TOP ACTIVATIONS
MAX = 2.279

easy to forget what got you to where you are now
it combines what the Taliban were saying via cellphone with US
3 x mb 8 b ET pic . twitter . com
reward . After they reach the criterion ( 3 or 10
exercise de ple tes glucose and glyc ogen rapidly , forcing
know what you re thinking . Roger
key for T UG , an open - world survival title
to understand what their patients were saying . There
researchers believe that abnormal levels of dopamine are connected to schizophrenia
Sorry , no free Wrong Eye here .) When picking your
I completely understand what you are feeling right now because I
izes the body s natural shifts between sympathetic (
of what you re looking for in the mountains
where he admitted what he had done to a police officer
X w 2 cv h 4 z Sonia R inc
the other provinces and utilities are paying ," Vel ie said
B AM ! What you were seeking , you held .
reward ( water ). They were then habit uated to sitting
m B z 5 s 8 General Flynn (@ Gen
/ z J X w 2 cv h 4 z

MIN ACTIVATIONS
MIN = -2.488

or more mitigation systems to protect their networks and online services
to replicate the scene as closely as possible , and Murd
. x . 134 11 35 487 x . x .
foul cal um ny was almost certainly the work of Richard
try and match them as closely as possible and will communicate
The Best T reek As yu K ong Ch oya Week
. This would save 108 , 476 , 7 88 litres
. x . 169 11 66 589 x . x .
. x . 111 10 36 203 x . x .
. Around 1 , 337 trains in the northern
would then stand at 2 , 88 , 339 tons .
state , an incredible 12 , 436 seed lings were rejected
per cent of the 9 , 204 people employed in that
crisis close to 6 , 000 concerned issues consumers had
. x . 112 10 25 151 x . x .
AM as the premiere rather than D IV IDE AND CON
for the present ers entering from the other side of the
. x . 119 10 23 199 x . x .
. x . 156 10 23 688 x . x .
. x . 143 10 33 455 x . x .

INTERVAL 1.088 - 2.279
CONTAINS 0.617%

which you train your body to rapidly and efficiently mobilize free
fail to point out is that those executive actions contained no
researchers believe that abnormal levels of dopamine are connected to schizophrenia
pregnant . Flynn and his wife specifically decided to make the
have no j agg ies at all even with no AA

INTERVAL -0.104 - 1.088
CONTAINS 59.618%

's going to take some work , on both the political
man does while alone in his room . Oh
224 GB / s 192 GB / s 256 GB /
know I 'm being silly . But it 's just that
a cable bundle ! You must specify that you want issue

INTERVAL -1.296 - -0.104
CONTAINS 39.630%

lot of snow recently . From the reports we 've received
state court level , where 95 percent of the nation
}, {{ 9 , 14 }, { 9 , 15 },
the so - called complaint . That was back in 2008
Castro , Cuban dissidents who advocate greater human rights on the

INTERVAL -2.488 - -1.296
CONTAINS 0.135%

opinions expressed by column ists are their own and do not
face are just as powerful . The stakes of our struggle
the work of Islamic extremists or by others seeking to exploit
that 's celebrating its 40 th anniversary . It 's paired
hood in New York , also served as a set for
Token easy
Token ID2562
Feature activation+7.43e-01

Pos logit contributions
zie+8.940
yg+8.002
verbal+6.921
enh+6.911
uben+6.897
Neg logit contributions
ディ-13.426
-12.647
Counter-12.343
akespeare-11.924
-11.727
Token to
Token ID284
Feature activation+7.70e-01

Pos logit contributions
zie+7.281
yg+6.565
grab+5.651
gain+5.625
verbal+5.620
Neg logit contributions
ディ-9.643
-9.029
Counter-8.950
akespeare-8.556
-8.435
Token forget
Token ID6044
Feature activation+1.05e+00

Pos logit contributions
zie+8.137
yg+7.590
gain+6.834
 SEE+6.733
grab+6.703
Neg logit contributions
ディ-9.130
-8.192
Counter-7.987
-7.614
-7.542
Token what
Token ID644
Feature activation+9.29e-01

Pos logit contributions
zie+9.533
yg+8.596
verbal+7.647
grab+7.572
gain+7.540
Neg logit contributions
ディ-13.054
-12.100
Counter-11.657
akespeare-11.344
-11.196
Token got
Token ID1392
Feature activation+8.03e-01

Pos logit contributions
zie+9.457
yg+8.474
enh+7.663
gain+7.654
grab+7.613
Neg logit contributions
ディ-10.976
-10.127
Counter-9.457
akespeare-9.446
-9.249
Token you
Token ID345
Feature activation+2.28e+00

Pos logit contributions
zie+8.391
yg+7.772
grab+7.062
gain+6.903
verbal+6.830
Neg logit contributions
ディ-9.309
-8.578
Counter-8.274
akespeare-8.169
-7.775
Token to
Token ID284
Feature activation+1.33e+00

Pos logit contributions
zie+15.471
yg+14.459
grab+13.608
 whereby+13.273
 SEE+12.852
Neg logit contributions
ディ-22.582
akespeare-19.649
-19.467
Counter-19.384
dayName-19.246
Token where
Token ID810
Feature activation+7.91e-01

Pos logit contributions
zie+11.506
yg+10.548
 SEE+9.702
enh+9.622
grab+9.606
Neg logit contributions
ディ-15.392
-13.963
Counter-13.395
akespeare-13.075
-12.889
Token you
Token ID345
Feature activation+1.54e+00

Pos logit contributions
zie+8.441
yg+7.603
grab+6.843
uben+6.824
gain+6.801
Neg logit contributions
ディ-9.210
-8.502
Counter-8.321
akespeare-8.040
-7.761
Token are
Token ID389
Feature activation+1.06e+00

Pos logit contributions
zie+9.543
yg+8.969
grab+8.130
gain+8.082
 SEE+7.619
Neg logit contributions
ディ-19.626
-18.433
Counter-18.078
akespeare-17.454
Doctor-17.214
Token now
Token ID783
Feature activation+8.02e-01

Pos logit contributions
zie+8.920
yg+8.037
grab+7.128
gain+7.039
verbal+7.036
Neg logit contributions
ディ-13.569
-12.929
Counter-12.500
akespeare-12.115
-11.994
Token it
Token ID340
Feature activation+2.64e-01

Pos logit contributions
zie+9.190
yg+8.246
uben+7.292
grab+7.200
verbal+7.156
Neg logit contributions
ディ-11.259
-10.359
Counter-10.222
akespeare-9.829
-9.583
Token combines
Token ID21001
Feature activation+8.95e-01

Pos logit contributions
zie+2.834
yg+2.549
grab+2.237
uben+2.219
verbal+2.216
Neg logit contributions
ディ-3.256
-3.064
Counter-3.031
akespeare-2.883
-2.811
Token what
Token ID644
Feature activation+6.00e-01

Pos logit contributions
zie+9.465
yg+8.794
verbal+7.730
grab+7.666
enh+7.641
Neg logit contributions
ディ-9.989
-9.349
Counter-9.098
akespeare-8.532
-8.455
Token the
Token ID262
Feature activation+6.24e-01

Pos logit contributions
zie+6.519
yg+5.919
grab+5.132
uben+5.126
aptic+5.107
Neg logit contributions
ディ-7.121
-6.638
Counter-6.488
akespeare-6.190
-6.090
Token Taliban
Token ID15471
Feature activation+9.12e-01

Pos logit contributions
zie+6.996
yg+6.282
enh+5.588
uben+5.585
grab+5.523
Neg logit contributions
ディ-7.159
-6.646
Counter-6.505
akespeare-6.210
-6.055
Token were
Token ID547
Feature activation+2.09e+00

Pos logit contributions
zie+9.206
yg+8.233
enh+7.319
verbal+7.268
grab+7.239
Neg logit contributions
ディ-10.687
-9.819
Counter-9.473
akespeare-9.414
-8.932
Token saying
Token ID2282
Feature activation+9.54e-01

Pos logit contributions
zie+17.510
yg+16.072
grab+14.882
 SEE+14.626
 ­+14.507
Neg logit contributions
ディ-18.546
-17.029
akespeare-16.221
acters-16.017
Counter-15.931
Token via
Token ID2884
Feature activation+7.91e-01

Pos logit contributions
zie+8.770
yg+7.964
verbal+6.926
grab+6.741
uben+6.655
Neg logit contributions
ディ-11.802
Counter-10.998
-10.988
akespeare-10.481
-10.356
Token cellphone
Token ID26346
Feature activation+4.12e-01

Pos logit contributions
zie+9.066
yg+8.213
verbal+7.403
enh+7.376
aptic+7.330
Neg logit contributions
ディ-8.500
-7.901
Counter-7.736
akespeare-7.328
-7.273
Token with
Token ID351
Feature activation+2.59e-01

Pos logit contributions
zie+4.738
yg+4.324
verbal+3.828
grab+3.824
aptic+3.800
Neg logit contributions
ディ-4.676
-4.389
Counter-4.341
akespeare-4.153
-4.014
Token US
Token ID1294
Feature activation-2.60e-02

Pos logit contributions
zie+3.038
yg+2.769
verbal+2.449
grab+2.448
uben+2.446
Neg logit contributions
ディ-2.947
-2.765
Counter-2.735
akespeare-2.593
-2.513
Token3
Token ID18
Feature activation+1.35e+00

Pos logit contributions
zie+12.636
yg+12.120
md+10.408
dl+10.220
enh+10.124
Neg logit contributions
ディ-12.649
-11.752
akespeare-11.252
Counter-11.226
-10.995
Tokenx
Token ID87
Feature activation+1.46e+00

Pos logit contributions
zie+13.946
yg+13.470
md+11.614
dl+11.464
enh+11.282
Neg logit contributions
ディ-13.460
-12.503
akespeare-12.069
Counter-11.912
-11.696
Tokenmb
Token ID2022
Feature activation+1.47e+00

Pos logit contributions
zie+15.180
yg+14.548
md+12.710
dl+12.459
kh+12.382
Neg logit contributions
ディ-13.892
-12.807
akespeare-12.351
Counter-12.148
-12.106
Token8
Token ID23
Feature activation+1.74e+00

Pos logit contributions
zie+15.337
yg+14.693
md+12.949
dl+12.724
kh+12.656
Neg logit contributions
ディ-13.818
-12.862
akespeare-12.319
Counter-12.119
-12.072
Tokenb
Token ID65
Feature activation+1.27e+00

Pos logit contributions
zie+17.736
yg+17.362
md+15.099
dl+14.888
kh+14.683
Neg logit contributions
ディ-14.939
-13.835
akespeare-13.385
 cloaked-12.998
-12.892
TokenET
Token ID2767
Feature activation+2.08e+00

Pos logit contributions
zie+13.428
yg+12.516
md+11.049
kh+10.775
enh+10.743
Neg logit contributions
ディ-12.846
-11.967
akespeare-11.510
Counter-11.392
-11.221
Token pic
Token ID8301
Feature activation+4.83e-01

Pos logit contributions
zie+18.767
yg+18.376
dl+15.937
grab+15.718
md+15.663
Neg logit contributions
ディ-17.469
-15.776
Counter-15.315
acters-14.893
akespeare-14.778
Token.
Token ID13
Feature activation-1.24e-02

Pos logit contributions
zie+3.647
yg+3.339
grab+2.866
md+2.845
verbal+2.690
Neg logit contributions
ディ-7.183
-6.889
akespeare-6.743
��-6.578
Counter-6.530
Tokentwitter
Token ID6956
Feature activation+7.81e-01

Pos logit contributions
ディ+0.160
+0.153
Counter+0.150
akespeare+0.146
+0.140
Neg logit contributions
zie-0.125
yg-0.115
grab-0.100
verbal-0.099
gain-0.098
Token.
Token ID13
Feature activation+4.09e-02

Pos logit contributions
zie+5.007
yg+4.829
grab+3.994
md+3.897
kh+3.644
Neg logit contributions
ディ-11.798
��-11.449
-11.332
akespeare-11.300
 compr-11.231
Tokencom
Token ID785
Feature activation+1.29e-01

Pos logit contributions
zie+0.422
yg+0.386
grab+0.340
gain+0.333
verbal+0.331
Neg logit contributions
ディ-0.523
-0.496
Counter-0.486
akespeare-0.479
-0.457
Token reward
Token ID6721
Feature activation+1.41e+00

Pos logit contributions
zie+13.976
yg+12.446
grab+11.797
aptic+11.720
 reward+11.713
Neg logit contributions
ディ-14.744
-13.331
Counter-13.190
akespeare-12.739
-12.322
Token.
Token ID13
Feature activation+1.30e+00

Pos logit contributions
zie+12.734
yg+12.079
grab+10.988
aptic+10.631
verbal+10.383
Neg logit contributions
ディ-15.567
-14.046
akespeare-13.834
Counter-13.475
-13.408
Token After
Token ID2293
Feature activation+9.06e-01

Pos logit contributions
zie+12.169
yg+11.155
 SEE+10.811
md+10.422
grab+10.337
Neg logit contributions
ディ-14.094
-12.657
akespeare-12.524
Counter-11.848
Doctor-11.700
Token they
Token ID484
Feature activation+1.39e+00

Pos logit contributions
zie+8.970
yg+8.235
grab+7.452
verbal+7.418
gain+7.400
Neg logit contributions
ディ-10.825
-10.106
Counter-9.770
-9.330
akespeare-9.295
Token reach
Token ID3151
Feature activation+1.66e+00

Pos logit contributions
zie+11.196
yg+10.270
gain+9.612
 SEE+9.468
verbal+9.100
Neg logit contributions
ディ-16.986
-15.228
-14.696
Counter-14.591
-13.879
Token the
Token ID262
Feature activation+2.05e+00

Pos logit contributions
zie+14.464
yg+13.108
grab+11.970
aptic+11.885
gain+11.682
Neg logit contributions
ディ-17.587
-15.806
Counter-15.758
akespeare-15.082
-14.859
Token criterion
Token ID34054
Feature activation+1.55e+00

Pos logit contributions
zie+16.625
 reward+15.540
yg+14.914
 dopamine+14.741
grab+14.204
Neg logit contributions
ディ-19.362
Counter-17.310
-17.112
akespeare-16.668
-16.055
Token (
Token ID357
Feature activation+1.58e+00

Pos logit contributions
zie+12.931
yg+12.468
grab+11.411
 whereby+11.199
verbal+11.124
Neg logit contributions
ディ-16.752
akespeare-14.908
-14.897
Counter-14.663
-14.346
Token3
Token ID18
Feature activation+1.13e+00

Pos logit contributions
zie+16.808
yg+15.491
gain+14.968
verbal+14.890
grab+14.858
Neg logit contributions
ディ-13.318
akespeare-12.301
-11.969
Counter-11.385
acters-11.341
Token or
Token ID393
Feature activation+6.56e-01

Pos logit contributions
zie+11.407
yg+10.739
md+9.534
verbal+9.498
aptic+9.482
Neg logit contributions
ディ-12.124
Counter-11.070
-11.031
akespeare-11.023
Doctor-10.305
Token 10
Token ID838
Feature activation+1.47e+00

Pos logit contributions
zie+6.280
yg+5.634
grab+4.968
gain+4.962
verbal+4.919
Neg logit contributions
ディ-8.415
-7.956
Counter-7.718
akespeare-7.599
-7.328
Token exercise
Token ID5517
Feature activation+1.18e+00

Pos logit contributions
zie+11.513
yg+9.929
gain+9.602
enh+9.492
 SEE+9.330
Neg logit contributions
ディ-13.786
Counter-12.471
-12.451
akespeare-12.284
1963-11.657
Token de
Token ID390
Feature activation+7.15e-01

Pos logit contributions
zie+10.161
yg+8.731
enh+8.597
gain+8.497
aptic+8.181
Neg logit contributions
ディ-14.312
-13.193
Counter-12.814
akespeare-12.557
-11.981
Tokenple
Token ID1154
Feature activation+3.98e-01

Pos logit contributions
zie+5.968
yg+5.225
enh+4.572
gain+4.560
grab+4.491
Neg logit contributions
ディ-9.963
Counter-9.472
-9.460
akespeare-9.250
-8.964
Tokentes
Token ID4879
Feature activation+1.83e+00

Pos logit contributions
zie+2.564
yg+2.146
verbal+1.728
grab+1.713
gain+1.703
Neg logit contributions
ディ-6.646
-6.306
Counter-6.148
akespeare-5.936
-5.928
Token glucose
Token ID15701
Feature activation+1.16e+00

Pos logit contributions
zie+14.671
yg+13.532
 dopamine+12.912
gain+12.594
dl+12.373
Neg logit contributions
ディ-18.189
akespeare-17.350
-17.109
-16.792
Counter-16.711
Token and
Token ID290
Feature activation+2.03e+00

Pos logit contributions
zie+10.744
yg+9.624
gain+9.012
grab+8.690
enh+8.663
Neg logit contributions
ディ-13.513
-12.505
akespeare-12.251
Counter-12.006
Doctor-11.753
Token glyc
Token ID40232
Feature activation+8.85e-01

Pos logit contributions
zie+16.241
 dopamine+15.267
enh+14.757
yg+14.743
gain+14.677
Neg logit contributions
ディ-18.936
akespeare-17.256
-17.252
-16.350
Counter-16.132
Tokenogen
Token ID6644
Feature activation+1.03e+00

Pos logit contributions
zie+8.605
yg+7.873
gain+7.072
oled+7.064
enh+7.042
Neg logit contributions
ディ-11.098
-10.130
Counter-9.883
-9.591
Doctor-9.379
Token rapidly
Token ID8902
Feature activation+6.65e-01

Pos logit contributions
zie+9.553
yg+8.768
aptic+7.763
gain+7.736
grab+7.681
Neg logit contributions
ディ-12.290
-11.302
Counter-11.212
akespeare-11.192
-10.699
Token,
Token ID11
Feature activation+1.52e+00

Pos logit contributions
zie+6.556
yg+5.804
enh+5.133
grab+5.080
aptic+5.074
Neg logit contributions
ディ-8.523
-7.939
Counter-7.866
akespeare-7.625
-7.376
Token forcing
Token ID10833
Feature activation+1.13e+00

Pos logit contributions
zie+13.047
yg+11.361
enh+11.359
gain+11.036
 whereby+10.895
Neg logit contributions
ディ-15.869
-14.781
akespeare-14.594
Counter-14.338
Doctor-13.702
Token know
Token ID760
Feature activation+5.39e-01

Pos logit contributions
zie+3.771
yg+3.407
grab+3.009
enh+2.957
gain+2.952
Neg logit contributions
ディ-4.131
-3.896
Counter-3.838
akespeare-3.673
-3.532
Token what
Token ID644
Feature activation+4.65e-01

Pos logit contributions
zie+5.690
yg+5.094
grab+4.484
enh+4.467
verbal+4.450
Neg logit contributions
ディ-6.662
-6.161
Counter-6.093
akespeare-5.942
-5.738
Token you
Token ID345
Feature activation+1.30e+00

Pos logit contributions
zie+5.674
yg+5.132
grab+4.625
verbal+4.556
enh+4.552
Neg logit contributions
ディ-5.051
-4.654
Counter-4.563
akespeare-4.419
-4.245
Token
Token ID447
Feature activation+4.24e-01

Pos logit contributions
zie+9.397
yg+7.985
 SEE+7.055
verbal+6.839
 ­+6.795
Neg logit contributions
ディ-16.921
Counter-16.080
akespeare-15.647
-15.488
acters-15.260
Token
Token ID247
Feature activation+6.69e-01

Pos logit contributions
zie+4.717
yg+4.358
uben+3.885
gain+3.868
grab+3.832
Neg logit contributions
ディ-4.831
Counter-4.490
-4.484
akespeare-4.338
Doctor-4.111
Tokenre
Token ID260
Feature activation+2.02e+00

Pos logit contributions
zie+6.369
yg+5.807
grab+4.975
gain+4.913
verbal+4.906
Neg logit contributions
ディ-8.572
Counter-8.144
-8.123
akespeare-7.860
-7.659
Token thinking
Token ID3612
Feature activation+5.87e-01

Pos logit contributions
zie+14.970
yg+14.252
 ­+12.420
grab+12.383
gain+12.150
Neg logit contributions
ディ-19.895
akespeare-18.691
Counter-18.330
-18.169
-18.097
Token.
Token ID13
Feature activation+5.72e-01

Pos logit contributions
zie+5.565
yg+4.943
aptic+4.280
grab+4.243
verbal+4.242
Neg logit contributions
ディ-7.778
-7.328
Counter-7.304
akespeare-7.058
-6.825
Token
Token ID564
Feature activation-1.77e-01

Pos logit contributions
zie+5.825
yg+5.218
grab+4.721
 SEE+4.591
enh+4.584
Neg logit contributions
ディ-6.980
-6.511
akespeare-6.402
Counter-6.382
-5.994
Token
Token ID250
Feature activation-5.29e-02

Pos logit contributions
ディ+1.458
+1.365
Counter+1.315
akespeare+1.204
+1.189
Neg logit contributions
zie-2.594
yg-2.416
enh-2.188
grab-2.179
uben-2.170
TokenRoger
Token ID43719
Feature activation+3.66e-01

Pos logit contributions
ディ+0.684
+0.649
Counter+0.644
akespeare+0.613
+0.593
Neg logit contributions
zie-0.532
yg-0.475
uben-0.415
aptic-0.412
verbal-0.412
Token key
Token ID1994
Feature activation+1.45e+00

Pos logit contributions
zie+14.196
yg+11.579
grab+11.077
 Achievement+10.593
uben+10.552
Neg logit contributions
ディ-17.321
-15.825
akespeare-15.680
Counter-15.638
acters-14.949
Token for
Token ID329
Feature activation+6.30e-01

Pos logit contributions
zie+12.292
yg+10.658
grab+9.985
 whereby+9.703
aptic+9.237
Neg logit contributions
ディ-16.724
Counter-15.707
-15.589
akespeare-15.210
-14.554
Token T
Token ID309
Feature activation+7.12e-01

Pos logit contributions
zie+7.082
yg+6.579
uben+5.722
grab+5.720
enh+5.681
Neg logit contributions
ディ-7.164
-6.719
Counter-6.552
akespeare-6.362
-6.034
TokenUG
Token ID7340
Feature activation+1.06e+00

Pos logit contributions
zie+7.253
yg+6.662
aptic+5.990
uben+5.872
grab+5.793
Neg logit contributions
ディ-8.721
Counter-8.043
-7.943
akespeare-7.521
-7.418
Token,
Token ID11
Feature activation+1.44e+00

Pos logit contributions
zie+9.738
yg+8.759
uben+7.568
gain+7.523
aptic+7.505
Neg logit contributions
ディ-12.873
-11.723
Counter-11.683
akespeare-11.554
Doctor-10.905
Token an
Token ID281
Feature activation+2.02e+00

Pos logit contributions
zie+12.441
yg+11.709
 whereby+10.635
grab+10.179
enh+10.128
Neg logit contributions
ディ-15.686
-14.540
Counter-14.157
akespeare-13.903
acters-13.215
Token open
Token ID1280
Feature activation+4.98e-01

Pos logit contributions
zie+16.234
yg+14.344
enh+13.470
 Achievement+13.047
kh+12.931
Neg logit contributions
ディ-21.279
-17.909
akespeare-17.390
Counter-17.284
1963-16.855
Token-
Token ID12
Feature activation+6.91e-01

Pos logit contributions
zie+4.592
yg+4.109
grab+3.514
uben+3.509
verbal+3.479
Neg logit contributions
ディ-6.804
-6.394
Counter-6.353
akespeare-6.120
-5.949
Tokenworld
Token ID6894
Feature activation+9.92e-01

Pos logit contributions
zie+4.750
yg+4.214
grab+3.491
enh+3.462
verbal+3.439
Neg logit contributions
ディ-10.843
-10.217
Counter-10.044
akespeare-9.968
-9.698
Token survival
Token ID9441
Feature activation+1.09e+00

Pos logit contributions
zie+10.100
yg+9.184
grab+8.077
uben+7.966
enh+7.961
Neg logit contributions
ディ-11.280
-10.239
akespeare-10.011
Counter-10.000
1963-9.455
Token title
Token ID3670
Feature activation+7.69e-01

Pos logit contributions
zie+9.942
yg+8.992
grab+7.823
uben+7.662
enh+7.654
Neg logit contributions
ディ-13.054
-12.183
akespeare-11.925
Counter-11.701
1963-11.390
Token to
Token ID284
Feature activation-1.21e-01

Pos logit contributions
zie+4.371
yg+3.948
verbal+3.452
grab+3.442
uben+3.438
Neg logit contributions
ディ-4.888
-4.590
Counter-4.550
akespeare-4.326
-4.231
Token understand
Token ID1833
Feature activation+2.35e-01

Pos logit contributions
ディ+1.364
+1.298
Counter+1.286
akespeare+1.231
+1.178
Neg logit contributions
zie-1.409
yg-1.283
aptic-1.143
uben-1.139
verbal-1.134
Token what
Token ID644
Feature activation+3.46e-01

Pos logit contributions
zie+2.718
yg+2.473
verbal+2.182
uben+2.180
grab+2.177
Neg logit contributions
ディ-2.718
-2.547
Counter-2.525
akespeare-2.394
-2.325
Token their
Token ID511
Feature activation+3.44e-01

Pos logit contributions
zie+3.841
yg+3.473
grab+3.043
uben+3.042
verbal+3.039
Neg logit contributions
ディ-4.163
-3.895
Counter-3.871
akespeare-3.669
-3.573
Token patients
Token ID3871
Feature activation+7.96e-01

Pos logit contributions
zie+4.161
yg+3.793
verbal+3.399
aptic+3.369
uben+3.368
Neg logit contributions
ディ-3.770
-3.489
Counter-3.482
akespeare-3.248
-3.177
Token were
Token ID547
Feature activation+2.00e+00

Pos logit contributions
zie+7.397
yg+6.490
uben+5.716
verbal+5.712
gain+5.674
Neg logit contributions
ディ-10.198
Counter-9.618
-9.503
akespeare-9.110
-8.866
Token saying
Token ID2282
Feature activation+4.08e-01

Pos logit contributions
zie+14.975
yg+13.776
 ­+12.587
 SEE+11.961
verbal+11.574
Neg logit contributions
ディ-21.255
-18.973
-18.660
Counter-18.602
akespeare-18.182
Token.
Token ID13
Feature activation+2.98e-01

Pos logit contributions
zie+4.096
yg+3.680
verbal+3.194
uben+3.162
aptic+3.154
Neg logit contributions
ディ-5.319
-4.987
Counter-4.963
akespeare-4.710
-4.628
Token
Token ID198
Feature activation-2.78e-01

Pos logit contributions
zie+3.321
yg+3.012
verbal+2.669
grab+2.658
gain+2.623
Neg logit contributions
ディ-3.517
-3.277
Counter-3.247
akespeare-3.143
-3.011
Token
Token ID198
Feature activation+4.51e-01

Pos logit contributions
ディ+2.897
+2.702
Counter+2.666
akespeare+2.526
+2.433
Neg logit contributions
zie-3.547
yg-3.260
grab-2.916
verbal-2.913
uben-2.911
TokenThere
Token ID1858
Feature activation+8.15e-01

Pos logit contributions
zie+5.471
yg+5.040
verbal+4.578
grab+4.514
gain+4.481
Neg logit contributions
ディ-4.776
-4.385
Counter-4.266
akespeare-4.214
 spoof-4.024
Token researchers
Token ID4837
Feature activation-4.92e-02

Pos logit contributions
zie+8.257
yg+7.817
aptic+6.996
verbal+6.953
enh+6.916
Neg logit contributions
ディ-7.948
-7.338
Counter-7.134
akespeare-6.849
-6.603
Token believe
Token ID1975
Feature activation+7.39e-01

Pos logit contributions
ディ+0.578
+0.547
Counter+0.543
akespeare+0.517
+0.501
Neg logit contributions
zie-0.556
yg-0.502
verbal-0.443
uben-0.442
grab-0.442
Token that
Token ID326
Feature activation+1.46e+00

Pos logit contributions
zie+7.411
yg+6.834
enh+5.988
aptic+5.988
verbal+5.887
Neg logit contributions
ディ-9.037
-8.447
Counter-8.347
akespeare-7.966
-7.765
Token abnormal
Token ID18801
Feature activation+1.76e+00

Pos logit contributions
zie+13.825
yg+13.214
 dopamine+13.047
aptic+11.974
enh+11.943
Neg logit contributions
ディ-14.878
-13.692
Counter-13.168
akespeare-12.613
-12.499
Token levels
Token ID2974
Feature activation+8.77e-01

Pos logit contributions
 dopamine+14.853
zie+14.498
yg+14.214
aptic+13.033
 SEE+12.338
Neg logit contributions
ディ-18.148
Counter-15.925
-15.880
-15.873
akespeare-15.612
Token of
Token ID286
Feature activation+1.98e+00

Pos logit contributions
zie+8.073
yg+7.244
aptic+6.415
verbal+6.295
gain+6.266
Neg logit contributions
ディ-11.460
-10.740
Counter-10.597
akespeare-10.232
-9.899
Token dopamine
Token ID26252
Feature activation+1.07e+00

Pos logit contributions
 dopamine+15.770
zie+14.681
yg+14.415
aptic+13.334
 SEE+12.848
Neg logit contributions
ディ-20.671
-18.476
Counter-18.301
-18.219
akespeare-18.069
Token are
Token ID389
Feature activation+1.33e-01

Pos logit contributions
zie+9.694
yg+8.693
aptic+8.018
enh+7.961
gain+7.937
Neg logit contributions
ディ-13.251
-11.900
Counter-11.637
akespeare-11.437
Doctor-10.966
Token connected
Token ID5884
Feature activation+3.03e-01

Pos logit contributions
zie+1.554
yg+1.416
uben+1.253
grab+1.251
verbal+1.251
Neg logit contributions
ディ-1.510
-1.418
Counter-1.404
akespeare-1.331
-1.290
Token to
Token ID284
Feature activation+1.42e+00

Pos logit contributions
zie+3.292
yg+2.969
uben+2.619
grab+2.614
verbal+2.605
Neg logit contributions
ディ-3.715
-3.528
Counter-3.481
akespeare-3.322
-3.223
Token schizophrenia
Token ID22794
Feature activation+4.76e-01

Pos logit contributions
zie+13.803
yg+12.923
enh+11.741
verbal+11.703
 dopamine+11.627
Neg logit contributions
ディ-14.524
-13.636
Counter-13.116
akespeare-12.577
-12.361
TokenSorry
Token ID14385
Feature activation+1.27e+00

Pos logit contributions
zie+10.515
yg+9.639
grab+9.020
gain+8.960
enh+8.948
Neg logit contributions
ディ-8.780
akespeare-8.074
-7.969
acters-7.375
Counter-7.367
Token,
Token ID11
Feature activation+9.32e-01

Pos logit contributions
zie+12.442
yg+10.956
grab+9.894
enh+9.728
gain+9.695
Neg logit contributions
ディ-13.854
-12.884
akespeare-12.711
Counter-12.510
-12.082
Token no
Token ID645
Feature activation+1.26e+00

Pos logit contributions
zie+10.309
yg+9.210
grab+8.413
enh+8.377
uben+8.255
Neg logit contributions
ディ-9.862
-9.189
Counter-8.875
akespeare-8.859
-8.425
Token free
Token ID1479
Feature activation+1.48e+00

Pos logit contributions
zie+14.576
yg+12.894
enh+11.870
grab+11.780
gain+11.757
Neg logit contributions
ディ-11.454
-10.517
Counter-10.335
akespeare-10.066
-9.791
Token Wrong
Token ID28843
Feature activation+3.88e-01

Pos logit contributions
zie+16.477
yg+14.126
grab+13.508
enh+13.307
md+13.066
Neg logit contributions
ディ-13.123
-11.912
Counter-11.863
akespeare-11.530
-11.209
Token Eye
Token ID14144
Feature activation+1.98e+00

Pos logit contributions
zie+3.654
yg+3.258
verbal+2.786
grab+2.783
uben+2.781
Neg logit contributions
ディ-5.319
-5.056
Counter-5.026
akespeare-4.794
-4.671
Token here
Token ID994
Feature activation+8.56e-01

Pos logit contributions
zie+19.683
yg+16.343
enh+15.853
gain+15.731
grab+15.603
Neg logit contributions
ディ-15.984
-14.729
akespeare-14.583
Counter-14.307
-13.656
Token.)
Token ID2014
Feature activation+1.01e+00

Pos logit contributions
zie+7.679
yg+6.515
aptic+5.657
gain+5.652
grab+5.603
Neg logit contributions
ディ-11.359
-10.761
Counter-10.711
akespeare-10.350
-10.013
Token When
Token ID1649
Feature activation+7.22e-01

Pos logit contributions
zie+9.262
yg+8.190
grab+7.671
 SEE+7.625
md+7.400
Neg logit contributions
ディ-12.090
akespeare-11.276
-11.011
Counter-10.532
-10.323
Token picking
Token ID10868
Feature activation+7.28e-01

Pos logit contributions
zie+7.802
yg+7.062
grab+6.505
gain+6.272
enh+6.268
Neg logit contributions
ディ-8.284
-7.737
Counter-7.431
akespeare-7.350
-7.021
Token your
Token ID534
Feature activation+9.87e-01

Pos logit contributions
zie+7.510
yg+6.777
grab+6.015
gain+5.978
enh+5.955
Neg logit contributions
ディ-8.790
-8.185
Counter-7.959
akespeare-7.654
-7.432
TokenI
Token ID40
Feature activation+8.53e-01

Pos logit contributions
zie+8.778
yg+8.139
grab+7.510
verbal+7.430
gain+7.315
Neg logit contributions
ディ-8.526
akespeare-7.999
-7.832
 spoof-7.518
Counter-7.458
Token completely
Token ID3190
Feature activation-2.31e-01

Pos logit contributions
zie+8.596
yg+7.852
grab+7.088
 SEE+6.894
gain+6.821
Neg logit contributions
ディ-10.063
Counter-9.480
-9.428
akespeare-8.848
-8.737
Token understand
Token ID1833
Feature activation+2.22e-01

Pos logit contributions
ディ+2.505
+2.349
Counter+2.317
akespeare+2.206
+2.127
Neg logit contributions
zie-2.807
yg-2.553
aptic-2.280
uben-2.265
verbal-2.258
Token what
Token ID644
Feature activation+4.86e-01

Pos logit contributions
zie+2.342
yg+2.113
uben+1.839
verbal+1.837
grab+1.836
Neg logit contributions
ディ-2.786
-2.630
Counter-2.607
akespeare-2.487
-2.418
Token you
Token ID345
Feature activation+1.31e+00

Pos logit contributions
zie+5.822
yg+5.311
grab+4.754
verbal+4.717
uben+4.688
Neg logit contributions
ディ-5.353
-4.954
Counter-4.882
akespeare-4.657
-4.532
Token are
Token ID389
Feature activation+1.97e+00

Pos logit contributions
zie+9.334
yg+8.311
 SEE+7.322
verbal+6.988
gain+6.801
Neg logit contributions
ディ-17.653
Counter-16.608
-16.247
-15.676
-15.510
Token feeling
Token ID4203
Feature activation+9.11e-01

Pos logit contributions
zie+13.204
yg+12.562
grab+10.847
 SEE+10.769
 ­+10.720
Neg logit contributions
ディ-22.643
Counter-21.113
-21.019
-20.575
akespeare-19.836
Token right
Token ID826
Feature activation+7.72e-01

Pos logit contributions
zie+8.076
yg+7.227
verbal+6.218
uben+6.179
grab+6.156
Neg logit contributions
ディ-11.953
Counter-11.331
-11.193
-10.731
akespeare-10.720
Token now
Token ID783
Feature activation+4.93e-01

Pos logit contributions
zie+6.955
yg+6.344
grab+5.476
uben+5.458
verbal+5.405
Neg logit contributions
ディ-10.294
-9.589
Counter-9.582
akespeare-9.199
-9.032
Token because
Token ID780
Feature activation+2.18e-01

Pos logit contributions
zie+4.619
yg+4.109
verbal+3.504
uben+3.500
aptic+3.492
Neg logit contributions
ディ-6.703
-6.343
Counter-6.326
akespeare-6.061
-5.923
Token I
Token ID314
Feature activation+9.08e-02

Pos logit contributions
zie+2.548
yg+2.323
uben+2.055
grab+2.054
verbal+2.053
Neg logit contributions
ディ-2.484
-2.329
Counter-2.306
akespeare-2.189
-2.123
Tokenizes
Token ID4340
Feature activation+1.27e+00

Pos logit contributions
zie+5.637
yg+5.171
uben+4.619
grab+4.613
verbal+4.613
Neg logit contributions
ディ-4.701
-4.391
Counter-4.338
akespeare-4.100
-3.958
Token the
Token ID262
Feature activation+1.11e+00

Pos logit contributions
zie+12.056
yg+10.443
gain+10.320
enh+9.525
grab+9.450
Neg logit contributions
ディ-13.711
-13.078
akespeare-12.987
Counter-12.780
-12.384
Token body
Token ID1767
Feature activation+1.05e+00

Pos logit contributions
zie+11.226
yg+9.920
gain+9.603
enh+9.267
grab+8.969
Neg logit contributions
ディ-11.564
-11.007
akespeare-10.812
Counter-10.785
-10.361
Token
Token ID447
Feature activation+4.56e-01

Pos logit contributions
zie+10.379
yg+9.038
gain+8.487
grab+8.340
enh+8.107
Neg logit contributions
ディ-12.426
-11.601
akespeare-11.442
Counter-11.400
-11.036
Token
Token ID247
Feature activation+1.44e+00

Pos logit contributions
zie+4.085
yg+3.668
gain+3.221
uben+3.193
grab+3.154
Neg logit contributions
ディ-6.183
-5.806
Counter-5.752
akespeare-5.679
Doctor-5.412
Tokens
Token ID82
Feature activation+1.97e+00

Pos logit contributions
zie+13.118
gain+11.559
yg+11.499
enh+10.857
grab+10.703
Neg logit contributions
ディ-15.567
akespeare-15.002
-14.621
Counter-14.320
-14.067
Token natural
Token ID3288
Feature activation+1.47e+00

Pos logit contributions
zie+16.753
gain+14.926
yg+14.319
 ­+14.206
enh+13.761
Neg logit contributions
ディ-16.951
akespeare-16.609
-16.170
Counter-15.722
-15.304
Token shifts
Token ID15381
Feature activation+3.35e-01

Pos logit contributions
zie+14.443
gain+11.982
yg+11.801
enh+11.591
grab+11.457
Neg logit contributions
ディ-14.233
akespeare-13.746
-13.517
Counter-13.269
-12.915
Token between
Token ID1022
Feature activation+6.08e-01

Pos logit contributions
zie+3.487
yg+3.137
verbal+2.727
grab+2.727
aptic+2.726
Neg logit contributions
ディ-4.262
-4.014
Counter-3.974
akespeare-3.803
-3.694
Token sympathetic
Token ID22335
Feature activation+1.88e-01

Pos logit contributions
zie+6.566
yg+5.863
gain+5.316
uben+5.221
enh+5.211
Neg logit contributions
ディ-7.113
-6.753
Counter-6.691
akespeare-6.478
-6.218
Token (
Token ID357
Feature activation+8.15e-01

Pos logit contributions
zie+2.093
yg+1.900
uben+1.671
verbal+1.664
grab+1.664
Neg logit contributions
ディ-2.249
-2.119
Counter-2.097
akespeare-1.993
-1.933
Token of
Token ID286
Feature activation+4.92e-02

Pos logit contributions
zie+6.461
yg+5.943
gain+5.274
verbal+5.189
uben+5.173
Neg logit contributions
ディ-6.909
-6.402
Counter-6.378
akespeare-6.173
-5.854
Token what
Token ID644
Feature activation+2.14e-01

Pos logit contributions
zie+0.602
yg+0.555
verbal+0.493
aptic+0.493
grab+0.492
Neg logit contributions
ディ-0.542
-0.502
Counter-0.494
akespeare-0.465
-0.453
Token you
Token ID345
Feature activation+9.26e-01

Pos logit contributions
zie+2.612
yg+2.388
uben+2.125
grab+2.122
verbal+2.121
Neg logit contributions
ディ-2.338
-2.189
Counter-2.165
akespeare-2.049
-1.981
Token
Token ID447
Feature activation+3.33e-01

Pos logit contributions
zie+8.916
yg+7.648
gain+6.927
grab+6.828
dl+6.774
Neg logit contributions
ディ-10.934
Counter-10.532
-10.337
akespeare-10.294
-9.720
Token
Token ID247
Feature activation+6.24e-01

Pos logit contributions
zie+3.856
yg+3.529
uben+3.142
grab+3.122
gain+3.120
Neg logit contributions
ディ-3.767
-3.523
Counter-3.499
akespeare-3.350
-3.199
Tokenre
Token ID260
Feature activation+1.96e+00

Pos logit contributions
zie+4.936
yg+4.325
gain+3.605
grab+3.596
enh+3.573
Neg logit contributions
ディ-9.093
-8.709
Counter-8.707
akespeare-8.496
-8.248
Token looking
Token ID2045
Feature activation+1.10e+00

Pos logit contributions
zie+17.455
grab+15.264
gain+15.187
yg+15.149
enh+14.584
Neg logit contributions
ディ-16.913
akespeare-16.335
Counter-15.768
-15.766
acters-15.561
Token for
Token ID329
Feature activation-2.32e-01

Pos logit contributions
zie+7.454
yg+6.029
grab+4.975
enh+4.874
gain+4.859
Neg logit contributions
ディ-16.843
Counter-15.708
-15.702
akespeare-15.287
-15.140
Token in
Token ID287
Feature activation+1.31e-01

Pos logit contributions
ディ+2.681
+2.552
Counter+2.509
akespeare+2.343
+2.312
Neg logit contributions
zie-2.549
yg-2.339
grab-2.070
uben-2.059
verbal-2.054
Token the
Token ID262
Feature activation+2.21e-01

Pos logit contributions
zie+1.606
yg+1.471
uben+1.308
verbal+1.308
aptic+1.307
Neg logit contributions
ディ-1.432
-1.340
Counter-1.323
akespeare-1.251
-1.212
Token mountains
Token ID12269
Feature activation-3.76e-01

Pos logit contributions
zie+2.648
yg+2.412
uben+2.144
verbal+2.139
gain+2.136
Neg logit contributions
ディ-2.463
-2.299
Counter-2.282
akespeare-2.170
-2.085
Token where
Token ID810
Feature activation+3.26e-01

Pos logit contributions
zie+4.673
yg+4.141
verbal+3.648
grab+3.619
aptic+3.600
Neg logit contributions
ディ-5.978
-5.671
Counter-5.599
akespeare-5.401
-5.220
Token he
Token ID339
Feature activation+3.55e-01

Pos logit contributions
zie+4.091
yg+3.748
uben+3.347
verbal+3.345
grab+3.345
Neg logit contributions
ディ-3.436
-3.207
Counter-3.170
akespeare-3.000
-2.895
Token admitted
Token ID6848
Feature activation+7.46e-01

Pos logit contributions
zie+3.673
yg+3.293
grab+2.881
verbal+2.870
uben+2.853
Neg logit contributions
ディ-4.503
-4.257
Counter-4.203
akespeare-4.023
-3.903
Token what
Token ID644
Feature activation+2.44e-01

Pos logit contributions
zie+7.382
yg+6.755
grab+5.870
verbal+5.810
enh+5.714
Neg logit contributions
ディ-9.265
-8.850
Counter-8.770
akespeare-8.375
-8.171
Token he
Token ID339
Feature activation+1.32e+00

Pos logit contributions
zie+3.041
yg+2.789
grab+2.485
uben+2.483
verbal+2.482
Neg logit contributions
ディ-2.602
-2.432
Counter-2.403
akespeare-2.274
-2.194
Token had
Token ID550
Feature activation+1.95e+00

Pos logit contributions
zie+9.949
yg+8.909
grab+7.892
 ­+7.459
kh+7.406
Neg logit contributions
ディ-16.802
-16.143
Counter-15.944
akespeare-15.116
acters-14.850
Token done
Token ID1760
Feature activation+5.70e-01

Pos logit contributions
zie+13.117
yg+12.608
 ­+11.433
grab+11.053
verbal+10.644
Neg logit contributions
ディ-20.580
-20.393
Counter-20.098
akespeare-18.885
-18.524
Token to
Token ID284
Feature activation+5.03e-01

Pos logit contributions
zie+5.578
yg+5.036
verbal+4.393
uben+4.367
grab+4.355
Neg logit contributions
ディ-7.307
-6.939
Counter-6.933
akespeare-6.578
-6.419
Token a
Token ID257
Feature activation+4.44e-01

Pos logit contributions
zie+5.920
yg+5.330
uben+4.751
grab+4.741
verbal+4.730
Neg logit contributions
ディ-5.619
-5.235
Counter-5.201
akespeare-4.965
-4.787
Token police
Token ID1644
Feature activation+5.79e-01

Pos logit contributions
zie+4.724
yg+4.219
verbal+3.764
uben+3.699
grab+3.693
Neg logit contributions
ディ-5.388
Counter-5.124
-5.113
akespeare-4.922
-4.716
Token officer
Token ID3818
Feature activation+4.35e-01

Pos logit contributions
zie+6.022
yg+5.334
verbal+4.746
grab+4.688
uben+4.635
Neg logit contributions
ディ-7.262
-6.813
Counter-6.725
akespeare-6.571
-6.269
TokenX
Token ID55
Feature activation+1.58e+00

Pos logit contributions
zie+13.847
yg+13.344
md+11.591
dl+11.510
kh+11.263
Neg logit contributions
ディ-13.241
-12.294
akespeare-11.705
Counter-11.594
-11.520
Tokenw
Token ID86
Feature activation+1.48e+00

Pos logit contributions
zie+16.332
yg+15.810
md+13.791
dl+13.661
kh+13.442
Neg logit contributions
ディ-14.488
-13.322
akespeare-12.800
-12.582
Counter-12.479
Token2
Token ID17
Feature activation+1.86e+00

Pos logit contributions
zie+15.518
yg+14.676
md+12.955
dl+12.740
kh+12.578
Neg logit contributions
ディ-13.917
-12.841
akespeare-12.391
Counter-12.177
-12.122
Tokencv
Token ID33967
Feature activation+1.63e+00

Pos logit contributions
zie+18.608
yg+18.183
md+15.979
dl+15.814
kh+15.590
Neg logit contributions
ディ-15.271
-14.126
akespeare-13.715
 cloaked-13.320
-13.242
Tokenh
Token ID71
Feature activation+1.76e+00

Pos logit contributions
zie+16.796
yg+16.047
md+14.201
dl+13.931
kh+13.913
Neg logit contributions
ディ-14.522
-13.429
akespeare-12.931
 cloaked-12.685
-12.623
Token4
Token ID19
Feature activation+1.94e+00

Pos logit contributions
zie+17.759
yg+16.856
md+15.093
dl+14.902
kh+14.791
Neg logit contributions
ディ-15.200
-14.010
akespeare-13.375
-13.242
 cloaked-13.210
Tokenz
Token ID89
Feature activation+1.67e+00

Pos logit contributions
zie+18.561
yg+18.204
md+15.719
dl+15.639
kh+15.378
Neg logit contributions
ディ-16.464
-15.044
akespeare-14.770
acters-14.302
 cloaked-14.297
Token
Token ID851
Feature activation+9.31e-01

Pos logit contributions
zie+13.610
yg+13.093
md+10.715
dl+10.692
grab+10.630
Neg logit contributions
ディ-18.527
-17.206
akespeare-16.688
Counter-16.550
-16.036
Token Sonia
Token ID20733
Feature activation+1.12e-01

Pos logit contributions
zie+10.106
yg+9.183
uben+8.091
grab+8.080
gain+7.982
Neg logit contributions
ディ-10.189
-9.578
Counter-9.382
akespeare-9.081
-8.712
Token R
Token ID371
Feature activation+2.26e-01

Pos logit contributions
zie+1.236
yg+1.121
aptic+0.990
verbal+0.989
grab+0.981
Neg logit contributions
ディ-1.352
-1.278
Counter-1.270
akespeare-1.198
-1.168
Tokeninc
Token ID1939
Feature activation+4.04e-01

Pos logit contributions
zie+2.426
yg+2.190
verbal+1.920
grab+1.917
uben+1.912
Neg logit contributions
ディ-2.778
-2.621
Counter-2.599
akespeare-2.484
-2.403
Token the
Token ID262
Feature activation+7.10e-02

Pos logit contributions
ディ+3.999
+3.700
Counter+3.673
akespeare+3.398
Doctor+3.304
Neg logit contributions
zie-5.050
yg-4.545
verbal-4.177
aptic-4.149
uben-4.104
Token other
Token ID584
Feature activation+3.47e-01

Pos logit contributions
zie+0.868
yg+0.794
uben+0.707
verbal+0.706
grab+0.706
Neg logit contributions
ディ-0.773
-0.724
Counter-0.716
akespeare-0.677
-0.655
Token provinces
Token ID17812
Feature activation+6.95e-01

Pos logit contributions
zie+4.055
yg+3.741
verbal+3.287
grab+3.281
uben+3.262
Neg logit contributions
ディ-3.901
-3.639
Counter-3.630
akespeare-3.429
-3.320
Token and
Token ID290
Feature activation+3.31e-05

Pos logit contributions
zie+5.912
yg+5.327
gain+4.568
grab+4.549
uben+4.449
Neg logit contributions
ディ-9.630
-9.134
Counter-8.916
akespeare-8.575
-8.496
Token utilities
Token ID20081
Feature activation+4.93e-01

Pos logit contributions
zie+0.000
yg+0.000
verbal+0.000
uben+0.000
aptic+0.000
Neg logit contributions
ディ-0.000
-0.000
Counter-0.000
akespeare-0.000
-0.000
Token are
Token ID389
Feature activation+1.92e+00

Pos logit contributions
zie+4.647
yg+4.194
grab+3.625
gain+3.606
uben+3.563
Neg logit contributions
ディ-6.555
-6.201
Counter-6.110
akespeare-5.854
-5.738
Token paying
Token ID5989
Feature activation+9.59e-02

Pos logit contributions
zie+13.067
yg+12.588
grab+11.744
 ­+11.228
 SEE+11.064
Neg logit contributions
ディ-21.022
-19.996
Counter-19.082
acters-18.849
akespeare-18.828
Token,"
Token ID553
Feature activation+4.82e-01

Pos logit contributions
zie+1.010
yg+0.908
uben+0.793
verbal+0.790
grab+0.788
Neg logit contributions
ディ-1.207
-1.144
Counter-1.132
akespeare-1.078
-1.042
Token Vel
Token ID17378
Feature activation+6.46e-01

Pos logit contributions
zie+4.646
yg+4.148
grab+3.538
uben+3.526
verbal+3.525
Neg logit contributions
ディ-6.458
-6.120
Counter-6.060
akespeare-5.821
-5.647
Tokenie
Token ID494
Feature activation+1.58e-01

Pos logit contributions
zie+4.355
yg+3.442
grab+2.540
enh+2.501
md+2.416
Neg logit contributions
ディ-11.283
-10.447
-10.000
akespeare-9.925
Counter-9.919
Token said
Token ID531
Feature activation+5.97e-02

Pos logit contributions
zie+1.660
yg+1.491
verbal+1.321
aptic+1.317
uben+1.313
Neg logit contributions
ディ-1.963
-1.882
Counter-1.858
akespeare-1.762
-1.733
Token B
Token ID347
Feature activation+2.73e-01

Pos logit contributions
zie+1.988
yg+1.810
uben+1.600
grab+1.598
verbal+1.598
Neg logit contributions
ディ-1.958
-1.841
Counter-1.821
akespeare-1.728
-1.674
TokenAM
Token ID2390
Feature activation+1.82e-01

Pos logit contributions
zie+3.072
yg+2.788
uben+2.451
grab+2.448
verbal+2.447
Neg logit contributions
ディ-3.246
-3.053
Counter-3.024
akespeare-2.874
-2.788
Token!
Token ID0
Feature activation-1.87e-01

Pos logit contributions
zie+1.927
yg+1.736
uben+1.518
verbal+1.507
grab+1.507
Neg logit contributions
ディ-2.282
-2.163
Counter-2.136
akespeare-2.036
-1.987
Token What
Token ID1867
Feature activation+3.28e-01

Pos logit contributions
ディ+2.157
+2.061
Counter+2.030
akespeare+1.909
+1.878
Neg logit contributions
zie-2.126
yg-1.916
uben-1.702
aptic-1.681
verbal-1.678
Token you
Token ID345
Feature activation+1.31e+00

Pos logit contributions
zie+3.586
yg+3.258
verbal+2.859
grab+2.857
uben+2.846
Neg logit contributions
ディ-3.966
-3.713
Counter-3.665
akespeare-3.504
-3.405
Token were
Token ID547
Feature activation+1.90e+00

Pos logit contributions
zie+10.529
yg+9.626
 SEE+9.378
gain+8.650
grab+8.545
Neg logit contributions
ディ-15.269
-14.601
Counter-14.385
akespeare-13.919
acters-13.811
Token seeking
Token ID6095
Feature activation+1.65e-01

Pos logit contributions
zie+14.014
yg+13.710
 SEE+13.279
grab+13.022
 ­+12.633
Neg logit contributions
ディ-18.197
-17.798
akespeare-16.845
Counter-16.719
acters-16.296
Token,
Token ID11
Feature activation+3.40e-01

Pos logit contributions
zie+1.766
yg+1.591
uben+1.390
grab+1.383
verbal+1.382
Neg logit contributions
ディ-2.057
-1.946
Counter-1.926
akespeare-1.835
-1.786
Token you
Token ID345
Feature activation+9.20e-01

Pos logit contributions
zie+3.745
yg+3.424
grab+3.031
verbal+3.015
uben+3.003
Neg logit contributions
ディ-4.063
-3.788
Counter-3.772
akespeare-3.600
-3.460
Token held
Token ID2714
Feature activation+6.75e-01

Pos logit contributions
zie+8.640
yg+7.826
 SEE+7.121
grab+6.989
gain+6.924
Neg logit contributions
ディ-11.120
-10.394
Counter-10.301
akespeare-9.858
acters-9.634
Token.
Token ID13
Feature activation+9.62e-02

Pos logit contributions
zie+6.667
yg+6.123
grab+5.311
uben+5.237
aptic+5.235
Neg logit contributions
ディ-8.382
-7.879
Counter-7.747
akespeare-7.577
-7.178
Token reward
Token ID6721
Feature activation+1.18e+00

Pos logit contributions
zie+13.859
yg+12.324
grab+11.992
 SEE+11.627
aptic+11.528
Neg logit contributions
ディ-13.233
-12.209
Counter-11.824
akespeare-11.570
-11.251
Token (
Token ID357
Feature activation+1.50e+00

Pos logit contributions
zie+10.587
yg+9.670
grab+8.876
aptic+8.741
verbal+8.736
Neg logit contributions
ディ-14.173
-12.981
akespeare-12.613
Counter-12.430
-12.370
Tokenwater
Token ID7050
Feature activation+5.54e-01

Pos logit contributions
zie+16.863
yg+15.522
verbal+15.170
grab+15.134
enh+15.094
Neg logit contributions
ディ-11.833
akespeare-11.211
-10.936
acters-10.192
 cloaked-9.958
Token).
Token ID737
Feature activation+1.28e+00

Pos logit contributions
zie+5.677
yg+5.080
grab+4.558
aptic+4.505
gain+4.479
Neg logit contributions
ディ-6.954
-6.532
Counter-6.400
akespeare-6.192
-6.029
Token They
Token ID1119
Feature activation+1.79e+00

Pos logit contributions
zie+11.144
yg+10.332
grab+9.607
md+9.595
 SEE+9.534
Neg logit contributions
ディ-14.340
akespeare-13.156
-12.926
Counter-11.995
 cloaked-11.864
Token were
Token ID547
Feature activation+1.86e+00

Pos logit contributions
zie+12.911
yg+12.158
 SEE+11.031
grab+10.950
enh+10.915
Neg logit contributions
ディ-19.820
-17.653
Counter-17.274
akespeare-16.700
-16.477
Token then
Token ID788
Feature activation+1.64e+00

Pos logit contributions
zie+13.079
yg+12.574
verbal+11.841
 subsequently+11.821
grab+11.820
Neg logit contributions
ディ-19.679
-18.042
akespeare-17.525
Counter-17.076
acters-16.443
Token habit
Token ID7947
Feature activation+1.66e-01

Pos logit contributions
zie+13.376
yg+12.499
verbal+11.873
grab+11.754
enh+11.627
Neg logit contributions
ディ-17.327
-15.615
akespeare-15.041
Counter-14.919
acters-14.360
Tokenuated
Token ID6605
Feature activation+6.61e-01

Pos logit contributions
zie+2.239
yg+2.047
uben+1.852
aptic+1.846
verbal+1.838
Neg logit contributions
ディ-1.610
-1.503
Counter-1.487
akespeare-1.386
-1.334
Token to
Token ID284
Feature activation+1.38e+00

Pos logit contributions
zie+6.420
yg+5.806
verbal+5.043
grab+5.022
aptic+4.996
Neg logit contributions
ディ-8.643
-8.041
Counter-7.982
akespeare-7.659
-7.458
Token sitting
Token ID5586
Feature activation+6.07e-01

Pos logit contributions
zie+13.225
yg+11.534
 SEE+11.344
grab+11.258
enh+11.003
Neg logit contributions
ディ-14.582
-13.681
Counter-12.798
akespeare-12.698
-12.595
Tokenm
Token ID76
Feature activation+1.12e+00

Pos logit contributions
zie+13.878
yg+13.297
md+11.600
dl+11.498
kh+11.247
Neg logit contributions
ディ-13.332
-12.362
akespeare-11.840
Counter-11.699
-11.592
TokenB
Token ID33
Feature activation+1.24e+00

Pos logit contributions
zie+11.913
yg+11.053
md+9.611
kh+9.450
dl+9.429
Neg logit contributions
ディ-11.814
-11.050
Counter-10.594
akespeare-10.529
-10.395
Tokenz
Token ID89
Feature activation+1.48e+00

Pos logit contributions
zie+12.984
yg+12.230
md+10.742
dl+10.649
kh+10.509
Neg logit contributions
ディ-12.794
-12.045
akespeare-11.455
Counter-11.394
-11.147
Token5
Token ID20
Feature activation+1.70e+00

Pos logit contributions
zie+15.409
yg+14.834
md+12.974
dl+12.763
kh+12.620
Neg logit contributions
ディ-13.886
-12.964
akespeare-12.391
Counter-12.197
-12.131
Tokens
Token ID82
Feature activation+1.34e+00

Pos logit contributions
zie+17.398
yg+16.863
md+14.753
dl+14.550
kh+14.386
Neg logit contributions
ディ-14.820
-13.717
akespeare-13.283
 cloaked-12.931
-12.798
Token8
Token ID23
Feature activation+1.86e+00

Pos logit contributions
zie+13.787
yg+12.787
md+11.204
dl+11.065
kh+10.996
Neg logit contributions
ディ-13.464
-12.589
akespeare-12.100
Counter-12.018
-11.743
Token
Token ID851
Feature activation+7.88e-01

Pos logit contributions
zie+17.495
yg+16.684
md+14.469
dl+14.233
grab+13.868
Neg logit contributions
ディ-16.669
-15.144
akespeare-15.052
Counter-14.884
acters-14.477
Token General
Token ID3611
Feature activation+4.78e-01

Pos logit contributions
zie+8.137
yg+7.329
uben+6.307
md+6.219
enh+6.216
Neg logit contributions
ディ-9.456
Counter-8.908
-8.868
akespeare-8.580
acters-8.157
Token Flynn
Token ID14654
Feature activation+4.54e-01

Pos logit contributions
zie+4.713
yg+4.213
uben+3.624
verbal+3.619
grab+3.616
Neg logit contributions
ディ-6.322
-5.990
Counter-5.936
akespeare-5.685
-5.525
Token (@
Token ID4275
Feature activation+7.63e-01

Pos logit contributions
zie+4.566
yg+4.094
uben+3.537
grab+3.532
verbal+3.530
Neg logit contributions
ディ-5.936
-5.623
Counter-5.569
akespeare-5.320
-5.180
TokenGen
Token ID13746
Feature activation-1.01e-01

Pos logit contributions
zie+8.385
yg+7.762
enh+6.761
grab+6.751
verbal+6.743
Neg logit contributions
ディ-8.472
-8.075
akespeare-7.917
Counter-7.824
 cloaked-7.429
Token/
Token ID14
Feature activation+1.83e+00

Pos logit contributions
zie+2.657
yg+2.418
grab+1.934
md+1.931
gain+1.863
Neg logit contributions
ディ-6.294
-6.028
akespeare-6.006
Counter-5.789
��-5.691
Tokenz
Token ID89
Feature activation+1.40e+00

Pos logit contributions
zie+18.422
yg+18.073
md+15.875
dl+15.586
kh+15.373
Neg logit contributions
ディ-15.089
-13.795
akespeare-13.427
 cloaked-13.127
-13.063
TokenJ
Token ID41
Feature activation+1.32e+00

Pos logit contributions
zie+14.527
yg+14.077
md+12.257
dl+12.111
enh+11.860
Neg logit contributions
ディ-13.614
-12.668
akespeare-12.144
Counter-11.942
-11.910
TokenX
Token ID55
Feature activation+1.58e+00

Pos logit contributions
zie+13.847
yg+13.344
md+11.591
dl+11.510
kh+11.263
Neg logit contributions
ディ-13.241
-12.294
akespeare-11.705
Counter-11.594
-11.520
Tokenw
Token ID86
Feature activation+1.48e+00

Pos logit contributions
zie+16.332
yg+15.810
md+13.791
dl+13.661
kh+13.442
Neg logit contributions
ディ-14.488
-13.322
akespeare-12.800
-12.582
Counter-12.479
Token2
Token ID17
Feature activation+1.86e+00

Pos logit contributions
zie+15.518
yg+14.676
md+12.955
dl+12.740
kh+12.578
Neg logit contributions
ディ-13.917
-12.841
akespeare-12.391
Counter-12.177
-12.122
Tokencv
Token ID33967
Feature activation+1.63e+00

Pos logit contributions
zie+18.608
yg+18.183
md+15.979
dl+15.814
kh+15.590
Neg logit contributions
ディ-15.271
-14.126
akespeare-13.715
 cloaked-13.320
-13.242
Tokenh
Token ID71
Feature activation+1.76e+00

Pos logit contributions
zie+16.796
yg+16.047
md+14.201
dl+13.931
kh+13.913
Neg logit contributions
ディ-14.522
-13.429
akespeare-12.931
 cloaked-12.685
-12.623
Token4
Token ID19
Feature activation+1.94e+00

Pos logit contributions
zie+17.759
yg+16.856
md+15.093
dl+14.902
kh+14.791
Neg logit contributions
ディ-15.200
-14.010
akespeare-13.375
-13.242
 cloaked-13.210
Tokenz
Token ID89
Feature activation+1.67e+00

Pos logit contributions
zie+18.561
yg+18.204
md+15.719
dl+15.639
kh+15.378
Neg logit contributions
ディ-16.464
-15.044
akespeare-14.770
acters-14.302
 cloaked-14.297
Token
Token ID851
Feature activation+9.31e-01

Pos logit contributions
zie+13.610
yg+13.093
md+10.715
dl+10.692
grab+10.630
Neg logit contributions
ディ-18.527
-17.206
akespeare-16.688
Counter-16.550
-16.036
Token or
Token ID393
Feature activation-1.43e+00

Pos logit contributions
Counter+6.422
ディ+6.310
+6.211
 spoof+5.713
akespeare+5.697
Neg logit contributions
zie-6.677
yg-6.046
grab-5.377
aptic-5.349
uben-5.342
Token more
Token ID517
Feature activation-7.68e-01

Pos logit contributions
ディ+10.952
Counter+10.688
 spoof+10.459
+10.451
 fusion+9.672
Neg logit contributions
zie-15.578
yg-13.873
aptic-12.990
verbal-12.916
grab-12.729
Token mitigation
Token ID33585
Feature activation-5.08e-01

Pos logit contributions
Counter+7.553
ディ+7.330
+7.246
 spoof+6.768
 cloaked+6.688
Neg logit contributions
zie-9.021
yg-8.029
aptic-7.322
grab-7.319
uben-7.319
Token systems
Token ID3341
Feature activation-9.98e-01

Pos logit contributions
ディ+5.460
Counter+5.345
+5.259
+4.848
 spoof+4.839
Neg logit contributions
zie-5.815
yg-5.171
aptic-4.708
uben-4.700
grab-4.695
Token to
Token ID284
Feature activation-1.16e+00

Pos logit contributions
ディ+9.797
Counter+9.692
+9.408
 cloaked+9.009
 spoof+8.950
Neg logit contributions
zie-10.626
yg-9.758
verbal-8.703
uben-8.635
aptic-8.517
Token protect
Token ID1805
Feature activation-2.49e+00

Pos logit contributions
Counter+10.272
 spoof+9.518
+9.359
ディ+9.358
 foil+9.065
Neg logit contributions
zie-13.052
yg-12.205
verbal-11.060
aptic-11.059
grab-10.880
Token their
Token ID511
Feature activation-1.48e+00

Pos logit contributions
Counter+16.676
 guests+15.689
+15.190
 spoof+15.089
 VPN+15.007
Neg logit contributions
zie-22.234
yg-19.380
uben-18.056
aptic-17.965
verbal-17.682
Token networks
Token ID7686
Feature activation-1.72e+00

Pos logit contributions
Counter+12.175
+11.791
ディ+11.632
 guests+11.586
 spoof+11.148
Neg logit contributions
zie-16.258
yg-14.544
grab-13.281
aptic-13.087
verbal-13.065
Token and
Token ID290
Feature activation-1.40e+00

Pos logit contributions
Counter+14.157
+13.539
ディ+13.536
 guests+12.810
 spoof+12.686
Neg logit contributions
zie-16.069
yg-15.178
uben-13.669
grab-13.476
verbal-13.401
Token online
Token ID2691
Feature activation-8.88e-01

Pos logit contributions
Counter+12.892
 guests+12.202
+12.170
ディ+11.929
 spoof+11.657
Neg logit contributions
zie-14.200
yg-12.918
verbal-11.949
aptic-11.837
grab-11.550
Token services
Token ID2594
Feature activation-1.58e+00

Pos logit contributions
ディ+8.775
+8.601
Counter+8.567
 guests+8.042
+7.816
Neg logit contributions
zie-10.019
yg-9.109
uben-8.151
grab-8.123
aptic-8.112
Token to
Token ID284
Feature activation+2.17e-01

Pos logit contributions
zie+0.160
yg+0.145
aptic+0.126
uben+0.126
gain+0.125
Neg logit contributions
ディ-0.183
-0.174
Counter-0.172
akespeare-0.164
-0.158
Token replicate
Token ID24340
Feature activation-2.97e-01

Pos logit contributions
zie+2.420
yg+2.196
grab+1.935
uben+1.934
verbal+1.933
Neg logit contributions
ディ-2.602
-2.440
Counter-2.410
akespeare-2.285
-2.235
Token the
Token ID262
Feature activation-8.19e-02

Pos logit contributions
ディ+3.200
Counter+2.989
+2.965
akespeare+2.766
Doctor+2.739
Neg logit contributions
zie-3.535
yg-3.221
uben-2.886
verbal-2.866
aptic-2.831
Token scene
Token ID3715
Feature activation-4.61e-01

Pos logit contributions
ディ+0.926
+0.871
Counter+0.865
akespeare+0.819
+0.792
Neg logit contributions
zie-0.960
yg-0.876
uben-0.775
verbal-0.774
aptic-0.772
Token as
Token ID355
Feature activation-7.16e-01

Pos logit contributions
ディ+5.051
+4.860
Counter+4.782
akespeare+4.495
 spoof+4.385
Neg logit contributions
zie-5.250
yg-4.745
uben-4.218
aptic-4.154
gain-4.152
Token closely
Token ID7173
Feature activation-2.07e+00

Pos logit contributions
ディ+7.474
+6.979
Counter+6.967
 guests+6.671
 cloaked+6.562
Neg logit contributions
zie-8.063
yg-7.288
verbal-6.542
uben-6.509
gain-6.451
Token as
Token ID355
Feature activation-3.15e-01

Pos logit contributions
ディ+11.867
+11.674
Counter+11.204
 guests+11.026
 spoof+10.649
Neg logit contributions
zie-20.914
yg-19.182
uben-17.027
verbal-16.982
ggle-16.885
Token possible
Token ID1744
Feature activation-6.41e-01

Pos logit contributions
ディ+2.773
+2.545
Counter+2.523
akespeare+2.340
Doctor+2.285
Neg logit contributions
zie-4.396
yg-4.061
verbal-3.664
uben-3.638
gain-3.612
Token,
Token ID11
Feature activation+7.45e-01

Pos logit contributions
ディ+6.412
+6.226
Counter+6.060
akespeare+5.692
 spoof+5.634
Neg logit contributions
zie-7.374
yg-6.661
uben-5.981
verbal-5.943
gain-5.920
Token and
Token ID290
Feature activation+2.86e-01

Pos logit contributions
zie+7.614
yg+6.917
grab+6.215
enh+6.152
gain+6.069
Neg logit contributions
ディ-8.884
-8.233
Counter-8.137
akespeare-7.879
-7.522
Token Murd
Token ID20305
Feature activation+3.93e-01

Pos logit contributions
zie+3.166
yg+2.886
grab+2.546
aptic+2.526
uben+2.523
Neg logit contributions
ディ-3.421
-3.211
Counter-3.169
akespeare-3.023
-2.934
Token.
Token ID13
Feature activation+2.74e-01

Pos logit contributions
zie+1.460
yg+1.328
grab+1.138
md+1.111
gain+1.105
Neg logit contributions
ディ-2.262
-2.149
akespeare-2.102
Counter-2.082
-2.000
Tokenx
Token ID87
Feature activation+1.49e-01

Pos logit contributions
zie+1.989
yg+1.769
grab+1.439
gain+1.407
enh+1.381
Neg logit contributions
ディ-4.361
-4.191
akespeare-4.130
Counter-4.075
-3.946
Token.
Token ID13
Feature activation+1.22e-01

Pos logit contributions
zie+1.405
yg+1.269
grab+1.100
md+1.073
gain+1.062
Neg logit contributions
ディ-2.073
-1.988
Counter-1.929
akespeare-1.922
-1.840
Token134
Token ID19880
Feature activation-5.18e-01

Pos logit contributions
zie+1.536
yg+1.408
grab+1.270
gain+1.254
uben+1.248
Neg logit contributions
ディ-1.300
-1.240
Counter-1.196
akespeare-1.166
-1.110
Token 11
Token ID1367
Feature activation-7.00e-01

Pos logit contributions
ディ+5.254
Counter+4.917
+4.881
+4.471
1963+4.461
Neg logit contributions
zie-6.321
yg-5.784
aptic-5.333
uben-5.244
verbal-5.182
Token35
Token ID2327
Feature activation-2.00e+00

Pos logit contributions
ディ+6.184
+5.616
Counter+5.424
1963+5.375
9999+5.333
Neg logit contributions
zie-9.569
yg-8.546
uben-8.157
aptic-8.057
 newsp-7.913
Token487
Token ID35133
Feature activation+4.05e-02

Pos logit contributions
ディ+13.064
+12.354
1963+11.832
9999+11.790
554+11.585
Neg logit contributions
zie-19.884
 newsp-18.071
yg-17.539
aptic-16.977
 emanc-16.617
Token x
Token ID2124
Feature activation+4.23e-03

Pos logit contributions
zie+0.446
yg+0.407
grab+0.357
gain+0.352
verbal+0.350
Neg logit contributions
ディ-0.495
-0.464
Counter-0.455
akespeare-0.446
-0.425
Token.
Token ID13
Feature activation+1.38e-01

Pos logit contributions
zie+0.041
yg+0.037
grab+0.032
verbal+0.031
gain+0.031
Neg logit contributions
ディ-0.058
-0.054
Counter-0.053
akespeare-0.053
-0.050
Tokenx
Token ID87
Feature activation+2.43e-01

Pos logit contributions
zie+1.048
yg+0.936
grab+0.773
gain+0.755
verbal+0.741
Neg logit contributions
ディ-2.142
-2.052
Counter-2.011
akespeare-2.004
-1.925
Token.
Token ID13
Feature activation+3.61e-01

Pos logit contributions
zie+2.148
yg+1.968
grab+1.674
md+1.642
gain+1.618
Neg logit contributions
ディ-3.456
-3.290
akespeare-3.236
Counter-3.182
-3.062
Token foul
Token ID15626
Feature activation-1.95e-01

Pos logit contributions
ディ+6.465
+6.048
Counter+5.937
akespeare+5.821
 spoof+5.721
Neg logit contributions
zie-6.878
yg-6.232
verbal-5.562
aptic-5.552
uben-5.539
Token cal
Token ID2386
Feature activation+1.82e-02

Pos logit contributions
ディ+2.570
+2.430
Counter+2.407
akespeare+2.317
Doctor+2.241
Neg logit contributions
zie-1.927
yg-1.684
uben-1.484
aptic-1.469
grab-1.460
Tokenum
Token ID388
Feature activation-1.11e-01

Pos logit contributions
zie+0.185
yg+0.165
uben+0.144
aptic+0.143
verbal+0.142
Neg logit contributions
ディ-0.234
Counter-0.221
-0.220
akespeare-0.212
-0.204
Tokenny
Token ID3281
Feature activation-7.95e-01

Pos logit contributions
ディ+1.205
+1.135
Counter+1.135
akespeare+1.099
+1.043
Neg logit contributions
zie-1.310
yg-1.184
uben-1.082
verbal-1.051
aptic-1.050
Token was
Token ID373
Feature activation-1.34e+00

Pos logit contributions
ディ+7.980
+7.497
Counter+7.368
 cloaked+7.281
 spoof+7.237
Neg logit contributions
zie-9.100
yg-8.100
uben-7.468
aptic-7.369
gain-7.213
Token almost
Token ID2048
Feature activation-1.99e+00

Pos logit contributions
 cloaked+11.699
ディ+11.550
 spoof+11.246
 filtered+10.854
+10.815
Neg logit contributions
zie-14.628
yg-12.700
aptic-12.404
uben-12.249
enh-11.387
Token certainly
Token ID3729
Feature activation-1.83e+00

Pos logit contributions
 cloaked+14.219
 spoof+13.751
ディ+13.234
 concealed+13.183
 filtered+13.035
Neg logit contributions
zie-19.927
yg-17.590
aptic-17.576
ngth-16.590
uben-16.460
Token the
Token ID262
Feature activation-6.75e-01

Pos logit contributions
 cloaked+14.546
 spoof+14.208
 filtered+13.673
ディ+13.212
 concealed+12.981
Neg logit contributions
zie-18.429
uben-15.639
yg-15.363
aptic-14.785
grab-14.241
Token work
Token ID670
Feature activation-1.14e-01

Pos logit contributions
ディ+6.508
+6.059
Counter+5.996
 spoof+5.847
akespeare+5.837
Neg logit contributions
zie-8.250
yg-7.318
grab-6.755
uben-6.733
aptic-6.667
Token of
Token ID286
Feature activation-4.65e-01

Pos logit contributions
ディ+1.287
+1.213
Counter+1.194
akespeare+1.138
+1.097
Neg logit contributions
zie-1.323
yg-1.213
grab-1.074
uben-1.073
gain-1.068
Token Richard
Token ID6219
Feature activation-2.50e-01

Pos logit contributions
ディ+4.676
+4.311
Counter+4.310
akespeare+4.148
 spoof+4.063
Neg logit contributions
zie-5.720
yg-5.265
grab-4.757
uben-4.696
aptic-4.683
Token try
Token ID1949
Feature activation-5.01e-01

Pos logit contributions
zie+1.257
yg+1.139
grab+0.999
uben+0.998
verbal+0.996
Neg logit contributions
ディ-1.368
-1.288
Counter-1.271
akespeare-1.210
-1.176
Token and
Token ID290
Feature activation-6.26e-02

Pos logit contributions
ディ+4.807
Counter+4.531
+4.435
akespeare+4.070
 spoof+4.063
Neg logit contributions
zie-6.024
yg-5.662
verbal-5.179
uben-5.115
aptic-5.101
Token match
Token ID2872
Feature activation-1.17e-01

Pos logit contributions
ディ+0.739
+0.696
Counter+0.691
akespeare+0.657
+0.637
Neg logit contributions
zie-0.705
yg-0.639
verbal-0.563
aptic-0.562
uben-0.562
Token them
Token ID606
Feature activation-2.12e-01

Pos logit contributions
ディ+1.307
Counter+1.231
+1.224
akespeare+1.147
Doctor+1.114
Neg logit contributions
zie-1.376
yg-1.247
aptic-1.121
verbal-1.116
uben-1.114
Token as
Token ID355
Feature activation-3.50e-01

Pos logit contributions
ディ+2.557
+2.447
Counter+2.444
akespeare+2.315
+2.235
Neg logit contributions
zie-2.247
yg-2.014
aptic-1.800
uben-1.779
verbal-1.779
Token closely
Token ID7173
Feature activation-1.99e+00

Pos logit contributions
ディ+3.527
Counter+3.280
+3.271
akespeare+3.021
 spoof+2.948
Neg logit contributions
zie-4.375
yg-4.027
aptic-3.661
verbal-3.659
uben-3.617
Token as
Token ID355
Feature activation-5.42e-01

Pos logit contributions
ディ+11.299
Counter+11.055
+11.020
 spoof+10.291
 guests+10.184
Neg logit contributions
zie-19.854
yg-18.746
verbal-16.838
oled-16.837
md-16.833
Token possible
Token ID1744
Feature activation-4.43e-01

Pos logit contributions
ディ+4.245
Counter+3.905
+3.857
 spoof+3.530
akespeare+3.514
Neg logit contributions
zie-7.564
yg-7.140
verbal-6.528
aptic-6.491
grab-6.433
Token and
Token ID290
Feature activation+2.81e-01

Pos logit contributions
ディ+4.880
+4.672
Counter+4.663
akespeare+4.367
Doctor+4.269
Neg logit contributions
zie-4.832
yg-4.399
verbal-3.975
uben-3.934
aptic-3.911
Token will
Token ID481
Feature activation+3.69e-01

Pos logit contributions
zie+3.096
yg+2.807
grab+2.468
uben+2.459
gain+2.457
Neg logit contributions
ディ-3.411
-3.200
Counter-3.138
akespeare-3.010
-2.918
Token communicate
Token ID10996
Feature activation+1.16e-01

Pos logit contributions
zie+3.866
yg+3.484
gain+3.025
grab+3.022
enh+2.980
Neg logit contributions
ディ-4.671
-4.380
Counter-4.294
akespeare-4.113
-4.021
TokenThe
Token ID464
Feature activation-1.37e+00

Pos logit contributions
Counter+15.713
ディ+15.535
+14.877
Doctor+14.535
Filter+14.085
Neg logit contributions
zie-14.647
yg-13.885
aptic-12.904
uben-12.900
verbal-12.574
TokenBest
Token ID13014
Feature activation-1.65e+00

Pos logit contributions
Counter+10.592
ディ+10.513
Doctor+9.801
+9.571
Legend+9.201
Neg logit contributions
zie-14.844
yg-14.336
aptic-13.246
uben-12.870
gain-12.781
TokenT
Token ID51
Feature activation-2.92e-01

Pos logit contributions
ディ+16.264
Counter+15.864
+15.060
Doctor+14.756
Filter+14.318
Neg logit contributions
zie-13.423
yg-12.007
aptic-11.310
uben-11.240
verbal-11.108
Tokenreek
Token ID10316
Feature activation-9.93e-01

Pos logit contributions
ディ+3.575
+3.453
Counter+3.444
akespeare+3.302
Doctor+3.152
Neg logit contributions
zie-2.922
yg-2.562
verbal-2.358
gain-2.343
grab-2.316
TokenAs
Token ID1722
Feature activation-2.37e-01

Pos logit contributions
ディ+10.998
Counter+10.913
+10.537
Doctor+10.166
akespeare+9.846
Neg logit contributions
zie-8.554
yg-7.778
verbal-7.357
grab-7.002
aptic-6.977
Tokenyu
Token ID24767
Feature activation-1.98e+00

Pos logit contributions
ディ+3.124
Counter+3.045
+2.944
Doctor+2.840
akespeare+2.783
Neg logit contributions
zie-2.233
yg-2.037
verbal-1.732
grab-1.709
aptic-1.707
TokenK
Token ID42
Feature activation-8.25e-01

Pos logit contributions
Counter+17.803
ディ+17.364
Doctor+16.723
+16.719
Filter+15.953
Neg logit contributions
zie-14.939
yg-13.687
uben-12.792
verbal-12.761
aptic-12.684
Tokenong
Token ID506
Feature activation-1.73e+00

Pos logit contributions
Counter+6.303
+5.851
ディ+5.754
akespeare+5.527
1963+5.264
Neg logit contributions
zie-10.361
yg-9.635
verbal-9.502
aptic-9.179
gain-9.111
TokenCh
Token ID1925
Feature activation-1.38e-01

Pos logit contributions
Counter+15.661
ディ+14.354
Doctor+14.233
+13.947
Filter+13.404
Neg logit contributions
zie-14.786
yg-14.272
uben-13.499
 emanc-13.179
aptic-12.909
Tokenoya
Token ID23790
Feature activation-1.48e+00

Pos logit contributions
ディ+2.071
+1.990
Counter+1.985
akespeare+1.930
+1.861
Neg logit contributions
zie-1.077
yg-0.935
verbal-0.777
aptic-0.774
gain-0.772
TokenWeek
Token ID20916
Feature activation-7.87e-01

Pos logit contributions
Counter+14.147
ディ+13.769
Doctor+13.222
+13.043
Legend+12.559
Neg logit contributions
zie-13.351
yg-12.078
aptic-11.648
verbal-11.603
uben-11.574
Token.
Token ID13
Feature activation-4.74e-01

Pos logit contributions
ディ+2.852
+2.685
Counter+2.666
akespeare+2.500
+2.444
Neg logit contributions
zie-2.823
yg-2.601
uben-2.286
aptic-2.261
verbal-2.254
Token This
Token ID770
Feature activation+6.79e-02

Pos logit contributions
ディ+4.362
Counter+4.143
+4.136
+3.636
Doctor+3.573
Neg logit contributions
zie-6.302
yg-5.832
aptic-5.307
uben-5.265
grab-5.171
Token would
Token ID561
Feature activation+2.02e-01

Pos logit contributions
zie+0.802
yg+0.732
uben+0.650
verbal+0.648
aptic+0.647
Neg logit contributions
ディ-0.767
-0.724
Counter-0.714
akespeare-0.675
-0.655
Token save
Token ID3613
Feature activation-8.05e-01

Pos logit contributions
zie+2.240
yg+2.035
gain+1.781
grab+1.781
uben+1.775
Neg logit contributions
ディ-2.424
-2.270
Counter-2.245
akespeare-2.140
-2.085
Token 108
Token ID15495
Feature activation-7.19e-01

Pos logit contributions
ディ+7.934
+7.613
Counter+7.446
 guests+6.866
akespeare+6.823
Neg logit contributions
zie-9.312
yg-8.438
aptic-7.721
uben-7.658
verbal-7.422
Token,
Token ID11
Feature activation-1.98e+00

Pos logit contributions
ディ+7.290
+6.934
Counter+6.648
akespeare+6.249
 guests+6.090
Neg logit contributions
zie-8.343
yg-7.583
grab-6.969
uben-6.856
verbal-6.851
Token476
Token ID35435
Feature activation-4.03e-01

Pos logit contributions
ディ+14.222
+13.243
9999+12.915
514+12.797
Counter+12.243
Neg logit contributions
zie-18.113
yg-16.118
 newsp-15.951
verbal-15.141
uben-14.854
Token,
Token ID11
Feature activation-1.51e+00

Pos logit contributions
ディ+4.402
+4.123
Counter+4.030
akespeare+3.758
+3.693
Neg logit contributions
zie-4.738
yg-4.296
grab-3.911
uben-3.896
verbal-3.856
Token7
Token ID22
Feature activation+1.02e-01

Pos logit contributions
ディ+11.503
+10.582
9999+9.867
Counter+9.713
514+9.560
Neg logit contributions
zie-16.260
yg-14.732
verbal-13.681
 newsp-13.606
grab-13.339
Token88
Token ID3459
Feature activation-4.60e-01

Pos logit contributions
zie+1.200
yg+1.098
grab+0.974
verbal+0.972
gain+0.969
Neg logit contributions
ディ-1.158
-1.085
Counter-1.066
akespeare-1.022
-0.984
Token litres
Token ID49622
Feature activation-8.27e-01

Pos logit contributions
ディ+4.910
+4.649
Counter+4.566
akespeare+4.222
+4.168
Neg logit contributions
zie-5.390
yg-4.916
grab-4.470
uben-4.465
verbal-4.431
Token.
Token ID13
Feature activation+3.37e-01

Pos logit contributions
zie+0.377
yg+0.343
grab+0.295
gain+0.289
verbal+0.289
Neg logit contributions
ディ-0.541
-0.513
Counter-0.502
akespeare-0.498
-0.476
Tokenx
Token ID87
Feature activation+2.39e-01

Pos logit contributions
zie+2.317
yg+2.088
grab+1.674
gain+1.658
md+1.607
Neg logit contributions
ディ-5.414
-5.201
akespeare-5.146
Counter-5.063
-4.929
Token.
Token ID13
Feature activation+4.36e-01

Pos logit contributions
zie+2.090
yg+1.917
grab+1.639
gain+1.620
md+1.614
Neg logit contributions
ディ-3.403
-3.198
akespeare-3.148
Counter-3.139
-3.013
Token169
Token ID22172
Feature activation-7.72e-01

Pos logit contributions
zie+4.733
yg+4.273
gain+4.019
enh+3.941
grab+3.854
Neg logit contributions
ディ-5.315
Counter-4.756
-4.750
akespeare-4.578
-4.489
Token 11
Token ID1367
Feature activation-1.10e+00

Pos logit contributions
ディ+6.929
Counter+6.589
+6.443
9999+6.058
Doctor+5.944
Neg logit contributions
zie-9.654
yg-8.784
aptic-8.278
uben-8.163
verbal-7.993
Token66
Token ID2791
Feature activation-1.97e+00

Pos logit contributions
ディ+8.712
1963+7.938
9999+7.885
+7.870
Counter+7.577
Neg logit contributions
zie-14.699
yg-12.865
uben-12.583
aptic-12.507
 newsp-12.346
Token589
Token ID44169
Feature activation-1.04e-01

Pos logit contributions
ディ+13.796
+12.743
1963+12.193
9999+11.983
Counter+11.769
Neg logit contributions
zie-19.297
 newsp-17.278
yg-17.012
aptic-16.347
uben-16.103
Token x
Token ID2124
Feature activation-2.53e-01

Pos logit contributions
ディ+1.219
+1.145
Counter+1.127
akespeare+1.092
+1.044
Neg logit contributions
zie-1.192
yg-1.092
grab-0.964
verbal-0.953
gain-0.952
Token.
Token ID13
Feature activation-1.35e-01

Pos logit contributions
ディ+3.190
+3.014
Counter+2.984
akespeare+2.850
+2.768
Neg logit contributions
zie-2.661
yg-2.399
uben-2.085
grab-2.085
verbal-2.083
Tokenx
Token ID87
Feature activation-1.01e-01

Pos logit contributions
ディ+1.993
+1.901
Counter+1.880
akespeare+1.821
+1.771
Neg logit contributions
zie-1.142
yg-1.007
grab-0.841
verbal-0.834
uben-0.831
Token.
Token ID13
Feature activation+2.04e-01

Pos logit contributions
ディ+1.336
+1.266
Counter+1.244
akespeare+1.211
+1.168
Neg logit contributions
zie-1.015
yg-0.919
grab-0.795
verbal-0.787
uben-0.783
Token.
Token ID13
Feature activation+1.41e-02

Pos logit contributions
zie+0.618
yg+0.563
grab+0.483
verbal+0.473
gain+0.472
Neg logit contributions
ディ-0.903
-0.855
Counter-0.834
akespeare-0.832
-0.794
Tokenx
Token ID87
Feature activation-4.38e-01

Pos logit contributions
zie+0.112
yg+0.099
grab+0.082
gain+0.081
enh+0.080
Neg logit contributions
ディ-0.216
-0.206
Counter-0.202
akespeare-0.200
-0.193
Token.
Token ID13
Feature activation-1.07e+00

Pos logit contributions
ディ+5.167
Counter+4.909
+4.902
akespeare+4.552
Doctor+4.475
Neg logit contributions
zie-4.805
yg-4.326
aptic-3.864
uben-3.832
verbal-3.817
Token111
Token ID16243
Feature activation-9.21e-01

Pos logit contributions
Counter+4.349
Doctor+3.540
+3.334
ディ+3.036
1963+2.871
Neg logit contributions
zie-17.914
yg-17.423
aptic-16.597
uben-15.948
ggle-15.947
Token 10
Token ID838
Feature activation-8.28e-01

Pos logit contributions
ディ+8.067
Counter+7.591
+7.363
9999+6.872
+6.844
Neg logit contributions
zie-11.323
yg-10.317
aptic-9.729
uben-9.492
verbal-9.334
Token36
Token ID2623
Feature activation-1.97e+00

Pos logit contributions
ディ+7.047
+6.641
9999+6.280
1963+6.251
Counter+6.161
Neg logit contributions
zie-11.095
yg-9.928
uben-9.475
aptic-9.458
 newsp-9.422
Token203
Token ID22416
Feature activation-1.58e-01

Pos logit contributions
ディ+13.283
+12.566
1963+12.035
9999+11.975
554+11.664
Neg logit contributions
zie-19.400
 newsp-17.456
yg-17.244
aptic-16.587
uben-16.294
Token x
Token ID2124
Feature activation-1.18e-01

Pos logit contributions
ディ+1.833
+1.719
Counter+1.696
akespeare+1.630
+1.565
Neg logit contributions
zie-1.846
yg-1.685
grab-1.490
verbal-1.480
uben-1.478
Token.
Token ID13
Feature activation+1.39e-01

Pos logit contributions
ディ+1.554
+1.470
Counter+1.446
akespeare+1.409
+1.357
Neg logit contributions
zie-1.186
yg-1.074
grab-0.930
verbal-0.919
gain-0.914
Tokenx
Token ID87
Feature activation+7.65e-02

Pos logit contributions
zie+1.049
yg+0.938
grab+0.769
gain+0.754
enh+0.739
Neg logit contributions
ディ-2.167
-2.077
Counter-2.033
akespeare-2.029
-1.951
Token.
Token ID13
Feature activation-1.89e-01

Pos logit contributions
zie+0.720
yg+0.656
grab+0.563
verbal+0.550
gain+0.550
Neg logit contributions
ディ-1.057
-1.002
Counter-0.976
akespeare-0.974
-0.930
Token.
Token ID13
Feature activation-4.06e-01

Pos logit contributions
ディ+2.156
+2.034
Counter+2.022
akespeare+1.862
+1.812
Neg logit contributions
zie-2.294
yg-2.097
verbal-1.872
uben-1.868
gain-1.862
Token
Token ID198
Feature activation-3.06e-01

Pos logit contributions
ディ+4.215
Counter+4.044
+3.968
+3.548
Doctor+3.543
Neg logit contributions
zie-5.019
yg-4.608
aptic-4.200
uben-4.076
gain-4.059
Token
Token ID198
Feature activation-1.93e-01

Pos logit contributions
ディ+3.124
+2.913
Counter+2.883
akespeare+2.705
+2.615
Neg logit contributions
zie-3.942
yg-3.621
uben-3.249
aptic-3.241
verbal-3.241
TokenAround
Token ID24472
Feature activation-8.93e-01

Pos logit contributions
ディ+2.013
Counter+1.946
+1.901
Doctor+1.735
+1.709
Neg logit contributions
zie-2.443
yg-2.224
aptic-2.013
uben-1.996
gain-1.967
Token 1
Token ID352
Feature activation-6.65e-01

Pos logit contributions
ディ+8.707
Counter+8.377
+8.213
akespeare+7.732
1963+7.516
Neg logit contributions
zie-10.034
yg-9.257
aptic-8.526
grab-8.050
uben-8.014
Token,
Token ID11
Feature activation-1.96e+00

Pos logit contributions
ディ+6.425
Counter+6.306
+6.201
+5.595
akespeare+5.564
Neg logit contributions
zie-7.877
yg-6.984
aptic-6.485
verbal-6.386
grab-6.366
Token337
Token ID31496
Feature activation-4.08e-01

Pos logit contributions
ディ+13.494
+12.680
9999+12.377
Counter+12.043
1963+11.661
Neg logit contributions
zie-19.831
yg-17.644
aptic-16.661
uben-16.296
verbal-16.171
Token trains
Token ID13404
Feature activation-8.30e-01

Pos logit contributions
ディ+4.239
Counter+4.046
+3.979
akespeare+3.686
+3.571
Neg logit contributions
zie-4.955
yg-4.419
aptic-4.010
uben-3.980
gain-3.967
Token in
Token ID287
Feature activation-5.78e-01

Pos logit contributions
ディ+8.133
+7.992
Counter+7.843
akespeare+7.409
+7.217
Neg logit contributions
zie-9.112
yg-8.362
gain-7.506
verbal-7.456
grab-7.385
Token the
Token ID262
Feature activation-3.51e-01

Pos logit contributions
ディ+6.084
Counter+5.663
+5.600
akespeare+5.454
+5.236
Neg logit contributions
zie-6.544
yg-5.972
aptic-5.471
uben-5.317
verbal-5.284
Token northern
Token ID7840
Feature activation-3.78e-01

Pos logit contributions
ディ+3.744
+3.513
Counter+3.505
akespeare+3.315
+3.222
Neg logit contributions
zie-4.168
yg-3.789
uben-3.417
aptic-3.416
verbal-3.399
Token would
Token ID561
Feature activation-9.63e-02

Pos logit contributions
ディ+2.226
+2.126
Counter+2.115
akespeare+1.970
+1.921
Neg logit contributions
zie-2.246
yg-2.037
verbal-1.802
uben-1.800
aptic-1.789
Token then
Token ID788
Feature activation-3.84e-01

Pos logit contributions
ディ+1.070
+1.021
Counter+1.010
akespeare+0.956
+0.915
Neg logit contributions
zie-1.132
yg-1.034
aptic-0.928
uben-0.921
verbal-0.921
Token stand
Token ID1302
Feature activation-3.45e-01

Pos logit contributions
ディ+3.818
+3.706
Counter+3.650
akespeare+3.442
 spoof+3.289
Neg logit contributions
zie-4.694
yg-4.320
aptic-3.973
uben-3.913
verbal-3.899
Token at
Token ID379
Feature activation-2.30e-01

Pos logit contributions
ディ+3.668
+3.474
Counter+3.465
akespeare+3.229
+3.136
Neg logit contributions
zie-4.102
yg-3.717
aptic-3.377
verbal-3.335
uben-3.277
Token 2
Token ID362
Feature activation-2.52e-01

Pos logit contributions
ディ+2.787
+2.629
Counter+2.617
akespeare+2.467
+2.414
Neg logit contributions
zie-2.496
yg-2.257
aptic-2.004
uben-1.977
grab-1.969
Token,
Token ID11
Feature activation-1.96e+00

Pos logit contributions
ディ+2.773
+2.653
Counter+2.592
akespeare+2.407
+2.352
Neg logit contributions
zie-2.976
yg-2.694
grab-2.437
aptic-2.420
verbal-2.414
Token88
Token ID3459
Feature activation-2.33e-01

Pos logit contributions
ディ+13.857
+12.870
9999+12.535
514+12.483
Counter+12.031
Neg logit contributions
zie-18.664
yg-16.404
 newsp-16.238
verbal-15.390
uben-15.198
Token,
Token ID11
Feature activation-1.82e+00

Pos logit contributions
ディ+2.603
+2.470
Counter+2.459
akespeare+2.314
1963+2.232
Neg logit contributions
zie-2.697
yg-2.468
uben-2.197
aptic-2.189
verbal-2.189
Token339
Token ID29626
Feature activation+2.49e-01

Pos logit contributions
ディ+13.373
+12.487
9999+11.916
514+11.916
Counter+11.479
Neg logit contributions
zie-17.533
yg-15.547
 newsp-15.263
verbal-14.576
uben-14.371
Token tons
Token ID10860
Feature activation-4.48e-01

Pos logit contributions
zie+3.264
yg+3.006
uben+2.698
grab+2.695
verbal+2.695
Neg logit contributions
ディ-2.502
-2.328
Counter-2.300
akespeare-2.165
-2.086
Token.
Token ID13
Feature activation-3.08e-01

Pos logit contributions
ディ+4.793
+4.422
Counter+4.382
akespeare+4.105
+4.025
Neg logit contributions
zie-5.347
yg-4.939
uben-4.324
verbal-4.323
grab-4.250
Token state
Token ID1181
Feature activation+4.87e-01

Pos logit contributions
zie+2.282
yg+2.076
uben+1.834
verbal+1.832
grab+1.832
Neg logit contributions
ディ-2.271
-2.136
Counter-2.113
akespeare-2.006
-1.944
Token,
Token ID11
Feature activation-5.71e-01

Pos logit contributions
zie+4.959
yg+4.417
verbal+3.865
grab+3.862
uben+3.839
Neg logit contributions
ディ-6.172
-5.851
Counter-5.747
akespeare-5.518
-5.358
Token an
Token ID281
Feature activation+3.99e-02

Pos logit contributions
ディ+6.281
Counter+6.005
+5.896
akespeare+5.462
+5.433
Neg logit contributions
zie-6.382
yg-5.720
aptic-5.202
verbal-5.105
uben-5.015
Token incredible
Token ID8082
Feature activation-3.14e-01

Pos logit contributions
zie+0.451
yg+0.411
verbal+0.363
grab+0.361
aptic+0.360
Neg logit contributions
ディ-0.467
-0.443
Counter-0.440
akespeare-0.416
-0.404
Token 12
Token ID1105
Feature activation-5.03e-01

Pos logit contributions
ディ+3.402
Counter+3.217
+3.199
akespeare+2.967
+2.911
Neg logit contributions
zie-3.757
yg-3.437
aptic-3.068
verbal-3.020
grab-3.010
Token,
Token ID11
Feature activation-1.95e+00

Pos logit contributions
ディ+5.013
Counter+4.762
+4.672
9999+4.373
akespeare+4.286
Neg logit contributions
zie-6.151
yg-5.591
aptic-5.108
uben-5.085
verbal-5.077
Token436
Token ID43690
Feature activation-3.21e-01

Pos logit contributions
ディ+14.196
+13.245
9999+12.922
514+12.843
Counter+12.479
Neg logit contributions
zie-17.825
yg-15.972
 newsp-15.433
verbal-14.897
grab-14.817
Token seed
Token ID9403
Feature activation-5.55e-02

Pos logit contributions
ディ+3.460
Counter+3.278
+3.262
+2.987
akespeare+2.986
Neg logit contributions
zie-3.838
yg-3.458
uben-3.144
aptic-3.122
grab-3.112
Tokenlings
Token ID17783
Feature activation-1.29e-01

Pos logit contributions
ディ+0.628
Counter+0.585
+0.584
akespeare+0.547
+0.534
Neg logit contributions
zie-0.657
yg-0.596
aptic-0.535
uben-0.533
verbal-0.532
Token were
Token ID547
Feature activation-2.17e-01

Pos logit contributions
ディ+1.561
+1.472
Counter+1.469
akespeare+1.385
+1.363
Neg logit contributions
zie-1.391
yg-1.259
aptic-1.110
verbal-1.106
uben-1.104
Token rejected
Token ID8606
Feature activation-1.65e-01

Pos logit contributions
ディ+2.478
Counter+2.313
+2.303
akespeare+2.165
+2.148
Neg logit contributions
zie-2.484
yg-2.247
aptic-2.041
verbal-2.010
uben-1.982
Token per
Token ID583
Feature activation-8.46e-01

Pos logit contributions
ディ+11.217
+10.742
Counter+10.180
Doctor+10.010
 guests+9.876
Neg logit contributions
zie-15.435
uben-13.542
yg-13.411
grab-12.970
gain-12.584
Token cent
Token ID1247
Feature activation-8.60e-01

Pos logit contributions
ディ+5.892
Counter+4.952
+4.879
Doctor+4.726
acters+4.364
Neg logit contributions
zie-12.028
yg-11.362
uben-10.641
enh-10.534
gain-10.488
Token of
Token ID286
Feature activation-3.18e-01

Pos logit contributions
ディ+7.705
Counter+7.011
+6.980
acters+6.764
Doctor+6.384
Neg logit contributions
zie-10.544
yg-9.468
uben-8.677
grab-8.656
verbal-8.580
Token the
Token ID262
Feature activation-1.46e-01

Pos logit contributions
ディ+3.368
+3.129
Counter+3.095
akespeare+2.958
Doctor+2.848
Neg logit contributions
zie-3.809
yg-3.456
grab-3.163
aptic-3.142
uben-3.135
Token 9
Token ID860
Feature activation-4.13e-01

Pos logit contributions
ディ+1.556
+1.446
Counter+1.430
akespeare+1.350
+1.306
Neg logit contributions
zie-1.809
yg-1.649
grab-1.497
aptic-1.486
uben-1.482
Token,
Token ID11
Feature activation-1.92e+00

Pos logit contributions
ディ+4.169
+3.894
Counter+3.813
akespeare+3.653
9999+3.617
Neg logit contributions
zie-5.182
yg-4.639
uben-4.266
grab-4.250
aptic-4.211
Token204
Token ID18638
Feature activation-3.71e-01

Pos logit contributions
ディ+13.323
+12.261
514+11.960
9999+11.843
Counter+11.540
Neg logit contributions
zie-18.531
yg-16.826
 newsp-16.169
verbal-15.702
grab-15.506
Token people
Token ID661
Feature activation-4.98e-01

Pos logit contributions
ディ+3.757
+3.477
Counter+3.460
akespeare+3.248
Doctor+3.162
Neg logit contributions
zie-4.667
yg-4.210
uben-3.904
grab-3.893
aptic-3.814
Token employed
Token ID9322
Feature activation-4.24e-01

Pos logit contributions
ディ+5.139
+4.731
Counter+4.699
akespeare+4.392
Doctor+4.332
Neg logit contributions
zie-5.988
yg-5.435
grab-5.003
verbal-4.977
uben-4.922
Token in
Token ID287
Feature activation-6.51e-01

Pos logit contributions
ディ+4.639
+4.328
Counter+4.266
akespeare+4.146
Doctor+3.998
Neg logit contributions
zie-4.751
yg-4.337
grab-3.956
aptic-3.905
verbal-3.883
Token that
Token ID326
Feature activation-3.57e-02

Pos logit contributions
ディ+6.088
+5.618
Counter+5.607
akespeare+5.352
Doctor+5.157
Neg logit contributions
zie-7.963
yg-7.217
grab-6.763
aptic-6.734
uben-6.619
Token crisis
Token ID4902
Feature activation-1.42e-01

Pos logit contributions
ディ+0.255
+0.239
Counter+0.237
akespeare+0.225
+0.219
Neg logit contributions
zie-0.257
yg-0.235
verbal-0.207
uben-0.206
grab-0.205
Token
Token ID960
Feature activation-7.44e-02

Pos logit contributions
ディ+1.778
+1.681
Counter+1.660
akespeare+1.568
+1.557
Neg logit contributions
zie-1.504
yg-1.359
verbal-1.173
grab-1.172
uben-1.171
Tokenclose
Token ID19836
Feature activation-9.96e-01

Pos logit contributions
ディ+0.819
Counter+0.769
+0.765
akespeare+0.710
+0.696
Neg logit contributions
zie-0.903
yg-0.823
aptic-0.740
uben-0.729
verbal-0.722
Token to
Token ID284
Feature activation-7.99e-01

Pos logit contributions
ディ+8.625
+8.536
Counter+8.529
akespeare+8.049
+7.805
Neg logit contributions
zie-11.184
yg-10.409
aptic-9.438
verbal-9.366
retch-9.221
Token 6
Token ID718
Feature activation-3.40e-01

Pos logit contributions
ディ+7.836
Counter+7.554
+7.322
akespeare+6.955
 cloaked+6.848
Neg logit contributions
zie-9.338
yg-8.641
aptic-7.918
grab-7.537
uben-7.507
Token,
Token ID11
Feature activation-1.90e+00

Pos logit contributions
ディ+3.428
+3.243
Counter+3.226
akespeare+3.029
9999+2.991
Neg logit contributions
zie-4.248
yg-3.854
aptic-3.489
verbal-3.481
grab-3.476
Token000
Token ID830
Feature activation-1.08e-01

Pos logit contributions
ディ+13.045
+12.034
9999+11.601
Counter+11.428
514+11.423
Neg logit contributions
zie-18.788
yg-17.171
 newsp-16.687
verbal-15.977
grab-15.704
Token concerned
Token ID5213
Feature activation+5.95e-02

Pos logit contributions
ディ+1.293
Counter+1.225
+1.223
akespeare+1.152
+1.130
Neg logit contributions
zie-1.201
yg-1.076
verbal-0.951
uben-0.948
grab-0.948
Token issues
Token ID2428
Feature activation-2.15e-01

Pos logit contributions
zie+0.753
yg+0.688
aptic+0.618
uben+0.617
verbal+0.617
Neg logit contributions
ディ-0.623
-0.582
Counter-0.577
akespeare-0.543
-0.525
Token consumers
Token ID7008
Feature activation-3.11e-01

Pos logit contributions
ディ+2.574
+2.433
Counter+2.406
akespeare+2.289
+2.256
Neg logit contributions
zie-2.359
yg-2.109
verbal-1.877
aptic-1.874
uben-1.871
Token had
Token ID550
Feature activation-1.65e-01

Pos logit contributions
ディ+3.665
+3.450
Counter+3.437
akespeare+3.226
+3.186
Neg logit contributions
zie-3.437
yg-3.077
aptic-2.763
verbal-2.741
grab-2.720
Token.
Token ID13
Feature activation-1.89e-01

Pos logit contributions
zie+0.720
yg+0.656
grab+0.563
verbal+0.550
gain+0.550
Neg logit contributions
ディ-1.057
-1.002
Counter-0.976
akespeare-0.974
-0.930
Tokenx
Token ID87
Feature activation-5.88e-01

Pos logit contributions
ディ+2.767
+2.634
Counter+2.608
akespeare+2.515
+2.451
Neg logit contributions
zie-1.618
yg-1.424
grab-1.189
verbal-1.185
uben-1.185
Token.
Token ID13
Feature activation-9.04e-01

Pos logit contributions
ディ+6.495
Counter+6.255
+6.165
Doctor+5.674
akespeare+5.626
Neg logit contributions
zie-6.672
yg-5.967
aptic-5.493
uben-5.384
verbal-5.341
Token112
Token ID14686
Feature activation-7.96e-01

Pos logit contributions
Counter+4.719
ディ+4.535
+4.059
Doctor+3.756
Filter+3.509
Neg logit contributions
zie-14.663
yg-13.508
aptic-13.186
uben-12.639
 newsp-12.575
Token 10
Token ID838
Feature activation-7.28e-01

Pos logit contributions
ディ+7.147
Counter+6.776
+6.681
+6.097
9999+6.085
Neg logit contributions
zie-9.794
yg-8.958
aptic-8.377
uben-8.276
verbal-8.155
Token25
Token ID1495
Feature activation-1.89e+00

Pos logit contributions
ディ+6.334
+5.926
Counter+5.547
1963+5.502
9999+5.477
Neg logit contributions
zie-9.867
yg-8.852
uben-8.424
aptic-8.393
 newsp-8.328
Token151
Token ID24309
Feature activation+3.34e-02

Pos logit contributions
ディ+13.027
+12.328
9999+11.825
1963+11.683
Counter+11.415
Neg logit contributions
zie-18.949
 newsp-16.965
yg-16.793
aptic-15.960
 emanc-15.681
Token x
Token ID2124
Feature activation-1.64e-01

Pos logit contributions
zie+0.364
yg+0.335
grab+0.293
gain+0.289
verbal+0.288
Neg logit contributions
ディ-0.411
-0.386
Counter-0.377
akespeare-0.373
-0.355
Token.
Token ID13
Feature activation+6.32e-02

Pos logit contributions
ディ+2.131
+2.016
Counter+1.987
akespeare+1.926
+1.858
Neg logit contributions
zie-1.676
yg-1.516
grab-1.314
verbal-1.303
uben-1.297
Tokenx
Token ID87
Feature activation+5.70e-02

Pos logit contributions
zie+0.495
yg+0.440
grab+0.363
gain+0.357
verbal+0.351
Neg logit contributions
ディ-0.973
-0.932
Counter-0.914
akespeare-0.906
-0.873
Token.
Token ID13
Feature activation-3.30e-02

Pos logit contributions
zie+0.539
yg+0.490
grab+0.420
verbal+0.411
gain+0.410
Neg logit contributions
ディ-0.787
-0.746
Counter-0.727
akespeare-0.724
-0.691
TokenAM
Token ID2390
Feature activation-6.84e-01

Pos logit contributions
ディ+4.446
Counter+4.235
+4.223
akespeare+3.819
Doctor+3.811
Neg logit contributions
zie-5.199
yg-4.615
aptic-4.159
uben-4.145
grab-4.135
Token as
Token ID355
Feature activation-1.32e+00

Pos logit contributions
+7.223
ディ+7.086
Counter+7.058
 spoof+6.434
akespeare+6.431
Neg logit contributions
zie-7.740
yg-6.903
uben-6.390
grab-6.227
aptic-6.164
Token the
Token ID262
Feature activation-1.26e+00

Pos logit contributions
ディ+12.465
+11.922
Counter+11.620
 spoof+11.048
 cloaked+10.963
Neg logit contributions
zie-12.874
yg-11.842
aptic-11.125
gain-11.021
uben-10.973
Token premiere
Token ID19245
Feature activation-1.22e+00

Pos logit contributions
ディ+11.477
+10.989
Counter+10.904
 spoof+10.729
 cloaked+10.585
Neg logit contributions
zie-12.796
yg-11.615
aptic-11.205
gain-10.941
verbal-10.701
Token rather
Token ID2138
Feature activation-1.09e+00

Pos logit contributions
Counter+10.115
+10.009
ディ+9.988
 spoof+9.774
 guests+9.753
Neg logit contributions
zie-13.512
yg-12.025
aptic-11.591
uben-11.321
gain-11.161
Token than
Token ID621
Feature activation-1.87e+00

Pos logit contributions
ディ+9.685
Counter+9.306
+9.071
akespeare+8.596
 cloaked+8.453
Neg logit contributions
zie-12.939
yg-11.893
aptic-10.947
uben-10.764
verbal-10.689
Token D
Token ID360
Feature activation-2.72e-01

Pos logit contributions
ディ+15.482
 spoof+14.657
 guests+14.525
+14.514
Counter+14.375
Neg logit contributions
zie-15.989
yg-14.608
aptic-14.080
verbal-13.594
uben-13.576
TokenIV
Token ID3824
Feature activation-1.01e+00

Pos logit contributions
ディ+2.987
+2.865
Counter+2.790
akespeare+2.639
+2.564
Neg logit contributions
zie-3.134
yg-2.923
gain-2.599
uben-2.595
grab-2.570
TokenIDE
Token ID14114
Feature activation-5.28e-01

Pos logit contributions
ディ+6.822
+6.688
Counter+6.404
1963+5.662
Doctor+5.461
Neg logit contributions
zie-14.893
yg-13.930
uben-12.940
grab-12.822
aptic-12.736
Token AND
Token ID5357
Feature activation-4.56e-01

Pos logit contributions
ディ+4.950
+4.886
Counter+4.630
akespeare+4.317
+4.262
Neg logit contributions
zie-6.530
yg-6.190
uben-5.589
aptic-5.521
grab-5.500
Token CON
Token ID7102
Feature activation-4.46e-01

Pos logit contributions
ディ+4.423
+4.218
Counter+4.146
akespeare+3.762
+3.731
Neg logit contributions
zie-6.042
yg-5.534
aptic-5.023
uben-4.957
grab-4.894
Token for
Token ID329
Feature activation-1.05e+00

Pos logit contributions
ディ+3.706
+3.579
Counter+3.493
akespeare+3.339
+3.283
Neg logit contributions
zie-3.077
yg-2.780
gain-2.481
aptic-2.472
verbal-2.427
Token the
Token ID262
Feature activation-1.05e+00

Pos logit contributions
 guests+10.220
ディ+9.962
Counter+9.823
+9.571
 spoof+9.281
Neg logit contributions
zie-10.573
yg-9.639
uben-8.977
aptic-8.824
gain-8.736
Token present
Token ID1944
Feature activation-6.44e-01

Pos logit contributions
ディ+10.094
 guests+9.913
+9.767
Counter+9.641
akespeare+9.172
Neg logit contributions
zie-10.987
yg-10.011
uben-9.200
gain-9.051
aptic-8.965
Tokeners
Token ID364
Feature activation-9.34e-01

Pos logit contributions
ディ+5.448
+5.125
Counter+4.996
 guests+4.745
akespeare+4.527
Neg logit contributions
zie-8.347
yg-7.645
gain-7.124
aptic-7.119
uben-7.094
Token entering
Token ID8218
Feature activation-1.31e+00

Pos logit contributions
ディ+8.586
+8.348
Counter+8.318
 guests+8.022
 cloaked+7.975
Neg logit contributions
zie-10.284
yg-9.353
aptic-8.615
gain-8.566
uben-8.443
Token from
Token ID422
Feature activation-1.86e+00

Pos logit contributions
+11.422
Counter+11.093
 guests+11.081
ディ+10.868
 filtered+10.550
Neg logit contributions
zie-12.934
yg-11.450
gain-10.915
uben-10.843
oled-10.822
Token the
Token ID262
Feature activation-1.55e+00

Pos logit contributions
 guests+13.044
Counter+13.043
+13.005
ディ+12.908
 opposite+12.312
Neg logit contributions
zie-17.474
yg-15.752
uben-14.697
gain-14.642
aptic-14.618
Token other
Token ID584
Feature activation-8.44e-01

Pos logit contributions
+12.362
ディ+12.179
 guests+12.153
Counter+11.905
 spoof+11.604
Neg logit contributions
zie-15.197
yg-13.778
uben-13.204
oled-12.777
verbal-12.762
Token side
Token ID1735
Feature activation-7.56e-01

Pos logit contributions
+5.868
ディ+5.777
Counter+5.687
 guests+5.458
akespeare+5.170
Neg logit contributions
zie-11.494
yg-10.646
uben-10.062
oled-9.819
gain-9.770
Token of
Token ID286
Feature activation-3.70e-01

Pos logit contributions
ディ+6.895
+6.840
Counter+6.639
+6.328
akespeare+6.244
Neg logit contributions
zie-8.891
yg-8.136
gain-7.344
uben-7.287
aptic-7.283
Token the
Token ID262
Feature activation-4.61e-01

Pos logit contributions
ディ+3.470
+3.335
Counter+3.322
akespeare+3.065
+2.944
Neg logit contributions
zie-4.771
yg-4.265
uben-3.984
aptic-3.950
verbal-3.932
Token.
Token ID13
Feature activation+2.57e-01

Pos logit contributions
zie+1.138
yg+1.037
grab+0.889
verbal+0.866
md+0.866
Neg logit contributions
ディ-1.732
-1.642
akespeare-1.600
Counter-1.597
-1.525
Tokenx
Token ID87
Feature activation+8.98e-02

Pos logit contributions
zie+1.867
yg+1.662
grab+1.347
gain+1.303
verbal+1.280
Neg logit contributions
ディ-4.100
-3.920
akespeare-3.864
Counter-3.817
-3.689
Token.
Token ID13
Feature activation-6.20e-01

Pos logit contributions
zie+0.850
yg+0.760
grab+0.658
enh+0.643
md+0.639
Neg logit contributions
ディ-1.255
-1.181
Counter-1.151
akespeare-1.144
-1.091
Token119
Token ID16315
Feature activation-6.61e-01

Pos logit contributions
ディ+3.866
Counter+3.791
+3.618
1963+3.235
akespeare+3.202
Neg logit contributions
zie-9.629
yg-9.320
aptic-8.578
verbal-8.422
uben-8.339
Token 10
Token ID838
Feature activation-7.68e-01

Pos logit contributions
ディ+6.194
Counter+5.804
+5.739
+5.249
9999+5.214
Neg logit contributions
zie-8.329
yg-7.562
aptic-7.024
uben-7.015
verbal-6.922
Token23
Token ID1954
Feature activation-1.86e+00

Pos logit contributions
ディ+6.650
+6.201
9999+5.883
Counter+5.847
1963+5.783
Neg logit contributions
zie-10.262
yg-9.263
uben-8.839
aptic-8.695
 newsp-8.663
Token199
Token ID19104
Feature activation-3.04e-02

Pos logit contributions
ディ+12.685
+11.928
9999+11.782
1963+11.532
Counter+11.052
Neg logit contributions
zie-18.802
 newsp-16.980
yg-16.841
aptic-16.158
uben-15.735
Token x
Token ID2124
Feature activation-1.02e-01

Pos logit contributions
ディ+0.365
+0.343
Counter+0.336
akespeare+0.330
+0.314
Neg logit contributions
zie-0.340
yg-0.311
grab-0.274
gain-0.270
verbal-0.270
Token.
Token ID13
Feature activation+7.67e-02

Pos logit contributions
ディ+1.351
+1.279
Counter+1.256
akespeare+1.227
+1.180
Neg logit contributions
zie-1.025
yg-0.929
grab-0.805
verbal-0.794
gain-0.789
Tokenx
Token ID87
Feature activation+1.12e-01

Pos logit contributions
zie+0.598
yg+0.533
grab+0.440
gain+0.432
verbal+0.425
Neg logit contributions
ディ-1.180
-1.131
Counter-1.109
akespeare-1.099
-1.061
Token.
Token ID13
Feature activation+2.09e-01

Pos logit contributions
zie+1.038
yg+0.945
grab+0.809
verbal+0.790
md+0.789
Neg logit contributions
ディ-1.562
-1.482
akespeare-1.441
Counter-1.439
-1.377
Token.
Token ID13
Feature activation-7.38e-03

Pos logit contributions
ディ+0.195
+0.185
Counter+0.181
akespeare+0.178
+0.171
Neg logit contributions
zie-0.141
yg-0.128
grab-0.110
verbal-0.108
gain-0.108
Tokenx
Token ID87
Feature activation-7.39e-02

Pos logit contributions
ディ+0.112
+0.108
Counter+0.106
akespeare+0.104
+0.100
Neg logit contributions
zie-0.059
yg-0.053
grab-0.044
gain-0.043
verbal-0.043
Token.
Token ID13
Feature activation-2.94e-01

Pos logit contributions
ディ+0.895
+0.851
Counter+0.834
akespeare+0.810
+0.770
Neg logit contributions
zie-0.816
yg-0.750
grab-0.660
uben-0.654
gain-0.653
Token156
Token ID21599
Feature activation-7.86e-01

Pos logit contributions
ディ+2.682
Counter+2.464
+2.451
akespeare+2.242
+2.210
Neg logit contributions
zie-4.081
yg-3.756
aptic-3.473
uben-3.394
verbal-3.391
Token 10
Token ID838
Feature activation-9.20e-01

Pos logit contributions
ディ+7.105
Counter+6.587
+6.499
9999+6.059
1963+6.013
Neg logit contributions
zie-9.705
yg-8.977
aptic-8.485
uben-8.314
verbal-8.166
Token23
Token ID1954
Feature activation-1.86e+00

Pos logit contributions
ディ+7.631
+7.130
1963+6.873
9999+6.814
Counter+6.667
Neg logit contributions
zie-12.181
yg-10.936
 newsp-10.636
uben-10.547
aptic-10.472
Token688
Token ID34427
Feature activation-1.15e-01

Pos logit contributions
ディ+12.654
+11.932
9999+11.669
1963+11.515
Counter+11.033
Neg logit contributions
zie-18.768
 newsp-17.136
yg-16.673
aptic-16.061
 emanc-15.703
Token x
Token ID2124
Feature activation-9.47e-02

Pos logit contributions
ディ+1.357
+1.273
Counter+1.253
akespeare+1.215
+1.161
Neg logit contributions
zie-1.320
yg-1.204
grab-1.065
verbal-1.053
gain-1.051
Token.
Token ID13
Feature activation+4.39e-02

Pos logit contributions
ディ+1.257
+1.190
Counter+1.168
akespeare+1.142
+1.098
Neg logit contributions
zie-0.944
yg-0.855
grab-0.740
verbal-0.729
gain-0.725
Tokenx
Token ID87
Feature activation+9.72e-02

Pos logit contributions
zie+0.347
yg+0.308
grab+0.255
gain+0.251
verbal+0.247
Neg logit contributions
ディ-0.672
-0.643
Counter-0.632
akespeare-0.624
-0.602
Token.
Token ID13
Feature activation+2.28e-01

Pos logit contributions
zie+0.907
yg+0.827
grab+0.709
verbal+0.691
gain+0.691
Neg logit contributions
ディ-1.347
-1.280
akespeare-1.245
Counter-1.244
-1.187
Token.
Token ID13
Feature activation+3.37e-01

Pos logit contributions
zie+1.915
yg+1.752
grab+1.490
md+1.457
gain+1.440
Neg logit contributions
ディ-3.090
-2.938
akespeare-2.886
Counter-2.846
-2.739
Tokenx
Token ID87
Feature activation+2.27e-01

Pos logit contributions
zie+2.329
yg+2.087
grab+1.680
gain+1.642
md+1.598
Neg logit contributions
ディ-5.411
-5.203
akespeare-5.148
Counter-5.055
-4.921
Token.
Token ID13
Feature activation-2.06e-01

Pos logit contributions
zie+2.003
yg+1.813
grab+1.584
uben+1.525
verbal+1.521
Neg logit contributions
ディ-3.205
-3.083
akespeare-2.976
Counter-2.956
-2.859
Token143
Token ID21139
Feature activation-6.70e-01

Pos logit contributions
ディ+2.116
+1.965
Counter+1.957
akespeare+1.832
+1.767
Neg logit contributions
zie-2.632
yg-2.422
aptic-2.160
uben-2.154
verbal-2.152
Token 10
Token ID838
Feature activation-9.07e-01

Pos logit contributions
ディ+6.351
Counter+5.938
+5.895
1963+5.411
9999+5.398
Neg logit contributions
zie-8.263
yg-7.608
aptic-7.081
uben-6.891
verbal-6.889
Token33
Token ID2091
Feature activation-1.84e+00

Pos logit contributions
ディ+7.516
+7.059
9999+6.822
1963+6.756
Counter+6.556
Neg logit contributions
zie-12.093
yg-10.840
 newsp-10.423
uben-10.393
aptic-10.355
Token455
Token ID30505
Feature activation+1.65e-01

Pos logit contributions
ディ+12.258
+11.394
1963+11.109
9999+11.029
Counter+10.366
Neg logit contributions
zie-19.384
 newsp-17.630
yg-17.259
aptic-16.701
uben-16.245
Token x
Token ID2124
Feature activation-1.05e-01

Pos logit contributions
zie+1.721
yg+1.575
grab+1.373
gain+1.348
verbal+1.327
Neg logit contributions
ディ-2.107
-1.976
Counter-1.937
akespeare-1.928
-1.827
Token.
Token ID13
Feature activation+6.66e-02

Pos logit contributions
ディ+1.385
+1.310
Counter+1.288
akespeare+1.257
+1.210
Neg logit contributions
zie-1.054
yg-0.955
grab-0.827
verbal-0.816
gain-0.811
Tokenx
Token ID87
Feature activation+1.85e-01

Pos logit contributions
zie+0.523
yg+0.465
grab+0.385
gain+0.378
verbal+0.372
Neg logit contributions
ディ-1.023
-0.979
Counter-0.962
akespeare-0.953
-0.918
Token.
Token ID13
Feature activation+3.74e-01

Pos logit contributions
zie+1.672
yg+1.523
grab+1.305
md+1.269
gain+1.263
Neg logit contributions
ディ-2.611
-2.487
akespeare-2.431
Counter-2.406
-2.310
Token which
Token ID543
Feature activation+8.64e-01

Pos logit contributions
zie+2.247
yg+2.034
uben+1.781
grab+1.779
verbal+1.778
Neg logit contributions
ディ-2.499
-2.356
Counter-2.333
akespeare-2.221
-2.156
Token you
Token ID345
Feature activation+1.73e+00

Pos logit contributions
zie+9.459
yg+8.541
gain+7.773
aptic+7.738
enh+7.677
Neg logit contributions
ディ-9.560
-8.934
Counter-8.848
akespeare-8.626
-8.296
Token train
Token ID4512
Feature activation+6.55e-01

Pos logit contributions
zie+14.463
gain+13.183
 SEE+12.630
yg+12.373
enh+12.250
Neg logit contributions
ディ-17.278
akespeare-15.668
Counter-15.395
-15.374
-15.196
Token your
Token ID534
Feature activation+1.33e+00

Pos logit contributions
zie+7.184
yg+6.441
enh+5.786
gain+5.690
grab+5.676
Neg logit contributions
ディ-7.581
-7.013
Counter-6.893
akespeare-6.736
-6.515
Token body
Token ID1767
Feature activation+7.37e-01

Pos logit contributions
zie+11.587
yg+10.231
enh+9.275
 SEE+8.965
aptic+8.847
Neg logit contributions
ディ-15.196
Counter-13.896
-13.849
akespeare-13.778
-13.319
Token to
Token ID284
Feature activation+1.46e+00

Pos logit contributions
zie+7.223
yg+6.283
enh+5.579
verbal+5.562
grab+5.560
Neg logit contributions
ディ-9.473
-8.728
Counter-8.500
akespeare-8.393
-8.113
Token rapidly
Token ID8902
Feature activation+1.10e+00

Pos logit contributions
zie+13.250
 SEE+11.467
gain+11.298
yg+11.167
enh+10.643
Neg logit contributions
ディ-15.762
-14.068
akespeare-13.870
Counter-13.583
-13.431
Token and
Token ID290
Feature activation+1.08e+00

Pos logit contributions
zie+10.958
yg+9.398
gain+9.247
 SEE+8.897
enh+8.789
Neg logit contributions
ディ-12.447
-11.201
Counter-11.028
akespeare-10.967
-10.833
Token efficiently
Token ID18306
Feature activation+1.19e+00

Pos logit contributions
zie+10.091
yg+8.678
gain+8.346
enh+8.101
grab+7.954
Neg logit contributions
ディ-12.945
-12.018
akespeare-11.732
Counter-11.616
-11.169
Token mobilize
Token ID43494
Feature activation+1.58e+00

Pos logit contributions
zie+11.540
yg+9.885
gain+9.764
 SEE+9.420
enh+9.140
Neg logit contributions
ディ-13.528
-12.162
akespeare-11.904
Counter-11.713
-11.699
Token free
Token ID1479
Feature activation+8.57e-01

Pos logit contributions
zie+13.436
yg+12.455
gain+11.072
enh+10.941
 dopamine+10.889
Neg logit contributions
ディ-16.994
akespeare-15.715
-15.381
Counter-15.019
-14.946
Token fail
Token ID2038
Feature activation-1.68e-01

Pos logit contributions
ディ+0.434
+0.410
Counter+0.408
akespeare+0.381
+0.373
Neg logit contributions
zie-0.494
yg-0.449
uben-0.404
verbal-0.403
aptic-0.402
Token to
Token ID284
Feature activation+1.92e-01

Pos logit contributions
ディ+1.861
+1.776
Counter+1.765
akespeare+1.634
Doctor+1.583
Neg logit contributions
zie-1.962
yg-1.818
uben-1.604
aptic-1.578
grab-1.574
Token point
Token ID966
Feature activation+1.19e-01

Pos logit contributions
zie+2.021
yg+1.823
grab+1.587
uben+1.584
verbal+1.579
Neg logit contributions
ディ-2.425
-2.284
Counter-2.257
akespeare-2.156
-2.098
Token out
Token ID503
Feature activation+1.89e-01

Pos logit contributions
zie+1.469
yg+1.337
verbal+1.205
uben+1.201
aptic+1.200
Neg logit contributions
ディ-1.259
-1.230
Counter-1.209
akespeare-1.126
-1.087
Token is
Token ID318
Feature activation+2.01e-01

Pos logit contributions
zie+2.032
yg+1.836
verbal+1.606
uben+1.604
aptic+1.595
Neg logit contributions
ディ-2.331
-2.210
Counter-2.201
akespeare-2.086
-2.024
Token that
Token ID326
Feature activation+1.09e+00

Pos logit contributions
zie+2.320
yg+2.109
uben+1.863
aptic+1.860
verbal+1.851
Neg logit contributions
ディ-2.337
-2.214
Counter-2.196
akespeare-2.070
-2.019
Token those
Token ID883
Feature activation+1.44e+00

Pos logit contributions
zie+9.930
yg+9.135
grab+8.242
verbal+7.895
enh+7.858
Neg logit contributions
ディ-13.285
-12.174
Counter-11.652
akespeare-11.610
Doctor-11.142
Token executive
Token ID4640
Feature activation+8.30e-01

Pos logit contributions
zie+11.997
yg+11.466
verbal+10.528
 ­+10.468
enh+10.349
Neg logit contributions
ディ-15.835
-14.110
Counter-13.750
akespeare-13.711
Doctor-13.375
Token actions
Token ID4028
Feature activation+5.31e-01

Pos logit contributions
zie+6.486
yg+5.650
enh+4.878
verbal+4.867
grab+4.786
Neg logit contributions
ディ-11.842
-11.171
Counter-11.018
akespeare-10.801
-10.377
Token contained
Token ID7763
Feature activation+8.64e-01

Pos logit contributions
zie+5.210
yg+4.636
enh+4.083
grab+4.066
gain+4.059
Neg logit contributions
ディ-6.870
-6.421
Counter-6.247
akespeare-6.042
Doctor-5.889
Token no
Token ID645
Feature activation+5.64e-01

Pos logit contributions
zie+8.572
yg+7.886
verbal+7.073
enh+7.032
grab+6.993
Neg logit contributions
ディ-10.106
-9.509
Counter-9.176
akespeare-8.874
Doctor-8.564
Token researchers
Token ID4837
Feature activation-4.92e-02

Pos logit contributions
zie+8.257
yg+7.817
aptic+6.996
verbal+6.953
enh+6.916
Neg logit contributions
ディ-7.948
-7.338
Counter-7.134
akespeare-6.849
-6.603
Token believe
Token ID1975
Feature activation+7.39e-01

Pos logit contributions
ディ+0.578
+0.547
Counter+0.543
akespeare+0.517
+0.501
Neg logit contributions
zie-0.556
yg-0.502
verbal-0.443
uben-0.442
grab-0.442
Token that
Token ID326
Feature activation+1.46e+00

Pos logit contributions
zie+7.411
yg+6.834
enh+5.988
aptic+5.988
verbal+5.887
Neg logit contributions
ディ-9.037
-8.447
Counter-8.347
akespeare-7.966
-7.765
Token abnormal
Token ID18801
Feature activation+1.76e+00

Pos logit contributions
zie+13.825
yg+13.214
 dopamine+13.047
aptic+11.974
enh+11.943
Neg logit contributions
ディ-14.878
-13.692
Counter-13.168
akespeare-12.613
-12.499
Token levels
Token ID2974
Feature activation+8.77e-01

Pos logit contributions
 dopamine+14.853
zie+14.498
yg+14.214
aptic+13.033
 SEE+12.338
Neg logit contributions
ディ-18.148
Counter-15.925
-15.880
-15.873
akespeare-15.612
Token of
Token ID286
Feature activation+1.98e+00

Pos logit contributions
zie+8.073
yg+7.244
aptic+6.415
verbal+6.295
gain+6.266
Neg logit contributions
ディ-11.460
-10.740
Counter-10.597
akespeare-10.232
-9.899
Token dopamine
Token ID26252
Feature activation+1.07e+00

Pos logit contributions
 dopamine+15.770
zie+14.681
yg+14.415
aptic+13.334
 SEE+12.848
Neg logit contributions
ディ-20.671
-18.476
Counter-18.301
-18.219
akespeare-18.069
Token are
Token ID389
Feature activation+1.33e-01

Pos logit contributions
zie+9.694
yg+8.693
aptic+8.018
enh+7.961
gain+7.937
Neg logit contributions
ディ-13.251
-11.900
Counter-11.637
akespeare-11.437
Doctor-10.966
Token connected
Token ID5884
Feature activation+3.03e-01

Pos logit contributions
zie+1.554
yg+1.416
uben+1.253
grab+1.251
verbal+1.251
Neg logit contributions
ディ-1.510
-1.418
Counter-1.404
akespeare-1.331
-1.290
Token to
Token ID284
Feature activation+1.42e+00

Pos logit contributions
zie+3.292
yg+2.969
uben+2.619
grab+2.614
verbal+2.605
Neg logit contributions
ディ-3.715
-3.528
Counter-3.481
akespeare-3.322
-3.223
Token schizophrenia
Token ID22794
Feature activation+4.76e-01

Pos logit contributions
zie+13.803
yg+12.923
enh+11.741
verbal+11.703
 dopamine+11.627
Neg logit contributions
ディ-14.524
-13.636
Counter-13.116
akespeare-12.577
-12.361
Token pregnant
Token ID10423
Feature activation+3.20e-01

Pos logit contributions
ディ+0.016
+0.015
Counter+0.015
akespeare+0.014
+0.014
Neg logit contributions
zie-0.014
yg-0.013
aptic-0.011
grab-0.011
uben-0.011
Token.
Token ID13
Feature activation+3.70e-01

Pos logit contributions
zie+3.200
yg+2.868
uben+2.478
grab+2.475
verbal+2.471
Neg logit contributions
ディ-4.203
-3.980
Counter-3.944
akespeare-3.765
-3.667
Token Flynn
Token ID14654
Feature activation+3.96e-01

Pos logit contributions
zie+4.053
yg+3.720
grab+3.236
verbal+3.223
uben+3.196
Neg logit contributions
ディ-4.398
-4.137
Counter-4.067
akespeare-4.014
-3.789
Token and
Token ID290
Feature activation+2.49e-01

Pos logit contributions
zie+4.044
yg+3.656
grab+3.156
uben+3.130
verbal+3.129
Neg logit contributions
ディ-5.054
-4.750
Counter-4.720
akespeare-4.525
-4.371
Token his
Token ID465
Feature activation+3.23e-01

Pos logit contributions
zie+2.968
yg+2.713
uben+2.404
grab+2.400
verbal+2.399
Neg logit contributions
ディ-2.784
-2.611
Counter-2.581
akespeare-2.450
-2.367
Token wife
Token ID3656
Feature activation+1.11e+00

Pos logit contributions
zie+4.136
yg+3.826
verbal+3.420
grab+3.413
aptic+3.393
Neg logit contributions
ディ-3.294
-3.073
Counter-3.033
akespeare-2.894
-2.756
Token specifically
Token ID5734
Feature activation-2.43e-02

Pos logit contributions
zie+10.830
yg+10.049
uben+8.475
enh+8.445
grab+8.346
Neg logit contributions
ディ-12.816
-11.770
Counter-11.671
akespeare-11.641
Doctor-10.950
Token decided
Token ID3066
Feature activation-3.49e-01

Pos logit contributions
ディ+0.292
+0.275
Counter+0.273
akespeare+0.260
+0.253
Neg logit contributions
zie-0.267
yg-0.242
aptic-0.213
uben-0.213
verbal-0.213
Token to
Token ID284
Feature activation-4.84e-01

Pos logit contributions
ディ+3.523
+3.379
Counter+3.258
akespeare+2.968
+2.934
Neg logit contributions
zie-4.274
yg-3.880
verbal-3.595
aptic-3.583
gain-3.522
Token make
Token ID787
Feature activation+2.67e-01

Pos logit contributions
ディ+4.941
+4.691
Counter+4.649
akespeare+4.419
 spoof+4.388
Neg logit contributions
zie-5.860
yg-5.247
aptic-4.925
uben-4.786
verbal-4.785
Token the
Token ID262
Feature activation+2.83e-01

Pos logit contributions
zie+3.037
yg+2.762
uben+2.432
grab+2.427
verbal+2.426
Neg logit contributions
ディ-3.116
-2.936
Counter-2.907
akespeare-2.764
-2.679
Token have
Token ID423
Feature activation+1.38e-01

Pos logit contributions
zie+2.653
yg+2.406
grab+2.111
gain+2.107
uben+2.094
Neg logit contributions
ディ-3.117
-2.940
Counter-2.873
akespeare-2.783
-2.701
Token no
Token ID645
Feature activation+6.21e-01

Pos logit contributions
zie+1.536
yg+1.393
uben+1.222
grab+1.220
verbal+1.220
Neg logit contributions
ディ-1.666
-1.570
Counter-1.554
akespeare-1.479
-1.435
Token j
Token ID474
Feature activation+6.87e-01

Pos logit contributions
zie+6.465
yg+6.006
enh+5.266
gain+5.256
grab+5.184
Neg logit contributions
ディ-7.470
-7.049
Counter-6.889
akespeare-6.732
-6.441
Tokenagg
Token ID9460
Feature activation+2.43e-01

Pos logit contributions
zie+3.255
yg+3.083
grab+2.233
gain+2.187
enh+2.115
Neg logit contributions
ディ-12.311
-11.653
Counter-11.158
-10.986
-10.805
Tokenies
Token ID444
Feature activation+2.33e-01

Pos logit contributions
zie+2.899
yg+2.685
gain+2.279
grab+2.266
enh+2.253
Neg logit contributions
ディ-2.866
-2.609
Counter-2.564
akespeare-2.420
-2.409
Token at
Token ID379
Feature activation+1.13e+00

Pos logit contributions
zie+2.397
yg+2.165
grab+1.884
gain+1.881
uben+1.878
Neg logit contributions
ディ-2.986
-2.821
Counter-2.782
akespeare-2.673
-2.601
Token all
Token ID477
Feature activation-1.17e-01

Pos logit contributions
zie+6.746
yg+6.432
enh+5.435
grab+5.224
dl+4.902
Neg logit contributions
ディ-16.617
-15.732
Counter-15.442
akespeare-15.110
-14.876
Token even
Token ID772
Feature activation+2.70e-02

Pos logit contributions
ディ+1.374
+1.300
Counter+1.296
akespeare+1.211
Doctor+1.179
Neg logit contributions
zie-1.302
yg-1.181
verbal-1.041
uben-1.041
aptic-1.028
Token with
Token ID351
Feature activation+2.64e-01

Pos logit contributions
zie+0.289
yg+0.260
uben+0.227
verbal+0.227
grab+0.226
Neg logit contributions
ディ-0.335
-0.316
Counter-0.314
akespeare-0.298
-0.290
Token no
Token ID645
Feature activation+5.99e-01

Pos logit contributions
zie+2.871
yg+2.603
uben+2.280
grab+2.278
verbal+2.271
Neg logit contributions
ディ-3.221
-3.036
Counter-3.003
akespeare-2.866
-2.778
Token AA
Token ID15923
Feature activation+3.00e-01

Pos logit contributions
zie+6.699
yg+6.186
enh+5.471
uben+5.445
grab+5.440
Neg logit contributions
ディ-6.877
-6.370
Counter-6.257
akespeare-6.087
-5.787
Token's
Token ID338
Feature activation+3.65e-01

Pos logit contributions
zie+3.670
yg+3.271
grab+2.832
aptic+2.791
verbal+2.789
Neg logit contributions
ディ-4.948
-4.661
Counter-4.608
akespeare-4.403
-4.317
Token going
Token ID1016
Feature activation+5.38e-01

Pos logit contributions
zie+3.908
yg+3.551
grab+3.119
verbal+3.086
enh+3.070
Neg logit contributions
ディ-4.505
-4.230
Counter-4.127
akespeare-3.953
-3.850
Token to
Token ID284
Feature activation+3.70e-01

Pos logit contributions
zie+4.313
yg+3.775
grab+3.265
enh+3.193
gain+3.184
Neg logit contributions
ディ-7.833
-7.379
Counter-7.210
akespeare-7.066
-6.904
Token take
Token ID1011
Feature activation+1.99e-01

Pos logit contributions
zie+4.474
yg+4.098
grab+3.631
gain+3.602
uben+3.601
Neg logit contributions
ディ-4.077
-3.757
Counter-3.721
akespeare-3.513
-3.443
Token some
Token ID617
Feature activation+2.51e-01

Pos logit contributions
zie+2.233
yg+2.039
grab+1.788
verbal+1.784
uben+1.779
Neg logit contributions
ディ-2.363
-2.218
Counter-2.190
akespeare-2.087
-2.026
Token work
Token ID670
Feature activation+7.70e-02

Pos logit contributions
zie+2.946
yg+2.718
verbal+2.399
grab+2.394
aptic+2.382
Neg logit contributions
ディ-2.837
-2.655
Counter-2.620
akespeare-2.487
-2.401
Token,
Token ID11
Feature activation+2.38e-01

Pos logit contributions
zie+0.839
yg+0.757
uben+0.666
verbal+0.664
aptic+0.663
Neg logit contributions
ディ-0.935
-0.883
Counter-0.875
akespeare-0.835
-0.805
Token on
Token ID319
Feature activation+3.07e-01

Pos logit contributions
zie+2.585
yg+2.354
grab+2.057
verbal+2.053
aptic+2.048
Neg logit contributions
ディ-2.907
-2.726
Counter-2.699
akespeare-2.578
-2.498
Token both
Token ID1111
Feature activation-2.57e-01

Pos logit contributions
zie+3.355
yg+3.054
verbal+2.667
grab+2.656
uben+2.652
Neg logit contributions
ディ-3.742
-3.531
Counter-3.475
akespeare-3.311
-3.222
Token the
Token ID262
Feature activation-5.49e-01

Pos logit contributions
ディ+2.238
+2.039
Counter+2.026
akespeare+1.895
+1.807
Neg logit contributions
zie-3.693
yg-3.323
uben-3.077
grab-3.058
aptic-3.035
Token political
Token ID1964
Feature activation-1.68e-01

Pos logit contributions
ディ+5.613
+5.207
Counter+5.174
akespeare+4.905
 cloaked+4.843
Neg logit contributions
zie-6.682
yg-5.853
uben-5.425
grab-5.379
verbal-5.368
Tokenman
Token ID805
Feature activation-1.80e-01

Pos logit contributions
zie+2.385
yg+1.838
grab+1.597
enh+1.530
md+1.475
Neg logit contributions
ディ-7.586
-7.331
Counter-7.053
-6.998
akespeare-6.977
Token does
Token ID857
Feature activation+6.14e-02

Pos logit contributions
ディ+2.076
+1.993
Counter+1.978
akespeare+1.883
+1.817
Neg logit contributions
zie-2.020
yg-1.828
uben-1.623
gain-1.620
aptic-1.620
Token while
Token ID981
Feature activation-1.36e-01

Pos logit contributions
zie+0.675
yg+0.611
uben+0.536
aptic+0.535
verbal+0.535
Neg logit contributions
ディ-0.739
-0.701
Counter-0.695
akespeare-0.661
-0.638
Token alone
Token ID3436
Feature activation-2.67e-03

Pos logit contributions
ディ+1.547
+1.463
Counter+1.460
akespeare+1.384
+1.328
Neg logit contributions
zie-1.567
yg-1.421
aptic-1.272
verbal-1.264
uben-1.254
Token in
Token ID287
Feature activation-3.66e-01

Pos logit contributions
ディ+0.032
+0.030
Counter+0.030
akespeare+0.028
+0.027
Neg logit contributions
zie-0.030
yg-0.027
aptic-0.024
uben-0.024
verbal-0.024
Token his
Token ID465
Feature activation+6.63e-02

Pos logit contributions
ディ+3.582
+3.333
Counter+3.319
akespeare+3.131
 spoof+2.962
Neg logit contributions
zie-4.713
yg-4.257
aptic-3.912
gain-3.857
uben-3.846
Token room
Token ID2119
Feature activation-3.40e-01

Pos logit contributions
zie+0.812
yg+0.740
verbal+0.661
uben+0.661
grab+0.659
Neg logit contributions
ディ-0.719
-0.677
Counter-0.668
akespeare-0.632
-0.610
Token.
Token ID13
Feature activation+5.01e-01

Pos logit contributions
ディ+3.789
Counter+3.594
+3.550
akespeare+3.381
 cloaked+3.282
Neg logit contributions
zie-3.858
yg-3.464
aptic-3.171
uben-3.130
verbal-3.091
Token
Token ID198
Feature activation+2.06e-01

Pos logit contributions
zie+5.527
yg+5.027
grab+4.508
verbal+4.412
md+4.401
Neg logit contributions
ディ-5.812
-5.451
Counter-5.301
akespeare-5.273
-5.038
Token
Token ID198
Feature activation+3.67e-01

Pos logit contributions
zie+2.208
yg+2.036
grab+1.841
md+1.787
gain+1.784
Neg logit contributions
ディ-2.543
-2.368
akespeare-2.328
Counter-2.273
-2.184
TokenOh
Token ID5812
Feature activation+3.90e-01

Pos logit contributions
zie+4.461
yg+4.106
grab+3.699
verbal+3.678
enh+3.626
Neg logit contributions
ディ-3.933
-3.676
Counter-3.540
akespeare-3.518
-3.351
Token 224
Token ID26063
Feature activation+5.37e-01

Pos logit contributions
zie+2.590
yg+2.367
grab+2.096
verbal+2.093
uben+2.087
Neg logit contributions
ディ-2.521
-2.368
Counter-2.337
akespeare-2.243
-2.148
Token GB
Token ID13124
Feature activation+2.05e-01

Pos logit contributions
zie+5.149
yg+4.604
grab+3.942
uben+3.933
verbal+3.929
Neg logit contributions
ディ-7.207
-6.816
Counter-6.773
akespeare-6.496
-6.314
Token/
Token ID14
Feature activation+4.77e-01

Pos logit contributions
zie+1.464
yg+1.311
grab+1.101
gain+1.064
verbal+1.040
Neg logit contributions
ディ-3.222
-3.071
akespeare-3.036
Counter-2.995
-2.855
Tokens
Token ID82
Feature activation+1.31e-01

Pos logit contributions
zie+2.567
yg+2.223
grab+1.789
gain+1.757
md+1.717
Neg logit contributions
ディ-8.268
-7.965
akespeare-7.901
Counter-7.677
-7.561
Token 192
Token ID17817
Feature activation+5.43e-01

Pos logit contributions
zie+1.609
yg+1.472
uben+1.311
verbal+1.309
grab+1.309
Neg logit contributions
ディ-1.424
-1.333
Counter-1.319
akespeare-1.246
-1.206
Token GB
Token ID13124
Feature activation+1.71e-01

Pos logit contributions
zie+5.151
yg+4.602
grab+3.928
verbal+3.924
uben+3.922
Neg logit contributions
ディ-7.315
-6.927
Counter-6.875
akespeare-6.601
-6.421
Token/
Token ID14
Feature activation+3.75e-01

Pos logit contributions
zie+1.251
yg+1.123
grab+0.937
gain+0.908
verbal+0.893
Neg logit contributions
ディ-2.670
-2.540
akespeare-2.505
Counter-2.477
-2.359
Tokens
Token ID82
Feature activation+1.70e-02

Pos logit contributions
zie+2.170
yg+1.901
grab+1.525
gain+1.506
md+1.474
Neg logit contributions
ディ-6.404
-6.163
akespeare-6.087
Counter-5.958
-5.838
Token 256
Token ID17759
Feature activation+4.06e-01

Pos logit contributions
zie+0.218
yg+0.200
uben+0.180
aptic+0.179
verbal+0.179
Neg logit contributions
ディ-0.176
-0.164
Counter-0.163
akespeare-0.152
-0.148
Token GB
Token ID13124
Feature activation+2.25e-01

Pos logit contributions
zie+3.498
yg+3.121
grab+2.626
gain+2.609
verbal+2.596
Neg logit contributions
ディ-5.788
-5.471
Counter-5.430
akespeare-5.305
-5.099
Token/
Token ID14
Feature activation+3.72e-01

Pos logit contributions
zie+1.598
yg+1.436
grab+1.193
gain+1.158
md+1.140
Neg logit contributions
ディ-3.529
-3.357
akespeare-3.327
Counter-3.264
-3.113
Token know
Token ID760
Feature activation+4.13e-01

Pos logit contributions
zie+1.650
yg+1.496
uben+1.313
grab+1.312
verbal+1.311
Neg logit contributions
ディ-1.772
-1.670
Counter-1.653
akespeare-1.572
-1.526
Token I
Token ID314
Feature activation+1.80e-01

Pos logit contributions
zie+4.571
yg+4.143
uben+3.653
grab+3.645
verbal+3.630
Neg logit contributions
ディ-4.939
-4.608
Counter-4.562
akespeare-4.358
-4.232
Token'm
Token ID1101
Feature activation+4.94e-01

Pos logit contributions
zie+1.947
yg+1.759
grab+1.536
uben+1.533
verbal+1.533
Neg logit contributions
ディ-2.225
-2.100
Counter-2.077
akespeare-1.977
-1.919
Token being
Token ID852
Feature activation+1.51e-02

Pos logit contributions
zie+5.526
yg+5.042
grab+4.459
verbal+4.454
uben+4.421
Neg logit contributions
ディ-5.674
-5.350
Counter-5.251
akespeare-5.006
-4.862
Token silly
Token ID14397
Feature activation+6.78e-01

Pos logit contributions
zie+0.181
yg+0.166
uben+0.147
aptic+0.147
verbal+0.146
Neg logit contributions
ディ-0.168
-0.157
Counter-0.157
akespeare-0.148
-0.144
Token.
Token ID13
Feature activation+2.45e-01

Pos logit contributions
zie+6.470
yg+5.688
verbal+4.989
grab+4.951
uben+4.946
Neg logit contributions
ディ-8.858
-8.327
Counter-8.255
akespeare-7.903
-7.758
Token But
Token ID887
Feature activation+5.68e-02

Pos logit contributions
zie+2.748
yg+2.496
grab+2.207
verbal+2.199
uben+2.194
Neg logit contributions
ディ-2.887
-2.713
Counter-2.676
akespeare-2.572
-2.476
Token it
Token ID340
Feature activation+3.55e-01

Pos logit contributions
zie+0.661
yg+0.603
verbal+0.533
aptic+0.532
uben+0.532
Neg logit contributions
ディ-0.647
-0.611
Counter-0.605
akespeare-0.573
-0.554
Token's
Token ID338
Feature activation-1.73e-01

Pos logit contributions
zie+3.667
yg+3.292
grab+2.868
uben+2.846
verbal+2.843
Neg logit contributions
ディ-4.533
-4.279
Counter-4.231
akespeare-4.033
-3.939
Token just
Token ID655
Feature activation-4.69e-01

Pos logit contributions
ディ+1.880
+1.771
Counter+1.771
akespeare+1.672
+1.611
Neg logit contributions
zie-2.065
yg-1.884
aptic-1.702
uben-1.681
verbal-1.673
Token that
Token ID326
Feature activation+3.40e-01

Pos logit contributions
ディ+4.829
+4.564
Counter+4.563
akespeare+4.243
 cloaked+4.212
Neg logit contributions
zie-5.618
yg-5.159
aptic-4.597
uben-4.570
gain-4.521
Token a
Token ID257
Feature activation+5.13e-01

Pos logit contributions
zie+9.105
yg+8.392
grab+7.507
uben+7.375
gain+7.353
Neg logit contributions
ディ-9.366
-8.838
Counter-8.546
akespeare-8.305
-7.944
Token cable
Token ID7862
Feature activation+4.22e-02

Pos logit contributions
zie+5.620
yg+5.131
grab+4.551
uben+4.510
verbal+4.405
Neg logit contributions
ディ-6.020
-5.681
Counter-5.614
akespeare-5.385
-5.163
Token bundle
Token ID18537
Feature activation+8.17e-01

Pos logit contributions
zie+0.472
yg+0.431
uben+0.379
aptic+0.378
verbal+0.378
Neg logit contributions
ディ-0.503
-0.476
Counter-0.470
akespeare-0.445
-0.434
Token!
Token ID0
Feature activation+5.08e-01

Pos logit contributions
zie+7.685
yg+6.745
grab+5.997
verbal+5.864
enh+5.835
Neg logit contributions
ディ-10.343
-9.737
Counter-9.661
akespeare-9.418
-8.932
Token You
Token ID921
Feature activation+7.05e-01

Pos logit contributions
zie+5.714
yg+5.249
grab+4.679
gain+4.586
verbal+4.582
Neg logit contributions
ディ-5.809
-5.415
Counter-5.313
akespeare-5.201
-4.947
Token must
Token ID1276
Feature activation+5.86e-01

Pos logit contributions
zie+6.494
yg+5.846
gain+5.050
grab+4.989
enh+4.847
Neg logit contributions
ディ-9.229
-8.712
Counter-8.646
akespeare-8.383
-8.059
Token specify
Token ID11986
Feature activation-2.05e-01

Pos logit contributions
zie+5.997
yg+5.574
grab+4.752
gain+4.735
enh+4.672
Neg logit contributions
ディ-7.210
-6.794
Counter-6.690
akespeare-6.401
-6.195
Token that
Token ID326
Feature activation-7.86e-02

Pos logit contributions
ディ+2.221
+2.070
Counter+2.064
akespeare+1.943
+1.896
Neg logit contributions
zie-2.473
yg-2.254
aptic-2.035
verbal-2.021
uben-2.020
Token you
Token ID345
Feature activation-3.97e-01

Pos logit contributions
ディ+0.764
+0.709
Counter+0.706
akespeare+0.657
+0.643
Neg logit contributions
zie-1.040
yg-0.958
verbal-0.874
aptic-0.870
uben-0.864
Token want
Token ID765
Feature activation+3.07e-01

Pos logit contributions
ディ+4.646
+4.405
Counter+4.346
akespeare+4.097
+4.009
Neg logit contributions
zie-4.283
yg-3.831
verbal-3.547
uben-3.485
grab-3.411
Token issue
Token ID2071
Feature activation-2.47e-01

Pos logit contributions
zie+3.385
yg+3.069
uben+2.689
grab+2.687
verbal+2.684
Neg logit contributions
ディ-3.704
-3.492
Counter-3.455
akespeare-3.287
-3.188
Token lot
Token ID1256
Feature activation-5.97e-01

Pos logit contributions
zie+2.079
yg+1.807
grab+1.514
verbal+1.509
uben+1.507
Neg logit contributions
ディ-3.622
-3.456
Counter-3.432
akespeare-3.322
-3.227
Token of
Token ID286
Feature activation-1.13e-01

Pos logit contributions
ディ+5.412
Counter+5.160
+5.118
 cloaked+4.845
 guests+4.716
Neg logit contributions
zie-7.420
yg-6.927
uben-6.154
grab-6.054
enh-6.031
Token snow
Token ID6729
Feature activation-2.23e-01

Pos logit contributions
ディ+1.305
+1.229
Counter+1.213
akespeare+1.142
+1.111
Neg logit contributions
zie-1.300
yg-1.188
uben-1.052
grab-1.048
verbal-1.048
Token recently
Token ID2904
Feature activation-7.97e-01

Pos logit contributions
ディ+2.570
+2.405
Counter+2.374
akespeare+2.237
Doctor+2.185
Neg logit contributions
zie-2.526
yg-2.284
uben-2.042
aptic-2.018
verbal-2.015
Token.
Token ID13
Feature activation-1.25e-01

Pos logit contributions
ディ+7.745
+7.378
Counter+7.258
 cloaked+6.741
Doctor+6.691
Neg logit contributions
zie-8.821
yg-8.348
verbal-7.461
aptic-7.288
uben-7.233
Token From
Token ID3574
Feature activation-3.01e-01

Pos logit contributions
ディ+1.441
+1.355
Counter+1.347
akespeare+1.262
+1.232
Neg logit contributions
zie-1.436
yg-1.307
aptic-1.157
uben-1.157
verbal-1.153
Token the
Token ID262
Feature activation-2.03e-01

Pos logit contributions
ディ+3.215
+2.988
Counter+2.964
akespeare+2.728
 cloaked+2.719
Neg logit contributions
zie-3.600
yg-3.267
aptic-2.957
verbal-2.949
uben-2.946
Token reports
Token ID3136
Feature activation-6.16e-01

Pos logit contributions
ディ+2.174
+2.052
Counter+2.004
akespeare+1.873
+1.842
Neg logit contributions
zie-2.448
yg-2.274
uben-2.018
verbal-2.015
aptic-2.003
Token we
Token ID356
Feature activation-1.67e-01

Pos logit contributions
ディ+5.874
+5.732
Counter+5.685
 cloaked+5.229
akespeare+5.153
Neg logit contributions
zie-7.249
yg-6.766
uben-6.007
enh-5.979
gain-5.939
Token've
Token ID1053
Feature activation-2.92e-01

Pos logit contributions
ディ+2.059
+1.951
Counter+1.937
akespeare+1.853
+1.795
Neg logit contributions
zie-1.769
yg-1.595
uben-1.391
aptic-1.385
grab-1.382
Token received
Token ID2722
Feature activation-1.71e-01

Pos logit contributions
ディ+3.059
+2.884
Counter+2.852
akespeare+2.707
+2.598
Neg logit contributions
zie-3.550
yg-3.218
aptic-2.904
uben-2.882
verbal-2.866
Token state
Token ID1181
Feature activation+4.99e-01

Pos logit contributions
zie+6.399
yg+5.882
aptic+5.095
uben+5.084
grab+5.076
Neg logit contributions
ディ-7.182
Counter-6.682
-6.666
akespeare-6.309
-6.124
Token court
Token ID2184
Feature activation+3.83e-01

Pos logit contributions
zie+5.144
yg+4.665
verbal+4.069
enh+4.023
aptic+4.018
Neg logit contributions
ディ-6.268
Counter-5.839
-5.829
akespeare-5.528
-5.369
Token level
Token ID1241
Feature activation+2.68e-01

Pos logit contributions
zie+3.961
yg+3.586
uben+3.101
verbal+3.098
aptic+3.098
Neg logit contributions
ディ-4.854
-4.567
Counter-4.540
akespeare-4.333
-4.200
Token,
Token ID11
Feature activation+3.46e-01

Pos logit contributions
zie+2.861
yg+2.582
uben+2.253
grab+2.249
verbal+2.248
Neg logit contributions
ディ-3.336
-3.155
Counter-3.122
akespeare-2.976
-2.893
Token where
Token ID810
Feature activation+5.33e-01

Pos logit contributions
zie+3.707
yg+3.350
grab+2.936
aptic+2.927
enh+2.917
Neg logit contributions
ディ-4.249
-3.998
Counter-3.942
akespeare-3.781
-3.656
Token 95
Token ID6957
Feature activation-2.46e-01

Pos logit contributions
zie+6.223
yg+5.791
verbal+5.067
uben+5.054
grab+5.053
Neg logit contributions
ディ-5.897
-5.505
Counter-5.449
akespeare-5.111
-4.971
Token percent
Token ID1411
Feature activation-2.88e-01

Pos logit contributions
ディ+2.289
+2.074
Counter+2.027
akespeare+1.944
+1.828
Neg logit contributions
zie-3.387
yg-3.075
grab-2.836
uben-2.831
verbal-2.815
Token of
Token ID286
Feature activation+7.44e-01

Pos logit contributions
ディ+3.064
+2.841
Counter+2.759
akespeare+2.700
+2.556
Neg logit contributions
zie-3.554
yg-3.155
uben-2.927
grab-2.908
aptic-2.844
Token the
Token ID262
Feature activation+7.04e-01

Pos logit contributions
zie+7.990
yg+7.504
grab+6.376
enh+6.364
gain+6.358
Neg logit contributions
ディ-8.542
Counter-8.031
-8.027
akespeare-7.458
-7.371
Token nation
Token ID3277
Feature activation-1.87e-01

Pos logit contributions
zie+7.522
yg+7.004
gain+6.083
enh+6.025
uben+6.015
Neg logit contributions
ディ-8.025
Counter-7.545
-7.531
akespeare-7.053
-6.925
Token
Token ID447
Feature activation+8.62e-02

Pos logit contributions
ディ+2.226
+2.119
Counter+2.085
akespeare+1.992
+1.913
Neg logit contributions
zie-2.022
yg-1.837
uben-1.636
verbal-1.611
gain-1.603
Token},
Token ID5512
Feature activation-1.94e-01

Pos logit contributions
ディ+3.836
+3.575
Counter+3.552
akespeare+3.302
+3.212
Neg logit contributions
zie-5.057
yg-4.657
aptic-4.193
uben-4.182
verbal-4.181
Token {{
Token ID22935
Feature activation-4.35e-02

Pos logit contributions
ディ+1.781
+1.686
Counter+1.642
akespeare+1.534
+1.463
Neg logit contributions
zie-2.642
yg-2.448
uben-2.249
verbal-2.227
enh-2.214
Token9
Token ID24
Feature activation-4.52e-01

Pos logit contributions
ディ+0.509
Counter+0.479
+0.479
akespeare+0.442
+0.434
Neg logit contributions
zie-0.498
yg-0.453
aptic-0.403
uben-0.403
verbal-0.395
Token,
Token ID11
Feature activation+2.07e-01

Pos logit contributions
ディ+5.370
Counter+5.066
+5.063
akespeare+4.704
+4.615
Neg logit contributions
zie-4.979
yg-4.482
aptic-3.997
uben-3.959
verbal-3.942
Token 14
Token ID1478
Feature activation-4.42e-01

Pos logit contributions
zie+2.099
yg+1.884
uben+1.630
aptic+1.626
verbal+1.625
Neg logit contributions
ディ-2.693
-2.544
Counter-2.525
akespeare-2.410
-2.345
Token},
Token ID5512
Feature activation-3.78e-01

Pos logit contributions
ディ+4.316
+4.018
Counter+4.005
akespeare+3.716
+3.611
Neg logit contributions
zie-5.791
yg-5.315
aptic-4.800
uben-4.784
verbal-4.777
Token {
Token ID1391
Feature activation+1.90e-02

Pos logit contributions
ディ+4.392
+4.133
Counter+4.107
akespeare+3.856
+3.764
Neg logit contributions
zie-4.292
yg-3.881
aptic-3.451
uben-3.442
verbal-3.428
Token9
Token ID24
Feature activation-5.03e-01

Pos logit contributions
zie+0.235
yg+0.215
uben+0.197
aptic+0.195
verbal+0.192
Neg logit contributions
ディ-0.201
Counter-0.191
-0.189
akespeare-0.173
Doctor-0.169
Token,
Token ID11
Feature activation+1.93e-01

Pos logit contributions
ディ+5.730
Counter+5.447
+5.426
akespeare+4.972
Doctor+4.919
Neg logit contributions
zie-5.684
yg-5.123
aptic-4.642
uben-4.599
verbal-4.531
Token 15
Token ID1315
Feature activation-6.69e-01

Pos logit contributions
zie+1.804
yg+1.606
grab+1.401
verbal+1.381
gain+1.375
Neg logit contributions
ディ-2.616
-2.513
Counter-2.462
akespeare-2.377
-2.311
Token},
Token ID5512
Feature activation-5.11e-01

Pos logit contributions
ディ+5.895
Counter+5.491
+5.476
akespeare+4.945
+4.913
Neg logit contributions
zie-9.013
yg-8.330
aptic-7.625
verbal-7.558
uben-7.520
Token the
Token ID262
Feature activation+1.87e-01

Pos logit contributions
ディ+3.875
Counter+3.603
+3.580
akespeare+3.329
 spoof+3.299
Neg logit contributions
zie-4.251
yg-3.839
aptic-3.470
uben-3.442
grab-3.436
Token so
Token ID523
Feature activation-2.01e-01

Pos logit contributions
zie+2.207
yg+2.015
grab+1.782
verbal+1.782
uben+1.780
Neg logit contributions
ディ-2.106
-1.976
Counter-1.954
akespeare-1.857
-1.798
Token-
Token ID12
Feature activation+4.40e-01

Pos logit contributions
ディ+2.406
Counter+2.201
+2.189
akespeare+2.058
Doctor+2.031
Neg logit contributions
zie-2.145
yg-1.915
gain-1.751
aptic-1.722
grab-1.715
Tokencalled
Token ID7174
Feature activation+1.52e-01

Pos logit contributions
zie+2.483
yg+2.032
verbal+1.700
grab+1.599
gain+1.535
Neg logit contributions
ディ-7.476
-7.271
Counter-7.099
akespeare-7.033
-6.951
Token complaint
Token ID8224
Feature activation-2.09e-01

Pos logit contributions
zie+1.677
yg+1.522
uben+1.336
aptic+1.333
verbal+1.331
Neg logit contributions
ディ-1.828
-1.723
Counter-1.706
akespeare-1.620
-1.576
Token.
Token ID13
Feature activation-2.47e-01

Pos logit contributions
ディ+2.394
+2.251
Counter+2.247
akespeare+2.098
 spoof+2.064
Neg logit contributions
zie-2.359
yg-2.145
aptic-1.911
uben-1.904
grab-1.885
Token That
Token ID1320
Feature activation+4.53e-01

Pos logit contributions
ディ+2.782
+2.588
Counter+2.587
Doctor+2.375
akespeare+2.366
Neg logit contributions
zie-2.924
yg-2.663
aptic-2.407
uben-2.372
grab-2.350
Token was
Token ID373
Feature activation+9.44e-02

Pos logit contributions
zie+4.882
yg+4.425
grab+3.875
verbal+3.863
gain+3.832
Neg logit contributions
ディ-5.514
-5.191
Counter-5.136
akespeare-4.921
-4.748
Token back
Token ID736
Feature activation+3.73e-01

Pos logit contributions
zie+1.057
yg+0.959
uben+0.843
grab+0.841
verbal+0.841
Neg logit contributions
ディ-1.126
-1.060
Counter-1.050
akespeare-0.998
-0.968
Token in
Token ID287
Feature activation+2.96e-01

Pos logit contributions
zie+3.786
yg+3.412
verbal+2.983
grab+2.967
uben+2.946
Neg logit contributions
ディ-4.784
-4.535
Counter-4.480
akespeare-4.279
-4.177
Token 2008
Token ID3648
Feature activation-3.33e-03

Pos logit contributions
zie+3.535
yg+3.232
verbal+2.871
grab+2.853
gain+2.849
Neg logit contributions
ディ-3.270
-3.076
Counter-3.030
akespeare-2.875
-2.794
Token Castro
Token ID21193
Feature activation-3.01e-01

Pos logit contributions
zie+1.060
yg+0.976
uben+0.879
verbal+0.878
grab+0.878
Neg logit contributions
ディ-0.777
-0.721
Counter-0.712
akespeare-0.669
-0.645
Token,
Token ID11
Feature activation+6.39e-02

Pos logit contributions
ディ+3.174
+3.103
Counter+3.009
akespeare+2.891
+2.775
Neg logit contributions
zie-3.516
yg-3.305
verbal-2.898
grab-2.893
gain-2.867
Token Cuban
Token ID18147
Feature activation-2.07e-01

Pos logit contributions
zie+0.778
yg+0.714
uben+0.635
verbal+0.632
aptic+0.631
Neg logit contributions
ディ-0.698
-0.655
Counter-0.650
akespeare-0.613
-0.593
Token dissidents
Token ID49913
Feature activation-2.74e-01

Pos logit contributions
ディ+2.089
+1.964
Counter+1.949
akespeare+1.797
+1.752
Neg logit contributions
zie-2.630
yg-2.446
uben-2.200
grab-2.165
verbal-2.160
Token who
Token ID508
Feature activation-5.70e-01

Pos logit contributions
ディ+3.103
+3.016
Counter+2.971
akespeare+2.782
+2.727
Neg logit contributions
zie-3.057
yg-2.779
verbal-2.462
grab-2.454
uben-2.450
Token advocate
Token ID12811
Feature activation-2.38e-01

Pos logit contributions
ディ+5.834
+5.830
Counter+5.790
akespeare+5.405
 cloaked+5.361
Neg logit contributions
zie-6.467
yg-5.842
grab-5.336
verbal-5.334
aptic-5.280
Token greater
Token ID3744
Feature activation-2.84e-01

Pos logit contributions
ディ+2.782
+2.632
Counter+2.626
akespeare+2.437
1963+2.391
Neg logit contributions
zie-2.645
yg-2.408
uben-2.127
aptic-2.117
verbal-2.115
Token human
Token ID1692
Feature activation+6.20e-02

Pos logit contributions
ディ+3.166
+2.973
Counter+2.955
akespeare+2.771
+2.708
Neg logit contributions
zie-3.308
yg-2.995
uben-2.696
aptic-2.678
verbal-2.669
Token rights
Token ID2489
Feature activation+2.29e-01

Pos logit contributions
zie+0.750
yg+0.688
verbal+0.614
uben+0.613
aptic+0.608
Neg logit contributions
ディ-0.679
-0.637
Counter-0.629
akespeare-0.594
-0.574
Token on
Token ID319
Feature activation+5.74e-01

Pos logit contributions
zie+2.437
yg+2.199
grab+1.918
uben+1.918
verbal+1.914
Neg logit contributions
ディ-2.860
-2.697
Counter-2.669
akespeare-2.547
-2.473
Token the
Token ID262
Feature activation+9.25e-02

Pos logit contributions
zie+6.600
yg+5.989
gain+5.356
enh+5.297
grab+5.287
Neg logit contributions
ディ-6.413
-6.107
Counter-5.936
akespeare-5.717
-5.507
Token opinions
Token ID9317
Feature activation-2.37e-01

Pos logit contributions
ディ+5.478
Counter+5.166
+5.146
akespeare+4.872
+4.744
Neg logit contributions
zie-5.357
yg-4.928
aptic-4.378
verbal-4.365
gain-4.306
Token expressed
Token ID6241
Feature activation-9.85e-02

Pos logit contributions
ディ+2.262
+2.187
Counter+2.110
akespeare+1.995
+1.910
Neg logit contributions
zie-3.141
yg-2.873
uben-2.617
gain-2.593
aptic-2.572
Token by
Token ID416
Feature activation-7.23e-01

Pos logit contributions
ディ+1.244
+1.194
Counter+1.189
akespeare+1.136
 spoof+1.087
Neg logit contributions
zie-1.020
yg-0.904
aptic-0.807
verbal-0.803
uben-0.802
Token column
Token ID5721
Feature activation-1.07e+00

Pos logit contributions
ディ+5.462
Counter+5.263
+4.960
 guests+4.790
akespeare+4.710
Neg logit contributions
zie-10.856
yg-9.749
aptic-9.359
grab-9.007
enh-9.002
Tokenists
Token ID1023
Feature activation-8.18e-01

Pos logit contributions
Counter+3.476
ディ+3.228
+2.937
 guests+2.692
Doctor+1.964
Neg logit contributions
zie-19.880
yg-18.771
uben-18.040
retch-17.408
gain-17.327
Token are
Token ID389
Feature activation-1.51e+00

Pos logit contributions
Counter+8.031
+7.691
ディ+7.578
 guests+7.377
akespeare+7.023
Neg logit contributions
zie-9.882
yg-9.219
uben-8.578
grab-8.417
aptic-8.402
Token their
Token ID511
Feature activation-7.61e-01

Pos logit contributions
Counter+4.070
ディ+3.884
 guests+3.094
 cloaked+2.701
+2.654
Neg logit contributions
zie-25.666
gain-24.152
yg-23.988
aptic-23.753
oled-22.831
Token own
Token ID898
Feature activation-9.01e-01

Pos logit contributions
Counter+5.822
ディ+5.808
+5.498
 guests+5.319
Doctor+4.903
Neg logit contributions
zie-9.965
yg-9.787
aptic-9.438
uben-9.271
retch-9.228
Token and
Token ID290
Feature activation-1.08e+00

Pos logit contributions
ディ+7.333
Counter+7.193
+6.986
Doctor+6.489
 cloaked+6.200
Neg logit contributions
zie-11.853
yg-11.434
aptic-10.712
grab-10.555
gain-10.396
Token do
Token ID466
Feature activation-6.63e-01

Pos logit contributions
Counter+7.521
ディ+6.934
+6.775
+6.370
 guests+6.289
Neg logit contributions
zie-14.665
yg-13.995
aptic-13.974
grab-13.295
enh-13.242
Token not
Token ID407
Feature activation-6.12e-01

Pos logit contributions
ディ+5.448
Counter+5.138
+5.014
akespeare+4.452
+4.397
Neg logit contributions
zie-9.430
yg-8.719
uben-8.268
aptic-8.221
enh-8.036
Token face
Token ID1986
Feature activation-1.20e+00

Pos logit contributions
ディ+1.875
+1.783
Counter+1.777
akespeare+1.678
+1.630
Neg logit contributions
zie-1.773
yg-1.596
grab-1.416
verbal-1.415
aptic-1.409
Token are
Token ID389
Feature activation-1.16e+00

Pos logit contributions
Counter+10.797
 cloaked+10.507
ディ+10.383
+10.319
akespeare+9.833
Neg logit contributions
zie-12.341
yg-11.207
uben-10.191
verbal-10.131
aptic-10.079
Token just
Token ID655
Feature activation-1.21e+00

Pos logit contributions
 cloaked+10.681
Counter+10.643
ディ+10.367
+10.256
 guests+9.925
Neg logit contributions
zie-11.994
yg-11.140
aptic-10.308
verbal-9.867
uben-9.866
Token as
Token ID355
Feature activation-2.64e-01

Pos logit contributions
ディ+9.758
+9.640
Counter+9.474
 guests+8.947
 cloaked+8.824
Neg logit contributions
zie-13.520
yg-12.706
grab-11.504
verbal-11.490
uben-11.251
Token powerful
Token ID3665
Feature activation-1.62e+00

Pos logit contributions
ディ+3.111
Counter+3.021
+2.998
akespeare+2.811
 cloaked+2.766
Neg logit contributions
zie-2.837
yg-2.582
aptic-2.285
uben-2.254
verbal-2.251
Token.
Token ID13
Feature activation-1.66e+00

Pos logit contributions
+12.341
Counter+12.131
ディ+11.855
 cloaked+11.796
 guests+11.213
Neg logit contributions
zie-15.830
yg-14.427
verbal-13.003
grab-12.862
uben-12.837
Token The
Token ID383
Feature activation-1.16e+00

Pos logit contributions
Counter+13.898
ディ+13.282
+12.941
Terror+11.555
 Opera+11.449
Neg logit contributions
zie-16.427
yg-14.894
aptic-14.501
uben-13.781
yrus-13.572
Token stakes
Token ID21147
Feature activation+1.81e-01

Pos logit contributions
ディ+10.975
Counter+10.718
+10.492
 cloaked+10.333
 guests+9.837
Neg logit contributions
zie-11.760
yg-10.635
aptic-9.694
verbal-9.665
uben-9.634
Token of
Token ID286
Feature activation-4.49e-01

Pos logit contributions
zie+1.821
yg+1.630
uben+1.405
verbal+1.404
grab+1.403
Neg logit contributions
ディ-2.362
-2.244
Counter-2.221
akespeare-2.122
-2.063
Token our
Token ID674
Feature activation-4.62e-01

Pos logit contributions
ディ+4.658
Counter+4.424
+4.420
akespeare+4.147
 cloaked+3.981
Neg logit contributions
zie-5.288
yg-4.762
aptic-4.458
verbal-4.316
uben-4.312
Token struggle
Token ID6531
Feature activation+7.58e-02

Pos logit contributions
ディ+4.817
+4.636
Counter+4.575
akespeare+4.238
+4.128
Neg logit contributions
zie-5.520
yg-4.985
aptic-4.632
verbal-4.518
uben-4.477
Token the
Token ID262
Feature activation-1.03e+00

Pos logit contributions
Counter+11.395
 cloaked+10.986
 spoof+10.915
 filtered+10.338
ディ+10.195
Neg logit contributions
zie-15.635
yg-13.961
aptic-13.279
retch-12.918
uben-12.807
Token work
Token ID670
Feature activation-2.13e-01

Pos logit contributions
Counter+8.718
 spoof+8.340
ディ+8.208
+8.108
 guests+7.951
Neg logit contributions
zie-12.076
yg-11.066
gain-10.203
aptic-10.111
uben-10.014
Token of
Token ID286
Feature activation-7.78e-01

Pos logit contributions
ディ+2.316
Counter+2.195
+2.176
akespeare+2.038
+1.984
Neg logit contributions
zie-2.537
yg-2.331
gain-2.067
aptic-2.065
uben-2.063
Token Islamic
Token ID5533
Feature activation-1.21e+00

Pos logit contributions
Counter+7.442
ディ+7.187
+6.931
 spoof+6.863
 cloaked+6.653
Neg logit contributions
zie-9.041
yg-8.420
aptic-7.564
grab-7.449
verbal-7.393
Token extremists
Token ID20674
Feature activation-1.11e+00

Pos logit contributions
Counter+9.554
+8.814
ディ+8.514
 spoof+8.446
 cloaked+8.019
Neg logit contributions
zie-14.272
yg-13.692
verbal-12.598
grab-12.471
uben-12.370
Token or
Token ID393
Feature activation-1.42e+00

Pos logit contributions
Counter+9.717
ディ+9.507
 cloaked+9.343
+9.340
 spoof+9.243
Neg logit contributions
zie-12.472
yg-11.323
aptic-10.281
verbal-10.278
uben-10.051
Token by
Token ID416
Feature activation-7.44e-01

Pos logit contributions
Counter+12.065
 guests+11.707
 spoof+11.600
 cloaked+11.587
+11.123
Neg logit contributions
zie-14.991
yg-13.263
uben-12.514
aptic-12.235
retch-12.201
Token others
Token ID1854
Feature activation-8.10e-01

Pos logit contributions
ディ+7.136
Counter+7.119
+6.726
 spoof+6.694
 cloaked+6.522
Neg logit contributions
zie-8.611
yg-7.927
aptic-7.133
uben-7.100
verbal-7.038
Token seeking
Token ID6095
Feature activation-5.64e-01

Pos logit contributions
Counter+7.677
ディ+7.572
 cloaked+7.215
+7.195
 spoof+7.045
Neg logit contributions
zie-9.400
yg-8.573
aptic-7.724
verbal-7.665
uben-7.628
Token to
Token ID284
Feature activation-3.26e-01

Pos logit contributions
ディ+5.308
+5.221
Counter+5.025
 spoof+4.684
 cloaked+4.673
Neg logit contributions
zie-6.805
yg-6.209
uben-5.851
aptic-5.769
verbal-5.703
Token exploit
Token ID14561
Feature activation-4.52e-02

Pos logit contributions
ディ+3.396
+3.327
Counter+3.321
akespeare+3.146
 spoof+3.063
Neg logit contributions
zie-3.892
yg-3.538
aptic-3.202
uben-3.177
verbal-3.176
Token that
Token ID326
Feature activation-1.73e-01

Pos logit contributions
ディ+2.411
+2.278
Counter+2.249
akespeare+2.155
 spoof+2.111
Neg logit contributions
zie-2.072
yg-1.873
uben-1.648
aptic-1.648
verbal-1.641
Token's
Token ID338
Feature activation-7.71e-01

Pos logit contributions
ディ+2.011
+1.908
Counter+1.891
akespeare+1.820
 spoof+1.771
Neg logit contributions
zie-1.933
yg-1.734
uben-1.567
verbal-1.564
aptic-1.560
Token celebrating
Token ID17499
Feature activation-8.67e-01

Pos logit contributions
ディ+8.270
 spoof+7.791
Counter+7.769
+7.692
 cloaked+7.624
Neg logit contributions
zie-8.284
yg-7.264
aptic-6.883
verbal-6.780
uben-6.719
Token its
Token ID663
Feature activation-1.09e+00

Pos logit contributions
ディ+7.668
Counter+7.375
+7.135
 spoof+6.938
 guests+6.818
Neg logit contributions
zie-10.159
yg-9.252
aptic-8.793
uben-8.720
oled-8.570
Token 40
Token ID2319
Feature activation-5.75e-01

Pos logit contributions
ディ+8.981
 spoof+8.200
+8.109
Counter+8.092
 cloaked+7.908
Neg logit contributions
zie-12.813
yg-11.807
aptic-11.490
uben-11.200
verbal-10.803
Tokenth
Token ID400
Feature activation-1.48e+00

Pos logit contributions
ディ+5.255
+4.755
Counter+4.524
akespeare+4.321
1963+4.149
Neg logit contributions
zie-7.402
yg-6.817
uben-6.455
gain-6.334
grab-6.262
Token anniversary
Token ID11162
Feature activation-6.00e-01

Pos logit contributions
ディ+8.525
 spoof+7.770
+7.434
 guests+7.102
Counter+7.085
Neg logit contributions
zie-18.462
yg-16.885
aptic-16.827
verbal-16.791
uben-16.770
Token.
Token ID13
Feature activation-5.15e-01

Pos logit contributions
ディ+6.158
Counter+5.716
+5.687
 spoof+5.440
akespeare+5.401
Neg logit contributions
zie-6.800
yg-6.234
verbal-5.680
aptic-5.617
uben-5.589
Token It
Token ID632
Feature activation-5.13e-02

Pos logit contributions
ディ+5.159
Counter+4.866
+4.799
Doctor+4.412
1963+4.347
Neg logit contributions
zie-6.363
yg-5.769
uben-5.372
aptic-5.363
verbal-5.209
Token's
Token ID338
Feature activation-5.82e-01

Pos logit contributions
ディ+0.638
+0.604
Counter+0.596
akespeare+0.577
+0.554
Neg logit contributions
zie-0.538
yg-0.487
aptic-0.431
verbal-0.431
uben-0.430
Token paired
Token ID20312
Feature activation+1.45e-01

Pos logit contributions
ディ+6.436
Counter+5.987
+5.934
 spoof+5.752
akespeare+5.736
Neg logit contributions
zie-6.396
yg-5.713
aptic-5.297
verbal-5.219
uben-5.209
Tokenhood
Token ID2894
Feature activation-2.08e-01

Pos logit contributions
ディ+6.075
+5.775
Counter+5.738
akespeare+5.468
1963+5.350
Neg logit contributions
zie-6.303
yg-5.590
aptic-5.230
verbal-5.076
gain-5.054
Token in
Token ID287
Feature activation-1.09e-01

Pos logit contributions
ディ+2.306
Counter+2.176
+2.173
akespeare+2.039
+1.978
Neg logit contributions
zie-2.394
yg-2.178
aptic-1.999
uben-1.936
verbal-1.935
Token New
Token ID968
Feature activation-4.95e-01

Pos logit contributions
ディ+1.032
Counter+0.966
+0.953
akespeare+0.900
1963+0.855
Neg logit contributions
zie-1.427
yg-1.321
aptic-1.220
verbal-1.212
grab-1.197
Token York
Token ID1971
Feature activation-3.64e-01

Pos logit contributions
ディ+5.567
+5.310
Counter+5.297
akespeare+4.992
+4.864
Neg logit contributions
zie-5.464
yg-4.903
aptic-4.567
verbal-4.498
grab-4.446
Token,
Token ID11
Feature activation-7.00e-01

Pos logit contributions
ディ+4.345
Counter+4.100
+4.097
akespeare+3.852
+3.762
Neg logit contributions
zie-3.965
yg-3.598
aptic-3.193
verbal-3.182
uben-3.166
Token also
Token ID635
Feature activation-1.37e+00

Pos logit contributions
ディ+8.364
+7.819
Counter+7.653
akespeare+7.389
+7.291
Neg logit contributions
zie-7.168
yg-6.469
aptic-5.891
verbal-5.846
gain-5.744
Token served
Token ID4983
Feature activation-1.47e+00

Pos logit contributions
ディ+5.716
+4.899
 cloaked+4.654
 filtered+4.651
Counter+4.575
Neg logit contributions
zie-20.728
yg-19.984
aptic-19.876
verbal-18.730
gain-18.633
Token as
Token ID355
Feature activation-1.07e+00

Pos logit contributions
ディ+7.201
Counter+6.677
+6.146
 guests+5.957
9999+5.689
Neg logit contributions
zie-20.748
yg-19.815
oled-18.622
retch-18.444
aptic-18.415
Token a
Token ID257
Feature activation-5.90e-01

Pos logit contributions
ディ+7.600
Counter+7.221
+6.975
9999+6.353
Doctor+6.341
Neg logit contributions
zie-14.485
yg-13.546
aptic-12.709
uben-12.420
verbal-12.379
Token set
Token ID900
Feature activation+5.02e-01

Pos logit contributions
ディ+6.112
Counter+5.586
+5.563
akespeare+5.231
1963+5.066
Neg logit contributions
zie-7.357
yg-6.570
aptic-6.030
gain-5.934
uben-5.925
Token for
Token ID329
Feature activation-6.56e-01

Pos logit contributions
zie+2.432
yg+1.923
grab+1.487
verbal+1.370
enh+1.368
Neg logit contributions
ディ-9.103
-8.625
akespeare-8.424
Counter-8.378
-8.299