TOP ACTIVATIONS
MAX = 2.466

able version <|endoftext|> Rob C estern ino hosts a jam packed
toward progressive policies and generous social spending was soon followed by
towel - lined plate . Serve hot ! 1
easy bank loans and the cac het of having planes land
] This payment schedule would save the Haitian government USD 100
iff Tra x Reviewer : V the Happy L
run - up in opioid prescribing emerged from an unhealthy enchantment
levels of tax cuts and spending increases in upcoming budgets that
the public spaces , image courtesy of Dav part .
impl Board { pub fn copy _ board ( &
can cause real prices to fall , while an excess of
But I just couldn 't bear it . I was so
T > { ... } impl < T > Own ed
or so . Let it sit for a few minutes after
A run - up in prescribing from 2000 through 2011 fueled
cutting off their arms to save their heads , but that
l owski said The Str and , Delaware Street and other
I bought it for music listening and the bass is really
energy . They need us dearly after the ID split to
interesting . Apparently the Str and Craft 122 is going to

MIN ACTIVATIONS
MIN = -2.118

and connect its W AN port to the MI 424 WR
this month after using prof anity to describe President Donald Trump
MI 424 WR 's LAN port . The router should then
a laboratory in San Juan Cap ist rano , Calif .,
USB connection , or your PC hal ts , or the
) under the list of brid ged connections . The original
} at least one blank character (< Space > or <
' of people of all faiths , colors and cre eds
to one of its LAN ports with a static IP address
, said it was no surprise rates were held .
could 3 D MAX 3 D MAX
live harmon iously with your fellow men . I leave you
fellow at the leftist Southern Poverty Law Center , ass ailed
12 . Alex Dun bar ( 93 ) (
by two magical words - mouse issues - which no admin
953 16 27 145 Flat buff ers Reading 20 88 17
fundraising emails to supporters . Coord ination on this front is
} at least one blank character (< Space > or <
exposing their child to prof anity ." PHOTOS :
The p angs of dis pri z 'd love , the

INTERVAL 1.320 - 2.466
CONTAINS 0.040%

of the difference could be attributed to teacher characteristics , especially
iff Tra x Reviewer : V the Happy L
ing David Cameron at Wim bledon , to Alex Sal mond
year fight against the deal exempl ifies Washington s
A run - up in prescribing from 2000 through 2011 fueled

INTERVAL 0.174 - 1.320
CONTAINS 26.048%

m almost done with a marketing degree , I have finally
of great mutual benefit to fans and club alike .
between sovereign Israel and the settlements in the territories ended up
" When I grew up hockey was too expensive with all
[…] Since I m now finally having

INTERVAL -0.972 - 0.174
CONTAINS 73.384%

heim Glacier by 1 . 5 kilometers . And Hel heim
That is way too far gone for me now . I
and bad policies , or be reasoned with .
will want , so factories will close down and choice of
, her leadership of that powerful panel -- and how it

INTERVAL -2.118 - -0.972
CONTAINS 0.528%

th among all edge r us hers *) 2017
m ak ua k ā ne no K ona , Hawai
of girls . Generally , the Mong ols (
asted its world champion Cleveland Rams in a celebration that doubled
California v . Southern Pacific Railroad Company F acts
Tokenable
Token ID540
Feature activation-3.54e-01

Pos logit contributions
 foreskin+1.787
 salute+1.672
 conver+1.638
 digit+1.621
ratulations+1.614
Neg logit contributions
antine-0.717
bledon-0.715
mom-0.646
water-0.632
ables-0.626
Token version
Token ID2196
Feature activation+5.47e-02

Pos logit contributions
 foreskin+0.734
 salute+0.671
 conver+0.647
 digit+0.635
ratulations+0.632
Neg logit contributions
antine-0.706
bledon-0.690
ables-0.690
water-0.658
mom-0.655
Token<|endoftext|>
Token ID50256
Feature activation-1.10e-01

Pos logit contributions
antine+0.120
ables+0.120
bledon+0.119
water+0.114
mom+0.114
Neg logit contributions
 foreskin-0.101
 salute-0.092
 conver-0.088
ratulations-0.087
cens-0.086
TokenRob
Token ID14350
Feature activation+5.75e-01

Pos logit contributions
 foreskin+0.250
 salute+0.230
 conver+0.223
ratulations+0.220
 digit+0.217
Neg logit contributions
antine-0.199
ables-0.197
bledon-0.195
water-0.186
mom-0.186
Token C
Token ID327
Feature activation+1.99e-01

Pos logit contributions
antine+1.068
ables+1.051
bledon+1.036
water+0.995
mom+0.985
Neg logit contributions
 foreskin-1.272
 salute-1.168
 conver-1.132
ratulations-1.112
cens-1.106
Tokenestern
Token ID3330
Feature activation+2.47e+00

Pos logit contributions
antine+0.476
ables+0.474
bledon+0.464
water+0.449
mom+0.448
Neg logit contributions
 foreskin-0.337
 salute-0.300
 conver-0.288
ratulations-0.282
cens-0.279
Tokenino
Token ID2879
Feature activation+1.44e-02

Pos logit contributions
ables+3.172
antine+3.144
bledon+3.007
water+2.895
mom+2.797
Neg logit contributions
 foreskin-6.827
 salute-6.335
 conver-6.186
ratulations-6.139
 pancreat-6.113
Token hosts
Token ID11453
Feature activation+9.99e-02

Pos logit contributions
antine+0.031
ables+0.030
bledon+0.030
water+0.029
mom+0.029
Neg logit contributions
 foreskin-0.028
 salute-0.025
ratulations-0.024
 conver-0.024
cens-0.024
Token a
Token ID257
Feature activation+1.32e-02

Pos logit contributions
antine+0.193
ables+0.190
bledon+0.188
water+0.179
mom+0.179
Neg logit contributions
 foreskin-0.214
 salute-0.196
 conver-0.189
ratulations-0.187
cens-0.185
Token jam
Token ID16853
Feature activation+4.19e-01

Pos logit contributions
antine+0.029
ables+0.028
bledon+0.028
mom+0.027
water+0.027
Neg logit contributions
 foreskin-0.025
 salute-0.022
 conver-0.021
ratulations-0.021
cens-0.021
Token packed
Token ID11856
Feature activation-9.44e-02

Pos logit contributions
antine+0.808
ables+0.796
bledon+0.784
water+0.754
mom+0.751
Neg logit contributions
 foreskin-0.897
 salute-0.820
 conver-0.792
ratulations-0.789
cens-0.776
Token toward
Token ID3812
Feature activation+2.53e-01

Pos logit contributions
 foreskin+0.047
 salute+0.043
 conver+0.041
ratulations+0.041
cens+0.040
Neg logit contributions
antine-0.046
ables-0.045
bledon-0.045
water-0.043
mom-0.043
Token progressive
Token ID10393
Feature activation+1.30e-01

Pos logit contributions
antine+0.520
ables+0.514
bledon+0.509
water+0.488
mom+0.486
Neg logit contributions
 foreskin-0.507
 salute-0.461
ratulations-0.440
 conver-0.440
cens-0.433
Token policies
Token ID4788
Feature activation+6.88e-01

Pos logit contributions
antine+0.263
ables+0.259
bledon+0.257
water+0.246
mom+0.245
Neg logit contributions
 foreskin-0.267
 salute-0.244
 conver-0.234
ratulations-0.232
cens-0.228
Token and
Token ID290
Feature activation+5.42e-02

Pos logit contributions
antine+1.373
ables+1.369
bledon+1.342
water+1.296
mom+1.289
Neg logit contributions
 foreskin-1.413
 salute-1.278
ratulations-1.247
cens-1.198
 conver-1.193
Token generous
Token ID14431
Feature activation+8.72e-01

Pos logit contributions
antine+0.112
ables+0.110
bledon+0.109
water+0.104
mom+0.104
Neg logit contributions
 foreskin-0.109
 salute-0.099
 conver-0.095
ratulations-0.094
cens-0.093
Token social
Token ID1919
Feature activation+1.82e+00

Pos logit contributions
ables+1.826
antine+1.824
bledon+1.782
water+1.748
mom+1.719
Neg logit contributions
 foreskin-1.684
 salute-1.506
ratulations-1.466
cens-1.437
 conver-1.435
Token spending
Token ID4581
Feature activation+1.50e+00

Pos logit contributions
ables+3.864
antine+3.800
bledon+3.662
water+3.614
mom+3.520
Neg logit contributions
 foreskin-3.428
 salute-3.082
ratulations-3.005
estinal-2.964
cens-2.938
Token was
Token ID373
Feature activation+3.53e-01

Pos logit contributions
ables+3.082
antine+3.061
bledon+3.010
water+2.937
mom+2.886
Neg logit contributions
 foreskin-2.947
 salute-2.655
ratulations-2.612
cens-2.511
estinal-2.482
Token soon
Token ID2582
Feature activation-1.59e-01

Pos logit contributions
antine+0.758
ables+0.752
bledon+0.745
water+0.715
mom+0.715
Neg logit contributions
 foreskin-0.674
 salute-0.608
ratulations-0.584
 conver-0.573
cens-0.571
Token followed
Token ID3940
Feature activation-8.26e-02

Pos logit contributions
 foreskin+0.271
 salute+0.243
ratulations+0.230
 conver+0.230
cens+0.224
Neg logit contributions
antine-0.375
ables-0.372
bledon-0.369
water-0.355
mom-0.354
Token by
Token ID416
Feature activation+5.77e-01

Pos logit contributions
 foreskin+0.188
 salute+0.173
 conver+0.168
ratulations+0.166
cens+0.164
Neg logit contributions
antine-0.148
ables-0.146
bledon-0.144
water-0.137
mom-0.137
Token towel
Token ID24808
Feature activation+3.05e-01

Pos logit contributions
ables+0.703
antine+0.699
bledon+0.687
water+0.665
mom+0.662
Neg logit contributions
 foreskin-0.710
 salute-0.655
ratulations-0.630
 conver-0.614
 neigh-0.610
Token-
Token ID12
Feature activation+6.02e-02

Pos logit contributions
antine+0.604
ables+0.599
bledon+0.591
water+0.569
mom+0.565
Neg logit contributions
 foreskin-0.627
 salute-0.579
ratulations-0.559
 conver-0.552
cens-0.543
Tokenlined
Token ID10837
Feature activation-4.28e-01

Pos logit contributions
antine+0.141
ables+0.140
bledon+0.139
water+0.135
mom+0.134
Neg logit contributions
 foreskin-0.103
 salute-0.094
 conver-0.089
ratulations-0.088
 neigh-0.086
Token plate
Token ID7480
Feature activation+1.39e-01

Pos logit contributions
 foreskin+1.006
 salute+0.938
ratulations+0.906
 conver+0.898
cens+0.894
Neg logit contributions
antine-0.725
ables-0.720
bledon-0.702
mom-0.674
water-0.673
Token.
Token ID13
Feature activation+2.48e-01

Pos logit contributions
antine+0.269
ables+0.266
bledon+0.261
water+0.252
mom+0.250
Neg logit contributions
 foreskin-0.296
 salute-0.273
 conver-0.262
ratulations-0.261
cens-0.257
Token Serve
Token ID35557
Feature activation+1.79e+00

Pos logit contributions
antine+0.494
ables+0.489
bledon+0.482
water+0.467
mom+0.464
Neg logit contributions
 foreskin-0.511
 salute-0.468
 conver-0.454
ratulations-0.451
cens-0.440
Token hot
Token ID3024
Feature activation-4.29e-01

Pos logit contributions
ables+3.697
antine+3.597
bledon+3.542
water+3.468
mom+3.444
Neg logit contributions
 foreskin-3.477
 salute-3.148
ratulations-3.072
cens-3.063
 conver-3.041
Token!
Token ID0
Feature activation+2.73e-01

Pos logit contributions
 foreskin+0.904
 salute+0.828
 conver+0.800
ratulations+0.784
cens+0.777
Neg logit contributions
antine-0.840
ables-0.825
bledon-0.820
mom-0.780
water-0.778
Token
Token ID198
Feature activation+3.69e-02

Pos logit contributions
antine+0.565
ables+0.560
bledon+0.553
water+0.533
mom+0.531
Neg logit contributions
 foreskin-0.542
 salute-0.493
 conver-0.477
ratulations-0.470
cens-0.462
Token
Token ID198
Feature activation+6.41e-02

Pos logit contributions
antine+0.076
ables+0.076
bledon+0.075
water+0.072
mom+0.072
Neg logit contributions
 foreskin-0.074
 salute-0.067
 conver-0.065
ratulations-0.064
cens-0.063
Token1
Token ID16
Feature activation-1.27e-01

Pos logit contributions
antine+0.113
ables+0.111
bledon+0.110
water+0.104
mom+0.104
Neg logit contributions
 foreskin-0.148
 salute-0.136
 conver-0.132
ratulations-0.130
cens-0.129
Token easy
Token ID2562
Feature activation+1.41e-01

Pos logit contributions
 foreskin+0.209
 salute+0.190
 conver+0.183
ratulations+0.182
cens+0.180
Neg logit contributions
antine-0.203
ables-0.200
bledon-0.198
water-0.190
mom-0.189
Token bank
Token ID3331
Feature activation+3.17e-01

Pos logit contributions
antine+0.289
ables+0.287
bledon+0.284
water+0.273
mom+0.270
Neg logit contributions
 foreskin-0.281
 salute-0.256
 conver-0.246
ratulations-0.243
cens-0.241
Token loans
Token ID10021
Feature activation+1.91e-01

Pos logit contributions
ables+0.750
antine+0.744
bledon+0.720
mom+0.702
water+0.702
Neg logit contributions
 foreskin-0.543
 salute-0.484
 conver-0.455
ratulations-0.452
cens-0.451
Token and
Token ID290
Feature activation-1.67e-02

Pos logit contributions
antine+0.374
ables+0.370
bledon+0.366
water+0.351
mom+0.349
Neg logit contributions
 foreskin-0.401
 salute-0.366
 conver-0.350
ratulations-0.350
cens-0.345
Token the
Token ID262
Feature activation+1.86e-01

Pos logit contributions
 foreskin+0.033
 salute+0.030
 conver+0.029
ratulations+0.028
cens+0.028
Neg logit contributions
antine-0.035
ables-0.034
bledon-0.034
water-0.033
mom-0.033
Token cac
Token ID40428
Feature activation+1.76e+00

Pos logit contributions
antine+0.386
ables+0.379
bledon+0.374
mom+0.359
water+0.359
Neg logit contributions
 foreskin-0.366
 salute-0.332
ratulations-0.323
 conver-0.317
cens-0.316
Tokenhet
Token ID3202
Feature activation-3.30e-01

Pos logit contributions
antine+3.463
ables+3.395
bledon+3.339
mom+3.299
water+3.283
Neg logit contributions
 foreskin-3.692
 salute-3.346
 conver-3.237
cens-3.165
ratulations-3.143
Token of
Token ID286
Feature activation-2.89e-02

Pos logit contributions
 foreskin+0.667
 salute+0.607
 conver+0.584
ratulations+0.577
cens+0.570
Neg logit contributions
antine-0.676
ables-0.667
bledon-0.662
water-0.632
mom-0.631
Token having
Token ID1719
Feature activation+3.39e-01

Pos logit contributions
 foreskin+0.057
 salute+0.052
ratulations+0.050
 conver+0.050
cens+0.049
Neg logit contributions
antine-0.060
ables-0.059
bledon-0.059
mom-0.056
water-0.056
Token planes
Token ID13016
Feature activation-1.23e-01

Pos logit contributions
antine+0.697
ables+0.686
bledon+0.681
water+0.651
mom+0.649
Neg logit contributions
 foreskin-0.675
 salute-0.615
ratulations-0.595
cens-0.587
 conver-0.585
Token land
Token ID1956
Feature activation-6.51e-01

Pos logit contributions
 foreskin+0.230
 salute+0.207
 conver+0.199
ratulations+0.198
cens+0.195
Neg logit contributions
antine-0.269
ables-0.267
bledon-0.264
water-0.254
mom-0.254
Token]
Token ID60
Feature activation-7.10e-01

Pos logit contributions
 foreskin+0.878
 salute+0.784
 conver+0.748
ratulations+0.735
cens+0.722
Neg logit contributions
antine-1.280
ables-1.265
bledon-1.257
water-1.212
mom-1.210
TokenThis
Token ID1212
Feature activation-4.52e-01

Pos logit contributions
 foreskin+1.372
 salute+1.252
 conver+1.204
cens+1.174
ratulations+1.173
Neg logit contributions
antine-1.504
ables-1.482
bledon-1.481
mom-1.400
water-1.399
Token payment
Token ID6074
Feature activation-3.07e-01

Pos logit contributions
 foreskin+0.923
 salute+0.842
 conver+0.811
ratulations+0.799
cens+0.790
Neg logit contributions
antine-0.917
ables-0.903
bledon-0.896
water-0.857
mom-0.856
Token schedule
Token ID7269
Feature activation+4.88e-02

Pos logit contributions
 foreskin+0.610
 salute+0.556
 conver+0.536
ratulations+0.526
cens+0.520
Neg logit contributions
antine-0.639
ables-0.630
bledon-0.626
water-0.598
mom-0.597
Token would
Token ID561
Feature activation-1.41e-01

Pos logit contributions
antine+0.103
ables+0.102
bledon+0.101
water+0.097
mom+0.097
Neg logit contributions
 foreskin-0.095
 salute-0.086
ratulations-0.083
 conver-0.082
cens-0.081
Token save
Token ID3613
Feature activation+1.75e+00

Pos logit contributions
 foreskin+0.281
 salute+0.255
 conver+0.246
ratulations+0.244
cens+0.241
Neg logit contributions
antine-0.293
ables-0.290
bledon-0.286
water-0.275
mom-0.274
Token the
Token ID262
Feature activation-7.00e-02

Pos logit contributions
antine+3.138
ables+3.069
bledon+3.007
water+2.979
mom+2.918
Neg logit contributions
 foreskin-3.929
 salute-3.562
ratulations-3.481
 conver-3.440
cens-3.438
Token Haitian
Token ID48723
Feature activation+1.16e-01

Pos logit contributions
 foreskin+0.142
 salute+0.129
 conver+0.124
ratulations+0.122
cens+0.121
Neg logit contributions
antine-0.143
ables-0.141
bledon-0.140
water-0.134
mom-0.134
Token government
Token ID1230
Feature activation+3.76e-01

Pos logit contributions
antine+0.219
ables+0.215
bledon+0.212
water+0.205
mom+0.203
Neg logit contributions
 foreskin-0.253
 salute-0.231
 conver-0.223
ratulations-0.222
cens-0.219
Token USD
Token ID11403
Feature activation+3.61e-01

Pos logit contributions
antine+0.732
ables+0.725
bledon+0.715
water+0.689
mom+0.685
Neg logit contributions
 foreskin-0.794
 salute-0.725
 conver-0.699
ratulations-0.695
cens-0.687
Token 100
Token ID1802
Feature activation+1.03e-01

Pos logit contributions
antine+0.679
ables+0.674
bledon+0.664
water+0.638
mom+0.636
Neg logit contributions
 foreskin-0.789
 salute-0.722
 conver-0.696
ratulations-0.691
cens-0.679
Tokeniff
Token ID733
Feature activation+3.32e-01

Pos logit contributions
 foreskin+0.318
 salute+0.293
 conver+0.282
ratulations+0.278
cens+0.277
Neg logit contributions
antine-0.282
bledon-0.276
ables-0.276
mom-0.261
water-0.261
TokenTra
Token ID15721
Feature activation+2.42e-01

Pos logit contributions
antine+0.706
ables+0.700
bledon+0.685
water+0.672
mom+0.661
Neg logit contributions
 foreskin-0.653
 salute-0.592
 conver-0.565
ratulations-0.563
 neigh-0.548
Tokenx
Token ID87
Feature activation+3.93e-01

Pos logit contributions
antine+0.428
ables+0.422
bledon+0.417
water+0.398
mom+0.397
Neg logit contributions
 foreskin-0.559
 salute-0.516
 conver-0.499
ratulations-0.494
cens-0.486
Token
Token ID198
Feature activation-1.41e-01

Pos logit contributions
antine+0.779
ables+0.774
bledon+0.758
water+0.738
mom+0.736
Neg logit contributions
 foreskin-0.805
 salute-0.741
 conver-0.715
ratulations-0.706
 pancreat-0.687
Token
Token ID198
Feature activation+3.86e-02

Pos logit contributions
 foreskin+0.284
 salute+0.259
 conver+0.249
ratulations+0.247
cens+0.241
Neg logit contributions
antine-0.290
ables-0.287
bledon-0.283
water-0.273
mom-0.272
TokenReviewer
Token ID35407
Feature activation+1.75e+00

Pos logit contributions
antine+0.068
ables+0.068
bledon+0.066
water+0.064
mom+0.064
Neg logit contributions
 foreskin-0.089
 salute-0.082
 conver-0.080
ratulations-0.078
 digit-0.077
Token:
Token ID25
Feature activation+3.38e-01

Pos logit contributions
ables+3.347
antine+3.275
water+3.221
mom+3.169
bledon+3.146
Neg logit contributions
 foreskin-3.751
ratulations-3.486
 salute-3.460
estinal-3.349
 conver-3.314
Token V
Token ID569
Feature activation-2.22e-01

Pos logit contributions
antine+0.687
ables+0.675
bledon+0.666
water+0.646
mom+0.646
Neg logit contributions
 foreskin-0.676
 salute-0.617
ratulations-0.597
 conver-0.597
cens-0.586
Tokenthe
Token ID1169
Feature activation-3.11e-01

Pos logit contributions
 foreskin+0.513
 salute+0.474
 conver+0.459
ratulations+0.452
cens+0.448
Neg logit contributions
antine-0.390
ables-0.384
bledon-0.380
water-0.361
mom-0.360
TokenHappy
Token ID25082
Feature activation+4.60e-01

Pos logit contributions
 foreskin+0.722
 salute+0.666
 conver+0.646
ratulations+0.636
cens+0.629
Neg logit contributions
antine-0.545
ables-0.536
bledon-0.531
water-0.504
mom-0.503
TokenL
Token ID43
Feature activation-5.41e-01

Pos logit contributions
antine+0.899
ables+0.886
bledon+0.874
water+0.849
mom+0.847
Neg logit contributions
 foreskin-0.971
 salute-0.891
 conver-0.861
ratulations-0.850
cens-0.833
Token run
Token ID1057
Feature activation+9.96e-02

Pos logit contributions
 foreskin+0.389
 salute+0.355
 conver+0.340
ratulations+0.338
cens+0.332
Neg logit contributions
antine-0.430
ables-0.424
bledon-0.419
water-0.402
mom-0.402
Token-
Token ID12
Feature activation-1.44e-01

Pos logit contributions
antine+0.190
ables+0.190
bledon+0.185
mom+0.179
water+0.178
Neg logit contributions
 foreskin-0.214
 salute-0.197
 conver-0.191
ratulations-0.189
cens-0.185
Tokenup
Token ID929
Feature activation+3.36e-01

Pos logit contributions
 foreskin+0.353
 salute+0.328
 conver+0.319
ratulations+0.313
 digit+0.310
Neg logit contributions
antine-0.236
ables-0.232
bledon-0.227
mom-0.218
water-0.217
Token in
Token ID287
Feature activation-1.81e-02

Pos logit contributions
antine+0.591
ables+0.578
bledon+0.572
water+0.544
mom+0.543
Neg logit contributions
 foreskin-0.773
 salute-0.716
 conver-0.693
ratulations-0.686
cens-0.678
Token opioid
Token ID23039
Feature activation-5.01e-01

Pos logit contributions
 foreskin+0.033
 salute+0.030
 conver+0.029
ratulations+0.029
cens+0.028
Neg logit contributions
antine-0.040
ables-0.040
bledon-0.039
water-0.038
mom-0.038
Token prescribing
Token ID39973
Feature activation+1.75e+00

Pos logit contributions
 foreskin+0.904
 salute+0.816
 conver+0.786
ratulations+0.763
cens+0.759
Neg logit contributions
antine-1.131
ables-1.116
bledon-1.114
mom-1.066
water-1.062
Token emerged
Token ID9349
Feature activation+6.51e-01

Pos logit contributions
ables+3.479
antine+3.425
bledon+3.335
water+3.268
mom+3.229
Neg logit contributions
 foreskin-3.490
 salute-3.231
ratulations-3.181
estinal-3.012
cens-3.008
Token from
Token ID422
Feature activation+8.21e-02

Pos logit contributions
antine+1.251
ables+1.236
bledon+1.208
water+1.180
mom+1.168
Neg logit contributions
 foreskin-1.375
 salute-1.271
ratulations-1.227
 conver-1.192
estinal-1.182
Token an
Token ID281
Feature activation+2.33e-01

Pos logit contributions
antine+0.174
ables+0.172
bledon+0.170
water+0.163
mom+0.163
Neg logit contributions
 foreskin-0.159
 salute-0.145
 conver-0.139
ratulations-0.139
cens-0.136
Token unhealthy
Token ID27942
Feature activation+2.83e-01

Pos logit contributions
antine+0.511
ables+0.503
bledon+0.494
water+0.478
mom+0.477
Neg logit contributions
 foreskin-0.436
 salute-0.397
 conver-0.379
ratulations-0.377
 digit-0.369
Token enchantment
Token ID43706
Feature activation-4.27e-01

Pos logit contributions
antine+0.606
ables+0.597
bledon+0.589
water+0.571
mom+0.567
Neg logit contributions
 foreskin-0.540
 salute-0.493
ratulations-0.472
 conver-0.471
cens-0.461
Token levels
Token ID2974
Feature activation+2.94e-02

Pos logit contributions
 foreskin+0.176
 salute+0.160
 conver+0.153
ratulations+0.152
cens+0.149
Neg logit contributions
antine-0.192
ables-0.190
bledon-0.188
water-0.180
mom-0.180
Token of
Token ID286
Feature activation+2.71e-01

Pos logit contributions
antine+0.059
ables+0.058
bledon+0.058
water+0.055
mom+0.055
Neg logit contributions
 foreskin-0.061
 salute-0.055
 conver-0.053
ratulations-0.053
cens-0.052
Token tax
Token ID1687
Feature activation+1.01e-01

Pos logit contributions
antine+0.555
ables+0.550
bledon+0.544
water+0.522
mom+0.520
Neg logit contributions
 foreskin-0.543
 salute-0.494
ratulations-0.473
 conver-0.472
cens-0.465
Token cuts
Token ID6630
Feature activation+4.69e-01

Pos logit contributions
ables+0.204
antine+0.203
bledon+0.200
water+0.192
mom+0.191
Neg logit contributions
 foreskin-0.204
 salute-0.187
ratulations-0.178
 conver-0.178
cens-0.175
Token and
Token ID290
Feature activation-1.90e-01

Pos logit contributions
antine+0.967
ables+0.962
bledon+0.950
water+0.915
mom+0.910
Neg logit contributions
 foreskin-0.938
 salute-0.851
ratulations-0.815
 conver-0.810
cens-0.799
Token spending
Token ID4581
Feature activation+1.68e+00

Pos logit contributions
 foreskin+0.388
 salute+0.354
 conver+0.341
ratulations+0.336
cens+0.332
Neg logit contributions
antine-0.386
ables-0.381
bledon-0.378
water-0.361
mom-0.360
Token increases
Token ID5732
Feature activation+5.55e-01

Pos logit contributions
ables+3.478
antine+3.397
bledon+3.350
water+3.289
mom+3.217
Neg logit contributions
 foreskin-3.299
 salute-3.010
ratulations-2.952
cens-2.846
estinal-2.826
Token in
Token ID287
Feature activation+1.02e-01

Pos logit contributions
antine+1.204
ables+1.199
bledon+1.181
water+1.142
mom+1.135
Neg logit contributions
 foreskin-1.050
 salute-0.947
ratulations-0.908
 conver-0.898
cens-0.885
Token upcoming
Token ID7865
Feature activation-8.58e-02

Pos logit contributions
antine+0.206
ables+0.204
bledon+0.202
water+0.193
mom+0.193
Neg logit contributions
 foreskin-0.209
 salute-0.191
 conver-0.184
ratulations-0.181
cens-0.179
Token budgets
Token ID21011
Feature activation+5.38e-01

Pos logit contributions
 foreskin+0.145
 salute+0.129
 conver+0.124
ratulations+0.121
cens+0.119
Neg logit contributions
antine-0.204
ables-0.202
bledon-0.200
water-0.193
mom-0.193
Token that
Token ID326
Feature activation-3.66e-01

Pos logit contributions
antine+1.128
ables+1.121
bledon+1.112
water+1.070
mom+1.066
Neg logit contributions
 foreskin-1.054
 salute-0.951
ratulations-0.919
 conver-0.900
cens-0.892
Token the
Token ID262
Feature activation+5.98e-02

Pos logit contributions
antine+0.270
ables+0.269
bledon+0.265
water+0.256
mom+0.252
Neg logit contributions
 foreskin-0.282
 salute-0.256
ratulations-0.247
 conver-0.244
cens-0.243
Token public
Token ID1171
Feature activation+2.96e-01

Pos logit contributions
antine+0.125
ables+0.124
bledon+0.122
water+0.118
mom+0.117
Neg logit contributions
 foreskin-0.118
 salute-0.107
ratulations-0.103
 conver-0.102
cens-0.102
Token spaces
Token ID9029
Feature activation+2.64e-01

Pos logit contributions
antine+0.610
ables+0.608
bledon+0.601
water+0.576
mom+0.570
Neg logit contributions
 foreskin-0.587
 salute-0.534
 conver-0.513
ratulations-0.513
cens-0.506
Token,
Token ID11
Feature activation+2.20e-02

Pos logit contributions
antine+0.532
ables+0.530
bledon+0.521
water+0.503
mom+0.498
Neg logit contributions
 foreskin-0.537
 salute-0.486
ratulations-0.470
 conver-0.464
cens-0.462
Token image
Token ID2939
Feature activation+8.28e-01

Pos logit contributions
antine+0.047
ables+0.047
bledon+0.047
water+0.045
mom+0.045
Neg logit contributions
 foreskin-0.041
 salute-0.037
ratulations-0.036
 conver-0.036
cens-0.035
Token courtesy
Token ID12537
Feature activation+1.67e+00

Pos logit contributions
antine+1.742
ables+1.739
bledon+1.718
water+1.658
mom+1.648
Neg logit contributions
 foreskin-1.583
 salute-1.421
ratulations-1.393
 conver-1.371
cens-1.345
Token of
Token ID286
Feature activation+7.28e-02

Pos logit contributions
antine+3.039
ables+3.022
bledon+2.954
water+2.880
mom+2.803
Neg logit contributions
 foreskin-3.692
 salute-3.367
ratulations-3.292
 conver-3.258
estinal-3.194
Token Dav
Token ID2544
Feature activation+5.65e-01

Pos logit contributions
antine+0.139
ables+0.138
bledon+0.136
water+0.131
mom+0.129
Neg logit contributions
 foreskin-0.157
 salute-0.144
 conver-0.138
ratulations-0.138
cens-0.136
Tokenpart
Token ID3911
Feature activation+3.04e-01

Pos logit contributions
antine+0.696
ables+0.680
bledon+0.671
water+0.622
mom+0.621
Neg logit contributions
 foreskin-1.602
 salute-1.503
 conver-1.463
ratulations-1.447
cens-1.435
Token.
Token ID13
Feature activation+8.82e-02

Pos logit contributions
antine+0.574
ables+0.569
bledon+0.558
water+0.539
mom+0.536
Neg logit contributions
 foreskin-0.662
 salute-0.607
 conver-0.584
ratulations-0.580
cens-0.570
Token
Token ID198
Feature activation+1.48e-01

Pos logit contributions
antine+0.181
ables+0.179
bledon+0.177
water+0.171
mom+0.170
Neg logit contributions
 foreskin-0.178
 salute-0.162
 conver-0.156
ratulations-0.154
cens-0.151
Token
Token ID198
Feature activation-9.31e-03

Pos logit contributions
antine+0.036
ables+0.036
bledon+0.036
water+0.034
mom+0.034
Neg logit contributions
 foreskin-0.035
 salute-0.031
 conver-0.030
ratulations-0.030
cens-0.029
Tokenimpl
Token ID23928
Feature activation+1.55e+00

Pos logit contributions
 foreskin+0.023
 salute+0.022
 conver+0.021
ratulations+0.021
cens+0.020
Neg logit contributions
antine-0.015
ables-0.014
bledon-0.014
water-0.013
mom-0.013
Token Board
Token ID5926
Feature activation-1.25e-01

Pos logit contributions
antine+3.020
ables+2.981
bledon+2.903
water+2.810
mom+2.791
Neg logit contributions
 foreskin-3.271
 salute-2.990
 conver-2.866
cens-2.859
ratulations-2.857
Token {
Token ID1391
Feature activation+1.37e-01

Pos logit contributions
 foreskin+0.274
 salute+0.252
 conver+0.244
ratulations+0.239
cens+0.237
Neg logit contributions
antine-0.236
ables-0.232
bledon-0.230
water-0.219
mom-0.219
Token pub
Token ID2240
Feature activation+5.37e-01

Pos logit contributions
antine+0.260
ables+0.257
bledon+0.253
water+0.242
mom+0.242
Neg logit contributions
 foreskin-0.298
 salute-0.273
 conver-0.264
ratulations-0.262
cens-0.259
Token fn
Token ID24714
Feature activation+1.66e+00

Pos logit contributions
antine+0.742
ables+0.728
bledon+0.715
water+0.675
mom+0.673
Neg logit contributions
 foreskin-1.441
 salute-1.341
 conver-1.299
ratulations-1.296
cens-1.288
Token copy
Token ID4866
Feature activation+3.64e-01

Pos logit contributions
antine+2.988
ables+2.976
bledon+2.888
mom+2.829
water+2.813
Neg logit contributions
 foreskin-3.633
 salute-3.295
cens-3.218
ratulations-3.195
estinal-3.144
Token_
Token ID62
Feature activation+1.10e-01

Pos logit contributions
antine+0.676
ables+0.667
bledon+0.658
water+0.630
mom+0.630
Neg logit contributions
 foreskin-0.806
 salute-0.740
 conver-0.715
ratulations-0.706
cens-0.696
Tokenboard
Token ID3526
Feature activation-2.43e-01

Pos logit contributions
antine+0.181
ables+0.178
bledon+0.176
water+0.166
mom+0.166
Neg logit contributions
 foreskin-0.268
 salute-0.248
 conver-0.241
ratulations-0.237
cens-0.235
Token (
Token ID357
Feature activation+5.32e-02

Pos logit contributions
 foreskin+0.551
 salute+0.507
 conver+0.491
ratulations+0.484
cens+0.479
Neg logit contributions
antine-0.437
ables-0.430
bledon-0.426
water-0.405
mom-0.404
Token &
Token ID1222
Feature activation+8.12e-01

Pos logit contributions
antine+0.128
ables+0.127
bledon+0.125
water+0.122
mom+0.121
Neg logit contributions
 foreskin-0.088
 salute-0.079
 conver-0.075
ratulations-0.074
cens-0.073
Token can
Token ID460
Feature activation+4.39e-01

Pos logit contributions
antine+0.467
ables+0.464
bledon+0.457
water+0.442
mom+0.439
Neg logit contributions
 foreskin-0.456
 salute-0.415
ratulations-0.402
 conver-0.391
cens-0.388
Token cause
Token ID2728
Feature activation+5.90e-01

Pos logit contributions
antine+0.861
ables+0.859
bledon+0.841
water+0.812
mom+0.807
Neg logit contributions
 foreskin-0.916
 salute-0.833
ratulations-0.808
 conver-0.796
cens-0.791
Token real
Token ID1103
Feature activation+1.19e+00

Pos logit contributions
ables+1.186
antine+1.184
bledon+1.168
water+1.122
mom+1.116
Neg logit contributions
 foreskin-1.200
 salute-1.094
ratulations-1.058
 conver-1.033
cens-1.030
Token prices
Token ID4536
Feature activation+7.50e-01

Pos logit contributions
antine+2.256
ables+2.253
bledon+2.213
water+2.125
mom+2.122
Neg logit contributions
 foreskin-2.546
 salute-2.331
ratulations-2.259
 conver-2.202
estinal-2.181
Token to
Token ID284
Feature activation+2.16e-01

Pos logit contributions
ables+1.288
antine+1.283
bledon+1.261
water+1.222
mom+1.198
Neg logit contributions
 foreskin-1.739
 salute-1.610
ratulations-1.578
 conver-1.518
cens-1.517
Token fall
Token ID2121
Feature activation+1.66e+00

Pos logit contributions
antine+0.456
ables+0.453
bledon+0.447
water+0.431
mom+0.429
Neg logit contributions
 foreskin-0.421
 salute-0.380
ratulations-0.368
 conver-0.362
cens-0.360
Token,
Token ID11
Feature activation+1.62e-01

Pos logit contributions
ables+3.086
antine+3.035
bledon+2.995
water+2.923
mom+2.895
Neg logit contributions
 foreskin-3.600
 salute-3.297
ratulations-3.250
estinal-3.106
 conver-3.083
Token while
Token ID981
Feature activation+5.46e-01

Pos logit contributions
antine+0.341
ables+0.338
bledon+0.334
water+0.322
mom+0.321
Neg logit contributions
 foreskin-0.318
 salute-0.289
ratulations-0.276
 conver-0.275
cens-0.270
Token an
Token ID281
Feature activation+3.92e-02

Pos logit contributions
antine+1.146
ables+1.141
bledon+1.124
water+1.081
mom+1.076
Neg logit contributions
 foreskin-1.063
 salute-0.968
ratulations-0.926
 conver-0.915
cens-0.905
Token excess
Token ID6992
Feature activation-3.85e-01

Pos logit contributions
antine+0.082
ables+0.081
bledon+0.080
water+0.077
mom+0.077
Neg logit contributions
 foreskin-0.077
 salute-0.070
 conver-0.067
ratulations-0.067
cens-0.065
Token of
Token ID286
Feature activation+6.44e-02

Pos logit contributions
 foreskin+0.774
 salute+0.706
 conver+0.679
ratulations+0.671
cens+0.661
Neg logit contributions
antine-0.791
ables-0.782
bledon-0.776
water-0.742
mom-0.741
Token But
Token ID887
Feature activation-1.23e-01

Pos logit contributions
 foreskin+0.098
 salute+0.089
 conver+0.086
ratulations+0.085
cens+0.083
Neg logit contributions
antine-0.098
ables-0.097
bledon-0.095
water-0.092
mom-0.092
Token I
Token ID314
Feature activation+1.33e-01

Pos logit contributions
 foreskin+0.262
 salute+0.240
 conver+0.232
ratulations+0.228
cens+0.226
Neg logit contributions
antine-0.239
ables-0.235
bledon-0.233
water-0.223
mom-0.222
Token just
Token ID655
Feature activation-1.67e-01

Pos logit contributions
antine+0.270
ables+0.270
bledon+0.264
water+0.257
mom+0.255
Neg logit contributions
 foreskin-0.265
 salute-0.237
ratulations-0.233
cens-0.230
 conver-0.229
Token couldn
Token ID3521
Feature activation+2.48e-02

Pos logit contributions
 foreskin+0.326
 salute+0.293
ratulations+0.284
 conver+0.281
cens+0.279
Neg logit contributions
antine-0.349
ables-0.346
bledon-0.339
water-0.330
mom-0.330
Token't
Token ID470
Feature activation+2.35e-01

Pos logit contributions
antine+0.050
ables+0.049
bledon+0.049
water+0.047
mom+0.047
Neg logit contributions
 foreskin-0.051
 salute-0.047
 conver-0.045
ratulations-0.044
cens-0.044
Token bear
Token ID6842
Feature activation+1.65e+00

Pos logit contributions
ables+0.476
antine+0.473
bledon+0.460
water+0.450
mom+0.449
Neg logit contributions
 foreskin-0.475
 salute-0.425
ratulations-0.418
 conver-0.414
cens-0.411
Token it
Token ID340
Feature activation+4.21e-01

Pos logit contributions
antine+2.866
ables+2.846
bledon+2.755
water+2.718
mom+2.655
Neg logit contributions
 foreskin-3.775
 salute-3.470
ratulations-3.391
 conver-3.314
estinal-3.300
Token.
Token ID13
Feature activation+6.25e-02

Pos logit contributions
antine+0.773
ables+0.766
bledon+0.756
water+0.724
mom+0.722
Neg logit contributions
 foreskin-0.938
 salute-0.862
ratulations-0.833
 conver-0.828
cens-0.813
Token I
Token ID314
Feature activation+9.85e-02

Pos logit contributions
antine+0.126
ables+0.124
bledon+0.122
water+0.119
mom+0.118
Neg logit contributions
 foreskin-0.129
 salute-0.117
 conver-0.113
ratulations-0.113
cens-0.110
Token was
Token ID373
Feature activation+2.38e-01

Pos logit contributions
antine+0.199
ables+0.198
bledon+0.194
water+0.189
mom+0.188
Neg logit contributions
 foreskin-0.198
 salute-0.178
ratulations-0.174
 conver-0.171
cens-0.171
Token so
Token ID523
Feature activation+2.15e-01

Pos logit contributions
antine+0.516
ables+0.514
bledon+0.509
water+0.494
mom+0.492
Neg logit contributions
 foreskin-0.440
 salute-0.398
ratulations-0.384
cens-0.378
 conver-0.376
Token T
Token ID309
Feature activation+4.85e-02

Pos logit contributions
antine+0.370
ables+0.368
bledon+0.361
water+0.350
mom+0.349
Neg logit contributions
 foreskin-0.390
 salute-0.356
 conver-0.341
ratulations-0.339
cens-0.332
Token >
Token ID1875
Feature activation+4.12e-01

Pos logit contributions
antine+0.104
ables+0.104
bledon+0.102
water+0.098
mom+0.098
Neg logit contributions
 foreskin-0.093
 salute-0.084
 conver-0.081
ratulations-0.080
cens-0.079
Token {
Token ID1391
Feature activation+1.25e-01

Pos logit contributions
antine+0.741
ables+0.739
bledon+0.719
water+0.690
mom+0.687
Neg logit contributions
 foreskin-0.925
 salute-0.852
ratulations-0.822
 conver-0.820
cens-0.811
Token...
Token ID2644
Feature activation+2.30e-01

Pos logit contributions
antine+0.256
ables+0.253
bledon+0.250
water+0.240
mom+0.240
Neg logit contributions
 foreskin-0.253
 salute-0.230
 conver-0.222
ratulations-0.219
cens-0.216
Token }
Token ID1782
Feature activation-4.81e-01

Pos logit contributions
antine+0.476
ables+0.470
bledon+0.464
water+0.445
mom+0.445
Neg logit contributions
 foreskin-0.460
 salute-0.419
 conver-0.403
ratulations-0.399
cens-0.394
Token impl
Token ID4114
Feature activation+1.65e+00

Pos logit contributions
 foreskin+1.025
 salute+0.940
 conver+0.908
ratulations+0.893
cens+0.884
Neg logit contributions
antine-0.932
ables-0.918
bledon-0.911
water-0.868
mom-0.867
Token <
Token ID1279
Feature activation-2.98e-02

Pos logit contributions
antine+3.408
ables+3.367
bledon+3.281
water+3.194
mom+3.153
Neg logit contributions
 foreskin-3.224
 salute-2.922
ratulations-2.839
 conver-2.797
cens-2.788
Token T
Token ID309
Feature activation-1.36e-01

Pos logit contributions
 foreskin+0.062
 salute+0.056
 conver+0.054
ratulations+0.054
cens+0.053
Neg logit contributions
antine-0.060
ables-0.059
bledon-0.058
water-0.056
mom-0.056
Token >
Token ID1875
Feature activation+2.68e-01

Pos logit contributions
 foreskin+0.258
 salute+0.234
 conver+0.225
ratulations+0.221
cens+0.218
Neg logit contributions
antine-0.295
ables-0.291
bledon-0.289
water-0.276
mom-0.276
Token Own
Token ID11744
Feature activation+3.86e-01

Pos logit contributions
antine+0.533
ables+0.527
bledon+0.519
water+0.498
mom+0.496
Neg logit contributions
 foreskin-0.557
 salute-0.508
 conver-0.489
ratulations-0.485
cens-0.480
Tokened
Token ID276
Feature activation+3.08e-01

Pos logit contributions
antine+0.590
ables+0.580
bledon+0.572
water+0.539
mom+0.537
Neg logit contributions
 foreskin-0.982
 salute-0.913
 conver-0.886
ratulations-0.876
cens-0.869
Token or
Token ID393
Feature activation+2.25e-02

Pos logit contributions
antine+2.126
ables+2.111
bledon+2.072
water+2.021
mom+2.000
Neg logit contributions
 foreskin-2.166
 salute-1.980
ratulations-1.915
 conver-1.908
cens-1.891
Token so
Token ID523
Feature activation-9.76e-03

Pos logit contributions
antine+0.050
ables+0.049
bledon+0.049
water+0.047
mom+0.047
Neg logit contributions
 foreskin-0.041
 salute-0.037
 conver-0.036
ratulations-0.035
cens-0.035
Token.
Token ID13
Feature activation+1.78e-01

Pos logit contributions
 foreskin+0.021
 salute+0.019
 conver+0.019
ratulations+0.018
cens+0.018
Neg logit contributions
antine-0.019
ables-0.018
bledon-0.018
water-0.017
mom-0.017
Token Let
Token ID3914
Feature activation-1.21e-01

Pos logit contributions
antine+0.357
ables+0.353
bledon+0.349
water+0.336
mom+0.335
Neg logit contributions
 foreskin-0.368
 salute-0.336
 conver-0.324
ratulations-0.319
cens-0.315
Token it
Token ID340
Feature activation+4.28e-01

Pos logit contributions
 foreskin+0.256
 salute+0.235
 conver+0.227
ratulations+0.223
cens+0.221
Neg logit contributions
antine-0.236
ables-0.232
bledon-0.231
water-0.220
mom-0.220
Token sit
Token ID1650
Feature activation+1.64e+00

Pos logit contributions
antine+0.853
ables+0.844
bledon+0.829
water+0.805
mom+0.801
Neg logit contributions
 foreskin-0.883
 salute-0.806
ratulations-0.772
 conver-0.771
cens-0.767
Token for
Token ID329
Feature activation+4.03e-01

Pos logit contributions
ables+2.810
antine+2.767
bledon+2.680
water+2.634
mom+2.627
Neg logit contributions
 foreskin-3.797
 salute-3.518
ratulations-3.428
 conver-3.362
cens-3.329
Token a
Token ID257
Feature activation+1.31e-01

Pos logit contributions
ables+0.781
antine+0.780
bledon+0.761
mom+0.741
water+0.739
Neg logit contributions
 foreskin-0.839
 salute-0.770
cens-0.741
ratulations-0.739
 conver-0.736
Token few
Token ID1178
Feature activation+3.26e-02

Pos logit contributions
antine+0.267
ables+0.266
bledon+0.263
mom+0.255
water+0.252
Neg logit contributions
 foreskin-0.262
 salute-0.239
 conver-0.230
ratulations-0.229
cens-0.226
Token minutes
Token ID2431
Feature activation+1.29e+00

Pos logit contributions
antine+0.060
ables+0.059
bledon+0.058
mom+0.056
water+0.056
Neg logit contributions
 foreskin-0.073
 salute-0.067
 conver-0.065
ratulations-0.064
cens-0.063
Token after
Token ID706
Feature activation+1.86e-01

Pos logit contributions
antine+2.403
ables+2.398
bledon+2.341
water+2.289
mom+2.239
Neg logit contributions
 foreskin-2.784
 salute-2.572
ratulations-2.497
 conver-2.462
cens-2.444
Token A
Token ID317
Feature activation+1.59e-01

Pos logit contributions
 foreskin+0.067
 salute+0.061
 conver+0.059
ratulations+0.059
cens+0.057
Neg logit contributions
antine-0.069
ables-0.068
bledon-0.067
water-0.065
mom-0.065
Token run
Token ID1057
Feature activation+2.14e-01

Pos logit contributions
antine+0.339
ables+0.335
bledon+0.333
water+0.319
mom+0.318
Neg logit contributions
 foreskin-0.308
 salute-0.279
 conver-0.269
ratulations-0.267
cens-0.261
Token-
Token ID12
Feature activation-6.46e-02

Pos logit contributions
antine+0.405
ables+0.402
bledon+0.396
water+0.381
mom+0.379
Neg logit contributions
 foreskin-0.463
 salute-0.425
 conver-0.412
ratulations-0.407
cens-0.401
Tokenup
Token ID929
Feature activation+2.82e-01

Pos logit contributions
 foreskin+0.164
 salute+0.152
 conver+0.148
ratulations+0.146
cens+0.144
Neg logit contributions
antine-0.099
ables-0.097
bledon-0.096
water-0.091
mom-0.091
Token in
Token ID287
Feature activation+4.58e-02

Pos logit contributions
antine+0.529
ables+0.523
bledon+0.517
water+0.497
mom+0.494
Neg logit contributions
 foreskin-0.613
 salute-0.565
 conver-0.542
ratulations-0.542
cens-0.534
Token prescribing
Token ID39973
Feature activation+1.64e+00

Pos logit contributions
antine+0.098
ables+0.097
bledon+0.096
water+0.092
mom+0.092
Neg logit contributions
 foreskin-0.088
 salute-0.080
 conver-0.077
ratulations-0.076
cens-0.075
Token from
Token ID422
Feature activation+1.21e-01

Pos logit contributions
ables+3.340
antine+3.287
bledon+3.227
water+3.166
mom+3.117
Neg logit contributions
 foreskin-3.227
 salute-2.960
ratulations-2.927
cens-2.796
 conver-2.756
Token 2000
Token ID4751
Feature activation-3.50e-01

Pos logit contributions
antine+0.257
ables+0.253
bledon+0.250
water+0.241
mom+0.240
Neg logit contributions
 foreskin-0.236
 salute-0.215
 conver-0.206
ratulations-0.205
cens-0.201
Token through
Token ID832
Feature activation+4.16e-01

Pos logit contributions
 foreskin+0.758
 salute+0.697
 conver+0.671
ratulations+0.665
cens+0.655
Neg logit contributions
antine-0.664
ables-0.655
bledon-0.650
water-0.619
mom-0.618
Token 2011
Token ID2813
Feature activation+2.49e-02

Pos logit contributions
antine+0.752
ables+0.742
bledon+0.732
water+0.699
mom+0.697
Neg logit contributions
 foreskin-0.938
 salute-0.864
 conver-0.832
ratulations-0.830
cens-0.816
Token fueled
Token ID24959
Feature activation-7.12e-02

Pos logit contributions
antine+0.052
ables+0.052
bledon+0.051
water+0.049
mom+0.049
Neg logit contributions
 foreskin-0.049
 salute-0.044
ratulations-0.043
 conver-0.042
cens-0.041
Token cutting
Token ID7720
Feature activation-2.35e-01

Pos logit contributions
 foreskin+0.146
 salute+0.132
ratulations+0.126
 conver+0.124
cens+0.124
Neg logit contributions
antine-0.165
ables-0.164
bledon-0.162
water-0.157
mom-0.156
Token off
Token ID572
Feature activation-3.33e-01

Pos logit contributions
 foreskin+0.446
 salute+0.404
 conver+0.386
ratulations+0.382
cens+0.377
Neg logit contributions
antine-0.508
ables-0.503
bledon-0.498
water-0.479
mom-0.478
Token their
Token ID511
Feature activation-1.42e-02

Pos logit contributions
 foreskin+0.661
 salute+0.603
 conver+0.577
ratulations+0.572
cens+0.566
Neg logit contributions
antine-0.691
ables-0.686
bledon-0.678
water-0.654
mom-0.649
Token arms
Token ID5101
Feature activation-2.68e-01

Pos logit contributions
 foreskin+0.028
 salute+0.025
 conver+0.024
ratulations+0.024
cens+0.024
Neg logit contributions
antine-0.029
ables-0.029
bledon-0.029
water-0.028
mom-0.028
Token to
Token ID284
Feature activation+4.01e-02

Pos logit contributions
 foreskin+0.572
 salute+0.525
 conver+0.507
ratulations+0.498
cens+0.493
Neg logit contributions
antine-0.519
ables-0.511
bledon-0.507
mom-0.483
water-0.483
Token save
Token ID3613
Feature activation+1.63e+00

Pos logit contributions
antine+0.083
ables+0.082
bledon+0.081
water+0.079
mom+0.078
Neg logit contributions
 foreskin-0.080
 salute-0.072
ratulations-0.070
 conver-0.070
cens-0.068
Token their
Token ID511
Feature activation+3.54e-02

Pos logit contributions
ables+2.862
antine+2.835
water+2.760
bledon+2.756
mom+2.675
Neg logit contributions
 foreskin-3.651
 salute-3.357
ratulations-3.314
cens-3.250
estinal-3.236
Token heads
Token ID6665
Feature activation-3.61e-01

Pos logit contributions
antine+0.068
ables+0.067
bledon+0.066
water+0.064
mom+0.063
Neg logit contributions
 foreskin-0.076
 salute-0.070
 conver-0.067
ratulations-0.066
cens-0.066
Token,
Token ID11
Feature activation-4.10e-02

Pos logit contributions
 foreskin+0.779
 salute+0.715
 conver+0.690
ratulations+0.680
cens+0.673
Neg logit contributions
antine-0.691
ables-0.681
bledon-0.675
water-0.644
mom-0.643
Token but
Token ID475
Feature activation+1.40e-01

Pos logit contributions
 foreskin+0.081
 salute+0.073
ratulations+0.070
 conver+0.070
cens+0.069
Neg logit contributions
antine-0.085
ables-0.084
bledon-0.083
water-0.081
mom-0.080
Token that
Token ID326
Feature activation+1.82e-01

Pos logit contributions
antine+0.266
ables+0.263
bledon+0.260
water+0.251
mom+0.249
Neg logit contributions
 foreskin-0.302
 salute-0.276
 conver-0.266
ratulations-0.265
cens-0.262
Tokenl
Token ID75
Feature activation-4.16e-01

Pos logit contributions
antine+0.687
ables+0.675
bledon+0.667
water+0.631
mom+0.630
Neg logit contributions
 foreskin-1.088
 salute-1.010
 conver-0.981
ratulations-0.968
cens-0.957
Tokenowski
Token ID12079
Feature activation-2.52e-01

Pos logit contributions
 foreskin+0.763
 salute+0.689
 conver+0.666
ratulations+0.654
cens+0.651
Neg logit contributions
antine-0.914
bledon-0.897
ables-0.895
mom-0.856
water-0.850
Token said
Token ID531
Feature activation+1.47e-01

Pos logit contributions
 foreskin+0.523
 salute+0.475
 conver+0.458
ratulations+0.456
cens+0.450
Neg logit contributions
antine-0.500
ables-0.495
bledon-0.489
water-0.469
mom-0.468
Token The
Token ID383
Feature activation+1.26e-01

Pos logit contributions
antine+0.272
ables+0.268
bledon+0.265
water+0.253
mom+0.252
Neg logit contributions
 foreskin-0.324
 salute-0.299
 conver-0.287
ratulations-0.286
cens-0.282
Token Str
Token ID4285
Feature activation+4.49e-01

Pos logit contributions
antine+0.250
ables+0.246
bledon+0.243
water+0.233
mom+0.232
Neg logit contributions
 foreskin-0.261
 salute-0.239
 conver-0.231
ratulations-0.228
cens-0.225
Tokenand
Token ID392
Feature activation+1.63e+00

Pos logit contributions
antine+0.692
ables+0.687
bledon+0.667
water+0.632
mom+0.631
Neg logit contributions
 foreskin-1.143
 salute-1.061
 conver-1.029
ratulations-1.009
cens-1.006
Token,
Token ID11
Feature activation+7.42e-02

Pos logit contributions
ables+3.357
antine+3.320
bledon+3.258
water+3.207
mom+3.119
Neg logit contributions
 foreskin-3.213
 salute-2.906
ratulations-2.816
 conver-2.772
cens-2.736
Token Delaware
Token ID19603
Feature activation+2.99e-01

Pos logit contributions
antine+0.145
ables+0.143
bledon+0.142
water+0.136
mom+0.135
Neg logit contributions
 foreskin-0.157
 salute-0.144
ratulations-0.138
 conver-0.138
cens-0.135
Token Street
Token ID3530
Feature activation+4.73e-01

Pos logit contributions
antine+0.644
ables+0.635
bledon+0.630
water+0.606
mom+0.600
Neg logit contributions
 foreskin-0.576
 salute-0.521
 conver-0.499
ratulations-0.497
cens-0.487
Token and
Token ID290
Feature activation-1.21e-01

Pos logit contributions
antine+0.914
ables+0.914
bledon+0.903
water+0.864
mom+0.856
Neg logit contributions
 foreskin-0.999
 salute-0.913
ratulations-0.885
 conver-0.875
cens-0.870
Token other
Token ID584
Feature activation-6.01e-01

Pos logit contributions
 foreskin+0.280
 salute+0.258
 conver+0.250
ratulations+0.247
cens+0.245
Neg logit contributions
antine-0.211
ables-0.208
bledon-0.206
water-0.196
mom-0.195
Token I
Token ID314
Feature activation+1.11e-01

Pos logit contributions
antine+0.420
ables+0.414
bledon+0.409
water+0.395
mom+0.394
Neg logit contributions
 foreskin-0.421
 salute-0.384
 conver-0.371
ratulations-0.368
cens-0.359
Token bought
Token ID5839
Feature activation+3.29e-01

Pos logit contributions
antine+0.222
ables+0.222
bledon+0.217
water+0.211
mom+0.209
Neg logit contributions
 foreskin-0.227
 salute-0.202
ratulations-0.199
 conver-0.197
cens-0.196
Token it
Token ID340
Feature activation+2.97e-01

Pos logit contributions
antine+0.625
ables+0.621
bledon+0.608
water+0.587
mom+0.584
Neg logit contributions
 foreskin-0.704
 salute-0.650
ratulations-0.625
 conver-0.619
cens-0.611
Token for
Token ID329
Feature activation+1.46e-01

Pos logit contributions
antine+0.563
ables+0.559
bledon+0.551
water+0.529
mom+0.528
Neg logit contributions
 foreskin-0.639
 salute-0.586
ratulations-0.566
 conver-0.564
cens-0.553
Token music
Token ID2647
Feature activation+8.10e-01

Pos logit contributions
antine+0.271
ables+0.270
bledon+0.265
water+0.256
mom+0.255
Neg logit contributions
 foreskin-0.319
 salute-0.292
ratulations-0.284
 conver-0.282
cens-0.279
Token listening
Token ID8680
Feature activation+1.62e+00

Pos logit contributions
antine+1.637
ables+1.634
bledon+1.610
water+1.554
mom+1.552
Neg logit contributions
 foreskin-1.633
 salute-1.490
ratulations-1.419
 conver-1.417
cens-1.396
Token and
Token ID290
Feature activation+1.38e-01

Pos logit contributions
ables+3.054
antine+2.972
bledon+2.891
mom+2.865
water+2.848
Neg logit contributions
 foreskin-3.437
 salute-3.160
ratulations-3.084
cens-3.055
estinal-3.022
Token the
Token ID262
Feature activation+2.43e-01

Pos logit contributions
antine+0.269
ables+0.267
bledon+0.264
water+0.254
mom+0.253
Neg logit contributions
 foreskin-0.289
 salute-0.263
ratulations-0.254
 conver-0.252
cens-0.251
Token bass
Token ID12702
Feature activation+2.90e-01

Pos logit contributions
antine+0.512
ables+0.510
bledon+0.502
mom+0.485
water+0.484
Neg logit contributions
 foreskin-0.470
 salute-0.429
ratulations-0.412
 conver-0.408
cens-0.401
Token is
Token ID318
Feature activation+2.10e-01

Pos logit contributions
antine+0.624
ables+0.619
bledon+0.611
water+0.591
mom+0.590
Neg logit contributions
 foreskin-0.552
 salute-0.500
ratulations-0.478
 conver-0.477
cens-0.472
Token really
Token ID1107
Feature activation+6.00e-02

Pos logit contributions
ables+0.430
antine+0.429
bledon+0.422
water+0.410
mom+0.407
Neg logit contributions
 foreskin-0.415
 salute-0.379
ratulations-0.368
 conver-0.362
cens-0.361
Token energy
Token ID2568
Feature activation-4.71e-01

Pos logit contributions
 foreskin+0.735
 salute+0.672
 conver+0.647
ratulations+0.643
cens+0.638
Neg logit contributions
antine-0.678
ables-0.670
bledon-0.660
water-0.636
mom-0.635
Token.
Token ID13
Feature activation-1.85e-02

Pos logit contributions
 foreskin+1.009
 salute+0.926
 conver+0.895
ratulations+0.879
cens+0.871
Neg logit contributions
antine-0.906
ables-0.891
bledon-0.885
water-0.842
mom-0.841
Token They
Token ID1119
Feature activation-2.60e-02

Pos logit contributions
 foreskin+0.038
 salute+0.034
 conver+0.033
ratulations+0.033
cens+0.032
Neg logit contributions
antine-0.037
ables-0.037
bledon-0.036
water-0.035
mom-0.035
Token need
Token ID761
Feature activation+5.63e-01

Pos logit contributions
 foreskin+0.047
 salute+0.042
 conver+0.041
ratulations+0.040
cens+0.040
Neg logit contributions
antine-0.058
ables-0.058
bledon-0.057
water-0.055
mom-0.055
Token us
Token ID514
Feature activation+1.19e-01

Pos logit contributions
antine+1.047
ables+1.034
bledon+1.016
water+0.988
mom+0.976
Neg logit contributions
 foreskin-1.233
 salute-1.131
 conver-1.088
ratulations-1.083
cens-1.075
Token dearly
Token ID46379
Feature activation+1.61e+00

Pos logit contributions
antine+0.223
ables+0.220
bledon+0.217
water+0.208
mom+0.207
Neg logit contributions
 foreskin-0.263
 salute-0.241
 conver-0.232
ratulations-0.231
cens-0.227
Token after
Token ID706
Feature activation+8.01e-02

Pos logit contributions
antine+2.864
ables+2.837
bledon+2.767
water+2.726
mom+2.686
Neg logit contributions
 foreskin-3.617
 salute-3.314
ratulations-3.210
 conver-3.175
cens-3.122
Token the
Token ID262
Feature activation-1.91e-02

Pos logit contributions
antine+0.153
ables+0.150
bledon+0.149
water+0.143
mom+0.142
Neg logit contributions
 foreskin-0.173
 salute-0.158
 conver-0.152
ratulations-0.151
cens-0.149
Token ID
Token ID4522
Feature activation+1.23e-01

Pos logit contributions
 foreskin+0.036
 salute+0.032
ratulations+0.031
 conver+0.031
cens+0.031
Neg logit contributions
antine-0.041
ables-0.041
bledon-0.040
mom-0.039
water-0.039
Token split
Token ID6626
Feature activation-1.81e-02

Pos logit contributions
antine+0.249
ables+0.246
bledon+0.243
water+0.233
mom+0.233
Neg logit contributions
 foreskin-0.254
 salute-0.228
 conver-0.220
ratulations-0.220
cens-0.217
Token to
Token ID284
Feature activation+2.99e-02

Pos logit contributions
 foreskin+0.038
 salute+0.035
 conver+0.033
ratulations+0.033
cens+0.033
Neg logit contributions
antine-0.035
ables-0.035
bledon-0.035
water-0.033
mom-0.033
Token interesting
Token ID3499
Feature activation-6.33e-01

Pos logit contributions
antine+0.384
ables+0.382
bledon+0.375
water+0.363
mom+0.362
Neg logit contributions
 foreskin-0.328
 salute-0.296
ratulations-0.284
 conver-0.282
cens-0.278
Token.
Token ID13
Feature activation+1.79e-01

Pos logit contributions
 foreskin+1.259
 salute+1.147
 conver+1.104
ratulations+1.084
cens+1.073
Neg logit contributions
antine-1.316
ables-1.298
bledon-1.289
water-1.232
mom-1.231
Token Apparently
Token ID18626
Feature activation+1.02e-01

Pos logit contributions
antine+0.362
ables+0.358
bledon+0.353
water+0.341
mom+0.341
Neg logit contributions
 foreskin-0.367
 salute-0.334
 conver-0.322
ratulations-0.320
cens-0.313
Token the
Token ID262
Feature activation-1.22e-01

Pos logit contributions
antine+0.188
ables+0.185
bledon+0.183
water+0.175
mom+0.175
Neg logit contributions
 foreskin-0.227
 salute-0.209
 conver-0.202
ratulations-0.200
cens-0.197
Token Str
Token ID4285
Feature activation+5.90e-01

Pos logit contributions
 foreskin+0.242
 salute+0.220
 conver+0.212
ratulations+0.210
cens+0.207
Neg logit contributions
antine-0.255
ables-0.252
bledon-0.249
water-0.240
mom-0.239
Tokenand
Token ID392
Feature activation+1.60e+00

Pos logit contributions
antine+1.029
ables+1.020
bledon+0.998
mom+0.952
water+0.950
Neg logit contributions
 foreskin-1.382
 salute-1.273
 conver-1.231
ratulations-1.212
cens-1.202
Token Craft
Token ID15745
Feature activation+4.97e-01

Pos logit contributions
ables+3.289
antine+3.284
bledon+3.202
water+3.144
mom+3.105
Neg logit contributions
 foreskin-3.142
 salute-2.831
ratulations-2.738
 conver-2.709
cens-2.665
Token 122
Token ID19409
Feature activation+2.72e-01

Pos logit contributions
antine+1.010
ables+1.002
bledon+0.984
water+0.948
mom+0.945
Neg logit contributions
 foreskin-1.015
 salute-0.923
 conver-0.887
ratulations-0.878
cens-0.867
Token is
Token ID318
Feature activation+8.89e-02

Pos logit contributions
antine+0.562
ables+0.557
bledon+0.549
water+0.529
mom+0.527
Neg logit contributions
 foreskin-0.545
 salute-0.492
ratulations-0.475
 conver-0.472
cens-0.463
Token going
Token ID1016
Feature activation+2.68e-01

Pos logit contributions
antine+0.183
ables+0.181
bledon+0.178
water+0.172
mom+0.171
Neg logit contributions
 foreskin-0.178
 salute-0.162
 conver-0.155
ratulations-0.155
cens-0.153
Token to
Token ID284
Feature activation+1.40e-01

Pos logit contributions
antine+0.486
ables+0.477
bledon+0.474
water+0.453
mom+0.450
Neg logit contributions
 foreskin-0.602
 salute-0.553
 conver-0.534
ratulations-0.530
cens-0.525
Token and
Token ID290
Feature activation-1.99e-01

Pos logit contributions
 foreskin+1.831
 salute+1.674
 conver+1.627
ratulations+1.584
 digit+1.576
Neg logit contributions
antine-1.750
ables-1.715
bledon-1.709
mom-1.630
water-1.625
Token connect
Token ID2018
Feature activation-3.96e-01

Pos logit contributions
 foreskin+0.407
 salute+0.371
 conver+0.357
ratulations+0.353
cens+0.349
Neg logit contributions
antine-0.403
ables-0.399
bledon-0.395
water-0.378
mom-0.376
Token its
Token ID663
Feature activation-6.57e-01

Pos logit contributions
 foreskin+0.861
 salute+0.791
 conver+0.764
ratulations+0.752
cens+0.745
Neg logit contributions
antine-0.751
ables-0.740
bledon-0.734
water-0.698
mom-0.698
Token W
Token ID370
Feature activation-3.89e-01

Pos logit contributions
 foreskin+1.277
 salute+1.161
 conver+1.116
ratulations+1.097
cens+1.084
Neg logit contributions
antine-1.393
ables-1.374
bledon-1.364
water-1.306
mom-1.305
TokenAN
Token ID1565
Feature activation-8.98e-02

Pos logit contributions
 foreskin+0.859
 salute+0.790
 conver+0.766
ratulations+0.752
cens+0.746
Neg logit contributions
antine-0.722
ables-0.709
bledon-0.704
water-0.670
mom-0.669
Token port
Token ID2493
Feature activation-1.93e+00

Pos logit contributions
 foreskin+0.181
 salute+0.166
ratulations+0.158
 conver+0.157
cens+0.156
Neg logit contributions
antine-0.183
ables-0.183
bledon-0.180
water-0.172
mom-0.171
Token to
Token ID284
Feature activation-3.02e-01

Pos logit contributions
 foreskin+4.019
 salute+3.667
 conver+3.620
 digit+3.495
ratulations+3.461
Neg logit contributions
antine-3.760
bledon-3.673
ables-3.614
mom-3.532
water-3.471
Token the
Token ID262
Feature activation-2.62e-01

Pos logit contributions
 foreskin+0.654
 salute+0.600
 conver+0.579
ratulations+0.572
cens+0.566
Neg logit contributions
antine-0.573
ables-0.565
bledon-0.560
water-0.533
mom-0.532
Token MI
Token ID15789
Feature activation-3.95e-01

Pos logit contributions
 foreskin+0.484
 salute+0.438
ratulations+0.416
 conver+0.416
cens+0.408
Neg logit contributions
antine-0.580
ables-0.576
bledon-0.570
water-0.549
mom-0.545
Token424
Token ID40090
Feature activation-7.93e-01

Pos logit contributions
 foreskin+0.910
 salute+0.840
 conver+0.812
ratulations+0.802
cens+0.794
Neg logit contributions
antine-0.696
ables-0.686
bledon-0.680
water-0.644
mom-0.643
TokenWR
Token ID18564
Feature activation-5.74e-01

Pos logit contributions
 foreskin+1.614
 salute+1.473
 conver+1.419
ratulations+1.397
cens+1.381
Neg logit contributions
antine-1.611
ables-1.589
bledon-1.577
water-1.506
mom-1.504
Token this
Token ID428
Feature activation-1.01e-02

Pos logit contributions
antine+1.402
ables+1.400
bledon+1.378
mom+1.316
water+1.314
Neg logit contributions
 foreskin-1.695
 salute-1.545
ratulations-1.494
 conver-1.473
cens-1.460
Token month
Token ID1227
Feature activation+3.10e-01

Pos logit contributions
 foreskin+0.017
 salute+0.015
 conver+0.014
ratulations+0.014
cens+0.014
Neg logit contributions
antine-0.024
ables-0.024
bledon-0.024
mom-0.023
water-0.023
Token after
Token ID706
Feature activation+1.58e-01

Pos logit contributions
antine+0.572
ables+0.568
bledon+0.560
water+0.536
mom+0.535
Neg logit contributions
 foreskin-0.686
 salute-0.630
ratulations-0.606
 conver-0.604
cens-0.591
Token using
Token ID1262
Feature activation+6.02e-02

Pos logit contributions
antine+0.305
ables+0.303
bledon+0.297
water+0.286
mom+0.283
Neg logit contributions
 foreskin-0.333
 salute-0.305
ratulations-0.294
 conver-0.292
cens-0.288
Token prof
Token ID1534
Feature activation-2.70e-01

Pos logit contributions
antine+0.119
ables+0.118
bledon+0.116
water+0.112
mom+0.111
Neg logit contributions
 foreskin-0.125
 salute-0.114
ratulations-0.111
 conver-0.110
cens-0.108
Tokenanity
Token ID19689
Feature activation-1.91e+00

Pos logit contributions
 foreskin+0.573
 salute+0.524
 conver+0.506
ratulations+0.498
 digit+0.490
Neg logit contributions
antine-0.531
ables-0.523
bledon-0.512
mom-0.495
water-0.493
Token to
Token ID284
Feature activation-5.22e-02

Pos logit contributions
 foreskin+3.714
 salute+3.404
 conver+3.309
 digit+3.198
cens+3.160
Neg logit contributions
antine-4.016
bledon-3.931
ables-3.920
mom-3.726
water-3.688
Token describe
Token ID6901
Feature activation-2.22e-01

Pos logit contributions
 foreskin+0.099
 salute+0.088
ratulations+0.087
 conver+0.085
cens+0.084
Neg logit contributions
ables-0.112
antine-0.112
bledon-0.110
water-0.106
mom-0.105
Token President
Token ID1992
Feature activation+1.18e-01

Pos logit contributions
 foreskin+0.451
 salute+0.411
 conver+0.396
ratulations+0.395
cens+0.387
Neg logit contributions
antine-0.452
ables-0.448
bledon-0.443
water-0.423
mom-0.421
Token Donald
Token ID3759
Feature activation-8.53e-01

Pos logit contributions
antine+0.258
ables+0.257
bledon+0.254
water+0.244
mom+0.243
Neg logit contributions
 foreskin-0.223
 salute-0.201
 conver-0.192
ratulations-0.191
cens-0.188
Token Trump
Token ID1301
Feature activation-1.23e-01

Pos logit contributions
 foreskin+1.723
 salute+1.578
 conver+1.522
ratulations+1.490
 digit+1.471
Neg logit contributions
antine-1.740
ables-1.713
bledon-1.705
mom-1.628
water-1.623
Token MI
Token ID15789
Feature activation-3.95e-01

Pos logit contributions
 foreskin+0.484
 salute+0.438
ratulations+0.416
 conver+0.416
cens+0.408
Neg logit contributions
antine-0.580
ables-0.576
bledon-0.570
water-0.549
mom-0.545
Token424
Token ID40090
Feature activation-7.93e-01

Pos logit contributions
 foreskin+0.910
 salute+0.840
 conver+0.812
ratulations+0.802
cens+0.794
Neg logit contributions
antine-0.696
ables-0.686
bledon-0.680
water-0.644
mom-0.643
TokenWR
Token ID18564
Feature activation-5.74e-01

Pos logit contributions
 foreskin+1.614
 salute+1.473
 conver+1.419
ratulations+1.397
cens+1.381
Neg logit contributions
antine-1.611
ables-1.589
bledon-1.577
water-1.506
mom-1.504
Token's
Token ID338
Feature activation-4.60e-01

Pos logit contributions
 foreskin+1.151
 salute+1.050
 conver+1.014
ratulations+0.993
cens+0.983
Neg logit contributions
antine-1.181
ables-1.163
bledon-1.156
water-1.104
mom-1.104
Token LAN
Token ID24192
Feature activation-1.55e+00

Pos logit contributions
 foreskin+0.895
 salute+0.814
 conver+0.782
ratulations+0.770
cens+0.761
Neg logit contributions
antine-0.975
ables-0.963
bledon-0.955
water-0.915
mom-0.913
Token port
Token ID2493
Feature activation-1.86e+00

Pos logit contributions
 foreskin+2.676
 salute+2.403
 conver+2.336
 digit+2.248
ratulations+2.234
Neg logit contributions
antine-3.594
ables-3.514
bledon-3.508
mom-3.382
water-3.368
Token.
Token ID13
Feature activation-2.22e-01

Pos logit contributions
 foreskin+3.740
 salute+3.406
 conver+3.369
 digit+3.262
ratulations+3.192
Neg logit contributions
antine-3.775
bledon-3.677
ables-3.628
mom-3.540
water-3.474
Token The
Token ID383
Feature activation-2.50e-01

Pos logit contributions
 foreskin+0.437
 salute+0.398
 conver+0.383
ratulations+0.377
cens+0.372
Neg logit contributions
antine-0.465
ables-0.459
bledon-0.455
water-0.436
mom-0.435
Token router
Token ID20264
Feature activation-4.37e-01

Pos logit contributions
 foreskin+0.489
 salute+0.443
 conver+0.425
ratulations+0.422
cens+0.416
Neg logit contributions
antine-0.525
ables-0.520
bledon-0.515
water-0.494
mom-0.492
Token should
Token ID815
Feature activation-7.68e-01

Pos logit contributions
 foreskin+0.728
 salute+0.650
 conver+0.619
ratulations+0.610
cens+0.601
Neg logit contributions
antine-1.048
ables-1.038
bledon-1.030
water-0.992
mom-0.990
Token then
Token ID788
Feature activation-5.10e-01

Pos logit contributions
 foreskin+1.508
 salute+1.379
 conver+1.327
ratulations+1.296
 digit+1.286
Neg logit contributions
antine-1.617
ables-1.589
bledon-1.581
mom-1.511
water-1.510
Token a
Token ID257
Feature activation+1.98e-01

Pos logit contributions
antine+0.123
ables+0.122
bledon+0.121
water+0.116
mom+0.116
Neg logit contributions
 foreskin-0.109
 salute-0.100
 conver-0.095
ratulations-0.095
cens-0.094
Token laboratory
Token ID14010
Feature activation+3.31e-02

Pos logit contributions
antine+0.420
ables+0.416
bledon+0.413
water+0.397
mom+0.396
Neg logit contributions
 foreskin-0.383
 salute-0.349
ratulations-0.335
 conver-0.335
cens-0.328
Token in
Token ID287
Feature activation+2.30e-01

Pos logit contributions
antine+0.066
ables+0.065
bledon+0.064
water+0.061
mom+0.061
Neg logit contributions
 foreskin-0.069
 salute-0.063
 conver-0.061
ratulations-0.060
cens-0.059
Token San
Token ID2986
Feature activation-9.74e-02

Pos logit contributions
antine+0.421
ables+0.418
bledon+0.412
water+0.395
mom+0.390
Neg logit contributions
 foreskin-0.510
 salute-0.470
ratulations-0.455
 conver-0.453
cens-0.446
Token Juan
Token ID16852
Feature activation+6.89e-01

Pos logit contributions
 foreskin+0.220
 salute+0.203
 conver+0.196
ratulations+0.193
cens+0.191
Neg logit contributions
antine-0.176
ables-0.174
bledon-0.172
water-0.164
mom-0.163
Token Cap
Token ID4476
Feature activation-1.80e+00

Pos logit contributions
antine+1.191
ables+1.170
bledon+1.151
water+1.106
mom+1.082
Neg logit contributions
 foreskin-1.604
 salute-1.489
ratulations-1.447
 conver-1.434
 digit-1.407
Tokenist
Token ID396
Feature activation+3.73e-01

Pos logit contributions
 foreskin+4.460
 salute+4.176
 conver+4.063
ratulations+3.978
cens+3.968
Neg logit contributions
bledon-2.742
antine-2.738
ables-2.688
mom-2.549
water-2.533
Tokenrano
Token ID35823
Feature activation+4.66e-03

Pos logit contributions
antine+0.701
ables+0.694
bledon+0.680
mom+0.659
water+0.657
Neg logit contributions
 foreskin-0.817
 salute-0.752
 conver-0.726
ratulations-0.718
 digit-0.706
Token,
Token ID11
Feature activation+2.08e-01

Pos logit contributions
antine+0.008
ables+0.008
bledon+0.008
water+0.008
mom+0.008
Neg logit contributions
 foreskin-0.011
 salute-0.010
 conver-0.010
ratulations-0.010
cens-0.009
Token Calif
Token ID3192
Feature activation+7.50e-01

Pos logit contributions
antine+0.410
ables+0.405
bledon+0.402
water+0.390
mom+0.383
Neg logit contributions
 foreskin-0.434
 salute-0.395
ratulations-0.382
 conver-0.377
cens-0.374
Token.,
Token ID1539
Feature activation+5.30e-01

Pos logit contributions
antine+1.264
ables+1.247
bledon+1.216
water+1.158
mom+1.148
Neg logit contributions
 foreskin-1.805
 salute-1.670
 conver-1.612
ratulations-1.590
 digit-1.576
Token USB
Token ID8450
Feature activation-1.24e+00

Pos logit contributions
 foreskin+0.325
 salute+0.295
 conver+0.283
ratulations+0.279
cens+0.275
Neg logit contributions
antine-0.374
ables-0.370
bledon-0.367
water-0.352
mom-0.352
Token connection
Token ID4637
Feature activation-5.96e-01

Pos logit contributions
 foreskin+2.230
 salute+2.014
 conver+1.948
ratulations+1.873
 digit+1.870
Neg logit contributions
antine-2.804
ables-2.741
bledon-2.741
water-2.631
mom-2.630
Token,
Token ID11
Feature activation-1.62e-01

Pos logit contributions
 foreskin+1.257
 salute+1.151
 conver+1.113
ratulations+1.090
cens+1.081
Neg logit contributions
antine-1.165
ables-1.144
bledon-1.140
mom-1.084
water-1.083
Token or
Token ID393
Feature activation+2.46e-01

Pos logit contributions
 foreskin+0.316
 salute+0.288
 conver+0.276
ratulations+0.274
cens+0.270
Neg logit contributions
antine-0.338
ables-0.335
bledon-0.331
water-0.320
mom-0.318
Token your
Token ID534
Feature activation+1.82e-01

Pos logit contributions
antine+0.481
ables+0.478
bledon+0.469
water+0.453
mom+0.450
Neg logit contributions
 foreskin-0.512
 salute-0.472
 conver-0.451
ratulations-0.451
cens-0.443
Token PC
Token ID4217
Feature activation-1.77e+00

Pos logit contributions
antine+0.398
ables+0.396
bledon+0.390
water+0.378
mom+0.376
Neg logit contributions
 foreskin-0.338
 salute-0.308
ratulations-0.294
 conver-0.292
cens-0.287
Token hal
Token ID10284
Feature activation-1.08e+00

Pos logit contributions
 foreskin+3.196
 salute+2.873
 conver+2.797
 digit+2.691
 interpre+2.671
Neg logit contributions
antine-3.980
ables-3.899
bledon-3.893
mom-3.730
water-3.709
Tokents
Token ID912
Feature activation+2.92e-01

Pos logit contributions
 foreskin+1.521
 salute+1.327
 conver+1.270
ratulations+1.222
cens+1.213
Neg logit contributions
antine-2.876
ables-2.832
bledon-2.823
mom-2.721
water-2.715
Token,
Token ID11
Feature activation-2.50e-01

Pos logit contributions
antine+0.570
ables+0.568
bledon+0.559
water+0.537
mom+0.534
Neg logit contributions
 foreskin-0.613
 salute-0.559
 conver-0.539
ratulations-0.533
cens-0.528
Token or
Token ID393
Feature activation-2.71e-02

Pos logit contributions
 foreskin+0.494
 salute+0.450
 conver+0.432
ratulations+0.427
cens+0.421
Neg logit contributions
antine-0.521
ables-0.516
bledon-0.510
water-0.491
mom-0.490
Token the
Token ID262
Feature activation-1.39e-01

Pos logit contributions
 foreskin+0.057
 salute+0.052
 conver+0.050
ratulations+0.050
cens+0.049
Neg logit contributions
antine-0.053
ables-0.052
bledon-0.052
water-0.050
mom-0.049
Token)
Token ID8
Feature activation+7.39e-02

Pos logit contributions
ables+0.488
antine+0.487
bledon+0.477
water+0.462
mom+0.458
Neg logit contributions
 foreskin-0.575
 salute-0.526
 conver-0.503
ratulations-0.500
cens-0.493
Token under
Token ID739
Feature activation-5.18e-02

Pos logit contributions
antine+0.144
ables+0.144
bledon+0.141
water+0.135
mom+0.135
Neg logit contributions
 foreskin-0.156
 salute-0.143
 conver-0.137
ratulations-0.136
cens-0.135
Token the
Token ID262
Feature activation-1.64e-01

Pos logit contributions
 foreskin+0.098
 salute+0.089
 conver+0.085
ratulations+0.084
cens+0.083
Neg logit contributions
antine-0.112
ables-0.111
bledon-0.110
water-0.106
mom-0.105
Token list
Token ID1351
Feature activation+3.04e-01

Pos logit contributions
 foreskin+0.312
 salute+0.284
 conver+0.271
ratulations+0.268
cens+0.264
Neg logit contributions
antine-0.353
ables-0.351
bledon-0.347
water-0.333
mom-0.332
Token of
Token ID286
Feature activation+2.33e-01

Pos logit contributions
antine+0.602
ables+0.599
bledon+0.589
water+0.564
mom+0.562
Neg logit contributions
 foreskin-0.632
 salute-0.580
 conver-0.555
ratulations-0.552
cens-0.543
Token brid
Token ID38265
Feature activation-1.76e+00

Pos logit contributions
ables+0.509
antine+0.508
bledon+0.499
water+0.480
mom+0.479
Neg logit contributions
 foreskin-0.438
 salute-0.398
ratulations-0.376
 conver-0.374
cens-0.370
Tokenged
Token ID2004
Feature activation-4.20e-01

Pos logit contributions
 foreskin+2.981
 salute+2.681
 conver+2.589
ratulations+2.502
cens+2.498
Neg logit contributions
antine-4.104
bledon-4.030
ables-4.006
mom-3.876
water-3.852
Token connections
Token ID8787
Feature activation-6.91e-01

Pos logit contributions
 foreskin+0.761
 salute+0.688
 conver+0.656
ratulations+0.647
cens+0.639
Neg logit contributions
antine-0.946
ables-0.937
bledon-0.928
water-0.892
mom-0.890
Token.
Token ID13
Feature activation+7.65e-02

Pos logit contributions
 foreskin+1.432
 salute+1.308
 conver+1.274
ratulations+1.234
 digit+1.234
Neg logit contributions
antine-1.381
bledon-1.349
ables-1.342
mom-1.284
water-1.280
Token The
Token ID383
Feature activation-3.45e-02

Pos logit contributions
antine+0.157
ables+0.156
bledon+0.154
water+0.148
mom+0.148
Neg logit contributions
 foreskin-0.154
 salute-0.140
 conver-0.134
ratulations-0.133
cens-0.131
Token original
Token ID2656
Feature activation+8.21e-01

Pos logit contributions
 foreskin+0.069
 salute+0.062
ratulations+0.060
 conver+0.060
cens+0.059
Neg logit contributions
antine-0.071
ables-0.071
bledon-0.070
water-0.067
mom-0.067
Token}
Token ID92
Feature activation-1.37e-01

Pos logit contributions
antine+0.254
ables+0.252
bledon+0.247
water+0.236
mom+0.234
Neg logit contributions
 foreskin-0.394
 salute-0.366
 conver-0.354
ratulations-0.351
cens-0.348
Token at
Token ID379
Feature activation-5.46e-02

Pos logit contributions
 foreskin+0.280
 salute+0.254
ratulations+0.245
 conver+0.244
cens+0.240
Neg logit contributions
antine-0.276
ables-0.275
bledon-0.267
water-0.259
mom-0.258
Token least
Token ID1551
Feature activation-1.91e-01

Pos logit contributions
 foreskin+0.122
 salute+0.113
 conver+0.109
ratulations+0.108
cens+0.106
Neg logit contributions
antine-0.100
ables-0.098
bledon-0.097
water-0.092
mom-0.092
Token one
Token ID530
Feature activation-4.89e-01

Pos logit contributions
 foreskin+0.322
 salute+0.289
 conver+0.275
ratulations+0.272
cens+0.268
Neg logit contributions
antine-0.454
ables-0.450
bledon-0.444
water-0.430
mom-0.429
Token blank
Token ID9178
Feature activation-3.25e-01

Pos logit contributions
 foreskin+0.909
 salute+0.820
 conver+0.788
ratulations+0.783
cens+0.770
Neg logit contributions
antine-1.070
ables-1.063
bledon-1.045
water-1.011
mom-1.008
Token character
Token ID2095
Feature activation-1.74e+00

Pos logit contributions
 foreskin+0.497
 salute+0.437
ratulations+0.415
 conver+0.415
cens+0.405
Neg logit contributions
antine-0.823
ables-0.819
bledon-0.806
water-0.783
mom-0.780
Token (<
Token ID38155
Feature activation-3.28e-01

Pos logit contributions
 foreskin+3.959
 salute+3.672
 conver+3.540
 digit+3.467
ratulations+3.463
Neg logit contributions
antine-3.105
bledon-3.050
ables-3.026
mom-2.866
water-2.862
TokenSpace
Token ID14106
Feature activation-3.10e-01

Pos logit contributions
 foreskin+0.671
 salute+0.614
 conver+0.592
ratulations+0.583
 pancreat+0.569
Neg logit contributions
antine-0.664
ables-0.661
bledon-0.649
water-0.631
mom-0.626
Token>
Token ID29
Feature activation-7.49e-02

Pos logit contributions
 foreskin+0.730
 salute+0.674
 conver+0.654
ratulations+0.649
cens+0.641
Neg logit contributions
antine-0.530
ables-0.524
bledon-0.517
water-0.493
mom-0.491
Token or
Token ID393
Feature activation-4.75e-02

Pos logit contributions
 foreskin+0.146
 salute+0.132
 conver+0.127
ratulations+0.126
cens+0.123
Neg logit contributions
antine-0.158
ables-0.158
bledon-0.155
water-0.149
mom-0.149
Token <
Token ID1279
Feature activation-1.04e+00

Pos logit contributions
 foreskin+0.083
 salute+0.075
ratulations+0.072
 conver+0.072
cens+0.070
Neg logit contributions
ables-0.109
antine-0.109
bledon-0.107
water-0.104
mom-0.103
Token'
Token ID6
Feature activation+2.24e-01

Pos logit contributions
 foreskin+0.984
 salute+0.885
 conver+0.848
ratulations+0.828
cens+0.817
Neg logit contributions
antine-1.315
ables-1.297
bledon-1.292
water-1.237
mom-1.237
Token of
Token ID286
Feature activation-2.18e-01

Pos logit contributions
antine+0.448
ables+0.443
bledon+0.438
water+0.421
mom+0.419
Neg logit contributions
 foreskin-0.461
 salute-0.421
 conver-0.406
ratulations-0.404
cens-0.396
Token people
Token ID661
Feature activation-2.29e-03

Pos logit contributions
 foreskin+0.399
 salute+0.360
 conver+0.345
ratulations+0.340
cens+0.335
Neg logit contributions
antine-0.486
ables-0.481
bledon-0.477
water-0.458
mom-0.457
Token of
Token ID286
Feature activation-2.99e-01

Pos logit contributions
 foreskin+0.005
 salute+0.004
 conver+0.004
ratulations+0.004
cens+0.004
Neg logit contributions
antine-0.005
ables-0.005
bledon-0.004
water-0.004
mom-0.004
Token all
Token ID477
Feature activation-3.52e-01

Pos logit contributions
 foreskin+0.665
 salute+0.612
 conver+0.592
ratulations+0.582
cens+0.577
Neg logit contributions
antine-0.553
ables-0.544
bledon-0.540
water-0.513
mom-0.512
Token faiths
Token ID43028
Feature activation-1.72e+00

Pos logit contributions
 foreskin+0.358
 salute+0.297
ratulations+0.271
 conver+0.264
cens+0.258
Neg logit contributions
ables-1.066
antine-1.058
bledon-1.046
water-1.023
mom-1.018
Token,
Token ID11
Feature activation-4.64e-01

Pos logit contributions
 foreskin+3.816
 salute+3.535
 conver+3.425
ratulations+3.338
cens+3.329
Neg logit contributions
antine-3.155
ables-3.078
bledon-3.065
mom-2.903
water-2.897
Token colors
Token ID7577
Feature activation-5.83e-01

Pos logit contributions
 foreskin+1.039
 salute+0.956
 conver+0.920
ratulations+0.917
cens+0.902
Neg logit contributions
antine-0.846
ables-0.841
bledon-0.827
mom-0.787
water-0.787
Token and
Token ID290
Feature activation-2.24e-01

Pos logit contributions
 foreskin+1.172
 salute+1.069
 conver+1.028
ratulations+1.015
cens+1.001
Neg logit contributions
antine-1.199
ables-1.184
bledon-1.174
water-1.123
mom-1.121
Token cre
Token ID1126
Feature activation-2.07e-01

Pos logit contributions
 foreskin+0.378
 salute+0.340
ratulations+0.330
 conver+0.322
cens+0.317
Neg logit contributions
ables-0.525
antine-0.524
bledon-0.515
water-0.497
mom-0.495
Tokeneds
Token ID5379
Feature activation+2.83e-01

Pos logit contributions
 foreskin+0.455
 salute+0.418
 conver+0.404
ratulations+0.399
cens+0.393
Neg logit contributions
antine-0.388
ables-0.384
bledon-0.379
mom-0.361
water-0.360
Token to
Token ID284
Feature activation+8.54e-02

Pos logit contributions
antine+0.597
ables+0.589
bledon+0.583
water+0.555
mom+0.553
Neg logit contributions
 foreskin-0.732
 salute-0.674
 conver-0.650
ratulations-0.643
cens-0.635
Token one
Token ID530
Feature activation-2.73e-01

Pos logit contributions
antine+0.160
ables+0.158
bledon+0.156
water+0.149
mom+0.148
Neg logit contributions
 foreskin-0.188
 salute-0.172
 conver-0.166
ratulations-0.165
cens-0.163
Token of
Token ID286
Feature activation+2.18e-01

Pos logit contributions
 foreskin+0.556
 salute+0.508
 conver+0.488
ratulations+0.482
cens+0.476
Neg logit contributions
antine-0.556
ables-0.549
bledon-0.544
water-0.520
mom-0.519
Token its
Token ID663
Feature activation-2.81e-02

Pos logit contributions
antine+0.388
ables+0.387
bledon+0.382
water+0.363
mom+0.361
Neg logit contributions
 foreskin-0.493
 salute-0.453
ratulations-0.437
 conver-0.436
cens-0.430
Token LAN
Token ID24192
Feature activation-1.50e+00

Pos logit contributions
 foreskin+0.053
 salute+0.048
 conver+0.045
ratulations+0.045
cens+0.044
Neg logit contributions
antine-0.061
ables-0.061
bledon-0.060
water-0.058
mom-0.058
Token ports
Token ID14090
Feature activation-1.72e+00

Pos logit contributions
 foreskin+2.627
 salute+2.364
 conver+2.302
ratulations+2.213
 digit+2.204
Neg logit contributions
antine-3.445
bledon-3.365
ables-3.358
mom-3.234
water-3.230
Token with
Token ID351
Feature activation-1.46e-01

Pos logit contributions
 foreskin+3.458
 salute+3.157
 conver+3.116
 digit+3.018
cens+2.988
Neg logit contributions
antine-3.455
bledon-3.368
ables-3.331
mom-3.222
water-3.179
Token a
Token ID257
Feature activation-1.46e-01

Pos logit contributions
 foreskin+0.307
 salute+0.281
 conver+0.270
ratulations+0.268
cens+0.265
Neg logit contributions
antine-0.285
ables-0.283
bledon-0.279
water-0.267
mom-0.266
Token static
Token ID9037
Feature activation+1.14e-01

Pos logit contributions
 foreskin+0.278
 salute+0.251
ratulations+0.241
 conver+0.239
cens+0.236
Neg logit contributions
antine-0.313
ables-0.311
bledon-0.307
water-0.296
mom-0.294
Token IP
Token ID6101
Feature activation-1.07e+00

Pos logit contributions
antine+0.195
ables+0.193
bledon+0.190
water+0.181
mom+0.179
Neg logit contributions
 foreskin-0.269
 salute-0.248
 conver-0.239
ratulations-0.239
cens-0.235
Token address
Token ID2209
Feature activation-6.54e-01

Pos logit contributions
 foreskin+1.858
 salute+1.674
 conver+1.608
ratulations+1.555
 digit+1.548
Neg logit contributions
antine-2.484
ables-2.446
bledon-2.436
mom-2.342
water-2.333
Token,
Token ID11
Feature activation-7.84e-02

Pos logit contributions
 foreskin+0.600
 salute+0.553
 conver+0.535
ratulations+0.527
cens+0.522
Neg logit contributions
antine-0.476
ables-0.468
bledon-0.464
water-0.440
mom-0.440
Token said
Token ID531
Feature activation-7.56e-01

Pos logit contributions
 foreskin+0.155
 salute+0.141
ratulations+0.135
 conver+0.134
cens+0.132
Neg logit contributions
ables-0.163
antine-0.163
bledon-0.161
water-0.154
mom-0.154
Token it
Token ID340
Feature activation-4.59e-01

Pos logit contributions
 foreskin+1.501
 salute+1.369
 conver+1.325
ratulations+1.293
 digit+1.288
Neg logit contributions
antine-1.572
ables-1.539
bledon-1.531
mom-1.462
water-1.461
Token was
Token ID373
Feature activation-4.77e-01

Pos logit contributions
 foreskin+0.910
 salute+0.824
ratulations+0.793
 conver+0.782
cens+0.773
Neg logit contributions
antine-0.956
ables-0.954
bledon-0.940
water-0.902
mom-0.897
Token no
Token ID645
Feature activation-3.42e-01

Pos logit contributions
 foreskin+0.899
 salute+0.813
ratulations+0.774
 conver+0.772
cens+0.761
Neg logit contributions
antine-1.037
ables-1.030
bledon-1.018
water-0.982
mom-0.976
Token surprise
Token ID5975
Feature activation-1.71e+00

Pos logit contributions
 foreskin+0.556
 salute+0.495
 conver+0.473
ratulations+0.470
cens+0.461
Neg logit contributions
antine-0.828
ables-0.823
bledon-0.817
water-0.787
mom-0.787
Token rates
Token ID3965
Feature activation-4.41e-02

Pos logit contributions
 foreskin+3.219
 salute+2.967
 conver+2.901
 digit+2.853
 pancreat+2.774
Neg logit contributions
antine-3.577
ables-3.488
bledon-3.480
mom-3.340
water-3.325
Token were
Token ID547
Feature activation-3.07e-01

Pos logit contributions
 foreskin+0.091
 salute+0.083
ratulations+0.080
 conver+0.079
cens+0.078
Neg logit contributions
antine-0.088
ables-0.087
bledon-0.087
water-0.083
mom-0.082
Token held
Token ID2714
Feature activation+8.46e-02

Pos logit contributions
 foreskin+0.596
 salute+0.541
ratulations+0.520
 conver+0.505
cens+0.505
Neg logit contributions
ables-0.651
antine-0.649
bledon-0.645
water-0.619
mom-0.614
Token.
Token ID13
Feature activation-2.03e-01

Pos logit contributions
ables+0.177
antine+0.177
bledon+0.174
water+0.167
mom+0.166
Neg logit contributions
 foreskin-0.167
 salute-0.151
ratulations-0.147
 conver-0.142
cens-0.141
Token
Token ID198
Feature activation-2.17e-01

Pos logit contributions
 foreskin+0.403
 salute+0.368
 conver+0.353
ratulations+0.348
cens+0.343
Neg logit contributions
antine-0.424
ables-0.419
bledon-0.415
water-0.398
mom-0.398
Token could
Token ID714
Feature activation-1.54e-02

Pos logit contributions
 foreskin+0.395
 salute+0.356
ratulations+0.341
 conver+0.341
cens+0.336
Neg logit contributions
antine-0.458
ables-0.455
bledon-0.450
water-0.433
mom-0.432
Token
Token ID198
Feature activation-2.03e-01

Pos logit contributions
 foreskin+0.031
 salute+0.028
 conver+0.027
ratulations+0.027
cens+0.026
Neg logit contributions
antine-0.032
ables-0.032
bledon-0.031
water-0.030
mom-0.030
Token
Token ID198
Feature activation-1.30e-01

Pos logit contributions
 foreskin+0.399
 salute+0.363
 conver+0.350
ratulations+0.344
cens+0.340
Neg logit contributions
antine-0.426
ables-0.420
bledon-0.417
water-0.399
mom-0.399
Token3
Token ID18
Feature activation-1.86e-02

Pos logit contributions
 foreskin+0.291
 salute+0.268
 conver+0.259
ratulations+0.255
cens+0.252
Neg logit contributions
antine-0.239
ables-0.235
bledon-0.233
water-0.222
mom-0.222
TokenD
Token ID35
Feature activation-8.15e-01

Pos logit contributions
 foreskin+0.040
 salute+0.036
 conver+0.035
ratulations+0.035
cens+0.034
Neg logit contributions
antine-0.036
ables-0.036
bledon-0.035
water-0.034
mom-0.034
TokenMAX
Token ID22921
Feature activation-1.70e+00

Pos logit contributions
 foreskin+1.378
 salute+1.229
 conver+1.175
ratulations+1.152
cens+1.134
Neg logit contributions
antine-1.940
ables-1.920
bledon-1.910
water-1.835
mom-1.833
Token
Token ID198
Feature activation-5.48e-01

Pos logit contributions
 foreskin+3.324
 salute+3.070
 conver+2.951
cens+2.837
ratulations+2.834
Neg logit contributions
antine-3.536
ables-3.463
bledon-3.456
water-3.296
mom-3.284
Token
Token ID198
Feature activation-3.02e-01

Pos logit contributions
 foreskin+1.037
 salute+0.941
 conver+0.907
cens+0.883
 digit+0.878
Neg logit contributions
antine-1.189
bledon-1.169
ables-1.166
mom-1.106
water-1.104
Token3
Token ID18
Feature activation-1.11e-01

Pos logit contributions
 foreskin+0.682
 salute+0.629
 conver+0.608
ratulations+0.599
cens+0.594
Neg logit contributions
antine-0.545
ables-0.537
bledon-0.532
water-0.505
mom-0.505
TokenD
Token ID35
Feature activation-6.51e-01

Pos logit contributions
 foreskin+0.254
 salute+0.233
 conver+0.225
ratulations+0.223
cens+0.220
Neg logit contributions
antine-0.197
ables-0.194
bledon-0.193
water-0.183
mom-0.183
TokenMAX
Token ID22921
Feature activation-1.48e+00

Pos logit contributions
 foreskin+0.982
 salute+0.867
 conver+0.828
ratulations+0.801
 digit+0.794
Neg logit contributions
antine-1.660
ables-1.638
bledon-1.626
water-1.566
mom-1.563
Token live
Token ID2107
Feature activation+5.89e-01

Pos logit contributions
ables+0.603
antine+0.599
bledon+0.588
water+0.570
mom+0.567
Neg logit contributions
 foreskin-0.580
ratulations-0.515
 salute-0.514
cens-0.508
 conver-0.504
Token harmon
Token ID25625
Feature activation+6.00e-01

Pos logit contributions
antine+1.138
ables+1.137
bledon+1.112
water+1.074
mom+1.067
Neg logit contributions
 foreskin-1.232
 salute-1.124
ratulations-1.093
 conver-1.087
cens-1.085
Tokeniously
Token ID6819
Feature activation+6.42e-01

Pos logit contributions
antine+0.670
ables+0.653
bledon+0.635
water+0.588
mom+0.584
Neg logit contributions
 foreskin-1.780
 salute-1.672
 conver-1.625
ratulations-1.614
cens-1.596
Token with
Token ID351
Feature activation+7.14e-02

Pos logit contributions
antine+1.161
ables+1.157
bledon+1.131
water+1.092
mom+1.084
Neg logit contributions
 foreskin-1.435
 salute-1.312
ratulations-1.277
 conver-1.262
cens-1.249
Token your
Token ID534
Feature activation+1.76e-01

Pos logit contributions
antine+0.142
ables+0.141
bledon+0.138
water+0.133
mom+0.133
Neg logit contributions
 foreskin-0.148
 salute-0.135
 conver-0.130
ratulations-0.129
cens-0.128
Token fellow
Token ID5891
Feature activation-1.70e+00

Pos logit contributions
antine+0.409
ables+0.407
bledon+0.401
mom+0.391
water+0.389
Neg logit contributions
 foreskin-0.297
 salute-0.268
cens-0.258
ratulations-0.257
 conver-0.254
Token men
Token ID1450
Feature activation-8.73e-02

Pos logit contributions
 foreskin+2.678
 salute+2.394
 conver+2.281
ratulations+2.210
 digit+2.183
Neg logit contributions
antine-4.240
bledon-4.158
ables-4.131
mom-3.964
water-3.940
Token.
Token ID13
Feature activation-1.71e-02

Pos logit contributions
 foreskin+0.187
 salute+0.171
 conver+0.165
ratulations+0.163
cens+0.161
Neg logit contributions
antine-0.169
ables-0.167
bledon-0.164
water-0.157
mom-0.157
Token I
Token ID314
Feature activation-9.62e-03

Pos logit contributions
 foreskin+0.036
 salute+0.032
 conver+0.031
ratulations+0.031
cens+0.030
Neg logit contributions
antine-0.034
ables-0.034
bledon-0.033
water-0.032
mom-0.032
Token leave
Token ID2666
Feature activation+6.50e-01

Pos logit contributions
 foreskin+0.018
 salute+0.016
ratulations+0.016
 conver+0.016
cens+0.016
Neg logit contributions
antine-0.021
ables-0.021
bledon-0.020
mom-0.020
water-0.020
Token you
Token ID345
Feature activation-9.48e-02

Pos logit contributions
antine+1.175
ables+1.166
bledon+1.142
water+1.099
mom+1.092
Neg logit contributions
 foreskin-1.456
 salute-1.335
 conver-1.296
ratulations-1.283
cens-1.274
Token fellow
Token ID5891
Feature activation-5.41e-01

Pos logit contributions
 foreskin+0.560
 salute+0.499
 conver+0.477
ratulations+0.469
cens+0.462
Neg logit contributions
antine-0.812
ables-0.802
bledon-0.798
water-0.769
mom-0.766
Token at
Token ID379
Feature activation+4.85e-02

Pos logit contributions
 foreskin+1.085
 salute+0.989
 conver+0.953
ratulations+0.936
cens+0.926
Neg logit contributions
antine-1.117
ables-1.101
bledon-1.093
water-1.045
mom-1.044
Token the
Token ID262
Feature activation+9.80e-02

Pos logit contributions
antine+0.084
ables+0.083
bledon+0.082
water+0.079
mom+0.077
Neg logit contributions
 foreskin-0.113
 salute-0.104
ratulations-0.101
 conver-0.101
cens-0.099
Token leftist
Token ID28318
Feature activation-4.53e-01

Pos logit contributions
antine+0.206
ables+0.203
bledon+0.200
water+0.195
mom+0.192
Neg logit contributions
 foreskin-0.191
 salute-0.175
ratulations-0.170
 conver-0.167
estinal-0.164
Token Southern
Token ID8050
Feature activation-8.33e-02

Pos logit contributions
 foreskin+0.884
 salute+0.805
ratulations+0.777
 conver+0.772
estinal+0.750
Neg logit contributions
antine-0.954
ables-0.939
bledon-0.927
water-0.904
mom-0.894
Token Poverty
Token ID30627
Feature activation-1.69e+00

Pos logit contributions
 foreskin+0.156
 salute+0.143
ratulations+0.140
 conver+0.137
estinal+0.134
Neg logit contributions
antine-0.181
ables-0.177
bledon-0.172
water-0.171
mom-0.169
Token Law
Token ID3854
Feature activation-4.41e-01

Pos logit contributions
 foreskin+2.959
 salute+2.691
 conver+2.557
 digit+2.464
ratulations+2.462
Neg logit contributions
antine-3.882
ables-3.836
bledon-3.813
mom-3.641
water-3.602
Token Center
Token ID3337
Feature activation+1.66e-01

Pos logit contributions
 foreskin+0.682
 salute+0.605
 conver+0.576
ratulations+0.561
cens+0.553
Neg logit contributions
antine-1.112
ables-1.097
bledon-1.092
water-1.051
mom-1.050
Token,
Token ID11
Feature activation+6.74e-02

Pos logit contributions
antine+0.286
ables+0.283
bledon+0.280
water+0.267
mom+0.266
Neg logit contributions
 foreskin-0.390
 salute-0.359
 conver-0.346
ratulations-0.345
cens-0.339
Token ass
Token ID840
Feature activation+1.79e-01

Pos logit contributions
ables+0.146
antine+0.145
bledon+0.144
water+0.138
mom+0.137
Neg logit contributions
 foreskin-0.127
 salute-0.115
ratulations-0.110
 conver-0.109
cens-0.108
Tokenailed
Token ID6255
Feature activation-2.15e-01

Pos logit contributions
antine+0.369
ables+0.367
bledon+0.359
mom+0.346
water+0.346
Neg logit contributions
 foreskin-0.361
 salute-0.327
 conver-0.312
ratulations-0.309
 neigh-0.304
Token
Token ID198
Feature activation-1.72e-01

Pos logit contributions
 foreskin+0.042
 salute+0.038
 conver+0.037
ratulations+0.036
cens+0.036
Neg logit contributions
antine-0.045
ables-0.044
bledon-0.044
water-0.042
mom-0.042
Token
Token ID198
Feature activation-7.64e-02

Pos logit contributions
 foreskin+0.338
 salute+0.307
 conver+0.296
ratulations+0.291
cens+0.287
Neg logit contributions
antine-0.362
ables-0.358
bledon-0.355
water-0.340
mom-0.339
Token12
Token ID1065
Feature activation+1.25e-01

Pos logit contributions
 foreskin+0.168
 salute+0.154
 conver+0.149
ratulations+0.147
 digit+0.145
Neg logit contributions
antine-0.143
ables-0.141
bledon-0.140
water-0.134
mom-0.134
Token.
Token ID13
Feature activation-4.09e-01

Pos logit contributions
antine+0.235
ables+0.232
bledon+0.229
water+0.220
mom+0.219
Neg logit contributions
 foreskin-0.275
 salute-0.252
 conver-0.244
ratulations-0.242
cens-0.237
Token Alex
Token ID4422
Feature activation-1.91e-01

Pos logit contributions
 foreskin+0.878
 salute+0.805
 conver+0.779
ratulations+0.767
cens+0.757
Neg logit contributions
antine-0.786
ables-0.774
bledon-0.767
water-0.732
mom-0.731
Token Dun
Token ID5648
Feature activation-1.68e+00

Pos logit contributions
 foreskin+0.442
 salute+0.408
 conver+0.395
ratulations+0.389
cens+0.386
Neg logit contributions
antine-0.336
ables-0.330
bledon-0.327
water-0.310
mom-0.310
Tokenbar
Token ID5657
Feature activation-6.63e-01

Pos logit contributions
 foreskin+4.173
 salute+3.896
cens+3.792
 conver+3.761
 digit+3.754
Neg logit contributions
antine-2.407
ables-2.313
bledon-2.282
 FTA-2.139
mom-2.106
Token (
Token ID357
Feature activation-1.05e-01

Pos logit contributions
 foreskin+1.480
 salute+1.370
 conver+1.327
 digit+1.292
cens+1.291
Neg logit contributions
antine-1.207
ables-1.185
bledon-1.180
mom-1.112
water-1.102
Token93
Token ID6052
Feature activation+2.23e-01

Pos logit contributions
 foreskin+0.228
 salute+0.209
 conver+0.202
ratulations+0.199
cens+0.197
Neg logit contributions
antine-0.200
ables-0.197
bledon-0.196
water-0.186
mom-0.186
Token)
Token ID8
Feature activation-9.98e-02

Pos logit contributions
antine+0.491
ables+0.485
bledon+0.480
water+0.463
mom+0.462
Neg logit contributions
 foreskin-0.415
 salute-0.375
 conver-0.360
ratulations-0.356
cens-0.348
Token (
Token ID357
Feature activation-8.19e-03

Pos logit contributions
 foreskin+0.228
 salute+0.211
 conver+0.204
ratulations+0.200
cens+0.200
Neg logit contributions
antine-0.177
ables-0.174
bledon-0.173
mom-0.163
water-0.163
Token by
Token ID416
Feature activation+9.01e-03

Pos logit contributions
antine+0.141
ables+0.138
bledon+0.137
water+0.130
mom+0.130
Neg logit contributions
 foreskin-0.177
 salute-0.163
 conver-0.158
ratulations-0.155
cens-0.154
Token two
Token ID734
Feature activation+1.38e-02

Pos logit contributions
antine+0.019
ables+0.019
bledon+0.018
water+0.018
mom+0.018
Neg logit contributions
 foreskin-0.018
 salute-0.016
ratulations-0.016
 conver-0.015
cens-0.015
Token magical
Token ID10883
Feature activation-1.74e-01

Pos logit contributions
antine+0.027
ables+0.027
bledon+0.027
water+0.025
mom+0.025
Neg logit contributions
 foreskin-0.029
 salute-0.026
 conver-0.025
ratulations-0.025
cens-0.025
Token words
Token ID2456
Feature activation+4.16e-01

Pos logit contributions
 foreskin+0.352
 salute+0.322
 conver+0.308
ratulations+0.307
cens+0.302
Neg logit contributions
antine-0.354
ables-0.350
bledon-0.345
water-0.331
mom-0.331
Token -
Token ID532
Feature activation-1.32e-01

Pos logit contributions
antine+0.837
ables+0.832
bledon+0.817
water+0.790
mom+0.787
Neg logit contributions
 foreskin-0.850
 salute-0.776
ratulations-0.742
 conver-0.736
cens-0.720
Token mouse
Token ID10211
Feature activation-1.68e+00

Pos logit contributions
 foreskin+0.256
 salute+0.232
 conver+0.223
ratulations+0.222
cens+0.217
Neg logit contributions
antine-0.280
ables-0.276
bledon-0.274
water-0.263
mom-0.263
Token issues
Token ID2428
Feature activation+3.05e-01

Pos logit contributions
 foreskin+3.355
 salute+3.067
 conver+2.953
cens+2.879
ratulations+2.860
Neg logit contributions
antine-3.451
ables-3.365
bledon-3.346
water-3.192
mom-3.183
Token -
Token ID532
Feature activation-1.07e-01

Pos logit contributions
antine+0.571
ables+0.565
bledon+0.557
water+0.535
mom+0.534
Neg logit contributions
 foreskin-0.664
 salute-0.613
ratulations-0.587
 conver-0.587
 digit-0.573
Token which
Token ID543
Feature activation-7.41e-01

Pos logit contributions
 foreskin+0.213
 salute+0.194
 conver+0.186
ratulations+0.185
cens+0.181
Neg logit contributions
antine-0.221
ables-0.218
bledon-0.217
mom-0.208
water-0.208
Token no
Token ID645
Feature activation-2.81e-01

Pos logit contributions
 foreskin+1.465
 salute+1.334
 conver+1.287
ratulations+1.257
cens+1.246
Neg logit contributions
antine-1.551
ables-1.528
bledon-1.518
water-1.450
mom-1.450
Token admin
Token ID13169
Feature activation-5.88e-02

Pos logit contributions
 foreskin+0.489
 salute+0.439
 conver+0.420
ratulations+0.413
cens+0.407
Neg logit contributions
antine-0.653
ables-0.646
bledon-0.641
water-0.616
mom-0.615
Token953
Token ID49649
Feature activation+1.26e-01

Pos logit contributions
 foreskin+0.444
 salute+0.408
 conver+0.394
ratulations+0.388
cens+0.384
Neg logit contributions
antine-0.392
ables-0.386
bledon-0.383
water-0.365
mom-0.364
Token 16
Token ID1467
Feature activation-1.63e-01

Pos logit contributions
antine+0.237
ables+0.235
bledon+0.231
water+0.222
mom+0.221
Neg logit contributions
 foreskin-0.275
 salute-0.253
 conver-0.243
ratulations-0.241
cens-0.238
Token27
Token ID1983
Feature activation+2.50e-01

Pos logit contributions
 foreskin+0.364
 salute+0.335
 conver+0.324
ratulations+0.320
cens+0.315
Neg logit contributions
antine-0.300
ables-0.295
bledon-0.292
water-0.279
mom-0.278
Token145
Token ID18781
Feature activation-1.45e-01

Pos logit contributions
antine+0.450
ables+0.442
bledon+0.435
water+0.419
mom+0.417
Neg logit contributions
 foreskin-0.568
 salute-0.522
 conver-0.505
ratulations-0.502
cens-0.492
Token Flat
Token ID21939
Feature activation+1.97e-01

Pos logit contributions
 foreskin+0.290
 salute+0.264
 conver+0.253
ratulations+0.251
cens+0.247
Neg logit contributions
antine-0.302
ables-0.298
bledon-0.295
water-0.284
mom-0.282
Tokenbuff
Token ID36873
Feature activation-1.66e+00

Pos logit contributions
antine+0.384
ables+0.379
bledon+0.376
water+0.359
mom+0.358
Neg logit contributions
 foreskin-0.416
 salute-0.381
 conver-0.367
ratulations-0.362
cens-0.358
Tokeners
Token ID364
Feature activation-1.05e-01

Pos logit contributions
 foreskin+3.021
 salute+2.754
 conver+2.659
ratulations+2.581
cens+2.546
Neg logit contributions
antine-3.669
bledon-3.632
ables-3.566
mom-3.449
water-3.422
TokenReading
Token ID36120
Feature activation+2.77e-01

Pos logit contributions
 foreskin+0.216
 salute+0.198
 conver+0.190
ratulations+0.188
cens+0.186
Neg logit contributions
antine-0.209
ables-0.206
bledon-0.204
water-0.195
mom-0.195
Token 20
Token ID1160
Feature activation+1.74e-01

Pos logit contributions
antine+0.534
ables+0.528
bledon+0.519
water+0.499
mom+0.498
Neg logit contributions
 foreskin-0.594
 salute-0.546
 conver-0.524
ratulations-0.520
cens-0.512
Token88
Token ID3459
Feature activation+3.69e-01

Pos logit contributions
antine+0.322
ables+0.317
bledon+0.312
water+0.300
mom+0.300
Neg logit contributions
 foreskin-0.386
 salute-0.354
 conver-0.342
ratulations-0.339
cens-0.333
Token17
Token ID1558
Feature activation+9.60e-02

Pos logit contributions
antine+0.661
ables+0.652
bledon+0.641
water+0.616
mom+0.614
Neg logit contributions
 foreskin-0.843
 salute-0.775
 conver-0.749
ratulations-0.746
cens-0.730
Token fundraising
Token ID18066
Feature activation+2.40e-01

Pos logit contributions
 foreskin+0.153
 salute+0.138
 conver+0.133
ratulations+0.131
cens+0.130
Neg logit contributions
antine-0.168
ables-0.167
bledon-0.165
water-0.159
mom-0.159
Token emails
Token ID7237
Feature activation+6.67e-01

Pos logit contributions
antine+0.496
ables+0.494
bledon+0.486
mom+0.469
water+0.468
Neg logit contributions
 foreskin-0.476
 salute-0.429
ratulations-0.415
 conver-0.412
cens-0.410
Token to
Token ID284
Feature activation-2.38e-01

Pos logit contributions
antine+1.220
ables+1.218
bledon+1.195
water+1.150
mom+1.144
Neg logit contributions
 foreskin-1.479
 salute-1.349
ratulations-1.305
 conver-1.289
cens-1.276
Token supporters
Token ID5941
Feature activation-5.11e-01

Pos logit contributions
 foreskin+0.463
 salute+0.417
 conver+0.403
ratulations+0.402
cens+0.397
Neg logit contributions
antine-0.503
ables-0.499
bledon-0.493
water-0.474
mom-0.474
Token.
Token ID13
Feature activation+9.46e-02

Pos logit contributions
 foreskin+1.092
 salute+1.000
 conver+0.963
ratulations+0.953
cens+0.941
Neg logit contributions
antine-0.986
ables-0.973
bledon-0.964
water-0.920
mom-0.919
Token Coord
Token ID22819
Feature activation-1.66e+00

Pos logit contributions
antine+0.192
ables+0.190
bledon+0.187
water+0.181
mom+0.181
Neg logit contributions
 foreskin-0.193
 salute-0.176
 conver-0.169
ratulations-0.167
cens-0.164
Tokenination
Token ID1883
Feature activation-8.64e-01

Pos logit contributions
 foreskin+2.887
 salute+2.619
 conver+2.504
ratulations+2.459
cens+2.441
Neg logit contributions
antine-3.778
bledon-3.748
ables-3.724
water-3.587
mom-3.570
Token on
Token ID319
Feature activation-6.00e-02

Pos logit contributions
 foreskin+1.689
 salute+1.536
 conver+1.490
ratulations+1.441
 digit+1.436
Neg logit contributions
antine-1.816
ables-1.792
bledon-1.786
water-1.705
mom-1.698
Token this
Token ID428
Feature activation-4.65e-02

Pos logit contributions
 foreskin+0.122
 salute+0.111
ratulations+0.107
 conver+0.106
cens+0.105
Neg logit contributions
antine-0.122
ables-0.121
bledon-0.119
mom-0.115
water-0.115
Token front
Token ID2166
Feature activation-7.32e-01

Pos logit contributions
 foreskin+0.093
 salute+0.084
ratulations+0.081
 conver+0.081
cens+0.079
Neg logit contributions
antine-0.096
ables-0.095
bledon-0.094
mom-0.090
water-0.090
Token is
Token ID318
Feature activation+1.04e-01

Pos logit contributions
 foreskin+1.383
 salute+1.252
 conver+1.218
 digit+1.172
ratulations+1.172
Neg logit contributions
antine-1.583
ables-1.559
bledon-1.557
water-1.479
mom-1.478
Token}
Token ID92
Feature activation-9.65e-02

Pos logit contributions
 foreskin+0.062
 salute+0.057
 conver+0.055
ratulations+0.055
cens+0.054
Neg logit contributions
antine-0.049
ables-0.048
bledon-0.047
water-0.045
mom-0.045
Token at
Token ID379
Feature activation-2.70e-02

Pos logit contributions
 foreskin+0.191
 salute+0.175
 conver+0.167
ratulations+0.167
cens+0.164
Neg logit contributions
antine-0.200
ables-0.200
bledon-0.195
water-0.189
mom-0.187
Token least
Token ID1551
Feature activation-5.44e-03

Pos logit contributions
 foreskin+0.060
 salute+0.055
 conver+0.053
ratulations+0.053
cens+0.052
Neg logit contributions
antine-0.050
ables-0.049
bledon-0.048
water-0.046
mom-0.046
Token one
Token ID530
Feature activation-2.93e-01

Pos logit contributions
 foreskin+0.011
 salute+0.010
ratulations+0.009
 conver+0.009
cens+0.009
Neg logit contributions
antine-0.011
ables-0.011
bledon-0.011
water-0.011
mom-0.011
Token blank
Token ID9178
Feature activation-3.06e-01

Pos logit contributions
 foreskin+0.553
 salute+0.502
ratulations+0.481
 conver+0.480
cens+0.472
Neg logit contributions
antine-0.634
ables-0.629
bledon-0.615
water-0.598
mom-0.595
Token character
Token ID2095
Feature activation-1.65e+00

Pos logit contributions
 foreskin+0.574
 salute+0.520
 conver+0.499
ratulations+0.494
cens+0.487
Neg logit contributions
antine-0.670
ables-0.664
bledon-0.653
water-0.631
mom-0.628
Token (<
Token ID38155
Feature activation-3.44e-01

Pos logit contributions
 foreskin+3.347
 salute+3.062
 conver+2.960
 digit+2.882
ratulations+2.856
Neg logit contributions
antine-3.357
bledon-3.308
ables-3.284
mom-3.140
water-3.124
TokenSpace
Token ID14106
Feature activation-3.97e-01

Pos logit contributions
 foreskin+0.798
 salute+0.741
 conver+0.713
ratulations+0.704
cens+0.693
Neg logit contributions
antine-0.599
ables-0.594
bledon-0.582
water-0.559
mom-0.556
Token>
Token ID29
Feature activation+4.59e-02

Pos logit contributions
 foreskin+0.861
 salute+0.784
 conver+0.758
ratulations+0.755
cens+0.744
Neg logit contributions
antine-0.755
ables-0.747
bledon-0.736
water-0.709
mom-0.707
Token or
Token ID393
Feature activation-1.32e-01

Pos logit contributions
ables+0.095
antine+0.095
bledon+0.093
water+0.090
mom+0.090
Neg logit contributions
 foreskin-0.091
 salute-0.082
ratulations-0.079
 conver-0.079
cens-0.077
Token <
Token ID1279
Feature activation-9.79e-01

Pos logit contributions
 foreskin+0.246
 salute+0.223
ratulations+0.214
 conver+0.212
cens+0.209
Neg logit contributions
ables-0.290
antine-0.289
bledon-0.282
water-0.274
mom-0.271
Token exposing
Token ID21294
Feature activation+1.46e-01

Pos logit contributions
 foreskin+0.552
 salute+0.504
ratulations+0.487
 conver+0.482
cens+0.471
Neg logit contributions
antine-0.577
ables-0.573
bledon-0.563
water-0.543
mom-0.543
Token their
Token ID511
Feature activation-4.44e-02

Pos logit contributions
antine+0.277
ables+0.274
bledon+0.270
mom+0.258
water+0.257
Neg logit contributions
 foreskin-0.316
 salute-0.291
ratulations-0.281
 conver-0.281
cens-0.275
Token child
Token ID1200
Feature activation-5.38e-01

Pos logit contributions
 foreskin+0.081
 salute+0.074
 conver+0.071
ratulations+0.071
cens+0.069
Neg logit contributions
antine-0.098
ables-0.097
bledon-0.096
mom-0.093
water-0.092
Token to
Token ID284
Feature activation-1.53e-01

Pos logit contributions
 foreskin+1.189
 salute+1.092
 conver+1.054
ratulations+1.037
cens+1.029
Neg logit contributions
antine-1.000
ables-0.982
bledon-0.977
water-0.926
mom-0.924
Token prof
Token ID1534
Feature activation-9.53e-02

Pos logit contributions
 foreskin+0.302
 salute+0.276
 conver+0.266
ratulations+0.264
cens+0.259
Neg logit contributions
antine-0.320
ables-0.316
bledon-0.312
water-0.300
mom-0.298
Tokenanity
Token ID19689
Feature activation-1.64e+00

Pos logit contributions
 foreskin+0.194
 salute+0.177
 conver+0.171
ratulations+0.167
 digit+0.165
Neg logit contributions
antine-0.195
ables-0.192
bledon-0.189
mom-0.182
water-0.181
Token."
Token ID526
Feature activation-3.15e-01

Pos logit contributions
 foreskin+3.489
 salute+3.219
 conver+3.109
 digit+3.019
cens+3.012
Neg logit contributions
antine-3.180
bledon-3.102
ables-3.094
mom-2.922
water-2.915
Token
Token ID198
Feature activation-3.47e-03

Pos logit contributions
 foreskin+0.615
 salute+0.560
 conver+0.539
ratulations+0.529
cens+0.523
Neg logit contributions
antine-0.664
ables-0.654
bledon-0.650
water-0.621
mom-0.620
Token
Token ID198
Feature activation+7.70e-02

Pos logit contributions
 foreskin+0.007
 salute+0.006
 conver+0.006
ratulations+0.006
cens+0.006
Neg logit contributions
antine-0.007
ables-0.007
bledon-0.007
water-0.007
mom-0.007
TokenPHOTOS
Token ID42709
Feature activation-4.09e-01

Pos logit contributions
antine+0.132
ables+0.131
bledon+0.129
mom+0.124
water+0.124
Neg logit contributions
 foreskin-0.182
 salute-0.168
 conver-0.162
ratulations-0.160
 digit-0.158
Token:
Token ID25
Feature activation+6.80e-02

Pos logit contributions
 foreskin+0.772
 salute+0.699
 conver+0.673
ratulations+0.667
cens+0.650
Neg logit contributions
antine-0.890
ables-0.882
bledon-0.873
water-0.843
mom-0.840
TokenThe
Token ID464
Feature activation-2.18e-01

Pos logit contributions
 foreskin+1.900
 salute+1.757
 conver+1.703
ratulations+1.677
cens+1.667
Neg logit contributions
antine-1.402
ables-1.379
bledon-1.369
water-1.291
mom-1.288
Token p
Token ID279
Feature activation-3.42e-01

Pos logit contributions
 foreskin+0.395
 salute+0.356
 conver+0.342
ratulations+0.339
cens+0.334
Neg logit contributions
antine-0.490
ables-0.484
bledon-0.478
water-0.461
mom-0.461
Tokenangs
Token ID27725
Feature activation-3.28e-01

Pos logit contributions
 foreskin+0.775
 salute+0.715
 conver+0.692
ratulations+0.683
cens+0.675
Neg logit contributions
antine-0.611
ables-0.601
bledon-0.601
mom-0.567
water-0.567
Token of
Token ID286
Feature activation-4.50e-01

Pos logit contributions
 foreskin+0.676
 salute+0.616
 conver+0.594
ratulations+0.587
cens+0.579
Neg logit contributions
antine-0.659
ables-0.650
bledon-0.642
water-0.616
mom-0.614
Token dis
Token ID595
Feature activation+5.62e-01

Pos logit contributions
 foreskin+0.859
 salute+0.779
 conver+0.749
ratulations+0.735
cens+0.726
Neg logit contributions
antine-0.971
ables-0.958
bledon-0.952
water-0.911
mom-0.910
Tokenpri
Token ID3448
Feature activation-1.62e+00

Pos logit contributions
antine+0.929
ables+0.921
bledon+0.895
mom+0.858
water+0.857
Neg logit contributions
 foreskin-1.362
 salute-1.263
 conver-1.225
ratulations-1.205
 digit-1.195
Tokenz
Token ID89
Feature activation+1.62e-01

Pos logit contributions
 foreskin+3.694
 salute+3.427
 conver+3.319
ratulations+3.311
cens+3.260
Neg logit contributions
antine-2.832
bledon-2.794
ables-2.764
mom-2.624
water-2.597
Token'd
Token ID1549
Feature activation-8.36e-01

Pos logit contributions
antine+0.309
ables+0.304
bledon+0.299
water+0.287
mom+0.285
Neg logit contributions
 foreskin-0.353
 salute-0.323
 conver-0.315
ratulations-0.309
 digit-0.305
Token love
Token ID1842
Feature activation+1.03e+00

Pos logit contributions
 foreskin+1.582
 salute+1.434
 conver+1.375
ratulations+1.343
cens+1.329
Neg logit contributions
antine-1.821
ables-1.792
bledon-1.786
mom-1.704
water-1.704
Token,
Token ID11
Feature activation-2.71e-01

Pos logit contributions
antine+1.870
ables+1.857
bledon+1.799
water+1.747
mom+1.735
Neg logit contributions
 foreskin-2.315
 salute-2.095
ratulations-2.075
 conver-2.032
 digit-2.004
Token the
Token ID262
Feature activation-3.42e-01

Pos logit contributions
 foreskin+0.575
 salute+0.525
 conver+0.508
ratulations+0.505
cens+0.497
Neg logit contributions
antine-0.528
ables-0.519
bledon-0.514
mom-0.493
water-0.492
Token of
Token ID286
Feature activation+3.07e-01

Pos logit contributions
 foreskin+0.477
 salute+0.436
ratulations+0.419
 conver+0.416
cens+0.410
Neg logit contributions
antine-0.486
ables-0.478
bledon-0.473
water-0.459
mom-0.454
Token the
Token ID262
Feature activation+4.95e-02

Pos logit contributions
antine+0.564
bledon+0.551
ables+0.551
water+0.525
mom+0.523
Neg logit contributions
 foreskin-0.680
 salute-0.626
ratulations-0.605
 conver-0.600
cens-0.589
Token difference
Token ID3580
Feature activation+4.69e-01

Pos logit contributions
antine+0.104
ables+0.103
bledon+0.102
water+0.098
mom+0.098
Neg logit contributions
 foreskin-0.096
 salute-0.087
ratulations-0.084
 conver-0.083
cens-0.082
Token could
Token ID714
Feature activation+5.78e-02

Pos logit contributions
antine+0.962
ables+0.958
bledon+0.945
water+0.907
mom+0.903
Neg logit contributions
 foreskin-0.937
 salute-0.858
ratulations-0.817
 conver-0.813
cens-0.797
Token be
Token ID307
Feature activation+1.96e-01

Pos logit contributions
antine+0.118
ables+0.117
bledon+0.116
water+0.111
mom+0.111
Neg logit contributions
 foreskin-0.116
 salute-0.106
 conver-0.102
ratulations-0.101
cens-0.099
Token attributed
Token ID14183
Feature activation+1.47e+00

Pos logit contributions
antine+0.403
ables+0.402
bledon+0.398
water+0.383
mom+0.382
Neg logit contributions
 foreskin-0.385
 salute-0.354
ratulations-0.338
 conver-0.334
cens-0.328
Token to
Token ID284
Feature activation+5.67e-01

Pos logit contributions
antine+2.548
ables+2.490
bledon+2.474
water+2.352
mom+2.335
Neg logit contributions
 foreskin-3.412
 salute-3.173
 conver-3.031
ratulations-3.026
cens-2.990
Token teacher
Token ID4701
Feature activation+6.28e-01

Pos logit contributions
antine+1.169
ables+1.148
bledon+1.136
mom+1.091
water+1.090
Neg logit contributions
 foreskin-1.132
 salute-1.031
ratulations-0.998
 conver-0.983
cens-0.969
Token characteristics
Token ID9695
Feature activation+4.99e-01

Pos logit contributions
antine+1.289
ables+1.273
bledon+1.255
mom+1.207
water+1.204
Neg logit contributions
 foreskin-1.253
 salute-1.133
ratulations-1.104
 conver-1.092
cens-1.070
Token,
Token ID11
Feature activation+1.40e-01

Pos logit contributions
antine+0.980
ables+0.970
bledon+0.951
water+0.920
mom+0.914
Neg logit contributions
 foreskin-1.046
 salute-0.963
ratulations-0.930
 conver-0.901
cens-0.888
Token especially
Token ID2592
Feature activation+9.23e-01

Pos logit contributions
antine+0.336
ables+0.332
bledon+0.329
water+0.318
mom+0.318
Neg logit contributions
 foreskin-0.232
 salute-0.207
 conver-0.197
ratulations-0.197
cens-0.191
Tokeniff
Token ID733
Feature activation+3.32e-01

Pos logit contributions
 foreskin+0.318
 salute+0.293
 conver+0.282
ratulations+0.278
cens+0.277
Neg logit contributions
antine-0.282
bledon-0.276
ables-0.276
mom-0.261
water-0.261
TokenTra
Token ID15721
Feature activation+2.42e-01

Pos logit contributions
antine+0.706
ables+0.700
bledon+0.685
water+0.672
mom+0.661
Neg logit contributions
 foreskin-0.653
 salute-0.592
 conver-0.565
ratulations-0.563
 neigh-0.548
Tokenx
Token ID87
Feature activation+3.93e-01

Pos logit contributions
antine+0.428
ables+0.422
bledon+0.417
water+0.398
mom+0.397
Neg logit contributions
 foreskin-0.559
 salute-0.516
 conver-0.499
ratulations-0.494
cens-0.486
Token
Token ID198
Feature activation-1.41e-01

Pos logit contributions
antine+0.779
ables+0.774
bledon+0.758
water+0.738
mom+0.736
Neg logit contributions
 foreskin-0.805
 salute-0.741
 conver-0.715
ratulations-0.706
 pancreat-0.687
Token
Token ID198
Feature activation+3.86e-02

Pos logit contributions
 foreskin+0.284
 salute+0.259
 conver+0.249
ratulations+0.247
cens+0.241
Neg logit contributions
antine-0.290
ables-0.287
bledon-0.283
water-0.273
mom-0.272
TokenReviewer
Token ID35407
Feature activation+1.75e+00

Pos logit contributions
antine+0.068
ables+0.068
bledon+0.066
water+0.064
mom+0.064
Neg logit contributions
 foreskin-0.089
 salute-0.082
 conver-0.080
ratulations-0.078
 digit-0.077
Token:
Token ID25
Feature activation+3.38e-01

Pos logit contributions
ables+3.347
antine+3.275
water+3.221
mom+3.169
bledon+3.146
Neg logit contributions
 foreskin-3.751
ratulations-3.486
 salute-3.460
estinal-3.349
 conver-3.314
Token V
Token ID569
Feature activation-2.22e-01

Pos logit contributions
antine+0.687
ables+0.675
bledon+0.666
water+0.646
mom+0.646
Neg logit contributions
 foreskin-0.676
 salute-0.617
ratulations-0.597
 conver-0.597
cens-0.586
Tokenthe
Token ID1169
Feature activation-3.11e-01

Pos logit contributions
 foreskin+0.513
 salute+0.474
 conver+0.459
ratulations+0.452
cens+0.448
Neg logit contributions
antine-0.390
ables-0.384
bledon-0.380
water-0.361
mom-0.360
TokenHappy
Token ID25082
Feature activation+4.60e-01

Pos logit contributions
 foreskin+0.722
 salute+0.666
 conver+0.646
ratulations+0.636
cens+0.629
Neg logit contributions
antine-0.545
ables-0.536
bledon-0.531
water-0.504
mom-0.503
TokenL
Token ID43
Feature activation-5.41e-01

Pos logit contributions
antine+0.899
ables+0.886
bledon+0.874
water+0.849
mom+0.847
Neg logit contributions
 foreskin-0.971
 salute-0.891
 conver-0.861
ratulations-0.850
cens-0.833
Tokening
Token ID278
Feature activation+3.16e-02

Pos logit contributions
 foreskin+0.116
 salute+0.107
 conver+0.103
ratulations+0.102
cens+0.101
Neg logit contributions
antine-0.092
ables-0.091
bledon-0.090
water-0.085
mom-0.085
Token David
Token ID3271
Feature activation+1.34e-01

Pos logit contributions
antine+0.061
ables+0.060
bledon+0.059
water+0.057
mom+0.057
Neg logit contributions
 foreskin-0.068
 salute-0.062
 conver-0.060
ratulations-0.059
cens-0.058
Token Cameron
Token ID10821
Feature activation+2.68e-01

Pos logit contributions
antine+0.274
ables+0.270
bledon+0.268
mom+0.255
water+0.255
Neg logit contributions
 foreskin-0.270
 salute-0.245
 conver-0.236
ratulations-0.236
cens-0.232
Token at
Token ID379
Feature activation+3.02e-01

Pos logit contributions
antine+0.480
ables+0.475
bledon+0.469
water+0.449
mom+0.446
Neg logit contributions
 foreskin-0.603
 salute-0.552
 conver-0.535
ratulations-0.534
cens-0.528
Token Wim
Token ID39999
Feature activation+6.48e-01

Pos logit contributions
antine+0.559
ables+0.553
bledon+0.551
water+0.523
mom+0.520
Neg logit contributions
 foreskin-0.664
 salute-0.609
 conver-0.590
ratulations-0.587
cens-0.580
Tokenbledon
Token ID49258
Feature activation+1.36e+00

Pos logit contributions
antine+0.001
bledon+0.000
ables-0.012
water-0.075
mom-0.083
Neg logit contributions
 foreskin-2.637
 salute-2.520
 conver-2.476
ratulations-2.456
cens-2.447
Token,
Token ID11
Feature activation+2.53e-01

Pos logit contributions
bledon+2.377
antine+2.340
ables+2.329
water+2.206
mom+2.195
Neg logit contributions
 foreskin-3.075
 salute-2.821
 conver-2.725
ratulations-2.724
cens-2.697
Token to
Token ID284
Feature activation-1.89e-01

Pos logit contributions
antine+0.464
ables+0.457
bledon+0.453
water+0.433
mom+0.431
Neg logit contributions
 foreskin-0.564
 salute-0.518
 conver-0.498
ratulations-0.497
cens-0.492
Token Alex
Token ID4422
Feature activation-6.53e-02

Pos logit contributions
 foreskin+0.399
 salute+0.365
 conver+0.352
ratulations+0.349
cens+0.345
Neg logit contributions
antine-0.369
ables-0.364
bledon-0.360
water-0.344
mom-0.343
Token Sal
Token ID4849
Feature activation-3.32e-01

Pos logit contributions
 foreskin+0.143
 salute+0.132
 conver+0.127
ratulations+0.126
 digit+0.124
Neg logit contributions
antine-0.122
ables-0.120
bledon-0.118
water-0.114
mom-0.113
Tokenmond
Token ID6327
Feature activation+6.46e-02

Pos logit contributions
 foreskin+1.113
 salute+1.054
 conver+1.032
ratulations+1.022
cens+1.015
Neg logit contributions
antine-0.237
ables-0.228
bledon-0.223
water-0.194
mom-0.193
Tokenyear
Token ID1941
Feature activation-3.18e-01

Pos logit contributions
antine+0.366
ables+0.362
bledon+0.355
water+0.347
mom+0.345
Neg logit contributions
 foreskin-0.360
 salute-0.328
 conver-0.314
ratulations-0.311
 neigh-0.303
Token fight
Token ID1907
Feature activation+1.15e-02

Pos logit contributions
 foreskin+0.648
 salute+0.592
 conver+0.570
ratulations+0.561
cens+0.555
Neg logit contributions
antine-0.645
ables-0.636
bledon-0.631
water-0.603
mom-0.602
Token against
Token ID1028
Feature activation-5.30e-01

Pos logit contributions
antine+0.024
ables+0.024
bledon+0.024
water+0.023
mom+0.023
Neg logit contributions
 foreskin-0.022
 salute-0.020
 conver-0.019
ratulations-0.019
cens-0.019
Token the
Token ID262
Feature activation-1.15e-02

Pos logit contributions
 foreskin+1.196
 salute+1.102
 conver+1.066
ratulations+1.051
cens+1.041
Neg logit contributions
antine-0.958
ables-0.943
bledon-0.935
water-0.888
mom-0.887
Token deal
Token ID1730
Feature activation+2.98e-01

Pos logit contributions
 foreskin+0.027
 salute+0.024
 conver+0.024
ratulations+0.023
cens+0.023
Neg logit contributions
antine-0.020
ables-0.020
bledon-0.020
water-0.019
mom-0.019
Token exempl
Token ID21433
Feature activation+1.42e+00

Pos logit contributions
antine+0.641
ables+0.634
bledon+0.625
water+0.605
mom+0.601
Neg logit contributions
 foreskin-0.570
 salute-0.517
ratulations-0.495
 conver-0.489
cens-0.479
Tokenifies
Token ID6945
Feature activation+7.94e-01

Pos logit contributions
ables+2.150
antine+2.090
bledon+1.974
water+1.932
mom+1.922
Neg logit contributions
 foreskin-3.704
 salute-3.428
 neigh-3.245
 conver-3.245
 digit-3.245
Token Washington
Token ID2669
Feature activation+1.24e-01

Pos logit contributions
antine+1.584
ables+1.555
bledon+1.540
water+1.495
mom+1.482
Neg logit contributions
 foreskin-1.652
 salute-1.490
ratulations-1.427
 conver-1.405
cens-1.389
Token
Token ID447
Feature activation+3.21e-02

Pos logit contributions
antine+0.223
ables+0.219
bledon+0.217
water+0.207
mom+0.206
Neg logit contributions
 foreskin-0.281
 salute-0.259
 conver-0.250
ratulations-0.247
cens-0.244
Token
Token ID247
Feature activation+2.06e-01

Pos logit contributions
antine+0.066
ables+0.065
bledon+0.064
water+0.062
mom+0.062
Neg logit contributions
 foreskin-0.065
 salute-0.059
 conver-0.057
ratulations-0.056
cens-0.055
Tokens
Token ID82
Feature activation+9.63e-02

Pos logit contributions
antine+0.480
ables+0.474
bledon+0.470
water+0.453
mom+0.452
Neg logit contributions
 foreskin-0.359
 salute-0.322
 conver-0.308
ratulations-0.303
cens-0.297
Token A
Token ID317
Feature activation+1.59e-01

Pos logit contributions
 foreskin+0.067
 salute+0.061
 conver+0.059
ratulations+0.059
cens+0.057
Neg logit contributions
antine-0.069
ables-0.068
bledon-0.067
water-0.065
mom-0.065
Token run
Token ID1057
Feature activation+2.14e-01

Pos logit contributions
antine+0.339
ables+0.335
bledon+0.333
water+0.319
mom+0.318
Neg logit contributions
 foreskin-0.308
 salute-0.279
 conver-0.269
ratulations-0.267
cens-0.261
Token-
Token ID12
Feature activation-6.46e-02

Pos logit contributions
antine+0.405
ables+0.402
bledon+0.396
water+0.381
mom+0.379
Neg logit contributions
 foreskin-0.463
 salute-0.425
 conver-0.412
ratulations-0.407
cens-0.401
Tokenup
Token ID929
Feature activation+2.82e-01

Pos logit contributions
 foreskin+0.164
 salute+0.152
 conver+0.148
ratulations+0.146
cens+0.144
Neg logit contributions
antine-0.099
ables-0.097
bledon-0.096
water-0.091
mom-0.091
Token in
Token ID287
Feature activation+4.58e-02

Pos logit contributions
antine+0.529
ables+0.523
bledon+0.517
water+0.497
mom+0.494
Neg logit contributions
 foreskin-0.613
 salute-0.565
 conver-0.542
ratulations-0.542
cens-0.534
Token prescribing
Token ID39973
Feature activation+1.64e+00

Pos logit contributions
antine+0.098
ables+0.097
bledon+0.096
water+0.092
mom+0.092
Neg logit contributions
 foreskin-0.088
 salute-0.080
 conver-0.077
ratulations-0.076
cens-0.075
Token from
Token ID422
Feature activation+1.21e-01

Pos logit contributions
ables+3.340
antine+3.287
bledon+3.227
water+3.166
mom+3.117
Neg logit contributions
 foreskin-3.227
 salute-2.960
ratulations-2.927
cens-2.796
 conver-2.756
Token 2000
Token ID4751
Feature activation-3.50e-01

Pos logit contributions
antine+0.257
ables+0.253
bledon+0.250
water+0.241
mom+0.240
Neg logit contributions
 foreskin-0.236
 salute-0.215
 conver-0.206
ratulations-0.205
cens-0.201
Token through
Token ID832
Feature activation+4.16e-01

Pos logit contributions
 foreskin+0.758
 salute+0.697
 conver+0.671
ratulations+0.665
cens+0.655
Neg logit contributions
antine-0.664
ables-0.655
bledon-0.650
water-0.619
mom-0.618
Token 2011
Token ID2813
Feature activation+2.49e-02

Pos logit contributions
antine+0.752
ables+0.742
bledon+0.732
water+0.699
mom+0.697
Neg logit contributions
 foreskin-0.938
 salute-0.864
 conver-0.832
ratulations-0.830
cens-0.816
Token fueled
Token ID24959
Feature activation-7.12e-02

Pos logit contributions
antine+0.052
ables+0.052
bledon+0.051
water+0.049
mom+0.049
Neg logit contributions
 foreskin-0.049
 salute-0.044
ratulations-0.043
 conver-0.042
cens-0.041
Tokenm
Token ID76
Feature activation+2.75e-01

Pos logit contributions
 foreskin+0.636
 salute+0.599
 conver+0.584
ratulations+0.577
cens+0.574
Neg logit contributions
antine-0.231
ables-0.225
bledon-0.222
water-0.202
mom-0.201
Token almost
Token ID2048
Feature activation+5.12e-02

Pos logit contributions
antine+0.586
ables+0.583
bledon+0.578
mom+0.558
water+0.556
Neg logit contributions
 foreskin-0.524
 salute-0.473
ratulations-0.457
 conver-0.454
cens-0.452
Token done
Token ID1760
Feature activation-2.46e-01

Pos logit contributions
antine+0.103
ables+0.103
bledon+0.102
water+0.098
mom+0.098
Neg logit contributions
 foreskin-0.103
 salute-0.094
ratulations-0.090
 conver-0.090
cens-0.090
Token with
Token ID351
Feature activation-1.98e-01

Pos logit contributions
 foreskin+0.526
 salute+0.482
 conver+0.465
ratulations+0.459
cens+0.454
Neg logit contributions
antine-0.474
ables-0.467
bledon-0.464
water-0.442
mom-0.441
Token a
Token ID257
Feature activation-8.39e-02

Pos logit contributions
 foreskin+0.408
 salute+0.373
 conver+0.359
ratulations+0.356
cens+0.351
Neg logit contributions
antine-0.394
ables-0.390
bledon-0.387
water-0.370
mom-0.369
Token marketing
Token ID7124
Feature activation+9.63e-01

Pos logit contributions
 foreskin+0.168
 salute+0.153
 conver+0.148
ratulations+0.146
cens+0.144
Neg logit contributions
antine-0.172
ables-0.170
bledon-0.170
mom-0.162
water-0.162
Token degree
Token ID4922
Feature activation-3.60e-02

Pos logit contributions
antine+1.759
ables+1.757
bledon+1.725
mom+1.660
water+1.657
Neg logit contributions
 foreskin-2.114
 salute-1.942
 conver-1.878
ratulations-1.865
cens-1.852
Token,
Token ID11
Feature activation-1.04e-01

Pos logit contributions
 foreskin+0.079
 salute+0.072
 conver+0.069
ratulations+0.069
cens+0.068
Neg logit contributions
antine-0.068
ables-0.067
bledon-0.066
water-0.063
mom-0.063
Token I
Token ID314
Feature activation-6.41e-02

Pos logit contributions
 foreskin+0.210
 salute+0.191
 conver+0.184
ratulations+0.182
cens+0.180
Neg logit contributions
antine-0.211
ables-0.209
bledon-0.207
mom-0.199
water-0.199
Token have
Token ID423
Feature activation-1.15e-02

Pos logit contributions
 foreskin+0.126
 salute+0.114
 conver+0.110
ratulations+0.109
cens+0.107
Neg logit contributions
antine-0.134
ables-0.133
bledon-0.132
water-0.127
mom-0.126
Token finally
Token ID3443
Feature activation-3.18e-01

Pos logit contributions
 foreskin+0.023
 salute+0.021
 conver+0.021
ratulations+0.020
cens+0.020
Neg logit contributions
antine-0.023
ables-0.023
bledon-0.023
mom-0.022
water-0.022
Token of
Token ID286
Feature activation+2.07e-02

Pos logit contributions
ables+0.307
antine+0.306
bledon+0.300
mom+0.288
water+0.288
Neg logit contributions
 foreskin-0.302
 salute-0.272
ratulations-0.263
cens-0.260
 conver-0.259
Token great
Token ID1049
Feature activation+4.64e-01

Pos logit contributions
antine+0.045
ables+0.045
bledon+0.044
mom+0.042
water+0.042
Neg logit contributions
 foreskin-0.039
 salute-0.035
ratulations-0.034
cens-0.033
 conver-0.033
Token mutual
Token ID13584
Feature activation-5.08e-01

Pos logit contributions
antine+0.840
ables+0.837
bledon+0.818
mom+0.791
water+0.790
Neg logit contributions
 foreskin-1.038
 salute-0.941
ratulations-0.917
 conver-0.908
cens-0.907
Token benefit
Token ID4414
Feature activation+1.97e-02

Pos logit contributions
 foreskin+1.106
 salute+1.012
 conver+0.979
ratulations+0.967
cens+0.959
Neg logit contributions
antine-0.957
ables-0.946
bledon-0.934
water-0.894
mom-0.892
Token to
Token ID284
Feature activation+8.36e-02

Pos logit contributions
antine+0.035
ables+0.035
bledon+0.034
water+0.033
mom+0.032
Neg logit contributions
 foreskin-0.045
 salute-0.041
ratulations-0.040
 conver-0.040
cens-0.039
Token fans
Token ID3296
Feature activation+4.62e-01

Pos logit contributions
antine+0.160
ables+0.158
bledon+0.156
water+0.149
mom+0.148
Neg logit contributions
 foreskin-0.181
 salute-0.165
ratulations-0.158
cens-0.158
 conver-0.158
Token and
Token ID290
Feature activation-2.39e-01

Pos logit contributions
antine+0.864
ables+0.857
bledon+0.847
water+0.811
mom+0.808
Neg logit contributions
 foreskin-1.016
 salute-0.925
ratulations-0.890
cens-0.876
 conver-0.875
Token club
Token ID3430
Feature activation+3.84e-01

Pos logit contributions
 foreskin+0.437
 salute+0.394
 conver+0.378
ratulations+0.372
cens+0.367
Neg logit contributions
antine-0.537
ables-0.530
bledon-0.526
water-0.505
mom-0.504
Token alike
Token ID12936
Feature activation-3.35e-01

Pos logit contributions
antine+0.840
ables+0.830
bledon+0.820
water+0.793
mom+0.789
Neg logit contributions
 foreskin-0.726
 salute-0.652
ratulations-0.618
 conver-0.615
cens-0.614
Token.
Token ID13
Feature activation-1.48e-01

Pos logit contributions
 foreskin+0.759
 salute+0.698
 conver+0.671
ratulations+0.668
cens+0.660
Neg logit contributions
antine-0.605
ables-0.598
bledon-0.591
water-0.563
mom-0.561
Token
Token ID198
Feature activation-2.45e-01

Pos logit contributions
 foreskin+0.301
 salute+0.273
 conver+0.263
ratulations+0.261
cens+0.256
Neg logit contributions
antine-0.300
ables-0.296
bledon-0.293
water-0.282
mom-0.282
Token between
Token ID1022
Feature activation-6.03e-02

Pos logit contributions
 foreskin+0.352
 salute+0.321
 conver+0.307
ratulations+0.307
cens+0.301
Neg logit contributions
antine-0.358
ables-0.354
bledon-0.350
water-0.336
mom-0.335
Token sovereign
Token ID18901
Feature activation-5.22e-02

Pos logit contributions
 foreskin+0.114
 salute+0.104
 conver+0.100
ratulations+0.099
cens+0.097
Neg logit contributions
antine-0.131
ables-0.129
bledon-0.128
water-0.123
mom-0.123
Token Israel
Token ID2692
Feature activation+1.24e-01

Pos logit contributions
 foreskin+0.094
 salute+0.085
 conver+0.082
ratulations+0.080
cens+0.079
Neg logit contributions
antine-0.118
ables-0.117
bledon-0.116
water-0.112
mom-0.111
Token and
Token ID290
Feature activation+1.53e-01

Pos logit contributions
antine+0.253
ables+0.250
bledon+0.247
water+0.237
mom+0.236
Neg logit contributions
 foreskin-0.251
 salute-0.229
 conver-0.220
ratulations-0.218
cens-0.214
Token the
Token ID262
Feature activation+2.83e-01

Pos logit contributions
antine+0.319
ables+0.315
bledon+0.311
water+0.299
mom+0.298
Neg logit contributions
 foreskin-0.304
 salute-0.277
 conver-0.266
ratulations-0.264
cens-0.259
Token settlements
Token ID18573
Feature activation+2.88e-01

Pos logit contributions
antine+0.596
ables+0.591
bledon+0.579
water+0.562
mom+0.557
Neg logit contributions
 foreskin-0.549
 salute-0.502
 conver-0.482
ratulations-0.480
cens-0.472
Token in
Token ID287
Feature activation+5.53e-02

Pos logit contributions
antine+0.611
ables+0.607
bledon+0.598
water+0.579
mom+0.573
Neg logit contributions
 foreskin-0.558
 salute-0.506
ratulations-0.484
 conver-0.482
cens-0.474
Token the
Token ID262
Feature activation+1.05e-01

Pos logit contributions
antine+0.103
ables+0.102
bledon+0.100
water+0.096
mom+0.095
Neg logit contributions
 foreskin-0.122
 salute-0.112
 conver-0.108
ratulations-0.107
cens-0.106
Token territories
Token ID16771
Feature activation-5.91e-01

Pos logit contributions
antine+0.190
ables+0.187
bledon+0.183
water+0.177
mom+0.175
Neg logit contributions
 foreskin-0.235
 salute-0.218
 conver-0.210
ratulations-0.209
cens-0.207
Token ended
Token ID4444
Feature activation-3.37e-01

Pos logit contributions
 foreskin+1.099
 salute+0.995
 conver+0.955
ratulations+0.937
cens+0.926
Neg logit contributions
antine-1.305
ables-1.288
bledon-1.280
water-1.226
mom-1.226
Token up
Token ID510
Feature activation+1.85e-02

Pos logit contributions
 foreskin+0.646
 salute+0.587
 conver+0.564
ratulations+0.554
cens+0.547
Neg logit contributions
antine-0.724
ables-0.714
bledon-0.709
water-0.679
mom-0.679
Token "
Token ID366
Feature activation-9.17e-02

Pos logit contributions
 foreskin+0.076
 salute+0.069
 conver+0.067
ratulations+0.065
cens+0.064
Neg logit contributions
antine-0.083
ables-0.082
bledon-0.081
mom-0.079
water-0.079
TokenWhen
Token ID2215
Feature activation-2.66e-01

Pos logit contributions
 foreskin+0.217
 salute+0.200
 conver+0.194
ratulations+0.192
 digit+0.189
Neg logit contributions
antine-0.157
ables-0.155
bledon-0.153
mom-0.147
water-0.147
Token I
Token ID314
Feature activation-2.47e-02

Pos logit contributions
 foreskin+0.556
 salute+0.509
 conver+0.491
ratulations+0.484
cens+0.478
Neg logit contributions
antine-0.525
ables-0.517
bledon-0.513
water-0.490
mom-0.489
Token grew
Token ID6348
Feature activation+3.13e-01

Pos logit contributions
 foreskin+0.050
 salute+0.044
ratulations+0.043
 conver+0.043
cens+0.043
Neg logit contributions
antine-0.050
ables-0.050
bledon-0.049
mom-0.048
water-0.048
Token up
Token ID510
Feature activation-1.60e-01

Pos logit contributions
ables+0.738
antine+0.734
bledon+0.729
mom+0.705
water+0.701
Neg logit contributions
 foreskin-0.525
 salute-0.474
 conver-0.449
cens-0.440
ratulations-0.440
Token hockey
Token ID12217
Feature activation+2.06e-01

Pos logit contributions
 foreskin+0.357
 salute+0.329
 conver+0.318
ratulations+0.314
cens+0.311
Neg logit contributions
antine-0.293
ables-0.289
bledon-0.287
water-0.272
mom-0.272
Token was
Token ID373
Feature activation+1.12e-01

Pos logit contributions
antine+0.425
ables+0.423
bledon+0.417
mom+0.403
water+0.401
Neg logit contributions
 foreskin-0.410
 salute-0.374
 conver-0.359
ratulations-0.357
cens-0.352
Token too
Token ID1165
Feature activation+2.38e-02

Pos logit contributions
antine+0.224
ables+0.222
bledon+0.221
mom+0.212
water+0.211
Neg logit contributions
 foreskin-0.226
 salute-0.206
ratulations-0.200
 conver-0.199
cens-0.196
Token expensive
Token ID5789
Feature activation+5.56e-02

Pos logit contributions
antine+0.050
ables+0.050
bledon+0.049
mom+0.047
water+0.047
Neg logit contributions
 foreskin-0.046
 salute-0.042
ratulations-0.041
 conver-0.041
cens-0.040
Token with
Token ID351
Feature activation+1.19e-01

Pos logit contributions
antine+0.102
ables+0.101
bledon+0.100
water+0.095
mom+0.095
Neg logit contributions
 foreskin-0.123
 salute-0.113
 conver-0.110
ratulations-0.109
cens-0.107
Token all
Token ID477
Feature activation+7.11e-04

Pos logit contributions
antine+0.237
ables+0.235
bledon+0.232
mom+0.222
water+0.222
Neg logit contributions
 foreskin-0.244
 salute-0.223
 conver-0.215
ratulations-0.214
cens-0.211
Token […]
Token ID23834
Feature activation-1.59e-01

Pos logit contributions
 foreskin+0.273
 salute+0.248
 conver+0.239
ratulations+0.238
cens+0.233
Neg logit contributions
antine-0.286
ables-0.285
bledon-0.278
mom-0.271
water-0.271
Token
Token ID198
Feature activation-7.02e-02

Pos logit contributions
 foreskin+0.310
 salute+0.282
 conver+0.272
ratulations+0.269
cens+0.263
Neg logit contributions
antine-0.337
ables-0.333
bledon-0.328
mom-0.317
water-0.317
Token
Token ID198
Feature activation-7.02e-02

Pos logit contributions
 foreskin+0.140
 salute+0.128
 conver+0.123
ratulations+0.122
cens+0.119
Neg logit contributions
antine-0.145
ables-0.144
bledon-0.142
water-0.137
mom-0.137
TokenSince
Token ID6385
Feature activation+6.23e-02

Pos logit contributions
 foreskin+0.161
 salute+0.149
 conver+0.144
ratulations+0.142
 digit+0.140
Neg logit contributions
antine-0.124
ables-0.123
bledon-0.121
mom-0.115
water-0.115
Token I
Token ID314
Feature activation-2.86e-01

Pos logit contributions
antine+0.118
ables+0.116
bledon+0.115
water+0.109
mom+0.109
Neg logit contributions
 foreskin-0.136
 salute-0.125
 conver-0.121
ratulations-0.119
cens-0.117
Token
Token ID447
Feature activation+2.01e-01

Pos logit contributions
 foreskin+0.559
 salute+0.502
 conver+0.487
ratulations+0.486
cens+0.478
Neg logit contributions
antine-0.602
ables-0.596
bledon-0.585
water-0.566
mom-0.565
Token
Token ID247
Feature activation-5.29e-02

Pos logit contributions
antine+0.405
ables+0.400
bledon+0.395
mom+0.380
water+0.379
Neg logit contributions
 foreskin-0.409
 salute-0.375
 conver-0.360
ratulations-0.358
cens-0.350
Tokenm
Token ID76
Feature activation+3.55e-01

Pos logit contributions
 foreskin+0.123
 salute+0.113
 conver+0.110
ratulations+0.109
cens+0.107
Neg logit contributions
antine-0.092
ables-0.091
bledon-0.090
mom-0.085
water-0.085
Token now
Token ID783
Feature activation+1.88e-01

Pos logit contributions
antine+0.751
ables+0.743
bledon+0.735
mom+0.711
water+0.707
Neg logit contributions
 foreskin-0.688
 salute-0.628
 conver-0.600
ratulations-0.598
cens-0.586
Token finally
Token ID3443
Feature activation-1.80e-01

Pos logit contributions
antine+0.389
ables+0.386
bledon+0.381
mom+0.370
water+0.367
Neg logit contributions
 foreskin-0.373
 salute-0.341
 conver-0.326
ratulations-0.325
cens-0.319
Token having
Token ID1719
Feature activation+2.45e-01

Pos logit contributions
 foreskin+0.335
 salute+0.304
ratulations+0.289
 conver+0.289
cens+0.283
Neg logit contributions
antine-0.394
ables-0.391
bledon-0.387
mom-0.373
water-0.372
Tokenheim
Token ID9096
Feature activation+8.05e-02

Pos logit contributions
antine+0.557
ables+0.546
bledon+0.540
water+0.513
mom+0.511
Neg logit contributions
 foreskin-0.819
 salute-0.756
 conver-0.735
ratulations-0.726
cens-0.715
Token Glacier
Token ID49120
Feature activation+1.04e-01

Pos logit contributions
antine+0.158
ables+0.155
bledon+0.154
water+0.148
mom+0.147
Neg logit contributions
 foreskin-0.170
 salute-0.156
 conver-0.150
ratulations-0.149
cens-0.146
Token by
Token ID416
Feature activation+1.20e-02

Pos logit contributions
antine+0.200
ables+0.197
bledon+0.195
water+0.188
mom+0.186
Neg logit contributions
 foreskin-0.221
 salute-0.203
 conver-0.195
ratulations-0.194
cens-0.191
Token 1
Token ID352
Feature activation-1.53e-01

Pos logit contributions
antine+0.025
ables+0.024
bledon+0.024
water+0.023
mom+0.023
Neg logit contributions
 foreskin-0.024
 salute-0.022
 conver-0.021
ratulations-0.021
cens-0.021
Token.
Token ID13
Feature activation+1.55e-02

Pos logit contributions
 foreskin+0.335
 salute+0.308
 conver+0.297
ratulations+0.293
cens+0.290
Neg logit contributions
antine-0.289
ables-0.284
bledon-0.282
water-0.269
mom-0.268
Token5
Token ID20
Feature activation-1.20e-02

Pos logit contributions
antine+0.026
ables+0.026
bledon+0.025
water+0.024
mom+0.024
Neg logit contributions
 foreskin-0.037
 salute-0.034
 conver-0.033
ratulations-0.033
cens-0.032
Token kilometers
Token ID18212
Feature activation+2.82e-01

Pos logit contributions
 foreskin+0.026
 salute+0.024
 conver+0.023
ratulations+0.023
cens+0.022
Neg logit contributions
antine-0.023
ables-0.022
bledon-0.022
water-0.021
mom-0.021
Token.
Token ID13
Feature activation+1.80e-01

Pos logit contributions
antine+0.531
ables+0.526
bledon+0.522
water+0.500
mom+0.496
Neg logit contributions
 foreskin-0.612
 salute-0.561
ratulations-0.539
 conver-0.539
cens-0.528
Token And
Token ID843
Feature activation+2.48e-01

Pos logit contributions
antine+0.368
ables+0.364
bledon+0.359
water+0.350
mom+0.347
Neg logit contributions
 foreskin-0.365
 salute-0.332
ratulations-0.319
 conver-0.319
cens-0.310
Token Hel
Token ID5053
Feature activation+4.13e-01

Pos logit contributions
antine+0.480
ables+0.475
bledon+0.470
water+0.453
mom+0.449
Neg logit contributions
 foreskin-0.525
 salute-0.480
 conver-0.459
ratulations-0.458
cens-0.453
Tokenheim
Token ID9096
Feature activation+1.83e-01

Pos logit contributions
antine+0.645
ables+0.631
bledon+0.626
water+0.592
mom+0.589
Neg logit contributions
 foreskin-1.037
 salute-0.964
 conver-0.934
ratulations-0.924
cens-0.912
Token That
Token ID1320
Feature activation+2.26e-01

Pos logit contributions
antine+0.976
ables+0.971
bledon+0.953
water+0.929
mom+0.926
Neg logit contributions
 foreskin-1.046
 salute-0.953
 conver-0.925
ratulations-0.923
cens-0.896
Token is
Token ID318
Feature activation+4.27e-01

Pos logit contributions
antine+0.470
ables+0.466
bledon+0.461
water+0.446
mom+0.445
Neg logit contributions
 foreskin-0.445
 salute-0.404
 conver-0.390
ratulations-0.387
cens-0.380
Token way
Token ID835
Feature activation+6.57e-01

Pos logit contributions
antine+0.894
ables+0.888
bledon+0.871
water+0.850
mom+0.845
Neg logit contributions
 foreskin-0.827
 salute-0.753
 conver-0.724
ratulations-0.723
cens-0.706
Token too
Token ID1165
Feature activation+5.59e-02

Pos logit contributions
antine+1.555
ables+1.547
bledon+1.513
water+1.501
mom+1.485
Neg logit contributions
 foreskin-1.081
 salute-0.976
ratulations-0.938
 conver-0.918
estinal-0.895
Token far
Token ID1290
Feature activation+3.16e-01

Pos logit contributions
antine+0.118
ables+0.117
bledon+0.116
water+0.112
mom+0.112
Neg logit contributions
 foreskin-0.107
 salute-0.098
ratulations-0.094
 conver-0.093
cens-0.092
Token gone
Token ID3750
Feature activation-2.10e-03

Pos logit contributions
antine+0.647
ables+0.646
bledon+0.633
water+0.615
mom+0.611
Neg logit contributions
 foreskin-0.630
 salute-0.577
ratulations-0.553
 conver-0.549
cens-0.538
Token for
Token ID329
Feature activation+1.33e-01

Pos logit contributions
 foreskin+0.005
 salute+0.004
 conver+0.004
ratulations+0.004
cens+0.004
Neg logit contributions
antine-0.004
ables-0.004
bledon-0.004
water-0.004
mom-0.004
Token me
Token ID502
Feature activation-6.17e-02

Pos logit contributions
ables+0.271
antine+0.271
bledon+0.266
water+0.258
mom+0.257
Neg logit contributions
 foreskin-0.267
 salute-0.243
 conver-0.234
ratulations-0.234
cens-0.231
Token now
Token ID783
Feature activation+3.23e-01

Pos logit contributions
 foreskin+0.137
 salute+0.126
 conver+0.121
ratulations+0.120
cens+0.118
Neg logit contributions
antine-0.114
ables-0.113
bledon-0.112
water-0.106
mom-0.106
Token.
Token ID13
Feature activation+3.73e-02

Pos logit contributions
antine+0.589
ables+0.582
bledon+0.575
water+0.549
mom+0.548
Neg logit contributions
 foreskin-0.726
 salute-0.667
 conver-0.645
ratulations-0.638
cens-0.630
Token I
Token ID314
Feature activation+4.81e-03

Pos logit contributions
antine+0.075
ables+0.074
bledon+0.073
water+0.071
mom+0.071
Neg logit contributions
 foreskin-0.077
 salute-0.070
 conver-0.068
ratulations-0.067
cens-0.066
Token and
Token ID290
Feature activation+1.77e-01

Pos logit contributions
 foreskin+0.243
 salute+0.221
 conver+0.213
ratulations+0.212
cens+0.209
Neg logit contributions
ables-0.231
antine-0.231
bledon-0.226
water-0.219
mom-0.219
Token bad
Token ID2089
Feature activation+1.53e-01

Pos logit contributions
antine+0.349
ables+0.346
bledon+0.339
water+0.326
mom+0.325
Neg logit contributions
 foreskin-0.370
 salute-0.336
 conver-0.325
ratulations-0.323
cens-0.319
Token policies
Token ID4788
Feature activation+6.70e-02

Pos logit contributions
antine+0.282
ables+0.282
bledon+0.276
water+0.267
mom+0.266
Neg logit contributions
 foreskin-0.336
 salute-0.308
ratulations-0.297
 conver-0.296
cens-0.293
Token,
Token ID11
Feature activation-2.75e-01

Pos logit contributions
antine+0.120
ables+0.119
bledon+0.117
water+0.112
mom+0.112
Neg logit contributions
 foreskin-0.152
 salute-0.140
ratulations-0.134
 conver-0.134
cens-0.132
Token or
Token ID393
Feature activation+2.55e-01

Pos logit contributions
 foreskin+0.550
 salute+0.493
ratulations+0.482
 conver+0.476
cens+0.472
Neg logit contributions
antine-0.564
ables-0.561
bledon-0.551
water-0.533
mom-0.532
Token be
Token ID307
Feature activation-4.20e-01

Pos logit contributions
ables+0.526
antine+0.526
bledon+0.514
water+0.496
mom+0.496
Neg logit contributions
 foreskin-0.501
 salute-0.447
ratulations-0.443
cens-0.434
 conver-0.431
Token reasoned
Token ID37857
Feature activation+2.86e-01

Pos logit contributions
 foreskin+0.741
 salute+0.664
 conver+0.633
ratulations+0.630
cens+0.619
Neg logit contributions
antine-0.964
ables-0.961
bledon-0.949
mom-0.913
water-0.913
Token with
Token ID351
Feature activation+6.06e-02

Pos logit contributions
antine+0.511
ables+0.511
bledon+0.497
mom+0.478
water+0.476
Neg logit contributions
 foreskin-0.649
 salute-0.593
ratulations-0.576
 conver-0.570
cens-0.561
Token.
Token ID13
Feature activation-4.37e-02

Pos logit contributions
antine+0.114
ables+0.113
bledon+0.111
mom+0.106
water+0.106
Neg logit contributions
 foreskin-0.132
 salute-0.121
 conver-0.117
ratulations-0.117
cens-0.114
Token
Token ID198
Feature activation+1.06e-01

Pos logit contributions
 foreskin+0.091
 salute+0.083
 conver+0.080
ratulations+0.080
cens+0.078
Neg logit contributions
antine-0.087
ables-0.086
bledon-0.084
mom-0.082
water-0.082
Token
Token ID198
Feature activation+1.20e-01

Pos logit contributions
antine+0.219
ables+0.216
bledon+0.213
water+0.206
mom+0.206
Neg logit contributions
 foreskin-0.214
 salute-0.195
 conver-0.188
ratulations-0.186
cens-0.182
Token will
Token ID481
Feature activation+8.32e-02

Pos logit contributions
 foreskin+0.206
 salute+0.185
 conver+0.178
ratulations+0.177
cens+0.175
Neg logit contributions
antine-0.241
ables-0.239
bledon-0.237
water-0.228
mom-0.227
Token want
Token ID765
Feature activation+2.93e-01

Pos logit contributions
antine+0.175
ables+0.173
bledon+0.172
water+0.165
mom+0.164
Neg logit contributions
 foreskin-0.163
 salute-0.147
 conver-0.142
ratulations-0.141
cens-0.139
Token,
Token ID11
Feature activation-2.38e-01

Pos logit contributions
antine+0.534
ables+0.528
bledon+0.522
water+0.498
mom+0.495
Neg logit contributions
 foreskin-0.655
 salute-0.602
 conver-0.581
ratulations-0.577
cens-0.570
Token so
Token ID523
Feature activation+6.53e-02

Pos logit contributions
 foreskin+0.483
 salute+0.439
 conver+0.424
ratulations+0.419
cens+0.414
Neg logit contributions
antine-0.484
ables-0.478
bledon-0.474
water-0.454
mom-0.453
Token factories
Token ID17590
Feature activation+2.37e-01

Pos logit contributions
antine+0.124
ables+0.123
bledon+0.122
water+0.116
mom+0.116
Neg logit contributions
 foreskin-0.141
 salute-0.129
 conver-0.125
ratulations-0.123
cens-0.122
Token will
Token ID481
Feature activation-2.22e-01

Pos logit contributions
antine+0.538
ables+0.533
bledon+0.527
water+0.510
mom+0.506
Neg logit contributions
 foreskin-0.425
 salute-0.379
ratulations-0.366
 conver-0.362
cens-0.357
Token close
Token ID1969
Feature activation+3.37e-01

Pos logit contributions
 foreskin+0.432
 salute+0.394
 conver+0.379
ratulations+0.370
 digit+0.366
Neg logit contributions
antine-0.472
ables-0.465
bledon-0.462
mom-0.442
water-0.441
Token down
Token ID866
Feature activation+5.04e-01

Pos logit contributions
antine+0.679
ables+0.676
bledon+0.666
water+0.641
mom+0.638
Neg logit contributions
 foreskin-0.686
 salute-0.624
ratulations-0.601
 conver-0.598
cens-0.585
Token and
Token ID290
Feature activation-3.74e-01

Pos logit contributions
antine+0.954
ables+0.948
bledon+0.934
water+0.899
mom+0.895
Neg logit contributions
 foreskin-1.085
 salute-0.991
ratulations-0.958
 conver-0.951
cens-0.934
Token choice
Token ID3572
Feature activation+6.87e-01

Pos logit contributions
 foreskin+0.779
 salute+0.715
 conver+0.690
ratulations+0.674
 digit+0.669
Neg logit contributions
antine-0.743
ables-0.731
bledon-0.727
mom-0.692
water-0.692
Token of
Token ID286
Feature activation+2.02e-01

Pos logit contributions
ables+1.501
antine+1.492
bledon+1.476
water+1.415
mom+1.414
Neg logit contributions
 foreskin-1.278
 salute-1.150
ratulations-1.109
 conver-1.078
cens-1.071
Token,
Token ID11
Feature activation+2.07e-01

Pos logit contributions
antine+0.332
ables+0.327
bledon+0.323
water+0.307
mom+0.307
Neg logit contributions
 foreskin-0.444
 salute-0.410
 conver-0.397
ratulations-0.391
cens-0.388
Token her
Token ID607
Feature activation+6.30e-01

Pos logit contributions
antine+0.378
ables+0.373
bledon+0.367
water+0.352
mom+0.351
Neg logit contributions
 foreskin-0.466
 salute-0.426
 conver-0.411
ratulations-0.407
cens-0.404
Token leadership
Token ID5531
Feature activation+9.26e-02

Pos logit contributions
antine+1.368
ables+1.353
bledon+1.339
water+1.291
mom+1.290
Neg logit contributions
 foreskin-1.185
 salute-1.062
ratulations-1.024
 conver-1.015
cens-1.010
Token of
Token ID286
Feature activation-6.50e-02

Pos logit contributions
antine+0.194
ables+0.192
bledon+0.190
mom+0.183
water+0.183
Neg logit contributions
 foreskin-0.182
 salute-0.164
 conver-0.158
ratulations-0.158
cens-0.156
Token that
Token ID326
Feature activation+1.84e-01

Pos logit contributions
 foreskin+0.147
 salute+0.134
 conver+0.130
ratulations+0.129
cens+0.127
Neg logit contributions
antine-0.119
ables-0.117
bledon-0.115
water-0.111
mom-0.110
Token powerful
Token ID3665
Feature activation-2.93e-03

Pos logit contributions
antine+0.393
ables+0.388
bledon+0.384
water+0.370
mom+0.368
Neg logit contributions
 foreskin-0.357
 salute-0.323
 conver-0.310
ratulations-0.309
cens-0.302
Token panel
Token ID6103
Feature activation+5.51e-01

Pos logit contributions
 foreskin+0.006
 salute+0.005
 conver+0.005
ratulations+0.005
cens+0.005
Neg logit contributions
antine-0.006
ables-0.006
bledon-0.006
water-0.006
mom-0.006
Token --
Token ID1377
Feature activation+3.05e-01

Pos logit contributions
antine+1.151
ables+1.135
bledon+1.123
water+1.079
mom+1.077
Neg logit contributions
 foreskin-1.093
 salute-0.988
ratulations-0.951
 conver-0.942
cens-0.927
Token and
Token ID290
Feature activation+1.97e-01

Pos logit contributions
antine+0.620
ables+0.611
bledon+0.603
water+0.582
mom+0.581
Neg logit contributions
 foreskin-0.624
 salute-0.567
 conver-0.545
ratulations-0.541
cens-0.530
Token how
Token ID703
Feature activation-3.29e-02

Pos logit contributions
antine+0.358
ables+0.353
bledon+0.350
water+0.332
mom+0.332
Neg logit contributions
 foreskin-0.442
 salute-0.406
 conver-0.393
ratulations-0.388
cens-0.384
Token it
Token ID340
Feature activation+1.10e-02

Pos logit contributions
 foreskin+0.076
 salute+0.070
 conver+0.068
ratulations+0.067
cens+0.066
Neg logit contributions
antine-0.058
ables-0.057
bledon-0.056
water-0.054
mom-0.054
Tokenth
Token ID400
Feature activation-2.62e-01

Pos logit contributions
antine+1.378
ables+1.371
bledon+1.342
water+1.313
mom+1.290
Neg logit contributions
 foreskin-1.092
 salute-0.983
ratulations-0.939
 conver-0.928
estinal-0.912
Token among
Token ID1871
Feature activation-1.30e-01

Pos logit contributions
 foreskin+0.549
 salute+0.502
 conver+0.485
ratulations+0.477
cens+0.472
Neg logit contributions
antine-0.519
ables-0.511
bledon-0.507
water-0.485
mom-0.482
Token all
Token ID477
Feature activation-1.44e-01

Pos logit contributions
 foreskin+0.135
 salute+0.112
 conver+0.103
ratulations+0.100
 digit+0.097
Neg logit contributions
ables-0.392
antine-0.391
bledon-0.386
water-0.376
mom-0.374
Token edge
Token ID5743
Feature activation-1.32e+00

Pos logit contributions
 foreskin+0.165
 salute+0.139
 conver+0.128
ratulations+0.126
cens+0.122
Neg logit contributions
antine-0.421
ables-0.421
bledon-0.415
water-0.405
mom-0.402
Token r
Token ID374
Feature activation-2.42e-01

Pos logit contributions
 foreskin+2.532
 salute+2.311
 conver+2.230
ratulations+2.186
cens+2.161
Neg logit contributions
antine-2.810
bledon-2.755
ables-2.741
mom-2.633
water-2.614
Tokenus
Token ID385
Feature activation-1.52e+00

Pos logit contributions
 foreskin+0.567
 salute+0.525
 conver+0.508
ratulations+0.501
cens+0.497
Neg logit contributions
antine-0.417
ables-0.409
bledon-0.406
water-0.384
mom-0.384
Tokenhers
Token ID7084
Feature activation-5.52e-01

Pos logit contributions
 foreskin+1.775
 salute+1.541
 conver+1.435
ratulations+1.391
cens+1.361
Neg logit contributions
antine-4.369
bledon-4.301
ables-4.278
mom-4.163
water-4.141
Token*)
Token ID28104
Feature activation+2.89e-01

Pos logit contributions
 foreskin+1.028
 salute+0.929
 conver+0.891
ratulations+0.876
cens+0.866
Neg logit contributions
antine-1.218
ables-1.205
bledon-1.194
water-1.147
mom-1.144
Token
Token ID198
Feature activation-5.42e-02

Pos logit contributions
antine+0.587
ables+0.580
bledon+0.573
water+0.551
mom+0.550
Neg logit contributions
 foreskin-0.586
 salute-0.535
 conver-0.517
ratulations-0.509
cens-0.500
Token
Token ID198
Feature activation-1.89e-02

Pos logit contributions
 foreskin+0.108
 salute+0.098
 conver+0.094
ratulations+0.093
cens+0.092
Neg logit contributions
antine-0.113
ables-0.111
bledon-0.110
water-0.106
mom-0.106
Token2017
Token ID5539
Feature activation+2.85e-02

Pos logit contributions
 foreskin+0.048
 salute+0.045
 conver+0.043
ratulations+0.043
cens+0.042
Neg logit contributions
antine-0.029
ables-0.028
bledon-0.028
water-0.026
mom-0.026
Token m
Token ID285
Feature activation-1.37e-01

Pos logit contributions
 foreskin+1.398
 salute+1.294
 conver+1.243
ratulations+1.218
 digit+1.214
Neg logit contributions
antine-1.182
ables-1.163
bledon-1.161
water-1.095
mom-1.089
Tokenak
Token ID461
Feature activation+1.04e-01

Pos logit contributions
 foreskin+0.319
 salute+0.296
 conver+0.286
ratulations+0.282
cens+0.280
Neg logit contributions
antine-0.235
ables-0.232
bledon-0.230
water-0.217
mom-0.217
Tokenua
Token ID6413
Feature activation-2.26e-01

Pos logit contributions
antine+0.183
ables+0.180
bledon+0.178
water+0.169
mom+0.169
Neg logit contributions
 foreskin-0.242
 salute-0.223
 conver-0.216
ratulations-0.213
cens-0.211
Token k
Token ID479
Feature activation-8.90e-02

Pos logit contributions
 foreskin+0.486
 salute+0.448
 conver+0.431
ratulations+0.423
cens+0.420
Neg logit contributions
antine-0.433
ables-0.426
bledon-0.424
water-0.402
mom-0.401
Tokenā
Token ID10235
Feature activation+2.60e-01

Pos logit contributions
 foreskin+0.204
 salute+0.188
 conver+0.182
ratulations+0.180
cens+0.178
Neg logit contributions
antine-0.158
ables-0.155
bledon-0.154
water-0.146
mom-0.146
Tokenne
Token ID710
Feature activation-1.12e+00

Pos logit contributions
antine+0.463
ables+0.455
bledon+0.451
water+0.429
mom+0.429
Neg logit contributions
 foreskin-0.593
 salute-0.546
 conver-0.530
ratulations-0.523
cens-0.517
Token no
Token ID645
Feature activation-5.87e-01

Pos logit contributions
 foreskin+2.530
 salute+2.358
 conver+2.261
cens+2.218
 digit+2.215
Neg logit contributions
antine-1.998
bledon-1.969
ables-1.960
mom-1.826
water-1.826
Token K
Token ID509
Feature activation+1.37e-01

Pos logit contributions
 foreskin+1.281
 salute+1.181
 conver+1.138
ratulations+1.116
 digit+1.108
Neg logit contributions
antine-1.109
ables-1.087
bledon-1.085
water-1.025
mom-1.023
Tokenona
Token ID4450
Feature activation-3.33e-01

Pos logit contributions
antine+0.246
ables+0.242
bledon+0.239
water+0.228
mom+0.227
Neg logit contributions
 foreskin-0.312
 salute-0.287
 conver-0.278
ratulations-0.274
cens-0.271
Token,
Token ID11
Feature activation-1.52e-01

Pos logit contributions
 foreskin+0.723
 salute+0.665
 conver+0.641
ratulations+0.629
cens+0.626
Neg logit contributions
antine-0.632
ables-0.621
bledon-0.621
water-0.586
mom-0.586
Token Hawai
Token ID10847
Feature activation+7.52e-01

Pos logit contributions
 foreskin+0.331
 salute+0.304
 conver+0.294
ratulations+0.290
cens+0.287
Neg logit contributions
antine-0.289
ables-0.284
bledon-0.282
water-0.268
mom-0.268
Token of
Token ID286
Feature activation-3.41e-01

Pos logit contributions
 foreskin+1.448
 salute+1.323
 conver+1.272
ratulations+1.247
cens+1.234
Neg logit contributions
antine-1.530
ables-1.504
bledon-1.501
mom-1.429
water-1.427
Token girls
Token ID4813
Feature activation+7.16e-02

Pos logit contributions
 foreskin+0.636
 salute+0.576
 conver+0.551
ratulations+0.545
cens+0.541
Neg logit contributions
antine-0.746
ables-0.739
bledon-0.731
water-0.702
mom-0.701
Token.
Token ID13
Feature activation-2.08e-01

Pos logit contributions
antine+0.124
ables+0.123
bledon+0.121
water+0.115
mom+0.115
Neg logit contributions
 foreskin-0.166
 salute-0.154
 conver-0.148
ratulations-0.148
cens-0.146
Token
Token ID198
Feature activation-1.32e-01

Pos logit contributions
 foreskin+0.414
 salute+0.377
 conver+0.363
ratulations+0.358
cens+0.353
Neg logit contributions
antine-0.430
ables-0.424
bledon-0.421
water-0.403
mom-0.403
Token
Token ID198
Feature activation-1.09e-01

Pos logit contributions
 foreskin+0.260
 salute+0.236
 conver+0.227
ratulations+0.224
cens+0.221
Neg logit contributions
antine-0.277
ables-0.274
bledon-0.271
water-0.260
mom-0.260
TokenGenerally
Token ID37058
Feature activation-1.14e+00

Pos logit contributions
 foreskin+0.251
 salute+0.231
 conver+0.224
ratulations+0.221
cens+0.218
Neg logit contributions
antine-0.193
ables-0.190
bledon-0.188
water-0.179
mom-0.179
Token,
Token ID11
Feature activation+1.24e-02

Pos logit contributions
 foreskin+2.478
 salute+2.281
 conver+2.219
ratulations+2.165
 digit+2.149
Neg logit contributions
antine-2.139
bledon-2.109
ables-2.098
mom-1.989
water-1.983
Token the
Token ID262
Feature activation-3.94e-01

Pos logit contributions
antine+0.025
ables+0.025
bledon+0.025
water+0.024
mom+0.024
Neg logit contributions
 foreskin-0.025
 salute-0.023
 conver-0.022
ratulations-0.022
cens-0.021
Token Mong
Token ID18428
Feature activation-1.48e-01

Pos logit contributions
 foreskin+0.743
 salute+0.672
 conver+0.646
ratulations+0.640
cens+0.634
Neg logit contributions
antine-0.857
ables-0.846
bledon-0.834
water-0.804
mom-0.803
Tokenols
Token ID10220
Feature activation-1.35e-01

Pos logit contributions
 foreskin+0.360
 salute+0.333
 conver+0.323
ratulations+0.318
cens+0.316
Neg logit contributions
antine-0.241
ables-0.238
bledon-0.234
water-0.222
mom-0.221
Token (
Token ID357
Feature activation+1.26e-01

Pos logit contributions
 foreskin+0.266
 salute+0.241
 conver+0.232
ratulations+0.230
cens+0.226
Neg logit contributions
antine-0.282
ables-0.278
bledon-0.276
water-0.265
mom-0.264
Tokenasted
Token ID8992
Feature activation-1.10e-01

Pos logit contributions
 foreskin+0.847
 salute+0.775
 conver+0.749
ratulations+0.743
cens+0.732
Neg logit contributions
antine-0.726
ables-0.716
bledon-0.710
water-0.676
mom-0.674
Token its
Token ID663
Feature activation-7.94e-01

Pos logit contributions
 foreskin+0.239
 salute+0.218
 conver+0.211
ratulations+0.209
cens+0.207
Neg logit contributions
antine-0.209
ables-0.206
bledon-0.204
water-0.195
mom-0.195
Token world
Token ID995
Feature activation+1.67e-01

Pos logit contributions
 foreskin+1.548
 salute+1.408
 conver+1.354
ratulations+1.327
cens+1.312
Neg logit contributions
antine-1.680
ables-1.657
bledon-1.645
water-1.572
mom-1.572
Token champion
Token ID8783
Feature activation-1.09e-01

Pos logit contributions
antine+0.325
ables+0.321
bledon+0.318
water+0.304
mom+0.303
Neg logit contributions
 foreskin-0.354
 salute-0.324
 conver-0.313
ratulations-0.310
cens-0.305
Token Cleveland
Token ID10306
Feature activation+1.85e-01

Pos logit contributions
 foreskin+0.232
 salute+0.212
 conver+0.206
ratulations+0.204
cens+0.202
Neg logit contributions
antine-0.210
ables-0.208
bledon-0.205
water-0.196
mom-0.196
Token Rams
Token ID16077
Feature activation-1.01e+00

Pos logit contributions
antine+0.350
ables+0.346
bledon+0.342
water+0.328
mom+0.326
Neg logit contributions
 foreskin-0.402
 salute-0.367
 conver-0.356
ratulations-0.354
cens-0.350
Token in
Token ID287
Feature activation-5.69e-02

Pos logit contributions
 foreskin+2.167
 salute+1.994
 conver+1.924
ratulations+1.884
cens+1.868
Neg logit contributions
antine-1.947
ables-1.917
bledon-1.903
mom-1.808
water-1.807
Token a
Token ID257
Feature activation-8.34e-02

Pos logit contributions
 foreskin+0.122
 salute+0.111
 conver+0.108
ratulations+0.107
cens+0.106
Neg logit contributions
antine-0.109
ables-0.108
bledon-0.107
water-0.102
mom-0.102
Token celebration
Token ID14395
Feature activation-2.00e-01

Pos logit contributions
 foreskin+0.164
 salute+0.148
 conver+0.143
ratulations+0.142
cens+0.141
Neg logit contributions
antine-0.175
ables-0.173
bledon-0.171
mom-0.164
water-0.164
Token that
Token ID326
Feature activation-3.25e-01

Pos logit contributions
 foreskin+0.405
 salute+0.365
ratulations+0.357
 conver+0.353
cens+0.350
Neg logit contributions
antine-0.407
ables-0.403
bledon-0.397
water-0.383
mom-0.380
Token doubled
Token ID15229
Feature activation+1.79e-01

Pos logit contributions
 foreskin+0.612
 salute+0.547
ratulations+0.531
 conver+0.521
cens+0.520
Neg logit contributions
antine-0.706
ables-0.701
bledon-0.691
water-0.667
mom-0.665
TokenCalifornia
Token ID25284
Feature activation-1.16e-01

Pos logit contributions
 foreskin+0.362
 salute+0.333
 conver+0.322
ratulations+0.317
cens+0.313
Neg logit contributions
antine-0.299
ables-0.295
bledon-0.291
water-0.278
mom-0.278
Token v
Token ID410
Feature activation-1.11e-01

Pos logit contributions
 foreskin+0.266
 salute+0.245
 conver+0.237
ratulations+0.235
cens+0.232
Neg logit contributions
antine-0.206
ables-0.202
bledon-0.200
water-0.190
mom-0.190
Token.
Token ID13
Feature activation-4.64e-01

Pos logit contributions
 foreskin+0.238
 salute+0.219
 conver+0.211
ratulations+0.208
cens+0.206
Neg logit contributions
antine-0.214
ables-0.211
bledon-0.209
water-0.199
mom-0.199
Token Southern
Token ID8050
Feature activation-1.38e-01

Pos logit contributions
 foreskin+0.947
 salute+0.864
 conver+0.833
ratulations+0.823
cens+0.811
Neg logit contributions
antine-0.942
ables-0.929
bledon-0.920
water-0.883
mom-0.879
Token Pacific
Token ID8211
Feature activation-8.73e-01

Pos logit contributions
 foreskin+0.264
 salute+0.239
 conver+0.230
ratulations+0.226
cens+0.224
Neg logit contributions
antine-0.296
ables-0.292
bledon-0.290
water-0.278
mom-0.278
Token Railroad
Token ID36289
Feature activation-1.24e+00

Pos logit contributions
 foreskin+1.577
 salute+1.431
 conver+1.371
ratulations+1.340
cens+1.327
Neg logit contributions
antine-1.957
ables-1.934
bledon-1.928
mom-1.846
water-1.836
Token Company
Token ID5834
Feature activation-5.21e-01

Pos logit contributions
 foreskin+2.527
 salute+2.327
 conver+2.242
ratulations+2.192
cens+2.174
Neg logit contributions
antine-2.484
ables-2.445
bledon-2.444
mom-2.321
water-2.306
Token
Token ID198
Feature activation-2.22e-01

Pos logit contributions
 foreskin+1.067
 salute+0.975
 conver+0.940
ratulations+0.924
cens+0.914
Neg logit contributions
antine-1.051
ables-1.036
bledon-1.028
water-0.982
mom-0.981
Token
Token ID198
Feature activation-1.57e-01

Pos logit contributions
 foreskin+0.435
 salute+0.395
 conver+0.381
ratulations+0.374
cens+0.370
Neg logit contributions
antine-0.468
ables-0.462
bledon-0.459
water-0.439
mom-0.438
TokenF
Token ID37
Feature activation+2.99e-01

Pos logit contributions
 foreskin+0.350
 salute+0.322
 conver+0.311
ratulations+0.307
 digit+0.302
Neg logit contributions
antine-0.292
ables-0.287
bledon-0.284
water-0.272
mom-0.271
Tokenacts
Token ID8656
Feature activation-9.91e-02

Pos logit contributions
antine+0.569
ables+0.562
bledon+0.548
water+0.525
mom+0.522
Neg logit contributions
 foreskin-0.658
 salute-0.595
 conver-0.577
ratulations-0.569
 digit-0.562