TOP ACTIVATIONS
MAX = 2.642

Ferguson versus Kh ab ib Nur mag omed ov which
4 so that his R anged Combat Expert can swing for
consultation for independence in the foreseeable future . But a fresh
a Trump shirt after he punched another rally go er ,
youth court judge for the inmate s release and
got way brighter !!! Cong rats 2 @ m f lynn
float technique , you are punching down into the chest and
heads Jason West and Vince Zam pe lla , outlines the
iser ies for DC 's Vert igo imprint and a handful
and 4 damage with R anged Combat Expert . Tra ited
s preference for a Nate Diaz trilogy fight was met by
dump ers driver V asant Tav are and Ash ok Yad
. O . J . Simpson : American Crime Story on
( N UST ), N UFC Fans United and Newcastle United
ence Jr s punches had blown Brook
Tele path H ython ia the Cruel vs Ign
retaliate , he threw no punches , and he did not
against Mark Hunt back at UFC 200 . US
to leak . A president ign iting speculation about secret Oval
Hom M TR Station ( Exit A or D ). <|endoftext|>

MIN ACTIVATIONS
MIN = -2.464

to accident status ). George reminded me that if I rolled
download . Includes the Catholic Encyclopedia , Church Fathers , Sum
based on how one labels oneself .
like a good ship to remind us of the power of
um . It is also cance lling similar events planned for
about data structures / al gorith ms and more about web
designers and even the home gard ener . In doing so
Caps coach Barry Tro tz reminded DC fans that there was
a fundraising email this week reminding voters that the last time
, on every occasion , remind Syria that Hat ay in
over the team radio in FP 1 at Suz uka to
s Lara Sel ig man reminded folks Tuesday , is a
years away from a driving licence . However , they can
of a new mood of cens orious political correctness sweeping university
. The phrase reminds me of a patient whose
( 0 s )} c xd 4 { ( 341 s
Thanks , Jo , for reminding us not to take the
regulatory bodies creating the black lists . EFF director
the sea to be continually reminded of this connection . World
hide , right ? Then remind them that strong encryption is

INTERVAL 1.365 - 2.642
CONTAINS 0.260%

compromise sentence that still involves prison but is below the Guidelines
. <|endoftext|> B rock Les nar has been flagged by the
initial izes and replaces the inn ards of the bound tag
To West R ukong ai ... to see Shib
that the Mom ent is not being

INTERVAL 0.089 - 1.365
CONTAINS 36.494%

went so far as to suggest that the Mayor was a
is advising Sullivan , says it has had a devastating impact
check - ins on Unt app d . Think about that
advised she will also hear the defense s request
Gl enn Greenwald writes <|endoftext|> The divers found her on a

INTERVAL -1.188 - 0.089
CONTAINS 62.705%

can already be heard somewhat . Do this with the radio
( Na of umi ) But
wrong , the least one can say is that they were
But Stella …. "… Sure
ets . When equipped with a thrust augment er , a

INTERVAL -2.464 - -1.188
CONTAINS 0.541%

the B ip artisan Policy Center . M ood
over . It was a poor choice . I didnt finish
a Russian truck convoy bearing humanitarian aid might now enter into
of educational qualifications for transport licenses are some of the features
ened to me with some older firm w ares .
Token Ferguson
Token ID15031
Feature activation+4.90e-01

Pos logit contributions
uel+0.891
+0.867
+0.803
 WATCHED+0.802
lished+0.802
Neg logit contributions
 spont-1.495
 volunt-1.433
 Pv-1.423
 imperson-1.266
ortment-1.259
Token versus
Token ID9051
Feature activation+3.96e-01

Pos logit contributions
uel+0.788
+0.772
 WATCHED+0.729
lished+0.728
+0.728
Neg logit contributions
 spont-0.851
 volunt-0.809
 Pv-0.804
 imperson-0.693
ortment-0.688
Token Kh
Token ID5311
Feature activation+3.72e-01

Pos logit contributions
uel+0.660
+0.646
 WATCHED+0.612
+0.612
lished+0.611
Neg logit contributions
 spont-0.662
 volunt-0.629
 Pv-0.625
 imperson-0.536
ortment-0.532
Tokenab
Token ID397
Feature activation+5.77e-01

Pos logit contributions
+0.635
uel+0.630
+0.601
lished+0.600
 WATCHED+0.600
Neg logit contributions
 spont-0.605
 volunt-0.574
 Pv-0.566
ortment-0.488
 imperson-0.486
Tokenib
Token ID571
Feature activation+1.33e-01

Pos logit contributions
uel+1.030
+0.967
 WATCHED+0.940
lished+0.928
+0.928
Neg logit contributions
 spont-0.928
 Pv-0.868
 volunt-0.868
 imperson-0.742
ortment-0.733
Token Nur
Token ID49281
Feature activation+2.64e+00

Pos logit contributions
uel+0.150
+0.145
 WATCHED+0.135
lished+0.134
+0.134
Neg logit contributions
 spont-0.294
 volunt-0.281
 Pv-0.280
 imperson-0.251
ortment-0.250
Tokenmag
Token ID19726
Feature activation+1.29e+00

Pos logit contributions
uel+1.988
+1.854
+1.621
uru+1.612
 WATCHED+1.601
Neg logit contributions
 spont-6.835
 volunt-6.599
 Pv-6.480
ortment-5.959
 imperson-5.872
Tokenomed
Token ID12657
Feature activation+1.62e+00

Pos logit contributions
uel+1.537
+1.460
lished+1.360
 WATCHED+1.360
+1.339
Neg logit contributions
 spont-2.782
 volunt-2.665
 Pv-2.664
 imperson-2.362
ortment-2.350
Tokenov
Token ID709
Feature activation+6.74e-01

Pos logit contributions
uel+2.531
+2.377
 WATCHED+2.320
ーン+2.288
osponsors+2.280
Neg logit contributions
 spont-2.882
 volunt-2.741
 Pv-2.714
ortment-2.324
 imperson-2.323
Token
Token ID784
Feature activation+2.97e-01

Pos logit contributions
uel+1.091
+1.062
 WATCHED+1.012
lished+1.002
+1.002
Neg logit contributions
 spont-1.166
 volunt-1.108
 Pv-1.090
ortment-0.944
 imperson-0.941
Token which
Token ID543
Feature activation+5.32e-01

Pos logit contributions
uel+0.501
+0.491
 WATCHED+0.470
+0.464
lished+0.464
Neg logit contributions
 spont-0.489
 volunt-0.467
 Pv-0.460
ortment-0.395
 imperson-0.391
Token 4
Token ID604
Feature activation+6.31e-01

Pos logit contributions
uel+0.338
+0.331
 WATCHED+0.314
+0.311
lished+0.308
Neg logit contributions
 spont-0.358
 volunt-0.342
 Pv-0.334
 imperson-0.291
ortment-0.291
Token so
Token ID523
Feature activation-1.29e-01

Pos logit contributions
uel+0.997
+0.973
 WATCHED+0.918
+0.916
lished+0.910
Neg logit contributions
 spont-1.118
 volunt-1.068
 Pv-1.049
ortment-0.913
 imperson-0.910
Token that
Token ID326
Feature activation+7.08e-02

Pos logit contributions
 spont+0.214
 volunt+0.203
 Pv+0.202
 imperson+0.173
ortment+0.173
Neg logit contributions
uel-0.216
-0.211
 WATCHED-0.200
-0.200
lished-0.200
Token his
Token ID465
Feature activation+3.73e-01

Pos logit contributions
uel+0.110
+0.107
+0.101
 WATCHED+0.101
lished+0.100
Neg logit contributions
 spont-0.126
 volunt-0.121
 Pv-0.119
ortment-0.104
 imperson-0.103
Token R
Token ID371
Feature activation+5.54e-01

Pos logit contributions
uel+0.563
+0.549
 WATCHED+0.520
+0.516
lished+0.509
Neg logit contributions
 spont-0.680
 volunt-0.650
 Pv-0.643
ortment-0.565
 imperson-0.560
Tokenanged
Token ID5102
Feature activation+2.22e+00

Pos logit contributions
uel+0.939
+0.906
 WATCHED+0.862
+0.859
lished+0.855
Neg logit contributions
 spont-0.924
 volunt-0.878
 Pv-0.870
 imperson-0.744
ortment-0.742
Token Combat
Token ID14487
Feature activation+7.97e-01

Pos logit contributions
uel+2.571
+2.406
 WATCHED+2.274
+2.247
lished+2.171
Neg logit contributions
 spont-4.761
 volunt-4.611
 Pv-4.498
ortment-4.110
 imperson-4.058
Token Expert
Token ID25516
Feature activation+1.23e-01

Pos logit contributions
uel+1.434
+1.402
+1.339
 WATCHED+1.338
lished+1.332
Neg logit contributions
 spont-1.222
 volunt-1.162
 Pv-1.145
ortment-0.968
 imperson-0.966
Token can
Token ID460
Feature activation+3.52e-02

Pos logit contributions
uel+0.178
+0.174
 WATCHED+0.164
+0.163
lished+0.162
Neg logit contributions
 spont-0.232
 volunt-0.222
 Pv-0.220
ortment-0.193
 imperson-0.192
Token swing
Token ID9628
Feature activation+1.43e+00

Pos logit contributions
uel+0.056
+0.055
 WATCHED+0.052
+0.052
lished+0.051
Neg logit contributions
 spont-0.062
 volunt-0.059
 Pv-0.058
ortment-0.050
 imperson-0.050
Token for
Token ID329
Feature activation+1.97e-01

Pos logit contributions
uel+2.299
+2.238
 WATCHED+2.125
+2.125
lished+2.103
Neg logit contributions
 spont-2.423
 volunt-2.317
 Pv-2.268
ortment-2.006
 imperson-1.960
Token consultation
Token ID18103
Feature activation+6.99e-01

Pos logit contributions
 spont+0.034
 volunt+0.033
 Pv+0.033
 imperson+0.028
ortment+0.028
Neg logit contributions
uel-0.030
-0.030
lished-0.028
 WATCHED-0.028
-0.028
Token for
Token ID329
Feature activation+2.30e-01

Pos logit contributions
uel+1.116
+1.086
 WATCHED+1.025
+1.025
lished+1.023
Neg logit contributions
 spont-1.216
 volunt-1.153
 Pv-1.148
ortment-0.994
 imperson-0.993
Token independence
Token ID10404
Feature activation+2.21e-01

Pos logit contributions
uel+0.388
+0.381
 WATCHED+0.360
lished+0.360
+0.360
Neg logit contributions
 spont-0.379
 volunt-0.360
 Pv-0.359
 imperson-0.307
ortment-0.305
Token in
Token ID287
Feature activation+3.30e-01

Pos logit contributions
uel+0.367
+0.360
 WATCHED+0.340
+0.339
lished+0.339
Neg logit contributions
 spont-0.372
 volunt-0.353
 Pv-0.351
 imperson-0.302
ortment-0.300
Token the
Token ID262
Feature activation+4.87e-02

Pos logit contributions
uel+0.545
+0.535
lished+0.506
 WATCHED+0.506
+0.505
Neg logit contributions
 spont-0.557
 volunt-0.529
 Pv-0.527
 imperson-0.452
ortment-0.448
Token foreseeable
Token ID34922
Feature activation+2.21e+00

Pos logit contributions
uel+0.072
+0.070
lished+0.066
 WATCHED+0.066
+0.066
Neg logit contributions
 spont-0.091
 volunt-0.087
 Pv-0.087
 imperson-0.076
ortment-0.075
Token future
Token ID2003
Feature activation+7.69e-01

Pos logit contributions
uel+1.661
+1.550
+1.413
 WATCHED+1.406
ーン+1.355
Neg logit contributions
 spont-5.596
 volunt-5.402
 Pv-5.381
ortment-4.932
 imperson-4.913
Token.
Token ID13
Feature activation+2.59e-02

Pos logit contributions
uel+1.265
+1.234
 WATCHED+1.173
+1.169
lished+1.164
Neg logit contributions
 spont-1.301
 volunt-1.233
 Pv-1.221
ortment-1.055
 imperson-1.053
Token But
Token ID887
Feature activation+3.48e-02

Pos logit contributions
uel+0.042
+0.041
+0.039
 WATCHED+0.039
lished+0.039
Neg logit contributions
 spont-0.044
 volunt-0.042
 Pv-0.042
 imperson-0.037
ortment-0.036
Token a
Token ID257
Feature activation+2.03e-01

Pos logit contributions
uel+0.060
+0.059
 WATCHED+0.056
+0.056
lished+0.056
Neg logit contributions
 spont-0.056
 volunt-0.053
 Pv-0.053
 imperson-0.045
ortment-0.045
Token fresh
Token ID4713
Feature activation+3.92e-03

Pos logit contributions
uel+0.319
+0.313
lished+0.295
 WATCHED+0.295
+0.295
Neg logit contributions
 spont-0.359
 volunt-0.342
 Pv-0.340
 imperson-0.295
ortment-0.292
Token a
Token ID257
Feature activation+6.35e-02

Pos logit contributions
 spont+0.182
 volunt+0.172
 Pv+0.171
ortment+0.145
 imperson+0.143
Neg logit contributions
uel-0.205
-0.201
 WATCHED-0.192
lished-0.190
-0.190
Token Trump
Token ID1301
Feature activation+1.11e-01

Pos logit contributions
uel+0.104
+0.101
 WATCHED+0.096
+0.095
lished+0.095
Neg logit contributions
 spont-0.108
 volunt-0.103
 Pv-0.102
ortment-0.088
 imperson-0.087
Token shirt
Token ID10147
Feature activation+2.54e-02

Pos logit contributions
uel+0.179
+0.176
 WATCHED+0.166
lished+0.165
+0.165
Neg logit contributions
 spont-0.193
 volunt-0.184
 Pv-0.183
ortment-0.157
 imperson-0.156
Token after
Token ID706
Feature activation+3.24e-01

Pos logit contributions
uel+0.040
+0.039
 WATCHED+0.037
+0.037
lished+0.036
Neg logit contributions
 spont-0.045
 volunt-0.043
 Pv-0.043
ortment-0.037
 imperson-0.036
Token he
Token ID339
Feature activation-7.92e-02

Pos logit contributions
uel+0.520
+0.504
 WATCHED+0.483
lished+0.475
+0.472
Neg logit contributions
 spont-0.559
 volunt-0.534
 Pv-0.528
ortment-0.462
 imperson-0.446
Token punched
Token ID25436
Feature activation+2.20e+00

Pos logit contributions
 spont+0.144
 volunt+0.137
 Pv+0.136
ortment+0.118
 imperson+0.117
Neg logit contributions
uel-0.121
-0.118
 WATCHED-0.112
lished-0.110
-0.110
Token another
Token ID1194
Feature activation-8.64e-02

Pos logit contributions
uel+3.470
+3.301
 WATCHED+3.241
 Jem+3.209
+3.138
Neg logit contributions
 spont-3.677
 volunt-3.542
 Pv-3.439
ortment-3.142
 imperson-2.902
Token rally
Token ID7903
Feature activation-3.41e-01

Pos logit contributions
 spont+0.135
 volunt+0.128
 Pv+0.127
ortment+0.107
 imperson+0.106
Neg logit contributions
uel-0.154
-0.150
 WATCHED-0.143
-0.142
lished-0.142
Tokengo
Token ID2188
Feature activation-1.84e-01

Pos logit contributions
 spont+0.580
 volunt+0.551
 Pv+0.549
 imperson+0.471
ortment+0.467
Neg logit contributions
uel-0.558
-0.548
 WATCHED-0.518
lished-0.517
-0.517
Tokener
Token ID263
Feature activation-5.27e-02

Pos logit contributions
 spont+0.333
 volunt+0.317
 Pv+0.316
 imperson+0.274
ortment+0.272
Neg logit contributions
uel-0.283
-0.277
 WATCHED-0.261
lished-0.261
-0.261
Token,
Token ID11
Feature activation+6.14e-03

Pos logit contributions
 spont+0.089
 volunt+0.085
 Pv+0.084
ortment+0.072
 imperson+0.071
Neg logit contributions
uel-0.087
-0.085
 WATCHED-0.081
-0.080
lished-0.080
Token youth
Token ID6205
Feature activation-1.81e-01

Pos logit contributions
uel+0.454
+0.446
+0.421
lished+0.420
 WATCHED+0.420
Neg logit contributions
 spont-0.474
 volunt-0.449
 Pv-0.448
 imperson-0.385
ortment-0.383
Token court
Token ID2184
Feature activation+9.01e-01

Pos logit contributions
 spont+0.336
 volunt+0.321
 Pv+0.320
 imperson+0.279
ortment+0.276
Neg logit contributions
uel-0.270
-0.264
lished-0.248
 WATCHED-0.248
-0.248
Token judge
Token ID5052
Feature activation+5.55e-01

Pos logit contributions
uel+1.472
+1.439
+1.365
 WATCHED+1.360
lished+1.354
Neg logit contributions
 spont-1.528
 volunt-1.444
 Pv-1.440
ortment-1.240
 imperson-1.224
Token for
Token ID329
Feature activation+4.56e-01

Pos logit contributions
uel+0.907
+0.887
+0.842
 WATCHED+0.838
lished+0.835
Neg logit contributions
 spont-0.947
 volunt-0.899
 Pv-0.890
ortment-0.765
 imperson-0.756
Token the
Token ID262
Feature activation+1.88e-01

Pos logit contributions
uel+0.780
+0.765
+0.726
lished+0.722
 WATCHED+0.720
Neg logit contributions
 spont-0.740
 Pv-0.697
 volunt-0.696
ortment-0.596
 imperson-0.595
Token inmate
Token ID28414
Feature activation+2.18e+00

Pos logit contributions
uel+0.275
+0.269
lished+0.253
 WATCHED+0.252
+0.252
Neg logit contributions
 spont-0.353
 volunt-0.337
 Pv-0.336
 imperson-0.293
ortment-0.292
Token
Token ID447
Feature activation+1.20e-01

Pos logit contributions
uel+2.952
+2.767
+2.627
lished+2.554
 WATCHED+2.545
Neg logit contributions
 spont-4.290
 Pv-4.092
 volunt-4.062
ortment-3.637
 imperson-3.550
Token
Token ID247
Feature activation+8.88e-02

Pos logit contributions
uel+0.184
+0.181
 WATCHED+0.170
+0.169
lished+0.168
Neg logit contributions
 spont-0.217
 volunt-0.207
 Pv-0.205
 imperson-0.179
ortment-0.177
Tokens
Token ID82
Feature activation+1.83e-01

Pos logit contributions
uel+0.127
+0.123
lished+0.115
+0.115
 WATCHED+0.115
Neg logit contributions
 spont-0.171
 volunt-0.163
 Pv-0.162
 imperson-0.142
ortment-0.141
Token release
Token ID2650
Feature activation+1.71e+00

Pos logit contributions
uel+0.253
+0.249
lished+0.234
 WATCHED+0.233
+0.232
Neg logit contributions
 spont-0.357
 volunt-0.342
 Pv-0.339
 imperson-0.298
ortment-0.295
Token and
Token ID290
Feature activation+3.33e-01

Pos logit contributions
uel+2.743
+2.638
+2.553
 WATCHED+2.467
lished+2.465
Neg logit contributions
 spont-2.931
 volunt-2.782
 Pv-2.773
ortment-2.423
 imperson-2.401
Token got
Token ID1392
Feature activation+6.18e-01

Pos logit contributions
 spont+0.402
 volunt+0.385
 Pv+0.384
 imperson+0.339
ortment+0.335
Neg logit contributions
uel-0.262
-0.256
lished-0.239
-0.238
 WATCHED-0.238
Token way
Token ID835
Feature activation-9.21e-01

Pos logit contributions
uel+0.913
+0.891
 WATCHED+0.841
+0.837
lished+0.832
Neg logit contributions
 spont-1.149
 volunt-1.099
 Pv-1.087
ortment-0.952
 imperson-0.952
Token brighter
Token ID26083
Feature activation-8.14e-02

Pos logit contributions
 spont+1.847
 Pv+1.773
 volunt+1.769
 imperson+1.563
ortment+1.537
Neg logit contributions
uel-1.217
-1.196
lished-1.125
-1.116
 WATCHED-1.110
Token!!!
Token ID10185
Feature activation-1.21e-01

Pos logit contributions
 spont+0.136
 volunt+0.129
 Pv+0.128
 imperson+0.109
ortment+0.109
Neg logit contributions
uel-0.137
-0.134
-0.127
 WATCHED-0.127
lished-0.126
Token Cong
Token ID2908
Feature activation-6.61e-01

Pos logit contributions
 spont+0.207
 volunt+0.198
 Pv+0.196
 imperson+0.169
ortment+0.168
Neg logit contributions
uel-0.197
-0.193
-0.182
 WATCHED-0.182
lished-0.182
Tokenrats
Token ID46714
Feature activation+2.17e+00

Pos logit contributions
 spont+1.632
 volunt+1.573
 Pv+1.569
 imperson+1.421
ortment+1.411
Neg logit contributions
uel-0.579
-0.556
lished-0.500
 WATCHED-0.498
-0.498
Token 2
Token ID362
Feature activation-3.83e-01

Pos logit contributions
uel+3.313
+3.206
ーン+3.035
 Jem+3.015
+3.006
Neg logit contributions
 spont-3.826
 volunt-3.681
 Pv-3.541
ortment-3.168
 imperson-3.099
Token @
Token ID2488
Feature activation-4.96e-01

Pos logit contributions
 spont+0.631
 volunt+0.599
 Pv+0.595
 imperson+0.510
ortment+0.505
Neg logit contributions
uel-0.648
-0.636
lished-0.602
 WATCHED-0.602
-0.602
Tokenm
Token ID76
Feature activation-1.58e-01

Pos logit contributions
 spont+0.912
 volunt+0.869
 Pv+0.866
 imperson+0.754
ortment+0.747
Neg logit contributions
uel-0.748
-0.733
lished-0.689
 WATCHED-0.689
-0.688
Tokenf
Token ID69
Feature activation+2.39e-01

Pos logit contributions
 spont+0.310
 volunt+0.297
 Pv+0.296
 imperson+0.260
ortment+0.257
Neg logit contributions
uel-0.218
-0.213
lished-0.199
 WATCHED-0.198
-0.198
Tokenlynn
Token ID12935
Feature activation+2.19e-02

Pos logit contributions
uel+0.436
+0.429
lished+0.408
 WATCHED+0.408
+0.407
Neg logit contributions
 spont-0.362
 volunt-0.341
 Pv-0.340
 imperson-0.286
ortment-0.282
Token float
Token ID12178
Feature activation-4.01e-01

Pos logit contributions
 spont+0.431
 volunt+0.408
 Pv+0.406
 imperson+0.348
ortment+0.348
Neg logit contributions
uel-0.427
-0.420
 WATCHED-0.399
-0.396
lished-0.395
Token technique
Token ID8173
Feature activation-8.19e-02

Pos logit contributions
 spont+0.674
 volunt+0.639
 Pv+0.636
 imperson+0.546
ortment+0.540
Neg logit contributions
uel-0.667
-0.650
lished-0.618
 WATCHED-0.617
-0.616
Token,
Token ID11
Feature activation-6.26e-02

Pos logit contributions
 spont+0.139
 volunt+0.133
 Pv+0.131
ortment+0.113
 imperson+0.113
Neg logit contributions
uel-0.134
-0.131
lished-0.124
 WATCHED-0.124
-0.123
Token you
Token ID345
Feature activation+3.31e-02

Pos logit contributions
 spont+0.104
 volunt+0.099
 Pv+0.098
ortment+0.084
 imperson+0.084
Neg logit contributions
uel-0.105
-0.103
 WATCHED-0.097
-0.097
lished-0.097
Token are
Token ID389
Feature activation-2.00e-01

Pos logit contributions
uel+0.050
+0.049
 WATCHED+0.046
lished+0.045
+0.045
Neg logit contributions
 spont-0.061
 volunt-0.058
 Pv-0.058
ortment-0.050
 imperson-0.050
Token punching
Token ID29044
Feature activation+2.16e+00

Pos logit contributions
 spont+0.327
 volunt+0.310
 Pv+0.309
ortment+0.263
 imperson+0.262
Neg logit contributions
uel-0.340
-0.333
 WATCHED-0.316
-0.313
lished-0.313
Token down
Token ID866
Feature activation+2.15e-01

Pos logit contributions
uel+3.410
+3.309
 WATCHED+3.183
+3.176
lished+3.136
Neg logit contributions
 spont-3.616
 volunt-3.484
 Pv-3.446
ortment-3.055
 imperson-2.943
Token into
Token ID656
Feature activation+2.71e-01

Pos logit contributions
uel+0.349
+0.342
 WATCHED+0.324
lished+0.323
+0.323
Neg logit contributions
 spont-0.368
 volunt-0.350
 Pv-0.349
 imperson-0.300
ortment-0.298
Token the
Token ID262
Feature activation-2.52e-02

Pos logit contributions
uel+0.461
+0.452
+0.429
 WATCHED+0.429
lished+0.428
Neg logit contributions
 spont-0.439
 volunt-0.417
 Pv-0.415
ortment-0.355
 imperson-0.353
Token chest
Token ID7721
Feature activation+6.15e-01

Pos logit contributions
 spont+0.043
 volunt+0.041
 Pv+0.040
ortment+0.035
 imperson+0.035
Neg logit contributions
uel-0.041
-0.040
 WATCHED-0.038
-0.038
lished-0.038
Token and
Token ID290
Feature activation+1.39e-01

Pos logit contributions
uel+1.006
+0.991
 WATCHED+0.937
lished+0.936
+0.934
Neg logit contributions
 spont-1.043
 volunt-0.992
 Pv-0.984
 imperson-0.845
ortment-0.837
Token heads
Token ID6665
Feature activation-1.33e-01

Pos logit contributions
uel+0.479
+0.466
 WATCHED+0.439
+0.437
lished+0.437
Neg logit contributions
 spont-0.563
 volunt-0.536
 Pv-0.529
ortment-0.462
 imperson-0.461
Token Jason
Token ID8982
Feature activation+2.21e-01

Pos logit contributions
 spont+0.225
 volunt+0.214
 Pv+0.214
 imperson+0.184
ortment+0.182
Neg logit contributions
uel-0.218
-0.215
lished-0.203
 WATCHED-0.203
-0.202
Token West
Token ID2688
Feature activation+2.99e-01

Pos logit contributions
uel+0.361
+0.350
 WATCHED+0.331
lished+0.331
+0.329
Neg logit contributions
 spont-0.379
 volunt-0.362
 Pv-0.357
 imperson-0.311
ortment-0.307
Token and
Token ID290
Feature activation+3.24e-01

Pos logit contributions
uel+0.497
+0.487
lished+0.461
+0.460
 WATCHED+0.459
Neg logit contributions
 spont-0.503
 volunt-0.478
 Pv-0.472
 imperson-0.407
ortment-0.402
Token Vince
Token ID22376
Feature activation+1.32e-01

Pos logit contributions
uel+0.535
+0.524
 WATCHED+0.496
+0.495
lished+0.495
Neg logit contributions
 spont-0.546
 volunt-0.519
 Pv-0.513
 imperson-0.442
ortment-0.440
Token Zam
Token ID38343
Feature activation+2.16e+00

Pos logit contributions
uel+0.176
+0.174
 WATCHED+0.162
lished+0.162
+0.161
Neg logit contributions
 spont-0.265
 Pv-0.254
 volunt-0.254
 imperson-0.223
ortment-0.220
Tokenpe
Token ID431
Feature activation+8.39e-01

Pos logit contributions
uel+3.052
+2.777
uru+2.748
 WATCHED+2.654
+2.622
Neg logit contributions
 spont-4.129
 volunt-3.939
 Pv-3.922
ortment-3.444
 imperson-3.399
Tokenlla
Token ID8466
Feature activation+1.53e-01

Pos logit contributions
uel+1.128
+1.099
 WATCHED+1.039
lished+1.029
+1.027
Neg logit contributions
 spont-1.673
 volunt-1.607
 Pv-1.578
ortment-1.400
 imperson-1.398
Token,
Token ID11
Feature activation+1.68e-01

Pos logit contributions
uel+0.251
+0.246
lished+0.233
 WATCHED+0.233
+0.233
Neg logit contributions
 spont-0.261
 volunt-0.248
 Pv-0.247
 imperson-0.212
ortment-0.210
Token outlines
Token ID27430
Feature activation-2.42e-01

Pos logit contributions
uel+0.251
+0.246
lished+0.231
 WATCHED+0.231
+0.231
Neg logit contributions
 spont-0.312
 volunt-0.297
 Pv-0.296
 imperson-0.258
ortment-0.256
Token the
Token ID262
Feature activation+1.01e-01

Pos logit contributions
 spont+0.399
 volunt+0.379
 Pv+0.378
 imperson+0.322
ortment+0.319
Neg logit contributions
uel-0.408
-0.402
lished-0.381
-0.380
 WATCHED-0.380
Tokeniser
Token ID5847
Feature activation-5.50e-01

Pos logit contributions
 spont+0.008
 volunt+0.007
 Pv+0.007
 imperson+0.006
ortment+0.006
Neg logit contributions
uel-0.005
-0.005
 WATCHED-0.005
lished-0.005
-0.005
Tokenies
Token ID444
Feature activation+6.84e-01

Pos logit contributions
 spont+0.856
 Pv+0.816
 volunt+0.801
ortment+0.684
 imperson+0.681
Neg logit contributions
uel-0.963
-0.961
lished-0.922
-0.917
 WATCHED-0.916
Token for
Token ID329
Feature activation+2.14e-01

Pos logit contributions
uel+1.119
+1.088
 WATCHED+1.037
+1.028
lished+1.025
Neg logit contributions
 spont-1.171
 volunt-1.110
 Pv-1.099
 imperson-0.946
ortment-0.943
Token DC
Token ID6257
Feature activation+1.24e+00

Pos logit contributions
uel+0.336
+0.329
 WATCHED+0.311
lished+0.310
+0.309
Neg logit contributions
 spont-0.380
 volunt-0.362
 Pv-0.360
 imperson-0.311
ortment-0.309
Token's
Token ID338
Feature activation-1.23e-01

Pos logit contributions
uel+1.809
+1.747
lished+1.644
+1.626
 WATCHED+1.624
Neg logit contributions
 spont-2.340
 volunt-2.250
 Pv-2.203
 imperson-1.942
ortment-1.928
Token Vert
Token ID24417
Feature activation+2.15e+00

Pos logit contributions
 spont+0.219
 volunt+0.209
 Pv+0.208
 imperson+0.180
ortment+0.178
Neg logit contributions
uel-0.193
-0.189
lished-0.178
 WATCHED-0.178
-0.178
Tokenigo
Token ID14031
Feature activation+8.09e-01

Pos logit contributions
uel+3.540
+3.361
 WATCHED+3.207
lished+3.203
+3.154
Neg logit contributions
 spont-3.704
 volunt-3.546
 Pv-3.457
ortment-2.999
 imperson-2.997
Token imprint
Token ID29122
Feature activation+1.44e+00

Pos logit contributions
uel+1.087
+1.042
 WATCHED+0.975
+0.973
lished+0.972
Neg logit contributions
 spont-1.623
 volunt-1.557
 Pv-1.535
 imperson-1.360
ortment-1.355
Token and
Token ID290
Feature activation+2.31e-01

Pos logit contributions
uel+2.366
+2.272
lished+2.156
 WATCHED+2.156
+2.153
Neg logit contributions
 spont-2.474
 volunt-2.357
 Pv-2.297
 imperson-2.009
ortment-1.988
Token a
Token ID257
Feature activation+1.34e-01

Pos logit contributions
uel+0.376
+0.367
 WATCHED+0.348
lished+0.347
+0.346
Neg logit contributions
 spont-0.396
 volunt-0.377
 Pv-0.374
 imperson-0.321
ortment-0.320
Token handful
Token ID10089
Feature activation+5.18e-01

Pos logit contributions
uel+0.220
+0.216
 WATCHED+0.204
lished+0.204
+0.204
Neg logit contributions
 spont-0.228
 volunt-0.216
 Pv-0.215
 imperson-0.185
ortment-0.184
Token and
Token ID290
Feature activation-9.47e-02

Pos logit contributions
uel+0.292
+0.288
 WATCHED+0.273
+0.271
lished+0.270
Neg logit contributions
 spont-0.297
 volunt-0.283
 Pv-0.279
ortment-0.241
 imperson-0.239
Token 4
Token ID604
Feature activation+3.90e-01

Pos logit contributions
 spont+0.164
 volunt+0.157
 Pv+0.155
 imperson+0.134
ortment+0.134
Neg logit contributions
uel-0.152
-0.149
 WATCHED-0.141
-0.140
lished-0.140
Token damage
Token ID2465
Feature activation+5.84e-01

Pos logit contributions
uel+0.643
+0.628
 WATCHED+0.592
+0.592
lished+0.587
Neg logit contributions
 spont-0.660
 volunt-0.629
 Pv-0.623
 imperson-0.537
ortment-0.536
Token with
Token ID351
Feature activation+4.94e-03

Pos logit contributions
uel+0.954
+0.924
 WATCHED+0.879
+0.872
lished+0.862
Neg logit contributions
 spont-0.998
 volunt-0.950
 Pv-0.941
ortment-0.815
 imperson-0.809
Token R
Token ID371
Feature activation+4.89e-01

Pos logit contributions
uel+0.008
+0.008
 WATCHED+0.007
lished+0.007
+0.007
Neg logit contributions
 spont-0.009
 volunt-0.008
 Pv-0.008
 imperson-0.007
ortment-0.007
Tokenanged
Token ID5102
Feature activation+2.15e+00

Pos logit contributions
uel+0.768
+0.733
 WATCHED+0.695
+0.687
lished+0.686
Neg logit contributions
 spont-0.884
 volunt-0.844
 Pv-0.837
ortment-0.727
 imperson-0.723
Token Combat
Token ID14487
Feature activation+6.76e-01

Pos logit contributions
uel+3.355
+3.169
 WATCHED+3.043
 Jem+2.975
+2.972
Neg logit contributions
 spont-3.747
 volunt-3.614
 Pv-3.485
ortment-3.148
 imperson-3.057
Token Expert
Token ID25516
Feature activation-2.82e-02

Pos logit contributions
uel+1.106
+1.068
 WATCHED+1.018
lished+1.009
+1.008
Neg logit contributions
 spont-1.154
 volunt-1.103
 Pv-1.087
ortment-0.939
 imperson-0.937
Token.
Token ID13
Feature activation+4.33e-01

Pos logit contributions
 spont+0.048
 volunt+0.045
 Pv+0.045
 imperson+0.039
ortment+0.039
Neg logit contributions
uel-0.047
-0.045
 WATCHED-0.043
-0.043
lished-0.043
Token Tra
Token ID4759
Feature activation+9.03e-01

Pos logit contributions
uel+0.719
+0.701
+0.665
 WATCHED+0.661
ーン+0.652
Neg logit contributions
 spont-0.735
 volunt-0.703
 Pv-0.682
 imperson-0.604
ortment-0.590
Tokenited
Token ID863
Feature activation+5.99e-01

Pos logit contributions
uel+1.404
+1.344
+1.297
lished+1.275
 WATCHED+1.260
Neg logit contributions
 spont-1.624
 volunt-1.545
 Pv-1.530
 imperson-1.338
ortment-1.323
Tokens
Token ID82
Feature activation+2.26e-03

Pos logit contributions
uel+0.093
+0.091
 WATCHED+0.086
lished+0.086
+0.085
Neg logit contributions
 spont-0.107
 volunt-0.102
 Pv-0.101
 imperson-0.088
ortment-0.087
Token preference
Token ID12741
Feature activation-2.48e-01

Pos logit contributions
uel+0.003
+0.003
lished+0.003
 WATCHED+0.003
+0.003
Neg logit contributions
 spont-0.004
 volunt-0.004
 Pv-0.004
 imperson-0.003
ortment-0.003
Token for
Token ID329
Feature activation+2.07e-01

Pos logit contributions
 spont+0.447
 volunt+0.426
 Pv+0.425
 imperson+0.369
ortment+0.365
Neg logit contributions
uel-0.382
-0.375
lished-0.353
 WATCHED-0.353
-0.352
Token a
Token ID257
Feature activation+7.09e-02

Pos logit contributions
uel+0.336
+0.328
 WATCHED+0.311
+0.310
lished+0.310
Neg logit contributions
 spont-0.356
 volunt-0.339
 Pv-0.337
 imperson-0.290
ortment-0.289
Token Nate
Token ID23486
Feature activation+5.83e-01

Pos logit contributions
uel+0.099
+0.097
lished+0.091
 WATCHED+0.091
+0.091
Neg logit contributions
 spont-0.138
 volunt-0.132
 Pv-0.131
 imperson-0.115
ortment-0.114
Token Diaz
Token ID26899
Feature activation+2.13e+00

Pos logit contributions
uel+0.841
+0.827
lished+0.778
 WATCHED+0.775
+0.773
Neg logit contributions
 spont-1.106
 Pv-1.055
 volunt-1.054
 imperson-0.922
ortment-0.911
Token trilogy
Token ID26298
Feature activation+5.24e-01

Pos logit contributions
uel+2.769
+2.639
 WATCHED+2.462
+2.433
lished+2.411
Neg logit contributions
 spont-4.289
 volunt-4.131
 Pv-4.059
ortment-3.626
 imperson-3.533
Token fight
Token ID1907
Feature activation+1.30e+00

Pos logit contributions
uel+0.813
+0.796
 WATCHED+0.751
lished+0.750
+0.749
Neg logit contributions
 spont-0.939
 volunt-0.895
 Pv-0.890
 imperson-0.771
ortment-0.766
Token was
Token ID373
Feature activation+3.77e-01

Pos logit contributions
uel+2.018
+1.957
 WATCHED+1.856
+1.852
lished+1.827
Neg logit contributions
 spont-2.318
 volunt-2.232
 Pv-2.192
ortment-1.906
 imperson-1.899
Token met
Token ID1138
Feature activation-4.83e-01

Pos logit contributions
uel+0.576
+0.564
 WATCHED+0.531
lished+0.530
+0.529
Neg logit contributions
 spont-0.683
 volunt-0.651
 Pv-0.648
 imperson-0.562
ortment-0.560
Token by
Token ID416
Feature activation+1.67e-01

Pos logit contributions
 spont+0.758
 volunt+0.717
 Pv+0.714
 imperson+0.605
ortment+0.597
Neg logit contributions
uel-0.854
-0.840
lished-0.799
 WATCHED-0.798
-0.796
Token dump
Token ID10285
Feature activation+6.95e-01

Pos logit contributions
 spont+0.143
 volunt+0.136
 Pv+0.134
 imperson+0.116
ortment+0.116
Neg logit contributions
uel-0.136
-0.133
 WATCHED-0.126
-0.126
lished-0.126
Tokeners
Token ID364
Feature activation-1.82e-01

Pos logit contributions
uel+1.009
+0.973
+0.923
lished+0.912
 WATCHED+0.907
Neg logit contributions
 spont-1.322
 volunt-1.260
 Pv-1.248
 imperson-1.095
ortment-1.083
Token driver
Token ID4639
Feature activation-1.65e-01

Pos logit contributions
 spont+0.309
 volunt+0.293
 Pv+0.288
 imperson+0.249
ortment+0.248
Neg logit contributions
uel-0.300
-0.293
 WATCHED-0.277
-0.277
lished-0.276
Token V
Token ID569
Feature activation+1.88e-01

Pos logit contributions
 spont+0.284
 volunt+0.269
 Pv+0.266
 imperson+0.230
ortment+0.230
Neg logit contributions
uel-0.268
-0.261
 WATCHED-0.247
-0.247
lished-0.246
Tokenasant
Token ID8775
Feature activation+7.45e-01

Pos logit contributions
uel+0.301
+0.292
 WATCHED+0.277
lished+0.276
+0.276
Neg logit contributions
 spont-0.330
 volunt-0.313
 Pv-0.310
 imperson-0.269
ortment-0.266
Token Tav
Token ID41489
Feature activation+2.13e+00

Pos logit contributions
uel+1.284
+1.239
 WATCHED+1.185
+1.176
lished+1.173
Neg logit contributions
 spont-1.218
 volunt-1.149
 Pv-1.124
 imperson-0.973
ortment-0.967
Tokenare
Token ID533
Feature activation-4.22e-01

Pos logit contributions
uel+3.429
+3.178
lished+3.090
uru+3.083
 WATCHED+3.081
Neg logit contributions
 spont-3.686
 volunt-3.481
 Pv-3.423
 imperson-2.972
ortment-2.939
Token and
Token ID290
Feature activation+5.35e-02

Pos logit contributions
 spont+0.735
 volunt+0.699
 Pv+0.695
 imperson+0.601
ortment+0.595
Neg logit contributions
uel-0.675
-0.662
lished-0.625
 WATCHED-0.625
-0.624
Token Ash
Token ID7844
Feature activation+6.01e-02

Pos logit contributions
uel+0.085
+0.083
 WATCHED+0.079
+0.078
lished+0.078
Neg logit contributions
 spont-0.093
 volunt-0.089
 Pv-0.087
ortment-0.076
 imperson-0.076
Tokenok
Token ID482
Feature activation-1.21e-01

Pos logit contributions
uel+0.084
+0.082
lished+0.077
 WATCHED+0.077
+0.077
Neg logit contributions
 spont-0.117
 volunt-0.112
 Pv-0.111
 imperson-0.098
ortment-0.097
Token Yad
Token ID41449
Feature activation-1.11e+00

Pos logit contributions
 spont+0.205
 volunt+0.194
 Pv+0.191
 imperson+0.165
ortment+0.164
Neg logit contributions
uel-0.200
-0.195
 WATCHED-0.185
-0.184
lished-0.184
Token.
Token ID13
Feature activation+1.06e-02

Pos logit contributions
uel+0.917
+0.884
lished+0.843
+0.841
 WATCHED+0.837
Neg logit contributions
 spont-0.917
 volunt-0.874
 Pv-0.863
 imperson-0.745
ortment-0.735
Token O
Token ID440
Feature activation-4.37e-01

Pos logit contributions
uel+0.016
+0.016
 WATCHED+0.015
+0.015
lished+0.015
Neg logit contributions
 spont-0.020
 volunt-0.019
 Pv-0.018
 imperson-0.016
ortment-0.016
Token.
Token ID13
Feature activation+2.74e-01

Pos logit contributions
 spont+0.795
 volunt+0.755
 Pv+0.754
 imperson+0.656
ortment+0.647
Neg logit contributions
uel-0.653
-0.650
lished-0.615
-0.614
 WATCHED-0.613
TokenJ
Token ID41
Feature activation-4.58e-01

Pos logit contributions
uel+0.364
+0.356
+0.331
lished+0.330
 WATCHED+0.329
Neg logit contributions
 spont-0.552
 volunt-0.529
 Pv-0.524
 imperson-0.467
ortment-0.460
Token.
Token ID13
Feature activation+2.42e-01

Pos logit contributions
 spont+0.728
 volunt+0.689
 Pv+0.686
 imperson+0.582
ortment+0.576
Neg logit contributions
uel-0.804
-0.790
lished-0.750
 WATCHED-0.750
-0.749
Token Simpson
Token ID20531
Feature activation+2.12e+00

Pos logit contributions
uel+0.440
+0.433
lished+0.412
+0.412
 WATCHED+0.412
Neg logit contributions
 spont-0.366
 volunt-0.346
 Pv-0.343
 imperson-0.289
ortment-0.286
Token:
Token ID25
Feature activation-9.51e-02

Pos logit contributions
uel+2.940
+2.861
 WATCHED+2.730
+2.697
ーン+2.602
Neg logit contributions
 spont-4.035
 volunt-3.828
 Pv-3.773
ortment-3.403
 imperson-3.280
Token American
Token ID1605
Feature activation-2.42e-01

Pos logit contributions
 spont+0.180
 Pv+0.172
 volunt+0.172
 imperson+0.150
ortment+0.149
Neg logit contributions
uel-0.137
-0.134
lished-0.127
 WATCHED-0.126
-0.126
Token Crime
Token ID10003
Feature activation-1.51e-01

Pos logit contributions
 spont+0.308
 Pv+0.288
 volunt+0.287
 imperson+0.232
ortment+0.227
Neg logit contributions
uel-0.501
-0.494
lished-0.477
 WATCHED-0.476
-0.472
Token Story
Token ID8362
Feature activation-3.97e-01

Pos logit contributions
 spont+0.282
 Pv+0.274
 volunt+0.269
 imperson+0.238
ortment+0.233
Neg logit contributions
uel-0.221
-0.214
lished-0.207
-0.202
osponsors-0.202
Token on
Token ID319
Feature activation+2.33e-01

Pos logit contributions
 spont+0.716
 volunt+0.682
 Pv+0.681
 imperson+0.591
ortment+0.582
Neg logit contributions
uel-0.610
-0.600
lished-0.567
-0.563
 WATCHED-0.563
Token (
Token ID357
Feature activation+2.71e-01

Pos logit contributions
uel+1.412
+1.381
+1.303
 WATCHED+1.298
lished+1.292
Neg logit contributions
 spont-1.597
 volunt-1.514
 Pv-1.491
 imperson-1.304
ortment-1.303
TokenN
Token ID45
Feature activation+7.06e-01

Pos logit contributions
uel+0.393
+0.386
lished+0.362
 WATCHED+0.361
+0.361
Neg logit contributions
 spont-0.511
 volunt-0.488
 Pv-0.486
 imperson-0.425
ortment-0.421
TokenUST
Token ID7759
Feature activation+2.06e-01

Pos logit contributions
uel+0.791
+0.771
lished+0.710
 WATCHED+0.709
+0.707
Neg logit contributions
 spont-1.566
 volunt-1.506
 Pv-1.501
 imperson-1.342
ortment-1.332
Token),
Token ID828
Feature activation+1.15e-01

Pos logit contributions
uel+0.333
+0.327
 WATCHED+0.309
lished+0.308
+0.308
Neg logit contributions
 spont-0.356
 volunt-0.339
 Pv-0.337
 imperson-0.291
ortment-0.288
Token N
Token ID399
Feature activation+1.13e+00

Pos logit contributions
uel+0.163
+0.160
lished+0.150
 WATCHED+0.150
+0.149
Neg logit contributions
 spont-0.220
 volunt-0.210
 Pv-0.209
 imperson-0.183
ortment-0.182
TokenUFC
Token ID44534
Feature activation+2.12e+00

Pos logit contributions
uel+1.461
+1.424
 WATCHED+1.323
lished+1.323
+1.319
Neg logit contributions
 spont-2.307
 volunt-2.213
 Pv-2.206
 imperson-1.948
ortment-1.934
Token Fans
Token ID24964
Feature activation+2.53e-01

Pos logit contributions
uel+3.274
+3.185
+3.002
 WATCHED+2.961
ーン+2.949
Neg logit contributions
 spont-3.787
 volunt-3.595
 Pv-3.514
ortment-3.103
 imperson-3.073
Token United
Token ID1578
Feature activation+2.44e-01

Pos logit contributions
uel+0.335
+0.327
 WATCHED+0.305
lished+0.305
+0.305
Neg logit contributions
 spont-0.512
 volunt-0.490
 Pv-0.487
 imperson-0.431
ortment-0.428
Token and
Token ID290
Feature activation+4.35e-01

Pos logit contributions
uel+0.342
+0.335
 WATCHED+0.314
lished+0.313
+0.313
Neg logit contributions
 spont-0.472
 volunt-0.451
 Pv-0.448
 imperson-0.394
ortment-0.391
Token Newcastle
Token ID22410
Feature activation+1.09e+00

Pos logit contributions
uel+0.649
+0.636
 WATCHED+0.599
+0.596
lished+0.596
Neg logit contributions
 spont-0.802
 volunt-0.764
 Pv-0.759
 imperson-0.663
ortment-0.660
Token United
Token ID1578
Feature activation+4.66e-02

Pos logit contributions
uel+1.302
+1.265
 WATCHED+1.164
+1.162
lished+1.158
Neg logit contributions
 spont-2.326
 volunt-2.236
 Pv-2.212
 imperson-1.972
ortment-1.968
Tokenence
Token ID594
Feature activation+4.40e-01

Pos logit contributions
uel+0.160
+0.158
+0.150
 WATCHED+0.150
lished+0.149
Neg logit contributions
 spont-0.160
 volunt-0.151
 Pv-0.151
 imperson-0.129
ortment-0.128
Token Jr
Token ID7504
Feature activation-3.85e-01

Pos logit contributions
uel+0.695
+0.681
 WATCHED+0.643
+0.641
lished+0.640
Neg logit contributions
 spont-0.777
 volunt-0.740
 Pv-0.734
 imperson-0.634
ortment-0.631
Token
Token ID447
Feature activation+1.24e-01

Pos logit contributions
 spont+0.615
 volunt+0.582
 Pv+0.580
 imperson+0.493
ortment+0.488
Neg logit contributions
uel-0.670
-0.659
lished-0.625
 WATCHED-0.625
-0.624
Token
Token ID247
Feature activation+1.93e-01

Pos logit contributions
uel+0.190
+0.187
 WATCHED+0.175
lished+0.175
+0.175
Neg logit contributions
 spont-0.224
 volunt-0.214
 Pv-0.213
 imperson-0.185
ortment-0.183
Tokens
Token ID82
Feature activation+2.63e-02

Pos logit contributions
uel+0.273
+0.267
lished+0.250
 WATCHED+0.250
+0.249
Neg logit contributions
 spont-0.372
 volunt-0.355
 Pv-0.353
 imperson-0.310
ortment-0.307
Token punches
Token ID25495
Feature activation+2.09e+00

Pos logit contributions
uel+0.043
+0.042
lished+0.040
 WATCHED+0.040
+0.040
Neg logit contributions
 spont-0.045
 volunt-0.043
 Pv-0.042
 imperson-0.036
ortment-0.036
Token had
Token ID550
Feature activation-1.48e-01

Pos logit contributions
uel+3.147
+3.039
 WATCHED+2.895
+2.884
ーン+2.823
Neg logit contributions
 spont-3.814
 volunt-3.669
 Pv-3.605
ortment-3.140
 imperson-3.108
Token blown
Token ID16318
Feature activation-4.41e-01

Pos logit contributions
 spont+0.267
 volunt+0.255
 Pv+0.253
 imperson+0.219
ortment+0.218
Neg logit contributions
uel-0.229
-0.224
 WATCHED-0.212
-0.211
lished-0.210
Token Brook
Token ID7590
Feature activation+7.32e-02

Pos logit contributions
 spont+0.778
 volunt+0.739
 Pv+0.738
 imperson+0.638
ortment+0.630
Neg logit contributions
uel-0.695
-0.683
lished-0.645
 WATCHED-0.643
-0.642
Token
Token ID447
Feature activation+1.63e-01

Pos logit contributions
uel+0.120
+0.118
+0.111
 WATCHED+0.111
lished+0.111
Neg logit contributions
 spont-0.125
 volunt-0.119
 Pv-0.118
ortment-0.101
 imperson-0.101
Token
Token ID247
Feature activation+1.16e-01

Pos logit contributions
uel+0.251
+0.246
 WATCHED+0.232
lished+0.231
+0.231
Neg logit contributions
 spont-0.294
 volunt-0.281
 Pv-0.279
 imperson-0.243
ortment-0.240
Token Tele
Token ID14318
Feature activation-5.76e-01

Pos logit contributions
uel+0.045
+0.044
 WATCHED+0.042
+0.042
lished+0.042
Neg logit contributions
 spont-0.055
 volunt-0.053
 Pv-0.052
 imperson-0.046
ortment-0.045
Tokenpath
Token ID6978
Feature activation+2.32e-01

Pos logit contributions
 spont+1.136
 volunt+1.087
 Pv+1.085
 imperson+0.955
ortment+0.946
Neg logit contributions
uel-0.782
-0.772
 WATCHED-0.718
lished-0.718
-0.717
Token
Token ID198
Feature activation+7.22e-02

Pos logit contributions
uel+0.383
+0.369
+0.349
lished+0.348
 WATCHED+0.348
Neg logit contributions
 spont-0.400
 volunt-0.381
 Pv-0.373
 imperson-0.323
ortment-0.322
Token
Token ID198
Feature activation+6.43e-02

Pos logit contributions
uel+0.134
+0.132
lished+0.127
 WATCHED+0.126
+0.126
Neg logit contributions
 spont-0.107
 Pv-0.101
 volunt-0.100
 imperson-0.084
ortment-0.083
TokenH
Token ID39
Feature activation+4.01e-02

Pos logit contributions
uel+0.094
+0.092
+0.086
 WATCHED+0.086
lished+0.086
Neg logit contributions
 spont-0.122
 volunt-0.116
 Pv-0.115
 imperson-0.101
ortment-0.100
Tokenython
Token ID7535
Feature activation+2.09e+00

Pos logit contributions
uel+0.059
+0.057
lished+0.054
 WATCHED+0.054
+0.053
Neg logit contributions
 spont-0.075
 volunt-0.072
 Pv-0.072
 imperson-0.063
ortment-0.062
Tokenia
Token ID544
Feature activation+1.97e-01

Pos logit contributions
uel+3.379
+3.293
+3.114
 WATCHED+3.066
lished+3.059
Neg logit contributions
 spont-3.549
 volunt-3.420
 Pv-3.325
ortment-2.943
 imperson-2.865
Token the
Token ID262
Feature activation+1.57e-01

Pos logit contributions
uel+0.305
+0.299
+0.281
 WATCHED+0.281
lished+0.280
Neg logit contributions
 spont-0.352
 volunt-0.336
 Pv-0.332
 imperson-0.289
ortment-0.288
Token Cruel
Token ID44153
Feature activation-1.60e-01

Pos logit contributions
uel+0.238
+0.234
 WATCHED+0.220
+0.219
lished+0.219
Neg logit contributions
 spont-0.285
 volunt-0.272
 Pv-0.270
 imperson-0.235
ortment-0.234
Token vs
Token ID3691
Feature activation+3.16e-01

Pos logit contributions
 spont+0.297
 volunt+0.284
 Pv+0.281
 imperson+0.246
ortment+0.244
Neg logit contributions
uel-0.240
-0.235
 WATCHED-0.221
-0.220
lished-0.220
Token Ign
Token ID16583
Feature activation+8.37e-01

Pos logit contributions
uel+0.493
+0.481
 WATCHED+0.454
+0.453
lished+0.450
Neg logit contributions
 spont-0.562
 volunt-0.537
 Pv-0.528
 imperson-0.461
ortment-0.460
Token retaliate
Token ID50081
Feature activation-3.96e-02

Pos logit contributions
 spont+0.028
 volunt+0.026
 Pv+0.026
ortment+0.023
 imperson+0.022
Neg logit contributions
uel-0.025
-0.025
 WATCHED-0.024
-0.023
lished-0.023
Token,
Token ID11
Feature activation+1.30e-01

Pos logit contributions
 spont+0.067
 volunt+0.064
 Pv+0.064
ortment+0.055
 imperson+0.054
Neg logit contributions
uel-0.065
-0.063
 WATCHED-0.060
-0.060
lished-0.060
Token he
Token ID339
Feature activation+1.18e-02

Pos logit contributions
uel+0.206
+0.203
 WATCHED+0.193
lished+0.190
+0.189
Neg logit contributions
 spont-0.226
 volunt-0.215
 Pv-0.214
ortment-0.185
 imperson-0.183
Token threw
Token ID9617
Feature activation+3.08e-01

Pos logit contributions
uel+0.017
+0.017
 WATCHED+0.016
lished+0.016
+0.016
Neg logit contributions
 spont-0.022
 volunt-0.021
 Pv-0.021
 imperson-0.019
ortment-0.019
Token no
Token ID645
Feature activation-3.35e-01

Pos logit contributions
uel+0.523
+0.513
lished+0.487
 WATCHED+0.487
+0.486
Neg logit contributions
 spont-0.503
 volunt-0.478
 Pv-0.474
ortment-0.406
 imperson-0.404
Token punches
Token ID25495
Feature activation+2.08e+00

Pos logit contributions
 spont+0.739
 volunt+0.709
 Pv+0.707
 imperson+0.634
ortment+0.625
Neg logit contributions
uel-0.381
-0.373
lished-0.345
 WATCHED-0.343
-0.342
Token,
Token ID11
Feature activation+2.81e-03

Pos logit contributions
uel+3.155
+3.024
 WATCHED+2.939
lished+2.890
+2.885
Neg logit contributions
 spont-3.662
 volunt-3.539
 Pv-3.483
ortment-3.086
 imperson-2.954
Token and
Token ID290
Feature activation-1.59e-01

Pos logit contributions
uel+0.004
+0.004
 WATCHED+0.004
lished+0.004
+0.004
Neg logit contributions
 spont-0.005
 volunt-0.005
 Pv-0.005
 imperson-0.004
ortment-0.004
Token he
Token ID339
Feature activation-1.34e-01

Pos logit contributions
 spont+0.288
 volunt+0.275
 Pv+0.274
 imperson+0.238
ortment+0.235
Neg logit contributions
uel-0.243
-0.239
lished-0.225
 WATCHED-0.225
-0.224
Token did
Token ID750
Feature activation+2.24e-01

Pos logit contributions
 spont+0.257
 volunt+0.246
 Pv+0.245
 imperson+0.214
ortment+0.213
Neg logit contributions
uel-0.192
-0.188
 WATCHED-0.176
lished-0.176
-0.175
Token not
Token ID407
Feature activation+5.51e-02

Pos logit contributions
uel+0.323
+0.317
 WATCHED+0.299
lished+0.296
+0.295
Neg logit contributions
 spont-0.419
 volunt-0.401
 Pv-0.398
ortment-0.346
 imperson-0.344
Token against
Token ID1028
Feature activation+6.35e-01

Pos logit contributions
uel+1.308
+1.280
lished+1.213
 WATCHED+1.213
+1.212
Neg logit contributions
 spont-1.272
 volunt-1.210
 Pv-1.201
 imperson-1.026
ortment-1.019
Token Mark
Token ID2940
Feature activation-7.61e-01

Pos logit contributions
uel+1.053
+1.034
lished+0.979
 WATCHED+0.978
+0.977
Neg logit contributions
 spont-1.067
 volunt-1.013
 Pv-1.009
 imperson-0.865
ortment-0.857
Token Hunt
Token ID12937
Feature activation-1.34e-01

Pos logit contributions
 spont+1.266
 Pv+1.229
 volunt+1.200
ortment+1.055
 imperson+1.020
Neg logit contributions
-1.283
uel-1.238
 WATCHED-1.192
lished-1.186
ーン-1.176
Token back
Token ID736
Feature activation+3.21e-01

Pos logit contributions
 spont+0.222
 volunt+0.210
 Pv+0.209
 imperson+0.179
ortment+0.177
Neg logit contributions
uel-0.227
-0.223
lished-0.211
 WATCHED-0.211
-0.211
Token at
Token ID379
Feature activation-8.42e-02

Pos logit contributions
uel+0.554
+0.541
lished+0.515
 WATCHED+0.514
+0.513
Neg logit contributions
 spont-0.522
 volunt-0.496
 Pv-0.491
 imperson-0.417
ortment-0.415
Token UFC
Token ID11448
Feature activation+2.07e+00

Pos logit contributions
 spont+0.187
 volunt+0.180
 Pv+0.180
 imperson+0.161
ortment+0.159
Neg logit contributions
uel-0.094
-0.092
lished-0.084
 WATCHED-0.084
-0.084
Token 200
Token ID939
Feature activation+7.00e-01

Pos logit contributions
uel+2.868
+2.720
+2.596
ーン+2.588
 WATCHED+2.581
Neg logit contributions
 spont-4.025
 volunt-3.890
 Pv-3.809
ortment-3.383
 imperson-3.356
Token.
Token ID13
Feature activation+2.80e-01

Pos logit contributions
uel+1.173
+1.139
 WATCHED+1.086
+1.085
lished+1.080
Neg logit contributions
 spont-1.166
 volunt-1.108
 Pv-1.101
ortment-0.942
 imperson-0.941
Token
Token ID198
Feature activation-2.94e-01

Pos logit contributions
uel+0.484
+0.472
 WATCHED+0.451
+0.450
lished+0.448
Neg logit contributions
 spont-0.456
 volunt-0.433
 Pv-0.422
 imperson-0.370
ortment-0.359
Token
Token ID198
Feature activation-3.49e-01

Pos logit contributions
 spont+0.417
 Pv+0.400
 volunt+0.387
 imperson+0.321
ortment+0.320
Neg logit contributions
uel-0.558
-0.554
lished-0.539
 WATCHED-0.528
osponsors-0.526
TokenUS
Token ID2937
Feature activation+1.30e-01

Pos logit contributions
 spont+0.665
 volunt+0.635
 Pv+0.633
 imperson+0.554
ortment+0.550
Neg logit contributions
uel-0.499
-0.489
lished-0.459
 WATCHED-0.459
-0.457
Token to
Token ID284
Feature activation+3.85e-01

Pos logit contributions
uel+0.391
+0.383
 WATCHED+0.362
lished+0.361
+0.360
Neg logit contributions
 spont-0.429
 volunt-0.408
 Pv-0.407
 imperson-0.348
ortment-0.348
Token leak
Token ID13044
Feature activation-6.73e-01

Pos logit contributions
uel+0.616
+0.598
 WATCHED+0.569
lished+0.560
+0.559
Neg logit contributions
 spont-0.672
 volunt-0.638
 Pv-0.636
ortment-0.550
 imperson-0.538
Token.
Token ID764
Feature activation-6.32e-02

Pos logit contributions
 spont+1.078
 volunt+1.022
 Pv+1.014
ortment+0.864
 imperson+0.863
Neg logit contributions
uel-1.168
-1.152
 WATCHED-1.092
lished-1.091
-1.089
Token A
Token ID317
Feature activation+6.76e-03

Pos logit contributions
 spont+0.105
 volunt+0.100
 Pv+0.098
 imperson+0.085
ortment+0.084
Neg logit contributions
uel-0.107
-0.105
 WATCHED-0.100
-0.099
lished-0.099
Token president
Token ID1893
Feature activation+1.69e-01

Pos logit contributions
uel+0.011
+0.011
lished+0.010
 WATCHED+0.010
+0.010
Neg logit contributions
 spont-0.012
 volunt-0.011
 Pv-0.011
 imperson-0.010
ortment-0.010
Token ign
Token ID3627
Feature activation+2.07e+00

Pos logit contributions
uel+0.249
+0.243
 WATCHED+0.228
+0.227
lished+0.227
Neg logit contributions
 spont-0.315
 volunt-0.302
 Pv-0.300
 imperson-0.260
ortment-0.260
Tokeniting
Token ID1780
Feature activation+1.13e+00

Pos logit contributions
uel+2.062
+1.834
lished+1.745
uru+1.713
+1.706
Neg logit contributions
 spont-4.879
 volunt-4.722
 Pv-4.696
 imperson-4.204
ortment-4.187
Token speculation
Token ID13367
Feature activation+5.50e-01

Pos logit contributions
uel+1.864
+1.820
 WATCHED+1.710
lished+1.708
+1.695
Neg logit contributions
 spont-1.820
 volunt-1.774
 Pv-1.751
ortment-1.537
 imperson-1.486
Token about
Token ID546
Feature activation-2.59e-01

Pos logit contributions
uel+0.884
+0.868
 WATCHED+0.822
+0.814
lished+0.814
Neg logit contributions
 spont-0.949
 volunt-0.906
 Pv-0.898
ortment-0.774
 imperson-0.771
Token secret
Token ID3200
Feature activation+8.61e-02

Pos logit contributions
 spont+0.435
 volunt+0.414
 Pv+0.412
ortment+0.353
 imperson+0.352
Neg logit contributions
uel-0.428
-0.421
 WATCHED-0.398
lished-0.397
-0.396
Token Oval
Token ID25165
Feature activation-5.80e-01

Pos logit contributions
uel+0.141
+0.138
 WATCHED+0.131
+0.130
lished+0.129
Neg logit contributions
 spont-0.146
 volunt-0.139
 Pv-0.138
ortment-0.120
 imperson-0.118
Token Hom
Token ID8074
Feature activation-9.55e-01

Pos logit contributions
uel+0.384
+0.377
+0.358
lished+0.358
 WATCHED+0.358
Neg logit contributions
 spont-0.317
 volunt-0.300
 Pv-0.297
 imperson-0.250
ortment-0.248
Token M
Token ID337
Feature activation+1.71e-02

Pos logit contributions
 spont+1.629
 Pv+1.548
 volunt+1.548
 imperson+1.329
ortment+1.309
Neg logit contributions
uel-1.552
-1.529
lished-1.451
 WATCHED-1.448
-1.442
TokenTR
Token ID5446
Feature activation+6.03e-01

Pos logit contributions
uel+0.022
+0.022
lished+0.020
 WATCHED+0.020
+0.020
Neg logit contributions
 spont-0.035
 volunt-0.034
 Pv-0.033
 imperson-0.030
ortment-0.029
Token Station
Token ID9327
Feature activation-3.92e-01

Pos logit contributions
uel+0.967
+0.941
+0.891
lished+0.882
 WATCHED+0.879
Neg logit contributions
 spont-1.053
 volunt-1.005
 Pv-0.987
ortment-0.858
 imperson-0.858
Token (
Token ID357
Feature activation+1.21e-01

Pos logit contributions
 spont+0.659
 volunt+0.626
 Pv+0.621
 imperson+0.534
ortment+0.529
Neg logit contributions
uel-0.652
-0.640
lished-0.605
 WATCHED-0.605
-0.605
TokenExit
Token ID30337
Feature activation+2.07e+00

Pos logit contributions
uel+0.190
+0.187
+0.176
lished+0.175
 WATCHED+0.175
Neg logit contributions
 spont-0.215
 volunt-0.204
 Pv-0.202
 imperson-0.176
ortment-0.175
Token A
Token ID317
Feature activation+1.26e-01

Pos logit contributions
uel+3.012
+2.896
+2.719
 WATCHED+2.717
lished+2.679
Neg logit contributions
 spont-3.840
 volunt-3.704
 Pv-3.580
ortment-3.214
 imperson-3.194
Token or
Token ID393
Feature activation-1.91e-01

Pos logit contributions
uel+0.202
+0.198
+0.187
lished+0.186
 WATCHED+0.186
Neg logit contributions
 spont-0.221
 volunt-0.210
 Pv-0.209
 imperson-0.181
ortment-0.179
Token D
Token ID360
Feature activation+6.32e-01

Pos logit contributions
 spont+0.331
 volunt+0.315
 Pv+0.313
 imperson+0.270
ortment+0.268
Neg logit contributions
uel-0.306
-0.300
 WATCHED-0.283
lished-0.283
-0.283
Token).
Token ID737
Feature activation-1.28e-01

Pos logit contributions
uel+1.033
+1.009
+0.953
 WATCHED+0.949
lished+0.947
Neg logit contributions
 spont-1.085
 volunt-1.031
 Pv-1.021
 imperson-0.881
ortment-0.876
Token<|endoftext|>
Token ID50256
Feature activation+9.67e-01

Pos logit contributions
 spont+0.215
 volunt+0.205
 Pv+0.201
 imperson+0.175
ortment+0.172
Neg logit contributions
uel-0.214
-0.211
 WATCHED-0.199
-0.199
lished-0.198
Token to
Token ID284
Feature activation-6.79e-02

Pos logit contributions
uel+0.650
+0.637
 WATCHED+0.604
+0.602
lished+0.598
Neg logit contributions
 spont-0.668
 volunt-0.636
 Pv-0.629
ortment-0.543
 imperson-0.542
Token accident
Token ID5778
Feature activation-6.78e-01

Pos logit contributions
 spont+0.111
 volunt+0.106
 Pv+0.104
 imperson+0.089
ortment+0.089
Neg logit contributions
uel-0.115
-0.113
 WATCHED-0.107
-0.107
lished-0.107
Token status
Token ID3722
Feature activation-4.81e-02

Pos logit contributions
 spont+1.117
 volunt+1.060
 Pv+1.055
 imperson+0.902
ortment+0.892
Neg logit contributions
uel-1.149
-1.129
lished-1.070
 WATCHED-1.070
-1.068
Token).
Token ID737
Feature activation-2.28e-01

Pos logit contributions
 spont+0.080
 volunt+0.075
 Pv+0.075
 imperson+0.064
ortment+0.064
Neg logit contributions
uel-0.081
-0.080
 WATCHED-0.076
-0.075
lished-0.075
Token George
Token ID4502
Feature activation-1.06e+00

Pos logit contributions
 spont+0.386
 volunt+0.368
 Pv+0.361
 imperson+0.316
ortment+0.311
Neg logit contributions
uel-0.376
-0.369
 WATCHED-0.349
-0.348
lished-0.346
Token reminded
Token ID14516
Feature activation-2.46e+00

Pos logit contributions
 spont+1.855
 volunt+1.768
 Pv+1.763
 imperson+1.529
ortment+1.498
Neg logit contributions
uel-1.670
-1.636
lished-1.558
-1.553
 WATCHED-1.546
Token me
Token ID502
Feature activation+2.04e-02

Pos logit contributions
 spont+3.670
 Pv+3.459
 volunt+3.442
 imperson+2.934
ortment+2.805
Neg logit contributions
-4.448
uel-4.448
lished-4.261
 WATCHED-4.249
osponsors-4.238
Token that
Token ID326
Feature activation-3.71e-02

Pos logit contributions
uel+0.034
+0.033
 WATCHED+0.031
lished+0.031
+0.031
Neg logit contributions
 spont-0.035
 volunt-0.033
 Pv-0.033
ortment-0.028
 imperson-0.028
Token if
Token ID611
Feature activation+7.28e-02

Pos logit contributions
 spont+0.062
 volunt+0.059
 Pv+0.058
ortment+0.050
 imperson+0.050
Neg logit contributions
uel-0.062
-0.061
 WATCHED-0.057
-0.057
lished-0.057
Token I
Token ID314
Feature activation+1.57e-02

Pos logit contributions
uel+0.124
+0.121
 WATCHED+0.114
+0.114
lished+0.114
Neg logit contributions
 spont-0.118
 volunt-0.113
 Pv-0.111
ortment-0.096
 imperson-0.095
Token rolled
Token ID11686
Feature activation-2.93e-01

Pos logit contributions
uel+0.023
+0.022
 WATCHED+0.021
lished+0.021
+0.021
Neg logit contributions
 spont-0.030
 volunt-0.028
 Pv-0.028
ortment-0.025
 imperson-0.024
Token download
Token ID4321
Feature activation-4.97e-01

Pos logit contributions
uel+1.053
+1.032
+0.985
lished+0.975
 WATCHED+0.971
Neg logit contributions
 spont-0.919
 volunt-0.862
 Pv-0.845
ortment-0.720
 imperson-0.718
Token.
Token ID13
Feature activation-2.19e-01

Pos logit contributions
 spont+0.772
 Pv+0.732
 volunt+0.727
 imperson+0.617
ortment+0.608
Neg logit contributions
uel-0.883
-0.870
lished-0.833
 WATCHED-0.826
-0.823
Token Includes
Token ID29581
Feature activation-1.52e-01

Pos logit contributions
 spont+0.477
 volunt+0.458
 Pv+0.455
 imperson+0.408
ortment+0.403
Neg logit contributions
uel-0.255
-0.246
 WATCHED-0.230
-0.229
lished-0.226
Token the
Token ID262
Feature activation-1.88e-01

Pos logit contributions
 spont+0.249
 volunt+0.236
 Pv+0.234
 imperson+0.200
ortment+0.198
Neg logit contributions
uel-0.259
-0.255
lished-0.242
 WATCHED-0.242
-0.241
Token Catholic
Token ID7835
Feature activation-1.46e+00

Pos logit contributions
 spont+0.288
 volunt+0.273
 Pv+0.270
 imperson+0.228
ortment+0.227
Neg logit contributions
uel-0.336
-0.330
 WATCHED-0.315
-0.314
lished-0.314
Token Encyclopedia
Token ID36732
Feature activation-2.46e+00

Pos logit contributions
 spont+1.302
 volunt+1.174
 Pv+1.163
 imperson+0.832
ortment+0.820
Neg logit contributions
uel-3.572
-3.517
 WATCHED-3.401
lished-3.400
-3.388
Token,
Token ID11
Feature activation-1.56e-01

Pos logit contributions
 spont+3.692
 Pv+3.602
 volunt+3.500
 imperson+2.960
ortment+2.930
Neg logit contributions
uel-4.393
-4.354
lished-4.258
ーテ-4.159
-4.150
Token Church
Token ID4564
Feature activation-1.73e+00

Pos logit contributions
 spont+0.254
 volunt+0.241
 Pv+0.241
 imperson+0.205
ortment+0.202
Neg logit contributions
uel-0.267
-0.262
lished-0.251
 WATCHED-0.251
-0.248
Token Fathers
Token ID30969
Feature activation-1.25e+00

Pos logit contributions
 spont+2.514
 volunt+2.362
 Pv+2.352
 imperson+1.952
ortment+1.928
Neg logit contributions
uel-3.260
-3.230
lished-3.077
 WATCHED-3.076
-3.068
Token,
Token ID11
Feature activation-2.80e-01

Pos logit contributions
 spont+2.235
 volunt+2.123
 Pv+2.092
 imperson+1.823
ortment+1.803
Neg logit contributions
uel-1.958
-1.911
-1.808
 WATCHED-1.797
lished-1.767
Token Sum
Token ID5060
Feature activation-6.31e-01

Pos logit contributions
 spont+0.495
 volunt+0.470
 Pv+0.459
 imperson+0.401
ortment+0.395
Neg logit contributions
uel-0.440
-0.430
lished-0.400
-0.396
 WATCHED-0.395
Token based
Token ID1912
Feature activation+6.07e-01

Pos logit contributions
uel+0.082
+0.080
 WATCHED+0.076
+0.076
lished+0.076
Neg logit contributions
 spont-0.085
 volunt-0.080
 Pv-0.080
ortment-0.069
 imperson-0.068
Token on
Token ID319
Feature activation+5.70e-02

Pos logit contributions
uel+0.943
+0.921
 WATCHED+0.872
lished+0.869
+0.867
Neg logit contributions
 spont-1.081
 volunt-1.029
 Pv-1.026
ortment-0.880
 imperson-0.875
Token how
Token ID703
Feature activation+6.70e-02

Pos logit contributions
uel+0.095
+0.094
 WATCHED+0.089
lished+0.088
+0.088
Neg logit contributions
 spont-0.094
 volunt-0.089
 Pv-0.089
ortment-0.076
 imperson-0.075
Token one
Token ID530
Feature activation+1.46e-01

Pos logit contributions
uel+0.107
+0.105
 WATCHED+0.100
lished+0.099
+0.098
Neg logit contributions
 spont-0.116
 volunt-0.111
 Pv-0.110
ortment-0.094
 imperson-0.094
Token labels
Token ID14722
Feature activation-4.45e-01

Pos logit contributions
uel+0.219
+0.215
 WATCHED+0.203
lished+0.201
+0.200
Neg logit contributions
 spont-0.269
 volunt-0.256
 Pv-0.254
ortment-0.220
 imperson-0.218
Token oneself
Token ID27186
Feature activation-2.41e+00

Pos logit contributions
 spont+0.677
 volunt+0.640
 Pv+0.637
 imperson+0.534
ortment+0.531
Neg logit contributions
uel-0.807
-0.794
 WATCHED-0.754
lished-0.754
-0.753
Token.
Token ID13
Feature activation-1.63e-01

Pos logit contributions
 spont+3.666
 Pv+3.496
 volunt+3.472
 imperson+3.003
ortment+2.883
Neg logit contributions
uel-4.282
-4.208
lished-3.997
 WATCHED-3.980
-3.979
Token
Token ID447
Feature activation-2.91e-01

Pos logit contributions
 spont+0.282
 volunt+0.269
 Pv+0.265
 imperson+0.231
ortment+0.228
Neg logit contributions
uel-0.264
-0.260
 WATCHED-0.246
-0.246
lished-0.244
Token
Token ID251
Feature activation-4.14e-02

Pos logit contributions
 spont+0.402
 volunt+0.379
 Pv+0.375
 imperson+0.311
ortment+0.306
Neg logit contributions
uel-0.571
-0.565
-0.538
 WATCHED-0.538
lished-0.537
Token
Token ID198
Feature activation+9.69e-02

Pos logit contributions
 spont+0.071
 volunt+0.067
 Pv+0.066
 imperson+0.058
ortment+0.057
Neg logit contributions
uel-0.068
-0.067
 WATCHED-0.063
-0.063
lished-0.063
Token
Token ID198
Feature activation-9.06e-03

Pos logit contributions
uel+0.179
+0.177
lished+0.170
 WATCHED+0.168
+0.168
Neg logit contributions
 spont-0.144
 Pv-0.136
 volunt-0.135
 imperson-0.113
ortment-0.112
Token like
Token ID588
Feature activation-1.79e-01

Pos logit contributions
 spont+0.145
 volunt+0.137
 Pv+0.136
ortment+0.117
 imperson+0.116
Neg logit contributions
uel-0.149
-0.144
 WATCHED-0.138
-0.137
lished-0.136
Token a
Token ID257
Feature activation-6.17e-02

Pos logit contributions
 spont+0.281
 volunt+0.269
 Pv+0.265
ortment+0.229
 imperson+0.223
Neg logit contributions
uel-0.312
-0.306
 WATCHED-0.291
-0.290
lished-0.288
Token good
Token ID922
Feature activation-4.25e-02

Pos logit contributions
 spont+0.105
 volunt+0.100
 Pv+0.099
 imperson+0.086
ortment+0.086
Neg logit contributions
uel-0.101
-0.099
 WATCHED-0.093
-0.093
lished-0.093
Token ship
Token ID4074
Feature activation-1.16e-01

Pos logit contributions
 spont+0.070
 volunt+0.067
 Pv+0.066
ortment+0.057
 imperson+0.056
Neg logit contributions
uel-0.071
-0.070
 WATCHED-0.066
-0.066
lished-0.066
Token to
Token ID284
Feature activation-9.48e-02

Pos logit contributions
 spont+0.204
 volunt+0.194
 Pv+0.192
 imperson+0.166
ortment+0.165
Neg logit contributions
uel-0.182
-0.178
 WATCHED-0.169
lished-0.168
-0.168
Token remind
Token ID7101
Feature activation-2.35e+00

Pos logit contributions
 spont+0.167
 volunt+0.159
 Pv+0.158
ortment+0.136
 imperson+0.135
Neg logit contributions
uel-0.150
-0.146
 WATCHED-0.138
-0.138
lished-0.137
Token us
Token ID514
Feature activation-3.50e-01

Pos logit contributions
 spont+3.607
 Pv+3.450
 volunt+3.381
 imperson+2.881
ortment+2.769
Neg logit contributions
-4.154
uel-4.135
lished-4.002
 WATCHED-3.953
-3.881
Token of
Token ID286
Feature activation+6.62e-02

Pos logit contributions
 spont+0.610
 volunt+0.580
 Pv+0.578
 imperson+0.499
ortment+0.494
Neg logit contributions
uel-0.559
-0.549
lished-0.519
 WATCHED-0.518
-0.518
Token the
Token ID262
Feature activation-6.95e-02

Pos logit contributions
uel+0.112
+0.110
+0.104
 WATCHED+0.104
lished+0.103
Neg logit contributions
 spont-0.108
 volunt-0.103
 Pv-0.102
ortment-0.088
 imperson-0.087
Token power
Token ID1176
Feature activation-7.77e-01

Pos logit contributions
 spont+0.122
 volunt+0.116
 Pv+0.115
 imperson+0.100
ortment+0.099
Neg logit contributions
uel-0.110
-0.108
 WATCHED-0.102
-0.102
lished-0.102
Token of
Token ID286
Feature activation+1.06e-01

Pos logit contributions
 spont+1.417
 volunt+1.351
 Pv+1.344
 imperson+1.169
ortment+1.160
Neg logit contributions
uel-1.181
-1.157
lished-1.090
 WATCHED-1.089
-1.088
Tokenum
Token ID388
Feature activation-2.91e-01

Pos logit contributions
 spont+0.692
 volunt+0.663
 Pv+0.660
 imperson+0.584
ortment+0.581
Neg logit contributions
uel-0.431
-0.423
lished-0.394
 WATCHED-0.393
-0.392
Token.
Token ID13
Feature activation-1.28e-02

Pos logit contributions
 spont+0.479
 volunt+0.454
 Pv+0.452
 imperson+0.386
ortment+0.383
Neg logit contributions
uel-0.494
-0.485
 WATCHED-0.460
lished-0.460
-0.459
Token It
Token ID632
Feature activation+2.60e-01

Pos logit contributions
 spont+0.022
 volunt+0.021
 Pv+0.020
 imperson+0.018
ortment+0.017
Neg logit contributions
uel-0.021
-0.021
 WATCHED-0.020
-0.020
lished-0.020
Token is
Token ID318
Feature activation+3.90e-01

Pos logit contributions
uel+0.379
+0.369
 WATCHED+0.347
lished+0.345
+0.344
Neg logit contributions
 spont-0.494
 volunt-0.472
 Pv-0.469
 imperson-0.408
ortment-0.407
Token also
Token ID635
Feature activation+3.53e-01

Pos logit contributions
uel+0.554
+0.540
 WATCHED+0.506
lished+0.504
+0.503
Neg logit contributions
 spont-0.750
 volunt-0.719
 Pv-0.713
 imperson-0.624
ortment-0.623
Token cance
Token ID10450
Feature activation-2.35e+00

Pos logit contributions
uel+0.567
+0.551
 WATCHED+0.523
lished+0.517
+0.517
Neg logit contributions
 spont-0.612
 volunt-0.583
 Pv-0.577
ortment-0.499
 imperson-0.494
Tokenlling
Token ID2680
Feature activation-9.10e-01

Pos logit contributions
 spont+2.248
 Pv+2.120
 volunt+2.033
 imperson+1.564
ortment+1.517
Neg logit contributions
-5.416
uel-5.343
-5.216
 WATCHED-5.214
ーテ-5.188
Token similar
Token ID2092
Feature activation-4.23e-01

Pos logit contributions
 spont+1.536
 volunt+1.462
 Pv+1.453
 imperson+1.246
ortment+1.243
Neg logit contributions
uel-1.499
-1.465
 WATCHED-1.391
lished-1.381
-1.381
Token events
Token ID2995
Feature activation-3.36e-01

Pos logit contributions
 spont+0.724
 volunt+0.689
 Pv+0.685
ortment+0.593
 imperson+0.587
Neg logit contributions
uel-0.679
-0.666
 WATCHED-0.633
-0.624
ーン-0.621
Token planned
Token ID6027
Feature activation-2.36e-01

Pos logit contributions
 spont+0.559
 volunt+0.531
 Pv+0.526
ortment+0.449
 imperson+0.449
Neg logit contributions
uel-0.565
-0.552
 WATCHED-0.525
-0.521
lished-0.521
Token for
Token ID329
Feature activation+1.14e-01

Pos logit contributions
 spont+0.422
 volunt+0.403
 Pv+0.400
 imperson+0.346
ortment+0.345
Neg logit contributions
uel-0.366
-0.358
 WATCHED-0.338
lished-0.337
-0.337
Token about
Token ID546
Feature activation-4.74e-01

Pos logit contributions
uel+0.546
+0.533
lished+0.504
 WATCHED+0.504
+0.500
Neg logit contributions
 spont-0.621
 volunt-0.592
 Pv-0.590
 imperson-0.511
ortment-0.508
Token data
Token ID1366
Feature activation+2.29e-01

Pos logit contributions
 spont+0.782
 volunt+0.742
 Pv+0.739
ortment+0.633
 imperson+0.631
Neg logit contributions
uel-0.797
-0.780
 WATCHED-0.740
lished-0.737
-0.735
Token structures
Token ID8573
Feature activation-5.49e-01

Pos logit contributions
uel+0.384
+0.377
 WATCHED+0.356
+0.356
lished+0.356
Neg logit contributions
 spont-0.379
 volunt-0.360
 Pv-0.359
 imperson-0.306
ortment-0.304
Token/
Token ID14
Feature activation+2.45e-01

Pos logit contributions
 spont+0.916
 volunt+0.870
 Pv+0.866
 imperson+0.742
ortment+0.734
Neg logit contributions
uel-0.916
-0.901
lished-0.852
 WATCHED-0.852
-0.851
Tokenal
Token ID282
Feature activation-2.54e-01

Pos logit contributions
uel+0.392
+0.384
+0.361
 WATCHED+0.360
lished+0.359
Neg logit contributions
 spont-0.428
 volunt-0.407
 Pv-0.406
 imperson-0.349
ortment-0.348
Tokengorith
Token ID7727
Feature activation-2.29e+00

Pos logit contributions
 spont+0.364
 volunt+0.343
 Pv+0.341
 imperson+0.284
ortment+0.280
Neg logit contributions
uel-0.486
-0.478
lished-0.455
 WATCHED-0.455
-0.455
Tokenms
Token ID907
Feature activation-1.70e-01

Pos logit contributions
 spont+3.849
 Pv+3.627
 volunt+3.582
 imperson+3.120
ortment+3.094
Neg logit contributions
-3.729
uel-3.691
lished-3.521
 WATCHED-3.496
-3.483
Token and
Token ID290
Feature activation+6.49e-02

Pos logit contributions
 spont+0.295
 volunt+0.281
 Pv+0.279
 imperson+0.241
ortment+0.239
Neg logit contributions
uel-0.273
-0.267
 WATCHED-0.252
-0.252
lished-0.251
Token more
Token ID517
Feature activation+6.11e-04

Pos logit contributions
uel+0.101
+0.099
 WATCHED+0.094
lished+0.093
+0.093
Neg logit contributions
 spont-0.115
 volunt-0.110
 Pv-0.109
 imperson-0.095
ortment-0.094
Token about
Token ID546
Feature activation-3.63e-01

Pos logit contributions
uel+0.001
+0.001
 WATCHED+0.001
lished+0.001
+0.001
Neg logit contributions
 spont-0.001
 volunt-0.001
 Pv-0.001
 imperson-0.001
ortment-0.001
Token web
Token ID3992
Feature activation-1.32e-01

Pos logit contributions
 spont+0.596
 volunt+0.566
 Pv+0.564
ortment+0.482
 imperson+0.480
Neg logit contributions
uel-0.613
-0.601
 WATCHED-0.571
lished-0.568
-0.567
Token designers
Token ID15441
Feature activation+9.82e-02

Pos logit contributions
 spont+0.146
 volunt+0.137
 Pv+0.137
 imperson+0.114
ortment+0.113
Neg logit contributions
uel-0.190
-0.188
-0.179
 WATCHED-0.179
lished-0.179
Token and
Token ID290
Feature activation+6.73e-02

Pos logit contributions
uel+0.159
+0.155
+0.147
 WATCHED+0.147
lished+0.146
Neg logit contributions
 spont-0.169
 volunt-0.161
 Pv-0.160
 imperson-0.137
ortment-0.137
Token even
Token ID772
Feature activation-6.20e-01

Pos logit contributions
uel+0.118
+0.115
 WATCHED+0.109
+0.109
lished+0.109
Neg logit contributions
 spont-0.106
 volunt-0.101
 Pv-0.101
 imperson-0.085
ortment-0.085
Token the
Token ID262
Feature activation-2.41e-02

Pos logit contributions
 spont+1.018
 volunt+0.964
 Pv+0.961
 imperson+0.820
ortment+0.815
Neg logit contributions
uel-1.053
-1.034
 WATCHED-0.978
lished-0.977
-0.977
Token home
Token ID1363
Feature activation-4.90e-01

Pos logit contributions
 spont+0.040
 volunt+0.038
 Pv+0.038
 imperson+0.032
ortment+0.032
Neg logit contributions
uel-0.040
-0.040
 WATCHED-0.038
-0.037
lished-0.037
Token gard
Token ID16985
Feature activation-2.28e+00

Pos logit contributions
 spont+0.847
 volunt+0.805
 Pv+0.802
 imperson+0.692
ortment+0.685
Neg logit contributions
uel-0.789
-0.775
lished-0.732
 WATCHED-0.731
-0.731
Tokenener
Token ID877
Feature activation-1.17e+00

Pos logit contributions
 spont+4.032
 volunt+3.831
 Pv+3.831
 imperson+3.324
ortment+3.270
Neg logit contributions
uel-3.527
-3.524
lished-3.305
-3.287
 WATCHED-3.286
Token.
Token ID13
Feature activation-4.62e-02

Pos logit contributions
 spont+2.016
 volunt+1.916
 Pv+1.911
 imperson+1.649
ortment+1.621
Neg logit contributions
uel-1.885
-1.857
lished-1.765
 WATCHED-1.750
-1.744
Token In
Token ID554
Feature activation-3.63e-01

Pos logit contributions
 spont+0.076
 volunt+0.072
 Pv+0.071
 imperson+0.062
ortment+0.061
Neg logit contributions
uel-0.079
-0.077
-0.073
 WATCHED-0.073
lished-0.072
Token doing
Token ID1804
Feature activation+8.19e-02

Pos logit contributions
 spont+0.614
 volunt+0.583
 Pv+0.579
 imperson+0.498
ortment+0.495
Neg logit contributions
uel-0.601
-0.585
-0.554
lished-0.553
 WATCHED-0.553
Token so
Token ID523
Feature activation-3.26e-02

Pos logit contributions
uel+0.129
+0.126
 WATCHED+0.119
+0.118
lished+0.118
Neg logit contributions
 spont-0.145
 volunt-0.137
 Pv-0.136
ortment-0.118
 imperson-0.118
Token Caps
Token ID23534
Feature activation-4.77e-01

Pos logit contributions
 spont+0.116
 volunt+0.110
 Pv+0.109
 imperson+0.093
ortment+0.093
Neg logit contributions
uel-0.119
-0.117
lished-0.110
-0.110
 WATCHED-0.110
Token coach
Token ID3985
Feature activation+2.05e-03

Pos logit contributions
 spont+0.921
 volunt+0.881
 Pv+0.877
 imperson+0.770
ortment+0.763
Neg logit contributions
uel-0.671
-0.657
lished-0.615
 WATCHED-0.615
-0.614
Token Barry
Token ID14488
Feature activation-4.47e-01

Pos logit contributions
uel+0.004
+0.004
+0.004
 WATCHED+0.004
lished+0.004
Neg logit contributions
 spont-0.003
 volunt-0.003
 Pv-0.003
 imperson-0.002
ortment-0.002
Token Tro
Token ID8498
Feature activation-8.76e-01

Pos logit contributions
 spont+0.322
 volunt+0.296
 Pv+0.272
ortment+0.189
 imperson+0.162
Neg logit contributions
uel-1.191
-1.152
 WATCHED-1.109
 Jem-1.109
-1.104
Tokentz
Token ID22877
Feature activation+9.99e-01

Pos logit contributions
 spont+1.401
 volunt+1.331
 Pv+1.322
 imperson+1.121
ortment+1.111
Neg logit contributions
uel-1.530
-1.500
lished-1.424
 WATCHED-1.422
-1.420
Token reminded
Token ID14516
Feature activation-2.27e+00

Pos logit contributions
uel+1.504
+1.467
 WATCHED+1.374
lished+1.364
+1.355
Neg logit contributions
 spont-1.847
 volunt-1.766
 Pv-1.744
ortment-1.512
 imperson-1.497
Token DC
Token ID6257
Feature activation+2.93e-01

Pos logit contributions
 spont+3.993
 Pv+3.812
 volunt+3.785
 imperson+3.313
ortment+3.224
Neg logit contributions
uel-3.499
-3.479
lished-3.307
 WATCHED-3.305
-3.279
Token fans
Token ID3296
Feature activation-2.50e-02

Pos logit contributions
uel+0.451
+0.443
lished+0.415
 WATCHED+0.414
+0.414
Neg logit contributions
 spont-0.530
 volunt-0.505
 Pv-0.503
 imperson-0.436
ortment-0.433
Token that
Token ID326
Feature activation-7.82e-02

Pos logit contributions
 spont+0.041
 volunt+0.039
 Pv+0.039
 imperson+0.033
ortment+0.033
Neg logit contributions
uel-0.042
-0.042
 WATCHED-0.039
lished-0.039
-0.039
Token there
Token ID612
Feature activation-2.72e-01

Pos logit contributions
 spont+0.130
 volunt+0.124
 Pv+0.123
ortment+0.105
 imperson+0.105
Neg logit contributions
uel-0.131
-0.128
-0.121
 WATCHED-0.121
lished-0.120
Token was
Token ID373
Feature activation-1.05e-01

Pos logit contributions
 spont+0.492
 volunt+0.469
 Pv+0.467
 imperson+0.405
ortment+0.402
Neg logit contributions
uel-0.417
-0.409
lished-0.385
 WATCHED-0.385
-0.384
Token a
Token ID257
Feature activation+1.90e-02

Pos logit contributions
 spont+0.405
 volunt+0.385
 Pv+0.381
 imperson+0.325
ortment+0.324
Neg logit contributions
uel-0.439
-0.430
 WATCHED-0.409
lished-0.408
-0.407
Token fundraising
Token ID18066
Feature activation+1.64e-01

Pos logit contributions
uel+0.030
+0.029
 WATCHED+0.028
lished+0.027
+0.027
Neg logit contributions
 spont-0.033
 volunt-0.032
 Pv-0.032
ortment-0.027
 imperson-0.027
Token email
Token ID3053
Feature activation-5.34e-01

Pos logit contributions
uel+0.265
+0.260
 WATCHED+0.246
+0.245
lished+0.244
Neg logit contributions
 spont-0.281
 volunt-0.267
 Pv-0.266
 imperson-0.228
ortment-0.228
Token this
Token ID428
Feature activation+1.27e-01

Pos logit contributions
 spont+0.937
 volunt+0.891
 Pv+0.887
 imperson+0.766
ortment+0.760
Neg logit contributions
uel-0.849
-0.833
lished-0.786
 WATCHED-0.786
-0.785
Token week
Token ID1285
Feature activation-1.33e-01

Pos logit contributions
uel+0.195
+0.191
 WATCHED+0.180
+0.179
lished+0.179
Neg logit contributions
 spont-0.228
 volunt-0.217
 Pv-0.216
ortment-0.187
 imperson-0.187
Token reminding
Token ID30047
Feature activation-2.27e+00

Pos logit contributions
 spont+0.214
 volunt+0.203
 Pv+0.201
ortment+0.171
 imperson+0.170
Neg logit contributions
uel-0.229
-0.225
 WATCHED-0.214
-0.213
lished-0.213
Token voters
Token ID4446
Feature activation+1.55e-01

Pos logit contributions
 spont+4.153
 Pv+3.972
 volunt+3.964
 imperson+3.481
ortment+3.342
Neg logit contributions
uel-3.391
-3.336
lished-3.172
 WATCHED-3.163
-3.161
Token that
Token ID326
Feature activation+2.84e-02

Pos logit contributions
uel+0.258
+0.254
 WATCHED+0.240
lished+0.238
+0.238
Neg logit contributions
 spont-0.261
 volunt-0.248
 Pv-0.246
ortment-0.211
 imperson-0.210
Token the
Token ID262
Feature activation+3.03e-01

Pos logit contributions
uel+0.049
+0.048
 WATCHED+0.045
lished+0.045
+0.045
Neg logit contributions
 spont-0.046
 volunt-0.044
 Pv-0.043
ortment-0.037
 imperson-0.037
Token last
Token ID938
Feature activation-8.23e-02

Pos logit contributions
uel+0.462
+0.454
 WATCHED+0.427
+0.425
lished+0.425
Neg logit contributions
 spont-0.547
 volunt-0.522
 Pv-0.519
 imperson-0.450
ortment-0.450
Token time
Token ID640
Feature activation+2.34e-01

Pos logit contributions
 spont+0.154
 volunt+0.147
 Pv+0.146
 imperson+0.128
ortment+0.126
Neg logit contributions
uel-0.121
-0.119
lished-0.112
 WATCHED-0.111
-0.111
Token,
Token ID11
Feature activation-1.08e-01

Pos logit contributions
uel+0.171
+0.167
 WATCHED+0.158
+0.157
lished+0.157
Neg logit contributions
 spont-0.196
 volunt-0.186
 Pv-0.186
ortment-0.161
 imperson-0.160
Token on
Token ID319
Feature activation+4.88e-02

Pos logit contributions
 spont+0.176
 volunt+0.167
 Pv+0.166
 imperson+0.142
ortment+0.140
Neg logit contributions
uel-0.186
-0.182
lished-0.173
 WATCHED-0.173
-0.173
Token every
Token ID790
Feature activation-3.93e-01

Pos logit contributions
uel+0.065
+0.063
lished+0.059
+0.059
 WATCHED+0.059
Neg logit contributions
 spont-0.098
 volunt-0.094
 Pv-0.093
 imperson-0.082
ortment-0.082
Token occasion
Token ID6695
Feature activation-5.35e-01

Pos logit contributions
 spont+0.713
 volunt+0.680
 Pv+0.677
 imperson+0.588
ortment+0.583
Neg logit contributions
uel-0.599
-0.587
lished-0.553
 WATCHED-0.553
-0.552
Token,
Token ID11
Feature activation-2.46e-01

Pos logit contributions
 spont+0.925
 volunt+0.879
 Pv+0.876
ortment+0.753
 imperson+0.752
Neg logit contributions
uel-0.860
-0.844
lished-0.798
 WATCHED-0.797
-0.794
Token remind
Token ID7101
Feature activation-2.25e+00

Pos logit contributions
 spont+0.434
 volunt+0.412
 Pv+0.411
ortment+0.356
 imperson+0.353
Neg logit contributions
uel-0.386
-0.379
 WATCHED-0.358
lished-0.356
-0.356
Token Syria
Token ID4392
Feature activation+4.63e-01

Pos logit contributions
 spont+3.476
 Pv+3.285
 volunt+3.270
 imperson+2.795
ortment+2.683
Neg logit contributions
-3.969
uel-3.940
lished-3.777
 WATCHED-3.769
osponsors-3.722
Token that
Token ID326
Feature activation-1.75e-01

Pos logit contributions
uel+0.761
+0.742
 WATCHED+0.705
lished+0.703
+0.701
Neg logit contributions
 spont-0.784
 volunt-0.745
 Pv-0.740
ortment-0.635
 imperson-0.635
Token Hat
Token ID10983
Feature activation+7.34e-01

Pos logit contributions
 spont+0.287
 volunt+0.272
 Pv+0.270
 imperson+0.231
ortment+0.230
Neg logit contributions
uel-0.297
-0.291
 WATCHED-0.276
-0.275
lished-0.275
Tokenay
Token ID323
Feature activation-1.15e+00

Pos logit contributions
uel+1.288
+1.262
 WATCHED+1.199
lished+1.197
+1.197
Neg logit contributions
 spont-1.167
 volunt-1.104
 Pv-1.097
 imperson-0.933
ortment-0.922
Token in
Token ID287
Feature activation+1.11e-01

Pos logit contributions
 spont+2.052
 volunt+1.955
 Pv+1.949
 imperson+1.692
ortment+1.656
Neg logit contributions
uel-1.774
-1.752
lished-1.661
 WATCHED-1.645
-1.637
Token over
Token ID625
Feature activation-1.26e-01

Pos logit contributions
 spont+0.302
 volunt+0.287
 Pv+0.285
 imperson+0.245
ortment+0.243
Neg logit contributions
uel-0.288
-0.283
lished-0.267
 WATCHED-0.267
-0.267
Token the
Token ID262
Feature activation-2.29e-01

Pos logit contributions
 spont+0.213
 volunt+0.203
 Pv+0.201
 imperson+0.173
ortment+0.172
Neg logit contributions
uel-0.206
-0.202
 WATCHED-0.192
-0.191
lished-0.191
Token team
Token ID1074
Feature activation-4.27e-01

Pos logit contributions
 spont+0.386
 volunt+0.367
 Pv+0.365
 imperson+0.313
ortment+0.312
Neg logit contributions
uel-0.377
-0.369
 WATCHED-0.350
lished-0.348
-0.348
Token radio
Token ID5243
Feature activation-1.09e+00

Pos logit contributions
 spont+0.732
 volunt+0.696
 Pv+0.693
 imperson+0.597
ortment+0.591
Neg logit contributions
uel-0.693
-0.681
lished-0.643
 WATCHED-0.643
-0.642
Token in
Token ID287
Feature activation-4.58e-02

Pos logit contributions
 spont+1.801
 volunt+1.705
 Pv+1.701
 imperson+1.460
ortment+1.437
Neg logit contributions
uel-1.828
-1.805
lished-1.705
-1.700
 WATCHED-1.698
Token FP
Token ID31459
Feature activation-2.25e+00

Pos logit contributions
 spont+0.076
 volunt+0.072
 Pv+0.072
 imperson+0.062
ortment+0.061
Neg logit contributions
uel-0.077
-0.075
 WATCHED-0.071
lished-0.071
-0.071
Token1
Token ID16
Feature activation+2.52e-01

Pos logit contributions
 spont+3.832
 Pv+3.638
 volunt+3.605
 imperson+3.122
ortment+3.119
Neg logit contributions
uel-3.611
-3.591
lished-3.411
 WATCHED-3.394
osponsors-3.356
Token at
Token ID379
Feature activation-2.58e-01

Pos logit contributions
uel+0.414
+0.403
 WATCHED+0.382
+0.380
lished+0.378
Neg logit contributions
 spont-0.428
 volunt-0.409
 Pv-0.404
ortment-0.349
 imperson-0.346
Token Suz
Token ID22691
Feature activation-7.37e-01

Pos logit contributions
 spont+0.439
 volunt+0.417
 Pv+0.414
ortment+0.357
 imperson+0.357
Neg logit contributions
uel-0.423
-0.413
 WATCHED-0.390
-0.389
lished-0.389
Tokenuka
Token ID14852
Feature activation-5.24e-01

Pos logit contributions
 spont+1.589
 volunt+1.526
 Pv+1.522
 imperson+1.355
ortment+1.345
Neg logit contributions
uel-0.871
-0.850
lished-0.785
 WATCHED-0.785
-0.783
Token to
Token ID284
Feature activation+2.04e-01

Pos logit contributions
 spont+0.867
 volunt+0.822
 Pv+0.818
 imperson+0.699
ortment+0.694
Neg logit contributions
uel-0.886
-0.870
 WATCHED-0.824
lished-0.823
-0.823
Tokens
Token ID82
Feature activation+3.39e-02

Pos logit contributions
uel+0.065
+0.064
lished+0.060
 WATCHED+0.060
+0.060
Neg logit contributions
 spont-0.093
 volunt-0.089
 Pv-0.089
 imperson-0.078
ortment-0.078
Token Lara
Token ID29849
Feature activation+1.51e-01

Pos logit contributions
uel+0.055
+0.054
 WATCHED+0.051
+0.051
lished+0.051
Neg logit contributions
 spont-0.058
 volunt-0.055
 Pv-0.054
 imperson-0.047
ortment-0.047
Token Sel
Token ID15300
Feature activation+5.48e-01

Pos logit contributions
uel+0.244
+0.237
+0.224
lished+0.224
 WATCHED+0.224
Neg logit contributions
 spont-0.261
 volunt-0.248
 Pv-0.245
 imperson-0.213
ortment-0.210
Tokenig
Token ID328
Feature activation-1.24e-01

Pos logit contributions
uel+0.913
+0.870
lished+0.834
 WATCHED+0.830
+0.830
Neg logit contributions
 spont-0.931
 volunt-0.888
 Pv-0.872
 imperson-0.751
ortment-0.741
Tokenman
Token ID805
Feature activation-3.28e-01

Pos logit contributions
 spont+0.208
 volunt+0.198
 Pv+0.196
 imperson+0.168
ortment+0.167
Neg logit contributions
uel-0.207
-0.202
lished-0.191
 WATCHED-0.191
-0.191
Token reminded
Token ID14516
Feature activation-2.24e+00

Pos logit contributions
 spont+0.544
 volunt+0.516
 Pv+0.513
 imperson+0.439
ortment+0.435
Neg logit contributions
uel-0.552
-0.542
 WATCHED-0.513
lished-0.513
-0.512
Token folks
Token ID7974
Feature activation+2.63e-01

Pos logit contributions
 spont+3.717
 Pv+3.552
 volunt+3.509
 imperson+3.043
ortment+2.956
Neg logit contributions
uel-3.698
-3.676
lished-3.495
 WATCHED-3.494
-3.443
Token Tuesday
Token ID3431
Feature activation-6.82e-02

Pos logit contributions
uel+0.423
+0.414
 WATCHED+0.391
+0.389
lished+0.388
Neg logit contributions
 spont-0.457
 volunt-0.435
 Pv-0.431
ortment-0.373
 imperson-0.371
Token,
Token ID11
Feature activation+9.36e-03

Pos logit contributions
 spont+0.114
 volunt+0.108
 Pv+0.107
 imperson+0.091
ortment+0.091
Neg logit contributions
uel-0.115
-0.112
 WATCHED-0.107
lished-0.106
-0.106
Token is
Token ID318
Feature activation+6.69e-02

Pos logit contributions
uel+0.014
+0.013
 WATCHED+0.013
+0.013
lished+0.013
Neg logit contributions
 spont-0.018
 volunt-0.017
 Pv-0.017
 imperson-0.015
ortment-0.014
Token a
Token ID257
Feature activation-3.67e-02

Pos logit contributions
uel+0.107
+0.105
 WATCHED+0.100
+0.099
lished+0.099
Neg logit contributions
 spont-0.116
 volunt-0.111
 Pv-0.109
ortment-0.095
 imperson-0.094
Token years
Token ID812
Feature activation+1.25e-01

Pos logit contributions
uel+0.150
+0.148
+0.141
 WATCHED+0.141
lished+0.140
Neg logit contributions
 spont-0.146
 volunt-0.138
 Pv-0.137
 imperson-0.117
ortment-0.116
Token away
Token ID1497
Feature activation-2.47e-01

Pos logit contributions
uel+0.251
+0.246
 WATCHED+0.236
+0.235
lished+0.234
Neg logit contributions
 spont-0.164
 volunt-0.154
 Pv-0.153
ortment-0.126
 imperson-0.123
Token from
Token ID422
Feature activation-1.47e-01

Pos logit contributions
 spont+0.419
 volunt+0.399
 Pv+0.397
 imperson+0.341
ortment+0.340
Neg logit contributions
uel-0.403
-0.396
 WATCHED-0.375
-0.375
lished-0.375
Token a
Token ID257
Feature activation-1.28e-01

Pos logit contributions
 spont+0.244
 volunt+0.232
 Pv+0.231
ortment+0.198
 imperson+0.197
Neg logit contributions
uel-0.245
-0.241
 WATCHED-0.228
-0.227
lished-0.227
Token driving
Token ID5059
Feature activation-9.47e-01

Pos logit contributions
 spont+0.221
 volunt+0.210
 Pv+0.209
 imperson+0.180
ortment+0.178
Neg logit contributions
uel-0.206
-0.202
lished-0.191
 WATCHED-0.191
-0.191
Token licence
Token ID17098
Feature activation-2.23e+00

Pos logit contributions
 spont+1.471
 volunt+1.390
 Pv+1.384
 imperson+1.172
ortment+1.152
Neg logit contributions
uel-1.697
-1.670
lished-1.588
 WATCHED-1.586
-1.585
Token.
Token ID13
Feature activation-1.05e-01

Pos logit contributions
 spont+3.520
 volunt+3.352
 Pv+3.332
 imperson+2.861
ortment+2.777
Neg logit contributions
uel-3.885
-3.833
lished-3.650
 WATCHED-3.648
-3.620
Token However
Token ID2102
Feature activation+1.76e-01

Pos logit contributions
 spont+0.173
 volunt+0.164
 Pv+0.161
 imperson+0.140
ortment+0.137
Neg logit contributions
uel-0.180
-0.177
 WATCHED-0.168
-0.168
lished-0.166
Token,
Token ID11
Feature activation+1.20e-01

Pos logit contributions
uel+0.281
+0.275
 WATCHED+0.260
lished+0.260
+0.259
Neg logit contributions
 spont-0.304
 volunt-0.289
 Pv-0.288
ortment-0.248
 imperson-0.247
Token they
Token ID484
Feature activation+1.40e-01

Pos logit contributions
uel+0.202
+0.197
 WATCHED+0.188
+0.186
lished+0.185
Neg logit contributions
 spont-0.196
 volunt-0.186
 Pv-0.185
ortment-0.160
 imperson-0.158
Token can
Token ID460
Feature activation-1.21e-01

Pos logit contributions
uel+0.208
+0.203
 WATCHED+0.192
+0.190
lished+0.189
Neg logit contributions
 spont-0.261
 volunt-0.248
 Pv-0.248
 imperson-0.215
ortment-0.215
Token of
Token ID286
Feature activation-1.33e-01

Pos logit contributions
 spont+1.047
 volunt+0.999
 Pv+0.994
 imperson+0.865
ortment+0.859
Neg logit contributions
uel-0.858
-0.841
lished-0.791
 WATCHED-0.791
-0.790
Token a
Token ID257
Feature activation-1.27e-01

Pos logit contributions
 spont+0.216
 volunt+0.205
 Pv+0.204
ortment+0.175
 imperson+0.173
Neg logit contributions
uel-0.226
-0.222
 WATCHED-0.211
lished-0.210
-0.209
Token new
Token ID649
Feature activation+4.78e-02

Pos logit contributions
 spont+0.202
 volunt+0.192
 Pv+0.191
ortment+0.162
 imperson+0.162
Neg logit contributions
uel-0.221
-0.218
 WATCHED-0.206
lished-0.206
-0.205
Token mood
Token ID10038
Feature activation-6.94e-01

Pos logit contributions
uel+0.073
+0.072
 WATCHED+0.067
+0.067
lished+0.067
Neg logit contributions
 spont-0.086
 volunt-0.082
 Pv-0.082
ortment-0.071
 imperson-0.071
Token of
Token ID286
Feature activation-8.32e-02

Pos logit contributions
 spont+1.159
 volunt+1.099
 Pv+1.095
 imperson+0.938
ortment+0.928
Neg logit contributions
uel-1.159
-1.139
lished-1.079
 WATCHED-1.078
-1.077
Token cens
Token ID12428
Feature activation-2.19e+00

Pos logit contributions
 spont+0.136
 volunt+0.130
 Pv+0.130
ortment+0.111
 imperson+0.110
Neg logit contributions
uel-0.139
-0.136
lished-0.129
 WATCHED-0.128
-0.127
Tokenorious
Token ID9982
Feature activation+3.02e-01

Pos logit contributions
 spont+4.854
 Pv+4.683
 volunt+4.668
ortment+4.241
 imperson+4.171
Neg logit contributions
-2.318
uel-2.294
lished-2.137
 WATCHED-2.126
-2.106
Token political
Token ID1964
Feature activation-3.61e-01

Pos logit contributions
uel+0.519
+0.507
 WATCHED+0.479
+0.478
 Jem+0.476
Neg logit contributions
 spont-0.472
 Pv-0.454
 volunt-0.453
ortment-0.385
 imperson-0.377
Token correctness
Token ID29409
Feature activation-9.94e-02

Pos logit contributions
 spont+0.409
 Pv+0.385
 volunt+0.384
ortment+0.301
 imperson+0.296
Neg logit contributions
uel-0.780
-0.770
 WATCHED-0.738
-0.733
lished-0.727
Token sweeping
Token ID18404
Feature activation+1.49e-01

Pos logit contributions
 spont+0.169
 volunt+0.161
 Pv+0.159
ortment+0.138
 imperson+0.136
Neg logit contributions
uel-0.163
-0.159
 WATCHED-0.151
-0.151
lished-0.149
Token university
Token ID6403
Feature activation-3.13e-02

Pos logit contributions
uel+0.260
+0.254
lished+0.242
+0.242
 WATCHED+0.241
Neg logit contributions
 spont-0.235
 volunt-0.223
 Pv-0.221
ortment-0.188
 imperson-0.187
Token.
Token ID13
Feature activation+3.39e-02

Pos logit contributions
 spont+0.597
 volunt+0.566
 Pv+0.565
 imperson+0.480
ortment+0.480
Neg logit contributions
uel-0.626
-0.614
-0.582
 WATCHED-0.582
lished-0.580
Token
Token ID198
Feature activation+8.91e-02

Pos logit contributions
uel+0.057
+0.056
+0.053
 WATCHED+0.053
ーン+0.053
Neg logit contributions
 spont-0.057
 volunt-0.054
 Pv-0.052
 imperson-0.047
ortment-0.045
Token
Token ID198
Feature activation-3.09e-02

Pos logit contributions
uel+0.165
+0.163
lished+0.156
 WATCHED+0.155
+0.155
Neg logit contributions
 spont-0.132
 Pv-0.125
 volunt-0.124
 imperson-0.103
ortment-0.103
TokenThe
Token ID464
Feature activation+1.80e-01

Pos logit contributions
 spont+0.058
 volunt+0.056
 Pv+0.055
 imperson+0.049
ortment+0.048
Neg logit contributions
uel-0.045
-0.044
-0.042
 WATCHED-0.041
lished-0.041
Token phrase
Token ID9546
Feature activation+2.15e-01

Pos logit contributions
uel+0.297
+0.292
 WATCHED+0.277
+0.276
lished+0.274
Neg logit contributions
 spont-0.304
 volunt-0.288
 Pv-0.287
ortment-0.247
 imperson-0.245
Token reminds
Token ID17603
Feature activation-2.18e+00

Pos logit contributions
uel+0.319
+0.313
 WATCHED+0.293
lished+0.293
+0.293
Neg logit contributions
 spont-0.401
 volunt-0.382
 Pv-0.381
 imperson-0.332
ortment-0.329
Token me
Token ID502
Feature activation+3.48e-02

Pos logit contributions
 spont+3.348
 Pv+3.182
 volunt+3.178
 imperson+2.692
ortment+2.602
Neg logit contributions
uel-3.874
-3.848
 WATCHED-3.655
lished-3.637
-3.609
Token of
Token ID286
Feature activation+1.08e-01

Pos logit contributions
uel+0.051
+0.049
 WATCHED+0.047
+0.046
ーン+0.046
Neg logit contributions
 spont-0.065
 volunt-0.063
 Pv-0.062
 imperson-0.054
ortment-0.054
Token a
Token ID257
Feature activation+9.16e-02

Pos logit contributions
uel+0.182
+0.179
 WATCHED+0.171
+0.170
ーン+0.168
Neg logit contributions
 spont-0.177
 volunt-0.168
 Pv-0.167
ortment-0.145
 imperson-0.142
Token patient
Token ID5827
Feature activation+6.80e-01

Pos logit contributions
uel+0.152
+0.149
 WATCHED+0.141
+0.140
lished+0.140
Neg logit contributions
 spont-0.153
 volunt-0.146
 Pv-0.145
ortment-0.125
 imperson-0.124
Token whose
Token ID3025
Feature activation+6.80e-02

Pos logit contributions
uel+1.015
+0.978
 WATCHED+0.935
lished+0.919
+0.919
Neg logit contributions
 spont-1.254
 volunt-1.194
 Pv-1.192
ortment-1.050
 imperson-1.023
Token(
Token ID7
Feature activation-3.37e-01

Pos logit contributions
uel+0.154
+0.150
lished+0.146
 WATCHED+0.143
+0.141
Neg logit contributions
 spont-0.173
 Pv-0.167
 volunt-0.164
 imperson-0.142
ortment-0.141
Token0
Token ID15
Feature activation-9.40e-02

Pos logit contributions
 spont+0.611
 volunt+0.583
 Pv+0.579
 imperson+0.504
ortment+0.499
Neg logit contributions
uel-0.516
-0.506
 WATCHED-0.476
lished-0.476
-0.476
Tokens
Token ID82
Feature activation-3.61e-01

Pos logit contributions
 spont+0.178
 Pv+0.171
 volunt+0.170
 imperson+0.149
ortment+0.147
Neg logit contributions
uel-0.134
-0.132
lished-0.126
 WATCHED-0.124
-0.124
Token)}
Token ID38165
Feature activation-4.72e-01

Pos logit contributions
 spont+0.514
 Pv+0.493
 volunt+0.478
 imperson+0.396
ortment+0.396
Neg logit contributions
uel-0.682
-0.673
lished-0.654
 WATCHED-0.644
-0.643
Token c
Token ID269
Feature activation-3.76e-03

Pos logit contributions
 spont+0.819
 volunt+0.779
 Pv+0.777
 imperson+0.669
ortment+0.661
Neg logit contributions
uel-0.759
-0.745
lished-0.706
-0.703
 WATCHED-0.703
Tokenxd
Token ID24954
Feature activation-2.17e+00

Pos logit contributions
 spont+0.006
 volunt+0.005
 Pv+0.005
 imperson+0.005
ortment+0.004
Neg logit contributions
uel-0.007
-0.007
lished-0.006
 WATCHED-0.006
-0.006
Token4
Token ID19
Feature activation+1.17e-01

Pos logit contributions
 spont+3.282
 Pv+3.164
 volunt+3.076
 imperson+2.566
ortment+2.528
Neg logit contributions
uel-3.784
-3.740
lished-3.698
 WATCHED-3.598
ーン-3.580
Token {
Token ID1391
Feature activation+6.69e-03

Pos logit contributions
uel+0.125
+0.121
 WATCHED+0.111
lished+0.111
+0.111
Neg logit contributions
 spont-0.266
 volunt-0.256
 Pv-0.254
 imperson-0.228
ortment-0.227
Token(
Token ID7
Feature activation-4.82e-01

Pos logit contributions
uel+0.010
+0.010
lished+0.010
 WATCHED+0.010
+0.010
Neg logit contributions
 spont-0.012
 Pv-0.011
 volunt-0.011
 imperson-0.010
ortment-0.010
Token341
Token ID33660
Feature activation-3.36e-01

Pos logit contributions
 spont+0.866
 volunt+0.825
 Pv+0.822
 imperson+0.713
ortment+0.706
Neg logit contributions
uel-0.745
-0.731
lished-0.689
 WATCHED-0.688
-0.687
Tokens
Token ID82
Feature activation-5.87e-01

Pos logit contributions
 spont+0.630
 Pv+0.607
 volunt+0.600
 imperson+0.526
ortment+0.520
Neg logit contributions
uel-0.483
-0.480
lished-0.457
 WATCHED-0.450
-0.449
Token Thanks
Token ID6930
Feature activation+4.01e-01

Pos logit contributions
uel+0.063
+0.062
+0.059
 WATCHED+0.059
lished+0.058
Neg logit contributions
 spont-0.065
 volunt-0.062
 Pv-0.061
 imperson-0.054
ortment-0.053
Token,
Token ID11
Feature activation-8.15e-03

Pos logit contributions
uel+0.575
+0.560
+0.525
 WATCHED+0.524
ーン+0.521
Neg logit contributions
 spont-0.760
 volunt-0.728
 Pv-0.721
 imperson-0.633
ortment-0.632
Token Jo
Token ID5302
Feature activation-3.07e-01

Pos logit contributions
 spont+0.017
 volunt+0.017
 Pv+0.017
 imperson+0.015
ortment+0.015
Neg logit contributions
uel-0.010
-0.010
 WATCHED-0.009
-0.009
lished-0.009
Token,
Token ID11
Feature activation-9.19e-03

Pos logit contributions
 spont+0.518
 volunt+0.492
 Pv+0.489
 imperson+0.420
ortment+0.417
Neg logit contributions
uel-0.507
-0.498
lished-0.471
 WATCHED-0.470
-0.470
Token for
Token ID329
Feature activation-9.86e-02

Pos logit contributions
 spont+0.017
 volunt+0.016
 Pv+0.016
 imperson+0.014
ortment+0.014
Neg logit contributions
uel-0.014
-0.014
lished-0.013
-0.013
 WATCHED-0.013
Token reminding
Token ID30047
Feature activation-2.16e+00

Pos logit contributions
 spont+0.155
 volunt+0.146
 Pv+0.146
 imperson+0.123
ortment+0.123
Neg logit contributions
uel-0.174
-0.170
 WATCHED-0.162
-0.162
lished-0.161
Token us
Token ID514
Feature activation-2.38e-01

Pos logit contributions
 spont+3.209
 Pv+3.030
 volunt+3.030
 imperson+2.523
ortment+2.433
Neg logit contributions
-3.947
uel-3.906
lished-3.780
 WATCHED-3.751
-3.695
Token not
Token ID407
Feature activation+1.54e-01

Pos logit contributions
 spont+0.411
 volunt+0.391
 Pv+0.389
 imperson+0.335
ortment+0.333
Neg logit contributions
uel-0.384
-0.377
 WATCHED-0.356
lished-0.355
-0.355
Token to
Token ID284
Feature activation+2.04e-01

Pos logit contributions
uel+0.244
+0.239
 WATCHED+0.226
lished+0.226
+0.225
Neg logit contributions
 spont-0.270
 volunt-0.257
 Pv-0.256
 imperson-0.221
ortment-0.219
Token take
Token ID1011
Feature activation-5.67e-01

Pos logit contributions
uel+0.331
+0.322
 WATCHED+0.307
+0.302
lished+0.301
Neg logit contributions
 spont-0.349
 volunt-0.331
 Pv-0.329
ortment-0.284
 imperson-0.281
Token the
Token ID262
Feature activation+2.07e-02

Pos logit contributions
 spont+0.879
 volunt+0.831
 Pv+0.827
 imperson+0.699
ortment+0.692
Neg logit contributions
uel-1.014
-0.998
lished-0.948
 WATCHED-0.948
-0.946
Token regulatory
Token ID11344
Feature activation-1.07e+00

Pos logit contributions
 spont+0.118
 volunt+0.112
 Pv+0.112
ortment+0.096
 imperson+0.095
Neg logit contributions
uel-0.114
-0.111
 WATCHED-0.106
lished-0.105
-0.105
Token bodies
Token ID5920
Feature activation-3.05e-01

Pos logit contributions
 spont+1.482
 volunt+1.393
 Pv+1.381
 imperson+1.150
ortment+1.106
Neg logit contributions
uel-2.094
-2.060
lished-1.985
 WATCHED-1.965
-1.956
Token creating
Token ID4441
Feature activation-4.04e-01

Pos logit contributions
 spont+0.501
 volunt+0.474
 Pv+0.471
 imperson+0.401
ortment+0.400
Neg logit contributions
uel-0.518
-0.509
 WATCHED-0.484
-0.482
lished-0.481
Token the
Token ID262
Feature activation-3.73e-02

Pos logit contributions
 spont+0.655
 volunt+0.619
 Pv+0.617
 imperson+0.524
ortment+0.523
Neg logit contributions
uel-0.695
-0.682
 WATCHED-0.647
lished-0.647
-0.646
Token black
Token ID2042
Feature activation+1.66e-01

Pos logit contributions
 spont+0.061
 volunt+0.057
 Pv+0.057
ortment+0.049
 imperson+0.049
Neg logit contributions
uel-0.064
-0.062
 WATCHED-0.059
-0.059
lished-0.059
Tokenlists
Token ID20713
Feature activation-2.16e+00

Pos logit contributions
uel+0.327
+0.318
+0.305
lished+0.304
 WATCHED+0.304
Neg logit contributions
 spont-0.231
 volunt-0.216
 Pv-0.215
ortment-0.176
 imperson-0.175
Token.
Token ID13
Feature activation+4.85e-03

Pos logit contributions
 spont+3.387
 Pv+3.202
 volunt+3.199
 imperson+2.742
ortment+2.632
Neg logit contributions
uel-3.772
-3.747
lished-3.566
 WATCHED-3.525
-3.516
Token
Token ID198
Feature activation-2.10e-02

Pos logit contributions
uel+0.008
+0.008
 WATCHED+0.008
+0.008
lished+0.008
Neg logit contributions
 spont-0.008
 volunt-0.007
 Pv-0.007
 imperson-0.006
ortment-0.006
Token
Token ID198
Feature activation-9.01e-02

Pos logit contributions
 spont+0.030
 Pv+0.029
 volunt+0.028
 imperson+0.024
ortment+0.024
Neg logit contributions
uel-0.039
-0.039
lished-0.038
 WATCHED-0.037
-0.037
TokenEFF
Token ID37267
Feature activation+6.92e-02

Pos logit contributions
 spont+0.175
 volunt+0.168
 Pv+0.167
 imperson+0.147
ortment+0.145
Neg logit contributions
uel-0.126
-0.123
lished-0.115
 WATCHED-0.115
-0.115
Token director
Token ID3437
Feature activation+4.85e-01

Pos logit contributions
uel+0.102
+0.100
lished+0.094
 WATCHED+0.094
+0.094
Neg logit contributions
 spont-0.129
 volunt-0.123
 Pv-0.123
 imperson-0.107
ortment-0.107
Token the
Token ID262
Feature activation-1.83e-01

Pos logit contributions
 spont+0.221
 volunt+0.210
 Pv+0.209
 imperson+0.178
ortment+0.177
Neg logit contributions
uel-0.226
-0.221
lished-0.210
 WATCHED-0.210
-0.209
Token sea
Token ID5417
Feature activation-5.11e-01

Pos logit contributions
 spont+0.333
 volunt+0.318
 Pv+0.317
 imperson+0.275
ortment+0.273
Neg logit contributions
uel-0.277
-0.270
 WATCHED-0.255
lished-0.255
-0.254
Token to
Token ID284
Feature activation-3.58e-02

Pos logit contributions
 spont+0.860
 volunt+0.816
 Pv+0.813
 imperson+0.697
ortment+0.691
Neg logit contributions
uel-0.846
-0.831
 WATCHED-0.786
lished-0.786
-0.785
Token be
Token ID307
Feature activation+1.40e-01

Pos logit contributions
 spont+0.064
 volunt+0.061
 Pv+0.061
ortment+0.053
 imperson+0.053
Neg logit contributions
uel-0.055
-0.054
 WATCHED-0.051
lished-0.051
-0.050
Token continually
Token ID17451
Feature activation-1.04e-01

Pos logit contributions
uel+0.219
+0.214
 WATCHED+0.204
lished+0.202
+0.200
Neg logit contributions
 spont-0.247
 volunt-0.235
 Pv-0.234
 imperson-0.202
ortment-0.202
Token reminded
Token ID14516
Feature activation-2.13e+00

Pos logit contributions
 spont+0.195
 volunt+0.187
 Pv+0.186
 imperson+0.162
ortment+0.161
Neg logit contributions
uel-0.151
-0.148
 WATCHED-0.139
lished-0.139
-0.138
Token of
Token ID286
Feature activation-5.65e-02

Pos logit contributions
 spont+3.551
 Pv+3.391
 volunt+3.369
 imperson+2.917
ortment+2.823
Neg logit contributions
-3.487
uel-3.465
lished-3.317
osponsors-3.288
-3.264
Token this
Token ID428
Feature activation-2.90e-01

Pos logit contributions
 spont+0.092
 volunt+0.087
 Pv+0.087
ortment+0.074
 imperson+0.074
Neg logit contributions
uel-0.096
-0.094
 WATCHED-0.089
lished-0.089
-0.089
Token connection
Token ID4637
Feature activation+3.60e-01

Pos logit contributions
 spont+0.504
 volunt+0.480
 Pv+0.478
 imperson+0.411
ortment+0.410
Neg logit contributions
uel-0.465
-0.455
 WATCHED-0.431
lished-0.430
-0.429
Token.
Token ID13
Feature activation-3.32e-02

Pos logit contributions
uel+0.596
+0.578
+0.550
 WATCHED+0.550
lished+0.547
Neg logit contributions
 spont-0.606
 volunt-0.578
 Pv-0.571
ortment-0.492
 imperson-0.491
Token World
Token ID2159
Feature activation-3.33e-01

Pos logit contributions
 spont+0.054
 volunt+0.051
 Pv+0.050
 imperson+0.044
ortment+0.043
Neg logit contributions
uel-0.057
-0.056
 WATCHED-0.053
-0.053
lished-0.053
Token hide
Token ID7808
Feature activation+7.73e-02

Pos logit contributions
 spont+0.177
 volunt+0.168
 Pv+0.166
ortment+0.143
 imperson+0.139
Neg logit contributions
uel-0.192
-0.187
 WATCHED-0.179
-0.177
lished-0.176
Token,
Token ID11
Feature activation+5.71e-02

Pos logit contributions
uel+0.128
+0.125
 WATCHED+0.120
+0.119
lished+0.118
Neg logit contributions
 spont-0.131
 volunt-0.124
 Pv-0.122
ortment-0.105
 imperson-0.105
Token right
Token ID826
Feature activation-1.84e-02

Pos logit contributions
uel+0.092
+0.091
 WATCHED+0.086
+0.085
lished+0.085
Neg logit contributions
 spont-0.098
 volunt-0.093
 Pv-0.093
ortment-0.080
 imperson-0.080
Token?
Token ID30
Feature activation+2.33e-01

Pos logit contributions
 spont+0.033
 volunt+0.031
 Pv+0.031
 imperson+0.027
ortment+0.027
Neg logit contributions
uel-0.029
-0.028
 WATCHED-0.027
-0.027
lished-0.027
Token Then
Token ID3244
Feature activation+3.16e-01

Pos logit contributions
uel+0.377
+0.371
 WATCHED+0.353
+0.351
ーン+0.344
Neg logit contributions
 spont-0.402
 volunt-0.383
 Pv-0.373
 imperson-0.330
ortment-0.324
Token remind
Token ID7101
Feature activation-2.13e+00

Pos logit contributions
uel+0.536
+0.523
 WATCHED+0.500
+0.494
lished+0.492
Neg logit contributions
 spont-0.518
 volunt-0.490
 Pv-0.486
ortment-0.418
 imperson-0.412
Token them
Token ID606
Feature activation-2.67e-01

Pos logit contributions
 spont+2.840
 Pv+2.659
 volunt+2.658
 imperson+2.203
ortment+2.087
Neg logit contributions
uel-4.226
-4.175
lished-4.029
 WATCHED-3.986
-3.973
Token that
Token ID326
Feature activation-9.80e-02

Pos logit contributions
 spont+0.447
 volunt+0.424
 Pv+0.422
 imperson+0.361
ortment+0.360
Neg logit contributions
uel-0.444
-0.436
 WATCHED-0.414
lished-0.412
-0.412
Token strong
Token ID1913
Feature activation-7.37e-01

Pos logit contributions
 spont+0.160
 volunt+0.152
 Pv+0.151
ortment+0.130
 imperson+0.128
Neg logit contributions
uel-0.166
-0.163
 WATCHED-0.155
-0.154
lished-0.153
Token encryption
Token ID15835
Feature activation-2.52e-01

Pos logit contributions
 spont+1.033
 volunt+0.971
 Pv+0.967
 imperson+0.795
ortment+0.794
Neg logit contributions
uel-1.428
-1.408
 WATCHED-1.341
-1.339
lished-1.338
Token is
Token ID318
Feature activation+1.78e-02

Pos logit contributions
 spont+0.471
 volunt+0.450
 Pv+0.448
 imperson+0.391
ortment+0.387
Neg logit contributions
uel-0.371
-0.364
lished-0.343
 WATCHED-0.341
-0.341
Token compromise
Token ID13110
Feature activation+4.03e-01

Pos logit contributions
uel+0.403
+0.395
lished+0.371
 WATCHED+0.371
+0.371
Neg logit contributions
 spont-0.501
 volunt-0.477
 Pv-0.475
 imperson-0.414
ortment-0.411
Token sentence
Token ID6827
Feature activation+1.11e+00

Pos logit contributions
uel+0.657
+0.644
+0.608
 WATCHED+0.607
lished+0.606
Neg logit contributions
 spont-0.686
 volunt-0.645
 Pv-0.644
 imperson-0.556
ortment-0.554
Token that
Token ID326
Feature activation+1.27e-01

Pos logit contributions
uel+1.760
+1.712
 WATCHED+1.626
+1.622
lished+1.612
Neg logit contributions
 spont-1.920
 volunt-1.807
 Pv-1.801
ortment-1.555
 imperson-1.549
Token still
Token ID991
Feature activation+6.53e-01

Pos logit contributions
uel+0.187
+0.183
lished+0.173
 WATCHED+0.173
+0.172
Neg logit contributions
 spont-0.238
 volunt-0.227
 Pv-0.226
 imperson-0.198
ortment-0.195
Token involves
Token ID9018
Feature activation-2.76e-01

Pos logit contributions
uel+0.942
+0.923
lished+0.866
 WATCHED+0.865
+0.864
Neg logit contributions
 spont-1.241
 volunt-1.185
 Pv-1.180
 imperson-1.033
ortment-1.024
Token prison
Token ID3770
Feature activation+1.63e+00

Pos logit contributions
 spont+0.460
 volunt+0.437
 Pv+0.435
 imperson+0.373
ortment+0.368
Neg logit contributions
uel-0.463
-0.455
lished-0.431
 WATCHED-0.431
-0.430
Token but
Token ID475
Feature activation+4.66e-01

Pos logit contributions
uel+2.185
+2.123
+1.994
 WATCHED+1.976
lished+1.965
Neg logit contributions
 spont-3.202
 Pv-3.046
 volunt-3.046
ortment-2.691
 imperson-2.666
Token is
Token ID318
Feature activation+1.52e-01

Pos logit contributions
uel+0.708
+0.693
lished+0.653
 WATCHED+0.652
+0.651
Neg logit contributions
 spont-0.847
 volunt-0.807
 Pv-0.804
 imperson-0.698
ortment-0.693
Token below
Token ID2174
Feature activation+2.22e-02

Pos logit contributions
uel+0.232
+0.227
lished+0.214
 WATCHED+0.214
+0.214
Neg logit contributions
 spont-0.278
 volunt-0.265
 Pv-0.264
 imperson-0.230
ortment-0.227
Token the
Token ID262
Feature activation+4.86e-02

Pos logit contributions
uel+0.037
+0.036
lished+0.034
+0.034
 WATCHED+0.034
Neg logit contributions
 spont-0.037
 volunt-0.035
 Pv-0.035
 imperson-0.030
ortment-0.030
Token Guidelines
Token ID31072
Feature activation-2.02e-01

Pos logit contributions
uel+0.076
+0.075
lished+0.070
 WATCHED+0.070
+0.070
Neg logit contributions
 spont-0.086
 volunt-0.082
 Pv-0.082
 imperson-0.071
ortment-0.070
Token.
Token ID13
Feature activation+4.81e-01

Pos logit contributions
uel+0.793
+0.772
lished+0.733
 WATCHED+0.732
+0.731
Neg logit contributions
 spont-0.822
 volunt-0.776
 Pv-0.768
 imperson-0.662
ortment-0.659
Token<|endoftext|>
Token ID50256
Feature activation+6.44e-01

Pos logit contributions
uel+0.782
+0.765
+0.725
 WATCHED+0.724
lished+0.720
Neg logit contributions
 spont-0.832
 volunt-0.789
 Pv-0.772
 imperson-0.683
ortment-0.663
TokenB
Token ID33
Feature activation-4.17e-01

Pos logit contributions
uel+0.962
+0.939
+0.881
 WATCHED+0.877
lished+0.873
Neg logit contributions
 spont-1.202
 volunt-1.153
 Pv-1.139
 imperson-1.000
ortment-0.986
Tokenrock
Token ID10823
Feature activation-1.72e-01

Pos logit contributions
 spont+0.789
 volunt+0.754
 Pv+0.751
 imperson+0.656
ortment+0.651
Neg logit contributions
uel-0.605
-0.592
lished-0.555
 WATCHED-0.555
-0.554
Token Les
Token ID11732
Feature activation-6.29e-01

Pos logit contributions
 spont+0.286
 volunt+0.271
 Pv+0.270
 imperson+0.231
ortment+0.229
Neg logit contributions
uel-0.290
-0.285
lished-0.270
 WATCHED-0.270
-0.270
Tokennar
Token ID23955
Feature activation+1.74e+00

Pos logit contributions
 spont+0.997
 volunt+0.944
 Pv+0.937
 imperson+0.792
ortment+0.785
Neg logit contributions
uel-1.118
-1.088
 WATCHED-1.035
-1.031
lished-1.027
Token has
Token ID468
Feature activation+2.53e-01

Pos logit contributions
uel+2.545
+2.475
 WATCHED+2.366
+2.312
lished+2.292
Neg logit contributions
 spont-3.236
 volunt-3.114
 Pv-3.051
ortment-2.698
 imperson-2.633
Token been
Token ID587
Feature activation-5.95e-01

Pos logit contributions
uel+0.382
+0.374
 WATCHED+0.353
lished+0.351
+0.350
Neg logit contributions
 spont-0.461
 volunt-0.440
 Pv-0.437
ortment-0.379
 imperson-0.378
Token flagged
Token ID34060
Feature activation-1.58e+00

Pos logit contributions
 spont+1.057
 volunt+1.006
 Pv+1.002
 imperson+0.867
ortment+0.860
Neg logit contributions
uel-0.931
-0.913
 WATCHED-0.861
lished-0.861
-0.859
Token by
Token ID416
Feature activation+4.53e-02

Pos logit contributions
 spont+2.646
 volunt+2.505
 Pv+2.503
 imperson+2.165
ortment+2.097
Neg logit contributions
-2.595
uel-2.588
lished-2.437
 WATCHED-2.424
osponsors-2.406
Token the
Token ID262
Feature activation-1.33e-01

Pos logit contributions
uel+0.068
+0.066
lished+0.063
+0.062
 WATCHED+0.062
Neg logit contributions
 spont-0.084
 volunt-0.080
 Pv-0.080
 imperson-0.069
ortment-0.068
Token initial
Token ID4238
Feature activation-1.43e-01

Pos logit contributions
uel+0.241
+0.235
 WATCHED+0.221
lished+0.220
+0.220
Neg logit contributions
 spont-0.321
 volunt-0.307
 Pv-0.305
 imperson-0.267
ortment-0.265
Tokenizes
Token ID4340
Feature activation-7.24e-01

Pos logit contributions
 spont+0.289
 volunt+0.277
 Pv+0.276
 imperson+0.244
ortment+0.242
Neg logit contributions
uel-0.188
-0.184
lished-0.171
 WATCHED-0.171
-0.171
Token and
Token ID290
Feature activation+1.25e-01

Pos logit contributions
 spont+1.184
 volunt+1.123
 Pv+1.118
 imperson+0.954
ortment+0.945
Neg logit contributions
uel-1.235
-1.213
lished-1.150
 WATCHED-1.150
-1.148
Token replaces
Token ID24020
Feature activation-3.34e-01

Pos logit contributions
uel+0.181
+0.177
 WATCHED+0.166
lished+0.166
+0.166
Neg logit contributions
 spont-0.238
 volunt-0.228
 Pv-0.226
 imperson-0.197
ortment-0.197
Token the
Token ID262
Feature activation-7.56e-02

Pos logit contributions
 spont+0.556
 volunt+0.529
 Pv+0.526
ortment+0.447
 imperson+0.447
Neg logit contributions
uel-0.561
-0.546
 WATCHED-0.522
lished-0.521
-0.519
Token inn
Token ID3527
Feature activation+1.38e+00

Pos logit contributions
 spont+0.121
 volunt+0.114
 Pv+0.114
ortment+0.096
 imperson+0.096
Neg logit contributions
uel-0.132
-0.129
 WATCHED-0.124
-0.122
lished-0.122
Tokenards
Token ID1371
Feature activation-4.12e-01

Pos logit contributions
uel+1.742
+1.652
lished+1.561
+1.534
 WATCHED+1.522
Neg logit contributions
 spont-2.879
 volunt-2.766
 Pv-2.744
 imperson-2.427
ortment-2.391
Token of
Token ID286
Feature activation+8.06e-03

Pos logit contributions
 spont+0.727
 volunt+0.692
 Pv+0.689
 imperson+0.595
ortment+0.590
Neg logit contributions
uel-0.650
-0.636
-0.600
 WATCHED-0.600
lished-0.599
Token the
Token ID262
Feature activation-1.27e-01

Pos logit contributions
uel+0.013
+0.013
 WATCHED+0.013
lished+0.012
+0.012
Neg logit contributions
 spont-0.013
 volunt-0.013
 Pv-0.013
ortment-0.011
 imperson-0.011
Token bound
Token ID5421
Feature activation-4.01e-01

Pos logit contributions
 spont+0.199
 volunt+0.188
 Pv+0.187
 imperson+0.157
ortment+0.157
Neg logit contributions
uel-0.226
-0.221
 WATCHED-0.212
-0.210
lished-0.210
Token tag
Token ID7621
Feature activation-2.24e-01

Pos logit contributions
 spont+0.618
 volunt+0.584
 Pv+0.582
 imperson+0.490
ortment+0.486
Neg logit contributions
uel-0.723
-0.707
 WATCHED-0.674
-0.674
lished-0.672
Token
Token ID564
Feature activation-2.67e-01

Pos logit contributions
uel+0.117
+0.115
 WATCHED+0.109
+0.108
lished+0.108
Neg logit contributions
 spont-0.127
 volunt-0.121
 Pv-0.119
 imperson-0.104
ortment-0.103
Token
Token ID250
Feature activation-1.46e-01

Pos logit contributions
 spont+0.277
 volunt+0.254
 Pv+0.254
 imperson+0.192
ortment+0.189
Neg logit contributions
uel-0.614
-0.606
lished-0.584
 WATCHED-0.582
-0.581
TokenTo
Token ID2514
Feature activation-9.36e-02

Pos logit contributions
 spont+0.259
 volunt+0.247
 Pv+0.244
 imperson+0.213
ortment+0.210
Neg logit contributions
uel-0.233
-0.228
-0.216
 WATCHED-0.214
lished-0.213
Token West
Token ID2688
Feature activation+1.09e-01

Pos logit contributions
 spont+0.158
 volunt+0.150
 Pv+0.149
ortment+0.128
 imperson+0.127
Neg logit contributions
uel-0.155
-0.151
-0.144
lished-0.143
 WATCHED-0.142
Token R
Token ID371
Feature activation+4.28e-01

Pos logit contributions
uel+0.171
+0.167
+0.158
lished+0.158
ーン+0.156
Neg logit contributions
 spont-0.195
 volunt-0.186
 Pv-0.183
 imperson-0.160
ortment-0.158
Tokenukong
Token ID46654
Feature activation+1.50e+00

Pos logit contributions
uel+0.660
+0.634
lished+0.597
 WATCHED+0.595
+0.594
Neg logit contributions
 spont-0.783
 volunt-0.744
 Pv-0.742
ortment-0.641
 imperson-0.640
Tokenai
Token ID1872
Feature activation+1.98e-01

Pos logit contributions
uel+2.963
+2.829
lished+2.710
+2.709
uru+2.687
Neg logit contributions
 spont-2.142
 volunt-2.018
 Pv-1.972
ortment-1.622
 imperson-1.620
Token...
Token ID986
Feature activation+1.86e-01

Pos logit contributions
uel+0.314
+0.306
+0.289
lished+0.287
 WATCHED+0.286
Neg logit contributions
 spont-0.350
 volunt-0.333
 Pv-0.328
 imperson-0.285
ortment-0.284
Token to
Token ID284
Feature activation-1.11e-02

Pos logit contributions
uel+0.297
+0.292
 WATCHED+0.275
+0.275
lished+0.273
Neg logit contributions
 spont-0.327
 volunt-0.310
 Pv-0.307
 imperson-0.266
ortment-0.265
Token see
Token ID766
Feature activation+2.30e-01

Pos logit contributions
 spont+0.019
 volunt+0.018
 Pv+0.018
ortment+0.016
 imperson+0.016
Neg logit contributions
uel-0.018
-0.018
-0.017
lished-0.017
 WATCHED-0.017
Token Shib
Token ID44254
Feature activation+2.15e-01

Pos logit contributions
uel+0.392
+0.381
+0.364
 WATCHED+0.362
lished+0.361
Neg logit contributions
 spont-0.378
 volunt-0.357
 Pv-0.354
ortment-0.305
 imperson-0.303
Token that
Token ID326
Feature activation-2.27e-01

Pos logit contributions
 spont+0.440
 volunt+0.417
 Pv+0.414
 imperson+0.351
ortment+0.350
Neg logit contributions
uel-0.485
-0.476
 WATCHED-0.453
lished-0.451
-0.450
Token the
Token ID262
Feature activation+1.40e-01

Pos logit contributions
 spont+0.366
 volunt+0.347
 Pv+0.345
ortment+0.294
 imperson+0.292
Neg logit contributions
uel-0.389
-0.383
 WATCHED-0.363
lished-0.362
-0.361
Token
Token ID564
Feature activation-4.26e-02

Pos logit contributions
uel+0.224
+0.220
 WATCHED+0.208
+0.207
lished+0.207
Neg logit contributions
 spont-0.242
 volunt-0.230
 Pv-0.229
ortment-0.197
 imperson-0.196
Token
Token ID250
Feature activation-3.50e-02

Pos logit contributions
 spont+0.046
 volunt+0.042
 Pv+0.042
 imperson+0.032
ortment+0.031
Neg logit contributions
uel-0.097
-0.095
 WATCHED-0.092
lished-0.092
-0.092
TokenMom
Token ID29252
Feature activation+6.82e-01

Pos logit contributions
 spont+0.066
 volunt+0.063
 Pv+0.063
 imperson+0.055
ortment+0.055
Neg logit contributions
uel-0.051
-0.050
lished-0.047
 WATCHED-0.047
-0.047
Tokenent
Token ID298
Feature activation+1.37e+00

Pos logit contributions
uel+1.166
+1.141
 WATCHED+1.079
+1.075
lished+1.072
Neg logit contributions
 spont-1.117
 volunt-1.063
 Pv-1.054
 imperson-0.904
ortment-0.891
Token
Token ID447
Feature activation-2.46e-01

Pos logit contributions
uel+2.102
+2.078
lished+1.946
+1.931
ーン+1.920
Neg logit contributions
 spont-2.458
 volunt-2.363
 Pv-2.333
 imperson-2.016
ortment-2.010
Token
Token ID251
Feature activation+1.11e-01

Pos logit contributions
 spont+0.336
 volunt+0.316
 Pv+0.313
 imperson+0.258
ortment+0.255
Neg logit contributions
uel-0.485
-0.479
 WATCHED-0.457
-0.456
lished-0.456
Token is
Token ID318
Feature activation+1.88e-02

Pos logit contributions
uel+0.165
+0.161
lished+0.152
 WATCHED+0.151
+0.151
Neg logit contributions
 spont-0.207
 volunt-0.197
 Pv-0.196
 imperson-0.171
ortment-0.170
Token not
Token ID407
Feature activation-2.46e-01

Pos logit contributions
uel+0.029
+0.029
 WATCHED+0.027
+0.027
lished+0.027
Neg logit contributions
 spont-0.033
 volunt-0.032
 Pv-0.031
ortment-0.027
 imperson-0.027
Token being
Token ID852
Feature activation-2.27e-01

Pos logit contributions
 spont+0.421
 volunt+0.400
 Pv+0.397
 imperson+0.340
ortment+0.340
Neg logit contributions
uel-0.400
-0.392
 WATCHED-0.371
lished-0.371
-0.370
Token went
Token ID1816
Feature activation-3.31e-01

Pos logit contributions
uel+0.536
+0.529
 WATCHED+0.496
lished+0.493
+0.491
Neg logit contributions
 spont-0.598
 volunt-0.568
 Pv-0.565
ortment-0.486
 imperson-0.485
Token so
Token ID523
Feature activation-2.23e-01

Pos logit contributions
 spont+0.568
 volunt+0.539
 Pv+0.536
ortment+0.460
 imperson+0.459
Neg logit contributions
uel-0.535
-0.526
 WATCHED-0.499
-0.496
lished-0.494
Token far
Token ID1290
Feature activation-2.60e-01

Pos logit contributions
 spont+0.381
 volunt+0.362
 Pv+0.360
ortment+0.312
 imperson+0.308
Neg logit contributions
uel-0.361
-0.356
 WATCHED-0.337
-0.335
lished-0.334
Token as
Token ID355
Feature activation+9.72e-03

Pos logit contributions
 spont+0.463
 volunt+0.442
 Pv+0.441
ortment+0.380
 imperson+0.377
Neg logit contributions
uel-0.404
-0.396
-0.374
 WATCHED-0.373
lished-0.372
Token to
Token ID284
Feature activation-2.74e-01

Pos logit contributions
uel+0.015
+0.015
 WATCHED+0.014
lished+0.014
+0.014
Neg logit contributions
 spont-0.017
 volunt-0.016
 Pv-0.016
ortment-0.014
 imperson-0.014
Token suggest
Token ID1950
Feature activation+1.20e-01

Pos logit contributions
 spont+0.458
 volunt+0.434
 Pv+0.433
ortment+0.369
 imperson+0.365
Neg logit contributions
uel-0.457
-0.448
 WATCHED-0.425
-0.422
lished-0.421
Token that
Token ID326
Feature activation-1.56e-01

Pos logit contributions
uel+0.203
+0.198
 WATCHED+0.189
+0.186
lished+0.186
Neg logit contributions
 spont-0.196
 volunt-0.187
 Pv-0.186
ortment-0.160
 imperson-0.156
Token the
Token ID262
Feature activation+1.27e-01

Pos logit contributions
 spont+0.267
 volunt+0.256
 Pv+0.254
ortment+0.221
 imperson+0.216
Neg logit contributions
uel-0.252
-0.247
 WATCHED-0.234
-0.232
lished-0.230
Token Mayor
Token ID10106
Feature activation+5.89e-01

Pos logit contributions
uel+0.203
+0.199
 WATCHED+0.189
+0.186
ーン+0.185
Neg logit contributions
 spont-0.218
 volunt-0.210
 Pv-0.208
ortment-0.180
 imperson-0.178
Token was
Token ID373
Feature activation+3.36e-03

Pos logit contributions
uel+0.867
+0.851
 WATCHED+0.795
+0.786
lished+0.779
Neg logit contributions
 spont-1.105
 volunt-1.059
 Pv-1.045
ortment-0.911
 imperson-0.902
Token a
Token ID257
Feature activation+8.02e-03

Pos logit contributions
uel+0.006
+0.006
 WATCHED+0.005
+0.005
lished+0.005
Neg logit contributions
 spont-0.005
 volunt-0.005
 Pv-0.005
ortment-0.004
 imperson-0.004
Token is
Token ID318
Feature activation+1.45e-01

Pos logit contributions
 spont+0.057
 volunt+0.055
 Pv+0.054
 imperson+0.048
ortment+0.048
Neg logit contributions
uel-0.038
-0.037
 WATCHED-0.034
lished-0.034
-0.034
Token advising
Token ID30341
Feature activation-8.96e-01

Pos logit contributions
uel+0.219
+0.214
 WATCHED+0.202
lished+0.201
+0.200
Neg logit contributions
 spont-0.266
 volunt-0.254
 Pv-0.251
ortment-0.219
 imperson-0.218
Token Sullivan
Token ID18501
Feature activation+9.49e-01

Pos logit contributions
 spont+1.494
 Pv+1.418
 volunt+1.416
 imperson+1.217
ortment+1.194
Neg logit contributions
uel-1.492
-1.478
lished-1.402
 WATCHED-1.395
-1.393
Token,
Token ID11
Feature activation+1.32e-01

Pos logit contributions
uel+1.494
+1.465
 WATCHED+1.380
+1.374
lished+1.367
Neg logit contributions
 spont-1.670
 volunt-1.578
 Pv-1.569
ortment-1.359
 imperson-1.354
Token says
Token ID1139
Feature activation-7.71e-02

Pos logit contributions
uel+0.205
+0.201
 WATCHED+0.190
lished+0.188
+0.188
Neg logit contributions
 spont-0.234
 volunt-0.223
 Pv-0.221
ortment-0.192
 imperson-0.191
Token it
Token ID340
Feature activation+1.90e-01

Pos logit contributions
 spont+0.130
 volunt+0.124
 Pv+0.123
ortment+0.105
 imperson+0.105
Neg logit contributions
uel-0.127
-0.124
 WATCHED-0.118
-0.117
lished-0.117
Token has
Token ID468
Feature activation+3.92e-01

Pos logit contributions
uel+0.277
+0.271
 WATCHED+0.255
+0.254
lished+0.254
Neg logit contributions
 spont-0.358
 volunt-0.342
 Pv-0.341
 imperson-0.297
ortment-0.296
Token had
Token ID550
Feature activation+1.69e-01

Pos logit contributions
uel+0.604
+0.592
 WATCHED+0.558
+0.556
lished+0.556
Neg logit contributions
 spont-0.706
 volunt-0.673
 Pv-0.670
 imperson-0.580
ortment-0.578
Token a
Token ID257
Feature activation+9.77e-02

Pos logit contributions
uel+0.282
+0.277
 WATCHED+0.262
lished+0.261
+0.261
Neg logit contributions
 spont-0.281
 volunt-0.267
 Pv-0.266
 imperson-0.227
ortment-0.227
Token devastating
Token ID14101
Feature activation+3.51e-01

Pos logit contributions
uel+0.146
+0.143
 WATCHED+0.134
lished+0.133
+0.133
Neg logit contributions
 spont-0.181
 volunt-0.172
 Pv-0.172
 imperson-0.150
ortment-0.149
Token impact
Token ID2928
Feature activation+1.49e-01

Pos logit contributions
uel+0.434
+0.421
 WATCHED+0.391
+0.386
lished+0.383
Neg logit contributions
 spont-0.737
 volunt-0.706
 Pv-0.705
ortment-0.626
 imperson-0.624
Token check
Token ID2198
Feature activation-1.90e-01

Pos logit contributions
uel+0.166
+0.161
lished+0.153
+0.153
 WATCHED+0.152
Neg logit contributions
 spont-0.176
 volunt-0.167
 Pv-0.166
 imperson-0.143
ortment-0.142
Token-
Token ID12
Feature activation+3.10e-02

Pos logit contributions
 spont+0.397
 volunt+0.381
 Pv+0.380
 imperson+0.334
ortment+0.334
Neg logit contributions
uel-0.239
-0.233
-0.216
lished-0.216
 WATCHED-0.215
Tokenins
Token ID1040
Feature activation+2.78e-01

Pos logit contributions
uel+0.055
+0.053
+0.051
lished+0.051
 WATCHED+0.051
Neg logit contributions
 spont-0.049
 volunt-0.047
 Pv-0.046
 imperson-0.039
ortment-0.039
Token on
Token ID319
Feature activation+1.57e-01

Pos logit contributions
uel+0.460
+0.449
 WATCHED+0.426
+0.425
lished+0.424
Neg logit contributions
 spont-0.472
 volunt-0.448
 Pv-0.444
 imperson-0.380
ortment-0.380
Token Unt
Token ID26970
Feature activation+8.99e-01

Pos logit contributions
uel+0.254
+0.250
lished+0.236
 WATCHED+0.236
+0.235
Neg logit contributions
 spont-0.271
 volunt-0.257
 Pv-0.257
 imperson-0.221
ortment-0.218
Tokenapp
Token ID1324
Feature activation+9.74e-01

Pos logit contributions
uel+1.133
+1.122
 WATCHED+1.040
lished+1.039
+1.036
Neg logit contributions
 spont-1.861
 volunt-1.781
 Pv-1.779
 imperson-1.576
ortment-1.568
Tokend
Token ID67
Feature activation-4.55e-02

Pos logit contributions
uel+1.350
+1.309
lished+1.234
 WATCHED+1.232
ーン+1.222
Neg logit contributions
 spont-1.910
 volunt-1.832
 Pv-1.795
 imperson-1.585
ortment-1.582
Token.
Token ID13
Feature activation+1.92e-01

Pos logit contributions
 spont+0.077
 volunt+0.073
 Pv+0.073
 imperson+0.062
ortment+0.062
Neg logit contributions
uel-0.075
-0.074
lished-0.070
 WATCHED-0.070
-0.070
Token Think
Token ID11382
Feature activation-1.40e-01

Pos logit contributions
uel+0.324
+0.317
+0.301
 WATCHED+0.301
lished+0.299
Neg logit contributions
 spont-0.320
 volunt-0.304
 Pv-0.299
 imperson-0.260
ortment-0.254
Token about
Token ID546
Feature activation-4.66e-01

Pos logit contributions
 spont+0.247
 volunt+0.235
 Pv+0.234
 imperson+0.203
ortment+0.201
Neg logit contributions
uel-0.221
-0.217
 WATCHED-0.204
-0.204
lished-0.204
Token that
Token ID326
Feature activation-1.76e-01

Pos logit contributions
 spont+0.704
 volunt+0.665
 Pv+0.661
 imperson+0.556
ortment+0.554
Neg logit contributions
uel-0.853
-0.836
 WATCHED-0.797
-0.794
lished-0.793
Token advised
Token ID13030
Feature activation-1.24e-01

Pos logit contributions
uel+0.229
+0.224
 WATCHED+0.211
lished+0.210
+0.210
Neg logit contributions
 spont-0.284
 volunt-0.270
 Pv-0.270
 imperson-0.234
ortment-0.233
Token she
Token ID673
Feature activation+3.58e-01

Pos logit contributions
 spont+0.206
 volunt+0.196
 Pv+0.195
 imperson+0.167
ortment+0.165
Neg logit contributions
uel-0.209
-0.206
lished-0.195
-0.194
 WATCHED-0.194
Token will
Token ID481
Feature activation-1.52e-01

Pos logit contributions
uel+0.514
+0.503
 WATCHED+0.473
+0.468
lished+0.467
Neg logit contributions
 spont-0.683
 volunt-0.651
 Pv-0.649
 imperson-0.566
ortment-0.566
Token also
Token ID635
Feature activation+8.53e-01

Pos logit contributions
 spont+0.269
 volunt+0.255
 Pv+0.255
 imperson+0.220
ortment+0.219
Neg logit contributions
uel-0.238
-0.233
 WATCHED-0.220
lished-0.220
-0.220
Token hear
Token ID3285
Feature activation+1.25e-01

Pos logit contributions
uel+1.315
+1.267
 WATCHED+1.209
+1.195
lished+1.187
Neg logit contributions
 spont-1.533
 Pv-1.449
 volunt-1.448
ortment-1.263
 imperson-1.240
Token the
Token ID262
Feature activation+2.00e-01

Pos logit contributions
uel+0.208
+0.204
 WATCHED+0.193
+0.192
lished+0.192
Neg logit contributions
 spont-0.211
 volunt-0.200
 Pv-0.199
 imperson-0.170
ortment-0.170
Token defense
Token ID3761
Feature activation+4.91e-01

Pos logit contributions
uel+0.304
+0.298
 WATCHED+0.281
+0.279
lished+0.277
Neg logit contributions
 spont-0.361
 volunt-0.342
 Pv-0.342
ortment-0.299
 imperson-0.297
Token
Token ID447
Feature activation+1.20e-01

Pos logit contributions
uel+0.725
+0.708
 WATCHED+0.666
+0.666
lished+0.661
Neg logit contributions
 spont-0.914
 volunt-0.869
 Pv-0.867
 imperson-0.755
ortment-0.755
Token
Token ID247
Feature activation+1.12e-01

Pos logit contributions
uel+0.188
+0.185
lished+0.175
 WATCHED+0.174
+0.174
Neg logit contributions
 spont-0.213
 volunt-0.202
 Pv-0.202
 imperson-0.174
ortment-0.173
Tokens
Token ID82
Feature activation+1.08e-01

Pos logit contributions
uel+0.154
+0.150
lished+0.140
 WATCHED+0.140
+0.140
Neg logit contributions
 spont-0.223
 volunt-0.213
 Pv-0.211
 imperson-0.187
ortment-0.185
Token request
Token ID2581
Feature activation-2.95e-01

Pos logit contributions
uel+0.166
+0.163
 WATCHED+0.153
+0.153
lished+0.152
Neg logit contributions
 spont-0.193
 volunt-0.184
 Pv-0.184
 imperson-0.159
ortment-0.159
TokenGl
Token ID9861
Feature activation-1.43e-01

Pos logit contributions
uel+0.064
+0.063
 WATCHED+0.059
+0.059
lished+0.059
Neg logit contributions
 spont-0.067
 volunt-0.064
 Pv-0.063
 imperson-0.055
ortment-0.054
Tokenenn
Token ID1697
Feature activation+4.49e-02

Pos logit contributions
 spont+0.280
 volunt+0.268
 Pv+0.266
 imperson+0.235
ortment+0.234
Neg logit contributions
uel-0.195
-0.193
lished-0.180
-0.180
 WATCHED-0.180
Token Greenwald
Token ID41821
Feature activation-3.80e-01

Pos logit contributions
uel+0.071
+0.070
 WATCHED+0.066
lished+0.066
+0.066
Neg logit contributions
 spont-0.079
 volunt-0.076
 Pv-0.075
 imperson-0.065
ortment-0.064
Token writes
Token ID6797
Feature activation-3.60e-01

Pos logit contributions
 spont+0.671
 volunt+0.638
 Pv+0.638
 imperson+0.553
ortment+0.545
Neg logit contributions
uel-0.597
-0.588
lished-0.555
-0.554
 WATCHED-0.553
Token<|endoftext|>
Token ID50256
Feature activation+9.20e-01

Pos logit contributions
 spont+0.604
 volunt+0.573
 Pv+0.571
 imperson+0.490
ortment+0.484
Neg logit contributions
uel-0.600
-0.590
lished-0.558
 WATCHED-0.558
-0.557
TokenThe
Token ID464
Feature activation+2.35e-01

Pos logit contributions
uel+1.377
+1.344
+1.261
 WATCHED+1.253
lished+1.247
Neg logit contributions
 spont-1.716
 volunt-1.647
 Pv-1.624
 imperson-1.428
ortment-1.407
Token divers
Token ID15070
Feature activation-6.96e-02

Pos logit contributions
uel+0.376
+0.369
 WATCHED+0.349
+0.347
lished+0.345
Neg logit contributions
 spont-0.408
 volunt-0.389
 Pv-0.385
ortment-0.333
 imperson-0.332
Token found
Token ID1043
Feature activation-1.36e-04

Pos logit contributions
 spont+0.130
 volunt+0.124
 Pv+0.123
 imperson+0.107
ortment+0.107
Neg logit contributions
uel-0.103
-0.101
 WATCHED-0.095
lished-0.094
-0.094
Token her
Token ID607
Feature activation+4.41e-02

Pos logit contributions
 spont+0.000
 volunt+0.000
 Pv+0.000
 imperson+0.000
ortment+0.000
Neg logit contributions
uel-0.000
-0.000
 WATCHED-0.000
-0.000
lished-0.000
Token on
Token ID319
Feature activation+2.53e-02

Pos logit contributions
uel+0.074
+0.072
 WATCHED+0.069
+0.068
lished+0.068
Neg logit contributions
 spont-0.073
 volunt-0.070
 Pv-0.069
ortment-0.059
 imperson-0.059
Token a
Token ID257
Feature activation+5.57e-02

Pos logit contributions
uel+0.042
+0.041
 WATCHED+0.038
+0.038
lished+0.038
Neg logit contributions
 spont-0.043
 volunt-0.041
 Pv-0.040
ortment-0.035
 imperson-0.035
Token can
Token ID460
Feature activation-1.97e-01

Pos logit contributions
uel+1.167
+1.151
+1.087
 WATCHED+1.079
ーン+1.057
Neg logit contributions
 spont-1.446
 volunt-1.386
 Pv-1.374
ortment-1.204
 imperson-1.188
Token already
Token ID1541
Feature activation-9.28e-01

Pos logit contributions
 spont+0.356
 volunt+0.339
 Pv+0.337
ortment+0.291
 imperson+0.291
Neg logit contributions
uel-0.303
-0.297
 WATCHED-0.280
-0.279
lished-0.278
Token be
Token ID307
Feature activation-6.95e-02

Pos logit contributions
 spont+1.635
 volunt+1.558
 Pv+1.548
 imperson+1.332
ortment+1.331
Neg logit contributions
uel-1.467
-1.437
 WATCHED-1.357
-1.353
lished-1.351
Token heard
Token ID2982
Feature activation-6.50e-01

Pos logit contributions
 spont+0.117
 volunt+0.111
 Pv+0.110
ortment+0.094
 imperson+0.093
Neg logit contributions
uel-0.116
-0.114
 WATCHED-0.108
-0.107
lished-0.107
Token somewhat
Token ID6454
Feature activation-2.24e-01

Pos logit contributions
 spont+1.068
 volunt+1.012
 Pv+1.008
 imperson+0.862
ortment+0.852
Neg logit contributions
uel-1.103
-1.084
lished-1.028
 WATCHED-1.026
-1.025
Token.
Token ID13
Feature activation-3.81e-02

Pos logit contributions
 spont+0.393
 volunt+0.374
 Pv+0.372
ortment+0.322
 imperson+0.320
Neg logit contributions
uel-0.355
-0.348
 WATCHED-0.330
-0.328
lished-0.326
Token Do
Token ID2141
Feature activation+1.28e-01

Pos logit contributions
 spont+0.065
 volunt+0.062
 Pv+0.060
 imperson+0.053
ortment+0.052
Neg logit contributions
uel-0.063
-0.062
 WATCHED-0.059
-0.059
lished-0.058
Token this
Token ID428
Feature activation-1.66e-01

Pos logit contributions
uel+0.214
+0.208
+0.198
 WATCHED+0.197
lished+0.197
Neg logit contributions
 spont-0.214
 volunt-0.204
 Pv-0.202
ortment-0.171
 imperson-0.171
Token with
Token ID351
Feature activation-1.05e-01

Pos logit contributions
 spont+0.289
 volunt+0.275
 Pv+0.274
ortment+0.236
 imperson+0.235
Neg logit contributions
uel-0.265
-0.260
 WATCHED-0.246
-0.244
lished-0.244
Token the
Token ID262
Feature activation-5.62e-02

Pos logit contributions
 spont+0.172
 volunt+0.164
 Pv+0.163
ortment+0.139
 imperson+0.139
Neg logit contributions
uel-0.176
-0.173
 WATCHED-0.165
-0.164
lished-0.163
Token radio
Token ID5243
Feature activation-6.14e-01

Pos logit contributions
 spont+0.096
 volunt+0.092
 Pv+0.091
ortment+0.078
 imperson+0.078
Neg logit contributions
uel-0.091
-0.089
 WATCHED-0.085
-0.084
lished-0.084
Token
Token ID251
Feature activation+1.45e-02

Pos logit contributions
 spont+0.120
 volunt+0.113
 Pv+0.112
 imperson+0.093
ortment+0.092
Neg logit contributions
uel-0.165
-0.163
-0.155
 WATCHED-0.155
lished-0.155
Token (
Token ID357
Feature activation+7.95e-02

Pos logit contributions
uel+0.022
+0.022
+0.020
lished+0.020
 WATCHED+0.020
Neg logit contributions
 spont-0.027
 volunt-0.025
 Pv-0.025
 imperson-0.022
ortment-0.022
TokenNa
Token ID26705
Feature activation-2.59e-01

Pos logit contributions
uel+0.152
+0.149
+0.142
lished+0.141
 WATCHED+0.140
Neg logit contributions
 spont-0.116
 volunt-0.110
 Pv-0.108
 imperson-0.091
ortment-0.090
Tokenof
Token ID1659
Feature activation-3.40e-01

Pos logit contributions
 spont+0.386
 Pv+0.369
 volunt+0.362
 imperson+0.305
ortment+0.303
Neg logit contributions
uel-0.470
-0.469
lished-0.454
 WATCHED-0.450
-0.446
Tokenumi
Token ID12994
Feature activation-7.54e-02

Pos logit contributions
 spont+0.610
 Pv+0.581
 volunt+0.581
 imperson+0.502
ortment+0.499
Neg logit contributions
uel-0.516
-0.516
lished-0.490
 WATCHED-0.486
-0.481
Token)
Token ID8
Feature activation+6.54e-02

Pos logit contributions
 spont+0.125
 Pv+0.120
 volunt+0.119
 imperson+0.102
ortment+0.100
Neg logit contributions
uel-0.126
-0.124
lished-0.119
 WATCHED-0.117
-0.117
Token
Token ID198
Feature activation+4.55e-03

Pos logit contributions
uel+0.121
+0.120
lished+0.115
 WATCHED+0.114
+0.114
Neg logit contributions
 spont-0.097
 Pv-0.092
 volunt-0.091
 imperson-0.076
ortment-0.075
Token
Token ID198
Feature activation+2.19e-02

Pos logit contributions
uel+0.008
+0.008
lished+0.008
 WATCHED+0.008
+0.008
Neg logit contributions
 spont-0.007
 Pv-0.006
 volunt-0.006
 imperson-0.005
ortment-0.005
Token
Token ID447
Feature activation-3.90e-01

Pos logit contributions
uel+0.035
+0.034
+0.032
 WATCHED+0.032
lished+0.032
Neg logit contributions
 spont-0.038
 volunt-0.037
 Pv-0.036
 imperson-0.032
ortment-0.031
Token
Token ID250
Feature activation-1.77e-01

Pos logit contributions
 spont+0.415
 volunt+0.381
 Pv+0.380
 imperson+0.291
ortment+0.287
Neg logit contributions
uel-0.887
-0.876
lished-0.843
 WATCHED-0.841
-0.840
TokenBut
Token ID1537
Feature activation+4.04e-01

Pos logit contributions
 spont+0.312
 volunt+0.297
 Pv+0.295
 imperson+0.255
ortment+0.253
Neg logit contributions
uel-0.281
-0.275
-0.260
 WATCHED-0.258
lished-0.257
Token wrong
Token ID2642
Feature activation-3.54e-01

Pos logit contributions
uel+0.524
+0.515
+0.484
 WATCHED+0.484
lished+0.480
Neg logit contributions
 spont-0.524
 volunt-0.498
 Pv-0.498
ortment-0.431
 imperson-0.421
Token,
Token ID11
Feature activation+2.50e-01

Pos logit contributions
 spont+0.601
 volunt+0.570
 Pv+0.567
 imperson+0.487
ortment+0.485
Neg logit contributions
uel-0.581
-0.570
 WATCHED-0.539
-0.537
lished-0.536
Token the
Token ID262
Feature activation+1.10e-01

Pos logit contributions
uel+0.409
+0.403
 WATCHED+0.378
lished+0.377
+0.376
Neg logit contributions
 spont-0.420
 volunt-0.399
 Pv-0.396
ortment-0.343
 imperson-0.339
Token least
Token ID1551
Feature activation+1.19e+00

Pos logit contributions
uel+0.176
+0.173
 WATCHED+0.163
+0.162
lished+0.162
Neg logit contributions
 spont-0.189
 Pv-0.179
 volunt-0.179
ortment-0.154
 imperson-0.153
Token one
Token ID530
Feature activation+1.13e-02

Pos logit contributions
uel+1.793
+1.744
lished+1.626
+1.626
 WATCHED+1.626
Neg logit contributions
 spont-2.148
 Pv-2.043
 volunt-2.041
ortment-1.785
 imperson-1.755
Token can
Token ID460
Feature activation-2.56e-01

Pos logit contributions
uel+0.016
+0.016
 WATCHED+0.015
lished+0.015
+0.015
Neg logit contributions
 spont-0.021
 volunt-0.020
 Pv-0.020
 imperson-0.018
ortment-0.018
Token say
Token ID910
Feature activation+8.32e-02

Pos logit contributions
 spont+0.418
 Pv+0.395
 volunt+0.395
 imperson+0.334
ortment+0.334
Neg logit contributions
uel-0.439
-0.429
 WATCHED-0.408
-0.404
lished-0.403
Token is
Token ID318
Feature activation+7.79e-02

Pos logit contributions
uel+0.130
+0.127
 WATCHED+0.119
+0.119
lished+0.119
Neg logit contributions
 spont-0.147
 volunt-0.140
 Pv-0.138
ortment-0.121
 imperson-0.120
Token that
Token ID326
Feature activation-2.43e-02

Pos logit contributions
uel+0.135
+0.133
 WATCHED+0.126
lished+0.126
+0.125
Neg logit contributions
 spont-0.124
 volunt-0.117
 Pv-0.117
 imperson-0.099
ortment-0.099
Token they
Token ID484
Feature activation+1.76e-01

Pos logit contributions
 spont+0.040
 volunt+0.038
 Pv+0.038
ortment+0.033
 imperson+0.032
Neg logit contributions
uel-0.041
-0.040
 WATCHED-0.038
-0.038
lished-0.038
Token were
Token ID547
Feature activation+3.15e-01

Pos logit contributions
uel+0.271
+0.265
 WATCHED+0.250
+0.248
lished+0.247
Neg logit contributions
 spont-0.318
 volunt-0.301
 Pv-0.301
ortment-0.260
 imperson-0.259
Token
Token ID198
Feature activation-6.76e-02

Pos logit contributions
 spont+0.234
 volunt+0.223
 Pv+0.220
 imperson+0.190
ortment+0.187
Neg logit contributions
uel-0.244
-0.240
-0.228
 WATCHED-0.228
lished-0.226
Token
Token ID198
Feature activation-8.32e-02

Pos logit contributions
 spont+0.097
 Pv+0.092
 volunt+0.090
ortment+0.075
 imperson+0.075
Neg logit contributions
uel-0.128
-0.126
lished-0.123
 WATCHED-0.120
-0.120
Token
Token ID31857
Feature activation+2.60e-01

Pos logit contributions
 spont+0.163
 volunt+0.156
 Pv+0.156
 imperson+0.137
ortment+0.136
Neg logit contributions
uel-0.115
-0.112
lished-0.105
 WATCHED-0.105
-0.105
TokenBut
Token ID1537
Feature activation+2.66e-01

Pos logit contributions
uel+0.399
+0.394
+0.370
 WATCHED+0.367
lished+0.365
Neg logit contributions
 spont-0.473
 volunt-0.453
 Pv-0.446
 imperson-0.393
ortment-0.386
Token Stella
Token ID45856
Feature activation+2.24e-01

Pos logit contributions
uel+0.434
+0.423
+0.400
 WATCHED+0.398
lished+0.396
Neg logit contributions
 spont-0.452
 volunt-0.430
 Pv-0.427
ortment-0.370
 imperson-0.367
Token….
Token ID11580
Feature activation+4.70e-02

Pos logit contributions
uel+0.313
+0.306
 WATCHED+0.286
+0.285
lished+0.285
Neg logit contributions
 spont-0.434
 volunt-0.415
 Pv-0.413
 imperson-0.361
ortment-0.361
Token
Token ID198
Feature activation-3.24e-02

Pos logit contributions
uel+0.080
+0.079
+0.074
 WATCHED+0.074
lished+0.074
Neg logit contributions
 spont-0.077
 volunt-0.073
 Pv-0.073
 imperson-0.062
ortment-0.062
Token
Token ID198
Feature activation-3.14e-02

Pos logit contributions
 spont+0.047
 Pv+0.045
 volunt+0.044
 imperson+0.036
ortment+0.036
Neg logit contributions
uel-0.061
-0.060
lished-0.058
 WATCHED-0.057
-0.057
Token"…
Token ID24426
Feature activation+5.18e-02

Pos logit contributions
 spont+0.060
 volunt+0.058
 Pv+0.057
 imperson+0.050
ortment+0.050
Neg logit contributions
uel-0.044
-0.044
lished-0.041
 WATCHED-0.041
-0.041
TokenSure
Token ID19457
Feature activation+2.82e-01

Pos logit contributions
uel+0.079
+0.078
+0.073
 WATCHED+0.073
lished+0.073
Neg logit contributions
 spont-0.094
 volunt-0.089
 Pv-0.089
 imperson-0.077
ortment-0.077
Token
Token ID1399
Feature activation+3.86e-01

Pos logit contributions
uel+0.452
+0.440
+0.415
ーン+0.414
lished+0.412
Neg logit contributions
 spont-0.489
 volunt-0.467
 Pv-0.460
 imperson-0.398
ortment-0.397
Tokenets
Token ID1039
Feature activation+1.90e-01

Pos logit contributions
 spont+1.667
 volunt+1.567
 Pv+1.564
 imperson+1.306
ortment+1.295
Neg logit contributions
uel-2.184
-2.183
-2.084
lished-2.073
 WATCHED-2.067
Token.
Token ID13
Feature activation-6.56e-02

Pos logit contributions
uel+0.314
+0.305
 WATCHED+0.290
+0.289
lished+0.289
Neg logit contributions
 spont-0.323
 volunt-0.306
 Pv-0.304
ortment-0.261
 imperson-0.260
Token When
Token ID1649
Feature activation-2.95e-01

Pos logit contributions
 spont+0.111
 volunt+0.106
 Pv+0.105
 imperson+0.091
ortment+0.090
Neg logit contributions
uel-0.108
-0.106
 WATCHED-0.100
-0.100
lished-0.099
Token equipped
Token ID10911
Feature activation+5.37e-01

Pos logit contributions
 spont+0.480
 volunt+0.455
 Pv+0.453
 imperson+0.385
ortment+0.385
Neg logit contributions
uel-0.506
-0.495
 WATCHED-0.469
lished-0.468
-0.468
Token with
Token ID351
Feature activation-1.96e-01

Pos logit contributions
uel+0.927
+0.907
+0.865
lished+0.864
 WATCHED+0.860
Neg logit contributions
 spont-0.858
 volunt-0.817
 Pv-0.808
ortment-0.687
 imperson-0.684
Token a
Token ID257
Feature activation-1.31e-01

Pos logit contributions
 spont+0.318
 volunt+0.301
 Pv+0.299
 imperson+0.255
ortment+0.254
Neg logit contributions
uel-0.337
-0.332
 WATCHED-0.315
lished-0.314
-0.314
Token thrust
Token ID14613
Feature activation+3.98e-01

Pos logit contributions
 spont+0.227
 volunt+0.216
 Pv+0.215
 imperson+0.185
ortment+0.184
Neg logit contributions
uel-0.210
-0.206
lished-0.195
 WATCHED-0.195
-0.194
Token augment
Token ID35016
Feature activation-1.16e+00

Pos logit contributions
uel+0.648
+0.634
lished+0.598
 WATCHED+0.597
+0.597
Neg logit contributions
 spont-0.685
 volunt-0.652
 Pv-0.645
 imperson-0.557
ortment-0.552
Tokener
Token ID263
Feature activation-4.65e-01

Pos logit contributions
 spont+1.931
 volunt+1.832
 Pv+1.827
 imperson+1.563
ortment+1.545
Neg logit contributions
uel-1.926
-1.900
lished-1.802
 WATCHED-1.799
-1.797
Token,
Token ID11
Feature activation-4.71e-02

Pos logit contributions
 spont+0.780
 volunt+0.740
 Pv+0.737
 imperson+0.632
ortment+0.626
Neg logit contributions
uel-0.774
-0.761
lished-0.720
 WATCHED-0.720
-0.719
Token a
Token ID257
Feature activation+4.93e-02

Pos logit contributions
 spont+0.076
 volunt+0.073
 Pv+0.072
 imperson+0.061
ortment+0.061
Neg logit contributions
uel-0.081
-0.079
 WATCHED-0.075
-0.075
lished-0.075
Token the
Token ID262
Feature activation+7.89e-02

Pos logit contributions
uel+0.192
+0.189
lished+0.179
 WATCHED+0.179
+0.179
Neg logit contributions
 spont-0.191
 volunt-0.182
 Pv-0.180
ortment-0.155
 imperson-0.154
Token B
Token ID347
Feature activation-1.44e-02

Pos logit contributions
uel+0.133
+0.131
+0.124
 WATCHED+0.124
lished+0.123
Neg logit contributions
 spont-0.130
 volunt-0.124
 Pv-0.123
 imperson-0.105
ortment-0.105
Tokenip
Token ID541
Feature activation-6.29e-01

Pos logit contributions
 spont+0.026
 volunt+0.024
 Pv+0.024
 imperson+0.021
ortment+0.021
Neg logit contributions
uel-0.022
-0.022
lished-0.021
 WATCHED-0.021
-0.021
Tokenartisan
Token ID19714
Feature activation-2.41e-01

Pos logit contributions
 spont+1.074
 Pv+1.018
 volunt+1.011
 imperson+0.873
ortment+0.863
Neg logit contributions
uel-1.021
-1.013
lished-0.960
-0.958
 WATCHED-0.956
Token Policy
Token ID7820
Feature activation+9.90e-03

Pos logit contributions
 spont+0.290
 Pv+0.271
 volunt+0.271
 imperson+0.216
ortment+0.209
Neg logit contributions
uel-0.514
-0.507
 WATCHED-0.487
lished-0.485
-0.483
Token Center
Token ID3337
Feature activation-1.32e+00

Pos logit contributions
uel+0.014
+0.013
lished+0.013
 WATCHED+0.013
+0.013
Neg logit contributions
 spont-0.019
 volunt-0.018
 Pv-0.018
 imperson-0.016
ortment-0.016
Token.
Token ID13
Feature activation+9.93e-02

Pos logit contributions
 spont+2.131
 Pv+2.022
 volunt+2.018
 imperson+1.719
ortment+1.694
Neg logit contributions
uel-2.255
-2.220
lished-2.104
 WATCHED-2.101
-2.100
Token
Token ID198
Feature activation+1.33e-01

Pos logit contributions
uel+0.173
+0.169
 WATCHED+0.161
+0.161
lished+0.160
Neg logit contributions
 spont-0.160
 volunt-0.152
 Pv-0.149
 imperson-0.129
ortment-0.127
Token
Token ID198
Feature activation+7.27e-02

Pos logit contributions
uel+0.247
+0.244
lished+0.234
+0.232
 WATCHED+0.232
Neg logit contributions
 spont-0.197
 Pv-0.187
 volunt-0.185
 imperson-0.155
ortment-0.153
TokenM
Token ID44
Feature activation-1.92e-01

Pos logit contributions
uel+0.105
+0.103
 WATCHED+0.096
+0.096
lished+0.096
Neg logit contributions
 spont-0.139
 volunt-0.133
 Pv-0.132
 imperson-0.116
ortment-0.115
Tokenood
Token ID702
Feature activation-8.33e-01

Pos logit contributions
 spont+0.354
 volunt+0.337
 Pv+0.336
 imperson+0.293
ortment+0.290
Neg logit contributions
uel-0.288
-0.283
lished-0.266
 WATCHED-0.266
-0.265
Token over
Token ID625
Feature activation-5.65e-02

Pos logit contributions
uel+0.700
+0.684
+0.650
 WATCHED+0.647
ーン+0.635
Neg logit contributions
 spont-0.784
 volunt-0.750
 Pv-0.746
ortment-0.657
 imperson-0.639
Token.
Token ID13
Feature activation-1.12e-02

Pos logit contributions
 spont+0.099
 volunt+0.094
 Pv+0.093
ortment+0.081
 imperson+0.080
Neg logit contributions
uel-0.089
-0.088
 WATCHED-0.083
lished-0.082
-0.082
Token It
Token ID632
Feature activation-5.23e-02

Pos logit contributions
 spont+0.018
 volunt+0.017
 Pv+0.017
 imperson+0.015
ortment+0.014
Neg logit contributions
uel-0.019
-0.019
 WATCHED-0.018
-0.018
lished-0.018
Token was
Token ID373
Feature activation+4.98e-02

Pos logit contributions
 spont+0.096
 volunt+0.091
 Pv+0.091
 imperson+0.079
ortment+0.079
Neg logit contributions
uel-0.079
-0.077
lished-0.073
 WATCHED-0.073
-0.072
Token a
Token ID257
Feature activation-1.98e-01

Pos logit contributions
uel+0.080
+0.078
 WATCHED+0.074
lished+0.074
+0.074
Neg logit contributions
 spont-0.086
 volunt-0.082
 Pv-0.081
ortment-0.070
 imperson-0.070
Token poor
Token ID3595
Feature activation-1.30e+00

Pos logit contributions
 spont+0.341
 volunt+0.325
 Pv+0.322
ortment+0.279
 imperson+0.278
Neg logit contributions
uel-0.320
-0.314
 WATCHED-0.296
lished-0.295
-0.294
Token choice
Token ID3572
Feature activation-1.36e-01

Pos logit contributions
 spont+2.079
 volunt+1.968
 Pv+1.959
 imperson+1.665
ortment+1.647
Neg logit contributions
uel-2.278
-2.240
lished-2.126
 WATCHED-2.125
-2.122
Token.
Token ID13
Feature activation-6.47e-02

Pos logit contributions
 spont+0.228
 volunt+0.217
 Pv+0.215
 imperson+0.184
ortment+0.184
Neg logit contributions
uel-0.226
-0.221
 WATCHED-0.210
-0.209
lished-0.209
Token I
Token ID314
Feature activation-3.61e-01

Pos logit contributions
 spont+0.104
 volunt+0.100
 Pv+0.098
 imperson+0.084
ortment+0.083
Neg logit contributions
uel-0.112
-0.110
 WATCHED-0.104
-0.104
lished-0.104
Token didnt
Token ID42547
Feature activation+3.42e-03

Pos logit contributions
 spont+0.672
 volunt+0.641
 Pv+0.637
 imperson+0.555
ortment+0.553
Neg logit contributions
uel-0.533
-0.522
 WATCHED-0.491
lished-0.490
-0.489
Token finish
Token ID5461
Feature activation-1.03e+00

Pos logit contributions
uel+0.005
+0.005
 WATCHED+0.005
lished+0.005
+0.005
Neg logit contributions
 spont-0.006
 volunt-0.006
 Pv-0.006
ortment-0.005
 imperson-0.005
Token a
Token ID257
Feature activation+4.47e-01

Pos logit contributions
 spont+0.129
 volunt+0.123
 Pv+0.122
 imperson+0.105
ortment+0.105
Neg logit contributions
uel-0.125
-0.123
 WATCHED-0.116
lished-0.116
-0.116
Token Russian
Token ID3394
Feature activation+6.03e-01

Pos logit contributions
uel+0.709
+0.695
 WATCHED+0.654
lished+0.652
+0.651
Neg logit contributions
 spont-0.779
 volunt-0.740
 Pv-0.735
 imperson-0.636
ortment-0.636
Token truck
Token ID7779
Feature activation-2.94e-01

Pos logit contributions
uel+0.965
+0.956
 WATCHED+0.905
+0.894
lished+0.884
Neg logit contributions
 spont-1.019
 volunt-0.972
 Pv-0.961
ortment-0.838
 imperson-0.820
Token convoy
Token ID32860
Feature activation-3.08e-01

Pos logit contributions
 spont+0.485
 volunt+0.460
 Pv+0.457
 imperson+0.389
ortment+0.387
Neg logit contributions
uel-0.499
-0.489
 WATCHED-0.464
lished-0.462
-0.462
Token bearing
Token ID14121
Feature activation-1.26e+00

Pos logit contributions
 spont+0.546
 volunt+0.520
 Pv+0.517
 imperson+0.447
ortment+0.445
Neg logit contributions
uel-0.484
-0.474
 WATCHED-0.447
lished-0.446
-0.446
Token humanitarian
Token ID15706
Feature activation-1.34e+00

Pos logit contributions
 spont+2.175
 volunt+2.070
 Pv+2.066
 imperson+1.782
ortment+1.754
Neg logit contributions
uel-2.028
-1.996
lished-1.882
-1.881
 WATCHED-1.874
Token aid
Token ID6133
Feature activation-1.97e+00

Pos logit contributions
 spont+1.304
 volunt+1.200
 Pv+1.187
 imperson+0.891
ortment+0.846
Neg logit contributions
uel-3.176
-3.150
lished-3.055
-3.021
 WATCHED-3.017
Token might
Token ID1244
Feature activation-7.41e-03

Pos logit contributions
 spont+3.434
 Pv+3.293
 volunt+3.266
 imperson+2.864
ortment+2.727
Neg logit contributions
uel-3.099
-3.085
lished-2.993
-2.914
osponsors-2.893
Token now
Token ID783
Feature activation-1.50e-02

Pos logit contributions
 spont+0.013
 volunt+0.012
 Pv+0.012
ortment+0.011
 imperson+0.011
Neg logit contributions
uel-0.012
-0.011
 WATCHED-0.011
-0.011
lished-0.011
Token enter
Token ID3802
Feature activation+2.84e-01

Pos logit contributions
 spont+0.026
 volunt+0.025
 Pv+0.025
ortment+0.021
 imperson+0.021
Neg logit contributions
uel-0.024
-0.023
 WATCHED-0.022
-0.022
lished-0.022
Token into
Token ID656
Feature activation+9.57e-02

Pos logit contributions
uel+0.479
+0.467
+0.444
lished+0.444
 WATCHED+0.443
Neg logit contributions
 spont-0.469
 volunt-0.444
 Pv-0.438
ortment-0.377
 imperson-0.377
Token of
Token ID286
Feature activation-2.37e-02

Pos logit contributions
 spont+0.906
 volunt+0.863
 Pv+0.859
 imperson+0.746
ortment+0.741
Neg logit contributions
uel-0.762
-0.747
lished-0.704
 WATCHED-0.704
-0.702
Token educational
Token ID9856
Feature activation-4.27e-01

Pos logit contributions
 spont+0.037
 volunt+0.035
 Pv+0.035
ortment+0.030
 imperson+0.029
Neg logit contributions
uel-0.041
-0.041
 WATCHED-0.039
-0.039
lished-0.039
Token qualifications
Token ID24751
Feature activation-4.61e-01

Pos logit contributions
 spont+0.707
 volunt+0.666
 Pv+0.662
ortment+0.571
 imperson+0.566
Neg logit contributions
uel-0.717
-0.703
 WATCHED-0.668
-0.667
lished-0.659
Token for
Token ID329
Feature activation+4.88e-02

Pos logit contributions
 spont+0.794
 volunt+0.753
 Pv+0.743
ortment+0.642
 imperson+0.642
Neg logit contributions
uel-0.750
-0.733
-0.694
 WATCHED-0.693
lished-0.691
Token transport
Token ID4839
Feature activation-2.34e-03

Pos logit contributions
uel+0.088
+0.087
 WATCHED+0.083
+0.082
lished+0.082
Neg logit contributions
 spont-0.074
 volunt-0.069
 Pv-0.069
ortment-0.059
 imperson-0.058
Token licenses
Token ID16625
Feature activation-1.33e+00

Pos logit contributions
 spont+0.004
 volunt+0.004
 Pv+0.003
 imperson+0.003
ortment+0.003
Neg logit contributions
uel-0.004
-0.004
-0.004
 WATCHED-0.004
lished-0.004
Token are
Token ID389
Feature activation-3.82e-02

Pos logit contributions
 spont+2.190
 volunt+2.078
 Pv+2.074
 imperson+1.773
ortment+1.750
Neg logit contributions
uel-2.233
-2.198
lished-2.084
 WATCHED-2.083
-2.076
Token some
Token ID617
Feature activation-1.49e-02

Pos logit contributions
 spont+0.069
 volunt+0.066
 Pv+0.065
 imperson+0.057
ortment+0.056
Neg logit contributions
uel-0.059
-0.057
lished-0.054
 WATCHED-0.054
-0.054
Token of
Token ID286
Feature activation-1.71e-01

Pos logit contributions
 spont+0.028
 volunt+0.026
 Pv+0.026
 imperson+0.023
ortment+0.023
Neg logit contributions
uel-0.022
-0.022
lished-0.020
 WATCHED-0.020
-0.020
Token the
Token ID262
Feature activation-1.65e-01

Pos logit contributions
 spont+0.282
 volunt+0.267
 Pv+0.264
 imperson+0.227
ortment+0.226
Neg logit contributions
uel-0.288
-0.283
lished-0.268
 WATCHED-0.267
-0.267
Token features
Token ID3033
Feature activation+3.76e-01

Pos logit contributions
 spont+0.290
 volunt+0.275
 Pv+0.273
 imperson+0.236
ortment+0.235
Neg logit contributions
uel-0.261
-0.257
 WATCHED-0.242
lished-0.242
-0.242
Tokenened
Token ID2945
Feature activation+1.35e-01

Pos logit contributions
uel+0.611
+0.595
lished+0.574
+0.568
 WATCHED+0.567
Neg logit contributions
 spont-0.455
 volunt-0.433
 Pv-0.429
 imperson-0.353
ortment-0.350
Token to
Token ID284
Feature activation-2.12e-01

Pos logit contributions
uel+0.221
+0.215
 WATCHED+0.205
+0.204
lished+0.203
Neg logit contributions
 spont-0.231
 volunt-0.221
 Pv-0.218
 imperson-0.187
ortment-0.187
Token me
Token ID502
Feature activation-1.50e-01

Pos logit contributions
 spont+0.349
 volunt+0.332
 Pv+0.327
 imperson+0.280
ortment+0.279
Neg logit contributions
uel-0.361
-0.352
 WATCHED-0.336
-0.335
lished-0.332
Token with
Token ID351
Feature activation-2.06e-01

Pos logit contributions
 spont+0.252
 volunt+0.239
 Pv+0.236
 imperson+0.202
ortment+0.202
Neg logit contributions
uel-0.251
-0.246
 WATCHED-0.235
-0.232
lished-0.232
Token some
Token ID617
Feature activation-1.32e-01

Pos logit contributions
 spont+0.342
 volunt+0.325
 Pv+0.322
 imperson+0.276
ortment+0.275
Neg logit contributions
uel-0.347
-0.341
 WATCHED-0.324
-0.323
lished-0.321
Token older
Token ID4697
Feature activation-1.28e+00

Pos logit contributions
 spont+0.220
 volunt+0.208
 Pv+0.206
 imperson+0.177
ortment+0.176
Neg logit contributions
uel-0.221
-0.216
 WATCHED-0.206
-0.205
lished-0.204
Token firm
Token ID4081
Feature activation-5.62e-01

Pos logit contributions
 spont+2.042
 volunt+1.938
 Pv+1.935
 imperson+1.643
ortment+1.615
Neg logit contributions
uel-2.223
-2.194
lished-2.089
-2.076
 WATCHED-2.074
Tokenw
Token ID86
Feature activation-3.13e-01

Pos logit contributions
 spont+1.092
 volunt+1.042
 Pv+1.038
 imperson+0.914
ortment+0.909
Neg logit contributions
uel-0.788
-0.770
lished-0.721
 WATCHED-0.719
-0.718
Tokenares
Token ID3565
Feature activation-2.16e-01

Pos logit contributions
 spont+0.804
 volunt+0.778
 Pv+0.775
 imperson+0.705
ortment+0.699
Neg logit contributions
uel-0.242
-0.232
lished-0.206
 WATCHED-0.204
-0.204
Token.
Token ID13
Feature activation-5.58e-02

Pos logit contributions
 spont+0.361
 volunt+0.343
 Pv+0.339
 imperson+0.290
ortment+0.289
Neg logit contributions
uel-0.360
-0.352
 WATCHED-0.335
lished-0.334
-0.333
Token
Token ID198
Feature activation+7.64e-02

Pos logit contributions
 spont+0.094
 volunt+0.089
 Pv+0.088
 imperson+0.076
ortment+0.075
Neg logit contributions
uel-0.093
-0.091
 WATCHED-0.086
-0.086
lished-0.086