TOP ACTIVATIONS
MAX = 10.005

sterile amalg am of social democracy and Stalin ism ; an
No single system tracks street people as they bounce among that
Go somewhere that people walk their dogs & make
and delusion as schizophren ic people whom the team tested with
transition (" trans sexual " people ) and other gender non
the intelligent and educated young people of the country and which
When I meet people from like , down South
and we 're happy for people to contact us at yog
i ators . For many people , he exempl ifies the
The hopes of people who regarded her as the
. But the people scoring the carc asses are
needs to happen to these people so that this stops ,
in 2008 . The American people bailed these sociop aths out
recent history , for those people that are unaware of how
great . Beyond that , people just like something or they
America against Jews and black people , which is more in
more to do with electoral politics than it had to do
The emotional makeup of the people is essential to hegemony ;
politics , health , religion and so on .
About 24 , 000 people have visited since it opened

MIN ACTIVATIONS
MIN = -11.259

wanted to get into tech , I
considered an elite all - around talent . All
can be sent to explore , harvest resources ,
T oward the end of the clip
last month . Against Sweden in June , they
negative ( tem perature - atten uating ) feedback in Watts
did not appear to point towards the spacecraft . Deep Space
" Having lured the child into the early stages of a
York Times that any digging into matters beyond Russia would be
this to the famed Mom of uku in New York .
deduct ibles and out - of - pocket costs , but
socialist principles can be incorporated into that , but persuasion comes
to strike fear into the hearts of criminals
us 9 , placing it into production in August 2013 .
mann , the 3 - to - 4 - inch dead
that Wes Anderson would adapt into a stop - motion feature
Catalan DUI ( Decl ar aci ón Un ilateral de Ind
could delay or derail progress toward 100 % renewable energy and
the king , gest uring towards the crowds .
Foundation : " 9 - to - 5 Working Women Education

INTERVAL 4.689 - 10.005
CONTAINS 10.344%

might be pretty special too . Despite being unable
and spend time with family ," Walker told Triple M .
to K oug ra , you will see all available K
up in the video clips they released . ( Not ably
a great example of that . The J oust mechanic was

INTERVAL -0.627 - 4.689
CONTAINS 50.488%

them . Also , Jub ilee actually gets to use her
candidate , had pledged to turf - out Assange if he
outs at town meetings in New Hampshire and Iowa are sending
across the land , would have been thwarted by supposed fair
$ 20 million that was passed to help repair damage from

INTERVAL -5.943 - -0.627
CONTAINS 33.020%

He s passionate about his city , his community
rations for the first b rit ish edition of ' Charlie
Sunday . St ri king images capture the moment
thing I felt very sure of in my mind is that
ord persist e , les F ond ateurs pre nd ront

INTERVAL -11.259 - -5.943
CONTAINS 6.148%

will be welcome . Take Home Message 8 :
reme Wh ale Prime Mar
K ES . We surprised at least two big suck ers
, because they re black and female .
. It would be interesting to see how the cub s
Token sterile
Token ID38697
Feature activation+2.09e+00

Pos logit contributions
,+0.218
,...+0.171
.+0.170
.""+0.164
 society+0.143
Neg logit contributions
-0.247
-0.216
 -0.201
 Originally-0.199
 �-0.193
Token amalg
Token ID45541
Feature activation-9.05e-01

Pos logit contributions
,+0.154
,...+0.116
.+0.113
.""+0.112
.,+0.090
Neg logit contributions
-0.233
-0.209
 -0.195
 Originally-0.190
 �-0.189
Tokenam
Token ID321
Feature activation+9.33e-01

Pos logit contributions
+0.075
+0.064
 Originally+0.058
 +0.057
 �+0.054
Neg logit contributions
,-0.094
,...-0.078
.-0.075
.""-0.074
 society-0.068
Token of
Token ID286
Feature activation-3.21e+00

Pos logit contributions
,+0.118
,...+0.101
.+0.099
.""+0.098
 society+0.091
Neg logit contributions
-0.058
-0.044
 -0.038
 Originally-0.037
 �-0.036
Token social
Token ID1919
Feature activation+7.56e+00

Pos logit contributions
+0.360
+0.323
 +0.301
 Originally+0.298
 �+0.293
Neg logit contributions
,-0.235
,...-0.177
.-0.173
.""-0.168
.,-0.136
Token democracy
Token ID7996
Feature activation+1.00e+01

Pos logit contributions
,+0.506
,...+0.377
.+0.373
.""+0.371
 society+0.288
Neg logit contributions
-0.890
-0.797
 -0.741
 Originally-0.727
 �-0.723
Token and
Token ID290
Feature activation+2.77e+00

Pos logit contributions
,+0.640
,...+0.457
.+0.447
.""+0.440
 society+0.357
Neg logit contributions
-1.242
-1.094
 -1.029
 Originally-1.016
 �-1.009
Token Stalin
Token ID17482
Feature activation+5.18e+00

Pos logit contributions
,+0.194
,...+0.146
.+0.142
.""+0.140
,.+0.112
Neg logit contributions
-0.318
-0.287
 -0.265
 Originally-0.263
 �-0.260
Tokenism
Token ID1042
Feature activation+2.51e+00

Pos logit contributions
,+0.465
,...+0.372
.+0.360
.""+0.358
 society+0.309
Neg logit contributions
-0.509
-0.437
 -0.401
 Originally-0.393
 �-0.392
Token;
Token ID26
Feature activation+7.22e+00

Pos logit contributions
,+0.095
,...+0.050
.+0.043
.""+0.043
 society+0.026
Neg logit contributions
-0.376
-0.340
 -0.323
 Originally-0.319
 �-0.318
Token an
Token ID281
Feature activation+1.56e+00

Pos logit contributions
,+0.621
.+0.475
,...+0.462
.""+0.439
 society+0.394
Neg logit contributions
-0.743
-0.643
 -0.616
 Originally-0.613
 �-0.582
TokenNo
Token ID2949
Feature activation+1.81e+00

Pos logit contributions
+0.298
+0.236
 +0.207
 Originally+0.207
 �+0.196
Neg logit contributions
,-0.483
,...-0.400
.-0.397
.""-0.388
 society-0.370
Token single
Token ID2060
Feature activation+1.89e+00

Pos logit contributions
,+0.137
,...+0.106
.+0.100
.""+0.100
 society+0.084
Neg logit contributions
-0.198
-0.177
 -0.163
 Originally-0.162
 �-0.158
Token system
Token ID1080
Feature activation+4.29e+00

Pos logit contributions
,+0.134
,...+0.101
.+0.099
.""+0.098
 society+0.078
Neg logit contributions
-0.215
-0.194
 -0.178
 Originally-0.177
 �-0.176
Token tracks
Token ID8339
Feature activation+2.59e+00

Pos logit contributions
,+0.413
.+0.329
,...+0.326
.""+0.317
 society+0.286
Neg logit contributions
-0.393
-0.335
 -0.312
 Originally-0.309
 �-0.298
Token street
Token ID4675
Feature activation+7.37e+00

Pos logit contributions
,+0.188
,...+0.141
.+0.138
.""+0.136
 society+0.111
Neg logit contributions
-0.293
-0.263
 -0.241
 Originally-0.238
 �-0.237
Token people
Token ID661
Feature activation+9.95e+00

Pos logit contributions
,+0.492
,...+0.349
.+0.346
.""+0.331
 society+0.276
Neg logit contributions
-0.886
-0.794
 -0.741
 Originally-0.740
 �-0.724
Token as
Token ID355
Feature activation-1.76e+00

Pos logit contributions
,+0.812
.+0.611
,...+0.609
.""+0.587
 society+0.520
Neg logit contributions
-1.069
-0.928
 -0.871
 Originally-0.866
 �-0.843
Token they
Token ID484
Feature activation+6.92e+00

Pos logit contributions
+0.185
+0.163
 +0.151
 Originally+0.151
 �+0.146
Neg logit contributions
,-0.144
,...-0.111
.-0.109
.""-0.106
 society-0.090
Token bounce
Token ID20110
Feature activation+1.05e+00

Pos logit contributions
,+0.661
.+0.522
,...+0.517
.""+0.497
 society+0.451
Neg logit contributions
-0.640
-0.548
 Originally-0.512
 -0.511
 �-0.487
Token among
Token ID1871
Feature activation-2.35e+00

Pos logit contributions
,+0.131
,...+0.111
.+0.110
.""+0.108
 society+0.100
Neg logit contributions
-0.065
-0.051
 -0.045
 Originally-0.045
 �-0.041
Token that
Token ID326
Feature activation+3.37e+00

Pos logit contributions
+0.265
+0.242
 Originally+0.222
 +0.221
 �+0.215
Neg logit contributions
,-0.166
,...-0.125
.-0.120
.""-0.120
,.-0.096
Token
Token ID198
Feature activation-3.97e+00

Pos logit contributions
,+1.044
.+0.899
,...+0.879
.""+0.856
 society+0.814
Neg logit contributions
-0.310
-0.199
 -0.180
 Originally-0.178
 �-0.148
Token
Token ID198
Feature activation-3.92e+00

Pos logit contributions
+0.079
 +0.013
 Originally+0.012
 Regarding+0.004
+0.000
Neg logit contributions
,-0.683
.-0.603
,...-0.576
.""-0.558
 society-0.536
TokenGo
Token ID5247
Feature activation+2.31e+00

Pos logit contributions
+0.274
+0.221
 +0.200
 Originally+0.197
 �+0.182
Neg logit contributions
,-0.466
,...-0.387
.-0.385
.""-0.373
 society-0.353
Token somewhere
Token ID7382
Feature activation+1.25e+00

Pos logit contributions
,+0.283
.+0.236
,...+0.234
.""+0.226
 society+0.211
Neg logit contributions
-0.153
-0.122
 -0.112
 Originally-0.111
 �-0.102
Token that
Token ID326
Feature activation+3.39e+00

Pos logit contributions
,+0.117
,...+0.093
.+0.092
.""+0.090
 society+0.080
Neg logit contributions
-0.115
-0.100
 -0.093
 Originally-0.092
 �-0.088
Token people
Token ID661
Feature activation+9.93e+00

Pos logit contributions
,+0.279
,...+0.221
.+0.214
.""+0.212
 society+0.179
Neg logit contributions
-0.347
-0.313
 -0.288
 Originally-0.282
 �-0.274
Token walk
Token ID2513
Feature activation+2.61e+00

Pos logit contributions
,+0.932
.+0.733
,...+0.727
.""+0.695
 society+0.629
Neg logit contributions
-0.933
-0.805
 -0.755
 Originally-0.753
 �-0.713
Token their
Token ID511
Feature activation+4.37e+00

Pos logit contributions
,+0.268
,...+0.217
.+0.214
.""+0.210
 society+0.192
Neg logit contributions
-0.217
-0.187
 -0.171
 Originally-0.169
 �-0.160
Token dogs
Token ID6844
Feature activation+6.07e+00

Pos logit contributions
,+0.317
,...+0.231
.+0.230
.""+0.218
 society+0.184
Neg logit contributions
-0.500
-0.447
 -0.422
 Originally-0.415
 �-0.403
Token &
Token ID1222
Feature activation+7.94e-01

Pos logit contributions
,+0.473
,...+0.355
.+0.348
.""+0.339
 society+0.299
Neg logit contributions
-0.658
-0.588
 -0.548
 Originally-0.544
 �-0.524
Token make
Token ID787
Feature activation+9.00e-01

Pos logit contributions
,+0.070
,...+0.054
.+0.054
.""+0.052
 society+0.046
Neg logit contributions
-0.079
-0.069
 -0.065
 Originally-0.063
 �-0.061
Token and
Token ID290
Feature activation+2.73e+00

Pos logit contributions
+0.367
+0.325
 +0.297
 Originally+0.295
 �+0.289
Neg logit contributions
,-0.282
,...-0.220
.-0.212
.""-0.212
 society-0.188
Token delusion
Token ID40969
Feature activation+4.66e+00

Pos logit contributions
,+0.188
,...+0.139
.+0.135
.""+0.130
 society+0.109
Neg logit contributions
-0.318
-0.288
 -0.266
 Originally-0.266
 �-0.260
Token as
Token ID355
Feature activation-1.80e+00

Pos logit contributions
,+0.441
,...+0.356
.+0.346
.""+0.342
 society+0.311
Neg logit contributions
-0.423
-0.367
 -0.334
 Originally-0.332
 �-0.320
Token schizophren
Token ID19991
Feature activation+2.75e+00

Pos logit contributions
+0.195
+0.174
 Originally+0.162
 +0.162
 �+0.155
Neg logit contributions
,-0.139
,...-0.107
.-0.104
.""-0.101
 society-0.085
Tokenic
Token ID291
Feature activation-1.78e+00

Pos logit contributions
,+0.331
,...+0.282
.+0.274
.""+0.271
 society+0.251
Neg logit contributions
-0.181
-0.149
 Originally-0.130
 -0.129
 �-0.121
Token people
Token ID661
Feature activation+9.90e+00

Pos logit contributions
+0.208
+0.190
 +0.174
 Originally+0.171
 �+0.171
Neg logit contributions
,-0.119
,...-0.089
.""-0.084
.-0.083
 society-0.068
Token whom
Token ID4150
Feature activation+4.07e+00

Pos logit contributions
,+0.565
,...+0.383
.+0.362
.""+0.357
 society+0.298
Neg logit contributions
-1.275
-1.155
 -1.081
 Originally-1.077
 �-1.057
Token the
Token ID262
Feature activation+4.32e+00

Pos logit contributions
,+0.300
,...+0.232
.+0.224
.""+0.218
.,+0.175
Neg logit contributions
-0.451
-0.408
 Originally-0.377
 -0.377
 �-0.364
Token team
Token ID1074
Feature activation+5.55e+00

Pos logit contributions
,+0.322
.+0.237
,...+0.237
.""+0.223
 society+0.185
Neg logit contributions
-0.489
-0.435
 -0.410
 Originally-0.408
 �-0.394
Token tested
Token ID6789
Feature activation+1.84e+00

Pos logit contributions
,+0.552
.+0.443
,...+0.441
.""+0.425
 society+0.386
Neg logit contributions
-0.487
-0.418
 Originally-0.387
 -0.386
 �-0.364
Token with
Token ID351
Feature activation-4.00e+00

Pos logit contributions
,+0.120
,...+0.087
.+0.082
.""+0.082
 society+0.071
Neg logit contributions
-0.220
-0.199
 -0.185
 Originally-0.184
 �-0.180
Token transition
Token ID6801
Feature activation+1.37e+00

Pos logit contributions
,+0.358
,...+0.280
.+0.275
.""+0.269
 society+0.230
Neg logit contributions
-0.466
-0.416
 -0.377
 Originally-0.370
 �-0.369
Token ("
Token ID5855
Feature activation+2.90e+00

Pos logit contributions
,+0.096
,...+0.070
.+0.068
.""+0.065
 society+0.056
Neg logit contributions
-0.160
-0.143
 -0.134
 Originally-0.132
 �-0.129
Tokentrans
Token ID7645
Feature activation+2.02e-01

Pos logit contributions
,+0.338
,...+0.280
.+0.278
.""+0.269
 society+0.251
Neg logit contributions
-0.209
-0.171
 -0.154
 Originally-0.153
 �-0.142
Tokensexual
Token ID18338
Feature activation+3.61e+00

Pos logit contributions
,+0.021
,...+0.018
.+0.017
.""+0.017
 society+0.015
Neg logit contributions
-0.016
-0.014
 -0.012
 Originally-0.012
 �-0.012
Token"
Token ID1
Feature activation+3.31e+00

Pos logit contributions
,+0.349
,...+0.283
.+0.279
.""+0.265
 society+0.240
Neg logit contributions
-0.326
-0.279
 -0.256
 Originally-0.252
 �-0.244
Token people
Token ID661
Feature activation+9.89e+00

Pos logit contributions
,+0.327
.+0.259
,...+0.250
.""+0.241
 society+0.222
Neg logit contributions
-0.298
-0.251
 -0.243
 Originally-0.241
 Regarding-0.226
Token)
Token ID8
Feature activation+9.50e-01

Pos logit contributions
,+0.884
.+0.693
,...+0.692
.""+0.668
 society+0.599
Neg logit contributions
-0.961
-0.833
 Originally-0.775
 -0.773
 �-0.742
Token and
Token ID290
Feature activation+2.84e+00

Pos logit contributions
,+0.097
.+0.077
,...+0.074
.""+0.072
 society+0.066
Neg logit contributions
-0.083
-0.069
 Originally-0.067
 -0.067
 Regarding-0.063
Token other
Token ID584
Feature activation+2.35e+00

Pos logit contributions
,+0.201
,...+0.147
.+0.146
.""+0.139
 society+0.114
Neg logit contributions
-0.327
-0.294
 -0.275
 Originally-0.272
 �-0.265
Token gender
Token ID5279
Feature activation+6.32e+00

Pos logit contributions
,+0.155
,...+0.112
.+0.109
.""+0.105
 society+0.084
Neg logit contributions
-0.281
-0.254
 -0.236
 Originally-0.236
 �-0.230
Token non
Token ID1729
Feature activation+2.15e+00

Pos logit contributions
,+0.456
,...+0.338
.+0.330
.""+0.317
 society+0.268
Neg logit contributions
-0.724
-0.647
 -0.601
 Originally-0.597
 �-0.584
Token the
Token ID262
Feature activation+4.10e+00

Pos logit contributions
+0.480
+0.437
 Originally+0.397
 +0.397
 �+0.392
Neg logit contributions
,-0.302
,...-0.230
.-0.222
.""-0.218
.,-0.177
Token intelligent
Token ID12661
Feature activation+3.48e+00

Pos logit contributions
,+0.320
.+0.240
,...+0.239
.""+0.227
 society+0.190
Neg logit contributions
-0.447
-0.398
 Originally-0.372
 -0.372
 �-0.359
Token and
Token ID290
Feature activation+2.68e+00

Pos logit contributions
,+0.238
,...+0.174
.+0.173
.""+0.167
 society+0.135
Neg logit contributions
-0.403
-0.364
 -0.338
 Originally-0.335
 �-0.327
Token educated
Token ID15657
Feature activation+3.64e+00

Pos logit contributions
,+0.217
,...+0.167
.+0.166
.""+0.160
 society+0.135
Neg logit contributions
-0.281
-0.249
 -0.231
 Originally-0.229
 �-0.225
Token young
Token ID1862
Feature activation+4.68e+00

Pos logit contributions
,+0.297
.+0.229
,...+0.227
.""+0.217
 society+0.189
Neg logit contributions
-0.378
-0.333
 -0.310
 Originally-0.309
 �-0.299
Token people
Token ID661
Feature activation+9.87e+00

Pos logit contributions
,+0.304
.+0.215
,...+0.213
.""+0.201
 society+0.166
Neg logit contributions
-0.561
-0.505
 -0.475
 Originally-0.472
 �-0.460
Token of
Token ID286
Feature activation-3.18e+00

Pos logit contributions
,+1.156
.+0.964
,...+0.955
.""+0.927
 society+0.865
Neg logit contributions
-0.692
-0.561
 -0.506
 Originally-0.502
 �-0.476
Token the
Token ID262
Feature activation+4.34e+00

Pos logit contributions
+0.364
+0.329
 +0.304
 Originally+0.303
 �+0.297
Neg logit contributions
,-0.225
,...-0.166
.-0.164
.""-0.158
,.-0.126
Token country
Token ID1499
Feature activation+7.40e+00

Pos logit contributions
,+0.335
.+0.250
,...+0.247
.""+0.233
 society+0.195
Neg logit contributions
-0.479
-0.426
 -0.401
 Originally-0.400
 �-0.385
Token and
Token ID290
Feature activation+2.81e+00

Pos logit contributions
,+0.612
.+0.470
,...+0.466
.""+0.443
 society+0.400
Neg logit contributions
-0.765
-0.670
 -0.625
 Originally-0.624
 �-0.607
Token which
Token ID543
Feature activation+9.62e-01

Pos logit contributions
,+0.258
.+0.205
,...+0.204
.""+0.197
 society+0.171
Neg logit contributions
-0.263
-0.229
 -0.212
 Originally-0.210
 �-0.203
Token
Token ID564
Feature activation-3.41e+00

Pos logit contributions
,+0.391
.+0.314
,...+0.309
.""+0.297
 society+0.272
Neg logit contributions
-0.326
-0.277
 -0.260
 Originally-0.258
 �-0.242
Token
Token ID250
Feature activation-1.09e+00

Pos logit contributions
+0.231
+0.186
 +0.169
 Originally+0.167
 �+0.155
Neg logit contributions
,-0.411
.-0.342
,...-0.340
.""-0.330
 society-0.309
TokenWhen
Token ID2215
Feature activation-8.67e-02

Pos logit contributions
+0.082
+0.067
 +0.061
 Originally+0.061
 �+0.058
Neg logit contributions
,-0.124
,...-0.102
.-0.101
.""-0.098
 society-0.091
Token I
Token ID314
Feature activation+5.43e+00

Pos logit contributions
+0.010
+0.008
 +0.008
 Originally+0.008
 �+0.008
Neg logit contributions
,-0.007
,...-0.005
.-0.005
.""-0.005
 society-0.004
Token meet
Token ID1826
Feature activation+2.48e+00

Pos logit contributions
,+0.512
.+0.404
,...+0.400
.""+0.384
 society+0.350
Neg logit contributions
-0.509
-0.437
 Originally-0.409
 -0.408
 �-0.389
Token people
Token ID661
Feature activation+9.86e+00

Pos logit contributions
,+0.187
,...+0.141
.+0.139
.""+0.132
 society+0.115
Neg logit contributions
-0.275
-0.245
 -0.228
 Originally-0.227
 �-0.224
Token from
Token ID422
Feature activation-5.41e+00

Pos logit contributions
,+0.858
.+0.671
,...+0.665
.""+0.639
 society+0.581
Neg logit contributions
-0.989
-0.853
 -0.794
 Originally-0.786
 �-0.775
Token like
Token ID588
Feature activation-8.90e-01

Pos logit contributions
+0.562
+0.499
 +0.460
 Originally+0.455
 �+0.448
Neg logit contributions
,-0.442
,...-0.342
.-0.336
.""-0.326
 society-0.278
Token,
Token ID11
Feature activation+8.88e+00

Pos logit contributions
+0.103
+0.092
 +0.086
 Originally+0.084
 �+0.084
Neg logit contributions
,-0.062
,...-0.046
.-0.045
.""-0.044
 society-0.037
Token down
Token ID866
Feature activation-2.57e+00

Pos logit contributions
,+0.792
.+0.614
,...+0.606
.""+0.578
 society+0.516
Neg logit contributions
-0.885
-0.763
 -0.723
 Originally-0.716
 �-0.689
Token South
Token ID2520
Feature activation+2.70e+00

Pos logit contributions
+0.246
+0.215
 +0.199
 Originally+0.196
 �+0.190
Neg logit contributions
,-0.228
,...-0.180
.-0.179
.""-0.176
 society-0.156
Token and
Token ID290
Feature activation+2.61e+00

Pos logit contributions
+0.081
+0.073
 +0.067
 Originally+0.067
 �+0.064
Neg logit contributions
,-0.059
,...-0.046
.-0.044
.""-0.043
 society-0.039
Token we
Token ID356
Feature activation+5.89e+00

Pos logit contributions
,+0.245
,...+0.196
.+0.194
.""+0.187
 society+0.169
Neg logit contributions
-0.242
-0.210
 -0.195
 Originally-0.194
 �-0.184
Token're
Token ID821
Feature activation+3.68e+00

Pos logit contributions
,+0.550
.+0.434
,...+0.433
.""+0.415
 society+0.378
Neg logit contributions
-0.552
-0.479
 Originally-0.445
 -0.445
 �-0.421
Token happy
Token ID3772
Feature activation+3.66e+00

Pos logit contributions
,+0.359
.+0.285
,...+0.280
.""+0.269
 society+0.245
Neg logit contributions
-0.335
-0.285
 -0.269
 Originally-0.268
 �-0.253
Token for
Token ID329
Feature activation-3.61e+00

Pos logit contributions
,+0.504
,...+0.430
.+0.430
.""+0.417
 society+0.395
Neg logit contributions
-0.182
-0.136
 -0.115
 Originally-0.115
 �-0.102
Token people
Token ID661
Feature activation+9.86e+00

Pos logit contributions
+0.431
+0.392
 +0.365
 Originally+0.362
 �+0.351
Neg logit contributions
,-0.239
,...-0.174
.-0.167
.""-0.160
 society-0.132
Token to
Token ID284
Feature activation-3.95e+00

Pos logit contributions
,+1.290
.+1.091
,...+1.091
.""+1.061
 society+0.999
Neg logit contributions
-0.558
-0.432
 -0.376
 Originally-0.371
 �-0.340
Token contact
Token ID2800
Feature activation+1.70e+00

Pos logit contributions
+0.352
+0.301
 +0.282
 Originally+0.281
 �+0.265
Neg logit contributions
,-0.390
.-0.310
,...-0.307
.""-0.295
 society-0.269
Token us
Token ID514
Feature activation+4.79e+00

Pos logit contributions
,+0.113
,...+0.082
.+0.079
.""+0.076
 society+0.065
Neg logit contributions
-0.203
-0.184
 Originally-0.172
 -0.171
 �-0.165
Token at
Token ID379
Feature activation-4.46e+00

Pos logit contributions
,+0.541
,...+0.445
.+0.443
.""+0.429
 society+0.401
Neg logit contributions
-0.357
-0.296
 -0.269
 Originally-0.267
 �-0.251
Token yog
Token ID27423
Feature activation+4.31e+00

Pos logit contributions
+0.457
+0.410
 Originally+0.376
 +0.375
 �+0.360
Neg logit contributions
,-0.370
,...-0.288
.-0.280
.""-0.274
 society-0.244
Tokeni
Token ID72
Feature activation-1.21e+00

Pos logit contributions
,+0.105
,...+0.087
.+0.086
.""+0.084
 society+0.078
Neg logit contributions
-0.066
-0.055
 Originally-0.049
 -0.049
 �-0.046
Tokenators
Token ID2024
Feature activation-1.01e+00

Pos logit contributions
+0.099
+0.081
 +0.073
 Originally+0.073
 �+0.070
Neg logit contributions
,-0.130
,...-0.108
.-0.105
.""-0.105
 society-0.096
Token.
Token ID13
Feature activation+7.00e+00

Pos logit contributions
+0.143
+0.128
 +0.120
 Originally+0.120
 �+0.120
Neg logit contributions
,-0.047
,...-0.028
.""-0.026
.-0.026
 society-0.020
Token For
Token ID1114
Feature activation-2.96e+00

Pos logit contributions
,+0.919
.+0.776
,...+0.758
.""+0.733
 society+0.689
Neg logit contributions
-0.400
-0.296
 -0.280
 Originally-0.271
 �-0.245
Token many
Token ID867
Feature activation+3.10e+00

Pos logit contributions
+0.306
+0.271
 Originally+0.251
 +0.250
 �+0.241
Neg logit contributions
,-0.245
,...-0.188
.-0.187
.""-0.181
 society-0.156
Token people
Token ID661
Feature activation+9.82e+00

Pos logit contributions
,+0.218
,...+0.164
.+0.164
.""+0.162
 society+0.136
Neg logit contributions
-0.349
-0.313
 Originally-0.288
 -0.287
 �-0.282
Token,
Token ID11
Feature activation+8.86e+00

Pos logit contributions
,+0.511
.+0.338
,...+0.329
.""+0.328
 society+0.254
Neg logit contributions
-1.323
-1.181
 -1.105
 Originally-1.103
 �-1.102
Token he
Token ID339
Feature activation+5.95e+00

Pos logit contributions
,+0.706
.+0.529
,...+0.521
.""+0.494
 society+0.431
Neg logit contributions
-0.967
-0.848
 -0.803
 Originally-0.801
 �-0.768
Token exempl
Token ID21433
Feature activation+7.48e-01

Pos logit contributions
,+0.565
.+0.450
,...+0.447
.""+0.435
 society+0.390
Neg logit contributions
-0.553
-0.474
 Originally-0.436
 -0.435
 �-0.421
Tokenifies
Token ID6945
Feature activation-1.65e+00

Pos logit contributions
,+0.083
,...+0.068
.+0.068
.""+0.065
 society+0.061
Neg logit contributions
-0.058
-0.048
 Originally-0.044
 -0.043
 �-0.041
Token the
Token ID262
Feature activation+4.34e+00

Pos logit contributions
+0.188
+0.167
 +0.152
 �+0.152
 Originally+0.152
Neg logit contributions
,-0.119
,...-0.090
.-0.087
.""-0.087
,.-0.070
Token
Token ID447
Feature activation-2.05e+00

Pos logit contributions
,+0.979
.+0.832
,...+0.816
.""+0.794
 society+0.748
Neg logit contributions
-0.387
-0.285
 -0.261
 Originally-0.258
 Regarding-0.225
Token
Token ID251
Feature activation+3.03e+00

Pos logit contributions
+0.137
+0.113
 Originally+0.100
 +0.099
 �+0.093
Neg logit contributions
,-0.248
,...-0.206
.-0.206
.""-0.202
 society-0.188
Token The
Token ID383
Feature activation+3.36e+00

Pos logit contributions
,+0.437
.+0.375
,...+0.374
.""+0.364
 society+0.346
Neg logit contributions
-0.138
-0.097
 -0.081
 Originally-0.081
 �-0.071
Token hopes
Token ID7176
Feature activation+5.25e+00

Pos logit contributions
,+0.284
,...+0.218
.+0.217
.""+0.206
 society+0.180
Neg logit contributions
-0.349
-0.305
 Originally-0.286
 -0.285
 �-0.275
Token of
Token ID286
Feature activation-3.18e+00

Pos logit contributions
,+0.561
,...+0.460
.+0.460
.""+0.446
 society+0.408
Neg logit contributions
-0.425
-0.354
 -0.323
 Originally-0.321
 �-0.310
Token people
Token ID661
Feature activation+9.80e+00

Pos logit contributions
+0.351
+0.311
 +0.290
 Originally+0.288
 �+0.284
Neg logit contributions
,-0.244
,...-0.187
.-0.182
.""-0.177
,.-0.145
Token who
Token ID508
Feature activation+1.86e+00

Pos logit contributions
,+1.027
.+0.830
,...+0.819
.""+0.789
 society+0.725
Neg logit contributions
-0.823
-0.688
 -0.644
 Originally-0.640
 �-0.605
Token regarded
Token ID11987
Feature activation+5.24e-01

Pos logit contributions
,+0.173
.+0.138
,...+0.137
.""+0.132
 society+0.119
Neg logit contributions
-0.174
-0.150
 Originally-0.139
 -0.139
 �-0.133
Token her
Token ID607
Feature activation+5.18e+00

Pos logit contributions
,+0.038
,...+0.029
.+0.028
.""+0.027
,.+0.022
Neg logit contributions
-0.060
-0.053
 Originally-0.050
 -0.049
 �-0.049
Token as
Token ID355
Feature activation-1.96e+00

Pos logit contributions
,+0.564
,...+0.463
.+0.462
.""+0.447
 society+0.407
Neg logit contributions
-0.411
-0.341
 -0.312
 Originally-0.308
 �-0.297
Token the
Token ID262
Feature activation+4.05e+00

Pos logit contributions
+0.202
+0.178
 +0.165
 Originally+0.164
 �+0.161
Neg logit contributions
,-0.164
,...-0.129
.-0.126
.""-0.122
 society-0.105
Token.
Token ID13
Feature activation+7.18e+00

Pos logit contributions
,+0.346
,...+0.239
.""+0.222
.+0.216
 society+0.178
Neg logit contributions
-0.796
-0.728
 -0.676
 Originally-0.664
 �-0.657
Token
Token ID198
Feature activation-3.99e+00

Pos logit contributions
,+1.072
.+0.926
,...+0.906
.""+0.881
 society+0.841
Neg logit contributions
-0.287
-0.175
 -0.157
 Originally-0.153
 �-0.125
Token
Token ID198
Feature activation-3.95e+00

Pos logit contributions
+0.074
 +0.007
 Originally+0.005
+0.000
 Regarding-0.004
Neg logit contributions
,-0.690
.-0.609
,...-0.586
.""-0.568
 society-0.546
TokenBut
Token ID1537
Feature activation+3.35e+00

Pos logit contributions
+0.281
+0.228
 Originally+0.200
 +0.199
 �+0.183
Neg logit contributions
,-0.465
,...-0.388
.-0.382
.""-0.376
 society-0.356
Token the
Token ID262
Feature activation+4.25e+00

Pos logit contributions
,+0.289
,...+0.225
.+0.225
.""+0.218
 society+0.189
Neg logit contributions
-0.334
-0.293
 -0.273
 Originally-0.272
 �-0.260
Token people
Token ID661
Feature activation+9.78e+00

Pos logit contributions
,+0.352
.+0.268
,...+0.268
.""+0.256
 society+0.218
Neg logit contributions
-0.444
-0.392
 -0.366
 Originally-0.365
 �-0.351
Token scoring
Token ID9689
Feature activation+6.91e-01

Pos logit contributions
,+1.024
.+0.828
,...+0.821
.""+0.794
 society+0.724
Neg logit contributions
-0.815
-0.688
 -0.639
 Originally-0.634
 �-0.598
Token the
Token ID262
Feature activation+4.12e+00

Pos logit contributions
,+0.051
,...+0.039
.+0.038
.""+0.037
 society+0.031
Neg logit contributions
-0.077
-0.069
 Originally-0.064
 -0.064
 �-0.062
Token carc
Token ID36756
Feature activation+2.70e+00

Pos logit contributions
,+0.350
,...+0.269
.+0.268
.""+0.258
 society+0.222
Neg logit contributions
-0.423
-0.372
 -0.347
 Originally-0.346
 �-0.333
Tokenasses
Token ID13978
Feature activation+2.16e+00

Pos logit contributions
,+0.288
.+0.234
,...+0.226
.""+0.217
 society+0.200
Neg logit contributions
-0.224
-0.184
 -0.176
 Originally-0.175
 Regarding-0.165
Token are
Token ID389
Feature activation+2.52e+00

Pos logit contributions
,+0.217
.+0.175
,...+0.174
.""+0.168
 society+0.153
Neg logit contributions
-0.188
-0.161
 -0.149
 Originally-0.148
 �-0.140
Token needs
Token ID2476
Feature activation+2.99e+00

Pos logit contributions
,+0.439
,...+0.347
.+0.347
.""+0.331
 society+0.300
Neg logit contributions
-0.434
-0.373
 -0.344
 Originally-0.344
 �-0.330
Token to
Token ID284
Feature activation-3.92e+00

Pos logit contributions
,+0.409
.+0.349
,...+0.344
.""+0.334
 society+0.316
Neg logit contributions
-0.155
-0.113
 -0.102
 Originally-0.101
 �-0.088
Token happen
Token ID1645
Feature activation+4.77e+00

Pos logit contributions
+0.365
+0.317
 +0.290
 Originally+0.290
 �+0.278
Neg logit contributions
,-0.368
,...-0.293
.-0.289
.""-0.277
 society-0.255
Token to
Token ID284
Feature activation-4.05e+00

Pos logit contributions
,+0.477
,...+0.380
.+0.380
.""+0.367
 society+0.338
Neg logit contributions
-0.424
-0.358
 -0.331
 Originally-0.328
 �-0.316
Token these
Token ID777
Feature activation+4.53e+00

Pos logit contributions
+0.399
+0.352
 +0.322
 Originally+0.319
 �+0.314
Neg logit contributions
,-0.356
,...-0.280
.-0.275
.""-0.265
 society-0.231
Token people
Token ID661
Feature activation+9.76e+00

Pos logit contributions
,+0.305
,...+0.216
.+0.213
.""+0.200
 society+0.166
Neg logit contributions
-0.543
-0.489
 -0.461
 Originally-0.459
 �-0.446
Token so
Token ID523
Feature activation+1.67e+00

Pos logit contributions
,+0.709
,...+0.517
.+0.509
.""+0.489
 society+0.430
Neg logit contributions
-1.135
-0.998
 -0.936
 Originally-0.928
 �-0.914
Token that
Token ID326
Feature activation+3.20e+00

Pos logit contributions
,+0.126
,...+0.095
.+0.093
.""+0.091
 society+0.075
Neg logit contributions
-0.186
-0.166
 -0.154
 Originally-0.150
 �-0.150
Token this
Token ID428
Feature activation+3.27e+00

Pos logit contributions
,+0.228
,...+0.171
.+0.166
.""+0.162
 society+0.129
Neg logit contributions
-0.367
-0.330
 -0.306
 Originally-0.300
 �-0.299
Token stops
Token ID9911
Feature activation+9.72e-01

Pos logit contributions
,+0.282
,...+0.221
.+0.217
.""+0.209
 society+0.178
Neg logit contributions
-0.327
-0.290
 -0.267
 Originally-0.262
 �-0.258
Token,
Token ID11
Feature activation+8.51e+00

Pos logit contributions
,+0.060
,...+0.043
.+0.040
.""+0.039
 society+0.033
Neg logit contributions
-0.122
-0.109
 -0.102
 Originally-0.100
 �-0.100
Token in
Token ID287
Feature activation-4.28e+00

Pos logit contributions
,+0.304
,...+0.229
.""+0.215
.+0.214
 society+0.185
Neg logit contributions
-0.492
-0.440
 -0.407
 Originally-0.404
 �-0.395
Token 2008
Token ID3648
Feature activation+1.17e+00

Pos logit contributions
+0.468
+0.420
 Originally+0.386
 +0.386
 �+0.375
Neg logit contributions
,-0.327
,...-0.251
.-0.243
.""-0.237
,.-0.195
Token.
Token ID13
Feature activation+7.15e+00

Pos logit contributions
,+0.059
,...+0.040
.""+0.036
.+0.034
 society+0.029
Neg logit contributions
-0.157
-0.143
 -0.134
 Originally-0.131
 �-0.130
Token The
Token ID383
Feature activation+3.28e+00

Pos logit contributions
,+0.972
.+0.828
,...+0.803
.""+0.778
 society+0.740
Neg logit contributions
-0.379
-0.264
 -0.250
 Originally-0.247
 �-0.220
Token American
Token ID1605
Feature activation+6.95e+00

Pos logit contributions
,+0.262
,...+0.199
.+0.198
.""+0.189
 society+0.161
Neg logit contributions
-0.351
-0.313
 -0.292
 Originally-0.289
 �-0.280
Token people
Token ID661
Feature activation+9.76e+00

Pos logit contributions
,+0.384
,...+0.261
.+0.248
.""+0.242
,.+0.169
Neg logit contributions
-0.905
-0.831
 Originally-0.777
 -0.770
 �-0.752
Token bailed
Token ID47739
Feature activation+3.37e+00

Pos logit contributions
,+0.905
,...+0.722
.+0.720
.""+0.701
 society+0.626
Neg logit contributions
-0.913
-0.790
 -0.724
 Originally-0.718
 �-0.699
Token these
Token ID777
Feature activation+4.45e+00

Pos logit contributions
,+0.371
,...+0.312
.+0.304
.""+0.300
 society+0.273
Neg logit contributions
-0.252
-0.215
 -0.187
 Originally-0.185
 �-0.175
Token sociop
Token ID41221
Feature activation+4.34e+00

Pos logit contributions
,+0.292
,...+0.211
.+0.205
.""+0.197
 society+0.159
Neg logit contributions
-0.533
-0.484
 -0.453
 Originally-0.450
 �-0.436
Tokenaths
Token ID33148
Feature activation+3.65e+00

Pos logit contributions
,+0.461
,...+0.378
.+0.374
.""+0.367
 society+0.336
Neg logit contributions
-0.349
-0.297
 Originally-0.271
 -0.268
 �-0.251
Token out
Token ID503
Feature activation-3.31e+00

Pos logit contributions
,+0.424
,...+0.359
.+0.351
.""+0.346
 society+0.319
Neg logit contributions
-0.253
-0.211
 -0.184
 Originally-0.182
 �-0.171
Token recent
Token ID2274
Feature activation+1.85e+00

Pos logit contributions
+0.589
+0.516
 +0.480
 Originally+0.479
 �+0.477
Neg logit contributions
,-0.387
,...-0.296
.-0.284
.""-0.277
 society-0.233
Token history
Token ID2106
Feature activation+6.60e+00

Pos logit contributions
,+0.130
,...+0.097
.+0.093
.""+0.090
 society+0.075
Neg logit contributions
-0.216
-0.192
 -0.179
 �-0.177
 Originally-0.177
Token,
Token ID11
Feature activation+8.77e+00

Pos logit contributions
,+0.398
,...+0.272
.+0.260
.""+0.254
 society+0.209
Neg logit contributions
-0.849
-0.753
 -0.714
 Originally-0.706
 �-0.701
Token for
Token ID329
Feature activation-3.73e+00

Pos logit contributions
,+0.902
.+0.726
,...+0.696
.""+0.672
 society+0.627
Neg logit contributions
-0.743
-0.625
 -0.602
 Originally-0.599
 Regarding-0.558
Token those
Token ID883
Feature activation+5.39e+00

Pos logit contributions
+0.409
+0.356
 +0.333
 Originally+0.326
 �+0.326
Neg logit contributions
,-0.294
,...-0.225
.-0.220
.""-0.212
 society-0.181
Token people
Token ID661
Feature activation+9.69e+00

Pos logit contributions
,+0.591
,...+0.482
.+0.482
.""+0.464
 society+0.427
Neg logit contributions
-0.424
-0.351
 -0.326
 Originally-0.321
 �-0.305
Token that
Token ID326
Feature activation+3.12e+00

Pos logit contributions
,+0.898
,...+0.711
.+0.699
.""+0.682
 society+0.618
Neg logit contributions
-0.935
-0.794
 -0.737
 Originally-0.727
 �-0.716
Token are
Token ID389
Feature activation+2.52e+00

Pos logit contributions
,+0.298
,...+0.238
.+0.236
.""+0.229
 society+0.205
Neg logit contributions
-0.293
-0.247
 -0.229
 Originally-0.228
 �-0.222
Token unaware
Token ID17261
Feature activation-7.24e-01

Pos logit contributions
,+0.244
,...+0.194
.+0.193
.""+0.185
 society+0.169
Neg logit contributions
-0.230
-0.196
 -0.182
 Originally-0.181
 �-0.175
Token of
Token ID286
Feature activation-3.54e+00

Pos logit contributions
+0.072
+0.061
 +0.055
 �+0.055
 Originally+0.053
Neg logit contributions
,-0.065
,...-0.052
.""-0.050
.-0.050
 society-0.045
Token how
Token ID703
Feature activation+2.93e+00

Pos logit contributions
+0.419
+0.365
 �+0.340
 +0.337
 Originally+0.334
Neg logit contributions
,-0.248
,...-0.190
.""-0.179
.-0.177
 society-0.149
Token great
Token ID1049
Feature activation+3.20e+00

Pos logit contributions
,+0.256
,...+0.197
.+0.194
.""+0.187
 society+0.164
Neg logit contributions
-0.329
-0.292
 -0.270
 Originally-0.269
 �-0.262
Token.
Token ID13
Feature activation+7.15e+00

Pos logit contributions
,+0.206
,...+0.152
.+0.139
.""+0.136
 society+0.122
Neg logit contributions
-0.381
-0.347
 -0.319
 Originally-0.315
 �-0.311
Token Beyond
Token ID12197
Feature activation-1.23e+00

Pos logit contributions
,+0.955
.+0.810
,...+0.794
.""+0.771
 society+0.727
Neg logit contributions
-0.396
-0.290
 -0.269
 Originally-0.266
 �-0.235
Token that
Token ID326
Feature activation+3.29e+00

Pos logit contributions
+0.127
+0.114
 Originally+0.104
 +0.103
 �+0.102
Neg logit contributions
,-0.098
,...-0.077
.-0.074
.""-0.073
 society-0.063
Token,
Token ID11
Feature activation+8.75e+00

Pos logit contributions
,+0.079
,...+0.029
.""+0.024
.+0.024
 society+0.003
Neg logit contributions
-0.513
-0.481
 -0.446
 Originally-0.445
 �-0.443
Token people
Token ID661
Feature activation+9.68e+00

Pos logit contributions
,+0.697
.+0.521
,...+0.512
.""+0.483
 society+0.425
Neg logit contributions
-0.952
-0.835
 -0.795
 Originally-0.793
 �-0.758
Token just
Token ID655
Feature activation+1.67e+00

Pos logit contributions
,+0.931
,...+0.740
.+0.739
.""+0.710
 society+0.643
Neg logit contributions
-0.879
-0.757
 -0.699
 Originally-0.696
 �-0.668
Token like
Token ID588
Feature activation-1.23e+00

Pos logit contributions
,+0.151
.+0.117
,...+0.117
.""+0.112
 society+0.100
Neg logit contributions
-0.162
-0.141
 -0.131
 Originally-0.130
 �-0.125
Token something
Token ID1223
Feature activation+5.51e+00

Pos logit contributions
+0.102
+0.090
 +0.079
 Originally+0.079
 �+0.077
Neg logit contributions
,-0.123
,...-0.102
.-0.099
.""-0.097
 society-0.089
Token or
Token ID393
Feature activation+1.72e+00

Pos logit contributions
,+0.433
,...+0.333
.+0.320
.""+0.309
 society+0.280
Neg logit contributions
-0.585
-0.523
 -0.477
 Originally-0.474
 �-0.465
Token they
Token ID484
Feature activation+6.52e+00

Pos logit contributions
,+0.131
,...+0.099
.+0.098
.""+0.094
 society+0.080
Neg logit contributions
-0.187
-0.169
 -0.156
 Originally-0.155
 �-0.151
Token America
Token ID2253
Feature activation+8.63e+00

Pos logit contributions
+0.544
+0.494
 +0.450
 Originally+0.447
 �+0.442
Neg logit contributions
,-0.350
,...-0.270
.""-0.258
.-0.257
.,-0.207
Token against
Token ID1028
Feature activation-4.39e+00

Pos logit contributions
,+0.502
,...+0.347
.+0.329
.""+0.319
 society+0.265
Neg logit contributions
-1.093
-0.995
 -0.931
 Originally-0.928
 �-0.903
Token Jews
Token ID6771
Feature activation+7.48e+00

Pos logit contributions
+0.476
+0.437
 +0.399
 Originally+0.395
 �+0.387
Neg logit contributions
,-0.326
,...-0.254
.-0.241
.""-0.241
.,-0.196
Token and
Token ID290
Feature activation+2.40e+00

Pos logit contributions
,+0.394
,...+0.261
.+0.243
.""+0.237
 society+0.199
Neg logit contributions
-0.988
-0.901
 -0.842
 Originally-0.839
 �-0.821
Token black
Token ID2042
Feature activation+4.36e+00

Pos logit contributions
,+0.202
,...+0.158
.+0.156
.""+0.150
 society+0.129
Neg logit contributions
-0.244
-0.216
 -0.202
 Originally-0.199
 �-0.192
Token people
Token ID661
Feature activation+9.66e+00

Pos logit contributions
,+0.237
,...+0.162
.+0.149
.""+0.149
 society+0.104
Neg logit contributions
-0.568
-0.520
 -0.489
 Originally-0.481
 �-0.476
Token,
Token ID11
Feature activation+8.69e+00

Pos logit contributions
,+0.423
,...+0.249
.+0.227
.""+0.225
 society+0.172
Neg logit contributions
-1.359
-1.249
 -1.176
 Originally-1.166
 �-1.146
Token which
Token ID543
Feature activation+8.44e-01

Pos logit contributions
,+0.859
.+0.682
,...+0.666
.""+0.639
 society+0.585
Neg logit contributions
-0.784
-0.662
 -0.631
 Originally-0.627
 �-0.590
Token is
Token ID318
Feature activation+3.10e+00

Pos logit contributions
,+0.070
,...+0.056
.+0.055
.""+0.053
 society+0.046
Neg logit contributions
-0.085
-0.076
 Originally-0.070
 -0.070
 �-0.067
Token more
Token ID517
Feature activation+1.97e-01

Pos logit contributions
,+0.286
,...+0.227
.+0.226
.""+0.218
 society+0.195
Neg logit contributions
-0.290
-0.253
 -0.234
 Originally-0.233
 �-0.223
Token in
Token ID287
Feature activation-4.29e+00

Pos logit contributions
,+0.019
.+0.015
,...+0.015
.""+0.014
 society+0.013
Neg logit contributions
-0.018
-0.016
 -0.015
 Originally-0.015
 �-0.014
Token more
Token ID517
Feature activation+2.30e-01

Pos logit contributions
,+0.189
,...+0.155
.+0.152
.""+0.149
 society+0.134
Neg logit contributions
-0.164
-0.140
 Originally-0.128
 -0.127
 �-0.124
Token to
Token ID284
Feature activation-4.05e+00

Pos logit contributions
,+0.025
,...+0.021
.+0.021
.""+0.020
 society+0.018
Neg logit contributions
-0.018
-0.015
 -0.013
 Originally-0.013
 �-0.013
Token do
Token ID466
Feature activation+1.73e+00

Pos logit contributions
+0.358
+0.307
 Originally+0.278
 +0.278
 �+0.270
Neg logit contributions
,-0.399
,...-0.324
.-0.318
.""-0.310
 society-0.278
Token with
Token ID351
Feature activation-4.05e+00

Pos logit contributions
,+0.231
.+0.196
,...+0.192
.""+0.186
 society+0.175
Neg logit contributions
-0.096
-0.072
 -0.067
 Originally-0.066
 Regarding-0.059
Token electoral
Token ID13901
Feature activation+5.16e+00

Pos logit contributions
+0.439
+0.392
 +0.356
 Originally+0.353
 �+0.352
Neg logit contributions
,-0.311
,...-0.241
.-0.234
.""-0.230
 society-0.192
Token politics
Token ID4819
Feature activation+9.62e+00

Pos logit contributions
,+0.314
,...+0.223
.+0.216
.""+0.208
 society+0.161
Neg logit contributions
-0.642
-0.583
 -0.542
 Originally-0.538
 �-0.530
Token than
Token ID621
Feature activation+1.30e+00

Pos logit contributions
,+0.891
,...+0.713
.+0.694
.""+0.685
 society+0.632
Neg logit contributions
-0.931
-0.787
 -0.717
 Originally-0.712
 �-0.705
Token it
Token ID340
Feature activation+5.64e+00

Pos logit contributions
,+0.102
,...+0.079
.+0.077
.""+0.075
 society+0.063
Neg logit contributions
-0.139
-0.124
 -0.114
 Originally-0.114
 �-0.112
Token had
Token ID550
Feature activation+1.49e+00

Pos logit contributions
,+0.547
,...+0.442
.+0.435
.""+0.426
 society+0.384
Neg logit contributions
-0.512
-0.438
 -0.398
 Originally-0.397
 �-0.388
Token to
Token ID284
Feature activation-4.35e+00

Pos logit contributions
,+0.171
,...+0.145
.+0.142
.""+0.140
 society+0.127
Neg logit contributions
-0.107
-0.088
 Originally-0.077
 -0.077
 �-0.075
Token do
Token ID466
Feature activation+1.42e+00

Pos logit contributions
+0.394
+0.336
 +0.311
 Originally+0.310
 �+0.298
Neg logit contributions
,-0.423
,...-0.338
.-0.337
.""-0.323
 society-0.291
Token The
Token ID383
Feature activation+3.28e+00

Pos logit contributions
,+0.890
.+0.746
,...+0.730
.""+0.707
 society+0.664
Neg logit contributions
-0.460
-0.357
 -0.334
 Originally-0.331
 �-0.300
Token emotional
Token ID7016
Feature activation+4.15e+00

Pos logit contributions
,+0.273
.+0.207
,...+0.207
.""+0.195
 society+0.170
Neg logit contributions
-0.344
-0.302
 -0.284
 Originally-0.283
 �-0.272
Token makeup
Token ID16029
Feature activation+5.21e+00

Pos logit contributions
,+0.326
,...+0.251
.+0.246
.""+0.237
 society+0.203
Neg logit contributions
-0.443
-0.396
 -0.368
 Originally-0.364
 �-0.354
Token of
Token ID286
Feature activation-3.36e+00

Pos logit contributions
,+0.680
,...+0.579
.+0.579
.""+0.564
 society+0.526
Neg logit contributions
-0.296
-0.228
 -0.197
 Originally-0.192
 �-0.179
Token the
Token ID262
Feature activation+4.14e+00

Pos logit contributions
+0.362
+0.323
 +0.295
 Originally+0.293
 �+0.290
Neg logit contributions
,-0.262
,...-0.204
.-0.198
.""-0.192
.,-0.158
Token people
Token ID661
Feature activation+9.61e+00

Pos logit contributions
,+0.316
.+0.234
,...+0.232
.""+0.218
 society+0.185
Neg logit contributions
-0.461
-0.409
 -0.386
 Originally-0.384
 �-0.371
Token is
Token ID318
Feature activation+2.85e+00

Pos logit contributions
,+0.983
.+0.792
,...+0.788
.""+0.758
 society+0.689
Neg logit contributions
-0.822
-0.696
 -0.645
 Originally-0.636
 �-0.609
Token essential
Token ID6393
Feature activation-1.52e+00

Pos logit contributions
,+0.279
.+0.224
,...+0.224
.""+0.217
 society+0.195
Neg logit contributions
-0.250
-0.216
 -0.198
 Originally-0.196
 �-0.188
Token to
Token ID284
Feature activation-4.35e+00

Pos logit contributions
+0.105
+0.087
 +0.078
 Originally+0.076
 �+0.073
Neg logit contributions
,-0.177
,...-0.149
.-0.146
.""-0.143
 society-0.134
Token hegemony
Token ID46877
Feature activation+4.68e+00

Pos logit contributions
+0.474
+0.420
 +0.393
 Originally+0.382
 �+0.379
Neg logit contributions
,-0.337
,...-0.258
.-0.252
.""-0.243
 society-0.200
Token;
Token ID26
Feature activation+6.76e+00

Pos logit contributions
,+0.297
,...+0.213
.+0.200
.""+0.196
 society+0.162
Neg logit contributions
-0.573
-0.515
 -0.483
 Originally-0.474
 �-0.466
Token
Token ID960
Feature activation+5.23e+00

Pos logit contributions
,+0.268
,...+0.196
.""+0.182
.+0.180
 society+0.148
Neg logit contributions
-0.509
-0.461
 -0.429
 Originally-0.418
 �-0.418
Tokenpolitics
Token ID34127
Feature activation+7.81e+00

Pos logit contributions
,+0.625
.+0.519
,...+0.518
.""+0.501
 society+0.465
Neg logit contributions
-0.361
-0.293
 -0.265
 Originally-0.261
 �-0.242
Token,
Token ID11
Feature activation+8.89e+00

Pos logit contributions
,+0.269
,...+0.140
.""+0.126
.+0.122
 society+0.073
Neg logit contributions
-1.159
-1.071
 -1.003
 Originally-0.996
 �-0.986
Token health
Token ID1535
Feature activation+7.74e+00

Pos logit contributions
,+0.616
,...+0.446
.+0.441
.""+0.422
 society+0.334
Neg logit contributions
-1.045
-0.939
 -0.882
 Originally-0.872
 �-0.850
Token,
Token ID11
Feature activation+8.88e+00

Pos logit contributions
,+0.392
.""+0.268
,...+0.266
.+0.254
 society+0.185
Neg logit contributions
-1.029
-0.933
 -0.860
 �-0.849
 Originally-0.848
Token religion
Token ID5737
Feature activation+9.60e+00

Pos logit contributions
,+0.629
,...+0.463
.+0.458
.""+0.445
,.+0.350
Neg logit contributions
-1.024
-0.923
 -0.860
 Originally-0.844
 �-0.832
Token and
Token ID290
Feature activation+2.79e+00

Pos logit contributions
,+0.104
,...-0.046
.""-0.049
.-0.076
 society-0.124
Neg logit contributions
-1.648
-1.534
 -1.432
 �-1.425
 Originally-1.419
Token so
Token ID523
Feature activation+1.92e+00

Pos logit contributions
,+0.229
,...+0.184
.""+0.180
.+0.179
,.+0.151
Neg logit contributions
-0.279
-0.252
 -0.228
 �-0.222
 Originally-0.221
Token on
Token ID319
Feature activation-5.26e+00

Pos logit contributions
,+0.254
,...+0.221
.+0.215
.""+0.214
 society+0.200
Neg logit contributions
-0.102
-0.079
 -0.063
 Originally-0.062
 �-0.059
Token.
Token ID13
Feature activation+7.10e+00

Pos logit contributions
+0.787
+0.729
 +0.676
 �+0.670
 Originally+0.668
Neg logit contributions
,-0.174
,...-0.088
.""-0.082
.-0.060
 society-0.045
Token
Token ID198
Feature activation-4.10e+00

Pos logit contributions
,+0.978
.+0.834
,...+0.813
.""+0.789
 society+0.750
Neg logit contributions
-0.364
-0.254
 -0.237
 Originally-0.235
 �-0.204
Token
Token ID198
Feature activation-4.10e+00

Pos logit contributions
+0.075
 +0.008
 Originally+0.007
+0.000
 Regarding-0.003
Neg logit contributions
,-0.703
.-0.620
,...-0.597
.""-0.579
 society-0.554
TokenAbout
Token ID8585
Feature activation-3.02e+00

Pos logit contributions
+0.296
+0.233
 Originally+0.205
 +0.202
 �+0.193
Neg logit contributions
,-0.484
,...-0.404
.-0.399
.""-0.389
 society-0.372
Token 24
Token ID1987
Feature activation-1.72e+00

Pos logit contributions
+0.253
+0.218
 +0.198
 Originally+0.198
 �+0.187
Neg logit contributions
,-0.313
,...-0.253
.-0.252
.""-0.243
 society-0.220
Token,
Token ID11
Feature activation+8.68e+00

Pos logit contributions
+0.253
+0.234
 Originally+0.219
 +0.217
 �+0.214
Neg logit contributions
,-0.060
,...-0.034
.""-0.031
.-0.028
 society-0.015
Token000
Token ID830
Feature activation-3.57e-01

Pos logit contributions
,+1.006
.+0.831
,...+0.828
.""+0.800
 society+0.739
Neg logit contributions
-0.628
-0.515
 Originally-0.469
 -0.468
 �-0.433
Token people
Token ID661
Feature activation+9.58e+00

Pos logit contributions
+0.045
+0.041
 Originally+0.039
 +0.039
 �+0.038
Neg logit contributions
,-0.021
,...-0.014
.-0.014
.""-0.013
 society-0.010
Token have
Token ID423
Feature activation+9.44e-01

Pos logit contributions
,+0.965
.+0.779
,...+0.775
.""+0.749
 society+0.686
Neg logit contributions
-0.829
-0.706
 -0.647
 Originally-0.645
 �-0.619
Token visited
Token ID8672
Feature activation+5.78e-01

Pos logit contributions
,+0.102
,...+0.083
.+0.083
.""+0.080
 society+0.074
Neg logit contributions
-0.075
-0.063
 -0.057
 Originally-0.057
 �-0.054
Token since
Token ID1201
Feature activation+9.95e-02

Pos logit contributions
,+0.045
,...+0.035
.+0.034
.""+0.033
 society+0.029
Neg logit contributions
-0.062
-0.055
 Originally-0.051
 -0.050
 �-0.049
Token it
Token ID340
Feature activation+5.49e+00

Pos logit contributions
,+0.008
,...+0.006
.""+0.006
.+0.006
 society+0.005
Neg logit contributions
-0.010
-0.009
 Originally-0.008
 -0.008
 �-0.008
Token opened
Token ID4721
Feature activation-1.39e-01

Pos logit contributions
,+0.556
,...+0.448
.+0.447
.""+0.431
 society+0.394
Neg logit contributions
-0.471
-0.404
 Originally-0.373
 -0.372
 �-0.350
Token wanted
Token ID2227
Feature activation+1.41e-01

Pos logit contributions
,+0.255
,...+0.202
.+0.201
.""+0.194
 society+0.175
Neg logit contributions
-0.257
-0.223
 -0.207
 Originally-0.206
 �-0.197
Token to
Token ID284
Feature activation-6.81e+00

Pos logit contributions
,+0.018
,...+0.015
.+0.015
.""+0.015
 society+0.014
Neg logit contributions
-0.008
-0.007
 -0.005
 Originally-0.005
 �-0.005
Token get
Token ID651
Feature activation-2.07e+00

Pos logit contributions
+0.632
+0.560
 +0.513
 Originally+0.508
 �+0.484
Neg logit contributions
,-0.615
,...-0.495
.-0.486
.""-0.479
 society-0.430
Token
Token ID564
Feature activation-6.07e+00

Pos logit contributions
+0.178
+0.153
 +0.142
 Originally+0.140
 �+0.133
Neg logit contributions
,-0.209
,...-0.168
.-0.168
.""-0.161
 society-0.146
Token
Token ID250
Feature activation-3.94e+00

Pos logit contributions
+0.407
+0.324
 Originally+0.291
 +0.291
 �+0.272
Neg logit contributions
,-0.738
.-0.615
,...-0.612
.""-0.595
 society-0.556
Tokeninto
Token ID20424
Feature activation-1.13e+01

Pos logit contributions
+0.345
+0.287
 +0.264
 Originally+0.264
 �+0.255
Neg logit contributions
,-0.398
,...-0.322
.-0.319
.""-0.309
 society-0.283
Token tech
Token ID7261
Feature activation+5.63e+00

Pos logit contributions
+1.300
+1.120
 �+1.054
 +1.039
 Originally+1.026
Neg logit contributions
,-0.820
,...-0.620
.-0.608
.""-0.604
,.-0.495
Token,
Token ID11
Feature activation+5.85e+00

Pos logit contributions
,+0.295
,...+0.190
.+0.190
.""+0.185
 society+0.139
Neg logit contributions
-0.764
-0.677
 -0.644
 �-0.638
 Originally-0.635
Token
Token ID447
Feature activation-5.21e+00

Pos logit contributions
,+0.504
.+0.387
,...+0.381
.""+0.365
 society+0.323
Neg logit contributions
-0.603
-0.520
 -0.494
 Originally-0.489
 �-0.473
Token
Token ID251
Feature activation-3.91e-02

Pos logit contributions
+0.358
+0.289
 Originally+0.256
 +0.255
 �+0.248
Neg logit contributions
,-0.616
.-0.514
,...-0.513
.""-0.512
 society-0.468
Token I
Token ID314
Feature activation+2.33e+00

Pos logit contributions
+0.004
+0.003
 +0.003
 Originally+0.003
 �+0.003
Neg logit contributions
,-0.003
.-0.003
,...-0.003
.""-0.002
 society-0.002
Token considered
Token ID3177
Feature activation-2.93e+00

Pos logit contributions
+0.276
+0.237
 +0.223
 Originally+0.220
 �+0.212
Neg logit contributions
,-0.270
,...-0.212
.-0.212
.""-0.202
 society-0.181
Token an
Token ID281
Feature activation-1.77e+00

Pos logit contributions
+0.301
+0.263
 +0.243
 Originally+0.240
 �+0.237
Neg logit contributions
,-0.246
,...-0.196
.-0.186
.""-0.184
 society-0.162
Token elite
Token ID9085
Feature activation+1.26e+00

Pos logit contributions
+0.194
+0.170
 +0.163
 Originally+0.162
 �+0.155
Neg logit contributions
,-0.139
.-0.104
,...-0.102
.""-0.096
 society-0.085
Token all
Token ID477
Feature activation-1.50e+00

Pos logit contributions
,+0.105
,...+0.082
.+0.080
.""+0.079
 society+0.070
Neg logit contributions
-0.129
-0.114
 -0.104
 Originally-0.104
 �-0.101
Token-
Token ID12
Feature activation-5.30e-01

Pos logit contributions
+0.204
+0.186
 +0.176
 Originally+0.174
 �+0.172
Neg logit contributions
,-0.074
,...-0.048
.-0.044
.""-0.043
 society-0.032
Tokenaround
Token ID14145
Feature activation-9.98e+00

Pos logit contributions
+0.020
+0.013
 +0.010
 Originally+0.010
 �+0.008
Neg logit contributions
,-0.080
,...-0.069
.-0.069
.""-0.067
 society-0.064
Token talent
Token ID7401
Feature activation+3.06e+00

Pos logit contributions
+1.032
+0.914
 +0.838
 Originally+0.827
 �+0.815
Neg logit contributions
,-0.814
,...-0.644
.""-0.614
.-0.610
 society-0.540
Token.
Token ID13
Feature activation+3.47e+00

Pos logit contributions
,+0.101
,...+0.051
.""+0.043
.+0.037
 society+0.021
Neg logit contributions
-0.469
-0.428
 -0.405
 Originally-0.401
 �-0.400
Token
Token ID198
Feature activation-7.87e+00

Pos logit contributions
,+0.417
.+0.347
,...+0.338
.""+0.326
 society+0.305
Neg logit contributions
-0.239
-0.188
 -0.178
 Originally-0.177
 �-0.161
Token
Token ID198
Feature activation-7.88e+00

Pos logit contributions
+0.142
 Originally+0.013
 +0.013
+0.000
 Regarding-0.010
Neg logit contributions
,-1.359
.-1.201
,...-1.152
.""-1.119
 society-1.073
TokenAll
Token ID3237
Feature activation-2.85e+00

Pos logit contributions
+0.564
+0.450
 +0.401
 Originally+0.396
 �+0.375
Neg logit contributions
,-0.930
,...-0.774
.-0.767
.""-0.749
 society-0.708
Token can
Token ID460
Feature activation-2.06e+00

Pos logit contributions
,+0.072
,...+0.056
.+0.056
.""+0.055
 society+0.049
Neg logit contributions
-0.082
-0.072
 -0.066
 Originally-0.066
 �-0.064
Token be
Token ID307
Feature activation-3.07e+00

Pos logit contributions
+0.184
+0.160
 +0.148
 Originally+0.147
 �+0.140
Neg logit contributions
,-0.199
,...-0.159
.-0.159
.""-0.153
 society-0.139
Token sent
Token ID1908
Feature activation-2.44e+00

Pos logit contributions
+0.259
+0.220
 +0.204
 Originally+0.202
 �+0.192
Neg logit contributions
,-0.316
.-0.255
,...-0.253
.""-0.245
 society-0.224
Token
Token ID198
Feature activation-7.58e+00

Pos logit contributions
+0.144
+0.114
 +0.101
 Originally+0.100
 �+0.091
Neg logit contributions
,-0.311
.-0.263
,...-0.261
.""-0.254
 society-0.238
Token
Token ID198
Feature activation-7.65e+00

Pos logit contributions
+0.140
 +0.015
 Originally+0.014
+0.001
 Regarding-0.007
Neg logit contributions
,-1.308
.-1.156
,...-1.114
.""-1.079
 society-1.031
Tokento
Token ID1462
Feature activation-9.76e+00

Pos logit contributions
+0.406
+0.302
 +0.261
 Originally+0.259
 �+0.229
Neg logit contributions
,-1.032
.-0.877
,...-0.873
.""-0.851
 society-0.810
Token explore
Token ID7301
Feature activation-3.82e+00

Pos logit contributions
+0.995
+0.889
 +0.818
 Originally+0.816
 �+0.786
Neg logit contributions
,-0.810
,...-0.632
.-0.622
.""-0.604
 society-0.521
Token,
Token ID11
Feature activation+4.98e+00

Pos logit contributions
+0.429
+0.391
 +0.360
 Originally+0.358
 �+0.347
Neg logit contributions
,-0.273
,...-0.204
.-0.199
.""-0.198
 society-0.166
Token harvest
Token ID13222
Feature activation-4.65e-01

Pos logit contributions
,+0.468
.+0.367
,...+0.357
.""+0.341
 society+0.311
Neg logit contributions
-0.474
-0.403
 -0.386
 Originally-0.385
 �-0.362
Token resources
Token ID4133
Feature activation+1.49e+00

Pos logit contributions
+0.057
+0.053
 +0.049
 Originally+0.048
 �+0.048
Neg logit contributions
,-0.027
.""-0.020
,...-0.019
.-0.019
 society-0.015
Token,
Token ID11
Feature activation+6.79e+00

Pos logit contributions
,+0.058
.+0.030
,...+0.029
.""+0.028
 society+0.018
Neg logit contributions
-0.217
-0.200
 -0.190
 Originally-0.188
 �-0.184
Token
Token ID447
Feature activation-3.44e+00

Pos logit contributions
,+1.301
.+1.160
,...+1.156
.""+1.135
 society+1.091
Neg logit contributions
-0.024
+0.077
 Originally+0.115
 +0.115
 �+0.137
Token
Token ID251
Feature activation+1.62e+00

Pos logit contributions
+0.255
+0.212
 Originally+0.190
 +0.188
 �+0.179
Neg logit contributions
,-0.393
,...-0.323
.-0.323
.""-0.316
 society-0.290
Token
Token ID198
Feature activation-5.34e+00

Pos logit contributions
,+0.281
.+0.248
,...+0.247
.""+0.242
 society+0.232
Neg logit contributions
-0.026
-0.004
 +0.004
 Originally+0.004
 �+0.010
Token
Token ID198
Feature activation-5.41e+00

Pos logit contributions
+0.102
 +0.014
 Originally+0.012
 Regarding+0.000
+0.000
Neg logit contributions
,-0.919
.-0.812
,...-0.778
.""-0.753
 society-0.722
TokenT
Token ID51
Feature activation-1.99e+00

Pos logit contributions
+0.094
+0.003
 Originally-0.037
 -0.038
 �-0.042
Neg logit contributions
,-0.945
,...-0.843
.-0.831
.""-0.823
 society-0.800
Tokenoward
Token ID46138
Feature activation-9.75e+00

Pos logit contributions
+0.148
+0.124
 Originally+0.111
 +0.109
 �+0.103
Neg logit contributions
,-0.224
,...-0.187
.-0.182
.""-0.181
 society-0.168
Token the
Token ID262
Feature activation+2.98e+00

Pos logit contributions
+1.137
+1.019
 +0.948
 Originally+0.934
 �+0.933
Neg logit contributions
,-0.674
,...-0.500
.-0.486
.""-0.472
 society-0.384
Token end
Token ID886
Feature activation-1.82e+00

Pos logit contributions
,+0.304
.+0.245
,...+0.243
.""+0.235
 society+0.212
Neg logit contributions
-0.257
-0.218
 -0.202
 Originally-0.200
 �-0.191
Token of
Token ID286
Feature activation-4.56e+00

Pos logit contributions
+0.114
+0.088
 +0.074
 Originally+0.073
 �+0.072
Neg logit contributions
,-0.224
.-0.192
,...-0.192
.""-0.191
 society-0.177
Token the
Token ID262
Feature activation+2.95e+00

Pos logit contributions
+0.530
+0.472
 Originally+0.434
 +0.433
 �+0.431
Neg logit contributions
,-0.319
,...-0.242
.-0.234
.""-0.230
 society-0.189
Token clip
Token ID10651
Feature activation-5.13e-01

Pos logit contributions
,+0.275
.+0.217
,...+0.216
.""+0.207
 society+0.186
Neg logit contributions
-0.280
-0.241
 -0.224
 Originally-0.223
 �-0.215
Token last
Token ID938
Feature activation-3.18e+00

Pos logit contributions
+0.251
+0.229
 +0.214
 Originally+0.210
 �+0.202
Neg logit contributions
,-0.154
.""-0.110
.-0.110
,...-0.110
 society-0.095
Token month
Token ID1227
Feature activation-8.55e-01

Pos logit contributions
+0.329
+0.295
 +0.271
 Originally+0.269
 �+0.259
Neg logit contributions
,-0.259
,...-0.202
.-0.195
.""-0.194
 society-0.170
Token.
Token ID13
Feature activation+4.24e+00

Pos logit contributions
+0.102
+0.094
 +0.088
 Originally+0.086
 �+0.083
Neg logit contributions
,-0.056
.""-0.039
,...-0.038
.-0.038
 society-0.033
Token
Token ID198
Feature activation-6.96e+00

Pos logit contributions
,+0.611
.+0.525
,...+0.515
.""+0.500
 society+0.475
Neg logit contributions
-0.191
-0.127
 -0.114
 Originally-0.113
 �-0.096
Token
Token ID198
Feature activation-6.89e+00

Pos logit contributions
+0.131
 +0.012
 Originally+0.010
+0.000
 Regarding-0.006
Neg logit contributions
,-1.201
.-1.061
,...-1.024
.""-0.989
 society-0.949
TokenAgainst
Token ID39276
Feature activation-9.74e+00

Pos logit contributions
+0.492
+0.400
 +0.356
 Originally+0.352
 �+0.323
Neg logit contributions
,-0.806
,...-0.666
.-0.661
.""-0.648
 society-0.617
Token Sweden
Token ID10710
Feature activation-2.40e-01

Pos logit contributions
+0.951
+0.861
 +0.781
 Originally+0.768
 �+0.742
Neg logit contributions
,-0.829
,...-0.661
.""-0.640
.-0.640
 society-0.553
Token in
Token ID287
Feature activation-7.20e+00

Pos logit contributions
+0.028
+0.026
 +0.024
 Originally+0.024
 �+0.023
Neg logit contributions
,-0.016
.""-0.011
,...-0.011
.-0.011
 society-0.009
Token June
Token ID2795
Feature activation-1.62e+00

Pos logit contributions
+0.711
+0.633
 +0.578
 Originally+0.574
 �+0.556
Neg logit contributions
,-0.624
,...-0.493
.-0.483
.""-0.471
 society-0.411
Token,
Token ID11
Feature activation+5.93e+00

Pos logit contributions
+0.209
+0.195
 +0.181
 Originally+0.179
 �+0.174
Neg logit contributions
,-0.087
.""-0.059
,...-0.058
.-0.056
 society-0.045
Token they
Token ID484
Feature activation+3.95e+00

Pos logit contributions
,+0.508
.+0.387
,...+0.380
.""+0.361
 society+0.322
Neg logit contributions
-0.612
-0.532
 -0.507
 Originally-0.504
 �-0.479
Token negative
Token ID4633
Feature activation-2.45e+00

Pos logit contributions
+0.014
+0.012
 +0.011
 Originally+0.011
 �+0.011
Neg logit contributions
,-0.012
,...-0.009
.-0.009
.""-0.009
 society-0.008
Token (
Token ID357
Feature activation-1.14e+00

Pos logit contributions
+0.198
+0.167
 +0.148
 Originally+0.146
 �+0.145
Neg logit contributions
,-0.259
,...-0.216
.-0.210
.""-0.210
 society-0.189
Tokentem
Token ID11498
Feature activation-3.28e+00

Pos logit contributions
+0.078
+0.062
 +0.055
 Originally+0.055
 �+0.051
Neg logit contributions
,-0.138
,...-0.115
.-0.114
.""-0.112
 society-0.105
Tokenperature
Token ID21069
Feature activation-2.44e+00

Pos logit contributions
+0.235
+0.191
 Originally+0.175
 +0.174
 �+0.161
Neg logit contributions
,-0.383
.-0.316
,...-0.314
.""-0.304
 society-0.283
Token-
Token ID12
Feature activation-3.75e-01

Pos logit contributions
+0.335
+0.302
 Originally+0.284
 +0.282
 �+0.280
Neg logit contributions
,-0.121
,...-0.077
.-0.074
.""-0.071
 society-0.051
Tokenatten
Token ID41769
Feature activation-9.74e+00

Pos logit contributions
+0.028
+0.023
 +0.021
 Originally+0.021
 �+0.020
Neg logit contributions
,-0.043
.-0.035
,...-0.035
.""-0.034
 society-0.031
Tokenuating
Token ID11927
Feature activation-6.07e+00

Pos logit contributions
+0.642
+0.521
 +0.457
 Originally+0.457
 �+0.420
Neg logit contributions
,-1.178
,...-0.987
.-0.981
.""-0.955
 society-0.898
Token)
Token ID8
Feature activation-2.50e+00

Pos logit contributions
+0.741
+0.657
 Originally+0.608
 +0.604
 �+0.601
Neg logit contributions
,-0.390
,...-0.284
.-0.276
.""-0.270
 society-0.222
Token feedback
Token ID7538
Feature activation-4.19e-01

Pos logit contributions
+0.265
+0.231
 +0.219
 Originally+0.218
 �+0.210
Neg logit contributions
,-0.207
.-0.156
,...-0.154
.""-0.146
 society-0.129
Token in
Token ID287
Feature activation-7.70e+00

Pos logit contributions
+0.042
+0.037
 +0.034
 Originally+0.034
 �+0.033
Neg logit contributions
,-0.036
,...-0.028
.-0.027
.""-0.027
 society-0.024
Token Watts
Token ID27555
Feature activation-2.63e+00

Pos logit contributions
+0.820
+0.725
 +0.671
 Originally+0.662
 �+0.654
Neg logit contributions
,-0.617
,...-0.474
.-0.466
.""-0.453
 society-0.382
Token did
Token ID750
Feature activation+4.34e-01

Pos logit contributions
+0.115
+0.099
 +0.092
 Originally+0.091
 �+0.087
Neg logit contributions
,-0.124
.-0.099
,...-0.099
.""-0.096
 society-0.087
Token not
Token ID407
Feature activation-3.01e+00

Pos logit contributions
,+0.050
,...+0.043
.+0.042
.""+0.041
 society+0.038
Neg logit contributions
-0.030
-0.025
 -0.022
 Originally-0.021
 �-0.021
Token appear
Token ID1656
Feature activation-1.26e+00

Pos logit contributions
+0.272
+0.235
 +0.218
 Originally+0.217
 �+0.206
Neg logit contributions
,-0.290
.-0.230
,...-0.230
.""-0.221
 society-0.201
Token to
Token ID284
Feature activation-7.04e+00

Pos logit contributions
+0.082
+0.066
 +0.059
 Originally+0.058
 �+0.054
Neg logit contributions
,-0.153
,...-0.128
.-0.127
.""-0.125
 society-0.117
Token point
Token ID966
Feature activation-8.17e-01

Pos logit contributions
+0.626
+0.543
 +0.500
 Originally+0.498
 �+0.473
Neg logit contributions
,-0.684
,...-0.549
.-0.545
.""-0.525
 society-0.474
Token towards
Token ID3371
Feature activation-9.64e+00

Pos logit contributions
+0.049
+0.038
 +0.034
 Originally+0.034
 �+0.031
Neg logit contributions
,-0.104
.-0.088
,...-0.088
.""-0.085
 society-0.080
Token the
Token ID262
Feature activation+1.12e+00

Pos logit contributions
+1.080
+0.963
 +0.883
 Originally+0.875
 �+0.870
Neg logit contributions
,-0.701
,...-0.541
.-0.517
.""-0.505
 society-0.425
Token spacecraft
Token ID16807
Feature activation+1.45e+00

Pos logit contributions
,+0.101
.+0.078
,...+0.078
.""+0.074
 society+0.066
Neg logit contributions
-0.110
-0.095
 -0.089
 Originally-0.089
 �-0.085
Token.
Token ID13
Feature activation+3.72e+00

Pos logit contributions
,+0.062
,...+0.038
.""+0.035
.+0.033
 society+0.025
Neg logit contributions
-0.205
-0.188
 -0.176
 Originally-0.175
 �-0.172
Token Deep
Token ID10766
Feature activation-2.44e+00

Pos logit contributions
,+0.525
.+0.450
,...+0.438
.""+0.425
 society+0.404
Neg logit contributions
-0.178
-0.121
 -0.113
 Originally-0.110
 �-0.095
Token Space
Token ID4687
Feature activation+6.60e-01

Pos logit contributions
+0.228
+0.199
 +0.183
 Originally+0.182
 �+0.175
Neg logit contributions
,-0.226
,...-0.182
.-0.177
.""-0.173
 society-0.154
Token "
Token ID366
Feature activation-1.08e+00

Pos logit contributions
,+0.528
.+0.450
,...+0.444
.""+0.432
 society+0.408
Neg logit contributions
-0.199
-0.146
 -0.131
 Originally-0.130
 �-0.113
TokenHaving
Token ID14698
Feature activation-4.27e+00

Pos logit contributions
+0.072
+0.057
 +0.052
 Originally+0.051
 �+0.047
Neg logit contributions
,-0.133
.-0.111
,...-0.110
.""-0.107
 society-0.100
Token lured
Token ID41546
Feature activation-3.13e+00

Pos logit contributions
+0.393
+0.341
 +0.314
 Originally+0.314
 �+0.301
Neg logit contributions
,-0.401
,...-0.320
.-0.317
.""-0.307
 society-0.272
Token the
Token ID262
Feature activation+7.38e-01

Pos logit contributions
+0.360
+0.324
 +0.298
 Originally+0.298
 �+0.294
Neg logit contributions
,-0.218
,...-0.164
.-0.158
.""-0.155
 society-0.125
Token child
Token ID1200
Feature activation+5.09e+00

Pos logit contributions
,+0.054
.+0.039
,...+0.039
.""+0.037
 society+0.031
Neg logit contributions
-0.085
-0.075
 -0.071
 Originally-0.071
 �-0.069
Token into
Token ID656
Feature activation-9.62e+00

Pos logit contributions
,+0.635
.+0.535
,...+0.534
.""+0.519
 society+0.484
Neg logit contributions
-0.319
-0.252
 -0.224
 Originally-0.222
 �-0.207
Token the
Token ID262
Feature activation+4.49e-01

Pos logit contributions
+1.054
+0.942
 +0.867
 Originally+0.859
 �+0.853
Neg logit contributions
,-0.728
,...-0.560
.-0.543
.""-0.528
.,-0.434
Token early
Token ID1903
Feature activation-4.55e+00

Pos logit contributions
,+0.035
.+0.026
,...+0.026
.""+0.024
 society+0.021
Neg logit contributions
-0.050
-0.044
 -0.041
 Originally-0.041
 �-0.040
Token stages
Token ID9539
Feature activation-2.07e+00

Pos logit contributions
+0.479
+0.427
 +0.387
 Originally+0.386
 �+0.380
Neg logit contributions
,-0.362
,...-0.282
.-0.274
.""-0.269
 society-0.228
Token of
Token ID286
Feature activation-6.99e+00

Pos logit contributions
+0.120
+0.094
 +0.078
 Originally+0.076
 �+0.076
Neg logit contributions
,-0.263
,...-0.226
.-0.225
.""-0.222
 society-0.205
Token a
Token ID257
Feature activation+8.67e-02

Pos logit contributions
+0.800
+0.719
 +0.669
 Originally+0.663
 �+0.654
Neg logit contributions
,-0.501
,...-0.372
.-0.363
.""-0.351
 society-0.283
Token York
Token ID1971
Feature activation+1.45e+00

Pos logit contributions
,+0.004
,...+0.003
.+0.003
.""+0.003
 society+0.003
Neg logit contributions
-0.005
-0.005
 Originally-0.004
 -0.004
 �-0.004
Token Times
Token ID3782
Feature activation+7.43e-01

Pos logit contributions
,+0.131
,...+0.105
.+0.102
.""+0.100
 society+0.088
Neg logit contributions
-0.139
-0.122
 -0.111
 Originally-0.110
 �-0.107
Token that
Token ID326
Feature activation-2.99e-02

Pos logit contributions
,+0.059
.+0.044
,...+0.044
.""+0.043
 society+0.038
Neg logit contributions
-0.082
-0.070
 -0.065
 �-0.065
 Originally-0.064
Token any
Token ID597
Feature activation-9.44e-01

Pos logit contributions
+0.003
+0.003
 +0.003
 Originally+0.003
 �+0.003
Neg logit contributions
,-0.002
,...-0.002
.-0.002
.""-0.002
 society-0.001
Token digging
Token ID18894
Feature activation-1.22e+00

Pos logit contributions
+0.101
+0.089
 +0.083
 Originally+0.082
 �+0.080
Neg logit contributions
,-0.075
.-0.057
,...-0.057
.""-0.054
 society-0.047
Token into
Token ID656
Feature activation-9.62e+00

Pos logit contributions
+0.075
+0.059
 +0.051
 Originally+0.050
 �+0.048
Neg logit contributions
,-0.154
,...-0.131
.-0.130
.""-0.127
 society-0.118
Token matters
Token ID6067
Feature activation+8.92e-01

Pos logit contributions
+1.088
+0.979
 �+0.882
 +0.879
 Originally+0.873
Neg logit contributions
,-0.684
,...-0.519
.-0.503
.""-0.503
 society-0.410
Token beyond
Token ID3675
Feature activation-4.90e+00

Pos logit contributions
,+0.095
.+0.078
,...+0.077
.""+0.075
 society+0.068
Neg logit contributions
-0.072
-0.060
 -0.056
 Originally-0.055
 �-0.053
Token Russia
Token ID3284
Feature activation+3.72e+00

Pos logit contributions
+0.552
+0.500
 �+0.451
 +0.450
 Originally+0.448
Neg logit contributions
,-0.349
,...-0.262
.-0.257
.""-0.256
 society-0.209
Token would
Token ID561
Feature activation-1.97e+00

Pos logit contributions
,+0.312
,...+0.241
.+0.240
.""+0.233
 society+0.208
Neg logit contributions
-0.392
-0.335
 -0.309
 �-0.308
 Originally-0.307
Token be
Token ID307
Feature activation-1.72e+00

Pos logit contributions
+0.170
+0.145
 +0.134
 Originally+0.132
 �+0.127
Neg logit contributions
,-0.198
.-0.160
,...-0.159
.""-0.153
 society-0.141
Token this
Token ID428
Feature activation+2.17e-01

Pos logit contributions
+0.122
+0.107
 +0.099
 Originally+0.097
 �+0.095
Neg logit contributions
,-0.114
,...-0.091
.-0.089
.""-0.087
 society-0.078
Token to
Token ID284
Feature activation-7.42e+00

Pos logit contributions
,+0.024
,...+0.020
.+0.019
.""+0.019
 society+0.018
Neg logit contributions
-0.017
-0.014
 -0.012
 Originally-0.012
 �-0.012
Token the
Token ID262
Feature activation+8.89e-01

Pos logit contributions
+0.781
+0.698
 +0.636
 Originally+0.636
 �+0.623
Neg logit contributions
,-0.587
,...-0.460
.-0.444
.""-0.443
 society-0.379
Token famed
Token ID28331
Feature activation-6.63e-01

Pos logit contributions
,+0.076
.+0.058
,...+0.058
.""+0.056
 society+0.049
Neg logit contributions
-0.091
-0.079
 Originally-0.074
 -0.074
 �-0.071
Token Mom
Token ID11254
Feature activation+3.82e+00

Pos logit contributions
+0.064
+0.056
 Originally+0.052
 +0.052
 �+0.050
Neg logit contributions
,-0.059
,...-0.047
.-0.046
.""-0.045
 society-0.040
Tokenof
Token ID1659
Feature activation-9.62e+00

Pos logit contributions
,+0.503
,...+0.427
.+0.426
.""+0.415
 society+0.391
Neg logit contributions
-0.215
-0.165
 Originally-0.144
 -0.144
 �-0.130
Tokenuku
Token ID33263
Feature activation-2.69e+00

Pos logit contributions
+0.822
+0.707
 Originally+0.649
 +0.633
 �+0.605
Neg logit contributions
,-0.964
,...-0.784
.-0.773
.""-0.755
 society-0.705
Token in
Token ID287
Feature activation-8.15e+00

Pos logit contributions
+0.310
+0.275
 Originally+0.256
 +0.255
 �+0.250
Neg logit contributions
,-0.191
,...-0.143
.-0.138
.""-0.137
 society-0.118
Token New
Token ID968
Feature activation-5.15e-01

Pos logit contributions
+0.822
+0.729
 Originally+0.670
 +0.668
 �+0.645
Neg logit contributions
,-0.692
,...-0.546
.-0.533
.""-0.520
 society-0.450
Token York
Token ID1971
Feature activation+1.07e+00

Pos logit contributions
+0.055
+0.049
 Originally+0.045
 +0.045
 �+0.043
Neg logit contributions
,-0.041
,...-0.031
.-0.031
.""-0.030
 society-0.026
Token.
Token ID13
Feature activation+3.35e+00

Pos logit contributions
,+0.048
,...+0.029
.""+0.029
.+0.027
 society+0.020
Neg logit contributions
-0.151
-0.135
 �-0.127
 Originally-0.127
 -0.126
Token deduct
Token ID18777
Feature activation-3.45e+00

Pos logit contributions
+0.168
+0.148
 Originally+0.134
 +0.133
 �+0.131
Neg logit contributions
,-0.139
,...-0.110
.-0.108
.""-0.105
 society-0.093
Tokenibles
Token ID18764
Feature activation-1.78e+00

Pos logit contributions
+0.301
+0.255
 Originally+0.239
 +0.239
 �+0.224
Neg logit contributions
,-0.347
.-0.278
,...-0.276
.""-0.265
 society-0.243
Token and
Token ID290
Feature activation-6.63e-01

Pos logit contributions
+0.225
+0.203
 +0.188
 Originally+0.188
 �+0.186
Neg logit contributions
,-0.104
,...-0.072
.""-0.069
.-0.069
 society-0.058
Token out
Token ID503
Feature activation-6.38e+00

Pos logit contributions
+0.065
+0.057
 Originally+0.053
 +0.052
 �+0.051
Neg logit contributions
,-0.058
,...-0.046
.-0.045
.""-0.044
 society-0.039
Token-
Token ID12
Feature activation-5.30e-01

Pos logit contributions
+0.871
+0.791
 Originally+0.747
 +0.737
 �+0.731
Neg logit contributions
,-0.320
,...-0.208
.-0.190
.""-0.180
 society-0.135
Tokenof
Token ID1659
Feature activation-9.55e+00

Pos logit contributions
+0.018
+0.011
 Originally+0.008
 +0.008
 �+0.006
Neg logit contributions
,-0.082
.-0.071
,...-0.071
.""-0.069
 society-0.066
Token-
Token ID12
Feature activation-7.65e-01

Pos logit contributions
+1.369
+1.239
 +1.195
 Originally+1.193
 �+1.155
Neg logit contributions
,-0.431
.-0.238
,...-0.227
.""-0.197
 society-0.135
Tokenpocket
Token ID31991
Feature activation-2.13e+00

Pos logit contributions
+0.058
+0.048
 Originally+0.043
 +0.043
 �+0.040
Neg logit contributions
,-0.086
,...-0.071
.-0.071
.""-0.068
 society-0.063
Token costs
Token ID3484
Feature activation+3.44e-01

Pos logit contributions
+0.221
+0.197
 Originally+0.179
 �+0.176
 +0.175
Neg logit contributions
,-0.172
,...-0.137
.-0.131
.""-0.128
 society-0.113
Token,
Token ID11
Feature activation+4.98e+00

Pos logit contributions
,+0.023
,...+0.016
.+0.016
.""+0.016
 society+0.013
Neg logit contributions
-0.041
-0.037
 -0.034
 Originally-0.034
 �-0.033
Token but
Token ID475
Feature activation+5.30e-03

Pos logit contributions
,+0.500
.+0.398
,...+0.386
.""+0.370
 society+0.341
Neg logit contributions
-0.444
-0.370
 -0.358
 Originally-0.354
 �-0.332
Token socialist
Token ID15889
Feature activation+2.86e+00

Pos logit contributions
+0.317
+0.280
 +0.261
 Originally+0.259
 �+0.253
Neg logit contributions
,-0.244
,...-0.188
.-0.185
.""-0.177
 society-0.156
Token principles
Token ID7811
Feature activation+2.16e+00

Pos logit contributions
,+0.191
,...+0.140
.+0.137
.""+0.133
 society+0.109
Neg logit contributions
-0.341
-0.305
 -0.286
 Originally-0.283
 �-0.278
Token can
Token ID460
Feature activation-2.06e+00

Pos logit contributions
,+0.203
,...+0.163
.+0.161
.""+0.157
 society+0.141
Neg logit contributions
-0.201
-0.172
 -0.158
 Originally-0.156
 �-0.154
Token be
Token ID307
Feature activation-3.07e+00

Pos logit contributions
+0.172
+0.140
 +0.127
 Originally+0.125
 �+0.124
Neg logit contributions
,-0.218
,...-0.180
.-0.178
.""-0.173
 society-0.159
Token incorporated
Token ID16560
Feature activation-4.32e+00

Pos logit contributions
+0.264
+0.222
 +0.205
 Originally+0.202
 �+0.196
Neg logit contributions
,-0.312
,...-0.253
.-0.252
.""-0.242
 society-0.222
Token into
Token ID656
Feature activation-9.51e+00

Pos logit contributions
+0.284
+0.222
 +0.193
 Originally+0.189
 �+0.186
Neg logit contributions
,-0.525
,...-0.446
.-0.438
.""-0.433
 society-0.408
Token that
Token ID326
Feature activation-2.64e-01

Pos logit contributions
+1.065
+0.954
 +0.875
 Originally+0.874
 �+0.862
Neg logit contributions
,-0.695
,...-0.546
.-0.512
.""-0.509
,.-0.408
Token,
Token ID11
Feature activation+5.17e+00

Pos logit contributions
+0.039
+0.035
 +0.033
 �+0.033
 Originally+0.033
Neg logit contributions
,-0.011
,...-0.006
.""-0.005
.-0.005
 society-0.003
Token but
Token ID475
Feature activation-3.05e-02

Pos logit contributions
,+0.579
.+0.474
,...+0.463
.""+0.447
 society+0.415
Neg logit contributions
-0.395
-0.325
 -0.308
 Originally-0.306
 Regarding-0.282
Token persuasion
Token ID41784
Feature activation+3.70e+00

Pos logit contributions
+0.003
+0.003
 +0.003
 Originally+0.003
 �+0.003
Neg logit contributions
,-0.002
,...-0.002
.-0.002
.""-0.002
 society-0.002
Token comes
Token ID2058
Feature activation-8.09e-01

Pos logit contributions
,+0.349
,...+0.278
.+0.277
.""+0.268
 society+0.242
Neg logit contributions
-0.347
-0.294
 -0.271
 Originally-0.267
 �-0.264
Token to
Token ID284
Feature activation-7.17e+00

Pos logit contributions
+0.097
+0.082
 Originally+0.075
 +0.075
 �+0.072
Neg logit contributions
,-0.116
,...-0.093
.-0.093
.""-0.091
 society-0.083
Token
Token ID564
Feature activation-6.55e+00

Pos logit contributions
+0.701
+0.607
 +0.567
 Originally+0.557
 �+0.540
Neg logit contributions
,-0.644
,...-0.504
.-0.501
.""-0.481
 society-0.433
Token
Token ID250
Feature activation-4.43e+00

Pos logit contributions
+0.514
+0.423
 +0.398
 Originally+0.396
 �+0.367
Neg logit contributions
,-0.724
.-0.591
,...-0.580
.""-0.559
 society-0.518
Tokenstrike
Token ID33069
Feature activation-8.37e-01

Pos logit contributions
+0.318
+0.257
 +0.234
 Originally+0.230
 �+0.215
Neg logit contributions
,-0.517
,...-0.429
.-0.427
.""-0.413
 society-0.389
Token fear
Token ID3252
Feature activation+1.79e+00

Pos logit contributions
+0.084
+0.072
 +0.067
 Originally+0.066
 �+0.065
Neg logit contributions
,-0.074
,...-0.058
.-0.057
.""-0.055
 society-0.050
Token into
Token ID656
Feature activation-9.51e+00

Pos logit contributions
,+0.242
.+0.206
,...+0.206
.""+0.200
 society+0.189
Neg logit contributions
-0.095
-0.069
 -0.061
 Originally-0.060
 �-0.054
Token the
Token ID262
Feature activation+6.67e-01

Pos logit contributions
+1.123
+0.995
 +0.929
 Originally+0.913
 �+0.910
Neg logit contributions
,-0.654
,...-0.486
.-0.469
.""-0.454
,.-0.377
Token hearts
Token ID11954
Feature activation+2.08e+00

Pos logit contributions
,+0.050
.+0.037
,...+0.037
.""+0.034
 society+0.029
Neg logit contributions
-0.076
-0.067
 -0.063
 Originally-0.063
 �-0.061
Token of
Token ID286
Feature activation-7.03e+00

Pos logit contributions
,+0.269
.+0.228
,...+0.226
.""+0.221
 society+0.207
Neg logit contributions
-0.123
-0.094
 -0.083
 Originally-0.082
 �-0.076
Token criminals
Token ID12954
Feature activation+5.27e+00

Pos logit contributions
+0.840
+0.743
 +0.702
 Originally+0.693
 �+0.684
Neg logit contributions
,-0.483
,...-0.348
.-0.345
.""-0.329
 society-0.265
Token
Token ID447
Feature activation-5.90e+00

Pos logit contributions
,+0.380
,...+0.278
.+0.274
.""+0.267
 society+0.230
Neg logit contributions
-0.617
-0.537
 -0.504
 Originally-0.501
 �-0.494
Tokenus
Token ID385
Feature activation-1.23e+00

Pos logit contributions
+0.289
+0.241
 +0.230
 Originally+0.229
 �+0.214
Neg logit contributions
,-0.343
.-0.276
,...-0.268
.""-0.258
 society-0.236
Token 9
Token ID860
Feature activation-3.83e+00

Pos logit contributions
+0.098
+0.083
 +0.074
 Originally+0.073
 �+0.070
Neg logit contributions
,-0.130
,...-0.109
.""-0.106
.-0.105
 society-0.097
Token,
Token ID11
Feature activation+5.41e+00

Pos logit contributions
+0.557
+0.506
 +0.476
 Originally+0.475
 �+0.471
Neg logit contributions
,-0.155
,...-0.087
.""-0.081
.-0.076
 society-0.053
Token placing
Token ID12560
Feature activation-3.72e+00

Pos logit contributions
,+0.538
.+0.428
,...+0.414
.""+0.396
 society+0.365
Neg logit contributions
-0.485
-0.408
 -0.394
 Originally-0.390
 Regarding-0.366
Token it
Token ID340
Feature activation+2.52e+00

Pos logit contributions
+0.409
+0.365
 +0.340
 Originally+0.339
 �+0.329
Neg logit contributions
,-0.284
,...-0.214
.-0.210
.""-0.202
 society-0.170
Token into
Token ID656
Feature activation-9.51e+00

Pos logit contributions
,+0.309
.+0.259
,...+0.258
.""+0.250
 society+0.233
Neg logit contributions
-0.165
-0.131
 -0.118
 Originally-0.118
 �-0.109
Token production
Token ID3227
Feature activation+1.07e+00

Pos logit contributions
+1.010
+0.903
 +0.826
 Originally+0.823
 �+0.811
Neg logit contributions
,-0.749
,...-0.593
.-0.565
.""-0.549
,.-0.458
Token in
Token ID287
Feature activation-7.98e+00

Pos logit contributions
,+0.106
,...+0.085
.+0.083
.""+0.081
 society+0.074
Neg logit contributions
-0.096
-0.082
 -0.075
 Originally-0.075
 �-0.072
Token August
Token ID2932
Feature activation-3.22e+00

Pos logit contributions
+0.740
+0.644
 +0.589
 Originally+0.589
 �+0.568
Neg logit contributions
,-0.750
,...-0.598
.-0.591
.""-0.572
 society-0.509
Token 2013
Token ID2211
Feature activation-2.33e+00

Pos logit contributions
+0.319
+0.279
 +0.254
 Originally+0.253
 �+0.248
Neg logit contributions
,-0.279
,...-0.221
.""-0.212
.-0.212
 society-0.191
Token.
Token ID13
Feature activation+3.34e+00

Pos logit contributions
+0.369
+0.337
 +0.319
 Originally+0.317
 �+0.316
Neg logit contributions
,-0.068
,...-0.027
.""-0.022
.-0.018
 society-0.008
Tokenmann
Token ID9038
Feature activation-3.46e+00

Pos logit contributions
+0.674
+0.608
 Originally+0.572
 +0.566
 �+0.547
Neg logit contributions
,-0.388
,...-0.286
.-0.283
.""-0.278
 society-0.243
Token,
Token ID11
Feature activation+5.66e+00

Pos logit contributions
+0.615
+0.572
 Originally+0.543
 +0.542
 �+0.538
Neg logit contributions
,-0.016
,...+0.037
.""+0.038
.+0.041
 society+0.064
Token the
Token ID262
Feature activation+1.00e+00

Pos logit contributions
,+0.474
.+0.361
,...+0.359
.""+0.341
 society+0.300
Neg logit contributions
-0.591
-0.518
 -0.488
 Originally-0.487
 �-0.465
Token 3
Token ID513
Feature activation-4.49e+00

Pos logit contributions
,+0.085
,...+0.065
.+0.065
.""+0.062
 society+0.054
Neg logit contributions
-0.103
-0.091
 Originally-0.084
 -0.084
 �-0.081
Token-
Token ID12
Feature activation-5.42e-01

Pos logit contributions
+0.707
+0.656
 Originally+0.613
 +0.609
 �+0.599
Neg logit contributions
,-0.101
,...-0.043
.""-0.041
.-0.013
 society-0.009
Tokento
Token ID1462
Feature activation-9.49e+00

Pos logit contributions
+0.040
+0.033
 +0.030
 Originally+0.030
 �+0.028
Neg logit contributions
,-0.061
,...-0.051
.-0.050
.""-0.049
 society-0.046
Token-
Token ID12
Feature activation-6.93e-01

Pos logit contributions
+1.347
+1.215
 +1.179
 Originally+1.175
 �+1.135
Neg logit contributions
,-0.445
.-0.253
,...-0.237
.""-0.208
 society-0.148
Token4
Token ID19
Feature activation-5.75e+00

Pos logit contributions
+0.032
+0.023
 +0.019
 Originally+0.019
 �+0.016
Neg logit contributions
,-0.098
,...-0.084
.-0.084
.""-0.082
 society-0.078
Token-
Token ID12
Feature activation-8.08e-01

Pos logit contributions
+0.660
+0.593
 +0.540
 Originally+0.539
 �+0.529
Neg logit contributions
,-0.398
,...-0.304
.""-0.290
.-0.283
 society-0.243
Tokeninch
Token ID8589
Feature activation-3.81e+00

Pos logit contributions
+0.060
+0.049
 +0.044
 Originally+0.044
 �+0.041
Neg logit contributions
,-0.092
,...-0.076
.-0.076
.""-0.074
 society-0.069
Token dead
Token ID2636
Feature activation-3.50e-02

Pos logit contributions
+0.400
+0.356
 +0.327
 Originally+0.325
 �+0.317
Neg logit contributions
,-0.302
,...-0.237
.-0.230
.""-0.225
 society-0.199
Token that
Token ID326
Feature activation+2.28e-01

Pos logit contributions
,+0.210
.+0.167
,...+0.164
.""+0.157
 society+0.144
Neg logit contributions
-0.188
-0.159
 -0.151
 Originally-0.150
 �-0.141
Token Wes
Token ID18622
Feature activation-2.29e+00

Pos logit contributions
,+0.021
,...+0.017
.+0.017
.""+0.017
 society+0.015
Neg logit contributions
-0.021
-0.018
 -0.017
 Originally-0.017
 �-0.016
Token Anderson
Token ID9918
Feature activation-2.90e+00

Pos logit contributions
+0.204
+0.179
 Originally+0.165
 +0.162
 �+0.155
Neg logit contributions
,-0.220
,...-0.179
.-0.174
.""-0.172
 society-0.154
Token would
Token ID561
Feature activation-1.46e+00

Pos logit contributions
+0.269
+0.237
 Originally+0.218
 +0.217
 �+0.207
Neg logit contributions
,-0.264
,...-0.212
.-0.209
.""-0.206
 society-0.186
Token adapt
Token ID6068
Feature activation-3.94e+00

Pos logit contributions
+0.129
+0.112
 Originally+0.103
 +0.103
 �+0.097
Neg logit contributions
,-0.143
,...-0.114
.-0.114
.""-0.110
 society-0.100
Token into
Token ID656
Feature activation-9.36e+00

Pos logit contributions
+0.400
+0.358
 +0.331
 Originally+0.330
 �+0.314
Neg logit contributions
,-0.323
,...-0.252
.""-0.243
.-0.242
 society-0.218
Token a
Token ID257
Feature activation-1.25e+00

Pos logit contributions
+0.934
+0.849
 +0.765
 Originally+0.762
 �+0.738
Neg logit contributions
,-0.769
,...-0.627
.""-0.592
.-0.584
 society-0.497
Token stop
Token ID2245
Feature activation-1.89e+00

Pos logit contributions
+0.135
+0.121
 +0.112
 Originally+0.111
 �+0.107
Neg logit contributions
,-0.099
,...-0.075
.-0.073
.""-0.071
 society-0.060
Token-
Token ID12
Feature activation-7.65e-01

Pos logit contributions
+0.223
+0.199
 +0.189
 Originally+0.185
 �+0.181
Neg logit contributions
,-0.130
,...-0.093
.-0.091
.""-0.087
 society-0.075
Tokenmotion
Token ID38714
Feature activation-1.72e+00

Pos logit contributions
+0.056
+0.046
 +0.042
 Originally+0.041
 �+0.038
Neg logit contributions
,-0.087
,...-0.072
.-0.072
.""-0.070
 society-0.066
Token feature
Token ID3895
Feature activation-2.72e+00

Pos logit contributions
+0.190
+0.173
 +0.159
 Originally+0.156
 �+0.153
Neg logit contributions
,-0.126
,...-0.096
.-0.092
.""-0.092
 society-0.077
Token Catalan
Token ID31066
Feature activation-1.47e+00

Pos logit contributions
,+0.126
,...+0.097
.+0.097
.""+0.091
 society+0.080
Neg logit contributions
-0.153
-0.134
 -0.126
 Originally-0.125
 �-0.121
Token DUI
Token ID39414
Feature activation+2.09e+00

Pos logit contributions
+0.169
+0.150
 Originally+0.139
 +0.139
 �+0.137
Neg logit contributions
,-0.107
,...-0.080
.-0.078
.""-0.076
 society-0.062
Token (
Token ID357
Feature activation-8.95e-01

Pos logit contributions
,+0.198
,...+0.158
.+0.157
.""+0.152
 society+0.136
Neg logit contributions
-0.197
-0.166
 -0.153
 Originally-0.153
 �-0.149
TokenDecl
Token ID37835
Feature activation-4.36e+00

Pos logit contributions
+0.062
+0.050
 +0.045
 Originally+0.045
 �+0.042
Neg logit contributions
,-0.107
,...-0.089
.-0.089
.""-0.086
 society-0.081
Tokenar
Token ID283
Feature activation-4.31e+00

Pos logit contributions
+0.304
+0.249
 Originally+0.221
 +0.218
 �+0.205
Neg logit contributions
,-0.510
,...-0.427
.-0.419
.""-0.411
 society-0.389
Tokenaci
Token ID32009
Feature activation-9.33e+00

Pos logit contributions
+0.310
+0.252
 Originally+0.227
 +0.222
 �+0.211
Neg logit contributions
,-0.496
,...-0.416
.-0.410
.""-0.400
 society-0.373
Tokenón
Token ID18840
Feature activation-2.27e+00

Pos logit contributions
+0.696
+0.563
 Originally+0.505
 +0.494
 �+0.469
Neg logit contributions
,-1.058
,...-0.882
.-0.873
.""-0.854
 society-0.792
Token Un
Token ID791
Feature activation-2.99e+00

Pos logit contributions
+0.200
+0.169
 Originally+0.156
 +0.155
 �+0.148
Neg logit contributions
,-0.227
,...-0.183
.-0.182
.""-0.176
 society-0.160
Tokenilateral
Token ID14796
Feature activation-4.99e+00

Pos logit contributions
+0.261
+0.223
 +0.206
 Originally+0.204
 �+0.194
Neg logit contributions
,-0.299
,...-0.242
.-0.239
.""-0.231
 society-0.212
Token de
Token ID390
Feature activation-5.61e+00

Pos logit contributions
+0.457
+0.390
 Originally+0.361
 +0.360
 �+0.343
Neg logit contributions
,-0.480
,...-0.384
.-0.382
.""-0.369
 society-0.336
Token Ind
Token ID1423
Feature activation-3.57e+00

Pos logit contributions
+0.484
+0.410
 Originally+0.375
 +0.375
 �+0.358
Neg logit contributions
,-0.571
,...-0.463
.-0.459
.""-0.442
 society-0.405
Token could
Token ID714
Feature activation-1.05e+00

Pos logit contributions
,+0.292
,...+0.234
.+0.233
.""+0.225
 society+0.204
Neg logit contributions
-0.274
-0.232
 -0.216
 Originally-0.212
 �-0.207
Token delay
Token ID5711
Feature activation-3.30e+00

Pos logit contributions
+0.092
+0.078
 +0.073
 Originally+0.071
 �+0.069
Neg logit contributions
,-0.104
,...-0.084
.-0.084
.""-0.081
 society-0.074
Token or
Token ID393
Feature activation-1.34e+00

Pos logit contributions
+0.359
+0.319
 +0.292
 Originally+0.291
 �+0.286
Neg logit contributions
,-0.252
,...-0.197
.-0.189
.""-0.185
 society-0.159
Token derail
Token ID33749
Feature activation-4.36e+00

Pos logit contributions
+0.119
+0.102
 +0.094
 Originally+0.093
 �+0.089
Neg logit contributions
,-0.131
,...-0.105
.-0.105
.""-0.101
 society-0.092
Token progress
Token ID4371
Feature activation-1.03e+00

Pos logit contributions
+0.490
+0.435
 Originally+0.403
 +0.401
 �+0.392
Neg logit contributions
,-0.323
,...-0.247
.-0.237
.""-0.229
 society-0.196
Token toward
Token ID3812
Feature activation-9.27e+00

Pos logit contributions
+0.059
+0.045
 +0.039
 Originally+0.039
 �+0.036
Neg logit contributions
,-0.134
,...-0.114
.-0.113
.""-0.111
 society-0.104
Token 100
Token ID1802
Feature activation-3.73e+00

Pos logit contributions
+0.963
+0.848
 +0.780
 Originally+0.775
 �+0.761
Neg logit contributions
,-0.763
,...-0.596
.-0.583
.""-0.565
 society-0.483
Token%
Token ID4
Feature activation+2.18e-02

Pos logit contributions
+0.321
+0.270
 Originally+0.238
 +0.237
 �+0.235
Neg logit contributions
,-0.373
,...-0.316
.""-0.305
.-0.298
 society-0.271
Token renewable
Token ID15713
Feature activation-1.43e+00

Pos logit contributions
,+0.002
.+0.001
,...+0.001
.""+0.001
 society+0.001
Neg logit contributions
-0.002
-0.002
 -0.002
 Originally-0.002
 �-0.002
Token energy
Token ID2568
Feature activation+9.49e-01

Pos logit contributions
+0.157
+0.140
 +0.128
 �+0.126
 Originally+0.126
Neg logit contributions
,-0.108
,...-0.084
.-0.080
.""-0.080
 society-0.066
Token and
Token ID290
Feature activation+7.09e-01

Pos logit contributions
,+0.082
,...+0.065
.+0.062
.""+0.062
 society+0.055
Neg logit contributions
-0.095
-0.083
 -0.076
 Originally-0.075
 �-0.074
Token the
Token ID262
Feature activation+1.62e+00

Pos logit contributions
,+0.083
,...+0.062
.+0.060
.""+0.058
 society+0.047
Neg logit contributions
-0.141
-0.125
 -0.116
 Originally-0.116
 �-0.115
Token king
Token ID5822
Feature activation+2.26e+00

Pos logit contributions
,+0.118
,...+0.088
.+0.087
.""+0.083
 society+0.069
Neg logit contributions
-0.185
-0.164
 Originally-0.153
 -0.153
 �-0.149
Token,
Token ID11
Feature activation+6.18e+00

Pos logit contributions
,+0.066
,...+0.028
.""+0.022
.+0.019
 society+0.010
Neg logit contributions
-0.358
-0.324
 �-0.306
 -0.306
 Originally-0.304
Token gest
Token ID10521
Feature activation-2.49e+00

Pos logit contributions
,+0.694
.+0.569
,...+0.556
.""+0.537
 society+0.499
Neg logit contributions
-0.471
-0.386
 -0.365
 Originally-0.363
 Regarding-0.335
Tokenuring
Token ID870
Feature activation-3.28e+00

Pos logit contributions
+0.169
+0.134
 +0.126
 Originally+0.124
 Regarding+0.113
Neg logit contributions
,-0.303
.-0.253
,...-0.247
.""-0.239
 society-0.223
Token towards
Token ID3371
Feature activation-9.20e+00

Pos logit contributions
+0.197
+0.151
 +0.132
 Originally+0.130
 �+0.124
Neg logit contributions
,-0.419
,...-0.356
.-0.352
.""-0.344
 society-0.323
Token the
Token ID262
Feature activation+1.45e+00

Pos logit contributions
+1.067
+0.944
 +0.873
 �+0.871
 Originally+0.854
Neg logit contributions
,-0.649
,...-0.500
.-0.468
.""-0.457
,.-0.382
Token crowds
Token ID15779
Feature activation+3.03e+00

Pos logit contributions
,+0.120
,...+0.091
.+0.091
.""+0.086
 society+0.075
Neg logit contributions
-0.153
-0.134
 -0.126
 Originally-0.125
 �-0.121
Token.
Token ID13
Feature activation+4.22e+00

Pos logit contributions
,+0.203
,...+0.147
.+0.139
.""+0.138
 society+0.118
Neg logit contributions
-0.365
-0.324
 -0.304
 Originally-0.300
 �-0.298
Token
Token ID564
Feature activation-6.23e+00

Pos logit contributions
,+0.584
.+0.498
,...+0.489
.""+0.476
 society+0.450
Neg logit contributions
-0.211
-0.152
 -0.138
 Originally-0.136
 Regarding-0.117
Token
Token ID250
Feature activation-3.91e+00

Pos logit contributions
+0.487
+0.400
 +0.378
 Originally+0.375
 �+0.348
Neg logit contributions
,-0.689
.-0.563
,...-0.552
.""-0.532
 society-0.492
Token Foundation
Token ID5693
Feature activation+1.69e+00

Pos logit contributions
+0.280
+0.249
 +0.232
 Originally+0.231
 �+0.221
Neg logit contributions
,-0.216
,...-0.165
.-0.162
.""-0.158
 society-0.137
Token:
Token ID25
Feature activation+1.03e+00

Pos logit contributions
,+0.086
,...+0.057
.""+0.054
.+0.054
 society+0.040
Neg logit contributions
-0.225
-0.207
 -0.194
 Originally-0.193
 �-0.188
Token "
Token ID366
Feature activation-5.57e-01

Pos logit contributions
,+0.105
.+0.085
,...+0.084
.""+0.081
 society+0.074
Neg logit contributions
-0.087
-0.075
 -0.070
 Originally-0.070
 Regarding-0.065
Token9
Token ID24
Feature activation-4.52e+00

Pos logit contributions
+0.038
+0.031
 Originally+0.027
 +0.027
 �+0.025
Neg logit contributions
,-0.067
,...-0.056
.-0.055
.""-0.054
 society-0.050
Token-
Token ID12
Feature activation-2.48e-01

Pos logit contributions
+0.561
+0.509
 Originally+0.475
 +0.472
 �+0.461
Neg logit contributions
,-0.278
,...-0.202
.""-0.185
.-0.184
 society-0.151
Tokento
Token ID1462
Feature activation-9.13e+00

Pos logit contributions
+0.009
+0.006
 +0.005
 Originally+0.005
 �+0.004
Neg logit contributions
,-0.038
.-0.033
,...-0.032
.""-0.032
 society-0.030
Token-
Token ID12
Feature activation-2.73e-01

Pos logit contributions
+1.313
+1.187
 +1.150
 Originally+1.147
 �+1.109
Neg logit contributions
,-0.411
.-0.226
,...-0.212
.""-0.184
 society-0.126
Token5
Token ID20
Feature activation-3.98e+00

Pos logit contributions
+0.013
+0.009
 +0.008
 Originally+0.008
 �+0.007
Neg logit contributions
,-0.038
.-0.033
,...-0.033
.""-0.032
 society-0.030
Token Working
Token ID14594
Feature activation-8.44e-01

Pos logit contributions
+0.285
+0.238
 Originally+0.211
 +0.202
 �+0.198
Neg logit contributions
,-0.450
,...-0.381
.-0.372
.""-0.370
 society-0.339
Token Women
Token ID6926
Feature activation+3.57e+00

Pos logit contributions
+0.083
+0.072
 Originally+0.066
 +0.065
 �+0.063
Neg logit contributions
,-0.075
,...-0.059
.-0.058
.""-0.057
 society-0.052
Token Education
Token ID7868
Feature activation+4.29e+00

Pos logit contributions
,+0.297
,...+0.230
.+0.225
.""+0.218
 society+0.197
Neg logit contributions
-0.368
-0.326
 Originally-0.301
 -0.298
 �-0.289
Token might
Token ID1244
Feature activation+4.25e-01

Pos logit contributions
,+0.084
,...+0.061
.+0.060
.""+0.058
 society+0.049
Neg logit contributions
-0.152
-0.137
 -0.128
 Originally-0.127
 �-0.124
Token be
Token ID307
Feature activation-1.83e+00

Pos logit contributions
,+0.043
,...+0.036
.+0.035
.""+0.034
 society+0.032
Neg logit contributions
-0.035
-0.030
 -0.027
 Originally-0.027
 �-0.026
Token pretty
Token ID2495
Feature activation+4.85e-01

Pos logit contributions
+0.189
+0.169
 +0.154
 Originally+0.153
 �+0.149
Neg logit contributions
,-0.147
,...-0.117
.""-0.111
.-0.110
 society-0.097
Token special
Token ID2041
Feature activation+1.33e+00

Pos logit contributions
,+0.038
,...+0.030
.+0.028
.""+0.028
 society+0.025
Neg logit contributions
-0.051
-0.046
 -0.042
 Originally-0.042
 �-0.040
Token too
Token ID1165
Feature activation-1.58e-01

Pos logit contributions
,+0.134
,...+0.111
.""+0.108
.+0.107
 society+0.099
Neg logit contributions
-0.110
-0.097
 -0.085
 Originally-0.085
 �-0.081
Token.
Token ID13
Feature activation+4.97e+00

Pos logit contributions
+0.023
+0.022
 +0.020
 Originally+0.020
 �+0.020
Neg logit contributions
,-0.005
,...-0.003
.""-0.003
.-0.002
 society-0.001
Token
Token ID198
Feature activation-6.19e+00

Pos logit contributions
,+0.744
.+0.643
,...+0.630
.""+0.613
 society+0.583
Neg logit contributions
-0.196
-0.122
 -0.109
 Originally-0.105
 �-0.085
Token
Token ID198
Feature activation-6.14e+00

Pos logit contributions
+0.121
 +0.017
 Originally+0.014
 Regarding+0.002
+0.000
Neg logit contributions
,-1.065
.-0.940
,...-0.903
.""-0.872
 society-0.837
TokenDespite
Token ID8332
Feature activation-2.67e+00

Pos logit contributions
+0.497
+0.411
 Originally+0.369
 +0.364
 �+0.344
Neg logit contributions
,-0.663
,...-0.545
.-0.535
.""-0.524
 society-0.500
Token being
Token ID852
Feature activation+1.61e+00

Pos logit contributions
+0.272
+0.241
 Originally+0.221
 +0.220
 �+0.213
Neg logit contributions
,-0.224
,...-0.176
.-0.172
.""-0.168
 society-0.146
Token unable
Token ID5906
Feature activation-3.85e+00

Pos logit contributions
,+0.158
.+0.126
,...+0.125
.""+0.120
 society+0.109
Neg logit contributions
-0.145
-0.124
 Originally-0.116
 -0.115
 �-0.109
Token and
Token ID290
Feature activation+1.42e+00

Pos logit contributions
+0.202
+0.177
 Originally+0.157
 +0.157
 �+0.151
Neg logit contributions
,-0.241
,...-0.199
.-0.192
.""-0.190
 society-0.175
Token spend
Token ID4341
Feature activation-9.48e-01

Pos logit contributions
,+0.125
,...+0.097
.+0.097
.""+0.093
 society+0.082
Neg logit contributions
-0.140
-0.123
 -0.114
 Originally-0.114
 �-0.109
Token time
Token ID640
Feature activation+2.89e+00

Pos logit contributions
+0.100
+0.090
 +0.083
 Originally+0.082
 �+0.080
Neg logit contributions
,-0.075
,...-0.058
.-0.055
.""-0.054
 society-0.048
Token with
Token ID351
Feature activation-5.29e+00

Pos logit contributions
,+0.374
,...+0.318
.+0.315
.""+0.307
 society+0.289
Neg logit contributions
-0.165
-0.130
 -0.112
 Originally-0.111
 �-0.102
Token family
Token ID1641
Feature activation+6.94e+00

Pos logit contributions
+0.594
+0.542
 +0.494
 �+0.482
 Originally+0.482
Neg logit contributions
,-0.373
,...-0.289
.""-0.276
.-0.270
.,-0.221
Token,"
Token ID553
Feature activation+6.10e+00

Pos logit contributions
,+0.459
,...+0.338
.+0.318
.""+0.316
 society+0.274
Neg logit contributions
-0.824
-0.743
 -0.684
 Originally-0.679
 �-0.673
Token Walker
Token ID10120
Feature activation-9.37e-01

Pos logit contributions
,+0.611
,...+0.493
.+0.490
.""+0.477
 society+0.431
Neg logit contributions
-0.539
-0.459
 -0.419
 Originally-0.417
 �-0.398
Token told
Token ID1297
Feature activation+1.78e+00

Pos logit contributions
+0.090
+0.079
 +0.071
 Originally+0.071
 �+0.069
Neg logit contributions
,-0.084
,...-0.068
.""-0.066
.-0.065
 society-0.060
Token Triple
Token ID19817
Feature activation-1.74e+00

Pos logit contributions
,+0.161
,...+0.127
.+0.124
.""+0.122
 society+0.107
Neg logit contributions
-0.172
-0.150
 Originally-0.138
 -0.138
 �-0.132
Token M
Token ID337
Feature activation+1.26e-01

Pos logit contributions
+0.154
+0.130
 +0.123
 Originally+0.123
 �+0.115
Neg logit contributions
,-0.174
.-0.138
,...-0.136
.""-0.130
 society-0.119
Token.
Token ID13
Feature activation+5.77e+00

Pos logit contributions
,+0.008
,...+0.006
.+0.005
.""+0.005
 society+0.004
Neg logit contributions
-0.015
-0.014
 -0.013
 Originally-0.013
 �-0.013
Token to
Token ID284
Feature activation-4.27e+00

Pos logit contributions
+0.457
+0.412
 Originally+0.384
 +0.383
 �+0.374
Neg logit contributions
,-0.283
.-0.208
,...-0.208
.""-0.201
 society-0.173
Token K
Token ID509
Feature activation+1.13e+00

Pos logit contributions
+0.372
+0.321
 +0.297
 Originally+0.295
 �+0.282
Neg logit contributions
,-0.425
,...-0.342
.-0.340
.""-0.328
 society-0.298
Tokenoug
Token ID20805
Feature activation-1.88e+00

Pos logit contributions
,+0.105
,...+0.083
.+0.083
.""+0.081
 society+0.074
Neg logit contributions
-0.106
-0.092
 Originally-0.085
 -0.085
 �-0.081
Tokenra
Token ID430
Feature activation-3.14e+00

Pos logit contributions
+0.120
+0.096
 Originally+0.087
 +0.087
 �+0.078
Neg logit contributions
,-0.231
,...-0.195
.-0.193
.""-0.188
 society-0.175
Token,
Token ID11
Feature activation+8.42e+00

Pos logit contributions
+0.452
+0.413
 Originally+0.393
 +0.390
 �+0.383
Neg logit contributions
,-0.124
,...-0.069
.-0.068
.""-0.066
 society-0.044
Token you
Token ID345
Feature activation+5.58e+00

Pos logit contributions
,+0.628
.+0.459
,...+0.450
.""+0.423
 society+0.367
Neg logit contributions
-0.959
-0.847
 -0.809
 Originally-0.805
 �-0.772
Token will
Token ID481
Feature activation+1.14e+00

Pos logit contributions
,+0.512
,...+0.403
.+0.402
.""+0.385
 society+0.349
Neg logit contributions
-0.531
-0.463
 -0.432
 Originally-0.425
 �-0.409
Token see
Token ID766
Feature activation-9.09e-01

Pos logit contributions
,+0.125
,...+0.102
.+0.102
.""+0.098
 society+0.091
Neg logit contributions
-0.088
-0.075
 -0.068
 Originally-0.068
 �-0.064
Token all
Token ID477
Feature activation+1.28e+00

Pos logit contributions
+0.091
+0.080
 Originally+0.074
 +0.074
 �+0.071
Neg logit contributions
,-0.078
,...-0.061
.-0.060
.""-0.058
 society-0.052
Token available
Token ID1695
Feature activation-2.89e+00

Pos logit contributions
,+0.147
,...+0.122
.+0.121
.""+0.118
 society+0.109
Neg logit contributions
-0.093
-0.078
 Originally-0.069
 -0.069
 �-0.065
Token K
Token ID509
Feature activation+7.82e-01

Pos logit contributions
+0.257
+0.224
 Originally+0.203
 +0.202
 �+0.194
Neg logit contributions
,-0.282
,...-0.228
.-0.224
.""-0.220
 society-0.198
Token up
Token ID510
Feature activation-5.93e+00

Pos logit contributions
,+0.120
,...+0.094
.+0.089
.""+0.088
 society+0.077
Neg logit contributions
-0.155
-0.136
 -0.125
 Originally-0.124
 �-0.124
Token in
Token ID287
Feature activation-5.98e+00

Pos logit contributions
+0.575
+0.492
 +0.455
 Originally+0.448
 �+0.445
Neg logit contributions
,-0.539
,...-0.428
.-0.415
.""-0.410
 society-0.374
Token the
Token ID262
Feature activation+2.70e+00

Pos logit contributions
+0.679
+0.605
 +0.564
 Originally+0.558
 �+0.553
Neg logit contributions
,-0.434
,...-0.336
.-0.317
.""-0.308
 society-0.257
Token video
Token ID2008
Feature activation+4.28e+00

Pos logit contributions
,+0.232
.+0.178
,...+0.177
.""+0.167
 society+0.149
Neg logit contributions
-0.276
-0.239
 -0.227
 Originally-0.226
 �-0.217
Token clips
Token ID19166
Feature activation+6.19e-01

Pos logit contributions
,+0.299
,...+0.221
.+0.210
.""+0.209
 society+0.181
Neg logit contributions
-0.504
-0.445
 -0.417
 Originally-0.412
 �-0.410
Token they
Token ID484
Feature activation+5.31e+00

Pos logit contributions
,+0.045
,...+0.034
.+0.033
.""+0.032
 society+0.028
Neg logit contributions
-0.070
-0.062
 -0.058
 Originally-0.057
 �-0.057
Token released
Token ID2716
Feature activation-1.49e+00

Pos logit contributions
,+0.546
.+0.439
,...+0.436
.""+0.418
 society+0.384
Neg logit contributions
-0.457
-0.383
 -0.359
 Originally-0.358
 �-0.338
Token.
Token ID13
Feature activation+5.54e+00

Pos logit contributions
+0.166
+0.146
 +0.136
 Originally+0.135
 �+0.133
Neg logit contributions
,-0.113
,...-0.085
.-0.082
.""-0.080
 society-0.071
Token (
Token ID357
Feature activation+3.26e-01

Pos logit contributions
,+0.805
.+0.693
,...+0.678
.""+0.659
 society+0.627
Neg logit contributions
-0.237
-0.158
 -0.144
 Originally-0.141
 Regarding-0.114
TokenNot
Token ID3673
Feature activation-1.46e+00

Pos logit contributions
,+0.039
.+0.033
,...+0.032
.""+0.031
 society+0.029
Neg logit contributions
-0.022
-0.018
 -0.016
 Originally-0.016
 �-0.015
Tokenably
Token ID1346
Feature activation-4.82e+00

Pos logit contributions
+0.136
+0.116
 +0.109
 Originally+0.108
 �+0.104
Neg logit contributions
,-0.139
,...-0.110
.-0.109
.""-0.104
 society-0.095
Token a
Token ID257
Feature activation+6.93e-01

Pos logit contributions
,+0.164
,...+0.127
.+0.127
.""+0.123
 society+0.110
Neg logit contributions
-0.186
-0.163
 Originally-0.153
 -0.151
 �-0.145
Token great
Token ID1049
Feature activation+1.73e+00

Pos logit contributions
,+0.056
,...+0.042
.+0.042
.""+0.040
 society+0.035
Neg logit contributions
-0.073
-0.065
 Originally-0.061
 -0.061
 �-0.058
Token example
Token ID1672
Feature activation+1.19e+00

Pos logit contributions
,+0.142
,...+0.110
.+0.108
.""+0.105
 society+0.093
Neg logit contributions
-0.178
-0.159
 -0.147
 Originally-0.146
 �-0.140
Token of
Token ID286
Feature activation-4.56e+00

Pos logit contributions
,+0.096
,...+0.076
.""+0.073
.+0.071
 society+0.065
Neg logit contributions
-0.125
-0.109
 Originally-0.100
 -0.100
 �-0.099
Token that
Token ID326
Feature activation+1.99e+00

Pos logit contributions
+0.475
+0.427
 Originally+0.389
 +0.388
 �+0.383
Neg logit contributions
,-0.361
,...-0.282
.-0.269
.""-0.265
 society-0.236
Token.
Token ID13
Feature activation+5.66e+00

Pos logit contributions
,+0.067
,...+0.032
.""+0.030
.+0.025
 society+0.017
Neg logit contributions
-0.300
-0.276
 Originally-0.259
 -0.258
 �-0.258
Token The
Token ID383
Feature activation+1.86e+00

Pos logit contributions
,+0.710
.+0.595
,...+0.580
.""+0.560
 society+0.526
Neg logit contributions
-0.361
-0.275
 -0.260
 Originally-0.254
 �-0.233
Token J
Token ID449
Feature activation+2.05e-01

Pos logit contributions
,+0.165
,...+0.128
.+0.128
.""+0.123
 society+0.109
Neg logit contributions
-0.182
-0.160
 Originally-0.149
 -0.148
 �-0.142
Tokenoust
Token ID23968
Feature activation-8.77e-01

Pos logit contributions
,+0.023
.+0.019
,...+0.019
.""+0.018
 society+0.017
Neg logit contributions
-0.016
-0.013
 -0.012
 Originally-0.012
 �-0.011
Token mechanic
Token ID21600
Feature activation+2.36e+00

Pos logit contributions
+0.089
+0.078
 Originally+0.073
 +0.073
 �+0.070
Neg logit contributions
,-0.074
,...-0.057
.-0.057
.""-0.055
 society-0.049
Token was
Token ID373
Feature activation+1.68e+00

Pos logit contributions
,+0.214
,...+0.168
.+0.167
.""+0.163
 society+0.148
Neg logit contributions
-0.228
-0.197
 -0.182
 Originally-0.182
 �-0.176
Token them
Token ID606
Feature activation+4.18e+00

Pos logit contributions
+0.483
+0.436
 Originally+0.399
 +0.399
 �+0.389
Neg logit contributions
,-0.339
,...-0.267
.-0.251
.""-0.250
 society-0.211
Token.
Token ID13
Feature activation+5.88e+00

Pos logit contributions
,+0.156
,...+0.084
.""+0.077
.+0.070
 society+0.054
Neg logit contributions
-0.626
-0.565
 -0.536
 �-0.533
 Originally-0.527
Token Also
Token ID4418
Feature activation-8.95e-01

Pos logit contributions
,+0.809
.+0.690
,...+0.674
.""+0.655
 society+0.619
Neg logit contributions
-0.303
-0.213
 -0.198
 Originally-0.194
 �-0.171
Token,
Token ID11
Feature activation+7.46e+00

Pos logit contributions
+0.131
+0.119
 +0.114
 Originally+0.113
 �+0.111
Neg logit contributions
,-0.035
.-0.019
,...-0.018
.""-0.017
 society-0.012
Token Jub
Token ID38075
Feature activation-2.42e+00

Pos logit contributions
,+0.669
.+0.518
,...+0.510
.""+0.485
 society+0.436
Neg logit contributions
-0.738
-0.637
 -0.606
 Originally-0.603
 �-0.572
Tokenilee
Token ID40626
Feature activation+9.05e-01

Pos logit contributions
+0.242
+0.207
 +0.200
 Originally+0.199
 Regarding+0.190
Neg logit contributions
,-0.217
.-0.168
,...-0.161
.""-0.153
 society-0.138
Token actually
Token ID1682
Feature activation+4.72e-01

Pos logit contributions
,+0.078
,...+0.061
.+0.061
.""+0.059
 society+0.053
Neg logit contributions
-0.092
-0.079
 -0.073
 Originally-0.073
 �-0.072
Token gets
Token ID3011
Feature activation+1.36e+00

Pos logit contributions
,+0.043
.+0.033
,...+0.033
.""+0.032
 society+0.029
Neg logit contributions
-0.046
-0.040
 -0.037
 Originally-0.037
 �-0.036
Token to
Token ID284
Feature activation-5.34e+00

Pos logit contributions
,+0.128
.+0.100
,...+0.100
.""+0.096
 society+0.086
Neg logit contributions
-0.128
-0.110
 -0.103
 Originally-0.102
 �-0.098
Token use
Token ID779
Feature activation-1.97e+00

Pos logit contributions
+0.496
+0.431
 +0.401
 Originally+0.394
 �+0.382
Neg logit contributions
,-0.499
,...-0.396
.-0.393
.""-0.379
 society-0.343
Token her
Token ID607
Feature activation+4.06e+00

Pos logit contributions
+0.219
+0.197
 +0.184
 Originally+0.182
 �+0.178
Neg logit contributions
,-0.145
,...-0.110
.-0.107
.""-0.105
,.-0.087
Token candidate
Token ID4540
Feature activation+9.30e-01

Pos logit contributions
,+0.176
,...+0.126
.+0.125
.""+0.120
 society+0.101
Neg logit contributions
-0.299
-0.268
 -0.251
 Originally-0.251
 �-0.241
Token,
Token ID11
Feature activation+6.77e+00

Pos logit contributions
,+0.100
.+0.083
,...+0.083
.""+0.082
 society+0.075
Neg logit contributions
-0.072
-0.060
 -0.054
 Originally-0.053
 �-0.051
Token had
Token ID550
Feature activation-2.13e-01

Pos logit contributions
,+0.654
.+0.519
,...+0.511
.""+0.490
 society+0.446
Neg logit contributions
-0.621
-0.531
 -0.500
 Originally-0.497
 �-0.470
Token pledged
Token ID16387
Feature activation+1.23e+00

Pos logit contributions
+0.018
+0.016
 Originally+0.015
 +0.015
 �+0.014
Neg logit contributions
,-0.021
,...-0.017
.-0.017
.""-0.016
 society-0.015
Token to
Token ID284
Feature activation-6.02e+00

Pos logit contributions
,+0.159
,...+0.136
.+0.135
.""+0.131
 society+0.123
Neg logit contributions
-0.070
-0.055
 -0.047
 Originally-0.047
 �-0.044
Token turf
Token ID29553
Feature activation+2.59e+00

Pos logit contributions
+0.558
+0.484
 +0.451
 Originally+0.449
 �+0.428
Neg logit contributions
,-0.567
.-0.448
,...-0.447
.""-0.426
 society-0.385
Token-
Token ID12
Feature activation+8.72e-01

Pos logit contributions
,+0.225
,...+0.179
.+0.174
.""+0.170
 society+0.148
Neg logit contributions
-0.254
-0.227
 Originally-0.207
 -0.206
 �-0.198
Tokenout
Token ID448
Feature activation-5.86e+00

Pos logit contributions
,+0.102
,...+0.085
.+0.084
.""+0.082
 society+0.076
Neg logit contributions
-0.061
-0.051
 -0.045
 Originally-0.045
 �-0.041
Token Assange
Token ID21882
Feature activation+3.02e-01

Pos logit contributions
+0.590
+0.535
 Originally+0.486
 +0.483
 �+0.467
Neg logit contributions
,-0.482
,...-0.380
.-0.368
.""-0.364
 society-0.311
Token if
Token ID611
Feature activation-6.23e-01

Pos logit contributions
,+0.031
,...+0.025
.+0.025
.""+0.024
 society+0.022
Neg logit contributions
-0.025
-0.022
 -0.020
 Originally-0.020
 �-0.019
Token he
Token ID339
Feature activation+3.95e+00

Pos logit contributions
+0.067
+0.061
 +0.056
 Originally+0.055
 �+0.054
Neg logit contributions
,-0.047
,...-0.036
.-0.035
.""-0.035
 society-0.029
Tokenouts
Token ID5269
Feature activation-5.97e+00

Pos logit contributions
+0.165
+0.133
 +0.122
 Originally+0.122
 �+0.113
Neg logit contributions
,-0.274
.-0.227
,...-0.225
.""-0.218
 society-0.203
Token at
Token ID379
Feature activation-7.46e+00

Pos logit contributions
+0.439
+0.361
 +0.328
 Originally+0.324
 �+0.308
Neg logit contributions
,-0.677
.-0.560
,...-0.557
.""-0.541
 society-0.504
Token town
Token ID3240
Feature activation+4.56e+00

Pos logit contributions
+0.832
+0.752
 Originally+0.691
 +0.686
 �+0.673
Neg logit contributions
,-0.547
,...-0.417
.-0.403
.""-0.391
 society-0.331
Token meetings
Token ID8292
Feature activation+3.57e+00

Pos logit contributions
,+0.356
,...+0.271
.+0.267
.""+0.261
 society+0.227
Neg logit contributions
-0.490
-0.437
 -0.404
 Originally-0.401
 �-0.390
Token in
Token ID287
Feature activation-7.25e+00

Pos logit contributions
,+0.353
.+0.284
,...+0.283
.""+0.274
 society+0.252
Neg logit contributions
-0.312
-0.267
 -0.246
 Originally-0.242
 �-0.233
Token New
Token ID968
Feature activation+4.39e-01

Pos logit contributions
+0.746
+0.666
 Originally+0.605
 +0.605
 �+0.589
Neg logit contributions
,-0.595
,...-0.465
.-0.455
.""-0.446
 society-0.386
Token Hampshire
Token ID13910
Feature activation-2.89e-01

Pos logit contributions
,+0.036
,...+0.028
.+0.027
.""+0.026
 society+0.023
Neg logit contributions
-0.046
-0.041
 Originally-0.037
 -0.037
 �-0.036
Token and
Token ID290
Feature activation-1.92e-01

Pos logit contributions
+0.027
+0.023
 +0.021
 Originally+0.021
 �+0.020
Neg logit contributions
,-0.027
,...-0.021
.-0.021
.""-0.021
 society-0.019
Token Iowa
Token ID9406
Feature activation+9.50e-01

Pos logit contributions
+0.017
+0.015
 +0.014
 Originally+0.013
 �+0.013
Neg logit contributions
,-0.019
,...-0.015
.-0.015
.""-0.014
 society-0.013
Token are
Token ID389
Feature activation-1.20e-01

Pos logit contributions
,+0.090
,...+0.072
.+0.071
.""+0.069
 society+0.063
Neg logit contributions
-0.087
-0.075
 -0.068
 Originally-0.068
 �-0.066
Token sending
Token ID7216
Feature activation-5.87e-01

Pos logit contributions
+0.012
+0.010
 +0.009
 Originally+0.009
 �+0.009
Neg logit contributions
,-0.011
,...-0.008
.-0.008
.""-0.008
 society-0.007
Token across
Token ID1973
Feature activation-5.68e+00

Pos logit contributions
,+0.081
,...+0.059
.+0.057
.""+0.057
 society+0.048
Neg logit contributions
-0.139
-0.122
 -0.113
 �-0.112
 Originally-0.112
Token the
Token ID262
Feature activation+2.98e+00

Pos logit contributions
+0.645
+0.579
 Originally+0.525
 +0.523
 �+0.512
Neg logit contributions
,-0.403
,...-0.308
.""-0.302
.-0.294
.,-0.235
Token land
Token ID1956
Feature activation+4.83e+00

Pos logit contributions
,+0.232
.+0.173
,...+0.172
.""+0.164
 society+0.137
Neg logit contributions
-0.329
-0.290
 -0.272
 Originally-0.272
 �-0.262
Token,
Token ID11
Feature activation+7.45e+00

Pos logit contributions
,+0.292
,...+0.203
.+0.195
.""+0.194
 society+0.162
Neg logit contributions
-0.617
-0.544
 -0.510
 �-0.508
 Originally-0.504
Token would
Token ID561
Feature activation+5.47e-01

Pos logit contributions
,+0.670
.+0.520
,...+0.510
.""+0.487
 society+0.437
Neg logit contributions
-0.738
-0.634
 -0.604
 Originally-0.601
 �-0.571
Token have
Token ID423
Feature activation-1.49e-01

Pos logit contributions
,+0.059
,...+0.049
.+0.048
.""+0.047
 society+0.043
Neg logit contributions
-0.043
-0.036
 -0.033
 Originally-0.033
 �-0.031
Token been
Token ID587
Feature activation+5.15e-01

Pos logit contributions
+0.012
+0.010
 +0.010
 Originally+0.009
 �+0.009
Neg logit contributions
,-0.016
,...-0.013
.-0.013
.""-0.012
 society-0.011
Token thwarted
Token ID44200
Feature activation+5.35e-01

Pos logit contributions
,+0.050
,...+0.040
.+0.040
.""+0.039
 society+0.035
Neg logit contributions
-0.046
-0.039
 -0.037
 Originally-0.036
 �-0.035
Token by
Token ID416
Feature activation-5.15e+00

Pos logit contributions
,+0.034
,...+0.025
.""+0.023
.+0.023
 society+0.020
Neg logit contributions
-0.066
-0.058
 -0.054
 Originally-0.054
 �-0.054
Token supposed
Token ID4385
Feature activation-1.00e+00

Pos logit contributions
+0.547
+0.497
 +0.446
 Originally+0.443
 �+0.436
Neg logit contributions
,-0.391
,...-0.312
.""-0.301
.-0.295
 society-0.244
Token fair
Token ID3148
Feature activation+1.62e+00

Pos logit contributions
+0.106
+0.095
 +0.087
 Originally+0.086
 �+0.085
Neg logit contributions
,-0.078
,...-0.061
.-0.059
.""-0.058
 society-0.048
Token $
Token ID720
Feature activation-1.81e+00

Pos logit contributions
+0.407
+0.360
 +0.329
 Originally+0.327
 �+0.316
Neg logit contributions
,-0.373
,...-0.297
.-0.287
.""-0.282
 society-0.251
Token20
Token ID1238
Feature activation-4.27e+00

Pos logit contributions
+0.103
+0.080
 +0.070
 Originally+0.070
 �+0.062
Neg logit contributions
,-0.237
.-0.200
,...-0.200
.""-0.195
 society-0.184
Token million
Token ID1510
Feature activation+2.39e+00

Pos logit contributions
+0.560
+0.512
 Originally+0.476
 +0.471
 �+0.464
Neg logit contributions
,-0.222
,...-0.153
.""-0.144
.-0.135
 society-0.121
Token that
Token ID326
Feature activation+8.53e-01

Pos logit contributions
,+0.151
,...+0.107
.+0.102
.""+0.099
 society+0.087
Neg logit contributions
-0.292
-0.265
 -0.247
 Originally-0.246
 �-0.239
Token was
Token ID373
Feature activation+8.71e-01

Pos logit contributions
,+0.074
,...+0.060
.+0.058
.""+0.058
 society+0.050
Neg logit contributions
-0.082
-0.074
 Originally-0.067
 -0.066
 �-0.064
Token passed
Token ID3804
Feature activation+9.75e-01

Pos logit contributions
,+0.089
.+0.071
,...+0.071
.""+0.068
 society+0.063
Neg logit contributions
-0.074
-0.063
 Originally-0.059
 -0.059
 �-0.055
Token to
Token ID284
Feature activation-5.97e+00

Pos logit contributions
,+0.120
,...+0.100
.+0.100
.""+0.097
 society+0.091
Neg logit contributions
-0.063
-0.050
 -0.045
 Originally-0.045
 �-0.041
Token help
Token ID1037
Feature activation+1.14e+00

Pos logit contributions
+0.581
+0.523
 Originally+0.474
 +0.472
 �+0.453
Neg logit contributions
,-0.516
,...-0.411
.-0.398
.""-0.390
 society-0.338
Token repair
Token ID9185
Feature activation+1.63e+00

Pos logit contributions
,+0.092
,...+0.071
.+0.069
.""+0.068
 society+0.058
Neg logit contributions
-0.119
-0.107
 Originally-0.098
 -0.098
 �-0.094
Token damage
Token ID2465
Feature activation+2.01e+00

Pos logit contributions
,+0.121
,...+0.094
.+0.090
.""+0.090
 society+0.074
Neg logit contributions
-0.178
-0.161
 Originally-0.146
 -0.145
 �-0.142
Token from
Token ID422
Feature activation-6.89e+00

Pos logit contributions
,+0.223
,...+0.184
.+0.181
.""+0.178
 society+0.166
Neg logit contributions
-0.151
-0.128
 -0.114
 Originally-0.113
 �-0.107
Token He
Token ID679
Feature activation+2.23e+00

Pos logit contributions
,+0.706
.+0.588
,...+0.576
.""+0.557
 society+0.521
Neg logit contributions
-0.391
-0.306
 -0.288
 Originally-0.284
 �-0.261
Token
Token ID447
Feature activation-3.43e+00

Pos logit contributions
,+0.188
.+0.145
,...+0.145
.""+0.140
 society+0.124
Neg logit contributions
-0.227
-0.200
 Originally-0.186
 -0.185
 �-0.178
Token
Token ID247
Feature activation+1.54e+00

Pos logit contributions
+0.223
+0.176
 Originally+0.154
 +0.154
 �+0.145
Neg logit contributions
,-0.424
.-0.354
,...-0.353
.""-0.347
 society-0.322
Tokens
Token ID82
Feature activation-2.46e+00

Pos logit contributions
,+0.193
.+0.162
,...+0.161
.""+0.158
 society+0.148
Neg logit contributions
-0.099
-0.077
 -0.068
 Originally-0.068
 �-0.063
Token passionate
Token ID15347
Feature activation+1.55e+00

Pos logit contributions
+0.245
+0.209
 Originally+0.195
 +0.193
 �+0.189
Neg logit contributions
,-0.218
,...-0.170
.-0.169
.""-0.163
 society-0.147
Token about
Token ID546
Feature activation-4.65e+00

Pos logit contributions
,+0.154
,...+0.124
.+0.123
.""+0.120
 society+0.110
Neg logit contributions
-0.137
-0.116
 -0.106
 Originally-0.105
 �-0.102
Token his
Token ID465
Feature activation+3.31e+00

Pos logit contributions
+0.518
+0.460
 +0.424
 Originally+0.418
 �+0.416
Neg logit contributions
,-0.348
,...-0.260
.-0.257
.""-0.254
 society-0.207
Token city
Token ID1748
Feature activation+6.42e+00

Pos logit contributions
,+0.241
.+0.175
,...+0.174
.""+0.166
 society+0.139
Neg logit contributions
-0.382
-0.339
 -0.319
 Originally-0.315
 �-0.309
Token,
Token ID11
Feature activation+7.53e+00

Pos logit contributions
,+0.410
,...+0.288
.+0.281
.""+0.279
 society+0.234
Neg logit contributions
-0.802
-0.705
 -0.659
 �-0.654
 Originally-0.650
Token his
Token ID465
Feature activation+3.30e+00

Pos logit contributions
,+0.599
.+0.446
,...+0.438
.""+0.417
 society+0.362
Neg logit contributions
-0.824
-0.722
 -0.685
 Originally-0.682
 �-0.656
Token community
Token ID2055
Feature activation+6.03e+00

Pos logit contributions
,+0.223
.+0.158
,...+0.156
.""+0.149
 society+0.118
Neg logit contributions
-0.396
-0.355
 -0.334
 Originally-0.330
 �-0.323
Tokenrations
Token ID9143
Feature activation-3.96e+00

Pos logit contributions
+0.072
+0.054
 Originally+0.045
 +0.044
 �+0.039
Neg logit contributions
,-0.205
,...-0.177
.-0.174
.""-0.171
 society-0.161
Token for
Token ID329
Feature activation-5.20e+00

Pos logit contributions
+0.306
+0.259
 +0.237
 Originally+0.236
 �+0.218
Neg logit contributions
,-0.435
,...-0.357
.-0.354
.""-0.345
 society-0.317
Token the
Token ID262
Feature activation+2.71e+00

Pos logit contributions
+0.542
+0.488
 +0.453
 Originally+0.447
 �+0.429
Neg logit contributions
,-0.423
,...-0.328
.-0.319
.""-0.311
 society-0.266
Token first
Token ID717
Feature activation-4.60e-01

Pos logit contributions
,+0.230
.+0.176
,...+0.175
.""+0.165
 society+0.145
Neg logit contributions
-0.279
-0.245
 -0.231
 Originally-0.230
 �-0.219
Token b
Token ID275
Feature activation-7.44e-01

Pos logit contributions
+0.047
+0.042
 +0.039
 Originally+0.039
 �+0.037
Neg logit contributions
,-0.038
,...-0.030
.-0.029
.""-0.028
 society-0.025
Tokenrit
Token ID799
Feature activation-9.31e-01

Pos logit contributions
+0.056
+0.046
 Originally+0.042
 +0.042
 �+0.039
Neg logit contributions
,-0.084
,...-0.070
.-0.068
.""-0.067
 society-0.062
Tokenish
Token ID680
Feature activation-2.12e+00

Pos logit contributions
+0.081
+0.070
 Originally+0.064
 +0.064
 �+0.060
Neg logit contributions
,-0.093
,...-0.075
.-0.074
.""-0.072
 society-0.066
Token edition
Token ID8313
Feature activation+1.92e-01

Pos logit contributions
+0.231
+0.207
 +0.194
 Originally+0.192
 �+0.184
Neg logit contributions
,-0.164
,...-0.124
.-0.122
.""-0.117
 society-0.099
Token of
Token ID286
Feature activation-4.87e+00

Pos logit contributions
,+0.023
,...+0.020
.+0.019
.""+0.019
 society+0.018
Neg logit contributions
-0.012
-0.010
 -0.009
 Originally-0.009
 �-0.008
Token '
Token ID705
Feature activation-1.42e-02

Pos logit contributions
+0.472
+0.422
 +0.390
 Originally+0.389
 �+0.366
Neg logit contributions
,-0.430
,...-0.341
.-0.334
.""-0.326
 society-0.285
TokenCharlie
Token ID37136
Feature activation+1.04e+00

Pos logit contributions
+0.001
+0.001
 +0.001
 Originally+0.001
 �+0.001
Neg logit contributions
,-0.002
,...-0.001
.-0.001
.""-0.001
 society-0.001
Token Sunday
Token ID3502
Feature activation+1.36e+00

Pos logit contributions
,+0.118
,...+0.091
.+0.088
.""+0.086
 society+0.074
Neg logit contributions
-0.161
-0.144
 Originally-0.133
 -0.133
 �-0.128
Token.
Token ID13
Feature activation+5.06e+00

Pos logit contributions
,+0.080
,...+0.057
.""+0.055
.+0.053
 society+0.045
Neg logit contributions
-0.170
-0.157
 -0.145
 Originally-0.144
 �-0.140
Token
Token ID198
Feature activation-6.02e+00

Pos logit contributions
,+0.820
.+0.717
,...+0.706
.""+0.690
 society+0.659
Neg logit contributions
-0.137
-0.063
 -0.048
 Originally-0.045
 �-0.024
Token
Token ID198
Feature activation-6.00e+00

Pos logit contributions
+0.112
 +0.013
 Originally+0.010
+0.000
 Regarding-0.002
Neg logit contributions
,-1.038
.-0.916
,...-0.880
.""-0.853
 society-0.818
TokenSt
Token ID1273
Feature activation-1.34e+00

Pos logit contributions
+0.403
+0.320
 Originally+0.281
 +0.274
 �+0.255
Neg logit contributions
,-0.730
,...-0.614
.-0.605
.""-0.592
 society-0.568
Tokenri
Token ID380
Feature activation-3.61e+00

Pos logit contributions
+0.101
+0.086
 Originally+0.076
 +0.076
 �+0.071
Neg logit contributions
,-0.148
,...-0.125
.""-0.119
.-0.119
 society-0.111
Tokenking
Token ID3364
Feature activation-2.02e+00

Pos logit contributions
+0.291
+0.248
 Originally+0.227
 +0.223
 �+0.211
Neg logit contributions
,-0.382
,...-0.313
.-0.309
.""-0.306
 society-0.279
Token images
Token ID4263
Feature activation+2.78e+00

Pos logit contributions
+0.201
+0.177
 Originally+0.165
 +0.164
 �+0.157
Neg logit contributions
,-0.175
,...-0.137
.-0.135
.""-0.130
 society-0.116
Token capture
Token ID8006
Feature activation-3.75e+00

Pos logit contributions
,+0.331
.+0.276
,...+0.275
.""+0.268
 society+0.249
Neg logit contributions
-0.190
-0.155
 -0.139
 Originally-0.139
 �-0.129
Token the
Token ID262
Feature activation+2.28e+00

Pos logit contributions
+0.422
+0.379
 Originally+0.349
 +0.348
 �+0.341
Neg logit contributions
,-0.274
,...-0.208
.-0.201
.""-0.195
 society-0.161
Token moment
Token ID2589
Feature activation+6.28e-01

Pos logit contributions
,+0.181
.+0.136
,...+0.136
.""+0.129
 society+0.110
Neg logit contributions
-0.247
-0.218
 -0.204
 Originally-0.204
 �-0.197
Token thing
Token ID1517
Feature activation+4.02e+00

Pos logit contributions
,+0.059
,...+0.046
.+0.046
.""+0.044
 society+0.039
Neg logit contributions
-0.065
-0.056
 -0.053
 Originally-0.052
 �-0.051
Token I
Token ID314
Feature activation+3.57e+00

Pos logit contributions
,+0.363
,...+0.288
.+0.285
.""+0.276
 society+0.246
Neg logit contributions
-0.391
-0.335
 -0.313
 Originally-0.309
 �-0.304
Token felt
Token ID2936
Feature activation+9.55e-01

Pos logit contributions
,+0.334
.+0.262
,...+0.258
.""+0.247
 society+0.224
Neg logit contributions
-0.339
-0.290
 -0.275
 Originally-0.274
 �-0.260
Token very
Token ID845
Feature activation-1.46e+00

Pos logit contributions
,+0.089
,...+0.070
.+0.070
.""+0.067
 society+0.061
Neg logit contributions
-0.091
-0.078
 -0.073
 Originally-0.072
 �-0.070
Token sure
Token ID1654
Feature activation+3.39e-01

Pos logit contributions
+0.136
+0.118
 +0.110
 Originally+0.108
 �+0.104
Neg logit contributions
,-0.135
,...-0.108
.-0.107
.""-0.103
 society-0.093
Token of
Token ID286
Feature activation-5.11e+00

Pos logit contributions
,+0.037
,...+0.030
.+0.030
.""+0.029
 society+0.027
Neg logit contributions
-0.027
-0.022
 -0.020
 Originally-0.020
 �-0.019
Token in
Token ID287
Feature activation-6.21e+00

Pos logit contributions
+0.495
+0.422
 +0.393
 Originally+0.389
 �+0.382
Neg logit contributions
,-0.462
,...-0.368
.-0.363
.""-0.355
 society-0.323
Token my
Token ID616
Feature activation+3.85e+00

Pos logit contributions
+0.704
+0.618
 +0.584
 Originally+0.581
 �+0.572
Neg logit contributions
,-0.466
,...-0.349
.-0.343
.""-0.327
 society-0.276
Token mind
Token ID2000
Feature activation+4.04e+00

Pos logit contributions
,+0.317
.+0.239
,...+0.236
.""+0.223
 society+0.197
Neg logit contributions
-0.411
-0.358
 -0.341
 Originally-0.339
 �-0.326
Token is
Token ID318
Feature activation+1.15e+00

Pos logit contributions
,+0.383
,...+0.306
.+0.304
.""+0.296
 society+0.270
Neg logit contributions
-0.378
-0.319
 -0.298
 Originally-0.294
 �-0.287
Token that
Token ID326
Feature activation+1.41e+00

Pos logit contributions
,+0.093
,...+0.073
.+0.071
.""+0.069
 society+0.060
Neg logit contributions
-0.123
-0.107
 -0.100
 �-0.099
 Originally-0.098
Tokenord
Token ID585
Feature activation+3.43e-01

Pos logit contributions
+0.294
+0.251
 Originally+0.230
 +0.229
 �+0.215
Neg logit contributions
,-0.339
,...-0.278
.-0.268
.""-0.265
 society-0.247
Token persist
Token ID21160
Feature activation-2.13e+00

Pos logit contributions
,+0.032
,...+0.026
.+0.026
.""+0.025
 society+0.023
Neg logit contributions
-0.031
-0.027
 -0.025
 Originally-0.025
 �-0.024
Tokene
Token ID68
Feature activation-3.93e+00

Pos logit contributions
+0.136
+0.109
 +0.098
 Originally+0.095
 �+0.087
Neg logit contributions
,-0.262
,...-0.220
.-0.219
.""-0.215
 society-0.202
Token,
Token ID11
Feature activation+7.40e+00

Pos logit contributions
+0.368
+0.317
 +0.295
 Originally+0.290
 �+0.278
Neg logit contributions
,-0.360
,...-0.287
.-0.283
.""-0.275
 society-0.254
Token les
Token ID10287
Feature activation+4.92e-01

Pos logit contributions
,+0.734
.+0.584
,...+0.572
.""+0.549
 society+0.502
Neg logit contributions
-0.664
-0.561
 -0.533
 Originally-0.530
 �-0.499
Token F
Token ID376
Feature activation-1.24e+00

Pos logit contributions
,+0.045
,...+0.036
.+0.036
.""+0.034
 society+0.031
Neg logit contributions
-0.047
-0.041
 Originally-0.038
 -0.038
 �-0.036
Tokenond
Token ID623
Feature activation-3.01e+00

Pos logit contributions
+0.077
+0.062
 Originally+0.055
 +0.054
 �+0.049
Neg logit contributions
,-0.153
,...-0.131
.-0.126
.""-0.125
 society-0.120
Tokenateurs
Token ID45211
Feature activation+1.17e+00

Pos logit contributions
+0.250
+0.215
 Originally+0.197
 +0.195
 �+0.181
Neg logit contributions
,-0.309
,...-0.252
.-0.248
.""-0.242
 society-0.226
Token pre
Token ID662
Feature activation-1.65e+00

Pos logit contributions
,+0.111
,...+0.089
.+0.088
.""+0.085
 society+0.078
Neg logit contributions
-0.105
-0.091
 -0.085
 Originally-0.083
 �-0.080
Tokennd
Token ID358
Feature activation-1.59e+00

Pos logit contributions
+0.125
+0.103
 +0.094
 Originally+0.094
 �+0.086
Neg logit contributions
,-0.186
,...-0.155
.-0.152
.""-0.149
 society-0.139
Tokenront
Token ID4298
Feature activation-1.65e+00

Pos logit contributions
+0.091
+0.070
 +0.061
 Originally+0.061
 �+0.055
Neg logit contributions
,-0.208
,...-0.176
.-0.176
.""-0.171
 society-0.162
Token will
Token ID481
Feature activation-7.67e-01

Pos logit contributions
,+0.078
.+0.063
,...+0.063
.""+0.061
 society+0.056
Neg logit contributions
-0.064
-0.054
 -0.050
 Originally-0.049
 �-0.047
Token be
Token ID307
Feature activation-2.17e+00

Pos logit contributions
+0.067
+0.057
 +0.053
 Originally+0.053
 �+0.050
Neg logit contributions
,-0.077
.-0.062
,...-0.062
.""-0.059
 society-0.054
Token welcome
Token ID7062
Feature activation-3.38e+00

Pos logit contributions
+0.187
+0.160
 +0.148
 Originally+0.147
 �+0.139
Neg logit contributions
,-0.220
,...-0.177
.-0.176
.""-0.169
 society-0.154
Token.
Token ID13
Feature activation+4.65e+00

Pos logit contributions
+0.493
+0.451
 +0.427
 Originally+0.423
 �+0.418
Neg logit contributions
,-0.135
,...-0.076
.-0.066
.""-0.065
 society-0.043
Token
Token ID198
Feature activation-6.50e+00

Pos logit contributions
,+0.663
.+0.569
,...+0.555
.""+0.540
 society+0.513
Neg logit contributions
-0.212
-0.143
 -0.133
 Originally-0.130
 �-0.108
Token
Token ID198
Feature activation-6.62e+00

Pos logit contributions
+0.123
 +0.015
 Originally+0.014
+0.000
 Regarding-0.005
Neg logit contributions
,-1.121
.-0.991
,...-0.948
.""-0.920
 society-0.885
TokenTake
Token ID12322
Feature activation-2.20e+00

Pos logit contributions
+0.468
+0.370
 +0.328
 Originally+0.328
 �+0.309
Neg logit contributions
,-0.790
,...-0.661
.-0.653
.""-0.638
 society-0.600
Token Home
Token ID5995
Feature activation+1.82e+00

Pos logit contributions
+0.222
+0.196
 +0.182
 Originally+0.180
 �+0.174
Neg logit contributions
,-0.186
,...-0.147
.-0.144
.""-0.140
 society-0.123
Token Message
Token ID16000
Feature activation+8.54e-01

Pos logit contributions
,+0.217
,...+0.182
.+0.181
.""+0.177
 society+0.164
Neg logit contributions
-0.124
-0.103
 Originally-0.092
 -0.091
 �-0.085
Token 8
Token ID807
Feature activation-4.62e+00

Pos logit contributions
,+0.134
.+0.117
,...+0.116
.""+0.114
 society+0.108
Neg logit contributions
-0.027
-0.016
 -0.012
 Originally-0.012
 �-0.008
Token:
Token ID25
Feature activation+1.34e+00

Pos logit contributions
+0.464
+0.406
 +0.376
 Originally+0.376
 �+0.360
Neg logit contributions
,-0.404
,...-0.317
.-0.307
.""-0.303
 society-0.271
Tokenreme
Token ID2182
Feature activation-2.69e+00

Pos logit contributions
,+0.116
,...+0.096
.+0.095
.""+0.093
 society+0.087
Neg logit contributions
-0.071
-0.059
 Originally-0.053
 -0.053
 �-0.049
Token
Token ID198
Feature activation-6.27e+00

Pos logit contributions
+0.111
+0.084
 Originally+0.065
 +0.064
 �+0.052
Neg logit contributions
,-0.392
,...-0.340
.-0.337
.""-0.332
 society-0.313
Token
Token ID198
Feature activation-6.32e+00

Pos logit contributions
+0.118
 +0.011
 Originally+0.010
+0.000
 Regarding-0.006
Neg logit contributions
,-1.083
.-0.957
,...-0.922
.""-0.892
 society-0.856
TokenWh
Token ID1199
Feature activation-1.57e+00

Pos logit contributions
+0.447
+0.368
 Originally+0.324
 +0.320
 �+0.295
Neg logit contributions
,-0.737
,...-0.614
.-0.606
.""-0.597
 society-0.566
Tokenale
Token ID1000
Feature activation-3.86e+00

Pos logit contributions
+0.116
+0.097
 Originally+0.088
 +0.087
 �+0.081
Neg logit contributions
,-0.178
,...-0.147
.-0.145
.""-0.142
 society-0.133
Token
Token ID198
Feature activation-6.31e+00

Pos logit contributions
+0.211
+0.168
 +0.142
 Originally+0.142
 �+0.125
Neg logit contributions
,-0.511
,...-0.436
.-0.433
.""-0.425
 society-0.398
Token
Token ID198
Feature activation-6.34e+00

Pos logit contributions
+0.117
 +0.010
 Originally+0.010
+0.000
 Regarding-0.007
Neg logit contributions
,-1.090
.-0.963
,...-0.928
.""-0.898
 society-0.862
TokenPrime
Token ID26405
Feature activation-1.47e+00

Pos logit contributions
+0.450
+0.371
 Originally+0.326
 +0.323
 �+0.298
Neg logit contributions
,-0.739
,...-0.615
.-0.608
.""-0.598
 society-0.566
Token
Token ID198
Feature activation-6.31e+00

Pos logit contributions
+0.039
+0.024
 Originally+0.014
 +0.013
 �+0.007
Neg logit contributions
,-0.236
,...-0.208
.-0.206
.""-0.204
 society-0.193
Token
Token ID198
Feature activation-6.31e+00

Pos logit contributions
+0.118
 +0.011
 Originally+0.010
+0.000
 Regarding-0.006
Neg logit contributions
,-1.090
.-0.962
,...-0.927
.""-0.897
 society-0.861
TokenMar
Token ID7676
Feature activation-1.67e+00

Pos logit contributions
+0.447
+0.368
 Originally+0.324
 +0.320
 �+0.296
Neg logit contributions
,-0.735
,...-0.612
.-0.605
.""-0.595
 society-0.564
TokenK
Token ID42
Feature activation-1.23e+00

Pos logit contributions
+0.275
+0.239
 +0.216
 Originally+0.215
 �+0.200
Neg logit contributions
,-0.347
,...-0.285
.-0.279
.""-0.274
 society-0.257
TokenES
Token ID1546
Feature activation-2.22e+00

Pos logit contributions
+0.108
+0.095
 +0.087
 Originally+0.086
 �+0.081
Neg logit contributions
,-0.120
,...-0.096
.-0.095
.""-0.094
 society-0.087
Token.
Token ID13
Feature activation+4.94e+00

Pos logit contributions
+0.224
+0.200
 +0.187
 Originally+0.185
 �+0.175
Neg logit contributions
,-0.189
,...-0.145
.-0.144
.""-0.142
 society-0.129
Token We
Token ID775
Feature activation+1.25e+00

Pos logit contributions
,+0.631
.+0.531
,...+0.520
.""+0.504
 society+0.474
Neg logit contributions
-0.302
-0.229
 -0.213
 Originally-0.212
 �-0.191
Token surprised
Token ID6655
Feature activation-2.24e+00

Pos logit contributions
,+0.121
.+0.096
,...+0.095
.""+0.091
 society+0.084
Neg logit contributions
-0.113
-0.097
 -0.091
 Originally-0.091
 �-0.085
Token at
Token ID379
Feature activation-6.71e+00

Pos logit contributions
+0.246
+0.222
 +0.208
 Originally+0.206
 �+0.198
Neg logit contributions
,-0.168
,...-0.126
.-0.124
.""-0.119
 society-0.103
Token least
Token ID1551
Feature activation-6.24e-01

Pos logit contributions
+0.706
+0.633
 +0.590
 Originally+0.586
 �+0.562
Neg logit contributions
,-0.536
,...-0.410
.-0.404
.""-0.394
 society-0.338
Token two
Token ID734
Feature activation-2.74e-01

Pos logit contributions
+0.058
+0.051
 +0.047
 Originally+0.047
 �+0.044
Neg logit contributions
,-0.058
,...-0.046
.-0.046
.""-0.044
 society-0.040
Token big
Token ID1263
Feature activation+3.21e+00

Pos logit contributions
+0.027
+0.024
 +0.022
 Originally+0.022
 �+0.021
Neg logit contributions
,-0.024
,...-0.019
.-0.019
.""-0.018
 society-0.016
Token suck
Token ID10110
Feature activation-1.29e+00

Pos logit contributions
,+0.271
,...+0.210
.+0.208
.""+0.204
 society+0.179
Neg logit contributions
-0.324
-0.287
 -0.267
 Originally-0.264
 �-0.254
Tokeners
Token ID364
Feature activation-3.53e+00

Pos logit contributions
+0.093
+0.077
 +0.069
 Originally+0.069
 �+0.063
Neg logit contributions
,-0.149
,...-0.124
.-0.122
.""-0.120
 society-0.111
Token,
Token ID11
Feature activation+6.73e+00

Pos logit contributions
+0.445
+0.401
 +0.375
 Originally+0.371
 �+0.366
Neg logit contributions
,-0.236
,...-0.169
.""-0.165
.-0.160
 society-0.136
Token because
Token ID780
Feature activation+1.05e+00

Pos logit contributions
,+0.631
.+0.495
,...+0.479
.""+0.459
 society+0.418
Neg logit contributions
-0.640
-0.545
 -0.524
 Originally-0.521
 Regarding-0.490
Token they
Token ID484
Feature activation+4.78e+00

Pos logit contributions
,+0.085
,...+0.065
.+0.065
.""+0.063
 society+0.054
Neg logit contributions
-0.111
-0.098
 -0.091
 Originally-0.090
 �-0.088
Token
Token ID447
Feature activation-4.23e+00

Pos logit contributions
,+0.470
.+0.375
,...+0.374
.""+0.360
 society+0.328
Neg logit contributions
-0.436
-0.365
 -0.339
 Originally-0.336
 �-0.327
Token
Token ID247
Feature activation+7.83e-01

Pos logit contributions
+0.376
+0.317
 +0.301
 Originally+0.299
 �+0.281
Neg logit contributions
,-0.423
.-0.337
,...-0.331
.""-0.317
 society-0.291
Tokenre
Token ID260
Feature activation-6.31e+00

Pos logit contributions
,+0.099
,...+0.083
.+0.083
.""+0.081
 society+0.076
Neg logit contributions
-0.049
-0.038
 -0.033
 Originally-0.033
 �-0.031
Token black
Token ID2042
Feature activation+2.55e+00

Pos logit contributions
+0.637
+0.547
 +0.509
 Originally+0.505
 �+0.494
Neg logit contributions
,-0.550
,...-0.430
.-0.425
.""-0.409
 society-0.365
Token and
Token ID290
Feature activation+6.43e-01

Pos logit contributions
,+0.105
,...+0.062
.""+0.058
.+0.053
 society+0.035
Neg logit contributions
-0.369
-0.335
 -0.316
 �-0.311
 Originally-0.310
Token female
Token ID4048
Feature activation+3.10e+00

Pos logit contributions
,+0.052
,...+0.040
.+0.040
.""+0.039
 society+0.033
Neg logit contributions
-0.068
-0.059
 -0.056
 Originally-0.055
 �-0.054
Token.
Token ID13
Feature activation+5.11e+00

Pos logit contributions
,+0.108
,...+0.056
.""+0.052
.+0.045
 society+0.025
Neg logit contributions
-0.466
-0.427
 -0.404
 �-0.396
 Originally-0.395
Token
Token ID198
Feature activation-6.01e+00

Pos logit contributions
,+0.775
.+0.672
,...+0.657
.""+0.639
 society+0.610
Neg logit contributions
-0.188
-0.112
 -0.099
 Originally-0.097
 �-0.075
Token.
Token ID13
Feature activation+4.29e+00

Pos logit contributions
+0.548
+0.491
 +0.464
 Originally+0.458
 �+0.448
Neg logit contributions
,-0.279
,...-0.194
.-0.186
.""-0.183
 society-0.158
Token It
Token ID632
Feature activation+8.48e-01

Pos logit contributions
,+0.544
.+0.457
,...+0.445
.""+0.430
 society+0.405
Neg logit contributions
-0.265
-0.201
 -0.190
 Originally-0.189
 �-0.170
Token would
Token ID561
Feature activation-1.12e+00

Pos logit contributions
,+0.075
.+0.058
,...+0.057
.""+0.055
 society+0.049
Neg logit contributions
-0.086
-0.074
 -0.070
 Originally-0.069
 �-0.067
Token be
Token ID307
Feature activation-2.54e+00

Pos logit contributions
+0.097
+0.082
 +0.077
 Originally+0.076
 �+0.072
Neg logit contributions
,-0.112
,...-0.091
.-0.090
.""-0.087
 society-0.080
Token interesting
Token ID3499
Feature activation-1.89e+00

Pos logit contributions
+0.257
+0.225
 +0.213
 Originally+0.211
 �+0.201
Neg logit contributions
,-0.218
,...-0.167
.-0.167
.""-0.159
 society-0.141
Token to
Token ID284
Feature activation-6.92e+00

Pos logit contributions
+0.096
+0.072
 +0.061
 Originally+0.058
 �+0.055
Neg logit contributions
,-0.255
,...-0.219
.-0.218
.""-0.214
 society-0.202
Token see
Token ID766
Feature activation-3.28e+00

Pos logit contributions
+0.626
+0.541
 +0.505
 Originally+0.495
 �+0.475
Neg logit contributions
,-0.668
,...-0.532
.-0.530
.""-0.509
 society-0.459
Token how
Token ID703
Feature activation+3.83e-01

Pos logit contributions
+0.340
+0.299
 +0.277
 Originally+0.273
 �+0.267
Neg logit contributions
,-0.273
,...-0.211
.-0.207
.""-0.204
 society-0.177
Token the
Token ID262
Feature activation+1.43e+00

Pos logit contributions
,+0.030
,...+0.022
.+0.022
.""+0.021
 society+0.018
Neg logit contributions
-0.042
-0.037
 -0.035
 Originally-0.034
 �-0.033
Token cub
Token ID13617
Feature activation-5.07e-01

Pos logit contributions
,+0.116
.+0.087
,...+0.087
.""+0.083
 society+0.071
Neg logit contributions
-0.153
-0.134
 -0.126
 Originally-0.125
 �-0.121
Tokens
Token ID82
Feature activation-4.01e+00

Pos logit contributions
+0.031
+0.024
 +0.021
 Originally+0.021
 �+0.019
Neg logit contributions
,-0.064
,...-0.054
.-0.054
.""-0.053
 society-0.049