TOP ACTIVATIONS
MAX = 17.784

from the window . Rel uct antly , he came away
. Rel uct antly , Elsa left the
sea monster insisted . Rel uct antly , it followed him
buy his own . Rel uct antly , his friend gave
. F right ened , Lucy screamed and
in one piece . All right ?" John nodded
dragon ! F right ened , Mary started running
a tall tree . Un bek now st to him ,
and Tom felt a strange sensation in their stomach s .
and Ben felt a funny sensation in their stomach s .
felt a strange t ingly sensation and when it stopped they
out loud and enjoyed the sensation of being so high up
a hot and tang y sensation in their mouths . They
and Tom experienced a strange sensation they had never felt before
, they felt a strange sensation . They felt light and
Suddenly , he felt a sensation . He began to slide
of new treasures . Del ight ed , he explored each
a pleasant t ing ling sensation throughout his body . The
feels a hot and burning sensation in her mouth . She
he felt a fl apping sensation in the air . It

MIN ACTIVATIONS
MIN = -11.976

and they pretended to be builders . They used their trucks
you . You are great builders ," Dad said .
pretend they are drivers and builders . They make roads with
, " You are not builders , you are boys .
. They pretend they are builders and make roads and houses
. Daniel even watched the builders make a big sand castle
not dirty , we are builders ." Their mom
! We are the best builders !" Sam cheered .
and Lily , the best builders ever . And we are
. We are the best builders ever !" Lily said .
liked to pretend they were builders and fix things around the
liked to pretend they were builders and make things with the
were good friends and good builders . <|endoftext|> L ily and
liked to pretend they were builders and fix things with their
later . They are good builders . <|endoftext|> Tom and Mia
blue berries ?" " Yes , please !" Lily
L incoln ". " Who is Lincoln , Ben
they decided to be mail carriers and deliver letters to their
his dad . " Dad , Dad ! What
at her . " Thank you , z ebra

INTERVAL 10.344 - 17.784
CONTAINS 0.153%

€ � � Let â € ™ s find out .
€ ˜ thank you â € ™ . One day she
for an operation . â € Tim my asked
€ � � That â € ™ s so cool !
friends . â € � � I don â

INTERVAL 2.904 - 10.344
CONTAINS 13.029%

helped stop the thief . From that day on , Blue
felt sad , but she didn 't let the ducks bring
sky . One day , a three year old
cream was clean . But it still smelled sm elly !
Once upon a time , in a small toy room ,

INTERVAL -4.536 - 2.904
CONTAINS 82.320%

tooth . L ily 's mom told her
Amy thanked Manuel , waved goodbye and left the
rabbit hopped out and John felt so happy . He reached
ucky and his friends sw am around the pond , making
so she went to the garden to think about what she

INTERVAL -11.976 - -4.536
CONTAINS 4.498%

little girl was so happy that she could carry her dolls
to the beach with his family . The water was very
her mom . " Yes , you can ,
bird how to fly and how to find worms to eat
said he didn 't know how . Jerry decided
Token from
Token ID422
Feature activation+4.49e+00

Pos logit contributions
rated+0.403
aned+0.343
ently+0.323
err+0.304
rily+0.296
Neg logit contributions
 who-0.313
 detectives-0.309
 builders-0.302
 kings-0.301
 spies-0.291
Token the
Token ID262
Feature activation+1.47e+00

Pos logit contributions
rated+0.247
aned+0.212
ently+0.205
err+0.193
rily+0.191
Neg logit contributions
 detectives-0.137
 who-0.137
 builders-0.134
 kings-0.132
 spies-0.127
Token window
Token ID4324
Feature activation+2.34e+00

Pos logit contributions
rated+0.076
aned+0.061
ently+0.060
err+0.058
 sensation+0.057
Neg logit contributions
 detectives-0.050
 who-0.048
 builders-0.048
 kings-0.047
 spies-0.046
Token.
Token ID13
Feature activation+1.90e+00

Pos logit contributions
rated+0.116
aned+0.099
ently+0.093
err+0.089
rily+0.086
Neg logit contributions
 who-0.087
 detectives-0.083
 kings-0.082
 builders-0.082
 farmers-0.079
Token Rel
Token ID4718
Feature activation+4.58e+00

Pos logit contributions
rated+0.100
aned+0.087
ently+0.083
err+0.079
rily+0.077
Neg logit contributions
 detectives-0.062
 who-0.061
 builders-0.061
 kings-0.061
 spies-0.058
Tokenuct
Token ID4782
Feature activation+1.78e+01

Pos logit contributions
rated+0.121
aned+0.079
err+0.073
ently+0.070
rily+0.064
Neg logit contributions
 who-0.276
 detectives-0.264
 kings-0.262
 builders-0.261
 farmers-0.254
Tokenantly
Token ID3875
Feature activation+4.78e+00

Pos logit contributions
rated+0.665
aned+0.486
ently+0.438
 sensation+0.436
ulp+0.415
Neg logit contributions
 who-0.905
 police-0.848
 builders-0.847
 detectives-0.835
 kings-0.835
Token,
Token ID11
Feature activation-3.74e+00

Pos logit contributions
rated+0.274
aned+0.237
ently+0.227
err+0.217
rily+0.212
Neg logit contributions
 who-0.140
 detectives-0.137
 builders-0.134
 kings-0.133
 spies-0.127
Token he
Token ID339
Feature activation-1.33e+00

Pos logit contributions
 who+0.105
 detectives+0.104
 builders+0.101
 kings+0.100
 spies+0.096
Neg logit contributions
rated-0.217
aned-0.189
ently-0.181
err-0.172
rily-0.169
Token came
Token ID1625
Feature activation-3.72e+00

Pos logit contributions
 detectives+0.046
 builders+0.045
 spies+0.043
 kings+0.043
 farmers+0.042
Neg logit contributions
rated-0.069
aned-0.059
ently-0.055
err-0.052
rily-0.051
Token away
Token ID1497
Feature activation+8.28e+00

Pos logit contributions
 detectives+0.149
 who+0.147
 builders+0.146
 kings+0.145
 spies+0.140
Neg logit contributions
rated-0.170
aned-0.142
ently-0.135
err-0.127
rily-0.123
Token.
Token ID13
Feature activation+1.78e+00

Pos logit contributions
rated+0.133
aned+0.115
ently+0.107
err+0.100
rily+0.098
Neg logit contributions
 who-0.096
 detectives-0.093
 kings-0.090
 builders-0.090
 spies-0.088
Token
Token ID220
Feature activation+6.47e+00

Pos logit contributions
rated+0.103
aned+0.088
ently+0.084
err+0.081
rily+0.077
Neg logit contributions
 who-0.053
 detectives-0.051
 builders-0.048
 kings-0.048
 spies-0.047
Token
Token ID198
Feature activation-2.03e+00

Pos logit contributions
rated+0.415
aned+0.358
ently+0.342
err+0.330
rily+0.328
Neg logit contributions
 who-0.152
 detectives-0.141
 builders-0.135
 kings-0.134
 spies-0.129
Token
Token ID198
Feature activation-2.05e+00

Pos logit contributions
 detectives+0.058
 kings+0.057
 builders+0.057
 who+0.056
 spies+0.053
Neg logit contributions
rated-0.115
aned-0.102
ently-0.098
err-0.093
rily-0.091
TokenRel
Token ID6892
Feature activation+8.50e+00

Pos logit contributions
 kings+0.063
 builders+0.062
 detectives+0.061
 spies+0.058
 who+0.057
Neg logit contributions
rated-0.113
aned-0.100
ently-0.096
rily-0.090
err-0.089
Tokenuct
Token ID4782
Feature activation+1.77e+01

Pos logit contributions
rated+0.295
 sensation+0.220
err+0.209
rum+0.204
rily+0.188
Neg logit contributions
 who-0.441
Playing-0.409
 detectives-0.403
 kings-0.396
 named-0.391
Tokenantly
Token ID3875
Feature activation+4.84e+00

Pos logit contributions
rated+0.640
aned+0.471
ently+0.428
rum+0.409
 sensation+0.408
Neg logit contributions
 who-0.905
 detectives-0.861
 builders-0.860
 kings-0.850
 police-0.848
Token,
Token ID11
Feature activation-3.83e+00

Pos logit contributions
rated+0.260
aned+0.226
ently+0.215
err+0.202
rily+0.201
Neg logit contributions
 detectives-0.153
 who-0.152
 builders-0.150
 kings-0.150
 spies-0.142
Token Elsa
Token ID19226
Feature activation+6.47e-01

Pos logit contributions
 who+0.107
 detectives+0.106
 builders+0.103
 kings+0.102
 spies+0.097
Neg logit contributions
rated-0.224
aned-0.195
ently-0.187
err-0.178
rily-0.175
Token left
Token ID1364
Feature activation-2.41e+00

Pos logit contributions
rated+0.033
aned+0.029
ently+0.027
err+0.026
rily+0.025
Neg logit contributions
 detectives-0.022
 builders-0.022
 kings-0.021
 spies-0.020
 who-0.020
Token the
Token ID262
Feature activation+1.37e+00

Pos logit contributions
 who+0.075
 detectives+0.073
 kings+0.071
 builders+0.071
 spies+0.068
Neg logit contributions
rated-0.136
aned-0.116
ently-0.110
err-0.106
rily-0.103
Token sea
Token ID5417
Feature activation-9.56e-01

Pos logit contributions
rated+0.072
aned+0.059
 sensation+0.057
err+0.056
ently+0.056
Neg logit contributions
 builders-0.050
 who-0.050
 detectives-0.050
 kings-0.049
 farmers-0.046
Token monster
Token ID9234
Feature activation-2.11e-02

Pos logit contributions
 detectives+0.032
 who+0.031
 builders+0.031
 kings+0.031
 spies+0.030
Neg logit contributions
rated-0.050
aned-0.042
ently-0.041
err-0.038
rily-0.038
Token insisted
Token ID11189
Feature activation+3.62e+00

Pos logit contributions
 who+0.001
 detectives+0.001
 kings+0.001
 builders+0.001
 spies+0.001
Neg logit contributions
rated-0.001
aned-0.001
ently-0.001
err-0.001
rily-0.001
Token.
Token ID13
Feature activation+2.02e+00

Pos logit contributions
rated+0.224
aned+0.188
err+0.180
ently+0.168
rily+0.165
Neg logit contributions
 who-0.110
 builders-0.092
 detectives-0.091
 kings-0.091
 farmers-0.091
Token Rel
Token ID4718
Feature activation+4.52e+00

Pos logit contributions
rated+0.113
aned+0.096
ently+0.090
err+0.085
rily+0.083
Neg logit contributions
 who-0.067
 detectives-0.064
 builders-0.062
 kings-0.061
 spies-0.060
Tokenuct
Token ID4782
Feature activation+1.77e+01

Pos logit contributions
rated+0.152
aned+0.113
ently+0.103
err+0.103
rily+0.096
Neg logit contributions
 who-0.239
 detectives-0.230
 kings-0.228
 builders-0.226
 farmers-0.220
Tokenantly
Token ID3875
Feature activation+4.84e+00

Pos logit contributions
rated+0.693
aned+0.525
 sensation+0.476
ently+0.473
 spine+0.466
Neg logit contributions
 who-0.858
We-0.806
 police-0.805
 builders-0.793
 named-0.787
Token,
Token ID11
Feature activation-3.65e+00

Pos logit contributions
rated+0.304
aned+0.261
ently+0.250
err+0.246
rily+0.232
Neg logit contributions
 who-0.127
 detectives-0.121
 builders-0.119
 kings-0.114
 farmers-0.111
Token it
Token ID340
Feature activation+3.76e+00

Pos logit contributions
 who+0.101
 detectives+0.100
 builders+0.097
 kings+0.096
 spies+0.092
Neg logit contributions
rated-0.214
aned-0.187
ently-0.179
err-0.170
rily-0.167
Token followed
Token ID3940
Feature activation+1.43e+00

Pos logit contributions
rated+0.208
aned+0.180
ently+0.172
err+0.163
rily+0.160
Neg logit contributions
 who-0.115
 detectives-0.114
 builders-0.111
 kings-0.111
 spies-0.106
Token him
Token ID683
Feature activation+1.33e+00

Pos logit contributions
rated+0.090
aned+0.079
ently+0.074
err+0.071
 sensation+0.068
Neg logit contributions
 who-0.039
 detectives-0.034
 kings-0.034
 builders-0.033
 spies-0.033
Token buy
Token ID2822
Feature activation-3.89e+00

Pos logit contributions
 detectives+0.006
 builders+0.006
 who+0.006
 kings+0.006
 superheroes+0.006
Neg logit contributions
rated-0.015
aned-0.013
ently-0.013
rily-0.012
err-0.012
Token his
Token ID465
Feature activation+2.63e+00

Pos logit contributions
 detectives+0.117
 who+0.116
 builders+0.113
 kings+0.112
 spies+0.108
Neg logit contributions
rated-0.215
aned-0.186
ently-0.180
err-0.169
rily-0.168
Token own
Token ID898
Feature activation-3.74e-01

Pos logit contributions
rated+0.138
aned+0.119
ently+0.112
err+0.105
rily+0.103
Neg logit contributions
 who-0.093
 detectives-0.089
 kings-0.088
 builders-0.088
 farmers-0.085
Token.
Token ID13
Feature activation+1.64e+00

Pos logit contributions
 who+0.013
 kings+0.012
 store+0.011
 builders+0.011
 farmers+0.011
Neg logit contributions
rated-0.020
aned-0.018
ently-0.017
err-0.016
rily-0.015
Token Rel
Token ID4718
Feature activation+4.43e+00

Pos logit contributions
rated+0.087
aned+0.074
ently+0.071
err+0.067
rily+0.066
Neg logit contributions
 who-0.055
 detectives-0.054
 builders-0.053
 kings-0.053
 spies-0.051
Tokenuct
Token ID4782
Feature activation+1.77e+01

Pos logit contributions
rated+0.177
aned+0.143
ently+0.134
err+0.125
rily+0.121
Neg logit contributions
 who-0.205
 detectives-0.203
 builders-0.199
 kings-0.199
 spies-0.193
Tokenantly
Token ID3875
Feature activation+4.87e+00

Pos logit contributions
rated+0.623
aned+0.457
ently+0.419
 sensation+0.404
ulp+0.387
Neg logit contributions
 who-0.922
 builders-0.888
 police-0.874
 detectives-0.872
 kings-0.862
Token,
Token ID11
Feature activation-3.77e+00

Pos logit contributions
rated+0.256
aned+0.225
ently+0.213
err+0.199
rily+0.199
Neg logit contributions
 detectives-0.157
 who-0.155
 kings-0.153
 builders-0.152
 spies-0.146
Token his
Token ID465
Feature activation+2.63e+00

Pos logit contributions
 detectives+0.124
 kings+0.122
 builders+0.119
 who+0.118
 spies+0.114
Neg logit contributions
rated-0.196
aned-0.168
ently-0.162
err-0.152
rily-0.150
Token friend
Token ID1545
Feature activation-4.79e+00

Pos logit contributions
rated+0.143
aned+0.124
ently+0.118
err+0.115
rily+0.109
Neg logit contributions
 who-0.080
 detectives-0.080
 builders-0.077
 kings-0.077
 spies-0.073
Token gave
Token ID2921
Feature activation-2.48e+00

Pos logit contributions
 detectives+0.159
 builders+0.155
 kings+0.150
 spies+0.149
 farmers+0.146
Neg logit contributions
rated-0.251
aned-0.216
ently-0.203
err-0.190
rily-0.188
Token.
Token ID13
Feature activation+1.76e+00

Pos logit contributions
 detectives+0.100
 who+0.098
 builders+0.098
 kings+0.097
 spies+0.094
Neg logit contributions
rated-0.120
aned-0.101
ently-0.096
err-0.089
rily-0.088
Token
Token ID220
Feature activation+6.66e+00

Pos logit contributions
rated+0.103
aned+0.087
ently+0.084
err+0.078
rily+0.076
Neg logit contributions
 who-0.054
 detectives-0.050
 builders-0.048
 kings-0.046
 spies-0.046
Token
Token ID198
Feature activation-1.93e+00

Pos logit contributions
rated+0.428
aned+0.370
ently+0.352
err+0.339
rily+0.336
Neg logit contributions
 who-0.157
 detectives-0.147
 builders-0.141
 kings-0.140
 spies-0.134
Token
Token ID198
Feature activation-1.96e+00

Pos logit contributions
 detectives+0.053
 who+0.052
 builders+0.051
 kings+0.051
 spies+0.048
Neg logit contributions
rated-0.114
aned-0.100
ently-0.095
err-0.091
rily-0.089
TokenF
Token ID37
Feature activation+3.51e+00

Pos logit contributions
 detectives+0.074
 kings+0.073
 builders+0.072
 who+0.070
 spies+0.070
Neg logit contributions
rated-0.095
aned-0.081
ently-0.076
err-0.073
rily-0.071
Tokenright
Token ID3506
Feature activation+1.41e+01

Pos logit contributions
rated+0.123
aned+0.106
 hurts+0.090
 AW+0.090
ently+0.087
Neg logit contributions
 who-0.172
 named-0.150
 store-0.143
 kings-0.143
 detectives-0.142
Tokenened
Token ID2945
Feature activation+4.53e+00

Pos logit contributions
rated+0.445
err+0.300
aned+0.291
ently+0.280
 sensation+0.232
Neg logit contributions
 who-0.793
 kings-0.779
 builders-0.778
 detectives-0.766
 pirates-0.757
Token,
Token ID11
Feature activation-4.01e+00

Pos logit contributions
rated+0.260
aned+0.221
ently+0.212
err+0.203
ingle+0.196
Neg logit contributions
 who-0.145
 detectives-0.136
 builders-0.130
 kings-0.128
 spies-0.128
Token Lucy
Token ID22162
Feature activation+4.54e+00

Pos logit contributions
 detectives+0.129
 builders+0.127
 who+0.126
 kings+0.126
 spies+0.120
Neg logit contributions
rated-0.213
aned-0.182
ently-0.175
err-0.166
rily-0.164
Token screamed
Token ID25421
Feature activation+5.37e+00

Pos logit contributions
rated+0.233
aned+0.205
ently+0.191
rily+0.177
err+0.175
Neg logit contributions
 detectives-0.158
 builders-0.155
 kings-0.150
 spies-0.149
 who-0.145
Token and
Token ID290
Feature activation+1.37e+00

Pos logit contributions
rated+0.300
aned+0.259
ently+0.247
err+0.232
raw+0.222
Neg logit contributions
 who-0.179
 detectives-0.165
 kings-0.158
 spies-0.158
 builders-0.156
Token in
Token ID287
Feature activation+5.49e+00

Pos logit contributions
rated+0.104
aned+0.091
ently+0.086
err+0.083
rily+0.078
Neg logit contributions
 who-0.063
 detectives-0.062
 builders-0.060
 kings-0.060
 spies-0.058
Token one
Token ID530
Feature activation-3.06e+00

Pos logit contributions
rated+0.288
aned+0.242
ently+0.237
err+0.225
rily+0.224
Neg logit contributions
 detectives-0.177
 who-0.171
 builders-0.169
 kings-0.168
 farmers-0.162
Token piece
Token ID3704
Feature activation-2.55e+00

Pos logit contributions
 who+0.104
 detectives+0.098
 builders+0.097
 kings+0.097
 farmers+0.093
Neg logit contributions
rated-0.162
aned-0.146
ently-0.135
err-0.128
rily-0.124
Token.
Token ID13
Feature activation+1.86e+00

Pos logit contributions
 who+0.096
 builders+0.091
 detectives+0.091
 kings+0.090
 farmers+0.087
Neg logit contributions
rated-0.127
aned-0.110
ently-0.101
err-0.098
rily-0.093
Token All
Token ID1439
Feature activation-1.76e+00

Pos logit contributions
rated+0.085
aned+0.071
ently+0.067
err+0.064
rily+0.062
Neg logit contributions
 kings-0.072
 detectives-0.072
 builders-0.071
 who-0.070
 farmers-0.068
Tokenright
Token ID3506
Feature activation+1.39e+01

Pos logit contributions
 who+0.051
 detectives+0.049
 builders+0.048
 kings+0.048
 farmers+0.045
Neg logit contributions
rated-0.102
aned-0.090
ently-0.085
err-0.082
rily-0.080
Token?"
Token ID1701
Feature activation+9.48e-01

Pos logit contributions
rated+0.637
aned+0.530
err+0.507
ently+0.485
 sensation+0.450
Neg logit contributions
 builders-0.571
 who-0.561
 kings-0.555
 detectives-0.547
 farmers-0.534
Token
Token ID198
Feature activation-2.07e+00

Pos logit contributions
rated+0.064
aned+0.056
ently+0.052
rily+0.051
err+0.051
Neg logit contributions
 who-0.022
 builders-0.018
 detectives-0.018
 farmers-0.016
 see-0.016
Token
Token ID198
Feature activation-2.07e+00

Pos logit contributions
 who+0.051
 detectives+0.048
 builders+0.047
 kings+0.046
 spies+0.044
Neg logit contributions
rated-0.130
aned-0.114
ently-0.108
err-0.104
rily-0.103
TokenJohn
Token ID7554
Feature activation+3.41e+00

Pos logit contributions
 detectives+0.069
 kings+0.067
 builders+0.065
 who+0.065
 spies+0.064
Neg logit contributions
rated-0.109
aned-0.094
ently-0.090
err-0.086
rily-0.082
Token nodded
Token ID14464
Feature activation+1.82e+00

Pos logit contributions
rated+0.193
aned+0.162
ently+0.159
err+0.153
rily+0.150
Neg logit contributions
 who-0.109
 builders-0.092
 kings-0.092
 detectives-0.092
 farmers-0.089
Token dragon
Token ID10441
Feature activation-2.40e+00

Pos logit contributions
 builders+0.126
 detectives+0.125
 kings+0.121
 who+0.121
 farmers+0.117
Neg logit contributions
rated-0.201
aned-0.162
err-0.159
ently-0.158
night-0.153
Token!
Token ID0
Feature activation+8.43e-01

Pos logit contributions
 builders+0.089
 detectives+0.089
 kings+0.085
 farmers+0.083
 mechanics+0.083
Neg logit contributions
rated-0.111
aned-0.090
ently-0.087
night-0.082
err-0.082
Token
Token ID198
Feature activation-2.41e+00

Pos logit contributions
rated+0.043
aned+0.037
ently+0.035
err+0.033
rily+0.033
Neg logit contributions
 detectives-0.029
 builders-0.028
 who-0.028
 kings-0.028
 spies-0.027
Token
Token ID198
Feature activation-2.53e+00

Pos logit contributions
 detectives+0.073
 who+0.072
 builders+0.072
 kings+0.071
 spies+0.068
Neg logit contributions
rated-0.133
aned-0.116
ently-0.111
err-0.106
rily-0.103
TokenF
Token ID37
Feature activation+3.06e+00

Pos logit contributions
 detectives+0.076
 builders+0.074
 kings+0.074
 superheroes+0.070
 spies+0.070
Neg logit contributions
rated-0.138
aned-0.121
ently-0.116
err-0.113
rily-0.107
Tokenright
Token ID3506
Feature activation+1.37e+01

Pos logit contributions
rated+0.125
aned+0.114
 hurts+0.101
ently+0.096
rily+0.096
Neg logit contributions
 who-0.130
 named-0.121
 kings-0.109
 store-0.108
 judge-0.106
Tokenened
Token ID2945
Feature activation+4.10e+00

Pos logit contributions
rated+0.678
err+0.553
aned+0.552
ently+0.516
rum+0.495
Neg logit contributions
 who-0.502
 kings-0.499
 builders-0.474
 pirates-0.461
 detectives-0.455
Token,
Token ID11
Feature activation-4.84e+00

Pos logit contributions
rated+0.240
aned+0.204
ently+0.192
err+0.188
ingle+0.185
Neg logit contributions
 who-0.129
 detectives-0.120
 builders-0.115
 kings-0.113
 spies-0.112
Token Mary
Token ID5335
Feature activation+4.93e+00

Pos logit contributions
 detectives+0.150
 who+0.150
 builders+0.146
 kings+0.145
 spies+0.139
Neg logit contributions
rated-0.265
aned-0.228
ently-0.218
err-0.207
rily-0.204
Token started
Token ID2067
Feature activation-1.68e+00

Pos logit contributions
rated+0.273
aned+0.235
ently+0.225
err+0.214
rily+0.210
Neg logit contributions
 who-0.152
 detectives-0.149
 kings-0.145
 builders-0.145
 spies-0.138
Token running
Token ID2491
Feature activation+1.90e+00

Pos logit contributions
 who+0.051
 detectives+0.050
 kings+0.049
 builders+0.049
 spies+0.046
Neg logit contributions
rated-0.094
aned-0.081
ently-0.078
err-0.074
rily-0.073
Token a
Token ID257
Feature activation-3.55e+00

Pos logit contributions
 builders+0.174
 detectives+0.172
 captains+0.161
 mechanics+0.159
 kings+0.157
Neg logit contributions
rated-0.238
ently-0.211
aned-0.205
 sensation-0.201
err-0.200
Token tall
Token ID7331
Feature activation-2.06e+00

Pos logit contributions
 builders+0.120
 detectives+0.118
 kings+0.116
 farmers+0.111
 who+0.109
Neg logit contributions
rated-0.170
err-0.136
 sensation-0.133
 cheek-0.132
aned-0.131
Token tree
Token ID5509
Feature activation-1.09e+00

Pos logit contributions
 detectives+0.064
 who+0.061
 builders+0.061
 kings+0.060
 spies+0.059
Neg logit contributions
rated-0.112
aned-0.094
ently-0.090
err-0.087
 sensation-0.085
Token.
Token ID13
Feature activation+1.69e+00

Pos logit contributions
 detectives+0.041
 who+0.040
 builders+0.040
 kings+0.040
 spies+0.039
Neg logit contributions
rated-0.052
aned-0.044
ently-0.042
err-0.039
rily-0.038
Token Un
Token ID791
Feature activation+7.52e-01

Pos logit contributions
rated+0.096
aned+0.082
ently+0.079
err+0.074
rily+0.071
Neg logit contributions
 who-0.053
 detectives-0.051
 kings-0.049
 builders-0.048
 spies-0.047
Tokenbek
Token ID47083
Feature activation+1.37e+01

Pos logit contributions
rated+0.033
aned+0.027
ently+0.025
err+0.023
ingle+0.023
Neg logit contributions
 detectives-0.032
 builders-0.031
 spies-0.031
 kings-0.031
 farmers-0.031
Tokennow
Token ID2197
Feature activation+6.97e+00

Pos logit contributions
rated+0.442
aned+0.329
ently+0.304
err+0.277
rily+0.266
Neg logit contributions
 who-0.733
 detectives-0.733
 builders-0.721
 kings-0.721
 spies-0.699
Tokenst
Token ID301
Feature activation+6.39e+00

Pos logit contributions
rated+0.381
aned+0.330
ently+0.315
err+0.300
rily+0.292
Neg logit contributions
 who-0.222
 detectives-0.217
 kings-0.211
 builders-0.210
 spies-0.202
Token to
Token ID284
Feature activation-1.37e-01

Pos logit contributions
rated+0.245
aned+0.180
 sensation+0.166
err+0.164
ently+0.158
Neg logit contributions
 who-0.315
 kings-0.305
 detectives-0.292
 builders-0.288
 see-0.283
Token him
Token ID683
Feature activation+1.37e+00

Pos logit contributions
 detectives+0.005
 builders+0.005
 who+0.005
 spies+0.005
 kings+0.005
Neg logit contributions
rated-0.006
aned-0.005
ently-0.005
err-0.005
 sensation-0.005
Token,
Token ID11
Feature activation-3.70e+00

Pos logit contributions
rated+0.077
aned+0.066
ently+0.063
err+0.060
rily+0.059
Neg logit contributions
 who-0.041
 detectives-0.041
 builders-0.040
 kings-0.039
 spies-0.038
Token and
Token ID290
Feature activation+1.20e+00

Pos logit contributions
rated+0.036
aned+0.031
ently+0.029
err+0.027
rily+0.027
Neg logit contributions
 detectives-0.024
 builders-0.023
 who-0.023
 kings-0.023
 spies-0.022
Token Tom
Token ID4186
Feature activation+1.34e+00

Pos logit contributions
rated+0.071
aned+0.062
ently+0.059
err+0.056
rily+0.055
Neg logit contributions
 who-0.034
 detectives-0.033
 kings-0.033
 builders-0.032
 spies-0.031
Token felt
Token ID2936
Feature activation-1.10e+00

Pos logit contributions
rated+0.071
aned+0.063
ently+0.057
err+0.053
rily+0.053
Neg logit contributions
 detectives-0.045
 builders-0.043
 spies-0.042
 farmers-0.041
 kings-0.040
Token a
Token ID257
Feature activation-3.49e+00

Pos logit contributions
 detectives+0.033
 builders+0.033
 kings+0.032
 who+0.032
 spies+0.031
Neg logit contributions
rated-0.061
aned-0.052
ently-0.049
rily-0.047
 sensation-0.047
Token strange
Token ID6283
Feature activation+5.19e-01

Pos logit contributions
 who+0.119
 detectives+0.114
 builders+0.113
 kings+0.112
 spies+0.107
Neg logit contributions
rated-0.182
aned-0.158
ently-0.151
err-0.141
rily-0.139
Token sensation
Token ID18098
Feature activation+1.35e+01

Pos logit contributions
rated+0.024
aned+0.022
ently+0.021
rily+0.019
err+0.019
Neg logit contributions
 who-0.021
 detectives-0.018
 kings-0.018
 see-0.018
 police-0.018
Token in
Token ID287
Feature activation+5.67e+00

Pos logit contributions
rated+0.724
aned+0.624
ently+0.592
err+0.583
rily+0.548
Neg logit contributions
 who-0.445
 kings-0.409
 builders-0.409
 detectives-0.405
 farmers-0.392
Token their
Token ID511
Feature activation+3.88e-01

Pos logit contributions
rated+0.324
aned+0.282
ently+0.270
err+0.254
rily+0.248
Neg logit contributions
 who-0.171
 detectives-0.167
 builders-0.163
 kings-0.163
 spies-0.156
Token stomach
Token ID11384
Feature activation+3.31e+00

Pos logit contributions
rated+0.018
aned+0.015
ently+0.014
err+0.013
rily+0.013
Neg logit contributions
 who-0.016
 kings-0.015
 detectives-0.015
 builders-0.015
 superheroes-0.015
Tokens
Token ID82
Feature activation-9.26e-01

Pos logit contributions
rated+0.157
aned+0.132
ently+0.125
err+0.117
rily+0.115
Neg logit contributions
 who-0.127
 detectives-0.127
 builders-0.124
 kings-0.123
 spies-0.119
Token.
Token ID13
Feature activation+1.73e+00

Pos logit contributions
 who+0.030
 kings+0.026
 builders+0.026
 detectives+0.025
 farmers+0.025
Neg logit contributions
rated-0.053
aned-0.045
err-0.043
ently-0.042
rily-0.040
Token and
Token ID290
Feature activation+1.01e+00

Pos logit contributions
rated+0.214
aned+0.187
ently+0.174
err+0.163
rily+0.161
Neg logit contributions
 detectives-0.146
 builders-0.144
 kings-0.139
 spies-0.138
 farmers-0.135
Token Ben
Token ID3932
Feature activation+8.36e-01

Pos logit contributions
rated+0.056
aned+0.048
ently+0.046
err+0.044
rily+0.043
Neg logit contributions
 detectives-0.031
 who-0.031
 builders-0.030
 kings-0.030
 farmers-0.028
Token felt
Token ID2936
Feature activation-1.15e+00

Pos logit contributions
rated+0.045
aned+0.039
ently+0.036
rily+0.033
err+0.033
Neg logit contributions
 detectives-0.028
 builders-0.027
 spies-0.026
 farmers-0.025
 kings-0.025
Token a
Token ID257
Feature activation-3.60e+00

Pos logit contributions
 detectives+0.034
 builders+0.034
 who+0.033
 kings+0.033
 spies+0.031
Neg logit contributions
rated-0.064
aned-0.055
ently-0.052
rily-0.050
err-0.050
Token funny
Token ID8258
Feature activation+1.59e+00

Pos logit contributions
 who+0.123
 detectives+0.119
 builders+0.117
 kings+0.116
 spies+0.111
Neg logit contributions
rated-0.187
aned-0.162
ently-0.155
err-0.145
rily-0.142
Token sensation
Token ID18098
Feature activation+1.35e+01

Pos logit contributions
rated+0.083
aned+0.072
ently+0.069
err+0.063
rily+0.063
Neg logit contributions
 who-0.056
 detectives-0.052
 builders-0.051
 kings-0.051
 spies-0.050
Token in
Token ID287
Feature activation+5.58e+00

Pos logit contributions
rated+0.723
aned+0.623
ently+0.591
err+0.582
rily+0.545
Neg logit contributions
 who-0.446
 kings-0.410
 builders-0.409
 detectives-0.405
 farmers-0.393
Token their
Token ID511
Feature activation+4.09e-01

Pos logit contributions
rated+0.326
aned+0.287
ently+0.275
err+0.257
rily+0.250
Neg logit contributions
 who-0.162
 detectives-0.156
 kings-0.153
 builders-0.152
 spies-0.147
Token stomach
Token ID11384
Feature activation+3.28e+00

Pos logit contributions
rated+0.019
aned+0.016
ently+0.015
err+0.014
rily+0.014
Neg logit contributions
 who-0.017
 detectives-0.016
 kings-0.016
 builders-0.016
 spies-0.016
Tokens
Token ID82
Feature activation-9.38e-01

Pos logit contributions
rated+0.155
aned+0.131
ently+0.124
err+0.116
rily+0.114
Neg logit contributions
 who-0.127
 detectives-0.126
 builders-0.123
 kings-0.123
 spies-0.118
Token.
Token ID13
Feature activation+1.60e+00

Pos logit contributions
 who+0.031
 kings+0.026
 builders+0.026
 detectives+0.025
 farmers+0.025
Neg logit contributions
rated-0.053
aned-0.046
err-0.043
ently-0.042
raw-0.040
Token felt
Token ID2936
Feature activation-1.34e+00

Pos logit contributions
 detectives+0.094
 spies+0.089
 builders+0.085
 farmers+0.081
 charity+0.080
Neg logit contributions
rated-0.127
aned-0.111
ently-0.099
rily-0.098
led-0.096
Token a
Token ID257
Feature activation-3.84e+00

Pos logit contributions
 detectives+0.044
 builders+0.043
 kings+0.042
 spies+0.042
 who+0.040
Neg logit contributions
rated-0.070
aned-0.059
rily-0.056
ently-0.055
 sensation-0.055
Token strange
Token ID6283
Feature activation+3.37e-01

Pos logit contributions
 detectives+0.131
 who+0.128
 builders+0.127
 kings+0.126
 spies+0.122
Neg logit contributions
rated-0.201
aned-0.171
ently-0.162
err-0.154
rily-0.152
Token t
Token ID256
Feature activation-8.29e-01

Pos logit contributions
rated+0.016
aned+0.014
ently+0.013
err+0.012
rily+0.012
Neg logit contributions
 who-0.013
 detectives-0.012
 builders-0.012
 kings-0.012
 farmers-0.011
Tokeningly
Token ID4420
Feature activation+3.59e+00

Pos logit contributions
 detectives+0.041
 who+0.041
 kings+0.040
 builders+0.040
 spies+0.039
Neg logit contributions
rated-0.030
aned-0.024
ently-0.023
err-0.020
rily-0.020
Token sensation
Token ID18098
Feature activation+1.34e+01

Pos logit contributions
rated+0.189
aned+0.166
ently+0.155
err+0.151
rily+0.145
Neg logit contributions
 who-0.126
 kings-0.114
 detectives-0.113
 builders-0.113
 farmers-0.109
Token and
Token ID290
Feature activation+9.61e-01

Pos logit contributions
rated+0.708
aned+0.607
ently+0.576
err+0.554
rily+0.534
Neg logit contributions
 who-0.452
 detectives-0.438
 builders-0.432
 kings-0.430
 farmers-0.413
Token when
Token ID618
Feature activation+3.05e+00

Pos logit contributions
rated+0.049
aned+0.042
ently+0.040
rily+0.037
 sensation+0.037
Neg logit contributions
 detectives-0.033
 builders-0.032
 kings-0.031
 spies-0.030
 who-0.030
Token it
Token ID340
Feature activation+3.78e+00

Pos logit contributions
rated+0.172
aned+0.148
ently+0.143
err+0.135
rily+0.134
Neg logit contributions
 who-0.087
 detectives-0.086
 builders-0.083
 kings-0.083
 spies-0.079
Token stopped
Token ID5025
Feature activation+2.19e+00

Pos logit contributions
rated+0.220
aned+0.191
ently+0.191
err+0.180
rily+0.180
Neg logit contributions
 who-0.104
 kings-0.094
 detectives-0.093
 builders-0.093
 superheroes-0.090
Token they
Token ID484
Feature activation-2.95e+00

Pos logit contributions
rated+0.110
aned+0.094
ently+0.090
err+0.084
rily+0.084
Neg logit contributions
 detectives-0.077
 who-0.076
 builders-0.074
 kings-0.074
 spies-0.071
Token out
Token ID503
Feature activation+2.02e+00

Pos logit contributions
rated+0.173
aned+0.148
ently+0.139
err+0.132
ingle+0.126
Neg logit contributions
 who-0.107
 detectives-0.104
 kings-0.100
 builders-0.100
 spies-0.097
Token loud
Token ID7812
Feature activation+3.80e+00

Pos logit contributions
rated+0.105
aned+0.090
ently+0.086
err+0.081
rily+0.080
Neg logit contributions
 who-0.068
 detectives-0.068
 builders-0.067
 kings-0.066
 spies-0.064
Token and
Token ID290
Feature activation+1.52e+00

Pos logit contributions
rated+0.220
aned+0.190
ently+0.181
err+0.175
ingle+0.170
Neg logit contributions
 who-0.115
 detectives-0.106
 kings-0.105
 builders-0.104
 farmers-0.101
Token enjoyed
Token ID8359
Feature activation+1.24e-01

Pos logit contributions
rated+0.084
aned+0.072
ently+0.069
err+0.065
rily+0.064
Neg logit contributions
 who-0.047
 detectives-0.047
 builders-0.046
 kings-0.045
 spies-0.043
Token the
Token ID262
Feature activation+1.56e+00

Pos logit contributions
rated+0.006
aned+0.005
ently+0.005
rily+0.005
err+0.005
Neg logit contributions
 detectives-0.004
 builders-0.004
 who-0.004
 kings-0.004
 spies-0.004
Token sensation
Token ID18098
Feature activation+1.34e+01

Pos logit contributions
rated+0.090
aned+0.081
ently+0.076
rily+0.075
err+0.072
Neg logit contributions
 who-0.045
 kings-0.043
 builders-0.043
 detectives-0.042
 farmers-0.041
Token of
Token ID286
Feature activation-3.70e+00

Pos logit contributions
rated+0.786
aned+0.687
err+0.647
ently+0.639
rily+0.605
Neg logit contributions
 who-0.388
 builders-0.347
 kings-0.344
 detectives-0.334
 farmers-0.334
Token being
Token ID852
Feature activation-1.50e+00

Pos logit contributions
 who+0.105
 detectives+0.104
 builders+0.101
 kings+0.101
 spies+0.096
Neg logit contributions
rated-0.214
aned-0.186
ently-0.178
err-0.170
rily-0.167
Token so
Token ID523
Feature activation+2.81e+00

Pos logit contributions
 detectives+0.042
 who+0.042
 builders+0.041
 kings+0.040
 spies+0.039
Neg logit contributions
rated-0.087
aned-0.075
ently-0.072
err-0.068
rily-0.067
Token high
Token ID1029
Feature activation+7.20e+00

Pos logit contributions
rated+0.156
aned+0.134
ently+0.126
err+0.122
rily+0.118
Neg logit contributions
 who-0.085
 detectives-0.085
 spies-0.080
 kings-0.079
 builders-0.079
Token up
Token ID510
Feature activation+4.41e+00

Pos logit contributions
rated+0.359
aned+0.328
err+0.297
ently+0.294
ulp+0.290
Neg logit contributions
 who-0.257
 builders-0.233
 detectives-0.233
 kings-0.231
 farmers-0.227
Token a
Token ID257
Feature activation-3.73e+00

Pos logit contributions
 detectives+0.045
 builders+0.045
 who+0.044
 kings+0.044
 spies+0.042
Neg logit contributions
rated-0.068
aned-0.058
ently-0.054
rily-0.053
err-0.052
Token hot
Token ID3024
Feature activation+2.49e+00

Pos logit contributions
 who+0.129
 detectives+0.124
 builders+0.121
 kings+0.120
 spies+0.117
Neg logit contributions
rated-0.194
aned-0.167
ently-0.160
err-0.150
rily-0.144
Token and
Token ID290
Feature activation+1.07e+00

Pos logit contributions
rated+0.131
aned+0.115
ently+0.111
err+0.103
rily+0.099
Neg logit contributions
 who-0.088
 builders-0.082
 detectives-0.080
 kings-0.079
 farmers-0.079
Token tang
Token ID13875
Feature activation+2.78e+00

Pos logit contributions
rated+0.058
aned+0.050
ently+0.048
err+0.045
rily+0.044
Neg logit contributions
 who-0.034
 detectives-0.034
 builders-0.033
 kings-0.033
 spies-0.032
Tokeny
Token ID88
Feature activation+2.03e+00

Pos logit contributions
rated+0.133
aned+0.114
ently+0.113
err+0.108
appointed+0.102
Neg logit contributions
 who-0.108
 kings-0.093
 farmers-0.092
 builders-0.091
 chickens-0.088
Token sensation
Token ID18098
Feature activation+1.34e+01

Pos logit contributions
rated+0.116
aned+0.103
ently+0.099
err+0.094
ingle+0.088
Neg logit contributions
 who-0.068
 kings-0.055
 pirates-0.054
 farmers-0.054
 spies-0.053
Token in
Token ID287
Feature activation+5.51e+00

Pos logit contributions
rated+0.739
aned+0.632
err+0.606
ently+0.598
raw+0.556
Neg logit contributions
 who-0.442
 kings-0.384
 builders-0.374
 farmers-0.365
 detectives-0.363
Token their
Token ID511
Feature activation+1.80e-01

Pos logit contributions
rated+0.329
aned+0.290
ently+0.280
err+0.260
night+0.252
Neg logit contributions
 who-0.157
 detectives-0.146
 kings-0.145
 builders-0.143
 spies-0.139
Token mouths
Token ID28552
Feature activation+2.42e+00

Pos logit contributions
rated+0.008
aned+0.007
ently+0.006
err+0.006
rily+0.005
Neg logit contributions
 who-0.008
 detectives-0.008
 kings-0.008
 builders-0.008
 farmers-0.008
Token.
Token ID13
Feature activation+1.80e+00

Pos logit contributions
rated+0.135
aned+0.115
ently+0.109
err+0.108
night+0.099
Neg logit contributions
 who-0.081
 kings-0.072
 detectives-0.070
 builders-0.069
 spies-0.068
Token They
Token ID1119
Feature activation-2.40e+00

Pos logit contributions
rated+0.102
aned+0.089
ently+0.085
err+0.081
rily+0.080
Neg logit contributions
 detectives-0.052
 who-0.052
 builders-0.051
 kings-0.051
 spies-0.048
Token and
Token ID290
Feature activation+1.40e+00

Pos logit contributions
rated+0.074
aned+0.062
ently+0.061
err+0.058
rily+0.058
Neg logit contributions
 who-0.042
 detectives-0.038
 kings-0.038
 builders-0.038
 farmers-0.036
Token Tom
Token ID4186
Feature activation+1.23e+00

Pos logit contributions
rated+0.079
aned+0.068
ently+0.065
rily+0.061
err+0.061
Neg logit contributions
 who-0.045
 detectives-0.044
 builders-0.043
 kings-0.043
 spies-0.041
Token experienced
Token ID5924
Feature activation-4.53e-01

Pos logit contributions
rated+0.063
aned+0.055
ently+0.051
err+0.048
rily+0.046
Neg logit contributions
 detectives-0.043
 builders-0.042
 kings-0.041
 spies-0.041
 who-0.040
Token a
Token ID257
Feature activation-3.49e+00

Pos logit contributions
 detectives+0.014
 who+0.013
 builders+0.013
 kings+0.013
 spies+0.012
Neg logit contributions
rated-0.025
aned-0.022
ently-0.021
err-0.020
rily-0.020
Token strange
Token ID6283
Feature activation+3.89e-01

Pos logit contributions
 detectives+0.124
 builders+0.121
 kings+0.120
 who+0.120
 spies+0.116
Neg logit contributions
rated-0.177
aned-0.148
ently-0.140
err-0.135
rily-0.130
Token sensation
Token ID18098
Feature activation+1.34e+01

Pos logit contributions
rated+0.019
aned+0.017
ently+0.016
rily+0.015
err+0.014
Neg logit contributions
 who-0.015
 detectives-0.014
 kings-0.014
 builders-0.014
 farmers-0.013
Token they
Token ID484
Feature activation-2.59e+00

Pos logit contributions
rated+0.745
aned+0.642
ently+0.609
err+0.603
rily+0.574
Neg logit contributions
 who-0.422
 builders-0.387
 kings-0.384
 detectives-0.381
 farmers-0.367
Token had
Token ID550
Feature activation-7.85e-01

Pos logit contributions
 who+0.080
 detectives+0.080
 builders+0.078
 kings+0.077
 spies+0.074
Neg logit contributions
rated-0.143
aned-0.123
ently-0.118
err-0.112
rily-0.110
Token never
Token ID1239
Feature activation+2.74e+00

Pos logit contributions
 who+0.024
 detectives+0.024
 builders+0.023
 kings+0.023
 spies+0.022
Neg logit contributions
rated-0.044
aned-0.038
ently-0.036
err-0.034
rily-0.034
Token felt
Token ID2936
Feature activation-1.18e+00

Pos logit contributions
rated+0.159
aned+0.139
ently+0.134
err+0.127
rily+0.127
Neg logit contributions
 who-0.077
 detectives-0.075
 kings-0.074
 builders-0.073
 spies-0.069
Token before
Token ID878
Feature activation+6.95e+00

Pos logit contributions
 who+0.036
 detectives+0.034
 kings+0.033
 builders+0.032
 spies+0.031
Neg logit contributions
rated-0.068
aned-0.059
ently-0.057
err-0.054
rily-0.053
Token,
Token ID11
Feature activation-3.82e+00

Pos logit contributions
rated+0.047
aned+0.040
ently+0.039
err+0.036
rily+0.036
Neg logit contributions
 who-0.028
 detectives-0.028
 builders-0.027
 kings-0.027
 farmers-0.026
Token they
Token ID484
Feature activation-2.53e+00

Pos logit contributions
 who+0.092
 detectives+0.092
 builders+0.089
 kings+0.088
 farmers+0.083
Neg logit contributions
rated-0.232
aned-0.202
ently-0.195
err-0.187
rily-0.184
Token felt
Token ID2936
Feature activation-1.14e+00

Pos logit contributions
 detectives+0.078
 builders+0.076
 farmers+0.074
 kings+0.071
 charity+0.071
Neg logit contributions
rated-0.135
aned-0.119
ently-0.107
rily-0.102
led-0.101
Token a
Token ID257
Feature activation-3.58e+00

Pos logit contributions
 builders+0.038
 detectives+0.036
 kings+0.036
 queens+0.035
 who+0.034
Neg logit contributions
rated-0.056
aned-0.049
 sensation-0.048
ently-0.045
rily-0.045
Token strange
Token ID6283
Feature activation+4.16e-01

Pos logit contributions
 who+0.131
 detectives+0.129
 builders+0.126
 kings+0.125
 spies+0.121
Neg logit contributions
rated-0.177
aned-0.151
ently-0.144
err-0.135
rily-0.132
Token sensation
Token ID18098
Feature activation+1.34e+01

Pos logit contributions
rated+0.020
aned+0.017
ently+0.017
raw+0.015
err+0.015
Neg logit contributions
 who-0.017
 see-0.014
 detectives-0.014
 spies-0.014
 pirates-0.014
Token.
Token ID13
Feature activation+1.98e+00

Pos logit contributions
rated+0.730
aned+0.621
ently+0.590
err+0.589
rily+0.549
Neg logit contributions
 who-0.443
 builders-0.398
 kings-0.395
 detectives-0.392
 farmers-0.384
Token They
Token ID1119
Feature activation-2.17e+00

Pos logit contributions
rated+0.118
aned+0.100
ently+0.095
err+0.088
rily+0.088
Neg logit contributions
 who-0.061
 detectives-0.058
 builders-0.056
 spies-0.055
 kings-0.054
Token felt
Token ID2936
Feature activation-1.19e+00

Pos logit contributions
 detectives+0.075
 builders+0.074
 kings+0.071
 farmers+0.071
 spies+0.069
Neg logit contributions
rated-0.112
aned-0.097
ently-0.088
rily-0.084
err-0.083
Token light
Token ID1657
Feature activation+9.44e-01

Pos logit contributions
 builders+0.037
 detectives+0.037
 kings+0.036
 who+0.034
 queens+0.034
Neg logit contributions
rated-0.062
aned-0.054
ently-0.051
 sensation-0.051
rily-0.050
Token and
Token ID290
Feature activation+1.34e+00

Pos logit contributions
rated+0.052
aned+0.045
ently+0.042
err+0.042
raw+0.041
Neg logit contributions
 who-0.031
 builders-0.028
 kings-0.028
 detectives-0.027
 farmers-0.027
TokenSuddenly
Token ID38582
Feature activation+7.33e-01

Pos logit contributions
 detectives+0.062
 who+0.061
 kings+0.060
 builders+0.060
 spies+0.057
Neg logit contributions
rated-0.118
aned-0.102
ently-0.098
err-0.093
rily-0.091
Token,
Token ID11
Feature activation-3.82e+00

Pos logit contributions
rated+0.038
aned+0.033
ently+0.032
err+0.029
rily+0.029
Neg logit contributions
 detectives-0.024
 who-0.024
 kings-0.023
 builders-0.023
 spies-0.022
Token he
Token ID339
Feature activation-1.53e+00

Pos logit contributions
 who+0.104
 detectives+0.101
 builders+0.100
 kings+0.098
 spies+0.093
Neg logit contributions
rated-0.228
aned-0.201
ently-0.191
err-0.183
rily-0.179
Token felt
Token ID2936
Feature activation-1.33e+00

Pos logit contributions
 detectives+0.039
 builders+0.038
 kings+0.037
 who+0.037
 spies+0.036
Neg logit contributions
rated-0.091
aned-0.082
ently-0.076
err-0.073
rily-0.071
Token a
Token ID257
Feature activation-3.62e+00

Pos logit contributions
 builders+0.042
 detectives+0.041
 kings+0.040
 who+0.037
 queens+0.037
Neg logit contributions
rated-0.070
aned-0.061
 sensation-0.058
ently-0.057
rily-0.056
Token sensation
Token ID18098
Feature activation+1.34e+01

Pos logit contributions
 who+0.129
 detectives+0.124
 builders+0.122
 kings+0.121
 farmers+0.117
Neg logit contributions
rated-0.182
aned-0.156
ently-0.151
err-0.140
rily-0.137
Token.
Token ID13
Feature activation+1.82e+00

Pos logit contributions
rated+0.735
aned+0.626
ently+0.603
err+0.594
rily+0.552
Neg logit contributions
 who-0.433
 builders-0.385
 detectives-0.382
 kings-0.381
 farmers-0.374
Token He
Token ID679
Feature activation-2.30e-01

Pos logit contributions
rated+0.096
aned+0.081
ently+0.076
err+0.072
ingle+0.070
Neg logit contributions
 who-0.065
 builders-0.062
 detectives-0.061
 police-0.058
 kings-0.058
Token began
Token ID2540
Feature activation-1.95e+00

Pos logit contributions
 who+0.007
 detectives+0.007
 kings+0.006
 builders+0.006
 farmers+0.006
Neg logit contributions
rated-0.013
aned-0.011
ently-0.011
err-0.010
rily-0.010
Token to
Token ID284
Feature activation-2.18e-01

Pos logit contributions
 who+0.058
 detectives+0.056
 builders+0.055
 kings+0.055
 farmers+0.052
Neg logit contributions
rated-0.112
aned-0.096
ently-0.093
err-0.087
rily-0.087
Token slide
Token ID10649
Feature activation+4.16e-01

Pos logit contributions
 who+0.008
 detectives+0.008
 kings+0.007
 builders+0.007
 farmers+0.007
Neg logit contributions
rated-0.011
aned-0.009
ently-0.009
err-0.008
rily-0.008
Token of
Token ID286
Feature activation-3.73e+00

Pos logit contributions
 detectives+0.069
 builders+0.065
 captains+0.065
 spies+0.064
 farmers+0.064
Neg logit contributions
rated-0.075
ently-0.065
aned-0.064
rily-0.060
ingle-0.059
Token new
Token ID649
Feature activation-2.79e+00

Pos logit contributions
 detectives+0.129
 builders+0.126
 captains+0.125
 who+0.122
 farmers+0.122
Neg logit contributions
rated-0.173
ently-0.144
aned-0.143
err-0.137
 sensation-0.135
Token treasures
Token ID29561
Feature activation-4.05e+00

Pos logit contributions
Playing+0.122
 detectives+0.122
 captains+0.121
 vet+0.120
 veterinarian+0.120
Neg logit contributions
rated-0.094
ently-0.079
aned-0.073
err-0.070
ened-0.066
Token.
Token ID13
Feature activation+1.92e+00

Pos logit contributions
 detectives+0.168
 builders+0.166
 farmers+0.161
 kings+0.161
 superheroes+0.158
Neg logit contributions
rated-0.167
ently-0.138
aned-0.137
rily-0.129
err-0.119
Token Del
Token ID4216
Feature activation+5.51e+00

Pos logit contributions
rated+0.071
aned+0.057
ently+0.056
err+0.054
rily+0.050
Neg logit contributions
 kings-0.089
 builders-0.087
 who-0.087
 detectives-0.086
 superheroes-0.085
Tokenight
Token ID432
Feature activation+1.34e+01

Pos logit contributions
rated+0.263
aned+0.214
err+0.191
 sensation+0.191
rily+0.190
Neg logit contributions
 who-0.221
 kings-0.214
 builders-0.207
 farmers-0.200
 detectives-0.200
Tokened
Token ID276
Feature activation+5.23e-01

Pos logit contributions
rated+0.484
aned+0.380
ently+0.345
err+0.341
rum+0.310
Neg logit contributions
 who-0.671
 kings-0.666
 builders-0.654
 detectives-0.654
 pirates-0.627
Token,
Token ID11
Feature activation-3.73e+00

Pos logit contributions
rated+0.025
aned+0.022
ently+0.021
rily+0.020
night+0.019
Neg logit contributions
 detectives-0.018
 builders-0.018
 who-0.017
 kings-0.017
 superheroes-0.017
Token he
Token ID339
Feature activation-1.48e+00

Pos logit contributions
 detectives+0.128
 builders+0.126
 kings+0.122
 who+0.121
boys+0.120
Neg logit contributions
rated-0.167
ently-0.141
aned-0.138
err-0.135
rily-0.132
Token explored
Token ID18782
Feature activation-1.62e+00

Pos logit contributions
 builders+0.050
 detectives+0.049
 charity+0.049
 spies+0.049
 farmers+0.049
Neg logit contributions
rated-0.072
aned-0.065
ently-0.060
led-0.056
ened-0.056
Token each
Token ID1123
Feature activation-3.02e+00

Pos logit contributions
 detectives+0.056
 builders+0.056
 who+0.053
 farmers+0.053
 spies+0.053
Neg logit contributions
rated-0.079
ently-0.067
aned-0.065
rily-0.063
err-0.061
Token a
Token ID257
Feature activation-3.56e+00

Pos logit contributions
 detectives+0.040
 builders+0.040
 who+0.039
 kings+0.039
 spies+0.037
Neg logit contributions
rated-0.073
aned-0.063
ently-0.060
rily-0.057
err-0.056
Token pleasant
Token ID15497
Feature activation+8.45e-01

Pos logit contributions
 who+0.112
 detectives+0.105
 kings+0.104
 builders+0.103
 superheroes+0.099
Neg logit contributions
rated-0.196
aned-0.172
ently-0.164
err-0.155
rily-0.152
Token t
Token ID256
Feature activation-6.21e-01

Pos logit contributions
rated+0.044
aned+0.039
ently+0.037
err+0.034
rily+0.033
Neg logit contributions
 who-0.030
 detectives-0.028
 kings-0.027
 builders-0.027
 superheroes-0.026
Tokening
Token ID278
Feature activation+4.60e-01

Pos logit contributions
 who+0.035
 builders+0.035
 detectives+0.034
 kings+0.034
 spies+0.033
Neg logit contributions
rated-0.019
aned-0.014
ently-0.012
err-0.011
rily-0.010
Tokenling
Token ID1359
Feature activation+1.89e+00

Pos logit contributions
rated+0.023
aned+0.021
ently+0.020
err+0.019
raw+0.018
Neg logit contributions
 who-0.017
 kings-0.015
 store-0.014
 superheroes-0.014
 pirates-0.014
Token sensation
Token ID18098
Feature activation+1.34e+01

Pos logit contributions
rated+0.098
aned+0.093
ently+0.086
err+0.082
raw+0.082
Neg logit contributions
 who-0.066
 kings-0.056
 living-0.055
 store-0.054
 superheroes-0.054
Token throughout
Token ID3690
Feature activation+1.59e+00

Pos logit contributions
rated+0.731
aned+0.632
ently+0.597
err+0.593
rily+0.554
Neg logit contributions
 who-0.433
 kings-0.395
 builders-0.394
 detectives-0.389
 farmers-0.380
Token his
Token ID465
Feature activation+2.64e+00

Pos logit contributions
rated+0.073
aned+0.060
ently+0.060
 sensation+0.056
rily+0.055
Neg logit contributions
 detectives-0.060
 builders-0.058
 who-0.057
 kings-0.057
 farmers-0.055
Token body
Token ID1767
Feature activation+1.14e+00

Pos logit contributions
rated+0.140
aned+0.125
ently+0.115
rily+0.113
err+0.109
Neg logit contributions
 who-0.095
 detectives-0.086
 superheroes-0.086
 kings-0.085
 builders-0.083
Token.
Token ID13
Feature activation+1.93e+00

Pos logit contributions
rated+0.053
aned+0.044
ently+0.042
rily+0.039
err+0.039
Neg logit contributions
 detectives-0.045
 builders-0.044
 who-0.043
 kings-0.043
 spies-0.042
Token The
Token ID383
Feature activation+7.50e-01

Pos logit contributions
rated+0.115
aned+0.099
ently+0.094
rily+0.089
err+0.087
Neg logit contributions
 who-0.058
 detectives-0.053
 builders-0.052
 kings-0.050
 spies-0.049
Token feels
Token ID5300
Feature activation-2.30e+00

Pos logit contributions
 detectives+0.052
 builders+0.052
 who+0.050
 kings+0.050
 farmers+0.050
Neg logit contributions
rated-0.051
aned-0.043
ently-0.040
err-0.037
rily-0.036
Token a
Token ID257
Feature activation-3.79e+00

Pos logit contributions
 builders+0.084
 detectives+0.083
 kings+0.081
 who+0.078
 captains+0.077
Neg logit contributions
rated-0.106
aned-0.092
ently-0.085
 sensation-0.085
rily-0.085
Token hot
Token ID3024
Feature activation+2.48e+00

Pos logit contributions
 who+0.132
 detectives+0.122
 builders+0.120
 kings+0.119
 spies+0.117
Neg logit contributions
rated-0.198
aned-0.172
ently-0.164
err-0.152
rily-0.147
Token and
Token ID290
Feature activation+8.04e-01

Pos logit contributions
rated+0.120
aned+0.104
ently+0.100
err+0.092
rily+0.089
Neg logit contributions
 who-0.097
 builders-0.093
 detectives-0.090
 kings-0.090
 farmers-0.089
Token burning
Token ID9482
Feature activation+1.95e+00

Pos logit contributions
rated+0.043
aned+0.037
ently+0.036
err+0.034
rily+0.033
Neg logit contributions
 who-0.027
 detectives-0.026
 kings-0.026
 builders-0.025
 spies-0.025
Token sensation
Token ID18098
Feature activation+1.34e+01

Pos logit contributions
rated+0.106
aned+0.095
ently+0.091
err+0.090
appointed+0.084
Neg logit contributions
 who-0.062
 kings-0.057
 builders-0.056
 farmers-0.055
 pirates-0.055
Token in
Token ID287
Feature activation+5.52e+00

Pos logit contributions
rated+0.726
aned+0.610
err+0.584
ently+0.574
night+0.537
Neg logit contributions
 who-0.456
 kings-0.410
 builders-0.407
 detectives-0.397
 farmers-0.395
Token her
Token ID607
Feature activation+2.60e+00

Pos logit contributions
rated+0.329
aned+0.285
ently+0.275
err+0.257
night+0.253
Neg logit contributions
 who-0.160
 kings-0.148
 detectives-0.146
 builders-0.146
 spies-0.142
Token mouth
Token ID5422
Feature activation+2.62e+00

Pos logit contributions
rated+0.132
aned+0.107
ently+0.098
err+0.096
rily+0.089
Neg logit contributions
 who-0.109
 living-0.098
 detectives-0.097
 store-0.096
 kings-0.096
Token.
Token ID13
Feature activation+1.52e+00

Pos logit contributions
rated+0.147
aned+0.126
ently+0.121
err+0.119
rily+0.113
Neg logit contributions
 who-0.084
 kings-0.072
 builders-0.071
 farmers-0.071
 detectives-0.070
Token She
Token ID1375
Feature activation-1.09e+00

Pos logit contributions
rated+0.086
aned+0.075
ently+0.072
err+0.069
rily+0.067
Neg logit contributions
 detectives-0.043
 who-0.043
 builders-0.043
 kings-0.042
 farmers-0.040
Token he
Token ID339
Feature activation-1.95e+00

Pos logit contributions
 detectives+0.141
 who+0.140
 builders+0.137
 kings+0.136
 spies+0.129
Neg logit contributions
rated-0.284
aned-0.245
ently-0.237
err-0.225
rily-0.223
Token felt
Token ID2936
Feature activation-1.39e+00

Pos logit contributions
 detectives+0.048
 builders+0.048
 who+0.046
 spies+0.045
 kings+0.044
Neg logit contributions
rated-0.117
aned-0.104
ently-0.098
err-0.092
rily-0.091
Token a
Token ID257
Feature activation-3.69e+00

Pos logit contributions
 detectives+0.043
 builders+0.043
 kings+0.041
 who+0.041
 spies+0.040
Neg logit contributions
rated-0.074
aned-0.064
ently-0.060
 sensation-0.059
rily-0.057
Token fl
Token ID781
Feature activation+6.24e+00

Pos logit contributions
 who+0.130
 detectives+0.128
 builders+0.125
 kings+0.124
 spies+0.119
Neg logit contributions
rated-0.187
aned-0.160
ently-0.153
err-0.144
rily-0.141
Tokenapping
Token ID5912
Feature activation+5.89e-01

Pos logit contributions
rated+0.284
aned+0.236
ently+0.223
err+0.208
rily+0.200
Neg logit contributions
 who-0.256
 detectives-0.251
 builders-0.247
 kings-0.245
 spies-0.238
Token sensation
Token ID18098
Feature activation+1.34e+01

Pos logit contributions
rated+0.030
aned+0.027
ently+0.025
err+0.024
raw+0.023
Neg logit contributions
 who-0.022
 kings-0.019
 builders-0.018
 detectives-0.018
 farmers-0.018
Token in
Token ID287
Feature activation+5.24e+00

Pos logit contributions
rated+0.715
aned+0.610
err+0.579
ently+0.578
rily+0.534
Neg logit contributions
 who-0.446
 builders-0.409
 kings-0.407
 detectives-0.404
 farmers-0.394
Token the
Token ID262
Feature activation+1.20e+00

Pos logit contributions
rated+0.261
aned+0.223
ently+0.212
err+0.194
rily+0.190
Neg logit contributions
 who-0.199
 detectives-0.191
 kings-0.191
 builders-0.190
 spies-0.183
Token air
Token ID1633
Feature activation+5.27e+00

Pos logit contributions
rated+0.055
aned+0.047
ently+0.044
err+0.041
rily+0.040
Neg logit contributions
 who-0.048
 detectives-0.048
 builders-0.047
 kings-0.047
 spies-0.045
Token.
Token ID13
Feature activation+1.38e+00

Pos logit contributions
rated+0.273
aned+0.236
ently+0.223
err+0.218
rily+0.203
Neg logit contributions
 who-0.188
 kings-0.172
 detectives-0.170
 builders-0.168
 farmers-0.164
Token It
Token ID632
Feature activation+8.87e-01

Pos logit contributions
rated+0.075
aned+0.062
ently+0.059
err+0.054
rily+0.054
Neg logit contributions
 who-0.049
 detectives-0.046
 builders-0.044
 spies-0.043
 kings-0.043
Token and
Token ID290
Feature activation+1.00e+00

Pos logit contributions
 detectives+0.146
 builders+0.144
 kings+0.140
 farmers+0.135
 spies+0.134
Neg logit contributions
rated-0.204
aned-0.173
ently-0.162
rily-0.160
 sensation-0.157
Token they
Token ID484
Feature activation-3.02e+00

Pos logit contributions
rated+0.048
aned+0.040
ied+0.037
 sensation+0.037
rily+0.037
Neg logit contributions
 veterinarian-0.029
 detectives-0.029
 builders-0.028
 kings-0.027
 vet-0.027
Token pretended
Token ID36766
Feature activation-3.70e+00

Pos logit contributions
 detectives+0.108
 spies+0.105
 farmers+0.097
 builders+0.097
 vet+0.095
Neg logit contributions
rated-0.150
aned-0.124
led-0.106
ided-0.105
rily-0.105
Token to
Token ID284
Feature activation-3.75e-01

Pos logit contributions
 detectives+0.107
 who+0.104
 builders+0.103
 kings+0.102
 farmers+0.098
Neg logit contributions
rated-0.206
aned-0.178
ently-0.171
err-0.162
rily-0.160
Token be
Token ID307
Feature activation-2.60e+00

Pos logit contributions
 detectives+0.014
 spies+0.013
 builders+0.013
 farmers+0.013
 superheroes+0.013
Neg logit contributions
rated-0.019
aned-0.015
ently-0.013
 sensation-0.013
raw-0.013
Token builders
Token ID31606
Feature activation-1.20e+01

Pos logit contributions
Playing+0.058
 who+0.050
 veterinarian+0.049
 hardware+0.049
 vet+0.048
Neg logit contributions
rated-0.136
err-0.115
 sensation-0.113
aned-0.110
 off-0.109
Token.
Token ID13
Feature activation+1.40e+00

Pos logit contributions
 detectives+0.538
 builders+0.513
 hardware+0.508
 farmers+0.501
Playing+0.498
Neg logit contributions
rated-0.429
aned-0.343
ently-0.330
rily-0.299
err-0.286
Token They
Token ID1119
Feature activation-2.54e+00

Pos logit contributions
rated+0.082
aned+0.070
ently+0.067
rily+0.063
err+0.063
Neg logit contributions
 who-0.042
 detectives-0.040
 builders-0.039
 kings-0.039
 farmers-0.037
Token used
Token ID973
Feature activation-3.79e+00

Pos logit contributions
 detectives+0.096
 spies+0.094
 veterinarian+0.090
 vet+0.089
 farmers+0.087
Neg logit contributions
rated-0.117
aned-0.093
ened-0.081
ided-0.079
ied-0.078
Token their
Token ID511
Feature activation-5.30e-02

Pos logit contributions
 detectives+0.132
 who+0.129
 kings+0.122
 builders+0.122
 farmers+0.122
Neg logit contributions
rated-0.177
aned-0.154
ently-0.143
rily-0.139
 sensation-0.137
Token trucks
Token ID13960
Feature activation-4.33e+00

Pos logit contributions
 who+0.002
 detectives+0.002
 builders+0.002
 kings+0.002
 spies+0.002
Neg logit contributions
rated-0.003
aned-0.003
ently-0.002
err-0.002
rily-0.002
Token you
Token ID345
Feature activation+1.80e+00

Pos logit contributions
 detectives+0.156
 builders+0.147
 kings+0.143
 farmers+0.142
 who+0.142
Neg logit contributions
rated-0.175
ently-0.157
aned-0.148
err-0.137
rily-0.134
Token.
Token ID13
Feature activation+1.70e+00

Pos logit contributions
rated+0.085
aned+0.071
ently+0.067
err+0.063
rily+0.062
Neg logit contributions
 detectives-0.071
 who-0.070
 builders-0.069
 kings-0.069
 spies-0.067
Token You
Token ID921
Feature activation+4.49e+00

Pos logit contributions
rated+0.069
aned+0.056
ently+0.054
err+0.051
night+0.049
Neg logit contributions
 detectives-0.072
 kings-0.072
 builders-0.072
 who-0.072
 farmers-0.070
Token are
Token ID389
Feature activation-2.86e+00

Pos logit contributions
rated+0.198
aned+0.156
 hurts+0.137
night+0.134
ently+0.133
Neg logit contributions
 veterinarian-0.162
 farmers-0.162
 detectives-0.161
 vet-0.160
 spies-0.156
Token great
Token ID1049
Feature activation-7.82e-01

Pos logit contributions
Playing+0.093
Some+0.093
 hardware+0.092
 detectives+0.090
Did+0.090
Neg logit contributions
rated-0.113
aned-0.098
ently-0.092
rily-0.091
err-0.090
Token builders
Token ID31606
Feature activation-1.20e+01

Pos logit contributions
 hardware+0.020
 who+0.020
Playing+0.019
Did+0.019
Some+0.019
Neg logit contributions
rated-0.040
aned-0.035
ently-0.035
night-0.034
rily-0.033
Token,"
Token ID553
Feature activation-2.71e+00

Pos logit contributions
 detectives+0.520
 builders+0.502
 kings+0.493
 farmers+0.493
 spies+0.483
Neg logit contributions
rated-0.491
aned-0.386
ently-0.381
rily-0.343
rum-0.337
Token Dad
Token ID17415
Feature activation-3.47e+00

Pos logit contributions
 detectives+0.102
 who+0.099
 builders+0.098
 farmers+0.098
 kings+0.096
Neg logit contributions
rated-0.110
ently-0.094
aned-0.094
night-0.088
raw-0.088
Token said
Token ID531
Feature activation+5.98e-01

Pos logit contributions
 detectives+0.120
 kings+0.116
 builders+0.115
 who+0.114
 spies+0.112
Neg logit contributions
rated-0.177
aned-0.154
ently-0.145
err-0.136
rily-0.135
Token.
Token ID13
Feature activation+1.68e+00

Pos logit contributions
rated+0.035
aned+0.029
err+0.027
ently+0.026
 sensation+0.025
Neg logit contributions
 who-0.021
 builders-0.018
 detectives-0.018
 spies-0.018
 kings-0.017
Token
Token ID198
Feature activation-2.47e+00

Pos logit contributions
rated+0.060
aned+0.053
err+0.052
ently+0.050
rily+0.048
Neg logit contributions
 kings-0.072
 detectives-0.069
 farmers-0.069
boys-0.068
 builders-0.067
Token pretend
Token ID16614
Feature activation-3.63e+00

Pos logit contributions
 detectives+0.096
 spies+0.092
 farmers+0.089
 queens+0.088
 builders+0.088
Neg logit contributions
rated-0.119
aned-0.087
raw-0.080
 hurts-0.077
 cheek-0.074
Token they
Token ID484
Feature activation-2.98e+00

Pos logit contributions
 detectives+0.098
 builders+0.092
 kings+0.091
 who+0.090
 spies+0.087
Neg logit contributions
rated-0.204
aned-0.176
ently-0.170
err-0.161
night-0.159
Token are
Token ID389
Feature activation-2.84e+00

Pos logit contributions
 detectives+0.124
 vet+0.120
 spies+0.118
 farmers+0.117
 builders+0.117
Neg logit contributions
rated-0.125
aned-0.095
 hurts-0.088
led-0.080
ently-0.078
Token drivers
Token ID6643
Feature activation-8.21e+00

Pos logit contributions
 detectives+0.065
 who+0.062
Playing+0.062
 hardware+0.061
 vet+0.059
Neg logit contributions
rated-0.149
aned-0.131
ently-0.121
rily-0.118
icle-0.117
Token and
Token ID290
Feature activation+1.10e+00

Pos logit contributions
 detectives+0.336
 builders+0.318
 kings+0.308
 farmers+0.306
 spies+0.303
Neg logit contributions
rated-0.348
aned-0.290
ently-0.276
rily-0.254
err-0.250
Token builders
Token ID31606
Feature activation-1.19e+01

Pos logit contributions
rated+0.048
 hurts+0.036
aned+0.036
 lesson+0.035
erved+0.034
Neg logit contributions
 detectives-0.034
 veterinarian-0.034
 hardware-0.033
 chefs-0.033
 farmers-0.033
Token.
Token ID13
Feature activation+1.55e+00

Pos logit contributions
 detectives+0.535
 builders+0.516
 farmers+0.503
 kings+0.500
 spies+0.500
Neg logit contributions
rated-0.469
aned-0.387
ently-0.364
rily-0.330
err-0.323
Token They
Token ID1119
Feature activation-2.59e+00

Pos logit contributions
rated+0.087
aned+0.075
ently+0.072
err+0.069
rily+0.068
Neg logit contributions
 detectives-0.046
 who-0.046
 builders-0.045
 kings-0.045
 spies-0.042
Token make
Token ID787
Feature activation-3.53e+00

Pos logit contributions
 detectives+0.097
 spies+0.092
 farmers+0.090
 builders+0.087
 queens+0.087
Neg logit contributions
rated-0.119
aned-0.087
raw-0.081
 hurts-0.078
 cheek-0.074
Token roads
Token ID9725
Feature activation-1.81e+00

Pos logit contributions
 detectives+0.136
 who+0.133
 builders+0.131
 kings+0.128
 farmers+0.127
Neg logit contributions
rated-0.161
aned-0.135
ently-0.128
err-0.120
rily-0.118
Token with
Token ID351
Feature activation+9.93e-01

Pos logit contributions
 detectives+0.064
 who+0.063
 builders+0.062
 kings+0.061
 farmers+0.059
Neg logit contributions
rated-0.091
aned-0.077
ently-0.074
err-0.069
rily-0.068
Token,
Token ID11
Feature activation-3.81e+00

Pos logit contributions
rated+0.044
aned+0.039
ently+0.038
rily+0.035
night+0.034
Neg logit contributions
 detectives-0.030
 who-0.030
 kings-0.029
 builders-0.029
 spies-0.028
Token "
Token ID366
Feature activation-8.69e+00

Pos logit contributions
 detectives+0.100
 kings+0.097
 builders+0.095
 who+0.095
 spies+0.090
Neg logit contributions
rated-0.225
aned-0.201
ently-0.192
err-0.181
rily-0.179
TokenYou
Token ID1639
Feature activation+5.74e+00

Pos logit contributions
 detectives+0.260
 who+0.258
 kings+0.251
 builders+0.251
 spies+0.240
Neg logit contributions
rated-0.490
aned-0.424
ently-0.405
err-0.387
rily-0.376
Token are
Token ID389
Feature activation-2.75e+00

Pos logit contributions
rated+0.280
aned+0.232
ently+0.213
 hurts+0.207
night+0.203
Neg logit contributions
 detectives-0.192
 kings-0.190
 builders-0.181
 farmers-0.179
 spies-0.178
Token not
Token ID407
Feature activation-8.20e-03

Pos logit contributions
 detectives+0.100
 who+0.100
 kings+0.098
 builders+0.096
 farmers+0.092
Neg logit contributions
rated-0.126
aned-0.108
ently-0.099
err-0.097
 sensation-0.091
Token builders
Token ID31606
Feature activation-1.19e+01

Pos logit contributions
 who+0.000
 detectives+0.000
 builders+0.000
 kings+0.000
 spies+0.000
Neg logit contributions
rated-0.000
aned-0.000
ently-0.000
err-0.000
rily-0.000
Token,
Token ID11
Feature activation-3.78e+00

Pos logit contributions
 detectives+0.428
 who+0.414
 builders+0.414
 kings+0.410
 spies+0.397
Neg logit contributions
rated-0.593
aned-0.503
ently-0.480
err-0.447
rily-0.437
Token you
Token ID345
Feature activation+1.95e+00

Pos logit contributions
 detectives+0.114
 who+0.111
 kings+0.110
 builders+0.108
 superheroes+0.103
Neg logit contributions
rated-0.207
aned-0.179
ently-0.173
err-0.163
rily-0.160
Token are
Token ID389
Feature activation-2.76e+00

Pos logit contributions
rated+0.101
aned+0.087
ently+0.081
err+0.075
 hurts+0.075
Neg logit contributions
 detectives-0.065
 kings-0.062
 builders-0.061
 spies-0.060
 who-0.059
Token boys
Token ID6510
Feature activation-5.28e+00

Pos logit contributions
 detectives+0.088
 who+0.086
 kings+0.085
 builders+0.083
 farmers+0.080
Neg logit contributions
rated-0.139
aned-0.120
ently-0.112
err-0.108
rily-0.104
Token.
Token ID13
Feature activation+2.02e+00

Pos logit contributions
 detectives+0.238
 builders+0.229
 kings+0.225
 farmers+0.222
 spies+0.221
Neg logit contributions
rated-0.207
aned-0.162
ently-0.156
rum-0.144
rily-0.141
Token.
Token ID13
Feature activation+1.36e+00

Pos logit contributions
rated+0.121
aned+0.103
ently+0.098
err+0.092
rily+0.090
Neg logit contributions
 who-0.088
 detectives-0.086
 builders-0.085
 kings-0.084
 spies-0.081
Token They
Token ID1119
Feature activation-2.56e+00

Pos logit contributions
rated+0.072
aned+0.063
ently+0.060
err+0.057
rily+0.056
Neg logit contributions
 detectives-0.043
 builders-0.042
 kings-0.042
 who-0.042
 farmers-0.039
Token pretend
Token ID16614
Feature activation-3.38e+00

Pos logit contributions
 detectives+0.092
 spies+0.087
 builders+0.084
 queens+0.084
 farmers+0.082
Neg logit contributions
rated-0.129
aned-0.100
raw-0.092
 sensation-0.087
ently-0.085
Token they
Token ID484
Feature activation-2.96e+00

Pos logit contributions
 detectives+0.095
 builders+0.089
 who+0.087
 kings+0.087
 spies+0.085
Neg logit contributions
rated-0.184
aned-0.157
ently-0.153
err-0.144
night-0.143
Token are
Token ID389
Feature activation-2.82e+00

Pos logit contributions
 detectives+0.127
 vet+0.122
 spies+0.121
 veterinarian+0.119
 builders+0.118
Neg logit contributions
rated-0.120
aned-0.087
ently-0.078
 hurts-0.077
raw-0.075
Token builders
Token ID31606
Feature activation-1.19e+01

Pos logit contributions
 who+0.067
Playing+0.066
Some+0.063
 detectives+0.063
 vet+0.062
Neg logit contributions
rated-0.148
aned-0.125
rily-0.121
ently-0.120
err-0.115
Token and
Token ID290
Feature activation+7.38e-01

Pos logit contributions
 detectives+0.511
 builders+0.479
 farmers+0.467
 spies+0.465
Playing+0.463
Neg logit contributions
rated-0.479
aned-0.386
ently-0.372
rily-0.342
night-0.325
Token make
Token ID787
Feature activation-3.45e+00

Pos logit contributions
rated+0.033
 hurts+0.026
 lesson+0.025
aned+0.024
 tant+0.024
Neg logit contributions
 veterinarian-0.021
Playing-0.021
 who-0.020
 detectives-0.019
ograph-0.019
Token roads
Token ID9725
Feature activation-1.35e+00

Pos logit contributions
 detectives+0.113
 who+0.113
 builders+0.109
 kings+0.107
 farmers+0.104
Neg logit contributions
rated-0.177
aned-0.151
ently-0.146
rily-0.138
err-0.138
Token and
Token ID290
Feature activation+7.87e-01

Pos logit contributions
 detectives+0.050
 who+0.049
 builders+0.048
 kings+0.047
 farmers+0.046
Neg logit contributions
rated-0.066
aned-0.055
ently-0.053
err-0.049
rily-0.049
Token houses
Token ID7777
Feature activation-5.45e+00

Pos logit contributions
rated+0.033
aned+0.024
 sensation+0.023
 hurts+0.023
ently+0.022
Neg logit contributions
 veterinarian-0.028
 detectives-0.027
 photographer-0.027
Playing-0.027
 scientist-0.026
Token.
Token ID13
Feature activation+1.86e+00

Pos logit contributions
rated+0.120
aned+0.104
ently+0.096
err+0.092
rily+0.087
Neg logit contributions
 who-0.083
 detectives-0.079
 kings-0.079
 builders-0.078
 farmers-0.075
Token Daniel
Token ID7806
Feature activation+2.73e+00

Pos logit contributions
rated+0.096
aned+0.082
ently+0.078
err+0.074
rily+0.072
Neg logit contributions
 who-0.065
 detectives-0.064
 builders-0.063
 kings-0.062
 farmers-0.060
Token even
Token ID772
Feature activation-2.81e+00

Pos logit contributions
rated+0.139
aned+0.121
ently+0.112
rily+0.104
err+0.103
Neg logit contributions
 detectives-0.096
 builders-0.094
 spies-0.092
 kings-0.091
 farmers-0.087
Token watched
Token ID7342
Feature activation+3.34e+00

Pos logit contributions
 detectives+0.086
 builders+0.084
 spies+0.081
 kings+0.081
 who+0.078
Neg logit contributions
rated-0.155
aned-0.133
ently-0.124
night-0.118
err-0.118
Token the
Token ID262
Feature activation+1.55e+00

Pos logit contributions
rated+0.188
aned+0.161
ently+0.151
err+0.148
rily+0.137
Neg logit contributions
 who-0.109
 detectives-0.101
 builders-0.101
 kings-0.101
 farmers-0.097
Token builders
Token ID31606
Feature activation-1.19e+01

Pos logit contributions
rated+0.085
aned+0.073
ently+0.069
err+0.066
rily+0.065
Neg logit contributions
 detectives-0.048
 who-0.047
 kings-0.046
 builders-0.045
 spies-0.044
Token make
Token ID787
Feature activation-3.28e+00

Pos logit contributions
 detectives+0.376
 who+0.365
 builders+0.363
 kings+0.359
 spies+0.350
Neg logit contributions
rated-0.643
aned-0.554
ently-0.530
rily-0.496
err-0.495
Token a
Token ID257
Feature activation-3.65e+00

Pos logit contributions
 who+0.095
 detectives+0.094
 builders+0.091
 kings+0.091
 spies+0.087
Neg logit contributions
rated-0.187
aned-0.162
ently-0.155
err-0.148
rily-0.145
Token big
Token ID1263
Feature activation-4.30e+00

Pos logit contributions
 detectives+0.122
 who+0.120
 builders+0.119
 kings+0.118
 spies+0.114
Neg logit contributions
rated-0.189
aned-0.159
ently-0.152
err-0.147
rily-0.144
Token sand
Token ID6450
Feature activation+2.26e+00

Pos logit contributions
 detectives+0.160
 who+0.156
 builders+0.155
 kings+0.154
 spies+0.150
Neg logit contributions
rated-0.209
aned-0.172
ently-0.164
err-0.155
rily-0.154
Token castle
Token ID16669
Feature activation-3.73e+00

Pos logit contributions
rated+0.119
aned+0.102
ently+0.097
err+0.092
rily+0.090
Neg logit contributions
 who-0.076
 detectives-0.075
 builders-0.074
 kings-0.073
 spies-0.070
Token not
Token ID407
Feature activation+5.26e-02

Pos logit contributions
 who+0.074
 detectives+0.073
 kings+0.071
 builders+0.071
Playing+0.068
Neg logit contributions
rated-0.142
aned-0.124
ently-0.112
rily-0.110
err-0.109
Token dirty
Token ID11841
Feature activation+7.80e-01

Pos logit contributions
rated+0.003
aned+0.003
ently+0.002
err+0.002
rily+0.002
Neg logit contributions
 detectives-0.001
 who-0.001
 kings-0.001
 builders-0.001
 spies-0.001
Token,
Token ID11
Feature activation-3.78e+00

Pos logit contributions
rated+0.038
aned+0.032
ently+0.030
err+0.029
rily+0.028
Neg logit contributions
 who-0.029
 detectives-0.029
 builders-0.029
 kings-0.028
 spies-0.027
Token we
Token ID356
Feature activation-3.27e+00

Pos logit contributions
 who+0.095
 detectives+0.094
 builders+0.091
 kings+0.091
 spies+0.086
Neg logit contributions
rated-0.230
aned-0.201
ently-0.193
err-0.184
rily-0.182
Token are
Token ID389
Feature activation-2.70e+00

Pos logit contributions
 who+0.087
 detectives+0.084
 builders+0.082
 kings+0.081
 spies+0.077
Neg logit contributions
rated-0.195
aned-0.172
ently-0.165
err-0.158
rily-0.156
Token builders
Token ID31606
Feature activation-1.18e+01

Pos logit contributions
 detectives+0.085
 who+0.084
 builders+0.083
 kings+0.082
 farmers+0.079
Neg logit contributions
rated-0.139
aned-0.118
ently-0.109
err-0.105
rily-0.105
Token."
Token ID526
Feature activation+1.28e+00

Pos logit contributions
 detectives+0.553
 builders+0.531
 kings+0.526
 who+0.524
 farmers+0.519
Neg logit contributions
rated-0.459
aned-0.359
ently-0.336
rily-0.309
err-0.304
Token
Token ID198
Feature activation-1.94e+00

Pos logit contributions
rated+0.099
aned+0.082
rily+0.080
ently+0.079
err+0.076
Neg logit contributions
 who-0.022
 named-0.017
 see-0.016
 builders-0.016
 police-0.016
Token
Token ID198
Feature activation-1.93e+00

Pos logit contributions
 who+0.044
 detectives+0.038
 builders+0.038
 kings+0.037
 spies+0.035
Neg logit contributions
rated-0.129
aned-0.111
ently-0.106
rily-0.103
err-0.103
TokenTheir
Token ID14574
Feature activation+3.27e-01

Pos logit contributions
 detectives+0.068
 kings+0.067
 who+0.065
 builders+0.065
 superheroes+0.063
Neg logit contributions
rated-0.098
aned-0.086
ently-0.080
err-0.076
rily-0.072
Token mom
Token ID1995
Feature activation-4.28e+00

Pos logit contributions
rated+0.017
aned+0.015
ently+0.014
err+0.014
night+0.013
Neg logit contributions
 who-0.011
 detectives-0.010
 kings-0.010
 builders-0.010
 farmers-0.010
Token!
Token ID0
Feature activation+1.89e+00

Pos logit contributions
rated+0.182
aned+0.154
ently+0.143
err+0.140
rily+0.135
Neg logit contributions
 who-0.134
 detectives-0.132
 builders-0.132
 kings-0.130
 spies-0.124
Token We
Token ID775
Feature activation-2.67e+00

Pos logit contributions
rated+0.093
aned+0.078
ently+0.074
err+0.070
rily+0.069
Neg logit contributions
 who-0.070
 detectives-0.069
 builders-0.068
 kings-0.068
 spies-0.065
Token are
Token ID389
Feature activation-2.62e+00

Pos logit contributions
 detectives+0.090
 who+0.086
 spies+0.085
 farmers+0.085
 builders+0.084
Neg logit contributions
rated-0.137
aned-0.113
ently-0.103
err-0.101
night-0.099
Token the
Token ID262
Feature activation+1.34e+00

Pos logit contributions
 who+0.080
 detectives+0.076
Playing+0.076
Some+0.074
Did+0.073
Neg logit contributions
rated-0.129
aned-0.114
ently-0.106
rily-0.104
err-0.102
Token best
Token ID1266
Feature activation-2.13e+00

Pos logit contributions
rated+0.056
err+0.046
aned+0.045
ently+0.044
 sensation+0.042
Neg logit contributions
 who-0.051
 detectives-0.050
Playing-0.048
 named-0.046
 spies-0.046
Token builders
Token ID31606
Feature activation-1.18e+01

Pos logit contributions
 who+0.078
 detectives+0.077
 kings+0.074
 builders+0.073
 farmers+0.073
Neg logit contributions
rated-0.104
aned-0.087
ently-0.084
err-0.080
rily-0.077
Token!"
Token ID2474
Feature activation-3.96e-01

Pos logit contributions
 detectives+0.460
 builders+0.443
 who+0.441
 kings+0.437
 farmers+0.434
Neg logit contributions
rated-0.546
aned-0.445
ently-0.430
err-0.401
rily-0.398
Token Sam
Token ID3409
Feature activation+8.12e-01

Pos logit contributions
 detectives+0.015
 builders+0.015
 kings+0.015
 farmers+0.014
 who+0.014
Neg logit contributions
rated-0.017
aned-0.015
ently-0.014
err-0.014
night-0.013
Token cheered
Token ID34767
Feature activation-6.95e-01

Pos logit contributions
rated+0.044
aned+0.038
ently+0.036
err+0.034
rily+0.034
Neg logit contributions
 who-0.026
 detectives-0.026
 builders-0.025
 kings-0.025
 spies-0.024
Token.
Token ID13
Feature activation+1.94e+00

Pos logit contributions
 who+0.024
 detectives+0.022
 builders+0.021
 spies+0.021
 kings+0.021
Neg logit contributions
rated-0.040
aned-0.034
ently-0.032
err-0.030
raw-0.030
Token
Token ID198
Feature activation-2.06e+00

Pos logit contributions
rated+0.093
aned+0.083
ently+0.080
err+0.079
rily+0.074
Neg logit contributions
 kings-0.066
 detectives-0.066
 builders-0.064
 farmers-0.063
 who-0.062
Token and
Token ID290
Feature activation+1.09e+00

Pos logit contributions
rated+0.024
aned+0.019
ently+0.019
rily+0.018
err+0.017
Neg logit contributions
 detectives-0.021
 builders-0.020
 kings-0.020
 spies-0.019
 farmers-0.019
Token Lily
Token ID20037
Feature activation+2.57e-01

Pos logit contributions
rated+0.051
aned+0.041
night+0.039
ently+0.039
err+0.039
Neg logit contributions
 builders-0.036
 detectives-0.036
 kings-0.035
 farmers-0.035
 who-0.034
Token,
Token ID11
Feature activation-3.93e+00

Pos logit contributions
rated+0.012
aned+0.010
ently+0.009
err+0.009
rily+0.009
Neg logit contributions
 detectives-0.010
 builders-0.010
 who-0.010
 kings-0.010
 spies-0.009
Token the
Token ID262
Feature activation+1.40e+00

Pos logit contributions
 detectives+0.117
 builders+0.114
 kings+0.110
 farmers+0.106
 who+0.105
Neg logit contributions
rated-0.211
aned-0.178
ently-0.176
err-0.167
rily-0.165
Token best
Token ID1266
Feature activation-2.25e+00

Pos logit contributions
rated+0.069
aned+0.057
ently+0.056
err+0.053
 sensation+0.051
Neg logit contributions
 detectives-0.049
 who-0.048
 builders-0.047
 kings-0.046
 spies-0.046
Token builders
Token ID31606
Feature activation-1.18e+01

Pos logit contributions
 who+0.077
 detectives+0.077
 builders+0.075
 kings+0.075
 spies+0.072
Neg logit contributions
rated-0.117
aned-0.100
ently-0.095
err-0.090
rily-0.088
Token ever
Token ID1683
Feature activation+4.12e+00

Pos logit contributions
 detectives+0.459
 who+0.450
 builders+0.446
 kings+0.442
 spies+0.430
Neg logit contributions
rated-0.554
aned-0.461
ently-0.440
err-0.410
rily-0.404
Token.
Token ID13
Feature activation+2.03e+00

Pos logit contributions
rated+0.184
aned+0.153
ently+0.144
err+0.134
rily+0.131
Neg logit contributions
 who-0.171
 detectives-0.171
 builders-0.167
 kings-0.166
 spies-0.161
Token And
Token ID843
Feature activation+4.26e-01

Pos logit contributions
rated+0.091
aned+0.075
ently+0.072
err+0.070
rily+0.067
Neg logit contributions
 kings-0.078
 detectives-0.078
 who-0.077
 builders-0.077
 farmers-0.075
Token we
Token ID356
Feature activation-3.35e+00

Pos logit contributions
rated+0.021
ently+0.017
aned+0.017
night+0.016
rily+0.016
Neg logit contributions
 builders-0.013
 detectives-0.013
 kings-0.012
 farmers-0.012
Playing-0.012
Token are
Token ID389
Feature activation-2.75e+00

Pos logit contributions
 detectives+0.116
 builders+0.109
 farmers+0.108
 spies+0.108
 kings+0.107
Neg logit contributions
rated-0.170
aned-0.142
ently-0.129
err-0.125
night-0.121
Token.
Token ID13
Feature activation+1.22e+00

Pos logit contributions
rated+0.028
aned+0.024
ently+0.023
err+0.021
rily+0.021
Neg logit contributions
 who-0.024
 detectives-0.023
 builders-0.023
 kings-0.023
 spies-0.022
Token We
Token ID775
Feature activation-2.44e+00

Pos logit contributions
rated+0.057
aned+0.046
ently+0.044
err+0.043
rily+0.042
Neg logit contributions
 detectives-0.046
 who-0.045
 kings-0.045
 builders-0.045
 farmers-0.044
Token are
Token ID389
Feature activation-2.67e+00

Pos logit contributions
 detectives+0.079
 builders+0.076
 spies+0.074
 who+0.073
 farmers+0.073
Neg logit contributions
rated-0.129
aned-0.108
ently-0.098
err-0.097
night-0.096
Token the
Token ID262
Feature activation+9.65e-01

Pos logit contributions
 who+0.078
 detectives+0.076
 hardware+0.076
Playing+0.075
 builders+0.075
Neg logit contributions
rated-0.129
aned-0.116
ently-0.104
rily-0.104
err-0.101
Token best
Token ID1266
Feature activation-2.38e+00

Pos logit contributions
rated+0.046
err+0.040
 sensation+0.037
ently+0.036
aned+0.035
Neg logit contributions
 who-0.028
Playing-0.028
 detectives-0.028
 builders-0.026
Did-0.026
Token builders
Token ID31606
Feature activation-1.18e+01

Pos logit contributions
 who+0.072
 detectives+0.071
 kings+0.069
Playing+0.069
Did+0.068
Neg logit contributions
rated-0.121
aned-0.099
ently-0.099
night-0.095
err-0.095
Token ever
Token ID1683
Feature activation+4.23e+00

Pos logit contributions
 detectives+0.477
 builders+0.455
 kings+0.449
 who+0.446
 farmers+0.443
Neg logit contributions
rated-0.526
aned-0.423
ently-0.409
rily-0.377
err-0.374
Token!"
Token ID2474
Feature activation-4.18e-01

Pos logit contributions
rated+0.189
aned+0.159
ently+0.149
err+0.140
rily+0.137
Neg logit contributions
 who-0.177
 detectives-0.176
 builders-0.172
 kings-0.172
 spies-0.166
Token Lily
Token ID20037
Feature activation+5.30e-01

Pos logit contributions
 who+0.014
 detectives+0.013
 builders+0.013
 kings+0.013
 spies+0.012
Neg logit contributions
rated-0.023
aned-0.020
ently-0.019
err-0.018
rily-0.017
Token said
Token ID531
Feature activation+1.06e+00

Pos logit contributions
rated+0.028
aned+0.024
ently+0.023
err+0.022
rily+0.022
Neg logit contributions
 detectives-0.017
 who-0.017
 builders-0.017
 kings-0.017
 spies-0.016
Token.
Token ID13
Feature activation+1.55e+00

Pos logit contributions
rated+0.059
aned+0.048
err+0.046
ently+0.045
 sensation+0.042
Neg logit contributions
 who-0.037
 detectives-0.034
 builders-0.034
 spies-0.033
 kings-0.033
Token liked
Token ID8288
Feature activation-3.46e+00

Pos logit contributions
 detectives+0.093
 spies+0.092
 farmers+0.083
 superheroes+0.083
 veterinarian+0.083
Neg logit contributions
rated-0.117
aned-0.095
ently-0.081
led-0.080
ided-0.080
Token to
Token ID284
Feature activation-3.95e-01

Pos logit contributions
 detectives+0.139
 spies+0.132
 builders+0.129
 kings+0.129
 farmers+0.128
Neg logit contributions
rated-0.125
aned-0.104
ently-0.101
err-0.096
rily-0.096
Token pretend
Token ID16614
Feature activation-3.32e+00

Pos logit contributions
 detectives+0.013
 queens+0.013
 kings+0.013
 spies+0.013
Playing+0.013
Neg logit contributions
rated-0.019
aned-0.016
ently-0.015
 sensation-0.014
ingle-0.014
Token they
Token ID484
Feature activation-2.85e+00

Pos logit contributions
 detectives+0.092
 builders+0.085
 kings+0.085
 who+0.083
 farmers+0.082
Neg logit contributions
rated-0.170
ently-0.149
aned-0.148
rily-0.139
err-0.134
Token were
Token ID547
Feature activation-2.08e+00

Pos logit contributions
 detectives+0.105
 spies+0.100
 veterinarian+0.096
 vet+0.096
 builders+0.095
Neg logit contributions
rated-0.139
aned-0.114
ently-0.106
led-0.102
ingle-0.097
Token builders
Token ID31606
Feature activation-1.18e+01

Pos logit contributions
Playing+0.050
 who+0.049
Did+0.047
 veterinarian+0.045
 vet+0.045
Neg logit contributions
rated-0.106
aned-0.090
ently-0.090
rily-0.086
 off-0.083
Token and
Token ID290
Feature activation+1.09e+00

Pos logit contributions
 detectives+0.526
Playing+0.515
Some+0.505
 hardware+0.500
 vet+0.499
Neg logit contributions
rated-0.412
aned-0.314
ently-0.307
rily-0.298
rum-0.268
Token fix
Token ID4259
Feature activation-2.79e+00

Pos logit contributions
rated+0.046
 tant+0.035
aned+0.035
 off+0.035
 lesson+0.035
Neg logit contributions
Playing-0.032
 veterinarian-0.030
boys-0.030
 detectives-0.029
Charl-0.028
Token things
Token ID1243
Feature activation-1.77e+00

Pos logit contributions
 detectives+0.098
 who+0.097
 builders+0.095
 kings+0.094
 spies+0.091
Neg logit contributions
rated-0.141
aned-0.120
ently-0.115
err-0.108
rily-0.106
Token around
Token ID1088
Feature activation+2.42e+00

Pos logit contributions
 detectives+0.077
 kings+0.075
 builders+0.074
 farmers+0.073
 who+0.073
Neg logit contributions
rated-0.071
aned-0.058
ently-0.057
rily-0.052
err-0.049
Token the
Token ID262
Feature activation+1.32e+00

Pos logit contributions
rated+0.115
ently+0.096
aned+0.095
rily+0.090
err+0.087
Neg logit contributions
 detectives-0.085
 builders-0.081
 who-0.080
 kings-0.080
 spies-0.077
Token liked
Token ID8288
Feature activation-3.34e+00

Pos logit contributions
 detectives+0.094
 spies+0.089
 veterinarian+0.086
 farmers+0.085
 superheroes+0.082
Neg logit contributions
rated-0.108
aned-0.093
led-0.079
ided-0.078
ently-0.077
Token to
Token ID284
Feature activation-4.08e-01

Pos logit contributions
 detectives+0.130
 builders+0.121
 spies+0.120
 farmers+0.119
 kings+0.118
Neg logit contributions
rated-0.126
aned-0.110
ently-0.108
rily-0.100
err-0.100
Token pretend
Token ID16614
Feature activation-3.14e+00

Pos logit contributions
 detectives+0.013
 kings+0.013
 spies+0.012
 builders+0.012
 farmers+0.012
Neg logit contributions
rated-0.021
aned-0.018
ently-0.017
 sensation-0.016
rily-0.016
Token they
Token ID484
Feature activation-2.83e+00

Pos logit contributions
 detectives+0.091
 builders+0.085
 kings+0.084
 who+0.082
 farmers+0.081
Neg logit contributions
rated-0.157
aned-0.139
ently-0.139
rily-0.128
 sensation-0.125
Token were
Token ID547
Feature activation-1.91e+00

Pos logit contributions
 detectives+0.102
 spies+0.097
 builders+0.095
 vet+0.093
 veterinarian+0.093
Neg logit contributions
rated-0.138
aned-0.114
ently-0.106
led-0.102
ened-0.097
Token builders
Token ID31606
Feature activation-1.17e+01

Pos logit contributions
Playing+0.049
 who+0.048
Did+0.047
Would+0.046
Some+0.046
Neg logit contributions
rated-0.092
aned-0.081
ently-0.078
rily-0.075
 off-0.073
Token and
Token ID290
Feature activation+9.32e-01

Pos logit contributions
 detectives+0.502
Playing+0.485
Some+0.476
 builders+0.476
 hardware+0.472
Neg logit contributions
rated-0.444
aned-0.348
ently-0.343
rily-0.329
rum-0.301
Token make
Token ID787
Feature activation-3.22e+00

Pos logit contributions
rated+0.041
aned+0.033
 hurts+0.032
 tant+0.032
 off+0.031
Neg logit contributions
Playing-0.027
 veterinarian-0.024
boys-0.024
 detectives-0.024
 who-0.023
Token things
Token ID1243
Feature activation-1.72e+00

Pos logit contributions
 detectives+0.107
 who+0.106
 builders+0.104
 kings+0.102
 farmers+0.100
Neg logit contributions
rated-0.159
aned-0.137
ently-0.134
rily-0.125
err-0.125
Token with
Token ID351
Feature activation+1.18e+00

Pos logit contributions
 detectives+0.064
 kings+0.063
 who+0.062
 farmers+0.061
 builders+0.061
Neg logit contributions
rated-0.076
aned-0.062
ently-0.061
rily-0.058
ingle-0.055
Token the
Token ID262
Feature activation+1.28e+00

Pos logit contributions
rated+0.060
aned+0.051
ently+0.050
err+0.046
rily+0.046
Neg logit contributions
 detectives-0.039
 who-0.038
 builders-0.038
 kings-0.037
 farmers-0.036
Token were
Token ID547
Feature activation-1.96e+00

Pos logit contributions
 detectives+0.084
 builders+0.081
 farmers+0.080
 spies+0.078
 lawyers+0.074
Neg logit contributions
rated-0.110
aned-0.094
night-0.083
err-0.082
ently-0.082
Token good
Token ID922
Feature activation+1.07e+00

Pos logit contributions
 detectives+0.061
 hardware+0.059
 who+0.058
 builders+0.058
 charity+0.058
Neg logit contributions
rated-0.089
aned-0.080
ently-0.074
rily-0.073
err-0.071
Token friends
Token ID2460
Feature activation-7.08e+00

Pos logit contributions
rated+0.058
aned+0.053
night+0.052
ently+0.052
err+0.050
Neg logit contributions
 who-0.025
 hardware-0.025
 detectives-0.024
 charity-0.022
 store-0.022
Token and
Token ID290
Feature activation+1.12e+00

Pos logit contributions
 detectives+0.292
 builders+0.287
 kings+0.282
 who+0.280
 farmers+0.276
Neg logit contributions
rated-0.307
aned-0.253
ently-0.246
err-0.228
rily-0.225
Token good
Token ID922
Feature activation+1.04e+00

Pos logit contributions
rated+0.057
night+0.049
aned+0.047
 sensation+0.045
ently+0.045
Neg logit contributions
 detectives-0.031
 builders-0.030
 hardware-0.030
 farmers-0.030
 who-0.029
Token builders
Token ID31606
Feature activation-1.17e+01

Pos logit contributions
rated+0.067
aned+0.059
ently+0.058
err+0.055
night+0.055
Neg logit contributions
 who-0.021
 detectives-0.020
 kings-0.019
 builders-0.019
 farmers-0.018
Token.
Token ID13
Feature activation+1.96e+00

Pos logit contributions
 detectives+0.503
 builders+0.489
 who+0.477
 kings+0.475
 farmers+0.473
Neg logit contributions
rated-0.491
aned-0.400
ently-0.386
err-0.355
night-0.349
Token<|endoftext|>
Token ID50256
Feature activation+3.50e+00

Pos logit contributions
rated+0.086
err+0.073
aned+0.069
ently+0.068
night+0.067
Neg logit contributions
 builders-0.072
 detectives-0.072
 kings-0.071
 farmers-0.068
 hardware-0.065
TokenL
Token ID43
Feature activation+6.92e-01

Pos logit contributions
rated+0.200
aned+0.173
ently+0.163
err+0.154
rily+0.154
Neg logit contributions
 who-0.107
 detectives-0.104
 kings-0.101
 builders-0.101
 spies-0.097
Tokenily
Token ID813
Feature activation-7.22e-02

Pos logit contributions
rated+0.037
aned+0.032
ently+0.030
err+0.029
rily+0.028
Neg logit contributions
 who-0.023
 detectives-0.023
 kings-0.023
 builders-0.022
 spies-0.022
Token and
Token ID290
Feature activation+1.35e+00

Pos logit contributions
 detectives+0.002
 builders+0.002
 spies+0.002
 farmers+0.002
 kings+0.002
Neg logit contributions
rated-0.004
aned-0.003
ently-0.003
night-0.003
err-0.003
Token liked
Token ID8288
Feature activation-3.43e+00

Pos logit contributions
 vet+0.101
 veterinarian+0.099
 farmers+0.099
 queens+0.097
 detectives+0.095
Neg logit contributions
rated-0.089
led-0.066
aned-0.065
ened-0.062
 sensation-0.056
Token to
Token ID284
Feature activation-3.62e-01

Pos logit contributions
 detectives+0.133
 who+0.127
 farmers+0.127
 kings+0.126
 builders+0.126
Neg logit contributions
rated-0.133
aned-0.108
ently-0.102
night-0.101
 sensation-0.100
Token pretend
Token ID16614
Feature activation-3.46e+00

Pos logit contributions
 detectives+0.010
 kings+0.010
 builders+0.010
 who+0.010
 farmers+0.010
Neg logit contributions
rated-0.020
aned-0.017
ently-0.016
 sensation-0.016
err-0.015
Token they
Token ID484
Feature activation-2.83e+00

Pos logit contributions
 detectives+0.084
 builders+0.082
 kings+0.080
 who+0.079
 farmers+0.077
Neg logit contributions
rated-0.201
aned-0.177
ently-0.173
err-0.162
rily-0.161
Token were
Token ID547
Feature activation-2.09e+00

Pos logit contributions
 detectives+0.105
 builders+0.104
 spies+0.099
 farmers+0.098
 vet+0.095
Neg logit contributions
rated-0.134
aned-0.109
led-0.100
ently-0.098
ingle-0.098
Token builders
Token ID31606
Feature activation-1.17e+01

Pos logit contributions
 who+0.038
Playing+0.036
 hardware+0.034
 vet+0.033
Did+0.033
Neg logit contributions
rated-0.122
aned-0.106
ently-0.104
rily-0.102
err-0.101
Token and
Token ID290
Feature activation+1.23e+00

Pos logit contributions
 detectives+0.530
 builders+0.512
 hardware+0.508
Playing+0.502
 farmers+0.501
Neg logit contributions
rated-0.409
aned-0.313
ently-0.304
rily-0.291
err-0.271
Token fix
Token ID4259
Feature activation-2.87e+00

Pos logit contributions
rated+0.050
 lesson+0.040
 off+0.039
 sensation+0.039
 hurts+0.038
Neg logit contributions
Playing-0.036
 veterinarian-0.035
Far-0.033
boys-0.032
Charl-0.032
Token things
Token ID1243
Feature activation-1.82e+00

Pos logit contributions
 who+0.101
 detectives+0.101
 builders+0.099
 kings+0.098
 spies+0.094
Neg logit contributions
rated-0.145
aned-0.124
ently-0.118
err-0.111
rily-0.109
Token with
Token ID351
Feature activation+1.07e+00

Pos logit contributions
 detectives+0.079
 kings+0.078
 builders+0.077
 who+0.076
 farmers+0.075
Neg logit contributions
rated-0.074
aned-0.060
ently-0.059
rily-0.054
err-0.052
Token their
Token ID511
Feature activation+5.77e-03

Pos logit contributions
rated+0.052
aned+0.044
ently+0.043
err+0.040
rily+0.039
Neg logit contributions
 detectives-0.036
 builders-0.034
 who-0.034
 kings-0.033
 farmers-0.033
Token later
Token ID1568
Feature activation+4.79e+00

Pos logit contributions
rated+0.283
aned+0.239
ently+0.229
err+0.216
rily+0.212
Neg logit contributions
 detectives-0.196
 who-0.194
 builders-0.191
 kings-0.190
 spies-0.183
Token.
Token ID13
Feature activation+2.00e+00

Pos logit contributions
rated+0.244
aned+0.208
ently+0.192
err+0.189
rily+0.177
Neg logit contributions
 who-0.179
 detectives-0.173
 builders-0.170
 kings-0.167
 spies-0.165
Token They
Token ID1119
Feature activation-2.33e+00

Pos logit contributions
rated+0.105
aned+0.089
ently+0.085
err+0.083
rily+0.079
Neg logit contributions
 detectives-0.065
 kings-0.064
 builders-0.064
 who-0.063
 farmers-0.061
Token are
Token ID389
Feature activation-2.69e+00

Pos logit contributions
 detectives+0.081
 builders+0.080
 spies+0.076
 farmers+0.074
 mechanics+0.072
Neg logit contributions
rated-0.117
aned-0.094
raw-0.087
ently-0.086
err-0.085
Token good
Token ID922
Feature activation+1.01e+00

Pos logit contributions
 detectives+0.081
 who+0.079
 builders+0.078
 kings+0.075
 spies+0.073
Neg logit contributions
rated-0.141
aned-0.122
ently-0.113
rily-0.112
err-0.110
Token builders
Token ID31606
Feature activation-1.17e+01

Pos logit contributions
rated+0.063
aned+0.055
ently+0.054
err+0.051
night+0.051
Neg logit contributions
 who-0.021
 detectives-0.021
 kings-0.020
 builders-0.019
 farmers-0.018
Token.
Token ID13
Feature activation+1.95e+00

Pos logit contributions
 detectives+0.493
 builders+0.475
 kings+0.460
 farmers+0.459
 spies+0.454
Neg logit contributions
rated-0.495
aned-0.395
ently-0.387
rily-0.357
err-0.355
Token<|endoftext|>
Token ID50256
Feature activation+3.46e+00

Pos logit contributions
rated+0.096
aned+0.079
err+0.078
ently+0.077
night+0.074
Neg logit contributions
 kings-0.066
 detectives-0.066
 builders-0.065
 who-0.062
 farmers-0.062
TokenTom
Token ID13787
Feature activation+8.31e-01

Pos logit contributions
rated+0.193
aned+0.167
ently+0.157
err+0.148
rily+0.148
Neg logit contributions
 who-0.109
 detectives-0.107
 builders-0.105
 kings-0.104
 spies-0.100
Token and
Token ID290
Feature activation+1.37e+00

Pos logit contributions
rated+0.039
aned+0.033
ently+0.031
err+0.029
rily+0.028
Neg logit contributions
 detectives-0.032
 who-0.032
 builders-0.032
 kings-0.031
 spies-0.031
Token Mia
Token ID32189
Feature activation+3.23e+00

Pos logit contributions
rated+0.068
aned+0.055
night+0.053
ently+0.053
err+0.052
Neg logit contributions
 detectives-0.047
 kings-0.046
 builders-0.046
 who-0.044
 farmers-0.044
Token blue
Token ID4171
Feature activation-8.64e-01

Pos logit contributions
 detectives+0.007
 who+0.007
 kings+0.007
 builders+0.007
 farmers+0.007
Neg logit contributions
rated-0.010
aned-0.009
ently-0.008
err-0.008
rily-0.008
Tokenberries
Token ID20853
Feature activation-1.27e+00

Pos logit contributions
 who+0.027
 kings+0.025
 detectives+0.024
 builders+0.024
 farmers+0.023
Neg logit contributions
rated-0.050
aned-0.044
ently-0.041
err-0.039
rily-0.038
Token?"
Token ID1701
Feature activation-5.19e-01

Pos logit contributions
 who+0.043
 detectives+0.041
 kings+0.040
 builders+0.040
 farmers+0.038
Neg logit contributions
rated-0.067
aned-0.058
ently-0.055
err-0.053
rily-0.050
Token
Token ID198
Feature activation-2.55e+00

Pos logit contributions
 who+0.015
 detectives+0.014
 builders+0.014
 kings+0.014
 farmers+0.013
Neg logit contributions
rated-0.031
aned-0.027
ently-0.025
err-0.024
rily-0.024
Token
Token ID198
Feature activation-2.70e+00

Pos logit contributions
 who+0.062
 detectives+0.057
 kings+0.055
 builders+0.055
 spies+0.052
Neg logit contributions
rated-0.163
aned-0.142
ently-0.136
err-0.128
rily-0.128
Token"
Token ID1
Feature activation-1.14e+01

Pos logit contributions
 detectives+0.079
 builders+0.078
 kings+0.077
 who+0.075
 spies+0.073
Neg logit contributions
rated-0.149
aned-0.130
ently-0.125
err-0.121
rily-0.117
TokenYes
Token ID5297
Feature activation+5.60e+00

Pos logit contributions
 detectives+0.348
 kings+0.339
 builders+0.331
 spies+0.320
 superheroes+0.309
Neg logit contributions
rated-0.616
aned-0.534
err-0.509
ently-0.502
 sensation-0.484
Token,
Token ID11
Feature activation-4.90e+00

Pos logit contributions
rated+0.366
aned+0.326
err+0.312
ently+0.304
 sensation+0.296
Neg logit contributions
 who-0.138
 kings-0.115
 builders-0.112
 detectives-0.109
 police-0.105
Token please
Token ID3387
Feature activation-3.43e-03

Pos logit contributions
 detectives+0.154
 builders+0.150
 kings+0.149
 who+0.147
 spies+0.141
Neg logit contributions
rated-0.259
aned-0.222
ently-0.213
rily-0.204
err-0.203
Token!"
Token ID2474
Feature activation-1.05e+00

Pos logit contributions
 who+0.000
 detectives+0.000
 builders+0.000
 kings+0.000
 spies+0.000
Neg logit contributions
rated-0.000
aned-0.000
ently-0.000
err-0.000
rily-0.000
Token Lily
Token ID20037
Feature activation+3.31e-01

Pos logit contributions
 detectives+0.033
 who+0.033
 builders+0.032
 kings+0.032
 spies+0.031
Neg logit contributions
rated-0.057
aned-0.049
ently-0.047
err-0.044
rily-0.044
TokenL
Token ID43
Feature activation+7.48e-01

Pos logit contributions
rated+0.077
aned+0.064
ently+0.061
err+0.058
rily+0.056
Neg logit contributions
 builders-0.077
 detectives-0.077
 kings-0.076
 who-0.076
 farmers-0.074
Tokenincoln
Token ID11690
Feature activation+3.05e+00

Pos logit contributions
rated+0.041
aned+0.035
ently+0.032
rily+0.032
err+0.031
Neg logit contributions
 who-0.025
 kings-0.023
 builders-0.023
 detectives-0.023
 farmers-0.022
Token".
Token ID1911
Feature activation+6.29e-01

Pos logit contributions
rated+0.165
aned+0.143
ently+0.133
err+0.132
rily+0.126
Neg logit contributions
 who-0.102
 kings-0.092
 detectives-0.092
 builders-0.090
 farmers-0.088
Token
Token ID198
Feature activation-2.66e+00

Pos logit contributions
rated+0.036
aned+0.031
ently+0.029
err+0.027
rily+0.027
Neg logit contributions
 who-0.020
 detectives-0.019
 builders-0.018
 kings-0.018
 spies-0.018
Token
Token ID198
Feature activation-2.81e+00

Pos logit contributions
 who+0.067
 detectives+0.064
 kings+0.062
 builders+0.062
 spies+0.059
Neg logit contributions
rated-0.165
aned-0.144
ently-0.138
err-0.131
rily-0.130
Token"
Token ID1
Feature activation-1.14e+01

Pos logit contributions
 detectives+0.083
 kings+0.081
 builders+0.081
 who+0.080
 superheroes+0.076
Neg logit contributions
rated-0.156
aned-0.136
ently-0.130
err-0.126
rily-0.120
TokenWho
Token ID8241
Feature activation-2.96e+00

Pos logit contributions
 detectives+0.385
 kings+0.379
 who+0.369
 builders+0.369
 spies+0.355
Neg logit contributions
rated-0.593
aned-0.501
err-0.477
ently-0.467
 sensation-0.447
Token is
Token ID318
Feature activation-7.15e-01

Pos logit contributions
 detectives+0.104
 builders+0.102
 farmers+0.098
 who+0.097
 kings+0.097
Neg logit contributions
rated-0.143
aned-0.116
ently-0.115
err-0.108
night-0.105
Token Lincoln
Token ID12406
Feature activation-3.29e+00

Pos logit contributions
 kings+0.021
 who+0.021
 detectives+0.020
 captains+0.020
 builders+0.020
Neg logit contributions
rated-0.034
aned-0.028
ently-0.028
night-0.027
 hurts-0.026
Token,
Token ID11
Feature activation-4.83e+00

Pos logit contributions
 who+0.116
 detectives+0.116
 builders+0.113
 kings+0.112
 spies+0.108
Neg logit contributions
rated-0.166
aned-0.141
ently-0.134
err-0.126
rily-0.124
Token Ben
Token ID3932
Feature activation+2.50e-01

Pos logit contributions
 detectives+0.144
 builders+0.139
 kings+0.136
 who+0.134
 farmers+0.133
Neg logit contributions
rated-0.250
aned-0.210
ently-0.208
rily-0.199
err-0.198
Token they
Token ID484
Feature activation-2.80e+00

Pos logit contributions
 who+0.072
 detectives+0.071
 builders+0.068
 kings+0.067
 spies+0.062
Neg logit contributions
rated-0.266
aned-0.237
ently-0.228
err-0.219
rily-0.216
Token decided
Token ID3066
Feature activation-1.32e+00

Pos logit contributions
 farmers+0.101
 detectives+0.100
 builders+0.099
 veterinarian+0.098
 vet+0.098
Neg logit contributions
rated-0.121
angled-0.087
aned-0.086
ied-0.085
ened-0.085
Token to
Token ID284
Feature activation-3.10e-01

Pos logit contributions
 detectives+0.048
 builders+0.048
 kings+0.048
 who+0.046
 farmers+0.046
Neg logit contributions
rated-0.060
aned-0.050
ently-0.049
night-0.045
rily-0.044
Token be
Token ID307
Feature activation-2.54e+00

Pos logit contributions
 detectives+0.010
 builders+0.010
 spies+0.010
 who+0.009
 farmers+0.009
Neg logit contributions
rated-0.015
aned-0.013
raw-0.012
ingle-0.012
night-0.012
Token mail
Token ID6920
Feature activation-5.96e+00

Pos logit contributions
Playing+0.071
 who+0.070
 veterinarian+0.066
 peac+0.065
 judge+0.065
Neg logit contributions
rated-0.118
 sensation-0.096
aned-0.095
err-0.094
night-0.089
Token carriers
Token ID16651
Feature activation-1.14e+01

Pos logit contributions
 detectives+0.193
 who+0.193
 builders+0.188
 kings+0.187
 farmers+0.179
Neg logit contributions
rated-0.312
aned-0.265
ently-0.257
err-0.244
rily-0.239
Token and
Token ID290
Feature activation+1.23e+00

Pos logit contributions
 detectives+0.448
 builders+0.444
 farmers+0.418
 kings+0.416
 hardware+0.414
Neg logit contributions
rated-0.474
ently-0.386
aned-0.372
rily-0.365
err-0.352
Token deliver
Token ID5203
Feature activation-5.92e+00

Pos logit contributions
rated+0.056
 off+0.045
night+0.044
 forehead+0.041
aned+0.040
Neg logit contributions
Playing-0.039
years-0.033
 farmers-0.031
Did-0.031
 detectives-0.031
Token letters
Token ID7475
Feature activation-4.20e+00

Pos logit contributions
 detectives+0.197
 who+0.190
 builders+0.189
 kings+0.187
 spies+0.181
Neg logit contributions
rated-0.293
ently-0.242
aned-0.241
rily-0.227
night-0.226
Token to
Token ID284
Feature activation-3.12e-01

Pos logit contributions
 detectives+0.161
 builders+0.155
 kings+0.155
 who+0.153
 farmers+0.151
Neg logit contributions
rated-0.185
aned-0.155
ently-0.153
rily-0.143
err-0.137
Token their
Token ID511
Feature activation+3.08e-02

Pos logit contributions
 detectives+0.010
Playing+0.010
 builders+0.010
 kings+0.009
Did+0.009
Neg logit contributions
rated-0.014
ently-0.012
rily-0.011
aned-0.011
err-0.011
Token his
Token ID465
Feature activation+2.01e+00

Pos logit contributions
rated+0.060
aned+0.052
ently+0.048
err+0.047
rily+0.045
Neg logit contributions
 who-0.045
 detectives-0.042
 kings-0.042
 builders-0.041
 superheroes-0.040
Token dad
Token ID9955
Feature activation-7.00e+00

Pos logit contributions
rated+0.109
aned+0.094
ently+0.089
err+0.084
rily+0.082
Neg logit contributions
 who-0.066
 detectives-0.064
 kings-0.063
 builders-0.062
 spies-0.060
Token.
Token ID13
Feature activation+1.70e-01

Pos logit contributions
 detectives+0.234
 builders+0.228
 kings+0.223
 who+0.223
 spies+0.219
Neg logit contributions
rated-0.366
aned-0.313
ently-0.301
err-0.278
rily-0.276
Token
Token ID198
Feature activation-2.63e+00

Pos logit contributions
rated+0.008
aned+0.007
ently+0.007
err+0.006
rily+0.006
Neg logit contributions
 detectives-0.006
 who-0.006
 builders-0.006
 kings-0.006
 spies-0.006
Token
Token ID198
Feature activation-2.71e+00

Pos logit contributions
 detectives+0.081
 who+0.079
 builders+0.079
 kings+0.078
 spies+0.074
Neg logit contributions
rated-0.144
aned-0.124
ently-0.119
err-0.114
rily-0.111
Token"
Token ID1
Feature activation-1.13e+01

Pos logit contributions
 detectives+0.080
 builders+0.078
 kings+0.078
 who+0.076
 superheroes+0.074
Neg logit contributions
rated-0.148
aned-0.130
ently-0.124
err-0.121
rily-0.115
TokenDad
Token ID46270
Feature activation-2.52e+00

Pos logit contributions
 detectives+0.337
 kings+0.328
 builders+0.327
 who+0.322
 superheroes+0.305
Neg logit contributions
rated-0.629
aned-0.539
ently-0.514
err-0.513
rily-0.490
Token,
Token ID11
Feature activation-5.19e+00

Pos logit contributions
 detectives+0.085
 builders+0.082
 who+0.082
 kings+0.081
 spies+0.080
Neg logit contributions
rated-0.131
aned-0.111
ently-0.106
err-0.098
night-0.097
Token Dad
Token ID17415
Feature activation-3.67e+00

Pos logit contributions
 detectives+0.148
 builders+0.144
 kings+0.143
 farmers+0.138
 superheroes+0.132
Neg logit contributions
rated-0.283
aned-0.234
ently-0.229
err-0.221
rily-0.221
Token!
Token ID0
Feature activation+8.89e-01

Pos logit contributions
 who+0.125
 detectives+0.124
 builders+0.121
 kings+0.120
 spies+0.116
Neg logit contributions
rated-0.191
aned-0.163
ently-0.156
err-0.147
rily-0.144
Token What
Token ID1867
Feature activation-3.02e+00

Pos logit contributions
rated+0.047
aned+0.040
ently+0.039
err+0.037
rily+0.036
Neg logit contributions
 detectives-0.028
 who-0.028
 kings-0.027
 builders-0.027
 farmers-0.026
Token at
Token ID379
Feature activation+2.45e+00

Pos logit contributions
 who+0.143
 detectives+0.142
 builders+0.139
 kings+0.138
 spies+0.133
Neg logit contributions
rated-0.219
aned-0.188
ently-0.179
err-0.169
rily-0.166
Token her
Token ID607
Feature activation+1.96e+00

Pos logit contributions
rated+0.131
aned+0.112
ently+0.108
err+0.102
rily+0.101
Neg logit contributions
 detectives-0.077
 builders-0.074
 who-0.074
 kings-0.073
 spies-0.071
Token.
Token ID13
Feature activation+3.99e-01

Pos logit contributions
rated+0.108
aned+0.093
ently+0.089
err+0.085
rily+0.083
Neg logit contributions
 who-0.060
 detectives-0.060
 builders-0.058
 kings-0.058
 spies-0.055
Token
Token ID198
Feature activation-2.55e+00

Pos logit contributions
rated+0.022
aned+0.019
ently+0.018
err+0.017
rily+0.017
Neg logit contributions
 who-0.012
 detectives-0.012
 builders-0.012
 kings-0.012
 spies-0.011
Token
Token ID198
Feature activation-2.65e+00

Pos logit contributions
 who+0.072
 detectives+0.071
 builders+0.069
 kings+0.069
 spies+0.066
Neg logit contributions
rated-0.148
aned-0.129
ently-0.123
err-0.117
rily-0.115
Token"
Token ID1
Feature activation-1.13e+01

Pos logit contributions
 detectives+0.079
 builders+0.078
 kings+0.077
 who+0.074
 superheroes+0.073
Neg logit contributions
rated-0.145
aned-0.127
ently-0.122
err-0.119
rily-0.112
TokenThank
Token ID10449
Feature activation-1.78e+00

Pos logit contributions
 detectives+0.358
 kings+0.350
 builders+0.349
 who+0.338
 spies+0.326
Neg logit contributions
rated-0.609
aned-0.520
err-0.495
ently-0.492
rily-0.466
Token you
Token ID345
Feature activation+1.51e+00

Pos logit contributions
 detectives+0.057
 who+0.056
 builders+0.055
 kings+0.055
 spies+0.053
Neg logit contributions
rated-0.095
aned-0.082
ently-0.079
err-0.074
rily-0.073
Token,
Token ID11
Feature activation-4.81e+00

Pos logit contributions
rated+0.082
aned+0.071
ently+0.067
err+0.064
rily+0.062
Neg logit contributions
 who-0.049
 detectives-0.048
 builders-0.047
 kings-0.047
 spies-0.045
Token z
Token ID1976
Feature activation+3.80e+00

Pos logit contributions
 detectives+0.167
 builders+0.165
 who+0.160
 kings+0.160
 farmers+0.155
Neg logit contributions
rated-0.235
aned-0.196
ently-0.190
err-0.185
rily-0.183
Tokenebra
Token ID37052
Feature activation-6.64e-01

Pos logit contributions
rated+0.167
aned+0.130
err+0.114
ently+0.114
rily+0.111
Neg logit contributions
 who-0.173
 kings-0.171
 detectives-0.171
 builders-0.171
 spies-0.164
Token€
Token ID26391
Feature activation+1.18e+01

Pos logit contributions
rated+0.175
aned+0.145
ently+0.138
rily+0.133
err+0.128
Neg logit contributions
 who-0.153
 detectives-0.149
 builders-0.146
 kings-0.145
 spies-0.140
Token�
Token ID129
Feature activation+5.33e+00

Pos logit contributions
rated+0.505
aned+0.441
ently+0.420
err+0.378
ingle+0.363
Neg logit contributions
 detectives-0.467
 kings-0.456
 builders-0.454
 farmers-0.444
 superheroes-0.442
Token�
Token ID241
Feature activation-7.29e+00

Pos logit contributions
rated+0.257
ently+0.204
aned+0.202
rily+0.195
err+0.192
Neg logit contributions
 who-0.201
 detectives-0.193
 farmers-0.187
 builders-0.187
 kings-0.187
TokenLet
Token ID5756
Feature activation+1.96e+00

Pos logit contributions
 who+0.194
 detectives+0.179
 builders+0.176
 kings+0.174
 spies+0.164
Neg logit contributions
rated-0.447
aned-0.393
ently-0.376
rily-0.357
err-0.352
Tokenâ
Token ID22940
Feature activation+8.63e+00

Pos logit contributions
rated+0.128
aned+0.114
 sensation+0.110
ently+0.109
rily+0.108
Neg logit contributions
 who-0.048
 see-0.041
 store-0.040
 police-0.038
We-0.036
Token€
Token ID26391
Feature activation+1.18e+01

Pos logit contributions
rated+0.334
rily+0.294
aned+0.280
ently+0.264
err+0.257
Neg logit contributions
 who-0.390
 store-0.372
 see-0.366
 named-0.366
 judge-0.357
Tokenâ„¢
Token ID8151
Feature activation+1.74e+00

Pos logit contributions
rated+0.499
aned+0.408
ently+0.381
err+0.367
ingle+0.362
Neg logit contributions
 kings-0.465
 builders-0.460
 detectives-0.456
 who-0.443
 superheroes-0.433
Tokens
Token ID82
Feature activation-1.13e+00

Pos logit contributions
rated+0.097
aned+0.079
 sensation+0.075
ently+0.074
err+0.073
Neg logit contributions
 who-0.061
 named-0.053
 kings-0.051
 builders-0.051
 store-0.050
Token find
Token ID1064
Feature activation-6.34e+00

Pos logit contributions
 detectives+0.040
 spies+0.038
 kings+0.038
 builders+0.038
 superheroes+0.038
Neg logit contributions
rated-0.056
aned-0.047
ently-0.045
rily-0.041
err-0.041
Token out
Token ID503
Feature activation+1.67e+00

Pos logit contributions
 detectives+0.194
 builders+0.189
 kings+0.187
 who+0.187
 farmers+0.178
Neg logit contributions
rated-0.344
aned-0.297
ently-0.289
err-0.272
rily-0.272
Token.
Token ID13
Feature activation+1.81e+00

Pos logit contributions
rated+0.096
aned+0.083
ently+0.080
err+0.075
rily+0.075
Neg logit contributions
 detectives-0.047
 who-0.046
 builders-0.046
 kings-0.046
 spies-0.044
Token€
Token ID26391
Feature activation+1.19e+01

Pos logit contributions
rated+0.171
aned+0.142
ently+0.134
rily+0.126
err+0.123
Neg logit contributions
 who-0.166
 detectives-0.162
 builders-0.158
 kings-0.157
 farmers-0.152
Token˜
Token ID41185
Feature activation+1.46e+00

Pos logit contributions
rated+0.498
aned+0.426
ently+0.400
err+0.372
ingle+0.360
Neg logit contributions
 detectives-0.494
 builders-0.487
 kings-0.480
 farmers-0.471
 spies-0.467
Tokenthank
Token ID40716
Feature activation-1.42e+00

Pos logit contributions
rated+0.073
aned+0.063
ently+0.059
err+0.056
rily+0.054
Neg logit contributions
 who-0.054
 detectives-0.053
 builders-0.051
 kings-0.051
 spies-0.049
Token you
Token ID345
Feature activation+2.17e+00

Pos logit contributions
 who+0.043
 detectives+0.043
 builders+0.041
 kings+0.041
 spies+0.039
Neg logit contributions
rated-0.080
aned-0.069
ently-0.066
err-0.063
rily-0.062
Tokenâ
Token ID22940
Feature activation+8.88e+00

Pos logit contributions
rated+0.110
aned+0.095
ently+0.089
err+0.084
rily+0.083
Neg logit contributions
 who-0.080
 detectives-0.076
 kings-0.074
 builders-0.074
 farmers-0.071
Token€
Token ID26391
Feature activation+1.19e+01

Pos logit contributions
rily+0.285
rated+0.274
aned+0.261
ently+0.241
ulp+0.220
Neg logit contributions
 who-0.437
 named-0.423
 store-0.404
 judge-0.404
 see-0.385
Tokenâ„¢
Token ID8151
Feature activation+1.74e+00

Pos logit contributions
rated+0.559
aned+0.479
ently+0.457
err+0.431
ingle+0.413
Neg logit contributions
 detectives-0.440
 builders-0.438
 kings-0.430
 who-0.418
 farmers-0.416
Token.
Token ID13
Feature activation+1.55e+00

Pos logit contributions
rated+0.091
aned+0.075
ently+0.069
err+0.067
 sensation+0.067
Neg logit contributions
 who-0.065
 kings-0.058
 builders-0.058
 detectives-0.057
 named-0.056
Token One
Token ID1881
Feature activation-2.87e+00

Pos logit contributions
rated+0.080
aned+0.068
ently+0.065
err+0.062
rily+0.061
Neg logit contributions
 detectives-0.052
 who-0.052
 builders-0.051
 kings-0.051
 spies-0.049
Token day
Token ID1110
Feature activation+6.02e+00

Pos logit contributions
 who+0.107
 store+0.096
 named+0.094
 farmers+0.090
 living+0.089
Neg logit contributions
aned-0.148
rated-0.144
rily-0.141
ently-0.134
iously-0.118
Token she
Token ID673
Feature activation-1.99e+00

Pos logit contributions
rated+0.351
aned+0.309
ently+0.293
err+0.280
rily+0.277
Neg logit contributions
 who-0.175
 detectives-0.169
 builders-0.163
 kings-0.163
 spies-0.155
Token for
Token ID329
Feature activation-5.69e-01

Pos logit contributions
rated+0.143
aned+0.117
ently+0.114
err+0.108
night+0.104
Neg logit contributions
 detectives-0.108
 kings-0.106
 builders-0.105
 who-0.103
 spies-0.101
Token an
Token ID281
Feature activation-7.04e-02

Pos logit contributions
 detectives+0.017
 builders+0.016
 kings+0.016
 spies+0.015
 farmers+0.015
Neg logit contributions
rated-0.028
aned-0.025
ently-0.024
night-0.023
err-0.022
Token operation
Token ID4905
Feature activation+1.22e+00

Pos logit contributions
 detectives+0.003
 who+0.003
 builders+0.003
 kings+0.003
 spies+0.003
Neg logit contributions
rated-0.003
aned-0.003
ently-0.003
err-0.002
rily-0.002
Token.
Token ID13
Feature activation+1.60e+00

Pos logit contributions
rated+0.057
aned+0.048
ently+0.045
err+0.042
rily+0.041
Neg logit contributions
 detectives-0.047
 who-0.047
 builders-0.046
 kings-0.046
 spies-0.044
Tokenâ
Token ID22940
Feature activation+8.61e+00

Pos logit contributions
rated+0.075
aned+0.062
ently+0.057
err+0.057
rily+0.055
Neg logit contributions
 kings-0.060
 detectives-0.059
 builders-0.059
 who-0.058
 farmers-0.056
Token€
Token ID26391
Feature activation+1.18e+01

Pos logit contributions
rated+0.297
rily+0.293
aned+0.264
ently+0.256
err+0.241
Neg logit contributions
 who-0.398
 named-0.377
 store-0.374
Playing-0.364
We-0.351
Token
Token ID198
Feature activation-2.11e+00

Pos logit contributions
rated+0.529
aned+0.437
ently+0.406
err+0.385
ingle+0.375
Neg logit contributions
 detectives-0.457
 kings-0.453
 builders-0.451
 farmers-0.436
 superheroes-0.427
Token
Token ID198
Feature activation-2.10e+00

Pos logit contributions
 who+0.053
 detectives+0.050
 builders+0.049
 kings+0.048
 spies+0.046
Neg logit contributions
rated-0.130
aned-0.114
ently-0.109
err-0.105
rily-0.103
TokenTim
Token ID14967
Feature activation+8.93e-01

Pos logit contributions
 detectives+0.076
 kings+0.074
 builders+0.072
 spies+0.071
 superheroes+0.070
Neg logit contributions
rated-0.108
aned-0.090
ently-0.082
err-0.082
appointed-0.077
Tokenmy
Token ID1820
Feature activation-1.36e+00

Pos logit contributions
rated+0.060
rily+0.058
aned+0.057
ently+0.056
 sensation+0.055
Neg logit contributions
 who-0.016
 named-0.010
 kings-0.010
 saw-0.010
 store-0.010
Token asked
Token ID1965
Feature activation-2.91e+00

Pos logit contributions
 who+0.041
 detectives+0.040
 builders+0.038
 kings+0.038
 spies+0.036
Neg logit contributions
rated-0.077
aned-0.067
ently-0.064
err-0.061
rily-0.061
Token€
Token ID26391
Feature activation+1.21e+01

Pos logit contributions
rated+0.193
aned+0.163
ently+0.158
rily+0.154
err+0.141
Neg logit contributions
 who-0.157
 detectives-0.149
 builders-0.149
 kings-0.144
 farmers-0.142
Token�
Token ID129
Feature activation+5.55e+00

Pos logit contributions
rated+0.574
aned+0.494
ently+0.465
err+0.437
ingle+0.414
Neg logit contributions
 detectives-0.448
 kings-0.437
 builders-0.434
 who-0.433
 spies-0.419
Token�
Token ID241
Feature activation-6.99e+00

Pos logit contributions
rated+0.284
ently+0.236
aned+0.226
rily+0.223
err+0.221
Neg logit contributions
 who-0.184
 farmers-0.175
 detectives-0.171
 kings-0.168
 builders-0.166
TokenThat
Token ID2504
Feature activation+5.29e+00

Pos logit contributions
 who+0.196
 detectives+0.177
 builders+0.174
 kings+0.172
 farmers+0.167
Neg logit contributions
rated-0.418
aned-0.373
ently-0.356
rily-0.332
err-0.327
Tokenâ
Token ID22940
Feature activation+8.91e+00

Pos logit contributions
rated+0.269
aned+0.229
ently+0.218
err+0.205
rily+0.202
Neg logit contributions
 who-0.187
 detectives-0.186
 builders-0.182
 kings-0.181
 spies-0.174
Token€
Token ID26391
Feature activation+1.21e+01

Pos logit contributions
rated+0.293
rily+0.293
aned+0.282
ently+0.245
 tant+0.242
Neg logit contributions
 who-0.422
 store-0.419
 named-0.397
 judge-0.393
We-0.387
Tokenâ„¢
Token ID8151
Feature activation+2.07e+00

Pos logit contributions
rated+0.613
aned+0.523
ently+0.498
err+0.474
rily+0.457
Neg logit contributions
 detectives-0.409
 who-0.406
 kings-0.401
 builders-0.400
 spies-0.382
Tokens
Token ID82
Feature activation-9.07e-01

Pos logit contributions
rated+0.114
aned+0.095
 sensation+0.090
rily+0.088
err+0.088
Neg logit contributions
 who-0.070
 named-0.060
 kings-0.059
 builders-0.058
 store-0.057
Token so
Token ID523
Feature activation+2.65e+00

Pos logit contributions
 who+0.029
 detectives+0.028
 kings+0.028
 builders+0.028
 farmers+0.026
Neg logit contributions
rated-0.049
aned-0.043
ently-0.041
err-0.039
rily-0.038
Token cool
Token ID3608
Feature activation+4.81e-01

Pos logit contributions
rated+0.150
aned+0.129
ently+0.124
err+0.118
rily+0.115
Neg logit contributions
 detectives-0.078
 who-0.077
 builders-0.076
 kings-0.075
 spies-0.072
Token!
Token ID0
Feature activation+2.25e+00

Pos logit contributions
rated+0.025
aned+0.022
ently+0.021
err+0.020
rily+0.019
Neg logit contributions
 who-0.017
 detectives-0.016
 builders-0.015
 kings-0.015
 farmers-0.015
Token friends
Token ID2460
Feature activation-7.21e+00

Pos logit contributions
rated+0.149
aned+0.130
ently+0.127
err+0.123
night+0.118
Neg logit contributions
 detectives-0.061
 builders-0.059
 who-0.058
 kings-0.057
 spies-0.054
Token.
Token ID13
Feature activation+1.64e+00

Pos logit contributions
 detectives+0.230
 builders+0.225
 who+0.221
 kings+0.221
 spies+0.212
Neg logit contributions
rated-0.385
aned-0.328
ently-0.319
err-0.296
rily-0.293
Token
Token ID198
Feature activation-2.19e+00

Pos logit contributions
rated+0.092
aned+0.080
ently+0.077
err+0.074
rily+0.073
Neg logit contributions
 detectives-0.047
 kings-0.046
 who-0.046
 builders-0.046
 spies-0.043
Token
Token ID198
Feature activation-2.20e+00

Pos logit contributions
 detectives+0.062
 kings+0.061
 builders+0.061
 who+0.060
 spies+0.057
Neg logit contributions
rated-0.125
aned-0.110
ently-0.106
err-0.101
rily-0.098
Tokenâ
Token ID22940
Feature activation+8.59e+00

Pos logit contributions
 detectives+0.069
 kings+0.068
 builders+0.066
 spies+0.064
 superheroes+0.063
Neg logit contributions
rated-0.118
aned-0.105
ently-0.097
err-0.095
raw-0.091
Token€
Token ID26391
Feature activation+1.18e+01

Pos logit contributions
rated+0.280
rily+0.254
aned+0.227
ently+0.224
 tant+0.222
Neg logit contributions
 who-0.435
Playing-0.402
 judge-0.398
 named-0.397
 store-0.395
Token�
Token ID129
Feature activation+5.32e+00

Pos logit contributions
rated+0.517
aned+0.435
ently+0.391
err+0.375
night+0.366
Neg logit contributions
 kings-0.462
 builders-0.457
 detectives-0.456
 who-0.435
 farmers-0.434
Token�
Token ID241
Feature activation-7.23e+00

Pos logit contributions
rated+0.254
ently+0.206
aned+0.204
err+0.192
rily+0.187
Neg logit contributions
 who-0.203
 detectives-0.198
 farmers-0.191
 kings-0.190
 builders-0.190
TokenI
Token ID40
Feature activation-2.60e+00

Pos logit contributions
 who+0.178
 detectives+0.160
 builders+0.153
 kings+0.151
 spies+0.146
Neg logit contributions
rated-0.456
aned-0.408
ently-0.392
rily-0.371
err-0.367
Token don
Token ID836
Feature activation+5.47e+00

Pos logit contributions
 who+0.069
 detectives+0.065
 kings+0.062
 builders+0.061
 spies+0.059
Neg logit contributions
rated-0.156
aned-0.139
ently-0.135
rily-0.128
err-0.125
Tokenâ
Token ID22940
Feature activation+8.55e+00

Pos logit contributions
rated+0.261
aned+0.234
ently+0.219
err+0.200
ingle+0.196
Neg logit contributions
 detectives-0.203
 kings-0.201
 who-0.196
 builders-0.194
 spies-0.189
Token helped
Token ID4193
Feature activation-1.05e+00

Pos logit contributions
 detectives+0.020
 builders+0.020
 who+0.020
 kings+0.019
 farmers+0.019
Neg logit contributions
rated-0.038
aned-0.034
ently-0.032
err-0.030
night-0.030
Token stop
Token ID2245
Feature activation+3.09e+00

Pos logit contributions
 who+0.033
 detectives+0.033
 builders+0.032
 kings+0.032
 spies+0.031
Neg logit contributions
rated-0.057
aned-0.049
ently-0.047
err-0.045
rily-0.044
Token the
Token ID262
Feature activation+1.41e+00

Pos logit contributions
rated+0.165
aned+0.145
ently+0.139
rily+0.130
err+0.130
Neg logit contributions
 detectives-0.092
 builders-0.089
 kings-0.086
 who-0.085
 spies-0.083
Token thief
Token ID25906
Feature activation-4.27e+00

Pos logit contributions
rated+0.082
aned+0.068
err+0.066
ently+0.066
 sensation+0.063
Neg logit contributions
 who-0.038
 detectives-0.037
 kings-0.035
 builders-0.034
 spies-0.033
Token.
Token ID13
Feature activation+1.89e+00

Pos logit contributions
 detectives+0.155
 who+0.155
 builders+0.152
 kings+0.151
 spies+0.145
Neg logit contributions
rated-0.211
aned-0.179
ently-0.170
err-0.160
rily-0.157
Token From
Token ID3574
Feature activation+5.74e+00

Pos logit contributions
rated+0.081
aned+0.068
ently+0.064
err+0.060
rily+0.058
Neg logit contributions
 detectives-0.080
 builders-0.080
 kings-0.079
 who-0.079
 spies-0.076
Token that
Token ID326
Feature activation-4.26e+00

Pos logit contributions
rated+0.338
aned+0.302
ently+0.294
night+0.277
err+0.270
Neg logit contributions
 detectives-0.138
 kings-0.134
 builders-0.133
 superheroes-0.123
 farmers-0.123
Token day
Token ID1110
Feature activation+5.96e+00

Pos logit contributions
 who+0.193
 detectives+0.189
 builders+0.186
 kings+0.186
 farmers+0.180
Neg logit contributions
rated-0.177
aned-0.147
ently-0.137
rily-0.126
err-0.126
Token on
Token ID319
Feature activation+7.30e+00

Pos logit contributions
rated+0.288
aned+0.241
ently+0.227
err+0.217
rily+0.214
Neg logit contributions
 who-0.230
 detectives-0.227
 builders-0.223
 kings-0.220
 spies-0.214
Token,
Token ID11
Feature activation-3.77e+00

Pos logit contributions
rated+0.397
aned+0.343
ently+0.331
err+0.309
rily+0.304
Neg logit contributions
 detectives-0.229
 who-0.229
 builders-0.223
 kings-0.223
 spies-0.213
Token Blue
Token ID4518
Feature activation-2.18e-01

Pos logit contributions
 detectives+0.132
 builders+0.131
 kings+0.126
 spies+0.123
 farmers+0.120
Neg logit contributions
rated-0.179
ently-0.143
aned-0.143
err-0.134
rily-0.132
Token felt
Token ID2936
Feature activation-1.59e+00

Pos logit contributions
 who+0.018
 detectives+0.016
 kings+0.016
 builders+0.016
 farmers+0.015
Neg logit contributions
rated-0.033
aned-0.028
ently-0.027
err-0.026
rily-0.026
Token sad
Token ID6507
Feature activation+1.69e+00

Pos logit contributions
 builders+0.062
 detectives+0.061
 kings+0.060
 who+0.058
 farmers+0.057
Neg logit contributions
rated-0.074
aned-0.062
ently-0.055
err-0.054
 sensation-0.054
Token,
Token ID11
Feature activation-3.91e+00

Pos logit contributions
rated+0.093
aned+0.080
ently+0.075
err+0.072
rily+0.071
Neg logit contributions
 who-0.056
 detectives-0.053
 builders-0.052
 kings-0.051
 spies-0.050
Token but
Token ID475
Feature activation+3.71e+00

Pos logit contributions
 who+0.120
 detectives+0.104
 kings+0.100
 spies+0.099
 farmers+0.098
Neg logit contributions
rated-0.229
aned-0.201
ently-0.195
err-0.186
raw-0.184
Token she
Token ID673
Feature activation-2.25e+00

Pos logit contributions
rated+0.243
aned+0.215
ently+0.207
err+0.199
rily+0.196
Neg logit contributions
 who-0.078
 detectives-0.077
 builders-0.075
 kings-0.074
 spies-0.069
Token didn
Token ID1422
Feature activation+6.29e+00

Pos logit contributions
 builders+0.075
 detectives+0.073
 kings+0.069
 spies+0.068
 farmers+0.067
Neg logit contributions
rated-0.118
aned-0.102
err-0.093
ently-0.093
night-0.091
Token't
Token ID470
Feature activation-3.85e-02

Pos logit contributions
rated+0.362
aned+0.308
ently+0.295
err+0.282
rily+0.278
Neg logit contributions
 who-0.183
 detectives-0.181
 builders-0.176
 kings-0.173
 spies-0.167
Token let
Token ID1309
Feature activation+2.57e+00

Pos logit contributions
 detectives+0.001
 who+0.001
 builders+0.001
 kings+0.001
 farmers+0.001
Neg logit contributions
rated-0.002
aned-0.002
ently-0.002
err-0.002
night-0.002
Token the
Token ID262
Feature activation+1.18e+00

Pos logit contributions
rated+0.157
aned+0.140
ently+0.133
err+0.126
rily+0.126
Neg logit contributions
 who-0.069
 detectives-0.064
 kings-0.063
 farmers-0.062
 see-0.061
Token ducks
Token ID39694
Feature activation-4.49e+00

Pos logit contributions
rated+0.073
aned+0.065
ently+0.062
err+0.059
rily+0.058
Neg logit contributions
 who-0.029
 detectives-0.028
 builders-0.027
 kings-0.027
 spies-0.026
Token bring
Token ID2222
Feature activation-2.92e+00

Pos logit contributions
 who+0.134
 detectives+0.133
 builders+0.130
 kings+0.129
 spies+0.123
Neg logit contributions
rated-0.251
aned-0.217
ently-0.208
err-0.198
rily-0.194
Token sky
Token ID6766
Feature activation+2.21e+00

Pos logit contributions
rated+0.076
aned+0.067
ently+0.064
rily+0.059
err+0.059
Neg logit contributions
 who-0.046
 detectives-0.045
 kings-0.044
 builders-0.044
 farmers-0.042
Token.
Token ID13
Feature activation+1.51e+00

Pos logit contributions
rated+0.107
aned+0.091
ently+0.085
err+0.081
rily+0.078
Neg logit contributions
 who-0.085
 detectives-0.082
 kings-0.081
 builders-0.080
 farmers-0.078
Token
Token ID198
Feature activation-2.12e+00

Pos logit contributions
rated+0.082
aned+0.071
ently+0.068
err+0.064
rily+0.063
Neg logit contributions
 who-0.048
 detectives-0.048
 builders-0.046
 kings-0.046
 spies-0.044
Token
Token ID198
Feature activation-2.20e+00

Pos logit contributions
 detectives+0.069
 builders+0.067
 kings+0.066
 who+0.064
 spies+0.063
Neg logit contributions
rated-0.112
aned-0.099
ently-0.096
err-0.092
rily-0.088
TokenOne
Token ID3198
Feature activation-3.46e+00

Pos logit contributions
 detectives+0.071
 builders+0.069
 kings+0.067
 spies+0.067
 superheroes+0.065
Neg logit contributions
rated-0.114
aned-0.100
ently-0.098
err-0.096
right-0.088
Token day
Token ID1110
Feature activation+5.75e+00

Pos logit contributions
 who+0.141
 kings+0.124
 farmers+0.123
 named+0.122
 builders+0.120
Neg logit contributions
rated-0.165
aned-0.159
ently-0.143
rily-0.138
rum-0.127
Token,
Token ID11
Feature activation-4.25e+00

Pos logit contributions
rated+0.307
aned+0.264
ently+0.255
err+0.240
rily+0.236
Neg logit contributions
 detectives-0.181
 who-0.180
 builders-0.177
 kings-0.175
 spies-0.168
Token a
Token ID257
Feature activation-3.91e+00

Pos logit contributions
 detectives+0.135
 who+0.133
 builders+0.132
 kings+0.131
 spies+0.125
Neg logit contributions
rated-0.226
aned-0.193
ently-0.186
err-0.177
rily-0.175
Token three
Token ID1115
Feature activation-4.96e+00

Pos logit contributions
 detectives+0.136
 kings+0.135
 builders+0.132
 spies+0.128
 who+0.124
Neg logit contributions
rated-0.178
night-0.142
 sensation-0.140
 cheek-0.139
err-0.138
Token year
Token ID614
Feature activation+5.59e+00

Pos logit contributions
 detectives+0.243
 who+0.240
 builders+0.238
 kings+0.237
 spies+0.232
Neg logit contributions
rated-0.184
aned-0.146
ently-0.136
err-0.124
rily-0.120
Token old
Token ID1468
Feature activation-5.06e+00

Pos logit contributions
rated+0.290
aned+0.246
ently+0.240
err+0.222
rily+0.221
Neg logit contributions
 detectives-0.185
 who-0.183
 kings-0.178
 builders-0.177
 spies-0.173
Token cream
Token ID8566
Feature activation+1.95e+00

Pos logit contributions
rated+0.066
aned+0.057
ently+0.054
err+0.051
rily+0.050
Neg logit contributions
 who-0.046
 detectives-0.045
 builders-0.044
 kings-0.044
 spies-0.042
Token was
Token ID373
Feature activation+6.90e-01

Pos logit contributions
rated+0.102
aned+0.087
ently+0.083
err+0.079
rily+0.077
Neg logit contributions
 who-0.066
 detectives-0.065
 builders-0.064
 kings-0.064
 spies-0.061
Token clean
Token ID3424
Feature activation+1.44e+00

Pos logit contributions
rated+0.039
aned+0.034
ently+0.033
err+0.032
rily+0.030
Neg logit contributions
 who-0.020
 detectives-0.020
 builders-0.020
 kings-0.020
 spies-0.018
Token.
Token ID13
Feature activation+1.80e+00

Pos logit contributions
rated+0.073
aned+0.065
ently+0.060
err+0.059
raw+0.056
Neg logit contributions
 who-0.051
 kings-0.047
 builders-0.047
 detectives-0.047
 store-0.045
Token But
Token ID887
Feature activation+4.97e+00

Pos logit contributions
rated+0.092
aned+0.077
ently+0.073
err+0.067
rily+0.067
Neg logit contributions
 who-0.067
 detectives-0.065
 builders-0.064
 kings-0.063
 farmers-0.061
Token it
Token ID340
Feature activation+3.77e+00

Pos logit contributions
rated+0.288
aned+0.251
ently+0.239
err+0.228
rily+0.224
Neg logit contributions
 who-0.144
 detectives-0.141
 builders-0.137
 kings-0.136
 spies-0.130
Token still
Token ID991
Feature activation+3.47e+00

Pos logit contributions
rated+0.205
aned+0.176
ently+0.171
rily+0.159
err+0.159
Neg logit contributions
 who-0.122
 detectives-0.116
 kings-0.114
 builders-0.114
 farmers-0.108
Token smelled
Token ID37609
Feature activation-1.40e+00

Pos logit contributions
rated+0.179
aned+0.160
ently+0.156
err+0.149
rily+0.148
Neg logit contributions
 who-0.115
 detectives-0.107
 builders-0.106
 kings-0.105
 superheroes-0.102
Token sm
Token ID895
Feature activation+4.50e-01

Pos logit contributions
 who+0.038
 detectives+0.034
 kings+0.034
 builders+0.034
 police+0.033
Neg logit contributions
rated-0.081
aned-0.075
ently-0.071
err-0.069
rum-0.065
Tokenelly
Token ID6148
Feature activation+9.77e-01

Pos logit contributions
rated+0.021
aned+0.017
ently+0.016
rily+0.015
err+0.015
Neg logit contributions
 builders-0.018
 who-0.018
 detectives-0.018
 kings-0.018
 spies-0.017
Token!
Token ID0
Feature activation+2.13e+00

Pos logit contributions
rated+0.053
aned+0.046
ently+0.043
err+0.042
rily+0.039
Neg logit contributions
 who-0.033
 detectives-0.031
 builders-0.031
 kings-0.030
 farmers-0.029
TokenOnce
Token ID7454
Feature activation+4.46e+00

Pos logit contributions
rated+0.252
aned+0.224
ently+0.210
rily+0.203
err+0.197
Neg logit contributions
 who-0.071
 detectives-0.059
 kings-0.057
 store-0.055
 builders-0.055
Token upon
Token ID2402
Feature activation+5.62e+00

Pos logit contributions
rated+0.219
aned+0.189
ently+0.167
 sensation+0.163
err+0.162
Neg logit contributions
 who-0.181
 kings-0.163
 detectives-0.156
 see-0.155
 builders-0.154
Token a
Token ID257
Feature activation-3.91e+00

Pos logit contributions
rated+0.389
aned+0.349
ently+0.336
err+0.318
rily+0.317
Neg logit contributions
 who-0.105
 detectives-0.097
 kings-0.093
 builders-0.090
 spies-0.087
Token time
Token ID640
Feature activation+8.86e+00

Pos logit contributions
 who+0.206
 named+0.183
 kings+0.180
 detectives+0.179
 see+0.177
Neg logit contributions
rated-0.139
aned-0.123
ently-0.111
rily-0.105
rum-0.092
Token,
Token ID11
Feature activation-3.91e+00

Pos logit contributions
rated+0.525
aned+0.458
ently+0.434
err+0.414
rily+0.413
Neg logit contributions
 who-0.250
 detectives-0.241
 kings-0.233
 builders-0.232
 spies-0.222
Token in
Token ID287
Feature activation+5.38e+00

Pos logit contributions
 who+0.086
 detectives+0.066
 kings+0.064
 spies+0.060
 builders+0.059
Neg logit contributions
rated-0.267
aned-0.247
ently-0.234
err-0.220
night-0.218
Token a
Token ID257
Feature activation-3.84e+00

Pos logit contributions
rated+0.324
aned+0.280
ently+0.271
err+0.268
rily+0.262
Neg logit contributions
 detectives-0.130
 builders-0.129
 kings-0.124
 who-0.120
 farmers-0.116
Token small
Token ID1402
Feature activation-3.65e+00

Pos logit contributions
 builders+0.130
 detectives+0.125
 spies+0.117
 kings+0.117
 farmers+0.116
Neg logit contributions
rated-0.190
err-0.152
aned-0.145
night-0.141
ently-0.141
Token toy
Token ID13373
Feature activation-1.97e+00

Pos logit contributions
 detectives+0.102
 builders+0.098
 spies+0.095
Playing+0.094
 kings+0.091
Neg logit contributions
rated-0.190
err-0.156
night-0.152
rily-0.152
ently-0.152
Token room
Token ID2119
Feature activation+9.81e-01

Pos logit contributions
 builders+0.066
 detectives+0.066
 queens+0.062
 spies+0.061
 farmers+0.061
Neg logit contributions
rated-0.082
ently-0.078
aned-0.070
night-0.070
err-0.069
Token,
Token ID11
Feature activation-3.87e+00

Pos logit contributions
rated+0.048
aned+0.041
ently+0.041
err+0.037
night+0.037
Neg logit contributions
 detectives-0.035
 builders-0.034
 spies-0.033
 superheroes-0.032
 kings-0.032
Token tooth
Token ID16162
Feature activation+1.71e+00

Pos logit contributions
rated+0.146
aned+0.128
ently+0.116
err+0.110
rily+0.108
Neg logit contributions
 who-0.087
 detectives-0.082
 kings-0.082
 superheroes-0.078
 spies-0.077
Token.
Token ID13
Feature activation+1.67e+00

Pos logit contributions
rated+0.085
aned+0.073
ently+0.068
err+0.066
rily+0.062
Neg logit contributions
 who-0.064
 detectives-0.062
 kings-0.061
 builders-0.060
 spies-0.059
Token
Token ID220
Feature activation+6.58e+00

Pos logit contributions
rated+0.109
aned+0.096
ently+0.092
err+0.086
rily+0.086
Neg logit contributions
 who-0.042
 detectives-0.035
 store-0.034
 kings-0.033
 named-0.033
Token
Token ID198
Feature activation-2.04e+00

Pos logit contributions
rated+0.415
aned+0.361
ently+0.345
err+0.330
rily+0.327
Neg logit contributions
 who-0.160
 detectives-0.150
 kings-0.145
 builders-0.145
 spies-0.137
Token
Token ID198
Feature activation-2.07e+00

Pos logit contributions
 who+0.051
 detectives+0.049
 kings+0.047
 builders+0.047
 spies+0.044
Neg logit contributions
rated-0.126
aned-0.110
ently-0.106
err-0.101
rily-0.100
TokenL
Token ID43
Feature activation+9.27e-01

Pos logit contributions
 detectives+0.065
 builders+0.064
 who+0.064
 kings+0.063
 spies+0.061
Neg logit contributions
rated-0.115
aned-0.099
ently-0.094
err-0.090
rily-0.087
Tokenily
Token ID813
Feature activation+1.05e-01

Pos logit contributions
rated+0.056
aned+0.048
rily+0.044
ently+0.043
 spine+0.042
Neg logit contributions
 who-0.027
 detectives-0.024
 kings-0.024
 spies-0.023
 builders-0.022
Token's
Token ID338
Feature activation-1.38e+00

Pos logit contributions
rated+0.005
aned+0.005
ently+0.004
err+0.004
night+0.004
Neg logit contributions
 builders-0.004
 detectives-0.004
 kings-0.004
 spies-0.004
 farmers-0.003
Token mom
Token ID1995
Feature activation-4.53e+00

Pos logit contributions
 detectives+0.045
 builders+0.045
 who+0.044
 kings+0.043
 farmers+0.042
Neg logit contributions
rated-0.070
aned-0.060
ently-0.058
err-0.057
night-0.056
Token told
Token ID1297
Feature activation+2.44e-01

Pos logit contributions
 detectives+0.169
 spies+0.167
 builders+0.162
 charity+0.157
 superheroes+0.150
Neg logit contributions
rated-0.220
aned-0.195
ently-0.181
angled-0.165
ied-0.161
Token her
Token ID607
Feature activation+2.56e+00

Pos logit contributions
rated+0.015
aned+0.013
ently+0.013
err+0.012
rily+0.012
Neg logit contributions
 who-0.007
 detectives-0.006
 kings-0.006
 builders-0.006
 superheroes-0.006
Token
Token ID220
Feature activation+6.74e+00

Pos logit contributions
 detectives+0.049
 builders+0.048
 who+0.048
 kings+0.048
 spies+0.045
Neg logit contributions
rated-0.098
aned-0.086
ently-0.083
err-0.079
rily-0.077
Token
Token ID198
Feature activation-1.78e+00

Pos logit contributions
rated+0.409
aned+0.356
ently+0.342
err+0.326
rily+0.322
Neg logit contributions
 who-0.174
 detectives-0.170
 builders-0.165
 kings-0.163
 spies-0.156
TokenAmy
Token ID40797
Feature activation+4.38e+00

Pos logit contributions
 detectives+0.059
 builders+0.057
 kings+0.056
 spies+0.055
 who+0.054
Neg logit contributions
rated-0.091
aned-0.080
ently-0.076
err-0.076
rily-0.070
Token thanked
Token ID26280
Feature activation-3.82e-01

Pos logit contributions
rated+0.226
aned+0.195
ently+0.183
err+0.174
rily+0.170
Neg logit contributions
 detectives-0.151
 builders-0.147
 kings-0.144
 who-0.143
 spies-0.142
Token Manuel
Token ID25995
Feature activation+9.55e-01

Pos logit contributions
 who+0.012
 detectives+0.012
 builders+0.012
 kings+0.011
 spies+0.011
Neg logit contributions
rated-0.021
aned-0.018
ently-0.017
err-0.016
rily-0.016
Token,
Token ID11
Feature activation-3.72e+00

Pos logit contributions
rated+0.047
aned+0.039
ently+0.037
err+0.035
rily+0.034
Neg logit contributions
 who-0.036
 detectives-0.035
 builders-0.035
 kings-0.034
 spies-0.033
Token waved
Token ID26834
Feature activation+1.53e+00

Pos logit contributions
 detectives+0.116
 builders+0.113
 kings+0.108
 spies+0.106
 superheroes+0.101
Neg logit contributions
rated-0.200
aned-0.177
ently-0.168
rily-0.162
err-0.161
Token goodbye
Token ID24829
Feature activation+7.76e+00

Pos logit contributions
rated+0.072
aned+0.060
ently+0.056
err+0.053
rily+0.049
Neg logit contributions
 who-0.063
 detectives-0.061
 kings-0.060
 builders-0.059
 spies-0.058
Token and
Token ID290
Feature activation+1.61e+00

Pos logit contributions
rated+0.431
aned+0.372
ently+0.353
err+0.338
rily+0.326
Neg logit contributions
 who-0.250
 detectives-0.237
 builders-0.231
 kings-0.230
 farmers-0.221
Token left
Token ID1364
Feature activation-1.98e+00

Pos logit contributions
rated+0.089
aned+0.077
ently+0.073
err+0.069
rily+0.068
Neg logit contributions
 who-0.050
 detectives-0.050
 builders-0.048
 kings-0.048
 spies-0.046
Token the
Token ID262
Feature activation+1.47e+00

Pos logit contributions
 who+0.065
 detectives+0.063
 builders+0.061
 kings+0.061
 spies+0.059
Neg logit contributions
rated-0.109
aned-0.092
ently-0.087
err-0.084
rily-0.081
Token rabbit
Token ID22746
Feature activation-4.25e+00

Pos logit contributions
 detectives+0.045
 who+0.045
 builders+0.044
 kings+0.043
 spies+0.041
Neg logit contributions
rated-0.101
aned-0.088
ently-0.084
err-0.080
rily-0.079
Token hopped
Token ID45230
Feature activation+1.80e+00

Pos logit contributions
 who+0.128
 detectives+0.124
 kings+0.121
 builders+0.120
 spies+0.115
Neg logit contributions
rated-0.239
aned-0.208
ently-0.198
err-0.189
rily-0.186
Token out
Token ID503
Feature activation+9.36e-01

Pos logit contributions
rated+0.097
aned+0.083
ently+0.078
rum+0.072
err+0.071
Neg logit contributions
 who-0.064
 detectives-0.058
 named-0.057
 kings-0.056
 spies-0.055
Token and
Token ID290
Feature activation+6.55e-01

Pos logit contributions
rated+0.049
aned+0.042
ently+0.039
err+0.037
rily+0.035
Neg logit contributions
 who-0.033
 detectives-0.032
 kings-0.032
 builders-0.031
 spies-0.030
Token John
Token ID1757
Feature activation+2.48e+00

Pos logit contributions
rated+0.034
aned+0.029
ently+0.027
err+0.026
rily+0.026
Neg logit contributions
 builders-0.022
 detectives-0.022
 kings-0.021
 who-0.021
 spies-0.021
Token felt
Token ID2936
Feature activation-1.48e+00

Pos logit contributions
rated+0.131
aned+0.113
ently+0.106
err+0.101
rily+0.100
Neg logit contributions
 detectives-0.082
 builders-0.081
 who-0.079
 kings-0.079
 spies-0.077
Token so
Token ID523
Feature activation+2.53e+00

Pos logit contributions
 detectives+0.043
 builders+0.042
 who+0.042
 kings+0.041
 spies+0.040
Neg logit contributions
rated-0.084
aned-0.072
ently-0.069
err-0.066
rily-0.065
Token happy
Token ID3772
Feature activation-1.14e-02

Pos logit contributions
rated+0.155
aned+0.137
ently+0.132
err+0.126
rily+0.123
Neg logit contributions
 who-0.063
 detectives-0.060
 kings-0.058
 builders-0.057
 farmers-0.055
Token.
Token ID13
Feature activation+1.35e+00

Pos logit contributions
 who+0.000
 detectives+0.000
 builders+0.000
 kings+0.000
 spies+0.000
Neg logit contributions
rated-0.001
aned-0.001
ently-0.000
err-0.000
rily-0.000
Token He
Token ID679
Feature activation-1.84e-01

Pos logit contributions
rated+0.066
aned+0.056
ently+0.053
err+0.049
rily+0.049
Neg logit contributions
 who-0.051
 detectives-0.050
 builders-0.049
 kings-0.048
 spies-0.047
Token reached
Token ID4251
Feature activation+1.25e+00

Pos logit contributions
 detectives+0.006
 builders+0.006
 who+0.006
 kings+0.006
 spies+0.006
Neg logit contributions
rated-0.010
aned-0.008
ently-0.008
err-0.007
rily-0.007
Tokenucky
Token ID5309
Feature activation-2.55e-01

Pos logit contributions
rated+0.089
aned+0.077
ently+0.070
 sensation+0.066
rum+0.065
Neg logit contributions
 detectives-0.063
 who-0.061
 kings-0.060
 spies-0.060
 pirates-0.059
Token and
Token ID290
Feature activation+1.16e+00

Pos logit contributions
 detectives+0.009
 builders+0.009
 kings+0.008
 spies+0.008
 who+0.008
Neg logit contributions
rated-0.013
aned-0.011
ently-0.010
err-0.010
night-0.010
Token his
Token ID465
Feature activation+2.44e+00

Pos logit contributions
rated+0.065
aned+0.058
ently+0.057
err+0.050
rily+0.048
Neg logit contributions
 who-0.039
 named-0.036
 kings-0.035
 detectives-0.035
 pirates-0.035
Token friends
Token ID2460
Feature activation-7.28e+00

Pos logit contributions
rated+0.150
aned+0.134
ently+0.128
err+0.126
rily+0.121
Neg logit contributions
 who-0.056
 detectives-0.054
 builders-0.052
 kings-0.051
 farmers-0.047
Token sw
Token ID1509
Feature activation+3.61e+00

Pos logit contributions
 detectives+0.238
 builders+0.228
 spies+0.226
 queens+0.214
 kings+0.213
Neg logit contributions
rated-0.399
aned-0.327
ently-0.299
night-0.297
err-0.283
Tokenam
Token ID321
Feature activation-5.42e-01

Pos logit contributions
rated+0.143
ently+0.103
 sensation+0.099
aned+0.099
rily+0.090
Neg logit contributions
 kings-0.168
 detectives-0.165
 captains-0.165
 builders-0.163
boys-0.160
Token around
Token ID1088
Feature activation+2.06e+00

Pos logit contributions
 who+0.018
 detectives+0.017
 builders+0.016
 kings+0.016
 spies+0.016
Neg logit contributions
rated-0.030
aned-0.026
ently-0.026
err-0.024
rily-0.023
Token the
Token ID262
Feature activation+1.17e+00

Pos logit contributions
rated+0.106
aned+0.089
ently+0.086
err+0.083
rily+0.080
Neg logit contributions
 detectives-0.068
 builders-0.067
 kings-0.065
 who-0.064
 spies-0.063
Token pond
Token ID16723
Feature activation-1.46e+00

Pos logit contributions
rated+0.060
aned+0.051
err+0.047
ently+0.046
rily+0.046
Neg logit contributions
 builders-0.038
 detectives-0.038
 kings-0.036
 who-0.036
 mechanics-0.036
Token,
Token ID11
Feature activation-3.80e+00

Pos logit contributions
 detectives+0.050
 builders+0.049
 who+0.048
 kings+0.048
 spies+0.047
Neg logit contributions
rated-0.075
aned-0.063
ently-0.061
err-0.057
rily-0.056
Token making
Token ID1642
Feature activation-1.38e+00

Pos logit contributions
 detectives+0.110
 builders+0.109
 kings+0.105
 who+0.102
 spies+0.101
Neg logit contributions
rated-0.214
aned-0.183
ently-0.172
err-0.167
rily-0.164
Token so
Token ID523
Feature activation+2.54e+00

Pos logit contributions
 detectives+0.166
 builders+0.164
 kings+0.162
 who+0.156
 spies+0.153
Neg logit contributions
rated-0.215
aned-0.180
ently-0.172
rily-0.161
err-0.160
Token she
Token ID673
Feature activation-2.17e+00

Pos logit contributions
rated+0.150
aned+0.131
ently+0.128
rily+0.120
err+0.119
Neg logit contributions
 detectives-0.063
 builders-0.062
 kings-0.060
 who-0.059
 spies-0.057
Token went
Token ID1816
Feature activation-3.30e-02

Pos logit contributions
 detectives+0.063
 who+0.062
 builders+0.062
 kings+0.061
 spies+0.058
Neg logit contributions
rated-0.123
aned-0.107
ently-0.102
err-0.097
rily-0.096
Token to
Token ID284
Feature activation-2.32e-01

Pos logit contributions
 detectives+0.001
 who+0.001
 builders+0.001
 kings+0.001
 farmers+0.001
Neg logit contributions
rated-0.002
aned-0.001
ently-0.001
err-0.001
rily-0.001
Token the
Token ID262
Feature activation+1.27e+00

Pos logit contributions
 detectives+0.007
 builders+0.007
 who+0.007
 kings+0.007
 farmers+0.006
Neg logit contributions
rated-0.013
aned-0.011
ently-0.011
err-0.010
rily-0.010
Token garden
Token ID11376
Feature activation-2.31e+00

Pos logit contributions
rated+0.066
err+0.054
aned+0.053
night+0.053
ently+0.052
Neg logit contributions
 builders-0.034
 detectives-0.033
 who-0.033
Playing-0.033
 kings-0.032
Token to
Token ID284
Feature activation-1.93e-01

Pos logit contributions
 detectives+0.101
 builders+0.099
 superheroes+0.095
 spies+0.094
 mechanics+0.094
Neg logit contributions
rated-0.088
ently-0.072
aned-0.069
rily-0.068
 sensation-0.063
Token think
Token ID892
Feature activation+2.77e+00

Pos logit contributions
 builders+0.006
 detectives+0.006
 kings+0.005
 farmers+0.005
 spies+0.005
Neg logit contributions
rated-0.010
aned-0.008
ently-0.008
 sensation-0.008
err-0.008
Token about
Token ID546
Feature activation-1.26e+00

Pos logit contributions
rated+0.165
aned+0.145
err+0.134
ently+0.133
rily+0.128
Neg logit contributions
 who-0.080
 detectives-0.072
 kings-0.071
 builders-0.070
 spies-0.067
Token what
Token ID644
Feature activation-4.26e+00

Pos logit contributions
 builders+0.045
 detectives+0.044
 captains+0.041
 kings+0.040
 spies+0.040
Neg logit contributions
rated-0.058
ently-0.050
aned-0.047
rily-0.045
err-0.045
Token she
Token ID673
Feature activation-2.13e+00

Pos logit contributions
 detectives+0.131
 builders+0.131
 kings+0.125
 captains+0.124
 spies+0.121
Neg logit contributions
rated-0.218
ently-0.180
aned-0.179
err-0.173
night-0.169
Token little
Token ID1310
Feature activation-1.23e+00

Pos logit contributions
rated+0.078
aned+0.064
ently+0.062
err+0.061
night+0.058
Neg logit contributions
 detectives-0.049
 who-0.047
 builders-0.046
 kings-0.046
 spies-0.045
Token girl
Token ID2576
Feature activation-5.41e-01

Pos logit contributions
 detectives+0.039
 who+0.038
 builders+0.038
 kings+0.037
 spies+0.036
Neg logit contributions
rated-0.066
aned-0.055
ently-0.053
err-0.051
rily-0.050
Token was
Token ID373
Feature activation+2.94e-01

Pos logit contributions
 detectives+0.017
 builders+0.017
 kings+0.016
 spies+0.016
 who+0.016
Neg logit contributions
rated-0.030
aned-0.026
ently-0.024
err-0.023
rily-0.023
Token so
Token ID523
Feature activation+2.35e+00

Pos logit contributions
rated+0.016
aned+0.014
ently+0.013
err+0.012
rily+0.012
Neg logit contributions
 detectives-0.009
 builders-0.009
 kings-0.009
 spies-0.009
 who-0.008
Token happy
Token ID3772
Feature activation-9.84e-02

Pos logit contributions
rated+0.136
aned+0.118
ently+0.111
err+0.106
rily+0.105
Neg logit contributions
 detectives-0.068
 builders-0.066
 kings-0.065
 who-0.065
 spies-0.063
Token that
Token ID326
Feature activation-4.69e+00

Pos logit contributions
 detectives+0.003
 who+0.003
 builders+0.003
 kings+0.003
 spies+0.002
Neg logit contributions
rated-0.006
aned-0.005
ently-0.005
err-0.005
rily-0.004
Token she
Token ID673
Feature activation-2.16e+00

Pos logit contributions
 detectives+0.126
 builders+0.124
 who+0.120
 kings+0.117
 spies+0.115
Neg logit contributions
rated-0.275
aned-0.229
ently-0.227
night-0.216
err-0.215
Token could
Token ID714
Feature activation+5.65e-01

Pos logit contributions
 detectives+0.065
 builders+0.064
 who+0.063
 kings+0.062
 spies+0.060
Neg logit contributions
rated-0.121
aned-0.105
ently-0.099
err-0.094
rily-0.093
Token carry
Token ID3283
Feature activation-2.59e+00

Pos logit contributions
rated+0.030
aned+0.027
ently+0.026
night+0.023
rily+0.023
Neg logit contributions
 detectives-0.018
 builders-0.017
 kings-0.017
 spies-0.016
 who-0.016
Token her
Token ID607
Feature activation+2.55e+00

Pos logit contributions
 who+0.072
 detectives+0.069
 kings+0.067
 builders+0.067
 spies+0.064
Neg logit contributions
rated-0.156
aned-0.136
ently-0.128
err-0.123
rily-0.121
Token dolls
Token ID36062
Feature activation-3.30e+00

Pos logit contributions
rated+0.136
aned+0.117
ently+0.111
err+0.105
rily+0.103
Neg logit contributions
 who-0.086
 detectives-0.083
 kings-0.082
 builders-0.082
 spies-0.078
Token to
Token ID284
Feature activation-3.08e-01

Pos logit contributions
 who+0.010
 detectives+0.010
 kings+0.010
 builders+0.010
 farmers+0.009
Neg logit contributions
rated-0.013
aned-0.011
ently-0.011
err-0.010
night-0.010
Token the
Token ID262
Feature activation+1.18e+00

Pos logit contributions
 detectives+0.010
 builders+0.010
 who+0.009
 kings+0.009
 farmers+0.009
Neg logit contributions
rated-0.016
aned-0.014
ently-0.013
err-0.013
rily-0.012
Token beach
Token ID10481
Feature activation+6.88e-01

Pos logit contributions
rated+0.056
err+0.050
night+0.046
icle+0.046
aned+0.046
Neg logit contributions
 detectives-0.034
 builders-0.034
 who-0.033
 kings-0.032
 spies-0.030
Token with
Token ID351
Feature activation+1.14e+00

Pos logit contributions
rated+0.032
aned+0.026
ently+0.025
err+0.024
night+0.023
Neg logit contributions
 detectives-0.026
 builders-0.025
 kings-0.024
 superheroes-0.024
 spies-0.024
Token his
Token ID465
Feature activation+2.34e+00

Pos logit contributions
rated+0.028
aned+0.020
err+0.016
night+0.016
ently+0.015
Neg logit contributions
 who-0.065
 detectives-0.065
 builders-0.063
 farmers-0.063
 kings-0.063
Token family
Token ID1641
Feature activation-6.81e+00

Pos logit contributions
rated+0.106
aned+0.092
err+0.088
ently+0.087
night+0.083
Neg logit contributions
 who-0.081
 detectives-0.080
 kings-0.078
 farmers-0.076
 builders-0.075
Token.
Token ID13
Feature activation+1.53e+00

Pos logit contributions
 detectives+0.330
 builders+0.316
 kings+0.313
 spies+0.311
 superheroes+0.307
Neg logit contributions
rated-0.239
ently-0.184
aned-0.183
rily-0.167
night-0.162
Token The
Token ID383
Feature activation+4.62e-01

Pos logit contributions
rated+0.083
aned+0.072
ently+0.069
err+0.064
rily+0.064
Neg logit contributions
 who-0.050
 detectives-0.049
 builders-0.048
 kings-0.047
 spies-0.045
Token water
Token ID1660
Feature activation+5.59e-01

Pos logit contributions
rated+0.024
aned+0.020
ently+0.020
err+0.019
rily+0.018
Neg logit contributions
 detectives-0.016
 who-0.016
 kings-0.015
 builders-0.015
 spies-0.014
Token was
Token ID373
Feature activation+5.32e-01

Pos logit contributions
rated+0.030
aned+0.026
ently+0.024
err+0.023
rily+0.023
Neg logit contributions
 who-0.019
 detectives-0.018
 builders-0.018
 kings-0.018
 spies-0.017
Token very
Token ID845
Feature activation+8.02e-02

Pos logit contributions
rated+0.030
aned+0.026
ently+0.025
err+0.023
rily+0.023
Neg logit contributions
 who-0.016
 detectives-0.016
 builders-0.015
 kings-0.015
 spies-0.015
Token her
Token ID607
Feature activation+2.57e+00

Pos logit contributions
 who+0.101
 detectives+0.093
 kings+0.091
 spies+0.090
 builders+0.089
Neg logit contributions
rated-0.166
aned-0.137
err-0.127
ently-0.126
rily-0.120
Token mom
Token ID1995
Feature activation-4.49e+00

Pos logit contributions
rated+0.129
aned+0.112
ently+0.108
err+0.105
rily+0.101
Neg logit contributions
 detectives-0.083
 builders-0.080
 kings-0.080
 farmers-0.079
 who-0.076
Token.
Token ID13
Feature activation+1.69e+00

Pos logit contributions
 who+0.175
 detectives+0.170
 kings+0.166
 builders+0.166
 spies+0.160
Neg logit contributions
rated-0.216
aned-0.182
ently-0.172
err-0.163
rily-0.157
Token
Token ID198
Feature activation-2.11e+00

Pos logit contributions
rated+0.102
aned+0.089
ently+0.085
err+0.081
rily+0.080
Neg logit contributions
 who-0.045
 detectives-0.044
 builders-0.042
 kings-0.042
 spies-0.040
Token
Token ID198
Feature activation-2.10e+00

Pos logit contributions
 who+0.043
 detectives+0.037
 kings+0.036
 builders+0.035
 store+0.035
Neg logit contributions
rated-0.146
aned-0.126
ently-0.120
 sensation-0.117
err-0.116
Token"
Token ID1
Feature activation-1.05e+01

Pos logit contributions
 who+0.047
 detectives+0.046
 builders+0.044
 kings+0.044
 spies+0.041
Neg logit contributions
rated-0.135
aned-0.119
ently-0.115
err-0.110
rily-0.108
TokenYes
Token ID5297
Feature activation+6.28e+00

Pos logit contributions
 kings+0.451
 detectives+0.443
 builders+0.433
 spies+0.420
 superheroes+0.418
Neg logit contributions
rated-0.439
aned-0.362
err-0.331
ently-0.329
right-0.316
Token,
Token ID11
Feature activation-3.88e+00

Pos logit contributions
rated+0.437
aned+0.384
err+0.380
 sensation+0.357
ently+0.353
Neg logit contributions
 who-0.132
 builders-0.103
 kings-0.103
 store-0.098
 police-0.097
Token you
Token ID345
Feature activation+2.01e+00

Pos logit contributions
 who+0.119
 detectives+0.112
 builders+0.109
 kings+0.109
 spies+0.105
Neg logit contributions
rated-0.223
aned-0.194
ently-0.184
err-0.177
rily-0.171
Token can
Token ID460
Feature activation-1.30e+00

Pos logit contributions
rated+0.100
aned+0.078
ently+0.075
night+0.074
raw+0.073
Neg logit contributions
 who-0.068
 detectives-0.068
 farmers-0.067
 kings-0.067
Playing-0.066
Token,
Token ID11
Feature activation-3.86e+00

Pos logit contributions
 detectives+0.052
 builders+0.051
 kings+0.050
 spies+0.049
 farmers+0.048
Neg logit contributions
rated-0.060
aned-0.051
ently-0.048
rily-0.044
ingle-0.044
Token bird
Token ID6512
Feature activation-3.60e+00

Pos logit contributions
 detectives+0.034
 who+0.034
 builders+0.033
 kings+0.032
 spies+0.030
Neg logit contributions
rated-0.089
aned-0.078
ently-0.075
err-0.072
rily-0.071
Token how
Token ID703
Feature activation-4.69e+00

Pos logit contributions
 who+0.119
 detectives+0.114
 kings+0.113
 builders+0.110
 farmers+0.106
Neg logit contributions
rated-0.196
aned-0.169
ently-0.160
err-0.152
rily-0.148
Token to
Token ID284
Feature activation-4.97e-01

Pos logit contributions
 who+0.146
 detectives+0.144
 builders+0.140
 kings+0.140
 spies+0.133
Neg logit contributions
rated-0.261
aned-0.226
ently-0.216
err-0.204
rily-0.201
Token fly
Token ID6129
Feature activation-1.47e+00

Pos logit contributions
 detectives+0.016
 builders+0.015
 spies+0.014
 who+0.014
 kings+0.014
Neg logit contributions
rated-0.027
aned-0.023
ently-0.022
 sensation-0.021
rily-0.021
Token and
Token ID290
Feature activation+7.41e-01

Pos logit contributions
 who+0.052
 kings+0.048
 detectives+0.047
 builders+0.045
 spies+0.044
Neg logit contributions
rated-0.078
aned-0.070
ently-0.065
err-0.061
rily-0.060
Token how
Token ID703
Feature activation-4.54e+00

Pos logit contributions
rated+0.040
aned+0.032
ently+0.030
night+0.030
 sensation+0.029
Neg logit contributions
 builders-0.024
 detectives-0.023
 who-0.022
 farmers-0.022
 kings-0.022
Token to
Token ID284
Feature activation-3.88e-01

Pos logit contributions
 who+0.146
 detectives+0.140
 kings+0.139
 builders+0.139
 farmers+0.131
Neg logit contributions
rated-0.248
aned-0.218
ently-0.206
err-0.196
rily-0.192
Token find
Token ID1064
Feature activation-6.60e+00

Pos logit contributions
 detectives+0.013
 builders+0.013
 spies+0.012
 kings+0.012
 farmers+0.012
Neg logit contributions
rated-0.020
aned-0.017
ently-0.016
 sensation-0.015
rily-0.015
Token worms
Token ID29174
Feature activation-3.31e+00

Pos logit contributions
 detectives+0.254
 builders+0.252
 who+0.245
 kings+0.241
 farmers+0.239
Neg logit contributions
rated-0.300
ently-0.247
aned-0.244
err-0.227
rily-0.226
Token to
Token ID284
Feature activation-3.63e-01

Pos logit contributions
 detectives+0.128
 builders+0.127
 who+0.124
 kings+0.123
 farmers+0.120
Neg logit contributions
rated-0.152
aned-0.126
ently-0.123
rily-0.114
err-0.112
Token eat
Token ID4483
Feature activation-2.00e+00

Pos logit contributions
 builders+0.012
 detectives+0.012
 mechanics+0.011
 spies+0.011
 police+0.011
Neg logit contributions
rated-0.018
ently-0.015
aned-0.015
 sensation-0.014
rily-0.014
Token said
Token ID531
Feature activation+6.79e-01

Pos logit contributions
rated+0.043
aned+0.038
ently+0.037
rily+0.035
err+0.035
Neg logit contributions
 who-0.022
 detectives-0.018
 kings-0.018
 superheroes-0.017
 store-0.017
Token he
Token ID339
Feature activation-1.70e+00

Pos logit contributions
rated+0.039
aned+0.034
ently+0.033
err+0.031
rily+0.031
Neg logit contributions
 detectives-0.019
 who-0.019
 builders-0.019
 kings-0.018
 spies-0.017
Token didn
Token ID1422
Feature activation+6.34e+00

Pos logit contributions
 builders+0.056
 detectives+0.056
 kings+0.054
 who+0.053
 spies+0.052
Neg logit contributions
rated-0.090
aned-0.077
ently-0.071
err-0.069
night-0.067
Token't
Token ID470
Feature activation+1.53e-01

Pos logit contributions
rated+0.382
aned+0.323
ently+0.311
err+0.300
rily+0.296
Neg logit contributions
 who-0.171
 detectives-0.165
 builders-0.161
 kings-0.157
 spies-0.151
Token know
Token ID760
Feature activation-3.62e+00

Pos logit contributions
rated+0.010
aned+0.009
ently+0.009
rily+0.008
err+0.008
Neg logit contributions
 who-0.003
 kings-0.003
 detectives-0.003
 builders-0.003
 superheroes-0.003
Token how
Token ID703
Feature activation-4.61e+00

Pos logit contributions
 detectives+0.097
 builders+0.094
 who+0.094
 kings+0.092
 spies+0.088
Neg logit contributions
rated-0.212
aned-0.185
ently-0.180
err-0.169
rily-0.168
Token.
Token ID13
Feature activation+1.50e+00

Pos logit contributions
 who+0.133
 detectives+0.120
 kings+0.118
 builders+0.117
 superheroes+0.114
Neg logit contributions
rated-0.276
aned-0.248
ently-0.235
err-0.226
rily-0.222
Token
Token ID198
Feature activation-2.31e+00

Pos logit contributions
rated+0.099
aned+0.085
ently+0.082
rily+0.080
err+0.077
Neg logit contributions
 who-0.034
 detectives-0.032
 builders-0.031
 kings-0.030
 spies-0.029
Token
Token ID198
Feature activation-2.30e+00

Pos logit contributions
 who+0.050
 detectives+0.042
 kings+0.040
 builders+0.040
 spies+0.037
Neg logit contributions
rated-0.156
aned-0.137
ently-0.129
rily-0.127
err-0.124
TokenJerry
Token ID43462
Feature activation-4.21e-01

Pos logit contributions
 who+0.056
 detectives+0.056
 builders+0.054
 kings+0.054
 spies+0.051
Neg logit contributions
rated-0.142
aned-0.125
ently-0.120
err-0.115
rily-0.113
Token decided
Token ID3066
Feature activation-1.41e+00

Pos logit contributions
 detectives+0.013
 builders+0.013
 who+0.012
 kings+0.012
 spies+0.012
Neg logit contributions
rated-0.023
aned-0.020
ently-0.019
err-0.018
rily-0.018