TOP ACTIVATIONS
MAX = 13.226

pretty hair ." " Hi , Max ," Tom
L incoln ". " Who is Lincoln , Ben
have there ." " Thanks !" said Tim my
from Tom . " Tom , go away !
blue berries ?" " Yes , please !" Lily
eat them . " Maybe we can fly with
the slide ." " I see ," the man
go away ." " What do you want to
they are ." " Maybe they are rob ins
are you ?" " I am a spirit ,"
ge ese . " Maybe it got hurt by
touch it ?" " No , we cannot touch
are happy . " No , thank you !"
and sweet ." " Maybe it is for someone
your bird ." " No , it can 't
touch it ." " Me too ," Ben said
are well ." " Thank you , Lily ,"
a feather . " Thank you , bird ,
y ucky ." " Me neither ," Ben said
for Ben . " Ben , you need to

MIN ACTIVATIONS
MIN = -16.670

. Rel uct antly , Elsa left the
sea monster insisted . Rel uct antly , it followed him
buy his own . Rel uct antly , his friend gave
from the window . Rel uct antly , he came away
so sad and embarrassed . Tears ran down her face ,
and hurt her knee . Tears started to run down her
Mama began to cry . Tears started streaming down her cheeks
big sad inside her . Tears rolled down her cheeks and
girl was very sad . Tears started rolling down her cheeks
was too high up . Tears threatened to fall from his
her friends were there . Tears started rolling down her cheeks
no energy to play . Tears rolled down his cheeks and
girl didn 't understand . Tears started rolling down her face
Ron ny nodded . Tears filled his eyes .
and he felt sad . Tears started to stream down his
but it seemed impossible . Tears started to come from her
it had been caught . Tears filled the oy ster 's
as the fire grew . Tears ran down their faces as
. She got hurt ! Tears ran down her face .
to go to work . Tears started flowing down his cheeks

INTERVAL 5.752 - 13.226
CONTAINS 2.035%

. " You are very kind . I will be your
. But his mom said " no " because it was
very beautiful , Lily . I love it . You did
" Where are we going mom my ?" asked
the juice . " Hey , you two ,"

INTERVAL -1.722 - 5.752
CONTAINS 72.368%

They swim and play and slide on the rocks . Sometimes
their head and said no . Then one day
brave and he had tried to deliver the packages , even
and walked towards the bush . They both got there in
park . " Yes !" Sam exclaimed , as he ran

INTERVAL -9.196 - -1.722
CONTAINS 25.409%

and said he must stay there without moving for a whole
man is complete !" Lily says . " He is very
and stopped to listen . Her heart began to beat fast
any more threats . <|endoftext|> Once upon a time , there
a fun day and she couldn 't wait to tell her

INTERVAL -16.670 - -9.196
CONTAINS 0.188%

waved back as Joe flew off into the sky . Joe
said . " Wow , that sounds fun !"
the wolf , Hi there !
will find it . â The guard nodded
asked her , What â
Token pretty
Token ID2495
Feature activation+2.51e+00

Pos logit contributions
 chefs+0.075
 queens+0.069
 charity+0.061
 kings+0.059
 spies+0.057
Neg logit contributions
aned-0.113
 forehead-0.106
rated-0.106
iously-0.106
ently-0.105
Token hair
Token ID4190
Feature activation-7.60e-01

Pos logit contributions
 chefs+0.070
 queens+0.063
 charity+0.058
 kings+0.055
 spies+0.054
Neg logit contributions
aned-0.098
ently-0.093
 forehead-0.093
rated-0.092
iously-0.092
Token."
Token ID526
Feature activation-1.07e+00

Pos logit contributions
aned+0.026
 forehead+0.025
rated+0.024
ently+0.024
iously+0.024
Neg logit contributions
 chefs-0.025
 queens-0.023
 charity-0.021
 kings-0.021
 spies-0.020
Token
Token ID198
Feature activation+1.88e+00

Pos logit contributions
aned+0.045
rated+0.043
ently+0.041
iously+0.041
 forehead+0.040
Neg logit contributions
 chefs-0.030
 queens-0.026
 charity-0.025
 kings-0.024
 pirates-0.023
Token
Token ID198
Feature activation+1.95e+00

Pos logit contributions
 chefs+0.056
 queens+0.051
 charity+0.048
 kings+0.047
 spies+0.044
Neg logit contributions
aned-0.072
rated-0.068
iously-0.067
 forehead-0.067
ently-0.066
Token"
Token ID1
Feature activation+1.32e+01

Pos logit contributions
 chefs+0.048
 queens+0.041
 kings+0.036
 spies+0.033
 pirates+0.033
Neg logit contributions
aned-0.085
appointed-0.080
ently-0.079
rated-0.079
 forehead-0.078
TokenHi
Token ID17250
Feature activation-2.36e-01

Pos logit contributions
 chefs+0.366
 queens+0.315
 kings+0.306
 spies+0.286
 builders+0.283
Neg logit contributions
aned-0.500
appointed-0.485
 forehead-0.483
rated-0.462
ently-0.460
Token,
Token ID11
Feature activation+5.73e+00

Pos logit contributions
aned+0.008
rated+0.008
 forehead+0.007
iously+0.007
ently+0.007
Neg logit contributions
 chefs-0.008
 queens-0.007
 charity-0.007
 kings-0.006
 spies-0.006
Token Max
Token ID5436
Feature activation-1.05e+00

Pos logit contributions
 chefs+0.180
 queens+0.164
 kings+0.148
 charity+0.147
 builders+0.141
Neg logit contributions
aned-0.199
 forehead-0.191
rated-0.186
ently-0.183
iously-0.183
Token,"
Token ID553
Feature activation+1.32e+00

Pos logit contributions
aned+0.034
rated+0.032
 forehead+0.031
ently+0.031
iously+0.031
Neg logit contributions
 chefs-0.038
 queens-0.035
 kings-0.032
 charity-0.032
 spies-0.030
Token Tom
Token ID4186
Feature activation-1.82e-01

Pos logit contributions
 chefs+0.038
 queens+0.035
 kings+0.031
 charity+0.031
 spies+0.029
Neg logit contributions
aned-0.051
rated-0.048
 forehead-0.048
ently-0.047
iously-0.047
TokenL
Token ID43
Feature activation+1.28e+00

Pos logit contributions
aned+0.042
rated+0.039
ently+0.037
iously+0.036
 forehead+0.036
Neg logit contributions
 chefs-0.055
 queens-0.051
 charity-0.048
 kings-0.048
 spies-0.047
Tokenincoln
Token ID11690
Feature activation-3.52e+00

Pos logit contributions
 chefs+0.033
 queens+0.030
 kings+0.027
 charity+0.026
 pirates+0.024
Neg logit contributions
aned-0.054
rated-0.052
appointed-0.051
iously-0.051
 forehead-0.050
Token".
Token ID1911
Feature activation-1.39e+00

Pos logit contributions
aned+0.137
rated+0.128
appointed+0.126
 forehead+0.125
ently+0.124
Neg logit contributions
 chefs-0.102
 queens-0.091
 kings-0.084
 charity-0.082
 pirates-0.079
Token
Token ID198
Feature activation+1.63e+00

Pos logit contributions
aned+0.054
rated+0.052
ently+0.050
 forehead+0.049
iously+0.049
Neg logit contributions
 chefs-0.042
 queens-0.038
 charity-0.036
 kings-0.034
 spies-0.033
Token
Token ID198
Feature activation+1.87e+00

Pos logit contributions
 chefs+0.045
 queens+0.041
 charity+0.038
 kings+0.037
 spies+0.035
Neg logit contributions
aned-0.066
rated-0.063
iously-0.062
 forehead-0.062
ently-0.060
Token"
Token ID1
Feature activation+1.32e+01

Pos logit contributions
 chefs+0.043
 queens+0.037
 kings+0.033
 charity+0.031
 pirates+0.030
Neg logit contributions
aned-0.084
rated-0.080
appointed-0.079
ently-0.079
 forehead-0.078
TokenWho
Token ID8241
Feature activation+3.16e+00

Pos logit contributions
 chefs+0.384
 queens+0.340
 kings+0.323
 charity+0.305
 pirates+0.299
Neg logit contributions
aned-0.503
rated-0.476
appointed-0.476
 forehead-0.467
ently-0.454
Token is
Token ID318
Feature activation+1.41e-01

Pos logit contributions
 chefs+0.110
 queens+0.100
 kings+0.094
 charity+0.091
 pirates+0.088
Neg logit contributions
aned-0.109
rated-0.100
 forehead-0.099
appointed-0.098
iously-0.098
Token Lincoln
Token ID12406
Feature activation+3.45e+00

Pos logit contributions
 chefs+0.004
 queens+0.004
 charity+0.003
 kings+0.003
 spies+0.003
Neg logit contributions
aned-0.005
rated-0.005
ently-0.005
iously-0.005
 forehead-0.005
Token,
Token ID11
Feature activation+5.76e+00

Pos logit contributions
 chefs+0.113
 queens+0.103
 kings+0.097
 charity+0.093
 pirates+0.092
Neg logit contributions
aned-0.123
rated-0.113
appointed-0.112
ently-0.111
 forehead-0.111
Token Ben
Token ID3932
Feature activation+2.26e-01

Pos logit contributions
 chefs+0.179
 queens+0.162
 kings+0.146
 charity+0.146
 spies+0.138
Neg logit contributions
aned-0.199
 forehead-0.190
rated-0.188
ently-0.186
iously-0.185
Token have
Token ID423
Feature activation+2.80e+00

Pos logit contributions
aned+0.019
rated+0.018
ently+0.018
iously+0.018
 forehead+0.018
Neg logit contributions
 chefs-0.016
 queens-0.014
 kings-0.013
 charity-0.013
 pirates-0.013
Token there
Token ID612
Feature activation-1.50e+00

Pos logit contributions
 chefs+0.080
 queens+0.072
 charity+0.064
 kings+0.063
 spies+0.061
Neg logit contributions
aned-0.107
ently-0.099
 forehead-0.099
iously-0.099
rated-0.098
Token."
Token ID526
Feature activation-1.04e+00

Pos logit contributions
aned+0.056
rated+0.053
 forehead+0.052
ently+0.052
iously+0.051
Neg logit contributions
 chefs-0.046
 queens-0.042
 kings-0.038
 charity-0.038
 spies-0.036
Token
Token ID198
Feature activation+1.71e+00

Pos logit contributions
aned+0.042
rated+0.041
ently+0.039
iously+0.038
 forehead+0.038
Neg logit contributions
 chefs-0.030
 queens-0.027
 charity-0.025
 kings-0.024
 pirates-0.023
Token
Token ID198
Feature activation+1.82e+00

Pos logit contributions
 chefs+0.050
 queens+0.046
 charity+0.042
 kings+0.041
 pirates+0.039
Neg logit contributions
aned-0.067
rated-0.064
iously-0.063
 forehead-0.062
ently-0.061
Token"
Token ID1
Feature activation+1.32e+01

Pos logit contributions
 chefs+0.045
 queens+0.038
 kings+0.034
 charity+0.031
 spies+0.031
Neg logit contributions
aned-0.079
appointed-0.074
rated-0.074
ently-0.074
 forehead-0.073
TokenThanks
Token ID9690
Feature activation-3.16e+00

Pos logit contributions
 chefs+0.381
 queens+0.339
 kings+0.328
 spies+0.302
 pirates+0.295
Neg logit contributions
aned-0.475
appointed-0.456
 forehead-0.450
rated-0.439
ently-0.430
Token!"
Token ID2474
Feature activation+7.28e-02

Pos logit contributions
aned+0.129
rated+0.120
ently+0.117
appointed+0.116
 forehead+0.114
Neg logit contributions
 chefs-0.091
 queens-0.082
 kings-0.075
 charity-0.074
 pirates-0.071
Token said
Token ID531
Feature activation-3.32e+00

Pos logit contributions
 chefs+0.002
 queens+0.002
 kings+0.002
 charity+0.002
 spies+0.002
Neg logit contributions
aned-0.003
rated-0.003
ently-0.003
iously-0.003
appointed-0.003
Token Tim
Token ID5045
Feature activation-1.92e-01

Pos logit contributions
aned+0.140
rated+0.135
ently+0.130
appointed+0.128
 forehead+0.126
Neg logit contributions
 chefs-0.091
 queens-0.080
 kings-0.073
 charity-0.071
 pirates-0.070
Tokenmy
Token ID1820
Feature activation+9.47e-02

Pos logit contributions
aned+0.009
ently+0.008
inkle+0.008
rated+0.008
heit+0.008
Neg logit contributions
 chefs-0.004
 queens-0.003
 kings-0.003
 pirates-0.003
please-0.002
Token from
Token ID422
Feature activation-3.95e+00

Pos logit contributions
aned+0.285
rated+0.267
ently+0.261
 forehead+0.260
appointed+0.260
Neg logit contributions
 chefs-0.193
 queens-0.174
 kings-0.159
 charity-0.156
 pirates-0.149
Token Tom
Token ID4186
Feature activation+4.28e-01

Pos logit contributions
aned+0.144
 forehead+0.135
iously+0.134
rated+0.134
ently+0.132
Neg logit contributions
 chefs-0.122
 queens-0.111
 charity-0.101
 kings-0.098
 builders-0.093
Token.
Token ID13
Feature activation-6.29e-01

Pos logit contributions
 chefs+0.010
 queens+0.008
 kings+0.008
 charity+0.008
 pirates+0.007
Neg logit contributions
aned-0.019
rated-0.018
appointed-0.018
ently-0.017
iously-0.017
Token
Token ID198
Feature activation+1.85e+00

Pos logit contributions
aned+0.024
rated+0.024
 forehead+0.022
ently+0.022
iously+0.022
Neg logit contributions
 chefs-0.019
 queens-0.017
 charity-0.016
 kings-0.015
 spies-0.015
Token
Token ID198
Feature activation+1.96e+00

Pos logit contributions
 chefs+0.051
 queens+0.046
 charity+0.043
 kings+0.042
 spies+0.040
Neg logit contributions
aned-0.075
rated-0.072
iously-0.071
 forehead-0.071
ently-0.069
Token"
Token ID1
Feature activation+1.32e+01

Pos logit contributions
 chefs+0.047
 queens+0.042
 charity+0.037
 kings+0.037
 spies+0.034
Neg logit contributions
aned-0.087
rated-0.083
 forehead-0.081
ently-0.081
iously-0.081
TokenTom
Token ID13787
Feature activation+6.31e-01

Pos logit contributions
 chefs+0.335
 queens+0.304
 kings+0.277
 charity+0.262
 spies+0.248
Neg logit contributions
aned-0.551
rated-0.521
appointed-0.514
 forehead-0.512
ently-0.504
Token,
Token ID11
Feature activation+6.03e+00

Pos logit contributions
 chefs+0.017
 queens+0.014
 kings+0.013
 pirates+0.012
 builders+0.012
Neg logit contributions
aned-0.026
rated-0.025
ently-0.025
 forehead-0.025
appointed-0.025
Token go
Token ID467
Feature activation+2.37e+00

Pos logit contributions
 chefs+0.193
 queens+0.179
 kings+0.161
 charity+0.159
 builders+0.151
Neg logit contributions
aned-0.209
 forehead-0.199
rated-0.197
iously-0.192
ently-0.191
Token away
Token ID1497
Feature activation-6.62e+00

Pos logit contributions
 chefs+0.087
 queens+0.081
 charity+0.074
 kings+0.074
 spies+0.070
Neg logit contributions
aned-0.075
rated-0.070
 forehead-0.068
iously-0.067
appointed-0.067
Token!
Token ID0
Feature activation-8.99e-01

Pos logit contributions
aned+0.217
rated+0.202
 forehead+0.198
ently+0.197
iously+0.196
Neg logit contributions
 chefs-0.235
 queens-0.217
 kings-0.199
 charity-0.198
 spies-0.189
Token blue
Token ID4171
Feature activation+1.06e+00

Pos logit contributions
 queens+0.009
 chefs+0.009
 kings+0.007
 charity+0.007
 spies+0.007
Neg logit contributions
aned-0.011
 forehead-0.010
iously-0.010
ently-0.010
appointed-0.010
Tokenberries
Token ID20853
Feature activation+2.19e+00

Pos logit contributions
 chefs+0.027
 queens+0.024
 kings+0.023
 pirates+0.022
 charity+0.020
Neg logit contributions
aned-0.047
rated-0.044
appointed-0.043
iously-0.042
ently-0.042
Token?"
Token ID1701
Feature activation-1.45e+00

Pos logit contributions
 chefs+0.056
 queens+0.049
 kings+0.045
 pirates+0.042
 charity+0.039
Neg logit contributions
aned-0.093
rated-0.087
 forehead-0.087
iously-0.085
appointed-0.085
Token
Token ID198
Feature activation+1.48e+00

Pos logit contributions
aned+0.059
rated+0.057
ently+0.054
 forehead+0.054
iously+0.053
Neg logit contributions
 chefs-0.041
 queens-0.036
 kings-0.033
 charity-0.033
 pirates-0.032
Token
Token ID198
Feature activation+1.70e+00

Pos logit contributions
 chefs+0.038
 queens+0.034
 charity+0.032
 kings+0.031
 pirates+0.029
Neg logit contributions
aned-0.063
rated-0.061
iously-0.060
 forehead-0.059
ently-0.057
Token"
Token ID1
Feature activation+1.31e+01

Pos logit contributions
 chefs+0.041
 queens+0.036
 kings+0.032
 charity+0.031
 spies+0.029
Neg logit contributions
aned-0.075
rated-0.071
 forehead-0.070
ently-0.070
iously-0.069
TokenYes
Token ID5297
Feature activation-4.91e+00

Pos logit contributions
 chefs+0.345
 queens+0.301
 kings+0.281
 charity+0.263
 spies+0.260
Neg logit contributions
aned-0.539
appointed-0.510
 forehead-0.507
rated-0.505
ently-0.494
Token,
Token ID11
Feature activation+5.83e+00

Pos logit contributions
aned+0.213
rated+0.200
appointed+0.193
ently+0.192
 forehead+0.191
Neg logit contributions
 chefs-0.122
 queens-0.109
 kings-0.100
 charity-0.096
 pirates-0.092
Token please
Token ID3387
Feature activation+1.13e+00

Pos logit contributions
 chefs+0.182
 queens+0.164
 kings+0.150
 charity+0.149
 spies+0.141
Neg logit contributions
aned-0.213
rated-0.200
 forehead-0.198
iously-0.197
ently-0.196
Token!"
Token ID2474
Feature activation+2.02e-01

Pos logit contributions
 chefs+0.037
 queens+0.033
 kings+0.031
 charity+0.031
 pirates+0.029
Neg logit contributions
aned-0.041
ently-0.037
rated-0.037
appointed-0.036
iously-0.036
Token Lily
Token ID20037
Feature activation+3.20e-01

Pos logit contributions
 chefs+0.006
 queens+0.005
 charity+0.005
 kings+0.005
 pirates+0.004
Neg logit contributions
aned-0.008
rated-0.008
ently-0.008
appointed-0.008
iously-0.008
Token eat
Token ID4483
Feature activation+3.11e+00

Pos logit contributions
 chefs+0.057
 queens+0.052
 kings+0.047
 charity+0.047
 spies+0.045
Neg logit contributions
aned-0.058
rated-0.055
 forehead-0.054
iously-0.053
appointed-0.053
Token them
Token ID606
Feature activation+2.35e+00

Pos logit contributions
 chefs+0.093
 queens+0.085
 charity+0.076
 kings+0.076
 builders+0.073
Neg logit contributions
aned-0.112
ently-0.107
rated-0.107
iously-0.106
 forehead-0.106
Token.
Token ID13
Feature activation-5.94e-01

Pos logit contributions
 chefs+0.073
 queens+0.067
 kings+0.060
 charity+0.060
 spies+0.057
Neg logit contributions
aned-0.087
rated-0.082
 forehead-0.080
ently-0.080
iously-0.080
Token
Token ID198
Feature activation+1.99e+00

Pos logit contributions
aned+0.023
rated+0.022
ently+0.021
 forehead+0.021
iously+0.021
Neg logit contributions
 chefs-0.018
 queens-0.016
 charity-0.015
 kings-0.015
 spies-0.015
Token
Token ID198
Feature activation+2.05e+00

Pos logit contributions
 chefs+0.054
 queens+0.049
 charity+0.046
 kings+0.044
 pirates+0.043
Neg logit contributions
aned-0.081
rated-0.078
iously-0.077
 forehead-0.077
ently-0.074
Token"
Token ID1
Feature activation+1.31e+01

Pos logit contributions
 chefs+0.045
 queens+0.040
 charity+0.035
 kings+0.034
 spies+0.032
Neg logit contributions
aned-0.095
rated-0.091
 forehead-0.090
iously-0.089
ently-0.089
TokenMaybe
Token ID13300
Feature activation+2.59e+00

Pos logit contributions
 chefs+0.336
 queens+0.305
 kings+0.276
 charity+0.259
 builders+0.251
Neg logit contributions
aned-0.541
rated-0.513
 forehead-0.506
appointed-0.503
ently-0.502
Token we
Token ID356
Feature activation+6.08e+00

Pos logit contributions
 chefs+0.083
 queens+0.076
 charity+0.069
 kings+0.068
 spies+0.065
Neg logit contributions
aned-0.091
rated-0.086
ently-0.085
iously-0.085
 forehead-0.085
Token can
Token ID460
Feature activation+1.94e+00

Pos logit contributions
 chefs+0.209
 queens+0.196
 charity+0.178
 kings+0.172
 spies+0.168
Neg logit contributions
aned-0.192
rated-0.187
 forehead-0.176
appointed-0.175
iously-0.174
Token fly
Token ID6129
Feature activation+1.99e+00

Pos logit contributions
 chefs+0.066
 queens+0.063
 kings+0.054
 charity+0.054
 spies+0.053
Neg logit contributions
aned-0.065
 forehead-0.061
rated-0.060
appointed-0.060
iously-0.060
Token with
Token ID351
Feature activation+3.20e-01

Pos logit contributions
 chefs+0.060
 queens+0.054
 charity+0.049
 kings+0.049
 builders+0.046
Neg logit contributions
aned-0.075
rated-0.071
ently-0.069
 forehead-0.069
iously-0.069
Token the
Token ID262
Feature activation-9.87e-01

Pos logit contributions
aned+0.140
rated+0.132
ently+0.128
 forehead+0.128
appointed+0.127
Neg logit contributions
 chefs-0.096
 queens-0.085
 kings-0.077
 charity-0.076
 pirates-0.073
Token slide
Token ID10649
Feature activation+8.59e-01

Pos logit contributions
aned+0.037
rated+0.034
ently+0.034
iously+0.033
appointed+0.033
Neg logit contributions
 chefs-0.031
 queens-0.028
 charity-0.026
 kings-0.026
 pirates-0.025
Token."
Token ID526
Feature activation-1.40e+00

Pos logit contributions
 chefs+0.021
 queens+0.018
 kings+0.018
 pirates+0.016
 charity+0.016
Neg logit contributions
aned-0.036
rated-0.034
 forehead-0.034
iously-0.034
appointed-0.034
Token
Token ID198
Feature activation+1.55e+00

Pos logit contributions
aned+0.058
rated+0.057
ently+0.053
iously+0.052
 forehead+0.052
Neg logit contributions
 chefs-0.039
 queens-0.034
 charity-0.032
 kings-0.030
 pirates-0.030
Token
Token ID198
Feature activation+1.71e+00

Pos logit contributions
 chefs+0.042
 queens+0.038
 charity+0.036
 kings+0.034
 spies+0.033
Neg logit contributions
aned-0.063
rated-0.061
iously-0.060
 forehead-0.059
ently-0.058
Token"
Token ID1
Feature activation+1.31e+01

Pos logit contributions
 chefs+0.042
 queens+0.037
 kings+0.033
 charity+0.031
 spies+0.030
Neg logit contributions
aned-0.075
rated-0.070
appointed-0.070
ently-0.070
 forehead-0.069
TokenI
Token ID40
Feature activation+4.68e+00

Pos logit contributions
 chefs+0.383
 queens+0.341
 kings+0.322
 charity+0.299
 spies+0.296
Neg logit contributions
aned-0.499
appointed-0.472
 forehead-0.466
rated-0.466
ently-0.448
Token see
Token ID766
Feature activation+5.77e+00

Pos logit contributions
 chefs+0.154
 queens+0.142
 kings+0.129
 charity+0.128
 spies+0.122
Neg logit contributions
aned-0.166
rated-0.156
 forehead-0.152
ently-0.152
iously-0.151
Token,"
Token ID553
Feature activation+1.45e+00

Pos logit contributions
 chefs+0.177
 queens+0.160
 kings+0.146
 charity+0.144
 spies+0.137
Neg logit contributions
aned-0.221
rated-0.207
 forehead-0.202
ently-0.202
appointed-0.200
Token the
Token ID262
Feature activation-6.74e-01

Pos logit contributions
 chefs+0.043
 queens+0.038
 kings+0.035
 charity+0.034
 pirates+0.033
Neg logit contributions
aned-0.058
rated-0.055
ently-0.054
appointed-0.053
iously-0.053
Token man
Token ID582
Feature activation+2.87e+00

Pos logit contributions
aned+0.027
rated+0.025
ently+0.025
iously+0.025
appointed+0.024
Neg logit contributions
 chefs-0.020
 queens-0.018
 kings-0.016
 charity-0.016
 pirates-0.015
Token go
Token ID467
Feature activation+2.24e+00

Pos logit contributions
aned+0.025
 forehead+0.024
rated+0.024
ently+0.023
appointed+0.023
Neg logit contributions
 chefs-0.024
 queens-0.022
 kings-0.019
 charity-0.019
 builders-0.018
Token away
Token ID1497
Feature activation-6.94e+00

Pos logit contributions
 chefs+0.073
 queens+0.065
 kings+0.059
 charity+0.059
 builders+0.056
Neg logit contributions
aned-0.078
 forehead-0.074
rated-0.073
appointed-0.072
iously-0.071
Token."
Token ID526
Feature activation-9.64e-01

Pos logit contributions
aned+0.229
rated+0.222
ently+0.218
iously+0.216
 forehead+0.216
Neg logit contributions
 chefs-0.230
 queens-0.208
 charity-0.191
 kings-0.190
 builders-0.181
Token
Token ID198
Feature activation+2.20e+00

Pos logit contributions
aned+0.043
rated+0.042
ently+0.039
iously+0.039
 forehead+0.039
Neg logit contributions
 chefs-0.024
 queens-0.020
 charity-0.020
 pirates-0.018
 kings-0.018
Token
Token ID198
Feature activation+2.23e+00

Pos logit contributions
 chefs+0.058
 queens+0.052
 charity+0.049
 kings+0.047
 pirates+0.045
Neg logit contributions
aned-0.091
rated-0.088
iously-0.086
 forehead-0.085
appointed-0.083
Token"
Token ID1
Feature activation+1.31e+01

Pos logit contributions
 chefs+0.056
 queens+0.049
 kings+0.043
 charity+0.040
 spies+0.040
Neg logit contributions
aned-0.096
rated-0.089
appointed-0.089
ently-0.089
 forehead-0.089
TokenWhat
Token ID2061
Feature activation+2.56e+00

Pos logit contributions
 chefs+0.447
 queens+0.410
 kings+0.394
 spies+0.359
 pirates+0.359
Neg logit contributions
aned-0.425
appointed-0.401
 forehead-0.400
rated-0.396
iously-0.382
Token do
Token ID466
Feature activation-1.20e+00

Pos logit contributions
 chefs+0.101
 queens+0.090
 kings+0.087
 charity+0.085
 pirates+0.084
Neg logit contributions
aned-0.082
iously-0.073
rated-0.072
 forehead-0.072
appointed-0.071
Token you
Token ID345
Feature activation+3.62e-01

Pos logit contributions
aned+0.049
rated+0.045
 forehead+0.044
ently+0.044
iously+0.044
Neg logit contributions
 chefs-0.036
 queens-0.032
 charity-0.029
 kings-0.029
 pirates-0.028
Token want
Token ID765
Feature activation+4.16e+00

Pos logit contributions
 chefs+0.010
 queens+0.008
 kings+0.008
 pirates+0.008
 spies+0.008
Neg logit contributions
aned-0.015
ently-0.015
iously-0.014
 forehead-0.014
appointed-0.014
Token to
Token ID284
Feature activation+2.84e-01

Pos logit contributions
 chefs+0.118
 queens+0.106
 kings+0.096
 charity+0.095
 spies+0.090
Neg logit contributions
aned-0.169
rated-0.159
 forehead-0.155
ently-0.155
iously-0.155
Token they
Token ID484
Feature activation+4.62e+00

Pos logit contributions
 chefs+0.103
 queens+0.093
 kings+0.084
 charity+0.083
 pirates+0.078
Neg logit contributions
aned-0.142
rated-0.134
 forehead-0.131
ently-0.131
iously-0.130
Token are
Token ID389
Feature activation+2.57e+00

Pos logit contributions
 chefs+0.140
 queens+0.126
 charity+0.113
 kings+0.111
 spies+0.107
Neg logit contributions
aned-0.175
rated-0.166
 forehead-0.162
ently-0.160
iously-0.159
Token."
Token ID526
Feature activation-8.71e-01

Pos logit contributions
 chefs+0.072
 queens+0.065
 charity+0.061
 spies+0.053
 builders+0.053
Neg logit contributions
aned-0.097
 forehead-0.091
ently-0.089
iously-0.089
appointed-0.089
Token
Token ID198
Feature activation+1.96e+00

Pos logit contributions
aned+0.036
rated+0.035
ently+0.033
iously+0.033
 forehead+0.033
Neg logit contributions
 chefs-0.024
 queens-0.021
 charity-0.020
 kings-0.020
 pirates-0.019
Token
Token ID198
Feature activation+2.01e+00

Pos logit contributions
 chefs+0.056
 queens+0.050
 charity+0.047
 kings+0.046
 pirates+0.044
Neg logit contributions
aned-0.078
rated-0.075
iously-0.073
 forehead-0.073
ently-0.071
Token"
Token ID1
Feature activation+1.31e+01

Pos logit contributions
 chefs+0.050
 queens+0.044
 kings+0.039
 charity+0.037
 spies+0.036
Neg logit contributions
aned-0.087
rated-0.082
appointed-0.081
ently-0.081
 forehead-0.080
TokenMaybe
Token ID13300
Feature activation+2.65e+00

Pos logit contributions
 chefs+0.392
 queens+0.342
 kings+0.336
 spies+0.307
 charity+0.302
Neg logit contributions
aned-0.465
appointed-0.451
 forehead-0.443
rated-0.429
iously-0.421
Token they
Token ID484
Feature activation+4.59e+00

Pos logit contributions
 chefs+0.083
 queens+0.075
 charity+0.068
 kings+0.068
 spies+0.064
Neg logit contributions
aned-0.098
rated-0.092
ently-0.090
 forehead-0.090
iously-0.090
Token are
Token ID389
Feature activation+2.44e+00

Pos logit contributions
 chefs+0.158
 queens+0.145
 charity+0.132
 kings+0.129
 spies+0.126
Neg logit contributions
aned-0.153
rated-0.147
 forehead-0.140
iously-0.140
ently-0.139
Token rob
Token ID3857
Feature activation+4.73e+00

Pos logit contributions
 chefs+0.066
 queens+0.066
 charity+0.062
J+0.057
 kings+0.053
Neg logit contributions
aned-0.083
 forehead-0.082
appointed-0.079
iously-0.079
rated-0.078
Tokenins
Token ID1040
Feature activation+5.08e+00

Pos logit contributions
 chefs+0.132
 charity+0.123
 queens+0.117
 spies+0.104
 kings+0.102
Neg logit contributions
aned-0.174
ently-0.171
rated-0.164
iously-0.163
 forehead-0.159
Token are
Token ID389
Feature activation+2.66e+00

Pos logit contributions
 chefs+0.092
 queens+0.083
 kings+0.077
 charity+0.075
 pirates+0.072
Neg logit contributions
aned-0.106
rated-0.098
 forehead-0.097
ently-0.096
iously-0.096
Token you
Token ID345
Feature activation+6.21e-01

Pos logit contributions
 chefs+0.081
 queens+0.074
 charity+0.067
 kings+0.065
 spies+0.062
Neg logit contributions
aned-0.094
ently-0.088
iously-0.087
rated-0.086
appointed-0.086
Token?"
Token ID1701
Feature activation-1.24e+00

Pos logit contributions
 chefs+0.022
 queens+0.020
 kings+0.019
 charity+0.018
 pirates+0.018
Neg logit contributions
aned-0.021
rated-0.019
 forehead-0.019
ently-0.019
iously-0.019
Token
Token ID198
Feature activation+2.01e+00

Pos logit contributions
aned+0.051
rated+0.051
ently+0.047
iously+0.046
 forehead+0.046
Neg logit contributions
 chefs-0.034
 queens-0.029
 charity-0.028
 kings-0.027
 pirates-0.027
Token
Token ID198
Feature activation+2.09e+00

Pos logit contributions
 chefs+0.052
 queens+0.047
 charity+0.044
 kings+0.042
 pirates+0.040
Neg logit contributions
aned-0.085
rated-0.083
iously-0.081
 forehead-0.079
ently-0.077
Token"
Token ID1
Feature activation+1.31e+01

Pos logit contributions
 chefs+0.051
 queens+0.045
 kings+0.040
 charity+0.038
 spies+0.037
Neg logit contributions
aned-0.091
rated-0.086
ently-0.085
 forehead-0.085
appointed-0.085
TokenI
Token ID40
Feature activation+5.01e+00

Pos logit contributions
 chefs+0.330
 queens+0.286
 kings+0.276
 spies+0.246
 charity+0.245
Neg logit contributions
aned-0.529
appointed-0.503
 forehead-0.499
ently-0.484
rated-0.479
Token am
Token ID716
Feature activation+2.30e+00

Pos logit contributions
 chefs+0.172
 queens+0.162
 kings+0.146
 charity+0.145
 builders+0.140
Neg logit contributions
aned-0.167
rated-0.160
 forehead-0.152
appointed-0.150
ently-0.150
Token a
Token ID257
Feature activation+3.58e+00

Pos logit contributions
 chefs+0.071
 queens+0.065
 charity+0.059
 kings+0.058
 spies+0.055
Neg logit contributions
aned-0.084
 forehead-0.079
iously-0.078
rated-0.078
ently-0.077
Token spirit
Token ID4437
Feature activation+2.14e+00

Pos logit contributions
 chefs+0.103
 queens+0.097
 kings+0.087
 builders+0.082
 charity+0.080
Neg logit contributions
 forehead-0.127
rated-0.117
aned-0.113
iously-0.112
 cheek-0.111
Token,"
Token ID553
Feature activation+1.43e+00

Pos logit contributions
 chefs+0.080
 queens+0.073
 charity+0.068
 kings+0.068
 spies+0.065
Neg logit contributions
aned-0.065
rated-0.061
 forehead-0.060
iously-0.059
ently-0.059
Token ge
Token ID4903
Feature activation+2.12e+00

Pos logit contributions
 chefs+0.023
 queens+0.020
 charity+0.018
 kings+0.018
 spies+0.017
Neg logit contributions
aned-0.028
 forehead-0.027
rated-0.027
iously-0.027
ently-0.027
Tokenese
Token ID2771
Feature activation+3.82e+00

Pos logit contributions
 chefs+0.067
 queens+0.061
 kings+0.056
 charity+0.055
 pirates+0.052
Neg logit contributions
aned-0.078
rated-0.073
 forehead-0.072
ently-0.072
iously-0.071
Token.
Token ID13
Feature activation-7.97e-01

Pos logit contributions
 chefs+0.101
 queens+0.091
 kings+0.084
 charity+0.082
 pirates+0.078
Neg logit contributions
aned-0.160
rated-0.149
appointed-0.146
ently-0.145
iously-0.145
Token
Token ID198
Feature activation+2.00e+00

Pos logit contributions
aned+0.031
rated+0.030
 forehead+0.028
ently+0.028
iously+0.027
Neg logit contributions
 chefs-0.024
 queens-0.022
 charity-0.021
 kings-0.020
 pirates-0.020
Token
Token ID198
Feature activation+2.07e+00

Pos logit contributions
 chefs+0.061
 queens+0.056
 charity+0.052
 kings+0.051
 pirates+0.049
Neg logit contributions
aned-0.075
rated-0.071
iously-0.070
 forehead-0.070
ently-0.068
Token"
Token ID1
Feature activation+1.31e+01

Pos logit contributions
 chefs+0.055
 queens+0.048
 kings+0.043
 charity+0.041
 spies+0.040
Neg logit contributions
aned-0.087
rated-0.081
appointed-0.081
ently-0.080
 forehead-0.080
TokenMaybe
Token ID13300
Feature activation+2.60e+00

Pos logit contributions
 chefs+0.343
 queens+0.306
 kings+0.283
 charity+0.268
 spies+0.259
Neg logit contributions
aned-0.531
rated-0.504
 forehead-0.499
appointed-0.497
ently-0.489
Token it
Token ID340
Feature activation-2.17e+00

Pos logit contributions
 chefs+0.081
 queens+0.074
 charity+0.067
 kings+0.067
 spies+0.063
Neg logit contributions
aned-0.096
rated-0.090
ently-0.089
 forehead-0.089
iously-0.088
Token got
Token ID1392
Feature activation+4.49e+00

Pos logit contributions
aned+0.077
 forehead+0.072
ently+0.071
appointed+0.069
 spine+0.069
Neg logit contributions
 chefs-0.071
 queens-0.064
 kings-0.063
 pirates-0.059
 spies-0.056
Token hurt
Token ID5938
Feature activation+4.21e-01

Pos logit contributions
 chefs+0.130
 queens+0.114
 charity+0.109
 kings+0.102
 spies+0.097
Neg logit contributions
aned-0.170
rated-0.164
iously-0.160
appointed-0.158
 forehead-0.157
Token by
Token ID416
Feature activation-7.49e+00

Pos logit contributions
 chefs+0.017
 queens+0.016
 charity+0.016
 kings+0.015
 spies+0.014
Neg logit contributions
aned-0.010
rated-0.010
iously-0.010
ently-0.010
appointed-0.009
Token touch
Token ID3638
Feature activation-2.38e+00

Pos logit contributions
 chefs+0.184
 queens+0.168
 kings+0.150
 charity+0.149
 spies+0.144
Neg logit contributions
aned-0.203
rated-0.192
 forehead-0.191
appointed-0.188
ently-0.186
Token it
Token ID340
Feature activation-2.38e+00

Pos logit contributions
aned+0.084
rated+0.078
ently+0.077
 forehead+0.077
iously+0.076
Neg logit contributions
 chefs-0.078
 queens-0.071
 charity-0.065
 kings-0.065
 spies-0.061
Token?"
Token ID1701
Feature activation-1.44e+00

Pos logit contributions
aned+0.090
rated+0.083
iously+0.082
 forehead+0.081
appointed+0.081
Neg logit contributions
 chefs-0.074
 queens-0.067
 kings-0.061
 charity-0.061
 spies-0.059
Token
Token ID198
Feature activation+1.79e+00

Pos logit contributions
aned+0.061
rated+0.059
 forehead+0.055
ently+0.055
iously+0.055
Neg logit contributions
 chefs-0.039
 queens-0.033
 charity-0.031
 kings-0.031
 pirates-0.030
Token
Token ID198
Feature activation+1.92e+00

Pos logit contributions
 chefs+0.043
 queens+0.038
 charity+0.037
 kings+0.035
 pirates+0.033
Neg logit contributions
aned-0.077
rated-0.076
iously-0.074
 forehead-0.073
appointed-0.070
Token"
Token ID1
Feature activation+1.31e+01

Pos logit contributions
 chefs+0.045
 queens+0.040
 kings+0.034
 charity+0.034
 spies+0.032
Neg logit contributions
aned-0.086
rated-0.082
 forehead-0.081
ently-0.081
iously-0.080
TokenNo
Token ID2949
Feature activation-3.25e+00

Pos logit contributions
 chefs+0.420
 queens+0.382
 kings+0.362
 pirates+0.333
 charity+0.332
Neg logit contributions
aned-0.465
appointed-0.438
 forehead-0.435
rated-0.432
ently-0.422
Token,
Token ID11
Feature activation+5.81e+00

Pos logit contributions
aned+0.156
rated+0.147
appointed+0.141
ently+0.140
 forehead+0.138
Neg logit contributions
 chefs-0.070
 queens-0.055
 kings-0.052
 charity-0.052
 pirates-0.049
Token we
Token ID356
Feature activation+6.01e+00

Pos logit contributions
 chefs+0.188
 queens+0.172
 kings+0.156
 charity+0.156
 spies+0.148
Neg logit contributions
aned-0.209
rated-0.195
 forehead-0.192
ently-0.191
iously-0.190
Token cannot
Token ID2314
Feature activation+1.79e-01

Pos logit contributions
 chefs+0.189
 queens+0.170
 kings+0.156
 charity+0.155
 spies+0.147
Neg logit contributions
aned-0.224
rated-0.208
ently-0.207
 forehead-0.207
iously-0.206
Token touch
Token ID3638
Feature activation-2.35e+00

Pos logit contributions
 chefs+0.006
 queens+0.005
 kings+0.005
 charity+0.005
 spies+0.005
Neg logit contributions
aned-0.006
rated-0.006
 forehead-0.006
ently-0.006
iously-0.006
Token are
Token ID389
Feature activation+2.67e+00

Pos logit contributions
 chefs+0.099
 queens+0.092
 charity+0.080
 kings+0.078
 spies+0.078
Neg logit contributions
aned-0.116
rated-0.113
 forehead-0.111
appointed-0.108
iously-0.106
Token happy
Token ID3772
Feature activation+2.27e-01

Pos logit contributions
 queens+0.070
 chefs+0.069
 charity+0.066
Sp+0.057
 builders+0.056
Neg logit contributions
 forehead-0.095
aned-0.095
appointed-0.088
rated-0.086
iously-0.084
Token.
Token ID13
Feature activation-6.90e-01

Pos logit contributions
 chefs+0.007
 queens+0.007
 charity+0.007
 kings+0.006
 spies+0.006
Neg logit contributions
aned-0.007
 forehead-0.007
ently-0.007
iously-0.007
rated-0.007
Token
Token ID198
Feature activation+2.27e+00

Pos logit contributions
aned+0.025
rated+0.023
 forehead+0.023
ently+0.023
iously+0.023
Neg logit contributions
 chefs-0.022
 queens-0.020
 kings-0.019
 charity-0.019
 spies-0.018
Token
Token ID198
Feature activation+2.23e+00

Pos logit contributions
 chefs+0.074
 queens+0.068
 charity+0.063
 kings+0.063
 pirates+0.060
Neg logit contributions
aned-0.081
rated-0.077
iously-0.076
 forehead-0.075
ently-0.073
Token"
Token ID1
Feature activation+1.31e+01

Pos logit contributions
 chefs+0.057
 queens+0.051
 kings+0.044
 charity+0.043
 spies+0.041
Neg logit contributions
aned-0.095
rated-0.089
appointed-0.089
ently-0.088
 forehead-0.087
TokenNo
Token ID2949
Feature activation-3.55e+00

Pos logit contributions
 chefs+0.331
 queens+0.303
 kings+0.277
 charity+0.264
 spies+0.253
Neg logit contributions
aned-0.531
rated-0.506
appointed-0.505
 forehead-0.499
ently-0.483
Token,
Token ID11
Feature activation+5.94e+00

Pos logit contributions
aned+0.163
rated+0.151
appointed+0.148
ently+0.147
 forehead+0.144
Neg logit contributions
 chefs-0.089
 queens-0.073
 kings-0.070
 charity-0.068
 pirates-0.066
Token thank
Token ID5875
Feature activation+9.95e-01

Pos logit contributions
 chefs+0.199
 queens+0.183
 kings+0.167
 charity+0.165
 builders+0.157
Neg logit contributions
aned-0.196
 forehead-0.188
rated-0.185
iously-0.182
ently-0.179
Token you
Token ID345
Feature activation+3.85e-01

Pos logit contributions
 chefs+0.029
 queens+0.026
 kings+0.024
 charity+0.023
 pirates+0.022
Neg logit contributions
aned-0.040
rated-0.038
ently-0.037
 forehead-0.036
iously-0.036
Token!"
Token ID2474
Feature activation+6.93e-02

Pos logit contributions
 chefs+0.012
 queens+0.011
 kings+0.010
 charity+0.010
 spies+0.009
Neg logit contributions
aned-0.015
rated-0.014
ently-0.013
 forehead-0.013
appointed-0.013
Token and
Token ID290
Feature activation-1.95e+00

Pos logit contributions
 chefs+0.101
 queens+0.093
 charity+0.084
 kings+0.083
 spies+0.079
Neg logit contributions
aned-0.128
rated-0.122
 forehead-0.120
ently-0.119
iously-0.119
Token sweet
Token ID6029
Feature activation+2.07e+00

Pos logit contributions
aned+0.075
rated+0.070
ently+0.069
 forehead+0.069
iously+0.069
Neg logit contributions
 chefs-0.058
 queens-0.053
 kings-0.048
 charity-0.047
 spies-0.045
Token."
Token ID526
Feature activation-1.22e+00

Pos logit contributions
 chefs+0.067
 queens+0.060
 kings+0.055
 charity+0.053
 pirates+0.052
Neg logit contributions
aned-0.077
rated-0.070
appointed-0.070
ently-0.069
 forehead-0.069
Token
Token ID198
Feature activation+1.44e+00

Pos logit contributions
aned+0.052
rated+0.051
ently+0.048
iously+0.047
 forehead+0.047
Neg logit contributions
 chefs-0.033
 queens-0.028
 charity-0.026
 kings-0.025
 pirates-0.025
Token
Token ID198
Feature activation+1.66e+00

Pos logit contributions
 chefs+0.037
 queens+0.033
 charity+0.031
 kings+0.030
 pirates+0.028
Neg logit contributions
aned-0.061
rated-0.059
iously-0.059
 forehead-0.057
ently-0.055
Token"
Token ID1
Feature activation+1.31e+01

Pos logit contributions
 chefs+0.040
 queens+0.035
 kings+0.031
 charity+0.030
 spies+0.029
Neg logit contributions
aned-0.073
rated-0.068
ently-0.068
 forehead-0.067
appointed-0.067
TokenMaybe
Token ID13300
Feature activation+2.65e+00

Pos logit contributions
 chefs+0.384
 queens+0.340
 kings+0.330
 spies+0.306
 charity+0.302
Neg logit contributions
aned-0.479
appointed-0.460
 forehead-0.455
rated-0.448
ently-0.435
Token it
Token ID340
Feature activation-2.25e+00

Pos logit contributions
 chefs+0.082
 queens+0.075
 charity+0.068
 kings+0.067
 spies+0.064
Neg logit contributions
aned-0.098
rated-0.093
ently-0.091
 forehead-0.091
iously-0.091
Token is
Token ID318
Feature activation-4.15e-02

Pos logit contributions
aned+0.077
 forehead+0.072
ently+0.070
iously+0.069
appointed+0.069
Neg logit contributions
 chefs-0.078
 queens-0.071
 kings-0.068
 pirates-0.063
 charity-0.061
Token for
Token ID329
Feature activation+2.60e+00

Pos logit contributions
aned+0.002
rated+0.002
 forehead+0.002
iously+0.002
ently+0.002
Neg logit contributions
 chefs-0.001
 queens-0.001
 charity-0.001
 kings-0.001
 spies-0.001
Token someone
Token ID2130
Feature activation+4.44e+00

Pos logit contributions
 chefs+0.076
 queens+0.069
 charity+0.062
 kings+0.062
 spies+0.058
Neg logit contributions
aned-0.094
iously-0.090
 forehead-0.090
ently-0.088
rated-0.086
Token your
Token ID534
Feature activation-7.81e-01

Pos logit contributions
 chefs+0.002
 queens+0.002
 charity+0.002
 kings+0.002
 spies+0.002
Neg logit contributions
aned-0.002
rated-0.002
appointed-0.002
 forehead-0.002
 hurts-0.002
Token bird
Token ID6512
Feature activation+2.64e+00

Pos logit contributions
aned+0.028
rated+0.026
ently+0.026
appointed+0.026
iously+0.025
Neg logit contributions
 chefs-0.025
 queens-0.023
 kings-0.021
 charity-0.021
 pirates-0.020
Token."
Token ID526
Feature activation-1.43e+00

Pos logit contributions
 chefs+0.080
 queens+0.073
 kings+0.068
 charity+0.065
 pirates+0.063
Neg logit contributions
aned-0.100
rated-0.092
ently-0.091
appointed-0.091
iously-0.091
Token
Token ID198
Feature activation+1.61e+00

Pos logit contributions
aned+0.060
rated+0.058
ently+0.056
iously+0.055
 forehead+0.054
Neg logit contributions
 chefs-0.039
 queens-0.034
 charity-0.033
 kings-0.031
 pirates-0.030
Token
Token ID198
Feature activation+1.80e+00

Pos logit contributions
 chefs+0.042
 queens+0.037
 charity+0.036
 kings+0.034
 pirates+0.032
Neg logit contributions
aned-0.067
rated-0.065
iously-0.065
 forehead-0.063
ently-0.062
Token"
Token ID1
Feature activation+1.31e+01

Pos logit contributions
 chefs+0.041
 queens+0.036
 kings+0.032
 charity+0.030
 spies+0.029
Neg logit contributions
aned-0.081
rated-0.077
ently-0.076
appointed-0.076
 forehead-0.076
TokenNo
Token ID2949
Feature activation-3.11e+00

Pos logit contributions
 chefs+0.384
 queens+0.340
 kings+0.319
 charity+0.302
 spies+0.295
Neg logit contributions
aned-0.503
rated-0.472
 forehead-0.471
appointed-0.471
ently-0.457
Token,
Token ID11
Feature activation+5.83e+00

Pos logit contributions
aned+0.146
rated+0.137
appointed+0.134
ently+0.132
 forehead+0.129
Neg logit contributions
 chefs-0.070
 queens-0.057
 kings-0.054
 charity-0.052
 pirates-0.051
Token it
Token ID340
Feature activation-2.25e+00

Pos logit contributions
 chefs+0.190
 queens+0.173
 kings+0.158
 charity+0.157
 spies+0.149
Neg logit contributions
aned-0.207
rated-0.194
 forehead-0.192
ently-0.190
iously-0.190
Token can
Token ID460
Feature activation+1.91e+00

Pos logit contributions
aned+0.085
 spine+0.081
ently+0.079
 forehead+0.078
iously+0.078
Neg logit contributions
 chefs-0.064
 kings-0.053
 kind-0.051
 queens-0.048
 like-0.048
Token't
Token ID470
Feature activation+7.93e-01

Pos logit contributions
 chefs+0.064
 queens+0.060
 kings+0.056
 charity+0.055
 pirates+0.053
Neg logit contributions
aned-0.063
rated-0.060
ently-0.059
iously-0.058
 forehead-0.057
Token touch
Token ID3638
Feature activation-2.25e+00

Pos logit contributions
 chefs+0.005
 queens+0.005
 kings+0.004
 charity+0.004
 spies+0.004
Neg logit contributions
aned-0.006
rated-0.006
 forehead-0.006
ently-0.006
iously-0.006
Token it
Token ID340
Feature activation-2.21e+00

Pos logit contributions
aned+0.085
rated+0.080
ently+0.078
 forehead+0.078
iously+0.078
Neg logit contributions
 chefs-0.068
 queens-0.062
 charity-0.056
 kings-0.056
 spies-0.053
Token."
Token ID526
Feature activation-1.04e+00

Pos logit contributions
aned+0.091
rated+0.084
 forehead+0.083
iously+0.082
ently+0.082
Neg logit contributions
 chefs-0.061
 queens-0.054
 charity-0.049
 kings-0.049
 pirates-0.046
Token
Token ID198
Feature activation+1.77e+00

Pos logit contributions
aned+0.045
rated+0.044
ently+0.042
iously+0.041
 forehead+0.041
Neg logit contributions
 chefs-0.027
 queens-0.023
 charity-0.022
 kings-0.021
 pirates-0.021
Token
Token ID198
Feature activation+1.87e+00

Pos logit contributions
 chefs+0.044
 queens+0.039
 charity+0.037
 kings+0.035
 pirates+0.034
Neg logit contributions
aned-0.076
rated-0.074
iously-0.073
 forehead-0.072
appointed-0.069
Token"
Token ID1
Feature activation+1.31e+01

Pos logit contributions
 chefs+0.043
 queens+0.038
 kings+0.033
 charity+0.033
 spies+0.030
Neg logit contributions
aned-0.084
rated-0.080
 forehead-0.079
ently-0.079
iously-0.078
TokenMe
Token ID5308
Feature activation-4.41e+00

Pos logit contributions
 chefs+0.399
 queens+0.361
 kings+0.339
 charity+0.317
 spies+0.311
Neg logit contributions
aned-0.486
appointed-0.457
rated-0.455
 forehead-0.452
ently-0.439
Token too
Token ID1165
Feature activation-2.35e-01

Pos logit contributions
aned+0.211
rated+0.200
ently+0.193
zed+0.187
 forehead+0.186
Neg logit contributions
 chefs-0.111
 queens-0.095
 pirates-0.082
 kings-0.081
 spies-0.080
Token,"
Token ID553
Feature activation+1.46e+00

Pos logit contributions
aned+0.009
 forehead+0.008
rated+0.008
ently+0.008
iously+0.008
Neg logit contributions
 chefs-0.007
 queens-0.007
 kings-0.006
 charity-0.006
 builders-0.006
Token Ben
Token ID3932
Feature activation+1.99e-01

Pos logit contributions
 chefs+0.043
 queens+0.039
 kings+0.035
 charity+0.035
 pirates+0.033
Neg logit contributions
aned-0.059
rated-0.055
ently-0.054
appointed-0.053
iously-0.053
Token said
Token ID531
Feature activation-3.03e+00

Pos logit contributions
 chefs+0.007
 queens+0.006
 kings+0.006
 pirates+0.006
 spies+0.006
Neg logit contributions
aned-0.006
ently-0.006
rated-0.006
appointed-0.006
iously-0.006
Token are
Token ID389
Feature activation+2.40e+00

Pos logit contributions
aned+0.038
rated+0.035
ently+0.035
iously+0.035
 forehead+0.035
Neg logit contributions
 chefs-0.029
 queens-0.026
 kings-0.024
 charity-0.024
 pirates-0.023
Token well
Token ID880
Feature activation-7.19e-01

Pos logit contributions
 chefs+0.067
 queens+0.062
 charity+0.055
 kings+0.053
 spies+0.051
Neg logit contributions
aned-0.091
 forehead-0.086
rated-0.085
ently-0.084
iously-0.083
Token."
Token ID526
Feature activation-1.03e+00

Pos logit contributions
aned+0.024
rated+0.023
 forehead+0.023
ently+0.022
iously+0.022
Neg logit contributions
 chefs-0.023
 queens-0.022
 charity-0.020
 kings-0.020
 spies-0.019
Token
Token ID198
Feature activation+1.80e+00

Pos logit contributions
aned+0.044
rated+0.043
ently+0.040
iously+0.040
 forehead+0.040
Neg logit contributions
 chefs-0.028
 queens-0.024
 charity-0.023
 kings-0.022
 pirates-0.022
Token
Token ID198
Feature activation+1.89e+00

Pos logit contributions
 chefs+0.049
 queens+0.045
 charity+0.042
 kings+0.041
 pirates+0.039
Neg logit contributions
aned-0.073
rated-0.070
iously-0.069
 forehead-0.068
ently-0.067
Token"
Token ID1
Feature activation+1.31e+01

Pos logit contributions
 chefs+0.045
 queens+0.039
 kings+0.034
 charity+0.032
 spies+0.031
Neg logit contributions
aned-0.084
rated-0.079
appointed-0.079
ently-0.078
 forehead-0.078
TokenThank
Token ID10449
Feature activation+2.69e+00

Pos logit contributions
 chefs+0.396
 queens+0.350
 kings+0.340
 spies+0.312
 builders+0.308
Neg logit contributions
aned-0.457
appointed-0.442
 forehead-0.436
rated-0.421
ently-0.416
Token you
Token ID345
Feature activation+4.39e-01

Pos logit contributions
 chefs+0.086
 queens+0.077
 kings+0.071
 charity+0.071
 spies+0.067
Neg logit contributions
aned-0.098
rated-0.091
ently-0.091
 forehead-0.090
iously-0.090
Token,
Token ID11
Feature activation+5.65e+00

Pos logit contributions
 chefs+0.014
 queens+0.013
 kings+0.012
 charity+0.012
 spies+0.011
Neg logit contributions
aned-0.016
rated-0.015
ently-0.015
 forehead-0.015
appointed-0.015
Token Lily
Token ID20037
Feature activation+5.85e-01

Pos logit contributions
 chefs+0.174
 queens+0.157
 kings+0.142
 charity+0.142
 spies+0.134
Neg logit contributions
aned-0.210
rated-0.197
 forehead-0.195
ently-0.193
iously-0.193
Token,"
Token ID553
Feature activation+1.35e+00

Pos logit contributions
 chefs+0.018
 queens+0.016
 kings+0.015
 pirates+0.014
 spies+0.014
Neg logit contributions
aned-0.022
rated-0.020
appointed-0.020
ently-0.020
iously-0.020
Token a
Token ID257
Feature activation+3.82e+00

Pos logit contributions
 chefs+0.066
 queens+0.058
 kings+0.052
 charity+0.051
 pirates+0.048
Neg logit contributions
aned-0.115
rated-0.108
ently-0.106
appointed-0.105
 forehead-0.105
Token feather
Token ID17972
Feature activation-8.95e-01

Pos logit contributions
 chefs+0.118
 queens+0.107
 charity+0.097
 kings+0.097
 spies+0.092
Neg logit contributions
aned-0.144
rated-0.134
ently-0.132
iously-0.131
 forehead-0.131
Token.
Token ID13
Feature activation-7.60e-01

Pos logit contributions
aned+0.036
rated+0.033
appointed+0.033
ently+0.033
iously+0.033
Neg logit contributions
 chefs-0.026
 queens-0.023
 kings-0.021
 charity-0.021
 pirates-0.020
Token
Token ID198
Feature activation+2.14e+00

Pos logit contributions
aned+0.028
rated+0.027
 forehead+0.025
ently+0.025
iously+0.025
Neg logit contributions
 chefs-0.025
 queens-0.023
 charity-0.021
 kings-0.021
 pirates-0.020
Token
Token ID198
Feature activation+2.14e+00

Pos logit contributions
 chefs+0.059
 queens+0.053
 charity+0.050
 kings+0.048
 pirates+0.046
Neg logit contributions
aned-0.087
rated-0.083
iously-0.082
 forehead-0.082
ently-0.079
Token"
Token ID1
Feature activation+1.31e+01

Pos logit contributions
 chefs+0.053
 queens+0.047
 kings+0.041
 charity+0.040
 spies+0.038
Neg logit contributions
aned-0.093
rated-0.088
ently-0.087
 forehead-0.086
appointed-0.086
TokenThank
Token ID10449
Feature activation+2.83e+00

Pos logit contributions
 chefs+0.361
 queens+0.325
 kings+0.300
 charity+0.281
 builders+0.274
Neg logit contributions
aned-0.515
rated-0.487
 forehead-0.486
appointed-0.481
ently-0.474
Token you
Token ID345
Feature activation+7.04e-01

Pos logit contributions
 chefs+0.090
 queens+0.082
 kings+0.075
 charity+0.075
 spies+0.070
Neg logit contributions
aned-0.102
rated-0.095
ently-0.095
 forehead-0.094
iously-0.094
Token,
Token ID11
Feature activation+5.82e+00

Pos logit contributions
 chefs+0.022
 queens+0.020
 kings+0.018
 charity+0.018
 spies+0.017
Neg logit contributions
aned-0.027
rated-0.025
ently-0.024
 forehead-0.024
appointed-0.024
Token bird
Token ID6512
Feature activation+1.63e+00

Pos logit contributions
 chefs+0.187
 queens+0.170
 kings+0.153
 charity+0.153
 builders+0.145
Neg logit contributions
aned-0.203
 forehead-0.192
rated-0.190
iously-0.188
ently-0.188
Token,
Token ID11
Feature activation+5.72e+00

Pos logit contributions
 chefs+0.052
 queens+0.048
 kings+0.045
 pirates+0.043
 charity+0.042
Neg logit contributions
aned-0.060
appointed-0.055
rated-0.054
 forehead-0.053
ently-0.053
Token y
Token ID331
Feature activation-1.60e+00

Pos logit contributions
aned+0.129
rated+0.119
 forehead+0.116
ently+0.115
iously+0.114
Neg logit contributions
 chefs-0.179
 queens-0.167
 charity-0.155
 kings-0.155
 spies-0.148
Tokenucky
Token ID5309
Feature activation+1.95e+00

Pos logit contributions
aned+0.044
rated+0.040
ently+0.039
iously+0.039
 forehead+0.039
Neg logit contributions
 chefs-0.066
 queens-0.061
 charity-0.057
 kings-0.057
 spies-0.055
Token."
Token ID526
Feature activation-1.46e+00

Pos logit contributions
 chefs+0.057
 queens+0.048
 kings+0.044
 pirates+0.043
 spies+0.041
Neg logit contributions
aned-0.078
appointed-0.073
ently-0.072
rated-0.072
iously-0.070
Token
Token ID198
Feature activation+1.44e+00

Pos logit contributions
aned+0.060
rated+0.058
ently+0.055
iously+0.054
 forehead+0.054
Neg logit contributions
 chefs-0.041
 queens-0.036
 charity-0.033
 kings-0.032
 pirates-0.032
Token
Token ID198
Feature activation+1.68e+00

Pos logit contributions
 chefs+0.038
 queens+0.034
 charity+0.032
 kings+0.030
 pirates+0.029
Neg logit contributions
aned-0.060
rated-0.058
iously-0.057
 forehead-0.056
ently-0.055
Token"
Token ID1
Feature activation+1.31e+01

Pos logit contributions
 chefs+0.040
 queens+0.035
 kings+0.031
 charity+0.030
 spies+0.029
Neg logit contributions
aned-0.075
rated-0.070
ently-0.069
appointed-0.069
 forehead-0.069
TokenMe
Token ID5308
Feature activation-4.49e+00

Pos logit contributions
 chefs+0.372
 queens+0.333
 kings+0.319
 charity+0.290
 spies+0.290
Neg logit contributions
aned-0.500
appointed-0.474
 forehead-0.470
rated-0.465
ently-0.449
Token neither
Token ID6159
Feature activation+4.06e+00

Pos logit contributions
aned+0.207
rated+0.194
ently+0.192
 forehead+0.185
appointed+0.184
Neg logit contributions
 chefs-0.116
 queens-0.099
 kings-0.084
 pirates-0.082
 spies-0.081
Token,"
Token ID553
Feature activation+1.34e+00

Pos logit contributions
 chefs+0.126
 queens+0.115
 kings+0.107
 charity+0.100
 builders+0.100
Neg logit contributions
aned-0.152
 forehead-0.140
ently-0.138
appointed-0.135
rated-0.135
Token Ben
Token ID3932
Feature activation+2.64e-01

Pos logit contributions
 chefs+0.037
 queens+0.033
 kings+0.030
 charity+0.030
 pirates+0.029
Neg logit contributions
aned-0.056
rated-0.053
ently-0.051
appointed-0.050
iously-0.050
Token said
Token ID531
Feature activation-3.03e+00

Pos logit contributions
 chefs+0.010
 queens+0.008
 kings+0.008
 pirates+0.008
 spies+0.008
Neg logit contributions
aned-0.008
ently-0.008
rated-0.008
appointed-0.007
iously-0.007
Token for
Token ID329
Feature activation+2.78e+00

Pos logit contributions
 queens+0.083
 chefs+0.082
 charity+0.076
 kings+0.073
 spies+0.067
Neg logit contributions
aned-0.085
iously-0.084
ently-0.079
rated-0.079
 forehead-0.077
Token Ben
Token ID3932
Feature activation+4.75e-01

Pos logit contributions
 chefs+0.077
 queens+0.074
 charity+0.064
 kings+0.063
 spies+0.059
Neg logit contributions
aned-0.104
iously-0.101
 forehead-0.097
ently-0.097
appointed-0.093
Token.
Token ID13
Feature activation-5.69e-01

Pos logit contributions
 chefs+0.015
 queens+0.014
 kings+0.012
 charity+0.012
 spies+0.012
Neg logit contributions
aned-0.017
rated-0.016
 forehead-0.016
ently-0.016
iously-0.016
Token
Token ID198
Feature activation+2.09e+00

Pos logit contributions
aned+0.022
rated+0.020
 forehead+0.020
iously+0.020
ently+0.020
Neg logit contributions
 chefs-0.018
 queens-0.016
 charity-0.015
 kings-0.015
 spies-0.014
Token
Token ID198
Feature activation+2.13e+00

Pos logit contributions
 chefs+0.065
 queens+0.059
 charity+0.055
 kings+0.054
 spies+0.051
Neg logit contributions
aned-0.078
rated-0.073
 forehead-0.072
iously-0.072
ently-0.071
Token"
Token ID1
Feature activation+1.31e+01

Pos logit contributions
 chefs+0.054
 queens+0.048
 kings+0.042
 charity+0.040
 spies+0.038
Neg logit contributions
aned-0.091
rated-0.085
appointed-0.085
ently-0.085
 forehead-0.084
TokenBen
Token ID11696
Feature activation+1.24e+00

Pos logit contributions
 chefs+0.328
 queens+0.312
 kings+0.290
 charity+0.270
 builders+0.254
Neg logit contributions
aned-0.508
rated-0.472
 forehead-0.471
appointed-0.468
ently-0.463
Token,
Token ID11
Feature activation+5.97e+00

Pos logit contributions
 chefs+0.039
 queens+0.035
 kings+0.032
 charity+0.031
 spies+0.030
Neg logit contributions
aned-0.046
rated-0.043
ently-0.042
 forehead-0.042
iously-0.042
Token you
Token ID345
Feature activation+3.35e-01

Pos logit contributions
 chefs+0.191
 queens+0.180
 kings+0.161
 charity+0.157
 builders+0.152
Neg logit contributions
aned-0.197
 forehead-0.193
rated-0.186
iously-0.182
ently-0.181
Token need
Token ID761
Feature activation+5.37e+00

Pos logit contributions
 chefs+0.012
 queens+0.011
 charity+0.010
 kings+0.010
 spies+0.010
Neg logit contributions
aned-0.010
rated-0.009
 forehead-0.009
appointed-0.009
iously-0.009
Token to
Token ID284
Feature activation+2.33e-01

Pos logit contributions
 chefs+0.145
 queens+0.135
 kings+0.119
 charity+0.118
 spies+0.112
Neg logit contributions
aned-0.207
 forehead-0.197
ently-0.194
iously-0.194
appointed-0.187
Token.
Token ID13
Feature activation-1.43e+00

Pos logit contributions
aned+0.091
rated+0.084
appointed+0.082
ently+0.082
iously+0.081
Neg logit contributions
 chefs-0.061
 queens-0.052
 charity-0.050
 kings-0.049
 spies-0.048
Token
Token ID220
Feature activation-6.05e+00

Pos logit contributions
aned+0.053
rated+0.053
ently+0.049
iously+0.049
appointed+0.046
Neg logit contributions
 chefs-0.042
 charity-0.036
 queens-0.036
 spies-0.033
 pirates-0.033
Token
Token ID198
Feature activation+1.57e+00

Pos logit contributions
rated+0.258
aned+0.257
 forehead+0.252
iously+0.250
appointed+0.234
Neg logit contributions
 chefs-0.144
 queens-0.121
 charity-0.118
 kings-0.108
 spies-0.107
Token
Token ID198
Feature activation+1.59e+00

Pos logit contributions
 chefs+0.048
 queens+0.043
 charity+0.040
 kings+0.039
 spies+0.037
Neg logit contributions
aned-0.059
rated-0.056
iously-0.055
 forehead-0.054
ently-0.054
TokenRel
Token ID6892
Feature activation-7.15e+00

Pos logit contributions
 chefs+0.051
 queens+0.046
 kings+0.042
 charity+0.041
 spies+0.039
Neg logit contributions
aned-0.059
 forehead-0.054
rated-0.054
ently-0.054
appointed-0.054
Tokenuct
Token ID4782
Feature activation-1.67e+01

Pos logit contributions
 forehead+0.144
rated+0.138
aned+0.130
iously+0.128
 spine+0.125
Neg logit contributions
 chefs-0.308
 queens-0.281
 kings-0.277
J-0.268
 especially-0.266
Tokenantly
Token ID3875
Feature activation-6.42e+00

Pos logit contributions
aned+0.306
rated+0.303
 forehead+0.297
iously+0.285
 spine+0.281
Neg logit contributions
 chefs-0.756
 queens-0.722
 kings-0.684
 pirates-0.665
 builders-0.661
Token,
Token ID11
Feature activation+5.11e+00

Pos logit contributions
aned+0.237
rated+0.222
ently+0.217
 forehead+0.217
iously+0.215
Neg logit contributions
 chefs-0.203
 queens-0.184
 kings-0.167
 charity-0.167
 spies-0.159
Token Elsa
Token ID19226
Feature activation-7.05e-01

Pos logit contributions
 chefs+0.141
 queens+0.125
 charity+0.114
 kings+0.110
 spies+0.106
Neg logit contributions
aned-0.212
rated-0.200
iously-0.194
ently-0.194
appointed-0.190
Token left
Token ID1364
Feature activation+2.23e+00

Pos logit contributions
aned+0.026
rated+0.025
ently+0.024
appointed+0.024
 forehead+0.024
Neg logit contributions
 chefs-0.022
 queens-0.019
 kings-0.018
 charity-0.017
 pirates-0.017
Token the
Token ID262
Feature activation-1.26e+00

Pos logit contributions
 chefs+0.068
 queens+0.062
 kings+0.056
 charity+0.055
 spies+0.052
Neg logit contributions
aned-0.084
 forehead-0.079
ently-0.079
rated-0.078
iously-0.078
Token sea
Token ID5417
Feature activation+6.23e-01

Pos logit contributions
aned+0.049
 forehead+0.049
rated+0.047
iously+0.046
appointed+0.045
Neg logit contributions
 chefs-0.038
 queens-0.036
 kings-0.032
 charity-0.032
 builders-0.030
Token monster
Token ID9234
Feature activation+3.00e-01

Pos logit contributions
 chefs+0.022
 queens+0.020
 kings+0.018
 charity+0.018
 spies+0.018
Neg logit contributions
aned-0.021
rated-0.019
 forehead-0.019
ently-0.019
iously-0.019
Token insisted
Token ID11189
Feature activation-3.81e+00

Pos logit contributions
 chefs+0.010
 queens+0.009
 kings+0.008
 charity+0.008
 pirates+0.008
Neg logit contributions
aned-0.011
rated-0.010
ently-0.010
appointed-0.010
 forehead-0.010
Token.
Token ID13
Feature activation-1.59e+00

Pos logit contributions
aned+0.164
rated+0.162
appointed+0.156
 forehead+0.151
iously+0.148
Neg logit contributions
 chefs-0.093
 queens-0.080
 kings-0.074
 pirates-0.072
 charity-0.070
Token Rel
Token ID4718
Feature activation-3.59e+00

Pos logit contributions
aned+0.059
rated+0.058
 forehead+0.052
ently+0.051
iously+0.051
Neg logit contributions
 chefs-0.053
 queens-0.047
 charity-0.044
 pirates-0.043
 kings-0.042
Tokenuct
Token ID4782
Feature activation-1.67e+01

Pos logit contributions
 forehead+0.084
aned+0.082
rated+0.081
 spine+0.076
 Today+0.073
Neg logit contributions
 chefs-0.138
 queens-0.131
 kings-0.129
 charity-0.129
 especially-0.122
Tokenantly
Token ID3875
Feature activation-6.42e+00

Pos logit contributions
 spine+0.363
aned+0.358
 forehead+0.355
rated+0.351
appointed+0.330
Neg logit contributions
 chefs-0.700
 queens-0.632
 pirates-0.630
 kings-0.616
We-0.608
Token,
Token ID11
Feature activation+4.93e+00

Pos logit contributions
aned+0.272
rated+0.265
ently+0.252
 forehead+0.252
appointed+0.248
Neg logit contributions
 chefs-0.168
 queens-0.146
 kings-0.132
 pirates-0.130
 builders-0.126
Token it
Token ID340
Feature activation-3.08e+00

Pos logit contributions
 chefs+0.121
 queens+0.106
 kings+0.096
 charity+0.092
 pirates+0.092
Neg logit contributions
aned-0.225
rated-0.211
ently-0.205
 forehead-0.202
iously-0.202
Token followed
Token ID3940
Feature activation-2.35e+00

Pos logit contributions
aned+0.116
ently+0.109
 forehead+0.107
appointed+0.107
rated+0.107
Neg logit contributions
 chefs-0.092
 queens-0.083
 kings-0.078
 pirates-0.075
 spies-0.073
Token him
Token ID683
Feature activation-4.71e-01

Pos logit contributions
aned+0.091
rated+0.085
ently+0.084
 forehead+0.084
iously+0.084
Neg logit contributions
 chefs-0.071
 queens-0.064
 kings-0.057
 charity-0.057
 spies-0.054
Token buy
Token ID2822
Feature activation+2.70e+00

Pos logit contributions
aned+0.009
 forehead+0.009
rated+0.009
ently+0.009
appointed+0.009
Neg logit contributions
 chefs-0.007
 queens-0.006
 kings-0.006
 spies-0.005
 charity-0.005
Token his
Token ID465
Feature activation-2.19e+00

Pos logit contributions
 chefs+0.086
 queens+0.082
 charity+0.071
 kings+0.071
 spies+0.069
Neg logit contributions
aned-0.091
ently-0.088
iously-0.087
 forehead-0.085
rated-0.084
Token own
Token ID898
Feature activation-3.69e-01

Pos logit contributions
aned+0.087
rated+0.081
ently+0.079
appointed+0.078
 forehead+0.077
Neg logit contributions
 chefs-0.066
 queens-0.057
 charity-0.054
 kings-0.053
 pirates-0.051
Token.
Token ID13
Feature activation-1.44e+00

Pos logit contributions
aned+0.015
rated+0.014
ently+0.014
appointed+0.013
 forehead+0.013
Neg logit contributions
 chefs-0.011
 queens-0.009
 kings-0.009
 charity-0.008
 pirates-0.008
Token Rel
Token ID4718
Feature activation-3.46e+00

Pos logit contributions
aned+0.054
rated+0.054
 forehead+0.052
ently+0.050
iously+0.049
Neg logit contributions
 chefs-0.043
 charity-0.036
 queens-0.036
 builders-0.034
 spies-0.034
Tokenuct
Token ID4782
Feature activation-1.66e+01

Pos logit contributions
 forehead+0.102
aned+0.102
rated+0.100
ently+0.093
iously+0.093
Neg logit contributions
 chefs-0.118
 queens-0.109
 charity-0.105
 kings-0.105
 builders-0.098
Tokenantly
Token ID3875
Feature activation-6.42e+00

Pos logit contributions
 forehead+0.330
rated+0.313
aned+0.312
 spine+0.309
iously+0.290
Neg logit contributions
 chefs-0.743
 queens-0.681
 kings-0.666
 pirates-0.664
 builders-0.663
Token,
Token ID11
Feature activation+5.09e+00

Pos logit contributions
aned+0.231
rated+0.214
iously+0.211
 forehead+0.211
ently+0.211
Neg logit contributions
 chefs-0.209
 queens-0.191
 charity-0.174
 kings-0.174
 spies-0.164
Token his
Token ID465
Feature activation-2.19e+00

Pos logit contributions
 chefs+0.159
 queens+0.144
 kings+0.131
 charity+0.131
 spies+0.125
Neg logit contributions
aned-0.190
rated-0.178
 forehead-0.174
ently-0.173
iously-0.172
Token friend
Token ID1545
Feature activation+3.02e+00

Pos logit contributions
aned+0.086
rated+0.080
 forehead+0.080
iously+0.079
ently+0.079
Neg logit contributions
 chefs-0.064
 queens-0.058
 kings-0.052
 charity-0.052
 spies-0.048
Token gave
Token ID2921
Feature activation+2.33e+00

Pos logit contributions
 chefs+0.104
 queens+0.096
 charity+0.087
 kings+0.087
 spies+0.083
Neg logit contributions
aned-0.102
rated-0.095
 forehead-0.093
iously-0.093
ently-0.093
Token from
Token ID422
Feature activation-4.75e+00

Pos logit contributions
aned+0.274
rated+0.255
 forehead+0.251
ently+0.250
iously+0.248
Neg logit contributions
 chefs-0.220
 queens-0.200
 kings-0.181
 charity-0.180
 spies-0.171
Token the
Token ID262
Feature activation-1.35e+00

Pos logit contributions
aned+0.181
rated+0.172
iously+0.169
 forehead+0.169
ently+0.169
Neg logit contributions
 chefs-0.141
 queens-0.128
 charity-0.113
 kings-0.113
 spies-0.106
Token window
Token ID4324
Feature activation-4.38e+00

Pos logit contributions
aned+0.045
 forehead+0.044
rated+0.044
appointed+0.042
ently+0.042
Neg logit contributions
 chefs-0.046
 queens-0.041
 charity-0.037
 kings-0.037
 spies-0.036
Token.
Token ID13
Feature activation-1.48e+00

Pos logit contributions
aned+0.177
rated+0.166
 forehead+0.162
ently+0.162
appointed+0.161
Neg logit contributions
 chefs-0.121
 queens-0.110
 kings-0.100
 charity-0.098
 pirates-0.093
Token Rel
Token ID4718
Feature activation-3.56e+00

Pos logit contributions
aned+0.056
rated+0.054
ently+0.051
 forehead+0.051
iously+0.049
Neg logit contributions
 chefs-0.045
 queens-0.040
 charity-0.038
 kings-0.037
 spies-0.036
Tokenuct
Token ID4782
Feature activation-1.66e+01

Pos logit contributions
 forehead+0.068
rated+0.063
aned+0.063
 spine+0.059
 Today+0.056
Neg logit contributions
 chefs-0.155
 queens-0.147
 kings-0.144
 charity-0.142
 especially-0.139
Tokenantly
Token ID3875
Feature activation-6.32e+00

Pos logit contributions
rated+0.349
 forehead+0.346
aned+0.337
 spine+0.331
iously+0.300
Neg logit contributions
 chefs-0.699
 queens-0.659
 kings-0.639
 pirates-0.628
 police-0.620
Token,
Token ID11
Feature activation+4.97e+00

Pos logit contributions
aned+0.251
rated+0.242
ently+0.232
 forehead+0.232
appointed+0.227
Neg logit contributions
 chefs-0.177
 queens-0.160
 kings-0.144
 charity-0.140
 pirates-0.138
Token he
Token ID339
Feature activation+1.29e+00

Pos logit contributions
 chefs+0.135
 queens+0.121
 kings+0.109
 charity+0.107
 pirates+0.103
Neg logit contributions
aned-0.206
rated-0.196
ently-0.189
 forehead-0.188
iously-0.187
Token came
Token ID1625
Feature activation+4.34e+00

Pos logit contributions
 chefs+0.042
 queens+0.039
 kings+0.035
 charity+0.035
 spies+0.033
Neg logit contributions
aned-0.046
rated-0.043
 forehead-0.042
ently-0.042
iously-0.042
Token away
Token ID1497
Feature activation-7.20e+00

Pos logit contributions
 chefs+0.132
 queens+0.120
 kings+0.109
 charity+0.108
 spies+0.103
Neg logit contributions
aned-0.165
rated-0.154
 forehead-0.151
ently-0.151
iously-0.150
Token so
Token ID523
Feature activation-3.11e+00

Pos logit contributions
aned+0.004
 forehead+0.004
rated+0.004
appointed+0.004
iously+0.004
Neg logit contributions
 chefs-0.003
 queens-0.003
 charity-0.003
 kings-0.003
 builders-0.003
Token sad
Token ID6507
Feature activation-1.81e+00

Pos logit contributions
aned+0.141
rated+0.133
ently+0.131
 forehead+0.131
iously+0.131
Neg logit contributions
 chefs-0.072
 queens-0.064
 kings-0.056
 charity-0.055
 spies-0.051
Token and
Token ID290
Feature activation-2.08e+00

Pos logit contributions
aned+0.073
rated+0.067
appointed+0.065
 forehead+0.065
iously+0.065
Neg logit contributions
 chefs-0.051
 queens-0.046
 charity-0.042
 kings-0.041
 pirates-0.039
Token embarrassed
Token ID21100
Feature activation-2.88e+00

Pos logit contributions
aned+0.086
ently+0.082
iously+0.078
rated+0.078
 forehead+0.076
Neg logit contributions
 chefs-0.055
 queens-0.051
 kings-0.046
 pirates-0.043
 spies-0.043
Token.
Token ID13
Feature activation-1.50e+00

Pos logit contributions
aned+0.118
rated+0.110
ently+0.107
 forehead+0.107
appointed+0.106
Neg logit contributions
 chefs-0.078
 queens-0.070
 charity-0.063
 kings-0.063
 spies-0.059
Token Tears
Token ID40328
Feature activation-1.28e+01

Pos logit contributions
aned+0.055
rated+0.053
 forehead+0.050
iously+0.049
ently+0.048
Neg logit contributions
 chefs-0.047
 queens-0.043
 charity-0.042
 kings-0.039
 spies-0.037
Token ran
Token ID4966
Feature activation+2.01e-02

Pos logit contributions
aned+0.444
rated+0.417
 forehead+0.409
iously+0.409
ently+0.407
Neg logit contributions
 chefs-0.420
 queens-0.385
 kings-0.353
 charity-0.349
 pirates-0.335
Token down
Token ID866
Feature activation-5.54e+00

Pos logit contributions
 chefs+0.001
 queens+0.001
 kings+0.001
 charity+0.001
 spies+0.001
Neg logit contributions
aned-0.001
rated-0.001
 forehead-0.001
ently-0.001
iously-0.001
Token her
Token ID607
Feature activation-2.44e+00

Pos logit contributions
aned+0.209
ently+0.194
iously+0.192
 forehead+0.192
rated+0.190
Neg logit contributions
 chefs-0.171
 queens-0.152
 charity-0.137
 kings-0.136
 pirates-0.127
Token face
Token ID1986
Feature activation-1.40e+00

Pos logit contributions
aned+0.073
rated+0.072
ulously+0.065
appointed+0.064
ently+0.062
Neg logit contributions
 charity-0.080
 chefs-0.078
 queens-0.075
 fashion-0.074
 pirates-0.072
Token,
Token ID11
Feature activation+4.77e+00

Pos logit contributions
aned+0.058
rated+0.055
 forehead+0.053
ently+0.053
appointed+0.053
Neg logit contributions
 chefs-0.036
 queens-0.033
 charity-0.030
 kings-0.029
 spies-0.027
Token and
Token ID290
Feature activation-1.98e+00

Pos logit contributions
aned+0.216
rated+0.203
 forehead+0.199
ently+0.199
iously+0.197
Neg logit contributions
 chefs-0.161
 queens-0.145
 kings-0.131
 charity-0.130
 spies-0.123
Token hurt
Token ID5938
Feature activation-2.53e-01

Pos logit contributions
aned+0.073
ently+0.072
rated+0.071
iously+0.067
appointed+0.067
Neg logit contributions
 chefs-0.058
 queens-0.053
 kings-0.048
 spies-0.046
 charity-0.046
Token her
Token ID607
Feature activation-2.36e+00

Pos logit contributions
aned+0.011
rated+0.010
ently+0.010
 forehead+0.010
iously+0.010
Neg logit contributions
 chefs-0.007
 queens-0.006
 kings-0.005
 charity-0.005
 spies-0.005
Token knee
Token ID10329
Feature activation-2.34e+00

Pos logit contributions
aned+0.074
rated+0.067
ently+0.067
appointed+0.065
iously+0.058
Neg logit contributions
 chefs-0.086
 queens-0.076
 pirates-0.075
 spies-0.073
 kings-0.073
Token.
Token ID13
Feature activation-1.44e+00

Pos logit contributions
aned+0.097
ently+0.093
rated+0.092
iously+0.090
 forehead+0.090
Neg logit contributions
 chefs-0.056
 queens-0.050
 kings-0.046
 charity-0.045
 spies-0.042
Token Tears
Token ID40328
Feature activation-1.28e+01

Pos logit contributions
rated+0.060
aned+0.059
 forehead+0.054
ently+0.054
inkle+0.054
Neg logit contributions
 chefs-0.037
 queens-0.031
 kings-0.029
 charity-0.028
 spies-0.027
Token started
Token ID2067
Feature activation+8.37e-01

Pos logit contributions
aned+0.434
rated+0.402
 forehead+0.394
ently+0.394
iously+0.391
Neg logit contributions
 chefs-0.442
 queens-0.406
 kings-0.372
 charity-0.372
 spies-0.354
Token to
Token ID284
Feature activation-3.95e-01

Pos logit contributions
 chefs+0.021
 queens+0.019
 charity+0.017
 kings+0.017
 spies+0.015
Neg logit contributions
aned-0.036
rated-0.034
 forehead-0.034
ently-0.034
iously-0.033
Token run
Token ID1057
Feature activation+6.05e-01

Pos logit contributions
aned+0.016
iously+0.015
ently+0.015
rated+0.015
appointed+0.014
Neg logit contributions
 chefs-0.010
 queens-0.009
 kings-0.008
 charity-0.008
 pirates-0.008
Token down
Token ID866
Feature activation-5.60e+00

Pos logit contributions
 chefs+0.023
 queens+0.021
 kings+0.020
 charity+0.020
 pirates+0.019
Neg logit contributions
aned-0.019
rated-0.017
 forehead-0.017
iously-0.017
appointed-0.017
Token her
Token ID607
Feature activation-2.38e+00

Pos logit contributions
aned+0.207
iously+0.192
ently+0.191
 forehead+0.190
rated+0.190
Neg logit contributions
 chefs-0.177
 queens-0.159
 charity-0.146
 kings-0.145
 pirates-0.136
Token Mama
Token ID40084
Feature activation-1.10e+00

Pos logit contributions
 chefs+0.143
 queens+0.123
 kings+0.111
 pirates+0.107
 builders+0.105
Neg logit contributions
aned-0.214
rated-0.198
ently-0.195
appointed-0.192
iously-0.191
Token began
Token ID2540
Feature activation+9.90e-01

Pos logit contributions
aned+0.040
rated+0.038
ently+0.037
 forehead+0.037
appointed+0.037
Neg logit contributions
 chefs-0.035
 queens-0.032
 kings-0.029
 charity-0.028
 pirates-0.028
Token to
Token ID284
Feature activation-3.58e-01

Pos logit contributions
 chefs+0.023
 queens+0.020
 kings+0.018
 charity+0.018
 spies+0.016
Neg logit contributions
aned-0.045
rated-0.042
 forehead-0.042
ently-0.042
iously-0.041
Token cry
Token ID3960
Feature activation-6.12e+00

Pos logit contributions
aned+0.012
rated+0.011
 forehead+0.011
iously+0.011
appointed+0.011
Neg logit contributions
 chefs-0.012
 queens-0.011
 kings-0.010
 charity-0.010
 spies-0.010
Token.
Token ID13
Feature activation-1.59e+00

Pos logit contributions
aned+0.247
rated+0.231
ently+0.228
 forehead+0.225
appointed+0.225
Neg logit contributions
 chefs-0.173
 queens-0.153
 charity-0.138
 kings-0.137
 spies-0.132
Token Tears
Token ID40328
Feature activation-1.28e+01

Pos logit contributions
aned+0.062
rated+0.059
ently+0.058
 forehead+0.055
iously+0.054
Neg logit contributions
 chefs-0.047
 queens-0.039
 kings-0.036
 pirates-0.035
 charity-0.035
Token started
Token ID2067
Feature activation+8.78e-01

Pos logit contributions
aned+0.438
rated+0.407
 forehead+0.399
ently+0.398
iously+0.395
Neg logit contributions
 chefs-0.436
 queens-0.400
 kings-0.366
 charity-0.366
 spies-0.348
Token streaming
Token ID11305
Feature activation-6.73e+00

Pos logit contributions
 chefs+0.023
 queens+0.021
 kings+0.018
 charity+0.018
 pirates+0.017
Neg logit contributions
aned-0.037
rated-0.035
ently-0.034
 forehead-0.034
appointed-0.034
Token down
Token ID866
Feature activation-5.66e+00

Pos logit contributions
aned+0.194
ently+0.178
rated+0.176
 forehead+0.175
iously+0.171
Neg logit contributions
 chefs-0.259
 queens-0.236
 kings-0.223
 pirates-0.215
 spies-0.214
Token her
Token ID607
Feature activation-2.31e+00

Pos logit contributions
aned+0.200
rated+0.184
ently+0.184
iously+0.182
 forehead+0.182
Neg logit contributions
 chefs-0.186
 queens-0.172
 charity-0.157
 kings-0.155
 spies-0.148
Token cheeks
Token ID25839
Feature activation-2.74e+00

Pos logit contributions
aned+0.078
rated+0.078
appointed+0.073
ulously+0.071
ently+0.071
Neg logit contributions
 chefs-0.069
 pirates-0.058
 charity-0.058
 queens-0.057
 spies-0.054
Token big
Token ID1263
Feature activation+3.69e+00

Pos logit contributions
 chefs+0.081
 queens+0.073
 charity+0.065
 kings+0.065
 spies+0.062
Neg logit contributions
aned-0.107
rated-0.100
ently-0.099
iously-0.098
appointed-0.097
Token sad
Token ID6507
Feature activation-1.92e+00

Pos logit contributions
 chefs+0.149
 queens+0.140
 charity+0.130
 kings+0.130
 spies+0.124
Neg logit contributions
aned-0.102
rated-0.095
 forehead-0.094
iously-0.091
ently-0.090
Token inside
Token ID2641
Feature activation-9.32e-01

Pos logit contributions
aned+0.083
ently+0.076
appointed+0.076
rated+0.075
iously+0.073
Neg logit contributions
 chefs-0.049
 queens-0.042
 charity-0.039
 pirates-0.037
 kings-0.037
Token her
Token ID607
Feature activation-2.17e+00

Pos logit contributions
aned+0.038
rated+0.035
 forehead+0.035
ently+0.035
iously+0.035
Neg logit contributions
 chefs-0.026
 queens-0.023
 kings-0.021
 charity-0.021
 spies-0.020
Token.
Token ID13
Feature activation-1.43e+00

Pos logit contributions
aned+0.071
rated+0.067
appointed+0.063
ently+0.062
iously+0.060
Neg logit contributions
 chefs-0.081
 queens-0.073
 kings-0.068
 pirates-0.067
 charity-0.067
Token Tears
Token ID40328
Feature activation-1.27e+01

Pos logit contributions
rated+0.058
aned+0.057
ently+0.053
 forehead+0.053
iously+0.051
Neg logit contributions
 chefs-0.038
 queens-0.031
 charity-0.031
Where-0.029
Why-0.028
Token rolled
Token ID11686
Feature activation+1.30e+00

Pos logit contributions
aned+0.434
rated+0.400
 forehead+0.393
ently+0.391
appointed+0.387
Neg logit contributions
 chefs-0.445
 queens-0.409
 charity-0.376
 kings-0.373
 spies-0.357
Token down
Token ID866
Feature activation-5.48e+00

Pos logit contributions
 chefs+0.050
 queens+0.046
 kings+0.043
 charity+0.042
 spies+0.042
Neg logit contributions
aned-0.038
rated-0.035
ently-0.034
iously-0.034
 forehead-0.034
Token her
Token ID607
Feature activation-2.23e+00

Pos logit contributions
aned+0.196
iously+0.183
ently+0.182
 forehead+0.181
appointed+0.177
Neg logit contributions
 chefs-0.180
 queens-0.162
 charity-0.151
 kings-0.146
 pirates-0.139
Token cheeks
Token ID25839
Feature activation-2.58e+00

Pos logit contributions
rated+0.072
aned+0.072
appointed+0.065
ently+0.063
ulously+0.062
Neg logit contributions
 chefs-0.071
 queens-0.064
 pirates-0.061
 charity-0.061
 fashion-0.061
Token and
Token ID290
Feature activation-2.03e+00

Pos logit contributions
aned+0.095
rated+0.088
 forehead+0.087
iously+0.087
ently+0.086
Neg logit contributions
 chefs-0.082
 queens-0.075
 charity-0.068
 kings-0.068
 spies-0.064
Token girl
Token ID2576
Feature activation-1.07e+00

Pos logit contributions
aned+0.036
 forehead+0.035
rated+0.034
iously+0.034
ently+0.033
Neg logit contributions
 chefs-0.031
 queens-0.028
 charity-0.027
 kings-0.025
 spies-0.024
Token was
Token ID373
Feature activation-1.73e+00

Pos logit contributions
aned+0.037
rated+0.036
ently+0.035
 forehead+0.034
appointed+0.034
Neg logit contributions
 chefs-0.035
 queens-0.031
 kings-0.030
 pirates-0.028
 charity-0.028
Token very
Token ID845
Feature activation+5.68e-01

Pos logit contributions
aned+0.068
rated+0.063
 forehead+0.063
iously+0.062
appointed+0.062
Neg logit contributions
 chefs-0.051
 queens-0.046
 charity-0.042
 kings-0.041
 spies-0.039
Token sad
Token ID6507
Feature activation-1.86e+00

Pos logit contributions
 chefs+0.018
 queens+0.017
 charity+0.016
 kings+0.015
 pirates+0.015
Neg logit contributions
aned-0.021
rated-0.019
 forehead-0.019
appointed-0.019
iously-0.019
Token.
Token ID13
Feature activation-1.39e+00

Pos logit contributions
aned+0.070
rated+0.065
ently+0.064
 forehead+0.064
iously+0.064
Neg logit contributions
 chefs-0.057
 queens-0.052
 kings-0.047
 charity-0.047
 spies-0.044
Token Tears
Token ID40328
Feature activation-1.27e+01

Pos logit contributions
aned+0.055
rated+0.053
ently+0.051
 forehead+0.050
inkle+0.049
Neg logit contributions
 chefs-0.039
 queens-0.034
 charity-0.032
 kings-0.031
 spies-0.029
Token started
Token ID2067
Feature activation+9.54e-01

Pos logit contributions
aned+0.444
rated+0.410
ently+0.402
 forehead+0.402
iously+0.398
Neg logit contributions
 chefs-0.432
 queens-0.396
 charity-0.363
 kings-0.361
 spies-0.346
Token rolling
Token ID10708
Feature activation+1.61e+00

Pos logit contributions
 chefs+0.025
 queens+0.023
 charity+0.020
 kings+0.020
 spies+0.019
Neg logit contributions
aned-0.040
iously-0.038
 forehead-0.037
rated-0.037
ently-0.037
Token down
Token ID866
Feature activation-5.57e+00

Pos logit contributions
 chefs+0.064
 queens+0.059
 kings+0.055
 charity+0.055
 spies+0.053
Neg logit contributions
aned-0.047
rated-0.043
 forehead-0.042
ently-0.042
iously-0.041
Token her
Token ID607
Feature activation-2.31e+00

Pos logit contributions
aned+0.204
ently+0.190
iously+0.189
rated+0.185
appointed+0.185
Neg logit contributions
 chefs-0.178
 queens-0.160
 charity-0.150
 kings-0.145
 pirates-0.139
Token cheeks
Token ID25839
Feature activation-2.64e+00

Pos logit contributions
aned+0.068
rated+0.065
appointed+0.060
ently+0.058
ulously+0.053
Neg logit contributions
 chefs-0.090
 queens-0.080
 charity-0.076
 kings-0.075
 pirates-0.074
Token was
Token ID373
Feature activation-1.73e+00

Pos logit contributions
aned+0.098
 forehead+0.097
ently+0.097
rated+0.094
iously+0.093
Neg logit contributions
 chefs-0.095
 queens-0.086
 kings-0.080
 especially-0.079
 pirates-0.077
Token too
Token ID1165
Feature activation-5.25e-01

Pos logit contributions
aned+0.072
ently+0.070
rated+0.069
 forehead+0.068
iously+0.066
Neg logit contributions
 chefs-0.044
 queens-0.041
 kings-0.036
 spies-0.033
 pirates-0.033
Token high
Token ID1029
Feature activation-5.39e+00

Pos logit contributions
aned+0.022
ently+0.021
 forehead+0.021
iously+0.021
rated+0.020
Neg logit contributions
 chefs-0.014
 queens-0.012
 kings-0.011
 charity-0.011
 builders-0.010
Token up
Token ID510
Feature activation-3.38e+00

Pos logit contributions
aned+0.210
 forehead+0.195
rated+0.194
appointed+0.191
ently+0.191
Neg logit contributions
 chefs-0.156
 queens-0.137
 kings-0.126
 pirates-0.120
 spies-0.120
Token.
Token ID13
Feature activation-1.32e+00

Pos logit contributions
aned+0.131
rated+0.123
 forehead+0.121
ently+0.121
iously+0.120
Neg logit contributions
 chefs-0.100
 queens-0.090
 kings-0.081
 charity-0.081
 spies-0.076
Token Tears
Token ID40328
Feature activation-1.27e+01

Pos logit contributions
aned+0.048
rated+0.047
 forehead+0.044
ently+0.043
iously+0.042
Neg logit contributions
 chefs-0.043
 queens-0.038
 charity-0.036
 pirates-0.034
 kings-0.034
Token threatened
Token ID8556
Feature activation+2.15e+00

Pos logit contributions
aned+0.443
rated+0.412
 forehead+0.404
ently+0.404
iously+0.401
Neg logit contributions
 chefs-0.425
 queens-0.390
 charity-0.355
 kings-0.355
 spies-0.338
Token to
Token ID284
Feature activation-2.91e-01

Pos logit contributions
 chefs+0.057
 queens+0.051
 kings+0.045
 charity+0.045
 spies+0.042
Neg logit contributions
aned-0.090
rated-0.085
 forehead-0.083
ently-0.083
appointed-0.083
Token fall
Token ID2121
Feature activation-1.56e+00

Pos logit contributions
aned+0.012
rated+0.011
ently+0.011
appointed+0.011
iously+0.011
Neg logit contributions
 chefs-0.008
 queens-0.007
 charity-0.007
 kings-0.006
 pirates-0.006
Token from
Token ID422
Feature activation-4.79e+00

Pos logit contributions
aned+0.055
rated+0.052
 forehead+0.051
ently+0.051
appointed+0.050
Neg logit contributions
 chefs-0.050
 queens-0.045
 kings-0.042
 charity-0.041
 spies-0.040
Token his
Token ID465
Feature activation-2.34e+00

Pos logit contributions
aned+0.196
rated+0.184
ently+0.181
 forehead+0.180
appointed+0.179
Neg logit contributions
 chefs-0.134
 queens-0.119
 kings-0.108
 charity-0.106
 pirates-0.102
Token her
Token ID607
Feature activation-2.09e+00

Pos logit contributions
 chefs+0.099
 queens+0.088
 kings+0.078
 charity+0.078
 spies+0.077
Neg logit contributions
aned-0.111
ently-0.105
iously-0.104
 forehead-0.102
appointed-0.100
Token friends
Token ID2460
Feature activation+6.57e+00

Pos logit contributions
aned+0.076
 forehead+0.072
rated+0.072
iously+0.071
ently+0.071
Neg logit contributions
 chefs-0.064
 queens-0.060
 kings-0.053
 charity-0.052
 spies-0.050
Token were
Token ID547
Feature activation+1.01e+00

Pos logit contributions
 chefs+0.216
 queens+0.201
 charity+0.178
 kings+0.177
 spies+0.170
Neg logit contributions
aned-0.230
rated-0.218
ently-0.211
 forehead-0.210
iously-0.210
Token there
Token ID612
Feature activation-2.52e+00

Pos logit contributions
 chefs+0.031
 queens+0.030
 charity+0.026
 kings+0.025
 builders+0.024
Neg logit contributions
aned-0.034
 forehead-0.032
iously-0.032
rated-0.031
ently-0.031
Token.
Token ID13
Feature activation-1.36e+00

Pos logit contributions
aned+0.086
rated+0.083
ently+0.082
iously+0.081
 forehead+0.081
Neg logit contributions
 chefs-0.081
 queens-0.075
 kings-0.066
 charity-0.066
 builders-0.064
Token Tears
Token ID40328
Feature activation-1.27e+01

Pos logit contributions
aned+0.053
rated+0.050
 forehead+0.048
ently+0.047
iously+0.046
Neg logit contributions
 chefs-0.041
 queens-0.036
 kings-0.033
 charity-0.033
 pirates-0.032
Token started
Token ID2067
Feature activation+9.83e-01

Pos logit contributions
aned+0.437
rated+0.402
ently+0.394
 forehead+0.389
appointed+0.388
Neg logit contributions
 chefs-0.445
 queens-0.408
 charity-0.377
 kings-0.370
 spies-0.356
Token rolling
Token ID10708
Feature activation+1.77e+00

Pos logit contributions
 chefs+0.028
 queens+0.025
 charity+0.023
 kings+0.022
 spies+0.020
Neg logit contributions
aned-0.039
iously-0.037
rated-0.037
ently-0.036
appointed-0.036
Token down
Token ID866
Feature activation-5.56e+00

Pos logit contributions
 chefs+0.072
 queens+0.067
 charity+0.063
 kings+0.062
 spies+0.060
Neg logit contributions
aned-0.049
rated-0.045
ently-0.044
 forehead-0.044
iously-0.044
Token her
Token ID607
Feature activation-2.10e+00

Pos logit contributions
aned+0.194
ently+0.183
iously+0.182
rated+0.178
 forehead+0.176
Neg logit contributions
 chefs-0.184
 queens-0.167
 charity-0.154
 kings-0.149
 spies-0.142
Token cheeks
Token ID25839
Feature activation-2.69e+00

Pos logit contributions
aned+0.055
rated+0.052
ently+0.048
appointed+0.048
ulously+0.046
Neg logit contributions
 chefs-0.082
 charity-0.078
 queens-0.077
 pirates-0.073
 spies-0.071
Token no
Token ID645
Feature activation+2.71e+00

Pos logit contributions
 chefs+0.027
 queens+0.024
 kings+0.021
 charity+0.021
 pirates+0.020
Neg logit contributions
aned-0.046
rated-0.043
ently-0.043
 forehead-0.042
iously-0.042
Token energy
Token ID2568
Feature activation-2.25e+00

Pos logit contributions
 chefs+0.086
 queens+0.077
 kings+0.071
 charity+0.071
 pirates+0.070
Neg logit contributions
aned-0.102
ently-0.093
rated-0.092
appointed-0.091
iously-0.091
Token to
Token ID284
Feature activation-3.20e-01

Pos logit contributions
aned+0.090
 forehead+0.085
rated+0.085
iously+0.084
ently+0.083
Neg logit contributions
 chefs-0.062
 queens-0.056
 kings-0.049
 charity-0.049
 spies-0.046
Token play
Token ID711
Feature activation+2.76e+00

Pos logit contributions
aned+0.011
rated+0.010
ently+0.010
 forehead+0.010
iously+0.010
Neg logit contributions
 chefs-0.011
 queens-0.010
 charity-0.009
 kings-0.009
 pirates-0.009
Token.
Token ID13
Feature activation-1.61e+00

Pos logit contributions
 chefs+0.081
 queens+0.078
 kings+0.068
 builders+0.066
 charity+0.065
Neg logit contributions
aned-0.097
ently-0.095
 forehead-0.094
rated-0.092
iously-0.090
Token Tears
Token ID40328
Feature activation-1.26e+01

Pos logit contributions
aned+0.061
rated+0.060
inkle+0.059
 forehead+0.057
ently+0.056
Neg logit contributions
 chefs-0.045
 queens-0.036
 charity-0.036
 especially-0.035
 pirates-0.035
Token rolled
Token ID11686
Feature activation+1.60e+00

Pos logit contributions
aned+0.437
rated+0.407
 forehead+0.399
ently+0.399
iously+0.396
Neg logit contributions
 chefs-0.424
 queens-0.389
 kings-0.355
 charity-0.354
 spies-0.337
Token down
Token ID866
Feature activation-5.55e+00

Pos logit contributions
 chefs+0.061
 queens+0.056
 kings+0.052
 charity+0.051
 spies+0.050
Neg logit contributions
aned-0.048
rated-0.044
 forehead-0.044
ently-0.043
iously-0.043
Token his
Token ID465
Feature activation-2.27e+00

Pos logit contributions
aned+0.184
iously+0.169
ently+0.169
rated+0.168
 forehead+0.166
Neg logit contributions
 chefs-0.196
 queens-0.181
 charity-0.169
 kings-0.164
 spies-0.157
Token cheeks
Token ID25839
Feature activation-2.65e+00

Pos logit contributions
rated+0.075
aned+0.072
appointed+0.067
ently+0.066
ulously+0.064
Neg logit contributions
 chefs-0.070
 pirates-0.066
 superheroes-0.065
 charity-0.064
 queens-0.063
Token and
Token ID290
Feature activation-2.00e+00

Pos logit contributions
aned+0.101
rated+0.095
ently+0.093
 forehead+0.093
iously+0.092
Neg logit contributions
 chefs-0.079
 queens-0.072
 kings-0.065
 charity-0.064
 spies-0.061
Token girl
Token ID2576
Feature activation-1.19e+00

Pos logit contributions
aned+0.039
rated+0.037
 forehead+0.037
iously+0.036
ently+0.036
Neg logit contributions
 chefs-0.030
 queens-0.028
 charity-0.025
 kings-0.025
 builders-0.023
Token didn
Token ID1422
Feature activation-5.75e+00

Pos logit contributions
aned+0.045
rated+0.044
ently+0.042
appointed+0.042
 forehead+0.042
Neg logit contributions
 chefs-0.034
 especially-0.028
 kings-0.027
 queens-0.027
 pirates-0.026
Token't
Token ID470
Feature activation+6.61e-01

Pos logit contributions
rated+0.231
 forehead+0.218
iously+0.215
 spine+0.214
aned+0.214
Neg logit contributions
 chefs-0.142
 queens-0.124
 charity-0.124
 especially-0.114
 builders-0.113
Token understand
Token ID1833
Feature activation-9.00e-01

Pos logit contributions
 chefs+0.019
 queens+0.018
 charity+0.016
 kings+0.016
 spies+0.015
Neg logit contributions
aned-0.026
ently-0.024
iously-0.024
rated-0.024
 forehead-0.024
Token.
Token ID13
Feature activation-1.59e+00

Pos logit contributions
aned+0.035
rated+0.033
 forehead+0.033
ently+0.033
iously+0.032
Neg logit contributions
 chefs-0.026
 queens-0.024
 kings-0.021
 charity-0.021
 spies-0.020
Token Tears
Token ID40328
Feature activation-1.26e+01

Pos logit contributions
aned+0.064
rated+0.062
inkle+0.059
 forehead+0.057
ently+0.055
Neg logit contributions
 chefs-0.046
 queens-0.037
 charity-0.037
 kings-0.036
 pirates-0.034
Token started
Token ID2067
Feature activation+1.08e+00

Pos logit contributions
aned+0.430
rated+0.397
ently+0.389
 forehead+0.388
iously+0.384
Neg logit contributions
 chefs-0.438
 queens-0.403
 charity-0.369
 kings-0.368
 spies-0.351
Token rolling
Token ID10708
Feature activation+1.89e+00

Pos logit contributions
 chefs+0.029
 queens+0.026
 charity+0.024
 kings+0.023
 spies+0.022
Neg logit contributions
aned-0.045
iously-0.042
rated-0.042
 forehead-0.042
ently-0.042
Token down
Token ID866
Feature activation-5.45e+00

Pos logit contributions
 chefs+0.076
 queens+0.070
 kings+0.065
 charity+0.065
 spies+0.063
Neg logit contributions
aned-0.053
rated-0.049
 forehead-0.048
ently-0.048
iously-0.047
Token her
Token ID607
Feature activation-1.98e+00

Pos logit contributions
aned+0.194
ently+0.182
iously+0.180
rated+0.178
 forehead+0.176
Neg logit contributions
 chefs-0.179
 queens-0.163
 charity-0.150
 kings-0.145
 spies-0.139
Token face
Token ID1986
Feature activation-1.29e+00

Pos logit contributions
aned+0.057
rated+0.055
appointed+0.050
ently+0.049
ulously+0.049
Neg logit contributions
 chefs-0.070
 charity-0.066
 queens-0.063
 pirates-0.062
 spies-0.061
Token
Token ID198
Feature activation+1.86e+00

Pos logit contributions
 chefs+0.054
 queens+0.048
 charity+0.043
 kings+0.043
 spies+0.041
Neg logit contributions
aned-0.073
rated-0.071
 forehead-0.068
iously-0.067
appointed-0.067
TokenRon
Token ID23672
Feature activation-1.98e+00

Pos logit contributions
 chefs+0.058
 queens+0.054
 kings+0.050
 charity+0.048
 spies+0.047
Neg logit contributions
aned-0.068
iously-0.062
ently-0.062
appointed-0.061
 forehead-0.061
Tokenny
Token ID3281
Feature activation-2.07e+00

Pos logit contributions
appointed+0.054
aned+0.052
inkle+0.051
ently+0.051
alk+0.048
Neg logit contributions
 chefs-0.064
please-0.054
 queens-0.054
L-0.052
J-0.052
Token nodded
Token ID14464
Feature activation-3.22e+00

Pos logit contributions
rated+0.073
aned+0.072
ently+0.070
appointed+0.066
 forehead+0.066
Neg logit contributions
 chefs-0.062
 queens-0.055
 kings-0.052
 pirates-0.050
 especially-0.050
Token.
Token ID13
Feature activation-1.41e+00

Pos logit contributions
aned+0.117
rated+0.110
 forehead+0.109
ently+0.108
iously+0.108
Neg logit contributions
 chefs-0.104
 queens-0.095
 charity-0.086
 kings-0.086
 spies-0.081
Token Tears
Token ID40328
Feature activation-1.26e+01

Pos logit contributions
aned+0.055
rated+0.054
ently+0.050
 forehead+0.050
iously+0.050
Neg logit contributions
 chefs-0.042
 queens-0.036
 charity-0.033
 kings-0.032
 spies-0.031
Token filled
Token ID5901
Feature activation+2.41e+00

Pos logit contributions
aned+0.431
rated+0.400
 forehead+0.392
ently+0.392
iously+0.389
Neg logit contributions
 chefs-0.430
 queens-0.395
 kings-0.361
 charity-0.361
 spies-0.344
Token his
Token ID465
Feature activation-2.18e+00

Pos logit contributions
 chefs+0.073
 queens+0.063
 kings+0.058
 charity+0.058
 pirates+0.057
Neg logit contributions
aned-0.092
rated-0.088
 forehead-0.086
appointed-0.084
ently-0.084
Token eyes
Token ID2951
Feature activation-2.93e+00

Pos logit contributions
rated+0.077
aned+0.073
ently+0.072
appointed+0.069
zag+0.063
Neg logit contributions
 chefs-0.059
 pirates-0.052
 watering-0.051
 nos-0.051
 big-0.051
Token.
Token ID13
Feature activation-1.52e+00

Pos logit contributions
aned+0.110
rated+0.103
 forehead+0.101
ently+0.101
iously+0.100
Neg logit contributions
 chefs-0.090
 queens-0.082
 kings-0.074
 charity-0.074
 spies-0.070
Token
Token ID198
Feature activation+1.62e+00

Pos logit contributions
aned+0.057
rated+0.056
 forehead+0.052
ently+0.052
iously+0.051
Neg logit contributions
 chefs-0.048
 queens-0.041
 charity-0.038
 kings-0.037
 pirates-0.036
Token and
Token ID290
Feature activation-1.85e+00

Pos logit contributions
aned+0.135
rated+0.128
 forehead+0.124
ently+0.124
appointed+0.123
Neg logit contributions
 chefs-0.096
 queens-0.085
 kings-0.077
 charity-0.076
 pirates-0.073
Token he
Token ID339
Feature activation+1.53e+00

Pos logit contributions
aned+0.062
rated+0.058
ently+0.057
 forehead+0.056
iously+0.056
Neg logit contributions
 chefs-0.065
 queens-0.059
 kings-0.055
 charity-0.054
 spies-0.052
Token felt
Token ID2936
Feature activation+5.90e-02

Pos logit contributions
 chefs+0.046
 queens+0.041
 kings+0.038
 charity+0.036
 pirates+0.036
Neg logit contributions
aned-0.058
rated-0.056
ently-0.055
 forehead-0.054
iously-0.053
Token sad
Token ID6507
Feature activation-1.81e+00

Pos logit contributions
 chefs+0.002
 queens+0.002
 charity+0.002
 kings+0.002
 builders+0.001
Neg logit contributions
aned-0.002
rated-0.002
 forehead-0.002
appointed-0.002
ently-0.002
Token.
Token ID13
Feature activation-1.53e+00

Pos logit contributions
aned+0.067
rated+0.063
 forehead+0.062
ently+0.062
iously+0.062
Neg logit contributions
 chefs-0.056
 queens-0.052
 kings-0.047
 charity-0.047
 spies-0.044
Token Tears
Token ID40328
Feature activation-1.26e+01

Pos logit contributions
aned+0.056
rated+0.053
ently+0.051
 forehead+0.051
inkle+0.050
Neg logit contributions
 chefs-0.049
 queens-0.041
 charity-0.040
 builders-0.039
 pirates-0.038
Token started
Token ID2067
Feature activation+1.04e+00

Pos logit contributions
aned+0.435
rated+0.401
 forehead+0.393
ently+0.392
appointed+0.388
Neg logit contributions
 chefs-0.431
 queens-0.399
 charity-0.367
 kings-0.363
 spies-0.346
Token to
Token ID284
Feature activation-2.21e-01

Pos logit contributions
 chefs+0.026
 queens+0.023
 kings+0.020
 charity+0.020
 spies+0.019
Neg logit contributions
aned-0.045
rated-0.043
 forehead-0.042
ently-0.042
iously-0.042
Token stream
Token ID4269
Feature activation+2.08e+00

Pos logit contributions
aned+0.009
iously+0.008
ently+0.008
rated+0.008
appointed+0.008
Neg logit contributions
 chefs-0.006
 queens-0.006
 charity-0.005
 kings-0.005
 pirates-0.005
Token down
Token ID866
Feature activation-5.52e+00

Pos logit contributions
 chefs+0.067
 queens+0.061
 kings+0.056
 charity+0.055
 spies+0.053
Neg logit contributions
aned-0.075
rated-0.070
 forehead-0.068
ently-0.068
iously-0.068
Token his
Token ID465
Feature activation-2.30e+00

Pos logit contributions
aned+0.208
rated+0.195
 forehead+0.191
ently+0.190
iously+0.189
Neg logit contributions
 chefs-0.171
 queens-0.155
 kings-0.141
 charity-0.139
 pirates-0.134
Token but
Token ID475
Feature activation-4.23e+00

Pos logit contributions
 chefs+0.132
 queens+0.119
 kings+0.106
 charity+0.106
 spies+0.099
Neg logit contributions
aned-0.202
rated-0.191
ently-0.187
 forehead-0.186
iously-0.185
Token it
Token ID340
Feature activation-3.21e+00

Pos logit contributions
aned+0.176
rated+0.168
ently+0.163
 forehead+0.162
iously+0.162
Neg logit contributions
 chefs-0.110
 queens-0.102
 kings-0.091
 charity-0.089
 spies-0.084
Token seemed
Token ID3947
Feature activation-1.73e+00

Pos logit contributions
aned+0.110
 forehead+0.105
ently+0.103
iously+0.103
rated+0.102
Neg logit contributions
 chefs-0.100
 queens-0.094
 kings-0.086
 especially-0.082
 charity-0.082
Token impossible
Token ID5340
Feature activation-2.00e+00

Pos logit contributions
aned+0.067
rated+0.065
appointed+0.064
iously+0.064
 forehead+0.063
Neg logit contributions
 chefs-0.050
 queens-0.044
 charity-0.040
 kings-0.038
 spies-0.037
Token.
Token ID13
Feature activation-1.74e+00

Pos logit contributions
aned+0.081
rated+0.076
 forehead+0.075
appointed+0.074
ently+0.073
Neg logit contributions
 chefs-0.055
 queens-0.049
 kings-0.045
 charity-0.044
 spies-0.042
Token Tears
Token ID40328
Feature activation-1.26e+01

Pos logit contributions
aned+0.064
rated+0.063
 forehead+0.058
iously+0.056
ently+0.056
Neg logit contributions
 chefs-0.054
 queens-0.049
 charity-0.046
 kings-0.044
 spies-0.042
Token started
Token ID2067
Feature activation+9.07e-01

Pos logit contributions
aned+0.435
rated+0.402
ently+0.394
 forehead+0.393
iously+0.389
Neg logit contributions
 chefs-0.430
 queens-0.394
 charity-0.362
 kings-0.359
 spies-0.343
Token to
Token ID284
Feature activation-3.25e-01

Pos logit contributions
 chefs+0.024
 queens+0.021
 charity+0.019
 kings+0.019
 spies+0.018
Neg logit contributions
aned-0.038
rated-0.036
 forehead-0.035
ently-0.035
iously-0.035
Token come
Token ID1282
Feature activation+1.92e+00

Pos logit contributions
aned+0.012
iously+0.011
ently+0.011
rated+0.011
 forehead+0.011
Neg logit contributions
 chefs-0.009
 queens-0.008
 charity-0.008
 kings-0.008
 pirates-0.007
Token from
Token ID422
Feature activation-4.82e+00

Pos logit contributions
 chefs+0.060
 queens+0.055
 kings+0.050
 charity+0.050
 spies+0.048
Neg logit contributions
aned-0.071
rated-0.065
 forehead-0.065
ently-0.064
iously-0.064
Token her
Token ID607
Feature activation-2.25e+00

Pos logit contributions
aned+0.194
rated+0.182
 forehead+0.179
ently+0.179
iously+0.177
Neg logit contributions
 chefs-0.137
 queens-0.123
 kings-0.110
 charity-0.109
 spies-0.104
Token it
Token ID340
Feature activation-3.00e+00

Pos logit contributions
aned+0.078
ently+0.074
 forehead+0.072
rated+0.071
iously+0.071
Neg logit contributions
 chefs-0.052
 queens-0.048
 kings-0.043
 pirates-0.040
 spies-0.040
Token had
Token ID550
Feature activation+1.13e+00

Pos logit contributions
aned+0.110
ently+0.108
 forehead+0.106
rated+0.104
appointed+0.103
Neg logit contributions
 chefs-0.091
 queens-0.080
 kings-0.075
 pirates-0.074
 spies-0.070
Token been
Token ID587
Feature activation-3.61e-01

Pos logit contributions
 chefs+0.029
 queens+0.026
 kings+0.023
 charity+0.023
 spies+0.022
Neg logit contributions
aned-0.048
rated-0.045
 forehead-0.044
ently-0.044
iously-0.044
Token caught
Token ID4978
Feature activation+3.98e+00

Pos logit contributions
aned+0.014
iously+0.013
rated+0.013
 forehead+0.012
ently+0.012
Neg logit contributions
 chefs-0.011
 queens-0.010
 charity-0.009
 kings-0.008
 builders-0.008
Token.
Token ID13
Feature activation-1.56e+00

Pos logit contributions
 chefs+0.115
 queens+0.104
 charity+0.094
 kings+0.092
 spies+0.087
Neg logit contributions
aned-0.156
rated-0.147
ently-0.145
iously-0.144
 forehead-0.144
Token Tears
Token ID40328
Feature activation-1.25e+01

Pos logit contributions
rated+0.060
aned+0.059
 forehead+0.055
ently+0.053
inkle+0.052
Neg logit contributions
 chefs-0.047
 queens-0.041
 pirates-0.039
 kings-0.038
 charity-0.038
Token filled
Token ID5901
Feature activation+2.43e+00

Pos logit contributions
aned+0.438
rated+0.405
ently+0.397
 forehead+0.396
iously+0.394
Neg logit contributions
 chefs-0.424
 queens-0.389
 charity-0.357
 kings-0.353
 spies-0.337
Token the
Token ID262
Feature activation-1.22e+00

Pos logit contributions
 chefs+0.050
 queens+0.039
 kings+0.037
 pirates+0.036
 spies+0.034
Neg logit contributions
aned-0.117
rated-0.112
 forehead-0.112
appointed-0.107
iously-0.107
Token oy
Token ID35104
Feature activation-8.61e-01

Pos logit contributions
aned+0.055
ently+0.050
appointed+0.050
rated+0.050
iously+0.049
Neg logit contributions
 chefs-0.032
 queens-0.027
 kings-0.025
 pirates-0.024
 charity-0.023
Tokenster
Token ID1706
Feature activation+3.69e+00

Pos logit contributions
aned+0.029
rated+0.027
appointed+0.027
 forehead+0.026
ently+0.026
Neg logit contributions
 chefs-0.030
 queens-0.027
 kings-0.025
 charity-0.024
 pirates-0.024
Token's
Token ID338
Feature activation+8.89e-01

Pos logit contributions
 chefs+0.091
 queens+0.083
 pirates+0.079
 kings+0.078
 including+0.072
Neg logit contributions
aned-0.149
rated-0.142
appointed-0.139
-0.135
ently-0.135
Token as
Token ID355
Feature activation+1.67e+00

Pos logit contributions
aned+0.056
rated+0.052
ently+0.051
appointed+0.051
iously+0.050
Neg logit contributions
 chefs-0.031
 queens-0.026
 kings-0.025
 charity-0.024
 pirates-0.024
Token the
Token ID262
Feature activation-1.04e+00

Pos logit contributions
 chefs+0.049
 queens+0.044
 charity+0.040
 kings+0.039
 spies+0.037
Neg logit contributions
aned-0.063
rated-0.060
 forehead-0.059
ently-0.059
appointed-0.059
Token fire
Token ID2046
Feature activation+6.78e-01

Pos logit contributions
aned+0.040
 forehead+0.038
rated+0.038
appointed+0.037
iously+0.036
Neg logit contributions
 chefs-0.030
 queens-0.028
 charity-0.025
 kings-0.024
 spies-0.023
Token grew
Token ID6348
Feature activation+5.57e+00

Pos logit contributions
 chefs+0.021
 queens+0.019
 kings+0.018
 charity+0.017
 pirates+0.017
Neg logit contributions
aned-0.025
rated-0.023
ently-0.023
 forehead-0.023
appointed-0.023
Token.
Token ID13
Feature activation-1.19e+00

Pos logit contributions
 chefs+0.181
 queens+0.167
 charity+0.165
 kings+0.150
 spies+0.145
Neg logit contributions
aned-0.194
rated-0.189
appointed-0.182
iously-0.180
ently-0.180
Token Tears
Token ID40328
Feature activation-1.25e+01

Pos logit contributions
aned+0.042
ently+0.039
rated+0.039
 forehead+0.037
iously+0.036
Neg logit contributions
 chefs-0.040
 queens-0.034
 charity-0.033
 pirates-0.033
 kings-0.033
Token ran
Token ID4966
Feature activation+5.40e-01

Pos logit contributions
aned+0.440
rated+0.409
 forehead+0.398
ently+0.398
appointed+0.394
Neg logit contributions
 chefs-0.422
 queens-0.388
 charity-0.357
 kings-0.352
 spies-0.337
Token down
Token ID866
Feature activation-5.26e+00

Pos logit contributions
 chefs+0.020
 queens+0.018
 charity+0.017
 kings+0.017
 pirates+0.016
Neg logit contributions
aned-0.017
rated-0.016
 forehead-0.016
appointed-0.016
iously-0.015
Token their
Token ID511
Feature activation+1.03e-01

Pos logit contributions
aned+0.165
iously+0.156
ently+0.155
rated+0.152
 forehead+0.151
Neg logit contributions
 chefs-0.183
 queens-0.171
 charity-0.163
 kings-0.151
please-0.145
Token faces
Token ID6698
Feature activation+2.09e+00

Pos logit contributions
 chefs+0.004
 queens+0.003
 pirates+0.003
 charity+0.003
 kings+0.003
Neg logit contributions
aned-0.003
ently-0.003
rated-0.003
appointed-0.003
iously-0.003
Token as
Token ID355
Feature activation+1.58e+00

Pos logit contributions
 chefs+0.064
 queens+0.058
 kings+0.052
 charity+0.052
 spies+0.049
Neg logit contributions
aned-0.079
rated-0.073
 forehead-0.072
ently-0.072
iously-0.072
Token.
Token ID13
Feature activation-8.33e-01

Pos logit contributions
aned+0.033
rated+0.032
ently+0.031
appointed+0.031
iously+0.031
Neg logit contributions
 chefs-0.020
 queens-0.018
 kings-0.016
 charity-0.016
 pirates-0.015
Token She
Token ID1375
Feature activation+7.08e-01

Pos logit contributions
aned+0.033
rated+0.032
ently+0.030
 forehead+0.030
iously+0.030
Neg logit contributions
 chefs-0.024
 queens-0.022
 charity-0.021
 kings-0.020
 pirates-0.019
Token got
Token ID1392
Feature activation+4.06e+00

Pos logit contributions
 chefs+0.022
 queens+0.020
 kings+0.018
 charity+0.018
 pirates+0.017
Neg logit contributions
aned-0.026
rated-0.025
ently-0.024
appointed-0.024
iously-0.024
Token hurt
Token ID5938
Feature activation+1.24e-01

Pos logit contributions
 chefs+0.146
 queens+0.139
 charity+0.129
 kings+0.126
 spies+0.118
Neg logit contributions
aned-0.127
 forehead-0.118
appointed-0.113
rated-0.113
iously-0.112
Token!
Token ID0
Feature activation-1.35e+00

Pos logit contributions
 chefs+0.004
 queens+0.004
 charity+0.004
 kings+0.003
 spies+0.003
Neg logit contributions
aned-0.004
ently-0.004
 forehead-0.004
iously-0.004
rated-0.004
Token Tears
Token ID40328
Feature activation-1.25e+01

Pos logit contributions
rated+0.055
aned+0.054
ently+0.050
 forehead+0.050
iously+0.049
Neg logit contributions
 chefs-0.038
 queens-0.032
 charity-0.032
 spies-0.029
 kings-0.029
Token ran
Token ID4966
Feature activation+3.56e-02

Pos logit contributions
aned+0.430
rated+0.401
 forehead+0.392
ently+0.392
iously+0.390
Neg logit contributions
 chefs-0.425
 queens-0.389
 kings-0.356
 charity-0.354
 spies-0.339
Token down
Token ID866
Feature activation-5.41e+00

Pos logit contributions
 chefs+0.001
 queens+0.001
 kings+0.001
 charity+0.001
 pirates+0.001
Neg logit contributions
aned-0.001
rated-0.001
 forehead-0.001
appointed-0.001
iously-0.001
Token her
Token ID607
Feature activation-2.39e+00

Pos logit contributions
aned+0.198
iously+0.182
 forehead+0.181
ently+0.181
appointed+0.178
Neg logit contributions
 chefs-0.174
 queens-0.158
 charity-0.143
 kings-0.143
 pirates-0.134
Token face
Token ID1986
Feature activation-1.21e+00

Pos logit contributions
rated+0.068
aned+0.066
ently+0.061
appointed+0.059
ulously+0.057
Neg logit contributions
 chefs-0.085
 charity-0.078
 queens-0.077
 spies-0.077
 pirates-0.076
Token.
Token ID13
Feature activation-1.54e+00

Pos logit contributions
aned+0.050
rated+0.048
ently+0.047
appointed+0.046
 forehead+0.046
Neg logit contributions
 chefs-0.031
 queens-0.027
 charity-0.025
 kings-0.024
 spies-0.023
Token to
Token ID284
Feature activation-1.76e-01

Pos logit contributions
 chefs+0.032
 queens+0.029
 charity+0.026
 kings+0.025
 spies+0.024
Neg logit contributions
aned-0.044
rated-0.041
 forehead-0.041
iously-0.041
appointed-0.041
Token go
Token ID467
Feature activation+1.53e+00

Pos logit contributions
aned+0.006
rated+0.006
ently+0.006
iously+0.006
 forehead+0.006
Neg logit contributions
 chefs-0.006
 queens-0.005
 charity-0.005
 kings-0.005
 pirates-0.005
Token to
Token ID284
Feature activation-1.90e-01

Pos logit contributions
 chefs+0.049
 queens+0.044
 kings+0.040
 charity+0.038
 builders+0.037
Neg logit contributions
aned-0.056
 forehead-0.053
rated-0.052
ently-0.052
appointed-0.051
Token work
Token ID670
Feature activation+9.52e-01

Pos logit contributions
aned+0.008
rated+0.008
ently+0.007
iously+0.007
appointed+0.007
Neg logit contributions
 chefs-0.005
 queens-0.004
 charity-0.004
 kings-0.004
 pirates-0.004
Token.
Token ID13
Feature activation-1.27e+00

Pos logit contributions
 chefs+0.027
 queens+0.025
 charity+0.022
 kings+0.022
 spies+0.021
Neg logit contributions
aned-0.038
rated-0.035
 forehead-0.035
iously-0.034
ently-0.034
Token Tears
Token ID40328
Feature activation-1.25e+01

Pos logit contributions
rated+0.047
inkle+0.046
aned+0.046
ently+0.044
 forehead+0.043
Neg logit contributions
 chefs-0.034
 charity-0.031
 queens-0.030
 spies-0.029
 pirates-0.029
Token started
Token ID2067
Feature activation+1.11e+00

Pos logit contributions
aned+0.436
rated+0.404
 forehead+0.395
ently+0.394
appointed+0.390
Neg logit contributions
 chefs-0.426
 queens-0.392
 charity-0.359
 kings-0.357
 spies-0.340
Token flowing
Token ID17609
Feature activation+3.53e-01

Pos logit contributions
 chefs+0.030
 queens+0.027
 charity+0.024
 kings+0.024
 spies+0.023
Neg logit contributions
aned-0.046
rated-0.043
iously-0.042
ently-0.042
 forehead-0.042
Token down
Token ID866
Feature activation-5.52e+00

Pos logit contributions
 chefs+0.014
 queens+0.013
 charity+0.012
 kings+0.012
 spies+0.011
Neg logit contributions
aned-0.010
rated-0.010
iously-0.009
 forehead-0.009
ently-0.009
Token his
Token ID465
Feature activation-2.11e+00

Pos logit contributions
aned+0.202
ently+0.186
rated+0.186
iously+0.185
 forehead+0.184
Neg logit contributions
 chefs-0.177
 queens-0.161
 charity-0.147
 kings-0.144
 spies-0.137
Token cheeks
Token ID25839
Feature activation-2.57e+00

Pos logit contributions
rated+0.061
aned+0.060
ently+0.057
appointed+0.053
ulously+0.052
Neg logit contributions
 chefs-0.072
 charity-0.068
 pirates-0.065
 fashion-0.064
 superheroes-0.063
Token.
Token ID13
Feature activation-1.49e+00

Pos logit contributions
aned+0.105
rated+0.101
appointed+0.097
ently+0.096
iously+0.096
Neg logit contributions
 chefs-0.072
 queens-0.064
 kings-0.058
 charity-0.057
 spies-0.055
Token "
Token ID366
Feature activation+1.10e+01

Pos logit contributions
aned+0.047
 forehead+0.043
ently+0.043
iously+0.042
appointed+0.042
Neg logit contributions
 chefs-0.053
 queens-0.049
 kings-0.045
 charity-0.045
 builders-0.042
TokenYou
Token ID1639
Feature activation-2.94e+00

Pos logit contributions
 chefs+0.289
 queens+0.259
 kings+0.237
 charity+0.229
 builders+0.218
Neg logit contributions
aned-0.430
rated-0.406
ently-0.403
 forehead-0.402
appointed-0.401
Token are
Token ID389
Feature activation+2.10e+00

Pos logit contributions
aned+0.080
rated+0.076
 forehead+0.075
appointed+0.071
 hurts+0.068
Neg logit contributions
 chefs-0.108
 queens-0.104
 charity-0.097
 kings-0.091
 spies-0.090
Token very
Token ID845
Feature activation+9.51e-01

Pos logit contributions
 chefs+0.064
 queens+0.061
 charity+0.058
please+0.052
 kings+0.052
Neg logit contributions
 forehead-0.062
aned-0.062
appointed-0.059
-0.056
iously-0.055
Token kind
Token ID1611
Feature activation+6.49e+00

Pos logit contributions
 chefs+0.020
 queens+0.017
 charity+0.015
 kings+0.015
 spies+0.014
Neg logit contributions
aned-0.042
 forehead-0.040
rated-0.040
appointed-0.040
ently-0.039
Token.
Token ID13
Feature activation-1.48e+00

Pos logit contributions
 chefs+0.225
 charity+0.210
 queens+0.209
 kings+0.193
please+0.184
Neg logit contributions
aned-0.188
iously-0.180
 forehead-0.177
ently-0.176
rated-0.173
Token I
Token ID314
Feature activation+6.82e+00

Pos logit contributions
aned+0.040
rated+0.036
 forehead+0.035
ently+0.035
iously+0.035
Neg logit contributions
 chefs-0.061
 queens-0.057
 kings-0.053
 charity-0.053
 spies-0.051
Token will
Token ID481
Feature activation-1.02e+00

Pos logit contributions
 chefs+0.200
 queens+0.193
 kings+0.162
 builders+0.161
 charity+0.160
Neg logit contributions
rated-0.246
aned-0.245
 forehead-0.233
appointed-0.226
inkle-0.221
Token be
Token ID307
Feature activation+3.40e+00

Pos logit contributions
aned+0.033
 forehead+0.032
appointed+0.030
rated+0.030
ently+0.030
Neg logit contributions
 chefs-0.036
 queens-0.033
 kings-0.028
 spies-0.028
 builders-0.028
Token your
Token ID534
Feature activation-7.67e-01

Pos logit contributions
 chefs+0.102
 queens+0.098
 charity+0.094
 kings+0.082
 builders+0.081
Neg logit contributions
aned-0.110
 forehead-0.107
appointed-0.101
rated-0.100
ently-0.099
Token.
Token ID13
Feature activation-1.29e+00

Pos logit contributions
 chefs+0.025
 queens+0.024
 charity+0.022
 kings+0.021
 builders+0.020
Neg logit contributions
rated-0.020
aned-0.019
iously-0.018
ently-0.018
 forehead-0.018
Token But
Token ID887
Feature activation-4.66e+00

Pos logit contributions
aned+0.050
ently+0.046
 forehead+0.045
rated+0.045
iously+0.045
Neg logit contributions
 chefs-0.040
 charity-0.035
 queens-0.033
 kings-0.033
 pirates-0.033
Token his
Token ID465
Feature activation-2.00e+00

Pos logit contributions
aned+0.232
rated+0.217
ently+0.210
 forehead+0.207
iously+0.202
Neg logit contributions
 chefs-0.098
 pirates-0.090
 kings-0.081
 especially-0.079
 spies-0.076
Token mom
Token ID1995
Feature activation+4.24e+00

Pos logit contributions
aned+0.087
rated+0.083
ently+0.082
iously+0.080
appointed+0.080
Neg logit contributions
 chefs-0.053
 queens-0.044
 pirates-0.042
 kings-0.041
 spies-0.041
Token said
Token ID531
Feature activation-3.35e+00

Pos logit contributions
 chefs+0.145
 queens+0.135
 charity+0.123
 kings+0.122
 spies+0.117
Neg logit contributions
aned-0.145
rated-0.135
ently-0.132
 forehead-0.131
appointed-0.130
Token "
Token ID366
Feature activation+1.09e+01

Pos logit contributions
aned+0.127
rated+0.118
 forehead+0.117
ently+0.117
iously+0.116
Neg logit contributions
 chefs-0.102
 queens-0.093
 kings-0.084
 charity-0.084
 spies-0.079
Tokenno
Token ID3919
Feature activation-4.55e+00

Pos logit contributions
 chefs+0.317
 queens+0.286
 kings+0.257
 charity+0.256
 spies+0.242
Neg logit contributions
aned-0.432
rated-0.406
 forehead-0.398
ently-0.398
iously-0.396
Token"
Token ID1
Feature activation+1.22e+01

Pos logit contributions
aned+0.178
rated+0.170
 forehead+0.167
appointed+0.165
iously+0.162
Neg logit contributions
 chefs-0.140
 queens-0.123
 kings-0.112
 pirates-0.107
 charity-0.107
Token because
Token ID780
Feature activation-2.12e-01

Pos logit contributions
 chefs+0.305
 queens+0.266
 kings+0.223
 charity+0.218
 spies+0.217
Neg logit contributions
aned-0.518
rated-0.489
 forehead-0.486
iously-0.477
appointed-0.476
Token it
Token ID340
Feature activation-2.95e+00

Pos logit contributions
aned+0.009
rated+0.008
iously+0.008
ently+0.008
 forehead+0.008
Neg logit contributions
 chefs-0.006
 queens-0.006
 kings-0.005
 pirates-0.005
 charity-0.005
Token was
Token ID373
Feature activation-1.34e+00

Pos logit contributions
aned+0.109
ently+0.104
iously+0.102
 forehead+0.102
rated+0.100
Neg logit contributions
 chefs-0.096
 queens-0.084
 kings-0.077
 pirates-0.076
 spies-0.074
Token very
Token ID845
Feature activation+7.64e-01

Pos logit contributions
 chefs+0.023
 queens+0.020
 charity+0.018
 kings+0.018
 spies+0.016
Neg logit contributions
aned-0.049
rated-0.046
ently-0.046
iously-0.045
 forehead-0.045
Token beautiful
Token ID4950
Feature activation+1.16e+00

Pos logit contributions
 chefs+0.014
 queens+0.012
 kings+0.010
 charity+0.010
 pirates+0.008
Neg logit contributions
aned-0.039
rated-0.037
iously-0.036
ently-0.036
 forehead-0.036
Token,
Token ID11
Feature activation+5.27e+00

Pos logit contributions
 chefs+0.035
 queens+0.032
 kings+0.029
 charity+0.028
 pirates+0.027
Neg logit contributions
aned-0.046
rated-0.042
ently-0.042
appointed-0.041
iously-0.041
Token Lily
Token ID20037
Feature activation+2.38e-01

Pos logit contributions
 chefs+0.143
 queens+0.128
 kings+0.116
 charity+0.115
 builders+0.107
Neg logit contributions
aned-0.210
 forehead-0.199
rated-0.197
ently-0.195
appointed-0.195
Token.
Token ID13
Feature activation-1.60e+00

Pos logit contributions
 chefs+0.007
 queens+0.006
 kings+0.006
 charity+0.005
 pirates+0.005
Neg logit contributions
aned-0.010
rated-0.009
ently-0.009
iously-0.009
 forehead-0.009
Token I
Token ID314
Feature activation+6.97e+00

Pos logit contributions
aned+0.054
rated+0.049
ently+0.049
iously+0.048
 forehead+0.047
Neg logit contributions
 chefs-0.058
 queens-0.053
 kings-0.049
 charity-0.049
 pirates-0.047
Token love
Token ID1842
Feature activation+4.28e+00

Pos logit contributions
 chefs+0.216
 queens+0.210
 charity+0.182
 kings+0.180
 spies+0.178
Neg logit contributions
rated-0.244
aned-0.241
 forehead-0.227
inkle-0.225
appointed-0.219
Token it
Token ID340
Feature activation-2.76e+00

Pos logit contributions
 chefs+0.137
 queens+0.122
 charity+0.115
 kings+0.112
 spies+0.108
Neg logit contributions
aned-0.151
 forehead-0.142
iously-0.140
ently-0.140
rated-0.138
Token.
Token ID13
Feature activation-1.60e+00

Pos logit contributions
aned+0.107
rated+0.101
 forehead+0.100
ently+0.099
iously+0.098
Neg logit contributions
 chefs-0.081
 queens-0.073
 charity-0.066
 kings-0.066
 spies-0.062
Token You
Token ID921
Feature activation-1.71e+00

Pos logit contributions
aned+0.047
rated+0.042
ently+0.042
iously+0.041
 forehead+0.041
Neg logit contributions
 chefs-0.064
 queens-0.059
 kings-0.055
 charity-0.055
 spies-0.053
Token did
Token ID750
Feature activation-7.59e+00

Pos logit contributions
aned+0.059
rated+0.056
 forehead+0.055
appointed+0.054
ently+0.053
Neg logit contributions
 chefs-0.056
 queens-0.051
 charity-0.048
 kings-0.046
 spies-0.045
Token
Token ID198
Feature activation+1.14e+00

Pos logit contributions
rated+0.280
aned+0.278
iously+0.276
 forehead+0.275
appointed+0.257
Neg logit contributions
 chefs-0.131
 queens-0.108
 charity-0.105
 kings-0.100
 pirates-0.091
Token
Token ID198
Feature activation+1.31e+00

Pos logit contributions
 chefs+0.031
 queens+0.028
 charity+0.026
 kings+0.025
 pirates+0.024
Neg logit contributions
aned-0.046
rated-0.044
 forehead-0.044
iously-0.044
ently-0.042
Token"
Token ID1
Feature activation+1.26e+01

Pos logit contributions
 chefs+0.035
 queens+0.031
 kings+0.028
 charity+0.027
 spies+0.026
Neg logit contributions
aned-0.054
rated-0.051
ently-0.050
appointed-0.050
 forehead-0.050
TokenWhere
Token ID8496
Feature activation+5.85e+00

Pos logit contributions
 chefs+0.344
 queens+0.306
 kings+0.285
 charity+0.268
 spies+0.263
Neg logit contributions
aned-0.507
appointed-0.479
rated-0.478
 forehead-0.475
ently-0.464
Token are
Token ID389
Feature activation+2.34e+00

Pos logit contributions
 chefs+0.180
 queens+0.152
 kings+0.150
 charity+0.141
 pirates+0.141
Neg logit contributions
aned-0.239
iously-0.218
 forehead-0.215
rated-0.214
appointed-0.214
Token we
Token ID356
Feature activation+6.13e+00

Pos logit contributions
 chefs+0.076
 queens+0.068
 charity+0.065
 kings+0.063
 spies+0.060
Neg logit contributions
aned-0.078
iously-0.073
rated-0.073
 forehead-0.072
appointed-0.072
Token going
Token ID1016
Feature activation+1.59e+00

Pos logit contributions
 chefs+0.183
 queens+0.172
 charity+0.156
 kings+0.146
 spies+0.144
Neg logit contributions
aned-0.209
rated-0.206
 forehead-0.200
iously-0.199
appointed-0.197
Token mom
Token ID1995
Feature activation+4.85e+00

Pos logit contributions
 chefs+0.053
 queens+0.049
 charity+0.045
 kings+0.044
 spies+0.042
Neg logit contributions
aned-0.054
rated-0.050
ently-0.049
 forehead-0.049
iously-0.049
Tokenmy
Token ID1820
Feature activation+6.83e-01

Pos logit contributions
 chefs+0.182
 queens+0.169
 charity+0.158
 kings+0.155
 spies+0.150
Neg logit contributions
aned-0.144
rated-0.136
ently-0.131
 forehead-0.130
appointed-0.129
Token?"
Token ID1701
Feature activation-1.28e+00

Pos logit contributions
 chefs+0.026
 queens+0.024
 charity+0.022
 kings+0.022
 spies+0.021
Neg logit contributions
aned-0.020
rated-0.019
 forehead-0.018
ently-0.018
iously-0.018
Token asked
Token ID1965
Feature activation+1.46e+00

Pos logit contributions
aned+0.054
rated+0.053
ently+0.050
iously+0.049
 forehead+0.049
Neg logit contributions
 chefs-0.033
 queens-0.029
 kings-0.027
 charity-0.027
 pirates-0.026
Token the
Token ID262
Feature activation-1.16e+00

Pos logit contributions
 chefs+0.002
 queens+0.002
please+0.001
 charity+0.001
 kings+0.001
Neg logit contributions
aned-0.001
 forehead-0.001
iously-0.001
ulously-0.001
chief-0.001
Token juice
Token ID13135
Feature activation+8.16e-01

Pos logit contributions
aned+0.041
 forehead+0.041
rated+0.039
appointed+0.037
ently+0.037
Neg logit contributions
 chefs-0.036
 queens-0.034
 kings-0.030
 charity-0.028
 builders-0.028
Token.
Token ID13
Feature activation-1.58e+00

Pos logit contributions
 chefs+0.024
 queens+0.022
 kings+0.020
 charity+0.019
 spies+0.019
Neg logit contributions
aned-0.032
rated-0.030
 forehead-0.029
ently-0.029
iously-0.029
Token
Token ID198
Feature activation+1.96e+00

Pos logit contributions
aned+0.045
rated+0.041
 forehead+0.040
ently+0.040
iously+0.039
Neg logit contributions
 chefs-0.064
 queens-0.059
 kings-0.055
 charity-0.055
 spies-0.053
Token
Token ID198
Feature activation+1.96e+00

Pos logit contributions
 chefs+0.061
 queens+0.055
 charity+0.050
 kings+0.050
 spies+0.047
Neg logit contributions
aned-0.073
rated-0.069
 forehead-0.068
iously-0.067
ently-0.067
Token"
Token ID1
Feature activation+1.24e+01

Pos logit contributions
 chefs+0.070
 queens+0.063
 kings+0.059
 charity+0.057
 superheroes+0.056
Neg logit contributions
aned-0.062
appointed-0.057
ently-0.054
rated-0.053
 forehead-0.052
TokenHey
Token ID10814
Feature activation-1.98e+00

Pos logit contributions
 chefs+0.353
 queens+0.312
 kings+0.303
 charity+0.287
 detectives+0.276
Neg logit contributions
aned-0.436
 forehead-0.409
rated-0.409
appointed-0.404
ulously-0.384
Token,
Token ID11
Feature activation+5.48e+00

Pos logit contributions
aned+0.053
rated+0.052
iously+0.051
 forehead+0.050
ently+0.048
Neg logit contributions
 chefs-0.072
 queens-0.069
 charity-0.065
 kings-0.061
please-0.059
Token you
Token ID345
Feature activation+3.20e-01

Pos logit contributions
 chefs+0.173
 queens+0.163
please+0.144
 kings+0.143
 charity+0.138
Neg logit contributions
aned-0.179
 forehead-0.178
iously-0.170
rated-0.167
ently-0.164
Token two
Token ID734
Feature activation+2.19e+00

Pos logit contributions
 chefs+0.009
 queens+0.008
 charity+0.007
 kings+0.007
 spies+0.007
Neg logit contributions
aned-0.013
rated-0.012
 forehead-0.012
ently-0.012
iously-0.012
Token,"
Token ID553
Feature activation+8.36e-01

Pos logit contributions
 chefs+0.080
 queens+0.075
 charity+0.069
 kings+0.067
please+0.064
Neg logit contributions
aned-0.066
iously-0.060
rated-0.060
ently-0.059
 forehead-0.059
Token They
Token ID1119
Feature activation+2.60e+00

Pos logit contributions
aned+0.059
rated+0.055
 forehead+0.054
ently+0.054
iously+0.053
Neg logit contributions
 chefs-0.043
 queens-0.036
 charity-0.035
 kings-0.035
 pirates-0.033
Token swim
Token ID9422
Feature activation+1.19e-02

Pos logit contributions
 chefs+0.076
 queens+0.070
 charity+0.062
 spies+0.058
 kings+0.058
Neg logit contributions
aned-0.098
rated-0.096
 forehead-0.093
appointed-0.091
iously-0.089
Token and
Token ID290
Feature activation-1.87e+00

Pos logit contributions
 chefs+0.000
 queens+0.000
 charity+0.000
 kings+0.000
 spies+0.000
Neg logit contributions
aned-0.000
rated-0.000
 forehead-0.000
ently-0.000
iously-0.000
Token play
Token ID711
Feature activation+2.82e+00

Pos logit contributions
aned+0.071
ently+0.066
rated+0.066
iously+0.065
 forehead+0.064
Neg logit contributions
 chefs-0.058
 queens-0.053
 kings-0.048
 charity-0.047
 pirates-0.045
Token and
Token ID290
Feature activation-1.92e+00

Pos logit contributions
 chefs+0.082
 queens+0.081
 charity+0.069
 kings+0.067
 builders+0.065
Neg logit contributions
aned-0.099
ently-0.096
rated-0.094
 forehead-0.094
iously-0.090
Token slide
Token ID10649
Feature activation-1.11e-01

Pos logit contributions
aned+0.057
rated+0.057
 forehead+0.055
appointed+0.054
iously+0.052
Neg logit contributions
 chefs-0.070
 queens-0.063
 charity-0.059
 kings-0.058
 spies-0.057
Token on
Token ID319
Feature activation-6.15e+00

Pos logit contributions
aned+0.004
rated+0.004
ently+0.004
iously+0.004
appointed+0.004
Neg logit contributions
 chefs-0.004
 queens-0.003
 charity-0.003
 kings-0.003
 spies-0.003
Token the
Token ID262
Feature activation-1.19e+00

Pos logit contributions
ently+0.168
aned+0.168
iously+0.164
 forehead+0.152
ulously+0.150
Neg logit contributions
 chefs-0.225
 queens-0.212
please-0.189
 builders-0.185
 detectives-0.185
Token rocks
Token ID12586
Feature activation+2.47e+00

Pos logit contributions
aned+0.043
rated+0.040
 forehead+0.039
ently+0.039
iously+0.039
Neg logit contributions
 chefs-0.039
 queens-0.036
 kings-0.032
 charity-0.032
 spies-0.031
Token.
Token ID13
Feature activation-1.53e+00

Pos logit contributions
 chefs+0.078
 queens+0.071
 charity+0.065
 kings+0.064
 spies+0.061
Neg logit contributions
aned-0.090
rated-0.084
ently-0.083
 forehead-0.083
iously-0.083
Token Sometimes
Token ID8975
Feature activation+1.81e-01

Pos logit contributions
aned+0.059
rated+0.055
 forehead+0.055
ently+0.055
iously+0.054
Neg logit contributions
 chefs-0.047
 queens-0.041
 charity-0.038
 kings-0.038
 pirates-0.037
Token their
Token ID511
Feature activation+1.05e-01

Pos logit contributions
aned+0.121
rated+0.114
ently+0.112
 forehead+0.112
iously+0.112
Neg logit contributions
 chefs-0.096
 queens-0.088
 charity-0.079
 kings-0.079
 spies-0.074
Token head
Token ID1182
Feature activation-2.64e+00

Pos logit contributions
 chefs+0.004
 queens+0.004
 kings+0.004
 pirates+0.004
 charity+0.004
Neg logit contributions
aned-0.003
ently-0.002
rated-0.002
appointed-0.002
iously-0.002
Token and
Token ID290
Feature activation-1.64e+00

Pos logit contributions
aned+0.103
rated+0.097
 forehead+0.095
ently+0.095
appointed+0.094
Neg logit contributions
 chefs-0.075
 queens-0.068
 kings-0.062
 charity-0.061
 spies-0.058
Token said
Token ID531
Feature activation-3.30e+00

Pos logit contributions
aned+0.052
rated+0.048
iously+0.048
 forehead+0.048
ently+0.047
Neg logit contributions
 chefs-0.059
 queens-0.055
 charity-0.051
 kings-0.049
 builders-0.047
Token no
Token ID645
Feature activation+2.93e+00

Pos logit contributions
aned+0.107
iously+0.103
ently+0.102
 forehead+0.100
appointed+0.096
Neg logit contributions
 chefs-0.109
 queens-0.102
 charity-0.094
 kings-0.091
 spies-0.086
Token.
Token ID13
Feature activation-1.43e+00

Pos logit contributions
 chefs+0.090
 queens+0.081
 kings+0.074
 charity+0.073
 spies+0.069
Neg logit contributions
aned-0.111
rated-0.104
 forehead-0.102
ently-0.102
iously-0.102
Token
Token ID198
Feature activation+1.81e+00

Pos logit contributions
aned+0.053
rated+0.052
 forehead+0.049
ently+0.047
iously+0.047
Neg logit contributions
 chefs-0.047
 queens-0.041
 charity-0.039
 kings-0.039
 pirates-0.038
Token
Token ID198
Feature activation+1.81e+00

Pos logit contributions
 chefs+0.043
 queens+0.038
 charity+0.037
 kings+0.034
 pirates+0.033
Neg logit contributions
aned-0.077
rated-0.076
 forehead-0.075
iously-0.074
appointed-0.070
TokenThen
Token ID6423
Feature activation-3.56e+00

Pos logit contributions
 chefs+0.044
 queens+0.039
 kings+0.035
 charity+0.035
 spies+0.032
Neg logit contributions
aned-0.080
rated-0.076
 forehead-0.075
ently-0.074
iously-0.074
Token one
Token ID530
Feature activation+4.67e+00

Pos logit contributions
aned+0.119
iously+0.114
rated+0.113
ently+0.111
appointed+0.111
Neg logit contributions
 chefs-0.116
 queens-0.106
 charity-0.099
 kings-0.094
 spies-0.090
Token day
Token ID1110
Feature activation-5.63e+00

Pos logit contributions
 chefs+0.164
 queens+0.151
 kings+0.140
 charity+0.137
 pirates+0.135
Neg logit contributions
aned-0.167
ently-0.150
iously-0.146
appointed-0.145
rated-0.144
Token brave
Token ID14802
Feature activation+1.98e+00

Pos logit contributions
aned+0.013
iously+0.012
 forehead+0.012
appointed+0.011
ently+0.011
Neg logit contributions
 chefs-0.011
 queens-0.010
 charity-0.010
 kings-0.009
Sp-0.008
Token and
Token ID290
Feature activation-1.83e+00

Pos logit contributions
 chefs+0.069
 queens+0.064
 charity+0.059
 kings+0.057
 builders+0.054
Neg logit contributions
aned-0.063
rated-0.060
iously-0.060
 forehead-0.060
ently-0.058
Token he
Token ID339
Feature activation+1.44e+00

Pos logit contributions
aned+0.074
 forehead+0.072
rated+0.072
appointed+0.071
iously+0.070
Neg logit contributions
 chefs-0.048
 queens-0.042
 charity-0.037
 kings-0.037
 spies-0.033
Token had
Token ID550
Feature activation+1.01e+00

Pos logit contributions
 chefs+0.042
 queens+0.038
 kings+0.035
 charity+0.034
 pirates+0.033
Neg logit contributions
aned-0.055
rated-0.052
ently-0.052
 forehead-0.051
appointed-0.051
Token tried
Token ID3088
Feature activation-1.70e+00

Pos logit contributions
 chefs+0.029
 queens+0.026
 charity+0.023
 kings+0.023
 spies+0.022
Neg logit contributions
aned-0.040
rated-0.037
 forehead-0.037
ently-0.037
iously-0.037
Token to
Token ID284
Feature activation-1.70e-01

Pos logit contributions
aned+0.069
ently+0.064
iously+0.064
rated+0.064
appointed+0.063
Neg logit contributions
 chefs-0.047
 queens-0.042
 charity-0.037
 kings-0.037
 builders-0.034
Token deliver
Token ID5203
Feature activation+4.95e+00

Pos logit contributions
aned+0.006
 forehead+0.006
rated+0.006
iously+0.006
ently+0.006
Neg logit contributions
 chefs-0.005
 queens-0.005
 kings-0.004
 charity-0.004
 spies-0.004
Token the
Token ID262
Feature activation-1.21e+00

Pos logit contributions
 chefs+0.163
 queens+0.149
 charity+0.132
 kings+0.131
 spies+0.128
Neg logit contributions
aned-0.160
ently-0.155
iously-0.152
 forehead-0.151
rated-0.150
Token packages
Token ID10392
Feature activation+4.20e+00

Pos logit contributions
rated+0.042
 forehead+0.041
aned+0.040
ently+0.039
appointed+0.039
Neg logit contributions
 chefs-0.038
 queens-0.036
 kings-0.031
J-0.030
 charity-0.030
Token,
Token ID11
Feature activation+5.10e+00

Pos logit contributions
 chefs+0.122
 queens+0.110
 kings+0.099
 charity+0.098
 spies+0.093
Neg logit contributions
aned-0.163
rated-0.154
 forehead-0.152
ently-0.151
iously-0.151
Token even
Token ID772
Feature activation+3.65e+00

Pos logit contributions
 chefs+0.167
 queens+0.152
 kings+0.138
 charity+0.137
 builders+0.129
Neg logit contributions
 forehead-0.167
aned-0.164
iously-0.160
rated-0.157
appointed-0.154
Token and
Token ID290
Feature activation-1.96e+00

Pos logit contributions
aned+0.058
rated+0.055
ently+0.053
 forehead+0.053
appointed+0.053
Neg logit contributions
 chefs-0.039
 queens-0.035
 kings-0.031
 charity-0.030
 pirates-0.030
Token walked
Token ID6807
Feature activation-2.66e-01

Pos logit contributions
aned+0.079
rated+0.074
ently+0.073
 forehead+0.073
iously+0.072
Neg logit contributions
 chefs-0.056
 queens-0.050
 kings-0.045
 charity-0.044
 spies-0.042
Token towards
Token ID3371
Feature activation-4.42e+00

Pos logit contributions
aned+0.009
rated+0.009
 forehead+0.009
appointed+0.009
iously+0.009
Neg logit contributions
 chefs-0.009
 queens-0.008
 charity-0.007
 kings-0.007
 spies-0.007
Token the
Token ID262
Feature activation-1.22e+00

Pos logit contributions
aned+0.134
iously+0.130
 forehead+0.126
rated+0.124
ently+0.122
Neg logit contributions
 chefs-0.152
 queens-0.143
 charity-0.128
 kings-0.127
 spies-0.123
Token bush
Token ID24484
Feature activation-1.46e+00

Pos logit contributions
aned+0.039
rated+0.037
appointed+0.037
 forehead+0.036
ently+0.035
Neg logit contributions
 chefs-0.044
 queens-0.040
 charity-0.037
 kings-0.037
 pirates-0.036
Token.
Token ID13
Feature activation-1.51e+00

Pos logit contributions
aned+0.058
rated+0.055
 forehead+0.053
ently+0.053
iously+0.053
Neg logit contributions
 chefs-0.042
 queens-0.037
 kings-0.034
 charity-0.034
 pirates-0.032
Token They
Token ID1119
Feature activation+2.49e+00

Pos logit contributions
aned+0.042
rated+0.039
 forehead+0.038
iously+0.038
appointed+0.038
Neg logit contributions
 chefs-0.060
 queens-0.057
 kings-0.053
 charity-0.052
 spies-0.050
Token both
Token ID1111
Feature activation-1.84e+00

Pos logit contributions
 chefs+0.083
 queens+0.075
 charity+0.073
 spies+0.065
 builders+0.065
Neg logit contributions
aned-0.086
rated-0.080
iously-0.079
appointed-0.079
 forehead-0.078
Token got
Token ID1392
Feature activation+4.15e+00

Pos logit contributions
aned+0.064
rated+0.059
 forehead+0.058
ently+0.058
iously+0.058
Neg logit contributions
 chefs-0.062
 queens-0.057
 kings-0.052
 charity-0.052
 spies-0.049
Token there
Token ID612
Feature activation-2.53e+00

Pos logit contributions
 chefs+0.127
 charity+0.127
 queens+0.125
 kings+0.111
 builders+0.102
Neg logit contributions
aned-0.136
 forehead-0.127
appointed-0.127
iously-0.124
rated-0.122
Token in
Token ID287
Feature activation-4.89e+00

Pos logit contributions
aned+0.092
rated+0.087
 forehead+0.086
iously+0.086
ently+0.086
Neg logit contributions
 chefs-0.080
 queens-0.074
 charity-0.066
 kings-0.065
 spies-0.062
Token park
Token ID3952
Feature activation-5.19e-01

Pos logit contributions
aned+0.036
 forehead+0.035
rated+0.034
appointed+0.034
ently+0.033
Neg logit contributions
 chefs-0.025
 queens-0.022
 kings-0.019
 charity-0.019
 spies-0.018
Token.
Token ID13
Feature activation-1.07e+00

Pos logit contributions
aned+0.020
rated+0.019
 forehead+0.018
ently+0.018
iously+0.018
Neg logit contributions
 chefs-0.016
 queens-0.014
 kings-0.013
 charity-0.013
 spies-0.012
Token "
Token ID366
Feature activation+1.09e+01

Pos logit contributions
aned+0.043
rated+0.042
ently+0.040
 forehead+0.040
iously+0.039
Neg logit contributions
 chefs-0.031
 queens-0.027
 charity-0.026
 pirates-0.024
 spies-0.024
TokenYes
Token ID5297
Feature activation-5.45e+00

Pos logit contributions
 chefs+0.332
 queens+0.293
 kings+0.282
 spies+0.263
 builders+0.257
Neg logit contributions
aned-0.377
rated-0.366
appointed-0.365
 forehead-0.356
ulously-0.342
Token!"
Token ID2474
Feature activation-1.06e+00

Pos logit contributions
aned+0.227
rated+0.208
appointed+0.206
ently+0.205
 forehead+0.200
Neg logit contributions
 chefs-0.147
 queens-0.133
 kings-0.120
 charity-0.118
 pirates-0.115
Token Sam
Token ID3409
Feature activation-6.25e-01

Pos logit contributions
rated+0.045
aned+0.043
ently+0.042
iously+0.040
 forehead+0.039
Neg logit contributions
 chefs-0.028
 queens-0.024
 pirates-0.022
 spies-0.022
 charity-0.021
Token exclaimed
Token ID33552
Feature activation-5.35e+00

Pos logit contributions
rated+0.025
aned+0.024
ently+0.024
inkle+0.023
 forehead+0.022
Neg logit contributions
 chefs-0.014
 especially-0.013
 queens-0.013
 kings-0.013
 even-0.013
Token,
Token ID11
Feature activation+5.14e+00

Pos logit contributions
aned+0.218
rated+0.206
ently+0.200
appointed+0.199
 forehead+0.198
Neg logit contributions
 chefs-0.148
 queens-0.131
 kings-0.119
 charity-0.118
 pirates-0.113
Token as
Token ID355
Feature activation+1.47e+00

Pos logit contributions
 chefs+0.136
 queens+0.121
 kings+0.107
 charity+0.107
 pirates+0.103
Neg logit contributions
aned-0.215
rated-0.208
ently-0.202
appointed-0.194
iously-0.192
Token he
Token ID339
Feature activation+1.73e+00

Pos logit contributions
 chefs+0.043
 queens+0.039
 kings+0.036
 charity+0.034
 pirates+0.034
Neg logit contributions
aned-0.059
rated-0.054
 forehead-0.054
ently-0.053
iously-0.053
Token ran
Token ID4966
Feature activation+2.89e-01

Pos logit contributions
 chefs+0.055
 queens+0.050
 kings+0.046
 charity+0.045
 pirates+0.043
Neg logit contributions
aned-0.063
rated-0.059
ently-0.058
 forehead-0.058
appointed-0.057
Token and
Token ID290
Feature activation-1.74e+00

Pos logit contributions
aned+0.104
rated+0.098
 forehead+0.096
appointed+0.095
iously+0.094
Neg logit contributions
 chefs-0.071
 queens-0.063
 kings-0.059
 charity-0.057
 pirates-0.054
Token said
Token ID531
Feature activation-3.51e+00

Pos logit contributions
aned+0.066
rated+0.062
 forehead+0.061
iously+0.061
appointed+0.060
Neg logit contributions
 chefs-0.050
 queens-0.046
 charity-0.041
 kings-0.041
 spies-0.038
Token he
Token ID339
Feature activation+1.70e+00

Pos logit contributions
aned+0.115
ently+0.108
iously+0.107
 forehead+0.106
rated+0.105
Neg logit contributions
 chefs-0.120
 queens-0.113
 charity-0.101
 kings-0.101
 spies-0.095
Token must
Token ID1276
Feature activation-4.69e+00

Pos logit contributions
 chefs+0.061
 queens+0.057
 charity+0.052
 kings+0.052
 spies+0.050
Neg logit contributions
aned-0.055
rated-0.051
 forehead-0.049
ently-0.049
iously-0.049
Token stay
Token ID2652
Feature activation-2.06e+00

Pos logit contributions
aned+0.175
rated+0.164
iously+0.163
ently+0.162
 forehead+0.161
Neg logit contributions
 chefs-0.143
 queens-0.126
 charity-0.121
 kings-0.119
 pirates-0.113
Token there
Token ID612
Feature activation-2.63e+00

Pos logit contributions
aned+0.077
 forehead+0.074
rated+0.073
ently+0.072
appointed+0.072
Neg logit contributions
 chefs-0.060
 queens-0.058
 charity-0.050
 kings-0.049
 builders-0.046
Token without
Token ID1231
Feature activation-2.45e+00

Pos logit contributions
aned+0.097
rated+0.092
ently+0.091
 forehead+0.090
iously+0.090
Neg logit contributions
 chefs-0.081
 queens-0.074
 kings-0.065
 charity-0.065
 spies-0.062
Token moving
Token ID3867
Feature activation+7.51e-01

Pos logit contributions
aned+0.089
iously+0.085
 forehead+0.083
rated+0.081
ently+0.081
Neg logit contributions
 chefs-0.074
 queens-0.069
 charity-0.062
 kings-0.061
 builders-0.058
Token for
Token ID329
Feature activation+2.07e+00

Pos logit contributions
 chefs+0.022
 queens+0.020
 kings+0.018
 charity+0.018
 spies+0.017
Neg logit contributions
aned-0.029
rated-0.027
 forehead-0.027
ently-0.026
iously-0.026
Token a
Token ID257
Feature activation+2.95e+00

Pos logit contributions
 chefs+0.063
 queens+0.057
 kings+0.051
 charity+0.049
 spies+0.047
Neg logit contributions
aned-0.076
iously-0.073
ently-0.071
 forehead-0.071
rated-0.069
Token whole
Token ID2187
Feature activation+1.97e-01

Pos logit contributions
 chefs+0.090
 queens+0.081
 kings+0.074
 charity+0.072
 spies+0.070
Neg logit contributions
aned-0.110
rated-0.105
 forehead-0.104
ently-0.102
iously-0.102
Tokenman
Token ID805
Feature activation+1.11e+00

Pos logit contributions
aned+0.140
appointed+0.137
rated+0.126
iously+0.125
ently+0.124
Neg logit contributions
 chefs-0.137
 pirates-0.110
 queens-0.109
 kings-0.108
 builders-0.101
Token is
Token ID318
Feature activation-1.66e-01

Pos logit contributions
 chefs+0.040
 queens+0.035
 kings+0.035
 pirates+0.032
 spies+0.031
Neg logit contributions
aned-0.038
appointed-0.035
ently-0.034
rated-0.034
 forehead-0.034
Token complete
Token ID1844
Feature activation-9.70e-01

Pos logit contributions
aned+0.007
rated+0.007
ently+0.007
iously+0.007
 forehead+0.007
Neg logit contributions
 chefs-0.004
 queens-0.004
 kings-0.003
 charity-0.003
 pirates-0.003
Token!"
Token ID2474
Feature activation-2.98e-01

Pos logit contributions
aned+0.030
rated+0.028
ently+0.027
 forehead+0.027
iously+0.027
Neg logit contributions
 chefs-0.036
 queens-0.033
 kings-0.030
 charity-0.030
 spies-0.029
Token Lily
Token ID20037
Feature activation+2.78e-01

Pos logit contributions
aned+0.013
rated+0.012
ently+0.012
 forehead+0.012
iously+0.012
Neg logit contributions
 chefs-0.008
 queens-0.007
 kings-0.006
 charity-0.006
 spies-0.006
Token says
Token ID1139
Feature activation-2.52e+00

Pos logit contributions
 chefs+0.011
 kings+0.009
 spies+0.009
 pirates+0.009
 queens+0.009
Neg logit contributions
aned-0.008
rated-0.008
ently-0.008
 forehead-0.007
iously-0.007
Token.
Token ID13
Feature activation-1.59e+00

Pos logit contributions
aned+0.105
rated+0.102
appointed+0.097
ently+0.097
 forehead+0.095
Neg logit contributions
 chefs-0.067
 queens-0.058
 kings-0.053
 charity-0.051
 spies-0.051
Token "
Token ID366
Feature activation+1.11e+01

Pos logit contributions
aned+0.058
rated+0.055
 forehead+0.053
ently+0.053
iously+0.053
Neg logit contributions
 chefs-0.051
 queens-0.047
 kings-0.043
 charity-0.042
 spies-0.041
TokenHe
Token ID1544
Feature activation-3.85e-01

Pos logit contributions
 chefs+0.281
 queens+0.245
 kings+0.223
 charity+0.218
 spies+0.205
Neg logit contributions
aned-0.457
rated-0.435
 forehead-0.434
appointed-0.428
ently-0.421
Token is
Token ID318
Feature activation-3.61e-01

Pos logit contributions
aned+0.012
appointed+0.011
rated+0.011
ently+0.011
 forehead+0.010
Neg logit contributions
 chefs-0.015
 kings-0.013
 queens-0.013
 pirates-0.013
 spies-0.012
Token very
Token ID845
Feature activation+8.19e-01

Pos logit contributions
aned+0.014
 forehead+0.013
rated+0.013
iously+0.013
appointed+0.013
Neg logit contributions
 chefs-0.010
 queens-0.009
 charity-0.009
 kings-0.008
 builders-0.007
Token and
Token ID290
Feature activation-1.54e+00

Pos logit contributions
aned+0.072
rated+0.070
 forehead+0.068
iously+0.067
ently+0.067
Neg logit contributions
 chefs-0.063
 queens-0.058
 charity-0.052
 kings-0.050
 spies-0.047
Token stopped
Token ID5025
Feature activation-2.23e+00

Pos logit contributions
aned+0.070
rated+0.066
 forehead+0.065
ently+0.065
iously+0.065
Neg logit contributions
 chefs-0.036
 queens-0.031
 kings-0.027
 charity-0.027
 spies-0.025
Token to
Token ID284
Feature activation-1.18e-01

Pos logit contributions
aned+0.077
iously+0.073
rated+0.073
ently+0.072
 forehead+0.071
Neg logit contributions
 chefs-0.071
 queens-0.067
 charity-0.062
 kings-0.057
 builders-0.053
Token listen
Token ID6004
Feature activation-1.01e+00

Pos logit contributions
aned+0.005
rated+0.005
 forehead+0.005
iously+0.005
appointed+0.005
Neg logit contributions
 chefs-0.003
 queens-0.003
 kings-0.002
 charity-0.002
 spies-0.002
Token.
Token ID13
Feature activation-1.28e+00

Pos logit contributions
aned+0.041
rated+0.039
 forehead+0.038
ently+0.038
iously+0.038
Neg logit contributions
 chefs-0.028
 queens-0.025
 charity-0.022
 kings-0.022
 spies-0.021
Token Her
Token ID2332
Feature activation-2.14e+00

Pos logit contributions
aned+0.054
rated+0.051
iously+0.049
ently+0.049
 forehead+0.049
Neg logit contributions
 chefs-0.034
 queens-0.030
 charity-0.027
 kings-0.027
 spies-0.026
Token heart
Token ID2612
Feature activation-3.48e+00

Pos logit contributions
aned+0.097
rated+0.093
ently+0.092
iously+0.089
appointed+0.086
Neg logit contributions
 chefs-0.052
 spies-0.042
 pirates-0.041
 superheroes-0.039
 queens-0.039
Token began
Token ID2540
Feature activation+1.26e+00

Pos logit contributions
aned+0.119
ently+0.115
 forehead+0.114
iously+0.114
rated+0.113
Neg logit contributions
 chefs-0.107
 queens-0.101
 kings-0.092
 pirates-0.087
 builders-0.085
Token to
Token ID284
Feature activation-1.22e-01

Pos logit contributions
 chefs+0.035
 queens+0.031
 charity+0.028
 kings+0.027
 spies+0.025
Neg logit contributions
aned-0.051
iously-0.048
rated-0.048
 forehead-0.048
ently-0.047
Token beat
Token ID4405
Feature activation+2.05e+00

Pos logit contributions
aned+0.004
rated+0.004
 forehead+0.004
ently+0.004
iously+0.004
Neg logit contributions
 chefs-0.004
 queens-0.004
 kings-0.004
 charity-0.004
 spies-0.003
Token fast
Token ID3049
Feature activation-7.79e-01

Pos logit contributions
 chefs+0.069
 queens+0.058
 charity+0.055
 kings+0.053
 builders+0.050
Neg logit contributions
aned-0.074
rated-0.069
iously-0.069
appointed-0.069
 forehead-0.068
Token any
Token ID597
Feature activation+2.92e+00

Pos logit contributions
iously+0.080
aned+0.078
 forehead+0.077
rated+0.074
appointed+0.070
Neg logit contributions
 chefs-0.079
 queens-0.071
 charity-0.067
 kings-0.063
 builders-0.062
Token more
Token ID517
Feature activation+5.20e+00

Pos logit contributions
 chefs+0.086
 queens+0.079
 charity+0.074
 kings+0.069
 spies+0.066
Neg logit contributions
aned-0.102
rated-0.101
iously-0.099
 forehead-0.099
ently-0.095
Token threats
Token ID7432
Feature activation+1.59e+00

Pos logit contributions
 chefs+0.142
 queens+0.128
 charity+0.116
 kings+0.114
 spies+0.106
Neg logit contributions
aned-0.210
rated-0.199
iously-0.195
 forehead-0.195
ently-0.194
Token.
Token ID13
Feature activation-1.46e+00

Pos logit contributions
 chefs+0.046
 queens+0.041
 kings+0.037
 charity+0.037
 spies+0.035
Neg logit contributions
aned-0.062
rated-0.059
 forehead-0.058
ently-0.058
iously-0.057
Token<|endoftext|>
Token ID50256
Feature activation-4.99e+00

Pos logit contributions
aned+0.056
rated+0.054
 forehead+0.052
ently+0.049
iously+0.049
Neg logit contributions
 chefs-0.045
 queens-0.041
 charity-0.040
 kings-0.037
 pirates-0.037
TokenOnce
Token ID7454
Feature activation-4.47e+00

Pos logit contributions
aned+0.201
 forehead+0.189
iously+0.188
rated+0.188
appointed+0.181
Neg logit contributions
 chefs-0.150
 queens-0.130
 kings-0.120
 pirates-0.118
 charity-0.116
Token upon
Token ID2402
Feature activation-7.79e+00

Pos logit contributions
aned+0.176
 forehead+0.167
iously+0.164
rated+0.161
appointed+0.157
Neg logit contributions
 chefs-0.125
 queens-0.115
 kings-0.112
 pirates-0.104
 charity-0.101
Token a
Token ID257
Feature activation+2.91e+00

Pos logit contributions
aned+0.351
rated+0.331
ently+0.326
iously+0.325
 forehead+0.324
Neg logit contributions
 chefs-0.185
 queens-0.163
 kings-0.143
 charity-0.141
 pirates-0.134
Token time
Token ID640
Feature activation-6.58e+00

Pos logit contributions
 chefs+0.107
 queens+0.097
 charity+0.092
 kings+0.091
 pirates+0.087
Neg logit contributions
aned-0.092
rated-0.083
 forehead-0.083
ently-0.083
iously-0.082
Token,
Token ID11
Feature activation+5.13e+00

Pos logit contributions
aned+0.247
rated+0.231
 forehead+0.227
ently+0.225
iously+0.224
Neg logit contributions
 chefs-0.204
 queens-0.185
 kings-0.168
 charity-0.167
 spies-0.159
Token there
Token ID612
Feature activation-2.66e+00

Pos logit contributions
 chefs+0.077
 queens+0.057
 kings+0.049
 pirates+0.048
 spies+0.043
Neg logit contributions
aned-0.283
iously-0.266
 forehead-0.266
ently-0.263
rated-0.261
Token a
Token ID257
Feature activation+2.89e+00

Pos logit contributions
aned+0.012
iously+0.012
 forehead+0.011
ently+0.011
zed+0.011
Neg logit contributions
 chefs-0.012
 charity-0.011
 queens-0.011
 spies-0.010
 builders-0.010
Token fun
Token ID1257
Feature activation+3.92e+00

Pos logit contributions
 chefs+0.084
 queens+0.077
 kings+0.068
 spies+0.066
 builders+0.066
Neg logit contributions
 forehead-0.103
rated-0.096
iously-0.092
aned-0.092
appointed-0.091
Token day
Token ID1110
Feature activation-5.69e+00

Pos logit contributions
 chefs+0.142
 queens+0.127
 charity+0.121
 kings+0.119
 pirates+0.117
Neg logit contributions
aned-0.131
rated-0.117
ently-0.117
appointed-0.115
iously-0.115
Token and
Token ID290
Feature activation-1.71e+00

Pos logit contributions
aned+0.217
rated+0.204
 forehead+0.203
iously+0.202
ently+0.202
Neg logit contributions
 chefs-0.169
 queens-0.152
 charity-0.137
 kings-0.136
 spies-0.128
Token she
Token ID673
Feature activation+1.94e+00

Pos logit contributions
aned+0.060
rated+0.057
 forehead+0.056
iously+0.056
appointed+0.056
Neg logit contributions
 chefs-0.056
 queens-0.049
 charity-0.047
 kings-0.045
 spies-0.044
Token couldn
Token ID3521
Feature activation-5.87e+00

Pos logit contributions
 chefs+0.055
 queens+0.049
 kings+0.047
 pirates+0.045
 especially+0.044
Neg logit contributions
aned-0.077
ently-0.072
appointed-0.071
rated-0.071
iously-0.071
Token't
Token ID470
Feature activation+6.93e-01

Pos logit contributions
rated+0.247
 forehead+0.240
aned+0.239
iously+0.236
 spine+0.227
Neg logit contributions
 chefs-0.129
 queens-0.125
 especially-0.121
 charity-0.112
 even-0.107
Token wait
Token ID4043
Feature activation-3.13e+00

Pos logit contributions
 chefs+0.028
 queens+0.026
 kings+0.024
 charity+0.023
 spies+0.023
Neg logit contributions
aned-0.020
rated-0.018
 forehead-0.018
iously-0.018
appointed-0.017
Token to
Token ID284
Feature activation-1.94e-01

Pos logit contributions
aned+0.143
rated+0.136
 forehead+0.133
ently+0.133
iously+0.132
Neg logit contributions
 chefs-0.071
 queens-0.063
 kings-0.055
 charity-0.054
 spies-0.050
Token tell
Token ID1560
Feature activation-1.30e+00

Pos logit contributions
aned+0.007
 forehead+0.006
rated+0.006
iously+0.006
appointed+0.006
Neg logit contributions
 chefs-0.006
 queens-0.006
 kings-0.005
 charity-0.005
 builders-0.005
Token her
Token ID607
Feature activation-2.02e+00

Pos logit contributions
aned+0.053
 forehead+0.050
ently+0.050
rated+0.049
iously+0.049
Neg logit contributions
 chefs-0.036
 queens-0.032
 charity-0.028
 kings-0.028
 spies-0.026
Token waved
Token ID26834
Feature activation-3.03e+00

Pos logit contributions
aned+0.081
ently+0.077
rated+0.076
 forehead+0.072
appointed+0.072
Neg logit contributions
 chefs-0.064
 queens-0.060
 kings-0.055
 pirates-0.053
 spies-0.051
Token back
Token ID736
Feature activation-8.93e-01

Pos logit contributions
aned+0.065
iously+0.061
 forehead+0.060
rated+0.059
ently+0.057
Neg logit contributions
 chefs-0.142
 queens-0.133
 charity-0.125
 kings-0.122
 spies-0.118
Token as
Token ID355
Feature activation+1.28e+00

Pos logit contributions
aned+0.042
rated+0.041
 forehead+0.038
appointed+0.038
ently+0.037
Neg logit contributions
 chefs-0.018
 queens-0.016
 kings-0.015
 pirates-0.014
 spies-0.014
Token Joe
Token ID5689
Feature activation-2.53e+00

Pos logit contributions
 chefs+0.044
 queens+0.040
 charity+0.038
 kings+0.037
 spies+0.035
Neg logit contributions
aned-0.042
rated-0.040
ently-0.039
iously-0.039
 forehead-0.038
Token flew
Token ID13112
Feature activation-1.64e-01

Pos logit contributions
aned+0.095
rated+0.090
appointed+0.088
ently+0.088
 forehead+0.087
Neg logit contributions
 chefs-0.074
 queens-0.069
 kings-0.065
 pirates-0.062
 charity-0.060
Token off
Token ID572
Feature activation-9.32e+00

Pos logit contributions
aned+0.005
rated+0.005
ently+0.005
 forehead+0.005
iously+0.005
Neg logit contributions
 chefs-0.006
 queens-0.006
 charity-0.005
 kings-0.005
 spies-0.005
Token into
Token ID656
Feature activation-3.84e+00

Pos logit contributions
aned+0.348
 forehead+0.322
iously+0.321
ently+0.321
rated+0.320
Neg logit contributions
 chefs-0.289
 queens-0.260
 charity-0.236
 kings-0.231
 spies-0.220
Token the
Token ID262
Feature activation-1.51e+00

Pos logit contributions
aned+0.115
 forehead+0.111
iously+0.109
rated+0.107
ently+0.107
Neg logit contributions
 chefs-0.135
 queens-0.122
 charity-0.110
please-0.108
 kings-0.108
Token sky
Token ID6766
Feature activation-4.31e+00

Pos logit contributions
aned+0.056
ently+0.051
rated+0.051
iously+0.050
 forehead+0.050
Neg logit contributions
 chefs-0.048
 queens-0.044
 kings-0.041
 charity-0.041
 pirates-0.039
Token.
Token ID13
Feature activation-1.72e+00

Pos logit contributions
aned+0.159
rated+0.149
 forehead+0.147
ently+0.147
iously+0.147
Neg logit contributions
 chefs-0.135
 queens-0.122
 charity-0.111
 kings-0.110
 spies-0.105
Token Joe
Token ID5689
Feature activation-2.51e+00

Pos logit contributions
aned+0.064
 forehead+0.061
rated+0.057
ently+0.055
iously+0.054
Neg logit contributions
 chefs-0.053
 queens-0.046
 kings-0.045
 builders-0.045
 charity-0.044
Token said
Token ID531
Feature activation-3.11e+00

Pos logit contributions
 chefs+0.081
 queens+0.071
 kings+0.069
 pirates+0.065
 spies+0.064
Neg logit contributions
aned-0.062
rated-0.059
appointed-0.058
iously-0.057
ently-0.056
Token.
Token ID13
Feature activation-1.11e+00

Pos logit contributions
aned+0.129
rated+0.123
appointed+0.119
ently+0.116
iously+0.116
Neg logit contributions
 chefs-0.085
 queens-0.073
 kings-0.067
 charity-0.066
 pirates-0.063
Token
Token ID198
Feature activation+1.98e+00

Pos logit contributions
aned+0.041
rated+0.039
 forehead+0.038
ently+0.038
iously+0.038
Neg logit contributions
 chefs-0.035
 queens-0.031
 charity-0.028
 kings-0.028
 spies-0.027
Token
Token ID198
Feature activation+2.03e+00

Pos logit contributions
 chefs+0.062
 queens+0.056
 charity+0.051
 kings+0.051
 spies+0.048
Neg logit contributions
aned-0.074
rated-0.069
 forehead-0.068
iously-0.067
ently-0.067
Token"
Token ID1
Feature activation+1.28e+01

Pos logit contributions
 chefs+0.056
 queens+0.048
 kings+0.044
 pirates+0.041
 charity+0.041
Neg logit contributions
aned-0.080
appointed-0.076
ently-0.075
rated-0.074
 forehead-0.073
TokenWow
Token ID22017
Feature activation-9.86e+00

Pos logit contributions
 chefs+0.446
 queens+0.409
 kings+0.404
 charity+0.369
 spies+0.367
Neg logit contributions
aned-0.374
appointed-0.367
 forehead-0.367
rated-0.355
iously-0.336
Token,
Token ID11
Feature activation+5.61e+00

Pos logit contributions
aned+0.404
rated+0.376
 forehead+0.368
appointed+0.368
iously+0.364
Neg logit contributions
 chefs-0.285
 queens-0.251
 kings-0.232
 pirates-0.219
 charity-0.218
Token that
Token ID326
Feature activation+3.37e+00

Pos logit contributions
 chefs+0.176
 queens+0.162
 kings+0.146
 charity+0.145
 builders+0.137
Neg logit contributions
aned-0.190
 forehead-0.186
iously-0.182
rated-0.181
ently-0.177
Token sounds
Token ID5238
Feature activation+2.84e+00

Pos logit contributions
 chefs+0.114
 pirates+0.098
 kings+0.098
 queens+0.093
 se+0.088
Neg logit contributions
aned-0.131
iously-0.113
 forehead-0.113
ently-0.112
appointed-0.111
Token fun
Token ID1257
Feature activation+4.09e+00

Pos logit contributions
 chefs+0.055
 queens+0.048
 kings+0.040
 charity+0.040
 spies+0.036
Neg logit contributions
aned-0.135
rated-0.133
 forehead-0.129
appointed-0.128
ently-0.128
Token!"
Token ID2474
Feature activation-1.15e-01

Pos logit contributions
 chefs+0.154
 queens+0.143
 charity+0.133
 kings+0.131
 spies+0.125
Neg logit contributions
aned-0.123
rated-0.115
ently-0.112
iously-0.112
 forehead-0.111
Token the
Token ID262
Feature activation-9.66e-01

Pos logit contributions
aned+0.007
rated+0.006
 forehead+0.006
iously+0.006
ently+0.006
Neg logit contributions
 chefs-0.005
 queens-0.005
 kings-0.004
 charity-0.004
 spies-0.004
Token wolf
Token ID17481
Feature activation-7.78e-02

Pos logit contributions
 forehead+0.031
aned+0.031
rated+0.031
appointed+0.029
iously+0.029
Neg logit contributions
 chefs-0.033
 queens-0.031
 kings-0.028
 charity-0.028
 builders-0.026
Token,
Token ID11
Feature activation+5.31e+00

Pos logit contributions
aned+0.003
rated+0.003
ently+0.003
appointed+0.003
 forehead+0.003
Neg logit contributions
 chefs-0.002
 queens-0.002
 kings-0.002
 charity-0.002
 pirates-0.002
Token
Token ID6184
Feature activation-5.91e+00

Pos logit contributions
 chefs+0.139
 queens+0.123
 kings+0.112
 charity+0.110
 spies+0.105
Neg logit contributions
aned-0.223
rated-0.212
ently-0.208
 forehead-0.204
iously-0.203
Token
Token ID95
Feature activation-3.09e+00

Pos logit contributions
aned+0.198
rated+0.189
ently+0.181
 forehead+0.178
iously+0.173
Neg logit contributions
 chefs-0.204
 queens-0.182
 charity-0.175
 kings-0.171
 pirates-0.166
Token
Token ID26391
Feature activation-1.13e+01

Pos logit contributions
aned+0.092
ently+0.087
rated+0.085
 forehead+0.083
iously+0.081
Neg logit contributions
 chefs-0.110
 charity-0.098
 queens-0.097
 kings-0.093
 pirates-0.092
Token
Token ID129
Feature activation-4.08e+00

Pos logit contributions
aned+0.360
rated+0.335
 forehead+0.329
ently+0.326
iously+0.324
Neg logit contributions
 chefs-0.410
 queens-0.378
 kings-0.348
 charity-0.345
 spies-0.333
Token
Token ID241
Feature activation+8.98e+00

Pos logit contributions
ently+0.132
rated+0.130
 forehead+0.130
aned+0.126
ipop+0.125
Neg logit contributions
 chefs-0.119
 kings-0.106
 charity-0.103
 spies-0.103
 pirates-0.102
TokenHi
Token ID17250
Feature activation-6.22e-01

Pos logit contributions
 chefs+0.222
 queens+0.196
 charity+0.173
 kings+0.170
 pirates+0.165
Neg logit contributions
aned-0.395
ently-0.370
rated-0.368
 forehead-0.366
iously-0.363
Token there
Token ID612
Feature activation-2.61e+00

Pos logit contributions
aned+0.024
rated+0.022
 forehead+0.022
ently+0.022
appointed+0.022
Neg logit contributions
 chefs-0.019
 queens-0.017
 kings-0.016
 charity-0.016
 pirates-0.015
Token!
Token ID0
Feature activation-1.90e+00

Pos logit contributions
aned+0.104
rated+0.095
 forehead+0.094
iously+0.093
ently+0.093
Neg logit contributions
 chefs-0.078
 queens-0.068
 kings-0.064
 charity-0.064
 pirates-0.061
Token will
Token ID481
Feature activation-1.24e+00

Pos logit contributions
 chefs+0.045
 queens+0.041
 kings+0.038
 charity+0.036
 pirates+0.035
Neg logit contributions
aned-0.077
ently-0.072
appointed-0.071
iously-0.071
 forehead-0.070
Token find
Token ID1064
Feature activation+5.02e+00

Pos logit contributions
aned+0.047
rated+0.044
 forehead+0.043
ently+0.043
iously+0.043
Neg logit contributions
 chefs-0.038
 queens-0.034
 kings-0.031
 charity-0.031
 spies-0.029
Token it
Token ID340
Feature activation-3.24e+00

Pos logit contributions
 chefs+0.174
 queens+0.159
 charity+0.147
 kings+0.144
 builders+0.138
Neg logit contributions
aned-0.164
ently-0.153
rated-0.152
iously-0.151
 forehead-0.150
Token.
Token ID13
Feature activation-1.56e+00

Pos logit contributions
aned+0.121
rated+0.113
 forehead+0.111
ently+0.111
iously+0.110
Neg logit contributions
 chefs-0.100
 queens-0.091
 kings-0.082
 charity-0.082
 spies-0.078
Tokenâ
Token ID22940
Feature activation-7.12e+00

Pos logit contributions
aned+0.052
rated+0.050
ently+0.047
 forehead+0.047
iously+0.046
Neg logit contributions
 chefs-0.056
 queens-0.051
 charity-0.050
 kings-0.046
 spies-0.046
Token
Token ID26391
Feature activation-1.12e+01

Pos logit contributions
aned+0.151
iously+0.146
 forehead+0.140
 jaws+0.136
 fists+0.133
Neg logit contributions
 responsible-0.248
Thank-0.248
 like-0.242
L-0.242
 charity-0.241
Token
Token ID198
Feature activation+1.45e+00

Pos logit contributions
aned+0.388
rated+0.366
 forehead+0.358
ently+0.353
iously+0.349
Neg logit contributions
 chefs-0.383
 queens-0.350
 kings-0.324
 charity-0.322
 pirates-0.309
Token
Token ID198
Feature activation+1.48e+00

Pos logit contributions
 chefs+0.039
 queens+0.035
 charity+0.033
 kings+0.031
 pirates+0.029
Neg logit contributions
aned-0.059
rated-0.057
 forehead-0.056
iously-0.056
ently-0.054
TokenThe
Token ID464
Feature activation-2.78e+00

Pos logit contributions
 chefs+0.047
 queens+0.042
 kings+0.040
 spies+0.037
 pirates+0.037
Neg logit contributions
aned-0.055
appointed-0.051
ently-0.050
rated-0.049
 forehead-0.048
Token guard
Token ID4860
Feature activation+4.35e+00

Pos logit contributions
aned+0.108
 forehead+0.105
rated+0.103
iously+0.102
ently+0.101
Neg logit contributions
 chefs-0.079
 queens-0.068
 charity-0.061
 kings-0.061
 spies-0.058
Token nodded
Token ID14464
Feature activation-3.18e+00

Pos logit contributions
 chefs+0.140
 queens+0.129
 kings+0.120
 charity+0.114
 pirates+0.114
Neg logit contributions
aned-0.155
rated-0.146
appointed-0.143
 forehead-0.143
ently-0.142
Token asked
Token ID1965
Feature activation+1.06e+00

Pos logit contributions
aned+0.062
ently+0.060
rated+0.058
iously+0.058
 forehead+0.057
Neg logit contributions
 chefs-0.038
 queens-0.034
 kings-0.031
 charity-0.029
 pirates-0.029
Token her
Token ID607
Feature activation-2.00e+00

Pos logit contributions
 chefs+0.037
 queens+0.035
 charity+0.031
 kings+0.030
 builders+0.030
Neg logit contributions
ently-0.033
aned-0.033
rated-0.032
iously-0.031
 forehead-0.030
Token,
Token ID11
Feature activation+5.23e+00

Pos logit contributions
aned+0.071
 forehead+0.069
ently+0.069
rated+0.068
iously+0.067
Neg logit contributions
 chefs-0.059
 queens-0.056
 kings-0.048
 charity-0.047
 builders-0.046
Token
Token ID6184
Feature activation-5.34e+00

Pos logit contributions
 chefs+0.175
 queens+0.162
 kings+0.147
 charity+0.145
 spies+0.138
Neg logit contributions
aned-0.179
 forehead-0.168
rated-0.168
ently-0.164
iously-0.164
Token
Token ID95
Feature activation-2.38e+00

Pos logit contributions
aned+0.167
 forehead+0.154
rated+0.153
appointed+0.151
iously+0.151
Neg logit contributions
 chefs-0.198
 queens-0.186
 kings-0.171
 charity-0.163
 pirates-0.163
Token
Token ID26391
Feature activation-1.08e+01

Pos logit contributions
aned+0.070
rated+0.064
ently+0.063
iously+0.062
 forehead+0.062
Neg logit contributions
 chefs-0.093
 queens-0.086
 charity-0.080
 kings-0.079
 spies-0.076
Token
Token ID129
Feature activation-3.66e+00

Pos logit contributions
aned+0.289
rated+0.260
appointed+0.258
ently+0.255
 forehead+0.254
Neg logit contributions
 chefs-0.430
 queens-0.405
 kings-0.378
 charity-0.374
 builders-0.362
Token
Token ID241
Feature activation+9.37e+00

Pos logit contributions
aned+0.107
rated+0.105
ently+0.103
 forehead+0.101
iously+0.100
Neg logit contributions
 chefs-0.129
 queens-0.115
 charity-0.114
 kings-0.111
 spies-0.108
TokenWhat
Token ID2061
Feature activation+2.31e+00

Pos logit contributions
 chefs+0.231
 queens+0.203
 charity+0.178
 kings+0.178
 spies+0.165
Neg logit contributions
aned-0.412
rated-0.388
 forehead-0.383
ently-0.383
iously-0.381
Tokenâ
Token ID22940
Feature activation-6.63e+00

Pos logit contributions
 chefs+0.084
 queens+0.078
 kings+0.071
 charity+0.071
 spies+0.068
Neg logit contributions
aned-0.073
rated-0.068
ently-0.067
 forehead-0.067
iously-0.066
Token
Token ID26391
Feature activation-1.07e+01

Pos logit contributions
aned+0.194
iously+0.181
 forehead+0.169
ently+0.164
ulously+0.155
Neg logit contributions
 chefs-0.252
 charity-0.231
 queens-0.229
 including-0.212
Why-0.208