TOP ACTIVATIONS
MAX = 10.189

and held her hand even tighter . " Me too
!" She hugged her bear tighter and turned away . The
her and held the doll tighter . He shouted , "
. Daddy hugged her even tighter and wiped away her tears
he hugged the thing even tighter and enjoyed how it felt
better and held his pillow tighter . Mom asked
head and hugged her dolls tighter . Max did not give
She hugged the wreck even tighter . The
her and she hugged it tighter . Soon
stopped and tied the string tighter . Soon he was home
, she hugged it even tighter and said " I love
gave her a hug even tighter than any hug before .
He smiled , hugged her tighter and said he knew she
her and held her dog tighter . " Go
. She hugged it even tighter and fell asleep with a
. But if she gets worse , call me . And
. They hugged her even tighter when she promised never to
feel better . She felt worse . She saw the dog
, but the sting got worse . She rubbed her eyes
smiled and hugged the pillow tighter , never wanting to let

MIN ACTIVATIONS
MIN = -9.689

big , beautiful rainbow - col oured butterfly ! The little
delicious vegetables and bright - col oured flowers .
of laundry on the clothes line . " He wants to
a strange , raspberry - col oured object hiding under a
I fell off the clothes line and I can 't get
back on the zip - line , z ipping all around
it would hug the shore line and admire the wrecked boats
a walk along the shore line . But something
when it hugged the shore line again , it noticed a
things â € � � col ouring books , stickers and
It was a zip - line ! It was made of
robe back onto the clothes line . He used all his
suddenly , a brightly - col oured belt caught her eye
the edge of the shore line and just stood there .
jumped onto the zip - line and z ipped around the
bed wearing his favourite py j amas , feeling so proud
day , a chocolate - col oured whip named C heddar
, with all its different col ours , shapes and textures
warm , and the shore line was ancient .
date , a rainbow - col oured butterfly flew in through

INTERVAL 5.220 - 10.189
CONTAINS 1.388%

and offered to repair the bad event .
diamond flower . She couldn 't believe it . She had
of it so it wouldn 't break into parts like his
. His shoes got wet again , but he didn 't
felt different . He felt lighter , happier and accepted .

INTERVAL 0.250 - 5.220
CONTAINS 36.594%

we 'll always keep you safe ." Lily smiled and felt
sang . They sang the song that made them famous .
down on the floor and started to put the puzzle pieces
sat down in the middle of the pitch . No one
onto the chim ney . He made it to the top

INTERVAL -4.720 - 0.250
CONTAINS 61.015%

they could be friends . <|endoftext|> Tom and Anna liked to
had new friends to play with and she always kept the
home . He saw the open freezer and said " Uh
drum with his mom . They played together for a long
shining and the flowers blo oming . She put on her

INTERVAL -9.689 - -4.720
CONTAINS 1.003%

. " The bird hurt me !" Ben runs
hopped into the car . Daddy drove her to the big
Let 's go in and explore ." L ily
cap and said , " This is the perfect one !"
" We are sorry , Mom . We were just playing
Token and
Token ID290
Feature activation-4.06e-01

Pos logit contributions
 coward+0.020
 loser+0.017
 hawk+0.017
 badly+0.013
 villain+0.013
Neg logit contributions
 Katie-0.035
 dining-0.034
 counters-0.033
view-0.033
iah-0.033
Token held
Token ID2714
Feature activation+1.53e+00

Pos logit contributions
view+0.025
 dining+0.025
 Katie+0.025
 doorstep+0.025
 counters+0.025
Neg logit contributions
 coward-0.020
 hawk-0.018
 loser-0.017
 badly-0.016
 Sly-0.015
Token her
Token ID607
Feature activation-8.66e-01

Pos logit contributions
 coward+0.071
 loser+0.062
 hawk+0.060
 badly+0.054
ching+0.050
Neg logit contributions
 Katie-0.105
 dining-0.101
 counters-0.100
view-0.100
 doorstep-0.100
Token hand
Token ID1021
Feature activation-1.17e+00

Pos logit contributions
 Katie+0.047
 dining+0.045
view+0.045
 counters+0.044
 pavement+0.044
Neg logit contributions
 coward-0.052
 loser-0.048
 hawk-0.047
 badly-0.044
ching-0.041
Token even
Token ID772
Feature activation-6.93e-01

Pos logit contributions
 Katie+0.087
 dining+0.084
 counters+0.083
view+0.082
 doorstep+0.082
Neg logit contributions
 coward-0.047
 loser-0.041
 hawk-0.039
 badly-0.035
ching-0.031
Token tighter
Token ID28619
Feature activation+1.02e+01

Pos logit contributions
 Katie+0.058
 dining+0.056
 counters+0.056
view+0.055
 doorstep+0.055
Neg logit contributions
 coward-0.022
 loser-0.018
 hawk-0.017
 badly-0.014
 hunter-0.012
Token.
Token ID13
Feature activation-1.61e-01

Pos logit contributions
 coward+0.449
 loser+0.385
 hawk+0.375
 villain+0.310
 badly+0.308
Neg logit contributions
 Katie-0.737
 dining-0.704
 counters-0.697
 doorstep-0.695
view-0.694
Token
Token ID198
Feature activation+1.13e+00

Pos logit contributions
view+0.011
 doorstep+0.010
 pavement+0.010
 Katie+0.010
 occasion+0.010
Neg logit contributions
 coward-0.008
 hawk-0.007
 loser-0.007
 Sly-0.006
 hunter-0.006
Token"
Token ID1
Feature activation-1.70e-01

Pos logit contributions
 coward+0.034
 loser+0.026
 hawk+0.026
 badly+0.022
ching+0.021
Neg logit contributions
 Katie-0.095
 pavement-0.092
 doorstep-0.091
 counters-0.091
view-0.091
TokenMe
Token ID5308
Feature activation-9.96e-01

Pos logit contributions
 doorstep+0.011
 counters+0.011
 Katie+0.011
 dining+0.011
 pavement+0.011
Neg logit contributions
 coward-0.008
 loser-0.008
 hawk-0.007
Bad-0.007
ching-0.006
Token too
Token ID1165
Feature activation+3.72e+00

Pos logit contributions
 Katie+0.083
 dining+0.081
 counters+0.080
view+0.080
 doorstep+0.080
Neg logit contributions
 coward-0.032
 loser-0.027
 hawk-0.025
 badly-0.021
 Sly-0.018
Token!"
Token ID2474
Feature activation-7.29e-01

Pos logit contributions
 coward+0.296
 hawk+0.261
 loser+0.250
ching+0.227
 dodged+0.225
Neg logit contributions
 Katie-0.380
 dining-0.353
 doorstep-0.350
 counters-0.341
iah-0.340
Token She
Token ID1375
Feature activation-9.48e-01

Pos logit contributions
 dining+0.045
 Katie+0.045
iah+0.045
view+0.045
 doorstep+0.044
Neg logit contributions
 coward-0.036
 loser-0.031
 hawk-0.030
 badly-0.029
 Tweet-0.028
Token hugged
Token ID41130
Feature activation-8.87e-01

Pos logit contributions
 Katie+0.049
iah+0.046
 counters+0.046
 doorstep+0.046
 theatre+0.044
Neg logit contributions
 coward-0.058
 loser-0.056
 hawk-0.054
 hunters-0.046
 weaker-0.045
Token her
Token ID607
Feature activation-9.15e-01

Pos logit contributions
 Katie+0.051
iah+0.045
 pavement+0.045
 counters+0.044
 clothing+0.042
Neg logit contributions
 coward-0.056
 loser-0.052
 hawk-0.047
ching-0.042
 badly-0.041
Token bear
Token ID6842
Feature activation+4.96e+00

Pos logit contributions
 fridge+0.050
iah+0.049
 mant+0.049
 outfit+0.048
 doorstep+0.048
Neg logit contributions
 coward-0.048
 loser-0.044
 badly-0.040
 hawk-0.039
Bad-0.035
Token tighter
Token ID28619
Feature activation+1.02e+01

Pos logit contributions
 coward+0.262
 loser+0.234
 hawk+0.228
Bad+0.195
 villain+0.194
Neg logit contributions
 Katie-0.308
 doorstep-0.299
 counters-0.291
 dining-0.291
view-0.285
Token and
Token ID290
Feature activation-4.49e-01

Pos logit contributions
 coward+0.572
 hawk+0.499
 loser+0.480
 villain+0.443
Bad+0.429
Neg logit contributions
 Katie-0.638
 doorstep-0.579
view-0.569
iah-0.568
 occasion-0.558
Token turned
Token ID2900
Feature activation-1.79e+00

Pos logit contributions
 Katie+0.026
 counters+0.025
 dining+0.024
view+0.024
 doorstep+0.024
Neg logit contributions
 coward-0.025
 loser-0.024
 hawk-0.023
 badly-0.020
ching-0.019
Token away
Token ID1497
Feature activation+5.89e+00

Pos logit contributions
 Katie+0.108
 counters+0.101
 anticipation+0.098
 pavement+0.098
shirt+0.097
Neg logit contributions
 coward-0.104
 loser-0.098
 hawk-0.087
 villain-0.077
 hunter-0.076
Token.
Token ID13
Feature activation-1.91e-01

Pos logit contributions
 coward+0.314
 loser+0.290
 hawk+0.276
Bad+0.233
 villain+0.230
Neg logit contributions
 Katie-0.381
 counters-0.343
 doorstep-0.338
 dining-0.337
 pavement-0.337
Token The
Token ID383
Feature activation-1.18e+00

Pos logit contributions
 Katie+0.013
view+0.012
 dining+0.012
iah+0.012
 doorstep+0.012
Neg logit contributions
 coward-0.009
 loser-0.008
 hawk-0.007
 badly-0.007
 Sly-0.007
Token her
Token ID607
Feature activation-8.34e-01

Pos logit contributions
 Katie+0.125
 counters+0.116
 pavement+0.115
view+0.114
iah+0.111
Neg logit contributions
 coward-0.109
 loser-0.104
 hawk-0.095
 hero-0.083
 hunter-0.081
Token and
Token ID290
Feature activation-4.09e-01

Pos logit contributions
 Katie+0.055
 doorstep+0.053
view+0.053
 dining+0.052
 counters+0.052
Neg logit contributions
 coward-0.040
 hawk-0.036
 loser-0.035
 Sly-0.030
 hunter-0.030
Token held
Token ID2714
Feature activation+1.41e+00

Pos logit contributions
 Katie+0.022
 counters+0.020
iah+0.020
 pavement+0.020
 dining+0.019
Neg logit contributions
 coward-0.024
 loser-0.024
 hawk-0.022
ching-0.020
 hero-0.020
Token the
Token ID262
Feature activation+2.34e-01

Pos logit contributions
 coward+0.072
 hawk+0.064
 loser+0.064
 hunter+0.054
 villain+0.053
Neg logit contributions
 Katie-0.097
iah-0.087
 counters-0.086
 anticipation-0.085
 Jamie-0.084
Token doll
Token ID3654
Feature activation+3.56e+00

Pos logit contributions
 coward+0.012
 hawk+0.011
 loser+0.011
 badly+0.009
 Sly+0.009
Neg logit contributions
 Katie-0.014
 dining-0.014
 counters-0.014
 doorstep-0.014
 pavement-0.014
Token tighter
Token ID28619
Feature activation+1.02e+01

Pos logit contributions
 coward+0.194
 hawk+0.171
 loser+0.167
 villain+0.144
Bad+0.141
Neg logit contributions
 doorstep-0.217
 Katie-0.215
 counters-0.211
 dining-0.207
view-0.207
Token.
Token ID13
Feature activation-1.69e-01

Pos logit contributions
 coward+0.581
 hawk+0.510
 loser+0.465
 villain+0.446
 hunter+0.433
Neg logit contributions
 Katie-0.614
 doorstep-0.567
 pavement-0.567
iah-0.564
view-0.560
Token He
Token ID679
Feature activation+1.38e+00

Pos logit contributions
 Katie+0.011
 dining+0.011
view+0.011
iah+0.011
 doorstep+0.011
Neg logit contributions
 coward-0.008
 loser-0.007
 hawk-0.007
 badly-0.006
 Sly-0.006
Token shouted
Token ID17293
Feature activation-2.06e+00

Pos logit contributions
 loser+0.086
 coward+0.085
 hawk+0.079
ching+0.069
 villain+0.068
Neg logit contributions
 Katie-0.067
 doorstep-0.064
 counters-0.064
iah-0.063
 pavement-0.061
Token,
Token ID11
Feature activation+3.26e-01

Pos logit contributions
 Katie+0.133
 counters+0.127
 pavement+0.121
 dining+0.120
 doorstep+0.118
Neg logit contributions
 coward-0.104
 hawk-0.100
 loser-0.090
 hunter-0.078
 weaker-0.073
Token "
Token ID366
Feature activation+1.88e-01

Pos logit contributions
 coward+0.018
 loser+0.017
 hawk+0.016
 badly+0.014
 villain+0.014
Neg logit contributions
 Katie-0.019
 counters-0.018
 dining-0.018
 pavement-0.018
 doorstep-0.018
Token.
Token ID13
Feature activation-1.67e-01

Pos logit contributions
 Katie+0.396
iah+0.392
 dining+0.380
 counters+0.380
view+0.380
Neg logit contributions
 coward-0.171
 loser-0.149
 badly-0.145
 hawk-0.128
 too-0.126
Token Daddy
Token ID26926
Feature activation-6.07e+00

Pos logit contributions
view+0.011
 anniversary+0.010
 occasion+0.010
 doorstep+0.010
 contain+0.010
Neg logit contributions
 coward-0.007
 too-0.007
 Tweet-0.007
 loser-0.006
 Spot-0.006
Token hugged
Token ID41130
Feature activation-8.77e-01

Pos logit contributions
 Katie+0.376
 dining+0.363
 counters+0.358
view+0.358
 doorstep+0.355
Neg logit contributions
 coward-0.318
 loser-0.283
 hawk-0.276
 badly-0.249
ching-0.237
Token her
Token ID607
Feature activation-9.32e-01

Pos logit contributions
 dining+0.058
 doorstep+0.058
view+0.057
 pavement+0.057
 counters+0.056
Neg logit contributions
 coward-0.039
 loser-0.034
 hawk-0.034
 badly-0.032
 on-0.031
Token even
Token ID772
Feature activation-7.72e-01

Pos logit contributions
 Katie+0.070
view+0.068
 dining+0.068
 counters+0.068
 pavement+0.067
Neg logit contributions
 coward-0.036
 loser-0.031
 hawk-0.029
 badly-0.027
 hero-0.024
Token tighter
Token ID28619
Feature activation+1.01e+01

Pos logit contributions
 Katie+0.064
 counters+0.062
 dining+0.062
view+0.062
iah+0.061
Neg logit contributions
 coward-0.025
 loser-0.021
 hawk-0.020
 badly-0.016
 villain-0.014
Token and
Token ID290
Feature activation-4.57e-01

Pos logit contributions
 coward+0.417
 loser+0.356
 hawk+0.345
 badly+0.289
 villain+0.276
Neg logit contributions
 Katie-0.759
 dining-0.727
 counters-0.719
 doorstep-0.718
view-0.718
Token wiped
Token ID21122
Feature activation+1.31e+00

Pos logit contributions
 Katie+0.029
 dining+0.028
 counters+0.028
 doorstep+0.028
view+0.028
Neg logit contributions
 coward-0.023
 loser-0.020
 hawk-0.020
 badly-0.018
 hurt-0.017
Token away
Token ID1497
Feature activation+5.93e+00

Pos logit contributions
 coward+0.042
 loser+0.035
 hawk+0.033
 badly+0.030
ching+0.027
Neg logit contributions
 Katie-0.105
 dining-0.105
 doorstep-0.103
view-0.102
iah-0.102
Token her
Token ID607
Feature activation-9.25e-01

Pos logit contributions
 coward+0.225
 loser+0.192
 hawk+0.181
 badly+0.169
 hurt+0.147
Neg logit contributions
 Katie-0.442
 dining-0.435
view-0.432
 doorstep-0.431
 counters-0.428
Token tears
Token ID10953
Feature activation+2.81e+00

Pos logit contributions
 Katie+0.054
iah+0.053
 dining+0.051
umm+0.051
oph+0.050
Neg logit contributions
 badly-0.046
 coward-0.045
 hurt-0.045
 loser-0.044
 on-0.043
Token he
Token ID339
Feature activation+2.98e+00

Pos logit contributions
 coward+0.012
 loser+0.011
 hawk+0.010
ching+0.009
 badly+0.008
Neg logit contributions
 Katie-0.022
 dining-0.020
 counters-0.020
iah-0.020
 pavement-0.020
Token hugged
Token ID41130
Feature activation-8.52e-01

Pos logit contributions
 coward+0.174
 loser+0.162
 hawk+0.155
 badly+0.138
 villain+0.134
Neg logit contributions
 Katie-0.167
 counters-0.160
iah-0.160
 doorstep-0.159
 dining-0.157
Token the
Token ID262
Feature activation+2.69e-01

Pos logit contributions
 Katie+0.067
 counters+0.063
 pavement+0.063
view+0.063
iah+0.062
Neg logit contributions
 coward-0.034
 loser-0.029
 hawk-0.027
 badly-0.022
ching-0.022
Token thing
Token ID1517
Feature activation+3.39e+00

Pos logit contributions
 coward+0.015
 loser+0.014
 hawk+0.013
 badly+0.012
 hero+0.012
Neg logit contributions
 Katie-0.016
view-0.015
iah-0.015
 dining-0.015
 counters-0.015
Token even
Token ID772
Feature activation-6.98e-01

Pos logit contributions
 coward+0.143
 loser+0.123
 hawk+0.119
 badly+0.100
 villain+0.095
Neg logit contributions
 Katie-0.248
 dining-0.237
 doorstep-0.235
 counters-0.235
view-0.235
Token tighter
Token ID28619
Feature activation+1.01e+01

Pos logit contributions
 Katie+0.054
 counters+0.052
view+0.051
iah+0.051
 dining+0.050
Neg logit contributions
 coward-0.027
 loser-0.024
 hawk-0.023
 villain-0.018
 badly-0.017
Token and
Token ID290
Feature activation-4.00e-01

Pos logit contributions
 coward+0.479
 hawk+0.402
 loser+0.398
 villain+0.338
Bad+0.330
Neg logit contributions
 Katie-0.715
 dining-0.674
iah-0.670
 doorstep-0.668
 counters-0.667
Token enjoyed
Token ID8359
Feature activation+3.98e-01

Pos logit contributions
 Katie+0.027
 dining+0.025
 counters+0.025
view+0.025
 doorstep+0.025
Neg logit contributions
 coward-0.020
 loser-0.018
 hawk-0.017
 badly-0.015
 villain-0.014
Token how
Token ID703
Feature activation-3.03e-01

Pos logit contributions
 coward+0.015
 loser+0.013
 hawk+0.012
 badly+0.010
 hunter+0.009
Neg logit contributions
 Katie-0.031
 dining-0.030
 counters-0.030
view-0.030
 doorstep-0.030
Token it
Token ID340
Feature activation+1.02e+00

Pos logit contributions
 Katie+0.023
 dining+0.022
view+0.022
 doorstep+0.022
 counters+0.022
Neg logit contributions
 coward-0.012
 loser-0.011
 hawk-0.010
 badly-0.009
 weaker-0.008
Token felt
Token ID2936
Feature activation+7.41e-01

Pos logit contributions
 coward+0.056
 loser+0.051
 hawk+0.049
 badly+0.045
 villain+0.042
Neg logit contributions
 Katie-0.061
 dining-0.058
 counters-0.058
iah-0.057
view-0.057
Token better
Token ID1365
Feature activation+8.56e+00

Pos logit contributions
 coward+0.024
 loser+0.022
 hawk+0.021
 villain+0.016
 Sly+0.016
Neg logit contributions
 Katie-0.047
 dining-0.044
view-0.044
 counters-0.044
iah-0.043
Token and
Token ID290
Feature activation-4.90e-01

Pos logit contributions
 coward+0.398
 hawk+0.360
 loser+0.334
 Sly+0.289
 villain+0.285
Neg logit contributions
 Katie-0.600
 dining-0.559
 doorstep-0.550
view-0.548
iah-0.547
Token held
Token ID2714
Feature activation+1.44e+00

Pos logit contributions
 Katie+0.031
 dining+0.030
 counters+0.030
view+0.029
 doorstep+0.029
Neg logit contributions
 coward-0.025
 loser-0.023
 hawk-0.022
 badly-0.019
ching-0.018
Token his
Token ID465
Feature activation+8.11e-01

Pos logit contributions
 coward+0.057
 loser+0.049
 hawk+0.048
 badly+0.038
 villain+0.036
Neg logit contributions
 Katie-0.112
iah-0.106
 counters-0.105
 dining-0.105
 doorstep-0.104
Token pillow
Token ID28774
Feature activation+3.20e+00

Pos logit contributions
 coward+0.052
 loser+0.047
 hawk+0.046
 badly+0.041
 Sly+0.041
Neg logit contributions
 Katie-0.041
 doorstep-0.040
 dining-0.040
 counters-0.039
 mant-0.037
Token tighter
Token ID28619
Feature activation+1.01e+01

Pos logit contributions
 coward+0.172
 loser+0.153
 hawk+0.152
Bad+0.130
 Sly+0.127
Neg logit contributions
 Katie-0.194
 doorstep-0.193
 dining-0.191
 counters-0.190
view-0.186
Token.
Token ID13
Feature activation-2.30e-01

Pos logit contributions
 coward+0.496
 hawk+0.427
 loser+0.416
 villain+0.358
Bad+0.351
Neg logit contributions
 Katie-0.694
 doorstep-0.656
 dining-0.652
view-0.647
 counters-0.645
Token
Token ID198
Feature activation+1.14e+00

Pos logit contributions
 Katie+0.016
 dining+0.016
 occasion+0.016
 counters+0.016
view+0.016
Neg logit contributions
 coward-0.009
 loser-0.008
 hawk-0.007
 Sly-0.007
 Dino-0.007
Token
Token ID198
Feature activation+1.15e+00

Pos logit contributions
 coward+0.059
 loser+0.052
 hawk+0.051
 badly+0.045
 villain+0.043
Neg logit contributions
 Katie-0.073
view-0.069
 dining-0.069
 counters-0.069
 doorstep-0.069
TokenMom
Token ID29252
Feature activation-3.40e+00

Pos logit contributions
 coward+0.061
 hawk+0.053
 loser+0.053
 badly+0.046
 villain+0.045
Neg logit contributions
 Katie-0.071
view-0.069
iah-0.068
 counters-0.068
 dining-0.068
Token asked
Token ID1965
Feature activation-2.47e-01

Pos logit contributions
 Katie+0.206
 dining+0.202
 counters+0.200
view+0.199
 pavement+0.198
Neg logit contributions
 coward-0.177
 loser-0.155
 hawk-0.153
 badly-0.138
 hunter-0.135
Token head
Token ID1182
Feature activation-1.38e+00

Pos logit contributions
 doorstep+0.027
 counters+0.025
 Katie+0.024
 pavement+0.024
 dining+0.024
Neg logit contributions
 coward-0.071
 hawk-0.066
 loser-0.062
 Sly-0.060
ching-0.059
Token and
Token ID290
Feature activation-4.37e-01

Pos logit contributions
 Katie+0.091
view+0.086
 doorstep+0.086
 counters+0.085
 dining+0.084
Neg logit contributions
 coward-0.068
 hawk-0.059
 loser-0.058
 villain-0.048
 Sly-0.048
Token hugged
Token ID41130
Feature activation-8.79e-01

Pos logit contributions
 Katie+0.024
 counters+0.021
iah+0.021
 dining+0.021
 pavement+0.021
Neg logit contributions
 coward-0.027
 loser-0.026
 hawk-0.024
ching-0.021
 badly-0.021
Token her
Token ID607
Feature activation-8.34e-01

Pos logit contributions
 Katie+0.051
iah+0.045
 pavement+0.044
 counters+0.044
 clothing+0.043
Neg logit contributions
 coward-0.054
 loser-0.049
 hawk-0.048
ching-0.040
 hunter-0.040
Token dolls
Token ID36062
Feature activation+1.62e+00

Pos logit contributions
 doorstep+0.045
 Katie+0.045
 dining+0.045
 counters+0.045
 balcony+0.043
Neg logit contributions
 coward-0.047
 hawk-0.044
 loser-0.039
ching-0.036
 Sly-0.036
Token tighter
Token ID28619
Feature activation+1.01e+01

Pos logit contributions
 coward+0.089
 hawk+0.080
 loser+0.078
Bad+0.069
 villain+0.066
Neg logit contributions
 Katie-0.097
 dining-0.094
 doorstep-0.093
view-0.091
 counters-0.091
Token.
Token ID13
Feature activation-1.42e-01

Pos logit contributions
 coward+0.581
 hawk+0.518
 loser+0.484
 villain+0.443
Bad+0.437
Neg logit contributions
 Katie-0.606
view-0.566
 doorstep-0.565
iah-0.554
 dining-0.554
Token Max
Token ID5436
Feature activation+2.94e+00

Pos logit contributions
 Katie+0.009
 counters+0.008
 dining+0.008
 pavement+0.008
 doorstep+0.008
Neg logit contributions
 coward-0.008
 loser-0.007
 hawk-0.007
 badly-0.006
 hunter-0.006
Token did
Token ID750
Feature activation+2.66e+00

Pos logit contributions
 coward+0.172
 loser+0.164
 hawk+0.156
 badly+0.138
ching+0.132
Neg logit contributions
 Katie-0.163
 doorstep-0.152
 counters-0.152
iah-0.149
 dining-0.148
Token not
Token ID407
Feature activation+4.52e+00

Pos logit contributions
 hawk+0.140
 coward+0.133
 loser+0.124
 hunter+0.113
 hunters+0.109
Neg logit contributions
 Katie-0.163
 contain-0.147
 anticipation-0.147
 Jamie-0.143
view-0.143
Token give
Token ID1577
Feature activation+2.00e+00

Pos logit contributions
 coward+0.272
 hawk+0.253
 loser+0.243
 badly+0.221
 Sly+0.217
Neg logit contributions
 Katie-0.240
view-0.221
 pavement-0.220
 occasion-0.217
 contain-0.215
Token She
Token ID1375
Feature activation-9.42e-01

Pos logit contributions
 dining+0.016
iah+0.015
view+0.015
 theatre+0.015
 Katie+0.015
Neg logit contributions
 coward-0.013
 loser-0.011
 Tweet-0.011
 hawk-0.011
 badly-0.010
Token hugged
Token ID41130
Feature activation-8.85e-01

Pos logit contributions
 Katie+0.047
iah+0.045
 doorstep+0.045
yah+0.042
 counters+0.042
Neg logit contributions
 coward-0.057
 loser-0.057
 hawk-0.051
 villain-0.046
 hunters-0.046
Token the
Token ID262
Feature activation+1.09e-01

Pos logit contributions
 Katie+0.061
 counters+0.057
 dining+0.057
view+0.057
iah+0.057
Neg logit contributions
 coward-0.043
 loser-0.037
 hawk-0.036
 badly-0.032
 hunter-0.030
Token wreck
Token ID15228
Feature activation+1.34e+00

Pos logit contributions
 coward+0.005
 loser+0.004
 hawk+0.004
 badly+0.004
 Sly+0.004
Neg logit contributions
 Katie-0.007
 pavement-0.007
 counters-0.007
 doorstep-0.007
 dining-0.007
Token even
Token ID772
Feature activation-7.62e-01

Pos logit contributions
 coward+0.056
 loser+0.048
 hawk+0.048
 badly+0.038
 Sly+0.038
Neg logit contributions
 Katie-0.098
 counters-0.094
 dining-0.094
view-0.093
 doorstep-0.093
Token tighter
Token ID28619
Feature activation+1.01e+01

Pos logit contributions
 Katie+0.052
 counters+0.050
view+0.050
shirt+0.049
 anticipation+0.049
Neg logit contributions
 coward-0.033
 loser-0.033
 hawk-0.029
 villain-0.024
 hero-0.023
Token.
Token ID13
Feature activation-2.53e-01

Pos logit contributions
 coward+0.516
 hawk+0.444
 loser+0.435
 villain+0.381
 hunter+0.372
Neg logit contributions
 Katie-0.677
iah-0.619
 doorstep-0.616
 dining-0.614
 pavement-0.613
Token
Token ID220
Feature activation-2.18e+00

Pos logit contributions
iah+0.018
 dining+0.018
view+0.017
 theatre+0.017
 occasion+0.017
Neg logit contributions
 coward-0.009
 Tweet-0.009
 loser-0.008
 Wh-0.008
 too-0.008
Token
Token ID198
Feature activation+1.16e+00

Pos logit contributions
 Katie+0.154
 dining+0.151
iah+0.148
 counters+0.148
view+0.148
Neg logit contributions
 coward-0.093
 loser-0.081
 hawk-0.076
 badly-0.071
ching-0.066
Token
Token ID198
Feature activation+1.15e+00

Pos logit contributions
 coward+0.041
 loser+0.034
 hawk+0.032
 badly+0.030
ching+0.029
Neg logit contributions
 dining-0.088
 counters-0.086
iah-0.086
 Katie-0.086
 doorstep-0.085
TokenThe
Token ID464
Feature activation-5.51e-01

Pos logit contributions
 coward+0.059
 loser+0.053
 hawk+0.051
 badly+0.051
Bad+0.048
Neg logit contributions
 Katie-0.071
 dining-0.070
iah-0.069
 doorstep-0.066
 counters-0.066
Token her
Token ID607
Feature activation-9.75e-01

Pos logit contributions
 counters+0.022
 dining+0.022
 Katie+0.022
 doorstep+0.021
view+0.021
Neg logit contributions
 coward-0.017
 loser-0.015
 hawk-0.015
 badly-0.013
 Sly-0.013
Token and
Token ID290
Feature activation-5.15e-01

Pos logit contributions
 Katie+0.074
 dining+0.071
 counters+0.070
iah+0.070
view+0.069
Neg logit contributions
 coward-0.038
 loser-0.034
 hawk-0.032
 badly-0.029
 hunter-0.026
Token she
Token ID673
Feature activation+1.27e+00

Pos logit contributions
 Katie+0.033
 dining+0.031
 counters+0.031
iah+0.031
view+0.031
Neg logit contributions
 coward-0.026
 loser-0.024
 hawk-0.023
 badly-0.020
 villain-0.019
Token hugged
Token ID41130
Feature activation-9.13e-01

Pos logit contributions
 coward+0.069
 loser+0.063
 hawk+0.060
 badly+0.053
 villain+0.051
Neg logit contributions
 Katie-0.078
 dining-0.073
 doorstep-0.073
 counters-0.073
view-0.073
Token it
Token ID340
Feature activation+9.25e-01

Pos logit contributions
 Katie+0.066
iah+0.061
view+0.061
 dining+0.061
 counters+0.061
Neg logit contributions
 coward-0.041
 loser-0.036
 hawk-0.034
 badly-0.029
 villain-0.028
Token tighter
Token ID28619
Feature activation+1.01e+01

Pos logit contributions
 coward+0.042
 loser+0.037
 hawk+0.036
 badly+0.033
ching+0.030
Neg logit contributions
 Katie-0.064
 dining-0.061
 counters-0.061
view-0.060
iah-0.060
Token.
Token ID13
Feature activation-2.17e-01

Pos logit contributions
 coward+0.507
 loser+0.431
 hawk+0.429
 villain+0.374
 Sly+0.359
Neg logit contributions
 Katie-0.679
view-0.640
 doorstep-0.639
 dining-0.638
iah-0.633
Token
Token ID220
Feature activation-2.22e+00

Pos logit contributions
 dining+0.016
 doorstep+0.016
 counters+0.016
view+0.016
 theatre+0.016
Neg logit contributions
 coward-0.008
 loser-0.007
 hawk-0.006
 Sly-0.006
 Tweet-0.006
Token
Token ID198
Feature activation+1.13e+00

Pos logit contributions
 Katie+0.158
 dining+0.155
 counters+0.153
 doorstep+0.153
iah+0.152
Neg logit contributions
 coward-0.094
 loser-0.081
 hawk-0.077
 badly-0.071
ching-0.067
Token
Token ID198
Feature activation+1.13e+00

Pos logit contributions
 coward+0.041
 loser+0.035
 hawk+0.031
 badly+0.030
ching+0.028
Neg logit contributions
 dining-0.086
 counters-0.085
 Katie-0.085
 doorstep-0.085
 pavement-0.083
TokenSoon
Token ID28093
Feature activation-1.54e+00

Pos logit contributions
 coward+0.044
 loser+0.039
 hawk+0.036
 badly+0.033
ching+0.030
Neg logit contributions
 Katie-0.086
 dining-0.083
 counters-0.083
 doorstep-0.082
iah-0.081
Token stopped
Token ID5025
Feature activation+2.62e+00

Pos logit contributions
 coward+0.158
 loser+0.150
 hawk+0.141
 badly+0.127
Bad+0.121
Neg logit contributions
 Katie-0.169
 doorstep-0.166
iah-0.166
 counters-0.164
 theatre-0.162
Token and
Token ID290
Feature activation-4.43e-01

Pos logit contributions
 coward+0.136
 loser+0.121
 hawk+0.113
 villain+0.102
 hunter+0.096
Neg logit contributions
 Katie-0.169
view-0.159
 pavement-0.158
 counters-0.158
 dining-0.157
Token tied
Token ID8165
Feature activation+6.20e+00

Pos logit contributions
 Katie+0.026
 dining+0.024
view+0.024
iah+0.024
 doorstep+0.024
Neg logit contributions
 coward-0.025
 loser-0.024
 hawk-0.022
 badly-0.020
 hunters-0.019
Token the
Token ID262
Feature activation+1.54e-01

Pos logit contributions
 coward+0.285
 loser+0.243
 hawk+0.232
 villain+0.200
 badly+0.192
Neg logit contributions
 Katie-0.456
 pavement-0.417
 counters-0.417
iah-0.414
 dining-0.409
Token string
Token ID4731
Feature activation+3.29e+00

Pos logit contributions
 coward+0.009
 loser+0.008
 hawk+0.008
 badly+0.007
 hurt+0.007
Neg logit contributions
 Katie-0.009
view-0.009
iah-0.009
 dining-0.009
 counters-0.008
Token tighter
Token ID28619
Feature activation+1.01e+01

Pos logit contributions
 coward+0.175
 hawk+0.150
 loser+0.149
Bad+0.132
 villain+0.128
Neg logit contributions
 Katie-0.206
 pavement-0.196
 doorstep-0.194
 counters-0.191
view-0.190
Token.
Token ID13
Feature activation-2.37e-01

Pos logit contributions
 coward+0.523
 hawk+0.435
 loser+0.418
 villain+0.386
Bad+0.382
Neg logit contributions
 Katie-0.669
 pavement-0.627
iah-0.619
 doorstep-0.617
 counters-0.607
Token Soon
Token ID15894
Feature activation-1.09e+00

Pos logit contributions
 Katie+0.016
iah+0.016
 dining+0.016
view+0.016
 theatre+0.015
Neg logit contributions
 coward-0.011
 loser-0.010
 hawk-0.009
 badly-0.008
 Tweet-0.008
Token he
Token ID339
Feature activation+2.97e+00

Pos logit contributions
 Katie+0.071
 anticipation+0.062
 Jamie+0.061
 counters+0.060
 pavement+0.060
Neg logit contributions
 coward-0.057
 loser-0.051
 hawk-0.048
Bad-0.045
ching-0.043
Token was
Token ID373
Feature activation-2.47e-01

Pos logit contributions
 coward+0.170
 loser+0.158
 hawk+0.151
 badly+0.137
 Sly+0.130
Neg logit contributions
 Katie-0.163
 doorstep-0.157
view-0.154
iah-0.154
 pavement-0.151
Token home
Token ID1363
Feature activation-4.53e-01

Pos logit contributions
 Katie+0.016
 dining+0.015
 counters+0.015
view+0.015
 doorstep+0.015
Neg logit contributions
 coward-0.013
 loser-0.011
 hawk-0.011
 villain-0.009
 Sly-0.009
Token,
Token ID11
Feature activation+2.71e-01

Pos logit contributions
 Katie+0.200
 dining+0.195
view+0.194
 counters+0.193
 doorstep+0.193
Neg logit contributions
 coward-0.077
 loser-0.064
 hawk-0.061
 badly-0.053
 hunter-0.045
Token she
Token ID673
Feature activation+1.19e+00

Pos logit contributions
 coward+0.016
 loser+0.015
 hawk+0.014
 badly+0.013
ching+0.013
Neg logit contributions
 Katie-0.015
 dining-0.014
 counters-0.014
iah-0.014
 doorstep-0.014
Token hugged
Token ID41130
Feature activation-9.04e-01

Pos logit contributions
 coward+0.056
 loser+0.047
 hawk+0.047
 hurt+0.043
 badly+0.043
Neg logit contributions
 Katie-0.078
 dining-0.076
 pavement-0.076
view-0.076
 counters-0.075
Token it
Token ID340
Feature activation+9.60e-01

Pos logit contributions
 dining+0.071
 counters+0.070
 doorstep+0.069
 Katie+0.069
view+0.068
Neg logit contributions
 coward-0.030
 hawk-0.026
 loser-0.025
 badly-0.023
 on-0.021
Token even
Token ID772
Feature activation-7.83e-01

Pos logit contributions
 coward+0.030
 badly+0.027
 on+0.027
 too+0.026
 loser+0.025
Neg logit contributions
 Katie-0.073
iah-0.072
 dining-0.072
view-0.072
 counters-0.072
Token tighter
Token ID28619
Feature activation+1.01e+01

Pos logit contributions
 Katie+0.065
 counters+0.063
 dining+0.062
view+0.062
iah+0.061
Neg logit contributions
 coward-0.026
 loser-0.023
 hawk-0.021
 badly-0.016
 villain-0.015
Token and
Token ID290
Feature activation-4.42e-01

Pos logit contributions
 coward+0.429
 loser+0.367
 hawk+0.356
 badly+0.297
 villain+0.292
Neg logit contributions
 Katie-0.746
 dining-0.712
 doorstep-0.704
view-0.703
 counters-0.702
Token said
Token ID531
Feature activation-1.12e+00

Pos logit contributions
 Katie+0.028
 dining+0.027
 doorstep+0.027
view+0.027
 counters+0.027
Neg logit contributions
 coward-0.022
 loser-0.019
 hawk-0.019
 badly-0.018
 hurt-0.017
Token "
Token ID366
Feature activation+2.16e-01

Pos logit contributions
 Katie+0.074
view+0.073
 pavement+0.073
 dining+0.072
 doorstep+0.072
Neg logit contributions
 coward-0.053
 loser-0.046
 hawk-0.044
 badly-0.043
ching-0.039
TokenI
Token ID40
Feature activation-2.34e+00

Pos logit contributions
 coward+0.011
 loser+0.010
 hawk+0.010
Bad+0.009
ching+0.009
Neg logit contributions
 dining-0.014
 doorstep-0.013
 counters-0.013
 Katie-0.013
 patio-0.013
Token love
Token ID1842
Feature activation-9.61e-01

Pos logit contributions
 Katie+0.150
 counters+0.143
 dining+0.143
 pavement+0.141
view+0.141
Neg logit contributions
 coward-0.122
 loser-0.108
 hawk-0.102
 badly-0.095
ching-0.089
Token gave
Token ID2921
Feature activation+1.34e+00

Pos logit contributions
 coward+0.048
 loser+0.044
 hawk+0.042
 badly+0.038
 villain+0.036
Neg logit contributions
 Katie-0.055
 dining-0.052
 doorstep-0.051
 counters-0.051
iah-0.051
Token her
Token ID607
Feature activation-1.02e+00

Pos logit contributions
 coward+0.055
 loser+0.047
 hawk+0.046
 badly+0.040
ching+0.037
Neg logit contributions
 Katie-0.097
 dining-0.096
 counters-0.095
view-0.094
 doorstep-0.094
Token a
Token ID257
Feature activation+5.44e-01

Pos logit contributions
 Katie+0.078
 dining+0.076
 counters+0.076
view+0.075
 pavement+0.075
Neg logit contributions
 coward-0.038
 loser-0.032
 hawk-0.031
 badly-0.028
ching-0.025
Token hug
Token ID16225
Feature activation+1.52e+00

Pos logit contributions
 coward+0.028
 loser+0.025
 hawk+0.024
 badly+0.022
ching+0.020
Neg logit contributions
 Katie-0.035
 dining-0.033
 counters-0.033
view-0.033
iah-0.033
Token even
Token ID772
Feature activation-7.87e-01

Pos logit contributions
 coward+0.053
 loser+0.045
 hawk+0.043
 badly+0.037
ching+0.034
Neg logit contributions
 Katie-0.119
 dining-0.116
 counters-0.116
iah-0.114
view-0.114
Token tighter
Token ID28619
Feature activation+1.00e+01

Pos logit contributions
 Katie+0.064
 counters+0.062
view+0.062
 dining+0.061
iah+0.060
Neg logit contributions
 coward-0.026
 loser-0.023
 hawk-0.021
 badly-0.017
 villain-0.016
Token than
Token ID621
Feature activation+1.07e+00

Pos logit contributions
 coward+0.432
 loser+0.368
 hawk+0.355
 badly+0.298
 villain+0.292
Neg logit contributions
 Katie-0.736
 dining-0.703
view-0.697
 doorstep-0.697
 counters-0.694
Token any
Token ID597
Feature activation+3.60e+00

Pos logit contributions
 coward+0.047
 loser+0.041
 hawk+0.040
 badly+0.034
 hunter+0.032
Neg logit contributions
 Katie-0.078
 dining-0.073
 counters-0.072
iah-0.072
view-0.072
Token hug
Token ID16225
Feature activation+1.54e+00

Pos logit contributions
 coward+0.182
 loser+0.161
 hawk+0.155
 badly+0.137
 Sly+0.131
Neg logit contributions
 Katie-0.234
 counters-0.220
 dining-0.220
 doorstep-0.220
iah-0.219
Token before
Token ID878
Feature activation+1.30e+00

Pos logit contributions
 coward+0.059
 loser+0.051
 hawk+0.048
 badly+0.042
ching+0.038
Neg logit contributions
 Katie-0.117
 dining-0.113
 counters-0.112
view-0.112
 doorstep-0.112
Token.
Token ID13
Feature activation-2.16e-01

Pos logit contributions
 coward+0.051
 loser+0.044
 hawk+0.043
 badly+0.038
ching+0.034
Neg logit contributions
 Katie-0.096
 dining-0.094
view-0.093
 counters-0.093
iah-0.093
Token He
Token ID679
Feature activation+1.43e+00

Pos logit contributions
 Katie+0.008
 dining+0.008
 counters+0.008
view+0.008
iah+0.008
Neg logit contributions
 coward-0.006
 loser-0.006
 hawk-0.005
 Sly-0.005
 badly-0.005
Token smiled
Token ID13541
Feature activation+4.45e-01

Pos logit contributions
 coward+0.083
 loser+0.078
 hawk+0.073
 badly+0.066
ching+0.064
Neg logit contributions
 Katie-0.083
 doorstep-0.077
iah-0.076
 dining-0.076
 counters-0.075
Token,
Token ID11
Feature activation+3.90e-01

Pos logit contributions
 coward+0.022
 loser+0.019
 hawk+0.019
 villain+0.015
Bad+0.015
Neg logit contributions
 Katie-0.031
view-0.029
 dining-0.028
 doorstep-0.028
iah-0.028
Token hugged
Token ID41130
Feature activation-8.51e-01

Pos logit contributions
 coward+0.019
 hawk+0.016
 loser+0.016
 badly+0.015
 hurt+0.014
Neg logit contributions
 Katie-0.026
 dining-0.025
 doorstep-0.025
 counters-0.025
iah-0.025
Token her
Token ID607
Feature activation-9.49e-01

Pos logit contributions
 Katie+0.053
 counters+0.050
 dining+0.049
iah+0.049
view+0.049
Neg logit contributions
 coward-0.046
 loser-0.041
 hawk-0.040
 badly-0.035
ching-0.034
Token tighter
Token ID28619
Feature activation+1.00e+01

Pos logit contributions
 Katie+0.061
 doorstep+0.058
 counters+0.057
 dining+0.057
iah+0.056
Neg logit contributions
 coward-0.051
 hawk-0.044
 loser-0.043
Bad-0.037
 villain-0.037
Token and
Token ID290
Feature activation-3.04e-01

Pos logit contributions
 coward+0.459
 hawk+0.386
 loser+0.379
 villain+0.327
Bad+0.319
Neg logit contributions
 Katie-0.728
 doorstep-0.680
 dining-0.672
iah-0.671
view-0.670
Token said
Token ID531
Feature activation-1.05e+00

Pos logit contributions
 Katie+0.019
 dining+0.018
 counters+0.018
view+0.018
 doorstep+0.018
Neg logit contributions
 coward-0.016
 loser-0.014
 hawk-0.014
 badly-0.012
 hunter-0.012
Token he
Token ID339
Feature activation+3.03e+00

Pos logit contributions
 Katie+0.076
 dining+0.070
 counters+0.068
iah+0.068
view+0.067
Neg logit contributions
 coward-0.046
 loser-0.041
 hawk-0.041
 hunter-0.033
 hunters-0.033
Token knew
Token ID2993
Feature activation+7.24e-01

Pos logit contributions
 coward+0.149
 loser+0.136
 hawk+0.132
 badly+0.111
 villain+0.108
Neg logit contributions
 Katie-0.199
 dining-0.187
 doorstep-0.187
iah-0.187
 counters-0.186
Token she
Token ID673
Feature activation+1.35e+00

Pos logit contributions
 coward+0.033
 loser+0.030
 hawk+0.028
 hunter+0.023
 badly+0.023
Neg logit contributions
 Katie-0.052
iah-0.048
 dining-0.048
 counters-0.048
view-0.047
Token her
Token ID607
Feature activation-9.46e-01

Pos logit contributions
 Katie+0.021
 dining+0.021
 counters+0.021
view+0.021
 doorstep+0.020
Neg logit contributions
 coward-0.013
 loser-0.011
 hawk-0.011
 badly-0.009
 villain-0.009
Token and
Token ID290
Feature activation-2.01e-01

Pos logit contributions
 Katie+0.064
 dining+0.061
iah+0.061
 counters+0.061
view+0.060
Neg logit contributions
 coward-0.045
 loser-0.040
 hawk-0.039
 badly-0.035
 hunter-0.033
Token held
Token ID2714
Feature activation+1.43e+00

Pos logit contributions
 Katie+0.012
 counters+0.011
 dining+0.011
 theatre+0.011
iah+0.011
Neg logit contributions
 coward-0.011
 loser-0.011
 hawk-0.010
 badly-0.009
 villain-0.009
Token her
Token ID607
Feature activation-9.52e-01

Pos logit contributions
 coward+0.073
 loser+0.065
 hawk+0.063
 badly+0.054
 villain+0.053
Neg logit contributions
 Katie-0.096
iah-0.088
 counters-0.087
 dining-0.086
 doorstep-0.085
Token dog
Token ID3290
Feature activation+3.89e+00

Pos logit contributions
 Katie+0.046
 dining+0.044
 doorstep+0.044
 counters+0.044
view+0.043
Neg logit contributions
 coward-0.063
 loser-0.057
 hawk-0.056
 badly-0.050
 villain-0.050
Token tighter
Token ID28619
Feature activation+1.00e+01

Pos logit contributions
 coward+0.199
 loser+0.174
 hawk+0.173
 villain+0.145
Bad+0.140
Neg logit contributions
 Katie-0.248
 doorstep-0.241
 counters-0.240
 dining-0.239
view-0.233
Token.
Token ID13
Feature activation+1.26e-02

Pos logit contributions
 coward+0.558
 hawk+0.480
 loser+0.467
 villain+0.436
Bad+0.408
Neg logit contributions
 Katie-0.622
 doorstep-0.578
view-0.577
iah-0.577
 counters-0.570
Token
Token ID198
Feature activation+1.19e+00

Pos logit contributions
 coward+0.001
 loser+0.001
 hawk+0.001
 badly+0.001
 Sly+0.000
Neg logit contributions
 Katie-0.001
 dining-0.001
view-0.001
 doorstep-0.001
 counters-0.001
Token
Token ID198
Feature activation+1.18e+00

Pos logit contributions
 coward+0.057
 loser+0.050
 hawk+0.048
 badly+0.043
 hunter+0.040
Neg logit contributions
 Katie-0.080
 dining-0.077
 counters-0.076
view-0.076
 doorstep-0.075
Token"
Token ID1
Feature activation-1.74e-01

Pos logit contributions
 coward+0.047
 loser+0.040
 hawk+0.039
 badly+0.033
 villain+0.031
Neg logit contributions
 Katie-0.090
 dining-0.087
 counters-0.086
view-0.085
iah-0.084
TokenGo
Token ID5247
Feature activation-1.83e+00

Pos logit contributions
 Katie+0.010
view+0.009
iah+0.009
 dining+0.009
 pavement+0.009
Neg logit contributions
 coward-0.010
 loser-0.009
 hawk-0.009
 badly-0.008
 hero-0.008
Token.
Token ID13
Feature activation-1.05e-01

Pos logit contributions
 Katie+0.055
 doorstep+0.053
 counters+0.052
 dining+0.052
view+0.051
Neg logit contributions
 coward-0.045
 loser-0.040
 hawk-0.039
Bad-0.032
ching-0.032
Token She
Token ID1375
Feature activation-9.72e-01

Pos logit contributions
view+0.006
 dining+0.006
 counters+0.006
iah+0.006
 doorstep+0.006
Neg logit contributions
 coward-0.005
 loser-0.005
 hawk-0.004
 Sly-0.004
 Tweet-0.004
Token hugged
Token ID41130
Feature activation-6.94e-01

Pos logit contributions
 Katie+0.056
 dining+0.053
 counters+0.053
 doorstep+0.053
view+0.052
Neg logit contributions
 coward-0.056
 loser-0.051
 hawk-0.049
 badly-0.044
 villain-0.042
Token it
Token ID340
Feature activation+1.18e+00

Pos logit contributions
 Katie+0.041
 counters+0.037
iah+0.037
 pavement+0.037
view+0.037
Neg logit contributions
 coward-0.041
 loser-0.038
 hawk-0.036
 badly-0.032
ching-0.031
Token even
Token ID772
Feature activation-6.44e-01

Pos logit contributions
 coward+0.046
 loser+0.040
 hawk+0.038
 badly+0.036
 hero+0.033
Neg logit contributions
 Katie-0.087
 dining-0.085
view-0.084
 counters-0.084
iah-0.084
Token tighter
Token ID28619
Feature activation+1.00e+01

Pos logit contributions
 Katie+0.049
view+0.047
 counters+0.046
 dining+0.046
shirt+0.045
Neg logit contributions
 coward-0.026
 loser-0.024
 hawk-0.021
 badly-0.018
 villain-0.017
Token and
Token ID290
Feature activation-1.41e-01

Pos logit contributions
 coward+0.496
 loser+0.421
 hawk+0.418
 villain+0.362
 hunter+0.343
Neg logit contributions
 Katie-0.692
 doorstep-0.640
view-0.634
iah-0.632
 dining-0.632
Token fell
Token ID3214
Feature activation+4.68e+00

Pos logit contributions
 Katie+0.009
 dining+0.008
 counters+0.008
view+0.008
 doorstep+0.008
Neg logit contributions
 coward-0.008
 loser-0.007
 hawk-0.007
 badly-0.006
 villain-0.006
Token asleep
Token ID16039
Feature activation+4.14e+00

Pos logit contributions
 coward+0.268
 loser+0.241
 hawk+0.231
Bad+0.197
 hunter+0.197
Neg logit contributions
 Katie-0.276
 counters-0.273
 pavement-0.268
 dining-0.265
 doorstep-0.260
Token with
Token ID351
Feature activation-9.65e-01

Pos logit contributions
 coward+0.203
 hawk+0.180
 loser+0.176
 Sly+0.145
ching+0.144
Neg logit contributions
 Katie-0.274
 dining-0.269
 doorstep-0.259
 counters-0.254
view-0.254
Token a
Token ID257
Feature activation+7.95e-01

Pos logit contributions
 Katie+0.066
 dining+0.063
 counters+0.063
view+0.062
 doorstep+0.062
Neg logit contributions
 coward-0.046
 loser-0.040
 hawk-0.039
 badly-0.034
 hunter-0.032
Token.
Token ID13
Feature activation-1.36e-02

Pos logit contributions
 Katie+0.167
view+0.161
iah+0.160
 dining+0.160
 counters+0.160
Neg logit contributions
 coward-0.073
 loser-0.064
 hawk-0.056
 badly-0.052
 hurt-0.045
Token But
Token ID887
Feature activation-1.28e+00

Pos logit contributions
view+0.001
 doorstep+0.001
 Katie+0.001
 dining+0.001
 theatre+0.001
Neg logit contributions
 coward-0.001
 loser-0.001
 hawk-0.001
 Sly-0.001
 badly-0.001
Token if
Token ID611
Feature activation+3.48e+00

Pos logit contributions
 Katie+0.087
 dining+0.083
iah+0.081
 counters+0.081
 doorstep+0.080
Neg logit contributions
 coward-0.061
 loser-0.054
 hawk-0.053
 badly-0.046
ching-0.044
Token she
Token ID673
Feature activation+1.42e+00

Pos logit contributions
 coward+0.196
 loser+0.179
 hawk+0.173
ching+0.155
 badly+0.151
Neg logit contributions
 Katie-0.209
 dining-0.194
iah-0.190
 counters-0.188
 doorstep-0.187
Token gets
Token ID3011
Feature activation+2.08e+00

Pos logit contributions
 coward+0.069
 loser+0.063
 hawk+0.061
 badly+0.054
 weaker+0.052
Neg logit contributions
 Katie-0.091
iah-0.087
 doorstep-0.086
 dining-0.086
 counters-0.085
Token worse
Token ID4785
Feature activation+9.98e+00

Pos logit contributions
 coward+0.085
 loser+0.076
 hawk+0.071
 villain+0.059
 badly+0.057
Neg logit contributions
 Katie-0.156
 dining-0.148
 counters-0.148
iah-0.147
view-0.146
Token,
Token ID11
Feature activation+4.47e-01

Pos logit contributions
 coward+0.386
 loser+0.329
 hawk+0.316
 badly+0.267
 hunter+0.246
Neg logit contributions
 Katie-0.765
 dining-0.735
 counters-0.730
view-0.726
 doorstep-0.724
Token call
Token ID869
Feature activation-2.77e+00

Pos logit contributions
 coward+0.023
 loser+0.020
 hawk+0.019
 badly+0.017
ching+0.016
Neg logit contributions
 Katie-0.029
 dining-0.028
 counters-0.027
 doorstep-0.027
iah-0.027
Token me
Token ID502
Feature activation-5.05e+00

Pos logit contributions
 Katie+0.200
 dining+0.193
view+0.193
 counters+0.192
 doorstep+0.191
Neg logit contributions
 coward-0.115
 loser-0.101
 hawk-0.094
 badly-0.085
 villain-0.078
Token.
Token ID13
Feature activation+1.50e-02

Pos logit contributions
 Katie+0.363
 dining+0.349
 counters+0.345
iah+0.343
 doorstep+0.342
Neg logit contributions
 coward-0.218
 loser-0.189
 hawk-0.186
 badly-0.157
 villain-0.149
Token And
Token ID843
Feature activation-1.21e+00

Pos logit contributions
 coward+0.001
 loser+0.001
 hawk+0.001
 badly+0.001
 villain+0.001
Neg logit contributions
 Katie-0.001
view-0.001
iah-0.001
 doorstep-0.001
 dining-0.001
Token.
Token ID13
Feature activation-1.55e-01

Pos logit contributions
 coward+0.096
 loser+0.085
 hawk+0.077
 badly+0.075
 hero+0.065
Neg logit contributions
 Katie-0.203
iah-0.196
 dining-0.195
view-0.194
 counters-0.192
Token They
Token ID1119
Feature activation-9.52e-01

Pos logit contributions
iah+0.010
 dining+0.010
 contain+0.010
view+0.010
 occasion+0.010
Neg logit contributions
 coward-0.006
 loser-0.005
 too-0.005
 Tweet-0.005
 Wh-0.005
Token hugged
Token ID41130
Feature activation-8.08e-01

Pos logit contributions
 Katie+0.061
 counters+0.060
 dining+0.059
 pavement+0.059
view+0.058
Neg logit contributions
 coward-0.047
 loser-0.040
 hawk-0.040
 badly-0.037
 hurt-0.035
Token her
Token ID607
Feature activation-9.34e-01

Pos logit contributions
 dining+0.055
iah+0.053
 doorstep+0.053
 counters+0.053
view+0.053
Neg logit contributions
 coward-0.034
 loser-0.031
 badly-0.030
 hawk-0.029
 on-0.027
Token even
Token ID772
Feature activation-7.75e-01

Pos logit contributions
 Katie+0.069
 dining+0.067
 counters+0.066
view+0.066
 doorstep+0.066
Neg logit contributions
 coward-0.038
 loser-0.033
 hawk-0.032
 badly-0.028
 hunter-0.025
Token tighter
Token ID28619
Feature activation+9.97e+00

Pos logit contributions
 Katie+0.064
 counters+0.062
 dining+0.062
view+0.061
iah+0.060
Neg logit contributions
 coward-0.026
 loser-0.022
 hawk-0.021
 badly-0.016
 villain-0.015
Token when
Token ID618
Feature activation-3.13e-01

Pos logit contributions
 coward+0.397
 loser+0.339
 hawk+0.326
 badly+0.275
 villain+0.258
Neg logit contributions
 Katie-0.760
 dining-0.730
 counters-0.723
view-0.721
 doorstep-0.721
Token she
Token ID673
Feature activation+1.28e+00

Pos logit contributions
 Katie+0.022
 dining+0.021
 counters+0.021
view+0.021
 doorstep+0.021
Neg logit contributions
 coward-0.014
 loser-0.012
 hawk-0.012
 badly-0.010
 villain-0.010
Token promised
Token ID8072
Feature activation+4.38e+00

Pos logit contributions
 coward+0.061
 loser+0.052
 hawk+0.051
 badly+0.046
 hunter+0.043
Neg logit contributions
 Katie-0.087
 counters-0.083
 dining-0.083
 pavement-0.082
view-0.081
Token never
Token ID1239
Feature activation+5.00e+00

Pos logit contributions
 coward+0.152
 loser+0.129
 hawk+0.120
 badly+0.114
 hurt+0.097
Neg logit contributions
 Katie-0.339
 dining-0.337
 counters-0.332
iah-0.331
view-0.330
Token to
Token ID284
Feature activation-2.47e-01

Pos logit contributions
 coward+0.203
 loser+0.177
 hawk+0.169
 badly+0.143
 villain+0.134
Neg logit contributions
 Katie-0.373
 dining-0.357
view-0.354
 counters-0.354
 doorstep-0.354
Token feel
Token ID1254
Feature activation+2.36e-01

Pos logit contributions
 coward+0.297
 hawk+0.286
 loser+0.268
 badly+0.256
Bad+0.249
Neg logit contributions
 Katie-0.206
 contain-0.200
 pavement-0.183
view-0.183
 occasion-0.181
Token better
Token ID1365
Feature activation+8.59e+00

Pos logit contributions
 coward+0.010
 loser+0.009
 hawk+0.009
 villain+0.007
 Sly+0.007
Neg logit contributions
 Katie-0.017
 counters-0.015
view-0.015
 pavement-0.015
 dining-0.015
Token.
Token ID13
Feature activation+2.14e-02

Pos logit contributions
 coward+0.450
 hawk+0.426
 loser+0.385
 Sly+0.352
 hunter+0.350
Neg logit contributions
 Katie-0.513
view-0.513
 dining-0.495
 doorstep-0.489
 counters-0.483
Token She
Token ID1375
Feature activation-8.44e-01

Pos logit contributions
 coward+0.001
 loser+0.001
 hawk+0.001
 badly+0.001
 Sly+0.001
Neg logit contributions
 Katie-0.001
 dining-0.001
view-0.001
iah-0.001
 doorstep-0.001
Token felt
Token ID2936
Feature activation+6.94e-01

Pos logit contributions
 Katie+0.050
iah+0.047
 doorstep+0.047
 counters+0.046
view+0.046
Neg logit contributions
 coward-0.045
 loser-0.043
 hawk-0.040
 weaker-0.035
 badly-0.035
Token worse
Token ID4785
Feature activation+9.96e+00

Pos logit contributions
 coward+0.030
 loser+0.028
 hawk+0.026
 villain+0.021
 hero+0.021
Neg logit contributions
 Katie-0.047
 counters-0.045
view-0.045
 dining-0.044
 pavement-0.044
Token.
Token ID13
Feature activation+1.80e-02

Pos logit contributions
 coward+0.514
 hawk+0.457
 loser+0.439
 Sly+0.386
 hunter+0.381
Neg logit contributions
 Katie-0.614
view-0.606
 dining-0.594
 pavement-0.585
 counters-0.577
Token She
Token ID1375
Feature activation-8.37e-01

Pos logit contributions
 coward+0.001
 loser+0.001
 hawk+0.001
 badly+0.001
 Sly+0.001
Neg logit contributions
 Katie-0.001
view-0.001
 dining-0.001
iah-0.001
 doorstep-0.001
Token saw
Token ID2497
Feature activation-2.03e+00

Pos logit contributions
 Katie+0.048
 doorstep+0.045
iah+0.045
 counters+0.045
 pavement+0.044
Neg logit contributions
 coward-0.046
 loser-0.044
 hawk-0.041
 badly-0.035
 hunters-0.035
Token the
Token ID262
Feature activation+2.28e-01

Pos logit contributions
 Katie+0.140
 counters+0.130
view+0.129
 theatre+0.128
 dining+0.128
Neg logit contributions
 coward-0.094
 loser-0.087
 hawk-0.080
 badly-0.069
 hunter-0.066
Token dog
Token ID3290
Feature activation+3.99e+00

Pos logit contributions
 coward+0.008
 loser+0.006
 hawk+0.006
 badly+0.006
 Sly+0.005
Neg logit contributions
 dining-0.017
 Katie-0.017
 doorstep-0.017
 pavement-0.017
 counters-0.017
Token,
Token ID11
Feature activation+2.66e-01

Pos logit contributions
 Katie+0.072
 counters+0.071
iah+0.069
 dining+0.069
 pavement+0.068
Neg logit contributions
 coward-0.046
 loser-0.040
 hawk-0.038
 villain-0.033
 badly-0.033
Token but
Token ID475
Feature activation-1.92e+00

Pos logit contributions
 coward+0.013
 loser+0.011
 hawk+0.011
 badly+0.010
 hunter+0.009
Neg logit contributions
 Katie-0.018
 dining-0.017
 counters-0.017
view-0.017
 doorstep-0.017
Token the
Token ID262
Feature activation+1.81e-01

Pos logit contributions
 Katie+0.130
iah+0.121
 dining+0.121
 counters+0.119
view+0.117
Neg logit contributions
 coward-0.095
 loser-0.084
 hawk-0.082
 hunter-0.072
ching-0.070
Token sting
Token ID24276
Feature activation+4.44e+00

Pos logit contributions
 coward+0.012
 loser+0.011
 hawk+0.010
 badly+0.009
 hunter+0.009
Neg logit contributions
 Katie-0.009
 dining-0.009
view-0.009
 counters-0.009
 doorstep-0.009
Token got
Token ID1392
Feature activation+1.33e+00

Pos logit contributions
 coward+0.228
 loser+0.207
 hawk+0.196
 badly+0.175
 hero+0.170
Neg logit contributions
 Katie-0.277
 counters-0.265
iah-0.261
view-0.261
 pavement-0.260
Token worse
Token ID4785
Feature activation+9.95e+00

Pos logit contributions
 coward+0.058
 loser+0.057
 hawk+0.050
 villain+0.049
 hero+0.046
Neg logit contributions
 counters-0.086
 shades-0.084
 Katie-0.082
 anticipation-0.081
iah-0.080
Token.
Token ID13
Feature activation-1.38e-01

Pos logit contributions
 coward+0.466
 loser+0.402
 hawk+0.395
 hunter+0.328
 badly+0.327
Neg logit contributions
 Katie-0.686
 dining-0.658
 counters-0.657
view-0.651
 pavement-0.647
Token She
Token ID1375
Feature activation-8.75e-01

Pos logit contributions
view+0.008
 occasion+0.008
 dining+0.008
 theatre+0.008
iah+0.008
Neg logit contributions
 coward-0.007
 loser-0.007
 badly-0.006
 hawk-0.006
 too-0.006
Token rubbed
Token ID31862
Feature activation+2.40e+00

Pos logit contributions
 Katie+0.050
 counters+0.049
iah+0.049
 doorstep+0.047
 dining+0.046
Neg logit contributions
 coward-0.048
 loser-0.046
 hawk-0.043
 hero-0.038
 badly-0.037
Token her
Token ID607
Feature activation-8.42e-01

Pos logit contributions
 coward+0.147
 loser+0.131
 hawk+0.129
 hunter+0.118
 villain+0.115
Neg logit contributions
 Katie-0.136
 counters-0.127
iah-0.125
view-0.121
 pavement-0.121
Token eyes
Token ID2951
Feature activation-3.56e+00

Pos logit contributions
 Katie+0.040
 counters+0.039
 dining+0.038
 doorstep+0.038
view+0.037
Neg logit contributions
 coward-0.057
 loser-0.052
 hawk-0.050
 badly-0.046
 villain-0.045
Token smiled
Token ID13541
Feature activation+4.51e-01

Pos logit contributions
 Katie+0.063
 dining+0.061
 counters+0.061
view+0.060
 pavement+0.060
Neg logit contributions
 coward-0.044
 loser-0.038
 hawk-0.037
 badly-0.034
 hunter-0.032
Token and
Token ID290
Feature activation-3.22e-01

Pos logit contributions
 coward+0.018
 loser+0.015
 hawk+0.015
 badly+0.013
 hunter+0.012
Neg logit contributions
 Katie-0.034
 dining-0.033
 counters-0.033
view-0.033
 doorstep-0.032
Token hugged
Token ID41130
Feature activation-7.96e-01

Pos logit contributions
 counters+0.020
 dining+0.020
view+0.020
 Katie+0.020
 doorstep+0.020
Neg logit contributions
 coward-0.016
 hawk-0.014
 loser-0.014
 badly-0.013
 hurt-0.013
Token the
Token ID262
Feature activation+1.74e-01

Pos logit contributions
 dining+0.061
view+0.060
 counters+0.060
 Katie+0.060
 doorstep+0.060
Neg logit contributions
 coward-0.027
 loser-0.023
 hawk-0.022
 badly-0.021
 on-0.019
Token pillow
Token ID28774
Feature activation+3.13e+00

Pos logit contributions
 coward+0.009
 loser+0.008
 hawk+0.008
 hero+0.007
 hunter+0.007
Neg logit contributions
 Katie-0.011
view-0.011
iah-0.011
 counters-0.010
 dining-0.010
Token tighter
Token ID28619
Feature activation+9.95e+00

Pos logit contributions
 coward+0.125
 loser+0.109
 hawk+0.103
 badly+0.096
 hero+0.086
Neg logit contributions
 Katie-0.230
view-0.221
iah-0.221
 counters-0.220
 dining-0.220
Token,
Token ID11
Feature activation+3.59e-01

Pos logit contributions
 coward+0.441
 loser+0.374
 hawk+0.366
 badly+0.305
 villain+0.302
Neg logit contributions
 Katie-0.722
 dining-0.687
 doorstep-0.681
 counters-0.678
iah-0.677
Token never
Token ID1239
Feature activation+4.99e+00

Pos logit contributions
 coward+0.016
 hawk+0.013
 loser+0.013
 badly+0.012
 hurt+0.012
Neg logit contributions
 Katie-0.024
iah-0.024
 dining-0.024
 doorstep-0.024
 counters-0.024
Token wanting
Token ID10291
Feature activation-9.60e-01

Pos logit contributions
 coward+0.234
 loser+0.207
 hawk+0.200
 badly+0.175
 villain+0.165
Neg logit contributions
 Katie-0.343
 dining-0.327
view-0.323
 counters-0.323
 doorstep-0.322
Token to
Token ID284
Feature activation-2.72e-01

Pos logit contributions
 Katie+0.075
 dining+0.072
 counters+0.071
view+0.071
 doorstep+0.071
Neg logit contributions
 coward-0.036
 loser-0.031
 hawk-0.029
 badly-0.025
 hunter-0.023
Token let
Token ID1309
Feature activation+9.25e-01

Pos logit contributions
 Katie+0.016
 dining+0.016
 counters+0.016
iah+0.016
view+0.015
Neg logit contributions
 coward-0.015
 loser-0.013
 hawk-0.013
 badly-0.012
 hunter-0.011
Token big
Token ID1263
Feature activation-1.44e+00

Pos logit contributions
 coward+0.027
 loser+0.024
 hawk+0.024
 badly+0.021
 hunter+0.020
Neg logit contributions
 Katie-0.031
view-0.029
 dining-0.029
 counters-0.029
iah-0.029
Token,
Token ID11
Feature activation+1.72e-01

Pos logit contributions
 Katie+0.087
 dining+0.083
 counters+0.082
 doorstep+0.081
view+0.081
Neg logit contributions
 coward-0.079
 loser-0.071
 hawk-0.068
 badly-0.062
ching-0.059
Token beautiful
Token ID4950
Feature activation-3.54e+00

Pos logit contributions
 coward+0.009
 loser+0.008
 hawk+0.008
 badly+0.007
 hunter+0.007
Neg logit contributions
 Katie-0.011
 dining-0.010
view-0.010
 doorstep-0.010
 counters-0.010
Token rainbow
Token ID27223
Feature activation-2.83e+00

Pos logit contributions
 Katie+0.210
 counters+0.202
view+0.202
iah+0.201
 dining+0.201
Neg logit contributions
 coward-0.188
 hawk-0.172
 loser-0.170
 badly-0.156
 hunter-0.147
Token-
Token ID12
Feature activation+5.79e-01

Pos logit contributions
 Katie+0.209
 dining+0.202
 counters+0.201
iah+0.199
view+0.198
Neg logit contributions
 coward-0.111
 loser-0.098
 hawk-0.097
 badly-0.085
ching-0.078
Tokencol
Token ID4033
Feature activation-9.69e+00

Pos logit contributions
 coward+0.046
 hawk+0.044
 loser+0.044
 badly+0.042
ching+0.041
Neg logit contributions
 dining-0.018
 Katie-0.018
 counters-0.017
 doorstep-0.017
iah-0.016
Tokenoured
Token ID8167
Feature activation-3.44e+00

Pos logit contributions
 dining+0.321
 Katie+0.316
 counters+0.295
view+0.286
 patio+0.280
Neg logit contributions
 coward-0.751
 loser-0.706
 hawk-0.691
 badly-0.681
ls-0.655
Token butterfly
Token ID35113
Feature activation+2.94e+00

Pos logit contributions
 Katie+0.229
 dining+0.220
 counters+0.219
 doorstep+0.217
iah+0.217
Neg logit contributions
 coward-0.165
 loser-0.147
 hawk-0.143
 badly-0.128
 hunter-0.119
Token!
Token ID0
Feature activation-1.50e+00

Pos logit contributions
 coward+0.126
 loser+0.110
 hawk+0.106
 badly+0.092
 hunter+0.085
Neg logit contributions
 Katie-0.212
 dining-0.203
 counters-0.202
view-0.201
 doorstep-0.200
Token The
Token ID383
Feature activation-1.33e+00

Pos logit contributions
 dining+0.089
view+0.089
 counters+0.089
 doorstep+0.088
iah+0.088
Neg logit contributions
 coward-0.076
 loser-0.069
 hawk-0.068
 Tweet-0.065
 badly-0.064
Token little
Token ID1310
Feature activation-1.75e+00

Pos logit contributions
 Katie+0.088
 counters+0.085
iah+0.085
view+0.085
 dining+0.085
Neg logit contributions
 coward-0.063
 loser-0.057
 hawk-0.056
 badly-0.049
 hunter-0.047
Token delicious
Token ID12625
Feature activation-4.56e-01

Pos logit contributions
 doorstep+0.074
 Katie+0.072
yah+0.070
iah+0.070
view+0.070
Neg logit contributions
 coward-0.049
 hawk-0.045
 on-0.043
 hunters-0.042
 loser-0.042
Token vegetables
Token ID13701
Feature activation-8.42e-01

Pos logit contributions
view+0.031
 Katie+0.031
 doorstep+0.031
 contain+0.030
iah+0.030
Neg logit contributions
 coward-0.018
 hawk-0.017
 on-0.017
 hunters-0.016
ls-0.016
Token and
Token ID290
Feature activation-3.87e-01

Pos logit contributions
 Katie+0.067
 dining+0.064
 counters+0.064
view+0.064
 doorstep+0.063
Neg logit contributions
 coward-0.030
 loser-0.025
 hawk-0.024
 badly-0.021
 hunter-0.019
Token bright
Token ID6016
Feature activation-4.39e+00

Pos logit contributions
 Katie+0.023
 doorstep+0.022
view+0.022
 theatre+0.021
 dining+0.021
Neg logit contributions
 coward-0.021
 hawk-0.018
 loser-0.018
 hurt-0.017
 hunter-0.016
Token-
Token ID12
Feature activation+8.63e-01

Pos logit contributions
 Katie+0.262
 doorstep+0.260
 dining+0.253
view+0.249
 theatre+0.246
Neg logit contributions
 coward-0.217
 hawk-0.206
 loser-0.192
 on-0.189
 hunters-0.184
Tokencol
Token ID4033
Feature activation-9.48e+00

Pos logit contributions
 coward+0.049
 hawk+0.044
 loser+0.044
ching+0.043
ls+0.043
Neg logit contributions
 doorstep-0.045
 Katie-0.044
 dining-0.043
 theatre-0.042
 clothing-0.041
Tokenoured
Token ID8167
Feature activation-3.43e+00

Pos logit contributions
 dining+0.370
 Katie+0.360
view+0.337
 theatre+0.336
 counters+0.333
Neg logit contributions
 coward-0.660
 hawk-0.607
 loser-0.605
ls-0.597
 badly-0.592
Token flowers
Token ID12734
Feature activation-2.55e+00

Pos logit contributions
 Katie+0.211
 doorstep+0.203
view+0.200
 dining+0.200
iah+0.198
Neg logit contributions
 coward-0.180
 hawk-0.161
 loser-0.159
 badly-0.143
 hunters-0.140
Token.
Token ID13
Feature activation-2.06e-01

Pos logit contributions
 Katie+0.201
 dining+0.192
 counters+0.191
 doorstep+0.191
iah+0.190
Neg logit contributions
 coward-0.090
 loser-0.075
 hawk-0.074
 badly-0.064
ching-0.058
Token
Token ID220
Feature activation-2.14e+00

Pos logit contributions
view+0.015
 doorstep+0.015
 counters+0.015
 dining+0.014
 Katie+0.014
Neg logit contributions
 coward-0.008
 hawk-0.007
 Tweet-0.007
 loser-0.006
 Sly-0.006
Token
Token ID198
Feature activation+1.28e+00

Pos logit contributions
 Katie+0.153
 dining+0.149
 counters+0.148
 doorstep+0.147
 pavement+0.146
Neg logit contributions
 coward-0.090
 loser-0.077
 hawk-0.074
 badly-0.068
ching-0.063
Token of
Token ID286
Feature activation+1.58e-01

Pos logit contributions
 Katie+0.010
view+0.010
 dining+0.010
 counters+0.010
 doorstep+0.010
Neg logit contributions
 coward-0.007
 loser-0.007
 hawk-0.006
 Sly-0.005
 villain-0.005
Token laundry
Token ID25724
Feature activation-5.06e+00

Pos logit contributions
 coward+0.009
 loser+0.008
 hawk+0.007
 badly+0.007
 Sly+0.006
Neg logit contributions
 Katie-0.010
 counters-0.009
 dining-0.009
 pavement-0.009
view-0.009
Token on
Token ID319
Feature activation+2.49e+00

Pos logit contributions
 Katie+0.344
 dining+0.330
 counters+0.328
view+0.326
 doorstep+0.325
Neg logit contributions
 coward-0.238
 loser-0.209
 hawk-0.202
 badly-0.178
 hunter-0.167
Token the
Token ID262
Feature activation+3.38e-01

Pos logit contributions
 coward+0.099
 loser+0.083
 hawk+0.079
 badly+0.065
 villain+0.063
Neg logit contributions
 Katie-0.194
 dining-0.184
 counters-0.183
 pavement-0.181
 doorstep-0.181
Token clothes
Token ID8242
Feature activation-5.10e+00

Pos logit contributions
 coward+0.024
 loser+0.023
 hawk+0.023
 badly+0.021
 hurt+0.021
Neg logit contributions
 Katie-0.014
view-0.013
iah-0.013
 dining-0.012
 counters-0.012
Tokenline
Token ID1370
Feature activation-9.36e+00

Pos logit contributions
 Katie+0.392
 dining+0.378
 counters+0.375
view+0.373
iah+0.373
Neg logit contributions
 coward-0.188
 loser-0.163
 hawk-0.152
 badly-0.139
ching-0.121
Token.
Token ID13
Feature activation-3.29e-02

Pos logit contributions
 Katie+0.750
 dining+0.730
 counters+0.721
iah+0.719
 doorstep+0.716
Neg logit contributions
 coward-0.300
 loser-0.256
 hawk-0.238
 badly-0.226
 on-0.197
Token "
Token ID366
Feature activation+2.61e-01

Pos logit contributions
 Katie+0.002
 dining+0.002
 counters+0.002
view+0.002
 doorstep+0.002
Neg logit contributions
 coward-0.002
 loser-0.001
 hawk-0.001
 badly-0.001
 villain-0.001
TokenHe
Token ID1544
Feature activation+1.83e+00

Pos logit contributions
 coward+0.014
 loser+0.012
 hawk+0.012
 badly+0.011
Bad+0.010
Neg logit contributions
 Katie-0.017
 dining-0.016
 doorstep-0.016
 counters-0.016
iah-0.015
Token wants
Token ID3382
Feature activation+1.27e-01

Pos logit contributions
 coward+0.083
 loser+0.073
 hawk+0.070
 badly+0.062
ching+0.058
Neg logit contributions
 Katie-0.128
 dining-0.123
view-0.122
 counters-0.121
 doorstep-0.121
Token to
Token ID284
Feature activation-2.49e-01

Pos logit contributions
 coward+0.005
 loser+0.005
 hawk+0.004
 badly+0.004
ching+0.003
Neg logit contributions
 Katie-0.009
 dining-0.009
 counters-0.009
 doorstep-0.009
view-0.009
Token a
Token ID257
Feature activation+5.37e-01

Pos logit contributions
 Katie+0.070
 dining+0.067
 counters+0.066
view+0.066
 doorstep+0.066
Neg logit contributions
 coward-0.036
 loser-0.031
 hawk-0.029
 badly-0.025
 villain-0.023
Token strange
Token ID6283
Feature activation-1.91e+00

Pos logit contributions
 coward+0.027
 loser+0.024
 hawk+0.023
 badly+0.021
ching+0.020
Neg logit contributions
 Katie-0.034
 dining-0.033
 counters-0.032
 doorstep-0.032
iah-0.032
Token,
Token ID11
Feature activation+2.49e-01

Pos logit contributions
 Katie+0.124
 dining+0.120
 counters+0.119
view+0.119
iah+0.118
Neg logit contributions
 coward-0.094
 hawk-0.084
 loser-0.084
 badly-0.075
 hunter-0.071
Token raspberry
Token ID38973
Feature activation+1.74e+00

Pos logit contributions
 coward+0.011
 hawk+0.010
 loser+0.010
 badly+0.009
 hunter+0.008
Neg logit contributions
 Katie-0.017
 dining-0.016
 doorstep-0.016
iah-0.016
view-0.016
Token-
Token ID12
Feature activation+1.10e+00

Pos logit contributions
 coward+0.084
 loser+0.074
 hawk+0.072
 badly+0.064
 hunter+0.060
Neg logit contributions
 Katie-0.117
 dining-0.111
 counters-0.110
view-0.110
 doorstep-0.110
Tokencol
Token ID4033
Feature activation-9.16e+00

Pos logit contributions
 coward+0.077
 loser+0.071
 hawk+0.070
 badly+0.065
 hunter+0.062
Neg logit contributions
 Katie-0.049
 dining-0.046
 counters-0.045
view-0.045
 doorstep-0.045
Tokenoured
Token ID8167
Feature activation-3.41e+00

Pos logit contributions
 Katie+0.289
 dining+0.271
view+0.257
 counters+0.256
 doorstep+0.250
Neg logit contributions
 coward-0.752
 loser-0.706
 hawk-0.689
 badly-0.658
 Sly-0.640
Token object
Token ID2134
Feature activation-5.18e-01

Pos logit contributions
 Katie+0.212
 dining+0.203
 counters+0.201
view+0.200
 doorstep+0.199
Neg logit contributions
 coward-0.180
 loser-0.160
 hawk-0.155
 badly-0.139
 villain-0.132
Token hiding
Token ID11816
Feature activation+1.96e+00

Pos logit contributions
 Katie+0.039
 counters+0.038
 dining+0.038
 doorstep+0.037
view+0.037
Neg logit contributions
 coward-0.020
 loser-0.018
 hawk-0.016
 badly-0.013
 villain-0.013
Token under
Token ID739
Feature activation+9.33e-01

Pos logit contributions
 coward+0.116
 loser+0.103
 hawk+0.099
 villain+0.088
ching+0.087
Neg logit contributions
 Katie-0.115
iah-0.107
 dining-0.106
 counters-0.106
view-0.106
Token a
Token ID257
Feature activation+5.44e-01

Pos logit contributions
 coward+0.033
 loser+0.028
 hawk+0.027
 badly+0.023
 hurt+0.021
Neg logit contributions
 Katie-0.073
 dining-0.072
view-0.071
iah-0.071
 counters-0.071
TokenI
Token ID40
Feature activation-2.39e+00

Pos logit contributions
 coward+0.008
 loser+0.007
 hawk+0.007
 badly+0.006
Bad+0.006
Neg logit contributions
 Katie-0.010
 dining-0.010
 counters-0.010
 doorstep-0.010
iah-0.010
Token fell
Token ID3214
Feature activation+4.70e+00

Pos logit contributions
 Katie+0.142
iah+0.133
 dining+0.130
 doorstep+0.129
 occasion+0.129
Neg logit contributions
 coward-0.123
 hawk-0.116
 loser-0.113
 weaker-0.104
 badly-0.099
Token off
Token ID572
Feature activation+2.62e+00

Pos logit contributions
 coward+0.247
 hawk+0.210
 loser+0.204
 dodged+0.174
 hunter+0.172
Neg logit contributions
 Katie-0.292
 pavement-0.285
view-0.279
 counters-0.278
 patio-0.275
Token the
Token ID262
Feature activation+1.12e-01

Pos logit contributions
 coward+0.114
 loser+0.099
 hawk+0.095
 badly+0.078
 Sly+0.076
Neg logit contributions
 Katie-0.192
 dining-0.182
view-0.178
 counters-0.178
iah-0.177
Token clothes
Token ID8242
Feature activation-4.91e+00

Pos logit contributions
 coward+0.007
 loser+0.006
 hawk+0.006
 badly+0.006
 hunter+0.005
Neg logit contributions
 Katie-0.006
 dining-0.006
view-0.006
 counters-0.006
iah-0.006
Tokenline
Token ID1370
Feature activation-9.15e+00

Pos logit contributions
 Katie+0.367
 dining+0.354
 counters+0.353
view+0.350
iah+0.349
Neg logit contributions
 coward-0.194
 loser-0.169
 hawk-0.159
 badly-0.142
ching-0.126
Token and
Token ID290
Feature activation-4.18e-01

Pos logit contributions
 Katie+0.725
 dining+0.705
 counters+0.705
iah+0.697
 doorstep+0.695
Neg logit contributions
 coward-0.308
 loser-0.269
 hawk-0.245
 badly-0.222
 hurt-0.193
Token I
Token ID314
Feature activation-2.50e+00

Pos logit contributions
 Katie+0.029
 dining+0.027
 counters+0.027
view+0.027
iah+0.027
Neg logit contributions
 coward-0.020
 loser-0.017
 hawk-0.017
 badly-0.015
 villain-0.014
Token can
Token ID460
Feature activation+1.94e+00

Pos logit contributions
 Katie+0.154
 doorstep+0.148
view+0.147
 dining+0.146
iah+0.146
Neg logit contributions
 coward-0.125
 hawk-0.114
 loser-0.114
 hunters-0.097
 weaker-0.095
Token't
Token ID470
Feature activation+5.77e+00

Pos logit contributions
 coward+0.085
 loser+0.073
 hawk+0.072
 badly+0.061
 hunters+0.059
Neg logit contributions
 Katie-0.138
 doorstep-0.133
 dining-0.132
 counters-0.131
view-0.131
Token get
Token ID651
Feature activation+1.08e+00

Pos logit contributions
 coward+0.341
 loser+0.311
 hawk+0.301
 badly+0.278
 villain+0.263
Neg logit contributions
 Katie-0.326
 dining-0.311
 counters-0.308
view-0.304
iah-0.303
Token back
Token ID736
Feature activation+1.77e+00

Pos logit contributions
 Katie+0.002
 dining+0.002
 counters+0.001
 doorstep+0.001
view+0.001
Neg logit contributions
 coward-0.001
 loser-0.001
 hawk-0.001
 badly-0.001
 villain-0.001
Token on
Token ID319
Feature activation+2.58e+00

Pos logit contributions
 coward+0.055
 loser+0.047
 hawk+0.044
 badly+0.040
 on+0.039
Neg logit contributions
 Katie-0.142
view-0.139
 dining-0.139
 counters-0.139
iah-0.139
Token the
Token ID262
Feature activation+5.83e-02

Pos logit contributions
 coward+0.098
 loser+0.084
 hawk+0.081
 badly+0.069
 hunter+0.063
Neg logit contributions
 Katie-0.197
 dining-0.191
 counters-0.190
view-0.189
iah-0.188
Token zip
Token ID19974
Feature activation-5.19e-01

Pos logit contributions
 hawk+0.003
 coward+0.003
 hurt+0.003
 on+0.003
 bad+0.003
Neg logit contributions
 Katie-0.003
view-0.003
eline-0.003
aney-0.003
 Birthday-0.003
Token-
Token ID12
Feature activation+1.02e+00

Pos logit contributions
 Katie+0.039
 dining+0.038
 counters+0.038
view+0.038
iah+0.038
Neg logit contributions
 coward-0.019
 loser-0.017
 hawk-0.016
 badly-0.014
ching-0.013
Tokenline
Token ID1370
Feature activation-9.15e+00

Pos logit contributions
 coward+0.060
 loser+0.056
 hawk+0.054
ching+0.054
 badly+0.052
Neg logit contributions
 dining-0.051
 Katie-0.050
 counters-0.050
 doorstep-0.050
 pavement-0.048
Token,
Token ID11
Feature activation+2.68e-01

Pos logit contributions
 Katie+0.746
 dining+0.740
 counters+0.738
iah+0.730
view+0.724
Neg logit contributions
 coward-0.257
 loser-0.222
 hawk-0.198
 badly-0.194
 on-0.182
Token z
Token ID1976
Feature activation+1.56e+00

Pos logit contributions
 coward+0.013
 loser+0.011
 hawk+0.011
 badly+0.010
 hurt+0.009
Neg logit contributions
 Katie-0.018
view-0.017
iah-0.017
 dining-0.017
 doorstep-0.017
Tokenipping
Token ID4501
Feature activation-9.25e-01

Pos logit contributions
 coward+0.091
 loser+0.083
 hawk+0.080
 badly+0.073
ching+0.069
Neg logit contributions
 Katie-0.089
 dining-0.085
 counters-0.084
view-0.084
 doorstep-0.084
Token all
Token ID477
Feature activation-1.21e+00

Pos logit contributions
 Katie+0.071
 dining+0.068
 counters+0.068
view+0.068
 doorstep+0.068
Neg logit contributions
 coward-0.035
 loser-0.030
 hawk-0.029
 badly-0.025
ching-0.023
Token around
Token ID1088
Feature activation-2.52e+00

Pos logit contributions
 Katie+0.090
 doorstep+0.089
 counters+0.089
 dining+0.089
iah+0.087
Neg logit contributions
 coward-0.044
 loser-0.039
 hawk-0.037
 badly-0.033
 on-0.032
Token it
Token ID340
Feature activation+9.40e-01

Pos logit contributions
 coward+0.010
 loser+0.009
 hawk+0.008
 badly+0.007
ching+0.007
Neg logit contributions
 Katie-0.020
 dining-0.019
iah-0.019
 counters-0.019
 pavement-0.019
Token would
Token ID561
Feature activation+2.25e+00

Pos logit contributions
 coward+0.052
 loser+0.049
 hawk+0.046
 badly+0.044
 Sly+0.041
Neg logit contributions
iah-0.051
 theatre-0.050
view-0.050
 dining-0.050
 Katie-0.050
Token hug
Token ID16225
Feature activation+1.51e+00

Pos logit contributions
 coward+0.122
 loser+0.110
 hawk+0.106
 badly+0.096
ching+0.091
Neg logit contributions
 Katie-0.136
 dining-0.130
view-0.128
 counters-0.128
 doorstep-0.128
Token the
Token ID262
Feature activation+1.56e-01

Pos logit contributions
 coward+0.060
 loser+0.051
 hawk+0.049
 badly+0.042
ching+0.039
Neg logit contributions
 Katie-0.115
 dining-0.110
 counters-0.109
view-0.109
 doorstep-0.108
Token shore
Token ID15191
Feature activation-5.33e+00

Pos logit contributions
 coward+0.008
 loser+0.007
 hawk+0.007
 badly+0.006
 hunter+0.006
Neg logit contributions
 Katie-0.010
 dining-0.009
 counters-0.009
view-0.009
 doorstep-0.009
Tokenline
Token ID1370
Feature activation-9.15e+00

Pos logit contributions
 Katie+0.434
 counters+0.417
 doorstep+0.416
 dining+0.415
view+0.413
Neg logit contributions
 coward-0.169
 loser-0.146
 hawk-0.131
 badly-0.116
 on-0.111
Token and
Token ID290
Feature activation-3.91e-01

Pos logit contributions
 Katie+0.736
 counters+0.718
 dining+0.714
iah+0.708
 doorstep+0.708
Neg logit contributions
 coward-0.290
 loser-0.246
 hawk-0.229
 badly-0.214
 on-0.202
Token admire
Token ID21099
Feature activation-2.02e+00

Pos logit contributions
 Katie+0.022
 dining+0.021
 counters+0.021
view+0.021
 doorstep+0.020
Neg logit contributions
 coward-0.023
 loser-0.021
 hawk-0.020
 badly-0.018
 hunter-0.018
Token the
Token ID262
Feature activation+1.67e-01

Pos logit contributions
 Katie+0.157
 dining+0.148
view+0.147
iah+0.146
 counters+0.146
Neg logit contributions
 coward-0.079
 loser-0.069
 hawk-0.063
 badly-0.052
ching-0.050
Token wrecked
Token ID44770
Feature activation+3.03e+00

Pos logit contributions
 coward+0.010
 loser+0.009
 hawk+0.009
 badly+0.008
 hunter+0.008
Neg logit contributions
 Katie-0.009
 dining-0.009
 counters-0.009
 doorstep-0.009
view-0.009
Token boats
Token ID16659
Feature activation-2.80e+00

Pos logit contributions
 coward+0.153
 loser+0.136
 hawk+0.130
 badly+0.117
 Sly+0.111
Neg logit contributions
 Katie-0.196
 dining-0.187
 counters-0.186
view-0.185
 doorstep-0.185
Token a
Token ID257
Feature activation+5.22e-01

Pos logit contributions
 on+0.010
 coward+0.009
 loser+0.007
 too+0.006
 better+0.006
Neg logit contributions
 dining-0.049
view-0.048
 doorstep-0.048
 counters-0.047
 patio-0.046
Token walk
Token ID2513
Feature activation-1.63e+00

Pos logit contributions
 coward+0.028
 hawk+0.025
 loser+0.025
 badly+0.023
 hurt+0.022
Neg logit contributions
 Katie-0.031
 counters-0.031
view-0.031
 dining-0.030
 pavement-0.030
Token along
Token ID1863
Feature activation+7.27e-01

Pos logit contributions
 Katie+0.125
 dining+0.121
 counters+0.121
view+0.120
 doorstep+0.120
Neg logit contributions
 coward-0.061
 loser-0.053
 hawk-0.050
 badly-0.045
 hunter-0.039
Token the
Token ID262
Feature activation+1.19e-01

Pos logit contributions
 coward+0.022
 loser+0.019
 hawk+0.018
 badly+0.017
 on+0.014
Neg logit contributions
 Katie-0.058
 counters-0.058
 dining-0.057
iah-0.057
view-0.057
Token shore
Token ID15191
Feature activation-5.29e+00

Pos logit contributions
 coward+0.008
 hawk+0.007
 hunter+0.007
 loser+0.007
 hurt+0.007
Neg logit contributions
 Katie-0.005
aney-0.005
view-0.005
 dining-0.005
 counters-0.005
Tokenline
Token ID1370
Feature activation-9.13e+00

Pos logit contributions
 Katie+0.455
 counters+0.440
view+0.439
 doorstep+0.436
 dining+0.436
Neg logit contributions
 coward-0.125
 loser-0.107
 on-0.096
 too-0.088
 hawk-0.086
Token.
Token ID13
Feature activation-2.28e-01

Pos logit contributions
 Katie+0.766
 counters+0.761
 dining+0.753
iah+0.744
 pavement+0.735
Neg logit contributions
 coward-0.233
 loser-0.197
 badly-0.178
 on-0.178
 hawk-0.172
Token
Token ID198
Feature activation+1.21e+00

Pos logit contributions
view+0.020
 anniversary+0.020
iah+0.020
 clothing+0.019
 counters+0.019
Neg logit contributions
 Spot-0.004
 Tweet-0.004
 Zip-0.003
 too-0.003
-0.003
Token
Token ID198
Feature activation+1.21e+00

Pos logit contributions
 coward+0.042
 hawk+0.033
 loser+0.033
 badly+0.031
ching+0.029
Neg logit contributions
 counters-0.093
 dining-0.093
 pavement-0.091
iah-0.091
 Katie-0.091
TokenBut
Token ID1537
Feature activation-1.25e+00

Pos logit contributions
 coward+0.048
 loser+0.041
 badly+0.040
ching+0.039
 hawk+0.039
Neg logit contributions
iah-0.084
umm-0.082
 counters-0.081
 Katie-0.080
 dining-0.080
Token something
Token ID1223
Feature activation-2.39e+00

Pos logit contributions
 Katie+0.092
 dining+0.089
 counters+0.088
view+0.088
 doorstep+0.087
Neg logit contributions
 coward-0.051
 loser-0.044
 hawk-0.042
 badly-0.037
 hunter-0.034
Token when
Token ID618
Feature activation-4.28e-01

Pos logit contributions
 coward+0.011
 loser+0.010
 hawk+0.009
ching+0.008
 villain+0.007
Neg logit contributions
 Katie-0.017
 dining-0.015
 pavement-0.015
 counters-0.015
 theatre-0.015
Token it
Token ID340
Feature activation+9.17e-01

Pos logit contributions
 Katie+0.028
 dining+0.024
 theatre+0.024
 pavement+0.024
iah+0.024
Neg logit contributions
 coward-0.022
 loser-0.020
ching-0.017
ino-0.017
 hawk-0.016
Token hugged
Token ID41130
Feature activation-9.23e-01

Pos logit contributions
 coward+0.050
 loser+0.045
 hawk+0.043
 badly+0.039
 Sly+0.038
Neg logit contributions
 Katie-0.054
 doorstep-0.052
view-0.052
 dining-0.052
 theatre-0.051
Token the
Token ID262
Feature activation+1.12e-01

Pos logit contributions
 Katie+0.069
 pavement+0.065
iah+0.064
view+0.064
 counters+0.064
Neg logit contributions
 coward-0.040
 loser-0.035
 hawk-0.030
 badly-0.026
ching-0.026
Token shore
Token ID15191
Feature activation-5.34e+00

Pos logit contributions
 coward+0.006
 loser+0.005
 hawk+0.005
 badly+0.004
 hunter+0.004
Neg logit contributions
 Katie-0.007
 dining-0.007
 counters-0.007
view-0.007
 doorstep-0.007
Tokenline
Token ID1370
Feature activation-9.13e+00

Pos logit contributions
 Katie+0.451
view+0.438
 dining+0.438
 counters+0.436
 doorstep+0.435
Neg logit contributions
 coward-0.157
 loser-0.129
 hawk-0.122
 badly-0.100
 hunter-0.088
Token again
Token ID757
Feature activation+6.31e+00

Pos logit contributions
 Katie+0.776
 counters+0.762
 dining+0.761
view+0.753
 doorstep+0.752
Neg logit contributions
 coward-0.250
 loser-0.204
 hawk-0.196
 badly-0.167
 weaker-0.143
Token,
Token ID11
Feature activation+2.48e-01

Pos logit contributions
 coward+0.165
 loser+0.132
 hawk+0.129
 badly+0.105
 hurt+0.089
Neg logit contributions
 Katie-0.547
 counters-0.542
 dining-0.541
view-0.536
 doorstep-0.535
Token it
Token ID340
Feature activation+9.16e-01

Pos logit contributions
 coward+0.012
 loser+0.011
 hawk+0.009
ching+0.008
 badly+0.008
Neg logit contributions
 Katie-0.018
 dining-0.016
 pavement-0.016
 counters-0.015
 theatre-0.015
Token noticed
Token ID6810
Feature activation-9.68e-01

Pos logit contributions
 coward+0.052
 loser+0.050
 hawk+0.044
 badly+0.041
 Sly+0.040
Neg logit contributions
 Katie-0.051
 doorstep-0.051
view-0.050
 dining-0.049
 theatre-0.049
Token a
Token ID257
Feature activation+4.91e-01

Pos logit contributions
 Katie+0.072
view+0.068
 dining+0.067
 counters+0.067
 doorstep+0.066
Neg logit contributions
 coward-0.041
 loser-0.037
 hawk-0.033
 villain-0.028
 badly-0.028
Token things
Token ID1243
Feature activation-9.32e-01

Pos logit contributions
iah+0.026
view+0.026
 doorstep+0.026
 Katie+0.025
 dining+0.025
Neg logit contributions
 coward-0.011
 hawk-0.009
 hunters-0.009
 loser-0.009
 badly-0.009
Tokenâ
Token ID22940
Feature activation-6.15e-01

Pos logit contributions
 Katie+0.072
 dining+0.070
 counters+0.070
view+0.069
 doorstep+0.069
Neg logit contributions
 coward-0.034
 loser-0.028
 hawk-0.027
 badly-0.024
ching-0.022
Token€
Token ID26391
Feature activation-1.82e+00

Pos logit contributions
 desks+0.029
 Katie+0.029
 balcony+0.029
 counters+0.029
 dining+0.028
Neg logit contributions
 loser-0.036
 coward-0.033
ching-0.031
 better-0.030
 harder-0.030
Token�
Token ID447
Feature activation-1.04e+00

Pos logit contributions
 pavement+0.112
 anticipation+0.111
 contain+0.110
view+0.110
 mant+0.110
Neg logit contributions
 coward-0.084
 loser-0.076
 badly-0.072
 hawk-0.068
ls-0.067
Token�
Token ID251
Feature activation+4.05e-01

Pos logit contributions
 Katie+0.064
view+0.063
 pavement+0.062
 doorstep+0.062
 counters+0.061
Neg logit contributions
 coward-0.050
 loser-0.045
 hawk-0.045
 badly-0.041
 on-0.040
Tokencol
Token ID4033
Feature activation-9.10e+00

Pos logit contributions
 on+0.012
 too+0.011
 coward+0.011
 better+0.009
 again+0.009
Neg logit contributions
 doorstep-0.033
 pavement-0.032
 counters-0.031
 dining-0.030
view-0.030
Tokenouring
Token ID21823
Feature activation+1.04e+00

Pos logit contributions
 dining+0.317
 Katie+0.311
 counters+0.305
 pavement+0.295
view+0.292
Neg logit contributions
 coward-0.692
 loser-0.642
 badly-0.624
 hawk-0.604
ching-0.586
Token books
Token ID3835
Feature activation+9.07e-02

Pos logit contributions
 coward+0.037
 loser+0.031
 hawk+0.030
 badly+0.027
 Sly+0.024
Neg logit contributions
 Katie-0.080
 doorstep-0.078
 dining-0.078
 counters-0.077
view-0.077
Token,
Token ID11
Feature activation+2.22e-01

Pos logit contributions
 coward+0.003
 loser+0.003
 hawk+0.002
 badly+0.002
ching+0.002
Neg logit contributions
 Katie-0.007
 dining-0.007
 counters-0.007
iah-0.007
view-0.007
Token stickers
Token ID28568
Feature activation+1.79e+00

Pos logit contributions
 coward+0.009
 too+0.007
 loser+0.007
 badly+0.007
 hawk+0.007
Neg logit contributions
 doorstep-0.015
iah-0.015
 dining-0.015
 counters-0.015
umm-0.015
Token and
Token ID290
Feature activation-4.95e-01

Pos logit contributions
 coward+0.069
 loser+0.059
 hawk+0.057
 badly+0.048
 villain+0.044
Neg logit contributions
 Katie-0.137
 dining-0.132
 counters-0.131
view-0.130
 doorstep-0.130
TokenIt
Token ID1026
Feature activation-1.31e+00

Pos logit contributions
 coward+0.054
 badly+0.048
 hawk+0.048
 loser+0.047
ching+0.043
Neg logit contributions
 Katie-0.076
iah-0.075
 dining-0.073
view-0.072
 counters-0.072
Token was
Token ID373
Feature activation-2.65e-01

Pos logit contributions
 Katie+0.100
 dining+0.098
view+0.098
 counters+0.096
 pavement+0.095
Neg logit contributions
 coward-0.049
 hawk-0.040
 loser-0.039
 hurt-0.035
 badly-0.035
Token a
Token ID257
Feature activation+5.28e-01

Pos logit contributions
 Katie+0.021
 doorstep+0.021
view+0.021
 counters+0.021
iah+0.021
Neg logit contributions
 coward-0.008
 loser-0.007
 hawk-0.006
 badly-0.006
 on-0.005
Token zip
Token ID19974
Feature activation-5.47e-01

Pos logit contributions
 coward+0.028
 loser+0.025
 hawk+0.024
 badly+0.022
 hunter+0.021
Neg logit contributions
 Katie-0.033
view-0.032
 counters-0.032
iah-0.031
 dining-0.031
Token-
Token ID12
Feature activation+1.10e+00

Pos logit contributions
 Katie+0.037
 dining+0.036
 counters+0.035
view+0.035
iah+0.035
Neg logit contributions
 coward-0.026
 loser-0.023
 hawk-0.022
 badly-0.020
 hunter-0.018
Tokenline
Token ID1370
Feature activation-9.08e+00

Pos logit contributions
 coward+0.067
 loser+0.063
ching+0.063
 hawk+0.062
 badly+0.062
Neg logit contributions
 dining-0.050
 Katie-0.050
 doorstep-0.048
 counters-0.047
iah-0.045
Token!
Token ID0
Feature activation-1.57e+00

Pos logit contributions
 counters+0.771
 Katie+0.766
 dining+0.761
iah+0.753
view+0.739
Neg logit contributions
 on-0.194
 coward-0.191
 badly-0.187
 loser-0.176
 too-0.172
Token It
Token ID632
Feature activation-1.59e+00

Pos logit contributions
view+0.103
iah+0.102
 dining+0.095
 clothing+0.093
 theatre+0.092
Neg logit contributions
 coward-0.070
 badly-0.066
 Tweet-0.065
 Blue-0.064
 loser-0.062
Token was
Token ID373
Feature activation-2.67e-01

Pos logit contributions
 Katie+0.119
 dining+0.116
view+0.115
 counters+0.115
iah+0.113
Neg logit contributions
 coward-0.063
 hawk-0.052
 loser-0.052
 badly-0.046
 hurt-0.045
Token made
Token ID925
Feature activation+9.22e-01

Pos logit contributions
 Katie+0.020
view+0.019
 doorstep+0.019
 dining+0.019
 counters+0.019
Neg logit contributions
 coward-0.011
 loser-0.009
 hawk-0.009
 badly-0.008
 on-0.007
Token of
Token ID286
Feature activation+1.42e-01

Pos logit contributions
 coward+0.022
 on+0.019
 badly+0.018
 loser+0.018
 too+0.017
Neg logit contributions
 dining-0.078
iah-0.077
view-0.076
 Katie-0.076
 doorstep-0.076
Tokenrobe
Token ID25481
Feature activation-1.03e+00

Pos logit contributions
 Katie+0.106
 dining+0.100
 counters+0.100
view+0.098
 doorstep+0.098
Neg logit contributions
 coward-0.151
 loser-0.138
 hawk-0.137
 badly-0.125
 hunter-0.121
Token back
Token ID736
Feature activation+1.86e+00

Pos logit contributions
 Katie+0.068
 pavement+0.066
 doorstep+0.065
view+0.064
 counters+0.064
Neg logit contributions
 coward-0.049
 hawk-0.043
 loser-0.042
Bad-0.036
 hunter-0.034
Token onto
Token ID4291
Feature activation-2.24e-01

Pos logit contributions
 coward+0.090
 loser+0.078
 hawk+0.078
Bad+0.066
 villain+0.065
Neg logit contributions
 Katie-0.125
 dining-0.118
 patio-0.118
 doorstep-0.117
 counters-0.117
Token the
Token ID262
Feature activation+1.11e-01

Pos logit contributions
 Katie+0.016
 pavement+0.015
 dining+0.014
 counters+0.014
iah+0.014
Neg logit contributions
 coward-0.010
 loser-0.009
 hawk-0.008
 badly-0.007
ls-0.007
Token clothes
Token ID8242
Feature activation-4.85e+00

Pos logit contributions
 coward+0.007
 loser+0.006
 hawk+0.006
 badly+0.005
 hunter+0.005
Neg logit contributions
 Katie-0.006
 dining-0.006
view-0.006
 counters-0.006
iah-0.006
Tokenline
Token ID1370
Feature activation-9.05e+00

Pos logit contributions
 Katie+0.369
 dining+0.356
 counters+0.353
view+0.351
iah+0.350
Neg logit contributions
 coward-0.188
 loser-0.162
 hawk-0.153
 badly-0.134
ching-0.121
Token.
Token ID13
Feature activation-1.59e-01

Pos logit contributions
 Katie+0.695
 dining+0.672
 counters+0.667
iah+0.662
 doorstep+0.661
Neg logit contributions
 coward-0.340
 loser-0.293
 hawk-0.276
 badly-0.245
ching-0.220
Token He
Token ID679
Feature activation+1.33e+00

Pos logit contributions
 Katie+0.010
view+0.010
 dining+0.010
iah+0.009
 doorstep+0.009
Neg logit contributions
 coward-0.008
 loser-0.007
 hawk-0.007
 Sly-0.006
 badly-0.006
Token used
Token ID973
Feature activation+4.95e+00

Pos logit contributions
 coward+0.078
 loser+0.074
 hawk+0.072
 badly+0.065
Bad+0.060
Neg logit contributions
 Katie-0.070
 theatre-0.066
 pavement-0.064
 counters-0.064
iah-0.063
Token all
Token ID477
Feature activation-1.16e+00

Pos logit contributions
 coward+0.233
 loser+0.207
 hawk+0.193
 badly+0.161
 hunter+0.160
Neg logit contributions
 Katie-0.346
 anticipation-0.329
 pavement-0.321
 counters-0.321
 clothing-0.321
Token his
Token ID465
Feature activation+8.45e-01

Pos logit contributions
 Katie+0.084
view+0.079
 dining+0.077
 counters+0.077
 pavement+0.077
Neg logit contributions
 coward-0.052
 loser-0.046
 hawk-0.044
 badly-0.038
 hunter-0.036
Token suddenly
Token ID6451
Feature activation+1.88e+00

Pos logit contributions
 Katie+0.022
 dining+0.020
iah+0.020
 counters+0.020
 doorstep+0.020
Neg logit contributions
 coward-0.017
 loser-0.016
 hawk-0.015
 badly-0.013
ching-0.013
Token,
Token ID11
Feature activation+2.30e-01

Pos logit contributions
 coward+0.088
 loser+0.078
 hawk+0.073
 badly+0.065
ching+0.061
Neg logit contributions
 Katie-0.132
 dining-0.124
 counters-0.122
iah-0.122
 doorstep-0.121
Token a
Token ID257
Feature activation+5.67e-01

Pos logit contributions
 coward+0.012
 loser+0.010
 hawk+0.010
 badly+0.009
 hunter+0.008
Neg logit contributions
 Katie-0.015
 dining-0.014
 counters-0.014
view-0.014
iah-0.014
Token brightly
Token ID35254
Feature activation-3.00e+00

Pos logit contributions
 coward+0.027
 loser+0.024
 hawk+0.023
 badly+0.020
 Sly+0.019
Neg logit contributions
 Katie-0.038
 dining-0.037
 counters-0.036
 doorstep-0.036
iah-0.036
Token-
Token ID12
Feature activation+1.18e+00

Pos logit contributions
 Katie+0.161
 dining+0.157
 doorstep+0.153
 counters+0.152
iah+0.148
Neg logit contributions
 coward-0.177
 hawk-0.168
 loser-0.161
 badly-0.152
 hurt-0.152
Tokencol
Token ID4033
Feature activation-9.04e+00

Pos logit contributions
 coward+0.085
 loser+0.078
 hawk+0.078
 badly+0.073
ching+0.071
Neg logit contributions
 Katie-0.050
 dining-0.048
 counters-0.047
 doorstep-0.047
iah-0.045
Tokenoured
Token ID8167
Feature activation-3.28e+00

Pos logit contributions
 Katie+0.278
 dining+0.261
 counters+0.251
view+0.243
 doorstep+0.234
Neg logit contributions
 coward-0.756
 loser-0.709
 hawk-0.695
 badly-0.665
 villain-0.638
Token belt
Token ID10999
Feature activation+1.69e+00

Pos logit contributions
 Katie+0.232
 dining+0.223
 counters+0.221
view+0.221
 doorstep+0.220
Neg logit contributions
 coward-0.144
 loser-0.126
 hawk-0.122
 badly-0.106
 hunter-0.099
Token caught
Token ID4978
Feature activation+3.06e+00

Pos logit contributions
 coward+0.078
 loser+0.068
 hawk+0.064
 badly+0.055
 Sly+0.054
Neg logit contributions
 Katie-0.116
 doorstep-0.113
view-0.112
 pavement-0.112
 counters-0.112
Token her
Token ID607
Feature activation-9.78e-01

Pos logit contributions
 coward+0.141
 loser+0.124
 hawk+0.116
 badly+0.100
 villain+0.097
Neg logit contributions
 Katie-0.217
 dining-0.205
iah-0.204
 counters-0.203
view-0.201
Token eye
Token ID4151
Feature activation-3.22e+00

Pos logit contributions
 Katie+0.051
 dining+0.048
 counters+0.047
iah+0.047
view+0.047
Neg logit contributions
 coward-0.062
 loser-0.057
 hawk-0.055
 badly-0.051
 hunter-0.049
Token the
Token ID262
Feature activation+9.32e-02

Pos logit contributions
 counters+0.027
view+0.027
 dining+0.027
 doorstep+0.026
umm+0.026
Neg logit contributions
 coward-0.007
 loser-0.005
 hawk-0.005
 on-0.003
 hunter-0.003
Token edge
Token ID5743
Feature activation-3.68e+00

Pos logit contributions
 coward+0.006
 hawk+0.005
 loser+0.005
 hunter+0.005
 hurt+0.005
Neg logit contributions
 Katie-0.005
iah-0.004
view-0.004
 counters-0.004
 dining-0.004
Token of
Token ID286
Feature activation+1.30e-01

Pos logit contributions
 Katie+0.301
 dining+0.300
 doorstep+0.297
 counters+0.296
iah+0.295
Neg logit contributions
 coward-0.106
 loser-0.091
 hawk-0.085
 on-0.075
 badly-0.075
Token the
Token ID262
Feature activation+9.60e-02

Pos logit contributions
 coward+0.004
 loser+0.003
 hawk+0.003
 badly+0.003
 hunter+0.002
Neg logit contributions
 Katie-0.011
 dining-0.011
 counters-0.010
 doorstep-0.010
view-0.010
Token shore
Token ID15191
Feature activation-5.21e+00

Pos logit contributions
 coward+0.006
 hawk+0.005
 loser+0.005
 hunter+0.005
 hurt+0.005
Neg logit contributions
 Katie-0.005
view-0.005
iah-0.005
shirt-0.005
 occasion-0.005
Tokenline
Token ID1370
Feature activation-9.02e+00

Pos logit contributions
 Katie+0.454
view+0.442
 doorstep+0.440
 counters+0.440
 dining+0.436
Neg logit contributions
 coward-0.116
 loser-0.097
 on-0.082
 hawk-0.082
 too-0.081
Token and
Token ID290
Feature activation-4.42e-01

Pos logit contributions
 Katie+0.778
 counters+0.770
 dining+0.765
iah+0.753
 pavement+0.752
Neg logit contributions
 coward-0.201
 loser-0.161
 on-0.150
 hawk-0.149
 badly-0.148
Token just
Token ID655
Feature activation+2.16e+00

Pos logit contributions
 Katie+0.029
view+0.029
 counters+0.028
 dining+0.028
 doorstep+0.028
Neg logit contributions
 coward-0.021
 hawk-0.017
 loser-0.017
 badly-0.016
 hurt-0.016
Token stood
Token ID6204
Feature activation+1.30e+00

Pos logit contributions
 coward+0.103
 loser+0.089
 hawk+0.088
 hurt+0.084
 badly+0.081
Neg logit contributions
 Katie-0.137
 counters-0.137
 dining-0.136
 doorstep-0.135
iah-0.132
Token there
Token ID612
Feature activation-8.63e-01

Pos logit contributions
 coward+0.052
 loser+0.045
 hawk+0.043
 badly+0.037
 hunter+0.034
Neg logit contributions
 Katie-0.098
 dining-0.094
 counters-0.093
view-0.093
 doorstep-0.092
Token.
Token ID13
Feature activation-1.35e-01

Pos logit contributions
 Katie+0.067
 counters+0.065
 dining+0.065
view+0.065
iah+0.064
Neg logit contributions
 coward-0.031
 loser-0.026
 hawk-0.024
 badly-0.022
ching-0.020
Token jumped
Token ID11687
Feature activation-4.57e-02

Pos logit contributions
 Katie+0.079
 dining+0.078
view+0.077
 counters+0.077
umm+0.076
Neg logit contributions
 coward-0.042
 loser-0.035
 hawk-0.035
 hurt-0.033
 badly-0.032
Token onto
Token ID4291
Feature activation-2.01e-01

Pos logit contributions
 Katie+0.003
 counters+0.003
 pavement+0.003
 doorstep+0.003
 dining+0.003
Neg logit contributions
 coward-0.002
 loser-0.002
 hawk-0.002
Bad-0.002
 villain-0.002
Token the
Token ID262
Feature activation+2.19e-01

Pos logit contributions
 Katie+0.016
 dining+0.015
 counters+0.015
 doorstep+0.015
 pavement+0.015
Neg logit contributions
 coward-0.008
 loser-0.007
 hawk-0.006
 badly-0.005
 hunter-0.005
Token zip
Token ID19974
Feature activation-5.07e-01

Pos logit contributions
 coward+0.011
 loser+0.010
 hawk+0.010
 badly+0.009
 hunter+0.009
Neg logit contributions
 Katie-0.014
 dining-0.013
view-0.013
iah-0.013
 counters-0.013
Token-
Token ID12
Feature activation+1.16e+00

Pos logit contributions
 Katie+0.039
 dining+0.038
 counters+0.038
view+0.038
iah+0.037
Neg logit contributions
 coward-0.019
 loser-0.016
 hawk-0.015
 badly-0.013
 hunter-0.012
Tokenline
Token ID1370
Feature activation-9.01e+00

Pos logit contributions
 coward+0.070
 loser+0.066
 hawk+0.063
ching+0.062
 badly+0.061
Neg logit contributions
 dining-0.058
 Katie-0.057
 counters-0.055
 doorstep-0.054
view-0.051
Token and
Token ID290
Feature activation-4.33e-01

Pos logit contributions
 dining+0.797
 Katie+0.792
 counters+0.792
iah+0.782
view+0.770
Neg logit contributions
 coward-0.173
 loser-0.154
 badly-0.143
 on-0.143
 too-0.134
Token z
Token ID1976
Feature activation+1.64e+00

Pos logit contributions
 Katie+0.028
 dining+0.027
 counters+0.027
view+0.027
 doorstep+0.027
Neg logit contributions
 coward-0.022
 loser-0.019
 hawk-0.019
 badly-0.017
 villain-0.016
Tokenipped
Token ID3949
Feature activation-6.27e-01

Pos logit contributions
 coward+0.108
 hawk+0.098
 loser+0.097
 badly+0.088
 hunter+0.087
Neg logit contributions
 Katie-0.081
iah-0.075
 dining-0.075
view-0.075
 counters-0.075
Token around
Token ID1088
Feature activation-2.42e+00

Pos logit contributions
 Katie+0.046
 counters+0.044
 dining+0.044
 doorstep+0.043
 pavement+0.043
Neg logit contributions
 coward-0.027
 loser-0.024
 hawk-0.023
 badly-0.019
 villain-0.018
Token the
Token ID262
Feature activation+2.33e-01

Pos logit contributions
 Katie+0.189
 dining+0.182
view+0.182
 counters+0.181
iah+0.180
Neg logit contributions
 coward-0.089
 loser-0.076
 hawk-0.072
 badly-0.062
 villain-0.056
Token bed
Token ID3996
Feature activation-2.66e+00

Pos logit contributions
 coward+0.083
 loser+0.072
 hawk+0.069
 badly+0.061
 villain+0.056
Neg logit contributions
 Katie-0.144
 dining-0.139
 counters-0.138
view-0.138
iah-0.137
Token wearing
Token ID5762
Feature activation-2.40e+00

Pos logit contributions
 Katie+0.208
 counters+0.202
 dining+0.201
iah+0.200
 pavement+0.200
Neg logit contributions
 coward-0.090
 loser-0.077
 badly-0.070
 hawk-0.067
 on-0.057
Token his
Token ID465
Feature activation+9.52e-01

Pos logit contributions
 Katie+0.203
 dining+0.198
 doorstep+0.196
 counters+0.196
view+0.196
Neg logit contributions
 coward-0.069
 loser-0.055
 hawk-0.053
 badly-0.045
 hunter-0.038
Token favourite
Token ID12507
Feature activation-5.05e+00

Pos logit contributions
 coward+0.046
 loser+0.043
 hawk+0.040
 too+0.040
 badly+0.039
Neg logit contributions
 Katie-0.059
view-0.057
 counters-0.056
 dining-0.056
 doorstep-0.056
Token py
Token ID12972
Feature activation-4.67e+00

Pos logit contributions
 Katie+0.300
iah+0.284
 dining+0.283
view+0.283
 pavement+0.282
Neg logit contributions
 coward-0.254
 loser-0.232
 hawk-0.227
 badly-0.227
 on-0.226
Tokenj
Token ID73
Feature activation-8.98e+00

Pos logit contributions
 Katie+0.177
 dining+0.166
 anticipation+0.161
 counters+0.158
umm+0.157
Neg logit contributions
 badly-0.316
 coward-0.296
 too-0.292
 on-0.290
 loser-0.290
Tokenamas
Token ID17485
Feature activation-3.05e+00

Pos logit contributions
 anticipation+0.378
 Katie+0.371
 contain+0.369
 dining+0.362
 mant+0.361
Neg logit contributions
 loser-0.544
 coward-0.539
ching-0.525
 badly-0.514
 too-0.512
Token,
Token ID11
Feature activation+2.73e-01

Pos logit contributions
 Katie+0.240
 counters+0.235
 dining+0.233
view+0.230
 pavement+0.230
Neg logit contributions
 coward-0.101
 loser-0.087
 hawk-0.079
 badly-0.077
 on-0.067
Token feeling
Token ID4203
Feature activation-7.44e-01

Pos logit contributions
 coward+0.010
 loser+0.008
 hawk+0.008
 badly+0.008
 hurt+0.007
Neg logit contributions
 Katie-0.020
 dining-0.020
 doorstep-0.020
iah-0.020
umm-0.020
Token so
Token ID523
Feature activation-4.09e+00

Pos logit contributions
 doorstep+0.051
 Katie+0.051
 pavement+0.050
 counters+0.049
 dining+0.049
Neg logit contributions
 coward-0.034
 loser-0.029
 hawk-0.028
 badly-0.027
 hurt-0.025
Token proud
Token ID6613
Feature activation-1.50e+00

Pos logit contributions
 doorstep+0.285
 pavement+0.283
 counters+0.279
 Katie+0.275
 patio+0.275
Neg logit contributions
 coward-0.170
 badly-0.155
 hurt-0.144
 bad-0.140
 better-0.139
Token day
Token ID1110
Feature activation-4.74e-01

Pos logit contributions
 Katie+0.061
 dining+0.058
 counters+0.058
view+0.058
iah+0.058
Neg logit contributions
 coward-0.044
 loser-0.039
 hawk-0.037
 badly-0.034
 villain-0.032
Token,
Token ID11
Feature activation+3.29e-01

Pos logit contributions
 Katie+0.030
 dining+0.027
iah+0.026
 counters+0.026
 theatre+0.026
Neg logit contributions
 coward-0.025
 loser-0.022
 hawk-0.021
 weaker-0.019
 villain-0.018
Token a
Token ID257
Feature activation+6.46e-01

Pos logit contributions
 coward+0.016
 loser+0.015
 hawk+0.014
ching+0.012
 weaker+0.012
Neg logit contributions
 Katie-0.022
 dining-0.020
 counters-0.020
iah-0.019
 patio-0.019
Token chocolate
Token ID11311
Feature activation+1.22e+00

Pos logit contributions
 coward+0.034
 loser+0.031
 hawk+0.029
 badly+0.027
ching+0.026
Neg logit contributions
 Katie-0.039
 dining-0.037
 counters-0.036
iah-0.036
 doorstep-0.036
Token-
Token ID12
Feature activation+1.21e+00

Pos logit contributions
 coward+0.057
 loser+0.050
 hawk+0.047
 badly+0.042
ching+0.040
Neg logit contributions
 Katie-0.084
 dining-0.081
 counters-0.080
 doorstep-0.080
view-0.080
Tokencol
Token ID4033
Feature activation-8.97e+00

Pos logit contributions
 coward+0.080
 loser+0.073
 hawk+0.071
 badly+0.066
ching+0.064
Neg logit contributions
 Katie-0.060
 dining-0.057
 counters-0.056
 doorstep-0.056
view-0.055
Tokenoured
Token ID8167
Feature activation-3.21e+00

Pos logit contributions
 Katie+0.268
 dining+0.246
 counters+0.239
view+0.235
 doorstep+0.231
Neg logit contributions
 coward-0.762
 loser-0.713
 hawk-0.700
 badly-0.665
 Sly-0.642
Token whip
Token ID22972
Feature activation+3.45e+00

Pos logit contributions
 Katie+0.209
 dining+0.200
 counters+0.198
view+0.197
 doorstep+0.197
Neg logit contributions
 coward-0.160
 loser-0.143
 hawk-0.139
 badly-0.124
 hunter-0.117
Token named
Token ID3706
Feature activation-1.37e+00

Pos logit contributions
 coward+0.131
 loser+0.113
 hawk+0.104
 badly+0.086
 villain+0.082
Neg logit contributions
 Katie-0.263
 dining-0.254
view-0.254
 doorstep-0.254
 pavement-0.253
Token C
Token ID327
Feature activation-1.95e+00

Pos logit contributions
 Katie+0.099
 dining+0.095
 counters+0.094
view+0.094
 doorstep+0.094
Neg logit contributions
 coward-0.058
 loser-0.051
 hawk-0.049
 badly-0.042
 Sly-0.040
Tokenheddar
Token ID44937
Feature activation+2.64e+00

Pos logit contributions
 Katie+0.101
 anticipation+0.101
 contain+0.095
 counters+0.092
 mant+0.091
Neg logit contributions
 coward-0.118
 loser-0.106
 hawk-0.103
ching-0.101
 badly-0.096
Token,
Token ID11
Feature activation+2.52e-01

Pos logit contributions
 Katie+0.250
 dining+0.242
 counters+0.240
view+0.239
 doorstep+0.238
Neg logit contributions
 coward-0.122
 loser-0.104
 hawk-0.100
 badly-0.087
 hunter-0.078
Token with
Token ID351
Feature activation-1.19e+00

Pos logit contributions
 coward+0.010
 hawk+0.008
 loser+0.008
 too+0.007
 badly+0.007
Neg logit contributions
 dining-0.019
 doorstep-0.019
 Katie-0.018
iah-0.018
view-0.018
Token all
Token ID477
Feature activation-1.10e+00

Pos logit contributions
 dining+0.082
 doorstep+0.082
 counters+0.081
iah+0.081
view+0.080
Neg logit contributions
 coward-0.052
 hawk-0.045
 loser-0.044
 badly-0.041
 too-0.039
Token its
Token ID663
Feature activation+1.88e+00

Pos logit contributions
 doorstep+0.080
 dining+0.079
 counters+0.079
 Katie+0.078
 pavement+0.078
Neg logit contributions
 coward-0.041
 hawk-0.035
 loser-0.035
 badly-0.031
ching-0.030
Token different
Token ID1180
Feature activation-8.52e-01

Pos logit contributions
 coward+0.108
 loser+0.098
 hawk+0.097
 badly+0.087
ching+0.087
Neg logit contributions
 Katie-0.106
 dining-0.102
view-0.102
iah-0.101
 doorstep-0.100
Tokencol
Token ID4033
Feature activation-8.96e+00

Pos logit contributions
 dining+0.043
iah+0.043
 doorstep+0.043
view+0.043
 Katie+0.043
Neg logit contributions
 coward-0.048
 loser-0.045
 hawk-0.044
 on-0.042
ching-0.042
Tokenours
Token ID4662
Feature activation-7.15e-01

Pos logit contributions
 dining+0.429
 Katie+0.420
 doorstep+0.400
view+0.398
 theatre+0.395
Neg logit contributions
 coward-0.537
 loser-0.501
ls-0.492
 badly-0.477
 hawk-0.477
Token,
Token ID11
Feature activation+2.55e-01

Pos logit contributions
 Katie+0.056
 doorstep+0.056
 dining+0.056
view+0.055
 counters+0.055
Neg logit contributions
 coward-0.024
 loser-0.020
 hawk-0.019
 badly-0.017
 on-0.016
Token shapes
Token ID15268
Feature activation-1.74e+00

Pos logit contributions
 coward+0.012
 loser+0.010
 hawk+0.010
 badly+0.009
 on+0.009
Neg logit contributions
 Katie-0.017
 doorstep-0.017
 dining-0.017
view-0.017
iah-0.016
Token and
Token ID290
Feature activation-4.19e-01

Pos logit contributions
 Katie+0.134
 dining+0.130
 counters+0.129
view+0.128
 doorstep+0.128
Neg logit contributions
 coward-0.065
 loser-0.056
 hawk-0.053
 badly-0.045
ching-0.041
Token textures
Token ID20028
Feature activation-1.42e+00

Pos logit contributions
 Katie+0.021
 doorstep+0.021
iah+0.021
 dining+0.021
view+0.021
Neg logit contributions
 coward-0.025
 loser-0.023
 hawk-0.023
 hunters-0.021
 hurt-0.021
Token warm
Token ID5814
Feature activation-3.47e+00

Pos logit contributions
 doorstep+0.025
 Katie+0.025
view+0.024
 counters+0.024
 dining+0.024
Neg logit contributions
 coward-0.022
 hawk-0.020
 loser-0.019
 badly-0.018
 hurt-0.018
Token,
Token ID11
Feature activation+2.74e-01

Pos logit contributions
 Katie+0.293
 dining+0.290
 doorstep+0.287
view+0.286
 counters+0.286
Neg logit contributions
 coward-0.083
 hawk-0.074
 on-0.071
 too-0.069
 badly-0.066
Token and
Token ID290
Feature activation-4.14e-01

Pos logit contributions
 coward+0.007
 hawk+0.006
 on+0.006
 loser+0.006
 badly+0.005
Neg logit contributions
 doorstep-0.023
 Katie-0.023
 dining-0.023
iah-0.023
view-0.022
Token the
Token ID262
Feature activation+2.11e-01

Pos logit contributions
 doorstep+0.030
view+0.029
 dining+0.028
 counters+0.028
iah+0.028
Neg logit contributions
 coward-0.017
 hawk-0.014
 loser-0.014
 badly-0.013
 on-0.013
Token shore
Token ID15191
Feature activation-5.17e+00

Pos logit contributions
 coward+0.011
 loser+0.010
 hawk+0.010
 hurt+0.009
 bad+0.009
Neg logit contributions
 Katie-0.012
 doorstep-0.012
view-0.012
shirt-0.012
eline-0.012
Tokenline
Token ID1370
Feature activation-8.95e+00

Pos logit contributions
 Katie+0.402
 dining+0.391
 doorstep+0.388
view+0.387
 counters+0.384
Neg logit contributions
 coward-0.186
 loser-0.155
 hawk-0.144
 hurt-0.131
 badly-0.129
Token was
Token ID373
Feature activation-2.09e-01

Pos logit contributions
 Katie+0.706
 dining+0.698
 counters+0.689
 doorstep+0.682
 pavement+0.677
Neg logit contributions
 coward-0.295
 loser-0.243
 badly-0.230
 hawk-0.225
 hurt-0.222
Token ancient
Token ID6156
Feature activation-1.28e+00

Pos logit contributions
 Katie+0.013
 dining+0.013
 counters+0.012
 doorstep+0.012
view+0.012
Neg logit contributions
 coward-0.011
 loser-0.010
 hawk-0.009
 badly-0.009
 hunter-0.008
Token.
Token ID13
Feature activation-1.98e-01

Pos logit contributions
 Katie+0.097
 dining+0.093
 counters+0.092
view+0.092
 doorstep+0.092
Neg logit contributions
 coward-0.050
 loser-0.043
 hawk-0.041
 badly-0.036
 hunter-0.033
Token
Token ID220
Feature activation-2.19e+00

Pos logit contributions
iah+0.016
 clothing+0.015
 anniversary+0.015
 theatre+0.015
view+0.015
Neg logit contributions
 too-0.005
 Tweet-0.005
 again-0.005
 coward-0.005
 Zip-0.005
Token
Token ID198
Feature activation+1.18e+00

Pos logit contributions
 Katie+0.161
 dining+0.160
 counters+0.158
iah+0.158
 doorstep+0.157
Neg logit contributions
 coward-0.085
 loser-0.073
 hawk-0.067
 badly-0.064
ching-0.059
Token date
Token ID3128
Feature activation-1.54e+00

Pos logit contributions
 coward+0.009
 loser+0.008
 hawk+0.008
 on+0.008
 bad+0.008
Neg logit contributions
 Katie-0.006
view-0.006
iah-0.006
 counters-0.006
 pavement-0.006
Token,
Token ID11
Feature activation+2.48e-01

Pos logit contributions
view+0.129
 Katie+0.129
iah+0.128
 dining+0.128
 counters+0.126
Neg logit contributions
 coward-0.040
 loser-0.035
 on-0.031
 hawk-0.029
 badly-0.028
Token a
Token ID257
Feature activation+5.81e-01

Pos logit contributions
 coward+0.012
 loser+0.010
 hawk+0.010
 badly+0.009
 hunter+0.008
Neg logit contributions
 Katie-0.017
 dining-0.016
 counters-0.016
view-0.016
 doorstep-0.016
Token rainbow
Token ID27223
Feature activation-2.47e+00

Pos logit contributions
 coward+0.030
 loser+0.027
 hawk+0.026
 badly+0.024
 hunter+0.023
Neg logit contributions
 Katie-0.036
view-0.035
iah-0.035
 counters-0.035
 dining-0.035
Token-
Token ID12
Feature activation+1.23e+00

Pos logit contributions
 Katie+0.168
 dining+0.165
 counters+0.162
iah+0.161
 doorstep+0.159
Neg logit contributions
 coward-0.108
 hawk-0.098
 loser-0.095
 on-0.086
 badly-0.085
Tokencol
Token ID4033
Feature activation-8.95e+00

Pos logit contributions
 coward+0.082
 hawk+0.077
 loser+0.077
ching+0.077
 on+0.075
Neg logit contributions
 dining-0.052
 counters-0.049
 doorstep-0.049
 Katie-0.048
iah-0.045
Tokenoured
Token ID8167
Feature activation-3.26e+00

Pos logit contributions
 dining+0.303
 Katie+0.297
 counters+0.281
view+0.274
 patio+0.263
Neg logit contributions
 coward-0.690
 loser-0.650
 hawk-0.629
 badly-0.627
ls-0.589
Token butterfly
Token ID35113
Feature activation+3.25e+00

Pos logit contributions
 Katie+0.199
iah+0.192
view+0.192
 dining+0.191
 counters+0.191
Neg logit contributions
 coward-0.169
 loser-0.155
 hawk-0.153
 badly-0.135
 hunter-0.129
Token flew
Token ID13112
Feature activation+2.85e+00

Pos logit contributions
 coward+0.136
 loser+0.117
 hawk+0.116
 badly+0.101
ching+0.096
Neg logit contributions
 Katie-0.235
 dining-0.228
 counters-0.225
iah-0.225
view-0.224
Token in
Token ID287
Feature activation-5.10e-01

Pos logit contributions
 coward+0.157
 loser+0.139
 hawk+0.130
 villain+0.114
 badly+0.114
Neg logit contributions
 Katie-0.179
 pavement-0.165
view-0.165
 counters-0.164
 dining-0.164
Token through
Token ID832
Feature activation-1.61e+00

Pos logit contributions
 Katie+0.038
 dining+0.037
 counters+0.037
view+0.037
iah+0.037
Neg logit contributions
 coward-0.020
 loser-0.018
 hawk-0.017
 badly-0.015
ching-0.013
Token and
Token ID290
Feature activation-4.40e-01

Pos logit contributions
 Katie+0.003
 dining+0.003
 counters+0.003
view+0.003
 doorstep+0.003
Neg logit contributions
 coward-0.002
 loser-0.002
 hawk-0.001
 badly-0.001
 villain-0.001
Token offered
Token ID4438
Feature activation+1.83e+00

Pos logit contributions
 Katie+0.027
 dining+0.025
 counters+0.025
iah+0.025
view+0.025
Neg logit contributions
 coward-0.024
 loser-0.022
 hawk-0.021
 badly-0.019
 villain-0.018
Token to
Token ID284
Feature activation-3.32e-01

Pos logit contributions
 coward+0.063
 loser+0.052
 hawk+0.050
 badly+0.042
 Sly+0.038
Neg logit contributions
 Katie-0.146
 dining-0.143
view-0.142
 counters-0.142
 doorstep-0.141
Token repair
Token ID9185
Feature activation+2.19e+00

Pos logit contributions
 Katie+0.021
 dining+0.021
iah+0.021
 counters+0.021
view+0.020
Neg logit contributions
 coward-0.016
 loser-0.014
 hawk-0.014
 badly-0.013
 hurt-0.013
Token the
Token ID262
Feature activation+1.23e-01

Pos logit contributions
 coward+0.089
 loser+0.077
 hawk+0.072
 badly+0.060
 villain+0.058
Neg logit contributions
 Katie-0.168
 counters-0.158
 dining-0.157
iah-0.157
 pavement-0.155
Token bad
Token ID2089
Feature activation+5.26e+00

Pos logit contributions
 coward+0.007
 loser+0.007
 hawk+0.007
 badly+0.006
 hero+0.006
Neg logit contributions
 Katie-0.007
view-0.006
iah-0.006
 counters-0.006
 dining-0.006
Token event
Token ID1785
Feature activation-4.29e+00

Pos logit contributions
 coward+0.235
 loser+0.207
 hawk+0.198
 badly+0.172
 hunter+0.163
Neg logit contributions
 Katie-0.372
 dining-0.359
 counters-0.357
view-0.357
iah-0.352
Token.
Token ID13
Feature activation-2.31e-01

Pos logit contributions
 Katie+0.344
view+0.337
 dining+0.337
iah+0.337
 counters+0.334
Neg logit contributions
 coward-0.135
 loser-0.113
 hawk-0.099
 badly-0.098
 on-0.088
Token
Token ID220
Feature activation-2.19e+00

Pos logit contributions
view+0.017
iah+0.017
 dining+0.017
 doorstep+0.017
 anniversary+0.016
Neg logit contributions
 coward-0.007
 Tweet-0.007
 Spot-0.007
 Blue-0.007
 too-0.007
Token
Token ID198
Feature activation+1.21e+00

Pos logit contributions
 Katie+0.155
 dining+0.151
 counters+0.149
iah+0.149
view+0.149
Neg logit contributions
 coward-0.094
 loser-0.081
 hawk-0.076
 badly-0.070
ching-0.065
Token
Token ID198
Feature activation+1.19e+00

Pos logit contributions
 coward+0.045
 loser+0.038
 hawk+0.036
 badly+0.031
ching+0.031
Neg logit contributions
 dining-0.090
 counters-0.089
 Katie-0.089
 doorstep-0.089
iah-0.089
Token diamond
Token ID15291
Feature activation+2.81e+00

Pos logit contributions
 counters+0.202
 Katie+0.195
iah+0.194
 dining+0.192
view+0.191
Neg logit contributions
 coward-0.156
 hawk-0.155
 loser-0.140
 badly-0.140
 hunter-0.133
Token flower
Token ID15061
Feature activation+9.32e-01

Pos logit contributions
 coward+0.116
 loser+0.101
 hawk+0.097
 badly+0.085
 hunter+0.078
Neg logit contributions
 Katie-0.206
 dining-0.198
 counters-0.197
view-0.196
iah-0.195
Token.
Token ID13
Feature activation-1.99e-01

Pos logit contributions
 coward+0.033
 loser+0.029
 hawk+0.028
 badly+0.025
 hunter+0.022
Neg logit contributions
 Katie-0.071
 counters-0.070
 dining-0.069
iah-0.069
view-0.069
Token She
Token ID1375
Feature activation-8.82e-01

Pos logit contributions
view+0.012
iah+0.012
 dining+0.012
 counters+0.012
 doorstep+0.012
Neg logit contributions
 coward-0.009
 loser-0.009
 Tweet-0.008
 too-0.008
 Sly-0.008
Token couldn
Token ID3521
Feature activation+1.46e+00

Pos logit contributions
 Katie+0.059
 dining+0.058
 counters+0.058
view+0.057
 pavement+0.057
Neg logit contributions
 coward-0.040
 hawk-0.034
 loser-0.033
 badly-0.031
 hurt-0.031
Token't
Token ID470
Feature activation+5.82e+00

Pos logit contributions
 coward+0.060
 loser+0.051
 hawk+0.049
 badly+0.040
 hunter+0.040
Neg logit contributions
 Katie-0.111
 doorstep-0.104
view-0.103
 dining-0.103
iah-0.102
Token believe
Token ID1975
Feature activation-5.06e+00

Pos logit contributions
 coward+0.380
 loser+0.365
 hawk+0.336
 hurt+0.327
 badly+0.325
Neg logit contributions
 counters-0.281
 dining-0.275
 Katie-0.270
iah-0.259
view-0.256
Token it
Token ID340
Feature activation+9.37e-01

Pos logit contributions
 Katie+0.368
 dining+0.348
 counters+0.345
iah+0.345
view+0.344
Neg logit contributions
 coward-0.222
 loser-0.194
 hawk-0.185
 badly-0.158
 hunter-0.152
Token.
Token ID13
Feature activation-1.90e-01

Pos logit contributions
 coward+0.036
 loser+0.031
 hawk+0.029
 badly+0.027
ching+0.025
Neg logit contributions
 Katie-0.069
view-0.069
 dining-0.069
 counters-0.068
 pavement-0.067
Token She
Token ID1375
Feature activation-8.72e-01

Pos logit contributions
view+0.012
iah+0.011
 counters+0.011
 doorstep+0.011
 dining+0.011
Neg logit contributions
 coward-0.009
 loser-0.008
 Tweet-0.008
 too-0.008
 Sly-0.007
Token had
Token ID550
Feature activation+6.63e-01

Pos logit contributions
 dining+0.058
 Katie+0.058
 counters+0.058
view+0.057
 pavement+0.057
Neg logit contributions
 coward-0.039
 hawk-0.033
 loser-0.033
 badly-0.030
 hurt-0.030
Token of
Token ID286
Feature activation+1.41e-01

Pos logit contributions
 Katie+0.019
 dining+0.018
view+0.018
 counters+0.018
 doorstep+0.018
Neg logit contributions
 coward-0.014
 hawk-0.012
 loser-0.012
 badly-0.010
 Sly-0.010
Token it
Token ID340
Feature activation+1.02e+00

Pos logit contributions
 coward+0.006
 loser+0.005
 hawk+0.005
 hunter+0.004
ls+0.004
Neg logit contributions
 Katie-0.010
 pavement-0.010
 counters-0.010
 clothing-0.009
iah-0.009
Token so
Token ID523
Feature activation-4.17e+00

Pos logit contributions
 coward+0.047
 loser+0.040
 hawk+0.040
Bad+0.032
 badly+0.032
Neg logit contributions
 Katie-0.072
 doorstep-0.068
 dining-0.068
 counters-0.068
 pavement-0.067
Token it
Token ID340
Feature activation+1.00e+00

Pos logit contributions
 Katie+0.312
iah+0.289
 pavement+0.287
 counters+0.283
 doorstep+0.282
Neg logit contributions
 coward-0.172
 loser-0.147
 hawk-0.143
ching-0.113
 hunter-0.112
Token wouldn
Token ID3636
Feature activation+5.69e+00

Pos logit contributions
 coward+0.050
 loser+0.045
 hawk+0.044
 badly+0.037
 hunters+0.036
Neg logit contributions
 Katie-0.065
iah-0.062
 counters-0.062
 doorstep-0.062
 pavement-0.061
Token't
Token ID470
Feature activation+5.81e+00

Pos logit contributions
 coward+0.273
 hawk+0.227
 loser+0.226
 hunters+0.202
 hunter+0.186
Neg logit contributions
 Katie-0.375
 doorstep-0.373
view-0.368
 pavement-0.365
iah-0.352
Token break
Token ID2270
Feature activation+3.82e+00

Pos logit contributions
 coward+0.306
 hawk+0.263
 loser+0.262
ls+0.234
 Sly+0.228
Neg logit contributions
 pavement-0.344
 Katie-0.338
 contain-0.336
 doorstep-0.336
view-0.336
Token into
Token ID656
Feature activation-6.11e-01

Pos logit contributions
 coward+0.181
 hawk+0.156
 loser+0.147
 Sly+0.125
 hunter+0.124
Neg logit contributions
 Katie-0.257
 pavement-0.248
view-0.242
 counters-0.239
 shades-0.234
Token parts
Token ID3354
Feature activation+1.32e+00

Pos logit contributions
 Katie+0.044
view+0.041
 counters+0.041
 dining+0.040
 pavement+0.040
Neg logit contributions
 coward-0.027
 hawk-0.023
 loser-0.022
 Sly-0.020
 hunter-0.019
Token like
Token ID588
Feature activation-1.85e+00

Pos logit contributions
 coward+0.052
 hawk+0.044
 loser+0.042
 Sly+0.035
 hunter+0.034
Neg logit contributions
 Katie-0.101
view-0.096
 dining-0.096
 counters-0.095
 pavement-0.094
Token his
Token ID465
Feature activation+9.44e-01

Pos logit contributions
 Katie+0.139
 pavement+0.131
 counters+0.130
view+0.130
 dining+0.130
Neg logit contributions
 coward-0.075
 loser-0.065
 hawk-0.060
 Sly-0.049
 hunter-0.048
Token.
Token ID13
Feature activation-2.81e-01

Pos logit contributions
 coward+0.050
 loser+0.043
 hawk+0.043
Bad+0.035
 hunter+0.035
Neg logit contributions
 Katie-0.070
 counters-0.069
 pavement-0.068
 doorstep-0.068
 dining-0.068
Token His
Token ID2399
Feature activation-7.44e-01

Pos logit contributions
 Katie+0.020
iah+0.020
 dining+0.020
view+0.020
 theatre+0.020
Neg logit contributions
 coward-0.011
 loser-0.009
 Tweet-0.009
 badly-0.009
 hawk-0.009
Token shoes
Token ID10012
Feature activation-1.89e+00

Pos logit contributions
 Katie+0.043
 doorstep+0.042
 pavement+0.041
 counters+0.041
 dining+0.041
Neg logit contributions
 coward-0.042
 loser-0.037
 hawk-0.036
 badly-0.033
 Sly-0.032
Token got
Token ID1392
Feature activation+1.37e+00

Pos logit contributions
 Katie+0.118
 dining+0.112
 counters+0.111
view+0.111
 doorstep+0.110
Neg logit contributions
 coward-0.100
 loser-0.089
 hawk-0.087
 badly-0.078
 hunter-0.074
Token wet
Token ID9583
Feature activation-3.45e-02

Pos logit contributions
 coward+0.085
 loser+0.080
 hawk+0.078
 villain+0.068
 hunter+0.068
Neg logit contributions
 counters-0.070
 Katie-0.068
 dining-0.064
 pavement-0.064
 anticipation-0.063
Token again
Token ID757
Feature activation+6.40e+00

Pos logit contributions
 Katie+0.002
 counters+0.002
 pavement+0.002
view+0.002
 dining+0.002
Neg logit contributions
 coward-0.002
 loser-0.001
 hawk-0.001
 villain-0.001
 hunter-0.001
Token,
Token ID11
Feature activation+2.72e-01

Pos logit contributions
 coward+0.233
 loser+0.198
 hawk+0.188
 badly+0.161
 hunter+0.144
Neg logit contributions
 Katie-0.505
 dining-0.488
 counters-0.484
view-0.482
 doorstep-0.481
Token but
Token ID475
Feature activation-1.90e+00

Pos logit contributions
 coward+0.014
 loser+0.013
 hawk+0.012
 badly+0.011
 villain+0.010
Neg logit contributions
 Katie-0.018
 counters-0.017
 dining-0.017
 pavement-0.017
view-0.017
Token he
Token ID339
Feature activation+3.00e+00

Pos logit contributions
 Katie+0.126
 pavement+0.114
iah+0.112
 dining+0.112
 doorstep+0.111
Neg logit contributions
 coward-0.096
 loser-0.089
 hawk-0.082
ching-0.077
ino-0.072
Token didn
Token ID1422
Feature activation+2.61e+00

Pos logit contributions
 coward+0.171
 loser+0.163
 hawk+0.162
 hunters+0.139
 villain+0.136
Neg logit contributions
 Katie-0.163
iah-0.158
 doorstep-0.155
view-0.150
 dining-0.150
Token't
Token ID470
Feature activation+5.85e+00

Pos logit contributions
 coward+0.119
 hawk+0.099
 loser+0.099
 hunter+0.085
 hunters+0.084
Neg logit contributions
 Katie-0.184
 doorstep-0.177
 pavement-0.174
view-0.173
 counters-0.172
Token felt
Token ID2936
Feature activation+6.22e-01

Pos logit contributions
 Katie+0.143
 dining+0.138
 counters+0.137
view+0.136
 doorstep+0.135
Neg logit contributions
 coward-0.110
 loser-0.097
 hawk-0.095
 badly-0.085
 hunter-0.080
Token different
Token ID1180
Feature activation-8.93e-01

Pos logit contributions
 coward+0.026
 loser+0.023
 hawk+0.022
 badly+0.019
 villain+0.018
Neg logit contributions
 Katie-0.045
 dining-0.043
 counters-0.043
view-0.043
iah-0.042
Token.
Token ID13
Feature activation-1.53e-01

Pos logit contributions
 Katie+0.068
 dining+0.066
 counters+0.065
iah+0.065
view+0.065
Neg logit contributions
 coward-0.034
 loser-0.030
 hawk-0.028
 badly-0.025
ching-0.022
Token He
Token ID679
Feature activation+1.34e+00

Pos logit contributions
view+0.011
iah+0.011
 dining+0.011
 doorstep+0.011
 counters+0.011
Neg logit contributions
 coward-0.006
 loser-0.005
 hawk-0.005
 Sly-0.005
 hero-0.005
Token felt
Token ID2936
Feature activation+6.28e-01

Pos logit contributions
 coward+0.068
 loser+0.059
 hawk+0.058
 badly+0.052
 hunter+0.050
Neg logit contributions
 Katie-0.087
 dining-0.084
 counters-0.083
view-0.082
iah-0.082
Token lighter
Token ID14871
Feature activation+5.67e+00

Pos logit contributions
 coward+0.027
 loser+0.024
 hawk+0.023
 badly+0.020
ching+0.019
Neg logit contributions
 Katie-0.045
 dining-0.043
 counters-0.043
view-0.043
 doorstep-0.043
Token,
Token ID11
Feature activation+2.64e-01

Pos logit contributions
 coward+0.227
 loser+0.196
 hawk+0.187
 badly+0.161
 hunter+0.148
Neg logit contributions
 Katie-0.426
 dining-0.409
 counters-0.407
view-0.405
 doorstep-0.404
Token happier
Token ID23030
Feature activation+3.91e+00

Pos logit contributions
 coward+0.011
 loser+0.009
 hawk+0.009
 badly+0.008
 weaker+0.007
Neg logit contributions
 Katie-0.019
 dining-0.019
 doorstep-0.019
iah-0.018
 counters-0.018
Token and
Token ID290
Feature activation-3.83e-01

Pos logit contributions
 coward+0.159
 loser+0.137
 hawk+0.132
 badly+0.115
 hunter+0.105
Neg logit contributions
 Katie-0.291
 dining-0.281
 counters-0.279
view-0.277
 doorstep-0.277
Token accepted
Token ID6292
Feature activation+4.18e+00

Pos logit contributions
 Katie+0.029
 dining+0.028
iah+0.028
 counters+0.028
view+0.028
Neg logit contributions
 coward-0.015
 loser-0.013
 hawk-0.012
 badly-0.011
 weaker-0.010
Token.
Token ID13
Feature activation-1.30e-01

Pos logit contributions
 coward+0.143
 loser+0.119
 hawk+0.114
 badly+0.098
 Sly+0.087
Neg logit contributions
 Katie-0.333
 dining-0.325
 counters-0.323
 doorstep-0.322
view-0.322
Token we
Token ID356
Feature activation-2.36e+00

Pos logit contributions
 coward+0.012
 loser+0.010
 hawk+0.010
 badly+0.009
ching+0.008
Neg logit contributions
 Katie-0.020
 dining-0.019
 counters-0.019
 doorstep-0.018
view-0.018
Token'll
Token ID1183
Feature activation+5.23e-01

Pos logit contributions
 Katie+0.160
 doorstep+0.153
iah+0.153
 dining+0.151
 theatre+0.146
Neg logit contributions
 coward-0.106
 hawk-0.094
 loser-0.093
 badly-0.081
Bad-0.080
Token always
Token ID1464
Feature activation+2.67e+00

Pos logit contributions
 coward+0.026
 loser+0.024
 hawk+0.023
 badly+0.021
 hurt+0.020
Neg logit contributions
 Katie-0.033
 counters-0.033
 dining-0.032
view-0.032
 pavement-0.031
Token keep
Token ID1394
Feature activation+9.59e-01

Pos logit contributions
 coward+0.134
 loser+0.121
 hawk+0.114
 badly+0.106
 hurt+0.101
Neg logit contributions
 Katie-0.172
 counters-0.166
 dining-0.166
view-0.163
iah-0.162
Token you
Token ID345
Feature activation-3.19e+00

Pos logit contributions
 coward+0.048
 loser+0.042
 hawk+0.041
 badly+0.037
 hurt+0.036
Neg logit contributions
 Katie-0.061
 dining-0.060
view-0.060
 counters-0.059
 doorstep-0.059
Token safe
Token ID3338
Feature activation+1.95e+00

Pos logit contributions
 Katie+0.209
 dining+0.198
view+0.197
 counters+0.197
iah+0.196
Neg logit contributions
 coward-0.156
 loser-0.140
 hawk-0.133
 badly-0.127
 hurt-0.116
Token."
Token ID526
Feature activation-4.62e-02

Pos logit contributions
 coward+0.084
 loser+0.072
 hawk+0.070
 badly+0.059
ching+0.057
Neg logit contributions
 Katie-0.141
 dining-0.136
 counters-0.135
view-0.134
 doorstep-0.133
Token Lily
Token ID20037
Feature activation+5.86e-01

Pos logit contributions
iah+0.003
 dining+0.003
 Katie+0.003
view+0.003
 doorstep+0.003
Neg logit contributions
 coward-0.002
 loser-0.001
 Tweet-0.001
 too-0.001
 badly-0.001
Token smiled
Token ID13541
Feature activation+4.24e-01

Pos logit contributions
 coward+0.028
 loser+0.025
 hawk+0.024
 badly+0.023
 hurt+0.022
Neg logit contributions
 Katie-0.037
 dining-0.037
 counters-0.036
 pavement-0.036
view-0.036
Token and
Token ID290
Feature activation-4.03e-01

Pos logit contributions
 coward+0.023
 loser+0.020
 hawk+0.020
Bad+0.017
 Sly+0.017
Neg logit contributions
 Katie-0.028
 dining-0.025
view-0.025
 doorstep-0.025
iah-0.025
Token felt
Token ID2936
Feature activation+5.99e-01

Pos logit contributions
 dining+0.026
 Katie+0.026
 counters+0.025
view+0.025
 doorstep+0.025
Neg logit contributions
 coward-0.020
 hawk-0.017
 loser-0.016
 badly-0.015
 hurt-0.015
Token sang
Token ID25889
Feature activation+2.13e-01

Pos logit contributions
 coward+0.163
 hawk+0.144
 loser+0.144
 badly+0.128
 Sly+0.122
Neg logit contributions
 Katie-0.196
view-0.186
 dining-0.185
 pavement-0.182
 counters-0.182
Token.
Token ID13
Feature activation-6.62e-02

Pos logit contributions
 coward+0.010
 hawk+0.008
 loser+0.008
 badly+0.007
 Sly+0.007
Neg logit contributions
 Katie-0.015
 dining-0.014
view-0.014
 counters-0.014
 doorstep-0.014
Token They
Token ID1119
Feature activation-7.95e-01

Pos logit contributions
 Katie+0.004
 doorstep+0.004
 dining+0.004
 counters+0.004
iah+0.004
Neg logit contributions
 coward-0.003
 loser-0.003
 hawk-0.003
 badly-0.002
 Sly-0.002
Token sang
Token ID25889
Feature activation+2.01e-01

Pos logit contributions
 Katie+0.043
view+0.040
 dining+0.039
iah+0.039
 theatre+0.039
Neg logit contributions
 coward-0.048
 loser-0.044
 hawk-0.043
 badly-0.040
 Sly-0.038
Token the
Token ID262
Feature activation+2.37e-01

Pos logit contributions
 coward+0.009
 loser+0.008
 hawk+0.008
 badly+0.007
 Sly+0.007
Neg logit contributions
 Katie-0.014
 dining-0.013
 counters-0.013
view-0.013
 doorstep-0.013
Token song
Token ID3496
Feature activation+3.59e-01

Pos logit contributions
 coward+0.014
 loser+0.013
 hawk+0.012
 villain+0.011
 hero+0.011
Neg logit contributions
 Katie-0.013
iah-0.013
 dining-0.013
 counters-0.012
view-0.012
Token that
Token ID326
Feature activation-5.13e-01

Pos logit contributions
 coward+0.014
 loser+0.012
 hawk+0.011
 badly+0.010
 hunter+0.009
Neg logit contributions
 Katie-0.027
 dining-0.026
 counters-0.026
view-0.026
 doorstep-0.026
Token made
Token ID925
Feature activation+8.54e-01

Pos logit contributions
 Katie+0.038
 dining+0.036
 counters+0.036
view+0.036
iah+0.036
Neg logit contributions
 coward-0.021
 loser-0.019
 hawk-0.018
 badly-0.016
 hunter-0.014
Token them
Token ID606
Feature activation-2.00e+00

Pos logit contributions
 coward+0.034
 loser+0.029
 hawk+0.027
 badly+0.024
 villain+0.022
Neg logit contributions
 Katie-0.064
 dining-0.063
 doorstep-0.062
 counters-0.062
view-0.062
Token famous
Token ID5863
Feature activation-4.82e-02

Pos logit contributions
 Katie+0.121
 dining+0.115
 counters+0.114
iah+0.114
 doorstep+0.113
Neg logit contributions
 coward-0.108
 loser-0.098
 hawk-0.094
 badly-0.086
ching-0.081
Token.
Token ID13
Feature activation-8.97e-02

Pos logit contributions
 Katie+0.004
 dining+0.003
 counters+0.003
view+0.003
 doorstep+0.003
Neg logit contributions
 coward-0.002
 loser-0.002
 hawk-0.002
 badly-0.001
 villain-0.001
Token down
Token ID866
Feature activation+4.11e-01

Pos logit contributions
 coward+0.047
 loser+0.042
 hawk+0.042
ching+0.036
 villain+0.035
Neg logit contributions
 Katie-0.062
 dining-0.062
iah-0.062
 counters-0.061
view-0.060
Token on
Token ID319
Feature activation+2.63e+00

Pos logit contributions
 coward+0.017
 hawk+0.014
 loser+0.014
 badly+0.011
 villain+0.011
Neg logit contributions
 Katie-0.031
 dining-0.030
iah-0.029
 theatre-0.029
view-0.029
Token the
Token ID262
Feature activation+1.76e-01

Pos logit contributions
 coward+0.101
 loser+0.085
 hawk+0.080
 badly+0.068
ching+0.063
Neg logit contributions
 Katie-0.207
 dining-0.198
 counters-0.196
 doorstep-0.195
iah-0.194
Token floor
Token ID4314
Feature activation-4.07e+00

Pos logit contributions
 coward+0.013
 loser+0.012
 hawk+0.012
 badly+0.011
 hurt+0.011
Neg logit contributions
 Katie-0.007
iah-0.007
view-0.007
 dining-0.007
 counters-0.006
Token and
Token ID290
Feature activation-4.63e-01

Pos logit contributions
 Katie+0.334
iah+0.323
 counters+0.321
view+0.320
 dining+0.319
Neg logit contributions
 coward-0.126
 loser-0.108
 hawk-0.091
 badly-0.088
 on-0.076
Token started
Token ID2067
Feature activation+1.50e+00

Pos logit contributions
 Katie+0.027
 dining+0.026
view+0.026
 counters+0.025
iah+0.025
Neg logit contributions
 coward-0.026
 loser-0.023
 hawk-0.023
 badly-0.021
 hunter-0.020
Token to
Token ID284
Feature activation-2.86e-01

Pos logit contributions
 coward+0.066
 loser+0.058
 hawk+0.056
 badly+0.048
 hunter+0.045
Neg logit contributions
 Katie-0.107
 dining-0.103
 counters-0.102
view-0.101
 doorstep-0.101
Token put
Token ID1234
Feature activation-4.96e-01

Pos logit contributions
 Katie+0.014
iah+0.014
view+0.014
 counters+0.013
 dining+0.013
Neg logit contributions
 coward-0.018
 loser-0.017
 hawk-0.016
 hurt-0.015
 badly-0.015
Token the
Token ID262
Feature activation+2.38e-01

Pos logit contributions
 Katie+0.035
 counters+0.034
iah+0.033
 dining+0.033
 pavement+0.033
Neg logit contributions
 coward-0.023
 loser-0.020
 hawk-0.019
 badly-0.017
 villain-0.016
Token puzzle
Token ID15027
Feature activation+5.34e+00

Pos logit contributions
 coward+0.014
 hawk+0.012
 loser+0.011
 badly+0.011
 Sly+0.011
Neg logit contributions
 dining-0.013
 counters-0.013
 mant-0.012
 doorstep-0.012
 patio-0.012
Token pieces
Token ID5207
Feature activation+3.69e+00

Pos logit contributions
 coward+0.271
 loser+0.234
 hawk+0.234
Bad+0.203
 Sly+0.192
Neg logit contributions
 Katie-0.348
 counters-0.341
 dining-0.341
 doorstep-0.336
 pavement-0.331
Token sat
Token ID3332
Feature activation+1.12e+00

Pos logit contributions
 coward+0.184
 loser+0.168
 hawk+0.160
 badly+0.146
 villain+0.137
Neg logit contributions
 Katie-0.196
 dining-0.187
 counters-0.186
 doorstep-0.185
iah-0.184
Token down
Token ID866
Feature activation+5.86e-01

Pos logit contributions
 coward+0.053
 loser+0.047
 hawk+0.044
 villain+0.039
ching+0.038
Neg logit contributions
 Katie-0.077
 dining-0.075
 counters-0.075
iah-0.074
view-0.073
Token in
Token ID287
Feature activation-2.82e-01

Pos logit contributions
 coward+0.023
 loser+0.020
 hawk+0.019
 badly+0.016
 villain+0.015
Neg logit contributions
 Katie-0.045
 dining-0.043
view-0.042
 counters-0.042
 doorstep-0.042
Token the
Token ID262
Feature activation+5.03e-01

Pos logit contributions
 Katie+0.023
 counters+0.022
 dining+0.022
iah+0.022
 doorstep+0.022
Neg logit contributions
 coward-0.009
 loser-0.008
 hawk-0.007
 badly-0.006
 hunter-0.005
Token middle
Token ID3504
Feature activation-3.67e+00

Pos logit contributions
 coward+0.029
 hawk+0.028
 loser+0.027
 hunter+0.025
 hurt+0.024
Neg logit contributions
 Katie-0.028
iah-0.026
 counters-0.025
view-0.025
 dining-0.024
Token of
Token ID286
Feature activation+2.93e-01

Pos logit contributions
iah+0.318
 Katie+0.317
 dining+0.312
 counters+0.310
 doorstep+0.307
Neg logit contributions
 coward-0.073
 loser-0.065
 too-0.060
 on-0.057
 badly-0.054
Token the
Token ID262
Feature activation+4.60e-01

Pos logit contributions
 coward+0.010
 loser+0.009
 hawk+0.008
 badly+0.007
ching+0.006
Neg logit contributions
 Katie-0.024
 dining-0.023
 counters-0.023
view-0.022
 doorstep-0.022
Token pitch
Token ID7078
Feature activation-1.24e+00

Pos logit contributions
 coward+0.027
 hawk+0.026
 loser+0.025
 hunter+0.023
 hero+0.022
Neg logit contributions
 Katie-0.025
iah-0.023
 counters-0.022
view-0.022
aney-0.022
Token.
Token ID13
Feature activation+1.23e-01

Pos logit contributions
 Katie+0.097
 counters+0.093
iah+0.093
 dining+0.093
view+0.092
Neg logit contributions
 coward-0.043
 loser-0.037
 hawk-0.035
 badly-0.031
 hunter-0.028
Token No
Token ID1400
Feature activation+3.55e+00

Pos logit contributions
 coward+0.005
 loser+0.005
 hawk+0.004
 Sly+0.004
 Tweet+0.004
Neg logit contributions
view-0.008
iah-0.008
 dining-0.008
 doorstep-0.008
 counters-0.008
Token one
Token ID530
Feature activation+2.12e+00

Pos logit contributions
 coward+0.183
 hawk+0.157
 loser+0.157
 badly+0.136
 Sly+0.133
Neg logit contributions
 Katie-0.229
 occasion-0.213
 theatre-0.211
 pavement-0.207
 doorstep-0.206
Token onto
Token ID4291
Feature activation-1.51e-01

Pos logit contributions
 Katie+0.013
 counters+0.012
 doorstep+0.012
 dining+0.012
view+0.012
Neg logit contributions
 coward-0.009
 loser-0.008
 hawk-0.007
 villain-0.006
Bad-0.006
Token the
Token ID262
Feature activation+1.49e-01

Pos logit contributions
 Katie+0.011
 pavement+0.010
 counters+0.010
 dining+0.010
 doorstep+0.010
Neg logit contributions
 coward-0.007
 loser-0.006
 hawk-0.005
 villain-0.005
 badly-0.005
Token chim
Token ID18205
Feature activation-1.69e+00

Pos logit contributions
 coward+0.010
 loser+0.009
 hawk+0.009
 badly+0.008
 hurt+0.008
Neg logit contributions
 Katie-0.008
view-0.007
 dining-0.007
iah-0.007
 counters-0.007
Tokenney
Token ID1681
Feature activation-1.96e+00

Pos logit contributions
 Katie+0.076
 dining+0.068
 counters+0.067
iah+0.067
view+0.066
Neg logit contributions
 coward-0.117
 loser-0.110
 hawk-0.105
 badly-0.102
 weaker-0.096
Token.
Token ID13
Feature activation-1.56e-01

Pos logit contributions
 Katie+0.154
 dining+0.149
iah+0.148
view+0.147
 counters+0.147
Neg logit contributions
 coward-0.068
 loser-0.059
 hawk-0.055
 badly-0.051
ching-0.044
Token He
Token ID679
Feature activation+1.44e+00

Pos logit contributions
view+0.010
iah+0.010
 Katie+0.010
 dining+0.009
 doorstep+0.009
Neg logit contributions
 coward-0.008
 loser-0.007
 hawk-0.007
 Sly-0.006
 Tweet-0.006
Token made
Token ID925
Feature activation+1.02e+00

Pos logit contributions
 coward+0.082
 loser+0.077
 hawk+0.072
 badly+0.064
 villain+0.063
Neg logit contributions
 Katie-0.082
 counters-0.078
 doorstep-0.078
iah-0.078
 dining-0.076
Token it
Token ID340
Feature activation+1.00e+00

Pos logit contributions
 coward+0.044
 loser+0.038
 hawk+0.037
 badly+0.032
 hunter+0.030
Neg logit contributions
 Katie-0.074
 dining-0.071
 counters-0.070
view-0.070
 doorstep-0.070
Token to
Token ID284
Feature activation-2.54e-01

Pos logit contributions
 coward+0.048
 loser+0.042
 hawk+0.041
Bad+0.034
 villain+0.034
Neg logit contributions
 Katie-0.067
 dining-0.064
 counters-0.064
 doorstep-0.064
view-0.064
Token the
Token ID262
Feature activation+1.56e-01

Pos logit contributions
 Katie+0.019
 pavement+0.018
 dining+0.018
view+0.017
 doorstep+0.017
Neg logit contributions
 coward-0.011
 loser-0.009
 hawk-0.009
 badly-0.008
ching-0.007
Token top
Token ID1353
Feature activation+2.03e+00

Pos logit contributions
 coward+0.008
 loser+0.007
 hawk+0.007
 badly+0.006
 hunter+0.006
Neg logit contributions
 Katie-0.010
 dining-0.009
view-0.009
 counters-0.009
iah-0.009
Token they
Token ID484
Feature activation+1.21e+00

Pos logit contributions
 coward+0.135
 loser+0.118
 hawk+0.111
ching+0.095
 badly+0.092
Neg logit contributions
 Katie-0.272
 dining-0.251
iah-0.250
 counters-0.246
 doorstep-0.244
Token could
Token ID714
Feature activation+1.40e+00

Pos logit contributions
 coward+0.059
 loser+0.052
 hawk+0.050
 badly+0.044
 villain+0.042
Neg logit contributions
 Katie-0.081
 dining-0.078
 counters-0.077
view-0.077
 doorstep-0.077
Token be
Token ID307
Feature activation-6.06e-01

Pos logit contributions
 coward+0.076
 loser+0.068
 hawk+0.067
 badly+0.059
 villain+0.057
Neg logit contributions
 Katie-0.085
 dining-0.081
 doorstep-0.081
view-0.080
 counters-0.080
Token friends
Token ID2460
Feature activation+1.17e+00

Pos logit contributions
 Katie+0.041
 dining+0.039
 counters+0.039
view+0.038
iah+0.037
Neg logit contributions
 coward-0.029
 hawk-0.026
 loser-0.026
 villain-0.021
 Sly-0.021
Token.
Token ID13
Feature activation+8.57e-02

Pos logit contributions
 coward+0.055
 loser+0.048
 hawk+0.048
 Sly+0.039
 villain+0.039
Neg logit contributions
 Katie-0.080
 dining-0.076
view-0.074
 doorstep-0.074
 counters-0.073
Token<|endoftext|>
Token ID50256
Feature activation-1.56e-01

Pos logit contributions
 coward+0.004
 loser+0.003
 hawk+0.003
 badly+0.003
 Sly+0.003
Neg logit contributions
 Katie-0.006
 dining-0.006
view-0.006
 counters-0.006
 doorstep-0.006
TokenTom
Token ID13787
Feature activation-5.10e-01

Pos logit contributions
 Katie+0.010
 dining+0.009
view+0.009
 counters+0.009
iah+0.009
Neg logit contributions
 coward-0.008
 loser-0.007
 hawk-0.007
 badly-0.007
 villain-0.006
Token and
Token ID290
Feature activation-2.05e-01

Pos logit contributions
 Katie+0.032
 counters+0.032
 dining+0.032
 doorstep+0.032
 pavement+0.031
Neg logit contributions
 coward-0.025
 loser-0.022
 hawk-0.021
 badly-0.019
 hunter-0.018
Token Anna
Token ID11735
Feature activation+8.92e-01

Pos logit contributions
 Katie+0.015
 dining+0.015
 counters+0.015
 doorstep+0.015
view+0.015
Neg logit contributions
 coward-0.008
 loser-0.007
 hawk-0.007
 badly-0.006
 hunter-0.005
Token liked
Token ID8288
Feature activation-1.96e-01

Pos logit contributions
 coward+0.037
 loser+0.032
 hawk+0.030
 hurt+0.029
 badly+0.028
Neg logit contributions
 counters-0.062
 dining-0.062
 Katie-0.061
umm-0.060
 pavement-0.060
Token to
Token ID284
Feature activation-3.09e-02

Pos logit contributions
 Katie+0.016
iah+0.016
 doorstep+0.016
 counters+0.016
 dining+0.016
Neg logit contributions
 coward-0.006
 loser-0.005
 hawk-0.005
 badly-0.004
 hunters-0.004
Token had
Token ID550
Feature activation+6.45e-01

Pos logit contributions
 coward+0.031
 loser+0.028
 hawk+0.027
 badly+0.025
Bad+0.023
Neg logit contributions
 Katie-0.039
iah-0.037
 dining-0.037
view-0.037
 doorstep-0.037
Token new
Token ID649
Feature activation+5.42e-02

Pos logit contributions
 coward+0.027
 loser+0.023
 hawk+0.021
 badly+0.019
ching+0.018
Neg logit contributions
 doorstep-0.046
 counters-0.046
view-0.046
 Katie-0.046
 dining-0.046
Token friends
Token ID2460
Feature activation+1.21e+00

Pos logit contributions
 coward+0.002
 loser+0.002
 hawk+0.002
 badly+0.001
ching+0.001
Neg logit contributions
 Katie-0.004
 dining-0.004
 counters-0.004
view-0.004
 doorstep-0.004
Token to
Token ID284
Feature activation-3.63e-01

Pos logit contributions
 coward+0.049
 loser+0.043
 hawk+0.041
 badly+0.035
 villain+0.032
Neg logit contributions
 Katie-0.090
 dining-0.086
view-0.085
 counters-0.085
 doorstep-0.085
Token play
Token ID711
Feature activation-2.26e+00

Pos logit contributions
 counters+0.021
 Katie+0.021
 dining+0.020
iah+0.020
view+0.020
Neg logit contributions
 coward-0.020
 loser-0.018
 hawk-0.018
 badly-0.016
 hurt-0.016
Token with
Token ID351
Feature activation-1.25e+00

Pos logit contributions
 Katie+0.164
 dining+0.158
 counters+0.157
view+0.156
 doorstep+0.155
Neg logit contributions
 coward-0.097
 loser-0.084
 hawk-0.081
 badly-0.070
 hunter-0.065
Token and
Token ID290
Feature activation-4.46e-01

Pos logit contributions
 Katie+0.087
 dining+0.084
 counters+0.083
view+0.082
 doorstep+0.082
Neg logit contributions
 coward-0.057
 loser-0.050
 hawk-0.048
 badly-0.042
 hunter-0.039
Token she
Token ID673
Feature activation+1.23e+00

Pos logit contributions
 Katie+0.030
 dining+0.029
 counters+0.029
view+0.029
 doorstep+0.028
Neg logit contributions
 coward-0.021
 loser-0.019
 hawk-0.018
 badly-0.016
 hunter-0.015
Token always
Token ID1464
Feature activation+2.67e+00

Pos logit contributions
 coward+0.069
 loser+0.066
 hawk+0.065
 badly+0.057
Bad+0.054
Neg logit contributions
 Katie-0.069
iah-0.065
 theatre-0.063
view-0.063
 dining-0.062
Token kept
Token ID4030
Feature activation+1.61e+00

Pos logit contributions
 coward+0.124
 loser+0.108
 hawk+0.102
 badly+0.091
 hurt+0.086
Neg logit contributions
 Katie-0.184
 counters-0.179
 dining-0.178
 doorstep-0.177
view-0.176
Token the
Token ID262
Feature activation+1.77e-01

Pos logit contributions
 coward+0.070
 loser+0.060
 hawk+0.059
 badly+0.054
 hurt+0.051
Neg logit contributions
 Katie-0.111
 dining-0.111
 doorstep-0.111
 counters-0.110
view-0.109
Token home
Token ID1363
Feature activation-5.20e-01

Pos logit contributions
 Katie+0.006
 dining+0.005
 counters+0.005
 doorstep+0.005
view+0.005
Neg logit contributions
 coward-0.004
 loser-0.003
 hawk-0.003
 badly-0.003
 villain-0.003
Token.
Token ID13
Feature activation-2.43e-02

Pos logit contributions
 Katie+0.040
 dining+0.038
 counters+0.038
view+0.038
iah+0.038
Neg logit contributions
 coward-0.020
 loser-0.017
 hawk-0.016
 badly-0.014
 hunter-0.013
Token He
Token ID679
Feature activation+1.37e+00

Pos logit contributions
view+0.002
 occasion+0.002
 doorstep+0.002
 anniversary+0.002
iah+0.002
Neg logit contributions
 coward-0.001
 loser-0.001
 too-0.001
 badly-0.001
 Spot-0.001
Token saw
Token ID2497
Feature activation-1.92e+00

Pos logit contributions
 coward+0.063
 hawk+0.053
 loser+0.052
 badly+0.048
 hurt+0.046
Neg logit contributions
 Katie-0.092
view-0.091
 dining-0.090
 pavement-0.090
 counters-0.090
Token the
Token ID262
Feature activation+2.02e-01

Pos logit contributions
 Katie+0.145
 dining+0.141
 counters+0.140
 doorstep+0.140
view+0.140
Neg logit contributions
 coward-0.074
 loser-0.062
 hawk-0.060
 badly-0.053
 Sly-0.048
Token open
Token ID1280
Feature activation-2.13e+00

Pos logit contributions
 coward+0.011
 loser+0.010
 hawk+0.010
 badly+0.009
 hunter+0.009
Neg logit contributions
 Katie-0.012
view-0.011
iah-0.011
 dining-0.011
 counters-0.011
Token freezer
Token ID30967
Feature activation-2.93e+00

Pos logit contributions
 Katie+0.106
view+0.100
 dining+0.099
iah+0.099
 counters+0.098
Neg logit contributions
 coward-0.134
 loser-0.123
 hawk-0.122
 badly-0.114
 hunter-0.109
Token and
Token ID290
Feature activation-3.06e-01

Pos logit contributions
 Katie+0.229
view+0.220
 dining+0.220
 counters+0.218
iah+0.217
Neg logit contributions
 coward-0.101
 loser-0.085
 hawk-0.082
 badly-0.076
 on-0.066
Token said
Token ID531
Feature activation-1.09e+00

Pos logit contributions
 Katie+0.021
view+0.021
iah+0.020
 dining+0.020
 doorstep+0.020
Neg logit contributions
 coward-0.014
 loser-0.011
 hawk-0.011
 badly-0.011
 hurt-0.010
Token "
Token ID366
Feature activation+2.79e-01

Pos logit contributions
 Katie+0.081
 dining+0.076
 counters+0.075
iah+0.073
 doorstep+0.072
Neg logit contributions
 coward-0.046
 loser-0.041
 hawk-0.040
ching-0.032
 villain-0.032
TokenUh
Token ID34653
Feature activation+1.16e+00

Pos logit contributions
 coward+0.013
 loser+0.013
Bad+0.011
 hawk+0.011
ching+0.010
Neg logit contributions
 counters-0.018
 balcony-0.018
 dining-0.018
 doorstep-0.018
 patio-0.017
Token drum
Token ID13026
Feature activation-5.33e-01

Pos logit contributions
 coward+0.057
 hawk+0.052
 loser+0.051
 Sly+0.046
 badly+0.045
Neg logit contributions
 Katie-0.050
 doorstep-0.049
 counters-0.049
 pavement-0.049
view-0.049
Token with
Token ID351
Feature activation-1.16e+00

Pos logit contributions
view+0.038
 pavement+0.038
 doorstep+0.038
 Katie+0.038
 counters+0.038
Neg logit contributions
 coward-0.022
 hawk-0.019
 loser-0.018
Bad-0.015
 Sly-0.014
Token his
Token ID465
Feature activation+9.58e-01

Pos logit contributions
 Katie+0.077
 pavement+0.073
 counters+0.072
 desks+0.071
 clothing+0.071
Neg logit contributions
 coward-0.054
 hawk-0.052
 loser-0.048
 hunter-0.041
ls-0.038
Token mom
Token ID1995
Feature activation-1.87e+00

Pos logit contributions
 coward+0.050
 hawk+0.047
 loser+0.042
 Sly+0.040
 hunter+0.037
Neg logit contributions
 clothing-0.053
 doorstep-0.053
 bedroom-0.052
 counters-0.052
 desks-0.051
Token.
Token ID13
Feature activation-2.69e-01

Pos logit contributions
 Katie+0.121
 dining+0.116
 doorstep+0.115
 counters+0.115
view+0.115
Neg logit contributions
 coward-0.096
 hawk-0.085
 loser-0.083
 villain-0.069
 badly-0.069
Token They
Token ID1119
Feature activation-8.08e-01

Pos logit contributions
 theatre+0.019
 Katie+0.019
 counters+0.019
 occasion+0.019
 dining+0.019
Neg logit contributions
 coward-0.010
 Tweet-0.009
 Wh-0.008
 Spot-0.008
 badly-0.008
Token played
Token ID2826
Feature activation-4.43e-01

Pos logit contributions
 Katie+0.041
iah+0.038
 dining+0.038
view+0.038
 doorstep+0.038
Neg logit contributions
 coward-0.052
 loser-0.049
 hawk-0.047
 badly-0.043
 Sly-0.041
Token together
Token ID1978
Feature activation+1.67e+00

Pos logit contributions
 Katie+0.028
 counters+0.028
 dining+0.028
 theatre+0.027
 pavement+0.027
Neg logit contributions
 coward-0.022
 hawk-0.020
 loser-0.020
 badly-0.016
 Sly-0.016
Token for
Token ID329
Feature activation+1.92e+00

Pos logit contributions
 coward+0.095
 hawk+0.089
 loser+0.083
 infected+0.078
Bad+0.077
Neg logit contributions
 Katie-0.093
 theatre-0.092
 dining-0.090
 doorstep-0.089
 pavement-0.087
Token a
Token ID257
Feature activation+7.05e-01

Pos logit contributions
 coward+0.106
 loser+0.095
 hawk+0.092
 badly+0.081
 weaker+0.078
Neg logit contributions
 Katie-0.116
 dining-0.109
 pavement-0.109
view-0.109
 counters-0.109
Token long
Token ID890
Feature activation+6.78e-01

Pos logit contributions
 coward+0.033
 loser+0.029
 hawk+0.029
 hunters+0.024
 Sly+0.024
Neg logit contributions
 Katie-0.047
 doorstep-0.045
 dining-0.045
 occasion-0.045
 pavement-0.044
Token shining
Token ID22751
Feature activation-3.95e+00

Pos logit contributions
 Katie+0.253
 dining+0.244
 counters+0.241
 doorstep+0.240
view+0.240
Neg logit contributions
 coward-0.128
 loser-0.113
 hawk-0.113
 badly-0.097
 on-0.094
Token and
Token ID290
Feature activation-3.99e-01

Pos logit contributions
 Katie+0.305
 dining+0.291
 counters+0.291
iah+0.289
view+0.288
Neg logit contributions
 coward-0.154
 loser-0.130
 hawk-0.121
 badly-0.101
 villain-0.097
Token the
Token ID262
Feature activation+1.31e-01

Pos logit contributions
 doorstep+0.035
view+0.034
umm+0.033
 dining+0.033
 counters+0.033
Neg logit contributions
 coward-0.010
 hawk-0.008
 hurt-0.007
 on-0.007
 too-0.007
Token flowers
Token ID12734
Feature activation-2.29e+00

Pos logit contributions
 coward+0.007
 hawk+0.006
 loser+0.006
 hunter+0.005
 hurt+0.005
Neg logit contributions
 Katie-0.008
 doorstep-0.008
iah-0.007
 dining-0.007
view-0.007
Token blo
Token ID24924
Feature activation-5.68e+00

Pos logit contributions
 Katie+0.184
 dining+0.177
 doorstep+0.175
 counters+0.175
view+0.174
Neg logit contributions
 coward-0.076
 hawk-0.063
 loser-0.062
 badly-0.058
 on-0.057
Tokenoming
Token ID3383
Feature activation-2.09e+00

Pos logit contributions
 Katie+0.342
 doorstep+0.329
view+0.320
 dining+0.320
 pavement+0.318
Neg logit contributions
 coward-0.290
 loser-0.265
ls-0.262
 hawk-0.260
 on-0.251
Token.
Token ID13
Feature activation-1.27e-01

Pos logit contributions
 Katie+0.151
 counters+0.145
 dining+0.144
view+0.144
 pavement+0.142
Neg logit contributions
 coward-0.092
 loser-0.079
 hawk-0.074
 badly-0.063
 villain-0.062
Token She
Token ID1375
Feature activation-9.63e-01

Pos logit contributions
view+0.009
 theatre+0.008
 dining+0.008
iah+0.008
 doorstep+0.008
Neg logit contributions
 coward-0.005
 loser-0.005
 Tweet-0.005
 too-0.004
 Spot-0.004
Token put
Token ID1234
Feature activation-4.81e-01

Pos logit contributions
 Katie+0.061
 dining+0.060
view+0.059
 counters+0.059
 doorstep+0.058
Neg logit contributions
 coward-0.049
 loser-0.042
 hawk-0.042
 badly-0.039
 hurt-0.037
Token on
Token ID319
Feature activation+2.62e+00

Pos logit contributions
 Katie+0.042
 dining+0.040
 counters+0.040
iah+0.040
view+0.040
Neg logit contributions
 coward-0.014
 loser-0.011
 hawk-0.010
 badly-0.008
 villain-0.007
Token her
Token ID607
Feature activation-9.32e-01

Pos logit contributions
 coward+0.087
 hawk+0.079
 on+0.079
 badly+0.077
 hurt+0.075
Neg logit contributions
 doorstep-0.188
umm-0.188
 Katie-0.188
view-0.187
 dining-0.186
Token.
Token ID13
Feature activation+5.54e-02

Pos logit contributions
 coward+0.101
 loser+0.088
 hawk+0.087
 badly+0.072
 Sly+0.071
Neg logit contributions
 Katie-0.162
 counters-0.154
 dining-0.154
view-0.153
 pavement-0.150
Token "
Token ID366
Feature activation+5.69e-01

Pos logit contributions
 coward+0.003
 loser+0.003
 hawk+0.003
 badly+0.002
 Sly+0.002
Neg logit contributions
 doorstep-0.003
iah-0.003
view-0.003
 counters-0.003
 dining-0.003
TokenThe
Token ID464
Feature activation-3.14e-01

Pos logit contributions
 coward+0.030
 loser+0.027
 hawk+0.026
 badly+0.024
 hunter+0.023
Neg logit contributions
 Katie-0.035
 dining-0.033
view-0.033
 counters-0.033
iah-0.033
Token bird
Token ID6512
Feature activation+4.16e+00

Pos logit contributions
 Katie+0.020
 dining+0.020
 counters+0.019
 doorstep+0.019
 pavement+0.019
Neg logit contributions
 coward-0.015
 loser-0.013
 hawk-0.013
 badly-0.012
 Sly-0.011
Token hurt
Token ID5938
Feature activation+5.63e+00

Pos logit contributions
 coward+0.202
 loser+0.192
 hawk+0.174
 Sly+0.160
 badly+0.153
Neg logit contributions
 Katie-0.251
view-0.245
 counters-0.243
 theatre-0.242
 dining-0.241
Token me
Token ID502
Feature activation-5.12e+00

Pos logit contributions
 coward+0.298
 loser+0.264
 hawk+0.256
 badly+0.227
 hunter+0.220
Neg logit contributions
 Katie-0.356
 dining-0.334
 counters-0.332
view-0.330
iah-0.329
Token!"
Token ID2474
Feature activation-6.33e-01

Pos logit contributions
 Katie+0.343
view+0.332
 counters+0.332
 doorstep+0.329
 dining+0.328
Neg logit contributions
 coward-0.239
 hawk-0.204
 loser-0.204
 villain-0.173
 Sly-0.170
Token
Token ID198
Feature activation+1.43e+00

Pos logit contributions
 doorstep+0.043
 dining+0.043
view+0.043
 Katie+0.043
iah+0.043
Neg logit contributions
 coward-0.027
 loser-0.023
 hawk-0.023
 badly-0.021
 hurt-0.019
Token
Token ID198
Feature activation+1.42e+00

Pos logit contributions
 coward+0.065
 loser+0.057
 hawk+0.055
 badly+0.049
ching+0.046
Neg logit contributions
 Katie-0.099
 dining-0.095
 counters-0.094
 doorstep-0.094
iah-0.094
TokenBen
Token ID11696
Feature activation-3.72e-01

Pos logit contributions
 coward+0.061
 loser+0.054
 hawk+0.051
 badly+0.046
Bad+0.043
Neg logit contributions
 Katie-0.102
 doorstep-0.098
 dining-0.098
iah-0.097
 counters-0.097
Token runs
Token ID4539
Feature activation+1.88e+00

Pos logit contributions
 Katie+0.022
 counters+0.021
 dining+0.021
 doorstep+0.021
view+0.021
Neg logit contributions
 coward-0.020
 loser-0.019
 hawk-0.018
 badly-0.017
 villain-0.015
Token hopped
Token ID45230
Feature activation+1.23e+00

Pos logit contributions
view+0.033
 Katie+0.031
 doorstep+0.031
umm+0.031
 occasion+0.031
Neg logit contributions
 coward-0.023
 hawk-0.020
 loser-0.019
 on-0.019
 hurt-0.018
Token into
Token ID656
Feature activation-6.00e-01

Pos logit contributions
 coward+0.057
 loser+0.050
 hawk+0.049
 badly+0.042
 villain+0.040
Neg logit contributions
 Katie-0.086
 dining-0.082
 counters-0.081
view-0.080
 doorstep-0.080
Token the
Token ID262
Feature activation+8.24e-02

Pos logit contributions
view+0.051
 Katie+0.050
 doorstep+0.050
 dining+0.050
iah+0.050
Neg logit contributions
 coward-0.016
 hawk-0.012
 loser-0.012
 badly-0.010
 hurt-0.009
Token car
Token ID1097
Feature activation+2.07e+00

Pos logit contributions
 coward+0.004
 hawk+0.004
 loser+0.004
 hunter+0.004
 badly+0.004
Neg logit contributions
 Katie-0.005
iah-0.005
view-0.005
 doorstep-0.004
 dining-0.004
Token.
Token ID13
Feature activation-2.00e-01

Pos logit contributions
 coward+0.040
 loser+0.037
 badly+0.036
 too+0.034
 on+0.034
Neg logit contributions
 Katie-0.182
iah-0.174
 dining-0.172
 counters-0.171
view-0.168
Token Daddy
Token ID26926
Feature activation-6.10e+00

Pos logit contributions
view+0.015
 anniversary+0.015
 dining+0.014
 occasion+0.014
iah+0.014
Neg logit contributions
 Tweet-0.007
 Spot-0.007
 Wh-0.006
 on-0.006
 too-0.006
Token drove
Token ID10357
Feature activation-6.75e-01

Pos logit contributions
 Katie+0.401
 dining+0.394
view+0.393
 counters+0.386
 pavement+0.386
Neg logit contributions
 coward-0.285
 hawk-0.246
 loser-0.243
 badly-0.219
ching-0.215
Token her
Token ID607
Feature activation-1.03e+00

Pos logit contributions
 dining+0.052
 Katie+0.051
view+0.051
 doorstep+0.051
 counters+0.051
Neg logit contributions
 coward-0.023
 loser-0.019
 hawk-0.018
 badly-0.017
 on-0.016
Token to
Token ID284
Feature activation-3.49e-01

Pos logit contributions
 Katie+0.079
 dining+0.075
view+0.075
 counters+0.075
iah+0.075
Neg logit contributions
 coward-0.038
 loser-0.033
 hawk-0.032
 badly-0.028
 hunter-0.026
Token the
Token ID262
Feature activation+7.95e-02

Pos logit contributions
 counters+0.031
view+0.030
iah+0.030
umm+0.030
 doorstep+0.030
Neg logit contributions
 coward-0.008
 loser-0.006
 hawk-0.006
 hurt-0.004
 hunter-0.004
Token big
Token ID1263
Feature activation-1.43e+00

Pos logit contributions
 coward+0.005
 loser+0.004
 hawk+0.004
 badly+0.004
 hunter+0.004
Neg logit contributions
 Katie-0.004
iah-0.004
view-0.004
 counters-0.004
 dining-0.004
Token Let
Token ID3914
Feature activation-1.95e+00

Pos logit contributions
view+0.004
 dining+0.004
 Katie+0.004
 doorstep+0.004
iah+0.004
Neg logit contributions
 coward-0.003
 loser-0.003
 hawk-0.003
 villain-0.002
Bad-0.002
Token's
Token ID338
Feature activation-2.04e+00

Pos logit contributions
 Katie+0.107
 dining+0.100
 counters+0.099
 doorstep+0.099
iah+0.099
Neg logit contributions
 coward-0.120
 loser-0.109
 hawk-0.106
 badly-0.097
 hunter-0.092
Token go
Token ID467
Feature activation+2.91e-01

Pos logit contributions
 Katie+0.125
 dining+0.122
 counters+0.121
view+0.120
iah+0.118
Neg logit contributions
 coward-0.105
 loser-0.096
 hawk-0.092
 badly-0.086
 hurt-0.085
Token in
Token ID287
Feature activation-5.31e-01

Pos logit contributions
 coward+0.013
 loser+0.011
 hawk+0.011
 badly+0.010
 hurt+0.009
Neg logit contributions
 Katie-0.020
 dining-0.020
view-0.020
 counters-0.019
iah-0.019
Token and
Token ID290
Feature activation-1.76e-01

Pos logit contributions
 Katie+0.039
 dining+0.038
 counters+0.037
 doorstep+0.037
view+0.037
Neg logit contributions
 coward-0.023
 loser-0.019
 hawk-0.019
 badly-0.016
 Sly-0.015
Token explore
Token ID7301
Feature activation-6.01e+00

Pos logit contributions
 Katie+0.010
view+0.009
iah+0.009
 dining+0.009
 counters+0.009
Neg logit contributions
 coward-0.010
 loser-0.009
 hawk-0.009
 badly-0.008
 hurt-0.008
Token."
Token ID526
Feature activation+1.40e-01

Pos logit contributions
 Katie+0.459
 dining+0.438
 counters+0.434
view+0.429
 doorstep+0.427
Neg logit contributions
 coward-0.240
 loser-0.208
 hawk-0.196
 badly-0.163
 Sly-0.155
Token
Token ID198
Feature activation+1.32e+00

Pos logit contributions
 coward+0.006
 loser+0.006
 hawk+0.005
 badly+0.005
 Sly+0.005
Neg logit contributions
 Katie-0.010
 dining-0.009
view-0.009
iah-0.009
 doorstep-0.009
Token
Token ID198
Feature activation+1.32e+00

Pos logit contributions
 coward+0.072
 loser+0.066
 hawk+0.063
 badly+0.056
 villain+0.054
Neg logit contributions
 Katie-0.081
 dining-0.077
view-0.077
 counters-0.076
 doorstep-0.075
TokenL
Token ID43
Feature activation-1.75e-01

Pos logit contributions
 coward+0.074
 hawk+0.069
 loser+0.064
 badly+0.058
 Sly+0.056
Neg logit contributions
 dining-0.075
 Katie-0.075
 counters-0.074
view-0.074
 pavement-0.070
Tokenily
Token ID813
Feature activation-1.58e-02

Pos logit contributions
 Katie+0.010
 counters+0.009
 dining+0.009
view+0.009
iah+0.009
Neg logit contributions
 coward-0.011
 hawk-0.010
 loser-0.009
 badly-0.009
 hunter-0.008
Token cap
Token ID1451
Feature activation+3.68e-01

Pos logit contributions
 coward+0.004
 loser+0.003
 hawk+0.003
 badly+0.003
 villain+0.003
Neg logit contributions
 Katie-0.005
 dining-0.005
 counters-0.005
view-0.005
 doorstep-0.005
Token and
Token ID290
Feature activation-5.05e-01

Pos logit contributions
 coward+0.011
 loser+0.010
 hawk+0.009
 badly+0.009
 on+0.007
Neg logit contributions
 Katie-0.030
 dining-0.029
view-0.029
 counters-0.029
iah-0.029
Token said
Token ID531
Feature activation-1.20e+00

Pos logit contributions
 Katie+0.032
 dining+0.030
 counters+0.030
view+0.030
 doorstep+0.030
Neg logit contributions
 coward-0.026
 loser-0.023
 hawk-0.023
 badly-0.021
 hunter-0.019
Token,
Token ID11
Feature activation+1.47e-01

Pos logit contributions
 Katie+0.092
 dining+0.086
 counters+0.085
iah+0.084
view+0.083
Neg logit contributions
 coward-0.049
 loser-0.043
 hawk-0.041
 badly-0.033
 hunter-0.033
Token "
Token ID366
Feature activation+1.39e-01

Pos logit contributions
 coward+0.006
 loser+0.005
 hawk+0.005
 badly+0.004
 Sly+0.004
Neg logit contributions
view-0.011
 Katie-0.011
 doorstep-0.011
 dining-0.011
 counters-0.011
TokenThis
Token ID1212
Feature activation-5.37e+00

Pos logit contributions
 loser+0.005
Bad+0.005
 coward+0.005
 hawk+0.005
ching+0.005
Neg logit contributions
 dining-0.010
 counters-0.010
 doorstep-0.010
 balcony-0.010
 desks-0.009
Token is
Token ID318
Feature activation-6.59e-01

Pos logit contributions
 Katie+0.352
 dining+0.336
view+0.333
 doorstep+0.333
 counters+0.333
Neg logit contributions
 coward-0.263
 loser-0.226
 hawk-0.224
 badly-0.199
 weaker-0.189
Token the
Token ID262
Feature activation+7.23e-02

Pos logit contributions
 Katie+0.048
 dining+0.046
 counters+0.046
view+0.046
 doorstep+0.046
Neg logit contributions
 coward-0.028
 loser-0.024
 hawk-0.023
 badly-0.021
 Sly-0.019
Token perfect
Token ID2818
Feature activation-9.80e-01

Pos logit contributions
 coward+0.003
 loser+0.003
 hawk+0.003
 badly+0.003
 hunter+0.002
Neg logit contributions
 Katie-0.005
 dining-0.005
 counters-0.005
view-0.005
iah-0.005
Token one
Token ID530
Feature activation+2.03e+00

Pos logit contributions
 Katie+0.063
 dining+0.061
 counters+0.060
view+0.060
 doorstep+0.060
Neg logit contributions
 coward-0.049
 loser-0.044
 hawk-0.042
 badly-0.038
 hunter-0.036
Token!"
Token ID2474
Feature activation-7.70e-01

Pos logit contributions
 coward+0.079
 loser+0.070
 hawk+0.064
 badly+0.059
 hero+0.053
Neg logit contributions
 Katie-0.151
 counters-0.150
 dining-0.149
view-0.148
 doorstep-0.146
Token "
Token ID366
Feature activation+3.83e-01

Pos logit contributions
 coward+0.032
 loser+0.028
 hawk+0.027
 badly+0.024
 villain+0.023
Neg logit contributions
 Katie-0.039
 dining-0.038
 counters-0.037
 doorstep-0.037
iah-0.037
TokenWe
Token ID1135
Feature activation-4.37e+00

Pos logit contributions
 coward+0.024
 loser+0.021
 hawk+0.021
 badly+0.019
 Sly+0.019
Neg logit contributions
 Katie-0.020
view-0.019
 counters-0.019
iah-0.019
 dining-0.019
Token are
Token ID389
Feature activation-7.80e-01

Pos logit contributions
 Katie+0.214
 doorstep+0.193
view+0.193
 dining+0.192
iah+0.190
Neg logit contributions
 coward-0.270
 hawk-0.265
 loser-0.246
 weaker-0.232
 Sly-0.232
Token sorry
Token ID7926
Feature activation+3.66e+00

Pos logit contributions
 Katie+0.053
 dining+0.051
 counters+0.049
 theatre+0.048
 balcony+0.047
Neg logit contributions
 coward-0.035
 loser-0.032
 hawk-0.032
ching-0.027
 Sly-0.027
Token,
Token ID11
Feature activation+6.28e-01

Pos logit contributions
 coward+0.161
 hawk+0.139
 loser+0.136
 Sly+0.109
 hunter+0.109
Neg logit contributions
 Katie-0.270
 counters-0.252
 dining-0.250
 doorstep-0.247
iah-0.245
Token Mom
Token ID11254
Feature activation-5.04e+00

Pos logit contributions
 coward+0.032
 hawk+0.028
 loser+0.028
ching+0.027
 badly+0.027
Neg logit contributions
 Katie-0.039
 dining-0.036
 theatre-0.035
 counters-0.035
 porch-0.035
Token.
Token ID13
Feature activation+9.98e-02

Pos logit contributions
 Katie+0.351
 dining+0.332
 counters+0.331
view+0.330
iah+0.328
Neg logit contributions
 coward-0.236
 loser-0.206
 hawk-0.204
 badly-0.174
 Sly-0.167
Token We
Token ID775
Feature activation-4.31e+00

Pos logit contributions
 coward+0.006
 loser+0.005
 hawk+0.005
 badly+0.005
Bad+0.005
Neg logit contributions
 Katie-0.006
 dining-0.005
view-0.005
 counters-0.005
iah-0.005
Token were
Token ID547
Feature activation-1.81e+00

Pos logit contributions
 Katie+0.244
view+0.231
 doorstep+0.228
iah+0.226
 dining+0.223
Neg logit contributions
 coward-0.231
 hawk-0.221
 loser-0.210
 hunters-0.192
 Sly-0.190
Token just
Token ID655
Feature activation+2.35e+00

Pos logit contributions
 Katie+0.117
 dining+0.113
 counters+0.112
 theatre+0.109
iah+0.106
Neg logit contributions
 coward-0.085
 hawk-0.080
 loser-0.080
 Sly-0.067
 badly-0.066
Token playing
Token ID2712
Feature activation-2.88e+00

Pos logit contributions
 coward+0.127
 loser+0.114
 hawk+0.110
 badly+0.100
 hunter+0.094
Neg logit contributions
 Katie-0.144
 dining-0.138
 counters-0.137
view-0.136
 doorstep-0.136