TOP ACTIVATIONS
MAX = 4.498

your tower . It 's so tall !" he said .
block . " It is very tall and pretty !"
and blue blocks created a very colourful tower .
said , " Yes , very much ! I love it
was an obedient boy , so he said " yes "
the ground . It was so big that they could both
Look , our tower is so high !" Lily said .
and started to draw a very big picture .
her tower very tall and very pretty . But
desk . He drew a very big and beautiful flower .
Look , our tower is so high !" Sara says .
liked counting and he was very good at it . He
his tower . It is very tall and has many colors
Ben wants to make a very long rail . He takes
Tim my loved the sandwich so much that he asked his
" Yes , it is very tall and pretty . I
They made the sand castle very big and strong .
was taller than them and very strong . "
. He likes his hat very much . "
top . The tower is very tall . "

MIN ACTIVATIONS
MIN = -3.879

it wanted him to take the sack . It was so
for telling her to take the cat to the vet .
bridge . A small mouse stepped forward and said that he
L ily 's mom called her to come inside ,
L ily 's mom called for her , but Lily
so y ucky that everyone stepped back and made a face
Lily and her mom my stepped out of the truck and
But then , her mom called her in for dinner .
door . She saw Lily crou ching behind the clothes .
stop the thief . She stepped in front of the thief
while , Lily 's mom called her inside for some hot
She thanked her cat for the special gift .
't worry , Lily . We 'll find it ." They
, Lily 's mom my called her to go home .
Benny thanked the mole for the adventure and hopped back up
, " Thank you for the cookies and for helping me
called out to Bobby , " Where is my grandson ?
know . The moral of the story is to be careful
." Gr anny stepped closer and looked at the
floor . Lily 's mom stepped on it and it showed

INTERVAL 2.404 - 4.498
CONTAINS 0.332%

battery . It had a very important job - to power
. They liked to make different things with the blocks ,
was the cut est little three year old . He loved
to say thank you , so he made them some cookies
Tim my hit the ball so hard that he became dizz

INTERVAL 0.309 - 2.404
CONTAINS 35.235%

lots of pretty fireworks lighting up the sky . The colors
story aloud to each other . They were so proud of
space . When he returned to Earth , he told all
for ?" Lily says . " It 's for
that he couldn 't wait to ride it . He practiced

INTERVAL -1.785 - 0.309
CONTAINS 62.853%

lead the way ". The girls followed Maria and
boy named Jamie . Jamie loved to play outside and explore
upon a time , there was a little girl named Lily
was a little girl who liked to draw . She had
Lily and her mom my were surprised but happy to see

INTERVAL -3.879 - -1.785
CONTAINS 1.579%

you , Grand ma . You are very nice . We
while , Lily 's mom asked her to shut the dish
to Tom , " Stop ! It is a trap !
They remembered their friendship . <|endoftext|> L ila and Ben were
, Lily 's mom my asked her , " L ily
Token your
Token ID534
Feature activation+1.49e+00

Pos logit contributions
lf+1.565
 flavors+1.522
 sizes+1.442
 kinds+1.441
cest+1.382
Neg logit contributions
hered-1.509
oken-1.497
pped-1.408
 risking-1.396
 grounded-1.391
Token tower
Token ID10580
Feature activation+2.69e+00

Pos logit contributions
lf+2.007
 flavors+1.927
 kinds+1.791
 sizes+1.771
cest+1.768
Neg logit contributions
oken-3.010
hered-2.845
oning-2.709
ives-2.675
 grounded-2.626
Token.
Token ID13
Feature activation+4.79e-01

Pos logit contributions
 flavors+4.547
lf+4.528
 sizes+4.264
 kinds+4.217
cest+4.162
Neg logit contributions
hered-4.152
ives-3.933
 grounded-3.899
oken-3.868
pped-3.732
Token It
Token ID632
Feature activation+1.53e+00

Pos logit contributions
lf+0.998
 flavors+0.965
 kinds+0.923
 sizes+0.907
cest+0.876
Neg logit contributions
hered-0.618
oken-0.592
 grounded-0.540
ives-0.531
oning-0.531
Token's
Token ID338
Feature activation+2.47e+00

Pos logit contributions
lf+3.072
 flavors+2.923
 kinds+2.723
cest+2.660
 sizes+2.656
Neg logit contributions
hered-2.086
oken-1.997
 grounded-1.890
ives-1.855
pped-1.790
Token so
Token ID523
Feature activation+4.50e+00

Pos logit contributions
lf+4.875
 flavors+4.715
cest+4.456
 kinds+4.449
 sizes+4.229
Neg logit contributions
oken-3.513
 grounded-3.302
pped-3.077
hered-3.066
 okay-3.029
Token tall
Token ID7331
Feature activation+1.23e+00

Pos logit contributions
lf+8.550
 flavors+8.539
cest+7.898
 oval+7.772
 kinds+7.415
Neg logit contributions
 grounded-7.257
oken-7.162
 okay-6.878
ives-6.800
 Drink-6.785
Token!"
Token ID2474
Feature activation+1.47e+00

Pos logit contributions
lf+1.805
 flavors+1.732
 kinds+1.625
 sizes+1.596
 oval+1.525
Neg logit contributions
hered-2.266
oken-2.135
 grounded-2.087
ives-2.061
 risking-2.058
Token he
Token ID339
Feature activation+2.31e+00

Pos logit contributions
 flavors+2.473
 sizes+2.430
lf+2.427
 kinds+2.337
cest+2.277
Neg logit contributions
hered-2.443
oken-2.199
 grounded-2.164
 risking-2.141
oning-2.112
Token said
Token ID531
Feature activation+1.28e+00

Pos logit contributions
 flavors+3.890
lf+3.841
 sizes+3.741
cest+3.705
 kinds+3.629
Neg logit contributions
hered-3.710
 grounded-3.388
oken-3.197
 risking-3.191
pped-3.114
Token.
Token ID13
Feature activation+1.03e+00

Pos logit contributions
 flavors+1.987
lf+1.973
 sizes+1.877
 kinds+1.839
cest+1.704
Neg logit contributions
hered-2.431
oken-2.254
 grounded-2.130
oned-2.113
oning-2.101
Token block
Token ID2512
Feature activation+1.42e+00

Pos logit contributions
lf+1.810
 flavors+1.687
 sizes+1.660
 kinds+1.621
cest+1.495
Neg logit contributions
hered-3.261
oken-2.994
pped-2.883
oned-2.830
oning-2.819
Token.
Token ID13
Feature activation-4.89e-01

Pos logit contributions
lf+2.501
 flavors+2.438
 kinds+2.328
 sizes+2.315
cest+2.163
Neg logit contributions
hered-2.343
oken-2.122
pped-2.060
ives-2.018
 grounded-1.963
Token "
Token ID366
Feature activation+1.09e+00

Pos logit contributions
hered+0.842
oken+0.832
 grounded+0.760
 risking+0.755
pped+0.755
Neg logit contributions
lf-0.803
 flavors-0.778
 kinds-0.731
 sizes-0.730
 oval-0.686
TokenIt
Token ID1026
Feature activation+2.06e+00

Pos logit contributions
lf+2.549
 flavors+2.408
 kinds+2.336
 sizes+2.282
cest+2.217
Neg logit contributions
hered-1.193
oken-1.057
pped-1.020
ives-0.995
Sleep-0.980
Token is
Token ID318
Feature activation+2.51e+00

Pos logit contributions
lf+4.267
 flavors+3.936
 kinds+3.615
cest+3.575
lights+3.566
Neg logit contributions
hered-2.776
oken-2.532
ives-2.505
 grounded-2.470
pped-2.331
Token very
Token ID845
Feature activation+4.47e+00

Pos logit contributions
lf+5.144
 flavors+4.850
 kinds+4.693
cest+4.518
 sizes+4.373
Neg logit contributions
oken-3.181
 grounded-3.088
hered-3.058
pped-2.952
 risking-2.895
Token tall
Token ID7331
Feature activation+1.86e+00

Pos logit contributions
lf+8.555
 flavors+8.150
cest+7.562
 kinds+7.521
 sizes+7.483
Neg logit contributions
oken-7.684
 grounded-7.399
 okay-7.070
 OK-6.815
oning-6.606
Token and
Token ID290
Feature activation+3.27e+00

Pos logit contributions
lf+3.451
 flavors+3.222
 kinds+3.066
 sizes+2.991
 oval+2.883
Neg logit contributions
hered-2.811
oken-2.533
ives-2.488
 grounded-2.433
 risking-2.357
Token pretty
Token ID2495
Feature activation+2.17e+00

Pos logit contributions
lf+6.292
 flavors+5.790
 kinds+5.448
cest+5.418
omo+5.303
Neg logit contributions
 grounded-4.946
hered-4.666
oken-4.587
Come-4.497
 risking-4.477
Token!"
Token ID2474
Feature activation-3.26e-01

Pos logit contributions
lf+3.341
 flavors+2.892
 kinds+2.708
 sizes+2.656
 oval+2.575
Neg logit contributions
hered-4.176
ives-3.751
oken-3.640
pped-3.615
 risking-3.584
Token
Token ID198
Feature activation+1.10e+00

Pos logit contributions
hered+0.589
oken+0.577
 grounded+0.529
pped+0.528
 risking+0.527
Neg logit contributions
lf-0.516
 flavors-0.497
 kinds-0.468
 sizes-0.465
 oval-0.435
Token and
Token ID290
Feature activation+2.83e+00

Pos logit contributions
lf+1.833
 flavors+1.755
 sizes+1.688
 kinds+1.663
cest+1.607
Neg logit contributions
hered-2.019
oken-1.888
pped-1.800
ives-1.757
 risking-1.714
Token blue
Token ID4171
Feature activation+4.40e-01

Pos logit contributions
lf+4.324
 flavors+4.301
cest+4.163
 kinds+4.085
 sizes+4.038
Neg logit contributions
hered-5.044
pped-4.554
Come-4.551
 grounded-4.474
Grab-4.424
Token blocks
Token ID7021
Feature activation+8.38e-01

Pos logit contributions
lf+0.706
 flavors+0.664
 sizes+0.625
 kinds+0.622
cest+0.593
Neg logit contributions
hered-0.797
oken-0.758
pped-0.708
ives-0.698
 risking-0.690
Token created
Token ID2727
Feature activation+1.65e+00

Pos logit contributions
lf+1.581
 flavors+1.527
 sizes+1.429
 kinds+1.425
 oval+1.370
Neg logit contributions
hered-1.254
oken-1.178
pped-1.118
ives-1.084
 risking-1.071
Token a
Token ID257
Feature activation+2.71e+00

Pos logit contributions
lf+3.510
 flavors+3.311
 sizes+3.107
cest+3.104
 kinds+3.096
Neg logit contributions
oken-2.024
hered-1.965
pped-1.747
ives-1.694
 risking-1.686
Token very
Token ID845
Feature activation+4.40e+00

Pos logit contributions
lf+3.796
 flavors+3.727
 kinds+3.673
 sizes+3.430
cest+3.147
Neg logit contributions
hered-5.054
oken-5.036
 grounded-4.883
oning-4.754
 risking-4.636
Token colourful
Token ID44799
Feature activation+2.47e+00

Pos logit contributions
lf+7.730
 kinds+7.422
 flavors+7.362
 sizes+7.358
cest+6.889
Neg logit contributions
oken-7.836
 grounded-7.092
oning-6.873
 somewhere-6.791
 poisoned-6.757
Token tower
Token ID10580
Feature activation+2.34e+00

Pos logit contributions
lf+2.459
 flavors+2.384
 sizes+2.244
 kinds+2.150
cest+2.102
Neg logit contributions
hered-5.667
oken-5.301
oning-5.000
ives-4.980
oned-4.961
Token.
Token ID13
Feature activation-5.37e-01

Pos logit contributions
 flavors+3.900
lf+3.784
 kinds+3.502
 sizes+3.400
omo+3.292
Neg logit contributions
hered-4.107
ives-3.614
pped-3.559
oken-3.509
 grounded-3.424
Token
Token ID198
Feature activation+1.13e-01

Pos logit contributions
hered+0.938
oken+0.918
 grounded+0.846
 risking+0.846
pped+0.842
Neg logit contributions
lf-0.874
 flavors-0.849
 kinds-0.806
 sizes-0.797
 oval-0.747
Token
Token ID198
Feature activation-8.15e-02

Pos logit contributions
lf+0.214
 flavors+0.207
 sizes+0.198
 kinds+0.196
 oval+0.187
Neg logit contributions
hered-0.168
oken-0.163
pped-0.148
 grounded-0.146
 risking-0.146
Token said
Token ID531
Feature activation+1.68e+00

Pos logit contributions
lf+2.682
 flavors+2.532
 sizes+2.509
 kinds+2.444
cest+2.370
Neg logit contributions
hered-2.350
oken-2.195
 grounded-2.127
 risking-2.076
pped-2.054
Token,
Token ID11
Feature activation-1.36e-02

Pos logit contributions
lf+2.785
 flavors+2.732
 sizes+2.670
 kinds+2.580
 oval+2.433
Neg logit contributions
hered-2.998
oken-2.747
pped-2.563
 risking-2.552
ives-2.550
Token "
Token ID366
Feature activation+1.52e+00

Pos logit contributions
hered+0.026
oken+0.025
pped+0.023
 grounded+0.023
 risking+0.023
Neg logit contributions
lf-0.020
 flavors-0.020
 kinds-0.018
 sizes-0.018
cest-0.017
TokenYes
Token ID5297
Feature activation+1.96e+00

Pos logit contributions
lf+3.086
 flavors+2.756
 kinds+2.697
 sizes+2.677
cest+2.601
Neg logit contributions
hered-2.178
Sleep-1.980
oken-1.941
ives-1.939
Come-1.906
Token,
Token ID11
Feature activation+6.15e-01

Pos logit contributions
lf+3.291
 flavors+3.276
 sizes+3.229
 kinds+3.205
cest+2.859
Neg logit contributions
hered-3.451
oken-3.176
pped-3.001
oning-2.970
 permission-2.935
Token very
Token ID845
Feature activation+4.38e+00

Pos logit contributions
lf+1.033
 flavors+1.001
 sizes+0.966
 kinds+0.961
cest+0.895
Neg logit contributions
hered-1.000
oken-0.974
 grounded-0.927
 risking-0.903
pped-0.895
Token much
Token ID881
Feature activation+2.46e+00

Pos logit contributions
 flavors+8.831
lf+8.814
 sizes+8.635
 kinds+8.535
lights+7.933
Neg logit contributions
 okay-6.876
 OK-6.700
oken-6.356
 Drink-6.093
 grounded-5.892
Token!
Token ID0
Feature activation+3.92e-01

Pos logit contributions
lf+4.286
 flavors+4.036
 kinds+3.762
 sizes+3.762
cest+3.610
Neg logit contributions
hered-4.022
oken-3.830
ives-3.807
 grounded-3.700
Sleep-3.670
Token I
Token ID314
Feature activation-5.31e-01

Pos logit contributions
lf+0.733
 flavors+0.701
 kinds+0.668
 sizes+0.661
cest+0.632
Neg logit contributions
hered-0.600
oken-0.567
ives-0.525
 grounded-0.521
pped-0.514
Token love
Token ID1842
Feature activation-5.72e-05

Pos logit contributions
hered+0.792
oken+0.773
pped+0.697
 risking+0.695
 grounded+0.689
Neg logit contributions
lf-0.999
 flavors-0.974
 kinds-0.932
 sizes-0.922
cest-0.873
Token it
Token ID340
Feature activation+1.43e+00

Pos logit contributions
hered+0.000
oken+0.000
 risking+0.000
 grounded+0.000
pped+0.000
Neg logit contributions
lf-0.000
 flavors-0.000
 kinds-0.000
 sizes-0.000
 oval-0.000
Token was
Token ID373
Feature activation+4.85e-01

Pos logit contributions
lf+1.863
 sizes+1.791
 flavors+1.779
 kinds+1.723
cest+1.715
Neg logit contributions
hered-1.743
oken-1.662
 grounded-1.597
pped-1.571
ives-1.530
Token an
Token ID281
Feature activation+4.77e-01

Pos logit contributions
lf+0.765
 flavors+0.735
 sizes+0.690
 kinds+0.683
cest+0.646
Neg logit contributions
hered-0.872
oken-0.868
 grounded-0.809
 risking-0.792
pped-0.792
Token obedient
Token ID41652
Feature activation-1.50e-04

Pos logit contributions
lf+0.659
 flavors+0.638
 sizes+0.599
 kinds+0.583
cest+0.535
Neg logit contributions
oken-0.940
hered-0.931
 grounded-0.875
pped-0.854
 risking-0.840
Token boy
Token ID2933
Feature activation+4.90e-01

Pos logit contributions
hered+0.000
oken+0.000
 grounded+0.000
pped+0.000
 risking+0.000
Neg logit contributions
lf-0.000
 flavors-0.000
 sizes-0.000
 kinds-0.000
 oval-0.000
Token,
Token ID11
Feature activation-2.94e-01

Pos logit contributions
lf+0.696
 flavors+0.671
 sizes+0.622
 kinds+0.614
 oval+0.564
Neg logit contributions
hered-0.978
oken-0.923
 grounded-0.880
 risking-0.873
pped-0.872
Token so
Token ID523
Feature activation+4.36e+00

Pos logit contributions
hered+0.520
oken+0.509
pped+0.464
 grounded+0.464
 risking+0.463
Neg logit contributions
lf-0.476
 flavors-0.457
 kinds-0.432
 sizes-0.430
 oval-0.403
Token he
Token ID339
Feature activation+7.00e-01

Pos logit contributions
 sizes+9.147
 oval+8.762
 kinds+8.579
 flavors+8.451
 sorts+8.116
Neg logit contributions
 okay-6.979
 OK-6.645
pped-6.253
Come-6.111
hered-6.000
Token said
Token ID531
Feature activation+7.97e-01

Pos logit contributions
lf+1.266
 flavors+1.246
 sizes+1.226
 kinds+1.212
cest+1.149
Neg logit contributions
hered-1.039
oken-0.980
 grounded-0.948
pped-0.938
 risking-0.902
Token "
Token ID366
Feature activation+1.15e+00

Pos logit contributions
lf+1.336
 flavors+1.300
 kinds+1.266
 sizes+1.260
cest+1.172
Neg logit contributions
hered-1.356
oken-1.288
pped-1.214
 risking-1.209
 grounded-1.189
Tokenyes
Token ID8505
Feature activation+8.54e-01

Pos logit contributions
lf+2.394
 flavors+2.248
 sizes+2.190
 kinds+2.166
cest+2.108
Neg logit contributions
hered-1.534
oken-1.352
pped-1.352
ives-1.295
Sleep-1.249
Token"
Token ID1
Feature activation+7.69e-01

Pos logit contributions
 flavors+1.199
lf+1.197
 kinds+1.153
 sizes+1.152
 oval+1.037
Neg logit contributions
hered-1.712
oken-1.584
pped-1.556
 grounded-1.515
ives-1.508
Token the
Token ID262
Feature activation+1.49e+00

Pos logit contributions
lf+2.864
 flavors+2.862
 kinds+2.710
cest+2.709
 sizes+2.686
Neg logit contributions
hered-2.718
pped-2.497
oken-2.485
oning-2.366
oned-2.317
Token ground
Token ID2323
Feature activation+7.96e-01

Pos logit contributions
lf+2.054
 flavors+1.987
 kinds+1.925
 sizes+1.846
cest+1.753
Neg logit contributions
hered-2.965
oken-2.883
pped-2.680
oning-2.678
ives-2.670
Token.
Token ID13
Feature activation+3.74e-01

Pos logit contributions
lf+1.101
 flavors+1.055
 kinds+0.974
 sizes+0.958
cest+0.909
Neg logit contributions
hered-1.585
oken-1.519
pped-1.436
ives-1.418
 risking-1.410
Token It
Token ID632
Feature activation+2.63e+00

Pos logit contributions
lf+0.672
 flavors+0.651
 kinds+0.613
 sizes+0.606
cest+0.587
Neg logit contributions
hered-0.594
oken-0.562
oning-0.516
ives-0.515
pped-0.514
Token was
Token ID373
Feature activation+2.18e+00

Pos logit contributions
lf+5.114
 flavors+5.001
cest+4.611
 kinds+4.594
 sizes+4.561
Neg logit contributions
hered-3.846
pped-3.241
oken-3.167
ives-3.128
 grounded-3.117
Token so
Token ID523
Feature activation+4.35e+00

Pos logit contributions
lf+4.806
 flavors+4.551
cest+4.309
 kinds+4.249
 sizes+4.117
Neg logit contributions
oken-2.604
hered-2.541
pped-2.292
 grounded-2.200
 risking-2.152
Token big
Token ID1263
Feature activation+1.37e-01

Pos logit contributions
lf+8.275
 flavors+8.036
cest+7.828
 sizes+7.071
oc+6.938
Neg logit contributions
oken-7.470
ives-7.178
hered-6.835
't-6.562
Come-6.354
Token that
Token ID326
Feature activation+1.60e+00

Pos logit contributions
lf+0.230
 flavors+0.221
 kinds+0.209
 sizes+0.208
 oval+0.196
Neg logit contributions
hered-0.231
oken-0.222
pped-0.205
 risking-0.204
 grounded-0.204
Token they
Token ID484
Feature activation+1.36e+00

Pos logit contributions
lf+2.753
 flavors+2.737
 kinds+2.472
 oval+2.460
cest+2.457
Neg logit contributions
hered-2.561
oken-2.372
 risking-2.282
 grounded-2.250
pped-2.214
Token could
Token ID714
Feature activation+1.82e+00

Pos logit contributions
 flavors+2.465
lf+2.459
 kinds+2.328
 sizes+2.286
cest+2.216
Neg logit contributions
hered-2.084
pped-1.915
oken-1.888
 grounded-1.857
ives-1.779
Token both
Token ID1111
Feature activation+2.74e+00

Pos logit contributions
 flavors+3.478
lf+3.419
 sizes+3.352
 kinds+3.313
cest+3.229
Neg logit contributions
hered-2.592
oken-2.515
pped-2.323
oning-2.309
 grounded-2.264
TokenLook
Token ID8567
Feature activation+7.01e-01

Pos logit contributions
hered+0.154
oken+0.147
pped+0.130
 grounded+0.130
 risking+0.129
Neg logit contributions
lf-0.272
 flavors-0.263
 kinds-0.253
 sizes-0.252
 oval-0.240
Token,
Token ID11
Feature activation+5.35e-02

Pos logit contributions
lf+1.125
 flavors+1.114
 sizes+1.078
 kinds+1.070
cest+0.980
Neg logit contributions
hered-1.234
oken-1.185
pped-1.112
 grounded-1.104
 risking-1.075
Token our
Token ID674
Feature activation+2.40e+00

Pos logit contributions
lf+0.093
 flavors+0.090
 kinds+0.086
 sizes+0.086
cest+0.080
Neg logit contributions
hered-0.087
oken-0.084
 grounded-0.077
 risking-0.077
pped-0.076
Token tower
Token ID10580
Feature activation+2.13e+00

Pos logit contributions
lf+2.057
 flavors+1.853
 kinds+1.730
cest+1.723
 sizes+1.714
Neg logit contributions
oken-5.965
hered-5.605
 grounded-5.400
 permission-5.400
oning-5.362
Token is
Token ID318
Feature activation+2.35e+00

Pos logit contributions
lf+4.070
 flavors+3.801
 sizes+3.672
 kinds+3.650
 oval+3.549
Neg logit contributions
hered-3.131
oken-2.788
 grounded-2.733
ives-2.683
pped-2.660
Token so
Token ID523
Feature activation+4.32e+00

Pos logit contributions
lf+4.503
 flavors+4.405
 kinds+4.236
cest+4.024
 sizes+4.014
Neg logit contributions
hered-3.252
oken-3.230
 grounded-3.132
pped-3.110
 risking-2.968
Token high
Token ID1029
Feature activation+6.87e-01

Pos logit contributions
lf+7.656
 flavors+7.651
cest+7.384
 sizes+7.101
 kinds+7.045
Neg logit contributions
't-7.054
 grounded-7.012
oken-6.952
oning-6.777
 Drink-6.568
Token!"
Token ID2474
Feature activation+3.55e-01

Pos logit contributions
lf+0.924
 flavors+0.876
 kinds+0.826
 sizes+0.814
 oval+0.755
Neg logit contributions
hered-1.394
oken-1.322
pped-1.265
 grounded-1.257
 risking-1.249
Token Lily
Token ID20037
Feature activation+1.94e+00

Pos logit contributions
lf+0.599
 flavors+0.588
 kinds+0.553
 sizes+0.552
cest+0.524
Neg logit contributions
hered-0.596
oken-0.568
 grounded-0.528
 risking-0.522
pped-0.520
Token said
Token ID531
Feature activation-6.60e-02

Pos logit contributions
lf+3.191
 flavors+3.097
 sizes+2.975
cest+2.936
 kinds+2.876
Neg logit contributions
hered-3.385
ives-3.013
oken-2.932
 grounded-2.894
pped-2.856
Token.
Token ID13
Feature activation-1.16e+00

Pos logit contributions
hered+0.140
oken+0.136
pped+0.126
 grounded+0.126
 risking+0.126
Neg logit contributions
lf-0.084
 flavors-0.080
 kinds-0.075
 sizes-0.074
 oval-0.068
Token and
Token ID290
Feature activation+7.41e-01

Pos logit contributions
hered+0.019
oken+0.018
pped+0.017
 grounded+0.017
 risking+0.017
Neg logit contributions
lf-0.021
 flavors-0.020
 kinds-0.019
 sizes-0.019
cest-0.018
Token started
Token ID2067
Feature activation+9.13e-02

Pos logit contributions
lf+1.267
 flavors+1.225
 kinds+1.187
 sizes+1.177
cest+1.132
Neg logit contributions
hered-1.197
oken-1.140
pped-1.090
 risking-1.071
 grounded-1.061
Token to
Token ID284
Feature activation+1.13e+00

Pos logit contributions
lf+0.189
 flavors+0.184
 sizes+0.175
 kinds+0.175
cest+0.167
Neg logit contributions
hered-0.119
oken-0.115
pped-0.102
 risking-0.102
 grounded-0.101
Token draw
Token ID3197
Feature activation+1.19e+00

Pos logit contributions
lf+2.052
 flavors+2.017
 sizes+1.901
 kinds+1.899
cest+1.868
Neg logit contributions
oken-1.689
hered-1.689
pped-1.512
oning-1.501
 grounded-1.477
Token a
Token ID257
Feature activation+2.26e+00

Pos logit contributions
lf+2.135
 flavors+2.054
 kinds+1.939
 sizes+1.921
cest+1.885
Neg logit contributions
hered-1.853
oken-1.810
pped-1.667
ives-1.646
oning-1.570
Token very
Token ID845
Feature activation+4.31e+00

Pos logit contributions
lf+3.050
 flavors+3.034
 kinds+2.937
 sizes+2.872
cest+2.529
Neg logit contributions
hered-4.402
oken-4.157
ives-3.970
oning-3.959
 grounded-3.950
Token big
Token ID1263
Feature activation+1.93e+00

Pos logit contributions
 sizes+7.316
 flavors+7.168
lf+7.162
 kinds+7.058
cest+6.485
Neg logit contributions
oken-8.134
oning-7.329
 grounded-7.188
add-6.762
 poisoned-6.744
Token picture
Token ID4286
Feature activation+1.34e+00

Pos logit contributions
lf+2.822
 flavors+2.621
 kinds+2.439
 sizes+2.423
cest+2.245
Neg logit contributions
hered-3.778
oken-3.512
ives-3.200
 grounded-3.193
oning-3.156
Token.
Token ID13
Feature activation-1.32e-01

Pos logit contributions
 flavors+2.205
lf+2.190
 sizes+2.028
 kinds+2.015
cest+1.955
Neg logit contributions
hered-2.348
oken-2.147
ives-2.120
pped-2.039
 grounded-1.971
Token
Token ID198
Feature activation+4.76e-01

Pos logit contributions
hered+0.210
oken+0.204
pped+0.185
 grounded+0.184
 risking+0.184
Neg logit contributions
lf-0.237
 flavors-0.228
 kinds-0.217
 sizes-0.216
 oval-0.204
Token
Token ID198
Feature activation-9.22e-02

Pos logit contributions
lf+0.925
 flavors+0.916
 sizes+0.881
 kinds+0.862
 oval+0.832
Neg logit contributions
hered-0.668
oken-0.641
pped-0.583
 grounded-0.574
 risking-0.573
Token her
Token ID607
Feature activation+2.12e+00

Pos logit contributions
lf+3.120
 flavors+2.970
 sizes+2.799
cest+2.787
 kinds+2.764
Neg logit contributions
oken-2.099
hered-1.983
pped-1.825
oning-1.788
ives-1.757
Token tower
Token ID10580
Feature activation+3.01e+00

Pos logit contributions
lf+2.546
 flavors+2.304
cest+2.154
 kinds+2.095
 sizes+2.085
Neg logit contributions
oken-4.714
hered-4.446
 permission-4.214
 grounded-4.209
oning-4.046
Token very
Token ID845
Feature activation+3.81e+00

Pos logit contributions
lf+5.933
 flavors+5.738
cest+5.221
 sizes+5.184
 kinds+5.155
Neg logit contributions
hered-4.339
oken-4.224
ives-3.813
oning-3.763
 somewhere-3.663
Token tall
Token ID7331
Feature activation+1.99e+00

Pos logit contributions
lf+7.387
 flavors+6.779
 kinds+6.242
 oval+6.042
 sizes+5.952
Neg logit contributions
oken-6.148
oning-5.681
 grounded-5.623
ives-5.436
hered-5.414
Token and
Token ID290
Feature activation+3.49e+00

Pos logit contributions
lf+3.990
 flavors+3.759
 kinds+3.589
 sizes+3.502
cest+3.463
Neg logit contributions
hered-2.621
oken-2.524
pped-2.270
ives-2.245
 risking-2.224
Token very
Token ID845
Feature activation+4.31e+00

Pos logit contributions
lf+6.970
 flavors+6.888
 kinds+6.453
cest+6.446
omo+6.414
Neg logit contributions
hered-4.500
Come-4.444
oken-4.350
 grounded-4.338
pped-4.253
Token pretty
Token ID2495
Feature activation+2.42e+00

Pos logit contributions
lf+9.223
 flavors+9.046
 kinds+8.292
omo+8.128
 sizes+8.034
Neg logit contributions
oken-6.546
 okay-6.022
 grounded-5.918
ives-5.590
 risking-5.589
Token.
Token ID13
Feature activation-1.90e-01

Pos logit contributions
lf+4.142
 flavors+3.759
 kinds+3.595
omo+3.548
cest+3.490
Neg logit contributions
hered-4.082
oken-3.790
pped-3.657
ting-3.636
ives-3.590
Token
Token ID198
Feature activation+4.62e-01

Pos logit contributions
hered+0.297
oken+0.290
pped+0.262
 grounded+0.262
 risking+0.261
Neg logit contributions
lf-0.347
 flavors-0.334
 kinds-0.318
 sizes-0.317
 oval-0.299
Token
Token ID198
Feature activation+9.34e-02

Pos logit contributions
lf+0.904
 flavors+0.890
 sizes+0.852
 kinds+0.841
 oval+0.807
Neg logit contributions
hered-0.648
oken-0.625
pped-0.569
 grounded-0.563
ives-0.563
TokenBut
Token ID1537
Feature activation-5.38e-01

Pos logit contributions
lf+0.153
 flavors+0.150
 kinds+0.142
 sizes+0.141
 oval+0.132
Neg logit contributions
hered-0.162
oken-0.157
pped-0.146
 grounded-0.145
 risking-0.144
Token desk
Token ID6915
Feature activation+1.32e+00

Pos logit contributions
lf+0.597
 flavors+0.569
 kinds+0.543
 sizes+0.527
cest+0.483
Neg logit contributions
hered-1.173
oken-1.144
 grounded-1.077
pped-1.073
oning-1.069
Token.
Token ID13
Feature activation+3.64e-01

Pos logit contributions
 flavors+1.870
lf+1.824
 sizes+1.703
 kinds+1.702
cest+1.628
Neg logit contributions
hered-2.536
oken-2.440
ives-2.353
oning-2.307
pped-2.305
Token He
Token ID679
Feature activation+1.52e+00

Pos logit contributions
lf+0.600
 flavors+0.571
 sizes+0.542
 kinds+0.539
cest+0.515
Neg logit contributions
hered-0.635
oken-0.621
oning-0.567
pped-0.567
 grounded-0.563
Token drew
Token ID9859
Feature activation+9.55e-01

Pos logit contributions
lf+2.611
 flavors+2.571
 sizes+2.562
 kinds+2.495
cest+2.445
Neg logit contributions
hered-2.313
pped-2.190
oken-2.189
 grounded-2.082
ives-1.998
Token a
Token ID257
Feature activation+2.01e+00

Pos logit contributions
lf+2.005
 flavors+1.940
 kinds+1.848
 sizes+1.843
cest+1.800
Neg logit contributions
hered-1.189
oken-1.159
pped-1.054
ives-1.003
 grounded-0.992
Token very
Token ID845
Feature activation+4.30e+00

Pos logit contributions
 flavors+2.460
lf+2.451
 kinds+2.446
 sizes+2.400
cest+2.131
Neg logit contributions
hered-4.129
oken-3.891
 grounded-3.768
oning-3.701
 risking-3.674
Token big
Token ID1263
Feature activation+1.91e+00

Pos logit contributions
 sizes+7.645
 kinds+7.289
 flavors+6.923
lf+6.853
cest+6.830
Neg logit contributions
oken-7.915
 grounded-7.094
oning-6.965
 poisoned-6.481
 okay-6.367
Token and
Token ID290
Feature activation+3.42e+00

Pos logit contributions
lf+2.743
 flavors+2.653
 sizes+2.575
 kinds+2.533
cest+2.331
Neg logit contributions
hered-3.653
oken-3.369
 grounded-3.187
pped-3.143
oning-3.065
Token beautiful
Token ID4950
Feature activation+1.88e+00

Pos logit contributions
lf+5.822
cest+5.625
 flavors+5.579
 kinds+5.545
 sizes+5.534
Neg logit contributions
hered-5.172
 grounded-5.035
Come-5.033
pped-4.870
oken-4.790
Token flower
Token ID15061
Feature activation+1.35e+00

Pos logit contributions
lf+2.477
 flavors+2.477
 sizes+2.449
 kinds+2.364
cest+2.255
Neg logit contributions
hered-3.637
oken-3.353
pped-3.182
 risking-3.123
oned-3.088
Token.
Token ID13
Feature activation+8.11e-01

Pos logit contributions
lf+2.051
 flavors+1.982
 kinds+1.867
 sizes+1.827
cest+1.739
Neg logit contributions
hered-2.471
oken-2.292
ives-2.189
pped-2.173
 risking-2.169
TokenLook
Token ID8567
Feature activation+8.21e-01

Pos logit contributions
lf+0.860
 flavors+0.819
 kinds+0.796
 sizes+0.787
cest+0.756
Neg logit contributions
hered-0.437
oken-0.400
pped-0.365
 grounded-0.356
ives-0.353
Token,
Token ID11
Feature activation+5.84e-02

Pos logit contributions
lf+1.308
 flavors+1.274
 sizes+1.245
 kinds+1.228
cest+1.117
Neg logit contributions
hered-1.474
oken-1.430
pped-1.340
 grounded-1.312
oning-1.289
Token our
Token ID674
Feature activation+2.45e+00

Pos logit contributions
lf+0.106
 flavors+0.102
 kinds+0.098
 sizes+0.097
cest+0.091
Neg logit contributions
hered-0.091
oken-0.088
 grounded-0.081
 risking-0.080
pped-0.080
Token tower
Token ID10580
Feature activation+2.51e+00

Pos logit contributions
lf+2.725
 sizes+2.427
 kinds+2.392
 flavors+2.351
cest+2.267
Neg logit contributions
oken-5.536
hered-5.272
 grounded-5.036
 permission-5.029
oning-4.881
Token is
Token ID318
Feature activation+2.40e+00

Pos logit contributions
lf+5.202
 flavors+4.720
 sizes+4.659
 kinds+4.565
cest+4.437
Neg logit contributions
hered-3.377
oken-3.035
 grounded-3.002
ives-2.958
pped-2.787
Token so
Token ID523
Feature activation+4.27e+00

Pos logit contributions
lf+4.500
 flavors+4.179
 kinds+4.083
cest+3.881
 sizes+3.835
Neg logit contributions
hered-3.576
oken-3.527
 grounded-3.383
pped-3.331
 risking-3.238
Token high
Token ID1029
Feature activation+1.02e+00

Pos logit contributions
lf+8.107
 flavors+7.276
cest+7.150
 kinds+6.941
 sizes+6.854
Neg logit contributions
oken-6.931
 grounded-6.765
't-6.700
 Drink-6.656
 okay-6.553
Token!"
Token ID2474
Feature activation-1.91e-01

Pos logit contributions
lf+1.343
 flavors+1.269
 kinds+1.204
 sizes+1.184
 oval+1.077
Neg logit contributions
hered-2.099
oken-1.972
pped-1.909
 grounded-1.888
 risking-1.873
Token Sara
Token ID24799
Feature activation+2.14e-01

Pos logit contributions
hered+0.299
oken+0.291
pped+0.263
 grounded+0.263
 risking+0.262
Neg logit contributions
lf-0.348
 flavors-0.336
 kinds-0.320
 sizes-0.319
 oval-0.301
Token says
Token ID1139
Feature activation-6.16e-01

Pos logit contributions
lf+0.302
 flavors+0.281
 sizes+0.262
 kinds+0.261
cest+0.245
Neg logit contributions
hered-0.425
oken-0.414
ives-0.386
 grounded-0.383
pped-0.380
Token.
Token ID13
Feature activation-1.41e+00

Pos logit contributions
hered+1.356
oken+1.347
 grounded+1.263
pped+1.258
 risking+1.257
Neg logit contributions
lf-0.707
 flavors-0.661
 kinds-0.612
 sizes-0.609
 oval-0.565
Token liked
Token ID8288
Feature activation+1.39e+00

Pos logit contributions
lf+5.223
 flavors+5.006
cest+4.884
 sizes+4.849
 kinds+4.796
Neg logit contributions
hered-4.198
oken-3.980
ives-3.802
pped-3.719
 grounded-3.649
Token counting
Token ID14143
Feature activation+2.60e+00

Pos logit contributions
lf+2.946
 flavors+2.878
 kinds+2.681
 sizes+2.671
cest+2.668
Neg logit contributions
hered-1.599
oken-1.549
 risking-1.435
oning-1.363
ives-1.362
Token and
Token ID290
Feature activation+2.44e+00

Pos logit contributions
lf+5.400
 flavors+5.005
cest+4.884
omo+4.755
 kinds+4.628
Neg logit contributions
oken-3.305
hered-3.273
oning-2.919
 risking-2.915
pped-2.882
Token he
Token ID339
Feature activation+2.83e+00

Pos logit contributions
lf+4.507
 flavors+4.250
cest+4.022
 kinds+3.925
omo+3.913
Neg logit contributions
hered-3.711
oken-3.466
oning-3.375
oned-3.302
 risking-3.295
Token was
Token ID373
Feature activation+3.18e+00

Pos logit contributions
lf+5.294
 flavors+5.040
cest+4.834
 sizes+4.761
 kinds+4.721
Neg logit contributions
hered-4.059
oken-3.959
ives-3.760
pped-3.630
 grounded-3.624
Token very
Token ID845
Feature activation+4.24e+00

Pos logit contributions
lf+6.978
 flavors+6.504
lights+6.272
cest+5.969
 kinds+5.858
Neg logit contributions
oken-3.924
hered-3.820
 risking-3.679
 permission-3.650
 grounded-3.643
Token good
Token ID922
Feature activation+8.88e-01

Pos logit contributions
lf+9.714
 flavors+8.682
lights+8.635
 sizes+7.904
 kinds+7.875
Neg logit contributions
oken-5.980
 grounded-5.878
oning-5.527
 okay-5.499
hered-5.432
Token at
Token ID379
Feature activation+1.62e+00

Pos logit contributions
lf+1.690
 flavors+1.599
 sizes+1.520
 kinds+1.503
 oval+1.454
Neg logit contributions
hered-1.334
oken-1.294
 risking-1.173
 grounded-1.169
ives-1.168
Token it
Token ID340
Feature activation+1.70e+00

Pos logit contributions
lf+2.719
 flavors+2.701
 sizes+2.523
cest+2.496
 kinds+2.424
Neg logit contributions
oken-2.605
hered-2.567
 risking-2.361
oning-2.338
pped-2.336
Token.
Token ID13
Feature activation+1.91e+00

Pos logit contributions
lf+2.687
 flavors+2.626
 sizes+2.347
 kinds+2.340
cest+2.281
Neg logit contributions
hered-3.063
oken-2.917
ives-2.784
oning-2.753
 grounded-2.731
Token He
Token ID679
Feature activation+2.72e+00

Pos logit contributions
lf+3.401
 flavors+3.189
cest+3.099
 kinds+3.019
 sizes+2.958
Neg logit contributions
oken-3.051
hered-3.042
ives-2.904
oning-2.871
 Leave-2.823
Token his
Token ID465
Feature activation+1.33e+00

Pos logit contributions
lf+1.406
 flavors+1.347
 sizes+1.254
 kinds+1.235
cest+1.189
Neg logit contributions
hered-1.725
oken-1.639
pped-1.550
 risking-1.546
 grounded-1.541
Token tower
Token ID10580
Feature activation+2.28e+00

Pos logit contributions
lf+1.537
 flavors+1.417
 sizes+1.333
 kinds+1.333
cest+1.268
Neg logit contributions
oken-2.912
hered-2.833
 grounded-2.651
 permission-2.644
 risking-2.592
Token.
Token ID13
Feature activation+7.25e-01

Pos logit contributions
lf+4.782
 flavors+4.650
 sizes+4.463
cest+4.380
 kinds+4.377
Neg logit contributions
hered-2.839
ives-2.523
oken-2.497
 grounded-2.327
pped-2.319
Token It
Token ID632
Feature activation+2.45e+00

Pos logit contributions
lf+1.255
 flavors+1.176
 sizes+1.127
 kinds+1.123
cest+1.061
Neg logit contributions
hered-1.242
oken-1.192
ives-1.098
pped-1.091
 risking-1.089
Token is
Token ID318
Feature activation+2.37e+00

Pos logit contributions
lf+5.307
 flavors+4.891
cest+4.613
 kinds+4.556
 sizes+4.461
Neg logit contributions
hered-3.003
 grounded-2.707
oken-2.680
 Leave-2.638
ives-2.634
Token very
Token ID845
Feature activation+4.23e+00

Pos logit contributions
lf+4.411
 flavors+4.250
 kinds+4.113
cest+3.920
oc+3.815
Neg logit contributions
oken-3.407
hered-3.316
 grounded-3.264
pped-3.205
 risking-3.083
Token tall
Token ID7331
Feature activation+1.73e+00

Pos logit contributions
lf+7.607
 flavors+7.450
 sizes+7.206
 kinds+7.091
cest+7.048
Neg logit contributions
oken-7.036
 grounded-6.989
 okay-6.370
Come-6.296
oning-6.249
Token and
Token ID290
Feature activation+3.17e+00

Pos logit contributions
lf+3.655
 flavors+3.453
 kinds+3.325
 sizes+3.266
 oval+3.159
Neg logit contributions
hered-2.115
oken-1.912
ives-1.806
 grounded-1.790
pped-1.746
Token has
Token ID468
Feature activation+1.69e+00

Pos logit contributions
lf+5.891
 flavors+5.297
 kinds+5.181
cest+5.140
omo+4.600
Neg logit contributions
 grounded-4.928
Come-4.760
oken-4.645
hered-4.582
 risking-4.469
Token many
Token ID867
Feature activation+3.30e+00

Pos logit contributions
lf+3.571
 flavors+3.201
 kinds+3.077
 sizes+2.979
cest+2.975
Neg logit contributions
oken-2.201
hered-2.124
pped-1.959
 risking-1.958
 grounded-1.906
Token colors
Token ID7577
Feature activation+9.82e-01

Pos logit contributions
lf+5.028
cest+4.342
 flavors+4.309
 kinds+4.177
master+4.121
Neg logit contributions
oken-6.390
 consequences-5.815
ives-5.467
 risking-5.449
pped-5.445
Token Ben
Token ID3932
Feature activation+2.26e-02

Pos logit contributions
hered+0.467
oken+0.457
pped+0.417
 grounded+0.416
 risking+0.415
Neg logit contributions
lf-0.470
 flavors-0.453
 kinds-0.429
 sizes-0.428
 oval-0.402
Token wants
Token ID3382
Feature activation+8.09e-01

Pos logit contributions
lf+0.040
 flavors+0.038
 kinds+0.036
 sizes+0.036
 oval+0.034
Neg logit contributions
hered-0.036
oken-0.035
 grounded-0.032
pped-0.032
 risking-0.032
Token to
Token ID284
Feature activation+1.63e+00

Pos logit contributions
lf+1.872
 flavors+1.782
 sizes+1.719
 kinds+1.702
cest+1.656
Neg logit contributions
hered-0.882
oken-0.819
 risking-0.726
pped-0.725
 grounded-0.721
Token make
Token ID787
Feature activation+1.77e+00

Pos logit contributions
lf+3.219
 flavors+3.110
 sizes+3.053
 kinds+2.952
cest+2.824
Neg logit contributions
hered-2.388
oken-2.203
pped-2.034
oning-1.997
 Leave-1.977
Token a
Token ID257
Feature activation+1.92e+00

Pos logit contributions
lf+3.868
 flavors+3.757
 sizes+3.568
 kinds+3.507
 oval+3.389
Neg logit contributions
oken-2.142
hered-2.088
pped-1.866
 grounded-1.806
 permission-1.791
Token very
Token ID845
Feature activation+4.23e+00

Pos logit contributions
lf+3.266
 flavors+3.211
 kinds+3.054
 sizes+3.000
cest+2.661
Neg logit contributions
hered-3.188
oken-3.115
 grounded-2.942
oning-2.835
 risking-2.745
Token long
Token ID890
Feature activation+1.50e+00

Pos logit contributions
 flavors+8.097
 sizes+8.085
 kinds+7.870
lf+7.849
 sorts+6.803
Neg logit contributions
oken-7.321
 grounded-6.654
 okay-6.326
oning-6.287
 somewhere-6.121
Token rail
Token ID6787
Feature activation+1.43e+00

Pos logit contributions
 flavors+2.294
lf+2.275
 kinds+2.117
 sizes+2.034
cest+1.947
Neg logit contributions
hered-2.733
oken-2.591
oning-2.381
 grounded-2.355
oned-2.311
Token.
Token ID13
Feature activation+1.97e+00

Pos logit contributions
lf+2.258
 flavors+2.202
 kinds+2.007
 sizes+1.984
cest+1.868
Neg logit contributions
hered-2.635
oken-2.498
 grounded-2.336
ives-2.312
pped-2.288
Token He
Token ID679
Feature activation+2.95e-02

Pos logit contributions
lf+3.124
 flavors+2.768
cest+2.606
 sizes+2.566
 kinds+2.529
Neg logit contributions
hered-3.749
oken-3.697
ives-3.579
oning-3.502
oned-3.409
Token takes
Token ID2753
Feature activation+9.40e-01

Pos logit contributions
lf+0.050
 flavors+0.048
 kinds+0.045
 sizes+0.045
cest+0.043
Neg logit contributions
hered-0.050
oken-0.049
 grounded-0.044
pped-0.044
 risking-0.044
Token Tim
Token ID5045
Feature activation+8.29e-01

Pos logit contributions
lf+1.483
 flavors+1.420
 sizes+1.346
 kinds+1.340
cest+1.334
Neg logit contributions
hered-1.235
oken-1.175
ives-1.084
oning-1.082
 grounded-1.065
Tokenmy
Token ID1820
Feature activation+1.38e+00

Pos logit contributions
lf+1.587
 flavors+1.528
 sizes+1.480
 kinds+1.434
cest+1.406
Neg logit contributions
hered-1.184
oken-1.159
pped-1.075
 grounded-1.061
ives-1.020
Token loved
Token ID6151
Feature activation+1.43e+00

Pos logit contributions
lf+2.301
 sizes+2.245
 flavors+2.231
 kinds+2.162
cest+2.127
Neg logit contributions
hered-2.176
oken-2.068
 grounded-2.010
pped-2.000
ives-1.928
Token the
Token ID262
Feature activation+4.68e-01

Pos logit contributions
lf+2.429
 flavors+2.325
 sizes+2.296
cest+2.268
 kinds+2.261
Neg logit contributions
hered-2.181
oken-2.118
 risking-2.106
pped-2.010
 grounded-2.006
Token sandwich
Token ID20433
Feature activation+1.90e+00

Pos logit contributions
lf+0.665
 flavors+0.614
 sizes+0.585
 kinds+0.576
cest+0.550
Neg logit contributions
hered-0.913
oken-0.887
ives-0.816
 grounded-0.814
 risking-0.811
Token so
Token ID523
Feature activation+4.22e+00

Pos logit contributions
lf+3.545
 flavors+3.422
 sizes+3.255
 kinds+3.211
cest+3.142
Neg logit contributions
hered-2.803
oken-2.544
ives-2.494
pped-2.398
 grounded-2.368
Token much
Token ID881
Feature activation+1.91e+00

Pos logit contributions
 sizes+8.277
 oval+8.190
cest+7.857
lf+7.843
 flavors+7.748
Neg logit contributions
pped-6.579
oken-6.240
hered-6.130
 okay-6.041
Sleep-5.930
Token that
Token ID326
Feature activation+2.64e+00

Pos logit contributions
lf+3.378
 flavors+3.204
 sizes+3.157
 kinds+3.142
 oval+3.034
Neg logit contributions
hered-3.039
ives-2.713
 risking-2.705
oken-2.659
 grounded-2.658
Token he
Token ID339
Feature activation+9.41e-01

Pos logit contributions
 sizes+3.367
 flavors+3.331
 kinds+3.255
lf+3.214
cest+3.096
Neg logit contributions
hered-5.079
 grounded-5.012
pped-4.964
 risking-4.896
Come-4.876
Token asked
Token ID1965
Feature activation+9.16e-01

Pos logit contributions
lf+1.672
 flavors+1.623
 sizes+1.596
 kinds+1.591
cest+1.505
Neg logit contributions
hered-1.416
oken-1.376
pped-1.317
 grounded-1.299
 risking-1.258
Token his
Token ID465
Feature activation+1.10e+00

Pos logit contributions
lf+1.024
 flavors+1.006
 sizes+0.963
 kinds+0.915
 oval+0.868
Neg logit contributions
hered-2.026
oken-1.998
pped-1.901
 grounded-1.901
 permission-1.884
Token"
Token ID1
Feature activation+8.42e-01

Pos logit contributions
lf+2.147
 flavors+1.981
 sizes+1.958
 kinds+1.954
 oval+1.811
Neg logit contributions
hered-2.333
oken-2.161
pped-2.129
 grounded-2.101
Sleep-2.073
TokenYes
Token ID5297
Feature activation+1.45e+00

Pos logit contributions
lf+1.760
 flavors+1.658
 kinds+1.607
 sizes+1.571
cest+1.517
Neg logit contributions
hered-1.121
oken-1.004
pped-0.950
 risking-0.941
ives-0.940
Token,
Token ID11
Feature activation-2.05e-01

Pos logit contributions
lf+2.278
 flavors+2.246
 sizes+2.226
 kinds+2.217
 sorts+1.965
Neg logit contributions
hered-2.623
oken-2.490
pped-2.395
oning-2.342
 grounded-2.293
Token it
Token ID340
Feature activation+1.24e+00

Pos logit contributions
hered+0.353
oken+0.345
pped+0.315
 grounded+0.315
 risking+0.314
Neg logit contributions
lf-0.343
 flavors-0.329
 kinds-0.312
 sizes-0.311
 oval-0.291
Token is
Token ID318
Feature activation+2.64e+00

Pos logit contributions
lf+2.249
 flavors+2.109
 sizes+2.009
 kinds+1.992
cest+1.935
Neg logit contributions
hered-1.953
oken-1.868
ives-1.753
 grounded-1.733
pped-1.691
Token very
Token ID845
Feature activation+4.22e+00

Pos logit contributions
lf+4.356
 flavors+3.861
 kinds+3.578
lights+3.525
cest+3.494
Neg logit contributions
hered-4.537
oken-4.399
 grounded-4.191
 OK-4.135
 okay-4.110
Token tall
Token ID7331
Feature activation+1.71e+00

Pos logit contributions
lf+7.908
 flavors+7.378
 sizes+7.226
cest+7.085
lights+6.914
Neg logit contributions
oken-6.941
 grounded-6.566
 okay-6.343
ives-6.241
oning-6.230
Token and
Token ID290
Feature activation+2.82e+00

Pos logit contributions
lf+2.520
 flavors+2.317
 kinds+2.144
 sizes+2.121
 oval+2.040
Neg logit contributions
hered-3.241
oken-2.990
ives-2.922
 grounded-2.916
 risking-2.831
Token pretty
Token ID2495
Feature activation+1.60e+00

Pos logit contributions
lf+5.513
 flavors+5.011
 kinds+4.712
cest+4.663
omo+4.621
Neg logit contributions
 grounded-4.177
hered-4.014
oken-3.859
 risking-3.836
Come-3.704
Token.
Token ID13
Feature activation+3.30e-01

Pos logit contributions
lf+2.306
 flavors+2.039
 sizes+1.870
 kinds+1.859
 oval+1.792
Neg logit contributions
hered-3.215
oken-2.905
ives-2.882
 grounded-2.819
 risking-2.811
Token I
Token ID314
Feature activation+5.05e-01

Pos logit contributions
lf+0.648
 flavors+0.619
 kinds+0.590
 sizes+0.580
cest+0.556
Neg logit contributions
hered-0.471
oken-0.452
 grounded-0.415
ives-0.410
pped-0.407
Token They
Token ID1119
Feature activation+1.30e+00

Pos logit contributions
hered+0.620
oken+0.605
pped+0.556
 grounded+0.556
 risking+0.555
Neg logit contributions
lf-0.559
 flavors-0.539
 kinds-0.510
 sizes-0.509
 oval-0.476
Token made
Token ID925
Feature activation+1.43e+00

Pos logit contributions
lf+2.464
 flavors+2.343
 kinds+2.214
 sizes+2.205
cest+2.163
Neg logit contributions
hered-1.905
oken-1.816
pped-1.746
 grounded-1.679
ives-1.640
Token the
Token ID262
Feature activation+1.81e+00

Pos logit contributions
lf+3.005
 flavors+2.718
 sizes+2.627
 kinds+2.581
cest+2.546
Neg logit contributions
oken-1.908
hered-1.770
pped-1.645
 risking-1.602
 grounded-1.575
Token sand
Token ID6450
Feature activation+2.47e+00

Pos logit contributions
lf+2.627
 flavors+2.210
 kinds+2.206
 sizes+2.142
cest+2.011
Neg logit contributions
oken-3.564
hered-3.343
 grounded-3.134
 consequences-3.131
 permission-3.121
Tokencastle
Token ID18676
Feature activation+3.05e+00

Pos logit contributions
lf+3.872
 flavors+3.462
 kinds+3.345
 sizes+3.236
cest+2.972
Neg logit contributions
hered-4.455
oken-4.313
ives-4.000
oning-3.912
 somewhere-3.854
Token very
Token ID845
Feature activation+4.19e+00

Pos logit contributions
lf+6.853
 flavors+6.315
 kinds+5.945
 sizes+5.915
cest+5.803
Neg logit contributions
hered-3.736
oken-3.321
ives-3.199
 somewhere-3.054
oning-3.039
Token big
Token ID1263
Feature activation+5.47e-01

Pos logit contributions
lf+8.256
 flavors+6.835
 kinds+6.570
 sizes+6.410
 oval+6.242
Neg logit contributions
oken-7.382
add-6.347
 okay-6.287
 grounded-6.227
oning-6.220
Token and
Token ID290
Feature activation+2.66e+00

Pos logit contributions
lf+1.015
 flavors+0.955
 kinds+0.900
 sizes+0.892
 oval+0.857
Neg logit contributions
hered-0.831
oken-0.795
 risking-0.729
 grounded-0.727
ives-0.725
Token strong
Token ID1913
Feature activation+1.54e+00

Pos logit contributions
lf+4.575
 flavors+4.099
cest+4.028
 kinds+3.923
 sizes+3.690
Neg logit contributions
hered-4.230
oken-4.195
 grounded-4.104
Come-4.056
pped-3.880
Token.
Token ID13
Feature activation-3.33e-01

Pos logit contributions
lf+2.471
 flavors+2.145
 kinds+2.027
 sizes+1.944
cest+1.904
Neg logit contributions
hered-2.890
oken-2.687
 risking-2.562
pped-2.534
ives-2.533
Token
Token ID198
Feature activation+1.39e-01

Pos logit contributions
hered+0.561
oken+0.548
pped+0.500
 grounded+0.500
 risking+0.499
Neg logit contributions
lf-0.565
 flavors-0.546
 kinds-0.517
 sizes-0.516
 oval-0.485
Token was
Token ID373
Feature activation+1.84e+00

Pos logit contributions
lf+3.954
 flavors+3.851
 kinds+3.650
cest+3.576
 sizes+3.573
Neg logit contributions
hered-3.308
oken-2.957
 grounded-2.878
pped-2.872
 poisoned-2.755
Token taller
Token ID25242
Feature activation+2.72e+00

Pos logit contributions
lf+3.461
 flavors+3.275
 kinds+3.059
cest+3.039
 sizes+2.943
Neg logit contributions
oken-2.747
hered-2.561
 grounded-2.495
pped-2.452
 risking-2.410
Token than
Token ID621
Feature activation-1.46e-01

Pos logit contributions
lf+4.372
 flavors+3.663
 kinds+3.512
 sizes+3.346
 oval+3.231
Neg logit contributions
hered-5.066
oken-4.888
ives-4.674
 consequences-4.389
 risking-4.379
Token them
Token ID606
Feature activation+4.51e-01

Pos logit contributions
hered+0.247
oken+0.242
pped+0.221
 grounded+0.220
 risking+0.220
Neg logit contributions
lf-0.246
 flavors-0.236
 kinds-0.224
 sizes-0.223
 oval-0.210
Token and
Token ID290
Feature activation+2.17e+00

Pos logit contributions
lf+0.625
 flavors+0.597
 kinds+0.558
 sizes+0.550
 oval+0.517
Neg logit contributions
hered-0.899
oken-0.870
pped-0.821
 risking-0.814
ives-0.807
Token very
Token ID845
Feature activation+4.19e+00

Pos logit contributions
lf+3.970
 flavors+3.697
 kinds+3.542
cest+3.465
 sizes+3.331
Neg logit contributions
hered-3.193
 risking-3.041
oken-3.017
pped-2.957
 grounded-2.929
Token strong
Token ID1913
Feature activation+1.05e+00

Pos logit contributions
lf+8.017
 flavors+7.574
omo+6.863
cest+6.579
 sizes+6.527
Neg logit contributions
oken-7.262
 grounded-6.794
 poisoned-6.203
ives-6.194
 okay-6.124
Token.
Token ID13
Feature activation+1.85e-01

Pos logit contributions
lf+1.537
 flavors+1.396
 kinds+1.298
 sizes+1.253
 oval+1.225
Neg logit contributions
hered-2.078
oken-1.952
 risking-1.860
ives-1.840
pped-1.837
Token
Token ID198
Feature activation+2.36e-01

Pos logit contributions
lf+0.342
 flavors+0.326
 kinds+0.310
 sizes+0.309
cest+0.294
Neg logit contributions
hered-0.289
oken-0.282
 grounded-0.254
pped-0.253
 risking-0.253
Token
Token ID198
Feature activation+1.06e-01

Pos logit contributions
lf+0.453
 flavors+0.439
 sizes+0.420
 kinds+0.417
 oval+0.398
Neg logit contributions
hered-0.344
oken-0.334
pped-0.302
 grounded-0.300
 risking-0.299
Token"
Token ID1
Feature activation+3.96e-01

Pos logit contributions
lf+0.167
 flavors+0.161
 kinds+0.152
 sizes+0.151
 oval+0.141
Neg logit contributions
hered-0.192
oken-0.187
pped-0.174
 grounded-0.173
 risking-0.172
Token.
Token ID13
Feature activation+3.89e-01

Pos logit contributions
lf+1.036
 flavors+0.999
 kinds+0.918
 sizes+0.911
cest+0.850
Neg logit contributions
hered-1.491
oken-1.444
ives-1.358
pped-1.352
 grounded-1.350
Token He
Token ID679
Feature activation+2.93e-01

Pos logit contributions
lf+0.744
 flavors+0.717
 kinds+0.680
 sizes+0.677
cest+0.648
Neg logit contributions
hered-0.565
oken-0.547
 grounded-0.498
pped-0.495
ives-0.493
Token likes
Token ID7832
Feature activation+1.11e+00

Pos logit contributions
lf+0.482
 flavors+0.455
 kinds+0.434
 sizes+0.433
cest+0.408
Neg logit contributions
hered-0.502
oken-0.496
 grounded-0.457
pped-0.454
 risking-0.451
Token his
Token ID465
Feature activation+8.91e-01

Pos logit contributions
lf+1.939
 flavors+1.828
 kinds+1.754
 sizes+1.742
cest+1.682
Neg logit contributions
hered-1.709
oken-1.683
 risking-1.613
pped-1.548
 grounded-1.542
Token hat
Token ID6877
Feature activation+1.92e+00

Pos logit contributions
lf+1.212
 flavors+1.144
 kinds+1.103
 sizes+1.102
cest+1.018
Neg logit contributions
hered-1.736
oken-1.710
 grounded-1.576
 risking-1.573
pped-1.557
Token very
Token ID845
Feature activation+4.18e+00

Pos logit contributions
lf+3.750
 flavors+3.560
 kinds+3.471
 sizes+3.390
 oval+3.216
Neg logit contributions
hered-2.785
oken-2.465
pped-2.398
ives-2.347
 grounded-2.320
Token much
Token ID881
Feature activation+1.33e+00

Pos logit contributions
lf+8.239
 flavors+7.820
 sizes+7.650
 oval+7.604
 kinds+7.509
Neg logit contributions
oken-6.317
 somewhere-5.819
Sleep-5.591
hered-5.587
 okay-5.397
Token.
Token ID13
Feature activation+1.29e+00

Pos logit contributions
lf+2.139
 flavors+1.910
 kinds+1.790
 sizes+1.786
 oval+1.728
Neg logit contributions
hered-2.460
oken-2.292
ives-2.275
 grounded-2.205
 risking-2.174
Token
Token ID198
Feature activation+8.54e-01

Pos logit contributions
lf+2.367
 flavors+2.218
 kinds+2.106
 sizes+2.075
cest+2.021
Neg logit contributions
hered-2.023
oken-1.934
 Leave-1.844
ives-1.837
Sleep-1.825
Token
Token ID198
Feature activation+1.14e+00

Pos logit contributions
lf+1.723
 flavors+1.706
 sizes+1.644
 kinds+1.607
cest+1.557
Neg logit contributions
oken-1.102
hered-1.096
 grounded-0.991
pped-0.982
 risking-0.969
Token"
Token ID1
Feature activation-2.49e-01

Pos logit contributions
 flavors+1.881
lf+1.850
 sizes+1.794
 kinds+1.783
cest+1.644
Neg logit contributions
hered-1.913
oken-1.846
 grounded-1.811
pped-1.777
oning-1.755
Token top
Token ID1353
Feature activation+2.83e+00

Pos logit contributions
lf+0.821
 flavors+0.792
 sizes+0.761
 kinds+0.760
cest+0.718
Neg logit contributions
hered-0.878
oken-0.848
pped-0.797
 risking-0.774
oning-0.772
Token.
Token ID13
Feature activation+1.27e-02

Pos logit contributions
lf+5.680
 flavors+5.506
 sizes+5.184
 kinds+5.039
cest+5.008
Neg logit contributions
hered-4.055
oken-3.706
ives-3.650
pped-3.439
oning-3.341
Token The
Token ID383
Feature activation+1.91e+00

Pos logit contributions
lf+0.024
 flavors+0.023
 kinds+0.022
 sizes+0.022
cest+0.021
Neg logit contributions
hered-0.019
oken-0.019
pped-0.017
 grounded-0.017
 risking-0.017
Token tower
Token ID10580
Feature activation+1.51e+00

Pos logit contributions
lf+2.555
 flavors+2.379
 kinds+2.244
 sizes+2.204
cest+2.169
Neg logit contributions
oken-3.873
hered-3.752
 grounded-3.559
oning-3.499
 consequences-3.474
Token is
Token ID318
Feature activation+2.24e+00

Pos logit contributions
lf+2.726
 flavors+2.583
 kinds+2.458
 sizes+2.451
cest+2.432
Neg logit contributions
hered-2.261
oken-2.206
 grounded-2.078
ives-2.032
pped-2.020
Token very
Token ID845
Feature activation+4.17e+00

Pos logit contributions
lf+3.496
 flavors+3.329
 kinds+3.152
cest+3.000
 sizes+2.927
Neg logit contributions
oken-4.041
hered-3.940
pped-3.815
 grounded-3.730
 risking-3.619
Token tall
Token ID7331
Feature activation+1.46e+00

Pos logit contributions
lf+8.335
 flavors+7.672
cest+7.416
 sizes+7.361
 kinds+7.111
Neg logit contributions
oken-6.967
 grounded-6.446
 okay-6.127
 poisoned-6.093
oning-5.823
Token.
Token ID13
Feature activation-4.82e-01

Pos logit contributions
lf+2.732
 flavors+2.578
 kinds+2.469
 sizes+2.408
 oval+2.330
Neg logit contributions
hered-2.192
oken-2.006
pped-1.880
ives-1.873
 risking-1.864
Token
Token ID198
Feature activation+4.79e-02

Pos logit contributions
hered+0.769
oken+0.749
 grounded+0.683
pped+0.681
 risking+0.681
Neg logit contributions
lf-0.856
 flavors-0.831
 kinds-0.790
 sizes-0.786
 oval-0.746
Token
Token ID198
Feature activation+2.36e-01

Pos logit contributions
lf+0.090
 flavors+0.087
 sizes+0.083
 kinds+0.082
 oval+0.078
Neg logit contributions
hered-0.072
oken-0.070
pped-0.063
 grounded-0.063
 risking-0.063
Token"
Token ID1
Feature activation+3.21e-01

Pos logit contributions
lf+0.365
 flavors+0.352
 kinds+0.333
 sizes+0.332
cest+0.307
Neg logit contributions
hered-0.437
oken-0.422
pped-0.396
 grounded-0.391
oning-0.388
Token it
Token ID340
Feature activation-5.35e-01

Pos logit contributions
hered+0.074
oken+0.073
 risking+0.067
 grounded+0.067
pped+0.067
Neg logit contributions
lf-0.050
 flavors-0.048
 kinds-0.045
 sizes-0.045
 oval-0.042
Token wanted
Token ID2227
Feature activation-1.15e+00

Pos logit contributions
hered+1.053
oken+1.038
pped+0.955
 grounded+0.952
 risking+0.951
Neg logit contributions
lf-0.766
 flavors-0.729
 kinds-0.688
 sizes-0.684
 oval-0.628
Token him
Token ID683
Feature activation-6.85e-01

Pos logit contributions
hered+1.761
oken+1.689
pped+1.556
 grounded+1.542
ives+1.534
Neg logit contributions
lf-2.158
 flavors-2.156
 kinds-2.058
 sizes-2.027
 oval-1.916
Token to
Token ID284
Feature activation-5.42e-01

Pos logit contributions
hered+0.958
oken+0.933
pped+0.830
 grounded+0.825
 risking+0.813
Neg logit contributions
lf-1.350
 flavors-1.329
 kinds-1.276
 sizes-1.262
cest-1.201
Token take
Token ID1011
Feature activation-6.70e-01

Pos logit contributions
hered+1.014
oken+0.993
 grounded+0.915
 risking+0.914
pped+0.913
Neg logit contributions
lf-0.819
 flavors-0.785
 kinds-0.741
 sizes-0.733
 oval-0.685
Token the
Token ID262
Feature activation-3.88e+00

Pos logit contributions
hered+1.211
oken+1.184
 grounded+1.092
 risking+1.091
pped+1.089
Neg logit contributions
lf-1.042
 flavors-1.013
 kinds-0.958
 sizes-0.949
 oval-0.878
Token sack
Token ID23704
Feature activation+8.21e-02

Pos logit contributions
hered+8.565
oken+8.494
ives+8.198
oned+8.165
pped+8.147
Neg logit contributions
lf-5.422
 flavors-5.213
 tower-5.209
 kinds-4.719
 sizes-4.629
Token.
Token ID13
Feature activation+1.92e+00

Pos logit contributions
lf+0.136
 flavors+0.131
 sizes+0.123
 kinds+0.123
 oval+0.115
Neg logit contributions
hered-0.142
oken-0.137
pped-0.126
 grounded-0.126
 risking-0.125
Token It
Token ID632
Feature activation+6.87e-01

Pos logit contributions
lf+3.500
 flavors+3.268
 sizes+3.037
cest+3.002
 oval+2.995
Neg logit contributions
oken-3.093
hered-2.979
oning-2.776
 Leave-2.772
Come-2.764
Token was
Token ID373
Feature activation-1.12e-01

Pos logit contributions
lf+1.163
 flavors+1.108
 sizes+1.047
 kinds+1.037
cest+0.986
Neg logit contributions
hered-1.166
oken-1.104
 grounded-1.036
pped-1.027
 risking-1.015
Token so
Token ID523
Feature activation+1.73e+00

Pos logit contributions
hered+0.195
oken+0.191
 grounded+0.175
pped+0.175
 risking+0.175
Neg logit contributions
lf-0.182
 flavors-0.175
 kinds-0.165
 sizes-0.164
 oval-0.154
Token for
Token ID329
Feature activation-1.12e+00

Pos logit contributions
hered+0.149
oken+0.145
 grounded+0.134
pped+0.134
 risking+0.134
Neg logit contributions
lf-0.108
 flavors-0.103
 kinds-0.096
 sizes-0.096
 oval-0.089
Token telling
Token ID5149
Feature activation-7.35e-01

Pos logit contributions
hered+1.761
oken+1.717
ives+1.597
pped+1.518
oning+1.499
Neg logit contributions
lf-2.114
 flavors-1.955
 kinds-1.892
 sizes-1.848
 oval-1.769
Token her
Token ID607
Feature activation-4.70e-01

Pos logit contributions
hered+1.296
oken+1.252
pped+1.156
 risking+1.144
 grounded+1.144
Neg logit contributions
lf-1.196
 flavors-1.149
 kinds-1.100
 sizes-1.084
 oval-0.999
Token to
Token ID284
Feature activation-3.41e-01

Pos logit contributions
hered+0.824
oken+0.802
pped+0.734
 grounded+0.727
 risking+0.726
Neg logit contributions
lf-0.776
 flavors-0.742
 kinds-0.706
 sizes-0.701
cest-0.652
Token take
Token ID1011
Feature activation-1.18e+00

Pos logit contributions
hered+0.595
oken+0.582
pped+0.531
 risking+0.530
 grounded+0.530
Neg logit contributions
lf-0.557
 flavors-0.534
 kinds-0.505
 sizes-0.502
 oval-0.470
Token the
Token ID262
Feature activation-3.77e+00

Pos logit contributions
hered+1.851
oken+1.827
 risking+1.664
 grounded+1.655
pped+1.645
Neg logit contributions
lf-2.050
 flavors-2.002
 kinds-1.938
 sizes-1.869
 oval-1.767
Token cat
Token ID3797
Feature activation-2.35e+00

Pos logit contributions
pped+7.272
oken+7.069
hered+6.973
oned+6.911
 risking+6.473
Neg logit contributions
 flavors-6.336
lf-6.042
 kinds-5.964
 type-5.783
 tower-5.776
Token to
Token ID284
Feature activation-6.73e-01

Pos logit contributions
oken+4.508
hered+4.216
pped+4.113
 risking+4.023
Grab+3.923
Neg logit contributions
lf-3.467
 kinds-3.381
 flavors-3.294
 sizes-3.273
 tower-3.094
Token the
Token ID262
Feature activation-1.67e+00

Pos logit contributions
hered+1.200
oken+1.171
pped+1.078
ives+1.065
 grounded+1.065
Neg logit contributions
lf-1.081
 flavors-1.033
 kinds-0.982
 sizes-0.964
 oval-0.910
Token vet
Token ID20202
Feature activation-7.72e-01

Pos logit contributions
hered+2.252
oken+2.176
pped+2.017
ives+1.987
oned+1.887
Neg logit contributions
lf-3.460
 flavors-3.373
 tower-3.199
 kinds-3.112
 sizes-3.086
Token.
Token ID13
Feature activation+8.43e-02

Pos logit contributions
hered+1.708
oken+1.702
pped+1.593
 risking+1.578
 grounded+1.574
Neg logit contributions
lf-0.884
 flavors-0.829
 kinds-0.778
 sizes-0.773
 oval-0.700
Token bridge
Token ID7696
Feature activation-2.29e-02

Pos logit contributions
hered+1.178
oken+1.148
pped+1.088
 risking+1.077
 grounded+1.076
Neg logit contributions
lf-0.500
 flavors-0.468
 sizes-0.427
 kinds-0.426
 oval-0.383
Token.
Token ID13
Feature activation-2.02e-01

Pos logit contributions
hered+0.048
oken+0.047
 grounded+0.044
pped+0.044
 risking+0.044
Neg logit contributions
lf-0.029
 flavors-0.028
 kinds-0.026
 sizes-0.026
 oval-0.024
Token A
Token ID317
Feature activation+5.31e-01

Pos logit contributions
hered+0.352
oken+0.344
pped+0.315
 grounded+0.314
 risking+0.314
Neg logit contributions
lf-0.330
 flavors-0.317
 kinds-0.300
 sizes-0.299
 oval-0.280
Token small
Token ID1402
Feature activation-4.14e-01

Pos logit contributions
lf+0.752
 flavors+0.737
 kinds+0.685
 sizes+0.683
 oval+0.623
Neg logit contributions
hered-1.038
oken-0.999
 risking-0.937
 grounded-0.935
oning-0.914
Token mouse
Token ID10211
Feature activation-1.38e+00

Pos logit contributions
hered+0.703
oken+0.691
pped+0.636
 risking+0.628
 grounded+0.628
Neg logit contributions
lf-0.693
 flavors-0.666
 kinds-0.636
 sizes-0.630
 oval-0.596
Token stepped
Token ID10764
Feature activation-3.74e+00

Pos logit contributions
oken+2.528
hered+2.433
oning+2.263
ives+2.248
 grounded+2.236
Neg logit contributions
lf-2.255
 flavors-2.051
 kinds-1.985
 sizes-1.934
cest-1.865
Token forward
Token ID2651
Feature activation-1.46e+00

Pos logit contributions
oken+6.437
hered+6.406
 risking+6.288
ives+6.232
Sleep+6.216
Neg logit contributions
lf-6.093
 tall-5.673
 oval-5.472
 his-5.376
 blue-5.344
Token and
Token ID290
Feature activation-7.30e-01

Pos logit contributions
oken+2.789
hered+2.764
 grounded+2.570
pped+2.543
ives+2.506
Neg logit contributions
lf-2.130
 flavors-1.967
 kinds-1.960
 sizes-1.918
cest-1.842
Token said
Token ID531
Feature activation-7.00e-01

Pos logit contributions
hered+1.257
oken+1.252
pped+1.130
ives+1.124
 risking+1.113
Neg logit contributions
lf-1.213
 flavors-1.162
 kinds-1.106
 sizes-1.090
 oval-1.031
Token that
Token ID326
Feature activation-4.03e-01

Pos logit contributions
hered+1.511
oken+1.499
pped+1.387
 grounded+1.387
ives+1.375
Neg logit contributions
lf-0.855
 flavors-0.796
 kinds-0.732
 sizes-0.725
 oval-0.673
Token he
Token ID339
Feature activation-1.36e+00

Pos logit contributions
hered+0.854
oken+0.839
pped+0.780
 grounded+0.775
 risking+0.771
Neg logit contributions
lf-0.515
 flavors-0.487
 kinds-0.455
 sizes-0.452
 oval-0.411
Token
Token ID198
Feature activation+2.44e-01

Pos logit contributions
lf+0.189
 flavors+0.185
 sizes+0.176
 kinds+0.176
 oval+0.167
Neg logit contributions
hered-0.151
oken-0.146
pped-0.133
 grounded-0.132
 risking-0.131
TokenL
Token ID43
Feature activation-1.55e+00

Pos logit contributions
lf+0.378
 flavors+0.373
 kinds+0.353
 sizes+0.351
 oval+0.325
Neg logit contributions
hered-0.445
oken-0.429
pped-0.404
 grounded-0.400
 risking-0.394
Tokenily
Token ID813
Feature activation+5.80e-01

Pos logit contributions
hered+1.823
oken+1.611
 risking+1.550
oning+1.404
 grounded+1.392
Neg logit contributions
lf-3.616
 flavors-3.366
 kinds-3.363
 sizes-3.184
cest-3.128
Token's
Token ID338
Feature activation-1.04e+00

Pos logit contributions
lf+1.002
 flavors+0.994
 sizes+0.944
 kinds+0.943
cest+0.874
Neg logit contributions
hered-0.942
oken-0.900
pped-0.846
 grounded-0.838
 risking-0.816
Token mom
Token ID1995
Feature activation-2.38e-01

Pos logit contributions
hered+2.484
oken+2.397
pped+2.307
 risking+2.297
ives+2.287
Neg logit contributions
lf-1.110
 flavors-0.998
 tower-0.887
 kinds-0.881
 sizes-0.874
Token called
Token ID1444
Feature activation-3.69e+00

Pos logit contributions
hered+0.406
oken+0.397
pped+0.361
 grounded+0.361
 risking+0.361
Neg logit contributions
lf-0.401
 flavors-0.385
 kinds-0.365
 sizes-0.363
 oval-0.341
Token her
Token ID607
Feature activation-2.27e+00

Pos logit contributions
ives+9.200
oken+8.930
hered+8.926
pped+8.436
 risking+8.357
Neg logit contributions
lf-4.101
 kinds-3.353
 his-3.229
 tower-2.981
 type-2.948
Token to
Token ID284
Feature activation-2.35e+00

Pos logit contributions
oken+3.680
hered+3.584
pped+3.426
 risking+3.291
ives+3.188
Neg logit contributions
lf-4.087
 flavors-3.821
 kinds-3.775
 sizes-3.668
 tower-3.636
Token come
Token ID1282
Feature activation-2.09e+00

Pos logit contributions
oken+3.818
hered+3.810
ives+3.687
 risking+3.565
pped+3.445
Neg logit contributions
lf-4.272
 flavors-3.891
 kinds-3.806
 type-3.617
 tower-3.555
Token inside
Token ID2641
Feature activation-4.63e-01

Pos logit contributions
oken+3.689
hered+3.554
 risking+3.399
ives+3.348
pped+3.248
Neg logit contributions
lf-3.424
 flavors-3.187
 kinds-3.125
 tower-3.038
cest-3.019
Token,
Token ID11
Feature activation-1.55e+00

Pos logit contributions
hered+0.859
oken+0.844
pped+0.777
 risking+0.772
 grounded+0.772
Neg logit contributions
lf-0.708
 flavors-0.674
 kinds-0.642
 sizes-0.630
 oval-0.588
Token
Token ID198
Feature activation-5.04e-01

Pos logit contributions
lf+0.129
 flavors+0.125
 sizes+0.119
 kinds+0.119
 oval+0.113
Neg logit contributions
hered-0.103
oken-0.101
pped-0.091
 grounded-0.090
 risking-0.090
TokenL
Token ID43
Feature activation-1.99e+00

Pos logit contributions
hered+0.952
oken+0.930
 risking+0.856
pped+0.850
 grounded+0.850
Neg logit contributions
lf-0.768
 flavors-0.720
 kinds-0.675
 sizes-0.674
 oval-0.630
Tokenily
Token ID813
Feature activation-7.70e-01

Pos logit contributions
hered+2.452
 risking+2.167
oken+2.147
oning+1.926
 grounded+1.866
Neg logit contributions
lf-4.567
 kinds-4.236
 flavors-4.178
 sizes-3.955
 tower-3.873
Token's
Token ID338
Feature activation-4.22e-01

Pos logit contributions
hered+1.376
oken+1.351
 risking+1.246
ives+1.230
 grounded+1.225
Neg logit contributions
lf-1.239
 flavors-1.171
 kinds-1.099
 sizes-1.088
 oval-1.034
Token mom
Token ID1995
Feature activation-1.39e+00

Pos logit contributions
hered+0.946
oken+0.924
pped+0.867
 risking+0.867
 grounded+0.863
Neg logit contributions
lf-0.493
 flavors-0.461
 kinds-0.421
 sizes-0.419
 oval-0.383
Token called
Token ID1444
Feature activation-3.68e+00

Pos logit contributions
hered+2.454
oken+2.440
 risking+2.238
ives+2.236
Sleep+2.183
Neg logit contributions
lf-2.325
 flavors-2.088
 kinds-2.010
 sizes-1.929
 oval-1.862
Token for
Token ID329
Feature activation-2.18e+00

Pos logit contributions
hered+8.780
ives+8.710
oken+8.646
 risking+8.289
pped+8.235
Neg logit contributions
lf-4.324
 kinds-3.578
 type-3.248
 his-3.246
 tower-3.145
Token her
Token ID607
Feature activation-2.43e+00

Pos logit contributions
hered+3.907
oken+3.627
ives+3.528
 risking+3.413
pped+3.409
Neg logit contributions
lf-3.827
 flavors-3.485
 kinds-3.310
 sizes-3.084
 type-3.049
Token,
Token ID11
Feature activation-2.13e+00

Pos logit contributions
hered+4.487
oken+4.388
pped+4.383
 risking+4.084
Grab+4.064
Neg logit contributions
lf-3.800
 flavors-3.685
 kinds-3.656
 sizes-3.564
 tower-3.370
Token but
Token ID475
Feature activation-1.60e+00

Pos logit contributions
hered+4.565
oken+4.517
ives+4.226
pped+4.126
 risking+3.945
Neg logit contributions
lf-2.805
 flavors-2.576
 sizes-2.501
 kinds-2.470
 oval-2.390
Token Lily
Token ID20037
Feature activation-1.03e+00

Pos logit contributions
hered+3.327
oken+3.285
ives+3.139
oning+2.973
 risking+2.962
Neg logit contributions
lf-2.292
 flavors-1.963
 kinds-1.876
 sizes-1.848
 oval-1.711
Token so
Token ID523
Feature activation+1.29e+00

Pos logit contributions
hered+1.783
oken+1.725
ives+1.625
pped+1.595
oning+1.582
Neg logit contributions
lf-1.307
 flavors-1.248
 sizes-1.187
 kinds-1.187
 oval-1.119
Token y
Token ID331
Feature activation+5.04e-01

Pos logit contributions
lf+1.973
 flavors+1.874
 sizes+1.746
 kinds+1.716
cest+1.666
Neg logit contributions
oken-2.432
hered-2.348
 grounded-2.277
 risking-2.187
 poisoned-2.173
Tokenucky
Token ID5309
Feature activation-1.24e-01

Pos logit contributions
lf+0.745
 flavors+0.706
 sizes+0.656
 kinds+0.651
cest+0.609
Neg logit contributions
hered-0.977
oken-0.961
pped-0.885
ives-0.875
oning-0.868
Token that
Token ID326
Feature activation+3.63e-01

Pos logit contributions
hered+0.227
oken+0.222
pped+0.204
 risking+0.204
 grounded+0.204
Neg logit contributions
lf-0.191
 flavors-0.183
 kinds-0.172
 sizes-0.172
 oval-0.160
Token everyone
Token ID2506
Feature activation+1.64e-01

Pos logit contributions
lf+0.567
 flavors+0.545
 kinds+0.512
 sizes+0.512
 oval+0.487
Neg logit contributions
hered-0.644
oken-0.625
 risking-0.584
 grounded-0.584
pped-0.578
Token stepped
Token ID10764
Feature activation-3.67e+00

Pos logit contributions
lf+0.274
 flavors+0.265
 sizes+0.249
 kinds+0.248
 oval+0.236
Neg logit contributions
hered-0.279
oken-0.271
 grounded-0.251
 risking-0.250
pped-0.249
Token back
Token ID736
Feature activation-3.50e-01

Pos logit contributions
hered+6.507
oken+6.389
 risking+6.326
ives+6.187
Sleep+6.028
Neg logit contributions
lf-5.666
 kinds-5.383
 flavors-5.339
 shape-5.241
 tall-5.232
Token and
Token ID290
Feature activation-3.97e-01

Pos logit contributions
hered+0.537
oken+0.524
pped+0.472
 grounded+0.470
 risking+0.470
Neg logit contributions
lf-0.646
 flavors-0.624
 kinds-0.596
 sizes-0.592
 oval-0.559
Token made
Token ID925
Feature activation-4.99e-01

Pos logit contributions
hered+0.711
oken+0.698
pped+0.638
 grounded+0.635
 risking+0.634
Neg logit contributions
lf-0.633
 flavors-0.607
 kinds-0.573
 sizes-0.571
 oval-0.533
Token a
Token ID257
Feature activation+1.65e+00

Pos logit contributions
hered+0.830
oken+0.802
pped+0.734
 risking+0.732
 grounded+0.731
Neg logit contributions
lf-0.857
 flavors-0.832
 kinds-0.795
 sizes-0.782
 oval-0.734
Token face
Token ID1986
Feature activation+1.44e-01

Pos logit contributions
lf+2.520
 flavors+2.385
 sizes+2.341
 kinds+2.325
cest+2.043
Neg logit contributions
hered-3.057
oken-3.006
 grounded-2.861
 risking-2.795
oning-2.747
Token Lily
Token ID20037
Feature activation-3.41e-01

Pos logit contributions
hered+0.139
oken+0.135
 grounded+0.124
pped+0.124
 risking+0.124
Neg logit contributions
lf-0.125
 flavors-0.121
 kinds-0.114
 sizes-0.114
 oval-0.106
Token and
Token ID290
Feature activation+3.47e-02

Pos logit contributions
hered+0.583
oken+0.571
 risking+0.519
pped+0.519
 grounded+0.518
Neg logit contributions
lf-0.569
 flavors-0.545
 kinds-0.515
 sizes-0.512
 oval-0.482
Token her
Token ID607
Feature activation+4.55e-01

Pos logit contributions
lf+0.061
 flavors+0.059
 sizes+0.056
 kinds+0.056
 oval+0.052
Neg logit contributions
hered-0.057
oken-0.055
 grounded-0.050
pped-0.050
 risking-0.050
Token mom
Token ID1995
Feature activation-1.01e-02

Pos logit contributions
lf+0.590
 flavors+0.561
 sizes+0.534
 kinds+0.533
cest+0.496
Neg logit contributions
hered-0.926
oken-0.924
 grounded-0.854
 risking-0.838
pped-0.838
Tokenmy
Token ID1820
Feature activation-7.62e-01

Pos logit contributions
hered+0.016
oken+0.016
pped+0.014
 grounded+0.014
 risking+0.014
Neg logit contributions
lf-0.018
 flavors-0.017
 kinds-0.016
 sizes-0.016
 oval-0.015
Token stepped
Token ID10764
Feature activation-3.67e+00

Pos logit contributions
hered+1.300
oken+1.298
 risking+1.175
 grounded+1.164
ives+1.154
Neg logit contributions
lf-1.273
 flavors-1.221
 kinds-1.153
 sizes-1.138
 oval-1.065
Token out
Token ID503
Feature activation-9.10e-01

Pos logit contributions
 risking+6.346
hered+6.190
ives+6.108
oken+6.066
Sleep+5.875
Neg logit contributions
lf-6.108
 kinds-5.501
 flavors-5.478
 oval-5.428
 blue-5.394
Token of
Token ID286
Feature activation-1.12e+00

Pos logit contributions
hered+1.716
oken+1.675
pped+1.571
 risking+1.564
 grounded+1.545
Neg logit contributions
lf-1.360
 flavors-1.284
 kinds-1.273
 sizes-1.214
 oval-1.145
Token the
Token ID262
Feature activation-1.22e+00

Pos logit contributions
hered+2.371
oken+2.335
ives+2.233
pped+2.166
 grounded+2.165
Neg logit contributions
lf-1.431
 flavors-1.359
 kinds-1.297
 sizes-1.220
 oval-1.152
Token truck
Token ID7779
Feature activation-5.95e-01

Pos logit contributions
hered+2.706
oken+2.608
pped+2.505
ives+2.464
 risking+2.456
Neg logit contributions
lf-1.578
 flavors-1.457
 kinds-1.347
 tower-1.289
 sizes-1.286
Token and
Token ID290
Feature activation+1.76e-01

Pos logit contributions
hered+1.080
oken+1.079
 risking+0.986
pped+0.983
 grounded+0.980
Neg logit contributions
lf-0.924
 flavors-0.879
 kinds-0.833
 sizes-0.828
 oval-0.778
Token But
Token ID887
Feature activation-2.19e-01

Pos logit contributions
hered+0.821
oken+0.806
pped+0.734
 grounded+0.732
 risking+0.732
Neg logit contributions
lf-0.826
 flavors-0.792
 kinds-0.751
 sizes-0.747
 oval-0.706
Token then
Token ID788
Feature activation+6.64e-02

Pos logit contributions
hered+0.377
oken+0.368
pped+0.336
 grounded+0.336
 risking+0.335
Neg logit contributions
lf-0.363
 flavors-0.349
 kinds-0.330
 sizes-0.329
 oval-0.308
Token,
Token ID11
Feature activation-1.07e+00

Pos logit contributions
lf+0.090
 flavors+0.086
 kinds+0.081
 sizes+0.081
 oval+0.074
Neg logit contributions
hered-0.135
oken-0.132
pped-0.123
 risking-0.123
 grounded-0.122
Token her
Token ID607
Feature activation-7.15e-01

Pos logit contributions
hered+2.113
oken+2.106
ives+1.957
 grounded+1.906
oning+1.896
Neg logit contributions
lf-1.561
 flavors-1.432
 kinds-1.326
 sizes-1.303
 tower-1.238
Token mom
Token ID1995
Feature activation-1.08e+00

Pos logit contributions
hered+1.602
oken+1.551
 risking+1.474
pped+1.468
 grounded+1.461
Neg logit contributions
lf-0.845
 flavors-0.787
 kinds-0.707
 sizes-0.706
 tower-0.664
Token called
Token ID1444
Feature activation-3.60e+00

Pos logit contributions
hered+1.941
oken+1.940
 risking+1.770
ives+1.743
 grounded+1.725
Neg logit contributions
lf-1.776
 flavors-1.626
 kinds-1.555
 sizes-1.521
 oval-1.449
Token her
Token ID607
Feature activation-2.34e+00

Pos logit contributions
ives+8.361
hered+8.150
oken+8.036
 risking+7.726
 grounded+7.533
Neg logit contributions
lf-4.600
 kinds-3.693
 his-3.613
 flavors-3.394
 type-3.375
Token in
Token ID287
Feature activation-2.01e+00

Pos logit contributions
oken+4.243
hered+4.207
pped+4.024
 risking+3.951
oning+3.763
Neg logit contributions
lf-3.773
 flavors-3.557
 kinds-3.493
 tower-3.335
 sizes-3.313
Token for
Token ID329
Feature activation-2.12e+00

Pos logit contributions
hered+4.212
oken+4.166
pped+4.069
ives+3.992
 risking+3.931
Neg logit contributions
lf-2.593
 flavors-2.428
 kinds-2.399
 sizes-2.180
 type-2.066
Token dinner
Token ID8073
Feature activation+6.12e-01

Pos logit contributions
hered+3.662
ives+3.504
oken+3.436
 risking+3.246
 grounded+3.219
Neg logit contributions
lf-3.781
 flavors-3.511
 kinds-3.243
 oval-3.038
 sizes-3.033
Token.
Token ID13
Feature activation-1.31e+00

Pos logit contributions
lf+0.828
 flavors+0.798
 sizes+0.754
 kinds+0.747
 oval+0.700
Neg logit contributions
hered-1.229
oken-1.186
 grounded-1.104
pped-1.101
ives-1.100
Token door
Token ID3420
Feature activation+1.52e+00

Pos logit contributions
hered+1.364
oken+1.310
pped+1.228
 risking+1.219
 grounded+1.205
Neg logit contributions
lf-0.992
 flavors-0.945
 kinds-0.882
 sizes-0.877
 tower-0.815
Token.
Token ID13
Feature activation-7.46e-01

Pos logit contributions
lf+2.424
 flavors+2.282
 sizes+2.165
 kinds+2.084
cest+2.042
Neg logit contributions
hered-2.744
oken-2.645
ives-2.522
oning-2.481
 grounded-2.456
Token She
Token ID1375
Feature activation+9.19e-01

Pos logit contributions
hered+1.262
oken+1.222
pped+1.121
 risking+1.113
 grounded+1.107
Neg logit contributions
lf-1.278
 flavors-1.221
 kinds-1.158
 sizes-1.152
 oval-1.098
Token saw
Token ID2497
Feature activation-4.43e-01

Pos logit contributions
lf+1.606
 flavors+1.578
 sizes+1.557
 kinds+1.536
cest+1.449
Neg logit contributions
hered-1.415
oken-1.390
 grounded-1.310
pped-1.304
 risking-1.254
Token Lily
Token ID20037
Feature activation-1.33e+00

Pos logit contributions
hered+0.836
oken+0.815
pped+0.749
 risking+0.748
 grounded+0.748
Neg logit contributions
lf-0.671
 flavors-0.642
 kinds-0.606
 sizes-0.601
 oval-0.558
Token crou
Token ID33838
Feature activation-3.58e+00

Pos logit contributions
oken+2.264
hered+2.258
 risking+2.030
ives+2.021
 grounded+2.010
Neg logit contributions
lf-2.327
 flavors-2.130
 kinds-2.033
 sizes-2.027
 tower-1.965
Tokenching
Token ID10813
Feature activation-1.69e+00

Pos logit contributions
 risking+4.567
Sleep+4.539
hered+4.528
oken+4.371
 grounded+4.265
Neg logit contributions
lf-6.529
v-6.523
 oval-6.281
 kinds-6.249
cest-6.150
Token behind
Token ID2157
Feature activation-1.20e+00

Pos logit contributions
oken+2.905
hered+2.799
 risking+2.711
 grounded+2.645
pped+2.607
Neg logit contributions
lf-2.727
 flavors-2.696
 kinds-2.678
 sizes-2.617
 oval-2.434
Token the
Token ID262
Feature activation-5.91e-01

Pos logit contributions
hered+2.369
oken+2.338
ives+2.202
 grounded+2.195
 risking+2.166
Neg logit contributions
lf-1.714
 flavors-1.597
 kinds-1.507
 sizes-1.474
 tower-1.387
Token clothes
Token ID8242
Feature activation-2.90e-01

Pos logit contributions
hered+1.095
oken+1.066
pped+0.983
 risking+0.979
 grounded+0.977
Neg logit contributions
lf-0.923
 flavors-0.875
 sizes-0.822
 kinds-0.820
cest-0.760
Token.
Token ID13
Feature activation-7.97e-01

Pos logit contributions
hered+0.563
oken+0.552
 grounded+0.509
pped+0.509
 risking+0.508
Neg logit contributions
lf-0.413
 flavors-0.393
 kinds-0.369
 sizes-0.367
 oval-0.340
Token stop
Token ID2245
Feature activation-1.09e+00

Pos logit contributions
hered+0.817
oken+0.799
pped+0.714
ives+0.709
 risking+0.707
Neg logit contributions
lf-1.084
 flavors-1.042
 kinds-0.995
 sizes-0.981
 oval-0.933
Token the
Token ID262
Feature activation-2.57e+00

Pos logit contributions
hered+2.158
oken+2.128
ives+1.977
 grounded+1.955
pped+1.937
Neg logit contributions
lf-1.560
 flavors-1.453
 sizes-1.361
 kinds-1.358
 oval-1.262
Token thief
Token ID25906
Feature activation-1.58e+00

Pos logit contributions
hered+4.045
oken+4.016
pped+3.819
oned+3.651
ives+3.615
Neg logit contributions
lf-5.073
 tower-4.740
 flavors-4.726
 kinds-4.309
 sizes-4.309
Token.
Token ID13
Feature activation+3.28e-02

Pos logit contributions
oken+3.537
hered+3.389
pped+3.237
 grounded+3.175
 risking+3.171
Neg logit contributions
lf-1.900
 flavors-1.692
 kinds-1.682
 sizes-1.663
 tower-1.531
Token She
Token ID1375
Feature activation-6.32e-01

Pos logit contributions
lf+0.062
 flavors+0.059
 sizes+0.056
 kinds+0.056
cest+0.053
Neg logit contributions
hered-0.049
oken-0.048
 grounded-0.043
 risking-0.043
pped-0.043
Token stepped
Token ID10764
Feature activation-3.57e+00

Pos logit contributions
hered+1.095
oken+1.085
 risking+0.980
pped+0.974
ives+0.971
Neg logit contributions
lf-1.057
 flavors-1.006
 kinds-0.948
 sizes-0.935
 oval-0.889
Token in
Token ID287
Feature activation-1.88e+00

Pos logit contributions
oken+6.437
 risking+6.282
hered+6.234
Sleep+5.994
ives+5.755
Neg logit contributions
lf-5.480
 tall-5.218
 tower-5.022
 oval-4.994
 flavors-4.983
Token front
Token ID2166
Feature activation-1.22e-01

Pos logit contributions
oken+3.760
hered+3.741
 risking+3.536
pped+3.526
 grounded+3.499
Neg logit contributions
lf-2.585
 flavors-2.506
 kinds-2.471
 sizes-2.207
 oval-2.128
Token of
Token ID286
Feature activation-1.16e+00

Pos logit contributions
hered+0.226
oken+0.222
pped+0.203
 grounded+0.203
 risking+0.203
Neg logit contributions
lf-0.188
 flavors-0.180
 sizes-0.169
 kinds-0.169
 oval-0.157
Token the
Token ID262
Feature activation-1.91e+00

Pos logit contributions
hered+2.419
oken+2.331
ives+2.240
 grounded+2.192
pped+2.168
Neg logit contributions
lf-1.528
 flavors-1.495
 kinds-1.434
 sizes-1.340
 oval-1.238
Token thief
Token ID25906
Feature activation-1.73e+00

Pos logit contributions
hered+2.776
oken+2.705
pped+2.554
ives+2.428
oned+2.422
Neg logit contributions
lf-3.842
 flavors-3.696
 tower-3.579
 kinds-3.447
 sizes-3.384
Token while
Token ID981
Feature activation-1.28e+00

Pos logit contributions
lf+1.097
 flavors+1.090
 kinds+1.045
 sizes+1.025
cest+0.943
Neg logit contributions
hered-1.321
oken-1.226
 risking-1.160
oning-1.144
 grounded-1.140
Token,
Token ID11
Feature activation-1.22e+00

Pos logit contributions
oken+2.930
hered+2.895
ives+2.784
 grounded+2.747
 risking+2.730
Neg logit contributions
lf-1.420
 flavors-1.165
 kinds-1.106
 sizes-1.058
 oval-1.056
Token Lily
Token ID20037
Feature activation-5.46e-01

Pos logit contributions
oken+2.445
hered+2.443
ives+2.285
 grounded+2.239
 risking+2.209
Neg logit contributions
lf-1.736
 flavors-1.567
 kinds-1.452
 sizes-1.437
 tower-1.378
Token's
Token ID338
Feature activation-3.62e-01

Pos logit contributions
hered+0.929
oken+0.915
 risking+0.837
 grounded+0.831
pped+0.821
Neg logit contributions
lf-0.917
 flavors-0.872
 kinds-0.821
 sizes-0.814
 oval-0.773
Token mom
Token ID1995
Feature activation-1.40e+00

Pos logit contributions
hered+0.764
oken+0.747
 risking+0.696
pped+0.695
 grounded+0.695
Neg logit contributions
lf-0.464
 flavors-0.440
 kinds-0.406
 sizes-0.403
 oval-0.373
Token called
Token ID1444
Feature activation-3.54e+00

Pos logit contributions
oken+2.382
hered+2.330
 risking+2.176
ives+2.113
 grounded+2.090
Neg logit contributions
lf-2.468
 flavors-2.226
 kinds-2.135
 sizes-2.060
cest-2.006
Token her
Token ID607
Feature activation-2.27e+00

Pos logit contributions
ives+8.598
hered+8.521
oken+8.439
 risking+8.027
pped+7.947
Neg logit contributions
lf-3.885
 kinds-3.270
 his-3.166
 type-2.817
 tower-2.786
Token inside
Token ID2641
Feature activation-8.33e-01

Pos logit contributions
oken+3.985
hered+3.801
pped+3.668
 risking+3.604
 Leave+3.449
Neg logit contributions
lf-3.847
 flavors-3.546
 kinds-3.518
 sizes-3.396
 tower-3.331
Token for
Token ID329
Feature activation-2.22e+00

Pos logit contributions
hered+1.584
oken+1.553
pped+1.441
 risking+1.431
 grounded+1.430
Neg logit contributions
lf-1.236
 flavors-1.176
 kinds-1.137
 sizes-1.086
 oval-1.015
Token some
Token ID617
Feature activation-9.68e-01

Pos logit contributions
hered+3.939
oken+3.793
ives+3.769
 Leave+3.561
 grounded+3.541
Neg logit contributions
lf-3.919
 flavors-3.552
 kinds-3.242
 oval-3.068
 tower-3.063
Token hot
Token ID3024
Feature activation-3.90e-01

Pos logit contributions
hered+1.822
oken+1.728
 grounded+1.631
pped+1.625
 risking+1.622
Neg logit contributions
lf-1.484
 flavors-1.456
 kinds-1.353
 sizes-1.316
 tower-1.245
Token She
Token ID1375
Feature activation+2.17e-01

Pos logit contributions
hered+0.527
oken+0.515
pped+0.466
 grounded+0.465
 risking+0.464
Neg logit contributions
lf-0.595
 flavors-0.573
 kinds-0.546
 sizes-0.544
 oval-0.513
Token thanked
Token ID26280
Feature activation-1.72e+00

Pos logit contributions
lf+0.388
 flavors+0.372
 sizes+0.356
 kinds+0.356
 oval+0.334
Neg logit contributions
hered-0.343
oken-0.331
pped-0.306
 grounded-0.305
 risking-0.302
Token her
Token ID607
Feature activation-9.08e-01

Pos logit contributions
oken+2.991
hered+2.947
ives+2.755
 grounded+2.675
 risking+2.655
Neg logit contributions
lf-2.909
 flavors-2.691
 kinds-2.552
 sizes-2.457
 oval-2.437
Token cat
Token ID3797
Feature activation-1.70e+00

Pos logit contributions
hered+1.516
oken+1.493
pped+1.364
 risking+1.356
 grounded+1.320
Neg logit contributions
lf-1.583
 flavors-1.515
 kinds-1.444
 sizes-1.438
 oval-1.363
Token for
Token ID329
Feature activation-1.60e+00

Pos logit contributions
oken+3.126
hered+3.023
pped+2.849
 risking+2.826
 grounded+2.713
Neg logit contributions
lf-2.742
 flavors-2.554
 kinds-2.517
 sizes-2.450
 tower-2.371
Token the
Token ID262
Feature activation-3.54e+00

Pos logit contributions
oken+2.668
hered+2.655
ives+2.518
oning+2.348
oned+2.312
Neg logit contributions
lf-2.892
 flavors-2.597
 kinds-2.523
 sizes-2.419
 oval-2.352
Token special
Token ID2041
Feature activation-2.69e+00

Pos logit contributions
pped+9.906
hered+9.882
oken+9.590
oned+9.563
ives+9.441
Neg logit contributions
 flavors-2.799
lf-2.774
 kinds-2.499
 tower-2.451
 type-2.408
Token gift
Token ID6979
Feature activation-6.28e-01

Pos logit contributions
oken+6.299
pped+6.100
 grounded+5.977
hered+5.962
 risking+5.941
Neg logit contributions
lf-3.240
 flavors-3.088
 kinds-2.973
 type-2.821
 oval-2.659
Token.
Token ID13
Feature activation-2.65e-01

Pos logit contributions
oken+1.134
hered+1.134
pped+1.032
 risking+1.031
 grounded+1.025
Neg logit contributions
lf-0.982
 flavors-0.934
 kinds-0.889
 sizes-0.880
 oval-0.822
Token
Token ID198
Feature activation-4.75e-01

Pos logit contributions
hered+0.417
oken+0.407
pped+0.368
 grounded+0.368
 risking+0.367
Neg logit contributions
lf-0.478
 flavors-0.461
 kinds-0.438
 sizes-0.437
 oval-0.412
Token
Token ID198
Feature activation-7.91e-01

Pos logit contributions
hered+0.649
oken+0.631
 risking+0.559
 grounded+0.558
pped+0.557
Neg logit contributions
lf-0.959
 flavors-0.918
 kinds-0.881
 sizes-0.872
 oval-0.830
Token't
Token ID470
Feature activation-2.45e-01

Pos logit contributions
oken+1.465
hered+1.408
 risking+1.230
ives+1.155
 grounded+1.129
Neg logit contributions
lf-4.163
 flavors-3.823
 kinds-3.707
 sizes-3.641
 tower-3.606
Token worry
Token ID5490
Feature activation+4.94e-01

Pos logit contributions
hered+0.379
oken+0.370
pped+0.334
 grounded+0.333
 risking+0.332
Neg logit contributions
lf-0.452
 flavors-0.436
 kinds-0.415
 sizes-0.413
 oval-0.391
Token,
Token ID11
Feature activation-7.67e-01

Pos logit contributions
lf+0.691
 flavors+0.677
 kinds+0.627
 sizes+0.622
 oval+0.572
Neg logit contributions
hered-0.983
oken-0.951
 grounded-0.881
pped-0.875
 risking-0.873
Token Lily
Token ID20037
Feature activation+3.18e-01

Pos logit contributions
hered+1.481
oken+1.447
pped+1.327
ives+1.319
 risking+1.312
Neg logit contributions
lf-1.152
 flavors-1.089
 sizes-1.023
 kinds-1.016
 oval-0.946
Token.
Token ID13
Feature activation-1.90e+00

Pos logit contributions
lf+0.414
 flavors+0.398
 kinds+0.365
 sizes+0.363
 oval+0.338
Neg logit contributions
hered-0.662
oken-0.641
 grounded-0.602
 risking-0.600
pped-0.600
Token We
Token ID775
Feature activation-3.54e+00

Pos logit contributions
hered+3.413
oken+3.393
pped+3.128
 risking+3.076
ives+2.983
Neg logit contributions
lf-2.949
 flavors-2.804
 sizes-2.794
 kinds-2.718
 oval-2.686
Token'll
Token ID1183
Feature activation-1.71e+00

Pos logit contributions
hered+6.364
oken+6.203
Sleep+5.893
oning+5.825
 risking+5.783
Neg logit contributions
 kinds-5.838
lf-5.483
 flavors-5.424
 sizes-5.339
 sorts-5.214
Token find
Token ID1064
Feature activation-9.75e-01

Pos logit contributions
hered+3.091
oken+3.053
ives+2.857
 risking+2.811
pped+2.769
Neg logit contributions
lf-2.669
 flavors-2.504
 kinds-2.354
 sizes-2.345
cest-2.275
Token it
Token ID340
Feature activation-1.10e+00

Pos logit contributions
hered+1.846
oken+1.801
ives+1.651
oning+1.643
 risking+1.637
Neg logit contributions
lf-1.431
 flavors-1.409
 kinds-1.324
 sizes-1.289
cest-1.226
Token."
Token ID526
Feature activation-1.54e+00

Pos logit contributions
hered+1.992
oken+1.990
 risking+1.796
 grounded+1.770
pped+1.765
Neg logit contributions
lf-1.735
 flavors-1.655
 kinds-1.548
 sizes-1.546
cest-1.450
Token They
Token ID1119
Feature activation-1.39e+00

Pos logit contributions
oken+3.096
hered+3.087
 risking+2.907
pped+2.907
 grounded+2.848
Neg logit contributions
lf-2.081
 flavors-1.988
 kinds-1.910
 sizes-1.899
 oval-1.769
Token,
Token ID11
Feature activation-1.19e+00

Pos logit contributions
hered+1.888
oken+1.883
 risking+1.772
 grounded+1.768
pped+1.751
Neg logit contributions
lf-0.974
 flavors-0.889
 kinds-0.827
 sizes-0.809
 oval-0.739
Token Lily
Token ID20037
Feature activation-5.90e-02

Pos logit contributions
hered+2.389
oken+2.371
ives+2.192
 grounded+2.178
 risking+2.147
Neg logit contributions
lf-1.714
 flavors-1.561
 kinds-1.460
 sizes-1.446
 tower-1.379
Token's
Token ID338
Feature activation-7.66e-01

Pos logit contributions
hered+0.096
oken+0.093
pped+0.085
 grounded+0.085
 risking+0.085
Neg logit contributions
lf-0.103
 flavors-0.099
 kinds-0.094
 sizes-0.094
 oval-0.088
Token mom
Token ID1995
Feature activation-5.46e-01

Pos logit contributions
hered+1.701
oken+1.657
ives+1.564
 risking+1.561
pped+1.556
Neg logit contributions
lf-0.919
 flavors-0.867
 kinds-0.772
 sizes-0.764
 tower-0.733
Tokenmy
Token ID1820
Feature activation-1.06e+00

Pos logit contributions
hered+0.912
oken+0.899
 risking+0.814
 grounded+0.808
pped+0.800
Neg logit contributions
lf-0.948
 flavors-0.902
 kinds-0.857
 sizes-0.847
 oval-0.802
Token called
Token ID1444
Feature activation-3.53e+00

Pos logit contributions
oken+1.838
hered+1.807
 risking+1.642
 grounded+1.626
oning+1.608
Neg logit contributions
lf-1.808
 flavors-1.690
 kinds-1.593
 sizes-1.567
 oval-1.495
Token her
Token ID607
Feature activation-1.91e+00

Pos logit contributions
oken+8.209
ives+8.143
hered+8.116
 risking+7.806
pped+7.624
Neg logit contributions
lf-4.027
 kinds-3.545
 his-3.445
 flavors-3.152
 type-3.089
Token to
Token ID284
Feature activation-2.08e+00

Pos logit contributions
oken+2.994
hered+2.864
pped+2.687
 risking+2.662
oning+2.555
Neg logit contributions
lf-3.561
 flavors-3.414
 kinds-3.335
 sizes-3.215
 tower-3.178
Token go
Token ID467
Feature activation-2.12e+00

Pos logit contributions
hered+3.374
oken+3.333
ives+3.169
 risking+3.168
 grounded+3.072
Neg logit contributions
lf-3.719
 flavors-3.478
 kinds-3.317
 type-3.114
 tower-3.110
Token home
Token ID1363
Feature activation+4.37e-01

Pos logit contributions
oken+3.660
hered+3.559
 risking+3.286
pped+3.232
ives+3.220
Neg logit contributions
lf-3.482
 flavors-3.364
 kinds-3.264
 tower-3.128
cest-3.064
Token.
Token ID13
Feature activation-6.07e-01

Pos logit contributions
lf+0.617
 flavors+0.593
 sizes+0.560
 kinds+0.546
 oval+0.517
Neg logit contributions
hered-0.861
oken-0.839
 grounded-0.784
 risking-0.773
pped-0.770
Token Benny
Token ID44275
Feature activation-4.18e-01

Pos logit contributions
hered+2.011
oken+2.005
pped+1.833
 risking+1.823
 grounded+1.794
Neg logit contributions
lf-2.010
 flavors-1.897
 sizes-1.835
 kinds-1.835
 oval-1.757
Token thanked
Token ID26280
Feature activation-1.08e+00

Pos logit contributions
hered+0.793
oken+0.782
 risking+0.716
pped+0.716
 grounded+0.715
Neg logit contributions
lf-0.618
 flavors-0.590
 kinds-0.555
 sizes-0.551
 oval-0.513
Token the
Token ID262
Feature activation-2.36e+00

Pos logit contributions
hered+2.355
oken+2.310
ives+2.203
pped+2.157
 risking+2.133
Neg logit contributions
lf-1.365
 flavors-1.227
 kinds-1.182
 sizes-1.118
 oval-1.070
Token mole
Token ID9411
Feature activation-5.73e-01

Pos logit contributions
hered+4.567
oken+4.524
pped+4.331
ives+4.311
 risking+4.063
Neg logit contributions
lf-3.728
 flavors-3.529
 kinds-3.372
 tower-3.251
 sizes-3.246
Token for
Token ID329
Feature activation-1.31e+00

Pos logit contributions
hered+0.970
oken+0.961
pped+0.878
 risking+0.869
 grounded+0.866
Neg logit contributions
lf-0.957
 flavors-0.918
 kinds-0.881
 sizes-0.874
 oval-0.825
Token the
Token ID262
Feature activation-3.53e+00

Pos logit contributions
hered+2.756
oken+2.712
ives+2.597
pped+2.475
oning+2.464
Neg logit contributions
lf-1.782
 flavors-1.576
 kinds-1.530
 sizes-1.463
cest-1.384
Token adventure
Token ID8855
Feature activation-3.52e-01

Pos logit contributions
pped+8.457
hered+8.317
oken+7.984
ives+7.945
oned+7.716
Neg logit contributions
lf-4.308
 flavors-4.151
 tower-4.047
 kinds-4.039
 sizes-3.989
Token and
Token ID290
Feature activation+2.29e-01

Pos logit contributions
hered+0.620
oken+0.611
pped+0.558
 grounded+0.557
 risking+0.556
Neg logit contributions
lf-0.566
 flavors-0.542
 kinds-0.514
 sizes-0.511
 oval-0.477
Token hopped
Token ID45230
Feature activation-1.63e+00

Pos logit contributions
lf+0.389
 flavors+0.376
 sizes+0.356
 kinds+0.353
 oval+0.334
Neg logit contributions
hered-0.385
oken-0.370
 grounded-0.347
pped-0.342
 risking-0.342
Token back
Token ID736
Feature activation-3.68e-01

Pos logit contributions
oken+2.491
hered+2.376
 risking+2.280
ives+2.172
 grounded+2.163
Neg logit contributions
lf-2.963
 flavors-2.814
 kinds-2.754
 sizes-2.615
cest-2.587
Token up
Token ID510
Feature activation-6.55e-01

Pos logit contributions
hered+0.507
oken+0.495
pped+0.442
 grounded+0.440
 risking+0.438
Neg logit contributions
lf-0.734
 flavors-0.710
 kinds-0.682
 sizes-0.677
 oval-0.643
Token,
Token ID11
Feature activation-7.32e-01

Pos logit contributions
hered+0.438
oken+0.430
pped+0.399
 grounded+0.399
 risking+0.398
Neg logit contributions
lf-0.265
 flavors-0.251
 kinds-0.233
 sizes-0.232
 oval-0.213
Token "
Token ID366
Feature activation+7.93e-02

Pos logit contributions
oken+1.485
hered+1.483
pped+1.349
 grounded+1.346
 risking+1.343
Neg logit contributions
lf-1.001
 flavors-0.926
 sizes-0.862
 kinds-0.854
 oval-0.810
TokenThank
Token ID10449
Feature activation-7.62e-01

Pos logit contributions
lf+0.164
 flavors+0.158
 kinds+0.152
 sizes+0.151
 oval+0.144
Neg logit contributions
hered-0.104
oken-0.099
pped-0.090
 grounded-0.089
 risking-0.088
Token you
Token ID345
Feature activation+1.01e+00

Pos logit contributions
hered+1.048
oken+1.043
 risking+0.924
 grounded+0.916
pped+0.907
Neg logit contributions
lf-1.538
 flavors-1.462
 kinds-1.394
 sizes-1.370
 oval-1.320
Token for
Token ID329
Feature activation-5.99e-01

Pos logit contributions
 flavors+1.631
lf+1.610
 sizes+1.526
 kinds+1.518
cest+1.381
Neg logit contributions
hered-1.805
oken-1.717
 grounded-1.622
oning-1.584
pped-1.583
Token the
Token ID262
Feature activation-3.51e+00

Pos logit contributions
hered+0.857
oken+0.833
ives+0.747
 risking+0.731
pped+0.728
Neg logit contributions
lf-1.190
 flavors-1.129
 kinds-1.080
 sizes-1.065
 oval-1.022
Token cookies
Token ID14746
Feature activation+9.58e-01

Pos logit contributions
hered+9.240
pped+8.929
oned+8.579
oken+8.443
 Leave+8.401
Neg logit contributions
lf-3.916
 flavors-3.851
 tower-3.557
 kinds-3.412
 sizes-3.399
Token and
Token ID290
Feature activation+2.88e-01

Pos logit contributions
lf+1.356
 flavors+1.282
 sizes+1.205
 kinds+1.180
 oval+1.149
Neg logit contributions
hered-1.884
oken-1.745
ives-1.712
 grounded-1.694
pped-1.653
Token for
Token ID329
Feature activation-4.34e-01

Pos logit contributions
lf+0.444
 flavors+0.423
 sizes+0.398
 kinds+0.396
 oval+0.374
Neg logit contributions
hered-0.521
oken-0.509
 grounded-0.481
 risking-0.474
pped-0.472
Token helping
Token ID5742
Feature activation+8.49e-01

Pos logit contributions
hered+0.614
oken+0.595
ives+0.528
pped+0.526
 risking+0.525
Neg logit contributions
lf-0.863
 flavors-0.827
 kinds-0.791
 sizes-0.783
 oval-0.748
Token me
Token ID502
Feature activation+6.11e-01

Pos logit contributions
lf+1.465
 flavors+1.430
 sizes+1.391
 kinds+1.360
cest+1.290
Neg logit contributions
hered-1.375
oken-1.350
pped-1.273
 grounded-1.255
 risking-1.225
Token called
Token ID1444
Feature activation-1.87e+00

Pos logit contributions
hered+0.560
oken+0.549
pped+0.501
 risking+0.500
 grounded+0.500
Neg logit contributions
lf-0.508
 flavors-0.486
 kinds-0.459
 sizes-0.457
 oval-0.428
Token out
Token ID503
Feature activation+3.50e-01

Pos logit contributions
oken+3.497
hered+3.488
ives+3.250
pped+3.206
 risking+3.150
Neg logit contributions
lf-2.905
 kinds-2.625
 flavors-2.602
 sizes-2.486
 tower-2.358
Token to
Token ID284
Feature activation-1.56e+00

Pos logit contributions
lf+0.581
 flavors+0.573
 sizes+0.536
 kinds+0.532
cest+0.497
Neg logit contributions
hered-0.605
oken-0.584
 grounded-0.533
 risking-0.533
pped-0.529
Token Bobby
Token ID17155
Feature activation-6.73e-01

Pos logit contributions
oken+2.753
hered+2.749
ives+2.616
pped+2.482
 risking+2.472
Neg logit contributions
lf-2.572
 flavors-2.363
 kinds-2.294
 sizes-2.182
 type-2.060
Token,
Token ID11
Feature activation-1.17e+00

Pos logit contributions
hered+1.429
oken+1.420
pped+1.320
 risking+1.308
 grounded+1.307
Neg logit contributions
lf-0.834
 flavors-0.777
 kinds-0.737
 sizes-0.734
 oval-0.673
Token "
Token ID366
Feature activation-3.50e+00

Pos logit contributions
hered+2.375
oken+2.366
ives+2.186
pped+2.162
 grounded+2.113
Neg logit contributions
lf-1.573
 flavors-1.462
 sizes-1.388
 kinds-1.372
 oval-1.290
TokenWhere
Token ID8496
Feature activation-1.20e+00

Pos logit contributions
oken+6.530
ives+6.027
hered+5.919
 Drink+5.189
 grounded+5.174
Neg logit contributions
lf-6.558
 flavors-6.011
 kinds-5.416
cest-5.221
 sizes-5.105
Token is
Token ID318
Feature activation-6.88e-01

Pos logit contributions
hered+1.978
oken+1.889
pped+1.734
oning+1.732
 risking+1.729
Neg logit contributions
lf-2.138
 flavors-2.022
 kinds-1.964
 sizes-1.881
 tower-1.830
Token my
Token ID616
Feature activation-6.19e-01

Pos logit contributions
hered+1.370
oken+1.352
ives+1.245
pped+1.235
 grounded+1.222
Neg logit contributions
lf-0.974
 flavors-0.921
 kinds-0.862
 sizes-0.848
 oval-0.786
Token grandson
Token ID31845
Feature activation-6.46e-01

Pos logit contributions
hered+1.263
oken+1.243
pped+1.159
 risking+1.145
 grounded+1.145
Neg logit contributions
lf-0.836
 flavors-0.803
 kinds-0.744
 sizes-0.739
 tower-0.683
Token?
Token ID30
Feature activation-1.08e+00

Pos logit contributions
hered+1.162
oken+1.154
 risking+1.046
pped+1.046
ives+1.042
Neg logit contributions
lf-1.016
 flavors-0.965
 kinds-0.912
 sizes-0.895
cest-0.851
Token know
Token ID760
Feature activation-4.30e-01

Pos logit contributions
hered+0.386
oken+0.376
pped+0.339
 grounded+0.337
 risking+0.337
Neg logit contributions
lf-0.484
 flavors-0.467
 kinds-0.446
 sizes-0.445
 oval-0.420
Token.
Token ID13
Feature activation-7.17e-01

Pos logit contributions
hered+0.872
oken+0.853
 grounded+0.787
pped+0.786
 risking+0.783
Neg logit contributions
lf-0.584
 flavors-0.558
 kinds-0.519
 sizes-0.519
cest-0.473
Token The
Token ID383
Feature activation-3.04e+00

Pos logit contributions
hered+1.326
oken+1.283
pped+1.202
 risking+1.181
 grounded+1.168
Neg logit contributions
lf-1.101
 flavors-1.076
 sizes-1.025
 kinds-1.021
 oval-0.953
Token moral
Token ID6573
Feature activation-1.10e+00

Pos logit contributions
hered+6.679
pped+6.109
ives+6.082
oned+6.045
oken+6.039
Neg logit contributions
lf-4.718
 flavors-4.421
 tower-4.281
 sizes-3.948
 kinds-3.877
Token of
Token ID286
Feature activation-9.98e-01

Pos logit contributions
hered+2.266
oken+2.181
pped+2.048
oning+2.032
 risking+2.024
Neg logit contributions
lf-1.483
 flavors-1.465
 kinds-1.413
 sizes-1.327
 oval-1.217
Token the
Token ID262
Feature activation-3.48e+00

Pos logit contributions
hered+2.103
oken+2.064
ives+1.920
pped+1.900
oning+1.886
Neg logit contributions
lf-1.342
 flavors-1.230
 kinds-1.193
 sizes-1.129
 oval-1.047
Token story
Token ID1621
Feature activation-5.19e-01

Pos logit contributions
hered+8.271
pped+7.820
oned+7.580
ked+7.442
oken+7.327
Neg logit contributions
lf-5.345
 flavors-4.626
 kinds-4.337
 tower-4.267
 sizes-4.176
Token is
Token ID318
Feature activation-1.66e+00

Pos logit contributions
hered+0.980
oken+0.962
pped+0.887
 risking+0.879
 grounded+0.871
Neg logit contributions
lf-0.781
 flavors-0.748
 kinds-0.706
 sizes-0.697
cest-0.644
Token to
Token ID284
Feature activation-8.86e-01

Pos logit contributions
hered+3.335
oken+3.112
pped+2.992
ives+2.882
oned+2.880
Neg logit contributions
lf-2.393
 flavors-2.269
 kinds-2.228
 sizes-2.106
 oval-2.038
Token be
Token ID307
Feature activation-3.56e-01

Pos logit contributions
hered+1.625
oken+1.568
pped+1.454
ives+1.445
 risking+1.439
Neg logit contributions
lf-1.383
 flavors-1.325
 kinds-1.261
 sizes-1.253
 oval-1.173
Token careful
Token ID8161
Feature activation-8.42e-01

Pos logit contributions
hered+0.743
oken+0.727
pped+0.676
 risking+0.671
 grounded+0.669
Neg logit contributions
lf-0.461
 flavors-0.442
 kinds-0.413
 sizes-0.409
 oval-0.376
Token."
Token ID526
Feature activation-5.36e-01

Pos logit contributions
lf+0.404
 flavors+0.386
 kinds+0.357
 sizes+0.354
 oval+0.329
Neg logit contributions
hered-0.602
oken-0.586
 grounded-0.549
pped-0.546
 risking-0.544
Token
Token ID198
Feature activation+1.55e-01

Pos logit contributions
hered+0.885
oken+0.877
pped+0.795
 grounded+0.792
 risking+0.790
Neg logit contributions
lf-0.926
 flavors-0.883
 kinds-0.839
 sizes-0.834
 oval-0.791
Token
Token ID198
Feature activation+5.75e-01

Pos logit contributions
lf+0.291
 flavors+0.285
 sizes+0.272
 kinds+0.271
cest+0.256
Neg logit contributions
hered-0.230
oken-0.223
pped-0.201
 grounded-0.200
 risking-0.200
TokenGr
Token ID8642
Feature activation-2.68e-01

Pos logit contributions
lf+0.924
 flavors+0.918
 kinds+0.874
 sizes+0.868
cest+0.806
Neg logit contributions
hered-1.009
oken-0.961
pped-0.906
 grounded-0.901
oning-0.896
Tokenanny
Token ID7737
Feature activation-1.22e+00

Pos logit contributions
hered+0.586
oken+0.575
pped+0.537
 risking+0.536
 grounded+0.535
Neg logit contributions
lf-0.321
 flavors-0.303
 kinds-0.282
 sizes-0.280
 oval-0.254
Token stepped
Token ID10764
Feature activation-3.48e+00

Pos logit contributions
hered+2.428
oken+2.404
 risking+2.264
oning+2.220
ives+2.205
Neg logit contributions
lf-1.748
 flavors-1.598
 kinds-1.490
 sizes-1.438
 oval-1.408
Token closer
Token ID5699
Feature activation-2.56e-01

Pos logit contributions
 risking+6.170
oken+6.029
hered+5.938
ives+5.679
Sleep+5.621
Neg logit contributions
lf-5.772
 tall-4.992
 oval-4.978
 kinds-4.963
 tower-4.866
Token and
Token ID290
Feature activation-7.82e-01

Pos logit contributions
hered+0.389
oken+0.379
pped+0.342
 grounded+0.341
 risking+0.340
Neg logit contributions
lf-0.477
 flavors-0.461
 kinds-0.439
 sizes-0.437
 oval-0.413
Token looked
Token ID3114
Feature activation-7.48e-01

Pos logit contributions
hered+1.439
oken+1.436
 risking+1.293
pped+1.284
 grounded+1.282
Neg logit contributions
lf-1.207
 flavors-1.140
 kinds-1.077
 sizes-1.054
 oval-1.016
Token at
Token ID379
Feature activation-2.60e-02

Pos logit contributions
hered+1.236
oken+1.185
pped+1.082
ives+1.078
 risking+1.078
Neg logit contributions
lf-1.299
 flavors-1.248
 kinds-1.196
 sizes-1.184
 oval-1.121
Token the
Token ID262
Feature activation-1.68e+00

Pos logit contributions
hered+0.047
oken+0.046
pped+0.043
 grounded+0.042
 risking+0.042
Neg logit contributions
lf-0.041
 flavors-0.039
 sizes-0.037
 kinds-0.037
 oval-0.034
Token floor
Token ID4314
Feature activation+2.43e-01

Pos logit contributions
lf+1.333
 flavors+1.279
 kinds+1.258
 sizes+1.235
cest+1.144
Neg logit contributions
hered-1.334
oken-1.271
 grounded-1.190
pped-1.189
oning-1.185
Token.
Token ID13
Feature activation-6.03e-01

Pos logit contributions
lf+0.323
 flavors+0.310
 kinds+0.287
 sizes+0.287
 oval+0.266
Neg logit contributions
hered-0.490
oken-0.481
pped-0.450
 grounded-0.445
 risking-0.445
Token Lily
Token ID20037
Feature activation+6.66e-01

Pos logit contributions
hered+1.097
oken+1.073
 risking+0.993
pped+0.992
 grounded+0.984
Neg logit contributions
lf-0.952
 flavors-0.909
 kinds-0.857
 sizes-0.856
 oval-0.810
Token's
Token ID338
Feature activation+3.21e-01

Pos logit contributions
lf+1.107
 flavors+1.080
 sizes+1.051
 kinds+1.045
cest+0.985
Neg logit contributions
hered-1.103
oken-1.059
pped-0.997
 grounded-0.991
 risking-0.977
Token mom
Token ID1995
Feature activation+4.33e-01

Pos logit contributions
lf+0.416
 flavors+0.393
 sizes+0.374
 kinds+0.374
cest+0.340
Neg logit contributions
hered-0.662
oken-0.659
 grounded-0.606
pped-0.601
 risking-0.598
Token stepped
Token ID10764
Feature activation-3.47e+00

Pos logit contributions
lf+0.684
 flavors+0.668
 sizes+0.637
 kinds+0.629
cest+0.592
Neg logit contributions
hered-0.768
oken-0.739
pped-0.699
 grounded-0.695
 risking-0.680
Token on
Token ID319
Feature activation-1.61e+00

Pos logit contributions
hered+5.985
oken+5.853
 risking+5.778
ives+5.658
Sleep+5.482
Neg logit contributions
lf-5.840
 flavors-5.269
 kinds-5.235
 tall-5.084
 blue-5.007
Token it
Token ID340
Feature activation+2.92e-01

Pos logit contributions
oken+3.615
hered+3.585
ives+3.385
 grounded+3.331
 risking+3.266
Neg logit contributions
lf-2.037
 flavors-1.896
 kinds-1.715
 sizes-1.598
 tower-1.587
Token and
Token ID290
Feature activation+2.62e-01

Pos logit contributions
lf+0.553
 flavors+0.538
 sizes+0.508
 kinds+0.506
cest+0.482
Neg logit contributions
hered-0.434
oken-0.415
 grounded-0.377
pped-0.375
ives-0.373
Token it
Token ID340
Feature activation+1.54e+00

Pos logit contributions
lf+0.437
 flavors+0.427
 sizes+0.409
 kinds+0.406
 oval+0.380
Neg logit contributions
hered-0.444
oken-0.424
pped-0.400
 grounded-0.398
 risking-0.395
Token showed
Token ID3751
Feature activation-2.31e-01

Pos logit contributions
 flavors+2.689
lf+2.672
 kinds+2.667
 sizes+2.662
cest+2.518
Neg logit contributions
hered-2.416
oken-2.164
pped-2.158
 grounded-2.090
 poisoned-2.050
Token battery
Token ID6555
Feature activation+2.31e-01

Pos logit contributions
 flavors+1.696
lf+1.661
 sizes+1.645
 kinds+1.638
cest+1.427
Neg logit contributions
hered-1.959
oken-1.872
 grounded-1.791
 risking-1.721
Sleep-1.712
Token.
Token ID13
Feature activation+8.09e-01

Pos logit contributions
lf+0.362
 flavors+0.355
 kinds+0.332
 sizes+0.332
 oval+0.307
Neg logit contributions
hered-0.411
oken-0.396
pped-0.367
 grounded-0.367
 risking-0.366
Token It
Token ID632
Feature activation+9.28e-01

Pos logit contributions
lf+1.616
 flavors+1.582
 sizes+1.507
 kinds+1.482
cest+1.452
Neg logit contributions
hered-1.115
oken-1.041
pped-0.952
 grounded-0.941
oning-0.928
Token had
Token ID550
Feature activation+4.53e-01

Pos logit contributions
lf+1.492
 flavors+1.464
 kinds+1.372
 sizes+1.362
cest+1.282
Neg logit contributions
hered-1.611
oken-1.555
 grounded-1.477
pped-1.460
 risking-1.425
Token a
Token ID257
Feature activation+1.60e+00

Pos logit contributions
lf+0.882
 flavors+0.844
 kinds+0.802
 sizes+0.797
cest+0.758
Neg logit contributions
hered-0.644
oken-0.636
pped-0.580
 grounded-0.572
 risking-0.571
Token very
Token ID845
Feature activation+3.00e+00

Pos logit contributions
lf+2.192
 flavors+2.149
 kinds+2.104
 sizes+2.043
cest+1.832
Neg logit contributions
hered-3.152
oken-3.067
 grounded-2.955
 risking-2.898
 Drink-2.754
Token important
Token ID1593
Feature activation+2.99e-01

Pos logit contributions
lf+4.412
 sizes+4.388
 flavors+4.307
 kinds+4.270
cest+3.904
Neg logit contributions
oken-6.084
 grounded-5.454
hered-5.185
 poisoned-5.142
 risking-4.925
Token job
Token ID1693
Feature activation+4.87e-01

Pos logit contributions
lf+0.450
 flavors+0.429
 sizes+0.404
 kinds+0.402
 oval+0.373
Neg logit contributions
hered-0.560
oken-0.545
 grounded-0.502
 risking-0.501
pped-0.496
Token -
Token ID532
Feature activation-9.12e-02

Pos logit contributions
lf+0.779
 flavors+0.759
 kinds+0.713
 sizes+0.706
 oval+0.673
Neg logit contributions
hered-0.841
oken-0.818
 risking-0.759
 grounded-0.757
pped-0.748
Token to
Token ID284
Feature activation+1.16e+00

Pos logit contributions
hered+0.167
oken+0.163
 grounded+0.150
pped+0.150
 risking+0.150
Neg logit contributions
lf-0.140
 flavors-0.134
 kinds-0.127
 sizes-0.126
 oval-0.117
Token power
Token ID1176
Feature activation+1.19e+00

Pos logit contributions
 flavors+2.148
lf+2.121
 sizes+2.071
 kinds+2.030
cest+1.933
Neg logit contributions
oken-1.700
hered-1.696
 grounded-1.594
pped-1.531
 risking-1.489
Token.
Token ID13
Feature activation+1.49e+00

Pos logit contributions
lf+0.805
 flavors+0.779
 kinds+0.730
 sizes+0.723
cest+0.675
Neg logit contributions
hered-1.174
oken-1.146
pped-1.064
 risking-1.058
oning-1.055
Token They
Token ID1119
Feature activation+1.63e+00

Pos logit contributions
lf+2.644
 flavors+2.392
 kinds+2.213
cest+2.205
 sizes+2.164
Neg logit contributions
hered-2.468
oken-2.391
oning-2.232
ives-2.221
 Drink-2.188
Token liked
Token ID8288
Feature activation+1.40e+00

Pos logit contributions
 flavors+3.110
lf+3.075
 kinds+2.872
 sizes+2.802
 oval+2.727
Neg logit contributions
hered-2.293
oken-2.148
pped-2.132
 grounded-2.045
 risking-1.948
Token to
Token ID284
Feature activation+2.85e+00

Pos logit contributions
lf+3.291
 flavors+3.203
 sizes+3.046
 kinds+3.041
cest+3.037
Neg logit contributions
hered-1.242
oken-1.208
 risking-1.124
 grounded-1.051
pped-1.025
Token make
Token ID787
Feature activation+2.23e+00

Pos logit contributions
 flavors+5.278
lf+5.175
cest+4.967
 sizes+4.936
 kinds+4.926
Neg logit contributions
oken-3.984
hered-3.979
oning-3.818
oned-3.762
pped-3.577
Token different
Token ID1180
Feature activation+3.16e+00

Pos logit contributions
lf+3.992
 flavors+3.651
 sizes+3.431
cest+3.417
 kinds+3.355
Neg logit contributions
oken-3.610
hered-3.336
oning-3.215
 grounded-3.117
ives-3.096
Token things
Token ID1243
Feature activation+1.14e+00

Pos logit contributions
lf+5.418
 oval+4.375
 flavors+4.225
cest+4.212
omo+4.210
Neg logit contributions
oken-5.517
 consequences-5.269
ives-5.112
hered-5.106
oning-5.020
Token with
Token ID351
Feature activation+1.58e+00

Pos logit contributions
lf+2.039
 flavors+1.896
 kinds+1.787
 sizes+1.745
 oval+1.689
Neg logit contributions
hered-1.892
oken-1.731
ives-1.657
 risking-1.640
 grounded-1.639
Token the
Token ID262
Feature activation+8.23e-01

Pos logit contributions
lf+3.168
 flavors+2.960
 kinds+2.822
cest+2.820
 sizes+2.762
Neg logit contributions
oken-2.132
hered-2.114
pped-1.922
 risking-1.900
 permission-1.894
Token blocks
Token ID7021
Feature activation+1.39e+00

Pos logit contributions
lf+1.140
 flavors+1.050
 kinds+0.988
 sizes+0.944
cest+0.934
Neg logit contributions
oken-1.631
hered-1.608
 grounded-1.474
oning-1.467
ives-1.464
Token,
Token ID11
Feature activation-3.72e-01

Pos logit contributions
lf+2.191
 flavors+2.087
 kinds+1.928
 sizes+1.869
 oval+1.784
Neg logit contributions
hered-2.544
oken-2.374
ives-2.313
pped-2.251
 grounded-2.237
Token was
Token ID373
Feature activation-1.89e+00

Pos logit contributions
hered+2.039
oken+2.027
 risking+1.856
oning+1.845
ives+1.818
Neg logit contributions
lf-1.802
 flavors-1.713
 kinds-1.630
 sizes-1.567
 tower-1.491
Token the
Token ID262
Feature activation-9.78e-02

Pos logit contributions
hered+3.224
ives+3.041
oken+2.961
 risking+2.886
oning+2.776
Neg logit contributions
lf-3.233
 kinds-2.927
 flavors-2.918
 oval-2.833
 tower-2.780
Token cut
Token ID2005
Feature activation+6.87e-01

Pos logit contributions
hered+0.170
oken+0.166
 grounded+0.152
pped+0.152
 risking+0.151
Neg logit contributions
lf-0.161
 flavors-0.154
 kinds-0.146
 sizes-0.145
 oval-0.136
Tokenest
Token ID395
Feature activation+3.39e-01

Pos logit contributions
lf+1.166
 flavors+1.128
 sizes+1.060
 kinds+1.056
 oval+1.015
Neg logit contributions
hered-1.141
oken-1.108
pped-1.027
 grounded-1.007
 risking-1.001
Token little
Token ID1310
Feature activation+6.91e-01

Pos logit contributions
lf+0.516
 flavors+0.489
 kinds+0.463
 sizes+0.462
 oval+0.431
Neg logit contributions
hered-0.623
oken-0.605
 risking-0.556
pped-0.556
 grounded-0.553
Token three
Token ID1115
Feature activation+2.64e+00

Pos logit contributions
lf+1.055
 flavors+0.975
 kinds+0.911
 sizes+0.902
 oval+0.861
Neg logit contributions
hered-1.278
oken-1.232
 risking-1.140
 grounded-1.136
pped-1.127
Token year
Token ID614
Feature activation+1.81e+00

Pos logit contributions
lf+4.246
 flavors+4.187
 oval+4.152
 sizes+4.015
cest+3.742
Neg logit contributions
hered-4.140
oken-4.133
 grounded-3.733
pped-3.685
ives-3.684
Token old
Token ID1468
Feature activation+7.40e-01

Pos logit contributions
lf+2.493
 flavors+2.431
 kinds+2.227
 sizes+2.215
cest+2.098
Neg logit contributions
hered-3.476
oken-3.470
 grounded-3.245
ives-3.174
 permission-3.127
Token.
Token ID13
Feature activation+1.19e+00

Pos logit contributions
lf+1.186
 flavors+1.125
 kinds+1.060
 sizes+1.055
 oval+1.021
Neg logit contributions
hered-1.291
oken-1.226
 risking-1.154
 grounded-1.149
pped-1.139
Token He
Token ID679
Feature activation+5.81e-01

Pos logit contributions
lf+1.774
 flavors+1.581
cest+1.428
 kinds+1.415
 sizes+1.400
Neg logit contributions
oken-2.323
hered-2.292
oning-2.103
ives-2.096
 grounded-2.089
Token loved
Token ID6151
Feature activation+3.35e-01

Pos logit contributions
lf+0.932
 flavors+0.897
 sizes+0.855
 kinds+0.850
cest+0.803
Neg logit contributions
hered-0.998
oken-0.983
pped-0.919
 grounded-0.910
 risking-0.890
Token to
Token ID284
Feature activation+6.74e-01

Pos logit contributions
hered+1.458
oken+1.413
pped+1.273
ives+1.258
 grounded+1.255
Neg logit contributions
lf-1.735
 flavors-1.706
 kinds-1.614
 sizes-1.585
 tower-1.508
Token say
Token ID910
Feature activation+2.33e-01

Pos logit contributions
lf+1.445
 flavors+1.394
 sizes+1.363
 kinds+1.340
 oval+1.284
Neg logit contributions
hered-0.832
oken-0.789
 grounded-0.715
 risking-0.696
pped-0.696
Token thank
Token ID5875
Feature activation+5.17e-02

Pos logit contributions
lf+0.495
 flavors+0.478
 sizes+0.459
 kinds+0.459
 oval+0.435
Neg logit contributions
hered-0.292
oken-0.284
 grounded-0.254
 risking-0.252
pped-0.248
Token you
Token ID345
Feature activation-2.94e-02

Pos logit contributions
lf+0.108
 flavors+0.105
 kinds+0.101
 sizes+0.101
cest+0.096
Neg logit contributions
hered-0.066
oken-0.064
pped-0.057
 grounded-0.057
 risking-0.056
Token,
Token ID11
Feature activation-5.82e-01

Pos logit contributions
hered+0.049
oken+0.047
 grounded+0.043
pped+0.043
 risking+0.043
Neg logit contributions
lf-0.051
 flavors-0.049
 kinds-0.046
 sizes-0.046
 oval-0.043
Token so
Token ID523
Feature activation+2.56e+00

Pos logit contributions
hered+1.019
oken+0.997
pped+0.910
 risking+0.902
 grounded+0.894
Neg logit contributions
lf-0.960
 flavors-0.922
 sizes-0.876
 kinds-0.871
 oval-0.824
Token he
Token ID339
Feature activation-1.42e-01

Pos logit contributions
 flavors+4.249
 sizes+4.059
 kinds+4.050
lf+3.779
 oval+3.658
Neg logit contributions
hered-4.487
pped-4.401
 grounded-4.323
 risking-4.197
 okay-4.084
Token made
Token ID925
Feature activation+2.47e-01

Pos logit contributions
hered+0.239
oken+0.233
pped+0.213
 grounded+0.212
 risking+0.211
Neg logit contributions
lf-0.242
 flavors-0.233
 kinds-0.221
 sizes-0.220
 oval-0.207
Token them
Token ID606
Feature activation-2.02e-01

Pos logit contributions
lf+0.476
 flavors+0.458
 sizes+0.441
 kinds+0.434
 oval+0.418
Neg logit contributions
hered-0.355
oken-0.350
pped-0.316
 grounded-0.315
 risking-0.311
Token some
Token ID617
Feature activation-7.25e-01

Pos logit contributions
hered+0.359
oken+0.351
pped+0.321
 grounded+0.321
 risking+0.321
Neg logit contributions
lf-0.322
 flavors-0.309
 kinds-0.292
 sizes-0.290
 oval-0.272
Token cookies
Token ID14746
Feature activation-2.52e-01

Pos logit contributions
hered+1.446
oken+1.392
pped+1.317
 risking+1.295
 grounded+1.288
Neg logit contributions
lf-1.016
 flavors-1.000
 kinds-0.936
 sizes-0.915
 tower-0.846
Token Tim
Token ID5045
Feature activation-5.56e-01

Pos logit contributions
lf+1.321
 flavors+1.276
 sizes+1.235
 kinds+1.225
cest+1.161
Neg logit contributions
hered-1.366
oken-1.276
pped-1.231
 risking-1.214
 grounded-1.198
Tokenmy
Token ID1820
Feature activation-7.96e-01

Pos logit contributions
hered+0.961
oken+0.940
 risking+0.861
 grounded+0.852
pped+0.851
Neg logit contributions
lf-0.924
 flavors-0.886
 kinds-0.847
 sizes-0.834
cest-0.789
Token hit
Token ID2277
Feature activation+5.01e-01

Pos logit contributions
hered+1.359
oken+1.349
 risking+1.221
oning+1.210
 grounded+1.210
Neg logit contributions
lf-1.336
 flavors-1.272
 kinds-1.212
 sizes-1.188
cest-1.144
Token the
Token ID262
Feature activation-8.70e-01

Pos logit contributions
lf+0.842
 flavors+0.806
 sizes+0.773
 kinds+0.768
cest+0.726
Neg logit contributions
hered-0.855
oken-0.835
pped-0.782
 grounded-0.766
 risking-0.757
Token ball
Token ID2613
Feature activation-1.25e-01

Pos logit contributions
hered+1.635
oken+1.627
 risking+1.483
pped+1.481
 grounded+1.476
Neg logit contributions
lf-1.296
 flavors-1.258
 sizes-1.167
 kinds-1.159
 tower-1.103
Token so
Token ID523
Feature activation+2.56e+00

Pos logit contributions
hered+0.189
oken+0.184
pped+0.166
 grounded+0.166
 risking+0.165
Neg logit contributions
lf-0.231
 flavors-0.224
 kinds-0.213
 sizes-0.212
 oval-0.200
Token hard
Token ID1327
Feature activation+2.96e-01

Pos logit contributions
 flavors+4.515
 sizes+4.366
 kinds+4.324
lf+4.288
cest+4.012
Neg logit contributions
pped-4.486
hered-4.325
oken-4.138
 risking-4.049
 grounded-4.045
Token that
Token ID326
Feature activation+1.62e+00

Pos logit contributions
lf+0.481
 flavors+0.471
 kinds+0.442
 sizes+0.442
 oval+0.415
Neg logit contributions
hered-0.515
oken-0.495
pped-0.464
 risking-0.461
 grounded-0.457
Token he
Token ID339
Feature activation+2.29e-01

Pos logit contributions
 flavors+2.196
lf+2.117
 kinds+2.112
 sizes+2.101
cest+1.955
Neg logit contributions
hered-3.070
oken-2.947
 grounded-2.916
pped-2.914
 risking-2.884
Token became
Token ID2627
Feature activation+7.34e-02

Pos logit contributions
lf+0.411
 flavors+0.400
 kinds+0.385
 sizes+0.381
 oval+0.359
Neg logit contributions
hered-0.358
oken-0.349
pped-0.323
 grounded-0.318
 risking-0.317
Token dizz
Token ID29527
Feature activation+5.38e-01

Pos logit contributions
lf+0.135
 flavors+0.130
 kinds+0.123
 sizes+0.123
 oval+0.116
Neg logit contributions
hered-0.114
oken-0.112
 risking-0.101
 grounded-0.101
pped-0.101
Token lots
Token ID6041
Feature activation+6.50e-01

Pos logit contributions
lf+1.568
 flavors+1.500
 sizes+1.429
 kinds+1.408
 oval+1.366
Neg logit contributions
hered-1.262
oken-1.261
 risking-1.153
 grounded-1.151
pped-1.147
Token of
Token ID286
Feature activation+1.23e-01

Pos logit contributions
lf+1.081
 flavors+1.028
 sizes+0.960
 kinds+0.950
 oval+0.921
Neg logit contributions
hered-1.114
oken-1.087
ives-1.005
pped-0.996
 risking-0.990
Token pretty
Token ID2495
Feature activation+1.07e+00

Pos logit contributions
lf+0.173
 flavors+0.163
 sizes+0.152
 kinds+0.152
 oval+0.143
Neg logit contributions
hered-0.239
oken-0.239
 grounded-0.219
 risking-0.219
pped-0.217
Token fireworks
Token ID26056
Feature activation+2.91e-01

Pos logit contributions
lf+1.612
 flavors+1.480
 sizes+1.395
 kinds+1.383
 oval+1.343
Neg logit contributions
hered-1.963
oken-1.929
 risking-1.774
pped-1.758
ives-1.736
Token lighting
Token ID12019
Feature activation+1.72e+00

Pos logit contributions
lf+0.512
 flavors+0.492
 kinds+0.465
 sizes+0.461
 oval+0.439
Neg logit contributions
hered-0.471
oken-0.450
 risking-0.413
pped-0.412
 grounded-0.409
Token up
Token ID510
Feature activation+1.02e+00

Pos logit contributions
lf+3.530
 flavors+3.240
cest+3.084
 kinds+3.064
 oval+3.045
Neg logit contributions
oken-2.203
hered-2.179
 risking-1.924
oning-1.870
pped-1.863
Token the
Token ID262
Feature activation-6.19e-01

Pos logit contributions
lf+1.759
 flavors+1.646
 sizes+1.541
 kinds+1.525
 oval+1.499
Neg logit contributions
hered-1.671
oken-1.655
 risking-1.506
pped-1.489
 grounded-1.488
Token sky
Token ID6766
Feature activation+1.50e+00

Pos logit contributions
hered+1.119
oken+1.089
pped+1.008
 risking+1.003
 grounded+0.995
Neg logit contributions
lf-0.995
 flavors-0.957
 sizes-0.896
 kinds-0.895
 tower-0.839
Token.
Token ID13
Feature activation+9.26e-01

Pos logit contributions
lf+2.744
 flavors+2.618
 kinds+2.355
 sizes+2.332
cest+2.267
Neg logit contributions
hered-2.436
oken-2.213
ives-2.089
oning-2.043
pped-2.011
Token The
Token ID383
Feature activation-1.58e-01

Pos logit contributions
lf+1.527
 flavors+1.462
 kinds+1.391
 sizes+1.376
cest+1.371
Neg logit contributions
hered-1.562
oken-1.552
oning-1.425
 grounded-1.419
oned-1.406
Token colors
Token ID7577
Feature activation+5.51e-01

Pos logit contributions
hered+0.303
oken+0.297
 grounded+0.274
pped+0.273
 risking+0.273
Neg logit contributions
lf-0.232
 flavors-0.222
 kinds-0.208
 sizes-0.208
 oval-0.193
Token story
Token ID1621
Feature activation+1.17e+00

Pos logit contributions
lf+0.004
 flavors+0.004
 kinds+0.004
 sizes+0.004
 oval+0.003
Neg logit contributions
hered-0.006
oken-0.006
 grounded-0.005
pped-0.005
 risking-0.005
Token aloud
Token ID32227
Feature activation+1.24e+00

Pos logit contributions
lf+2.153
 flavors+2.150
 kinds+2.026
 sizes+2.008
cest+1.943
Neg logit contributions
hered-1.790
oken-1.611
ives-1.573
 grounded-1.504
pped-1.479
Token to
Token ID284
Feature activation+1.75e+00

Pos logit contributions
lf+2.369
 flavors+2.264
 kinds+2.149
 sizes+2.121
cest+2.041
Neg logit contributions
hered-1.848
oken-1.728
ives-1.585
pped-1.582
 risking-1.551
Token each
Token ID1123
Feature activation+8.14e-01

Pos logit contributions
lf+3.072
 flavors+2.968
 kinds+2.757
 sizes+2.751
cest+2.674
Neg logit contributions
oken-2.855
hered-2.832
 grounded-2.619
 permission-2.579
 risking-2.559
Token other
Token ID584
Feature activation+2.30e-01

Pos logit contributions
lf+1.295
 flavors+1.273
 sizes+1.212
 kinds+1.155
 oval+1.108
Neg logit contributions
hered-1.461
oken-1.390
pped-1.305
 grounded-1.301
 risking-1.279
Token.
Token ID13
Feature activation+4.26e-01

Pos logit contributions
lf+0.368
 flavors+0.356
 kinds+0.333
 sizes+0.331
 oval+0.311
Neg logit contributions
hered-0.410
oken-0.394
 grounded-0.364
ives-0.361
pped-0.361
Token They
Token ID1119
Feature activation+1.24e+00

Pos logit contributions
lf+0.742
 flavors+0.706
 kinds+0.675
 sizes+0.665
cest+0.645
Neg logit contributions
hered-0.707
oken-0.684
ives-0.631
 grounded-0.623
pped-0.622
Token were
Token ID547
Feature activation+1.94e+00

Pos logit contributions
lf+2.298
 flavors+2.263
 sizes+2.160
 kinds+2.135
cest+2.063
Neg logit contributions
hered-1.824
oken-1.681
pped-1.655
 grounded-1.652
ives-1.561
Token so
Token ID523
Feature activation+2.27e+00

Pos logit contributions
lf+3.624
 flavors+3.501
 kinds+3.214
 sizes+3.171
 oval+3.093
Neg logit contributions
hered-2.913
oken-2.900
 grounded-2.838
pped-2.639
 risking-2.632
Token proud
Token ID6613
Feature activation+8.54e-01

Pos logit contributions
lf+3.727
 flavors+3.598
 sizes+3.428
 oval+3.241
 kinds+3.188
Neg logit contributions
oken-4.174
hered-3.954
 grounded-3.912
ives-3.710
pped-3.555
Token of
Token ID286
Feature activation+5.12e-01

Pos logit contributions
lf+1.475
 flavors+1.424
 sizes+1.333
 kinds+1.314
 oval+1.260
Neg logit contributions
hered-1.396
oken-1.335
ives-1.257
 risking-1.244
 grounded-1.243
Token space
Token ID2272
Feature activation+1.16e+00

Pos logit contributions
lf+2.267
 flavors+2.252
 sizes+2.122
 kinds+2.031
 oval+1.989
Neg logit contributions
oken-2.201
hered-2.200
 risking-2.027
 grounded-2.006
 permission-1.957
Token.
Token ID13
Feature activation+6.23e-01

Pos logit contributions
 flavors+1.863
lf+1.858
 sizes+1.719
 kinds+1.703
 oval+1.662
Neg logit contributions
hered-1.993
oken-1.943
 grounded-1.805
 risking-1.805
pped-1.777
Token When
Token ID1649
Feature activation+7.06e-01

Pos logit contributions
lf+0.993
 flavors+0.934
 kinds+0.876
 sizes+0.867
cest+0.838
Neg logit contributions
hered-1.119
oken-1.108
 grounded-1.025
ives-1.020
oning-1.006
Token he
Token ID339
Feature activation+1.93e-01

Pos logit contributions
lf+0.800
 flavors+0.778
 sizes+0.745
 kinds+0.736
 oval+0.674
Neg logit contributions
hered-1.554
oken-1.517
 risking-1.439
pped-1.438
 grounded-1.433
Token returned
Token ID4504
Feature activation+1.08e-01

Pos logit contributions
lf+0.359
 flavors+0.351
 sizes+0.338
 kinds+0.334
 oval+0.316
Neg logit contributions
hered-0.292
oken-0.282
pped-0.258
 grounded-0.255
 risking-0.252
Token to
Token ID284
Feature activation+7.64e-01

Pos logit contributions
lf+0.179
 flavors+0.172
 sizes+0.163
 kinds+0.163
 oval+0.153
Neg logit contributions
hered-0.185
oken-0.180
 risking-0.165
pped-0.165
 grounded-0.164
Token Earth
Token ID3668
Feature activation+2.97e-01

Pos logit contributions
lf+1.043
 flavors+1.030
 sizes+0.966
 kinds+0.922
 oval+0.883
Neg logit contributions
hered-1.501
oken-1.498
 grounded-1.395
 risking-1.383
pped-1.365
Token,
Token ID11
Feature activation+3.17e-01

Pos logit contributions
lf+0.410
 flavors+0.396
 sizes+0.366
 kinds+0.366
 oval+0.339
Neg logit contributions
hered-0.588
oken-0.568
 risking-0.528
 grounded-0.526
pped-0.525
Token he
Token ID339
Feature activation+2.91e-01

Pos logit contributions
lf+0.333
 flavors+0.314
 kinds+0.288
 sizes+0.288
 oval+0.264
Neg logit contributions
hered-0.731
oken-0.715
 risking-0.677
pped-0.675
 grounded-0.675
Token told
Token ID1297
Feature activation+3.70e-01

Pos logit contributions
lf+0.506
 flavors+0.490
 sizes+0.466
 kinds+0.466
 oval+0.437
Neg logit contributions
hered-0.472
oken-0.459
pped-0.424
 grounded-0.422
 risking-0.417
Token all
Token ID477
Feature activation+1.21e+00

Pos logit contributions
lf+0.434
 flavors+0.418
 sizes+0.390
 kinds+0.382
 oval+0.353
Neg logit contributions
hered-0.808
oken-0.800
 grounded-0.745
pped-0.741
 risking-0.737
Token for
Token ID329
Feature activation-5.79e-01

Pos logit contributions
lf+1.146
 flavors+1.113
 sizes+1.048
 kinds+1.045
 oval+0.970
Neg logit contributions
hered-1.301
oken-1.274
 grounded-1.192
 risking-1.183
pped-1.175
Token?"
Token ID1701
Feature activation+3.68e-01

Pos logit contributions
hered+1.208
oken+1.182
 risking+1.095
pped+1.093
 grounded+1.089
Neg logit contributions
lf-0.762
 flavors-0.715
 kinds-0.666
 sizes-0.654
 oval-0.611
Token Lily
Token ID20037
Feature activation+7.52e-03

Pos logit contributions
lf+0.621
 flavors+0.600
 sizes+0.567
 kinds+0.565
cest+0.529
Neg logit contributions
hered-0.623
oken-0.605
 grounded-0.557
 risking-0.554
pped-0.550
Token says
Token ID1139
Feature activation-5.57e-01

Pos logit contributions
lf+0.012
 flavors+0.011
 kinds+0.011
 sizes+0.011
cest+0.010
Neg logit contributions
hered-0.013
oken-0.013
 grounded-0.012
pped-0.012
 risking-0.012
Token.
Token ID13
Feature activation-6.36e-01

Pos logit contributions
hered+1.239
oken+1.226
 grounded+1.144
pped+1.143
 risking+1.141
Neg logit contributions
lf-0.632
 flavors-0.592
 kinds-0.549
 sizes-0.544
 oval-0.500
Token
Token ID198
Feature activation+7.44e-01

Pos logit contributions
hered+1.057
oken+1.054
 grounded+0.952
pped+0.951
 risking+0.946
Neg logit contributions
lf-1.073
 flavors-1.033
 kinds-0.984
 sizes-0.973
 oval-0.931
Token
Token ID198
Feature activation+1.18e+00

Pos logit contributions
lf+1.517
 flavors+1.480
 sizes+1.422
 kinds+1.407
cest+1.337
Neg logit contributions
hered-1.007
oken-0.961
pped-0.872
ives-0.871
 grounded-0.863
Token"
Token ID1
Feature activation+4.96e-01

Pos logit contributions
lf+1.904
 flavors+1.846
 kinds+1.750
 sizes+1.734
cest+1.562
Neg logit contributions
hered-2.205
oken-1.964
pped-1.956
ives-1.947
 grounded-1.930
TokenIt
Token ID1026
Feature activation+1.03e+00

Pos logit contributions
lf+1.097
 flavors+1.047
 kinds+1.014
 sizes+1.001
cest+0.957
Neg logit contributions
hered-0.593
oken-0.545
pped-0.499
 risking-0.497
 grounded-0.496
Token's
Token ID338
Feature activation+1.05e+00

Pos logit contributions
lf+1.914
 flavors+1.804
 kinds+1.701
 sizes+1.692
cest+1.608
Neg logit contributions
hered-1.580
oken-1.497
 grounded-1.432
 risking-1.396
ives-1.388
Token for
Token ID329
Feature activation-2.04e-01

Pos logit contributions
lf+2.008
 flavors+1.890
 kinds+1.828
 sizes+1.779
cest+1.729
Neg logit contributions
oken-1.555
hered-1.510
 grounded-1.408
pped-1.372
 risking-1.367
Token that
Token ID326
Feature activation+2.95e+00

Pos logit contributions
lf+1.475
 flavors+1.429
 sizes+1.366
 kinds+1.341
 oval+1.311
Neg logit contributions
hered-1.127
oken-1.073
 risking-0.990
 grounded-0.970
pped-0.969
Token he
Token ID339
Feature activation+5.84e-01

Pos logit contributions
 flavors+4.217
 sizes+4.077
lf+4.019
 oval+3.862
lights+3.744
Neg logit contributions
 grounded-5.547
 risking-5.389
hered-5.381
Sleep-5.327
 permission-5.298
Token couldn
Token ID3521
Feature activation-1.84e-01

Pos logit contributions
lf+0.987
 flavors+0.967
 kinds+0.926
 sizes+0.923
 oval+0.867
Neg logit contributions
hered-0.962
oken-0.920
pped-0.872
 grounded-0.864
 risking-0.851
Token't
Token ID470
Feature activation+3.10e-01

Pos logit contributions
hered+0.119
oken+0.111
pped+0.084
 grounded+0.084
 risking+0.083
Neg logit contributions
lf-0.503
 flavors-0.492
 kinds-0.476
 sizes-0.475
 oval-0.458
Token wait
Token ID4043
Feature activation+2.76e-02

Pos logit contributions
lf+0.592
 flavors+0.582
 sizes+0.560
 kinds+0.558
 oval+0.520
Neg logit contributions
hered-0.453
oken-0.435
pped-0.394
 grounded-0.392
oning-0.386
Token to
Token ID284
Feature activation+9.12e-01

Pos logit contributions
lf+0.058
 flavors+0.057
 sizes+0.054
 kinds+0.054
 oval+0.052
Neg logit contributions
hered-0.035
oken-0.034
pped-0.030
 risking-0.030
 grounded-0.030
Token ride
Token ID6594
Feature activation+1.85e+00

Pos logit contributions
lf+1.596
 flavors+1.580
 sizes+1.571
 kinds+1.511
cest+1.435
Neg logit contributions
hered-1.454
oken-1.405
 grounded-1.299
pped-1.298
 risking-1.276
Token it
Token ID340
Feature activation+7.19e-01

Pos logit contributions
 sizes+3.393
lf+3.385
 flavors+3.381
 kinds+3.328
 oval+3.200
Neg logit contributions
hered-2.794
pped-2.552
oken-2.540
 risking-2.429
 grounded-2.401
Token.
Token ID13
Feature activation+1.07e+00

Pos logit contributions
lf+1.170
 flavors+1.158
 sizes+1.085
 kinds+1.076
cest+1.022
Neg logit contributions
hered-1.252
oken-1.190
pped-1.107
 grounded-1.096
 risking-1.092
Token He
Token ID679
Feature activation+1.16e+00

Pos logit contributions
lf+1.557
 flavors+1.419
 kinds+1.312
 sizes+1.311
cest+1.301
Neg logit contributions
hered-2.101
oken-2.098
ives-1.981
oning-1.946
 grounded-1.935
Token practiced
Token ID19893
Feature activation+7.28e-01

Pos logit contributions
lf+2.072
 flavors+2.033
 sizes+1.993
 kinds+1.955
cest+1.863
Neg logit contributions
hered-1.760
oken-1.679
pped-1.624
 grounded-1.585
 risking-1.533
Token lead
Token ID1085
Feature activation-7.71e-01

Pos logit contributions
hered+0.330
oken+0.321
pped+0.291
 grounded+0.291
 risking+0.290
Neg logit contributions
lf-0.373
 flavors-0.360
 kinds-0.342
 sizes-0.341
 oval-0.321
Token the
Token ID262
Feature activation-1.19e+00

Pos logit contributions
hered+1.325
oken+1.300
 grounded+1.183
pped+1.183
 risking+1.178
Neg logit contributions
lf-1.286
 flavors-1.232
 kinds-1.178
 sizes-1.161
 oval-1.094
Token way
Token ID835
Feature activation-5.12e-01

Pos logit contributions
hered+2.693
oken+2.606
pped+2.488
ives+2.466
oned+2.412
Neg logit contributions
lf-1.406
 flavors-1.368
 sizes-1.262
 kinds-1.239
 tower-1.146
Token".
Token ID1911
Feature activation-1.02e+00

Pos logit contributions
hered+0.715
oken+0.699
pped+0.618
 risking+0.615
 grounded+0.614
Neg logit contributions
lf-1.012
 flavors-0.973
 kinds-0.935
 sizes-0.933
cest-0.886
Token
Token ID198
Feature activation-5.15e-01

Pos logit contributions
hered+1.789
oken+1.764
pped+1.643
 grounded+1.623
 risking+1.621
Neg logit contributions
lf-1.655
 flavors-1.578
 kinds-1.540
 sizes-1.501
 oval-1.417
Token
Token ID198
Feature activation-1.15e-01

Pos logit contributions
hered+0.837
oken+0.816
 risking+0.740
pped+0.739
 grounded+0.738
Neg logit contributions
lf-0.914
 flavors-0.869
 kinds-0.829
 sizes-0.818
 oval-0.777
TokenThe
Token ID464
Feature activation-7.80e-01

Pos logit contributions
hered+0.215
oken+0.210
pped+0.194
 grounded+0.194
 risking+0.193
Neg logit contributions
lf-0.175
 flavors-0.168
 kinds-0.158
 sizes-0.157
 oval-0.146
Token girls
Token ID4813
Feature activation-4.14e-01

Pos logit contributions
hered+1.356
oken+1.310
pped+1.222
 risking+1.210
ives+1.201
Neg logit contributions
lf-1.353
 flavors-1.276
 kinds-1.214
 sizes-1.202
 tower-1.142
Token followed
Token ID3940
Feature activation-9.47e-01

Pos logit contributions
hered+0.712
oken+0.700
 risking+0.638
pped+0.637
 grounded+0.636
Neg logit contributions
lf-0.685
 flavors-0.655
 kinds-0.623
 sizes-0.618
 oval-0.577
Token Maria
Token ID14200
Feature activation-7.74e-01

Pos logit contributions
hered+2.003
oken+1.987
 grounded+1.872
 risking+1.867
pped+1.863
Neg logit contributions
lf-1.196
 flavors-1.097
 kinds-1.056
 sizes-1.029
 oval-0.958
Token and
Token ID290
Feature activation-3.07e-01

Pos logit contributions
hered+1.192
oken+1.189
pped+1.075
 grounded+1.069
 risking+1.063
Neg logit contributions
lf-1.392
 flavors-1.337
 kinds-1.294
 sizes-1.287
 oval-1.212
Token boy
Token ID2933
Feature activation+8.25e-01

Pos logit contributions
lf+1.074
 flavors+1.048
 sizes+0.997
 kinds+0.978
cest+0.912
Neg logit contributions
hered-1.080
oken-1.022
 grounded-0.957
 risking-0.925
pped-0.920
Token named
Token ID3706
Feature activation+1.20e+00

Pos logit contributions
 flavors+1.750
lf+1.742
 sizes+1.665
 kinds+1.632
cest+1.537
Neg logit contributions
hered-1.028
oken-0.955
pped-0.877
 grounded-0.875
 risking-0.859
Token Jamie
Token ID17826
Feature activation+1.20e+00

Pos logit contributions
lf+2.096
 flavors+2.065
 kinds+1.979
 sizes+1.944
cest+1.912
Neg logit contributions
hered-1.918
oken-1.828
 grounded-1.704
oning-1.695
pped-1.682
Token.
Token ID13
Feature activation+3.06e+00

Pos logit contributions
 flavors+1.779
lf+1.778
 sizes+1.657
 kinds+1.625
 oval+1.515
Neg logit contributions
hered-2.227
oken-2.142
pped-2.043
 grounded-2.026
 risking-2.014
Token Jamie
Token ID17826
Feature activation+4.98e-01

Pos logit contributions
lf+4.900
 flavors+4.295
cest+4.226
 sizes+4.030
 oval+3.855
Neg logit contributions
oken-5.667
hered-5.255
ives-5.234
Sleep-5.166
Come-5.154
Token loved
Token ID6151
Feature activation+9.71e-02

Pos logit contributions
lf+0.771
 flavors+0.755
 sizes+0.715
 kinds+0.711
 oval+0.662
Neg logit contributions
hered-0.879
oken-0.860
 grounded-0.805
pped-0.799
 risking-0.790
Token to
Token ID284
Feature activation+7.03e-01

Pos logit contributions
lf+0.204
 flavors+0.197
 sizes+0.189
 kinds+0.189
cest+0.180
Neg logit contributions
hered-0.121
oken-0.118
 risking-0.105
 grounded-0.104
pped-0.104
Token play
Token ID711
Feature activation-4.21e-01

Pos logit contributions
lf+1.240
 flavors+1.215
 sizes+1.174
 kinds+1.148
cest+1.111
Neg logit contributions
hered-1.104
oken-1.071
pped-0.968
oning-0.954
 grounded-0.951
Token outside
Token ID2354
Feature activation+7.96e-01

Pos logit contributions
hered+0.693
oken+0.679
 grounded+0.616
 risking+0.615
pped+0.614
Neg logit contributions
lf-0.740
 flavors-0.704
 kinds-0.667
 sizes-0.661
 oval-0.628
Token and
Token ID290
Feature activation+9.30e-01

Pos logit contributions
lf+1.616
 flavors+1.595
 sizes+1.502
 kinds+1.493
cest+1.441
Neg logit contributions
hered-1.046
oken-0.992
pped-0.893
 grounded-0.885
 risking-0.884
Token explore
Token ID7301
Feature activation+1.99e+00

Pos logit contributions
lf+1.727
 flavors+1.673
 kinds+1.587
 sizes+1.576
cest+1.521
Neg logit contributions
hered-1.374
oken-1.318
 risking-1.248
 grounded-1.220
pped-1.210
Token upon
Token ID2402
Feature activation-6.46e-01

Pos logit contributions
 flavors+1.781
lf+1.779
 sizes+1.704
 kinds+1.669
 oval+1.628
Neg logit contributions
hered-0.787
oken-0.780
 grounded-0.660
pped-0.644
 risking-0.642
Token a
Token ID257
Feature activation-4.14e-01

Pos logit contributions
hered+0.963
oken+0.938
 grounded+0.840
 risking+0.832
pped+0.829
Neg logit contributions
lf-1.229
 flavors-1.186
 kinds-1.134
 sizes-1.118
 oval-1.057
Token time
Token ID640
Feature activation-6.62e-01

Pos logit contributions
hered+0.605
oken+0.586
 risking+0.531
pped+0.531
 grounded+0.530
Neg logit contributions
lf-0.791
 flavors-0.763
 kinds-0.727
 sizes-0.723
 oval-0.687
Token,
Token ID11
Feature activation-7.93e-01

Pos logit contributions
hered+1.451
oken+1.439
 grounded+1.349
pped+1.344
 risking+1.340
Neg logit contributions
lf-0.783
 flavors-0.720
 kinds-0.672
 sizes-0.661
 oval-0.602
Token there
Token ID612
Feature activation+6.95e-01

Pos logit contributions
hered+1.000
oken+0.983
ives+0.873
 risking+0.850
pped+0.849
Neg logit contributions
lf-1.697
 flavors-1.612
 kinds-1.560
 sizes-1.550
 oval-1.491
Token was
Token ID373
Feature activation-1.55e+00

Pos logit contributions
lf+1.087
 flavors+1.050
 sizes+1.015
 kinds+0.982
 oval+0.953
Neg logit contributions
hered-1.245
oken-1.201
pped-1.118
 risking-1.100
 grounded-1.088
Token a
Token ID257
Feature activation+1.12e+00

Pos logit contributions
hered+2.603
oken+2.537
 risking+2.346
ives+2.344
 grounded+2.307
Neg logit contributions
lf-2.721
 flavors-2.594
 kinds-2.522
 sizes-2.421
 oval-2.377
Token little
Token ID1310
Feature activation+8.43e-01

Pos logit contributions
 flavors+1.749
lf+1.721
 sizes+1.695
 kinds+1.692
cest+1.471
Neg logit contributions
hered-1.981
oken-1.890
 grounded-1.804
 risking-1.745
Sleep-1.729
Token girl
Token ID2576
Feature activation+7.45e-01

Pos logit contributions
lf+1.416
 flavors+1.372
 sizes+1.301
 kinds+1.271
cest+1.170
Neg logit contributions
hered-1.451
oken-1.340
 grounded-1.271
 risking-1.231
pped-1.225
Token named
Token ID3706
Feature activation-7.34e-01

Pos logit contributions
 flavors+1.582
lf+1.562
 sizes+1.500
 kinds+1.474
cest+1.385
Neg logit contributions
hered-0.935
oken-0.877
 grounded-0.801
pped-0.797
 risking-0.777
Token Lily
Token ID20037
Feature activation+1.03e+00

Pos logit contributions
hered+1.405
oken+1.381
 risking+1.285
pped+1.276
 grounded+1.268
Neg logit contributions
lf-1.102
 flavors-1.022
 kinds-0.974
 sizes-0.966
 oval-0.909
Token was
Token ID373
Feature activation-4.40e-01

Pos logit contributions
hered+1.271
oken+1.224
 risking+1.112
pped+1.105
 grounded+1.082
Neg logit contributions
lf-1.614
 flavors-1.578
 sizes-1.522
 kinds-1.517
 oval-1.465
Token a
Token ID257
Feature activation+5.13e-01

Pos logit contributions
hered+0.732
oken+0.710
pped+0.648
 risking+0.645
 grounded+0.643
Neg logit contributions
lf-0.752
 flavors-0.723
 kinds-0.689
 sizes-0.685
 oval-0.645
Token little
Token ID1310
Feature activation+2.98e-01

Pos logit contributions
lf+0.702
 flavors+0.679
 kinds+0.646
 sizes+0.644
cest+0.580
Neg logit contributions
hered-1.009
oken-0.978
 grounded-0.923
 risking-0.910
pped-0.892
Token girl
Token ID2576
Feature activation-4.74e-01

Pos logit contributions
lf+0.457
 flavors+0.435
 sizes+0.406
 kinds+0.405
cest+0.373
Neg logit contributions
hered-0.554
oken-0.535
 grounded-0.500
 risking-0.493
pped-0.486
Token who
Token ID508
Feature activation+3.23e-01

Pos logit contributions
hered+0.791
oken+0.780
pped+0.706
 grounded+0.705
 risking+0.702
Neg logit contributions
lf-0.804
 flavors-0.770
 kinds-0.734
 sizes-0.728
 oval-0.688
Token liked
Token ID8288
Feature activation+1.32e-01

Pos logit contributions
lf+0.489
 flavors+0.471
 sizes+0.441
 kinds+0.433
 oval+0.414
Neg logit contributions
hered-0.597
oken-0.583
 grounded-0.544
pped-0.538
 risking-0.536
Token to
Token ID284
Feature activation+1.30e+00

Pos logit contributions
lf+0.273
 flavors+0.264
 sizes+0.252
 kinds+0.252
cest+0.241
Neg logit contributions
hered-0.170
oken-0.165
 risking-0.148
 grounded-0.146
pped-0.145
Token draw
Token ID3197
Feature activation+1.03e+00

Pos logit contributions
lf+2.387
 flavors+2.340
 sizes+2.264
 kinds+2.211
cest+2.168
Neg logit contributions
hered-1.936
oken-1.876
oning-1.670
pped-1.647
 Drink-1.645
Token.
Token ID13
Feature activation+1.02e-01

Pos logit contributions
lf+1.720
 flavors+1.695
 sizes+1.576
 kinds+1.574
cest+1.528
Neg logit contributions
hered-1.685
oken-1.642
pped-1.516
ives-1.507
 grounded-1.468
Token She
Token ID1375
Feature activation+1.47e+00

Pos logit contributions
lf+0.189
 flavors+0.182
 kinds+0.172
 sizes+0.171
cest+0.164
Neg logit contributions
hered-0.155
oken-0.151
 grounded-0.135
pped-0.135
ives-0.134
Token had
Token ID550
Feature activation+5.68e-01

Pos logit contributions
lf+2.653
 flavors+2.629
 sizes+2.506
 kinds+2.481
cest+2.439
Neg logit contributions
hered-2.158
oken-2.048
pped-1.979
 grounded-1.944
ives-1.894
Token Lily
Token ID20037
Feature activation+6.40e-01

Pos logit contributions
hered+0.710
oken+0.697
pped+0.640
 grounded+0.639
 risking+0.638
Neg logit contributions
lf-0.595
 flavors-0.569
 kinds-0.536
 sizes-0.535
 oval-0.501
Token and
Token ID290
Feature activation-1.54e-01

Pos logit contributions
lf+1.079
 flavors+1.062
 kinds+1.002
 sizes+0.998
cest+0.936
Neg logit contributions
hered-1.077
oken-1.014
pped-0.961
 grounded-0.941
 risking-0.933
Token her
Token ID607
Feature activation-3.27e-01

Pos logit contributions
hered+0.249
oken+0.242
pped+0.220
 grounded+0.219
 risking+0.219
Neg logit contributions
lf-0.274
 flavors-0.264
 kinds-0.251
 sizes-0.250
 oval-0.236
Token mom
Token ID1995
Feature activation-5.23e-01

Pos logit contributions
hered+0.647
oken+0.635
pped+0.589
 grounded+0.588
 risking+0.586
Neg logit contributions
lf-0.461
 flavors-0.440
 kinds-0.411
 sizes-0.410
 oval-0.380
Tokenmy
Token ID1820
Feature activation-7.12e-01

Pos logit contributions
hered+0.863
oken+0.850
 risking+0.774
 grounded+0.772
pped+0.766
Neg logit contributions
lf-0.898
 flavors-0.865
 kinds-0.825
 sizes-0.820
 oval-0.766
Token were
Token ID547
Feature activation-1.28e+00

Pos logit contributions
oken+1.200
hered+1.198
 grounded+1.089
 risking+1.089
pped+1.070
Neg logit contributions
lf-1.189
 flavors-1.144
 kinds-1.089
 sizes-1.086
 oval-1.009
Token surprised
Token ID6655
Feature activation-8.84e-01

Pos logit contributions
hered+2.226
oken+2.153
ives+2.063
pped+1.968
Sleep+1.962
Neg logit contributions
lf-2.120
 flavors-1.962
 kinds-1.876
 sizes-1.875
cest-1.780
Token but
Token ID475
Feature activation-4.16e-01

Pos logit contributions
hered+1.692
oken+1.671
pped+1.547
 grounded+1.541
 risking+1.534
Neg logit contributions
lf-1.292
 flavors-1.248
 kinds-1.201
 sizes-1.178
cest-1.087
Token happy
Token ID3772
Feature activation-1.08e+00

Pos logit contributions
hered+0.768
oken+0.756
pped+0.689
ives+0.688
 grounded+0.687
Neg logit contributions
lf-0.643
 flavors-0.614
 kinds-0.580
 sizes-0.579
 oval-0.538
Token to
Token ID284
Feature activation+1.94e-01

Pos logit contributions
hered+2.195
oken+2.191
pped+2.024
 grounded+2.007
oning+1.984
Neg logit contributions
lf-1.502
 flavors-1.430
 kinds-1.352
 sizes-1.330
cest-1.214
Token see
Token ID766
Feature activation-6.95e-01

Pos logit contributions
lf+0.362
 flavors+0.351
 sizes+0.334
 kinds+0.333
 oval+0.313
Neg logit contributions
hered-0.299
oken-0.288
pped-0.259
 grounded-0.258
 risking-0.258
Token you
Token ID345
Feature activation+2.10e-01

Pos logit contributions
oken+1.672
hered+1.657
 risking+1.487
pped+1.473
 grounded+1.472
Neg logit contributions
lf-2.204
 flavors-2.114
 kinds-2.016
 sizes-1.963
 oval-1.889
Token,
Token ID11
Feature activation-8.57e-02

Pos logit contributions
lf+0.290
 flavors+0.278
 kinds+0.258
 sizes+0.257
 oval+0.237
Neg logit contributions
hered-0.425
oken-0.412
 grounded-0.384
pped-0.380
ives-0.380
Token Grand
Token ID5675
Feature activation+4.70e-01

Pos logit contributions
hered+0.151
oken+0.148
 grounded+0.136
pped+0.135
 risking+0.135
Neg logit contributions
lf-0.138
 flavors-0.133
 kinds-0.126
 sizes-0.126
 oval-0.117
Tokenma
Token ID2611
Feature activation+8.98e-01

Pos logit contributions
lf+0.733
 flavors+0.699
 sizes+0.680
 kinds+0.663
 oval+0.634
Neg logit contributions
hered-0.845
oken-0.826
 grounded-0.763
pped-0.759
 risking-0.744
Token.
Token ID13
Feature activation-1.25e+00

Pos logit contributions
lf+1.320
 flavors+1.223
 sizes+1.135
 kinds+1.109
 oval+1.080
Neg logit contributions
hered-1.755
oken-1.625
ives-1.580
 grounded-1.579
pped-1.539
Token You
Token ID921
Feature activation-1.80e+00

Pos logit contributions
oken+1.911
hered+1.872
pped+1.723
 risking+1.685
 grounded+1.637
Neg logit contributions
lf-2.281
 flavors-2.255
 sizes-2.201
 kinds-2.190
 oval-2.060
Token are
Token ID389
Feature activation+4.03e-01

Pos logit contributions
oken+2.666
hered+2.657
 risking+2.457
oned+2.429
oning+2.421
Neg logit contributions
 flavors-3.348
lf-3.338
 kinds-3.270
 sizes-3.184
cest-2.946
Token very
Token ID845
Feature activation+1.79e+00

Pos logit contributions
lf+0.681
 flavors+0.658
 kinds+0.618
 sizes+0.617
cest+0.577
Neg logit contributions
hered-0.681
oken-0.667
 grounded-0.627
 risking-0.612
pped-0.607
Token nice
Token ID3621
Feature activation-1.77e-01

Pos logit contributions
lf+3.023
 flavors+2.857
 sizes+2.713
 kinds+2.589
 oval+2.573
Neg logit contributions
oken-3.113
hered-3.100
 grounded-2.988
ives-2.833
Sleep-2.825
Token.
Token ID13
Feature activation-7.76e-01

Pos logit contributions
hered+0.356
oken+0.348
pped+0.323
 grounded+0.322
 risking+0.322
Neg logit contributions
lf-0.242
 flavors-0.231
 kinds-0.215
 sizes-0.215
 oval-0.198
Token We
Token ID775
Feature activation-1.75e+00

Pos logit contributions
hered+1.081
oken+1.067
pped+0.956
 risking+0.941
 grounded+0.926
Neg logit contributions
lf-1.514
 flavors-1.487
 sizes-1.436
 kinds-1.434
 oval-1.358
Token while
Token ID981
Feature activation-8.90e-01

Pos logit contributions
 flavors+1.333
lf+1.304
 kinds+1.278
 sizes+1.265
cest+1.156
Neg logit contributions
hered-1.707
oken-1.579
 risking-1.498
 grounded-1.494
oning-1.482
Token,
Token ID11
Feature activation-6.73e-01

Pos logit contributions
oken+1.973
hered+1.966
ives+1.846
 risking+1.837
 grounded+1.833
Neg logit contributions
lf-1.052
 flavors-0.917
 kinds-0.858
 sizes-0.835
 oval-0.809
Token Lily
Token ID20037
Feature activation-6.61e-01

Pos logit contributions
hered+1.277
oken+1.263
ives+1.158
 grounded+1.151
 risking+1.150
Neg logit contributions
lf-1.018
 flavors-0.951
 kinds-0.890
 sizes-0.882
 oval-0.834
Token's
Token ID338
Feature activation-1.75e-01

Pos logit contributions
hered+1.106
oken+1.097
 risking+0.999
 grounded+0.984
pped+0.979
Neg logit contributions
lf-1.134
 flavors-1.076
 kinds-1.018
 sizes-1.005
 oval-0.959
Token mom
Token ID1995
Feature activation-1.18e+00

Pos logit contributions
hered+0.381
oken+0.374
 grounded+0.348
pped+0.348
 risking+0.347
Neg logit contributions
lf-0.213
 flavors-0.202
 kinds-0.187
 sizes-0.186
 oval-0.169
Token asked
Token ID1965
Feature activation-2.35e+00

Pos logit contributions
oken+2.093
hered+2.071
 risking+1.925
ives+1.860
oning+1.842
Neg logit contributions
lf-1.982
 flavors-1.810
 kinds-1.729
 sizes-1.670
 oval-1.609
Token her
Token ID607
Feature activation-1.16e+00

Pos logit contributions
hered+5.063
oken+4.900
 risking+4.703
ives+4.567
 grounded+4.543
Neg logit contributions
lf-3.128
 kinds-2.762
 flavors-2.736
 sizes-2.433
 type-2.420
Token to
Token ID284
Feature activation-7.25e-01

Pos logit contributions
hered+1.913
oken+1.849
pped+1.707
 risking+1.667
 grounded+1.614
Neg logit contributions
lf-2.087
 flavors-1.965
 kinds-1.910
 sizes-1.844
 tower-1.787
Token shut
Token ID4423
Feature activation+2.17e-01

Pos logit contributions
hered+1.054
oken+1.023
 risking+0.927
ives+0.918
pped+0.914
Neg logit contributions
lf-1.415
 flavors-1.350
 kinds-1.283
 sizes-1.258
 oval-1.218
Token the
Token ID262
Feature activation-2.50e-01

Pos logit contributions
lf+0.328
 flavors+0.318
 sizes+0.301
 kinds+0.299
cest+0.280
Neg logit contributions
hered-0.401
oken-0.393
pped-0.367
 grounded-0.362
 risking-0.362
Token dish
Token ID9433
Feature activation+1.15e+00

Pos logit contributions
hered+0.488
oken+0.477
pped+0.441
 grounded+0.440
 risking+0.440
Neg logit contributions
lf-0.360
 flavors-0.344
 kinds-0.322
 sizes-0.320
 oval-0.297
Token to
Token ID284
Feature activation+2.29e-02

Pos logit contributions
oken+3.538
hered+3.454
 risking+3.288
pped+3.243
 grounded+3.233
Neg logit contributions
lf-2.135
 flavors-1.933
 kinds-1.807
 oval-1.799
 sizes-1.795
Token Tom
Token ID4186
Feature activation-1.56e+00

Pos logit contributions
lf+0.036
 flavors+0.034
 sizes+0.033
 kinds+0.033
 oval+0.030
Neg logit contributions
hered-0.042
oken-0.041
pped-0.038
 grounded-0.037
 risking-0.037
Token,
Token ID11
Feature activation-1.79e+00

Pos logit contributions
oken+3.529
hered+3.418
pped+3.214
 risking+3.187
 grounded+3.171
Neg logit contributions
lf-1.787
 flavors-1.622
 kinds-1.567
 sizes-1.541
 tower-1.413
Token "
Token ID366
Feature activation-2.08e+00

Pos logit contributions
oken+4.096
hered+3.986
ives+3.779
pped+3.683
 grounded+3.586
Neg logit contributions
lf-2.071
 flavors-1.802
 sizes-1.732
 oval-1.697
 kinds-1.632
TokenStop
Token ID19485
Feature activation-7.57e-01

Pos logit contributions
oken+4.282
hered+3.951
ives+3.703
 risking+3.628
oning+3.596
Neg logit contributions
lf-3.124
 flavors-3.121
 kinds-2.647
 tower-2.589
cest-2.569
Token!
Token ID0
Feature activation-2.25e+00

Pos logit contributions
oken+1.659
hered+1.651
ives+1.532
 grounded+1.519
pped+1.500
Neg logit contributions
lf-0.915
 flavors-0.838
 sizes-0.767
 kinds-0.764
cest-0.719
Token It
Token ID632
Feature activation-9.23e-01

Pos logit contributions
oken+3.182
hered+3.062
 risking+2.738
ives+2.691
Sleep+2.628
Neg logit contributions
lf-4.459
 flavors-4.264
 tower-4.145
 oval-4.061
 kinds-3.965
Token is
Token ID318
Feature activation-1.32e+00

Pos logit contributions
hered+1.665
oken+1.664
pped+1.501
 risking+1.499
oning+1.478
Neg logit contributions
lf-1.445
 flavors-1.382
 kinds-1.313
 sizes-1.300
 oval-1.228
Token a
Token ID257
Feature activation-4.75e-01

Pos logit contributions
hered+2.643
oken+2.586
ives+2.445
oned+2.368
pped+2.356
Neg logit contributions
lf-1.880
 flavors-1.765
 sizes-1.664
 kinds-1.621
 oval-1.616
Token trap
Token ID12840
Feature activation-4.23e-01

Pos logit contributions
hered+0.710
oken+0.698
pped+0.627
 grounded+0.617
ives+0.617
Neg logit contributions
lf-0.898
 flavors-0.863
 kinds-0.818
 sizes-0.816
 oval-0.782
Token!
Token ID0
Feature activation-1.84e+00

Pos logit contributions
hered+0.957
oken+0.949
pped+0.881
 risking+0.881
 grounded+0.878
Neg logit contributions
lf-0.469
 flavors-0.436
 kinds-0.400
 sizes-0.398
 oval-0.361
Token They
Token ID1119
Feature activation+1.94e-01

Pos logit contributions
hered+0.489
oken+0.478
pped+0.438
 grounded+0.436
 risking+0.436
Neg logit contributions
lf-0.463
 flavors-0.445
 kinds-0.421
 sizes-0.420
 oval-0.394
Token remembered
Token ID12086
Feature activation-6.43e-01

Pos logit contributions
lf+0.373
 flavors+0.358
 kinds+0.344
 sizes+0.342
 oval+0.327
Neg logit contributions
hered-0.283
oken-0.272
 grounded-0.248
pped-0.248
 risking-0.245
Token their
Token ID511
Feature activation-1.95e+00

Pos logit contributions
hered+1.267
oken+1.234
pped+1.137
ives+1.133
 grounded+1.126
Neg logit contributions
lf-0.928
 flavors-0.897
 kinds-0.836
 sizes-0.824
 oval-0.763
Token friendship
Token ID14738
Feature activation-1.79e-01

Pos logit contributions
hered+4.418
pped+4.245
oken+4.145
ives+4.063
oned+4.039
Neg logit contributions
lf-2.456
 flavors-2.406
 tower-2.165
 sizes-2.101
 kinds-2.069
Token.
Token ID13
Feature activation-2.26e-01

Pos logit contributions
hered+0.380
oken+0.373
pped+0.347
 grounded+0.347
 risking+0.346
Neg logit contributions
lf-0.224
 flavors-0.212
 kinds-0.197
 sizes-0.196
 oval-0.179
Token<|endoftext|>
Token ID50256
Feature activation-1.87e+00

Pos logit contributions
hered+0.380
oken+0.371
pped+0.338
 grounded+0.337
 risking+0.336
Neg logit contributions
lf-0.386
 flavors-0.371
 kinds-0.352
 sizes-0.351
 oval-0.330
TokenL
Token ID43
Feature activation-8.15e-01

Pos logit contributions
oken+2.386
hered+2.301
 grounded+2.182
 risking+2.120
 permission+2.096
Neg logit contributions
lf-3.957
 flavors-3.854
 kinds-3.735
 sizes-3.537
 tower-3.483
Tokenila
Token ID10102
Feature activation-6.00e-01

Pos logit contributions
hered+1.205
oken+1.140
 risking+1.056
 grounded+1.031
pped+1.018
Neg logit contributions
lf-1.602
 flavors-1.526
 kinds-1.479
 sizes-1.436
cest-1.371
Token and
Token ID290
Feature activation-5.86e-01

Pos logit contributions
hered+1.061
oken+1.050
 risking+0.960
 grounded+0.952
oning+0.949
Neg logit contributions
lf-0.966
 flavors-0.930
 kinds-0.877
 sizes-0.865
 oval-0.810
Token Ben
Token ID3932
Feature activation-1.08e+00

Pos logit contributions
hered+1.039
oken+1.033
pped+0.934
 risking+0.931
 grounded+0.927
Neg logit contributions
lf-0.939
 flavors-0.906
 kinds-0.858
 sizes-0.845
 oval-0.797
Token were
Token ID547
Feature activation-1.97e+00

Pos logit contributions
oken+1.914
hered+1.899
 risking+1.755
oning+1.728
 grounded+1.716
Neg logit contributions
lf-1.710
 flavors-1.706
 kinds-1.605
 sizes-1.581
cest-1.473
Token,
Token ID11
Feature activation-5.57e-01

Pos logit contributions
lf+1.087
 flavors+1.055
 sizes+0.988
 kinds+0.975
 oval+0.936
Neg logit contributions
hered-1.349
oken-1.330
 grounded-1.211
pped-1.207
 risking-1.206
Token Lily
Token ID20037
Feature activation-2.54e-01

Pos logit contributions
hered+1.044
oken+1.030
ives+0.942
 grounded+0.941
pped+0.941
Neg logit contributions
lf-0.852
 flavors-0.804
 kinds-0.754
 sizes-0.752
 oval-0.705
Token's
Token ID338
Feature activation-2.63e-01

Pos logit contributions
hered+0.412
oken+0.402
 grounded+0.364
pped+0.364
 risking+0.364
Neg logit contributions
lf-0.447
 flavors-0.430
 kinds-0.408
 sizes-0.406
 oval-0.383
Token mom
Token ID1995
Feature activation+4.90e-01

Pos logit contributions
hered+0.578
oken+0.567
pped+0.529
 risking+0.528
 grounded+0.528
Neg logit contributions
lf-0.316
 flavors-0.298
 kinds-0.275
 sizes-0.273
 oval-0.249
Tokenmy
Token ID1820
Feature activation-5.10e-01

Pos logit contributions
lf+0.909
 flavors+0.897
 sizes+0.849
 kinds+0.842
cest+0.803
Neg logit contributions
hered-0.737
oken-0.698
pped-0.658
 grounded-0.649
ives-0.626
Token asked
Token ID1965
Feature activation-2.23e+00

Pos logit contributions
hered+0.954
oken+0.949
 risking+0.870
 grounded+0.864
pped+0.860
Neg logit contributions
lf-0.774
 flavors-0.729
 kinds-0.686
 sizes-0.679
 oval-0.642
Token her
Token ID607
Feature activation-8.83e-01

Pos logit contributions
hered+5.122
oken+5.026
 risking+4.829
 grounded+4.713
pped+4.693
Neg logit contributions
lf-2.580
 flavors-2.280
 kinds-2.277
 sizes-2.029
 tower-1.971
Token,
Token ID11
Feature activation-1.31e+00

Pos logit contributions
hered+1.485
oken+1.441
pped+1.334
 risking+1.304
 grounded+1.280
Neg logit contributions
lf-1.546
 flavors-1.471
 kinds-1.418
 sizes-1.384
 tower-1.321
Token "
Token ID366
Feature activation-2.02e+00

Pos logit contributions
hered+2.903
oken+2.896
ives+2.660
pped+2.648
 risking+2.615
Neg logit contributions
lf-1.593
 flavors-1.442
 kinds-1.362
 sizes-1.346
 oval-1.288
TokenL
Token ID43
Feature activation-1.98e+00

Pos logit contributions
oken+3.357
hered+3.233
 risking+2.794
ives+2.758
 grounded+2.718
Neg logit contributions
lf-3.853
 flavors-3.722
 kinds-3.479
 sizes-3.412
 oval-3.284
Tokenily
Token ID813
Feature activation-7.45e-01

Pos logit contributions
hered+2.431
oken+2.201
 risking+2.150
 grounded+1.973
oning+1.905
Neg logit contributions
lf-4.517
 kinds-4.152
 flavors-4.127
 sizes-3.890
 tower-3.828