TOP ACTIVATIONS
MAX = 5.218

you go !" Ben said , giving her the t eddy
robots ever ," Tom said , following her with his car
zoom again !" Ben said , handing her the car .
clean up ," Lily said , starting to pick up her
sounds fun ." Max said , as they walked to the
than you !" Lily said , as she turned her car
the clock !" Lily said , as the clock started to
you go ." Ben says , as he gives her his
home !" their parents say , looking at them with surprise
, zoom !" said Lily , excited ly . " Grand
" Sure !" said Lily , happy to have a friend
it is !" Lily whispered , pointing to the doll .
a star !" Lily said , showing her pizza .
the oil !" Ben says , as he pushes his red
much fun !" Lily said , smiling . "
get it !" Max whispered , climbing on a chair .
big crab !" Tom said , holding up his bucket .
, Spot !" Tom said , holding a stuffed dog as
and shiny !" Lily said , leaning over to see .
's play !" Sam said , eager to try .

MIN ACTIVATIONS
MIN = -4.463

She did not know where she was . She did not
of energy . Her favourite past ime was going to the
named Louise . Her favourite past ime was playing in the
She did not know where she was . She did not
I don 't know where she is ," the little girl
She did not know where she was going , but she
" Stop , these are ours !" Sara said
pony named Spark y . Spark y was very small and
I don 't know where he is ." Tim wanted to
<|endoftext|> Once upon a time there was a man . He
He asks his mom where he is . His mom says
. They come to see what is going on . They
play with . All of her friends came to the party
<|endoftext|> Once upon a time there was a little girl named
It was the most beautiful past el she had ever seen
<|endoftext|> Once upon a time there lived a kid named Tom
<|endoftext|> Once upon a time there was a 3 year old
<|endoftext|> Once upon a time there was a little girl named
I don 't know where he is . I 'm so
took our drawings and our past els !" Mom

INTERVAL 2.798 - 5.218
CONTAINS 1.245%

felt better . She said , " I 'm sorry ,
up ." The frog was so excited . He said
. Kai 's mom was very proud of him and she
said . Mom was busy in the kitchen .
Ben asked , " What kind of cheese

INTERVAL 0.378 - 2.798
CONTAINS 34.311%

that she was brave enough to try something new . After
mom my and daddy laughed with happiness . Then they cl
terrible !" Dan cried out . " What are we going
, juicy cherry on a tree . Benny hopped over to
my was happy that he had organized it nicely . He

INTERVAL -2.043 - 0.378
CONTAINS 60.847%

. " Help ! Help !" Tom and Lily
a girl named Molly . She was a very intelligent girl
. She thought and thought until she had an idea .
day , Tim my went to school and Max had to
! I will sell them at the market and make some

INTERVAL -4.463 - -2.043
CONTAINS 3.597%

things you have ! <|endoftext|> Sam and Joe were best friends
said , " Ted , let 's part our hair in
One morning , a little girl named Sue woke up early
yard is very messy . There are toys , leaves ,
ringing sound . She wondered what it could be .
Token you
Token ID345
Feature activation-9.16e-01

Pos logit contributions
orns+3.853
ests+3.674
oline+3.625
ible+3.559
iles+3.555
Neg logit contributions
shadow-4.905
 louder-4.752
 proud-4.670
 determined-4.644
 confident-4.499
Token go
Token ID467
Feature activation+9.41e-02

Pos logit contributions
orns+1.699
 is+1.685
oline+1.650
ible+1.630
ons+1.614
Neg logit contributions
shadow-1.592
 louder-1.489
ured-1.421
 determined-1.409
Just-1.394
Token!"
Token ID2474
Feature activation-4.17e-01

Pos logit contributions
shadow+0.224
ured+0.206
 determined+0.204
 louder+0.204
 confident+0.204
Neg logit contributions
 is-0.122
 lives-0.107
ie-0.107
orns-0.105
 belongs-0.104
Token Ben
Token ID3932
Feature activation+8.04e-01

Pos logit contributions
 is+0.506
 lives+0.453
ie+0.451
orns+0.445
 belongs+0.443
Neg logit contributions
shadow-1.001
 louder-0.948
 proud-0.942
 confident-0.941
ured-0.937
Token said
Token ID531
Feature activation+1.97e+00

Pos logit contributions
shadow+1.870
ured+1.724
ivated+1.702
 determined+1.696
 confident+1.692
Neg logit contributions
 is-1.110
ie-0.970
 lives-0.947
 belongs-0.929
oline-0.921
Token,
Token ID11
Feature activation+5.22e+00

Pos logit contributions
shadow+5.393
Unfortunately+5.030
Half+4.975
ivated+4.940
ured+4.913
Neg logit contributions
 is-2.157
ie-1.863
 belongs-1.568
 lives-1.557
 acts-1.537
Token giving
Token ID3501
Feature activation-1.48e+00

Pos logit contributions
Unfortunately+13.710
shadow+13.680
Half+12.954
Hours+12.365
har+12.280
Neg logit contributions
 is-8.467
ie-7.198
 brother-6.419
 doesn-6.221
 was-6.178
Token her
Token ID607
Feature activation-1.52e+00

Pos logit contributions
 is+3.021
 lives+2.969
iles+2.951
 belongs+2.903
ible+2.876
Neg logit contributions
 louder-2.212
shadow-2.211
 proud-2.063
 confident-2.025
ured-2.013
Token the
Token ID262
Feature activation-6.55e-01

Pos logit contributions
iles+3.088
 is+3.066
orns+3.038
oline+2.988
 lives+2.982
Neg logit contributions
shadow-2.396
 louder-2.242
 proud-2.159
 confident-2.086
 determined-2.082
Token t
Token ID256
Feature activation-1.55e+00

Pos logit contributions
 is+1.000
ie+0.960
orns+0.948
oline+0.936
iles+0.933
Neg logit contributions
shadow-1.347
 louder-1.250
 determined-1.246
 confident-1.233
 proud-1.232
Tokeneddy
Token ID21874
Feature activation-2.56e+00

Pos logit contributions
 is+1.991
 lives+1.830
 acts+1.783
 belongs+1.774
ible+1.769
Neg logit contributions
shadow-3.687
eyed-3.374
Just-3.316
 louder-3.297
ooting-3.292
Token robots
Token ID14193
Feature activation-1.49e+00

Pos logit contributions
shadow+0.214
ured+0.198
 louder+0.196
 determined+0.196
 confident+0.196
Neg logit contributions
 is-0.124
ie-0.110
 lives-0.110
orns-0.108
 belongs-0.107
Token ever
Token ID1683
Feature activation-1.91e+00

Pos logit contributions
 is+2.265
ie+2.212
ible+2.206
orns+2.202
oline+2.172
Neg logit contributions
shadow-2.962
 proud-2.836
 confident-2.827
 determined-2.789
 louder-2.736
Token,"
Token ID553
Feature activation-3.91e-01

Pos logit contributions
ests+2.811
 is+2.711
oline+2.642
orns+2.605
ible+2.597
Neg logit contributions
shadow-4.242
 proud-3.861
 confident-3.838
 louder-3.827
 pleased-3.783
Token Tom
Token ID4186
Feature activation+5.52e-01

Pos logit contributions
 is+0.493
ie+0.446
 lives+0.440
orns+0.436
 belongs+0.433
Neg logit contributions
shadow-0.929
 louder-0.871
 confident-0.860
 determined-0.856
 proud-0.856
Token said
Token ID531
Feature activation+2.14e+00

Pos logit contributions
shadow+1.256
ured+1.158
 determined+1.141
ivated+1.141
Unfortunately+1.136
Neg logit contributions
 is-0.778
ie-0.692
 lives-0.680
 belongs-0.664
oline-0.659
Token,
Token ID11
Feature activation+5.21e+00

Pos logit contributions
shadow+5.909
Unfortunately+5.605
Half+5.487
ivated+5.473
 sidelines+5.439
Neg logit contributions
 is-2.279
ie-1.994
 acts-1.642
 belongs-1.640
 lives-1.637
Token following
Token ID1708
Feature activation-6.30e-01

Pos logit contributions
Unfortunately+12.168
shadow+11.724
Half+11.092
eyed+10.693
ured+10.620
Neg logit contributions
 is-9.916
ie-8.474
 dinosaurs-8.147
 bears-7.769
 lives-7.752
Token her
Token ID607
Feature activation+3.80e-01

Pos logit contributions
 is+1.052
 lives+0.970
 belongs+0.962
orns+0.957
ie+0.938
Neg logit contributions
shadow-1.232
 louder-1.161
ured-1.159
 determined-1.120
 confident-1.116
Token with
Token ID351
Feature activation+2.85e-01

Pos logit contributions
shadow+0.835
ured+0.766
 determined+0.756
 confident+0.754
ivated+0.753
Neg logit contributions
 is-0.571
ie-0.509
 lives-0.507
 belongs-0.494
orns-0.493
Token his
Token ID465
Feature activation-5.72e-01

Pos logit contributions
shadow+0.545
ured+0.494
 determined+0.487
 louder+0.485
 confident+0.485
Neg logit contributions
 is-0.502
ie-0.457
 lives-0.456
orns-0.447
 belongs-0.447
Token car
Token ID1097
Feature activation+1.08e+00

Pos logit contributions
 is+0.950
ie+0.897
orns+0.884
oline+0.876
 lives+0.869
Neg logit contributions
shadow-1.107
 determined-1.033
 louder-1.029
 confident-1.021
ured-1.015
Token zoom
Token ID19792
Feature activation-1.12e+00

Pos logit contributions
shadow+2.027
ured+1.841
 determined+1.818
 pleased+1.807
 confident+1.802
Neg logit contributions
 is-2.490
 lives-2.228
ie-2.223
 belongs-2.215
ons-2.129
Token again
Token ID757
Feature activation-1.26e+00

Pos logit contributions
 is+2.296
ie+2.240
 lives+2.229
iles+2.202
orns+2.195
Neg logit contributions
shadow-1.717
 louder-1.580
ured-1.496
 confident-1.459
 proud-1.453
Token!"
Token ID2474
Feature activation-4.13e-01

Pos logit contributions
 is+1.886
ests+1.779
 belongs+1.765
oline+1.765
 lives+1.757
Neg logit contributions
shadow-2.700
 louder-2.485
 confident-2.415
Just-2.398
 determined-2.392
Token Ben
Token ID3932
Feature activation+1.11e+00

Pos logit contributions
 is+0.595
orns+0.549
ie+0.548
 lives+0.539
ible+0.535
Neg logit contributions
shadow-0.898
 confident-0.834
 louder-0.834
 determined-0.834
ured-0.829
Token said
Token ID531
Feature activation+2.52e+00

Pos logit contributions
shadow+2.615
ured+2.439
 crashing+2.412
ivated+2.400
pering+2.389
Neg logit contributions
 is-1.505
ie-1.300
 lives-1.258
 belongs-1.238
ons-1.233
Token,
Token ID11
Feature activation+5.19e+00

Pos logit contributions
shadow+7.155
Unfortunately+6.821
Half+6.767
 sidelines+6.656
ivated+6.647
Neg logit contributions
 is-2.424
ie-2.118
 happened-1.787
pa-1.688
 belongs-1.687
Token handing
Token ID22786
Feature activation-1.54e+00

Pos logit contributions
shadow+11.969
Unfortunately+11.919
Half+11.242
ooting+10.696
eyed+10.566
Neg logit contributions
 is-9.389
ie-8.449
 dinosaurs-7.866
 brother-7.725
 happened-7.619
Token her
Token ID607
Feature activation-7.38e-01

Pos logit contributions
 is+2.958
 lives+2.952
iles+2.947
orns+2.947
 belongs+2.915
Neg logit contributions
 louder-2.384
shadow-2.364
 proud-2.237
ured-2.185
 confident-2.173
Token the
Token ID262
Feature activation+3.49e-01

Pos logit contributions
 is+1.236
ie+1.175
iles+1.169
oline+1.168
 lives+1.168
Neg logit contributions
shadow-1.434
 louder-1.299
 proud-1.288
 determined-1.277
 confident-1.269
Token car
Token ID1097
Feature activation+9.03e-01

Pos logit contributions
shadow+0.771
ured+0.710
 determined+0.699
 confident+0.698
 louder+0.697
Neg logit contributions
 is-0.515
 lives-0.456
ie-0.455
 belongs-0.445
orns-0.444
Token.
Token ID13
Feature activation+1.15e+00

Pos logit contributions
shadow+1.785
ured+1.621
ivated+1.615
Unfortunately+1.599
 determined+1.595
Neg logit contributions
 is-1.579
ie-1.405
 lives-1.392
 belongs-1.386
oline-1.355
Token clean
Token ID3424
Feature activation-9.91e-01

Pos logit contributions
shadow+1.213
ured+1.081
 determined+1.071
ivated+1.062
 pleased+1.058
Neg logit contributions
 is-1.347
 lives-1.231
ie-1.227
 belongs-1.197
orns-1.191
Token up
Token ID510
Feature activation-4.82e-01

Pos logit contributions
 is+1.743
 lives+1.650
orns+1.642
 belongs+1.614
 happened+1.614
Neg logit contributions
shadow-1.792
 louder-1.687
ured-1.671
 proud-1.606
 confident-1.605
Token,"
Token ID553
Feature activation+2.69e-02

Pos logit contributions
 is+0.715
ie+0.660
orns+0.651
 lives+0.650
oline+0.642
Neg logit contributions
shadow-1.026
 louder-0.948
ured-0.941
 confident-0.934
 determined-0.929
Token Lily
Token ID20037
Feature activation+1.55e-01

Pos logit contributions
shadow+0.064
 louder+0.059
ured+0.059
 confident+0.058
 determined+0.058
Neg logit contributions
 is-0.035
ie-0.031
 lives-0.031
orns-0.030
 belongs-0.030
Token said
Token ID531
Feature activation+1.65e+00

Pos logit contributions
shadow+0.341
ured+0.313
 determined+0.310
 louder+0.310
 confident+0.310
Neg logit contributions
 is-0.226
ie-0.202
 lives-0.201
orns-0.197
 belongs-0.197
Token,
Token ID11
Feature activation+5.14e+00

Pos logit contributions
shadow+4.345
Unfortunately+4.064
ivated+4.007
ured+3.991
Half+3.972
Neg logit contributions
 is-1.891
ie-1.641
 lives-1.449
ons-1.446
 acts-1.437
Token starting
Token ID3599
Feature activation+2.76e+00

Pos logit contributions
Unfortunately+12.800
shadow+12.593
Half+11.549
har+11.251
eyed+11.234
Neg logit contributions
 is-8.952
ie-7.593
 dinosaurs-7.330
 dreams-7.271
 brother-7.146
Token to
Token ID284
Feature activation+1.81e+00

Pos logit contributions
shadow+5.496
ivated+5.129
ured+5.036
Unfortunately+4.881
 pleased+4.718
Neg logit contributions
 is-5.470
 lives-4.687
ons-4.678
 dinosaurs-4.669
 dreams-4.657
Token pick
Token ID2298
Feature activation-1.22e+00

Pos logit contributions
shadow+3.296
ured+2.972
Unfortunately+2.947
Just+2.896
ivated+2.895
Neg logit contributions
 is-3.642
ie-3.281
 lives-3.240
 dreams-3.199
 acts-3.189
Token up
Token ID510
Feature activation-3.78e-01

Pos logit contributions
 is+2.459
 lives+2.372
 belongs+2.338
ie+2.298
iles+2.284
Neg logit contributions
shadow-1.821
 louder-1.817
ured-1.716
 confident-1.676
 proud-1.670
Token her
Token ID607
Feature activation-1.49e+00

Pos logit contributions
 is+0.667
 lives+0.613
ie+0.613
 belongs+0.608
orns+0.603
Neg logit contributions
shadow-0.698
 louder-0.649
ured-0.644
 confident-0.637
 determined-0.635
Token sounds
Token ID5238
Feature activation-7.05e-01

Pos logit contributions
orns+2.211
iles+2.087
oline+2.032
ie+2.015
ests+2.009
Neg logit contributions
 louder-3.921
 determined-3.836
shadow-3.833
 confident-3.797
 proud-3.776
Token fun
Token ID1257
Feature activation-2.39e+00

Pos logit contributions
 is+1.348
ie+1.326
orns+1.312
ons+1.289
ests+1.284
Neg logit contributions
shadow-1.147
 confident-1.076
 proud-1.071
 louder-1.070
 determined-1.052
Token."
Token ID526
Feature activation-3.07e-01

Pos logit contributions
orns+3.948
ible+3.917
icer+3.877
iles+3.855
ests+3.853
Neg logit contributions
shadow-4.593
 louder-4.469
 proud-4.377
 confident-4.350
 determined-4.338
Token Max
Token ID5436
Feature activation+1.51e-01

Pos logit contributions
 is+0.407
ie+0.367
 lives+0.362
orns+0.362
 belongs+0.359
Neg logit contributions
shadow-0.710
 louder-0.664
 confident-0.656
 proud-0.654
 determined-0.653
Token said
Token ID531
Feature activation+1.82e+00

Pos logit contributions
shadow+0.329
ured+0.303
 determined+0.299
 louder+0.299
 confident+0.299
Neg logit contributions
 is-0.222
ie-0.199
 lives-0.198
orns-0.195
 belongs-0.194
Token,
Token ID11
Feature activation+5.14e+00

Pos logit contributions
shadow+4.896
Unfortunately+4.571
Half+4.501
ivated+4.491
ured+4.486
Neg logit contributions
 is-2.011
ie-1.780
ons-1.495
 lives-1.493
 belongs-1.489
Token as
Token ID355
Feature activation-1.21e+00

Pos logit contributions
shadow+12.600
Unfortunately+12.398
Half+11.519
har+11.226
eyed+11.182
Neg logit contributions
 is-8.930
ie-7.952
 dinosaurs-7.267
berries-7.085
 brother-7.040
Token they
Token ID484
Feature activation+5.45e-01

Pos logit contributions
 is+2.494
 belongs+2.370
orns+2.359
 lives+2.356
oline+2.342
Neg logit contributions
shadow-1.903
 louder-1.830
 confident-1.802
 determined-1.771
 proud-1.767
Token walked
Token ID6807
Feature activation+1.95e+00

Pos logit contributions
shadow+1.095
ured+0.994
 determined+0.978
 confident+0.977
ivated+0.975
Neg logit contributions
 is-0.911
ie-0.830
 lives-0.822
orns-0.805
oline-0.803
Token to
Token ID284
Feature activation+5.31e-01

Pos logit contributions
shadow+3.793
ivated+3.300
Unfortunately+3.300
 pleased+3.238
ured+3.234
Neg logit contributions
 is-3.780
ie-3.490
 lives-3.338
oline-3.303
orns-3.283
Token the
Token ID262
Feature activation+8.45e-01

Pos logit contributions
shadow+1.097
ured+1.003
 determined+0.989
 confident+0.983
ivated+0.980
Neg logit contributions
 is-0.865
ie-0.779
 lives-0.775
 belongs-0.752
orns-0.751
Token than
Token ID621
Feature activation-7.86e-01

Pos logit contributions
 is+0.906
ie+0.850
 lives+0.847
orns+0.840
oline+0.835
Neg logit contributions
shadow-1.061
 louder-0.966
 confident-0.960
 determined-0.959
ured-0.954
Token you
Token ID345
Feature activation-8.86e-01

Pos logit contributions
 is+1.023
 lives+0.917
 belongs+0.896
ible+0.892
ie+0.891
Neg logit contributions
shadow-1.859
 louder-1.729
ured-1.703
 crashing-1.693
 determined-1.684
Token!"
Token ID2474
Feature activation+3.15e-01

Pos logit contributions
 is+1.101
orns+1.049
ie+1.030
 lives+1.025
ests+1.015
Neg logit contributions
shadow-2.104
 louder-1.924
 determined-1.920
 proud-1.904
 confident-1.902
Token Lily
Token ID20037
Feature activation+6.46e-01

Pos logit contributions
shadow+0.752
ured+0.696
 determined+0.688
 louder+0.686
 confident+0.686
Neg logit contributions
 is-0.405
ie-0.356
 lives-0.353
orns-0.343
 belongs-0.343
Token said
Token ID531
Feature activation+2.33e+00

Pos logit contributions
shadow+1.354
ured+1.239
ivated+1.216
 determined+1.213
 confident+1.212
Neg logit contributions
 is-1.029
ie-0.918
 lives-0.905
 belongs-0.889
oline-0.885
Token,
Token ID11
Feature activation+5.13e+00

Pos logit contributions
shadow+6.519
Unfortunately+6.144
Half+6.097
 sidelines+5.982
ivated+5.953
Neg logit contributions
 is-2.358
ie-2.078
 happened-1.731
 acts-1.664
ons-1.654
Token as
Token ID355
Feature activation-1.33e-01

Pos logit contributions
shadow+12.098
Unfortunately+11.709
Half+11.158
har+10.494
ooting+10.430
Neg logit contributions
 is-9.646
ie-8.443
 was-7.666
 lives-7.630
 happened-7.473
Token she
Token ID673
Feature activation+1.14e+00

Pos logit contributions
 is+0.267
 lives+0.247
ie+0.247
orns+0.245
 belongs+0.244
Neg logit contributions
shadow-0.219
 determined-0.197
 confident-0.197
 louder-0.197
ured-0.196
Token turned
Token ID2900
Feature activation+1.98e+00

Pos logit contributions
shadow+2.073
ured+1.895
ivated+1.884
 confident+1.823
 pleased+1.822
Neg logit contributions
 is-2.143
ie-1.906
 lives-1.897
oline-1.877
 belongs-1.853
Token her
Token ID607
Feature activation+9.52e-01

Pos logit contributions
shadow+4.112
ured+3.649
Just+3.558
uring+3.524
Unfortunately+3.523
Neg logit contributions
 is-3.590
ie-3.249
oline-3.113
 happened-3.043
ons-3.028
Token car
Token ID1097
Feature activation+2.54e+00

Pos logit contributions
shadow+2.085
ured+1.910
ivated+1.892
Unfortunately+1.888
 determined+1.872
Neg logit contributions
 is-1.495
 lives-1.299
ie-1.289
 belongs-1.279
oline-1.237
Token the
Token ID262
Feature activation+3.64e-01

Pos logit contributions
 is+1.331
orns+1.246
 lives+1.244
ie+1.239
 belongs+1.236
Neg logit contributions
shadow-1.484
 louder-1.361
ured-1.340
 proud-1.317
 determined-1.311
Token clock
Token ID8801
Feature activation+2.36e-02

Pos logit contributions
shadow+0.748
ured+0.682
 determined+0.671
 confident+0.669
 louder+0.668
Neg logit contributions
 is-0.592
 lives-0.532
ie-0.531
 belongs-0.520
orns-0.517
Token!"
Token ID2474
Feature activation+4.03e-01

Pos logit contributions
shadow+0.055
ured+0.051
 louder+0.051
 determined+0.051
 confident+0.051
Neg logit contributions
 is-0.030
 lives-0.027
ie-0.027
orns-0.027
 belongs-0.026
Token Lily
Token ID20037
Feature activation+1.01e+00

Pos logit contributions
shadow+0.939
ured+0.867
 determined+0.856
 louder+0.853
 confident+0.852
Neg logit contributions
 is-0.545
ie-0.481
 lives-0.477
 belongs-0.465
orns-0.463
Token said
Token ID531
Feature activation+2.87e+00

Pos logit contributions
shadow+2.169
ured+2.018
ivated+1.976
 crashing+1.955
 determined+1.953
Neg logit contributions
 is-1.562
ie-1.393
 lives-1.344
ons-1.326
oline-1.315
Token,
Token ID11
Feature activation+5.12e+00

Pos logit contributions
shadow+8.215
Unfortunately+7.804
Half+7.804
 sidelines+7.596
Just+7.497
Neg logit contributions
 is-2.797
ie-2.539
 happened-2.128
ons-2.006
pa-1.972
Token as
Token ID355
Feature activation-4.05e-01

Pos logit contributions
shadow+11.766
Unfortunately+11.738
Half+11.182
ured+10.737
ooting+10.463
Neg logit contributions
 is-9.267
ie-8.520
 dreams-7.585
 was-7.546
 brother-7.519
Token the
Token ID262
Feature activation+5.37e-01

Pos logit contributions
 is+0.805
orns+0.749
 lives+0.749
ie+0.745
 belongs+0.744
Neg logit contributions
shadow-0.666
 louder-0.609
 confident-0.608
 determined-0.605
 proud-0.602
Token clock
Token ID8801
Feature activation+1.62e+00

Pos logit contributions
shadow+1.193
ured+1.093
ivated+1.068
Unfortunately+1.064
 determined+1.062
Neg logit contributions
 is-0.803
 lives-0.713
ie-0.711
 belongs-0.689
oline-0.684
Token started
Token ID2067
Feature activation+2.96e+00

Pos logit contributions
shadow+3.361
ivated+3.001
Unfortunately+2.993
ured+2.990
pering+2.949
Neg logit contributions
 is-2.765
ie-2.518
 happened-2.422
 lives-2.411
oline-2.394
Token to
Token ID284
Feature activation+2.68e+00

Pos logit contributions
shadow+6.647
ured+6.210
ivated+6.101
Unfortunately+6.034
Just+5.851
Neg logit contributions
 is-5.038
ie-4.477
 happened-4.446
 dreams-4.351
ons-4.344
Token you
Token ID345
Feature activation-6.68e-01

Pos logit contributions
orns+3.386
ests+3.233
oline+3.144
ible+3.129
iles+3.045
Neg logit contributions
shadow-4.724
 louder-4.549
 proud-4.476
 determined-4.459
 confident-4.333
Token go
Token ID467
Feature activation+1.60e-01

Pos logit contributions
 is+1.279
orns+1.251
 lives+1.216
ie+1.213
oline+1.211
Neg logit contributions
shadow-1.117
 louder-1.020
ured-0.997
 determined-0.984
 proud-0.972
Token."
Token ID526
Feature activation+7.38e-01

Pos logit contributions
shadow+0.399
ured+0.370
 determined+0.366
 louder+0.366
 confident+0.365
Neg logit contributions
 is-0.188
ie-0.163
 lives-0.162
orns-0.158
 belongs-0.158
Token Ben
Token ID3932
Feature activation-2.19e-01

Pos logit contributions
shadow+1.764
ured+1.626
 determined+1.600
ivated+1.593
 louder+1.591
Neg logit contributions
 is-1.008
 lives-0.867
ie-0.858
orns-0.842
 belongs-0.840
Token says
Token ID1139
Feature activation+2.22e+00

Pos logit contributions
 is+0.181
ie+0.160
orns+0.155
 lives+0.153
ible+0.146
Neg logit contributions
shadow-0.603
ured-0.568
 louder-0.568
 determined-0.566
 confident-0.566
Token,
Token ID11
Feature activation+5.11e+00

Pos logit contributions
shadow+6.128
Unfortunately+5.769
ivated+5.652
Half+5.643
 confident+5.589
Neg logit contributions
 is-2.438
ie-2.052
 belongs-1.723
 lives-1.717
 acts-1.659
Token as
Token ID355
Feature activation-5.07e-01

Pos logit contributions
Unfortunately+12.513
shadow+12.492
har+11.458
Half+11.381
Hours+11.098
Neg logit contributions
 is-9.038
ie-7.549
 dinosaurs-6.890
 doesn-6.876
 lives-6.761
Token he
Token ID339
Feature activation-6.26e-01

Pos logit contributions
 is+1.027
 lives+0.956
orns+0.955
 belongs+0.948
ie+0.946
Neg logit contributions
shadow-0.824
 louder-0.759
 confident-0.748
 determined-0.748
 proud-0.735
Token gives
Token ID3607
Feature activation-2.25e+00

Pos logit contributions
 is+0.939
ie+0.914
orns+0.875
 lives+0.866
 belongs+0.860
Neg logit contributions
shadow-1.274
 determined-1.184
 louder-1.180
 confident-1.177
ured-1.168
Token her
Token ID607
Feature activation-1.19e+00

Pos logit contributions
 is+5.072
 lives+4.992
 belongs+4.931
iles+4.855
 acts+4.816
Neg logit contributions
shadow-3.051
 louder-3.012
 proud-2.708
ured-2.661
 confident-2.547
Token his
Token ID465
Feature activation-1.14e+00

Pos logit contributions
 is+2.231
iles+2.182
orns+2.160
ie+2.157
 lives+2.144
Neg logit contributions
shadow-2.006
 louder-1.866
 proud-1.831
 confident-1.798
 determined-1.785
Token home
Token ID1363
Feature activation+1.01e-01

Pos logit contributions
 is+0.025
 lives+0.023
ie+0.023
orns+0.023
 belongs+0.023
Neg logit contributions
shadow-0.023
ured-0.021
 louder-0.021
 determined-0.021
 confident-0.021
Token!"
Token ID2474
Feature activation+7.05e-02

Pos logit contributions
shadow+0.235
ured+0.216
 determined+0.214
 louder+0.214
 confident+0.214
Neg logit contributions
 is-0.135
ie-0.120
 lives-0.119
orns-0.117
 belongs-0.116
Token their
Token ID511
Feature activation+1.78e-01

Pos logit contributions
shadow+0.155
ured+0.143
 louder+0.142
 determined+0.142
 confident+0.141
Neg logit contributions
 is-0.103
ie-0.092
 lives-0.092
orns-0.091
 belongs-0.090
Token parents
Token ID3397
Feature activation+1.35e+00

Pos logit contributions
shadow+0.384
ured+0.354
 determined+0.350
 louder+0.349
 confident+0.349
Neg logit contributions
 is-0.267
ie-0.239
 lives-0.238
orns-0.234
 belongs-0.233
Token say
Token ID910
Feature activation+2.54e+00

Pos logit contributions
shadow+2.853
 confident+2.522
ured+2.515
Unfortunately+2.507
Half+2.494
Neg logit contributions
 is-2.321
ie-2.058
 lives-1.995
 belongs-1.970
 acts-1.934
Token,
Token ID11
Feature activation+5.11e+00

Pos logit contributions
shadow+7.116
Unfortunately+6.745
Half+6.744
ivated+6.599
 confident+6.585
Neg logit contributions
 is-2.636
ie-2.244
ons-1.893
 lives-1.810
 happened-1.777
Token looking
Token ID2045
Feature activation+6.26e-01

Pos logit contributions
shadow+11.343
Unfortunately+10.997
Half+10.693
ured+10.211
mic+10.151
Neg logit contributions
 is-9.987
ie-8.656
pa-7.993
 was-7.948
 brother-7.920
Token at
Token ID379
Feature activation-1.75e+00

Pos logit contributions
shadow+1.089
ured+0.966
Unfortunately+0.947
Just+0.942
ivated+0.939
Neg logit contributions
 is-1.242
ie-1.140
 lives-1.134
 belongs-1.112
oline-1.097
Token them
Token ID606
Feature activation+2.22e+00

Pos logit contributions
 is+3.416
 belongs+3.328
orns+3.313
ests+3.287
 lives+3.278
Neg logit contributions
 louder-2.820
shadow-2.755
ured-2.674
 proud-2.634
 crashing-2.610
Token with
Token ID351
Feature activation-6.47e-02

Pos logit contributions
shadow+5.773
Unfortunately+5.365
Half+5.306
Just+5.291
 confident+5.224
Neg logit contributions
 is-2.848
ie-2.494
 lives-2.273
 belongs-2.126
 dreams-2.113
Token surprise
Token ID5975
Feature activation-9.77e-01

Pos logit contributions
 is+0.119
ie+0.110
 lives+0.108
orns+0.108
 belongs+0.108
Neg logit contributions
shadow-0.116
ured-0.107
 louder-0.106
 determined-0.106
 confident-0.105
Token,
Token ID11
Feature activation+1.07e+00

Pos logit contributions
 is+1.529
orns+1.463
ests+1.459
 lives+1.450
iles+1.433
Neg logit contributions
shadow-2.519
 louder-2.298
ured-2.278
 proud-2.250
 determined-2.244
Token zoom
Token ID19792
Feature activation-9.58e-01

Pos logit contributions
shadow+2.347
ured+2.181
ivated+2.161
 determined+2.134
 confident+2.133
Neg logit contributions
 is-1.643
 lives-1.428
ie-1.418
 belongs-1.386
 dreams-1.332
Token!"
Token ID2474
Feature activation+9.91e-01

Pos logit contributions
 is+1.498
orns+1.419
 lives+1.415
iles+1.398
ie+1.393
Neg logit contributions
shadow-1.935
 louder-1.788
ured-1.776
 proud-1.717
 confident-1.716
Token said
Token ID531
Feature activation+2.70e-02

Pos logit contributions
shadow+2.329
ured+2.147
 determined+2.123
 crashing+2.113
ivated+2.111
Neg logit contributions
 is-1.346
ie-1.181
 lives-1.165
 belongs-1.132
ons-1.119
Token Lily
Token ID20037
Feature activation+2.31e+00

Pos logit contributions
shadow+0.062
ured+0.057
 louder+0.057
 determined+0.057
 confident+0.057
Neg logit contributions
 is-0.037
 lives-0.032
ie-0.032
orns-0.032
 belongs-0.032
Token,
Token ID11
Feature activation+5.11e+00

Pos logit contributions
shadow+5.997
Unfortunately+5.817
ured+5.752
ivated+5.710
Half+5.699
Neg logit contributions
 is-2.577
ie-2.327
 lives-1.979
ons-1.966
 dreams-1.917
Token excited
Token ID6568
Feature activation+9.79e-01

Pos logit contributions
Unfortunately+10.995
shadow+10.904
Half+10.634
ured+9.959
pering+9.925
Neg logit contributions
 is-10.001
ie-8.936
 lived-8.185
 lives-8.183
 was-8.178
Tokenly
Token ID306
Feature activation+2.45e+00

Pos logit contributions
shadow+2.265
 determined+2.064
Unfortunately+2.061
ivated+2.050
 confident+2.049
Neg logit contributions
 is-1.391
ie-1.278
 lives-1.208
oline-1.174
 belongs-1.168
Token.
Token ID13
Feature activation+2.40e+00

Pos logit contributions
shadow+6.310
Unfortunately+6.033
Half+5.863
ivated+5.836
Just+5.791
Neg logit contributions
 is-2.815
ie-2.521
 lives-2.401
ons-2.401
pa-2.328
Token "
Token ID366
Feature activation+2.21e+00

Pos logit contributions
shadow+4.467
ured+4.244
ivated+4.205
 confident+4.139
 determined+4.127
Neg logit contributions
 is-4.708
 doesn-3.923
 lives-3.910
 isn-3.874
 belongs-3.873
TokenGrand
Token ID23581
Feature activation-1.23e+00

Pos logit contributions
shadow+4.309
ured+4.022
 determined+3.899
 pleased+3.873
uring+3.850
Neg logit contributions
 is-3.863
oline-3.454
orns-3.453
 lives-3.426
ie-3.413
Token"
Token ID1
Feature activation+1.69e+00

Pos logit contributions
shadow+4.330
ivated+4.183
ured+4.164
 crashing+4.107
pering+4.062
Neg logit contributions
 is-3.338
ie-2.962
 belongs-2.781
ara-2.763
 lives-2.757
TokenSure
Token ID19457
Feature activation-1.08e+00

Pos logit contributions
shadow+2.238
ured+1.933
pering+1.907
uring+1.857
 crashing+1.847
Neg logit contributions
 is-4.095
 lives-3.693
 belongs-3.685
ie-3.675
ests-3.675
Token!"
Token ID2474
Feature activation+9.29e-01

Pos logit contributions
 is+1.604
orns+1.575
ests+1.540
oline+1.536
 lives+1.513
Neg logit contributions
shadow-2.336
 louder-2.112
 determined-2.103
 proud-2.081
 confident-2.072
Token said
Token ID531
Feature activation-2.82e-01

Pos logit contributions
shadow+2.180
ured+2.022
 determined+1.996
ivated+1.986
 louder+1.981
Neg logit contributions
 is-1.283
ie-1.107
 lives-1.085
 belongs-1.067
ons-1.065
Token Lily
Token ID20037
Feature activation+1.94e+00

Pos logit contributions
 is+0.389
 lives+0.350
orns+0.346
 belongs+0.341
ie+0.341
Neg logit contributions
shadow-0.643
ured-0.592
 louder-0.589
 confident-0.585
 determined-0.583
Token,
Token ID11
Feature activation+5.10e+00

Pos logit contributions
shadow+5.065
Unfortunately+4.823
 crashing+4.764
ivated+4.757
ured+4.757
Neg logit contributions
 is-2.142
ie-1.896
ons-1.653
 belongs-1.629
 lives-1.625
Token happy
Token ID3772
Feature activation+1.32e+00

Pos logit contributions
shadow+13.246
Unfortunately+12.137
har+11.875
pering+11.864
Half+11.859
Neg logit contributions
 is-8.130
ie-6.745
 doesn-6.544
 happened-6.447
 was-6.391
Token to
Token ID284
Feature activation+1.31e+00

Pos logit contributions
shadow+2.626
ivated+2.359
ured+2.347
Unfortunately+2.338
 determined+2.333
Neg logit contributions
 is-2.393
ie-2.205
 lives-2.112
 happened-2.073
oline-2.071
Token have
Token ID423
Feature activation-7.57e-01

Pos logit contributions
shadow+2.379
ured+2.097
 determined+2.084
 louder+2.050
ivated+2.050
Neg logit contributions
 is-2.564
ie-2.352
 lives-2.268
ons-2.239
 belongs-2.225
Token a
Token ID257
Feature activation+9.21e-01

Pos logit contributions
 is+1.383
 lives+1.306
 belongs+1.287
ie+1.283
iles+1.274
Neg logit contributions
shadow-1.376
 confident-1.256
 proud-1.256
 determined-1.243
 louder-1.242
Token friend
Token ID1545
Feature activation-6.63e-02

Pos logit contributions
shadow+1.942
ured+1.814
ivated+1.758
Unfortunately+1.735
uring+1.730
Neg logit contributions
 is-1.501
 lives-1.324
ie-1.312
 belongs-1.284
 dreams-1.263
Token it
Token ID340
Feature activation-1.79e+00

Pos logit contributions
orns+4.201
ests+4.070
iles+3.721
oline+3.647
ible+3.646
Neg logit contributions
shadow-7.248
 proud-6.957
 louder-6.782
 determined-6.774
 relieved-6.737
Token is
Token ID318
Feature activation-1.41e+00

Pos logit contributions
ests+2.037
orns+1.985
ible+1.809
iles+1.772
pa+1.764
Neg logit contributions
shadow-4.501
 louder-4.205
 determined-4.164
 proud-4.134
ured-4.065
Token!"
Token ID2474
Feature activation+2.96e-01

Pos logit contributions
 is+2.561
orns+2.558
ie+2.543
ests+2.514
 acts+2.471
Neg logit contributions
shadow-2.485
 louder-2.271
ured-2.208
 determined-2.196
 proud-2.195
Token Lily
Token ID20037
Feature activation+7.62e-01

Pos logit contributions
shadow+0.653
ured+0.600
 determined+0.593
 confident+0.591
 louder+0.591
Neg logit contributions
 is-0.434
ie-0.386
 lives-0.385
 belongs-0.376
orns-0.376
Token whispered
Token ID25029
Feature activation+2.28e+00

Pos logit contributions
shadow+1.638
ured+1.501
ivated+1.471
 confident+1.469
 determined+1.468
Neg logit contributions
 is-1.176
ie-1.040
 lives-1.030
 belongs-1.006
oline-1.003
Token,
Token ID11
Feature activation+5.08e+00

Pos logit contributions
shadow+6.003
Unfortunately+5.636
Half+5.472
ivated+5.458
 pleased+5.451
Neg logit contributions
 is-2.527
ie-2.414
pa-2.086
 happened-2.023
 lives-2.009
Token pointing
Token ID10609
Feature activation+1.44e+00

Pos logit contributions
shadow+13.518
Unfortunately+13.205
Half+12.582
ooting+12.159
har+12.053
Neg logit contributions
 is-7.745
ie-6.671
 lives-5.827
pa-5.811
 was-5.651
Token to
Token ID284
Feature activation-8.73e-02

Pos logit contributions
shadow+2.764
Unfortunately+2.463
ivated+2.445
ured+2.423
 confident+2.388
Neg logit contributions
 is-2.755
 lives-2.422
ie-2.403
 belongs-2.366
 happened-2.328
Token the
Token ID262
Feature activation+1.88e-01

Pos logit contributions
 is+0.153
ie+0.142
orns+0.140
 lives+0.139
 belongs+0.139
Neg logit contributions
shadow-0.166
ured-0.151
 louder-0.151
 determined-0.149
 confident-0.149
Token doll
Token ID3654
Feature activation+3.06e-01

Pos logit contributions
shadow+0.438
ured+0.405
 determined+0.400
 louder+0.400
 confident+0.399
Neg logit contributions
 is-0.251
ie-0.223
 lives-0.221
orns-0.216
 belongs-0.215
Token.
Token ID13
Feature activation+1.41e+00

Pos logit contributions
shadow+0.684
ured+0.629
 determined+0.622
 confident+0.621
 louder+0.621
Neg logit contributions
 is-0.436
ie-0.388
 lives-0.386
 belongs-0.377
orns-0.376
Token a
Token ID257
Feature activation+5.64e-01

Pos logit contributions
 is+1.296
orns+1.216
 lives+1.213
ie+1.205
 belongs+1.204
Neg logit contributions
shadow-1.386
 louder-1.281
ured-1.265
 proud-1.257
 determined-1.237
Token star
Token ID3491
Feature activation+2.57e-01

Pos logit contributions
shadow+1.303
ured+1.204
ivated+1.185
 determined+1.181
 confident+1.178
Neg logit contributions
 is-0.782
 lives-0.684
ie-0.680
 belongs-0.663
orns-0.647
Token!"
Token ID2474
Feature activation+8.06e-02

Pos logit contributions
shadow+0.549
ured+0.504
 determined+0.498
 louder+0.497
 confident+0.496
Neg logit contributions
 is-0.391
ie-0.351
 lives-0.349
orns-0.342
 belongs-0.341
Token Lily
Token ID20037
Feature activation+5.47e-01

Pos logit contributions
shadow+0.200
ured+0.187
 louder+0.185
 determined+0.185
 confident+0.185
Neg logit contributions
 is-0.094
ie-0.083
 lives-0.082
orns-0.081
 belongs-0.080
Token said
Token ID531
Feature activation+2.03e+00

Pos logit contributions
shadow+1.226
ured+1.127
 determined+1.108
 louder+1.108
ivated+1.106
Neg logit contributions
 is-0.791
ie-0.699
 lives-0.691
 belongs-0.676
oline-0.675
Token,
Token ID11
Feature activation+5.08e+00

Pos logit contributions
shadow+5.601
Unfortunately+5.225
Half+5.116
ured+5.092
Just+5.047
Neg logit contributions
 is-2.188
ie-1.884
 happened-1.604
 dreams-1.599
ons-1.598
Token showing
Token ID4478
Feature activation-7.71e-01

Pos logit contributions
shadow+11.744
Unfortunately+11.101
Half+10.292
ooting+9.901
har+9.706
Neg logit contributions
 is-10.212
ie-8.876
 lives-8.177
 dinosaurs-8.055
 happened-8.020
Token her
Token ID607
Feature activation-1.07e+00

Pos logit contributions
 is+1.349
orns+1.275
 lives+1.269
 belongs+1.252
ie+1.239
Neg logit contributions
shadow-1.399
 louder-1.359
ured-1.334
 proud-1.299
 confident-1.294
Token pizza
Token ID14256
Feature activation+3.35e-01

Pos logit contributions
orns+1.774
 is+1.758
ie+1.704
oline+1.691
 lives+1.673
Neg logit contributions
shadow-2.034
 proud-1.968
 pleased-1.947
 determined-1.940
 louder-1.930
Token.
Token ID13
Feature activation+7.65e-01

Pos logit contributions
shadow+0.682
ured+0.621
 determined+0.612
 confident+0.611
 louder+0.610
Neg logit contributions
 is-0.551
ie-0.496
 lives-0.494
 belongs-0.484
orns-0.484
Token
Token ID198
Feature activation-2.60e-01

Pos logit contributions
shadow+1.556
ured+1.410
 determined+1.381
 confident+1.380
 louder+1.373
Neg logit contributions
 is-1.303
 lives-1.149
ie-1.130
 belongs-1.121
oline-1.120
Token the
Token ID262
Feature activation+1.19e-01

Pos logit contributions
 is+0.665
ie+0.617
 lives+0.615
 belongs+0.608
orns+0.603
Neg logit contributions
shadow-0.786
 louder-0.717
ured-0.716
Unfortunately-0.701
 determined-0.700
Token oil
Token ID3056
Feature activation-1.02e+00

Pos logit contributions
shadow+0.249
ured+0.229
 louder+0.226
 determined+0.226
 confident+0.226
Neg logit contributions
 is-0.184
ie-0.167
 lives-0.165
orns-0.162
 belongs-0.162
Token!"
Token ID2474
Feature activation-3.37e-01

Pos logit contributions
 is+1.267
 lives+1.211
orns+1.177
ie+1.169
 acts+1.150
Neg logit contributions
shadow-2.317
 louder-2.183
ured-2.161
Unfortunately-2.117
 proud-2.112
Token Ben
Token ID3932
Feature activation-3.68e-01

Pos logit contributions
 is+0.405
ie+0.367
orns+0.366
 lives+0.355
oline+0.349
Neg logit contributions
shadow-0.811
ured-0.769
 determined-0.761
 louder-0.759
ivated-0.757
Token says
Token ID1139
Feature activation+2.50e+00

Pos logit contributions
 is+0.303
ie+0.273
orns+0.267
 lives+0.259
ible+0.247
Neg logit contributions
shadow-1.013
 louder-0.962
ured-0.959
 determined-0.957
 confident-0.956
Token,
Token ID11
Feature activation+5.07e+00

Pos logit contributions
shadow+7.215
Unfortunately+6.787
Half+6.687
ivated+6.567
 sidelines+6.544
Neg logit contributions
 is-2.436
ie-2.078
 happened-1.743
 acts-1.716
 lives-1.690
Token as
Token ID355
Feature activation-3.08e-01

Pos logit contributions
shadow+11.744
Unfortunately+10.992
har+10.105
Half+10.025
ooting+9.798
Neg logit contributions
 is-10.107
ie-8.667
 happened-8.285
 was-8.071
 lives-8.066
Token he
Token ID339
Feature activation-3.80e-01

Pos logit contributions
 is+0.605
ie+0.566
orns+0.563
 lives+0.562
 belongs+0.558
Neg logit contributions
shadow-0.508
 louder-0.468
 determined-0.467
 confident-0.465
 proud-0.460
Token pushes
Token ID20070
Feature activation+8.38e-01

Pos logit contributions
 is+0.561
ie+0.536
orns+0.519
 lives+0.519
 belongs+0.508
Neg logit contributions
shadow-0.790
 louder-0.735
 determined-0.733
 confident-0.733
ured-0.726
Token his
Token ID465
Feature activation-2.30e-01

Pos logit contributions
shadow+1.756
ured+1.596
 determined+1.576
ivated+1.568
 confident+1.567
Neg logit contributions
 is-1.367
ie-1.240
 lives-1.191
orns-1.182
oline-1.175
Token red
Token ID2266
Feature activation-2.18e-01

Pos logit contributions
 is+0.321
ie+0.295
 lives+0.290
orns+0.289
oline+0.283
Neg logit contributions
shadow-0.517
 louder-0.477
ured-0.476
 determined-0.472
 confident-0.470
Token much
Token ID881
Feature activation+1.28e+00

Pos logit contributions
shadow+2.381
ured+2.196
ivated+2.087
 louder+2.057
 determined+2.036
Neg logit contributions
 is-2.937
 lives-2.617
 belongs-2.591
ie-2.518
oline-2.494
Token fun
Token ID1257
Feature activation-9.12e-01

Pos logit contributions
shadow+2.727
ured+2.588
ivated+2.553
eyed+2.491
 determined+2.462
Neg logit contributions
 is-2.050
 lives-1.749
 belongs-1.734
ie-1.732
 happened-1.701
Token!"
Token ID2474
Feature activation+1.60e-01

Pos logit contributions
 is+1.117
orns+1.044
 lives+1.020
ible+1.019
ie+1.011
Neg logit contributions
shadow-2.168
ured-2.009
 determined-2.003
 louder-1.998
 confident-1.987
Token Lily
Token ID20037
Feature activation+8.54e-01

Pos logit contributions
shadow+0.316
ured+0.288
 determined+0.284
 louder+0.284
 confident+0.284
Neg logit contributions
 is-0.272
ie-0.247
 lives-0.246
orns-0.242
 belongs-0.241
Token said
Token ID531
Feature activation+2.33e+00

Pos logit contributions
shadow+1.764
ured+1.624
ivated+1.592
 crashing+1.582
 louder+1.581
Neg logit contributions
 is-1.377
ie-1.235
 lives-1.211
 belongs-1.187
oline-1.181
Token,
Token ID11
Feature activation+5.07e+00

Pos logit contributions
shadow+6.352
Unfortunately+5.996
Half+5.891
Just+5.821
 sidelines+5.760
Neg logit contributions
 is-2.531
ie-2.226
 happened-1.936
 lives-1.918
pa-1.889
Token smiling
Token ID16755
Feature activation+1.93e+00

Pos logit contributions
shadow+11.755
Unfortunately+11.480
Half+10.799
ured+10.571
eyed+10.522
Neg logit contributions
 is-9.365
ie-7.522
 was-7.349
 lives-7.345
 doesn-7.221
Token.
Token ID13
Feature activation+1.56e+00

Pos logit contributions
shadow+5.164
Unfortunately+4.803
ivated+4.702
Half+4.668
ured+4.651
Neg logit contributions
 is-2.125
ie-1.887
 dreams-1.688
 lives-1.659
 happened-1.649
Token
Token ID198
Feature activation+2.44e-01

Pos logit contributions
shadow+3.006
ured+2.733
 confident+2.670
 determined+2.659
ivated+2.658
Neg logit contributions
 is-2.885
 lives-2.473
 belongs-2.432
ie-2.429
 happened-2.429
Token
Token ID198
Feature activation+2.59e+00

Pos logit contributions
shadow+0.473
ured+0.430
 determined+0.424
 louder+0.423
 confident+0.423
Neg logit contributions
 is-0.421
ie-0.383
 lives-0.382
orns-0.375
 belongs-0.374
Token"
Token ID1
Feature activation+2.18e+00

Pos logit contributions
shadow+4.704
 crashing+4.435
 confident+4.372
 pleased+4.330
ured+4.318
Neg logit contributions
 is-5.110
ie-4.490
 lives-4.453
ons-4.407
 belongs-4.311
Token get
Token ID651
Feature activation+1.75e-01

Pos logit contributions
shadow+3.863
 pleased+3.336
 louder+3.336
Unfortunately+3.313
ivated+3.313
Neg logit contributions
 is-4.157
 lives-3.699
ie-3.579
 dreams-3.529
 belongs-3.494
Token it
Token ID340
Feature activation-2.43e-01

Pos logit contributions
shadow+0.372
ured+0.341
 determined+0.337
 louder+0.336
 confident+0.335
Neg logit contributions
 is-0.272
ie-0.245
 lives-0.244
orns-0.239
 belongs-0.238
Token!"
Token ID2474
Feature activation+2.61e-01

Pos logit contributions
 is+0.419
 lives+0.387
orns+0.384
ie+0.381
oline+0.377
Neg logit contributions
shadow-0.467
ured-0.420
 louder-0.419
 determined-0.414
 confident-0.411
Token Max
Token ID5436
Feature activation+3.26e-01

Pos logit contributions
shadow+0.619
ured+0.573
 determined+0.567
 louder+0.565
 confident+0.565
Neg logit contributions
 is-0.340
ie-0.299
 lives-0.297
orns-0.289
 belongs-0.289
Token whispered
Token ID25029
Feature activation+2.13e+00

Pos logit contributions
shadow+0.721
ured+0.662
 determined+0.654
 confident+0.653
 louder+0.652
Neg logit contributions
 is-0.477
ie-0.425
 lives-0.423
 belongs-0.413
orns-0.411
Token,
Token ID11
Feature activation+5.07e+00

Pos logit contributions
shadow+5.556
Unfortunately+5.202
Half+5.072
ivated+5.064
 confident+5.013
Neg logit contributions
 is-2.501
ie-2.392
 lives-1.967
 dreams-1.943
 happened-1.935
Token climbing
Token ID14281
Feature activation+5.47e-01

Pos logit contributions
shadow+14.582
Unfortunately+14.122
Half+13.959
har+13.285
ooting+13.122
Neg logit contributions
 is-6.961
ie-5.798
 doesn-4.801
 lives-4.722
 was-4.641
Token on
Token ID319
Feature activation-3.39e-01

Pos logit contributions
shadow+0.986
ured+0.882
 confident+0.874
 determined+0.872
ivated+0.870
Neg logit contributions
 is-1.035
ie-0.941
 lives-0.935
 belongs-0.919
oline-0.918
Token a
Token ID257
Feature activation+1.42e+00

Pos logit contributions
 is+0.577
ie+0.535
 lives+0.534
 belongs+0.529
orns+0.525
Neg logit contributions
shadow-0.645
 louder-0.598
ured-0.597
 determined-0.589
 confident-0.585
Token chair
Token ID5118
Feature activation+1.52e+00

Pos logit contributions
shadow+3.334
ured+3.082
ivated+3.040
Just+2.971
 confident+2.969
Neg logit contributions
 is-2.063
ie-1.768
 lives-1.734
 belongs-1.690
oline-1.679
Token.
Token ID13
Feature activation+1.35e+00

Pos logit contributions
shadow+3.909
 pleased+3.526
ured+3.523
Unfortunately+3.521
ivated+3.517
Neg logit contributions
 is-1.853
ie-1.592
 lives-1.511
oline-1.483
 belongs-1.451
Token big
Token ID1263
Feature activation-3.92e-01

Pos logit contributions
shadow+0.884
ured+0.811
ivated+0.795
 determined+0.795
 confident+0.793
Neg logit contributions
 is-0.678
ie-0.607
 lives-0.607
 belongs-0.592
orns-0.589
Token crab
Token ID32202
Feature activation-5.85e-01

Pos logit contributions
 is+0.616
orns+0.568
ie+0.565
 lives+0.559
 belongs+0.558
Neg logit contributions
shadow-0.787
 louder-0.726
 determined-0.722
 proud-0.721
 confident-0.720
Token!"
Token ID2474
Feature activation-1.54e-01

Pos logit contributions
 is+0.702
 lives+0.623
ible+0.614
ie+0.610
oline+0.609
Neg logit contributions
shadow-1.416
ured-1.296
 louder-1.291
 proud-1.290
 determined-1.287
Token Tom
Token ID4186
Feature activation+3.89e-01

Pos logit contributions
 is+0.197
ie+0.177
orns+0.175
 lives+0.173
 belongs+0.170
Neg logit contributions
shadow-0.365
ured-0.339
 louder-0.337
 determined-0.337
 confident-0.336
Token said
Token ID531
Feature activation+1.92e+00

Pos logit contributions
shadow+0.858
ured+0.789
 determined+0.778
 confident+0.777
 louder+0.777
Neg logit contributions
 is-0.573
ie-0.513
 lives-0.508
 belongs-0.496
orns-0.494
Token,
Token ID11
Feature activation+5.06e+00

Pos logit contributions
shadow+5.230
Unfortunately+4.925
ured+4.833
ivated+4.811
Half+4.803
Neg logit contributions
 is-2.085
ie-1.816
 happened-1.593
 lives-1.581
 belongs-1.545
Token holding
Token ID4769
Feature activation-4.95e-02

Pos logit contributions
shadow+11.414
Unfortunately+11.292
Half+10.422
ured+9.968
ooting+9.883
Neg logit contributions
 is-9.643
ie-8.840
 dinosaurs-8.059
 lives-7.884
pa-7.741
Token up
Token ID510
Feature activation+1.50e+00

Pos logit contributions
 is+0.094
ie+0.086
 lives+0.086
orns+0.085
 belongs+0.085
Neg logit contributions
shadow-0.087
ured-0.080
 louder-0.079
 determined-0.078
 confident-0.078
Token his
Token ID465
Feature activation-6.12e-01

Pos logit contributions
shadow+3.339
Unfortunately+2.930
ured+2.925
ivated+2.922
 determined+2.875
Neg logit contributions
 is-2.405
ie-2.138
 lives-2.101
 dreams-2.077
oline-2.061
Token bucket
Token ID19236
Feature activation+1.30e+00

Pos logit contributions
 is+1.061
ie+0.995
orns+0.989
oline+0.981
 lives+0.970
Neg logit contributions
shadow-1.140
 determined-1.053
 proud-1.051
ured-1.041
 pleased-1.040
Token.
Token ID13
Feature activation+1.45e+00

Pos logit contributions
shadow+3.223
ivated+2.928
ured+2.925
Unfortunately+2.903
 confident+2.878
Neg logit contributions
 is-1.710
ie-1.516
 lives-1.406
oline-1.400
 belongs-1.396
Token,
Token ID11
Feature activation-4.41e-01

Pos logit contributions
 is+2.193
orns+2.069
oline+2.061
ests+2.056
ible+2.042
Neg logit contributions
shadow-3.581
 louder-3.302
 determined-3.266
 proud-3.263
 confident-3.255
Token Spot
Token ID15899
Feature activation-1.75e+00

Pos logit contributions
 is+0.765
ie+0.717
orns+0.715
 lives+0.709
ible+0.703
Neg logit contributions
shadow-0.827
ured-0.748
 louder-0.744
 determined-0.739
 confident-0.734
Token!"
Token ID2474
Feature activation+1.41e-02

Pos logit contributions
oline+2.312
orns+2.305
 is+2.289
ests+2.286
ons+2.207
Neg logit contributions
shadow-3.934
 louder-3.603
 proud-3.601
 determined-3.554
 confident-3.548
Token Tom
Token ID4186
Feature activation+4.14e-01

Pos logit contributions
shadow+0.033
ured+0.031
 louder+0.031
 confident+0.031
 determined+0.031
Neg logit contributions
 is-0.018
ie-0.016
 lives-0.016
orns-0.016
 belongs-0.015
Token said
Token ID531
Feature activation+2.30e+00

Pos logit contributions
shadow+0.888
ured+0.813
 determined+0.802
 confident+0.801
 louder+0.800
Neg logit contributions
 is-0.634
ie-0.569
 lives-0.564
 belongs-0.551
orns-0.548
Token,
Token ID11
Feature activation+5.06e+00

Pos logit contributions
shadow+6.380
Unfortunately+6.068
Half+5.978
Just+5.888
ivated+5.854
Neg logit contributions
 is-2.369
ie-2.112
 happened-1.758
 lives-1.732
 acts-1.686
Token holding
Token ID4769
Feature activation-1.72e-01

Pos logit contributions
shadow+12.683
Unfortunately+12.623
Half+11.811
Hours+11.121
Finally+10.866
Neg logit contributions
 is-8.850
ie-7.555
 dinosaurs-7.108
 lives-7.039
 was-6.943
Token a
Token ID257
Feature activation+1.84e+00

Pos logit contributions
 is+0.330
ie+0.307
 lives+0.305
orns+0.302
 belongs+0.301
Neg logit contributions
shadow-0.296
ured-0.271
 louder-0.268
 determined-0.266
 confident-0.265
Token stuffed
Token ID22259
Feature activation-5.08e-01

Pos logit contributions
shadow+4.212
ured+3.940
ivated+3.906
uring+3.880
Unfortunately+3.823
Neg logit contributions
 is-2.838
 lives-2.378
 belongs-2.299
ie-2.293
ons-2.258
Token dog
Token ID3290
Feature activation+1.64e+00

Pos logit contributions
 is+0.553
 belongs+0.474
 lives+0.471
orns+0.458
ie+0.456
Neg logit contributions
shadow-1.281
ured-1.211
 louder-1.200
 determined-1.186
 confident-1.185
Token as
Token ID355
Feature activation+3.89e-02

Pos logit contributions
shadow+3.574
Unfortunately+3.292
 pleased+3.260
ivated+3.239
 confident+3.238
Neg logit contributions
 is-2.593
ie-2.317
 lives-2.279
 belongs-2.186
ons-2.156
Token and
Token ID290
Feature activation+8.68e-01

Pos logit contributions
 is+1.849
 lives+1.820
orns+1.764
ible+1.755
 belongs+1.740
Neg logit contributions
shadow-2.799
 louder-2.686
ured-2.599
 proud-2.595
 crashing-2.576
Token shiny
Token ID22441
Feature activation-1.27e+00

Pos logit contributions
shadow+1.516
ured+1.387
ivated+1.348
Unfortunately+1.342
Just+1.325
Neg logit contributions
 is-1.719
 lives-1.561
ie-1.538
 belongs-1.532
orns-1.485
Token!"
Token ID2474
Feature activation+3.66e-01

Pos logit contributions
 is+1.702
 lives+1.648
ests+1.622
orns+1.616
ons+1.606
Neg logit contributions
shadow-2.749
 louder-2.664
 proud-2.597
ured-2.586
 confident-2.573
Token Lily
Token ID20037
Feature activation+8.75e-02

Pos logit contributions
shadow+0.891
ured+0.828
 determined+0.815
ivated+0.813
 confident+0.813
Neg logit contributions
 is-0.464
ie-0.403
 lives-0.401
 belongs-0.389
orns-0.387
Token said
Token ID531
Feature activation+1.67e+00

Pos logit contributions
shadow+0.190
ured+0.175
 determined+0.173
 confident+0.173
 louder+0.173
Neg logit contributions
 is-0.130
 lives-0.117
ie-0.117
orns-0.115
 belongs-0.114
Token,
Token ID11
Feature activation+5.06e+00

Pos logit contributions
shadow+4.480
Unfortunately+4.171
ured+4.112
Half+4.090
ivated+4.070
Neg logit contributions
 is-1.903
ie-1.640
 lives-1.440
 belongs-1.422
 happened-1.416
Token leaning
Token ID21804
Feature activation+3.21e+00

Pos logit contributions
shadow+11.672
Unfortunately+11.577
Half+10.884
har+10.123
Hours+10.118
Neg logit contributions
 is-9.597
ie-8.731
 lives-7.581
 was-7.508
 dinosaurs-7.381
Token over
Token ID625
Feature activation+4.53e-02

Pos logit contributions
shadow+6.886
Unfortunately+6.220
 sidelines+6.137
 confident+6.119
ivated+6.111
Neg logit contributions
 is-5.946
ie-5.293
 happened-4.918
pa-4.869
ible-4.811
Token to
Token ID284
Feature activation+7.03e-01

Pos logit contributions
shadow+0.086
ured+0.078
 louder+0.078
 determined+0.078
 confident+0.077
Neg logit contributions
 is-0.079
 lives-0.073
ie-0.073
 belongs-0.072
orns-0.071
Token see
Token ID766
Feature activation-8.88e-01

Pos logit contributions
shadow+1.534
ured+1.404
 determined+1.385
 confident+1.381
Unfortunately+1.380
Neg logit contributions
 is-1.096
ie-0.976
 lives-0.959
 belongs-0.933
ible-0.922
Token.
Token ID13
Feature activation+1.33e+00

Pos logit contributions
 is+1.555
oline+1.493
orns+1.480
ie+1.473
 belongs+1.471
Neg logit contributions
shadow-1.609
 louder-1.531
 proud-1.507
ured-1.477
 determined-1.458
Token's
Token ID338
Feature activation+9.71e-01

Pos logit contributions
ests+3.353
orns+3.143
iles+3.120
 dinosaurs+3.014
oline+3.006
Neg logit contributions
shadow-4.897
 louder-4.879
Unfortunately-4.691
ured-4.661
Just-4.627
Token play
Token ID711
Feature activation-2.70e-01

Pos logit contributions
shadow+1.795
ured+1.600
 pleased+1.584
ivated+1.580
 confident+1.567
Neg logit contributions
 is-1.849
 lives-1.670
ie-1.655
 belongs-1.610
 dreams-1.597
Token!"
Token ID2474
Feature activation-6.21e-01

Pos logit contributions
 is+0.519
ie+0.495
orns+0.484
 lives+0.484
 belongs+0.477
Neg logit contributions
shadow-0.462
 louder-0.416
ured-0.413
 confident-0.408
 proud-0.406
Token Sam
Token ID3409
Feature activation-5.91e-02

Pos logit contributions
 is+0.906
ie+0.837
 lives+0.829
 belongs+0.823
ible+0.822
Neg logit contributions
shadow-1.329
 louder-1.259
 proud-1.240
 confident-1.236
 determined-1.227
Token said
Token ID531
Feature activation+1.71e+00

Pos logit contributions
 is+0.089
 lives+0.081
ie+0.081
orns+0.080
 belongs+0.079
Neg logit contributions
shadow-0.126
ured-0.116
 louder-0.115
 determined-0.115
 confident-0.115
Token,
Token ID11
Feature activation+5.06e+00

Pos logit contributions
shadow+4.542
Unfortunately+4.224
Half+4.140
ured+4.135
ivated+4.132
Neg logit contributions
 is-1.937
ie-1.700
 lives-1.484
 belongs-1.483
 happened-1.459
Token eager
Token ID11069
Feature activation+2.28e-01

Pos logit contributions
shadow+13.200
Unfortunately+12.828
har+11.964
Half+11.944
ooting+11.804
Neg logit contributions
 is-7.904
ie-6.659
 dinosaurs-6.309
 bears-6.178
 brother-6.100
Token to
Token ID284
Feature activation+1.07e+00

Pos logit contributions
shadow+0.410
ured+0.370
 determined+0.364
 louder+0.364
 confident+0.363
Neg logit contributions
 is-0.427
ie-0.391
 lives-0.390
orns-0.384
 belongs-0.384
Token try
Token ID1949
Feature activation+8.04e-01

Pos logit contributions
shadow+1.932
ured+1.723
Unfortunately+1.702
 determined+1.699
Just+1.694
Neg logit contributions
 is-2.087
ie-1.879
 lives-1.877
 belongs-1.823
ons-1.806
Token.
Token ID13
Feature activation+8.65e-01

Pos logit contributions
shadow+1.768
ured+1.650
ivated+1.626
 determined+1.608
Unfortunately+1.606
Neg logit contributions
 is-1.212
ie-1.055
 lives-1.043
orns-1.026
 belongs-1.019
Token
Token ID198
Feature activation-9.06e-01

Pos logit contributions
shadow+1.720
ured+1.581
ivated+1.555
 determined+1.541
 confident+1.525
Neg logit contributions
 is-1.514
ie-1.326
 lives-1.326
orns-1.317
 belongs-1.300
Token She
Token ID1375
Feature activation-1.74e+00

Pos logit contributions
 is+0.734
ie+0.710
 lives+0.689
 belongs+0.685
orns+0.682
Neg logit contributions
shadow-0.564
 louder-0.506
Unfortunately-0.505
ured-0.505
Just-0.504
Token did
Token ID750
Feature activation+1.18e+00

Pos logit contributions
orns+2.853
 is+2.794
ie+2.759
ests+2.750
ons+2.729
Neg logit contributions
 determined-3.257
shadow-3.215
Just-3.168
 proud-3.161
 louder-3.145
Token not
Token ID407
Feature activation+2.28e-01

Pos logit contributions
shadow+1.386
ured+1.189
ivated+1.143
Unfortunately+1.121
 crashing+1.111
Neg logit contributions
 is-3.070
 lives-2.789
 belongs-2.767
ie-2.754
 dreams-2.704
Token know
Token ID760
Feature activation-1.90e+00

Pos logit contributions
shadow+0.319
ured+0.278
 determined+0.273
 louder+0.272
 confident+0.272
Neg logit contributions
 is-0.519
ie-0.483
 lives-0.482
orns-0.475
 belongs-0.475
Token where
Token ID810
Feature activation-1.22e+00

Pos logit contributions
 is+3.657
ests+3.620
orns+3.619
ible+3.603
oline+3.589
Neg logit contributions
shadow-3.122
 louder-3.097
 crashing-2.879
 determined-2.796
 proud-2.777
Token she
Token ID673
Feature activation-4.46e+00

Pos logit contributions
 is+2.446
orns+2.431
ible+2.429
ie+2.358
oline+2.356
Neg logit contributions
shadow-1.831
 louder-1.748
 determined-1.709
 crashing-1.673
 confident-1.660
Token was
Token ID373
Feature activation-3.91e-01

Pos logit contributions
ests+8.189
orns+7.840
ishes+7.799
ons+7.393
iles+7.278
Neg logit contributions
 determined-8.879
Just-8.593
 proud-8.564
ured-8.533
 relieved-8.471
Token.
Token ID13
Feature activation-7.67e-01

Pos logit contributions
 is+0.582
ie+0.524
 lives+0.523
orns+0.522
oline+0.520
Neg logit contributions
shadow-0.845
ured-0.771
 louder-0.768
 determined-0.766
 confident-0.762
Token She
Token ID1375
Feature activation-2.18e+00

Pos logit contributions
 is+1.560
ie+1.552
 belongs+1.482
 lives+1.481
ons+1.474
Neg logit contributions
shadow-1.171
Unfortunately-1.076
Just-1.075
 louder-1.066
 determined-1.055
Token did
Token ID750
Feature activation+1.20e+00

Pos logit contributions
orns+3.877
ons+3.773
ests+3.748
 is+3.709
ie+3.684
Neg logit contributions
 determined-3.850
Just-3.745
 louder-3.697
 confident-3.650
 proud-3.635
Token not
Token ID407
Feature activation+5.36e-02

Pos logit contributions
shadow+1.419
ured+1.228
ivated+1.184
Unfortunately+1.158
 crashing+1.155
Neg logit contributions
 is-3.126
 lives-2.849
 belongs-2.825
ie-2.802
 dreams-2.758
Token of
Token ID286
Feature activation-5.40e-01

Pos logit contributions
 is+0.036
ie+0.032
 lives+0.032
orns+0.032
 belongs+0.032
Neg logit contributions
shadow-0.049
ured-0.046
 louder-0.045
 determined-0.045
 confident-0.045
Token energy
Token ID2568
Feature activation+6.71e-01

Pos logit contributions
 is+0.885
 belongs+0.808
ie+0.806
ible+0.790
 lives+0.784
Neg logit contributions
shadow-1.062
 confident-1.010
 determined-1.004
 proud-1.000
 louder-0.994
Token.
Token ID13
Feature activation-1.52e+00

Pos logit contributions
shadow+1.590
ured+1.452
ivated+1.434
Unfortunately+1.433
 louder+1.430
Neg logit contributions
 is-0.904
 lives-0.786
ie-0.785
 belongs-0.758
orns-0.752
Token Her
Token ID2332
Feature activation-1.04e+00

Pos logit contributions
 is+3.330
ie+3.297
iles+3.252
ible+3.249
 belongs+3.221
Neg logit contributions
shadow-2.027
Unfortunately-1.880
Just-1.853
 confident-1.825
 proud-1.812
Token favourite
Token ID12507
Feature activation-2.73e+00

Pos logit contributions
 is+1.773
ie+1.726
orns+1.724
ons+1.676
iles+1.657
Neg logit contributions
shadow-1.861
 determined-1.836
 proud-1.830
 confident-1.824
ured-1.778
Token past
Token ID1613
Feature activation-4.30e+00

Pos logit contributions
iles+4.371
orns+4.329
ishes+4.139
ible+4.106
ons+4.011
Neg logit contributions
 proud-5.330
 determined-5.328
 confident-5.324
 understanding-5.249
 pleased-5.117
Tokenime
Token ID524
Feature activation-2.68e+00

Pos logit contributions
 doesn+8.160
 belongs+8.140
ible+8.003
 tastes+7.942
iles+7.872
Neg logit contributions
Her-7.278
rolling-6.588
 slightly-6.560
shadow-6.537
ivated-6.527
Token was
Token ID373
Feature activation+2.13e-01

Pos logit contributions
iles+3.384
ishes+3.368
ests+3.328
orns+3.300
 belongs+3.275
Neg logit contributions
shadow-5.882
 proud-5.719
Just-5.703
 determined-5.668
 confident-5.614
Token going
Token ID1016
Feature activation+3.37e-01

Pos logit contributions
shadow+0.389
ured+0.351
 determined+0.347
 louder+0.346
 confident+0.346
Neg logit contributions
 is-0.392
ie-0.359
 lives-0.358
orns-0.352
 belongs-0.352
Token to
Token ID284
Feature activation+1.07e+00

Pos logit contributions
shadow+0.556
ured+0.497
 determined+0.487
 confident+0.486
 louder+0.486
Neg logit contributions
 is-0.683
ie-0.629
 lives-0.626
orns-0.617
 belongs-0.616
Token the
Token ID262
Feature activation+5.52e-01

Pos logit contributions
shadow+2.148
ured+1.971
 determined+1.916
ivated+1.910
Just+1.908
Neg logit contributions
 is-1.868
 lives-1.681
ie-1.639
 belongs-1.605
 dreams-1.604
Token named
Token ID3706
Feature activation-7.57e-01

Pos logit contributions
 is+0.362
orns+0.330
 lives+0.329
 belongs+0.329
oline+0.327
Neg logit contributions
shadow-0.363
ured-0.329
 determined-0.327
 louder-0.326
 confident-0.326
Token Louise
Token ID31774
Feature activation-1.26e+00

Pos logit contributions
 is+1.094
orns+1.010
ible+1.007
iles+0.991
 lives+0.991
Neg logit contributions
shadow-1.696
ured-1.538
 louder-1.528
ivated-1.521
 determined-1.517
Token.
Token ID13
Feature activation-1.27e+00

Pos logit contributions
 is+2.001
orns+1.984
ests+1.956
iles+1.902
oline+1.891
Neg logit contributions
shadow-2.456
 proud-2.320
ured-2.313
 determined-2.311
 louder-2.304
Token Her
Token ID2332
Feature activation-7.21e-01

Pos logit contributions
 is+2.725
 belongs+2.627
ie+2.613
orns+2.595
 lives+2.582
Neg logit contributions
shadow-1.669
 louder-1.619
 confident-1.595
Just-1.593
 determined-1.586
Token favourite
Token ID12507
Feature activation-2.30e+00

Pos logit contributions
 is+1.228
orns+1.179
ie+1.177
ons+1.146
iles+1.143
Neg logit contributions
shadow-1.323
 proud-1.266
 determined-1.258
 confident-1.251
ured-1.238
Token past
Token ID1613
Feature activation-4.29e+00

Pos logit contributions
orns+3.702
iles+3.695
ons+3.658
 is+3.567
ible+3.558
Neg logit contributions
 proud-4.344
 confident-4.252
 determined-4.205
 understanding-4.180
shadow-4.163
Tokenime
Token ID524
Feature activation-2.31e+00

Pos logit contributions
 doesn+9.096
ible+8.912
iles+8.836
 belongs+8.657
 tastes+8.636
Neg logit contributions
Her-7.253
eyed-6.153
Just-6.061
shadow-5.908
Suddenly-5.819
Token was
Token ID373
Feature activation+2.56e-01

Pos logit contributions
iles+3.932
ishes+3.921
 tastes+3.915
 is+3.895
 belongs+3.892
Neg logit contributions
Just-3.947
shadow-3.941
 proud-3.868
eyed-3.766
 determined-3.712
Token playing
Token ID2712
Feature activation+1.19e+00

Pos logit contributions
shadow+0.476
ured+0.431
 determined+0.424
 louder+0.423
 confident+0.423
Neg logit contributions
 is-0.467
ie-0.426
 lives-0.425
orns-0.417
 belongs-0.417
Token in
Token ID287
Feature activation-3.86e-02

Pos logit contributions
shadow+2.053
ured+1.753
 pleased+1.744
 determined+1.735
 confident+1.705
Neg logit contributions
 is-2.491
ie-2.273
 lives-2.251
orns-2.239
oline-2.209
Token the
Token ID262
Feature activation+7.94e-01

Pos logit contributions
 is+0.069
ie+0.063
 lives+0.063
 belongs+0.063
orns+0.062
Neg logit contributions
shadow-0.072
ured-0.065
 determined-0.064
 louder-0.064
 confident-0.064
Token She
Token ID1375
Feature activation-7.99e-01

Pos logit contributions
 is+1.104
ie+1.059
 lives+1.036
 belongs+1.024
orns+1.021
Neg logit contributions
shadow-0.795
Unfortunately-0.706
ured-0.702
Just-0.701
 louder-0.696
Token did
Token ID750
Feature activation+1.69e+00

Pos logit contributions
 is+1.326
orns+1.288
ests+1.262
ie+1.246
iles+1.244
Neg logit contributions
shadow-1.505
 determined-1.432
 louder-1.411
ured-1.399
 confident-1.393
Token not
Token ID407
Feature activation+5.84e-01

Pos logit contributions
shadow+2.149
ured+1.880
 crashing+1.848
ivated+1.794
rolling+1.771
Neg logit contributions
 is-4.331
 lives-3.891
 belongs-3.871
 happened-3.804
ie-3.801
Token know
Token ID760
Feature activation-1.66e+00

Pos logit contributions
shadow+1.025
ured+0.920
Unfortunately+0.897
 determined+0.897
ivated+0.896
Neg logit contributions
 is-1.133
ie-1.048
 lives-1.038
 belongs-1.016
orns-1.014
Token where
Token ID810
Feature activation-1.94e+00

Pos logit contributions
 is+3.202
ests+3.117
oline+3.089
 belongs+3.076
orns+3.071
Neg logit contributions
shadow-2.792
 louder-2.730
 crashing-2.551
 determined-2.538
 understanding-2.528
Token she
Token ID673
Feature activation-4.27e+00

Pos logit contributions
ible+4.193
orns+4.151
ests+4.109
iles+4.104
 is+4.075
Neg logit contributions
shadow-2.656
 louder-2.608
 determined-2.553
 proud-2.463
 confident-2.457
Token was
Token ID373
Feature activation-4.45e-01

Pos logit contributions
ests+8.163
orns+7.369
ishes+7.282
icer+7.216
berries+7.106
Neg logit contributions
 determined-8.888
 proud-8.570
 confident-8.409
 louder-8.397
 relieved-8.377
Token.
Token ID13
Feature activation-6.06e-01

Pos logit contributions
 is+0.752
ests+0.692
orns+0.686
 lives+0.685
oline+0.683
Neg logit contributions
shadow-0.875
 louder-0.792
ured-0.790
 determined-0.785
 confident-0.780
Token She
Token ID1375
Feature activation-9.95e-01

Pos logit contributions
 is+1.273
ie+1.233
 lives+1.198
 belongs+1.187
iles+1.185
Neg logit contributions
shadow-0.927
Unfortunately-0.829
Just-0.822
 louder-0.818
ured-0.815
Token did
Token ID750
Feature activation+1.56e+00

Pos logit contributions
 is+1.672
orns+1.642
ests+1.595
ons+1.592
ie+1.586
Neg logit contributions
shadow-1.818
 determined-1.768
 louder-1.735
ured-1.726
Just-1.726
Token not
Token ID407
Feature activation+3.19e-01

Pos logit contributions
shadow+1.946
ured+1.695
 crashing+1.657
ivated+1.627
Unfortunately+1.608
Neg logit contributions
 is-4.003
 lives-3.625
 belongs-3.597
ie-3.540
 dreams-3.527
Token I
Token ID314
Feature activation-1.21e+00

Pos logit contributions
shadow+0.280
ured+0.254
 determined+0.251
 louder+0.251
 confident+0.250
Neg logit contributions
 is-0.243
ie-0.222
 lives-0.220
orns-0.218
 belongs-0.216
Token don
Token ID836
Feature activation+6.96e-01

Pos logit contributions
orns+1.914
oline+1.904
 is+1.867
iles+1.860
ests+1.845
Neg logit contributions
shadow-2.354
 determined-2.224
 proud-2.214
 confident-2.178
 pleased-2.140
Token't
Token ID470
Feature activation+2.02e-01

Pos logit contributions
shadow+1.027
ured+0.910
 confident+0.888
 louder+0.885
 determined+0.884
Neg logit contributions
 is-1.551
ie-1.424
 lives-1.412
 belongs-1.399
orns-1.394
Token know
Token ID760
Feature activation-3.16e+00

Pos logit contributions
shadow+0.350
ured+0.314
 determined+0.309
 confident+0.308
 louder+0.308
Neg logit contributions
 is-0.394
ie-0.362
 lives-0.361
orns-0.356
 belongs-0.355
Token where
Token ID810
Feature activation-2.67e+00

Pos logit contributions
ests+6.317
oline+6.061
orns+5.994
ible+5.959
iles+5.958
Neg logit contributions
shadow-6.048
 louder-5.639
 understanding-5.198
 proud-5.090
 determined-4.917
Token she
Token ID673
Feature activation-4.27e+00

Pos logit contributions
ible+6.480
iles+6.405
ests+6.396
orns+6.371
oline+6.219
Neg logit contributions
shadow-3.382
 louder-3.115
 proud-3.034
 determined-2.954
 crashing-2.910
Token is
Token ID318
Feature activation-8.86e-01

Pos logit contributions
ests+8.471
iles+8.147
orns+8.020
icer+7.361
oline+7.343
Neg logit contributions
shadow-8.123
ured-7.937
 relieved-7.875
 determined-7.869
uring-7.855
Token,"
Token ID553
Feature activation-6.42e-01

Pos logit contributions
 is+1.311
ests+1.227
orns+1.221
oline+1.219
ible+1.210
Neg logit contributions
shadow-1.914
ured-1.726
 louder-1.721
 determined-1.717
 proud-1.704
Token the
Token ID262
Feature activation-4.46e-01

Pos logit contributions
 is+0.972
orns+0.906
oline+0.900
 lives+0.896
iles+0.895
Neg logit contributions
shadow-1.353
 louder-1.260
 confident-1.235
 proud-1.233
 determined-1.232
Token little
Token ID1310
Feature activation-1.30e+00

Pos logit contributions
 is+0.769
orns+0.726
ie+0.716
ible+0.713
 lives+0.706
Neg logit contributions
shadow-0.836
 louder-0.777
 determined-0.773
 proud-0.768
 confident-0.767
Token girl
Token ID2576
Feature activation-4.58e-01

Pos logit contributions
 is+1.835
orns+1.796
ons+1.792
iles+1.749
ests+1.737
Neg logit contributions
shadow-2.778
 louder-2.741
 confident-2.602
 determined-2.601
 proud-2.585
Token She
Token ID1375
Feature activation-1.39e+00

Pos logit contributions
shadow+0.826
ured+0.746
 determined+0.733
 louder+0.729
 confident+0.728
Neg logit contributions
 is-0.878
ie-0.798
 lives-0.797
orns-0.786
 belongs-0.781
Token did
Token ID750
Feature activation+1.71e+00

Pos logit contributions
 is+2.111
orns+2.089
ests+2.006
ible+1.996
iles+1.990
Neg logit contributions
shadow-2.733
 determined-2.673
 proud-2.633
Just-2.604
 louder-2.601
Token not
Token ID407
Feature activation+1.20e+00

Pos logit contributions
shadow+2.200
ured+1.919
eyed+1.815
ivated+1.806
Unfortunately+1.787
Neg logit contributions
 is-4.396
 lives-3.922
ie-3.893
 belongs-3.876
 doesn-3.802
Token know
Token ID760
Feature activation-1.53e+00

Pos logit contributions
shadow+2.144
ured+1.941
Unfortunately+1.898
 determined+1.876
Just+1.863
Neg logit contributions
 is-2.336
ie-2.212
 lives-2.137
orns-2.094
 belongs-2.065
Token where
Token ID810
Feature activation-1.16e+00

Pos logit contributions
 is+2.906
ests+2.840
 belongs+2.809
oline+2.809
orns+2.806
Neg logit contributions
shadow-2.561
 louder-2.437
 crashing-2.355
 determined-2.284
 proud-2.282
Token she
Token ID673
Feature activation-4.25e+00

Pos logit contributions
 is+2.288
ible+2.259
iles+2.219
oline+2.218
orns+2.215
Neg logit contributions
shadow-1.796
 louder-1.662
 crashing-1.636
 determined-1.630
 proud-1.600
Token was
Token ID373
Feature activation+3.03e-01

Pos logit contributions
ests+9.776
orns+9.109
ishes+8.732
iles+8.628
 dinosaurs+8.614
Neg logit contributions
 determined-6.867
Just-6.712
 proud-6.494
ured-6.417
 relieved-6.331
Token going
Token ID1016
Feature activation+2.06e-01

Pos logit contributions
shadow+0.581
ured+0.527
 determined+0.520
 louder+0.518
 confident+0.518
Neg logit contributions
 is-0.531
ie-0.484
 lives-0.481
orns-0.472
 belongs-0.472
Token,
Token ID11
Feature activation+5.76e-01

Pos logit contributions
shadow+0.470
ured+0.434
 determined+0.428
 louder+0.428
 confident+0.427
Neg logit contributions
 is-0.285
ie-0.253
 lives-0.252
orns-0.246
 belongs-0.246
Token but
Token ID475
Feature activation-2.90e-01

Pos logit contributions
shadow+0.891
ured+0.788
 confident+0.764
 determined+0.761
ivated+0.760
Neg logit contributions
 is-1.238
ie-1.139
 lives-1.135
 belongs-1.114
orns-1.111
Token she
Token ID673
Feature activation-1.69e+00

Pos logit contributions
 is+0.566
ie+0.526
orns+0.525
oline+0.521
 lives+0.520
Neg logit contributions
shadow-0.490
 determined-0.445
 louder-0.441
ured-0.440
 confident-0.437
Token
Token ID198
Feature activation-8.04e-02

Pos logit contributions
shadow+1.353
ured+1.236
 determined+1.211
ivated+1.209
 confident+1.203
Neg logit contributions
 is-1.200
 lives-1.064
 belongs-1.042
ie-1.039
orns-1.037
Token
Token ID198
Feature activation+1.79e+00

Pos logit contributions
 is+0.142
ie+0.132
 lives+0.131
orns+0.129
 belongs+0.129
Neg logit contributions
shadow-0.151
ured-0.135
Unfortunately-0.134
 louder-0.134
 determined-0.134
Token"
Token ID1
Feature activation+1.06e+00

Pos logit contributions
shadow+3.154
ured+2.973
 crashing+2.895
 determined+2.894
ivated+2.891
Neg logit contributions
 is-3.620
 lives-3.172
 belongs-3.084
orns-3.060
ons-3.030
TokenStop
Token ID19485
Feature activation-6.05e-01

Pos logit contributions
shadow+2.425
ured+2.240
 determined+2.209
 confident+2.205
 pleased+2.184
Neg logit contributions
 is-1.518
 lives-1.314
orns-1.303
ie-1.286
 belongs-1.277
Token,
Token ID11
Feature activation-1.98e-01

Pos logit contributions
 is+0.908
 lives+0.840
ests+0.840
orns+0.838
 belongs+0.830
Neg logit contributions
shadow-1.308
 louder-1.184
ured-1.179
 determined-1.173
ivated-1.172
Token these
Token ID777
Feature activation-4.25e+00

Pos logit contributions
 is+0.388
 lives+0.360
orns+0.360
ie+0.359
 belongs+0.354
Neg logit contributions
shadow-0.333
ured-0.300
 louder-0.299
 determined-0.294
 proud-0.293
Token are
Token ID389
Feature activation-1.37e-02

Pos logit contributions
oline+5.191
pa+4.738
ests+4.599
ible+4.535
ie+4.506
Neg logit contributions
shadow-10.601
 proud-10.521
 click-10.470
 understanding-10.041
 revealing-9.945
Token ours
Token ID16903
Feature activation-1.76e+00

Pos logit contributions
 is+0.025
ie+0.023
 lives+0.022
orns+0.022
oline+0.022
Neg logit contributions
shadow-0.026
ured-0.023
 louder-0.023
 determined-0.023
 confident-0.023
Token!"
Token ID2474
Feature activation+4.65e-01

Pos logit contributions
orns+2.425
oline+2.417
ests+2.379
ible+2.337
ons+2.323
Neg logit contributions
shadow-3.833
 louder-3.622
 proud-3.614
ured-3.553
 determined-3.483
Token Sara
Token ID24799
Feature activation+4.09e-01

Pos logit contributions
shadow+0.881
ured+0.796
 determined+0.782
 confident+0.778
 louder+0.778
Neg logit contributions
 is-0.837
 lives-0.757
ie-0.754
 belongs-0.741
orns-0.736
Token said
Token ID531
Feature activation+1.99e+00

Pos logit contributions
shadow+0.835
ured+0.763
 determined+0.751
 confident+0.749
 louder+0.748
Neg logit contributions
 is-0.669
ie-0.603
 lives-0.598
 belongs-0.586
orns-0.585
Token pony
Token ID26902
Feature activation-6.02e-01

Pos logit contributions
shadow+0.059
 determined+0.053
ured+0.053
 confident+0.053
 louder+0.053
Neg logit contributions
 is-0.058
ie-0.053
 lives-0.052
orns-0.052
 belongs-0.052
Token named
Token ID3706
Feature activation-1.09e+00

Pos logit contributions
 is+1.110
iles+1.042
ible+1.034
orns+1.033
oline+1.033
Neg logit contributions
shadow-1.045
 proud-0.961
 determined-0.951
 confident-0.950
ured-0.940
Token Spark
Token ID17732
Feature activation-4.03e+00

Pos logit contributions
 is+1.741
 belongs+1.619
ie+1.608
 lives+1.605
ible+1.590
Neg logit contributions
shadow-2.260
ured-2.056
 louder-2.026
 confident-2.025
 proud-2.013
Tokeny
Token ID88
Feature activation-1.78e+00

Pos logit contributions
pa+6.551
 foods+6.474
ons+6.391
iles+6.391
 tastes+6.362
Neg logit contributions
 proud-7.876
 louder-7.671
 confident-7.630
eyed-7.629
shadow-7.524
Token.
Token ID13
Feature activation-5.51e-01

Pos logit contributions
ons+2.769
orns+2.766
iles+2.762
 is+2.742
ible+2.664
Neg logit contributions
shadow-3.260
 proud-3.249
 confident-3.205
Just-3.195
 louder-3.180
Token Spark
Token ID17732
Feature activation-4.23e+00

Pos logit contributions
 is+0.998
ie+0.956
ible+0.937
 lives+0.935
iles+0.931
Neg logit contributions
shadow-0.971
ured-0.877
 louder-0.877
 confident-0.876
ivated-0.869
Tokeny
Token ID88
Feature activation-2.68e+00

Pos logit contributions
 foods+5.903
 dreams+5.856
pa+5.808
 acts+5.624
iles+5.590
Neg logit contributions
eyed-9.345
 proud-9.230
shadow-9.104
 grateful-8.928
Just-8.905
Token was
Token ID373
Feature activation+7.77e-01

Pos logit contributions
iles+4.505
orns+4.381
ons+4.304
ishes+4.283
ible+4.157
Neg logit contributions
 proud-5.180
Just-5.176
 determined-5.140
 confident-5.122
shadow-4.786
Token very
Token ID845
Feature activation+2.68e-01

Pos logit contributions
shadow+1.297
ured+1.166
Unfortunately+1.125
ivated+1.124
 louder+1.115
Neg logit contributions
 is-1.593
 lives-1.458
ie-1.454
 belongs-1.415
orns-1.407
Token small
Token ID1402
Feature activation-1.38e+00

Pos logit contributions
shadow+0.425
ured+0.378
 determined+0.371
 louder+0.370
 confident+0.369
Neg logit contributions
 is-0.557
ie-0.515
 lives-0.514
orns-0.506
 belongs-0.506
Token and
Token ID290
Feature activation-3.01e-01

Pos logit contributions
 is+2.433
ons+2.422
orns+2.392
oline+2.388
ie+2.376
Neg logit contributions
shadow-2.334
 determined-2.298
 louder-2.284
 confident-2.281
 proud-2.255
Token I
Token ID314
Feature activation-1.51e+00

Pos logit contributions
 is+0.718
ie+0.677
orns+0.675
ible+0.657
ons+0.657
Neg logit contributions
shadow-0.894
ured-0.822
 louder-0.814
 determined-0.806
Unfortunately-0.798
Token don
Token ID836
Feature activation+7.04e-01

Pos logit contributions
orns+2.333
oline+2.324
ons+2.256
ests+2.237
iles+2.218
Neg logit contributions
shadow-2.995
 determined-2.916
 proud-2.912
 confident-2.853
 pleased-2.792
Token't
Token ID470
Feature activation-7.08e-02

Pos logit contributions
shadow+1.037
ured+0.922
 confident+0.896
 louder+0.895
 determined+0.892
Neg logit contributions
 is-1.569
ie-1.440
 lives-1.428
 belongs-1.417
orns-1.411
Token know
Token ID760
Feature activation-3.10e+00

Pos logit contributions
 is+0.140
 lives+0.129
ie+0.129
orns+0.128
oline+0.128
Neg logit contributions
shadow-0.118
ured-0.106
 confident-0.105
 determined-0.105
 louder-0.105
Token where
Token ID810
Feature activation-2.50e+00

Pos logit contributions
ests+7.063
oline+6.879
orns+6.780
ible+6.612
ops+6.501
Neg logit contributions
shadow-4.889
 louder-4.759
 crashing-4.274
 understanding-4.199
 proud-4.166
Token he
Token ID339
Feature activation-4.20e+00

Pos logit contributions
ible+5.500
ests+5.477
orns+5.435
oline+5.364
icer+5.248
Neg logit contributions
shadow-3.589
 louder-3.430
 crashing-3.337
 proud-3.317
 determined-3.305
Token is
Token ID318
Feature activation-1.03e+00

Pos logit contributions
ests+8.100
orns+7.535
iles+7.332
oline+7.313
berries+7.156
Neg logit contributions
shadow-7.991
uring-7.854
ured-7.796
 determined-7.774
 proud-7.615
Token."
Token ID526
Feature activation-1.05e+00

Pos logit contributions
 is+1.534
ests+1.467
oline+1.458
orns+1.452
ible+1.431
Neg logit contributions
shadow-2.199
ured-1.991
 determined-1.982
 louder-1.979
 proud-1.973
Token Tim
Token ID5045
Feature activation-9.90e-01

Pos logit contributions
 is+1.560
ie+1.515
ible+1.485
 lives+1.472
orns+1.469
Neg logit contributions
shadow-2.182
Unfortunately-2.027
Just-2.027
 proud-2.012
ured-1.996
Token wanted
Token ID2227
Feature activation-1.53e+00

Pos logit contributions
 is+1.523
orns+1.487
ests+1.437
 lives+1.433
iles+1.428
Neg logit contributions
shadow-1.998
 determined-1.867
Just-1.863
 louder-1.856
 proud-1.823
Token to
Token ID284
Feature activation+8.15e-02

Pos logit contributions
 is+3.516
ible+3.461
oline+3.435
ie+3.424
orns+3.403
Neg logit contributions
shadow-1.862
 louder-1.784
 determined-1.728
 proud-1.709
 confident-1.686
Token<|endoftext|>
Token ID50256
Feature activation-2.24e+00

Pos logit contributions
 is+0.274
ie+0.254
 lives+0.253
oline+0.250
orns+0.249
Neg logit contributions
shadow-0.291
 confident-0.259
 louder-0.259
Unfortunately-0.258
ured-0.258
TokenOnce
Token ID7454
Feature activation-2.68e+00

Pos logit contributions
ishes+5.021
ie+5.002
 belongs+4.986
ible+4.967
iles+4.958
Neg logit contributions
shadow-2.951
 proud-2.774
Unfortunately-2.740
Just-2.722
eyed-2.657
Token upon
Token ID2402
Feature activation-1.15e+00

Pos logit contributions
ible+5.637
orns+5.424
 foods+5.399
ests+5.340
 acts+5.338
Neg logit contributions
shadow-4.461
 proud-4.275
 pleased-4.206
 determined-4.027
 impressed-3.997
Token a
Token ID257
Feature activation-1.61e+00

Pos logit contributions
 is+2.267
ible+2.199
 belongs+2.194
orns+2.183
ie+2.182
Neg logit contributions
shadow-1.814
 proud-1.690
 louder-1.684
Just-1.665
 confident-1.664
Token time
Token ID640
Feature activation-2.18e+00

Pos logit contributions
 is+1.722
orns+1.707
ishes+1.623
 belongs+1.617
ible+1.616
Neg logit contributions
shadow-3.911
Just-3.832
 determined-3.818
 confident-3.764
 proud-3.756
Token there
Token ID612
Feature activation-4.18e+00

Pos logit contributions
ible+4.006
orns+3.922
oline+3.823
ishes+3.771
ests+3.749
Neg logit contributions
shadow-4.034
 determined-3.673
 proud-3.631
ivated-3.599
 pleased-3.596
Token was
Token ID373
Feature activation-1.56e+00

Pos logit contributions
ishes+8.186
orns+7.908
iles+7.714
ests+7.426
unn+7.313
Neg logit contributions
 proud-8.486
 relieved-8.199
 determined-7.938
 pleased-7.849
But-7.677
Token a
Token ID257
Feature activation+6.66e-01

Pos logit contributions
 is+3.304
 belongs+3.198
ie+3.126
 acts+3.104
ible+3.087
Neg logit contributions
shadow-2.249
 proud-2.236
 determined-2.159
 confident-2.154
 louder-2.151
Token man
Token ID582
Feature activation+4.65e-02

Pos logit contributions
shadow+1.404
ured+1.294
ivated+1.256
Unfortunately+1.252
 determined+1.247
Neg logit contributions
 is-1.061
 lives-0.951
ie-0.950
 belongs-0.912
oline-0.902
Token.
Token ID13
Feature activation-2.14e-01

Pos logit contributions
shadow+0.091
ured+0.083
 determined+0.082
 confident+0.082
 louder+0.082
Neg logit contributions
 is-0.078
 lives-0.071
ie-0.071
 belongs-0.070
orns-0.070
Token He
Token ID679
Feature activation-1.49e+00

Pos logit contributions
 is+0.448
ie+0.420
 lives+0.414
orns+0.412
 belongs+0.411
Neg logit contributions
shadow-0.329
 confident-0.292
ured-0.291
 louder-0.289
 determined-0.289
Token He
Token ID679
Feature activation-2.06e+00

Pos logit contributions
 is+0.649
ie+0.625
 lives+0.611
orns+0.603
 belongs+0.601
Neg logit contributions
shadow-0.474
 louder-0.417
ured-0.417
Unfortunately-0.416
 determined-0.414
Token asks
Token ID7893
Feature activation-1.56e+00

Pos logit contributions
ests+2.361
ie+2.359
ons+2.323
orns+2.293
pa+2.210
Neg logit contributions
 determined-4.821
 louder-4.733
shadow-4.653
 proud-4.598
 confident-4.587
Token his
Token ID465
Feature activation-1.12e+00

Pos logit contributions
 is+3.305
 belongs+3.232
 lives+3.203
orns+3.185
oline+3.173
Neg logit contributions
 louder-2.232
shadow-2.218
ured-2.034
 determined-2.002
 understanding-2.000
Token mom
Token ID1995
Feature activation-1.09e+00

Pos logit contributions
 is+1.932
ie+1.899
orns+1.884
oline+1.876
ons+1.841
Neg logit contributions
shadow-2.090
 louder-1.940
 proud-1.918
 determined-1.918
 confident-1.899
Token where
Token ID810
Feature activation-2.06e+00

Pos logit contributions
 is+1.883
orns+1.846
oline+1.831
 lives+1.806
ible+1.799
Neg logit contributions
shadow-1.983
 louder-1.863
 determined-1.791
ured-1.773
 proud-1.759
Token he
Token ID339
Feature activation-4.17e+00

Pos logit contributions
ible+4.804
orns+4.689
oline+4.628
iles+4.623
ishes+4.593
Neg logit contributions
shadow-2.437
 determined-2.327
 louder-2.326
 proud-2.325
 crashing-2.255
Token is
Token ID318
Feature activation+3.73e-01

Pos logit contributions
ests+6.240
oline+5.569
orns+5.559
 foods+5.510
iles+5.371
Neg logit contributions
shadow-9.295
 proud-9.213
 determined-9.120
 victorious-8.863
 relieved-8.854
Token.
Token ID13
Feature activation+5.17e-01

Pos logit contributions
shadow+0.836
ured+0.771
 determined+0.760
 louder+0.760
 confident+0.759
Neg logit contributions
 is-0.536
ie-0.476
 lives-0.474
 belongs-0.463
orns-0.462
Token His
Token ID2399
Feature activation-1.51e+00

Pos logit contributions
shadow+0.941
ured+0.849
 determined+0.831
 confident+0.829
ivated+0.828
Neg logit contributions
 is-0.970
 lives-0.878
ie-0.876
 belongs-0.862
orns-0.858
Token mom
Token ID1995
Feature activation-1.18e+00

Pos logit contributions
oline+2.727
ie+2.727
orns+2.716
 is+2.689
ons+2.673
Neg logit contributions
shadow-2.622
 determined-2.535
 proud-2.531
 confident-2.456
 louder-2.438
Token says
Token ID1139
Feature activation-1.06e+00

Pos logit contributions
orns+1.206
ests+1.172
oline+1.159
 is+1.157
ie+1.143
Neg logit contributions
shadow-2.876
 louder-2.762
 determined-2.761
 proud-2.726
ured-2.704
Token.
Token ID13
Feature activation+3.53e-01

Pos logit contributions
shadow+2.427
ured+2.229
 determined+2.211
Unfortunately+2.194
 confident+2.187
Neg logit contributions
 is-1.325
ie-1.148
 lives-1.105
oline-1.062
 belongs-1.060
Token They
Token ID1119
Feature activation+5.78e-01

Pos logit contributions
shadow+0.689
ured+0.625
 determined+0.616
 confident+0.614
 louder+0.612
Neg logit contributions
 is-0.616
 lives-0.556
ie-0.554
 belongs-0.545
orns-0.544
Token come
Token ID1282
Feature activation+5.23e-01

Pos logit contributions
shadow+1.168
ured+1.060
 determined+1.041
 confident+1.039
Unfortunately+1.035
Neg logit contributions
 is-0.980
 lives-0.877
ie-0.872
 belongs-0.850
orns-0.844
Token to
Token ID284
Feature activation+1.45e-01

Pos logit contributions
shadow+0.885
ured+0.797
 confident+0.784
 determined+0.782
ivated+0.779
Neg logit contributions
 is-1.041
ie-0.953
 lives-0.947
 belongs-0.926
orns-0.923
Token see
Token ID766
Feature activation-2.29e+00

Pos logit contributions
shadow+0.267
ured+0.241
 louder+0.238
 determined+0.238
 confident+0.238
Neg logit contributions
 is-0.263
ie-0.242
 lives-0.240
orns-0.237
 belongs-0.237
Token what
Token ID644
Feature activation-4.16e+00

Pos logit contributions
ests+4.610
orns+4.575
oline+4.532
ie+4.492
ible+4.410
Neg logit contributions
 louder-3.737
shadow-3.650
 proud-3.474
 understanding-3.365
 crashing-3.335
Token is
Token ID318
Feature activation+1.04e+00

Pos logit contributions
orns+9.369
ible+9.169
ishes+8.952
ie+8.595
icer+8.513
Neg logit contributions
 proud-7.490
 louder-7.212
 determined-7.021
 relieved-6.851
 pleased-6.802
Token going
Token ID1016
Feature activation+6.39e-01

Pos logit contributions
shadow+1.852
ured+1.658
Unfortunately+1.627
ivated+1.606
 confident+1.604
Neg logit contributions
 is-2.078
 lives-1.859
 belongs-1.835
ie-1.823
 happened-1.796
Token on
Token ID319
Feature activation-2.10e+00

Pos logit contributions
shadow+1.156
ured+1.047
 determined+1.026
 confident+1.026
ivated+1.019
Neg logit contributions
 is-1.222
ie-1.099
 lives-1.097
 belongs-1.081
orns-1.078
Token.
Token ID13
Feature activation-4.12e-01

Pos logit contributions
orns+3.395
ests+3.390
 acts+3.349
 belongs+3.321
ie+3.313
Neg logit contributions
ured-3.775
shadow-3.764
 proud-3.734
 louder-3.726
uring-3.679
Token They
Token ID1119
Feature activation+5.92e-01

Pos logit contributions
 is+0.744
ie+0.737
orns+0.700
ible+0.693
ons+0.691
Neg logit contributions
shadow-0.728
Unfortunately-0.667
ured-0.665
 louder-0.665
 determined-0.659
Token play
Token ID711
Feature activation-2.66e-01

Pos logit contributions
shadow+0.108
ured+0.098
 determined+0.096
 louder+0.096
 confident+0.096
Neg logit contributions
 is-0.119
ie-0.110
 lives-0.109
orns-0.109
 belongs-0.108
Token with
Token ID351
Feature activation-1.96e+00

Pos logit contributions
 is+0.596
ie+0.561
 lives+0.559
orns+0.558
 belongs+0.554
Neg logit contributions
shadow-0.364
ured-0.324
 louder-0.320
 confident-0.316
 determined-0.314
Token.
Token ID13
Feature activation-1.06e+00

Pos logit contributions
 is+3.321
 acts+3.279
iles+3.244
ible+3.223
orns+3.218
Neg logit contributions
shadow-3.588
 proud-3.474
 louder-3.402
 confident-3.359
ured-3.328
Token All
Token ID1439
Feature activation-2.44e+00

Pos logit contributions
 is+2.143
iles+2.101
ible+2.073
ie+2.059
orns+2.047
Neg logit contributions
shadow-1.676
Unfortunately-1.502
Just-1.481
 proud-1.475
 louder-1.452
Token of
Token ID286
Feature activation-2.21e+00

Pos logit contributions
ible+4.749
ests+4.741
orns+4.624
iles+4.613
 is+4.541
Neg logit contributions
 proud-3.852
 pleased-3.675
 confident-3.655
shadow-3.634
 determined-3.611
Token her
Token ID607
Feature activation-4.15e+00

Pos logit contributions
 belongs+4.704
 is+4.581
ible+4.436
 lives+4.416
 acts+4.388
Neg logit contributions
shadow-3.247
 louder-3.200
 proud-3.096
 understanding-3.025
Unfortunately-3.013
Token friends
Token ID2460
Feature activation-1.62e+00

Pos logit contributions
orns+8.859
ible+8.347
iles+8.166
oline+8.005
 belongs+7.879
Neg logit contributions
 proud-8.497
 determined-7.670
 pleased-7.600
 confident-7.351
 grateful-7.312
Token came
Token ID1625
Feature activation+9.04e-01

Pos logit contributions
orns+2.513
ests+2.412
 is+2.385
iles+2.369
oline+2.288
Neg logit contributions
shadow-3.288
 proud-3.177
 determined-3.166
 pleased-3.102
 confident-3.089
Token to
Token ID284
Feature activation-4.09e-01

Pos logit contributions
shadow+1.679
ured+1.533
Unfortunately+1.501
 confident+1.496
 determined+1.492
Neg logit contributions
 is-1.677
ie-1.542
 lives-1.508
 belongs-1.458
oline-1.457
Token the
Token ID262
Feature activation-7.60e-01

Pos logit contributions
 is+0.781
orns+0.731
 belongs+0.723
ie+0.722
 lives+0.716
Neg logit contributions
shadow-0.705
 louder-0.639
 confident-0.633
ured-0.631
 proud-0.629
Token party
Token ID2151
Feature activation-5.77e-01

Pos logit contributions
 is+1.075
ie+1.034
orns+1.022
ible+1.002
iles+0.997
Neg logit contributions
shadow-1.614
 proud-1.525
 louder-1.512
 determined-1.503
 confident-1.498
Token<|endoftext|>
Token ID50256
Feature activation-2.32e+00

Pos logit contributions
 is+0.926
ie+0.909
 lives+0.870
oline+0.860
 belongs+0.860
Neg logit contributions
shadow-0.924
Unfortunately-0.827
 proud-0.818
 confident-0.815
Just-0.812
TokenOnce
Token ID7454
Feature activation-2.69e+00

Pos logit contributions
ie+5.231
ishes+5.157
 belongs+5.102
ible+5.074
iles+5.070
Neg logit contributions
shadow-3.091
 proud-2.859
Just-2.851
Unfortunately-2.851
eyed-2.787
Token upon
Token ID2402
Feature activation-9.12e-01

Pos logit contributions
ible+5.618
orns+5.494
 foods+5.334
 acts+5.298
ests+5.220
Neg logit contributions
shadow-4.518
 pleased-4.276
 proud-4.264
 determined-4.051
 impressed-4.048
Token a
Token ID257
Feature activation-1.47e+00

Pos logit contributions
 is+1.756
ie+1.689
ible+1.686
orns+1.680
 belongs+1.678
Neg logit contributions
shadow-1.504
 proud-1.380
 louder-1.373
 determined-1.372
 confident-1.368
Token time
Token ID640
Feature activation-2.11e+00

Pos logit contributions
 is+1.561
orns+1.553
ishes+1.453
ests+1.444
 belongs+1.442
Neg logit contributions
shadow-3.616
 determined-3.541
Just-3.516
 confident-3.468
 proud-3.449
Token there
Token ID612
Feature activation-4.13e+00

Pos logit contributions
ible+3.807
orns+3.801
oline+3.644
ishes+3.595
ests+3.565
Neg logit contributions
shadow-3.986
 determined-3.660
 pleased-3.600
 proud-3.571
ivated-3.556
Token was
Token ID373
Feature activation-1.46e+00

Pos logit contributions
ishes+7.960
orns+7.817
iles+7.428
unn+7.145
ests+7.074
Neg logit contributions
 proud-8.548
 relieved-8.244
 determined-8.183
 pleased-8.021
 crashing-7.734
Token a
Token ID257
Feature activation+6.54e-01

Pos logit contributions
 is+3.051
 belongs+2.938
ie+2.885
 acts+2.857
ible+2.848
Neg logit contributions
shadow-2.142
 proud-2.116
 determined-2.061
 confident-2.048
 louder-2.033
Token little
Token ID1310
Feature activation-7.92e-01

Pos logit contributions
shadow+1.357
ured+1.248
ivated+1.213
Unfortunately+1.209
 determined+1.203
Neg logit contributions
 is-1.062
 lives-0.955
ie-0.954
 belongs-0.918
oline-0.907
Token girl
Token ID2576
Feature activation-4.69e-01

Pos logit contributions
 is+1.002
 belongs+0.894
orns+0.889
 lives+0.867
ible+0.867
Neg logit contributions
shadow-1.816
 louder-1.738
 proud-1.736
 confident-1.723
 determined-1.710
Token named
Token ID3706
Feature activation-1.03e+00

Pos logit contributions
 is+0.869
orns+0.800
 belongs+0.796
oline+0.794
ible+0.791
Neg logit contributions
shadow-0.820
 determined-0.740
ured-0.738
 confident-0.737
 proud-0.737
Token It
Token ID632
Feature activation-1.09e+00

Pos logit contributions
 is+1.858
ie+1.807
iles+1.782
ible+1.773
orns+1.768
Neg logit contributions
shadow-1.153
 confident-1.065
Unfortunately-1.043
 proud-1.039
ivated-1.035
Token was
Token ID373
Feature activation+7.70e-01

Pos logit contributions
orns+0.952
iles+0.942
 is+0.939
ests+0.895
ible+0.891
Neg logit contributions
shadow-2.784
 proud-2.742
 determined-2.741
 confident-2.731
Just-2.704
Token the
Token ID262
Feature activation+6.36e-01

Pos logit contributions
shadow+1.480
ured+1.339
ivated+1.308
Unfortunately+1.294
 confident+1.291
Neg logit contributions
 is-1.390
ie-1.267
 lives-1.252
 belongs-1.215
orns-1.206
Token most
Token ID749
Feature activation-3.86e-01

Pos logit contributions
shadow+1.376
ured+1.251
 determined+1.227
ivated+1.224
Unfortunately+1.222
Neg logit contributions
 is-0.984
ie-0.882
 lives-0.880
 belongs-0.853
orns-0.849
Token beautiful
Token ID4950
Feature activation-1.00e+00

Pos logit contributions
 is+0.794
ie+0.741
orns+0.740
 lives+0.733
 belongs+0.728
Neg logit contributions
shadow-0.602
 confident-0.558
 proud-0.554
 determined-0.548
 pleased-0.546
Token past
Token ID1613
Feature activation-4.12e+00

Pos logit contributions
 is+1.323
iles+1.253
orns+1.229
 belongs+1.211
 lives+1.199
Neg logit contributions
shadow-2.175
 confident-2.091
 proud-2.078
 louder-2.070
ured-2.064
Tokenel
Token ID417
Feature activation-2.64e+00

Pos logit contributions
 doesn+5.988
 acts+5.948
 lives+5.943
 belongs+5.833
 cares+5.670
Neg logit contributions
 crashing-8.672
Her-8.546
 slightly-8.148
eyed-8.066
ured-8.029
Token she
Token ID673
Feature activation-1.84e+00

Pos logit contributions
 acts+3.926
 is+3.919
text+3.915
ests+3.707
 lives+3.705
Neg logit contributions
 crashing-5.200
ured-5.079
shadow-5.077
 proud-4.902
 louder-4.883
Token had
Token ID550
Feature activation-5.42e-01

Pos logit contributions
orns+2.711
 is+2.704
ests+2.693
iles+2.656
 acts+2.589
Neg logit contributions
shadow-3.792
 proud-3.714
 determined-3.683
 confident-3.667
 pleased-3.593
Token ever
Token ID1683
Feature activation-9.33e-01

Pos logit contributions
 is+0.814
ie+0.739
 lives+0.732
orns+0.728
 belongs+0.724
Neg logit contributions
shadow-1.146
ured-1.060
 determined-1.050
 confident-1.048
 proud-1.041
Token seen
Token ID1775
Feature activation-1.50e+00

Pos logit contributions
 is+1.522
 lives+1.373
ests+1.370
 belongs+1.369
ie+1.363
Neg logit contributions
shadow-1.890
 confident-1.730
 louder-1.716
 determined-1.713
 pleased-1.705
Token<|endoftext|>
Token ID50256
Feature activation-2.40e+00

Pos logit contributions
 is+3.135
ie+3.123
 belongs+3.123
iles+3.106
ible+3.066
Neg logit contributions
shadow-2.490
 proud-2.322
Unfortunately-2.317
Just-2.214
 pleased-2.202
TokenOnce
Token ID7454
Feature activation-2.81e+00

Pos logit contributions
ishes+5.489
ie+5.484
ible+5.367
 belongs+5.362
iles+5.332
Neg logit contributions
shadow-3.116
 proud-2.967
Unfortunately-2.960
Just-2.924
eyed-2.812
Token upon
Token ID2402
Feature activation-1.10e+00

Pos logit contributions
ible+6.131
orns+5.895
 foods+5.771
ests+5.729
 acts+5.670
Neg logit contributions
shadow-4.602
 proud-4.427
 pleased-4.409
 impressed-4.161
 determined-4.146
Token a
Token ID257
Feature activation-1.54e+00

Pos logit contributions
 is+2.163
ible+2.094
ie+2.091
 belongs+2.089
orns+2.082
Neg logit contributions
shadow-1.760
 proud-1.642
 louder-1.625
 confident-1.613
Just-1.612
Token time
Token ID640
Feature activation-1.96e+00

Pos logit contributions
orns+1.643
 is+1.637
ishes+1.537
ons+1.533
 belongs+1.523
Neg logit contributions
shadow-3.764
 determined-3.669
Just-3.662
 confident-3.620
 proud-3.614
Token there
Token ID612
Feature activation-4.11e+00

Pos logit contributions
ible+3.505
orns+3.487
oline+3.331
ishes+3.294
ests+3.293
Neg logit contributions
shadow-3.744
 determined-3.415
 proud-3.370
 pleased-3.359
ivated-3.347
Token lived
Token ID5615
Feature activation+2.47e-01

Pos logit contributions
ishes+8.143
orns+7.919
iles+7.513
ests+7.330
unn+7.315
Neg logit contributions
 proud-8.593
 relieved-8.169
 determined-8.089
 pleased-7.998
But-7.770
Token a
Token ID257
Feature activation+1.71e-01

Pos logit contributions
shadow+0.470
ured+0.427
 determined+0.421
 louder+0.420
 confident+0.419
Neg logit contributions
 is-0.435
ie-0.396
 lives-0.395
orns-0.387
 belongs-0.387
Token kid
Token ID5141
Feature activation-1.11e+00

Pos logit contributions
shadow+0.322
ured+0.291
 determined+0.289
 confident+0.288
 louder+0.288
Neg logit contributions
 is-0.302
ie-0.276
 lives-0.274
orns-0.271
 belongs-0.270
Token named
Token ID3706
Feature activation-1.52e+00

Pos logit contributions
 is+2.101
ible+2.004
orns+1.999
 belongs+1.979
oline+1.974
Neg logit contributions
shadow-1.807
 proud-1.690
 determined-1.650
 confident-1.650
Just-1.641
Token Tom
Token ID4186
Feature activation-2.43e+00

Pos logit contributions
 is+2.381
orns+2.311
ie+2.212
 acts+2.180
 lives+2.162
Neg logit contributions
shadow-3.132
 louder-2.919
 crashing-2.897
ured-2.884
 proud-2.847
Token<|endoftext|>
Token ID50256
Feature activation-2.29e+00

Pos logit contributions
 is+2.203
ie+2.130
 belongs+2.117
ible+2.114
oline+2.101
Neg logit contributions
shadow-1.932
 proud-1.766
Unfortunately-1.752
Just-1.709
 pleased-1.697
TokenOnce
Token ID7454
Feature activation-2.63e+00

Pos logit contributions
ishes+5.149
ible+5.130
 belongs+5.094
ie+5.061
 lives+5.033
Neg logit contributions
shadow-3.027
 proud-2.852
Just-2.820
Unfortunately-2.816
eyed-2.691
Token upon
Token ID2402
Feature activation-1.12e+00

Pos logit contributions
ible+5.590
orns+5.274
 acts+5.220
 foods+5.210
ests+5.199
Neg logit contributions
shadow-4.331
 pleased-4.075
 proud-4.067
 determined-3.868
 impressed-3.862
Token a
Token ID257
Feature activation-1.53e+00

Pos logit contributions
 is+2.209
ible+2.143
 belongs+2.135
ie+2.119
orns+2.108
Neg logit contributions
shadow-1.789
 proud-1.663
 louder-1.648
Just-1.646
 determined-1.630
Token time
Token ID640
Feature activation-2.12e+00

Pos logit contributions
 is+1.635
orns+1.611
ible+1.524
ishes+1.523
 belongs+1.521
Neg logit contributions
shadow-3.751
Just-3.652
 determined-3.647
 confident-3.593
 proud-3.583
Token there
Token ID612
Feature activation-4.11e+00

Pos logit contributions
ible+3.868
orns+3.791
oline+3.708
ishes+3.641
ests+3.631
Neg logit contributions
shadow-3.957
 determined-3.613
 pleased-3.539
 proud-3.539
ivated-3.510
Token was
Token ID373
Feature activation-1.17e+00

Pos logit contributions
ishes+7.956
orns+7.693
iles+7.385
ests+7.278
unn+7.156
Neg logit contributions
 proud-8.336
 relieved-8.060
 determined-7.888
 pleased-7.781
But-7.718
Token a
Token ID257
Feature activation+7.22e-01

Pos logit contributions
 is+2.371
 belongs+2.256
ie+2.225
 acts+2.202
 lives+2.195
Neg logit contributions
shadow-1.816
 proud-1.743
 determined-1.713
 confident-1.702
 louder-1.701
Token 3
Token ID513
Feature activation-1.06e+00

Pos logit contributions
shadow+1.501
ured+1.383
ivated+1.339
Unfortunately+1.334
 determined+1.325
Neg logit contributions
 is-1.174
 lives-1.054
ie-1.052
 belongs-1.010
oline-0.997
Token year
Token ID614
Feature activation+6.78e-01

Pos logit contributions
 is+1.515
 belongs+1.398
orns+1.359
ible+1.347
 lives+1.342
Neg logit contributions
shadow-2.274
 proud-2.174
 determined-2.160
 confident-2.148
 pleased-2.147
Token old
Token ID1468
Feature activation-6.25e-01

Pos logit contributions
shadow+1.309
ured+1.192
 determined+1.168
ivated+1.167
Unfortunately+1.166
Neg logit contributions
 is-1.187
ie-1.100
 lives-1.084
 belongs-1.061
oline-1.051
Token<|endoftext|>
Token ID50256
Feature activation-2.39e+00

Pos logit contributions
 is+1.962
ie+1.913
iles+1.889
 belongs+1.887
ible+1.871
Neg logit contributions
shadow-1.745
 proud-1.605
Unfortunately-1.584
 confident-1.554
 pleased-1.547
TokenOnce
Token ID7454
Feature activation-2.28e+00

Pos logit contributions
ie+5.468
ishes+5.432
 belongs+5.371
ible+5.308
iles+5.295
Neg logit contributions
shadow-3.051
 proud-2.936
Unfortunately-2.918
Just-2.899
 victorious-2.778
Token upon
Token ID2402
Feature activation-1.10e+00

Pos logit contributions
ible+4.557
orns+4.388
 foods+4.308
 acts+4.287
ests+4.277
Neg logit contributions
shadow-3.964
 proud-3.772
 pleased-3.762
 determined-3.569
 impressed-3.567
Token a
Token ID257
Feature activation-1.38e+00

Pos logit contributions
 is+2.166
ible+2.098
 belongs+2.093
ie+2.092
orns+2.092
Neg logit contributions
shadow-1.771
 proud-1.640
 louder-1.631
Just-1.618
 determined-1.616
Token time
Token ID640
Feature activation-1.81e+00

Pos logit contributions
 is+1.447
orns+1.406
 belongs+1.338
ests+1.315
ons+1.313
Neg logit contributions
shadow-3.428
 determined-3.346
Just-3.315
 confident-3.288
 proud-3.272
Token there
Token ID612
Feature activation-4.10e+00

Pos logit contributions
ible+3.113
orns+3.058
oline+2.977
ishes+2.922
 is+2.915
Neg logit contributions
shadow-3.534
 determined-3.224
 proud-3.187
ivated-3.178
 pleased-3.171
Token was
Token ID373
Feature activation-1.29e+00

Pos logit contributions
ishes+8.097
orns+7.858
iles+7.491
ests+7.228
unn+7.155
Neg logit contributions
 proud-8.463
 relieved-8.199
 determined-8.168
 pleased-7.977
But-7.739
Token a
Token ID257
Feature activation+8.49e-01

Pos logit contributions
 is+2.649
 belongs+2.519
ie+2.475
 acts+2.455
ible+2.445
Neg logit contributions
shadow-1.985
 proud-1.929
 determined-1.899
 confident-1.884
 louder-1.852
Token little
Token ID1310
Feature activation-4.13e-01

Pos logit contributions
shadow+1.797
ured+1.660
Unfortunately+1.605
ivated+1.605
uring+1.580
Neg logit contributions
 is-1.356
 lives-1.219
ie-1.216
 belongs-1.158
oline-1.143
Token girl
Token ID2576
Feature activation-2.46e-01

Pos logit contributions
 is+0.505
 belongs+0.441
orns+0.439
 lives+0.436
oline+0.430
Neg logit contributions
shadow-0.982
 louder-0.928
 confident-0.923
 proud-0.922
 determined-0.919
Token named
Token ID3706
Feature activation-6.09e-01

Pos logit contributions
 is+0.448
orns+0.408
 belongs+0.407
 lives+0.406
oline+0.405
Neg logit contributions
shadow-0.446
ured-0.402
 determined-0.401
 confident-0.400
 louder-0.399
Token I
Token ID314
Feature activation-5.31e-01

Pos logit contributions
 is+0.704
ie+0.678
orns+0.661
 lives+0.653
ons+0.652
Neg logit contributions
shadow-0.588
ured-0.521
 louder-0.518
Unfortunately-0.512
 determined-0.510
Token don
Token ID836
Feature activation+9.57e-01

Pos logit contributions
 is+0.687
orns+0.647
oline+0.643
ie+0.639
ons+0.627
Neg logit contributions
shadow-1.228
 determined-1.130
 proud-1.124
ured-1.119
 confident-1.119
Token't
Token ID470
Feature activation+1.50e-01

Pos logit contributions
shadow+1.463
ured+1.303
 confident+1.275
 louder+1.264
 pleased+1.261
Neg logit contributions
 is-2.110
ie-1.900
ons-1.889
 lives-1.884
 belongs-1.879
Token know
Token ID760
Feature activation-2.24e+00

Pos logit contributions
shadow+0.270
ured+0.242
 determined+0.239
 louder+0.239
 confident+0.239
Neg logit contributions
 is-0.282
ie-0.259
 lives-0.258
orns-0.254
 belongs-0.254
Token where
Token ID810
Feature activation-2.19e+00

Pos logit contributions
oline+4.329
orns+4.296
ests+4.284
 is+4.178
ible+4.172
Neg logit contributions
shadow-3.916
 louder-3.708
 crashing-3.435
 determined-3.373
ured-3.342
Token he
Token ID339
Feature activation-4.10e+00

Pos logit contributions
orns+4.940
ible+4.934
oline+4.848
ests+4.753
iles+4.728
Neg logit contributions
shadow-2.937
 louder-2.723
 determined-2.647
 crashing-2.619
 proud-2.570
Token is
Token ID318
Feature activation-8.02e-01

Pos logit contributions
ests+6.186
orns+5.794
oline+5.459
iles+5.364
berries+5.291
Neg logit contributions
shadow-9.295
ured-9.096
 proud-8.964
 determined-8.908
uring-8.876
Token.
Token ID13
Feature activation-1.04e+00

Pos logit contributions
 is+1.241
ests+1.156
orns+1.151
ie+1.149
oline+1.142
Neg logit contributions
shadow-1.673
ured-1.525
 louder-1.513
 determined-1.503
 proud-1.496
Token I
Token ID314
Feature activation-1.03e+00

Pos logit contributions
ie+2.229
 is+2.205
orns+2.161
ons+2.154
oline+2.124
Neg logit contributions
shadow-1.522
 louder-1.338
Just-1.330
ured-1.326
Unfortunately-1.319
Token'm
Token ID1101
Feature activation+2.47e+00

Pos logit contributions
 is+1.647
orns+1.636
oline+1.616
ons+1.599
ie+1.599
Neg logit contributions
shadow-1.999
 determined-1.853
 proud-1.833
 confident-1.822
eyed-1.807
Token so
Token ID523
Feature activation+1.57e+00

Pos logit contributions
shadow+4.026
ured+3.681
Just+3.602
Unfortunately+3.591
Half+3.510
Neg logit contributions
 is-5.361
 lives-4.863
ie-4.717
 happened-4.701
 dinosaurs-4.619
Token took
Token ID1718
Feature activation-1.34e+00

Pos logit contributions
orns+0.794
ests+0.757
 is+0.725
ible+0.722
oline+0.721
Neg logit contributions
shadow-3.639
 crashing-3.440
 proud-3.432
 louder-3.427
 determined-3.392
Token our
Token ID674
Feature activation-2.16e+00

Pos logit contributions
 is+2.496
ests+2.443
orns+2.399
ie+2.394
oline+2.384
Neg logit contributions
shadow-2.274
 louder-2.158
ured-2.095
 crashing-2.017
 proud-1.981
Token drawings
Token ID23388
Feature activation+1.00e-01

Pos logit contributions
ests+4.187
 is+4.185
ie+4.123
orns+4.108
text+3.988
Neg logit contributions
shadow-3.399
ured-3.318
 proud-3.229
 louder-3.177
 determined-3.158
Token and
Token ID290
Feature activation-6.87e-01

Pos logit contributions
shadow+0.201
ured+0.183
 determined+0.181
 louder+0.181
 confident+0.180
Neg logit contributions
 is-0.164
ie-0.149
 lives-0.148
orns-0.146
 belongs-0.145
Token our
Token ID674
Feature activation-1.99e+00

Pos logit contributions
 is+1.151
ie+1.093
orns+1.071
oline+1.069
ests+1.055
Neg logit contributions
shadow-1.316
 louder-1.202
ured-1.199
 determined-1.186
 proud-1.177
Token past
Token ID1613
Feature activation-4.10e+00

Pos logit contributions
 is+2.883
orns+2.777
ie+2.724
ests+2.708
oline+2.660
Neg logit contributions
 proud-4.019
ured-4.007
shadow-4.002
 determined-4.000
 louder-3.969
Tokenels
Token ID1424
Feature activation-1.28e+00

Pos logit contributions
unn+4.863
 lives+4.848
ests+4.814
 acts+4.785
 happened+4.720
Neg logit contributions
shadow-9.034
ured-9.006
 crashing-8.817
uring-8.796
 revealing-8.645
Token!"
Token ID2474
Feature activation+1.35e+00

Pos logit contributions
 is+1.832
ie+1.755
orns+1.751
oline+1.748
 lives+1.747
Neg logit contributions
shadow-2.634
 louder-2.436
ured-2.434
 proud-2.406
 crashing-2.398
Token
Token ID198
Feature activation+4.02e-01

Pos logit contributions
shadow+2.991
ured+2.761
ivated+2.729
 determined+2.702
 confident+2.688
Neg logit contributions
 is-2.160
 lives-1.819
ie-1.749
 happened-1.736
 belongs-1.730
Token
Token ID198
Feature activation+2.25e+00

Pos logit contributions
shadow+0.789
ured+0.721
 determined+0.709
 confident+0.707
 louder+0.706
Neg logit contributions
 is-0.686
ie-0.619
 lives-0.618
 belongs-0.603
orns-0.603
TokenMom
Token ID29252
Feature activation-2.82e-01

Pos logit contributions
shadow+3.824
ivated+3.662
ured+3.622
 confident+3.569
ooting+3.492
Neg logit contributions
 is-4.694
 lives-4.186
ie-4.024
 belongs-3.899
text-3.895
Token felt
Token ID2936
Feature activation+1.07e+00

Pos logit contributions
 is+1.907
orns+1.880
ie+1.861
iles+1.840
 belongs+1.813
Neg logit contributions
shadow-2.047
 determined-1.928
 proud-1.923
 confident-1.912
Just-1.908
Token better
Token ID1365
Feature activation+2.20e+00

Pos logit contributions
shadow+1.584
ured+1.352
Just+1.310
Unfortunately+1.304
ooting+1.298
Neg logit contributions
 is-2.450
 lives-2.244
ie-2.194
 belongs-2.188
 dreams-2.172
Token.
Token ID13
Feature activation-1.12e-01

Pos logit contributions
shadow+5.426
ured+5.101
Unfortunately+5.068
ivated+5.010
 crashing+4.976
Neg logit contributions
 is-2.931
 lives-2.421
ie-2.417
 dreams-2.337
pa-2.269
Token She
Token ID1375
Feature activation-7.06e-01

Pos logit contributions
 is+0.236
ie+0.223
 lives+0.219
 belongs+0.217
orns+0.217
Neg logit contributions
shadow-0.170
ured-0.151
 confident-0.149
 determined-0.149
 louder-0.148
Token said
Token ID531
Feature activation+8.18e-01

Pos logit contributions
 is+1.172
orns+1.137
ie+1.123
iles+1.097
oline+1.088
Neg logit contributions
shadow-1.322
 determined-1.249
 proud-1.243
 confident-1.240
Just-1.228
Token,
Token ID11
Feature activation+2.98e+00

Pos logit contributions
shadow+1.958
ured+1.800
Unfortunately+1.770
ivated+1.770
 confident+1.769
Neg logit contributions
 is-1.105
ie-0.946
 lives-0.942
 belongs-0.910
ons-0.896
Token "
Token ID366
Feature activation+5.11e-01

Pos logit contributions
shadow+5.674
ured+5.232
ivated+5.177
uring+5.005
har+4.980
Neg logit contributions
 is-6.261
 lives-5.188
 doesn-5.152
 happened-5.115
 isn-5.031
TokenI
Token ID40
Feature activation-5.72e-01

Pos logit contributions
shadow+0.711
ured+0.619
 determined+0.608
 louder+0.602
 confident+0.600
Neg logit contributions
 is-1.181
 lives-1.095
ie-1.094
orns-1.081
 belongs-1.074
Token'm
Token ID1101
Feature activation+2.69e+00

Pos logit contributions
 is+0.911
orns+0.883
ie+0.872
oline+0.867
ons+0.847
Neg logit contributions
shadow-1.136
 determined-1.058
 proud-1.056
 confident-1.054
 pleased-1.039
Token sorry
Token ID7926
Feature activation-8.13e-01

Pos logit contributions
shadow+3.458
ured+3.087
Just+3.077
Unfortunately+2.978
rolling+2.929
Neg logit contributions
 is-6.951
 lives-6.298
 dreams-6.155
 happened-6.139
ie-6.103
Token,
Token ID11
Feature activation-8.04e-02

Pos logit contributions
 is+1.174
orns+1.166
oline+1.135
ible+1.123
ons+1.114
Neg logit contributions
shadow-1.735
 louder-1.638
 proud-1.576
ured-1.572
 crashing-1.567
Token up
Token ID510
Feature activation+4.56e-01

Pos logit contributions
 is+1.584
 lives+1.462
ible+1.456
 acts+1.441
oline+1.440
Neg logit contributions
shadow-2.042
 louder-1.807
ured-1.800
 proud-1.777
 crashing-1.777
Token."
Token ID526
Feature activation-3.98e-01

Pos logit contributions
shadow+1.029
ured+0.951
 determined+0.937
 louder+0.936
 confident+0.935
Neg logit contributions
 is-0.640
ie-0.568
 lives-0.562
orns-0.549
 belongs-0.549
Token
Token ID198
Feature activation+1.14e-01

Pos logit contributions
 is+0.785
ie+0.740
 lives+0.729
orns+0.722
 belongs+0.720
Neg logit contributions
shadow-0.661
ured-0.592
Unfortunately-0.586
Just-0.585
 confident-0.584
TokenThe
Token ID464
Feature activation-7.85e-03

Pos logit contributions
shadow+0.218
ured+0.196
 determined+0.195
 confident+0.194
 louder+0.194
Neg logit contributions
 is-0.201
ie-0.184
 lives-0.184
 belongs-0.181
orns-0.180
Token frog
Token ID21264
Feature activation+3.69e-02

Pos logit contributions
 is+0.013
ie+0.012
 lives+0.011
orns+0.011
 belongs+0.011
Neg logit contributions
shadow-0.016
ured-0.015
 louder-0.015
 determined-0.015
 confident-0.014
Token was
Token ID373
Feature activation+2.82e+00

Pos logit contributions
shadow+0.080
ured+0.074
 determined+0.074
 louder+0.073
 confident+0.073
Neg logit contributions
 is-0.054
 lives-0.048
ie-0.048
orns-0.048
 belongs-0.047
Token so
Token ID523
Feature activation+1.84e+00

Pos logit contributions
shadow+4.738
ured+4.183
 sidelines+4.042
Half+4.027
ooting+4.025
Neg logit contributions
 is-6.306
 lives-5.619
ie-5.419
 happened-5.348
 dreams-5.316
Token excited
Token ID6568
Feature activation-9.79e-01

Pos logit contributions
shadow+2.368
ured+2.071
Unfortunately+1.933
 louder+1.882
Just+1.877
Neg logit contributions
 is-4.773
 lives-4.395
ie-4.297
 belongs-4.284
 happened-4.261
Token.
Token ID13
Feature activation+1.32e+00

Pos logit contributions
 is+1.805
 lives+1.703
orns+1.696
ests+1.696
 belongs+1.678
Neg logit contributions
shadow-1.716
 louder-1.575
ured-1.568
Just-1.541
 confident-1.536
Token He
Token ID679
Feature activation+7.04e-01

Pos logit contributions
shadow+2.371
ured+2.087
ivated+2.085
 determined+2.050
 louder+2.046
Neg logit contributions
 is-2.618
 lives-2.332
 belongs-2.309
oline-2.298
 doesn-2.249
Token said
Token ID531
Feature activation+1.43e+00

Pos logit contributions
shadow+1.315
ured+1.182
ivated+1.170
 louder+1.161
 confident+1.157
Neg logit contributions
 is-1.285
ie-1.173
 lives-1.167
 belongs-1.139
oline-1.132
Token.
Token ID13
Feature activation-7.28e-01

Pos logit contributions
shadow+0.808
ured+0.746
 determined+0.737
 louder+0.735
 confident+0.734
Neg logit contributions
 is-0.478
ie-0.422
 lives-0.419
 belongs-0.409
orns-0.409
Token Kai
Token ID22091
Feature activation-9.67e-01

Pos logit contributions
 is+1.381
ie+1.333
oline+1.285
 belongs+1.278
 lives+1.275
Neg logit contributions
shadow-1.269
Unfortunately-1.127
 determined-1.117
Just-1.115
 confident-1.106
Token's
Token ID338
Feature activation-5.84e-01

Pos logit contributions
 is+1.517
orns+1.455
oline+1.418
ests+1.407
iles+1.402
Neg logit contributions
shadow-1.921
 determined-1.815
 confident-1.783
ured-1.782
 proud-1.779
Token mom
Token ID1995
Feature activation-3.95e-01

Pos logit contributions
 is+0.915
orns+0.849
ie+0.842
 belongs+0.837
oline+0.833
Neg logit contributions
shadow-1.187
 determined-1.134
 confident-1.112
 proud-1.110
 louder-1.106
Token was
Token ID373
Feature activation+4.01e+00

Pos logit contributions
 is+0.569
orns+0.527
 lives+0.518
oline+0.514
 belongs+0.513
Neg logit contributions
shadow-0.853
 determined-0.791
ured-0.783
 louder-0.783
 confident-0.782
Token very
Token ID845
Feature activation+3.95e+00

Pos logit contributions
shadow+7.320
ured+6.943
rolling+6.693
 sidelines+6.611
ooting+6.587
Neg logit contributions
 is-9.122
 lives-7.770
 happened-7.333
pa-7.325
ie-7.262
Token proud
Token ID6613
Feature activation-5.03e-01

Pos logit contributions
shadow+6.737
ured+6.501
Half+6.413
Unfortunately+6.392
 crashing+6.324
Neg logit contributions
 is-9.184
 lives-8.157
ie-7.913
 belongs-7.799
 happened-7.548
Token of
Token ID286
Feature activation-6.80e-01

Pos logit contributions
 is+0.814
 belongs+0.752
 lives+0.752
orns+0.749
ible+0.745
Neg logit contributions
shadow-0.993
 louder-0.925
 confident-0.920
 determined-0.919
 proud-0.911
Token him
Token ID683
Feature activation-2.03e-01

Pos logit contributions
 is+1.001
oline+0.932
ie+0.923
 belongs+0.922
 lives+0.910
Neg logit contributions
shadow-1.440
 determined-1.337
ured-1.336
 louder-1.331
 crashing-1.322
Token and
Token ID290
Feature activation+7.65e-01

Pos logit contributions
 is+0.352
ie+0.324
 lives+0.322
orns+0.322
oline+0.318
Neg logit contributions
shadow-0.386
ured-0.353
 determined-0.350
 louder-0.350
 confident-0.349
Token she
Token ID673
Feature activation-3.74e-02

Pos logit contributions
shadow+1.490
ured+1.348
ivated+1.337
 louder+1.318
Unfortunately+1.316
Neg logit contributions
 is-1.357
 lives-1.227
ie-1.213
 belongs-1.194
orns-1.165
Token said
Token ID531
Feature activation+1.70e+00

Pos logit contributions
shadow+2.734
ured+2.463
ooting+2.439
 determined+2.430
 confident+2.428
Neg logit contributions
 is-1.693
ie-1.489
 lives-1.451
 belongs-1.406
 happened-1.402
Token.
Token ID13
Feature activation+9.97e-01

Pos logit contributions
shadow+4.534
Unfortunately+4.161
ured+4.117
ivated+4.102
Half+4.071
Neg logit contributions
 is-1.983
ie-1.577
 lives-1.556
 happened-1.552
 dreams-1.501
Token
Token ID198
Feature activation+2.97e-01

Pos logit contributions
shadow+2.131
ured+1.952
ivated+1.915
 determined+1.913
 confident+1.892
Neg logit contributions
 is-1.620
 lives-1.401
 happened-1.349
 belongs-1.347
oline-1.345
Token
Token ID198
Feature activation+2.64e+00

Pos logit contributions
shadow+0.581
ured+0.529
 determined+0.521
 confident+0.520
 louder+0.519
Neg logit contributions
 is-0.510
ie-0.462
 lives-0.461
orns-0.452
 belongs-0.451
TokenMom
Token ID29252
Feature activation+1.93e-01

Pos logit contributions
shadow+4.844
ivated+4.692
ured+4.596
 confident+4.532
ooting+4.527
Neg logit contributions
 is-5.305
 lives-4.607
ara-4.571
ons-4.393
 belongs-4.297
Token was
Token ID373
Feature activation+3.93e+00

Pos logit contributions
shadow+0.395
ured+0.361
 determined+0.357
 louder+0.357
 confident+0.356
Neg logit contributions
 is-0.310
ie-0.281
 lives-0.280
orns-0.275
 belongs-0.274
Token busy
Token ID8179
Feature activation+2.40e+00

Pos logit contributions
shadow+7.290
ured+6.828
Unfortunately+6.460
rolling+6.413
mic+6.396
Neg logit contributions
 is-9.050
 lives-7.604
 happened-7.496
pa-7.295
ons-7.221
Token in
Token ID287
Feature activation+1.03e+00

Pos logit contributions
shadow+5.648
Unfortunately+5.195
Half+5.064
Just+5.013
ured+5.011
Neg logit contributions
 is-3.766
ie-3.270
 lives-3.206
 dreams-3.050
 acts-3.019
Token the
Token ID262
Feature activation+3.57e-01

Pos logit contributions
shadow+2.261
ured+2.040
Just+2.004
Unfortunately+1.983
 pleased+1.982
Neg logit contributions
 is-1.617
 lives-1.435
ie-1.377
 dreams-1.354
orns-1.350
Token kitchen
Token ID9592
Feature activation+5.66e-01

Pos logit contributions
shadow+0.784
ured+0.720
 determined+0.708
 confident+0.705
 louder+0.705
Neg logit contributions
 is-0.537
 lives-0.477
ie-0.476
 belongs-0.465
orns-0.464
Token.
Token ID13
Feature activation+8.60e-01

Pos logit contributions
shadow+1.348
ured+1.234
 determined+1.220
 confident+1.218
ivated+1.217
Neg logit contributions
 is-0.754
 lives-0.652
ie-0.651
 belongs-0.631
orns-0.621
Token
Token ID220
Feature activation-6.40e-01

Pos logit contributions
 is+0.134
ie+0.125
 lives+0.124
 belongs+0.122
orns+0.122
Neg logit contributions
shadow-0.135
ured-0.123
 louder-0.121
 confident-0.121
 determined-0.121
Token
Token ID198
Feature activation-6.17e-01

Pos logit contributions
 is+1.173
ie+1.128
 lives+1.118
oline+1.091
orns+1.091
Neg logit contributions
shadow-1.150
Unfortunately-1.016
 proud-1.012
 louder-1.009
ured-1.008
Token
Token ID198
Feature activation+6.80e-01

Pos logit contributions
 is+1.131
 lives+1.077
ie+1.074
 belongs+1.065
ible+1.055
Neg logit contributions
shadow-1.098
Unfortunately-0.992
Just-0.986
 proud-0.977
 louder-0.968
TokenBen
Token ID11696
Feature activation-6.86e-01

Pos logit contributions
shadow+1.398
ured+1.276
 determined+1.254
 confident+1.250
ivated+1.248
Neg logit contributions
 is-1.127
ie-1.000
 lives-0.996
orns-0.978
oline-0.977
Token asked
Token ID1965
Feature activation-5.46e-01

Pos logit contributions
 is+1.099
orns+1.056
ests+1.014
 lives+1.008
iles+1.005
Neg logit contributions
shadow-1.338
 determined-1.255
ured-1.253
 proud-1.233
 confident-1.228
Token,
Token ID11
Feature activation+3.32e+00

Pos logit contributions
 is+0.812
orns+0.752
 lives+0.752
 belongs+0.744
oline+0.731
Neg logit contributions
shadow-1.166
 louder-1.097
ured-1.081
 determined-1.080
 pleased-1.067
Token "
Token ID366
Feature activation+6.30e-01

Pos logit contributions
shadow+6.708
ured+6.183
har+6.122
pering+6.057
uring+6.022
Neg logit contributions
 is-6.492
ie-5.467
 happened-5.353
 isn-5.169
 doesn-5.165
TokenWhat
Token ID2061
Feature activation-3.11e+00

Pos logit contributions
shadow+0.940
ured+0.830
 confident+0.817
 determined+0.813
 louder+0.809
Neg logit contributions
 is-1.380
ie-1.271
 lives-1.264
orns-1.255
 belongs-1.243
Token kind
Token ID1611
Feature activation-1.99e+00

Pos logit contributions
orns+6.412
oline+6.003
ie+5.983
text+5.703
ishes+5.677
Neg logit contributions
 proud-5.327
 determined-5.229
 louder-5.216
 understanding-5.131
 pleased-5.088
Token of
Token ID286
Feature activation-1.63e+00

Pos logit contributions
ible+3.480
orns+3.464
ests+3.368
 is+3.350
oline+3.333
Neg logit contributions
shadow-3.428
 proud-3.335
 louder-3.333
 pleased-3.305
 confident-3.192
Token cheese
Token ID9891
Feature activation-1.17e+00

Pos logit contributions
oline+2.100
 is+2.078
ie+2.054
text+2.039
orns+1.983
Neg logit contributions
shadow-3.567
 proud-3.518
 determined-3.431
 louder-3.418
 pleased-3.414
Token that
Token ID326
Feature activation-2.12e-01

Pos logit contributions
shadow+0.578
ured+0.529
 determined+0.522
 louder+0.521
 confident+0.521
Neg logit contributions
 is-0.426
ie-0.383
 lives-0.381
orns-0.373
 belongs-0.373
Token she
Token ID673
Feature activation-2.99e-01

Pos logit contributions
 is+0.426
ie+0.398
orns+0.396
 lives+0.395
oline+0.393
Neg logit contributions
shadow-0.344
 determined-0.311
 confident-0.308
 louder-0.308
ured-0.306
Token was
Token ID373
Feature activation+3.00e+00

Pos logit contributions
 is+0.422
orns+0.381
 lives+0.379
ie+0.379
iles+0.369
Neg logit contributions
shadow-0.666
 determined-0.617
ured-0.616
 confident-0.614
 proud-0.612
Token brave
Token ID14802
Feature activation-1.23e+00

Pos logit contributions
shadow+4.278
ured+4.018
Unfortunately+3.985
rolling+3.972
 sidelines+3.947
Neg logit contributions
 is-7.028
 happened-6.442
ie-6.398
ons-6.354
 lives-6.297
Token enough
Token ID1576
Feature activation+6.65e-01

Pos logit contributions
 is+2.136
 belongs+2.044
ons+2.036
iles+2.031
orns+2.027
Neg logit contributions
shadow-2.176
 louder-2.146
 confident-2.124
 determined-2.121
ured-2.056
Token to
Token ID284
Feature activation+1.60e+00

Pos logit contributions
shadow+1.149
ured+1.032
ivated+1.013
 louder+0.999
Unfortunately+0.995
Neg logit contributions
 is-1.327
ie-1.217
 lives-1.208
orns-1.196
 belongs-1.186
Token try
Token ID1949
Feature activation-2.61e-02

Pos logit contributions
shadow+3.086
ured+2.710
ivated+2.678
Unfortunately+2.677
 louder+2.655
Neg logit contributions
 is-3.063
ie-2.730
 lives-2.727
 dreams-2.644
 belongs-2.612
Token something
Token ID1223
Feature activation-3.44e-01

Pos logit contributions
 is+0.044
 lives+0.040
ie+0.040
 belongs+0.039
orns+0.039
Neg logit contributions
shadow-0.051
ured-0.047
 louder-0.047
 determined-0.046
 confident-0.046
Token new
Token ID649
Feature activation-1.25e+00

Pos logit contributions
 is+0.602
orns+0.563
 lives+0.560
ie+0.555
oline+0.555
Neg logit contributions
shadow-0.641
 louder-0.587
ured-0.584
 confident-0.581
 determined-0.579
Token.
Token ID13
Feature activation-2.81e-01

Pos logit contributions
 is+1.995
 lives+1.948
oline+1.911
orns+1.908
ie+1.903
Neg logit contributions
shadow-2.416
 louder-2.279
 confident-2.276
ured-2.270
 determined-2.256
Token After
Token ID2293
Feature activation+1.10e+00

Pos logit contributions
 is+0.619
ie+0.587
 lives+0.579
orns+0.577
iles+0.577
Neg logit contributions
shadow-0.402
ured-0.357
 determined-0.350
Unfortunately-0.349
 confident-0.349
Token mom
Token ID1995
Feature activation+2.57e-01

Pos logit contributions
orns+1.636
 is+1.615
 belongs+1.535
iles+1.523
ible+1.521
Neg logit contributions
shadow-1.864
 determined-1.831
 proud-1.829
 pleased-1.804
 louder-1.794
Tokenmy
Token ID1820
Feature activation+3.90e-01

Pos logit contributions
shadow+0.712
ured+0.667
 determined+0.661
 louder+0.660
 confident+0.659
Neg logit contributions
 is-0.230
ie-0.190
 lives-0.188
orns-0.181
 belongs-0.180
Token and
Token ID290
Feature activation+7.99e-01

Pos logit contributions
shadow+0.781
ured+0.711
 determined+0.700
 confident+0.699
 louder+0.698
Neg logit contributions
 is-0.649
ie-0.587
 lives-0.582
 belongs-0.569
orns-0.566
Token daddy
Token ID35822
Feature activation+2.05e-01

Pos logit contributions
shadow+1.829
ured+1.701
ivated+1.660
Unfortunately+1.656
 confident+1.651
Neg logit contributions
 is-1.130
ie-1.027
 lives-0.987
pa-0.957
ons-0.952
Token laughed
Token ID13818
Feature activation+1.94e+00

Pos logit contributions
shadow+0.386
ured+0.350
 determined+0.345
 louder+0.345
 confident+0.344
Neg logit contributions
 is-0.364
ie-0.332
 lives-0.331
orns-0.326
 belongs-0.326
Token with
Token ID351
Feature activation+3.86e-01

Pos logit contributions
shadow+4.851
Unfortunately+4.307
ured+4.248
pering+4.237
ivated+4.223
Neg logit contributions
 is-2.717
ie-2.316
 dreams-2.213
 lives-2.184
 happened-2.104
Token happiness
Token ID12157
Feature activation+4.25e-01

Pos logit contributions
shadow+0.672
ured+0.601
 determined+0.589
 louder+0.586
 confident+0.585
Neg logit contributions
 is-0.752
ie-0.689
 lives-0.687
orns-0.675
 belongs-0.672
Token.
Token ID13
Feature activation+1.37e+00

Pos logit contributions
shadow+0.957
ured+0.878
 determined+0.868
 confident+0.865
ivated+0.864
Neg logit contributions
 is-0.608
ie-0.542
 lives-0.536
 belongs-0.521
orns-0.521
Token Then
Token ID3244
Feature activation+1.45e+00

Pos logit contributions
shadow+2.992
ured+2.746
 determined+2.719
ivated+2.702
 confident+2.687
Neg logit contributions
 is-2.178
 lives-1.861
ie-1.852
 belongs-1.818
ible-1.805
Token they
Token ID484
Feature activation+3.08e-02

Pos logit contributions
shadow+3.485
ured+3.263
ivated+3.211
pering+3.198
uring+3.169
Neg logit contributions
 is-1.975
ie-1.726
 dreams-1.642
 lives-1.620
 happened-1.599
Token cl
Token ID537
Feature activation+3.76e-01

Pos logit contributions
shadow+0.058
ured+0.053
 louder+0.052
 determined+0.052
 confident+0.052
Neg logit contributions
 is-0.055
 lives-0.050
ie-0.050
orns-0.049
 belongs-0.049
Token terrible
Token ID7818
Feature activation-1.46e+00

Pos logit contributions
shadow+0.017
ured+0.015
 louder+0.015
 determined+0.015
 confident+0.015
Neg logit contributions
 is-0.017
ie-0.016
 lives-0.016
orns-0.016
 belongs-0.015
Token!"
Token ID2474
Feature activation+1.65e-01

Pos logit contributions
 is+2.032
orns+1.990
ests+1.985
 lives+1.874
ible+1.873
Neg logit contributions
shadow-3.164
 louder-3.038
ured-3.021
 determined-2.941
uring-2.923
Token Dan
Token ID6035
Feature activation-3.84e-01

Pos logit contributions
shadow+0.365
ured+0.336
 determined+0.332
 louder+0.331
 confident+0.331
Neg logit contributions
 is-0.237
ie-0.212
 lives-0.211
orns-0.207
 belongs-0.206
Token cried
Token ID16896
Feature activation+1.83e+00

Pos logit contributions
 is+0.525
orns+0.480
 lives+0.475
ie+0.470
 belongs+0.462
Neg logit contributions
shadow-0.857
ured-0.796
 louder-0.795
 determined-0.794
 confident-0.790
Token out
Token ID503
Feature activation+1.92e+00

Pos logit contributions
shadow+4.917
Unfortunately+4.521
ivated+4.478
Half+4.466
 determined+4.436
Neg logit contributions
 is-1.929
ie-1.821
ible-1.538
 happened-1.515
 dreams-1.504
Token.
Token ID13
Feature activation+1.25e+00

Pos logit contributions
shadow+4.866
Unfortunately+4.387
ured+4.305
ivated+4.298
Just+4.258
Neg logit contributions
 is-2.517
ie-2.258
 lives-2.114
 happened-2.110
iles-2.038
Token "
Token ID366
Feature activation+7.36e-01

Pos logit contributions
shadow+2.211
ured+1.987
ivated+1.980
 confident+1.962
 determined+1.939
Neg logit contributions
 is-2.468
 lives-2.176
ie-2.128
 belongs-2.125
oline-2.100
TokenWhat
Token ID2061
Feature activation-3.47e+00

Pos logit contributions
shadow+1.277
ured+1.157
 confident+1.128
 determined+1.128
 louder+1.127
Neg logit contributions
 is-1.430
 lives-1.304
ie-1.299
orns-1.288
 belongs-1.276
Token are
Token ID389
Feature activation-5.54e-01

Pos logit contributions
orns+6.536
ie+5.845
icer+5.783
ishes+5.769
oline+5.698
Neg logit contributions
 determined-7.123
 proud-7.016
 louder-6.916
 understanding-6.783
 impressed-6.685
Token we
Token ID356
Feature activation+7.56e-01

Pos logit contributions
 is+0.685
ie+0.614
oline+0.607
orns+0.605
ible+0.595
Neg logit contributions
shadow-1.334
ured-1.245
 determined-1.235
 louder-1.232
 proud-1.229
Token going
Token ID1016
Feature activation+4.15e-01

Pos logit contributions
shadow+1.296
ured+1.166
Unfortunately+1.139
 confident+1.135
Just+1.128
Neg logit contributions
 is-1.507
ie-1.379
 lives-1.371
 belongs-1.342
orns-1.325
Token,
Token ID11
Feature activation+1.49e+00

Pos logit contributions
shadow+0.667
ured+0.615
 determined+0.607
 louder+0.605
 confident+0.605
Neg logit contributions
 is-0.419
ie-0.373
 lives-0.371
orns-0.362
 belongs-0.361
Token juicy
Token ID35075
Feature activation+9.28e-02

Pos logit contributions
shadow+2.933
ured+2.698
Unfortunately+2.606
uring+2.569
ivated+2.569
Neg logit contributions
 is-2.678
 lives-2.384
 belongs-2.359
ie-2.358
orns-2.279
Token cherry
Token ID23612
Feature activation+2.95e-01

Pos logit contributions
shadow+0.206
ured+0.190
 determined+0.188
 louder+0.187
 confident+0.187
Neg logit contributions
 is-0.132
 lives-0.117
ie-0.117
 belongs-0.115
orns-0.115
Token on
Token ID319
Feature activation+3.65e-01

Pos logit contributions
shadow+0.633
ured+0.581
 determined+0.573
 louder+0.572
 confident+0.572
Neg logit contributions
 is-0.450
ie-0.404
 lives-0.401
orns-0.394
 belongs-0.393
Token a
Token ID257
Feature activation+6.65e-01

Pos logit contributions
shadow+0.722
ured+0.654
 determined+0.646
 confident+0.644
 louder+0.643
Neg logit contributions
 is-0.622
ie-0.564
 lives-0.561
orns-0.552
 belongs-0.549
Token tree
Token ID5509
Feature activation+1.67e+00

Pos logit contributions
shadow+1.453
ured+1.332
 louder+1.308
 confident+1.308
ivated+1.306
Neg logit contributions
 is-1.010
ie-0.903
 lives-0.888
 belongs-0.871
orns-0.862
Token.
Token ID13
Feature activation+7.39e-01

Pos logit contributions
shadow+4.245
ured+3.963
 pleased+3.906
 louder+3.882
 confident+3.875
Neg logit contributions
 is-2.181
ie-1.884
 lives-1.805
 belongs-1.763
orns-1.755
Token Benny
Token ID44275
Feature activation-2.62e-01

Pos logit contributions
shadow+1.293
ured+1.166
 determined+1.138
 louder+1.137
ivated+1.136
Neg logit contributions
 is-1.451
ie-1.314
 lives-1.307
 belongs-1.291
orns-1.286
Token hopped
Token ID45230
Feature activation+2.76e+00

Pos logit contributions
 is+0.358
orns+0.324
 lives+0.321
oline+0.320
ie+0.320
Neg logit contributions
shadow-0.593
 determined-0.551
ured-0.545
Just-0.542
 confident-0.540
Token over
Token ID625
Feature activation-1.01e+00

Pos logit contributions
shadow+5.282
ured+4.700
 confident+4.684
mic+4.673
rolling+4.668
Neg logit contributions
 is-5.637
ie-4.912
oline-4.900
 belongs-4.670
 lives-4.622
Token to
Token ID284
Feature activation+5.50e-01

Pos logit contributions
 is+2.179
 belongs+2.134
 lives+2.106
ie+2.085
ible+2.051
Neg logit contributions
shadow-1.397
 determined-1.279
 louder-1.272
 crashing-1.250
 proud-1.242
Tokenmy
Token ID1820
Feature activation-5.65e-01

Pos logit contributions
 lives+4.721
 belongs+4.612
 happened+4.561
iles+4.539
text+4.507
Neg logit contributions
shadow-6.659
 louder-6.179
Just-6.052
 determined-6.042
eyed-6.034
Token was
Token ID373
Feature activation+2.94e+00

Pos logit contributions
 is+0.708
orns+0.674
iles+0.648
 lives+0.642
ie+0.637
Neg logit contributions
shadow-1.309
 determined-1.236
 proud-1.219
 confident-1.218
Just-1.215
Token happy
Token ID3772
Feature activation+6.44e-01

Pos logit contributions
shadow+5.191
ured+5.050
 sidelines+4.722
Just+4.682
pering+4.661
Neg logit contributions
 is-6.226
 dreams-5.325
 lives-5.307
 happened-5.300
ie-5.228
Token that
Token ID326
Feature activation-2.85e-01

Pos logit contributions
shadow+1.256
ured+1.150
ivated+1.121
 determined+1.115
Unfortunately+1.109
Neg logit contributions
 is-1.142
ie-1.036
 lives-1.019
orns-0.996
 belongs-0.991
Token he
Token ID339
Feature activation-7.20e-01

Pos logit contributions
 is+0.587
orns+0.550
oline+0.550
ie+0.549
 lives+0.548
Neg logit contributions
shadow-0.451
 determined-0.403
 louder-0.402
 confident-0.401
 proud-0.399
Token had
Token ID550
Feature activation+7.51e-01

Pos logit contributions
 is+1.018
orns+0.957
iles+0.947
ie+0.938
 lives+0.930
Neg logit contributions
shadow-1.573
 proud-1.467
 confident-1.466
 determined-1.462
ured-1.436
Token organized
Token ID8389
Feature activation-4.57e-01

Pos logit contributions
shadow+1.534
ured+1.408
ivated+1.378
 louder+1.378
 determined+1.374
Neg logit contributions
 is-1.254
 lives-1.118
ie-1.104
 belongs-1.090
ons-1.087
Token it
Token ID340
Feature activation-9.78e-03

Pos logit contributions
 is+0.813
ie+0.757
 lives+0.757
orns+0.752
 belongs+0.746
Neg logit contributions
shadow-0.841
 louder-0.770
 confident-0.764
ured-0.762
 determined-0.760
Token nicely
Token ID16576
Feature activation-8.28e-01

Pos logit contributions
 is+0.017
 lives+0.016
ie+0.016
orns+0.016
 belongs+0.015
Neg logit contributions
shadow-0.018
ured-0.017
 determined-0.017
 louder-0.017
 confident-0.017
Token.
Token ID13
Feature activation-3.83e-01

Pos logit contributions
 is+1.398
 lives+1.306
orns+1.300
ie+1.286
oline+1.278
Neg logit contributions
shadow-1.593
ured-1.460
 determined-1.448
 confident-1.440
 louder-1.438
Token He
Token ID679
Feature activation-2.70e-01

Pos logit contributions
 is+0.823
ie+0.787
 lives+0.773
orns+0.773
iles+0.768
Neg logit contributions
shadow-0.567
Unfortunately-0.494
ured-0.492
 confident-0.492
 determined-0.490
Token.
Token ID13
Feature activation+2.47e-01

Pos logit contributions
shadow+1.623
ured+1.498
 confident+1.485
 determined+1.481
ivated+1.475
Neg logit contributions
 is-0.889
ie-0.771
 lives-0.749
 belongs-0.743
orns-0.730
Token
Token ID198
Feature activation-1.10e+00

Pos logit contributions
shadow+0.484
ured+0.440
 determined+0.434
 louder+0.433
 confident+0.432
Neg logit contributions
 is-0.425
ie-0.386
 lives-0.384
orns-0.377
 belongs-0.377
Token
Token ID198
Feature activation+1.02e+00

Pos logit contributions
 is+2.080
ible+2.079
orns+2.031
 lives+2.023
 belongs+2.021
Neg logit contributions
shadow-1.829
Unfortunately-1.661
Just-1.636
 proud-1.613
eyed-1.592
Token"
Token ID1
Feature activation+3.58e-01

Pos logit contributions
shadow+2.038
ured+1.858
ivated+1.847
 confident+1.841
 determined+1.839
Neg logit contributions
 is-1.810
 lives-1.606
ie-1.587
 belongs-1.565
oline-1.548
TokenHelp
Token ID22087
Feature activation-1.99e+00

Pos logit contributions
shadow+0.726
ured+0.662
 determined+0.652
 confident+0.651
 louder+0.649
Neg logit contributions
 is-0.598
ie-0.539
 lives-0.538
orns-0.529
 belongs-0.526
Token!
Token ID0
Feature activation-4.69e-01

Pos logit contributions
 is+3.093
 lives+3.079
 belongs+3.077
ests+3.056
oline+2.905
Neg logit contributions
shadow-4.242
 louder-3.947
Just-3.688
 crashing-3.683
ured-3.674
Token Help
Token ID10478
Feature activation-2.80e+00

Pos logit contributions
 is+1.061
ie+1.024
orns+1.013
 lives+1.001
oline+0.995
Neg logit contributions
shadow-0.649
ured-0.554
 louder-0.551
ivated-0.540
 determined-0.539
Token!"
Token ID2474
Feature activation-4.23e-01

Pos logit contributions
 lives+4.403
 dreams+4.387
 acts+4.302
ests+4.283
 belongs+4.190
Neg logit contributions
shadow-5.971
 louder-5.635
 crashing-5.379
Unfortunately-5.328
Just-5.310
Token Tom
Token ID4186
Feature activation+2.13e-01

Pos logit contributions
 is+0.550
ie+0.511
orns+0.505
 lives+0.492
ible+0.486
Neg logit contributions
shadow-0.974
ured-0.911
 louder-0.904
 proud-0.903
 confident-0.903
Token and
Token ID290
Feature activation+4.98e-01

Pos logit contributions
shadow+0.413
ured+0.375
 determined+0.371
 louder+0.370
 confident+0.369
Neg logit contributions
 is-0.368
ie-0.335
 lives-0.334
orns-0.328
 belongs-0.328
Token Lily
Token ID20037
Feature activation+7.25e-01

Pos logit contributions
shadow+1.229
ured+1.141
 determined+1.125
 confident+1.123
ivated+1.122
Neg logit contributions
 is-0.611
ie-0.534
 lives-0.526
 belongs-0.511
orns-0.502
Token a
Token ID257
Feature activation+9.25e-01

Pos logit contributions
 is+1.082
ie+1.022
 belongs+1.005
 lives+0.999
orns+0.998
Neg logit contributions
shadow-0.992
 determined-0.917
 louder-0.916
 confident-0.904
ured-0.903
Token girl
Token ID2576
Feature activation-1.88e-01

Pos logit contributions
shadow+1.997
ured+1.841
ivated+1.796
Unfortunately+1.782
Just+1.766
Neg logit contributions
 is-1.441
 lives-1.290
ie-1.273
 belongs-1.224
ible-1.196
Token named
Token ID3706
Feature activation-1.33e+00

Pos logit contributions
 is+0.337
orns+0.308
 lives+0.306
ie+0.306
oline+0.305
Neg logit contributions
shadow-0.348
ured-0.314
 determined-0.314
 confident-0.312
 louder-0.310
Token Molly
Token ID30236
Feature activation-1.12e+00

Pos logit contributions
 is+2.138
ests+1.969
ible+1.964
 lives+1.955
iles+1.944
Neg logit contributions
shadow-2.823
ured-2.515
 louder-2.502
ivated-2.488
 proud-2.484
Token.
Token ID13
Feature activation-7.18e-01

Pos logit contributions
 is+1.733
orns+1.658
ests+1.633
oline+1.598
 belongs+1.597
Neg logit contributions
shadow-2.320
 determined-2.140
ured-2.138
 louder-2.136
 proud-2.123
Token She
Token ID1375
Feature activation-6.97e-01

Pos logit contributions
 is+1.406
ie+1.316
 lives+1.313
 belongs+1.312
orns+1.312
Neg logit contributions
shadow-1.175
 determined-1.072
 louder-1.064
ured-1.063
 confident-1.063
Token was
Token ID373
Feature activation+1.59e+00

Pos logit contributions
 is+0.845
orns+0.811
iles+0.768
ests+0.763
ible+0.761
Neg logit contributions
shadow-1.633
 determined-1.564
 proud-1.540
Just-1.539
 confident-1.537
Token a
Token ID257
Feature activation+1.32e+00

Pos logit contributions
shadow+2.881
ured+2.528
Unfortunately+2.489
Just+2.473
ivated+2.416
Neg logit contributions
 is-3.132
 lives-2.904
ie-2.828
 dreams-2.732
 belongs-2.673
Token very
Token ID845
Feature activation+1.70e+00

Pos logit contributions
shadow+2.559
ured+2.277
Unfortunately+2.220
ivated+2.194
uring+2.184
Neg logit contributions
 is-2.458
 lives-2.262
ie-2.208
 belongs-2.120
 dreams-2.103
Token intelligent
Token ID12661
Feature activation-1.00e+00

Pos logit contributions
shadow+2.850
Unfortunately+2.497
ured+2.479
 crashing+2.412
 louder+2.408
Neg logit contributions
 is-3.659
 lives-3.430
ie-3.336
 belongs-3.314
 happened-3.160
Token girl
Token ID2576
Feature activation-1.42e+00

Pos logit contributions
 is+1.373
orns+1.292
ons+1.282
 belongs+1.257
ests+1.223
Neg logit contributions
shadow-2.151
 confident-2.103
 louder-2.090
 determined-2.085
 proud-2.063
Token.
Token ID13
Feature activation-4.82e-01

Pos logit contributions
 is+0.769
orns+0.723
 lives+0.717
ie+0.713
oline+0.711
Neg logit contributions
shadow-1.001
 louder-0.935
ured-0.924
 determined-0.914
 confident-0.909
Token She
Token ID1375
Feature activation+6.31e-01

Pos logit contributions
 is+1.020
ie+0.971
 lives+0.953
orns+0.942
 belongs+0.939
Neg logit contributions
shadow-0.731
ured-0.653
Unfortunately-0.647
 determined-0.645
Just-0.639
Token thought
Token ID1807
Feature activation-5.92e-01

Pos logit contributions
shadow+1.308
ured+1.190
ivated+1.175
 confident+1.165
 louder+1.165
Neg logit contributions
 is-1.030
ie-0.925
 lives-0.921
 belongs-0.898
oline-0.879
Token and
Token ID290
Feature activation-5.36e-01

Pos logit contributions
 is+1.052
orns+0.983
 lives+0.983
ie+0.972
 belongs+0.965
Neg logit contributions
shadow-1.086
 determined-1.004
 louder-0.992
 confident-0.991
ured-0.987
Token thought
Token ID1807
Feature activation-2.10e-01

Pos logit contributions
 is+0.926
orns+0.864
oline+0.852
 lives+0.848
 belongs+0.847
Neg logit contributions
shadow-1.025
 determined-0.940
 louder-0.937
ured-0.934
 confident-0.922
Token until
Token ID1566
Feature activation+2.40e-01

Pos logit contributions
 is+0.369
 lives+0.341
orns+0.337
ie+0.337
 belongs+0.334
Neg logit contributions
shadow-0.397
ured-0.362
 determined-0.359
 louder-0.359
 confident-0.357
Token she
Token ID673
Feature activation-1.33e+00

Pos logit contributions
shadow+0.432
ured+0.390
 determined+0.384
 louder+0.383
 confident+0.382
Neg logit contributions
 is-0.451
ie-0.414
 lives-0.412
orns-0.405
 belongs-0.405
Token had
Token ID550
Feature activation-1.99e-01

Pos logit contributions
 is+1.987
orns+1.963
ests+1.869
iles+1.861
 belongs+1.852
Neg logit contributions
shadow-2.815
 determined-2.698
Just-2.645
 proud-2.632
 confident-2.616
Token an
Token ID281
Feature activation-2.81e-02

Pos logit contributions
 is+0.365
ie+0.339
 lives+0.336
orns+0.332
 belongs+0.331
Neg logit contributions
shadow-0.365
 determined-0.329
ured-0.326
 louder-0.326
 confident-0.326
Token idea
Token ID2126
Feature activation+1.45e-02

Pos logit contributions
 is+0.042
ie+0.037
 lives+0.037
orns+0.037
 belongs+0.036
Neg logit contributions
shadow-0.061
ured-0.056
 determined-0.056
 confident-0.056
 louder-0.056
Token.
Token ID13
Feature activation+1.29e-01

Pos logit contributions
shadow+0.033
ured+0.031
 determined+0.030
 louder+0.030
 confident+0.030
Neg logit contributions
 is-0.020
 lives-0.018
ie-0.018
orns-0.017
 belongs-0.017
Token day
Token ID1110
Feature activation-1.06e+00

Pos logit contributions
 is+0.467
orns+0.416
ible+0.409
 belongs+0.406
ons+0.403
Neg logit contributions
shadow-0.949
 confident-0.883
 proud-0.881
 determined-0.880
 louder-0.876
Token,
Token ID11
Feature activation-5.25e-01

Pos logit contributions
 is+1.613
orns+1.544
 lives+1.535
 belongs+1.529
ible+1.520
Neg logit contributions
shadow-2.192
 louder-2.048
 proud-2.001
 determined-1.985
Just-1.974
Token Tim
Token ID5045
Feature activation-2.35e+00

Pos logit contributions
 is+0.760
orns+0.699
 lives+0.697
ie+0.693
 belongs+0.687
Neg logit contributions
shadow-1.149
 louder-1.078
 determined-1.054
 proud-1.053
 confident-1.047
Tokenmy
Token ID1820
Feature activation-1.37e+00

Pos logit contributions
ests+2.902
 acts+2.861
 lives+2.843
 belongs+2.817
 is+2.816
Neg logit contributions
shadow-5.642
Just-5.216
 determined-5.112
 louder-5.055
eyed-5.014
Token went
Token ID1816
Feature activation+2.57e+00

Pos logit contributions
 is+2.070
ests+2.058
iles+2.057
orns+2.028
ible+1.989
Neg logit contributions
shadow-2.785
 determined-2.595
Just-2.589
 proud-2.568
 confident-2.544
Token to
Token ID284
Feature activation-5.57e-01

Pos logit contributions
shadow+5.493
ured+5.272
 pleased+5.164
ivated+5.100
Unfortunately+5.019
Neg logit contributions
 is-4.549
orns-3.916
ie-3.836
 lives-3.807
 belongs-3.704
Token school
Token ID1524
Feature activation+9.04e-01

Pos logit contributions
 is+0.990
ie+0.934
orns+0.923
oline+0.915
ible+0.912
Neg logit contributions
shadow-1.036
 confident-0.922
 determined-0.922
 louder-0.921
 proud-0.917
Token and
Token ID290
Feature activation+1.93e-01

Pos logit contributions
shadow+1.699
ured+1.539
Unfortunately+1.496
 louder+1.494
 determined+1.492
Neg logit contributions
 is-1.662
 lives-1.496
ie-1.492
oline-1.456
 belongs-1.455
Token Max
Token ID5436
Feature activation-6.59e-01

Pos logit contributions
shadow+0.406
ured+0.371
 determined+0.367
 louder+0.366
 confident+0.366
Neg logit contributions
 is-0.301
ie-0.271
 lives-0.269
orns-0.265
 belongs-0.264
Token had
Token ID550
Feature activation+4.27e-02

Pos logit contributions
 is+0.978
orns+0.930
iles+0.909
ie+0.909
oline+0.897
Neg logit contributions
shadow-1.398
 determined-1.303
 confident-1.280
ured-1.280
 proud-1.274
Token to
Token ID284
Feature activation+9.30e-01

Pos logit contributions
shadow+0.078
ured+0.070
 determined+0.070
 louder+0.070
 confident+0.070
Neg logit contributions
 is-0.078
ie-0.073
 lives-0.072
orns-0.071
 belongs-0.071
Token!
Token ID0
Feature activation+4.00e-01

Pos logit contributions
orns+2.183
 is+2.105
ie+2.040
ible+2.030
ests+2.020
Neg logit contributions
shadow-2.933
 louder-2.779
 proud-2.724
ured-2.724
 confident-2.717
Token I
Token ID314
Feature activation+1.75e-01

Pos logit contributions
shadow+0.829
ured+0.759
 determined+0.749
 confident+0.747
 louder+0.747
Neg logit contributions
 is-0.640
 lives-0.573
ie-0.570
 belongs-0.560
orns-0.557
Token will
Token ID481
Feature activation+1.92e+00

Pos logit contributions
shadow+0.373
ured+0.343
 determined+0.339
 confident+0.338
 louder+0.338
Neg logit contributions
 is-0.267
ie-0.241
 lives-0.239
orns-0.236
 belongs-0.234
Token sell
Token ID3677
Feature activation-1.14e+00

Pos logit contributions
shadow+3.668
 louder+3.193
ivated+3.180
Half+3.169
Unfortunately+3.163
Neg logit contributions
 is-3.779
 lives-3.392
ie-3.349
 dreams-3.254
 belongs-3.157
Token them
Token ID606
Feature activation-6.13e-01

Pos logit contributions
 is+1.857
oline+1.734
ie+1.726
 lives+1.723
 belongs+1.719
Neg logit contributions
shadow-2.258
 louder-2.080
ured-2.053
 proud-2.018
 determined-1.988
Token at
Token ID379
Feature activation-1.63e+00

Pos logit contributions
 is+1.133
 lives+1.051
ie+1.051
orns+1.042
 belongs+1.033
Neg logit contributions
shadow-1.095
ured-0.989
 louder-0.966
 proud-0.959
 determined-0.956
Token the
Token ID262
Feature activation-5.45e-01

Pos logit contributions
 is+2.960
ests+2.845
oline+2.821
 belongs+2.802
ie+2.797
Neg logit contributions
shadow-2.910
 proud-2.620
 louder-2.611
ured-2.587
 determined-2.575
Token market
Token ID1910
Feature activation-1.13e+00

Pos logit contributions
 is+0.878
orns+0.801
ie+0.800
 belongs+0.793
 lives+0.788
Neg logit contributions
shadow-1.083
 louder-1.018
 determined-1.006
ured-1.004
 confident-0.999
Token and
Token ID290
Feature activation+7.67e-01

Pos logit contributions
 is+1.742
orns+1.665
 lives+1.630
ie+1.626
 happened+1.623
Neg logit contributions
shadow-2.323
ured-2.115
 louder-2.111
 determined-2.096
 proud-2.095
Token make
Token ID787
Feature activation-6.09e-01

Pos logit contributions
shadow+1.418
ured+1.277
ivated+1.258
 louder+1.254
 determined+1.253
Neg logit contributions
 is-1.414
 lives-1.289
ie-1.269
 belongs-1.250
orns-1.228
Token some
Token ID617
Feature activation-7.78e-02

Pos logit contributions
 is+1.091
orns+1.009
ie+1.007
oline+1.004
 belongs+1.000
Neg logit contributions
shadow-1.123
ured-1.023
 louder-1.020
 proud-1.007
 determined-1.000
Token things
Token ID1243
Feature activation-3.29e-01

Pos logit contributions
 is+1.403
iles+1.355
orns+1.332
ie+1.319
ible+1.299
Neg logit contributions
shadow-2.011
 confident-1.906
 determined-1.887
 louder-1.885
ured-1.883
Token you
Token ID345
Feature activation-1.31e+00

Pos logit contributions
 is+0.551
ie+0.510
orns+0.508
 lives+0.507
oline+0.502
Neg logit contributions
shadow-0.635
ured-0.581
 determined-0.575
 confident-0.574
 louder-0.572
Token have
Token ID423
Feature activation-3.66e-01

Pos logit contributions
ie+2.150
 is+2.150
ests+2.137
orns+2.125
oline+2.119
Neg logit contributions
shadow-2.449
 determined-2.345
 confident-2.327
 proud-2.321
ured-2.282
Token!
Token ID0
Feature activation-3.77e-01

Pos logit contributions
 is+0.628
ie+0.589
oline+0.580
orns+0.578
 lives+0.577
Neg logit contributions
shadow-0.700
ured-0.639
 determined-0.631
 confident-0.629
 proud-0.623
Token<|endoftext|>
Token ID50256
Feature activation-2.61e+00

Pos logit contributions
 is+0.681
ie+0.646
 lives+0.632
 belongs+0.628
oline+0.627
Neg logit contributions
shadow-0.696
Unfortunately-0.620
 determined-0.610
ured-0.610
Just-0.610
TokenSam
Token ID16305
Feature activation-2.72e+00

Pos logit contributions
ie+4.800
 belongs+4.788
 lives+4.674
ishes+4.666
ible+4.603
Neg logit contributions
shadow-4.436
Just-4.266
 crashing-4.251
Unfortunately-4.165
 proud-4.161
Token and
Token ID290
Feature activation-6.35e-01

Pos logit contributions
ests+3.808
iles+3.778
orns+3.658
 foods+3.551
 belongs+3.534
Neg logit contributions
shadow-5.536
 proud-5.505
 determined-5.474
 confident-5.393
eyed-5.343
Token Joe
Token ID5689
Feature activation-2.14e+00

Pos logit contributions
 is+1.176
orns+1.106
ie+1.095
oline+1.090
 belongs+1.084
Neg logit contributions
shadow-1.103
 determined-1.011
 confident-1.005
 proud-1.005
ured-1.004
Token were
Token ID547
Feature activation+8.84e-02

Pos logit contributions
ests+2.830
orns+2.824
 is+2.726
oline+2.717
iles+2.710
Neg logit contributions
 proud-4.468
shadow-4.397
 determined-4.388
eyed-4.374
 crashing-4.324
Token best
Token ID1266
Feature activation-1.20e-01

Pos logit contributions
shadow+0.175
ured+0.159
 determined+0.159
 louder+0.158
 confident+0.157
Neg logit contributions
 is-0.148
ie-0.135
 lives-0.134
orns-0.132
 belongs-0.132
Token friends
Token ID2460
Feature activation-1.59e-01

Pos logit contributions
 is+0.146
ie+0.127
 belongs+0.125
 lives+0.125
orns+0.123
Neg logit contributions
shadow-0.294
 determined-0.275
ured-0.274
 confident-0.273
 louder-0.272
Token said
Token ID531
Feature activation+1.58e+00

Pos logit contributions
 is+0.251
orns+0.229
 lives+0.228
ie+0.227
 belongs+0.225
Neg logit contributions
shadow-0.313
 determined-0.285
ured-0.284
 confident-0.283
 louder-0.282
Token,
Token ID11
Feature activation+3.57e+00

Pos logit contributions
shadow+3.442
ured+3.127
ivated+3.113
Unfortunately+3.080
pering+3.073
Neg logit contributions
 is-2.575
ie-2.208
 happened-2.184
 lives-2.178
 dreams-2.124
Token "
Token ID366
Feature activation+1.36e-01

Pos logit contributions
shadow+7.231
ivated+6.890
ured+6.827
har+6.793
eyed+6.659
Neg logit contributions
 is-7.090
 happened-5.791
 doesn-5.776
 isn-5.718
 lived-5.589
TokenTed
Token ID38972
Feature activation-1.61e+00

Pos logit contributions
shadow+0.224
ured+0.200
 determined+0.197
 louder+0.196
 confident+0.196
Neg logit contributions
 is-0.277
ie-0.255
 lives-0.255
 belongs-0.251
orns-0.251
Token,
Token ID11
Feature activation-1.34e-01

Pos logit contributions
orns+2.497
ests+2.474
 is+2.382
oline+2.376
iles+2.300
Neg logit contributions
shadow-3.301
 determined-3.074
 proud-3.054
 pleased-3.023
 louder-3.017
Token let
Token ID1309
Feature activation-2.11e+00

Pos logit contributions
 is+0.200
orns+0.183
ie+0.183
 lives+0.181
oline+0.179
Neg logit contributions
shadow-0.285
ured-0.261
 determined-0.260
 louder-0.260
 confident-0.258
Token's
Token ID338
Feature activation+1.15e-01

Pos logit contributions
ests+2.851
orns+2.798
iles+2.681
 is+2.673
 acts+2.652
Neg logit contributions
shadow-4.869
 louder-4.640
 proud-4.461
 determined-4.449
ured-4.430
Token part
Token ID636
Feature activation-1.02e+00

Pos logit contributions
shadow+0.200
ured+0.180
 determined+0.178
 louder+0.177
 confident+0.177
Neg logit contributions
 is-0.221
ie-0.203
 lives-0.202
orns-0.200
 belongs-0.199
Token our
Token ID674
Feature activation-1.61e+00

Pos logit contributions
 is+1.735
ible+1.653
 lives+1.647
orns+1.643
 belongs+1.612
Neg logit contributions
shadow-1.921
 louder-1.763
 proud-1.733
ured-1.723
 confident-1.708
Token hair
Token ID4190
Feature activation-4.10e-01

Pos logit contributions
 is+2.343
ie+2.184
orns+2.176
oline+2.173
ons+2.157
Neg logit contributions
shadow-3.336
 determined-3.256
 proud-3.214
 confident-3.204
 louder-3.202
Token in
Token ID287
Feature activation-3.63e-01

Pos logit contributions
 is+0.691
 lives+0.645
orns+0.637
ie+0.634
oline+0.628
Neg logit contributions
shadow-0.791
ured-0.713
 louder-0.708
 proud-0.705
 confident-0.703
TokenOne
Token ID3198
Feature activation-2.12e+00

Pos logit contributions
ie+5.731
ishes+5.657
ible+5.633
 belongs+5.562
 acts+5.504
Neg logit contributions
shadow-3.056
 proud-2.963
Just-2.945
Unfortunately-2.910
eyed-2.837
Token morning
Token ID3329
Feature activation-9.19e-01

Pos logit contributions
orns+3.415
ible+3.362
ishes+3.269
ests+3.268
 is+3.100
Neg logit contributions
 proud-4.301
 determined-4.163
 confident-4.143
 pleased-4.140
shadow-4.107
Token,
Token ID11
Feature activation-1.03e+00

Pos logit contributions
 is+1.356
orns+1.348
ible+1.316
ests+1.306
oline+1.281
Neg logit contributions
shadow-1.910
 proud-1.767
 determined-1.743
 louder-1.742
ured-1.741
Token a
Token ID257
Feature activation+1.34e-01

Pos logit contributions
orns+1.948
 is+1.922
ible+1.908
oline+1.885
ests+1.872
Neg logit contributions
shadow-1.724
 proud-1.590
 determined-1.569
 louder-1.563
Just-1.539
Token little
Token ID1310
Feature activation-2.55e-01

Pos logit contributions
shadow+0.265
ured+0.241
 determined+0.239
 louder+0.239
 confident+0.239
Neg logit contributions
 is-0.223
ie-0.203
 lives-0.202
orns-0.200
 belongs-0.199
Token girl
Token ID2576
Feature activation-2.12e+00

Pos logit contributions
 is+0.295
 belongs+0.253
 lives+0.252
ie+0.251
orns+0.251
Neg logit contributions
shadow-0.626
 louder-0.585
 confident-0.582
 proud-0.581
 determined-0.579
Token named
Token ID3706
Feature activation-1.29e+00

Pos logit contributions
iles+4.355
ests+4.310
ible+4.274
orns+4.204
ishes+4.135
Neg logit contributions
 proud-2.915
shadow-2.913
eyed-2.864
 determined-2.772
Just-2.752
Token Sue
Token ID26089
Feature activation-3.27e+00

Pos logit contributions
 is+1.911
orns+1.899
ible+1.895
iles+1.865
ests+1.865
Neg logit contributions
shadow-2.717
ured-2.500
 proud-2.483
ivated-2.471
 louder-2.449
Token woke
Token ID19092
Feature activation+3.41e-01

Pos logit contributions
ests+6.932
iles+6.917
 dinosaurs+6.415
ible+6.317
orns+6.214
Neg logit contributions
eyed-5.118
 proud-5.116
 relieved-4.780
 determined-4.652
 grateful-4.629
Token up
Token ID510
Feature activation+1.94e+00

Pos logit contributions
shadow+0.581
ured+0.520
 determined+0.512
 confident+0.510
 louder+0.509
Neg logit contributions
 is-0.672
ie-0.620
 lives-0.616
orns-0.606
 belongs-0.605
Token early
Token ID1903
Feature activation+8.29e-01

Pos logit contributions
shadow+4.055
ured+3.786
Unfortunately+3.671
 determined+3.546
Just+3.522
Neg logit contributions
 is-3.435
ie-3.102
 lives-3.031
 happened-2.923
 dreams-2.899
Token yard
Token ID12699
Feature activation-1.92e+00

Pos logit contributions
 is+0.461
ie+0.427
 lives+0.418
 belongs+0.413
orns+0.413
Neg logit contributions
shadow-0.607
 louder-0.567
 determined-0.565
 confident-0.560
ured-0.560
Token is
Token ID318
Feature activation+1.70e+00

Pos logit contributions
orns+0.734
ible+0.659
ie+0.637
pa+0.571
iles+0.543
Neg logit contributions
shadow-5.849
 louder-5.817
 determined-5.777
 proud-5.741
 crashing-5.735
Token very
Token ID845
Feature activation+1.45e+00

Pos logit contributions
shadow+3.162
ured+2.969
ivated+2.858
uring+2.756
eyed+2.753
Neg logit contributions
 is-3.383
 lives-2.943
ie-2.904
 belongs-2.903
 happened-2.858
Token messy
Token ID24097
Feature activation-1.33e+00

Pos logit contributions
shadow+2.604
ured+2.402
ivated+2.286
 confident+2.230
uring+2.216
Neg logit contributions
 is-2.880
 lives-2.600
ie-2.574
 belongs-2.566
oline-2.471
Token.
Token ID13
Feature activation-9.54e-01

Pos logit contributions
 is+2.018
 lives+1.970
orns+1.941
iles+1.925
ie+1.898
Neg logit contributions
shadow-2.601
 louder-2.516
ured-2.500
uring-2.455
 confident-2.437
Token There
Token ID1318
Feature activation-3.13e+00

Pos logit contributions
 is+1.678
ie+1.623
 lives+1.573
oline+1.565
ible+1.558
Neg logit contributions
shadow-1.783
Unfortunately-1.601
 crashing-1.597
 louder-1.589
Just-1.582
Token are
Token ID389
Feature activation-3.08e-01

Pos logit contributions
oline+3.438
orns+3.435
ible+3.167
ests+2.933
ishes+2.829
Neg logit contributions
 proud-7.855
 determined-7.837
 crashing-7.790
 louder-7.590
shadow-7.560
Token toys
Token ID14958
Feature activation-1.21e+00

Pos logit contributions
 is+0.479
ie+0.448
 lives+0.439
oline+0.435
orns+0.434
Neg logit contributions
shadow-0.625
 louder-0.579
 determined-0.579
ured-0.574
Unfortunately-0.572
Token,
Token ID11
Feature activation-1.26e-02

Pos logit contributions
 is+1.988
ie+1.987
ible+1.956
 lives+1.930
iles+1.926
Neg logit contributions
shadow-2.176
ured-2.017
 crashing-2.015
 louder-2.007
 determined-2.006
Token leaves
Token ID5667
Feature activation-2.87e-01

Pos logit contributions
 is+0.018
ie+0.016
 lives+0.016
orns+0.016
 belongs+0.016
Neg logit contributions
shadow-0.028
ured-0.026
 louder-0.025
 determined-0.025
 confident-0.025
Token,
Token ID11
Feature activation-4.46e-01

Pos logit contributions
 is+0.401
 lives+0.369
ie+0.367
orns+0.360
oline+0.359
Neg logit contributions
shadow-0.625
ured-0.581
 louder-0.575
 determined-0.570
 crashing-0.567
Token ringing
Token ID32333
Feature activation+8.95e-01

Pos logit contributions
shadow+3.499
ured+3.250
ivated+3.162
Unfortunately+3.112
eyed+3.103
Neg logit contributions
 is-2.533
 lives-2.228
ie-2.176
 happened-2.121
 belongs-2.094
Token sound
Token ID2128
Feature activation+1.26e+00

Pos logit contributions
shadow+1.249
ured+1.087
 determined+1.056
Unfortunately+1.052
ivated+1.050
Neg logit contributions
 is-2.109
ie-1.955
 lives-1.933
 belongs-1.895
oline-1.892
Token.
Token ID13
Feature activation+2.49e-01

Pos logit contributions
shadow+2.597
ured+2.331
Unfortunately+2.311
 determined+2.300
 confident+2.271
Neg logit contributions
 is-2.176
ie-1.955
 lives-1.888
 happened-1.887
oline-1.868
Token She
Token ID1375
Feature activation+4.91e-01

Pos logit contributions
shadow+0.465
ured+0.421
 determined+0.415
 louder+0.414
 confident+0.414
Neg logit contributions
 is-0.447
ie-0.408
 lives-0.407
orns-0.400
 belongs-0.399
Token wondered
Token ID14028
Feature activation-7.16e-01

Pos logit contributions
shadow+1.134
ured+1.046
 confident+1.030
ivated+1.028
 determined+1.028
Neg logit contributions
 is-0.678
ie-0.600
 lives-0.595
 belongs-0.575
oline-0.570
Token what
Token ID644
Feature activation-2.50e+00

Pos logit contributions
 is+1.276
orns+1.215
 lives+1.189
ests+1.188
oline+1.187
Neg logit contributions
shadow-1.335
 louder-1.230
 determined-1.201
ured-1.194
 confident-1.186
Token it
Token ID340
Feature activation-1.23e+00

Pos logit contributions
orns+4.786
ests+4.750
ishes+4.730
iles+4.625
icer+4.553
Neg logit contributions
 proud-4.271
 determined-4.223
 louder-4.219
 confident-4.178
shadow-4.147
Token could
Token ID714
Feature activation+5.58e-01

Pos logit contributions
ests+1.167
orns+1.161
 is+1.125
oline+1.121
text+1.076
Neg logit contributions
shadow-3.178
 louder-3.005
 determined-2.998
 proud-2.983
 confident-2.957
Token be
Token ID307
Feature activation+3.40e-01

Pos logit contributions
shadow+1.034
ured+0.931
 determined+0.916
 confident+0.914
 louder+0.912
Neg logit contributions
 is-1.033
ie-0.939
 lives-0.934
 belongs-0.917
orns-0.902
Token.
Token ID13
Feature activation-2.34e-02

Pos logit contributions
shadow+0.678
ured+0.618
 determined+0.609
 louder+0.607
 confident+0.607
Neg logit contributions
 is-0.571
ie-0.517
 lives-0.515
 belongs-0.505
orns-0.503
Token
Token ID198
Feature activation-8.07e-01

Pos logit contributions
 is+0.044
ie+0.040
 lives+0.040
orns+0.040
 belongs+0.040
Neg logit contributions
shadow-0.042
ured-0.037
 determined-0.037
 louder-0.037
 confident-0.037