TOP ACTIVATIONS
MAX = 0.558

a small puppy named Fl uffy . He lived in a
" His name is Fl uffy . He is sick .
a furry bunny named Fl uffy . Fl uffy loved to
One day , while Fl uffy was running around with her
Lily is happy . Fl uffy is well . The veterinarian
One day , Fl uffy went outside to play .
a k itty named Fl uffy . Fl uffy liked to
named Fl uffy . Fl uffy liked to play with balls
feed him carrots . Fl uffy was soft and warm and
the ducks away . Fl uffy got so scared that she
. Can I pet Fl uffy ?" He is curious .
loud and scary . Fl uffy and his friends were afraid
Lily is worried about Fl uffy . She wants him to
, you can pet Fl uffy . But be careful .
it a name , Fl uffy . Fl uffy was happy
a good sister to Fl uffy . Come , Fl uffy
, Fl uffy . Fl uffy was happy and loved to
that day on , Fl uffy and the red ball were
kitten . Lily and Fl uffy became best friends and did
called Fl uffy . Fl uffy was a very enthusiastic dog

MIN ACTIVATIONS
MIN = -0.587

a tiny little bird came swoop ing in and lifted up
Suddenly , a big bird swoop ed down and tried to
flap and they began to swoop around it . They were
One day a friendly owl swoop ed down and said hi
then an even bigger pigeon swoop ed down and landed beside
. Suddenly , an owl swoop ed down and snatched up
in the light . It swoop ed down and started flying
was playing outside . It swoop ed down close to the
but a se ag ull swoop ed down and snatched it
Just then , an owl swoop ed down from the sky
He was loop ing and swoop ing through the air faster
eat them . The bird swoop ed down and grabbed Fin
park when a big bird swoop ed down from the sky
. The eagle swoop ed down from the sky
was faster . The hawk swoop ed down and grabbed the
when the se ag ull swoop ed down and snatched Tim
a hawk . The hawk swoop ed down and grabbed S
out a fishing rod and swoop ed it around Su zy
Suddenly , a big bird swoop ed down and took the
was very graceful as it swoop ed down and grabbed the

INTERVAL 0.272 - 0.558
CONTAINS 2.874%

" Pl uto is not ugly , Bella .
wrong . " That dog is scary ," Tim my said
Soon , the friends stumbled into a clearing and
dishes . When she was done , her mom gave her
special connection to it . From that day on , she

INTERVAL -0.015 - 0.272
CONTAINS 52.742%

me get some food ?" Jim said , " Yes !
So one day , he decided to try it
" Good boy , Spot !" Lily says . "
bowl . She mixed it all up and put the cake
a real car . He made a loud engine noise and

INTERVAL -0.301 - -0.015
CONTAINS 42.203%

I had a wonderful dream . I dreamed of a magic
love . How can I repay you ?" the man said
girl , and one morning she saw something very blue in
, " How can you help ?" M olly
in the park . He saw all kinds of animals ,

INTERVAL -0.587 - -0.301
CONTAINS 2.181%

John it was foolish to touch the screen . John was
the hallway . Then she run into the kitchen . In
can he go for a walk . She said yes but
looking , until finally , he spotted her !
skin . But soon , he felt a sting in his
Token a
Token ID257
Feature activation+1.51e-02

Pos logit contributions
uffy+0.444
ours+0.435
 fault+0.421
 cousins+0.418
omm+0.417
Neg logit contributions
 branch-0.292
He-0.284
 swoop-0.270
 steer-0.260
 reach-0.258
Token small
Token ID1402
Feature activation-1.19e-01

Pos logit contributions
 twin+0.030
uffy+0.029
 best+0.028
 cousins+0.028
 twins+0.028
Neg logit contributions
 steer-0.030
 swoop-0.030
 paddle-0.029
He-0.029
 branch-0.028
Token puppy
Token ID26188
Feature activation-2.47e-02

Pos logit contributions
 swoop+0.217
 steer+0.217
 paddle+0.213
He+0.213
 branch+0.211
Neg logit contributions
uffy-0.245
 twin-0.243
 cousins-0.240
ours-0.238
 fault-0.235
Token named
Token ID3706
Feature activation+1.55e-01

Pos logit contributions
 steer+0.059
 swoop+0.059
 paddle+0.058
He+0.058
 branch+0.057
Neg logit contributions
uffy-0.037
 twin-0.037
 cousins-0.036
ours-0.036
 best-0.035
Token Fl
Token ID1610
Feature activation+1.41e-01

Pos logit contributions
ours+0.254
uffy+0.249
 fault+0.248
 twin+0.247
 cousins+0.244
Neg logit contributions
 steer-0.348
 swoop-0.347
 branch-0.347
He-0.346
 paddle-0.343
Tokenuffy
Token ID15352
Feature activation+5.57e-01

Pos logit contributions
 cousins+0.086
 sisters+0.081
ours+0.078
 fault+0.075
 best+0.072
Neg logit contributions
He-0.438
 branch-0.438
 steer-0.432
 paddle-0.431
 lift-0.424
Token.
Token ID13
Feature activation-5.29e-02

Pos logit contributions
uffy+1.196
 twin+1.185
 cousins+1.170
ours+1.161
 fault+1.147
Neg logit contributions
 swoop-0.968
 steer-0.967
 paddle-0.954
He-0.946
 branch-0.943
Token He
Token ID679
Feature activation-2.62e-01

Pos logit contributions
 swoop+0.120
 steer+0.120
 paddle+0.119
He+0.119
 branch+0.118
Neg logit contributions
uffy-0.085
 twin-0.084
 cousins-0.083
ours-0.082
 fault-0.081
Token lived
Token ID5615
Feature activation+3.49e-01

Pos logit contributions
 swoop+0.656
He+0.650
 steer+0.640
 branch+0.639
 paddle+0.633
Neg logit contributions
ours-0.379
 fault-0.372
uffy-0.370
 twin-0.369
 cousins-0.358
Token in
Token ID287
Feature activation+8.02e-02

Pos logit contributions
 fault+0.670
ours+0.667
uffy+0.667
 cousins+0.654
 twin+0.652
Neg logit contributions
 branch-0.702
He-0.699
 swoop-0.698
 steer-0.683
 paddle-0.666
Token a
Token ID257
Feature activation+1.51e-02

Pos logit contributions
 twin+0.128
 best+0.126
 cousins+0.124
uffy+0.121
 twins+0.121
Neg logit contributions
 swoop-0.186
 paddle-0.183
 steer-0.182
 branch-0.172
He-0.172
Token "
Token ID366
Feature activation+8.51e-02

Pos logit contributions
 twin+0.067
uffy+0.064
 cousins+0.064
ours+0.063
 best+0.063
Neg logit contributions
 steer-0.134
 paddle-0.132
 swoop-0.132
 branch-0.127
He-0.127
TokenHis
Token ID6653
Feature activation-3.89e-01

Pos logit contributions
uffy+0.144
 twins+0.139
 twin+0.137
 cousins+0.136
omm+0.135
Neg logit contributions
 paddle-0.181
 steer-0.175
 reach-0.170
 branch-0.169
 lift-0.167
Token name
Token ID1438
Feature activation+1.03e-01

Pos logit contributions
 branch+0.734
 swoop+0.714
 steer+0.703
 paddle+0.700
 lift+0.686
Neg logit contributions
ours-0.774
ains-0.735
apped-0.728
 twin-0.719
 fault-0.718
Token is
Token ID318
Feature activation+3.35e-01

Pos logit contributions
 fault+0.100
uffy+0.099
ours+0.099
 cousins+0.099
 twin+0.098
Neg logit contributions
 swoop-0.300
He-0.296
 branch-0.296
 steer-0.295
 paddle-0.289
Token Fl
Token ID1610
Feature activation+1.40e-01

Pos logit contributions
ours+0.612
 fault+0.600
icking+0.592
uffy+0.587
apped+0.586
Neg logit contributions
 swoop-0.680
 branch-0.676
He-0.670
 steer-0.653
 paddle-0.648
Tokenuffy
Token ID15352
Feature activation+5.57e-01

Pos logit contributions
 cousins+0.068
 fault+0.066
ours+0.058
 best+0.057
 birth+0.054
Neg logit contributions
 branch-0.441
 steer-0.438
 swoop-0.432
 paddle-0.432
 reach-0.427
Token.
Token ID13
Feature activation-5.50e-02

Pos logit contributions
ours+1.335
 fault+1.311
 cousins+1.293
uffy+1.289
apped+1.287
Neg logit contributions
 branch-0.828
He-0.808
 swoop-0.803
 steer-0.797
 paddle-0.779
Token He
Token ID679
Feature activation-2.72e-01

Pos logit contributions
 swoop+0.092
 steer+0.092
 paddle+0.090
 branch+0.090
He+0.089
Neg logit contributions
uffy-0.122
 twin-0.120
 cousins-0.119
ours-0.118
 fault-0.116
Token is
Token ID318
Feature activation+3.35e-01

Pos logit contributions
 swoop+0.740
 steer+0.725
He+0.722
 paddle+0.719
 branch+0.715
Neg logit contributions
 twin-0.311
ours-0.310
uffy-0.309
 fault-0.302
 cousins-0.302
Token sick
Token ID6639
Feature activation-6.40e-02

Pos logit contributions
uffy+0.671
ours+0.658
 fault+0.636
 twin+0.632
icking+0.629
Neg logit contributions
 swoop-0.646
 branch-0.635
He-0.632
 steer-0.624
 paddle-0.619
Token.
Token ID13
Feature activation-5.47e-02

Pos logit contributions
 steer+0.109
 swoop+0.108
He+0.108
 branch+0.107
 paddle+0.107
Neg logit contributions
uffy-0.140
 twin-0.137
ours-0.137
 cousins-0.136
 fault-0.134
Token a
Token ID257
Feature activation+1.81e-02

Pos logit contributions
uffy+0.419
ours+0.411
 fault+0.399
 cousins+0.397
 twin+0.393
Neg logit contributions
 branch-0.312
He-0.311
 swoop-0.300
 steer-0.294
 paddle-0.290
Token furry
Token ID40857
Feature activation-1.69e-01

Pos logit contributions
 twin+0.036
uffy+0.034
 best+0.034
 cousins+0.033
 twins+0.033
Neg logit contributions
 steer-0.037
 swoop-0.036
 paddle-0.035
He-0.035
 branch-0.034
Token bunny
Token ID44915
Feature activation-1.45e-01

Pos logit contributions
 swoop+0.284
 branch+0.282
 steer+0.281
 paddle+0.280
He+0.277
Neg logit contributions
uffy-0.369
ours-0.365
 twin-0.363
 cousins-0.362
 fault-0.361
Token named
Token ID3706
Feature activation+1.60e-01

Pos logit contributions
 steer+0.372
 paddle+0.368
 swoop+0.366
He+0.361
 branch+0.355
Neg logit contributions
uffy-0.195
 twin-0.194
 cousins-0.186
 best-0.180
ours-0.179
Token Fl
Token ID1610
Feature activation+1.41e-01

Pos logit contributions
uffy+0.277
 twin+0.274
 cousins+0.271
ours+0.271
 fault+0.266
Neg logit contributions
 swoop-0.346
 steer-0.345
 branch-0.341
 paddle-0.341
He-0.340
Tokenuffy
Token ID15352
Feature activation+5.56e-01

Pos logit contributions
 cousins+0.105
ours+0.097
 fault+0.096
 twin+0.091
 sisters+0.089
Neg logit contributions
He-0.436
 steer-0.432
 branch-0.432
 swoop-0.429
 paddle-0.429
Token.
Token ID13
Feature activation-5.22e-02

Pos logit contributions
uffy+1.118
 twin+1.111
 cousins+1.091
 best+1.058
ours+1.054
Neg logit contributions
 steer-1.037
 swoop-1.032
 paddle-1.031
He-1.010
 branch-0.999
Token Fl
Token ID1610
Feature activation+1.41e-01

Pos logit contributions
 swoop+0.124
 steer+0.124
 paddle+0.122
 branch+0.121
He+0.121
Neg logit contributions
uffy-0.079
 twin-0.078
 cousins-0.075
ours-0.075
 best-0.074
Tokenuffy
Token ID15352
Feature activation+5.56e-01

Pos logit contributions
 cousins+0.082
 fault+0.069
ours+0.067
 sisters+0.064
omm+0.060
Neg logit contributions
He-0.461
 branch-0.455
 swoop-0.444
 steer-0.444
 lift-0.444
Token loved
Token ID6151
Feature activation+2.61e-01

Pos logit contributions
ours+0.890
 fault+0.883
omm+0.847
 twin+0.845
 cousins+0.843
Neg logit contributions
 swoop-1.350
He-1.331
 branch-1.322
 steer-1.270
 paddle-1.267
Token to
Token ID284
Feature activation+2.85e-02

Pos logit contributions
uffy+0.492
 twin+0.484
 cousins+0.477
ours+0.477
 fault+0.470
Neg logit contributions
 swoop-0.527
 steer-0.524
 paddle-0.518
 branch-0.517
He-0.515
TokenOne
Token ID3198
Feature activation-1.42e-01

Pos logit contributions
uffy+0.057
 twin+0.056
 cousins+0.056
ours+0.056
 fault+0.055
Neg logit contributions
 swoop-0.054
 steer-0.054
He-0.054
 branch-0.054
 paddle-0.053
Token day
Token ID1110
Feature activation+1.83e-01

Pos logit contributions
 branch+0.325
He+0.320
 steer+0.319
 swoop+0.317
 paddle+0.312
Neg logit contributions
ours-0.232
 fault-0.231
 cousins-0.227
uffy-0.227
omm-0.222
Token,
Token ID11
Feature activation+5.54e-02

Pos logit contributions
 fault+0.393
 cousins+0.380
ours+0.371
 twin+0.369
anny+0.366
Neg logit contributions
 swoop-0.330
 branch-0.325
He-0.323
 steer-0.322
 paddle-0.321
Token while
Token ID981
Feature activation-2.78e-01

Pos logit contributions
uffy+0.093
 fault+0.093
 twin+0.093
 cousins+0.092
ours+0.092
Neg logit contributions
 swoop-0.121
 steer-0.120
 branch-0.120
He-0.120
 paddle-0.119
Token Fl
Token ID1610
Feature activation+1.39e-01

Pos logit contributions
 branch+0.523
 swoop+0.520
He+0.509
 paddle+0.509
 steer+0.507
Neg logit contributions
 fault-0.586
ours-0.552
icking-0.542
uffy-0.542
anny-0.539
Tokenuffy
Token ID15352
Feature activation+5.56e-01

Pos logit contributions
 cousins+0.093
 fault+0.086
 birth+0.082
 presents+0.080
 sisters+0.078
Neg logit contributions
He-0.425
 branch-0.424
 steer-0.416
 reach-0.415
 paddle-0.414
Token was
Token ID373
Feature activation+1.82e-01

Pos logit contributions
ours+1.029
 fault+1.024
 cousins+1.006
 twin+0.992
uffy+0.990
Neg logit contributions
 swoop-1.143
 branch-1.129
He-1.127
 steer-1.125
 paddle-1.120
Token running
Token ID2491
Feature activation-3.83e-02

Pos logit contributions
ours+0.232
uffy+0.231
 fault+0.228
omm+0.228
 cousins+0.223
Neg logit contributions
 branch-0.476
 swoop-0.470
 paddle-0.469
 steer-0.466
 reach-0.464
Token around
Token ID1088
Feature activation+1.29e-02

Pos logit contributions
 steer+0.088
 swoop+0.087
 paddle+0.087
He+0.084
 branch+0.083
Neg logit contributions
 best-0.061
uffy-0.059
 twin-0.058
 cousins-0.055
ained-0.055
Token with
Token ID351
Feature activation-5.16e-02

Pos logit contributions
uffy+0.024
 twin+0.024
 cousins+0.023
ours+0.023
 best+0.023
Neg logit contributions
 swoop-0.027
 steer-0.027
 paddle-0.026
He-0.026
 branch-0.026
Token her
Token ID607
Feature activation+9.83e-02

Pos logit contributions
 swoop+0.127
 steer+0.127
 paddle+0.125
He+0.123
 branch+0.120
Neg logit contributions
 twin-0.075
 best-0.074
 cousins-0.073
uffy-0.071
 fault-0.068
Token Lily
Token ID20037
Feature activation+2.13e-01

Pos logit contributions
He+0.075
 swoop+0.075
 steer+0.075
 paddle+0.073
 branch+0.073
Neg logit contributions
uffy-0.129
 twin-0.127
 cousins-0.127
ours-0.126
 fault-0.125
Token is
Token ID318
Feature activation+3.35e-01

Pos logit contributions
uffy+0.359
 cousins+0.346
anny+0.344
icking+0.344
omm+0.343
Neg logit contributions
 swoop-0.480
 steer-0.468
 branch-0.467
 paddle-0.462
 reach-0.449
Token happy
Token ID3772
Feature activation-6.22e-02

Pos logit contributions
uffy+0.756
ours+0.733
 twin+0.720
 cousins+0.716
 fault+0.712
Neg logit contributions
 swoop-0.567
 branch-0.561
 steer-0.557
He-0.556
 paddle-0.545
Token.
Token ID13
Feature activation-5.22e-02

Pos logit contributions
 swoop+0.116
 steer+0.115
 paddle+0.114
He+0.113
 branch+0.113
Neg logit contributions
uffy-0.127
 twin-0.126
 cousins-0.124
ours-0.123
 fault-0.122
Token Fl
Token ID1610
Feature activation+1.42e-01

Pos logit contributions
He+0.084
 branch+0.076
 swoop+0.076
 steer+0.075
 paddle+0.074
Neg logit contributions
 cousins-0.122
icking-0.122
 fault-0.120
 twin-0.120
ours-0.120
Tokenuffy
Token ID15352
Feature activation+5.56e-01

Pos logit contributions
 birth+0.104
 cousins+0.101
ied+0.087
 fault+0.083
 best+0.080
Neg logit contributions
 branch-0.418
He-0.415
 steer-0.405
 reach-0.401
 swoop-0.401
Token is
Token ID318
Feature activation+3.34e-01

Pos logit contributions
uffy+0.887
 cousins+0.863
 twin+0.854
 fault+0.842
omm+0.837
Neg logit contributions
 swoop-1.299
 steer-1.275
 branch-1.273
 paddle-1.261
He-1.238
Token well
Token ID880
Feature activation-4.37e-03

Pos logit contributions
uffy+0.718
ours+0.704
 twin+0.684
 cousins+0.682
 fault+0.682
Neg logit contributions
 swoop-0.601
He-0.596
 branch-0.593
 steer-0.589
 paddle-0.576
Token.
Token ID13
Feature activation-5.15e-02

Pos logit contributions
 swoop+0.008
 steer+0.008
 branch+0.008
He+0.007
 paddle+0.007
Neg logit contributions
uffy-0.009
 cousins-0.009
ours-0.009
 twin-0.009
 fault-0.009
Token The
Token ID383
Feature activation+2.36e-02

Pos logit contributions
He+0.082
 steer+0.078
 swoop+0.078
 branch+0.077
 paddle+0.076
Neg logit contributions
uffy-0.119
 cousins-0.119
ours-0.118
 twin-0.117
 fault-0.117
Token veterinarian
Token ID41404
Feature activation-7.27e-02

Pos logit contributions
 twin+0.044
uffy+0.044
 cousins+0.044
 best+0.043
ours+0.043
Neg logit contributions
 steer-0.047
 swoop-0.047
 paddle-0.047
He-0.046
 branch-0.046
Token
Token ID198
Feature activation+2.88e-02

Pos logit contributions
 cousins+0.059
uffy+0.058
 twin+0.058
ours+0.058
 fault+0.058
Neg logit contributions
He-0.053
 steer-0.053
 swoop-0.053
 branch-0.053
 paddle-0.052
TokenOne
Token ID3198
Feature activation-1.44e-01

Pos logit contributions
uffy+0.065
 twin+0.065
 best+0.063
ours+0.062
 cousins+0.061
Neg logit contributions
 paddle-0.047
 steer-0.046
 swoop-0.046
 branch-0.044
He-0.043
Token day
Token ID1110
Feature activation+1.81e-01

Pos logit contributions
 branch+0.337
 steer+0.332
 swoop+0.331
He+0.330
 paddle+0.326
Neg logit contributions
ours-0.225
 fault-0.225
uffy-0.223
 cousins-0.220
 twin-0.215
Token,
Token ID11
Feature activation+5.42e-02

Pos logit contributions
 fault+0.381
 cousins+0.369
ours+0.361
uffy+0.358
 twin+0.356
Neg logit contributions
 swoop-0.335
 steer-0.333
 paddle-0.331
 branch-0.330
He-0.330
Token Fl
Token ID1610
Feature activation+1.36e-01

Pos logit contributions
uffy+0.099
 twin+0.098
 cousins+0.096
ours+0.096
 fault+0.094
Neg logit contributions
 swoop-0.112
 steer-0.111
 paddle-0.110
He-0.109
 branch-0.109
Tokenuffy
Token ID15352
Feature activation+5.55e-01

Pos logit contributions
 birth+0.097
 presents+0.095
 cousins+0.094
 invitations+0.086
ours+0.085
Neg logit contributions
 branch-0.407
 lift-0.381
 swoop-0.379
 reach-0.377
 steer-0.375
Token went
Token ID1816
Feature activation+1.15e-02

Pos logit contributions
uffy+1.158
ours+1.151
 fault+1.148
 twin+1.141
 cousins+1.135
Neg logit contributions
 swoop-1.012
 steer-0.985
 branch-0.985
He-0.977
 paddle-0.975
Token outside
Token ID2354
Feature activation+6.87e-02

Pos logit contributions
uffy+0.017
 best+0.017
 twin+0.017
 cousins+0.017
ours+0.016
Neg logit contributions
 steer-0.027
 swoop-0.027
 paddle-0.027
He-0.027
 branch-0.026
Token to
Token ID284
Feature activation+2.80e-02

Pos logit contributions
 twin+0.122
uffy+0.121
 cousins+0.119
ours+0.118
 best+0.116
Neg logit contributions
 swoop-0.145
 steer-0.143
 paddle-0.142
He-0.142
 branch-0.139
Token play
Token ID711
Feature activation-1.00e-01

Pos logit contributions
 twin+0.054
uffy+0.052
 cousins+0.052
 best+0.051
ours+0.051
Neg logit contributions
He-0.054
 swoop-0.052
 steer-0.050
 branch-0.048
 paddle-0.048
Token.
Token ID13
Feature activation-4.81e-02

Pos logit contributions
 swoop+0.231
 steer+0.228
 paddle+0.225
He+0.213
 distance+0.206
Neg logit contributions
 best-0.163
 twin-0.161
uffy-0.153
 twins-0.153
apped-0.150
Token a
Token ID257
Feature activation+1.74e-02

Pos logit contributions
uffy+0.440
ours+0.431
 cousins+0.418
 fault+0.417
icking+0.412
Neg logit contributions
 branch-0.298
He-0.288
 swoop-0.279
 reach-0.267
 steer-0.265
Token k
Token ID479
Feature activation-8.61e-02

Pos logit contributions
 twin+0.035
uffy+0.034
 best+0.033
 cousins+0.033
 twins+0.032
Neg logit contributions
 steer-0.035
 swoop-0.034
 paddle-0.033
He-0.033
 branch-0.032
Tokenitty
Token ID9760
Feature activation+1.48e-01

Pos logit contributions
 steer+0.204
 paddle+0.202
 swoop+0.201
He+0.195
 branch+0.193
Neg logit contributions
uffy-0.141
 twin-0.135
 cousins-0.129
ours-0.127
apped-0.124
Token named
Token ID3706
Feature activation+1.59e-01

Pos logit contributions
uffy+0.195
 twin+0.194
 cousins+0.187
 best+0.177
ours+0.175
Neg logit contributions
 steer-0.383
 paddle-0.381
 swoop-0.376
He-0.375
 branch-0.363
Token Fl
Token ID1610
Feature activation+1.39e-01

Pos logit contributions
 fault+0.296
ours+0.292
 cousins+0.290
 twin+0.285
 best+0.280
Neg logit contributions
 branch-0.332
 swoop-0.326
 steer-0.319
He-0.315
 paddle-0.312
Tokenuffy
Token ID15352
Feature activation+5.55e-01

Pos logit contributions
 cousins+0.074
 fault+0.064
ours+0.063
 sisters+0.061
 best+0.059
Neg logit contributions
 branch-0.464
He-0.454
 swoop-0.453
 steer-0.449
 lift-0.443
Token.
Token ID13
Feature activation-5.23e-02

Pos logit contributions
uffy+1.109
 twin+1.105
 cousins+1.083
 best+1.048
ours+1.046
Neg logit contributions
 steer-1.042
 paddle-1.040
 swoop-1.035
He-1.014
 branch-0.996
Token Fl
Token ID1610
Feature activation+1.39e-01

Pos logit contributions
 swoop+0.123
 steer+0.123
 paddle+0.121
 branch+0.120
He+0.120
Neg logit contributions
uffy-0.081
 twin-0.079
 cousins-0.077
ours-0.077
 best-0.075
Tokenuffy
Token ID15352
Feature activation+5.55e-01

Pos logit contributions
 cousins+0.099
 birth+0.088
 fault+0.083
 sisters+0.080
 invitations+0.074
Neg logit contributions
 branch-0.442
He-0.430
 swoop-0.424
 lift-0.420
 reach-0.417
Token liked
Token ID8288
Feature activation+7.06e-02

Pos logit contributions
 fault+0.972
ours+0.958
icking+0.915
 twin+0.912
omm+0.908
Neg logit contributions
 swoop-1.275
He-1.258
 branch-1.254
 steer-1.193
 lift-1.182
Token to
Token ID284
Feature activation+2.79e-02

Pos logit contributions
uffy+0.131
 twin+0.129
 cousins+0.128
ours+0.127
 fault+0.125
Neg logit contributions
 swoop-0.143
 steer-0.143
 paddle-0.141
 branch-0.141
He-0.141
Token named
Token ID3706
Feature activation+1.59e-01

Pos logit contributions
uffy+0.195
 twin+0.194
 cousins+0.187
 best+0.177
ours+0.175
Neg logit contributions
 steer-0.383
 paddle-0.381
 swoop-0.376
He-0.375
 branch-0.363
Token Fl
Token ID1610
Feature activation+1.39e-01

Pos logit contributions
 fault+0.296
ours+0.292
 cousins+0.290
 twin+0.285
 best+0.280
Neg logit contributions
 branch-0.332
 swoop-0.326
 steer-0.319
He-0.315
 paddle-0.312
Tokenuffy
Token ID15352
Feature activation+5.55e-01

Pos logit contributions
 cousins+0.074
 fault+0.064
ours+0.063
 sisters+0.061
 best+0.059
Neg logit contributions
 branch-0.464
He-0.454
 swoop-0.453
 steer-0.449
 lift-0.443
Token.
Token ID13
Feature activation-5.23e-02

Pos logit contributions
uffy+1.109
 twin+1.105
 cousins+1.083
 best+1.048
ours+1.046
Neg logit contributions
 steer-1.042
 paddle-1.040
 swoop-1.035
He-1.014
 branch-0.996
Token Fl
Token ID1610
Feature activation+1.39e-01

Pos logit contributions
 swoop+0.123
 steer+0.123
 paddle+0.121
 branch+0.120
He+0.120
Neg logit contributions
uffy-0.081
 twin-0.079
 cousins-0.077
ours-0.077
 best-0.075
Tokenuffy
Token ID15352
Feature activation+5.55e-01

Pos logit contributions
 cousins+0.099
 birth+0.088
 fault+0.083
 sisters+0.080
 invitations+0.074
Neg logit contributions
 branch-0.442
He-0.430
 swoop-0.424
 lift-0.420
 reach-0.417
Token liked
Token ID8288
Feature activation+7.06e-02

Pos logit contributions
 fault+0.972
ours+0.958
icking+0.915
 twin+0.912
omm+0.908
Neg logit contributions
 swoop-1.275
He-1.258
 branch-1.254
 steer-1.193
 lift-1.182
Token to
Token ID284
Feature activation+2.79e-02

Pos logit contributions
uffy+0.131
 twin+0.129
 cousins+0.128
ours+0.127
 fault+0.125
Neg logit contributions
 swoop-0.143
 steer-0.143
 paddle-0.141
 branch-0.141
He-0.141
Token play
Token ID711
Feature activation-9.83e-02

Pos logit contributions
 sisters+0.052
 twin+0.052
 best+0.051
uffy+0.049
 cousins+0.049
Neg logit contributions
He-0.052
 distance-0.048
 steer-0.046
 swoop-0.046
 farther-0.045
Token with
Token ID351
Feature activation-5.00e-02

Pos logit contributions
 swoop+0.209
 steer+0.209
 paddle+0.207
He+0.201
 branch+0.200
Neg logit contributions
uffy-0.173
 twin-0.171
 best-0.169
 cousins-0.167
ours-0.163
Token balls
Token ID11333
Feature activation-2.06e-02

Pos logit contributions
 steer+0.092
 swoop+0.092
 paddle+0.091
He+0.090
 branch+0.088
Neg logit contributions
uffy-0.103
 twin-0.102
 cousins-0.101
ours-0.099
 best-0.099
Token feed
Token ID3745
Feature activation-2.31e-01

Pos logit contributions
He+0.067
 swoop+0.067
 steer+0.065
 branch+0.065
 paddle+0.062
Neg logit contributions
 cousins-0.101
 twin-0.100
 best-0.100
uffy-0.098
 sisters-0.096
Token him
Token ID683
Feature activation+5.09e-03

Pos logit contributions
 steer+0.444
He+0.441
 swoop+0.438
 paddle+0.436
 branch+0.435
Neg logit contributions
uffy-0.453
 cousins-0.449
 fault-0.447
 twin-0.446
ours-0.445
Token carrots
Token ID34397
Feature activation-7.87e-02

Pos logit contributions
uffy+0.010
 twin+0.010
 cousins+0.010
ours+0.010
 fault+0.010
Neg logit contributions
 swoop-0.009
 steer-0.009
 paddle-0.009
 branch-0.009
He-0.009
Token.
Token ID13
Feature activation-4.80e-02

Pos logit contributions
 swoop+0.164
 steer+0.160
 paddle+0.158
He+0.154
 branch+0.151
Neg logit contributions
 twin-0.142
 best-0.142
uffy-0.140
 cousins-0.134
 fault-0.133
Token Fl
Token ID1610
Feature activation+1.38e-01

Pos logit contributions
 steer+0.096
He+0.096
 swoop+0.096
 branch+0.095
 paddle+0.095
Neg logit contributions
uffy-0.089
 cousins-0.089
 twin-0.088
ours-0.087
 fault-0.087
Tokenuffy
Token ID15352
Feature activation+5.55e-01

Pos logit contributions
 cousins+0.111
 invitations+0.102
 birth+0.094
ied+0.092
 fault+0.090
Neg logit contributions
He-0.396
 branch-0.393
 paddle-0.382
 steer-0.382
 swoop-0.379
Token was
Token ID373
Feature activation+1.81e-01

Pos logit contributions
 fault+0.991
ours+0.982
 cousins+0.972
uffy+0.969
 twin+0.958
Neg logit contributions
 swoop-1.203
 branch-1.172
 steer-1.157
 paddle-1.155
He-1.154
Token soft
Token ID2705
Feature activation-1.42e-01

Pos logit contributions
uffy+0.395
ours+0.394
 fault+0.381
 cousins+0.376
ains+0.376
Neg logit contributions
He-0.325
 branch-0.324
 swoop-0.323
 steer-0.302
 paddle-0.297
Token and
Token ID290
Feature activation-4.47e-02

Pos logit contributions
 swoop+0.247
 steer+0.247
 paddle+0.243
He+0.242
 branch+0.242
Neg logit contributions
uffy-0.304
 twin-0.301
 cousins-0.298
ours-0.297
 fault-0.293
Token warm
Token ID5814
Feature activation-1.22e-01

Pos logit contributions
 steer+0.073
 swoop+0.072
 paddle+0.072
 branch+0.071
He+0.071
Neg logit contributions
uffy-0.100
 twin-0.100
 cousins-0.098
ours-0.097
 best-0.097
Token and
Token ID290
Feature activation-4.44e-02

Pos logit contributions
 swoop+0.214
 steer+0.213
 paddle+0.210
He+0.210
 branch+0.209
Neg logit contributions
uffy-0.262
 twin-0.259
 cousins-0.256
ours-0.256
 fault-0.252
Token the
Token ID262
Feature activation-1.94e-04

Pos logit contributions
 swoop+0.457
 steer+0.455
He+0.452
 paddle+0.449
 branch+0.449
Neg logit contributions
uffy-0.297
 twin-0.291
 cousins-0.289
ours-0.289
 fault-0.285
Token ducks
Token ID39694
Feature activation-1.73e-03

Pos logit contributions
 swoop+0.000
 steer+0.000
He+0.000
 branch+0.000
 paddle+0.000
Neg logit contributions
 best-0.000
 twin-0.000
 twins-0.000
 cousins-0.000
uffy-0.000
Token away
Token ID1497
Feature activation-1.32e-01

Pos logit contributions
He+0.004
 steer+0.003
 swoop+0.003
 paddle+0.003
 branch+0.003
Neg logit contributions
 twin-0.003
uffy-0.003
icking-0.003
 best-0.003
 fault-0.003
Token.
Token ID13
Feature activation-4.70e-02

Pos logit contributions
 swoop+0.257
 steer+0.255
 paddle+0.252
 branch+0.248
He+0.247
Neg logit contributions
uffy-0.261
 twin-0.258
 best-0.249
 cousins-0.244
ours-0.242
Token Fl
Token ID1610
Feature activation+1.38e-01

Pos logit contributions
 swoop+0.086
 steer+0.085
He+0.085
 branch+0.084
 paddle+0.084
Neg logit contributions
uffy-0.096
 twin-0.095
 cousins-0.095
ours-0.094
 fault-0.094
Tokenuffy
Token ID15352
Feature activation+5.55e-01

Pos logit contributions
 birth+0.117
 cousins+0.117
 presents+0.112
 fault+0.107
 invitations+0.106
Neg logit contributions
 branch-0.380
He-0.370
 reach-0.369
 lift-0.367
 steer-0.366
Token got
Token ID1392
Feature activation-7.10e-02

Pos logit contributions
uffy+1.099
 twin+1.087
 cousins+1.066
ours+1.054
 best+1.038
Neg logit contributions
 steer-1.057
 paddle-1.039
 swoop-1.037
He-1.029
 branch-1.022
Token so
Token ID523
Feature activation+1.32e-02

Pos logit contributions
 swoop+0.104
 steer+0.104
 paddle+0.102
He+0.101
 branch+0.101
Neg logit contributions
uffy-0.172
 twin-0.170
 cousins-0.169
ours-0.168
 fault-0.167
Token scared
Token ID12008
Feature activation-1.93e-01

Pos logit contributions
uffy+0.031
 twin+0.031
 best+0.030
 cousins+0.030
ours+0.030
Neg logit contributions
 steer-0.020
 paddle-0.020
 swoop-0.020
 branch-0.020
He-0.019
Token that
Token ID326
Feature activation-2.00e-02

Pos logit contributions
 steer+0.403
 paddle+0.402
 swoop+0.399
 branch+0.385
 lift+0.373
Neg logit contributions
uffy-0.356
 twin-0.353
 cousins-0.338
 best-0.336
ours-0.334
Token she
Token ID673
Feature activation-1.72e-01

Pos logit contributions
 steer+0.015
 swoop+0.015
 paddle+0.014
 branch+0.014
He+0.014
Neg logit contributions
uffy-0.063
 twin-0.063
 cousins-0.062
ours-0.062
 best-0.061
Token.
Token ID13
Feature activation-5.92e-02

Pos logit contributions
uffy+0.031
ours+0.030
 cousins+0.029
 twin+0.029
 fault+0.029
Neg logit contributions
 branch-0.021
He-0.021
 steer-0.021
 swoop-0.021
 paddle-0.020
Token Can
Token ID1680
Feature activation-6.23e-02

Pos logit contributions
He+0.110
 steer+0.109
 swoop+0.107
 paddle+0.107
 branch+0.107
Neg logit contributions
uffy-0.121
 twin-0.117
 fault-0.116
ours-0.116
 cousins-0.116
Token I
Token ID314
Feature activation-5.32e-02

Pos logit contributions
 steer+0.110
 swoop+0.110
He+0.108
 paddle+0.108
 branch+0.108
Neg logit contributions
uffy-0.131
 twin-0.130
 cousins-0.129
 fault-0.128
ours-0.128
Token pet
Token ID4273
Feature activation-2.06e-01

Pos logit contributions
 swoop+0.093
He+0.090
 steer+0.089
 branch+0.088
 paddle+0.087
Neg logit contributions
uffy-0.113
 twin-0.111
 best-0.107
 cousins-0.105
 cousin-0.103
Token Fl
Token ID1610
Feature activation+1.38e-01

Pos logit contributions
 steer+0.427
 swoop+0.427
 branch+0.421
 paddle+0.420
He+0.418
Neg logit contributions
uffy-0.372
 twin-0.366
 fault-0.365
 cousins-0.363
ours-0.361
Tokenuffy
Token ID15352
Feature activation+5.55e-01

Pos logit contributions
 fault+0.083
ied+0.077
 cousins+0.077
 best+0.074
omm+0.068
Neg logit contributions
 steer-0.420
 branch-0.418
 paddle-0.417
 swoop-0.415
 reach-0.405
Token?"
Token ID1701
Feature activation+1.59e-01

Pos logit contributions
 fault+1.016
 cousins+0.973
ours+0.961
uffy+0.952
anny+0.948
Neg logit contributions
 branch-1.167
 steer-1.155
 swoop-1.136
 paddle-1.129
He-1.115
Token He
Token ID679
Feature activation-2.79e-01

Pos logit contributions
uffy+0.230
 twin+0.230
 cousins+0.210
 cousin+0.207
ours+0.205
Neg logit contributions
 swoop-0.400
 paddle-0.393
 steer-0.390
 branch-0.380
 lift-0.365
Token is
Token ID318
Feature activation+3.28e-01

Pos logit contributions
 swoop+0.592
 steer+0.588
 paddle+0.581
He+0.579
 branch+0.577
Neg logit contributions
uffy-0.496
 twin-0.483
 cousins-0.482
 fault-0.478
ours-0.466
Token curious
Token ID11040
Feature activation-9.75e-02

Pos logit contributions
uffy+0.722
 twin+0.701
 cousins+0.700
ours+0.698
 fault+0.693
Neg logit contributions
 swoop-0.562
 steer-0.554
He-0.553
 branch-0.552
 paddle-0.545
Token.
Token ID13
Feature activation-5.85e-02

Pos logit contributions
 swoop+0.183
 steer+0.183
 paddle+0.181
 branch+0.176
He+0.175
Neg logit contributions
 twin-0.196
uffy-0.193
 cousins-0.192
 best-0.190
ours-0.190
Token loud
Token ID7812
Feature activation-2.67e-01

Pos logit contributions
uffy+0.404
ours+0.399
 fault+0.389
 cousins+0.388
 twin+0.388
Neg logit contributions
 swoop-0.312
 branch-0.310
 steer-0.304
He-0.304
 paddle-0.303
Token and
Token ID290
Feature activation-4.79e-02

Pos logit contributions
 swoop+0.466
 steer+0.465
 paddle+0.458
 branch+0.456
He+0.456
Neg logit contributions
uffy-0.574
 twin-0.567
 cousins-0.561
ours-0.559
 fault-0.551
Token scary
Token ID14343
Feature activation-2.19e-01

Pos logit contributions
 steer+0.075
 paddle+0.073
He+0.072
 swoop+0.072
 branch+0.070
Neg logit contributions
uffy-0.109
 twin-0.109
 cousins-0.107
 best-0.107
ours-0.106
Token.
Token ID13
Feature activation-4.80e-02

Pos logit contributions
 swoop+0.339
 branch+0.338
 steer+0.338
He+0.332
 paddle+0.332
Neg logit contributions
uffy-0.510
ours-0.502
 cousins-0.498
 twin-0.497
omm-0.491
Token Fl
Token ID1610
Feature activation+1.38e-01

Pos logit contributions
 swoop+0.093
 steer+0.093
He+0.093
 paddle+0.092
 branch+0.092
Neg logit contributions
uffy-0.092
 twin-0.091
 cousins-0.091
 fault-0.090
ours-0.090
Tokenuffy
Token ID15352
Feature activation+5.55e-01

Pos logit contributions
 cousins+0.096
 birth+0.092
 presents+0.086
omm+0.081
 fault+0.080
Neg logit contributions
 branch-0.424
 steer-0.409
He-0.409
 reach-0.407
 lift-0.404
Token and
Token ID290
Feature activation-4.76e-02

Pos logit contributions
uffy+1.115
 twin+1.105
ours+1.105
 fault+1.100
 cousins+1.098
Neg logit contributions
 swoop-1.044
 steer-1.026
 branch-1.019
 paddle-1.012
He-1.011
Token his
Token ID465
Feature activation-1.43e-01

Pos logit contributions
 swoop+0.083
 steer+0.083
 branch+0.082
 paddle+0.082
He+0.081
Neg logit contributions
uffy-0.102
 twin-0.101
ours-0.100
 cousins-0.100
 fault-0.099
Token friends
Token ID2460
Feature activation+3.83e-01

Pos logit contributions
 paddle+0.389
 branch+0.387
 swoop+0.371
 steer+0.370
 rock+0.369
Neg logit contributions
ours-0.182
omm-0.157
ains-0.155
anny-0.152
acey-0.152
Token were
Token ID547
Feature activation+1.61e-01

Pos logit contributions
uffy+0.731
ours+0.721
 fault+0.716
 twin+0.713
omm+0.707
Neg logit contributions
 swoop-0.777
 steer-0.766
 paddle-0.763
 branch-0.752
He-0.746
Token afraid
Token ID7787
Feature activation-1.00e-01

Pos logit contributions
ours+0.392
uffy+0.390
omm+0.382
 fault+0.377
 twin+0.373
Neg logit contributions
 swoop-0.242
 branch-0.240
 paddle-0.235
 steer-0.232
He-0.232
Token Lily
Token ID20037
Feature activation+2.07e-01

Pos logit contributions
He+0.074
 steer+0.073
 swoop+0.073
 branch+0.072
 paddle+0.071
Neg logit contributions
uffy-0.157
 cousins-0.156
 twin-0.155
ours-0.155
 fault-0.154
Token is
Token ID318
Feature activation+3.25e-01

Pos logit contributions
uffy+0.305
 cousins+0.302
 twin+0.299
anny+0.297
 fault+0.295
Neg logit contributions
 swoop-0.498
 steer-0.491
 branch-0.488
 paddle-0.488
He-0.487
Token worried
Token ID7960
Feature activation-3.23e-02

Pos logit contributions
uffy+0.774
ours+0.755
ains+0.734
 fault+0.730
 twin+0.729
Neg logit contributions
 branch-0.517
 swoop-0.515
He-0.503
 steer-0.493
 paddle-0.485
Token about
Token ID546
Feature activation+3.45e-01

Pos logit contributions
 steer+0.057
He+0.057
 swoop+0.056
 branch+0.056
 paddle+0.055
Neg logit contributions
uffy-0.069
 twin-0.067
 cousins-0.067
ours-0.067
 fault-0.066
Token Fl
Token ID1610
Feature activation+1.38e-01

Pos logit contributions
uffy+0.576
 twin+0.568
ours+0.565
 cousins+0.562
apped+0.550
Neg logit contributions
 steer-0.765
 swoop-0.759
 branch-0.755
 paddle-0.754
He-0.744
Tokenuffy
Token ID15352
Feature activation+5.55e-01

Pos logit contributions
 cousins+0.093
 fault+0.090
 birth+0.084
 best+0.078
apped+0.074
Neg logit contributions
 branch-0.404
 steer-0.398
 swoop-0.398
 reach-0.394
 paddle-0.394
Token.
Token ID13
Feature activation-5.77e-02

Pos logit contributions
uffy+1.117
ours+1.117
 cousins+1.103
 twin+1.095
 fault+1.082
Neg logit contributions
 branch-1.020
 swoop-1.018
 steer-1.018
 paddle-1.000
He-0.997
Token She
Token ID1375
Feature activation-5.34e-02

Pos logit contributions
He+0.098
 branch+0.095
 swoop+0.095
 steer+0.094
 paddle+0.093
Neg logit contributions
 cousins-0.126
 twin-0.125
uffy-0.125
ours-0.125
 fault-0.124
Token wants
Token ID3382
Feature activation+1.14e-01

Pos logit contributions
 swoop+0.122
 steer+0.119
 paddle+0.119
He+0.118
 branch+0.118
Neg logit contributions
uffy-0.087
 twin-0.086
 cousins-0.084
 fault-0.082
ours-0.082
Token him
Token ID683
Feature activation-8.88e-04

Pos logit contributions
uffy+0.231
ours+0.228
 twin+0.222
 cousins+0.222
apped+0.221
Neg logit contributions
 steer-0.217
 swoop-0.216
 branch-0.214
He-0.213
 paddle-0.212
Token to
Token ID284
Feature activation+2.52e-02

Pos logit contributions
 branch+0.002
 steer+0.002
 swoop+0.002
He+0.002
 paddle+0.002
Neg logit contributions
ours-0.002
uffy-0.002
omm-0.002
 twin-0.002
 cousins-0.002
Token,
Token ID11
Feature activation+4.87e-02

Pos logit contributions
 swoop+0.219
 branch+0.218
 steer+0.216
He+0.214
 paddle+0.209
Neg logit contributions
ours-0.228
uffy-0.226
 cousins-0.224
 twin-0.223
 fault-0.223
Token you
Token ID345
Feature activation+7.96e-02

Pos logit contributions
uffy+0.079
 twin+0.078
 cousins+0.076
ours+0.075
 best+0.074
Neg logit contributions
 swoop-0.111
 steer-0.110
 paddle-0.110
He-0.109
 branch-0.108
Token can
Token ID460
Feature activation+1.12e-01

Pos logit contributions
uffy+0.161
 cousins+0.155
omm+0.154
 fault+0.153
ours+0.152
Neg logit contributions
He-0.153
 steer-0.149
 swoop-0.148
 reach-0.148
 paddle-0.147
Token pet
Token ID4273
Feature activation-2.06e-01

Pos logit contributions
uffy+0.273
 twin+0.271
 cousins+0.269
ours+0.268
 fault+0.266
Neg logit contributions
 swoop-0.163
 steer-0.161
 paddle-0.159
 branch-0.159
He-0.157
Token Fl
Token ID1610
Feature activation+1.38e-01

Pos logit contributions
 swoop+0.449
 steer+0.448
 paddle+0.442
 branch+0.442
He+0.440
Neg logit contributions
uffy-0.348
 twin-0.345
 cousins-0.341
ours-0.339
 fault-0.338
Tokenuffy
Token ID15352
Feature activation+5.55e-01

Pos logit contributions
 fault+0.077
 cousins+0.076
ied+0.065
ours+0.063
 sisters+0.063
Neg logit contributions
 steer-0.434
 swoop-0.432
 branch-0.432
 paddle-0.427
He-0.424
Token.
Token ID13
Feature activation-5.62e-02

Pos logit contributions
ours+1.294
 cousins+1.291
 fault+1.285
uffy+1.279
 twin+1.267
Neg logit contributions
 branch-0.854
 swoop-0.837
 steer-0.836
He-0.818
 paddle-0.812
Token But
Token ID887
Feature activation-9.09e-02

Pos logit contributions
 steer+0.098
 swoop+0.098
He+0.097
 paddle+0.096
 branch+0.096
Neg logit contributions
uffy-0.119
 twin-0.118
 cousins-0.117
ours-0.117
 fault-0.116
Token be
Token ID307
Feature activation-1.01e-01

Pos logit contributions
 steer+0.170
 swoop+0.169
 branch+0.166
 paddle+0.165
He+0.165
Neg logit contributions
 cousins-0.184
uffy-0.183
ours-0.182
 twin-0.182
 fault-0.182
Token careful
Token ID8161
Feature activation+6.71e-03

Pos logit contributions
 swoop+0.165
 branch+0.165
He+0.160
 steer+0.156
 reach+0.148
Neg logit contributions
uffy-0.239
 fault-0.233
ours-0.230
 cousins-0.230
 twin-0.230
Token.
Token ID13
Feature activation-5.66e-02

Pos logit contributions
uffy+0.015
 cousins+0.015
 twin+0.015
ours+0.015
 fault+0.015
Neg logit contributions
 steer-0.011
 branch-0.011
He-0.011
 swoop-0.011
 paddle-0.010
Token it
Token ID340
Feature activation-1.25e-01

Pos logit contributions
omm+0.162
uffy+0.159
ours+0.155
 fault+0.153
ains+0.150
Neg logit contributions
 branch-0.086
 swoop-0.086
 steer-0.084
 paddle-0.081
 reach-0.080
Token a
Token ID257
Feature activation+1.49e-02

Pos logit contributions
 branch+0.232
 swoop+0.232
 steer+0.230
 paddle+0.227
He+0.224
Neg logit contributions
uffy-0.259
ours-0.252
 fault-0.247
 twin-0.247
omm-0.245
Token name
Token ID1438
Feature activation+9.68e-02

Pos logit contributions
 twin+0.031
uffy+0.031
 cousins+0.031
 best+0.031
 fault+0.030
Neg logit contributions
 steer-0.027
 swoop-0.027
He-0.026
 paddle-0.025
 branch-0.025
Token,
Token ID11
Feature activation+5.35e-02

Pos logit contributions
uffy+0.238
ours+0.231
 fault+0.229
 cousins+0.226
 twin+0.224
Neg logit contributions
 branch-0.146
He-0.141
 swoop-0.139
 steer-0.134
 paddle-0.130
Token Fl
Token ID1610
Feature activation+1.35e-01

Pos logit contributions
 twin+0.102
 cousins+0.099
 twins+0.097
 best+0.094
ours+0.093
Neg logit contributions
 swoop-0.106
 steer-0.103
 paddle-0.102
He-0.098
 distance-0.097
Tokenuffy
Token ID15352
Feature activation+5.54e-01

Pos logit contributions
 cousins+0.080
 best+0.074
 sisters+0.072
 birth+0.067
omm+0.063
Neg logit contributions
 branch-0.434
 swoop-0.413
 steer-0.413
 lift-0.410
 paddle-0.409
Token.
Token ID13
Feature activation-5.35e-02

Pos logit contributions
ours+1.327
uffy+1.319
 fault+1.303
 cousins+1.301
 twin+1.299
Neg logit contributions
 branch-0.833
 swoop-0.823
 steer-0.811
He-0.803
 paddle-0.796
Token Fl
Token ID1610
Feature activation+1.35e-01

Pos logit contributions
He+0.090
 branch+0.087
 steer+0.086
 paddle+0.086
 swoop+0.083
Neg logit contributions
 cousins-0.121
 fault-0.121
icking-0.119
uffy-0.117
 best-0.116
Tokenuffy
Token ID15352
Feature activation+5.54e-01

Pos logit contributions
 birth+0.098
 cousins+0.091
 presents+0.088
 invitations+0.084
ied+0.081
Neg logit contributions
 branch-0.408
 lift-0.385
 reach-0.380
 steer-0.379
 swoop-0.379
Token was
Token ID373
Feature activation+1.83e-01

Pos logit contributions
ours+1.010
 fault+0.999
uffy+0.977
omm+0.977
 cousins+0.949
Neg logit contributions
 swoop-1.210
 branch-1.175
He-1.158
 steer-1.155
 paddle-1.146
Token happy
Token ID3772
Feature activation-6.61e-02

Pos logit contributions
uffy+0.429
ours+0.428
 fault+0.409
ains+0.405
icking+0.399
Neg logit contributions
 branch-0.305
 swoop-0.302
He-0.294
 paddle-0.282
 steer-0.281
Token a
Token ID257
Feature activation+9.39e-03

Pos logit contributions
 swoop+0.102
 steer+0.099
 paddle+0.099
 branch+0.098
He+0.098
Neg logit contributions
 twin-0.118
 cousins-0.118
uffy-0.115
 best-0.115
ours-0.113
Token good
Token ID922
Feature activation+1.19e-01

Pos logit contributions
 twin+0.017
uffy+0.017
 cousins+0.016
 best+0.016
 fault+0.016
Neg logit contributions
 steer-0.020
 swoop-0.020
 paddle-0.019
He-0.019
 branch-0.019
Token sister
Token ID6621
Feature activation+4.09e-01

Pos logit contributions
uffy+0.215
ours+0.213
 twin+0.209
 cousins+0.207
 fault+0.204
Neg logit contributions
 swoop-0.248
 steer-0.247
 paddle-0.246
 branch-0.244
He-0.243
Token to
Token ID284
Feature activation+2.91e-02

Pos logit contributions
uffy+0.954
ours+0.943
 fault+0.907
 cousins+0.906
omm+0.906
Neg logit contributions
He-0.639
 steer-0.635
 branch-0.633
 swoop-0.624
 paddle-0.606
Token Fl
Token ID1610
Feature activation+1.39e-01

Pos logit contributions
uffy+0.058
 twin+0.057
 cousins+0.057
ours+0.056
 fault+0.055
Neg logit contributions
 swoop-0.055
 steer-0.055
 paddle-0.054
He-0.054
 branch-0.054
Tokenuffy
Token ID15352
Feature activation+5.54e-01

Pos logit contributions
 fault+0.086
 cousins+0.080
 best+0.078
ours+0.072
 sisters+0.069
Neg logit contributions
 paddle-0.420
 steer-0.417
 branch-0.413
He-0.407
 lift-0.406
Token.
Token ID13
Feature activation-5.89e-02

Pos logit contributions
uffy+1.252
ours+1.249
 cousins+1.222
 fault+1.219
 twin+1.216
Neg logit contributions
 steer-0.888
 branch-0.887
 swoop-0.877
He-0.872
 paddle-0.866
Token Come
Token ID7911
Feature activation+1.88e-01

Pos logit contributions
He+0.082
 steer+0.081
 swoop+0.080
 paddle+0.079
 branch+0.079
Neg logit contributions
uffy-0.146
 twin-0.144
ours-0.144
 cousins-0.143
 fault-0.143
Token,
Token ID11
Feature activation+5.00e-02

Pos logit contributions
uffy+0.343
 twin+0.338
 cousins+0.334
ours+0.331
 best+0.325
Neg logit contributions
 swoop-0.387
 steer-0.385
 paddle-0.380
He-0.379
 branch-0.379
Token Fl
Token ID1610
Feature activation+1.39e-01

Pos logit contributions
 twin+0.116
uffy+0.116
 cousins+0.114
ours+0.112
 best+0.112
Neg logit contributions
 swoop-0.077
 steer-0.076
 paddle-0.075
 branch-0.075
He-0.074
Tokenuffy
Token ID15352
Feature activation+5.54e-01

Pos logit contributions
 cousins+0.086
 sisters+0.077
 fault+0.068
 best+0.066
 birth+0.065
Neg logit contributions
 branch-0.442
 steer-0.436
He-0.436
 paddle-0.435
 swoop-0.430
Token,
Token ID11
Feature activation+5.35e-02

Pos logit contributions
uffy+0.238
ours+0.231
 fault+0.229
 cousins+0.226
 twin+0.224
Neg logit contributions
 branch-0.146
He-0.141
 swoop-0.139
 steer-0.134
 paddle-0.130
Token Fl
Token ID1610
Feature activation+1.35e-01

Pos logit contributions
 twin+0.102
 cousins+0.099
 twins+0.097
 best+0.094
ours+0.093
Neg logit contributions
 swoop-0.106
 steer-0.103
 paddle-0.102
He-0.098
 distance-0.097
Tokenuffy
Token ID15352
Feature activation+5.54e-01

Pos logit contributions
 cousins+0.080
 best+0.074
 sisters+0.072
 birth+0.067
omm+0.063
Neg logit contributions
 branch-0.434
 swoop-0.413
 steer-0.413
 lift-0.410
 paddle-0.409
Token.
Token ID13
Feature activation-5.35e-02

Pos logit contributions
ours+1.327
uffy+1.319
 fault+1.303
 cousins+1.301
 twin+1.299
Neg logit contributions
 branch-0.833
 swoop-0.823
 steer-0.811
He-0.803
 paddle-0.796
Token Fl
Token ID1610
Feature activation+1.35e-01

Pos logit contributions
He+0.090
 branch+0.087
 steer+0.086
 paddle+0.086
 swoop+0.083
Neg logit contributions
 cousins-0.121
 fault-0.121
icking-0.119
uffy-0.117
 best-0.116
Tokenuffy
Token ID15352
Feature activation+5.54e-01

Pos logit contributions
 birth+0.098
 cousins+0.091
 presents+0.088
 invitations+0.084
ied+0.081
Neg logit contributions
 branch-0.408
 lift-0.385
 reach-0.380
 steer-0.379
 swoop-0.379
Token was
Token ID373
Feature activation+1.83e-01

Pos logit contributions
ours+1.010
 fault+0.999
uffy+0.977
omm+0.977
 cousins+0.949
Neg logit contributions
 swoop-1.210
 branch-1.175
He-1.158
 steer-1.155
 paddle-1.146
Token happy
Token ID3772
Feature activation-6.61e-02

Pos logit contributions
uffy+0.429
ours+0.428
 fault+0.409
ains+0.405
icking+0.399
Neg logit contributions
 branch-0.305
 swoop-0.302
He-0.294
 paddle-0.282
 steer-0.281
Token and
Token ID290
Feature activation-4.98e-02

Pos logit contributions
 branch+0.119
He+0.117
 swoop+0.117
 steer+0.116
 paddle+0.113
Neg logit contributions
uffy-0.141
ours-0.138
 cousins-0.137
 twin-0.137
omm-0.134
Token loved
Token ID6151
Feature activation+2.57e-01

Pos logit contributions
 swoop+0.117
 steer+0.117
 paddle+0.116
 branch+0.115
He+0.115
Neg logit contributions
uffy-0.076
 twin-0.075
 cousins-0.074
ours-0.073
 best-0.072
Token to
Token ID284
Feature activation+2.75e-02

Pos logit contributions
uffy+0.509
ours+0.497
 twin+0.491
 fault+0.481
 cousins+0.481
Neg logit contributions
 swoop-0.505
 branch-0.502
 steer-0.500
He-0.493
 paddle-0.492
Token that
Token ID326
Feature activation-2.20e-02

Pos logit contributions
 swoop+0.193
 steer+0.192
 paddle+0.189
 branch+0.189
He+0.189
Neg logit contributions
uffy-0.219
 twin-0.217
 cousins-0.214
ours-0.213
 fault-0.211
Token day
Token ID1110
Feature activation+1.84e-01

Pos logit contributions
 branch+0.050
 swoop+0.048
 reach+0.046
He+0.046
 lift+0.046
Neg logit contributions
ours-0.041
 cousins-0.039
uffy-0.039
omm-0.037
ains-0.037
Token on
Token ID319
Feature activation+6.49e-02

Pos logit contributions
 fault+0.370
ours+0.365
 cousins+0.362
uffy+0.355
 twin+0.355
Neg logit contributions
He-0.351
 branch-0.349
 swoop-0.349
 steer-0.343
 paddle-0.340
Token,
Token ID11
Feature activation+5.78e-02

Pos logit contributions
 cousins+0.147
ours+0.146
ains+0.142
 fault+0.142
anny+0.139
Neg logit contributions
 branch-0.107
 swoop-0.105
 reach-0.102
 paddle-0.102
He-0.101
Token Fl
Token ID1610
Feature activation+1.38e-01

Pos logit contributions
ours+0.107
 fault+0.106
 cousins+0.105
omm+0.103
uffy+0.101
Neg logit contributions
 branch-0.119
 swoop-0.118
 steer-0.117
He-0.116
 paddle-0.113
Tokenuffy
Token ID15352
Feature activation+5.54e-01

Pos logit contributions
 cousins+0.115
 birth+0.114
 invitations+0.102
 presents+0.099
ours+0.093
Neg logit contributions
 branch-0.409
 swoop-0.399
He-0.392
 steer-0.382
 lift-0.377
Token and
Token ID290
Feature activation-4.66e-02

Pos logit contributions
uffy+1.045
ours+1.031
 cousins+1.020
 twin+1.015
 fault+0.997
Neg logit contributions
 swoop-1.135
 steer-1.112
 branch-1.112
He-1.105
 paddle-1.090
Token the
Token ID262
Feature activation-7.68e-03

Pos logit contributions
 swoop+0.095
 steer+0.094
 branch+0.093
 paddle+0.093
He+0.092
Neg logit contributions
uffy-0.088
 twin-0.086
ours-0.086
 cousins-0.085
 fault-0.085
Token red
Token ID2266
Feature activation-8.22e-02

Pos logit contributions
 swoop+0.013
 steer+0.013
 paddle+0.013
He+0.013
 branch+0.013
Neg logit contributions
uffy-0.016
 twin-0.016
 cousins-0.016
ours-0.016
 fault-0.016
Token ball
Token ID2613
Feature activation-5.49e-02

Pos logit contributions
 swoop+0.144
 steer+0.143
 paddle+0.141
He+0.140
 branch+0.140
Neg logit contributions
uffy-0.177
 twin-0.175
 cousins-0.173
ours-0.172
 fault-0.170
Token were
Token ID547
Feature activation+1.60e-01

Pos logit contributions
 steer+0.127
 swoop+0.126
 paddle+0.126
He+0.125
 branch+0.124
Neg logit contributions
uffy-0.084
 twin-0.084
 cousins-0.081
 best-0.081
ours-0.080
Token kitten
Token ID42143
Feature activation+5.97e-02

Pos logit contributions
 branch+0.105
 swoop+0.104
 paddle+0.104
 steer+0.102
He+0.097
Neg logit contributions
ours-0.119
uffy-0.118
ains-0.115
 cousins-0.114
omm-0.114
Token.
Token ID13
Feature activation-5.20e-02

Pos logit contributions
ours+0.141
uffy+0.137
omm+0.135
 twin+0.133
 cousins+0.133
Neg logit contributions
 branch-0.093
 swoop-0.091
 steer-0.089
 paddle-0.089
He-0.087
Token Lily
Token ID20037
Feature activation+2.14e-01

Pos logit contributions
He+0.104
 branch+0.098
 swoop+0.096
 steer+0.092
 paddle+0.092
Neg logit contributions
 fault-0.102
 cousins-0.098
uffy-0.098
icking-0.097
ains-0.097
Token and
Token ID290
Feature activation-4.90e-02

Pos logit contributions
uffy+0.405
 fault+0.404
ours+0.403
 cousins+0.390
 twin+0.390
Neg logit contributions
 swoop-0.445
 steer-0.428
He-0.428
 branch-0.427
 paddle-0.423
Token Fl
Token ID1610
Feature activation+1.36e-01

Pos logit contributions
 branch+0.090
 swoop+0.090
 paddle+0.089
 steer+0.089
 lift+0.085
Neg logit contributions
 fault-0.104
anny-0.101
ours-0.101
ains-0.100
uffy-0.099
Tokenuffy
Token ID15352
Feature activation+5.54e-01

Pos logit contributions
 birth+0.112
 fault+0.105
box+0.098
 presents+0.098
 cousins+0.097
Neg logit contributions
 swoop-0.367
 branch-0.365
Tweet-0.354
 paddle-0.353
 fly-0.353
Token became
Token ID2627
Feature activation+1.20e-01

Pos logit contributions
uffy+0.937
ours+0.912
 fault+0.909
 twin+0.902
 cousins+0.890
Neg logit contributions
 swoop-1.255
 paddle-1.233
 steer-1.233
 branch-1.216
He-1.206
Token best
Token ID1266
Feature activation+5.24e-01

Pos logit contributions
uffy+0.192
ours+0.189
omm+0.184
 fault+0.182
icking+0.177
Neg logit contributions
 swoop-0.283
 branch-0.283
 paddle-0.276
 reach-0.273
 catch-0.273
Token friends
Token ID2460
Feature activation+3.81e-01

Pos logit contributions
ours+0.501
uffy+0.474
omm+0.464
 fault+0.461
ains+0.456
Neg logit contributions
 steer-1.518
He-1.511
 branch-1.510
 paddle-1.509
 swoop-1.508
Token and
Token ID290
Feature activation-4.93e-02

Pos logit contributions
ours+0.897
uffy+0.897
 twin+0.870
omm+0.869
 cousins+0.863
Neg logit contributions
 branch-0.596
He-0.590
 swoop-0.580
 steer-0.575
 paddle-0.558
Token did
Token ID750
Feature activation+1.32e-01

Pos logit contributions
 swoop+0.102
 paddle+0.097
 branch+0.097
 steer+0.095
 move+0.091
Neg logit contributions
omm-0.099
ours-0.097
 fault-0.097
uffy-0.095
ains-0.092
Token called
Token ID1444
Feature activation+8.85e-02

Pos logit contributions
 steer+0.018
 swoop+0.018
 paddle+0.018
He+0.018
 branch+0.018
Neg logit contributions
uffy-0.010
 twin-0.010
 cousins-0.010
ours-0.010
 best-0.009
Token Fl
Token ID1610
Feature activation+1.39e-01

Pos logit contributions
ours+0.150
uffy+0.147
 cousins+0.145
 fault+0.144
 twin+0.144
Neg logit contributions
 steer-0.193
 branch-0.193
 paddle-0.191
 swoop-0.190
He-0.188
Tokenuffy
Token ID15352
Feature activation+5.54e-01

Pos logit contributions
 cousins+0.097
 fault+0.095
ours+0.091
 sisters+0.089
omm+0.085
Neg logit contributions
He-0.432
 branch-0.420
 steer-0.419
 reach-0.413
 paddle-0.410
Token.
Token ID13
Feature activation-5.70e-02

Pos logit contributions
uffy+1.236
ours+1.223
 twin+1.222
 cousins+1.211
 fault+1.205
Neg logit contributions
 swoop-0.912
 steer-0.910
 branch-0.901
He-0.896
 paddle-0.893
Token Fl
Token ID1610
Feature activation+1.38e-01

Pos logit contributions
 swoop+0.124
 steer+0.124
He+0.124
 paddle+0.123
 branch+0.123
Neg logit contributions
uffy-0.096
 cousins-0.095
 twin-0.095
ours-0.094
 fault-0.094
Tokenuffy
Token ID15352
Feature activation+5.54e-01

Pos logit contributions
 cousins+0.083
 fault+0.078
 birth+0.075
ours+0.074
 sisters+0.073
Neg logit contributions
He-0.444
 branch-0.426
 steer-0.417
 reach-0.415
 lift-0.414
Token was
Token ID373
Feature activation+1.83e-01

Pos logit contributions
 fault+1.013
ours+1.007
omm+0.964
 twin+0.961
icking+0.958
Neg logit contributions
He-1.214
 swoop-1.205
 branch-1.194
 steer-1.145
 lift-1.132
Token a
Token ID257
Feature activation+1.63e-02

Pos logit contributions
ours+0.389
 fault+0.382
uffy+0.380
ains+0.371
 cousins+0.369
Neg logit contributions
 branch-0.328
He-0.327
 swoop-0.326
 paddle-0.318
 steer-0.314
Token very
Token ID845
Feature activation+6.93e-02

Pos logit contributions
 twin+0.032
 best+0.032
uffy+0.031
 cousins+0.030
 twins+0.029
Neg logit contributions
 steer-0.033
 swoop-0.033
 paddle-0.031
 branch-0.030
He-0.030
Token enthusiastic
Token ID21848
Feature activation-5.84e-02

Pos logit contributions
uffy+0.143
 twin+0.141
 best+0.138
 cousins+0.137
ours+0.135
Neg logit contributions
 steer-0.129
 swoop-0.129
 branch-0.126
 paddle-0.125
He-0.125
Token dog
Token ID3290
Feature activation-7.93e-03

Pos logit contributions
 swoop+0.117
 steer+0.116
 paddle+0.113
He+0.112
 branch+0.110
Neg logit contributions
 twin-0.113
uffy-0.112
 cousins-0.109
 best-0.107
 fault-0.106
Token a
Token ID257
Feature activation+5.16e-03

Pos logit contributions
uffy+0.104
 twin+0.102
 cousins+0.101
ours+0.101
 fault+0.100
Neg logit contributions
 swoop-0.087
 steer-0.086
 paddle-0.085
 branch-0.085
He-0.085
Token tiny
Token ID7009
Feature activation-1.33e-01

Pos logit contributions
 twin+0.012
 best+0.011
uffy+0.011
 cousins+0.011
 sisters+0.011
Neg logit contributions
 steer-0.009
 paddle-0.008
He-0.008
 swoop-0.008
 reach-0.008
Token little
Token ID1310
Feature activation-6.17e-02

Pos logit contributions
 steer+0.235
He+0.231
 paddle+0.219
 swoop+0.214
 reach+0.204
Neg logit contributions
 twin-0.300
 best-0.274
uffy-0.272
 brother-0.270
 cousins-0.269
Token bird
Token ID6512
Feature activation-2.13e-01

Pos logit contributions
 steer+0.112
He+0.111
 paddle+0.107
 swoop+0.102
 reach+0.098
Neg logit contributions
 twin-0.135
 best-0.127
 cousins-0.124
 sisters-0.124
 brother-0.124
Token came
Token ID1625
Feature activation+1.36e-02

Pos logit contributions
 steer+0.330
 paddle+0.328
He+0.327
 branch+0.311
 distance+0.299
Neg logit contributions
uffy-0.476
 twin-0.470
icking-0.454
 best-0.454
 cousins-0.453
Token swoop
Token ID38527
Feature activation-5.82e-01

Pos logit contributions
 best+0.022
 twin+0.021
icking+0.020
apped+0.020
 fault+0.020
Neg logit contributions
 steer-0.030
 paddle-0.030
He-0.029
 swoop-0.029
 branch-0.028
Tokening
Token ID278
Feature activation+2.17e-01

Pos logit contributions
 swoop+1.441
 steer+1.440
 paddle+1.425
He+1.419
 branch+1.418
Neg logit contributions
uffy-0.815
 twin-0.803
 cousins-0.788
ours-0.783
 fault-0.768
Token in
Token ID287
Feature activation+6.58e-02

Pos logit contributions
 twin+0.327
 best+0.326
apped+0.318
uffy+0.297
icking+0.293
Neg logit contributions
 steer-0.483
 paddle-0.478
He-0.466
 swoop-0.451
 distance-0.445
Token and
Token ID290
Feature activation-5.33e-02

Pos logit contributions
 twin+0.102
 best+0.100
 twins+0.094
 fault+0.090
ma+0.090
Neg logit contributions
 paddle-0.143
 steer-0.138
 swoop-0.133
He-0.129
bread-0.124
Token lifted
Token ID13663
Feature activation-2.34e-01

Pos logit contributions
He+0.092
 steer+0.089
 paddle+0.088
 branch+0.086
 distance+0.085
Neg logit contributions
ours-0.103
 twin-0.103
uffy-0.103
 cousins-0.101
 best-0.101
Token up
Token ID510
Feature activation+1.37e-01

Pos logit contributions
 steer+0.483
 paddle+0.480
 swoop+0.479
He+0.475
 branch+0.469
Neg logit contributions
 twin-0.420
uffy-0.419
 cousins-0.406
 best-0.406
ours-0.405
TokenSuddenly
Token ID38582
Feature activation-2.40e-01

Pos logit contributions
 twin+0.002
uffy+0.002
 cousins+0.002
ours+0.002
 best+0.001
Neg logit contributions
 paddle-0.001
 steer-0.001
 swoop-0.001
 branch-0.001
 lift-0.001
Token,
Token ID11
Feature activation+3.51e-02

Pos logit contributions
 swoop+0.480
 steer+0.472
 branch+0.469
 paddle+0.469
He+0.466
Neg logit contributions
uffy-0.451
ours-0.446
 cousins-0.444
 fault-0.444
 twin-0.441
Token a
Token ID257
Feature activation-5.82e-03

Pos logit contributions
uffy+0.066
 twin+0.065
 cousins+0.064
ours+0.064
 fault+0.063
Neg logit contributions
 swoop-0.070
 steer-0.070
 paddle-0.069
He-0.069
 branch-0.069
Token big
Token ID1263
Feature activation-1.41e-01

Pos logit contributions
 steer+0.010
 paddle+0.009
 swoop+0.009
He+0.009
 reach+0.009
Neg logit contributions
 twin-0.013
 best-0.013
uffy-0.013
 cousins-0.013
 twins-0.012
Token bird
Token ID6512
Feature activation-2.16e-01

Pos logit contributions
 steer+0.250
He+0.247
 paddle+0.232
 swoop+0.232
 reach+0.212
Neg logit contributions
 twin-0.313
 cousins-0.294
 best-0.293
 brother-0.290
 twins-0.289
Token swoop
Token ID38527
Feature activation-5.81e-01

Pos logit contributions
 steer+0.395
 paddle+0.392
He+0.385
 branch+0.362
 distance+0.353
Neg logit contributions
uffy-0.405
ours-0.394
 twin-0.392
apped-0.390
 cousins-0.374
Tokened
Token ID276
Feature activation+1.70e-01

Pos logit contributions
 swoop+1.513
 steer+1.513
 paddle+1.497
He+1.492
 branch+1.491
Neg logit contributions
uffy-0.737
 twin-0.724
 cousins-0.709
ours-0.705
 fault-0.689
Token down
Token ID866
Feature activation-8.49e-02

Pos logit contributions
 twin+0.259
 best+0.252
apped+0.251
uffy+0.248
ours+0.246
Neg logit contributions
 steer-0.380
 paddle-0.375
He-0.362
 swoop-0.355
 branch-0.351
Token and
Token ID290
Feature activation-6.46e-02

Pos logit contributions
 paddle+0.200
 steer+0.196
 swoop+0.191
 branch+0.180
He+0.180
Neg logit contributions
apped-0.118
 twin-0.116
 best-0.115
 twins-0.106
icking-0.104
Token tried
Token ID3088
Feature activation-1.58e-01

Pos logit contributions
He+0.109
 steer+0.106
 distance+0.103
 paddle+0.102
 branch+0.099
Neg logit contributions
ours-0.129
uffy-0.125
 twin-0.123
 cousins-0.121
 best-0.120
Token to
Token ID284
Feature activation+1.44e-02

Pos logit contributions
 steer+0.349
 swoop+0.348
He+0.346
 paddle+0.346
 branch+0.338
Neg logit contributions
 twin-0.256
 fault-0.253
 best-0.250
uffy-0.248
 cousins-0.242
Token flap
Token ID37699
Feature activation-3.41e-01

Pos logit contributions
 steer+0.186
 paddle+0.182
He+0.175
 swoop+0.172
 branch+0.167
Neg logit contributions
 best-0.150
icking-0.147
 twin-0.144
 fault-0.138
apped-0.134
Token and
Token ID290
Feature activation-5.34e-02

Pos logit contributions
 steer+0.657
 paddle+0.652
 swoop+0.642
 branch+0.626
He+0.622
Neg logit contributions
 twin-0.655
 best-0.651
uffy-0.628
 fault-0.628
apped-0.623
Token they
Token ID484
Feature activation+3.31e-02

Pos logit contributions
He+0.091
 steer+0.085
 paddle+0.084
 branch+0.079
 distance+0.078
Neg logit contributions
 twin-0.110
 best-0.107
 cousins-0.106
 sisters-0.105
ours-0.103
Token began
Token ID2540
Feature activation-5.10e-02

Pos logit contributions
 twin+0.064
apped+0.062
 cousins+0.062
 best+0.062
uffy+0.061
Neg logit contributions
He-0.061
 steer-0.061
 paddle-0.059
 branch-0.058
 distance-0.055
Token to
Token ID284
Feature activation+2.09e-02

Pos logit contributions
He+0.119
 steer+0.117
 paddle+0.115
 swoop+0.113
 branch+0.112
Neg logit contributions
 twin-0.078
 best-0.076
 fault-0.075
 cousins-0.074
ours-0.073
Token swoop
Token ID38527
Feature activation-5.80e-01

Pos logit contributions
 best+0.047
 sisters+0.045
 twin+0.044
 fault+0.043
 brother+0.043
Neg logit contributions
He-0.033
His-0.026
 steer-0.025
Exc-0.025
bread-0.024
Token around
Token ID1088
Feature activation-6.69e-04

Pos logit contributions
 steer+1.186
 swoop+1.183
 paddle+1.171
 branch+1.164
He+1.163
Neg logit contributions
uffy-1.055
 twin-1.042
 cousins-1.020
ours-1.018
 best-1.017
Token it
Token ID340
Feature activation-1.32e-01

Pos logit contributions
He+0.001
 swoop+0.001
 steer+0.001
 paddle+0.001
 branch+0.001
Neg logit contributions
 twin-0.001
 best-0.001
uffy-0.001
 cousins-0.001
 fault-0.001
Token.
Token ID13
Feature activation-5.17e-02

Pos logit contributions
 steer+0.272
 paddle+0.270
He+0.263
 swoop+0.257
 branch+0.246
Neg logit contributions
 best-0.237
 twin-0.233
uffy-0.229
apped-0.221
 fault-0.217
Token They
Token ID1119
Feature activation+1.09e-01

Pos logit contributions
 swoop+0.111
 steer+0.111
 paddle+0.109
 branch+0.109
He+0.108
Neg logit contributions
uffy-0.089
 twin-0.089
ours-0.087
 cousins-0.086
 fault-0.085
Token were
Token ID547
Feature activation+1.44e-01

Pos logit contributions
 twin+0.214
 cousins+0.211
 best+0.209
 sisters+0.203
uffy+0.202
Neg logit contributions
He-0.189
 steer-0.183
 paddle-0.177
 branch-0.173
 distance-0.166
TokenOne
Token ID3198
Feature activation-1.60e-01

Pos logit contributions
 twin+0.027
uffy+0.026
 cousins+0.026
ours+0.025
 best+0.025
Neg logit contributions
 paddle-0.018
 steer-0.018
 swoop-0.018
 branch-0.017
 lift-0.017
Token day
Token ID1110
Feature activation+1.64e-01

Pos logit contributions
 branch+0.384
 swoop+0.382
 steer+0.377
He+0.376
 paddle+0.374
Neg logit contributions
 cousins-0.242
 fault-0.242
uffy-0.240
ours-0.237
omm-0.230
Token a
Token ID257
Feature activation-4.57e-03

Pos logit contributions
uffy+0.302
ours+0.298
 fault+0.298
 cousins+0.296
 twin+0.295
Neg logit contributions
 swoop-0.334
 steer-0.331
 paddle-0.330
He-0.328
 branch-0.326
Token friendly
Token ID8030
Feature activation-1.97e-01

Pos logit contributions
 steer+0.008
 swoop+0.008
He+0.008
 paddle+0.008
 branch+0.008
Neg logit contributions
 twin-0.010
uffy-0.009
 best-0.009
 cousins-0.009
 cousin-0.009
Token owl
Token ID39610
Feature activation-2.61e-01

Pos logit contributions
 steer+0.319
 swoop+0.316
He+0.312
 paddle+0.309
 branch+0.306
Neg logit contributions
 twin-0.449
uffy-0.448
 cousins-0.439
ours-0.433
 best-0.430
Token swoop
Token ID38527
Feature activation-5.78e-01

Pos logit contributions
 steer+0.520
 paddle+0.515
He+0.514
 branch+0.493
 distance+0.476
Neg logit contributions
 twin-0.450
uffy-0.449
ours-0.438
 cousins-0.434
apped-0.427
Tokened
Token ID276
Feature activation+1.71e-01

Pos logit contributions
 swoop+1.499
 steer+1.492
 paddle+1.477
He+1.474
 branch+1.472
Neg logit contributions
uffy-0.744
 twin-0.733
 cousins-0.721
ours-0.714
 fault-0.703
Token down
Token ID866
Feature activation-8.47e-02

Pos logit contributions
 twin+0.279
 best+0.274
uffy+0.272
 cousins+0.267
ours+0.266
Neg logit contributions
 steer-0.371
 paddle-0.366
He-0.357
 swoop-0.356
 branch-0.349
Token and
Token ID290
Feature activation-6.44e-02

Pos logit contributions
 paddle+0.191
 steer+0.190
 swoop+0.186
He+0.177
 branch+0.176
Neg logit contributions
 best-0.128
apped-0.127
 twin-0.127
ours-0.119
uffy-0.118
Token said
Token ID531
Feature activation-1.09e-02

Pos logit contributions
He+0.114
 steer+0.114
 paddle+0.110
 branch+0.107
 distance+0.106
Neg logit contributions
ours-0.125
uffy-0.122
 twin-0.121
 cousins-0.120
 best-0.118
Token hi
Token ID23105
Feature activation-1.01e-01

Pos logit contributions
 swoop+0.023
 steer+0.023
 paddle+0.023
He+0.023
 branch+0.023
Neg logit contributions
uffy-0.019
 twin-0.019
 cousins-0.019
ours-0.019
 fault-0.018
Token then
Token ID788
Feature activation-4.86e-02

Pos logit contributions
 branch+0.462
 swoop+0.459
 steer+0.436
He+0.434
 paddle+0.430
Neg logit contributions
 fault-0.717
uffy-0.716
ours-0.712
 twin-0.694
 cousins-0.693
Token an
Token ID281
Feature activation+8.76e-02

Pos logit contributions
 swoop+0.082
 steer+0.081
 branch+0.080
He+0.080
 paddle+0.080
Neg logit contributions
uffy-0.107
 twin-0.106
ours-0.105
 fault-0.105
 cousins-0.105
Token even
Token ID772
Feature activation-6.35e-02

Pos logit contributions
 twin+0.160
 best+0.158
 cousins+0.154
omm+0.150
 fault+0.150
Neg logit contributions
 steer-0.182
 paddle-0.177
 swoop-0.176
He-0.172
 lift-0.166
Token bigger
Token ID5749
Feature activation-1.50e-01

Pos logit contributions
 paddle+0.111
 steer+0.110
He+0.107
 swoop+0.105
 branch+0.105
Neg logit contributions
 best-0.135
 twin-0.134
 cousins-0.130
ours-0.129
uffy-0.127
Token pigeon
Token ID46387
Feature activation-2.60e-01

Pos logit contributions
 steer+0.254
He+0.248
 paddle+0.244
 swoop+0.242
 branch+0.231
Neg logit contributions
 twin-0.335
uffy-0.324
 best-0.321
 cousins-0.316
ours-0.309
Token swoop
Token ID38527
Feature activation-5.76e-01

Pos logit contributions
 steer+0.469
 paddle+0.462
He+0.455
 branch+0.450
 swoop+0.444
Neg logit contributions
 twin-0.532
uffy-0.532
 cousins-0.521
 best-0.514
ours-0.513
Tokened
Token ID276
Feature activation+1.74e-01

Pos logit contributions
 swoop+1.491
 steer+1.482
 paddle+1.468
He+1.467
 branch+1.462
Neg logit contributions
uffy-0.745
 twin-0.736
 cousins-0.724
ours-0.717
 fault-0.705
Token down
Token ID866
Feature activation-8.01e-02

Pos logit contributions
 best+0.289
 twin+0.286
uffy+0.281
 fault+0.275
ours+0.275
Neg logit contributions
 steer-0.371
 paddle-0.365
He-0.354
 swoop-0.352
 branch-0.347
Token and
Token ID290
Feature activation-6.02e-02

Pos logit contributions
 steer+0.175
 paddle+0.174
 swoop+0.171
 branch+0.163
He+0.162
Neg logit contributions
 best-0.130
 twin-0.128
apped-0.127
ours-0.123
uffy-0.122
Token landed
Token ID11406
Feature activation-2.27e-01

Pos logit contributions
He+0.102
 steer+0.099
 paddle+0.097
 distance+0.095
 branch+0.094
Neg logit contributions
ours-0.120
 twin-0.116
uffy-0.115
 best-0.113
 cousins-0.112
Token beside
Token ID13970
Feature activation-1.60e-01

Pos logit contributions
 steer+0.498
 paddle+0.491
 swoop+0.490
He+0.479
 branch+0.477
Neg logit contributions
uffy-0.386
 twin-0.384
 best-0.376
ours-0.376
 cousins-0.372
Token.
Token ID13
Feature activation-5.97e-02

Pos logit contributions
 swoop+0.387
 paddle+0.381
 steer+0.381
 branch+0.375
He+0.364
Neg logit contributions
 twin-0.375
 best-0.370
 fault-0.370
uffy-0.368
 cousins-0.365
Token Suddenly
Token ID24975
Feature activation-5.63e-03

Pos logit contributions
He+0.104
 branch+0.097
 swoop+0.097
 steer+0.096
 paddle+0.094
Neg logit contributions
 cousins-0.131
ours-0.129
uffy-0.129
 twin-0.128
 best-0.127
Token,
Token ID11
Feature activation+5.26e-02

Pos logit contributions
 swoop+0.011
 steer+0.011
 paddle+0.011
He+0.011
 branch+0.011
Neg logit contributions
uffy-0.011
 twin-0.011
 cousins-0.011
ours-0.010
 fault-0.010
Token an
Token ID281
Feature activation+8.82e-02

Pos logit contributions
 twin+0.127
uffy+0.126
 cousins+0.124
 best+0.124
ours+0.121
Neg logit contributions
 swoop-0.077
 steer-0.076
 paddle-0.074
He-0.074
 branch-0.073
Token owl
Token ID39610
Feature activation-2.56e-01

Pos logit contributions
 twin+0.164
 best+0.162
 cousins+0.159
omm+0.158
 fault+0.152
Neg logit contributions
 steer-0.180
 swoop-0.174
 paddle-0.174
He-0.171
 distance-0.164
Token swoop
Token ID38527
Feature activation-5.75e-01

Pos logit contributions
 paddle+0.493
 steer+0.490
He+0.482
 branch+0.453
 distance+0.453
Neg logit contributions
uffy-0.438
 twin-0.438
icking-0.434
 cousins-0.426
apped-0.424
Tokened
Token ID276
Feature activation+1.78e-01

Pos logit contributions
 steer+1.509
 swoop+1.503
 paddle+1.493
 branch+1.490
He+1.485
Neg logit contributions
uffy-0.716
 twin-0.703
 cousins-0.687
ours-0.681
 best-0.664
Token down
Token ID866
Feature activation-8.02e-02

Pos logit contributions
 twin+0.271
 best+0.270
apped+0.265
ours+0.257
uffy+0.256
Neg logit contributions
 steer-0.389
 paddle-0.384
He-0.370
 swoop-0.368
 branch-0.366
Token and
Token ID290
Feature activation-5.40e-02

Pos logit contributions
 steer+0.187
 paddle+0.186
 swoop+0.183
 branch+0.174
He+0.171
Neg logit contributions
apped-0.111
 best-0.111
 twin-0.107
icking-0.100
ours-0.098
Token snatched
Token ID43276
Feature activation-2.79e-01

Pos logit contributions
He+0.093
 steer+0.091
 distance+0.091
 paddle+0.091
 branch+0.086
Neg logit contributions
ours-0.100
uffy-0.099
 twin-0.098
 cousins-0.097
 best-0.096
Token up
Token ID510
Feature activation+1.40e-01

Pos logit contributions
 swoop+0.561
 steer+0.560
 paddle+0.554
He+0.551
 branch+0.550
Neg logit contributions
uffy-0.514
 twin-0.514
 cousins-0.504
ours-0.503
 best-0.496
Token in
Token ID287
Feature activation+6.34e-02

Pos logit contributions
 steer+0.231
 paddle+0.225
 swoop+0.224
He+0.224
 branch+0.220
Neg logit contributions
 twin-0.210
 best-0.206
uffy-0.204
ours-0.203
 cousins-0.202
Token the
Token ID262
Feature activation-1.98e-02

Pos logit contributions
 twin+0.092
 best+0.089
 twins+0.087
 living+0.082
 sisters+0.081
Neg logit contributions
 steer-0.144
 paddle-0.143
 swoop-0.138
He-0.136
bread-0.131
Token light
Token ID1657
Feature activation-2.18e-01

Pos logit contributions
 steer+0.033
 swoop+0.032
He+0.032
 paddle+0.032
 branch+0.031
Neg logit contributions
 best-0.043
 twin-0.043
uffy-0.042
 cousins-0.042
ours-0.041
Token.
Token ID13
Feature activation-7.03e-02

Pos logit contributions
 steer+0.428
 swoop+0.420
He+0.420
 paddle+0.419
 branch+0.399
Neg logit contributions
 twin-0.413
 best-0.400
uffy-0.398
ours-0.389
 fault-0.385
Token It
Token ID632
Feature activation-5.03e-02

Pos logit contributions
 steer+0.139
 swoop+0.139
 paddle+0.137
 branch+0.135
He+0.133
Neg logit contributions
uffy-0.134
 twin-0.133
ours-0.131
 cousins-0.129
 fault-0.127
Token swoop
Token ID38527
Feature activation-5.75e-01

Pos logit contributions
 steer+0.122
 paddle+0.120
 branch+0.119
He+0.117
 swoop+0.116
Neg logit contributions
 twin-0.070
 best-0.068
uffy-0.067
ours-0.066
 cousins-0.063
Tokened
Token ID276
Feature activation+1.70e-01

Pos logit contributions
 swoop+1.489
 steer+1.483
 paddle+1.469
He+1.465
 branch+1.462
Neg logit contributions
uffy-0.743
 twin-0.733
 cousins-0.721
ours-0.714
 fault-0.701
Token down
Token ID866
Feature activation-8.39e-02

Pos logit contributions
 best+0.289
 twin+0.280
apped+0.275
uffy+0.274
ours+0.274
Neg logit contributions
 steer-0.355
He-0.345
 paddle-0.344
 swoop-0.343
 branch-0.336
Token and
Token ID290
Feature activation-7.46e-02

Pos logit contributions
 steer+0.187
 swoop+0.184
 paddle+0.183
He+0.176
 branch+0.175
Neg logit contributions
 best-0.134
apped-0.131
 twin-0.127
ours-0.123
icking-0.122
Token started
Token ID2067
Feature activation-4.12e-02

Pos logit contributions
He+0.123
 steer+0.118
 branch+0.116
 distance+0.113
 paddle+0.112
Neg logit contributions
ours-0.152
apped-0.144
 twin-0.144
 best-0.143
uffy-0.143
Token flying
Token ID7348
Feature activation-1.50e-01

Pos logit contributions
He+0.097
 steer+0.095
 paddle+0.092
 branch+0.092
 swoop+0.092
Neg logit contributions
 twin-0.059
 best-0.058
 fault-0.057
ours-0.055
 cousins-0.053
Token was
Token ID373
Feature activation+1.68e-01

Pos logit contributions
uffy+0.065
 twin+0.064
 cousins+0.063
ours+0.063
 best+0.062
Neg logit contributions
 steer-0.077
 paddle-0.076
 swoop-0.076
 branch-0.076
He-0.075
Token playing
Token ID2712
Feature activation+3.06e-01

Pos logit contributions
uffy+0.320
 twin+0.312
ours+0.311
 cousins+0.310
 fault+0.304
Neg logit contributions
 swoop-0.338
 steer-0.333
 branch-0.332
 paddle-0.330
He-0.327
Token outside
Token ID2354
Feature activation+5.43e-02

Pos logit contributions
 twin+0.496
 best+0.494
 cousins+0.470
apped+0.464
uffy+0.462
Neg logit contributions
 swoop-0.669
 steer-0.665
 paddle-0.657
He-0.642
 branch-0.619
Token.
Token ID13
Feature activation-6.56e-02

Pos logit contributions
 twin+0.109
uffy+0.108
 cousins+0.106
ours+0.105
 best+0.105
Neg logit contributions
 swoop-0.101
 steer-0.101
 paddle-0.099
He-0.098
 branch-0.098
Token It
Token ID632
Feature activation-4.45e-02

Pos logit contributions
 swoop+0.120
 steer+0.120
 paddle+0.118
 branch+0.117
He+0.117
Neg logit contributions
uffy-0.134
 twin-0.133
 cousins-0.131
ours-0.131
 fault-0.129
Token swoop
Token ID38527
Feature activation-5.74e-01

Pos logit contributions
 steer+0.106
 paddle+0.105
 branch+0.103
He+0.103
 swoop+0.102
Neg logit contributions
 twin-0.066
uffy-0.064
 best-0.063
 cousins-0.062
ours-0.061
Tokened
Token ID276
Feature activation+1.76e-01

Pos logit contributions
 swoop+1.464
 steer+1.450
He+1.440
 paddle+1.437
 branch+1.429
Neg logit contributions
uffy-0.768
 twin-0.763
 cousins-0.752
ours-0.748
 fault-0.737
Token down
Token ID866
Feature activation-8.10e-02

Pos logit contributions
 twin+0.306
 best+0.305
uffy+0.301
 cousins+0.297
ours+0.293
Neg logit contributions
 steer-0.362
 swoop-0.357
 paddle-0.354
He-0.352
 branch-0.349
Token close
Token ID1969
Feature activation-2.55e-01

Pos logit contributions
 steer+0.172
 swoop+0.171
 paddle+0.169
 branch+0.163
He+0.163
Neg logit contributions
 best-0.138
 twin-0.137
apped-0.134
uffy-0.132
ours-0.131
Token to
Token ID284
Feature activation+1.84e-02

Pos logit contributions
 steer+0.520
 swoop+0.520
 paddle+0.515
He+0.509
 branch+0.502
Neg logit contributions
 twin-0.458
uffy-0.452
 best-0.449
 fault-0.447
 cousins-0.444
Token the
Token ID262
Feature activation-1.52e-02

Pos logit contributions
 twin+0.032
 best+0.032
 cousins+0.031
ours+0.030
uffy+0.030
Neg logit contributions
He-0.037
 swoop-0.037
 paddle-0.036
 steer-0.036
 branch-0.035
Token but
Token ID475
Feature activation-2.53e-01

Pos logit contributions
 twin+0.113
 cousins+0.110
 best+0.108
 twins+0.106
 sisters+0.104
Neg logit contributions
 swoop-0.079
 steer-0.075
 paddle-0.074
 branch-0.071
He-0.069
Token a
Token ID257
Feature activation+8.44e-03

Pos logit contributions
 swoop+0.299
 steer+0.295
He+0.292
 paddle+0.292
 branch+0.286
Neg logit contributions
uffy-0.699
 fault-0.684
ours-0.683
 twin-0.683
 cousins-0.678
Token se
Token ID384
Feature activation-2.91e-01

Pos logit contributions
 twin+0.017
 best+0.017
uffy+0.016
 cousins+0.016
 cousin+0.016
Neg logit contributions
 steer-0.016
He-0.016
 swoop-0.015
 paddle-0.015
 reach-0.015
Tokenag
Token ID363
Feature activation+5.55e-02

Pos logit contributions
 steer+0.651
 paddle+0.647
 swoop+0.646
 branch+0.641
He+0.638
Neg logit contributions
uffy-0.475
 twin-0.468
 cousins-0.458
ours-0.457
 fault-0.454
Tokenull
Token ID724
Feature activation-3.59e-02

Pos logit contributions
 twin+0.102
uffy+0.101
apped+0.098
ours+0.097
 cousins+0.097
Neg logit contributions
 steer-0.111
 paddle-0.110
 swoop-0.109
He-0.108
 branch-0.108
Token swoop
Token ID38527
Feature activation-5.73e-01

Pos logit contributions
 steer+0.072
He+0.072
 paddle+0.071
 branch+0.070
 distance+0.069
Neg logit contributions
uffy-0.066
 twin-0.063
icking-0.061
 best-0.061
 cousins-0.060
Tokened
Token ID276
Feature activation+1.80e-01

Pos logit contributions
 swoop+1.432
 steer+1.413
He+1.403
 paddle+1.399
 branch+1.390
Neg logit contributions
uffy-0.802
 twin-0.799
 cousins-0.790
ours-0.782
 fault-0.772
Token down
Token ID866
Feature activation-7.73e-02

Pos logit contributions
 best+0.248
 twin+0.247
apped+0.239
ours+0.237
 fault+0.230
Neg logit contributions
 steer-0.423
 paddle-0.418
He-0.410
 branch-0.408
 swoop-0.405
Token and
Token ID290
Feature activation-5.92e-02

Pos logit contributions
 paddle+0.178
 steer+0.178
 swoop+0.178
 branch+0.172
He+0.165
Neg logit contributions
 best-0.114
apped-0.112
 twin-0.111
ours-0.107
icking-0.105
Token snatched
Token ID43276
Feature activation-2.79e-01

Pos logit contributions
He+0.111
 distance+0.107
 branch+0.105
 steer+0.104
 paddle+0.101
Neg logit contributions
ours-0.112
uffy-0.105
 twin-0.104
 best-0.102
apped-0.100
Token it
Token ID340
Feature activation-1.32e-01

Pos logit contributions
 swoop+0.517
 steer+0.517
He+0.512
 paddle+0.511
 branch+0.509
Neg logit contributions
 twin-0.551
ours-0.544
uffy-0.543
 cousins-0.537
 best-0.537
TokenJust
Token ID5703
Feature activation-2.72e-01

Pos logit contributions
 twin+0.037
uffy+0.036
anny+0.035
 cousins+0.034
omm+0.034
Neg logit contributions
 paddle-0.025
 steer-0.023
 swoop-0.023
 branch-0.022
 lift-0.022
Token then
Token ID788
Feature activation-4.56e-02

Pos logit contributions
He+0.531
 steer+0.525
 paddle+0.523
 swoop+0.522
 branch+0.505
Neg logit contributions
uffy-0.499
 twin-0.499
 best-0.497
ours-0.487
 cousins-0.480
Token,
Token ID11
Feature activation+4.55e-02

Pos logit contributions
 swoop+0.098
 steer+0.098
 paddle+0.097
He+0.096
 branch+0.096
Neg logit contributions
uffy-0.079
 twin-0.078
 cousins-0.077
ours-0.077
 fault-0.076
Token an
Token ID281
Feature activation+9.50e-02

Pos logit contributions
 twin+0.084
uffy+0.084
 cousins+0.082
 best+0.081
ours+0.080
Neg logit contributions
 swoop-0.092
 steer-0.092
He-0.090
 paddle-0.090
 branch-0.089
Token owl
Token ID39610
Feature activation-2.53e-01

Pos logit contributions
 twin+0.180
 best+0.175
 cousins+0.172
omm+0.169
ours+0.168
Neg logit contributions
 steer-0.191
He-0.185
 paddle-0.185
 swoop-0.184
 branch-0.174
Token swoop
Token ID38527
Feature activation-5.73e-01

Pos logit contributions
 steer+0.448
 paddle+0.446
He+0.427
 branch+0.413
 distance+0.406
Neg logit contributions
uffy-0.502
 twin-0.490
ours-0.484
 cousins-0.474
 best-0.472
Tokened
Token ID276
Feature activation+1.78e-01

Pos logit contributions
 swoop+1.494
 steer+1.494
 paddle+1.478
He+1.473
 branch+1.473
Neg logit contributions
uffy-0.727
 twin-0.715
 cousins-0.700
ours-0.696
 fault-0.680
Token down
Token ID866
Feature activation-7.49e-02

Pos logit contributions
 twin+0.272
 best+0.271
apped+0.266
ours+0.264
uffy+0.259
Neg logit contributions
 steer-0.396
 paddle-0.392
He-0.380
 swoop-0.375
 branch-0.370
Token from
Token ID422
Feature activation-4.17e-02

Pos logit contributions
 paddle+0.175
 steer+0.175
 swoop+0.171
 branch+0.162
He+0.161
Neg logit contributions
apped-0.105
 best-0.104
 twin-0.100
ours-0.094
icking-0.093
Token the
Token ID262
Feature activation-1.36e-02

Pos logit contributions
He+0.089
 paddle+0.088
 steer+0.087
 swoop+0.082
bread+0.080
Neg logit contributions
 best-0.061
ours-0.061
 cousins-0.060
 twin-0.060
 twins-0.060
Token sky
Token ID6766
Feature activation-3.41e-02

Pos logit contributions
 steer+0.022
He+0.021
 swoop+0.021
 paddle+0.020
 lift+0.019
Neg logit contributions
 twin-0.031
 best-0.031
 twins-0.030
uffy-0.030
ours-0.029
Token He
Token ID679
Feature activation-2.76e-01

Pos logit contributions
 swoop+0.103
 steer+0.103
 paddle+0.102
 branch+0.101
He+0.101
Neg logit contributions
uffy-0.110
 twin-0.109
 cousins-0.107
ours-0.107
 fault-0.105
Token was
Token ID373
Feature activation+1.72e-01

Pos logit contributions
 steer+0.516
 swoop+0.513
 paddle+0.510
He+0.510
 branch+0.506
Neg logit contributions
uffy-0.548
 twin-0.546
 cousins-0.536
ours-0.534
 best-0.525
Token loop
Token ID9052
Feature activation-1.91e-01

Pos logit contributions
 twin+0.330
uffy+0.329
 cousins+0.324
ours+0.321
 best+0.320
Neg logit contributions
 steer-0.337
 swoop-0.334
 paddle-0.332
He-0.329
 branch-0.326
Tokening
Token ID278
Feature activation+2.25e-01

Pos logit contributions
 steer+0.439
 paddle+0.437
 swoop+0.432
 branch+0.429
He+0.423
Neg logit contributions
uffy-0.304
 twin-0.298
 best-0.295
 fault-0.286
 cousins-0.281
Token and
Token ID290
Feature activation-5.24e-02

Pos logit contributions
 best+0.398
 twin+0.381
 fault+0.376
uffy+0.375
ours+0.361
Neg logit contributions
 steer-0.465
 paddle-0.463
 swoop-0.452
He-0.450
 branch-0.446
Token swoop
Token ID38527
Feature activation-5.73e-01

Pos logit contributions
 steer+0.093
 paddle+0.093
He+0.093
 swoop+0.090
 branch+0.089
Neg logit contributions
 twin-0.106
 best-0.105
 cousins-0.104
uffy-0.103
ours-0.102
Tokening
Token ID278
Feature activation+2.25e-01

Pos logit contributions
 swoop+1.490
 steer+1.490
 paddle+1.475
 branch+1.469
He+1.467
Neg logit contributions
uffy-0.729
 twin-0.717
 cousins-0.701
ours-0.696
 fault-0.682
Token through
Token ID832
Feature activation-5.35e-02

Pos logit contributions
 best+0.349
apped+0.329
 twin+0.326
 fault+0.309
uffy+0.299
Neg logit contributions
 steer-0.494
He-0.488
 paddle-0.485
 distance-0.463
 swoop-0.462
Token the
Token ID262
Feature activation-1.12e-02

Pos logit contributions
 paddle+0.122
 swoop+0.118
 steer+0.118
He+0.114
 lift+0.111
Neg logit contributions
 twin-0.086
 best-0.081
 twins-0.077
 cousins-0.076
 fault-0.076
Token air
Token ID1633
Feature activation-1.55e-02

Pos logit contributions
 steer+0.022
 swoop+0.021
 paddle+0.021
He+0.021
 lift+0.020
Neg logit contributions
 twin-0.022
 best-0.021
 cousins-0.021
uffy-0.021
 fault-0.021
Token faster
Token ID5443
Feature activation-3.73e-01

Pos logit contributions
 steer+0.032
 paddle+0.032
 swoop+0.031
He+0.031
 branch+0.029
Neg logit contributions
 twin-0.027
 best-0.027
 fault-0.026
uffy-0.026
icking-0.025
Token eat
Token ID4483
Feature activation-2.54e-01

Pos logit contributions
 twin+0.051
uffy+0.049
 fault+0.049
 best+0.048
 cousins+0.048
Neg logit contributions
He-0.040
 winding-0.036
 distance-0.035
 branch-0.035
bread-0.035
Token them
Token ID606
Feature activation+1.62e-01

Pos logit contributions
 swoop+0.615
 steer+0.614
 paddle+0.609
He+0.605
 branch+0.605
Neg logit contributions
uffy-0.375
 twin-0.368
 cousins-0.364
ours-0.359
 fault-0.354
Token.
Token ID13
Feature activation-5.49e-02

Pos logit contributions
uffy+0.341
 twin+0.338
 cousins+0.333
ours+0.332
 fault+0.328
Neg logit contributions
 swoop-0.289
 steer-0.288
 paddle-0.284
He-0.282
 branch-0.282
Token The
Token ID383
Feature activation+1.99e-02

Pos logit contributions
 steer+0.095
 swoop+0.094
 branch+0.093
 paddle+0.092
He+0.091
Neg logit contributions
uffy-0.119
 twin-0.117
ours-0.116
 cousins-0.114
 best-0.113
Token bird
Token ID6512
Feature activation-2.05e-01

Pos logit contributions
 best+0.048
 twin+0.048
 cousins+0.046
 twins+0.046
uffy+0.045
Neg logit contributions
 steer-0.032
He-0.031
 paddle-0.030
 swoop-0.029
 reach-0.028
Token swoop
Token ID38527
Feature activation-5.73e-01

Pos logit contributions
 steer+0.415
 paddle+0.411
He+0.402
 branch+0.396
 swoop+0.390
Neg logit contributions
uffy-0.381
 twin-0.379
 cousins-0.363
 best-0.362
ours-0.354
Tokened
Token ID276
Feature activation+1.82e-01

Pos logit contributions
 swoop+1.438
 steer+1.413
He+1.408
 paddle+1.404
 branch+1.390
Neg logit contributions
uffy-0.794
 twin-0.788
 cousins-0.786
ours-0.780
 fault-0.779
Token down
Token ID866
Feature activation-7.62e-02

Pos logit contributions
 twin+0.327
 best+0.324
uffy+0.319
apped+0.313
ours+0.311
Neg logit contributions
 steer-0.366
He-0.353
 branch-0.353
 paddle-0.351
 swoop-0.351
Token and
Token ID290
Feature activation-5.06e-02

Pos logit contributions
 steer+0.168
 swoop+0.165
 paddle+0.165
 branch+0.159
He+0.156
Neg logit contributions
 twin-0.123
 best-0.122
apped-0.121
uffy-0.115
ours-0.112
Token grabbed
Token ID13176
Feature activation-1.70e-01

Pos logit contributions
He+0.089
 distance+0.086
 steer+0.085
 branch+0.081
 paddle+0.080
Neg logit contributions
ours-0.097
uffy-0.094
 twin-0.092
 best-0.089
 cousins-0.087
Token Fin
Token ID4463
Feature activation-1.45e-01

Pos logit contributions
 steer+0.301
 swoop+0.298
He+0.298
 branch+0.296
 paddle+0.291
Neg logit contributions
 twin-0.353
uffy-0.351
ours-0.346
 best-0.345
 cousins-0.344
Token park
Token ID3952
Feature activation+2.58e-02

Pos logit contributions
 steer+0.024
 swoop+0.023
 paddle+0.023
He+0.022
 branch+0.022
Neg logit contributions
 twin-0.019
uffy-0.019
 best-0.019
 cousins-0.019
ours-0.019
Token when
Token ID618
Feature activation+2.41e-02

Pos logit contributions
 twin+0.044
 best+0.043
uffy+0.042
 cousins+0.041
apped+0.041
Neg logit contributions
 paddle-0.057
 steer-0.057
 swoop-0.057
He-0.055
 branch-0.052
Token a
Token ID257
Feature activation+2.44e-03

Pos logit contributions
uffy+0.058
 twin+0.057
 cousins+0.057
ours+0.056
 fault+0.056
Neg logit contributions
 swoop-0.036
 steer-0.036
 paddle-0.035
He-0.035
 branch-0.035
Token big
Token ID1263
Feature activation-1.32e-01

Pos logit contributions
 twin+0.005
uffy+0.005
 best+0.005
 cousins+0.005
 cousin+0.005
Neg logit contributions
 steer-0.005
 paddle-0.004
 swoop-0.004
He-0.004
 branch-0.004
Token bird
Token ID6512
Feature activation-2.06e-01

Pos logit contributions
He+0.242
 steer+0.238
 swoop+0.227
 paddle+0.223
 branch+0.214
Neg logit contributions
 twin-0.281
uffy-0.275
 cousins-0.270
ours-0.267
 best-0.266
Token swoop
Token ID38527
Feature activation-5.73e-01

Pos logit contributions
 steer+0.425
 paddle+0.425
He+0.416
 branch+0.388
 lift+0.378
Neg logit contributions
uffy-0.356
 twin-0.340
ours-0.335
apped-0.334
icking-0.323
Tokened
Token ID276
Feature activation+1.78e-01

Pos logit contributions
 steer+1.487
 swoop+1.486
 paddle+1.471
He+1.465
 branch+1.464
Neg logit contributions
uffy-0.732
 twin-0.719
 cousins-0.704
ours-0.700
 fault-0.683
Token down
Token ID866
Feature activation-7.88e-02

Pos logit contributions
 twin+0.284
 best+0.282
uffy+0.277
apped+0.276
ours+0.276
Neg logit contributions
 steer-0.394
 paddle-0.385
He-0.379
 swoop-0.370
 branch-0.364
Token from
Token ID422
Feature activation-3.99e-02

Pos logit contributions
 paddle+0.181
 steer+0.180
 swoop+0.176
He+0.169
 branch+0.166
Neg logit contributions
apped-0.119
 twin-0.117
 best-0.117
ours-0.115
uffy-0.111
Token the
Token ID262
Feature activation-8.78e-03

Pos logit contributions
He+0.085
 paddle+0.083
 steer+0.083
 swoop+0.077
 lift+0.076
Neg logit contributions
ours-0.062
 fault-0.060
 best-0.060
 twin-0.060
 twins-0.060
Token sky
Token ID6766
Feature activation-3.01e-02

Pos logit contributions
 steer+0.014
He+0.013
 paddle+0.013
 swoop+0.012
 lift+0.012
Neg logit contributions
 best-0.021
 twin-0.021
 twins-0.020
 cousins-0.020
uffy-0.019
Token.
Token ID13
Feature activation-6.02e-02

Pos logit contributions
 swoop+0.347
 paddle+0.339
 steer+0.337
He+0.334
 branch+0.328
Neg logit contributions
 twin-0.311
 best-0.308
uffy-0.305
icking-0.301
 cousins-0.296
Token
Token ID198
Feature activation+2.31e-02

Pos logit contributions
 swoop+0.128
 steer+0.126
 paddle+0.125
 branch+0.123
He+0.119
Neg logit contributions
 twin-0.106
uffy-0.105
ours-0.101
 cousins-0.100
omm-0.099
Token
Token ID198
Feature activation+2.20e-02

Pos logit contributions
uffy+0.036
 twin+0.036
anny+0.034
ours+0.033
 best+0.033
Neg logit contributions
 paddle-0.053
 swoop-0.052
 steer-0.052
 branch-0.051
 lift-0.049
TokenThe
Token ID464
Feature activation-2.71e-01

Pos logit contributions
 twin+0.046
anny+0.044
uffy+0.043
omm+0.042
 cousins+0.041
Neg logit contributions
 paddle-0.036
 swoop-0.034
 branch-0.033
 steer-0.033
 move-0.032
Token eagle
Token ID31176
Feature activation-4.47e-01

Pos logit contributions
 steer+0.273
He+0.250
 paddle+0.248
 swoop+0.226
 lift+0.217
Neg logit contributions
 best-0.819
 twin-0.758
 cousins-0.747
 siblings-0.741
 twins-0.740
Token swoop
Token ID38527
Feature activation-5.72e-01

Pos logit contributions
 paddle+0.936
 steer+0.929
He+0.903
 branch+0.885
 distance+0.869
Neg logit contributions
 twin-0.773
uffy-0.750
 best-0.740
ours-0.723
icking-0.720
Tokened
Token ID276
Feature activation+1.78e-01

Pos logit contributions
 steer+1.516
 swoop+1.513
 paddle+1.500
 branch+1.496
He+1.492
Neg logit contributions
uffy-0.700
 twin-0.687
 cousins-0.669
ours-0.666
 best-0.648
Token down
Token ID866
Feature activation-7.92e-02

Pos logit contributions
 best+0.283
 twin+0.269
apped+0.263
ours+0.253
icking+0.251
Neg logit contributions
 steer-0.383
 paddle-0.376
He-0.370
 swoop-0.366
 branch-0.360
Token from
Token ID422
Feature activation-4.02e-02

Pos logit contributions
 steer+0.179
 paddle+0.178
 swoop+0.176
 branch+0.166
He+0.166
Neg logit contributions
 best-0.120
apped-0.117
 twin-0.113
icking-0.108
ours-0.105
Token the
Token ID262
Feature activation-7.20e-03

Pos logit contributions
He+0.085
 paddle+0.083
 steer+0.083
 swoop+0.080
 lift+0.076
Neg logit contributions
 best-0.063
 twin-0.061
 cousins-0.060
 fault-0.060
ours-0.059
Token sky
Token ID6766
Feature activation-3.34e-02

Pos logit contributions
 steer+0.010
He+0.010
 paddle+0.009
 swoop+0.009
 lift+0.008
Neg logit contributions
 best-0.018
 twin-0.017
 twins-0.016
 brother-0.016
 sisters-0.016
Token was
Token ID373
Feature activation+1.71e-01

Pos logit contributions
 steer+0.881
 paddle+0.880
He+0.876
 branch+0.850
 swoop+0.842
Neg logit contributions
uffy-0.537
 twin-0.532
 best-0.510
 cousins-0.486
ours-0.482
Token faster
Token ID5443
Feature activation-3.67e-01

Pos logit contributions
 best+0.307
 twin+0.306
 fault+0.289
 cousins+0.289
uffy+0.285
Neg logit contributions
 paddle-0.358
 steer-0.354
He-0.341
 swoop-0.338
 branch-0.323
Token.
Token ID13
Feature activation-6.03e-02

Pos logit contributions
 paddle+0.724
 swoop+0.723
 steer+0.711
He+0.690
 branch+0.684
Neg logit contributions
 fault-0.702
 twin-0.695
 best-0.692
uffy-0.684
 cousins-0.664
Token The
Token ID383
Feature activation+1.71e-02

Pos logit contributions
 swoop+0.124
 steer+0.122
 paddle+0.121
 branch+0.119
He+0.117
Neg logit contributions
 twin-0.112
uffy-0.111
ours-0.110
 fault-0.108
 cousins-0.106
Token hawk
Token ID48710
Feature activation-3.69e-01

Pos logit contributions
 best+0.043
 twin+0.042
 twins+0.042
uffy+0.041
 cousins+0.040
Neg logit contributions
 steer-0.025
He-0.023
 swoop-0.022
 paddle-0.022
 reach-0.021
Token swoop
Token ID38527
Feature activation-5.72e-01

Pos logit contributions
He+0.725
 paddle+0.725
 steer+0.717
 branch+0.678
 distance+0.660
Neg logit contributions
 twin-0.671
uffy-0.655
 best-0.645
 cousins-0.605
icking-0.604
Tokened
Token ID276
Feature activation+1.86e-01

Pos logit contributions
 swoop+1.442
 steer+1.421
He+1.414
 paddle+1.405
 branch+1.401
Neg logit contributions
uffy-0.790
 cousins-0.786
 twin-0.783
ours-0.772
 fault-0.766
Token down
Token ID866
Feature activation-7.37e-02

Pos logit contributions
 best+0.280
 twin+0.278
apped+0.259
ours+0.257
uffy+0.252
Neg logit contributions
 steer-0.409
 paddle-0.408
 swoop-0.392
He-0.390
 distance-0.382
Token and
Token ID290
Feature activation-6.03e-02

Pos logit contributions
 paddle+0.171
 swoop+0.168
 steer+0.168
 branch+0.158
 distance+0.156
Neg logit contributions
 best-0.110
 twin-0.109
apped-0.104
 twins-0.098
ours-0.097
Token grabbed
Token ID13176
Feature activation-1.67e-01

Pos logit contributions
He+0.102
 paddle+0.098
 distance+0.098
 steer+0.097
 branch+0.094
Neg logit contributions
ours-0.116
 twin-0.111
uffy-0.107
icked-0.106
 best-0.106
Token the
Token ID262
Feature activation-1.78e-02

Pos logit contributions
 swoop+0.342
He+0.340
 steer+0.340
 paddle+0.336
 branch+0.331
Neg logit contributions
 twin-0.302
ours-0.293
uffy-0.292
 best-0.291
 cousins-0.286
Token when
Token ID618
Feature activation+2.89e-02

Pos logit contributions
 steer+0.466
 swoop+0.464
 branch+0.459
 paddle+0.458
He+0.455
Neg logit contributions
uffy-0.572
 cousins-0.558
 twin-0.558
ours-0.556
 fault-0.548
Token the
Token ID262
Feature activation-1.53e-02

Pos logit contributions
uffy+0.060
ours+0.059
 twin+0.058
 cousins+0.058
 fault+0.057
Neg logit contributions
 steer-0.053
 swoop-0.053
 paddle-0.053
 branch-0.052
He-0.052
Token se
Token ID384
Feature activation-2.90e-01

Pos logit contributions
 steer+0.026
 swoop+0.025
He+0.025
 branch+0.025
 paddle+0.024
Neg logit contributions
 best-0.035
 twin-0.035
 twins-0.033
 cousins-0.033
uffy-0.032
Tokenag
Token ID363
Feature activation+5.58e-02

Pos logit contributions
 swoop+0.615
 steer+0.601
 paddle+0.586
He+0.585
 branch+0.582
Neg logit contributions
uffy-0.526
 cousins-0.522
 twin-0.518
ours-0.509
 best-0.501
Tokenull
Token ID724
Feature activation-3.54e-02

Pos logit contributions
 cousins+0.120
uffy+0.120
 twin+0.117
ours+0.117
 fault+0.117
Neg logit contributions
 swoop-0.098
 steer-0.097
 paddle-0.095
He-0.095
 branch-0.094
Token swoop
Token ID38527
Feature activation-5.72e-01

Pos logit contributions
 steer+0.070
He+0.069
 branch+0.069
 paddle+0.069
 swoop+0.067
Neg logit contributions
 twin-0.066
uffy-0.066
 cousins-0.063
 best-0.063
ours-0.062
Tokened
Token ID276
Feature activation+1.79e-01

Pos logit contributions
 swoop+1.408
 steer+1.386
 paddle+1.378
He+1.374
 branch+1.358
Neg logit contributions
uffy-0.826
 cousins-0.819
 twin-0.818
ours-0.809
 fault-0.800
Token down
Token ID866
Feature activation-7.85e-02

Pos logit contributions
 twin+0.310
 best+0.304
uffy+0.303
ours+0.299
 fault+0.298
Neg logit contributions
 steer-0.374
 swoop-0.371
He-0.370
 paddle-0.366
 branch-0.366
Token and
Token ID290
Feature activation-6.01e-02

Pos logit contributions
 swoop+0.169
 steer+0.168
 paddle+0.165
 branch+0.163
He+0.162
Neg logit contributions
 twin-0.133
 best-0.131
apped-0.129
ours-0.129
uffy-0.128
Token snatched
Token ID43276
Feature activation-2.80e-01

Pos logit contributions
He+0.104
 steer+0.098
 branch+0.097
 paddle+0.094
 swoop+0.092
Neg logit contributions
 twin-0.126
ours-0.125
 cousins-0.124
uffy-0.123
 best-0.121
Token Tim
Token ID5045
Feature activation+1.30e-01

Pos logit contributions
 steer+0.525
 swoop+0.525
 paddle+0.520
 branch+0.513
He+0.508
Neg logit contributions
uffy-0.577
 twin-0.559
 cousins-0.558
ours-0.550
 fault-0.546
Token a
Token ID257
Feature activation+1.06e-02

Pos logit contributions
uffy+0.160
 cousins+0.157
ours+0.157
 twin+0.156
 fault+0.152
Neg logit contributions
 swoop-0.159
 branch-0.158
 steer-0.157
 paddle-0.156
He-0.152
Token hawk
Token ID48710
Feature activation-3.72e-01

Pos logit contributions
 twin+0.025
uffy+0.024
 best+0.024
 cousins+0.023
 twins+0.023
Neg logit contributions
 steer-0.017
He-0.016
 paddle-0.016
 reach-0.015
 swoop-0.015
Token.
Token ID13
Feature activation-5.54e-02

Pos logit contributions
He+0.775
 steer+0.763
 paddle+0.741
 branch+0.711
 reach+0.709
Neg logit contributions
uffy-0.647
 twin-0.636
 best-0.605
 cousins-0.602
icking-0.599
Token The
Token ID383
Feature activation+2.46e-02

Pos logit contributions
 steer+0.115
 swoop+0.114
 paddle+0.112
 branch+0.112
He+0.111
Neg logit contributions
uffy-0.100
 twin-0.100
ours-0.098
 cousins-0.097
 best-0.095
Token hawk
Token ID48710
Feature activation-3.72e-01

Pos logit contributions
 best+0.069
 twin+0.064
 twins+0.063
 cousins+0.062
 siblings+0.061
Neg logit contributions
 steer-0.031
He-0.029
 paddle-0.028
 swoop-0.028
 reach-0.027
Token swoop
Token ID38527
Feature activation-5.72e-01

Pos logit contributions
 steer+0.775
 paddle+0.760
He+0.747
 branch+0.732
 reach+0.718
Neg logit contributions
 twin-0.630
uffy-0.628
 best-0.590
 cousins-0.584
icking-0.577
Tokened
Token ID276
Feature activation+1.79e-01

Pos logit contributions
 swoop+1.491
 steer+1.486
 paddle+1.472
He+1.467
 branch+1.465
Neg logit contributions
uffy-0.729
 twin-0.718
 cousins-0.704
ours-0.699
 fault-0.686
Token down
Token ID866
Feature activation-7.45e-02

Pos logit contributions
 best+0.309
apped+0.300
 twin+0.300
uffy+0.298
ours+0.288
Neg logit contributions
 steer-0.354
He-0.348
 paddle-0.342
 branch-0.342
 swoop-0.337
Token and
Token ID290
Feature activation-4.95e-02

Pos logit contributions
 steer+0.166
 swoop+0.164
 paddle+0.163
 branch+0.158
He+0.157
Neg logit contributions
apped-0.113
 best-0.113
 twin-0.112
uffy-0.106
icking-0.104
Token grabbed
Token ID13176
Feature activation-1.67e-01

Pos logit contributions
He+0.087
 steer+0.083
 distance+0.081
 branch+0.080
 paddle+0.080
Neg logit contributions
uffy-0.093
 twin-0.091
ours-0.091
 best-0.088
 cousins-0.086
Token S
Token ID311
Feature activation+4.16e-02

Pos logit contributions
 steer+0.336
He+0.335
 swoop+0.334
 branch+0.330
 paddle+0.328
Neg logit contributions
 twin-0.306
uffy-0.304
ours-0.299
 best-0.298
 cousins-0.295
Token out
Token ID503
Feature activation+7.18e-02

Pos logit contributions
 swoop+0.080
 steer+0.080
 paddle+0.079
He+0.079
 branch+0.079
Neg logit contributions
uffy-0.064
 twin-0.064
 cousins-0.063
ours-0.062
 fault-0.061
Token a
Token ID257
Feature activation+5.46e-03

Pos logit contributions
 twin+0.131
uffy+0.128
 cousins+0.127
ours+0.126
 best+0.125
Neg logit contributions
 swoop-0.149
 steer-0.147
 paddle-0.145
 branch-0.145
He-0.143
Token fishing
Token ID12478
Feature activation-1.89e-01

Pos logit contributions
 twin+0.012
 best+0.012
uffy+0.011
 cousins+0.011
 brother+0.011
Neg logit contributions
He-0.009
 steer-0.009
 swoop-0.008
 branch-0.008
Leave-0.007
Token rod
Token ID15299
Feature activation-1.92e-01

Pos logit contributions
 swoop+0.284
He+0.280
 steer+0.266
 branch+0.259
 paddle+0.255
Neg logit contributions
 twin-0.438
 best-0.435
uffy-0.422
ours-0.421
apped-0.416
Token and
Token ID290
Feature activation-4.24e-02

Pos logit contributions
 swoop+0.423
 steer+0.417
 paddle+0.411
He+0.403
 branch+0.391
Neg logit contributions
 twin-0.309
icking-0.295
apped-0.295
 best-0.292
 fault-0.289
Token swoop
Token ID38527
Feature activation-5.72e-01

Pos logit contributions
He+0.074
 steer+0.072
 branch+0.071
 swoop+0.071
 paddle+0.070
Neg logit contributions
 twin-0.087
 best-0.083
ours-0.083
 cousins-0.083
uffy-0.081
Tokened
Token ID276
Feature activation+1.80e-01

Pos logit contributions
 steer+1.545
 swoop+1.536
 paddle+1.528
 branch+1.526
He+1.515
Neg logit contributions
uffy-0.664
 twin-0.649
ours-0.627
 cousins-0.626
 best-0.622
Token it
Token ID340
Feature activation-1.28e-01

Pos logit contributions
 best+0.297
 twin+0.297
ours+0.285
 fault+0.281
apped+0.280
Neg logit contributions
 steer-0.374
He-0.367
 paddle-0.363
 swoop-0.359
 branch-0.356
Token around
Token ID1088
Feature activation+7.66e-03

Pos logit contributions
 steer+0.282
 paddle+0.276
He+0.273
 branch+0.272
 swoop+0.270
Neg logit contributions
 best-0.208
 twin-0.201
apped-0.199
uffy-0.196
ours-0.195
Token Su
Token ID1778
Feature activation+8.36e-02

Pos logit contributions
 twin+0.013
uffy+0.013
 best+0.012
 cousins+0.012
ours+0.012
Neg logit contributions
 steer-0.016
 paddle-0.016
He-0.016
 swoop-0.016
 branch-0.016
Tokenzy
Token ID7357
Feature activation+1.12e-01

Pos logit contributions
 twin+0.131
uffy+0.131
 cousins+0.129
ours+0.128
 best+0.125
Neg logit contributions
 swoop-0.193
 steer-0.192
 branch-0.191
 paddle-0.189
He-0.188
Token Suddenly
Token ID24975
Feature activation-2.41e-03

Pos logit contributions
He+0.102
 low+0.091
 steer+0.090
 paddle+0.090
 branch+0.089
Neg logit contributions
uffy-0.119
ours-0.117
ils-0.116
 cousins-0.115
ins-0.114
Token,
Token ID11
Feature activation+5.31e-02

Pos logit contributions
 paddle+0.004
 swoop+0.004
 steer+0.004
 move+0.004
 he+0.004
Neg logit contributions
ours-0.005
anny-0.005
 fault-0.005
uffy-0.005
omm-0.005
Token a
Token ID257
Feature activation+8.35e-03

Pos logit contributions
uffy+0.113
ours+0.112
 fault+0.111
omm+0.108
anny+0.108
Neg logit contributions
 paddle-0.094
 steer-0.093
 branch-0.091
 swoop-0.090
He-0.090
Token big
Token ID1263
Feature activation-1.30e-01

Pos logit contributions
 twin+0.019
 best+0.018
 cousins+0.018
uffy+0.018
 sisters+0.017
Neg logit contributions
 steer-0.015
He-0.014
 swoop-0.014
 paddle-0.014
 branch-0.013
Token bird
Token ID6512
Feature activation-2.05e-01

Pos logit contributions
He+0.231
 steer+0.231
 swoop+0.216
 paddle+0.212
 branch+0.203
Neg logit contributions
 twin-0.284
 cousins-0.267
 best-0.266
 brother-0.266
 twins-0.260
Token swoop
Token ID38527
Feature activation-5.72e-01

Pos logit contributions
 steer+0.416
 paddle+0.413
He+0.403
 branch+0.380
 distance+0.372
Neg logit contributions
uffy-0.352
 twin-0.340
 cousins-0.333
apped-0.328
icking-0.326
Tokened
Token ID276
Feature activation+1.78e-01

Pos logit contributions
 swoop+1.459
 steer+1.456
 paddle+1.441
He+1.436
 branch+1.435
Neg logit contributions
uffy-0.760
 twin-0.748
 cousins-0.734
ours-0.729
 fault-0.715
Token down
Token ID866
Feature activation-7.53e-02

Pos logit contributions
 twin+0.286
apped+0.275
 best+0.273
uffy+0.272
 cousins+0.270
Neg logit contributions
 steer-0.386
 paddle-0.376
He-0.373
 swoop-0.370
 branch-0.369
Token and
Token ID290
Feature activation-5.11e-02

Pos logit contributions
 swoop+0.175
 steer+0.174
 paddle+0.173
 branch+0.166
He+0.160
Neg logit contributions
 twin-0.109
apped-0.109
 best-0.106
icking-0.098
 twins-0.098
Token took
Token ID1718
Feature activation-3.63e-02

Pos logit contributions
 distance+0.094
He+0.094
 steer+0.094
 paddle+0.092
 branch+0.088
Neg logit contributions
ours-0.091
 twin-0.088
 cousins-0.087
uffy-0.087
apped-0.085
Token the
Token ID262
Feature activation-8.35e-03

Pos logit contributions
 swoop+0.072
 steer+0.072
 paddle+0.071
 branch+0.070
He+0.069
Neg logit contributions
uffy-0.071
ours-0.069
 twin-0.068
 cousins-0.068
 fault-0.068
Token was
Token ID373
Feature activation+1.76e-01

Pos logit contributions
 steer+0.436
 paddle+0.434
He+0.416
 branch+0.411
 swoop+0.403
Neg logit contributions
 twin-0.347
uffy-0.338
 best-0.328
 cousins-0.325
ours-0.317
Token very
Token ID845
Feature activation+6.43e-02

Pos logit contributions
 twin+0.378
 best+0.375
 cousins+0.366
uffy+0.365
ours+0.355
Neg logit contributions
 steer-0.311
 paddle-0.302
He-0.297
 swoop-0.294
 branch-0.286
Token graceful
Token ID44363
Feature activation-1.63e-01

Pos logit contributions
 best+0.137
 twin+0.136
uffy+0.132
 twins+0.125
ours+0.123
Neg logit contributions
 steer-0.113
 paddle-0.108
He-0.106
 branch-0.106
 distance-0.104
Token as
Token ID355
Feature activation-1.80e-01

Pos logit contributions
 steer+0.316
 swoop+0.314
 paddle+0.312
He+0.307
 branch+0.300
Neg logit contributions
uffy-0.320
 twin-0.320
 cousins-0.313
 best-0.312
 fault-0.310
Token it
Token ID340
Feature activation-1.30e-01

Pos logit contributions
 paddle+0.331
 steer+0.330
 swoop+0.327
He+0.326
 branch+0.320
Neg logit contributions
 best-0.358
 twin-0.355
uffy-0.344
 fault-0.339
 cousins-0.339
Token swoop
Token ID38527
Feature activation-5.72e-01

Pos logit contributions
 steer+0.210
He+0.208
 paddle+0.206
 branch+0.202
 swoop+0.195
Neg logit contributions
uffy-0.286
 twin-0.280
 best-0.276
ours-0.273
 cousins-0.268
Tokened
Token ID276
Feature activation+1.83e-01

Pos logit contributions
 swoop+1.397
 steer+1.395
 paddle+1.380
He+1.375
 branch+1.374
Neg logit contributions
uffy-0.820
 twin-0.808
 cousins-0.794
ours-0.789
 fault-0.774
Token down
Token ID866
Feature activation-7.56e-02

Pos logit contributions
 best+0.292
 twin+0.277
apped+0.263
 fault+0.261
ours+0.258
Neg logit contributions
 steer-0.405
 paddle-0.397
He-0.392
 swoop-0.382
 branch-0.378
Token and
Token ID290
Feature activation-5.43e-02

Pos logit contributions
 steer+0.181
 paddle+0.180
 swoop+0.177
He+0.170
 branch+0.167
Neg logit contributions
 best-0.109
apped-0.104
 twin-0.102
icking-0.097
ours-0.095
Token grabbed
Token ID13176
Feature activation-1.71e-01

Pos logit contributions
He+0.094
 steer+0.090
 paddle+0.088
 branch+0.085
 distance+0.084
Neg logit contributions
ours-0.110
 twin-0.108
 best-0.105
uffy-0.102
 cousins-0.101
Token the
Token ID262
Feature activation-1.50e-02

Pos logit contributions
He+0.346
 steer+0.342
 paddle+0.336
 swoop+0.334
 branch+0.326
Neg logit contributions
 twin-0.313
 best-0.309
ours-0.304
 fault-0.295
uffy-0.293
Token
Token ID198
Feature activation+2.27e-02

Pos logit contributions
 swoop+0.111
 steer+0.111
 paddle+0.109
 branch+0.108
He+0.108
Neg logit contributions
uffy-0.102
 twin-0.101
 cousins-0.099
ours-0.099
 fault-0.098
Token
Token ID198
Feature activation+2.31e-02

Pos logit contributions
uffy+0.045
 twin+0.044
 cousins+0.044
ours+0.044
 fault+0.043
Neg logit contributions
 swoop-0.043
 steer-0.043
He-0.043
 paddle-0.043
 branch-0.043
Token"
Token ID1
Feature activation-1.16e-01

Pos logit contributions
uffy+0.047
 twin+0.046
 cousins+0.044
 best+0.043
omm+0.043
Neg logit contributions
 swoop-0.042
 paddle-0.042
 distance-0.042
 steer-0.042
 branch-0.041
TokenPl
Token ID3646
Feature activation-8.69e-02

Pos logit contributions
 paddle+0.211
 swoop+0.210
 branch+0.208
 steer+0.208
 reach+0.202
Neg logit contributions
 twin-0.239
uffy-0.237
 best-0.227
 cousins-0.224
ours-0.223
Tokenuto
Token ID9390
Feature activation+1.82e-02

Pos logit contributions
 steer+0.191
He+0.188
 swoop+0.188
 paddle+0.184
 branch+0.183
Neg logit contributions
 twin-0.148
 cousins-0.147
uffy-0.146
 best-0.143
ours-0.140
Token is
Token ID318
Feature activation+3.35e-01

Pos logit contributions
 cousins+0.029
 twin+0.029
uffy+0.029
omm+0.029
ours+0.028
Neg logit contributions
 swoop-0.041
 paddle-0.041
 steer-0.040
 branch-0.039
 lift-0.039
Token not
Token ID407
Feature activation+8.61e-02

Pos logit contributions
uffy+0.715
ours+0.696
 cousins+0.678
 twin+0.674
apped+0.666
Neg logit contributions
 swoop-0.612
 branch-0.604
 steer-0.591
He-0.589
 paddle-0.584
Token ugly
Token ID13400
Feature activation-4.69e-02

Pos logit contributions
uffy+0.170
ours+0.165
 twin+0.163
 cousins+0.160
omm+0.159
Neg logit contributions
 swoop-0.170
 steer-0.167
 branch-0.166
 paddle-0.165
He-0.164
Token,
Token ID11
Feature activation+5.26e-02

Pos logit contributions
 swoop+0.094
 steer+0.094
 paddle+0.093
He+0.092
 branch+0.092
Neg logit contributions
uffy-0.088
 twin-0.087
 cousins-0.086
ours-0.085
 fault-0.084
Token Bella
Token ID39304
Feature activation+1.10e-01

Pos logit contributions
 twin+0.114
uffy+0.113
 cousins+0.111
ours+0.110
 best+0.110
Neg logit contributions
 swoop-0.090
 steer-0.089
 paddle-0.089
 branch-0.087
He-0.087
Token.
Token ID13
Feature activation-5.50e-02

Pos logit contributions
uffy+0.242
 twin+0.240
ours+0.240
 cousins+0.239
 fault+0.235
Neg logit contributions
 swoop-0.182
 steer-0.179
 branch-0.179
He-0.178
 paddle-0.176
Token wrong
Token ID2642
Feature activation+2.64e-01

Pos logit contributions
uffy+0.202
 twin+0.200
ours+0.196
 cousins+0.194
 fault+0.188
Neg logit contributions
 swoop-0.504
 steer-0.500
 branch-0.497
 paddle-0.497
He-0.494
Token.
Token ID13
Feature activation-5.27e-02

Pos logit contributions
uffy+0.566
 twin+0.562
 cousins+0.554
ours+0.550
 fault+0.545
Neg logit contributions
 swoop-0.463
 steer-0.461
 paddle-0.454
 branch-0.451
He-0.450
Token "
Token ID366
Feature activation+9.08e-02

Pos logit contributions
 swoop+0.115
 steer+0.115
 paddle+0.114
He+0.113
 branch+0.113
Neg logit contributions
uffy-0.090
 twin-0.088
 cousins-0.087
ours-0.087
 fault-0.085
TokenThat
Token ID2504
Feature activation-1.17e-01

Pos logit contributions
 twin+0.170
uffy+0.166
 cousins+0.164
ained+0.162
 twins+0.161
Neg logit contributions
 paddle-0.172
 steer-0.166
 branch-0.164
 swoop-0.156
 lift-0.155
Token dog
Token ID3290
Feature activation-9.68e-03

Pos logit contributions
 swoop+0.246
 steer+0.245
 paddle+0.242
 branch+0.241
He+0.241
Neg logit contributions
uffy-0.209
 twin-0.206
 cousins-0.203
ours-0.203
 fault-0.200
Token is
Token ID318
Feature activation+3.39e-01

Pos logit contributions
 swoop+0.022
 steer+0.021
 branch+0.021
 paddle+0.021
He+0.021
Neg logit contributions
 fault-0.016
ours-0.016
 twin-0.016
 cousins-0.016
uffy-0.016
Token scary
Token ID14343
Feature activation-2.19e-01

Pos logit contributions
uffy+0.712
ours+0.707
 twin+0.691
 cousins+0.684
 fault+0.676
Neg logit contributions
 swoop-0.630
 branch-0.612
 steer-0.612
 paddle-0.604
He-0.603
Token,"
Token ID553
Feature activation-7.15e-02

Pos logit contributions
 swoop+0.415
 steer+0.414
 paddle+0.409
He+0.406
 branch+0.404
Neg logit contributions
uffy-0.436
 twin-0.433
 cousins-0.428
ours-0.423
 fault-0.420
Token Tim
Token ID5045
Feature activation+1.44e-01

Pos logit contributions
 steer+0.173
 paddle+0.169
 branch+0.166
 swoop+0.165
He+0.160
Neg logit contributions
 twin-0.103
 best-0.101
uffy-0.100
ours-0.100
 cousins-0.099
Tokenmy
Token ID1820
Feature activation+2.84e-01

Pos logit contributions
uffy+0.129
 twin+0.126
 cousins+0.123
ours+0.123
 fault+0.118
Neg logit contributions
 swoop-0.430
 steer-0.427
He-0.424
 branch-0.423
 paddle-0.423
Token said
Token ID531
Feature activation+8.11e-03

Pos logit contributions
 twin+0.441
uffy+0.430
 best+0.420
 cousins+0.408
 brother+0.400
Neg logit contributions
 steer-0.669
 paddle-0.666
 branch-0.635
 distance-0.631
He-0.628
Token
Token ID198
Feature activation+2.03e-02

Pos logit contributions
 cousins+0.035
 twin+0.034
ours+0.034
uffy+0.033
 fault+0.033
Neg logit contributions
He-0.024
 steer-0.021
 branch-0.020
 swoop-0.020
 paddle-0.019
Token
Token ID198
Feature activation+2.07e-02

Pos logit contributions
uffy+0.033
 twin+0.033
 cousins+0.031
 best+0.031
ours+0.030
Neg logit contributions
 paddle-0.045
 swoop-0.045
 steer-0.044
 branch-0.044
 lift-0.042
TokenSoon
Token ID28093
Feature activation-1.81e-01

Pos logit contributions
 twin+0.046
uffy+0.043
anny+0.043
 cousins+0.043
 best+0.042
Neg logit contributions
 paddle-0.032
 swoop-0.031
 branch-0.030
 steer-0.029
 lift-0.029
Token,
Token ID11
Feature activation+4.74e-02

Pos logit contributions
 swoop+0.358
 steer+0.355
 paddle+0.352
He+0.350
 branch+0.348
Neg logit contributions
uffy-0.352
ours-0.344
 twin-0.342
 cousins-0.340
 fault-0.339
Token the
Token ID262
Feature activation-5.68e-03

Pos logit contributions
 twin+0.097
uffy+0.095
 cousins+0.095
 best+0.095
ours+0.091
Neg logit contributions
 swoop-0.087
 steer-0.086
He-0.085
 paddle-0.084
 branch-0.084
Token friends
Token ID2460
Feature activation+3.76e-01

Pos logit contributions
 steer+0.006
 swoop+0.006
He+0.006
 paddle+0.006
 branch+0.006
Neg logit contributions
 best-0.016
 twin-0.016
 cousins-0.016
uffy-0.016
 twins-0.015
Token stumbled
Token ID24241
Feature activation-1.13e-01

Pos logit contributions
uffy+0.722
 twin+0.721
 cousins+0.708
ours+0.698
 best+0.690
Neg logit contributions
 steer-0.727
 swoop-0.725
 paddle-0.715
 branch-0.714
He-0.713
Token into
Token ID656
Feature activation-3.02e-02

Pos logit contributions
 steer+0.244
 swoop+0.242
 paddle+0.240
 branch+0.236
He+0.235
Neg logit contributions
uffy-0.196
 twin-0.194
 cousins-0.189
 best-0.189
 fault-0.187
Token a
Token ID257
Feature activation+1.08e-02

Pos logit contributions
 swoop+0.065
 steer+0.064
 paddle+0.063
He+0.063
 branch+0.061
Neg logit contributions
 twin-0.054
 cousins-0.052
 best-0.051
uffy-0.051
 fault-0.050
Token clearing
Token ID17304
Feature activation-2.51e-01

Pos logit contributions
 twin+0.024
 best+0.023
uffy+0.023
 cousins+0.022
 twins+0.021
Neg logit contributions
 steer-0.019
 paddle-0.018
He-0.018
 swoop-0.018
 lift-0.016
Token and
Token ID290
Feature activation-5.66e-02

Pos logit contributions
 steer+0.487
 swoop+0.486
 paddle+0.482
He+0.479
 branch+0.464
Neg logit contributions
 twin-0.478
uffy-0.471
 best-0.466
ours-0.463
 cousins-0.462
Token dishes
Token ID16759
Feature activation+2.69e-01

Pos logit contributions
 paddle+0.032
 steer+0.032
He+0.032
 branch+0.032
 swoop+0.032
Neg logit contributions
uffy-0.020
 cousins-0.019
 twin-0.019
ours-0.019
 fault-0.018
Token.
Token ID13
Feature activation-5.03e-02

Pos logit contributions
 best+0.453
 twin+0.445
icking+0.427
uffy+0.413
 fault+0.407
Neg logit contributions
 swoop-0.618
 steer-0.591
 paddle-0.580
He-0.577
 branch-0.556
Token When
Token ID1649
Feature activation+1.07e-01

Pos logit contributions
He+0.070
 low+0.065
 branch+0.064
 reach+0.062
 swoop+0.058
Neg logit contributions
 cousins-0.127
 fault-0.125
 twin-0.125
uffy-0.124
icking-0.123
Token she
Token ID673
Feature activation-1.79e-01

Pos logit contributions
 fault+0.288
uffy+0.287
 cousins+0.287
 twin+0.286
ours+0.286
Neg logit contributions
 branch-0.128
 swoop-0.127
He-0.124
 steer-0.123
 paddle-0.123
Token was
Token ID373
Feature activation+1.78e-01

Pos logit contributions
 swoop+0.322
 steer+0.316
He+0.316
 paddle+0.311
 branch+0.308
Neg logit contributions
 fault-0.387
 cousins-0.382
uffy-0.380
ours-0.370
 twin-0.368
Token done
Token ID1760
Feature activation+3.45e-01

Pos logit contributions
uffy+0.256
 cousins+0.245
ours+0.244
 twin+0.240
 fault+0.236
Neg logit contributions
 steer-0.438
He-0.437
 swoop-0.436
 branch-0.436
 paddle-0.435
Token,
Token ID11
Feature activation+5.52e-02

Pos logit contributions
uffy+0.534
 twin+0.530
 cousins+0.515
ours+0.513
 best+0.504
Neg logit contributions
 swoop-0.812
 steer-0.808
He-0.796
 branch-0.795
 paddle-0.793
Token her
Token ID607
Feature activation+9.20e-02

Pos logit contributions
uffy+0.135
ours+0.133
 fault+0.133
 twin+0.132
 cousins+0.132
Neg logit contributions
 swoop-0.079
 branch-0.079
 steer-0.078
He-0.078
 paddle-0.078
Token mom
Token ID1995
Feature activation+2.75e-01

Pos logit contributions
ains+0.153
ours+0.151
ained+0.150
apped+0.145
anny+0.144
Neg logit contributions
 branch-0.214
 lift-0.201
 rock-0.199
 paddle-0.197
 reach-0.196
Token gave
Token ID2921
Feature activation+5.95e-02

Pos logit contributions
uffy+0.544
 twin+0.534
ours+0.530
 fault+0.530
 cousins+0.530
Neg logit contributions
 swoop-0.531
 steer-0.525
He-0.518
 paddle-0.517
 branch-0.515
Token her
Token ID607
Feature activation+9.24e-02

Pos logit contributions
uffy+0.148
omm+0.142
ains+0.141
ours+0.141
 fault+0.140
Neg logit contributions
 swoop-0.089
 reach-0.087
 branch-0.086
He-0.085
 low-0.083
Token special
Token ID2041
Feature activation-1.19e-02

Pos logit contributions
uffy+0.016
ours+0.016
 twin+0.015
 cousins+0.015
 fault+0.015
Neg logit contributions
 swoop-0.017
 branch-0.017
 steer-0.017
 paddle-0.017
He-0.017
Token connection
Token ID4637
Feature activation-1.71e-01

Pos logit contributions
 swoop+0.024
 steer+0.024
 branch+0.024
 paddle+0.024
He+0.023
Neg logit contributions
uffy-0.022
 twin-0.022
ours-0.022
 cousins-0.022
 fault-0.021
Token to
Token ID284
Feature activation+2.69e-02

Pos logit contributions
 branch+0.283
 steer+0.280
He+0.278
 swoop+0.277
 reach+0.268
Neg logit contributions
uffy-0.381
ours-0.379
 twin-0.372
 cousins-0.367
omm-0.361
Token it
Token ID340
Feature activation-1.28e-01

Pos logit contributions
uffy+0.054
ours+0.052
 twin+0.051
 cousins+0.050
apped+0.050
Neg logit contributions
 steer-0.053
 branch-0.052
 swoop-0.052
He-0.051
 paddle-0.051
Token.
Token ID13
Feature activation-6.12e-02

Pos logit contributions
 steer+0.224
 branch+0.224
 swoop+0.224
He+0.223
 paddle+0.217
Neg logit contributions
uffy-0.270
ours-0.267
 cousins-0.265
 twin-0.264
 fault-0.259
Token From
Token ID3574
Feature activation+2.74e-01

Pos logit contributions
He+0.111
 branch+0.098
 swoop+0.093
 steer+0.092
 reach+0.089
Neg logit contributions
uffy-0.133
icking-0.133
 cousins-0.132
 best-0.132
ours-0.128
Token that
Token ID326
Feature activation-2.45e-02

Pos logit contributions
 twin+0.591
uffy+0.591
ours+0.578
 cousins+0.577
 fault+0.577
Neg logit contributions
 swoop-0.482
 steer-0.476
 branch-0.476
He-0.470
 paddle-0.466
Token day
Token ID1110
Feature activation+1.80e-01

Pos logit contributions
 branch+0.054
 swoop+0.052
He+0.051
 reach+0.050
 lift+0.050
Neg logit contributions
ours-0.046
uffy-0.045
 cousins-0.044
ains-0.043
 fault-0.042
Token on
Token ID319
Feature activation+6.28e-02

Pos logit contributions
 fault+0.373
ours+0.365
uffy+0.359
 twin+0.359
 cousins+0.357
Neg logit contributions
He-0.339
 swoop-0.331
 branch-0.329
 steer-0.327
 reach-0.323
Token,
Token ID11
Feature activation+5.35e-02

Pos logit contributions
ours+0.152
 cousins+0.149
ains+0.147
apped+0.146
 fault+0.145
Neg logit contributions
 reach-0.092
 branch-0.092
 lift-0.088
 he-0.088
 swoop-0.088
Token she
Token ID673
Feature activation-1.78e-01

Pos logit contributions
uffy+0.102
ours+0.101
 fault+0.098
 cousins+0.097
ains+0.097
Neg logit contributions
He-0.107
 branch-0.105
 swoop-0.105
 reach-0.104
 steer-0.103
Token me
Token ID502
Feature activation+9.58e-02

Pos logit contributions
 swoop+0.407
 steer+0.405
 paddle+0.405
He+0.401
 branch+0.398
Neg logit contributions
 twin-0.242
uffy-0.242
 cousins-0.235
ours-0.233
 best-0.228
Token get
Token ID651
Feature activation-2.40e-01

Pos logit contributions
 twin+0.204
 best+0.203
icking+0.200
uffy+0.199
ours+0.199
Neg logit contributions
 swoop-0.168
He-0.164
 steer-0.160
 paddle-0.158
 branch-0.152
Token some
Token ID617
Feature activation+6.43e-02

Pos logit contributions
 swoop+0.466
 steer+0.463
 branch+0.458
 paddle+0.455
He+0.453
Neg logit contributions
uffy-0.482
 cousins-0.466
 twin-0.461
 fault-0.455
ours-0.450
Token food
Token ID2057
Feature activation+4.77e-02

Pos logit contributions
uffy+0.119
 twin+0.118
 cousins+0.116
ours+0.115
 fault+0.114
Neg logit contributions
 swoop-0.131
 steer-0.131
 paddle-0.129
He-0.129
 branch-0.128
Token?"
Token ID1701
Feature activation+1.68e-01

Pos logit contributions
uffy+0.070
 twin+0.069
 cousins+0.068
ours+0.067
 fault+0.066
Neg logit contributions
 steer-0.115
 swoop-0.114
 paddle-0.114
 branch-0.114
He-0.113
Token Jim
Token ID5395
Feature activation+1.80e-02

Pos logit contributions
uffy+0.287
 twin+0.286
 cousins+0.279
ours+0.279
 fault+0.274
Neg logit contributions
 swoop-0.367
 steer-0.365
 paddle-0.362
 branch-0.359
He-0.356
Token said
Token ID531
Feature activation-8.54e-04

Pos logit contributions
uffy+0.038
 cousins+0.038
 fault+0.038
ours+0.037
 twin+0.037
Neg logit contributions
 swoop-0.033
He-0.031
 steer-0.031
 paddle-0.031
 branch-0.031
Token,
Token ID11
Feature activation+4.54e-02

Pos logit contributions
 swoop+0.002
 steer+0.002
 paddle+0.002
 branch+0.002
He+0.002
Neg logit contributions
uffy-0.001
 twin-0.001
 cousins-0.001
ours-0.001
 best-0.001
Token "
Token ID366
Feature activation+8.55e-02

Pos logit contributions
 twin+0.063
 best+0.058
uffy+0.057
 cousins+0.057
apped+0.057
Neg logit contributions
 steer-0.115
 swoop-0.114
 paddle-0.113
 branch-0.110
He-0.109
TokenYes
Token ID5297
Feature activation-2.07e-01

Pos logit contributions
 best+0.150
 twin+0.149
othes+0.141
apped+0.141
 favorites+0.141
Neg logit contributions
 paddle-0.168
 swoop-0.165
 branch-0.159
 steer-0.154
 current-0.149
Token!
Token ID0
Feature activation-8.86e-02

Pos logit contributions
 swoop+0.431
 steer+0.430
 paddle+0.425
He+0.423
 branch+0.422
Neg logit contributions
uffy-0.376
 twin-0.372
 cousins-0.366
ours-0.365
 fault-0.359
Token
Token ID198
Feature activation+2.61e-02

Pos logit contributions
 swoop+0.109
 steer+0.108
 paddle+0.107
 branch+0.106
He+0.106
Neg logit contributions
uffy-0.111
 twin-0.110
 cousins-0.108
ours-0.108
 fault-0.106
Token
Token ID198
Feature activation+2.60e-02

Pos logit contributions
uffy+0.048
 twin+0.047
 cousins+0.046
ours+0.046
 fault+0.045
Neg logit contributions
 swoop-0.054
 steer-0.054
 paddle-0.053
 branch-0.052
He-0.052
TokenSo
Token ID2396
Feature activation-1.06e-01

Pos logit contributions
 twin+0.048
anny+0.047
 twins+0.044
 living+0.043
loe+0.042
Neg logit contributions
 paddle-0.049
 swoop-0.042
 steer-0.041
 wave-0.040
 branch-0.040
Token one
Token ID530
Feature activation-2.34e-02

Pos logit contributions
 swoop+0.212
 steer+0.211
 paddle+0.209
He+0.208
 branch+0.207
Neg logit contributions
uffy-0.202
 twin-0.198
 cousins-0.197
ours-0.196
 fault-0.194
Token day
Token ID1110
Feature activation+1.77e-01

Pos logit contributions
 paddle+0.065
 steer+0.064
 swoop+0.064
He+0.063
 branch+0.061
Neg logit contributions
 twin-0.027
 best-0.026
uffy-0.025
 cousins-0.024
ours-0.022
Token,
Token ID11
Feature activation+5.19e-02

Pos logit contributions
 twin+0.279
uffy+0.266
 best+0.265
 cousins+0.264
ours+0.258
Neg logit contributions
 swoop-0.406
 steer-0.404
 branch-0.400
 paddle-0.399
He-0.397
Token he
Token ID339
Feature activation-3.93e-01

Pos logit contributions
 twin+0.109
 best+0.106
 cousins+0.105
uffy+0.103
 twins+0.101
Neg logit contributions
 swoop-0.094
 steer-0.090
 paddle-0.089
He-0.088
 branch-0.088
Token decided
Token ID3066
Feature activation-1.19e-01

Pos logit contributions
 swoop+0.680
 steer+0.670
 paddle+0.661
 branch+0.654
He+0.654
Neg logit contributions
uffy-0.863
 twin-0.847
ours-0.842
 fault-0.839
 cousins-0.839
Token to
Token ID284
Feature activation+2.45e-02

Pos logit contributions
 swoop+0.293
 steer+0.290
 paddle+0.290
 branch+0.288
He+0.285
Neg logit contributions
 best-0.156
 fault-0.153
 cousins-0.150
 twin-0.146
ours-0.141
Token try
Token ID1949
Feature activation-2.23e-01

Pos logit contributions
 twin+0.055
 best+0.053
 cousins+0.052
uffy+0.052
 sisters+0.052
Neg logit contributions
He-0.039
 swoop-0.037
 branch-0.036
 steer-0.035
 paddle-0.034
Token it
Token ID340
Feature activation-1.26e-01

Pos logit contributions
 swoop+0.458
 steer+0.450
He+0.448
 paddle+0.445
 branch+0.435
Neg logit contributions
 twin-0.404
 best-0.397
ours-0.395
uffy-0.393
 fault-0.393
Token
Token ID198
Feature activation+1.33e-02

Pos logit contributions
uffy+0.027
 cousins+0.027
 twin+0.027
ours+0.027
 fault+0.027
Neg logit contributions
He-0.025
 steer-0.025
 swoop-0.025
 branch-0.025
 paddle-0.024
Token"
Token ID1
Feature activation-1.25e-01

Pos logit contributions
uffy+0.025
 twin+0.025
ours+0.024
 cousins+0.024
 best+0.024
Neg logit contributions
 paddle-0.027
 steer-0.026
 swoop-0.025
 branch-0.025
 lift-0.024
TokenGood
Token ID10248
Feature activation-9.24e-02

Pos logit contributions
 paddle+0.224
 steer+0.216
 lift+0.212
 branch+0.208
 reach+0.204
Neg logit contributions
 twin-0.254
uffy-0.244
 cousins-0.241
 twins-0.239
 best-0.237
Token boy
Token ID2933
Feature activation-1.72e-02

Pos logit contributions
 swoop+0.181
 branch+0.178
 steer+0.177
 paddle+0.177
He+0.173
Neg logit contributions
ours-0.179
uffy-0.177
 cousins-0.176
 twin-0.176
 fault-0.172
Token,
Token ID11
Feature activation+4.75e-02

Pos logit contributions
 paddle+0.037
 steer+0.037
He+0.036
 swoop+0.036
 branch+0.035
Neg logit contributions
uffy-0.030
 twin-0.030
 cousins-0.029
 best-0.029
ours-0.028
Token Spot
Token ID15899
Feature activation+9.82e-02

Pos logit contributions
 twin+0.075
uffy+0.074
 cousins+0.072
 best+0.071
ours+0.070
Neg logit contributions
 swoop-0.110
 steer-0.110
 paddle-0.109
He-0.109
 branch-0.107
Token!"
Token ID2474
Feature activation-6.10e-02

Pos logit contributions
uffy+0.186
 twin+0.180
 best+0.176
 cousins+0.175
 fault+0.171
Neg logit contributions
 paddle-0.198
 steer-0.197
He-0.191
 swoop-0.189
 branch-0.184
Token Lily
Token ID20037
Feature activation+2.10e-01

Pos logit contributions
 swoop+0.172
 steer+0.172
 paddle+0.169
 branch+0.167
He+0.164
Neg logit contributions
 twin-0.068
uffy-0.064
 cousins-0.063
 twins-0.061
ours-0.060
Token says
Token ID1139
Feature activation+1.48e-01

Pos logit contributions
uffy+0.312
 twin+0.310
 cousins+0.305
ours+0.301
 fault+0.296
Neg logit contributions
 steer-0.501
 swoop-0.500
 paddle-0.496
He-0.492
 branch-0.490
Token.
Token ID13
Feature activation-5.88e-02

Pos logit contributions
 twin+0.283
uffy+0.280
 best+0.272
 cousins+0.271
apped+0.271
Neg logit contributions
 swoop-0.293
 steer-0.286
 paddle-0.286
He-0.285
 branch-0.281
Token "
Token ID366
Feature activation+7.99e-02

Pos logit contributions
 steer+0.147
 swoop+0.146
 paddle+0.145
 branch+0.140
He+0.138
Neg logit contributions
uffy-0.084
 twin-0.083
ours-0.081
 cousins-0.079
 best-0.077
Token bowl
Token ID9396
Feature activation+5.11e-02

Pos logit contributions
 swoop+0.062
 steer+0.062
 paddle+0.061
He+0.061
 branch+0.061
Neg logit contributions
uffy-0.050
 twin-0.049
 cousins-0.049
ours-0.048
 fault-0.048
Token.
Token ID13
Feature activation-8.29e-02

Pos logit contributions
 twin+0.096
 best+0.093
uffy+0.092
icking+0.092
ours+0.092
Neg logit contributions
 swoop-0.104
 steer-0.104
 paddle-0.103
He-0.098
 branch-0.095
Token She
Token ID1375
Feature activation-6.01e-02

Pos logit contributions
 swoop+0.152
He+0.151
 steer+0.151
 branch+0.150
 paddle+0.149
Neg logit contributions
uffy-0.169
 twin-0.167
 cousins-0.167
ours-0.164
 fault-0.164
Token mixed
Token ID7668
Feature activation+1.53e-01

Pos logit contributions
 swoop+0.111
 steer+0.110
He+0.108
 branch+0.108
 paddle+0.108
Neg logit contributions
uffy-0.124
 cousins-0.121
 fault-0.121
 twin-0.121
ours-0.119
Token it
Token ID340
Feature activation-1.41e-01

Pos logit contributions
uffy+0.305
 cousins+0.305
 twin+0.299
 fault+0.293
ours+0.289
Neg logit contributions
 swoop-0.292
 steer-0.291
He-0.288
 branch-0.288
 paddle-0.285
Token all
Token ID477
Feature activation+1.70e-01

Pos logit contributions
 swoop+0.309
 steer+0.308
 branch+0.305
He+0.304
 paddle+0.303
Neg logit contributions
uffy-0.238
 cousins-0.235
 twin-0.234
ours-0.230
 fault-0.227
Token up
Token ID510
Feature activation+1.42e-01

Pos logit contributions
 cousins+0.331
uffy+0.311
 twin+0.298
 cousin+0.297
omm+0.294
Neg logit contributions
 branch-0.354
He-0.350
 swoop-0.347
 steer-0.344
 reach-0.343
Token and
Token ID290
Feature activation-7.13e-02

Pos logit contributions
uffy+0.275
 twin+0.274
 best+0.267
ours+0.267
 cousins+0.267
Neg logit contributions
 steer-0.275
 swoop-0.273
 paddle-0.273
He-0.268
 branch-0.264
Token put
Token ID1234
Feature activation-1.51e-01

Pos logit contributions
 swoop+0.129
 steer+0.128
 paddle+0.126
 branch+0.126
He+0.125
Neg logit contributions
uffy-0.152
 twin-0.148
 cousins-0.147
 fault-0.146
ours-0.145
Token the
Token ID262
Feature activation-2.11e-02

Pos logit contributions
 swoop+0.235
 branch+0.231
 steer+0.229
He+0.228
 paddle+0.224
Neg logit contributions
uffy-0.358
 cousins-0.357
 twin-0.350
 fault-0.345
ours-0.339
Token cake
Token ID12187
Feature activation+1.53e-01

Pos logit contributions
 swoop+0.044
 steer+0.043
He+0.043
 paddle+0.042
 branch+0.042
Neg logit contributions
 twin-0.039
uffy-0.038
 best-0.038
ours-0.037
 cousins-0.037
Token a
Token ID257
Feature activation+1.03e-02

Pos logit contributions
uffy+0.025
 twin+0.025
 cousins+0.025
ours+0.024
 best+0.024
Neg logit contributions
 swoop-0.024
 steer-0.024
 paddle-0.024
He-0.023
 branch-0.023
Token real
Token ID1103
Feature activation-6.54e-02

Pos logit contributions
 twin+0.020
 best+0.019
uffy+0.019
 cousins+0.019
 cousin+0.018
Neg logit contributions
 swoop-0.021
 steer-0.021
He-0.020
 paddle-0.020
 reach-0.019
Token car
Token ID1097
Feature activation+5.87e-03

Pos logit contributions
 swoop+0.137
 steer+0.135
He+0.133
 paddle+0.131
 branch+0.131
Neg logit contributions
 twin-0.120
uffy-0.117
 cousins-0.117
 best-0.116
 fault-0.114
Token.
Token ID13
Feature activation-5.59e-02

Pos logit contributions
uffy+0.012
 twin+0.012
 cousins+0.012
ours+0.011
 fault+0.011
Neg logit contributions
 swoop-0.011
 steer-0.011
 paddle-0.011
He-0.011
 branch-0.011
Token He
Token ID679
Feature activation-2.65e-01

Pos logit contributions
 swoop+0.107
 steer+0.105
 paddle+0.104
 branch+0.102
He+0.096
Neg logit contributions
 twin-0.111
uffy-0.110
ours-0.107
 cousins-0.106
omm-0.105
Token made
Token ID925
Feature activation+3.71e-02

Pos logit contributions
 steer+0.497
 swoop+0.495
 paddle+0.484
He+0.478
 branch+0.464
Neg logit contributions
uffy-0.570
 fault-0.553
ours-0.553
omm-0.545
 cousins-0.522
Token a
Token ID257
Feature activation+1.15e-02

Pos logit contributions
 twin+0.074
uffy+0.073
 cousins+0.073
ours+0.072
 best+0.072
Neg logit contributions
 swoop-0.069
 steer-0.069
 paddle-0.068
 branch-0.068
He-0.067
Token loud
Token ID7812
Feature activation-2.69e-01

Pos logit contributions
 twin+0.024
 cousins+0.024
 best+0.024
uffy+0.023
 fault+0.023
Neg logit contributions
 swoop-0.021
 steer-0.021
He-0.020
 paddle-0.020
 branch-0.020
Token engine
Token ID3113
Feature activation-7.42e-02

Pos logit contributions
He+0.373
 steer+0.358
 swoop+0.356
 paddle+0.356
 branch+0.338
Neg logit contributions
 twin-0.670
 cousins-0.634
 best-0.625
 twins-0.624
uffy-0.623
Token noise
Token ID7838
Feature activation-2.61e-01

Pos logit contributions
 swoop+0.154
He+0.153
 paddle+0.150
 steer+0.149
 branch+0.147
Neg logit contributions
 twin-0.131
 best-0.127
 cousins-0.124
uffy-0.121
 fault-0.121
Token and
Token ID290
Feature activation-5.27e-02

Pos logit contributions
 swoop+0.548
 paddle+0.538
 steer+0.534
He+0.528
 branch+0.527
Neg logit contributions
 twin-0.472
 best-0.456
 cousins-0.455
uffy-0.445
 fault-0.443
Token I
Token ID314
Feature activation-5.02e-02

Pos logit contributions
ained+0.104
apped+0.100
uffy+0.100
outed+0.099
ours+0.097
Neg logit contributions
 rock-0.077
 branch-0.076
 steer-0.076
 move-0.076
 lift-0.076
Token had
Token ID550
Feature activation+6.25e-02

Pos logit contributions
 swoop+0.087
 steer+0.087
 paddle+0.086
He+0.084
 branch+0.083
Neg logit contributions
ours-0.108
uffy-0.108
 cousins-0.107
 twin-0.106
omm-0.105
Token a
Token ID257
Feature activation+5.82e-03

Pos logit contributions
uffy+0.148
omm+0.144
icking+0.140
ours+0.139
anny+0.137
Neg logit contributions
 branch-0.104
 swoop-0.102
 reach-0.094
 steer-0.093
 catch-0.091
Token wonderful
Token ID7932
Feature activation+1.20e-01

Pos logit contributions
uffy+0.012
ours+0.012
 cousins+0.012
icking+0.012
 twin+0.012
Neg logit contributions
 paddle-0.010
 swoop-0.010
 branch-0.010
 steer-0.010
He-0.010
Token dream
Token ID4320
Feature activation-1.02e-01

Pos logit contributions
ours+0.238
uffy+0.237
ains+0.234
icking+0.228
 cousins+0.225
Neg logit contributions
 branch-0.225
 swoop-0.224
 paddle-0.224
 steer-0.221
 lift-0.214
Token.
Token ID13
Feature activation-6.00e-02

Pos logit contributions
 swoop+0.189
 steer+0.187
 paddle+0.187
He+0.184
 branch+0.183
Neg logit contributions
uffy-0.206
 twin-0.206
 cousins-0.202
ours-0.200
 fault-0.199
Token I
Token ID314
Feature activation-5.00e-02

Pos logit contributions
He+0.101
 steer+0.098
 branch+0.095
 paddle+0.095
 swoop+0.093
Neg logit contributions
uffy-0.134
 fault-0.131
icking-0.129
 twin-0.129
ours-0.129
Token dreamed
Token ID27947
Feature activation-1.00e-02

Pos logit contributions
 swoop+0.101
 steer+0.101
 paddle+0.099
 branch+0.099
He+0.099
Neg logit contributions
uffy-0.093
 twin-0.092
 cousins-0.091
ours-0.090
 fault-0.089
Token of
Token ID286
Feature activation+2.41e-03

Pos logit contributions
He+0.021
 branch+0.020
 steer+0.020
 swoop+0.020
 reach+0.019
Neg logit contributions
uffy-0.018
 cousins-0.018
ours-0.018
 fault-0.018
 twin-0.018
Token a
Token ID257
Feature activation+6.30e-03

Pos logit contributions
uffy+0.005
ours+0.005
icking+0.005
anny+0.005
apped+0.005
Neg logit contributions
 branch-0.005
 swoop-0.005
 steer-0.005
 paddle-0.005
 reach-0.004
Token magic
Token ID5536
Feature activation+3.03e-03

Pos logit contributions
 twin+0.013
uffy+0.013
 cousins+0.013
 best+0.013
 fault+0.013
Neg logit contributions
 steer-0.011
 swoop-0.011
 paddle-0.011
He-0.011
 branch-0.011
Token love
Token ID1842
Feature activation+6.83e-02

Pos logit contributions
ours+0.330
uffy+0.328
apped+0.312
ains+0.306
 twin+0.306
Neg logit contributions
 paddle-0.423
 branch-0.422
 steer-0.422
 swoop-0.413
He-0.401
Token.
Token ID13
Feature activation-6.99e-02

Pos logit contributions
uffy+0.162
ours+0.160
 twin+0.155
 cousins+0.154
 fault+0.154
Neg logit contributions
He-0.107
 steer-0.105
 branch-0.104
 swoop-0.100
 rock-0.096
Token How
Token ID1374
Feature activation+1.82e-01

Pos logit contributions
He+0.149
 steer+0.146
 paddle+0.143
 swoop+0.142
 branch+0.141
Neg logit contributions
uffy-0.126
 fault-0.121
 cousins-0.121
 twin-0.120
ours-0.120
Token can
Token ID460
Feature activation+1.07e-01

Pos logit contributions
 cousins+0.278
uffy+0.278
omm+0.267
apped+0.252
 twin+0.251
Neg logit contributions
He-0.443
 steer-0.416
 lift-0.403
 move-0.399
 branch-0.398
Token I
Token ID314
Feature activation-5.56e-02

Pos logit contributions
 twin+0.178
uffy+0.177
 cousins+0.173
ours+0.170
 best+0.169
Neg logit contributions
 swoop-0.238
 branch-0.234
 steer-0.233
 paddle-0.232
He-0.231
Token repay
Token ID20152
Feature activation-1.94e-01

Pos logit contributions
 swoop+0.100
 branch+0.097
He+0.093
 steer+0.092
 paddle+0.091
Neg logit contributions
 twin-0.113
 best-0.111
uffy-0.109
 cousins-0.107
icking-0.106
Token you
Token ID345
Feature activation+7.42e-02

Pos logit contributions
 steer+0.389
 swoop+0.382
He+0.381
 branch+0.375
 paddle+0.374
Neg logit contributions
uffy-0.370
ours-0.365
 fault-0.361
 twin-0.359
 cousins-0.350
Token?"
Token ID1701
Feature activation+1.46e-01

Pos logit contributions
uffy+0.108
 twin+0.106
 cousins+0.104
ours+0.104
 fault+0.102
Neg logit contributions
 steer-0.180
 swoop-0.180
 paddle-0.178
He-0.178
 branch-0.177
Token the
Token ID262
Feature activation-1.97e-02

Pos logit contributions
 twin+0.259
uffy+0.258
 cousins+0.251
ours+0.248
 best+0.241
Neg logit contributions
 swoop-0.310
 steer-0.308
 paddle-0.305
 branch-0.303
He-0.295
Token man
Token ID582
Feature activation-3.01e-01

Pos logit contributions
 swoop+0.043
 steer+0.043
 paddle+0.043
He+0.042
 branch+0.042
Neg logit contributions
uffy-0.033
 twin-0.033
 cousins-0.033
ours-0.032
 fault-0.032
Token said
Token ID531
Feature activation-1.04e-02

Pos logit contributions
 paddle+0.645
 steer+0.643
 branch+0.642
 swoop+0.642
He+0.630
Neg logit contributions
 twin-0.521
ours-0.507
uffy-0.504
 cousins-0.496
 best-0.479
Token girl
Token ID2576
Feature activation-1.25e-02

Pos logit contributions
 steer+0.131
 swoop+0.128
He+0.127
 paddle+0.126
 branch+0.122
Neg logit contributions
 twin-0.125
uffy-0.123
 best-0.117
 cousins-0.116
 twins-0.113
Token,
Token ID11
Feature activation+4.49e-02

Pos logit contributions
 swoop+0.028
 branch+0.028
 steer+0.028
He+0.028
 paddle+0.027
Neg logit contributions
uffy-0.020
 fault-0.020
ours-0.020
 cousins-0.020
 twin-0.020
Token and
Token ID290
Feature activation-6.11e-02

Pos logit contributions
uffy+0.074
 twin+0.073
 best+0.070
 cousins+0.069
ours+0.069
Neg logit contributions
 steer-0.101
 paddle-0.101
 swoop-0.099
 branch-0.096
He-0.095
Token one
Token ID530
Feature activation-2.84e-02

Pos logit contributions
 swoop+0.116
 steer+0.115
He+0.113
 paddle+0.113
 branch+0.113
Neg logit contributions
uffy-0.123
 twin-0.122
 cousins-0.120
ours-0.120
 fault-0.119
Token morning
Token ID3329
Feature activation+1.74e-01

Pos logit contributions
 branch+0.067
 steer+0.067
 swoop+0.067
He+0.067
 paddle+0.066
Neg logit contributions
uffy-0.044
ours-0.043
 fault-0.042
 cousins-0.042
 twin-0.042
Token she
Token ID673
Feature activation-1.78e-01

Pos logit contributions
 fault+0.388
 cousins+0.375
ains+0.375
ours+0.371
uffy+0.364
Neg logit contributions
 branch-0.307
 swoop-0.307
 paddle-0.285
 reach-0.283
He-0.282
Token saw
Token ID2497
Feature activation-7.01e-02

Pos logit contributions
 swoop+0.335
 steer+0.322
 paddle+0.319
He+0.318
 branch+0.314
Neg logit contributions
 fault-0.374
uffy-0.369
omm-0.366
 cousins-0.364
 twin-0.363
Token something
Token ID1223
Feature activation+4.80e-02

Pos logit contributions
 branch+0.121
He+0.119
 swoop+0.118
 steer+0.112
 paddle+0.112
Neg logit contributions
uffy-0.159
ours-0.151
 cousins-0.151
 fault-0.151
 twin-0.149
Token very
Token ID845
Feature activation+5.86e-02

Pos logit contributions
uffy+0.100
 twin+0.100
 best+0.099
 cousins+0.098
ours+0.098
Neg logit contributions
 steer-0.087
 swoop-0.085
He-0.084
 paddle-0.084
 branch-0.082
Token blue
Token ID4171
Feature activation-9.73e-02

Pos logit contributions
 best+0.108
uffy+0.106
 twin+0.106
ours+0.103
 cousins+0.100
Neg logit contributions
 steer-0.122
 swoop-0.119
He-0.118
 branch-0.116
 paddle-0.116
Token in
Token ID287
Feature activation+7.17e-02

Pos logit contributions
 swoop+0.177
 steer+0.176
 branch+0.175
 paddle+0.174
He+0.173
Neg logit contributions
uffy-0.201
 twin-0.199
 cousins-0.198
ours-0.196
 fault-0.194
Token,
Token ID11
Feature activation+4.78e-02

Pos logit contributions
 swoop+0.003
 steer+0.003
He+0.003
 paddle+0.003
 branch+0.003
Neg logit contributions
 cousins-0.003
uffy-0.003
 twin-0.003
 fault-0.003
ours-0.003
Token "
Token ID366
Feature activation+8.41e-02

Pos logit contributions
 twin+0.064
omm+0.061
 best+0.060
uffy+0.060
ours+0.060
Neg logit contributions
 steer-0.122
 swoop-0.120
 paddle-0.120
 branch-0.118
He-0.115
TokenHow
Token ID2437
Feature activation-6.98e-02

Pos logit contributions
 twin+0.163
 best+0.161
uffy+0.158
ours+0.156
 cousins+0.154
Neg logit contributions
 paddle-0.164
 steer-0.156
 branch-0.154
 swoop-0.154
 lift-0.150
Token can
Token ID460
Feature activation+1.13e-01

Pos logit contributions
He+0.147
 branch+0.144
 steer+0.144
 lift+0.140
 paddle+0.140
Neg logit contributions
 cousins-0.125
uffy-0.123
omm-0.120
 twin-0.120
ours-0.117
Token you
Token ID345
Feature activation+7.87e-02

Pos logit contributions
 twin+0.193
uffy+0.192
 cousins+0.189
 best+0.188
 sisters+0.182
Neg logit contributions
 swoop-0.243
He-0.235
 branch-0.235
 paddle-0.234
 steer-0.233
Token help
Token ID1037
Feature activation-1.62e-01

Pos logit contributions
 best+0.150
icking+0.143
 twin+0.139
 sisters+0.135
 cousins+0.133
Neg logit contributions
He-0.145
 swoop-0.145
 distance-0.138
 branch-0.137
 steer-0.134
Token?"
Token ID1701
Feature activation+1.61e-01

Pos logit contributions
 swoop+0.388
 steer+0.384
 paddle+0.383
He+0.380
 branch+0.376
Neg logit contributions
uffy-0.244
 twin-0.242
 cousins-0.235
ours-0.232
 best-0.232
Token
Token ID198
Feature activation+1.73e-02

Pos logit contributions
 cousins+0.319
uffy+0.318
 fault+0.314
 twin+0.310
ours+0.309
Neg logit contributions
He-0.311
 branch-0.301
 swoop-0.301
 steer-0.298
 paddle-0.297
Token
Token ID198
Feature activation+1.67e-02

Pos logit contributions
 fault+0.040
 cousins+0.040
uffy+0.038
ours+0.038
 twin+0.038
Neg logit contributions
He-0.030
 branch-0.027
 swoop-0.027
 steer-0.027
 reach-0.027
TokenM
Token ID44
Feature activation-4.90e-02

Pos logit contributions
uffy+0.036
 twin+0.035
 cousins+0.035
ours+0.035
 fault+0.034
Neg logit contributions
 swoop-0.029
 steer-0.029
 paddle-0.029
 branch-0.029
He-0.029
Tokenolly
Token ID5098
Feature activation+2.36e-01

Pos logit contributions
He+0.107
 branch+0.106
 reach+0.103
 swoop+0.100
 lift+0.099
Neg logit contributions
 cousins-0.078
 fault-0.075
ains-0.073
ained-0.071
 best-0.070
Token in
Token ID287
Feature activation+6.45e-02

Pos logit contributions
 swoop+0.246
 paddle+0.239
 steer+0.239
He+0.235
 branch+0.232
Neg logit contributions
uffy-0.176
 twin-0.174
 best-0.172
 cousins-0.171
apped-0.165
Token the
Token ID262
Feature activation-2.47e-02

Pos logit contributions
 twin+0.107
 cousins+0.104
 best+0.102
uffy+0.101
 twins+0.099
Neg logit contributions
 swoop-0.145
 paddle-0.141
 steer-0.139
He-0.137
 branch-0.132
Token park
Token ID3952
Feature activation+1.65e-02

Pos logit contributions
 swoop+0.047
 steer+0.047
 paddle+0.046
He+0.046
 branch+0.046
Neg logit contributions
uffy-0.049
 twin-0.049
 cousins-0.048
ours-0.048
 fault-0.047
Token.
Token ID13
Feature activation-6.76e-02

Pos logit contributions
 twin+0.033
uffy+0.032
 best+0.032
 cousins+0.032
ours+0.032
Neg logit contributions
 swoop-0.032
 steer-0.031
 paddle-0.031
He-0.030
 branch-0.030
Token He
Token ID679
Feature activation-2.85e-01

Pos logit contributions
 swoop+0.106
 steer+0.099
 paddle+0.097
 branch+0.095
He+0.090
Neg logit contributions
 twin-0.161
ours-0.158
 fault-0.157
omm-0.154
uffy-0.153
Token saw
Token ID2497
Feature activation-7.01e-02

Pos logit contributions
 swoop+0.561
 steer+0.552
 paddle+0.546
 branch+0.546
He+0.546
Neg logit contributions
uffy-0.555
 twin-0.551
 fault-0.550
ours-0.547
 cousins-0.546
Token all
Token ID477
Feature activation+1.71e-01

Pos logit contributions
 swoop+0.140
He+0.139
 steer+0.139
 branch+0.139
 paddle+0.137
Neg logit contributions
uffy-0.135
 twin-0.130
ours-0.130
 cousins-0.129
 fault-0.127
Token kinds
Token ID6982
Feature activation+1.08e-01

Pos logit contributions
 twin+0.282
uffy+0.275
ours+0.273
 cousins+0.270
 best+0.270
Neg logit contributions
 steer-0.378
 swoop-0.377
 paddle-0.375
He-0.365
 branch-0.364
Token of
Token ID286
Feature activation-8.27e-03

Pos logit contributions
 twin+0.183
 cousins+0.178
 best+0.177
uffy+0.177
ours+0.173
Neg logit contributions
 swoop-0.239
 steer-0.238
 paddle-0.236
 branch-0.230
He-0.226
Token animals
Token ID4695
Feature activation+5.77e-02

Pos logit contributions
 steer+0.018
 swoop+0.018
 paddle+0.017
 branch+0.017
He+0.017
Neg logit contributions
 twin-0.014
uffy-0.014
 cousins-0.014
ours-0.014
 best-0.014
Token,
Token ID11
Feature activation+4.39e-02

Pos logit contributions
uffy+0.102
 twin+0.101
 best+0.101
 cousins+0.099
ours+0.099
Neg logit contributions
 steer-0.120
 swoop-0.120
 paddle-0.118
He-0.117
 branch-0.117
Token John
Token ID1757
Feature activation+1.42e-01

Pos logit contributions
uffy+0.277
 twin+0.275
 cousins+0.271
ours+0.268
 fault+0.264
Neg logit contributions
 swoop-0.371
 steer-0.370
 paddle-0.366
 branch-0.364
He-0.364
Token it
Token ID340
Feature activation-1.25e-01

Pos logit contributions
uffy+0.337
ours+0.331
 twin+0.328
 cousins+0.326
 fault+0.320
Neg logit contributions
He-0.216
 steer-0.214
 swoop-0.214
 paddle-0.210
 branch-0.208
Token was
Token ID373
Feature activation+1.79e-01

Pos logit contributions
He+0.276
 swoop+0.272
 steer+0.269
 branch+0.267
 paddle+0.265
Neg logit contributions
ours-0.209
uffy-0.205
 cousins-0.204
 twin-0.197
apped-0.195
Token foolish
Token ID19538
Feature activation+1.97e-03

Pos logit contributions
uffy+0.382
ours+0.379
apped+0.366
 cousins+0.359
 twin+0.356
Neg logit contributions
 swoop-0.329
 branch-0.325
 steer-0.320
He-0.318
 paddle-0.313
Token to
Token ID284
Feature activation+2.61e-02

Pos logit contributions
uffy+0.004
ours+0.004
 twin+0.004
 cousins+0.004
omm+0.004
Neg logit contributions
 steer-0.003
He-0.003
 swoop-0.003
 branch-0.003
 paddle-0.003
Token touch
Token ID3638
Feature activation-3.58e-01

Pos logit contributions
uffy+0.056
 fault+0.056
icking+0.055
 best+0.055
 twin+0.055
Neg logit contributions
He-0.041
 swoop-0.037
 distance-0.037
 branch-0.035
 steer-0.034
Token the
Token ID262
Feature activation-4.71e-03

Pos logit contributions
 steer+0.657
 branch+0.649
He+0.644
 swoop+0.643
 paddle+0.640
Neg logit contributions
uffy-0.745
ours-0.719
 cousins-0.715
 twin-0.713
 fault-0.700
Token screen
Token ID3159
Feature activation+7.38e-02

Pos logit contributions
 steer+0.010
He+0.010
 swoop+0.010
 branch+0.010
 paddle+0.010
Neg logit contributions
uffy-0.009
ours-0.008
 twin-0.008
 fault-0.008
 cousins-0.008
Token.
Token ID13
Feature activation-5.26e-02

Pos logit contributions
 twin+0.141
 best+0.137
uffy+0.135
 fault+0.134
 cousins+0.132
Neg logit contributions
 swoop-0.149
 paddle-0.144
 steer-0.144
He-0.141
 branch-0.139
Token John
Token ID1757
Feature activation+1.43e-01

Pos logit contributions
 swoop+0.102
 steer+0.101
 paddle+0.100
He+0.100
 branch+0.099
Neg logit contributions
uffy-0.103
 twin-0.102
 cousins-0.101
ours-0.100
 fault-0.099
Token was
Token ID373
Feature activation+1.79e-01

Pos logit contributions
uffy+0.255
 twin+0.251
 cousins+0.248
ours+0.247
 fault+0.244
Neg logit contributions
 swoop-0.301
 steer-0.300
 paddle-0.296
He-0.295
 branch-0.295
Token the
Token ID262
Feature activation-1.83e-02

Pos logit contributions
 swoop+0.140
 steer+0.139
 paddle+0.139
He+0.136
 branch+0.133
Neg logit contributions
 twin-0.141
uffy-0.140
 cousins-0.137
 best-0.136
ours-0.135
Token hallway
Token ID23959
Feature activation-9.33e-02

Pos logit contributions
 swoop+0.042
 steer+0.042
 paddle+0.042
He+0.040
 branch+0.040
Neg logit contributions
uffy-0.030
 twin-0.029
 cousins-0.029
 best-0.029
 fault-0.027
Token.
Token ID13
Feature activation-5.67e-02

Pos logit contributions
 steer+0.169
 swoop+0.169
 paddle+0.168
He+0.165
 branch+0.163
Neg logit contributions
 twin-0.192
uffy-0.192
 cousins-0.188
ours-0.186
 best-0.185
Token Then
Token ID3244
Feature activation+1.31e-01

Pos logit contributions
 branch+0.094
He+0.094
 swoop+0.091
 steer+0.091
 paddle+0.090
Neg logit contributions
uffy-0.125
 cousins-0.125
 twin-0.124
 fault-0.122
 best-0.121
Token she
Token ID673
Feature activation-1.74e-01

Pos logit contributions
uffy+0.288
 twin+0.285
 cousins+0.282
ours+0.281
 fault+0.277
Neg logit contributions
 swoop-0.223
 steer-0.222
 paddle-0.219
He-0.217
 branch-0.217
Token run
Token ID1057
Feature activation-3.25e-01

Pos logit contributions
 swoop+0.297
 steer+0.292
 paddle+0.288
He+0.287
 branch+0.287
Neg logit contributions
uffy-0.387
 twin-0.381
ours-0.380
 fault-0.379
 cousins-0.377
Token into
Token ID656
Feature activation-2.76e-02

Pos logit contributions
 swoop+0.752
 steer+0.748
 paddle+0.740
 branch+0.714
He+0.713
Neg logit contributions
uffy-0.489
 best-0.483
 twin-0.481
 cousins-0.475
 fault-0.457
Token the
Token ID262
Feature activation-1.74e-02

Pos logit contributions
 swoop+0.062
 paddle+0.061
 steer+0.061
He+0.060
 branch+0.058
Neg logit contributions
 twin-0.047
 cousins-0.045
uffy-0.044
 sisters-0.043
 best-0.043
Token kitchen
Token ID9592
Feature activation+2.42e-01

Pos logit contributions
He+0.043
 steer+0.043
 branch+0.043
 swoop+0.043
 paddle+0.043
Neg logit contributions
uffy-0.024
ours-0.024
 twin-0.024
 cousins-0.024
 fault-0.023
Token.
Token ID13
Feature activation-5.62e-02

Pos logit contributions
 twin+0.462
uffy+0.453
 cousins+0.448
 fault+0.441
 best+0.440
Neg logit contributions
 swoop-0.485
 steer-0.481
 paddle-0.476
He-0.470
 branch-0.454
Token In
Token ID554
Feature activation+1.68e-01

Pos logit contributions
He+0.101
 branch+0.101
 steer+0.100
 swoop+0.100
 paddle+0.099
Neg logit contributions
uffy-0.117
 twin-0.116
 cousins-0.115
 fault-0.114
ours-0.113
Token can
Token ID460
Feature activation+1.11e-01

Pos logit contributions
uffy+0.579
 fault+0.549
 cousins+0.546
apped+0.536
omm+0.532
Neg logit contributions
He-0.497
 steer-0.484
 branch-0.479
 lift-0.479
 reach-0.476
Token he
Token ID339
Feature activation-3.95e-01

Pos logit contributions
icking+0.229
 twin+0.223
uffy+0.223
 best+0.220
ours+0.217
Neg logit contributions
 swoop-0.205
 branch-0.189
 distance-0.182
He-0.176
 steer-0.172
Token go
Token ID467
Feature activation-1.05e-01

Pos logit contributions
 swoop+0.638
 steer+0.619
He+0.617
 branch+0.617
 paddle+0.609
Neg logit contributions
uffy-0.890
 twin-0.885
 best-0.876
 cousins-0.874
ours-0.864
Token for
Token ID329
Feature activation+2.70e-01

Pos logit contributions
 swoop+0.241
He+0.230
 branch+0.224
 current+0.223
 eagle+0.221
Neg logit contributions
 best-0.166
icking-0.153
uffy-0.147
 cousins-0.144
 twin-0.142
Token a
Token ID257
Feature activation+1.10e-02

Pos logit contributions
 best+0.376
 cousins+0.365
 twin+0.357
 twins+0.343
 fault+0.336
Neg logit contributions
 swoop-0.659
He-0.641
 steer-0.635
 paddle-0.635
 branch-0.609
Token walk
Token ID2513
Feature activation-3.38e-01

Pos logit contributions
 twin+0.028
uffy+0.028
 best+0.028
 cousins+0.027
 cousin+0.027
Neg logit contributions
He-0.015
 swoop-0.014
 steer-0.013
 branch-0.013
 reach-0.012
Token.
Token ID13
Feature activation-5.60e-02

Pos logit contributions
 swoop+0.742
 paddle+0.720
 steer+0.720
He+0.707
 branch+0.694
Neg logit contributions
 best-0.566
uffy-0.564
 twin-0.563
 cousins-0.541
icking-0.530
Token She
Token ID1375
Feature activation-3.49e-02

Pos logit contributions
He+0.096
 steer+0.091
 paddle+0.090
 swoop+0.088
 branch+0.087
Neg logit contributions
 cousins-0.125
uffy-0.124
 fault-0.123
ours-0.121
 twin-0.121
Token said
Token ID531
Feature activation+5.12e-03

Pos logit contributions
 swoop+0.058
 paddle+0.056
 steer+0.055
He+0.055
 branch+0.055
Neg logit contributions
 fault-0.083
uffy-0.078
ains-0.077
 cousins-0.077
ours-0.077
Token yes
Token ID3763
Feature activation-1.04e-01

Pos logit contributions
uffy+0.011
 twin+0.011
 cousins+0.011
ours+0.011
 fault+0.011
Neg logit contributions
 swoop-0.009
 steer-0.009
 paddle-0.008
 branch-0.008
He-0.008
Token but
Token ID475
Feature activation-2.44e-01

Pos logit contributions
 swoop+0.210
 paddle+0.209
 steer+0.208
 branch+0.202
He+0.202
Neg logit contributions
uffy-0.195
 twin-0.195
 best-0.191
 cousins-0.190
 fault-0.189
Token looking
Token ID2045
Feature activation-4.35e-02

Pos logit contributions
 steer+0.100
 paddle+0.099
 swoop+0.098
He+0.097
 branch+0.097
Neg logit contributions
 best-0.090
 twin-0.090
 cousins-0.089
uffy-0.088
ours-0.086
Token,
Token ID11
Feature activation+5.07e-02

Pos logit contributions
 steer+0.116
 swoop+0.116
 paddle+0.115
 branch+0.114
He+0.114
Neg logit contributions
uffy-0.052
 twin-0.052
 cousins-0.051
 best-0.050
ours-0.050
Token until
Token ID1566
Feature activation-8.22e-02

Pos logit contributions
 best+0.114
 twin+0.113
 cousins+0.111
 fault+0.109
uffy+0.108
Neg logit contributions
 steer-0.080
 swoop-0.079
 paddle-0.079
He-0.076
 branch-0.075
Token finally
Token ID3443
Feature activation-1.14e-01

Pos logit contributions
 steer+0.089
 swoop+0.088
 paddle+0.085
He+0.084
 branch+0.084
Neg logit contributions
 twin-0.227
 best-0.227
uffy-0.226
 cousins-0.221
ours-0.219
Token,
Token ID11
Feature activation+5.15e-02

Pos logit contributions
 steer+0.144
 swoop+0.143
 paddle+0.141
 branch+0.140
He+0.139
Neg logit contributions
uffy-0.297
 twin-0.296
 cousins-0.292
 best-0.291
ours-0.290
Token he
Token ID339
Feature activation-3.96e-01

Pos logit contributions
 best+0.135
 twin+0.132
 cousins+0.130
uffy+0.129
icking+0.127
Neg logit contributions
 swoop-0.063
 steer-0.058
 branch-0.057
 paddle-0.056
He-0.056
Token spotted
Token ID13489
Feature activation-1.64e-01

Pos logit contributions
 swoop+0.731
 steer+0.717
He+0.710
 paddle+0.709
 branch+0.704
Neg logit contributions
uffy-0.825
 twin-0.811
ours-0.806
 cousins-0.805
 fault-0.801
Token her
Token ID607
Feature activation+9.01e-02

Pos logit contributions
 steer+0.327
 swoop+0.327
 paddle+0.323
He+0.320
 branch+0.319
Neg logit contributions
uffy-0.308
 twin-0.307
 cousins-0.302
ours-0.300
 fault-0.297
Token!
Token ID0
Feature activation-8.48e-02

Pos logit contributions
uffy+0.137
 twin+0.135
 cousins+0.133
ours+0.131
 fault+0.130
Neg logit contributions
 swoop-0.214
 steer-0.213
 paddle-0.211
He-0.210
 branch-0.210
Token
Token ID198
Feature activation+2.73e-02

Pos logit contributions
 swoop+0.157
 steer+0.156
 paddle+0.153
 branch+0.151
He+0.148
Neg logit contributions
uffy-0.174
 twin-0.173
ours-0.170
 cousins-0.169
 fault-0.165
Token
Token ID198
Feature activation+2.70e-02

Pos logit contributions
 twin+0.044
uffy+0.044
ours+0.042
 cousins+0.041
anny+0.041
Neg logit contributions
 swoop-0.061
 paddle-0.061
 steer-0.061
 branch-0.059
 lift-0.057
Token skin
Token ID4168
Feature activation+3.21e-02

Pos logit contributions
uffy+0.033
 twin+0.033
ours+0.033
 cousins+0.032
 fault+0.032
Neg logit contributions
 swoop-0.029
 steer-0.028
 branch-0.028
 paddle-0.028
He-0.028
Token.
Token ID13
Feature activation-6.91e-02

Pos logit contributions
uffy+0.063
 twin+0.063
ours+0.062
 best+0.061
 fault+0.061
Neg logit contributions
 steer-0.061
 swoop-0.060
 paddle-0.060
He-0.059
 branch-0.058
Token But
Token ID887
Feature activation-1.10e-01

Pos logit contributions
He+0.126
 swoop+0.123
 steer+0.122
 branch+0.121
 paddle+0.121
Neg logit contributions
uffy-0.144
 cousins-0.141
 fault-0.140
 twin-0.139
ours-0.139
Token soon
Token ID2582
Feature activation-7.47e-02

Pos logit contributions
 swoop+0.192
 steer+0.192
 branch+0.190
 paddle+0.189
He+0.184
Neg logit contributions
uffy-0.237
 cousins-0.233
ours-0.231
 fault-0.231
 twin-0.228
Token,
Token ID11
Feature activation+3.37e-02

Pos logit contributions
 swoop+0.130
 steer+0.127
He+0.124
 branch+0.124
 paddle+0.124
Neg logit contributions
uffy-0.162
ours-0.161
 twin-0.159
 cousins-0.158
 fault-0.157
Token he
Token ID339
Feature activation-4.13e-01

Pos logit contributions
uffy+0.071
 twin+0.070
 cousins+0.069
ours+0.069
 fault+0.068
Neg logit contributions
 swoop-0.060
 steer-0.060
 paddle-0.059
He-0.058
 branch-0.058
Token felt
Token ID2936
Feature activation-1.46e-02

Pos logit contributions
 swoop+0.824
 steer+0.795
He+0.790
 paddle+0.787
 branch+0.777
Neg logit contributions
uffy-0.802
 cousins-0.785
ours-0.783
 twin-0.781
 fault-0.772
Token a
Token ID257
Feature activation+2.18e-03

Pos logit contributions
 steer+0.031
 swoop+0.031
 paddle+0.030
 branch+0.029
He+0.029
Neg logit contributions
 best-0.026
 twin-0.025
 fault-0.025
ours-0.024
 sisters-0.024
Token sting
Token ID24276
Feature activation-3.59e-01

Pos logit contributions
 best+0.005
 twin+0.005
 brother+0.004
uffy+0.004
 fault+0.004
Neg logit contributions
 steer-0.004
 swoop-0.004
He-0.004
 paddle-0.003
 branch-0.003
Token in
Token ID287
Feature activation+7.25e-02

Pos logit contributions
 steer+0.754
 swoop+0.743
 paddle+0.742
 branch+0.724
He+0.723
Neg logit contributions
 twin-0.645
uffy-0.641
 best-0.620
 cousins-0.618
ours-0.613
Token his
Token ID465
Feature activation-1.63e-01

Pos logit contributions
 twin+0.135
 best+0.128
 twins+0.127
 fault+0.124
 sisters+0.123
Neg logit contributions
 paddle-0.142
 steer-0.141
 swoop-0.140
He-0.134
 branch-0.129