TOP ACTIVATIONS
MAX = 0.142

coat and boots when it r ained . <|endoftext|> Once upon
used her umbrella when it r ained and never let go
, it kept on flying err atically with inexplicable speed .
they decided never to go kay aking without help again !
to the lake to go kay aking again soon ! <|endoftext|>
building and some men started un loading boxes .
af o et ida and f en ug reek and mustard
for helping her load and un load the dishes . She
spread out their blankets and unp acked their delicious snacks .
poppy and nig ella and f enn el and an ise
helmets and climbed into the kay ak with their dad .
the sea . As he d ived deep , he saw
to the lake to go kay aking . Tim my was
mom and dad to go kay aking . Her mom and
water . He hopped and spl ashed around , having so
and his mom continued to un pack . Jack knew that
and Anne jumped in the kay ak and padd led away
John got in the kay ak and started padd ling
. She d ived and d ived but couldn 't find
They climbed into the kay ak and pushed it into

MIN ACTIVATIONS
MIN = -0.109

mom said , " Yes , but we need to sort
, " I am sorry , Amy ." But it was
mom said , " Yes , but be careful not to
Lily said , " Yes , I can serve you some
. " Yes , I do !" replied Charlie
M ummy said " Yes , of course you can Amy
mom said , " Yes , but be careful ." Amy
then said , " Yes , that 's a lovely idea
Tim my . " Yes , let 's go !" replied
asked Max . " Yes , please ," replied Ellie .
Tom asked . " Yes , let 's build the tower
Max replied , " Yes , I have a large letter
and said , " Yes , but first you have to
and said , " Yes , please !" They played on
Tim my . " Yes , we can climb the hill
" I don 't know , but we can try ,"
smiled and said " Yes , it does ! And it
just said , " No , this is my swing ."
. Mom said " Yes , here 's a perfect knife
Tom said , " Yes , come to my house ."

INTERVAL 0.080 - 0.142
CONTAINS 0.268%

went back outside with their sp ade and started to dig
wanted to play in the p uddle . They did not
She was happy when she pe ed on the toilet .
It tasted terrible and she rec o iled .
the leader of the group stepped forward . She

INTERVAL 0.017 - 0.080
CONTAINS 21.346%

It was a mine ! M andy quickly raced
pasta . They decided to take the pasta and run away
was so excited , but also a little anxious . He
Ben . I like to stay in the shallow end ."
arrived at their hotel , Tim my 's mom asked him

INTERVAL -0.046 - 0.017
CONTAINS 76.081%

Sara . I 'm glad you like them . Do you
named Tim . One day , Tim and Wednesday
to play in the park . She especially loved to sit
loved it very much . The mom my wanted
an enormous hole in the ground . They wondered what was

INTERVAL -0.109 - -0.046
CONTAINS 2.305%

" L ily and Tom , you did very well .
the lions . It 's okay . We can clean up
That was a great story , Benny . Thank you !"
. " Can we play together ?" Ben and
ll see a purple sunset !" The
Tokencoat
Token ID31434
Feature activation+1.01e-02

Pos logit contributions
aked+0.053
 dressed+0.053
aled+0.052
ips+0.052
years+0.049
Neg logit contributions
 we-0.111
 salad-0.109
 difference-0.106
 forgive-0.104
 yeah-0.103
Token and
Token ID290
Feature activation-8.51e-03

Pos logit contributions
aked+0.021
ips+0.020
aled+0.020
 dressed+0.020
years+0.020
Neg logit contributions
 we-0.012
 salad-0.011
 difference-0.011
 forgive-0.011
 would-0.010
Token boots
Token ID14412
Feature activation+3.43e-02

Pos logit contributions
 we+0.013
 salad+0.012
 difference+0.011
 would+0.011
 forgive+0.011
Neg logit contributions
aked-0.015
ips-0.015
aled-0.015
 dressed-0.015
ang-0.014
Token when
Token ID618
Feature activation-1.60e-02

Pos logit contributions
aked+0.071
aled+0.070
 dressed+0.069
ips+0.069
years+0.067
Neg logit contributions
 we-0.040
 salad-0.038
 difference-0.036
 forgive-0.036
 would-0.035
Token it
Token ID340
Feature activation+2.06e-02

Pos logit contributions
 we+0.013
 salad+0.012
 difference+0.011
 yeah+0.010
 tastes+0.010
Neg logit contributions
aked-0.040
 dressed-0.039
aled-0.039
ips-0.039
 ped-0.038
Token r
Token ID374
Feature activation+1.42e-01

Pos logit contributions
aked+0.042
aled+0.041
 dressed+0.041
ips+0.041
years+0.040
Neg logit contributions
 we-0.026
 salad-0.023
 difference-0.023
 would-0.022
 tastes-0.022
Tokenained
Token ID1328
Feature activation+3.67e-02

Pos logit contributions
years+0.158
 ped+0.147
ga+0.147
 dressed+0.146
Char+0.145
Neg logit contributions
 pie-0.273
ue-0.263
 anymore-0.244
 we-0.244
?"-0.241
Token.
Token ID13
Feature activation+1.31e-03

Pos logit contributions
aked+0.079
 dressed+0.078
aled+0.078
ips+0.077
years+0.077
Neg logit contributions
 we-0.041
 salad-0.036
 difference-0.035
 forgive-0.034
 would-0.034
Token<|endoftext|>
Token ID50256
Feature activation+6.86e-03

Pos logit contributions
 dressed+0.002
ips+0.002
aked+0.002
aled+0.002
ained+0.002
Neg logit contributions
 salad-0.002
 sir-0.002
 we-0.002
fly-0.002
 slice-0.002
TokenOnce
Token ID7454
Feature activation-1.18e-02

Pos logit contributions
aked+0.016
 dressed+0.015
aled+0.015
ips+0.015
years+0.015
Neg logit contributions
 we-0.007
 salad-0.006
 difference-0.006
 forgive-0.006
 slice-0.005
Token upon
Token ID2402
Feature activation-4.12e-02

Pos logit contributions
 tastes+0.020
 difference+0.019
fly+0.019
 salad+0.019
 repay+0.019
Neg logit contributions
 dressed-0.018
ips-0.017
ang-0.016
inn-0.016
ained-0.016
Token used
Token ID973
Feature activation+3.33e-02

Pos logit contributions
ips+0.077
aled+0.076
angers+0.074
ang+0.074
rol+0.074
Neg logit contributions
 we-0.031
 would-0.030
 forgive-0.027
 remember-0.027
 moment-0.027
Token her
Token ID607
Feature activation+1.88e-02

Pos logit contributions
ang+0.071
aked+0.070
aled+0.069
ips+0.068
ga+0.068
Neg logit contributions
 we-0.037
 salad-0.036
 forgive-0.034
 difference-0.034
 forgiveness-0.033
Token umbrella
Token ID25510
Feature activation+2.34e-02

Pos logit contributions
aked+0.035
 dressed+0.035
aled+0.035
ips+0.035
years+0.034
Neg logit contributions
 we-0.026
 salad-0.025
 difference-0.024
 forgive-0.023
 yeah-0.023
Token when
Token ID618
Feature activation-2.67e-02

Pos logit contributions
aked+0.046
 dressed+0.046
ips+0.045
aled+0.045
 ped+0.044
Neg logit contributions
 we-0.029
 salad-0.029
 difference-0.028
 sir-0.026
 tastes-0.026
Token it
Token ID340
Feature activation+1.26e-02

Pos logit contributions
 difference+0.022
 salad+0.022
 we+0.021
 slice+0.021
 sir+0.021
Neg logit contributions
 dressed-0.063
aked-0.062
ips-0.061
 ped-0.060
aled-0.060
Token r
Token ID374
Feature activation+1.39e-01

Pos logit contributions
aked+0.028
 dressed+0.027
aled+0.027
ips+0.027
years+0.027
Neg logit contributions
 we-0.013
 salad-0.012
 difference-0.012
 tastes-0.011
 would-0.011
Tokenained
Token ID1328
Feature activation+3.60e-02

Pos logit contributions
 dressed+0.134
ga+0.133
years+0.123
 ped+0.121
 speakers+0.118
Neg logit contributions
 pie-0.289
ue-0.276
 salad-0.264
fly-0.252
flower-0.251
Token and
Token ID290
Feature activation+5.90e-03

Pos logit contributions
aked+0.076
 dressed+0.075
ips+0.074
aled+0.074
years+0.073
Neg logit contributions
 we-0.041
 salad-0.038
 difference-0.037
 forgive-0.035
 would-0.035
Token never
Token ID1239
Feature activation-5.36e-04

Pos logit contributions
 dressed+0.012
aked+0.012
aled+0.012
ips+0.012
 ped+0.011
Neg logit contributions
 we-0.007
 salad-0.007
 difference-0.007
 sir-0.007
 yeah-0.006
Token let
Token ID1309
Feature activation-2.28e-02

Pos logit contributions
 salad+0.001
 we+0.001
 difference+0.001
 yeah+0.001
 slice+0.001
Neg logit contributions
 dressed-0.001
aked-0.001
 ped-0.001
aled-0.001
ips-0.001
Token go
Token ID467
Feature activation+3.63e-02

Pos logit contributions
 salad+0.028
 we+0.027
 difference+0.027
 slice+0.027
 yeah+0.027
Neg logit contributions
 dressed-0.045
aled-0.043
aked-0.043
 ped-0.042
ips-0.042
Token,
Token ID11
Feature activation-4.38e-02

Pos logit contributions
 salad+0.022
 difference+0.022
 we+0.021
 slice+0.021
ccoli+0.020
Neg logit contributions
 dressed-0.041
ips-0.040
aked-0.040
 ped-0.039
aled-0.039
Token it
Token ID340
Feature activation-2.85e-03

Pos logit contributions
 difference+0.054
 salad+0.051
fly+0.051
ccoli+0.051
 sir+0.050
Neg logit contributions
 dressed-0.088
 ped-0.084
years-0.081
aked-0.081
aled-0.080
Token kept
Token ID4030
Feature activation+4.67e-02

Pos logit contributions
 we+0.004
 salad+0.004
 difference+0.004
 sir+0.004
 yeah+0.004
Neg logit contributions
aked-0.005
 dressed-0.005
ips-0.005
aled-0.005
 ped-0.005
Token on
Token ID319
Feature activation+2.02e-02

Pos logit contributions
aked+0.085
opped+0.084
aled+0.082
ips+0.082
ang+0.081
Neg logit contributions
 we-0.065
fly-0.064
 salad-0.063
 bird-0.063
 pie-0.063
Token flying
Token ID7348
Feature activation+3.32e-02

Pos logit contributions
ips+0.039
ang+0.039
aked+0.038
izz+0.038
aled+0.036
Neg logit contributions
 salad-0.029
 mushroom-0.028
 bird-0.028
 moment-0.028
 difference-0.027
Token err
Token ID11454
Feature activation+1.37e-01

Pos logit contributions
aked+0.056
 dressed+0.056
ips+0.055
aled+0.055
years+0.053
Neg logit contributions
 we-0.051
 salad-0.049
 difference-0.049
 forgive-0.046
 tastes-0.046
Tokenatically
Token ID4142
Feature activation+3.25e-02

Pos logit contributions
 dressed+0.184
 ped+0.178
 buck+0.168
rolled+0.164
ained+0.163
Neg logit contributions
 forgive-0.213
ue-0.209
 repay-0.198
 pie-0.198
 believe-0.194
Token with
Token ID351
Feature activation+5.08e-03

Pos logit contributions
aked+0.070
ips+0.068
aled+0.068
years+0.067
 dressed+0.066
Neg logit contributions
 we-0.037
 difference-0.035
 salad-0.035
fly-0.032
 tastes-0.031
Token inexplicable
Token ID49602
Feature activation+1.90e-02

Pos logit contributions
aked+0.009
ips+0.009
 dressed+0.009
aled+0.009
ang+0.009
Neg logit contributions
 we-0.007
 salad-0.007
 difference-0.007
 forgive-0.006
 tastes-0.006
Token speed
Token ID2866
Feature activation+1.96e-02

Pos logit contributions
aked+0.032
 dressed+0.031
aled+0.031
ips+0.031
 ped+0.030
Neg logit contributions
 we-0.030
 salad-0.029
 difference-0.028
 sir-0.027
 yeah-0.027
Token.
Token ID13
Feature activation-8.16e-03

Pos logit contributions
aked+0.043
ips+0.043
years+0.042
 dressed+0.042
aled+0.042
Neg logit contributions
 we-0.021
 would-0.019
 salad-0.018
 difference-0.017
fly-0.017
Token they
Token ID484
Feature activation+7.15e-03

Pos logit contributions
 salad+0.006
 difference+0.006
 we+0.005
 sir+0.005
 yeah+0.005
Neg logit contributions
 dressed-0.009
aked-0.009
 ped-0.009
aled-0.009
ips-0.008
Token decided
Token ID3066
Feature activation-1.45e-02

Pos logit contributions
aked+0.014
ips+0.014
 dressed+0.013
aled+0.013
 ped+0.013
Neg logit contributions
 we-0.010
 salad-0.009
 difference-0.009
 forgive-0.008
 slice-0.008
Token never
Token ID1239
Feature activation-3.09e-03

Pos logit contributions
 salad+0.017
 difference+0.017
 we+0.017
 slice+0.016
 yeah+0.016
Neg logit contributions
 dressed-0.029
aked-0.028
aled-0.028
 ped-0.028
ips-0.027
Token to
Token ID284
Feature activation+1.24e-02

Pos logit contributions
 salad+0.005
ue+0.004
 difference+0.004
 yeah+0.004
 slice+0.004
Neg logit contributions
 dressed-0.006
 ped-0.005
aked-0.005
aled-0.005
years-0.005
Token go
Token ID467
Feature activation+4.04e-02

Pos logit contributions
 dressed+0.026
 ped+0.026
aled+0.025
aked+0.025
ips+0.025
Neg logit contributions
 salad-0.014
 we-0.013
 yeah-0.012
 difference-0.012
 tastes-0.012
Token kay
Token ID34681
Feature activation+1.37e-01

Pos logit contributions
aked+0.075
 dressed+0.073
aled+0.072
ips+0.072
 ped+0.070
Neg logit contributions
 we-0.058
 salad-0.055
 difference-0.054
 forgive-0.051
 sir-0.051
Tokenaking
Token ID868
Feature activation+2.20e-02

Pos logit contributions
ann+0.192
ga+0.188
itted+0.185
 dressed+0.180
angers+0.179
Neg logit contributions
 forgive-0.225
 difference-0.206
-0.203
 slice-0.203
 we-0.201
Token without
Token ID1231
Feature activation+2.42e-02

Pos logit contributions
aked+0.046
 dressed+0.046
ips+0.045
aled+0.045
 ped+0.044
Neg logit contributions
 we-0.025
 salad-0.024
 difference-0.024
 forgive-0.022
 sir-0.022
Token help
Token ID1037
Feature activation-6.91e-03

Pos logit contributions
aked+0.048
ips+0.047
aled+0.047
ang+0.045
 dressed+0.045
Neg logit contributions
 we-0.033
 salad-0.031
 difference-0.030
 forgive-0.028
 sir-0.028
Token again
Token ID757
Feature activation-1.14e-02

Pos logit contributions
 salad+0.010
 yeah+0.009
 we+0.009
 slice+0.009
 tastes+0.009
Neg logit contributions
 dressed-0.013
ga-0.012
ips-0.012
 ped-0.011
aked-0.011
Token!
Token ID0
Feature activation+7.92e-03

Pos logit contributions
 we+0.014
 salad+0.014
 difference+0.014
 yeah+0.013
 sir+0.013
Neg logit contributions
 dressed-0.022
aked-0.022
ips-0.022
aled-0.022
 ped-0.021
Token to
Token ID284
Feature activation-1.41e-03

Pos logit contributions
aked+0.030
aled+0.030
ips+0.030
 dressed+0.029
years+0.029
Neg logit contributions
 we-0.019
 salad-0.017
 difference-0.017
 forgive-0.017
 sir-0.016
Token the
Token ID262
Feature activation+2.05e-02

Pos logit contributions
 we+0.002
 salad+0.002
 difference+0.002
 forgive+0.002
 yeah+0.002
Neg logit contributions
aked-0.003
aled-0.003
ips-0.003
 dressed-0.003
ang-0.003
Token lake
Token ID13546
Feature activation+1.40e-02

Pos logit contributions
aked+0.045
aled+0.044
ips+0.044
itted+0.041
years+0.041
Neg logit contributions
 difference-0.029
 we-0.029
 bird-0.027
 end-0.026
 salad-0.025
Token to
Token ID284
Feature activation+1.64e-02

Pos logit contributions
 dressed+0.027
aked+0.026
ips+0.025
 ped+0.025
aled+0.025
Neg logit contributions
 we-0.018
 salad-0.018
 difference-0.017
 yeah-0.017
 tastes-0.017
Token go
Token ID467
Feature activation+5.44e-02

Pos logit contributions
aked+0.040
ips+0.038
aled+0.038
 dressed+0.038
years+0.037
Neg logit contributions
 we-0.015
 difference-0.014
 salad-0.014
 forgive-0.014
 yeah-0.013
Token kay
Token ID34681
Feature activation+1.37e-01

Pos logit contributions
aked+0.112
aled+0.109
years+0.106
ips+0.106
itted+0.103
Neg logit contributions
 we-0.076
 forgive-0.064
 difference-0.063
 salad-0.063
 would-0.062
Tokenaking
Token ID868
Feature activation+3.31e-02

Pos logit contributions
ga+0.190
itted+0.185
ann+0.184
 ped+0.183
aled+0.182
Neg logit contributions
 forgive-0.223
-0.205
 we-0.204
 promise-0.201
 slice-0.200
Token again
Token ID757
Feature activation+3.68e-03

Pos logit contributions
aked+0.068
aled+0.066
ips+0.065
years+0.065
 dressed+0.064
Neg logit contributions
 we-0.043
 salad-0.040
 difference-0.039
 tastes-0.036
 forgive-0.036
Token soon
Token ID2582
Feature activation-1.04e-02

Pos logit contributions
aked+0.008
aled+0.008
 dressed+0.008
ips+0.008
 ped+0.007
Neg logit contributions
 we-0.004
 salad-0.004
 difference-0.004
 sir-0.004
 forgive-0.004
Token!
Token ID0
Feature activation+6.74e-03

Pos logit contributions
 we+0.013
 salad+0.012
 difference+0.012
 yeah+0.011
 sir+0.011
Neg logit contributions
aked-0.021
 dressed-0.021
ips-0.020
aled-0.020
 ped-0.020
Token<|endoftext|>
Token ID50256
Feature activation+1.96e-02

Pos logit contributions
 dressed+0.013
aked+0.013
ips+0.013
aled+0.013
ang+0.012
Neg logit contributions
 salad-0.008
 we-0.008
 difference-0.008
 tastes-0.008
 sir-0.008
Token building
Token ID2615
Feature activation+1.60e-02

Pos logit contributions
 we+0.009
 salad+0.009
 difference+0.009
 yeah+0.008
 tastes+0.008
Neg logit contributions
aked-0.012
 dressed-0.012
aled-0.011
ips-0.011
 ped-0.011
Token and
Token ID290
Feature activation+4.73e-04

Pos logit contributions
aked+0.031
 dressed+0.031
ips+0.031
aled+0.031
years+0.030
Neg logit contributions
 we-0.021
 salad-0.020
 difference-0.019
 sir-0.018
 slice-0.018
Token some
Token ID617
Feature activation-1.16e-02

Pos logit contributions
 dressed+0.001
aked+0.001
aled+0.001
 ped+0.001
ips+0.001
Neg logit contributions
 salad-0.001
 we-0.001
 difference-0.001
 yeah-0.001
 sir-0.001
Token men
Token ID1450
Feature activation+1.05e-02

Pos logit contributions
 we+0.016
 salad+0.016
 difference+0.016
 forgive+0.015
 slice+0.015
Neg logit contributions
aked-0.022
ips-0.022
aled-0.021
 dressed-0.021
years-0.021
Token started
Token ID2067
Feature activation+4.92e-02

Pos logit contributions
aked+0.019
 dressed+0.019
ips+0.019
aled+0.019
 ped+0.018
Neg logit contributions
 we-0.015
 salad-0.014
 difference-0.014
 forgive-0.013
 sir-0.013
Token un
Token ID555
Feature activation+1.36e-01

Pos logit contributions
aked+0.125
aled+0.125
ang+0.123
ips+0.123
years+0.122
Neg logit contributions
 we-0.041
 pie-0.033
 bird-0.030
 salad-0.030
 they-0.030
Tokenloading
Token ID25138
Feature activation+2.52e-02

Pos logit contributions
ips+0.221
ained+0.217
itted+0.211
aled+0.207
aked+0.198
Neg logit contributions
 we-0.204
keeper-0.196
man-0.194
 slice-0.185
 sir-0.185
Token boxes
Token ID10559
Feature activation+3.07e-02

Pos logit contributions
years+0.050
aked+0.049
aled+0.049
ips+0.048
 dressed+0.047
Neg logit contributions
 we-0.034
 salad-0.033
 difference-0.031
 tastes-0.031
 slice-0.030
Token.
Token ID13
Feature activation+7.09e-03

Pos logit contributions
aked+0.072
aled+0.071
years+0.071
ips+0.069
ang+0.067
Neg logit contributions
 we-0.032
 salad-0.027
 sir-0.025
 difference-0.025
 would-0.024
Token
Token ID198
Feature activation+7.73e-03

Pos logit contributions
aked+0.015
 dressed+0.015
ips+0.015
aled+0.015
 ped+0.015
Neg logit contributions
 we-0.008
 salad-0.008
 difference-0.007
 sir-0.007
 yeah-0.007
Token
Token ID198
Feature activation+9.06e-03

Pos logit contributions
aked+0.017
 dressed+0.017
ips+0.016
aled+0.016
 ped+0.016
Neg logit contributions
 we-0.008
 salad-0.008
 difference-0.008
 sir-0.007
 yeah-0.007
Tokenaf
Token ID1878
Feature activation+2.37e-02

Pos logit contributions
aked+0.019
aled+0.019
 dressed+0.019
ips+0.019
 ped+0.018
Neg logit contributions
 we-0.011
 salad-0.011
 difference-0.010
 sir-0.010
 forgive-0.010
Tokeno
Token ID78
Feature activation+2.39e-02

Pos logit contributions
ips+0.038
 dressed+0.038
aked+0.038
aled+0.038
 ped+0.037
Neg logit contributions
 we-0.038
 salad-0.035
 tastes-0.034
 difference-0.034
 sir-0.034
Tokenet
Token ID316
Feature activation+4.92e-02

Pos logit contributions
aked+0.045
aled+0.045
 dressed+0.044
ips+0.044
years+0.043
Neg logit contributions
 we-0.032
 salad-0.030
 difference-0.028
 would-0.028
 tastes-0.028
Tokenida
Token ID3755
Feature activation+1.59e-02

Pos logit contributions
 dressed+0.074
aked+0.074
aled+0.074
ips+0.073
years+0.071
Neg logit contributions
 we-0.082
 salad-0.078
 would-0.075
 difference-0.074
 sir-0.074
Token and
Token ID290
Feature activation-8.01e-03

Pos logit contributions
aked+0.029
 dressed+0.029
ips+0.029
aled+0.029
 ped+0.028
Neg logit contributions
 we-0.022
 salad-0.021
 difference-0.020
 tastes-0.019
 yeah-0.019
Token f
Token ID277
Feature activation+1.36e-01

Pos logit contributions
 we+0.012
 salad+0.012
 difference+0.012
 sir+0.011
 yeah+0.011
Neg logit contributions
aked-0.013
 dressed-0.013
ips-0.013
aled-0.013
 ped-0.013
Tokenen
Token ID268
Feature activation+2.76e-02

Pos logit contributions
 ped+0.150
 ramps+0.147
 winding+0.133
 dressed+0.131
 buck+0.130
Neg logit contributions
fly-0.252
affe-0.215
 tastes-0.214
flies-0.208
 yeah-0.204
Tokenug
Token ID1018
Feature activation+7.87e-02

Pos logit contributions
aked+0.049
 dressed+0.048
aled+0.048
ips+0.048
 ped+0.047
Neg logit contributions
 we-0.040
 salad-0.038
 difference-0.037
 tastes-0.036
 slice-0.035
Tokenreek
Token ID10316
Feature activation+3.11e-02

Pos logit contributions
 dressed+0.079
ips+0.078
years+0.076
aked+0.073
aled+0.072
Neg logit contributions
 we-0.162
fly-0.161
 would-0.156
 moment-0.152
 bird-0.149
Token and
Token ID290
Feature activation+2.95e-03

Pos logit contributions
aked+0.061
aled+0.061
ips+0.060
years+0.060
 dressed+0.059
Neg logit contributions
 we-0.040
 salad-0.036
 difference-0.035
 would-0.035
 forgive-0.034
Token mustard
Token ID33694
Feature activation+3.23e-02

Pos logit contributions
aked+0.005
ips+0.005
 dressed+0.005
aled+0.005
 ped+0.005
Neg logit contributions
 we-0.005
 salad-0.005
 difference-0.004
 tastes-0.004
 slice-0.004
Token for
Token ID329
Feature activation+1.34e-02

Pos logit contributions
 we+0.026
 salad+0.024
 difference+0.023
 forgive+0.023
 would+0.022
Neg logit contributions
aked-0.039
ips-0.039
aled-0.038
 dressed-0.038
years-0.038
Token helping
Token ID5742
Feature activation-4.63e-03

Pos logit contributions
aked+0.029
ang+0.028
ips+0.027
aled+0.026
rol+0.026
Neg logit contributions
 forgive-0.017
 we-0.017
 salad-0.016
 forgiveness-0.016
 slice-0.015
Token her
Token ID607
Feature activation+3.70e-05

Pos logit contributions
 we+0.006
 salad+0.006
 difference+0.006
 forgive+0.005
 sir+0.005
Neg logit contributions
aked-0.009
aled-0.009
 dressed-0.009
ips-0.009
 ped-0.009
Token load
Token ID3440
Feature activation+5.53e-02

Pos logit contributions
aked+0.000
 dressed+0.000
ips+0.000
aled+0.000
 ped+0.000
Neg logit contributions
 we-0.000
 salad-0.000
 difference-0.000
 sir-0.000
 forgive-0.000
Token and
Token ID290
Feature activation+2.28e-02

Pos logit contributions
aled+0.119
years+0.118
aked+0.118
 ped+0.117
ang+0.116
Neg logit contributions
 we-0.063
 salad-0.059
 tastes-0.054
 difference-0.052
 sir-0.052
Token un
Token ID555
Feature activation+1.36e-01

Pos logit contributions
aked+0.042
ips+0.041
 dressed+0.041
aled+0.041
years+0.040
Neg logit contributions
 we-0.033
 salad-0.031
 difference-0.030
 forgive-0.029
 slice-0.029
Tokenload
Token ID2220
Feature activation+2.68e-02

Pos logit contributions
ained+0.250
itted+0.237
arted+0.236
agging+0.235
rol+0.231
Neg logit contributions
 we-0.170
man-0.167
 slice-0.165
keeper-0.156
 forgive-0.147
Token the
Token ID262
Feature activation+1.79e-02

Pos logit contributions
aked+0.058
aled+0.057
years+0.056
 ped+0.056
 dressed+0.055
Neg logit contributions
 we-0.032
 salad-0.029
 difference-0.027
 tastes-0.027
 sir-0.026
Token dishes
Token ID16759
Feature activation+3.26e-02

Pos logit contributions
aled+0.037
aked+0.037
ips+0.036
 ped+0.036
years+0.036
Neg logit contributions
 we-0.023
 salad-0.023
 difference-0.022
 tastes-0.021
 slice-0.020
Token.
Token ID13
Feature activation-1.21e-03

Pos logit contributions
aked+0.072
aled+0.072
years+0.070
ips+0.069
 dressed+0.069
Neg logit contributions
 we-0.037
 salad-0.034
 difference-0.031
 slice-0.030
 sir-0.030
Token She
Token ID1375
Feature activation+1.40e-02

Pos logit contributions
 we+0.001
 salad+0.001
 difference+0.001
 sir+0.001
 yeah+0.001
Neg logit contributions
aked-0.003
 dressed-0.002
aled-0.002
ips-0.002
 ped-0.002
Token spread
Token ID4104
Feature activation+3.78e-02

Pos logit contributions
 we+0.006
 salad+0.006
 would+0.005
 forgive+0.005
 slice+0.005
Neg logit contributions
ips-0.010
aked-0.010
aled-0.009
ang-0.009
ga-0.009
Token out
Token ID503
Feature activation+2.68e-02

Pos logit contributions
aked+0.063
ips+0.062
 dressed+0.062
aled+0.062
years+0.060
Neg logit contributions
 we-0.059
 salad-0.058
 difference-0.056
 slice-0.054
 tastes-0.053
Token their
Token ID511
Feature activation+1.67e-02

Pos logit contributions
aked+0.053
ga+0.053
ang+0.052
ips+0.052
years+0.052
Neg logit contributions
 salad-0.035
 we-0.034
 difference-0.031
 slice-0.030
 tastes-0.030
Token blankets
Token ID34794
Feature activation+4.01e-02

Pos logit contributions
aked+0.030
 dressed+0.029
ips+0.029
aled+0.029
 ped+0.028
Neg logit contributions
 we-0.024
 salad-0.023
 difference-0.023
 sir-0.022
 yeah-0.022
Token and
Token ID290
Feature activation-9.13e-03

Pos logit contributions
aked+0.087
aled+0.087
years+0.085
ips+0.084
 dressed+0.083
Neg logit contributions
 we-0.045
 salad-0.042
 difference-0.039
 slice-0.038
 would-0.037
Token unp
Token ID8593
Feature activation+1.36e-01

Pos logit contributions
 we+0.014
 salad+0.014
 difference+0.014
 yeah+0.013
 sir+0.013
Neg logit contributions
 dressed-0.015
aked-0.015
aled-0.015
ips-0.015
 ped-0.014
Tokenacked
Token ID6021
Feature activation+2.82e-02

Pos logit contributions
 ped+0.222
 training+0.218
 buck+0.198
years+0.195
 Fay+0.191
Neg logit contributions
ue-0.199
-0.181
man-0.177
keeper-0.169
ally-0.168
Token their
Token ID511
Feature activation+3.33e-03

Pos logit contributions
aked+0.048
ips+0.047
aled+0.047
 dressed+0.047
years+0.046
Neg logit contributions
 we-0.044
 salad-0.043
 difference-0.041
 tastes-0.040
 slice-0.039
Token delicious
Token ID12625
Feature activation-4.54e-03

Pos logit contributions
aked+0.006
 dressed+0.006
ips+0.006
aled+0.006
 ped+0.006
Neg logit contributions
 we-0.005
 salad-0.004
 difference-0.004
 yeah-0.004
 sir-0.004
Token snacks
Token ID28487
Feature activation+1.42e-02

Pos logit contributions
 we+0.005
 salad+0.005
 difference+0.005
 sir+0.004
 yeah+0.004
Neg logit contributions
aked-0.010
 dressed-0.010
ips-0.009
aled-0.009
 ped-0.009
Token.
Token ID13
Feature activation+1.27e-02

Pos logit contributions
aked+0.029
 dressed+0.029
aled+0.029
ips+0.028
 ped+0.028
Neg logit contributions
 we-0.017
 salad-0.016
 difference-0.016
 sir-0.015
 slice-0.015
Token poppy
Token ID47867
Feature activation+1.88e-02

Pos logit contributions
 we+0.003
 salad+0.003
 difference+0.003
 sir+0.003
 yeah+0.003
Neg logit contributions
aked-0.004
 dressed-0.004
aled-0.003
ips-0.003
 ped-0.003
Token and
Token ID290
Feature activation+1.01e-03

Pos logit contributions
 dressed+0.035
aked+0.035
ips+0.034
aled+0.033
 ped+0.033
Neg logit contributions
 we-0.026
 salad-0.025
 difference-0.025
 yeah-0.024
 sir-0.023
Token nig
Token ID44873
Feature activation+6.27e-02

Pos logit contributions
aked+0.002
 dressed+0.002
aled+0.002
ips+0.002
 ped+0.002
Neg logit contributions
 we-0.002
 salad-0.001
 difference-0.001
 sir-0.001
 yeah-0.001
Tokenella
Token ID12627
Feature activation+7.17e-03

Pos logit contributions
 dressed+0.089
aked+0.089
aled+0.087
ips+0.087
 ped+0.084
Neg logit contributions
 we-0.112
 salad-0.106
 difference-0.105
 sir-0.104
 yeah-0.103
Token and
Token ID290
Feature activation+7.96e-03

Pos logit contributions
aked+0.014
 dressed+0.014
ips+0.014
aled+0.014
 ped+0.014
Neg logit contributions
 we-0.009
 salad-0.008
 difference-0.008
 sir-0.008
 yeah-0.008
Token f
Token ID277
Feature activation+1.36e-01

Pos logit contributions
aked+0.015
 dressed+0.015
aled+0.015
ips+0.015
 ped+0.015
Neg logit contributions
 we-0.010
 salad-0.010
 difference-0.010
 sir-0.009
 yeah-0.009
Tokenenn
Token ID1697
Feature activation+6.24e-02

Pos logit contributions
 ramps+0.176
 winding+0.165
 ped+0.160
 convey+0.158
 escal+0.155
Neg logit contributions
fly-0.228
 yeah-0.212
 why-0.207
 we-0.203
 forgive-0.200
Tokenel
Token ID417
Feature activation+3.90e-02

Pos logit contributions
 dressed+0.081
aled+0.077
 ped+0.077
aked+0.077
 ramps+0.076
Neg logit contributions
 we-0.114
fly-0.107
 tastes-0.105
 would-0.105
 yeah-0.105
Token and
Token ID290
Feature activation+6.54e-03

Pos logit contributions
aked+0.077
 dressed+0.077
ips+0.075
aled+0.075
 ped+0.073
Neg logit contributions
 we-0.049
 salad-0.046
 difference-0.045
 sir-0.043
 tastes-0.043
Token an
Token ID281
Feature activation+3.07e-02

Pos logit contributions
aked+0.012
 dressed+0.012
ips+0.012
aled+0.012
 ped+0.012
Neg logit contributions
 we-0.009
 salad-0.008
 difference-0.008
 sir-0.008
 yeah-0.008
Tokenise
Token ID786
Feature activation+3.49e-02

Pos logit contributions
ips+0.060
 dressed+0.058
iddling+0.057
aked+0.057
angers+0.056
Neg logit contributions
 we-0.037
 difference-0.037
 sir-0.035
 bird-0.034
 tastes-0.034
Token helmets
Token ID28359
Feature activation+1.45e-02

Pos logit contributions
ips+0.032
years+0.032
rol+0.031
ang+0.030
uff+0.030
Neg logit contributions
 we-0.026
 would-0.025
 salad-0.023
 sandwich-0.022
 forget-0.022
Token and
Token ID290
Feature activation+4.52e-03

Pos logit contributions
aked+0.031
years+0.030
ips+0.030
aled+0.029
ang+0.029
Neg logit contributions
 we-0.017
 would-0.016
 forgive-0.015
 salad-0.015
 sir-0.014
Token climbed
Token ID19952
Feature activation+7.39e-02

Pos logit contributions
ips+0.008
years+0.008
aked+0.008
ang+0.008
aled+0.007
Neg logit contributions
 we-0.007
 salad-0.007
 would-0.007
 forgive-0.006
 forget-0.006
Token into
Token ID656
Feature activation+1.73e-02

Pos logit contributions
years+0.159
aked+0.146
iddling+0.145
ang+0.143
itted+0.142
Neg logit contributions
 we-0.100
 would-0.088
 salad-0.087
 they-0.085
 moment-0.084
Token the
Token ID262
Feature activation+3.15e-02

Pos logit contributions
aked+0.044
aled+0.042
itted+0.041
ained+0.041
years+0.041
Neg logit contributions
 we-0.016
 salad-0.014
 difference-0.013
 slice-0.013
 moment-0.013
Token kay
Token ID34681
Feature activation+1.36e-01

Pos logit contributions
years+0.069
aked+0.066
ips+0.065
aled+0.061
itted+0.061
Neg logit contributions
 difference-0.045
 we-0.042
 salad-0.041
 bird-0.041
 end-0.038
Tokenak
Token ID461
Feature activation+2.24e-02

Pos logit contributions
 dressed+0.227
years+0.220
pered+0.211
arted+0.210
 ped+0.210
Neg logit contributions
-0.182
 forgive-0.179
man-0.175
 slice-0.173
 sir-0.171
Token with
Token ID351
Feature activation+1.08e-02

Pos logit contributions
years+0.052
aked+0.052
ips+0.051
aled+0.051
 dressed+0.050
Neg logit contributions
 we-0.021
 salad-0.020
 difference-0.018
 sir-0.018
 slice-0.017
Token their
Token ID511
Feature activation+1.02e-02

Pos logit contributions
aked+0.022
ips+0.022
ang+0.022
ga+0.021
years+0.021
Neg logit contributions
 we-0.013
 salad-0.013
 difference-0.012
 forgive-0.012
 slice-0.011
Token dad
Token ID9955
Feature activation-9.82e-03

Pos logit contributions
aked+0.020
aled+0.020
 dressed+0.020
ips+0.020
years+0.019
Neg logit contributions
 we-0.014
 salad-0.013
 difference-0.013
 sir-0.012
 forgive-0.012
Token.
Token ID13
Feature activation+7.46e-03

Pos logit contributions
 we+0.011
 salad+0.010
 difference+0.010
 sir+0.009
 slice+0.009
Neg logit contributions
aked-0.021
aled-0.021
ips-0.021
 dressed-0.021
years-0.020
Token the
Token ID262
Feature activation+1.83e-02

Pos logit contributions
aked+0.030
aled+0.029
ips+0.029
years+0.029
ga+0.028
Neg logit contributions
 we-0.015
 salad-0.014
 difference-0.014
 moment-0.013
 forgive-0.013
Token sea
Token ID5417
Feature activation+5.29e-03

Pos logit contributions
aked+0.032
ips+0.032
years+0.031
aled+0.031
 dressed+0.031
Neg logit contributions
 we-0.028
 salad-0.028
 difference-0.027
 sandwich-0.025
 tastes-0.025
Token.
Token ID13
Feature activation+2.61e-02

Pos logit contributions
 dressed+0.010
aked+0.010
ips+0.009
 ped+0.009
aled+0.009
Neg logit contributions
 we-0.007
 salad-0.007
 difference-0.007
 yeah-0.007
 tastes-0.006
Token As
Token ID1081
Feature activation+5.26e-03

Pos logit contributions
aked+0.052
 dressed+0.052
ips+0.051
aled+0.051
years+0.050
Neg logit contributions
 we-0.032
 salad-0.031
 difference-0.030
 sir-0.029
 yeah-0.029
Token he
Token ID339
Feature activation+5.19e-02

Pos logit contributions
aked+0.015
ips+0.015
aled+0.015
 dressed+0.014
ga+0.014
Neg logit contributions
 we-0.003
 salad-0.002
 difference-0.002
 sir-0.002
 would-0.002
Token d
Token ID288
Feature activation+1.35e-01

Pos logit contributions
ips+0.119
years+0.115
ga+0.112
aled+0.112
pered+0.112
Neg logit contributions
 would-0.059
 we-0.056
 salad-0.047
 may-0.046
 moment-0.046
Tokenived
Token ID1572
Feature activation+3.87e-02

Pos logit contributions
years+0.210
 dressed+0.207
ind+0.201
 buck+0.199
 CDs+0.198
Neg logit contributions
 salad-0.179
 pie-0.175
ue-0.170
affe-0.169
 please-0.165
Token deep
Token ID2769
Feature activation+5.92e-02

Pos logit contributions
aked+0.064
ips+0.063
 dressed+0.063
aled+0.063
years+0.061
Neg logit contributions
 we-0.062
 salad-0.060
 difference-0.057
 sir-0.056
 tastes-0.055
Token,
Token ID11
Feature activation-8.45e-04

Pos logit contributions
years+0.106
aked+0.106
ips+0.103
aled+0.101
idget+0.099
Neg logit contributions
 we-0.093
 salad-0.089
 pie-0.077
 moment-0.077
 bird-0.077
Token he
Token ID339
Feature activation+3.17e-02

Pos logit contributions
 we+0.001
 salad+0.001
 difference+0.000
 sir+0.000
 yeah+0.000
Neg logit contributions
aked-0.002
 dressed-0.002
ips-0.002
aled-0.002
 ped-0.002
Token saw
Token ID2497
Feature activation+4.85e-03

Pos logit contributions
aked+0.068
ips+0.068
aled+0.067
 dressed+0.067
years+0.065
Neg logit contributions
 we-0.035
 salad-0.032
 difference-0.030
 would-0.029
 forgive-0.029
Token to
Token ID284
Feature activation-9.66e-03

Pos logit contributions
 we+0.006
 salad+0.006
 would+0.005
 forgive+0.005
 sandwich+0.005
Neg logit contributions
ips-0.020
ang-0.020
ga-0.020
years-0.020
aled-0.020
Token the
Token ID262
Feature activation-2.78e-03

Pos logit contributions
 we+0.013
 salad+0.013
 difference+0.012
 forgive+0.012
 sir+0.012
Neg logit contributions
aked-0.019
ips-0.018
 dressed-0.018
aled-0.018
years-0.017
Token lake
Token ID13546
Feature activation+2.02e-02

Pos logit contributions
 we+0.004
 salad+0.004
 difference+0.004
 sir+0.004
 forgive+0.004
Neg logit contributions
aked-0.005
 dressed-0.005
ips-0.005
aled-0.005
 ped-0.005
Token to
Token ID284
Feature activation+2.02e-02

Pos logit contributions
 dressed+0.040
aked+0.040
ips+0.039
aled+0.039
 ped+0.038
Neg logit contributions
 we-0.025
 salad-0.024
 difference-0.024
 yeah-0.023
 tastes-0.022
Token go
Token ID467
Feature activation+4.81e-02

Pos logit contributions
aked+0.045
ips+0.044
 dressed+0.044
aled+0.043
years+0.043
Neg logit contributions
 we-0.022
 salad-0.020
 difference-0.020
 forgive-0.019
 slice-0.018
Token kay
Token ID34681
Feature activation+1.35e-01

Pos logit contributions
aked+0.099
aled+0.098
years+0.098
ips+0.096
ga+0.093
Neg logit contributions
 we-0.067
 salad-0.057
 forgive-0.055
 would-0.054
 moment-0.053
Tokenaking
Token ID868
Feature activation+3.62e-02

Pos logit contributions
ann+0.202
years+0.200
 ped+0.200
ga+0.200
itted+0.197
Neg logit contributions
 forgive-0.199
-0.196
 we-0.191
 slice-0.188
man-0.184
Token.
Token ID13
Feature activation+1.85e-02

Pos logit contributions
years+0.089
ips+0.087
aked+0.087
aled+0.084
ga+0.083
Neg logit contributions
 salad-0.034
 we-0.034
 difference-0.028
 pie-0.027
 tastes-0.027
Token Tim
Token ID5045
Feature activation+3.40e-02

Pos logit contributions
 dressed+0.036
aked+0.036
aled+0.036
ips+0.035
 ped+0.035
Neg logit contributions
 we-0.023
 salad-0.023
 difference-0.022
 yeah-0.021
 sir-0.021
Tokenmy
Token ID1820
Feature activation+3.69e-02

Pos logit contributions
aked+0.083
 dressed+0.082
ips+0.082
aled+0.082
years+0.080
Neg logit contributions
 we-0.027
 salad-0.025
 difference-0.024
 tastes-0.022
 forgive-0.022
Token was
Token ID373
Feature activation+7.49e-02

Pos logit contributions
ips+0.093
aled+0.091
aked+0.091
years+0.089
 dressed+0.088
Neg logit contributions
 we-0.030
 would-0.025
 salad-0.025
 difference-0.023
 forgive-0.023
Token mom
Token ID1995
Feature activation-4.52e-03

Pos logit contributions
 we+0.009
 salad+0.009
 difference+0.009
 tastes+0.008
 yeah+0.008
Neg logit contributions
 dressed-0.027
aked-0.027
ips-0.027
aled-0.026
 ped-0.026
Token and
Token ID290
Feature activation-2.00e-02

Pos logit contributions
 we+0.006
 salad+0.005
 difference+0.005
 yeah+0.005
 tastes+0.005
Neg logit contributions
 dressed-0.009
aked-0.009
ips-0.009
 ped-0.008
ga-0.008
Token dad
Token ID9955
Feature activation-6.07e-03

Pos logit contributions
 difference+0.029
ue+0.026
fly+0.025
 tastes+0.025
affe+0.025
Neg logit contributions
 dressed-0.042
 ped-0.038
aked-0.035
 buck-0.035
years-0.035
Token to
Token ID284
Feature activation+6.95e-03

Pos logit contributions
 we+0.007
 salad+0.006
 difference+0.006
 slice+0.006
 forgive+0.006
Neg logit contributions
aled-0.013
aked-0.013
ips-0.013
years-0.013
 dressed-0.013
Token go
Token ID467
Feature activation+4.95e-02

Pos logit contributions
aked+0.015
 dressed+0.015
ips+0.015
aled+0.015
years+0.014
Neg logit contributions
 we-0.008
 salad-0.007
 difference-0.007
 forgive-0.007
 sir-0.007
Token kay
Token ID34681
Feature activation+1.34e-01

Pos logit contributions
aked+0.090
aled+0.090
years+0.088
ips+0.086
ga+0.083
Neg logit contributions
 we-0.079
 salad-0.069
 would-0.066
 forgive-0.065
 moment-0.065
Tokenaking
Token ID868
Feature activation+4.07e-02

Pos logit contributions
 heels+0.205
ga+0.202
 Fay+0.201
 ped+0.200
aled+0.198
Neg logit contributions
man-0.182
-0.181
 forgive-0.181
 we-0.175
 slice-0.171
Token.
Token ID13
Feature activation+7.81e-03

Pos logit contributions
aked+0.088
years+0.088
ips+0.088
aled+0.088
 dressed+0.083
Neg logit contributions
 we-0.047
 salad-0.045
 difference-0.041
 tastes-0.040
 pie-0.039
Token Her
Token ID2332
Feature activation+2.14e-02

Pos logit contributions
 dressed+0.015
aked+0.014
ips+0.014
aled+0.014
 ped+0.013
Neg logit contributions
 we-0.011
 difference-0.010
 salad-0.010
 tastes-0.010
 sir-0.010
Token mom
Token ID1995
Feature activation+1.94e-02

Pos logit contributions
aled+0.060
aked+0.060
years+0.059
ips+0.059
 dressed+0.057
Neg logit contributions
 we-0.011
 salad-0.009
 difference-0.009
 sir-0.008
 would-0.008
Token and
Token ID290
Feature activation-1.55e-02

Pos logit contributions
aked+0.038
 dressed+0.038
ips+0.037
aled+0.037
 ped+0.037
Neg logit contributions
 we-0.024
 salad-0.023
 difference-0.023
 yeah-0.021
 sir-0.021
Token water
Token ID1660
Feature activation+3.07e-02

Pos logit contributions
aled+0.057
aked+0.057
ips+0.055
years+0.054
 ped+0.054
Neg logit contributions
 salad-0.048
 we-0.047
 difference-0.047
 sandwich-0.044
 bird-0.043
Token.
Token ID13
Feature activation+1.66e-02

Pos logit contributions
aked+0.065
aled+0.064
 dressed+0.063
ips+0.063
years+0.062
Neg logit contributions
 we-0.035
 salad-0.034
 difference-0.032
 sir-0.030
 forgive-0.030
Token He
Token ID679
Feature activation+2.42e-02

Pos logit contributions
aked+0.031
 dressed+0.031
aled+0.030
ips+0.030
 ped+0.029
Neg logit contributions
 we-0.023
 salad-0.022
 difference-0.022
 sir-0.020
 yeah-0.020
Token hopped
Token ID45230
Feature activation+8.80e-02

Pos logit contributions
ips+0.052
aled+0.051
aked+0.051
ained+0.049
years+0.049
Neg logit contributions
 we-0.028
 would-0.028
 salad-0.024
 forget-0.023
 bird-0.023
Token and
Token ID290
Feature activation+3.83e-02

Pos logit contributions
years+0.175
aked+0.169
ips+0.169
ind+0.169
ang+0.165
Neg logit contributions
 we-0.116
 moment-0.109
 would-0.108
 salad-0.107
 forgive-0.101
Token spl
Token ID4328
Feature activation+1.34e-01

Pos logit contributions
ind+0.072
ips+0.072
years+0.070
aked+0.069
angers+0.069
Neg logit contributions
 we-0.059
 would-0.053
 moment-0.049
 salad-0.049
 they-0.048
Tokenashed
Token ID5263
Feature activation+5.01e-02

Pos logit contributions
years+0.235
 dressed+0.230
ind+0.227
 ped+0.220
Char+0.217
Neg logit contributions
 would-0.166
ue-0.160
 yeah-0.155
 slice-0.148
 we-0.146
Token around
Token ID1088
Feature activation+2.88e-02

Pos logit contributions
years+0.088
aked+0.086
ips+0.086
aled+0.085
 dressed+0.084
Neg logit contributions
 we-0.074
 salad-0.074
 moment-0.069
 would-0.069
 difference-0.068
Token,
Token ID11
Feature activation+1.08e-02

Pos logit contributions
aked+0.060
aled+0.059
 dressed+0.058
ips+0.058
years+0.056
Neg logit contributions
 we-0.036
 salad-0.034
 difference-0.032
 forgive-0.030
 moment-0.030
Token having
Token ID1719
Feature activation+4.02e-02

Pos logit contributions
aked+0.019
aled+0.019
ips+0.019
 dressed+0.019
years+0.018
Neg logit contributions
 we-0.016
 salad-0.015
 difference-0.015
 yeah-0.014
 forgive-0.014
Token so
Token ID523
Feature activation+4.48e-02

Pos logit contributions
aked+0.071
ips+0.071
aled+0.071
ang+0.070
 dressed+0.068
Neg logit contributions
 we-0.059
 salad-0.058
 difference-0.055
 forgive-0.053
 tastes-0.053
Token and
Token ID290
Feature activation-1.24e-02

Pos logit contributions
ips+0.044
aked+0.043
aled+0.042
 dressed+0.042
years+0.041
Neg logit contributions
 we-0.024
 salad-0.022
 difference-0.021
 forgive-0.021
 would-0.021
Token his
Token ID465
Feature activation-1.12e-02

Pos logit contributions
 we+0.023
 difference+0.023
 salad+0.022
 tastes+0.022
 sir+0.022
Neg logit contributions
 dressed-0.017
 ped-0.017
aked-0.016
aled-0.016
years-0.014
Token mom
Token ID1995
Feature activation+3.22e-02

Pos logit contributions
 we+0.008
 salad+0.007
 difference+0.007
 yeah+0.006
 tastes+0.006
Neg logit contributions
aked-0.028
 dressed-0.028
ips-0.028
aled-0.028
 ped-0.027
Token continued
Token ID3767
Feature activation+4.98e-02

Pos logit contributions
ips+0.067
aled+0.064
aked+0.062
angers+0.061
ang+0.061
Neg logit contributions
 would-0.041
 forgive-0.039
 we-0.039
 slice-0.036
 difference-0.036
Token to
Token ID284
Feature activation+4.33e-02

Pos logit contributions
idget+0.119
rol+0.119
ang+0.118
izz+0.116
ud+0.113
Neg logit contributions
 they-0.052
 we-0.051
 would-0.045
 moment-0.042
 he-0.038
Token un
Token ID555
Feature activation+1.33e-01

Pos logit contributions
aked+0.102
ips+0.100
years+0.098
aled+0.096
ang+0.096
Neg logit contributions
 we-0.046
 difference-0.043
 forgive-0.042
 salad-0.040
 forget-0.039
Tokenpack
Token ID8002
Feature activation+3.42e-02

Pos logit contributions
itted+0.214
ips+0.214
aked+0.212
ained+0.212
aled+0.206
Neg logit contributions
 we-0.207
man-0.186
 forgive-0.186
keeper-0.179
 slice-0.176
Token.
Token ID13
Feature activation+2.42e-02

Pos logit contributions
aked+0.073
aled+0.071
years+0.070
ga+0.070
ang+0.070
Neg logit contributions
 we-0.040
 salad-0.038
 difference-0.038
 tastes-0.034
 forgive-0.034
Token Jack
Token ID3619
Feature activation+3.99e-02

Pos logit contributions
aked+0.037
 dressed+0.037
aled+0.036
ips+0.036
 ped+0.035
Neg logit contributions
 we-0.041
 salad-0.040
 difference-0.039
 sir-0.038
 yeah-0.038
Token knew
Token ID2993
Feature activation-1.61e-02

Pos logit contributions
ips+0.096
aled+0.093
aked+0.091
ained+0.091
years+0.090
Neg logit contributions
 would-0.037
 we-0.035
 salad-0.029
 difference-0.028
 may-0.028
Token that
Token ID326
Feature activation-2.82e-02

Pos logit contributions
 we+0.015
 salad+0.015
 difference+0.014
 tastes+0.013
 slice+0.013
Neg logit contributions
 dressed-0.037
aked-0.036
ips-0.035
aled-0.035
 ped-0.035
Token and
Token ID290
Feature activation-2.91e-02

Pos logit contributions
ips+0.060
aked+0.060
 dressed+0.059
aled+0.059
years+0.058
Neg logit contributions
 we-0.033
 salad-0.031
 difference-0.030
 tastes-0.029
 forgive-0.029
Token Anne
Token ID15397
Feature activation+2.63e-02

Pos logit contributions
 difference+0.046
 salad+0.046
 sandwich+0.045
 pie+0.044
 slice+0.043
Neg logit contributions
 dressed-0.047
 ped-0.046
aled-0.046
aked-0.044
years-0.043
Token jumped
Token ID11687
Feature activation+8.44e-02

Pos logit contributions
ips+0.051
aked+0.050
ained+0.049
years+0.049
angers+0.048
Neg logit contributions
 we-0.035
 would-0.034
 forgive-0.033
 salad-0.031
 difference-0.031
Token in
Token ID287
Feature activation+1.25e-02

Pos logit contributions
years+0.195
aked+0.181
ips+0.167
ind+0.164
iddling+0.160
Neg logit contributions
 we-0.117
 they-0.098
 again-0.097
 salad-0.096
 thank-0.093
Token the
Token ID262
Feature activation+4.90e-02

Pos logit contributions
aked+0.026
aled+0.025
years+0.025
 dressed+0.025
ips+0.025
Neg logit contributions
 we-0.016
 difference-0.015
 salad-0.015
 slice-0.015
 moment-0.014
Token kay
Token ID34681
Feature activation+1.33e-01

Pos logit contributions
years+0.095
aked+0.092
ips+0.091
angers+0.087
idget+0.085
Neg logit contributions
 we-0.077
 bird-0.076
 difference-0.075
 end-0.069
-0.068
Tokenak
Token ID461
Feature activation+2.35e-02

Pos logit contributions
 dressed+0.223
years+0.208
pered+0.207
ga+0.203
 marching+0.202
Neg logit contributions
 slice-0.180
-0.173
 forgive-0.172
 difference-0.164
 sir-0.164
Token and
Token ID290
Feature activation+1.97e-02

Pos logit contributions
aked+0.054
years+0.053
aled+0.053
ips+0.052
 dressed+0.052
Neg logit contributions
 we-0.024
 salad-0.022
 difference-0.021
 sir-0.020
 would-0.020
Token padd
Token ID14098
Feature activation+7.17e-02

Pos logit contributions
aked+0.036
ips+0.035
years+0.034
aled+0.033
 dressed+0.032
Neg logit contributions
 we-0.032
 would-0.028
 salad-0.027
 forget-0.026
 tastes-0.026
Tokenled
Token ID992
Feature activation+4.53e-02

Pos logit contributions
years+0.107
 dressed+0.104
ips+0.101
angers+0.097
 ped+0.097
Neg logit contributions
 we-0.118
 would-0.114
 tastes-0.111
 difference-0.111
 moment-0.110
Token away
Token ID1497
Feature activation+2.03e-02

Pos logit contributions
years+0.081
aked+0.080
ips+0.079
ang+0.075
 dressed+0.075
Neg logit contributions
 we-0.072
 salad-0.065
 moment-0.064
 difference-0.063
 would-0.062
Token
Token ID198
Feature activation+2.33e-02

Pos logit contributions
aked+0.060
 dressed+0.059
ips+0.059
aled+0.059
years+0.057
Neg logit contributions
 we-0.026
 salad-0.024
 difference-0.023
 forgive-0.022
 sir-0.022
TokenJohn
Token ID7554
Feature activation+3.59e-02

Pos logit contributions
aked+0.041
 dressed+0.040
aled+0.039
ips+0.039
 ped+0.038
Neg logit contributions
 we-0.035
 salad-0.034
 difference-0.033
 sir-0.032
 yeah-0.032
Token got
Token ID1392
Feature activation+8.23e-02

Pos logit contributions
ips+0.083
aked+0.080
aled+0.079
years+0.078
ained+0.077
Neg logit contributions
 we-0.036
 would-0.033
 salad-0.031
 forgive-0.030
 difference-0.030
Token in
Token ID287
Feature activation+1.98e-02

Pos logit contributions
ips+0.163
ga+0.160
angers+0.159
ind+0.156
ang+0.156
Neg logit contributions
 we-0.110
 forgive-0.103
 salad-0.101
 pie-0.097
 bird-0.096
Token the
Token ID262
Feature activation+5.54e-02

Pos logit contributions
aked+0.043
years+0.043
aled+0.042
ips+0.042
ang+0.041
Neg logit contributions
 difference-0.022
 moment-0.022
 we-0.022
 salad-0.022
 slice-0.021
Token kay
Token ID34681
Feature activation+1.33e-01

Pos logit contributions
years+0.093
aked+0.093
ips+0.092
angers+0.084
aled+0.084
Neg logit contributions
 difference-0.099
 we-0.096
 bird-0.091
 end-0.091
 forget-0.086
Tokenak
Token ID461
Feature activation+2.80e-02

Pos logit contributions
 dressed+0.225
ga+0.215
 ped+0.212
years+0.210
 Fay+0.207
Neg logit contributions
 forgive-0.175
 slice-0.164
 difference-0.162
 sir-0.157
 yeah-0.155
Token and
Token ID290
Feature activation+1.94e-02

Pos logit contributions
aked+0.061
aled+0.061
ips+0.060
years+0.060
 dressed+0.059
Neg logit contributions
 we-0.030
 salad-0.028
 difference-0.027
 sir-0.026
 forgive-0.025
Token started
Token ID2067
Feature activation+7.18e-02

Pos logit contributions
ips+0.034
aked+0.034
aled+0.033
years+0.033
 dressed+0.032
Neg logit contributions
 we-0.030
 salad-0.028
 would-0.026
 slice-0.026
 difference-0.026
Token padd
Token ID14098
Feature activation+7.51e-02

Pos logit contributions
aked+0.153
years+0.148
aled+0.145
idget+0.144
ips+0.142
Neg logit contributions
 we-0.098
 they-0.082
 he-0.082
 would-0.080
 taste-0.076
Tokenling
Token ID1359
Feature activation+4.51e-02

Pos logit contributions
years+0.142
ips+0.141
 dressed+0.138
angers+0.137
aked+0.133
Neg logit contributions
 difference-0.088
 we-0.087
 moment-0.086
 would-0.084
 tastes-0.082
Token.
Token ID13
Feature activation+8.49e-03

Pos logit contributions
 dressed+0.023
aked+0.023
ips+0.023
aled+0.023
 ped+0.022
Neg logit contributions
 we-0.014
 salad-0.014
 difference-0.013
 yeah-0.013
 sir-0.013
Token She
Token ID1375
Feature activation+1.35e-02

Pos logit contributions
aked+0.016
 dressed+0.016
ips+0.016
aled+0.016
 ped+0.016
Neg logit contributions
 we-0.011
 salad-0.011
 difference-0.010
 sir-0.010
 yeah-0.010
Token d
Token ID288
Feature activation+1.22e-01

Pos logit contributions
ips+0.030
aked+0.030
aled+0.029
 dressed+0.029
ang+0.028
Neg logit contributions
 we-0.014
 salad-0.013
 difference-0.013
 forgive-0.012
 would-0.012
Tokenived
Token ID1572
Feature activation+3.30e-02

Pos logit contributions
 dressed+0.176
 buck+0.169
rob+0.162
 ped+0.159
ips+0.158
Neg logit contributions
affe-0.184
ue-0.182
 slice-0.170
 forgive-0.167
 salad-0.167
Token and
Token ID290
Feature activation+2.22e-02

Pos logit contributions
aked+0.051
ips+0.051
aled+0.050
 dressed+0.050
years+0.048
Neg logit contributions
 we-0.056
 salad-0.054
 difference-0.052
 tastes-0.050
 forgive-0.050
Token d
Token ID288
Feature activation+1.33e-01

Pos logit contributions
ips+0.045
aked+0.045
aled+0.044
ained+0.043
ang+0.042
Neg logit contributions
 we-0.030
 salad-0.026
 forgive-0.026
 would-0.025
 yeah-0.024
Tokenived
Token ID1572
Feature activation+3.40e-02

Pos logit contributions
rob+0.203
 dressed+0.200
 CDs+0.196
 buck+0.194
ips+0.192
Neg logit contributions
ue-0.186
affe-0.179
 forgive-0.165
 salad-0.163
isy-0.163
Token but
Token ID475
Feature activation+2.00e-03

Pos logit contributions
aked+0.057
ips+0.057
aled+0.056
 dressed+0.056
years+0.054
Neg logit contributions
 we-0.054
 salad-0.052
 difference-0.050
 tastes-0.048
 forgive-0.048
Token couldn
Token ID3521
Feature activation+6.54e-03

Pos logit contributions
aked+0.004
 dressed+0.004
ips+0.004
aled+0.004
 ped+0.004
Neg logit contributions
 we-0.002
 salad-0.002
 difference-0.002
 sir-0.002
 yeah-0.002
Token't
Token ID470
Feature activation-4.71e-03

Pos logit contributions
 dressed+0.012
ips+0.012
aled+0.012
aked+0.012
 ped+0.012
Neg logit contributions
 salad-0.009
 difference-0.009
 we-0.008
 slice-0.008
 yeah-0.008
Token find
Token ID1064
Feature activation+1.67e-02

Pos logit contributions
ccoli+0.006
 salad+0.006
 tastes+0.006
 difference+0.006
 yeah+0.006
Neg logit contributions
 dressed-0.009
 ped-0.009
ga-0.008
aled-0.008
pered-0.008
Token
Token ID198
Feature activation+1.57e-02

Pos logit contributions
aked+0.029
 dressed+0.029
ips+0.028
aled+0.028
 ped+0.028
Neg logit contributions
 we-0.015
 salad-0.014
 difference-0.014
 yeah-0.013
 tastes-0.013
TokenThey
Token ID2990
Feature activation+1.67e-02

Pos logit contributions
 dressed+0.029
aked+0.029
aled+0.029
ips+0.028
years+0.028
Neg logit contributions
 we-0.023
 salad-0.021
 difference-0.021
 sir-0.020
 forgive-0.020
Token climbed
Token ID19952
Feature activation+8.24e-02

Pos logit contributions
ips+0.035
ained+0.034
angers+0.033
itted+0.032
aked+0.032
Neg logit contributions
 would-0.022
 forgive-0.020
 we-0.020
 slice-0.019
 salad-0.019
Token into
Token ID656
Feature activation+7.76e-03

Pos logit contributions
years+0.151
aked+0.141
 dressed+0.138
aled+0.137
ips+0.136
Neg logit contributions
 we-0.130
 salad-0.116
 would-0.115
 bird-0.115
 moment-0.113
Token the
Token ID262
Feature activation+3.70e-02

Pos logit contributions
aked+0.019
aled+0.019
years+0.018
ained+0.018
itted+0.018
Neg logit contributions
 we-0.008
 salad-0.007
 slice-0.007
 difference-0.006
 bird-0.006
Token kay
Token ID34681
Feature activation+1.32e-01

Pos logit contributions
aked+0.065
years+0.064
ips+0.064
aled+0.061
 dressed+0.059
Neg logit contributions
 difference-0.064
 we-0.064
 bird-0.062
 salad-0.059
 end-0.058
Tokenak
Token ID461
Feature activation+3.01e-02

Pos logit contributions
 dressed+0.223
years+0.208
 ped+0.206
ga+0.203
aled+0.201
Neg logit contributions
 forgive-0.178
 slice-0.174
 sir-0.173
-0.172
man-0.171
Token and
Token ID290
Feature activation+2.40e-02

Pos logit contributions
aked+0.065
years+0.065
aled+0.064
ips+0.064
 dressed+0.064
Neg logit contributions
 we-0.033
 salad-0.031
 difference-0.029
 sir-0.028
 forgive-0.028
Token pushed
Token ID7121
Feature activation+3.83e-02

Pos logit contributions
ips+0.043
aked+0.043
years+0.041
aled+0.041
 dressed+0.041
Neg logit contributions
 we-0.037
 salad-0.034
 would-0.032
 slice-0.032
 forgive-0.032
Token it
Token ID340
Feature activation+2.29e-02

Pos logit contributions
years+0.078
aked+0.078
 dressed+0.078
ips+0.077
aled+0.076
Neg logit contributions
 we-0.049
 salad-0.044
 difference-0.042
 slice-0.041
 forgive-0.040
Token into
Token ID656
Feature activation+8.17e-03

Pos logit contributions
years+0.050
aked+0.048
 dressed+0.047
ips+0.047
ga+0.047
Neg logit contributions
 we-0.028
 would-0.024
 salad-0.024
 forgive-0.024
 difference-0.023
Token mom
Token ID1995
Feature activation+7.68e-03

Pos logit contributions
 we+0.023
 difference+0.023
fly+0.022
 salad+0.022
 yeah+0.022
Neg logit contributions
 dressed-0.048
ips-0.047
aked-0.047
 ped-0.046
aled-0.044
Token said
Token ID531
Feature activation+3.17e-04

Pos logit contributions
 dressed+0.014
aked+0.014
 ped+0.013
years+0.013
aled+0.013
Neg logit contributions
 difference-0.011
 salad-0.011
 we-0.011
 yeah-0.010
 sir-0.010
Token,
Token ID11
Feature activation-3.53e-02

Pos logit contributions
 dressed+0.001
ips+0.001
years+0.001
 ped+0.000
aked+0.000
Neg logit contributions
 difference-0.000
fly-0.000
 mushroom-0.000
 tastes-0.000
 salad-0.000
Token "
Token ID366
Feature activation-1.15e-02

Pos logit contributions
 difference+0.045
 tastes+0.039
 salad+0.039
fly+0.037
 we+0.037
Neg logit contributions
 dressed-0.072
 ped-0.068
years-0.067
ga-0.067
ips-0.067
TokenYes
Token ID5297
Feature activation-5.15e-02

Pos logit contributions
 salad+0.018
 difference+0.018
ccoli+0.017
 ending+0.016
 sir+0.016
Neg logit contributions
 dressed-0.020
aked-0.019
years-0.019
ang-0.018
 ped-0.018
Token,
Token ID11
Feature activation-1.09e-01

Pos logit contributions
fly+0.089
ccoli+0.083
 difference+0.083
 repay+0.082
 tastes+0.081
Neg logit contributions
ug-0.067
ips-0.066
uff-0.064
inn-0.064
agging-0.063
Token but
Token ID475
Feature activation-4.90e-02

Pos logit contributions
fly+0.177
 difference+0.158
ccoli+0.147
 repay+0.146
flake+0.138
Neg logit contributions
 dressed-0.176
 ramps-0.142
 marching-0.138
 bundled-0.138
 ped-0.137
Token we
Token ID356
Feature activation-2.03e-02

Pos logit contributions
 difference+0.058
affe+0.057
ccoli+0.056
fly+0.054
 slice+0.054
Neg logit contributions
 dressed-0.097
 ped-0.084
 marching-0.080
 buck-0.080
ips-0.080
Token need
Token ID761
Feature activation-2.54e-03

Pos logit contributions
 salad+0.018
 we+0.018
 difference+0.018
 yeah+0.017
 sir+0.016
Neg logit contributions
 dressed-0.047
aked-0.045
aled-0.044
ips-0.043
 ped-0.043
Token to
Token ID284
Feature activation+6.70e-03

Pos logit contributions
 we+0.003
 salad+0.003
 difference+0.002
 forgive+0.002
 sir+0.002
Neg logit contributions
aked-0.006
aled-0.005
ips-0.005
 dressed-0.005
ang-0.005
Token sort
Token ID3297
Feature activation+1.67e-02

Pos logit contributions
aked+0.016
ips+0.016
aled+0.016
 dressed+0.015
years+0.015
Neg logit contributions
 we-0.006
 salad-0.006
 difference-0.006
 forgive-0.005
 slice-0.005
Token,
Token ID11
Feature activation-4.29e-02

Pos logit contributions
 difference+0.030
 forget+0.027
 tastes+0.027
fly+0.027
 repay+0.026
Neg logit contributions
 dressed-0.026
years-0.025
pered-0.025
 ped-0.024
aked-0.024
Token "
Token ID366
Feature activation-2.26e-02

Pos logit contributions
 difference+0.040
 we+0.040
 salad+0.039
 tastes+0.036
 yeah+0.035
Neg logit contributions
 dressed-0.097
aked-0.096
aled-0.094
ips-0.094
 ped-0.094
TokenI
Token ID40
Feature activation-1.43e-02

Pos logit contributions
 salad+0.041
 difference+0.040
 sir+0.038
 sandwich+0.037
 choice+0.037
Neg logit contributions
aked-0.035
 dressed-0.032
years-0.032
 ped-0.032
aled-0.031
Token am
Token ID716
Feature activation+3.85e-02

Pos logit contributions
 salad+0.017
 difference+0.015
 we+0.015
 sir+0.015
ue+0.015
Neg logit contributions
 dressed-0.031
aked-0.029
 ped-0.028
ang-0.027
 buck-0.026
Token sorry
Token ID7926
Feature activation-7.78e-02

Pos logit contributions
aked+0.087
aled+0.083
ips+0.083
 dressed+0.081
 ped+0.079
Neg logit contributions
 we-0.045
 salad-0.040
 forgive-0.039
 difference-0.038
 sir-0.037
Token,
Token ID11
Feature activation-1.08e-01

Pos logit contributions
 repay+0.145
fly+0.143
 slice+0.143
 pineapple+0.141
cker+0.138
Neg logit contributions
 dressed-0.069
ug-0.069
 outside-0.068
pered-0.068
 around-0.068
Token Amy
Token ID14235
Feature activation-6.22e-02

Pos logit contributions
 repay+0.163
fly+0.162
by+0.156
ologist+0.147
man+0.145
Neg logit contributions
 dressed-0.138
 tight-0.132
 running-0.127
 z-0.126
 ramps-0.123
Token."
Token ID526
Feature activation-2.28e-02

Pos logit contributions
 salad+0.111
 yeah+0.106
 slice+0.102
 sir+0.100
 difference+0.100
Neg logit contributions
 dressed-0.087
aked-0.079
 ped-0.075
ang-0.074
ga-0.074
Token But
Token ID887
Feature activation-5.40e-02

Pos logit contributions
 difference+0.032
fly+0.031
 favor+0.030
 repay+0.029
 sir+0.029
Neg logit contributions
aked-0.040
aled-0.039
ips-0.037
 dressed-0.037
pered-0.036
Token it
Token ID340
Feature activation-2.47e-02

Pos logit contributions
 repay+0.099
 difference+0.098
ologist+0.094
 favor+0.094
ccoli+0.093
Neg logit contributions
 dressed-0.071
 ped-0.061
years-0.061
ips-0.059
 onto-0.057
Token was
Token ID373
Feature activation+3.28e-02

Pos logit contributions
 difference+0.031
 sir+0.031
 salad+0.031
 yeah+0.030
 we+0.030
Neg logit contributions
 dressed-0.050
aked-0.048
 ped-0.047
aled-0.045
ips-0.044
Token mom
Token ID1995
Feature activation+9.56e-03

Pos logit contributions
 we+0.019
 difference+0.019
 salad+0.018
fly+0.018
 tastes+0.018
Neg logit contributions
 dressed-0.039
ips-0.038
aked-0.038
 ped-0.037
uff-0.036
Token said
Token ID531
Feature activation-1.02e-03

Pos logit contributions
 dressed+0.017
aked+0.015
 ped+0.015
ang+0.014
aled+0.014
Neg logit contributions
 difference-0.016
 we-0.015
 salad-0.015
 ending-0.014
 yeah-0.013
Token,
Token ID11
Feature activation-2.93e-02

Pos logit contributions
 difference+0.001
fly+0.001
 salad+0.001
 tastes+0.001
 forget+0.001
Neg logit contributions
 dressed-0.002
ips-0.002
aked-0.002
years-0.002
 ped-0.002
Token "
Token ID366
Feature activation-1.36e-02

Pos logit contributions
 difference+0.027
 we+0.027
 salad+0.026
 tastes+0.024
 yeah+0.023
Neg logit contributions
 dressed-0.067
aked-0.066
ips-0.065
 ped-0.065
aled-0.065
TokenYes
Token ID5297
Feature activation-5.06e-02

Pos logit contributions
 salad+0.020
 difference+0.020
ccoli+0.018
 ending+0.018
 we+0.018
Neg logit contributions
aked-0.024
 dressed-0.024
years-0.024
ang-0.023
ips-0.023
Token,
Token ID11
Feature activation-1.07e-01

Pos logit contributions
fly+0.078
 difference+0.075
 repay+0.073
 tastes+0.072
ccoli+0.072
Neg logit contributions
ug-0.075
ips-0.075
 dressed-0.072
uff-0.071
pered-0.070
Token but
Token ID475
Feature activation-5.42e-02

Pos logit contributions
fly+0.173
 difference+0.158
man+0.151
 repay+0.149
ccoli+0.147
Neg logit contributions
 dressed-0.171
 ramps-0.142
usted-0.141
ug-0.136
 bundled-0.136
Token be
Token ID307
Feature activation+5.34e-02

Pos logit contributions
affe+0.060
 difference+0.059
ccoli+0.058
 slice+0.057
 salad+0.055
Neg logit contributions
 dressed-0.105
ips-0.091
 ped-0.090
ained-0.087
 F-0.087
Token careful
Token ID8161
Feature activation-2.64e-02

Pos logit contributions
aled+0.087
ga+0.083
ips+0.082
 dressed+0.082
 ped+0.080
Neg logit contributions
 we-0.086
 salad-0.086
 forgive-0.085
 slice-0.083
 difference-0.082
Token not
Token ID407
Feature activation-1.22e-02

Pos logit contributions
 difference+0.033
 salad+0.032
 yeah+0.031
 we+0.031
 slice+0.029
Neg logit contributions
 dressed-0.052
aked-0.051
ips-0.051
ga-0.049
aled-0.049
Token to
Token ID284
Feature activation+9.26e-03

Pos logit contributions
 we+0.015
 difference+0.015
 salad+0.015
 yeah+0.014
 sir+0.014
Neg logit contributions
 dressed-0.024
aked-0.024
ips-0.023
aled-0.023
 ped-0.022
Token Lily
Token ID20037
Feature activation+2.32e-02

Pos logit contributions
fly+0.023
 difference+0.022
 salad+0.022
 sir+0.022
 we+0.021
Neg logit contributions
 dressed-0.021
aked-0.021
ips-0.020
aled-0.020
 ped-0.019
Token said
Token ID531
Feature activation-7.50e-03

Pos logit contributions
 dressed+0.046
 ped+0.044
aked+0.043
aled+0.041
ips+0.040
Neg logit contributions
 difference-0.030
 salad-0.030
 we-0.030
 sir-0.027
 slice-0.027
Token,
Token ID11
Feature activation-2.85e-02

Pos logit contributions
 difference+0.011
 tastes+0.010
fly+0.010
 salad+0.010
 mushroom+0.009
Neg logit contributions
 dressed-0.014
 ped-0.014
years-0.013
ga-0.013
ips-0.013
Token "
Token ID366
Feature activation-1.63e-02

Pos logit contributions
 difference+0.027
 we+0.026
 salad+0.026
 tastes+0.024
 yeah+0.023
Neg logit contributions
 dressed-0.065
 ped-0.063
aked-0.063
aled-0.063
ips-0.062
TokenYes
Token ID5297
Feature activation-5.15e-02

Pos logit contributions
 salad+0.024
 difference+0.024
 we+0.022
 sir+0.022
 tastes+0.021
Neg logit contributions
 dressed-0.029
aked-0.028
 ped-0.028
years-0.028
aled-0.027
Token,
Token ID11
Feature activation-1.06e-01

Pos logit contributions
fly+0.069
 difference+0.068
 salad+0.066
 tastes+0.065
ccoli+0.062
Neg logit contributions
 dressed-0.087
 ped-0.085
ips-0.085
ga-0.083
aled-0.083
Token I
Token ID314
Feature activation-2.40e-02

Pos logit contributions
fly+0.138
 difference+0.135
 tastes+0.121
 feast+0.119
 decision+0.108
Neg logit contributions
 dressed-0.206
 ped-0.187
 ramps-0.173
 buck-0.172
 marching-0.170
Token can
Token ID460
Feature activation-1.07e-02

Pos logit contributions
 salad+0.025
 difference+0.024
 sir+0.023
 we+0.023
ue+0.023
Neg logit contributions
 dressed-0.055
aked-0.052
 ped-0.049
ips-0.047
aled-0.047
Token serve
Token ID4691
Feature activation-2.48e-02

Pos logit contributions
 we+0.013
 salad+0.012
 difference+0.011
 forgive+0.011
 yeah+0.011
Neg logit contributions
aked-0.023
 dressed-0.022
aled-0.022
ips-0.022
years-0.021
Token you
Token ID345
Feature activation-3.00e-02

Pos logit contributions
 we+0.022
 salad+0.021
 difference+0.020
 forgive+0.018
 sir+0.018
Neg logit contributions
aked-0.059
 dressed-0.058
aled-0.058
ips-0.058
 ped-0.057
Token some
Token ID617
Feature activation-3.64e-02

Pos logit contributions
 we+0.036
 salad+0.034
 difference+0.033
 sir+0.031
 forgive+0.031
Neg logit contributions
aked-0.062
 dressed-0.061
aled-0.061
ips-0.060
 ped-0.059
Token.
Token ID13
Feature activation-1.47e-02

Pos logit contributions
aked+0.026
 dressed+0.025
ips+0.025
aled+0.025
 ped+0.024
Neg logit contributions
 we-0.014
 salad-0.014
 difference-0.013
 forgive-0.012
 sir-0.012
Token
Token ID198
Feature activation-4.60e-03

Pos logit contributions
 difference+0.026
 decision+0.026
 ending+0.025
 choice+0.025
 tastes+0.025
Neg logit contributions
 ped-0.021
 dressed-0.021
ga-0.019
aled-0.018
aked-0.018
Token
Token ID198
Feature activation-5.30e-03

Pos logit contributions
 salad+0.007
 difference+0.007
 tastes+0.007
 ending+0.007
 yeah+0.007
Neg logit contributions
aked-0.007
 dressed-0.007
ips-0.007
ang-0.007
 ped-0.007
Token"
Token ID1
Feature activation-3.42e-02

Pos logit contributions
 difference+0.008
 ending+0.008
 salad+0.008
 tastes+0.008
 yeah+0.008
Neg logit contributions
aked-0.008
 ped-0.008
ga-0.008
 dressed-0.008
ang-0.008
TokenYes
Token ID5297
Feature activation-5.91e-02

Pos logit contributions
 difference+0.051
 salad+0.050
ccoli+0.046
 tastes+0.046
 we+0.046
Neg logit contributions
 dressed-0.060
years-0.057
aked-0.057
ang-0.057
 ped-0.057
Token,
Token ID11
Feature activation-1.06e-01

Pos logit contributions
 difference+0.096
 tastes+0.096
ccoli+0.095
fly+0.095
fruit+0.091
Neg logit contributions
ug-0.069
uff-0.069
 F-0.068
inn-0.065
ga-0.065
Token I
Token ID314
Feature activation-2.02e-02

Pos logit contributions
 difference+0.125
fly+0.116
 tastes+0.114
andra+0.105
 decision+0.105
Neg logit contributions
 dressed-0.198
 ped-0.174
 z-0.170
 ramps-0.169
 buck-0.169
Token do
Token ID466
Feature activation-1.24e-02

Pos logit contributions
 sir+0.027
 difference+0.026
 salad+0.026
 we+0.026
 yeah+0.025
Neg logit contributions
 dressed-0.040
aked-0.037
 ped-0.035
ips-0.033
ained-0.033
Token!"
Token ID2474
Feature activation-2.06e-02

Pos logit contributions
 difference+0.014
 salad+0.014
 we+0.013
 sir+0.013
 yeah+0.013
Neg logit contributions
 dressed-0.025
aked-0.025
 ped-0.025
ips-0.024
ang-0.024
Token replied
Token ID8712
Feature activation-1.93e-02

Pos logit contributions
 difference+0.032
 ending+0.031
 yeah+0.031
castle+0.029
 decision+0.029
Neg logit contributions
 dressed-0.033
 ped-0.032
aked-0.030
aled-0.028
alled-0.027
Token Charlie
Token ID11526
Feature activation-1.09e-02

Pos logit contributions
 difference+0.025
 tastes+0.024
 yeah+0.023
 forget+0.023
 we+0.022
Neg logit contributions
 dressed-0.036
 ped-0.034
aled-0.034
aked-0.033
ga-0.033
Token M
Token ID337
Feature activation+2.39e-02

Pos logit contributions
 difference+0.013
 sir+0.013
 ending+0.013
 decision+0.012
 favor+0.012
Neg logit contributions
 dressed-0.010
ips-0.008
years-0.008
aked-0.008
aled-0.008
Tokenummy
Token ID13513
Feature activation+1.73e-03

Pos logit contributions
 dressed+0.040
ips+0.039
years+0.038
uff+0.038
aked+0.037
Neg logit contributions
 we-0.037
 yeah-0.034
 forget-0.034
 difference-0.034
 sir-0.032
Token said
Token ID531
Feature activation+5.12e-03

Pos logit contributions
 dressed+0.003
aked+0.003
 ped+0.003
aled+0.003
ips+0.003
Neg logit contributions
 we-0.002
 difference-0.002
 salad-0.002
 yeah-0.002
 sir-0.002
Token "
Token ID366
Feature activation-4.62e-03

Pos logit contributions
 dressed+0.011
aked+0.011
ips+0.011
years+0.011
 ped+0.011
Neg logit contributions
 difference-0.005
 salad-0.005
 we-0.005
 tastes-0.004
fly-0.004
TokenYes
Token ID5297
Feature activation-5.24e-02

Pos logit contributions
 we+0.005
 salad+0.005
 difference+0.005
 sir+0.005
 yeah+0.005
Neg logit contributions
aked-0.010
 dressed-0.010
ips-0.010
aled-0.010
 ped-0.009
Token,
Token ID11
Feature activation-1.05e-01

Pos logit contributions
fly+0.076
 difference+0.073
 repay+0.071
 ending+0.068
 salad+0.067
Neg logit contributions
ips-0.086
 dressed-0.084
ug-0.079
inn-0.079
usted-0.078
Token of
Token ID286
Feature activation-3.20e-03

Pos logit contributions
fly+0.174
man+0.159
 difference+0.152
 ending+0.151
 repay+0.149
Neg logit contributions
 dressed-0.179
usted-0.145
 ramps-0.138
 bundled-0.133
 ped-0.132
Token course
Token ID1781
Feature activation-8.41e-02

Pos logit contributions
 we+0.003
 salad+0.003
 difference+0.003
 sir+0.003
 slice+0.002
Neg logit contributions
 dressed-0.007
aked-0.007
ips-0.007
aled-0.007
 ped-0.007
Token you
Token ID345
Feature activation-1.30e-02

Pos logit contributions
fly+0.130
 difference+0.127
 ending+0.123
 repay+0.118
man+0.113
Neg logit contributions
 dressed-0.134
ips-0.122
 ped-0.112
aked-0.109
years-0.108
Token can
Token ID460
Feature activation+1.84e-02

Pos logit contributions
 we+0.006
 salad+0.005
 difference+0.005
 forgive+0.004
 sir+0.004
Neg logit contributions
aked-0.036
 dressed-0.036
aled-0.036
ips-0.036
 ped-0.035
Token Amy
Token ID14235
Feature activation-3.97e-02

Pos logit contributions
aked+0.045
aled+0.042
years+0.042
ang+0.041
ips+0.041
Neg logit contributions
 we-0.020
 forgive-0.018
 would-0.017
 I-0.017
 remember-0.016
Token mom
Token ID1995
Feature activation+1.27e-02

Pos logit contributions
 we+0.015
 salad+0.015
 difference+0.015
 tastes+0.014
fly+0.014
Neg logit contributions
 dressed-0.033
ips-0.033
aked-0.033
 ped-0.032
uff-0.031
Token said
Token ID531
Feature activation+9.73e-04

Pos logit contributions
 dressed+0.024
aked+0.023
 ped+0.022
years+0.022
aled+0.022
Neg logit contributions
 difference-0.018
 salad-0.018
 we-0.018
 yeah-0.016
 tastes-0.016
Token,
Token ID11
Feature activation-2.94e-02

Pos logit contributions
 dressed+0.002
ips+0.002
aked+0.002
years+0.002
 ped+0.002
Neg logit contributions
 difference-0.001
 salad-0.001
 tastes-0.001
fly-0.001
 forget-0.001
Token "
Token ID366
Feature activation-1.08e-02

Pos logit contributions
 we+0.026
 difference+0.026
 salad+0.025
 tastes+0.023
 yeah+0.023
Neg logit contributions
 dressed-0.068
aked-0.068
ips-0.067
aled-0.066
 ped-0.066
TokenYes
Token ID5297
Feature activation-5.06e-02

Pos logit contributions
 salad+0.016
 difference+0.016
ccoli+0.015
 ending+0.014
 sandwich+0.014
Neg logit contributions
aked-0.020
 dressed-0.019
years-0.019
ang-0.018
 ped-0.018
Token,
Token ID11
Feature activation-1.05e-01

Pos logit contributions
fly+0.078
 difference+0.075
 repay+0.074
 tastes+0.073
ccoli+0.073
Neg logit contributions
ug-0.075
ips-0.074
pered-0.071
 dressed-0.071
uff-0.070
Token but
Token ID475
Feature activation-5.06e-02

Pos logit contributions
fly+0.171
 difference+0.156
man+0.150
ccoli+0.148
 repay+0.148
Neg logit contributions
 dressed-0.168
 ramps-0.137
usted-0.135
 bundled-0.133
ug-0.133
Token be
Token ID307
Feature activation+6.05e-02

Pos logit contributions
 slice+0.058
 salad+0.058
 difference+0.057
affe+0.055
ccoli+0.055
Neg logit contributions
 dressed-0.100
 ped-0.089
ips-0.085
 marching-0.084
 buck-0.084
Token careful
Token ID8161
Feature activation-2.55e-02

Pos logit contributions
aled+0.117
ga+0.114
ann+0.111
rol+0.111
ud+0.110
Neg logit contributions
 forgive-0.082
 we-0.081
 why-0.079
 remember-0.073
 yeah-0.072
Token."
Token ID526
Feature activation-1.39e-02

Pos logit contributions
 salad+0.032
 difference+0.031
 yeah+0.030
 we+0.030
 slice+0.029
Neg logit contributions
 dressed-0.051
aked-0.049
ips-0.049
aled-0.047
ga-0.047
Token Amy
Token ID14235
Feature activation+2.01e-02

Pos logit contributions
 difference+0.020
 sir+0.019
fly+0.019
 yeah+0.018
 repay+0.018
Neg logit contributions
aked-0.026
aled-0.024
ips-0.024
 dressed-0.024
itted-0.022
Token then
Token ID788
Feature activation-2.44e-02

Pos logit contributions
 difference+0.043
 we+0.041
 salad+0.040
fly+0.040
 sir+0.038
Neg logit contributions
 dressed-0.069
 ped-0.065
aked-0.065
aled-0.063
ips-0.062
Token said
Token ID531
Feature activation-6.57e-03

Pos logit contributions
 we+0.029
 salad+0.026
 difference+0.025
 forgive+0.024
 sir+0.024
Neg logit contributions
aked-0.052
aled-0.051
ips-0.051
years-0.050
 dressed-0.050
Token,
Token ID11
Feature activation-2.91e-02

Pos logit contributions
 difference+0.009
 tastes+0.008
 forget+0.007
fly+0.007
 salad+0.007
Neg logit contributions
 dressed-0.013
 ped-0.012
years-0.012
ga-0.012
opped-0.012
Token "
Token ID366
Feature activation-1.75e-02

Pos logit contributions
 we+0.022
 salad+0.020
 difference+0.018
 sir+0.018
 forgive+0.017
Neg logit contributions
aked-0.074
ips-0.073
aled-0.072
 dressed-0.072
 ped-0.070
TokenYes
Token ID5297
Feature activation-5.22e-02

Pos logit contributions
 salad+0.031
 difference+0.031
 sir+0.029
fly+0.029
 mushroom+0.028
Neg logit contributions
aked-0.028
 dressed-0.026
years-0.024
 ped-0.024
aled-0.023
Token,
Token ID11
Feature activation-1.05e-01

Pos logit contributions
fly+0.087
 difference+0.081
 repay+0.080
 tastes+0.080
ccoli+0.078
Neg logit contributions
ug-0.069
uff-0.066
inn-0.065
 dressed-0.065
usted-0.064
Token that
Token ID326
Feature activation-4.34e-02

Pos logit contributions
fly+0.180
 difference+0.153
man+0.151
 repay+0.151
flake+0.139
Neg logit contributions
 dressed-0.167
usted-0.140
 tight-0.138
 ramps-0.136
 ped-0.132
Token's
Token ID338
Feature activation+3.27e-02

Pos logit contributions
 salad+0.047
 slice+0.038
 yeah+0.038
 sir+0.038
 mushroom+0.038
Neg logit contributions
 dressed-0.096
 ped-0.094
aked-0.091
ips-0.088
ang-0.087
Token a
Token ID257
Feature activation-2.50e-02

Pos logit contributions
aked+0.066
ips+0.065
aled+0.064
years+0.063
 dressed+0.063
Neg logit contributions
 we-0.044
 salad-0.040
 difference-0.039
 sir-0.038
 forgive-0.037
Token lovely
Token ID14081
Feature activation-1.88e-02

Pos logit contributions
 we+0.038
 salad+0.037
 difference+0.036
 tastes+0.034
 yeah+0.034
Neg logit contributions
 dressed-0.042
aked-0.042
ips-0.041
aled-0.041
 ped-0.040
Token idea
Token ID2126
Feature activation-5.95e-02

Pos logit contributions
 we+0.031
 salad+0.030
 difference+0.030
 sir+0.029
 yeah+0.028
Neg logit contributions
aked-0.030
 dressed-0.029
aled-0.029
ips-0.029
years-0.028
Token Tim
Token ID5045
Feature activation+1.43e-03

Pos logit contributions
 difference+0.050
fly+0.047
 forget+0.045
 yeah+0.045
 salad+0.045
Neg logit contributions
 dressed-0.044
aked-0.041
aled-0.038
ips-0.038
 ped-0.038
Tokenmy
Token ID1820
Feature activation-5.69e-03

Pos logit contributions
 dressed+0.002
 ped+0.002
aked+0.002
idget+0.002
ips+0.002
Neg logit contributions
 we-0.003
 salad-0.002
 ending-0.002
 feast-0.002
 turkey-0.002
Token.
Token ID13
Feature activation-6.34e-05

Pos logit contributions
 we+0.008
 salad+0.008
 difference+0.008
 yeah+0.007
 tastes+0.007
Neg logit contributions
 dressed-0.010
 ped-0.010
ips-0.010
aked-0.010
aled-0.010
Token "
Token ID366
Feature activation-1.35e-02

Pos logit contributions
 difference+0.000
 salad+0.000
fly+0.000
 sir+0.000
 dragon+0.000
Neg logit contributions
 dressed-0.000
aked-0.000
 ped-0.000
ips-0.000
aled-0.000
TokenYes
Token ID5297
Feature activation-5.90e-02

Pos logit contributions
 salad+0.020
 difference+0.020
 sir+0.018
 we+0.018
 mushroom+0.018
Neg logit contributions
 dressed-0.024
aked-0.024
years-0.024
 ped-0.023
ang-0.023
Token,
Token ID11
Feature activation-1.05e-01

Pos logit contributions
fly+0.097
ccoli+0.092
 difference+0.091
 tastes+0.090
 mushroom+0.089
Neg logit contributions
ug-0.083
 dressed-0.075
uff-0.073
ga-0.071
usted-0.071
Token let
Token ID1309
Feature activation-2.67e-02

Pos logit contributions
 difference+0.130
fly+0.129
ccoli+0.111
 tastes+0.110
 ending+0.109
Neg logit contributions
 dressed-0.207
 ped-0.180
 ramps-0.180
ug-0.177
 buck-0.172
Token's
Token ID338
Feature activation-8.96e-03

Pos logit contributions
 salad+0.026
fly+0.025
 yeah+0.024
 difference+0.024
 slice+0.024
Neg logit contributions
 dressed-0.060
aked-0.058
aled-0.057
 ped-0.057
ips-0.057
Token go
Token ID467
Feature activation+1.85e-02

Pos logit contributions
 we+0.011
 salad+0.010
 difference+0.010
 yeah+0.010
 tastes+0.010
Neg logit contributions
 dressed-0.018
aked-0.018
aled-0.018
ips-0.018
 ped-0.017
Token!"
Token ID2474
Feature activation-1.49e-03

Pos logit contributions
aked+0.043
years+0.043
aled+0.043
ips+0.042
 dressed+0.041
Neg logit contributions
 we-0.020
 salad-0.015
 forgive-0.015
 difference-0.014
 would-0.014
Token replied
Token ID8712
Feature activation-5.17e-03

Pos logit contributions
 difference+0.002
fly+0.002
 salad+0.002
 sir+0.002
 favor+0.002
Neg logit contributions
 dressed-0.003
aked-0.003
 ped-0.003
ips-0.002
aled-0.002
Token asked
Token ID1965
Feature activation-3.25e-02

Pos logit contributions
 salad+0.021
 difference+0.021
 slice+0.020
 yeah+0.020
 sandwich+0.020
Neg logit contributions
 dressed-0.026
 ped-0.025
aled-0.023
aked-0.023
ga-0.023
Token Max
Token ID5436
Feature activation-4.70e-03

Pos logit contributions
 salad+0.045
 difference+0.045
 slice+0.044
 we+0.044
 yeah+0.043
Neg logit contributions
 dressed-0.058
 ped-0.057
aled-0.056
aked-0.055
years-0.054
Token.
Token ID13
Feature activation-1.50e-02

Pos logit contributions
 we+0.007
 salad+0.007
 difference+0.007
 yeah+0.006
 tastes+0.006
Neg logit contributions
 dressed-0.008
 ped-0.008
aked-0.008
years-0.008
aled-0.008
Token "
Token ID366
Feature activation-1.95e-02

Pos logit contributions
 difference+0.019
 salad+0.019
 we+0.018
 tastes+0.017
 slice+0.017
Neg logit contributions
 dressed-0.029
 ped-0.028
aled-0.028
aked-0.028
ga-0.028
TokenYes
Token ID5297
Feature activation-6.41e-02

Pos logit contributions
 salad+0.028
 difference+0.028
 we+0.027
 sandwich+0.026
 ending+0.026
Neg logit contributions
 dressed-0.035
aked-0.035
years-0.035
 ped-0.034
ang-0.033
Token,
Token ID11
Feature activation-1.05e-01

Pos logit contributions
 tastes+0.102
 difference+0.097
fly+0.097
 Something+0.096
 moment+0.095
Neg logit contributions
ug-0.087
ga-0.078
pered-0.078
uff-0.078
uster-0.077
Token please
Token ID3387
Feature activation-4.45e-02

Pos logit contributions
 difference+0.110
fly+0.103
idents+0.102
 tastes+0.102
 ending+0.095
Neg logit contributions
 dressed-0.210
 ped-0.196
 buck-0.184
 ramps-0.181
years-0.180
Token,"
Token ID553
Feature activation-2.77e-02

Pos logit contributions
 difference+0.059
 salad+0.058
 ending+0.056
 yeah+0.056
 slice+0.051
Neg logit contributions
 dressed-0.073
aked-0.071
 ped-0.069
 buck-0.066
aled-0.066
Token replied
Token ID8712
Feature activation-2.41e-02

Pos logit contributions
 difference+0.048
 salad+0.046
 ending+0.045
 slice+0.044
 yeah+0.044
Neg logit contributions
 dressed-0.042
 ped-0.041
 buck-0.036
aked-0.035
aled-0.035
Token Ellie
Token ID44801
Feature activation-1.77e-02

Pos logit contributions
 difference+0.035
 salad+0.035
 we+0.034
 tastes+0.034
 slice+0.033
Neg logit contributions
 dressed-0.041
 ped-0.040
aled-0.040
aked-0.037
ips-0.037
Token.
Token ID13
Feature activation-1.35e-02

Pos logit contributions
 we+0.027
 salad+0.027
 difference+0.027
 yeah+0.027
 tastes+0.025
Neg logit contributions
 dressed-0.029
 ped-0.029
aked-0.028
years-0.027
aled-0.027
Token Tom
Token ID4186
Feature activation+5.79e-03

Pos logit contributions
 difference+0.016
 salad+0.016
 we+0.016
fly+0.015
 slice+0.015
Neg logit contributions
 dressed-0.022
aked-0.021
 ped-0.021
ips-0.021
aled-0.021
Token asked
Token ID1965
Feature activation-3.03e-02

Pos logit contributions
 dressed+0.010
 ped+0.009
aked+0.009
aled+0.009
years+0.008
Neg logit contributions
 difference-0.009
 we-0.009
 salad-0.009
 yeah-0.009
 slice-0.008
Token.
Token ID13
Feature activation-1.24e-02

Pos logit contributions
 difference+0.054
fly+0.052
 we+0.051
ccoli+0.051
 salad+0.050
Neg logit contributions
 dressed-0.045
aked-0.042
 ped-0.042
ips-0.041
aled-0.040
Token "
Token ID366
Feature activation-1.19e-02

Pos logit contributions
 we+0.013
 salad+0.012
 difference+0.012
 yeah+0.011
 forgive+0.011
Neg logit contributions
aked-0.027
 dressed-0.027
ips-0.026
aled-0.026
 ped-0.026
TokenYes
Token ID5297
Feature activation-6.22e-02

Pos logit contributions
 we+0.015
 salad+0.015
 difference+0.014
 sir+0.013
 tastes+0.013
Neg logit contributions
aked-0.024
 dressed-0.024
aled-0.023
ips-0.023
 ped-0.023
Token,
Token ID11
Feature activation-1.04e-01

Pos logit contributions
fly+0.107
 difference+0.100
ccoli+0.098
 tastes+0.096
 Something+0.096
Neg logit contributions
ug-0.084
 dressed-0.080
ips-0.075
uff-0.074
ga-0.074
Token let
Token ID1309
Feature activation-2.76e-02

Pos logit contributions
fly+0.165
 difference+0.148
ccoli+0.139
 ending+0.133
man+0.130
Neg logit contributions
 dressed-0.194
 ped-0.168
 buck-0.158
 F-0.150
 bundled-0.147
Token's
Token ID338
Feature activation-2.38e-02

Pos logit contributions
fly+0.028
 salad+0.028
 difference+0.027
 yeah+0.026
 ending+0.026
Neg logit contributions
 dressed-0.060
aked-0.059
aled-0.058
ips-0.058
 ped-0.057
Token build
Token ID1382
Feature activation-6.56e-04

Pos logit contributions
 difference+0.029
 salad+0.028
 yeah+0.028
 we+0.027
 sir+0.027
Neg logit contributions
 dressed-0.046
aled-0.044
 ped-0.044
ips-0.043
aked-0.042
Token the
Token ID262
Feature activation-5.56e-03

Pos logit contributions
 difference+0.001
 we+0.001
 yeah+0.001
 sir+0.001
 ending+0.001
Neg logit contributions
 dressed-0.001
ips-0.001
aked-0.001
aled-0.001
 ramps-0.001
Token tower
Token ID10580
Feature activation-3.70e-02

Pos logit contributions
fly+0.007
ccoli+0.007
 we+0.007
 sir+0.006
 forgive+0.006
Neg logit contributions
 ramps-0.011
ips-0.011
aked-0.010
aled-0.010
 dressed-0.010
Token Max
Token ID5436
Feature activation+1.44e-02

Pos logit contributions
 difference+0.025
 salad+0.025
 ending+0.024
 tastes+0.024
 slice+0.024
Neg logit contributions
 dressed-0.022
 ped-0.022
aled-0.021
ga-0.020
aked-0.020
Token replied
Token ID8712
Feature activation-6.39e-03

Pos logit contributions
 dressed+0.027
 ped+0.027
aked+0.026
aled+0.025
ga+0.025
Neg logit contributions
 difference-0.019
 we-0.019
 salad-0.019
 slice-0.017
 yeah-0.017
Token,
Token ID11
Feature activation-3.69e-02

Pos logit contributions
 difference+0.011
 forget+0.010
 tastes+0.010
ccoli+0.009
 salad+0.009
Neg logit contributions
 ped-0.010
years-0.010
 dressed-0.010
pered-0.009
aled-0.009
Token "
Token ID366
Feature activation-2.15e-02

Pos logit contributions
 difference+0.052
 tastes+0.047
 salad+0.046
 choice+0.044
 we+0.044
Neg logit contributions
 ped-0.069
 dressed-0.068
ga-0.066
years-0.064
aled-0.063
TokenYes
Token ID5297
Feature activation-5.55e-02

Pos logit contributions
 difference+0.035
 salad+0.034
 sir+0.033
ccoli+0.032
 sandwich+0.031
Neg logit contributions
 dressed-0.035
years-0.035
ang-0.034
ga-0.034
aked-0.034
Token,
Token ID11
Feature activation-1.04e-01

Pos logit contributions
 difference+0.088
ccoli+0.085
 tastes+0.083
fly+0.083
 slice+0.081
Neg logit contributions
uff-0.074
ga-0.072
ug-0.071
 ped-0.069
 dressed-0.069
Token I
Token ID314
Feature activation-2.66e-02

Pos logit contributions
 difference+0.131
fly+0.120
 tastes+0.118
andra+0.116
idents+0.112
Neg logit contributions
 dressed-0.187
 ped-0.175
 marching-0.170
 ramps-0.163
 Fay-0.162
Token have
Token ID423
Feature activation-6.87e-03

Pos logit contributions
 sir+0.036
 difference+0.034
 salad+0.034
 yeah+0.034
ue+0.033
Neg logit contributions
 dressed-0.053
aked-0.049
 ped-0.045
aved-0.045
ained-0.044
Token a
Token ID257
Feature activation-3.54e-02

Pos logit contributions
 difference+0.012
fly+0.011
 sir+0.011
ue+0.011
 ending+0.011
Neg logit contributions
 dressed-0.010
aked-0.010
 ped-0.009
aled-0.009
ips-0.009
Token large
Token ID1588
Feature activation-3.58e-02

Pos logit contributions
ue+0.047
 we+0.047
 yeah+0.045
 tastes+0.045
 why+0.042
Neg logit contributions
 dressed-0.056
 ped-0.053
aked-0.052
ga-0.052
 training-0.051
Token letter
Token ID3850
Feature activation-4.10e-02

Pos logit contributions
 we+0.047
ue+0.043
 yeah+0.043
 why+0.042
 tastes+0.041
Neg logit contributions
 dressed-0.060
aked-0.060
ga-0.058
 ped-0.058
pered-0.056
Token and
Token ID290
Feature activation+1.53e-03

Pos logit contributions
 dressed+0.011
aked+0.010
 ped+0.010
ips+0.009
aled+0.009
Neg logit contributions
 we-0.011
 difference-0.010
 tastes-0.010
 salad-0.010
 yeah-0.010
Token said
Token ID531
Feature activation+2.43e-03

Pos logit contributions
 dressed+0.003
aked+0.003
aled+0.003
 ped+0.003
ips+0.003
Neg logit contributions
 we-0.002
 difference-0.002
 salad-0.002
 tastes-0.002
 yeah-0.002
Token,
Token ID11
Feature activation-3.12e-02

Pos logit contributions
 dressed+0.005
ips+0.004
aked+0.004
 ped+0.004
years+0.004
Neg logit contributions
 difference-0.003
 forget-0.003
 salad-0.003
fly-0.003
 tastes-0.003
Token "
Token ID366
Feature activation-6.73e-03

Pos logit contributions
 difference+0.031
 salad+0.030
 we+0.029
 tastes+0.028
 slice+0.026
Neg logit contributions
 dressed-0.069
aked-0.067
ips-0.067
 ped-0.067
aled-0.066
TokenYes
Token ID5297
Feature activation-4.54e-02

Pos logit contributions
 salad+0.010
 difference+0.010
ccoli+0.009
 sir+0.009
 we+0.009
Neg logit contributions
 dressed-0.012
aked-0.012
years-0.012
 ped-0.011
ang-0.011
Token,
Token ID11
Feature activation-1.04e-01

Pos logit contributions
fly+0.067
 salad+0.063
 difference+0.062
 repay+0.062
ccoli+0.061
Neg logit contributions
ug-0.070
 dressed-0.070
ips-0.069
uff-0.068
ga-0.066
Token but
Token ID475
Feature activation-4.34e-02

Pos logit contributions
fly+0.171
 difference+0.144
man+0.144
 repay+0.143
ccoli+0.135
Neg logit contributions
 dressed-0.164
 ramps-0.140
usted-0.139
 ped-0.129
ug-0.127
Token first
Token ID717
Feature activation-6.37e-02

Pos logit contributions
 salad+0.048
 slice+0.046
 difference+0.045
ccoli+0.044
affe+0.044
Neg logit contributions
 dressed-0.090
 ped-0.080
ips-0.078
ained-0.077
aled-0.077
Token you
Token ID345
Feature activation-1.16e-02

Pos logit contributions
fly+0.064
 slice+0.062
 salad+0.062
 difference+0.060
 yeah+0.056
Neg logit contributions
 dressed-0.137
ips-0.128
 ped-0.128
 ramps-0.123
aked-0.122
Token have
Token ID423
Feature activation+1.08e-02

Pos logit contributions
 salad+0.009
 we+0.009
 difference+0.009
 yeah+0.008
 slice+0.008
Neg logit contributions
 dressed-0.028
aked-0.027
aled-0.027
ips-0.026
 ped-0.026
Token to
Token ID284
Feature activation+2.11e-02

Pos logit contributions
aked+0.023
 dressed+0.023
ips+0.023
aled+0.023
 ped+0.022
Neg logit contributions
 we-0.012
 salad-0.012
 difference-0.011
 sir-0.011
 yeah-0.011
Token and
Token ID290
Feature activation-1.45e-03

Pos logit contributions
 dressed+0.007
aked+0.007
 ped+0.007
ips+0.007
aled+0.007
Neg logit contributions
 we-0.007
 salad-0.006
 difference-0.006
 tastes-0.006
 yeah-0.006
Token said
Token ID531
Feature activation-7.00e-03

Pos logit contributions
 we+0.002
 difference+0.002
 salad+0.002
 slice+0.002
 sir+0.002
Neg logit contributions
 dressed-0.003
aked-0.003
aled-0.003
ips-0.003
 ped-0.003
Token,
Token ID11
Feature activation-3.49e-02

Pos logit contributions
 difference+0.009
 salad+0.008
 tastes+0.008
 forget+0.008
fly+0.008
Neg logit contributions
ips-0.013
 ped-0.013
 dressed-0.013
aked-0.013
years-0.013
Token "
Token ID366
Feature activation-1.76e-02

Pos logit contributions
 we+0.031
 difference+0.031
 salad+0.031
 tastes+0.028
 yeah+0.027
Neg logit contributions
 dressed-0.080
aked-0.079
ips-0.078
aled-0.078
 ped-0.078
TokenYes
Token ID5297
Feature activation-5.35e-02

Pos logit contributions
 salad+0.028
 difference+0.027
 sir+0.025
ccoli+0.024
 mushroom+0.024
Neg logit contributions
 dressed-0.030
aked-0.029
 ped-0.029
years-0.029
ang-0.029
Token,
Token ID11
Feature activation-1.03e-01

Pos logit contributions
fly+0.074
 salad+0.073
 difference+0.073
 tastes+0.072
 slice+0.068
Neg logit contributions
 dressed-0.085
ug-0.084
ips-0.083
ga-0.083
uff-0.082
Token please
Token ID3387
Feature activation-4.11e-02

Pos logit contributions
 difference+0.112
fly+0.109
 tastes+0.101
 feast+0.093
 salad+0.092
Neg logit contributions
 dressed-0.225
 ped-0.206
 ramps-0.199
ug-0.192
 buck-0.191
Token!"
Token ID2474
Feature activation-1.48e-02

Pos logit contributions
 salad+0.049
 difference+0.047
 we+0.044
 yeah+0.042
 slice+0.042
Neg logit contributions
 dressed-0.083
aked-0.081
ips-0.079
aled-0.079
 ped-0.078
Token They
Token ID1119
Feature activation-8.89e-03

Pos logit contributions
 we+0.019
 salad+0.018
 difference+0.018
 sir+0.017
 yeah+0.017
Neg logit contributions
aked-0.029
 dressed-0.029
aled-0.028
ips-0.028
 ped-0.028
Token played
Token ID2826
Feature activation+2.42e-02

Pos logit contributions
 would+0.013
 salad+0.011
 forgive+0.011
 we+0.011
 forget+0.010
Neg logit contributions
ips-0.018
ang-0.018
ga-0.017
angers-0.017
pering-0.017
Token on
Token ID319
Feature activation-6.62e-03

Pos logit contributions
 dressed+0.038
aked+0.038
aled+0.038
ips+0.037
years+0.036
Neg logit contributions
 we-0.040
 salad-0.039
 difference-0.038
 forgive-0.037
 slice-0.036
Token Tim
Token ID5045
Feature activation-7.51e-03

Pos logit contributions
 difference+0.036
 we+0.035
 salad+0.035
 tastes+0.033
 yeah+0.033
Neg logit contributions
aked-0.048
 dressed-0.048
aled-0.047
ips-0.047
 ped-0.046
Tokenmy
Token ID1820
Feature activation-1.14e-02

Pos logit contributions
 we+0.013
 salad+0.012
 ending+0.012
 difference+0.012
 feast+0.012
Neg logit contributions
 dressed-0.013
aked-0.012
 ped-0.012
aled-0.011
ips-0.011
Token.
Token ID13
Feature activation-5.69e-03

Pos logit contributions
 we+0.016
 salad+0.015
 difference+0.015
 tastes+0.014
 yeah+0.014
Neg logit contributions
 dressed-0.022
 ped-0.021
aked-0.020
ips-0.020
aled-0.020
Token "
Token ID366
Feature activation-1.49e-02

Pos logit contributions
 difference+0.007
 salad+0.007
 we+0.007
fly+0.007
 tastes+0.007
Neg logit contributions
aked-0.011
 dressed-0.011
ips-0.011
aled-0.011
 ped-0.011
TokenYes
Token ID5297
Feature activation-6.23e-02

Pos logit contributions
 salad+0.021
 difference+0.021
 we+0.020
 mushroom+0.019
 sir+0.019
Neg logit contributions
 dressed-0.028
aked-0.028
years-0.026
 ped-0.026
ang-0.026
Token,
Token ID11
Feature activation-1.03e-01

Pos logit contributions
fly+0.104
 mushroom+0.097
 tastes+0.097
ccoli+0.097
 writer+0.096
Neg logit contributions
ug-0.089
 dressed-0.078
uff-0.078
usted-0.077
ga-0.076
Token we
Token ID356
Feature activation-2.99e-02

Pos logit contributions
 difference+0.123
fly+0.116
 tastes+0.103
 ending+0.102
ccoli+0.100
Neg logit contributions
 dressed-0.212
ug-0.190
 ramps-0.184
 ped-0.184
usted-0.178
Token can
Token ID460
Feature activation-5.67e-03

Pos logit contributions
fly+0.028
 salad+0.028
 difference+0.027
 forgiveness+0.026
 tastes+0.025
Neg logit contributions
 dressed-0.067
aked-0.066
 ped-0.064
aled-0.063
ips-0.063
Token climb
Token ID12080
Feature activation+1.79e-02

Pos logit contributions
 salad+0.007
 difference+0.007
fly+0.007
ccoli+0.007
 ending+0.006
Neg logit contributions
 ped-0.011
 dressed-0.011
ips-0.011
aled-0.010
aked-0.010
Token the
Token ID262
Feature activation-6.79e-03

Pos logit contributions
aked+0.037
 dressed+0.035
aled+0.035
years+0.035
ips+0.035
Neg logit contributions
 we-0.024
 salad-0.021
 difference-0.020
 tastes-0.020
 sir-0.019
Token hill
Token ID12788
Feature activation-1.56e-02

Pos logit contributions
 we+0.008
 salad+0.007
 difference+0.007
 sir+0.007
 tastes+0.007
Neg logit contributions
aked-0.015
 dressed-0.014
aled-0.014
ips-0.014
years-0.014
Token"
Token ID1
Feature activation-2.57e-02

Pos logit contributions
aked+0.022
aled+0.021
 dressed+0.021
 ped+0.021
ga+0.021
Neg logit contributions
 salad-0.015
 difference-0.015
 we-0.014
 ending-0.014
 tastes-0.014
TokenI
Token ID40
Feature activation-1.91e-02

Pos logit contributions
 we+0.030
 salad+0.029
 difference+0.028
 sir+0.026
 tastes+0.026
Neg logit contributions
aked-0.053
 dressed-0.053
ips-0.052
aled-0.052
 ped-0.051
Token don
Token ID836
Feature activation-2.52e-03

Pos logit contributions
 salad+0.026
 difference+0.023
 pie+0.023
 slice+0.022
 sir+0.021
Neg logit contributions
 dressed-0.039
aked-0.036
 ped-0.035
aved-0.034
ga-0.034
Token't
Token ID470
Feature activation-4.01e-04

Pos logit contributions
 we+0.003
 salad+0.002
 sir+0.002
 forgive+0.002
 difference+0.002
Neg logit contributions
aked-0.006
aled-0.006
 dressed-0.006
years-0.006
ips-0.006
Token know
Token ID760
Feature activation-5.54e-02

Pos logit contributions
 we+0.000
 salad+0.000
 difference+0.000
 yeah+0.000
 slice+0.000
Neg logit contributions
 dressed-0.001
aked-0.001
aled-0.001
ips-0.001
 ped-0.001
Token,
Token ID11
Feature activation-1.03e-01

Pos logit contributions
 difference+0.065
 ending+0.064
 slice+0.062
 salad+0.061
 forget+0.060
Neg logit contributions
 dressed-0.109
 ped-0.101
ips-0.101
aked-0.099
 ramps-0.098
Token but
Token ID475
Feature activation-6.39e-02

Pos logit contributions
 slice+0.134
castle+0.129
fly+0.128
 tastes+0.128
ue+0.127
Neg logit contributions
 dressed-0.164
 Sn-0.150
ga-0.145
 F-0.144
 ramps-0.143
Token we
Token ID356
Feature activation-2.82e-02

Pos logit contributions
ue+0.068
 difference+0.062
 slice+0.059
andra+0.056
 feast+0.056
Neg logit contributions
 dressed-0.135
 marching-0.115
 ped-0.114
years-0.113
 ramps-0.111
Token can
Token ID460
Feature activation-6.88e-03

Pos logit contributions
 salad+0.029
 difference+0.028
 yeah+0.027
 slice+0.027
 sir+0.027
Neg logit contributions
 dressed-0.062
aled-0.057
aked-0.057
 ped-0.055
aved-0.054
Token try
Token ID1949
Feature activation-2.60e-02

Pos logit contributions
 salad+0.008
 difference+0.008
 yeah+0.008
 ending+0.008
 moment+0.008
Neg logit contributions
 dressed-0.014
aled-0.013
 ped-0.013
ips-0.012
ga-0.012
Token,"
Token ID553
Feature activation-1.32e-02

Pos logit contributions
 yeah+0.031
 difference+0.031
 salad+0.030
 we+0.030
 sir+0.030
Neg logit contributions
 dressed-0.052
ips-0.049
aked-0.048
 ped-0.048
ga-0.048
Token smiled
Token ID13541
Feature activation+1.12e-02

Pos logit contributions
 dressed+0.023
 ped+0.022
aked+0.022
years+0.021
aled+0.021
Neg logit contributions
 difference-0.019
 we-0.018
 salad-0.017
 sir-0.017
 ending-0.016
Token and
Token ID290
Feature activation+6.59e-03

Pos logit contributions
 dressed+0.019
aked+0.018
 ped+0.017
ips+0.017
aled+0.017
Neg logit contributions
 we-0.018
 difference-0.017
 salad-0.017
 yeah-0.017
 tastes-0.017
Token said
Token ID531
Feature activation-6.27e-03

Pos logit contributions
 dressed+0.013
aked+0.012
 ped+0.012
aled+0.012
ips+0.012
Neg logit contributions
 difference-0.009
 we-0.008
 salad-0.008
 sir-0.008
 yeah-0.008
Token "
Token ID366
Feature activation-9.73e-03

Pos logit contributions
 difference+0.009
 forget+0.009
fly+0.008
 repay+0.008
 salad+0.008
Neg logit contributions
 dressed-0.011
ips-0.011
years-0.010
 ped-0.010
opped-0.010
TokenYes
Token ID5297
Feature activation-5.51e-02

Pos logit contributions
 difference+0.017
 sir+0.017
 salad+0.016
 we+0.016
 sandwich+0.015
Neg logit contributions
aked-0.015
 dressed-0.015
years-0.014
ang-0.014
ips-0.014
Token,
Token ID11
Feature activation-1.03e-01

Pos logit contributions
fly+0.094
 repay+0.092
 difference+0.091
ccoli+0.083
 forgiveness+0.082
Neg logit contributions
ips-0.069
uff-0.069
usted-0.066
ula-0.066
ug-0.065
Token it
Token ID340
Feature activation-3.43e-02

Pos logit contributions
fly+0.170
 repay+0.162
 difference+0.153
ologist+0.150
man+0.147
Neg logit contributions
 dressed-0.150
 tight-0.128
 ramps-0.127
usted-0.127
 off-0.120
Token does
Token ID857
Feature activation+4.43e-03

Pos logit contributions
 sir+0.034
 we+0.033
 salad+0.031
 difference+0.031
 yeah+0.031
Neg logit contributions
 dressed-0.075
aked-0.075
ips-0.070
aled-0.067
 ped-0.065
Token!
Token ID0
Feature activation-1.54e-02

Pos logit contributions
 dressed+0.009
ips+0.008
aked+0.008
 ped+0.008
ga+0.008
Neg logit contributions
 difference-0.005
fly-0.005
 sir-0.005
 we-0.005
 salad-0.005
Token And
Token ID843
Feature activation-3.83e-02

Pos logit contributions
 salad+0.016
 difference+0.015
 we+0.015
 slice+0.014
 sandwich+0.014
Neg logit contributions
aked-0.034
 dressed-0.033
aled-0.033
ips-0.033
ga-0.032
Token it
Token ID340
Feature activation-2.98e-02

Pos logit contributions
 difference+0.036
 salad+0.035
fly+0.034
ue+0.033
 sir+0.032
Neg logit contributions
 dressed-0.085
aked-0.078
years-0.077
aled-0.077
 ped-0.077
Token just
Token ID655
Feature activation+1.18e-02

Pos logit contributions
 salad+0.011
 difference+0.011
 we+0.011
 sir+0.010
 slice+0.010
Neg logit contributions
 dressed-0.018
aked-0.018
 ped-0.018
ang-0.017
years-0.017
Token said
Token ID531
Feature activation-1.59e-02

Pos logit contributions
 dressed+0.023
aked+0.023
aled+0.023
ips+0.022
 ped+0.022
Neg logit contributions
 we-0.015
 salad-0.015
 difference-0.014
 sir-0.013
 slice-0.013
Token,
Token ID11
Feature activation-5.56e-02

Pos logit contributions
 difference+0.019
 salad+0.018
fly+0.018
 tastes+0.017
 slice+0.017
Neg logit contributions
 dressed-0.031
 ped-0.031
years-0.030
ips-0.030
ga-0.029
Token "
Token ID366
Feature activation-2.97e-02

Pos logit contributions
 difference+0.064
 salad+0.063
 tastes+0.059
 slice+0.058
 sandwich+0.056
Neg logit contributions
 dressed-0.115
 ped-0.112
ga-0.111
years-0.110
ips-0.108
TokenNo
Token ID2949
Feature activation-4.72e-02

Pos logit contributions
 salad+0.044
 difference+0.042
 we+0.041
 mushroom+0.039
 slice+0.039
Neg logit contributions
 dressed-0.053
aked-0.053
 ped-0.051
ips-0.051
aled-0.051
Token,
Token ID11
Feature activation-1.03e-01

Pos logit contributions
fly+0.055
 salad+0.055
 slice+0.055
 we+0.051
ccoli+0.051
Neg logit contributions
 dressed-0.090
ips-0.088
 ped-0.087
ga-0.082
aled-0.082
Token this
Token ID428
Feature activation-3.93e-02

Pos logit contributions
fly+0.147
man+0.147
 repay+0.138
flake+0.126
 difference+0.125
Neg logit contributions
 dressed-0.154
 ped-0.142
 ramps-0.135
 tight-0.134
 skipping-0.134
Token is
Token ID318
Feature activation+8.19e-03

Pos logit contributions
 salad+0.050
fly+0.047
 we+0.046
 difference+0.046
ccoli+0.045
Neg logit contributions
aked-0.076
ips-0.075
 dressed-0.075
 ped-0.074
aled-0.072
Token my
Token ID616
Feature activation-9.26e-03

Pos logit contributions
aked+0.016
 dressed+0.016
aled+0.015
ips+0.015
 ped+0.015
Neg logit contributions
 we-0.011
 salad-0.011
 difference-0.010
 yeah-0.010
 sir-0.010
Token swing
Token ID9628
Feature activation-3.19e-02

Pos logit contributions
 salad+0.012
 we+0.012
 difference+0.012
fly+0.011
 slice+0.011
Neg logit contributions
 dressed-0.017
ips-0.017
 ped-0.017
aked-0.016
aled-0.016
Token."
Token ID526
Feature activation-2.44e-02

Pos logit contributions
 salad+0.038
 difference+0.037
 we+0.037
 yeah+0.035
 forgive+0.034
Neg logit contributions
 dressed-0.063
aked-0.062
ips-0.062
aled-0.061
 ped-0.060
Token.
Token ID13
Feature activation-1.59e-02

Pos logit contributions
 difference+0.030
 we+0.029
 salad+0.029
 yeah+0.027
 tastes+0.027
Neg logit contributions
 dressed-0.040
aked-0.037
ips-0.037
 ped-0.036
aled-0.035
Token Mom
Token ID11254
Feature activation+1.67e-04

Pos logit contributions
 difference+0.028
 salad+0.027
fly+0.027
 forget+0.027
 we+0.026
Neg logit contributions
 dressed-0.022
ips-0.022
aked-0.022
aled-0.021
itted-0.020
Token said
Token ID531
Feature activation-1.65e-02

Pos logit contributions
 dressed+0.000
aked+0.000
 ped+0.000
years+0.000
ang+0.000
Neg logit contributions
 difference-0.000
 salad-0.000
 we-0.000
 yeah-0.000
 ending-0.000
Token "
Token ID366
Feature activation-2.56e-02

Pos logit contributions
 difference+0.022
fly+0.019
 tastes+0.018
 mushroom+0.017
 salad+0.017
Neg logit contributions
 dressed-0.034
years-0.032
 ped-0.031
ips-0.029
ga-0.029
TokenYes
Token ID5297
Feature activation-5.44e-02

Pos logit contributions
 salad+0.043
 difference+0.042
 sir+0.040
 we+0.040
 mushroom+0.039
Neg logit contributions
aked-0.042
 dressed-0.042
years-0.039
ips-0.039
 ped-0.038
Token,
Token ID11
Feature activation-1.02e-01

Pos logit contributions
fly+0.076
 difference+0.070
 salad+0.067
 repay+0.065
 tastes+0.065
Neg logit contributions
 dressed-0.092
ips-0.091
ug-0.088
ga-0.087
ula-0.086
Token here
Token ID994
Feature activation-6.58e-02

Pos logit contributions
fly+0.161
man+0.139
 repay+0.129
 difference+0.128
 ending+0.123
Neg logit contributions
 dressed-0.182
usted-0.154
 ramps-0.149
 ped-0.146
ips-0.145
Token's
Token ID338
Feature activation+7.89e-03

Pos logit contributions
fly+0.069
 salad+0.061
 difference+0.061
 yeah+0.059
 slice+0.058
Neg logit contributions
 dressed-0.142
ips-0.128
aked-0.125
 ped-0.123
ga-0.119
Token a
Token ID257
Feature activation-4.66e-02

Pos logit contributions
 dressed+0.015
aked+0.015
ips+0.015
aled+0.015
 ped+0.015
Neg logit contributions
 we-0.010
 difference-0.010
 salad-0.009
 yeah-0.009
 sir-0.009
Token perfect
Token ID2818
Feature activation-3.47e-02

Pos logit contributions
ue+0.061
 we+0.061
fly+0.059
 yeah+0.055
 Meg+0.054
Neg logit contributions
 dressed-0.070
 ped-0.070
 training-0.067
 leather-0.067
 buck-0.065
Token knife
Token ID9845
Feature activation-3.89e-02

Pos logit contributions
 we+0.042
 sir+0.037
fly+0.036
ue+0.036
keys+0.036
Neg logit contributions
 dressed-0.063
aked-0.063
 ped-0.060
inn-0.059
opped-0.057
Token Tom
Token ID4186
Feature activation+5.18e-03

Pos logit contributions
 difference+0.004
 salad+0.004
 we+0.004
 slice+0.003
fly+0.003
Neg logit contributions
aked-0.006
aled-0.006
 dressed-0.006
ips-0.006
 ped-0.006
Token said
Token ID531
Feature activation-8.60e-03

Pos logit contributions
 dressed+0.009
aked+0.009
 ped+0.009
aled+0.008
ips+0.008
Neg logit contributions
 difference-0.008
 salad-0.008
 we-0.008
 yeah-0.007
 sir-0.007
Token,
Token ID11
Feature activation-2.96e-02

Pos logit contributions
 difference+0.014
 salad+0.012
 tastes+0.012
fly+0.012
ccoli+0.012
Neg logit contributions
 dressed-0.014
 ped-0.014
years-0.013
ips-0.013
pered-0.013
Token "
Token ID366
Feature activation-1.47e-02

Pos logit contributions
 difference+0.029
 we+0.028
 salad+0.027
 tastes+0.025
 yeah+0.024
Neg logit contributions
 dressed-0.066
aked-0.065
 ped-0.065
ips-0.065
aled-0.064
TokenYes
Token ID5297
Feature activation-5.45e-02

Pos logit contributions
 salad+0.020
 difference+0.020
 we+0.019
 sir+0.018
ccoli+0.018
Neg logit contributions
aked-0.028
 dressed-0.028
years-0.027
aled-0.027
 ped-0.027
Token,
Token ID11
Feature activation-1.02e-01

Pos logit contributions
 difference+0.084
fly+0.083
ccoli+0.079
 tastes+0.079
 salad+0.077
Neg logit contributions
ug-0.081
ips-0.078
 dressed-0.077
inn-0.077
uff-0.077
Token come
Token ID1282
Feature activation+1.08e-02

Pos logit contributions
 difference+0.159
fly+0.153
ccoli+0.140
 ending+0.136
 moment+0.133
Neg logit contributions
 dressed-0.176
 ped-0.168
ug-0.155
 ramps-0.154
aled-0.145
Token to
Token ID284
Feature activation-5.58e-03

Pos logit contributions
 dressed+0.017
aked+0.017
aled+0.016
ips+0.016
 ped+0.016
Neg logit contributions
 salad-0.018
 difference-0.018
 yeah-0.017
 we-0.017
 ending-0.017
Token my
Token ID616
Feature activation-6.19e-03

Pos logit contributions
 difference+0.008
ccoli+0.008
 we+0.008
 yeah+0.008
 tastes+0.007
Neg logit contributions
 dressed-0.010
aled-0.010
 ped-0.009
years-0.009
aked-0.009
Token house
Token ID2156
Feature activation-1.22e-02

Pos logit contributions
 we+0.008
ccoli+0.008
 yeah+0.008
 salad+0.007
 sir+0.007
Neg logit contributions
 dressed-0.011
aked-0.011
ips-0.011
aled-0.010
 ped-0.010
Token."
Token ID526
Feature activation-1.40e-03

Pos logit contributions
 we+0.014
 salad+0.014
 difference+0.014
 yeah+0.013
 tastes+0.013
Neg logit contributions
 dressed-0.025
aked-0.025
ips-0.024
aled-0.024
 ped-0.023
Token went
Token ID1816
Feature activation+6.37e-02

Pos logit contributions
ips+0.010
ang+0.009
ained+0.009
ga+0.009
angers+0.009
Neg logit contributions
 would-0.007
 we-0.006
 salad-0.005
 forgive-0.005
 remember-0.005
Token back
Token ID736
Feature activation+1.71e-02

Pos logit contributions
years+0.139
aled+0.135
aked+0.135
ud+0.131
opped+0.131
Neg logit contributions
 we-0.078
 salad-0.064
 moment-0.064
 would-0.064
 they-0.064
Token outside
Token ID2354
Feature activation+1.21e-02

Pos logit contributions
aled+0.038
years+0.037
ga+0.037
ips+0.036
rol+0.036
Neg logit contributions
 we-0.020
 moment-0.016
 would-0.016
 salad-0.016
 bird-0.015
Token with
Token ID351
Feature activation-5.36e-03

Pos logit contributions
aked+0.025
aled+0.025
 dressed+0.025
ips+0.024
years+0.024
Neg logit contributions
 we-0.014
 salad-0.014
 difference-0.013
 forgive-0.012
 slice-0.012
Token their
Token ID511
Feature activation+1.05e-02

Pos logit contributions
 we+0.008
 salad+0.008
 difference+0.008
 sir+0.007
 yeah+0.007
Neg logit contributions
aked-0.009
 dressed-0.009
ips-0.009
aled-0.009
 ped-0.009
Token sp
Token ID599
Feature activation+8.01e-02

Pos logit contributions
aked+0.020
 dressed+0.019
ips+0.019
aled+0.019
 ped+0.018
Neg logit contributions
 we-0.014
 salad-0.014
 difference-0.014
 yeah-0.013
 sir-0.013
Tokenade
Token ID671
Feature activation+2.03e-02

Pos logit contributions
ga+0.181
 dressed+0.180
aled+0.179
 ped+0.176
years+0.175
Neg logit contributions
 salad-0.083
 we-0.070
 tastes-0.070
 sandwich-0.069
 difference-0.068
Token and
Token ID290
Feature activation-1.60e-03

Pos logit contributions
aked+0.042
aled+0.041
 dressed+0.041
ips+0.041
years+0.040
Neg logit contributions
 we-0.024
 salad-0.023
 difference-0.022
 sir-0.021
 forgive-0.021
Token started
Token ID2067
Feature activation+4.84e-02

Pos logit contributions
 we+0.002
 salad+0.002
 difference+0.002
 would+0.002
 slice+0.002
Neg logit contributions
ips-0.003
aked-0.003
aled-0.003
 dressed-0.003
ga-0.003
Token to
Token ID284
Feature activation+2.15e-02

Pos logit contributions
aled+0.098
ang+0.097
aked+0.097
ips+0.095
ga+0.094
Neg logit contributions
 we-0.063
 salad-0.060
 would-0.056
 pie-0.055
 moment-0.052
Token dig
Token ID3100
Feature activation+5.57e-02

Pos logit contributions
 dressed+0.044
aked+0.043
aled+0.042
 ped+0.042
ips+0.041
Neg logit contributions
 we-0.026
 salad-0.025
 difference-0.024
 sir-0.023
 yeah-0.023
Token wanted
Token ID2227
Feature activation-1.01e-02

Pos logit contributions
ips+0.036
aked+0.034
 dressed+0.034
ang+0.034
ga+0.033
Neg logit contributions
 we-0.017
 salad-0.016
 would-0.016
 forgive-0.015
 tastes-0.015
Token to
Token ID284
Feature activation-1.02e-02

Pos logit contributions
 we+0.012
 salad+0.012
 difference+0.011
 sir+0.011
 yeah+0.011
Neg logit contributions
aked-0.021
 dressed-0.021
ips-0.020
aled-0.020
 ped-0.020
Token play
Token ID711
Feature activation+1.25e-02

Pos logit contributions
 salad+0.015
ccoli+0.014
 mushroom+0.013
 turkey+0.013
fly+0.013
Neg logit contributions
 ped-0.018
 dressed-0.018
aled-0.018
ga-0.017
 ramps-0.016
Token in
Token ID287
Feature activation-4.19e-03

Pos logit contributions
aked+0.024
 dressed+0.023
aled+0.023
ips+0.023
years+0.022
Neg logit contributions
 we-0.017
 salad-0.016
 difference-0.016
 forgive-0.015
 sir-0.015
Token the
Token ID262
Feature activation+1.26e-02

Pos logit contributions
 we+0.005
 salad+0.005
 difference+0.005
 forgive+0.004
 slice+0.004
Neg logit contributions
aked-0.009
aled-0.009
 dressed-0.009
ips-0.009
years-0.009
Token p
Token ID279
Feature activation+8.10e-02

Pos logit contributions
aked+0.026
years+0.026
 dressed+0.025
aled+0.025
ips+0.025
Neg logit contributions
 salad-0.017
 we-0.016
 difference-0.016
 pie-0.016
 bird-0.015
Tokenuddle
Token ID24500
Feature activation-1.31e-03

Pos logit contributions
 dressed+0.155
aked+0.150
aled+0.149
years+0.147
ga+0.147
Neg logit contributions
 we-0.112
 difference-0.105
 salad-0.105
ue-0.102
 sir-0.101
Token.
Token ID13
Feature activation-1.09e-02

Pos logit contributions
 we+0.002
 difference+0.002
 salad+0.002
 yeah+0.002
 tastes+0.001
Neg logit contributions
 dressed-0.003
aked-0.003
ips-0.003
aled-0.002
 ped-0.002
Token They
Token ID1119
Feature activation-1.76e-02

Pos logit contributions
 salad+0.016
 we+0.015
 difference+0.015
 slice+0.015
 tastes+0.015
Neg logit contributions
aked-0.020
 dressed-0.019
ips-0.019
aled-0.019
 ped-0.018
Token did
Token ID750
Feature activation+5.03e-03

Pos logit contributions
 we+0.017
 would+0.017
 forgive+0.017
 salad+0.016
 forget+0.015
Neg logit contributions
ips-0.041
aked-0.040
aled-0.038
ang-0.038
 dressed-0.038
Token not
Token ID407
Feature activation-6.99e-03

Pos logit contributions
 ped+0.010
 dressed+0.009
aled+0.009
ips+0.009
 ramps+0.009
Neg logit contributions
fly-0.007
 sir-0.006
affe-0.006
 salad-0.006
 difference-0.006
Token She
Token ID1375
Feature activation+1.21e-02

Pos logit contributions
aked+0.019
 dressed+0.019
ang+0.018
 ped+0.018
aled+0.018
Neg logit contributions
 we-0.011
 salad-0.011
 difference-0.010
 yeah-0.010
 tastes-0.010
Token was
Token ID373
Feature activation+6.26e-02

Pos logit contributions
aled+0.025
ips+0.025
aked+0.025
years+0.025
ga+0.024
Neg logit contributions
 we-0.014
 would-0.013
 salad-0.013
 forgive-0.012
 tastes-0.012
Token happy
Token ID3772
Feature activation-3.52e-02

Pos logit contributions
aked+0.133
aled+0.131
ud+0.131
ang+0.130
ann+0.130
Neg logit contributions
 why-0.066
 forgive-0.065
 we-0.064
 remember-0.061
 not-0.061
Token when
Token ID618
Feature activation-2.72e-02

Pos logit contributions
 sir+0.057
 difference+0.057
 salad+0.055
ue+0.055
 slice+0.055
Neg logit contributions
 dressed-0.058
ips-0.053
 ped-0.052
years-0.047
ang-0.045
Token she
Token ID673
Feature activation+2.56e-02

Pos logit contributions
 difference+0.026
 we+0.025
 salad+0.025
 sir+0.024
 slice+0.024
Neg logit contributions
 dressed-0.061
ips-0.060
aked-0.059
 ped-0.059
years-0.058
Token pe
Token ID613
Feature activation+9.11e-02

Pos logit contributions
aled+0.056
aked+0.056
ips+0.056
ga+0.055
 dressed+0.055
Neg logit contributions
 we-0.027
 would-0.025
 salad-0.024
 tastes-0.023
 forgive-0.023
Tokened
Token ID276
Feature activation+1.38e-02

Pos logit contributions
 dressed+0.114
aled+0.114
ga+0.111
aked+0.110
 ramps+0.107
Neg logit contributions
 we-0.165
 yeah-0.160
 salad-0.159
 tastes-0.158
 slice-0.156
Token on
Token ID319
Feature activation-1.17e-02

Pos logit contributions
aked+0.027
aled+0.027
ips+0.026
 dressed+0.026
years+0.026
Neg logit contributions
 we-0.018
 salad-0.017
 difference-0.017
 tastes-0.016
 forgive-0.016
Token the
Token ID262
Feature activation+8.71e-03

Pos logit contributions
 we+0.013
 salad+0.013
 difference+0.012
 bird+0.011
 tastes+0.011
Neg logit contributions
aked-0.026
aled-0.025
ang-0.025
years-0.025
ips-0.024
Token toilet
Token ID16146
Feature activation+1.55e-02

Pos logit contributions
aked+0.018
years+0.017
aled+0.017
ips+0.017
 dressed+0.016
Neg logit contributions
 we-0.012
 salad-0.012
 difference-0.011
 tastes-0.011
 sandwich-0.011
Token.
Token ID13
Feature activation+5.52e-03

Pos logit contributions
aked+0.032
 dressed+0.032
aled+0.031
ips+0.031
 ped+0.030
Neg logit contributions
 we-0.018
 salad-0.017
 difference-0.017
 sir-0.016
 yeah-0.016
Token It
Token ID632
Feature activation-1.19e-02

Pos logit contributions
aked+0.009
 dressed+0.009
aled+0.009
ips+0.009
 ped+0.009
Neg logit contributions
 we-0.006
 salad-0.006
 difference-0.005
 yeah-0.005
 tastes-0.005
Token tasted
Token ID29187
Feature activation+1.75e-02

Pos logit contributions
 we+0.015
 salad+0.015
 difference+0.015
 sir+0.014
 yeah+0.014
Neg logit contributions
 dressed-0.023
aked-0.023
aled-0.022
 ped-0.022
ips-0.022
Token terrible
Token ID7818
Feature activation-2.26e-02

Pos logit contributions
 dressed+0.029
aked+0.028
aled+0.027
ips+0.026
years+0.026
Neg logit contributions
 we-0.027
 sir-0.026
 difference-0.026
 yeah-0.025
 salad-0.025
Token and
Token ID290
Feature activation-1.47e-02

Pos logit contributions
 we+0.027
 salad+0.026
 difference+0.026
 sir+0.024
 slice+0.024
Neg logit contributions
 dressed-0.046
aked-0.045
ips-0.044
aled-0.044
 ped-0.043
Token she
Token ID673
Feature activation+9.16e-03

Pos logit contributions
 difference+0.018
 we+0.017
 salad+0.017
 sir+0.017
 yeah+0.016
Neg logit contributions
 dressed-0.030
aked-0.029
aled-0.028
ips-0.028
 ped-0.028
Token rec
Token ID664
Feature activation+9.49e-02

Pos logit contributions
aked+0.018
 dressed+0.018
ips+0.018
aled+0.018
 ped+0.017
Neg logit contributions
 we-0.012
 salad-0.011
 difference-0.011
 forgive-0.010
 tastes-0.010
Tokeno
Token ID78
Feature activation+2.17e-02

Pos logit contributions
 ramps+0.192
 dressed+0.181
pered+0.180
ga+0.179
angers+0.178
Neg logit contributions
ue-0.112
 forgive-0.108
 slice-0.103
 yeah-0.103
 favor-0.098
Tokeniled
Token ID3902
Feature activation+1.73e-02

Pos logit contributions
aked+0.042
 dressed+0.042
ips+0.041
aled+0.041
 ped+0.040
Neg logit contributions
 we-0.028
 salad-0.027
 difference-0.026
 sir-0.025
 yeah-0.025
Token.
Token ID13
Feature activation-9.23e-04

Pos logit contributions
aked+0.034
 dressed+0.033
ips+0.033
aled+0.033
 ped+0.032
Neg logit contributions
 we-0.022
 salad-0.022
 difference-0.021
 tastes-0.020
 slice-0.020
Token
Token ID198
Feature activation+2.02e-02

Pos logit contributions
 we+0.001
 salad+0.001
 difference+0.001
 yeah+0.001
 sir+0.001
Neg logit contributions
aked-0.002
 dressed-0.002
aled-0.002
ips-0.002
 ped-0.002
Token
Token ID198
Feature activation+1.84e-02

Pos logit contributions
aked+0.047
 dressed+0.046
aled+0.046
ips+0.046
 ped+0.045
Neg logit contributions
 we-0.019
 salad-0.017
 difference-0.016
 forgive-0.016
 sir-0.015
Token the
Token ID262
Feature activation-1.46e-02

Pos logit contributions
 difference+0.012
 we+0.012
 salad+0.012
 slice+0.011
 sir+0.011
Neg logit contributions
 dressed-0.017
 ped-0.016
aked-0.016
years-0.015
aled-0.015
Token leader
Token ID3554
Feature activation-1.50e-02

Pos logit contributions
 we+0.021
 salad+0.020
 yeah+0.020
 difference+0.019
 slice+0.019
Neg logit contributions
 dressed-0.026
aked-0.026
aled-0.025
ips-0.025
 ped-0.024
Token of
Token ID286
Feature activation+6.54e-03

Pos logit contributions
 salad+0.020
 we+0.019
 difference+0.019
 yeah+0.018
 tastes+0.018
Neg logit contributions
 dressed-0.029
aked-0.028
 ped-0.027
ips-0.027
aled-0.027
Token the
Token ID262
Feature activation-5.81e-03

Pos logit contributions
aked+0.014
ips+0.014
aled+0.013
 dressed+0.013
ang+0.013
Neg logit contributions
 we-0.007
 salad-0.007
 difference-0.007
 forgive-0.007
 why-0.006
Token group
Token ID1448
Feature activation-1.51e-02

Pos logit contributions
 we+0.008
 salad+0.008
 difference+0.008
 yeah+0.008
 sir+0.007
Neg logit contributions
aked-0.011
 dressed-0.010
ips-0.010
aled-0.010
 ped-0.010
Token stepped
Token ID10764
Feature activation+9.24e-02

Pos logit contributions
 salad+0.021
 we+0.021
 difference+0.020
 yeah+0.020
 sir+0.020
Neg logit contributions
 dressed-0.027
aked-0.027
ips-0.026
 ped-0.026
aled-0.026
Token forward
Token ID2651
Feature activation+4.42e-03

Pos logit contributions
aled+0.167
ips+0.167
 ped+0.162
aked+0.161
angers+0.156
Neg logit contributions
 forgive-0.117
 bird-0.116
 we-0.113
 salad-0.112
 again-0.110
Token.
Token ID13
Feature activation+1.74e-02

Pos logit contributions
aked+0.009
 dressed+0.009
aled+0.009
ips+0.009
 ped+0.009
Neg logit contributions
 we-0.005
 salad-0.005
 difference-0.005
 sir-0.004
 forgive-0.004
Token
Token ID198
Feature activation+3.57e-02

Pos logit contributions
aked+0.043
aled+0.042
 ped+0.042
years+0.042
ips+0.041
Neg logit contributions
 we-0.015
 difference-0.012
 salad-0.012
 forgive-0.011
 sir-0.011
Token
Token ID198
Feature activation+3.74e-02

Pos logit contributions
aled+0.100
aked+0.097
years+0.097
ips+0.096
 ped+0.096
Neg logit contributions
 we-0.019
 forgive-0.017
-0.012
 difference-0.011
?'"-0.011
TokenShe
Token ID3347
Feature activation+1.17e-02

Pos logit contributions
aled+0.080
ips+0.076
aked+0.075
pered+0.072
 ped+0.072
Neg logit contributions
 we-0.047
 forgive-0.042
-0.039
 would-0.036
?'"-0.036
Token It
Token ID632
Feature activation-5.84e-03

Pos logit contributions
aked+0.043
 dressed+0.043
ips+0.043
aled+0.043
 ped+0.041
Neg logit contributions
 we-0.027
 salad-0.025
 difference-0.025
 sir-0.023
 yeah-0.023
Token was
Token ID373
Feature activation+4.78e-02

Pos logit contributions
 salad+0.006
 we+0.006
 difference+0.006
 sir+0.005
 yeah+0.005
Neg logit contributions
 dressed-0.013
aked-0.013
 ped-0.013
aled-0.012
ips-0.012
Token a
Token ID257
Feature activation-2.33e-02

Pos logit contributions
ips+0.107
aked+0.106
aled+0.105
ang+0.102
ga+0.098
Neg logit contributions
 salad-0.055
 we-0.051
 pie-0.048
 bird-0.048
 sandwich-0.048
Token mine
Token ID6164
Feature activation-1.69e-02

Pos logit contributions
 we+0.032
 salad+0.030
 difference+0.030
 yeah+0.029
 tastes+0.028
Neg logit contributions
 dressed-0.043
aked-0.042
 ped-0.041
ips-0.041
aled-0.040
Token!
Token ID0
Feature activation+2.46e-02

Pos logit contributions
 difference+0.025
 salad+0.023
 forgive+0.023
 sir+0.022
 we+0.022
Neg logit contributions
 dressed-0.031
ips-0.030
 ped-0.030
aked-0.030
ang-0.029
Token
Token ID198
Feature activation+2.99e-02

Pos logit contributions
aked+0.052
 dressed+0.051
aled+0.051
ips+0.051
 ped+0.050
Neg logit contributions
 we-0.029
 salad-0.026
 difference-0.025
 sir-0.024
 yeah-0.024
Token
Token ID198
Feature activation+3.78e-02

Pos logit contributions
aked+0.066
 dressed+0.065
aled+0.065
ips+0.065
 ped+0.063
Neg logit contributions
 we-0.031
 salad-0.029
 difference-0.028
 forgive-0.026
 sir-0.026
TokenM
Token ID44
Feature activation+4.59e-02

Pos logit contributions
aled+0.080
ips+0.079
 dressed+0.078
aked+0.077
 ped+0.076
Neg logit contributions
 we-0.046
 salad-0.041
 forgive-0.039
 slice-0.038
 difference-0.038
Tokenandy
Token ID10757
Feature activation+2.85e-02

Pos logit contributions
 dressed+0.100
aled+0.099
aked+0.099
ga+0.098
ips+0.096
Neg logit contributions
 we-0.049
 salad-0.048
 sir-0.045
 difference-0.045
ue-0.044
Token quickly
Token ID2952
Feature activation+1.19e-02

Pos logit contributions
aked+0.057
 dressed+0.057
ips+0.056
aled+0.056
 ped+0.054
Neg logit contributions
 we-0.035
 salad-0.034
 difference-0.033
 sir-0.031
 yeah-0.031
Token raced
Token ID32897
Feature activation+4.42e-02

Pos logit contributions
ips+0.024
ained+0.023
aked+0.022
aled+0.022
ang+0.021
Neg logit contributions
 we-0.016
 salad-0.016
 forgive-0.015
 slice-0.015
 sandwich-0.015
Token pasta
Token ID26296
Feature activation-3.11e-02

Pos logit contributions
fly+0.054
ue+0.051
 tweet+0.048
 we+0.048
affe+0.046
Neg logit contributions
 dressed-0.045
aked-0.037
 buck-0.036
iam-0.035
 skating-0.035
Token.
Token ID13
Feature activation-1.49e-02

Pos logit contributions
 difference+0.046
 yeah+0.046
fly+0.044
 we+0.043
ccoli+0.043
Neg logit contributions
 dressed-0.052
aked-0.050
ips-0.050
 ped-0.047
aled-0.046
Token They
Token ID1119
Feature activation-2.58e-02

Pos logit contributions
 we+0.019
 salad+0.017
 difference+0.017
 sir+0.016
 forgive+0.016
Neg logit contributions
aked-0.030
 dressed-0.030
aled-0.030
ips-0.030
years-0.029
Token decided
Token ID3066
Feature activation-4.20e-02

Pos logit contributions
 we+0.024
 would+0.024
 forgive+0.022
 salad+0.021
 forget+0.021
Neg logit contributions
ips-0.062
aked-0.059
ang-0.058
aled-0.058
ga-0.057
Token to
Token ID284
Feature activation-1.35e-02

Pos logit contributions
 difference+0.065
 salad+0.064
 mushroom+0.063
 yeah+0.063
fly+0.062
Neg logit contributions
 dressed-0.072
 ped-0.071
aked-0.063
aled-0.063
 buck-0.062
Token take
Token ID1011
Feature activation+1.91e-02

Pos logit contributions
 we+0.015
 salad+0.015
 yeah+0.014
 difference+0.014
 tastes+0.014
Neg logit contributions
 dressed-0.028
aled-0.027
 ped-0.027
aked-0.027
ips-0.026
Token the
Token ID262
Feature activation-2.78e-02

Pos logit contributions
ang+0.041
years+0.040
 dressed+0.040
aked+0.040
 ped+0.040
Neg logit contributions
 we-0.022
 salad-0.021
 difference-0.019
 tastes-0.019
 sandwich-0.019
Token pasta
Token ID26296
Feature activation-2.06e-03

Pos logit contributions
fly+0.044
ue+0.042
 yeah+0.039
 tweet+0.038
 Susan+0.038
Neg logit contributions
 dressed-0.047
aked-0.042
 buck-0.040
 ramps-0.039
ips-0.039
Token and
Token ID290
Feature activation-2.35e-02

Pos logit contributions
 we+0.003
 difference+0.003
 yeah+0.003
 salad+0.003
 tastes+0.003
Neg logit contributions
 dressed-0.004
aked-0.003
ips-0.003
aled-0.003
 ped-0.003
Token run
Token ID1057
Feature activation+5.68e-02

Pos logit contributions
 difference+0.037
ue+0.035
 yeah+0.035
fly+0.035
 moment+0.033
Neg logit contributions
 dressed-0.040
aled-0.037
 ped-0.037
aked-0.036
 ramps-0.034
Token away
Token ID1497
Feature activation+6.36e-03

Pos logit contributions
years+0.118
ips+0.113
ang+0.112
irling+0.110
uff+0.110
Neg logit contributions
 we-0.075
 would-0.065
 forgive-0.063
 moment-0.062
 bird-0.060
Token was
Token ID373
Feature activation+5.90e-02

Pos logit contributions
 we+0.000
 salad+0.000
 would+0.000
 difference+0.000
 forgive+0.000
Neg logit contributions
ips-0.000
aked-0.000
aled-0.000
 dressed-0.000
ga-0.000
Token so
Token ID523
Feature activation+4.60e-02

Pos logit contributions
ips+0.124
aked+0.124
aled+0.122
ud+0.120
angers+0.119
Neg logit contributions
 we-0.065
 salad-0.065
 slice-0.061
 pie-0.061
 forgive-0.060
Token excited
Token ID6568
Feature activation-4.04e-02

Pos logit contributions
aled+0.106
aked+0.103
ips+0.102
ga+0.100
 ramps+0.098
Neg logit contributions
 we-0.050
 salad-0.046
 forgive-0.043
 slice-0.043
 sir-0.040
Token,
Token ID11
Feature activation-3.21e-02

Pos logit contributions
 difference+0.060
 slice+0.059
 forgive+0.058
 yeah+0.056
 salad+0.056
Neg logit contributions
 dressed-0.072
 ped-0.069
aked-0.067
ips-0.065
aled-0.064
Token but
Token ID475
Feature activation-9.85e-03

Pos logit contributions
 we+0.024
 salad+0.022
 difference+0.021
 sir+0.019
 yeah+0.019
Neg logit contributions
aked-0.081
 dressed-0.080
ips-0.079
aled-0.079
 ped-0.078
Token also
Token ID635
Feature activation+2.34e-02

Pos logit contributions
 we+0.009
 salad+0.009
 difference+0.009
 sir+0.008
 yeah+0.008
Neg logit contributions
aked-0.023
 dressed-0.023
aled-0.022
ips-0.022
 ped-0.022
Token a
Token ID257
Feature activation-1.91e-02

Pos logit contributions
 dressed+0.037
aked+0.037
aled+0.036
ips+0.036
 ped+0.035
Neg logit contributions
 we-0.039
 salad-0.037
 difference-0.037
 sir-0.035
 yeah-0.035
Token little
Token ID1310
Feature activation-1.23e-02

Pos logit contributions
 we+0.027
 tastes+0.026
ue+0.026
 yeah+0.025
 salad+0.025
Neg logit contributions
 dressed-0.030
 ped-0.030
ang-0.029
aked-0.029
years-0.029
Token anxious
Token ID18116
Feature activation-2.59e-02

Pos logit contributions
 we+0.013
ue+0.013
 salad+0.013
 yeah+0.013
 tastes+0.012
Neg logit contributions
 dressed-0.025
ang-0.024
 ped-0.024
aked-0.024
years-0.023
Token.
Token ID13
Feature activation+2.95e-03

Pos logit contributions
 we+0.028
 salad+0.026
 yeah+0.024
 tastes+0.024
 slice+0.024
Neg logit contributions
 dressed-0.054
 ped-0.053
aked-0.053
ga-0.052
aled-0.052
Token He
Token ID679
Feature activation-5.28e-03

Pos logit contributions
 dressed+0.006
aked+0.006
ips+0.006
aled+0.006
 ped+0.006
Neg logit contributions
 we-0.004
 salad-0.003
 difference-0.003
 sir-0.003
 yeah-0.003
Token Ben
Token ID3932
Feature activation-5.22e-02

Pos logit contributions
 difference+0.123
fly+0.121
 tastes+0.120
 salad+0.116
 moment+0.116
Neg logit contributions
 dressed-0.105
 ped-0.097
ips-0.096
ga-0.094
aled-0.093
Token.
Token ID13
Feature activation-1.34e-02

Pos logit contributions
 salad+0.082
 yeah+0.081
 difference+0.078
 forgiveness+0.077
 slice+0.076
Neg logit contributions
aked-0.079
 dressed-0.078
ang-0.076
 ped-0.075
ga-0.072
Token I
Token ID314
Feature activation-1.63e-02

Pos logit contributions
 salad+0.011
 we+0.011
 difference+0.011
 tastes+0.010
 slice+0.010
Neg logit contributions
aked-0.032
 dressed-0.032
ips-0.031
aled-0.031
 ped-0.030
Token like
Token ID588
Feature activation-2.53e-02

Pos logit contributions
 salad+0.016
 we+0.016
 difference+0.015
 sir+0.014
 mushroom+0.014
Neg logit contributions
 dressed-0.038
aked-0.037
 ped-0.035
ips-0.035
aled-0.035
Token to
Token ID284
Feature activation+6.27e-03

Pos logit contributions
 we+0.032
 salad+0.029
 difference+0.026
 sir+0.026
 pie+0.025
Neg logit contributions
aked-0.056
aled-0.053
years-0.053
 dressed-0.052
ips-0.052
Token stay
Token ID2652
Feature activation+4.40e-02

Pos logit contributions
 dressed+0.013
aked+0.013
aled+0.012
ips+0.012
 ped+0.012
Neg logit contributions
 we-0.008
 salad-0.007
 difference-0.007
 sir-0.007
 yeah-0.007
Token in
Token ID287
Feature activation-4.61e-03

Pos logit contributions
years+0.098
aled+0.095
ann+0.093
aked+0.092
ud+0.092
Neg logit contributions
 we-0.055
 remember-0.048
?"-0.048
 forgive-0.046
 promise-0.046
Token the
Token ID262
Feature activation+1.67e-02

Pos logit contributions
 we+0.005
 difference+0.004
 salad+0.004
 bird+0.003
 I+0.003
Neg logit contributions
aked-0.012
years-0.011
itted-0.011
aled-0.011
angers-0.011
Token shallow
Token ID19337
Feature activation+1.68e-02

Pos logit contributions
years+0.033
aked+0.032
 dressed+0.032
aled+0.031
ga+0.031
Neg logit contributions
 salad-0.023
 difference-0.023
 we-0.023
 sandwich-0.021
 bird-0.021
Token end
Token ID886
Feature activation+8.76e-04

Pos logit contributions
aked+0.035
 dressed+0.035
aled+0.034
ips+0.034
 ped+0.034
Neg logit contributions
 we-0.019
 salad-0.019
 difference-0.018
 sir-0.017
 forgive-0.017
Token."
Token ID526
Feature activation-1.67e-02

Pos logit contributions
aked+0.002
 dressed+0.002
ips+0.002
aled+0.002
 ped+0.002
Neg logit contributions
 we-0.001
 salad-0.001
 difference-0.001
 yeah-0.001
 sir-0.001
Token arrived
Token ID5284
Feature activation+3.43e-02

Pos logit contributions
ips+0.058
aked+0.058
aled+0.056
ang+0.056
years+0.056
Neg logit contributions
 we-0.025
 salad-0.022
 would-0.021
 difference-0.020
 forgive-0.020
Token at
Token ID379
Feature activation-6.32e-04

Pos logit contributions
aked+0.075
aled+0.074
 dressed+0.074
ips+0.073
years+0.071
Neg logit contributions
 we-0.038
 salad-0.036
 difference-0.034
 sir-0.032
 yeah-0.032
Token their
Token ID511
Feature activation+2.51e-02

Pos logit contributions
 we+0.001
 salad+0.001
 difference+0.001
 sir+0.001
 slice+0.001
Neg logit contributions
aked-0.001
ips-0.001
aled-0.001
years-0.001
 dressed-0.001
Token hotel
Token ID7541
Feature activation+1.79e-02

Pos logit contributions
aked+0.056
ips+0.053
aled+0.053
years+0.052
ga+0.051
Neg logit contributions
 we-0.033
 difference-0.031
 salad-0.030
 bird-0.028
 sandwich-0.027
Token,
Token ID11
Feature activation-3.26e-02

Pos logit contributions
 dressed+0.040
aked+0.040
ips+0.039
 ped+0.039
aled+0.038
Neg logit contributions
 we-0.017
 salad-0.017
 difference-0.017
 yeah-0.016
 forgive-0.016
Token Tim
Token ID5045
Feature activation+2.18e-02

Pos logit contributions
 difference+0.046
 tastes+0.041
 slice+0.041
 salad+0.040
 favor+0.040
Neg logit contributions
 dressed-0.062
 ped-0.059
aked-0.057
aled-0.056
 ramps-0.055
Tokenmy
Token ID1820
Feature activation+2.42e-02

Pos logit contributions
aked+0.049
 dressed+0.049
aled+0.048
ips+0.048
 ped+0.047
Neg logit contributions
 we-0.022
 salad-0.021
 difference-0.021
 yeah-0.019
 sir-0.019
Token's
Token ID338
Feature activation+1.41e-02

Pos logit contributions
aked+0.057
ips+0.056
aled+0.056
 dressed+0.056
years+0.055
Neg logit contributions
 we-0.022
 salad-0.020
 difference-0.019
 sir-0.018
 forgive-0.018
Token mom
Token ID1995
Feature activation+5.37e-03

Pos logit contributions
aled+0.038
aked+0.038
ips+0.038
 dressed+0.037
years+0.036
Neg logit contributions
 we-0.009
 salad-0.008
 difference-0.008
 sir-0.008
 bird-0.007
Token asked
Token ID1965
Feature activation-5.29e-03

Pos logit contributions
 dressed+0.010
aked+0.010
 ped+0.009
ips+0.009
aled+0.009
Neg logit contributions
 we-0.007
 salad-0.007
 difference-0.007
 yeah-0.007
 sir-0.007
Token him
Token ID683
Feature activation-1.69e-02

Pos logit contributions
 we+0.009
 difference+0.009
 salad+0.008
 yeah+0.008
 tastes+0.008
Neg logit contributions
 dressed-0.008
aked-0.008
ips-0.008
aled-0.008
 ped-0.008
Token Sara
Token ID24799
Feature activation-1.81e-02

Pos logit contributions
 salad+0.066
fly+0.065
 difference+0.064
 we+0.064
 tastes+0.062
Neg logit contributions
 dressed-0.075
ips-0.071
aked-0.070
 ped-0.069
ang-0.068
Token.
Token ID13
Feature activation+5.06e-03

Pos logit contributions
 salad+0.028
 yeah+0.027
 we+0.027
 difference+0.027
 slice+0.026
Neg logit contributions
 dressed-0.030
aked-0.029
ips-0.028
ga-0.027
 ped-0.027
Token I
Token ID314
Feature activation+1.96e-03

Pos logit contributions
aked+0.010
 dressed+0.010
ips+0.010
aled+0.010
 ped+0.010
Neg logit contributions
 salad-0.006
 we-0.006
 difference-0.006
 slice-0.006
 sir-0.005
Token'm
Token ID1101
Feature activation+5.97e-02

Pos logit contributions
 dressed+0.004
aked+0.004
ips+0.004
aled+0.004
 ped+0.004
Neg logit contributions
 salad-0.002
 we-0.002
 difference-0.002
 sir-0.002
 yeah-0.002
Token glad
Token ID9675
Feature activation-5.65e-02

Pos logit contributions
aked+0.115
aled+0.114
ips+0.110
 ped+0.108
 dressed+0.107
Neg logit contributions
 we-0.085
 salad-0.078
 forgive-0.075
 difference-0.074
 sir-0.074
Token you
Token ID345
Feature activation+3.40e-03

Pos logit contributions
 difference+0.057
 slice+0.055
 salad+0.053
fly+0.052
 mushroom+0.049
Neg logit contributions
 dressed-0.127
ips-0.119
 ped-0.118
ang-0.114
years-0.114
Token like
Token ID588
Feature activation-1.76e-02

Pos logit contributions
 dressed+0.007
aked+0.007
aled+0.007
ips+0.007
years+0.006
Neg logit contributions
 we-0.004
 salad-0.004
 difference-0.004
 sir-0.004
 yeah-0.004
Token them
Token ID606
Feature activation-3.19e-02

Pos logit contributions
 difference+0.025
 yeah+0.024
 we+0.024
fly+0.024
 sir+0.024
Neg logit contributions
 dressed-0.031
ips-0.030
aked-0.029
aled-0.028
years-0.028
Token.
Token ID13
Feature activation-1.84e-03

Pos logit contributions
 we+0.038
 salad+0.036
 difference+0.035
 forgive+0.033
 sir+0.033
Neg logit contributions
aked-0.066
 dressed-0.065
aled-0.065
ips-0.064
 ped-0.063
Token Do
Token ID2141
Feature activation-3.17e-02

Pos logit contributions
 salad+0.002
 we+0.002
 difference+0.002
 slice+0.002
 sandwich+0.002
Neg logit contributions
aked-0.004
 dressed-0.004
aled-0.003
ips-0.003
ga-0.003
Token you
Token ID345
Feature activation-1.15e-02

Pos logit contributions
fly+0.053
 difference+0.051
 slice+0.049
 mushroom+0.045
ccoli+0.045
Neg logit contributions
 dressed-0.051
 ped-0.050
ips-0.048
years-0.048
aked-0.045
Token named
Token ID3706
Feature activation+1.30e-02

Pos logit contributions
 salad+0.012
 we+0.011
 difference+0.011
 tastes+0.011
 yeah+0.011
Neg logit contributions
 dressed-0.019
aked-0.018
 ped-0.018
ips-0.017
ang-0.017
Token Tim
Token ID5045
Feature activation-6.03e-05

Pos logit contributions
 dressed+0.026
aked+0.026
aled+0.026
ips+0.026
 ped+0.025
Neg logit contributions
 we-0.016
 salad-0.015
 difference-0.015
 slice-0.014
 yeah-0.014
Token.
Token ID13
Feature activation+9.55e-03

Pos logit contributions
 salad+0.000
 we+0.000
 difference+0.000
 slice+0.000
 yeah+0.000
Neg logit contributions
 dressed-0.000
aked-0.000
 ped-0.000
aled-0.000
ips-0.000
Token
Token ID198
Feature activation+1.46e-02

Pos logit contributions
aked+0.017
 dressed+0.017
aled+0.016
ips+0.016
 ped+0.016
Neg logit contributions
 salad-0.013
 we-0.013
 difference-0.013
 tastes-0.013
 slice-0.013
Token
Token ID198
Feature activation+2.15e-02

Pos logit contributions
aked+0.032
 dressed+0.032
aled+0.032
ips+0.032
 ped+0.031
Neg logit contributions
 we-0.015
 salad-0.014
 difference-0.014
 forgive-0.013
 sir-0.013
TokenOne
Token ID3198
Feature activation-9.57e-03

Pos logit contributions
 dressed+0.044
ips+0.043
aked+0.043
aled+0.043
years+0.041
Neg logit contributions
 we-0.027
 salad-0.024
 forgive-0.023
 difference-0.023
ue-0.023
Token day
Token ID1110
Feature activation-1.25e-02

Pos logit contributions
 we+0.011
 salad+0.011
 difference+0.010
 forgive+0.010
 sir+0.010
Neg logit contributions
ips-0.020
aled-0.020
aked-0.020
 dressed-0.020
years-0.019
Token,
Token ID11
Feature activation-3.95e-02

Pos logit contributions
 difference+0.012
 salad+0.011
 we+0.011
 tastes+0.010
 slice+0.010
Neg logit contributions
aked-0.029
 dressed-0.029
ips-0.028
 ped-0.028
aled-0.028
Token Tim
Token ID5045
Feature activation+1.16e-02

Pos logit contributions
 difference+0.073
fly+0.064
 tastes+0.064
ccoli+0.063
 salad+0.061
Neg logit contributions
 dressed-0.062
 ped-0.061
aked-0.056
years-0.054
aled-0.054
Token and
Token ID290
Feature activation-3.35e-02

Pos logit contributions
aked+0.025
 dressed+0.025
ips+0.025
aled+0.025
 ped+0.024
Neg logit contributions
 we-0.012
 salad-0.012
 difference-0.011
 sir-0.011
 forgive-0.011
Token Wednesday
Token ID3583
Feature activation-4.08e-03

Pos logit contributions
 difference+0.057
 we+0.053
 tastes+0.053
fly+0.052
 yeah+0.051
Neg logit contributions
 dressed-0.054
 ped-0.054
aled-0.050
aked-0.050
years-0.047
Token to
Token ID284
Feature activation-7.99e-03

Pos logit contributions
 salad+0.000
 we+0.000
 pie+0.000
 bird+0.000
 would+0.000
Neg logit contributions
aked-0.001
aled-0.001
ips-0.001
ang-0.001
pered-0.001
Token play
Token ID711
Feature activation+1.43e-02

Pos logit contributions
 we+0.009
 difference+0.009
 salad+0.009
 sir+0.008
 yeah+0.008
Neg logit contributions
 dressed-0.016
aked-0.016
aled-0.016
ips-0.016
 ped-0.016
Token in
Token ID287
Feature activation-1.21e-03

Pos logit contributions
aked+0.024
aled+0.024
years+0.023
ips+0.022
 dressed+0.022
Neg logit contributions
 salad-0.024
 we-0.024
 sandwich-0.022
 forgive-0.022
 mushroom-0.022
Token the
Token ID262
Feature activation+9.27e-03

Pos logit contributions
 we+0.001
 salad+0.001
 bird+0.001
 pie+0.001
 mushroom+0.001
Neg logit contributions
aked-0.003
aled-0.003
ips-0.003
ang-0.003
itted-0.003
Token park
Token ID3952
Feature activation-4.56e-03

Pos logit contributions
aled+0.020
aked+0.020
ips+0.020
years+0.020
ang+0.019
Neg logit contributions
 salad-0.012
 we-0.011
 bird-0.010
 pie-0.010
 difference-0.010
Token.
Token ID13
Feature activation-6.15e-03

Pos logit contributions
 we+0.005
 salad+0.005
 difference+0.004
 forgive+0.004
 slice+0.004
Neg logit contributions
aked-0.010
aled-0.010
ips-0.010
 dressed-0.010
years-0.009
Token She
Token ID1375
Feature activation+1.01e-02

Pos logit contributions
 sir+0.012
 we+0.012
 sandwich+0.012
fly+0.012
 forgiveness+0.012
Neg logit contributions
 dressed-0.008
aked-0.007
aled-0.006
itted-0.006
 ped-0.006
Token especially
Token ID2592
Feature activation+2.61e-02

Pos logit contributions
ips+0.025
aled+0.025
aked+0.024
years+0.024
 dressed+0.024
Neg logit contributions
 we-0.008
 would-0.008
 salad-0.007
 moment-0.006
 tastes-0.006
Token loved
Token ID6151
Feature activation+4.00e-03

Pos logit contributions
ips+0.061
aled+0.060
rol+0.057
years+0.057
alled+0.057
Neg logit contributions
 would-0.024
 we-0.022
 moment-0.021
 salad-0.020
 forgive-0.019
Token to
Token ID284
Feature activation-1.14e-03

Pos logit contributions
aked+0.010
aled+0.010
ips+0.010
ang+0.009
opped+0.009
Neg logit contributions
 we-0.004
 salad-0.003
 would-0.003
 pie-0.003
 bird-0.003
Token sit
Token ID1650
Feature activation+5.62e-02

Pos logit contributions
 we+0.002
 difference+0.001
 salad+0.001
 sir+0.001
 yeah+0.001
Neg logit contributions
 dressed-0.002
aked-0.002
aled-0.002
 ped-0.002
ips-0.002
Token loved
Token ID6151
Feature activation+2.80e-05

Pos logit contributions
 dressed+0.007
aked+0.007
 ped+0.006
ips+0.006
aled+0.006
Neg logit contributions
 we-0.005
 salad-0.005
 difference-0.005
 yeah-0.004
 sir-0.004
Token it
Token ID340
Feature activation-1.76e-02

Pos logit contributions
aked+0.000
 dressed+0.000
ips+0.000
aled+0.000
 ped+0.000
Neg logit contributions
 we-0.000
 salad-0.000
 difference-0.000
 sir-0.000
 yeah-0.000
Token very
Token ID845
Feature activation+1.39e-02

Pos logit contributions
 we+0.025
 salad+0.024
 difference+0.024
 sir+0.022
 yeah+0.022
Neg logit contributions
aked-0.032
 dressed-0.032
aled-0.031
ips-0.031
 ped-0.030
Token much
Token ID881
Feature activation-1.84e-03

Pos logit contributions
 dressed+0.021
aked+0.021
ips+0.020
aled+0.020
 ped+0.020
Neg logit contributions
 we-0.023
 salad-0.023
 difference-0.023
 yeah-0.022
 sir-0.022
Token.
Token ID13
Feature activation-1.21e-03

Pos logit contributions
 we+0.002
 salad+0.002
 difference+0.002
 sir+0.002
 yeah+0.002
Neg logit contributions
aked-0.004
 dressed-0.004
ips-0.004
aled-0.004
 ped-0.004
Token
Token ID198
Feature activation-3.82e-03

Pos logit contributions
 salad+0.002
 we+0.002
 difference+0.002
 tastes+0.002
 sandwich+0.002
Neg logit contributions
 dressed-0.002
aled-0.002
aked-0.002
ips-0.002
ga-0.002
Token
Token ID198
Feature activation+4.58e-03

Pos logit contributions
 salad+0.005
 we+0.005
 difference+0.005
 tastes+0.005
 yeah+0.005
Neg logit contributions
 dressed-0.007
aked-0.007
ips-0.007
 ped-0.007
aled-0.007
TokenThe
Token ID464
Feature activation-1.00e-02

Pos logit contributions
ips+0.011
aled+0.011
aked+0.010
 dressed+0.010
pered+0.010
Neg logit contributions
 we-0.005
 salad-0.004
ue-0.004
 would-0.004
 moment-0.004
Token mom
Token ID1995
Feature activation-3.43e-03

Pos logit contributions
 we+0.011
 salad+0.010
 difference+0.010
 sir+0.009
 forgive+0.009
Neg logit contributions
aked-0.022
 dressed-0.022
ips-0.021
aled-0.021
 ped-0.021
Tokenmy
Token ID1820
Feature activation+3.15e-03

Pos logit contributions
 difference+0.004
 we+0.004
 salad+0.004
 yeah+0.004
 sir+0.004
Neg logit contributions
 dressed-0.007
aked-0.007
 ped-0.007
aled-0.006
ang-0.006
Token wanted
Token ID2227
Feature activation-1.47e-02

Pos logit contributions
 dressed+0.006
aked+0.006
aled+0.006
ips+0.006
 ped+0.006
Neg logit contributions
 we-0.004
 salad-0.004
 difference-0.004
 yeah-0.004
 sir-0.004
Token an
Token ID281
Feature activation+1.42e-02

Pos logit contributions
 we+0.025
 difference+0.024
 salad+0.024
 yeah+0.023
 sir+0.022
Neg logit contributions
aked-0.025
 dressed-0.025
aled-0.024
ips-0.024
 ped-0.024
Token enormous
Token ID9812
Feature activation-3.67e-02

Pos logit contributions
ips+0.030
aked+0.029
aled+0.027
years+0.027
iddling+0.026
Neg logit contributions
 salad-0.019
 pie-0.019
 bird-0.018
 sir-0.018
 turkey-0.018
Token hole
Token ID7604
Feature activation+1.13e-02

Pos logit contributions
 we+0.043
 salad+0.040
 difference+0.040
 yeah+0.038
 forgive+0.038
Neg logit contributions
 dressed-0.074
aked-0.074
ips-0.073
aled-0.072
 ped-0.071
Token in
Token ID287
Feature activation-8.19e-03

Pos logit contributions
aked+0.017
aled+0.017
 dressed+0.017
ips+0.017
years+0.016
Neg logit contributions
 we-0.020
 salad-0.019
 difference-0.018
 slice-0.018
 tastes-0.018
Token the
Token ID262
Feature activation-3.88e-03

Pos logit contributions
 we+0.010
 salad+0.009
 difference+0.009
 tastes+0.008
 slice+0.008
Neg logit contributions
aked-0.017
ips-0.017
aled-0.017
years-0.017
 dressed-0.017
Token ground
Token ID2323
Feature activation+6.65e-03

Pos logit contributions
 we+0.006
 salad+0.006
 difference+0.006
 slice+0.006
 sir+0.006
Neg logit contributions
aked-0.006
 dressed-0.006
ips-0.006
aled-0.006
years-0.006
Token.
Token ID13
Feature activation-2.02e-03

Pos logit contributions
aled+0.015
aked+0.015
years+0.015
 dressed+0.015
ips+0.014
Neg logit contributions
 we-0.007
 salad-0.006
 difference-0.006
 tastes-0.006
 slice-0.006
Token They
Token ID1119
Feature activation-5.20e-03

Pos logit contributions
 we+0.003
 salad+0.003
 difference+0.003
 yeah+0.003
 forgive+0.003
Neg logit contributions
aked-0.004
 dressed-0.004
ips-0.004
aled-0.004
 ped-0.003
Token wondered
Token ID14028
Feature activation-3.68e-02

Pos logit contributions
 we+0.006
 salad+0.005
 would+0.005
 forgive+0.005
 sir+0.005
Neg logit contributions
ips-0.012
aled-0.012
aked-0.012
 dressed-0.011
ga-0.011
Token what
Token ID644
Feature activation-4.80e-02

Pos logit contributions
 difference+0.046
 salad+0.043
fly+0.043
ue+0.043
 repay+0.043
Neg logit contributions
 dressed-0.075
 ped-0.072
 ramps-0.066
 buck-0.065
 marching-0.064
Token was
Token ID373
Feature activation+2.62e-02

Pos logit contributions
 yeah+0.054
 salad+0.053
 difference+0.053
 sir+0.052
ccoli+0.049
Neg logit contributions
 dressed-0.102
aked-0.097
 ped-0.094
ips-0.091
aled-0.088
Token "
Token ID366
Feature activation+7.89e-04

Pos logit contributions
 we+0.025
 salad+0.025
 difference+0.024
 tastes+0.022
 yeah+0.021
Neg logit contributions
 dressed-0.065
aked-0.064
ips-0.064
aled-0.064
 ped-0.063
TokenL
Token ID43
Feature activation+2.33e-02

Pos logit contributions
aked+0.001
 dressed+0.001
ips+0.001
aled+0.001
 ped+0.001
Neg logit contributions
 we-0.001
 salad-0.001
 difference-0.001
 forgive-0.001
 yeah-0.001
Tokenily
Token ID813
Feature activation-4.56e-02

Pos logit contributions
aked+0.038
 dressed+0.038
ips+0.037
aled+0.037
 ped+0.036
Neg logit contributions
 we-0.037
 salad-0.036
 difference-0.036
 sir-0.034
 yeah-0.034
Token and
Token ID290
Feature activation-4.72e-02

Pos logit contributions
 we+0.049
 salad+0.049
 difference+0.048
 yeah+0.044
 slice+0.043
Neg logit contributions
 dressed-0.097
aked-0.097
ips-0.095
aled-0.092
 ped-0.091
Token Tom
Token ID4186
Feature activation-5.00e-02

Pos logit contributions
fly+0.071
 difference+0.069
 salad+0.068
 we+0.066
 mushroom+0.065
Neg logit contributions
 dressed-0.081
aled-0.079
ips-0.079
 ped-0.078
aked-0.077
Token,
Token ID11
Feature activation-6.18e-02

Pos logit contributions
 salad+0.053
 we+0.052
 difference+0.051
 yeah+0.048
 slice+0.048
Neg logit contributions
 dressed-0.107
aked-0.105
ips-0.103
aled-0.099
 ped-0.099
Token you
Token ID345
Feature activation-2.08e-02

Pos logit contributions
ue+0.069
fly+0.067
 salad+0.065
 difference+0.064
 mushroom+0.062
Neg logit contributions
 dressed-0.129
 ped-0.118
ips-0.115
ga-0.111
years-0.111
Token did
Token ID750
Feature activation+1.87e-02

Pos logit contributions
 we+0.018
 salad+0.017
 difference+0.017
 yeah+0.015
 pie+0.015
Neg logit contributions
 dressed-0.049
aked-0.048
aled-0.047
ips-0.046
 ped-0.046
Token very
Token ID845
Feature activation+7.58e-03

Pos logit contributions
 dressed+0.034
 ped+0.032
aled+0.032
ips+0.031
aked+0.031
Neg logit contributions
 difference-0.025
 we-0.025
fly-0.025
 sir-0.025
 salad-0.025
Token well
Token ID880
Feature activation+1.83e-02

Pos logit contributions
aked+0.011
 dressed+0.011
ips+0.010
aled+0.010
 ped+0.010
Neg logit contributions
 we-0.014
 salad-0.014
 difference-0.013
 sir-0.013
 yeah-0.013
Token.
Token ID13
Feature activation-8.10e-03

Pos logit contributions
aked+0.043
aled+0.042
years+0.041
ips+0.041
opped+0.040
Neg logit contributions
 we-0.019
 salad-0.016
 difference-0.015
 sir-0.015
 forgive-0.015
Token the
Token ID262
Feature activation-2.33e-02

Pos logit contributions
 difference+0.009
 we+0.009
 salad+0.009
 yeah+0.009
 forgive+0.009
Neg logit contributions
 dressed-0.013
aked-0.012
ips-0.012
aled-0.012
 ped-0.012
Token lions
Token ID31420
Feature activation-4.12e-02

Pos logit contributions
 we+0.024
 yeah+0.023
 difference+0.023
 salad+0.022
ue+0.022
Neg logit contributions
 dressed-0.049
aked-0.049
ips-0.048
aled-0.047
ga-0.046
Token.
Token ID13
Feature activation-4.46e-03

Pos logit contributions
 difference+0.055
 yeah+0.053
 salad+0.053
 we+0.053
 forgive+0.051
Neg logit contributions
 dressed-0.076
aked-0.075
ips-0.075
aled-0.073
 ped-0.072
Token It
Token ID632
Feature activation-3.72e-02

Pos logit contributions
 we+0.005
 salad+0.005
 difference+0.005
 sir+0.005
 yeah+0.005
Neg logit contributions
aked-0.009
 dressed-0.009
ips-0.009
aled-0.009
 ped-0.009
Token's
Token ID338
Feature activation+3.37e-02

Pos logit contributions
 yeah+0.035
 salad+0.035
 sir+0.035
 difference+0.034
 ending+0.033
Neg logit contributions
 dressed-0.081
aled-0.081
aked-0.080
ips-0.079
 ped-0.077
Token okay
Token ID8788
Feature activation-4.85e-02

Pos logit contributions
aked+0.061
ips+0.059
 dressed+0.059
aled+0.059
years+0.058
Neg logit contributions
 we-0.050
 salad-0.048
 difference-0.047
 sir-0.044
 tastes-0.044
Token.
Token ID13
Feature activation-1.40e-02

Pos logit contributions
 yeah+0.094
 difference+0.088
andra+0.084
 feast+0.083
 salad+0.083
Neg logit contributions
 dressed-0.064
ips-0.052
ind-0.051
ained-0.050
aked-0.049
Token We
Token ID775
Feature activation-1.95e-02

Pos logit contributions
 salad+0.016
 difference+0.015
 we+0.015
 sandwich+0.014
 slice+0.014
Neg logit contributions
aled-0.029
aked-0.029
 dressed-0.029
ips-0.028
ga-0.027
Token can
Token ID460
Feature activation-7.03e-03

Pos logit contributions
 we+0.015
 salad+0.014
 difference+0.013
 sir+0.013
 yeah+0.013
Neg logit contributions
 dressed-0.049
aked-0.048
aled-0.047
ips-0.046
 ped-0.045
Token clean
Token ID3424
Feature activation+7.22e-03

Pos logit contributions
 salad+0.008
 difference+0.008
 ending+0.008
 sir+0.008
 yeah+0.008
Neg logit contributions
 dressed-0.014
aled-0.013
ips-0.013
 ped-0.013
ga-0.013
Token up
Token ID510
Feature activation-3.07e-02

Pos logit contributions
 dressed+0.012
ips+0.012
aked+0.012
aled+0.012
 ped+0.011
Neg logit contributions
 we-0.011
 salad-0.011
 difference-0.011
 forgive-0.010
 yeah-0.010
TokenThat
Token ID2504
Feature activation-5.40e-02

Pos logit contributions
 salad+0.016
 we+0.016
 difference+0.015
 sir+0.015
 tastes+0.015
Neg logit contributions
aked-0.020
 dressed-0.020
aled-0.019
ips-0.019
years-0.019
Token was
Token ID373
Feature activation+3.34e-02

Pos logit contributions
 yeah+0.064
 salad+0.063
 sir+0.063
ue+0.061
 we+0.060
Neg logit contributions
 dressed-0.103
 ped-0.091
aked-0.088
 buck-0.085
ained-0.084
Token a
Token ID257
Feature activation-2.13e-02

Pos logit contributions
 dressed+0.053
aked+0.050
aled+0.049
 ped+0.047
years+0.047
Neg logit contributions
 salad-0.053
 we-0.053
 difference-0.053
 yeah-0.051
 slice-0.051
Token great
Token ID1049
Feature activation-1.15e-02

Pos logit contributions
 we+0.034
ue+0.033
 tastes+0.032
 salad+0.032
 yeah+0.032
Neg logit contributions
 dressed-0.031
 ped-0.029
aked-0.029
ga-0.028
ang-0.028
Token story
Token ID1621
Feature activation-2.85e-02

Pos logit contributions
 we+0.015
fly+0.014
 tastes+0.014
 yeah+0.014
ue+0.014
Neg logit contributions
 dressed-0.020
 ped-0.019
aked-0.019
aled-0.018
ga-0.018
Token,
Token ID11
Feature activation-5.82e-02

Pos logit contributions
 salad+0.035
 difference+0.034
 yeah+0.033
 tastes+0.032
 we+0.031
Neg logit contributions
 dressed-0.058
aked-0.055
ips-0.054
aled-0.053
 ped-0.053
Token Benny
Token ID44275
Feature activation-3.18e-02

Pos logit contributions
 difference+0.068
 salad+0.067
 we+0.066
 tastes+0.066
 yeah+0.064
Neg logit contributions
 dressed-0.115
 ped-0.107
aked-0.105
aled-0.105
ga-0.104
Token.
Token ID13
Feature activation-1.10e-02

Pos logit contributions
 yeah+0.038
 we+0.037
 salad+0.037
 difference+0.037
 slice+0.035
Neg logit contributions
 dressed-0.062
aked-0.059
ga-0.059
 ped-0.058
ang-0.057
Token Thank
Token ID6952
Feature activation-1.18e-02

Pos logit contributions
 salad+0.012
 we+0.012
 difference+0.012
 slice+0.011
 sandwich+0.011
Neg logit contributions
 dressed-0.023
aked-0.023
aled-0.023
ips-0.023
 ped-0.022
Token you
Token ID345
Feature activation-3.23e-02

Pos logit contributions
 difference+0.011
 salad+0.011
 we+0.010
 tastes+0.010
 slice+0.010
Neg logit contributions
 dressed-0.027
aked-0.027
ips-0.026
years-0.026
aled-0.026
Token!"
Token ID2474
Feature activation-2.42e-03

Pos logit contributions
 difference+0.045
 salad+0.044
 we+0.044
 tastes+0.044
 yeah+0.043
Neg logit contributions
 dressed-0.056
aked-0.055
years-0.053
ips-0.053
aled-0.053
Token.
Token ID13
Feature activation-1.41e-02

Pos logit contributions
 we+0.010
 salad+0.009
 difference+0.009
 yeah+0.009
 tastes+0.009
Neg logit contributions
aked-0.014
 dressed-0.014
ips-0.013
aled-0.013
 ped-0.013
Token "
Token ID366
Feature activation-1.74e-02

Pos logit contributions
 difference+0.017
 salad+0.017
 we+0.017
 tastes+0.016
 yeah+0.016
Neg logit contributions
aked-0.028
 dressed-0.028
ips-0.027
aled-0.027
 ped-0.027
TokenCan
Token ID6090
Feature activation-5.98e-02

Pos logit contributions
 salad+0.023
 difference+0.022
 we+0.021
 tastes+0.021
 mushroom+0.020
Neg logit contributions
aked-0.035
 dressed-0.034
ips-0.033
aled-0.033
 ped-0.033
Token we
Token ID356
Feature activation-3.25e-02

Pos logit contributions
 difference+0.079
 feast+0.073
 slice+0.070
ccoli+0.067
 salad+0.066
Neg logit contributions
 ped-0.114
 dressed-0.110
 Fay-0.102
 F-0.099
aled-0.099
Token play
Token ID711
Feature activation-6.93e-03

Pos logit contributions
 difference+0.042
 salad+0.042
ccoli+0.041
 tastes+0.041
 moment+0.040
Neg logit contributions
 dressed-0.062
 ped-0.058
aled-0.055
ips-0.055
ga-0.055
Token together
Token ID1978
Feature activation-4.78e-02

Pos logit contributions
 difference+0.011
 we+0.011
 salad+0.011
 sir+0.010
 tastes+0.010
Neg logit contributions
 dressed-0.011
ips-0.011
aked-0.011
aled-0.011
 ped-0.010
Token?"
Token ID1701
Feature activation-4.02e-03

Pos logit contributions
 difference+0.044
 salad+0.043
 tastes+0.041
 we+0.041
 yeah+0.040
Neg logit contributions
 dressed-0.108
ips-0.106
aked-0.102
 ped-0.099
ang-0.099
Token
Token ID198
Feature activation+1.42e-02

Pos logit contributions
 difference+0.006
 tastes+0.006
 salad+0.006
 slice+0.005
fly+0.005
Neg logit contributions
 dressed-0.007
aked-0.007
 ped-0.007
aled-0.007
ips-0.007
Token
Token ID198
Feature activation+1.99e-02

Pos logit contributions
aked+0.029
 dressed+0.028
ips+0.028
aled+0.028
 ped+0.027
Neg logit contributions
 salad-0.017
 we-0.017
 difference-0.016
 tastes-0.016
 yeah-0.015
TokenBen
Token ID11696
Feature activation+3.80e-02

Pos logit contributions
aked+0.036
 dressed+0.036
aled+0.035
ips+0.035
 ped+0.034
Neg logit contributions
 we-0.028
 salad-0.028
 difference-0.027
 tastes-0.026
 yeah-0.026
Token and
Token ID290
Feature activation-1.74e-02

Pos logit contributions
aked+0.082
 dressed+0.081
aled+0.080
ips+0.080
 ped+0.078
Neg logit contributions
 we-0.042
 salad-0.040
 difference-0.039
 yeah-0.036
 sir-0.036
Token
Token ID26391
Feature activation+2.42e-02

Pos logit contributions
 ramps+0.116
years+0.115
ips+0.112
 ped+0.111
aked+0.110
Neg logit contributions
 forgive-0.061
 difference-0.060
ue-0.060
 we-0.059
 sir-0.056
Token
Token ID8151
Feature activation+2.70e-02

Pos logit contributions
ips+0.044
 dressed+0.044
aked+0.043
 ped+0.043
ang+0.041
Neg logit contributions
 salad-0.035
 difference-0.033
 mushroom-0.032
 forgive-0.031
 tastes-0.031
Tokenll
Token ID297
Feature activation+1.48e-02

Pos logit contributions
aked+0.048
 dressed+0.048
ips+0.048
aled+0.046
 ped+0.046
Neg logit contributions
 we-0.039
 salad-0.039
 difference-0.038
 tastes-0.036
 yeah-0.035
Token see
Token ID766
Feature activation-1.67e-02

Pos logit contributions
aked+0.033
 dressed+0.032
aled+0.032
ips+0.032
 ped+0.031
Neg logit contributions
 we-0.015
 salad-0.015
 difference-0.014
 sir-0.013
 yeah-0.013
Token a
Token ID257
Feature activation-3.93e-02

Pos logit contributions
 we+0.018
 salad+0.017
 difference+0.017
 yeah+0.016
 sir+0.016
Neg logit contributions
 dressed-0.036
aked-0.036
ips-0.035
aled-0.035
 ped-0.034
Token purple
Token ID14032
Feature activation-5.15e-02

Pos logit contributions
 we+0.053
 yeah+0.051
 salad+0.050
ue+0.049
 tastes+0.049
Neg logit contributions
 dressed-0.068
 ped-0.064
ga-0.062
ang-0.062
years-0.061
Token sunset
Token ID26428
Feature activation-2.46e-02

Pos logit contributions
 we+0.056
 difference+0.055
ccoli+0.055
 yeah+0.054
 sir+0.054
Neg logit contributions
 dressed-0.102
 ped-0.095
ga-0.095
ips-0.093
ang-0.091
Token!"
Token ID2474
Feature activation+2.44e-02

Pos logit contributions
 difference+0.030
 salad+0.030
 we+0.030
 forgive+0.029
 yeah+0.029
Neg logit contributions
 dressed-0.049
aked-0.047
ips-0.047
aled-0.046
 ped-0.045
Token
Token ID198
Feature activation+3.33e-02

Pos logit contributions
aked+0.059
aled+0.059
years+0.058
ips+0.057
 ramps+0.056
Neg logit contributions
 we-0.023
 yeah-0.018
 forgive-0.017
 difference-0.017
 salad-0.017
Token
Token ID198
Feature activation+3.79e-02

Pos logit contributions
aled+0.086
aked+0.085
ips+0.084
years+0.084
 dressed+0.082
Neg logit contributions
 we-0.024
 forgive-0.023
 difference-0.020
ue-0.018
 salad-0.018
TokenThe
Token ID464
Feature activation-1.80e-02

Pos logit contributions
aled+0.086
ips+0.082
aked+0.081
years+0.079
pered+0.077
Neg logit contributions
 we-0.043
 forgive-0.042
 yeah-0.037
ue-0.036
 would-0.035