TOP ACTIVATIONS
MAX = 0.203

, they put the cup cakes in the oven to bake
After they put the cup cakes in the oven , Lily
After they baked the cup cakes , they shared them with
. They put the cup cakes on a plate and brought
because it made the cup cakes fluffy . As
After they put the cup cakes in the oven , Lily
hard so that the cup cakes would be extra fluffy .
They put the cakes in the oven and waited
help eat the honey - cakes and the worker bee said
happy to see the sun fl owers growing tall and strong
falls . He knocks the cake off the table . The
it was the best cup cakes she ever had .
happy . She showed the pie a spark ly necklace ,
, Tim my saw the ge ese and thought they looked
When the cup cakes were done , Mill ie
happy with all the sun fl owers and he ordered his
home . She put the pie on a plate and cut
show his dad the cup cakes . It was a noisy
one of the delicious cup cakes . Y um ! <|endoftext|>
woke up , the cup cakes were ready . They were

MIN ACTIVATIONS
MIN = -0.182

Hello , Bun - B un ! You have pretty gray
. " B un - Cloud . I like
my said , " B un , it 's time to
generous rabbit named Bun B un . Every day , Bun
said , " Hello , Bun - B un ! You
. She said " B un , it 's time to
" We are in space !" Lily says . She
you here , young ' un ?" Bob
little rabbit named Bun B un . As they
happily . " Hello , Bun - Cloud . We are
to play in a fair manner . In the end ,
and saw it said ' un lock '. She
bus with the word " un iversity " on it .
me , Bun - B un ! I have a surprise
, " Mom my , Bun Bun is broken . He
wild cat that lives in far away places . It can
and don 't go too far ," she said . "
. " Go away , Lily ," he says
. " Go away , you filthy dog !"
. " Go away ! Leave my kids alone

INTERVAL 0.107 - 0.203
CONTAINS 0.238%

up there ?" Asked the gorilla . " I
made a face . The apple was not y ummy .
moved and the apple fell right down into the little girl
who loved to make delicious cakes . He always shared his
were happy . M oral of the story :

INTERVAL 0.010 - 0.107
CONTAINS 37.593%

to see what was inside . " Maybe it
They kept playing until the sun started to
One of the friends wanted the big peach , but
planets around her . But then she saw a bright ,
One day , Jan decided to tell Joy the truth .

INTERVAL -0.086 - 0.010
CONTAINS 61.541%

girl ," her dad said . " Always remember to keep
my . Tim my loved to run and play outside .
And they all lived happily ever after . <|endoftext|> One day
the other and he had so much fun .
so proud of what she had done and hugged her daddy

INTERVAL -0.182 - -0.086
CONTAINS 0.628%

Mia say , " OK , Mom . We will sort
. He was three years old and wanted to spend time
It 's okay , sweet ie . You can build another
It 's okay , Ben . I fixed your finger .
. " OK , but be careful . The
Token,
Token ID11
Feature activation+1.90e-02

Pos logit contributions
comb+0.246
Jud+0.219
cr+0.216
cious+0.197
gan+0.196
Neg logit contributions
les-0.121
agging-0.112
?"-0.108
ors-0.107
 me-0.105
Token they
Token ID484
Feature activation+3.98e-02

Pos logit contributions
comb+0.048
Jud+0.044
cr+0.043
cious+0.040
 coconut+0.039
Neg logit contributions
 me-0.019
?"-0.019
les-0.019
 Teddy-0.016
 trumpet-0.015
Token put
Token ID1234
Feature activation+2.47e-03

Pos logit contributions
comb+0.096
Jud+0.084
cr+0.082
cious+0.081
emon+0.076
Neg logit contributions
les-0.049
?"-0.047
 me-0.046
 everywhere-0.037
 wears-0.037
Token the
Token ID262
Feature activation+4.49e-02

Pos logit contributions
comb+0.006
Jud+0.005
cr+0.005
cious+0.005
 coconut+0.005
Neg logit contributions
les-0.003
?"-0.003
 me-0.003
agging-0.002
ors-0.002
Token cup
Token ID6508
Feature activation+1.24e-01

Pos logit contributions
comb+0.090
Jud+0.079
cr+0.075
cious+0.069
gan+0.066
Neg logit contributions
les-0.071
?"-0.066
 me-0.066
agging-0.064
ors-0.061
Tokencakes
Token ID37263
Feature activation+2.03e-01

Pos logit contributions
comb+0.259
Jud+0.246
cious+0.224
cr+0.209
Jess+0.206
Neg logit contributions
les-0.191
?"-0.169
 me-0.166
iren-0.160
ler-0.150
Token in
Token ID287
Feature activation-2.32e-03

Pos logit contributions
comb+0.552
Jud+0.530
cious+0.492
cr+0.486
 coconut+0.468
Neg logit contributions
les-0.167
?"-0.141
 me-0.131
agging-0.124
ler-0.108
Token the
Token ID262
Feature activation-2.65e-02

Pos logit contributions
?"+0.003
 me+0.003
les+0.002
 channel+0.002
?+0.002
Neg logit contributions
comb-0.006
Jud-0.005
cious-0.005
cr-0.005
gan-0.004
Token oven
Token ID14361
Feature activation+7.55e-02

Pos logit contributions
les+0.037
 me+0.035
?"+0.035
agging+0.032
iren+0.031
Neg logit contributions
comb-0.059
Jud-0.052
cr-0.050
cious-0.046
gan-0.044
Token to
Token ID284
Feature activation+2.03e-02

Pos logit contributions
comb+0.200
Jud+0.185
cr+0.178
cious+0.169
 coconut+0.166
Neg logit contributions
les-0.065
?"-0.058
 me-0.056
ors-0.052
agging-0.052
Token bake
Token ID28450
Feature activation+9.01e-02

Pos logit contributions
comb+0.041
Jud+0.037
cr+0.036
cious+0.034
 coconut+0.031
Neg logit contributions
?"-0.033
 me-0.032
les-0.028
?-0.026
 my-0.025
TokenAfter
Token ID3260
Feature activation+7.05e-02

Pos logit contributions
comb+0.036
cr+0.030
Jud+0.030
 coconut+0.028
 Maggie+0.027
Neg logit contributions
les-0.027
 me-0.027
?"-0.025
agging-0.024
ors-0.024
Token they
Token ID484
Feature activation+2.76e-02

Pos logit contributions
comb+0.166
Jud+0.147
cr+0.143
cious+0.133
gan+0.131
Neg logit contributions
?"-0.084
 me-0.081
,-0.077
 shooting-0.068
 that-0.068
Token put
Token ID1234
Feature activation-3.45e-03

Pos logit contributions
comb+0.067
Jud+0.061
cious+0.057
cr+0.057
stack+0.053
Neg logit contributions
?"-0.030
les-0.029
 me-0.028
ines-0.024
 messed-0.024
Token the
Token ID262
Feature activation+4.86e-02

Pos logit contributions
les+0.005
agging+0.004
?"+0.004
ors+0.004
 me+0.004
Neg logit contributions
comb-0.007
Jud-0.007
cr-0.006
cious-0.006
gan-0.006
Token cup
Token ID6508
Feature activation+1.23e-01

Pos logit contributions
comb+0.089
Jud+0.077
cr+0.074
cious+0.067
gan+0.064
Neg logit contributions
les-0.085
?"-0.079
 me-0.078
agging-0.076
ors-0.075
Tokencakes
Token ID37263
Feature activation+2.02e-01

Pos logit contributions
comb+0.214
Jud+0.193
cious+0.172
cr+0.169
 coconut+0.155
Neg logit contributions
les-0.228
?"-0.213
 me-0.212
iren-0.199
ors-0.198
Token in
Token ID287
Feature activation-5.58e-03

Pos logit contributions
comb+0.524
Jud+0.488
cr+0.460
cious+0.451
 coconut+0.430
Neg logit contributions
les-0.189
?"-0.170
 me-0.165
agging-0.150
ors-0.142
Token the
Token ID262
Feature activation-2.85e-02

Pos logit contributions
?"+0.006
 me+0.006
les+0.006
 channel+0.005
?+0.005
Neg logit contributions
comb-0.014
Jud-0.012
cr-0.012
cious-0.011
gan-0.011
Token oven
Token ID14361
Feature activation+7.36e-02

Pos logit contributions
les+0.039
 me+0.038
?"+0.037
agging+0.034
 wrong+0.033
Neg logit contributions
comb-0.063
Jud-0.056
cr-0.055
cious-0.050
gan-0.047
Token,
Token ID11
Feature activation+2.79e-03

Pos logit contributions
comb+0.208
Jud+0.188
cr+0.186
cious+0.174
gan+0.171
Neg logit contributions
les-0.052
agging-0.043
?"-0.042
 me-0.040
ors-0.040
Token Lily
Token ID20037
Feature activation+2.83e-02

Pos logit contributions
comb+0.008
Jud+0.007
cr+0.007
cious+0.006
 coconut+0.006
Neg logit contributions
?"-0.002
 me-0.002
les-0.002
 Teddy-0.002
 everywhere-0.002
TokenAfter
Token ID3260
Feature activation+7.33e-02

Pos logit contributions
comb+0.041
Jud+0.035
cr+0.034
 coconut+0.031
cious+0.031
Neg logit contributions
les-0.033
 me-0.033
?"-0.032
agging-0.031
ors-0.029
Token they
Token ID484
Feature activation+3.15e-02

Pos logit contributions
comb+0.180
Jud+0.159
cr+0.156
ocado+0.147
gan+0.145
Neg logit contributions
 me-0.074
,-0.073
?"-0.072
 shooting-0.070
 that-0.067
Token baked
Token ID22979
Feature activation+1.11e-01

Pos logit contributions
comb+0.077
Jud+0.070
cious+0.067
cr+0.067
gan+0.061
Neg logit contributions
les-0.033
?"-0.031
 me-0.030
ines-0.026
 messed-0.026
Token the
Token ID262
Feature activation+4.45e-02

Pos logit contributions
comb+0.298
Jud+0.274
cr+0.263
cious+0.249
 coconut+0.238
Neg logit contributions
les-0.102
?"-0.098
 me-0.097
agging-0.080
ors-0.076
Token cup
Token ID6508
Feature activation+1.20e-01

Pos logit contributions
comb+0.092
Jud+0.081
cr+0.079
cious+0.072
gan+0.070
Neg logit contributions
les-0.065
?"-0.060
agging-0.059
 me-0.059
ors-0.058
Tokencakes
Token ID37263
Feature activation+1.99e-01

Pos logit contributions
comb+0.204
Jud+0.186
cr+0.167
cious+0.165
 coconut+0.151
Neg logit contributions
les-0.222
?"-0.209
 me-0.208
ors-0.197
iren-0.195
Token,
Token ID11
Feature activation+1.96e-02

Pos logit contributions
comb+0.592
Jud+0.560
cr+0.536
cious+0.522
 coconut+0.504
Neg logit contributions
les-0.109
?"-0.093
 me-0.087
agging-0.068
ors-0.063
Token they
Token ID484
Feature activation+3.80e-02

Pos logit contributions
comb+0.052
Jud+0.047
cr+0.045
cious+0.044
 coconut+0.042
Neg logit contributions
 me-0.018
?"-0.018
les-0.018
 Teddy-0.016
 everywhere-0.015
Token shared
Token ID4888
Feature activation+4.62e-02

Pos logit contributions
comb+0.085
Jud+0.076
cr+0.072
cious+0.071
emon+0.068
Neg logit contributions
les-0.052
?"-0.050
 me-0.048
 wears-0.042
 everywhere-0.040
Token them
Token ID606
Feature activation+2.28e-02

Pos logit contributions
comb+0.130
Jud+0.114
cr+0.108
stack+0.102
cious+0.102
Neg logit contributions
les-0.044
?"-0.043
 me-0.041
 everywhere-0.030
 wears-0.029
Token with
Token ID351
Feature activation+1.37e-02

Pos logit contributions
comb+0.063
Jud+0.058
stack+0.053
 brigade+0.052
cr+0.051
Neg logit contributions
les-0.021
?"-0.018
 me-0.017
agging-0.016
ors-0.015
Token.
Token ID13
Feature activation+9.58e-03

Pos logit contributions
comb+0.098
Jud+0.088
cr+0.086
cious+0.081
gan+0.079
Neg logit contributions
les-0.035
?"-0.031
agging-0.029
 me-0.029
ors-0.029
Token They
Token ID1119
Feature activation+1.39e-02

Pos logit contributions
comb+0.021
Jud+0.018
cr+0.018
cious+0.017
 coconut+0.016
Neg logit contributions
les-0.013
?"-0.012
 me-0.012
ors-0.011
agging-0.011
Token put
Token ID1234
Feature activation-6.23e-03

Pos logit contributions
comb+0.031
Jud+0.028
cious+0.026
cr+0.026
emon+0.024
Neg logit contributions
les-0.019
?"-0.018
 me-0.016
 wears-0.016
ler-0.015
Token the
Token ID262
Feature activation+3.49e-02

Pos logit contributions
les+0.008
 me+0.007
?"+0.007
ors+0.007
agging+0.007
Neg logit contributions
comb-0.014
Jud-0.013
cr-0.012
cious-0.012
 coconut-0.011
Token cup
Token ID6508
Feature activation+1.17e-01

Pos logit contributions
comb+0.072
Jud+0.064
cr+0.060
cious+0.055
gan+0.053
Neg logit contributions
les-0.052
?"-0.049
 me-0.049
agging-0.046
ler-0.044
Tokencakes
Token ID37263
Feature activation+1.98e-01

Pos logit contributions
comb+0.279
Jud+0.266
cious+0.239
cr+0.236
 coconut+0.227
Neg logit contributions
les-0.137
?"-0.119
 me-0.119
iren-0.105
ors-0.105
Token on
Token ID319
Feature activation+2.40e-02

Pos logit contributions
comb+0.522
Jud+0.499
cr+0.456
cious+0.453
 coconut+0.435
Neg logit contributions
les-0.178
?"-0.156
 me-0.149
agging-0.138
ler-0.123
Token a
Token ID257
Feature activation-2.18e-02

Pos logit contributions
comb+0.058
Jud+0.052
cr+0.050
cious+0.048
emon+0.046
Neg logit contributions
 me-0.028
les-0.027
?"-0.027
ler-0.022
ors-0.022
Token plate
Token ID7480
Feature activation+3.54e-02

Pos logit contributions
les+0.033
?"+0.030
agging+0.030
 me+0.030
ors+0.029
Neg logit contributions
comb-0.044
Jud-0.038
cr-0.037
cious-0.034
gan-0.033
Token and
Token ID290
Feature activation-1.19e-02

Pos logit contributions
comb+0.091
Jud+0.084
cr+0.081
cious+0.077
 coconut+0.075
Neg logit contributions
les-0.033
?"-0.029
 me-0.028
agging-0.027
ors-0.026
Token brought
Token ID3181
Feature activation-1.32e-03

Pos logit contributions
?"+0.014
les+0.014
 me+0.013
 wears+0.012
 bl+0.011
Neg logit contributions
comb-0.028
Jud-0.026
cr-0.024
cious-0.023
Jess-0.023
Token because
Token ID780
Feature activation+1.87e-02

Pos logit contributions
comb+0.218
Jud+0.198
cr+0.192
cious+0.179
 coconut+0.176
Neg logit contributions
les-0.084
?"-0.076
 me-0.073
ors-0.073
agging-0.071
Token it
Token ID340
Feature activation+1.89e-03

Pos logit contributions
comb+0.055
Jud+0.050
cr+0.049
cious+0.047
gan+0.044
Neg logit contributions
?"-0.014
 me-0.012
les-0.011
ors-0.010
 my-0.009
Token made
Token ID925
Feature activation+2.05e-02

Pos logit contributions
comb+0.004
Jud+0.004
cr+0.003
cious+0.003
gan+0.003
Neg logit contributions
les-0.003
?"-0.002
agging-0.002
 me-0.002
ors-0.002
Token the
Token ID262
Feature activation+3.63e-02

Pos logit contributions
comb+0.045
Jud+0.040
cr+0.039
cious+0.036
 coconut+0.035
Neg logit contributions
les-0.028
 me-0.027
?"-0.026
ors-0.024
agging-0.024
Token cup
Token ID6508
Feature activation+1.15e-01

Pos logit contributions
comb+0.074
Jud+0.066
cr+0.063
cious+0.058
gan+0.056
Neg logit contributions
les-0.055
?"-0.052
 me-0.051
agging-0.050
ors-0.048
Tokencakes
Token ID37263
Feature activation+1.97e-01

Pos logit contributions
comb+0.224
Jud+0.204
cr+0.188
cious+0.179
 coconut+0.172
Neg logit contributions
les-0.186
?"-0.174
 me-0.168
ors-0.162
agging-0.159
Token fluffy
Token ID39145
Feature activation+4.80e-03

Pos logit contributions
comb+0.444
Jud+0.400
cr+0.386
cious+0.359
 coconut+0.349
Neg logit contributions
les-0.248
?"-0.229
 me-0.222
agging-0.221
ors-0.214
Token.
Token ID13
Feature activation+4.17e-02

Pos logit contributions
comb+0.012
Jud+0.011
cr+0.011
cious+0.010
gan+0.010
Neg logit contributions
les-0.005
?"-0.004
agging-0.004
 me-0.004
ors-0.004
Token
Token ID198
Feature activation+2.24e-02

Pos logit contributions
comb+0.089
Jud+0.078
cr+0.077
cious+0.070
 coconut+0.068
Neg logit contributions
les-0.058
?"-0.054
 me-0.053
agging-0.052
ors-0.052
Token
Token ID198
Feature activation+1.52e-02

Pos logit contributions
comb+0.036
Jud+0.033
cr+0.031
 Maggie+0.027
Jess+0.027
Neg logit contributions
les-0.041
agging-0.039
ening-0.038
ors-0.038
ler-0.037
TokenAs
Token ID1722
Feature activation+1.01e-02

Pos logit contributions
comb+0.029
cr+0.025
Jud+0.024
 coconut+0.022
cious+0.021
Neg logit contributions
les-0.025
 me-0.025
?"-0.024
agging-0.023
ors-0.023
TokenAfter
Token ID3260
Feature activation+5.50e-02

Pos logit contributions
 me+0.008
?"+0.006
iren+0.006
ock+0.006
agging+0.006
Neg logit contributions
comb-0.012
cr-0.009
 coconut-0.008
mons-0.008
cious-0.008
Token they
Token ID484
Feature activation+1.70e-02

Pos logit contributions
comb+0.124
Jud+0.111
cr+0.107
cious+0.098
gan+0.098
Neg logit contributions
 me-0.068
?"-0.067
les-0.065
agging-0.061
iren-0.059
Token put
Token ID1234
Feature activation-1.53e-02

Pos logit contributions
comb+0.040
Jud+0.036
cr+0.035
cious+0.034
mons+0.031
Neg logit contributions
les-0.020
 me-0.019
?"-0.019
 wears-0.017
ock-0.017
Token the
Token ID262
Feature activation+4.36e-02

Pos logit contributions
les+0.021
agging+0.019
ors+0.019
?"+0.019
 me+0.018
Neg logit contributions
comb-0.032
Jud-0.028
cr-0.027
gan-0.025
 Maggie-0.025
Token cup
Token ID6508
Feature activation+1.12e-01

Pos logit contributions
comb+0.096
Jud+0.085
cr+0.082
cious+0.076
gan+0.073
Neg logit contributions
les-0.060
?"-0.055
 me-0.054
agging-0.054
ors-0.051
Tokencakes
Token ID37263
Feature activation+1.91e-01

Pos logit contributions
comb+0.263
Jud+0.243
cr+0.227
cious+0.221
 coconut+0.211
Neg logit contributions
les-0.135
?"-0.121
 me-0.121
iren-0.112
ors-0.112
Token in
Token ID287
Feature activation-1.51e-02

Pos logit contributions
comb+0.490
Jud+0.458
cr+0.431
cious+0.416
 coconut+0.405
Neg logit contributions
les-0.185
?"-0.164
 me-0.158
agging-0.153
ors-0.143
Token the
Token ID262
Feature activation-3.48e-02

Pos logit contributions
?"+0.017
 me+0.017
les+0.016
 channel+0.014
agging+0.014
Neg logit contributions
comb-0.036
Jud-0.032
cr-0.032
cious-0.030
emon-0.029
Token oven
Token ID14361
Feature activation+6.51e-02

Pos logit contributions
les+0.049
 me+0.045
?"+0.045
agging+0.044
ors+0.042
Neg logit contributions
comb-0.075
Jud-0.066
cr-0.065
cious-0.059
gan-0.058
Token,
Token ID11
Feature activation-2.85e-03

Pos logit contributions
comb+0.188
Jud+0.173
cr+0.169
cious+0.160
 coconut+0.157
Neg logit contributions
les-0.040
?"-0.034
 me-0.032
agging-0.030
ors-0.030
Token Lily
Token ID20037
Feature activation+2.14e-02

Pos logit contributions
 me+0.003
?"+0.003
 Teddy+0.003
les+0.003
 wears+0.003
Neg logit contributions
comb-0.007
Jud-0.006
cr-0.006
cious-0.006
emon-0.006
Token hard
Token ID1327
Feature activation+7.13e-03

Pos logit contributions
comb+0.034
Jud+0.031
cr+0.029
stack+0.027
cious+0.027
Neg logit contributions
les-0.019
 me-0.018
?"-0.017
ors-0.016
ler-0.016
Token so
Token ID523
Feature activation+2.35e-02

Pos logit contributions
comb+0.017
cr+0.015
Jud+0.015
cious+0.014
gan+0.013
Neg logit contributions
les-0.008
agging-0.007
?"-0.007
 me-0.007
ors-0.007
Token that
Token ID326
Feature activation+9.48e-03

Pos logit contributions
comb+0.050
Jud+0.047
cr+0.044
cious+0.040
emon+0.040
Neg logit contributions
 me-0.031
les-0.030
?"-0.029
ors-0.027
 wrong-0.026
Token the
Token ID262
Feature activation+3.20e-02

Pos logit contributions
comb+0.022
cr+0.019
Jud+0.019
cious+0.019
 coconut+0.018
Neg logit contributions
 me-0.011
?"-0.011
les-0.010
 wears-0.009
ors-0.009
Token cup
Token ID6508
Feature activation+1.08e-01

Pos logit contributions
comb+0.068
Jud+0.061
cr+0.058
cious+0.052
gan+0.050
Neg logit contributions
les-0.047
 me-0.047
?"-0.045
 problem-0.042
 channel-0.041
Tokencakes
Token ID37263
Feature activation+1.90e-01

Pos logit contributions
comb+0.185
Jud+0.184
cious+0.158
cr+0.151
 coconut+0.145
Neg logit contributions
les-0.190
?"-0.184
 me-0.181
ors-0.161
iren-0.157
Token would
Token ID561
Feature activation+2.33e-03

Pos logit contributions
comb+0.389
Jud+0.371
cr+0.334
cious+0.327
 coconut+0.312
Neg logit contributions
les-0.273
?"-0.260
 me-0.253
ler-0.228
ors-0.227
Token be
Token ID307
Feature activation+7.04e-03

Pos logit contributions
comb+0.005
Jud+0.004
cr+0.004
cious+0.004
gan+0.004
Neg logit contributions
?"-0.003
 me-0.003
les-0.003
agging-0.003
ors-0.003
Token extra
Token ID3131
Feature activation+5.39e-03

Pos logit contributions
comb+0.015
Jud+0.014
cr+0.013
cious+0.012
 coconut+0.012
Neg logit contributions
les-0.009
?"-0.009
 me-0.009
agging-0.008
ors-0.008
Token fluffy
Token ID39145
Feature activation-2.84e-02

Pos logit contributions
comb+0.011
Jud+0.010
cr+0.010
cious+0.009
stack+0.009
Neg logit contributions
les-0.008
?"-0.007
 me-0.007
agging-0.007
ler-0.007
Token.
Token ID13
Feature activation+3.39e-02

Pos logit contributions
les+0.025
?"+0.022
 me+0.021
agging+0.020
ors+0.020
Neg logit contributions
comb-0.075
Jud-0.069
cr-0.067
cious-0.063
 coconut-0.062
Token
Token ID198
Feature activation+2.32e-02

Pos logit contributions
comb+0.089
cr+0.075
cious+0.071
Jud+0.069
 coconut+0.066
Neg logit contributions
 me-0.034
?"-0.032
les-0.026
?-0.023
 wears-0.023
Token
Token ID198
Feature activation+1.80e-02

Pos logit contributions
comb+0.065
cr+0.055
cious+0.053
Jud+0.051
mons+0.050
Neg logit contributions
?"-0.021
 me-0.021
les-0.017
ines-0.016
iren-0.015
TokenThey
Token ID2990
Feature activation+5.83e-02

Pos logit contributions
comb+0.039
cious+0.030
cr+0.030
stack+0.030
 extingu+0.029
Neg logit contributions
 me-0.030
?"-0.025
ocked-0.022
h-0.021
ines-0.021
Token put
Token ID1234
Feature activation+3.39e-03

Pos logit contributions
comb+0.128
Jud+0.118
cious+0.107
cr+0.106
mons+0.104
Neg logit contributions
les-0.081
 me-0.072
?"-0.070
ler-0.064
 wears-0.064
Token the
Token ID262
Feature activation+5.19e-02

Pos logit contributions
comb+0.008
Jud+0.007
cr+0.007
cious+0.007
 coconut+0.006
Neg logit contributions
les-0.004
 me-0.004
?"-0.004
agging-0.003
ler-0.003
Token cakes
Token ID30849
Feature activation+1.88e-01

Pos logit contributions
comb+0.103
Jud+0.098
cr+0.089
cious+0.083
 Judy+0.078
Neg logit contributions
les-0.083
 me-0.080
?"-0.078
agging-0.073
 channel-0.071
Token in
Token ID287
Feature activation+7.04e-03

Pos logit contributions
Jud+0.513
comb+0.497
cious+0.462
emon+0.454
Sweet+0.444
Neg logit contributions
les-0.132
 everywhere-0.113
 me-0.111
,-0.110
 around-0.091
Token the
Token ID262
Feature activation-1.66e-02

Pos logit contributions
comb+0.018
cious+0.016
Jud+0.016
cr+0.015
emon+0.015
Neg logit contributions
 me-0.007
?"-0.007
,-0.006
 my-0.005
?-0.005
Token oven
Token ID14361
Feature activation+8.29e-02

Pos logit contributions
 me+0.021
les+0.021
?"+0.019
agging+0.018
 wrong+0.017
Neg logit contributions
comb-0.039
Jud-0.036
cr-0.034
cious-0.031
gan-0.030
Token and
Token ID290
Feature activation+1.93e-02

Pos logit contributions
comb+0.219
Jud+0.206
cr+0.196
cious+0.188
 coconut+0.184
Neg logit contributions
les-0.069
 me-0.061
?"-0.060
ors-0.053
agging-0.053
Token waited
Token ID13488
Feature activation+5.97e-02

Pos logit contributions
comb+0.045
Jud+0.043
cr+0.039
cious+0.039
emon+0.038
Neg logit contributions
 me-0.023
les-0.022
?"-0.020
 wears-0.019
 everywhere-0.017
Token help
Token ID1037
Feature activation-2.17e-02

Pos logit contributions
comb+0.098
Jud+0.085
cr+0.084
cious+0.083
gan+0.078
Neg logit contributions
 me-0.056
?"-0.052
les-0.048
 my-0.042
 trumpet-0.042
Token eat
Token ID4483
Feature activation+7.89e-02

Pos logit contributions
 me+0.024
les+0.024
?"+0.022
ors+0.020
ler+0.019
Neg logit contributions
comb-0.051
Jud-0.047
cr-0.045
cious-0.044
gan-0.043
Token the
Token ID262
Feature activation+2.11e-02

Pos logit contributions
comb+0.182
Jud+0.170
cious+0.154
cr+0.153
gan+0.146
Neg logit contributions
les-0.100
 me-0.097
?"-0.096
ler-0.076
ock-0.076
Token honey
Token ID12498
Feature activation+1.01e-01

Pos logit contributions
comb+0.044
Jud+0.044
cr+0.041
 themselves+0.037
cious+0.037
Neg logit contributions
 me-0.030
les-0.029
?"-0.026
 problem-0.025
 wrong-0.024
Token-
Token ID12
Feature activation+4.22e-02

Pos logit contributions
Jud+0.266
comb+0.244
 coconut+0.234
emon+0.232
cious+0.232
Neg logit contributions
les-0.072
?"-0.070
 me-0.068
,-0.061
ors-0.051
Tokencakes
Token ID37263
Feature activation+1.87e-01

Pos logit contributions
comb+0.084
Jud+0.071
stack+0.064
cr+0.064
cious+0.063
Neg logit contributions
les-0.073
 me-0.065
?"-0.062
ie-0.060
ock-0.058
Token and
Token ID290
Feature activation+2.73e-02

Pos logit contributions
Jud+0.514
comb+0.509
cious+0.471
cr+0.459
 coconut+0.450
Neg logit contributions
les-0.122
?"-0.122
 me-0.110
,-0.086
ors-0.081
Token the
Token ID262
Feature activation-4.49e-03

Pos logit contributions
comb+0.060
Jud+0.056
cr+0.052
cious+0.052
emon+0.049
Neg logit contributions
 me-0.036
?"-0.032
les-0.030
 wears-0.028
 my-0.027
Token worker
Token ID8383
Feature activation+5.70e-02

Pos logit contributions
 me+0.007
les+0.006
?"+0.006
 problem+0.006
 wrong+0.006
Neg logit contributions
comb-0.009
Jud-0.009
cr-0.008
cious-0.007
 themselves-0.007
Token bee
Token ID20697
Feature activation+3.47e-02

Pos logit contributions
comb+0.134
Jud+0.127
cr+0.116
cious+0.112
 coconut+0.112
Neg logit contributions
les-0.069
 me-0.061
?"-0.058
ler-0.054
lers-0.052
Token said
Token ID531
Feature activation-1.14e-02

Pos logit contributions
comb+0.073
Jud+0.071
 coconut+0.064
cr+0.064
gan+0.062
Neg logit contributions
les-0.046
 me-0.041
?"-0.039
lers-0.038
ler-0.036
Token happy
Token ID3772
Feature activation+6.35e-02

Pos logit contributions
comb+0.043
Jud+0.038
cr+0.037
cious+0.035
stack+0.035
Neg logit contributions
?"-0.021
 me-0.019
les-0.019
 lonely-0.018
agging-0.018
Token to
Token ID284
Feature activation+4.81e-02

Pos logit contributions
comb+0.146
cr+0.129
Jud+0.129
cious+0.117
 Maggie+0.116
Neg logit contributions
les-0.075
agging-0.070
?"-0.066
ors-0.065
iren-0.064
Token see
Token ID766
Feature activation+7.45e-04

Pos logit contributions
comb+0.102
Jud+0.091
cr+0.090
cious+0.083
gan+0.082
Neg logit contributions
?"-0.067
 me-0.063
les-0.062
agging-0.057
 wears-0.056
Token the
Token ID262
Feature activation+4.07e-02

Pos logit contributions
comb+0.002
Jud+0.001
cr+0.001
 coconut+0.001
 Maggie+0.001
Neg logit contributions
les-0.001
agging-0.001
ors-0.001
?"-0.001
 me-0.001
Token sun
Token ID4252
Feature activation+6.67e-02

Pos logit contributions
comb+0.082
Jud+0.071
cr+0.070
cious+0.066
gan+0.065
Neg logit contributions
les-0.063
 me-0.061
?"-0.058
agging-0.054
ening-0.054
Tokenfl
Token ID2704
Feature activation+1.85e-01

Pos logit contributions
comb+0.129
Jud+0.125
cious+0.110
emon+0.110
cr+0.110
Neg logit contributions
les-0.102
 me-0.091
ening-0.090
?"-0.087
agging-0.083
Tokenowers
Token ID3618
Feature activation+7.03e-02

Pos logit contributions
comb+0.317
Jud+0.285
cr+0.260
cious+0.251
gan+0.240
Neg logit contributions
les-0.343
agging-0.303
ors-0.302
?"-0.300
 me-0.294
Token growing
Token ID3957
Feature activation+7.98e-02

Pos logit contributions
comb+0.175
Jud+0.165
cr+0.153
gan+0.150
cious+0.148
Neg logit contributions
les-0.071
?"-0.060
 me-0.059
ening-0.056
agging-0.056
Token tall
Token ID7331
Feature activation+2.01e-02

Pos logit contributions
comb+0.196
Jud+0.177
cr+0.170
cious+0.161
gan+0.157
Neg logit contributions
les-0.086
?"-0.076
 me-0.074
agging-0.074
ors-0.072
Token and
Token ID290
Feature activation+4.93e-02

Pos logit contributions
comb+0.054
Jud+0.051
cr+0.047
cious+0.045
gan+0.045
Neg logit contributions
les-0.017
 me-0.014
?"-0.014
ors-0.013
agging-0.013
Token strong
Token ID1913
Feature activation-2.95e-02

Pos logit contributions
comb+0.116
Jud+0.102
cr+0.096
cious+0.094
emon+0.094
Neg logit contributions
 me-0.059
les-0.056
?"-0.056
 wears-0.050
ening-0.049
Token falls
Token ID8953
Feature activation+7.13e-04

Pos logit contributions
agging+0.017
les+0.016
ors+0.015
 channel+0.015
ening+0.014
Neg logit contributions
comb-0.024
cr-0.022
Jud-0.020
 coconut-0.019
gan-0.019
Token.
Token ID13
Feature activation-2.08e-02

Pos logit contributions
comb+0.001
cr+0.001
 themselves+0.001
Jud+0.001
gan+0.001
Neg logit contributions
brate-0.001
 concert-0.001
agging-0.001
les-0.001
 instrument-0.001
Token He
Token ID679
Feature activation-2.82e-02

Pos logit contributions
les+0.030
agging+0.028
ors+0.027
?"+0.026
ening+0.026
Neg logit contributions
comb-0.043
Jud-0.038
cr-0.038
gan-0.034
 Maggie-0.034
Token knocks
Token ID36539
Feature activation+2.98e-02

Pos logit contributions
agging+0.038
ock+0.037
ors+0.037
 concert+0.034
 instrument+0.034
Neg logit contributions
comb-0.055
cr-0.050
 coconut-0.047
Jud-0.044
gan-0.044
Token the
Token ID262
Feature activation+5.72e-02

Pos logit contributions
comb+0.060
cr+0.055
Jud+0.054
 Maggie+0.050
gan+0.050
Neg logit contributions
agging-0.039
les-0.038
ors-0.037
ler-0.034
ening-0.034
Token cake
Token ID12187
Feature activation+1.84e-01

Pos logit contributions
comb+0.114
Jud+0.105
cr+0.098
cious+0.091
 Maggie+0.085
Neg logit contributions
les-0.091
?"-0.088
 me-0.086
agging-0.081
 wrong-0.078
Token off
Token ID572
Feature activation+8.90e-03

Pos logit contributions
comb+0.456
Jud+0.416
cr+0.401
cious+0.382
emon+0.367
Neg logit contributions
les-0.194
?"-0.175
 me-0.171
agging-0.162
ors-0.160
Token the
Token ID262
Feature activation+2.20e-02

Pos logit contributions
comb+0.023
Jud+0.020
cr+0.020
cious+0.019
emon+0.018
Neg logit contributions
les-0.009
?"-0.008
 me-0.008
ors-0.007
ler-0.007
Token table
Token ID3084
Feature activation+5.37e-02

Pos logit contributions
comb+0.050
Jud+0.047
cr+0.044
cious+0.040
Jess+0.039
Neg logit contributions
les-0.027
?"-0.027
 me-0.027
agging-0.024
 channel-0.023
Token.
Token ID13
Feature activation-6.42e-03

Pos logit contributions
comb+0.143
Jud+0.130
cr+0.127
cious+0.120
gan+0.117
Neg logit contributions
les-0.046
?"-0.040
 me-0.038
agging-0.038
ors-0.037
Token The
Token ID383
Feature activation+1.67e-03

Pos logit contributions
les+0.010
agging+0.009
ors+0.009
?"+0.009
 me+0.009
Neg logit contributions
comb-0.013
Jud-0.011
cr-0.011
cious-0.010
gan-0.010
Token it
Token ID340
Feature activation+1.05e-02

Pos logit contributions
comb+0.006
Jud+0.006
cr+0.005
 coconut+0.005
cious+0.005
Neg logit contributions
les-0.002
?"-0.002
 me-0.002
agging-0.001
ors-0.001
Token was
Token ID373
Feature activation-4.73e-03

Pos logit contributions
comb+0.023
Jud+0.022
cr+0.020
 coconut+0.019
cious+0.019
Neg logit contributions
les-0.013
?"-0.013
 me-0.012
 wears-0.011
ors-0.011
Token the
Token ID262
Feature activation+1.67e-02

Pos logit contributions
les+0.007
agging+0.006
ors+0.006
?"+0.006
 me+0.006
Neg logit contributions
comb-0.010
Jud-0.009
cr-0.009
cious-0.008
gan-0.008
Token best
Token ID1266
Feature activation+7.81e-02

Pos logit contributions
comb+0.036
Jud+0.032
cr+0.031
cious+0.029
gan+0.028
Neg logit contributions
les-0.022
?"-0.021
 me-0.021
agging-0.020
ler-0.019
Token cup
Token ID6508
Feature activation+9.75e-02

Pos logit contributions
comb+0.173
Jud+0.157
cr+0.148
cious+0.138
gan+0.134
Neg logit contributions
les-0.103
 me-0.097
?"-0.097
agging-0.090
 wears-0.087
Tokencakes
Token ID37263
Feature activation+1.84e-01

Pos logit contributions
comb+0.229
Jud+0.226
cr+0.197
cious+0.197
Jess+0.193
Neg logit contributions
les-0.107
?"-0.100
 me-0.096
ler-0.088
ors-0.087
Token she
Token ID673
Feature activation-1.29e-03

Pos logit contributions
comb+0.473
Jud+0.469
cr+0.416
 coconut+0.408
cious+0.407
Neg logit contributions
les-0.159
?"-0.153
 me-0.143
ler-0.125
ors-0.121
Token ever
Token ID1683
Feature activation+4.91e-02

Pos logit contributions
les+0.002
?"+0.002
 me+0.002
 wears+0.002
agging+0.002
Neg logit contributions
comb-0.003
Jud-0.002
cr-0.002
 coconut-0.002
cious-0.002
Token had
Token ID550
Feature activation-5.06e-03

Pos logit contributions
comb+0.087
Jud+0.075
cr+0.074
cious+0.066
gan+0.063
Neg logit contributions
les-0.083
agging-0.080
ors-0.078
?"-0.077
 me-0.076
Token.
Token ID13
Feature activation+2.49e-02

Pos logit contributions
les+0.005
?"+0.005
 me+0.005
agging+0.004
ors+0.004
Neg logit contributions
comb-0.013
Jud-0.011
cr-0.011
cious-0.010
 coconut-0.010
Token
Token ID220
Feature activation-3.80e-02

Pos logit contributions
comb+0.059
cr+0.049
Jud+0.049
cious+0.046
 coconut+0.045
Neg logit contributions
 me-0.031
?"-0.031
les-0.029
ors-0.027
 wears-0.026
Token happy
Token ID3772
Feature activation+5.12e-02

Pos logit contributions
comb+0.088
Jud+0.080
cr+0.077
cious+0.072
gan+0.070
Neg logit contributions
les-0.046
?"-0.044
 me-0.043
agging-0.041
ors-0.039
Token.
Token ID13
Feature activation+3.57e-02

Pos logit contributions
comb+0.124
Jud+0.109
cr+0.109
cious+0.099
gan+0.098
Neg logit contributions
les-0.057
agging-0.051
ors-0.049
?"-0.048
iren-0.047
Token She
Token ID1375
Feature activation+2.78e-03

Pos logit contributions
comb+0.084
cr+0.074
Jud+0.073
cious+0.067
 coconut+0.066
Neg logit contributions
 me-0.042
les-0.041
?"-0.040
ors-0.038
 wears-0.036
Token showed
Token ID3751
Feature activation+3.83e-02

Pos logit contributions
comb+0.006
Jud+0.005
cr+0.005
cious+0.005
 coconut+0.005
Neg logit contributions
les-0.004
 me-0.004
?"-0.004
 wears-0.003
ors-0.003
Token the
Token ID262
Feature activation+3.68e-02

Pos logit contributions
comb+0.094
cr+0.081
Jud+0.081
cious+0.075
gan+0.073
Neg logit contributions
les-0.047
 me-0.044
?"-0.041
ors-0.039
agging-0.038
Token pie
Token ID2508
Feature activation+1.83e-01

Pos logit contributions
comb+0.100
cr+0.093
Jud+0.089
cious+0.087
ees+0.081
Neg logit contributions
 me-0.035
?"-0.034
 problem-0.032
 wrong-0.030
 mess-0.028
Token a
Token ID257
Feature activation+2.64e-02

Pos logit contributions
comb+0.481
Jud+0.459
cr+0.424
cious+0.415
Jess+0.407
Neg logit contributions
les-0.145
 me-0.143
?"-0.122
 everywhere-0.117
ors-0.111
Token spark
Token ID9009
Feature activation-3.89e-02

Pos logit contributions
comb+0.063
Jud+0.057
cr+0.055
cious+0.052
gan+0.051
Neg logit contributions
les-0.028
?"-0.027
 me-0.027
agging-0.025
 trumpet-0.023
Tokenly
Token ID306
Feature activation-1.46e-02

Pos logit contributions
agging+0.036
iren+0.035
?"+0.035
 me+0.035
ors+0.034
Neg logit contributions
comb-0.095
cr-0.083
emon-0.076
Jud-0.075
cious-0.075
Token necklace
Token ID38987
Feature activation+4.76e-02

Pos logit contributions
les+0.024
?"+0.023
 me+0.023
ening+0.021
agging+0.021
Neg logit contributions
comb-0.027
Jud-0.025
cr-0.022
cious-0.022
gan-0.021
Token,
Token ID11
Feature activation+7.71e-03

Pos logit contributions
comb+0.118
Jud+0.104
cr+0.103
cious+0.095
gan+0.093
Neg logit contributions
les-0.050
?"-0.045
agging-0.044
 me-0.043
ors-0.043
Token,
Token ID11
Feature activation+3.37e-03

Pos logit contributions
 me+0.001
les+0.001
?"+0.000
,+0.000
 wrong-0.000
Neg logit contributions
comb-0.016
Jud-0.015
cr-0.014
cious-0.014
 coconut-0.013
Token Tim
Token ID5045
Feature activation+3.44e-02

Pos logit contributions
comb+0.009
Jud+0.008
cious+0.008
emon+0.007
cr+0.007
Neg logit contributions
?"-0.003
 me-0.003
 Teddy-0.003
,-0.003
 everywhere-0.003
Tokenmy
Token ID1820
Feature activation-4.41e-03

Pos logit contributions
comb+0.075
Jud+0.069
cr+0.066
gan+0.062
cious+0.060
Neg logit contributions
les-0.041
agging-0.040
ors-0.038
iren-0.037
?"-0.037
Token saw
Token ID2497
Feature activation+1.84e-02

Pos logit contributions
les+0.006
?"+0.006
agging+0.006
 me+0.006
ors+0.006
Neg logit contributions
comb-0.009
Jud-0.008
cr-0.008
cious-0.007
gan-0.007
Token the
Token ID262
Feature activation+2.58e-02

Pos logit contributions
comb+0.045
Jud+0.038
cr+0.036
cious+0.036
gan+0.034
Neg logit contributions
 me-0.022
les-0.022
?"-0.022
 wrong-0.020
 trumpet-0.019
Token ge
Token ID4903
Feature activation+1.83e-01

Pos logit contributions
comb+0.056
Jud+0.048
cr+0.046
 Maggie+0.041
gan+0.041
Neg logit contributions
les-0.037
 me-0.036
?"-0.035
agging-0.033
 wrong-0.033
Tokenese
Token ID2771
Feature activation+1.53e-01

Pos logit contributions
comb+0.347
Jud+0.285
cr+0.271
 Maggie+0.261
cious+0.245
Neg logit contributions
agging-0.306
les-0.305
 me-0.287
ors-0.281
?"-0.274
Token and
Token ID290
Feature activation+5.40e-03

Pos logit contributions
comb+0.381
Jud+0.354
cr+0.326
cious+0.317
gan+0.315
Neg logit contributions
les-0.168
 me-0.145
agging-0.143
?"-0.141
ening-0.129
Token thought
Token ID1807
Feature activation+7.56e-03

Pos logit contributions
comb+0.013
Jud+0.011
cr+0.011
emon+0.010
cious+0.010
Neg logit contributions
 me-0.006
les-0.006
?"-0.006
 wears-0.005
 trumpet-0.005
Token they
Token ID484
Feature activation+1.97e-02

Pos logit contributions
comb+0.019
Jud+0.017
cr+0.016
cious+0.015
gan+0.015
Neg logit contributions
les-0.008
?"-0.008
 me-0.007
ors-0.007
agging-0.007
Token looked
Token ID3114
Feature activation-1.94e-02

Pos logit contributions
comb+0.037
Jud+0.033
cr+0.030
emon+0.028
cious+0.028
Neg logit contributions
les-0.032
?"-0.030
 me-0.030
ors-0.029
agging-0.028
Token
Token ID198
Feature activation+2.54e-02

Pos logit contributions
comb+0.110
cr+0.096
Jud+0.093
cious+0.088
 coconut+0.087
Neg logit contributions
?"-0.054
 me-0.054
les-0.053
ors-0.049
iren-0.045
Token
Token ID198
Feature activation+1.81e-02

Pos logit contributions
comb+0.052
Jud+0.045
cr+0.044
cious+0.040
gan+0.039
Neg logit contributions
les-0.038
agging-0.034
?"-0.034
 me-0.034
ors-0.034
TokenWhen
Token ID2215
Feature activation+4.51e-02

Pos logit contributions
comb+0.036
cr+0.031
gan+0.030
cious+0.028
Jud+0.028
Neg logit contributions
 me-0.028
?"-0.027
les-0.025
ors-0.024
iren-0.023
Token the
Token ID262
Feature activation+1.60e-02

Pos logit contributions
comb+0.099
cr+0.094
cious+0.087
emon+0.086
ocado+0.084
Neg logit contributions
 me-0.058
?"-0.055
les-0.050
,-0.049
 problem-0.048
Token cup
Token ID6508
Feature activation+9.79e-02

Pos logit contributions
comb+0.032
Jud+0.031
cr+0.030
cious+0.027
 coconut+0.026
Neg logit contributions
les-0.024
 me-0.024
?"-0.023
 problem-0.022
 channel-0.021
Tokencakes
Token ID37263
Feature activation+1.82e-01

Pos logit contributions
comb+0.172
Jud+0.166
cious+0.158
cr+0.157
 themselves+0.144
Neg logit contributions
les-0.174
 me-0.168
?"-0.159
iren-0.146
,-0.139
Token were
Token ID547
Feature activation+4.63e-02

Pos logit contributions
comb+0.385
Jud+0.368
cr+0.346
cious+0.337
gan+0.318
Neg logit contributions
les-0.266
 me-0.242
?"-0.238
ler-0.207
iren-0.203
Token done
Token ID1760
Feature activation+3.89e-02

Pos logit contributions
comb+0.107
Jud+0.098
cr+0.094
cious+0.090
 coconut+0.084
Neg logit contributions
les-0.054
?"-0.053
 me-0.052
agging-0.050
ors-0.044
Token,
Token ID11
Feature activation+1.46e-02

Pos logit contributions
comb+0.123
Jud+0.118
cr+0.114
cious+0.109
 coconut+0.108
Neg logit contributions
les-0.011
?"-0.010
 me-0.008
ors-0.005
agging-0.005
Token Mill
Token ID9212
Feature activation+8.37e-03

Pos logit contributions
comb+0.032
cr+0.029
emon+0.029
cious+0.028
ocado+0.027
Neg logit contributions
 me-0.019
?"-0.018
 everywhere-0.016
les-0.016
,-0.016
Tokenie
Token ID494
Feature activation-4.06e-02

Pos logit contributions
comb+0.026
Jud+0.024
cr+0.024
cious+0.023
 coconut+0.022
Neg logit contributions
les-0.003
?"-0.002
 me-0.002
agging-0.002
ors-0.002
Token happy
Token ID3772
Feature activation+7.06e-02

Pos logit contributions
comb+0.050
Jud+0.043
cr+0.042
mons+0.039
emon+0.039
Neg logit contributions
 me-0.027
?"-0.027
les-0.026
agging-0.024
 wears-0.023
Token with
Token ID351
Feature activation+3.45e-02

Pos logit contributions
comb+0.184
Jud+0.167
cr+0.162
cious+0.153
gan+0.149
Neg logit contributions
les-0.065
?"-0.059
 me-0.058
agging-0.054
ors-0.054
Token all
Token ID477
Feature activation+4.64e-03

Pos logit contributions
comb+0.089
Jud+0.077
cr+0.077
cious+0.074
gan+0.070
Neg logit contributions
les-0.034
 me-0.033
?"-0.031
agging-0.030
ening-0.028
Token the
Token ID262
Feature activation+1.85e-02

Pos logit contributions
comb+0.011
Jud+0.010
cr+0.010
cious+0.009
gan+0.009
Neg logit contributions
les-0.006
?"-0.005
 me-0.005
agging-0.005
ening-0.005
Token sun
Token ID4252
Feature activation+5.85e-02

Pos logit contributions
comb+0.039
Jud+0.034
cr+0.034
cious+0.031
gan+0.030
Neg logit contributions
les-0.027
 me-0.024
?"-0.024
agging-0.024
ors-0.023
Tokenfl
Token ID2704
Feature activation+1.82e-01

Pos logit contributions
comb+0.122
Jud+0.113
cr+0.104
cious+0.099
emon+0.099
Neg logit contributions
les-0.085
 me-0.074
?"-0.074
agging-0.072
ors-0.071
Tokenowers
Token ID3618
Feature activation+6.66e-02

Pos logit contributions
comb+0.306
Jud+0.266
cr+0.244
cious+0.228
 coconut+0.220
Neg logit contributions
les-0.350
agging-0.319
ors-0.313
?"-0.311
 me-0.307
Token and
Token ID290
Feature activation+3.92e-02

Pos logit contributions
comb+0.167
Jud+0.155
cr+0.145
gan+0.141
cious+0.139
Neg logit contributions
les-0.068
?"-0.058
 me-0.057
agging-0.054
ening-0.052
Token he
Token ID339
Feature activation-3.97e-02

Pos logit contributions
comb+0.097
Jud+0.084
cr+0.080
emon+0.079
cious+0.078
Neg logit contributions
 me-0.046
les-0.045
 wears-0.039
?"-0.039
 my-0.034
Token ordered
Token ID6149
Feature activation+5.13e-02

Pos logit contributions
les+0.059
 me+0.055
?"+0.054
 wears+0.051
ors+0.049
Neg logit contributions
comb-0.083
Jud-0.073
cr-0.070
cious-0.065
emon-0.064
Token his
Token ID465
Feature activation+2.92e-02

Pos logit contributions
comb+0.123
Jud+0.109
cious+0.100
cr+0.097
gan+0.097
Neg logit contributions
 me-0.064
les-0.062
?"-0.056
agging-0.050
ock-0.048
Token home
Token ID1363
Feature activation-2.65e-02

Pos logit contributions
comb+0.013
Jud+0.011
cr+0.011
cious+0.010
gan+0.010
Neg logit contributions
les-0.008
?"-0.007
agging-0.007
ors-0.007
 me-0.007
Token.
Token ID13
Feature activation+2.87e-02

Pos logit contributions
?"+0.014
les+0.010
 me+0.009
agging+0.009
ors+0.007
Neg logit contributions
comb-0.080
Jud-0.075
cr-0.072
emon-0.069
 coconut-0.069
Token She
Token ID1375
Feature activation+1.23e-03

Pos logit contributions
comb+0.065
Jud+0.057
cr+0.056
emon+0.050
cious+0.050
Neg logit contributions
 me-0.038
les-0.036
?"-0.036
 wears-0.033
ors-0.032
Token put
Token ID1234
Feature activation+2.77e-03

Pos logit contributions
comb+0.003
Jud+0.003
cr+0.003
cious+0.002
 coconut+0.002
Neg logit contributions
?"-0.001
les-0.001
 me-0.001
 wears-0.001
ors-0.001
Token the
Token ID262
Feature activation+4.65e-02

Pos logit contributions
comb+0.008
Jud+0.007
cr+0.006
cious+0.006
gan+0.006
Neg logit contributions
les-0.003
 me-0.003
?"-0.002
ler-0.002
lers-0.002
Token pie
Token ID2508
Feature activation+1.81e-01

Pos logit contributions
comb+0.086
Jud+0.077
cr+0.066
cious+0.066
Jess+0.063
Neg logit contributions
 me-0.086
?"-0.085
 wrong-0.082
 problem-0.080
les-0.079
Token on
Token ID319
Feature activation+3.18e-02

Pos logit contributions
comb+0.492
Jud+0.492
cious+0.440
emon+0.429
Jess+0.427
Neg logit contributions
les-0.138
 me-0.115
?"-0.115
 everywhere-0.102
ors-0.102
Token a
Token ID257
Feature activation-1.07e-02

Pos logit contributions
comb+0.083
Jud+0.074
cr+0.072
cious+0.070
emon+0.067
Neg logit contributions
 me-0.035
?"-0.031
les-0.030
 my-0.024
ors-0.024
Token plate
Token ID7480
Feature activation+4.84e-02

Pos logit contributions
les+0.017
?"+0.016
 me+0.016
agging+0.016
ors+0.016
Neg logit contributions
comb-0.020
Jud-0.018
cr-0.017
cious-0.015
gan-0.015
Token and
Token ID290
Feature activation+6.55e-03

Pos logit contributions
comb+0.121
Jud+0.109
cr+0.107
cious+0.099
gan+0.097
Neg logit contributions
les-0.049
?"-0.044
agging-0.043
 me-0.042
ors-0.041
Token cut
Token ID2005
Feature activation-1.04e-02

Pos logit contributions
comb+0.017
Jud+0.015
cious+0.014
cr+0.014
aney+0.013
Neg logit contributions
?"-0.007
 me-0.007
les-0.006
 Teddy-0.006
 bl-0.005
Token show
Token ID905
Feature activation-2.14e-02

Pos logit contributions
comb+0.058
Jud+0.050
cr+0.050
cious+0.047
mons+0.045
Neg logit contributions
 me-0.026
?"-0.025
les-0.025
 wears-0.021
 trumpet-0.021
Token his
Token ID465
Feature activation+2.58e-02

Pos logit contributions
les+0.024
?"+0.022
agging+0.022
 me+0.021
ors+0.021
Neg logit contributions
comb-0.051
Jud-0.046
cr-0.045
cious-0.042
gan-0.041
Token dad
Token ID9955
Feature activation+2.14e-02

Pos logit contributions
comb+0.054
Jud+0.047
cr+0.045
cious+0.041
emon+0.040
Neg logit contributions
 me-0.039
?"-0.038
les-0.038
 trumpet-0.037
 problem-0.035
Token the
Token ID262
Feature activation+2.32e-02

Pos logit contributions
comb+0.056
Jud+0.050
cr+0.047
cious+0.044
mons+0.043
Neg logit contributions
les-0.022
?"-0.020
 me-0.019
ors-0.018
 wears-0.017
Token cup
Token ID6508
Feature activation+9.49e-02

Pos logit contributions
comb+0.054
Jud+0.048
cr+0.046
gan+0.043
cious+0.043
Neg logit contributions
les-0.028
 me-0.027
?"-0.026
 trumpet-0.025
agging-0.025
Tokencakes
Token ID37263
Feature activation+1.81e-01

Pos logit contributions
comb+0.203
Jud+0.190
cr+0.169
cious+0.167
Jess+0.160
Neg logit contributions
les-0.132
?"-0.121
 me-0.120
iren-0.113
ors-0.112
Token.
Token ID13
Feature activation+2.15e-02

Pos logit contributions
comb+0.491
Jud+0.461
cr+0.429
cious+0.419
 coconut+0.409
Neg logit contributions
les-0.151
?"-0.132
 me-0.125
agging-0.119
ors-0.112
Token It
Token ID632
Feature activation+2.08e-02

Pos logit contributions
comb+0.051
Jud+0.044
cr+0.044
cious+0.040
emon+0.039
Neg logit contributions
les-0.025
?"-0.025
 me-0.025
ors-0.022
agging-0.022
Token was
Token ID373
Feature activation+1.03e-02

Pos logit contributions
comb+0.048
Jud+0.042
cr+0.040
cious+0.038
 coconut+0.038
Neg logit contributions
les-0.027
 me-0.026
?"-0.025
 wears-0.022
ler-0.022
Token a
Token ID257
Feature activation+5.92e-03

Pos logit contributions
comb+0.024
Jud+0.021
cr+0.020
cious+0.019
gan+0.018
Neg logit contributions
les-0.013
?"-0.012
 me-0.012
agging-0.011
ors-0.011
Token noisy
Token ID31210
Feature activation+2.38e-02

Pos logit contributions
comb+0.014
Jud+0.012
cr+0.012
cious+0.011
gan+0.011
Neg logit contributions
les-0.007
 me-0.007
?"-0.007
agging-0.006
 trumpet-0.006
Token one
Token ID530
Feature activation+1.96e-02

Pos logit contributions
comb+0.090
Jud+0.080
cr+0.076
cious+0.071
gan+0.071
Neg logit contributions
 me-0.036
?"-0.035
les-0.035
agging-0.032
 shooting-0.030
Token of
Token ID286
Feature activation+5.65e-02

Pos logit contributions
comb+0.047
Jud+0.044
cr+0.042
cious+0.040
gan+0.038
Neg logit contributions
les-0.022
?"-0.021
 me-0.020
agging-0.019
ler-0.018
Token the
Token ID262
Feature activation+1.21e-02

Pos logit contributions
comb+0.140
Jud+0.121
cr+0.119
cious+0.112
gan+0.108
Neg logit contributions
les-0.065
 me-0.063
?"-0.061
agging-0.058
ors-0.054
Token delicious
Token ID12625
Feature activation+1.31e-01

Pos logit contributions
comb+0.028
Jud+0.025
cr+0.024
cious+0.022
gan+0.021
Neg logit contributions
les-0.016
 me-0.015
?"-0.015
agging-0.014
 channel-0.013
Token cup
Token ID6508
Feature activation+9.05e-02

Pos logit contributions
comb+0.239
Jud+0.208
cr+0.200
cious+0.178
gan+0.175
Neg logit contributions
les-0.229
?"-0.211
 me-0.211
agging-0.205
ors-0.200
Tokencakes
Token ID37263
Feature activation+1.81e-01

Pos logit contributions
comb+0.226
Jud+0.219
cr+0.198
cious+0.194
 coconut+0.190
Neg logit contributions
les-0.093
?"-0.081
 me-0.079
ors-0.073
agging-0.072
Token.
Token ID13
Feature activation+3.53e-02

Pos logit contributions
comb+0.491
Jud+0.478
cr+0.439
cious+0.425
 coconut+0.423
Neg logit contributions
les-0.141
?"-0.120
 me-0.114
agging-0.106
ors-0.098
Token Y
Token ID575
Feature activation+7.86e-02

Pos logit contributions
comb+0.080
Jud+0.070
cr+0.068
cious+0.062
 coconut+0.061
Neg logit contributions
les-0.046
?"-0.043
 me-0.043
agging-0.041
ors-0.040
Tokenum
Token ID388
Feature activation+5.27e-02

Pos logit contributions
comb+0.185
Jud+0.164
cr+0.158
cious+0.148
 coconut+0.145
Neg logit contributions
les-0.100
 me-0.096
agging-0.092
?"-0.091
ors-0.089
Token!
Token ID0
Feature activation+3.17e-02

Pos logit contributions
comb+0.147
Jud+0.133
cr+0.131
cious+0.122
 coconut+0.120
Neg logit contributions
les-0.040
?"-0.035
 me-0.033
agging-0.032
ors-0.031
Token<|endoftext|>
Token ID50256
Feature activation-6.13e-02

Pos logit contributions
comb+0.077
Jud+0.067
cr+0.064
cious+0.061
 coconut+0.059
Neg logit contributions
?"-0.036
les-0.036
 me-0.035
 wears-0.032
agging-0.031
Token woke
Token ID19092
Feature activation+2.97e-02

Pos logit contributions
les+0.022
agging+0.020
ors+0.020
?"+0.020
 me+0.019
Neg logit contributions
comb-0.031
Jud-0.027
cr-0.027
cious-0.024
gan-0.024
Token up
Token ID510
Feature activation+2.17e-02

Pos logit contributions
comb+0.072
Jud+0.064
cr+0.063
cious+0.058
gan+0.057
Neg logit contributions
les-0.032
agging-0.029
?"-0.029
 me-0.028
ors-0.028
Token,
Token ID11
Feature activation+1.02e-02

Pos logit contributions
comb+0.059
Jud+0.053
cr+0.052
cious+0.049
gan+0.048
Neg logit contributions
les-0.018
agging-0.015
?"-0.015
ors-0.015
 me-0.014
Token the
Token ID262
Feature activation+9.91e-03

Pos logit contributions
comb+0.025
Jud+0.023
cr+0.023
cious+0.022
emon+0.022
Neg logit contributions
 me-0.011
?"-0.010
les-0.009
 Teddy-0.009
 wrong-0.008
Token cup
Token ID6508
Feature activation+1.02e-01

Pos logit contributions
comb+0.022
Jud+0.020
cr+0.020
cious+0.018
gan+0.017
Neg logit contributions
les-0.014
 me-0.013
?"-0.012
agging-0.012
 problem-0.011
Tokencakes
Token ID37263
Feature activation+1.80e-01

Pos logit contributions
comb+0.179
Jud+0.173
cr+0.160
cious+0.157
 coconut+0.145
Neg logit contributions
les-0.180
 me-0.170
?"-0.157
iren-0.150
ler-0.144
Token were
Token ID547
Feature activation+3.66e-02

Pos logit contributions
comb+0.337
Jud+0.317
cr+0.294
cious+0.279
gan+0.269
Neg logit contributions
les-0.301
 me-0.270
?"-0.267
agging-0.249
ler-0.244
Token ready
Token ID3492
Feature activation+5.40e-02

Pos logit contributions
comb+0.086
Jud+0.078
cr+0.076
cious+0.072
 coconut+0.068
Neg logit contributions
les-0.041
 me-0.040
?"-0.040
agging-0.037
ening-0.034
Token.
Token ID13
Feature activation+2.97e-02

Pos logit contributions
comb+0.138
Jud+0.126
cr+0.122
cious+0.115
 coconut+0.113
Neg logit contributions
les-0.051
?"-0.047
 me-0.046
agging-0.044
ors-0.042
Token They
Token ID1119
Feature activation+2.82e-02

Pos logit contributions
comb+0.077
Jud+0.070
cr+0.068
cious+0.063
 coconut+0.063
Neg logit contributions
les-0.030
 me-0.029
?"-0.026
ors-0.025
 Teddy-0.024
Token were
Token ID547
Feature activation+3.24e-02

Pos logit contributions
comb+0.063
Jud+0.060
cious+0.055
cr+0.055
mons+0.054
Neg logit contributions
les-0.036
 me-0.033
?"-0.032
 wears-0.030
ie-0.029
TokenHello
Token ID15496
Feature activation-4.39e-02

Pos logit contributions
les+0.005
agging+0.005
?"+0.005
ors+0.005
ines+0.005
Neg logit contributions
comb-0.004
Jud-0.004
cr-0.004
cious-0.003
Let-0.003
Token,
Token ID11
Feature activation-9.58e-02

Pos logit contributions
agging+0.065
ag+0.064
 channels+0.063
iren+0.063
 channel+0.062
Neg logit contributions
comb-0.076
cr-0.068
 Mole-0.066
 Ab-0.065
Jud-0.064
Token Bun
Token ID28515
Feature activation-1.65e-01

Pos logit contributions
les+0.156
agging+0.146
lers+0.144
ler+0.140
ag+0.137
Neg logit contributions
comb-0.159
cr-0.146
 Mole-0.140
 Ab-0.139
Jud-0.138
Token-
Token ID12
Feature activation-3.14e-02

Pos logit contributions
 channels+0.228
 channel+0.221
agging+0.220
 volume+0.207
iren+0.206
Neg logit contributions
comb-0.316
cr-0.281
Jud-0.273
cious-0.255
M-0.244
TokenB
Token ID33
Feature activation-1.05e-01

Pos logit contributions
agging+0.066
les+0.064
 channel+0.063
ening+0.062
 problem+0.062
Neg logit contributions
comb-0.036
Jud-0.034
cr-0.032
lock-0.026
gan-0.026
Tokenun
Token ID403
Feature activation-1.82e-01

Pos logit contributions
les+0.136
 channels+0.129
?"+0.127
 me+0.125
 channel+0.125
Neg logit contributions
comb-0.211
cr-0.196
Jud-0.178
ees-0.176
gan-0.176
Token!
Token ID0
Feature activation-4.93e-02

Pos logit contributions
 channels+0.211
agging+0.210
 volume+0.200
 channel+0.200
les+0.197
Neg logit contributions
comb-0.394
cr-0.353
Jud-0.335
cious-0.327
gan-0.308
Token You
Token ID921
Feature activation-8.07e-03

Pos logit contributions
les+0.075
agging+0.067
ler+0.064
ors+0.062
?"+0.062
Neg logit contributions
comb-0.101
cr-0.089
Jud-0.088
cious-0.081
gan-0.079
Token have
Token ID423
Feature activation-1.05e-02

Pos logit contributions
agging+0.017
les+0.017
iren+0.017
ors+0.016
ening+0.016
Neg logit contributions
comb-0.011
cr-0.010
Jud-0.009
cious-0.007
 coconut-0.007
Token pretty
Token ID2495
Feature activation-7.05e-02

Pos logit contributions
les+0.018
agging+0.017
ors+0.017
iren+0.016
ler+0.016
Neg logit contributions
comb-0.019
Jud-0.017
cr-0.017
cious-0.015
gan-0.014
Token gray
Token ID12768
Feature activation-6.15e-02

Pos logit contributions
iren+0.121
ler+0.115
 channel+0.113
ock+0.112
 concert+0.111
Neg logit contributions
comb-0.102
cr-0.100
cious-0.088
 themselves-0.085
 mothers-0.081
Token.
Token ID13
Feature activation-1.68e-02

Pos logit contributions
les+0.065
?"+0.057
agging+0.056
 me+0.056
ors+0.055
Neg logit contributions
comb-0.172
Jud-0.153
cr-0.152
cious-0.142
gan-0.138
Token
Token ID198
Feature activation-1.17e-02

Pos logit contributions
les+0.030
agging+0.028
ors+0.028
ening+0.027
ler+0.027
Neg logit contributions
comb-0.029
cr-0.025
Jud-0.024
gan-0.022
 Maggie-0.022
Token
Token ID198
Feature activation-1.53e-02

Pos logit contributions
les+0.022
agging+0.021
ening+0.021
ors+0.021
 channel+0.020
Neg logit contributions
comb-0.017
Jud-0.016
cr-0.015
 themselves-0.014
 Maggie-0.014
Token"
Token ID1
Feature activation+4.50e-03

Pos logit contributions
les+0.025
 me+0.024
?"+0.024
ler+0.022
agging+0.022
Neg logit contributions
comb-0.030
Jud-0.026
cr-0.024
cious-0.022
 coconut-0.022
TokenB
Token ID33
Feature activation-9.91e-02

Pos logit contributions
comb+0.009
Jud+0.007
cr+0.007
 coconut+0.007
 Maggie+0.007
Neg logit contributions
les-0.007
 me-0.007
 sax-0.006
ock-0.006
agging-0.006
Tokenun
Token ID403
Feature activation-1.76e-01

Pos logit contributions
les+0.174
?"+0.162
agging+0.161
 me+0.160
ors+0.159
Neg logit contributions
comb-0.176
Jud-0.150
cr-0.146
cious-0.130
gan-0.126
Token-
Token ID12
Feature activation-3.36e-02

Pos logit contributions
les+0.163
agging+0.160
?"+0.142
 me+0.142
ors+0.141
Neg logit contributions
comb-0.437
cr-0.393
Jud-0.385
cious-0.366
 Maggie-0.354
TokenCloud
Token ID18839
Feature activation-3.84e-02

Pos logit contributions
les+0.052
?"+0.049
agging+0.048
 me+0.048
ors+0.047
Neg logit contributions
comb-0.066
Jud-0.057
cr-0.055
cious-0.051
gan-0.049
Token.
Token ID13
Feature activation-6.61e-02

Pos logit contributions
agging+0.044
les+0.042
 me+0.042
ors+0.039
?"+0.039
Neg logit contributions
comb-0.084
cr-0.076
Jud-0.072
cious-0.069
gan-0.068
Token I
Token ID314
Feature activation-3.42e-02

Pos logit contributions
les+0.097
agging+0.089
?"+0.088
 me+0.086
ors+0.086
Neg logit contributions
comb-0.136
Jud-0.118
cr-0.117
cious-0.106
gan-0.104
Token like
Token ID588
Feature activation-3.13e-03

Pos logit contributions
les+0.055
?"+0.051
 me+0.050
agging+0.049
ors+0.049
Neg logit contributions
comb-0.066
Jud-0.057
cr-0.055
cious-0.050
gan-0.049
Tokenmy
Token ID1820
Feature activation-2.54e-02

Pos logit contributions
comb+0.106
Jud+0.095
cr+0.093
cious+0.084
gan+0.083
Neg logit contributions
les-0.063
agging-0.059
ors-0.056
?"-0.056
 me-0.055
Token said
Token ID531
Feature activation-4.28e-03

Pos logit contributions
les+0.033
?"+0.031
 me+0.029
ors+0.028
agging+0.027
Neg logit contributions
comb-0.058
Jud-0.050
cr-0.050
cious-0.046
gan-0.045
Token,
Token ID11
Feature activation-2.15e-02

Pos logit contributions
?"+0.004
les+0.003
 me+0.003
agging+0.003
ors+0.003
Neg logit contributions
comb-0.011
Jud-0.010
cr-0.010
gan-0.009
 coconut-0.009
Token "
Token ID366
Feature activation+5.89e-03

Pos logit contributions
les+0.035
?"+0.032
agging+0.032
 me+0.032
ors+0.031
Neg logit contributions
comb-0.041
Jud-0.036
cr-0.035
cious-0.031
gan-0.030
TokenB
Token ID33
Feature activation-9.06e-02

Pos logit contributions
comb+0.010
Jud+0.009
cr+0.008
cious+0.007
Jess+0.007
Neg logit contributions
les-0.010
?"-0.010
agging-0.010
ors-0.010
ines-0.010
Tokenun
Token ID403
Feature activation-1.74e-01

Pos logit contributions
les+0.114
?"+0.114
 me+0.108
 channels+0.106
 channel+0.105
Neg logit contributions
comb-0.188
cr-0.169
Jud-0.153
ees-0.149
gan-0.148
Token,
Token ID11
Feature activation-8.05e-02

Pos logit contributions
les+0.157
agging+0.140
?"+0.135
 me+0.133
ors+0.129
Neg logit contributions
comb-0.452
Jud-0.403
cr-0.401
cious-0.375
 coconut-0.364
Token it
Token ID340
Feature activation-4.79e-02

Pos logit contributions
les+0.122
agging+0.113
iren+0.108
ors+0.103
ines+0.103
Neg logit contributions
comb-0.166
cr-0.143
Jud-0.136
 coconut-0.126
 Maggie-0.125
Token's
Token ID338
Feature activation-5.98e-03

Pos logit contributions
les+0.069
?"+0.063
 me+0.063
ors+0.060
agging+0.059
Neg logit contributions
comb-0.101
Jud-0.089
cr-0.086
cious-0.079
 coconut-0.078
Token time
Token ID640
Feature activation-4.00e-02

Pos logit contributions
les+0.006
ors+0.005
agging+0.005
iren+0.004
ler+0.004
Neg logit contributions
comb-0.016
Jud-0.014
cr-0.014
gan-0.013
emon-0.013
Token to
Token ID284
Feature activation-7.87e-03

Pos logit contributions
les+0.067
ler+0.060
agging+0.060
ors+0.059
iren+0.057
Neg logit contributions
comb-0.074
cr-0.065
Jud-0.062
cious-0.057
 Maggie-0.056
Token generous
Token ID14431
Feature activation+1.24e-03

Pos logit contributions
comb+0.020
Jud+0.018
cr+0.017
cious+0.016
gan+0.016
Neg logit contributions
?"-0.011
les-0.011
 me-0.011
agging-0.010
 wears-0.010
Token rabbit
Token ID22746
Feature activation-3.78e-02

Pos logit contributions
comb+0.002
Jud+0.002
cr+0.002
cious+0.002
gan+0.002
Neg logit contributions
les-0.002
?"-0.002
 me-0.002
agging-0.002
ors-0.002
Token named
Token ID3706
Feature activation-7.92e-02

Pos logit contributions
les+0.049
agging+0.045
ors+0.042
iren+0.041
 me+0.041
Neg logit contributions
comb-0.084
Jud-0.074
cr-0.074
cious-0.068
gan-0.066
Token Bun
Token ID28515
Feature activation-1.33e-01

Pos logit contributions
les+0.126
agging+0.117
?"+0.113
ors+0.111
iren+0.110
Neg logit contributions
comb-0.152
Jud-0.133
cr-0.130
cious-0.118
gan-0.115
TokenB
Token ID33
Feature activation-7.48e-02

Pos logit contributions
les+0.146
iren+0.139
agging+0.137
?"+0.136
 channel+0.133
Neg logit contributions
comb-0.303
cr-0.276
Jud-0.271
cious-0.252
gan-0.244
Tokenun
Token ID403
Feature activation-1.65e-01

Pos logit contributions
les+0.082
?"+0.077
 me+0.073
ening+0.072
 wears+0.072
Neg logit contributions
comb-0.170
cr-0.160
Jud-0.155
gan-0.145
cious-0.140
Token.
Token ID13
Feature activation-1.32e-02

Pos logit contributions
les+0.180
agging+0.170
 channel+0.168
iren+0.166
 me+0.163
Neg logit contributions
comb-0.368
cr-0.341
Jud-0.331
cious-0.312
gan-0.302
Token Every
Token ID3887
Feature activation-3.65e-02

Pos logit contributions
les+0.024
agging+0.021
ler+0.021
ines+0.020
 concert+0.020
Neg logit contributions
comb-0.020
cr-0.019
 themselves-0.019
 Mole-0.018
Let-0.017
Token day
Token ID1110
Feature activation-9.43e-03

Pos logit contributions
les+0.049
agging+0.044
?"+0.044
ors+0.044
 me+0.043
Neg logit contributions
comb-0.080
Jud-0.070
cr-0.069
cious-0.063
 coconut-0.062
Token,
Token ID11
Feature activation-1.82e-03

Pos logit contributions
 channel+0.015
les+0.015
agging+0.015
ines+0.015
 channels+0.014
Neg logit contributions
cr-0.014
comb-0.013
Over-0.013
 Mole-0.013
 themselves-0.013
Token Bun
Token ID28515
Feature activation-9.18e-02

Pos logit contributions
les+0.003
agging+0.003
ines+0.003
iren+0.003
 channel+0.003
Neg logit contributions
comb-0.003
cr-0.003
 themselves-0.003
 mirac-0.003
 Mole-0.002
Token said
Token ID531
Feature activation-2.03e-03

Pos logit contributions
comb+0.099
Jud+0.092
cr+0.087
cious+0.080
gan+0.077
Neg logit contributions
agging-0.081
les-0.080
ors-0.074
iren-0.073
ock-0.073
Token,
Token ID11
Feature activation-2.07e-02

Pos logit contributions
les+0.001
?"+0.001
 me+0.001
agging+0.001
ors+0.001
Neg logit contributions
comb-0.006
Jud-0.006
cr-0.005
cious-0.005
gan-0.005
Token "
Token ID366
Feature activation-2.65e-03

Pos logit contributions
les+0.034
?"+0.032
agging+0.031
 me+0.031
ors+0.031
Neg logit contributions
comb-0.039
Jud-0.034
cr-0.033
cious-0.030
gan-0.029
TokenHello
Token ID15496
Feature activation-4.39e-02

Pos logit contributions
les+0.005
agging+0.005
?"+0.005
ors+0.005
ines+0.005
Neg logit contributions
comb-0.004
Jud-0.004
cr-0.004
cious-0.003
Let-0.003
Token,
Token ID11
Feature activation-9.58e-02

Pos logit contributions
agging+0.065
ag+0.064
 channels+0.063
iren+0.063
 channel+0.062
Neg logit contributions
comb-0.076
cr-0.068
 Mole-0.066
 Ab-0.065
Jud-0.064
Token Bun
Token ID28515
Feature activation-1.65e-01

Pos logit contributions
les+0.156
agging+0.146
lers+0.144
ler+0.140
ag+0.137
Neg logit contributions
comb-0.159
cr-0.146
 Mole-0.140
 Ab-0.139
Jud-0.138
Token-
Token ID12
Feature activation-3.14e-02

Pos logit contributions
 channels+0.228
 channel+0.221
agging+0.220
 volume+0.207
iren+0.206
Neg logit contributions
comb-0.316
cr-0.281
Jud-0.273
cious-0.255
M-0.244
TokenB
Token ID33
Feature activation-1.05e-01

Pos logit contributions
agging+0.066
les+0.064
 channel+0.063
ening+0.062
 problem+0.062
Neg logit contributions
comb-0.036
Jud-0.034
cr-0.032
lock-0.026
gan-0.026
Tokenun
Token ID403
Feature activation-1.82e-01

Pos logit contributions
les+0.136
 channels+0.129
?"+0.127
 me+0.125
 channel+0.125
Neg logit contributions
comb-0.211
cr-0.196
Jud-0.178
ees-0.176
gan-0.176
Token!
Token ID0
Feature activation-4.93e-02

Pos logit contributions
 channels+0.211
agging+0.210
 volume+0.200
 channel+0.200
les+0.197
Neg logit contributions
comb-0.394
cr-0.353
Jud-0.335
cious-0.327
gan-0.308
Token You
Token ID921
Feature activation-8.07e-03

Pos logit contributions
les+0.075
agging+0.067
ler+0.064
ors+0.062
?"+0.062
Neg logit contributions
comb-0.101
cr-0.089
Jud-0.088
cious-0.081
gan-0.079
Token.
Token ID13
Feature activation+1.26e-02

Pos logit contributions
comb+0.116
Jud+0.102
cr+0.101
cious+0.093
gan+0.090
Neg logit contributions
les-0.056
agging-0.049
?"-0.049
ors-0.048
 me-0.046
Token She
Token ID1375
Feature activation-1.13e-02

Pos logit contributions
comb+0.014
Let+0.013
 themselves+0.012
cr+0.012
 Mole+0.012
Neg logit contributions
agging-0.026
les-0.026
ler-0.026
ines-0.025
ock-0.025
Token said
Token ID531
Feature activation+7.90e-03

Pos logit contributions
les+0.015
 me+0.014
?"+0.014
 wears+0.013
ors+0.012
Neg logit contributions
comb-0.026
Jud-0.022
cr-0.022
cious-0.020
mons-0.020
Token "
Token ID366
Feature activation+1.48e-02

Pos logit contributions
comb+0.022
Jud+0.020
cr+0.019
cious+0.018
 coconut+0.018
Neg logit contributions
les-0.006
?"-0.006
 me-0.005
ors-0.005
agging-0.005
TokenB
Token ID33
Feature activation-8.37e-02

Pos logit contributions
comb+0.028
Jud+0.023
cr+0.022
cious+0.021
 coconut+0.020
Neg logit contributions
les-0.025
 me-0.024
agging-0.023
?"-0.023
ock-0.022
Tokenun
Token ID403
Feature activation-1.65e-01

Pos logit contributions
les+0.127
?"+0.119
 me+0.117
agging+0.117
 wears+0.110
Neg logit contributions
comb-0.166
cr-0.143
Jud-0.142
cious-0.129
gan-0.126
Token,
Token ID11
Feature activation-7.06e-02

Pos logit contributions
les+0.161
agging+0.145
?"+0.141
 me+0.138
ors+0.134
Neg logit contributions
comb-0.416
Jud-0.371
cr-0.367
cious-0.343
gan-0.333
Token it
Token ID340
Feature activation-4.06e-02

Pos logit contributions
les+0.096
agging+0.087
?"+0.084
ors+0.084
 me+0.081
Neg logit contributions
comb-0.154
Jud-0.135
cr-0.133
cious-0.121
 coconut-0.119
Token's
Token ID338
Feature activation-2.48e-03

Pos logit contributions
les+0.064
agging+0.062
ors+0.059
?"+0.059
iren+0.057
Neg logit contributions
comb-0.078
Jud-0.066
cr-0.065
cious-0.059
gan-0.057
Token time
Token ID640
Feature activation-3.79e-02

Pos logit contributions
les+0.004
ors+0.003
agging+0.003
iren+0.003
ler+0.003
Neg logit contributions
comb-0.005
cr-0.004
Jud-0.004
gan-0.004
 Maggie-0.004
Token to
Token ID284
Feature activation-3.00e-03

Pos logit contributions
les+0.063
ler+0.056
agging+0.055
ors+0.055
iren+0.053
Neg logit contributions
comb-0.070
cr-0.060
Jud-0.058
cious-0.054
 Maggie-0.054
Token
Token ID198
Feature activation-1.78e-02

Pos logit contributions
les+0.017
agging+0.015
?"+0.015
ors+0.015
 me+0.015
Neg logit contributions
comb-0.023
Jud-0.020
cr-0.019
cious-0.018
gan-0.017
Token"
Token ID1
Feature activation+3.11e-03

Pos logit contributions
 me+0.023
?"+0.023
les+0.022
iren+0.020
agging+0.019
Neg logit contributions
comb-0.042
cr-0.034
Jud-0.034
 coconut-0.032
cious-0.031
TokenWe
Token ID1135
Feature activation+1.20e-02

Pos logit contributions
comb+0.007
cious+0.006
Jud+0.006
apa+0.005
stack+0.005
Neg logit contributions
les-0.004
 me-0.004
ock-0.004
ch-0.004
 channel-0.004
Token are
Token ID389
Feature activation-2.77e-02

Pos logit contributions
comb+0.025
Jud+0.022
cr+0.021
cious+0.020
gan+0.019
Neg logit contributions
les-0.017
?"-0.017
 me-0.016
agging-0.015
ors-0.015
Token in
Token ID287
Feature activation-7.49e-02

Pos logit contributions
les+0.039
agging+0.035
ors+0.034
?"+0.034
 me+0.033
Neg logit contributions
comb-0.059
Jud-0.052
cr-0.051
cious-0.047
gan-0.046
Token space
Token ID2272
Feature activation-1.62e-01

Pos logit contributions
les+0.130
agging+0.119
ors+0.118
iren+0.109
ock+0.108
Neg logit contributions
comb-0.137
Jud-0.117
cr-0.109
 Maggie-0.103
 coconut-0.102
Token!"
Token ID2474
Feature activation-9.36e-03

Pos logit contributions
les+0.131
agging+0.111
?"+0.106
ors+0.106
 me+0.104
Neg logit contributions
comb-0.441
Jud-0.395
cr-0.391
cious-0.365
gan-0.359
Token Lily
Token ID20037
Feature activation-1.91e-02

Pos logit contributions
les+0.015
agging+0.014
ors+0.014
?"+0.013
ening+0.013
Neg logit contributions
comb-0.018
Jud-0.016
cr-0.015
 Maggie-0.014
cious-0.014
Token says
Token ID1139
Feature activation-3.40e-02

Pos logit contributions
agging+0.024
les+0.023
ors+0.021
ler+0.021
ening+0.020
Neg logit contributions
comb-0.041
Jud-0.037
cr-0.037
gan-0.033
 Maggie-0.033
Token.
Token ID13
Feature activation-4.42e-02

Pos logit contributions
les+0.026
agging+0.023
?"+0.022
 me+0.021
ors+0.021
Neg logit contributions
comb-0.094
Jud-0.085
cr-0.083
 coconut-0.078
cious-0.078
Token She
Token ID1375
Feature activation-3.76e-02

Pos logit contributions
les+0.054
?"+0.053
 me+0.052
agging+0.049
iren+0.047
Neg logit contributions
comb-0.102
Jud-0.086
cr-0.086
 coconut-0.081
cious-0.080
Token you
Token ID345
Feature activation+1.72e-02

Pos logit contributions
comb+0.002
cr+0.002
Jud+0.002
 coconut+0.002
cious+0.002
Neg logit contributions
les-0.001
agging-0.001
ors-0.001
iren-0.001
ines-0.001
Token here
Token ID994
Feature activation+3.63e-02

Pos logit contributions
comb+0.042
cr+0.038
Jud+0.037
cious+0.034
 coconut+0.034
Neg logit contributions
les-0.018
ors-0.016
agging-0.016
iren-0.015
ler-0.014
Token,
Token ID11
Feature activation-6.54e-02

Pos logit contributions
comb+0.110
cr+0.099
Jud+0.097
 Maggie+0.091
cious+0.090
Neg logit contributions
les-0.019
agging-0.014
ors-0.013
iren-0.012
ening-0.011
Token young
Token ID1862
Feature activation-5.52e-02

Pos logit contributions
?"+0.086
 me+0.086
 my+0.069
 mask+0.069
les+0.068
Neg logit contributions
comb-0.153
Jud-0.137
cr-0.136
aped-0.129
cious-0.128
Token'
Token ID6
Feature activation+5.45e-02

Pos logit contributions
les+0.086
agging+0.081
ors+0.079
?"+0.078
 me+0.077
Neg logit contributions
comb-0.107
Jud-0.093
cr-0.090
cious-0.082
 coconut-0.080
Tokenun
Token ID403
Feature activation-1.60e-01

Pos logit contributions
comb+0.169
Jud+0.155
cr+0.154
gan+0.143
cious+0.143
Neg logit contributions
les-0.023
agging-0.016
ors-0.015
 me-0.011
ening-0.011
Token?"
Token ID1701
Feature activation-2.91e-02

Pos logit contributions
agging+0.119
les+0.117
 me+0.102
 channels+0.101
 channel+0.096
Neg logit contributions
comb-0.419
cr-0.374
Jud-0.367
cious-0.349
emon-0.337
Token
Token ID220
Feature activation-7.51e-02

Pos logit contributions
les+0.038
?"+0.037
 me+0.037
ors+0.034
agging+0.033
Neg logit contributions
comb-0.065
Jud-0.055
cr-0.055
cious-0.051
 coconut-0.050
Token
Token ID198
Feature activation+2.81e-03

Pos logit contributions
les+0.105
?"+0.098
agging+0.095
 me+0.095
ors+0.094
Neg logit contributions
comb-0.159
Jud-0.139
cr-0.136
cious-0.125
 coconut-0.122
Token
Token ID198
Feature activation-5.32e-06

Pos logit contributions
comb+0.006
cr+0.005
Jud+0.005
cious+0.005
 coconut+0.005
Neg logit contributions
 me-0.004
les-0.004
?"-0.004
agging-0.003
iren-0.003
TokenBob
Token ID18861
Feature activation+5.11e-02

Feature not active on prev token;
no predictions were affected.
Token little
Token ID1310
Feature activation-2.58e-02

Pos logit contributions
les+0.026
?"+0.024
agging+0.024
 me+0.023
ors+0.023
Neg logit contributions
comb-0.042
Jud-0.037
cr-0.036
cious-0.033
gan-0.032
Token rabbit
Token ID22746
Feature activation-2.19e-02

Pos logit contributions
les+0.044
?"+0.041
agging+0.041
 me+0.040
ors+0.040
Neg logit contributions
comb-0.047
Jud-0.040
cr-0.039
cious-0.035
gan-0.034
Token named
Token ID3706
Feature activation-6.51e-02

Pos logit contributions
les+0.031
agging+0.028
ors+0.027
iren+0.026
?"+0.026
Neg logit contributions
comb-0.046
cr-0.041
Jud-0.041
cious-0.037
 coconut-0.035
Token Bun
Token ID28515
Feature activation-1.25e-01

Pos logit contributions
les+0.079
?"+0.075
 me+0.075
ors+0.071
agging+0.070
Neg logit contributions
comb-0.150
Jud-0.130
cr-0.128
cious-0.118
 coconut-0.116
TokenB
Token ID33
Feature activation-6.94e-02

Pos logit contributions
les+0.122
?"+0.106
agging+0.104
 me+0.102
ors+0.102
Neg logit contributions
comb-0.323
Jud-0.291
cr-0.282
cious-0.264
 coconut-0.262
Tokenun
Token ID403
Feature activation-1.58e-01

Pos logit contributions
les+0.084
?"+0.079
 me+0.077
ening+0.075
 wears+0.075
Neg logit contributions
comb-0.151
cr-0.144
Jud-0.137
gan-0.128
cious-0.124
Token.
Token ID13
Feature activation+1.13e-03

Pos logit contributions
les+0.154
agging+0.139
?"+0.136
 me+0.135
ors+0.129
Neg logit contributions
comb-0.393
cr-0.355
Jud-0.354
cious-0.327
gan-0.316
Token
Token ID198
Feature activation-8.26e-04

Pos logit contributions
cr+0.001
 themselves+0.001
comb+0.001
Let+0.001
 Mole+0.001
Neg logit contributions
agging-0.003
les-0.003
ines-0.002
ler-0.002
ening-0.002
Token
Token ID198
Feature activation-2.72e-03

Pos logit contributions
ening+0.002
agging+0.002
les+0.002
ors+0.002
ler+0.002
Neg logit contributions
Jud-0.001
comb-0.001
cr-0.001
Jess-0.001
Maybe-0.001
TokenAs
Token ID1722
Feature activation-1.72e-03

Pos logit contributions
ening+0.006
les+0.006
agging+0.005
 breaths+0.005
 channel+0.005
Neg logit contributions
comb-0.003
Jud-0.003
cr-0.003
M-0.002
Let-0.002
Token they
Token ID484
Feature activation+2.19e-02

Pos logit contributions
agging+0.004
les+0.004
ag+0.004
 channel+0.004
ors+0.004
Neg logit contributions
comb-0.001
Jud-0.001
cr-0.001
 themselves-0.001
M-0.001
Token happily
Token ID18177
Feature activation-4.29e-02

Pos logit contributions
les+0.043
 me+0.040
?"+0.040
agging+0.039
ors+0.037
Neg logit contributions
comb-0.078
Jud-0.070
cr-0.068
cious-0.062
 coconut-0.062
Token.
Token ID13
Feature activation-2.09e-02

Pos logit contributions
les+0.027
agging+0.022
?"+0.022
 me+0.022
ors+0.019
Neg logit contributions
comb-0.124
Jud-0.116
cr-0.108
 coconut-0.104
cious-0.102
Token "
Token ID366
Feature activation-2.68e-03

Pos logit contributions
les+0.038
ler+0.035
agging+0.035
ors+0.035
ening+0.034
Neg logit contributions
comb-0.035
Jud-0.031
cr-0.030
 Maggie-0.027
gan-0.026
TokenHello
Token ID15496
Feature activation-3.86e-02

Pos logit contributions
les+0.004
 me+0.004
?"+0.004
agging+0.004
 sax+0.004
Neg logit contributions
comb-0.005
Jud-0.004
cr-0.004
 coconut-0.004
gan-0.004
Token,
Token ID11
Feature activation-9.46e-02

Pos logit contributions
agging+0.049
iren+0.047
les+0.047
ag+0.044
ler+0.042
Neg logit contributions
comb-0.079
cr-0.070
gan-0.066
 Ab-0.065
 Maggie-0.065
Token Bun
Token ID28515
Feature activation-1.57e-01

Pos logit contributions
les+0.142
agging+0.133
ors+0.129
ines+0.123
ler+0.123
Neg logit contributions
comb-0.181
cr-0.157
Jud-0.146
 Ab-0.142
gan-0.141
Token-
Token ID12
Feature activation-2.58e-02

Pos logit contributions
agging+0.166
les+0.162
ors+0.149
iren+0.149
 channel+0.148
Neg logit contributions
comb-0.370
cr-0.325
Jud-0.317
cious-0.302
gan-0.299
TokenCloud
Token ID18839
Feature activation-3.72e-02

Pos logit contributions
les+0.043
agging+0.042
ors+0.039
?"+0.039
 me+0.039
Neg logit contributions
comb-0.045
cr-0.040
Jud-0.040
cious-0.035
gan-0.035
Token.
Token ID13
Feature activation-7.41e-02

Pos logit contributions
agging+0.040
les+0.038
 me+0.037
ors+0.036
iren+0.036
Neg logit contributions
comb-0.083
cr-0.075
Jud-0.071
cious-0.069
gan-0.068
Token We
Token ID775
Feature activation+5.79e-03

Pos logit contributions
les+0.098
agging+0.089
?"+0.086
ors+0.086
 me+0.085
Neg logit contributions
comb-0.163
Jud-0.143
cr-0.143
cious-0.130
gan-0.128
Token are
Token ID389
Feature activation-1.44e-02

Pos logit contributions
comb+0.012
Jud+0.010
cr+0.010
cious+0.009
 coconut+0.009
Neg logit contributions
les-0.008
agging-0.008
?"-0.008
ors-0.008
 me-0.008
Token to
Token ID284
Feature activation-2.70e-02

Pos logit contributions
les+0.027
 me+0.026
?"+0.026
ors+0.024
 wears+0.024
Neg logit contributions
comb-0.043
Jud-0.038
cr-0.037
cious-0.034
 coconut-0.033
Token play
Token ID711
Feature activation-6.57e-02

Pos logit contributions
les+0.039
 me+0.037
?"+0.037
agging+0.036
ors+0.034
Neg logit contributions
comb-0.056
Jud-0.049
cr-0.048
cious-0.044
 coconut-0.043
Token in
Token ID287
Feature activation-5.05e-02

Pos logit contributions
les+0.083
?"+0.077
 me+0.076
agging+0.074
ors+0.072
Neg logit contributions
comb-0.151
Jud-0.134
cr-0.130
cious-0.121
 coconut-0.118
Token a
Token ID257
Feature activation-7.08e-02

Pos logit contributions
les+0.061
 me+0.060
?"+0.058
agging+0.055
ors+0.053
Neg logit contributions
comb-0.118
Jud-0.103
cr-0.102
cious-0.095
gan-0.092
Token fair
Token ID3148
Feature activation-2.52e-02

Pos logit contributions
les+0.096
 me+0.092
?"+0.090
agging+0.087
ler+0.083
Neg logit contributions
comb-0.155
Jud-0.136
cr-0.135
cious-0.121
gan-0.120
Token manner
Token ID5642
Feature activation-1.57e-01

Pos logit contributions
les+0.040
agging+0.037
?"+0.037
ors+0.037
 me+0.036
Neg logit contributions
comb-0.048
Jud-0.041
cr-0.040
cious-0.036
 coconut-0.036
Token.
Token ID13
Feature activation-2.24e-02

Pos logit contributions
les+0.105
?"+0.095
agging+0.084
 me+0.084
ors+0.084
Neg logit contributions
comb-0.444
Jud-0.418
cr-0.399
 coconut-0.383
cious-0.371
Token In
Token ID554
Feature activation-2.07e-02

Pos logit contributions
 me+0.029
?"+0.027
ors+0.025
les+0.024
 Teddy+0.024
Neg logit contributions
comb-0.051
Jud-0.045
cr-0.045
cious-0.041
 coconut-0.040
Token the
Token ID262
Feature activation-4.84e-02

Pos logit contributions
les+0.031
agging+0.028
?"+0.028
ors+0.028
 me+0.027
Neg logit contributions
comb-0.042
Jud-0.037
cr-0.036
cious-0.032
gan-0.032
Token end
Token ID886
Feature activation-1.38e-02

Pos logit contributions
les+0.074
?"+0.068
 me+0.068
agging+0.068
ors+0.065
Neg logit contributions
comb-0.098
Jud-0.085
cr-0.083
cious-0.075
gan-0.073
Token,
Token ID11
Feature activation-2.21e-02

Pos logit contributions
?"+0.008
les+0.008
 me+0.008
ors+0.006
iren+0.006
Neg logit contributions
comb-0.041
Jud-0.037
cr-0.036
cious-0.034
 coconut-0.034
Token and
Token ID290
Feature activation+2.12e-02

Pos logit contributions
comb+0.080
Jud+0.075
emon+0.073
cr+0.072
cious+0.071
Neg logit contributions
 me-0.011
?"-0.010
les-0.009
,-0.009
?-0.005
Token saw
Token ID2497
Feature activation+1.08e-02

Pos logit contributions
comb+0.048
Jud+0.043
cr+0.042
cious+0.039
emon+0.039
Neg logit contributions
les-0.027
 me-0.026
?"-0.025
 wears-0.024
ors-0.023
Token it
Token ID340
Feature activation+1.40e-02

Pos logit contributions
comb+0.027
Jud+0.024
cr+0.024
cious+0.022
gan+0.022
Neg logit contributions
les-0.011
 me-0.011
?"-0.010
ors-0.009
 wrong-0.009
Token said
Token ID531
Feature activation+9.06e-03

Pos logit contributions
comb+0.030
Jud+0.027
cr+0.026
cious+0.024
gan+0.023
Neg logit contributions
les-0.019
 me-0.018
?"-0.018
agging-0.017
ors-0.017
Token '
Token ID705
Feature activation-2.41e-02

Pos logit contributions
comb+0.024
Jud+0.022
cr+0.021
cious+0.020
 coconut+0.020
Neg logit contributions
?"-0.007
les-0.007
 me-0.007
agging-0.006
 wrong-0.006
Tokenun
Token ID403
Feature activation-1.57e-01

Pos logit contributions
les+0.040
?"+0.038
 me+0.037
ler+0.035
iren+0.035
Neg logit contributions
comb-0.044
Jud-0.037
cr-0.034
gan-0.033
 Maggie-0.033
Tokenlock
Token ID5354
Feature activation-1.30e-02

Pos logit contributions
les+0.224
?"+0.210
ors+0.204
 me+0.204
ler+0.191
Neg logit contributions
comb-0.325
Jud-0.296
cr-0.281
gan-0.259
 coconut-0.256
Token'.
Token ID4458
Feature activation-3.03e-02

Pos logit contributions
les+0.015
?"+0.014
 me+0.014
ors+0.013
agging+0.013
Neg logit contributions
comb-0.030
Jud-0.027
cr-0.027
cious-0.025
 coconut-0.024
Token
Token ID198
Feature activation-4.56e-04

Pos logit contributions
les+0.044
?"+0.041
 me+0.041
agging+0.040
ors+0.039
Neg logit contributions
comb-0.063
Jud-0.055
cr-0.054
cious-0.049
gan-0.048
Token
Token ID198
Feature activation-4.19e-03

Pos logit contributions
les+0.001
agging+0.001
ors+0.001
?"+0.001
ening+0.001
Neg logit contributions
comb-0.001
Jud-0.001
cr-0.001
cious-0.001
 Maggie-0.001
TokenShe
Token ID3347
Feature activation-1.38e-02

Pos logit contributions
les+0.007
 me+0.007
?"+0.007
agging+0.007
ors+0.006
Neg logit contributions
comb-0.008
cr-0.006
Jud-0.006
gan-0.006
 coconut-0.006
Token bus
Token ID1323
Feature activation+1.95e-03

Pos logit contributions
comb+0.017
Jud+0.015
cr+0.015
cious+0.014
gan+0.014
Neg logit contributions
les-0.009
?"-0.009
 me-0.009
agging-0.008
ler-0.008
Token with
Token ID351
Feature activation-7.78e-03

Pos logit contributions
comb+0.004
cr+0.004
Jud+0.004
cious+0.003
gan+0.003
Neg logit contributions
les-0.003
ors-0.002
agging-0.002
 me-0.002
?"-0.002
Token the
Token ID262
Feature activation-2.47e-02

Pos logit contributions
les+0.011
 me+0.010
?"+0.010
agging+0.009
ors+0.009
Neg logit contributions
comb-0.017
Jud-0.015
cr-0.014
cious-0.013
gan-0.013
Token word
Token ID1573
Feature activation+7.79e-02

Pos logit contributions
les+0.031
?"+0.028
agging+0.028
 me+0.028
ors+0.027
Neg logit contributions
comb-0.056
Jud-0.050
cr-0.048
cious-0.044
gan-0.043
Token "
Token ID366
Feature activation+2.24e-02

Pos logit contributions
comb+0.196
Jud+0.178
cr+0.172
cious+0.161
gan+0.157
Neg logit contributions
les-0.078
?"-0.071
 me-0.068
agging-0.067
ors-0.065
Tokenun
Token ID403
Feature activation-1.56e-01

Pos logit contributions
comb+0.044
Jud+0.036
cious+0.032
 coconut+0.031
 Maggie+0.031
Neg logit contributions
les-0.037
 me-0.035
?"-0.035
iren-0.033
ock-0.033
Tokeniversity
Token ID1608
Feature activation-8.79e-02

Pos logit contributions
les+0.215
?"+0.196
ors+0.191
 me+0.190
agging+0.188
Neg logit contributions
comb-0.339
Jud-0.301
cr-0.289
gan-0.263
 coconut-0.262
Token"
Token ID1
Feature activation+2.94e-02

Pos logit contributions
les+0.127
?"+0.118
 me+0.115
agging+0.114
ors+0.113
Neg logit contributions
comb-0.185
Jud-0.164
cr-0.157
cious-0.143
 coconut-0.142
Token on
Token ID319
Feature activation-4.70e-02

Pos logit contributions
comb+0.064
Jud+0.056
cr+0.052
cious+0.051
 coconut+0.050
Neg logit contributions
les-0.040
?"-0.039
 me-0.038
iren-0.035
ors-0.035
Token it
Token ID340
Feature activation-6.47e-02

Pos logit contributions
les+0.059
 me+0.058
?"+0.057
ors+0.054
agging+0.053
Neg logit contributions
comb-0.105
Jud-0.093
cr-0.092
gan-0.083
cious-0.082
Token.
Token ID13
Feature activation-4.64e-02

Pos logit contributions
les+0.054
?"+0.052
 me+0.050
ors+0.045
agging+0.043
Neg logit contributions
comb-0.172
Jud-0.160
cr-0.153
cious-0.145
 coconut-0.144
Token me
Token ID502
Feature activation-9.27e-02

Pos logit contributions
les+0.034
ines+0.031
 channels+0.029
agging+0.029
 wears+0.028
Neg logit contributions
cr-0.065
Jud-0.065
comb-0.063
 Judy-0.059
 coconut-0.057
Token,
Token ID11
Feature activation-6.35e-02

Pos logit contributions
les+0.091
agging+0.075
ening+0.071
iren+0.071
ors+0.070
Neg logit contributions
comb-0.241
cr-0.213
Jud-0.205
gan-0.194
cious-0.194
Token Bun
Token ID28515
Feature activation-1.41e-01

Pos logit contributions
les+0.101
agging+0.096
ines+0.094
iren+0.090
ors+0.088
Neg logit contributions
comb-0.119
cr-0.106
Jud-0.099
 coconut-0.092
gan-0.090
Token-
Token ID12
Feature activation-6.42e-03

Pos logit contributions
 channels+0.196
 channel+0.182
agging+0.166
iren+0.165
 volume+0.165
Neg logit contributions
comb-0.287
cr-0.258
Jud-0.236
cious-0.235
let-0.228
TokenB
Token ID33
Feature activation-7.39e-02

Pos logit contributions
les+0.011
agging+0.010
?"+0.010
 channel+0.010
ors+0.009
Neg logit contributions
comb-0.011
Jud-0.010
cr-0.010
cious-0.009
 coconut-0.009
Tokenun
Token ID403
Feature activation-1.56e-01

Pos logit contributions
les+0.089
 channels+0.081
 channel+0.079
?"+0.079
 me+0.077
Neg logit contributions
comb-0.165
cr-0.154
Jud-0.144
gan-0.138
cious-0.135
Token!
Token ID0
Feature activation-2.50e-02

Pos logit contributions
 channels+0.168
 channel+0.158
agging+0.155
 volume+0.152
les+0.151
Neg logit contributions
comb-0.348
cr-0.316
cious-0.295
Jud-0.294
 coconut-0.273
Token I
Token ID314
Feature activation-1.27e-02

Pos logit contributions
les+0.033
?"+0.030
agging+0.030
 me+0.030
ors+0.029
Neg logit contributions
comb-0.054
Jud-0.048
cr-0.047
cious-0.043
gan-0.042
Token have
Token ID423
Feature activation+1.86e-02

Pos logit contributions
les+0.020
agging+0.019
ors+0.019
 me+0.018
ening+0.018
Neg logit contributions
comb-0.024
cr-0.021
Jud-0.021
cious-0.018
emon-0.018
Token a
Token ID257
Feature activation-2.73e-02

Pos logit contributions
comb+0.032
cr+0.028
Jud+0.028
cious+0.025
 coconut+0.024
Neg logit contributions
les-0.033
agging-0.030
ors-0.030
?"-0.029
iren-0.029
Token surprise
Token ID5975
Feature activation+9.24e-03

Pos logit contributions
les+0.042
ors+0.040
agging+0.037
ening+0.036
ock+0.035
Neg logit contributions
comb-0.056
cr-0.047
Jud-0.047
 coconut-0.045
cious-0.043
Token,
Token ID11
Feature activation-1.88e-02

Pos logit contributions
les+0.012
agging+0.009
?"+0.009
ors+0.009
 me+0.009
Neg logit contributions
comb-0.044
Jud-0.040
cr-0.040
cious-0.037
gan-0.036
Token "
Token ID366
Feature activation+2.12e-03

Pos logit contributions
les+0.033
agging+0.030
?"+0.030
ors+0.029
 me+0.029
Neg logit contributions
comb-0.034
Jud-0.029
cr-0.028
cious-0.025
gan-0.025
TokenMom
Token ID29252
Feature activation+1.60e-02

Pos logit contributions
comb+0.003
Jud+0.002
cr+0.002
cious+0.002
gan+0.002
Neg logit contributions
les-0.005
agging-0.005
ors-0.004
?"-0.004
ines-0.004
Tokenmy
Token ID1820
Feature activation-7.31e-02

Pos logit contributions
comb+0.025
cr+0.022
Jud+0.022
ill+0.020
gan+0.020
Neg logit contributions
les-0.030
agging-0.028
ler-0.027
 sax-0.027
ors-0.026
Token,
Token ID11
Feature activation-7.94e-02

Pos logit contributions
les+0.100
agging+0.096
 channels+0.091
 sax+0.090
iren+0.087
Neg logit contributions
comb-0.153
Jud-0.136
cr-0.128
 Judy-0.123
gan-0.121
Token Bun
Token ID28515
Feature activation-1.54e-01

Pos logit contributions
les+0.122
agging+0.114
ors+0.103
ines+0.102
ler+0.102
Neg logit contributions
comb-0.166
cr-0.142
Jud-0.137
 coconut-0.128
gan-0.126
Token Bun
Token ID28515
Feature activation-1.49e-01

Pos logit contributions
 channels+0.178
agging+0.172
 channel+0.169
les+0.163
iren+0.161
Neg logit contributions
comb-0.340
cr-0.285
Jud-0.284
cious-0.270
gan-0.259
Token is
Token ID318
Feature activation-1.94e-02

Pos logit contributions
agging+0.167
 channels+0.166
les+0.161
 channel+0.159
iren+0.158
Neg logit contributions
comb-0.336
cr-0.287
Jud-0.281
cious-0.271
gan-0.258
Token broken
Token ID5445
Feature activation-4.51e-02

Pos logit contributions
les+0.033
ors+0.028
ines+0.027
ock+0.027
iren+0.026
Neg logit contributions
comb-0.037
cr-0.032
Jud-0.032
gan-0.028
emon-0.028
Token.
Token ID13
Feature activation-2.36e-02

Pos logit contributions
les+0.069
iren+0.064
agging+0.063
brate+0.063
 wears+0.062
Neg logit contributions
comb-0.082
cr-0.068
 themselves-0.063
gan-0.060
 Ab-0.054
Token He
Token ID679
Feature activation-3.63e-02

Pos logit contributions
les+0.040
agging+0.038
ines+0.035
ler+0.034
ag+0.034
Neg logit contributions
comb-0.043
cr-0.036
Jud-0.035
gan-0.034
cious-0.033
Token wild
Token ID4295
Feature activation-3.45e-02

Pos logit contributions
comb+0.046
Jud+0.040
cr+0.039
cious+0.035
 coconut+0.034
Neg logit contributions
les-0.038
agging-0.035
?"-0.035
ors-0.034
 me-0.034
Token cat
Token ID3797
Feature activation-1.40e-04

Pos logit contributions
les+0.059
agging+0.055
?"+0.054
ors+0.053
ler+0.052
Neg logit contributions
comb-0.062
cr-0.052
Jud-0.052
cious-0.048
 coconut-0.045
Token that
Token ID326
Feature activation-6.18e-02

Pos logit contributions
les+0.000
?"+0.000
 me+0.000
agging+0.000
ors+0.000
Neg logit contributions
comb-0.000
Jud-0.000
cr-0.000
cious-0.000
 coconut-0.000
Token lives
Token ID3160
Feature activation-1.08e-02

Pos logit contributions
les+0.074
?"+0.068
 me+0.068
ors+0.063
agging+0.062
Neg logit contributions
comb-0.145
Jud-0.129
cr-0.125
cious-0.116
gan-0.115
Token in
Token ID287
Feature activation-5.37e-02

Pos logit contributions
les+0.012
?"+0.011
 me+0.011
agging+0.011
ors+0.011
Neg logit contributions
comb-0.026
Jud-0.023
cr-0.023
cious-0.021
 coconut-0.021
Token far
Token ID1290
Feature activation-1.53e-01

Pos logit contributions
les+0.071
?"+0.066
 me+0.065
agging+0.064
ors+0.063
Neg logit contributions
comb-0.119
Jud-0.104
cr-0.102
cious-0.094
gan-0.092
Tokenaway
Token ID8272
Feature activation-1.02e-01

Pos logit contributions
agging+0.232
ors+0.201
les+0.199
 trumpet+0.199
 sax+0.197
Neg logit contributions
comb-0.267
Jud-0.247
cr-0.234
 coconut-0.227
 Maggie-0.224
Token places
Token ID4113
Feature activation-7.72e-02

Pos logit contributions
agging+0.124
les+0.122
iren+0.112
ors+0.111
ler+0.108
Neg logit contributions
comb-0.221
Jud-0.199
cr-0.191
cious-0.183
 coconut-0.179
Token.
Token ID13
Feature activation-2.52e-02

Pos logit contributions
les+0.065
agging+0.056
ors+0.053
?"+0.051
 me+0.051
Neg logit contributions
comb-0.206
Jud-0.185
cr-0.183
cious-0.171
gan-0.167
Token It
Token ID632
Feature activation+4.21e-03

Pos logit contributions
?"+0.030
les+0.029
 me+0.029
ors+0.027
 wears+0.026
Neg logit contributions
comb-0.058
Jud-0.051
cr-0.049
cious-0.046
 coconut-0.046
Token can
Token ID460
Feature activation-1.11e-03

Pos logit contributions
comb+0.010
Jud+0.009
cr+0.009
cious+0.008
 Maggie+0.008
Neg logit contributions
les-0.005
agging-0.004
ors-0.004
?"-0.004
iren-0.004
Token and
Token ID290
Feature activation-1.71e-02

Pos logit contributions
les+0.019
?"+0.016
agging+0.016
 me+0.015
ors+0.015
Neg logit contributions
comb-0.061
Jud-0.055
cr-0.055
cious-0.051
gan-0.050
Token don
Token ID836
Feature activation-4.89e-02

Pos logit contributions
les+0.025
?"+0.024
 me+0.024
agging+0.023
 wears+0.022
Neg logit contributions
comb-0.035
Jud-0.031
cr-0.030
cious-0.027
emon-0.026
Token't
Token ID470
Feature activation-7.55e-02

Pos logit contributions
les+0.051
agging+0.045
 me+0.042
ors+0.042
?"+0.042
Neg logit contributions
comb-0.121
Jud-0.108
cr-0.106
cious-0.098
gan-0.096
Token go
Token ID467
Feature activation-7.62e-03

Pos logit contributions
les+0.121
agging+0.111
?"+0.111
 me+0.109
ors+0.109
Neg logit contributions
comb-0.146
Jud-0.126
cr-0.123
cious-0.111
gan-0.108
Token too
Token ID1165
Feature activation-2.11e-02

Pos logit contributions
les+0.012
agging+0.011
ors+0.011
ening+0.011
ler+0.010
Neg logit contributions
comb-0.014
cr-0.012
Jud-0.011
gan-0.011
 themselves-0.011
Token far
Token ID1290
Feature activation-1.53e-01

Pos logit contributions
les+0.033
agging+0.030
ors+0.030
ening+0.028
 channel+0.027
Neg logit contributions
comb-0.041
Jud-0.037
cr-0.035
 coconut-0.033
gan-0.033
Token,"
Token ID553
Feature activation-3.36e-02

Pos logit contributions
iren+0.204
agging+0.204
 beats+0.201
 channel+0.199
 channels+0.198
Neg logit contributions
comb-0.294
cr-0.243
 themselves-0.237
Jud-0.231
 Maggie-0.225
Token she
Token ID673
Feature activation-6.69e-02

Pos logit contributions
 me+0.048
?"+0.045
,+0.040
?+0.040
 my+0.036
Neg logit contributions
comb-0.084
cious-0.070
cr-0.066
ocado-0.065
fire-0.062
Token said
Token ID531
Feature activation-3.00e-02

Pos logit contributions
les+0.065
 me+0.060
?"+0.058
 wears+0.050
agging+0.048
Neg logit contributions
comb-0.175
Jud-0.154
cr-0.151
 coconut-0.144
cious-0.143
Token.
Token ID13
Feature activation-1.95e-02

Pos logit contributions
les+0.021
agging+0.019
?"+0.019
 me+0.017
ors+0.017
Neg logit contributions
comb-0.084
Jud-0.077
cr-0.075
 coconut-0.071
cious-0.070
Token "
Token ID366
Feature activation-1.17e-03

Pos logit contributions
les+0.025
?"+0.025
 me+0.024
agging+0.023
ors+0.022
Neg logit contributions
comb-0.043
Jud-0.037
cr-0.037
cious-0.034
 coconut-0.033
Token.
Token ID13
Feature activation-1.63e-02

Pos logit contributions
les+0.071
agging+0.067
?"+0.064
 me+0.062
ors+0.061
Neg logit contributions
comb-0.156
cr-0.137
Jud-0.132
gan-0.126
cious-0.124
Token
Token ID198
Feature activation-1.59e-02

Pos logit contributions
les+0.022
agging+0.020
ors+0.020
?"+0.019
ler+0.019
Neg logit contributions
comb-0.035
Jud-0.031
cr-0.031
gan-0.028
cious-0.028
Token
Token ID198
Feature activation-2.21e-02

Pos logit contributions
les+0.026
agging+0.024
ors+0.023
ening+0.023
?"+0.023
Neg logit contributions
comb-0.029
Jud-0.027
cr-0.025
cious-0.022
 Maggie-0.022
Token"
Token ID1
Feature activation+5.45e-04

Pos logit contributions
 me+0.031
les+0.031
?"+0.030
iren+0.030
agging+0.028
Neg logit contributions
comb-0.048
cr-0.040
Jud-0.038
cious-0.036
 coconut-0.035
TokenGo
Token ID5247
Feature activation+7.25e-03

Pos logit contributions
comb+0.001
Jud+0.001
cious+0.001
cr+0.001
 Maggie+0.001
Neg logit contributions
les-0.001
 me-0.001
iren-0.001
?"-0.001
ler-0.001
Token away
Token ID1497
Feature activation-1.53e-01

Pos logit contributions
comb+0.014
cr+0.012
Jud+0.012
lock+0.011
 coconut+0.011
Neg logit contributions
agging-0.010
les-0.010
 channels-0.010
ening-0.010
ors-0.010
Token,
Token ID11
Feature activation-1.12e-01

Pos logit contributions
les+0.132
agging+0.123
ors+0.108
ening+0.099
ler+0.099
Neg logit contributions
comb-0.409
Jud-0.365
cr-0.357
gan-0.333
 Maggie-0.332
Token Lily
Token ID20037
Feature activation-7.70e-02

Pos logit contributions
les+0.136
?"+0.130
 me+0.130
agging+0.119
ors+0.117
Neg logit contributions
comb-0.263
Jud-0.234
cr-0.228
cious-0.213
gan-0.203
Token,"
Token ID553
Feature activation-5.16e-02

Pos logit contributions
les+0.074
agging+0.068
ors+0.061
iren+0.059
ening+0.059
Neg logit contributions
comb-0.196
Jud-0.171
cr-0.171
cious-0.158
gan-0.157
Token he
Token ID339
Feature activation-9.03e-02

Pos logit contributions
les+0.092
agging+0.086
ors+0.084
?"+0.081
ening+0.081
Neg logit contributions
comb-0.089
Jud-0.078
cr-0.076
gan-0.067
cious-0.066
Token says
Token ID1139
Feature activation-4.27e-02

Pos logit contributions
agging+0.139
ors+0.137
ening+0.132
ler+0.127
les+0.126
Neg logit contributions
comb-0.149
Jud-0.138
cr-0.135
gan-0.126
 themselves-0.120
Token.
Token ID13
Feature activation-2.82e-02

Pos logit contributions
comb+0.017
cr+0.015
Jud+0.015
cious+0.014
emon+0.014
Neg logit contributions
les-0.005
?"-0.004
agging-0.004
 me-0.004
ors-0.004
Token
Token ID198
Feature activation-2.56e-02

Pos logit contributions
les+0.041
?"+0.038
agging+0.038
 me+0.037
ors+0.037
Neg logit contributions
comb-0.058
Jud-0.051
cr-0.050
cious-0.045
gan-0.044
Token
Token ID198
Feature activation-3.06e-02

Pos logit contributions
les+0.039
agging+0.036
?"+0.035
ors+0.035
 me+0.034
Neg logit contributions
comb-0.051
Jud-0.045
cr-0.044
cious-0.040
gan-0.038
Token"
Token ID1
Feature activation-6.31e-03

Pos logit contributions
les+0.050
?"+0.047
 me+0.046
agging+0.046
ors+0.045
Neg logit contributions
comb-0.058
Jud-0.050
cr-0.049
cious-0.044
gan-0.043
TokenGo
Token ID5247
Feature activation+6.82e-03

Pos logit contributions
 me+0.010
les+0.009
?"+0.008
iren+0.008
 sax+0.008
Neg logit contributions
comb-0.014
cr-0.011
Jud-0.011
cious-0.011
stack-0.010
Token away
Token ID1497
Feature activation-1.53e-01

Pos logit contributions
comb+0.015
Jud+0.014
cr+0.013
 coconut+0.012
gan+0.012
Neg logit contributions
les-0.008
ors-0.008
agging-0.008
ening-0.007
ler-0.007
Token,
Token ID11
Feature activation-1.12e-01

Pos logit contributions
les+0.080
?"+0.073
 me+0.066
agging+0.059
ors+0.057
Neg logit contributions
comb-0.456
Jud-0.418
cr-0.415
cious-0.389
emon-0.381
Token you
Token ID345
Feature activation-3.66e-02

Pos logit contributions
 me+0.156
les+0.152
?"+0.149
agging+0.135
 wears+0.130
Neg logit contributions
comb-0.247
Jud-0.222
cr-0.216
cious-0.196
emon-0.187
Token filthy
Token ID35677
Feature activation-5.05e-02

Pos logit contributions
les+0.052
?"+0.048
 me+0.047
agging+0.047
ors+0.046
Neg logit contributions
comb-0.078
Jud-0.069
cr-0.067
cious-0.061
gan-0.060
Token dog
Token ID3290
Feature activation-5.44e-02

Pos logit contributions
les+0.061
?"+0.055
 me+0.054
agging+0.054
ors+0.053
Neg logit contributions
comb-0.118
Jud-0.106
cr-0.103
cious-0.095
gan-0.093
Token!"
Token ID2474
Feature activation-9.45e-03

Pos logit contributions
les+0.047
agging+0.042
ors+0.038
?"+0.037
ening+0.036
Neg logit contributions
comb-0.143
Jud-0.127
cr-0.127
cious-0.118
 coconut-0.115
Token.
Token ID13
Feature activation-3.46e-02

Pos logit contributions
comb+0.001
Jud+0.001
cr+0.001
cious+0.001
 coconut+0.001
Neg logit contributions
les-0.000
?"-0.000
agging-0.000
 me-0.000
ors-0.000
Token
Token ID198
Feature activation-2.44e-02

Pos logit contributions
les+0.053
agging+0.049
?"+0.048
ors+0.048
 me+0.047
Neg logit contributions
comb-0.069
Jud-0.060
cr-0.059
cious-0.053
gan-0.052
Token
Token ID198
Feature activation-2.79e-02

Pos logit contributions
les+0.039
agging+0.036
ors+0.035
?"+0.034
ler+0.034
Neg logit contributions
comb-0.047
Jud-0.042
cr-0.040
cious-0.036
 Maggie-0.035
Token"
Token ID1
Feature activation-2.30e-03

Pos logit contributions
les+0.040
?"+0.040
 me+0.039
iren+0.037
agging+0.037
Neg logit contributions
comb-0.059
cr-0.049
Jud-0.048
cious-0.044
gan-0.044
TokenGo
Token ID5247
Feature activation+6.69e-03

Pos logit contributions
 me+0.003
iren+0.003
ch+0.003
les+0.003
 sax+0.003
Neg logit contributions
comb-0.006
Jud-0.004
cious-0.004
cr-0.004
ossom-0.004
Token away
Token ID1497
Feature activation-1.52e-01

Pos logit contributions
comb+0.013
cr+0.012
Jud+0.011
lock+0.010
gan+0.010
Neg logit contributions
les-0.009
ors-0.009
agging-0.009
ening-0.009
 channels-0.009
Token!
Token ID0
Feature activation-6.45e-02

Pos logit contributions
les+0.109
agging+0.092
ors+0.084
?"+0.082
 me+0.080
Neg logit contributions
comb-0.431
Jud-0.387
cr-0.381
cious-0.357
gan-0.354
Token Leave
Token ID17446
Feature activation-4.36e-02

Pos logit contributions
les+0.076
?"+0.073
 me+0.070
agging+0.069
ors+0.069
Neg logit contributions
comb-0.152
Jud-0.134
cr-0.129
cious-0.119
 coconut-0.117
Token my
Token ID616
Feature activation-1.17e-02

Pos logit contributions
les+0.044
agging+0.039
?"+0.038
ors+0.037
 me+0.035
Neg logit contributions
comb-0.109
Jud-0.098
cr-0.096
cious-0.089
 coconut-0.088
Token kids
Token ID3988
Feature activation-5.48e-03

Pos logit contributions
les+0.016
?"+0.015
 me+0.015
agging+0.015
ors+0.014
Neg logit contributions
comb-0.026
Jud-0.023
cr-0.022
cious-0.020
emon-0.020
Token alone
Token ID3436
Feature activation-1.15e-01

Pos logit contributions
les+0.006
ors+0.005
agging+0.005
ening+0.005
iren+0.005
Neg logit contributions
comb-0.013
cr-0.012
Jud-0.011
cious-0.011
gan-0.011
Token up
Token ID510
Feature activation-2.90e-02

Pos logit contributions
comb+0.008
cr+0.007
Jud+0.007
 Maggie+0.006
 coconut+0.006
Neg logit contributions
les-0.003
agging-0.003
ors-0.002
ler-0.002
ock-0.002
Token there
Token ID612
Feature activation-2.65e-02

Pos logit contributions
les+0.042
agging+0.037
ors+0.037
ock+0.035
ler+0.034
Neg logit contributions
comb-0.062
Jud-0.054
cr-0.052
 coconut-0.049
 Maggie-0.048
Token?"
Token ID1701
Feature activation-1.86e-02

Pos logit contributions
les+0.016
ors+0.012
agging+0.012
iren+0.011
ler+0.010
Neg logit contributions
comb-0.078
cr-0.071
Jud-0.069
 Maggie-0.065
 coconut-0.065
Token Asked
Token ID26220
Feature activation-5.90e-02

Pos logit contributions
les+0.026
agging+0.024
ors+0.023
?"+0.022
ock+0.021
Neg logit contributions
comb-0.039
Jud-0.035
cr-0.034
cious-0.031
 Maggie-0.031
Token the
Token ID262
Feature activation+1.95e-02

Pos logit contributions
les+0.073
?"+0.069
 me+0.068
ors+0.064
agging+0.064
Neg logit contributions
comb-0.136
Jud-0.119
cr-0.116
cious-0.109
 coconut-0.105
Token gorilla
Token ID45314
Feature activation+1.30e-01

Pos logit contributions
comb+0.043
Jud+0.038
cr+0.037
cious+0.034
emon+0.033
Neg logit contributions
les-0.027
?"-0.025
 me-0.025
agging-0.024
ors-0.022
Token.
Token ID13
Feature activation+2.29e-02

Pos logit contributions
comb+0.352
Jud+0.320
cr+0.313
cious+0.296
emon+0.292
Neg logit contributions
les-0.102
?"-0.091
 me-0.088
agging-0.082
ors-0.080
Token
Token ID198
Feature activation+1.23e-02

Pos logit contributions
comb+0.049
Jud+0.042
cr+0.041
cious+0.038
gan+0.037
Neg logit contributions
les-0.032
?"-0.030
 me-0.029
agging-0.029
ors-0.028
Token
Token ID198
Feature activation+7.50e-03

Pos logit contributions
comb+0.025
Jud+0.022
cr+0.022
cious+0.020
gan+0.019
Neg logit contributions
les-0.018
?"-0.016
agging-0.016
 me-0.016
ors-0.016
Token"
Token ID1
Feature activation+1.73e-02

Pos logit contributions
comb+0.017
cr+0.014
Jud+0.013
 Maggie+0.013
gan+0.012
Neg logit contributions
 me-0.011
?"-0.010
les-0.010
iren-0.009
ors-0.009
TokenI
Token ID40
Feature activation-2.62e-02

Pos logit contributions
comb+0.037
Jud+0.027
gan+0.026
 Maggie+0.025
cious+0.025
Neg logit contributions
?"-0.025
 me-0.024
iren-0.024
 sax-0.024
 trumpet-0.023
Token made
Token ID925
Feature activation+1.30e-02

Pos logit contributions
les+0.017
 me+0.016
?"+0.016
 wears+0.014
 hits+0.013
Neg logit contributions
comb-0.028
Jud-0.025
cr-0.023
cious-0.022
emon-0.021
Token a
Token ID257
Feature activation+1.59e-02

Pos logit contributions
comb+0.029
Jud+0.025
cr+0.024
cious+0.023
stack+0.023
Neg logit contributions
les-0.018
 me-0.018
?"-0.016
agging-0.014
 wrong-0.014
Token face
Token ID1986
Feature activation+9.15e-03

Pos logit contributions
comb+0.038
Jud+0.034
cr+0.033
cious+0.031
gan+0.031
Neg logit contributions
 me-0.018
les-0.017
?"-0.016
agging-0.015
 problem-0.014
Token.
Token ID13
Feature activation+1.08e-02

Pos logit contributions
comb+0.024
Jud+0.023
cr+0.021
cious+0.020
gan+0.020
Neg logit contributions
les-0.008
?"-0.007
 me-0.007
agging-0.006
ening-0.006
Token The
Token ID383
Feature activation+4.79e-03

Pos logit contributions
comb+0.023
Jud+0.020
cr+0.020
cious+0.018
 coconut+0.017
Neg logit contributions
les-0.015
?"-0.014
 me-0.014
agging-0.014
ors-0.013
Token apple
Token ID17180
Feature activation+1.14e-01

Pos logit contributions
comb+0.008
Jud+0.008
cr+0.007
 Maggie+0.007
cious+0.007
Neg logit contributions
 me-0.009
?"-0.008
les-0.008
agging-0.008
 problem-0.008
Token was
Token ID373
Feature activation+3.64e-03

Pos logit contributions
comb+0.249
Jud+0.244
cious+0.223
cr+0.218
Jess+0.212
Neg logit contributions
les-0.145
 me-0.138
?"-0.125
 hits-0.118
ler-0.115
Token not
Token ID407
Feature activation+7.09e-03

Pos logit contributions
comb+0.008
Jud+0.007
cr+0.007
cious+0.006
gan+0.006
Neg logit contributions
les-0.005
?"-0.005
agging-0.005
ors-0.005
 me-0.005
Token y
Token ID331
Feature activation+3.64e-02

Pos logit contributions
comb+0.015
Jud+0.013
cr+0.013
cious+0.012
gan+0.011
Neg logit contributions
les-0.010
?"-0.009
agging-0.009
 me-0.009
ors-0.009
Tokenummy
Token ID13513
Feature activation+8.29e-03

Pos logit contributions
comb+0.083
Jud+0.073
cr+0.072
cious+0.066
gan+0.065
Neg logit contributions
les-0.045
?"-0.041
agging-0.041
 me-0.040
ors-0.040
Token.
Token ID13
Feature activation+2.99e-03

Pos logit contributions
comb+0.023
Jud+0.022
cr+0.021
cious+0.020
Jess+0.019
Neg logit contributions
les-0.006
?"-0.005
 me-0.005
ors-0.005
agging-0.004
Token moved
Token ID3888
Feature activation+2.76e-02

Pos logit contributions
comb+0.066
Jud+0.060
cr+0.056
cious+0.054
 coconut+0.053
Neg logit contributions
les-0.039
 me-0.035
?"-0.034
agging-0.033
ors-0.033
Token and
Token ID290
Feature activation+3.98e-02

Pos logit contributions
comb+0.066
Jud+0.059
cr+0.056
cious+0.054
 coconut+0.053
Neg logit contributions
les-0.032
 me-0.029
agging-0.027
?"-0.027
ors-0.027
Token the
Token ID262
Feature activation+1.59e-02

Pos logit contributions
comb+0.095
Jud+0.084
cr+0.081
cious+0.080
emon+0.077
Neg logit contributions
les-0.045
 me-0.044
?"-0.040
 hits-0.037
ening-0.037
Token apple
Token ID17180
Feature activation+1.25e-01

Pos logit contributions
comb+0.035
Jud+0.033
cr+0.032
cious+0.028
 Maggie+0.028
Neg logit contributions
les-0.021
 me-0.020
 problem-0.018
?"-0.018
 trumpet-0.018
Token fell
Token ID3214
Feature activation+1.20e-02

Pos logit contributions
comb+0.279
Jud+0.266
cious+0.241
cr+0.240
emon+0.230
Neg logit contributions
les-0.154
 me-0.142
 hits-0.125
?"-0.125
ler-0.123
Token right
Token ID826
Feature activation+1.24e-01

Pos logit contributions
comb+0.025
cr+0.022
Jud+0.020
gan+0.019
 themselves+0.018
Neg logit contributions
les-0.017
?"-0.016
agging-0.016
ors-0.016
 wears-0.015
Token down
Token ID866
Feature activation+1.46e-02

Pos logit contributions
comb+0.258
cr+0.226
Jud+0.218
gan+0.205
 Maggie+0.200
Neg logit contributions
les-0.168
ors-0.158
?"-0.155
iren-0.155
agging-0.153
Token into
Token ID656
Feature activation-2.67e-02

Pos logit contributions
comb+0.039
Jud+0.035
cr+0.034
cious+0.033
gan+0.032
Neg logit contributions
les-0.012
?"-0.011
 me-0.011
agging-0.011
ors-0.010
Token the
Token ID262
Feature activation+2.58e-04

Pos logit contributions
 me+0.028
les+0.024
?"+0.023
ening+0.022
 shooting+0.021
Neg logit contributions
comb-0.068
cr-0.059
Jud-0.057
cious-0.057
gan-0.056
Token little
Token ID1310
Feature activation+1.68e-02

Pos logit contributions
comb+0.001
Jud+0.001
cr+0.001
cious+0.000
gan+0.000
Neg logit contributions
les-0.000
?"-0.000
agging-0.000
 me-0.000
ors-0.000
Token girl
Token ID2576
Feature activation+1.20e-02

Pos logit contributions
comb+0.030
Jud+0.025
cr+0.024
cious+0.022
 coconut+0.021
Neg logit contributions
les-0.029
agging-0.028
?"-0.027
ors-0.027
 me-0.026
Token who
Token ID508
Feature activation-3.75e-02

Pos logit contributions
les+0.029
agging+0.027
ors+0.025
 me+0.025
?"+0.024
Neg logit contributions
comb-0.040
Jud-0.035
cr-0.035
cious-0.032
emon-0.031
Token loved
Token ID6151
Feature activation-6.78e-02

Pos logit contributions
les+0.055
agging+0.051
?"+0.050
ors+0.049
 me+0.049
Neg logit contributions
comb-0.077
Jud-0.067
cr-0.066
cious-0.060
gan-0.058
Token to
Token ID284
Feature activation-5.53e-03

Pos logit contributions
les+0.093
agging+0.085
?"+0.084
ors+0.082
 me+0.080
Neg logit contributions
comb-0.144
Jud-0.128
cr-0.126
cious-0.115
 coconut-0.113
Token make
Token ID787
Feature activation-1.76e-02

Pos logit contributions
les+0.009
?"+0.008
agging+0.008
 me+0.008
ors+0.008
Neg logit contributions
comb-0.011
Jud-0.009
cr-0.009
cious-0.008
gan-0.008
Token delicious
Token ID12625
Feature activation+9.10e-02

Pos logit contributions
les+0.024
 me+0.022
?"+0.021
ors+0.020
agging+0.020
Neg logit contributions
comb-0.041
Jud-0.035
cr-0.034
cious-0.032
gan-0.032
Token cakes
Token ID30849
Feature activation+1.09e-01

Pos logit contributions
comb+0.168
Jud+0.149
cr+0.139
gan+0.124
cious+0.123
Neg logit contributions
les-0.162
 me-0.149
?"-0.149
agging-0.142
ors-0.141
Token.
Token ID13
Feature activation+2.12e-02

Pos logit contributions
comb+0.304
Jud+0.301
cr+0.273
 coconut+0.269
cious+0.268
Neg logit contributions
?"-0.069
les-0.068
 me-0.055
ors-0.051
agging-0.050
Token He
Token ID679
Feature activation-9.80e-03

Pos logit contributions
comb+0.047
Jud+0.042
cr+0.041
cious+0.037
gan+0.037
Neg logit contributions
les-0.028
agging-0.025
ors-0.024
?"-0.024
 me-0.024
Token always
Token ID1464
Feature activation-4.46e-02

Pos logit contributions
?"+0.011
les+0.010
 me+0.010
 wears+0.009
ler+0.008
Neg logit contributions
comb-0.024
Jud-0.022
cr-0.021
cious-0.020
 coconut-0.019
Token shared
Token ID4888
Feature activation+2.50e-02

Pos logit contributions
les+0.069
agging+0.065
ors+0.064
?"+0.062
 me+0.062
Neg logit contributions
comb-0.087
Jud-0.076
cr-0.074
cious-0.067
gan-0.065
Token his
Token ID465
Feature activation-2.46e-02

Pos logit contributions
comb+0.060
Jud+0.053
cr+0.051
cious+0.047
gan+0.046
Neg logit contributions
les-0.030
?"-0.028
 me-0.028
agging-0.026
ors-0.025
Token were
Token ID547
Feature activation+2.72e-02

Pos logit contributions
comb+0.152
Jud+0.144
cr+0.129
cious+0.126
gan+0.124
Neg logit contributions
les-0.108
?"-0.092
 me-0.092
ler-0.089
ors-0.089
Token happy
Token ID3772
Feature activation+4.57e-02

Pos logit contributions
comb+0.059
Jud+0.051
cr+0.050
cious+0.046
 coconut+0.045
Neg logit contributions
les-0.037
 me-0.035
?"-0.035
agging-0.035
ors-0.033
Token.
Token ID13
Feature activation+2.68e-02

Pos logit contributions
comb+0.116
Jud+0.104
cr+0.101
cious+0.096
 coconut+0.093
Neg logit contributions
les-0.046
?"-0.041
 me-0.040
agging-0.040
ors-0.038
Token
Token ID198
Feature activation+1.80e-02

Pos logit contributions
comb+0.061
Jud+0.052
cr+0.052
emon+0.048
cious+0.047
Neg logit contributions
 me-0.034
?"-0.033
les-0.033
ors-0.031
 wears-0.030
Token
Token ID198
Feature activation+1.25e-02

Pos logit contributions
comb+0.042
cr+0.034
Jud+0.033
gan+0.033
cious+0.032
Neg logit contributions
 me-0.024
?"-0.023
les-0.023
agging-0.021
ock-0.020
TokenM
Token ID44
Feature activation+1.21e-01

Pos logit contributions
comb+0.027
cr+0.022
gan+0.022
mons+0.022
cious+0.021
Neg logit contributions
 me-0.019
les-0.017
?"-0.017
ock-0.016
ler-0.016
Tokenoral
Token ID6864
Feature activation+8.70e-02

Pos logit contributions
comb+0.254
Jud+0.223
cr+0.218
cious+0.199
gan+0.194
Neg logit contributions
les-0.172
?"-0.158
agging-0.157
 me-0.155
ors-0.154
Token of
Token ID286
Feature activation-1.39e-03

Pos logit contributions
comb+0.168
Jud+0.156
cr+0.147
emon+0.136
 coconut+0.131
Neg logit contributions
les-0.129
ors-0.119
agging-0.117
ening-0.111
ler-0.110
Token the
Token ID262
Feature activation-5.90e-02

Pos logit contributions
les+0.002
?"+0.002
 me+0.002
agging+0.002
ors+0.002
Neg logit contributions
comb-0.003
Jud-0.003
cr-0.003
cious-0.002
gan-0.002
Token story
Token ID1621
Feature activation+2.74e-02

Pos logit contributions
les+0.070
ors+0.063
agging+0.062
?"+0.062
 me+0.059
Neg logit contributions
comb-0.136
Jud-0.123
cr-0.119
cious-0.109
 coconut-0.107
Token:
Token ID25
Feature activation+3.15e-02

Pos logit contributions
comb+0.059
Jud+0.052
cr+0.051
cious+0.046
gan+0.046
Neg logit contributions
les-0.037
agging-0.034
ors-0.032
?"-0.032
 me-0.032
Token to
Token ID284
Feature activation+1.47e-02

Pos logit contributions
comb+0.054
Jud+0.048
cr+0.047
cious+0.043
 coconut+0.042
Neg logit contributions
les-0.027
?"-0.024
 me-0.024
agging-0.024
ors-0.024
Token see
Token ID766
Feature activation-1.96e-02

Pos logit contributions
comb+0.040
cr+0.035
Jud+0.035
cious+0.033
emon+0.033
Neg logit contributions
 me-0.014
?"-0.013
les-0.012
 wears-0.010
ors-0.010
Token what
Token ID644
Feature activation+4.97e-02

Pos logit contributions
les+0.021
?"+0.021
 me+0.019
ors+0.018
 wears+0.017
Neg logit contributions
comb-0.050
Jud-0.044
cr-0.042
cious-0.040
gan-0.040
Token was
Token ID373
Feature activation-1.15e-02

Pos logit contributions
comb+0.114
Jud+0.099
cr+0.097
gan+0.089
cious+0.089
Neg logit contributions
les-0.063
?"-0.060
 me-0.058
 wears-0.054
ors-0.053
Token inside
Token ID2641
Feature activation+7.34e-03

Pos logit contributions
les+0.022
agging+0.021
iren+0.020
ler+0.020
ock+0.020
Neg logit contributions
comb-0.017
cr-0.014
Jud-0.014
cious-0.013
emon-0.012
Token.
Token ID13
Feature activation+1.68e-02

Pos logit contributions
comb+0.020
Jud+0.018
cr+0.018
cious+0.016
gan+0.016
Neg logit contributions
les-0.006
?"-0.006
 me-0.005
agging-0.005
ors-0.005
Token
Token ID198
Feature activation+1.83e-03

Pos logit contributions
comb+0.040
Jud+0.034
cr+0.033
cious+0.030
 coconut+0.030
Neg logit contributions
les-0.021
 me-0.020
?"-0.020
 wears-0.018
ors-0.017
Token
Token ID198
Feature activation-4.15e-03

Pos logit contributions
comb+0.004
Jud+0.003
cr+0.003
cious+0.003
 coconut+0.003
Neg logit contributions
les-0.003
?"-0.002
 me-0.002
agging-0.002
ors-0.002
Token"
Token ID1
Feature activation+5.99e-03

Pos logit contributions
 me+0.005
les+0.005
?"+0.005
agging+0.004
ler+0.004
Neg logit contributions
comb-0.010
Jud-0.008
cr-0.008
 coconut-0.008
stack-0.008
TokenMaybe
Token ID13300
Feature activation+1.10e-02

Pos logit contributions
comb+0.014
ossom+0.011
agg+0.011
gan+0.010
apa+0.010
Neg logit contributions
 me-0.006
?"-0.005
 sax-0.005
 Teddy-0.005
 volume-0.005
Token it
Token ID340
Feature activation-2.64e-02

Pos logit contributions
comb+0.029
Jud+0.026
cr+0.025
cious+0.023
aney+0.023
Neg logit contributions
?"-0.012
 me-0.011
les-0.009
?-0.009
 my-0.009
Token
Token ID220
Feature activation-3.74e-02

Pos logit contributions
comb+0.067
Jud+0.059
cr+0.058
cious+0.053
gan+0.052
Neg logit contributions
les-0.043
?"-0.039
agging-0.039
 me-0.038
ors-0.038
Token
Token ID198
Feature activation+2.31e-02

Pos logit contributions
les+0.045
?"+0.044
 me+0.042
ors+0.040
agging+0.040
Neg logit contributions
comb-0.088
Jud-0.076
cr-0.074
cious-0.070
 coconut-0.069
Token
Token ID198
Feature activation+1.71e-02

Pos logit contributions
comb+0.050
Jud+0.044
cr+0.043
cious+0.039
 coconut+0.038
Neg logit contributions
les-0.032
?"-0.030
 me-0.029
agging-0.029
ors-0.028
TokenThey
Token ID2990
Feature activation+5.06e-02

Pos logit contributions
comb+0.035
Jud+0.031
cr+0.030
 Judy+0.027
cious+0.027
Neg logit contributions
les-0.026
 me-0.026
?"-0.024
agging-0.024
ler-0.023
Token kept
Token ID4030
Feature activation-1.12e-02

Pos logit contributions
comb+0.119
Jud+0.105
cr+0.102
cious+0.096
mons+0.094
Neg logit contributions
les-0.063
 me-0.060
?"-0.054
 wears-0.051
ler-0.049
Token playing
Token ID2712
Feature activation+2.98e-02

Pos logit contributions
les+0.013
?"+0.012
ors+0.011
agging+0.011
ler+0.011
Neg logit contributions
comb-0.026
Jud-0.023
cr-0.023
cious-0.021
 Maggie-0.021
Token until
Token ID1566
Feature activation+2.21e-02

Pos logit contributions
comb+0.071
Jud+0.063
cr+0.061
cious+0.057
gan+0.056
Neg logit contributions
les-0.035
?"-0.032
agging-0.032
 me-0.031
ors-0.030
Token the
Token ID262
Feature activation+2.74e-02

Pos logit contributions
comb+0.053
Jud+0.048
cr+0.047
cious+0.044
emon+0.042
Neg logit contributions
les-0.025
?"-0.024
 me-0.023
agging-0.021
ler-0.020
Token sun
Token ID4252
Feature activation+6.11e-02

Pos logit contributions
comb+0.067
Jud+0.061
cr+0.059
cious+0.054
 Maggie+0.052
Neg logit contributions
les-0.032
 me-0.031
agging-0.030
?"-0.028
 trumpet-0.027
Token started
Token ID2067
Feature activation+2.80e-02

Pos logit contributions
comb+0.135
Jud+0.124
cr+0.116
cious+0.112
 coconut+0.110
Neg logit contributions
les-0.087
 me-0.075
?"-0.072
ler-0.068
ors-0.067
Token to
Token ID284
Feature activation+4.89e-02

Pos logit contributions
comb+0.060
Jud+0.053
cr+0.053
cious+0.046
gan+0.046
Neg logit contributions
les-0.038
?"-0.035
agging-0.035
ors-0.034
 me-0.033
Token
Token ID198
Feature activation+8.59e-03

Pos logit contributions
comb+0.028
Jud+0.025
cr+0.024
cious+0.021
gan+0.021
Neg logit contributions
les-0.022
agging-0.020
?"-0.020
ors-0.020
 me-0.019
TokenOne
Token ID3198
Feature activation+2.36e-02

Pos logit contributions
comb+0.018
cr+0.013
Jud+0.013
 coconut+0.012
 Maggie+0.012
Neg logit contributions
 me-0.014
les-0.013
?"-0.013
iren-0.013
agging-0.013
Token of
Token ID286
Feature activation+3.83e-02

Pos logit contributions
comb+0.068
cr+0.061
Jud+0.060
cious+0.059
 themselves+0.058
Neg logit contributions
?"-0.019
 me-0.017
 problem-0.017
les-0.015
?-0.014
Token the
Token ID262
Feature activation-1.06e-02

Pos logit contributions
comb+0.094
Jud+0.080
cr+0.079
cious+0.075
gan+0.072
Neg logit contributions
les-0.045
 me-0.042
?"-0.042
agging-0.039
ors-0.038
Token friends
Token ID2460
Feature activation+2.38e-02

Pos logit contributions
 me+0.013
 problem+0.013
?"+0.012
les+0.011
 wrong+0.011
Neg logit contributions
comb-0.025
cr-0.023
Jud-0.022
cious-0.021
 Maggie-0.019
Token wanted
Token ID2227
Feature activation+3.43e-02

Pos logit contributions
comb+0.057
Jud+0.049
cr+0.048
cious+0.045
 coconut+0.043
Neg logit contributions
les-0.029
 me-0.029
?"-0.029
ors-0.025
 wears-0.024
Token the
Token ID262
Feature activation-7.93e-03

Pos logit contributions
comb+0.097
Jud+0.083
cr+0.080
cious+0.079
gan+0.076
Neg logit contributions
 me-0.030
?"-0.029
les-0.026
 wrong-0.025
ors-0.023
Token big
Token ID1263
Feature activation-5.57e-03

Pos logit contributions
les+0.016
?"+0.015
 me+0.015
agging+0.014
 problem+0.014
Neg logit contributions
comb-0.012
Jud-0.010
cr-0.010
cious-0.009
gan-0.008
Token peach
Token ID47565
Feature activation+9.18e-02

Pos logit contributions
les+0.011
agging+0.010
?"+0.010
ors+0.010
 me+0.010
Neg logit contributions
comb-0.009
Jud-0.007
cr-0.007
cious-0.006
 coconut-0.006
Token,
Token ID11
Feature activation+6.96e-03

Pos logit contributions
comb+0.263
Jud+0.256
cious+0.235
Jess+0.233
cr+0.231
Neg logit contributions
?"-0.053
 me-0.048
les-0.047
 wrong-0.039
?-0.036
Token but
Token ID475
Feature activation+3.09e-03

Pos logit contributions
comb+0.018
Jud+0.016
cr+0.016
cious+0.015
gan+0.014
Neg logit contributions
les-0.007
?"-0.006
 me-0.006
agging-0.006
ors-0.006
Token planets
Token ID14705
Feature activation-1.71e-02

Pos logit contributions
comb+0.013
cr+0.011
Jud+0.011
 themselves+0.011
 coconut+0.010
Neg logit contributions
agging-0.011
iren-0.011
arl-0.011
les-0.011
ors-0.011
Token around
Token ID1088
Feature activation-5.23e-02

Pos logit contributions
les+0.018
agging+0.017
?"+0.016
 me+0.016
ors+0.016
Neg logit contributions
comb-0.042
Jud-0.037
cr-0.037
cious-0.034
gan-0.033
Token her
Token ID607
Feature activation-2.34e-02

Pos logit contributions
les+0.083
agging+0.081
iren+0.076
ler+0.075
?"+0.075
Neg logit contributions
comb-0.097
Jud-0.085
cr-0.080
 Maggie-0.075
 themselves-0.072
Token.
Token ID13
Feature activation+3.23e-02

Pos logit contributions
les+0.031
iren+0.031
agging+0.028
ock+0.027
?"+0.027
Neg logit contributions
comb-0.050
cr-0.042
cious-0.041
Jud-0.040
 themselves-0.039
Token But
Token ID887
Feature activation+6.76e-03

Pos logit contributions
comb+0.075
Jud+0.067
cr+0.066
 coconut+0.060
cious+0.060
Neg logit contributions
 me-0.038
les-0.037
?"-0.035
ors-0.035
agging-0.033
Token then
Token ID788
Feature activation+1.28e-02

Pos logit contributions
comb+0.016
Jud+0.015
cr+0.015
 coconut+0.013
cious+0.013
Neg logit contributions
 me-0.008
?"-0.007
les-0.006
 wears-0.006
,-0.006
Token she
Token ID673
Feature activation-2.86e-02

Pos logit contributions
comb+0.035
cr+0.032
Jud+0.031
 coconut+0.029
gan+0.029
Neg logit contributions
les-0.010
 me-0.009
?"-0.009
 wears-0.007
ors-0.007
Token saw
Token ID2497
Feature activation+1.63e-02

Pos logit contributions
les+0.043
?"+0.039
agging+0.039
 me+0.039
ors+0.038
Neg logit contributions
comb-0.058
Jud-0.051
cr-0.050
cious-0.045
gan-0.044
Token a
Token ID257
Feature activation+1.42e-02

Pos logit contributions
comb+0.032
Jud+0.027
cr+0.027
cious+0.024
gan+0.024
Neg logit contributions
les-0.026
?"-0.024
 me-0.024
agging-0.023
ors-0.023
Token bright
Token ID6016
Feature activation-9.12e-03

Pos logit contributions
comb+0.026
Jud+0.023
cr+0.021
 coconut+0.021
cious+0.020
Neg logit contributions
ors-0.024
les-0.023
agging-0.023
iren-0.021
ag-0.021
Token,
Token ID11
Feature activation+2.02e-02

Pos logit contributions
ors+0.016
agging+0.015
iren+0.014
inches+0.014
ches+0.013
Neg logit contributions
comb-0.014
 coconut-0.012
 tad-0.011
 Maggie-0.011
cr-0.011
TokenOne
Token ID3198
Feature activation+2.87e-02

Pos logit contributions
comb+0.029
cr+0.025
gan+0.022
 coconut+0.022
 pulp+0.021
Neg logit contributions
 me-0.012
?"-0.010
ines-0.009
les-0.009
ors-0.008
Token day
Token ID1110
Feature activation-2.49e-02

Pos logit contributions
comb+0.087
cr+0.078
Jud+0.076
 themselves+0.074
stack+0.073
Neg logit contributions
?"-0.018
 me-0.016
les-0.013
?-0.013
ie-0.013
Token,
Token ID11
Feature activation-5.35e-03

Pos logit contributions
,+0.000
?"-0.000
 me-0.000
les-0.001
 problem-0.005
Neg logit contributions
comb-0.085
cr-0.079
Jud-0.077
cious-0.075
 coconut-0.074
Token Jan
Token ID2365
Feature activation+8.34e-03

Pos logit contributions
 me+0.007
,+0.007
?"+0.007
 everywhere+0.006
 wrong+0.006
Neg logit contributions
comb-0.011
cr-0.010
emon-0.010
cious-0.010
ocado-0.009
Token decided
Token ID3066
Feature activation+7.38e-02

Pos logit contributions
comb+0.019
cr+0.016
Jud+0.016
cious+0.015
 coconut+0.015
Neg logit contributions
les-0.012
 me-0.010
?"-0.010
ors-0.010
agging-0.009
Token to
Token ID284
Feature activation+1.84e-02

Pos logit contributions
comb+0.176
Jud+0.155
cr+0.153
cious+0.141
gan+0.138
Neg logit contributions
les-0.084
?"-0.080
 me-0.079
agging-0.074
ors-0.074
Token tell
Token ID1560
Feature activation+1.97e-02

Pos logit contributions
comb+0.040
Jud+0.035
cr+0.034
cious+0.033
gan+0.031
Neg logit contributions
 me-0.025
?"-0.025
les-0.024
ors-0.021
 trumpet-0.021
Token Joy
Token ID14087
Feature activation+4.14e-02

Pos logit contributions
comb+0.045
cr+0.038
cious+0.036
Jud+0.036
ocado+0.034
Neg logit contributions
 me-0.028
?"-0.025
les-0.024
ors-0.023
 wrong-0.022
Token the
Token ID262
Feature activation-4.69e-03

Pos logit contributions
comb+0.109
Jud+0.097
cr+0.096
cious+0.087
 coconut+0.087
Neg logit contributions
les-0.038
?"-0.036
 me-0.036
ors-0.030
 wrong-0.030
Token truth
Token ID3872
Feature activation+2.39e-02

Pos logit contributions
 me+0.005
?"+0.005
les+0.005
agging+0.005
 trumpet+0.005
Neg logit contributions
comb-0.011
Jud-0.010
cr-0.010
cious-0.009
gan-0.009
Token.
Token ID13
Feature activation+2.86e-02

Pos logit contributions
comb+0.067
Jud+0.062
cr+0.060
 coconut+0.057
cious+0.057
Neg logit contributions
les-0.017
?"-0.016
 me-0.014
ors-0.014
agging-0.013
Token girl
Token ID2576
Feature activation-4.34e-02

Pos logit contributions
les+0.047
?"+0.044
agging+0.043
 me+0.043
ors+0.042
Neg logit contributions
comb-0.068
Jud-0.060
cr-0.059
cious-0.053
gan-0.052
Token,"
Token ID553
Feature activation-1.76e-02

Pos logit contributions
les+0.045
agging+0.040
ors+0.038
 me+0.036
iren+0.035
Neg logit contributions
comb-0.108
cr-0.094
Jud-0.094
cious-0.088
gan-0.086
Token her
Token ID607
Feature activation-1.13e-02

Pos logit contributions
les+0.024
?"+0.022
 me+0.022
agging+0.021
ors+0.020
Neg logit contributions
comb-0.039
Jud-0.034
cr-0.033
cious-0.031
gan-0.030
Token dad
Token ID9955
Feature activation-1.07e-02

Pos logit contributions
les+0.021
?"+0.021
agging+0.020
 me+0.020
ors+0.020
Neg logit contributions
comb-0.019
Jud-0.016
cr-0.015
cious-0.013
emon-0.013
Token said
Token ID531
Feature activation-1.60e-02

Pos logit contributions
les+0.011
?"+0.010
 me+0.010
agging+0.010
ors+0.010
Neg logit contributions
comb-0.026
Jud-0.024
cr-0.023
cious-0.022
 coconut-0.021
Token.
Token ID13
Feature activation+2.18e-04

Pos logit contributions
les+0.013
?"+0.012
agging+0.011
 me+0.011
ors+0.011
Neg logit contributions
comb-0.043
Jud-0.039
cr-0.038
cious-0.036
 coconut-0.036
Token "
Token ID366
Feature activation+1.48e-02

Pos logit contributions
comb+0.000
Jud+0.000
cr+0.000
cious+0.000
gan+0.000
Neg logit contributions
les-0.000
agging-0.000
?"-0.000
ors-0.000
 me-0.000
TokenAlways
Token ID30374
Feature activation-5.23e-02

Pos logit contributions
comb+0.029
Jud+0.023
cr+0.022
 coconut+0.022
cious+0.021
Neg logit contributions
les-0.023
 me-0.022
ock-0.022
?"-0.022
agging-0.021
Token remember
Token ID3505
Feature activation+5.90e-03

Pos logit contributions
les+0.094
agging+0.086
ors+0.086
?"+0.084
iren+0.082
Neg logit contributions
comb-0.091
Jud-0.077
cr-0.076
cious-0.067
 coconut-0.065
Token to
Token ID284
Feature activation+1.48e-03

Pos logit contributions
comb+0.013
Jud+0.011
cr+0.011
 Maggie+0.010
cious+0.010
Neg logit contributions
les-0.008
agging-0.007
ler-0.007
ors-0.007
iren-0.007
Token keep
Token ID1394
Feature activation-6.12e-02

Pos logit contributions
comb+0.003
Jud+0.002
cr+0.002
cious+0.002
gan+0.002
Neg logit contributions
les-0.002
?"-0.002
 me-0.002
agging-0.002
ors-0.002
Tokenmy
Token ID1820
Feature activation-3.23e-02

Pos logit contributions
les+0.002
ors+0.002
agging+0.002
iren+0.002
brate+0.002
Neg logit contributions
comb-0.003
Jud-0.003
cr-0.003
gan-0.003
 coconut-0.003
Token.
Token ID13
Feature activation-9.34e-03

Pos logit contributions
les+0.042
agging+0.038
ors+0.036
iren+0.036
 me+0.035
Neg logit contributions
comb-0.069
Jud-0.063
cr-0.062
cious-0.056
 coconut-0.055
Token Tim
Token ID5045
Feature activation+8.84e-03

Pos logit contributions
les+0.015
?"+0.014
agging+0.014
 me+0.014
ors+0.014
Neg logit contributions
comb-0.018
Jud-0.015
cr-0.015
cious-0.014
gan-0.013
Tokenmy
Token ID1820
Feature activation-2.69e-02

Pos logit contributions
comb+0.018
Jud+0.016
cr+0.016
 coconut+0.014
gan+0.014
Neg logit contributions
agging-0.012
les-0.012
ors-0.011
iren-0.011
 concert-0.010
Token loved
Token ID6151
Feature activation-6.39e-02

Pos logit contributions
les+0.044
agging+0.042
ors+0.040
iren+0.039
ening+0.038
Neg logit contributions
comb-0.050
Jud-0.043
cr-0.043
cious-0.040
gan-0.037
Token to
Token ID284
Feature activation-6.08e-03

Pos logit contributions
les+0.101
agging+0.091
ler+0.089
ors+0.087
?"+0.085
Neg logit contributions
comb-0.120
Jud-0.108
cr-0.107
 coconut-0.099
cious-0.096
Token run
Token ID1057
Feature activation-6.26e-03

Pos logit contributions
les+0.008
agging+0.008
?"+0.008
 me+0.007
ors+0.007
Neg logit contributions
comb-0.013
Jud-0.011
cr-0.011
cious-0.010
 coconut-0.010
Token and
Token ID290
Feature activation-5.54e-03

Pos logit contributions
ucker+0.012
inkle+0.011
 dent+0.011
ening+0.011
agging+0.010
Neg logit contributions
 themselves-0.007
cr-0.006
comb-0.005
hand-0.005
Maybe-0.005
Token play
Token ID711
Feature activation-4.64e-02

Pos logit contributions
les+0.006
?"+0.006
 me+0.006
agging+0.006
ors+0.006
Neg logit contributions
comb-0.013
Jud-0.012
cr-0.011
cious-0.011
gan-0.010
Token outside
Token ID2354
Feature activation-5.48e-02

Pos logit contributions
ening+0.067
les+0.065
 wears+0.064
iren+0.063
ors+0.061
Neg logit contributions
comb-0.084
cr-0.074
 themselves-0.069
Jud-0.069
 mirac-0.065
Token.
Token ID13
Feature activation+1.82e-02

Pos logit contributions
les+0.077
agging+0.066
iren+0.066
ors+0.065
 wears+0.063
Neg logit contributions
comb-0.113
cr-0.102
Jud-0.093
 themselves-0.090
gan-0.088
Token And
Token ID843
Feature activation-1.09e-03

Pos logit contributions
?"+0.015
les+0.015
 me+0.015
ors+0.014
agging+0.013
Neg logit contributions
comb-0.031
Jud-0.027
cr-0.026
cious-0.025
emon-0.024
Token they
Token ID484
Feature activation-1.31e-02

Pos logit contributions
 me+0.001
?"+0.001
 wears+0.001
,+0.001
 Teddy+0.001
Neg logit contributions
comb-0.003
Jud-0.002
cious-0.002
cr-0.002
emon-0.002
Token all
Token ID477
Feature activation-3.52e-02

Pos logit contributions
 me+0.014
?"+0.013
les+0.013
 everywhere+0.011
 wears+0.011
Neg logit contributions
comb-0.031
Jud-0.029
cr-0.028
cious-0.028
emon-0.027
Token lived
Token ID5615
Feature activation-3.23e-02

Pos logit contributions
les+0.048
?"+0.046
 me+0.045
 wears+0.042
ler+0.040
Neg logit contributions
comb-0.078
Jud-0.068
cr-0.066
cious-0.062
mons-0.060
Token happily
Token ID18177
Feature activation-5.12e-02

Pos logit contributions
?"+0.040
 me+0.039
,+0.035
agging+0.034
les+0.034
Neg logit contributions
comb-0.076
Jud-0.068
cious-0.067
cr-0.063
emon-0.063
Token ever
Token ID1683
Feature activation+9.46e-03

Pos logit contributions
agging+0.086
les+0.083
?"+0.082
 me+0.080
lers+0.078
Neg logit contributions
comb-0.094
Jud-0.085
 coconut-0.072
cr-0.071
 Maggie-0.069
Token after
Token ID706
Feature activation+3.10e-02

Pos logit contributions
comb+0.018
Jud+0.015
cr+0.015
cious+0.013
gan+0.013
Neg logit contributions
les-0.016
?"-0.015
 me-0.015
agging-0.015
ors-0.014
Token.
Token ID13
Feature activation-4.39e-03

Pos logit contributions
comb+0.080
Jud+0.072
cr+0.071
cious+0.066
 coconut+0.065
Neg logit contributions
les-0.029
agging-0.025
?"-0.025
 me-0.024
ors-0.024
Token<|endoftext|>
Token ID50256
Feature activation-8.40e-02

Pos logit contributions
?"+0.004
 me+0.004
les+0.004
 wears+0.003
ors+0.003
Neg logit contributions
comb-0.011
Jud-0.010
cr-0.009
cious-0.009
gan-0.009
TokenOne
Token ID3198
Feature activation-1.39e-02

Pos logit contributions
les+0.128
agging+0.117
ors+0.116
?"+0.115
 me+0.114
Neg logit contributions
comb-0.166
Jud-0.146
cr-0.143
cious-0.128
gan-0.125
Token day
Token ID1110
Feature activation-6.04e-02

Pos logit contributions
?"+0.014
les+0.013
 me+0.013
agging+0.011
 wears+0.011
Neg logit contributions
comb-0.036
Jud-0.032
cr-0.032
cious-0.030
stack-0.028
Token the
Token ID262
Feature activation-3.39e-03

Pos logit contributions
comb+0.054
Jud+0.048
cr+0.047
cious+0.044
gan+0.043
Neg logit contributions
les-0.030
 me-0.030
?"-0.029
agging-0.026
ock-0.025
Token other
Token ID584
Feature activation+5.70e-02

Pos logit contributions
 me+0.005
les+0.004
?"+0.004
 trumpet+0.004
 wrong+0.004
Neg logit contributions
comb-0.008
Jud-0.007
cr-0.007
cious-0.006
emon-0.006
Token and
Token ID290
Feature activation+1.44e-02

Pos logit contributions
comb+0.126
Jud+0.112
cr+0.105
cious+0.097
emon+0.095
Neg logit contributions
 me-0.083
les-0.082
?"-0.076
?-0.069
ock-0.068
Token he
Token ID339
Feature activation-4.77e-02

Pos logit contributions
comb+0.033
Jud+0.030
cr+0.028
emon+0.027
cious+0.027
Neg logit contributions
 me-0.018
les-0.017
?"-0.016
 wears-0.016
 trumpet-0.015
Token had
Token ID550
Feature activation+4.98e-03

Pos logit contributions
les+0.073
?"+0.067
agging+0.067
 me+0.066
ors+0.065
Neg logit contributions
comb-0.095
Jud-0.083
cr-0.081
cious-0.073
gan-0.071
Token so
Token ID523
Feature activation+4.09e-03

Pos logit contributions
comb+0.011
Jud+0.009
cr+0.009
cious+0.008
gan+0.008
Neg logit contributions
les-0.007
 me-0.006
?"-0.006
agging-0.006
ors-0.006
Token much
Token ID881
Feature activation+2.65e-02

Pos logit contributions
comb+0.009
Jud+0.008
cr+0.008
cious+0.007
gan+0.007
Neg logit contributions
les-0.006
?"-0.005
agging-0.005
 me-0.005
ors-0.005
Token fun
Token ID1257
Feature activation+1.90e-02

Pos logit contributions
comb+0.054
Jud+0.051
cr+0.050
 coconut+0.046
emon+0.045
Neg logit contributions
ors-0.035
les-0.034
ler-0.031
agging-0.031
?"-0.030
Token.
Token ID13
Feature activation+4.22e-02

Pos logit contributions
comb+0.044
cr+0.039
Jud+0.038
cious+0.035
 Maggie+0.035
Neg logit contributions
les-0.021
?"-0.019
agging-0.019
 me-0.019
iren-0.019
Token
Token ID198
Feature activation+2.28e-02

Pos logit contributions
comb+0.106
Jud+0.091
cr+0.090
cious+0.085
emon+0.082
Neg logit contributions
les-0.043
 me-0.042
?"-0.042
ors-0.039
 wears-0.038
Token
Token ID198
Feature activation+1.53e-02

Pos logit contributions
comb+0.057
cr+0.048
Jud+0.047
cious+0.045
gan+0.045
Neg logit contributions
?"-0.025
les-0.024
 me-0.024
agging-0.021
iren-0.021
Token so
Token ID523
Feature activation+1.58e-02

Pos logit contributions
comb+0.021
Jud+0.019
cr+0.018
mons+0.017
cious+0.017
Neg logit contributions
?"-0.006
 me-0.006
les-0.005
agging-0.005
 wears-0.004
Token proud
Token ID6613
Feature activation+5.11e-03

Pos logit contributions
comb+0.035
Jud+0.031
cr+0.030
emon+0.027
cious+0.027
Neg logit contributions
?"-0.020
les-0.019
 me-0.019
agging-0.018
 wears-0.018
Token of
Token ID286
Feature activation+4.54e-02

Pos logit contributions
comb+0.011
Jud+0.009
cr+0.009
 Maggie+0.009
cious+0.008
Neg logit contributions
les-0.007
agging-0.007
ors-0.006
?"-0.006
iren-0.006
Token what
Token ID644
Feature activation+6.17e-02

Pos logit contributions
comb+0.093
cr+0.081
Jud+0.080
cious+0.073
 coconut+0.071
Neg logit contributions
 me-0.067
les-0.066
?"-0.065
agging-0.064
ors-0.061
Token she
Token ID673
Feature activation-2.40e-02

Pos logit contributions
comb+0.138
cr+0.120
Jud+0.116
cious+0.108
gan+0.108
Neg logit contributions
?"-0.082
les-0.080
 me-0.080
 wears-0.074
?-0.069
Token had
Token ID550
Feature activation+1.00e-02

Pos logit contributions
les+0.041
agging+0.039
ors+0.038
?"+0.037
 me+0.036
Neg logit contributions
comb-0.043
Jud-0.038
cr-0.036
cious-0.033
gan-0.032
Token done
Token ID1760
Feature activation+1.74e-02

Pos logit contributions
comb+0.020
Jud+0.018
cr+0.017
cious+0.015
gan+0.015
Neg logit contributions
les-0.015
agging-0.014
ors-0.014
iren-0.013
ler-0.013
Token and
Token ID290
Feature activation+2.38e-02

Pos logit contributions
comb+0.046
Jud+0.041
cr+0.040
cious+0.038
 coconut+0.037
Neg logit contributions
les-0.016
?"-0.014
agging-0.014
 me-0.013
ors-0.013
Token hugged
Token ID41130
Feature activation-3.75e-02

Pos logit contributions
comb+0.058
Jud+0.051
cr+0.049
cious+0.047
 coconut+0.047
Neg logit contributions
 me-0.027
les-0.025
?"-0.024
 wears-0.023
 bl-0.022
Token her
Token ID607
Feature activation-2.38e-03

Pos logit contributions
les+0.057
agging+0.053
?"+0.053
ors+0.051
iren+0.051
Neg logit contributions
comb-0.074
Jud-0.065
cr-0.062
cious-0.057
 Maggie-0.056
Token daddy
Token ID35822
Feature activation-4.05e-02

Pos logit contributions
les+0.004
?"+0.004
agging+0.004
 me+0.004
ors+0.004
Neg logit contributions
comb-0.004
Jud-0.004
cr-0.004
cious-0.003
gan-0.003
Token Mia
Token ID32189
Feature activation+1.41e-02

Pos logit contributions
 me+0.030
?"+0.027
 wears+0.024
 my+0.024
les+0.024
Neg logit contributions
comb-0.045
cr-0.038
cious-0.037
Jud-0.036
mons-0.035
Token say
Token ID910
Feature activation+4.18e-03

Pos logit contributions
comb+0.029
Jud+0.026
cr+0.025
cious+0.024
emon+0.023
Neg logit contributions
les-0.021
 me-0.020
?"-0.020
agging-0.018
 wears-0.017
Token,
Token ID11
Feature activation-3.48e-02

Pos logit contributions
comb+0.012
Jud+0.011
cr+0.011
cious+0.010
gan+0.010
Neg logit contributions
les-0.003
?"-0.002
agging-0.002
 me-0.002
ors-0.002
Token "
Token ID366
Feature activation-1.60e-02

Pos logit contributions
?"+0.038
 me+0.036
les+0.036
agging+0.036
 wears+0.035
Neg logit contributions
comb-0.082
Jud-0.076
cr-0.071
 coconut-0.068
emon-0.066
TokenOK
Token ID11380
Feature activation-4.39e-02

Pos logit contributions
 me+0.027
les+0.027
?"+0.027
ler+0.026
agging+0.025
Neg logit contributions
comb-0.028
Jud-0.024
cr-0.021
 Maggie-0.020
 Judy-0.020
Token,
Token ID11
Feature activation-8.76e-02

Pos logit contributions
les+0.043
agging+0.037
ors+0.037
iren+0.035
ler+0.034
Neg logit contributions
comb-0.112
cr-0.098
Jud-0.098
gan-0.094
 Maggie-0.091
Token Mom
Token ID11254
Feature activation-6.81e-03

Pos logit contributions
 me+0.152
?"+0.150
les+0.138
 wears+0.132
 my+0.129
Neg logit contributions
comb-0.158
Jud-0.148
cr-0.135
cious-0.128
 coconut-0.117
Token.
Token ID13
Feature activation-8.05e-02

Pos logit contributions
les+0.008
agging+0.007
ors+0.007
iren+0.007
ler+0.007
Neg logit contributions
comb-0.016
cr-0.014
Jud-0.014
gan-0.013
cious-0.013
Token We
Token ID775
Feature activation+3.48e-03

Pos logit contributions
?"+0.100
 me+0.093
les+0.090
 wears+0.089
ors+0.085
Neg logit contributions
comb-0.182
Jud-0.171
cr-0.156
 coconut-0.146
cious-0.145
Token will
Token ID481
Feature activation-2.74e-02

Pos logit contributions
comb+0.008
Jud+0.007
cr+0.007
cious+0.007
gan+0.007
Neg logit contributions
les-0.004
?"-0.004
 me-0.004
agging-0.003
 wears-0.003
Token sort
Token ID3297
Feature activation-5.48e-02

Pos logit contributions
les+0.048
agging+0.043
ors+0.043
ler+0.042
iren+0.042
Neg logit contributions
comb-0.051
Jud-0.043
cr-0.042
cious-0.038
gan-0.037
Token.
Token ID13
Feature activation-3.16e-03

Pos logit contributions
les+0.023
agging+0.020
ors+0.019
 me+0.019
?"+0.019
Neg logit contributions
comb-0.056
Jud-0.050
cr-0.049
cious-0.046
gan-0.045
Token He
Token ID679
Feature activation-2.29e-02

Pos logit contributions
les+0.006
agging+0.005
ors+0.005
ler+0.005
iren+0.005
Neg logit contributions
comb-0.006
Jud-0.005
cr-0.005
 Maggie-0.004
cious-0.004
Token was
Token ID373
Feature activation-1.09e-02

Pos logit contributions
les+0.036
agging+0.034
ors+0.034
iren+0.032
?"+0.032
Neg logit contributions
comb-0.043
Jud-0.038
cr-0.037
cious-0.034
gan-0.033
Token three
Token ID1115
Feature activation+7.68e-04

Pos logit contributions
les+0.017
ors+0.015
agging+0.015
?"+0.014
ler+0.014
Neg logit contributions
comb-0.022
Jud-0.019
cr-0.019
cious-0.018
gan-0.017
Token years
Token ID812
Feature activation-3.80e-02

Pos logit contributions
comb+0.002
Jud+0.002
cr+0.002
cious+0.002
gan+0.002
Neg logit contributions
les-0.001
?"-0.001
agging-0.001
 me-0.001
ors-0.001
Token old
Token ID1468
Feature activation-8.96e-02

Pos logit contributions
les+0.044
agging+0.040
ors+0.038
?"+0.036
ler+0.036
Neg logit contributions
comb-0.083
Jud-0.074
 Maggie-0.071
 coconut-0.070
cr-0.070
Token and
Token ID290
Feature activation+1.04e-02

Pos logit contributions
les+0.098
agging+0.089
ors+0.085
?"+0.084
 me+0.082
Neg logit contributions
comb-0.215
Jud-0.189
cr-0.189
cious-0.176
gan-0.174
Token wanted
Token ID2227
Feature activation+7.70e-03

Pos logit contributions
comb+0.023
Jud+0.020
cr+0.020
cious+0.018
gan+0.018
Neg logit contributions
les-0.014
agging-0.012
?"-0.012
ors-0.012
 me-0.012
Token to
Token ID284
Feature activation-9.97e-04

Pos logit contributions
comb+0.019
Jud+0.017
cr+0.016
cious+0.015
gan+0.015
Neg logit contributions
?"-0.008
 me-0.008
les-0.008
agging-0.007
ors-0.007
Token spend
Token ID4341
Feature activation+1.21e-02

Pos logit contributions
?"+0.001
 me+0.001
les+0.001
 trumpet+0.001
?+0.001
Neg logit contributions
comb-0.002
Jud-0.002
cr-0.002
cious-0.002
gan-0.002
Token time
Token ID640
Feature activation-3.77e-02

Pos logit contributions
 themselves+0.016
comb+0.016
cr+0.014
Jud+0.014
 Maggie+0.013
Neg logit contributions
les-0.022
ors-0.022
ler-0.021
iren-0.021
ening-0.020
TokenIt
Token ID1026
Feature activation+2.19e-03

Pos logit contributions
les+0.010
 me+0.010
ock+0.009
iren+0.009
?"+0.009
Neg logit contributions
comb-0.012
cr-0.009
Jud-0.009
cious-0.008
 coconut-0.008
Token's
Token ID338
Feature activation-1.57e-03

Pos logit contributions
comb+0.004
Jud+0.004
cr+0.004
cious+0.003
gan+0.003
Neg logit contributions
les-0.003
agging-0.003
?"-0.003
ors-0.003
 me-0.003
Token okay
Token ID8788
Feature activation+6.52e-02

Pos logit contributions
les+0.004
ors+0.003
agging+0.003
iren+0.003
ler+0.003
Neg logit contributions
comb-0.002
Jud-0.002
cr-0.002
gan-0.001
emon-0.001
Token,
Token ID11
Feature activation-6.73e-02

Pos logit contributions
comb+0.162
Jud+0.143
cr+0.142
gan+0.137
 Maggie+0.135
Neg logit contributions
les-0.071
agging-0.062
iren-0.058
ler-0.057
ors-0.055
Token sweet
Token ID6029
Feature activation+6.54e-03

Pos logit contributions
les+0.105
agging+0.098
?"+0.095
ors+0.094
iren+0.092
Neg logit contributions
comb-0.131
Jud-0.113
cr-0.112
gan-0.102
 Maggie-0.100
Tokenie
Token ID494
Feature activation-9.86e-02

Pos logit contributions
comb+0.017
Jud+0.015
cr+0.015
 Maggie+0.014
gan+0.014
Neg logit contributions
agging-0.005
 channel-0.005
les-0.005
ors-0.005
 wears-0.004
Token.
Token ID13
Feature activation-5.24e-02

Pos logit contributions
les+0.082
?"+0.072
agging+0.069
 me+0.068
ors+0.067
Neg logit contributions
comb-0.264
Jud-0.241
cr-0.236
cious-0.220
 coconut-0.217
Token You
Token ID921
Feature activation+9.03e-03

Pos logit contributions
les+0.074
agging+0.067
?"+0.066
 me+0.064
ors+0.064
Neg logit contributions
comb-0.111
Jud-0.098
cr-0.096
cious-0.087
gan-0.086
Token can
Token ID460
Feature activation-3.83e-02

Pos logit contributions
comb+0.018
Jud+0.016
cr+0.016
gan+0.014
cious+0.014
Neg logit contributions
les-0.014
agging-0.013
ors-0.012
 me-0.012
iren-0.012
Token build
Token ID1382
Feature activation-3.03e-02

Pos logit contributions
les+0.065
agging+0.060
ening+0.059
ler+0.057
ors+0.057
Neg logit contributions
comb-0.071
Jud-0.059
cr-0.058
gan-0.051
 coconut-0.050
Token another
Token ID1194
Feature activation+3.59e-02

Pos logit contributions
les+0.047
agging+0.043
?"+0.043
ors+0.042
 me+0.041
Neg logit contributions
comb-0.059
Jud-0.051
cr-0.050
cious-0.045
gan-0.045
TokenIt
Token ID1026
Feature activation-9.33e-03

Pos logit contributions
les+0.001
 me+0.000
?"+0.000
ock+0.000
agging+0.000
Neg logit contributions
comb-0.001
Jud-0.000
cr-0.000
cious-0.000
 coconut-0.000
Token's
Token ID338
Feature activation-1.09e-02

Pos logit contributions
les+0.015
agging+0.015
ors+0.014
?"+0.013
iren+0.013
Neg logit contributions
comb-0.018
cr-0.015
Jud-0.015
cious-0.013
gan-0.013
Token okay
Token ID8788
Feature activation+5.35e-02

Pos logit contributions
les+0.028
ors+0.026
iren+0.026
agging+0.025
ler+0.025
Neg logit contributions
comb-0.012
cr-0.009
Jud-0.008
gan-0.008
 themselves-0.007
Token,
Token ID11
Feature activation-8.52e-02

Pos logit contributions
comb+0.136
cr+0.118
Jud+0.118
gan+0.113
 Maggie+0.113
Neg logit contributions
les-0.057
agging-0.051
iren-0.047
ors-0.045
ler-0.043
Token Ben
Token ID3932
Feature activation-8.13e-02

Pos logit contributions
les+0.134
agging+0.124
ors+0.123
iren+0.120
?"+0.120
Neg logit contributions
comb-0.166
Jud-0.142
cr-0.139
gan-0.127
cious-0.126
Token.
Token ID13
Feature activation-8.69e-02

Pos logit contributions
agging+0.096
les+0.095
iren+0.087
ors+0.086
 channels+0.083
Neg logit contributions
comb-0.188
Jud-0.163
cr-0.161
cious-0.150
gan-0.144
Token I
Token ID314
Feature activation-3.67e-02

Pos logit contributions
les+0.123
agging+0.112
ors+0.106
?"+0.105
 me+0.105
Neg logit contributions
comb-0.187
Jud-0.161
cr-0.159
cious-0.147
gan-0.145
Token fixed
Token ID5969
Feature activation-8.37e-02

Pos logit contributions
les+0.068
?"+0.064
agging+0.063
 me+0.063
ors+0.062
Neg logit contributions
comb-0.061
Jud-0.052
cr-0.051
cious-0.045
gan-0.043
Token your
Token ID534
Feature activation-5.76e-03

Pos logit contributions
les+0.103
agging+0.101
iren+0.094
ors+0.089
 channel+0.088
Neg logit contributions
comb-0.173
cr-0.162
 themselves-0.153
Jud-0.150
 Maggie-0.146
Token finger
Token ID7660
Feature activation-2.16e-02

Pos logit contributions
les+0.009
?"+0.008
agging+0.008
ors+0.008
 me+0.008
Neg logit contributions
comb-0.011
Jud-0.010
cr-0.010
cious-0.009
 coconut-0.009
Token.
Token ID13
Feature activation-5.98e-02

Pos logit contributions
les+0.021
agging+0.019
ors+0.018
?"+0.018
 me+0.018
Neg logit contributions
comb-0.054
cr-0.048
Jud-0.047
cious-0.044
gan-0.043
Token.
Token ID13
Feature activation-1.74e-02

Pos logit contributions
comb+0.024
Jud+0.021
cr+0.021
cious+0.020
gan+0.019
Neg logit contributions
les-0.009
agging-0.008
?"-0.008
ors-0.008
 me-0.008
Token
Token ID198
Feature activation-1.50e-02

Pos logit contributions
les+0.023
?"+0.022
 me+0.021
agging+0.021
ors+0.020
Neg logit contributions
comb-0.039
Jud-0.033
cr-0.033
cious-0.031
 coconut-0.030
Token
Token ID198
Feature activation-2.03e-02

Pos logit contributions
les+0.024
agging+0.023
ors+0.022
ening+0.022
?"+0.021
Neg logit contributions
comb-0.028
Jud-0.025
cr-0.024
 Maggie-0.021
gan-0.021
Token"
Token ID1
Feature activation+1.70e-03

Pos logit contributions
 me+0.025
?"+0.023
les+0.023
iren+0.021
ler+0.020
Neg logit contributions
comb-0.050
cr-0.041
Jud-0.040
 coconut-0.039
cious-0.038
TokenOK
Token ID11380
Feature activation-4.69e-02

Pos logit contributions
comb+0.003
Jud+0.003
cr+0.003
cious+0.002
 coconut+0.002
Neg logit contributions
 me-0.003
?"-0.002
les-0.002
 sax-0.002
iren-0.002
Token,
Token ID11
Feature activation-9.19e-02

Pos logit contributions
les+0.049
agging+0.045
iren+0.041
ors+0.040
ler+0.039
Neg logit contributions
comb-0.118
cr-0.103
Jud-0.102
gan-0.098
 Maggie-0.097
Token but
Token ID475
Feature activation-6.17e-02

Pos logit contributions
 me+0.092
?"+0.085
les+0.079
ors+0.068
ock+0.068
Neg logit contributions
comb-0.234
Jud-0.218
cr-0.206
cious-0.198
 coconut-0.190
Token be
Token ID307
Feature activation-3.96e-02

Pos logit contributions
 me+0.080
?"+0.079
les+0.065
 wears+0.064
ors+0.063
Neg logit contributions
comb-0.138
Jud-0.127
cr-0.119
cious-0.117
aney-0.110
Token careful
Token ID8161
Feature activation-2.02e-02

Pos logit contributions
les+0.076
?"+0.071
agging+0.071
ors+0.070
 me+0.070
Neg logit contributions
comb-0.064
Jud-0.054
cr-0.052
cious-0.046
gan-0.044
Token.
Token ID13
Feature activation-5.66e-02

Pos logit contributions
les+0.018
agging+0.016
?"+0.016
 me+0.015
ors+0.015
Neg logit contributions
comb-0.053
Jud-0.047
cr-0.047
cious-0.044
gan-0.043
Token The
Token ID383
Feature activation-2.65e-02

Pos logit contributions
?"+0.072
 me+0.070
ors+0.063
les+0.061
iren+0.061
Neg logit contributions
comb-0.126
Jud-0.114
cr-0.109
cious-0.102
 coconut-0.100