TOP ACTIVATIONS
MAX = 9.727

fell down with a th ud . They cried and hurt
the ground with a th ud . He lay on the
be ruined . Daisy tr ud ged along , carrying her
floor with a loud th ud . " Ha
the snow . He tr ud ged back inside , still
ground with a loud th ud . Tom cries
the rug with a th ud and a g iggle .
bone landed with a th ud in the dirt . Ham
mom said , " R ud olph , it 's time
the grass with a th ud . " O
her with a loud th ud . Sally felt a sharp
the ground with a th ud . Both Alice and Lisa
ground with a loud th ud . His bike was bent
to cry . He tr ud ged back home , carrying
a happy potato named Sp ud . Sp ud was very
the floor with a th ud . She felt a pain
ground with a loud th ud . He hurt his arm
sad . Tr ud y said , " I
not find it . Tr ud y felt so sad .
back . He sadly tr ud ged home , without his

MIN ACTIVATIONS
MIN = -8.534

it was the best cup cakes she ever had .
a bite . The cup cakes were so tasty ! Lily
big blanket and ate cup cakes and sandwiches . When the
When the cup cakes were done , Mill ie
When the cup cakes were done baking , they
help her make some cup cakes and put up some decorations
her mom had made cup cakes ! Bella was so excited
Mom my was baking cup cakes . She got out the
Grand ma was making cup cakes for the family . Dave
Finally , the cup cakes were ready and they all
, Dave y 's cup cakes were done , and they
shoot hay and eat cup cakes . It was a good
Can I put the cup cakes in the oven ?" June
sandwiches and some little cup cakes . Lily was happy because
and enjoyed the delicious cup cakes . From that day on
the most delicious honey - cakes that were ever made .
After they put the cup cakes in the oven , Lily
and his friends ate cup cakes and played with the fixed
my and eat delicious cup cakes ! <|endoftext|> Once upon a
hard so that the cup cakes would be extra fluffy .

INTERVAL 5.162 - 9.727
CONTAINS 0.265%

have it !" Jul ie didn 't understand .
Anna and Ben were twins who liked to play games .
felt encouraged by his friends who were so supportive . One
, there was a family who wanted to go on a
had a big lake and many trees . They could swim

INTERVAL 0.596 - 5.162
CONTAINS 26.891%

diary anymore . Lily was heart broken and cried all day
time , there was a little girl named Lily who loved
Give it back , give it back !" It repeated what
. So , he used his foot to kick the ball
the forest to play with her friends . Suddenly , they

INTERVAL -3.969 - 0.596
CONTAINS 71.808%

she saw a hose . It was dripping water . Lily
." She took them to the kitchen and showed
. " Mom my , can you help me
there was a little girl called T illy . One day
scared girl . She was very frightened and did not want

INTERVAL -8.534 - -3.969
CONTAINS 1.035%

," he said . " What kind of competition ?"
the water . She watched as the sand made little bubbles
loves . I 'm so proud of you . You helped
baseball and said , " Thank you . I 'm sorry
." S ally nodded and began to climb
Token fell
Token ID3214
Feature activation+8.21e-01

Pos logit contributions
ud+0.255
and+0.253
Jo+0.250
jo+0.241
rum+0.239
Neg logit contributions
-0.271
idents-0.218
 hotter-0.216
Sorry-0.214
 cooler-0.201
Token down
Token ID866
Feature activation-1.33e+00

Pos logit contributions
ud+0.084
Jo+0.083
and+0.083
jo+0.080
rum+0.079
Neg logit contributions
-0.060
 hotter-0.047
idents-0.046
Sorry-0.044
 cooler-0.042
Token with
Token ID351
Feature activation+1.80e-01

Pos logit contributions
+0.153
idents+0.139
Sorry+0.133
 hotter+0.129
cakes+0.126
Neg logit contributions
ud-0.080
and-0.077
Jo-0.071
rum-0.069
 Pen-0.064
Token a
Token ID257
Feature activation+1.61e-01

Pos logit contributions
ud+0.014
and+0.013
Jo+0.013
rum+0.013
 Ali+0.012
Neg logit contributions
-0.018
idents-0.015
 hotter-0.014
Sorry-0.014
cakes-0.014
Token th
Token ID294
Feature activation+3.30e+00

Pos logit contributions
and+0.015
ud+0.015
Jo+0.014
jo+0.014
rum+0.014
Neg logit contributions
-0.014
 hotter-0.011
idents-0.011
Sorry-0.011
 cooler-0.010
Tokenud
Token ID463
Feature activation+9.73e+00

Pos logit contributions
Jo+0.066
and+0.064
 Pen+0.061
 Ali+0.059
 Jacob+0.058
Neg logit contributions
-0.501
 hotter-0.433
Sorry-0.432
 unst-0.413
 cooler-0.407
Token.
Token ID13
Feature activation-5.93e-02

Pos logit contributions
Jo+1.011
and+0.997
jo+0.964
old+0.957
Jamie+0.951
Neg logit contributions
-0.653
 hotter-0.493
 unst-0.465
Sorry-0.441
 cooler-0.405
Token They
Token ID1119
Feature activation-2.34e-01

Pos logit contributions
+0.006
idents+0.005
 hotter+0.005
Sorry+0.004
 cooler+0.004
Neg logit contributions
ud-0.005
and-0.005
Jo-0.005
rum-0.005
jo-0.005
Token cried
Token ID16896
Feature activation+2.27e+00

Pos logit contributions
+0.023
idents+0.018
 hotter+0.018
Sorry+0.018
 cooler+0.017
Neg logit contributions
ud-0.019
and-0.019
Jo-0.018
rum-0.017
jo-0.017
Token and
Token ID290
Feature activation-3.47e-01

Pos logit contributions
ud+0.174
and+0.172
Jo+0.167
rum+0.159
jo+0.158
Neg logit contributions
-0.229
idents-0.189
 hotter-0.186
Sorry-0.184
 cooler-0.174
Token hurt
Token ID5938
Feature activation-1.18e+00

Pos logit contributions
+0.032
 hotter+0.026
idents+0.026
Sorry+0.026
 cooler+0.025
Neg logit contributions
and-0.030
ud-0.030
Jo-0.029
jo-0.028
rum-0.027
Token the
Token ID262
Feature activation-4.54e-01

Pos logit contributions
Jo+0.055
and+0.055
ud+0.054
rum+0.052
jo+0.052
Neg logit contributions
-0.074
Sorry-0.061
idents-0.060
 hotter-0.060
cakes-0.059
Token ground
Token ID2323
Feature activation+1.19e+00

Pos logit contributions
+0.037
idents+0.029
 hotter+0.028
Sorry+0.028
cakes+0.026
Neg logit contributions
ud-0.043
and-0.043
Jo-0.042
rum-0.040
jo-0.040
Token with
Token ID351
Feature activation+1.80e-01

Pos logit contributions
ud+0.115
Jo+0.113
and+0.110
rum+0.108
jo+0.107
Neg logit contributions
-0.094
 hotter-0.073
 cooler-0.073
idents-0.070
 unst-0.068
Token a
Token ID257
Feature activation+1.61e-01

Pos logit contributions
ud+0.014
and+0.014
Jo+0.014
rum+0.013
jo+0.013
Neg logit contributions
-0.018
idents-0.015
 hotter-0.014
Sorry-0.014
cakes-0.014
Token th
Token ID294
Feature activation+3.30e+00

Pos logit contributions
ud+0.015
and+0.014
Jo+0.014
rum+0.014
jo+0.014
Neg logit contributions
-0.014
idents-0.011
 hotter-0.011
Sorry-0.011
 cooler-0.010
Tokenud
Token ID463
Feature activation+9.73e+00

Pos logit contributions
 Pen+0.101
and+0.098
 Mae+0.094
 Jacob+0.093
 Ali+0.091
Neg logit contributions
-0.482
 hotter-0.392
Sorry-0.383
 unst-0.376
ized-0.374
Token.
Token ID13
Feature activation-5.93e-02

Pos logit contributions
and+0.994
Jo+0.921
old+0.900
ud+0.895
 Mae+0.885
Neg logit contributions
-0.717
 hotter-0.537
 unst-0.521
Sorry-0.478
 cooler-0.459
Token He
Token ID679
Feature activation-4.56e-01

Pos logit contributions
+0.005
 hotter+0.005
idents+0.004
Sorry+0.004
 cooler+0.004
Neg logit contributions
and-0.005
ud-0.005
Jo-0.005
jo-0.005
rum-0.004
Token lay
Token ID3830
Feature activation+4.77e+00

Pos logit contributions
+0.037
idents+0.029
 hotter+0.029
Sorry+0.029
 cooler+0.028
Neg logit contributions
and-0.043
ud-0.042
rum-0.041
Jo-0.040
jo-0.039
Token on
Token ID319
Feature activation-9.03e-01

Pos logit contributions
ud+0.461
and+0.450
Jo+0.441
rum+0.433
jo+0.426
Neg logit contributions
-0.382
idents-0.303
Sorry-0.283
 hotter-0.280
cakes-0.276
Token the
Token ID262
Feature activation-4.44e-01

Pos logit contributions
+0.096
idents+0.086
Sorry+0.085
 hotter+0.082
cakes+0.082
Neg logit contributions
ud-0.057
Jo-0.056
and-0.054
rum-0.052
 Ali-0.052
Token be
Token ID307
Feature activation+2.09e-02

Pos logit contributions
+0.005
idents+0.004
 hotter+0.004
Sorry+0.003
 cooler+0.003
Neg logit contributions
ud-0.005
and-0.005
Jo-0.005
jo-0.005
rum-0.005
Token ruined
Token ID20484
Feature activation-3.45e+00

Pos logit contributions
ud+0.002
and+0.002
Jo+0.002
jo+0.002
rum+0.002
Neg logit contributions
-0.002
 hotter-0.001
idents-0.001
Sorry-0.001
 cooler-0.001
Token.
Token ID13
Feature activation-5.94e-02

Pos logit contributions
+0.266
idents+0.208
 hotter+0.204
 cooler+0.202
 unst+0.192
Neg logit contributions
ud-0.339
Jo-0.329
and-0.324
jo-0.316
rum-0.309
Token Daisy
Token ID40355
Feature activation+8.58e-01

Pos logit contributions
+0.005
idents+0.004
 hotter+0.004
Sorry+0.004
cakes+0.003
Neg logit contributions
ud-0.006
and-0.005
Jo-0.005
rum-0.005
jo-0.005
Token tr
Token ID491
Feature activation+2.03e+00

Pos logit contributions
and+0.084
ud+0.082
rum+0.079
Jo+0.076
jo+0.074
Neg logit contributions
-0.067
idents-0.053
Sorry-0.053
 hotter-0.052
 cooler-0.050
Tokenud
Token ID463
Feature activation+9.73e+00

Pos logit contributions
and+0.060
 Pen+0.050
Jo+0.049
 Mae+0.047
 Ali+0.045
Neg logit contributions
-0.313
idents-0.255
Sorry-0.254
 hotter-0.244
cakes-0.241
Tokenged
Token ID2004
Feature activation+1.52e+00

Pos logit contributions
and+1.042
Jo+0.975
ud+0.963
jo+0.954
weet+0.948
Neg logit contributions
-0.661
Sorry-0.479
 hotter-0.463
 unst-0.432
idents-0.424
Token along
Token ID1863
Feature activation+2.90e+00

Pos logit contributions
ud+0.158
Jo+0.151
jo+0.149
and+0.147
rum+0.147
Neg logit contributions
-0.106
 hotter-0.088
 unst-0.084
idents-0.084
Sorry-0.082
Token,
Token ID11
Feature activation+2.35e-01

Pos logit contributions
ud+0.292
and+0.275
rum+0.273
jo+0.270
Jo+0.267
Neg logit contributions
-0.231
 hotter-0.169
idents-0.169
Sorry-0.161
 surely-0.158
Token carrying
Token ID6872
Feature activation+2.55e+00

Pos logit contributions
ud+0.022
and+0.021
rum+0.021
Jo+0.020
jo+0.020
Neg logit contributions
-0.020
 hotter-0.017
idents-0.016
Sorry-0.015
 cooler-0.015
Token her
Token ID607
Feature activation+7.31e-01

Pos logit contributions
ud+0.232
and+0.225
jo+0.219
Jo+0.218
rum+0.214
Neg logit contributions
-0.219
 hotter-0.181
 cooler-0.174
idents-0.172
 unst-0.167
Token floor
Token ID4314
Feature activation+3.57e+00

Pos logit contributions
+0.042
 hotter+0.034
idents+0.034
 cooler+0.032
Sorry+0.032
Neg logit contributions
ud-0.039
and-0.038
Jo-0.038
rum-0.037
jo-0.036
Token with
Token ID351
Feature activation+1.75e-01

Pos logit contributions
Jo+0.369
rum+0.358
jo+0.357
and+0.351
 Pen+0.345
Neg logit contributions
-0.273
 hotter-0.195
 cooler-0.192
idents-0.183
 surely-0.175
Token a
Token ID257
Feature activation+1.55e-01

Pos logit contributions
ud+0.013
and+0.013
Jo+0.013
rum+0.012
jo+0.012
Neg logit contributions
-0.018
idents-0.015
 hotter-0.015
Sorry-0.015
cakes-0.014
Token loud
Token ID7812
Feature activation+4.11e-01

Pos logit contributions
ud+0.013
and+0.013
Jo+0.013
rum+0.012
jo+0.012
Neg logit contributions
-0.014
idents-0.011
 hotter-0.011
Sorry-0.011
cakes-0.010
Token th
Token ID294
Feature activation+3.29e+00

Pos logit contributions
and+0.042
ud+0.041
old+0.037
Jo+0.037
jo+0.036
Neg logit contributions
-0.032
 hotter-0.025
idents-0.025
 cooler-0.025
Sorry-0.023
Tokenud
Token ID463
Feature activation+9.72e+00

Pos logit contributions
and+0.084
 Pen+0.076
Jo+0.074
 Ali+0.070
 Jacob+0.070
Neg logit contributions
-0.494
 hotter-0.411
Sorry-0.403
 unst-0.394
 cooler-0.389
Token.
Token ID13
Feature activation-6.35e-02

Pos logit contributions
and+0.995
Jo+0.952
old+0.924
ud+0.911
jo+0.893
Neg logit contributions
-0.718
 hotter-0.535
 unst-0.504
Sorry-0.483
idents-0.475
Token
Token ID198
Feature activation-1.13e-01

Pos logit contributions
+0.005
idents+0.004
 hotter+0.004
Sorry+0.004
 cooler+0.004
Neg logit contributions
ud-0.006
and-0.006
Jo-0.006
rum-0.005
jo-0.005
Token
Token ID198
Feature activation-1.12e-01

Pos logit contributions
+0.008
Sorry+0.006
 hotter+0.005
idents+0.005
 unst+0.005
Neg logit contributions
ud-0.011
rum-0.011
and-0.011
Jo-0.011
 Mae-0.011
Token"
Token ID1
Feature activation-4.91e-01

Pos logit contributions
+0.010
Sorry+0.008
idents+0.008
 hotter+0.008
 cooler+0.007
Neg logit contributions
ud-0.010
and-0.010
Jo-0.010
jo-0.009
rum-0.009
TokenHa
Token ID23303
Feature activation+5.54e-01

Pos logit contributions
+0.043
idents+0.037
 hotter+0.034
 cooler+0.033
cakes+0.032
Neg logit contributions
ud-0.044
Jo-0.043
and-0.042
 Pen-0.040
 Ali-0.039
Token the
Token ID262
Feature activation-4.56e-01

Pos logit contributions
ud+0.113
and+0.108
Jo+0.104
jo+0.103
rum+0.100
Neg logit contributions
-0.075
 hotter-0.072
 cooler-0.070
 unst-0.061
 ang-0.057
Token snow
Token ID6729
Feature activation+1.55e+00

Pos logit contributions
+0.040
 hotter+0.033
idents+0.032
Sorry+0.031
 cooler+0.031
Neg logit contributions
ud-0.041
and-0.040
Jo-0.039
rum-0.039
jo-0.038
Token.
Token ID13
Feature activation-5.95e-02

Pos logit contributions
ud+0.134
and+0.132
Jo+0.129
rum+0.124
jo+0.123
Neg logit contributions
-0.140
idents-0.113
 hotter-0.112
Sorry-0.109
 cooler-0.104
Token He
Token ID679
Feature activation-4.62e-01

Pos logit contributions
+0.005
idents+0.004
 hotter+0.004
Sorry+0.004
 cooler+0.004
Neg logit contributions
ud-0.005
and-0.005
Jo-0.005
jo-0.005
rum-0.005
Token tr
Token ID491
Feature activation+2.02e+00

Pos logit contributions
+0.034
 cooler+0.027
Sorry+0.027
 hotter+0.027
idents+0.027
Neg logit contributions
and-0.046
ud-0.045
rum-0.044
Jo-0.042
jo-0.042
Tokenud
Token ID463
Feature activation+9.72e+00

Pos logit contributions
and+0.049
 Pen+0.039
Jo+0.038
 Mae+0.035
jo+0.033
Neg logit contributions
-0.317
idents-0.263
Sorry-0.261
 hotter-0.257
cakes-0.254
Tokenged
Token ID2004
Feature activation+1.52e+00

Pos logit contributions
and+0.981
ud+0.922
Jo+0.916
rum+0.907
jo+0.902
Neg logit contributions
-0.722
Sorry-0.533
 hotter-0.529
idents-0.501
 unst-0.492
Token back
Token ID736
Feature activation-3.21e+00

Pos logit contributions
ud+0.160
Jo+0.156
jo+0.153
rum+0.153
and+0.152
Neg logit contributions
-0.099
 hotter-0.084
 cooler-0.078
 unst-0.077
idents-0.076
Token inside
Token ID2641
Feature activation-2.04e+00

Pos logit contributions
+0.261
 hotter+0.200
idents+0.199
 cooler+0.199
 unst+0.187
Neg logit contributions
ud-0.301
and-0.291
Jo-0.291
rum-0.289
jo-0.279
Token,
Token ID11
Feature activation+2.32e-01

Pos logit contributions
+0.143
 cooler+0.106
 hotter+0.104
 unst+0.095
idents+0.095
Neg logit contributions
ud-0.212
jo-0.204
and-0.202
rum-0.201
Jo-0.198
Token still
Token ID991
Feature activation-3.77e+00

Pos logit contributions
ud+0.022
and+0.021
Jo+0.021
rum+0.020
jo+0.020
Neg logit contributions
-0.019
 hotter-0.016
idents-0.015
Sorry-0.015
 cooler-0.015
Token ground
Token ID2323
Feature activation+1.19e+00

Pos logit contributions
+0.034
 hotter+0.029
 cooler+0.028
idents+0.027
 unst+0.025
Neg logit contributions
ud-0.045
Jo-0.044
and-0.044
jo-0.042
rum-0.042
Token with
Token ID351
Feature activation+1.71e-01

Pos logit contributions
Jo+0.137
jo+0.132
ud+0.131
iley+0.131
and+0.129
Neg logit contributions
-0.069
 hotter-0.052
 cooler-0.051
idents-0.042
Sorry-0.041
Token a
Token ID257
Feature activation+1.61e-01

Pos logit contributions
and+0.016
ud+0.015
Jo+0.015
jo+0.015
rum+0.014
Neg logit contributions
-0.015
 hotter-0.012
 cooler-0.012
idents-0.011
Sorry-0.011
Token loud
Token ID7812
Feature activation+4.04e-01

Pos logit contributions
and+0.015
ud+0.015
Jo+0.014
jo+0.014
rum+0.013
Neg logit contributions
-0.014
 hotter-0.011
idents-0.011
Sorry-0.011
 cooler-0.011
Token th
Token ID294
Feature activation+3.30e+00

Pos logit contributions
and+0.041
ud+0.040
old+0.037
Jo+0.036
jo+0.035
Neg logit contributions
-0.031
 hotter-0.026
idents-0.025
 cooler-0.025
Sorry-0.023
Tokenud
Token ID463
Feature activation+9.72e+00

Pos logit contributions
and+0.109
Jo+0.100
 Jacob+0.097
 Pen+0.096
 Ali+0.093
Neg logit contributions
-0.459
 hotter-0.391
Sorry-0.376
 cooler-0.365
 unst-0.365
Token.
Token ID13
Feature activation-6.40e-02

Pos logit contributions
and+1.045
Jo+1.001
old+0.992
 languages+0.972
Anne+0.948
Neg logit contributions
-0.613
 hotter-0.467
 unst-0.421
Sorry-0.403
bled-0.380
Token
Token ID198
Feature activation-1.18e-01

Pos logit contributions
+0.005
 hotter+0.004
idents+0.004
Sorry+0.004
 cooler+0.004
Neg logit contributions
ud-0.006
and-0.006
Jo-0.006
rum-0.006
jo-0.006
Token
Token ID198
Feature activation-1.19e-01

Pos logit contributions
+0.007
Sorry+0.005
 hotter+0.004
 cooler+0.004
idents+0.004
Neg logit contributions
ud-0.013
rum-0.013
Jo-0.012
jo-0.012
weet-0.012
TokenTom
Token ID13787
Feature activation+1.86e+00

Pos logit contributions
+0.010
Sorry+0.009
 hotter+0.008
 cooler+0.007
idents+0.007
Neg logit contributions
ud-0.011
and-0.010
Jo-0.010
jo-0.010
rum-0.010
Token cries
Token ID24691
Feature activation+3.89e+00

Pos logit contributions
and+0.166
Jo+0.158
rum+0.149
 perspective+0.145
ud+0.141
Neg logit contributions
-0.146
 cooler-0.131
 hotter-0.130
idents-0.126
Sorry-0.118
Token the
Token ID262
Feature activation-4.50e-01

Pos logit contributions
+0.087
idents+0.072
 hotter+0.071
Sorry+0.070
 cooler+0.066
Neg logit contributions
ud-0.074
and-0.073
Jo-0.071
jo-0.068
rum-0.067
Token rug
Token ID14477
Feature activation+2.91e+00

Pos logit contributions
+0.042
 hotter+0.035
idents+0.035
 cooler+0.033
Sorry+0.033
Neg logit contributions
ud-0.037
rum-0.035
and-0.035
Jo-0.034
jo-0.033
Token with
Token ID351
Feature activation+1.71e-01

Pos logit contributions
ud+0.278
Jo+0.273
rum+0.272
and+0.265
jo+0.256
Neg logit contributions
-0.245
cakes-0.185
idents-0.183
 cooler-0.181
 hotter-0.178
Token a
Token ID257
Feature activation+1.61e-01

Pos logit contributions
ud+0.015
and+0.015
Jo+0.014
rum+0.014
jo+0.014
Neg logit contributions
-0.016
 hotter-0.013
idents-0.013
Sorry-0.012
 cooler-0.012
Token th
Token ID294
Feature activation+3.30e+00

Pos logit contributions
ud+0.014
and+0.014
Jo+0.014
rum+0.013
jo+0.013
Neg logit contributions
-0.014
idents-0.012
 hotter-0.011
Sorry-0.011
cakes-0.011
Tokenud
Token ID463
Feature activation+9.72e+00

Pos logit contributions
and+0.102
 Pen+0.090
 Jacob+0.085
 English+0.083
Jo+0.081
Neg logit contributions
-0.472
 hotter-0.376
Sorry-0.369
cakes-0.368
 cooler-0.358
Token and
Token ID290
Feature activation-3.46e-01

Pos logit contributions
and+1.032
Jo+0.965
old+0.950
ud+0.926
Jamie+0.920
Neg logit contributions
-0.689
 hotter-0.488
Sorry-0.443
 unst-0.439
idents-0.425
Token a
Token ID257
Feature activation+1.50e-01

Pos logit contributions
+0.029
 hotter+0.024
Sorry+0.023
idents+0.023
 cooler+0.023
Neg logit contributions
ud-0.032
and-0.032
rum-0.031
Jo-0.030
jo-0.030
Token g
Token ID308
Feature activation+2.76e+00

Pos logit contributions
ud+0.013
Jo+0.012
and+0.012
jo+0.012
rum+0.011
Neg logit contributions
-0.014
idents-0.011
 hotter-0.011
Sorry-0.011
cakes-0.010
Tokeniggle
Token ID24082
Feature activation+1.90e+00

Pos logit contributions
and+0.254
Jo+0.233
 Pen+0.233
ud+0.229
old+0.218
Neg logit contributions
-0.242
idents-0.194
 hotter-0.171
cakes-0.167
oming-0.163
Token.
Token ID13
Feature activation-5.90e-02

Pos logit contributions
and+0.183
ud+0.182
Jo+0.173
jo+0.167
rum+0.166
Neg logit contributions
-0.147
 hotter-0.114
Sorry-0.111
 cooler-0.111
idents-0.109
Token bone
Token ID9970
Feature activation-2.09e-01

Pos logit contributions
+0.046
idents+0.039
 hotter+0.039
Sorry+0.038
cakes+0.038
Neg logit contributions
ud-0.031
and-0.029
Jo-0.028
rum-0.027
jo-0.026
Token landed
Token ID11406
Feature activation-1.54e+00

Pos logit contributions
+0.017
idents+0.014
 hotter+0.013
Sorry+0.013
cakes+0.012
Neg logit contributions
ud-0.020
and-0.019
Jo-0.019
jo-0.018
rum-0.018
Token with
Token ID351
Feature activation+1.74e-01

Pos logit contributions
+0.144
idents+0.131
Sorry+0.117
 hotter+0.116
cakes+0.113
Neg logit contributions
ud-0.126
and-0.124
Jo-0.114
rum-0.109
 Pen-0.105
Token a
Token ID257
Feature activation+1.49e-01

Pos logit contributions
ud+0.014
and+0.014
Jo+0.014
rum+0.013
jo+0.013
Neg logit contributions
-0.017
idents-0.014
 hotter-0.014
Sorry-0.013
 cooler-0.013
Token th
Token ID294
Feature activation+3.30e+00

Pos logit contributions
ud+0.013
Jo+0.012
and+0.012
rum+0.012
jo+0.012
Neg logit contributions
-0.013
idents-0.011
 hotter-0.011
Sorry-0.010
cakes-0.010
Tokenud
Token ID463
Feature activation+9.72e+00

Pos logit contributions
and+0.108
 Pen+0.094
Jo+0.090
 Ali+0.087
 Mae+0.084
Neg logit contributions
-0.482
 hotter-0.397
Sorry-0.389
 unst-0.378
 cooler-0.377
Token in
Token ID287
Feature activation-2.41e-01

Pos logit contributions
and+0.933
Jo+0.875
ud+0.871
old+0.842
jo+0.841
Neg logit contributions
-0.812
 hotter-0.629
Sorry-0.589
 unst-0.575
idents-0.573
Token the
Token ID262
Feature activation-4.48e-01

Pos logit contributions
+0.023
idents+0.019
 hotter+0.019
Sorry+0.019
 cooler+0.018
Neg logit contributions
ud-0.019
and-0.019
Jo-0.019
jo-0.018
rum-0.018
Token dirt
Token ID13647
Feature activation+8.73e-01

Pos logit contributions
+0.039
idents+0.031
 hotter+0.031
Sorry+0.030
cakes+0.028
Neg logit contributions
ud-0.040
and-0.040
Jo-0.039
jo-0.037
rum-0.037
Token.
Token ID13
Feature activation-6.13e-02

Pos logit contributions
and+0.077
Jo+0.075
ud+0.075
jo+0.072
rum+0.072
Neg logit contributions
-0.079
 hotter-0.063
idents-0.060
 cooler-0.060
Sorry-0.058
Token Ham
Token ID4345
Feature activation-3.93e+00

Pos logit contributions
+0.006
idents+0.005
Sorry+0.005
 hotter+0.005
cakes+0.005
Neg logit contributions
ud-0.005
Jo-0.004
and-0.004
 Ali-0.004
 Pen-0.004
Token mom
Token ID1995
Feature activation-1.72e+00

Pos logit contributions
ud+0.222
and+0.221
Jo+0.217
rum+0.209
jo+0.207
Neg logit contributions
-0.231
idents-0.187
 hotter-0.185
Sorry-0.181
 cooler-0.176
Token said
Token ID531
Feature activation-5.65e-01

Pos logit contributions
+0.128
 hotter+0.101
idents+0.099
Sorry+0.098
 cooler+0.096
Neg logit contributions
and-0.172
ud-0.169
rum-0.166
Jo-0.164
jo-0.157
Token,
Token ID11
Feature activation+2.50e-01

Pos logit contributions
+0.038
idents+0.029
Sorry+0.028
cakes+0.027
 cooler+0.027
Neg logit contributions
ud-0.067
and-0.062
jo-0.060
rum-0.060
Jo-0.060
Token "
Token ID366
Feature activation+8.12e-01

Pos logit contributions
ud+0.025
and+0.023
rum+0.022
Jo+0.022
jo+0.021
Neg logit contributions
-0.020
idents-0.017
 hotter-0.016
 cooler-0.016
Sorry-0.015
TokenR
Token ID49
Feature activation+4.90e+00

Pos logit contributions
rum+0.077
ud+0.076
and+0.074
 Ali+0.072
jo+0.072
Neg logit contributions
-0.064
Sorry-0.055
 hotter-0.051
idents-0.050
 cooler-0.044
Tokenud
Token ID463
Feature activation+9.72e+00

Pos logit contributions
 Ali+0.363
Jo+0.358
and+0.348
 Mae+0.344
 Jacob+0.341
Neg logit contributions
-0.466
idents-0.383
Sorry-0.361
 hotter-0.356
ory-0.356
Tokenolph
Token ID10196
Feature activation+1.42e+00

Pos logit contributions
and+0.834
Jo+0.785
Anne+0.714
jo+0.712
Jamie+0.711
Neg logit contributions
-0.909
idents-0.720
Sorry-0.694
 hotter-0.670
 unst-0.667
Token,
Token ID11
Feature activation+2.36e-01

Pos logit contributions
and+0.132
Jo+0.128
ud+0.127
rum+0.119
jo+0.118
Neg logit contributions
-0.123
idents-0.099
Sorry-0.089
 cooler-0.089
 hotter-0.088
Token it
Token ID340
Feature activation+1.49e+00

Pos logit contributions
ud+0.025
jo+0.023
and+0.023
Jo+0.023
rum+0.022
Neg logit contributions
-0.017
 cooler-0.015
idents-0.015
 hotter-0.014
Sorry-0.013
Token's
Token ID338
Feature activation+2.58e+00

Pos logit contributions
and+0.107
ud+0.102
Jo+0.096
rum+0.094
jo+0.092
Neg logit contributions
-0.156
idents-0.140
 hotter-0.136
 cooler-0.132
Sorry-0.128
Token time
Token ID640
Feature activation+1.47e+00

Pos logit contributions
ud+0.156
and+0.155
Jo+0.152
jo+0.141
rum+0.140
Neg logit contributions
-0.301
 hotter-0.259
idents-0.257
Sorry-0.248
 cooler-0.247
Token the
Token ID262
Feature activation-4.52e-01

Pos logit contributions
+0.092
idents+0.078
Sorry+0.076
 hotter+0.075
cakes+0.071
Neg logit contributions
ud-0.068
and-0.066
Jo-0.066
 Ali-0.061
jo-0.061
Token grass
Token ID8701
Feature activation+3.02e+00

Pos logit contributions
+0.037
 hotter+0.031
idents+0.030
 cooler+0.029
Sorry+0.028
Neg logit contributions
ud-0.042
and-0.042
rum-0.041
Jo-0.041
jo-0.041
Token with
Token ID351
Feature activation+1.84e-01

Pos logit contributions
Jo+0.273
rum+0.265
ud+0.265
and+0.264
jo+0.258
Neg logit contributions
-0.270
 hotter-0.209
 cooler-0.202
idents-0.201
Sorry-0.192
Token a
Token ID257
Feature activation+1.54e-01

Pos logit contributions
ud+0.014
and+0.014
Jo+0.014
rum+0.013
jo+0.013
Neg logit contributions
-0.018
idents-0.015
 hotter-0.015
Sorry-0.015
 cooler-0.014
Token th
Token ID294
Feature activation+3.31e+00

Pos logit contributions
ud+0.014
and+0.014
Jo+0.013
rum+0.013
jo+0.013
Neg logit contributions
-0.013
idents-0.011
 hotter-0.011
Sorry-0.010
 cooler-0.010
Tokenud
Token ID463
Feature activation+9.72e+00

Pos logit contributions
and+0.163
 Jacob+0.153
Jo+0.153
 Pen+0.152
 Ali+0.147
Neg logit contributions
-0.418
 hotter-0.335
Sorry-0.325
 cooler-0.316
cakes-0.311
Token.
Token ID13
Feature activation-5.88e-02

Pos logit contributions
and+1.014
Jo+0.971
old+0.949
jo+0.928
ud+0.922
Neg logit contributions
-0.702
 hotter-0.520
 unst-0.474
Sorry-0.468
 cooler-0.441
Token
Token ID198
Feature activation-1.18e-01

Pos logit contributions
+0.005
idents+0.004
 hotter+0.004
Sorry+0.004
 cooler+0.004
Neg logit contributions
ud-0.005
and-0.005
Jo-0.005
rum-0.005
jo-0.005
Token
Token ID198
Feature activation-1.19e-01

Pos logit contributions
+0.009
Sorry+0.006
 hotter+0.006
idents+0.006
 cooler+0.006
Neg logit contributions
ud-0.012
Jo-0.012
rum-0.011
and-0.011
jo-0.011
Token"
Token ID1
Feature activation-4.87e-01

Pos logit contributions
+0.011
idents+0.009
 hotter+0.009
Sorry+0.009
 cooler+0.008
Neg logit contributions
ud-0.010
and-0.010
Jo-0.010
rum-0.009
jo-0.009
TokenO
Token ID46
Feature activation-2.19e+00

Pos logit contributions
+0.041
idents+0.035
 hotter+0.032
cakes+0.030
 cooler+0.030
Neg logit contributions
ud-0.045
and-0.044
Jo-0.044
 Pen-0.040
jo-0.040
Token her
Token ID607
Feature activation+7.26e-01

Pos logit contributions
ud+0.043
Jo+0.042
and+0.042
rum+0.039
 Ali+0.039
Neg logit contributions
-0.068
idents-0.059
 hotter-0.058
Sorry-0.057
cakes-0.054
Token with
Token ID351
Feature activation+1.84e-01

Pos logit contributions
ud+0.062
Jo+0.059
and+0.058
rum+0.057
jo+0.054
Neg logit contributions
-0.068
idents-0.055
 cooler-0.052
 unst-0.050
 hotter-0.050
Token a
Token ID257
Feature activation+1.50e-01

Pos logit contributions
and+0.018
ud+0.018
jo+0.017
rum+0.017
Jo+0.017
Neg logit contributions
-0.014
 cooler-0.011
 hotter-0.011
 unst-0.011
Sorry-0.011
Token loud
Token ID7812
Feature activation+4.06e-01

Pos logit contributions
and+0.014
ud+0.014
Jo+0.014
rum+0.013
jo+0.013
Neg logit contributions
-0.013
 hotter-0.010
idents-0.010
 cooler-0.009
Sorry-0.009
Token th
Token ID294
Feature activation+3.30e+00

Pos logit contributions
and+0.041
ud+0.040
old+0.036
Jo+0.035
 Mae+0.034
Neg logit contributions
-0.032
idents-0.025
 hotter-0.025
 cooler-0.025
 unst-0.022
Tokenud
Token ID463
Feature activation+9.72e+00

Pos logit contributions
and+0.121
 Pen+0.106
 Jacob+0.106
 Ali+0.100
 Mae+0.097
Neg logit contributions
-0.457
 hotter-0.378
Sorry-0.364
 unst-0.353
 cooler-0.353
Token.
Token ID13
Feature activation-6.39e-02

Pos logit contributions
and+0.990
old+0.921
Jo+0.911
ud+0.895
 languages+0.881
Neg logit contributions
-0.720
 hotter-0.525
 unst-0.492
Sorry-0.473
 cooler-0.443
Token Sally
Token ID25737
Feature activation-2.05e+00

Pos logit contributions
+0.006
idents+0.005
 hotter+0.005
Sorry+0.005
 cooler+0.004
Neg logit contributions
and-0.005
ud-0.005
Jo-0.005
rum-0.005
jo-0.005
Token felt
Token ID2936
Feature activation-5.05e-01

Pos logit contributions
+0.155
Sorry+0.118
 hotter+0.116
idents+0.115
 cooler+0.111
Neg logit contributions
and-0.203
ud-0.194
rum-0.192
jo-0.178
Jo-0.177
Token a
Token ID257
Feature activation+1.55e-01

Pos logit contributions
+0.044
 hotter+0.036
idents+0.036
Sorry+0.034
 cooler+0.033
Neg logit contributions
ud-0.045
and-0.045
Jo-0.044
jo-0.042
rum-0.042
Token sharp
Token ID7786
Feature activation+1.74e+00

Pos logit contributions
ud+0.014
and+0.013
Jo+0.013
rum+0.013
jo+0.013
Neg logit contributions
-0.014
idents-0.011
 hotter-0.011
Sorry-0.011
 cooler-0.010
Token the
Token ID262
Feature activation-4.60e-01

Pos logit contributions
and+0.235
ud+0.234
rum+0.223
Jo+0.220
jo+0.212
Neg logit contributions
-0.231
idents-0.204
 hotter-0.204
cakes-0.190
 cooler-0.187
Token ground
Token ID2323
Feature activation+1.19e+00

Pos logit contributions
+0.041
 hotter+0.033
idents+0.033
Sorry+0.032
 cooler+0.032
Neg logit contributions
ud-0.041
and-0.040
Jo-0.039
rum-0.038
jo-0.038
Token with
Token ID351
Feature activation+1.66e-01

Pos logit contributions
ud+0.111
Jo+0.111
and+0.109
jo+0.108
rum+0.106
Neg logit contributions
-0.098
 hotter-0.079
idents-0.074
 cooler-0.073
Sorry-0.071
Token a
Token ID257
Feature activation+1.57e-01

Pos logit contributions
and+0.016
ud+0.016
rum+0.015
jo+0.015
Jo+0.014
Neg logit contributions
-0.013
 hotter-0.011
 cooler-0.010
idents-0.010
 unst-0.010
Token th
Token ID294
Feature activation+3.29e+00

Pos logit contributions
and+0.014
ud+0.014
Jo+0.014
jo+0.013
rum+0.013
Neg logit contributions
-0.014
 hotter-0.011
idents-0.011
Sorry-0.011
 cooler-0.010
Tokenud
Token ID463
Feature activation+9.72e+00

Pos logit contributions
and+0.103
 Pen+0.096
 Ali+0.091
 Jacob+0.091
Jo+0.086
Neg logit contributions
-0.477
 hotter-0.387
Sorry-0.384
 unst-0.370
 cooler-0.365
Token.
Token ID13
Feature activation-6.10e-02

Pos logit contributions
and+0.983
old+0.913
Jo+0.911
ud+0.890
jo+0.886
Neg logit contributions
-0.759
 hotter-0.554
Sorry-0.527
 unst-0.523
idents-0.490
Token Both
Token ID5747
Feature activation-2.48e+00

Pos logit contributions
+0.007
idents+0.006
cakes+0.006
 hotter+0.006
Sorry+0.005
Neg logit contributions
ud-0.004
and-0.004
rum-0.004
Jo-0.004
jo-0.003
Token Alice
Token ID14862
Feature activation+5.42e+00

Pos logit contributions
+0.242
cakes+0.221
idents+0.207
 cooler+0.206
Sorry+0.205
Neg logit contributions
and-0.182
rum-0.181
ud-0.177
jo-0.156
Jo-0.155
Token and
Token ID290
Feature activation-3.63e-01

Pos logit contributions
rum+0.540
and+0.537
ud+0.495
jo+0.494
weet+0.478
Neg logit contributions
-0.417
Sorry-0.310
idents-0.296
 cooler-0.289
 hotter-0.289
Token Lisa
Token ID15378
Feature activation+2.21e+00

Pos logit contributions
+0.033
idents+0.027
 hotter+0.026
Sorry+0.026
 cooler+0.025
Neg logit contributions
and-0.031
ud-0.031
rum-0.029
Jo-0.028
jo-0.028
Token ground
Token ID2323
Feature activation+1.18e+00

Pos logit contributions
+0.040
 hotter+0.032
idents+0.032
Sorry+0.031
 cooler+0.030
Neg logit contributions
ud-0.042
and-0.041
Jo-0.040
jo-0.039
rum-0.039
Token with
Token ID351
Feature activation+1.80e-01

Pos logit contributions
Jo+0.123
jo+0.118
rum+0.117
ud+0.117
and+0.115
Neg logit contributions
-0.086
 hotter-0.067
 cooler-0.066
 surely-0.058
idents-0.057
Token a
Token ID257
Feature activation+1.41e-01

Pos logit contributions
and+0.015
ud+0.015
Jo+0.015
jo+0.014
rum+0.014
Neg logit contributions
-0.017
 hotter-0.014
idents-0.013
 cooler-0.013
Sorry-0.013
Token loud
Token ID7812
Feature activation+4.10e-01

Pos logit contributions
ud+0.012
and+0.012
Jo+0.012
rum+0.011
jo+0.011
Neg logit contributions
-0.013
idents-0.010
 hotter-0.010
Sorry-0.010
 cooler-0.009
Token th
Token ID294
Feature activation+3.29e+00

Pos logit contributions
and+0.040
ud+0.039
Jo+0.035
old+0.035
jo+0.034
Neg logit contributions
-0.034
 hotter-0.027
idents-0.027
 cooler-0.026
Sorry-0.025
Tokenud
Token ID463
Feature activation+9.72e+00

Pos logit contributions
and+0.096
 Pen+0.082
Jo+0.081
 Jacob+0.078
 Ali+0.076
Neg logit contributions
-0.484
 hotter-0.403
Sorry-0.394
 cooler-0.383
 unst-0.380
Token.
Token ID13
Feature activation-6.10e-02

Pos logit contributions
and+0.998
Jo+0.950
ud+0.923
old+0.918
jo+0.905
Neg logit contributions
-0.711
 hotter-0.538
 unst-0.494
Sorry-0.492
 cooler-0.467
Token His
Token ID2399
Feature activation+3.70e-01

Pos logit contributions
+0.006
idents+0.005
 hotter+0.004
Sorry+0.004
cakes+0.004
Neg logit contributions
ud-0.005
and-0.005
Jo-0.005
rum-0.005
jo-0.005
Token bike
Token ID7161
Feature activation+1.21e+00

Pos logit contributions
ud+0.031
and+0.030
Jo+0.030
rum+0.028
jo+0.028
Neg logit contributions
-0.035
idents-0.029
 hotter-0.028
Sorry-0.028
 cooler-0.027
Token was
Token ID373
Feature activation+4.02e-01

Pos logit contributions
ud+0.102
and+0.100
Jo+0.098
rum+0.094
jo+0.094
Neg logit contributions
-0.113
idents-0.092
 hotter-0.091
Sorry-0.089
 cooler-0.085
Token bent
Token ID17157
Feature activation-1.13e+00

Pos logit contributions
ud+0.038
and+0.038
Jo+0.037
rum+0.036
jo+0.036
Neg logit contributions
-0.033
idents-0.026
 hotter-0.025
Sorry-0.025
 cooler-0.023
Token to
Token ID284
Feature activation+7.50e-01

Pos logit contributions
+0.236
 unst+0.185
 cooler+0.184
 hotter+0.183
idents+0.179
Neg logit contributions
ud-0.285
and-0.270
jo-0.267
Jo-0.263
rum-0.257
Token cry
Token ID3960
Feature activation+3.38e+00

Pos logit contributions
ud+0.070
and+0.070
Jo+0.064
jo+0.063
rum+0.061
Neg logit contributions
-0.064
idents-0.052
 cooler-0.052
 hotter-0.051
 unst-0.049
Token.
Token ID13
Feature activation-6.13e-02

Pos logit contributions
ud+0.294
and+0.286
Jo+0.284
rum+0.270
jo+0.270
Neg logit contributions
-0.303
idents-0.246
 hotter-0.245
Sorry-0.236
 cooler-0.228
Token He
Token ID679
Feature activation-4.67e-01

Pos logit contributions
+0.005
idents+0.004
 hotter+0.004
Sorry+0.004
 cooler+0.003
Neg logit contributions
ud-0.006
and-0.006
Jo-0.006
rum-0.006
jo-0.005
Token tr
Token ID491
Feature activation+2.03e+00

Pos logit contributions
+0.038
idents+0.031
 hotter+0.031
Sorry+0.030
 cooler+0.029
Neg logit contributions
and-0.044
ud-0.043
Jo-0.041
rum-0.041
jo-0.039
Tokenud
Token ID463
Feature activation+9.72e+00

Pos logit contributions
and+0.064
 Pen+0.058
Jo+0.054
 Mae+0.053
weet+0.049
Neg logit contributions
-0.304
idents-0.247
Sorry-0.246
 hotter-0.235
cakes-0.235
Tokenged
Token ID2004
Feature activation+1.52e+00

Pos logit contributions
and+1.071
weet+1.003
Jo+0.999
 languages+0.997
ud+0.989
Neg logit contributions
-0.604
Sorry-0.424
 hotter-0.399
 unst-0.386
bled-0.383
Token back
Token ID736
Feature activation-3.21e+00

Pos logit contributions
ud+0.165
Jo+0.160
and+0.157
jo+0.155
rum+0.153
Neg logit contributions
-0.095
 hotter-0.077
idents-0.075
 unst-0.074
Sorry-0.073
Token home
Token ID1363
Feature activation-1.25e-01

Pos logit contributions
+0.272
idents+0.212
 cooler+0.207
 hotter+0.207
 unst+0.199
Neg logit contributions
ud-0.294
Jo-0.282
and-0.281
rum-0.273
jo-0.266
Token,
Token ID11
Feature activation+2.42e-01

Pos logit contributions
+0.011
idents+0.008
 hotter+0.008
 unst+0.008
 cooler+0.008
Neg logit contributions
ud-0.012
Jo-0.011
and-0.011
jo-0.011
weet-0.010
Token carrying
Token ID6872
Feature activation+2.53e+00

Pos logit contributions
ud+0.023
and+0.022
Jo+0.021
jo+0.021
rum+0.021
Neg logit contributions
-0.020
 hotter-0.016
idents-0.016
Sorry-0.016
 cooler-0.015
Token a
Token ID257
Feature activation+1.58e-01

Pos logit contributions
ud+0.035
and+0.035
Jo+0.034
rum+0.034
jo+0.033
Neg logit contributions
-0.037
 hotter-0.030
idents-0.029
Sorry-0.028
 cooler-0.028
Token happy
Token ID3772
Feature activation+1.63e-01

Pos logit contributions
ud+0.010
Jo+0.010
and+0.009
 Pen+0.009
 Ali+0.009
Neg logit contributions
-0.018
idents-0.015
 hotter-0.015
cakes-0.014
Sorry-0.014
Token potato
Token ID21219
Feature activation+6.50e-02

Pos logit contributions
ud+0.014
and+0.014
Jo+0.014
rum+0.013
jo+0.013
Neg logit contributions
-0.015
idents-0.012
 hotter-0.012
Sorry-0.011
 cooler-0.011
Token named
Token ID3706
Feature activation+4.69e-01

Pos logit contributions
and+0.006
ud+0.006
Jo+0.006
rum+0.006
Jamie+0.006
Neg logit contributions
-0.005
 cooler-0.004
idents-0.004
cakes-0.003
Sorry-0.003
Token Sp
Token ID1338
Feature activation-1.91e+00

Pos logit contributions
ud+0.049
and+0.049
Jo+0.048
rum+0.046
jo+0.046
Neg logit contributions
-0.034
idents-0.026
 hotter-0.025
Sorry-0.024
 cooler-0.023
Tokenud
Token ID463
Feature activation+9.72e+00

Pos logit contributions
+0.242
cakes+0.201
idents+0.200
 hotter+0.193
Sorry+0.193
Neg logit contributions
Jo-0.080
 Pen-0.079
 Ali-0.079
 Mae-0.074
rum-0.074
Token.
Token ID13
Feature activation-6.84e-02

Pos logit contributions
and+1.042
Jo+0.969
ud+0.927
old+0.924
Jamie+0.920
Neg logit contributions
-0.694
idents-0.486
Sorry-0.472
 hotter-0.452
 cooler-0.438
Token Sp
Token ID1338
Feature activation-1.93e+00

Pos logit contributions
+0.005
idents+0.004
 hotter+0.004
Sorry+0.004
 cooler+0.004
Neg logit contributions
ud-0.007
and-0.007
Jo-0.006
jo-0.006
rum-0.006
Tokenud
Token ID463
Feature activation+9.71e+00

Pos logit contributions
+0.188
idents+0.152
cakes+0.144
 hotter+0.141
Sorry+0.140
Neg logit contributions
 Pen-0.133
Jo-0.131
 Ali-0.131
 Mae-0.130
 English-0.127
Token was
Token ID373
Feature activation+4.14e-01

Pos logit contributions
and+0.857
weet+0.759
old+0.755
 languages+0.745
Jo+0.731
Neg logit contributions
-0.855
Sorry-0.654
idents-0.634
uc-0.610
 hotter-0.604
Token very
Token ID845
Feature activation-8.83e-01

Pos logit contributions
ud+0.042
and+0.041
Jo+0.040
jo+0.039
rum+0.039
Neg logit contributions
-0.032
 hotter-0.026
 cooler-0.026
idents-0.024
 unst-0.023
Token the
Token ID262
Feature activation-4.50e-01

Pos logit contributions
+0.085
idents+0.068
 hotter+0.068
Sorry+0.065
 cooler+0.064
Neg logit contributions
ud-0.078
and-0.077
Jo-0.074
rum-0.073
jo-0.072
Token floor
Token ID4314
Feature activation+3.58e+00

Pos logit contributions
+0.041
 hotter+0.035
idents+0.034
 cooler+0.033
cakes+0.032
Neg logit contributions
ud-0.038
rum-0.036
Jo-0.036
and-0.035
jo-0.035
Token with
Token ID351
Feature activation+1.79e-01

Pos logit contributions
rum+0.373
jo+0.371
Jo+0.370
and+0.357
ud+0.354
Neg logit contributions
-0.265
 hotter-0.184
 cooler-0.181
 surely-0.166
 finally-0.166
Token a
Token ID257
Feature activation+1.52e-01

Pos logit contributions
ud+0.015
and+0.015
Jo+0.014
jo+0.014
rum+0.014
Neg logit contributions
-0.017
 hotter-0.014
idents-0.014
Sorry-0.013
 cooler-0.013
Token th
Token ID294
Feature activation+3.30e+00

Pos logit contributions
ud+0.013
and+0.013
Jo+0.013
rum+0.012
jo+0.012
Neg logit contributions
-0.013
idents-0.011
 hotter-0.011
Sorry-0.010
cakes-0.010
Tokenud
Token ID463
Feature activation+9.71e+00

Pos logit contributions
and+0.102
 Jacob+0.097
 Pen+0.097
 Ali+0.091
Jo+0.088
Neg logit contributions
-0.471
 hotter-0.381
Sorry-0.371
cakes-0.363
 unst-0.361
Token.
Token ID13
Feature activation-5.53e-02

Pos logit contributions
and+0.948
old+0.879
Jo+0.876
 languages+0.848
ud+0.846
Neg logit contributions
-0.773
 hotter-0.583
 unst-0.543
Sorry-0.536
idents-0.506
Token She
Token ID1375
Feature activation+3.92e-01

Pos logit contributions
+0.005
idents+0.004
 hotter+0.004
cakes+0.003
Sorry+0.003
Neg logit contributions
ud-0.005
rum-0.005
and-0.005
Jo-0.005
jo-0.005
Token felt
Token ID2936
Feature activation-5.05e-01

Pos logit contributions
and+0.041
rum+0.041
ud+0.040
jo+0.039
Jo+0.038
Neg logit contributions
-0.027
Sorry-0.021
 cooler-0.020
 hotter-0.020
idents-0.020
Token a
Token ID257
Feature activation+1.53e-01

Pos logit contributions
+0.044
idents+0.036
 hotter+0.035
Sorry+0.034
 cooler+0.032
Neg logit contributions
ud-0.045
and-0.044
Jo-0.044
rum-0.042
jo-0.042
Token pain
Token ID2356
Feature activation+2.35e+00

Pos logit contributions
ud+0.013
and+0.012
Jo+0.012
rum+0.012
jo+0.011
Neg logit contributions
-0.014
idents-0.012
 hotter-0.012
Sorry-0.011
cakes-0.011
Token ground
Token ID2323
Feature activation+1.19e+00

Pos logit contributions
+0.042
idents+0.035
 hotter+0.035
Sorry+0.033
 cooler+0.033
Neg logit contributions
ud-0.039
and-0.038
Jo-0.038
rum-0.037
jo-0.036
Token with
Token ID351
Feature activation+1.80e-01

Pos logit contributions
Jo+0.132
jo+0.124
ud+0.122
and+0.120
rum+0.120
Neg logit contributions
-0.081
 hotter-0.059
 cooler-0.059
idents-0.052
 surely-0.050
Token a
Token ID257
Feature activation+1.53e-01

Pos logit contributions
ud+0.014
and+0.014
Jo+0.013
rum+0.013
jo+0.013
Neg logit contributions
-0.018
idents-0.015
 hotter-0.015
Sorry-0.015
 cooler-0.014
Token loud
Token ID7812
Feature activation+4.05e-01

Pos logit contributions
ud+0.013
and+0.013
Jo+0.013
jo+0.012
rum+0.012
Neg logit contributions
-0.014
idents-0.011
 hotter-0.011
Sorry-0.011
 cooler-0.010
Token th
Token ID294
Feature activation+3.30e+00

Pos logit contributions
and+0.040
ud+0.038
Jo+0.036
old+0.035
jo+0.034
Neg logit contributions
-0.033
 hotter-0.026
idents-0.026
 cooler-0.025
Sorry-0.024
Tokenud
Token ID463
Feature activation+9.71e+00

Pos logit contributions
and+0.110
Jo+0.100
 Pen+0.100
 Jacob+0.098
den+0.093
Neg logit contributions
-0.462
 hotter-0.385
Sorry-0.378
 unst-0.362
 cooler-0.360
Token.
Token ID13
Feature activation-5.53e-02

Pos logit contributions
and+1.002
Jo+0.967
old+0.948
 languages+0.922
jo+0.909
Neg logit contributions
-0.683
 hotter-0.508
 unst-0.480
Sorry-0.453
idents-0.431
Token He
Token ID679
Feature activation-4.60e-01

Pos logit contributions
+0.005
idents+0.004
 hotter+0.004
Sorry+0.003
 cooler+0.003
Neg logit contributions
ud-0.005
and-0.005
Jo-0.005
jo-0.005
rum-0.005
Token hurt
Token ID5938
Feature activation-1.19e+00

Pos logit contributions
+0.040
idents+0.033
 hotter+0.032
Sorry+0.032
 cooler+0.030
Neg logit contributions
and-0.041
ud-0.040
Jo-0.039
rum-0.038
jo-0.038
Token his
Token ID465
Feature activation+1.24e+00

Pos logit contributions
+0.082
 hotter+0.075
 cooler+0.068
Sorry+0.067
idents+0.064
Neg logit contributions
Jo-0.115
jo-0.114
and-0.114
ud-0.111
old-0.110
Token arm
Token ID3211
Feature activation+4.18e+00

Pos logit contributions
Jo+0.104
and+0.103
ud+0.102
jo+0.097
old+0.094
Neg logit contributions
-0.117
idents-0.098
 hotter-0.094
 cooler-0.094
 unst-0.089
Token sad
Token ID6507
Feature activation-4.87e-01

Pos logit contributions
ud+0.078
and+0.077
Jo+0.076
rum+0.073
jo+0.073
Neg logit contributions
-0.067
idents-0.052
 hotter-0.052
Sorry-0.050
cakes-0.047
Token.
Token ID13
Feature activation-5.89e-02

Pos logit contributions
+0.040
idents+0.033
 hotter+0.031
 cooler+0.030
Sorry+0.029
Neg logit contributions
ud-0.046
Jo-0.045
and-0.045
rum-0.043
jo-0.043
Token
Token ID198
Feature activation-1.15e-01

Pos logit contributions
+0.005
idents+0.004
 hotter+0.004
Sorry+0.004
 cooler+0.003
Neg logit contributions
ud-0.005
and-0.005
Jo-0.005
rum-0.005
jo-0.005
Token
Token ID198
Feature activation-1.18e-01

Pos logit contributions
+0.009
Sorry+0.007
 hotter+0.006
idents+0.006
 cooler+0.006
Neg logit contributions
ud-0.011
and-0.011
rum-0.011
jo-0.010
Jo-0.010
TokenTr
Token ID2898
Feature activation+8.64e-01

Pos logit contributions
+0.010
Sorry+0.008
 hotter+0.008
idents+0.008
 cooler+0.007
Neg logit contributions
ud-0.011
and-0.010
jo-0.010
Jo-0.010
rum-0.010
Tokenud
Token ID463
Feature activation+9.71e+00

Pos logit contributions
and+0.029
 villains+0.026
weet+0.026
Jo+0.025
 language+0.021
Neg logit contributions
-0.120
 hotter-0.099
idents-0.095
uddle-0.093
Sorry-0.092
Tokeny
Token ID88
Feature activation+4.90e+00

Pos logit contributions
and+0.731
weet+0.655
 villains+0.646
 languages+0.638
Jo+0.634
Neg logit contributions
-0.961
Sorry-0.746
 hotter-0.716
bled-0.696
uc-0.686
Token said
Token ID531
Feature activation-5.67e-01

Pos logit contributions
and+0.427
rum+0.416
ud+0.416
Jo+0.391
weet+0.385
Neg logit contributions
-0.422
idents-0.351
Sorry-0.339
 cooler-0.334
 hotter-0.320
Token,
Token ID11
Feature activation+2.45e-01

Pos logit contributions
+0.049
idents+0.040
Sorry+0.040
 hotter+0.040
 cooler+0.037
Neg logit contributions
ud-0.053
and-0.049
Jo-0.048
jo-0.047
rum-0.046
Token "
Token ID366
Feature activation+8.29e-01

Pos logit contributions
ud+0.018
and+0.018
Jo+0.017
rum+0.016
jo+0.016
Neg logit contributions
-0.025
idents-0.021
 hotter-0.021
Sorry-0.021
 cooler-0.020
TokenI
Token ID40
Feature activation-6.10e-01

Pos logit contributions
ud+0.078
and+0.077
Jo+0.077
jo+0.072
rum+0.072
Neg logit contributions
-0.070
idents-0.056
 hotter-0.054
 cooler-0.051
Sorry-0.051
Token not
Token ID407
Feature activation-5.33e-02

Pos logit contributions
+0.094
 hotter+0.081
Sorry+0.080
idents+0.079
cakes+0.078
Neg logit contributions
Jo-0.054
and-0.054
ud-0.053
 Ali-0.050
rum-0.049
Token find
Token ID1064
Feature activation-4.50e-02

Pos logit contributions
+0.005
 hotter+0.004
idents+0.004
Sorry+0.004
cakes+0.004
Neg logit contributions
and-0.004
ud-0.004
Jo-0.004
jo-0.004
rum-0.004
Token it
Token ID340
Feature activation+1.48e+00

Pos logit contributions
+0.003
 cooler+0.003
 hotter+0.003
idents+0.003
 unst+0.003
Neg logit contributions
and-0.005
ud-0.004
Jo-0.004
jo-0.004
rum-0.004
Token.
Token ID13
Feature activation-5.89e-02

Pos logit contributions
ud+0.133
Jo+0.129
and+0.127
jo+0.122
rum+0.122
Neg logit contributions
-0.128
idents-0.104
 hotter-0.103
 cooler-0.099
Sorry-0.097
Token Tr
Token ID833
Feature activation+1.23e+00

Pos logit contributions
+0.005
idents+0.004
 hotter+0.004
Sorry+0.004
 cooler+0.004
Neg logit contributions
ud-0.005
and-0.005
Jo-0.005
rum-0.005
jo-0.005
Tokenud
Token ID463
Feature activation+9.71e+00

Pos logit contributions
and+0.049
 villains+0.044
Jo+0.038
weet+0.038
 language+0.034
Neg logit contributions
-0.172
idents-0.141
 hotter-0.135
uddle-0.131
uc-0.130
Tokeny
Token ID88
Feature activation+4.90e+00

Pos logit contributions
and+0.712
weet+0.631
Jo+0.626
 villains+0.622
rum+0.609
Neg logit contributions
-0.999
Sorry-0.764
idents-0.748
 hotter-0.737
uc-0.721
Token felt
Token ID2936
Feature activation-5.02e-01

Pos logit contributions
and+0.438
rum+0.436
ud+0.430
weet+0.407
Jo+0.405
Neg logit contributions
-0.413
idents-0.345
Sorry-0.323
 cooler-0.320
 hotter-0.309
Token so
Token ID523
Feature activation+8.18e-01

Pos logit contributions
+0.043
 hotter+0.034
idents+0.034
Sorry+0.033
 cooler+0.032
Neg logit contributions
ud-0.046
and-0.045
Jo-0.045
jo-0.043
rum-0.043
Token sad
Token ID6507
Feature activation-4.87e-01

Pos logit contributions
ud+0.078
and+0.077
Jo+0.076
rum+0.073
jo+0.073
Neg logit contributions
-0.067
idents-0.052
 hotter-0.052
Sorry-0.050
cakes-0.047
Token.
Token ID13
Feature activation-5.89e-02

Pos logit contributions
+0.040
idents+0.033
 hotter+0.031
 cooler+0.030
Sorry+0.029
Neg logit contributions
ud-0.046
Jo-0.045
and-0.045
rum-0.043
jo-0.043
Token back
Token ID736
Feature activation-3.19e+00

Pos logit contributions
ud+0.132
and+0.130
Jo+0.128
rum+0.123
jo+0.123
Neg logit contributions
-0.129
idents-0.103
 hotter-0.102
Sorry-0.099
 cooler-0.094
Token.
Token ID13
Feature activation-6.35e-02

Pos logit contributions
+0.264
idents+0.211
 hotter+0.207
 cooler+0.205
 unst+0.200
Neg logit contributions
ud-0.295
Jo-0.293
and-0.276
jo-0.272
 Ali-0.268
Token He
Token ID679
Feature activation-4.58e-01

Pos logit contributions
+0.005
idents+0.004
Sorry+0.004
 hotter+0.004
 cooler+0.004
Neg logit contributions
ud-0.006
and-0.006
Jo-0.006
rum-0.006
jo-0.006
Token sadly
Token ID21098
Feature activation+3.94e-01

Pos logit contributions
+0.037
idents+0.029
Sorry+0.028
 hotter+0.028
 cooler+0.028
Neg logit contributions
and-0.044
ud-0.043
rum-0.041
Jo-0.041
jo-0.040
Token tr
Token ID491
Feature activation+2.02e+00

Pos logit contributions
ud+0.040
and+0.040
Jo+0.038
rum+0.037
jo+0.037
Neg logit contributions
-0.030
idents-0.023
 hotter-0.023
Sorry-0.023
 cooler-0.021
Tokenud
Token ID463
Feature activation+9.71e+00

Pos logit contributions
and+0.073
 Pen+0.068
Jo+0.067
 Mae+0.062
weet+0.061
Neg logit contributions
-0.294
idents-0.238
Sorry-0.236
 hotter-0.227
cakes-0.224
Tokenged
Token ID2004
Feature activation+1.52e+00

Pos logit contributions
and+0.997
Jo+0.955
ud+0.929
weet+0.926
jo+0.924
Neg logit contributions
-0.689
Sorry-0.491
 hotter-0.486
 unst-0.477
idents-0.459
Token home
Token ID1363
Feature activation-1.14e-01

Pos logit contributions
ud+0.162
Jo+0.159
jo+0.155
and+0.155
rum+0.151
Neg logit contributions
-0.098
 hotter-0.078
 unst-0.078
idents-0.076
Sorry-0.073
Token,
Token ID11
Feature activation+2.37e-01

Pos logit contributions
+0.010
idents+0.008
 hotter+0.008
 unst+0.007
 cooler+0.007
Neg logit contributions
ud-0.011
Jo-0.010
and-0.010
jo-0.010
rum-0.010
Token without
Token ID1231
Feature activation+9.69e-01

Pos logit contributions
ud+0.022
and+0.021
Jo+0.021
jo+0.020
rum+0.020
Neg logit contributions
-0.021
 hotter-0.016
idents-0.016
 cooler-0.015
Sorry-0.015
Token his
Token ID465
Feature activation+1.24e+00

Pos logit contributions
ud+0.093
and+0.093
jo+0.093
Jo+0.089
rum+0.084
Neg logit contributions
-0.075
 hotter-0.065
 unst-0.060
 cooler-0.059
 ang-0.056
Token it
Token ID340
Feature activation+1.09e+00

Pos logit contributions
+0.085
idents+0.068
Sorry+0.065
 hotter+0.065
 cooler+0.065
Neg logit contributions
ud-0.101
and-0.094
Jo-0.092
rum-0.088
jo-0.087
Token was
Token ID373
Feature activation+1.19e-01

Pos logit contributions
ud+0.103
and+0.097
Jo+0.095
rum+0.094
jo+0.088
Neg logit contributions
-0.089
idents-0.077
 hotter-0.072
 cooler-0.071
Sorry-0.070
Token the
Token ID262
Feature activation-7.17e-01

Pos logit contributions
ud+0.010
and+0.010
Jo+0.010
jo+0.010
rum+0.010
Neg logit contributions
-0.011
 hotter-0.009
idents-0.009
Sorry-0.008
 cooler-0.008
Token best
Token ID1266
Feature activation+1.63e-02

Pos logit contributions
+0.060
idents+0.048
 hotter+0.048
Sorry+0.046
 cooler+0.044
Neg logit contributions
ud-0.066
and-0.066
Jo-0.064
rum-0.062
jo-0.062
Token cup
Token ID6508
Feature activation-9.96e-01

Pos logit contributions
ud+0.002
rum+0.001
jo+0.001
Jo+0.001
and+0.001
Neg logit contributions
-0.001
 cooler-0.001
idents-0.001
 reward-0.001
cakes-0.001
Tokencakes
Token ID37263
Feature activation-8.53e+00

Pos logit contributions
+0.094
idents+0.075
cakes+0.075
 hotter+0.073
 cooler+0.072
Neg logit contributions
ud-0.083
Jo-0.080
and-0.079
rum-0.078
jo-0.076
Token she
Token ID673
Feature activation-5.82e-01

Pos logit contributions
+0.766
cakes+0.598
idents+0.596
 cooler+0.578
 hotter+0.574
Neg logit contributions
ud-0.768
Jo-0.730
rum-0.729
and-0.718
jo-0.714
Token ever
Token ID1683
Feature activation+1.22e+00

Pos logit contributions
+0.054
idents+0.044
Sorry+0.043
 hotter+0.042
 cooler+0.041
Neg logit contributions
ud-0.050
and-0.048
Jo-0.047
rum-0.046
jo-0.044
Token had
Token ID550
Feature activation-7.69e-01

Pos logit contributions
ud+0.092
and+0.091
Jo+0.089
jo+0.085
rum+0.084
Neg logit contributions
-0.125
idents-0.103
 hotter-0.102
Sorry-0.099
cakes-0.095
Token.
Token ID13
Feature activation-1.63e-01

Pos logit contributions
+0.067
idents+0.051
Sorry+0.051
 cooler+0.049
cakes+0.049
Neg logit contributions
ud-0.069
Jo-0.067
and-0.066
rum-0.066
jo-0.065
Token
Token ID220
Feature activation+1.15e+00

Pos logit contributions
+0.014
idents+0.011
Sorry+0.011
 hotter+0.010
cakes+0.010
Neg logit contributions
ud-0.015
and-0.014
rum-0.014
Jo-0.014
jo-0.013
Token a
Token ID257
Feature activation-2.46e-01

Pos logit contributions
+0.046
 hotter+0.037
idents+0.037
Sorry+0.036
 cooler+0.035
Neg logit contributions
ud-0.047
and-0.046
Jo-0.045
jo-0.044
rum-0.043
Token bite
Token ID13197
Feature activation-3.18e-01

Pos logit contributions
+0.021
idents+0.017
 hotter+0.016
Sorry+0.016
 cooler+0.015
Neg logit contributions
ud-0.023
and-0.022
Jo-0.022
rum-0.021
jo-0.021
Token.
Token ID13
Feature activation-3.26e-01

Pos logit contributions
+0.023
 hotter+0.019
cakes+0.019
 cooler+0.019
idents+0.018
Neg logit contributions
Jo-0.031
ud-0.029
jo-0.029
and-0.028
Jamie-0.028
Token The
Token ID383
Feature activation-2.17e+00

Pos logit contributions
+0.029
idents+0.022
Sorry+0.021
 hotter+0.021
 cooler+0.020
Neg logit contributions
ud-0.030
and-0.029
Jo-0.029
rum-0.028
jo-0.027
Token cup
Token ID6508
Feature activation-9.96e-01

Pos logit contributions
+0.201
idents+0.166
 hotter+0.160
Sorry+0.159
cakes+0.149
Neg logit contributions
ud-0.176
and-0.172
Jo-0.169
jo-0.161
rum-0.160
Tokencakes
Token ID37263
Feature activation-8.53e+00

Pos logit contributions
+0.086
idents+0.070
 hotter+0.069
Sorry+0.067
 cooler+0.063
Neg logit contributions
ud-0.089
and-0.086
Jo-0.086
jo-0.081
rum-0.081
Token were
Token ID547
Feature activation+2.84e-01

Pos logit contributions
+0.770
idents+0.646
 hotter+0.623
Sorry+0.610
 cooler+0.572
Neg logit contributions
ud-0.710
and-0.692
Jo-0.686
 Pen-0.642
rum-0.640
Token so
Token ID523
Feature activation+6.79e-01

Pos logit contributions
ud+0.023
and+0.023
Jo+0.022
rum+0.021
 Pen+0.021
Neg logit contributions
-0.027
idents-0.022
Sorry-0.021
 hotter-0.020
cakes-0.019
Token tasty
Token ID25103
Feature activation-3.05e+00

Pos logit contributions
ud+0.064
and+0.064
Jo+0.063
rum+0.060
jo+0.059
Neg logit contributions
-0.055
idents-0.043
 hotter-0.042
Sorry-0.041
 cooler-0.038
Token!
Token ID0
Feature activation-8.63e-01

Pos logit contributions
+0.267
idents+0.213
 hotter+0.204
 cooler+0.200
cakes+0.199
Neg logit contributions
ud-0.278
and-0.272
Jo-0.272
jo-0.266
rum-0.260
Token Lily
Token ID20037
Feature activation-1.15e+00

Pos logit contributions
+0.078
idents+0.061
 hotter+0.060
Sorry+0.059
 cooler+0.056
Neg logit contributions
ud-0.076
and-0.075
Jo-0.073
rum-0.072
jo-0.070
Token big
Token ID1263
Feature activation-4.40e-01

Pos logit contributions
+0.021
 hotter+0.016
 cooler+0.016
idents+0.016
Sorry+0.016
Neg logit contributions
ud-0.023
and-0.023
Jo-0.022
rum-0.022
jo-0.021
Token blanket
Token ID18447
Feature activation+1.88e+00

Pos logit contributions
+0.037
 hotter+0.029
 cooler+0.028
idents+0.028
Sorry+0.026
Neg logit contributions
and-0.043
ud-0.042
Jo-0.041
rum-0.039
jo-0.039
Token and
Token ID290
Feature activation-5.76e-01

Pos logit contributions
and+0.165
ud+0.165
Jo+0.160
rum+0.153
jo+0.151
Neg logit contributions
-0.170
 hotter-0.133
idents-0.132
 cooler-0.128
Sorry-0.127
Token ate
Token ID15063
Feature activation-1.49e+00

Pos logit contributions
+0.049
 hotter+0.040
idents+0.040
Sorry+0.039
 cooler+0.038
Neg logit contributions
ud-0.054
and-0.052
Jo-0.050
rum-0.049
jo-0.049
Token cup
Token ID6508
Feature activation-9.72e-01

Pos logit contributions
+0.141
idents+0.116
 hotter+0.116
Sorry+0.111
 cooler+0.109
Neg logit contributions
ud-0.122
and-0.120
Jo-0.118
jo-0.114
rum-0.113
Tokencakes
Token ID37263
Feature activation-8.52e+00

Pos logit contributions
+0.042
Sorry+0.028
idents+0.028
 hotter+0.026
 surely+0.019
Neg logit contributions
and-0.128
ud-0.127
Jo-0.123
 Pen-0.118
rum-0.117
Token and
Token ID290
Feature activation-4.89e-01

Pos logit contributions
+0.868
idents+0.736
Sorry+0.722
 hotter+0.717
 unst+0.655
Neg logit contributions
and-0.634
ud-0.617
Jo-0.595
 Pen-0.555
rum-0.554
Token sandwiches
Token ID29777
Feature activation-2.27e+00

Pos logit contributions
+0.034
 hotter+0.027
idents+0.026
 cooler+0.025
Sorry+0.025
Neg logit contributions
ud-0.053
and-0.052
Jo-0.050
rum-0.050
jo-0.050
Token.
Token ID13
Feature activation-1.63e-01

Pos logit contributions
+0.197
idents+0.153
 hotter+0.152
cakes+0.151
 cooler+0.151
Neg logit contributions
ud-0.207
Jo-0.203
and-0.200
jo-0.195
rum-0.193
Token When
Token ID1649
Feature activation-3.68e+00

Pos logit contributions
+0.014
idents+0.010
Sorry+0.010
 hotter+0.010
 cooler+0.010
Neg logit contributions
ud-0.016
and-0.015
Jo-0.015
rum-0.015
jo-0.014
Token the
Token ID262
Feature activation-5.36e-01

Pos logit contributions
+0.350
idents+0.290
 hotter+0.280
Sorry+0.277
 unst+0.262
Neg logit contributions
Jo-0.284
ud-0.279
 Ali-0.276
 Pen-0.273
 Mae-0.272
Token
Token ID198
Feature activation-2.26e-01

Pos logit contributions
+0.017
idents+0.014
 hotter+0.014
Sorry+0.014
 cooler+0.013
Neg logit contributions
and-0.015
ud-0.015
Jo-0.015
jo-0.014
rum-0.014
Token
Token ID198
Feature activation-2.22e-01

Pos logit contributions
+0.016
Sorry+0.010
idents+0.010
cakes+0.010
 hotter+0.010
Neg logit contributions
ud-0.024
rum-0.023
and-0.022
jo-0.022
Jo-0.022
TokenWhen
Token ID2215
Feature activation-4.14e+00

Pos logit contributions
+0.020
idents+0.016
 hotter+0.016
 cooler+0.015
Sorry+0.015
Neg logit contributions
Jo-0.020
and-0.019
ud-0.019
rum-0.018
jo-0.018
Token the
Token ID262
Feature activation-5.68e-01

Pos logit contributions
+0.473
idents+0.419
uddle+0.406
Sorry+0.403
cakes+0.395
Neg logit contributions
Jo-0.247
 Jess-0.226
 Ali-0.222
 Pen-0.221
 Mary-0.221
Token cup
Token ID6508
Feature activation-9.36e-01

Pos logit contributions
+0.049
idents+0.043
 hotter+0.043
Sorry+0.042
 unst+0.042
Neg logit contributions
Jo-0.042
and-0.042
 Mae-0.041
 living-0.041
 Pen-0.041
Tokencakes
Token ID37263
Feature activation-8.50e+00

Pos logit contributions
+0.057
idents+0.048
 hotter+0.045
Sorry+0.044
 unst+0.042
Neg logit contributions
Jo-0.101
ud-0.101
and-0.097
 Pen-0.095
 Ali-0.094
Token were
Token ID547
Feature activation+3.08e-01

Pos logit contributions
+0.779
idents+0.750
Sorry+0.685
 hotter+0.679
 unst+0.664
Neg logit contributions
Jo-0.626
ud-0.612
 Pen-0.575
and-0.557
 Jacob-0.552
Token done
Token ID1760
Feature activation-2.42e+00

Pos logit contributions
ud+0.024
and+0.023
 about+0.023
 Pen+0.022
 Jacob+0.022
Neg logit contributions
-0.029
idents-0.025
Sorry-0.024
 hotter-0.020
 trigger-0.020
Token,
Token ID11
Feature activation+1.06e-01

Pos logit contributions
+0.258
Sorry+0.227
idents+0.226
 hotter+0.224
 cooler+0.209
Neg logit contributions
Jo-0.152
ud-0.151
and-0.149
rum-0.136
 Pen-0.135
Token Mill
Token ID9212
Feature activation-9.05e-01

Pos logit contributions
Jo+0.007
 Ali+0.006
 Pen+0.006
 Mary+0.006
 Jess+0.006
Neg logit contributions
-0.012
Sorry-0.010
idents-0.010
 hotter-0.010
uddle-0.009
Tokenie
Token ID494
Feature activation+2.68e-01

Pos logit contributions
+0.061
 hotter+0.052
idents+0.049
 cooler+0.047
Sorry+0.047
Neg logit contributions
and-0.090
rum-0.084
ud-0.084
Jo-0.081
 villains-0.081
Token
Token ID198
Feature activation-2.29e-01

Pos logit contributions
+0.015
idents+0.012
 hotter+0.012
Sorry+0.012
cakes+0.011
Neg logit contributions
ud-0.014
and-0.014
Jo-0.013
rum-0.013
jo-0.013
Token
Token ID198
Feature activation-2.21e-01

Pos logit contributions
+0.017
idents+0.012
Sorry+0.012
cakes+0.012
 hotter+0.012
Neg logit contributions
ud-0.023
rum-0.022
Jo-0.022
and-0.021
jo-0.021
TokenWhen
Token ID2215
Feature activation-4.14e+00

Pos logit contributions
+0.019
idents+0.016
 hotter+0.015
Sorry+0.015
 cooler+0.014
Neg logit contributions
ud-0.020
and-0.020
Jo-0.019
rum-0.018
jo-0.018
Token the
Token ID262
Feature activation-5.61e-01

Pos logit contributions
+0.438
idents+0.385
Sorry+0.377
 hotter+0.377
athed+0.368
Neg logit contributions
 Pen-0.227
ud-0.226
 Mae-0.224
 Ali-0.222
Jo-0.215
Token cup
Token ID6508
Feature activation-9.15e-01

Pos logit contributions
+0.053
 hotter+0.048
idents+0.047
 unst+0.046
Sorry+0.043
Neg logit contributions
Jo-0.038
 living-0.038
and-0.037
ud-0.035
 Pen-0.035
Tokencakes
Token ID37263
Feature activation-8.49e+00

Pos logit contributions
+0.056
idents+0.049
 hotter+0.048
 unst+0.046
Sorry+0.044
Neg logit contributions
Jo-0.095
ud-0.093
 Pen-0.090
 Jacob-0.089
 Ali-0.087
Token were
Token ID547
Feature activation+3.16e-01

Pos logit contributions
+0.724
idents+0.714
 hotter+0.658
 unst+0.627
Sorry+0.623
Neg logit contributions
Jo-0.634
ud-0.621
 Pen-0.603
 about-0.583
 Jacob-0.583
Token done
Token ID1760
Feature activation-2.42e+00

Pos logit contributions
 about+0.024
ud+0.024
and+0.023
 Pen+0.022
Jo+0.022
Neg logit contributions
-0.030
idents-0.027
Sorry-0.025
 hotter-0.023
 trigger-0.022
Token baking
Token ID16871
Feature activation-1.72e+00

Pos logit contributions
+0.268
Sorry+0.239
idents+0.238
 hotter+0.236
 cooler+0.219
Neg logit contributions
Jo-0.141
and-0.138
ud-0.137
 Pen-0.126
 Ali-0.125
Token,
Token ID11
Feature activation+1.16e-01

Pos logit contributions
+0.185
Sorry+0.160
idents+0.157
 hotter+0.156
 cooler+0.143
Neg logit contributions
and-0.109
ud-0.106
Jo-0.105
 Pen-0.098
 Ali-0.097
Token they
Token ID484
Feature activation-1.56e+00

Pos logit contributions
 Pen+0.006
 Mae+0.005
 Jacob+0.005
 Ali+0.005
 Anna+0.005
Neg logit contributions
-0.014
Sorry-0.011
idents-0.011
 unst-0.011
 hotter-0.011
Token help
Token ID1037
Feature activation-1.84e+00

Pos logit contributions
ud+0.061
and+0.061
Jo+0.060
rum+0.057
jo+0.057
Neg logit contributions
-0.052
idents-0.042
 hotter-0.041
Sorry-0.040
 cooler-0.037
Token her
Token ID607
Feature activation+6.19e-01

Pos logit contributions
+0.204
 hotter+0.171
idents+0.171
Sorry+0.167
 cooler+0.160
Neg logit contributions
and-0.123
ud-0.121
Jo-0.118
rum-0.110
jo-0.110
Token make
Token ID787
Feature activation-9.77e-02

Pos logit contributions
and+0.053
ud+0.051
Jo+0.051
rum+0.049
jo+0.049
Neg logit contributions
-0.057
 hotter-0.046
idents-0.046
Sorry-0.046
cakes-0.041
Token some
Token ID617
Feature activation-1.88e-01

Pos logit contributions
+0.009
idents+0.007
 hotter+0.007
Sorry+0.007
 cooler+0.007
Neg logit contributions
ud-0.008
and-0.008
Jo-0.008
rum-0.008
jo-0.008
Token cup
Token ID6508
Feature activation-9.15e-01

Pos logit contributions
+0.016
idents+0.013
 hotter+0.013
Sorry+0.012
cakes+0.011
Neg logit contributions
ud-0.017
and-0.017
Jo-0.017
rum-0.016
jo-0.016
Tokencakes
Token ID37263
Feature activation-8.49e+00

Pos logit contributions
+0.066
Sorry+0.052
 hotter+0.050
idents+0.050
 surely+0.041
Neg logit contributions
and-0.099
ud-0.089
Jo-0.088
old-0.085
jo-0.085
Token and
Token ID290
Feature activation-4.68e-01

Pos logit contributions
+1.002
Sorry+0.903
idents+0.857
 hotter+0.853
 trigger+0.766
Neg logit contributions
and-0.516
 about-0.449
Jo-0.408
ud-0.407
 Pen-0.406
Token put
Token ID1234
Feature activation+4.40e-01

Pos logit contributions
+0.040
idents+0.031
 hotter+0.031
Sorry+0.030
 cooler+0.028
Neg logit contributions
ud-0.043
and-0.042
Jo-0.042
rum-0.040
jo-0.040
Token up
Token ID510
Feature activation-1.43e+00

Pos logit contributions
ud+0.026
and+0.026
 Pen+0.025
 Mae+0.025
Jo+0.024
Neg logit contributions
-0.049
idents-0.043
Sorry-0.042
 hotter-0.038
 unst-0.036
Token some
Token ID617
Feature activation-2.03e-01

Pos logit contributions
+0.154
idents+0.134
Sorry+0.132
 hotter+0.125
 unst+0.116
Neg logit contributions
ud-0.094
and-0.094
Jo-0.092
 Pen-0.089
 Mae-0.087
Token decorations
Token ID39011
Feature activation-3.75e+00

Pos logit contributions
+0.019
idents+0.015
Sorry+0.015
 hotter+0.014
 unst+0.013
Neg logit contributions
and-0.018
ud-0.018
Jo-0.016
 Pen-0.016
rum-0.016
Token her
Token ID607
Feature activation+6.27e-01

Pos logit contributions
ud+0.060
and+0.057
Jo+0.055
rum+0.055
jo+0.055
Neg logit contributions
-0.051
idents-0.042
 hotter-0.042
Sorry-0.039
 cooler-0.038
Token mom
Token ID1995
Feature activation-1.82e+00

Pos logit contributions
ud+0.065
and+0.063
Jo+0.062
rum+0.060
jo+0.060
Neg logit contributions
-0.048
idents-0.037
 hotter-0.035
 cooler-0.034
Sorry-0.034
Token had
Token ID550
Feature activation-7.43e-01

Pos logit contributions
+0.148
idents+0.114
Sorry+0.110
 hotter+0.108
 cooler+0.105
Neg logit contributions
and-0.172
ud-0.172
rum-0.165
Jo-0.161
jo-0.156
Token made
Token ID925
Feature activation+6.46e-01

Pos logit contributions
+0.063
idents+0.051
 hotter+0.050
Sorry+0.049
 cooler+0.049
Neg logit contributions
ud-0.069
and-0.066
rum-0.064
jo-0.064
Jo-0.063
Token cup
Token ID6508
Feature activation-9.10e-01

Pos logit contributions
ud+0.055
and+0.053
rum+0.052
Jo+0.052
jo+0.050
Neg logit contributions
-0.058
idents-0.048
 hotter-0.048
 cooler-0.045
cakes-0.045
Tokencakes
Token ID37263
Feature activation-8.48e+00

Pos logit contributions
+0.063
idents+0.047
 hotter+0.046
Sorry+0.046
 cooler+0.040
Neg logit contributions
and-0.098
ud-0.096
Jo-0.096
jo-0.092
rum-0.091
Token!
Token ID0
Feature activation-8.64e-01

Pos logit contributions
+0.807
idents+0.661
 hotter+0.653
Sorry+0.642
 cooler+0.606
Neg logit contributions
ud-0.689
and-0.683
Jo-0.668
rum-0.635
jo-0.634
Token Bella
Token ID39304
Feature activation-6.86e-01

Pos logit contributions
+0.084
 hotter+0.072
idents+0.072
Sorry+0.070
 cooler+0.067
Neg logit contributions
and-0.066
ud-0.066
Jo-0.064
jo-0.061
 Ali-0.061
Token was
Token ID373
Feature activation+3.04e-01

Pos logit contributions
+0.054
idents+0.042
cakes+0.041
Sorry+0.040
 cooler+0.038
Neg logit contributions
and-0.067
ud-0.066
rum-0.063
weet-0.060
jo-0.058
Token so
Token ID523
Feature activation+7.12e-01

Pos logit contributions
ud+0.032
jo+0.030
and+0.030
rum+0.029
Jo+0.029
Neg logit contributions
-0.022
 hotter-0.019
 cooler-0.018
 unst-0.017
 finally-0.017
Token excited
Token ID6568
Feature activation+8.94e-01

Pos logit contributions
ud+0.066
and+0.063
Jo+0.062
jo+0.061
rum+0.060
Neg logit contributions
-0.062
 hotter-0.050
idents-0.049
Sorry-0.049
 cooler-0.046
Token Mom
Token ID11254
Feature activation-6.96e-01

Pos logit contributions
ud+0.013
and+0.013
Jo+0.012
rum+0.012
jo+0.012
Neg logit contributions
-0.013
Sorry-0.010
idents-0.010
 hotter-0.010
 cooler-0.010
Tokenmy
Token ID1820
Feature activation+1.36e-01

Pos logit contributions
+0.053
idents+0.042
 hotter+0.041
Sorry+0.040
 cooler+0.040
Neg logit contributions
and-0.069
ud-0.066
Jo-0.064
rum-0.063
jo-0.061
Token was
Token ID373
Feature activation+3.15e-01

Pos logit contributions
and+0.014
ud+0.013
Jo+0.013
rum+0.013
jo+0.012
Neg logit contributions
-0.010
idents-0.008
Sorry-0.008
 hotter-0.008
 cooler-0.007
Token baking
Token ID16871
Feature activation-1.67e+00

Pos logit contributions
ud+0.024
and+0.024
Jo+0.024
jo+0.023
rum+0.023
Neg logit contributions
-0.031
 hotter-0.026
idents-0.026
Sorry-0.025
 cooler-0.024
Token cup
Token ID6508
Feature activation-9.04e-01

Pos logit contributions
+0.161
idents+0.133
 hotter+0.131
Sorry+0.129
 cooler+0.121
Neg logit contributions
ud-0.134
and-0.132
Jo-0.129
rum-0.122
jo-0.121
Tokencakes
Token ID37263
Feature activation-8.47e+00

Pos logit contributions
+0.043
idents+0.028
 hotter+0.028
Sorry+0.026
 cooler+0.022
Neg logit contributions
and-0.116
ud-0.116
Jo-0.113
jo-0.109
rum-0.108
Token.
Token ID13
Feature activation-1.73e-01

Pos logit contributions
+0.856
idents+0.737
 hotter+0.728
Sorry+0.727
 cooler+0.666
Neg logit contributions
and-0.619
ud-0.603
Jo-0.585
 Pen-0.541
jo-0.537
Token She
Token ID1375
Feature activation+2.86e-01

Pos logit contributions
+0.017
idents+0.015
 hotter+0.015
Sorry+0.014
 unst+0.013
Neg logit contributions
and-0.012
ud-0.012
Jo-0.012
jo-0.011
 Pen-0.011
Token got
Token ID1392
Feature activation-2.33e+00

Pos logit contributions
and+0.027
ud+0.026
rum+0.026
Jo+0.025
jo+0.025
Neg logit contributions
-0.024
Sorry-0.019
idents-0.018
cakes-0.018
 hotter-0.018
Token out
Token ID503
Feature activation+1.26e+00

Pos logit contributions
+0.268
idents+0.215
Sorry+0.208
 hotter+0.194
 surely+0.185
Neg logit contributions
and-0.159
ud-0.147
Jo-0.136
 Pen-0.133
 Mae-0.133
Token the
Token ID262
Feature activation-5.56e-01

Pos logit contributions
ud+0.091
and+0.089
Jo+0.087
rum+0.082
jo+0.081
Neg logit contributions
-0.133
idents-0.112
Sorry-0.109
 hotter-0.108
 cooler-0.101
Token Grand
Token ID5675
Feature activation+3.37e-01

Pos logit contributions
+0.012
idents+0.010
 hotter+0.010
 cooler+0.009
Sorry+0.009
Neg logit contributions
ud-0.015
and-0.015
Jo-0.015
rum-0.015
jo-0.014
Tokenma
Token ID2611
Feature activation-6.28e-01

Pos logit contributions
ud+0.029
and+0.028
Jo+0.027
rum+0.026
 Pen+0.026
Neg logit contributions
-0.031
idents-0.025
cakes-0.024
Sorry-0.024
 hotter-0.024
Token was
Token ID373
Feature activation+3.15e-01

Pos logit contributions
+0.050
idents+0.038
 cooler+0.038
cakes+0.038
 hotter+0.037
Neg logit contributions
and-0.060
rum-0.059
ud-0.058
Jo-0.055
weet-0.053
Token making
Token ID1642
Feature activation+1.39e-01

Pos logit contributions
ud+0.029
and+0.028
rum+0.028
jo+0.028
Jo+0.027
Neg logit contributions
-0.026
 hotter-0.023
 cooler-0.022
Sorry-0.020
 unst-0.019
Token cup
Token ID6508
Feature activation-9.04e-01

Pos logit contributions
ud+0.013
rum+0.013
jo+0.013
and+0.013
Jo+0.012
Neg logit contributions
-0.011
 hotter-0.009
 cooler-0.009
idents-0.009
cakes-0.009
Tokencakes
Token ID37263
Feature activation-8.47e+00

Pos logit contributions
+0.033
Sorry+0.019
idents+0.019
 hotter+0.018
 unst+0.012
Neg logit contributions
and-0.126
ud-0.123
Jo-0.123
jo-0.117
 Ali-0.115
Token for
Token ID329
Feature activation-9.35e-01

Pos logit contributions
+0.908
Sorry+0.808
idents+0.789
 hotter+0.768
 unst+0.706
Neg logit contributions
and-0.575
Jo-0.530
ud-0.526
 Ali-0.482
 Mae-0.481
Token the
Token ID262
Feature activation-5.55e-01

Pos logit contributions
+0.082
idents+0.064
 hotter+0.064
Sorry+0.062
 cooler+0.061
Neg logit contributions
ud-0.084
and-0.082
jo-0.079
rum-0.079
Jo-0.079
Token family
Token ID1641
Feature activation+3.84e-02

Pos logit contributions
+0.051
idents+0.042
 hotter+0.041
Sorry+0.040
 cooler+0.039
Neg logit contributions
ud-0.047
and-0.046
Jo-0.045
rum-0.045
jo-0.044
Token.
Token ID13
Feature activation-1.70e-01

Pos logit contributions
ud+0.004
rum+0.004
Jo+0.003
and+0.003
jo+0.003
Neg logit contributions
-0.003
cakes-0.002
 cooler-0.002
idents-0.002
 hotter-0.002
Token Dave
Token ID9935
Feature activation+4.11e+00

Pos logit contributions
+0.015
idents+0.012
 hotter+0.012
Sorry+0.012
 cooler+0.011
Neg logit contributions
ud-0.015
and-0.015
Jo-0.015
rum-0.014
jo-0.014
Token
Token ID198
Feature activation-2.09e-01

Pos logit contributions
+0.017
Sorry+0.012
idents+0.012
 hotter+0.012
cakes+0.011
Neg logit contributions
ud-0.021
rum-0.021
Jo-0.020
and-0.020
jo-0.019
TokenFinally
Token ID11158
Feature activation-2.90e+00

Pos logit contributions
+0.018
idents+0.015
 hotter+0.015
Sorry+0.014
 cooler+0.014
Neg logit contributions
and-0.019
ud-0.019
Jo-0.018
jo-0.017
rum-0.017
Token,
Token ID11
Feature activation+1.42e-01

Pos logit contributions
+0.325
idents+0.278
 hotter+0.276
Sorry+0.272
cakes+0.260
Neg logit contributions
ud-0.179
Jo-0.173
and-0.165
 Ali-0.161
rum-0.160
Token the
Token ID262
Feature activation-5.40e-01

Pos logit contributions
 Mae+0.006
 Pen+0.006
 Jacob+0.006
 Anna+0.006
 Mary+0.006
Neg logit contributions
-0.017
idents-0.014
anks-0.014
cakes-0.014
Sorry-0.014
Token cup
Token ID6508
Feature activation-9.04e-01

Pos logit contributions
+0.051
idents+0.042
 hotter+0.040
 unst+0.039
Sorry+0.039
Neg logit contributions
Jo-0.042
and-0.041
ud-0.040
 Pen-0.039
 living-0.039
Tokencakes
Token ID37263
Feature activation-8.47e+00

Pos logit contributions
+0.048
idents+0.036
 hotter+0.035
 unst+0.033
Sorry+0.030
Neg logit contributions
ud-0.107
Jo-0.106
and-0.103
jo-0.100
 Pen-0.100
Token were
Token ID547
Feature activation+3.21e-01

Pos logit contributions
+0.828
idents+0.754
 hotter+0.712
 unst+0.690
Sorry+0.686
Neg logit contributions
ud-0.583
Jo-0.571
 Pen-0.521
and-0.516
 about-0.503
Token ready
Token ID3492
Feature activation-2.66e+00

Pos logit contributions
ud+0.025
and+0.025
 about+0.025
 Pen+0.024
Jo+0.024
Neg logit contributions
-0.030
idents-0.025
Sorry-0.024
 hotter-0.021
 unst-0.020
Token and
Token ID290
Feature activation-4.74e-01

Pos logit contributions
+0.271
idents+0.224
 hotter+0.223
Sorry+0.219
 cooler+0.208
Neg logit contributions
ud-0.199
and-0.197
Jo-0.192
rum-0.182
jo-0.181
Token they
Token ID484
Feature activation-1.54e+00

Pos logit contributions
+0.052
idents+0.040
Sorry+0.039
 hotter+0.037
cakes+0.035
Neg logit contributions
 Pen-0.031
and-0.030
 Ali-0.030
 Mae-0.030
Jo-0.030
Token all
Token ID477
Feature activation+1.01e+00

Pos logit contributions
+0.135
idents+0.108
 hotter+0.106
Sorry+0.101
 cooler+0.093
Neg logit contributions
ud-0.140
Jo-0.138
and-0.135
jo-0.129
 Pen-0.128
Token,
Token ID11
Feature activation+1.62e-01

Pos logit contributions
and+0.135
ud+0.134
jo+0.129
Jo+0.128
rum+0.127
Neg logit contributions
-0.123
 hotter-0.097
idents-0.097
 cooler-0.096
Sorry-0.090
Token Dave
Token ID9935
Feature activation+4.12e+00

Pos logit contributions
ud+0.012
and+0.011
Jo+0.011
rum+0.010
jo+0.010
Neg logit contributions
-0.017
idents-0.014
 hotter-0.014
Sorry-0.014
 cooler-0.013
Tokeny
Token ID88
Feature activation+4.81e+00

Pos logit contributions
and+0.225
ud+0.211
Jo+0.203
rum+0.199
 Pen+0.182
Neg logit contributions
-0.490
idents-0.417
 hotter-0.409
Sorry-0.407
cakes-0.401
Token's
Token ID338
Feature activation+2.48e+00

Pos logit contributions
and+0.441
ud+0.436
rum+0.419
Jo+0.410
jo+0.395
Neg logit contributions
-0.407
idents-0.323
Sorry-0.311
 hotter-0.308
cakes-0.304
Token cup
Token ID6508
Feature activation-9.03e-01

Pos logit contributions
ud+0.223
and+0.220
Jo+0.217
jo+0.207
rum+0.206
Neg logit contributions
-0.214
 hotter-0.171
idents-0.171
Sorry-0.168
 cooler-0.155
Tokencakes
Token ID37263
Feature activation-8.47e+00

Pos logit contributions
+0.045
idents+0.034
 hotter+0.032
Sorry+0.031
 unst+0.026
Neg logit contributions
ud-0.111
and-0.109
Jo-0.109
jo-0.104
 Pen-0.101
Token were
Token ID547
Feature activation+3.41e-01

Pos logit contributions
+0.733
idents+0.656
Sorry+0.622
 hotter+0.612
 cooler+0.551
Neg logit contributions
ud-0.701
Jo-0.693
and-0.677
 Pen-0.629
jo-0.628
Token done
Token ID1760
Feature activation-2.39e+00

Pos logit contributions
ud+0.032
and+0.031
Jo+0.031
rum+0.029
jo+0.029
Neg logit contributions
-0.029
idents-0.023
 hotter-0.022
Sorry-0.022
 cooler-0.021
Token,
Token ID11
Feature activation+1.42e-01

Pos logit contributions
+0.217
idents+0.173
 hotter+0.169
Sorry+0.162
 cooler+0.161
Neg logit contributions
ud-0.210
and-0.203
Jo-0.197
rum-0.192
jo-0.192
Token and
Token ID290
Feature activation-4.40e-01

Pos logit contributions
ud+0.012
and+0.012
Jo+0.012
rum+0.011
jo+0.011
Neg logit contributions
-0.013
idents-0.011
 hotter-0.011
Sorry-0.010
 cooler-0.010
Token they
Token ID484
Feature activation-1.53e+00

Pos logit contributions
+0.036
 hotter+0.030
idents+0.029
Sorry+0.028
 cooler+0.028
Neg logit contributions
ud-0.042
and-0.041
Jo-0.039
rum-0.039
jo-0.038
Token shoot
Token ID2686
Feature activation-1.73e-01

Pos logit contributions
ud+0.057
Jo+0.056
and+0.055
 Ali+0.054
 Pen+0.053
Neg logit contributions
-0.057
idents-0.048
 hotter-0.047
Sorry-0.047
cakes-0.043
Token hay
Token ID27678
Feature activation+2.17e+00

Pos logit contributions
+0.017
idents+0.014
Sorry+0.013
 hotter+0.013
cakes+0.012
Neg logit contributions
ud-0.014
and-0.014
Jo-0.013
rum-0.013
jo-0.013
Token and
Token ID290
Feature activation-4.30e-01

Pos logit contributions
ud+0.172
and+0.171
Jo+0.164
rum+0.158
jo+0.158
Neg logit contributions
-0.208
idents-0.171
Sorry-0.170
 hotter-0.170
cakes-0.155
Token eat
Token ID4483
Feature activation-1.45e+00

Pos logit contributions
+0.035
idents+0.028
 hotter+0.028
Sorry+0.027
 cooler+0.026
Neg logit contributions
ud-0.041
and-0.041
Jo-0.040
rum-0.038
jo-0.038
Token cup
Token ID6508
Feature activation-8.99e-01

Pos logit contributions
+0.129
idents+0.104
 hotter+0.103
Sorry+0.100
 cooler+0.096
Neg logit contributions
ud-0.128
and-0.126
Jo-0.124
jo-0.118
rum-0.118
Tokencakes
Token ID37263
Feature activation-8.47e+00

Pos logit contributions
+0.036
Sorry+0.024
idents+0.024
 hotter+0.023
 surely+0.017
Neg logit contributions
and-0.123
ud-0.120
Jo-0.115
jo-0.111
rum-0.110
Token.
Token ID13
Feature activation-1.54e-01

Pos logit contributions
+0.897
idents+0.773
Sorry+0.767
 hotter+0.762
 unst+0.688
Neg logit contributions
and-0.606
ud-0.572
Jo-0.536
 Pen-0.501
rum-0.498
Token It
Token ID632
Feature activation-6.72e-01

Pos logit contributions
+0.015
 hotter+0.013
idents+0.012
Sorry+0.012
 cooler+0.012
Neg logit contributions
ud-0.012
and-0.012
Jo-0.011
jo-0.011
rum-0.011
Token was
Token ID373
Feature activation+3.22e-01

Pos logit contributions
+0.066
idents+0.055
 hotter+0.054
Sorry+0.053
 cooler+0.051
Neg logit contributions
ud-0.053
and-0.052
Jo-0.051
rum-0.048
jo-0.048
Token a
Token ID257
Feature activation+5.85e-02

Pos logit contributions
ud+0.024
and+0.024
Jo+0.023
rum+0.022
 Ali+0.022
Neg logit contributions
-0.033
idents-0.027
Sorry-0.027
 hotter-0.026
cakes-0.025
Token good
Token ID922
Feature activation+1.22e-01

Pos logit contributions
and+0.004
ud+0.004
Jo+0.004
jo+0.004
 Pen+0.004
Neg logit contributions
-0.006
idents-0.005
 hotter-0.005
Sorry-0.004
cakes-0.004
TokenCan
Token ID6090
Feature activation-2.27e+00

Pos logit contributions
+0.042
idents+0.033
 hotter+0.032
Sorry+0.030
 cooler+0.029
Neg logit contributions
ud-0.057
and-0.056
Jo-0.056
jo-0.053
rum-0.052
Token I
Token ID314
Feature activation+2.56e+00

Pos logit contributions
+0.226
idents+0.186
 hotter+0.184
Sorry+0.181
 cooler+0.172
Neg logit contributions
ud-0.176
and-0.173
Jo-0.169
rum-0.162
jo-0.161
Token put
Token ID1234
Feature activation+4.86e-01

Pos logit contributions
ud+0.185
Jo+0.183
and+0.179
 Ali+0.168
 Mae+0.167
Neg logit contributions
-0.228
 hotter-0.214
idents-0.213
Sorry-0.203
 ang-0.189
Token the
Token ID262
Feature activation-5.31e-01

Pos logit contributions
ud+0.032
and+0.031
 Pen+0.030
Jo+0.030
 Mae+0.030
Neg logit contributions
-0.051
Sorry-0.044
idents-0.044
 hotter-0.041
cakes-0.039
Token cup
Token ID6508
Feature activation-8.94e-01

Pos logit contributions
+0.047
idents+0.038
Sorry+0.038
 hotter+0.037
 unst+0.034
Neg logit contributions
ud-0.045
Jo-0.044
and-0.044
 Pen-0.042
 Mae-0.041
Tokencakes
Token ID37263
Feature activation-8.45e+00

Pos logit contributions
+0.060
idents+0.052
Sorry+0.048
 hotter+0.047
 ang+0.043
Neg logit contributions
and-0.092
ud-0.083
 Pen-0.081
Jo-0.080
 Mae-0.080
Token in
Token ID287
Feature activation-3.28e-01

Pos logit contributions
+0.871
idents+0.760
Sorry+0.734
 hotter+0.694
 trigger+0.669
Neg logit contributions
and-0.581
ud-0.515
 Pen-0.503
 Mae-0.500
 about-0.492
Token the
Token ID262
Feature activation-5.41e-01

Pos logit contributions
+0.031
idents+0.025
 hotter+0.025
Sorry+0.025
 cooler+0.024
Neg logit contributions
ud-0.027
and-0.027
Jo-0.026
jo-0.025
rum-0.025
Token oven
Token ID14361
Feature activation-8.82e-01

Pos logit contributions
+0.054
idents+0.047
Sorry+0.046
 hotter+0.044
bled+0.041
Neg logit contributions
and-0.038
ud-0.036
Jo-0.035
 living-0.035
 Pen-0.034
Token?"
Token ID1701
Feature activation-8.70e-01

Pos logit contributions
+0.091
idents+0.075
Sorry+0.075
 hotter+0.074
 unst+0.069
Neg logit contributions
ud-0.061
and-0.061
Jo-0.059
jo-0.056
 Pen-0.055
Token June
Token ID2795
Feature activation+1.24e+00

Pos logit contributions
+0.094
idents+0.079
 hotter+0.079
Sorry+0.078
 cooler+0.073
Neg logit contributions
ud-0.061
and-0.059
Jo-0.055
rum-0.054
jo-0.052
Token sandwiches
Token ID29777
Feature activation-2.21e+00

Pos logit contributions
+0.058
 cooler+0.049
idents+0.045
 hotter+0.045
cakes+0.044
Neg logit contributions
ud-0.070
rum-0.068
jo-0.067
Jo-0.066
and-0.066
Token and
Token ID290
Feature activation-4.07e-01

Pos logit contributions
+0.220
idents+0.182
 hotter+0.180
Sorry+0.176
 cooler+0.168
Neg logit contributions
ud-0.171
and-0.168
Jo-0.165
rum-0.157
jo-0.156
Token some
Token ID617
Feature activation-1.31e-01

Pos logit contributions
+0.027
idents+0.020
 hotter+0.020
Sorry+0.019
 cooler+0.018
Neg logit contributions
ud-0.045
and-0.044
Jo-0.044
rum-0.042
jo-0.042
Token little
Token ID1310
Feature activation+1.14e+00

Pos logit contributions
+0.010
idents+0.008
 hotter+0.008
Sorry+0.008
cakes+0.007
Neg logit contributions
ud-0.013
and-0.013
Jo-0.013
rum-0.012
jo-0.012
Token cup
Token ID6508
Feature activation-8.70e-01

Pos logit contributions
ud+0.108
and+0.107
Jo+0.105
rum+0.101
jo+0.101
Neg logit contributions
-0.092
idents-0.074
 hotter-0.073
Sorry-0.070
 cooler-0.068
Tokencakes
Token ID37263
Feature activation-8.45e+00

Pos logit contributions
+0.036
idents+0.023
 hotter+0.023
Sorry+0.021
 unst+0.016
Neg logit contributions
and-0.117
ud-0.116
Jo-0.112
jo-0.110
rum-0.109
Token.
Token ID13
Feature activation-1.30e-01

Pos logit contributions
+0.847
idents+0.739
 hotter+0.733
Sorry+0.723
 unst+0.645
Neg logit contributions
and-0.635
ud-0.615
Jo-0.564
 Pen-0.535
rum-0.525
Token Lily
Token ID20037
Feature activation-1.12e+00

Pos logit contributions
+0.013
 hotter+0.010
Sorry+0.010
idents+0.010
 unst+0.010
Neg logit contributions
and-0.010
ud-0.010
Jo-0.009
 Pen-0.009
 Mae-0.009
Token was
Token ID373
Feature activation+3.48e-01

Pos logit contributions
+0.090
idents+0.072
 cooler+0.067
Sorry+0.067
cakes+0.065
Neg logit contributions
and-0.109
rum-0.104
ud-0.104
Jo-0.099
jo-0.097
Token happy
Token ID3772
Feature activation+9.08e-02

Pos logit contributions
ud+0.032
and+0.030
Jo+0.030
jo+0.030
rum+0.029
Neg logit contributions
-0.030
 hotter-0.025
idents-0.025
 cooler-0.024
Sorry-0.023
Token because
Token ID780
Feature activation-2.24e+00

Pos logit contributions
ud+0.008
and+0.007
Jo+0.007
rum+0.007
jo+0.007
Neg logit contributions
-0.009
idents-0.007
 hotter-0.007
Sorry-0.007
 cooler-0.007
Token and
Token ID290
Feature activation-4.24e-01

Pos logit contributions
ud+0.009
Jo+0.008
and+0.008
rum+0.008
jo+0.008
Neg logit contributions
-0.009
idents-0.008
Sorry-0.008
 hotter-0.008
 cooler-0.007
Token enjoyed
Token ID8359
Feature activation+4.64e-01

Pos logit contributions
+0.039
 hotter+0.031
idents+0.030
Sorry+0.029
 cooler+0.027
Neg logit contributions
and-0.036
Jo-0.035
ud-0.034
 Ali-0.033
 Pen-0.033
Token the
Token ID262
Feature activation-5.23e-01

Pos logit contributions
ud+0.033
and+0.032
Jo+0.032
rum+0.030
jo+0.029
Neg logit contributions
-0.049
idents-0.042
 hotter-0.041
Sorry-0.040
 cooler-0.039
Token delicious
Token ID12625
Feature activation-4.32e+00

Pos logit contributions
+0.047
Sorry+0.038
idents+0.037
 hotter+0.037
 cooler+0.034
Neg logit contributions
and-0.044
Jo-0.044
ud-0.044
rum-0.041
jo-0.041
Token cup
Token ID6508
Feature activation-8.81e-01

Pos logit contributions
+0.369
idents+0.292
 hotter+0.290
Sorry+0.287
 cooler+0.265
Neg logit contributions
ud-0.392
and-0.387
Jo-0.384
rum-0.366
jo-0.362
Tokencakes
Token ID37263
Feature activation-8.44e+00

Pos logit contributions
+0.059
idents+0.046
Sorry+0.046
 hotter+0.046
 cooler+0.037
Neg logit contributions
and-0.097
ud-0.092
Jo-0.092
jo-0.087
 Jacob-0.086
Token.
Token ID13
Feature activation-1.30e-01

Pos logit contributions
+0.847
idents+0.724
Sorry+0.716
 hotter+0.716
 unst+0.639
Neg logit contributions
and-0.640
ud-0.602
Jo-0.591
 Pen-0.544
 Mae-0.541
Token From
Token ID3574
Feature activation-1.66e+00

Pos logit contributions
+0.013
 hotter+0.012
Sorry+0.011
idents+0.011
 cooler+0.011
Neg logit contributions
and-0.010
ud-0.009
Jo-0.009
jo-0.009
 Ali-0.008
Token that
Token ID326
Feature activation+6.32e-01

Pos logit contributions
+0.220
idents+0.182
 hotter+0.179
cakes+0.175
 unst+0.174
Neg logit contributions
Jo-0.072
 who-0.070
ud-0.069
old-0.069
 about-0.068
Token day
Token ID1110
Feature activation+1.42e+00

Pos logit contributions
ud+0.063
and+0.063
rum+0.061
Jo+0.060
old+0.059
Neg logit contributions
-0.048
 hotter-0.036
idents-0.034
 surely-0.033
 cooler-0.033
Token on
Token ID319
Feature activation-9.91e-01

Pos logit contributions
ud+0.126
and+0.123
Jo+0.120
rum+0.116
jo+0.114
Neg logit contributions
-0.125
idents-0.103
 hotter-0.100
Sorry-0.097
 cooler-0.093
Token the
Token ID262
Feature activation-5.12e-01

Pos logit contributions
+0.002
 hotter+0.002
idents+0.002
 cooler+0.002
cakes+0.002
Neg logit contributions
ud-0.003
and-0.002
rum-0.002
jo-0.002
Jo-0.002
Token most
Token ID749
Feature activation-2.90e+00

Pos logit contributions
+0.047
idents+0.037
Sorry+0.037
 hotter+0.037
 unst+0.034
Neg logit contributions
and-0.043
ud-0.043
Jo-0.042
 Pen-0.040
jo-0.040
Token delicious
Token ID12625
Feature activation-4.33e+00

Pos logit contributions
+0.201
 hotter+0.155
idents+0.154
 cooler+0.145
cakes+0.143
Neg logit contributions
ud-0.310
and-0.299
Jo-0.297
rum-0.293
jo-0.291
Token honey
Token ID12498
Feature activation-1.20e+00

Pos logit contributions
+0.307
idents+0.235
 hotter+0.229
 cooler+0.218
cakes+0.216
Neg logit contributions
ud-0.463
and-0.456
Jo-0.443
rum-0.436
jo-0.434
Token-
Token ID12
Feature activation-1.63e+00

Pos logit contributions
+0.111
idents+0.090
 hotter+0.089
Sorry+0.086
 cooler+0.084
Neg logit contributions
ud-0.102
and-0.100
Jo-0.098
rum-0.094
jo-0.093
Tokencakes
Token ID37263
Feature activation-8.44e+00

Pos logit contributions
+0.152
idents+0.125
 hotter+0.124
Sorry+0.121
cakes+0.120
Neg logit contributions
ud-0.136
and-0.128
Jo-0.127
rum-0.123
jo-0.122
Token that
Token ID326
Feature activation+6.41e-01

Pos logit contributions
+0.824
idents+0.684
 hotter+0.676
Sorry+0.669
 cooler+0.622
Neg logit contributions
and-0.654
ud-0.652
Jo-0.638
rum-0.600
jo-0.598
Token were
Token ID547
Feature activation+3.61e-01

Pos logit contributions
ud+0.064
and+0.061
rum+0.060
Jo+0.059
jo+0.057
Neg logit contributions
-0.051
 hotter-0.041
idents-0.039
 cooler-0.038
Sorry-0.037
Token ever
Token ID1683
Feature activation+1.29e+00

Pos logit contributions
ud+0.032
and+0.032
Jo+0.031
jo+0.030
rum+0.030
Neg logit contributions
-0.032
idents-0.026
 hotter-0.025
Sorry-0.025
 cooler-0.024
Token made
Token ID925
Feature activation+6.63e-01

Pos logit contributions
and+0.111
ud+0.109
Jo+0.107
rum+0.102
jo+0.100
Neg logit contributions
-0.120
idents-0.097
Sorry-0.093
 hotter-0.087
cakes-0.084
Token.
Token ID13
Feature activation-1.41e-01

Pos logit contributions
ud+0.056
Jo+0.053
and+0.053
rum+0.051
jo+0.051
Neg logit contributions
-0.061
 hotter-0.050
idents-0.049
cakes-0.047
Sorry-0.047
TokenAfter
Token ID3260
Feature activation-1.47e+00

Pos logit contributions
+0.019
Sorry+0.016
idents+0.013
 hotter+0.013
cakes+0.013
Neg logit contributions
ud-0.022
rum-0.021
jo-0.021
and-0.021
Jo-0.020
Token they
Token ID484
Feature activation-1.53e+00

Pos logit contributions
+0.138
idents+0.111
 hotter+0.109
Sorry+0.108
 cooler+0.103
Neg logit contributions
ud-0.125
and-0.123
Jo-0.119
jo-0.115
rum-0.115
Token put
Token ID1234
Feature activation+4.81e-01

Pos logit contributions
+0.105
idents+0.096
 hotter+0.090
Sorry+0.078
cakes+0.073
Neg logit contributions
ud-0.162
Jo-0.160
and-0.151
 Ali-0.147
old-0.146
Token the
Token ID262
Feature activation-5.24e-01

Pos logit contributions
ud+0.034
and+0.033
Jo+0.033
 Ali+0.032
rum+0.032
Neg logit contributions
-0.048
Sorry-0.041
idents-0.041
 hotter-0.038
cakes-0.037
Token cup
Token ID6508
Feature activation-8.80e-01

Pos logit contributions
+0.053
idents+0.045
Sorry+0.045
 hotter+0.043
 unst+0.040
Neg logit contributions
ud-0.035
Jo-0.034
and-0.034
rum-0.031
 Pen-0.031
Tokencakes
Token ID37263
Feature activation-8.43e+00

Pos logit contributions
+0.071
idents+0.061
Sorry+0.058
 hotter+0.057
 cooler+0.049
Neg logit contributions
and-0.081
ud-0.078
Jo-0.075
 Ali-0.071
 Pen-0.071
Token in
Token ID287
Feature activation-2.87e-01

Pos logit contributions
+0.813
idents+0.716
Sorry+0.681
 hotter+0.661
 cooler+0.588
Neg logit contributions
and-0.665
ud-0.634
Jo-0.602
 Pen-0.579
jo-0.558
Token the
Token ID262
Feature activation-5.02e-01

Pos logit contributions
+0.025
 hotter+0.019
idents+0.019
Sorry+0.018
 cooler+0.018
Neg logit contributions
and-0.027
ud-0.027
Jo-0.026
jo-0.025
rum-0.025
Token oven
Token ID14361
Feature activation-8.28e-01

Pos logit contributions
+0.049
idents+0.041
Sorry+0.039
 hotter+0.039
cakes+0.036
Neg logit contributions
ud-0.039
and-0.039
Jo-0.038
jo-0.036
 Pen-0.035
Token,
Token ID11
Feature activation+2.01e-01

Pos logit contributions
+0.081
idents+0.066
 hotter+0.066
Sorry+0.064
 cooler+0.062
Neg logit contributions
ud-0.066
and-0.065
Jo-0.064
rum-0.061
jo-0.060
Token Lily
Token ID20037
Feature activation-1.09e+00

Pos logit contributions
ud+0.018
and+0.018
Jo+0.017
rum+0.017
jo+0.017
Neg logit contributions
-0.018
idents-0.014
 hotter-0.014
Sorry-0.014
 cooler-0.013
Token and
Token ID290
Feature activation-4.09e-01

Pos logit contributions
+0.126
idents+0.106
 cooler+0.097
Sorry+0.097
 unst+0.093
Neg logit contributions
and-0.166
rum-0.162
ud-0.160
Jo-0.154
weet-0.153
Token his
Token ID465
Feature activation+1.18e+00

Pos logit contributions
+0.046
idents+0.038
 hotter+0.038
Sorry+0.037
cakes+0.036
Neg logit contributions
ud-0.027
and-0.026
Jo-0.026
rum-0.025
jo-0.025
Token friends
Token ID2460
Feature activation-2.71e-01

Pos logit contributions
ud+0.084
and+0.079
Jo+0.079
rum+0.074
jo+0.074
Neg logit contributions
-0.122
 hotter-0.105
idents-0.104
Sorry-0.101
cakes-0.094
Token ate
Token ID15063
Feature activation-1.36e+00

Pos logit contributions
+0.022
idents+0.017
Sorry+0.017
 hotter+0.016
 cooler+0.016
Neg logit contributions
ud-0.026
and-0.026
rum-0.024
Jo-0.024
jo-0.024
Token cup
Token ID6508
Feature activation-8.66e-01

Pos logit contributions
+0.133
idents+0.109
Sorry+0.109
 hotter+0.104
cakes+0.097
Neg logit contributions
and-0.106
ud-0.106
Jo-0.101
 Pen-0.096
rum-0.096
Tokencakes
Token ID37263
Feature activation-8.42e+00

Pos logit contributions
+0.033
 hotter+0.022
Sorry+0.022
idents+0.020
 unst+0.014
Neg logit contributions
and-0.121
ud-0.115
Jo-0.114
jo-0.109
 Pen-0.108
Token and
Token ID290
Feature activation-4.12e-01

Pos logit contributions
+0.876
Sorry+0.754
idents+0.751
 hotter+0.747
 unst+0.671
Neg logit contributions
and-0.617
ud-0.579
Jo-0.552
 Pen-0.512
 Jacob-0.510
Token played
Token ID2826
Feature activation-9.74e-01

Pos logit contributions
+0.032
idents+0.025
 hotter+0.025
Sorry+0.025
 cooler+0.024
Neg logit contributions
ud-0.041
and-0.041
rum-0.039
Jo-0.038
jo-0.038
Token with
Token ID351
Feature activation+1.06e-01

Pos logit contributions
+0.089
idents+0.073
 hotter+0.072
Sorry+0.070
 cooler+0.066
Neg logit contributions
ud-0.083
and-0.082
Jo-0.080
rum-0.077
jo-0.076
Token the
Token ID262
Feature activation-5.21e-01

Pos logit contributions
ud+0.009
and+0.008
Jo+0.008
rum+0.008
jo+0.008
Neg logit contributions
-0.010
idents-0.008
 hotter-0.008
Sorry-0.008
 cooler-0.008
Token fixed
Token ID5969
Feature activation-4.60e+00

Pos logit contributions
+0.051
idents+0.041
Sorry+0.040
 hotter+0.040
 surely+0.037
Neg logit contributions
Jo-0.040
ud-0.039
and-0.039
 living-0.036
jo-0.036
Tokenmy
Token ID1820
Feature activation+1.74e-01

Pos logit contributions
+0.163
idents+0.132
 hotter+0.131
Sorry+0.128
 cooler+0.121
Neg logit contributions
ud-0.151
and-0.149
Jo-0.146
rum-0.140
jo-0.139
Token and
Token ID290
Feature activation-4.09e-01

Pos logit contributions
ud+0.014
and+0.014
Jo+0.014
rum+0.013
jo+0.013
Neg logit contributions
-0.016
idents-0.013
 hotter-0.013
Sorry-0.013
 cooler-0.013
Token eat
Token ID4483
Feature activation-1.42e+00

Pos logit contributions
+0.035
idents+0.029
 hotter+0.028
Sorry+0.027
 cooler+0.026
Neg logit contributions
ud-0.037
and-0.037
Jo-0.036
jo-0.035
rum-0.035
Token delicious
Token ID12625
Feature activation-4.32e+00

Pos logit contributions
+0.131
idents+0.108
Sorry+0.106
 hotter+0.105
cakes+0.095
Neg logit contributions
and-0.118
ud-0.118
Jo-0.114
rum-0.108
 Pen-0.107
Token cup
Token ID6508
Feature activation-8.66e-01

Pos logit contributions
+0.350
idents+0.276
Sorry+0.272
 hotter+0.271
 unst+0.244
Neg logit contributions
ud-0.411
and-0.406
Jo-0.400
rum-0.377
 Pen-0.373
Tokencakes
Token ID37263
Feature activation-8.42e+00

Pos logit contributions
+0.031
idents+0.019
 hotter+0.018
Sorry+0.018
 unst+0.012
Neg logit contributions
and-0.122
ud-0.120
Jo-0.117
jo-0.112
rum-0.111
Token!
Token ID0
Feature activation-8.21e-01

Pos logit contributions
+0.825
idents+0.691
 hotter+0.685
Sorry+0.674
 cooler+0.624
Neg logit contributions
and-0.657
ud-0.652
Jo-0.628
jo-0.588
rum-0.588
Token<|endoftext|>
Token ID50256
Feature activation+6.38e-01

Pos logit contributions
+0.083
 hotter+0.072
Sorry+0.069
idents+0.069
 cooler+0.066
Neg logit contributions
and-0.060
ud-0.060
Jo-0.058
jo-0.055
 Pen-0.053
TokenOnce
Token ID7454
Feature activation-1.25e+00

Pos logit contributions
ud+0.063
rum+0.063
Jo+0.062
and+0.062
jo+0.061
Neg logit contributions
-0.049
 hotter-0.033
 cooler-0.032
idents-0.032
Sorry-0.031
Token upon
Token ID2402
Feature activation-2.77e+00

Pos logit contributions
+0.086
idents+0.065
 hotter+0.063
Sorry+0.059
 cooler+0.059
Neg logit contributions
ud-0.132
Jo-0.131
and-0.131
rum-0.127
jo-0.123
Token a
Token ID257
Feature activation+8.28e-02

Pos logit contributions
+0.213
 cooler+0.161
idents+0.160
 hotter+0.160
Sorry+0.159
Neg logit contributions
ud-0.278
rum-0.265
Jo-0.263
jo-0.263
and-0.255
Token hard
Token ID1327
Feature activation+1.64e+00

Pos logit contributions
+0.080
idents+0.063
Sorry+0.062
 hotter+0.060
 cooler+0.057
Neg logit contributions
and-0.083
ud-0.082
Jo-0.079
 Pen-0.075
rum-0.075
Token so
Token ID523
Feature activation+7.59e-01

Pos logit contributions
ud+0.138
and+0.136
Jo+0.133
rum+0.127
jo+0.127
Neg logit contributions
-0.152
idents-0.123
 hotter-0.122
Sorry-0.119
 cooler-0.113
Token that
Token ID326
Feature activation+6.50e-01

Pos logit contributions
ud+0.074
and+0.073
Jo+0.072
rum+0.069
jo+0.069
Neg logit contributions
-0.060
idents-0.047
 hotter-0.047
Sorry-0.046
 cooler-0.043
Token the
Token ID262
Feature activation-5.11e-01

Pos logit contributions
ud+0.064
and+0.062
rum+0.060
Jo+0.059
jo+0.057
Neg logit contributions
-0.053
 hotter-0.044
idents-0.044
cakes-0.041
Sorry-0.040
Token cup
Token ID6508
Feature activation-8.66e-01

Pos logit contributions
+0.046
idents+0.037
 hotter+0.036
Sorry+0.036
 unst+0.034
Neg logit contributions
ud-0.043
Jo-0.043
and-0.043
 Pen-0.039
jo-0.039
Tokencakes
Token ID37263
Feature activation-8.42e+00

Pos logit contributions
+0.052
idents+0.039
 hotter+0.038
Sorry+0.038
 cooler+0.033
Neg logit contributions
ud-0.098
and-0.097
Jo-0.096
jo-0.091
rum-0.091
Token would
Token ID561
Feature activation+8.10e-01

Pos logit contributions
+0.779
idents+0.652
 hotter+0.644
Sorry+0.642
 cooler+0.590
Neg logit contributions
ud-0.658
and-0.658
Jo-0.653
 Pen-0.605
rum-0.594
Token be
Token ID307
Feature activation-3.71e-02

Pos logit contributions
ud+0.073
and+0.072
Jo+0.072
rum+0.068
 Pen+0.067
Neg logit contributions
-0.067
idents-0.054
 hotter-0.053
Sorry-0.052
cakes-0.049
Token extra
Token ID3131
Feature activation-1.13e+00

Pos logit contributions
+0.004
idents+0.003
Sorry+0.003
cakes+0.002
 hotter+0.002
Neg logit contributions
and-0.003
ud-0.003
Jo-0.003
rum-0.003
 Pen-0.003
Token fluffy
Token ID39145
Feature activation+4.95e-01

Pos logit contributions
+0.097
idents+0.076
Sorry+0.074
 hotter+0.071
cakes+0.065
Neg logit contributions
and-0.104
ud-0.101
Jo-0.099
rum-0.094
 Pen-0.092
Token.
Token ID13
Feature activation-1.32e-01

Pos logit contributions
ud+0.043
and+0.042
Jo+0.042
jo+0.040
rum+0.040
Neg logit contributions
-0.045
idents-0.036
 hotter-0.036
 cooler-0.034
Sorry-0.033
Token have
Token ID423
Feature activation-1.30e+00

Pos logit contributions
+0.201
 hotter+0.161
idents+0.161
Sorry+0.153
 cooler+0.145
Neg logit contributions
ud-0.223
and-0.221
Jo-0.217
rum-0.209
jo-0.207
Token it
Token ID340
Feature activation+1.48e+00

Pos logit contributions
+0.112
idents+0.091
 cooler+0.089
Sorry+0.089
 hotter+0.088
Neg logit contributions
ud-0.116
and-0.115
Jo-0.114
jo-0.112
rum-0.108
Token!"
Token ID2474
Feature activation+1.65e+00

Pos logit contributions
ud+0.108
and+0.107
Jo+0.103
rum+0.099
jo+0.098
Neg logit contributions
-0.153
idents-0.127
 hotter-0.126
Sorry-0.124
 cooler-0.118
Token
Token ID198
Feature activation-1.32e-01

Pos logit contributions
ud+0.140
and+0.130
rum+0.129
Jo+0.124
jo+0.121
Neg logit contributions
-0.154
idents-0.127
Sorry-0.122
 hotter-0.120
cakes-0.115
Token
Token ID198
Feature activation-1.29e-01

Pos logit contributions
+0.010
Sorry+0.007
idents+0.007
cakes+0.006
 hotter+0.006
Neg logit contributions
ud-0.014
rum-0.013
and-0.012
weet-0.012
jo-0.012
TokenJul
Token ID16980
Feature activation+6.17e+00

Pos logit contributions
+0.011
Sorry+0.011
idents+0.009
cakes+0.009
 hotter+0.008
Neg logit contributions
ud-0.011
rum-0.010
weet-0.009
and-0.009
 Pen-0.009
Tokenie
Token ID494
Feature activation+3.84e-01

Pos logit contributions
and+0.618
ud+0.598
 Pen+0.590
Jo+0.563
rum+0.562
Neg logit contributions
-0.362
cakes-0.302
idents-0.301
 cooler-0.272
 hotter-0.268
Token didn
Token ID1422
Feature activation-2.95e+00

Pos logit contributions
ud+0.035
and+0.035
Jo+0.033
rum+0.033
jo+0.032
Neg logit contributions
-0.033
idents-0.027
 hotter-0.026
Sorry-0.026
 cooler-0.024
Token't
Token ID470
Feature activation-2.40e+00

Pos logit contributions
+0.263
 hotter+0.232
idents+0.230
Sorry+0.228
 cooler+0.218
Neg logit contributions
ud-0.236
Jo-0.232
and-0.230
rum-0.221
 Pen-0.213
Token understand
Token ID1833
Feature activation+3.94e+00

Pos logit contributions
+0.260
idents+0.221
 hotter+0.218
Sorry+0.216
 cooler+0.207
Neg logit contributions
ud-0.164
and-0.159
Jo-0.156
jo-0.147
rum-0.146
Token.
Token ID13
Feature activation-6.97e-02

Pos logit contributions
ud+0.364
Jo+0.355
and+0.350
jo+0.333
 Pen+0.323
Neg logit contributions
-0.323
idents-0.278
 hotter-0.272
 cooler-0.252
Sorry-0.243
TokenAnna
Token ID31160
Feature activation+4.81e+00

Pos logit contributions
ud+0.069
and+0.068
Jo+0.068
rum+0.067
jo+0.066
Neg logit contributions
-0.054
 hotter-0.039
idents-0.038
Sorry-0.038
 cooler-0.037
Token and
Token ID290
Feature activation-3.43e-01

Pos logit contributions
and+0.461
rum+0.439
Jo+0.438
ud+0.434
jo+0.411
Neg logit contributions
-0.375
 cooler-0.290
idents-0.286
Sorry-0.285
 hotter-0.275
Token Ben
Token ID3932
Feature activation+3.00e+00

Pos logit contributions
+0.030
idents+0.025
Sorry+0.024
 hotter+0.024
 cooler+0.023
Neg logit contributions
ud-0.031
and-0.031
Jo-0.030
rum-0.029
jo-0.028
Token were
Token ID547
Feature activation+4.27e-01

Pos logit contributions
ud+0.268
and+0.265
Jo+0.260
rum+0.249
jo+0.249
Neg logit contributions
-0.261
idents-0.209
 hotter-0.206
Sorry-0.202
 cooler-0.190
Token twins
Token ID20345
Feature activation+4.58e-01

Pos logit contributions
Jo+0.035
and+0.034
rum+0.034
ud+0.034
jo+0.033
Neg logit contributions
-0.041
 hotter-0.033
 cooler-0.033
idents-0.033
Sorry-0.031
Token who
Token ID508
Feature activation+5.42e+00

Pos logit contributions
rum+0.042
and+0.041
ud+0.041
Jo+0.041
jo+0.040
Neg logit contributions
-0.040
idents-0.029
 cooler-0.028
 hotter-0.028
Sorry-0.028
Token liked
Token ID8288
Feature activation+2.39e+00

Pos logit contributions
ud+0.302
and+0.296
Jo+0.289
jo+0.268
rum+0.266
Neg logit contributions
-0.655
idents-0.563
 hotter-0.558
Sorry-0.548
 cooler-0.526
Token to
Token ID284
Feature activation+7.42e-01

Pos logit contributions
ud+0.208
rum+0.201
Jo+0.199
jo+0.196
and+0.190
Neg logit contributions
-0.212
idents-0.179
 hotter-0.178
 cooler-0.176
cakes-0.171
Token play
Token ID711
Feature activation-1.62e+00

Pos logit contributions
ud+0.077
and+0.076
Jo+0.074
rum+0.072
jo+0.071
Neg logit contributions
-0.055
idents-0.042
 hotter-0.041
Sorry-0.040
 cooler-0.037
Token games
Token ID1830
Feature activation+1.91e+00

Pos logit contributions
+0.158
idents+0.133
 hotter+0.132
Sorry+0.128
cakes+0.121
Neg logit contributions
and-0.127
ud-0.126
Jo-0.121
rum-0.115
jo-0.113
Token.
Token ID13
Feature activation-6.13e-02

Pos logit contributions
ud+0.184
Jo+0.180
rum+0.175
and+0.172
jo+0.172
Neg logit contributions
-0.157
 cooler-0.117
idents-0.116
cakes-0.114
 hotter-0.113
Token felt
Token ID2936
Feature activation-5.58e-01

Pos logit contributions
+0.031
 finally+0.028
 continued+0.027
 cooler+0.027
 surely+0.026
Neg logit contributions
rum-0.054
and-0.054
ud-0.051
weet-0.049
 outage-0.046
Token encouraged
Token ID10085
Feature activation+4.15e-01

Pos logit contributions
+0.045
 hotter+0.039
 cooler+0.037
 unst+0.036
idents+0.035
Neg logit contributions
ud-0.054
jo-0.049
Jo-0.049
and-0.049
rum-0.049
Token by
Token ID416
Feature activation-1.82e+00

Pos logit contributions
ud+0.042
rum+0.038
Jo+0.038
jo+0.036
and+0.036
Neg logit contributions
-0.030
 hotter-0.024
idents-0.023
 unst-0.023
Sorry-0.022
Token his
Token ID465
Feature activation+1.21e+00

Pos logit contributions
+0.135
 cooler+0.111
 hotter+0.108
 brand+0.101
idents+0.095
Neg logit contributions
ud-0.179
and-0.172
jo-0.169
old-0.163
Jo-0.156
Token friends
Token ID2460
Feature activation-2.37e-01

Pos logit contributions
ud+0.113
and+0.110
Jo+0.108
rum+0.105
jo+0.104
Neg logit contributions
-0.102
idents-0.083
 hotter-0.080
 cooler-0.077
Sorry-0.077
Token who
Token ID508
Feature activation+5.39e+00

Pos logit contributions
+0.015
 cooler+0.011
idents+0.011
 surely+0.011
cakes+0.010
Neg logit contributions
ud-0.026
rum-0.024
Jo-0.023
jo-0.023
Jamie-0.023
Token were
Token ID547
Feature activation+4.18e-01

Pos logit contributions
and+0.516
ud+0.490
rum+0.485
Jo+0.449
jo+0.447
Neg logit contributions
-0.428
 cooler-0.335
Sorry-0.328
 surely-0.322
 hotter-0.321
Token so
Token ID523
Feature activation+8.08e-01

Pos logit contributions
ud+0.040
and+0.039
rum+0.038
jo+0.037
Jo+0.036
Neg logit contributions
-0.032
 hotter-0.029
 cooler-0.029
 unst-0.024
Sorry-0.024
Token supportive
Token ID16443
Feature activation+5.29e-01

Pos logit contributions
ud+0.076
and+0.073
Jo+0.072
jo+0.071
rum+0.070
Neg logit contributions
-0.067
 hotter-0.053
Sorry-0.052
 cooler-0.051
idents-0.051
Token.
Token ID13
Feature activation-7.69e-02

Pos logit contributions
ud+0.048
Jo+0.048
rum+0.048
and+0.046
Jamie+0.046
Neg logit contributions
-0.039
 cooler-0.032
idents-0.031
cakes-0.030
 hotter-0.030
Token One
Token ID1881
Feature activation+2.81e-01

Pos logit contributions
+0.007
idents+0.005
 hotter+0.005
Sorry+0.004
 cooler+0.004
Neg logit contributions
ud-0.007
and-0.007
Jo-0.007
rum-0.007
jo-0.007
Token,
Token ID11
Feature activation+2.25e-01

Pos logit contributions
ud+0.163
Jo+0.157
jo+0.155
rum+0.151
and+0.150
Neg logit contributions
-0.106
idents-0.071
 cooler-0.062
 unst-0.060
Sorry-0.057
Token there
Token ID612
Feature activation-4.44e-01

Pos logit contributions
ud+0.022
Jo+0.022
and+0.022
jo+0.021
rum+0.021
Neg logit contributions
-0.018
 hotter-0.014
idents-0.014
Sorry-0.013
 cooler-0.013
Token was
Token ID373
Feature activation+4.04e-01

Pos logit contributions
+0.029
idents+0.018
 cooler+0.017
 finally+0.016
 hotter+0.016
Neg logit contributions
and-0.052
weet-0.050
rum-0.050
ud-0.049
jo-0.049
Token a
Token ID257
Feature activation+1.41e-01

Pos logit contributions
ud+0.035
and+0.034
Jo+0.034
rum+0.033
jo+0.033
Neg logit contributions
-0.037
 hotter-0.029
idents-0.029
Sorry-0.028
 cooler-0.028
Token family
Token ID1641
Feature activation+1.47e-01

Pos logit contributions
ud+0.009
Jo+0.008
and+0.008
 Pen+0.008
 Ali+0.008
Neg logit contributions
-0.016
idents-0.014
 hotter-0.013
Sorry-0.013
cakes-0.013
Token who
Token ID508
Feature activation+5.40e+00

Pos logit contributions
ud+0.011
rum+0.011
Jo+0.011
 Ali+0.011
jo+0.011
Neg logit contributions
-0.014
 cooler-0.011
Sorry-0.011
 hotter-0.011
cakes-0.010
Token wanted
Token ID2227
Feature activation+2.40e+00

Pos logit contributions
ud+0.330
and+0.324
Jo+0.314
rum+0.296
jo+0.293
Neg logit contributions
-0.627
idents-0.534
 hotter-0.528
Sorry-0.521
 cooler-0.501
Token to
Token ID284
Feature activation+7.31e-01

Pos logit contributions
ud+0.232
rum+0.230
jo+0.223
Jo+0.219
and+0.208
Neg logit contributions
-0.179
 cooler-0.161
 hotter-0.161
idents-0.154
cakes-0.149
Token go
Token ID467
Feature activation-8.42e-01

Pos logit contributions
ud+0.075
and+0.074
Jo+0.072
rum+0.070
jo+0.070
Neg logit contributions
-0.055
idents-0.042
 hotter-0.041
Sorry-0.040
 cooler-0.038
Token on
Token ID319
Feature activation-9.16e-01

Pos logit contributions
+0.096
idents+0.083
Sorry+0.082
 hotter+0.079
cakes+0.073
Neg logit contributions
and-0.056
Jo-0.050
ud-0.050
jo-0.047
old-0.046
Token a
Token ID257
Feature activation+1.44e-01

Pos logit contributions
+0.093
idents+0.077
 hotter+0.076
Sorry+0.075
cakes+0.071
Neg logit contributions
ud-0.069
and-0.067
Jo-0.067
jo-0.063
rum-0.062
Token had
Token ID550
Feature activation-6.58e-01

Pos logit contributions
+0.054
idents+0.043
 hotter+0.043
Sorry+0.042
 cooler+0.042
Neg logit contributions
ud-0.050
and-0.049
rum-0.046
Jo-0.046
jo-0.045
Token a
Token ID257
Feature activation+1.55e-01

Pos logit contributions
 cooler+0.044
+0.041
cakes+0.038
 hotter+0.037
 sack+0.034
Neg logit contributions
ud-0.065
rum-0.065
and-0.062
iley-0.061
jo-0.060
Token big
Token ID1263
Feature activation-1.42e-01

Pos logit contributions
ud+0.014
and+0.014
Jo+0.014
rum+0.013
jo+0.013
Neg logit contributions
-0.013
idents-0.011
 hotter-0.011
Sorry-0.010
 cooler-0.010
Token lake
Token ID13546
Feature activation+9.30e-01

Pos logit contributions
+0.012
 cooler+0.010
idents+0.010
 hotter+0.009
Sorry+0.008
Neg logit contributions
and-0.014
ud-0.013
Jo-0.013
jo-0.012
rum-0.012
Token and
Token ID290
Feature activation-3.44e-01

Pos logit contributions
ud+0.087
and+0.085
Jo+0.085
rum+0.082
jo+0.081
Neg logit contributions
-0.079
 cooler-0.062
 hotter-0.061
idents-0.061
cakes-0.059
Token many
Token ID867
Feature activation+5.26e+00

Pos logit contributions
+0.023
 cooler+0.021
cakes+0.019
 hotter+0.019
idents+0.019
Neg logit contributions
ud-0.037
and-0.036
rum-0.036
jo-0.035
Jo-0.033
Token trees
Token ID7150
Feature activation-1.24e+00

Pos logit contributions
and+0.514
ud+0.506
 Mae+0.497
jo+0.491
Jo+0.484
Neg logit contributions
-0.359
 cooler-0.342
 hotter-0.312
cakes-0.287
Sorry-0.282
Token.
Token ID13
Feature activation-5.57e-02

Pos logit contributions
+0.107
 hotter+0.084
 cooler+0.083
idents+0.082
cakes+0.078
Neg logit contributions
ud-0.113
and-0.110
Jo-0.109
jo-0.106
rum-0.105
Token They
Token ID1119
Feature activation-2.44e-01

Pos logit contributions
+0.005
idents+0.004
 hotter+0.004
Sorry+0.004
 cooler+0.004
Neg logit contributions
and-0.005
ud-0.005
Jo-0.005
rum-0.005
jo-0.005
Token could
Token ID714
Feature activation-8.61e-01

Pos logit contributions
+0.020
 cooler+0.015
Sorry+0.015
idents+0.014
 hotter+0.014
Neg logit contributions
and-0.024
rum-0.023
ud-0.023
Jo-0.021
jo-0.021
Token swim
Token ID9422
Feature activation-5.89e-02

Pos logit contributions
+0.075
idents+0.060
 hotter+0.060
Sorry+0.059
cakes+0.055
Neg logit contributions
ud-0.077
and-0.076
Jo-0.075
jo-0.071
rum-0.071
Token diary
Token ID26339
Feature activation-7.82e-01

Pos logit contributions
and+0.051
ud+0.049
Jo+0.048
rum+0.047
jo+0.046
Neg logit contributions
-0.067
Sorry-0.055
 hotter-0.055
idents-0.054
cakes-0.051
Token anymore
Token ID7471
Feature activation-1.14e+00

Pos logit contributions
+0.064
idents+0.051
 hotter+0.050
Sorry+0.049
 cooler+0.046
Neg logit contributions
ud-0.074
and-0.072
Jo-0.072
rum-0.069
jo-0.068
Token.
Token ID13
Feature activation-1.34e-01

Pos logit contributions
+0.090
idents+0.069
Sorry+0.062
 hotter+0.062
 cooler+0.061
Neg logit contributions
ud-0.113
and-0.107
Jo-0.107
jo-0.103
rum-0.102
Token Lily
Token ID20037
Feature activation-1.12e+00

Pos logit contributions
+0.014
 hotter+0.012
idents+0.011
Sorry+0.011
 cooler+0.011
Neg logit contributions
and-0.010
ud-0.010
Jo-0.009
jo-0.009
rum-0.009
Token was
Token ID373
Feature activation+3.43e-01

Pos logit contributions
+0.087
idents+0.070
Sorry+0.067
 hotter+0.066
 cooler+0.062
Neg logit contributions
and-0.109
ud-0.109
Jo-0.105
rum-0.104
jo-0.102
Token heart
Token ID2612
Feature activation+2.81e+00

Pos logit contributions
ud+0.031
and+0.031
Jo+0.030
rum+0.029
jo+0.029
Neg logit contributions
-0.030
idents-0.024
 hotter-0.023
Sorry-0.023
 cooler-0.022
Tokenbroken
Token ID25826
Feature activation-1.83e+00

Pos logit contributions
ud+0.293
and+0.292
Jo+0.287
jo+0.275
rum+0.272
Neg logit contributions
-0.207
idents-0.155
Sorry-0.152
 hotter-0.150
cakes-0.136
Token and
Token ID290
Feature activation-4.23e-01

Pos logit contributions
+0.163
idents+0.132
 hotter+0.129
Sorry+0.124
 cooler+0.122
Neg logit contributions
ud-0.158
Jo-0.157
and-0.157
rum-0.151
jo-0.149
Token cried
Token ID16896
Feature activation+2.20e+00

Pos logit contributions
+0.045
idents+0.036
 hotter+0.036
Sorry+0.035
cakes+0.034
Neg logit contributions
and-0.030
ud-0.030
Jo-0.029
jo-0.028
rum-0.028
Token all
Token ID477
Feature activation+1.04e+00

Pos logit contributions
and+0.168
ud+0.164
Jo+0.154
rum+0.150
jo+0.147
Neg logit contributions
-0.224
idents-0.185
Sorry-0.185
 hotter-0.181
cakes-0.173
Token day
Token ID1110
Feature activation+1.42e+00

Pos logit contributions
ud+0.105
and+0.098
Jo+0.096
rum+0.094
 Mae+0.092
Neg logit contributions
-0.067
 hotter-0.061
 cooler-0.058
idents-0.056
 unst-0.055
Token time
Token ID640
Feature activation+1.39e+00

Pos logit contributions
rum+0.006
Jo+0.006
and+0.006
 Ali+0.006
 Pen+0.006
Neg logit contributions
-0.007
idents-0.005
 cooler-0.005
Sorry-0.005
-0.005
Token,
Token ID11
Feature activation+1.56e-01

Pos logit contributions
ud+0.154
Jo+0.149
jo+0.147
and+0.145
rum+0.144
Neg logit contributions
-0.102
idents-0.064
 cooler-0.059
 unst-0.055
Sorry-0.053
Token there
Token ID612
Feature activation-5.10e-01

Pos logit contributions
ud+0.015
and+0.014
Jo+0.014
jo+0.014
rum+0.014
Neg logit contributions
-0.013
 hotter-0.010
idents-0.010
Sorry-0.010
 cooler-0.010
Token was
Token ID373
Feature activation+3.22e-01

Pos logit contributions
+0.032
idents+0.020
 cooler+0.020
 finally+0.019
 hotter+0.017
Neg logit contributions
and-0.060
weet-0.059
rum-0.058
ud-0.056
jo-0.056
Token a
Token ID257
Feature activation+6.48e-02

Pos logit contributions
ud+0.028
and+0.027
Jo+0.027
rum+0.026
jo+0.026
Neg logit contributions
-0.029
 hotter-0.023
idents-0.023
Sorry-0.023
 cooler-0.022
Token little
Token ID1310
Feature activation+1.10e+00

Pos logit contributions
ud+0.004
Jo+0.004
and+0.004
 Pen+0.004
 Ali+0.003
Neg logit contributions
-0.007
idents-0.006
 hotter-0.006
cakes-0.006
Sorry-0.006
Token girl
Token ID2576
Feature activation-5.78e-01

Pos logit contributions
ud+0.097
and+0.093
Jo+0.093
jo+0.089
rum+0.087
Neg logit contributions
-0.098
idents-0.080
 hotter-0.077
Sorry-0.076
cakes-0.072
Token named
Token ID3706
Feature activation+3.78e-01

Pos logit contributions
+0.041
Sorry+0.031
 cooler+0.029
 hotter+0.028
 sack+0.027
Neg logit contributions
rum-0.055
and-0.055
ud-0.054
Jo-0.054
 Jacob-0.052
Token Lily
Token ID20037
Feature activation-1.15e+00

Pos logit contributions
ud+0.036
and+0.035
Jo+0.035
rum+0.034
jo+0.033
Neg logit contributions
-0.031
idents-0.025
 hotter-0.024
Sorry-0.024
 cooler-0.022
Token who
Token ID508
Feature activation+5.32e+00

Pos logit contributions
+0.049
 unst+0.033
Sorry+0.033
 still+0.032
 continued+0.032
Neg logit contributions
 Mae-0.129
Jo-0.128
ud-0.128
 Jacob-0.127
 Pen-0.127
Token loved
Token ID6151
Feature activation+3.33e+00

Pos logit contributions
ud+0.297
and+0.290
Jo+0.281
rum+0.262
jo+0.259
Neg logit contributions
-0.647
idents-0.556
 hotter-0.549
Sorry-0.544
 cooler-0.524
TokenGive
Token ID23318
Feature activation+7.91e-01

Pos logit contributions
ud+0.061
and+0.061
Jo+0.058
rum+0.053
jo+0.051
Neg logit contributions
-0.085
idents-0.072
 hotter-0.070
 cooler-0.070
cakes-0.069
Token it
Token ID340
Feature activation+1.49e+00

Pos logit contributions
ud+0.069
and+0.069
Jo+0.068
jo+0.065
rum+0.063
Neg logit contributions
-0.068
idents-0.059
 hotter-0.056
Sorry-0.056
 cooler-0.052
Token back
Token ID736
Feature activation-3.19e+00

Pos logit contributions
and+0.170
ud+0.168
Jo+0.163
rum+0.162
jo+0.158
Neg logit contributions
-0.096
 hotter-0.066
idents-0.065
Sorry-0.064
cakes-0.056
Token,
Token ID11
Feature activation+2.35e-01

Pos logit contributions
+0.248
idents+0.220
 cooler+0.200
Sorry+0.198
 unst+0.196
Neg logit contributions
Jo-0.294
ud-0.283
and-0.273
jo-0.269
weet-0.251
Token give
Token ID1577
Feature activation-1.74e+00

Pos logit contributions
ud+0.019
and+0.019
Jo+0.018
jo+0.018
rum+0.017
Neg logit contributions
-0.023
idents-0.019
 hotter-0.019
Sorry-0.018
 cooler-0.018
Token it
Token ID340
Feature activation+1.49e+00

Pos logit contributions
+0.154
idents+0.134
 hotter+0.131
Sorry+0.129
 cooler+0.125
Neg logit contributions
ud-0.145
and-0.144
Jo-0.143
jo-0.136
rum-0.132
Token back
Token ID736
Feature activation-3.19e+00

Pos logit contributions
and+0.169
ud+0.167
rum+0.162
Jo+0.162
jo+0.158
Neg logit contributions
-0.098
 hotter-0.068
idents-0.067
Sorry-0.065
cakes-0.058
Token!"
Token ID2474
Feature activation+1.66e+00

Pos logit contributions
+0.258
idents+0.233
Sorry+0.216
 unst+0.215
 cooler+0.212
Neg logit contributions
Jo-0.283
ud-0.275
and-0.260
jo-0.257
 Pen-0.241
Token It
Token ID632
Feature activation-5.81e-01

Pos logit contributions
ud+0.134
and+0.133
Jo+0.129
rum+0.123
jo+0.123
Neg logit contributions
-0.159
idents-0.131
 hotter-0.129
Sorry-0.127
 cooler-0.120
Token repeated
Token ID5100
Feature activation+5.27e-01

Pos logit contributions
+0.054
idents+0.044
 hotter+0.043
Sorry+0.042
cakes+0.040
Neg logit contributions
ud-0.049
and-0.047
Jo-0.047
rum-0.045
jo-0.045
Token what
Token ID644
Feature activation-1.46e+00

Pos logit contributions
ud+0.048
and+0.046
Jo+0.046
jo+0.044
rum+0.042
Neg logit contributions
-0.045
idents-0.038
 hotter-0.037
Sorry-0.035
 cooler-0.035
Token.
Token ID13
Feature activation-8.83e-02

Pos logit contributions
ud+0.148
Jo+0.144
jo+0.138
and+0.137
rum+0.136
Neg logit contributions
-0.113
 cooler-0.089
idents-0.087
 hotter-0.085
 unst-0.081
Token So
Token ID1406
Feature activation+9.06e-01

Pos logit contributions
+0.007
idents+0.006
 hotter+0.006
Sorry+0.006
 cooler+0.005
Neg logit contributions
ud-0.008
and-0.008
Jo-0.008
rum-0.008
jo-0.008
Token,
Token ID11
Feature activation+2.26e-01

Pos logit contributions
ud+0.072
Jo+0.071
and+0.071
rum+0.067
 Ali+0.067
Neg logit contributions
-0.088
idents-0.072
 hotter-0.071
Sorry-0.069
cakes-0.067
Token he
Token ID339
Feature activation-1.57e-01

Pos logit contributions
Jo+0.019
and+0.019
ud+0.019
 Ali+0.018
 Pen+0.018
Neg logit contributions
-0.021
idents-0.017
 hotter-0.016
Sorry-0.016
cakes-0.015
Token used
Token ID973
Feature activation+7.70e-02

Pos logit contributions
+0.013
idents+0.010
Sorry+0.010
 hotter+0.010
 cooler+0.010
Neg logit contributions
ud-0.015
and-0.015
Jo-0.014
rum-0.014
jo-0.014
Token his
Token ID465
Feature activation+1.23e+00

Pos logit contributions
ud+0.007
and+0.007
Jo+0.006
jo+0.006
rum+0.006
Neg logit contributions
-0.007
 hotter-0.006
idents-0.006
 cooler-0.005
Sorry-0.005
Token foot
Token ID2366
Feature activation+5.13e+00

Pos logit contributions
ud+0.098
and+0.097
Jo+0.095
rum+0.092
jo+0.091
Neg logit contributions
-0.115
 hotter-0.095
idents-0.095
Sorry-0.094
cakes-0.088
Token to
Token ID284
Feature activation+7.42e-01

Pos logit contributions
ud+0.379
Jo+0.371
and+0.360
jo+0.347
rum+0.346
Neg logit contributions
-0.527
idents-0.435
 hotter-0.430
Sorry-0.423
 cooler-0.410
Token kick
Token ID4829
Feature activation+2.71e+00

Pos logit contributions
ud+0.071
and+0.070
Jo+0.069
rum+0.066
jo+0.066
Neg logit contributions
-0.061
idents-0.048
 hotter-0.047
Sorry-0.046
 cooler-0.043
Token the
Token ID262
Feature activation-4.52e-01

Pos logit contributions
ud+0.224
and+0.220
Jo+0.219
jo+0.209
rum+0.206
Neg logit contributions
-0.256
idents-0.210
 hotter-0.209
Sorry-0.201
 cooler-0.197
Token ball
Token ID2613
Feature activation+5.40e-01

Pos logit contributions
+0.042
idents+0.034
 hotter+0.034
Sorry+0.033
cakes+0.031
Neg logit contributions
ud-0.038
and-0.037
Jo-0.036
rum-0.035
jo-0.034
Token the
Token ID262
Feature activation-4.61e-01

Pos logit contributions
ud+0.064
and+0.062
Jo+0.061
rum+0.059
jo+0.058
Neg logit contributions
-0.066
idents-0.054
 hotter-0.053
Sorry-0.052
 cooler-0.049
Token forest
Token ID8222
Feature activation+5.46e-01

Pos logit contributions
+0.047
idents+0.040
 hotter+0.038
Sorry+0.038
 unst+0.036
Neg logit contributions
and-0.033
ud-0.033
Jo-0.033
 Pen-0.030
 Mae-0.029
Token to
Token ID284
Feature activation+7.22e-01

Pos logit contributions
ud+0.057
rum+0.055
Jo+0.055
jo+0.052
Jamie+0.052
Neg logit contributions
-0.038
idents-0.030
 cooler-0.029
cakes-0.025
 unst-0.025
Token play
Token ID711
Feature activation-1.64e+00

Pos logit contributions
Jo+0.055
and+0.054
 Pen+0.053
 Ali+0.053
 Mae+0.052
Neg logit contributions
-0.070
idents-0.060
Sorry-0.058
 hotter-0.058
cakes-0.056
Token with
Token ID351
Feature activation+1.60e-01

Pos logit contributions
+0.166
idents+0.141
 hotter+0.138
Sorry+0.135
cakes+0.128
Neg logit contributions
and-0.125
ud-0.122
Jo-0.119
rum-0.113
jo-0.111
Token her
Token ID607
Feature activation+7.00e-01

Pos logit contributions
ud+0.014
and+0.013
Jo+0.013
jo+0.013
rum+0.012
Neg logit contributions
-0.015
 hotter-0.012
 cooler-0.012
idents-0.012
Sorry-0.012
Token friends
Token ID2460
Feature activation-2.36e-01

Pos logit contributions
ud+0.042
and+0.042
Jo+0.041
 Pen+0.039
 Ali+0.039
Neg logit contributions
-0.078
 hotter-0.070
idents-0.068
Sorry-0.068
 surely-0.063
Token.
Token ID13
Feature activation-7.76e-02

Pos logit contributions
+0.019
 cooler+0.014
idents+0.014
 hotter+0.014
Sorry+0.013
Neg logit contributions
Jo-0.023
ud-0.023
and-0.023
jo-0.022
rum-0.022
Token Suddenly
Token ID24975
Feature activation+2.16e+00

Pos logit contributions
+0.008
 hotter+0.007
Sorry+0.007
idents+0.007
 unst+0.006
Neg logit contributions
and-0.006
ud-0.005
Jo-0.005
jo-0.005
 Pen-0.005
Token,
Token ID11
Feature activation+2.15e-01

Pos logit contributions
and+0.192
ud+0.191
Jo+0.183
rum+0.181
jo+0.177
Neg logit contributions
-0.192
idents-0.155
 hotter-0.154
Sorry-0.148
 cooler-0.147
Token they
Token ID484
Feature activation-1.46e+00

Pos logit contributions
 Pen+0.014
 Ali+0.014
Jo+0.014
 Mae+0.014
 Jacob+0.013
Neg logit contributions
-0.023
idents-0.019
 hotter-0.018
Sorry-0.018
 unst-0.017
Token she
Token ID673
Feature activation-4.49e-01

Pos logit contributions
 Ali+0.013
 Pen+0.013
 Mae+0.012
Jo+0.012
 Jacob+0.012
Neg logit contributions
-0.027
idents-0.022
cakes-0.021
 hotter-0.021
Sorry-0.021
Token saw
Token ID2497
Feature activation-6.23e-02

Pos logit contributions
+0.046
idents+0.038
 hotter+0.038
Sorry+0.037
 cooler+0.036
Neg logit contributions
ud-0.033
and-0.033
Jo-0.032
rum-0.031
jo-0.030
Token a
Token ID257
Feature activation+1.44e-01

Pos logit contributions
+0.006
 hotter+0.005
idents+0.005
Sorry+0.005
 cooler+0.004
Neg logit contributions
ud-0.005
and-0.005
Jo-0.005
rum-0.005
jo-0.005
Token hose
Token ID31489
Feature activation+9.23e-01

Pos logit contributions
ud+0.012
and+0.012
Jo+0.012
jo+0.011
rum+0.011
Neg logit contributions
-0.013
idents-0.011
 hotter-0.010
Sorry-0.010
cakes-0.010
Token.
Token ID13
Feature activation-7.51e-02

Pos logit contributions
ud+0.075
and+0.074
Jo+0.073
rum+0.069
jo+0.069
Neg logit contributions
-0.088
idents-0.072
 hotter-0.071
Sorry-0.070
 cooler-0.066
Token It
Token ID632
Feature activation-5.88e-01

Pos logit contributions
+0.008
idents+0.007
 hotter+0.006
Sorry+0.006
 cooler+0.006
Neg logit contributions
ud-0.006
and-0.005
Jo-0.005
rum-0.005
 Pen-0.005
Token was
Token ID373
Feature activation+3.97e-01

Pos logit contributions
+0.066
idents+0.057
 hotter+0.054
Sorry+0.053
cakes+0.051
Neg logit contributions
Jo-0.038
ud-0.038
 Ali-0.034
jo-0.034
and-0.034
Token dripping
Token ID37472
Feature activation-1.39e-01

Pos logit contributions
ud+0.029
Jo+0.027
and+0.027
 Pen+0.027
 Ali+0.027
Neg logit contributions
-0.040
idents-0.035
Sorry-0.033
cakes-0.032
 hotter-0.031
Token water
Token ID1660
Feature activation-1.35e+00

Pos logit contributions
+0.015
idents+0.013
Sorry+0.012
cakes+0.011
 hotter+0.011
Neg logit contributions
ud-0.010
and-0.009
rum-0.008
Jo-0.008
 Pen-0.008
Token.
Token ID13
Feature activation-7.05e-02

Pos logit contributions
+0.129
idents+0.106
 hotter+0.103
Sorry+0.103
cakes+0.096
Neg logit contributions
ud-0.110
and-0.108
Jo-0.105
rum-0.100
jo-0.100
Token Lily
Token ID20037
Feature activation-1.06e+00

Pos logit contributions
+0.007
idents+0.006
 hotter+0.006
Sorry+0.006
 cooler+0.006
Neg logit contributions
and-0.005
ud-0.005
Jo-0.005
 Ali-0.005
 Pen-0.005
Token."
Token ID526
Feature activation+1.44e+00

Pos logit contributions
ud+0.099
Jo+0.096
and+0.093
jo+0.090
rum+0.089
Neg logit contributions
-0.104
idents-0.085
Sorry-0.078
 hotter-0.078
 cooler-0.078
Token
Token ID198
Feature activation-1.22e-01

Pos logit contributions
ud+0.140
Jo+0.133
rum+0.133
and+0.132
jo+0.128
Neg logit contributions
-0.120
idents-0.089
Sorry-0.088
 hotter-0.086
 cooler-0.081
Token
Token ID198
Feature activation-1.11e-01

Pos logit contributions
+0.009
Sorry+0.006
idents+0.006
 hotter+0.006
 cooler+0.005
Neg logit contributions
ud-0.013
rum-0.012
and-0.012
Jo-0.012
jo-0.012
TokenShe
Token ID3347
Feature activation+1.78e-01

Pos logit contributions
+0.010
Sorry+0.009
 hotter+0.007
Thank+0.007
 cooler+0.007
Neg logit contributions
ud-0.009
rum-0.009
jo-0.009
and-0.008
Jo-0.008
Token took
Token ID1718
Feature activation-7.37e-02

Pos logit contributions
ud+0.017
and+0.017
Jo+0.016
rum+0.016
jo+0.016
Neg logit contributions
-0.015
idents-0.012
 hotter-0.011
Sorry-0.011
 cooler-0.011
Token them
Token ID606
Feature activation-7.06e-01

Pos logit contributions
+0.007
 hotter+0.006
idents+0.005
Sorry+0.005
 cooler+0.005
Neg logit contributions
ud-0.006
and-0.006
Jo-0.006
rum-0.006
jo-0.006
Token to
Token ID284
Feature activation+7.38e-01

Pos logit contributions
+0.063
idents+0.053
 hotter+0.051
Sorry+0.050
 cooler+0.049
Neg logit contributions
ud-0.060
Jo-0.058
and-0.057
rum-0.056
jo-0.055
Token the
Token ID262
Feature activation-4.51e-01

Pos logit contributions
ud+0.067
and+0.065
Jo+0.063
jo+0.063
rum+0.062
Neg logit contributions
-0.066
idents-0.052
 hotter-0.052
 cooler-0.050
Sorry-0.049
Token kitchen
Token ID9592
Feature activation+4.37e-01

Pos logit contributions
+0.046
idents+0.039
 hotter+0.038
Sorry+0.037
 cooler+0.036
Neg logit contributions
ud-0.034
and-0.033
Jo-0.032
rum-0.031
jo-0.031
Token and
Token ID290
Feature activation-3.47e-01

Pos logit contributions
ud+0.040
Jo+0.039
and+0.038
jo+0.038
rum+0.037
Neg logit contributions
-0.038
idents-0.031
 cooler-0.030
 hotter-0.029
cakes-0.028
Token showed
Token ID3751
Feature activation-2.28e+00

Pos logit contributions
+0.031
idents+0.025
 hotter+0.024
Sorry+0.024
 cooler+0.021
Neg logit contributions
ud-0.030
and-0.030
Jo-0.030
 Ali-0.029
 Pen-0.029
Token.
Token ID13
Feature activation-5.70e-02

Pos logit contributions
ud+0.177
and+0.177
Jo+0.169
rum+0.163
jo+0.161
Neg logit contributions
-0.225
idents-0.184
 hotter-0.180
Sorry-0.180
cakes-0.170
Token
Token ID198
Feature activation-1.29e-01

Pos logit contributions
+0.005
idents+0.004
Sorry+0.004
 hotter+0.003
cakes+0.003
Neg logit contributions
ud-0.006
and-0.005
rum-0.005
Jo-0.005
jo-0.005
Token
Token ID198
Feature activation-1.16e-01

Pos logit contributions
+0.008
Sorry+0.005
idents+0.005
cakes+0.004
 hotter+0.004
Neg logit contributions
ud-0.014
rum-0.014
and-0.013
weet-0.013
 Pen-0.013
Token"
Token ID1
Feature activation-4.88e-01

Pos logit contributions
+0.009
Sorry+0.008
idents+0.007
cakes+0.007
 surely+0.006
Neg logit contributions
ud-0.011
and-0.010
rum-0.010
 Pen-0.010
Jo-0.009
TokenMom
Token ID29252
Feature activation-2.13e+00

Pos logit contributions
+0.038
idents+0.030
 hotter+0.028
 cooler+0.026
Sorry+0.026
Neg logit contributions
ud-0.049
Jo-0.049
and-0.048
jo-0.045
rum-0.045
Tokenmy
Token ID1820
Feature activation+2.30e-01

Pos logit contributions
+0.189
idents+0.153
 hotter+0.151
Sorry+0.147
 cooler+0.140
Neg logit contributions
ud-0.186
and-0.186
Jo-0.181
rum-0.173
jo-0.172
Token,
Token ID11
Feature activation+2.29e-01

Pos logit contributions
and+0.019
ud+0.019
Jo+0.018
rum+0.017
jo+0.017
Neg logit contributions
-0.022
idents-0.018
 hotter-0.018
Sorry-0.018
 cooler-0.017
Token can
Token ID460
Feature activation-1.85e+00

Pos logit contributions
ud+0.021
and+0.021
Jo+0.020
jo+0.019
rum+0.019
Neg logit contributions
-0.019
idents-0.016
 hotter-0.016
 cooler-0.015
Sorry-0.015
Token you
Token ID345
Feature activation-2.32e-01

Pos logit contributions
+0.198
 hotter+0.172
idents+0.168
Sorry+0.163
 cooler+0.160
Neg logit contributions
Jo-0.117
ud-0.114
and-0.112
 Pen-0.111
 Ali-0.110
Token help
Token ID1037
Feature activation-1.74e+00

Pos logit contributions
+0.020
 hotter+0.017
idents+0.017
Sorry+0.016
cakes+0.015
Neg logit contributions
and-0.020
Jo-0.019
ud-0.019
rum-0.018
old-0.018
Token me
Token ID502
Feature activation+1.13e+00

Pos logit contributions
+0.104
idents+0.092
 hotter+0.089
 cooler+0.086
Sorry+0.083
Neg logit contributions
ud-0.187
Jo-0.177
and-0.177
rum-0.172
jo-0.171
Token there
Token ID612
Feature activation-4.26e-01

Pos logit contributions
+0.065
idents+0.046
 hotter+0.043
 cooler+0.040
Sorry+0.037
Neg logit contributions
ud-0.139
Jo-0.138
and-0.136
rum-0.135
jo-0.128
Token was
Token ID373
Feature activation+4.14e-01

Pos logit contributions
+0.035
idents+0.026
 hotter+0.024
 cooler+0.024
Sorry+0.024
Neg logit contributions
ud-0.042
and-0.041
rum-0.040
weet-0.039
Jo-0.039
Token a
Token ID257
Feature activation+1.59e-01

Pos logit contributions
rum+0.042
ud+0.042
jo+0.041
Jo+0.040
and+0.040
Neg logit contributions
-0.031
 cooler-0.023
 hotter-0.022
idents-0.022
Sorry-0.021
Token little
Token ID1310
Feature activation+1.20e+00

Pos logit contributions
ud+0.013
and+0.012
Jo+0.012
rum+0.012
jo+0.012
Neg logit contributions
-0.016
idents-0.013
 hotter-0.013
Sorry-0.012
 cooler-0.012
Token girl
Token ID2576
Feature activation-4.82e-01

Pos logit contributions
ud+0.115
and+0.115
Jo+0.112
rum+0.109
jo+0.108
Neg logit contributions
-0.097
idents-0.076
 hotter-0.075
Sorry-0.073
 cooler-0.070
Token called
Token ID1444
Feature activation+3.83e-01

Pos logit contributions
+0.032
Sorry+0.023
 continued+0.022
 still+0.020
 cooler+0.020
Neg logit contributions
rum-0.049
and-0.049
ud-0.048
Jo-0.048
 Jacob-0.047
Token T
Token ID309
Feature activation+1.04e+00

Pos logit contributions
ud+0.039
rum+0.038
and+0.036
jo+0.035
Jo+0.035
Neg logit contributions
-0.030
 Ros-0.021
idents-0.021
 Lil-0.020
cakes-0.020
Tokenilly
Token ID6548
Feature activation-6.88e-01

Pos logit contributions
and+0.086
ud+0.086
Jo+0.084
rum+0.082
jo+0.079
Neg logit contributions
-0.095
idents-0.078
 hotter-0.074
Sorry-0.073
cakes-0.071
Token.
Token ID13
Feature activation-5.94e-02

Pos logit contributions
+0.048
idents+0.034
 cooler+0.033
cakes+0.031
 unst+0.030
Neg logit contributions
ud-0.072
Jo-0.069
and-0.069
rum-0.067
Jamie-0.065
Token One
Token ID1881
Feature activation+3.00e-01

Pos logit contributions
+0.006
 hotter+0.006
Sorry+0.005
idents+0.005
cakes+0.005
Neg logit contributions
ud-0.004
and-0.004
Jo-0.004
jo-0.004
 Ali-0.004
Token day
Token ID1110
Feature activation+1.50e+00

Pos logit contributions
ud+0.023
and+0.022
Jo+0.022
rum+0.021
jo+0.020
Neg logit contributions
-0.030
 hotter-0.025
idents-0.025
Sorry-0.025
 cooler-0.024
Token scared
Token ID12008
Feature activation+5.84e-02

Pos logit contributions
ud+0.007
Jo+0.007
and+0.007
 Pen+0.006
jo+0.006
Neg logit contributions
-0.011
idents-0.010
 hotter-0.010
Sorry-0.009
cakes-0.009
Token girl
Token ID2576
Feature activation-5.27e-01

Pos logit contributions
and+0.006
ud+0.005
rum+0.005
Jo+0.005
Jamie+0.005
Neg logit contributions
-0.005
 cooler-0.004
 hotter-0.004
idents-0.004
 sack-0.003
Token.
Token ID13
Feature activation-1.15e-01

Pos logit contributions
+0.034
 cooler+0.024
 sack+0.022
 continued+0.022
 unst+0.022
Neg logit contributions
rum-0.053
Jo-0.052
ud-0.052
Jamie-0.052
 Jacob-0.050
Token She
Token ID1375
Feature activation+3.39e-01

Pos logit contributions
+0.010
idents+0.007
cakes+0.007
 hotter+0.007
Sorry+0.007
Neg logit contributions
ud-0.012
rum-0.011
Jo-0.011
and-0.011
jo-0.010
Token was
Token ID373
Feature activation+3.56e-01

Pos logit contributions
and+0.027
ud+0.026
rum+0.026
Jo+0.024
weet+0.023
Neg logit contributions
-0.032
idents-0.027
Sorry-0.027
 cooler-0.027
 hotter-0.026
Token very
Token ID845
Feature activation-9.28e-01

Pos logit contributions
ud+0.038
rum+0.037
and+0.037
Jo+0.036
jo+0.035
Neg logit contributions
-0.026
 hotter-0.021
 cooler-0.021
idents-0.019
 unst-0.018
Token frightened
Token ID24776
Feature activation-1.27e+00

Pos logit contributions
+0.074
 hotter+0.058
idents+0.057
Sorry+0.056
 cooler+0.054
Neg logit contributions
ud-0.091
and-0.090
Jo-0.088
rum-0.086
jo-0.085
Token and
Token ID290
Feature activation-4.11e-01

Pos logit contributions
 too+0.036
 as+0.026
 cooler+0.023
+0.022
 hotter+0.021
Neg logit contributions
Jamie-0.169
rum-0.168
Jul-0.164
Jo-0.164
Jonathan-0.159
Token did
Token ID750
Feature activation-1.62e+00

Pos logit contributions
+0.029
idents+0.025
 cooler+0.024
 hotter+0.024
Sorry+0.023
Neg logit contributions
ud-0.044
and-0.043
rum-0.040
Jo-0.040
jo-0.038
Token not
Token ID407
Feature activation-1.07e-01

Pos logit contributions
+0.122
idents+0.103
Sorry+0.098
 hotter+0.095
 cooler+0.095
Neg logit contributions
ud-0.170
and-0.155
jo-0.151
Jo-0.151
rum-0.149
Token want
Token ID765
Feature activation+1.24e+00

Pos logit contributions
+0.009
 cooler+0.008
 hotter+0.008
 unst+0.008
idents+0.008
Neg logit contributions
ud-0.009
and-0.009
Jo-0.008
jo-0.008
 Jacob-0.008
Token,"
Token ID553
Feature activation+2.02e+00

Pos logit contributions
+0.045
idents+0.043
 cooler+0.037
cakes+0.037
 work+0.034
Neg logit contributions
ud-0.062
and-0.054
Jo-0.054
rum-0.053
Jamie-0.053
Token he
Token ID339
Feature activation-1.49e-01

Pos logit contributions
ud+0.186
and+0.185
Jo+0.183
rum+0.174
jo+0.174
Neg logit contributions
-0.170
idents-0.134
 hotter-0.133
Sorry-0.129
 cooler-0.121
Token said
Token ID531
Feature activation-5.78e-01

Pos logit contributions
+0.011
idents+0.008
 hotter+0.008
Sorry+0.008
 cooler+0.007
Neg logit contributions
ud-0.015
and-0.015
Jo-0.015
rum-0.014
jo-0.014
Token.
Token ID13
Feature activation-5.44e-02

Pos logit contributions
+0.050
idents+0.041
 hotter+0.040
Sorry+0.039
 cooler+0.038
Neg logit contributions
ud-0.054
and-0.051
Jo-0.050
jo-0.048
rum-0.047
Token "
Token ID366
Feature activation+8.18e-01

Pos logit contributions
+0.005
idents+0.004
cakes+0.004
Sorry+0.004
 hotter+0.004
Neg logit contributions
ud-0.005
Jo-0.004
and-0.004
rum-0.004
jo-0.004
TokenWhat
Token ID2061
Feature activation-4.55e+00

Pos logit contributions
ud+0.082
rum+0.078
Jo+0.077
jo+0.076
and+0.076
Neg logit contributions
-0.064
Sorry-0.051
idents-0.048
 hotter-0.048
cakes-0.045
Token kind
Token ID1611
Feature activation+2.60e+00

Pos logit contributions
idents+0.164
 cooler+0.156
+0.150
 taste+0.149
 reward+0.139
Neg logit contributions
ud-0.572
jo-0.549
Jo-0.536
rum-0.533
anna-0.518
Token of
Token ID286
Feature activation+2.84e+00

Pos logit contributions
ud+0.190
and+0.176
Jo+0.175
rum+0.172
jo+0.162
Neg logit contributions
-0.267
cakes-0.234
 hotter-0.232
 cooler-0.228
Sorry-0.226
Token competition
Token ID5449
Feature activation-2.22e+00

Pos logit contributions
ud+0.342
and+0.335
jo+0.320
rum+0.312
Jo+0.312
Neg logit contributions
-0.181
idents-0.126
 cooler-0.124
 hotter-0.121
Sorry-0.117
Token?"
Token ID1701
Feature activation-7.89e-01

Pos logit contributions
+0.206
idents+0.170
 hotter+0.167
Sorry+0.163
 cooler+0.157
Neg logit contributions
ud-0.188
and-0.184
Jo-0.181
rum-0.172
jo-0.172
Token
Token ID198
Feature activation-1.13e-01

Pos logit contributions
+0.072
Sorry+0.057
idents+0.057
 hotter+0.056
cakes+0.055
Neg logit contributions
ud-0.070
and-0.064
rum-0.063
Jo-0.063
jo-0.061
Token the
Token ID262
Feature activation-4.56e-01

Pos logit contributions
+0.226
 hotter+0.181
idents+0.177
Sorry+0.175
 cooler+0.172
Neg logit contributions
ud-0.217
and-0.214
jo-0.208
Jo-0.208
rum-0.203
Token water
Token ID1660
Feature activation-1.34e+00

Pos logit contributions
+0.042
idents+0.034
Sorry+0.032
 hotter+0.031
cakes+0.030
Neg logit contributions
ud-0.039
and-0.038
Jo-0.036
 Mae-0.034
jo-0.034
Token.
Token ID13
Feature activation-5.95e-02

Pos logit contributions
+0.119
 hotter+0.094
idents+0.094
 cooler+0.090
Sorry+0.089
Neg logit contributions
ud-0.117
Jo-0.116
and-0.115
jo-0.111
rum-0.110
Token She
Token ID1375
Feature activation+3.89e-01

Pos logit contributions
+0.006
idents+0.005
 hotter+0.005
Sorry+0.004
 cooler+0.004
Neg logit contributions
ud-0.005
and-0.005
Jo-0.005
rum-0.005
jo-0.005
Token watched
Token ID7342
Feature activation+2.63e+00

Pos logit contributions
and+0.039
rum+0.038
ud+0.037
jo+0.036
Jo+0.036
Neg logit contributions
-0.029
 cooler-0.023
 hotter-0.022
idents-0.022
Sorry-0.022
Token as
Token ID355
Feature activation-4.25e+00

Pos logit contributions
ud+0.222
and+0.218
Jo+0.214
rum+0.205
jo+0.205
Neg logit contributions
-0.244
idents-0.198
 hotter-0.196
Sorry-0.192
 cooler-0.182
Token the
Token ID262
Feature activation-4.51e-01

Pos logit contributions
+0.360
 hotter+0.288
idents+0.274
 cooler+0.270
Sorry+0.267
Neg logit contributions
and-0.388
ud-0.384
rum-0.377
jo-0.369
Jo-0.366
Token sand
Token ID6450
Feature activation+4.64e-01

Pos logit contributions
+0.037
idents+0.029
 hotter+0.027
Sorry+0.027
 unst+0.025
Neg logit contributions
ud-0.043
and-0.043
Jo-0.043
jo-0.040
 Ali-0.040
Token made
Token ID925
Feature activation+7.30e-01

Pos logit contributions
ud+0.043
and+0.041
Jo+0.041
 Ali+0.039
 Mae+0.039
Neg logit contributions
-0.038
idents-0.030
Sorry-0.030
 hotter-0.028
 unst-0.026
Token little
Token ID1310
Feature activation+1.19e+00

Pos logit contributions
ud+0.055
and+0.055
Jo+0.054
jo+0.050
rum+0.050
Neg logit contributions
-0.074
idents-0.061
 hotter-0.060
Sorry-0.060
 cooler-0.056
Token bubbles
Token ID25037
Feature activation-4.62e-02

Pos logit contributions
ud+0.111
and+0.109
Jo+0.107
rum+0.104
jo+0.103
Neg logit contributions
-0.098
idents-0.079
 hotter-0.078
Sorry-0.076
 cooler-0.073
Token loves
Token ID10408
Feature activation+2.86e+00

Pos logit contributions
ud+0.039
and+0.039
Jo+0.038
rum+0.036
jo+0.036
Neg logit contributions
-0.043
idents-0.035
 hotter-0.034
Sorry-0.034
 cooler-0.032
Token.
Token ID13
Feature activation-1.53e-01

Pos logit contributions
ud+0.264
Jo+0.259
and+0.255
jo+0.247
rum+0.247
Neg logit contributions
-0.232
idents-0.184
 cooler-0.179
Sorry-0.178
cakes-0.173
Token I
Token ID314
Feature activation+2.55e+00

Pos logit contributions
+0.015
Sorry+0.011
idents+0.011
 hotter+0.010
cakes+0.010
Neg logit contributions
ud-0.013
and-0.013
Jo-0.012
jo-0.012
rum-0.012
Token'm
Token ID1101
Feature activation-1.42e+00

Pos logit contributions
ud+0.197
Jo+0.194
and+0.192
rum+0.181
jo+0.181
Neg logit contributions
-0.251
 hotter-0.210
idents-0.208
Sorry-0.198
cakes-0.192
Token so
Token ID523
Feature activation+7.47e-01

Pos logit contributions
+0.116
 hotter+0.091
idents+0.090
Sorry+0.087
 cooler+0.085
Neg logit contributions
ud-0.138
and-0.135
Jo-0.131
jo-0.128
rum-0.127
Token proud
Token ID6613
Feature activation-4.75e+00

Pos logit contributions
ud+0.082
and+0.079
Jo+0.078
jo+0.077
rum+0.076
Neg logit contributions
-0.052
 hotter-0.039
idents-0.039
Sorry-0.038
 cooler-0.036
Token of
Token ID286
Feature activation+2.80e+00

Pos logit contributions
+0.360
 cooler+0.324
cakes+0.321
Sorry+0.306
 hotter+0.305
Neg logit contributions
Jo-0.411
ud-0.407
jo-0.406
rum-0.395
Jamie-0.378
Token you
Token ID345
Feature activation-2.76e-01

Pos logit contributions
ud+0.283
jo+0.275
and+0.273
Jo+0.267
rum+0.247
Neg logit contributions
-0.225
Sorry-0.171
idents-0.167
 hotter-0.166
 cooler-0.162
Token.
Token ID13
Feature activation-1.01e-01

Pos logit contributions
+0.023
idents+0.019
 cooler+0.018
 hotter+0.017
Sorry+0.017
Neg logit contributions
ud-0.026
Jo-0.025
and-0.024
jo-0.023
rum-0.023
Token You
Token ID921
Feature activation-9.74e-02

Pos logit contributions
+0.009
Sorry+0.007
idents+0.007
 hotter+0.006
cakes+0.006
Neg logit contributions
ud-0.009
and-0.009
Jo-0.009
jo-0.008
rum-0.008
Token helped
Token ID4193
Feature activation+2.49e-01

Pos logit contributions
+0.010
 hotter+0.008
idents+0.008
Sorry+0.008
cakes+0.007
Neg logit contributions
ud-0.007
Jo-0.007
and-0.007
rum-0.007
jo-0.007
Token baseball
Token ID9283
Feature activation-3.35e-02

Pos logit contributions
ud+0.099
and+0.096
Jo+0.092
rum+0.090
jo+0.088
Neg logit contributions
-0.110
Sorry-0.095
 hotter-0.094
idents-0.092
cakes-0.091
Token and
Token ID290
Feature activation-3.45e-01

Pos logit contributions
+0.003
idents+0.003
 hotter+0.003
Sorry+0.003
cakes+0.002
Neg logit contributions
ud-0.003
and-0.003
Jo-0.003
rum-0.002
jo-0.002
Token said
Token ID531
Feature activation-5.79e-01

Pos logit contributions
+0.029
idents+0.023
 hotter+0.022
Sorry+0.022
cakes+0.020
Neg logit contributions
ud-0.032
and-0.032
Jo-0.031
rum-0.030
jo-0.030
Token,
Token ID11
Feature activation+2.45e-01

Pos logit contributions
+0.043
 hotter+0.035
Sorry+0.034
 cooler+0.034
idents+0.034
Neg logit contributions
ud-0.062
jo-0.056
Jo-0.055
and-0.055
weet-0.052
Token "
Token ID366
Feature activation+8.25e-01

Pos logit contributions
ud+0.019
and+0.018
Jo+0.018
jo+0.017
rum+0.017
Neg logit contributions
-0.025
idents-0.021
 hotter-0.021
Sorry-0.020
 cooler-0.019
TokenThank
Token ID10449
Feature activation-4.10e+00

Pos logit contributions
ud+0.096
and+0.095
Jo+0.094
jo+0.090
rum+0.090
Neg logit contributions
-0.050
idents-0.036
 hotter-0.035
Sorry-0.033
 cooler-0.031
Token you
Token ID345
Feature activation-2.30e-01

Pos logit contributions
+0.381
idents+0.306
Sorry+0.302
 hotter+0.300
cakes+0.280
Neg logit contributions
ud-0.350
and-0.341
Jo-0.335
jo-0.324
rum-0.321
Token.
Token ID13
Feature activation-6.57e-02

Pos logit contributions
+0.020
idents+0.016
 hotter+0.016
Sorry+0.015
 cooler+0.015
Neg logit contributions
ud-0.021
and-0.020
Jo-0.020
rum-0.019
jo-0.019
Token I
Token ID314
Feature activation+2.64e+00

Pos logit contributions
+0.007
idents+0.005
Sorry+0.005
 hotter+0.005
cakes+0.005
Neg logit contributions
ud-0.005
and-0.005
Jo-0.005
jo-0.005
rum-0.005
Token'm
Token ID1101
Feature activation-1.34e+00

Pos logit contributions
ud+0.206
Jo+0.202
and+0.201
rum+0.190
jo+0.190
Neg logit contributions
-0.258
 hotter-0.215
idents-0.214
Sorry-0.206
cakes-0.199
Token sorry
Token ID7926
Feature activation-1.44e+00

Pos logit contributions
+0.123
 hotter+0.100
idents+0.098
Sorry+0.094
 cooler+0.094
Neg logit contributions
ud-0.119
and-0.116
Jo-0.111
jo-0.110
rum-0.107
Token."
Token ID526
Feature activation+1.43e+00

Pos logit contributions
ud+0.067
and+0.066
Jo+0.065
rum+0.062
jo+0.062
Neg logit contributions
-0.069
idents-0.056
 hotter-0.056
Sorry-0.054
 cooler-0.052
Token
Token ID220
Feature activation+1.25e+00

Pos logit contributions
ud+0.118
and+0.116
Jo+0.114
rum+0.108
jo+0.108
Neg logit contributions
-0.135
idents-0.110
 hotter-0.109
Sorry-0.107
 cooler-0.101
Token
Token ID198
Feature activation-1.16e-01

Pos logit contributions
ud+0.117
Jo+0.113
and+0.111
rum+0.108
jo+0.108
Neg logit contributions
-0.104
idents-0.080
 hotter-0.078
Sorry-0.076
cakes-0.073
Token
Token ID198
Feature activation-1.18e-01

Pos logit contributions
+0.009
Sorry+0.006
idents+0.006
cakes+0.006
 hotter+0.006
Neg logit contributions
ud-0.012
rum-0.011
Jo-0.011
jo-0.011
and-0.011
TokenS
Token ID50
Feature activation-2.03e+00

Pos logit contributions
+0.009
Sorry+0.008
idents+0.006
cakes+0.006
 hotter+0.006
Neg logit contributions
ud-0.012
rum-0.011
jo-0.011
and-0.011
Jo-0.011
Tokenally
Token ID453
Feature activation-4.43e+00

Pos logit contributions
+0.214
 hotter+0.207
idents+0.187
 afloat+0.180
 unst+0.179
Neg logit contributions
ud-0.147
and-0.128
Jo-0.127
old-0.126
jo-0.121
Token nodded
Token ID14464
Feature activation+1.30e+00

Pos logit contributions
+0.354
idents+0.276
Sorry+0.271
 hotter+0.261
 cooler+0.258
Neg logit contributions
and-0.422
ud-0.415
rum-0.395
Jo-0.386
weet-0.380
Token and
Token ID290
Feature activation-3.46e-01

Pos logit contributions
and+0.087
Jo+0.084
ud+0.083
jo+0.079
rum+0.078
Neg logit contributions
-0.144
idents-0.123
 hotter-0.119
Sorry-0.118
cakes-0.115
Token began
Token ID2540
Feature activation-2.84e+00

Pos logit contributions
+0.026
idents+0.020
 hotter+0.020
Sorry+0.020
 cooler+0.019
Neg logit contributions
ud-0.035
and-0.035
Jo-0.033
rum-0.033
jo-0.033
Token to
Token ID284
Feature activation+7.48e-01

Pos logit contributions
+0.260
 hotter+0.210
 cooler+0.202
 unst+0.201
idents+0.199
Neg logit contributions
ud-0.250
and-0.231
jo-0.231
Jo-0.224
rum-0.220
Token climb
Token ID12080
Feature activation+5.33e-01

Pos logit contributions
ud+0.072
and+0.071
Jo+0.069
rum+0.066
jo+0.066
Neg logit contributions
-0.061
idents-0.048
 hotter-0.048
Sorry-0.046
 cooler-0.044