TOP ACTIVATIONS
MAX = 0.106

who liked to play hide and seek . One day ,
bunny liked to play hide and seek with his friends .
Betsy liked to play hide and seek with her friends .
who loved to play hide and seek . One day ,
they decide to play hide and seek . Tom is the
She loved to play hide and seek with her toys .
She loved to play hide and seek with her friends .
Benny loved to play hide and seek with his friends .
She loved to play hide and seek with her friends .
my loved to play hide and seek with his friends .
that said " No T resp assing ". But Lily was
ucky loved to play hide and seek with her friends .
She loved to play hide and seek with her friends .
ens loved to play hide and seek with Lily . One
mess . There are pots and pans on the floor .
tea parties and play hide and seek . Sally loved her
Tim loved to play hide and seek with his friends .
He loved to play hide and seek with his friends .
and loved to play hide and seek with the other animals
my loved to play hide and seek with his friends .

MIN ACTIVATIONS
MIN = -0.082

smiles were big . The end . <|endoftext|> Once upon a
at drawing pictures . The end . <|endoftext|> Once upon a
best friends forever . The end . <|endoftext|> Once upon a
their little girl . The end . <|endoftext|> Once upon a
play together nicely . The end . <|endoftext|> Once upon a
even closer friends . The end . <|endoftext|> Once upon a
the broken shelf . The end . <|endoftext|> Once upon a
icy black board . The end . <|endoftext|> Once upon a
amount of dessert . The end . <|endoftext|> Once upon a
And their mom . The end . <|endoftext|> Ben and Lily
happy and cozy . The end . <|endoftext|> Once upon a
other 's hair . The end . <|endoftext|> Once upon a
an important lesson . The end . <|endoftext|> Once upon a
happy and grateful . The end . <|endoftext|> Tim and Sue
just a fruit . The end . <|endoftext|> Once upon a
on their adventure . The end . <|endoftext|> Once upon a
happy and smiling . The end . <|endoftext|> Once upon a
bed every day . The end . <|endoftext|> Once upon a
the piano keys . The end . <|endoftext|> Once upon a
best friends forever . The end . <|endoftext|> Once upon a

INTERVAL 0.059 - 0.106
CONTAINS 0.351%

to decor ate the room to make it look all pretty
hive . They worked hard to make honey . One day
cut and fold the cardboard to make it look like shields
wanted to make new friends and learn new things .
. She loved to play with her toys all day .

INTERVAL 0.012 - 0.059
CONTAINS 29.455%

Help me , Tim !" Tim tries to help
with the chain . But Mom does not like
It was a big flag , with bright colors . It
the best of friends and played together every day . Peter
his way and could not find his family . The little

INTERVAL -0.035 - 0.012
CONTAINS 64.838%

make it brilliant again . <|endoftext|> Once upon a time ,
the wolf . He kicked him with his ho oves .
to touch the slide . He left the box and walked
She ran back in to show her mum all the stones
. He followed the aroma until he came to a clearing

INTERVAL -0.082 - -0.035
CONTAINS 5.355%

The tree didn 't say anything , but Spot knew the
said , " OK . I love you too . Let
" You 're welcome , my sweet ies . Now let
and make things . One day , they find a toy
The little boy smiled brightly , feeling satisfied and
Token who
Token ID508
Feature activation+1.61e-02

Pos logit contributions
 conclusions+0.053
 pans+0.053
Fl+0.053
Moon+0.052
ts+0.052
Neg logit contributions
:-0.029
sed-0.027
 fault-0.026
 okay-0.024
 enough-0.023
Token liked
Token ID8288
Feature activation+1.98e-03

Pos logit contributions
 conclusions+0.035
 pans+0.033
eds+0.033
Moon+0.033
ts+0.032
Neg logit contributions
:-0.019
sed-0.017
 fault-0.017
 okay-0.017
 happened-0.016
Token to
Token ID284
Feature activation+5.09e-02

Pos logit contributions
Fl+0.005
 conclusions+0.005
Moon+0.005
ts+0.005
eds+0.005
Neg logit contributions
 fault-0.001
:-0.001
sed-0.001
 enough-0.001
 okay-0.001
Token play
Token ID711
Feature activation+4.31e-02

Pos logit contributions
eds+0.074
Fl+0.071
 pans+0.071
ations+0.070
Brown+0.070
Neg logit contributions
:-0.092
sed-0.088
 okay-0.086
 fault-0.083
 tum-0.081
Token hide
Token ID7808
Feature activation+2.77e-02

Pos logit contributions
 conclusions+0.074
 pans+0.073
eds+0.073
Fl+0.073
Brown+0.072
Neg logit contributions
:-0.066
 okay-0.060
sed-0.058
 about-0.058
 enough-0.056
Token and
Token ID290
Feature activation+1.06e-01

Pos logit contributions
Moon+0.063
Brown+0.062
 conclusions+0.061
Bird+0.060
Fl+0.060
Neg logit contributions
:-0.029
 enough-0.025
 about-0.023
 okay-0.022
 fault-0.021
Token seek
Token ID5380
Feature activation+3.94e-02

Pos logit contributions
irs+0.200
cking+0.198
ler+0.195
eling+0.193
cks+0.189
Neg logit contributions
 okay-0.140
,-0.140
 full-0.139
:-0.138
 tum-0.137
Token.
Token ID13
Feature activation+1.59e-02

Pos logit contributions
 conclusions+0.085
Moon+0.085
Fl+0.085
Bird+0.084
eds+0.084
Neg logit contributions
:-0.043
 enough-0.038
 fault-0.037
 okay-0.037
 about-0.036
Token One
Token ID1881
Feature activation-2.13e-02

Pos logit contributions
eds+0.037
ts+0.035
Fl+0.035
 conclusions+0.035
 pans+0.035
Neg logit contributions
:-0.016
 fault-0.015
sed-0.015
 okay-0.014
 enough-0.013
Token day
Token ID1110
Feature activation-2.89e-02

Pos logit contributions
 fault+0.038
sed+0.037
:+0.037
 okay+0.035
 enough+0.035
Neg logit contributions
Fl-0.031
Moon-0.030
eds-0.030
Bee-0.030
 conclusions-0.030
Token,
Token ID11
Feature activation-6.71e-04

Pos logit contributions
 fault+0.038
sed+0.031
owl+0.028
ige+0.027
 okay+0.027
Neg logit contributions
Fl-0.062
 blankets-0.059
Bee-0.058
Moon-0.058
Buzz-0.058
Token bunny
Token ID44915
Feature activation+3.72e-02

Pos logit contributions
ts+0.016
eds+0.016
 conclusions+0.016
cking+0.015
irs+0.015
Neg logit contributions
 okay-0.022
:-0.021
 best-0.021
sed-0.021
 enough-0.020
Token liked
Token ID8288
Feature activation+2.83e-03

Pos logit contributions
 conclusions+0.075
 pans+0.073
Fl+0.072
Moon+0.072
eds+0.072
Neg logit contributions
:-0.049
sed-0.045
 fault-0.044
 okay-0.041
 enough-0.039
Token to
Token ID284
Feature activation+5.12e-02

Pos logit contributions
 conclusions+0.008
Fl+0.008
Moon+0.008
Brown+0.008
ations+0.008
Neg logit contributions
:-0.001
 fault-0.001
sed-0.001
 about-0.001
 enough-0.001
Token play
Token ID711
Feature activation+4.10e-02

Pos logit contributions
ations+0.079
eds+0.078
Brown+0.075
 pans+0.075
cking+0.075
Neg logit contributions
:-0.089
sed-0.085
 okay-0.081
 tum-0.079
 opinion-0.078
Token hide
Token ID7808
Feature activation+2.89e-02

Pos logit contributions
 conclusions+0.073
 pans+0.072
eds+0.071
Brown+0.070
Moon+0.070
Neg logit contributions
:-0.063
 about-0.057
 okay-0.056
sed-0.054
 enough-0.052
Token and
Token ID290
Feature activation+1.03e-01

Pos logit contributions
Moon+0.066
Brown+0.066
 conclusions+0.065
Bird+0.063
Fl+0.063
Neg logit contributions
:-0.030
 enough-0.026
 about-0.024
 okay-0.022
sed-0.021
Token seek
Token ID5380
Feature activation+3.82e-02

Pos logit contributions
irs+0.202
ler+0.191
cking+0.191
eling+0.191
ations+0.187
Neg logit contributions
,-0.140
:-0.138
 full-0.137
 okay-0.133
 tum-0.132
Token with
Token ID351
Feature activation+5.37e-02

Pos logit contributions
 conclusions+0.077
Moon+0.077
Fl+0.077
eds+0.076
Bird+0.076
Neg logit contributions
:-0.048
 enough-0.042
 okay-0.042
 fault-0.042
sed-0.042
Token his
Token ID465
Feature activation+1.14e-03

Pos logit contributions
ations+0.115
ge+0.110
lling+0.109
cking+0.107
eling+0.106
Neg logit contributions
:-0.067
 enough-0.063
 tum-0.060
 no-0.060
 okay-0.059
Token friends
Token ID2460
Feature activation+1.12e-02

Pos logit contributions
 conclusions+0.002
Fl+0.002
Moon+0.002
eds+0.002
ts+0.002
Neg logit contributions
sed-0.001
 fault-0.001
:-0.001
 okay-0.001
 enough-0.001
Token.
Token ID13
Feature activation+8.03e-03

Pos logit contributions
Fl+0.024
 conclusions+0.023
Moon+0.023
eds+0.023
Bee+0.023
Neg logit contributions
 fault-0.013
:-0.013
sed-0.012
 okay-0.011
 enough-0.011
Token Betsy
Token ID45775
Feature activation+4.15e-02

Pos logit contributions
Fl+0.022
eds+0.021
 conclusions+0.021
Moon+0.021
ts+0.021
Neg logit contributions
 fault-0.038
:-0.038
sed-0.037
 okay-0.036
 enough-0.035
Token liked
Token ID8288
Feature activation+4.36e-03

Pos logit contributions
 conclusions+0.086
 pans+0.083
eds+0.082
Moon+0.082
Fl+0.081
Neg logit contributions
:-0.053
 fault-0.047
sed-0.047
 okay-0.043
 happened-0.043
Token to
Token ID284
Feature activation+4.87e-02

Pos logit contributions
 conclusions+0.012
Fl+0.012
ts+0.012
eds+0.012
Moon+0.012
Neg logit contributions
:-0.002
 fault-0.002
sed-0.002
 enough-0.002
 about-0.002
Token play
Token ID711
Feature activation+4.26e-02

Pos logit contributions
eds+0.067
Fl+0.065
ations+0.063
Brown+0.062
 conclusions+0.062
Neg logit contributions
:-0.092
sed-0.088
 okay-0.085
 fault-0.084
 opinion-0.080
Token hide
Token ID7808
Feature activation+3.09e-02

Pos logit contributions
 conclusions+0.076
eds+0.073
 pans+0.073
Brown+0.071
Fl+0.071
Neg logit contributions
:-0.065
 about-0.059
 okay-0.058
sed-0.056
 enough-0.054
Token and
Token ID290
Feature activation+1.02e-01

Pos logit contributions
 conclusions+0.073
Moon+0.072
Brown+0.071
 sidelines+0.069
irs+0.069
Neg logit contributions
:-0.030
 enough-0.025
 about-0.024
 okay-0.021
sed-0.020
Token seek
Token ID5380
Feature activation+3.77e-02

Pos logit contributions
irs+0.195
ler+0.185
eling+0.182
cking+0.182
 conclusions+0.179
Neg logit contributions
 okay-0.141
,-0.140
:-0.140
 full-0.137
 ending-0.133
Token with
Token ID351
Feature activation+5.94e-02

Pos logit contributions
 conclusions+0.073
Moon+0.072
Fl+0.072
eds+0.072
Bird+0.071
Neg logit contributions
:-0.051
 enough-0.045
sed-0.045
 okay-0.045
 fault-0.044
Token her
Token ID607
Feature activation+1.47e-03

Pos logit contributions
ations+0.116
cking+0.115
ler+0.114
eds+0.114
lling+0.112
Neg logit contributions
:-0.079
 enough-0.072
 okay-0.070
 about-0.068
 best-0.068
Token friends
Token ID2460
Feature activation+1.15e-02

Pos logit contributions
Fl+0.003
 conclusions+0.003
Moon+0.003
eds+0.003
Bee+0.003
Neg logit contributions
 fault-0.002
sed-0.002
:-0.002
 okay-0.002
 enough-0.002
Token.
Token ID13
Feature activation+1.10e-02

Pos logit contributions
Fl+0.023
eds+0.022
Moon+0.022
Bee+0.022
ts+0.022
Neg logit contributions
 fault-0.015
sed-0.015
:-0.014
 okay-0.013
 enough-0.013
Token who
Token ID508
Feature activation+1.79e-02

Pos logit contributions
 conclusions+0.004
Fl+0.004
Moon+0.003
 pans+0.003
Bee+0.003
Neg logit contributions
:-0.003
 fault-0.002
sed-0.002
 okay-0.002
 enough-0.002
Token loved
Token ID6151
Feature activation+1.34e-02

Pos logit contributions
Fl+0.034
 conclusions+0.034
eds+0.034
Moon+0.034
 pans+0.033
Neg logit contributions
:-0.023
 fault-0.023
sed-0.022
 okay-0.021
 enough-0.020
Token to
Token ID284
Feature activation+3.99e-02

Pos logit contributions
Fl+0.036
 conclusions+0.036
Bee+0.035
Ruby+0.035
Bird+0.035
Neg logit contributions
:-0.008
 fault-0.008
sed-0.007
 enough-0.007
 about-0.006
Token play
Token ID711
Feature activation+3.82e-02

Pos logit contributions
Fl+0.060
eds+0.059
Bear+0.054
Bee+0.054
Ruby+0.054
Neg logit contributions
:-0.074
 fault-0.068
sed-0.068
 okay-0.068
 about-0.065
Token hide
Token ID7808
Feature activation+2.83e-02

Pos logit contributions
eds+0.060
Fl+0.060
Bee+0.060
 conclusions+0.059
Moon+0.058
Neg logit contributions
:-0.064
 okay-0.059
 about-0.058
sed-0.057
 enough-0.057
Token and
Token ID290
Feature activation+1.02e-01

Pos logit contributions
Moon+0.065
 conclusions+0.063
Bird+0.063
Brown+0.062
Bee+0.062
Neg logit contributions
:-0.029
 enough-0.026
 about-0.024
 okay-0.022
 fault-0.021
Token seek
Token ID5380
Feature activation+3.77e-02

Pos logit contributions
cking+0.189
ler+0.185
cks+0.184
irs+0.183
eds+0.180
Neg logit contributions
 okay-0.142
 full-0.137
 ending-0.136
:-0.135
 tum-0.133
Token.
Token ID13
Feature activation+1.32e-02

Pos logit contributions
Moon+0.082
Fl+0.082
 conclusions+0.082
Bee+0.082
Bird+0.082
Neg logit contributions
:-0.041
 enough-0.036
 fault-0.036
 okay-0.035
 about-0.034
Token One
Token ID1881
Feature activation-2.66e-02

Pos logit contributions
Fl+0.027
Moon+0.026
 conclusions+0.026
Bee+0.026
 pans+0.026
Neg logit contributions
 fault-0.016
sed-0.016
:-0.015
 okay-0.014
 enough-0.014
Token day
Token ID1110
Feature activation-3.21e-02

Pos logit contributions
 fault+0.054
sed+0.051
:+0.049
 okay+0.048
 enough+0.047
Neg logit contributions
Fl-0.036
Bee-0.034
Moon-0.034
Bear-0.033
Bird-0.033
Token,
Token ID11
Feature activation-4.99e-03

Pos logit contributions
 fault+0.049
sed+0.041
owl+0.037
 intentions+0.036
ope+0.035
Neg logit contributions
Fl-0.063
 blankets-0.059
ie-0.059
 pans-0.059
ts-0.058
Token they
Token ID484
Feature activation+5.45e-03

Pos logit contributions
 fault+0.042
sed+0.039
owl+0.036
:+0.035
ield+0.035
Neg logit contributions
 pans-0.035
ts-0.034
Fl-0.034
 blankets-0.033
Moon-0.033
Token decide
Token ID5409
Feature activation-3.65e-02

Pos logit contributions
Fl+0.008
Moon+0.008
 conclusions+0.008
eds+0.008
Bee+0.008
Neg logit contributions
 fault-0.010
:-0.009
sed-0.009
 okay-0.009
 enough-0.009
Token to
Token ID284
Feature activation+3.58e-02

Pos logit contributions
sed+0.040
 fault+0.039
:+0.034
 okay+0.033
apped+0.031
Neg logit contributions
Fl-0.078
 pans-0.077
 conclusions-0.077
ts-0.077
 blankets-0.076
Token play
Token ID711
Feature activation+2.74e-02

Pos logit contributions
Fl+0.056
eds+0.056
 pans+0.054
Bee+0.053
Bear+0.052
Neg logit contributions
:-0.062
 fault-0.060
 okay-0.057
 enough-0.056
sed-0.055
Token hide
Token ID7808
Feature activation+1.91e-02

Pos logit contributions
eds+0.051
Bee+0.051
Fl+0.050
Moon+0.050
Brown+0.049
Neg logit contributions
:-0.040
 fault-0.036
 okay-0.036
 about-0.035
 enough-0.035
Token and
Token ID290
Feature activation+1.01e-01

Pos logit contributions
Moon+0.041
Bird+0.041
Brown+0.041
Fl+0.040
Bee+0.040
Neg logit contributions
:-0.022
 enough-0.020
 fault-0.019
 okay-0.018
 about-0.018
Token seek
Token ID5380
Feature activation+2.94e-02

Pos logit contributions
eds+0.181
cking+0.178
ler+0.173
icer+0.171
hing+0.171
Neg logit contributions
 okay-0.149
:-0.141
sed-0.139
 ending-0.138
 no-0.129
Token.
Token ID13
Feature activation-1.46e-02

Pos logit contributions
Fl+0.061
Moon+0.061
Bird+0.061
Bee+0.060
 conclusions+0.060
Neg logit contributions
:-0.035
 fault-0.032
 enough-0.031
sed-0.031
 okay-0.030
Token Tom
Token ID4186
Feature activation+6.69e-03

Pos logit contributions
 fault+0.023
 okay+0.021
sed+0.021
 happened+0.020
:+0.020
Neg logit contributions
Fl-0.023
 pans-0.023
Moon-0.023
 blankets-0.023
 sofa-0.023
Token is
Token ID318
Feature activation-2.01e-02

Pos logit contributions
Fl+0.011
 conclusions+0.011
Moon+0.011
eds+0.011
Bee+0.011
Neg logit contributions
 fault-0.010
:-0.010
sed-0.010
 okay-0.009
 enough-0.009
Token the
Token ID262
Feature activation-8.75e-03

Pos logit contributions
 fault+0.033
sed+0.032
:+0.031
ield+0.029
ope+0.029
Neg logit contributions
ts-0.032
Fl-0.031
 pans-0.031
Moon-0.031
 blankets-0.030
Token She
Token ID1375
Feature activation+1.97e-02

Pos logit contributions
Fl+0.027
Moon+0.026
Bee+0.026
 conclusions+0.026
 blankets+0.026
Neg logit contributions
 fault-0.026
sed-0.025
:-0.024
 okay-0.023
 enough-0.023
Token loved
Token ID6151
Feature activation+1.57e-02

Pos logit contributions
Fl+0.037
Moon+0.036
eds+0.036
Bee+0.035
 conclusions+0.035
Neg logit contributions
 fault-0.028
sed-0.026
:-0.026
 okay-0.025
 enough-0.024
Token to
Token ID284
Feature activation+4.12e-02

Pos logit contributions
 conclusions+0.045
Fl+0.045
Bird+0.044
Bee+0.044
Bear+0.043
Neg logit contributions
:-0.007
 fault-0.006
 about-0.006
sed-0.005
 enough-0.005
Token play
Token ID711
Feature activation+3.69e-02

Pos logit contributions
Fl+0.054
eds+0.054
Bear+0.050
 conclusions+0.050
Moon+0.049
Neg logit contributions
:-0.082
sed-0.078
 fault-0.078
 okay-0.076
 about-0.073
Token hide
Token ID7808
Feature activation+2.78e-02

Pos logit contributions
eds+0.058
 conclusions+0.057
Bee+0.056
Fl+0.056
Moon+0.055
Neg logit contributions
:-0.064
 okay-0.059
 about-0.059
sed-0.056
 enough-0.055
Token and
Token ID290
Feature activation+1.01e-01

Pos logit contributions
Moon+0.062
 conclusions+0.062
Brown+0.061
Bird+0.061
Bear+0.060
Neg logit contributions
:-0.029
 enough-0.025
 about-0.024
 okay-0.023
sed-0.022
Token seek
Token ID5380
Feature activation+3.51e-02

Pos logit contributions
irs+0.179
ler+0.178
cking+0.175
 conclusions+0.172
eds+0.171
Neg logit contributions
 okay-0.150
:-0.142
 ending-0.138
 full-0.138
,-0.137
Token with
Token ID351
Feature activation+6.10e-02

Pos logit contributions
Fl+0.063
 conclusions+0.063
Moon+0.063
eds+0.063
Bee+0.062
Neg logit contributions
:-0.051
 fault-0.048
sed-0.047
 okay-0.046
 enough-0.046
Token her
Token ID607
Feature activation-5.97e-04

Pos logit contributions
eds+0.116
cking+0.112
ations+0.111
bing+0.110
ler+0.109
Neg logit contributions
:-0.087
 enough-0.082
 about-0.078
 okay-0.077
 best-0.074
Token toys
Token ID14958
Feature activation+3.39e-02

Pos logit contributions
 fault+0.001
sed+0.001
:+0.001
 okay+0.001
 enough+0.001
Neg logit contributions
Fl-0.001
 conclusions-0.001
eds-0.001
Moon-0.001
Bee-0.001
Token.
Token ID13
Feature activation+1.16e-02

Pos logit contributions
 blankets+0.057
ts+0.057
Fl+0.056
 pans+0.054
eds+0.053
Neg logit contributions
 fault-0.052
sed-0.052
ope-0.047
 okay-0.046
ield-0.045
Token She
Token ID1375
Feature activation+2.22e-02

Pos logit contributions
Fl+0.039
Moon+0.037
 conclusions+0.037
Bee+0.037
eds+0.037
Neg logit contributions
 fault-0.027
sed-0.026
:-0.025
 okay-0.024
 enough-0.023
Token loved
Token ID6151
Feature activation+1.75e-02

Pos logit contributions
Fl+0.042
 conclusions+0.041
eds+0.041
Moon+0.041
 pans+0.040
Neg logit contributions
:-0.030
 fault-0.030
sed-0.029
 okay-0.027
 enough-0.026
Token to
Token ID284
Feature activation+3.91e-02

Pos logit contributions
 conclusions+0.050
Fl+0.050
Bee+0.048
Bear+0.048
Bird+0.048
Neg logit contributions
:-0.009
 fault-0.008
 about-0.007
 enough-0.007
sed-0.007
Token play
Token ID711
Feature activation+3.88e-02

Pos logit contributions
Fl+0.049
eds+0.048
 conclusions+0.045
Bear+0.045
Moon+0.045
Neg logit contributions
:-0.080
 fault-0.076
sed-0.076
 okay-0.074
 enough-0.071
Token hide
Token ID7808
Feature activation+3.15e-02

Pos logit contributions
 conclusions+0.063
eds+0.062
Bee+0.060
Fl+0.059
Moon+0.058
Neg logit contributions
:-0.065
 about-0.060
 okay-0.058
sed-0.056
 enough-0.055
Token and
Token ID290
Feature activation+1.00e-01

Pos logit contributions
Moon+0.072
 conclusions+0.072
Brown+0.071
Bird+0.070
George+0.070
Neg logit contributions
:-0.032
 enough-0.027
 about-0.026
 okay-0.023
sed-0.022
Token seek
Token ID5380
Feature activation+3.79e-02

Pos logit contributions
irs+0.176
ler+0.176
cking+0.174
 conclusions+0.174
eds+0.171
Neg logit contributions
 okay-0.148
:-0.146
 full-0.140
 ending-0.138
,-0.137
Token with
Token ID351
Feature activation+5.99e-02

Pos logit contributions
 conclusions+0.068
Fl+0.068
Moon+0.068
eds+0.067
Bird+0.067
Neg logit contributions
:-0.056
 fault-0.051
 enough-0.051
sed-0.050
 okay-0.050
Token her
Token ID607
Feature activation-6.64e-04

Pos logit contributions
eds+0.115
ations+0.112
cking+0.111
ler+0.111
 conclusions+0.111
Neg logit contributions
:-0.083
 enough-0.079
 about-0.073
 okay-0.071
 best-0.070
Token friends
Token ID2460
Feature activation+1.01e-02

Pos logit contributions
 fault+0.001
sed+0.001
:+0.001
 okay+0.001
 enough+0.001
Neg logit contributions
Fl-0.001
 conclusions-0.001
eds-0.001
Moon-0.001
Bird-0.001
Token.
Token ID13
Feature activation+1.38e-02

Pos logit contributions
Fl+0.020
eds+0.019
Moon+0.019
Bear+0.019
 pans+0.018
Neg logit contributions
 fault-0.014
sed-0.014
:-0.012
 okay-0.012
 enough-0.011
Token Benny
Token ID44275
Feature activation+2.90e-02

Pos logit contributions
Fl+0.018
eds+0.018
 conclusions+0.018
Moon+0.018
ts+0.017
Neg logit contributions
 fault-0.033
:-0.033
sed-0.033
 okay-0.032
 enough-0.031
Token loved
Token ID6151
Feature activation+1.77e-02

Pos logit contributions
 conclusions+0.057
 pans+0.055
Fl+0.055
eds+0.055
Moon+0.055
Neg logit contributions
:-0.038
 fault-0.036
sed-0.036
 okay-0.033
 enough-0.032
Token to
Token ID284
Feature activation+4.16e-02

Pos logit contributions
 conclusions+0.049
Fl+0.048
Moon+0.047
Brown+0.047
Bird+0.046
Neg logit contributions
:-0.010
 fault-0.009
sed-0.008
 enough-0.008
 about-0.007
Token play
Token ID711
Feature activation+4.03e-02

Pos logit contributions
eds+0.058
Fl+0.057
Brown+0.055
ations+0.055
 pans+0.054
Neg logit contributions
:-0.078
sed-0.073
 okay-0.073
 fault-0.071
 about-0.069
Token hide
Token ID7808
Feature activation+2.61e-02

Pos logit contributions
 conclusions+0.072
eds+0.069
 pans+0.069
Moon+0.068
Fl+0.068
Neg logit contributions
:-0.062
 about-0.055
 okay-0.055
sed-0.052
 enough-0.051
Token and
Token ID290
Feature activation+9.96e-02

Pos logit contributions
Moon+0.060
 conclusions+0.059
Brown+0.058
Bird+0.058
Fl+0.057
Neg logit contributions
:-0.027
 enough-0.023
 about-0.022
 okay-0.020
sed-0.019
Token seek
Token ID5380
Feature activation+3.56e-02

Pos logit contributions
irs+0.185
cking+0.180
ler+0.178
eling+0.175
cks+0.174
Neg logit contributions
 okay-0.139
,-0.137
:-0.137
 full-0.134
 ending-0.131
Token with
Token ID351
Feature activation+5.63e-02

Pos logit contributions
Fl+0.065
 conclusions+0.065
Moon+0.064
eds+0.064
Bird+0.063
Neg logit contributions
:-0.052
 fault-0.047
sed-0.047
 enough-0.046
 okay-0.046
Token his
Token ID465
Feature activation-3.74e-04

Pos logit contributions
cking+0.111
ations+0.111
eds+0.107
eling+0.106
 conclusions+0.106
Neg logit contributions
:-0.075
 enough-0.068
 about-0.065
 okay-0.065
 best-0.063
Token friends
Token ID2460
Feature activation+7.48e-03

Pos logit contributions
 fault+0.000
sed+0.000
:+0.000
 okay+0.000
 enough+0.000
Neg logit contributions
Fl-0.001
 conclusions-0.001
Moon-0.001
eds-0.001
ts-0.001
Token.
Token ID13
Feature activation+1.14e-02

Pos logit contributions
Fl+0.015
eds+0.014
Moon+0.014
Bee+0.014
Bear+0.014
Neg logit contributions
 fault-0.010
sed-0.010
:-0.009
 okay-0.009
 enough-0.008
Token She
Token ID1375
Feature activation+2.20e-02

Pos logit contributions
Fl+0.039
Moon+0.038
 conclusions+0.038
Bee+0.037
eds+0.037
Neg logit contributions
 fault-0.027
sed-0.026
:-0.025
 okay-0.024
 enough-0.023
Token loved
Token ID6151
Feature activation+2.03e-02

Pos logit contributions
Fl+0.041
 conclusions+0.040
eds+0.040
Moon+0.040
Bee+0.039
Neg logit contributions
 fault-0.031
:-0.030
sed-0.030
 okay-0.028
 enough-0.027
Token to
Token ID284
Feature activation+3.70e-02

Pos logit contributions
 conclusions+0.057
Fl+0.056
Bee+0.055
Bird+0.054
Bear+0.054
Neg logit contributions
:-0.011
 fault-0.010
 enough-0.009
 about-0.009
sed-0.009
Token play
Token ID711
Feature activation+3.84e-02

Pos logit contributions
eds+0.048
Fl+0.048
 conclusions+0.045
Bear+0.045
Moon+0.044
Neg logit contributions
:-0.074
 fault-0.070
sed-0.069
 okay-0.069
 about-0.065
Token hide
Token ID7808
Feature activation+3.09e-02

Pos logit contributions
eds+0.064
 conclusions+0.064
Bee+0.061
Moon+0.060
Fl+0.060
Neg logit contributions
:-0.062
 about-0.058
 okay-0.057
sed-0.053
 enough-0.053
Token and
Token ID290
Feature activation+9.95e-02

Pos logit contributions
Moon+0.071
 conclusions+0.070
Brown+0.069
Bird+0.068
George+0.068
Neg logit contributions
:-0.032
 enough-0.027
 about-0.026
 okay-0.023
;-0.022
Token seek
Token ID5380
Feature activation+3.63e-02

Pos logit contributions
ler+0.176
cking+0.175
irs+0.174
eds+0.172
 conclusions+0.172
Neg logit contributions
 okay-0.150
:-0.145
 ending-0.138
 full-0.137
,-0.135
Token with
Token ID351
Feature activation+6.06e-02

Pos logit contributions
Fl+0.063
 conclusions+0.063
Moon+0.062
eds+0.062
Bee+0.062
Neg logit contributions
:-0.055
 fault-0.052
sed-0.051
 okay-0.050
 enough-0.050
Token her
Token ID607
Feature activation-1.04e-03

Pos logit contributions
eds+0.116
ations+0.113
cking+0.112
Fl+0.112
Bird+0.112
Neg logit contributions
:-0.085
 enough-0.081
 about-0.075
 fault-0.072
 okay-0.072
Token friends
Token ID2460
Feature activation+9.21e-03

Pos logit contributions
 fault+0.001
sed+0.001
:+0.001
 okay+0.001
 enough+0.001
Neg logit contributions
Fl-0.002
 conclusions-0.002
eds-0.002
Moon-0.002
Bee-0.002
Token.
Token ID13
Feature activation+1.26e-02

Pos logit contributions
Fl+0.018
eds+0.017
Moon+0.017
Bear+0.017
 pans+0.017
Neg logit contributions
 fault-0.013
sed-0.013
:-0.011
 okay-0.011
 enough-0.010
Tokenmy
Token ID1820
Feature activation+1.93e-02

Pos logit contributions
 conclusions+0.033
Fl+0.032
Moon+0.031
 pans+0.031
Bird+0.030
Neg logit contributions
:-0.051
 fault-0.051
sed-0.047
 okay-0.047
 enough-0.046
Token loved
Token ID6151
Feature activation+1.97e-02

Pos logit contributions
Fl+0.034
Moon+0.033
eds+0.033
Bee+0.033
Bear+0.033
Neg logit contributions
 fault-0.028
sed-0.027
:-0.027
 okay-0.026
 enough-0.025
Token to
Token ID284
Feature activation+3.96e-02

Pos logit contributions
 conclusions+0.055
Fl+0.054
Bee+0.053
Ruby+0.053
Bird+0.053
Neg logit contributions
:-0.011
 fault-0.010
 enough-0.009
 about-0.009
sed-0.009
Token play
Token ID711
Feature activation+3.95e-02

Pos logit contributions
Fl+0.050
eds+0.049
 conclusions+0.046
Moon+0.046
Bear+0.046
Neg logit contributions
:-0.081
 fault-0.077
sed-0.077
 okay-0.075
 about-0.072
Token hide
Token ID7808
Feature activation+2.96e-02

Pos logit contributions
 conclusions+0.063
eds+0.063
Bee+0.063
Fl+0.061
Moon+0.061
Neg logit contributions
:-0.066
 about-0.061
 okay-0.060
sed-0.058
 enough-0.056
Token and
Token ID290
Feature activation+9.89e-02

Pos logit contributions
Moon+0.068
 conclusions+0.067
Bird+0.066
Brown+0.065
Bee+0.065
Neg logit contributions
:-0.031
 enough-0.027
 about-0.025
 okay-0.023
sed-0.022
Token seek
Token ID5380
Feature activation+3.79e-02

Pos logit contributions
cking+0.168
irs+0.168
 conclusions+0.168
eds+0.165
ler+0.164
Neg logit contributions
 okay-0.153
:-0.149
 ending-0.144
 full-0.140
,-0.138
Token with
Token ID351
Feature activation+5.90e-02

Pos logit contributions
Moon+0.071
Fl+0.071
 conclusions+0.071
Bee+0.070
Bird+0.070
Neg logit contributions
:-0.053
 enough-0.048
 fault-0.048
 okay-0.047
sed-0.047
Token his
Token ID465
Feature activation-3.48e-03

Pos logit contributions
 conclusions+0.107
Fl+0.107
Bee+0.106
ations+0.105
eds+0.105
Neg logit contributions
:-0.085
 enough-0.081
 Dad-0.074
 about-0.074
 fault-0.074
Token friends
Token ID2460
Feature activation+8.84e-03

Pos logit contributions
 fault+0.005
:+0.005
sed+0.005
 okay+0.004
 enough+0.004
Neg logit contributions
Fl-0.006
eds-0.006
Moon-0.006
 conclusions-0.006
 pans-0.006
Token.
Token ID13
Feature activation+1.17e-02

Pos logit contributions
Fl+0.017
eds+0.016
Bear+0.016
Moon+0.016
ts+0.016
Neg logit contributions
sed-0.013
 fault-0.013
:-0.011
 okay-0.011
 enough-0.010
Token that
Token ID326
Feature activation-7.79e-03

Pos logit contributions
Fl+0.019
 pans+0.018
 blankets+0.018
Bird+0.018
Moon+0.018
Neg logit contributions
 fault-0.021
sed-0.019
 okay-0.018
ope-0.017
:-0.017
Token said
Token ID531
Feature activation-3.50e-02

Pos logit contributions
 fault+0.011
sed+0.010
:+0.010
 okay+0.009
 enough+0.009
Neg logit contributions
Fl-0.015
Moon-0.014
ts-0.014
 pans-0.014
Bird-0.014
Token "
Token ID366
Feature activation+4.75e-02

Pos logit contributions
 fault+0.060
sed+0.051
 intentions+0.049
oll+0.048
itted+0.048
Neg logit contributions
Fl-0.054
 flower-0.053
Moon-0.052
Bird-0.051
Bee-0.051
TokenNo
Token ID2949
Feature activation-1.43e-02

Pos logit contributions
 blankets+0.070
ts+0.070
 pans+0.067
eds+0.067
 rubble+0.067
Neg logit contributions
ield-0.078
 okay-0.078
ope-0.077
 happened-0.076
:-0.076
Token T
Token ID309
Feature activation+2.59e-03

Pos logit contributions
 fault+0.022
sed+0.022
:+0.021
apped+0.021
 okay+0.020
Neg logit contributions
Fl-0.023
eds-0.023
 blankets-0.023
 conclusions-0.023
Moon-0.022
Tokenresp
Token ID4363
Feature activation+9.82e-02

Pos logit contributions
Fl+0.005
eds+0.004
ie+0.004
Bee+0.004
Buzz+0.004
Neg logit contributions
sed-0.004
 fault-0.004
:-0.003
 enough-0.003
 okay-0.003
Tokenassing
Token ID19696
Feature activation+1.91e-02

Pos logit contributions
 pans+0.168
ie+0.162
 blankets+0.161
Bee+0.161
Moon+0.158
Neg logit contributions
:-0.146
sed-0.142
 fault-0.134
ield-0.133
 enough-0.132
Token".
Token ID1911
Feature activation+5.14e-03

Pos logit contributions
Fl+0.028
eds+0.027
Moon+0.027
 conclusions+0.027
Bee+0.027
Neg logit contributions
 fault-0.034
sed-0.033
:-0.032
 okay-0.031
 enough-0.030
Token But
Token ID887
Feature activation+3.14e-03

Pos logit contributions
Fl+0.009
Moon+0.009
 blankets+0.009
 conclusions+0.008
Bird+0.008
Neg logit contributions
 fault-0.008
sed-0.007
:-0.007
 okay-0.007
 happened-0.007
Token Lily
Token ID20037
Feature activation-1.43e-03

Pos logit contributions
Fl+0.004
Moon+0.004
 conclusions+0.004
eds+0.004
ts+0.004
Neg logit contributions
 fault-0.006
sed-0.006
:-0.006
 okay-0.005
 enough-0.005
Token was
Token ID373
Feature activation-3.18e-02

Pos logit contributions
 fault+0.003
sed+0.002
joy+0.002
apped+0.002
owl+0.002
Neg logit contributions
Fl-0.002
ie-0.002
Bird-0.002
Bear-0.002
ts-0.002
Tokenucky
Token ID5309
Feature activation+2.51e-02

Pos logit contributions
Fl+0.024
eds+0.023
Bear+0.023
Bee+0.023
 conclusions+0.023
Neg logit contributions
 fault-0.024
sed-0.024
:-0.023
 enough-0.022
 okay-0.022
Token loved
Token ID6151
Feature activation+1.52e-02

Pos logit contributions
Fl+0.045
eds+0.043
Moon+0.043
Bee+0.043
Bird+0.042
Neg logit contributions
 fault-0.038
sed-0.036
:-0.035
 okay-0.034
 enough-0.033
Token to
Token ID284
Feature activation+4.35e-02

Pos logit contributions
 conclusions+0.041
Fl+0.041
Moon+0.040
Brown+0.040
Bee+0.039
Neg logit contributions
:-0.009
 fault-0.009
sed-0.008
 enough-0.007
 okay-0.007
Token play
Token ID711
Feature activation+3.99e-02

Pos logit contributions
Fl+0.061
eds+0.061
Brown+0.059
ations+0.059
Bear+0.058
Neg logit contributions
:-0.080
sed-0.077
 okay-0.074
 fault-0.073
 tum-0.071
Token hide
Token ID7808
Feature activation+2.84e-02

Pos logit contributions
 conclusions+0.071
 pans+0.069
Moon+0.069
eds+0.069
Brown+0.068
Neg logit contributions
:-0.060
 okay-0.055
sed-0.054
 about-0.052
 fault-0.051
Token and
Token ID290
Feature activation+9.82e-02

Pos logit contributions
Moon+0.065
 conclusions+0.064
Brown+0.063
Bird+0.062
Fl+0.061
Neg logit contributions
:-0.030
 enough-0.025
 about-0.023
 okay-0.023
sed-0.022
Token seek
Token ID5380
Feature activation+3.73e-02

Pos logit contributions
irs+0.174
eling+0.170
ler+0.167
 conclusions+0.166
icer+0.163
Neg logit contributions
 okay-0.142
:-0.136
 full-0.136
,-0.133
sed-0.132
Token with
Token ID351
Feature activation+5.68e-02

Pos logit contributions
Moon+0.070
 conclusions+0.070
Fl+0.070
eds+0.069
Bird+0.069
Neg logit contributions
:-0.052
 enough-0.046
 okay-0.046
 fault-0.046
sed-0.046
Token her
Token ID607
Feature activation-2.78e-03

Pos logit contributions
ations+0.110
lling+0.103
cking+0.102
 conclusions+0.102
eds+0.102
Neg logit contributions
:-0.076
 enough-0.073
 okay-0.068
 fault-0.068
icked-0.067
Token friends
Token ID2460
Feature activation+8.12e-03

Pos logit contributions
 fault+0.003
sed+0.003
:+0.003
 okay+0.003
 enough+0.003
Neg logit contributions
 conclusions-0.006
Fl-0.006
Moon-0.006
eds-0.006
Bird-0.006
Token.
Token ID13
Feature activation+1.35e-02

Pos logit contributions
Fl+0.016
eds+0.016
Moon+0.016
 conclusions+0.016
Bee+0.016
Neg logit contributions
 fault-0.010
sed-0.010
:-0.009
 okay-0.009
 enough-0.009
Token She
Token ID1375
Feature activation+2.12e-02

Pos logit contributions
Fl+0.036
Moon+0.035
 conclusions+0.034
Bee+0.034
Bear+0.034
Neg logit contributions
 fault-0.027
sed-0.026
:-0.025
 okay-0.023
 enough-0.023
Token loved
Token ID6151
Feature activation+1.85e-02

Pos logit contributions
Fl+0.039
Moon+0.038
eds+0.037
 conclusions+0.037
Bee+0.037
Neg logit contributions
 fault-0.030
:-0.029
sed-0.029
 okay-0.027
 enough-0.027
Token to
Token ID284
Feature activation+3.84e-02

Pos logit contributions
 conclusions+0.053
Fl+0.052
Bee+0.051
Bird+0.050
Bear+0.050
Neg logit contributions
:-0.010
 fault-0.008
 enough-0.008
 about-0.007
sed-0.007
Token play
Token ID711
Feature activation+3.90e-02

Pos logit contributions
eds+0.049
Fl+0.048
 conclusions+0.045
Bear+0.045
Moon+0.045
Neg logit contributions
:-0.078
 fault-0.074
sed-0.073
 okay-0.073
 enough-0.069
Token hide
Token ID7808
Feature activation+3.00e-02

Pos logit contributions
 conclusions+0.064
eds+0.064
Bee+0.061
Moon+0.060
Fl+0.060
Neg logit contributions
:-0.064
 about-0.059
 okay-0.058
sed-0.055
 enough-0.054
Token and
Token ID290
Feature activation+9.81e-02

Pos logit contributions
 conclusions+0.069
Moon+0.068
Brown+0.066
Bird+0.066
Bear+0.066
Neg logit contributions
:-0.031
 enough-0.026
 about-0.025
 okay-0.023
sed-0.022
Token seek
Token ID5380
Feature activation+3.69e-02

Pos logit contributions
cking+0.168
irs+0.167
ler+0.166
eds+0.166
 conclusions+0.165
Neg logit contributions
 okay-0.154
:-0.147
 ending-0.141
 full-0.141
 tum-0.136
Token with
Token ID351
Feature activation+5.94e-02

Pos logit contributions
Fl+0.065
 conclusions+0.064
Moon+0.064
eds+0.064
Bee+0.063
Neg logit contributions
:-0.056
 fault-0.052
sed-0.051
 enough-0.050
 okay-0.050
Token her
Token ID607
Feature activation-1.53e-03

Pos logit contributions
eds+0.113
cking+0.108
Fl+0.108
ations+0.108
 conclusions+0.108
Neg logit contributions
:-0.085
 enough-0.080
 about-0.075
 okay-0.073
 best-0.072
Token friends
Token ID2460
Feature activation+9.13e-03

Pos logit contributions
 fault+0.002
:+0.002
sed+0.002
 okay+0.002
 enough+0.002
Neg logit contributions
Fl-0.003
 conclusions-0.003
eds-0.003
Moon-0.003
Bee-0.003
Token.
Token ID13
Feature activation+1.22e-02

Pos logit contributions
Fl+0.017
eds+0.017
Moon+0.017
Bear+0.017
 pans+0.016
Neg logit contributions
 fault-0.013
sed-0.013
:-0.011
 okay-0.011
 enough-0.010
Tokenens
Token ID641
Feature activation+3.16e-02

Pos logit contributions
Fl+0.010
ie+0.009
Bear+0.009
Bee+0.009
Bird+0.008
Neg logit contributions
sed-0.018
 fault-0.018
:-0.017
apped-0.016
 enough-0.016
Token loved
Token ID6151
Feature activation+1.33e-02

Pos logit contributions
Fl+0.058
Moon+0.056
eds+0.055
Bee+0.055
ts+0.055
Neg logit contributions
 fault-0.045
sed-0.043
:-0.043
 okay-0.041
 enough-0.040
Token to
Token ID284
Feature activation+4.05e-02

Pos logit contributions
 conclusions+0.036
Moon+0.035
Fl+0.035
ts+0.035
ations+0.035
Neg logit contributions
:-0.008
 fault-0.007
sed-0.007
 about-0.006
 enough-0.006
Token play
Token ID711
Feature activation+3.75e-02

Pos logit contributions
eds+0.048
Fl+0.047
ations+0.047
 pans+0.047
Moon+0.046
Neg logit contributions
:-0.084
sed-0.081
 fault-0.079
 okay-0.078
 about-0.075
Token hide
Token ID7808
Feature activation+2.66e-02

Pos logit contributions
 conclusions+0.066
eds+0.066
 pans+0.066
Moon+0.064
Bee+0.063
Neg logit contributions
:-0.058
 about-0.053
 okay-0.052
sed-0.048
 enough-0.046
Token and
Token ID290
Feature activation+9.80e-02

Pos logit contributions
Moon+0.059
 conclusions+0.058
Brown+0.057
Bird+0.056
Fl+0.056
Neg logit contributions
:-0.029
 enough-0.025
 okay-0.024
 about-0.023
sed-0.023
Token seek
Token ID5380
Feature activation+3.51e-02

Pos logit contributions
irs+0.180
ler+0.174
icer+0.169
cking+0.169
 conclusions+0.168
Neg logit contributions
 okay-0.144
sed-0.138
:-0.136
 ending-0.135
 full-0.129
Token with
Token ID351
Feature activation+5.19e-02

Pos logit contributions
 conclusions+0.065
Moon+0.065
eds+0.064
Fl+0.064
Bee+0.064
Neg logit contributions
:-0.050
 okay-0.045
 enough-0.045
 fault-0.045
sed-0.045
Token Lily
Token ID20037
Feature activation+1.66e-02

Pos logit contributions
cking+0.108
ations+0.108
eds+0.106
 conclusions+0.103
Moon+0.102
Neg logit contributions
:-0.062
 okay-0.061
 enough-0.057
 fault-0.054
 about-0.053
Token.
Token ID13
Feature activation+9.59e-03

Pos logit contributions
 conclusions+0.036
Moon+0.036
Fl+0.035
 pans+0.035
eds+0.035
Neg logit contributions
:-0.019
 fault-0.018
sed-0.017
 okay-0.016
 enough-0.016
Token One
Token ID1881
Feature activation-2.17e-02

Pos logit contributions
Fl+0.019
eds+0.018
Moon+0.018
 conclusions+0.018
Bee+0.018
Neg logit contributions
 fault-0.013
:-0.012
sed-0.012
 okay-0.011
 enough-0.011
Token mess
Token ID2085
Feature activation-8.26e-03

Pos logit contributions
Fl+0.022
Moon+0.021
eds+0.021
 conclusions+0.021
Bee+0.021
Neg logit contributions
:-0.028
 fault-0.027
sed-0.027
 okay-0.026
 enough-0.026
Token.
Token ID13
Feature activation-2.29e-02

Pos logit contributions
 fault+0.011
sed+0.011
:+0.011
 okay+0.010
 enough+0.010
Neg logit contributions
Fl-0.015
ts-0.015
eds-0.015
 pans-0.015
 blankets-0.015
Token There
Token ID1318
Feature activation+1.25e-03

Pos logit contributions
 fault+0.043
 okay+0.037
 ok+0.037
 luck+0.037
sed+0.037
Neg logit contributions
 pans-0.034
Fl-0.034
 chairs-0.033
 tables-0.033
Moon-0.033
Token are
Token ID389
Feature activation+2.26e-02

Pos logit contributions
Fl+0.002
ts+0.002
eds+0.002
 blankets+0.002
Moon+0.002
Neg logit contributions
sed-0.002
 fault-0.002
 okay-0.002
joy-0.002
:-0.002
Token pots
Token ID32195
Feature activation+1.88e-02

Pos logit contributions
Bird+0.032
eds+0.032
Fl+0.031
Moon+0.031
 conclusions+0.031
Neg logit contributions
 fault-0.042
:-0.042
sed-0.040
 enough-0.040
 okay-0.040
Token and
Token ID290
Feature activation+9.77e-02

Pos logit contributions
Fl+0.031
 pans+0.031
eds+0.031
 blankets+0.031
ts+0.031
Neg logit contributions
 fault-0.030
sed-0.029
 okay-0.026
:-0.026
joy-0.025
Token pans
Token ID36209
Feature activation+4.44e-02

Pos logit contributions
Hope+0.096
|+0.096
Bird+0.093
Moon+0.092
 conclusions+0.092
Neg logit contributions
:-0.216
sed-0.203
 tum-0.200
 okay-0.198
 Dad-0.196
Token on
Token ID319
Feature activation+5.69e-02

Pos logit contributions
Fl+0.087
eds+0.085
Moon+0.085
 conclusions+0.085
Bee+0.084
Neg logit contributions
 fault-0.057
:-0.055
sed-0.055
 okay-0.051
 enough-0.049
Token the
Token ID262
Feature activation+2.03e-02

Pos logit contributions
Brown+0.109
Bird+0.108
Hop+0.106
eds+0.106
zip+0.105
Neg logit contributions
:-0.086
 enough-0.081
 fault-0.070
 Dad-0.068
,-0.067
Token floor
Token ID4314
Feature activation+7.63e-03

Pos logit contributions
Bird+0.023
Bee+0.023
Buzz+0.023
Moon+0.023
Fl+0.023
Neg logit contributions
:-0.044
sed-0.043
 fault-0.042
 okay-0.040
 tum-0.039
Token.
Token ID13
Feature activation-2.57e-02

Pos logit contributions
Fl+0.015
eds+0.015
ts+0.015
 conclusions+0.015
 pans+0.015
Neg logit contributions
 fault-0.009
sed-0.009
:-0.009
 okay-0.008
 enough-0.008
Token tea
Token ID8887
Feature activation+1.26e-02

Pos logit contributions
Fl+0.041
Moon+0.040
 conclusions+0.040
Bear+0.039
Bee+0.039
Neg logit contributions
:-0.028
 fault-0.026
sed-0.025
 about-0.025
 enough-0.024
Token parties
Token ID4671
Feature activation+2.55e-02

Pos logit contributions
Fl+0.017
ts+0.016
 blankets+0.016
eds+0.015
Moon+0.015
Neg logit contributions
 fault-0.025
sed-0.024
:-0.023
joy-0.023
 okay-0.022
Token and
Token ID290
Feature activation+6.20e-02

Pos logit contributions
Fl+0.043
ts+0.043
 blankets+0.042
eds+0.041
 pans+0.041
Neg logit contributions
 fault-0.040
sed-0.039
:-0.037
 okay-0.035
joy-0.035
Token play
Token ID711
Feature activation+4.31e-02

Pos logit contributions
eds+0.099
Moon+0.091
Bear+0.091
Fl+0.090
ind+0.090
Neg logit contributions
:-0.105
sed-0.104
 about-0.102
 okay-0.101
 ending-0.098
Token hide
Token ID7808
Feature activation+2.69e-02

Pos logit contributions
Brown+0.069
eds+0.068
 pans+0.068
Bee+0.067
 conclusions+0.067
Neg logit contributions
:-0.072
 about-0.067
 okay-0.064
apped-0.062
sed-0.062
Token and
Token ID290
Feature activation+9.75e-02

Pos logit contributions
Moon+0.057
Brown+0.056
Bird+0.056
 conclusions+0.055
Bear+0.055
Neg logit contributions
:-0.031
 enough-0.028
 fault-0.026
 okay-0.026
 about-0.025
Token seek
Token ID5380
Feature activation+3.60e-02

Pos logit contributions
 conclusions+0.175
irs+0.172
ind+0.171
eling+0.170
eds+0.169
Neg logit contributions
 okay-0.138
:-0.132
sed-0.131
 full-0.131
 best-0.124
Token.
Token ID13
Feature activation-7.72e-03

Pos logit contributions
 conclusions+0.081
Moon+0.081
Bee+0.080
Bird+0.080
Fl+0.080
Neg logit contributions
:-0.038
 enough-0.033
 about-0.032
 okay-0.032
 fault-0.031
Token Sally
Token ID25737
Feature activation+3.20e-02

Pos logit contributions
 fault+0.015
sed+0.014
 enough+0.013
owl+0.012
oll+0.012
Neg logit contributions
Fl-0.011
Bird-0.011
Buzz-0.011
Moon-0.010
Bee-0.010
Token loved
Token ID6151
Feature activation+1.50e-02

Pos logit contributions
Fl+0.061
 conclusions+0.061
eds+0.060
Moon+0.060
 pans+0.059
Neg logit contributions
:-0.042
 fault-0.041
sed-0.041
 okay-0.037
 enough-0.037
Token her
Token ID607
Feature activation-3.52e-03

Pos logit contributions
 conclusions+0.040
Fl+0.039
Bird+0.038
ler+0.038
Bear+0.038
Neg logit contributions
:-0.011
 about-0.010
 fault-0.010
sed-0.009
 enough-0.008
Token Tim
Token ID5045
Feature activation+2.33e-02

Pos logit contributions
Fl+0.019
Bee+0.018
Moon+0.018
Brown+0.018
Bird+0.018
Neg logit contributions
 fault-0.021
sed-0.020
:-0.020
 enough-0.019
 okay-0.019
Token loved
Token ID6151
Feature activation+1.93e-02

Pos logit contributions
 conclusions+0.046
Fl+0.045
Moon+0.045
 pans+0.045
eds+0.044
Neg logit contributions
:-0.030
 fault-0.029
sed-0.028
 okay-0.027
 enough-0.025
Token to
Token ID284
Feature activation+3.89e-02

Pos logit contributions
Fl+0.053
 conclusions+0.053
Bee+0.051
Ruby+0.051
Bird+0.051
Neg logit contributions
:-0.012
 fault-0.011
 enough-0.010
 about-0.009
sed-0.009
Token play
Token ID711
Feature activation+3.91e-02

Pos logit contributions
eds+0.050
Fl+0.050
 conclusions+0.047
Bear+0.046
Moon+0.046
Neg logit contributions
:-0.079
sed-0.073
 fault-0.073
 okay-0.072
 enough-0.069
Token hide
Token ID7808
Feature activation+3.54e-02

Pos logit contributions
 conclusions+0.064
eds+0.064
Bee+0.063
Fl+0.062
Moon+0.061
Neg logit contributions
:-0.065
 about-0.059
 okay-0.059
sed-0.057
 enough-0.056
Token and
Token ID290
Feature activation+9.75e-02

Pos logit contributions
Moon+0.084
 conclusions+0.083
Bird+0.080
Brown+0.080
Bee+0.080
Neg logit contributions
:-0.034
 enough-0.028
 about-0.027
,-0.023
 okay-0.023
Token seek
Token ID5380
Feature activation+4.00e-02

Pos logit contributions
irs+0.167
 conclusions+0.164
cking+0.161
ler+0.156
eds+0.156
Neg logit contributions
:-0.147
 okay-0.143
,-0.142
 ending-0.140
 full-0.139
Token with
Token ID351
Feature activation+5.96e-02

Pos logit contributions
Moon+0.081
 conclusions+0.081
Bee+0.080
Bird+0.080
Fl+0.080
Neg logit contributions
:-0.052
 enough-0.045
 about-0.045
 okay-0.044
 fault-0.043
Token his
Token ID465
Feature activation-7.51e-04

Pos logit contributions
ations+0.115
Bird+0.114
Fl+0.114
 conclusions+0.113
Bee+0.113
Neg logit contributions
:-0.078
 enough-0.076
 about-0.070
 Dad-0.068
 fault-0.068
Token friends
Token ID2460
Feature activation+8.22e-03

Pos logit contributions
 fault+0.001
:+0.001
sed+0.001
 okay+0.001
 enough+0.001
Neg logit contributions
Fl-0.001
Moon-0.001
eds-0.001
 conclusions-0.001
Bee-0.001
Token.
Token ID13
Feature activation+8.68e-03

Pos logit contributions
Fl+0.015
ie+0.014
Bear+0.014
ts+0.014
 blankets+0.014
Neg logit contributions
sed-0.013
 fault-0.013
 okay-0.011
owl-0.011
:-0.011
Token He
Token ID679
Feature activation+2.89e-02

Pos logit contributions
eds+0.039
 conclusions+0.038
ts+0.038
Fl+0.037
 pans+0.037
Neg logit contributions
:-0.019
sed-0.018
 fault-0.017
 okay-0.017
 enough-0.016
Token loved
Token ID6151
Feature activation+1.95e-02

Pos logit contributions
 conclusions+0.066
eds+0.062
 pans+0.061
Moon+0.060
ations+0.060
Neg logit contributions
:-0.035
sed-0.032
 happened-0.029
 okay-0.027
 fault-0.026
Token to
Token ID284
Feature activation+4.46e-02

Pos logit contributions
 conclusions+0.057
Fl+0.057
Ruby+0.054
Moon+0.054
Bird+0.054
Neg logit contributions
:-0.009
 fault-0.007
sed-0.007
 about-0.007
 enough-0.007
Token play
Token ID711
Feature activation+4.09e-02

Pos logit contributions
eds+0.069
Fl+0.067
ations+0.066
lling+0.064
cking+0.063
Neg logit contributions
:-0.081
sed-0.074
 opinion-0.070
 tum-0.070
 okay-0.069
Token hide
Token ID7808
Feature activation+3.31e-02

Pos logit contributions
 conclusions+0.075
eds+0.074
eling+0.072
Fl+0.071
Pen+0.070
Neg logit contributions
:-0.061
 about-0.056
 okay-0.055
 enough-0.049
sed-0.049
Token and
Token ID290
Feature activation+9.75e-02

Pos logit contributions
Moon+0.078
 conclusions+0.077
Brown+0.074
Bird+0.074
Pen+0.073
Neg logit contributions
:-0.033
 enough-0.028
 about-0.026
 okay-0.023
;-0.022
Token seek
Token ID5380
Feature activation+3.87e-02

Pos logit contributions
irs+0.180
ler+0.179
cking+0.174
 conclusions+0.174
eling+0.170
Neg logit contributions
:-0.135
,-0.133
 full-0.132
 okay-0.129
 tum-0.128
Token with
Token ID351
Feature activation+5.97e-02

Pos logit contributions
 conclusions+0.077
Moon+0.077
Fl+0.077
Bird+0.076
Bee+0.076
Neg logit contributions
:-0.050
 enough-0.045
 about-0.044
 okay-0.043
 fault-0.043
Token his
Token ID465
Feature activation+1.98e-03

Pos logit contributions
ations+0.127
 conclusions+0.124
|+0.123
lling+0.122
cking+0.122
Neg logit contributions
 enough-0.069
:-0.068
 Dad-0.062
 no-0.061
 best-0.060
Token friends
Token ID2460
Feature activation+8.96e-03

Pos logit contributions
 conclusions+0.004
Fl+0.004
Moon+0.004
eds+0.004
ts+0.004
Neg logit contributions
sed-0.002
 fault-0.002
:-0.002
 okay-0.002
 enough-0.002
Token.
Token ID13
Feature activation+1.28e-02

Pos logit contributions
Fl+0.018
eds+0.017
Moon+0.017
Bee+0.017
Bear+0.017
Neg logit contributions
 fault-0.011
sed-0.011
:-0.011
 okay-0.010
 enough-0.010
Token and
Token ID290
Feature activation+4.25e-02

Pos logit contributions
Fl+0.007
 blankets+0.007
ts+0.007
eds+0.007
 conclusions+0.007
Neg logit contributions
 fault-0.005
sed-0.005
 okay-0.005
:-0.005
ope-0.004
Token loved
Token ID6151
Feature activation+1.12e-02

Pos logit contributions
Fl+0.082
ts+0.079
Moon+0.079
Bee+0.078
 pans+0.078
Neg logit contributions
 fault-0.059
:-0.055
sed-0.055
 okay-0.051
 enough-0.050
Token to
Token ID284
Feature activation+4.61e-02

Pos logit contributions
Fl+0.031
 conclusions+0.031
Moon+0.031
Bee+0.030
Bird+0.030
Neg logit contributions
:-0.006
 fault-0.006
 about-0.005
 enough-0.005
sed-0.005
Token play
Token ID711
Feature activation+4.22e-02

Pos logit contributions
Fl+0.067
eds+0.066
 pans+0.064
Bear+0.063
Moon+0.063
Neg logit contributions
:-0.084
sed-0.079
 okay-0.078
 fault-0.077
 about-0.075
Token hide
Token ID7808
Feature activation+3.45e-02

Pos logit contributions
 pans+0.075
eds+0.075
Bee+0.074
Moon+0.074
 conclusions+0.073
Neg logit contributions
:-0.063
 about-0.059
 okay-0.058
 enough-0.053
sed-0.050
Token and
Token ID290
Feature activation+9.73e-02

Pos logit contributions
Moon+0.076
Brown+0.074
Bird+0.073
 conclusions+0.073
Bee+0.072
Neg logit contributions
:-0.039
 enough-0.034
 about-0.032
 okay-0.031
;-0.028
Token seek
Token ID5380
Feature activation+4.24e-02

Pos logit contributions
irs+0.178
cking+0.175
ler+0.174
ations+0.168
eling+0.167
Neg logit contributions
 okay-0.136
 ending-0.131
sed-0.130
:-0.126
 full-0.125
Token with
Token ID351
Feature activation+5.43e-02

Pos logit contributions
Moon+0.090
Bee+0.089
Bird+0.089
 conclusions+0.089
Fl+0.088
Neg logit contributions
:-0.050
 about-0.045
 enough-0.044
 okay-0.043
 fault-0.039
Token the
Token ID262
Feature activation+1.11e-02

Pos logit contributions
ations+0.097
cking+0.095
Bee+0.095
eds+0.094
bing+0.093
Neg logit contributions
:-0.078
 enough-0.075
 okay-0.072
 about-0.071
 fault-0.069
Token other
Token ID584
Feature activation+1.37e-02

Pos logit contributions
eds+0.016
Fl+0.016
Moon+0.016
ts+0.016
 conclusions+0.016
Neg logit contributions
 fault-0.020
sed-0.020
 okay-0.019
:-0.019
 enough-0.018
Token animals
Token ID4695
Feature activation+2.66e-02

Pos logit contributions
Fl+0.015
eds+0.015
 conclusions+0.014
Moon+0.014
Bee+0.014
Neg logit contributions
 fault-0.029
sed-0.029
:-0.029
 okay-0.027
 enough-0.027
Tokenmy
Token ID1820
Feature activation+1.68e-02

Pos logit contributions
 conclusions+0.032
Moon+0.030
 pans+0.030
Fl+0.030
Bird+0.029
Neg logit contributions
:-0.046
 fault-0.045
 okay-0.042
sed-0.041
 happened-0.041
Token loved
Token ID6151
Feature activation+1.83e-02

Pos logit contributions
Fl+0.031
eds+0.030
Moon+0.030
 conclusions+0.030
Bee+0.030
Neg logit contributions
 fault-0.024
:-0.023
sed-0.023
 okay-0.022
 enough-0.021
Token to
Token ID284
Feature activation+3.64e-02

Pos logit contributions
 conclusions+0.050
Fl+0.050
Bee+0.049
Ruby+0.048
Bird+0.048
Neg logit contributions
:-0.011
 fault-0.010
 enough-0.009
sed-0.009
 about-0.009
Token play
Token ID711
Feature activation+3.86e-02

Pos logit contributions
Fl+0.044
eds+0.043
 conclusions+0.041
Moon+0.041
Bear+0.040
Neg logit contributions
:-0.076
 fault-0.072
sed-0.072
 okay-0.070
 enough-0.068
Token hide
Token ID7808
Feature activation+3.32e-02

Pos logit contributions
 conclusions+0.062
eds+0.061
Bee+0.061
Moon+0.059
Fl+0.059
Neg logit contributions
:-0.065
 about-0.060
 okay-0.059
sed-0.058
 enough-0.056
Token and
Token ID290
Feature activation+9.69e-02

Pos logit contributions
Moon+0.077
 conclusions+0.076
Bird+0.075
Brown+0.074
Bee+0.074
Neg logit contributions
:-0.033
 enough-0.028
 about-0.027
 okay-0.023
;-0.023
Token seek
Token ID5380
Feature activation+3.89e-02

Pos logit contributions
irs+0.168
 conclusions+0.165
cking+0.163
ler+0.160
eds+0.159
Neg logit contributions
:-0.145
 okay-0.144
,-0.138
 ending-0.138
 full-0.136
Token with
Token ID351
Feature activation+5.79e-02

Pos logit contributions
 conclusions+0.075
Moon+0.075
Fl+0.075
Bird+0.075
Bee+0.075
Neg logit contributions
:-0.053
 enough-0.047
 about-0.046
 okay-0.046
 fault-0.046
Token his
Token ID465
Feature activation-3.64e-03

Pos logit contributions
 conclusions+0.110
Fl+0.109
Bird+0.109
Bee+0.109
ations+0.108
Neg logit contributions
:-0.078
 enough-0.076
 Dad-0.070
 about-0.068
 fault-0.068
Token friends
Token ID2460
Feature activation+9.46e-03

Pos logit contributions
 fault+0.005
sed+0.005
:+0.005
 okay+0.004
 enough+0.004
Neg logit contributions
Fl-0.007
 conclusions-0.007
Moon-0.007
eds-0.007
Bee-0.007
Token.
Token ID13
Feature activation+1.19e-02

Pos logit contributions
Fl+0.018
eds+0.017
Bear+0.017
Moon+0.017
ts+0.017
Neg logit contributions
 fault-0.013
sed-0.013
:-0.012
 okay-0.011
 enough-0.011
Token smiles
Token ID21845
Feature activation-4.51e-02

Pos logit contributions
 fault+0.038
sed+0.038
:+0.037
 okay+0.035
 enough+0.033
Neg logit contributions
Fl-0.043
Moon-0.043
eds-0.042
 conclusions-0.042
Bee-0.042
Token were
Token ID547
Feature activation-2.73e-02

Pos logit contributions
 fault+0.081
owl+0.077
sed+0.077
ope+0.072
 intentions+0.070
Neg logit contributions
Fl-0.061
 blankets-0.061
Brown-0.060
hing-0.057
bing-0.057
Token big
Token ID1263
Feature activation-3.30e-02

Pos logit contributions
sed+0.052
 fault+0.050
:+0.048
owl+0.045
ield+0.044
Neg logit contributions
Fl-0.038
eds-0.036
Moon-0.036
ts-0.036
Brown-0.036
Token.
Token ID13
Feature activation-2.02e-02

Pos logit contributions
ope+0.061
sed+0.059
eware+0.056
ield+0.054
 happened+0.054
Neg logit contributions
 blankets-0.047
ie-0.039
Fl-0.039
 fabric-0.038
 pillow-0.038
Token The
Token ID383
Feature activation-2.13e-02

Pos logit contributions
 fault+0.029
sed+0.029
:+0.027
 enough+0.026
 okay+0.026
Neg logit contributions
Fl-0.037
Moon-0.035
 conclusions-0.034
Brown-0.034
eds-0.034
Token end
Token ID886
Feature activation-8.21e-02

Pos logit contributions
 fault+0.034
:+0.034
sed+0.033
 okay+0.032
 enough+0.031
Neg logit contributions
Fl-0.035
eds-0.034
Moon-0.034
 conclusions-0.034
Bee-0.034
Token.
Token ID13
Feature activation-1.87e-02

Pos logit contributions
sed+0.145
 fault+0.133
apped+0.119
joy+0.118
 intentions+0.117
Neg logit contributions
Fl-0.135
eds-0.128
 blankets-0.128
ie-0.127
Brown-0.125
Token<|endoftext|>
Token ID50256
Feature activation-1.06e-02

Pos logit contributions
 fault+0.027
sed+0.027
:+0.026
 enough+0.024
 okay+0.024
Neg logit contributions
Fl-0.034
Moon-0.033
 conclusions-0.032
Bear-0.032
eds-0.032
TokenOnce
Token ID7454
Feature activation-1.81e-02

Pos logit contributions
 fault+0.016
sed+0.016
:+0.015
 okay+0.015
 enough+0.014
Neg logit contributions
Fl-0.019
Moon-0.018
Bear-0.018
eds-0.018
Bee-0.018
Token upon
Token ID2402
Feature activation-1.06e-02

Pos logit contributions
sed+0.035
 fault+0.034
ield+0.031
:+0.031
 okay+0.031
Neg logit contributions
ie-0.024
eds-0.023
Fl-0.023
Moon-0.022
Bee-0.022
Token a
Token ID257
Feature activation-5.44e-02

Pos logit contributions
:+0.011
 okay+0.011
 fault+0.010
 about+0.010
 happened+0.010
Neg logit contributions
eds-0.024
 pans-0.023
 conclusions-0.023
Fl-0.023
Moon-0.023
Token at
Token ID379
Feature activation+2.39e-02

Pos logit contributions
sed+0.024
 fault+0.023
:+0.022
 okay+0.022
 ok+0.020
Neg logit contributions
eds-0.027
 conclusions-0.026
ts-0.026
 pans-0.026
 forth-0.025
Token drawing
Token ID8263
Feature activation+5.99e-04

Pos logit contributions
Fl+0.036
Moon+0.034
eds+0.034
 conclusions+0.034
Bee+0.034
Neg logit contributions
 fault-0.042
:-0.041
sed-0.041
 okay-0.039
 enough-0.038
Token pictures
Token ID5986
Feature activation+6.42e-03

Pos logit contributions
Fl+0.001
Moon+0.001
eds+0.001
Bee+0.001
 conclusions+0.001
Neg logit contributions
 fault-0.001
:-0.001
sed-0.001
 okay-0.001
 enough-0.001
Token.
Token ID13
Feature activation-1.68e-02

Pos logit contributions
Fl+0.012
eds+0.012
Moon+0.012
 conclusions+0.012
Bee+0.012
Neg logit contributions
 fault-0.008
sed-0.008
:-0.008
 okay-0.007
 enough-0.007
Token The
Token ID383
Feature activation-1.87e-02

Pos logit contributions
 fault+0.024
sed+0.023
:+0.022
 okay+0.021
 enough+0.021
Neg logit contributions
Fl-0.031
Moon-0.030
eds-0.030
 conclusions-0.030
Bear-0.030
Token end
Token ID886
Feature activation-8.21e-02

Pos logit contributions
 fault+0.035
sed+0.034
:+0.034
 enough+0.032
 okay+0.031
Neg logit contributions
Fl-0.025
 conclusions-0.024
 blankets-0.024
Moon-0.024
Bird-0.024
Token.
Token ID13
Feature activation-1.95e-02

Pos logit contributions
sed+0.132
 fault+0.124
 enough+0.111
joy+0.108
 okay+0.106
Neg logit contributions
Fl-0.144
eds-0.139
 conclusions-0.134
Bear-0.134
 blankets-0.134
Token<|endoftext|>
Token ID50256
Feature activation-1.29e-02

Pos logit contributions
 fault+0.029
sed+0.028
:+0.027
 enough+0.026
 okay+0.026
Neg logit contributions
Fl-0.035
Moon-0.034
Bear-0.033
eds-0.033
Bird-0.033
TokenOnce
Token ID7454
Feature activation-1.90e-02

Pos logit contributions
 fault+0.020
sed+0.020
:+0.018
 okay+0.018
 enough+0.018
Neg logit contributions
Fl-0.022
Moon-0.021
eds-0.021
Bear-0.021
Bee-0.021
Token upon
Token ID2402
Feature activation-7.03e-03

Pos logit contributions
sed+0.037
 fault+0.036
ield+0.034
ope+0.033
 okay+0.033
Neg logit contributions
ie-0.025
eds-0.023
Fl-0.023
Moon-0.022
Bee-0.022
Token a
Token ID257
Feature activation-5.29e-02

Pos logit contributions
:+0.008
 okay+0.007
 fault+0.007
 about+0.007
 happened+0.007
Neg logit contributions
eds-0.015
 pans-0.015
 conclusions-0.015
Fl-0.015
Moon-0.015
Token best
Token ID1266
Feature activation-2.97e-02

Pos logit contributions
 fault+0.036
sed+0.035
:+0.035
 okay+0.032
 enough+0.032
Neg logit contributions
Fl-0.039
Moon-0.038
eds-0.038
 conclusions-0.038
Bee-0.038
Token friends
Token ID2460
Feature activation-3.82e-02

Pos logit contributions
 fault+0.052
sed+0.051
:+0.050
ope+0.048
 okay+0.048
Neg logit contributions
 blankets-0.042
eds-0.040
 pans-0.038
ts-0.037
ie-0.036
Token forever
Token ID8097
Feature activation-1.37e-02

Pos logit contributions
sed+0.073
 fault+0.067
ses+0.061
ope+0.061
owl+0.060
Neg logit contributions
Fl-0.056
eds-0.053
 blankets-0.052
ts-0.051
 conclusions-0.050
Token.
Token ID13
Feature activation-1.75e-02

Pos logit contributions
sed+0.022
 fault+0.021
ield+0.019
:+0.019
owl+0.018
Neg logit contributions
eds-0.022
 blankets-0.022
Fl-0.022
ts-0.022
 conclusions-0.021
Token The
Token ID383
Feature activation-1.57e-02

Pos logit contributions
 fault+0.023
:+0.022
sed+0.022
 okay+0.021
 enough+0.020
Neg logit contributions
Fl-0.034
Moon-0.033
eds-0.033
 conclusions-0.033
Bee-0.033
Token end
Token ID886
Feature activation-8.20e-02

Pos logit contributions
 fault+0.025
:+0.025
sed+0.024
 okay+0.023
 enough+0.022
Neg logit contributions
Fl-0.026
eds-0.026
 conclusions-0.026
Moon-0.025
Bee-0.025
Token.
Token ID13
Feature activation-2.02e-02

Pos logit contributions
sed+0.135
 fault+0.127
 okay+0.113
joy+0.113
 enough+0.113
Neg logit contributions
Fl-0.140
eds-0.135
 blankets-0.133
Bear-0.132
 conclusions-0.130
Token<|endoftext|>
Token ID50256
Feature activation-9.67e-03

Pos logit contributions
 fault+0.030
sed+0.030
:+0.028
 enough+0.026
 okay+0.026
Neg logit contributions
Fl-0.036
Moon-0.035
Bear-0.034
 conclusions-0.034
eds-0.034
TokenOnce
Token ID7454
Feature activation-2.22e-02

Pos logit contributions
sed+0.016
 fault+0.016
:+0.014
 okay+0.014
 enough+0.014
Neg logit contributions
Fl-0.016
Moon-0.015
Bear-0.015
Bee-0.015
Buzz-0.015
Token upon
Token ID2402
Feature activation-8.54e-03

Pos logit contributions
sed+0.046
 fault+0.043
ield+0.041
ope+0.040
 okay+0.040
Neg logit contributions
ie-0.028
eds-0.025
Fl-0.024
Bee-0.024
Moon-0.024
Token a
Token ID257
Feature activation-4.82e-02

Pos logit contributions
:+0.009
 okay+0.009
 fault+0.009
 about+0.008
 happened+0.008
Neg logit contributions
eds-0.019
 pans-0.018
 conclusions-0.018
Fl-0.018
Moon-0.018
Token their
Token ID511
Feature activation-9.22e-03

Pos logit contributions
Fl+0.040
Moon+0.040
eds+0.040
Bee+0.039
 pans+0.039
Neg logit contributions
 fault-0.051
:-0.051
sed-0.049
 okay-0.049
 enough-0.047
Token little
Token ID1310
Feature activation+5.42e-04

Pos logit contributions
 fault+0.015
:+0.015
sed+0.015
 okay+0.014
 enough+0.014
Neg logit contributions
Fl-0.015
Moon-0.014
eds-0.014
 conclusions-0.014
Bee-0.014
Token girl
Token ID2576
Feature activation-2.75e-02

Pos logit contributions
Fl+0.001
Moon+0.001
eds+0.001
 conclusions+0.001
Bee+0.001
Neg logit contributions
 fault-0.001
:-0.001
sed-0.001
 okay-0.001
 enough-0.001
Token.
Token ID13
Feature activation-1.74e-02

Pos logit contributions
 fault+0.039
sed+0.039
:+0.036
 okay+0.034
ope+0.033
Neg logit contributions
Fl-0.049
eds-0.048
 blankets-0.047
Bear-0.047
 pans-0.047
Token The
Token ID383
Feature activation-1.99e-02

Pos logit contributions
 fault+0.022
:+0.022
sed+0.021
 okay+0.020
 enough+0.020
Neg logit contributions
Fl-0.034
eds-0.033
 conclusions-0.033
Moon-0.033
ts-0.033
Token end
Token ID886
Feature activation-8.15e-02

Pos logit contributions
 fault+0.036
sed+0.035
:+0.034
 enough+0.032
 okay+0.032
Neg logit contributions
Fl-0.029
 conclusions-0.028
Moon-0.028
 pans-0.028
 blankets-0.028
Token.
Token ID13
Feature activation-2.14e-02

Pos logit contributions
sed+0.147
 fault+0.134
apped+0.120
 intentions+0.120
 enough+0.119
Neg logit contributions
Fl-0.132
 blankets-0.126
eds-0.124
ie-0.121
Bear-0.119
Token<|endoftext|>
Token ID50256
Feature activation-9.63e-03

Pos logit contributions
 fault+0.030
sed+0.030
:+0.029
 okay+0.027
 enough+0.026
Neg logit contributions
Fl-0.040
Moon-0.038
 conclusions-0.038
eds-0.038
Bee-0.038
TokenOnce
Token ID7454
Feature activation-2.01e-02

Pos logit contributions
 fault+0.015
sed+0.014
:+0.013
 okay+0.013
 enough+0.013
Neg logit contributions
Fl-0.017
Moon-0.016
Bear-0.016
eds-0.016
Bee-0.016
Token upon
Token ID2402
Feature activation-6.46e-03

Pos logit contributions
sed+0.040
 fault+0.038
ield+0.036
 okay+0.035
ope+0.035
Neg logit contributions
ie-0.026
eds-0.024
Fl-0.024
Moon-0.023
Bee-0.023
Token a
Token ID257
Feature activation-5.40e-02

Pos logit contributions
:+0.006
 okay+0.006
 about+0.006
 fault+0.006
 happened+0.006
Neg logit contributions
eds-0.015
 pans-0.015
 conclusions-0.015
Fl-0.014
Bee-0.014
Token play
Token ID711
Feature activation+2.15e-02

Pos logit contributions
Fl+0.092
Moon+0.090
eds+0.089
Bee+0.088
 conclusions+0.088
Neg logit contributions
 fault-0.083
:-0.081
sed-0.079
 okay-0.077
 enough-0.074
Token together
Token ID1978
Feature activation+1.80e-02

Pos logit contributions
Fl+0.027
Moon+0.026
eds+0.026
 conclusions+0.026
Bee+0.025
Neg logit contributions
 fault-0.043
:-0.042
sed-0.042
 okay-0.040
 enough-0.039
Token nicely
Token ID16576
Feature activation-9.31e-03

Pos logit contributions
Fl+0.030
ts+0.029
eds+0.029
 conclusions+0.029
Moon+0.029
Neg logit contributions
 fault-0.028
sed-0.028
:-0.027
 okay-0.025
 enough-0.025
Token.
Token ID13
Feature activation-1.53e-02

Pos logit contributions
 fault+0.012
sed+0.011
:+0.011
 okay+0.010
 enough+0.010
Neg logit contributions
Fl-0.019
eds-0.018
 conclusions-0.018
Moon-0.018
ts-0.018
Token The
Token ID383
Feature activation-1.41e-02

Pos logit contributions
 fault+0.021
sed+0.021
:+0.020
 okay+0.019
 enough+0.019
Neg logit contributions
Fl-0.029
eds-0.028
Moon-0.028
 conclusions-0.028
Bee-0.027
Token end
Token ID886
Feature activation-8.01e-02

Pos logit contributions
 fault+0.026
sed+0.026
:+0.025
 enough+0.023
 okay+0.023
Neg logit contributions
Fl-0.020
 conclusions-0.019
 blankets-0.019
Moon-0.019
 pans-0.019
Token.
Token ID13
Feature activation-1.98e-02

Pos logit contributions
sed+0.111
 fault+0.109
 enough+0.098
:+0.097
 okay+0.095
Neg logit contributions
Fl-0.154
eds-0.149
 conclusions-0.146
Bear-0.144
Bee-0.144
Token<|endoftext|>
Token ID50256
Feature activation-1.50e-02

Pos logit contributions
 fault+0.029
sed+0.028
:+0.027
 enough+0.025
 okay+0.025
Neg logit contributions
Fl-0.036
Moon-0.034
eds-0.034
Bird-0.034
Bear-0.034
TokenOnce
Token ID7454
Feature activation-1.95e-02

Pos logit contributions
sed+0.024
 fault+0.024
:+0.021
 enough+0.021
 okay+0.021
Neg logit contributions
Fl-0.026
eds-0.024
Moon-0.024
Bear-0.024
Bee-0.024
Token upon
Token ID2402
Feature activation-7.36e-03

Pos logit contributions
sed+0.039
 fault+0.037
ield+0.035
ope+0.035
 enough+0.034
Neg logit contributions
ie-0.025
eds-0.024
Fl-0.023
Bee-0.022
Moon-0.022
Token a
Token ID257
Feature activation-4.88e-02

Pos logit contributions
:+0.008
 okay+0.008
 fault+0.007
 about+0.007
 happened+0.007
Neg logit contributions
eds-0.016
 pans-0.016
 conclusions-0.016
Fl-0.016
Moon-0.015
Token even
Token ID772
Feature activation+7.52e-03

Pos logit contributions
sed+0.072
 fault+0.071
:+0.068
owl+0.064
 okay+0.063
Neg logit contributions
Fl-0.052
Moon-0.050
eds-0.049
Bee-0.049
ts-0.049
Token closer
Token ID5699
Feature activation-3.82e-02

Pos logit contributions
Fl+0.011
eds+0.011
Moon+0.011
 conclusions+0.011
Bee+0.011
Neg logit contributions
 fault-0.013
 okay-0.013
:-0.013
 enough-0.012
sed-0.012
Token friends
Token ID2460
Feature activation-1.32e-02

Pos logit contributions
sed+0.044
 fault+0.041
:+0.038
 okay+0.037
owl+0.036
Neg logit contributions
eds-0.078
 blankets-0.074
 conclusions-0.074
Fl-0.073
ie-0.072
Token.
Token ID13
Feature activation-2.04e-02

Pos logit contributions
sed+0.020
 fault+0.020
:+0.018
 okay+0.018
owl+0.017
Neg logit contributions
Fl-0.022
eds-0.022
 conclusions-0.021
ts-0.021
Bear-0.021
Token The
Token ID383
Feature activation-6.93e-03

Pos logit contributions
sed+0.030
 fault+0.029
:+0.028
 enough+0.027
 okay+0.027
Neg logit contributions
Fl-0.037
Brown-0.036
Bear-0.036
Moon-0.035
Bird-0.035
Token end
Token ID886
Feature activation-8.00e-02

Pos logit contributions
 fault+0.012
sed+0.011
:+0.011
 okay+0.010
 enough+0.010
Neg logit contributions
Fl-0.011
Moon-0.011
Bee-0.011
 conclusions-0.011
Bear-0.011
Token.
Token ID13
Feature activation-2.36e-02

Pos logit contributions
sed+0.128
 fault+0.123
 enough+0.108
joy+0.107
apped+0.106
Neg logit contributions
Fl-0.141
eds-0.137
Bear-0.133
 blankets-0.131
Bird-0.131
Token<|endoftext|>
Token ID50256
Feature activation-1.10e-02

Pos logit contributions
 fault+0.034
sed+0.034
:+0.033
 okay+0.030
 enough+0.030
Neg logit contributions
Fl-0.043
Moon-0.042
Bear-0.041
Brown-0.041
Bird-0.041
TokenOnce
Token ID7454
Feature activation-1.83e-02

Pos logit contributions
 fault+0.017
sed+0.017
:+0.015
 okay+0.015
 enough+0.015
Neg logit contributions
Fl-0.019
Moon-0.019
Bear-0.018
Bee-0.018
eds-0.018
Token upon
Token ID2402
Feature activation-4.77e-03

Pos logit contributions
sed+0.034
 fault+0.033
:+0.031
 okay+0.031
ield+0.031
Neg logit contributions
ie-0.024
eds-0.024
Fl-0.024
Moon-0.023
Bee-0.023
Token a
Token ID257
Feature activation-5.38e-02

Pos logit contributions
 okay+0.005
:+0.005
 about+0.004
 fault+0.004
 happened+0.004
Neg logit contributions
eds-0.011
 pans-0.011
 conclusions-0.011
 forth-0.011
ations-0.011
Token the
Token ID262
Feature activation+3.80e-03

Pos logit contributions
Fl+0.022
Moon+0.021
Bee+0.021
Bird+0.021
Bear+0.021
Neg logit contributions
 fault-0.031
:-0.030
sed-0.029
 okay-0.028
 enough-0.028
Token broken
Token ID5445
Feature activation+1.21e-02

Pos logit contributions
eds+0.006
Moon+0.006
Fl+0.006
Buzz+0.006
Bear+0.006
Neg logit contributions
 fault-0.006
sed-0.006
 okay-0.006
:-0.006
 ending-0.006
Token shelf
Token ID18316
Feature activation-6.72e-03

Pos logit contributions
Fl+0.016
Moon+0.016
 conclusions+0.016
Bee+0.015
eds+0.015
Neg logit contributions
 fault-0.024
:-0.023
sed-0.023
 enough-0.022
 okay-0.022
Token.
Token ID13
Feature activation-1.22e-02

Pos logit contributions
 fault+0.009
sed+0.008
:+0.008
 okay+0.008
 enough+0.007
Neg logit contributions
Fl-0.013
eds-0.013
ts-0.013
 conclusions-0.013
 pans-0.013
Token The
Token ID383
Feature activation-1.05e-02

Pos logit contributions
 fault+0.017
:+0.016
sed+0.016
 okay+0.015
 enough+0.015
Neg logit contributions
Fl-0.023
Moon-0.022
eds-0.022
 conclusions-0.022
Bee-0.022
Token end
Token ID886
Feature activation-8.00e-02

Pos logit contributions
 fault+0.017
:+0.016
sed+0.016
 okay+0.016
 enough+0.015
Neg logit contributions
Fl-0.018
eds-0.017
 conclusions-0.017
Moon-0.017
Bee-0.017
Token.
Token ID13
Feature activation-1.87e-02

Pos logit contributions
sed+0.122
 fault+0.118
 enough+0.103
 okay+0.103
apped+0.102
Neg logit contributions
Fl-0.145
eds-0.138
 blankets-0.138
Brown-0.135
Bear-0.135
Token<|endoftext|>
Token ID50256
Feature activation-1.12e-02

Pos logit contributions
 fault+0.027
sed+0.026
:+0.025
 enough+0.024
 okay+0.024
Neg logit contributions
Fl-0.035
Moon-0.033
eds-0.033
Bird-0.033
Bear-0.033
TokenOnce
Token ID7454
Feature activation-2.13e-02

Pos logit contributions
 fault+0.018
sed+0.018
:+0.016
 okay+0.016
 enough+0.016
Neg logit contributions
Fl-0.019
Moon-0.018
Bear-0.018
eds-0.018
Bee-0.018
Token upon
Token ID2402
Feature activation-8.48e-03

Pos logit contributions
sed+0.041
 fault+0.040
ield+0.037
ope+0.037
:+0.036
Neg logit contributions
ie-0.028
eds-0.026
Fl-0.026
Moon-0.025
Bee-0.025
Token a
Token ID257
Feature activation-5.13e-02

Pos logit contributions
:+0.009
 okay+0.009
 fault+0.008
 about+0.008
 happened+0.007
Neg logit contributions
eds-0.019
 pans-0.019
 conclusions-0.019
Fl-0.018
ations-0.018
Token icy
Token ID30284
Feature activation+1.26e-02

Pos logit contributions
Fl+0.009
Moon+0.009
Buzz+0.009
Bear+0.008
Bee+0.008
Neg logit contributions
 fault-0.012
:-0.012
sed-0.012
 okay-0.011
 enough-0.011
Token black
Token ID2042
Feature activation+5.86e-03

Pos logit contributions
Fl+0.016
Moon+0.016
ts+0.015
Bee+0.015
Bear+0.015
Neg logit contributions
 fault-0.025
:-0.025
sed-0.024
 okay-0.023
 enough-0.022
Tokenboard
Token ID3526
Feature activation-1.80e-02

Pos logit contributions
eds+0.010
 conclusions+0.010
Fl+0.010
 blankets+0.009
 sofa+0.009
Neg logit contributions
 fault-0.009
sed-0.009
 okay-0.009
 enough-0.008
:-0.008
Token.
Token ID13
Feature activation-1.49e-02

Pos logit contributions
 fault+0.023
sed+0.023
:+0.022
 okay+0.021
 enough+0.020
Neg logit contributions
Fl-0.035
eds-0.034
 conclusions-0.033
Moon-0.033
ts-0.033
Token The
Token ID383
Feature activation-2.21e-02

Pos logit contributions
sed+0.022
 fault+0.022
:+0.021
 enough+0.020
 okay+0.020
Neg logit contributions
Fl-0.027
Moon-0.026
Bear-0.025
eds-0.025
Bee-0.025
Token end
Token ID886
Feature activation-7.97e-02

Pos logit contributions
 fault+0.043
sed+0.042
:+0.041
icked+0.039
 enough+0.039
Neg logit contributions
Fl-0.029
 conclusions-0.028
 blankets-0.027
 pans-0.027
Moon-0.027
Token.
Token ID13
Feature activation-1.99e-02

Pos logit contributions
sed+0.127
 fault+0.119
 enough+0.107
joy+0.102
 okay+0.102
Neg logit contributions
Fl-0.142
eds-0.140
 conclusions-0.134
Bear-0.133
 blankets-0.133
Token<|endoftext|>
Token ID50256
Feature activation-1.36e-02

Pos logit contributions
 fault+0.030
sed+0.029
:+0.028
 enough+0.026
 okay+0.026
Neg logit contributions
Fl-0.036
Moon-0.034
Bear-0.034
Bird-0.034
Bee-0.034
TokenOnce
Token ID7454
Feature activation-1.93e-02

Pos logit contributions
sed+0.023
 fault+0.023
 okay+0.020
 enough+0.020
:+0.020
Neg logit contributions
Fl-0.023
Bear-0.021
Moon-0.021
eds-0.021
Bee-0.021
Token upon
Token ID2402
Feature activation-8.46e-03

Pos logit contributions
sed+0.037
 fault+0.036
ield+0.033
ope+0.033
 okay+0.033
Neg logit contributions
ie-0.026
eds-0.024
Fl-0.024
Bee-0.023
Moon-0.023
Token a
Token ID257
Feature activation-5.65e-02

Pos logit contributions
:+0.009
 okay+0.009
 fault+0.008
 about+0.008
 happened+0.008
Neg logit contributions
eds-0.019
 pans-0.019
 conclusions-0.018
Fl-0.018
Moon-0.018
Token amount
Token ID2033
Feature activation-2.57e-02

Pos logit contributions
sed+0.077
:+0.069
 okay+0.068
ver+0.068
icked+0.067
Neg logit contributions
 pans-0.056
ie-0.053
eds-0.053
 blankets-0.053
 conclusions-0.052
Token of
Token ID286
Feature activation+1.47e-02

Pos logit contributions
sed+0.043
 fault+0.040
 okay+0.037
:+0.036
ope+0.036
Neg logit contributions
Fl-0.041
 blankets-0.040
eds-0.040
 pans-0.040
 conclusions-0.040
Token dessert
Token ID23084
Feature activation-1.28e-02

Pos logit contributions
Fl+0.025
eds+0.024
Bird+0.024
Moon+0.024
Bee+0.024
Neg logit contributions
:-0.024
 fault-0.024
 okay-0.024
 enough-0.023
sed-0.022
Token.
Token ID13
Feature activation-1.92e-02

Pos logit contributions
sed+0.017
 fault+0.017
:+0.016
 okay+0.015
apped+0.014
Neg logit contributions
 pans-0.024
Fl-0.024
eds-0.023
 blankets-0.023
ts-0.023
Token The
Token ID383
Feature activation-2.06e-02

Pos logit contributions
 fault+0.026
sed+0.025
:+0.025
 okay+0.023
 enough+0.023
Neg logit contributions
Fl-0.037
Moon-0.036
 conclusions-0.035
eds-0.035
Bee-0.035
Token end
Token ID886
Feature activation-7.97e-02

Pos logit contributions
 fault+0.036
sed+0.035
:+0.035
 okay+0.033
 enough+0.032
Neg logit contributions
Fl-0.031
 pans-0.030
 conclusions-0.030
Moon-0.030
ts-0.029
Token.
Token ID13
Feature activation-2.14e-02

Pos logit contributions
sed+0.123
 fault+0.115
 okay+0.104
 enough+0.103
apped+0.101
Neg logit contributions
Fl-0.143
eds-0.139
 blankets-0.137
 conclusions-0.135
Bear-0.135
Token<|endoftext|>
Token ID50256
Feature activation-1.15e-02

Pos logit contributions
 fault+0.030
sed+0.029
:+0.029
 okay+0.026
 enough+0.026
Neg logit contributions
Fl-0.040
Moon-0.038
 conclusions-0.038
eds-0.038
Bee-0.038
TokenOnce
Token ID7454
Feature activation-1.70e-02

Pos logit contributions
 fault+0.018
sed+0.017
:+0.016
 okay+0.016
 enough+0.015
Neg logit contributions
Fl-0.020
Moon-0.019
eds-0.019
Bear-0.019
Bee-0.019
Token upon
Token ID2402
Feature activation-2.67e-03

Pos logit contributions
sed+0.034
 fault+0.033
ield+0.030
ope+0.030
 okay+0.030
Neg logit contributions
ie-0.022
eds-0.020
Fl-0.020
Moon-0.019
Bee-0.019
Token a
Token ID257
Feature activation-4.78e-02

Pos logit contributions
 okay+0.003
:+0.003
 about+0.002
 happened+0.002
 fault+0.002
Neg logit contributions
eds-0.006
 pans-0.006
 conclusions-0.006
ations-0.006
 forth-0.006
Token And
Token ID843
Feature activation-1.09e-02

Pos logit contributions
sed+0.067
 fault+0.062
 enough+0.060
:+0.059
owl+0.059
Neg logit contributions
Fl-0.061
Moon-0.056
Bear-0.056
Brown-0.055
 conclusions-0.055
Token their
Token ID511
Feature activation-2.03e-02

Pos logit contributions
 fault+0.018
sed+0.018
:+0.017
 okay+0.016
 enough+0.016
Neg logit contributions
Fl-0.017
 conclusions-0.017
Moon-0.017
ts-0.017
 pans-0.016
Token mom
Token ID1995
Feature activation-2.20e-02

Pos logit contributions
 fault+0.035
sed+0.033
:+0.033
 okay+0.030
 enough+0.030
Neg logit contributions
Fl-0.033
Moon-0.033
Bird-0.033
Bee-0.033
Bear-0.032
Token.
Token ID13
Feature activation-3.59e-02

Pos logit contributions
 fault+0.032
:+0.031
sed+0.031
 okay+0.029
 enough+0.028
Neg logit contributions
Fl-0.040
 conclusions-0.039
Moon-0.039
eds-0.038
Bee-0.038
Token The
Token ID383
Feature activation-1.20e-02

Pos logit contributions
sed+0.062
 fault+0.058
 enough+0.056
:+0.054
owl+0.052
Neg logit contributions
Fl-0.059
Moon-0.056
Bear-0.056
Brown-0.055
 conclusions-0.055
Token end
Token ID886
Feature activation-7.96e-02

Pos logit contributions
 fault+0.019
:+0.019
sed+0.019
 okay+0.018
 enough+0.017
Neg logit contributions
Fl-0.020
eds-0.020
Moon-0.019
 conclusions-0.019
Bee-0.019
Token.
Token ID13
Feature activation-3.51e-02

Pos logit contributions
sed+0.132
 fault+0.125
joy+0.113
ige+0.112
 okay+0.111
Neg logit contributions
Fl-0.130
 conclusions-0.127
 blankets-0.127
eds-0.125
 pans-0.123
Token<|endoftext|>
Token ID50256
Feature activation-1.13e-03

Pos logit contributions
sed+0.062
 fault+0.060
 enough+0.056
:+0.055
 intentions+0.052
Neg logit contributions
Fl-0.056
Moon-0.054
Bear-0.053
Brown-0.052
Bird-0.052
TokenBen
Token ID11696
Feature activation-1.01e-02

Pos logit contributions
 fault+0.002
sed+0.002
:+0.002
 okay+0.002
 enough+0.002
Neg logit contributions
Fl-0.002
Moon-0.002
eds-0.002
 conclusions-0.002
Bee-0.002
Token and
Token ID290
Feature activation+1.83e-02

Pos logit contributions
 fault+0.016
sed+0.015
:+0.015
 okay+0.015
 enough+0.014
Neg logit contributions
Fl-0.017
eds-0.017
ts-0.016
Bee-0.016
Moon-0.016
Token Lily
Token ID20037
Feature activation-5.28e-03

Pos logit contributions
Fl+0.027
Moon+0.027
Bird+0.026
Bear+0.026
Bee+0.026
Neg logit contributions
 fault-0.033
sed-0.032
:-0.031
 okay-0.029
joy-0.028
Token happy
Token ID3772
Feature activation-2.65e-02

Pos logit contributions
 fault+0.025
:+0.024
sed+0.024
 okay+0.023
 enough+0.023
Neg logit contributions
Fl-0.020
eds-0.020
Moon-0.020
 conclusions-0.019
Bee-0.019
Token and
Token ID290
Feature activation+1.87e-02

Pos logit contributions
 fault+0.035
sed+0.034
:+0.034
 okay+0.031
 enough+0.030
Neg logit contributions
Fl-0.051
 conclusions-0.049
eds-0.049
Moon-0.049
 pans-0.049
Token cozy
Token ID37438
Feature activation-2.85e-03

Pos logit contributions
Fl+0.029
Moon+0.028
 conclusions+0.028
eds+0.028
Bee+0.027
Neg logit contributions
 fault-0.032
:-0.032
sed-0.031
 okay-0.030
 enough-0.029
Token.
Token ID13
Feature activation-2.17e-02

Pos logit contributions
 fault+0.004
sed+0.004
:+0.004
 okay+0.003
 enough+0.003
Neg logit contributions
Fl-0.006
 conclusions-0.005
eds-0.005
 blankets-0.005
Moon-0.005
Token The
Token ID383
Feature activation-1.67e-02

Pos logit contributions
 fault+0.032
sed+0.031
:+0.030
 enough+0.028
 okay+0.028
Neg logit contributions
Fl-0.040
Moon-0.038
Bee-0.038
Bear-0.038
Bird-0.038
Token end
Token ID886
Feature activation-7.95e-02

Pos logit contributions
:+0.024
 fault+0.024
sed+0.024
 okay+0.023
 ending+0.021
Neg logit contributions
eds-0.031
Moon-0.030
Fl-0.030
Bear-0.030
Bee-0.029
Token.
Token ID13
Feature activation-1.59e-02

Pos logit contributions
sed+0.126
 fault+0.118
 enough+0.108
 okay+0.106
joy+0.105
Neg logit contributions
Fl-0.142
eds-0.136
 blankets-0.134
Brown-0.131
Moon-0.131
Token<|endoftext|>
Token ID50256
Feature activation-1.42e-02

Pos logit contributions
 fault+0.023
sed+0.022
:+0.021
 enough+0.020
 okay+0.020
Neg logit contributions
Fl-0.030
Moon-0.028
 conclusions-0.028
Bee-0.028
Bear-0.028
TokenOnce
Token ID7454
Feature activation-2.08e-02

Pos logit contributions
 fault+0.022
sed+0.022
:+0.020
 okay+0.019
 enough+0.019
Neg logit contributions
Fl-0.025
Moon-0.024
eds-0.024
Bear-0.024
Bee-0.024
Token upon
Token ID2402
Feature activation-5.47e-03

Pos logit contributions
sed+0.041
 fault+0.040
ield+0.037
:+0.036
 okay+0.036
Neg logit contributions
ie-0.027
eds-0.026
Fl-0.025
Moon-0.024
Bee-0.024
Token a
Token ID257
Feature activation-4.75e-02

Pos logit contributions
:+0.005
 okay+0.005
 fault+0.005
 about+0.005
 happened+0.005
Neg logit contributions
eds-0.013
 pans-0.012
 conclusions-0.012
ations-0.012
Fl-0.012
Token other
Token ID584
Feature activation+1.56e-02

Pos logit contributions
Brown+0.021
Fl+0.020
cking+0.020
Bee+0.020
 pans+0.020
Neg logit contributions
:-0.026
 enough-0.023
 best-0.023
 about-0.023
 fault-0.023
Token's
Token ID338
Feature activation-5.79e-03

Pos logit contributions
Bird+0.031
Bee+0.031
Ro+0.031
Moon+0.031
Fl+0.031
Neg logit contributions
:-0.021
 about-0.019
 okay-0.019
 enough-0.019
 fault-0.019
Token hair
Token ID4190
Feature activation-1.19e-02

Pos logit contributions
 fault+0.009
:+0.009
sed+0.009
 okay+0.008
 enough+0.008
Neg logit contributions
Fl-0.010
Moon-0.010
Bee-0.010
Bear-0.010
eds-0.010
Token.
Token ID13
Feature activation-1.75e-02

Pos logit contributions
 fault+0.016
sed+0.016
:+0.016
 okay+0.015
 enough+0.014
Neg logit contributions
Fl-0.022
eds-0.021
 conclusions-0.021
Moon-0.021
ts-0.021
Token The
Token ID383
Feature activation-1.95e-02

Pos logit contributions
 fault+0.022
:+0.022
sed+0.021
 okay+0.020
 enough+0.019
Neg logit contributions
Fl-0.034
eds-0.034
 conclusions-0.034
Moon-0.034
 pans-0.034
Token end
Token ID886
Feature activation-7.94e-02

Pos logit contributions
 fault+0.034
sed+0.034
:+0.033
 okay+0.031
 enough+0.031
Neg logit contributions
Fl-0.029
 conclusions-0.028
Moon-0.028
 pans-0.028
 blankets-0.027
Token.
Token ID13
Feature activation-1.99e-02

Pos logit contributions
sed+0.123
 fault+0.118
 okay+0.106
 enough+0.105
apped+0.102
Neg logit contributions
Fl-0.143
eds-0.137
 blankets-0.134
 conclusions-0.133
Bear-0.132
Token<|endoftext|>
Token ID50256
Feature activation-1.02e-02

Pos logit contributions
 fault+0.027
sed+0.026
:+0.026
 okay+0.024
 enough+0.023
Neg logit contributions
Fl-0.038
Moon-0.037
eds-0.037
 conclusions-0.037
Bee-0.036
TokenOnce
Token ID7454
Feature activation-1.88e-02

Pos logit contributions
 fault+0.016
sed+0.016
 okay+0.015
:+0.015
 enough+0.014
Neg logit contributions
Fl-0.017
Moon-0.016
Bear-0.016
eds-0.016
Bee-0.016
Token upon
Token ID2402
Feature activation-7.77e-03

Pos logit contributions
sed+0.037
 fault+0.035
ield+0.033
 okay+0.033
:+0.033
Neg logit contributions
ie-0.024
eds-0.023
Fl-0.023
Moon-0.022
Bee-0.022
Token a
Token ID257
Feature activation-5.58e-02

Pos logit contributions
:+0.008
 okay+0.008
 about+0.007
 fault+0.007
 happened+0.007
Neg logit contributions
eds-0.018
 pans-0.018
 conclusions-0.018
Fl-0.017
 forth-0.017
Token an
Token ID281
Feature activation+6.26e-03

Pos logit contributions
sed+0.029
 fault+0.028
:+0.027
apped+0.025
owl+0.025
Neg logit contributions
Fl-0.024
ts-0.023
 conclusions-0.023
Bear-0.023
Brown-0.023
Token important
Token ID1593
Feature activation-2.52e-02

Pos logit contributions
Fl+0.013
 conclusions+0.013
eds+0.012
Moon+0.012
Bee+0.012
Neg logit contributions
sed-0.007
 fault-0.007
:-0.007
 okay-0.007
 enough-0.006
Token lesson
Token ID11483
Feature activation-3.67e-02

Pos logit contributions
icked+0.041
sed+0.040
:+0.040
apped+0.037
 okay+0.036
Neg logit contributions
 conclusions-0.039
 blankets-0.037
Fl-0.037
Bee-0.036
Brown-0.036
Token.
Token ID13
Feature activation-2.09e-02

Pos logit contributions
sed+0.074
 fault+0.066
itted+0.062
ses+0.061
ige+0.060
Neg logit contributions
ts-0.050
ie-0.050
lf-0.048
Fl-0.048
 myself-0.046
Token The
Token ID383
Feature activation-2.09e-02

Pos logit contributions
sed+0.033
 fault+0.032
:+0.029
 enough+0.029
 okay+0.029
Neg logit contributions
Fl-0.036
Bear-0.035
Brown-0.034
Moon-0.034
Bird-0.034
Token end
Token ID886
Feature activation-7.94e-02

Pos logit contributions
 fault+0.036
sed+0.036
:+0.035
 enough+0.033
 okay+0.032
Neg logit contributions
Fl-0.031
 conclusions-0.030
Moon-0.030
 blankets-0.030
 pans-0.030
Token.
Token ID13
Feature activation-2.08e-02

Pos logit contributions
sed+0.134
 fault+0.125
 enough+0.113
apped+0.110
joy+0.109
Neg logit contributions
Fl-0.134
eds-0.132
Bear-0.127
 conclusions-0.125
Brown-0.125
Token<|endoftext|>
Token ID50256
Feature activation-6.05e-03

Pos logit contributions
 fault+0.031
sed+0.031
:+0.029
 enough+0.028
 okay+0.027
Neg logit contributions
Fl-0.037
Moon-0.036
Bear-0.036
Bird-0.035
Brown-0.035
TokenOnce
Token ID7454
Feature activation-1.95e-02

Pos logit contributions
 fault+0.009
sed+0.009
:+0.008
 okay+0.008
 enough+0.008
Neg logit contributions
Fl-0.011
Moon-0.010
Bear-0.010
eds-0.010
Bee-0.010
Token upon
Token ID2402
Feature activation-6.16e-03

Pos logit contributions
sed+0.039
 fault+0.037
ield+0.035
ope+0.035
 okay+0.034
Neg logit contributions
ie-0.025
eds-0.024
Fl-0.023
Moon-0.022
Bee-0.022
Token a
Token ID257
Feature activation-5.07e-02

Pos logit contributions
:+0.006
 okay+0.006
 fault+0.006
 about+0.006
 happened+0.005
Neg logit contributions
eds-0.014
 pans-0.014
 conclusions-0.014
Fl-0.013
ations-0.013
Token happy
Token ID3772
Feature activation-3.22e-02

Pos logit contributions
 fault+0.041
:+0.040
sed+0.039
 okay+0.038
 enough+0.037
Neg logit contributions
Fl-0.046
 conclusions-0.046
eds-0.045
Moon-0.045
Bee-0.044
Token and
Token ID290
Feature activation+5.19e-03

Pos logit contributions
 fault+0.050
sed+0.049
:+0.044
 okay+0.044
ield+0.044
Neg logit contributions
 blankets-0.053
Fl-0.053
eds-0.052
Moon-0.052
Bee-0.051
Token grateful
Token ID14066
Feature activation-4.35e-02

Pos logit contributions
Fl+0.007
 pans+0.007
ts+0.007
 blankets+0.007
 myself+0.007
Neg logit contributions
 fault-0.010
:-0.009
sed-0.009
 enough-0.008
joy-0.008
Token.
Token ID13
Feature activation-3.75e-02

Pos logit contributions
 fault+0.073
sed+0.069
ver+0.068
ope+0.065
ield+0.065
Neg logit contributions
 blankets-0.066
ie-0.063
Fl-0.060
 pans-0.060
eds-0.058
Token The
Token ID383
Feature activation-8.56e-03

Pos logit contributions
 fault+0.070
sed+0.069
 enough+0.066
owl+0.064
ield+0.062
Neg logit contributions
Fl-0.056
Moon-0.052
Bear-0.051
Bee-0.050
Brown-0.049
Token end
Token ID886
Feature activation-7.94e-02

Pos logit contributions
 fault+0.013
:+0.013
sed+0.013
 okay+0.012
 enough+0.012
Neg logit contributions
Fl-0.014
eds-0.014
Moon-0.014
 conclusions-0.014
ts-0.014
Token.
Token ID13
Feature activation-3.24e-02

Pos logit contributions
sed+0.135
 fault+0.129
 okay+0.119
joy+0.117
ige+0.115
Neg logit contributions
 blankets-0.126
Fl-0.126
eds-0.125
 pans-0.119
 conclusions-0.119
Token<|endoftext|>
Token ID50256
Feature activation-5.21e-03

Pos logit contributions
sed+0.059
 fault+0.059
 enough+0.055
:+0.053
 intentions+0.052
Neg logit contributions
Fl-0.049
Moon-0.047
Bear-0.046
Bee-0.045
Bird-0.045
TokenTim
Token ID14967
Feature activation+6.83e-03

Pos logit contributions
 fault+0.009
sed+0.008
:+0.008
 okay+0.008
 enough+0.008
Neg logit contributions
Fl-0.008
Moon-0.008
eds-0.008
 conclusions-0.008
Bee-0.008
Token and
Token ID290
Feature activation+1.33e-02

Pos logit contributions
Fl+0.012
eds+0.012
Moon+0.012
 conclusions+0.012
Bee+0.012
Neg logit contributions
 fault-0.010
sed-0.010
:-0.010
 okay-0.009
 enough-0.009
Token Sue
Token ID26089
Feature activation+7.40e-03

Pos logit contributions
Fl+0.016
Moon+0.015
Bird+0.015
Bear+0.015
Bee+0.015
Neg logit contributions
 fault-0.028
sed-0.027
:-0.025
joy-0.025
 okay-0.024
Token just
Token ID655
Feature activation-4.36e-02

Pos logit contributions
 fault+0.040
sed+0.039
:+0.039
 okay+0.034
 enough+0.034
Neg logit contributions
Fl-0.075
eds-0.073
 conclusions-0.073
Moon-0.073
Bee-0.072
Token a
Token ID257
Feature activation-2.97e-02

Pos logit contributions
 fault+0.062
sed+0.061
:+0.060
joy+0.055
apped+0.054
Neg logit contributions
Fl-0.074
 conclusions-0.073
ts-0.073
 blankets-0.071
 myself-0.071
Token fruit
Token ID8234
Feature activation-1.06e-02

Pos logit contributions
sed+0.047
:+0.046
 fault+0.046
apped+0.044
icked+0.043
Neg logit contributions
Fl-0.048
Bee-0.046
 conclusions-0.046
Bear-0.046
Moon-0.046
Token.
Token ID13
Feature activation-1.81e-02

Pos logit contributions
 fault+0.014
sed+0.013
:+0.013
 okay+0.012
joy+0.012
Neg logit contributions
Fl-0.020
 conclusions-0.020
eds-0.019
 pans-0.019
 blankets-0.019
Token The
Token ID383
Feature activation-2.32e-02

Pos logit contributions
 fault+0.025
sed+0.025
:+0.024
 okay+0.023
 enough+0.023
Neg logit contributions
Fl-0.034
Moon-0.032
Bear-0.032
eds-0.032
Bee-0.032
Token end
Token ID886
Feature activation-7.93e-02

Pos logit contributions
sed+0.044
 fault+0.044
:+0.042
icked+0.041
apped+0.041
Neg logit contributions
 blankets-0.029
 conclusions-0.029
Fl-0.029
 pans-0.028
Bee-0.027
Token.
Token ID13
Feature activation-2.18e-02

Pos logit contributions
sed+0.131
 fault+0.121
 enough+0.111
apped+0.110
ige+0.108
Neg logit contributions
Fl-0.136
eds-0.130
 blankets-0.129
 conclusions-0.128
Bear-0.126
Token<|endoftext|>
Token ID50256
Feature activation-1.36e-02

Pos logit contributions
 fault+0.032
sed+0.032
:+0.030
 enough+0.028
 okay+0.028
Neg logit contributions
Fl-0.040
Moon-0.038
Bear-0.038
Bird-0.038
Bee-0.038
TokenOnce
Token ID7454
Feature activation-1.70e-02

Pos logit contributions
sed+0.023
 fault+0.023
 okay+0.020
 enough+0.020
:+0.020
Neg logit contributions
Fl-0.022
Bear-0.021
Moon-0.021
eds-0.021
Bee-0.021
Token upon
Token ID2402
Feature activation-7.16e-03

Pos logit contributions
sed+0.033
 fault+0.032
 okay+0.029
ield+0.029
ope+0.029
Neg logit contributions
ie-0.022
eds-0.021
Fl-0.021
Moon-0.020
Bee-0.020
Token a
Token ID257
Feature activation-5.50e-02

Pos logit contributions
:+0.007
 okay+0.007
 about+0.006
 fault+0.006
 happened+0.006
Neg logit contributions
eds-0.017
 pans-0.017
 conclusions-0.016
 forth-0.016
ations-0.016
Token on
Token ID319
Feature activation+8.41e-03

Pos logit contributions
 fault+0.009
sed+0.009
:+0.009
 okay+0.008
 enough+0.008
Neg logit contributions
Fl-0.011
eds-0.011
 conclusions-0.011
Moon-0.011
 blankets-0.011
Token their
Token ID511
Feature activation+9.71e-03

Pos logit contributions
Fl+0.013
Bee+0.013
Moon+0.013
Bird+0.013
 pans+0.013
Neg logit contributions
:-0.014
 fault-0.014
 enough-0.014
 okay-0.013
sed-0.013
Token adventure
Token ID8855
Feature activation-1.77e-03

Pos logit contributions
Bee+0.016
Fl+0.016
Bear+0.016
Bird+0.016
Moon+0.016
Neg logit contributions
 fault-0.016
sed-0.015
:-0.015
 enough-0.015
 okay-0.014
Token.
Token ID13
Feature activation-1.06e-02

Pos logit contributions
 fault+0.002
:+0.002
sed+0.002
 okay+0.002
 enough+0.002
Neg logit contributions
Fl-0.003
Moon-0.003
 conclusions-0.003
eds-0.003
Bee-0.003
Token The
Token ID383
Feature activation-1.87e-02

Pos logit contributions
 fault+0.015
sed+0.014
:+0.014
 okay+0.013
 enough+0.013
Neg logit contributions
Fl-0.020
Moon-0.019
eds-0.019
 conclusions-0.019
Bear-0.019
Token end
Token ID886
Feature activation-7.92e-02

Pos logit contributions
:+0.026
 fault+0.025
sed+0.025
 okay+0.025
 enough+0.024
Neg logit contributions
eds-0.036
Fl-0.035
 conclusions-0.034
zip-0.034
Bee-0.034
Token.
Token ID13
Feature activation-1.60e-02

Pos logit contributions
sed+0.127
 fault+0.120
 okay+0.107
apped+0.106
 enough+0.104
Neg logit contributions
Fl-0.140
eds-0.134
 blankets-0.134
Bear-0.129
 conclusions-0.129
Token<|endoftext|>
Token ID50256
Feature activation-1.14e-02

Pos logit contributions
 fault+0.023
sed+0.023
:+0.022
 okay+0.021
 enough+0.020
Neg logit contributions
Fl-0.030
Moon-0.028
Bear-0.028
 conclusions-0.028
eds-0.028
TokenOnce
Token ID7454
Feature activation-1.63e-02

Pos logit contributions
 fault+0.018
sed+0.018
:+0.016
 okay+0.016
 enough+0.016
Neg logit contributions
Fl-0.020
Moon-0.019
Bear-0.019
eds-0.019
Bee-0.019
Token upon
Token ID2402
Feature activation-6.70e-03

Pos logit contributions
sed+0.031
 fault+0.031
ield+0.028
:+0.028
 okay+0.028
Neg logit contributions
ie-0.021
eds-0.020
Fl-0.020
Bee-0.019
Moon-0.019
Token a
Token ID257
Feature activation-4.82e-02

Pos logit contributions
 okay+0.006
:+0.006
 about+0.006
 fault+0.005
 happened+0.005
Neg logit contributions
eds-0.016
 pans-0.016
 conclusions-0.015
ations-0.015
 forth-0.015
Token happy
Token ID3772
Feature activation-1.88e-02

Pos logit contributions
 fault+0.006
sed+0.006
:+0.006
 okay+0.006
 enough+0.005
Neg logit contributions
Fl-0.009
 conclusions-0.009
Bee-0.009
Moon-0.009
eds-0.009
Token and
Token ID290
Feature activation+2.12e-02

Pos logit contributions
 fault+0.026
sed+0.025
 okay+0.024
:+0.023
ope+0.022
Neg logit contributions
Fl-0.034
eds-0.034
 blankets-0.034
 conclusions-0.033
ts-0.033
Token smiling
Token ID16755
Feature activation-3.58e-02

Pos logit contributions
 blankets+0.031
Fl+0.031
 pans+0.031
 forth+0.030
zip+0.030
Neg logit contributions
 fault-0.038
:-0.034
sed-0.034
 intentions-0.033
ope-0.032
Token.
Token ID13
Feature activation-1.98e-02

Pos logit contributions
 fault+0.063
sed+0.060
ope+0.054
oke+0.054
owl+0.053
Neg logit contributions
 blankets-0.051
Fl-0.049
ie-0.047
 strings-0.046
Bear-0.046
Token The
Token ID383
Feature activation-2.10e-02

Pos logit contributions
 fault+0.034
sed+0.033
:+0.031
 enough+0.030
 okay+0.030
Neg logit contributions
Fl-0.032
Brown-0.030
Moon-0.030
Bear-0.030
Ruby-0.029
Token end
Token ID886
Feature activation-7.91e-02

Pos logit contributions
 fault+0.033
:+0.032
sed+0.032
 okay+0.031
 enough+0.029
Neg logit contributions
Fl-0.036
eds-0.035
Moon-0.035
 conclusions-0.035
Bee-0.035
Token.
Token ID13
Feature activation-2.17e-02

Pos logit contributions
sed+0.130
 fault+0.125
 enough+0.111
 okay+0.111
joy+0.110
Neg logit contributions
Fl-0.135
eds-0.131
 blankets-0.130
Bear-0.126
Brown-0.125
Token<|endoftext|>
Token ID50256
Feature activation-8.59e-03

Pos logit contributions
 fault+0.034
sed+0.033
:+0.031
 enough+0.030
 okay+0.029
Neg logit contributions
Fl-0.038
Moon-0.036
Bear-0.036
Brown-0.036
Bird-0.036
TokenOnce
Token ID7454
Feature activation-1.81e-02

Pos logit contributions
 fault+0.014
sed+0.014
 okay+0.012
:+0.012
 enough+0.012
Neg logit contributions
Fl-0.015
Moon-0.014
Bear-0.014
eds-0.014
Bee-0.014
Token upon
Token ID2402
Feature activation-7.18e-03

Pos logit contributions
sed+0.033
 fault+0.033
 okay+0.030
ield+0.030
:+0.030
Neg logit contributions
ie-0.025
eds-0.024
Fl-0.024
Moon-0.023
Bee-0.023
Token a
Token ID257
Feature activation-5.12e-02

Pos logit contributions
:+0.007
 okay+0.007
 fault+0.007
 about+0.006
 happened+0.006
Neg logit contributions
eds-0.016
 pans-0.016
 conclusions-0.016
Fl-0.016
ations-0.016
Token bed
Token ID3996
Feature activation-2.78e-02

Pos logit contributions
 fault+0.017
:+0.017
sed+0.017
 okay+0.016
 enough+0.016
Neg logit contributions
Fl-0.011
eds-0.011
 conclusions-0.011
 blankets-0.011
 pans-0.011
Token every
Token ID790
Feature activation-1.42e-02

Pos logit contributions
 fault+0.056
sed+0.056
:+0.053
 okay+0.051
ield+0.051
Neg logit contributions
 blankets-0.033
ts-0.032
Fl-0.032
eds-0.031
ie-0.030
Token day
Token ID1110
Feature activation-2.06e-02

Pos logit contributions
:+0.021
 fault+0.021
sed+0.020
 enough+0.020
 okay+0.020
Neg logit contributions
Fl-0.025
ts-0.025
 conclusions-0.025
 pans-0.024
Brown-0.024
Token.
Token ID13
Feature activation-7.29e-03

Pos logit contributions
 fault+0.031
sed+0.030
:+0.027
ope+0.026
 okay+0.026
Neg logit contributions
eds-0.035
 blankets-0.035
ts-0.035
Fl-0.035
 conclusions-0.034
Token The
Token ID383
Feature activation-2.04e-02

Pos logit contributions
 fault+0.010
sed+0.010
:+0.009
 okay+0.009
 enough+0.009
Neg logit contributions
Fl-0.014
Moon-0.013
 conclusions-0.013
eds-0.013
Bee-0.013
Token end
Token ID886
Feature activation-7.91e-02

Pos logit contributions
 fault+0.028
 okay+0.027
:+0.027
sed+0.027
 ending+0.026
Neg logit contributions
eds-0.040
ts-0.038
zip-0.038
 conclusions-0.037
Fl-0.037
Token.
Token ID13
Feature activation-1.41e-02

Pos logit contributions
sed+0.128
 fault+0.121
 enough+0.108
 okay+0.106
joy+0.106
Neg logit contributions
Fl-0.140
eds-0.135
 blankets-0.132
Bear-0.129
ie-0.129
Token<|endoftext|>
Token ID50256
Feature activation-9.52e-03

Pos logit contributions
sed+0.022
 fault+0.022
:+0.020
 enough+0.020
 okay+0.019
Neg logit contributions
Fl-0.025
Brown-0.023
Moon-0.023
Bear-0.023
Bird-0.023
TokenOnce
Token ID7454
Feature activation-1.39e-02

Pos logit contributions
 fault+0.014
sed+0.014
:+0.013
 okay+0.013
 enough+0.013
Neg logit contributions
Fl-0.017
Moon-0.016
eds-0.016
Bear-0.016
Bee-0.016
Token upon
Token ID2402
Feature activation-6.72e-03

Pos logit contributions
sed+0.027
 fault+0.026
:+0.024
 okay+0.024
ield+0.024
Neg logit contributions
ie-0.018
eds-0.017
Fl-0.017
Moon-0.017
Bee-0.017
Token a
Token ID257
Feature activation-5.13e-02

Pos logit contributions
:+0.007
 okay+0.007
 fault+0.006
 about+0.006
 happened+0.006
Neg logit contributions
eds-0.016
 pans-0.015
 conclusions-0.015
Fl-0.015
ations-0.015
Token the
Token ID262
Feature activation+8.49e-03

Pos logit contributions
Moon+0.052
Bee+0.052
Buzz+0.052
Bird+0.051
Fl+0.050
Neg logit contributions
 fault-0.057
:-0.056
 enough-0.053
 okay-0.050
 about-0.049
Token piano
Token ID19132
Feature activation+1.26e-02

Pos logit contributions
Buzz+0.010
Bee+0.010
Bear+0.009
Fl+0.009
eds+0.009
Neg logit contributions
 fault-0.019
:-0.018
sed-0.018
 okay-0.017
 enough-0.017
Token keys
Token ID8251
Feature activation+1.25e-02

Pos logit contributions
Fl+0.026
Moon+0.026
Bee+0.026
 pans+0.025
eds+0.025
Neg logit contributions
 fault-0.015
:-0.015
sed-0.014
 okay-0.013
 enough-0.013
Token.
Token ID13
Feature activation-2.23e-02

Pos logit contributions
Fl+0.024
Moon+0.024
eds+0.024
 conclusions+0.024
Bee+0.023
Neg logit contributions
 fault-0.016
:-0.016
sed-0.015
 okay-0.014
 enough-0.014
Token The
Token ID383
Feature activation-1.56e-02

Pos logit contributions
sed+0.039
 fault+0.036
:+0.035
 enough+0.034
 intentions+0.033
Neg logit contributions
Fl-0.037
Bear-0.034
Brown-0.034
Moon-0.034
Bird-0.033
Token end
Token ID886
Feature activation-7.90e-02

Pos logit contributions
 fault+0.027
sed+0.026
:+0.026
 okay+0.024
 enough+0.024
Neg logit contributions
Fl-0.024
Moon-0.023
 conclusions-0.023
 pans-0.023
eds-0.023
Token.
Token ID13
Feature activation-2.85e-02

Pos logit contributions
sed+0.121
 fault+0.114
 enough+0.102
apped+0.101
:+0.099
Neg logit contributions
Fl-0.145
eds-0.140
 conclusions-0.138
Brown-0.135
Bear-0.135
Token<|endoftext|>
Token ID50256
Feature activation-8.72e-03

Pos logit contributions
sed+0.050
 fault+0.048
:+0.044
 enough+0.043
 intentions+0.042
Neg logit contributions
Fl-0.047
Bear-0.044
Moon-0.044
Brown-0.043
Bird-0.043
TokenOnce
Token ID7454
Feature activation-1.80e-02

Pos logit contributions
 fault+0.013
sed+0.013
:+0.012
 okay+0.012
 enough+0.012
Neg logit contributions
Fl-0.015
Moon-0.015
Bear-0.015
eds-0.015
Bee-0.015
Token upon
Token ID2402
Feature activation-5.47e-03

Pos logit contributions
sed+0.036
 fault+0.034
ield+0.032
 okay+0.032
ope+0.032
Neg logit contributions
ie-0.023
eds-0.021
Fl-0.021
Moon-0.020
Bee-0.020
Token a
Token ID257
Feature activation-5.25e-02

Pos logit contributions
 okay+0.005
:+0.005
 about+0.005
 fault+0.005
 happened+0.004
Neg logit contributions
eds-0.013
 pans-0.013
 conclusions-0.012
 forth-0.012
Fl-0.012
Token best
Token ID1266
Feature activation-2.87e-02

Pos logit contributions
 fault+0.044
:+0.042
sed+0.041
 okay+0.040
 enough+0.039
Neg logit contributions
Fl-0.071
 conclusions-0.071
eds-0.070
Moon-0.069
 pans-0.069
Token friends
Token ID2460
Feature activation-3.64e-02

Pos logit contributions
 fault+0.046
sed+0.045
:+0.044
 okay+0.040
ope+0.039
Neg logit contributions
eds-0.047
 blankets-0.047
Fl-0.047
Bear-0.045
Bee-0.045
Token forever
Token ID8097
Feature activation-1.68e-02

Pos logit contributions
sed+0.067
 fault+0.064
ope+0.057
 expression+0.056
owl+0.056
Neg logit contributions
Fl-0.055
eds-0.052
ie-0.051
 blankets-0.051
Ruby-0.050
Token.
Token ID13
Feature activation-2.10e-02

Pos logit contributions
sed+0.025
 fault+0.025
:+0.023
ield+0.022
owl+0.021
Neg logit contributions
eds-0.029
Fl-0.029
 blankets-0.029
Bee-0.028
ts-0.028
Token The
Token ID383
Feature activation-1.49e-02

Pos logit contributions
 fault+0.028
sed+0.028
:+0.027
 okay+0.025
 enough+0.025
Neg logit contributions
Fl-0.040
Moon-0.039
 conclusions-0.039
eds-0.039
Bee-0.038
Token end
Token ID886
Feature activation-7.89e-02

Pos logit contributions
 fault+0.024
:+0.023
sed+0.023
 okay+0.022
 enough+0.021
Neg logit contributions
Fl-0.025
eds-0.024
 conclusions-0.024
Moon-0.024
Bee-0.024
Token.
Token ID13
Feature activation-1.98e-02

Pos logit contributions
sed+0.123
 fault+0.117
 okay+0.102
 enough+0.102
joy+0.101
Neg logit contributions
Fl-0.143
eds-0.138
 blankets-0.135
Bear-0.134
Bee-0.133
Token<|endoftext|>
Token ID50256
Feature activation-7.38e-03

Pos logit contributions
 fault+0.027
sed+0.027
:+0.026
 okay+0.025
 enough+0.024
Neg logit contributions
Fl-0.037
Moon-0.036
 conclusions-0.035
eds-0.035
Bee-0.035
TokenOnce
Token ID7454
Feature activation-1.18e-02

Pos logit contributions
 fault+0.011
sed+0.010
:+0.010
 okay+0.010
 enough+0.009
Neg logit contributions
Fl-0.014
Moon-0.013
eds-0.013
Bear-0.013
Bee-0.013
Token upon
Token ID2402
Feature activation-3.81e-03

Pos logit contributions
sed+0.023
 fault+0.022
:+0.021
ield+0.020
 okay+0.020
Neg logit contributions
ie-0.015
eds-0.015
Fl-0.015
Moon-0.014
Bee-0.014
Token a
Token ID257
Feature activation-4.90e-02

Pos logit contributions
 okay+0.004
:+0.003
 about+0.003
 fault+0.003
 happened+0.003
Neg logit contributions
eds-0.009
 pans-0.009
 conclusions-0.009
 forth-0.009
ations-0.009
Token to
Token ID284
Feature activation+4.26e-02

Pos logit contributions
 fault+0.030
sed+0.029
:+0.028
 okay+0.025
 enough+0.024
Neg logit contributions
Fl-0.078
Moon-0.076
eds-0.076
 conclusions-0.076
 pans-0.075
Token decor
Token ID11705
Feature activation+5.88e-02

Pos logit contributions
Fl+0.071
eds+0.068
cking+0.066
 pans+0.065
ts+0.063
Neg logit contributions
:-0.072
ield-0.065
 opinion-0.064
 fault-0.064
 enough-0.063
Tokenate
Token ID378
Feature activation+4.25e-02

Pos logit contributions
 conclusions+0.092
Fl+0.092
Bee+0.089
Buzz+0.087
Bird+0.086
Neg logit contributions
:-0.102
 fault-0.095
 enough-0.094
 okay-0.092
 about-0.090
Token the
Token ID262
Feature activation+3.66e-02

Pos logit contributions
|+0.080
 conclusions+0.080
Buzz+0.078
Bird+0.078
Brown+0.078
Neg logit contributions
:-0.069
 enough-0.064
 about-0.062
 okay-0.055
 best-0.055
Token room
Token ID2119
Feature activation+2.36e-02

Pos logit contributions
ck+0.061
|+0.060
ts+0.059
cking+0.058
 conclusions+0.058
Neg logit contributions
:-0.059
 best-0.059
 fault-0.057
sed-0.053
 okay-0.053
Token to
Token ID284
Feature activation+6.45e-02

Pos logit contributions
Fl+0.040
 conclusions+0.039
Moon+0.039
eds+0.039
Bee+0.039
Neg logit contributions
:-0.037
 fault-0.036
sed-0.035
 okay-0.034
 enough-0.033
Token make
Token ID787
Feature activation+4.05e-02

Pos logit contributions
eds+0.120
Fl+0.119
Bear+0.111
Ruby+0.110
Bee+0.108
Neg logit contributions
:-0.097
 okay-0.088
 about-0.088
 opinion-0.085
 enough-0.081
Token it
Token ID340
Feature activation+1.76e-02

Pos logit contributions
Fl+0.086
|+0.085
Ruby+0.084
Bee+0.083
 conclusions+0.083
Neg logit contributions
 enough-0.054
 about-0.053
:-0.053
 okay-0.049
 best-0.048
Token look
Token ID804
Feature activation+8.74e-03

Pos logit contributions
Fl+0.032
 conclusions+0.031
eds+0.031
Bird+0.031
Bear+0.030
Neg logit contributions
:-0.028
 okay-0.027
 fault-0.027
 enough-0.027
 about-0.026
Token all
Token ID477
Feature activation+1.40e-02

Pos logit contributions
Fl+0.015
 conclusions+0.014
eds+0.014
Bee+0.014
Moon+0.014
Neg logit contributions
:-0.014
 fault-0.014
 okay-0.013
sed-0.013
 enough-0.013
Token pretty
Token ID2495
Feature activation+1.45e-02

Pos logit contributions
 conclusions+0.025
Bird+0.025
Fl+0.025
Bee+0.024
Moon+0.024
Neg logit contributions
:-0.022
 enough-0.022
 about-0.022
 fault-0.022
 okay-0.021
Token hive
Token ID35881
Feature activation+1.84e-02

Pos logit contributions
Brown+0.044
zip+0.044
ts+0.044
Fl+0.043
 conclusions+0.043
Neg logit contributions
:-0.060
 okay-0.059
 enough-0.058
sed-0.057
 best-0.056
Token.
Token ID13
Feature activation-3.60e-03

Pos logit contributions
Fl+0.036
eds+0.036
Moon+0.035
 conclusions+0.035
Bee+0.035
Neg logit contributions
 fault-0.023
sed-0.022
:-0.022
 okay-0.021
 enough-0.020
Token They
Token ID1119
Feature activation+3.30e-02

Pos logit contributions
:+0.004
 fault+0.004
sed+0.004
 okay+0.004
 enough+0.003
Neg logit contributions
eds-0.008
Fl-0.008
 conclusions-0.008
ts-0.007
 pans-0.007
Token worked
Token ID3111
Feature activation-8.77e-03

Pos logit contributions
 conclusions+0.074
eds+0.072
Fl+0.071
Brown+0.070
 pans+0.070
Neg logit contributions
:-0.038
sed-0.034
 about-0.030
 happened-0.030
 okay-0.030
Token hard
Token ID1327
Feature activation-1.07e-02

Pos logit contributions
 fault+0.015
sed+0.015
:+0.015
 okay+0.014
 enough+0.014
Neg logit contributions
Fl-0.013
ts-0.013
eds-0.013
 pans-0.013
 blankets-0.012
Token to
Token ID284
Feature activation+6.34e-02

Pos logit contributions
 fault+0.014
sed+0.014
 okay+0.013
:+0.012
 ok+0.012
Neg logit contributions
ts-0.020
eds-0.020
 blankets-0.020
 pans-0.019
Fl-0.019
Token make
Token ID787
Feature activation+3.88e-02

Pos logit contributions
Fl+0.110
Brown+0.110
George+0.110
Bear+0.109
eds+0.108
Neg logit contributions
:-0.092
 okay-0.090
 about-0.081
 opinion-0.079
 fault-0.079
Token honey
Token ID12498
Feature activation+1.58e-02

Pos logit contributions
Brown+0.067
Fl+0.064
zip+0.063
George+0.060
Bear+0.060
Neg logit contributions
 enough-0.066
 okay-0.064
 about-0.063
 fault-0.063
:-0.062
Token.
Token ID13
Feature activation-1.54e-03

Pos logit contributions
Fl+0.029
 conclusions+0.028
Moon+0.028
eds+0.028
Bear+0.028
Neg logit contributions
 fault-0.023
:-0.022
sed-0.021
 okay-0.021
 enough-0.021
Token One
Token ID1881
Feature activation-1.82e-02

Pos logit contributions
:+0.002
 fault+0.001
sed+0.001
 okay+0.001
 about+0.001
Neg logit contributions
eds-0.003
ts-0.003
 conclusions-0.003
 pans-0.003
cking-0.003
Token day
Token ID1110
Feature activation-3.19e-02

Pos logit contributions
:+0.029
 fault+0.028
sed+0.028
 okay+0.027
 enough+0.026
Neg logit contributions
Fl-0.030
 conclusions-0.029
eds-0.029
ts-0.029
 pans-0.029
Token cut
Token ID2005
Feature activation+2.75e-02

Pos logit contributions
 conclusions+0.056
Fl+0.055
eds+0.055
Moon+0.054
Pen+0.052
Neg logit contributions
:-0.030
sed-0.029
 fault-0.027
 enough-0.026
 okay-0.025
Token and
Token ID290
Feature activation+5.69e-02

Pos logit contributions
 conclusions+0.043
Moon+0.042
Pen+0.041
Brown+0.041
eds+0.041
Neg logit contributions
:-0.048
 enough-0.047
 fault-0.045
 okay-0.043
 about-0.042
Token fold
Token ID5591
Feature activation+3.29e-02

Pos logit contributions
eds+0.090
ind+0.089
Fl+0.089
Moon+0.089
 conclusions+0.089
Neg logit contributions
:-0.095
sed-0.093
 enough-0.091
 okay-0.088
 happened-0.087
Token the
Token ID262
Feature activation+1.31e-02

Pos logit contributions
Moon+0.051
Fl+0.050
Brown+0.050
Bird+0.050
Bee+0.049
Neg logit contributions
:-0.060
 enough-0.057
 fault-0.057
 about-0.053
 okay-0.053
Token cardboard
Token ID26819
Feature activation+3.90e-02

Pos logit contributions
Fl+0.017
Moon+0.017
Bird+0.016
Pen+0.016
zip+0.016
Neg logit contributions
sed-0.026
 fault-0.026
 enough-0.025
:-0.025
 okay-0.025
Token to
Token ID284
Feature activation+6.39e-02

Pos logit contributions
Fl+0.057
 conclusions+0.056
Moon+0.055
Bird+0.054
Pen+0.053
Neg logit contributions
:-0.074
 fault-0.073
sed-0.070
 enough-0.068
 okay-0.063
Token make
Token ID787
Feature activation+3.67e-02

Pos logit contributions
Fl+0.109
eds+0.105
Pen+0.104
Moon+0.102
E+0.102
Neg logit contributions
:-0.097
 okay-0.094
 enough-0.092
 about-0.090
sed-0.089
Token it
Token ID340
Feature activation+8.90e-03

Pos logit contributions
Fl+0.071
Bee+0.067
Brown+0.067
Bird+0.067
Moon+0.066
Neg logit contributions
 enough-0.058
 about-0.054
 fault-0.054
:-0.052
 okay-0.052
Token look
Token ID804
Feature activation+2.76e-03

Pos logit contributions
Fl+0.015
Bird+0.015
Moon+0.015
eds+0.014
Bear+0.014
Neg logit contributions
 fault-0.015
 okay-0.015
 enough-0.015
:-0.014
 about-0.014
Token like
Token ID588
Feature activation+9.75e-03

Pos logit contributions
Fl+0.005
Bee+0.005
Moon+0.005
eds+0.005
 conclusions+0.005
Neg logit contributions
:-0.004
 fault-0.004
 okay-0.004
 enough-0.004
sed-0.004
Token shields
Token ID21132
Feature activation+1.62e-02

Pos logit contributions
eds+0.018
Fl+0.018
Moon+0.018
 conclusions+0.018
Bird+0.018
Neg logit contributions
:-0.014
 fault-0.014
 enough-0.013
sed-0.013
 okay-0.013
Token wanted
Token ID2227
Feature activation-2.73e-03

Pos logit contributions
Fl+0.023
ts+0.022
Moon+0.022
 pans+0.022
eds+0.022
Neg logit contributions
 fault-0.020
sed-0.018
 okay-0.018
:-0.018
 enough-0.017
Token to
Token ID284
Feature activation+4.65e-02

Pos logit contributions
 fault+0.002
sed+0.002
:+0.002
 okay+0.002
 enough+0.002
Neg logit contributions
Fl-0.006
Moon-0.006
 conclusions-0.006
eds-0.006
 pans-0.006
Token make
Token ID787
Feature activation+3.24e-02

Pos logit contributions
Fl+0.069
eds+0.068
Bee+0.067
Moon+0.066
Bear+0.066
Neg logit contributions
:-0.082
 fault-0.081
sed-0.078
 okay-0.076
 enough-0.075
Token new
Token ID649
Feature activation+7.22e-03

Pos logit contributions
Fl+0.060
Bee+0.059
Brown+0.058
Bear+0.058
Bird+0.058
Neg logit contributions
:-0.049
 fault-0.048
 enough-0.047
 okay-0.045
 about-0.045
Token friends
Token ID2460
Feature activation+1.07e-02

Pos logit contributions
 conclusions+0.011
 pans+0.011
Fl+0.011
 blankets+0.011
eds+0.011
Neg logit contributions
sed-0.012
 fault-0.012
:-0.011
 okay-0.011
 enough-0.010
Token and
Token ID290
Feature activation+6.32e-02

Pos logit contributions
Fl+0.021
eds+0.020
 conclusions+0.020
Moon+0.020
 pans+0.020
Neg logit contributions
 fault-0.014
sed-0.014
:-0.013
 okay-0.012
 enough-0.012
Token learn
Token ID2193
Feature activation-8.15e-03

Pos logit contributions
Fl+0.096
eds+0.094
Bee+0.092
Moon+0.092
Bear+0.091
Neg logit contributions
:-0.112
 fault-0.111
sed-0.108
 okay-0.105
 enough-0.102
Token new
Token ID649
Feature activation+1.50e-02

Pos logit contributions
 fault+0.013
:+0.013
sed+0.013
 okay+0.012
 enough+0.012
Neg logit contributions
Fl-0.013
Moon-0.013
eds-0.013
Bee-0.013
 conclusions-0.013
Token things
Token ID1243
Feature activation-9.64e-03

Pos logit contributions
Fl+0.015
 conclusions+0.015
eds+0.015
Moon+0.015
 pans+0.015
Neg logit contributions
 fault-0.033
sed-0.033
:-0.032
 okay-0.031
 enough-0.030
Token.
Token ID13
Feature activation-6.03e-03

Pos logit contributions
sed+0.015
 fault+0.014
apped+0.012
joy+0.012
ope+0.012
Neg logit contributions
Fl-0.017
 blankets-0.016
ts-0.016
 pans-0.016
 myself-0.016
Token
Token ID198
Feature activation+4.21e-02

Pos logit contributions
sed+0.011
 fault+0.010
:+0.009
 okay+0.009
 happened+0.009
Neg logit contributions
Fl-0.010
Moon-0.009
 conclusions-0.009
E-0.009
Pen-0.009
Token.
Token ID13
Feature activation+1.96e-02

Pos logit contributions
:+0.001
 fault+0.001
sed+0.001
 okay+0.001
 enough+0.001
Neg logit contributions
Fl-0.003
 conclusions-0.003
Moon-0.003
 pans-0.003
ts-0.003
Token She
Token ID1375
Feature activation+2.28e-02

Pos logit contributions
Fl+0.038
Moon+0.036
 conclusions+0.036
Bee+0.036
Bear+0.036
Neg logit contributions
 fault-0.027
sed-0.026
:-0.025
 okay-0.023
 enough-0.023
Token loved
Token ID6151
Feature activation+1.92e-02

Pos logit contributions
Fl+0.043
 conclusions+0.043
eds+0.042
Moon+0.042
 pans+0.042
Neg logit contributions
:-0.031
 fault-0.030
sed-0.030
 okay-0.028
 enough-0.027
Token to
Token ID284
Feature activation+4.18e-02

Pos logit contributions
 conclusions+0.055
Fl+0.054
Bee+0.053
Bird+0.053
Bear+0.052
Neg logit contributions
:-0.009
 fault-0.008
 enough-0.008
 about-0.007
sed-0.007
Token play
Token ID711
Feature activation+3.98e-02

Pos logit contributions
eds+0.052
Fl+0.052
 conclusions+0.048
Bear+0.048
Moon+0.048
Neg logit contributions
:-0.085
 fault-0.082
sed-0.081
 okay-0.080
 enough-0.077
Token with
Token ID351
Feature activation+6.39e-02

Pos logit contributions
 conclusions+0.064
eds+0.063
Bee+0.061
Moon+0.060
Fl+0.060
Neg logit contributions
:-0.067
 about-0.062
 okay-0.061
sed-0.058
 enough-0.057
Token her
Token ID607
Feature activation+1.23e-02

Pos logit contributions
 conclusions+0.138
|+0.135
eds+0.134
Bear+0.132
lling+0.130
Neg logit contributions
:-0.075
 happened-0.068
 about-0.068
 enough-0.064
 at-0.062
Token toys
Token ID14958
Feature activation+3.78e-02

Pos logit contributions
 conclusions+0.019
eds+0.018
eling+0.018
Bird+0.018
Moon+0.017
Neg logit contributions
sed-0.023
 fault-0.022
:-0.021
 best-0.021
 okay-0.020
Token all
Token ID477
Feature activation+2.29e-02

Pos logit contributions
Fl+0.066
ts+0.064
 blankets+0.063
Moon+0.063
 pans+0.063
Neg logit contributions
 fault-0.058
sed-0.057
:-0.053
 okay-0.051
ope-0.050
Token day
Token ID1110
Feature activation+7.49e-04

Pos logit contributions
 conclusions+0.050
Bear+0.049
 pans+0.049
Bird+0.048
Moon+0.048
Neg logit contributions
:-0.029
 about-0.028
 fault-0.025
 happened-0.022
 enough-0.022
Token.
Token ID13
Feature activation+1.47e-02

Pos logit contributions
Fl+0.001
 conclusions+0.001
Moon+0.001
eds+0.001
Bee+0.001
Neg logit contributions
:-0.001
 fault-0.001
sed-0.001
 okay-0.001
 enough-0.001
TokenHelp
Token ID22087
Feature activation+4.51e-03

Pos logit contributions
ie+0.066
 blankets+0.060
eds+0.058
 outer+0.057
ations+0.057
Neg logit contributions
 okay-0.048
 No-0.046
 OK-0.045
 fault-0.043
ope-0.043
Token me
Token ID502
Feature activation-2.85e-02

Pos logit contributions
Fl+0.008
Moon+0.008
 conclusions+0.008
eds+0.008
Bee+0.008
Neg logit contributions
:-0.007
 fault-0.007
sed-0.006
 okay-0.006
 enough-0.006
Token,
Token ID11
Feature activation-3.09e-03

Pos logit contributions
 fault+0.051
sed+0.050
 ending+0.045
joy+0.045
 intentions+0.045
Neg logit contributions
eds-0.041
Fl-0.040
ie-0.039
 myself-0.038
zip-0.038
Token Tim
Token ID5045
Feature activation-1.48e-02

Pos logit contributions
:+0.004
sed+0.004
 fault+0.004
 okay+0.004
 enough+0.004
Neg logit contributions
eds-0.006
Fl-0.006
Moon-0.006
 conclusions-0.006
Bee-0.006
Token!"
Token ID2474
Feature activation-2.44e-02

Pos logit contributions
 fault+0.025
sed+0.025
:+0.023
apped+0.022
 okay+0.022
Neg logit contributions
Fl-0.023
eds-0.023
ie-0.022
Bee-0.022
ts-0.021
Token
Token ID198
Feature activation+2.12e-02

Pos logit contributions
 fault+0.039
sed+0.038
:+0.035
 enough+0.034
 okay+0.034
Neg logit contributions
Fl-0.041
Bird-0.039
Bear-0.039
 conclusions-0.039
Bee-0.039
Token
Token ID198
Feature activation+4.59e-02

Pos logit contributions
 conclusions+0.044
Fl+0.044
ts+0.044
 pans+0.044
Moon+0.044
Neg logit contributions
:-0.025
 fault-0.022
sed-0.022
 okay-0.021
ope-0.019
TokenTim
Token ID14967
Feature activation+9.74e-03

Pos logit contributions
 conclusions+0.062
Moon+0.061
eds+0.061
ts+0.061
 outer+0.061
Neg logit contributions
:-0.086
 okay-0.079
sed-0.078
 fault-0.075
ope-0.075
Token tries
Token ID8404
Feature activation-2.78e-02

Pos logit contributions
Fl+0.016
eds+0.015
Moon+0.015
ts+0.015
 conclusions+0.015
Neg logit contributions
 fault-0.016
sed-0.015
:-0.015
 okay-0.015
 enough-0.014
Token to
Token ID284
Feature activation+3.44e-02

Pos logit contributions
sed+0.040
apped+0.036
ope+0.036
 fault+0.035
oll+0.035
Neg logit contributions
Fl-0.051
 blankets-0.049
eds-0.049
 conclusions-0.048
zip-0.048
Token help
Token ID1037
Feature activation+3.59e-03

Pos logit contributions
eds+0.061
iflower+0.058
 pans+0.057
ations+0.056
Moon+0.055
Neg logit contributions
 okay-0.054
:-0.050
 enough-0.049
 OK-0.049
sed-0.047
Token with
Token ID351
Feature activation+4.47e-02

Pos logit contributions
:+0.032
 fault+0.030
sed+0.028
 okay+0.028
 enough+0.028
Neg logit contributions
Fl-0.054
eds-0.054
Moon-0.054
Bee-0.054
 conclusions-0.053
Token the
Token ID262
Feature activation+2.65e-03

Pos logit contributions
Brown+0.086
Bird+0.085
Moon+0.085
Bee+0.085
Fl+0.083
Neg logit contributions
 fault-0.069
 enough-0.065
:-0.064
 okay-0.057
 about-0.057
Token chain
Token ID6333
Feature activation-2.20e-03

Pos logit contributions
Fl+0.004
 conclusions+0.004
eds+0.004
Moon+0.004
Bee+0.004
Neg logit contributions
 fault-0.005
:-0.005
sed-0.005
 okay-0.004
 enough-0.004
Token.
Token ID13
Feature activation-2.47e-02

Pos logit contributions
 fault+0.002
:+0.002
sed+0.002
 enough+0.002
 okay+0.002
Neg logit contributions
Fl-0.005
Moon-0.005
 conclusions-0.005
Bee-0.005
eds-0.005
Token
Token ID198
Feature activation+3.13e-02

Pos logit contributions
 fault+0.038
sed+0.037
:+0.035
 enough+0.034
 okay+0.034
Neg logit contributions
Fl-0.043
Moon-0.042
Bear-0.041
 conclusions-0.041
Bee-0.040
Token
Token ID198
Feature activation+5.88e-02

Pos logit contributions
 pans+0.071
 conclusions+0.070
eds+0.070
ts+0.070
Fl+0.069
Neg logit contributions
:-0.034
 fault-0.028
 okay-0.026
 about-0.026
sed-0.025
TokenBut
Token ID1537
Feature activation+3.85e-03

Pos logit contributions
eds+0.099
 blankets+0.097
ie+0.096
 conclusions+0.095
 wax+0.092
Neg logit contributions
:-0.095
 okay-0.087
ope-0.078
apping-0.077
:"-0.076
Token Mom
Token ID11254
Feature activation-5.54e-03

Pos logit contributions
Fl+0.006
Moon+0.005
 conclusions+0.005
eds+0.005
ts+0.005
Neg logit contributions
 fault-0.007
sed-0.007
:-0.007
 okay-0.006
 enough-0.006
Token does
Token ID857
Feature activation-2.59e-02

Pos logit contributions
 fault+0.009
sed+0.009
 okay+0.008
:+0.008
apped+0.008
Neg logit contributions
eds-0.009
Fl-0.009
ts-0.009
 pans-0.009
 blankets-0.009
Token not
Token ID407
Feature activation-1.61e-02

Pos logit contributions
sed+0.032
 fault+0.031
apped+0.025
joy+0.025
:+0.025
Neg logit contributions
ts-0.053
Fl-0.053
 pans-0.052
 conclusions-0.052
 myself-0.051
Token like
Token ID588
Feature activation+1.43e-02

Pos logit contributions
sed+0.031
 fault+0.030
:+0.027
apped+0.027
ield+0.027
Neg logit contributions
 conclusions-0.021
 pans-0.021
ts-0.021
eds-0.021
 myself-0.021
Token It
Token ID632
Feature activation+1.55e-02

Pos logit contributions
Fl+0.012
Moon+0.012
 conclusions+0.011
Bee+0.011
Bear+0.011
Neg logit contributions
 fault-0.008
sed-0.007
:-0.007
 okay-0.007
 enough-0.007
Token was
Token ID373
Feature activation+1.23e-03

Pos logit contributions
 conclusions+0.031
Fl+0.031
 pans+0.030
eds+0.030
Moon+0.030
Neg logit contributions
:-0.020
sed-0.019
 fault-0.018
 happened-0.017
 okay-0.017
Token a
Token ID257
Feature activation+1.33e-02

Pos logit contributions
Fl+0.002
 blankets+0.002
 pans+0.002
Moon+0.002
ts+0.002
Neg logit contributions
 fault-0.002
sed-0.002
:-0.002
joy-0.002
 okay-0.002
Token big
Token ID1263
Feature activation+3.61e-03

Pos logit contributions
Fl+0.020
eds+0.020
 conclusions+0.020
Moon+0.020
Bee+0.019
Neg logit contributions
 fault-0.023
:-0.022
sed-0.022
 okay-0.021
 enough-0.020
Token flag
Token ID6056
Feature activation+1.89e-02

Pos logit contributions
Fl+0.005
 pans+0.005
 conclusions+0.005
Moon+0.005
eds+0.005
Neg logit contributions
 fault-0.006
sed-0.006
:-0.006
 okay-0.006
 enough-0.006
Token,
Token ID11
Feature activation+2.90e-02

Pos logit contributions
Fl+0.038
 conclusions+0.038
Moon+0.037
eds+0.037
Bee+0.037
Neg logit contributions
:-0.023
 fault-0.023
sed-0.022
 okay-0.020
 enough-0.020
Token with
Token ID351
Feature activation+4.88e-02

Pos logit contributions
Fl+0.048
eds+0.046
Moon+0.046
 conclusions+0.046
Bee+0.046
Neg logit contributions
 fault-0.046
:-0.045
sed-0.045
 okay-0.043
 enough-0.042
Token bright
Token ID6016
Feature activation+3.38e-02

Pos logit contributions
Fl+0.091
 conclusions+0.089
Bee+0.086
eds+0.086
Bear+0.085
Neg logit contributions
:-0.073
 enough-0.068
sed-0.067
 fault-0.067
 okay-0.063
Token colors
Token ID7577
Feature activation+1.40e-02

Pos logit contributions
Fl+0.038
eds+0.037
Moon+0.037
 conclusions+0.037
 pans+0.037
Neg logit contributions
 fault-0.071
sed-0.069
:-0.069
 okay-0.067
 enough-0.065
Token.
Token ID13
Feature activation-1.13e-03

Pos logit contributions
 blankets+0.022
eds+0.021
 pans+0.021
ts+0.021
ie+0.021
Neg logit contributions
sed-0.023
 fault-0.023
 okay-0.022
 ok-0.021
ope-0.021
Token It
Token ID632
Feature activation+1.10e-02

Pos logit contributions
:+0.001
 fault+0.001
sed+0.001
 okay+0.001
 enough+0.001
Neg logit contributions
eds-0.002
Fl-0.002
 conclusions-0.002
 pans-0.002
ts-0.002
Token the
Token ID262
Feature activation-1.79e-02

Pos logit contributions
 fault+0.030
sed+0.030
:+0.029
 okay+0.027
 enough+0.026
Neg logit contributions
Fl-0.039
Moon-0.038
 conclusions-0.038
eds-0.038
ts-0.038
Token best
Token ID1266
Feature activation-2.85e-02

Pos logit contributions
 fault+0.009
:+0.008
sed+0.007
 okay+0.006
 enough+0.006
Neg logit contributions
Fl-0.050
eds-0.049
 conclusions-0.049
Moon-0.049
 pans-0.049
Token of
Token ID286
Feature activation+1.63e-03

Pos logit contributions
 fault+0.040
sed+0.040
:+0.039
 okay+0.037
 enough+0.035
Neg logit contributions
Fl-0.052
eds-0.050
 conclusions-0.050
Bee-0.050
Moon-0.050
Token friends
Token ID2460
Feature activation-3.03e-02

Pos logit contributions
Fl+0.003
eds+0.003
 conclusions+0.003
Moon+0.003
ts+0.003
Neg logit contributions
 fault-0.002
:-0.002
sed-0.002
 okay-0.002
 enough-0.002
Token and
Token ID290
Feature activation+2.99e-02

Pos logit contributions
sed+0.049
 fault+0.049
 okay+0.043
ope+0.043
owl+0.041
Neg logit contributions
ts-0.047
Fl-0.047
eds-0.047
 blankets-0.047
ie-0.046
Token played
Token ID2826
Feature activation+1.57e-02

Pos logit contributions
Fl+0.054
 conclusions+0.053
eds+0.053
Moon+0.053
 pans+0.052
Neg logit contributions
:-0.043
 fault-0.042
sed-0.042
 okay-0.039
 enough-0.039
Token together
Token ID1978
Feature activation+2.18e-02

Pos logit contributions
Fl+0.023
Moon+0.022
eds+0.022
 pans+0.022
Bee+0.022
Neg logit contributions
:-0.028
 fault-0.027
sed-0.027
 okay-0.027
 enough-0.026
Token every
Token ID790
Feature activation-4.80e-03

Pos logit contributions
Fl+0.038
 pans+0.037
Moon+0.037
eds+0.036
Bee+0.036
Neg logit contributions
:-0.034
 fault-0.033
sed-0.032
 enough-0.031
 okay-0.030
Token day
Token ID1110
Feature activation-1.45e-02

Pos logit contributions
:+0.006
 enough+0.005
 ending+0.005
 fault+0.005
 about+0.004
Neg logit contributions
ts-0.011
 pans-0.011
Brown-0.010
Fl-0.010
zip-0.010
Token.
Token ID13
Feature activation-1.77e-02

Pos logit contributions
 fault+0.019
sed+0.018
:+0.017
 okay+0.016
 enough+0.016
Neg logit contributions
Fl-0.028
eds-0.028
 conclusions-0.028
ts-0.028
Moon-0.027
Token Peter
Token ID5613
Feature activation+1.33e-02

Pos logit contributions
 fault+0.022
:+0.022
sed+0.019
 okay+0.019
 enough+0.018
Neg logit contributions
Fl-0.037
ts-0.036
 pans-0.036
eds-0.036
 conclusions-0.035
Token his
Token ID465
Feature activation-2.49e-02

Pos logit contributions
Moon+0.027
Fl+0.027
Bee+0.027
Bird+0.026
Bear+0.026
Neg logit contributions
 fault-0.032
:-0.031
 enough-0.030
 okay-0.029
 about-0.029
Token way
Token ID835
Feature activation-6.91e-04

Pos logit contributions
 fault+0.042
sed+0.041
:+0.039
 okay+0.037
 enough+0.036
Neg logit contributions
Moon-0.042
Fl-0.041
Bee-0.040
Bird-0.040
Bear-0.040
Token and
Token ID290
Feature activation+2.11e-02

Pos logit contributions
 fault+0.001
:+0.001
sed+0.001
 okay+0.001
 enough+0.001
Neg logit contributions
Fl-0.001
Moon-0.001
ts-0.001
Bee-0.001
 conclusions-0.001
Token could
Token ID714
Feature activation+6.46e-03

Pos logit contributions
Fl+0.040
Moon+0.039
 conclusions+0.039
Bee+0.039
eds+0.038
Neg logit contributions
 fault-0.029
:-0.029
sed-0.028
 okay-0.027
 enough-0.026
Token not
Token ID407
Feature activation+1.36e-02

Pos logit contributions
Fl+0.014
eds+0.014
 pans+0.014
 conclusions+0.014
Moon+0.014
Neg logit contributions
 fault-0.007
:-0.007
sed-0.006
 okay-0.006
 enough-0.006
Token find
Token ID1064
Feature activation+1.62e-02

Pos logit contributions
 pans+0.021
Fl+0.021
eds+0.021
Moon+0.020
 blankets+0.020
Neg logit contributions
:-0.024
 okay-0.023
 fault-0.023
sed-0.022
 enough-0.021
Token his
Token ID465
Feature activation-2.79e-02

Pos logit contributions
Bee+0.024
Fl+0.024
 pans+0.024
eds+0.024
Moon+0.023
Neg logit contributions
 fault-0.029
 enough-0.029
:-0.028
 okay-0.027
sed-0.026
Token family
Token ID1641
Feature activation+1.35e-02

Pos logit contributions
 fault+0.045
:+0.044
sed+0.044
 okay+0.041
 enough+0.040
Neg logit contributions
Fl-0.046
Moon-0.045
eds-0.044
 conclusions-0.044
Bee-0.044
Token.
Token ID13
Feature activation-9.38e-03

Pos logit contributions
Fl+0.027
eds+0.026
 conclusions+0.026
Moon+0.026
ts+0.026
Neg logit contributions
 fault-0.017
sed-0.016
:-0.016
 okay-0.015
 enough-0.014
Token The
Token ID383
Feature activation-1.06e-02

Pos logit contributions
 fault+0.012
sed+0.012
:+0.012
 okay+0.011
 enough+0.011
Neg logit contributions
Fl-0.018
Moon-0.018
 conclusions-0.018
Bear-0.018
Bee-0.017
Token little
Token ID1310
Feature activation-4.12e-03

Pos logit contributions
sed+0.016
 fault+0.016
:+0.016
 okay+0.015
 ending+0.015
Neg logit contributions
eds-0.019
Fl-0.019
zip-0.019
ts-0.018
Moon-0.018
Token make
Token ID787
Feature activation+1.63e-02

Pos logit contributions
Fl+0.037
Moon+0.036
eds+0.035
Bee+0.035
Bird+0.034
Neg logit contributions
:-0.048
 fault-0.047
 okay-0.046
sed-0.044
 enough-0.044
Token it
Token ID340
Feature activation-8.83e-03

Pos logit contributions
Fl+0.027
Bee+0.026
Bird+0.026
Brown+0.026
Moon+0.026
Neg logit contributions
 fault-0.028
:-0.026
 enough-0.026
 okay-0.026
 about-0.024
Token brilliant
Token ID10457
Feature activation-2.16e-03

Pos logit contributions
 fault+0.014
:+0.014
 okay+0.014
sed+0.013
 enough+0.013
Neg logit contributions
Fl-0.015
Moon-0.014
eds-0.014
Bird-0.014
Bee-0.014
Token again
Token ID757
Feature activation-1.55e-02

Pos logit contributions
 fault+0.003
sed+0.003
:+0.003
 okay+0.003
 enough+0.003
Neg logit contributions
Fl-0.004
 conclusions-0.004
eds-0.004
 pans-0.004
Moon-0.004
Token.
Token ID13
Feature activation-2.35e-02

Pos logit contributions
 fault+0.019
:+0.018
sed+0.018
 okay+0.017
 enough+0.016
Neg logit contributions
Fl-0.031
Moon-0.030
eds-0.030
 conclusions-0.030
Bee-0.030
Token<|endoftext|>
Token ID50256
Feature activation-4.30e-03

Pos logit contributions
sed+0.043
 fault+0.040
 enough+0.038
 okay+0.037
 intentions+0.036
Neg logit contributions
Fl-0.034
Brown-0.033
Moon-0.033
Bear-0.032
Ruby-0.032
TokenOnce
Token ID7454
Feature activation-1.62e-02

Pos logit contributions
 fault+0.007
sed+0.007
:+0.006
 okay+0.006
 enough+0.006
Neg logit contributions
Fl-0.007
Moon-0.007
eds-0.007
Bee-0.007
 conclusions-0.007
Token upon
Token ID2402
Feature activation-6.08e-03

Pos logit contributions
sed+0.032
 fault+0.031
 okay+0.029
:+0.028
ield+0.028
Neg logit contributions
ie-0.021
eds-0.019
Fl-0.019
Moon-0.019
Bee-0.019
Token a
Token ID257
Feature activation-5.00e-02

Pos logit contributions
 okay+0.005
:+0.005
 about+0.005
 happened+0.005
 enough+0.004
Neg logit contributions
eds-0.015
 pans-0.015
 conclusions-0.014
 forth-0.014
ations-0.014
Token time
Token ID640
Feature activation-5.60e-02

Pos logit contributions
sed+0.070
 fault+0.070
:+0.066
apped+0.062
icked+0.062
Neg logit contributions
Fl-0.093
Bee-0.088
Buzz-0.087
Ruby-0.087
Bird-0.086
Token,
Token ID11
Feature activation-2.34e-03

Pos logit contributions
 fault+0.073
sed+0.070
ige+0.064
 intentions+0.062
owl+0.061
Neg logit contributions
Fl-0.112
ie-0.106
Bee-0.103
George-0.103
Ruby-0.102
Token the
Token ID262
Feature activation+8.61e-03

Pos logit contributions
Moon+0.033
eds+0.033
Fl+0.033
Bee+0.033
 pans+0.033
Neg logit contributions
:-0.038
 fault-0.038
sed-0.036
 enough-0.036
 okay-0.035
Token wolf
Token ID17481
Feature activation+1.74e-02

Pos logit contributions
irs+0.012
zip+0.012
ations+0.012
Moon+0.011
|+0.011
Neg logit contributions
sed-0.017
 fault-0.016
:-0.016
 okay-0.015
 enough-0.015
Token.
Token ID13
Feature activation-1.70e-02

Pos logit contributions
Moon+0.040
 pans+0.040
Brown+0.040
ts+0.039
 conclusions+0.039
Neg logit contributions
:-0.019
 fault-0.016
 enough-0.015
 okay-0.015
sed-0.015
Token He
Token ID679
Feature activation+4.14e-03

Pos logit contributions
 fault+0.026
sed+0.024
:+0.023
 happened+0.022
 okay+0.022
Neg logit contributions
Fl-0.030
Moon-0.029
Bird-0.029
Bear-0.029
Bee-0.029
Token kicked
Token ID12165
Feature activation+9.31e-03

Pos logit contributions
 conclusions+0.008
eds+0.008
Moon+0.008
 pans+0.008
Fl+0.008
Neg logit contributions
sed-0.005
:-0.005
 fault-0.005
 enough-0.005
 happened-0.005
Token him
Token ID683
Feature activation+3.55e-03

Pos logit contributions
Moon+0.014
eds+0.014
 conclusions+0.014
 pans+0.014
Bee+0.014
Neg logit contributions
:-0.016
 fault-0.016
 enough-0.015
sed-0.015
 okay-0.015
Token with
Token ID351
Feature activation+3.58e-02

Pos logit contributions
Moon+0.007
 pans+0.007
 conclusions+0.007
Fl+0.007
Bee+0.007
Neg logit contributions
:-0.005
 fault-0.005
sed-0.004
 okay-0.004
 enough-0.004
Token his
Token ID465
Feature activation-1.45e-02

Pos logit contributions
Fl+0.048
Moon+0.047
eds+0.047
 conclusions+0.046
 pans+0.046
Neg logit contributions
 fault-0.068
:-0.068
sed-0.067
 enough-0.065
 okay-0.065
Token ho
Token ID8169
Feature activation+1.89e-02

Pos logit contributions
 fault+0.024
:+0.024
sed+0.023
 okay+0.022
 enough+0.022
Neg logit contributions
Fl-0.023
eds-0.022
 pans-0.022
 conclusions-0.022
Moon-0.022
Tokenoves
Token ID5241
Feature activation+5.92e-03

Pos logit contributions
Fl+0.020
Moon+0.019
 conclusions+0.019
 pans+0.019
ts+0.019
Neg logit contributions
 fault-0.041
:-0.041
sed-0.040
 okay-0.039
 enough-0.039
Token.
Token ID13
Feature activation-2.22e-02

Pos logit contributions
Fl+0.012
eds+0.011
 conclusions+0.011
Bee+0.011
Moon+0.011
Neg logit contributions
 fault-0.007
sed-0.007
:-0.007
 okay-0.007
 enough-0.006
Token to
Token ID284
Feature activation+3.09e-02

Pos logit contributions
sed+0.058
 fault+0.057
apped+0.051
ope+0.051
ver+0.050
Neg logit contributions
 blankets-0.061
Fl-0.059
 pans-0.057
 rock-0.056
eds-0.056
Token touch
Token ID3638
Feature activation+1.04e-02

Pos logit contributions
Fl+0.059
eds+0.059
 pans+0.058
Brown+0.058
Bear+0.058
Neg logit contributions
:-0.043
sed-0.042
 fault-0.041
 okay-0.040
 enough-0.038
Token the
Token ID262
Feature activation+1.40e-02

Pos logit contributions
Brown+0.018
Moon+0.016
|+0.016
Bee+0.016
Fl+0.016
Neg logit contributions
:-0.018
 enough-0.018
 fault-0.017
 okay-0.016
 about-0.016
Token slide
Token ID10649
Feature activation-6.49e-03

Pos logit contributions
Brown+0.022
Fl+0.022
Moon+0.022
zip+0.022
Pen+0.022
Neg logit contributions
sed-0.025
 fault-0.025
 enough-0.023
:-0.023
 ending-0.022
Token.
Token ID13
Feature activation-2.82e-02

Pos logit contributions
 fault+0.008
sed+0.008
:+0.008
 okay+0.007
 enough+0.007
Neg logit contributions
Fl-0.013
 conclusions-0.013
eds-0.013
Moon-0.012
ts-0.012
Token He
Token ID679
Feature activation+1.64e-03

Pos logit contributions
 fault+0.049
sed+0.048
 happened+0.046
:+0.044
 enough+0.044
Neg logit contributions
Fl-0.044
Bear-0.041
Moon-0.041
zip-0.040
Bee-0.040
Token left
Token ID1364
Feature activation+8.55e-03

Pos logit contributions
Fl+0.003
ts+0.003
Moon+0.003
Bee+0.003
Bird+0.003
Neg logit contributions
 fault-0.002
:-0.002
sed-0.002
 okay-0.002
 enough-0.002
Token the
Token ID262
Feature activation+9.37e-03

Pos logit contributions
Fl+0.011
Moon+0.011
 conclusions+0.011
eds+0.011
Brown+0.010
Neg logit contributions
:-0.017
 fault-0.017
sed-0.016
 enough-0.016
 okay-0.016
Token box
Token ID3091
Feature activation+1.40e-02

Pos logit contributions
Moon+0.014
Bear+0.014
Bird+0.014
Bee+0.013
Brown+0.013
Neg logit contributions
 fault-0.018
:-0.017
sed-0.017
 enough-0.016
 ending-0.016
Token and
Token ID290
Feature activation+3.04e-02

Pos logit contributions
Fl+0.021
eds+0.021
Moon+0.021
 conclusions+0.021
Bee+0.020
Neg logit contributions
 fault-0.024
sed-0.023
:-0.023
 okay-0.022
 enough-0.022
Token walked
Token ID6807
Feature activation-9.42e-03

Pos logit contributions
 conclusions+0.057
eds+0.056
Brown+0.056
Fl+0.056
Bee+0.056
Neg logit contributions
sed-0.043
:-0.042
 enough-0.041
 fault-0.039
 okay-0.037
Token She
Token ID1375
Feature activation-5.80e-03

Pos logit contributions
:+0.026
 fault+0.024
sed+0.023
 okay+0.022
 about+0.022
Neg logit contributions
eds-0.035
ts-0.035
Fl-0.034
 conclusions-0.034
 pans-0.034
Token ran
Token ID4966
Feature activation-4.72e-03

Pos logit contributions
:+0.008
sed+0.008
 fault+0.007
 enough+0.007
 okay+0.007
Neg logit contributions
 conclusions-0.011
Fl-0.011
eds-0.011
Moon-0.011
Bee-0.011
Token back
Token ID736
Feature activation-2.52e-03

Pos logit contributions
:+0.007
 fault+0.007
sed+0.007
 enough+0.007
 okay+0.006
Neg logit contributions
Fl-0.008
 conclusions-0.008
Bee-0.008
Moon-0.008
Bear-0.008
Token in
Token ID287
Feature activation+1.15e-03

Pos logit contributions
:+0.003
 fault+0.003
sed+0.003
 enough+0.003
 about+0.002
Neg logit contributions
Fl-0.005
 conclusions-0.005
Moon-0.005
Bird-0.005
Bee-0.005
Token to
Token ID284
Feature activation+2.38e-02

Pos logit contributions
Fl+0.002
Moon+0.002
eds+0.002
 conclusions+0.002
Bee+0.001
Neg logit contributions
 fault-0.002
:-0.002
sed-0.002
 okay-0.002
 enough-0.002
Token show
Token ID905
Feature activation+5.03e-03

Pos logit contributions
Brown+0.039
Fl+0.039
Bear+0.038
Bee+0.037
 pans+0.037
Neg logit contributions
:-0.043
 fault-0.040
 about-0.039
 enough-0.038
sed-0.038
Token her
Token ID607
Feature activation-1.21e-02

Pos logit contributions
Fl+0.006
Bee+0.006
eds+0.006
Moon+0.006
Bear+0.006
Neg logit contributions
:-0.011
 fault-0.010
sed-0.010
 okay-0.010
 enough-0.010
Token mum
Token ID25682
Feature activation-2.17e-02

Pos logit contributions
 fault+0.014
:+0.014
sed+0.013
 okay+0.012
 enough+0.012
Neg logit contributions
Fl-0.026
Moon-0.025
Bird-0.025
Brown-0.025
eds-0.025
Token all
Token ID477
Feature activation+9.36e-04

Pos logit contributions
:+0.034
 fault+0.031
 enough+0.030
 okay+0.029
 about+0.029
Neg logit contributions
Fl-0.039
 conclusions-0.039
Moon-0.038
Bee-0.038
eds-0.038
Token the
Token ID262
Feature activation-5.41e-03

Pos logit contributions
Fl+0.002
Moon+0.002
Bird+0.002
Brown+0.002
Bear+0.002
Neg logit contributions
:-0.002
 fault-0.002
 about-0.002
 enough-0.001
 okay-0.001
Token stones
Token ID14966
Feature activation+1.57e-02

Pos logit contributions
 fault+0.009
:+0.009
 okay+0.008
 enough+0.008
sed+0.008
Neg logit contributions
Fl-0.009
Moon-0.009
Bird-0.009
Brown-0.009
eds-0.009
Token.
Token ID13
Feature activation+8.93e-03

Pos logit contributions
 fault+0.017
joy+0.016
sed+0.016
 okay+0.015
ver+0.015
Neg logit contributions
 blankets-0.016
Fl-0.015
ie-0.015
ts-0.015
 pans-0.015
Token He
Token ID679
Feature activation+1.21e-02

Pos logit contributions
Fl+0.017
Moon+0.017
Bear+0.017
Bird+0.017
Bee+0.017
Neg logit contributions
 fault-0.012
sed-0.011
:-0.011
 enough-0.010
 okay-0.010
Token followed
Token ID3940
Feature activation+2.48e-02

Pos logit contributions
 conclusions+0.027
eds+0.025
 myself+0.025
Fl+0.025
 outer+0.025
Neg logit contributions
:-0.015
sed-0.013
 happened-0.012
 okay-0.012
 enough-0.012
Token the
Token ID262
Feature activation+2.02e-02

Pos logit contributions
Fl+0.039
eds+0.039
Brown+0.038
 pans+0.038
ations+0.038
Neg logit contributions
:-0.045
 enough-0.041
 okay-0.040
 about-0.039
sed-0.038
Token aroma
Token ID31242
Feature activation+5.28e-03

Pos logit contributions
ts+0.027
zip+0.026
Moon+0.026
ations+0.026
|+0.026
Neg logit contributions
sed-0.039
:-0.039
 fault-0.036
 okay-0.036
 enough-0.035
Token until
Token ID1566
Feature activation+5.31e-03

Pos logit contributions
Fl+0.010
eds+0.009
Moon+0.009
 conclusions+0.009
Bee+0.009
Neg logit contributions
 fault-0.007
:-0.007
sed-0.007
 okay-0.007
 enough-0.006
Token he
Token ID339
Feature activation-1.08e-02

Pos logit contributions
Fl+0.008
eds+0.008
 conclusions+0.008
Brown+0.007
Moon+0.007
Neg logit contributions
:-0.009
sed-0.009
 fault-0.009
 okay-0.009
 enough-0.009
Token came
Token ID1625
Feature activation-1.89e-02

Pos logit contributions
 fault+0.019
sed+0.017
:+0.016
 okay+0.016
owl+0.016
Neg logit contributions
Fl-0.018
Buzz-0.017
Bee-0.017
Bird-0.017
Bear-0.017
Token to
Token ID284
Feature activation+2.54e-02

Pos logit contributions
sed+0.034
 fault+0.033
oll+0.030
oke+0.029
ses+0.028
Neg logit contributions
eds-0.028
Moon-0.027
Bear-0.027
Bee-0.027
 myself-0.027
Token a
Token ID257
Feature activation+2.93e-02

Pos logit contributions
eds+0.057
Fl+0.055
Brown+0.053
 pans+0.053
Ruby+0.053
Neg logit contributions
 fault-0.028
:-0.027
 okay-0.027
 enough-0.026
sed-0.025
Token clearing
Token ID17304
Feature activation+4.64e-03

Pos logit contributions
ts+0.046
eds+0.045
 myself+0.044
cking+0.044
zip+0.044
Neg logit contributions
:-0.051
 okay-0.045
sed-0.044
 enough-0.044
 fault-0.044
Token The
Token ID383
Feature activation+2.20e-02

Pos logit contributions
Fl+0.009
Bird+0.008
Moon+0.008
Bear+0.008
Bee+0.008
Neg logit contributions
 fault-0.008
sed-0.007
:-0.007
 okay-0.007
 enough-0.007
Token tree
Token ID5509
Feature activation+7.36e-03

Pos logit contributions
Fl+0.018
eds+0.018
 conclusions+0.017
ts+0.016
 pans+0.016
Neg logit contributions
:-0.053
 fault-0.053
sed-0.053
 okay-0.051
 enough-0.050
Token didn
Token ID1422
Feature activation+3.37e-03

Pos logit contributions
Fl+0.014
Moon+0.014
Bird+0.014
Bee+0.014
Buzz+0.014
Neg logit contributions
 fault-0.010
sed-0.009
:-0.008
 okay-0.008
 enough-0.008
Token't
Token ID470
Feature activation+1.97e-04

Pos logit contributions
Fl+0.005
eds+0.005
 conclusions+0.005
Moon+0.005
 pans+0.005
Neg logit contributions
:-0.006
 fault-0.006
sed-0.005
 okay-0.005
 enough-0.005
Token say
Token ID910
Feature activation-3.90e-02

Pos logit contributions
Fl+0.000
Bird+0.000
Moon+0.000
 conclusions+0.000
Bee+0.000
Neg logit contributions
 fault-0.000
sed-0.000
:-0.000
joy-0.000
 enough-0.000
Token anything
Token ID1997
Feature activation-3.70e-02

Pos logit contributions
sed+0.071
 fault+0.068
itted+0.066
apped+0.065
ope+0.064
Neg logit contributions
Fl-0.051
ts-0.051
 myself-0.050
Moon-0.048
 my-0.047
Token,
Token ID11
Feature activation-1.70e-02

Pos logit contributions
sed+0.056
 fault+0.052
apped+0.051
itted+0.047
ope+0.047
Neg logit contributions
Fl-0.066
ts-0.063
ie-0.061
Bear-0.059
 blankets-0.059
Token but
Token ID475
Feature activation-1.02e-02

Pos logit contributions
 fault+0.025
sed+0.024
:+0.023
 okay+0.021
apped+0.021
Neg logit contributions
Fl-0.031
Bird-0.029
ts-0.029
 conclusions-0.029
Bear-0.029
Token Spot
Token ID15899
Feature activation+1.43e-02

Pos logit contributions
 fault+0.018
sed+0.017
:+0.017
 okay+0.015
joy+0.015
Neg logit contributions
Fl-0.016
Bird-0.015
Moon-0.015
Buzz-0.015
Bee-0.015
Token knew
Token ID2993
Feature activation-2.19e-02

Pos logit contributions
Fl+0.025
Buzz+0.024
Bird+0.024
ie+0.023
ts+0.023
Neg logit contributions
 fault-0.024
sed-0.020
joy-0.020
owl-0.019
 intentions-0.019
Token the
Token ID262
Feature activation+1.53e-03

Pos logit contributions
sed+0.045
 fault+0.045
apped+0.041
itted+0.041
:+0.041
Neg logit contributions
Fl-0.027
ts-0.026
 blankets-0.024
ie-0.024
Moon-0.024
Token said
Token ID531
Feature activation-4.59e-02

Pos logit contributions
Fl+0.045
Moon+0.043
ts+0.043
 pans+0.043
Bee+0.043
Neg logit contributions
 fault-0.033
:-0.031
sed-0.031
 okay-0.029
 enough-0.028
Token,
Token ID11
Feature activation-2.06e-02

Pos logit contributions
 fault+0.066
sed+0.059
ige+0.055
ses+0.051
 seeker+0.051
Neg logit contributions
Fl-0.084
ts-0.080
 myself-0.078
Buzz-0.078
Bee-0.078
Token "
Token ID366
Feature activation+2.09e-02

Pos logit contributions
 fault+0.021
sed+0.017
 enough+0.016
 intentions+0.016
:+0.015
Neg logit contributions
Fl-0.048
Bear-0.046
Moon-0.046
Bird-0.046
Bee-0.046
TokenOK
Token ID11380
Feature activation-2.59e-02

Pos logit contributions
eds+0.041
ts+0.040
 conclusions+0.039
 blankets+0.039
ations+0.039
Neg logit contributions
sed-0.026
 okay-0.026
 fault-0.025
:-0.025
ope-0.023
Token.
Token ID13
Feature activation-2.35e-02

Pos logit contributions
sed+0.040
 fault+0.039
itted+0.037
apped+0.034
icked+0.034
Neg logit contributions
 myself-0.046
Fl-0.046
ie-0.046
 blankets-0.045
 my-0.045
Token I
Token ID314
Feature activation-4.20e-02

Pos logit contributions
 fault+0.039
:+0.038
 enough+0.036
sed+0.036
 happened+0.035
Neg logit contributions
Fl-0.040
Moon-0.037
Bee-0.037
Bear-0.037
Bird-0.036
Token love
Token ID1842
Feature activation+1.06e-02

Pos logit contributions
sed+0.081
:+0.076
 fault+0.076
joy+0.075
apped+0.072
Neg logit contributions
Fl-0.056
Bird-0.052
Bee-0.052
Bear-0.052
zip-0.050
Token you
Token ID345
Feature activation-9.58e-03

Pos logit contributions
Fl+0.016
Moon+0.016
Bee+0.016
 conclusions+0.016
Bird+0.016
Neg logit contributions
:-0.020
 enough-0.018
 fault-0.018
sed-0.018
 okay-0.017
Token too
Token ID1165
Feature activation-3.35e-02

Pos logit contributions
 fault+0.013
sed+0.013
:+0.012
 happened+0.012
joy+0.011
Neg logit contributions
Fl-0.018
 blankets-0.018
eds-0.017
Bear-0.017
 pans-0.017
Token.
Token ID13
Feature activation-2.40e-02

Pos logit contributions
sed+0.051
 fault+0.050
 happened+0.045
apped+0.044
 intentions+0.043
Neg logit contributions
Fl-0.057
 blankets-0.056
 myself-0.055
 pans-0.055
eds-0.054
Token Let
Token ID3914
Feature activation-6.88e-03

Pos logit contributions
 fault+0.036
:+0.035
sed+0.034
 enough+0.033
 happened+0.033
Neg logit contributions
Fl-0.044
Moon-0.041
Bear-0.041
Bird-0.041
Bee-0.041
Token "
Token ID366
Feature activation+2.37e-02

Pos logit contributions
 fault+0.040
sed+0.034
 intentions+0.030
 enough+0.030
owl+0.028
Neg logit contributions
Fl-0.075
 pans-0.072
Bear-0.072
Bird-0.071
Moon-0.071
TokenYou
Token ID1639
Feature activation-3.58e-02

Pos logit contributions
ts+0.043
eds+0.041
 blankets+0.040
 conclusions+0.039
 rubble+0.039
Neg logit contributions
 okay-0.034
:-0.034
 fault-0.032
joy-0.032
ope-0.032
Token're
Token ID821
Feature activation-3.69e-02

Pos logit contributions
sed+0.056
 fault+0.054
:+0.053
joy+0.050
 ending+0.046
Neg logit contributions
ts-0.058
Fl-0.058
Bear-0.057
Moon-0.057
 blankets-0.056
Token welcome
Token ID7062
Feature activation-3.29e-02

Pos logit contributions
sed+0.052
 fault+0.051
:+0.046
apped+0.045
ield+0.045
Neg logit contributions
 myself-0.067
Fl-0.066
ts-0.065
 pans-0.065
 blankets-0.064
Token,
Token ID11
Feature activation-1.40e-02

Pos logit contributions
 fault+0.047
sed+0.046
itted+0.040
ses+0.038
apped+0.038
Neg logit contributions
Fl-0.061
 pans-0.060
 myself-0.060
ie-0.058
 blankets-0.058
Token my
Token ID616
Feature activation-3.62e-02

Pos logit contributions
 fault+0.020
:+0.020
sed+0.019
 okay+0.018
 enough+0.018
Neg logit contributions
Fl-0.025
eds-0.025
Moon-0.025
 conclusions-0.025
Bee-0.024
Token sweet
Token ID6029
Feature activation-1.20e-02

Pos logit contributions
 fault+0.052
:+0.051
sed+0.051
 okay+0.047
 enough+0.047
Neg logit contributions
Fl-0.064
 conclusions-0.062
Moon-0.062
eds-0.062
 pans-0.062
Tokenies
Token ID444
Feature activation-2.25e-03

Pos logit contributions
 fault+0.022
sed+0.021
:+0.021
 okay+0.020
 enough+0.019
Neg logit contributions
Fl-0.017
 conclusions-0.016
eds-0.016
 pans-0.016
Bee-0.016
Token.
Token ID13
Feature activation-1.90e-02

Pos logit contributions
 fault+0.003
:+0.003
sed+0.003
 okay+0.003
 enough+0.003
Neg logit contributions
Fl-0.004
Moon-0.004
 conclusions-0.004
eds-0.004
Bee-0.004
Token Now
Token ID2735
Feature activation-2.71e-02

Pos logit contributions
 fault+0.029
sed+0.028
:+0.027
 enough+0.026
 okay+0.025
Neg logit contributions
Fl-0.034
Moon-0.032
Bee-0.032
Bear-0.032
Bird-0.031
Token let
Token ID1309
Feature activation+1.06e-02

Pos logit contributions
 fault+0.048
sed+0.047
:+0.046
 okay+0.043
 enough+0.043
Neg logit contributions
Fl-0.040
Moon-0.039
 conclusions-0.039
eds-0.038
 pans-0.038
Token and
Token ID290
Feature activation+7.77e-02

Pos logit contributions
Fl+0.063
 conclusions+0.061
Moon+0.061
Bee+0.061
Bird+0.060
Neg logit contributions
:-0.044
 fault-0.042
sed-0.040
 enough-0.038
 okay-0.037
Token make
Token ID787
Feature activation+3.19e-02

Pos logit contributions
Fl+0.124
|+0.117
eds+0.116
Bird+0.115
Pen+0.113
Neg logit contributions
:-0.140
sed-0.135
 happened-0.126
 fault-0.126
 enough-0.123
Token things
Token ID1243
Feature activation+2.18e-02

Pos logit contributions
Fl+0.049
Brown+0.046
|+0.046
Bee+0.046
Bird+0.045
Neg logit contributions
:-0.062
 fault-0.061
 enough-0.059
 about-0.058
 okay-0.056
Token.
Token ID13
Feature activation-8.89e-03

Pos logit contributions
Fl+0.038
eds+0.037
Moon+0.037
Bee+0.036
 conclusions+0.036
Neg logit contributions
 fault-0.033
:-0.033
sed-0.031
 okay-0.031
 enough-0.031
Token One
Token ID1881
Feature activation-2.63e-02

Pos logit contributions
 fault+0.014
sed+0.014
 enough+0.013
:+0.013
 okay+0.012
Neg logit contributions
Fl-0.015
 conclusions-0.014
Moon-0.014
Bird-0.014
 blankets-0.014
Token day
Token ID1110
Feature activation-4.15e-02

Pos logit contributions
 fault+0.047
sed+0.046
:+0.045
 okay+0.043
 enough+0.042
Neg logit contributions
Fl-0.039
Moon-0.037
eds-0.037
 conclusions-0.037
Bee-0.037
Token,
Token ID11
Feature activation-2.75e-02

Pos logit contributions
 fault+0.077
sed+0.069
irk+0.065
owl+0.064
oll+0.062
Neg logit contributions
ie-0.061
Fl-0.061
 pans-0.060
 blankets-0.060
ts-0.060
Token they
Token ID484
Feature activation+4.41e-03

Pos logit contributions
 fault+0.057
sed+0.054
:+0.050
owl+0.050
ope+0.050
Neg logit contributions
 pans-0.034
 blankets-0.034
Fl-0.033
ts-0.032
zip-0.031
Token find
Token ID1064
Feature activation-2.25e-03

Pos logit contributions
Fl+0.008
eds+0.007
Moon+0.007
Bee+0.007
Brown+0.007
Neg logit contributions
:-0.007
 fault-0.007
sed-0.007
 okay-0.006
 enough-0.006
Token a
Token ID257
Feature activation+1.29e-02

Pos logit contributions
 fault+0.004
sed+0.004
 okay+0.004
joy+0.004
:+0.004
Neg logit contributions
 blankets-0.003
ts-0.003
Fl-0.003
 pans-0.003
Moon-0.003
Token toy
Token ID13373
Feature activation+2.96e-02

Pos logit contributions
Fl+0.021
 conclusions+0.020
eds+0.020
ts+0.020
Moon+0.020
Neg logit contributions
 fault-0.022
:-0.022
sed-0.021
 okay-0.020
 enough-0.020
Token
Token ID198
Feature activation+3.68e-02

Pos logit contributions
 fault+0.026
sed+0.025
:+0.025
 okay+0.023
 enough+0.023
Neg logit contributions
Fl-0.035
Moon-0.034
eds-0.034
 conclusions-0.034
Bee-0.034
Token
Token ID198
Feature activation+5.24e-02

Pos logit contributions
 pans+0.090
ts+0.089
eds+0.089
 forth+0.088
 blankets+0.087
Neg logit contributions
:-0.037
:"-0.025
 okay-0.024
ope-0.023
 about-0.023
TokenThe
Token ID464
Feature activation-6.60e-03

Pos logit contributions
eds+0.086
 pans+0.084
 blankets+0.084
ts+0.083
 wax+0.080
Neg logit contributions
:-0.091
 okay-0.080
:"-0.080
ope-0.079
apping-0.074
Token little
Token ID1310
Feature activation-6.33e-03

Pos logit contributions
:+0.010
sed+0.010
 fault+0.009
 okay+0.009
 ending+0.009
Neg logit contributions
eds-0.012
zip-0.011
Fl-0.011
 conclusions-0.011
ts-0.011
Token boy
Token ID2933
Feature activation-3.59e-03

Pos logit contributions
:+0.010
sed+0.010
 fault+0.010
 enough+0.010
joy+0.009
Neg logit contributions
eds-0.011
Fl-0.010
 conclusions-0.010
Brown-0.010
ts-0.010
Token smiled
Token ID13541
Feature activation-4.55e-02

Pos logit contributions
 fault+0.005
sed+0.005
:+0.005
 okay+0.005
 enough+0.005
Neg logit contributions
Fl-0.006
Moon-0.006
eds-0.006
 conclusions-0.006
Bee-0.006
Token brightly
Token ID35254
Feature activation-1.49e-02

Pos logit contributions
 fault+0.093
sed+0.091
 happened+0.082
ses+0.080
oke+0.080
Neg logit contributions
ie-0.053
 blankets-0.053
bing-0.051
 myself-0.050
 outer-0.049
Token,
Token ID11
Feature activation-1.40e-02

Pos logit contributions
 fault+0.024
sed+0.022
 okay+0.021
:+0.020
ope+0.019
Neg logit contributions
Fl-0.026
ts-0.025
Moon-0.025
 pans-0.024
 blankets-0.024
Token feeling
Token ID4203
Feature activation-2.67e-02

Pos logit contributions
 fault+0.019
:+0.019
sed+0.019
 okay+0.017
 enough+0.017
Neg logit contributions
Fl-0.026
 conclusions-0.026
eds-0.026
Moon-0.026
Bee-0.025
Token satisfied
Token ID11378
Feature activation-4.15e-02

Pos logit contributions
 fault+0.050
sed+0.050
ield+0.043
ope+0.042
ses+0.042
Neg logit contributions
Fl-0.039
Bear-0.038
Buzz-0.037
ts-0.037
Moon-0.036
Token and
Token ID290
Feature activation+1.38e-02

Pos logit contributions
 fault+0.074
sed+0.072
ield+0.062
ver+0.062
ses+0.062
Neg logit contributions
 blankets-0.061
ie-0.058
Fl-0.057
 myself-0.057
 pans-0.057