TOP ACTIVATIONS
MAX = 0.115

was very pleased with her honest and hard work . <|endoftext|>
in excitement . M um said to C eline
new hair brush on her long hair .
. But they did not have enough money to cover all
But Tom was too busy having fun and wanted to stay
life , with or without y ummy Ice - C ream
and fell onto the floor . He was feeling very ashamed
mom asked him why he had worked so hard to clean
store . Mama was looking for something and Ros y was
he remembered that his mom had some cleaning spray .
her blocks , but her brother wanted to play with them
left . M um and Sister were very
told Louise that the oil had run out very quickly .
to walk ." M um and Dad told Little
at her dad with an pleading expression on her face and
to prevent her mom from having to clean up the mud
smell and was filled with y ucky water .
She remembered how the lady had helped her and decided to
go outside ." M um and Dad smiled and
, Tommy 's trumpet breaks . He tries to blow ,

MIN ACTIVATIONS
MIN = -0.137

They watched the moon together until they fell asleep
. She would fly up and up , then come back
would fly over the hills and forests . One day ,
adventure . He watched the birds flying in the sky and
friends . They watch the birds and the flowers . They
Tom watched it go higher and higher until he could not
dad would watch the baby birds grow from a safe distance
up and down . The bird sits on the middle and
the other birds . The birds sing and thank the man
every night to watch the moon together . They became the
They watched the comet for a while , until
scary !" They swing higher and higher , pretending to fly
, Lily would visit the beetle every day and make sure
stars . She would watch them tw inkle in the sky
. He would make the rocket fly high in the sky
, I can make the moon go around the earth !"
the little girl visited the dragon every day and they would
. They would watch the planes take off and land .
dad . They would watch planes take off and land .
They both drew frogs and colored them . They

INTERVAL 0.052 - 0.115
CONTAINS 1.049%

, " No , we have to throw it away .
L ily was happy to have a friend to walk home
looked at each other and quickly ran towards it . The
iflower and Lily had a y ummy dinner together . And
it â € ™ s too far for you to go

INTERVAL -0.011 - 0.052
CONTAINS 68.127%

likes to print . She has a big box of paper
know what to do . It was so scary ! She
pay off . A few weeks later , Adam heard about
a cr ay on . She is very proud
they both began to cry . Suddenly , a

INTERVAL -0.074 - -0.011
CONTAINS 30.645%

be a reminder to listen . If you listen carefully ,
" Because ," the little boy said , " ang
He hoped he could play basketball and have fun . He
who seemed very friendly . He crou ched down
smiles and hugs to spread peace . Every day they reminded

INTERVAL -0.137 - -0.074
CONTAINS 0.179%

trumpet . She would play it every day and all her
. They make a big rainbow with all the colors .
" Do you want to come fly my k ite with
you can see more things . I have some slides with
he can keep flying up and exploring the sky . <|endoftext|>
Token was
Token ID373
Feature activation+2.61e-02

Pos logit contributions
 Sky+0.001
fly+0.001
 tweet+0.001
 Saturn+0.001
 globe+0.001
Neg logit contributions
gh-0.001
 purpose-0.001
array-0.001
 behaviour-0.001
 Sister-0.001
Token very
Token ID845
Feature activation+3.30e-02

Pos logit contributions
gh+0.061
 behaviour+0.058
 purpose+0.058
array+0.057
urance+0.056
Neg logit contributions
 Sky-0.030
fly-0.028
 higher-0.027
 below-0.026
 tweet-0.026
Token pleased
Token ID10607
Feature activation+1.34e-03

Pos logit contributions
gh+0.087
array+0.084
 purpose+0.082
urance+0.082
 behaviour+0.081
Neg logit contributions
 Sky-0.028
 higher-0.025
 together-0.025
fly-0.025
 below-0.024
Token with
Token ID351
Feature activation+2.08e-02

Pos logit contributions
gh+0.003
 purpose+0.003
array+0.003
 behaviour+0.003
 Sister+0.003
Neg logit contributions
 Sky-0.002
fly-0.002
 tweet-0.002
 globe-0.001
 Saturn-0.001
Token her
Token ID607
Feature activation+4.49e-02

Pos logit contributions
gh+0.043
array+0.040
 behaviour+0.038
 purpose+0.038
omm+0.037
Neg logit contributions
 Sky-0.033
fly-0.030
 tweet-0.028
 Saturn-0.027
 Jupiter-0.027
Token honest
Token ID5508
Feature activation+1.15e-01

Pos logit contributions
gh+0.095
array+0.086
David+0.083
scar+0.083
omm+0.081
Neg logit contributions
fly-0.059
 Sky-0.058
 together-0.058
 around-0.057
 tweet-0.057
Token and
Token ID290
Feature activation+3.93e-03

Pos logit contributions
gh+0.220
liv+0.197
uf+0.191
omm+0.191
scar+0.190
Neg logit contributions
 flap-0.149
 around-0.147
 for-0.143
 and-0.142
 fly-0.140
Token hard
Token ID1327
Feature activation+6.91e-02

Pos logit contributions
gh+0.007
 purpose+0.006
array+0.006
 behaviour+0.006
omm+0.006
Neg logit contributions
 Sky-0.007
fly-0.007
 tweet-0.006
 Saturn-0.006
 globe-0.006
Token work
Token ID670
Feature activation+5.99e-02

Pos logit contributions
gh+0.149
uf+0.146
omm+0.144
David+0.140
scar+0.133
Neg logit contributions
 around-0.084
 together-0.079
 down-0.078
 below-0.075
 on-0.073
Token.
Token ID13
Feature activation-7.37e-03

Pos logit contributions
uf+0.135
array+0.130
omm+0.128
 mischief+0.128
gh+0.128
Neg logit contributions
 together-0.074
 around-0.070
 Sky-0.068
 below-0.065
fly-0.064
Token<|endoftext|>
Token ID50256
Feature activation-2.47e-02

Pos logit contributions
fly+0.011
 Sky+0.011
 tweet+0.010
 globe+0.009
 higher+0.009
Neg logit contributions
gh-0.015
 purpose-0.014
omm-0.014
 Sister-0.014
array-0.014
Token in
Token ID287
Feature activation+3.50e-02

Pos logit contributions
 Sky+0.006
fly+0.006
 tweet+0.005
 Saturn+0.005
 globe+0.005
Neg logit contributions
gh-0.009
 purpose-0.008
array-0.008
 behaviour-0.008
 Sister-0.008
Token excitement
Token ID14067
Feature activation+4.44e-02

Pos logit contributions
gh+0.075
omm+0.071
scar+0.068
addy+0.066
array+0.066
Neg logit contributions
 Sky-0.051
 tweet-0.049
 below-0.047
fly-0.047
 higher-0.047
Token.
Token ID13
Feature activation+5.82e-02

Pos logit contributions
gh+0.095
array+0.091
omm+0.090
 mischief+0.088
scar+0.088
Neg logit contributions
fly-0.059
 Sky-0.059
 below-0.055
 together-0.052
 tweet-0.050
Token
Token ID198
Feature activation+3.55e-02

Pos logit contributions
gh+0.162
 purpose+0.146
 behaviour+0.146
 supposed+0.140
 consequences+0.139
Neg logit contributions
 Sky-0.043
 below-0.043
 down-0.039
 higher-0.039
als-0.038
Token
Token ID198
Feature activation+6.25e-02

Pos logit contributions
gh+0.093
 behaviour+0.092
array+0.090
urance+0.089
 purpose+0.089
Neg logit contributions
 Sky-0.031
fly-0.025
 Saturn-0.022
 below-0.022
 higher-0.021
TokenM
Token ID44
Feature activation+1.05e-01

Pos logit contributions
 behaviour+0.140
gh+0.140
urance+0.137
 purpose+0.133
 consequences+0.132
Neg logit contributions
 higher-0.066
 together-0.062
 Sky-0.062
 south-0.061
 below-0.058
Tokenum
Token ID388
Feature activation+9.54e-03

Pos logit contributions
gh+0.197
array+0.181
 purpose+0.178
 mischief+0.177
urance+0.177
Neg logit contributions
als-0.140
 below-0.131
ite-0.127
ocular-0.116
ls-0.114
Token said
Token ID531
Feature activation+3.85e-02

Pos logit contributions
gh+0.019
 purpose+0.018
array+0.017
 behaviour+0.017
urance+0.017
Neg logit contributions
fly-0.015
 Sky-0.015
 tweet-0.013
 below-0.012
 Saturn-0.012
Token to
Token ID284
Feature activation+7.47e-02

Pos logit contributions
array+0.089
gh+0.087
 purpose+0.085
urance+0.085
 mischief+0.084
Neg logit contributions
 Sky-0.048
 tweet-0.043
fly-0.042
als-0.038
 higher-0.038
Token C
Token ID327
Feature activation+4.05e-02

Pos logit contributions
gh+0.147
 purpose+0.136
array+0.134
 behaviour+0.134
 policy+0.130
Neg logit contributions
 tweet-0.100
 Sky-0.099
 Flor-0.095
 Saturn-0.094
 fly-0.093
Tokeneline
Token ID4470
Feature activation-9.52e-03

Pos logit contributions
gh+0.066
 purpose+0.060
 mischief+0.059
 punishment+0.059
 attempts+0.057
Neg logit contributions
fly-0.074
 below-0.072
 Sky-0.072
 globe-0.069
als-0.068
Token new
Token ID649
Feature activation+3.53e-02

Pos logit contributions
gh+0.069
array+0.067
urance+0.064
David+0.064
 behaviour+0.063
Neg logit contributions
 Sky-0.049
 tweet-0.044
fly-0.043
 flap-0.042
 higher-0.042
Token hair
Token ID4190
Feature activation+1.18e-02

Pos logit contributions
gh+0.060
array+0.057
urance+0.055
 forgotten+0.055
David+0.054
Neg logit contributions
 Sky-0.063
fly-0.059
 tweet-0.058
 flap-0.057
 higher-0.056
Tokenbrush
Token ID32680
Feature activation+2.20e-02

Pos logit contributions
gh+0.022
array+0.021
 purpose+0.021
 mischief+0.020
 Sister+0.020
Neg logit contributions
fly-0.019
 Sky-0.019
 tweet-0.017
 globe-0.016
 below-0.016
Token on
Token ID319
Feature activation+4.46e-02

Pos logit contributions
gh+0.043
omm+0.041
array+0.041
 mischief+0.040
 Sister+0.040
Neg logit contributions
fly-0.034
 Sky-0.033
 higher-0.029
 tweet-0.029
 below-0.029
Token her
Token ID607
Feature activation+3.73e-02

Pos logit contributions
 behaviour+0.110
gh+0.109
array+0.107
urance+0.106
scar+0.104
Neg logit contributions
 Sky-0.047
 Saturn-0.041
 Jupiter-0.036
 Bloom-0.036
 Mars-0.035
Token long
Token ID890
Feature activation+1.04e-01

Pos logit contributions
array+0.066
gh+0.065
David+0.063
scar+0.063
 Sister+0.060
Neg logit contributions
 Sky-0.062
fly-0.053
 flap-0.053
 tweet-0.052
 globe-0.052
Token hair
Token ID4190
Feature activation+2.45e-02

Pos logit contributions
Enough+0.226
scar+0.210
 supposed+0.208
David+0.208
udging+0.207
Neg logit contributions
 flap-0.127
 Sky-0.119
 fly-0.118
 soar-0.114
 tweet-0.107
Token.
Token ID13
Feature activation-2.40e-03

Pos logit contributions
gh+0.048
 mischief+0.045
array+0.045
 Sister+0.045
 behaviour+0.044
Neg logit contributions
fly-0.038
 Sky-0.037
 below-0.033
 globe-0.032
als-0.032
Token
Token ID220
Feature activation-2.94e-02

Pos logit contributions
fly+0.003
 Sky+0.003
 tweet+0.003
 globe+0.003
 Saturn+0.003
Neg logit contributions
gh-0.005
 purpose-0.005
array-0.005
 Sister-0.005
omm-0.005
Token
Token ID198
Feature activation-3.42e-02

Pos logit contributions
fly+0.050
 Sky+0.046
 tweet+0.045
flies+0.042
 globe+0.041
Neg logit contributions
gh-0.056
 purpose-0.051
omm-0.050
addy-0.050
em-0.049
Token
Token ID198
Feature activation+1.57e-02

Pos logit contributions
fly+0.066
 Sky+0.058
 tweet+0.058
flies+0.057
als+0.052
Neg logit contributions
gh-0.061
 purpose-0.055
addy-0.054
array-0.052
em-0.052
Token.
Token ID13
Feature activation+4.52e-02

Pos logit contributions
omm+0.213
addy+0.208
omo+0.201
innie+0.198
scar+0.196
Neg logit contributions
 together-0.091
 around-0.084
 below-0.078
 higher-0.075
 on-0.075
Token But
Token ID887
Feature activation+8.74e-03

Pos logit contributions
gh+0.087
 behaviour+0.072
illy+0.070
addy+0.069
 purpose+0.068
Neg logit contributions
 Sky-0.082
 below-0.073
fly-0.071
 Jupiter-0.070
 together-0.069
Token they
Token ID484
Feature activation+1.70e-02

Pos logit contributions
gh+0.013
 purpose+0.011
array+0.011
 behaviour+0.011
omm+0.011
Neg logit contributions
 Sky-0.018
fly-0.018
 tweet-0.017
 below-0.016
 globe-0.016
Token did
Token ID750
Feature activation+6.10e-02

Pos logit contributions
gh+0.033
 purpose+0.031
array+0.030
 mischief+0.029
omm+0.029
Neg logit contributions
fly-0.028
 Sky-0.027
 tweet-0.024
 below-0.024
 globe-0.023
Token not
Token ID407
Feature activation+6.06e-02

Pos logit contributions
omm+0.132
array+0.131
illy+0.130
kids+0.129
child+0.128
Neg logit contributions
 together-0.078
 higher-0.069
 Sky-0.066
 closer-0.062
 soar-0.061
Token have
Token ID423
Feature activation+1.01e-01

Pos logit contributions
omm+0.148
addy+0.140
gh+0.139
liv+0.137
innie+0.135
Neg logit contributions
fly-0.067
 fly-0.066
 flap-0.062
 soar-0.062
 tweet-0.061
Token enough
Token ID1576
Feature activation+7.88e-02

Pos logit contributions
omm+0.252
innie+0.236
addy+0.225
array+0.224
urance+0.221
Neg logit contributions
 together-0.104
 the-0.096
 on-0.092
 higher-0.091
 all-0.089
Token money
Token ID1637
Feature activation+5.98e-02

Pos logit contributions
gh+0.156
omm+0.150
scar+0.144
liv+0.144
innie+0.143
Neg logit contributions
 together-0.135
 below-0.123
 higher-0.118
 around-0.118
 on-0.117
Token to
Token ID284
Feature activation+8.03e-02

Pos logit contributions
array+0.137
gh+0.137
omm+0.137
 mischief+0.132
urance+0.131
Neg logit contributions
 below-0.063
 together-0.063
fly-0.062
 higher-0.061
 Sky-0.060
Token cover
Token ID3002
Feature activation+1.09e-02

Pos logit contributions
omm+0.182
gh+0.169
innie+0.167
addy+0.162
scar+0.160
Neg logit contributions
 fly-0.103
 soar-0.097
 flap-0.092
 below-0.091
 together-0.090
Token all
Token ID477
Feature activation+1.58e-02

Pos logit contributions
gh+0.024
array+0.022
 purpose+0.022
 behaviour+0.022
omm+0.022
Neg logit contributions
 Sky-0.014
fly-0.014
 tweet-0.012
 globe-0.012
 below-0.012
Token But
Token ID887
Feature activation+4.56e-02

Pos logit contributions
gh+0.204
 behaviour+0.189
 illness+0.184
oken+0.184
 supposed+0.181
Neg logit contributions
 down-0.068
 below-0.064
 higher-0.061
 together-0.060
 Sky-0.060
Token Tom
Token ID4186
Feature activation+6.85e-02

Pos logit contributions
gh+0.096
 mischief+0.091
 purpose+0.091
urance+0.091
array+0.088
Neg logit contributions
 Sky-0.058
 together-0.058
 higher-0.057
 down-0.054
 below-0.053
Token was
Token ID373
Feature activation+5.96e-02

Pos logit contributions
 purpose+0.146
 mischief+0.140
urance+0.138
array+0.134
gh+0.133
Neg logit contributions
 around-0.083
fly-0.083
bug-0.080
 Sky-0.080
 together-0.075
Token too
Token ID1165
Feature activation+7.70e-02

Pos logit contributions
 purpose+0.141
urance+0.141
 behaviour+0.140
gh+0.139
kids+0.138
Neg logit contributions
 around-0.064
 higher-0.061
 together-0.056
 on-0.056
 closer-0.055
Token busy
Token ID8179
Feature activation+8.38e-02

Pos logit contributions
gh+0.199
 occasions+0.194
urance+0.186
kids+0.180
child+0.170
Neg logit contributions
 together-0.076
 high-0.074
 higher-0.073
 below-0.072
 closer-0.071
Token having
Token ID1719
Feature activation+1.01e-01

Pos logit contributions
omm+0.183
uf+0.179
 Sister+0.176
gh+0.171
David+0.170
Neg logit contributions
 together-0.109
 around-0.107
 below-0.094
 higher-0.094
 on-0.093
Token fun
Token ID1257
Feature activation+7.84e-02

Pos logit contributions
uf+0.266
omm+0.257
urance+0.252
scar+0.249
ailed+0.248
Neg logit contributions
 together-0.084
 around-0.076
 higher-0.072
 on-0.069
 all-0.065
Token and
Token ID290
Feature activation+5.29e-02

Pos logit contributions
uf+0.195
 Sister+0.193
omm+0.190
David+0.185
child+0.181
Neg logit contributions
 together-0.095
 around-0.081
 on-0.067
 under-0.066
 higher-0.064
Token wanted
Token ID2227
Feature activation+6.60e-02

Pos logit contributions
gh+0.135
omm+0.133
uf+0.130
urance+0.129
 purpose+0.128
Neg logit contributions
 Sky-0.042
 together-0.040
 around-0.040
 higher-0.038
 down-0.035
Token to
Token ID284
Feature activation+6.62e-02

Pos logit contributions
urance+0.181
omm+0.175
ailed+0.174
uf+0.173
gh+0.172
Neg logit contributions
 around-0.048
 together-0.046
 higher-0.045
 closer-0.044
 Sky-0.041
Token stay
Token ID2652
Feature activation+3.76e-02

Pos logit contributions
omm+0.156
gh+0.156
kids+0.149
uf+0.148
child+0.148
Neg logit contributions
 fly-0.071
 soar-0.068
 flap-0.066
 Sky-0.057
 around-0.055
Token life
Token ID1204
Feature activation+1.35e-03

Pos logit contributions
gh+0.062
omm+0.061
array+0.060
urance+0.059
 behaviour+0.058
Neg logit contributions
 Sky-0.035
fly-0.032
 tweet-0.029
 below-0.029
 higher-0.029
Token,
Token ID11
Feature activation-2.03e-02

Pos logit contributions
gh+0.003
 purpose+0.003
array+0.003
 behaviour+0.002
 Sister+0.002
Neg logit contributions
 Sky-0.002
fly-0.002
 tweet-0.002
 globe-0.002
 Saturn-0.002
Token with
Token ID351
Feature activation+2.91e-02

Pos logit contributions
fly+0.035
 Sky+0.034
 tweet+0.030
 globe+0.029
flies+0.029
Neg logit contributions
gh-0.038
 purpose-0.035
 forgotten-0.035
 Sister-0.034
 behaviour-0.034
Token or
Token ID393
Feature activation+2.66e-03

Pos logit contributions
gh+0.062
array+0.059
omm+0.058
urance+0.057
 purpose+0.056
Neg logit contributions
 Sky-0.040
fly-0.038
 tweet-0.034
 Saturn-0.034
 Jupiter-0.033
Token without
Token ID1231
Feature activation+6.46e-02

Pos logit contributions
gh+0.005
 purpose+0.004
array+0.004
 behaviour+0.004
omm+0.004
Neg logit contributions
 Sky-0.005
fly-0.005
 tweet-0.004
 Saturn-0.004
 globe-0.004
Token y
Token ID331
Feature activation+1.00e-01

Pos logit contributions
omm+0.118
gh+0.115
liv+0.109
addy+0.108
innie+0.107
Neg logit contributions
 Sky-0.101
fly-0.101
 tweet-0.096
 higher-0.095
 below-0.092
Tokenummy
Token ID13513
Feature activation+7.20e-02

Pos logit contributions
liv+0.181
 virus+0.180
David+0.178
child+0.177
 forgotten+0.174
Neg logit contributions
als-0.127
 below-0.118
 higher-0.115
 Sky-0.109
 down-0.109
Token Ice
Token ID6663
Feature activation+5.97e-02

Pos logit contributions
gh+0.151
omm+0.151
liv+0.150
David+0.147
array+0.142
Neg logit contributions
 around-0.086
 below-0.084
als-0.082
 on-0.080
 higher-0.080
Token-
Token ID12
Feature activation+2.52e-02

Pos logit contributions
array+0.111
urance+0.109
gh+0.106
 mischief+0.105
ailed+0.101
Neg logit contributions
fly-0.093
 globe-0.088
 Sky-0.088
 below-0.087
als-0.087
TokenC
Token ID34
Feature activation+7.61e-02

Pos logit contributions
gh+0.045
array+0.042
urance+0.041
 purpose+0.041
omm+0.041
Neg logit contributions
fly-0.043
 Sky-0.043
 tweet-0.039
 below-0.038
als-0.038
Tokenream
Token ID1476
Feature activation+4.68e-02

Pos logit contributions
 virus+0.114
urance+0.114
 expression+0.111
 babys+0.110
array+0.109
Neg logit contributions
fly-0.121
ites-0.116
als-0.115
 Sky-0.112
flies-0.110
Token and
Token ID290
Feature activation+3.85e-02

Pos logit contributions
gh+0.034
 purpose+0.032
 behaviour+0.032
array+0.032
urance+0.031
Neg logit contributions
 Sky-0.012
fly-0.012
 tweet-0.010
 higher-0.010
 below-0.009
Token fell
Token ID3214
Feature activation+1.75e-02

Pos logit contributions
gh+0.078
omm+0.077
 purpose+0.076
urance+0.075
kids+0.075
Neg logit contributions
 Sky-0.055
fly-0.049
 higher-0.049
 flap-0.048
 tweet-0.047
Token onto
Token ID4291
Feature activation+5.46e-02

Pos logit contributions
gh+0.046
array+0.043
 purpose+0.043
 behaviour+0.043
omm+0.042
Neg logit contributions
 Sky-0.017
fly-0.017
 higher-0.014
 tweet-0.014
 below-0.013
Token the
Token ID262
Feature activation+1.20e-02

Pos logit contributions
gh+0.158
 behaviour+0.153
 supposed+0.149
urance+0.145
array+0.145
Neg logit contributions
 Sky-0.040
 Saturn-0.028
 Jupiter-0.025
 higher-0.025
 below-0.023
Token floor
Token ID4314
Feature activation+5.56e-02

Pos logit contributions
gh+0.023
 purpose+0.021
array+0.021
omm+0.021
 behaviour+0.021
Neg logit contributions
 Sky-0.020
fly-0.019
 tweet-0.017
 globe-0.017
 below-0.017
Token.
Token ID13
Feature activation+9.91e-02

Pos logit contributions
kids+0.126
 Sister+0.125
gh+0.122
omm+0.120
child+0.120
Neg logit contributions
 below-0.067
 higher-0.066
 Sky-0.066
fly-0.066
 around-0.060
Token He
Token ID679
Feature activation+2.50e-02

Pos logit contributions
gh+0.254
 behaviour+0.241
illy+0.240
 supposed+0.236
 occasions+0.236
Neg logit contributions
 down-0.074
 Sky-0.066
 around-0.059
 below-0.059
 higher-0.050
Token was
Token ID373
Feature activation+2.98e-02

Pos logit contributions
gh+0.044
omm+0.042
 purpose+0.042
urance+0.041
 behaviour+0.040
Neg logit contributions
 Sky-0.044
fly-0.042
 tweet-0.039
 flap-0.038
 higher-0.038
Token feeling
Token ID4203
Feature activation+2.88e-02

Pos logit contributions
gh+0.061
 behaviour+0.057
 purpose+0.057
kids+0.056
array+0.056
Neg logit contributions
 Sky-0.045
fly-0.043
 higher-0.040
 below-0.038
 flap-0.038
Token very
Token ID845
Feature activation+3.71e-02

Pos logit contributions
gh+0.062
kids+0.057
array+0.057
urance+0.056
 purpose+0.056
Neg logit contributions
 Sky-0.041
fly-0.038
 higher-0.038
 below-0.035
 tweet-0.035
Token ashamed
Token ID22461
Feature activation+4.31e-02

Pos logit contributions
gh+0.077
kids+0.075
child+0.072
em+0.072
omm+0.072
Neg logit contributions
 Sky-0.052
 higher-0.051
 below-0.049
 soar-0.047
fly-0.047
Token mom
Token ID1995
Feature activation+1.85e-02

Pos logit contributions
em+0.136
gh+0.136
array+0.132
David+0.128
 supposed+0.126
Neg logit contributions
 Sky-0.054
 fly-0.043
 rocket-0.043
 globe-0.042
 planet-0.041
Token asked
Token ID1965
Feature activation+1.28e-02

Pos logit contributions
gh+0.035
 purpose+0.033
 behaviour+0.031
array+0.031
 mischief+0.031
Neg logit contributions
 Sky-0.031
fly-0.031
 tweet-0.028
 below-0.026
 Saturn-0.026
Token him
Token ID683
Feature activation-7.24e-03

Pos logit contributions
gh+0.023
array+0.021
 purpose+0.021
addy+0.020
omm+0.020
Neg logit contributions
 Sky-0.023
fly-0.022
 tweet-0.020
 below-0.019
 Saturn-0.019
Token why
Token ID1521
Feature activation+4.54e-02

Pos logit contributions
fly+0.012
 Sky+0.012
 tweet+0.011
 globe+0.010
 Saturn+0.010
Neg logit contributions
gh-0.014
 purpose-0.012
 behaviour-0.012
 Sister-0.012
omm-0.012
Token he
Token ID339
Feature activation+5.70e-02

Pos logit contributions
gh+0.119
array+0.115
urance+0.114
omm+0.111
 purpose+0.111
Neg logit contributions
 Sky-0.039
fly-0.035
 below-0.033
 together-0.030
 tweet-0.028
Token had
Token ID550
Feature activation+9.74e-02

Pos logit contributions
 purpose+0.129
gh+0.123
omm+0.119
array+0.118
 mischief+0.116
Neg logit contributions
fly-0.073
bug-0.062
 Sky-0.062
 below-0.060
 tweet-0.060
Token worked
Token ID3111
Feature activation+6.74e-02

Pos logit contributions
omm+0.207
gh+0.199
scar+0.193
urance+0.193
innie+0.190
Neg logit contributions
 around-0.114
 together-0.114
 on-0.113
 in-0.106
 all-0.105
Token so
Token ID523
Feature activation+5.03e-02

Pos logit contributions
urance+0.163
uf+0.161
omm+0.159
kids+0.158
child+0.155
Neg logit contributions
 together-0.075
 higher-0.070
 around-0.067
 down-0.057
 on-0.056
Token hard
Token ID1327
Feature activation+7.54e-02

Pos logit contributions
kids+0.106
gh+0.105
urance+0.104
array+0.103
child+0.101
Neg logit contributions
 higher-0.066
 together-0.064
 Sky-0.063
 below-0.062
 around-0.059
Token to
Token ID284
Feature activation+5.87e-02

Pos logit contributions
omm+0.185
gh+0.181
OW+0.174
uf+0.173
urance+0.172
Neg logit contributions
 together-0.075
 around-0.069
 on-0.062
 below-0.062
 down-0.060
Token clean
Token ID3424
Feature activation+3.21e-02

Pos logit contributions
omm+0.131
gh+0.130
kids+0.128
 behaviour+0.124
child+0.123
Neg logit contributions
 fly-0.067
 flap-0.066
 soar-0.064
 higher-0.060
 Sky-0.057
Token store
Token ID3650
Feature activation+4.93e-02

Pos logit contributions
gh+0.024
 behaviour+0.022
 purpose+0.021
urance+0.021
omm+0.021
Neg logit contributions
 Sky-0.020
fly-0.020
 tweet-0.018
 globe-0.017
 Saturn-0.017
Token.
Token ID13
Feature activation+2.57e-03

Pos logit contributions
gh+0.126
 purpose+0.116
 mischief+0.114
omm+0.113
 behaviour+0.112
Neg logit contributions
 together-0.051
fly-0.048
 below-0.048
 Sky-0.048
 around-0.042
Token Mama
Token ID40084
Feature activation+2.96e-02

Pos logit contributions
gh+0.005
 purpose+0.005
array+0.005
 behaviour+0.004
omm+0.004
Neg logit contributions
 Sky-0.004
fly-0.004
 tweet-0.004
 Saturn-0.003
 globe-0.003
Token was
Token ID373
Feature activation+5.11e-02

Pos logit contributions
gh+0.067
 purpose+0.063
 behaviour+0.059
 mischief+0.059
urance+0.059
Neg logit contributions
fly-0.041
 Sky-0.041
 tweet-0.034
 below-0.034
 globe-0.033
Token looking
Token ID2045
Feature activation+2.12e-02

Pos logit contributions
gh+0.116
 behaviour+0.113
 purpose+0.110
urance+0.109
OW+0.104
Neg logit contributions
 Sky-0.060
 together-0.059
 higher-0.057
 below-0.056
 around-0.054
Token for
Token ID329
Feature activation+9.65e-02

Pos logit contributions
gh+0.057
 purpose+0.053
array+0.052
urance+0.052
omm+0.051
Neg logit contributions
 Sky-0.018
fly-0.017
 below-0.015
 higher-0.015
 tweet-0.015
Token something
Token ID1223
Feature activation+5.65e-02

Pos logit contributions
gh+0.252
urance+0.252
omm+0.239
aned+0.237
istent+0.236
Neg logit contributions
 together-0.068
 flowers-0.064
 mountains-0.064
 Sky-0.062
 the-0.060
Token and
Token ID290
Feature activation+3.65e-02

Pos logit contributions
urance+0.129
omm+0.126
array+0.124
child+0.122
gh+0.122
Neg logit contributions
 Sky-0.068
 together-0.067
 around-0.065
 closer-0.064
 higher-0.064
Token Ros
Token ID10018
Feature activation+4.52e-02

Pos logit contributions
gh+0.081
 purpose+0.074
urance+0.073
omm+0.071
 behaviour+0.070
Neg logit contributions
 Sky-0.048
fly-0.047
 together-0.045
 tweet-0.043
 Saturn-0.042
Tokeny
Token ID88
Feature activation+6.75e-02

Pos logit contributions
gh+0.118
 purpose+0.114
 punishment+0.113
 mischief+0.112
omm+0.110
Neg logit contributions
fly-0.045
 Sky-0.035
flies-0.034
 together-0.030
als-0.029
Token was
Token ID373
Feature activation+5.15e-02

Pos logit contributions
gh+0.137
omm+0.134
 purpose+0.133
array+0.126
urance+0.125
Neg logit contributions
fly-0.098
 Sky-0.094
 below-0.087
 together-0.083
 around-0.083
Token he
Token ID339
Feature activation+3.35e-02

Pos logit contributions
gh+0.011
 purpose+0.010
array+0.010
 behaviour+0.010
omm+0.010
Neg logit contributions
 Sky-0.008
fly-0.008
 tweet-0.007
 Saturn-0.006
 globe-0.006
Token remembered
Token ID12086
Feature activation+4.41e-02

Pos logit contributions
 purpose+0.071
urance+0.068
gh+0.067
omm+0.065
 behaviour+0.065
Neg logit contributions
fly-0.047
 Sky-0.044
bug-0.041
 tweet-0.039
 closer-0.039
Token that
Token ID326
Feature activation+1.17e-02

Pos logit contributions
gh+0.116
omm+0.113
urance+0.112
kids+0.109
array+0.108
Neg logit contributions
 Sky-0.034
fly-0.032
 Jupiter-0.030
 together-0.030
 Saturn-0.028
Token his
Token ID465
Feature activation+3.70e-02

Pos logit contributions
gh+0.026
 purpose+0.024
omm+0.024
array+0.024
urance+0.024
Neg logit contributions
 Sky-0.015
fly-0.015
 tweet-0.013
 Saturn-0.013
 globe-0.013
Token mom
Token ID1995
Feature activation+2.53e-02

Pos logit contributions
gh+0.073
array+0.069
em+0.069
urance+0.067
omm+0.067
Neg logit contributions
 Sky-0.056
 globe-0.049
 tweet-0.049
 below-0.048
 flap-0.048
Token had
Token ID550
Feature activation+9.64e-02

Pos logit contributions
gh+0.054
 purpose+0.053
 mischief+0.050
omm+0.050
array+0.049
Neg logit contributions
fly-0.036
 Sky-0.033
 tweet-0.030
bug-0.029
flies-0.028
Token some
Token ID617
Feature activation+8.28e-02

Pos logit contributions
omm+0.208
urance+0.196
uf+0.193
becca+0.186
kids+0.186
Neg logit contributions
 together-0.119
 on-0.112
 in-0.100
 fly-0.099
 fl-0.098
Token cleaning
Token ID12724
Feature activation+3.19e-02

Pos logit contributions
gh+0.173
omm+0.172
urance+0.169
David+0.168
oph+0.162
Neg logit contributions
 around-0.101
 fly-0.100
 flies-0.097
 on-0.097
 flap-0.097
Token spray
Token ID11662
Feature activation+2.85e-02

Pos logit contributions
gh+0.076
urance+0.072
omm+0.070
array+0.070
 behaviour+0.069
Neg logit contributions
fly-0.034
 Sky-0.034
 globe-0.030
 higher-0.030
 below-0.029
Token.
Token ID13
Feature activation+1.56e-02

Pos logit contributions
gh+0.062
urance+0.057
omm+0.057
array+0.057
 purpose+0.056
Neg logit contributions
fly-0.039
 Sky-0.038
 higher-0.034
 tweet-0.034
 below-0.033
Token
Token ID220
Feature activation-7.90e-03

Pos logit contributions
gh+0.034
 purpose+0.031
 behaviour+0.031
 forgotten+0.030
array+0.030
Neg logit contributions
 Sky-0.023
fly-0.021
 tweet-0.019
 below-0.018
 Saturn-0.018
Token her
Token ID607
Feature activation-1.51e-02

Pos logit contributions
gh+0.030
 behaviour+0.027
urance+0.027
 Sister+0.027
array+0.027
Neg logit contributions
 Sky-0.025
fly-0.023
 tweet-0.022
 higher-0.021
 Saturn-0.021
Token blocks
Token ID7021
Feature activation+6.02e-03

Pos logit contributions
fly+0.028
 Sky+0.027
 tweet+0.026
 Saturn+0.024
 Jupiter+0.024
Neg logit contributions
gh-0.026
 purpose-0.024
 behaviour-0.023
 forgotten-0.022
array-0.022
Token,
Token ID11
Feature activation-9.99e-03

Pos logit contributions
gh+0.011
 purpose+0.010
array+0.010
 behaviour+0.010
 Sister+0.010
Neg logit contributions
fly-0.010
 Sky-0.010
 tweet-0.009
 below-0.009
 globe-0.009
Token but
Token ID475
Feature activation-9.03e-03

Pos logit contributions
fly+0.017
 Sky+0.016
 tweet+0.015
 globe+0.014
flies+0.014
Neg logit contributions
gh-0.019
array-0.018
 purpose-0.018
 Sister-0.017
 behaviour-0.017
Token her
Token ID607
Feature activation+1.31e-02

Pos logit contributions
fly+0.016
 Sky+0.015
 tweet+0.014
 globe+0.014
flies+0.013
Neg logit contributions
gh-0.017
 purpose-0.015
array-0.015
 forgotten-0.014
omm-0.014
Token brother
Token ID3956
Feature activation+9.63e-02

Pos logit contributions
gh+0.026
array+0.024
 purpose+0.023
 Sister+0.023
urance+0.023
Neg logit contributions
 Sky-0.021
fly-0.020
 tweet-0.018
 globe-0.018
 Saturn-0.017
Token wanted
Token ID2227
Feature activation+4.59e-02

Pos logit contributions
 mischief+0.238
 purpose+0.229
oken+0.228
array+0.227
igsaw+0.225
Neg logit contributions
 Sky-0.077
 together-0.077
 around-0.077
fly-0.076
 could-0.067
Token to
Token ID284
Feature activation+3.60e-02

Pos logit contributions
urance+0.115
uf+0.112
gh+0.111
 behaviour+0.109
array+0.107
Neg logit contributions
 Sky-0.049
 higher-0.039
 together-0.038
fly-0.038
 Saturn-0.038
Token play
Token ID711
Feature activation-1.71e-02

Pos logit contributions
gh+0.086
 Sister+0.082
omm+0.082
 behaviour+0.081
array+0.079
Neg logit contributions
 Sky-0.038
 soar-0.038
fly-0.037
 fly-0.036
 flap-0.035
Token with
Token ID351
Feature activation+1.20e-02

Pos logit contributions
fly+0.027
 Sky+0.025
 tweet+0.023
flies+0.022
als+0.021
Neg logit contributions
gh-0.037
 purpose-0.034
array-0.032
 forgotten-0.031
 mischief-0.031
Token them
Token ID606
Feature activation+4.37e-04

Pos logit contributions
gh+0.024
array+0.022
 behaviour+0.022
 purpose+0.022
urance+0.022
Neg logit contributions
 Sky-0.018
fly-0.018
 tweet-0.016
 Saturn-0.015
 higher-0.015
Token left
Token ID1364
Feature activation+3.61e-02

Pos logit contributions
 Sky+0.014
fly+0.014
 tweet+0.013
 Saturn+0.012
 globe+0.012
Neg logit contributions
gh-0.013
 purpose-0.011
array-0.011
 forgotten-0.011
 behaviour-0.011
Token.
Token ID13
Feature activation+2.11e-02

Pos logit contributions
gh+0.086
omm+0.081
array+0.081
 purpose+0.079
urance+0.078
Neg logit contributions
 Sky-0.040
fly-0.039
 below-0.038
 globe-0.034
 together-0.034
Token
Token ID220
Feature activation+1.20e-02

Pos logit contributions
gh+0.052
 purpose+0.047
 behaviour+0.046
array+0.046
 forgotten+0.044
Neg logit contributions
 Sky-0.024
fly-0.023
 Saturn-0.020
 below-0.019
 globe-0.019
Token
Token ID198
Feature activation+7.99e-03

Pos logit contributions
gh+0.029
 purpose+0.027
array+0.026
 behaviour+0.026
 forgotten+0.026
Neg logit contributions
 Sky-0.014
fly-0.014
 tweet-0.012
 Saturn-0.011
 below-0.011
Token
Token ID198
Feature activation+2.97e-02

Pos logit contributions
gh+0.019
 purpose+0.018
array+0.017
 behaviour+0.017
omm+0.017
Neg logit contributions
 Sky-0.010
fly-0.009
 tweet-0.008
 Saturn-0.008
 globe-0.008
TokenM
Token ID44
Feature activation+9.61e-02

Pos logit contributions
gh+0.059
 behaviour+0.058
 purpose+0.057
array+0.057
urance+0.056
Neg logit contributions
 Sky-0.040
fly-0.038
 together-0.035
 Saturn-0.035
 globe-0.035
Tokenum
Token ID388
Feature activation+1.14e-02

Pos logit contributions
 blond+0.177
 blonde+0.176
gh+0.175
array+0.169
 purpose+0.169
Neg logit contributions
ite-0.097
s-0.096
 together-0.091
ites-0.089
fish-0.088
Token and
Token ID290
Feature activation+5.25e-02

Pos logit contributions
gh+0.027
 purpose+0.025
array+0.024
 behaviour+0.024
 mischief+0.024
Neg logit contributions
fly-0.014
 Sky-0.014
 tweet-0.012
 globe-0.011
 Saturn-0.011
Token Sister
Token ID26553
Feature activation+1.45e-02

Pos logit contributions
gh+0.110
 purpose+0.090
 supposed+0.089
array+0.087
 policy+0.085
Neg logit contributions
 Sky-0.076
 down-0.076
 deer-0.075
als-0.074
 below-0.073
Token were
Token ID547
Feature activation+1.33e-02

Pos logit contributions
gh+0.029
 purpose+0.026
array+0.025
 behaviour+0.025
omm+0.025
Neg logit contributions
 Sky-0.024
fly-0.024
 tweet-0.021
 globe-0.020
 below-0.020
Token very
Token ID845
Feature activation+8.23e-03

Pos logit contributions
gh+0.029
 purpose+0.027
array+0.027
 behaviour+0.026
omm+0.026
Neg logit contributions
 Sky-0.019
fly-0.018
 tweet-0.016
 below-0.015
 higher-0.015
Token told
Token ID1297
Feature activation-2.98e-02

Pos logit contributions
gh+0.035
 purpose+0.034
array+0.032
 behaviour+0.032
urance+0.031
Neg logit contributions
fly-0.032
 Sky-0.031
 tweet-0.028
 below-0.027
 Saturn-0.026
Token Louise
Token ID31774
Feature activation+1.13e-02

Pos logit contributions
fly+0.044
 globe+0.041
 Sky+0.038
 higher+0.038
 tweet+0.038
Neg logit contributions
gh-0.058
 purpose-0.056
 Sister-0.054
 behaviour-0.054
illy-0.053
Token that
Token ID326
Feature activation+8.05e-03

Pos logit contributions
gh+0.024
 purpose+0.022
array+0.022
omm+0.021
 behaviour+0.021
Neg logit contributions
 Sky-0.017
fly-0.016
 tweet-0.015
 Saturn-0.013
 Jupiter-0.013
Token the
Token ID262
Feature activation+4.85e-02

Pos logit contributions
gh+0.017
 purpose+0.016
array+0.016
omm+0.016
urance+0.016
Neg logit contributions
 Sky-0.011
fly-0.011
 tweet-0.010
 Saturn-0.009
 globe-0.009
Token oil
Token ID3056
Feature activation+5.48e-02

Pos logit contributions
gh+0.093
David+0.085
anked+0.080
oph+0.080
dad+0.080
Neg logit contributions
 Sky-0.071
 flap-0.070
 deer-0.069
 globe-0.069
fly-0.066
Token had
Token ID550
Feature activation+9.55e-02

Pos logit contributions
gh+0.113
array+0.108
 purpose+0.106
 mischief+0.106
urance+0.105
Neg logit contributions
fly-0.079
 Sky-0.074
 globe-0.070
als-0.067
 below-0.066
Token run
Token ID1057
Feature activation+6.09e-02

Pos logit contributions
omm+0.169
array+0.160
gh+0.156
innie+0.156
urance+0.154
Neg logit contributions
 Sky-0.146
 fly-0.139
fly-0.138
 flap-0.135
 around-0.134
Token out
Token ID503
Feature activation+1.41e-02

Pos logit contributions
omm+0.162
addy+0.156
gh+0.156
innie+0.151
array+0.151
Neg logit contributions
 around-0.051
 higher-0.050
 Sky-0.047
fly-0.046
 together-0.044
Token very
Token ID845
Feature activation+4.00e-02

Pos logit contributions
gh+0.031
 purpose+0.029
omm+0.029
array+0.028
 Sister+0.028
Neg logit contributions
 Sky-0.019
fly-0.019
 tweet-0.017
 below-0.016
 Saturn-0.015
Token quickly
Token ID2952
Feature activation+7.71e-02

Pos logit contributions
gh+0.092
urance+0.091
omm+0.089
array+0.089
child+0.088
Neg logit contributions
 Sky-0.046
 higher-0.046
 below-0.043
fly-0.040
 closer-0.039
Token.
Token ID13
Feature activation-5.17e-02

Pos logit contributions
omm+0.183
gh+0.182
innie+0.171
scar+0.170
addy+0.170
Neg logit contributions
fly-0.078
 below-0.077
 around-0.075
 together-0.073
 higher-0.073
Token to
Token ID284
Feature activation+3.70e-02

Pos logit contributions
gh+0.070
urance+0.066
omm+0.065
array+0.065
 behaviour+0.063
Neg logit contributions
 Sky-0.041
fly-0.039
 higher-0.035
 tweet-0.035
 Saturn-0.034
Token walk
Token ID2513
Feature activation-1.74e-02

Pos logit contributions
gh+0.083
omm+0.079
child+0.077
array+0.077
em+0.075
Neg logit contributions
fly-0.046
 Sky-0.046
 soar-0.044
 tweet-0.043
 flap-0.042
Token."
Token ID526
Feature activation+6.03e-02

Pos logit contributions
fly+0.024
 Sky+0.024
 tweet+0.022
 Saturn+0.020
 globe+0.019
Neg logit contributions
gh-0.038
 purpose-0.036
array-0.034
 forgotten-0.034
 behaviour-0.033
Token
Token ID198
Feature activation+3.98e-02

Pos logit contributions
gh+0.166
 supposed+0.148
 purpose+0.147
 behaviour+0.143
kids+0.142
Neg logit contributions
 together-0.046
 down-0.042
 Sky-0.042
 higher-0.038
 below-0.036
Token
Token ID198
Feature activation+5.88e-02

Pos logit contributions
gh+0.107
 behaviour+0.107
array+0.107
 purpose+0.104
 forgotten+0.103
Neg logit contributions
 Sky-0.028
fly-0.022
 together-0.020
 Fly-0.018
 Saturn-0.018
TokenM
Token ID44
Feature activation+9.55e-02

Pos logit contributions
urance+0.126
 behaviour+0.126
 supposed+0.125
gh+0.124
 purpose+0.122
Neg logit contributions
 together-0.063
 Sky-0.054
 higher-0.053
 closer-0.053
 Fly-0.051
Tokenum
Token ID388
Feature activation+2.15e-02

Pos logit contributions
 supposed+0.172
array+0.171
 mischief+0.170
 blonde+0.170
 virus+0.166
Neg logit contributions
ite-0.106
cast-0.100
 together-0.097
fish-0.094
 below-0.091
Token and
Token ID290
Feature activation+5.44e-02

Pos logit contributions
gh+0.050
 purpose+0.049
 mischief+0.046
array+0.046
 behaviour+0.046
Neg logit contributions
fly-0.027
 Sky-0.026
 tweet-0.023
 below-0.021
flies-0.021
Token Dad
Token ID17415
Feature activation+2.34e-02

Pos logit contributions
gh+0.120
 purpose+0.106
 supposed+0.101
urance+0.100
 policy+0.098
Neg logit contributions
 together-0.075
 down-0.073
 closer-0.072
 below-0.072
 Sky-0.071
Token told
Token ID1297
Feature activation+6.32e-02

Pos logit contributions
gh+0.046
 purpose+0.045
 mischief+0.041
omm+0.041
array+0.041
Neg logit contributions
 Sky-0.037
fly-0.037
 tweet-0.035
 below-0.033
 flap-0.032
Token Little
Token ID7703
Feature activation+7.39e-02

Pos logit contributions
gh+0.131
 supposed+0.128
 policy+0.121
omm+0.120
 forgotten+0.119
Neg logit contributions
 together-0.082
 Sky-0.079
 around-0.076
 down-0.072
 Jupiter-0.071
Token at
Token ID379
Feature activation-2.71e-02

Pos logit contributions
fly+0.036
 Sky+0.034
 tweet+0.030
 globe+0.029
flies+0.027
Neg logit contributions
 forgotten-0.056
gh-0.056
array-0.055
 behaviour-0.055
 mischief-0.055
Token her
Token ID607
Feature activation+1.76e-02

Pos logit contributions
fly+0.046
 Sky+0.043
 tweet+0.042
 globe+0.040
flies+0.038
Neg logit contributions
gh-0.046
 purpose-0.046
addy-0.045
 Sister-0.045
 mischief-0.043
Token dad
Token ID9955
Feature activation+2.57e-03

Pos logit contributions
gh+0.032
array+0.028
omm+0.028
 purpose+0.028
urance+0.027
Neg logit contributions
 Sky-0.031
fly-0.031
 tweet-0.028
 below-0.028
 globe-0.028
Token with
Token ID351
Feature activation+3.26e-02

Pos logit contributions
gh+0.006
 purpose+0.005
array+0.005
 behaviour+0.005
omm+0.005
Neg logit contributions
 Sky-0.004
fly-0.004
 tweet-0.003
 globe-0.003
 Saturn-0.003
Token an
Token ID281
Feature activation+5.18e-02

Pos logit contributions
gh+0.070
omm+0.066
array+0.064
addy+0.063
 purpose+0.063
Neg logit contributions
 Sky-0.047
fly-0.043
 tweet-0.042
 flap-0.040
 below-0.039
Token pleading
Token ID30279
Feature activation+9.53e-02

Pos logit contributions
gh+0.093
 purpose+0.085
 Sister+0.083
 forgotten+0.082
array+0.082
Neg logit contributions
 Sky-0.081
 flap-0.078
fly-0.077
 tweet-0.077
 Jupiter-0.076
Token expression
Token ID5408
Feature activation+1.51e-02

Pos logit contributions
liv+0.172
array+0.172
scar+0.170
Enough+0.168
uf+0.166
Neg logit contributions
 flap-0.134
fly-0.131
 globe-0.127
 tweet-0.122
 Sky-0.119
Token on
Token ID319
Feature activation+2.82e-02

Pos logit contributions
gh+0.033
 purpose+0.031
array+0.031
omm+0.030
 mischief+0.030
Neg logit contributions
fly-0.021
 Sky-0.020
 tweet-0.018
 below-0.018
 higher-0.017
Token her
Token ID607
Feature activation+3.88e-02

Pos logit contributions
array+0.057
gh+0.056
 supposed+0.055
 forgotten+0.055
 behaviour+0.054
Neg logit contributions
 Sky-0.044
fly-0.040
 tweet-0.035
 Saturn-0.035
 below-0.034
Token face
Token ID1986
Feature activation+3.34e-02

Pos logit contributions
array+0.068
gh+0.065
omm+0.064
 forgotten+0.062
scar+0.062
Neg logit contributions
 Sky-0.063
fly-0.062
 tweet-0.058
 flap-0.057
 globe-0.057
Token and
Token ID290
Feature activation+7.22e-03

Pos logit contributions
gh+0.066
array+0.064
 purpose+0.062
 mischief+0.062
 Sister+0.060
Neg logit contributions
fly-0.051
 Sky-0.050
 tweet-0.043
 below-0.043
flies-0.042
Token to
Token ID284
Feature activation+3.79e-02

Pos logit contributions
gh+0.032
 purpose+0.029
omm+0.029
array+0.029
 behaviour+0.029
Neg logit contributions
 Sky-0.022
fly-0.022
 tweet-0.019
 below-0.019
 higher-0.019
Token prevent
Token ID2948
Feature activation+4.62e-02

Pos logit contributions
gh+0.083
omm+0.080
kids+0.077
 behaviour+0.077
child+0.076
Neg logit contributions
 Sky-0.048
 flap-0.047
 fly-0.047
 soar-0.047
fly-0.044
Token her
Token ID607
Feature activation+4.86e-02

Pos logit contributions
gh+0.114
array+0.104
omm+0.103
urance+0.103
David+0.103
Neg logit contributions
 Sky-0.055
fly-0.043
 higher-0.042
 below-0.041
 Bloom-0.040
Token mom
Token ID1995
Feature activation+2.62e-02

Pos logit contributions
gh+0.113
David+0.107
omm+0.104
array+0.101
urance+0.101
Neg logit contributions
fly-0.056
 Sky-0.053
 higher-0.050
 together-0.049
flies-0.049
Token from
Token ID422
Feature activation+3.98e-02

Pos logit contributions
gh+0.060
omm+0.056
 purpose+0.055
 behaviour+0.054
 Sister+0.053
Neg logit contributions
fly-0.034
 Sky-0.032
 below-0.029
 higher-0.028
 tweet-0.028
Token having
Token ID1719
Feature activation+9.52e-02

Pos logit contributions
gh+0.073
omm+0.070
urance+0.066
addy+0.065
 behaviour+0.065
Neg logit contributions
 Sky-0.065
fly-0.062
 below-0.060
 flap-0.059
 higher-0.059
Token to
Token ID284
Feature activation+8.30e-02

Pos logit contributions
omm+0.247
scar+0.241
uf+0.239
David+0.238
child+0.237
Neg logit contributions
 together-0.088
 higher-0.065
 on-0.063
 around-0.062
 Sky-0.061
Token clean
Token ID3424
Feature activation+4.19e-02

Pos logit contributions
omm+0.206
child+0.194
kids+0.190
gh+0.187
 behaviour+0.183
Neg logit contributions
 fly-0.099
 soar-0.090
 flap-0.085
 float-0.078
 sing-0.075
Token up
Token ID510
Feature activation+5.16e-02

Pos logit contributions
gh+0.116
urance+0.111
David+0.109
omm+0.109
OW+0.107
Neg logit contributions
 higher-0.028
 around-0.026
 Sky-0.026
 down-0.025
 together-0.025
Token the
Token ID262
Feature activation+5.26e-02

Pos logit contributions
gh+0.121
omm+0.118
urance+0.115
David+0.113
array+0.112
Neg logit contributions
 Sky-0.055
 higher-0.052
 below-0.052
 together-0.052
fly-0.048
Token mud
Token ID17492
Feature activation+8.12e-03

Pos logit contributions
gh+0.108
David+0.103
urance+0.102
 Samantha+0.100
uf+0.100
Neg logit contributions
 higher-0.071
 around-0.066
 globe-0.065
 flap-0.064
fl-0.063
Token smell
Token ID8508
Feature activation+5.96e-02

Pos logit contributions
David+0.146
child+0.134
em+0.132
kids+0.132
urance+0.130
Neg logit contributions
fly-0.116
 globe-0.105
 dragon-0.102
 flap-0.101
 higher-0.100
Token and
Token ID290
Feature activation+3.92e-02

Pos logit contributions
omm+0.159
urance+0.156
 Sister+0.155
array+0.154
 behaviour+0.153
Neg logit contributions
 below-0.043
 around-0.041
fly-0.040
 Sky-0.039
 globe-0.036
Token was
Token ID373
Feature activation+4.50e-02

Pos logit contributions
gh+0.095
urance+0.087
omm+0.087
 behaviour+0.085
array+0.085
Neg logit contributions
 Sky-0.046
fly-0.043
 below-0.038
 globe-0.036
 tweet-0.035
Token filled
Token ID5901
Feature activation+1.66e-02

Pos logit contributions
gh+0.087
urance+0.080
 behaviour+0.079
array+0.079
kids+0.079
Neg logit contributions
fly-0.074
 Sky-0.073
 below-0.065
 higher-0.061
 tweet-0.061
Token with
Token ID351
Feature activation+5.49e-02

Pos logit contributions
gh+0.045
 behaviour+0.043
omm+0.043
urance+0.043
 Sister+0.043
Neg logit contributions
 Sky-0.013
fly-0.013
 below-0.010
 globe-0.010
 higher-0.009
Token y
Token ID331
Feature activation+9.44e-02

Pos logit contributions
gh+0.113
omm+0.103
urance+0.099
array+0.097
uf+0.096
Neg logit contributions
 Sky-0.081
fly-0.078
 below-0.074
 higher-0.072
 around-0.072
Tokenucky
Token ID5309
Feature activation+3.48e-02

Pos logit contributions
gh+0.168
David+0.158
 forgotten+0.158
liv+0.157
 virus+0.155
Neg logit contributions
als-0.142
 together-0.128
 below-0.126
 higher-0.121
 Sky-0.120
Token water
Token ID1660
Feature activation+3.15e-02

Pos logit contributions
gh+0.071
omm+0.064
array+0.064
David+0.063
urance+0.063
Neg logit contributions
fly-0.053
 Sky-0.052
flies-0.046
 globe-0.046
 tweet-0.045
Token.
Token ID13
Feature activation+3.73e-02

Pos logit contributions
gh+0.064
omm+0.059
array+0.058
 purpose+0.058
urance+0.057
Neg logit contributions
fly-0.049
 Sky-0.046
 below-0.042
 tweet-0.041
flies-0.041
Token
Token ID220
Feature activation+4.60e-03

Pos logit contributions
gh+0.076
 behaviour+0.066
 purpose+0.064
 forgotten+0.063
 virus+0.062
Neg logit contributions
 Sky-0.061
 below-0.055
fly-0.052
 globe-0.051
 Jupiter-0.050
Token
Token ID198
Feature activation+2.65e-02

Pos logit contributions
gh+0.011
 purpose+0.010
array+0.010
 behaviour+0.010
 Sister+0.010
Neg logit contributions
 Sky-0.006
fly-0.006
 tweet-0.005
 Saturn-0.005
 globe-0.005
Token She
Token ID1375
Feature activation+2.81e-02

Pos logit contributions
gh+0.103
 behaviour+0.089
scar+0.088
 supposed+0.087
dad+0.087
Neg logit contributions
 Sky-0.045
 Bloom-0.035
als-0.034
 Jupiter-0.033
 Sparkle-0.033
Token remembered
Token ID12086
Feature activation+3.46e-02

Pos logit contributions
gh+0.057
 purpose+0.054
urance+0.052
 behaviour+0.051
 mischief+0.050
Neg logit contributions
fly-0.044
 Sky-0.044
 tweet-0.038
 flap-0.036
 deer-0.036
Token how
Token ID703
Feature activation+4.57e-02

Pos logit contributions
gh+0.091
omm+0.083
urance+0.082
array+0.081
 purpose+0.081
Neg logit contributions
 Sky-0.034
fly-0.031
 tweet-0.028
 below-0.028
 Jupiter-0.027
Token the
Token ID262
Feature activation+3.55e-02

Pos logit contributions
gh+0.104
urance+0.100
array+0.099
omm+0.098
kids+0.097
Neg logit contributions
 together-0.058
 Sky-0.052
 around-0.051
 below-0.046
 higher-0.046
Token lady
Token ID10846
Feature activation-7.60e-03

Pos logit contributions
gh+0.088
urance+0.080
omm+0.079
uf+0.078
anked+0.076
Neg logit contributions
 higher-0.038
 Sky-0.037
 below-0.036
 fly-0.034
 globe-0.034
Token had
Token ID550
Feature activation+9.41e-02

Pos logit contributions
 Sky+0.014
fly+0.014
 tweet+0.013
 Saturn+0.012
 globe+0.012
Neg logit contributions
gh-0.013
 purpose-0.011
array-0.011
 forgotten-0.011
 behaviour-0.011
Token helped
Token ID4193
Feature activation+3.67e-02

Pos logit contributions
omm+0.207
uf+0.188
gh+0.185
urance+0.184
innie+0.182
Neg logit contributions
 together-0.125
 fly-0.109
 on-0.108
 around-0.107
 in-0.106
Token her
Token ID607
Feature activation+3.00e-02

Pos logit contributions
gh+0.077
array+0.073
omm+0.071
omo+0.070
scar+0.070
Neg logit contributions
 Sky-0.053
fly-0.049
 tweet-0.045
 below-0.044
 higher-0.043
Token and
Token ID290
Feature activation+1.69e-02

Pos logit contributions
gh+0.065
array+0.060
omm+0.059
David+0.058
kids+0.058
Neg logit contributions
fly-0.041
 Sky-0.040
 below-0.036
 flap-0.035
 fly-0.035
Token decided
Token ID3066
Feature activation+3.39e-02

Pos logit contributions
gh+0.033
 purpose+0.030
array+0.029
urance+0.029
omm+0.029
Neg logit contributions
fly-0.028
 Sky-0.028
 tweet-0.025
 below-0.023
 Saturn-0.023
Token to
Token ID284
Feature activation+4.07e-02

Pos logit contributions
gh+0.079
omm+0.076
array+0.075
urance+0.075
 mischief+0.073
Neg logit contributions
 Sky-0.039
fly-0.035
 below-0.034
 higher-0.033
 together-0.033
Token go
Token ID467
Feature activation-6.72e-03

Pos logit contributions
gh+0.010
 purpose+0.009
array+0.009
omm+0.009
 behaviour+0.009
Neg logit contributions
 Sky-0.010
fly-0.010
 tweet-0.009
 Saturn-0.009
 globe-0.009
Token outside
Token ID2354
Feature activation-2.98e-02

Pos logit contributions
fly+0.009
 Sky+0.009
 tweet+0.008
 globe+0.008
 Saturn+0.007
Neg logit contributions
gh-0.015
 purpose-0.014
array-0.013
 forgotten-0.013
 Sister-0.013
Token."
Token ID526
Feature activation+3.98e-02

Pos logit contributions
fly+0.053
 Sky+0.052
 tweet+0.049
 globe+0.044
 Saturn+0.044
Neg logit contributions
gh-0.052
 purpose-0.049
array-0.048
 forgotten-0.047
 Sister-0.047
Token
Token ID198
Feature activation+2.15e-02

Pos logit contributions
gh+0.106
 purpose+0.097
 behaviour+0.094
 forgotten+0.093
 supposed+0.093
Neg logit contributions
 Sky-0.035
 together-0.032
fly-0.031
 Fly-0.027
 tweet-0.027
Token
Token ID198
Feature activation+3.34e-02

Pos logit contributions
gh+0.054
 behaviour+0.051
 purpose+0.051
array+0.051
 forgotten+0.050
Neg logit contributions
 Sky-0.022
fly-0.020
 tweet-0.017
 Saturn-0.016
 globe-0.016
TokenM
Token ID44
Feature activation+9.40e-02

Pos logit contributions
gh+0.068
 behaviour+0.066
 purpose+0.065
urance+0.064
omm+0.063
Neg logit contributions
 Sky-0.045
fly-0.043
 together-0.041
 tweet-0.040
 higher-0.039
Tokenum
Token ID388
Feature activation+3.45e-03

Pos logit contributions
 blonde+0.161
 supposed+0.157
 purpose+0.156
 blond+0.156
 mischief+0.156
Neg logit contributions
ite-0.111
 together-0.105
cast-0.105
 tweet-0.095
ites-0.095
Token and
Token ID290
Feature activation+4.94e-02

Pos logit contributions
gh+0.008
 purpose+0.007
array+0.007
 behaviour+0.007
omm+0.007
Neg logit contributions
fly-0.004
 Sky-0.004
 tweet-0.004
 globe-0.003
 Saturn-0.003
Token Dad
Token ID17415
Feature activation+9.97e-03

Pos logit contributions
gh+0.100
 purpose+0.087
 supposed+0.083
urance+0.080
 policy+0.079
Neg logit contributions
 together-0.076
 Sky-0.075
 down-0.075
 tweet-0.073
 higher-0.072
Token smiled
Token ID13541
Feature activation-4.71e-03

Pos logit contributions
gh+0.018
 purpose+0.017
array+0.016
 behaviour+0.016
omm+0.016
Neg logit contributions
 Sky-0.017
fly-0.017
 tweet-0.016
 below-0.015
 globe-0.015
Token and
Token ID290
Feature activation-5.92e-03

Pos logit contributions
 Sky+0.007
fly+0.007
 tweet+0.006
 globe+0.005
 Saturn+0.005
Neg logit contributions
gh-0.010
 purpose-0.009
 behaviour-0.009
array-0.009
 Sister-0.009
Token,
Token ID11
Feature activation+5.11e-02

Pos logit contributions
urance+0.165
array+0.165
 mischief+0.165
gh+0.163
child+0.163
Neg logit contributions
 together-0.044
 Sky-0.039
 higher-0.036
 on-0.035
 aboard-0.035
Token Tommy
Token ID19919
Feature activation+3.60e-03

Pos logit contributions
gh+0.096
urance+0.091
 behaviour+0.089
omm+0.086
array+0.085
Neg logit contributions
 together-0.080
 Sky-0.080
 higher-0.078
 below-0.075
 down-0.074
Token's
Token ID338
Feature activation-1.97e-03

Pos logit contributions
gh+0.008
 purpose+0.007
array+0.007
 behaviour+0.007
omm+0.007
Neg logit contributions
fly-0.005
 Sky-0.005
 tweet-0.005
 Saturn-0.004
 globe-0.004
Token trumpet
Token ID40334
Feature activation+4.52e-03

Pos logit contributions
fly+0.004
 Sky+0.004
 tweet+0.003
 Saturn+0.003
 globe+0.003
Neg logit contributions
gh-0.003
 purpose-0.003
 behaviour-0.003
array-0.003
 Sister-0.003
Token breaks
Token ID9457
Feature activation+2.72e-02

Pos logit contributions
gh+0.008
 purpose+0.008
array+0.007
 behaviour+0.007
omm+0.007
Neg logit contributions
fly-0.008
 Sky-0.008
 tweet-0.007
 globe-0.007
 Saturn-0.007
Token.
Token ID13
Feature activation+9.30e-02

Pos logit contributions
gh+0.066
 behaviour+0.062
urance+0.061
array+0.061
 Sister+0.060
Neg logit contributions
fly-0.031
 Sky-0.031
 higher-0.026
 below-0.026
 Saturn-0.025
Token He
Token ID679
Feature activation-1.33e-02

Pos logit contributions
gh+0.257
illy+0.242
 injuries+0.236
 supposed+0.235
liv+0.232
Neg logit contributions
 down-0.058
 Sky-0.058
 Twist-0.043
 around-0.043
 below-0.042
Token tries
Token ID8404
Feature activation+1.24e-02

Pos logit contributions
fly+0.021
 Sky+0.021
 tweet+0.019
 globe+0.018
 below+0.018
Neg logit contributions
gh-0.026
array-0.024
 purpose-0.023
 forgotten-0.023
 behaviour-0.023
Token to
Token ID284
Feature activation-1.07e-02

Pos logit contributions
gh+0.025
array+0.023
omm+0.022
 behaviour+0.022
urance+0.022
Neg logit contributions
 Sky-0.019
fly-0.019
 tweet-0.017
 higher-0.017
 Saturn-0.016
Token blow
Token ID6611
Feature activation-1.74e-02

Pos logit contributions
 Sky+0.019
fly+0.019
 tweet+0.017
 globe+0.017
 Saturn+0.016
Neg logit contributions
gh-0.019
 purpose-0.018
array-0.017
urance-0.016
 behaviour-0.016
Token,
Token ID11
Feature activation-7.47e-03

Pos logit contributions
fly+0.029
 Sky+0.027
 tweet+0.025
flies+0.024
 globe+0.023
Neg logit contributions
gh-0.035
 purpose-0.033
 mischief-0.030
array-0.030
 behaviour-0.030
Token
Token ID198
Feature activation-3.04e-02

Pos logit contributions
fly+0.027
 Sky+0.023
flies+0.023
 globe+0.022
 tweet+0.022
Neg logit contributions
gh-0.026
 purpose-0.025
 Sister-0.024
 punishment-0.024
addy-0.024
Token
Token ID198
Feature activation-8.04e-03

Pos logit contributions
fly+0.056
 Sky+0.050
flies+0.048
 tweet+0.048
als+0.047
Neg logit contributions
gh-0.055
 purpose-0.050
addy-0.048
anked-0.047
David-0.047
TokenThey
Token ID2990
Feature activation-7.75e-03

Pos logit contributions
fly+0.017
 Sky+0.016
 tweet+0.015
flies+0.015
 Saturn+0.014
Neg logit contributions
gh-0.012
 purpose-0.011
array-0.011
addy-0.010
 Sister-0.010
Token watched
Token ID7342
Feature activation-3.29e-02

Pos logit contributions
fly+0.014
 Sky+0.014
 tweet+0.013
 globe+0.012
 Saturn+0.012
Neg logit contributions
gh-0.013
array-0.012
 purpose-0.012
 Sister-0.012
 forgotten-0.012
Token the
Token ID262
Feature activation-5.53e-02

Pos logit contributions
fly+0.059
 Sky+0.052
 globe+0.051
flies+0.049
 tweet+0.048
Neg logit contributions
gh-0.057
 purpose-0.056
 Sister-0.054
 punishment-0.054
anked-0.054
Token moon
Token ID8824
Feature activation-1.37e-01

Pos logit contributions
fly+0.077
 Sky+0.076
bug+0.072
 Bloom+0.067
 Flor+0.064
Neg logit contributions
 purpose-0.107
gh-0.101
array-0.099
child-0.098
 mischief-0.097
Token together
Token ID1978
Feature activation-4.26e-02

Pos logit contributions
 Asia+0.201
 Sky+0.198
 Bloom+0.190
 Fly+0.187
 globe+0.186
Neg logit contributions
gh-0.197
 efforts-0.187
 behaviour-0.175
 purpose-0.173
anked-0.170
Token until
Token ID1566
Feature activation-1.47e-02

Pos logit contributions
fly+0.084
 Sky+0.083
 tweet+0.076
 globe+0.075
 Jupiter+0.075
Neg logit contributions
gh-0.063
 purpose-0.058
 behaviour-0.056
 punishment-0.054
urance-0.052
Token they
Token ID484
Feature activation-3.61e-02

Pos logit contributions
fly+0.023
 Sky+0.022
 tweet+0.020
flies+0.019
 globe+0.018
Neg logit contributions
gh-0.030
 purpose-0.028
 Sister-0.027
 forgotten-0.027
 punishment-0.027
Token fell
Token ID3214
Feature activation-3.37e-02

Pos logit contributions
 Sky+0.078
fly+0.076
 tweet+0.073
 globe+0.070
 Jupiter+0.070
Neg logit contributions
 forgotten-0.048
gh-0.047
 disob-0.040
 happened-0.040
addy-0.039
Token asleep
Token ID16039
Feature activation-3.26e-02

Pos logit contributions
 Sky+0.049
fly+0.049
 tweet+0.045
 Saturn+0.043
 Bloom+0.042
Neg logit contributions
gh-0.065
 purpose-0.063
array-0.059
 forgotten-0.059
anked-0.059
Token.
Token ID13
Feature activation-4.31e-02

Pos logit contributions
 Sky+0.116
fly+0.114
 tweet+0.114
 globe+0.108
 Fly+0.106
Neg logit contributions
gh-0.048
 purpose-0.045
 time-0.040
 business-0.036
 supposed-0.036
Token She
Token ID1375
Feature activation-7.48e-02

Pos logit contributions
fly+0.093
flies+0.080
 Sky+0.080
 globe+0.078
 tweet+0.077
Neg logit contributions
gh-0.060
 purpose-0.060
 Sister-0.059
urance-0.056
addy-0.056
Token would
Token ID561
Feature activation-7.59e-02

Pos logit contributions
fly+0.171
 Sky+0.167
 tweet+0.158
stack+0.156
 south+0.153
Neg logit contributions
gh-0.076
 happened-0.073
 hated-0.071
 been-0.069
 forgotten-0.066
Token fly
Token ID6129
Feature activation-9.20e-02

Pos logit contributions
fly+0.116
 Sky+0.106
flies+0.106
rots+0.105
als+0.103
Neg logit contributions
gh-0.133
 disob-0.126
 purpose-0.121
 time-0.120
 punish-0.117
Token up
Token ID510
Feature activation-6.58e-02

Pos logit contributions
fly+0.175
stack+0.168
 globe+0.160
 launcher+0.157
rots+0.154
Neg logit contributions
 time-0.124
 purpose-0.114
gh-0.113
 forgotten-0.103
 gotten-0.089
Token and
Token ID290
Feature activation-1.32e-01

Pos logit contributions
fly+0.117
 tweet+0.100
 globe+0.098
 Sky+0.097
flies+0.094
Neg logit contributions
 purpose-0.101
gh-0.101
 time-0.099
 forgotten-0.099
 trouble-0.094
Token up
Token ID510
Feature activation-4.26e-02

Pos logit contributions
rots+0.239
ectar+0.215
ang+0.209
fly+0.207
 Sky+0.203
Neg logit contributions
 quit-0.155
 gotten-0.136
 disob-0.133
 forgotten-0.133
 delay-0.132
Token,
Token ID11
Feature activation-1.09e-01

Pos logit contributions
fly+0.078
 Sky+0.068
 globe+0.067
 tweet+0.066
 Saturn+0.063
Neg logit contributions
gh-0.070
 purpose-0.067
 forgotten-0.065
 punishment-0.061
 time-0.060
Token then
Token ID788
Feature activation-7.09e-02

Pos logit contributions
fly+0.195
rots+0.186
 globe+0.173
girls+0.171
fish+0.171
Neg logit contributions
 expression-0.131
 complaining-0.125
 forgotten-0.123
 grown-0.123
 refused-0.122
Token come
Token ID1282
Feature activation-9.84e-02

Pos logit contributions
fly+0.113
 globe+0.108
 Sky+0.096
pole+0.092
 Saturn+0.091
Neg logit contributions
gh-0.115
 behaviour-0.112
 happened-0.107
 disob-0.105
 forgotten-0.105
Token back
Token ID736
Feature activation-7.50e-02

Pos logit contributions
rob+0.188
 Sky+0.176
fish+0.170
 Blink+0.167
igator+0.164
Neg logit contributions
 supposed-0.125
gh-0.115
 business-0.112
 forgotten-0.106
 purpose-0.105
Token would
Token ID561
Feature activation-4.56e-02

Pos logit contributions
 Sky+0.179
 Saturn+0.160
stack+0.153
 higher+0.152
 Jupiter+0.152
Neg logit contributions
gh-0.110
 happened-0.107
 disob-0.099
 hated-0.096
 forgotten-0.095
Token fly
Token ID6129
Feature activation-9.48e-02

Pos logit contributions
 Sky+0.061
fly+0.059
 globe+0.054
 Saturn+0.051
als+0.051
Neg logit contributions
gh-0.097
 purpose-0.091
 disob-0.090
 punish-0.087
 forgotten-0.085
Token over
Token ID625
Feature activation-4.83e-02

Pos logit contributions
stack+0.176
rots+0.165
 globe+0.165
 Sky+0.164
fly+0.162
Neg logit contributions
 time-0.132
gh-0.125
 purpose-0.118
 forgotten-0.100
 gotten-0.091
Token the
Token ID262
Feature activation-5.98e-02

Pos logit contributions
fly+0.088
 Sky+0.079
 globe+0.075
 tweet+0.075
pole+0.074
Neg logit contributions
 purpose-0.079
 mischief-0.079
 trouble-0.078
gh-0.077
 time-0.073
Token hills
Token ID18639
Feature activation-7.30e-02

Pos logit contributions
fly+0.105
 Sky+0.101
 tweet+0.095
 together+0.093
bug+0.090
Neg logit contributions
 purpose-0.098
 forgotten-0.090
gh-0.087
 situation-0.087
 mischief-0.087
Token and
Token ID290
Feature activation-1.31e-01

Pos logit contributions
 Sky+0.149
 tweet+0.147
fly+0.141
 Fly+0.133
 Flor+0.131
Neg logit contributions
gh-0.089
 purpose-0.081
 time-0.081
 forgotten-0.077
 business-0.075
Token forests
Token ID17039
Feature activation-6.31e-02

Pos logit contributions
rots+0.235
 globe+0.206
ang+0.203
 Sky+0.198
ites+0.196
Neg logit contributions
 quit-0.156
 disob-0.155
 punish-0.150
 grown-0.149
 gotten-0.147
Token.
Token ID13
Feature activation-4.34e-02

Pos logit contributions
fly+0.140
 Sky+0.139
 tweet+0.138
 Fly+0.130
 Saturn+0.130
Neg logit contributions
gh-0.074
 forgotten-0.060
 purpose-0.057
 business-0.053
 time-0.052
Token One
Token ID1881
Feature activation-3.41e-02

Pos logit contributions
fly+0.096
flies+0.085
 tweet+0.085
fish+0.084
 Sky+0.083
Neg logit contributions
 purpose-0.053
gh-0.050
urance-0.048
 time-0.047
array-0.047
Token day
Token ID1110
Feature activation+9.80e-03

Pos logit contributions
 Sky+0.079
fly+0.078
 tweet+0.073
flies+0.073
als+0.072
Neg logit contributions
gh-0.038
 time-0.037
 purpose-0.036
 punishment-0.033
 forgotten-0.032
Token,
Token ID11
Feature activation-3.86e-02

Pos logit contributions
gh+0.020
array+0.018
 purpose+0.018
omm+0.018
 behaviour+0.018
Neg logit contributions
 Sky-0.015
fly-0.015
 tweet-0.014
 higher-0.013
 globe-0.013
Token adventure
Token ID8855
Feature activation+1.36e-02

Pos logit contributions
gh+0.061
 purpose+0.055
omm+0.055
 Sister+0.055
 behaviour+0.055
Neg logit contributions
 Sky-0.053
fly-0.051
 Jupiter-0.048
 Saturn-0.048
 tweet-0.047
Token.
Token ID13
Feature activation+2.80e-02

Pos logit contributions
gh+0.031
omm+0.029
 purpose+0.029
 behaviour+0.029
 Sister+0.029
Neg logit contributions
 Sky-0.017
fly-0.017
 below-0.014
 tweet-0.014
 higher-0.014
Token He
Token ID679
Feature activation-3.48e-04

Pos logit contributions
gh+0.069
 purpose+0.061
 behaviour+0.061
 forgotten+0.060
addy+0.060
Neg logit contributions
 Sky-0.033
fly-0.030
 below-0.028
 globe-0.027
 Jupiter-0.026
Token watched
Token ID7342
Feature activation-3.00e-03

Pos logit contributions
 Sky+0.001
fly+0.001
 tweet+0.001
 globe+0.001
 Saturn+0.001
Neg logit contributions
gh-0.001
 purpose-0.001
array-0.001
 behaviour-0.001
 Sister-0.001
Token the
Token ID262
Feature activation-1.55e-02

Pos logit contributions
fly+0.005
 Sky+0.005
 tweet+0.004
 globe+0.004
 Saturn+0.004
Neg logit contributions
gh-0.006
 purpose-0.006
array-0.005
 behaviour-0.005
 Sister-0.005
Token birds
Token ID10087
Feature activation-1.27e-01

Pos logit contributions
fly+0.022
 Sky+0.022
 tweet+0.019
 Saturn+0.018
flies+0.018
Neg logit contributions
gh-0.032
 purpose-0.031
array-0.030
 behaviour-0.030
 Sister-0.029
Token flying
Token ID7348
Feature activation-6.75e-02

Pos logit contributions
 globe+0.202
 Sky+0.197
 Dragon+0.190
 Asia+0.183
 Lia+0.178
Neg logit contributions
 time-0.150
 cleaning-0.136
 grown-0.136
array-0.135
 shaving-0.134
Token in
Token ID287
Feature activation-1.71e-02

Pos logit contributions
 tweet+0.094
 Sky+0.091
fly+0.089
 Dragon+0.086
bug+0.080
Neg logit contributions
gh-0.118
 purpose-0.116
 time-0.116
 forgotten-0.101
anked-0.099
Token the
Token ID262
Feature activation-3.03e-02

Pos logit contributions
fly+0.025
 Sky+0.024
 tweet+0.022
flies+0.021
als+0.020
Neg logit contributions
gh-0.035
 purpose-0.034
array-0.033
 mischief-0.032
 Sister-0.031
Token sky
Token ID6766
Feature activation-5.26e-02

Pos logit contributions
fly+0.050
 Sky+0.048
 tweet+0.044
flies+0.041
 globe+0.040
Neg logit contributions
array-0.054
 purpose-0.054
gh-0.053
 forgotten-0.051
 Sister-0.051
Token and
Token ID290
Feature activation-8.75e-02

Pos logit contributions
 Sky+0.117
fly+0.111
 tweet+0.111
 globe+0.107
 Saturn+0.103
Neg logit contributions
gh-0.058
 purpose-0.056
 time-0.050
em-0.048
array-0.047
Token friends
Token ID2460
Feature activation-2.56e-02

Pos logit contributions
fly+0.037
 Sky+0.035
 tweet+0.033
 globe+0.031
 Jupiter+0.031
Neg logit contributions
gh-0.033
 purpose-0.030
 forgotten-0.029
 mischief-0.029
 Sister-0.028
Token.
Token ID13
Feature activation-1.27e-02

Pos logit contributions
fly+0.057
 Sky+0.055
 tweet+0.052
 Saturn+0.051
 Jupiter+0.050
Neg logit contributions
gh-0.034
 purpose-0.029
 behaviour-0.029
 forgotten-0.028
 Sister-0.027
Token They
Token ID1119
Feature activation-4.99e-02

Pos logit contributions
fly+0.021
 Sky+0.019
 tweet+0.019
 globe+0.017
 flap+0.017
Neg logit contributions
gh-0.024
 purpose-0.023
 Sister-0.022
array-0.022
urance-0.022
Token watch
Token ID2342
Feature activation-5.68e-02

Pos logit contributions
fly+0.094
 Sky+0.088
flies+0.080
 Jupiter+0.079
 globe+0.078
Neg logit contributions
gh-0.077
 forgotten-0.076
 disob-0.073
 punish-0.071
 Sister-0.070
Token the
Token ID262
Feature activation-3.40e-02

Pos logit contributions
fly+0.108
 Sky+0.091
 globe+0.091
stack+0.090
flies+0.087
Neg logit contributions
gh-0.090
 purpose-0.088
 time-0.083
 mischief-0.080
 Tommy-0.079
Token birds
Token ID10087
Feature activation-1.25e-01

Pos logit contributions
fly+0.055
 Sky+0.054
 tweet+0.047
 Saturn+0.045
 Jupiter+0.045
Neg logit contributions
gh-0.063
 purpose-0.061
 behaviour-0.059
 punishment-0.058
 mischief-0.057
Token and
Token ID290
Feature activation-1.03e-01

Pos logit contributions
 globe+0.197
 Sky+0.181
 Asia+0.179
 Dragon+0.178
 Saturn+0.178
Neg logit contributions
 time-0.162
gh-0.153
 punish-0.153
 grown-0.144
urance-0.144
Token the
Token ID262
Feature activation-4.47e-02

Pos logit contributions
fly+0.202
pole+0.200
 globe+0.198
rots+0.192
pp+0.181
Neg logit contributions
 mischief-0.109
 Sister-0.108
 grown-0.104
 trouble-0.102
 Tommy-0.100
Token flowers
Token ID12734
Feature activation-1.03e-01

Pos logit contributions
fly+0.062
 Sky+0.060
 Saturn+0.051
 Jupiter+0.050
 Fly+0.048
Neg logit contributions
gh-0.091
 purpose-0.089
 behaviour-0.085
 mischief-0.084
array-0.082
Token.
Token ID13
Feature activation-1.93e-02

Pos logit contributions
fly+0.214
 Sky+0.214
 Jupiter+0.208
 Fly+0.206
 Dragon+0.206
Neg logit contributions
gh-0.091
 time-0.084
 purpose-0.078
 forgotten-0.074
 grown-0.069
Token They
Token ID1119
Feature activation-7.72e-02

Pos logit contributions
fly+0.036
 Sky+0.032
 tweet+0.032
 flap+0.030
 globe+0.030
Neg logit contributions
gh-0.032
 purpose-0.032
 Sister-0.031
urance-0.030
array-0.029
Token Tom
Token ID4186
Feature activation-3.09e-02

Pos logit contributions
gh+0.094
 behaviour+0.087
illy+0.084
 supposed+0.084
 purpose+0.083
Neg logit contributions
 Sky-0.049
 below-0.041
 higher-0.041
 Saturn-0.038
 mountains-0.036
Token watched
Token ID7342
Feature activation-2.08e-02

Pos logit contributions
fly+0.062
 Sky+0.061
 tweet+0.057
 globe+0.054
 Saturn+0.053
Neg logit contributions
gh-0.046
 forgotten-0.041
array-0.041
 Sister-0.039
 disob-0.038
Token it
Token ID340
Feature activation-7.40e-02

Pos logit contributions
fly+0.035
 Sky+0.032
 tweet+0.030
flies+0.029
 globe+0.029
Neg logit contributions
gh-0.041
 purpose-0.038
 behaviour-0.036
 punishment-0.036
 mischief-0.035
Token go
Token ID467
Feature activation-3.91e-02

Pos logit contributions
fly+0.122
 Sky+0.116
 globe+0.114
als+0.113
rots+0.111
Neg logit contributions
gh-0.128
 purpose-0.114
urance-0.108
 punishment-0.106
 behaviour-0.106
Token higher
Token ID2440
Feature activation-6.33e-02

Pos logit contributions
fly+0.058
 Sky+0.058
 tweet+0.054
bug+0.049
flies+0.049
Neg logit contributions
gh-0.080
 forgotten-0.071
 purpose-0.071
array-0.068
 business-0.064
Token and
Token ID290
Feature activation-1.23e-01

Pos logit contributions
 tweet+0.131
 Sky+0.128
fly+0.124
bug+0.113
 Fly+0.111
Neg logit contributions
 purpose-0.087
gh-0.078
 time-0.077
array-0.072
anked-0.071
Token higher
Token ID2440
Feature activation-5.00e-02

Pos logit contributions
rots+0.182
 Sky+0.175
fly+0.174
 globe+0.162
stack+0.157
Neg logit contributions
 forgotten-0.197
 complained-0.175
 quit-0.171
 disaster-0.168
array-0.168
Token until
Token ID1566
Feature activation-5.75e-03

Pos logit contributions
 Sky+0.105
 tweet+0.104
fly+0.103
bug+0.094
 Jupiter+0.094
Neg logit contributions
 purpose-0.062
gh-0.062
anked-0.054
 forgotten-0.052
 behaviour-0.051
Token he
Token ID339
Feature activation-3.07e-02

Pos logit contributions
fly+0.008
 Sky+0.008
 tweet+0.007
 globe+0.007
flies+0.007
Neg logit contributions
gh-0.012
 purpose-0.012
array-0.011
 forgotten-0.011
 Sister-0.011
Token could
Token ID714
Feature activation-3.83e-02

Pos logit contributions
 Sky+0.061
fly+0.061
 tweet+0.057
 globe+0.055
 deer+0.053
Neg logit contributions
gh-0.048
 forgotten-0.043
addy-0.038
 purpose-0.038
 happened-0.038
Token not
Token ID407
Feature activation-4.44e-02

Pos logit contributions
fly+0.073
 Sky+0.072
als+0.066
 globe+0.066
 tweet+0.065
Neg logit contributions
gh-0.061
 forgotten-0.056
 purpose-0.055
 supposed-0.051
urance-0.050
Token dad
Token ID9955
Feature activation-2.58e-02

Pos logit contributions
gh+0.041
array+0.038
 purpose+0.035
urance+0.035
anked+0.035
Neg logit contributions
 Sky-0.045
fly-0.041
 deer-0.040
 tweet-0.040
 below-0.040
Token would
Token ID561
Feature activation-3.12e-02

Pos logit contributions
 Sky+0.052
fly+0.050
 tweet+0.046
 Saturn+0.045
 Jupiter+0.044
Neg logit contributions
gh-0.040
 forgotten-0.036
 Sister-0.035
 behaviour-0.034
array-0.034
Token watch
Token ID2342
Feature activation-7.27e-02

Pos logit contributions
 Sky+0.065
fly+0.064
 Saturn+0.057
 tweet+0.057
 globe+0.057
Neg logit contributions
gh-0.044
 purpose-0.041
urance-0.038
array-0.037
 forgotten-0.037
Token the
Token ID262
Feature activation-1.58e-02

Pos logit contributions
fly+0.128
 globe+0.118
 Sky+0.115
fish+0.109
als+0.109
Neg logit contributions
 purpose-0.112
gh-0.112
 mischief-0.106
 behaviour-0.104
 time-0.101
Token baby
Token ID5156
Feature activation-8.79e-02

Pos logit contributions
fly+0.029
 Sky+0.028
 tweet+0.026
 Saturn+0.025
 globe+0.024
Neg logit contributions
 purpose-0.026
gh-0.026
 behaviour-0.025
 Sister-0.024
 mischief-0.024
Token birds
Token ID10087
Feature activation-1.23e-01

Pos logit contributions
 Saturn+0.134
 Jupiter+0.123
 globe+0.118
fly+0.118
 Mars+0.117
Neg logit contributions
 behaviour-0.143
 trouble-0.132
 punish-0.131
 forgotten-0.129
 mischief-0.128
Token grow
Token ID1663
Feature activation-5.92e-02

Pos logit contributions
 globe+0.182
 Saturn+0.178
 Asia+0.174
 Sky+0.167
 Dragon+0.166
Neg logit contributions
 punish-0.176
gh-0.166
urance-0.163
 behaviour-0.162
 forgotten-0.160
Token from
Token ID422
Feature activation-3.73e-02

Pos logit contributions
fly+0.088
 Sky+0.086
 globe+0.082
stack+0.081
 tweet+0.080
Neg logit contributions
gh-0.107
 purpose-0.101
 behaviour-0.100
 mischief-0.099
anked-0.098
Token a
Token ID257
Feature activation-3.17e-02

Pos logit contributions
 Sky+0.064
fly+0.064
 tweet+0.057
als+0.054
flies+0.053
Neg logit contributions
 purpose-0.065
 mischief-0.064
 trouble-0.061
 forgotten-0.061
gh-0.061
Token safe
Token ID3338
Feature activation-1.83e-02

Pos logit contributions
fly+0.051
 Sky+0.050
flies+0.045
 tweet+0.044
 Saturn+0.043
Neg logit contributions
gh-0.060
 purpose-0.058
 forgotten-0.057
 mischief-0.053
 Sister-0.053
Token distance
Token ID5253
Feature activation-2.30e-02

Pos logit contributions
 Sky+0.031
fly+0.031
 tweet+0.028
 Saturn+0.027
 Jupiter+0.026
Neg logit contributions
 purpose-0.032
gh-0.032
array-0.030
 forgotten-0.030
 behaviour-0.030
Token up
Token ID510
Feature activation-3.38e-02

Pos logit contributions
fly+0.081
 Sky+0.075
 Jupiter+0.071
flies+0.068
 tweet+0.068
Neg logit contributions
gh-0.089
 purpose-0.082
 time-0.075
 trouble-0.074
 behaviour-0.072
Token and
Token ID290
Feature activation-1.07e-01

Pos logit contributions
fly+0.065
 Sky+0.060
 tweet+0.058
flies+0.055
 globe+0.054
Neg logit contributions
gh-0.052
 purpose-0.051
 trouble-0.047
 forgotten-0.046
 mischief-0.045
Token down
Token ID866
Feature activation-3.08e-02

Pos logit contributions
fly+0.175
 Jupiter+0.170
rots+0.165
 Saturn+0.154
 Sky+0.153
Neg logit contributions
 forgotten-0.156
 Tommy-0.149
 complained-0.149
 whine-0.142
 Sister-0.138
Token.
Token ID13
Feature activation-3.80e-02

Pos logit contributions
fly+0.069
 Sky+0.066
 tweet+0.064
 Jupiter+0.060
flies+0.060
Neg logit contributions
gh-0.037
 purpose-0.037
 trouble-0.034
 forgotten-0.034
 mischief-0.032
Token The
Token ID383
Feature activation-4.23e-02

Pos logit contributions
fly+0.084
 tweet+0.075
pole+0.072
flies+0.071
 globe+0.070
Neg logit contributions
 Sister-0.051
 purpose-0.050
 mischief-0.048
 Tommy-0.047
urance-0.045
Token bird
Token ID6512
Feature activation-1.22e-01

Pos logit contributions
fly+0.060
 Sky+0.060
 Sparkle+0.053
 Jupiter+0.051
 Saturn+0.051
Neg logit contributions
 purpose-0.081
 trouble-0.079
 punishment-0.078
gh-0.076
 accident-0.075
Token sits
Token ID10718
Feature activation-6.77e-02

Pos logit contributions
stack+0.215
 globe+0.204
fly+0.203
 Jupiter+0.197
chard+0.192
Neg logit contributions
 time-0.148
 mistake-0.136
 complained-0.136
 trouble-0.135
 hated-0.134
Token on
Token ID319
Feature activation-1.30e-02

Pos logit contributions
fly+0.130
 Sky+0.129
 tweet+0.117
flies+0.115
 Jupiter+0.112
Neg logit contributions
gh-0.093
 forgotten-0.085
dad-0.085
 badly-0.084
addy-0.083
Token the
Token ID262
Feature activation-1.36e-02

Pos logit contributions
fly+0.020
 Sky+0.019
 tweet+0.017
flies+0.016
 globe+0.016
Neg logit contributions
gh-0.027
 purpose-0.025
 Sister-0.024
array-0.024
 mischief-0.024
Token middle
Token ID3504
Feature activation-2.55e-02

Pos logit contributions
fly+0.023
 Sky+0.022
 tweet+0.019
flies+0.019
 Jupiter+0.018
Neg logit contributions
gh-0.027
 purpose-0.024
array-0.023
 forgotten-0.023
 behaviour-0.023
Token and
Token ID290
Feature activation-1.11e-01

Pos logit contributions
fly+0.048
 Sky+0.047
 tweet+0.043
 Jupiter+0.041
flies+0.040
Neg logit contributions
gh-0.043
 purpose-0.039
em-0.037
 behaviour-0.037
addy-0.037
Token the
Token ID262
Feature activation-1.48e-02

Pos logit contributions
gh+0.011
 purpose+0.010
array+0.010
omm+0.010
 behaviour+0.010
Neg logit contributions
 Sky-0.008
fly-0.008
 tweet-0.007
 Saturn-0.007
 below-0.007
Token other
Token ID584
Feature activation-1.68e-02

Pos logit contributions
fly+0.018
 Sky+0.018
 tweet+0.014
 Saturn+0.014
 Jupiter+0.014
Neg logit contributions
gh-0.034
 purpose-0.033
 behaviour-0.032
array-0.031
 forgotten-0.031
Token birds
Token ID10087
Feature activation-8.42e-02

Pos logit contributions
fly+0.023
 Sky+0.023
 tweet+0.019
 Saturn+0.019
 Jupiter+0.019
Neg logit contributions
gh-0.034
 purpose-0.033
 behaviour-0.033
 mischief-0.032
array-0.032
Token.
Token ID13
Feature activation-2.95e-02

Pos logit contributions
 Sky+0.180
fly+0.178
 Jupiter+0.177
 Saturn+0.176
 globe+0.174
Neg logit contributions
gh-0.075
 forgotten-0.072
 supposed-0.068
 behaviour-0.065
 disob-0.063
Token The
Token ID383
Feature activation-2.16e-02

Pos logit contributions
fly+0.062
 tweet+0.054
flies+0.053
 Sky+0.053
 globe+0.053
Neg logit contributions
 purpose-0.040
gh-0.040
 Sister-0.040
 mischief-0.039
 Tommy-0.039
Token birds
Token ID10087
Feature activation-1.22e-01

Pos logit contributions
fly+0.028
 Sky+0.027
flies+0.022
 Saturn+0.022
 Jupiter+0.022
Neg logit contributions
gh-0.046
 purpose-0.046
 punishment-0.045
 behaviour-0.044
 forgotten-0.044
Token sing
Token ID1702
Feature activation-6.38e-02

Pos logit contributions
 Jupiter+0.189
fly+0.187
 Sky+0.183
 Saturn+0.182
 Pluto+0.177
Neg logit contributions
 punish-0.162
 disob-0.160
 time-0.158
 grown-0.154
 pain-0.147
Token and
Token ID290
Feature activation-9.19e-02

Pos logit contributions
fly+0.117
stack+0.113
 globe+0.111
flies+0.104
 Sky+0.101
Neg logit contributions
gh-0.097
 purpose-0.087
 time-0.084
 behaviour-0.080
 punishment-0.078
Token thank
Token ID5875
Feature activation-2.87e-02

Pos logit contributions
pole+0.154
 globe+0.140
rots+0.138
fly+0.134
ite+0.127
Neg logit contributions
 trouble-0.126
 pain-0.125
 punish-0.125
 purpose-0.125
 mischief-0.124
Token the
Token ID262
Feature activation-1.33e-02

Pos logit contributions
fly+0.061
 Sky+0.054
 globe+0.053
 tweet+0.051
flies+0.051
Neg logit contributions
gh-0.045
 purpose-0.037
 Sister-0.037
 behaviour-0.037
em-0.036
Token man
Token ID582
Feature activation-6.16e-02

Pos logit contributions
fly+0.017
 Sky+0.017
 Saturn+0.014
 tweet+0.014
 Jupiter+0.013
Neg logit contributions
gh-0.029
 purpose-0.028
 behaviour-0.027
array-0.027
 Sister-0.027
Token every
Token ID790
Feature activation-3.18e-02

Pos logit contributions
stack+0.218
fly+0.217
pp+0.208
pole+0.196
cher+0.191
Neg logit contributions
 trouble-0.127
 time-0.118
 happened-0.117
 purpose-0.111
 mistakes-0.111
Token night
Token ID1755
Feature activation-4.42e-02

Pos logit contributions
 Sky+0.063
fly+0.062
flies+0.058
 tweet+0.057
als+0.056
Neg logit contributions
gh-0.046
 purpose-0.043
 punishment-0.042
 behaviour-0.041
 forgotten-0.040
Token to
Token ID284
Feature activation-7.26e-02

Pos logit contributions
 Sky+0.080
fly+0.078
 tweet+0.076
 globe+0.072
 flap+0.071
Neg logit contributions
gh-0.071
 purpose-0.067
 forgotten-0.066
array-0.065
 behaviour-0.065
Token watch
Token ID2342
Feature activation-6.82e-02

Pos logit contributions
fly+0.143
 globe+0.139
als+0.137
 Sky+0.136
pole+0.135
Neg logit contributions
 purpose-0.095
gh-0.087
 mischief-0.085
 punish-0.085
 grown-0.081
Token the
Token ID262
Feature activation-6.11e-02

Pos logit contributions
fly+0.122
stack+0.115
 Sky+0.112
 globe+0.112
 tweet+0.105
Neg logit contributions
gh-0.103
 purpose-0.100
 behaviour-0.094
 mischief-0.093
 punishment-0.089
Token moon
Token ID8824
Feature activation-1.22e-01

Pos logit contributions
fly+0.089
 Sky+0.085
stack+0.081
bug+0.081
 Flor+0.075
Neg logit contributions
 purpose-0.115
 time-0.111
 happened-0.106
 mischief-0.106
array-0.106
Token together
Token ID1978
Feature activation-4.40e-02

Pos logit contributions
 Sky+0.195
 Asia+0.189
 globe+0.182
 Bloom+0.182
 Fly+0.179
Neg logit contributions
gh-0.173
 behaviour-0.147
 purpose-0.147
urance-0.141
 time-0.141
Token.
Token ID13
Feature activation-2.97e-02

Pos logit contributions
 Sky+0.094
fly+0.092
 tweet+0.087
 globe+0.085
 Saturn+0.084
Neg logit contributions
gh-0.058
 purpose-0.051
 behaviour-0.047
 forgotten-0.046
urance-0.044
Token They
Token ID1119
Feature activation-6.65e-02

Pos logit contributions
fly+0.053
 Sky+0.048
 flap+0.046
 tweet+0.046
 higher+0.046
Neg logit contributions
gh-0.050
 purpose-0.049
omm-0.048
urance-0.047
 Sister-0.046
Token became
Token ID2627
Feature activation-4.51e-02

Pos logit contributions
 Sky+0.143
fly+0.139
 higher+0.139
 aboard+0.134
 south+0.133
Neg logit contributions
 forgotten-0.076
gh-0.073
 disagreed-0.066
 disob-0.064
 been-0.064
Token the
Token ID262
Feature activation-4.31e-02

Pos logit contributions
fly+0.102
 tweet+0.091
 Sky+0.090
 globe+0.089
als+0.088
Neg logit contributions
 purpose-0.056
 Sister-0.055
 mischief-0.054
gh-0.053
 supposed-0.052
Token
Token ID198
Feature activation-1.05e-02

Pos logit contributions
fly+0.003
 Sky+0.003
 tweet+0.002
 globe+0.002
 Saturn+0.002
Neg logit contributions
gh-0.004
 purpose-0.004
array-0.004
 Sister-0.004
 behaviour-0.004
Token
Token ID198
Feature activation+9.33e-03

Pos logit contributions
fly+0.016
 Sky+0.015
 tweet+0.014
flies+0.013
 globe+0.013
Neg logit contributions
gh-0.022
 purpose-0.020
 Sister-0.019
array-0.019
addy-0.019
TokenThey
Token ID2990
Feature activation-1.01e-02

Pos logit contributions
gh+0.017
 purpose+0.015
 behaviour+0.015
array+0.015
omm+0.015
Neg logit contributions
 Sky-0.016
fly-0.016
 tweet-0.015
 Saturn-0.014
 globe-0.014
Token watched
Token ID7342
Feature activation-2.52e-02

Pos logit contributions
fly+0.018
 Sky+0.017
 tweet+0.016
 globe+0.015
 Saturn+0.015
Neg logit contributions
gh-0.018
 purpose-0.016
array-0.016
 forgotten-0.016
 Sister-0.016
Token the
Token ID262
Feature activation-2.51e-02

Pos logit contributions
fly+0.042
 Sky+0.038
 globe+0.036
 tweet+0.036
flies+0.034
Neg logit contributions
gh-0.047
 purpose-0.045
 Sister-0.044
 behaviour-0.043
 punishment-0.043
Token comet
Token ID31733
Feature activation-1.22e-01

Pos logit contributions
fly+0.040
 Sky+0.040
 tweet+0.035
 Saturn+0.033
 Jupiter+0.033
Neg logit contributions
gh-0.047
 purpose-0.046
 behaviour-0.044
 Sister-0.043
 punishment-0.043
Token for
Token ID329
Feature activation-2.47e-03

Pos logit contributions
 globe+0.175
 Sky+0.172
 Asia+0.165
fly+0.164
als+0.162
Neg logit contributions
gh-0.192
 behaviour-0.181
 purpose-0.178
 efforts-0.173
urance-0.169
Token a
Token ID257
Feature activation-5.35e-03

Pos logit contributions
fly+0.004
 Sky+0.004
 tweet+0.004
 globe+0.003
 Saturn+0.003
Neg logit contributions
gh-0.005
 purpose-0.004
array-0.004
 behaviour-0.004
 Sister-0.004
Token while
Token ID981
Feature activation+2.83e-03

Pos logit contributions
fly+0.010
 Sky+0.010
 tweet+0.009
 Saturn+0.008
 below+0.008
Neg logit contributions
gh-0.009
 purpose-0.009
array-0.008
 behaviour-0.008
 Sister-0.008
Token,
Token ID11
Feature activation-3.69e-02

Pos logit contributions
gh+0.006
 purpose+0.005
array+0.005
 behaviour+0.005
omm+0.005
Neg logit contributions
 Sky-0.004
fly-0.004
 tweet-0.004
 Saturn-0.004
 globe-0.004
Token until
Token ID1566
Feature activation-1.94e-02

Pos logit contributions
fly+0.068
 Sky+0.067
 tweet+0.060
als+0.059
 globe+0.059
Neg logit contributions
 forgotten-0.058
gh-0.057
array-0.057
 Sister-0.057
 mischief-0.054
Token scary
Token ID14343
Feature activation-3.37e-02

Pos logit contributions
fly+0.218
als+0.208
ites+0.204
stack+0.203
flies+0.202
Neg logit contributions
 unfair-0.106
 forgotten-0.098
 trouble-0.095
 expensive-0.092
 embarrassing-0.088
Token!"
Token ID2474
Feature activation+3.87e-03

Pos logit contributions
fly+0.064
 Sky+0.063
 tweet+0.060
 Saturn+0.057
 globe+0.056
Neg logit contributions
gh-0.051
 purpose-0.048
 behaviour-0.045
array-0.044
 supposed-0.044
Token They
Token ID1119
Feature activation-2.81e-02

Pos logit contributions
gh+0.008
 purpose+0.008
array+0.008
 behaviour+0.007
omm+0.007
Neg logit contributions
 Sky-0.006
fly-0.005
 tweet-0.005
 Saturn-0.005
 globe-0.005
Token swing
Token ID9628
Feature activation-7.79e-02

Pos logit contributions
fly+0.051
 Sky+0.049
 globe+0.044
 Saturn+0.042
 Jupiter+0.042
Neg logit contributions
gh-0.045
 Sister-0.043
 forgotten-0.043
 disob-0.042
array-0.042
Token higher
Token ID2440
Feature activation-7.02e-02

Pos logit contributions
fly+0.127
 Dragon+0.123
 reef+0.116
fish+0.114
 globe+0.112
Neg logit contributions
anked-0.109
 trouble-0.105
gh-0.104
 Tommy-0.102
 purpose-0.099
Token and
Token ID290
Feature activation-1.21e-01

Pos logit contributions
 Sky+0.146
 tweet+0.144
fly+0.140
 Sparkle+0.138
 Dragon+0.137
Neg logit contributions
 purpose-0.079
anked-0.079
 time-0.074
gh-0.070
em-0.070
Token higher
Token ID2440
Feature activation-5.93e-02

Pos logit contributions
rots+0.179
fly+0.175
amer+0.161
 globe+0.158
 Dragon+0.154
Neg logit contributions
 forgotten-0.186
 complained-0.172
 refused-0.168
 disagreed-0.164
 Tommy-0.163
Token,
Token ID11
Feature activation-7.44e-02

Pos logit contributions
 Sky+0.124
 tweet+0.122
fly+0.120
 Jupiter+0.115
 Sparkle+0.115
Neg logit contributions
 purpose-0.069
anked-0.065
gh-0.064
 time-0.058
em-0.058
Token pretending
Token ID23662
Feature activation-2.40e-02

Pos logit contributions
fly+0.142
 Sky+0.136
rots+0.130
 Dragon+0.126
 globe+0.125
Neg logit contributions
 forgotten-0.108
array-0.102
 accident-0.098
gh-0.096
 supposed-0.095
Token to
Token ID284
Feature activation-6.42e-02

Pos logit contributions
fly+0.046
 Sky+0.045
 tweet+0.042
 Saturn+0.040
 globe+0.040
Neg logit contributions
gh-0.040
 purpose-0.035
 forgotten-0.033
 behaviour-0.033
array-0.032
Token fly
Token ID6129
Feature activation-7.52e-02

Pos logit contributions
 globe+0.112
rots+0.111
 Sky+0.110
fly+0.108
als+0.108
Neg logit contributions
 purpose-0.095
gh-0.092
 disob-0.083
 reason-0.082
 mischief-0.081
Token,
Token ID11
Feature activation-3.05e-02

Pos logit contributions
gh+0.024
array+0.023
 purpose+0.022
 behaviour+0.022
 forgotten+0.022
Neg logit contributions
 Sky-0.018
fly-0.017
 tweet-0.015
 Bloom-0.014
 globe-0.014
Token Lily
Token ID20037
Feature activation-1.69e-02

Pos logit contributions
fly+0.075
 Sky+0.065
 globe+0.065
 tweet+0.064
flies+0.064
Neg logit contributions
 purpose-0.035
gh-0.034
 punishment-0.033
 Sister-0.032
 forgotten-0.032
Token would
Token ID561
Feature activation-1.99e-02

Pos logit contributions
 Sky+0.033
fly+0.032
 tweet+0.030
 globe+0.028
 Saturn+0.028
Neg logit contributions
gh-0.028
 forgotten-0.024
array-0.024
 purpose-0.024
 Sister-0.024
Token visit
Token ID3187
Feature activation-6.43e-02

Pos logit contributions
 Sky+0.039
fly+0.038
 globe+0.035
 below+0.034
 tweet+0.034
Neg logit contributions
gh-0.031
 purpose-0.028
array-0.027
 behaviour-0.027
urance-0.027
Token the
Token ID262
Feature activation-2.62e-02

Pos logit contributions
fly+0.136
 globe+0.125
stack+0.116
ite+0.112
pole+0.111
Neg logit contributions
 purpose-0.088
gh-0.082
 mischief-0.081
 forgotten-0.080
 trouble-0.078
Token beetle
Token ID45489
Feature activation-1.21e-01

Pos logit contributions
fly+0.041
 Sky+0.039
 tweet+0.033
flies+0.033
 Jupiter+0.032
Neg logit contributions
gh-0.053
 purpose-0.053
 forgotten-0.050
 behaviour-0.050
 mischief-0.050
Token every
Token ID790
Feature activation-1.65e-02

Pos logit contributions
fly+0.199
 globe+0.195
 Sky+0.188
 Saturn+0.186
 Pluto+0.172
Neg logit contributions
gh-0.169
 time-0.159
 business-0.158
 behaviour-0.152
 forgotten-0.149
Token day
Token ID1110
Feature activation-4.13e-02

Pos logit contributions
 Sky+0.033
fly+0.033
 tweet+0.030
flies+0.030
 globe+0.029
Neg logit contributions
gh-0.025
 purpose-0.023
 behaviour-0.022
 forgotten-0.022
 mischief-0.022
Token and
Token ID290
Feature activation-7.07e-02

Pos logit contributions
fly+0.077
 Sky+0.075
 tweet+0.071
 globe+0.070
 flap+0.065
Neg logit contributions
gh-0.068
 forgotten-0.061
 behaviour-0.059
 purpose-0.059
 supposed-0.059
Token make
Token ID787
Feature activation-4.43e-02

Pos logit contributions
 globe+0.130
fly+0.122
 Sky+0.122
ites+0.118
pole+0.116
Neg logit contributions
 forgotten-0.101
 purpose-0.093
gh-0.093
 disob-0.092
 punishment-0.092
Token sure
Token ID1654
Feature activation-3.27e-02

Pos logit contributions
fly+0.096
 globe+0.089
stack+0.084
 Sky+0.084
 tweet+0.080
Neg logit contributions
gh-0.067
 mischief-0.064
 purpose-0.064
 trouble-0.058
 behaviour-0.058
Token stars
Token ID5788
Feature activation-5.55e-02

Pos logit contributions
gh+0.045
urance+0.041
omm+0.040
 behaviour+0.040
array+0.040
Neg logit contributions
 Sky-0.034
fly-0.031
 tweet-0.029
 deer-0.028
 Saturn-0.027
Token.
Token ID13
Feature activation-8.64e-03

Pos logit contributions
 Sky+0.121
 tweet+0.120
fly+0.116
 Saturn+0.116
 Fly+0.113
Neg logit contributions
gh-0.065
 purpose-0.053
 forgotten-0.051
 supposed-0.051
 time-0.048
Token She
Token ID1375
Feature activation-2.40e-02

Pos logit contributions
fly+0.016
 Sky+0.015
 tweet+0.014
 globe+0.013
flies+0.013
Neg logit contributions
gh-0.015
 purpose-0.014
omm-0.014
array-0.014
 Sister-0.013
Token would
Token ID561
Feature activation-5.07e-02

Pos logit contributions
 Sky+0.047
fly+0.047
 tweet+0.044
 higher+0.042
 below+0.042
Neg logit contributions
gh-0.040
 forgotten-0.033
 purpose-0.032
array-0.031
 supposed-0.031
Token watch
Token ID2342
Feature activation-1.12e-01

Pos logit contributions
 Sky+0.095
fly+0.093
flies+0.085
 below+0.083
 globe+0.083
Neg logit contributions
gh-0.080
 purpose-0.070
 forgotten-0.065
 disob-0.064
array-0.064
Token them
Token ID606
Feature activation-1.21e-01

Pos logit contributions
stack+0.238
ite+0.226
fly+0.220
 Follow+0.219
bug+0.215
Neg logit contributions
gh-0.135
 purpose-0.133
 time-0.126
 grown-0.115
 mischief-0.101
Token tw
Token ID665
Feature activation-5.18e-02

Pos logit contributions
stack+0.196
 Follow+0.180
 globe+0.178
 Sky+0.168
fly+0.167
Neg logit contributions
gh-0.169
 time-0.167
 purpose-0.155
 grown-0.153
 chores-0.152
Tokeninkle
Token ID19894
Feature activation-5.78e-02

Pos logit contributions
 Sky+0.117
fly+0.113
 tweet+0.111
 globe+0.108
 Dragon+0.106
Neg logit contributions
gh-0.068
addy-0.056
 purpose-0.055
anked-0.055
omm-0.052
Token in
Token ID287
Feature activation-1.53e-02

Pos logit contributions
 Sky+0.086
stack+0.085
fly+0.085
 tweet+0.083
 globe+0.079
Neg logit contributions
gh-0.102
 purpose-0.098
 time-0.090
omm-0.088
 behaviour-0.088
Token the
Token ID262
Feature activation-3.85e-02

Pos logit contributions
fly+0.027
 Sky+0.026
 tweet+0.023
flies+0.023
 globe+0.022
Neg logit contributions
gh-0.027
 purpose-0.027
 mischief-0.025
array-0.025
 Sister-0.024
Token sky
Token ID6766
Feature activation-6.14e-02

Pos logit contributions
fly+0.066
 Sky+0.065
flies+0.058
 tweet+0.056
stack+0.055
Neg logit contributions
gh-0.067
 purpose-0.066
array-0.064
 mischief-0.062
 Sister-0.061
Token.
Token ID13
Feature activation-1.69e-02

Pos logit contributions
fly+0.002
 Sky+0.002
 tweet+0.002
 globe+0.002
 Saturn+0.002
Neg logit contributions
gh-0.002
 purpose-0.002
array-0.002
 behaviour-0.002
 Sister-0.002
Token He
Token ID679
Feature activation-3.05e-02

Pos logit contributions
fly+0.034
 Sky+0.032
 tweet+0.031
flies+0.030
 globe+0.029
Neg logit contributions
gh-0.025
 purpose-0.024
omm-0.024
 Sister-0.024
array-0.023
Token would
Token ID561
Feature activation-4.51e-02

Pos logit contributions
 Sky+0.060
fly+0.057
 tweet+0.056
 higher+0.052
 below+0.051
Neg logit contributions
gh-0.049
 forgotten-0.043
 purpose-0.040
 Sister-0.040
array-0.039
Token make
Token ID787
Feature activation-6.35e-02

Pos logit contributions
 Sky+0.079
fly+0.078
als+0.071
flies+0.070
 globe+0.068
Neg logit contributions
gh-0.075
 purpose-0.069
 forgotten-0.065
array-0.065
urance-0.065
Token the
Token ID262
Feature activation-7.02e-02

Pos logit contributions
fly+0.134
stack+0.119
bug+0.116
 tweet+0.114
 globe+0.113
Neg logit contributions
gh-0.098
 purpose-0.093
 mischief-0.091
 trouble-0.087
 business-0.082
Token rocket
Token ID10701
Feature activation-1.21e-01

Pos logit contributions
fly+0.114
bug+0.103
 Sky+0.103
 tweet+0.100
flies+0.096
Neg logit contributions
 purpose-0.139
 time-0.135
 trouble-0.129
 consequences-0.125
 mischief-0.125
Token fly
Token ID6129
Feature activation-7.88e-02

Pos logit contributions
 Sky+0.188
stack+0.185
als+0.182
 Flor+0.180
 Fly+0.176
Neg logit contributions
gh-0.179
 purpose-0.176
 product-0.151
 business-0.151
 punish-0.150
Token high
Token ID1029
Feature activation-5.45e-02

Pos logit contributions
 Sky+0.123
 tweet+0.122
fly+0.112
 Flor+0.108
bug+0.107
Neg logit contributions
gh-0.145
 purpose-0.135
 time-0.130
 forgotten-0.120
 business-0.113
Token in
Token ID287
Feature activation-1.26e-02

Pos logit contributions
 tweet+0.087
 Sky+0.084
fly+0.081
 Saturn+0.071
 Fly+0.070
Neg logit contributions
gh-0.105
 purpose-0.099
 forgotten-0.090
array-0.090
anked-0.087
Token the
Token ID262
Feature activation-1.58e-02

Pos logit contributions
fly+0.018
 Sky+0.018
 tweet+0.017
flies+0.015
als+0.015
Neg logit contributions
gh-0.027
 purpose-0.026
array-0.024
 mischief-0.024
 forgotten-0.024
Token sky
Token ID6766
Feature activation-5.45e-02

Pos logit contributions
fly+0.026
 Sky+0.026
 tweet+0.023
flies+0.022
 Bloom+0.021
Neg logit contributions
gh-0.031
 purpose-0.029
array-0.028
 forgotten-0.027
 mischief-0.027
Token,
Token ID11
Feature activation-4.70e-02

Pos logit contributions
 Sky+0.003
fly+0.003
 tweet+0.003
 Saturn+0.002
 globe+0.002
Neg logit contributions
gh-0.003
 purpose-0.003
array-0.003
 behaviour-0.003
 Sister-0.003
Token I
Token ID314
Feature activation-1.96e-04

Pos logit contributions
fly+0.091
 Sky+0.081
 below+0.079
flies+0.078
 globe+0.077
Neg logit contributions
gh-0.068
 Tommy-0.065
 purpose-0.065
 punishment-0.062
 disob-0.061
Token can
Token ID460
Feature activation-3.54e-02

Pos logit contributions
 Sky+0.000
fly+0.000
 tweet+0.000
 Saturn+0.000
 globe+0.000
Neg logit contributions
gh-0.000
 purpose-0.000
array-0.000
 behaviour-0.000
 Sister-0.000
Token make
Token ID787
Feature activation-5.05e-02

Pos logit contributions
fly+0.061
 Sky+0.060
 tweet+0.054
flies+0.054
 globe+0.053
Neg logit contributions
gh-0.062
 purpose-0.056
 punish-0.051
 disob-0.050
urance-0.050
Token the
Token ID262
Feature activation-5.33e-02

Pos logit contributions
fly+0.099
 globe+0.088
 Sky+0.086
stack+0.086
 tweet+0.084
Neg logit contributions
gh-0.081
 purpose-0.080
 mischief-0.074
 trouble-0.071
array-0.068
Token moon
Token ID8824
Feature activation-1.21e-01

Pos logit contributions
fly+0.089
 Sky+0.085
bug+0.078
 tweet+0.071
 Fly+0.070
Neg logit contributions
 purpose-0.098
 mischief-0.088
 trouble-0.086
 laundry-0.085
array-0.085
Token go
Token ID467
Feature activation-5.79e-02

Pos logit contributions
 Sky+0.190
 Fly+0.189
stack+0.187
 Bloom+0.182
 Saturn+0.177
Neg logit contributions
gh-0.187
 purpose-0.172
 product-0.149
 punish-0.140
 behaviour-0.139
Token around
Token ID1088
Feature activation-4.65e-02

Pos logit contributions
 Sky+0.083
 tweet+0.082
fly+0.080
stack+0.076
 Dragon+0.074
Neg logit contributions
gh-0.107
 purpose-0.104
 forgotten-0.096
 time-0.095
array-0.092
Token the
Token ID262
Feature activation-3.25e-02

Pos logit contributions
fly+0.079
 tweet+0.075
 Sky+0.074
flies+0.067
 globe+0.066
Neg logit contributions
 purpose-0.082
gh-0.081
 mischief-0.075
array-0.074
 trouble-0.072
Token earth
Token ID4534
Feature activation-9.32e-02

Pos logit contributions
fly+0.051
 Sky+0.050
 tweet+0.047
flies+0.042
 Fly+0.040
Neg logit contributions
 purpose-0.063
gh-0.061
array-0.060
 mischief-0.058
 Sister-0.056
Token!"
Token ID2474
Feature activation-4.84e-03

Pos logit contributions
 Sky+0.190
 tweet+0.184
 Fly+0.176
 Saturn+0.174
 Jupiter+0.169
Neg logit contributions
 purpose-0.115
gh-0.109
array-0.096
em-0.091
 forgotten-0.091
Token the
Token ID262
Feature activation-5.04e-02

Pos logit contributions
fly+0.143
 tweet+0.122
 flap+0.120
flies+0.118
 globe+0.117
Neg logit contributions
 Sister-0.094
 Tommy-0.092
 time-0.089
 purpose-0.089
 punishment-0.088
Token little
Token ID1310
Feature activation-6.59e-02

Pos logit contributions
fly+0.131
flies+0.122
ites+0.121
als+0.121
stack+0.120
Neg logit contributions
 time-0.047
 punishment-0.041
 purpose-0.041
 trouble-0.036
gh-0.035
Token girl
Token ID2576
Feature activation-4.50e-02

Pos logit contributions
stack+0.147
fly+0.145
girls+0.141
 Saturn+0.140
ites+0.137
Neg logit contributions
 trouble-0.061
 time-0.058
 purpose-0.056
 Sister-0.054
 punishment-0.053
Token visited
Token ID8672
Feature activation-8.11e-02

Pos logit contributions
 Sky+0.082
fly+0.076
 tweet+0.073
 globe+0.072
 higher+0.071
Neg logit contributions
gh-0.073
 forgotten-0.069
 behaviour-0.066
 Sister-0.066
 supposed-0.065
Token the
Token ID262
Feature activation-5.53e-02

Pos logit contributions
fly+0.169
ite+0.153
flies+0.148
 globe+0.148
als+0.148
Neg logit contributions
 purpose-0.113
 time-0.106
gh-0.101
 business-0.091
 trouble-0.091
Token dragon
Token ID10441
Feature activation-1.20e-01

Pos logit contributions
fly+0.103
 tweet+0.097
ites+0.094
flies+0.092
als+0.092
Neg logit contributions
 purpose-0.088
 time-0.087
 situation-0.082
 grown-0.081
 forgotten-0.080
Token every
Token ID790
Feature activation-3.82e-02

Pos logit contributions
 higher+0.186
 globe+0.185
 Sky+0.183
 Saturn+0.182
fly+0.178
Neg logit contributions
gh-0.160
 business-0.160
 time-0.148
 behaviour-0.146
 efforts-0.141
Token day
Token ID1110
Feature activation-4.20e-02

Pos logit contributions
 Sky+0.084
fly+0.084
flies+0.082
als+0.080
 tweet+0.079
Neg logit contributions
gh-0.045
 time-0.044
 purpose-0.041
 punishment-0.039
 behaviour-0.039
Token and
Token ID290
Feature activation-7.66e-02

Pos logit contributions
fly+0.082
 Sky+0.079
 tweet+0.078
 globe+0.075
 flap+0.072
Neg logit contributions
gh-0.065
 forgotten-0.056
 purpose-0.056
 supposed-0.055
 behaviour-0.054
Token they
Token ID484
Feature activation-8.71e-02

Pos logit contributions
 globe+0.137
fly+0.136
 Sky+0.133
 higher+0.130
 south+0.130
Neg logit contributions
 happened-0.107
 forgotten-0.106
gh-0.101
 complained-0.099
 Sister-0.095
Token would
Token ID561
Feature activation-7.39e-02

Pos logit contributions
 Sky+0.184
 higher+0.180
fly+0.173
 globe+0.169
 Follow+0.169
Neg logit contributions
 forgotten-0.092
gh-0.091
 disob-0.082
 disagreed-0.081
 happened-0.080
Token.
Token ID13
Feature activation-7.19e-03

Pos logit contributions
gh+0.034
omm+0.032
 Sister+0.032
array+0.032
 purpose+0.031
Neg logit contributions
 Sky-0.026
fly-0.025
 tweet-0.022
 Saturn-0.022
 globe-0.022
Token They
Token ID1119
Feature activation-1.32e-02

Pos logit contributions
fly+0.011
 Sky+0.011
 tweet+0.010
flies+0.009
 globe+0.009
Neg logit contributions
gh-0.014
 purpose-0.013
array-0.013
 Sister-0.013
omm-0.013
Token would
Token ID561
Feature activation-3.29e-02

Pos logit contributions
fly+0.026
 Sky+0.026
 tweet+0.023
 globe+0.022
flies+0.022
Neg logit contributions
gh-0.021
 forgotten-0.019
 Sister-0.019
array-0.019
 behaviour-0.018
Token watch
Token ID2342
Feature activation-5.14e-02

Pos logit contributions
fly+0.066
 Sky+0.063
flies+0.058
 Dragon+0.057
als+0.057
Neg logit contributions
gh-0.049
 purpose-0.045
 forgotten-0.045
array-0.044
urance-0.044
Token the
Token ID262
Feature activation-2.35e-02

Pos logit contributions
fly+0.107
 Sky+0.090
flies+0.089
 tweet+0.088
bug+0.088
Neg logit contributions
gh-0.080
 purpose-0.077
 mischief-0.076
 Tommy-0.074
array-0.073
Token planes
Token ID13016
Feature activation-1.20e-01

Pos logit contributions
fly+0.039
 Sky+0.036
 tweet+0.032
flies+0.031
bug+0.030
Neg logit contributions
gh-0.047
 purpose-0.044
array-0.044
 forgotten-0.043
 behaviour-0.042
Token take
Token ID1011
Feature activation+5.80e-03

Pos logit contributions
 Sky+0.199
fly+0.189
 Flor+0.184
 Dragon+0.183
Sky+0.180
Neg logit contributions
 punish-0.166
 time-0.164
 delay-0.157
 afford-0.156
gh-0.156
Token off
Token ID572
Feature activation-2.03e-02

Pos logit contributions
gh+0.012
 purpose+0.012
array+0.011
omm+0.011
 behaviour+0.011
Neg logit contributions
 Sky-0.008
fly-0.008
 tweet-0.007
 Saturn-0.007
 globe-0.007
Token and
Token ID290
Feature activation-6.76e-02

Pos logit contributions
fly+0.031
 Sky+0.029
 tweet+0.028
flies+0.025
 globe+0.024
Neg logit contributions
gh-0.041
 purpose-0.040
 forgotten-0.037
array-0.037
 Sister-0.037
Token land
Token ID1956
Feature activation-3.09e-02

Pos logit contributions
fly+0.131
 Sky+0.126
rots+0.120
 globe+0.114
 tweet+0.112
Neg logit contributions
 forgotten-0.094
gh-0.091
 punish-0.088
 purpose-0.087
 Sister-0.087
Token.
Token ID13
Feature activation-1.76e-02

Pos logit contributions
fly+0.048
 Sky+0.045
 tweet+0.044
 Fly+0.038
bug+0.037
Neg logit contributions
gh-0.062
 purpose-0.057
 forgotten-0.057
 behaviour-0.054
 punishment-0.054
Token dad
Token ID9955
Feature activation+7.06e-03

Pos logit contributions
fly+0.025
 Sky+0.024
 tweet+0.022
 globe+0.022
 below+0.022
Neg logit contributions
gh-0.010
 Sister-0.009
 purpose-0.009
array-0.009
 forgotten-0.009
Token.
Token ID13
Feature activation-3.49e-03

Pos logit contributions
gh+0.013
 purpose+0.012
array+0.012
 behaviour+0.011
omm+0.011
Neg logit contributions
fly-0.012
 Sky-0.012
 tweet-0.011
 globe-0.011
 below-0.010
Token They
Token ID1119
Feature activation-2.18e-02

Pos logit contributions
fly+0.006
 Sky+0.006
 tweet+0.005
 globe+0.005
 Saturn+0.005
Neg logit contributions
gh-0.006
 purpose-0.006
array-0.006
 Sister-0.006
omm-0.006
Token would
Token ID561
Feature activation-3.85e-02

Pos logit contributions
fly+0.041
 Sky+0.040
 tweet+0.036
flies+0.035
 higher+0.034
Neg logit contributions
gh-0.036
 forgotten-0.033
 Sister-0.032
array-0.031
kids-0.031
Token watch
Token ID2342
Feature activation-5.13e-02

Pos logit contributions
fly+0.081
 Sky+0.075
flies+0.071
 Dragon+0.070
als+0.069
Neg logit contributions
gh-0.054
 forgotten-0.048
array-0.048
 purpose-0.047
urance-0.047
Token planes
Token ID13016
Feature activation-1.20e-01

Pos logit contributions
fly+0.108
flies+0.088
 Sky+0.087
stack+0.086
 tweet+0.084
Neg logit contributions
gh-0.078
 purpose-0.077
 mischief-0.075
array-0.075
 time-0.075
Token take
Token ID1011
Feature activation-1.90e-03

Pos logit contributions
fly+0.223
 Flor+0.214
 bead+0.203
 Sky+0.202
bug+0.201
Neg logit contributions
 time-0.141
gh-0.128
 punish-0.123
 sale-0.122
 done-0.121
Token off
Token ID572
Feature activation-2.88e-02

Pos logit contributions
fly+0.003
 Sky+0.003
 tweet+0.003
 globe+0.003
 Saturn+0.003
Neg logit contributions
gh-0.004
 purpose-0.003
array-0.003
 behaviour-0.003
 Sister-0.003
Token and
Token ID290
Feature activation-7.13e-02

Pos logit contributions
fly+0.047
 Sky+0.042
 tweet+0.041
flies+0.038
 globe+0.037
Neg logit contributions
gh-0.054
 purpose-0.053
 forgotten-0.050
 Sister-0.050
array-0.050
Token land
Token ID1956
Feature activation-3.75e-02

Pos logit contributions
fly+0.140
 Sky+0.130
rots+0.129
 globe+0.120
flies+0.118
Neg logit contributions
 forgotten-0.096
gh-0.092
 punish-0.090
 purpose-0.088
array-0.088
Token.
Token ID13
Feature activation-2.45e-02

Pos logit contributions
fly+0.061
 Sky+0.054
 tweet+0.054
 Fly+0.048
bug+0.046
Neg logit contributions
gh-0.071
 purpose-0.068
 forgotten-0.066
em-0.063
 business-0.063
Token
Token ID198
Feature activation-3.74e-02

Pos logit contributions
fly+0.071
 tweet+0.063
flies+0.062
 below+0.061
fish+0.060
Neg logit contributions
 Sister-0.049
 purpose-0.047
gh-0.047
 Tommy-0.047
omm-0.046
Token
Token ID198
Feature activation-1.64e-02

Pos logit contributions
fly+0.072
 tweet+0.065
 Sky+0.064
flies+0.064
als+0.062
Neg logit contributions
gh-0.063
 purpose-0.058
addy-0.055
David-0.054
child-0.053
TokenThey
Token ID2990
Feature activation-3.02e-02

Pos logit contributions
fly+0.032
 Sky+0.031
 tweet+0.029
flies+0.028
 below+0.028
Neg logit contributions
gh-0.027
 purpose-0.024
array-0.024
 Sister-0.023
David-0.022
Token both
Token ID1111
Feature activation-1.46e-02

Pos logit contributions
 Sky+0.063
fly+0.062
 tweet+0.057
 globe+0.055
 Fly+0.054
Neg logit contributions
gh-0.043
 forgotten-0.040
array-0.037
 Sister-0.036
addy-0.036
Token drew
Token ID9859
Feature activation-8.62e-02

Pos logit contributions
 Sky+0.028
fly+0.027
 tweet+0.025
 globe+0.024
 Saturn+0.024
Neg logit contributions
gh-0.024
 forgotten-0.021
array-0.021
 purpose-0.021
 Sister-0.021
Token frogs
Token ID37475
Feature activation-1.20e-01

Pos logit contributions
stack+0.194
fly+0.191
 tweet+0.170
 globe+0.167
 Follow+0.167
Neg logit contributions
gh-0.085
 trouble-0.082
 time-0.081
 purpose-0.077
 Tommy-0.072
Token and
Token ID290
Feature activation-5.53e-02

Pos logit contributions
Flash+0.258
rob+0.236
 globe+0.228
 Asia+0.227
stack+0.224
Neg logit contributions
gh-0.081
 grown-0.075
 time-0.074
:-0.073
 --0.070
Token colored
Token ID16396
Feature activation-8.21e-02

Pos logit contributions
fly+0.080
 Sky+0.075
 globe+0.073
 below+0.073
 higher+0.068
Neg logit contributions
 Sister-0.098
 purpose-0.096
 mischief-0.094
gh-0.094
 forgotten-0.093
Token them
Token ID606
Feature activation-8.54e-02

Pos logit contributions
stack+0.147
fly+0.141
 Follow+0.141
 Sky+0.132
 tweet+0.130
Neg logit contributions
 purpose-0.107
gh-0.103
 time-0.095
anked-0.094
 Tommy-0.092
Token.
Token ID13
Feature activation-3.15e-02

Pos logit contributions
stack+0.156
 Follow+0.150
fly+0.145
 tweet+0.141
 south+0.140
Neg logit contributions
gh-0.117
 purpose-0.102
 time-0.099
 happened-0.089
 sale-0.087
Token They
Token ID1119
Feature activation-5.18e-02

Pos logit contributions
fly+0.063
 tweet+0.057
 Sky+0.056
fish+0.054
 globe+0.053
Neg logit contributions
 purpose-0.047
 Sister-0.047
urance-0.046
gh-0.045
array-0.044
Token,
Token ID11
Feature activation+1.60e-02

Pos logit contributions
array+0.094
 mischief+0.091
 purpose+0.089
kids+0.087
child+0.087
Neg logit contributions
 Sky-0.041
 tweet-0.039
 together-0.037
fly-0.036
 higher-0.032
Token "
Token ID366
Feature activation+3.58e-03

Pos logit contributions
gh+0.036
 purpose+0.035
array+0.034
 mischief+0.033
 behaviour+0.033
Neg logit contributions
 Sky-0.021
fly-0.020
 tweet-0.019
 globe-0.016
 below-0.016
TokenNo
Token ID2949
Feature activation+5.26e-02

Pos logit contributions
gh+0.006
 purpose+0.006
array+0.006
 behaviour+0.005
omm+0.005
Neg logit contributions
 Sky-0.007
fly-0.007
 tweet-0.006
 Saturn-0.006
 globe-0.006
Token,
Token ID11
Feature activation+6.30e-03

Pos logit contributions
gh+0.136
David+0.135
 supposed+0.129
scar+0.128
 forgotten+0.127
Neg logit contributions
 tweet-0.044
 Sky-0.043
 together-0.038
 flap-0.036
 higher-0.034
Token we
Token ID356
Feature activation+7.34e-03

Pos logit contributions
gh+0.015
 purpose+0.014
array+0.014
 behaviour+0.013
omm+0.013
Neg logit contributions
 Sky-0.008
fly-0.008
 tweet-0.007
 Saturn-0.006
 globe-0.006
Token have
Token ID423
Feature activation+5.47e-02

Pos logit contributions
gh+0.018
 purpose+0.017
array+0.016
omm+0.016
 behaviour+0.016
Neg logit contributions
fly-0.008
 Sky-0.008
 tweet-0.007
 globe-0.006
 Saturn-0.006
Token to
Token ID284
Feature activation+6.41e-02

Pos logit contributions
gh+0.137
urance+0.136
omm+0.136
scar+0.129
array+0.128
Neg logit contributions
 together-0.047
 Sky-0.047
 higher-0.042
fly-0.036
 closer-0.036
Token throw
Token ID3714
Feature activation+9.74e-04

Pos logit contributions
gh+0.144
omm+0.141
David+0.138
child+0.138
liv+0.137
Neg logit contributions
 fly-0.068
 soar-0.067
 flap-0.066
fly-0.066
 tweet-0.063
Token it
Token ID340
Feature activation-1.88e-02

Pos logit contributions
gh+0.002
 purpose+0.002
array+0.002
 behaviour+0.002
 Sister+0.002
Neg logit contributions
 Sky-0.001
fly-0.001
 tweet-0.001
 globe-0.001
 Saturn-0.001
Token away
Token ID1497
Feature activation-2.29e-02

Pos logit contributions
fly+0.025
 Sky+0.025
 tweet+0.022
 Saturn+0.021
 globe+0.020
Neg logit contributions
gh-0.042
 purpose-0.039
 forgotten-0.038
 behaviour-0.037
array-0.037
Token.
Token ID13
Feature activation-1.83e-02

Pos logit contributions
fly+0.038
 Sky+0.038
 tweet+0.036
 Saturn+0.033
 globe+0.032
Neg logit contributions
gh-0.041
 purpose-0.040
 forgotten-0.038
 behaviour-0.037
 Sister-0.037
TokenL
Token ID43
Feature activation+3.20e-02

Pos logit contributions
gh+0.032
 behaviour+0.030
array+0.030
urance+0.030
 purpose+0.030
Neg logit contributions
 Sky-0.027
fly-0.026
 tweet-0.023
 higher-0.022
 globe-0.022
Tokenily
Token ID813
Feature activation+8.17e-03

Pos logit contributions
gh+0.067
 purpose+0.065
 Sister+0.064
array+0.063
 mischief+0.063
Neg logit contributions
 Sky-0.042
fly-0.041
bug-0.039
 tweet-0.037
 together-0.035
Token was
Token ID373
Feature activation+1.39e-02

Pos logit contributions
gh+0.016
 purpose+0.014
array+0.014
 behaviour+0.014
omm+0.014
Neg logit contributions
 Sky-0.013
fly-0.013
 tweet-0.012
 globe-0.011
 Saturn-0.011
Token happy
Token ID3772
Feature activation+9.49e-03

Pos logit contributions
gh+0.031
 purpose+0.029
 behaviour+0.029
array+0.029
 Sister+0.028
Neg logit contributions
 Sky-0.019
fly-0.018
 tweet-0.016
 higher-0.015
 below-0.015
Token to
Token ID284
Feature activation+1.65e-02

Pos logit contributions
gh+0.021
 purpose+0.019
array+0.019
omm+0.019
urance+0.019
Neg logit contributions
 Sky-0.013
fly-0.013
 tweet-0.012
 below-0.011
 Saturn-0.010
Token have
Token ID423
Feature activation+5.92e-02

Pos logit contributions
gh+0.035
array+0.034
 purpose+0.033
omm+0.033
 behaviour+0.032
Neg logit contributions
 Sky-0.023
fly-0.022
 tweet-0.021
 flap-0.019
 soar-0.019
Token a
Token ID257
Feature activation+1.40e-02

Pos logit contributions
omm+0.153
 behaviour+0.138
urance+0.137
 expression+0.136
anked+0.135
Neg logit contributions
 together-0.062
 Sky-0.055
 Sparkle-0.049
 around-0.046
 Fly-0.044
Token friend
Token ID1545
Feature activation+3.30e-03

Pos logit contributions
gh+0.028
array+0.026
urance+0.025
omm+0.025
 purpose+0.025
Neg logit contributions
 Sky-0.022
fly-0.022
 tweet-0.020
 higher-0.019
 globe-0.019
Token to
Token ID284
Feature activation+3.61e-02

Pos logit contributions
gh+0.007
 purpose+0.007
array+0.007
 behaviour+0.007
omm+0.007
Neg logit contributions
 Sky-0.005
fly-0.005
 tweet-0.004
 Saturn-0.004
 globe-0.004
Token walk
Token ID2513
Feature activation-1.11e-02

Pos logit contributions
gh+0.072
omm+0.071
array+0.071
em+0.068
 behaviour+0.068
Neg logit contributions
 Sky-0.052
fly-0.050
 tweet-0.049
 flap-0.048
 soar-0.047
Token home
Token ID1363
Feature activation+4.21e-03

Pos logit contributions
fly+0.015
 Sky+0.014
 tweet+0.013
 globe+0.012
 Saturn+0.012
Neg logit contributions
gh-0.025
 purpose-0.023
array-0.022
 behaviour-0.022
 Sister-0.022
Token looked
Token ID3114
Feature activation-1.81e-02

Pos logit contributions
gh+0.025
 purpose+0.023
array+0.022
omm+0.022
 behaviour+0.021
Neg logit contributions
fly-0.022
 Sky-0.021
 tweet-0.019
 below-0.018
flies-0.018
Token at
Token ID379
Feature activation-2.56e-02

Pos logit contributions
fly+0.025
 Sky+0.025
 tweet+0.022
 globe+0.020
 Saturn+0.020
Neg logit contributions
gh-0.038
 forgotten-0.037
array-0.037
 Sister-0.037
 behaviour-0.036
Token each
Token ID1123
Feature activation+4.14e-03

Pos logit contributions
fly+0.042
 Sky+0.041
 tweet+0.038
 Dragon+0.035
pole+0.035
Neg logit contributions
 purpose-0.045
 Sister-0.044
gh-0.044
 behaviour-0.043
 mischief-0.043
Token other
Token ID584
Feature activation+2.49e-03

Pos logit contributions
gh+0.009
 purpose+0.008
array+0.008
omm+0.008
 behaviour+0.008
Neg logit contributions
 Sky-0.006
fly-0.006
 tweet-0.005
 globe-0.005
 Saturn-0.005
Token and
Token ID290
Feature activation-1.49e-02

Pos logit contributions
gh+0.006
 purpose+0.005
array+0.005
 behaviour+0.005
omm+0.005
Neg logit contributions
 Sky-0.003
fly-0.003
 tweet-0.003
 globe-0.003
 Saturn-0.003
Token quickly
Token ID2952
Feature activation+5.91e-02

Pos logit contributions
 Sky+0.030
fly+0.029
 tweet+0.027
 globe+0.027
 Saturn+0.026
Neg logit contributions
gh-0.022
 forgotten-0.021
 Sister-0.020
 behaviour-0.020
array-0.020
Token ran
Token ID4966
Feature activation+1.28e-02

Pos logit contributions
omm+0.136
 purpose+0.130
gh+0.129
innie+0.122
 virus+0.120
Neg logit contributions
 together-0.072
fly-0.070
 Sky-0.068
als-0.068
 tweet-0.067
Token towards
Token ID3371
Feature activation-3.61e-03

Pos logit contributions
gh+0.032
omm+0.030
 purpose+0.030
array+0.030
 behaviour+0.030
Neg logit contributions
 Sky-0.014
fly-0.013
 higher-0.011
 below-0.011
 tweet-0.011
Token it
Token ID340
Feature activation+9.18e-04

Pos logit contributions
fly+0.005
 Sky+0.005
 tweet+0.004
 globe+0.004
 Saturn+0.004
Neg logit contributions
gh-0.008
 purpose-0.007
array-0.007
 behaviour-0.007
 Sister-0.007
Token.
Token ID13
Feature activation-1.24e-02

Pos logit contributions
gh+0.002
 purpose+0.001
array+0.001
 behaviour+0.001
 Sister+0.001
Neg logit contributions
 Sky-0.002
fly-0.002
 tweet-0.001
 globe-0.001
 Saturn-0.001
Token The
Token ID383
Feature activation+1.39e-02

Pos logit contributions
fly+0.021
 Sky+0.020
 tweet+0.019
 globe+0.018
 higher+0.017
Neg logit contributions
gh-0.022
 Sister-0.021
 purpose-0.021
array-0.021
omm-0.021
Tokeniflower
Token ID42642
Feature activation-2.42e-02

Pos logit contributions
 Sky+0.062
 tweet+0.058
fly+0.057
 Saturn+0.054
 higher+0.052
Neg logit contributions
gh-0.051
 purpose-0.046
omm-0.044
 behaviour-0.042
 forgotten-0.041
Token and
Token ID290
Feature activation-3.49e-02

Pos logit contributions
 Sky+0.046
fly+0.044
 tweet+0.042
 Saturn+0.041
 Jupiter+0.040
Neg logit contributions
gh-0.040
 purpose-0.035
 forgotten-0.035
 behaviour-0.035
 Sister-0.034
Token Lily
Token ID20037
Feature activation-1.66e-02

Pos logit contributions
fly+0.062
 Sky+0.059
 tweet+0.055
 below+0.054
 Fly+0.053
Neg logit contributions
 purpose-0.058
gh-0.056
 Sister-0.055
 punishment-0.055
 forgotten-0.054
Token had
Token ID550
Feature activation+2.04e-02

Pos logit contributions
 Sky+0.032
fly+0.031
 tweet+0.029
 Saturn+0.028
 higher+0.027
Neg logit contributions
gh-0.027
 forgotten-0.024
 Sister-0.024
 purpose-0.023
array-0.023
Token a
Token ID257
Feature activation+8.14e-03

Pos logit contributions
gh+0.042
omm+0.039
 behaviour+0.038
array+0.038
 Sister+0.038
Neg logit contributions
 Sky-0.030
fly-0.030
 tweet-0.027
 Saturn-0.026
 higher-0.025
Token y
Token ID331
Feature activation+6.01e-02

Pos logit contributions
gh+0.014
array+0.012
 purpose+0.012
 behaviour+0.012
omm+0.012
Neg logit contributions
fly-0.015
 Sky-0.015
 tweet-0.014
 globe-0.013
 Saturn-0.013
Tokenummy
Token ID13513
Feature activation+2.01e-02

Pos logit contributions
gh+0.089
David+0.084
 Sister+0.082
array+0.081
 virus+0.080
Neg logit contributions
fly-0.110
 Sky-0.103
 together-0.097
 tweet-0.096
 reef-0.096
Token dinner
Token ID8073
Feature activation+2.99e-03

Pos logit contributions
gh+0.034
array+0.031
omm+0.030
urance+0.030
em+0.030
Neg logit contributions
fly-0.037
 Sky-0.036
 tweet-0.034
 globe-0.033
 higher-0.031
Token together
Token ID1978
Feature activation+8.86e-03

Pos logit contributions
gh+0.006
 purpose+0.005
array+0.005
 behaviour+0.005
omm+0.005
Neg logit contributions
 Sky-0.005
fly-0.005
 tweet-0.004
 globe-0.004
 Saturn-0.004
Token.
Token ID13
Feature activation-1.89e-02

Pos logit contributions
gh+0.016
array+0.015
 purpose+0.015
omm+0.014
 Sister+0.014
Neg logit contributions
fly-0.015
 Sky-0.015
 tweet-0.014
 below-0.013
 globe-0.013
Token And
Token ID843
Feature activation-4.71e-02

Pos logit contributions
fly+0.033
 Sky+0.030
 tweet+0.029
 higher+0.028
 flap+0.027
Neg logit contributions
gh-0.033
 Sister-0.032
 purpose-0.032
omm-0.032
addy-0.031
Token it
Token ID340
Feature activation+1.02e-02

Pos logit contributions
gh+0.038
 purpose+0.035
array+0.035
omm+0.034
urance+0.034
Neg logit contributions
 Sky-0.023
fly-0.023
 tweet-0.021
 Saturn-0.019
 globe-0.019
Tokenâ
Token ID22940
Feature activation+3.88e-02

Pos logit contributions
gh+0.020
 purpose+0.019
array+0.018
 behaviour+0.018
omm+0.018
Neg logit contributions
fly-0.016
 Sky-0.016
 tweet-0.014
 globe-0.013
 Saturn-0.013
Token€
Token ID26391
Feature activation+4.00e-02

Pos logit contributions
gh+0.054
 purpose+0.052
 mischief+0.051
 forgotten+0.050
 behaviour+0.049
Neg logit contributions
fly-0.077
 Sky-0.076
 tweet-0.074
 Fly-0.070
 higher-0.069
Tokenâ„¢
Token ID8151
Feature activation+4.22e-02

Pos logit contributions
gh+0.059
 purpose+0.057
omm+0.056
 Sister+0.056
 behaviour+0.056
Neg logit contributions
 Sky-0.076
fly-0.072
 tweet-0.069
 globe-0.068
 Saturn-0.066
Tokens
Token ID82
Feature activation+5.20e-02

Pos logit contributions
gh+0.094
array+0.094
 mischief+0.093
 purpose+0.093
 behaviour+0.090
Neg logit contributions
 Sky-0.053
fly-0.052
 tweet-0.043
 globe-0.041
 below-0.040
Token too
Token ID1165
Feature activation+6.16e-02

Pos logit contributions
gh+0.095
array+0.090
omm+0.088
David+0.088
child+0.088
Neg logit contributions
 Sky-0.080
fly-0.078
 higher-0.073
 tweet-0.073
 Jupiter-0.071
Token far
Token ID1290
Feature activation+3.95e-03

Pos logit contributions
gh+0.129
omm+0.128
urance+0.124
array+0.117
 behaviour+0.116
Neg logit contributions
 Sky-0.081
 higher-0.081
 below-0.077
fly-0.076
 together-0.076
Token for
Token ID329
Feature activation+7.61e-02

Pos logit contributions
gh+0.009
 purpose+0.008
array+0.008
 behaviour+0.008
omm+0.008
Neg logit contributions
fly-0.005
 Sky-0.005
 tweet-0.004
 globe-0.004
 Saturn-0.004
Token you
Token ID345
Feature activation+3.70e-02

Pos logit contributions
gh+0.161
omm+0.159
array+0.155
urance+0.152
scar+0.151
Neg logit contributions
 Sky-0.095
 Saturn-0.079
 tweet-0.075
 Fly-0.074
 Jupiter-0.074
Token to
Token ID284
Feature activation+4.52e-02

Pos logit contributions
gh+0.074
omm+0.072
array+0.071
 Sister+0.069
 purpose+0.069
Neg logit contributions
fly-0.055
 Sky-0.054
 tweet-0.049
 below-0.047
 higher-0.046
Token go
Token ID467
Feature activation+8.12e-03

Pos logit contributions
omm+0.098
gh+0.096
array+0.093
 Sister+0.090
child+0.090
Neg logit contributions
 Sky-0.060
 soar-0.055
 flap-0.054
 higher-0.053
 tweet-0.053
Token likes
Token ID7832
Feature activation+1.86e-02

Pos logit contributions
gh+0.065
 purpose+0.061
omm+0.059
 behaviour+0.059
urance+0.058
Neg logit contributions
fly-0.035
 Sky-0.035
 tweet-0.030
 globe-0.029
 below-0.029
Token to
Token ID284
Feature activation+3.02e-02

Pos logit contributions
gh+0.038
urance+0.034
array+0.034
 purpose+0.034
omm+0.034
Neg logit contributions
 Sky-0.029
fly-0.027
 tweet-0.025
 Saturn-0.024
 below-0.024
Token print
Token ID3601
Feature activation-9.57e-03

Pos logit contributions
gh+0.068
omm+0.064
array+0.062
 forgotten+0.062
 behaviour+0.061
Neg logit contributions
 Sky-0.040
fly-0.038
 tweet-0.036
 soar-0.034
 flap-0.034
Token.
Token ID13
Feature activation+1.71e-02

Pos logit contributions
fly+0.017
 Sky+0.016
 tweet+0.015
 globe+0.014
 Saturn+0.014
Neg logit contributions
gh-0.018
 purpose-0.017
array-0.016
 behaviour-0.016
 Sister-0.015
Token She
Token ID1375
Feature activation-1.01e-02

Pos logit contributions
gh+0.039
 forgotten+0.034
 behaviour+0.034
 purpose+0.034
array+0.033
Neg logit contributions
 Sky-0.024
fly-0.023
 tweet-0.021
 Saturn-0.020
 Jupiter-0.020
Token has
Token ID468
Feature activation+2.31e-02

Pos logit contributions
 Sky+0.016
fly+0.015
 tweet+0.014
 higher+0.012
 below+0.012
Neg logit contributions
gh-0.021
 purpose-0.019
array-0.019
 forgotten-0.018
 Sister-0.018
Token a
Token ID257
Feature activation+1.89e-03

Pos logit contributions
gh+0.054
omm+0.051
urance+0.050
 behaviour+0.050
array+0.049
Neg logit contributions
 Sky-0.029
fly-0.026
 Saturn-0.025
 tweet-0.024
 planets-0.024
Token big
Token ID1263
Feature activation+3.18e-03

Pos logit contributions
gh+0.004
 purpose+0.003
array+0.003
 behaviour+0.003
omm+0.003
Neg logit contributions
 Sky-0.003
fly-0.003
 tweet-0.003
 globe-0.003
 Saturn-0.003
Token box
Token ID3091
Feature activation-1.31e-02

Pos logit contributions
gh+0.007
 purpose+0.006
array+0.006
 behaviour+0.006
omm+0.006
Neg logit contributions
fly-0.005
 Sky-0.005
 tweet-0.004
 globe-0.004
 Saturn-0.004
Token of
Token ID286
Feature activation+1.61e-02

Pos logit contributions
 Sky+0.020
fly+0.020
 tweet+0.019
 Saturn+0.017
 globe+0.017
Neg logit contributions
gh-0.026
 purpose-0.024
 forgotten-0.023
array-0.023
omm-0.023
Token paper
Token ID3348
Feature activation-2.60e-02

Pos logit contributions
gh+0.031
urance+0.028
omm+0.028
array+0.027
 purpose+0.027
Neg logit contributions
 Sky-0.026
fly-0.026
 tweet-0.024
 Saturn-0.023
 globe-0.023
Token know
Token ID760
Feature activation+6.92e-03

Pos logit contributions
gh+0.012
 purpose+0.011
array+0.011
omm+0.011
 Sister+0.011
Neg logit contributions
fly-0.010
 Sky-0.010
 tweet-0.010
 Saturn-0.009
 below-0.009
Token what
Token ID644
Feature activation+3.54e-02

Pos logit contributions
gh+0.015
 purpose+0.014
array+0.014
 behaviour+0.013
 Sister+0.013
Neg logit contributions
 Sky-0.010
fly-0.010
 tweet-0.009
 Saturn-0.008
 below-0.008
Token to
Token ID284
Feature activation+1.38e-02

Pos logit contributions
array+0.077
urance+0.074
omm+0.073
anked+0.072
gh+0.071
Neg logit contributions
 Sky-0.054
fly-0.047
 tweet-0.044
 Bloom-0.041
 Saturn-0.040
Token do
Token ID466
Feature activation+4.49e-02

Pos logit contributions
gh+0.027
omm+0.026
 Sister+0.025
array+0.025
 purpose+0.025
Neg logit contributions
fly-0.021
 Sky-0.021
 tweet-0.019
 flap-0.018
 globe-0.018
Token.
Token ID13
Feature activation+2.39e-02

Pos logit contributions
 Sister+0.102
omm+0.099
array+0.098
uf+0.097
addy+0.097
Neg logit contributions
 together-0.051
 Sky-0.049
 around-0.048
 below-0.046
fly-0.046
Token It
Token ID632
Feature activation+7.16e-03

Pos logit contributions
gh+0.055
 purpose+0.048
 behaviour+0.048
array+0.048
 forgotten+0.047
Neg logit contributions
 Sky-0.034
fly-0.031
 tweet-0.028
 Bloom-0.026
 below-0.026
Token was
Token ID373
Feature activation+3.39e-02

Pos logit contributions
gh+0.013
 purpose+0.012
array+0.012
 behaviour+0.012
omm+0.012
Neg logit contributions
fly-0.012
 Sky-0.012
 tweet-0.011
 below-0.010
 Saturn-0.010
Token so
Token ID523
Feature activation+1.22e-02

Pos logit contributions
gh+0.071
 behaviour+0.067
array+0.067
urance+0.066
kids+0.066
Neg logit contributions
 Sky-0.050
fly-0.049
 higher-0.043
 tweet-0.041
 Saturn-0.041
Token scary
Token ID14343
Feature activation+6.49e-03

Pos logit contributions
gh+0.021
array+0.018
 purpose+0.018
 behaviour+0.018
urance+0.018
Neg logit contributions
 Sky-0.023
fly-0.023
 tweet-0.021
 below-0.020
 higher-0.020
Token!
Token ID0
Feature activation+2.23e-02

Pos logit contributions
gh+0.014
 purpose+0.012
array+0.012
omm+0.012
 Sister+0.012
Neg logit contributions
 Sky-0.010
fly-0.010
 tweet-0.008
 below-0.008
 globe-0.008
Token She
Token ID1375
Feature activation+1.98e-02

Pos logit contributions
gh+0.050
 behaviour+0.045
 purpose+0.045
array+0.044
 forgotten+0.043
Neg logit contributions
 Sky-0.033
fly-0.030
 tweet-0.027
 Saturn-0.026
 Bloom-0.025
Token pay
Token ID1414
Feature activation+3.41e-02

Pos logit contributions
gh+0.074
omm+0.071
array+0.069
addy+0.067
 Sister+0.067
Neg logit contributions
 Sky-0.052
fly-0.051
 tweet-0.048
 flap-0.045
 soar-0.045
Token off
Token ID572
Feature activation+4.26e-02

Pos logit contributions
gh+0.080
array+0.076
omm+0.076
urance+0.075
 purpose+0.073
Neg logit contributions
 Sky-0.038
fly-0.035
 higher-0.033
 below-0.032
 tweet-0.031
Token.
Token ID13
Feature activation+1.16e-02

Pos logit contributions
gh+0.097
omm+0.093
array+0.090
urance+0.088
 purpose+0.087
Neg logit contributions
 Sky-0.054
fly-0.050
 below-0.044
 tweet-0.044
 higher-0.043
Token A
Token ID317
Feature activation+2.51e-02

Pos logit contributions
gh+0.024
 purpose+0.022
array+0.022
 behaviour+0.022
 forgotten+0.021
Neg logit contributions
 Sky-0.018
fly-0.017
 tweet-0.015
 Saturn-0.014
 Jupiter-0.014
Token few
Token ID1178
Feature activation+3.55e-02

Pos logit contributions
gh+0.046
array+0.043
urance+0.042
omm+0.042
anked+0.042
Neg logit contributions
 Sky-0.043
fly-0.043
 tweet-0.039
 deer-0.037
 globe-0.037
Token weeks
Token ID2745
Feature activation+3.83e-02

Pos logit contributions
gh+0.058
omm+0.057
 purpose+0.055
array+0.054
addy+0.054
Neg logit contributions
 Sky-0.066
fly-0.062
 tweet-0.060
 deer-0.058
 below-0.057
Token later
Token ID1568
Feature activation+5.57e-02

Pos logit contributions
gh+0.084
 purpose+0.080
omm+0.076
array+0.075
 mischief+0.074
Neg logit contributions
 Sky-0.052
fly-0.050
 below-0.047
 tweet-0.044
 together-0.043
Token,
Token ID11
Feature activation-6.77e-03

Pos logit contributions
gh+0.131
omm+0.127
array+0.126
 purpose+0.126
 supposed+0.125
Neg logit contributions
 Sky-0.060
fly-0.055
 below-0.053
 together-0.052
 tweet-0.046
Token Adam
Token ID7244
Feature activation+3.24e-02

Pos logit contributions
fly+0.010
 Sky+0.010
 tweet+0.009
 globe+0.008
 Saturn+0.008
Neg logit contributions
gh-0.014
 purpose-0.013
array-0.013
 Sister-0.013
 behaviour-0.013
Token heard
Token ID2982
Feature activation+1.53e-02

Pos logit contributions
gh+0.064
 purpose+0.062
array+0.059
omm+0.059
urance+0.058
Neg logit contributions
fly-0.051
 Sky-0.050
 tweet-0.044
 below-0.043
flies-0.041
Token about
Token ID546
Feature activation+6.07e-03

Pos logit contributions
gh+0.032
array+0.029
 purpose+0.029
omm+0.029
 Sister+0.028
Neg logit contributions
 Sky-0.023
fly-0.022
 tweet-0.020
 Saturn-0.019
 below-0.019
Token a
Token ID257
Feature activation-2.33e-02

Pos logit contributions
gh+0.029
array+0.026
omm+0.026
 behaviour+0.026
 purpose+0.026
Neg logit contributions
 Sky-0.021
fly-0.020
 tweet-0.018
 Saturn-0.018
 Jupiter-0.017
Token cr
Token ID1067
Feature activation+1.05e-02

Pos logit contributions
 Sky+0.042
fly+0.041
 tweet+0.037
flies+0.036
 below+0.036
Neg logit contributions
gh-0.042
 purpose-0.039
 forgotten-0.036
array-0.035
 punishment-0.035
Tokenay
Token ID323
Feature activation+1.37e-03

Pos logit contributions
gh+0.018
 purpose+0.016
array+0.016
 behaviour+0.016
 Sister+0.016
Neg logit contributions
fly-0.019
 Sky-0.019
 tweet-0.018
 Saturn-0.017
 globe-0.017
Tokenon
Token ID261
Feature activation-4.69e-03

Pos logit contributions
gh+0.003
 purpose+0.002
array+0.002
 behaviour+0.002
 Sister+0.002
Neg logit contributions
 Sky-0.002
fly-0.002
 tweet-0.002
 globe-0.002
 Saturn-0.002
Token.
Token ID13
Feature activation-7.63e-03

Pos logit contributions
 Sky+0.008
fly+0.008
 tweet+0.007
 Saturn+0.007
 globe+0.007
Neg logit contributions
gh-0.009
 purpose-0.008
array-0.008
 behaviour-0.007
omm-0.007
Token
Token ID198
Feature activation-1.65e-03

Pos logit contributions
fly+0.013
 Sky+0.012
 tweet+0.012
 globe+0.011
flies+0.011
Neg logit contributions
gh-0.014
 purpose-0.013
 Sister-0.013
omm-0.013
array-0.013
Token
Token ID198
Feature activation+1.38e-02

Pos logit contributions
fly+0.002
 Sky+0.002
 tweet+0.002
 globe+0.002
 Saturn+0.002
Neg logit contributions
gh-0.004
 purpose-0.003
array-0.003
omm-0.003
 behaviour-0.003
TokenShe
Token ID3347
Feature activation-3.30e-02

Pos logit contributions
gh+0.029
 behaviour+0.027
 purpose+0.026
urance+0.026
array+0.026
Neg logit contributions
 Sky-0.021
fly-0.020
 tweet-0.018
 higher-0.017
 below-0.017
Token is
Token ID318
Feature activation+5.77e-04

Pos logit contributions
fly+0.059
 Sky+0.058
 tweet+0.054
 globe+0.051
 deer+0.050
Neg logit contributions
gh-0.056
array-0.054
addy-0.052
 forgotten-0.051
omm-0.051
Token very
Token ID845
Feature activation-1.14e-03

Pos logit contributions
gh+0.001
 purpose+0.001
array+0.001
 behaviour+0.001
omm+0.001
Neg logit contributions
fly-0.001
 Sky-0.001
 tweet-0.001
 globe-0.001
 Saturn-0.001
Token proud
Token ID6613
Feature activation+7.38e-03

Pos logit contributions
fly+0.002
 Sky+0.002
 tweet+0.001
 globe+0.001
 Saturn+0.001
Neg logit contributions
gh-0.002
 purpose-0.002
array-0.002
 behaviour-0.002
 Sister-0.002
Token they
Token ID484
Feature activation+1.87e-02

Pos logit contributions
gh+0.128
omm+0.127
child+0.125
array+0.125
uf+0.123
Neg logit contributions
 Sky-0.034
 together-0.033
 higher-0.032
 below-0.031
 closer-0.029
Token both
Token ID1111
Feature activation+2.72e-02

Pos logit contributions
gh+0.036
 purpose+0.034
array+0.034
omm+0.033
 behaviour+0.033
Neg logit contributions
 Sky-0.030
fly-0.030
 tweet-0.027
 below-0.026
 higher-0.025
Token began
Token ID2540
Feature activation+6.04e-02

Pos logit contributions
gh+0.050
omm+0.047
 purpose+0.047
array+0.046
urance+0.044
Neg logit contributions
 Sky-0.046
fly-0.046
 tweet-0.042
 below-0.042
als-0.040
Token to
Token ID284
Feature activation+2.97e-02

Pos logit contributions
omm+0.150
uf+0.145
illy+0.142
innie+0.142
gh+0.141
Neg logit contributions
 higher-0.063
 together-0.063
 Sky-0.062
fly-0.058
 below-0.058
Token cry
Token ID3960
Feature activation+3.32e-02

Pos logit contributions
gh+0.055
omm+0.053
addy+0.050
 behaviour+0.049
child+0.049
Neg logit contributions
 Sky-0.050
fly-0.050
 tweet-0.047
 higher-0.047
 soar-0.047
Token.
Token ID13
Feature activation+4.03e-02

Pos logit contributions
gh+0.068
array+0.065
omm+0.063
kids+0.062
 purpose+0.061
Neg logit contributions
fly-0.049
 Sky-0.049
 tweet-0.043
 together-0.043
 higher-0.042
Token
Token ID198
Feature activation+3.61e-02

Pos logit contributions
gh+0.108
 behaviour+0.099
array+0.097
 purpose+0.096
 virus+0.095
Neg logit contributions
 Sky-0.038
fly-0.034
 below-0.030
 together-0.029
 Bloom-0.027
Token
Token ID198
Feature activation+6.01e-02

Pos logit contributions
gh+0.095
 behaviour+0.094
array+0.094
 purpose+0.090
urance+0.090
Neg logit contributions
 Sky-0.031
fly-0.026
 below-0.021
 together-0.021
 Saturn-0.021
TokenSuddenly
Token ID38582
Feature activation+3.62e-02

Pos logit contributions
 behaviour+0.131
urance+0.125
gh+0.124
array+0.123
omm+0.122
Neg logit contributions
 together-0.066
 Sky-0.065
fly-0.062
 below-0.056
 higher-0.056
Token,
Token ID11
Feature activation+2.36e-02

Pos logit contributions
gh+0.082
array+0.079
 supposed+0.078
omm+0.078
urance+0.078
Neg logit contributions
 Sky-0.046
fly-0.044
 tweet-0.040
 together-0.038
 higher-0.038
Token a
Token ID257
Feature activation+3.69e-02

Pos logit contributions
gh+0.055
omm+0.050
array+0.049
urance+0.049
 purpose+0.049
Neg logit contributions
 Sky-0.031
fly-0.030
 tweet-0.026
 below-0.026
 higher-0.025
Token be
Token ID307
Feature activation-9.56e-03

Pos logit contributions
gh+0.010
 purpose+0.009
array+0.009
 Sister+0.009
 behaviour+0.009
Neg logit contributions
 Sky-0.010
fly-0.010
 tweet-0.009
 globe-0.008
 higher-0.008
Token a
Token ID257
Feature activation+7.53e-03

Pos logit contributions
fly+0.017
 Sky+0.017
 tweet+0.015
 globe+0.015
 Saturn+0.015
Neg logit contributions
gh-0.018
 purpose-0.016
array-0.015
 forgotten-0.015
 behaviour-0.015
Token reminder
Token ID15438
Feature activation-4.87e-03

Pos logit contributions
gh+0.013
array+0.011
 purpose+0.011
 behaviour+0.011
omm+0.011
Neg logit contributions
 Sky-0.014
fly-0.014
 tweet-0.013
 globe-0.012
 higher-0.012
Token to
Token ID284
Feature activation+2.79e-03

Pos logit contributions
fly+0.008
 Sky+0.008
 tweet+0.007
 Saturn+0.007
 globe+0.007
Neg logit contributions
gh-0.010
 purpose-0.009
 behaviour-0.008
array-0.008
omm-0.008
Token listen
Token ID6004
Feature activation+2.76e-04

Pos logit contributions
gh+0.005
 purpose+0.005
array+0.005
 behaviour+0.005
omm+0.005
Neg logit contributions
 Sky-0.005
fly-0.005
 tweet-0.004
 Saturn-0.004
 globe-0.004
Token.
Token ID13
Feature activation-4.71e-02

Pos logit contributions
gh+0.001
 purpose+0.000
array+0.000
 behaviour+0.000
 Sister+0.000
Neg logit contributions
 Sky-0.000
fly-0.000
 tweet-0.000
 globe-0.000
 Saturn-0.000
Token If
Token ID1002
Feature activation-3.38e-03

Pos logit contributions
fly+0.086
fish+0.080
 Sky+0.079
 globe+0.078
 below+0.078
Neg logit contributions
omm-0.076
 Sister-0.075
urance-0.071
 purpose-0.068
gh-0.068
Token you
Token ID345
Feature activation-6.13e-03

Pos logit contributions
fly+0.005
 Sky+0.005
 tweet+0.004
 globe+0.004
 Saturn+0.004
Neg logit contributions
gh-0.007
 purpose-0.007
array-0.007
 behaviour-0.007
 Sister-0.007
Token listen
Token ID6004
Feature activation-4.18e-03

Pos logit contributions
 Sky+0.010
fly+0.010
 tweet+0.009
 globe+0.009
 Saturn+0.009
Neg logit contributions
gh-0.011
 purpose-0.011
 behaviour-0.010
array-0.010
 Sister-0.010
Token carefully
Token ID7773
Feature activation+8.92e-03

Pos logit contributions
 Sky+0.006
fly+0.006
 tweet+0.006
 globe+0.005
 Saturn+0.005
Neg logit contributions
gh-0.009
 purpose-0.008
array-0.008
 behaviour-0.008
 Sister-0.008
Token,
Token ID11
Feature activation-4.96e-02

Pos logit contributions
gh+0.019
 purpose+0.017
array+0.017
 behaviour+0.017
omm+0.017
Neg logit contributions
fly-0.013
 Sky-0.013
 tweet-0.012
 Saturn-0.011
 globe-0.011
Token
Token ID198
Feature activation+1.64e-02

Pos logit contributions
gh+0.019
 purpose+0.017
array+0.017
 behaviour+0.017
 forgotten+0.017
Neg logit contributions
 Sky-0.010
fly-0.010
 tweet-0.009
 Saturn-0.008
 below-0.008
Token"
Token ID1
Feature activation-8.76e-03

Pos logit contributions
gh+0.036
 behaviour+0.033
array+0.033
 purpose+0.033
urance+0.033
Neg logit contributions
 Sky-0.023
fly-0.022
 tweet-0.020
 Saturn-0.018
 Bloom-0.018
TokenBecause
Token ID8128
Feature activation-4.26e-02

Pos logit contributions
 Sky+0.016
fly+0.016
 tweet+0.015
 globe+0.014
 Saturn+0.014
Neg logit contributions
gh-0.015
 purpose-0.013
array-0.013
 Sister-0.013
omm-0.013
Token,"
Token ID553
Feature activation-3.37e-02

Pos logit contributions
fly+0.068
 Sky+0.065
 tweet+0.060
 globe+0.057
 higher+0.055
Neg logit contributions
gh-0.076
 purpose-0.074
 behaviour-0.071
 Sister-0.070
 punishment-0.070
Token the
Token ID262
Feature activation-2.71e-02

Pos logit contributions
fly+0.061
 Sky+0.057
 globe+0.055
 tweet+0.053
flies+0.052
Neg logit contributions
 Sister-0.054
gh-0.053
 purpose-0.052
 Samantha-0.050
child-0.050
Token little
Token ID1310
Feature activation-2.66e-02

Pos logit contributions
fly+0.036
 Sky+0.032
 tweet+0.030
flies+0.029
 Saturn+0.029
Neg logit contributions
gh-0.058
 purpose-0.056
omm-0.055
 behaviour-0.054
 punishment-0.053
Token boy
Token ID2933
Feature activation-5.96e-05

Pos logit contributions
fly+0.039
 Sky+0.038
 Saturn+0.035
 tweet+0.034
 Jupiter+0.033
Neg logit contributions
 purpose-0.049
gh-0.049
 behaviour-0.048
 Sister-0.047
omm-0.046
Token said
Token ID531
Feature activation+8.45e-03

Pos logit contributions
 Sky+0.000
fly+0.000
 tweet+0.000
 globe+0.000
 Saturn+0.000
Neg logit contributions
gh-0.000
 purpose-0.000
array-0.000
 behaviour-0.000
 Sister-0.000
Token,
Token ID11
Feature activation+7.84e-03

Pos logit contributions
gh+0.016
array+0.015
 purpose+0.015
 behaviour+0.014
omm+0.014
Neg logit contributions
 Sky-0.014
fly-0.014
 tweet-0.013
 below-0.012
 globe-0.012
Token "
Token ID366
Feature activation-3.15e-02

Pos logit contributions
gh+0.016
 purpose+0.015
array+0.015
 behaviour+0.014
omm+0.014
Neg logit contributions
 Sky-0.012
fly-0.012
 tweet-0.011
 below-0.010
 globe-0.010
Tokenang
Token ID648
Feature activation-1.05e-02

Pos logit contributions
 Sky+0.061
fly+0.057
 tweet+0.054
 Saturn+0.052
 flap+0.052
Neg logit contributions
gh-0.050
 purpose-0.046
David-0.043
em-0.043
array-0.043
Token He
Token ID679
Feature activation-1.27e-02

Pos logit contributions
gh+0.026
 purpose+0.024
 behaviour+0.024
array+0.024
omm+0.023
Neg logit contributions
 Sky-0.016
fly-0.015
 Saturn-0.013
 tweet-0.013
 Jupiter-0.013
Token hoped
Token ID10719
Feature activation-4.06e-04

Pos logit contributions
fly+0.024
 Sky+0.024
 tweet+0.022
 globe+0.021
 Saturn+0.021
Neg logit contributions
gh-0.021
 forgotten-0.019
 Sister-0.018
array-0.018
 behaviour-0.018
Token he
Token ID339
Feature activation-3.64e-02

Pos logit contributions
fly+0.001
 Sky+0.001
 tweet+0.001
 globe+0.000
 Saturn+0.000
Neg logit contributions
gh-0.001
 purpose-0.001
array-0.001
 behaviour-0.001
 Sister-0.001
Token could
Token ID714
Feature activation-3.04e-02

Pos logit contributions
 Sky+0.062
fly+0.060
 Saturn+0.057
 tweet+0.055
 Jupiter+0.054
Neg logit contributions
gh-0.065
 forgotten-0.060
 behaviour-0.057
 disob-0.056
 Sister-0.054
Token play
Token ID711
Feature activation-4.05e-02

Pos logit contributions
fly+0.057
 Sky+0.056
 globe+0.052
 Jupiter+0.051
als+0.051
Neg logit contributions
gh-0.051
 purpose-0.047
 punish-0.043
 forgotten-0.043
 disob-0.043
Token basketball
Token ID9669
Feature activation-4.89e-02

Pos logit contributions
fly+0.069
 Sky+0.063
flies+0.059
als+0.056
 tweet+0.056
Neg logit contributions
gh-0.084
 purpose-0.075
 mischief-0.069
 trouble-0.069
 behaviour-0.069
Token and
Token ID290
Feature activation-3.47e-02

Pos logit contributions
fly+0.074
 Sky+0.069
 tweet+0.065
 Saturn+0.063
 Jupiter+0.062
Neg logit contributions
gh-0.104
 purpose-0.092
 behaviour-0.087
 forgotten-0.086
 punishment-0.085
Token have
Token ID423
Feature activation+2.55e-02

Pos logit contributions
fly+0.061
 Sky+0.059
 globe+0.053
 tweet+0.053
 Jupiter+0.052
Neg logit contributions
gh-0.061
 purpose-0.057
 forgotten-0.056
 behaviour-0.055
 punishment-0.055
Token fun
Token ID1257
Feature activation-5.74e-03

Pos logit contributions
omm+0.046
gh+0.045
 Sister+0.042
array+0.042
 behaviour+0.042
Neg logit contributions
 Sky-0.045
fly-0.039
 tweet-0.038
 higher-0.038
 together-0.038
Token.
Token ID13
Feature activation+8.57e-03

Pos logit contributions
fly+0.010
 Sky+0.010
 tweet+0.009
 Saturn+0.009
 globe+0.009
Neg logit contributions
gh-0.011
 purpose-0.010
array-0.009
 behaviour-0.009
 forgotten-0.009
Token He
Token ID679
Feature activation-1.28e-02

Pos logit contributions
gh+0.020
 purpose+0.018
 behaviour+0.018
array+0.018
omm+0.017
Neg logit contributions
 Sky-0.011
fly-0.011
 tweet-0.010
 Saturn-0.009
 Jupiter-0.009
Token who
Token ID508
Feature activation-2.36e-02

Pos logit contributions
fly+0.003
 Sky+0.003
 tweet+0.003
 globe+0.003
 Saturn+0.003
Neg logit contributions
gh-0.004
 purpose-0.004
array-0.004
 Sister-0.004
 behaviour-0.004
Token seemed
Token ID3947
Feature activation-4.54e-04

Pos logit contributions
 Sky+0.046
fly+0.046
 tweet+0.044
 Saturn+0.041
 higher+0.041
Neg logit contributions
gh-0.035
 forgotten-0.032
 Sister-0.032
array-0.030
 disob-0.030
Token very
Token ID845
Feature activation+1.10e-02

Pos logit contributions
 Sky+0.001
fly+0.001
 tweet+0.001
 globe+0.001
 Saturn+0.001
Neg logit contributions
gh-0.001
 purpose-0.001
array-0.001
 behaviour-0.001
 Sister-0.001
Token friendly
Token ID8030
Feature activation+1.26e-02

Pos logit contributions
gh+0.022
array+0.020
 purpose+0.020
omm+0.020
urance+0.020
Neg logit contributions
 Sky-0.017
fly-0.017
 tweet-0.015
 below-0.015
 higher-0.014
Token.
Token ID13
Feature activation-1.48e-02

Pos logit contributions
gh+0.025
 purpose+0.023
array+0.023
omm+0.023
 Sister+0.022
Neg logit contributions
fly-0.020
 Sky-0.020
 tweet-0.018
 below-0.017
 globe-0.016
Token
Token ID198
Feature activation-1.67e-02

Pos logit contributions
fly+0.025
 Sky+0.023
 tweet+0.023
 globe+0.021
flies+0.021
Neg logit contributions
gh-0.027
 Sister-0.026
 purpose-0.026
array-0.026
urance-0.025
Token
Token ID198
Feature activation+7.90e-04

Pos logit contributions
fly+0.027
 Sky+0.025
 tweet+0.024
flies+0.023
 globe+0.022
Neg logit contributions
gh-0.034
 purpose-0.031
array-0.029
 Sister-0.029
addy-0.029
TokenHe
Token ID1544
Feature activation-7.49e-03

Pos logit contributions
gh+0.001
 purpose+0.001
array+0.001
 behaviour+0.001
 Sister+0.001
Neg logit contributions
fly-0.001
 Sky-0.001
 tweet-0.001
 globe-0.001
 Saturn-0.001
Token crou
Token ID33838
Feature activation-8.48e-03

Pos logit contributions
 Sky+0.015
fly+0.015
 tweet+0.013
 globe+0.013
 Saturn+0.013
Neg logit contributions
gh-0.012
array-0.010
 purpose-0.010
 forgotten-0.010
 Sister-0.010
Tokenched
Token ID1740
Feature activation-4.96e-02

Pos logit contributions
 Sky+0.017
fly+0.017
 tweet+0.016
 globe+0.016
 Saturn+0.015
Neg logit contributions
gh-0.013
array-0.011
 forgotten-0.011
 behaviour-0.011
 Sister-0.011
Token down
Token ID866
Feature activation-2.03e-02

Pos logit contributions
fly+0.068
 globe+0.061
 Sky+0.058
 Saturn+0.058
flies+0.058
Neg logit contributions
 Sister-0.101
gh-0.100
anked-0.095
 Tommy-0.095
 behaviour-0.093
Token smiles
Token ID21845
Feature activation-4.01e-03

Pos logit contributions
fly+0.025
 Sky+0.024
 tweet+0.022
 globe+0.021
 Saturn+0.020
Neg logit contributions
gh-0.032
 purpose-0.030
 behaviour-0.029
array-0.028
 Sister-0.028
Token and
Token ID290
Feature activation+4.02e-03

Pos logit contributions
 Sky+0.005
fly+0.005
 tweet+0.005
 Saturn+0.004
 globe+0.004
Neg logit contributions
gh-0.009
 purpose-0.008
array-0.008
 behaviour-0.008
 Sister-0.008
Token hugs
Token ID40657
Feature activation+6.25e-03

Pos logit contributions
gh+0.007
 purpose+0.007
array+0.006
omm+0.006
 behaviour+0.006
Neg logit contributions
 Sky-0.007
fly-0.007
 tweet-0.006
 Saturn-0.006
 globe-0.006
Token to
Token ID284
Feature activation+2.17e-02

Pos logit contributions
gh+0.012
 purpose+0.011
array+0.011
omm+0.011
 behaviour+0.011
Neg logit contributions
 Sky-0.010
fly-0.010
 tweet-0.009
 globe-0.008
 Saturn-0.008
Token spread
Token ID4104
Feature activation-1.54e-02

Pos logit contributions
gh+0.048
array+0.044
omm+0.044
 purpose+0.043
 behaviour+0.042
Neg logit contributions
 Sky-0.029
fly-0.027
 tweet-0.025
 Saturn-0.024
 below-0.024
Token peace
Token ID4167
Feature activation-3.16e-02

Pos logit contributions
fly+0.024
 Sky+0.023
 tweet+0.021
 globe+0.020
 Saturn+0.019
Neg logit contributions
gh-0.030
 purpose-0.029
 behaviour-0.028
 Sister-0.028
 mischief-0.027
Token.
Token ID13
Feature activation+5.41e-02

Pos logit contributions
 Sky+0.055
fly+0.052
 tweet+0.050
 Saturn+0.047
 Jupiter+0.046
Neg logit contributions
gh-0.055
 purpose-0.052
 behaviour-0.051
 forgotten-0.049
 Sister-0.048
Token Every
Token ID3887
Feature activation+1.56e-02

Pos logit contributions
gh+0.134
 behaviour+0.117
 disob+0.114
 purpose+0.112
 mischief+0.110
Neg logit contributions
 Sky-0.061
 together-0.057
 Saturn-0.052
 below-0.052
 Jupiter-0.051
Token day
Token ID1110
Feature activation+6.20e-02

Pos logit contributions
gh+0.024
array+0.022
omm+0.022
 purpose+0.021
urance+0.021
Neg logit contributions
fly-0.031
 Sky-0.031
 tweet-0.028
 globe-0.027
 flap-0.027
Token they
Token ID484
Feature activation-4.76e-02

Pos logit contributions
gh+0.160
array+0.153
 purpose+0.152
omm+0.151
addy+0.149
Neg logit contributions
 together-0.055
 Sky-0.050
 below-0.047
 around-0.044
 aboard-0.043
Token reminded
Token ID14516
Feature activation+5.33e-02

Pos logit contributions
 Sky+0.098
fly+0.089
 globe+0.084
 tweet+0.083
 higher+0.082
Neg logit contributions
 forgotten-0.066
 disob-0.063
gh-0.063
 Sister-0.061
 happened-0.058
Token trumpet
Token ID40334
Feature activation-5.28e-02

Pos logit contributions
fly+0.112
 tweet+0.099
bug+0.096
als+0.095
 Sky+0.091
Neg logit contributions
gh-0.116
 purpose-0.113
 business-0.102
kids-0.096
 laundry-0.091
Token.
Token ID13
Feature activation-3.47e-02

Pos logit contributions
fly+0.102
 tweet+0.099
 Sky+0.096
flies+0.087
 globe+0.087
Neg logit contributions
gh-0.087
 purpose-0.077
 business-0.066
 behaviour-0.065
 time-0.064
Token She
Token ID1375
Feature activation-2.25e-02

Pos logit contributions
fly+0.079
fish+0.071
flies+0.070
 Sky+0.069
 tweet+0.068
Neg logit contributions
 purpose-0.044
gh-0.042
urance-0.041
array-0.039
omm-0.038
Token would
Token ID561
Feature activation-4.02e-02

Pos logit contributions
fly+0.043
 Sky+0.042
 tweet+0.039
 below+0.038
flies+0.037
Neg logit contributions
gh-0.038
 forgotten-0.032
 Sister-0.031
 purpose-0.031
addy-0.030
Token play
Token ID711
Feature activation-8.55e-02

Pos logit contributions
fly+0.073
 Sky+0.069
flies+0.065
als+0.063
 globe+0.062
Neg logit contributions
gh-0.070
 purpose-0.060
 disob-0.058
 Sister-0.058
 forgotten-0.058
Token it
Token ID340
Feature activation-9.79e-02

Pos logit contributions
fly+0.191
flies+0.174
als+0.157
bug+0.151
ite+0.148
Neg logit contributions
gh-0.147
 purpose-0.115
 time-0.114
ggie-0.105
 accident-0.103
Token every
Token ID790
Feature activation-1.55e-02

Pos logit contributions
fly+0.199
flies+0.176
als+0.171
 globe+0.164
rots+0.164
Neg logit contributions
gh-0.168
 time-0.158
 purpose-0.124
 accident-0.118
 business-0.118
Token day
Token ID1110
Feature activation-1.73e-02

Pos logit contributions
fly+0.031
 Sky+0.031
 tweet+0.028
flies+0.028
 Saturn+0.027
Neg logit contributions
gh-0.024
 purpose-0.022
 behaviour-0.020
array-0.020
 Sister-0.020
Token and
Token ID290
Feature activation-6.01e-02

Pos logit contributions
fly+0.027
 Sky+0.026
 tweet+0.024
 globe+0.023
flies+0.022
Neg logit contributions
gh-0.035
 purpose-0.032
array-0.031
 forgotten-0.031
 behaviour-0.030
Token all
Token ID477
Feature activation-3.53e-02

Pos logit contributions
fly+0.107
 Sky+0.105
 globe+0.098
rots+0.097
flies+0.092
Neg logit contributions
gh-0.097
 forgotten-0.088
 purpose-0.087
 behaviour-0.084
 Sister-0.083
Token her
Token ID607
Feature activation-5.71e-03

Pos logit contributions
fly+0.074
 Sky+0.073
flies+0.067
 Jupiter+0.067
 Saturn+0.067
Neg logit contributions
gh-0.047
 purpose-0.042
 supposed-0.042
 time-0.041
 forgotten-0.039
Token.
Token ID13
Feature activation-1.68e-02

Pos logit contributions
fly+0.088
 Sky+0.085
 tweet+0.084
 Jupiter+0.079
 Saturn+0.078
Neg logit contributions
gh-0.066
 purpose-0.058
 forgotten-0.057
 punishment-0.056
anked-0.056
Token They
Token ID1119
Feature activation-3.71e-02

Pos logit contributions
fly+0.030
 Sky+0.028
 tweet+0.027
 globe+0.025
 flap+0.025
Neg logit contributions
gh-0.029
 purpose-0.028
urance-0.027
 Sister-0.027
array-0.027
Token make
Token ID787
Feature activation-2.25e-02

Pos logit contributions
fly+0.070
 Sky+0.067
flies+0.061
 Dragon+0.059
 Jupiter+0.058
Neg logit contributions
gh-0.061
 forgotten-0.058
 punish-0.057
 disob-0.055
 purpose-0.055
Token a
Token ID257
Feature activation-4.14e-02

Pos logit contributions
fly+0.040
 Sky+0.038
 tweet+0.035
 globe+0.033
flies+0.032
Neg logit contributions
gh-0.043
 purpose-0.040
 mischief-0.038
array-0.038
 behaviour-0.037
Token big
Token ID1263
Feature activation-3.18e-02

Pos logit contributions
fly+0.088
 Sky+0.085
als+0.080
flies+0.078
 below+0.077
Neg logit contributions
gh-0.063
 purpose-0.060
 punishment-0.056
 mistake-0.053
addy-0.050
Token rainbow
Token ID27223
Feature activation-7.72e-02

Pos logit contributions
 Sky+0.074
fly+0.073
flies+0.065
 Fly+0.065
als+0.065
Neg logit contributions
gh-0.040
 purpose-0.039
array-0.035
 behaviour-0.035
 punishment-0.034
Token with
Token ID351
Feature activation-2.19e-02

Pos logit contributions
 Sky+0.165
 Fly+0.154
 Silver+0.153
 tweet+0.152
fly+0.151
Neg logit contributions
gh-0.100
 purpose-0.086
 time-0.075
anked-0.071
 accident-0.071
Token all
Token ID477
Feature activation-1.86e-02

Pos logit contributions
fly+0.036
 Sky+0.036
 tweet+0.033
 Fly+0.030
flies+0.030
Neg logit contributions
gh-0.041
 purpose-0.040
 forgotten-0.037
 mischief-0.035
array-0.035
Token the
Token ID262
Feature activation-3.45e-04

Pos logit contributions
fly+0.031
 Sky+0.031
 tweet+0.029
flies+0.026
als+0.025
Neg logit contributions
gh-0.034
 purpose-0.034
array-0.031
em-0.031
 forgotten-0.031
Token colors
Token ID7577
Feature activation-3.49e-02

Pos logit contributions
 Sky+0.001
fly+0.001
 tweet+0.001
 Saturn+0.000
 globe+0.000
Neg logit contributions
gh-0.001
 purpose-0.001
array-0.001
 behaviour-0.001
 Sister-0.001
Token.
Token ID13
Feature activation-1.68e-02

Pos logit contributions
 Sky+0.079
fly+0.075
 tweet+0.073
 Fly+0.069
 Bloom+0.069
Neg logit contributions
gh-0.045
 purpose-0.040
 forgotten-0.036
anked-0.036
 behaviour-0.036
Token "
Token ID366
Feature activation+2.91e-03

Pos logit contributions
gh+0.055
 purpose+0.049
 behaviour+0.049
array+0.048
 supposed+0.048
Neg logit contributions
 Sky-0.028
fly-0.026
 tweet-0.022
 below-0.022
 Saturn-0.020
TokenDo
Token ID5211
Feature activation+4.38e-03

Pos logit contributions
gh+0.005
 purpose+0.005
array+0.005
 behaviour+0.005
omm+0.005
Neg logit contributions
 Sky-0.005
fly-0.005
 tweet-0.005
 globe-0.004
 Saturn-0.004
Token you
Token ID345
Feature activation+1.95e-02

Pos logit contributions
gh+0.010
 purpose+0.009
array+0.009
 behaviour+0.009
 Sister+0.009
Neg logit contributions
 Sky-0.006
fly-0.006
 tweet-0.005
 Saturn-0.005
 globe-0.005
Token want
Token ID765
Feature activation+1.05e-02

Pos logit contributions
gh+0.042
omm+0.041
 purpose+0.039
array+0.039
 Sister+0.039
Neg logit contributions
fly-0.027
 tweet-0.025
 Sky-0.025
 flap-0.023
 soar-0.023
Token to
Token ID284
Feature activation-2.96e-03

Pos logit contributions
gh+0.021
array+0.019
 behaviour+0.019
omm+0.018
 purpose+0.018
Neg logit contributions
 Sky-0.017
fly-0.017
 tweet-0.015
 Saturn-0.014
 higher-0.014
Token come
Token ID1282
Feature activation-7.78e-02

Pos logit contributions
fly+0.004
 Sky+0.004
 tweet+0.004
 globe+0.004
 Saturn+0.004
Neg logit contributions
gh-0.006
 purpose-0.006
array-0.006
 behaviour-0.006
 Sister-0.005
Token fly
Token ID6129
Feature activation-6.10e-02

Pos logit contributions
 Sky+0.128
 valleys+0.124
 planets+0.122
als+0.122
fly+0.120
Neg logit contributions
gh-0.118
 purpose-0.117
 Tommy-0.103
 punish-0.100
 supposed-0.099
Token my
Token ID616
Feature activation-4.17e-02

Pos logit contributions
 Sky+0.103
fly+0.101
 tweet+0.097
 globe+0.094
flies+0.093
Neg logit contributions
 purpose-0.108
gh-0.106
array-0.089
 time-0.086
 forgotten-0.085
Token k
Token ID479
Feature activation-3.60e-02

Pos logit contributions
fly+0.071
 Sky+0.066
 tweet+0.063
 Saturn+0.060
 Jupiter+0.060
Neg logit contributions
 purpose-0.077
gh-0.070
 behaviour-0.067
 business-0.067
 punishment-0.066
Tokenite
Token ID578
Feature activation-7.77e-02

Pos logit contributions
 Sky+0.042
fly+0.040
 Jupiter+0.039
 Saturn+0.038
 tweet+0.037
Neg logit contributions
gh-0.081
 purpose-0.078
omm-0.074
anked-0.073
array-0.073
Token with
Token ID351
Feature activation+1.87e-04

Pos logit contributions
 Sky+0.154
fly+0.145
 globe+0.142
 Saturn+0.142
 Venus+0.141
Neg logit contributions
gh-0.112
 purpose-0.109
array-0.093
 punish-0.088
 disob-0.087
Token you
Token ID345
Feature activation-2.30e-02

Pos logit contributions
fly+0.068
 Sky+0.062
 globe+0.060
flies+0.056
 below+0.055
Neg logit contributions
gh-0.063
 behaviour-0.055
 disob-0.055
 punishment-0.054
 purpose-0.053
Token can
Token ID460
Feature activation+7.64e-03

Pos logit contributions
 Sky+0.032
fly+0.032
flies+0.026
 below+0.025
 tweet+0.025
Neg logit contributions
gh-0.051
 forgotten-0.047
 purpose-0.046
 behaviour-0.045
urance-0.045
Token see
Token ID766
Feature activation-5.24e-02

Pos logit contributions
gh+0.015
 purpose+0.014
array+0.014
 behaviour+0.013
 Sister+0.013
Neg logit contributions
fly-0.012
 Sky-0.012
 tweet-0.011
 globe-0.010
 Saturn-0.010
Token more
Token ID517
Feature activation-2.05e-02

Pos logit contributions
fly+0.098
 Sky+0.093
stack+0.089
 globe+0.087
 tweet+0.087
Neg logit contributions
gh-0.078
 purpose-0.078
 mischief-0.072
 trouble-0.070
em-0.069
Token things
Token ID1243
Feature activation-3.79e-02

Pos logit contributions
fly+0.038
 Sky+0.037
 tweet+0.034
 globe+0.032
stack+0.031
Neg logit contributions
gh-0.035
 purpose-0.034
 mischief-0.033
array-0.033
 behaviour-0.032
Token.
Token ID13
Feature activation-7.54e-02

Pos logit contributions
 Sky+0.071
fly+0.071
 tweet+0.069
 globe+0.063
 Saturn+0.062
Neg logit contributions
gh-0.060
 purpose-0.055
 supposed-0.054
 forgotten-0.053
array-0.049
Token I
Token ID314
Feature activation-1.66e-02

Pos logit contributions
fish+0.166
fly+0.164
ls+0.153
 Sky+0.151
 globe+0.151
Neg logit contributions
 Sister-0.085
omm-0.085
urance-0.083
 purpose-0.077
array-0.076
Token have
Token ID423
Feature activation+7.87e-03

Pos logit contributions
 Sky+0.028
fly+0.026
 tweet+0.024
 Saturn+0.023
 globe+0.023
Neg logit contributions
gh-0.032
 forgotten-0.030
 purpose-0.029
 behaviour-0.029
array-0.028
Token some
Token ID617
Feature activation+1.37e-02

Pos logit contributions
gh+0.016
 purpose+0.014
array+0.014
omm+0.014
 behaviour+0.014
Neg logit contributions
 Sky-0.013
fly-0.012
 tweet-0.011
 Saturn-0.011
 Jupiter-0.010
Token slides
Token ID19392
Feature activation-2.02e-02

Pos logit contributions
gh+0.028
 purpose+0.025
array+0.025
omm+0.025
 Sister+0.024
Neg logit contributions
 Sky-0.021
fly-0.021
 tweet-0.019
 Saturn-0.018
 below-0.018
Token with
Token ID351
Feature activation-1.18e-02

Pos logit contributions
 Sky+0.034
fly+0.034
 tweet+0.031
 Saturn+0.029
 Jupiter+0.029
Neg logit contributions
gh-0.036
 purpose-0.034
array-0.032
 Sister-0.032
em-0.032
Token he
Token ID339
Feature activation-4.80e-02

Pos logit contributions
fly+0.038
 Sky+0.038
 tweet+0.034
 globe+0.033
 Saturn+0.032
Neg logit contributions
gh-0.049
 purpose-0.046
 behaviour-0.045
 Sister-0.044
 forgotten-0.044
Token can
Token ID460
Feature activation-4.73e-02

Pos logit contributions
 Sky+0.086
fly+0.082
 globe+0.075
 Saturn+0.074
 tweet+0.074
Neg logit contributions
gh-0.081
 forgotten-0.077
 behaviour-0.072
 disob-0.072
 Sister-0.071
Token keep
Token ID1394
Feature activation-2.36e-02

Pos logit contributions
 Sky+0.083
fly+0.083
 Saturn+0.077
 globe+0.076
 Jupiter+0.075
Neg logit contributions
gh-0.083
 purpose-0.073
 disob-0.069
 forgotten-0.068
 punish-0.068
Token flying
Token ID7348
Feature activation-6.33e-02

Pos logit contributions
fly+0.043
 Sky+0.042
 tweet+0.038
als+0.037
 globe+0.036
Neg logit contributions
gh-0.043
 purpose-0.039
 mischief-0.038
 behaviour-0.037
 Sister-0.037
Token up
Token ID510
Feature activation-2.60e-02

Pos logit contributions
 Sky+0.104
fly+0.102
 tweet+0.090
 Saturn+0.089
 globe+0.088
Neg logit contributions
gh-0.119
 purpose-0.112
 behaviour-0.101
 forgotten-0.098
 mischief-0.096
Token and
Token ID290
Feature activation-8.61e-02

Pos logit contributions
 Sky+0.038
fly+0.038
 tweet+0.034
 globe+0.032
 Saturn+0.031
Neg logit contributions
gh-0.054
 purpose-0.050
 forgotten-0.049
 behaviour-0.048
array-0.048
Token exploring
Token ID13504
Feature activation-6.23e-02

Pos logit contributions
 Sky+0.123
fly+0.113
 globe+0.108
rots+0.106
 Saturn+0.105
Neg logit contributions
 forgotten-0.158
 mischief-0.152
 purpose-0.151
 behaviour-0.149
gh-0.149
Token the
Token ID262
Feature activation-3.70e-02

Pos logit contributions
fly+0.125
 Sky+0.116
 tweet+0.107
flies+0.106
bug+0.102
Neg logit contributions
gh-0.098
 purpose-0.094
 mischief-0.086
 behaviour-0.082
 business-0.082
Token sky
Token ID6766
Feature activation-6.09e-02

Pos logit contributions
fly+0.070
 Sky+0.069
 tweet+0.060
flies+0.057
 Saturn+0.056
Neg logit contributions
 purpose-0.061
array-0.059
gh-0.058
 mischief-0.058
 forgotten-0.056
Token.
Token ID13
Feature activation-1.69e-02

Pos logit contributions
 Sky+0.125
fly+0.121
 tweet+0.118
 Saturn+0.113
 globe+0.112
Neg logit contributions
gh-0.086
 purpose-0.076
 forgotten-0.071
 behaviour-0.069
em-0.064
Token<|endoftext|>
Token ID50256
Feature activation-2.49e-02

Pos logit contributions
fly+0.029
 Sky+0.028
 tweet+0.026
flies+0.024
 globe+0.024
Neg logit contributions
gh-0.031
 Sister-0.029
 purpose-0.029
omm-0.028
urance-0.028