TOP ACTIVATIONS
MAX = 0.127

wants to play with us ." Tom comes back
But no biting , okay ?" Max bark ed and nodded
are playing with their toys . They have many toys of
have to say sorry first ." L ily did
It was an unknown mistake ." L ily felt
wants to hurt us !" A man hears Tom
ila and Ben are friends . They like to play outside
to say sorry first ." L ily did not
something else to play with ." L ily and
deserve to ride on me ." Tim and Mia
, there were two twins . They were named Jake and
look pretty and y ummy ." They add everything
to play with us ." Tom comes back .
the gas can . OK ?" S ara said
time there were two bears . One was big and the
want to make a fort ." M ia said
It is very y ummy ." L ily felt
to say hello to him ?" Tom and Mia
you . Are you okay ?" she says .
you unhappy and angry ." Tom and Sam looked

MIN ACTIVATIONS
MIN = -0.161

disturb us or take our gold !" L ily
to be gold is not gold , but something else that
. I don 't need gold ice cream . I have
Maybe the treasure is not gold or jewels ," L ila
light en up the wheel bar row . <|endoftext|> Once there
to help him keep his gold safe . They agreed and
umpy and he shared his gold with everyone who needed it
slide is not made of gold , but it is brilliant
he couldn 't keep his gold safe by himself . He
friendship are more precious than gold ." L ily
co op to keep the gold safe . From
Did you kids find my gold watch ?"
. I will trade my navy boat for your red boat
real treasure chest filled with gold coins and jewels .
, you will get more gold ." The boy was greedy
he had never found the gold . He learned a lesson
It will only cost one gold coin ". The dragon was
that it is not real gold . It is chocolate !
treasure chest full of shiny gold coins ! Tim , Sally
she would never see her gold coin again . <|endoftext|> Once

INTERVAL 0.055 - 0.127
CONTAINS 16.463%

hammer ing and saw ing . After a while , they
be and decided to investigate . M ia grabbed
They look at the bucket . The fish and frogs are
a little girl named Lily . She loved to swing in
castle with towers and walls . Lily suggested they work together

INTERVAL -0.017 - 0.055
CONTAINS 45.084%

said . They went home and got some paper
be friends with the pig , but the pig didn 't
H annah said ,
little girl was so excited that she shouted , " Y
The animals brought sticks and leaves to the stream .

INTERVAL -0.089 - -0.017
CONTAINS 34.349%

. Miss y knew that even if the sun went away
loved to slide down the big hill in the park .
a prayer of thanks for the delicious egg she ate that
gave her a big bag of peanuts . Lily was so
decided to fold leaves and use them as little boats to

INTERVAL -0.161 - -0.089
CONTAINS 4.105%

make a sweater for her doll , Ros ie . She
some paper and cr ay ons too . She cut the
a way to make the dragon go away . He put
and cute , with fluffy feathers and a tiny be ak
to go back to the zoo again . <|endoftext|> Once upon
Token wants
Token ID3382
Feature activation+1.81e-02

Pos logit contributions
ies+0.037
els+0.035
ishing+0.034
ins+0.033
ier+0.031
Neg logit contributions
 Curious-0.028
 Fasc-0.027
 shouldn-0.027
 Are-0.026
 Looking-0.026
Token to
Token ID284
Feature activation+2.93e-02

Pos logit contributions
 shouldn+0.033
 Curious+0.033
 Are+0.032
+0.032
 hadn+0.031
Neg logit contributions
ies-0.033
els-0.032
ishing-0.031
ins-0.030
ier-0.028
Token play
Token ID711
Feature activation-7.70e-03

Pos logit contributions
 Curious+0.048
 Fasc+0.046
 Looking+0.045
 shouldn+0.045
 Are+0.045
Neg logit contributions
ies-0.061
els-0.058
ishing-0.054
ins-0.053
 times-0.051
Token with
Token ID351
Feature activation+4.16e-03

Pos logit contributions
ies+0.015
els+0.014
ins+0.013
ishing+0.013
ier+0.012
Neg logit contributions
 Are-0.014
-0.014
 Curious-0.014
 It-0.014
 Exc-0.014
Token us
Token ID514
Feature activation-4.82e-02

Pos logit contributions
 Curious+0.005
 Are+0.005
 shouldn+0.005
 Exc+0.005
+0.005
Neg logit contributions
ies-0.010
ishing-0.009
els-0.009
ins-0.009
ier-0.009
Token."
Token ID526
Feature activation+1.27e-01

Pos logit contributions
els+0.072
ishing+0.066
ies+0.063
ins+0.062
ier+0.059
Neg logit contributions
-0.103
 shouldn-0.100
 hadn-0.094
 Curious-0.092
 Are-0.091
Token
Token ID198
Feature activation+1.24e-01

Pos logit contributions
 shouldn+0.174
 hadn+0.165
 Fasc+0.161
 Curious+0.159
 Looking+0.151
Neg logit contributions
ies-0.288
els-0.267
ins-0.256
 again-0.252
er-0.249
Token
Token ID198
Feature activation+1.15e-01

Pos logit contributions
 shouldn+0.181
 hadn+0.175
 Fasc+0.173
 Curious+0.172
 Are+0.162
Neg logit contributions
ies-0.270
els-0.256
ins-0.249
ock-0.234
--0.231
TokenTom
Token ID13787
Feature activation+3.96e-02

Pos logit contributions
 shouldn+0.213
 Curious+0.206
 Fasc+0.204
 hadn+0.203
 Terr+0.194
Neg logit contributions
ies-0.206
els-0.195
ins-0.187
 times-0.174
ock-0.173
Token comes
Token ID2058
Feature activation+1.90e-02

Pos logit contributions
 Curious+0.087
 Fasc+0.085
 shouldn+0.084
 Are+0.084
 Looking+0.084
Neg logit contributions
ies-0.060
els-0.056
ishing-0.052
ins-0.050
 times-0.047
Token back
Token ID736
Feature activation-5.18e-03

Pos logit contributions
 Curious+0.032
+0.031
 Looking+0.030
 Fasc+0.030
 Exc+0.029
Neg logit contributions
els-0.037
ies-0.037
ishing-0.036
 times-0.035
ier-0.035
Token But
Token ID887
Feature activation+5.33e-02

Pos logit contributions
 shouldn+0.108
 hadn+0.107
 Fasc+0.104
 Curious+0.102
 Terr+0.101
Neg logit contributions
els-0.216
ies-0.207
er-0.203
 blocks-0.197
ins-0.197
Token no
Token ID645
Feature activation-1.39e-02

Pos logit contributions
 Are+0.076
 Curious+0.074
 hadn+0.073
 shouldn+0.073
+0.071
Neg logit contributions
ies-0.119
els-0.115
ishing-0.112
ins-0.111
ier-0.104
Token biting
Token ID30791
Feature activation-6.02e-02

Pos logit contributions
ins+0.026
els+0.026
ishing+0.025
ies+0.023
ier+0.022
Neg logit contributions
 Are-0.025
-0.023
 I-0.022
 Would-0.021
 Where-0.020
Token,
Token ID11
Feature activation+3.88e-02

Pos logit contributions
ies+0.120
ishing+0.118
els+0.113
ins+0.105
ooming+0.104
Neg logit contributions
-0.093
 shouldn-0.092
 Are-0.092
 It-0.091
 Where-0.086
Token okay
Token ID8788
Feature activation-1.71e-03

Pos logit contributions
 Are+0.048
+0.047
 shouldn+0.047
 Curious+0.047
 hadn+0.046
Neg logit contributions
ies-0.093
ishing-0.089
els-0.089
ins-0.086
ier-0.082
Token?"
Token ID1701
Feature activation+1.25e-01

Pos logit contributions
ishing+0.003
els+0.003
ins+0.003
isher+0.003
ier+0.003
Neg logit contributions
 Are-0.003
-0.003
 shouldn-0.003
 I-0.003
 Where-0.003
Token Max
Token ID5436
Feature activation+3.22e-02

Pos logit contributions
 shouldn+0.186
 hadn+0.174
 Fasc+0.171
 Curious+0.170
've+0.164
Neg logit contributions
ies-0.254
 again-0.237
els-0.235
ins-0.225
er-0.223
Token bark
Token ID21405
Feature activation-9.89e-03

Pos logit contributions
 Curious+0.056
 Fasc+0.055
 shouldn+0.055
 hadn+0.055
Don+0.054
Neg logit contributions
ies-0.061
els-0.056
ins-0.052
ishing-0.052
er-0.051
Tokened
Token ID276
Feature activation+4.32e-03

Pos logit contributions
ies+0.020
els+0.019
ishing+0.018
ins+0.017
 times+0.017
Neg logit contributions
 Curious-0.018
 Looking-0.017
 Fasc-0.016
 Exc-0.016
Don-0.016
Token and
Token ID290
Feature activation+5.40e-02

Pos logit contributions
 Curious+0.010
+0.010
 It+0.010
 Looking+0.010
 Exc+0.010
Neg logit contributions
els-0.007
ies-0.006
ins-0.006
ishing-0.006
ries-0.005
Token nodded
Token ID14464
Feature activation+4.57e-03

Pos logit contributions
 shouldn+0.085
 Curious+0.085
 Fasc+0.085
 Where+0.081
 Looking+0.081
Neg logit contributions
ies-0.113
els-0.102
ishing-0.097
ins-0.095
 again-0.094
Token are
Token ID389
Feature activation+7.49e-02

Pos logit contributions
 Curious+0.185
 Fasc+0.181
 shouldn+0.179
 Are+0.179
 Looking+0.179
Neg logit contributions
ies-0.069
els-0.060
ishing-0.053
ins-0.051
 times-0.045
Token playing
Token ID2712
Feature activation+1.85e-02

Pos logit contributions
 Curious+0.144
+0.142
 Fasc+0.140
 shouldn+0.138
Don+0.137
Neg logit contributions
ies-0.126
els-0.125
ishing-0.117
er-0.111
ier-0.108
Token with
Token ID351
Feature activation+5.16e-02

Pos logit contributions
 Curious+0.033
 Fasc+0.033
 Exc+0.031
 Looking+0.031
+0.031
Neg logit contributions
ies-0.034
els-0.034
ier-0.032
reating-0.030
ock-0.030
Token their
Token ID511
Feature activation+1.22e-02

Pos logit contributions
 shouldn+0.121
 Curious+0.120
 hadn+0.118
 Are+0.117
 Fasc+0.116
Neg logit contributions
ies-0.074
els-0.069
ishing-0.059
 toys-0.058
ins-0.058
Token toys
Token ID14958
Feature activation-3.29e-02

Pos logit contributions
 Curious+0.028
 Fasc+0.027
 Looking+0.027
 Are+0.027
 Exc+0.027
Neg logit contributions
ies-0.017
els-0.016
ishing-0.015
ins-0.015
ier-0.014
Token.
Token ID13
Feature activation+1.25e-01

Pos logit contributions
ies+0.063
els+0.062
reating+0.060
ins+0.059
ishing+0.058
Neg logit contributions
-0.052
 couldn-0.047
 hadn-0.047
 Suddenly-0.047
Don-0.047
Token They
Token ID1119
Feature activation+4.65e-02

Pos logit contributions
 shouldn+0.195
 Curious+0.185
 Terr+0.185
 hadn+0.185
Don+0.182
Neg logit contributions
ies-0.261
els-0.248
 times-0.229
 blocks-0.221
ins-0.220
Token have
Token ID423
Feature activation+3.72e-02

Pos logit contributions
 Terr+0.080
 Curious+0.079
 Fasc+0.078
 Looking+0.077
 Int+0.073
Neg logit contributions
ies-0.096
els-0.085
 times-0.081
 toys-0.080
 blocks-0.079
Token many
Token ID867
Feature activation-4.83e-02

Pos logit contributions
 Curious+0.064
 Fasc+0.062
 Are+0.061
 shouldn+0.061
 Looking+0.061
Neg logit contributions
ies-0.074
els-0.069
ishing-0.065
ins-0.065
 times-0.061
Token toys
Token ID14958
Feature activation-4.76e-02

Pos logit contributions
ier+0.055
ishing+0.048
ies+0.047
els+0.043
ock+0.043
Neg logit contributions
 Curious-0.115
 Fasc-0.114
 Exc-0.112
-0.110
 hadn-0.109
Token of
Token ID286
Feature activation-1.34e-03

Pos logit contributions
ishing+0.084
ies+0.081
els+0.080
ins+0.078
ock+0.072
Neg logit contributions
-0.087
 shouldn-0.081
 hadn-0.080
 had-0.079
 It-0.079
Token have
Token ID423
Feature activation+1.12e-02

Pos logit contributions
 Curious+0.002
 Fasc+0.002
 Exc+0.002
 Looking+0.002
 Int+0.002
Neg logit contributions
ies-0.002
els-0.002
ishing-0.002
ins-0.002
 again-0.002
Token to
Token ID284
Feature activation+2.61e-02

Pos logit contributions
 shouldn+0.019
 hadn+0.018
 Curious+0.018
+0.018
 Fasc+0.018
Neg logit contributions
ies-0.022
els-0.022
ishing-0.020
ins-0.020
ier-0.018
Token say
Token ID910
Feature activation+1.11e-02

Pos logit contributions
 Curious+0.038
 shouldn+0.037
 Fasc+0.037
 Are+0.037
 hadn+0.036
Neg logit contributions
ies-0.058
els-0.055
ishing-0.053
ins-0.053
ier-0.049
Token sorry
Token ID7926
Feature activation-1.59e-02

Pos logit contributions
 Curious+0.015
 Are+0.014
 Fasc+0.014
 Looking+0.014
+0.014
Neg logit contributions
ies-0.026
els-0.025
ishing-0.025
ins-0.023
ier-0.022
Token first
Token ID717
Feature activation-1.04e-02

Pos logit contributions
els+0.033
ishing+0.029
ig+0.027
ooming+0.027
ries+0.026
Neg logit contributions
 Are-0.028
 It-0.028
-0.027
 Where-0.027
 shouldn-0.027
Token."
Token ID526
Feature activation+1.25e-01

Pos logit contributions
els+0.016
ishing+0.015
ies+0.014
ins+0.014
ooming+0.013
Neg logit contributions
 shouldn-0.024
-0.024
 hadn-0.023
 Are-0.022
 It-0.022
Token
Token ID198
Feature activation+1.25e-01

Pos logit contributions
 shouldn+0.158
 Fasc+0.150
 hadn+0.149
 Curious+0.149
 Looking+0.140
Neg logit contributions
ies-0.292
els-0.270
ins-0.261
 again-0.260
er-0.256
Token
Token ID198
Feature activation+1.15e-01

Pos logit contributions
 shouldn+0.190
 hadn+0.183
 Fasc+0.182
 Curious+0.180
 Looking+0.167
Neg logit contributions
ies-0.259
els-0.244
ins-0.241
 times-0.223
 again-0.222
TokenL
Token ID43
Feature activation+4.06e-02

Pos logit contributions
 shouldn+0.215
 Curious+0.209
 Fasc+0.208
 hadn+0.204
 Looking+0.197
Neg logit contributions
ies-0.200
els-0.186
ins-0.184
 times-0.172
ock-0.165
Tokenily
Token ID813
Feature activation+4.31e-02

Pos logit contributions
 Curious+0.051
 hadn+0.050
 Are+0.050
 shouldn+0.050
 Looking+0.050
Neg logit contributions
ies-0.093
els-0.089
ishing-0.089
 angels-0.082
 times-0.081
Token did
Token ID750
Feature activation+5.30e-02

Pos logit contributions
 Curious+0.078
 Fasc+0.077
 Are+0.076
 shouldn+0.075
 Looking+0.075
Neg logit contributions
ies-0.080
els-0.075
ishing-0.070
ins-0.069
er-0.066
Token It
Token ID632
Feature activation+9.56e-03

Pos logit contributions
 Curious+0.106
 Fasc+0.103
 hadn+0.103
 shouldn+0.100
 Terr+0.098
Neg logit contributions
els-0.227
ies-0.219
ins-0.206
er-0.205
ishing-0.201
Token was
Token ID373
Feature activation+1.64e-02

Pos logit contributions
 Curious+0.016
 shouldn+0.016
 Fasc+0.016
 Are+0.015
 Looking+0.015
Neg logit contributions
ies-0.019
els-0.018
ishing-0.017
ins-0.017
 times-0.016
Token an
Token ID281
Feature activation-3.09e-02

Pos logit contributions
 shouldn+0.026
 Curious+0.026
 Fasc+0.025
 hadn+0.025
 Are+0.025
Neg logit contributions
ies-0.034
els-0.032
ishing-0.031
ins-0.031
ier-0.029
Token unknown
Token ID6439
Feature activation-8.38e-02

Pos logit contributions
ies+0.072
ins+0.070
els+0.065
ishing+0.063
 angels+0.063
Neg logit contributions
 shouldn-0.048
-0.045
 Curious-0.040
LL-0.039
 Fasc-0.039
Token mistake
Token ID7457
Feature activation-6.20e-02

Pos logit contributions
els+0.152
ishing+0.148
ins+0.145
ies+0.139
ier+0.129
Neg logit contributions
 shouldn-0.161
 Are-0.161
 Where-0.153
 It-0.151
-0.150
Token."
Token ID526
Feature activation+1.25e-01

Pos logit contributions
ishing+0.094
els+0.090
ies+0.086
ooming+0.077
ig+0.077
Neg logit contributions
-0.138
 shouldn-0.134
 Are-0.126
 It-0.126
 hadn-0.125
Token
Token ID198
Feature activation+1.19e-01

Pos logit contributions
 shouldn+0.176
 hadn+0.165
 Fasc+0.165
 Curious+0.161
 Looking+0.152
Neg logit contributions
ies-0.265
els-0.248
 again-0.243
ins-0.236
er-0.233
Token
Token ID198
Feature activation+1.10e-01

Pos logit contributions
 shouldn+0.180
 hadn+0.174
 Fasc+0.174
 Curious+0.171
 Are+0.161
Neg logit contributions
ies-0.247
els-0.236
ins-0.227
--0.217
 again-0.215
TokenL
Token ID43
Feature activation+3.40e-02

Pos logit contributions
 shouldn+0.196
 Curious+0.192
 Fasc+0.192
 hadn+0.187
 Looking+0.182
Neg logit contributions
ies-0.203
els-0.192
ins-0.183
 times-0.172
ishing-0.172
Tokenily
Token ID813
Feature activation+3.93e-02

Pos logit contributions
 Curious+0.044
 shouldn+0.043
 hadn+0.043
 Looking+0.043
 Are+0.043
Neg logit contributions
ies-0.076
els-0.073
ishing-0.072
 angels-0.068
 times-0.066
Token felt
Token ID2936
Feature activation+3.26e-02

Pos logit contributions
 Curious+0.092
 Fasc+0.089
 shouldn+0.089
 Looking+0.088
 Are+0.088
Neg logit contributions
ies-0.054
els-0.050
ishing-0.047
ins-0.045
 times-0.041
Token wants
Token ID3382
Feature activation+1.81e-02

Pos logit contributions
ies+0.065
els+0.062
ishing+0.058
ins+0.058
 times+0.055
Neg logit contributions
 Curious-0.056
 Fasc-0.055
 shouldn-0.054
 Looking-0.053
 Exc-0.053
Token to
Token ID284
Feature activation+2.93e-02

Pos logit contributions
 shouldn+0.033
 Curious+0.033
 Are+0.032
+0.032
 hadn+0.031
Neg logit contributions
ies-0.033
els-0.032
ishing-0.031
ins-0.030
ier-0.028
Token hurt
Token ID5938
Feature activation-3.55e-02

Pos logit contributions
 Curious+0.048
 Fasc+0.046
 Looking+0.045
 shouldn+0.045
 Are+0.045
Neg logit contributions
ies-0.061
els-0.058
ishing-0.054
ins-0.053
 times-0.051
Token us
Token ID514
Feature activation-4.62e-02

Pos logit contributions
ies+0.077
ishing+0.074
ins+0.071
els+0.070
ier+0.069
Neg logit contributions
 Are-0.051
-0.051
 shouldn-0.050
 Curious-0.049
Don-0.049
Token!"
Token ID2474
Feature activation+1.22e-01

Pos logit contributions
els+0.075
ishing+0.070
ins+0.066
ies+0.065
isher+0.063
Neg logit contributions
-0.098
 shouldn-0.090
 Are-0.082
 Curious-0.081
 Something-0.081
Token
Token ID198
Feature activation+1.25e-01

Pos logit contributions
 shouldn+0.165
 hadn+0.160
 Fasc+0.159
 Curious+0.157
 Looking+0.147
Neg logit contributions
ies-0.272
els-0.255
ins-0.245
er-0.240
 again-0.238
Token
Token ID198
Feature activation+1.15e-01

Pos logit contributions
 shouldn+0.197
 hadn+0.192
 Fasc+0.186
 Curious+0.182
 Are+0.172
Neg logit contributions
ies-0.248
els-0.240
ins-0.234
--0.220
ock-0.216
TokenA
Token ID32
Feature activation+3.47e-02

Pos logit contributions
 shouldn+0.215
 hadn+0.204
 Curious+0.203
 Fasc+0.202
 Terr+0.191
Neg logit contributions
ies-0.197
els-0.190
ins-0.184
 times-0.169
ock-0.168
Token man
Token ID582
Feature activation+7.28e-03

Pos logit contributions
 Curious+0.065
 Are+0.063
 Fasc+0.062
 Looking+0.062
 Exc+0.062
Neg logit contributions
ies-0.063
els-0.062
ishing-0.059
ins-0.056
ier-0.052
Token hears
Token ID23180
Feature activation+4.04e-02

Pos logit contributions
 Curious+0.011
 Fasc+0.010
 shouldn+0.010
 Are+0.010
 Looking+0.010
Neg logit contributions
ies-0.016
els-0.016
ishing-0.015
ins-0.015
 times-0.014
Token Tom
Token ID4186
Feature activation-1.81e-02

Pos logit contributions
 Curious+0.067
 Fasc+0.064
 Looking+0.063
 Exc+0.063
 Are+0.063
Neg logit contributions
ies-0.082
els-0.078
ishing-0.076
ins-0.073
ier-0.070
Tokenila
Token ID10102
Feature activation+5.94e-02

Pos logit contributions
 Curious+0.107
+0.105
 Exc+0.105
 Fasc+0.105
 shouldn+0.104
Neg logit contributions
ies-0.113
els-0.106
ishing-0.104
ins-0.099
 angels-0.097
Token and
Token ID290
Feature activation+1.19e-01

Pos logit contributions
 Curious+0.147
+0.146
 shouldn+0.144
 Fasc+0.143
 hadn+0.143
Neg logit contributions
els-0.064
ies-0.064
ishing-0.060
ins-0.058
ier-0.057
Token Ben
Token ID3932
Feature activation+6.60e-02

Pos logit contributions
 Terr+0.250
 hadn+0.248
 Curious+0.244
 shouldn+0.243
 Could+0.243
Neg logit contributions
ies-0.170
els-0.151
 and-0.138
 angels-0.137
 again-0.135
Token are
Token ID389
Feature activation+7.41e-02

Pos logit contributions
 Curious+0.174
 Fasc+0.170
 shouldn+0.169
 Are+0.168
 Looking+0.168
Neg logit contributions
ies-0.070
els-0.064
ishing-0.057
ins-0.055
 times-0.048
Token friends
Token ID2460
Feature activation+4.55e-02

Pos logit contributions
+0.139
 Curious+0.139
 Fasc+0.136
 shouldn+0.134
Don+0.132
Neg logit contributions
ies-0.128
els-0.125
ishing-0.119
er-0.113
ier-0.110
Token.
Token ID13
Feature activation+1.25e-01

Pos logit contributions
 Curious+0.116
 Fasc+0.112
 shouldn+0.112
 hadn+0.112
Don+0.112
Neg logit contributions
ies-0.048
els-0.046
ishing-0.042
ins-0.040
 times-0.037
Token They
Token ID1119
Feature activation+5.11e-02

Pos logit contributions
 Curious+0.131
 Fasc+0.124
 shouldn+0.124
 Are+0.120
 Looking+0.119
Neg logit contributions
ies-0.330
els-0.316
ishing-0.300
ins-0.299
 times-0.288
Token like
Token ID588
Feature activation+5.77e-02

Pos logit contributions
 Curious+0.083
 Fasc+0.081
 Looking+0.079
 Exc+0.079
 Are+0.079
Neg logit contributions
ies-0.108
els-0.100
ishing-0.094
ins-0.092
 times-0.089
Token to
Token ID284
Feature activation+3.37e-02

Pos logit contributions
 Curious+0.119
 Fasc+0.116
 shouldn+0.115
 Are+0.114
 Looking+0.114
Neg logit contributions
ies-0.094
els-0.089
ishing-0.084
ins-0.081
ier-0.075
Token play
Token ID711
Feature activation+3.53e-03

Pos logit contributions
 Curious+0.052
 Fasc+0.050
 Looking+0.049
 Are+0.048
 Exc+0.048
Neg logit contributions
ies-0.073
els-0.070
ishing-0.068
ins-0.066
ier-0.063
Token outside
Token ID2354
Feature activation-7.30e-03

Pos logit contributions
 Fasc+0.006
 Curious+0.006
 Looking+0.006
 Something+0.006
 It+0.006
Neg logit contributions
ies-0.006
els-0.006
ier-0.006
iness-0.006
ins-0.006
Token to
Token ID284
Feature activation+2.61e-02

Pos logit contributions
 shouldn+0.019
 hadn+0.018
 Curious+0.018
+0.018
 Fasc+0.018
Neg logit contributions
ies-0.022
els-0.022
ishing-0.020
ins-0.020
ier-0.018
Token say
Token ID910
Feature activation+1.11e-02

Pos logit contributions
 Curious+0.038
 shouldn+0.037
 Fasc+0.037
 Are+0.037
 hadn+0.036
Neg logit contributions
ies-0.058
els-0.055
ishing-0.053
ins-0.053
ier-0.049
Token sorry
Token ID7926
Feature activation-1.59e-02

Pos logit contributions
 Curious+0.015
 Are+0.014
 Fasc+0.014
 Looking+0.014
+0.014
Neg logit contributions
ies-0.026
els-0.025
ishing-0.025
ins-0.023
ier-0.022
Token first
Token ID717
Feature activation-1.04e-02

Pos logit contributions
els+0.033
ishing+0.029
ig+0.027
ooming+0.027
ries+0.026
Neg logit contributions
 Are-0.028
 It-0.028
-0.027
 Where-0.027
 shouldn-0.027
Token."
Token ID526
Feature activation+1.25e-01

Pos logit contributions
els+0.016
ishing+0.015
ies+0.014
ins+0.014
ooming+0.013
Neg logit contributions
 shouldn-0.024
-0.024
 hadn-0.023
 Are-0.022
 It-0.022
Token
Token ID198
Feature activation+1.25e-01

Pos logit contributions
 shouldn+0.158
 Fasc+0.150
 hadn+0.149
 Curious+0.149
 Looking+0.140
Neg logit contributions
ies-0.292
els-0.270
ins-0.261
 again-0.260
er-0.256
Token
Token ID198
Feature activation+1.15e-01

Pos logit contributions
 shouldn+0.190
 hadn+0.183
 Fasc+0.182
 Curious+0.180
 Looking+0.167
Neg logit contributions
ies-0.259
els-0.244
ins-0.241
 times-0.223
 again-0.222
TokenL
Token ID43
Feature activation+4.06e-02

Pos logit contributions
 shouldn+0.215
 Curious+0.209
 Fasc+0.208
 hadn+0.204
 Looking+0.197
Neg logit contributions
ies-0.200
els-0.186
ins-0.184
 times-0.172
ock-0.165
Tokenily
Token ID813
Feature activation+4.31e-02

Pos logit contributions
 Curious+0.051
 hadn+0.050
 Are+0.050
 shouldn+0.050
 Looking+0.050
Neg logit contributions
ies-0.093
els-0.089
ishing-0.089
 angels-0.082
 times-0.081
Token did
Token ID750
Feature activation+5.30e-02

Pos logit contributions
 Curious+0.078
 Fasc+0.077
 Are+0.076
 shouldn+0.075
 Looking+0.075
Neg logit contributions
ies-0.080
els-0.075
ishing-0.070
ins-0.069
er-0.066
Token not
Token ID407
Feature activation+2.58e-02

Pos logit contributions
 Curious+0.078
 Are+0.075
 Fasc+0.075
 shouldn+0.074
 Exc+0.073
Neg logit contributions
ies-0.116
els-0.110
ishing-0.102
ins-0.101
 again-0.101
Token something
Token ID1223
Feature activation-3.47e-02

Pos logit contributions
ies+0.047
ishing+0.043
els+0.043
ins+0.041
ooming+0.040
Neg logit contributions
 shouldn-0.038
 Something-0.037
-0.036
 It-0.036
 Curious-0.036
Token else
Token ID2073
Feature activation-5.09e-02

Pos logit contributions
els+0.073
ies+0.073
ishing+0.070
ins+0.065
 times+0.063
Neg logit contributions
 shouldn-0.053
-0.051
 hadn-0.051
 Are-0.050
 Curious-0.050
Token to
Token ID284
Feature activation-2.54e-02

Pos logit contributions
els+0.085
ies+0.081
ishing+0.077
ins+0.072
ock+0.071
Neg logit contributions
 shouldn-0.102
-0.100
 hadn-0.098
 Are-0.095
've-0.094
Token play
Token ID711
Feature activation-4.09e-02

Pos logit contributions
ies+0.057
els+0.054
ishing+0.052
ins+0.052
ier+0.049
Neg logit contributions
 Curious-0.036
 Are-0.035
 shouldn-0.035
 Fasc-0.034
 Looking-0.034
Token with
Token ID351
Feature activation-3.69e-02

Pos logit contributions
ies+0.080
els+0.075
ishing+0.070
ins+0.066
ig+0.065
Neg logit contributions
 It-0.072
-0.071
 Something-0.071
 Are-0.070
 Where-0.069
Token."
Token ID526
Feature activation+1.24e-01

Pos logit contributions
ies+0.054
ishing+0.052
els+0.052
burn+0.049
ins+0.049
Neg logit contributions
 shouldn-0.076
-0.076
 hadn-0.071
 It-0.069
 The-0.068
Token
Token ID198
Feature activation+1.19e-01

Pos logit contributions
 shouldn+0.180
 hadn+0.173
 Fasc+0.171
 Curious+0.167
 Looking+0.157
Neg logit contributions
ies-0.277
els-0.249
 again-0.245
ins-0.243
er-0.238
Token
Token ID198
Feature activation+1.11e-01

Pos logit contributions
 Fasc+0.171
 Curious+0.170
 shouldn+0.170
 hadn+0.169
 Looking+0.160
Neg logit contributions
ies-0.267
els-0.247
ins-0.243
--0.228
 again-0.228
TokenL
Token ID43
Feature activation+3.45e-02

Pos logit contributions
 Curious+0.195
 Fasc+0.193
 shouldn+0.191
 hadn+0.185
 Looking+0.185
Neg logit contributions
ies-0.221
els-0.204
ins-0.198
 times-0.186
ishing-0.185
Tokenily
Token ID813
Feature activation+3.68e-02

Pos logit contributions
 Curious+0.042
 shouldn+0.042
 hadn+0.041
 Are+0.041
 Looking+0.041
Neg logit contributions
ies-0.079
ishing-0.076
els-0.075
 angels-0.070
 times-0.068
Token and
Token ID290
Feature activation+6.40e-02

Pos logit contributions
 Curious+0.092
 Fasc+0.090
 shouldn+0.089
 Are+0.089
 Looking+0.088
Neg logit contributions
ies-0.045
els-0.041
ishing-0.037
ins-0.036
 times-0.032
Token deserve
Token ID10925
Feature activation-2.21e-02

Pos logit contributions
 shouldn+0.030
 Are+0.029
 Curious+0.029
+0.029
 hadn+0.028
Neg logit contributions
els-0.052
ies-0.051
ishing-0.050
ins-0.048
er-0.045
Token to
Token ID284
Feature activation+6.53e-03

Pos logit contributions
ies+0.046
els+0.043
ishing+0.042
ins+0.041
ier+0.038
Neg logit contributions
 shouldn-0.035
 hadn-0.033
-0.033
 Are-0.033
 Curious-0.032
Token ride
Token ID6594
Feature activation-5.66e-02

Pos logit contributions
 Curious+0.009
 Are+0.009
 shouldn+0.009
+0.009
 hadn+0.009
Neg logit contributions
ies-0.015
els-0.014
ishing-0.014
ins-0.014
ier-0.013
Token on
Token ID319
Feature activation+2.20e-04

Pos logit contributions
ies+0.096
els+0.091
ins+0.088
ishing+0.086
ier+0.077
Neg logit contributions
 shouldn-0.107
-0.106
 Curious-0.105
 hadn-0.104
 Are-0.103
Token me
Token ID502
Feature activation-3.79e-02

Pos logit contributions
+0.000
 Curious+0.000
 shouldn+0.000
 Fasc+0.000
 Exc+0.000
Neg logit contributions
ies-0.001
ins-0.000
ishing-0.000
els-0.000
ier-0.000
Token."
Token ID526
Feature activation+1.24e-01

Pos logit contributions
ies+0.056
els+0.055
ishing+0.052
ins+0.051
ries+0.046
Neg logit contributions
-0.088
 Curious-0.079
 shouldn-0.079
've-0.078
 hadn-0.077
Token
Token ID198
Feature activation+1.22e-01

Pos logit contributions
 shouldn+0.154
 Fasc+0.148
 Curious+0.147
 hadn+0.146
 Looking+0.136
Neg logit contributions
ies-0.279
els-0.261
 again-0.254
ins-0.251
er-0.248
Token
Token ID198
Feature activation+1.13e-01

Pos logit contributions
 shouldn+0.185
 hadn+0.181
 Fasc+0.179
 Curious+0.178
 Are+0.174
Neg logit contributions
ies-0.240
els-0.230
ins-0.226
ock-0.216
 again-0.212
TokenTim
Token ID14967
Feature activation+2.68e-02

Pos logit contributions
 shouldn+0.192
 Curious+0.187
 Fasc+0.185
 hadn+0.183
 Are+0.178
Neg logit contributions
ies-0.205
els-0.195
ins-0.191
ock-0.180
 times-0.179
Token and
Token ID290
Feature activation+7.06e-02

Pos logit contributions
 Curious+0.049
 Fasc+0.047
 shouldn+0.047
 Looking+0.046
 Are+0.046
Neg logit contributions
ies-0.050
els-0.048
ishing-0.045
ins-0.045
ier-0.042
Token Mia
Token ID32189
Feature activation+3.15e-02

Pos logit contributions
 Fasc+0.107
 Curious+0.107
 shouldn+0.106
 Looking+0.102
 Exc+0.101
Neg logit contributions
ies-0.155
els-0.142
 times-0.133
ishing-0.131
 angels-0.129
Token,
Token ID11
Feature activation+1.02e-01

Pos logit contributions
 Curious+0.216
 shouldn+0.211
 Fasc+0.211
Don+0.209
 Terr+0.207
Neg logit contributions
ies-0.077
els-0.064
ishing-0.053
ins-0.052
 angels-0.050
Token there
Token ID612
Feature activation+6.25e-02

Pos logit contributions
 Curious+0.206
 Fasc+0.201
 shouldn+0.199
 Looking+0.197
 Exc+0.197
Neg logit contributions
ies-0.172
els-0.160
ishing-0.150
ins-0.146
ier-0.135
Token were
Token ID547
Feature activation+7.25e-02

Pos logit contributions
 Curious+0.118
 Terr+0.115
 Fasc+0.113
 Looking+0.111
 Exc+0.109
Neg logit contributions
ies-0.111
els-0.101
 again-0.098
ie-0.095
 times-0.095
Token two
Token ID734
Feature activation+1.34e-03

Pos logit contributions
 Curious+0.160
 Fasc+0.156
 shouldn+0.154
 Are+0.154
 Looking+0.153
Neg logit contributions
ies-0.108
els-0.101
ishing-0.094
ins-0.091
 times-0.084
Token twins
Token ID20345
Feature activation+3.28e-02

Pos logit contributions
 Curious+0.003
 shouldn+0.003
+0.003
 hadn+0.003
 Fasc+0.003
Neg logit contributions
ishing-0.002
els-0.002
ier-0.002
ies-0.001
reating-0.001
Token.
Token ID13
Feature activation+1.24e-01

Pos logit contributions
+0.043
 shouldn+0.041
 hadn+0.041
 Are+0.040
 Curious+0.040
Neg logit contributions
els-0.073
ies-0.073
ishing-0.072
ier-0.070
ins-0.070
Token They
Token ID1119
Feature activation+4.07e-02

Pos logit contributions
 Curious+0.158
 shouldn+0.155
 Fasc+0.151
 Terr+0.150
Don+0.149
Neg logit contributions
ies-0.286
els-0.285
 times-0.252
ishing-0.252
ins-0.249
Token were
Token ID547
Feature activation+3.96e-02

Pos logit contributions
 Curious+0.066
 Terr+0.066
 Fasc+0.065
 Looking+0.064
 Exc+0.064
Neg logit contributions
ies-0.089
els-0.078
 toys-0.073
ins-0.071
ie-0.071
Token named
Token ID3706
Feature activation+4.03e-02

Pos logit contributions
 Curious+0.084
 Looking+0.081
 Fasc+0.081
 Exc+0.081
 Are+0.081
Neg logit contributions
ies-0.064
els-0.059
ins-0.053
 angels-0.053
ishing-0.052
Token Jake
Token ID14757
Feature activation-2.15e-03

Pos logit contributions
 Curious+0.062
 Fasc+0.060
 Are+0.059
 shouldn+0.059
 Looking+0.059
Neg logit contributions
ies-0.088
els-0.084
ishing-0.078
ins-0.076
 times-0.074
Token and
Token ID290
Feature activation+6.00e-02

Pos logit contributions
els+0.003
ishing+0.003
ies+0.003
ins+0.003
ier+0.003
Neg logit contributions
-0.005
 hadn-0.005
 shouldn-0.004
 The-0.004
 Curious-0.004
Token look
Token ID804
Feature activation-1.03e-02

Pos logit contributions
els+0.083
ishing+0.078
ies+0.077
 times+0.065
ig+0.063
Neg logit contributions
-0.139
 shouldn-0.137
 hadn-0.134
 Curious-0.134
 Are-0.134
Token pretty
Token ID2495
Feature activation-6.66e-02

Pos logit contributions
ies+0.018
els+0.017
ishing+0.016
ins+0.016
 angels+0.015
Neg logit contributions
 Curious-0.019
 shouldn-0.019
 Fasc-0.019
 hadn-0.019
 Are-0.018
Token and
Token ID290
Feature activation+1.41e-02

Pos logit contributions
els+0.089
ishing+0.082
ooming+0.080
ies+0.075
ock+0.075
Neg logit contributions
-0.140
 shouldn-0.137
 Are-0.132
 hadn-0.131
 Curious-0.129
Token y
Token ID331
Feature activation-5.76e-02

Pos logit contributions
 Curious+0.025
 Fasc+0.025
 Looking+0.024
 Exc+0.024
 Terr+0.024
Neg logit contributions
ies-0.027
els-0.024
ins-0.023
ishing-0.023
 times-0.023
Tokenummy
Token ID13513
Feature activation-5.52e-02

Pos logit contributions
els+0.090
ies+0.088
 angels+0.085
ishing+0.083
 fires+0.082
Neg logit contributions
 shouldn-0.096
-0.095
 This-0.092
 Exc-0.092
 Ginny-0.091
Token."
Token ID526
Feature activation+1.24e-01

Pos logit contributions
els+0.085
ishing+0.080
ies+0.076
ooming+0.073
ins+0.070
Neg logit contributions
-0.117
 Are-0.108
 shouldn-0.107
 Curious-0.106
 Exc-0.104
Token
Token ID198
Feature activation+1.19e-01

Pos logit contributions
 shouldn+0.149
 Curious+0.144
 Fasc+0.143
 hadn+0.140
 Looking+0.135
Neg logit contributions
ies-0.303
els-0.279
ins-0.273
er-0.262
 again-0.262
Token
Token ID198
Feature activation+1.11e-01

Pos logit contributions
 shouldn+0.164
 Curious+0.158
 hadn+0.157
 Fasc+0.157
 Are+0.150
Neg logit contributions
ies-0.273
els-0.257
ins-0.255
ock-0.235
er-0.235
TokenThey
Token ID2990
Feature activation+4.13e-02

Pos logit contributions
 shouldn+0.187
 Curious+0.184
 Fasc+0.180
 hadn+0.178
 Are+0.174
Neg logit contributions
ies-0.219
els-0.207
ins-0.201
ishing-0.188
 times-0.185
Token add
Token ID751
Feature activation-4.70e-02

Pos logit contributions
 Curious+0.062
 Fasc+0.060
 Are+0.059
 Looking+0.059
 shouldn+0.059
Neg logit contributions
ies-0.090
els-0.086
ishing-0.081
ins-0.080
 times-0.076
Token everything
Token ID2279
Feature activation-3.81e-02

Pos logit contributions
ies+0.096
ishing+0.094
ooming+0.089
ins+0.087
ologist+0.086
Neg logit contributions
-0.068
 Exc-0.067
 It-0.067
 Curious-0.064
 Something-0.064
Token to
Token ID284
Feature activation+2.93e-02

Pos logit contributions
 shouldn+0.033
 Curious+0.033
 Are+0.032
+0.032
 hadn+0.031
Neg logit contributions
ies-0.033
els-0.032
ishing-0.031
ins-0.030
ier-0.028
Token play
Token ID711
Feature activation-7.70e-03

Pos logit contributions
 Curious+0.048
 Fasc+0.046
 Looking+0.045
 shouldn+0.045
 Are+0.045
Neg logit contributions
ies-0.061
els-0.058
ishing-0.054
ins-0.053
 times-0.051
Token with
Token ID351
Feature activation+4.16e-03

Pos logit contributions
ies+0.015
els+0.014
ins+0.013
ishing+0.013
ier+0.012
Neg logit contributions
 Are-0.014
-0.014
 Curious-0.014
 It-0.014
 Exc-0.014
Token us
Token ID514
Feature activation-4.82e-02

Pos logit contributions
 Curious+0.005
 Are+0.005
 shouldn+0.005
 Exc+0.005
+0.005
Neg logit contributions
ies-0.010
ishing-0.009
els-0.009
ins-0.009
ier-0.009
Token."
Token ID526
Feature activation+1.27e-01

Pos logit contributions
els+0.072
ishing+0.066
ies+0.063
ins+0.062
ier+0.059
Neg logit contributions
-0.103
 shouldn-0.100
 hadn-0.094
 Curious-0.092
 Are-0.091
Token
Token ID198
Feature activation+1.24e-01

Pos logit contributions
 shouldn+0.174
 hadn+0.165
 Fasc+0.161
 Curious+0.159
 Looking+0.151
Neg logit contributions
ies-0.288
els-0.267
ins-0.256
 again-0.252
er-0.249
Token
Token ID198
Feature activation+1.15e-01

Pos logit contributions
 shouldn+0.181
 hadn+0.175
 Fasc+0.173
 Curious+0.172
 Are+0.162
Neg logit contributions
ies-0.270
els-0.256
ins-0.249
ock-0.234
--0.231
TokenTom
Token ID13787
Feature activation+3.96e-02

Pos logit contributions
 shouldn+0.213
 Curious+0.206
 Fasc+0.204
 hadn+0.203
 Terr+0.194
Neg logit contributions
ies-0.206
els-0.195
ins-0.187
 times-0.174
ock-0.173
Token comes
Token ID2058
Feature activation+1.90e-02

Pos logit contributions
 Curious+0.087
 Fasc+0.085
 shouldn+0.084
 Are+0.084
 Looking+0.084
Neg logit contributions
ies-0.060
els-0.056
ishing-0.052
ins-0.050
 times-0.047
Token back
Token ID736
Feature activation-5.18e-03

Pos logit contributions
 Curious+0.032
+0.031
 Looking+0.030
 Fasc+0.030
 Exc+0.029
Neg logit contributions
els-0.037
ies-0.037
ishing-0.036
 times-0.035
ier-0.035
Token.
Token ID13
Feature activation+1.01e-01

Pos logit contributions
ishing+0.009
els+0.009
ins+0.008
isher+0.008
ies+0.008
Neg logit contributions
 Is-0.010
-0.010
 Curious-0.009
 Who-0.009
 Are-0.009
Token the
Token ID262
Feature activation-3.18e-02

Pos logit contributions
ies+0.143
ins+0.132
ishing+0.127
ier+0.125
els+0.124
Neg logit contributions
-0.081
 shouldn-0.075
 It-0.073
 hadn-0.069
 Are-0.069
Token gas
Token ID3623
Feature activation-1.27e-01

Pos logit contributions
ies+0.055
ins+0.052
els+0.051
ishing+0.051
ier+0.045
Neg logit contributions
 shouldn-0.063
-0.061
 Curious-0.060
 hadn-0.059
 Are-0.058
Token can
Token ID460
Feature activation-6.71e-03

Pos logit contributions
els+0.187
ies+0.175
ins+0.172
ishing+0.168
 angels+0.155
Neg logit contributions
-0.263
 shouldn-0.257
 hadn-0.237
 Are-0.234
 Something-0.232
Token.
Token ID13
Feature activation+1.03e-01

Pos logit contributions
ies+0.014
els+0.014
ishing+0.014
ins+0.013
ries+0.012
Neg logit contributions
-0.011
 shouldn-0.010
 hadn-0.010
 Curious-0.010
 Are-0.010
Token OK
Token ID7477
Feature activation+1.08e-02

Pos logit contributions
 hadn+0.114
 shouldn+0.111
 Fasc+0.110
 Curious+0.110
Don+0.104
Neg logit contributions
els-0.224
ies-0.221
ins-0.214
er-0.213
 again-0.208
Token?"
Token ID1701
Feature activation+1.24e-01

Pos logit contributions
 Are+0.020
 shouldn+0.020
+0.020
 Curious+0.019
 Where+0.019
Neg logit contributions
els-0.019
ies-0.018
ishing-0.018
ins-0.018
ier-0.017
Token
Token ID198
Feature activation+1.19e-01

Pos logit contributions
 shouldn+0.170
 Curious+0.160
 Fasc+0.160
 hadn+0.157
 Looking+0.151
Neg logit contributions
ies-0.269
els-0.252
 again-0.241
ins-0.238
er-0.233
Token
Token ID198
Feature activation+1.10e-01

Pos logit contributions
 shouldn+0.178
 Fasc+0.168
 hadn+0.168
 Curious+0.167
 Are+0.158
Neg logit contributions
ies-0.248
els-0.234
ins-0.227
--0.217
 again-0.215
TokenS
Token ID50
Feature activation+2.51e-02

Pos logit contributions
 shouldn+0.191
 Curious+0.185
 Fasc+0.183
 hadn+0.179
 Terr+0.175
Neg logit contributions
ies-0.207
els-0.193
ins-0.187
er-0.174
 times-0.173
Tokenara
Token ID3301
Feature activation+3.19e-02

Pos logit contributions
 Curious+0.045
 Fasc+0.045
 shouldn+0.044
 Looking+0.044
 hadn+0.044
Neg logit contributions
ies-0.045
els-0.043
ishing-0.040
ins-0.039
 times-0.039
Token said
Token ID531
Feature activation+3.89e-02

Pos logit contributions
 Curious+0.058
 Fasc+0.056
 shouldn+0.055
 Are+0.055
 Looking+0.055
Neg logit contributions
ies-0.060
els-0.057
ishing-0.054
ins-0.053
 times-0.050
Token time
Token ID640
Feature activation+7.91e-02

Pos logit contributions
 Curious+0.156
 Fasc+0.152
 shouldn+0.151
 Are+0.150
 Looking+0.150
Neg logit contributions
ies-0.115
els-0.108
ishing-0.100
ins-0.098
 times-0.090
Token there
Token ID612
Feature activation+6.30e-02

Pos logit contributions
 Curious+0.220
 shouldn+0.217
 Fasc+0.216
Don+0.215
 Terr+0.213
Neg logit contributions
ies-0.072
els-0.058
ishing-0.048
ins-0.045
 angels-0.044
Token were
Token ID547
Feature activation+7.03e-02

Pos logit contributions
 Curious+0.123
 Terr+0.121
 Fasc+0.120
 Looking+0.115
 Exc+0.115
Neg logit contributions
ies-0.104
 again-0.093
els-0.093
ie-0.093
--0.088
Token two
Token ID734
Feature activation-3.10e-03

Pos logit contributions
 Curious+0.152
 Fasc+0.148
 shouldn+0.146
 Are+0.146
 Looking+0.145
Neg logit contributions
ies-0.108
els-0.101
ishing-0.094
ins-0.091
er-0.084
Token bears
Token ID13062
Feature activation-1.34e-02

Pos logit contributions
ishing+0.004
els+0.004
ier+0.004
reating+0.003
isher+0.003
Neg logit contributions
 Curious-0.006
-0.006
 hadn-0.006
 shouldn-0.006
 relieved-0.006
Token.
Token ID13
Feature activation+1.24e-01

Pos logit contributions
els+0.024
ishing+0.024
ies+0.023
ins+0.022
 times+0.022
Neg logit contributions
 shouldn-0.023
-0.023
 hadn-0.022
 Curious-0.022
 Are-0.021
Token One
Token ID1881
Feature activation+2.15e-02

Pos logit contributions
 shouldn+0.160
 Curious+0.156
 hadn+0.154
Don+0.153
 Fasc+0.152
Neg logit contributions
ies-0.284
els-0.280
 angels-0.253
ins-0.251
 blocks-0.250
Token was
Token ID373
Feature activation+4.25e-02

Pos logit contributions
 Curious+0.047
 Fasc+0.047
 Are+0.046
 Looking+0.046
 Exc+0.046
Neg logit contributions
ies-0.033
els-0.029
ishing-0.027
ins-0.027
 times-0.025
Token big
Token ID1263
Feature activation-5.09e-02

Pos logit contributions
 Curious+0.085
 shouldn+0.084
 hadn+0.083
 Fasc+0.083
 Are+0.083
Neg logit contributions
ies-0.074
els-0.067
ins-0.062
 angels-0.061
 toys-0.060
Token and
Token ID290
Feature activation+4.23e-02

Pos logit contributions
els+0.073
ishing+0.072
ier+0.070
ins+0.069
ies+0.069
Neg logit contributions
 Are-0.097
 Curious-0.096
 shouldn-0.096
 Fasc-0.095
-0.094
Token the
Token ID262
Feature activation+1.06e-02

Pos logit contributions
 Looking+0.091
 shouldn+0.090
 Curious+0.090
 Terr+0.089
 Fasc+0.089
Neg logit contributions
ies-0.065
els-0.058
ins-0.053
 angels-0.053
 times-0.052
Token want
Token ID765
Feature activation+1.46e-02

Pos logit contributions
 shouldn+0.010
 Curious+0.010
 hadn+0.010
Don+0.010
 Are+0.010
Neg logit contributions
ies-0.022
ishing-0.021
els-0.021
ins-0.020
ier-0.019
Token to
Token ID284
Feature activation+2.60e-02

Pos logit contributions
 Curious+0.026
 shouldn+0.026
 Are+0.026
 Fasc+0.025
 hadn+0.025
Neg logit contributions
ies-0.027
els-0.026
ishing-0.025
ins-0.024
ier-0.023
Token make
Token ID787
Feature activation-1.26e-02

Pos logit contributions
 Curious+0.040
 Fasc+0.039
 shouldn+0.038
 Are+0.038
 Looking+0.038
Neg logit contributions
ies-0.056
els-0.053
ishing-0.051
ins-0.050
 times-0.047
Token a
Token ID257
Feature activation-2.02e-02

Pos logit contributions
ies+0.025
els+0.024
ishing+0.024
ier+0.022
ins+0.022
Neg logit contributions
-0.020
 Curious-0.019
 shouldn-0.019
 Exc-0.019
 Are-0.019
Token fort
Token ID6285
Feature activation-7.07e-02

Pos logit contributions
els+0.037
ishing+0.035
ins+0.034
ies+0.034
ier+0.032
Neg logit contributions
-0.037
 shouldn-0.036
 Are-0.036
 Curious-0.035
 Terr-0.035
Token."
Token ID526
Feature activation+1.24e-01

Pos logit contributions
els+0.119
ishing+0.113
ies+0.111
ins+0.107
ig+0.092
Neg logit contributions
-0.146
 shouldn-0.133
 It-0.129
 Are-0.128
 Exc-0.128
Token
Token ID198
Feature activation+1.19e-01

Pos logit contributions
 shouldn+0.155
 hadn+0.148
 Fasc+0.145
 Curious+0.145
LL+0.138
Neg logit contributions
ies-0.286
els-0.265
ins-0.256
 again-0.252
er-0.249
Token
Token ID198
Feature activation+1.10e-01

Pos logit contributions
 shouldn+0.173
 hadn+0.167
 Fasc+0.165
 Curious+0.164
 Are+0.158
Neg logit contributions
ies-0.256
els-0.242
ins-0.239
ock-0.225
--0.221
TokenM
Token ID44
Feature activation+5.52e-02

Pos logit contributions
 shouldn+0.196
 Curious+0.189
 Fasc+0.187
 hadn+0.186
 Terr+0.179
Neg logit contributions
ies-0.201
els-0.189
ins-0.184
 times-0.170
ock-0.170
Tokenia
Token ID544
Feature activation+4.41e-02

Pos logit contributions
 Curious+0.087
 shouldn+0.086
+0.086
 hadn+0.085
 Looking+0.085
Neg logit contributions
ishing-0.103
ies-0.101
els-0.101
ins-0.092
ier-0.090
Token said
Token ID531
Feature activation+4.77e-02

Pos logit contributions
 Curious+0.073
 Fasc+0.070
 shouldn+0.069
 Looking+0.069
 Exc+0.069
Neg logit contributions
ies-0.090
els-0.086
ishing-0.082
ins-0.080
ier-0.076
Token It
Token ID632
Feature activation+1.88e-02

Pos logit contributions
 shouldn+0.102
 hadn+0.101
 Fasc+0.099
 Curious+0.095
Don+0.094
Neg logit contributions
els-0.231
ies-0.230
er-0.223
ins-0.218
 again-0.217
Token is
Token ID318
Feature activation+2.57e-02

Pos logit contributions
 Fasc+0.033
 Curious+0.033
 Looking+0.032
 Are+0.031
 Exc+0.031
Neg logit contributions
ies-0.036
els-0.033
ins-0.031
ishing-0.031
er-0.031
Token very
Token ID845
Feature activation+2.27e-02

Pos logit contributions
 Curious+0.046
 Fasc+0.045
 Looking+0.044
 Are+0.044
 Exc+0.044
Neg logit contributions
ies-0.049
els-0.047
ishing-0.044
ins-0.043
 times-0.041
Token y
Token ID331
Feature activation-4.92e-02

Pos logit contributions
 Curious+0.045
 Fasc+0.044
 Looking+0.044
 Are+0.043
 Exc+0.043
Neg logit contributions
ies-0.041
els-0.035
ishing-0.034
ins-0.034
 times-0.033
Tokenummy
Token ID13513
Feature activation-5.17e-02

Pos logit contributions
ies+0.083
els+0.077
ins+0.071
 angels+0.071
ishing+0.070
Neg logit contributions
-0.094
 shouldn-0.093
 Exc-0.092
 Curious-0.091
 Looking-0.089
Token."
Token ID526
Feature activation+1.23e-01

Pos logit contributions
els+0.089
ishing+0.081
ies+0.079
ooming+0.075
ins+0.074
Neg logit contributions
-0.109
 shouldn-0.101
 Are-0.095
 Curious-0.092
 It-0.092
Token
Token ID198
Feature activation+1.20e-01

Pos logit contributions
 shouldn+0.180
 hadn+0.169
 Fasc+0.166
 Curious+0.160
LL+0.158
Neg logit contributions
ies-0.260
els-0.234
 again-0.234
ins-0.228
er-0.224
Token
Token ID198
Feature activation+1.11e-01

Pos logit contributions
 shouldn+0.185
 hadn+0.177
 Fasc+0.174
 Curious+0.171
 Are+0.163
Neg logit contributions
ies-0.250
els-0.235
ins-0.231
ock-0.216
 again-0.215
TokenL
Token ID43
Feature activation+3.52e-02

Pos logit contributions
 shouldn+0.203
 Curious+0.194
 Fasc+0.193
 hadn+0.193
 Terr+0.184
Neg logit contributions
ies-0.202
els-0.190
ins-0.183
 times-0.172
er-0.169
Tokenily
Token ID813
Feature activation+4.16e-02

Pos logit contributions
 Curious+0.045
 Looking+0.044
 hadn+0.043
 shouldn+0.043
 Are+0.043
Neg logit contributions
ies-0.080
els-0.077
ishing-0.076
 angels-0.071
 times-0.071
Token felt
Token ID2936
Feature activation+3.87e-02

Pos logit contributions
 Curious+0.081
 Fasc+0.079
 Are+0.079
 shouldn+0.078
 Looking+0.078
Neg logit contributions
ies-0.072
els-0.067
ishing-0.063
ins-0.062
er-0.059
Token to
Token ID284
Feature activation+1.41e-02

Pos logit contributions
 Curious+0.022
 shouldn+0.022
 Are+0.022
 Exc+0.022
+0.022
Neg logit contributions
ies-0.027
els-0.026
ishing-0.025
ins-0.025
ier-0.023
Token say
Token ID910
Feature activation+5.77e-03

Pos logit contributions
 Curious+0.022
 Fasc+0.021
 shouldn+0.021
 Are+0.021
 Looking+0.021
Neg logit contributions
ies-0.030
els-0.029
ishing-0.027
ins-0.027
 times-0.026
Token hello
Token ID23748
Feature activation-2.50e-02

Pos logit contributions
 Curious+0.006
 Are+0.006
 Fasc+0.006
 Looking+0.006
 Where+0.006
Neg logit contributions
ies-0.015
els-0.014
ishing-0.014
ins-0.014
ier-0.013
Token to
Token ID284
Feature activation+1.97e-02

Pos logit contributions
els+0.047
ies+0.045
ishing+0.045
ins+0.045
 times+0.044
Neg logit contributions
 Are-0.042
 Curious-0.041
 It-0.040
 shouldn-0.040
 Where-0.040
Token him
Token ID683
Feature activation-2.37e-02

Pos logit contributions
 Curious+0.032
 Fasc+0.030
 Are+0.030
 shouldn+0.030
 Exc+0.030
Neg logit contributions
ies-0.041
els-0.038
ishing-0.037
ins-0.037
ier-0.035
Token?"
Token ID1701
Feature activation+1.23e-01

Pos logit contributions
els+0.051
ies+0.050
ishing+0.048
 times+0.047
ins+0.046
Neg logit contributions
 shouldn-0.033
 Are-0.032
 Curious-0.032
-0.032
 Fasc-0.031
Token
Token ID198
Feature activation+1.21e-01

Pos logit contributions
 shouldn+0.186
 hadn+0.176
 Curious+0.168
 Fasc+0.167
 Terr+0.163
Neg logit contributions
ies-0.271
els-0.253
 again-0.238
ins-0.235
er-0.231
Token
Token ID198
Feature activation+1.13e-01

Pos logit contributions
 shouldn+0.186
 hadn+0.180
 Fasc+0.172
 Curious+0.172
 Looking+0.165
Neg logit contributions
ies-0.259
els-0.245
ins-0.234
--0.224
ock-0.224
TokenTom
Token ID13787
Feature activation+3.45e-02

Pos logit contributions
 shouldn+0.220
 hadn+0.210
 Curious+0.204
 Fasc+0.203
 Looking+0.198
Neg logit contributions
ies-0.203
els-0.189
ins-0.178
er-0.170
ock-0.168
Token and
Token ID290
Feature activation+5.99e-02

Pos logit contributions
 Curious+0.078
 Fasc+0.076
 Looking+0.075
 shouldn+0.075
 Are+0.075
Neg logit contributions
ies-0.050
els-0.047
ishing-0.043
ins-0.042
ier-0.039
Token Mia
Token ID32189
Feature activation+3.49e-02

Pos logit contributions
 shouldn+0.104
 Fasc+0.098
 hadn+0.097
LL+0.096
 Curious+0.095
Neg logit contributions
ies-0.120
els-0.107
 times-0.101
 angels-0.100
 again-0.099
Token you
Token ID345
Feature activation-3.20e-02

Pos logit contributions
ies+0.074
ishing+0.073
els+0.070
ins+0.068
ier+0.066
Neg logit contributions
 shouldn-0.049
 Are-0.048
-0.047
Don-0.046
 Curious-0.046
Token.
Token ID13
Feature activation+9.38e-02

Pos logit contributions
els+0.058
ishing+0.053
ins+0.051
ies+0.049
isher+0.049
Neg logit contributions
 shouldn-0.068
-0.067
 hadn-0.061
 couldn-0.060
 Curious-0.059
Token Are
Token ID4231
Feature activation+2.69e-02

Pos logit contributions
 hadn+0.126
 Fasc+0.120
 shouldn+0.119
 Curious+0.117
 hardly+0.115
Neg logit contributions
els-0.197
ies-0.179
er-0.176
ins-0.173
 toys-0.168
Token you
Token ID345
Feature activation-4.48e-03

Pos logit contributions
 Curious+0.052
 Fasc+0.050
 Terr+0.049
 Exc+0.049
 Looking+0.049
Neg logit contributions
ies-0.051
els-0.047
ins-0.043
ishing-0.043
 angels-0.042
Token okay
Token ID8788
Feature activation-4.96e-03

Pos logit contributions
els+0.010
ies+0.010
ishing+0.009
ins+0.009
 times+0.009
Neg logit contributions
 shouldn-0.006
 Curious-0.006
 hadn-0.006
 Are-0.006
-0.006
Token?"
Token ID1701
Feature activation+1.23e-01

Pos logit contributions
els+0.011
ies+0.010
ishing+0.010
ins+0.010
isher+0.010
Neg logit contributions
 Are-0.007
 shouldn-0.007
 Curious-0.007
-0.006
 Where-0.006
Token she
Token ID673
Feature activation+4.40e-02

Pos logit contributions
 shouldn+0.194
 hadn+0.187
 Fasc+0.179
've+0.176
 Curious+0.175
Neg logit contributions
ies-0.239
 again-0.223
els-0.222
.-0.218
 and-0.212
Token says
Token ID1139
Feature activation+3.80e-02

Pos logit contributions
 shouldn+0.057
 Fasc+0.057
 Curious+0.057
 Looking+0.054
 Where+0.054
Neg logit contributions
ies-0.104
els-0.097
ins-0.092
er-0.090
ishing-0.090
Token.
Token ID13
Feature activation+1.02e-01

Pos logit contributions
 It+0.080
 Curious+0.080
+0.079
 �+0.078
 The+0.077
Neg logit contributions
els-0.058
ies-0.057
ishing-0.056
 times-0.052
ins-0.052
Token
Token ID198
Feature activation+1.15e-01

Pos logit contributions
 shouldn+0.130
 hadn+0.124
 Curious+0.118
 Fasc+0.117
LL+0.111
Neg logit contributions
ies-0.227
els-0.222
ins-0.209
er-0.206
 again-0.206
Token
Token ID198
Feature activation+1.09e-01

Pos logit contributions
 shouldn+0.168
 hadn+0.163
 Curious+0.163
 Fasc+0.162
 Terr+0.153
Neg logit contributions
ies-0.240
els-0.237
ins-0.217
 again-0.215
er-0.210
Token you
Token ID345
Feature activation-2.62e-02

Pos logit contributions
ies+0.008
els+0.007
ishing+0.007
ins+0.007
ier+0.007
Neg logit contributions
-0.007
 shouldn-0.006
 Curious-0.006
 Are-0.006
 It-0.006
Token unhappy
Token ID19283
Feature activation-4.69e-02

Pos logit contributions
els+0.062
ies+0.059
ishing+0.058
ins+0.056
er+0.055
Neg logit contributions
-0.034
 Curious-0.033
 shouldn-0.032
 Are-0.031
 Fasc-0.031
Token and
Token ID290
Feature activation+2.80e-02

Pos logit contributions
els+0.075
ishing+0.069
ies+0.064
isher+0.063
ig+0.061
Neg logit contributions
 Are-0.098
 shouldn-0.095
-0.092
 It-0.090
 Where-0.089
Token angry
Token ID7954
Feature activation-2.45e-02

Pos logit contributions
 Curious+0.045
 Fasc+0.043
 Looking+0.042
 Exc+0.042
 shouldn+0.041
Neg logit contributions
ies-0.060
els-0.056
ishing-0.054
ins-0.053
 times-0.050
Token."
Token ID526
Feature activation+1.20e-01

Pos logit contributions
els+0.034
ishing+0.031
ies+0.029
ins+0.027
ig+0.026
Neg logit contributions
 Are-0.057
 shouldn-0.054
 Where-0.053
 It-0.053
 Curious-0.052
Token
Token ID198
Feature activation+1.23e-01

Pos logit contributions
 shouldn+0.159
 Fasc+0.148
 hadn+0.148
 Curious+0.147
 Looking+0.137
Neg logit contributions
ies-0.272
els-0.249
ins-0.244
 again-0.242
er-0.237
Token
Token ID198
Feature activation+1.14e-01

Pos logit contributions
 shouldn+0.192
 hadn+0.183
 Fasc+0.181
 Curious+0.180
 Are+0.167
Neg logit contributions
ies-0.250
els-0.236
ins-0.233
--0.217
ock-0.217
TokenTom
Token ID13787
Feature activation+3.29e-02

Pos logit contributions
 shouldn+0.214
 Curious+0.205
 Fasc+0.203
 hadn+0.202
 Terr+0.193
Neg logit contributions
ies-0.195
els-0.181
ins-0.180
ock-0.164
 times-0.163
Token and
Token ID290
Feature activation+6.04e-02

Pos logit contributions
 Curious+0.065
 Fasc+0.063
 shouldn+0.062
 Looking+0.062
 Are+0.062
Neg logit contributions
ies-0.057
els-0.054
ishing-0.051
ins-0.050
 times-0.047
Token Sam
Token ID3409
Feature activation+2.41e-02

Pos logit contributions
 shouldn+0.096
 Fasc+0.096
 Curious+0.094
 hadn+0.090
 Looking+0.090
Neg logit contributions
ies-0.126
els-0.114
 times-0.109
er-0.106
ishing-0.105
Token looked
Token ID3114
Feature activation+1.36e-02

Pos logit contributions
 Curious+0.035
 Fasc+0.033
 shouldn+0.033
 Are+0.033
 Looking+0.033
Neg logit contributions
ies-0.054
els-0.052
ishing-0.049
ins-0.048
 times-0.046
Token disturb
Token ID17037
Feature activation-6.47e-02

Pos logit contributions
 Curious+0.002
 Fasc+0.002
 shouldn+0.002
 Are+0.002
 Looking+0.002
Neg logit contributions
ies-0.004
els-0.004
ishing-0.004
ins-0.004
 times-0.003
Token us
Token ID514
Feature activation-7.54e-02

Pos logit contributions
ies+0.120
ishing+0.116
ins+0.110
ier+0.109
els+0.108
Neg logit contributions
-0.113
 shouldn-0.111
 Are-0.108
 Where-0.106
 Curious-0.105
Token or
Token ID393
Feature activation-1.32e-02

Pos logit contributions
ishing+0.141
els+0.139
isher+0.129
ins+0.126
ies+0.120
Neg logit contributions
-0.142
 shouldn-0.126
've-0.116
 This-0.112
 Are-0.112
Token take
Token ID1011
Feature activation-7.71e-02

Pos logit contributions
ies+0.028
els+0.026
ishing+0.025
ins+0.024
 times+0.024
Neg logit contributions
 Curious-0.021
 Fasc-0.020
 Looking-0.020
 Exc-0.019
 shouldn-0.019
Token our
Token ID674
Feature activation-6.39e-02

Pos logit contributions
ies+0.132
els+0.123
ishing+0.121
ins+0.120
ier+0.117
Neg logit contributions
 shouldn-0.143
-0.143
 Curious-0.142
 Are-0.141
 Where-0.141
Token gold
Token ID3869
Feature activation-1.61e-01

Pos logit contributions
ins+0.097
ier+0.097
ies+0.096
ock+0.093
els+0.092
Neg logit contributions
 shouldn-0.126
 hadn-0.125
 Are-0.122
 Curious-0.122
 Where-0.119
Token!"
Token ID2474
Feature activation+9.83e-02

Pos logit contributions
els+0.242
ins+0.242
ishing+0.235
ooming+0.230
ipping+0.228
Neg logit contributions
 shouldn-0.249
-0.239
 This-0.230
 It-0.229
 could-0.228
Token
Token ID198
Feature activation+1.00e-01

Pos logit contributions
 shouldn+0.148
 hadn+0.141
 Fasc+0.138
 Curious+0.135
Don+0.134
Neg logit contributions
ies-0.207
er-0.188
 again-0.187
ins-0.184
els-0.182
Token
Token ID198
Feature activation+9.06e-02

Pos logit contributions
 shouldn+0.147
 Fasc+0.146
 hadn+0.144
 Curious+0.144
 Terr+0.138
Neg logit contributions
ies-0.212
ins-0.200
er-0.196
--0.194
 again-0.194
TokenL
Token ID43
Feature activation+8.54e-03

Pos logit contributions
 Curious+0.147
 Fasc+0.143
 shouldn+0.143
 Are+0.141
 Looking+0.140
Neg logit contributions
ies-0.187
els-0.174
ins-0.167
ishing-0.167
er-0.161
Tokenily
Token ID813
Feature activation+2.14e-02

Pos logit contributions
 Curious+0.013
 Looking+0.013
 Fasc+0.013
 shouldn+0.013
 Exc+0.013
Neg logit contributions
ies-0.018
els-0.017
ishing-0.016
ins-0.016
 angels-0.015
Token to
Token ID284
Feature activation-3.26e-02

Pos logit contributions
ies+0.070
els+0.067
ishing+0.063
ins+0.062
er+0.057
Neg logit contributions
 shouldn-0.086
 hadn-0.082
 Are-0.082
-0.082
 Curious-0.081
Token be
Token ID307
Feature activation-3.30e-02

Pos logit contributions
ies+0.069
els+0.066
ishing+0.061
 times+0.060
ins+0.060
Neg logit contributions
 Curious-0.052
 Fasc-0.051
 Looking-0.049
 Exc-0.049
 Terr-0.048
Token gold
Token ID3869
Feature activation-1.47e-01

Pos logit contributions
ies+0.070
els+0.066
ishing+0.063
ins+0.062
 times+0.059
Neg logit contributions
 Curious-0.053
 Fasc-0.051
 shouldn-0.050
 Are-0.050
 Looking-0.050
Token is
Token ID318
Feature activation-4.51e-02

Pos logit contributions
els+0.202
ishing+0.193
ins+0.187
ies+0.182
 times+0.172
Neg logit contributions
 shouldn-0.306
-0.293
 hadn-0.281
 Are-0.269
 Something-0.266
Token not
Token ID407
Feature activation-4.76e-02

Pos logit contributions
ies+0.084
els+0.081
ishing+0.077
ins+0.075
 times+0.070
Neg logit contributions
 shouldn-0.081
 Curious-0.079
 hadn-0.078
 Fasc-0.078
 Are-0.077
Token gold
Token ID3869
Feature activation-1.61e-01

Pos logit contributions
els+0.093
ies+0.090
ooming+0.088
ishing+0.088
ins+0.086
Neg logit contributions
 shouldn-0.083
 hadn-0.077
-0.071
've-0.071
 Are-0.070
Token,
Token ID11
Feature activation-2.15e-02

Pos logit contributions
els+0.238
ins+0.232
ishing+0.229
ies+0.226
 angels+0.221
Neg logit contributions
 shouldn-0.314
-0.299
 It-0.270
 hadn-0.268
 Are-0.266
Token but
Token ID475
Feature activation+1.89e-02

Pos logit contributions
ies+0.035
els+0.033
ishing+0.032
ins+0.030
ier+0.028
Neg logit contributions
 shouldn-0.044
 Curious-0.043
 hadn-0.043
Don-0.043
 Are-0.043
Token something
Token ID1223
Feature activation-2.24e-02

Pos logit contributions
 shouldn+0.028
 Curious+0.027
 Are+0.027
 hadn+0.027
 Fasc+0.026
Neg logit contributions
ies-0.041
els-0.040
ishing-0.039
ins-0.038
ier-0.036
Token else
Token ID2073
Feature activation-4.53e-02

Pos logit contributions
ies+0.045
els+0.042
ishing+0.039
ins+0.039
 times+0.038
Neg logit contributions
 Curious-0.039
 Fasc-0.038
 Looking-0.037
 Exc-0.037
 Terr-0.036
Token that
Token ID326
Feature activation-1.77e-02

Pos logit contributions
ies+0.091
els+0.086
ishing+0.081
ins+0.080
 times+0.075
Neg logit contributions
 Curious-0.077
 Fasc-0.075
 Looking-0.073
 Are-0.073
 Exc-0.073
Token.
Token ID13
Feature activation+4.68e-02

Pos logit contributions
els+0.133
ies+0.130
ishing+0.123
ins+0.116
ist+0.108
Neg logit contributions
-0.136
 Curious-0.130
 Exc-0.129
 Are-0.129
 Fasc-0.126
Token I
Token ID314
Feature activation-1.97e-05

Pos logit contributions
 Curious+0.045
 hadn+0.042
 Fasc+0.042
 shouldn+0.042
 Looking+0.040
Neg logit contributions
ies-0.122
els-0.122
ishing-0.114
ins-0.114
er-0.112
Token don
Token ID836
Feature activation-1.69e-03

Pos logit contributions
ies+0.000
els+0.000
ishing+0.000
ins+0.000
 times+0.000
Neg logit contributions
 Curious-0.000
 Fasc-0.000
 Looking-0.000
 shouldn-0.000
 Are-0.000
Token't
Token ID470
Feature activation-6.21e-03

Pos logit contributions
ies+0.005
els+0.005
ishing+0.004
ins+0.004
 times+0.004
Neg logit contributions
 Curious-0.001
 Fasc-0.001
 shouldn-0.001
 Are-0.001
 Looking-0.001
Token need
Token ID761
Feature activation-5.45e-02

Pos logit contributions
ies+0.016
els+0.016
ishing+0.015
ins+0.015
 angels+0.014
Neg logit contributions
 shouldn-0.007
-0.007
 Curious-0.007
 Are-0.007
 hadn-0.007
Token gold
Token ID3869
Feature activation-1.60e-01

Pos logit contributions
ies+0.117
ishing+0.105
els+0.097
ins+0.097
iness+0.093
Neg logit contributions
-0.089
 Where-0.086
 It-0.084
 Are-0.084
 Exc-0.084
Token ice
Token ID4771
Feature activation-1.46e-01

Pos logit contributions
ishing+0.235
ooming+0.234
ies+0.221
els+0.217
ist+0.203
Neg logit contributions
 It-0.265
 He-0.258
 Where-0.257
-0.256
 Is-0.255
Token cream
Token ID8566
Feature activation-1.26e-01

Pos logit contributions
els+0.204
ooming+0.192
ies+0.181
eness+0.180
 pilots+0.174
Neg logit contributions
LL-0.251
 This-0.244
 shouldn-0.240
 could-0.240
 Where-0.237
Token.
Token ID13
Feature activation+4.96e-02

Pos logit contributions
els+0.216
ies+0.216
ishing+0.206
ooming+0.190
ig+0.182
Neg logit contributions
-0.239
 shouldn-0.225
 I-0.223
 Is-0.221
 Where-0.221
Token I
Token ID314
Feature activation-1.90e-03

Pos logit contributions
 Curious+0.046
 Fasc+0.044
 hadn+0.044
 shouldn+0.043
 Terr+0.042
Neg logit contributions
els-0.129
ies-0.129
ins-0.121
ishing-0.120
er-0.119
Token have
Token ID423
Feature activation-1.98e-02

Pos logit contributions
ies+0.004
els+0.004
ishing+0.004
ins+0.004
ier+0.004
Neg logit contributions
 Curious-0.003
 Fasc-0.003
 Looking-0.003
 Exc-0.003
 Terr-0.003
TokenMaybe
Token ID13300
Feature activation+2.17e-02

Pos logit contributions
 Curious+0.034
 Fasc+0.032
 Are+0.031
 shouldn+0.031
 Looking+0.031
Neg logit contributions
ies-0.105
els-0.101
ishing-0.097
ins-0.095
 times-0.092
Token the
Token ID262
Feature activation-2.60e-02

Pos logit contributions
 Are+0.026
 Curious+0.026
+0.025
 shouldn+0.025
 Fasc+0.024
Neg logit contributions
ies-0.053
els-0.052
ishing-0.051
ins-0.050
ier-0.047
Token treasure
Token ID14068
Feature activation-1.14e-01

Pos logit contributions
ies+0.048
els+0.046
ishing+0.045
ins+0.044
ier+0.041
Neg logit contributions
 Curious-0.050
 shouldn-0.048
 Are-0.048
 Fasc-0.047
 hadn-0.047
Token is
Token ID318
Feature activation-3.15e-02

Pos logit contributions
els+0.249
ies+0.248
ishing+0.247
ins+0.234
ock+0.220
Neg logit contributions
 shouldn-0.154
-0.148
 hadn-0.146
 Are-0.138
've-0.138
Token not
Token ID407
Feature activation-3.22e-02

Pos logit contributions
ies+0.063
els+0.061
ishing+0.058
ins+0.057
 times+0.053
Neg logit contributions
 shouldn-0.053
 hadn-0.051
 Curious-0.051
 Fasc-0.050
 Are-0.050
Token gold
Token ID3869
Feature activation-1.60e-01

Pos logit contributions
els+0.066
ies+0.064
ishing+0.062
ins+0.061
ooming+0.057
Neg logit contributions
 shouldn-0.053
 hadn-0.051
 Curious-0.049
 Are-0.048
 Fasc-0.048
Token or
Token ID393
Feature activation-2.74e-02

Pos logit contributions
els+0.271
ishing+0.254
ins+0.249
ooming+0.237
ies+0.235
Neg logit contributions
-0.280
 shouldn-0.278
 I-0.251
 It-0.250
 Are-0.241
Token jewels
Token ID42701
Feature activation-1.40e-01

Pos logit contributions
ies+0.052
els+0.049
ishing+0.047
ins+0.047
ier+0.043
Neg logit contributions
 Curious-0.048
 shouldn-0.048
 Are-0.047
 Fasc-0.047
 hadn-0.047
Token,"
Token ID553
Feature activation+7.71e-02

Pos logit contributions
els+0.262
ishing+0.241
ins+0.230
ies+0.224
isher+0.205
Neg logit contributions
-0.268
 shouldn-0.251
've-0.238
 hadn-0.235
 I-0.230
Token L
Token ID406
Feature activation+4.94e-03

Pos logit contributions
 shouldn+0.130
 Curious+0.127
 Fasc+0.125
 Terr+0.124
 hadn+0.124
Neg logit contributions
ies-0.158
els-0.149
 times-0.136
ins-0.133
er-0.130
Tokenila
Token ID10102
Feature activation+5.62e-05

Pos logit contributions
 Curious+0.009
 Fasc+0.009
 shouldn+0.009
 Are+0.009
 Looking+0.009
Neg logit contributions
ies-0.009
els-0.009
ins-0.008
ishing-0.008
ier-0.007
Token light
Token ID1657
Feature activation-1.08e-01

Pos logit contributions
ies+0.053
els+0.051
ins+0.050
ishing+0.048
ier+0.045
Neg logit contributions
 Curious-0.030
 shouldn-0.028
-0.028
 Fasc-0.028
 Are-0.028
Tokenen
Token ID268
Feature activation-1.05e-01

Pos logit contributions
ies+0.217
ishing+0.204
ins+0.204
els+0.201
ier+0.191
Neg logit contributions
-0.158
 Is-0.156
 Curious-0.153
 Fasc-0.152
 Exc-0.149
Token up
Token ID510
Feature activation-6.25e-02

Pos logit contributions
ies+0.220
els+0.206
ishing+0.203
ins+0.201
ier+0.192
Neg logit contributions
 Curious-0.154
-0.152
 Fasc-0.150
 Exc-0.148
 Looking-0.146
Token the
Token ID262
Feature activation-3.96e-02

Pos logit contributions
ies+0.098
ins+0.095
els+0.094
ishing+0.088
 angels+0.080
Neg logit contributions
-0.127
 hadn-0.126
 shouldn-0.125
 Curious-0.123
 Fasc-0.122
Token wheel
Token ID7825
Feature activation-1.31e-01

Pos logit contributions
ies+0.071
ins+0.069
els+0.068
ishing+0.063
ier+0.060
Neg logit contributions
 hadn-0.076
-0.075
 shouldn-0.074
 Curious-0.074
Don-0.071
Tokenbar
Token ID5657
Feature activation-1.60e-01

Pos logit contributions
ins+0.208
ies+0.204
els+0.197
ishing+0.185
 angels+0.163
Neg logit contributions
-0.299
 hadn-0.259
 Exc-0.249
 shouldn-0.247
 �-0.246
Tokenrow
Token ID808
Feature activation-1.07e-01

Pos logit contributions
 candles+0.264
 angels+0.257
 films+0.252
 rockets+0.252
 opera+0.252
Neg logit contributions
 He-0.211
 hadn-0.208
 couldn-0.203
-0.199
 -0.199
Token.
Token ID13
Feature activation+9.11e-02

Pos logit contributions
els+0.153
ies+0.148
ins+0.140
ishing+0.134
ries+0.118
Neg logit contributions
-0.247
 hadn-0.235
 Curious-0.231
 shouldn-0.230
 Exc-0.227
Token<|endoftext|>
Token ID50256
Feature activation+6.25e-02

Pos logit contributions
 shouldn+0.127
LL+0.116
 hadn+0.115
 Fasc+0.114
Don+0.111
Neg logit contributions
ies-0.182
ins-0.179
 again-0.176
er-0.173
els-0.173
TokenOnce
Token ID7454
Feature activation+9.44e-02

Pos logit contributions
 shouldn+0.083
 Curious+0.082
 Fasc+0.080
 hadn+0.079
LL+0.078
Neg logit contributions
ies-0.136
els-0.132
ins-0.126
 again-0.125
ishing-0.123
Token there
Token ID612
Feature activation+5.29e-02

Pos logit contributions
 Curious+0.167
 Fasc+0.162
 Looking+0.160
 Are+0.160
 shouldn+0.158
Neg logit contributions
ies-0.191
els-0.171
ishing-0.159
ins-0.158
 times-0.153
Token to
Token ID284
Feature activation-1.36e-02

Pos logit contributions
els+0.132
ies+0.129
ishing+0.127
ins+0.123
 times+0.116
Neg logit contributions
 shouldn-0.125
 hadn-0.123
 Curious-0.120
-0.117
 Fasc-0.114
Token help
Token ID1037
Feature activation-3.51e-02

Pos logit contributions
ies+0.027
els+0.026
ishing+0.024
ins+0.023
 times+0.023
Neg logit contributions
 Curious-0.023
 Fasc-0.023
 Are-0.022
 shouldn-0.022
 Looking-0.022
Token him
Token ID683
Feature activation-5.31e-02

Pos logit contributions
ies+0.064
els+0.061
ishing+0.058
ins+0.057
ier+0.053
Neg logit contributions
 Curious-0.065
 Fasc-0.063
 Are-0.062
 shouldn-0.062
 Looking-0.062
Token keep
Token ID1394
Feature activation-7.26e-02

Pos logit contributions
els+0.115
ies+0.113
ins+0.111
ishing+0.108
 times+0.105
Neg logit contributions
-0.075
 Curious-0.074
 Something-0.072
 shouldn-0.071
 Fasc-0.070
Token his
Token ID465
Feature activation-7.97e-02

Pos logit contributions
ies+0.147
els+0.140
ins+0.137
ock+0.132
ier+0.131
Neg logit contributions
 shouldn-0.111
 hadn-0.110
 Curious-0.108
 Fasc-0.107
-0.105
Token gold
Token ID3869
Feature activation-1.59e-01

Pos logit contributions
ins+0.157
els+0.150
ock+0.146
ies+0.140
ier+0.138
Neg logit contributions
 hadn-0.137
 shouldn-0.133
 Curious-0.126
've-0.125
 Are-0.124
Token safe
Token ID3338
Feature activation-1.20e-01

Pos logit contributions
ins+0.277
els+0.277
ries+0.251
ips+0.248
ud+0.245
Neg logit contributions
-0.264
 shouldn-0.247
 hadn-0.243
 ju-0.236
 could-0.234
Token.
Token ID13
Feature activation+1.07e-01

Pos logit contributions
els+0.218
ins+0.213
ries+0.183
ooming+0.180
ishing+0.180
Neg logit contributions
-0.232
 hadn-0.221
 shouldn-0.218
 couldn-0.200
 had-0.196
Token They
Token ID1119
Feature activation+9.29e-03

Pos logit contributions
 shouldn+0.127
 hadn+0.124
Don+0.120
 Curious+0.120
 Fasc+0.115
Neg logit contributions
ies-0.254
els-0.241
er-0.241
 again-0.238
ins-0.235
Token agreed
Token ID4987
Feature activation+3.02e-03

Pos logit contributions
 Curious+0.020
 shouldn+0.020
 Fasc+0.020
 hadn+0.019
 Exc+0.019
Neg logit contributions
ies-0.014
els-0.013
ishing-0.012
ins-0.012
 times-0.011
Token and
Token ID290
Feature activation+4.45e-02

Pos logit contributions
 Curious+0.006
 shouldn+0.006
 hadn+0.006
+0.006
 Looking+0.006
Neg logit contributions
els-0.005
ies-0.005
ins-0.004
ishing-0.004
ries-0.004
Tokenumpy
Token ID32152
Feature activation-8.13e-02

Pos logit contributions
ies+0.169
els+0.164
ishing+0.155
 angels+0.152
ins+0.147
Neg logit contributions
 shouldn-0.099
 Curious-0.098
 Fasc-0.094
 Looking-0.092
 Are-0.092
Token and
Token ID290
Feature activation+2.05e-02

Pos logit contributions
els+0.131
ies+0.116
ishing+0.114
ins+0.112
ig+0.102
Neg logit contributions
 shouldn-0.178
 hadn-0.170
 Are-0.170
 Where-0.167
-0.165
Token he
Token ID339
Feature activation-2.62e-03

Pos logit contributions
 Curious+0.029
 Exc+0.028
 Fasc+0.028
 Looking+0.028
 Int+0.027
Neg logit contributions
ies-0.047
els-0.043
ishing-0.042
 times-0.041
ins-0.041
Token shared
Token ID4888
Feature activation-5.54e-02

Pos logit contributions
ies+0.006
els+0.006
ishing+0.006
ins+0.006
er+0.006
Neg logit contributions
 Curious-0.003
 Fasc-0.003
 Looking-0.003
 Are-0.003
 Exc-0.003
Token his
Token ID465
Feature activation-5.76e-02

Pos logit contributions
ies+0.129
els+0.124
ishing+0.123
ins+0.119
ier+0.117
Neg logit contributions
 Curious-0.069
 It-0.068
 shouldn-0.068
 Something-0.067
 Are-0.067
Token gold
Token ID3869
Feature activation-1.58e-01

Pos logit contributions
ies+0.097
els+0.096
ier+0.091
ins+0.091
ock+0.089
Neg logit contributions
 Curious-0.111
 hadn-0.107
 Looking-0.106
 Are-0.106
 Fasc-0.106
Token with
Token ID351
Feature activation-2.80e-02

Pos logit contributions
els+0.241
 stations+0.221
ock+0.217
ishing+0.216
ins+0.210
Neg logit contributions
-0.258
 Exc-0.253
 It-0.248
 This-0.247
 hardly-0.247
Token everyone
Token ID2506
Feature activation-5.46e-02

Pos logit contributions
ies+0.059
els+0.057
ishing+0.055
ins+0.052
ig+0.049
Neg logit contributions
 Curious-0.044
 shouldn-0.043
 hadn-0.042
 Fasc-0.042
 Are-0.041
Token who
Token ID508
Feature activation-2.37e-02

Pos logit contributions
els+0.080
ies+0.080
ishing+0.073
ins+0.067
ock+0.064
Neg logit contributions
-0.118
 hadn-0.116
 shouldn-0.115
 Curious-0.115
 Are-0.114
Token needed
Token ID2622
Feature activation-5.22e-02

Pos logit contributions
ies+0.060
els+0.057
ishing+0.055
ins+0.055
 times+0.052
Neg logit contributions
 Curious-0.028
 Fasc-0.026
 Looking-0.026
 Exc-0.026
 Are-0.025
Token it
Token ID340
Feature activation-6.97e-02

Pos logit contributions
ies+0.121
els+0.115
ishing+0.112
ins+0.109
ooming+0.102
Neg logit contributions
 shouldn-0.071
 hadn-0.071
-0.070
Don-0.069
 It-0.069
Token slide
Token ID10649
Feature activation-6.99e-02

Pos logit contributions
ies+0.047
els+0.047
ishing+0.045
ins+0.045
ier+0.040
Neg logit contributions
 Curious-0.047
 Are-0.047
 hadn-0.046
 shouldn-0.046
Don-0.044
Token is
Token ID318
Feature activation-8.59e-03

Pos logit contributions
ies+0.158
els+0.153
ishing+0.148
ins+0.143
 angels+0.137
Neg logit contributions
-0.094
 shouldn-0.093
 hadn-0.091
 Are-0.089
 Curious-0.086
Token not
Token ID407
Feature activation-2.07e-02

Pos logit contributions
ies+0.018
els+0.017
ishing+0.016
ins+0.016
 times+0.015
Neg logit contributions
 Curious-0.014
 Fasc-0.014
 shouldn-0.014
 Are-0.013
 Looking-0.013
Token made
Token ID925
Feature activation-4.65e-02

Pos logit contributions
els+0.045
ies+0.043
ishing+0.041
ins+0.041
ooming+0.039
Neg logit contributions
 shouldn-0.031
 hadn-0.030
 Are-0.029
-0.029
 Curious-0.028
Token of
Token ID286
Feature activation-4.96e-02

Pos logit contributions
els+0.092
ies+0.091
ishing+0.091
ins+0.087
ologist+0.078
Neg logit contributions
-0.081
 shouldn-0.076
 It-0.073
 hadn-0.070
 Are-0.069
Token gold
Token ID3869
Feature activation-1.57e-01

Pos logit contributions
ies+0.090
ishing+0.084
ins+0.084
els+0.081
ier+0.077
Neg logit contributions
 shouldn-0.088
 Are-0.087
 hadn-0.084
 Something-0.083
-0.082
Token,
Token ID11
Feature activation-2.53e-02

Pos logit contributions
ags+0.245
ishing+0.243
els+0.238
ooming+0.228
ins+0.226
Neg logit contributions
-0.273
 shouldn-0.254
 could-0.244
 It-0.242
 I-0.235
Token but
Token ID475
Feature activation+1.40e-02

Pos logit contributions
ies+0.052
els+0.050
ishing+0.049
ins+0.047
ier+0.044
Neg logit contributions
 shouldn-0.041
 Are-0.041
 hadn-0.040
 Curious-0.040
-0.040
Token it
Token ID340
Feature activation-2.10e-02

Pos logit contributions
 shouldn+0.018
 Are+0.018
 Curious+0.018
 hadn+0.018
+0.017
Neg logit contributions
ies-0.033
els-0.031
ishing-0.031
ins-0.030
ier-0.028
Token is
Token ID318
Feature activation-1.52e-03

Pos logit contributions
ies+0.042
els+0.040
ishing+0.037
ins+0.037
 times+0.035
Neg logit contributions
 Curious-0.036
 Fasc-0.035
 Are-0.034
 Looking-0.034
 shouldn-0.034
Token brilliant
Token ID10457
Feature activation-8.93e-02

Pos logit contributions
ies+0.003
els+0.003
ishing+0.003
ins+0.003
 times+0.003
Neg logit contributions
 Curious-0.002
 shouldn-0.002
 Fasc-0.002
 hadn-0.002
 Are-0.002
Token he
Token ID339
Feature activation-1.44e-02

Pos logit contributions
 Curious+0.006
 shouldn+0.006
 Fasc+0.005
 Are+0.005
 hadn+0.005
Neg logit contributions
ies-0.007
els-0.007
ishing-0.007
ins-0.007
 times-0.006
Token couldn
Token ID3521
Feature activation+1.96e-02

Pos logit contributions
ies+0.037
els+0.034
ie+0.032
ishing+0.032
-+0.031
Neg logit contributions
 Curious-0.018
 Fasc-0.018
 Are-0.018
 Terr-0.018
 Exc-0.018
Token't
Token ID470
Feature activation-1.08e-02

Pos logit contributions
 Curious+0.024
 Fasc+0.023
Don+0.023
 Are+0.022
 Looking+0.022
Neg logit contributions
els-0.046
ies-0.045
er-0.043
 again-0.043
ier-0.042
Token keep
Token ID1394
Feature activation-6.01e-02

Pos logit contributions
ies+0.024
els+0.023
ishing+0.022
ins+0.021
 times+0.021
Neg logit contributions
 Curious-0.016
 Fasc-0.015
 Are-0.015
 shouldn-0.015
 Looking-0.015
Token his
Token ID465
Feature activation-6.21e-02

Pos logit contributions
ies+0.128
els+0.120
ins+0.116
ishing+0.115
ier+0.110
Neg logit contributions
 Curious-0.092
 Fasc-0.090
 shouldn-0.089
 Looking-0.087
 hadn-0.087
Token gold
Token ID3869
Feature activation-1.57e-01

Pos logit contributions
ins+0.120
els+0.112
ock+0.108
ies+0.106
ier+0.106
Neg logit contributions
 hadn-0.109
 shouldn-0.107
 Curious-0.105
've-0.101
 couldn-0.100
Token safe
Token ID3338
Feature activation-1.12e-01

Pos logit contributions
ins+0.302
els+0.279
ips+0.271
ucks+0.259
ries+0.256
Neg logit contributions
 shouldn-0.246
-0.239
 hadn-0.230
 couldn-0.220
 ju-0.218
Token by
Token ID416
Feature activation-4.65e-02

Pos logit contributions
ins+0.200
els+0.194
ooming+0.167
ishing+0.165
ries+0.163
Neg logit contributions
 hadn-0.210
-0.209
 shouldn-0.207
 couldn-0.193
 had-0.186
Token himself
Token ID2241
Feature activation-8.34e-02

Pos logit contributions
ies+0.101
ins+0.100
ishing+0.098
els+0.096
ier+0.093
Neg logit contributions
 hadn-0.064
 shouldn-0.059
 couldn-0.058
 Curious-0.057
 hardly-0.056
Token.
Token ID13
Feature activation+9.34e-02

Pos logit contributions
els+0.136
ins+0.131
ishing+0.124
ies+0.119
isher+0.117
Neg logit contributions
-0.189
 shouldn-0.170
 hadn-0.166
 couldn-0.160
 This-0.152
Token He
Token ID679
Feature activation+1.07e-02

Pos logit contributions
've+0.096
Don+0.093
 hadn+0.090
 shouldn+0.088
 Fasc+0.083
Neg logit contributions
er-0.232
ies-0.230
 again-0.220
els-0.219
ishing-0.215
Token friendship
Token ID14738
Feature activation-8.39e-02

Pos logit contributions
ies+0.001
els+0.001
ishing+0.001
ins+0.001
ier+0.001
Neg logit contributions
 Curious-0.001
 Are-0.001
 shouldn-0.001
 hadn-0.001
 Fasc-0.001
Token are
Token ID389
Feature activation-3.29e-02

Pos logit contributions
els+0.127
ishing+0.118
ies+0.115
ooming+0.105
ins+0.104
Neg logit contributions
 shouldn-0.179
-0.174
 Are-0.172
 hadn-0.168
 Something-0.167
Token more
Token ID517
Feature activation-5.63e-02

Pos logit contributions
els+0.064
ies+0.064
ishing+0.062
ins+0.058
ooming+0.057
Neg logit contributions
 shouldn-0.055
 hadn-0.053
-0.051
 Are-0.050
've-0.048
Token precious
Token ID14186
Feature activation-1.13e-01

Pos logit contributions
els+0.104
ies+0.099
ishing+0.093
ries+0.088
ins+0.087
Neg logit contributions
 shouldn-0.107
 Are-0.100
-0.100
 hadn-0.100
 Curious-0.096
Token than
Token ID621
Feature activation-3.98e-02

Pos logit contributions
els+0.210
ooming+0.191
ig+0.180
burn+0.175
ishing+0.173
Neg logit contributions
 shouldn-0.199
 Are-0.184
 Where-0.176
 I-0.176
 But-0.173
Token gold
Token ID3869
Feature activation-1.57e-01

Pos logit contributions
ies+0.076
els+0.075
ishing+0.072
ins+0.070
ig+0.070
Neg logit contributions
 shouldn-0.067
 hadn-0.066
 Are-0.063
've-0.062
-0.059
Token."
Token ID526
Feature activation+1.07e-01

Pos logit contributions
ooming+0.249
els+0.244
ishing+0.238
burn+0.233
ies+0.215
Neg logit contributions
 shouldn-0.265
 could-0.252
 It-0.235
 I-0.232
-0.225
Token
Token ID198
Feature activation+1.11e-01

Pos logit contributions
 Curious+0.141
 shouldn+0.140
 Fasc+0.140
 hadn+0.138
 Looking+0.134
Neg logit contributions
ies-0.247
els-0.222
 again-0.219
ins-0.218
ishing-0.209
Token
Token ID198
Feature activation+1.02e-01

Pos logit contributions
 Curious+0.175
 Fasc+0.174
 shouldn+0.167
 Looking+0.167
 hadn+0.166
Neg logit contributions
ies-0.234
ins-0.212
els-0.210
 again-0.200
ock-0.197
TokenL
Token ID43
Feature activation+2.66e-02

Pos logit contributions
 Curious+0.169
 Fasc+0.165
 shouldn+0.160
 Looking+0.160
 Exc+0.159
Neg logit contributions
ies-0.214
els-0.198
ins-0.192
ishing-0.188
 times-0.181
Tokenily
Token ID813
Feature activation+2.72e-02

Pos logit contributions
 shouldn+0.033
 hadn+0.032
 Curious+0.032
Don+0.032
+0.031
Neg logit contributions
ies-0.062
els-0.060
ishing-0.058
 angels-0.056
 times-0.053
Token co
Token ID763
Feature activation-9.35e-02

Pos logit contributions
els+0.184
ins+0.182
ies+0.176
ishing+0.168
ier+0.163
Neg logit contributions
 Curious-0.207
 Are-0.204
 hadn-0.204
 shouldn-0.204
 Fasc-0.199
Tokenop
Token ID404
Feature activation-1.04e-01

Pos logit contributions
els+0.157
 times+0.152
ies+0.151
ishing+0.137
 ways+0.137
Neg logit contributions
 Curious-0.163
-0.163
 Fasc-0.156
 Looking-0.155
 Are-0.155
Token to
Token ID284
Feature activation-3.05e-02

Pos logit contributions
els+0.142
ies+0.131
ins+0.127
ishing+0.127
 times+0.122
Neg logit contributions
-0.215
 hadn-0.209
 shouldn-0.203
 couldn-0.202
 Curious-0.195
Token keep
Token ID1394
Feature activation-6.36e-02

Pos logit contributions
ies+0.061
els+0.058
ishing+0.055
ins+0.054
ier+0.051
Neg logit contributions
 Curious-0.051
 Fasc-0.050
 shouldn-0.049
 Are-0.049
 Looking-0.049
Token the
Token ID262
Feature activation-4.61e-02

Pos logit contributions
ies+0.131
els+0.124
ins+0.118
ishing+0.117
ier+0.111
Neg logit contributions
 hadn-0.099
 shouldn-0.099
 Curious-0.098
 Fasc-0.095
Don-0.094
Token gold
Token ID3869
Feature activation-1.56e-01

Pos logit contributions
els+0.085
ies+0.083
ins+0.082
ishing+0.077
ier+0.073
Neg logit contributions
 shouldn-0.083
 hadn-0.082
 Curious-0.082
 Are-0.081
Don-0.078
Token safe
Token ID3338
Feature activation-1.13e-01

Pos logit contributions
els+0.254
ins+0.239
ishing+0.238
ips+0.236
ooming+0.231
Neg logit contributions
-0.288
 shouldn-0.267
 hadn-0.262
 could-0.261
 ju-0.254
Token.
Token ID13
Feature activation+1.09e-01

Pos logit contributions
els+0.177
ins+0.173
ishing+0.160
ooming+0.158
ies+0.149
Neg logit contributions
-0.232
 hadn-0.232
 shouldn-0.230
 It-0.206
 had-0.206
Token
Token ID198
Feature activation+1.03e-01

Pos logit contributions
 shouldn+0.146
 Curious+0.145
 hadn+0.144
Don+0.143
 Fasc+0.139
Neg logit contributions
ies-0.228
 again-0.224
ins-0.216
er-0.210
els-0.208
Token
Token ID198
Feature activation+9.16e-02

Pos logit contributions
 Curious+0.169
 Fasc+0.166
 Are+0.165
 Terr+0.164
 shouldn+0.163
Neg logit contributions
ies-0.195
 again-0.182
ins-0.180
--0.173
 and-0.171
TokenFrom
Token ID4863
Feature activation+1.00e-02

Pos logit contributions
 Curious+0.133
 Terr+0.128
 Fasc+0.126
 Looking+0.124
 shouldn+0.124
Neg logit contributions
ies-0.196
ins-0.187
els-0.180
 times-0.179
er-0.178
TokenDid
Token ID11633
Feature activation-9.77e-03

Pos logit contributions
 Curious+0.031
 Fasc+0.030
 shouldn+0.030
 Looking+0.029
 Exc+0.029
Neg logit contributions
ies-0.068
els-0.066
ins-0.063
ishing-0.063
er-0.060
Token you
Token ID345
Feature activation-2.43e-02

Pos logit contributions
ies+0.018
els+0.017
 times+0.015
ins+0.015
ishing+0.015
Neg logit contributions
 Curious-0.019
 Fasc-0.019
 Terr-0.018
 Looking-0.018
 Exc-0.018
Token kids
Token ID3988
Feature activation-7.55e-02

Pos logit contributions
ies+0.053
els+0.049
ishing+0.047
ins+0.044
ock+0.044
Neg logit contributions
 Curious-0.037
 Fasc-0.036
 Terr-0.036
 Exc-0.036
 Looking-0.035
Token find
Token ID1064
Feature activation-6.77e-02

Pos logit contributions
ies+0.167
els+0.161
ishing+0.153
ins+0.150
 times+0.144
Neg logit contributions
 Curious-0.110
 Are-0.106
 shouldn-0.106
 Fasc-0.106
 Looking-0.103
Token my
Token ID616
Feature activation-7.74e-02

Pos logit contributions
ies+0.142
els+0.130
ishing+0.129
ins+0.127
ier+0.120
Neg logit contributions
 shouldn-0.104
 Curious-0.104
 Are-0.102
 Fasc-0.099
 Exc-0.099
Token gold
Token ID3869
Feature activation-1.56e-01

Pos logit contributions
ies+0.137
ins+0.136
els+0.134
ock+0.130
ishing+0.128
Neg logit contributions
 shouldn-0.143
 hadn-0.138
 Curious-0.137
 Are-0.134
 Where-0.133
Token watch
Token ID2342
Feature activation-1.05e-01

Pos logit contributions
els+0.246
ins+0.246
ies+0.241
ishing+0.234
ock+0.232
Neg logit contributions
 shouldn-0.303
 hadn-0.279
've-0.265
 Are-0.263
 It-0.262
Token?"
Token ID1701
Feature activation+9.95e-02

Pos logit contributions
ies+0.211
els+0.205
ishing+0.203
ins+0.197
 angels+0.178
Neg logit contributions
-0.169
 shouldn-0.166
 Are-0.156
 hadn-0.156
 It-0.155
Token
Token ID220
Feature activation+6.28e-02

Pos logit contributions
 shouldn+0.135
LL+0.122
 Fasc+0.121
 hadn+0.120
Jump+0.120
Neg logit contributions
 again-0.220
er-0.206
.-0.196
 and-0.195
ies-0.194
Token
Token ID198
Feature activation+9.08e-02

Pos logit contributions
 hadn+0.085
 Fasc+0.084
 Curious+0.083
 Looking+0.080
 shouldn+0.080
Neg logit contributions
ies-0.140
ins-0.132
els-0.130
 again-0.127
er-0.122
Token
Token ID198
Feature activation+8.75e-02

Pos logit contributions
 shouldn+0.147
 hadn+0.141
 Fasc+0.140
 Curious+0.140
 Looking+0.133
Neg logit contributions
ies-0.176
 again-0.171
--0.171
er-0.168
ins-0.161
Token.
Token ID13
Feature activation+5.54e-02

Pos logit contributions
els+0.099
ries+0.095
ins+0.093
ies+0.091
ishing+0.087
Neg logit contributions
-0.093
've-0.080
 Something-0.080
 shouldn-0.078
 Curious-0.076
Token I
Token ID314
Feature activation+4.12e-03

Pos logit contributions
 hadn+0.064
 Fasc+0.062
 Curious+0.061
 shouldn+0.061
Don+0.058
Neg logit contributions
els-0.127
er-0.123
ies-0.122
ishing-0.118
 angels-0.113
Token will
Token ID481
Feature activation+4.59e-03

Pos logit contributions
 Curious+0.008
 Fasc+0.007
 Looking+0.007
 Exc+0.007
 Are+0.007
Neg logit contributions
ies-0.008
els-0.007
ishing-0.007
ins-0.007
er-0.006
Token trade
Token ID3292
Feature activation-9.35e-02

Pos logit contributions
 Curious+0.007
 shouldn+0.006
 Are+0.006
 Fasc+0.006
 Looking+0.006
Neg logit contributions
ies-0.010
els-0.010
ins-0.010
ishing-0.010
ier-0.009
Token my
Token ID616
Feature activation-6.10e-02

Pos logit contributions
ies+0.199
ishing+0.193
els+0.183
ins+0.182
ier+0.181
Neg logit contributions
-0.133
 It-0.128
 shouldn-0.121
 Are-0.121
Don-0.121
Token navy
Token ID23956
Feature activation-1.56e-01

Pos logit contributions
ies+0.109
ins+0.107
els+0.106
ier+0.106
ishing+0.101
Neg logit contributions
 Curious-0.108
 Fasc-0.101
 shouldn-0.099
 hadn-0.098
've-0.096
Token boat
Token ID8848
Feature activation-1.34e-01

Pos logit contributions
els+0.275
ries+0.259
ishing+0.258
ins+0.255
ies+0.250
Neg logit contributions
 It-0.239
-0.237
 The-0.225
 shouldn-0.223
 I-0.218
Token for
Token ID329
Feature activation-1.99e-02

Pos logit contributions
els+0.217
ies+0.199
ins+0.192
ishing+0.191
ries+0.180
Neg logit contributions
-0.266
 shouldn-0.248
 This-0.237
've-0.232
 could-0.231
Token your
Token ID534
Feature activation-6.39e-02

Pos logit contributions
ies+0.048
els+0.044
ishing+0.043
ins+0.043
ier+0.042
Neg logit contributions
 Are-0.024
 shouldn-0.024
-0.024
 Curious-0.024
 hadn-0.024
Token red
Token ID2266
Feature activation-1.11e-01

Pos logit contributions
ins+0.117
els+0.117
ier+0.113
ies+0.109
ock+0.107
Neg logit contributions
 hadn-0.108
-0.105
 Curious-0.102
 shouldn-0.101
 Are-0.101
Token boat
Token ID8848
Feature activation-1.29e-01

Pos logit contributions
ins+0.233
ier+0.231
els+0.218
ries+0.210
ishing+0.207
Neg logit contributions
 It-0.137
 The-0.125
-0.123
 A-0.117
 This-0.116
Token real
Token ID1103
Feature activation-1.02e-01

Pos logit contributions
ies+0.067
els+0.067
ishing+0.062
ins+0.061
 times+0.056
Neg logit contributions
 Curious-0.069
 shouldn-0.068
 Are-0.068
 hadn-0.067
 Fasc-0.067
Token treasure
Token ID14068
Feature activation-1.31e-01

Pos logit contributions
els+0.185
ins+0.176
ies+0.169
ishing+0.166
ries+0.155
Neg logit contributions
 Are-0.188
 shouldn-0.188
 hadn-0.188
 Curious-0.187
 Looking-0.176
Token chest
Token ID7721
Feature activation-1.05e-01

Pos logit contributions
els+0.191
ishing+0.190
ins+0.189
ies+0.188
ock+0.166
Neg logit contributions
 hadn-0.277
 shouldn-0.275
-0.261
've-0.252
 Are-0.246
Token filled
Token ID5901
Feature activation-5.45e-02

Pos logit contributions
els+0.146
ins+0.137
ies+0.135
ishing+0.133
ock+0.118
Neg logit contributions
 hadn-0.242
 shouldn-0.235
-0.227
've-0.225
 couldn-0.219
Token with
Token ID351
Feature activation-5.11e-02

Pos logit contributions
ies+0.081
els+0.076
ishing+0.071
ins+0.069
ier+0.065
Neg logit contributions
 Curious-0.118
 Fasc-0.116
 shouldn-0.115
 Exc-0.115
 Looking-0.114
Token gold
Token ID3869
Feature activation-1.56e-01

Pos logit contributions
ies+0.087
els+0.086
ishing+0.084
ins+0.082
ig+0.076
Neg logit contributions
 shouldn-0.095
 Curious-0.095
 hadn-0.094
-0.092
 Fasc-0.092
Token coins
Token ID10796
Feature activation-1.34e-01

Pos logit contributions
ishing+0.250
els+0.241
ock+0.235
ooming+0.234
ins+0.233
Neg logit contributions
-0.256
 shouldn-0.249
 hadn-0.238
 It-0.234
 had-0.233
Token and
Token ID290
Feature activation-1.60e-02

Pos logit contributions
els+0.187
ins+0.179
ishing+0.177
ock+0.171
ies+0.165
Neg logit contributions
-0.284
 hadn-0.281
 shouldn-0.279
've-0.266
 couldn-0.255
Token jewels
Token ID42701
Feature activation-1.16e-01

Pos logit contributions
ies+0.024
els+0.022
ishing+0.020
ins+0.020
 times+0.018
Neg logit contributions
 Curious-0.035
 Fasc-0.035
 shouldn-0.034
 Looking-0.034
 Are-0.034
Token.
Token ID13
Feature activation+1.06e-01

Pos logit contributions
els+0.167
ishing+0.161
ins+0.148
ies+0.142
ock+0.139
Neg logit contributions
 hadn-0.249
 shouldn-0.245
-0.245
've-0.237
 couldn-0.225
Token
Token ID220
Feature activation+6.43e-02

Pos logit contributions
 shouldn+0.152
 hadn+0.145
've+0.138
 hardly+0.136
LL+0.136
Neg logit contributions
ies-0.223
 again-0.214
er-0.211
 toys-0.205
 and-0.195
Token,
Token ID11
Feature activation+2.98e-02

Pos logit contributions
ies+0.148
ins+0.147
els+0.146
ishing+0.142
ooming+0.133
Neg logit contributions
 shouldn-0.138
-0.135
 It-0.127
 Curious-0.127
 Exc-0.127
Token you
Token ID345
Feature activation-2.92e-03

Pos logit contributions
 Are+0.038
 Curious+0.038
 shouldn+0.037
 hadn+0.036
+0.036
Neg logit contributions
ies-0.072
els-0.070
ishing-0.067
ins-0.066
ier-0.062
Token will
Token ID481
Feature activation+1.80e-03

Pos logit contributions
ies+0.007
els+0.007
ishing+0.006
ins+0.006
 times+0.006
Neg logit contributions
 Curious-0.004
 Fasc-0.004
 Looking-0.004
 Are-0.003
 Exc-0.003
Token get
Token ID651
Feature activation-4.69e-02

Pos logit contributions
 Curious+0.003
 Fasc+0.003
 Looking+0.003
 Exc+0.003
 Are+0.002
Neg logit contributions
ies-0.004
els-0.004
ishing-0.004
ins-0.003
 times-0.003
Token more
Token ID517
Feature activation-7.77e-02

Pos logit contributions
ies+0.090
els+0.085
ishing+0.080
ins+0.079
 times+0.074
Neg logit contributions
 Curious-0.083
 Fasc-0.081
 shouldn-0.080
 Are-0.080
 Looking-0.079
Token gold
Token ID3869
Feature activation-1.56e-01

Pos logit contributions
els+0.153
ies+0.149
ishing+0.143
 angels+0.141
ooming+0.141
Neg logit contributions
 shouldn-0.123
-0.119
 Curious-0.114
 Are-0.114
 Fasc-0.112
Token."
Token ID526
Feature activation+9.65e-02

Pos logit contributions
els+0.255
ooming+0.241
ishing+0.224
 stations+0.211
ators+0.209
Neg logit contributions
-0.275
 shouldn-0.254
 could-0.247
 I-0.245
 ju-0.241
Token The
Token ID383
Feature activation+4.72e-03

Pos logit contributions
 shouldn+0.119
 hadn+0.116
 Curious+0.114
 Fasc+0.113
Don+0.110
Neg logit contributions
ies-0.231
els-0.214
ins-0.213
 again-0.211
ishing-0.210
Token boy
Token ID2933
Feature activation-6.24e-04

Pos logit contributions
 Curious+0.008
 Fasc+0.007
 Are+0.007
 Looking+0.007
 Exc+0.007
Neg logit contributions
ies-0.010
els-0.010
ishing-0.009
ins-0.009
ier-0.009
Token was
Token ID373
Feature activation+2.44e-02

Pos logit contributions
ies+0.002
els+0.001
ishing+0.001
ins+0.001
 times+0.001
Neg logit contributions
 Curious-0.001
 Fasc-0.001
 shouldn-0.001
 Are-0.001
 Looking-0.001
Token greedy
Token ID31828
Feature activation-6.65e-02

Pos logit contributions
 Curious+0.042
 Are+0.042
 Exc+0.042
 Fasc+0.042
 shouldn+0.041
Neg logit contributions
ies-0.043
els-0.041
ins-0.039
ishing-0.038
 again-0.038
Token he
Token ID339
Feature activation-1.48e-02

Pos logit contributions
 Curious+0.002
 hadn+0.002
 Are+0.002
 shouldn+0.002
Don+0.002
Neg logit contributions
ies-0.009
els-0.008
ishing-0.008
ins-0.008
ier-0.008
Token had
Token ID550
Feature activation-1.88e-02

Pos logit contributions
ies+0.035
 again+0.033
er+0.032
els+0.032
ier+0.032
Neg logit contributions
 Fasc-0.019
 Exc-0.019
 Curious-0.018
 Terr-0.018
 Looking-0.018
Token never
Token ID1239
Feature activation-3.54e-02

Pos logit contributions
ies+0.042
els+0.040
ishing+0.038
ins+0.037
er+0.036
Neg logit contributions
 Curious-0.028
 Fasc-0.027
 Exc-0.026
 Are-0.026
 Looking-0.026
Token found
Token ID1043
Feature activation-7.01e-02

Pos logit contributions
ies+0.083
els+0.078
ishing+0.076
ins+0.074
 times+0.071
Neg logit contributions
 Curious-0.048
 Fasc-0.046
 Exc-0.046
 Looking-0.045
 Terr-0.045
Token the
Token ID262
Feature activation-4.96e-02

Pos logit contributions
ies+0.161
els+0.144
ins+0.144
ishing+0.143
ooming+0.135
Neg logit contributions
 shouldn-0.097
 Curious-0.093
 It-0.092
 Where-0.092
 Something-0.091
Token gold
Token ID3869
Feature activation-1.55e-01

Pos logit contributions
ies+0.088
els+0.084
ins+0.080
ishing+0.079
ier+0.072
Neg logit contributions
 Curious-0.096
 shouldn-0.095
 Are-0.093
 hadn-0.093
 Fasc-0.092
Token.
Token ID13
Feature activation+8.38e-02

Pos logit contributions
ins+0.262
ies+0.261
els+0.254
ishing+0.249
ooming+0.232
Neg logit contributions
 shouldn-0.299
 hadn-0.283
 It-0.282
-0.273
 Where-0.272
Token He
Token ID679
Feature activation+1.09e-02

Pos logit contributions
 hadn+0.103
Jump+0.103
 Fasc+0.100
've+0.099
 shouldn+0.096
Neg logit contributions
 ways-0.159
 again-0.159
.-0.158
 toys-0.155
els-0.155
Token learned
Token ID4499
Feature activation+7.70e-04

Pos logit contributions
 Fasc+0.013
 Curious+0.013
 Are+0.013
 Looking+0.013
 Exc+0.012
Neg logit contributions
ies-0.027
els-0.025
er-0.024
ishing-0.024
 times-0.023
Token a
Token ID257
Feature activation-9.73e-03

Pos logit contributions
 Curious+0.001
 Fasc+0.001
 Are+0.001
 shouldn+0.001
 Looking+0.001
Neg logit contributions
ies-0.001
els-0.001
ishing-0.001
ins-0.001
 times-0.001
Token lesson
Token ID11483
Feature activation-2.50e-02

Pos logit contributions
ies+0.017
els+0.016
ishing+0.015
ins+0.015
 times+0.014
Neg logit contributions
 Curious-0.019
 Fasc-0.018
 Are-0.018
 Exc-0.018
 Looking-0.018
Token It
Token ID632
Feature activation-2.08e-02

Pos logit contributions
 hadn+0.071
 Fasc+0.071
Don+0.068
liv+0.068
 hardly+0.067
Neg logit contributions
er-0.119
els-0.113
ies-0.108
 again-0.108
 angels-0.105
Token will
Token ID481
Feature activation-2.50e-03

Pos logit contributions
ies+0.040
els+0.038
ishing+0.035
ins+0.035
er+0.034
Neg logit contributions
 Curious-0.037
 Fasc-0.036
 Looking-0.035
 Are-0.034
Don-0.034
Token only
Token ID691
Feature activation-3.91e-02

Pos logit contributions
ies+0.005
els+0.004
 times+0.004
er+0.004
ishing+0.004
Neg logit contributions
 Curious-0.005
 Fasc-0.004
 Looking-0.004
 Terr-0.004
 Int-0.004
Token cost
Token ID1575
Feature activation-8.20e-02

Pos logit contributions
ies+0.090
els+0.087
ishing+0.083
ins+0.082
 times+0.077
Neg logit contributions
 Curious-0.053
 shouldn-0.053
 Are-0.052
 hadn-0.051
 Fasc-0.050
Token one
Token ID530
Feature activation-7.94e-02

Pos logit contributions
ies+0.177
els+0.168
ishing+0.159
ins+0.156
 angels+0.148
Neg logit contributions
 shouldn-0.124
-0.124
 hadn-0.120
 Are-0.117
 Curious-0.116
Token gold
Token ID3869
Feature activation-1.55e-01

Pos logit contributions
ies+0.146
els+0.140
ishing+0.135
ins+0.134
ooming+0.130
Neg logit contributions
-0.155
 shouldn-0.148
 Where-0.144
 Are-0.139
 Who-0.135
Token coin
Token ID10752
Feature activation-1.40e-01

Pos logit contributions
ooming+0.243
els+0.239
ishing+0.236
ies+0.234
ins+0.216
Neg logit contributions
 shouldn-0.288
 It-0.280
-0.273
 Where-0.271
 could-0.263
Token".
Token ID1911
Feature activation+1.03e-01

Pos logit contributions
els+0.220
ies+0.208
ishing+0.207
ooming+0.204
ins+0.199
Neg logit contributions
 shouldn-0.284
-0.273
 It-0.259
 hadn-0.256
've-0.256
Token The
Token ID383
Feature activation-2.67e-03

Pos logit contributions
 hadn+0.090
 shouldn+0.084
Don+0.082
 Curious+0.078
 Fasc+0.076
Neg logit contributions
ies-0.261
els-0.259
 again-0.253
er-0.247
ishing-0.245
Token dragon
Token ID10441
Feature activation-3.34e-02

Pos logit contributions
ies+0.006
els+0.006
ishing+0.005
ins+0.005
ock+0.005
Neg logit contributions
 Curious-0.004
 Are-0.004
 Fasc-0.004
 Looking-0.004
 Where-0.004
Token was
Token ID373
Feature activation+1.27e-02

Pos logit contributions
ies+0.085
els+0.083
ishing+0.080
ins+0.079
ier+0.075
Neg logit contributions
 Curious-0.037
 shouldn-0.036
 hadn-0.036
 Fasc-0.034
 Exc-0.034
Token that
Token ID326
Feature activation+1.28e-03

Pos logit contributions
 Curious+0.001
 shouldn+0.001
 Fasc+0.001
 Exc+0.001
 Are+0.001
Neg logit contributions
ies-0.002
ishing-0.002
els-0.002
ins-0.002
ier-0.002
Token it
Token ID340
Feature activation-4.00e-02

Pos logit contributions
 Curious+0.002
 shouldn+0.002
 Are+0.002
 Fasc+0.002
Don+0.002
Neg logit contributions
ies-0.003
els-0.003
ishing-0.003
ins-0.003
ier-0.003
Token is
Token ID318
Feature activation-9.69e-03

Pos logit contributions
ies+0.071
els+0.069
ishing+0.061
ins+0.060
er+0.058
Neg logit contributions
 Curious-0.078
 Fasc-0.077
 Looking-0.074
Don-0.074
 Are-0.074
Token not
Token ID407
Feature activation-2.12e-02

Pos logit contributions
ies+0.021
els+0.020
ishing+0.019
ins+0.019
 times+0.018
Neg logit contributions
 Curious-0.015
 Fasc-0.014
 Are-0.014
 shouldn-0.014
 Looking-0.014
Token real
Token ID1103
Feature activation-1.08e-01

Pos logit contributions
els+0.044
ies+0.042
ishing+0.041
ins+0.039
ooming+0.038
Neg logit contributions
 shouldn-0.032
 hadn-0.032
 Curious-0.031
 Are-0.031
 Fasc-0.030
Token gold
Token ID3869
Feature activation-1.55e-01

Pos logit contributions
ies+0.156
ins+0.154
ishing+0.152
ock+0.150
burn+0.148
Neg logit contributions
 shouldn-0.230
 Where-0.228
 Are-0.224
 Who-0.216
-0.212
Token.
Token ID13
Feature activation+8.14e-02

Pos logit contributions
ooming+0.229
ishing+0.229
els+0.229
ins+0.229
burn+0.225
Neg logit contributions
 shouldn-0.290
 It-0.270
 could-0.265
-0.264
 Where-0.251
Token It
Token ID632
Feature activation-7.68e-03

Pos logit contributions
 hadn+0.092
 shouldn+0.085
 Curious+0.079
 Fasc+0.078
 hardly+0.077
Neg logit contributions
els-0.189
ies-0.181
ins-0.177
er-0.176
 again-0.171
Token is
Token ID318
Feature activation+3.54e-03

Pos logit contributions
els+0.013
ies+0.012
er+0.011
ins+0.011
 again+0.010
Neg logit contributions
 Curious-0.015
 Fasc-0.015
 Looking-0.014
Don-0.014
 Are-0.014
Token chocolate
Token ID11311
Feature activation-1.04e-01

Pos logit contributions
 Curious+0.006
 Fasc+0.006
 Are+0.006
 Looking+0.006
 shouldn+0.006
Neg logit contributions
ies-0.007
els-0.007
ishing-0.006
ins-0.006
 times-0.006
Token!
Token ID0
Feature activation+8.49e-02

Pos logit contributions
ishing+0.158
ins+0.157
els+0.154
ies+0.149
ock+0.140
Neg logit contributions
-0.212
 shouldn-0.202
 hadn-0.197
 It-0.193
 Are-0.186
Token treasure
Token ID14068
Feature activation-1.21e-01

Pos logit contributions
ies+0.032
els+0.029
ishing+0.028
ins+0.027
 times+0.026
Neg logit contributions
 Curious-0.032
 Fasc-0.031
 shouldn-0.031
 Looking-0.031
 Exc-0.031
Token chest
Token ID7721
Feature activation-8.96e-02

Pos logit contributions
els+0.206
ies+0.205
ins+0.202
ishing+0.193
ock+0.170
Neg logit contributions
 hadn-0.233
 shouldn-0.233
-0.228
 Are-0.225
 Curious-0.224
Token full
Token ID1336
Feature activation-4.69e-02

Pos logit contributions
ies+0.169
els+0.169
ins+0.159
ishing+0.156
 times+0.144
Neg logit contributions
 hadn-0.156
 shouldn-0.155
 Curious-0.155
 Are-0.151
-0.151
Token of
Token ID286
Feature activation-5.78e-02

Pos logit contributions
ies+0.086
els+0.082
ishing+0.081
ins+0.079
 angels+0.070
Neg logit contributions
-0.088
 shouldn-0.088
 Curious-0.085
 hadn-0.085
 Are-0.085
Token shiny
Token ID22441
Feature activation-1.31e-01

Pos logit contributions
ies+0.092
els+0.087
ishing+0.085
ins+0.082
ier+0.077
Neg logit contributions
 Curious-0.119
 shouldn-0.117
 Are-0.117
 Fasc-0.116
 hadn-0.116
Token gold
Token ID3869
Feature activation-1.55e-01

Pos logit contributions
ins+0.207
burn+0.196
ock+0.193
ies+0.189
ishing+0.185
Neg logit contributions
 shouldn-0.242
 It-0.240
 Are-0.239
 Curious-0.235
 Fasc-0.231
Token coins
Token ID10796
Feature activation-1.32e-01

Pos logit contributions
ins+0.253
ishing+0.246
els+0.236
ooming+0.230
ies+0.221
Neg logit contributions
 shouldn-0.281
-0.272
 It-0.263
 hadn-0.260
 Under-0.251
Token!
Token ID0
Feature activation+9.63e-02

Pos logit contributions
els+0.193
ins+0.190
ies+0.183
ishing+0.180
 angels+0.166
Neg logit contributions
 shouldn-0.279
-0.277
 hadn-0.276
've-0.261
 couldn-0.253
Token Tim
Token ID5045
Feature activation-2.67e-03

Pos logit contributions
Isa+0.167
 hadn+0.166
Don+0.163
 grant+0.158
Jump+0.157
Neg logit contributions
er-0.153
ies-0.147
 toys-0.146
 and-0.141
 blocks-0.138
Token,
Token ID11
Feature activation+2.13e-02

Pos logit contributions
ies+0.004
els+0.003
ishing+0.003
ins+0.003
er+0.003
Neg logit contributions
 Curious-0.006
 Fasc-0.006
 Are-0.006
 Looking-0.006
Don-0.006
Token Sally
Token ID25737
Feature activation-1.86e-04

Pos logit contributions
 Curious+0.026
 Fasc+0.025
 Are+0.025
 shouldn+0.025
 hadn+0.024
Neg logit contributions
ies-0.053
els-0.051
ishing-0.049
ins-0.048
ier-0.046
Token she
Token ID673
Feature activation-1.88e-02

Pos logit contributions
 Curious+0.017
 shouldn+0.017
 hadn+0.016
 Fasc+0.016
 Are+0.016
Neg logit contributions
ies-0.029
els-0.028
ishing-0.027
ins-0.027
ier-0.025
Token would
Token ID561
Feature activation+6.28e-03

Pos logit contributions
ies+0.048
els+0.045
ishing+0.043
ie+0.043
-+0.042
Neg logit contributions
 Fasc-0.023
 Curious-0.023
 Exc-0.023
 Terr-0.022
 Are-0.022
Token never
Token ID1239
Feature activation-3.11e-02

Pos logit contributions
 Curious+0.009
 Fasc+0.009
 Exc+0.008
 Looking+0.008
 Are+0.008
Neg logit contributions
ies-0.015
els-0.014
ishing-0.013
 times-0.013
ins-0.012
Token see
Token ID766
Feature activation-3.78e-02

Pos logit contributions
ies+0.071
els+0.067
ishing+0.064
ins+0.062
 times+0.060
Neg logit contributions
 Curious-0.045
 Fasc-0.044
 Looking-0.042
 Exc-0.042
 shouldn-0.042
Token her
Token ID607
Feature activation-7.14e-02

Pos logit contributions
ies+0.091
ins+0.084
ishing+0.083
els+0.080
ooming+0.078
Neg logit contributions
 shouldn-0.048
 Where-0.046
 Are-0.046
 Curious-0.046
 hadn-0.045
Token gold
Token ID3869
Feature activation-1.55e-01

Pos logit contributions
ins+0.136
ock+0.128
els+0.122
ies+0.118
ishing+0.114
Neg logit contributions
 hadn-0.128
 shouldn-0.125
 Curious-0.122
 couldn-0.116
 Fasc-0.114
Token coin
Token ID10752
Feature activation-1.37e-01

Pos logit contributions
ins+0.249
els+0.241
ies+0.239
aters+0.218
 stations+0.215
Neg logit contributions
 hadn-0.289
 shouldn-0.286
 Is-0.263
 Curious-0.259
-0.259
Token again
Token ID757
Feature activation-5.69e-02

Pos logit contributions
els+0.201
ins+0.197
ies+0.181
ishing+0.159
 times+0.159
Neg logit contributions
 hadn-0.313
-0.301
 shouldn-0.298
 couldn-0.284
've-0.275
Token.
Token ID13
Feature activation+1.02e-01

Pos logit contributions
els+0.075
ins+0.075
ies+0.075
 angels+0.068
ishing+0.066
Neg logit contributions
-0.132
 shouldn-0.127
 hadn-0.127
Don-0.123
 Curious-0.122
Token<|endoftext|>
Token ID50256
Feature activation+8.02e-02

Pos logit contributions
Jump+0.187
 leapt+0.174
 hadn+0.174
've+0.172
 Fasc+0.170
Neg logit contributions
 again-0.154
.-0.150
 and-0.143
ies-0.142
 toys-0.130
TokenOnce
Token ID7454
Feature activation+8.80e-02

Pos logit contributions
 shouldn+0.106
 Curious+0.104
 Fasc+0.103
 hadn+0.101
 Are+0.101
Neg logit contributions
ies-0.174
 again-0.167
els-0.167
ins-0.161
er-0.156
Token hammer
Token ID15554
Feature activation-1.19e-01

Pos logit contributions
ies+0.020
els+0.018
ock+0.018
ier+0.018
ins+0.018
Neg logit contributions
 Curious-0.017
 Fasc-0.017
 Looking-0.016
-0.016
 Exc-0.016
Tokening
Token ID278
Feature activation-9.09e-02

Pos logit contributions
ies+0.175
 times+0.165
ishing+0.164
ins+0.157
 angels+0.155
Neg logit contributions
-0.222
 It-0.212
 shouldn-0.207
 Exc-0.203
Don-0.203
Token and
Token ID290
Feature activation-5.04e-03

Pos logit contributions
ins+0.165
ies+0.159
ock+0.158
ier+0.151
iness+0.150
Neg logit contributions
-0.152
 It-0.151
 Exc-0.141
 The-0.139
 Fasc-0.138
Token saw
Token ID2497
Feature activation-1.36e-02

Pos logit contributions
ies+0.009
ins+0.009
ishing+0.008
ier+0.008
els+0.008
Neg logit contributions
-0.009
 Are-0.009
 hadn-0.009
 Curious-0.009
Don-0.009
Tokening
Token ID278
Feature activation-9.07e-02

Pos logit contributions
ies+0.031
ishing+0.029
ins+0.027
els+0.027
ock+0.027
Neg logit contributions
 shouldn-0.018
 It-0.018
 Curious-0.018
 Something-0.018
 Where-0.018
Token.
Token ID13
Feature activation+9.63e-02

Pos logit contributions
ies+0.129
ishing+0.126
ins+0.126
ock+0.124
els+0.121
Neg logit contributions
-0.173
 Curious-0.167
 shouldn-0.167
 hadn-0.167
 It-0.167
Token After
Token ID2293
Feature activation+3.16e-02

Pos logit contributions
 shouldn+0.129
 hadn+0.122
've+0.116
Don+0.116
LL+0.111
Neg logit contributions
ies-0.205
els-0.201
er-0.197
 toys-0.191
 hearts-0.190
Token a
Token ID257
Feature activation-1.28e-02

Pos logit contributions
 Curious+0.046
 Fasc+0.045
 Looking+0.044
 hadn+0.044
 Are+0.044
Neg logit contributions
ies-0.071
els-0.067
ishing-0.067
ins-0.065
ock-0.061
Token while
Token ID981
Feature activation+1.15e-02

Pos logit contributions
ies+0.021
els+0.020
ishing+0.019
ins+0.018
ier+0.017
Neg logit contributions
 Curious-0.026
 Fasc-0.026
 Are-0.025
 shouldn-0.025
 Looking-0.025
Token,
Token ID11
Feature activation+6.15e-02

Pos logit contributions
 Curious+0.025
 Are+0.024
 Fasc+0.024
 hadn+0.024
 Exc+0.024
Neg logit contributions
ies-0.017
els-0.017
ishing-0.016
ins-0.016
ock-0.014
Token they
Token ID484
Feature activation+7.05e-03

Pos logit contributions
 Curious+0.100
 shouldn+0.097
 Fasc+0.096
 Looking+0.096
 Terr+0.095
Neg logit contributions
ies-0.128
els-0.118
ishing-0.110
 again-0.107
 times-0.107
Token be
Token ID307
Feature activation-5.46e-02

Pos logit contributions
 Curious+0.009
 shouldn+0.008
 Fasc+0.008
 Are+0.008
 hadn+0.008
Neg logit contributions
ies-0.014
els-0.014
ishing-0.013
ins-0.013
ier-0.012
Token and
Token ID290
Feature activation+6.69e-03

Pos logit contributions
ies+0.086
els+0.086
ishing+0.080
ins+0.080
 angels+0.071
Neg logit contributions
 shouldn-0.112
-0.111
 hadn-0.109
 Curious-0.106
 Fasc-0.105
Token decided
Token ID3066
Feature activation+2.15e-02

Pos logit contributions
 Curious+0.010
 Fasc+0.010
 Exc+0.010
 Looking+0.010
 shouldn+0.010
Neg logit contributions
ies-0.014
els-0.013
ishing-0.013
ins-0.012
 times-0.012
Token to
Token ID284
Feature activation+2.02e-02

Pos logit contributions
 Curious+0.043
 Fasc+0.041
 shouldn+0.041
 Are+0.041
 Looking+0.041
Neg logit contributions
ies-0.037
els-0.035
ishing-0.033
ins-0.032
 times-0.029
Token investigate
Token ID9161
Feature activation-3.55e-02

Pos logit contributions
 Curious+0.029
 shouldn+0.028
 Fasc+0.027
 Looking+0.027
 Are+0.027
Neg logit contributions
ies-0.046
els-0.044
ishing-0.043
ins-0.043
ier-0.040
Token.
Token ID13
Feature activation+1.02e-01

Pos logit contributions
ies+0.050
els+0.048
ishing+0.047
ins+0.047
ier+0.039
Neg logit contributions
-0.078
 shouldn-0.077
 Curious-0.077
 hadn-0.076
 Are-0.074
Token
Token ID198
Feature activation+9.79e-02

Pos logit contributions
 hadn+0.138
 shouldn+0.136
LL+0.133
've+0.123
 stole+0.122
Neg logit contributions
 again-0.196
ies-0.184
els-0.180
er-0.179
ins-0.175
Token
Token ID198
Feature activation+9.32e-02

Pos logit contributions
 shouldn+0.149
 Are+0.148
 Fasc+0.145
 hadn+0.143
 Curious+0.138
Neg logit contributions
ies-0.184
ins-0.183
els-0.177
 again-0.172
ock-0.166
TokenM
Token ID44
Feature activation+4.21e-02

Pos logit contributions
 shouldn+0.161
 Fasc+0.156
 hadn+0.155
 Are+0.155
 Curious+0.152
Neg logit contributions
ies-0.159
ins-0.158
els-0.150
 times-0.143
er-0.142
Tokenia
Token ID544
Feature activation+2.49e-02

Pos logit contributions
 Curious+0.067
 shouldn+0.066
+0.066
 Looking+0.065
 hadn+0.064
Neg logit contributions
ies-0.081
ishing-0.078
els-0.077
ins-0.072
 angels-0.070
Token grabbed
Token ID13176
Feature activation-2.09e-02

Pos logit contributions
 Curious+0.035
 hadn+0.033
 shouldn+0.033
+0.033
 Fasc+0.033
Neg logit contributions
ies-0.057
els-0.055
ishing-0.053
ins-0.052
ier-0.050
Token They
Token ID1119
Feature activation+2.04e-02

Pos logit contributions
 hadn+0.134
 shouldn+0.133
've+0.130
 Curious+0.128
 Fasc+0.127
Neg logit contributions
ies-0.168
els-0.166
 hearts-0.159
 toys-0.154
 times-0.154
Token look
Token ID804
Feature activation-1.06e-02

Pos logit contributions
 Curious+0.036
 Fasc+0.035
 Looking+0.035
 Are+0.035
 Exc+0.034
Neg logit contributions
ies-0.038
els-0.036
 times-0.033
 angels-0.032
ishing-0.031
Token at
Token ID379
Feature activation+1.29e-04

Pos logit contributions
els+0.016
ies+0.015
ins+0.015
ishing+0.014
ries+0.013
Neg logit contributions
 Curious-0.022
 Fasc-0.022
 shouldn-0.022
 Are-0.021
 relieved-0.021
Token the
Token ID262
Feature activation-3.52e-02

Pos logit contributions
 Curious+0.000
 shouldn+0.000
 Fasc+0.000
Don+0.000
 hadn+0.000
Neg logit contributions
ies-0.000
ins-0.000
ishing-0.000
els-0.000
ier-0.000
Token bucket
Token ID19236
Feature activation-9.59e-02

Pos logit contributions
ies+0.063
els+0.062
ishing+0.062
ins+0.060
ier+0.054
Neg logit contributions
 shouldn-0.066
 Curious-0.066
Don-0.064
 hadn-0.063
 Fasc-0.062
Token.
Token ID13
Feature activation+8.68e-02

Pos logit contributions
ins+0.157
ishing+0.156
els+0.152
ies+0.147
ities+0.136
Neg logit contributions
-0.175
 It-0.171
 hadn-0.164
 He-0.164
 had-0.161
Token The
Token ID383
Feature activation-1.22e-02

Pos logit contributions
 hadn+0.121
 shouldn+0.120
've+0.114
 Curious+0.114
 Terr+0.113
Neg logit contributions
els-0.174
ies-0.169
 hearts-0.162
 again-0.158
 toys-0.158
Token fish
Token ID5916
Feature activation-5.60e-02

Pos logit contributions
ies+0.022
els+0.022
ishing+0.022
ins+0.021
ier+0.020
Neg logit contributions
 Curious-0.022
 shouldn-0.022
Don-0.021
 hadn-0.021
 Something-0.021
Token and
Token ID290
Feature activation+1.87e-02

Pos logit contributions
ies+0.117
els+0.113
ishing+0.113
ins+0.107
ier+0.105
Neg logit contributions
 Curious-0.080
 shouldn-0.080
 hadn-0.080
 Are-0.079
-0.078
Token frogs
Token ID37475
Feature activation-5.79e-02

Pos logit contributions
 Curious+0.028
 Fasc+0.027
 Are+0.027
 shouldn+0.026
 Exc+0.026
Neg logit contributions
ies-0.041
els-0.039
ishing-0.038
ins-0.037
ier-0.036
Token are
Token ID389
Feature activation-5.67e-03

Pos logit contributions
ies+0.129
ishing+0.127
els+0.127
ins+0.122
isher+0.115
Neg logit contributions
-0.079
 hadn-0.076
 Curious-0.076
 shouldn-0.076
've-0.075
Token a
Token ID257
Feature activation+2.92e-02

Pos logit contributions
 Are+0.147
 Curious+0.146
 shouldn+0.145
 hadn+0.144
 Terr+0.143
Neg logit contributions
ies-0.119
els-0.114
ishing-0.103
 toys-0.103
ins-0.102
Token little
Token ID1310
Feature activation-6.53e-03

Pos logit contributions
 Curious+0.056
 Fasc+0.054
 shouldn+0.054
 Looking+0.053
 Are+0.053
Neg logit contributions
ies-0.052
els-0.049
ishing-0.046
ins-0.045
 times-0.042
Token girl
Token ID2576
Feature activation+1.15e-02

Pos logit contributions
els+0.013
ishing+0.012
ins+0.012
ies+0.012
ier+0.011
Neg logit contributions
 Are-0.011
 Curious-0.011
-0.010
 Fasc-0.010
 Exc-0.010
Token named
Token ID3706
Feature activation+4.89e-02

Pos logit contributions
 Are+0.009
+0.009
 Curious+0.009
 shouldn+0.008
 hadn+0.008
Neg logit contributions
ies-0.032
els-0.031
ishing-0.031
ins-0.031
ier-0.030
Token Lily
Token ID20037
Feature activation+2.38e-02

Pos logit contributions
 Curious+0.078
 Fasc+0.078
 shouldn+0.077
 Terr+0.075
 hadn+0.074
Neg logit contributions
ies-0.101
els-0.091
ishing-0.084
 angels-0.084
er-0.084
Token.
Token ID13
Feature activation+1.14e-01

Pos logit contributions
+0.059
 shouldn+0.059
 hadn+0.058
 Curious+0.058
Don+0.058
Neg logit contributions
ies-0.026
els-0.026
ishing-0.024
ins-0.024
ier-0.022
Token She
Token ID1375
Feature activation+3.50e-02

Pos logit contributions
 shouldn+0.117
 Curious+0.112
 hadn+0.111
 Terr+0.108
 Fasc+0.108
Neg logit contributions
ies-0.290
els-0.275
er-0.262
ishing-0.261
 times-0.259
Token loved
Token ID6151
Feature activation+7.72e-03

Pos logit contributions
 Exc+0.067
 Terr+0.067
 Curious+0.065
 Something+0.064
 Fasc+0.064
Neg logit contributions
ies-0.063
--0.061
 toys-0.055
 and-0.053
ie-0.052
Token to
Token ID284
Feature activation+1.36e-02

Pos logit contributions
 Curious+0.014
 Fasc+0.014
Don+0.013
 shouldn+0.013
 Are+0.013
Neg logit contributions
ies-0.014
els-0.014
ishing-0.013
ier-0.013
ins-0.013
Token swing
Token ID9628
Feature activation-7.11e-02

Pos logit contributions
 Curious+0.022
 Fasc+0.021
 Looking+0.020
 Are+0.020
 shouldn+0.020
Neg logit contributions
ies-0.029
els-0.028
ishing-0.026
ins-0.026
ier-0.025
Token in
Token ID287
Feature activation-6.47e-04

Pos logit contributions
ies+0.109
ishing+0.107
ins+0.105
els+0.101
ier+0.094
Neg logit contributions
-0.146
 Exc-0.145
 Curious-0.144
 Fasc-0.143
 It-0.140
Token castle
Token ID16669
Feature activation-1.12e-01

Pos logit contributions
els+0.131
ins+0.129
ier+0.119
ies+0.119
ishing+0.118
Neg logit contributions
 Are-0.141
-0.137
 Curious-0.135
 Exc-0.134
 shouldn-0.132
Token with
Token ID351
Feature activation-3.37e-02

Pos logit contributions
els+0.175
ishing+0.171
ins+0.169
ies+0.161
isher+0.143
Neg logit contributions
-0.237
 Exc-0.220
 Curious-0.213
 shouldn-0.212
 This-0.209
Token towers
Token ID18028
Feature activation-9.65e-02

Pos logit contributions
ies+0.063
ishing+0.060
ins+0.057
els+0.055
ier+0.055
Neg logit contributions
 Curious-0.057
 Fasc-0.057
 Exc-0.057
 It-0.056
 shouldn-0.056
Token and
Token ID290
Feature activation-6.37e-03

Pos logit contributions
ishing+0.160
ins+0.151
els+0.147
isher+0.143
ies+0.137
Neg logit contributions
-0.194
 It-0.173
 hadn-0.172
 hardly-0.172
 shouldn-0.171
Token walls
Token ID7714
Feature activation-9.36e-02

Pos logit contributions
ies+0.008
els+0.008
ishing+0.007
ins+0.007
ier+0.006
Neg logit contributions
 Curious-0.015
 Are-0.015
 Fasc-0.015
 shouldn-0.015
 Exc-0.015
Token.
Token ID13
Feature activation+9.93e-02

Pos logit contributions
ishing+0.151
ins+0.142
els+0.137
ies+0.136
ig+0.130
Neg logit contributions
-0.163
 It-0.162
 hadn-0.155
 hardly-0.152
 barely-0.151
Token Lily
Token ID20037
Feature activation+1.36e-02

Pos logit contributions
 shouldn+0.141
 hadn+0.139
 Terr+0.136
 Curious+0.131
 Fasc+0.129
Neg logit contributions
ies-0.206
els-0.196
 again-0.195
ins-0.187
 toys-0.184
Token suggested
Token ID5220
Feature activation+5.23e-02

Pos logit contributions
 Curious+0.019
 Fasc+0.018
 Looking+0.018
 Are+0.018
 Terr+0.018
Neg logit contributions
ies-0.032
els-0.029
ishing-0.028
ins-0.028
er-0.027
Token they
Token ID484
Feature activation+4.47e-03

Pos logit contributions
 Curious+0.085
 Fasc+0.082
 Terr+0.081
 Are+0.081
 shouldn+0.081
Neg logit contributions
ies-0.108
els-0.101
ins-0.094
ishing-0.094
 times-0.092
Token work
Token ID670
Feature activation-6.02e-02

Pos logit contributions
 Curious+0.006
 shouldn+0.006
+0.006
 Are+0.006
 Fasc+0.006
Neg logit contributions
ies-0.010
els-0.009
ishing-0.009
ins-0.009
ier-0.009
Token together
Token ID1978
Feature activation-4.26e-02

Pos logit contributions
ies+0.126
ins+0.115
ishing+0.105
iness+0.104
els+0.103
Neg logit contributions
-0.098
 �-0.094
 Exc-0.094
 Fasc-0.091
 hadn-0.091
Token said
Token ID531
Feature activation+3.27e-03

Pos logit contributions
 Curious+0.009
 Exc+0.008
+0.008
 Looking+0.008
 Fasc+0.008
Neg logit contributions
ies-0.029
els-0.028
ishing-0.028
 times-0.026
ins-0.026
Token.
Token ID13
Feature activation+9.61e-02

Pos logit contributions
 Curious+0.007
+0.007
 The+0.006
 Looking+0.006
 relieved+0.006
Neg logit contributions
els-0.005
ies-0.005
ock-0.005
ishing-0.005
 times-0.004
Token
Token ID198
Feature activation+1.07e-01

Pos logit contributions
 shouldn+0.116
 hadn+0.115
 Fasc+0.110
 Curious+0.110
 Looking+0.105
Neg logit contributions
ies-0.230
els-0.218
ins-0.212
 again-0.204
er-0.203
Token
Token ID198
Feature activation+1.01e-01

Pos logit contributions
 Curious+0.119
 Fasc+0.114
 shouldn+0.114
 Looking+0.110
 hadn+0.110
Neg logit contributions
ies-0.278
els-0.263
ins-0.254
ishing-0.246
 times-0.240
TokenThey
Token ID2990
Feature activation+2.43e-02

Pos logit contributions
 Curious+0.143
 Fasc+0.137
 Are+0.136
Don+0.135
 shouldn+0.135
Neg logit contributions
ies-0.230
els-0.221
ishing-0.212
ins-0.206
ier-0.198
Token went
Token ID1816
Feature activation-7.15e-03

Pos logit contributions
 Curious+0.028
 hadn+0.027
 shouldn+0.026
 Exc+0.026
 Fasc+0.026
Neg logit contributions
ies-0.061
els-0.059
ishing-0.058
ins-0.056
 times-0.054
Token home
Token ID1363
Feature activation-1.88e-02

Pos logit contributions
els+0.016
ies+0.015
iness+0.015
eness+0.015
 times+0.015
Neg logit contributions
 Curious-0.010
 Fasc-0.009
 Looking-0.009
 Where-0.009
 Exc-0.009
Token and
Token ID290
Feature activation+3.76e-02

Pos logit contributions
els+0.033
ishing+0.029
ies+0.029
ock+0.028
ins+0.027
Neg logit contributions
 relieved-0.037
 Curious-0.037
 Are-0.035
 Terr-0.035
 Fasc-0.035
Token got
Token ID1392
Feature activation-1.15e-03

Pos logit contributions
 Curious+0.050
 Fasc+0.048
 shouldn+0.047
 Looking+0.047
 Are+0.047
Neg logit contributions
ies-0.089
els-0.085
ishing-0.081
ins-0.080
 times-0.076
Token some
Token ID617
Feature activation-6.17e-02

Pos logit contributions
ies+0.002
els+0.002
ishing+0.002
ock+0.002
ins+0.002
Neg logit contributions
 Curious-0.002
 Fasc-0.002
 Exc-0.002
 Are-0.002
 shouldn-0.002
Token paper
Token ID3348
Feature activation-1.09e-01

Pos logit contributions
ishing+0.094
ies+0.093
els+0.084
en+0.083
iness+0.082
Neg logit contributions
-0.117
 Fasc-0.110
 Exc-0.109
 Curious-0.108
 Something-0.106
Token be
Token ID307
Feature activation-4.25e-02

Pos logit contributions
 Curious+0.002
 Fasc+0.002
 Looking+0.002
 Are+0.002
Don+0.002
Neg logit contributions
ies-0.002
els-0.002
ishing-0.002
 times-0.002
ins-0.002
Token friends
Token ID2460
Feature activation-5.56e-02

Pos logit contributions
ies+0.059
els+0.059
ishing+0.052
ins+0.050
 times+0.048
Neg logit contributions
 Curious-0.093
 Fasc-0.092
 shouldn-0.091
 Are-0.090
Don-0.089
Token with
Token ID351
Feature activation-3.33e-02

Pos logit contributions
els+0.115
ies+0.105
ishing+0.099
ins+0.094
ries+0.094
Neg logit contributions
-0.105
 shouldn-0.093
 Curious-0.092
 Are-0.090
 Would-0.089
Token the
Token ID262
Feature activation-3.92e-02

Pos logit contributions
ies+0.077
ishing+0.073
els+0.071
ier+0.069
ins+0.069
Neg logit contributions
 shouldn-0.045
 Curious-0.044
 Fasc-0.043
Don-0.042
-0.041
Token pig
Token ID12967
Feature activation-9.48e-02

Pos logit contributions
ies+0.067
els+0.067
ishing+0.063
ins+0.058
ier+0.056
Neg logit contributions
 Curious-0.077
 shouldn-0.074
Don-0.073
 Fasc-0.073
 Are-0.072
Token,
Token ID11
Feature activation-1.09e-02

Pos logit contributions
els+0.138
ishing+0.128
ins+0.114
isher+0.111
ies+0.110
Neg logit contributions
 shouldn-0.209
-0.207
 hadn-0.196
 couldn-0.193
 Curious-0.192
Token but
Token ID475
Feature activation+3.28e-02

Pos logit contributions
ies+0.019
els+0.018
ishing+0.017
ins+0.017
ier+0.016
Neg logit contributions
 Curious-0.021
 shouldn-0.021
 Fasc-0.021
Don-0.020
 Exc-0.020
Token the
Token ID262
Feature activation-1.49e-02

Pos logit contributions
 Curious+0.060
 Looking+0.058
 Fasc+0.058
 Terr+0.058
 Exc+0.058
Neg logit contributions
ies-0.064
els-0.058
 times-0.055
 angels-0.053
ishing-0.052
Token pig
Token ID12967
Feature activation-4.55e-02

Pos logit contributions
ies+0.027
els+0.026
ishing+0.025
ins+0.024
ier+0.023
Neg logit contributions
 Curious-0.029
Don-0.027
 Are-0.027
 shouldn-0.027
 Fasc-0.027
Token didn
Token ID1422
Feature activation+2.68e-02

Pos logit contributions
ies+0.118
els+0.116
ishing+0.112
ins+0.110
 times+0.105
Neg logit contributions
 Curious-0.046
 shouldn-0.046
 hadn-0.045
 Fasc-0.043
 Are-0.042
Token't
Token ID470
Feature activation-6.16e-03

Pos logit contributions
 Curious+0.034
 Are+0.034
 Fasc+0.033
 Terr+0.032
 hadn+0.032
Neg logit contributions
els-0.061
ies-0.059
 again-0.056
er-0.055
 times-0.055
Token
Token ID198
Feature activation+8.40e-02

Pos logit contributions
 Curious+0.048
 Fasc+0.048
 Looking+0.046
Don+0.046
 shouldn+0.046
Neg logit contributions
ies-0.066
els-0.057
ins-0.056
--0.055
ishing-0.054
Token
Token ID198
Feature activation+7.94e-02

Pos logit contributions
 shouldn+0.125
 hadn+0.119
 Curious+0.118
 Fasc+0.117
 Are+0.114
Neg logit contributions
ins-0.164
ies-0.161
els-0.152
 again-0.151
ock-0.151
TokenH
Token ID39
Feature activation+2.10e-04

Pos logit contributions
 shouldn+0.120
 Curious+0.119
 Fasc+0.116
 hadn+0.113
 Terr+0.113
Neg logit contributions
ies-0.162
ins-0.158
els-0.151
ock-0.146
 times-0.144
Tokenannah
Token ID25761
Feature activation+5.50e-03

Pos logit contributions
 Curious+0.000
 Are+0.000
 Fasc+0.000
Don+0.000
 shouldn+0.000
Neg logit contributions
ies-0.000
els-0.000
ishing-0.000
ins-0.000
ier-0.000
Token said
Token ID531
Feature activation+2.28e-02

Pos logit contributions
 Curious+0.008
 Fasc+0.008
 Are+0.007
Don+0.007
 Looking+0.007
Neg logit contributions
ies-0.013
els-0.012
ishing-0.011
ins-0.011
er-0.011
Token,
Token ID11
Feature activation+4.17e-02

Pos logit contributions
 Curious+0.053
 Fasc+0.052
 Are+0.051
 shouldn+0.051
 Looking+0.051
Neg logit contributions
ies-0.031
els-0.030
ishing-0.027
ins-0.026
 times-0.024
Token
Token ID6184
Feature activation-2.43e-02

Pos logit contributions
 Curious+0.047
 shouldn+0.046
 Fasc+0.045
 hadn+0.043
 Are+0.043
Neg logit contributions
ies-0.103
els-0.098
ins-0.093
ishing-0.093
er-0.091
Token
Token ID95
Feature activation-2.79e-02

Pos logit contributions
els+0.052
ies+0.052
ishing+0.051
er+0.048
ig+0.047
Neg logit contributions
 Curious-0.034
 shouldn-0.033
 hadn-0.033
Don-0.033
 Terr-0.032
Token
Token ID26391
Feature activation+1.41e-02

Pos logit contributions
els+0.059
er+0.059
ishing+0.058
ies+0.055
ock+0.054
Neg logit contributions
 Curious-0.037
 shouldn-0.036
 hadn-0.036
Don-0.035
 Are-0.035
Token
Token ID129
Feature activation-2.66e-02

Pos logit contributions
 Curious+0.020
 Fasc+0.020
 Looking+0.019
Don+0.019
 shouldn+0.019
Neg logit contributions
ies-0.031
els-0.028
ins-0.027
ishing-0.026
--0.026
Token
Token ID241
Feature activation+4.88e-02

Pos logit contributions
els+0.061
ies+0.059
er+0.057
ishing+0.057
-+0.055
Neg logit contributions
 Are-0.036
 Curious-0.034
 hadn-0.033
 Something-0.033
Don-0.032
Token little
Token ID1310
Feature activation+6.19e-03

Pos logit contributions
 Curious+0.020
+0.019
 hadn+0.019
 Fasc+0.019
Don+0.019
Neg logit contributions
ies-0.019
els-0.019
ins-0.017
ishing-0.017
ier-0.016
Token girl
Token ID2576
Feature activation+1.04e-02

Pos logit contributions
 Curious+0.011
+0.010
 Are+0.010
 Fasc+0.010
Don+0.010
Neg logit contributions
els-0.012
ins-0.011
ier-0.011
ies-0.011
ishing-0.011
Token was
Token ID373
Feature activation+2.79e-02

Pos logit contributions
 Curious+0.015
 Fasc+0.015
 Are+0.014
 shouldn+0.014
 Looking+0.014
Neg logit contributions
ies-0.023
els-0.022
ishing-0.021
ins-0.021
er-0.020
Token so
Token ID523
Feature activation+4.08e-02

Pos logit contributions
 Curious+0.046
 Are+0.045
 shouldn+0.045
 Fasc+0.045
Don+0.044
Neg logit contributions
ies-0.055
els-0.052
ishing-0.049
ins-0.049
er-0.047
Token excited
Token ID6568
Feature activation-2.76e-03

Pos logit contributions
 Curious+0.073
 shouldn+0.073
 Where+0.071
 Something+0.071
 Are+0.071
Neg logit contributions
ies-0.078
ins-0.067
els-0.065
ishing-0.064
--0.064
Token that
Token ID326
Feature activation+2.64e-02

Pos logit contributions
els+0.005
ies+0.004
ins+0.004
ishing+0.004
ig+0.004
Neg logit contributions
 Curious-0.005
 shouldn-0.005
 Looking-0.005
 hadn-0.005
-0.005
Token she
Token ID673
Feature activation+1.65e-02

Pos logit contributions
 Curious+0.036
 Fasc+0.035
 Looking+0.034
 shouldn+0.034
 Are+0.033
Neg logit contributions
ies-0.063
els-0.059
ishing-0.055
 times-0.054
ins-0.054
Token shouted
Token ID17293
Feature activation-1.83e-02

Pos logit contributions
 Where+0.024
 Something+0.024
 Are+0.023
 Curious+0.023
 Fasc+0.023
Neg logit contributions
ies-0.037
--0.033
ishing-0.032
er-0.032
els-0.031
Token,
Token ID11
Feature activation+3.47e-02

Pos logit contributions
ies+0.031
els+0.030
ishing+0.028
ins+0.028
 times+0.026
Neg logit contributions
 Curious-0.037
 Fasc-0.036
 Looking-0.036
 Exc-0.035
 Are-0.035
Token "
Token ID366
Feature activation+3.24e-02

Pos logit contributions
 shouldn+0.042
 Curious+0.041
 Fasc+0.040
 Are+0.039
 hadn+0.039
Neg logit contributions
ies-0.083
els-0.079
ishing-0.075
ins-0.074
er-0.073
TokenY
Token ID56
Feature activation-4.30e-02

Pos logit contributions
 Curious+0.037
 Fasc+0.035
 shouldn+0.035
 Looking+0.034
 Are+0.034
Neg logit contributions
ies-0.083
els-0.079
ishing-0.076
ins-0.075
 times-0.072
Token
Token ID198
Feature activation+9.01e-02

Pos logit contributions
 Are+0.150
 hadn+0.147
 Curious+0.146
 Fasc+0.146
 shouldn+0.146
Neg logit contributions
ies-0.206
els-0.194
ins-0.192
 again-0.185
ock-0.185
TokenThe
Token ID464
Feature activation+2.40e-04

Pos logit contributions
 Are+0.136
 Curious+0.134
 hadn+0.133
 Looking+0.131
 Terr+0.131
Neg logit contributions
ies-0.172
els-0.170
 times-0.168
ins-0.168
ock-0.166
Token animals
Token ID4695
Feature activation-1.54e-02

Pos logit contributions
 Curious+0.000
 shouldn+0.000
 Fasc+0.000
 Are+0.000
Don+0.000
Neg logit contributions
ies-0.000
els-0.000
ishing-0.000
ins-0.000
ier-0.000
Token brought
Token ID3181
Feature activation-3.78e-02

Pos logit contributions
ies+0.038
els+0.037
ishing+0.036
ins+0.035
ock+0.033
Neg logit contributions
 Curious-0.018
 shouldn-0.018
 hadn-0.018
Don-0.017
 Fasc-0.017
Token sticks
Token ID16461
Feature activation-9.83e-02

Pos logit contributions
ies+0.075
els+0.071
ishing+0.070
ins+0.069
ier+0.067
Neg logit contributions
 Curious-0.062
 Fasc-0.060
 shouldn-0.059
-0.059
 Exc-0.058
Token and
Token ID290
Feature activation+1.06e-03

Pos logit contributions
els+0.157
ins+0.157
ishing+0.154
ies+0.147
ities+0.138
Neg logit contributions
-0.187
 It-0.171
 shouldn-0.167
 Exc-0.165
 The-0.163
Token leaves
Token ID5667
Feature activation-7.77e-02

Pos logit contributions
 Curious+0.002
 Fasc+0.002
 shouldn+0.002
 Exc+0.002
 hadn+0.002
Neg logit contributions
ies-0.001
els-0.001
ishing-0.001
ins-0.001
ier-0.001
Token to
Token ID284
Feature activation-1.74e-02

Pos logit contributions
els+0.141
ins+0.137
ies+0.136
ishing+0.136
 times+0.122
Neg logit contributions
-0.131
 It-0.124
 shouldn-0.120
 hadn-0.120
 Exc-0.120
Token the
Token ID262
Feature activation-1.94e-02

Pos logit contributions
ies+0.039
els+0.038
ins+0.036
ishing+0.036
ier+0.034
Neg logit contributions
 Curious-0.025
 Fasc-0.024
 shouldn-0.024
 Exc-0.023
 Looking-0.023
Token stream
Token ID4269
Feature activation-9.26e-02

Pos logit contributions
ies+0.039
els+0.038
ishing+0.036
ins+0.036
ier+0.033
Neg logit contributions
 Curious-0.032
 shouldn-0.031
 Fasc-0.031
 Are-0.031
 hadn-0.030
Token.
Token ID13
Feature activation+8.15e-02

Pos logit contributions
els+0.120
ies+0.109
ishing+0.107
ins+0.096
ock+0.090
Neg logit contributions
 hadn-0.209
 Curious-0.207
 shouldn-0.206
-0.206
 Terr-0.202
Token.
Token ID13
Feature activation+8.90e-02

Pos logit contributions
ins+0.076
els+0.073
ies+0.070
isher+0.068
ishing+0.066
Neg logit contributions
-0.113
 hadn-0.106
 shouldn-0.105
 Are-0.102
 Curious-0.102
Token Miss
Token ID4544
Feature activation-2.45e-02

Pos logit contributions
 hadn+0.081
 shouldn+0.080
 Curious+0.076
 Fasc+0.076
've+0.073
Neg logit contributions
ies-0.225
els-0.218
er-0.213
 again-0.205
ishing-0.204
Tokeny
Token ID88
Feature activation-2.51e-02

Pos logit contributions
ies+0.039
els+0.038
 times+0.037
ishing+0.035
ins+0.034
Neg logit contributions
 Curious-0.048
 shouldn-0.047
 Are-0.047
 hadn-0.047
 Looking-0.047
Token knew
Token ID2993
Feature activation+1.25e-02

Pos logit contributions
ies+0.060
els+0.059
ishing+0.056
ins+0.055
 times+0.052
Neg logit contributions
 Curious-0.032
 shouldn-0.030
 Fasc-0.030
 hadn-0.030
 Are-0.030
Token that
Token ID326
Feature activation-5.92e-03

Pos logit contributions
 hadn+0.019
 shouldn+0.019
 Curious+0.019
Don+0.019
 Looking+0.018
Neg logit contributions
ies-0.027
els-0.026
ishing-0.025
ins-0.024
 times-0.022
Token even
Token ID772
Feature activation-3.30e-02

Pos logit contributions
ies+0.014
els+0.014
ishing+0.014
ins+0.013
ier+0.013
Neg logit contributions
Don-0.008
-0.008
 hadn-0.008
 Are-0.008
 shouldn-0.007
Token if
Token ID611
Feature activation+8.48e-03

Pos logit contributions
els+0.057
ies+0.056
ishing+0.051
ins+0.051
ooming+0.049
Neg logit contributions
Don-0.062
 Curious-0.061
 Are-0.061
 Fasc-0.060
 shouldn-0.060
Token the
Token ID262
Feature activation-3.97e-02

Pos logit contributions
 Curious+0.012
 Fasc+0.012
 Exc+0.011
 Looking+0.011
 Terr+0.011
Neg logit contributions
ies-0.020
els-0.018
ishing-0.017
ins-0.017
 times-0.017
Token sun
Token ID4252
Feature activation-4.65e-02

Pos logit contributions
els+0.076
ies+0.075
ishing+0.073
ins+0.073
ier+0.068
Neg logit contributions
 Curious-0.073
 Are-0.070
 shouldn-0.068
 Fasc-0.068
Don-0.068
Token went
Token ID1816
Feature activation-4.80e-02

Pos logit contributions
ies+0.117
els+0.114
ishing+0.111
ins+0.108
ier+0.104
Neg logit contributions
 Curious-0.053
 shouldn-0.051
 hadn-0.051
 Fasc-0.050
 Are-0.049
Token away
Token ID1497
Feature activation-5.24e-02

Pos logit contributions
ies+0.075
els+0.074
ins+0.072
ishing+0.065
 times+0.062
Neg logit contributions
 Curious-0.102
Don-0.101
 Fasc-0.100
 shouldn-0.100
 Are-0.099
Token loved
Token ID6151
Feature activation+5.62e-03

Pos logit contributions
 Curious+0.030
 Fasc+0.030
 Exc+0.029
 Terr+0.029
 Looking+0.028
Neg logit contributions
ies-0.053
els-0.050
ishing-0.047
er-0.046
 times-0.046
Token to
Token ID284
Feature activation+1.37e-02

Pos logit contributions
 Curious+0.009
 Fasc+0.009
 shouldn+0.009
 Are+0.009
Don+0.009
Neg logit contributions
ies-0.011
els-0.011
ier-0.010
ishing-0.010
ins-0.010
Token slide
Token ID10649
Feature activation-6.40e-02

Pos logit contributions
 Curious+0.020
 Fasc+0.020
 shouldn+0.019
 Are+0.019
 Looking+0.019
Neg logit contributions
ies-0.030
els-0.029
ishing-0.028
ins-0.028
ier-0.026
Token down
Token ID866
Feature activation-1.74e-02

Pos logit contributions
ies+0.107
ins+0.107
els+0.104
ishing+0.100
ier+0.096
Neg logit contributions
-0.118
 Exc-0.114
 Fasc-0.112
 Curious-0.110
 It-0.106
Token the
Token ID262
Feature activation+5.96e-03

Pos logit contributions
ies+0.046
ishing+0.042
ins+0.042
els+0.041
ities+0.039
Neg logit contributions
 Exc-0.019
 Fasc-0.019
-0.019
 Curious-0.018
Don-0.018
Token big
Token ID1263
Feature activation-4.70e-02

Pos logit contributions
 Curious+0.011
 Fasc+0.011
+0.011
 Exc+0.011
 hadn+0.010
Neg logit contributions
ies-0.011
els-0.011
ishing-0.010
ins-0.010
ier-0.009
Token hill
Token ID12788
Feature activation-6.84e-02

Pos logit contributions
ies+0.090
ins+0.090
els+0.086
ier+0.085
ishing+0.082
Neg logit contributions
-0.083
 Curious-0.081
 Are-0.081
 Exc-0.078
 Fasc-0.078
Token in
Token ID287
Feature activation+3.64e-03

Pos logit contributions
ies+0.107
els+0.106
ins+0.103
ishing+0.096
ries+0.086
Neg logit contributions
-0.145
 Exc-0.135
 Curious-0.134
 hadn-0.134
 shouldn-0.131
Token the
Token ID262
Feature activation-2.77e-03

Pos logit contributions
 Curious+0.005
+0.005
 Fasc+0.005
 Are+0.005
 Exc+0.005
Neg logit contributions
ies-0.009
ins-0.008
els-0.008
ishing-0.008
ier-0.008
Token park
Token ID3952
Feature activation-3.16e-02

Pos logit contributions
ies+0.005
els+0.005
ishing+0.004
ins+0.004
 times+0.004
Neg logit contributions
 Curious-0.005
 Fasc-0.005
 shouldn-0.005
 Are-0.005
 Looking-0.005
Token.
Token ID13
Feature activation+9.67e-02

Pos logit contributions
els+0.045
ies+0.044
ishing+0.042
ins+0.041
ier+0.036
Neg logit contributions
 Curious-0.070
-0.069
 hadn-0.067
 Fasc-0.067
 It-0.067
Token a
Token ID257
Feature activation+1.99e-03

Pos logit contributions
 Curious+0.049
 Fasc+0.047
 Are+0.047
 Looking+0.047
 shouldn+0.047
Neg logit contributions
ies-0.045
els-0.043
ishing-0.041
ins-0.039
 times-0.036
Token prayer
Token ID11443
Feature activation-5.43e-02

Pos logit contributions
 Curious+0.004
 Fasc+0.004
 shouldn+0.004
Don+0.004
 Are+0.003
Neg logit contributions
ies-0.004
els-0.003
 times-0.003
ins-0.003
ishing-0.003
Token of
Token ID286
Feature activation-3.91e-02

Pos logit contributions
els+0.090
ishing+0.082
ins+0.082
ies+0.079
ries+0.073
Neg logit contributions
 hadn-0.112
-0.112
 Is-0.105
 �-0.105
 Curious-0.104
Token thanks
Token ID5176
Feature activation-2.94e-02

Pos logit contributions
ies+0.085
els+0.081
ins+0.078
ishing+0.078
ier+0.072
Neg logit contributions
 Curious-0.061
 shouldn-0.058
 Fasc-0.058
 Looking-0.058
 Are-0.058
Token for
Token ID329
Feature activation-2.48e-02

Pos logit contributions
els+0.048
ies+0.048
ins+0.047
ishing+0.046
ier+0.039
Neg logit contributions
 Curious-0.057
 shouldn-0.056
 hadn-0.056
-0.055
 Looking-0.055
Token the
Token ID262
Feature activation-2.88e-02

Pos logit contributions
ies+0.059
els+0.055
ishing+0.054
ins+0.054
ier+0.048
Neg logit contributions
 hadn-0.033
 Curious-0.033
 shouldn-0.032
Don-0.032
 Looking-0.032
Token delicious
Token ID12625
Feature activation-9.43e-02

Pos logit contributions
els+0.055
ies+0.054
ishing+0.051
ins+0.050
ier+0.046
Neg logit contributions
 Curious-0.053
 shouldn-0.050
 hadn-0.050
-0.050
 Fasc-0.049
Token egg
Token ID5935
Feature activation-1.01e-01

Pos logit contributions
els+0.181
ies+0.167
ishing+0.166
ins+0.164
ier+0.155
Neg logit contributions
 Curious-0.163
 Are-0.160
 Looking-0.155
 Fasc-0.154
-0.153
Token she
Token ID673
Feature activation-2.77e-02

Pos logit contributions
els+0.142
ishing+0.134
ins+0.130
ies+0.129
ooming+0.110
Neg logit contributions
-0.231
 hadn-0.226
 shouldn-0.226
 Curious-0.216
 couldn-0.210
Token ate
Token ID15063
Feature activation-6.62e-02

Pos logit contributions
ies+0.071
els+0.067
ishing+0.066
ins+0.064
ier+0.062
Neg logit contributions
 Curious-0.031
 Fasc-0.030
 Exc-0.029
 Looking-0.029
 Are-0.028
Token that
Token ID326
Feature activation-3.55e-02

Pos logit contributions
ies+0.097
els+0.097
ishing+0.090
ins+0.087
ooming+0.079
Neg logit contributions
-0.145
 hadn-0.138
 shouldn-0.137
 It-0.134
Don-0.132
Token gave
Token ID2921
Feature activation-8.38e-03

Pos logit contributions
 Curious+0.004
 Fasc+0.004
 Looking+0.004
 Are+0.004
 Exc+0.004
Neg logit contributions
ies-0.007
els-0.007
ishing-0.006
ins-0.006
er-0.006
Token her
Token ID607
Feature activation-2.45e-02

Pos logit contributions
ies+0.017
els+0.016
ishing+0.016
ins+0.015
ier+0.015
Neg logit contributions
 Curious-0.013
 shouldn-0.013
Don-0.013
 Are-0.013
 Exc-0.013
Token a
Token ID257
Feature activation-5.00e-03

Pos logit contributions
ies+0.054
els+0.052
ishing+0.050
ins+0.049
ier+0.047
Neg logit contributions
 Curious-0.036
 Fasc-0.035
 shouldn-0.035
 Exc-0.034
-0.034
Token big
Token ID1263
Feature activation-3.75e-02

Pos logit contributions
ies+0.009
els+0.008
ishing+0.008
ins+0.008
 times+0.007
Neg logit contributions
 Curious-0.009
 Fasc-0.009
 shouldn-0.009
 Are-0.009
 Looking-0.009
Token bag
Token ID6131
Feature activation-6.97e-02

Pos logit contributions
ies+0.067
els+0.067
ins+0.064
ishing+0.062
ier+0.060
Neg logit contributions
 Curious-0.070
 Exc-0.068
 Fasc-0.067
 Are-0.067
 shouldn-0.066
Token of
Token ID286
Feature activation-3.46e-02

Pos logit contributions
els+0.116
ies+0.113
ishing+0.106
ins+0.103
 times+0.097
Neg logit contributions
-0.140
 Curious-0.137
 shouldn-0.136
 Exc-0.136
 hadn-0.132
Token peanuts
Token ID45612
Feature activation-8.48e-02

Pos logit contributions
ies+0.059
els+0.056
ishing+0.053
ins+0.051
ier+0.048
Neg logit contributions
 Curious-0.069
 Fasc-0.067
 shouldn-0.066
 Are-0.066
 Looking-0.065
Token.
Token ID13
Feature activation+1.02e-01

Pos logit contributions
els+0.120
ies+0.116
ishing+0.114
ins+0.102
 times+0.098
Neg logit contributions
-0.188
 shouldn-0.182
 Exc-0.175
 This-0.174
 hadn-0.169
Token Lily
Token ID20037
Feature activation+2.46e-02

Pos logit contributions
 shouldn+0.183
 hadn+0.182
've+0.172
 wond+0.163
Don+0.163
Neg logit contributions
er-0.167
ies-0.162
els-0.158
 again-0.156
 toys-0.153
Token was
Token ID373
Feature activation+3.35e-02

Pos logit contributions
 Are+0.040
 Where+0.040
 Curious+0.040
 Terr+0.040
 Fasc+0.040
Neg logit contributions
ies-0.053
els-0.045
ins-0.045
 toys-0.044
er-0.044
Token so
Token ID523
Feature activation+3.75e-02

Pos logit contributions
 Are+0.067
 Something+0.067
Don+0.067
 Curious+0.066
 Looking+0.066
Neg logit contributions
ies-0.053
els-0.047
 toys-0.046
 angels-0.045
ins-0.045
Token decided
Token ID3066
Feature activation+2.79e-02

Pos logit contributions
 Curious+0.039
 Fasc+0.038
 Are+0.037
 Looking+0.037
 shouldn+0.037
Neg logit contributions
ies-0.066
els-0.063
ishing-0.060
ins-0.059
 times-0.057
Token to
Token ID284
Feature activation+3.01e-02

Pos logit contributions
 Curious+0.058
 Fasc+0.056
 shouldn+0.056
 Are+0.055
 Looking+0.055
Neg logit contributions
ies-0.046
els-0.043
ishing-0.040
ins-0.039
 times-0.037
Token fold
Token ID5591
Feature activation-6.37e-02

Pos logit contributions
 Curious+0.047
 Fasc+0.046
 Are+0.045
 Looking+0.045
 shouldn+0.045
Neg logit contributions
ies-0.064
els-0.061
ishing-0.057
ins-0.056
 times-0.054
Token leaves
Token ID5667
Feature activation-8.08e-02

Pos logit contributions
ies+0.128
els+0.122
ins+0.119
ishing+0.119
ier+0.114
Neg logit contributions
 Curious-0.100
 Exc-0.099
-0.099
 Fasc-0.097
 Something-0.096
Token and
Token ID290
Feature activation+1.26e-02

Pos logit contributions
els+0.126
ies+0.123
ins+0.123
ishing+0.119
ities+0.104
Neg logit contributions
-0.163
 Exc-0.148
 shouldn-0.148
 hadn-0.147
 Fasc-0.146
Token use
Token ID779
Feature activation-4.49e-02

Pos logit contributions
 Curious+0.021
 Fasc+0.020
 shouldn+0.020
 Are+0.020
 Looking+0.020
Neg logit contributions
ies-0.026
els-0.025
ishing-0.023
ins-0.023
 times-0.022
Token them
Token ID606
Feature activation-5.30e-02

Pos logit contributions
ies+0.094
ins+0.090
ishing+0.087
ier+0.086
els+0.085
Neg logit contributions
-0.068
 Exc-0.065
 Curious-0.064
 Fasc-0.064
 hadn-0.064
Token as
Token ID355
Feature activation+1.95e-02

Pos logit contributions
els+0.102
ins+0.100
ies+0.099
ishing+0.093
ock+0.090
Neg logit contributions
-0.090
 hadn-0.087
 shouldn-0.084
 couldn-0.081
 Curious-0.079
Token little
Token ID1310
Feature activation-4.49e-02

Pos logit contributions
 shouldn+0.027
 Curious+0.026
 hadn+0.026
 Fasc+0.026
+0.025
Neg logit contributions
els-0.043
ies-0.043
ishing-0.041
ins-0.039
er-0.038
Token boats
Token ID16659
Feature activation-9.76e-02

Pos logit contributions
els+0.069
ies+0.067
ishing+0.064
ins+0.063
ock+0.059
Neg logit contributions
-0.091
 Curious-0.090
 Are-0.090
 shouldn-0.089
 hadn-0.088
Token to
Token ID284
Feature activation-1.62e-02

Pos logit contributions
els+0.162
ies+0.161
ishing+0.148
ins+0.147
 times+0.134
Neg logit contributions
 shouldn-0.187
 hadn-0.185
-0.183
 Curious-0.181
Don-0.177
Token make
Token ID787
Feature activation-2.89e-02

Pos logit contributions
 Curious+0.008
 Terr+0.007
 Fasc+0.007
Don+0.007
 hadn+0.007
Neg logit contributions
ies-0.008
els-0.007
ishing-0.007
er-0.006
 times-0.006
Token a
Token ID257
Feature activation-4.27e-02

Pos logit contributions
ies+0.061
els+0.057
ishing+0.057
ier+0.054
ins+0.054
Neg logit contributions
 Curious-0.044
 Exc-0.043
 Fasc-0.043
 Looking-0.042
-0.041
Token sweater
Token ID35685
Feature activation-8.70e-02

Pos logit contributions
els+0.079
ies+0.075
ishing+0.074
ins+0.073
ier+0.072
Neg logit contributions
 Curious-0.076
 Are-0.074
 Exc-0.074
-0.073
 Fasc-0.073
Token for
Token ID329
Feature activation-1.07e-02

Pos logit contributions
els+0.156
ishing+0.146
ins+0.145
ies+0.144
ries+0.125
Neg logit contributions
-0.177
 shouldn-0.164
 Exc-0.161
 It-0.161
 Curious-0.160
Token her
Token ID607
Feature activation-5.14e-02

Pos logit contributions
ies+0.023
ishing+0.022
ins+0.022
els+0.022
ier+0.021
Neg logit contributions
 hardly-0.015
 Curious-0.015
 hadn-0.015
 Looking-0.014
 Are-0.014
Token doll
Token ID3654
Feature activation-1.00e-01

Pos logit contributions
els+0.093
ins+0.090
ishing+0.089
ies+0.085
ier+0.083
Neg logit contributions
 hadn-0.096
 shouldn-0.095
-0.093
 Curious-0.091
 couldn-0.091
Token,
Token ID11
Feature activation-6.72e-03

Pos logit contributions
els+0.151
ins+0.148
ishing+0.145
ies+0.138
 times+0.128
Neg logit contributions
-0.216
 shouldn-0.202
 This-0.192
 hadn-0.191
 It-0.191
Token Ros
Token ID10018
Feature activation-6.92e-02

Pos logit contributions
ies+0.012
els+0.011
ishing+0.010
ins+0.010
 times+0.009
Neg logit contributions
 Curious-0.013
 Fasc-0.013
 shouldn-0.013
 Looking-0.013
 Are-0.013
Tokenie
Token ID494
Feature activation-3.73e-02

Pos logit contributions
els+0.075
 times+0.068
 angels+0.061
ishing+0.061
ries+0.061
Neg logit contributions
 hadn-0.165
 shouldn-0.158
 couldn-0.151
-0.147
 lived-0.147
Token.
Token ID13
Feature activation+9.69e-02

Pos logit contributions
ies+0.049
els+0.048
ishing+0.045
ins+0.044
ier+0.039
Neg logit contributions
 Curious-0.085
 shouldn-0.085
-0.084
 hadn-0.084
 Are-0.084
Token She
Token ID1375
Feature activation+2.64e-02

Pos logit contributions
 Curious+0.101
 Fasc+0.099
 shouldn+0.098
 hadn+0.097
 Looking+0.094
Neg logit contributions
ies-0.254
els-0.241
ishing-0.228
ins-0.225
er-0.224
Token some
Token ID617
Feature activation-6.50e-02

Pos logit contributions
ies+0.040
ishing+0.036
ins+0.035
els+0.035
 times+0.033
Neg logit contributions
 Curious-0.028
 Exc-0.027
 Fasc-0.027
 shouldn-0.027
 Where-0.026
Token paper
Token ID3348
Feature activation-1.08e-01

Pos logit contributions
ies+0.099
ishing+0.091
ins+0.090
els+0.086
ooming+0.082
Neg logit contributions
-0.128
 shouldn-0.124
 hadn-0.122
 Exc-0.119
 Fasc-0.118
Token and
Token ID290
Feature activation+8.54e-03

Pos logit contributions
ins+0.157
ock+0.151
ishing+0.151
ies+0.145
 times+0.138
Neg logit contributions
-0.216
 hadn-0.204
 It-0.204
 Exc-0.198
 shouldn-0.197
Token cr
Token ID1067
Feature activation-9.74e-02

Pos logit contributions
 Curious+0.017
 Exc+0.016
 Fasc+0.016
 hadn+0.016
 shouldn+0.016
Neg logit contributions
ies-0.015
els-0.014
ishing-0.013
ins-0.013
ier-0.012
Tokenay
Token ID323
Feature activation-6.56e-02

Pos logit contributions
ies+0.163
ishing+0.144
els+0.140
 times+0.137
ins+0.134
Neg logit contributions
 Exc-0.189
 Curious-0.187
 Looking-0.186
 It-0.185
 shouldn-0.184
Tokenons
Token ID684
Feature activation-9.02e-02

Pos logit contributions
ies+0.122
ishing+0.119
 size+0.110
burn+0.108
pit+0.107
Neg logit contributions
-0.107
 It-0.099
 shouldn-0.094
 hadn-0.088
 This-0.087
Token too
Token ID1165
Feature activation-2.27e-02

Pos logit contributions
els+0.140
ins+0.139
ies+0.138
ishing+0.136
 times+0.127
Neg logit contributions
-0.180
 hadn-0.169
 shouldn-0.162
 It-0.162
 couldn-0.161
Token.
Token ID13
Feature activation+9.35e-02

Pos logit contributions
ishing+0.040
els+0.039
ies+0.039
ins+0.038
isher+0.037
Neg logit contributions
-0.043
 It-0.041
 shouldn-0.039
 hadn-0.039
 The-0.039
Token She
Token ID1375
Feature activation+2.19e-02

Pos logit contributions
 hadn+0.089
 shouldn+0.089
've+0.081
 Curious+0.080
 Are+0.079
Neg logit contributions
ies-0.244
els-0.242
er-0.227
ins-0.226
ishing-0.226
Token cut
Token ID2005
Feature activation-6.26e-02

Pos logit contributions
 Curious+0.028
 Fasc+0.027
 shouldn+0.026
 Are+0.026
 Looking+0.026
Neg logit contributions
ies-0.053
els-0.051
ishing-0.049
ins-0.048
 times-0.046
Token the
Token ID262
Feature activation-3.14e-02

Pos logit contributions
ies+0.131
ishing+0.124
ins+0.121
els+0.117
 times+0.116
Neg logit contributions
 It-0.094
 Exc-0.089
 Curious-0.088
-0.087
 Fasc-0.085
Token a
Token ID257
Feature activation-2.86e-02

Pos logit contributions
ies+0.081
ins+0.077
ishing+0.075
els+0.074
ier+0.069
Neg logit contributions
 Curious-0.057
 shouldn-0.057
 Where-0.054
 Looking-0.054
 hadn-0.053
Token way
Token ID835
Feature activation-5.26e-02

Pos logit contributions
els+0.060
ies+0.057
ins+0.055
ishing+0.055
ries+0.051
Neg logit contributions
 Curious-0.049
 Are-0.049
 shouldn-0.048
 hadn-0.047
 Fasc-0.046
Token to
Token ID284
Feature activation-1.70e-02

Pos logit contributions
ins+0.082
els+0.082
ies+0.081
ishing+0.080
ries+0.068
Neg logit contributions
-0.114
 shouldn-0.109
 Curious-0.108
 It-0.105
 hadn-0.105
Token make
Token ID787
Feature activation-4.04e-02

Pos logit contributions
ies+0.040
els+0.038
ins+0.037
ishing+0.037
ier+0.034
Neg logit contributions
 Curious-0.023
 Are-0.022
 shouldn-0.022
 Fasc-0.022
 Looking-0.022
Token the
Token ID262
Feature activation-3.51e-02

Pos logit contributions
ies+0.093
els+0.090
ishing+0.088
ins+0.083
ier+0.079
Neg logit contributions
 Curious-0.056
 Fasc-0.052
 Exc-0.052
 shouldn-0.052
 Looking-0.052
Token dragon
Token ID10441
Feature activation-1.01e-01

Pos logit contributions
els+0.063
ies+0.059
ishing+0.058
ins+0.058
ries+0.051
Neg logit contributions
 Curious-0.069
 shouldn-0.065
 Are-0.064
 hardly-0.063
-0.063
Token go
Token ID467
Feature activation-6.20e-02

Pos logit contributions
els+0.233
ishing+0.222
ies+0.221
ins+0.216
isher+0.204
Neg logit contributions
-0.136
 Curious-0.135
 shouldn-0.128
 Terr-0.125
 Are-0.124
Token away
Token ID1497
Feature activation-6.33e-02

Pos logit contributions
els+0.098
ins+0.097
ies+0.095
ishing+0.088
 times+0.080
Neg logit contributions
-0.127
 Curious-0.126
 shouldn-0.125
 hadn-0.122
 Fasc-0.122
Token.
Token ID13
Feature activation+9.26e-02

Pos logit contributions
els+0.113
ishing+0.102
ins+0.101
ies+0.097
isher+0.093
Neg logit contributions
-0.133
 shouldn-0.124
 hadn-0.120
 Curious-0.118
 Are-0.116
Token He
Token ID679
Feature activation+1.15e-02

Pos logit contributions
 hadn+0.098
've+0.093
Don+0.091
 shouldn+0.089
LL+0.085
Neg logit contributions
er-0.218
ies-0.211
 again-0.205
els-0.204
ishing-0.202
Token put
Token ID1234
Feature activation-6.15e-02

Pos logit contributions
 Curious+0.014
 shouldn+0.013
 hadn+0.013
 Fasc+0.013
 Exc+0.012
Neg logit contributions
ies-0.029
els-0.028
ishing-0.027
ins-0.026
ier-0.025
Token and
Token ID290
Feature activation+2.89e-02

Pos logit contributions
ies+0.109
els+0.108
ins+0.104
ishing+0.095
ock+0.088
Neg logit contributions
 shouldn-0.142
-0.140
 hadn-0.138
 Are-0.136
 It-0.135
Token cute
Token ID13779
Feature activation-7.19e-02

Pos logit contributions
 Curious+0.048
 Looking+0.047
 Fasc+0.047
 Exc+0.046
 Something+0.046
Neg logit contributions
ies-0.060
els-0.054
ishing-0.051
 times-0.050
 angels-0.050
Token,
Token ID11
Feature activation+2.70e-03

Pos logit contributions
els+0.116
ies+0.107
ishing+0.104
ins+0.104
ock+0.093
Neg logit contributions
 shouldn-0.151
 Curious-0.151
-0.148
 It-0.148
 hadn-0.146
Token with
Token ID351
Feature activation-1.41e-02

Pos logit contributions
 Curious+0.005
 shouldn+0.005
 hadn+0.005
Don+0.005
 Are+0.005
Neg logit contributions
ies-0.005
els-0.005
ishing-0.004
ins-0.004
ier-0.004
Token fluffy
Token ID39145
Feature activation-1.00e-01

Pos logit contributions
ies+0.029
els+0.028
ishing+0.027
ins+0.026
ier+0.024
Neg logit contributions
 Curious-0.023
 shouldn-0.022
 Fasc-0.022
 Are-0.022
 Looking-0.022
Token feathers
Token ID27400
Feature activation-9.48e-02

Pos logit contributions
ies+0.156
ins+0.155
els+0.155
ishing+0.154
ier+0.154
Neg logit contributions
 Curious-0.205
 It-0.197
 Are-0.196
 Fasc-0.194
 Looking-0.193
Token and
Token ID290
Feature activation-1.17e-02

Pos logit contributions
els+0.130
ishing+0.130
ins+0.125
ies+0.122
isher+0.120
Neg logit contributions
 shouldn-0.206
-0.206
 It-0.197
 hadn-0.197
 Curious-0.192
Token a
Token ID257
Feature activation-3.27e-02

Pos logit contributions
ies+0.023
els+0.022
ishing+0.021
ins+0.020
 times+0.019
Neg logit contributions
 Curious-0.020
 Fasc-0.019
 Looking-0.019
 shouldn-0.019
 Are-0.019
Token tiny
Token ID7009
Feature activation-9.18e-02

Pos logit contributions
els+0.066
ies+0.066
ishing+0.062
ins+0.061
er+0.056
Neg logit contributions
 Curious-0.056
 Are-0.054
 Fasc-0.053
 shouldn-0.053
 hadn-0.052
Token be
Token ID307
Feature activation-5.19e-02

Pos logit contributions
els+0.184
ins+0.179
ies+0.170
ishing+0.169
ries+0.158
Neg logit contributions
 Curious-0.165
 Are-0.159
-0.157
 Looking-0.152
 It-0.151
Tokenak
Token ID461
Feature activation-9.38e-02

Pos logit contributions
ies+0.098
els+0.097
 times+0.093
ishing+0.091
 angels+0.087
Neg logit contributions
've-0.079
 Curious-0.078
 hadn-0.078
-0.077
 Looking-0.075
Token to
Token ID284
Feature activation-1.87e-02

Pos logit contributions
ies+0.096
els+0.096
ishing+0.093
ins+0.092
 times+0.083
Neg logit contributions
 shouldn-0.088
-0.088
 hadn-0.087
 Curious-0.084
 Are-0.083
Token go
Token ID467
Feature activation-4.26e-02

Pos logit contributions
ies+0.043
els+0.042
ins+0.042
ishing+0.041
ier+0.037
Neg logit contributions
-0.026
 Curious-0.026
 shouldn-0.026
 Are-0.026
 hadn-0.025
Token back
Token ID736
Feature activation-4.62e-02

Pos logit contributions
els+0.070
ins+0.069
ies+0.068
ishing+0.063
 times+0.057
Neg logit contributions
-0.086
 Curious-0.083
 Are-0.082
 shouldn-0.082
 Where-0.081
Token to
Token ID284
Feature activation-2.17e-02

Pos logit contributions
els+0.087
ies+0.086
ins+0.084
ishing+0.080
ier+0.078
Neg logit contributions
-0.079
 Curious-0.078
 shouldn-0.076
 Where-0.075
 Fasc-0.075
Token the
Token ID262
Feature activation-2.33e-02

Pos logit contributions
ies+0.045
els+0.043
ishing+0.041
ins+0.040
ier+0.038
Neg logit contributions
 Curious-0.035
 Fasc-0.034
 Are-0.034
 shouldn-0.034
 Looking-0.033
Token zoo
Token ID26626
Feature activation-9.42e-02

Pos logit contributions
ies+0.041
els+0.039
ishing+0.036
ins+0.036
ier+0.034
Neg logit contributions
 Curious-0.046
 shouldn-0.045
 Are-0.045
 Fasc-0.044
 hadn-0.044
Token again
Token ID757
Feature activation-4.11e-02

Pos logit contributions
els+0.145
ies+0.134
ishing+0.122
ins+0.121
ries+0.101
Neg logit contributions
-0.217
 hadn-0.209
 Curious-0.206
 shouldn-0.204
 Is-0.197
Token.
Token ID13
Feature activation+9.27e-02

Pos logit contributions
els+0.051
ies+0.048
ins+0.044
ishing+0.043
ier+0.038
Neg logit contributions
-0.101
 shouldn-0.099
 hadn-0.098
 Curious-0.098
 Are-0.098
Token<|endoftext|>
Token ID50256
Feature activation+6.16e-02

Pos logit contributions
Jump+0.142
 Barkley+0.138
 shouldn+0.137
 leapt+0.134
Don+0.133
Neg logit contributions
 again-0.153
ies-0.144
 ways-0.142
ins-0.141
.-0.140
TokenOnce
Token ID7454
Feature activation+8.78e-02

Pos logit contributions
 shouldn+0.077
 Curious+0.076
 Fasc+0.074
 hadn+0.074
 Are+0.072
Neg logit contributions
ies-0.139
els-0.135
ins-0.130
 again-0.127
ishing-0.126
Token upon
Token ID2402
Feature activation+9.43e-02

Pos logit contributions
 Curious+0.150
 Fasc+0.145
 Looking+0.143
 Are+0.143
 shouldn+0.141
Neg logit contributions
ies-0.182
els-0.165
ishing-0.155
ins-0.154
 times-0.147