TOP ACTIVATIONS
MAX = 0.172

disturb us or take our gold !" L ily
to be gold is not gold , but something else that
" Your dog took our gold !" Tim says . "
friendship are more precious than gold ." L ily
he couldn 't keep his gold safe by himself . He
. I will trade my navy boat for your red boat
how to make things with copper ?" Anna says .
You can 't keep this gold , you have to give
together . They cannot afford gold ice cream .
I did not share the laser with you . Do you
Maybe the treasure is not gold or jewels ," L ila
. I don 't need gold ice cream . I have
real treasure chest filled with gold coins and jewels .
He jumps and grabs some gold in his mouth . He
. I have never seen copper before . It is so
slide is not made of gold , but it is brilliant
prevent accident by not putting rocks in our mouths ."
It will only cost one gold coin ". The dragon was
to help him keep his gold safe . They agreed and
, you will get more gold ." The boy was greedy

MIN ACTIVATIONS
MIN = -0.125

until the sun went down . <|endoftext|> Once upon a time
. " What is that ?" Todd asked Ben . Ben
in the park all day . <|endoftext|> One day , a
drawing pictures . The end . <|endoftext|> Once upon a time
and became best friends again . From that day on ,
, laughing and having fun . When it was time to
day at the fun fair . <|endoftext|> Once upon a time
shared many more happy times . <|endoftext|> One day , a
hurt again . The end . <|endoftext|> Once upon a time
with her new friends again . Every day ,
to have fun swinging safely . <|endoftext|> Once upon a time
they played together until dinner . Later that night
promised to play again soon . <|endoftext|> Once upon a time
and had lots of fun ! <|endoftext|> Once upon a time
days together . The end . <|endoftext|> Once upon a time
fun together . The end . <|endoftext|> Once upon a time
and play with my umbrella ?" he asked .
until the sun went down . <|endoftext|> Once upon a time
stay down here and play ?" he asked .
havoc all over the ranch ." Sam my was

INTERVAL 0.097 - 0.172
CONTAINS 3.180%

Bob ate lots of y ummy treats and had a great
ever . She touched the wood and it felt nice and
them how he made the picture . He lets them touch
loves peanuts . Ben hates peanuts . He thinks they are
. It is big and shiny . It has a dragon

INTERVAL 0.023 - 0.097
CONTAINS 29.417%

oc ados and scatter them around his farm .
and she started to go higher and higher . Jenna laughed
there more stars in the sky ?" Ben wondered .
love our train and we love each other !" Their mom
Lily 's mom was very flexible and always knew what to

INTERVAL -0.051 - 0.023
CONTAINS 50.823%

sad and angry . She did not want to put her
eyes and soon , she was dreaming . In
She closed her eyes and imagined she was flying through the
artwork . From that day on , they were good friends
learned an important lesson that day . Even if it 's

INTERVAL -0.125 - -0.051
CONTAINS 16.580%

they loved her . Mom smiled . She said
upon a time , there was a phone . It liked
's good for you . From that day on
more juice . But Tommy did not know that
the mill again . <|endoftext|> Once upon a time , there
Token disturb
Token ID17037
Feature activation+5.99e-02

Pos logit contributions
 Curious+0.000
Don+0.000
 Looking+0.000
 Are+0.000
 Fasc+0.000
Neg logit contributions
ies-0.000
els-0.000
ishing-0.000
ins-0.000
ig-0.000
Token us
Token ID514
Feature activation+8.11e-02

Pos logit contributions
ishing+0.121
ies+0.117
ins+0.111
ooming+0.110
ig+0.109
Neg logit contributions
 Who-0.095
-0.094
 It-0.094
 Are-0.094
 Where-0.093
Token or
Token ID393
Feature activation+1.55e-02

Pos logit contributions
ishing+0.164
els+0.157
isher+0.151
ig+0.143
ins+0.142
Neg logit contributions
-0.145
 shouldn-0.123
 This-0.120
 It-0.118
 The-0.118
Token take
Token ID1011
Feature activation+7.58e-02

Pos logit contributions
ies+0.034
els+0.033
ishing+0.032
ins+0.031
ig+0.030
Neg logit contributions
 Curious-0.023
Don-0.023
 Looking-0.022
 Fasc-0.022
 Are-0.022
Token our
Token ID674
Feature activation+6.06e-02

Pos logit contributions
ies+0.137
ishing+0.134
els+0.132
ins+0.130
ooming+0.125
Neg logit contributions
 It-0.134
 Where-0.133
 Who-0.132
 Curious-0.131
 Looking-0.131
Token gold
Token ID3869
Feature activation+1.72e-01

Pos logit contributions
ins+0.097
ier+0.095
ishing+0.093
ies+0.093
els+0.092
Neg logit contributions
 Are-0.114
 Curious-0.114
 hadn-0.113
 Where-0.112
 Who-0.111
Token!"
Token ID2474
Feature activation-1.03e-01

Pos logit contributions
ooming+0.277
ishing+0.272
ins+0.271
ipping+0.270
els+0.270
Neg logit contributions
 He-0.251
 It-0.245
 This-0.241
 shouldn-0.237
-0.231
Token
Token ID198
Feature activation-1.01e-01

Pos logit contributions
Don+0.144
 shouldn+0.141
 hadn+0.138
 Curious+0.136
 Fasc+0.135
Neg logit contributions
ies-0.222
 again-0.206
ins-0.202
er-0.199
els-0.198
Token
Token ID198
Feature activation-9.06e-02

Pos logit contributions
 Curious+0.138
 Fasc+0.135
Don+0.135
 hadn+0.133
 shouldn+0.132
Neg logit contributions
ies-0.218
ins-0.211
 again-0.204
--0.203
er-0.199
TokenL
Token ID43
Feature activation-8.42e-03

Pos logit contributions
 Curious+0.145
 Looking+0.141
 Are+0.138
Don+0.137
 Fasc+0.137
Neg logit contributions
ies-0.189
els-0.177
ishing-0.176
ins-0.174
er-0.162
Tokenily
Token ID813
Feature activation-1.56e-02

Pos logit contributions
 Curious+0.013
 Looking+0.013
Don+0.012
 Are+0.012
 Fasc+0.012
Neg logit contributions
ies-0.018
els-0.018
ishing-0.017
ins-0.016
ig-0.016
Token to
Token ID284
Feature activation+2.50e-02

Pos logit contributions
ies+0.061
els+0.061
ishing+0.059
ins+0.056
ig+0.055
Neg logit contributions
Don-0.068
 shouldn-0.067
 Are-0.067
 Curious-0.066
 Looking-0.065
Token be
Token ID307
Feature activation+2.68e-02

Pos logit contributions
ies+0.055
els+0.054
ishing+0.051
ins+0.048
ie+0.047
Neg logit contributions
 Curious-0.039
Don-0.037
 Looking-0.037
 Fasc-0.036
 Are-0.035
Token gold
Token ID3869
Feature activation+1.52e-01

Pos logit contributions
ies+0.059
els+0.058
ishing+0.056
ins+0.054
ig+0.052
Neg logit contributions
 Curious-0.040
Don-0.039
 Looking-0.039
 Are-0.038
 Fasc-0.037
Token is
Token ID318
Feature activation+4.31e-02

Pos logit contributions
ishing+0.227
els+0.226
ins+0.211
ooming+0.210
isher+0.204
Neg logit contributions
 shouldn-0.288
-0.279
 It-0.267
 hadn-0.266
 Are-0.262
Token not
Token ID407
Feature activation+4.71e-02

Pos logit contributions
ies+0.084
els+0.082
ishing+0.080
ins+0.076
ig+0.075
Neg logit contributions
 Curious-0.073
Don-0.072
 Looking-0.071
 Are-0.071
 shouldn-0.070
Token gold
Token ID3869
Feature activation+1.70e-01

Pos logit contributions
els+0.097
ooming+0.097
ishing+0.095
ig+0.094
ies+0.092
Neg logit contributions
 shouldn-0.074
 hadn-0.070
Don-0.067
 It-0.067
 Are-0.066
Token,
Token ID11
Feature activation+2.10e-02

Pos logit contributions
ishing+0.271
els+0.270
ooming+0.268
ins+0.265
burn+0.252
Neg logit contributions
 shouldn-0.301
-0.289
 It-0.281
 He-0.271
 Where-0.266
Token but
Token ID475
Feature activation-2.58e-02

Pos logit contributions
ies+0.035
ishing+0.035
els+0.034
ins+0.032
ig+0.031
Neg logit contributions
Don-0.042
 Curious-0.041
 Are-0.040
 shouldn-0.040
 hadn-0.040
Token something
Token ID1223
Feature activation+2.11e-02

Pos logit contributions
 Curious+0.036
Don+0.036
 Are+0.035
 Looking+0.035
 Who+0.033
Neg logit contributions
ies-0.058
els-0.057
ishing-0.057
ins-0.054
ig-0.053
Token else
Token ID2073
Feature activation+4.77e-02

Pos logit contributions
ies+0.043
els+0.042
ishing+0.040
ins+0.039
ig+0.037
Neg logit contributions
 Curious-0.036
Don-0.034
 Looking-0.034
 Fasc-0.033
 Are-0.033
Token that
Token ID326
Feature activation+1.26e-02

Pos logit contributions
ies+0.098
els+0.094
ishing+0.092
ins+0.088
ig+0.084
Neg logit contributions
 Curious-0.079
Don-0.077
 Looking-0.076
 Are-0.075
 Fasc-0.074
Token"
Token ID1
Feature activation-3.90e-02

Pos logit contributions
 Curious+0.139
 shouldn+0.137
 Looking+0.135
 hadn+0.133
 Fasc+0.131
Neg logit contributions
ies-0.191
els-0.190
ins-0.180
ishing-0.175
ooming-0.165
TokenYour
Token ID7120
Feature activation+4.73e-02

Pos logit contributions
 shouldn+0.042
 Curious+0.042
 Looking+0.041
 Fasc+0.040
 hadn+0.039
Neg logit contributions
els-0.090
ins-0.088
ies-0.086
 angels-0.081
ishing-0.081
Token dog
Token ID3290
Feature activation+6.53e-02

Pos logit contributions
ishing+0.094
ins+0.093
els+0.093
ies+0.089
ier+0.087
Neg logit contributions
 Are-0.076
 Curious-0.075
 Something-0.073
 Where-0.073
 It-0.073
Token took
Token ID1718
Feature activation+7.12e-02

Pos logit contributions
ies+0.132
els+0.129
ishing+0.126
ins+0.120
ig+0.115
Neg logit contributions
 Curious-0.108
Don-0.105
 Looking-0.105
 Are-0.103
 Fasc-0.100
Token our
Token ID674
Feature activation+6.64e-02

Pos logit contributions
ies+0.157
ishing+0.150
els+0.149
ins+0.146
ig+0.138
Neg logit contributions
 Curious-0.106
 Looking-0.104
 Are-0.103
Don-0.103
 It-0.101
Token gold
Token ID3869
Feature activation+1.69e-01

Pos logit contributions
ins+0.114
ier+0.113
ishing+0.110
ies+0.110
els+0.106
Neg logit contributions
 Curious-0.119
 Are-0.118
 Who-0.117
 shouldn-0.116
 Where-0.116
Token!"
Token ID2474
Feature activation-9.74e-02

Pos logit contributions
ins+0.275
ooming+0.272
ishing+0.270
ipping+0.251
els+0.250
Neg logit contributions
-0.263
 He-0.260
 Exc-0.253
 It-0.252
 shouldn-0.247
Token Tim
Token ID5045
Feature activation-7.37e-04

Pos logit contributions
 shouldn+0.141
 Curious+0.140
Don+0.139
 hadn+0.137
 Looking+0.136
Neg logit contributions
ies-0.203
els-0.200
 again-0.179
ins-0.177
ishing-0.176
Token says
Token ID1139
Feature activation-1.02e-02

Pos logit contributions
 Curious+0.001
 Looking+0.001
Don+0.001
 Are+0.001
 Fasc+0.001
Neg logit contributions
ies-0.002
els-0.002
ishing-0.002
ins-0.001
ig-0.001
Token.
Token ID13
Feature activation-8.12e-02

Pos logit contributions
 Curious+0.022
 relieved+0.022
 Looking+0.022
 It+0.021
+0.021
Neg logit contributions
els-0.017
ishing-0.016
ies-0.015
ig-0.015
ishes-0.015
Token "
Token ID366
Feature activation-3.45e-02

Pos logit contributions
Don+0.085
 shouldn+0.084
 Curious+0.083
 hadn+0.083
 Looking+0.080
Neg logit contributions
ies-0.201
els-0.196
ins-0.189
ishing-0.184
 again-0.181
Token friendship
Token ID14738
Feature activation+8.71e-02

Pos logit contributions
ies+0.001
els+0.001
ishing+0.001
ins+0.001
ig+0.001
Neg logit contributions
 Curious-0.001
 Are-0.001
Don-0.001
 Looking-0.001
 Who-0.001
Token are
Token ID389
Feature activation+3.49e-02

Pos logit contributions
els+0.140
ishing+0.135
ooming+0.125
ig+0.125
ies+0.123
Neg logit contributions
 Are-0.174
 It-0.173
 shouldn-0.172
-0.170
 Who-0.168
Token more
Token ID517
Feature activation+5.32e-02

Pos logit contributions
els+0.073
ishing+0.072
ies+0.072
ig+0.070
ooming+0.069
Neg logit contributions
 shouldn-0.051
 hadn-0.051
Don-0.050
 Are-0.049
 relieved-0.048
Token precious
Token ID14186
Feature activation+1.12e-01

Pos logit contributions
els+0.107
ies+0.101
ishing+0.100
ig+0.098
ooming+0.096
Neg logit contributions
 shouldn-0.088
 Are-0.087
Don-0.086
 Curious-0.084
 hadn-0.084
Token than
Token ID621
Feature activation+3.84e-02

Pos logit contributions
els+0.232
ooming+0.224
ig+0.218
burn+0.205
ishing+0.200
Neg logit contributions
 But-0.169
 shouldn-0.169
 Thank-0.166
 Are-0.166
 Where-0.161
Token gold
Token ID3869
Feature activation+1.68e-01

Pos logit contributions
ig+0.077
ishing+0.076
els+0.076
ies+0.076
ooming+0.072
Neg logit contributions
 hadn-0.059
 Are-0.058
 shouldn-0.058
Don-0.057
 Who-0.056
Token."
Token ID526
Feature activation-1.09e-01

Pos logit contributions
ooming+0.311
burn+0.290
ishing+0.288
els+0.284
ig+0.269
Neg logit contributions
 shouldn-0.250
 He-0.246
 It-0.244
 could-0.241
 I-0.234
Token
Token ID198
Feature activation-1.10e-01

Pos logit contributions
 Curious+0.131
 Looking+0.127
Don+0.124
 Fasc+0.124
 hardly+0.122
Neg logit contributions
ies-0.265
els-0.244
ins-0.240
 again-0.238
ishing-0.237
Token
Token ID198
Feature activation-1.03e-01

Pos logit contributions
 Curious+0.167
 Looking+0.163
 Fasc+0.161
 Are+0.158
 hardly+0.153
Neg logit contributions
ies-0.240
ins-0.222
els-0.219
ie-0.212
 again-0.208
TokenL
Token ID43
Feature activation-2.72e-02

Pos logit contributions
 Curious+0.168
 Looking+0.162
 Fasc+0.158
 Are+0.154
 Where+0.152
Neg logit contributions
ies-0.218
els-0.205
ins-0.201
ishing-0.200
ig-0.188
Tokenily
Token ID813
Feature activation-2.11e-02

Pos logit contributions
Don+0.032
 Curious+0.030
 Looking+0.029
 hadn+0.029
 Are+0.028
Neg logit contributions
ies-0.066
els-0.065
ishing-0.065
ishes-0.060
ig-0.059
Token he
Token ID339
Feature activation+1.54e-02

Pos logit contributions
 Curious+0.012
Don+0.012
 Looking+0.012
 Are+0.011
 Fasc+0.011
Neg logit contributions
ies-0.018
els-0.017
ishing-0.017
ins-0.016
ig-0.016
Token couldn
Token ID3521
Feature activation-1.86e-02

Pos logit contributions
ies+0.041
els+0.038
ishing+0.037
ie+0.037
ig+0.036
Neg logit contributions
 Curious-0.018
 Looking-0.018
 Are-0.018
 Fasc-0.017
Don-0.017
Token't
Token ID470
Feature activation+8.07e-03

Pos logit contributions
Don+0.022
 Curious+0.022
 Looking+0.021
 Are+0.021
 Fasc+0.020
Neg logit contributions
els-0.045
ies-0.044
ig-0.042
 again-0.041
ishing-0.041
Token keep
Token ID1394
Feature activation+5.61e-02

Pos logit contributions
ies+0.018
els+0.017
ishing+0.017
ins+0.016
ig+0.016
Neg logit contributions
 Curious-0.012
Don-0.012
 Looking-0.011
 Are-0.011
 Fasc-0.011
Token his
Token ID465
Feature activation+6.06e-02

Pos logit contributions
ies+0.125
els+0.119
ishing+0.118
ins+0.116
ig+0.109
Neg logit contributions
 Curious-0.081
 Looking-0.078
Don-0.077
 Fasc-0.076
 Are-0.075
Token gold
Token ID3869
Feature activation+1.68e-01

Pos logit contributions
ins+0.123
els+0.114
ier+0.107
ies+0.106
ries+0.106
Neg logit contributions
 hadn-0.100
 Curious-0.099
 hardly-0.095
 Looking-0.095
 shouldn-0.095
Token safe
Token ID3338
Feature activation+1.14e-01

Pos logit contributions
ins+0.342
els+0.313
ries+0.302
ips+0.300
ud+0.293
Neg logit contributions
 shouldn-0.237
-0.234
 hadn-0.226
 hardly-0.225
 He-0.223
Token by
Token ID416
Feature activation+3.93e-02

Pos logit contributions
ins+0.225
els+0.218
ooming+0.203
ries+0.194
ishing+0.193
Neg logit contributions
 hadn-0.191
-0.188
 shouldn-0.183
 had-0.181
 He-0.180
Token himself
Token ID2241
Feature activation+8.64e-02

Pos logit contributions
ishing+0.090
ins+0.089
ies+0.089
els+0.086
ig+0.085
Neg logit contributions
 hadn-0.048
 hardly-0.046
 Looking-0.045
 Curious-0.045
 It-0.044
Token.
Token ID13
Feature activation-9.69e-02

Pos logit contributions
els+0.157
ins+0.152
ishing+0.150
isher+0.142
ries+0.140
Neg logit contributions
-0.179
 shouldn-0.156
 hadn-0.156
 This-0.148
 couldn-0.145
Token He
Token ID679
Feature activation-1.56e-02

Pos logit contributions
Don+0.090
've+0.078
 hadn+0.074
 Curious+0.071
 Are+0.069
Neg logit contributions
ies-0.253
er-0.249
ishing-0.247
els-0.245
 again-0.243
Token.
Token ID13
Feature activation-5.71e-02

Pos logit contributions
ries+0.101
els+0.100
ins+0.095
ishing+0.091
ies+0.090
Neg logit contributions
-0.080
 Something-0.069
 see-0.069
 It-0.068
 have-0.067
Token I
Token ID314
Feature activation-5.54e-03

Pos logit contributions
Don+0.060
 Curious+0.060
 hadn+0.059
 hardly+0.059
 Fasc+0.057
Neg logit contributions
els-0.137
ishing-0.131
ies-0.131
er-0.128
ig-0.127
Token will
Token ID481
Feature activation-9.92e-03

Pos logit contributions
 Curious+0.010
Don+0.010
 Looking+0.010
 Fasc+0.010
 Are+0.010
Neg logit contributions
ies-0.010
els-0.010
ishing-0.010
ins-0.009
ig-0.009
Token trade
Token ID3292
Feature activation+9.40e-02

Pos logit contributions
 Curious+0.014
 Looking+0.013
Don+0.013
 Are+0.013
 Fasc+0.012
Neg logit contributions
ies-0.023
els-0.023
ishing-0.022
ins-0.022
ig-0.020
Token my
Token ID616
Feature activation+6.10e-02

Pos logit contributions
ishing+0.211
ies+0.207
ooming+0.199
ig+0.198
els+0.194
Neg logit contributions
 It-0.128
Don-0.121
-0.120
 The-0.118
 Are-0.114
Token navy
Token ID23956
Feature activation+1.66e-01

Pos logit contributions
ins+0.111
ies+0.110
els+0.110
ishing+0.108
ier+0.108
Neg logit contributions
 Curious-0.107
 It-0.097
 Fasc-0.097
 Looking-0.095
 Are-0.095
Token boat
Token ID8848
Feature activation+1.47e-01

Pos logit contributions
els+0.306
ries+0.301
ishing+0.298
ins+0.285
ies+0.272
Neg logit contributions
 It-0.256
 The-0.246
 Have-0.236
-0.231
 Win-0.231
Token for
Token ID329
Feature activation+1.62e-02

Pos logit contributions
els+0.260
ishing+0.240
ies+0.234
ins+0.233
ries+0.230
Neg logit contributions
-0.265
 The-0.249
 This-0.249
 He-0.244
 It-0.242
Token your
Token ID534
Feature activation+6.02e-02

Pos logit contributions
ies+0.039
ishing+0.037
els+0.037
ins+0.036
ig+0.035
Neg logit contributions
 Are-0.020
Don-0.019
 Curious-0.019
 It-0.019
 Looking-0.019
Token red
Token ID2266
Feature activation+1.10e-01

Pos logit contributions
ins+0.115
els+0.115
ier+0.108
ries+0.107
ishing+0.106
Neg logit contributions
 hadn-0.096
 Curious-0.094
 Are-0.094
-0.093
 Have-0.092
Token boat
Token ID8848
Feature activation+1.38e-01

Pos logit contributions
ins+0.243
ier+0.239
els+0.228
ries+0.227
ishing+0.223
Neg logit contributions
 It-0.136
 The-0.127
 Win-0.114
 A-0.113
 This-0.111
Token how
Token ID703
Feature activation-4.21e-03

Pos logit contributions
ies+0.116
ishing+0.115
els+0.111
ins+0.107
ooming+0.105
Neg logit contributions
 It-0.086
 The-0.082
 Curious-0.082
 Something-0.078
 This-0.078
Token to
Token ID284
Feature activation+2.21e-02

Pos logit contributions
 Curious+0.008
Don+0.008
 Are+0.008
 Looking+0.007
 Where+0.007
Neg logit contributions
ies-0.008
els-0.008
ishing-0.008
ins-0.007
ig-0.007
Token make
Token ID787
Feature activation+4.94e-02

Pos logit contributions
ies+0.050
els+0.048
ishing+0.048
ins+0.046
ig+0.044
Neg logit contributions
 Curious-0.031
Don-0.030
 Looking-0.030
 Are-0.030
 Fasc-0.029
Token things
Token ID1243
Feature activation+8.38e-02

Pos logit contributions
ies+0.110
ishing+0.107
els+0.103
ins+0.100
ig+0.100
Neg logit contributions
 Curious-0.067
 Are-0.067
 It-0.067
 Looking-0.065
 Exc-0.065
Token with
Token ID351
Feature activation+5.85e-02

Pos logit contributions
ishing+0.146
ies+0.145
els+0.145
ig+0.138
ins+0.136
Neg logit contributions
 Are-0.152
-0.149
 It-0.148
 Curious-0.145
Don-0.145
Token copper
Token ID15317
Feature activation+1.65e-01

Pos logit contributions
ies+0.135
ishing+0.132
ins+0.131
ooming+0.130
ig+0.127
Neg logit contributions
 It-0.075
 shouldn-0.073
 Are-0.071
Don-0.069
 The-0.068
Token?"
Token ID1701
Feature activation-1.07e-01

Pos logit contributions
ooming+0.250
ishing+0.228
 angels+0.223
ologist+0.217
ins+0.217
Neg logit contributions
 It-0.292
 shouldn-0.287
 Are-0.276
 This-0.269
-0.267
Token Anna
Token ID11735
Feature activation-1.51e-02

Pos logit contributions
 shouldn+0.216
LL+0.209
 hadn+0.208
 Terr+0.203
've+0.200
Neg logit contributions
ies-0.174
els-0.157
 again-0.156
ie-0.149
.-0.139
Token says
Token ID1139
Feature activation-4.24e-03

Pos logit contributions
 Curious+0.021
Don+0.020
 Looking+0.020
 Are+0.020
 Fasc+0.019
Neg logit contributions
ies-0.035
els-0.034
ishing-0.034
ins-0.032
ig-0.031
Token.
Token ID13
Feature activation-9.92e-02

Pos logit contributions
 Curious+0.009
+0.009
 Looking+0.008
 It+0.008
 relieved+0.008
Neg logit contributions
els-0.007
ishing-0.007
ins-0.007
ies-0.007
ishes-0.006
Token
Token ID198
Feature activation-1.01e-01

Pos logit contributions
 hadn+0.119
 shouldn+0.115
 Curious+0.105
LL+0.105
Don+0.104
Neg logit contributions
ies-0.231
els-0.226
ins-0.220
 again-0.210
ishing-0.208
TokenYou
Token ID1639
Feature activation-1.10e-02

Pos logit contributions
 Curious+0.027
 Looking+0.025
 Are+0.024
 Fasc+0.024
Don+0.024
Neg logit contributions
ies-0.079
els-0.078
ishing-0.077
ins-0.075
ig-0.071
Token can
Token ID460
Feature activation-1.72e-02

Pos logit contributions
 Curious+0.016
Don+0.016
 Looking+0.016
 Fasc+0.015
 Are+0.015
Neg logit contributions
ies-0.024
els-0.023
ishing-0.022
ins-0.021
ig-0.021
Token't
Token ID470
Feature activation+4.67e-03

Pos logit contributions
 Curious+0.020
 Looking+0.019
Don+0.019
 Are+0.019
 Who+0.018
Neg logit contributions
ies-0.043
els-0.042
ishing-0.042
ins-0.041
ig-0.038
Token keep
Token ID1394
Feature activation+5.64e-02

Pos logit contributions
ies+0.011
els+0.011
ishing+0.011
ins+0.010
ig+0.010
Neg logit contributions
 Curious-0.006
Don-0.006
 Looking-0.006
 Are-0.006
 Fasc-0.006
Token this
Token ID428
Feature activation+8.32e-02

Pos logit contributions
ies+0.125
els+0.117
ishing+0.116
ins+0.115
ier+0.111
Neg logit contributions
 Curious-0.080
 Looking-0.079
Don-0.077
 Are-0.076
 Fasc-0.075
Token gold
Token ID3869
Feature activation+1.64e-01

Pos logit contributions
els+0.178
ins+0.165
ishing+0.159
ies+0.157
ries+0.148
Neg logit contributions
-0.137
 Curious-0.127
 shouldn-0.126
LL-0.123
 hadn-0.122
Token,
Token ID11
Feature activation-2.29e-03

Pos logit contributions
els+0.290
ins+0.275
ooming+0.274
ishing+0.274
ries+0.251
Neg logit contributions
-0.298
 shouldn-0.279
 It-0.268
 I-0.258
 This-0.256
Token you
Token ID345
Feature activation+1.34e-02

Pos logit contributions
 Curious+0.004
Don+0.004
 Are+0.004
 Looking+0.004
 hadn+0.003
Neg logit contributions
ies-0.005
els-0.005
ishing-0.005
ins-0.004
ig-0.004
Token have
Token ID423
Feature activation+1.19e-02

Pos logit contributions
ies+0.033
els+0.032
ishing+0.031
ins+0.030
ig+0.029
Neg logit contributions
 Curious-0.017
Don-0.016
 Looking-0.016
 Are-0.016
 Fasc-0.015
Token to
Token ID284
Feature activation+4.57e-03

Pos logit contributions
ies+0.023
els+0.022
ishing+0.022
ins+0.021
ig+0.020
Neg logit contributions
 Curious-0.021
Don-0.021
 Looking-0.020
 Are-0.020
 hadn-0.020
Token give
Token ID1577
Feature activation+5.48e-02

Pos logit contributions
ies+0.011
els+0.011
ishing+0.010
ins+0.010
ig+0.009
Neg logit contributions
 Curious-0.006
 Looking-0.006
Don-0.006
 Are-0.006
 hadn-0.005
Token together
Token ID1978
Feature activation+4.26e-02

Pos logit contributions
ishing+0.172
els+0.169
ies+0.166
ig+0.159
ooming+0.155
Neg logit contributions
 shouldn-0.225
-0.217
 hadn-0.215
Don-0.215
 Are-0.214
Token.
Token ID13
Feature activation-9.42e-02

Pos logit contributions
els+0.065
ishing+0.063
ies+0.062
ins+0.058
ig+0.057
Neg logit contributions
-0.087
 hadn-0.086
 shouldn-0.086
 Are-0.086
Don-0.086
Token They
Token ID1119
Feature activation-2.58e-02

Pos logit contributions
Don+0.087
 Curious+0.087
 Fasc+0.082
 Looking+0.081
 hadn+0.080
Neg logit contributions
els-0.228
ies-0.224
ins-0.215
 again-0.208
ishing-0.208
Token cannot
Token ID2314
Feature activation-6.64e-03

Pos logit contributions
 Curious+0.036
 Looking+0.035
Don+0.035
 Are+0.034
 Fasc+0.034
Neg logit contributions
ies-0.059
els-0.057
ishing-0.056
ins-0.053
ig-0.052
Token afford
Token ID5368
Feature activation+5.73e-02

Pos logit contributions
 Curious+0.010
Don+0.009
 Looking+0.009
 Fasc+0.009
 Are+0.009
Neg logit contributions
ies-0.015
els-0.015
ishing-0.014
ins-0.013
ig-0.013
Token gold
Token ID3869
Feature activation+1.64e-01

Pos logit contributions
ies+0.135
ishing+0.131
ig+0.129
els+0.127
ins+0.125
Neg logit contributions
 It-0.077
 Where-0.073
Don-0.073
 Who-0.072
 shouldn-0.071
Token ice
Token ID4771
Feature activation+1.36e-01

Pos logit contributions
ooming+0.272
ishing+0.257
ins+0.249
burn+0.239
ies+0.233
Neg logit contributions
 shouldn-0.279
 It-0.276
 He-0.264
-0.255
 could-0.254
Token cream
Token ID8566
Feature activation+1.24e-01

Pos logit contributions
els+0.195
ooming+0.178
ins+0.174
eness+0.168
ologist+0.165
Neg logit contributions
LL-0.257
 had-0.248
 hadn-0.243
 shouldn-0.240
-0.232
Token.
Token ID13
Feature activation-9.86e-02

Pos logit contributions
els+0.212
ies+0.202
ishing+0.198
ins+0.194
ig+0.184
Neg logit contributions
-0.249
 shouldn-0.231
 hadn-0.227
LL-0.226
 This-0.221
Token
Token ID198
Feature activation-9.90e-02

Pos logit contributions
 Curious+0.117
 Fasc+0.112
 hadn+0.111
Don+0.111
 Looking+0.110
Neg logit contributions
els-0.213
ies-0.205
ins-0.198
 again-0.187
ishing-0.184
Token
Token ID198
Feature activation-9.64e-02

Pos logit contributions
 Curious+0.137
 Fasc+0.133
 Are+0.130
 Looking+0.130
 hadn+0.126
Neg logit contributions
ies-0.214
els-0.211
ins-0.206
 again-0.187
--0.187
Token I
Token ID314
Feature activation-8.66e-04

Pos logit contributions
ishing+0.024
els+0.024
ies+0.023
ins+0.022
ig+0.021
Neg logit contributions
 Are-0.020
 It-0.019
 Where-0.019
 Who-0.018
 Curious-0.018
Token did
Token ID750
Feature activation+9.24e-03

Pos logit contributions
 Curious+0.001
 Looking+0.001
Don+0.001
 Fasc+0.001
 Are+0.001
Neg logit contributions
ies-0.002
els-0.002
ishing-0.002
ins-0.002
ig-0.002
Token not
Token ID407
Feature activation+5.39e-03

Pos logit contributions
ishing+0.028
els+0.027
ies+0.027
ig+0.027
ins+0.026
Neg logit contributions
 hardly-0.005
Don-0.005
 shouldn-0.005
 It-0.005
 Are-0.005
Token share
Token ID2648
Feature activation+6.38e-02

Pos logit contributions
ies+0.013
els+0.013
ishing+0.013
ins+0.012
ig+0.012
Neg logit contributions
 Curious-0.006
 Are-0.006
Don-0.006
 Looking-0.006
 Who-0.006
Token the
Token ID262
Feature activation+4.60e-02

Pos logit contributions
ig+0.126
ishing+0.124
ooming+0.120
isher+0.109
ies+0.107
Neg logit contributions
 It-0.113
 The-0.108
 Who-0.105
 This-0.104
 Are-0.104
Token laser
Token ID12855
Feature activation+1.64e-01

Pos logit contributions
ishing+0.073
ies+0.073
els+0.072
ins+0.071
ier+0.070
Neg logit contributions
 Are-0.085
Don-0.083
 It-0.082
LL-0.082
 Something-0.082
Token with
Token ID351
Feature activation+2.88e-02

Pos logit contributions
els+0.322
ishing+0.316
ooming+0.290
ig+0.287
ies+0.283
Neg logit contributions
-0.274
 This-0.265
 It-0.248
 Who-0.245
 I-0.239
Token you
Token ID345
Feature activation+6.73e-02

Pos logit contributions
ies+0.065
ishing+0.064
els+0.062
ig+0.060
ins+0.060
Neg logit contributions
Don-0.038
 Are-0.038
 It-0.038
 Who-0.037
 Curious-0.037
Token.
Token ID13
Feature activation-7.29e-02

Pos logit contributions
ishing+0.132
els+0.127
isher+0.126
ig+0.122
ies+0.116
Neg logit contributions
-0.123
 It-0.111
 shouldn-0.111
 I-0.110
 The-0.110
Token Do
Token ID2141
Feature activation-4.45e-02

Pos logit contributions
 hadn+0.084
 Curious+0.083
Don+0.081
 hardly+0.081
 Fasc+0.079
Neg logit contributions
els-0.165
ies-0.154
er-0.152
ins-0.148
ishing-0.146
Token you
Token ID345
Feature activation-1.29e-02

Pos logit contributions
 Curious+0.100
 Terr+0.096
 Fasc+0.095
 Looking+0.095
 Exc+0.091
Neg logit contributions
ies-0.072
els-0.067
ie-0.060
ins-0.058
ries-0.057
TokenMaybe
Token ID13300
Feature activation-3.07e-02

Pos logit contributions
 Curious+0.038
 Looking+0.036
Don+0.035
 Are+0.034
 Fasc+0.033
Neg logit contributions
ies-0.120
els-0.118
ishing-0.115
ins-0.113
ig-0.108
Token the
Token ID262
Feature activation+8.62e-03

Pos logit contributions
 Are+0.036
 Curious+0.036
Don+0.035
 Who+0.034
 Where+0.034
Neg logit contributions
ishing-0.076
ies-0.075
els-0.075
ins-0.073
ig-0.069
Token treasure
Token ID14068
Feature activation+1.10e-01

Pos logit contributions
ies+0.016
els+0.016
ishing+0.015
ins+0.015
ig+0.014
Neg logit contributions
 Curious-0.016
Don-0.016
 Looking-0.016
 Are-0.016
 Fasc-0.015
Token is
Token ID318
Feature activation+2.44e-02

Pos logit contributions
ishing+0.255
els+0.250
ies+0.246
ins+0.236
ooming+0.229
Neg logit contributions
 shouldn-0.131
 hadn-0.129
-0.128
 Are-0.128
 It-0.127
Token not
Token ID407
Feature activation+2.60e-02

Pos logit contributions
ies+0.050
els+0.050
ishing+0.049
ins+0.047
ig+0.046
Neg logit contributions
 Curious-0.038
Don-0.038
 Looking-0.037
 Are-0.037
 hadn-0.037
Token gold
Token ID3869
Feature activation+1.64e-01

Pos logit contributions
els+0.056
ishing+0.054
ies+0.053
ig+0.052
ins+0.052
Neg logit contributions
 shouldn-0.038
 Curious-0.038
 hadn-0.038
Don-0.037
 Are-0.037
Token or
Token ID393
Feature activation+2.90e-02

Pos logit contributions
els+0.293
ishing+0.286
ooming+0.277
ins+0.272
burn+0.252
Neg logit contributions
-0.263
 He-0.256
 shouldn-0.255
 It-0.253
 I-0.248
Token jewels
Token ID42701
Feature activation+1.49e-01

Pos logit contributions
ies+0.057
els+0.055
ishing+0.055
ins+0.053
ig+0.050
Neg logit contributions
 Curious-0.049
Don-0.049
 Are-0.048
 Looking-0.048
 hadn-0.046
Token,"
Token ID553
Feature activation-8.56e-02

Pos logit contributions
els+0.293
ishing+0.279
ins+0.259
ies+0.244
ig+0.240
Neg logit contributions
-0.266
 shouldn-0.243
 I-0.239
've-0.234
 Are-0.232
Token L
Token ID406
Feature activation-8.02e-03

Pos logit contributions
 Curious+0.140
 Looking+0.139
 shouldn+0.137
 Terr+0.135
 hadn+0.134
Neg logit contributions
ies-0.175
els-0.169
ins-0.149
ries-0.144
 again-0.143
Tokenila
Token ID10102
Feature activation-3.05e-03

Pos logit contributions
 Curious+0.014
Don+0.014
 Looking+0.014
 Are+0.014
 Fasc+0.013
Neg logit contributions
ies-0.016
els-0.015
ishing-0.014
ins-0.014
ig-0.013
Token.
Token ID13
Feature activation-5.20e-02

Pos logit contributions
els+0.127
ies+0.122
ishing+0.122
ins+0.112
ist+0.108
Neg logit contributions
-0.116
 Curious-0.116
 Are-0.114
 �-0.112
 It-0.111
Token I
Token ID314
Feature activation-6.41e-03

Pos logit contributions
 Curious+0.047
Don+0.045
 Looking+0.043
 Fasc+0.041
 hadn+0.041
Neg logit contributions
els-0.141
ies-0.139
ishing-0.135
ins-0.132
ig-0.129
Token don
Token ID836
Feature activation-5.49e-03

Pos logit contributions
 Curious+0.008
Don+0.008
 Looking+0.008
 Are+0.007
 Fasc+0.007
Neg logit contributions
ies-0.016
els-0.015
ishing-0.015
ins-0.014
ig-0.014
Token't
Token ID470
Feature activation-3.22e-03

Pos logit contributions
 Curious+0.005
Don+0.005
 Looking+0.005
 Are+0.004
 Fasc+0.004
Neg logit contributions
ies-0.015
els-0.015
ishing-0.015
ins-0.014
ig-0.014
Token need
Token ID761
Feature activation+4.43e-02

Pos logit contributions
 Curious+0.003
 Are+0.003
Don+0.003
 Looking+0.003
 shouldn+0.003
Neg logit contributions
ies-0.008
els-0.008
ishing-0.008
ins-0.008
ig-0.008
Token gold
Token ID3869
Feature activation+1.63e-01

Pos logit contributions
ies+0.099
ishing+0.093
ins+0.084
ooming+0.084
els+0.084
Neg logit contributions
 It-0.068
 Where-0.067
-0.066
 Who-0.065
 Are-0.064
Token ice
Token ID4771
Feature activation+1.49e-01

Pos logit contributions
ooming+0.270
ishing+0.263
ig+0.241
els+0.235
ist+0.232
Neg logit contributions
 He-0.280
 It-0.271
 Where-0.257
 Who-0.257
 Is-0.256
Token cream
Token ID8566
Feature activation+1.31e-01

Pos logit contributions
ooming+0.226
els+0.222
ig+0.198
ishing+0.195
ologist+0.195
Neg logit contributions
 He-0.251
 But-0.251
LL-0.248
 This-0.244
 It-0.240
Token.
Token ID13
Feature activation-5.81e-02

Pos logit contributions
els+0.237
ishing+0.235
ies+0.232
ooming+0.224
ig+0.221
Neg logit contributions
-0.230
 Is-0.226
 Where-0.225
 But-0.225
 Who-0.225
Token I
Token ID314
Feature activation-5.69e-03

Pos logit contributions
 Curious+0.051
Don+0.049
 Looking+0.047
 Fasc+0.045
 hadn+0.045
Neg logit contributions
els-0.157
ies-0.154
ishing-0.150
ins-0.148
ig-0.143
Token have
Token ID423
Feature activation+1.15e-02

Pos logit contributions
 Curious+0.008
 Looking+0.008
Don+0.008
 Fasc+0.008
 Are+0.007
Neg logit contributions
ies-0.013
els-0.012
ishing-0.012
ins-0.012
ig-0.011
Token real
Token ID1103
Feature activation+1.00e-01

Pos logit contributions
els+0.080
ies+0.077
ishing+0.076
ins+0.073
ries+0.069
Neg logit contributions
 Curious-0.069
 Are-0.068
Don-0.067
 Looking-0.066
 hadn-0.065
Token treasure
Token ID14068
Feature activation+1.40e-01

Pos logit contributions
els+0.188
ins+0.179
ishing+0.174
ies+0.168
ries+0.165
Neg logit contributions
 Are-0.182
 Curious-0.181
 hadn-0.176
 Who-0.175
 Looking-0.173
Token chest
Token ID7721
Feature activation+1.12e-01

Pos logit contributions
ishing+0.225
els+0.218
ins+0.216
ies+0.209
ooming+0.192
Neg logit contributions
 hadn-0.280
 shouldn-0.272
-0.260
 Who-0.258
 Where-0.255
Token filled
Token ID5901
Feature activation+5.15e-02

Pos logit contributions
els+0.170
ishing+0.162
ins+0.160
ies+0.153
ishes+0.141
Neg logit contributions
 hadn-0.244
 shouldn-0.231
-0.225
've-0.224
 had-0.222
Token with
Token ID351
Feature activation+4.94e-02

Pos logit contributions
ies+0.080
els+0.078
ishing+0.076
ins+0.072
ig+0.070
Neg logit contributions
 Curious-0.107
Don-0.107
 Looking-0.105
 Are-0.103
 Fasc-0.102
Token gold
Token ID3869
Feature activation+1.63e-01

Pos logit contributions
ishing+0.090
els+0.088
ies+0.088
ig+0.086
ins+0.085
Neg logit contributions
 Curious-0.088
Don-0.087
 hadn-0.085
 Are-0.085
 shouldn-0.084
Token coins
Token ID10796
Feature activation+1.45e-01

Pos logit contributions
ishing+0.288
ooming+0.279
burn+0.275
els+0.267
ins+0.260
Neg logit contributions
 He-0.248
-0.244
 had-0.244
 It-0.243
 shouldn-0.231
Token and
Token ID290
Feature activation+2.16e-02

Pos logit contributions
els+0.221
ishing+0.220
ins+0.214
ooming+0.203
isher+0.195
Neg logit contributions
-0.283
 hadn-0.282
 shouldn-0.273
 had-0.268
've-0.264
Token jewels
Token ID42701
Feature activation+1.29e-01

Pos logit contributions
ies+0.032
els+0.031
ishing+0.030
ins+0.028
ig+0.027
Neg logit contributions
 Curious-0.048
Don-0.047
 Looking-0.046
 Are-0.046
 Fasc-0.045
Token.
Token ID13
Feature activation-1.12e-01

Pos logit contributions
ishing+0.202
els+0.201
ins+0.181
ooming+0.173
ig+0.173
Neg logit contributions
 hadn-0.258
-0.251
 shouldn-0.248
 had-0.245
've-0.242
Token
Token ID220
Feature activation-7.12e-02

Pos logit contributions
 hardly+0.148
 shouldn+0.148
 hadn+0.145
Don+0.144
 barely+0.143
Neg logit contributions
ies-0.239
 again-0.231
er-0.221
ie-0.214
 toys-0.210
Token He
Token ID679
Feature activation-2.26e-03

Pos logit contributions
 hadn+0.081
Don+0.073
 hardly+0.072
 wond+0.071
Isa+0.070
Neg logit contributions
 again-0.170
els-0.163
er-0.159
 and-0.154
.-0.154
Token jumps
Token ID18045
Feature activation+6.41e-02

Pos logit contributions
 Curious+0.004
Don+0.004
 Are+0.004
 Looking+0.004
 Where+0.004
Neg logit contributions
ies-0.004
els-0.004
ishing-0.004
ins-0.004
ig-0.004
Token and
Token ID290
Feature activation-1.94e-02

Pos logit contributions
ishing+0.107
ies+0.107
els+0.106
ins+0.101
ig+0.094
Neg logit contributions
-0.131
Don-0.131
 Curious-0.130
 It-0.129
 Looking-0.123
Token grabs
Token ID22378
Feature activation+4.89e-02

Pos logit contributions
 Curious+0.037
Don+0.036
 hadn+0.036
 Looking+0.036
 Fasc+0.035
Neg logit contributions
els-0.033
ies-0.033
ishing-0.031
ins-0.028
ig-0.028
Token some
Token ID617
Feature activation+7.91e-02

Pos logit contributions
ies+0.120
ishing+0.114
els+0.113
ins+0.112
ig+0.108
Neg logit contributions
 Curious-0.062
 Looking-0.059
Don-0.059
 It-0.058
 Are-0.056
Token gold
Token ID3869
Feature activation+1.63e-01

Pos logit contributions
burn+0.139
ies+0.131
ishing+0.130
ators+0.127
ooming+0.126
Neg logit contributions
 It-0.138
-0.134
 Are-0.134
 Curious-0.131
 Someone-0.130
Token in
Token ID287
Feature activation+2.61e-02

Pos logit contributions
burn+0.283
ooming+0.272
ishing+0.271
ators+0.268
ins+0.250
Neg logit contributions
 It-0.275
 He-0.269
-0.240
 The-0.227
 could-0.223
Token his
Token ID465
Feature activation+6.68e-02

Pos logit contributions
ies+0.065
ishing+0.062
ins+0.061
els+0.060
ig+0.059
Neg logit contributions
 Curious-0.033
Don-0.032
 Looking-0.031
 It-0.031
 Exc-0.030
Token mouth
Token ID5422
Feature activation+9.10e-02

Pos logit contributions
ies+0.126
els+0.119
ishing+0.118
ins+0.115
ier+0.109
Neg logit contributions
 Curious-0.122
 Looking-0.120
 Are-0.117
Don-0.115
 Something-0.115
Token.
Token ID13
Feature activation-8.07e-02

Pos logit contributions
ishing+0.188
els+0.177
ig+0.174
ies+0.174
ishes+0.172
Neg logit contributions
 He-0.166
-0.163
 It-0.159
 The-0.151
 She-0.146
Token He
Token ID679
Feature activation-8.17e-04

Pos logit contributions
Don+0.073
 hadn+0.072
LL+0.066
+0.066
 Curious+0.064
Neg logit contributions
els-0.196
 again-0.194
ies-0.192
ins-0.188
er-0.187
Token.
Token ID13
Feature activation-5.05e-02

Pos logit contributions
els+0.006
ishing+0.006
ies+0.006
ins+0.005
ries+0.005
Neg logit contributions
 Are-0.008
 Curious-0.008
 Looking-0.008
 It-0.008
 Where-0.008
Token I
Token ID314
Feature activation-7.38e-03

Pos logit contributions
Don+0.048
 Curious+0.047
 hadn+0.044
 Fasc+0.043
 Looking+0.043
Neg logit contributions
els-0.126
ies-0.124
ishing-0.122
ig-0.121
ish-0.117
Token have
Token ID423
Feature activation+7.35e-03

Pos logit contributions
 Curious+0.013
 Looking+0.012
Don+0.012
 Fasc+0.012
 Are+0.011
Neg logit contributions
ies-0.015
ishing-0.014
els-0.014
ig-0.013
ins-0.013
Token never
Token ID1239
Feature activation+4.22e-02

Pos logit contributions
ies+0.015
els+0.015
ishing+0.014
ins+0.014
ig+0.013
Neg logit contributions
 Curious-0.012
Don-0.012
 Looking-0.012
 Are-0.011
 Fasc-0.011
Token seen
Token ID1775
Feature activation+7.20e-02

Pos logit contributions
ies+0.103
els+0.102
ishing+0.099
ins+0.096
ig+0.093
Neg logit contributions
 Curious-0.052
Don-0.050
 Looking-0.050
 Are-0.049
 Fasc-0.047
Token copper
Token ID15317
Feature activation+1.62e-01

Pos logit contributions
ies+0.155
ishing+0.145
ins+0.140
els+0.138
ooming+0.137
Neg logit contributions
 It-0.114
 Where-0.110
 Are-0.110
 Curious-0.110
 Would-0.106
Token before
Token ID878
Feature activation-5.28e-03

Pos logit contributions
ies+0.274
ins+0.269
els+0.264
ooming+0.262
packs+0.260
Neg logit contributions
 Where-0.273
 It-0.269
 Is-0.266
 Who-0.260
 This-0.258
Token.
Token ID13
Feature activation-5.65e-02

Pos logit contributions
 Curious+0.012
 Are+0.012
 Looking+0.012
 It+0.012
Don+0.012
Neg logit contributions
ies-0.007
ishing-0.007
els-0.007
ins-0.006
ig-0.006
Token It
Token ID632
Feature activation+2.59e-02

Pos logit contributions
Don+0.049
 hadn+0.044
 Curious+0.043
 Fasc+0.041
 Terr+0.041
Neg logit contributions
els-0.143
ies-0.135
er-0.134
ishing-0.131
 again-0.130
Token is
Token ID318
Feature activation+4.90e-03

Pos logit contributions
ies+0.053
els+0.051
ishing+0.050
ins+0.048
ig+0.046
Neg logit contributions
 Curious-0.043
Don-0.042
 Looking-0.042
 Are-0.041
 Fasc-0.040
Token so
Token ID523
Feature activation-4.72e-03

Pos logit contributions
ies+0.010
els+0.010
ishing+0.009
ins+0.009
ig+0.009
Neg logit contributions
 Curious-0.008
Don-0.008
 Looking-0.008
 Are-0.008
 Fasc-0.008
Token slide
Token ID10649
Feature activation+6.39e-02

Pos logit contributions
els+0.026
ishing+0.026
ies+0.025
ins+0.025
ig+0.023
Neg logit contributions
 Curious-0.023
 Are-0.023
Don-0.023
 Looking-0.021
 hadn-0.021
Token is
Token ID318
Feature activation-1.27e-03

Pos logit contributions
ies+0.149
els+0.147
ishing+0.147
ins+0.138
ig+0.130
Neg logit contributions
 Are-0.076
-0.076
 hadn-0.075
 It-0.075
Don-0.074
Token not
Token ID407
Feature activation+1.22e-02

Pos logit contributions
 Curious+0.002
Don+0.002
 Looking+0.002
 Are+0.002
 Fasc+0.002
Neg logit contributions
ies-0.003
els-0.003
ishing-0.003
ins-0.002
ig-0.002
Token made
Token ID925
Feature activation+3.64e-02

Pos logit contributions
els+0.028
ishing+0.026
ies+0.026
ooming+0.025
ins+0.025
Neg logit contributions
 hadn-0.016
 shouldn-0.016
Don-0.016
 Are-0.016
 Curious-0.016
Token of
Token ID286
Feature activation+3.81e-02

Pos logit contributions
ishing+0.077
els+0.076
ies+0.074
ins+0.073
ishes+0.071
Neg logit contributions
-0.058
 It-0.056
 shouldn-0.053
 This-0.052
 The-0.051
Token gold
Token ID3869
Feature activation+1.62e-01

Pos logit contributions
ies+0.072
ishing+0.070
ins+0.068
ooming+0.066
els+0.066
Neg logit contributions
 Are-0.064
 It-0.061
 shouldn-0.061
 Something-0.061
 hadn-0.060
Token,
Token ID11
Feature activation+1.81e-02

Pos logit contributions
ishing+0.282
ooming+0.275
els+0.267
burn+0.265
ags+0.258
Neg logit contributions
-0.253
 He-0.246
 It-0.242
 I-0.228
 shouldn-0.228
Token but
Token ID475
Feature activation-2.23e-02

Pos logit contributions
ies+0.038
ishing+0.038
els+0.038
ins+0.036
ig+0.034
Neg logit contributions
Don-0.029
 Are-0.028
 Curious-0.028
 It-0.027
 hadn-0.027
Token it
Token ID340
Feature activation+1.11e-02

Pos logit contributions
 Are+0.028
Don+0.028
 Curious+0.028
 Looking+0.027
 hadn+0.027
Neg logit contributions
ies-0.053
ishing-0.051
els-0.051
ins-0.049
ig-0.047
Token is
Token ID318
Feature activation-7.39e-03

Pos logit contributions
ies+0.023
els+0.022
ishing+0.022
ins+0.021
ig+0.020
Neg logit contributions
 Curious-0.018
Don-0.018
 Looking-0.017
 Are-0.017
 Fasc-0.017
Token brilliant
Token ID10457
Feature activation+8.41e-02

Pos logit contributions
 Curious+0.012
Don+0.011
 Looking+0.011
 Are+0.011
 Fasc+0.011
Neg logit contributions
ies-0.016
els-0.015
ishing-0.015
ins-0.014
ig-0.014
Token prevent
Token ID2948
Feature activation+5.16e-02

Pos logit contributions
 Curious+0.024
 hardly+0.024
+0.023
 Looking+0.023
 Are+0.023
Neg logit contributions
ies-0.055
ishing-0.053
els-0.052
ins-0.052
ier-0.048
Token accident
Token ID5778
Feature activation+8.97e-02

Pos logit contributions
ies+0.120
ishing+0.119
ins+0.114
els+0.113
ig+0.110
Neg logit contributions
Don-0.064
 Curious-0.063
 Are-0.062
 Looking-0.062
 Who-0.061
Token by
Token ID416
Feature activation+2.76e-02

Pos logit contributions
els+0.148
ishing+0.141
ins+0.126
ig+0.126
isher+0.125
Neg logit contributions
-0.183
 This-0.168
 Are-0.167
 Curious-0.167
 It-0.166
Token not
Token ID407
Feature activation+3.58e-02

Pos logit contributions
ies+0.058
ishing+0.057
els+0.055
ins+0.054
ig+0.053
Neg logit contributions
 Are-0.039
 Curious-0.039
 It-0.038
 Looking-0.038
 shouldn-0.037
Token putting
Token ID5137
Feature activation+1.10e-01

Pos logit contributions
els+0.082
ishing+0.080
ies+0.078
ins+0.076
ig+0.076
Neg logit contributions
 Are-0.048
 shouldn-0.048
Don-0.047
-0.046
 Curious-0.045
Token rocks
Token ID12586
Feature activation+1.62e-01

Pos logit contributions
ies+0.237
ins+0.220
ishing+0.211
ooming+0.211
ier+0.207
Neg logit contributions
 It-0.151
-0.146
Don-0.146
 shouldn-0.142
 Are-0.141
Token in
Token ID287
Feature activation+5.06e-02

Pos logit contributions
els+0.292
ins+0.276
ies+0.270
ishing+0.264
 outfits+0.252
Neg logit contributions
-0.301
 shouldn-0.251
Don-0.244
 I-0.237
 It-0.234
Token our
Token ID674
Feature activation+7.07e-02

Pos logit contributions
ies+0.131
ins+0.125
ishing+0.123
ier+0.119
ooming+0.118
Neg logit contributions
-0.057
Don-0.052
 Are-0.052
 shouldn-0.052
 It-0.051
Token mouths
Token ID28552
Feature activation+1.07e-01

Pos logit contributions
ies+0.130
ins+0.129
ier+0.128
els+0.126
ishing+0.113
Neg logit contributions
 Curious-0.125
 Are-0.122
 hadn-0.119
 Looking-0.119
Don-0.117
Token."
Token ID526
Feature activation-1.10e-01

Pos logit contributions
els+0.184
ies+0.178
ishing+0.178
ins+0.170
ig+0.163
Neg logit contributions
-0.212
 shouldn-0.193
 It-0.189
 Are-0.182
 The-0.182
Token
Token ID198
Feature activation-9.82e-02

Pos logit contributions
Don+0.130
 hadn+0.129
 shouldn+0.128
LL+0.127
 Are+0.122
Neg logit contributions
ies-0.259
els-0.244
ins-0.244
 again-0.235
ishing-0.229
Token It
Token ID632
Feature activation+1.21e-02

Pos logit contributions
Don+0.073
 hardly+0.070
 Fasc+0.068
 hadn+0.068
LL+0.067
Neg logit contributions
els-0.124
er-0.124
ish-0.119
ishing-0.118
ies-0.117
Token will
Token ID481
Feature activation-6.41e-03

Pos logit contributions
ies+0.024
els+0.023
ishing+0.022
ins+0.021
ig+0.020
Neg logit contributions
 Curious-0.021
Don-0.021
 Looking-0.020
 Fasc-0.020
 Are-0.020
Token only
Token ID691
Feature activation+2.88e-02

Pos logit contributions
 Curious+0.011
 Fasc+0.011
 Looking+0.011
Don+0.010
 Int+0.010
Neg logit contributions
ies-0.013
els-0.012
ishing-0.011
ie-0.011
ier-0.011
Token cost
Token ID1575
Feature activation+7.73e-02

Pos logit contributions
ies+0.069
els+0.068
ishing+0.067
ins+0.065
ig+0.062
Neg logit contributions
 Curious-0.036
 Are-0.036
Don-0.035
 Looking-0.035
 Who-0.034
Token one
Token ID530
Feature activation+7.75e-02

Pos logit contributions
ies+0.173
els+0.167
ishing+0.163
ins+0.156
ooming+0.155
Neg logit contributions
Don-0.109
-0.106
 It-0.105
 Are-0.104
 shouldn-0.103
Token gold
Token ID3869
Feature activation+1.62e-01

Pos logit contributions
ies+0.150
els+0.147
ishing+0.146
ooming+0.145
ins+0.141
Neg logit contributions
-0.139
 Where-0.136
 Who-0.131
 It-0.130
 shouldn-0.130
Token coin
Token ID10752
Feature activation+1.48e-01

Pos logit contributions
ooming+0.288
ishing+0.271
els+0.263
ies+0.251
burn+0.247
Neg logit contributions
 It-0.294
 Where-0.277
 shouldn-0.275
 He-0.273
 Who-0.270
Token".
Token ID1911
Feature activation-1.08e-01

Pos logit contributions
ooming+0.257
els+0.255
ishing+0.250
ies+0.233
ins+0.233
Neg logit contributions
 shouldn-0.273
 It-0.271
-0.265
 This-0.260
 Where-0.260
Token The
Token ID383
Feature activation-7.03e-03

Pos logit contributions
Don+0.084
 hadn+0.077
 Curious+0.072
 Are+0.069
 hardly+0.068
Neg logit contributions
els-0.287
ies-0.285
ishing-0.279
 again-0.278
ish-0.270
Token dragon
Token ID10441
Feature activation+3.60e-02

Pos logit contributions
 Curious+0.011
 Are+0.011
 Looking+0.010
Don+0.010
 Who+0.010
Neg logit contributions
ies-0.015
els-0.015
ishing-0.015
ig-0.014
ins-0.014
Token was
Token ID373
Feature activation-2.20e-02

Pos logit contributions
ies+0.093
els+0.093
ishing+0.091
ins+0.088
ig+0.084
Neg logit contributions
 Curious-0.039
Don-0.037
 Looking-0.036
 hadn-0.035
 Are-0.035
Token to
Token ID284
Feature activation+1.32e-02

Pos logit contributions
els+0.154
ishing+0.153
ies+0.146
ins+0.144
ig+0.137
Neg logit contributions
 hadn-0.127
 Curious-0.125
 shouldn-0.124
Don-0.124
 Looking-0.119
Token help
Token ID1037
Feature activation+3.48e-02

Pos logit contributions
ies+0.027
els+0.027
ishing+0.026
ins+0.024
ig+0.024
Neg logit contributions
 Curious-0.022
Don-0.021
 Looking-0.021
 Are-0.021
 Fasc-0.020
Token him
Token ID683
Feature activation+5.61e-02

Pos logit contributions
ies+0.065
els+0.064
ishing+0.063
ins+0.060
ig+0.058
Neg logit contributions
 Curious-0.061
Don-0.061
 Looking-0.060
 Are-0.060
 Who-0.058
Token keep
Token ID1394
Feature activation+6.74e-02

Pos logit contributions
els+0.129
ins+0.126
ishing+0.124
ies+0.124
ries+0.120
Neg logit contributions
 Curious-0.073
-0.071
 Looking-0.071
 Something-0.070
 It-0.069
Token his
Token ID465
Feature activation+7.39e-02

Pos logit contributions
ies+0.142
els+0.138
ins+0.136
ishing+0.132
ig+0.130
Neg logit contributions
Don-0.094
 Curious-0.094
 Looking-0.093
 hadn-0.093
 It-0.091
Token gold
Token ID3869
Feature activation+1.61e-01

Pos logit contributions
ins+0.153
els+0.146
ock+0.136
ries+0.134
ier+0.133
Neg logit contributions
 hadn-0.119
 Curious-0.113
 relieved-0.112
 shouldn-0.112
 Are-0.111
Token safe
Token ID3338
Feature activation+1.15e-01

Pos logit contributions
ins+0.298
els+0.296
ries+0.283
ud+0.270
ooming+0.266
Neg logit contributions
-0.247
 He-0.236
 full-0.229
 hadn-0.227
 had-0.224
Token.
Token ID13
Feature activation-1.10e-01

Pos logit contributions
els+0.230
ins+0.226
ooming+0.206
ries+0.205
ishing+0.199
Neg logit contributions
-0.200
 hadn-0.190
 He-0.184
 shouldn-0.182
 had-0.181
Token They
Token ID1119
Feature activation-1.21e-02

Pos logit contributions
Don+0.130
 Curious+0.120
 hadn+0.119
 shouldn+0.117
 Are+0.114
Neg logit contributions
ies-0.266
els-0.256
ishing-0.256
 again-0.253
ins-0.251
Token agreed
Token ID4987
Feature activation-6.87e-03

Pos logit contributions
 Curious+0.025
Don+0.025
 Looking+0.025
 Are+0.024
 Fasc+0.024
Neg logit contributions
ies-0.019
els-0.018
ishing-0.018
ins-0.017
ig-0.016
Token and
Token ID290
Feature activation-4.54e-02

Pos logit contributions
 Curious+0.014
Don+0.014
 Looking+0.014
 hardly+0.014
 hadn+0.013
Neg logit contributions
els-0.012
ies-0.011
ishing-0.011
ins-0.011
ries-0.010
Token,
Token ID11
Feature activation-3.23e-02

Pos logit contributions
ins+0.156
ishing+0.155
els+0.154
ies+0.153
ooming+0.151
Neg logit contributions
 It-0.131
 shouldn-0.129
-0.128
 This-0.127
 Curious-0.125
Token you
Token ID345
Feature activation+4.14e-03

Pos logit contributions
 Are+0.041
Don+0.040
 Curious+0.040
 Looking+0.038
 Who+0.037
Neg logit contributions
ies-0.079
els-0.078
ishing-0.077
ins-0.074
ig-0.070
Token will
Token ID481
Feature activation-5.93e-04

Pos logit contributions
ies+0.010
els+0.010
ishing+0.010
ins+0.009
ig+0.009
Neg logit contributions
 Curious-0.005
Don-0.005
 Looking-0.005
 Are-0.005
 Fasc-0.005
Token get
Token ID651
Feature activation+4.44e-02

Pos logit contributions
 Curious+0.001
 Looking+0.001
Don+0.001
 Fasc+0.001
 Are+0.001
Neg logit contributions
ies-0.001
els-0.001
ishing-0.001
ins-0.001
ig-0.001
Token more
Token ID517
Feature activation+7.18e-02

Pos logit contributions
ies+0.089
els+0.086
ishing+0.084
ins+0.081
ig+0.079
Neg logit contributions
 Curious-0.074
Don-0.073
 Looking-0.072
 Are-0.072
 Fasc-0.069
Token gold
Token ID3869
Feature activation+1.61e-01

Pos logit contributions
els+0.149
ooming+0.145
ishing+0.144
ies+0.142
ig+0.140
Neg logit contributions
 shouldn-0.102
Don-0.102
 Curious-0.101
 Are-0.101
-0.100
Token."
Token ID526
Feature activation-9.89e-02

Pos logit contributions
ooming+0.290
els+0.285
ishing+0.261
ipping+0.240
ators+0.239
Neg logit contributions
-0.254
 He-0.242
 It-0.240
 I-0.238
 This-0.227
Token The
Token ID383
Feature activation-1.39e-02

Pos logit contributions
Don+0.110
 Curious+0.107
 hadn+0.103
 Looking+0.102
 shouldn+0.101
Neg logit contributions
ies-0.248
ishing-0.235
els-0.234
ins-0.233
 again-0.229
Token boy
Token ID2933
Feature activation+6.12e-03

Pos logit contributions
 Curious+0.021
 Are+0.019
Don+0.019
 Looking+0.019
 Who+0.019
Neg logit contributions
els-0.032
ies-0.032
ishing-0.031
ins-0.030
ig-0.029
Token was
Token ID373
Feature activation-2.98e-02

Pos logit contributions
ies+0.015
els+0.015
ishing+0.015
ins+0.014
ig+0.014
Neg logit contributions
 Curious-0.007
Don-0.007
 Looking-0.007
 Are-0.007
 Fasc-0.007
Token greedy
Token ID31828
Feature activation+6.41e-02

Pos logit contributions
Don+0.052
 Are+0.051
 Curious+0.051
 Who+0.050
 Looking+0.050
Neg logit contributions
ies-0.053
els-0.052
ishing-0.050
ins-0.049
 again-0.048
Token until
Token ID1566
Feature activation-2.41e-02

Pos logit contributions
els+0.099
ins+0.095
ries+0.092
ishing+0.091
isher+0.088
Neg logit contributions
 �-0.093
 hadn-0.092
-0.090
 It-0.087
 The-0.087
Token the
Token ID262
Feature activation+6.61e-03

Pos logit contributions
Don+0.027
 Curious+0.027
 Are+0.026
 Looking+0.026
 hadn+0.026
Neg logit contributions
ies-0.062
els-0.061
ishing-0.060
ins-0.060
ig-0.055
Token sun
Token ID4252
Feature activation+4.80e-02

Pos logit contributions
els+0.013
ishing+0.013
ins+0.013
ies+0.013
ries+0.012
Neg logit contributions
 Curious-0.011
Don-0.011
 hadn-0.011
 Are-0.011
 Looking-0.011
Token went
Token ID1816
Feature activation+3.13e-02

Pos logit contributions
els+0.123
ies+0.122
ishing+0.122
ins+0.117
ig+0.113
Neg logit contributions
Don-0.049
 Curious-0.048
 hadn-0.047
 Looking-0.046
 Are-0.045
Token down
Token ID866
Feature activation+1.89e-02

Pos logit contributions
ies+0.051
els+0.051
ins+0.049
ishing+0.047
ig+0.045
Neg logit contributions
Don-0.064
 Curious-0.064
 Looking-0.063
 Fasc-0.062
 It-0.061
Token.
Token ID13
Feature activation-1.25e-01

Pos logit contributions
ies+0.031
els+0.030
ishing+0.029
ins+0.029
ig+0.028
Neg logit contributions
 The-0.040
 It-0.040
Don-0.039
 Curious-0.038
 Fasc-0.038
Token<|endoftext|>
Token ID50256
Feature activation-4.91e-02

Pos logit contributions
Copy+0.203
Isa+0.197
 intrigued+0.194
 Curious+0.192
Jump+0.187
Neg logit contributions
 again-0.200
 and-0.184
ies-0.183
.-0.181
 together-0.178
TokenOnce
Token ID7454
Feature activation-7.39e-02

Pos logit contributions
 Curious+0.049
Don+0.048
 Looking+0.046
 Are+0.046
 Fasc+0.044
Neg logit contributions
ies-0.126
els-0.124
ishing-0.120
ins-0.119
 again-0.115
Token upon
Token ID2402
Feature activation-8.71e-02

Pos logit contributions
 Curious+0.123
Don+0.120
 Looking+0.120
 Are+0.117
 Fasc+0.115
Neg logit contributions
ies-0.156
els-0.146
ishing-0.142
ins-0.136
ig-0.133
Token a
Token ID257
Feature activation-5.84e-02

Pos logit contributions
 Curious+0.155
 Looking+0.153
 Are+0.153
Don+0.151
 Fasc+0.148
Neg logit contributions
ies-0.157
els-0.144
ishing-0.140
ins-0.140
 again-0.131
Token time
Token ID640
Feature activation-4.99e-02

Pos logit contributions
 Curious+0.113
Don+0.111
 Looking+0.110
 Are+0.108
 Fasc+0.107
Neg logit contributions
ies-0.102
els-0.100
ishing-0.096
ins-0.091
ig-0.086
Token.
Token ID13
Feature activation-1.11e-01

Pos logit contributions
ishing+0.168
ies+0.158
els+0.155
ins+0.148
isher+0.141
Neg logit contributions
-0.189
 shouldn-0.183
 hadn-0.180
 Curious-0.179
Don-0.178
Token "
Token ID366
Feature activation-4.18e-02

Pos logit contributions
 hadn+0.167
Don+0.159
 shouldn+0.153
 Terr+0.150
Ell+0.148
Neg logit contributions
ies-0.212
ins-0.198
els-0.198
 and-0.193
ries-0.185
TokenWhat
Token ID2061
Feature activation-3.33e-02

Pos logit contributions
 Curious+0.044
 Looking+0.042
 Are+0.040
 Fasc+0.039
 Terr+0.038
Neg logit contributions
els-0.106
ies-0.106
ishing-0.102
ins-0.100
ig-0.095
Token is
Token ID318
Feature activation-7.61e-03

Pos logit contributions
 Curious+0.054
Don+0.052
 Looking+0.052
 Are+0.051
 Fasc+0.050
Neg logit contributions
ies-0.069
els-0.067
ishing-0.067
ins-0.063
ig-0.061
Token that
Token ID326
Feature activation+5.27e-02

Pos logit contributions
 Curious+0.014
Don+0.013
 Looking+0.013
 Are+0.013
 Fasc+0.013
Neg logit contributions
ies-0.015
els-0.014
ishing-0.014
ins-0.013
ig-0.013
Token?"
Token ID1701
Feature activation-1.25e-01

Pos logit contributions
ies+0.101
els+0.098
ishing+0.095
ins+0.090
ig+0.087
Neg logit contributions
 Curious-0.094
Don-0.092
 Looking-0.091
 Are-0.090
 Fasc-0.088
Token Todd
Token ID14377
Feature activation-6.61e-03

Pos logit contributions
 shouldn+0.190
 hadn+0.190
LL+0.190
Thanks+0.189
 Terr+0.189
Neg logit contributions
ies-0.263
els-0.248
ries-0.232
ins-0.231
 eggs-0.223
Token asked
Token ID1965
Feature activation-2.41e-02

Pos logit contributions
 Curious+0.007
Don+0.007
 Looking+0.007
 Are+0.007
 Fasc+0.007
Neg logit contributions
ies-0.017
els-0.017
ishing-0.016
ins-0.016
ig-0.015
Token Ben
Token ID3932
Feature activation+2.02e-02

Pos logit contributions
 Curious+0.057
 Looking+0.056
Don+0.055
 Are+0.055
 Fasc+0.054
Neg logit contributions
ishing-0.032
els-0.031
ies-0.031
ins-0.028
ig-0.027
Token.
Token ID13
Feature activation-1.05e-01

Pos logit contributions
ies+0.022
els+0.021
ishing+0.021
ins+0.019
ig+0.018
Neg logit contributions
 Curious-0.052
 Looking-0.051
Don-0.051
 Are-0.050
 Fasc-0.049
Token Ben
Token ID3932
Feature activation-6.23e-03

Pos logit contributions
 shouldn+0.143
Don+0.134
 hadn+0.133
LL+0.122
 betrayed+0.119
Neg logit contributions
ies-0.228
ins-0.210
els-0.201
 again-0.200
er-0.194
Token in
Token ID287
Feature activation+1.06e-02

Pos logit contributions
els+0.025
ies+0.024
ishing+0.022
ins+0.020
ries+0.020
Neg logit contributions
 Curious-0.031
 It-0.030
 Looking-0.030
Don-0.030
 Fasc-0.029
Token the
Token ID262
Feature activation+2.09e-02

Pos logit contributions
ies+0.027
els+0.027
ishing+0.027
ins+0.026
ig+0.025
Neg logit contributions
 Curious-0.012
Don-0.011
 Looking-0.011
 Fasc-0.011
 Are-0.011
Token park
Token ID3952
Feature activation+4.12e-02

Pos logit contributions
ies+0.036
els+0.035
ishing+0.034
ins+0.032
ig+0.031
Neg logit contributions
 Curious-0.041
Don-0.040
 Looking-0.040
 Are-0.040
 Fasc-0.039
Token all
Token ID477
Feature activation+5.45e-03

Pos logit contributions
els+0.078
ishing+0.073
ins+0.068
ies+0.068
isher+0.067
Neg logit contributions
 The-0.080
 It-0.075
 But-0.074
 This-0.074
 Thanks-0.073
Token day
Token ID1110
Feature activation+1.61e-02

Pos logit contributions
ies+0.009
els+0.009
ishing+0.009
ins+0.008
ig+0.008
Neg logit contributions
 Curious-0.011
Don-0.011
 Looking-0.010
 Are-0.010
 Fasc-0.010
Token.
Token ID13
Feature activation-1.24e-01

Pos logit contributions
els+0.031
ishing+0.029
ies+0.028
ins+0.028
isher+0.027
Neg logit contributions
-0.029
 hardly-0.028
 Curious-0.027
 The-0.026
 hadn-0.026
Token<|endoftext|>
Token ID50256
Feature activation-5.73e-02

Pos logit contributions
 gru+0.215
 intrigued+0.212
 Curious+0.208
Copy+0.204
Jump+0.201
Neg logit contributions
 again-0.198
ies-0.193
.-0.185
 toys-0.184
 and-0.183
TokenOne
Token ID3198
Feature activation-3.22e-02

Pos logit contributions
 Curious+0.060
Don+0.058
 Are+0.057
 Looking+0.057
 Fasc+0.055
Neg logit contributions
ies-0.143
els-0.141
ins-0.136
ishing-0.136
 again-0.133
Token day
Token ID1110
Feature activation-5.57e-02

Pos logit contributions
 Curious+0.069
Don+0.068
 Looking+0.068
 Are+0.068
 Fasc+0.066
Neg logit contributions
ies-0.051
els-0.046
ishing-0.046
ins-0.043
ig-0.042
Token,
Token ID11
Feature activation-9.09e-02

Pos logit contributions
 Curious+0.133
 Looking+0.131
Don+0.130
 Are+0.129
 Fasc+0.127
Neg logit contributions
ies-0.071
els-0.070
ishing-0.067
ins-0.062
ig-0.056
Token a
Token ID257
Feature activation-3.95e-02

Pos logit contributions
 Terr+0.197
 Curious+0.195
 Looking+0.189
 Fasc+0.188
 Would+0.187
Neg logit contributions
ies-0.134
--0.111
els-0.109
ishing-0.108
ie-0.108
Token drawing
Token ID8263
Feature activation+7.76e-02

Pos logit contributions
ies+0.014
ishing+0.013
els+0.013
ins+0.013
ig+0.012
Neg logit contributions
 Curious-0.014
 Looking-0.014
 Are-0.014
Don-0.014
 Who-0.014
Token pictures
Token ID5986
Feature activation+8.35e-02

Pos logit contributions
ishing+0.124
ig+0.115
ies+0.112
isher+0.111
els+0.109
Neg logit contributions
 It-0.162
 The-0.153
 Something-0.153
 Who-0.151
-0.150
Token.
Token ID13
Feature activation-1.20e-01

Pos logit contributions
ishing+0.148
ies+0.137
els+0.136
ig+0.131
isher+0.128
Neg logit contributions
-0.165
 It-0.158
 Thanks-0.153
 Who-0.148
 The-0.147
Token The
Token ID383
Feature activation-1.31e-02

Pos logit contributions
 Curious+0.159
 Looking+0.147
 intrigued+0.147
 Fasc+0.146
Don+0.144
Neg logit contributions
ies-0.265
els-0.252
 again-0.251
ins-0.247
ishing-0.237
Token end
Token ID886
Feature activation-3.57e-02

Pos logit contributions
 Curious+0.029
Don+0.029
 Looking+0.028
 Are+0.028
 Fasc+0.027
Neg logit contributions
ies-0.020
els-0.019
ishing-0.019
ins-0.018
ig-0.017
Token.
Token ID13
Feature activation-1.24e-01

Pos logit contributions
Don+0.085
 Curious+0.084
 Looking+0.084
 Are+0.083
 Who+0.083
Neg logit contributions
ies-0.045
els-0.045
ishing-0.043
ins-0.038
ig-0.037
Token<|endoftext|>
Token ID50256
Feature activation-6.94e-02

Pos logit contributions
 Curious+0.172
 Looking+0.162
 Fasc+0.161
 intrigued+0.157
 Are+0.152
Neg logit contributions
 again-0.257
ies-0.254
els-0.241
ins-0.239
--0.232
TokenOnce
Token ID7454
Feature activation-7.27e-02

Pos logit contributions
 Curious+0.071
Don+0.070
 Are+0.069
 Looking+0.068
 Fasc+0.065
Neg logit contributions
ies-0.174
els-0.171
ishing-0.165
ins-0.164
 again-0.161
Token upon
Token ID2402
Feature activation-8.75e-02

Pos logit contributions
 Curious+0.113
Don+0.110
 Looking+0.109
 Are+0.107
 Fasc+0.105
Neg logit contributions
ies-0.156
els-0.151
ishing-0.148
ins-0.141
ig-0.136
Token a
Token ID257
Feature activation-5.75e-02

Pos logit contributions
 Curious+0.151
 Looking+0.149
 Are+0.147
Don+0.146
 Fasc+0.143
Neg logit contributions
ies-0.167
els-0.155
ishing-0.151
ins-0.149
ie-0.139
Token time
Token ID640
Feature activation-5.76e-02

Pos logit contributions
 Curious+0.108
 Looking+0.105
Don+0.105
 Fasc+0.102
 Are+0.102
Neg logit contributions
ies-0.104
els-0.103
ishing-0.099
ins-0.094
ries-0.088
Token and
Token ID290
Feature activation-1.46e-02

Pos logit contributions
els+0.201
ishing+0.186
ins+0.184
ies+0.176
ries+0.166
Neg logit contributions
-0.231
 hardly-0.220
 hadn-0.217
 Curious-0.216
 relieved-0.216
Token became
Token ID2627
Feature activation-1.46e-02

Pos logit contributions
 Curious+0.022
Don+0.021
 Looking+0.021
 Are+0.020
 Fasc+0.020
Neg logit contributions
ies-0.032
els-0.031
ishing-0.031
ins-0.029
ig-0.028
Token best
Token ID1266
Feature activation+5.20e-02

Pos logit contributions
 Curious+0.030
 Looking+0.029
Don+0.028
 Are+0.028
 Fasc+0.028
Neg logit contributions
ies-0.025
els-0.023
ishing-0.023
ins-0.022
ig-0.020
Token friends
Token ID2460
Feature activation+3.53e-02

Pos logit contributions
els+0.080
ishing+0.072
ies+0.071
ie+0.068
ries+0.067
Neg logit contributions
 relieved-0.107
 Thanks-0.105
 Curious-0.104
 shouldn-0.104
 hardly-0.104
Token again
Token ID757
Feature activation+2.58e-02

Pos logit contributions
els+0.058
ies+0.055
ishing+0.053
ins+0.049
ries+0.046
Neg logit contributions
-0.076
 hadn-0.075
 It-0.074
 Thanks-0.074
Don-0.074
Token.
Token ID13
Feature activation-1.24e-01

Pos logit contributions
els+0.037
ies+0.034
ishing+0.033
isher+0.031
ins+0.031
Neg logit contributions
Don-0.058
 hadn-0.057
-0.057
 Are-0.056
 shouldn-0.056
Token From
Token ID3574
Feature activation-2.14e-02

Pos logit contributions
Copy+0.204
 Curious+0.192
 intrigued+0.192
Isa+0.189
Jump+0.184
Neg logit contributions
 again-0.202
ies-0.196
els-0.192
 and-0.186
.-0.186
Token that
Token ID326
Feature activation-2.95e-02

Pos logit contributions
 Curious+0.043
 Looking+0.041
Don+0.041
 Are+0.041
 Fasc+0.040
Neg logit contributions
ies-0.038
els-0.036
ishing-0.035
ig-0.032
ins-0.032
Token day
Token ID1110
Feature activation-3.43e-02

Pos logit contributions
 Curious+0.060
Don+0.059
 Looking+0.058
 Are+0.058
 Fasc+0.057
Neg logit contributions
ies-0.050
els-0.048
ishing-0.047
ins-0.044
ig-0.042
Token on
Token ID319
Feature activation-2.72e-02

Pos logit contributions
 Curious+0.084
Don+0.083
 Looking+0.082
 Are+0.082
 Fasc+0.080
Neg logit contributions
ies-0.044
els-0.041
ishing-0.039
ins-0.037
ries-0.031
Token,
Token ID11
Feature activation-7.70e-02

Pos logit contributions
 Curious+0.068
Don+0.067
 Looking+0.066
 Are+0.066
 Fasc+0.065
Neg logit contributions
ies-0.033
els-0.032
ishing-0.030
ins-0.028
ig-0.026
Token,
Token ID11
Feature activation-4.14e-02

Pos logit contributions
els+0.025
ies+0.023
ishing+0.022
ishes+0.022
ins+0.020
Neg logit contributions
-0.026
 hadn-0.025
 hardly-0.025
 had-0.024
 The-0.024
Token laughing
Token ID14376
Feature activation+2.89e-02

Pos logit contributions
Don+0.069
 Curious+0.068
 Are+0.066
 hadn+0.065
 Looking+0.065
Neg logit contributions
els-0.085
ies-0.082
ishing-0.080
ins-0.077
ig-0.074
Token and
Token ID290
Feature activation-3.87e-02

Pos logit contributions
els+0.047
ishing+0.039
ig+0.038
ies+0.038
ins+0.037
Neg logit contributions
 It-0.063
 Curious-0.063
Don-0.062
 The-0.061
 Looking-0.061
Token having
Token ID1719
Feature activation-6.94e-03

Pos logit contributions
 Curious+0.070
 Looking+0.068
Don+0.066
 Fasc+0.066
 Exc+0.065
Neg logit contributions
ies-0.076
ishing-0.069
els-0.069
ins-0.066
ig-0.063
Token fun
Token ID1257
Feature activation+3.22e-02

Pos logit contributions
 Curious+0.011
Don+0.011
 Looking+0.011
 Are+0.011
 Fasc+0.011
Neg logit contributions
ies-0.014
els-0.014
ishing-0.013
ins-0.013
ig-0.012
Token.
Token ID13
Feature activation-1.24e-01

Pos logit contributions
els+0.059
ies+0.053
ishes+0.053
ishing+0.051
isher+0.050
Neg logit contributions
-0.057
 hadn-0.056
 Are-0.056
Don-0.056
 But-0.054
Token When
Token ID1649
Feature activation-8.39e-02

Pos logit contributions
Don+0.147
 Curious+0.146
LL+0.142
Copy+0.142
 Fasc+0.139
Neg logit contributions
ies-0.255
 again-0.247
els-0.237
ie-0.234
ig-0.231
Token it
Token ID340
Feature activation-1.64e-02

Pos logit contributions
 Curious+0.116
Don+0.112
 Looking+0.111
 Are+0.108
 Fasc+0.106
Neg logit contributions
ies-0.196
els-0.191
ishing-0.187
ins-0.179
ig-0.174
Token was
Token ID373
Feature activation-2.52e-02

Pos logit contributions
 Curious+0.024
Don+0.024
 Looking+0.023
 Something+0.022
 Fasc+0.022
Neg logit contributions
ies-0.037
ishing-0.033
els-0.033
ie-0.033
 again-0.033
Token time
Token ID640
Feature activation-1.12e-02

Pos logit contributions
 Curious+0.046
 Looking+0.045
Don+0.044
 Fasc+0.044
 Are+0.043
Neg logit contributions
ies-0.047
els-0.047
ishing-0.045
ins-0.043
ig-0.041
Token to
Token ID284
Feature activation-2.19e-02

Pos logit contributions
 Curious+0.020
 Looking+0.020
Don+0.020
 Are+0.019
 It+0.019
Neg logit contributions
els-0.021
ies-0.020
ishing-0.020
ins-0.019
ries-0.018
Token day
Token ID1110
Feature activation+1.05e-02

Pos logit contributions
els+0.122
ies+0.114
ishing+0.110
ins+0.109
ries+0.108
Neg logit contributions
 Curious-0.079
 Looking-0.079
 Thanks-0.077
-0.075
Thanks-0.075
Token at
Token ID379
Feature activation+1.06e-02

Pos logit contributions
els+0.016
ies+0.016
ishing+0.015
isher+0.014
ins+0.014
Neg logit contributions
-0.022
 Thanks-0.021
 hadn-0.021
 relieved-0.021
 Have-0.021
Token the
Token ID262
Feature activation+1.78e-02

Pos logit contributions
ies+0.022
els+0.022
ishing+0.021
ins+0.020
ig+0.020
Neg logit contributions
 Curious-0.017
Don-0.017
 Looking-0.016
 Are-0.016
 Fasc-0.016
Token fun
Token ID1257
Feature activation+5.01e-02

Pos logit contributions
ies+0.028
els+0.027
ishing+0.027
ins+0.024
ig+0.024
Neg logit contributions
 Curious-0.037
Don-0.036
 Looking-0.036
 Are-0.036
 Fasc-0.035
Token fair
Token ID3148
Feature activation+8.78e-02

Pos logit contributions
els+0.100
ishing+0.092
ins+0.089
ies+0.087
ig+0.085
Neg logit contributions
 Who-0.080
-0.079
 hardly-0.078
 This-0.078
 But-0.078
Token.
Token ID13
Feature activation-1.24e-01

Pos logit contributions
els+0.156
ishing+0.135
ries+0.128
ig+0.122
ishes+0.122
Neg logit contributions
 Who-0.157
 Thanks-0.152
 He-0.152
-0.150
 Where-0.149
Token<|endoftext|>
Token ID50256
Feature activation-5.98e-02

Pos logit contributions
 Curious+0.208
 intrigued+0.203
 Fasc+0.196
Copy+0.196
Don+0.189
Neg logit contributions
ies-0.202
 again-0.201
els-0.196
.-0.193
 and-0.190
TokenOnce
Token ID7454
Feature activation-7.90e-02

Pos logit contributions
 Curious+0.061
Don+0.059
 Looking+0.058
 Are+0.057
 Fasc+0.055
Neg logit contributions
ies-0.152
els-0.150
ishing-0.145
ins-0.144
 again-0.140
Token upon
Token ID2402
Feature activation-9.18e-02

Pos logit contributions
 Curious+0.137
Don+0.134
 Looking+0.133
 Are+0.130
 Fasc+0.128
Neg logit contributions
ies-0.162
els-0.150
ishing-0.146
ins-0.140
ig-0.138
Token a
Token ID257
Feature activation-5.96e-02

Pos logit contributions
 Curious+0.168
 Looking+0.167
 Are+0.167
Don+0.163
 Fasc+0.161
Neg logit contributions
ies-0.160
els-0.144
ins-0.142
ishing-0.141
 again-0.133
Token time
Token ID640
Feature activation-5.56e-02

Pos logit contributions
 Curious+0.118
Don+0.115
 Looking+0.114
 Are+0.112
 Fasc+0.111
Neg logit contributions
ies-0.103
els-0.100
ishing-0.097
ins-0.091
ig-0.086
Token shared
Token ID4888
Feature activation+4.55e-02

Pos logit contributions
Don+0.011
 Curious+0.011
 Are+0.011
 hadn+0.010
 Looking+0.010
Neg logit contributions
els-0.033
ies-0.033
ishing-0.033
ins-0.032
ig-0.031
Token many
Token ID867
Feature activation+7.85e-02

Pos logit contributions
ishing+0.100
ig+0.096
ier+0.096
els+0.095
ooming+0.095
Neg logit contributions
 It-0.071
 The-0.067
 -0.062
 Curious-0.060
 This-0.059
Token more
Token ID517
Feature activation+7.23e-02

Pos logit contributions
burn+0.139
ier+0.122
ooming+0.121
els+0.120
ishing+0.120
Neg logit contributions
-0.144
Me-0.134
 wond-0.129
From-0.128
 relieved-0.125
Token happy
Token ID3772
Feature activation+6.14e-02

Pos logit contributions
burn+0.132
ishing+0.129
isher+0.128
ier+0.127
els+0.123
Neg logit contributions
-0.130
 It-0.116
-0.112
 But-0.112
 The-0.111
Token times
Token ID1661
Feature activation+5.63e-02

Pos logit contributions
els+0.102
ishing+0.094
ier+0.093
ig+0.089
ist+0.089
Neg logit contributions
 Thank-0.105
 Are-0.105
 It-0.104
 Thanks-0.104
 The-0.103
Token.
Token ID13
Feature activation-1.24e-01

Pos logit contributions
ishing+0.095
els+0.093
ies+0.089
ishes+0.085
ins+0.084
Neg logit contributions
-0.116
 hardly-0.105
 Thanks-0.102
 This-0.102
 It-0.101
Token<|endoftext|>
Token ID50256
Feature activation-6.08e-02

Pos logit contributions
 Curious+0.184
 intrigued+0.171
 Fasc+0.168
Don+0.168
 Looking+0.165
Neg logit contributions
ies-0.233
 again-0.231
els-0.228
ins-0.217
ie-0.215
TokenOne
Token ID3198
Feature activation-3.02e-02

Pos logit contributions
 Curious+0.063
Don+0.061
 Looking+0.059
 Are+0.059
 Fasc+0.057
Neg logit contributions
ies-0.154
els-0.152
ishing-0.146
ins-0.145
 again-0.141
Token day
Token ID1110
Feature activation-5.20e-02

Pos logit contributions
 Curious+0.062
Don+0.061
 Looking+0.061
 Are+0.060
 Fasc+0.059
Neg logit contributions
ies-0.050
els-0.047
ishing-0.046
ins-0.043
ig-0.042
Token,
Token ID11
Feature activation-9.04e-02

Pos logit contributions
 Curious+0.131
Don+0.128
 Looking+0.128
 Are+0.126
 Fasc+0.125
Neg logit contributions
ies-0.062
els-0.059
ishing-0.056
ins-0.052
ig-0.048
Token a
Token ID257
Feature activation-3.77e-02

Pos logit contributions
 Terr+0.192
 Curious+0.191
 Would+0.187
 Looking+0.186
 Are+0.184
Neg logit contributions
ies-0.131
ishing-0.116
els-0.112
--0.112
ie-0.104
Token hurt
Token ID5938
Feature activation+6.81e-02

Pos logit contributions
ies+0.062
els+0.062
ishing+0.060
ins+0.057
ig+0.056
Neg logit contributions
 Curious-0.046
Don-0.046
 Looking-0.044
 Are-0.043
 shouldn-0.043
Token again
Token ID757
Feature activation+5.12e-02

Pos logit contributions
els+0.112
ishing+0.108
ig+0.105
ins+0.104
ies+0.100
Neg logit contributions
 He-0.137
 It-0.135
 Where-0.135
 shouldn-0.134
Don-0.134
Token.
Token ID13
Feature activation-1.22e-01

Pos logit contributions
els+0.086
isher+0.084
ins+0.081
ies+0.079
ishing+0.079
Neg logit contributions
-0.099
 hadn-0.098
 shouldn-0.095
Don-0.094
 The-0.091
Token The
Token ID383
Feature activation-1.22e-02

Pos logit contributions
 Curious+0.168
Copy+0.163
 intrigued+0.163
Jump+0.158
Don+0.156
Neg logit contributions
 again-0.237
ies-0.226
ie-0.214
els-0.213
 cubes-0.210
Token end
Token ID886
Feature activation-3.58e-02

Pos logit contributions
 Curious+0.028
Don+0.027
 Looking+0.027
 Are+0.027
 Fasc+0.026
Neg logit contributions
ies-0.018
els-0.017
ishing-0.017
ins-0.016
ig-0.014
Token.
Token ID13
Feature activation-1.23e-01

Pos logit contributions
Don+0.087
 Curious+0.086
 Looking+0.086
 Are+0.085
 Who+0.085
Neg logit contributions
ies-0.045
els-0.044
ishing-0.042
ins-0.039
ig-0.036
Token<|endoftext|>
Token ID50256
Feature activation-6.77e-02

Pos logit contributions
 Curious+0.183
 intrigued+0.177
 Looking+0.173
Copy+0.172
 Fasc+0.172
Neg logit contributions
 again-0.233
ies-0.217
--0.209
els-0.202
ie-0.202
TokenOnce
Token ID7454
Feature activation-7.16e-02

Pos logit contributions
 Curious+0.075
Don+0.074
 Are+0.073
 Looking+0.072
 Fasc+0.069
Neg logit contributions
ies-0.161
els-0.158
 again-0.152
ins-0.152
ishing-0.151
Token upon
Token ID2402
Feature activation-8.47e-02

Pos logit contributions
 Curious+0.116
Don+0.113
 Looking+0.113
 Are+0.110
 Fasc+0.108
Neg logit contributions
ies-0.151
els-0.144
ishing-0.141
ins-0.135
ig-0.131
Token a
Token ID257
Feature activation-5.39e-02

Pos logit contributions
 Curious+0.147
 Looking+0.145
 Are+0.144
Don+0.143
 Fasc+0.139
Neg logit contributions
ies-0.158
els-0.147
ishing-0.143
ins-0.142
ie-0.132
Token time
Token ID640
Feature activation-5.15e-02

Pos logit contributions
 Curious+0.101
 Looking+0.099
Don+0.099
 Are+0.096
 Fasc+0.095
Neg logit contributions
ies-0.098
els-0.097
ishing-0.093
ins-0.088
ries-0.082
Token with
Token ID351
Feature activation+1.77e-02

Pos logit contributions
els+0.082
ies+0.080
ishing+0.073
ishes+0.072
ries+0.071
Neg logit contributions
 Curious-0.082
Don-0.081
 Who-0.080
 Are-0.079
 Looking-0.079
Token her
Token ID607
Feature activation+4.98e-02

Pos logit contributions
ishing+0.043
ies+0.042
els+0.041
ish+0.040
ig+0.040
Neg logit contributions
Don-0.020
 It-0.020
 Curious-0.019
 hardly-0.019
 Who-0.019
Token new
Token ID649
Feature activation+8.22e-02

Pos logit contributions
els+0.089
ins+0.083
ishing+0.082
ies+0.080
ries+0.077
Neg logit contributions
 hadn-0.094
 Curious-0.092
 Who-0.090
 Fasc-0.088
 shouldn-0.088
Token friends
Token ID2460
Feature activation+7.02e-02

Pos logit contributions
els+0.153
ishing+0.140
ier+0.140
ries+0.138
ins+0.137
Neg logit contributions
 Curious-0.153
 Who-0.143
 Looking-0.142
 Fasc-0.137
 Where-0.136
Token again
Token ID757
Feature activation+4.06e-02

Pos logit contributions
els+0.125
ies+0.120
ishing+0.116
ins+0.113
ries+0.106
Neg logit contributions
-0.144
 hadn-0.134
Don-0.127
 Who-0.127
 shouldn-0.127
Token.
Token ID13
Feature activation-1.23e-01

Pos logit contributions
els+0.064
ins+0.059
ies+0.059
ishing+0.058
isher+0.057
Neg logit contributions
-0.085
Don-0.084
 Are-0.082
 Who-0.081
 hadn-0.081
Token
Token ID198
Feature activation-1.07e-01

Pos logit contributions
 intrigued+0.204
Jump+0.201
 gru+0.200
Copy+0.199
 Curious+0.193
Neg logit contributions
 again-0.213
ies-0.201
.-0.191
 and-0.188
ie-0.181
Token
Token ID198
Feature activation-9.98e-02

Pos logit contributions
 Are+0.202
 Curious+0.199
 Fasc+0.192
 Looking+0.191
 Terr+0.187
Neg logit contributions
ies-0.176
ie-0.166
 again-0.166
--0.158
els-0.157
TokenEvery
Token ID6109
Feature activation-2.50e-02

Pos logit contributions
 Curious+0.127
 Are+0.124
 Looking+0.120
 Fasc+0.117
 Terr+0.115
Neg logit contributions
ies-0.225
els-0.213
ins-0.211
ie-0.211
ock-0.210
Token day
Token ID1110
Feature activation-3.09e-02

Pos logit contributions
 Curious+0.050
Don+0.049
 Looking+0.048
 Are+0.048
 Fasc+0.047
Neg logit contributions
ies-0.043
els-0.042
ishing-0.041
ins-0.038
ig-0.036
Token,
Token ID11
Feature activation-7.42e-02

Pos logit contributions
 Curious+0.075
Don+0.074
 Looking+0.073
 Are+0.073
 Fasc+0.072
Neg logit contributions
ies-0.039
els-0.038
ishing-0.036
ins-0.033
ig-0.031
Token to
Token ID284
Feature activation-4.92e-03

Pos logit contributions
 Curious+0.047
 Looking+0.045
 Fasc+0.044
 Are+0.042
 Int+0.042
Neg logit contributions
ies-0.068
els-0.061
ishing-0.061
 again-0.059
ins-0.058
Token have
Token ID423
Feature activation+7.75e-03

Pos logit contributions
 Curious+0.006
Don+0.006
 Looking+0.006
 Are+0.006
 Fasc+0.005
Neg logit contributions
ies-0.012
els-0.012
ishing-0.012
ins-0.011
ig-0.011
Token fun
Token ID1257
Feature activation+5.27e-02

Pos logit contributions
ies+0.015
els+0.014
ishing+0.014
ins+0.013
ig+0.013
Neg logit contributions
 Curious-0.014
Don-0.013
 Looking-0.013
 Are-0.013
 Fasc-0.013
Token swinging
Token ID25635
Feature activation+7.82e-02

Pos logit contributions
ies+0.101
els+0.100
isher+0.098
ishes+0.096
ishing+0.094
Neg logit contributions
-0.089
 Are-0.081
Don-0.081
 Curious-0.078
 Looking-0.077
Token safely
Token ID11512
Feature activation+6.70e-02

Pos logit contributions
ies+0.135
els+0.134
ishing+0.131
ins+0.127
ishes+0.115
Neg logit contributions
Don-0.157
 Curious-0.148
-0.147
 It-0.146
 Are-0.142
Token.
Token ID13
Feature activation-1.23e-01

Pos logit contributions
els+0.118
ins+0.108
ies+0.107
isher+0.101
ishing+0.101
Neg logit contributions
-0.141
Don-0.133
 Curious-0.128
 hadn-0.124
 shouldn-0.123
Token<|endoftext|>
Token ID50256
Feature activation-5.88e-02

Pos logit contributions
 Curious+0.194
 intrigued+0.194
Jump+0.191
Copy+0.188
 Fasc+0.187
Neg logit contributions
 again-0.223
ies-0.206
ie-0.193
.-0.192
 and-0.191
TokenOnce
Token ID7454
Feature activation-7.42e-02

Pos logit contributions
 Curious+0.066
Don+0.065
 Are+0.063
 Looking+0.062
 Fasc+0.061
Neg logit contributions
ies-0.139
els-0.137
 again-0.132
ins-0.132
ishing-0.131
Token upon
Token ID2402
Feature activation-8.70e-02

Pos logit contributions
 Curious+0.127
Don+0.124
 Looking+0.123
 Are+0.121
 Fasc+0.119
Neg logit contributions
ies-0.154
els-0.142
ishing-0.138
ins-0.133
ig-0.131
Token a
Token ID257
Feature activation-5.62e-02

Pos logit contributions
 Curious+0.157
 Are+0.157
 Looking+0.156
Don+0.152
 Fasc+0.151
Neg logit contributions
ies-0.152
els-0.139
ins-0.135
ishing-0.135
 again-0.128
Token time
Token ID640
Feature activation-5.48e-02

Pos logit contributions
 Curious+0.109
Don+0.107
 Looking+0.106
 Are+0.104
 Fasc+0.103
Neg logit contributions
ies-0.098
els-0.096
ishing-0.093
ins-0.088
ig-0.083
Token they
Token ID484
Feature activation-3.09e-02

Pos logit contributions
 Where+0.090
 Something+0.088
 Curious+0.087
 Looking+0.087
 Who+0.085
Neg logit contributions
ies-0.111
--0.088
er-0.088
ie-0.085
 toys-0.084
Token played
Token ID2826
Feature activation+8.52e-03

Pos logit contributions
 Curious+0.053
 Looking+0.051
 Are+0.050
 Where+0.049
Don+0.049
Neg logit contributions
ies-0.065
els-0.060
ishing-0.059
ie-0.056
ig-0.054
Token together
Token ID1978
Feature activation+3.29e-02

Pos logit contributions
els+0.012
ies+0.011
ishing+0.010
ins+0.010
ries+0.009
Neg logit contributions
 Curious-0.020
 Looking-0.020
 It-0.020
Don-0.020
 Fasc-0.019
Token until
Token ID1566
Feature activation-4.55e-02

Pos logit contributions
ins+0.072
els+0.072
ies+0.070
ishing+0.070
isher+0.069
Neg logit contributions
 hardly-0.047
-0.047
 barely-0.045
 �-0.045
 The-0.044
Token dinner
Token ID8073
Feature activation+2.45e-02

Pos logit contributions
 Curious+0.074
 Looking+0.071
Don+0.070
 Are+0.069
 Fasc+0.068
Neg logit contributions
ies-0.096
els-0.091
ishing-0.090
ins-0.083
ig-0.083
Token.
Token ID13
Feature activation-1.23e-01

Pos logit contributions
ies+0.060
els+0.060
ishing+0.058
ins+0.055
ig+0.054
Neg logit contributions
 Curious-0.030
Don-0.029
 Looking-0.029
 Are-0.028
 hadn-0.027
Token
Token ID198
Feature activation-1.03e-01

Pos logit contributions
Copy+0.183
 wond+0.178
 furious+0.171
 intrigued+0.170
 gru+0.169
Neg logit contributions
ies-0.212
 again-0.210
 and-0.190
 toys-0.189
ins-0.186
Token
Token ID198
Feature activation-9.85e-02

Pos logit contributions
 Are+0.161
 Curious+0.159
 Fasc+0.153
 Looking+0.152
 Terr+0.149
Neg logit contributions
ies-0.200
ins-0.192
ie-0.186
els-0.182
 again-0.180
TokenLater
Token ID18602
Feature activation-4.03e-02

Pos logit contributions
 Curious+0.155
 Are+0.149
 Looking+0.148
 Terr+0.144
 Fasc+0.143
Neg logit contributions
ies-0.193
ins-0.186
els-0.179
ock-0.175
ishing-0.174
Token that
Token ID326
Feature activation-4.22e-02

Pos logit contributions
 Curious+0.073
Don+0.071
 Looking+0.071
 Are+0.070
 Fasc+0.068
Neg logit contributions
ies-0.077
els-0.074
ishing-0.072
ins-0.069
ig-0.066
Token night
Token ID1755
Feature activation-3.10e-02

Pos logit contributions
 Curious+0.097
 Looking+0.095
Don+0.095
 Are+0.093
 Fasc+0.092
Neg logit contributions
ies-0.061
els-0.054
ishing-0.054
ins-0.049
ie-0.049
Token promised
Token ID8072
Feature activation-3.37e-02

Pos logit contributions
 Curious+0.053
 Looking+0.051
 Are+0.049
Don+0.049
 Fasc+0.049
Neg logit contributions
ies-0.075
ishing-0.069
els-0.069
ins-0.064
ig-0.064
Token to
Token ID284
Feature activation-5.64e-03

Pos logit contributions
 Curious+0.061
Don+0.061
 Looking+0.060
 Are+0.058
 Who+0.058
Neg logit contributions
els-0.063
ies-0.063
ishing-0.062
ins-0.060
ig-0.056
Token play
Token ID711
Feature activation+3.01e-02

Pos logit contributions
 Curious+0.008
Don+0.007
 Looking+0.007
 Are+0.007
 Fasc+0.007
Neg logit contributions
ies-0.013
els-0.013
ishing-0.013
ins-0.012
ig-0.012
Token again
Token ID757
Feature activation+3.06e-02

Pos logit contributions
ies+0.048
els+0.047
ishing+0.040
ins+0.039
ries+0.038
Neg logit contributions
 It-0.066
 Curious-0.064
 Something-0.064
 The-0.064
 Looking-0.063
Token soon
Token ID2582
Feature activation+3.52e-02

Pos logit contributions
els+0.065
ins+0.060
ies+0.060
ishing+0.058
ries+0.056
Neg logit contributions
-0.051
Don-0.049
 hadn-0.049
 Who-0.048
 The-0.048
Token.
Token ID13
Feature activation-1.23e-01

Pos logit contributions
els+0.060
ins+0.056
ishing+0.055
ies+0.055
isher+0.051
Neg logit contributions
-0.070
 hadn-0.068
 It-0.067
 The-0.066
 This-0.066
Token<|endoftext|>
Token ID50256
Feature activation-6.09e-02

Pos logit contributions
Isa+0.200
Don+0.195
Copy+0.189
 intrigued+0.189
 bust+0.186
Neg logit contributions
 again-0.202
ies-0.190
.-0.187
 and-0.186
 together-0.185
TokenOnce
Token ID7454
Feature activation-7.56e-02

Pos logit contributions
 Curious+0.066
Don+0.065
 Are+0.063
 Looking+0.063
 Fasc+0.060
Neg logit contributions
ies-0.149
els-0.146
ishing-0.141
ins-0.141
 again-0.139
Token upon
Token ID2402
Feature activation-8.66e-02

Pos logit contributions
 Curious+0.132
Don+0.129
 Looking+0.128
 Are+0.126
 Fasc+0.124
Neg logit contributions
ies-0.154
els-0.144
ishing-0.139
ins-0.134
ig-0.131
Token a
Token ID257
Feature activation-5.46e-02

Pos logit contributions
 Curious+0.154
 Are+0.153
 Looking+0.152
Don+0.150
 Fasc+0.147
Neg logit contributions
ies-0.157
els-0.144
ishing-0.140
ins-0.139
 again-0.131
Token time
Token ID640
Feature activation-5.23e-02

Pos logit contributions
 Curious+0.107
Don+0.105
 Looking+0.104
 Are+0.102
 Fasc+0.101
Neg logit contributions
ies-0.094
els-0.091
ishing-0.089
ins-0.084
ig-0.079
Token and
Token ID290
Feature activation-2.20e-02

Pos logit contributions
ins+0.075
els+0.074
ies+0.073
ishing+0.072
isher+0.072
Neg logit contributions
 hadn-0.078
 The-0.077
 In-0.077
 It-0.075
-0.075
Token had
Token ID550
Feature activation-5.14e-03

Pos logit contributions
 Curious+0.029
 Looking+0.027
 Fasc+0.026
 Exc+0.026
Don+0.025
Neg logit contributions
ies-0.055
ishing-0.050
els-0.050
ie-0.048
ig-0.047
Token lots
Token ID6041
Feature activation+5.89e-02

Pos logit contributions
 Are+0.010
 Curious+0.010
 Looking+0.010
Don+0.010
 Who+0.009
Neg logit contributions
ies-0.009
els-0.008
ishing-0.008
ins-0.008
ig-0.007
Token of
Token ID286
Feature activation+4.04e-02

Pos logit contributions
els+0.122
ies+0.116
ishing+0.112
ins+0.104
ries+0.104
Neg logit contributions
 Curious-0.101
 Looking-0.097
Don-0.096
 Thanks-0.096
-0.096
Token fun
Token ID1257
Feature activation+5.06e-02

Pos logit contributions
ies+0.108
els+0.107
ishing+0.106
ins+0.100
ig+0.099
Neg logit contributions
 Curious-0.042
 Looking-0.040
Don-0.039
 Fasc-0.038
 hadn-0.036
Token!
Token ID0
Feature activation-1.23e-01

Pos logit contributions
els+0.080
isher+0.075
ishing+0.073
ies+0.073
ish+0.072
Neg logit contributions
 Looking-0.093
 Are-0.093
 Curious-0.092
 hadn-0.092
Don-0.091
Token<|endoftext|>
Token ID50256
Feature activation-6.03e-02

Pos logit contributions
 intrigued+0.182
 Curious+0.180
 Are+0.168
 Fasc+0.168
Copy+0.168
Neg logit contributions
 again-0.239
ies-0.231
ie-0.216
ig-0.212
els-0.208
TokenOnce
Token ID7454
Feature activation-7.78e-02

Pos logit contributions
 Curious+0.064
Don+0.062
 Are+0.061
 Looking+0.061
 Fasc+0.059
Neg logit contributions
ies-0.148
els-0.146
ishing-0.141
ins-0.140
 again-0.139
Token upon
Token ID2402
Feature activation-8.83e-02

Pos logit contributions
 Curious+0.134
Don+0.130
 Looking+0.130
 Are+0.128
 Fasc+0.126
Neg logit contributions
ies-0.162
els-0.148
ishing-0.145
ins-0.139
ig-0.137
Token a
Token ID257
Feature activation-5.69e-02

Pos logit contributions
 Curious+0.158
 Are+0.157
 Looking+0.157
Don+0.154
 Fasc+0.151
Neg logit contributions
ies-0.157
els-0.144
ishing-0.140
ins-0.139
 again-0.131
Token time
Token ID640
Feature activation-6.03e-02

Pos logit contributions
 Curious+0.113
Don+0.110
 Looking+0.110
 Are+0.108
 Fasc+0.106
Neg logit contributions
ies-0.097
els-0.094
ishing-0.091
ins-0.086
ig-0.082
Token days
Token ID1528
Feature activation+3.01e-02

Pos logit contributions
els+0.078
ishing+0.070
ier+0.065
ins+0.065
ries+0.064
Neg logit contributions
Don-0.089
 Are-0.088
 Curious-0.087
 hadn-0.085
 lived-0.084
Token together
Token ID1978
Feature activation+3.65e-02

Pos logit contributions
els+0.054
ies+0.052
ishing+0.048
isher+0.047
ish+0.046
Neg logit contributions
-0.056
 hadn-0.052
Don-0.050
've-0.049
 relieved-0.049
Token.
Token ID13
Feature activation-1.21e-01

Pos logit contributions
isher+0.066
els+0.064
ies+0.063
ins+0.063
ishing+0.061
Neg logit contributions
 hadn-0.062
-0.060
 The-0.059
 shouldn-0.059
 had-0.058
Token The
Token ID383
Feature activation-1.72e-02

Pos logit contributions
 Curious+0.177
 intrigued+0.166
 Fasc+0.162
 Looking+0.162
 Are+0.158
Neg logit contributions
ies-0.241
 again-0.237
els-0.227
ie-0.223
ig-0.220
Token end
Token ID886
Feature activation-3.70e-02

Pos logit contributions
Don+0.035
 Curious+0.035
 Are+0.033
 Looking+0.033
 hadn+0.033
Neg logit contributions
ies-0.029
ishing-0.029
els-0.029
ins-0.028
ier-0.025
Token.
Token ID13
Feature activation-1.23e-01

Pos logit contributions
Don+0.092
 Curious+0.092
 Looking+0.091
 Are+0.090
 It+0.090
Neg logit contributions
ies-0.044
els-0.042
ishing-0.041
ins-0.037
ooming-0.032
Token<|endoftext|>
Token ID50256
Feature activation-6.11e-02

Pos logit contributions
 Curious+0.187
 Looking+0.175
 intrigued+0.175
 Fasc+0.175
 Are+0.166
Neg logit contributions
 again-0.240
ies-0.226
els-0.213
ins-0.209
ie-0.208
TokenOnce
Token ID7454
Feature activation-7.41e-02

Pos logit contributions
 Curious+0.060
Don+0.059
 Are+0.057
 Looking+0.057
 Fasc+0.054
Neg logit contributions
ies-0.157
els-0.155
ishing-0.150
ins-0.148
 again-0.144
Token upon
Token ID2402
Feature activation-9.01e-02

Pos logit contributions
 Curious+0.122
Don+0.118
 Looking+0.118
 Are+0.115
 Fasc+0.113
Neg logit contributions
ies-0.156
els-0.148
ishing-0.144
ins-0.138
ig-0.135
Token a
Token ID257
Feature activation-5.78e-02

Pos logit contributions
 Curious+0.158
 Looking+0.156
 Are+0.156
Don+0.154
 Fasc+0.150
Neg logit contributions
ies-0.166
els-0.154
ishing-0.150
ins-0.148
 again-0.138
Token time
Token ID640
Feature activation-5.45e-02

Pos logit contributions
 Curious+0.109
 Looking+0.106
Don+0.106
 Are+0.102
 Fasc+0.102
Neg logit contributions
ies-0.105
els-0.104
ishing-0.100
ins-0.095
ries-0.089
Token fun
Token ID1257
Feature activation+4.92e-02

Pos logit contributions
els+0.107
ies+0.106
ishing+0.103
ier+0.099
ins+0.097
Neg logit contributions
 Curious-0.043
 Looking-0.042
Don-0.041
 shouldn-0.040
 Fasc-0.039
Token together
Token ID1978
Feature activation+3.47e-02

Pos logit contributions
isher+0.077
els+0.075
ier+0.074
ock+0.073
reating+0.070
Neg logit contributions
 Are-0.086
Don-0.086
 hadn-0.086
 shouldn-0.086
 Looking-0.085
Token.
Token ID13
Feature activation-1.18e-01

Pos logit contributions
isher+0.058
ies+0.057
els+0.056
ins+0.056
ock+0.054
Neg logit contributions
 hadn-0.063
 The-0.062
 shouldn-0.061
 had-0.061
 In-0.061
Token The
Token ID383
Feature activation-1.33e-02

Pos logit contributions
 Curious+0.152
 Fasc+0.140
 Looking+0.138
 intrigued+0.136
 Are+0.132
Neg logit contributions
ies-0.272
 again-0.255
els-0.252
ie-0.244
ig-0.243
Token end
Token ID886
Feature activation-3.98e-02

Pos logit contributions
Don+0.028
 Curious+0.028
 Looking+0.027
 Are+0.027
 hadn+0.026
Neg logit contributions
ies-0.021
els-0.021
ishing-0.020
ins-0.019
ig-0.018
Token.
Token ID13
Feature activation-1.23e-01

Pos logit contributions
Don+0.098
 Curious+0.097
 Looking+0.097
 Are+0.095
 It+0.094
Neg logit contributions
ies-0.048
els-0.047
ishing-0.045
ins-0.040
ries-0.036
Token<|endoftext|>
Token ID50256
Feature activation-6.79e-02

Pos logit contributions
 Curious+0.173
 Fasc+0.162
 Looking+0.161
 intrigued+0.156
 Are+0.153
Neg logit contributions
 again-0.256
ies-0.254
els-0.235
ins-0.232
ie-0.229
TokenOnce
Token ID7454
Feature activation-7.53e-02

Pos logit contributions
 Curious+0.074
Don+0.073
 Are+0.072
 Looking+0.070
 Fasc+0.068
Neg logit contributions
ies-0.164
els-0.160
 again-0.155
ishing-0.154
ins-0.154
Token upon
Token ID2402
Feature activation-8.91e-02

Pos logit contributions
 Curious+0.125
Don+0.122
 Looking+0.121
 Are+0.119
 Fasc+0.117
Neg logit contributions
ies-0.158
els-0.149
ishing-0.145
ins-0.139
ig-0.136
Token a
Token ID257
Feature activation-5.79e-02

Pos logit contributions
 Curious+0.158
 Looking+0.156
 Are+0.156
Don+0.154
 Fasc+0.150
Neg logit contributions
ies-0.163
els-0.150
ishing-0.146
ins-0.145
 again-0.136
Token time
Token ID640
Feature activation-5.76e-02

Pos logit contributions
 Curious+0.112
Don+0.109
 Looking+0.109
 Are+0.106
 Fasc+0.105
Neg logit contributions
ies-0.102
els-0.100
ishing-0.096
ins-0.091
ig-0.086
Token and
Token ID290
Feature activation+4.98e-03

Pos logit contributions
els+0.045
ies+0.043
ins+0.042
ishing+0.041
ishes+0.037
Neg logit contributions
 Are-0.053
-0.052
 Curious-0.052
 The-0.051
 It-0.051
Token play
Token ID711
Feature activation+3.41e-02

Pos logit contributions
ies+0.009
els+0.009
ishing+0.009
 again+0.008
ie+0.008
Neg logit contributions
Don-0.008
 Looking-0.008
 Curious-0.008
LL-0.008
 Thanks-0.008
Token with
Token ID351
Feature activation+3.01e-02

Pos logit contributions
ies+0.065
els+0.063
ins+0.060
ishing+0.059
ier+0.054
Neg logit contributions
 Are-0.063
 Curious-0.062
 It-0.060
Don-0.060
 Looking-0.059
Token my
Token ID616
Feature activation+5.77e-02

Pos logit contributions
ies+0.075
ishing+0.071
ier+0.070
ins+0.069
els+0.068
Neg logit contributions
 It-0.034
 Are-0.034
 Curious-0.033
 The-0.032
-0.032
Token umbrella
Token ID25510
Feature activation+1.06e-01

Pos logit contributions
ishing+0.098
els+0.095
ins+0.095
ier+0.094
ies+0.089
Neg logit contributions
 Curious-0.110
 hadn-0.101
Don-0.101
 shouldn-0.100
 Fasc-0.100
Token?"
Token ID1701
Feature activation-1.23e-01

Pos logit contributions
els+0.225
ins+0.220
ies+0.211
ishing+0.206
ig+0.186
Neg logit contributions
 Are-0.155
 Curious-0.154
-0.150
Don-0.146
 shouldn-0.146
Token he
Token ID339
Feature activation-2.50e-02

Pos logit contributions
WR+0.224
 hadn+0.223
 shouldn+0.220
 Have+0.218
've+0.217
Neg logit contributions
ies-0.221
els-0.207
--0.181
 again-0.180
ie-0.178
Token asked
Token ID1965
Feature activation-2.46e-02

Pos logit contributions
 Have+0.033
 Where+0.032
Don+0.032
 shouldn+0.031
 Thanks+0.031
Neg logit contributions
ies-0.060
els-0.055
ishing-0.053
ins-0.052
--0.050
Token.
Token ID13
Feature activation-1.05e-01

Pos logit contributions
 Curious+0.056
 Looking+0.054
Don+0.054
+0.054
 Who+0.053
Neg logit contributions
ishing-0.034
els-0.032
ies-0.032
ins-0.030
ig-0.029
Token
Token ID198
Feature activation-9.82e-02

Pos logit contributions
 shouldn+0.154
 hadn+0.149
 wond+0.139
Don+0.138
 Have+0.137
Neg logit contributions
ies-0.208
els-0.201
 again-0.193
ins-0.190
--0.190
Token
Token ID198
Feature activation-9.60e-02

Pos logit contributions
 hadn+0.130
 shouldn+0.129
 Curious+0.129
Don+0.126
 Fasc+0.124
Neg logit contributions
ies-0.217
els-0.207
ins-0.202
--0.199
 again-0.196
Token until
Token ID1566
Feature activation-3.09e-02

Pos logit contributions
els+0.092
ins+0.089
ies+0.085
isher+0.085
ries+0.085
Neg logit contributions
-0.070
 hardly-0.067
 -0.066
 barely-0.066
 �-0.065
Token the
Token ID262
Feature activation-2.27e-03

Pos logit contributions
 Curious+0.051
 Looking+0.049
Don+0.049
 Are+0.048
 Fasc+0.047
Neg logit contributions
ies-0.063
els-0.060
ishing-0.059
ins-0.055
ig-0.055
Token sun
Token ID4252
Feature activation+3.90e-02

Pos logit contributions
 Curious+0.004
Don+0.004
 Looking+0.004
 Are+0.004
 hadn+0.004
Neg logit contributions
els-0.004
ies-0.004
ishing-0.004
ins-0.004
ist-0.004
Token went
Token ID1816
Feature activation+1.76e-02

Pos logit contributions
els+0.097
ies+0.096
ishing+0.094
ins+0.091
ig+0.087
Neg logit contributions
 Curious-0.046
Don-0.045
 Looking-0.044
 Are-0.042
 hadn-0.041
Token down
Token ID866
Feature activation+1.13e-02

Pos logit contributions
ies+0.026
els+0.026
ishing+0.024
ins+0.024
ig+0.022
Neg logit contributions
 Curious-0.038
Don-0.038
 Looking-0.038
 Are-0.037
 Fasc-0.037
Token.
Token ID13
Feature activation-1.23e-01

Pos logit contributions
els+0.016
ies+0.016
ishing+0.015
ins+0.015
ig+0.014
Neg logit contributions
 Curious-0.026
Don-0.026
 It-0.026
 Looking-0.025
 The-0.025
Token<|endoftext|>
Token ID50256
Feature activation-5.74e-02

Pos logit contributions
Copy+0.207
 furious+0.192
LL+0.191
 intrigued+0.188
Don+0.188
Neg logit contributions
 again-0.201
ies-0.185
 together-0.178
ie-0.176
eddy-0.174
TokenOnce
Token ID7454
Feature activation-8.03e-02

Pos logit contributions
 Curious+0.064
Don+0.063
 Are+0.062
 Looking+0.060
 Fasc+0.059
Neg logit contributions
ies-0.138
els-0.135
 again-0.130
ins-0.129
ishing-0.129
Token upon
Token ID2402
Feature activation-8.71e-02

Pos logit contributions
 Curious+0.142
Don+0.139
 Looking+0.138
 Are+0.136
 Fasc+0.133
Neg logit contributions
ies-0.165
els-0.149
ishing-0.145
ie-0.140
ins-0.139
Token a
Token ID257
Feature activation-5.91e-02

Pos logit contributions
 Curious+0.156
 Are+0.155
 Looking+0.155
Don+0.152
 Fasc+0.149
Neg logit contributions
ies-0.156
els-0.142
ishing-0.139
ins-0.138
 again-0.129
Token time
Token ID640
Feature activation-5.49e-02

Pos logit contributions
 Curious+0.117
Don+0.114
 Looking+0.114
 Are+0.111
 Fasc+0.110
Neg logit contributions
ies-0.102
els-0.099
ishing-0.096
ins-0.090
ig-0.086
Token stay
Token ID2652
Feature activation+6.27e-02

Pos logit contributions
ies+0.016
els+0.015
ishing+0.015
ins+0.014
ig+0.014
Neg logit contributions
 Curious-0.010
Don-0.009
 Looking-0.009
 Are-0.009
 Fasc-0.009
Token down
Token ID866
Feature activation+3.85e-02

Pos logit contributions
els+0.114
ies+0.107
ins+0.104
ier+0.097
ries+0.096
Neg logit contributions
 Curious-0.116
 Are-0.113
 relieved-0.112
 Fasc-0.112
 Looking-0.111
Token here
Token ID994
Feature activation+5.36e-02

Pos logit contributions
ies+0.082
els+0.082
ins+0.079
ishing+0.077
ig+0.073
Neg logit contributions
 Are-0.059
 The-0.058
 It-0.058
 Curious-0.057
Don-0.057
Token and
Token ID290
Feature activation+3.07e-03

Pos logit contributions
els+0.119
ies+0.108
ins+0.106
ishing+0.102
ig+0.100
Neg logit contributions
 Are-0.079
 Curious-0.079
-0.079
 It-0.077
 The-0.077
Token play
Token ID711
Feature activation+2.97e-02

Pos logit contributions
ies+0.006
ishing+0.006
els+0.005
 again+0.005
ie+0.005
Neg logit contributions
 Curious-0.005
 Looking-0.005
Don-0.005
 Fasc-0.005
 Terr-0.005
Token?"
Token ID1701
Feature activation-1.23e-01

Pos logit contributions
els+0.060
ies+0.060
ins+0.053
ishing+0.052
ries+0.051
Neg logit contributions
 Are-0.054
 It-0.052
 Curious-0.052
 Where-0.051
Don-0.050
Token he
Token ID339
Feature activation-2.74e-02

Pos logit contributions
 Have+0.200
LL+0.199
 Goodbye+0.193
 shouldn+0.190
 hadn+0.187
Neg logit contributions
ies-0.228
ie-0.213
els-0.208
 again-0.204
--0.202
Token asked
Token ID1965
Feature activation-2.16e-02

Pos logit contributions
 Have+0.039
 Where+0.037
 Looking+0.036
 Who+0.036
Don+0.035
Neg logit contributions
ies-0.065
els-0.057
ishing-0.056
ie-0.056
ins-0.055
Token.
Token ID13
Feature activation-1.08e-01

Pos logit contributions
 Curious+0.045
 The+0.044
 He+0.044
 Looking+0.043
 Are+0.043
Neg logit contributions
els-0.036
ishing-0.034
ins-0.034
ig-0.033
ies-0.032
Token
Token ID198
Feature activation-9.84e-02

Pos logit contributions
 shouldn+0.129
Don+0.128
 hadn+0.120
LL+0.116
 Curious+0.113
Neg logit contributions
ies-0.233
 again-0.228
els-0.218
er-0.218
ishing-0.217
Token
Token ID198
Feature activation-9.35e-02

Pos logit contributions
 Curious+0.109
Don+0.106
 Looking+0.104
 Are+0.103
 Fasc+0.101
Neg logit contributions
ies-0.246
els-0.234
ins-0.230
ishing-0.227
ie-0.221
Token havoc
Token ID29860
Feature activation+5.52e-02

Pos logit contributions
ies+0.152
els+0.149
ishing+0.142
ins+0.141
ig+0.135
Neg logit contributions
 Curious-0.141
 It-0.140
 shouldn-0.139
-0.139
Don-0.138
Token all
Token ID477
Feature activation+4.21e-02

Pos logit contributions
els+0.092
ies+0.089
ishing+0.087
ins+0.084
ig+0.083
Neg logit contributions
Don-0.109
 shouldn-0.107
 hadn-0.106
-0.106
 Curious-0.105
Token over
Token ID625
Feature activation+3.49e-02

Pos logit contributions
els+0.081
ies+0.080
ishing+0.074
ins+0.074
ig+0.072
Neg logit contributions
Don-0.071
 Are-0.070
 Curious-0.070
 Looking-0.069
 shouldn-0.068
Token the
Token ID262
Feature activation+4.10e-02

Pos logit contributions
ies+0.081
ishing+0.077
ig+0.076
els+0.075
ins+0.075
Neg logit contributions
 Where-0.044
-0.044
Don-0.044
 Are-0.043
 Who-0.043
Token ranch
Token ID28752
Feature activation+9.84e-02

Pos logit contributions
els+0.078
ies+0.078
ishing+0.077
ins+0.074
ig+0.071
Neg logit contributions
 Curious-0.071
Don-0.071
 Are-0.069
-0.068
 shouldn-0.068
Token."
Token ID526
Feature activation-1.22e-01

Pos logit contributions
els+0.157
ishing+0.150
ies+0.146
ins+0.138
ig+0.136
Neg logit contributions
 shouldn-0.206
-0.204
 Curious-0.202
 This-0.199
 It-0.199
Token
Token ID198
Feature activation-1.19e-01

Pos logit contributions
 hadn+0.146
Don+0.145
 Goodbye+0.145
 Is+0.144
 Curious+0.144
Neg logit contributions
ies-0.270
els-0.258
ishing-0.256
 again-0.250
ig-0.250
Token
Token ID198
Feature activation-1.05e-01

Pos logit contributions
 Are+0.161
 Curious+0.160
 Looking+0.154
 hadn+0.154
 Fasc+0.152
Neg logit contributions
ies-0.255
els-0.240
ig-0.237
ins-0.234
ie-0.232
TokenSam
Token ID16305
Feature activation-1.22e-02

Pos logit contributions
 Are+0.165
 Is+0.163
 Curious+0.162
 Who+0.162
 Terr+0.160
Neg logit contributions
ies-0.209
ishing-0.199
els-0.197
ig-0.185
ie-0.184
Tokenmy
Token ID1820
Feature activation-1.91e-02

Pos logit contributions
 Curious+0.029
Don+0.028
 Looking+0.028
 hadn+0.027
 shouldn+0.027
Neg logit contributions
els-0.016
ies-0.016
ishing-0.015
ins-0.015
ig-0.013
Token was
Token ID373
Feature activation-3.76e-02

Pos logit contributions
 Curious+0.024
 Are+0.023
Don+0.023
 Where+0.023
 Looking+0.023
Neg logit contributions
ies-0.046
els-0.044
ishing-0.043
ie-0.041
ins-0.041
Token Bob
Token ID5811
Feature activation+6.51e-03

Pos logit contributions
 Looking+0.057
 Curious+0.057
LL+0.055
Don+0.055
 Are+0.053
Neg logit contributions
ies-0.112
els-0.105
ishing-0.101
ins-0.099
 again-0.098
Token ate
Token ID15063
Feature activation+3.80e-02

Pos logit contributions
ies+0.016
els+0.015
ishing+0.015
ins+0.015
ig+0.014
Neg logit contributions
 Curious-0.008
Don-0.008
 Looking-0.008
 Are-0.008
 Fasc-0.007
Token lots
Token ID6041
Feature activation+6.93e-02

Pos logit contributions
els+0.094
ies+0.094
ishing+0.092
ig+0.091
ooming+0.087
Neg logit contributions
 It-0.047
 The-0.045
 He-0.044
 -0.043
 Curious-0.043
Token of
Token ID286
Feature activation+5.82e-02

Pos logit contributions
els+0.151
ishing+0.146
ies+0.141
ig+0.133
ators+0.132
Neg logit contributions
-0.100
 Thanks-0.097
 It-0.095
 Curious-0.094
 relieved-0.092
Token y
Token ID331
Feature activation+9.15e-02

Pos logit contributions
ies+0.089
els+0.088
ishing+0.087
ig+0.080
ins+0.077
Neg logit contributions
 Curious-0.123
 Looking-0.119
Don-0.118
 Fasc-0.117
 Are-0.117
Tokenummy
Token ID13513
Feature activation+1.06e-01

Pos logit contributions
els+0.141
ies+0.140
ishing+0.134
ist+0.132
ier+0.130
Neg logit contributions
 Looking-0.163
 This-0.161
 Thanks-0.160
 Curious-0.160
 It-0.158
Token treats
Token ID18432
Feature activation+1.03e-01

Pos logit contributions
els+0.173
ishing+0.167
ig+0.163
ist+0.158
ies+0.156
Neg logit contributions
 Curious-0.199
 Looking-0.192
 Fasc-0.191
 Are-0.190
 It-0.189
Token and
Token ID290
Feature activation-5.83e-03

Pos logit contributions
els+0.182
ishing+0.174
ies+0.166
ig+0.163
ins+0.160
Neg logit contributions
-0.198
 hardly-0.190
 It-0.188
 He-0.187
 hadn-0.185
Token had
Token ID550
Feature activation-3.67e-03

Pos logit contributions
 Curious+0.009
Don+0.008
 Looking+0.008
 Are+0.008
 Fasc+0.008
Neg logit contributions
ies-0.013
els-0.013
ishing-0.012
ins-0.012
ig-0.011
Token a
Token ID257
Feature activation+2.72e-02

Pos logit contributions
 Curious+0.006
Don+0.006
 Looking+0.006
 Are+0.006
 Fasc+0.006
Neg logit contributions
ies-0.007
els-0.007
ishing-0.007
ins-0.006
ig-0.006
Token great
Token ID1049
Feature activation+5.79e-02

Pos logit contributions
ies+0.061
els+0.061
ishing+0.059
ins+0.056
ig+0.054
Neg logit contributions
 Curious-0.039
Don-0.038
 Looking-0.038
 Are-0.037
 Fasc-0.036
Token ever
Token ID1683
Feature activation+3.10e-02

Pos logit contributions
ies+0.132
els+0.129
ishing+0.127
ins+0.122
ig+0.117
Neg logit contributions
 Curious-0.089
Don-0.088
 Looking-0.087
 Are-0.085
 Fasc-0.083
Token.
Token ID13
Feature activation-9.95e-02

Pos logit contributions
els+0.039
ishing+0.038
ies+0.036
ins+0.035
ig+0.033
Neg logit contributions
 hadn-0.072
-0.071
 shouldn-0.071
 Curious-0.071
Don-0.071
Token She
Token ID1375
Feature activation-2.84e-02

Pos logit contributions
Don+0.101
 hadn+0.092
 Curious+0.090
LL+0.088
 shouldn+0.088
Neg logit contributions
ies-0.245
els-0.238
ishing-0.227
 again-0.225
ins-0.224
Token touched
Token ID12615
Feature activation+2.99e-02

Pos logit contributions
 Curious+0.036
Don+0.035
 Are+0.035
 Looking+0.035
 Where+0.033
Neg logit contributions
ies-0.070
els-0.067
ishing-0.065
ins-0.062
ig-0.062
Token the
Token ID262
Feature activation+2.85e-02

Pos logit contributions
ies+0.077
ishing+0.076
ins+0.074
ooming+0.072
els+0.070
Neg logit contributions
 It-0.034
Don-0.030
 Curious-0.030
 The-0.030
-0.028
Token wood
Token ID4898
Feature activation+1.10e-01

Pos logit contributions
ies+0.051
ishing+0.050
ins+0.049
els+0.048
ie+0.044
Neg logit contributions
 Curious-0.054
 shouldn-0.052
 Are-0.052
 It-0.052
Don-0.052
Token and
Token ID290
Feature activation-2.31e-02

Pos logit contributions
ins+0.172
ishing+0.169
ies+0.163
els+0.155
ooming+0.145
Neg logit contributions
-0.231
 It-0.222
 �-0.217
 Curious-0.215
 hadn-0.214
Token it
Token ID340
Feature activation+2.48e-02

Pos logit contributions
 Curious+0.031
Don+0.030
 Looking+0.030
 Are+0.029
 Fasc+0.029
Neg logit contributions
ies-0.054
els-0.053
ishing-0.052
ins-0.050
ig-0.048
Token felt
Token ID2936
Feature activation+1.42e-03

Pos logit contributions
ies+0.063
els+0.061
ishing+0.060
ins+0.058
ig+0.056
Neg logit contributions
 Curious-0.029
Don-0.028
 Looking-0.028
 Are-0.027
 Fasc-0.026
Token nice
Token ID3621
Feature activation+5.74e-02

Pos logit contributions
ies+0.002
ishing+0.002
els+0.002
ier+0.002
 again+0.002
Neg logit contributions
 Curious-0.003
Don-0.002
 Looking-0.002
 Who-0.002
 Are-0.002
Token and
Token ID290
Feature activation-6.53e-03

Pos logit contributions
els+0.109
ins+0.099
ies+0.098
ishing+0.095
ishes+0.094
Neg logit contributions
-0.108
 hadn-0.105
 It-0.101
 �-0.100
 Thanks-0.098
Token them
Token ID606
Feature activation+3.02e-02

Pos logit contributions
ishing+0.021
ies+0.021
els+0.020
ig+0.020
ins+0.019
Neg logit contributions
 Curious-0.024
 Looking-0.024
Don-0.023
 It-0.023
 Who-0.023
Token how
Token ID703
Feature activation-1.89e-02

Pos logit contributions
ies+0.057
ishing+0.057
els+0.056
ins+0.052
ig+0.052
Neg logit contributions
 Curious-0.053
 Looking-0.051
Don-0.051
 Are-0.050
 Who-0.049
Token he
Token ID339
Feature activation+3.29e-02

Pos logit contributions
 Curious+0.038
 Looking+0.037
 Are+0.037
Don+0.036
 Fasc+0.036
Neg logit contributions
ies-0.033
els-0.031
ishing-0.030
ins-0.029
ig-0.026
Token made
Token ID925
Feature activation+3.23e-02

Pos logit contributions
ies+0.072
ishing+0.068
els+0.068
ins+0.065
ig+0.063
Neg logit contributions
 Curious-0.048
Don-0.047
 Looking-0.046
 shouldn-0.046
 Are-0.046
Token the
Token ID262
Feature activation+3.36e-02

Pos logit contributions
ies+0.081
ishing+0.080
ig+0.079
ooming+0.076
els+0.075
Neg logit contributions
 It-0.037
 Curious-0.034
 The-0.033
 Looking-0.033
 Exc-0.031
Token picture
Token ID4286
Feature activation+1.04e-01

Pos logit contributions
ishing+0.068
els+0.067
ies+0.067
ins+0.065
ig+0.062
Neg logit contributions
 Curious-0.059
 Are-0.057
 Who-0.055
Don-0.054
 Looking-0.054
Token.
Token ID13
Feature activation-8.85e-02

Pos logit contributions
ishing+0.188
els+0.172
ins+0.169
ooming+0.165
ies+0.163
Neg logit contributions
 It-0.206
-0.202
 He-0.201
 hardly-0.183
 The-0.183
Token He
Token ID679
Feature activation-7.12e-03

Pos logit contributions
Don+0.087
 hadn+0.087
LL+0.082
 Curious+0.079
 shouldn+0.078
Neg logit contributions
els-0.212
ies-0.210
 again-0.202
er-0.201
ins-0.201
Token lets
Token ID8781
Feature activation+1.43e-03

Pos logit contributions
 Curious+0.010
 Looking+0.010
Don+0.010
 Are+0.009
 Fasc+0.009
Neg logit contributions
ies-0.016
ishing-0.016
els-0.016
ins-0.015
ig-0.014
Token them
Token ID606
Feature activation+4.00e-02

Pos logit contributions
ies+0.003
ishing+0.003
els+0.002
ins+0.002
ig+0.002
Neg logit contributions
 Curious-0.003
 It-0.003
 Looking-0.003
 Who-0.003
Don-0.003
Token touch
Token ID3638
Feature activation+5.29e-02

Pos logit contributions
ishing+0.086
ins+0.084
ies+0.083
ig+0.079
isher+0.078
Neg logit contributions
-0.060
 It-0.060
 The-0.058
 Curious-0.057
 Who-0.055
Token loves
Token ID10408
Feature activation+7.97e-03

Pos logit contributions
ies+0.000
els+0.000
ishing+0.000
ins+0.000
ig+0.000
Neg logit contributions
 Curious-0.000
Don-0.000
 Looking-0.000
 Are-0.000
 Fasc-0.000
Token peanuts
Token ID45612
Feature activation+1.23e-01

Pos logit contributions
ishing+0.014
els+0.014
ier+0.014
ig+0.014
ooming+0.014
Neg logit contributions
Don-0.014
 Are-0.013
 Who-0.013
 Curious-0.013
 It-0.013
Token.
Token ID13
Feature activation-1.00e-01

Pos logit contributions
ishing+0.215
els+0.213
isher+0.192
ies+0.191
ooming+0.190
Neg logit contributions
-0.232
 shouldn-0.218
 Jerry-0.208
 had-0.207
 It-0.206
Token Ben
Token ID3932
Feature activation+5.94e-03

Pos logit contributions
 hadn+0.111
Don+0.106
 Curious+0.104
 shouldn+0.101
 Looking+0.101
Neg logit contributions
els-0.247
ies-0.243
ins-0.234
ishing-0.226
 again-0.221
Token hates
Token ID24702
Feature activation+3.00e-02

Pos logit contributions
ies+0.012
els+0.011
ishing+0.011
ins+0.011
ish+0.010
Neg logit contributions
 Curious-0.010
 Looking-0.010
Don-0.010
 Fasc-0.010
 Terr-0.009
Token peanuts
Token ID45612
Feature activation+1.21e-01

Pos logit contributions
ishing+0.054
ies+0.054
els+0.053
ig+0.051
ooming+0.049
Neg logit contributions
Don-0.053
 Curious-0.052
 Are-0.051
 shouldn-0.050
 Looking-0.050
Token.
Token ID13
Feature activation-9.57e-02

Pos logit contributions
ishing+0.209
els+0.203
ies+0.193
ins+0.187
isher+0.183
Neg logit contributions
-0.237
 shouldn-0.234
 It-0.222
 hadn-0.219
Don-0.217
Token He
Token ID679
Feature activation-8.94e-03

Pos logit contributions
 hadn+0.095
Don+0.092
 shouldn+0.090
 Curious+0.088
 Looking+0.086
Neg logit contributions
els-0.240
ies-0.239
ins-0.238
 again-0.224
ishing-0.223
Token thinks
Token ID6834
Feature activation+4.08e-03

Pos logit contributions
 Curious+0.014
 Looking+0.014
Don+0.014
 Fasc+0.013
 Are+0.013
Neg logit contributions
ies-0.019
els-0.019
ishing-0.018
ins-0.017
ie-0.017
Token they
Token ID484
Feature activation+1.85e-02

Pos logit contributions
ies+0.009
ishing+0.009
els+0.008
ins+0.008
ig+0.008
Neg logit contributions
 Curious-0.006
Don-0.006
 Are-0.006
 Looking-0.006
 It-0.006
Token are
Token ID389
Feature activation+2.37e-02

Pos logit contributions
ies+0.036
els+0.035
ishing+0.034
ins+0.032
ig+0.031
Neg logit contributions
 Curious-0.033
 Looking-0.032
Don-0.032
 Fasc-0.031
 Are-0.031
Token.
Token ID13
Feature activation-8.75e-02

Pos logit contributions
ishing+0.150
ins+0.148
ies+0.148
els+0.141
ishes+0.137
Neg logit contributions
 hadn-0.167
-0.164
 had-0.159
 He-0.156
 shouldn-0.156
Token It
Token ID632
Feature activation-2.05e-03

Pos logit contributions
Don+0.085
 hadn+0.080
've+0.071
 hardly+0.070
Thanks+0.069
Neg logit contributions
els-0.216
er-0.214
ies-0.212
ishing-0.210
 again-0.210
Token is
Token ID318
Feature activation-8.57e-03

Pos logit contributions
Thanks+0.005
Don+0.005
 Curious+0.004
 Fasc+0.004
 Looking+0.004
Neg logit contributions
els-0.003
ies-0.003
 again-0.002
ig-0.002
er-0.002
Token big
Token ID1263
Feature activation+6.58e-02

Pos logit contributions
 Curious+0.014
Don+0.013
 Looking+0.013
 Are+0.013
 Fasc+0.013
Neg logit contributions
ies-0.018
els-0.017
ishing-0.017
ins-0.017
ig-0.016
Token and
Token ID290
Feature activation-1.21e-02

Pos logit contributions
els+0.112
ins+0.111
ishing+0.106
ies+0.105
ooming+0.097
Neg logit contributions
 Are-0.129
 It-0.127
 He-0.125
-0.124
 Curious-0.120
Token shiny
Token ID22441
Feature activation+1.00e-01

Pos logit contributions
 Looking+0.022
 Curious+0.022
 Fasc+0.022
Don+0.022
 Terr+0.021
Neg logit contributions
ies-0.021
els-0.020
ishing-0.018
 toys-0.017
 again-0.017
Token.
Token ID13
Feature activation-8.97e-02

Pos logit contributions
ins+0.196
ishing+0.180
ooming+0.176
ies+0.175
burn+0.170
Neg logit contributions
 He-0.191
 It-0.190
 Are-0.169
 But-0.162
 Where-0.159
Token It
Token ID632
Feature activation-6.24e-03

Pos logit contributions
Don+0.096
 hadn+0.092
've+0.084
 shouldn+0.079
 hardly+0.079
Neg logit contributions
els-0.215
ies-0.212
ishing-0.209
er-0.207
ins-0.205
Token has
Token ID468
Feature activation+5.86e-03

Pos logit contributions
Don+0.011
Thanks+0.011
 Curious+0.011
 Fasc+0.011
 Looking+0.011
Neg logit contributions
els-0.011
ies-0.011
ig-0.010
 again-0.010
 blocks-0.009
Token a
Token ID257
Feature activation+3.75e-02

Pos logit contributions
ies+0.011
ishing+0.011
els+0.011
ins+0.010
ig+0.010
Neg logit contributions
 Curious-0.011
Don-0.010
 Looking-0.010
 Are-0.010
 Fasc-0.010
Token dragon
Token ID10441
Feature activation+1.09e-01

Pos logit contributions
ies+0.065
els+0.063
ishing+0.062
ins+0.058
ig+0.055
Neg logit contributions
 Curious-0.074
Don-0.072
 Looking-0.071
 Are-0.070
 Fasc-0.069
Tokenoc
Token ID420
Feature activation+1.14e-01

Pos logit contributions
ies+0.227
els+0.216
ishing+0.205
ins+0.203
ier+0.200
Neg logit contributions
-0.127
 Curious-0.124
Don-0.121
 Exc-0.120
 Fasc-0.117
Tokenados
Token ID22484
Feature activation+1.01e-01

Pos logit contributions
ies+0.178
ock+0.145
ishing+0.142
ig+0.138
burn+0.137
Neg logit contributions
 This-0.210
 Terr-0.204
 Luckily-0.204
 shouldn-0.203
 Where-0.203
Token and
Token ID290
Feature activation+6.77e-03

Pos logit contributions
els+0.195
ies+0.189
ishing+0.187
ins+0.180
ooming+0.173
Neg logit contributions
-0.175
Don-0.163
 shouldn-0.158
 �-0.155
 Curious-0.154
Token scatter
Token ID41058
Feature activation+8.47e-02

Pos logit contributions
ies+0.013
els+0.013
ishing+0.013
ins+0.012
ig+0.012
Neg logit contributions
 Curious-0.012
Don-0.011
 Are-0.011
 Looking-0.011
 Fasc-0.011
Token them
Token ID606
Feature activation+5.99e-02

Pos logit contributions
ies+0.157
ishing+0.149
ooming+0.145
ig+0.144
els+0.143
Neg logit contributions
 It-0.152
Don-0.146
 Curious-0.146
-0.145
 Exc-0.143
Token around
Token ID1088
Feature activation+3.05e-02

Pos logit contributions
els+0.106
ins+0.103
ishing+0.100
ies+0.099
ig+0.093
Neg logit contributions
-0.106
 hadn-0.102
 Curious-0.099
 shouldn-0.096
Don-0.094
Token his
Token ID465
Feature activation+4.98e-02

Pos logit contributions
ies+0.062
ishing+0.059
ins+0.058
els+0.058
ig+0.057
Neg logit contributions
 It-0.049
 Who-0.049
Don-0.048
 Curious-0.048
 The-0.048
Token farm
Token ID5318
Feature activation+9.48e-02

Pos logit contributions
ier+0.083
els+0.083
ies+0.081
ins+0.081
ock+0.077
Neg logit contributions
 Curious-0.092
 hadn-0.088
 The-0.087
 Are-0.086
 Luckily-0.086
Token.
Token ID13
Feature activation-1.04e-01

Pos logit contributions
els+0.136
ishing+0.130
ies+0.129
ins+0.122
ier+0.121
Neg logit contributions
-0.199
 Curious-0.195
 hadn-0.192
 shouldn-0.191
 The-0.188
Token
Token ID198
Feature activation-9.52e-02

Pos logit contributions
Don+0.094
 Curious+0.089
LL+0.085
 Looking+0.085
 hadn+0.083
Neg logit contributions
els-0.265
ies-0.264
ins-0.254
ishing-0.253
ig-0.246
Token
Token ID198
Feature activation-9.04e-02

Pos logit contributions
 Curious+0.105
 Are+0.105
 Looking+0.101
 Fasc+0.098
Don+0.097
Neg logit contributions
ies-0.226
els-0.224
ins-0.223
ishing-0.211
ig-0.209
Token and
Token ID290
Feature activation-3.54e-02

Pos logit contributions
ies+0.092
ins+0.091
ishing+0.084
els+0.084
ishes+0.079
Neg logit contributions
-0.116
 Curious-0.113
 Looking-0.111
 �-0.111
 Exc-0.108
Token she
Token ID673
Feature activation-9.44e-03

Pos logit contributions
 Curious+0.045
 Looking+0.043
Don+0.043
 Are+0.042
 Fasc+0.041
Neg logit contributions
ies-0.085
els-0.082
ishing-0.081
ins-0.077
ig-0.076
Token started
Token ID2067
Feature activation+1.79e-03

Pos logit contributions
 Curious+0.011
Don+0.011
 Looking+0.010
 Are+0.010
 Fasc+0.010
Neg logit contributions
ies-0.024
els-0.023
ishing-0.023
ins-0.022
ig-0.021
Token to
Token ID284
Feature activation-4.37e-03

Pos logit contributions
ies+0.004
els+0.003
ishing+0.003
ins+0.003
ier+0.003
Neg logit contributions
 Curious-0.003
 Looking-0.003
Don-0.003
 Fasc-0.003
 hardly-0.003
Token go
Token ID467
Feature activation+3.18e-02

Pos logit contributions
 Curious+0.007
Don+0.007
 Looking+0.007
 Fasc+0.007
 Are+0.007
Neg logit contributions
ies-0.009
els-0.009
ishing-0.009
ins-0.009
ier-0.008
Token higher
Token ID2440
Feature activation+8.74e-02

Pos logit contributions
els+0.055
ins+0.053
ies+0.053
ishing+0.052
ities+0.048
Neg logit contributions
 Curious-0.059
-0.058
Don-0.057
 Looking-0.057
 Fasc-0.056
Token and
Token ID290
Feature activation-5.62e-03

Pos logit contributions
ins+0.136
els+0.134
ishing+0.128
ies+0.123
isher+0.122
Neg logit contributions
-0.170
Don-0.160
 �-0.158
 He-0.155
 But-0.154
Token higher
Token ID2440
Feature activation+7.42e-02

Pos logit contributions
 Curious+0.014
 Looking+0.014
Don+0.014
 Fasc+0.013
 Are+0.013
Neg logit contributions
ies-0.007
ishing-0.006
els-0.006
 again-0.005
ins-0.005
Token.
Token ID13
Feature activation-1.04e-01

Pos logit contributions
els+0.105
ins+0.103
ishing+0.098
ies+0.095
isher+0.089
Neg logit contributions
Don-0.152
-0.147
 �-0.146
 Curious-0.145
 Are-0.145
Token Jenna
Token ID45377
Feature activation-6.50e-04

Pos logit contributions
 hadn+0.108
 Skip+0.107
 Are+0.105
Don+0.105
LL+0.102
Neg logit contributions
ies-0.236
els-0.228
ishing-0.214
 again-0.210
ing-0.209
Token laughed
Token ID13818
Feature activation+5.97e-03

Pos logit contributions
 Curious+0.001
Don+0.001
 Looking+0.001
 Are+0.001
 Fasc+0.001
Neg logit contributions
ies-0.002
els-0.002
ishing-0.002
ins-0.001
ig-0.001
Token there
Token ID612
Feature activation+3.46e-02

Pos logit contributions
 Curious+0.024
Don+0.023
 Looking+0.023
 Fasc+0.023
 Terr+0.022
Neg logit contributions
ies-0.023
els-0.022
ishing-0.021
ig-0.020
ins-0.020
Token more
Token ID517
Feature activation+5.54e-02

Pos logit contributions
ies+0.069
els+0.068
ishing+0.067
ins+0.064
ig+0.060
Neg logit contributions
 Curious-0.058
Don-0.056
 Are-0.056
 Looking-0.056
 Fasc-0.054
Token stars
Token ID5788
Feature activation+1.02e-01

Pos logit contributions
ishing+0.105
els+0.098
ies+0.097
ooming+0.095
ins+0.093
Neg logit contributions
 Are-0.099
-0.095
 Curious-0.091
 Who-0.091
 Where-0.089
Token in
Token ID287
Feature activation+1.77e-02

Pos logit contributions
els+0.187
ishing+0.173
ins+0.170
ies+0.169
ries+0.157
Neg logit contributions
-0.174
 Are-0.174
 hadn-0.170
've-0.169
 shouldn-0.168
Token the
Token ID262
Feature activation+2.04e-02

Pos logit contributions
ies+0.045
ishing+0.044
ins+0.043
els+0.042
ier+0.041
Neg logit contributions
 Curious-0.019
 Are-0.019
Don-0.019
 It-0.019
-0.019
Token sky
Token ID6766
Feature activation+6.94e-02

Pos logit contributions
ies+0.043
ins+0.041
els+0.041
ishing+0.041
ries+0.037
Neg logit contributions
 Curious-0.034
 Are-0.034
 Looking-0.033
Don-0.032
-0.032
Token?"
Token ID1701
Feature activation-1.01e-01

Pos logit contributions
els+0.131
ies+0.126
ishing+0.123
ins+0.121
ig+0.112
Neg logit contributions
 Are-0.117
-0.115
 Curious-0.115
Don-0.115
 It-0.112
Token Ben
Token ID3932
Feature activation-1.40e-02

Pos logit contributions
 Curious+0.144
 Looking+0.140
Don+0.139
LL+0.137
 shouldn+0.136
Neg logit contributions
ies-0.213
els-0.213
ishing-0.190
 again-0.188
ie-0.186
Token wondered
Token ID14028
Feature activation-7.65e-03

Pos logit contributions
 Curious+0.014
Don+0.014
 Looking+0.013
 Are+0.013
 Fasc+0.013
Neg logit contributions
ies-0.038
els-0.036
ishing-0.036
ins-0.035
ig-0.033
Token.
Token ID13
Feature activation-8.81e-02

Pos logit contributions
 �+0.016
Don+0.016
 hadn+0.016
 Curious+0.016
 Are+0.016
Neg logit contributions
els-0.011
ishing-0.011
ins-0.011
ies-0.011
ishes-0.010
Token
Token ID198
Feature activation-9.61e-02

Pos logit contributions
 shouldn+0.103
 hardly+0.101
LL+0.100
 hadn+0.099
Don+0.094
Neg logit contributions
ies-0.194
els-0.188
 again-0.187
ins-0.183
er-0.178
Token love
Token ID1842
Feature activation+1.25e-02

Pos logit contributions
 Curious+0.016
Don+0.015
 Looking+0.015
 Are+0.015
 Fasc+0.014
Neg logit contributions
ies-0.025
els-0.024
ishing-0.023
ins-0.022
ig-0.022
Token our
Token ID674
Feature activation+4.65e-02

Pos logit contributions
ies+0.029
ishing+0.028
els+0.028
ier+0.027
ins+0.027
Neg logit contributions
 Are-0.017
 It-0.016
 Who-0.016
 Where-0.016
Don-0.016
Token train
Token ID4512
Feature activation+1.22e-01

Pos logit contributions
els+0.083
ins+0.083
ier+0.079
ishing+0.077
ies+0.076
Neg logit contributions
 hadn-0.083
LL-0.079
 shouldn-0.079
 Curious-0.078
 Thanks-0.078
Token and
Token ID290
Feature activation-6.42e-03

Pos logit contributions
els+0.244
ishing+0.222
ins+0.211
ies+0.209
ries+0.198
Neg logit contributions
-0.225
 Thanks-0.206
 shouldn-0.205
 Have-0.205
 hadn-0.204
Token we
Token ID356
Feature activation+4.29e-03

Pos logit contributions
 Curious+0.012
 Looking+0.011
Don+0.011
 Fasc+0.011
 Exc+0.011
Neg logit contributions
ies-0.013
els-0.012
ishing-0.011
ins-0.011
ig-0.010
Token love
Token ID1842
Feature activation+2.80e-02

Pos logit contributions
ies+0.010
els+0.009
ishing+0.009
ins+0.009
ig+0.008
Neg logit contributions
 Curious-0.006
Don-0.006
 Looking-0.006
 Are-0.006
 Fasc-0.006
Token each
Token ID1123
Feature activation+4.51e-02

Pos logit contributions
ies+0.067
ishing+0.066
ins+0.065
ig+0.064
els+0.064
Neg logit contributions
 It-0.035
-0.034
 The-0.033
 Who-0.033
 Are-0.032
Token other
Token ID584
Feature activation+6.80e-02

Pos logit contributions
ishing+0.105
els+0.103
ig+0.097
ies+0.097
ooming+0.092
Neg logit contributions
 The-0.066
-0.066
 Have-0.062
 It-0.062
 Who-0.062
Token!"
Token ID2474
Feature activation-1.10e-01

Pos logit contributions
els+0.141
ishing+0.126
ries+0.117
ies+0.115
ig+0.115
Neg logit contributions
-0.134
 relieved-0.119
 It-0.115
 hadn-0.113
 shouldn-0.111
Token Their
Token ID5334
Feature activation-1.05e-02

Pos logit contributions
 Curious+0.097
Don+0.092
 Looking+0.090
 Are+0.086
 Fasc+0.084
Neg logit contributions
ies-0.312
els-0.304
ishing-0.295
ins-0.290
ig-0.280
Token mom
Token ID1995
Feature activation-3.66e-03

Pos logit contributions
 Curious+0.019
 Looking+0.018
Don+0.018
 Are+0.018
 hadn+0.018
Neg logit contributions
ishing-0.019
ies-0.019
els-0.019
ins-0.018
ier-0.018
Token Lily
Token ID20037
Feature activation-1.24e-02

Pos logit contributions
 shouldn+0.135
Don+0.132
 hadn+0.130
LL+0.126
 coy+0.126
Neg logit contributions
ies-0.208
 again-0.205
ins-0.200
els-0.199
er-0.198
Token's
Token ID338
Feature activation+1.20e-02

Pos logit contributions
 Where+0.019
 Are+0.019
Don+0.019
 Curious+0.019
 Looking+0.019
Neg logit contributions
ies-0.027
ishing-0.025
els-0.025
ins-0.024
ig-0.024
Token mom
Token ID1995
Feature activation+1.24e-02

Pos logit contributions
els+0.023
ins+0.023
ishing+0.023
ies+0.022
ier+0.022
Neg logit contributions
 Curious-0.022
 Looking-0.020
 hadn-0.020
Don-0.020
 Are-0.020
Token was
Token ID373
Feature activation-1.63e-02

Pos logit contributions
ies+0.031
els+0.029
ishing+0.029
ins+0.028
ig+0.027
Neg logit contributions
 Curious-0.015
Don-0.015
 Looking-0.015
 Are-0.014
 Fasc-0.014
Token very
Token ID845
Feature activation-2.03e-02

Pos logit contributions
 Curious+0.027
Don+0.027
 Looking+0.026
 Are+0.026
 Fasc+0.025
Neg logit contributions
ies-0.033
els-0.032
ishing-0.031
ins-0.030
ier-0.028
Token flexible
Token ID12846
Feature activation+7.27e-02

Pos logit contributions
 Curious+0.033
 Looking+0.032
Don+0.032
 Are+0.031
 Where+0.031
Neg logit contributions
ies-0.046
ins-0.041
ishing-0.040
els-0.039
ier-0.037
Token and
Token ID290
Feature activation-3.04e-02

Pos logit contributions
els+0.125
ins+0.117
ies+0.111
ishing+0.111
ries+0.103
Neg logit contributions
-0.141
 hadn-0.139
 shouldn-0.135
 Curious-0.131
 Are-0.130
Token always
Token ID1464
Feature activation+1.35e-02

Pos logit contributions
 Curious+0.045
 Looking+0.045
 Fasc+0.042
 Terr+0.042
 Are+0.042
Neg logit contributions
ies-0.070
ishing-0.062
els-0.061
ins-0.060
ie-0.058
Token knew
Token ID2993
Feature activation-1.26e-02

Pos logit contributions
ies+0.033
ishing+0.032
els+0.031
ins+0.030
ig+0.029
Neg logit contributions
 Curious-0.017
 Looking-0.016
Don-0.016
 Fasc-0.016
 Are-0.015
Token what
Token ID644
Feature activation-1.53e-02

Pos logit contributions
 hadn+0.019
Don+0.019
 shouldn+0.018
 Curious+0.018
 Looking+0.018
Neg logit contributions
ishing-0.026
ies-0.026
els-0.026
ins-0.025
ooming-0.024
Token to
Token ID284
Feature activation+2.30e-02

Pos logit contributions
 Curious+0.029
 Looking+0.028
Don+0.028
 Are+0.027
 Fasc+0.027
Neg logit contributions
ies-0.028
els-0.027
ishing-0.026
ins-0.025
ig-0.023
Token sad
Token ID6507
Feature activation+1.05e-02

Pos logit contributions
 Curious+0.060
Don+0.060
 Are+0.058
 Looking+0.058
 Who+0.057
Neg logit contributions
ies-0.086
els-0.084
ishing-0.082
ins-0.081
ier-0.076
Token and
Token ID290
Feature activation-5.25e-02

Pos logit contributions
els+0.021
ishing+0.019
ators+0.019
isher+0.019
ins+0.019
Neg logit contributions
 He-0.020
 But-0.020
 Are-0.019
 It-0.019
 She-0.019
Token angry
Token ID7954
Feature activation+1.39e-02

Pos logit contributions
 Where+0.091
 Curious+0.091
 Looking+0.090
 Who+0.090
 Fasc+0.090
Neg logit contributions
ies-0.103
ishing-0.087
els-0.087
ie-0.084
 again-0.084
Token.
Token ID13
Feature activation-9.34e-02

Pos logit contributions
els+0.026
ishing+0.026
ins+0.024
ig+0.024
ishes+0.024
Neg logit contributions
 It-0.028
 He-0.027
 Are-0.027
 Where-0.026
 But-0.026
Token She
Token ID1375
Feature activation-3.98e-02

Pos logit contributions
 hadn+0.088
 hardly+0.088
've+0.085
 Curious+0.084
Don+0.082
Neg logit contributions
els-0.234
ies-0.231
ishing-0.214
ins-0.211
 again-0.210
Token did
Token ID750
Feature activation-3.67e-02

Pos logit contributions
 Where+0.062
 Fasc+0.059
 Have+0.059
 Thanks+0.059
 This+0.059
Neg logit contributions
ies-0.082
els-0.075
--0.069
ishing-0.069
ins-0.068
Token not
Token ID407
Feature activation-1.99e-02

Pos logit contributions
 Curious+0.055
 Are+0.053
 Looking+0.052
Don+0.051
 Fasc+0.051
Neg logit contributions
ies-0.080
els-0.077
ishing-0.073
 again-0.071
ins-0.071
Token want
Token ID765
Feature activation+7.50e-04

Pos logit contributions
 Curious+0.021
Don+0.021
 Are+0.020
 Looking+0.020
 Who+0.019
Neg logit contributions
els-0.053
ishing-0.051
ies-0.051
ins-0.049
ig-0.049
Token to
Token ID284
Feature activation+3.20e-03

Pos logit contributions
ies+0.002
ishing+0.002
ins+0.002
els+0.002
ig+0.002
Neg logit contributions
 It-0.001
 Where-0.001
 Who-0.001
Don-0.001
 Curious-0.001
Token put
Token ID1234
Feature activation+7.45e-02

Pos logit contributions
ishing+0.008
els+0.008
ins+0.008
ies+0.008
ig+0.007
Neg logit contributions
 Are-0.004
 Curious-0.004
 Who-0.004
Don-0.004
 hardly-0.004
Token her
Token ID607
Feature activation+5.86e-02

Pos logit contributions
ies+0.196
ins+0.189
ishing+0.186
ooming+0.185
ologist+0.181
Neg logit contributions
 It-0.082
 The-0.074
 This-0.064
 Is-0.062
 Where-0.060
Token eyes
Token ID2951
Feature activation+6.07e-02

Pos logit contributions
ins+0.079
ies+0.078
els+0.074
ishing+0.074
ier+0.071
Neg logit contributions
 Curious-0.105
 Looking-0.103
 It-0.097
 Terr-0.097
 Something-0.097
Token and
Token ID290
Feature activation-2.74e-02

Pos logit contributions
ins+0.108
els+0.106
ies+0.104
ishing+0.098
ishes+0.094
Neg logit contributions
 Curious-0.132
-0.131
 Looking-0.125
 �-0.124
Don-0.124
Token soon
Token ID2582
Feature activation-1.52e-02

Pos logit contributions
 Curious+0.037
 Looking+0.036
Don+0.035
 Are+0.035
 Fasc+0.035
Neg logit contributions
ies-0.064
els-0.060
ishing-0.060
ins-0.056
ig-0.056
Token,
Token ID11
Feature activation-3.80e-02

Pos logit contributions
 Curious+0.022
Don+0.022
 Looking+0.022
 Are+0.021
 Fasc+0.021
Neg logit contributions
ies-0.035
els-0.032
ishing-0.032
ins-0.030
ig-0.030
Token she
Token ID673
Feature activation-1.05e-02

Pos logit contributions
 Curious+0.045
Don+0.043
 Looking+0.043
 Are+0.041
 Fasc+0.040
Neg logit contributions
ies-0.096
els-0.094
ishing-0.092
ins-0.089
ig-0.086
Token was
Token ID373
Feature activation-1.38e-02

Pos logit contributions
 Curious+0.012
Don+0.012
 Looking+0.012
 Are+0.012
 Fasc+0.011
Neg logit contributions
ies-0.027
els-0.026
ishing-0.025
ins-0.024
ig-0.024
Token dreaming
Token ID28368
Feature activation+2.53e-02

Pos logit contributions
 Curious+0.023
 Looking+0.023
Don+0.022
 Are+0.022
 Fasc+0.022
Neg logit contributions
ies-0.028
els-0.028
ishing-0.027
ins-0.026
ig-0.024
Token.
Token ID13
Feature activation-1.00e-01

Pos logit contributions
els+0.052
ies+0.048
ishing+0.048
ins+0.046
ishes+0.045
Neg logit contributions
 Curious-0.044
 Looking-0.043
Don-0.043
 hadn-0.042
 It-0.041
Token
Token ID198
Feature activation-9.72e-02

Pos logit contributions
LL+0.098
Don+0.093
 hadn+0.086
 Curious+0.086
Thanks+0.084
Neg logit contributions
els-0.243
ies-0.239
ishing-0.225
ins-0.218
 again-0.217
Token
Token ID198
Feature activation-9.10e-02

Pos logit contributions
 Are+0.120
 Curious+0.119
LL+0.118
 Fasc+0.114
Don+0.113
Neg logit contributions
ies-0.212
els-0.210
ins-0.199
ie-0.195
 again-0.195
TokenIn
Token ID818
Feature activation-1.45e-02

Pos logit contributions
 Curious+0.122
 Are+0.121
 Looking+0.118
LL+0.116
 Fasc+0.116
Neg logit contributions
ies-0.195
els-0.193
ins-0.180
ishing-0.179
ie-0.173
Token She
Token ID1375
Feature activation-1.96e-02

Pos logit contributions
Don+0.091
 hadn+0.086
 shouldn+0.083
've+0.081
LL+0.080
Neg logit contributions
ies-0.259
els-0.250
ishing-0.244
er-0.242
 again-0.242
Token closed
Token ID4838
Feature activation+3.91e-02

Pos logit contributions
 Curious+0.024
 Looking+0.023
Don+0.023
 Are+0.023
 Fasc+0.022
Neg logit contributions
ies-0.049
els-0.047
ishing-0.046
ins-0.044
ig-0.044
Token her
Token ID607
Feature activation+4.46e-02

Pos logit contributions
ies+0.085
ins+0.082
els+0.080
ishing+0.080
ier+0.075
Neg logit contributions
 Curious-0.059
 Looking-0.058
Don-0.057
 Fasc-0.057
 Exc-0.054
Token eyes
Token ID2951
Feature activation+6.20e-02

Pos logit contributions
ins+0.080
ier+0.078
ishing+0.074
ies+0.072
ock+0.070
Neg logit contributions
 Curious-0.088
 Looking-0.086
 Exc-0.082
 It-0.081
 Terr-0.081
Token and
Token ID290
Feature activation-2.20e-02

Pos logit contributions
ins+0.127
els+0.119
ies+0.118
ishes+0.111
ishing+0.110
Neg logit contributions
 Curious-0.125
-0.123
 Suddenly-0.118
 �-0.117
 It-0.117
Token imagined
Token ID15758
Feature activation+3.29e-03

Pos logit contributions
 Curious+0.029
 Looking+0.028
Don+0.028
 Are+0.027
 Fasc+0.027
Neg logit contributions
ies-0.052
els-0.050
ishing-0.049
ins-0.047
ig-0.046
Token she
Token ID673
Feature activation+2.06e-02

Pos logit contributions
ies+0.008
ishing+0.008
els+0.008
ins+0.007
ish+0.007
Neg logit contributions
 Curious-0.004
Don-0.004
 Looking-0.004
 hadn-0.004
 Are-0.004
Token was
Token ID373
Feature activation+6.28e-03

Pos logit contributions
ies+0.054
els+0.052
ishing+0.051
ins+0.050
ig+0.048
Neg logit contributions
 Curious-0.023
Don-0.022
 Looking-0.022
 Are-0.021
 Fasc-0.020
Token flying
Token ID7348
Feature activation+8.58e-02

Pos logit contributions
ies+0.012
els+0.012
ishing+0.012
ins+0.011
ig+0.011
Neg logit contributions
 Curious-0.011
Don-0.011
 Looking-0.010
 Are-0.010
 Fasc-0.010
Token through
Token ID832
Feature activation+1.50e-02

Pos logit contributions
els+0.157
ies+0.156
ins+0.151
ishing+0.149
ries+0.135
Neg logit contributions
 Curious-0.161
Don-0.160
 Looking-0.154
 Fasc-0.152
 It-0.147
Token the
Token ID262
Feature activation+3.51e-02

Pos logit contributions
ies+0.043
ishing+0.042
ins+0.042
els+0.040
ish+0.040
Neg logit contributions
 Curious-0.014
Don-0.013
 It-0.013
 Looking-0.013
 The-0.013
Token artwork
Token ID16257
Feature activation+8.27e-02

Pos logit contributions
els+0.041
ishing+0.040
ins+0.040
ies+0.040
ier+0.039
Neg logit contributions
 Curious-0.035
 Looking-0.034
 hadn-0.034
 Who-0.033
 Fasc-0.033
Token.
Token ID13
Feature activation-1.11e-01

Pos logit contributions
els+0.136
ishing+0.135
ies+0.124
ins+0.122
ishes+0.113
Neg logit contributions
-0.173
 hadn-0.170
 Thanks-0.166
 shouldn-0.163
 hardly-0.163
Token From
Token ID3574
Feature activation-1.83e-02

Pos logit contributions
Don+0.134
 Curious+0.124
LL+0.120
 Fasc+0.118
 hadn+0.115
Neg logit contributions
els-0.242
ies-0.233
ins-0.224
er-0.219
 again-0.219
Token that
Token ID326
Feature activation-2.86e-02

Pos logit contributions
 Curious+0.034
 Looking+0.033
Don+0.033
 Are+0.033
 Fasc+0.032
Neg logit contributions
ies-0.034
els-0.033
ishing-0.032
ins-0.030
ig-0.029
Token day
Token ID1110
Feature activation-3.26e-02

Pos logit contributions
 Curious+0.052
Don+0.052
 Are+0.051
 Looking+0.051
 hadn+0.049
Neg logit contributions
ies-0.053
els-0.053
ishing-0.051
ins-0.050
ries-0.046
Token on
Token ID319
Feature activation-2.80e-02

Pos logit contributions
 Curious+0.073
 Are+0.073
Don+0.073
 Looking+0.072
+0.072
Neg logit contributions
ies-0.046
ishing-0.044
els-0.043
ins-0.041
ries-0.035
Token,
Token ID11
Feature activation-7.12e-02

Pos logit contributions
 Curious+0.069
Don+0.068
 Looking+0.067
 Are+0.067
 Fasc+0.066
Neg logit contributions
ies-0.035
els-0.033
ishing-0.032
ins-0.030
ig-0.027
Token they
Token ID484
Feature activation-3.18e-02

Pos logit contributions
 Curious+0.087
Don+0.084
 Looking+0.084
 Are+0.081
 Fasc+0.079
Neg logit contributions
ies-0.176
els-0.171
ishing-0.168
ins-0.162
ig-0.157
Token were
Token ID547
Feature activation-2.55e-02

Pos logit contributions
 Curious+0.044
Don+0.042
 Looking+0.042
 Are+0.041
 Fasc+0.040
Neg logit contributions
ies-0.074
els-0.072
ishing-0.071
ins-0.068
ig-0.065
Token good
Token ID922
Feature activation+2.53e-02

Pos logit contributions
 Curious+0.043
Don+0.042
 Looking+0.041
 Are+0.040
 Fasc+0.040
Neg logit contributions
ies-0.052
els-0.050
ishing-0.049
ins-0.047
ig-0.045
Token friends
Token ID2460
Feature activation+2.24e-02

Pos logit contributions
els+0.047
ist+0.041
ishing+0.040
ock+0.040
pan+0.039
Neg logit contributions
 Thanks-0.039
 Thank-0.037
 shouldn-0.036
 From-0.036
Don-0.035
Token learned
Token ID4499
Feature activation-1.80e-02

Pos logit contributions
ies+0.012
els+0.012
ishing+0.011
ins+0.011
ig+0.011
Neg logit contributions
 Curious-0.008
Don-0.008
 Looking-0.008
 Are-0.008
 Fasc-0.007
Token an
Token ID281
Feature activation+1.56e-02

Pos logit contributions
 Curious+0.031
Don+0.030
 Looking+0.030
 Are+0.030
 Fasc+0.029
Neg logit contributions
ies-0.035
els-0.034
ishing-0.033
ins-0.032
ig-0.031
Token important
Token ID1593
Feature activation+3.99e-02

Pos logit contributions
ies+0.038
els+0.038
ishing+0.037
ins+0.036
ooming+0.035
Neg logit contributions
 Curious-0.019
 Looking-0.018
 shouldn-0.018
 Fasc-0.017
Don-0.017
Token lesson
Token ID11483
Feature activation+1.62e-02

Pos logit contributions
ies+0.076
els+0.073
ishing+0.072
ins+0.068
ig+0.065
Neg logit contributions
 Curious-0.071
Don-0.070
 Looking-0.069
 Are-0.068
 Fasc-0.067
Token that
Token ID326
Feature activation+2.81e-04

Pos logit contributions
ies+0.031
els+0.030
ishing+0.030
ins+0.028
ig+0.026
Neg logit contributions
 Curious-0.029
Don-0.029
 Looking-0.028
 Are-0.028
 shouldn-0.027
Token day
Token ID1110
Feature activation+2.24e-03

Pos logit contributions
ies+0.000
els+0.000
ishing+0.000
ins+0.000
ig+0.000
Neg logit contributions
 Curious-0.001
Don-0.001
 Are-0.001
 Looking-0.001
 Fasc-0.001
Token.
Token ID13
Feature activation-8.81e-02

Pos logit contributions
ies+0.004
els+0.003
ishing+0.003
ins+0.003
ig+0.003
Neg logit contributions
 Curious-0.005
Don-0.005
 Looking-0.005
 Are-0.005
 Fasc-0.005
Token Even
Token ID3412
Feature activation-3.96e-04

Pos logit contributions
Don+0.065
 Curious+0.063
 Fasc+0.058
LL+0.056
 Looking+0.055
Neg logit contributions
els-0.235
ies-0.232
ishing-0.229
ins-0.226
er-0.221
Token if
Token ID611
Feature activation-2.54e-02

Pos logit contributions
Don+0.001
 Are+0.001
 Curious+0.001
 Looking+0.001
 Fasc+0.001
Neg logit contributions
els-0.001
ies-0.001
ishing-0.001
ins-0.001
ig-0.001
Token it
Token ID340
Feature activation+8.71e-03

Pos logit contributions
 Curious+0.037
 Looking+0.036
Don+0.035
 Fasc+0.034
 Are+0.034
Neg logit contributions
ies-0.057
els-0.054
ishing-0.053
ins-0.051
ig-0.050
Token's
Token ID338
Feature activation+1.76e-02

Pos logit contributions
ies+0.020
els+0.020
ishing+0.019
ins+0.019
ig+0.018
Neg logit contributions
 Curious-0.012
Don-0.011
 Looking-0.011
 Are-0.011
 Fasc-0.011
Token they
Token ID484
Feature activation-3.98e-03

Pos logit contributions
Don+0.012
 Curious+0.012
 Are+0.011
 It+0.011
 Looking+0.011
Neg logit contributions
ishing-0.033
els-0.033
ies-0.032
ig-0.030
ins-0.030
Token loved
Token ID6151
Feature activation+9.06e-03

Pos logit contributions
 Curious+0.005
 Looking+0.005
Don+0.005
 Are+0.005
 Fasc+0.004
Neg logit contributions
ies-0.010
els-0.010
ishing-0.009
ins-0.009
ig-0.009
Token her
Token ID607
Feature activation+6.45e-02

Pos logit contributions
ishing+0.018
ig+0.018
ooming+0.017
ies+0.017
els+0.017
Neg logit contributions
Don-0.015
 It-0.014
 Who-0.014
 Where-0.014
 Curious-0.013
Token.
Token ID13
Feature activation-9.58e-02

Pos logit contributions
els+0.129
ishing+0.125
ig+0.115
ooming+0.112
ins+0.111
Neg logit contributions
-0.114
 relieved-0.114
 Curious-0.111
 shouldn-0.110
 The-0.109
Token
Token ID198
Feature activation-9.82e-02

Pos logit contributions
 Curious+0.113
 hadn+0.111
 Looking+0.109
 shouldn+0.108
LL+0.107
Neg logit contributions
ies-0.221
els-0.210
ins-0.203
 again-0.195
ishing-0.194
Token
Token ID198
Feature activation-9.90e-02

Pos logit contributions
 Curious+0.120
 Looking+0.114
 Fasc+0.111
 Are+0.111
Don+0.111
Neg logit contributions
ies-0.239
els-0.230
ins-0.222
ishing-0.212
 again-0.210
TokenMom
Token ID29252
Feature activation-1.02e-02

Pos logit contributions
 Curious+0.178
 Looking+0.173
 Fasc+0.168
Don+0.166
 Are+0.166
Neg logit contributions
ies-0.194
els-0.184
ishing-0.177
ins-0.174
ig-0.163
Token smiled
Token ID13541
Feature activation+4.61e-03

Pos logit contributions
 Curious+0.012
Don+0.011
 Looking+0.011
 hadn+0.011
 Exc+0.010
Neg logit contributions
els-0.026
ishing-0.026
ies-0.026
ins-0.024
ig-0.024
Token.
Token ID13
Feature activation-8.94e-02

Pos logit contributions
els+0.008
ishing+0.007
ries+0.007
ig+0.007
isher+0.007
Neg logit contributions
Don-0.010
 Curious-0.010
 He-0.010
 It-0.010
 relieved-0.009
Token She
Token ID1375
Feature activation-1.64e-02

Pos logit contributions
 hardly+0.090
LL+0.088
 hadn+0.087
 Curious+0.083
've+0.082
Neg logit contributions
ies-0.216
els-0.210
ins-0.203
 again-0.201
er-0.200
Token said
Token ID531
Feature activation-2.52e-02

Pos logit contributions
 Curious+0.024
 Where+0.024
 Are+0.023
 Looking+0.023
 Fasc+0.023
Neg logit contributions
ies-0.036
ins-0.033
els-0.033
ishing-0.032
 again-0.032
Token upon
Token ID2402
Feature activation-8.65e-02

Pos logit contributions
 Curious+0.128
Don+0.125
 Looking+0.125
 Are+0.122
 Fasc+0.120
Neg logit contributions
ies-0.147
els-0.133
ishing-0.129
ie-0.125
ins-0.124
Token a
Token ID257
Feature activation-5.61e-02

Pos logit contributions
 Curious+0.155
 Are+0.155
 Looking+0.154
Don+0.152
 Fasc+0.149
Neg logit contributions
ies-0.153
els-0.141
ishing-0.136
ins-0.135
 again-0.128
Token time
Token ID640
Feature activation-5.16e-02

Pos logit contributions
 Curious+0.113
Don+0.110
 Looking+0.109
 Are+0.108
 Fasc+0.106
Neg logit contributions
ies-0.095
els-0.091
ishing-0.089
ins-0.083
ig-0.080
Token,
Token ID11
Feature activation-8.04e-02

Pos logit contributions
 Curious+0.136
Don+0.135
 Looking+0.133
 Are+0.130
 Fasc+0.130
Neg logit contributions
ies-0.055
els-0.049
ishing-0.046
ig-0.042
ins-0.042
Token there
Token ID612
Feature activation-4.00e-02

Pos logit contributions
 Curious+0.171
 Looking+0.168
Don+0.168
 Fasc+0.164
 Terr+0.164
Neg logit contributions
ies-0.124
els-0.114
ishing-0.108
ig-0.103
ie-0.102
Token was
Token ID373
Feature activation-6.52e-02

Pos logit contributions
 Curious+0.073
Don+0.070
 Terr+0.069
 Looking+0.069
 Fasc+0.068
Neg logit contributions
ies-0.072
els-0.068
ie-0.068
 again-0.066
ig-0.063
Token a
Token ID257
Feature activation-2.13e-02

Pos logit contributions
 Curious+0.114
 Are+0.112
Don+0.112
 Looking+0.111
 Who+0.108
Neg logit contributions
ies-0.122
els-0.118
ishing-0.114
ins-0.109
ig-0.104
Token phone
Token ID3072
Feature activation+8.29e-02

Pos logit contributions
 Curious+0.038
Don+0.037
 Looking+0.037
 Are+0.037
 Fasc+0.036
Neg logit contributions
ies-0.041
els-0.039
ishing-0.038
ins-0.036
ig-0.035
Token.
Token ID13
Feature activation-9.07e-02

Pos logit contributions
ishing+0.155
ies+0.149
els+0.147
ins+0.146
ig+0.132
Neg logit contributions
 Are-0.152
 Curious-0.152
 Who-0.151
 Where-0.150
 Looking-0.148
Token It
Token ID632
Feature activation-1.43e-03

Pos logit contributions
Don+0.091
 hadn+0.079
Ag+0.075
Thanks+0.072
 shouldn+0.072
Neg logit contributions
els-0.225
ies-0.212
er-0.208
ries-0.205
--0.203
Token liked
Token ID8288
Feature activation-1.11e-02

Pos logit contributions
 Curious+0.002
 Fasc+0.002
Don+0.002
 Looking+0.002
 Something+0.002
Neg logit contributions
ies-0.003
els-0.002
--0.002
ie-0.002
er-0.002
Token's
Token ID338
Feature activation+3.94e-02

Pos logit contributions
els+0.126
ies+0.125
ishing+0.124
ins+0.116
ig+0.115
Neg logit contributions
 shouldn-0.080
Don-0.079
 Curious-0.078
 Are-0.078
 hadn-0.076
Token good
Token ID922
Feature activation+6.21e-02

Pos logit contributions
els+0.097
ooming+0.092
ishing+0.091
ig+0.089
ins+0.089
Neg logit contributions
 shouldn-0.050
 hadn-0.046
Don-0.046
 Are-0.041
-0.041
Token for
Token ID329
Feature activation+2.00e-02

Pos logit contributions
els+0.129
ishing+0.118
ooming+0.117
ig+0.115
ies+0.114
Neg logit contributions
 shouldn-0.106
 It-0.099
 Where-0.098
 Are-0.098
 hadn-0.096
Token you
Token ID345
Feature activation+4.92e-02

Pos logit contributions
els+0.044
ishing+0.044
ies+0.044
ig+0.043
ins+0.042
Neg logit contributions
Don-0.025
 hadn-0.025
 shouldn-0.024
 Are-0.024
 It-0.023
Token.
Token ID13
Feature activation-1.03e-01

Pos logit contributions
els+0.072
ishing+0.071
ies+0.065
ins+0.064
ig+0.062
Neg logit contributions
 shouldn-0.112
-0.111
 hadn-0.108
Don-0.107
 Curious-0.106
Token
Token ID198
Feature activation-1.05e-01

Pos logit contributions
 hardly+0.140
 barely+0.135
 Curious+0.134
Jump+0.131
 Fasc+0.129
Neg logit contributions
ies-0.205
 again-0.201
els-0.193
ins-0.189
ie-0.189
Token
Token ID198
Feature activation-9.80e-02

Pos logit contributions
 Are+0.162
 Curious+0.162
 Fasc+0.153
 Looking+0.152
Don+0.151
Neg logit contributions
ies-0.209
els-0.196
ie-0.194
ins-0.192
 again-0.188
TokenFrom
Token ID4863
Feature activation-1.65e-02

Pos logit contributions
 Curious+0.114
 Are+0.113
 Where+0.106
 Looking+0.106
 Fasc+0.102
Neg logit contributions
ies-0.237
els-0.227
ins-0.224
ie-0.222
ock-0.222
Token that
Token ID326
Feature activation-1.79e-02

Pos logit contributions
 Curious+0.031
 Looking+0.030
Don+0.030
 Are+0.029
 Fasc+0.029
Neg logit contributions
ies-0.031
els-0.030
ishing-0.029
ins-0.027
ig-0.026
Token day
Token ID1110
Feature activation-3.19e-02

Pos logit contributions
 Curious+0.035
Don+0.034
 Looking+0.034
 Are+0.033
 Fasc+0.032
Neg logit contributions
ies-0.032
els-0.032
ishing-0.031
ins-0.029
ries-0.027
Token on
Token ID319
Feature activation-2.47e-02

Pos logit contributions
 Curious+0.079
Don+0.078
 Looking+0.078
 Are+0.077
 Fasc+0.076
Neg logit contributions
ies-0.039
els-0.037
ishing-0.035
ins-0.033
ries-0.029
Token more
Token ID517
Feature activation+6.52e-02

Pos logit contributions
ies+0.021
els+0.021
ishing+0.020
ish+0.020
ins+0.019
Neg logit contributions
 hardly-0.017
 Who-0.017
Don-0.017
 Is-0.016
 Are-0.016
Token juice
Token ID13135
Feature activation+1.04e-01

Pos logit contributions
ishing+0.117
els+0.109
ooming+0.108
ies+0.104
burn+0.103
Neg logit contributions
 hardly-0.110
 Who-0.109
 Curious-0.107
 Are-0.107
 Is-0.105
Token.
Token ID13
Feature activation-1.08e-01

Pos logit contributions
els+0.186
ishing+0.177
ies+0.165
ins+0.160
ooming+0.159
Neg logit contributions
-0.184
 hadn-0.178
 shouldn-0.177
 He-0.176
Don-0.175
Token
Token ID198
Feature activation-1.00e-01

Pos logit contributions
Don+0.116
 Curious+0.112
 shouldn+0.109
 hadn+0.109
 Are+0.104
Neg logit contributions
els-0.242
ies-0.239
 again-0.236
ishing-0.236
ins-0.232
Token
Token ID198
Feature activation-9.08e-02

Pos logit contributions
 Curious+0.132
 Are+0.128
Don+0.126
 hadn+0.124
 Looking+0.123
Neg logit contributions
ies-0.214
els-0.213
ins-0.208
 again-0.203
--0.198
TokenBut
Token ID1537
Feature activation-6.29e-02

Pos logit contributions
 Curious+0.123
 Looking+0.117
 Are+0.115
 Terr+0.113
 Fasc+0.112
Neg logit contributions
ies-0.205
els-0.199
ishing-0.197
ins-0.193
ig-0.181
Token Tommy
Token ID19919
Feature activation+6.52e-03

Pos logit contributions
 Curious+0.104
Don+0.102
 Looking+0.101
 Are+0.101
 Fasc+0.097
Neg logit contributions
ies-0.129
els-0.125
ishing-0.122
ins-0.119
ig-0.112
Token did
Token ID750
Feature activation-3.30e-02

Pos logit contributions
ies+0.017
els+0.016
ishing+0.016
ins+0.015
ig+0.015
Neg logit contributions
 Curious-0.007
Don-0.007
 Looking-0.007
 Are-0.007
 Fasc-0.007
Token not
Token ID407
Feature activation-3.68e-03

Pos logit contributions
 Curious+0.041
Don+0.040
 Looking+0.039
 Are+0.038
 Fasc+0.038
Neg logit contributions
ies-0.081
els-0.079
ishing-0.078
ins-0.075
ig-0.072
Token know
Token ID760
Feature activation-5.58e-03

Pos logit contributions
 Curious+0.004
 Are+0.004
 Who+0.004
Don+0.004
 Looking+0.004
Neg logit contributions
els-0.009
ishing-0.009
ies-0.009
ins-0.009
ries-0.009
Token that
Token ID326
Feature activation+8.89e-03

Pos logit contributions
 hadn+0.007
 Is+0.007
 shouldn+0.007
+0.007
Don+0.007
Neg logit contributions
ishing-0.013
ies-0.012
ins-0.012
ooming-0.012
els-0.012
Token the
Token ID262
Feature activation+2.05e-02

Pos logit contributions
ies+0.039
els+0.037
ishing+0.037
ins+0.036
ig+0.034
Neg logit contributions
 Curious-0.019
Don-0.018
 Looking-0.018
 Are-0.018
 Who-0.018
Token mill
Token ID3939
Feature activation+1.13e-01

Pos logit contributions
ies+0.040
els+0.038
ishing+0.037
ins+0.036
ig+0.033
Neg logit contributions
 Curious-0.037
Don-0.036
 Looking-0.035
 Fasc-0.035
 Are-0.035
Token again
Token ID757
Feature activation+4.20e-02

Pos logit contributions
els+0.166
ies+0.148
ins+0.139
ishing+0.132
ishes+0.128
Neg logit contributions
-0.267
 Curious-0.260
 hadn-0.255
 Unfortunately-0.254
Don-0.254
Token.
Token ID13
Feature activation-9.06e-02

Pos logit contributions
els+0.061
ies+0.060
ins+0.056
ishing+0.056
isher+0.049
Neg logit contributions
-0.094
 Curious-0.091
Don-0.091
 Are-0.091
 hadn-0.090
Token<|endoftext|>
Token ID50256
Feature activation-4.29e-02

Pos logit contributions
Don+0.102
Jump+0.095
Copy+0.093
 Fasc+0.093
Grab+0.093
Neg logit contributions
 again-0.187
els-0.182
 ways-0.173
ies-0.172
ig-0.168
TokenOnce
Token ID7454
Feature activation-6.71e-02

Pos logit contributions
 Curious+0.052
Don+0.051
 Looking+0.049
 Are+0.049
 hardly+0.049
Neg logit contributions
ies-0.099
els-0.097
ins-0.093
ishing-0.093
 again-0.091
Token upon
Token ID2402
Feature activation-7.99e-02

Pos logit contributions
 Curious+0.121
Don+0.119
 Looking+0.118
 Are+0.116
 Fasc+0.113
Neg logit contributions
ies-0.136
els-0.123
ishing-0.119
ie-0.116
ig-0.115
Token a
Token ID257
Feature activation-5.45e-02

Pos logit contributions
 Are+0.149
 Curious+0.147
 Looking+0.146
Don+0.145
LL+0.143
Neg logit contributions
ies-0.133
els-0.121
ishing-0.116
ins-0.115
 again-0.114
Token time
Token ID640
Feature activation-4.72e-02

Pos logit contributions
 Curious+0.110
Don+0.107
 Looking+0.107
 Are+0.105
 Fasc+0.104
Neg logit contributions
ies-0.092
els-0.088
ishing-0.086
ins-0.081
ig-0.077
Token,
Token ID11
Feature activation-7.70e-02

Pos logit contributions
 Curious+0.124
Don+0.122
 Looking+0.120
 Are+0.118
 Fasc+0.118
Neg logit contributions
ies-0.051
els-0.047
ishing-0.044
ins-0.040
ig-0.039
Token there
Token ID612
Feature activation-3.52e-02

Pos logit contributions
 Curious+0.161
Don+0.159
 Looking+0.157
 Terr+0.154
 Fasc+0.154
Neg logit contributions
ies-0.120
els-0.111
ishing-0.107
ig-0.101
ie-0.099