TOP ACTIVATIONS
MAX = 9.602

. All that movie - watching had made him sleepy .
That is a whale - watching boat . They take people
and said , " Y uck !" But Charlie laughed and
looking , Tim my sn uck over to Lily 's play
. " Y uck , this water is bitter
on his mouth . Y uck !" she said .
showed him how to pl uck the strings and make music
be quiet . He sn uck up on the fish and
bites it . " Y uck !" she says . "
bad too . " Y uck !" Tom said . "
a face . " Y uck , this sauce is spicy
. Anna and Ben unb uck led and got off .
and said , " Y uck !" Mom heard
and said , " Y uck ! This is not y
. " Y uck ! What is that smell
. " Y uck ! Let 's get out
. " Y uck ! I don 't like
night , she quietly sn uck out of her house and
He said , " Y uck ." His mom
. " Y uck ! This lime is sour

MIN ACTIVATIONS
MIN = -13.809

as he opened the ch ocol ates and started eating them
a big box of ch ocol ates . It was so
a large selection of ch ocol ates . Ell a picked
was filled with delicious ch ocol ates . He was happy
not eat too many ch ocol ates and his friend learned
colorful cand ies and ch ocol ates . She said to
enjoyed her cool , ch ocol at y treat !
looking at all the ch ocol ates , Lex i chose
find a box of ch ocol ates . "
son is eating our ch ocol ates !"
and started eating the ch ocol ates . The shop keeper
she paid for the ch ocol ates and the two of
ates . Lily loved ch ocol ates . She thanked Ben
a big box of ch ocol ates !
he ate all the ch ocol ates and sweets . Then
box and share the ch ocol ates . They are happy
time playing with her or nam ents and making her special
was a box of ch ocol ates . " For me
and they shared the ch ocol ates with everyone in the
she had with her or nam ents . <|endoftext|> Once upon

INTERVAL 3.749 - 9.602
CONTAINS 8.567%

practiced and soon he could whistle just like the bird .
a time , there was a girl named Mary . Mary
whale . <|endoftext|> Once upon a time there was a little
back to where she found it . G illy smiled and
loud noises , and she smelled strange smells . She was

INTERVAL -2.104 - 3.749
CONTAINS 70.696%

don 't you dress up the dough ? You can make
go back to the park when the wind was not so
were flowing down her cheeks . Her mom was worried and
bring it back ." The man nodded .
could open the gate with his big key . As he

INTERVAL -7.956 - -2.104
CONTAINS 20.275%

he was always busy . One day , he decided to
. They lived happily ever after . The end . <|endoftext|>
so glad she could play in the sand without worrying about
apped her hands . She loved stuffed animals . She hugged
children . <|endoftext|> L ily was a little girl who liked

INTERVAL -13.809 - -7.956
CONTAINS 0.461%

there was a rh in oc eros who wanted to rock
ever after . <|endoftext|> Once upon a time , there was
very helpful . <|endoftext|> Once upon a time , there were
my love ." <|endoftext|> Once upon a time , there was
of bird and needed a lot of love and care .
Token.
Token ID13
Feature activation-7.05e-01

Pos logit contributions
 perspective+0.012
 habitats+0.012
stack+0.012
arians+0.011
uff+0.011
Neg logit contributions
 better-0.005
 closer-0.004
 said-0.004
's-0.004
 sandwich-0.004
Token All
Token ID1439
Feature activation-5.41e+00

Pos logit contributions
 habitats+0.035
stack+0.035
 perspective+0.035
arians+0.034
years+0.033
Neg logit contributions
 to-0.012
 better-0.011
's-0.011
 again-0.011
 said-0.011
Token that
Token ID326
Feature activation+4.44e+00

Pos logit contributions
 habitats+0.264
 perspective+0.256
stack+0.256
 meanings+0.242
years+0.241
Neg logit contributions
 better-0.079
 closer-0.071
 again-0.067
 said-0.064
 sandwich-0.062
Token movie
Token ID3807
Feature activation-1.45e+00

Pos logit contributions
 better+0.090
 again+0.081
 said+0.081
 to+0.079
's+0.079
Neg logit contributions
 habitats-0.207
stack-0.202
 perspective-0.199
arians-0.191
 meanings-0.187
Token-
Token ID12
Feature activation-2.58e+00

Pos logit contributions
 habitats+0.072
stack+0.070
 perspective+0.069
arians+0.066
ric+0.065
Neg logit contributions
 better-0.026
 again-0.023
 said-0.023
 to-0.023
's-0.022
Tokenwatching
Token ID50042
Feature activation+9.56e+00

Pos logit contributions
 habitats+0.110
stack+0.110
 perspective+0.107
arians+0.102
 meanings+0.100
Neg logit contributions
 better-0.060
 again-0.057
 to-0.057
 said-0.056
's-0.055
Token had
Token ID550
Feature activation-6.30e-01

Pos logit contributions
 better+0.169
 again+0.155
 said+0.154
 to+0.153
's+0.150
Neg logit contributions
 habitats-0.474
stack-0.468
 perspective-0.460
arians-0.444
years-0.443
Token made
Token ID925
Feature activation+2.62e+00

Pos logit contributions
 habitats+0.029
stack+0.028
 perspective+0.028
arians+0.026
 meanings+0.026
Neg logit contributions
 better-0.013
 again-0.011
 closer-0.011
 said-0.011
's-0.011
Token him
Token ID683
Feature activation+1.94e+00

Pos logit contributions
 better+0.053
 closer+0.049
 said+0.048
 again+0.048
 to+0.046
Neg logit contributions
 habitats-0.117
stack-0.113
 perspective-0.113
 meanings-0.107
arians-0.106
Token sleepy
Token ID42023
Feature activation+1.27e-01

Pos logit contributions
 better+0.032
 said+0.030
 closer+0.029
 again+0.028
's+0.027
Neg logit contributions
 habitats-0.089
 perspective-0.087
stack-0.085
ric-0.084
 meanings-0.081
Token.
Token ID13
Feature activation-6.67e-01

Pos logit contributions
 better+0.002
 again+0.002
 said+0.002
 closer+0.002
 to+0.002
Neg logit contributions
 habitats-0.007
stack-0.006
 perspective-0.006
arians-0.006
 meanings-0.006
TokenThat
Token ID2504
Feature activation-1.91e+00

Pos logit contributions
 habitats+0.225
 perspective+0.217
stack+0.208
 meanings+0.185
ric+0.184
Neg logit contributions
 better-0.286
 again-0.268
 to-0.260
 closer-0.260
 said-0.258
Token is
Token ID318
Feature activation-1.78e+00

Pos logit contributions
stack+0.073
 habitats+0.073
 perspective+0.073
ric+0.070
arians+0.064
Neg logit contributions
 better-0.046
 again-0.043
 closer-0.042
 said-0.040
 kiss-0.040
Token a
Token ID257
Feature activation+4.06e+00

Pos logit contributions
 habitats+0.086
stack+0.084
 perspective+0.082
 meanings+0.079
ric+0.079
Neg logit contributions
 better-0.029
 again-0.028
 said-0.025
's-0.025
 closer-0.025
Token whale
Token ID22206
Feature activation+1.57e+00

Pos logit contributions
 again+0.079
 better+0.079
's+0.074
 said+0.074
 to+0.072
Neg logit contributions
 habitats-0.178
 perspective-0.175
stack-0.174
ric-0.161
arians-0.159
Token-
Token ID12
Feature activation-2.49e+00

Pos logit contributions
 better+0.025
 to+0.023
's+0.022
 said+0.022
 again+0.022
Neg logit contributions
 habitats-0.082
 perspective-0.079
stack-0.079
 meanings-0.076
years-0.075
Tokenwatching
Token ID50042
Feature activation+9.48e+00

Pos logit contributions
 habitats+0.085
stack+0.080
 perspective+0.078
years+0.077
ric+0.075
Neg logit contributions
 better-0.080
 again-0.077
 said-0.072
's-0.071
 sigh-0.070
Token boat
Token ID8848
Feature activation+1.31e+00

Pos logit contributions
 better+0.234
 again+0.216
 to+0.214
 said+0.214
 closer+0.212
Neg logit contributions
 habitats-0.395
stack-0.385
 perspective-0.379
arians-0.361
 meanings-0.358
Token.
Token ID13
Feature activation-6.05e-01

Pos logit contributions
 better+0.022
 said+0.019
 to+0.019
 again+0.019
's+0.019
Neg logit contributions
 habitats-0.066
stack-0.064
 perspective-0.064
arians-0.061
 meanings-0.061
Token They
Token ID1119
Feature activation-8.54e-01

Pos logit contributions
 habitats+0.024
stack+0.023
 perspective+0.022
arians+0.021
ric+0.021
Neg logit contributions
 better-0.016
 again-0.015
 said-0.015
 closer-0.015
 to-0.015
Token take
Token ID1011
Feature activation+1.95e+00

Pos logit contributions
 habitats+0.039
stack+0.038
 perspective+0.038
arians+0.036
 meanings+0.036
Neg logit contributions
 better-0.018
 said-0.016
 to-0.016
's-0.016
 again-0.016
Token people
Token ID661
Feature activation-5.03e+00

Pos logit contributions
 better+0.035
 again+0.032
 said+0.031
 closer+0.030
's+0.030
Neg logit contributions
 habitats-0.093
stack-0.089
 perspective-0.089
 meanings-0.085
arians-0.084
Token and
Token ID290
Feature activation+4.65e+00

Pos logit contributions
 better+0.059
 closer+0.052
 again+0.050
 said+0.049
's+0.049
Neg logit contributions
 habitats-0.182
stack-0.178
 perspective-0.175
 meanings-0.166
arians-0.165
Token said
Token ID531
Feature activation+6.74e+00

Pos logit contributions
 better+0.077
 said+0.069
 again+0.067
 to+0.067
 closer+0.065
Neg logit contributions
 habitats-0.233
stack-0.228
 perspective-0.227
arians-0.217
 meanings-0.212
Token,
Token ID11
Feature activation-5.93e-01

Pos logit contributions
 better+0.109
 again+0.097
 to+0.096
 said+0.095
 closer+0.092
Neg logit contributions
 habitats-0.343
stack-0.334
 perspective-0.332
arians-0.318
 meanings-0.313
Token "
Token ID366
Feature activation-7.02e+00

Pos logit contributions
 habitats+0.025
stack+0.024
 perspective+0.024
arians+0.023
 meanings+0.023
Neg logit contributions
 better-0.014
 again-0.013
 to-0.013
 said-0.013
 closer-0.013
TokenY
Token ID56
Feature activation-2.51e+00

Pos logit contributions
 habitats+0.266
stack+0.259
 perspective+0.254
arians+0.240
 meanings+0.237
Neg logit contributions
 better-0.201
 again-0.188
 said-0.187
 closer-0.185
 to-0.184
Tokenuck
Token ID1347
Feature activation+8.96e+00

Pos logit contributions
 habitats+0.080
stack+0.079
 perspective+0.077
uff+0.074
arians+0.073
Neg logit contributions
 better-0.081
 closer-0.077
 again-0.076
 said-0.076
 to-0.075
Token!"
Token ID2474
Feature activation+2.44e+00

Pos logit contributions
 better+0.154
 again+0.143
 to+0.141
 said+0.137
's+0.136
Neg logit contributions
 habitats-0.478
stack-0.456
 perspective-0.453
 meanings-0.434
arians-0.431
Token But
Token ID887
Feature activation-2.20e+00

Pos logit contributions
 better+0.038
 closer+0.033
 again+0.032
's+0.030
 to+0.030
Neg logit contributions
 habitats-0.122
stack-0.121
 perspective-0.118
arians-0.113
 meanings-0.112
Token Charlie
Token ID11526
Feature activation+1.03e+00

Pos logit contributions
 habitats+0.128
 perspective+0.123
 meanings+0.118
arians+0.118
stack+0.117
Neg logit contributions
 better-0.028
 again-0.027
 to-0.026
 said-0.025
 they-0.025
Token laughed
Token ID13818
Feature activation+4.05e+00

Pos logit contributions
 better+0.017
 said+0.016
 again+0.016
's+0.015
 to+0.015
Neg logit contributions
 habitats-0.052
 perspective-0.050
stack-0.050
arians-0.048
ric-0.047
Token and
Token ID290
Feature activation+3.11e+00

Pos logit contributions
 better+0.062
 closer+0.055
 said+0.055
 say+0.053
's+0.050
Neg logit contributions
 habitats-0.190
stack-0.188
 perspective-0.184
 meanings-0.176
arians-0.176
Token looking
Token ID2045
Feature activation+5.24e+00

Pos logit contributions
 better+0.090
's+0.089
 again+0.087
 closer+0.085
 said+0.081
Neg logit contributions
 perspective-0.211
 habitats-0.209
stack-0.204
years-0.196
arians-0.196
Token,
Token ID11
Feature activation-7.79e-01

Pos logit contributions
 better+0.089
 said+0.078
 again+0.078
 closer+0.076
's+0.076
Neg logit contributions
 habitats-0.256
stack-0.250
 perspective-0.248
arians-0.236
 meanings-0.234
Token Tim
Token ID5045
Feature activation+5.34e-02

Pos logit contributions
 habitats+0.040
 perspective+0.039
stack+0.039
arians+0.037
 meanings+0.037
Neg logit contributions
 better-0.012
 said-0.011
 again-0.011
 to-0.011
's-0.010
Tokenmy
Token ID1820
Feature activation+1.20e+00

Pos logit contributions
 better+0.001
 closer+0.001
 again+0.001
 to+0.001
 they+0.001
Neg logit contributions
 habitats-0.002
 perspective-0.002
stack-0.002
ric-0.002
 meanings-0.002
Token sn
Token ID3013
Feature activation-8.88e-01

Pos logit contributions
 better+0.018
 closer+0.017
 again+0.015
 bite+0.014
 sandwich+0.014
Neg logit contributions
 habitats-0.054
stack-0.054
 perspective-0.053
ric-0.052
 meanings-0.051
Tokenuck
Token ID1347
Feature activation+8.78e+00

Pos logit contributions
 perspective+0.036
 habitats+0.036
stack+0.034
arians+0.034
 meanings+0.034
Neg logit contributions
 better-0.023
 again-0.023
 to-0.022
 said-0.022
 closer-0.021
Token over
Token ID625
Feature activation+2.86e+00

Pos logit contributions
 better+0.161
 closer+0.141
 said+0.140
 again+0.140
's+0.138
Neg logit contributions
 habitats-0.410
stack-0.403
 perspective-0.393
arians-0.372
 meanings-0.371
Token to
Token ID284
Feature activation+1.40e+00

Pos logit contributions
 better+0.024
 again+0.021
 to+0.021
 said+0.018
 closer+0.018
Neg logit contributions
 habitats-0.168
stack-0.165
 perspective-0.163
arians-0.160
 meanings-0.158
Token Lily
Token ID20037
Feature activation+9.08e-01

Pos logit contributions
 better+0.021
 again+0.018
 closer+0.018
 said+0.018
's+0.018
Neg logit contributions
 habitats-0.071
stack-0.070
 perspective-0.069
arians-0.065
ric-0.065
Token's
Token ID338
Feature activation+3.29e-01

Pos logit contributions
 better+0.009
 closer+0.007
 again+0.007
 said+0.007
 to+0.007
Neg logit contributions
 habitats-0.050
stack-0.050
 perspective-0.049
arians-0.047
ric-0.046
Token play
Token ID711
Feature activation+1.70e+00

Pos logit contributions
 better+0.008
 to+0.008
 again+0.008
's+0.008
 said+0.007
Neg logit contributions
 habitats-0.014
 perspective-0.013
stack-0.013
arians-0.013
years-0.013
Token.
Token ID13
Feature activation-5.10e-01

Pos logit contributions
 better+0.024
 again+0.021
 said+0.021
 closer+0.021
 to+0.020
Neg logit contributions
 habitats-0.087
stack-0.085
 perspective-0.084
arians-0.080
 meanings-0.080
Token
Token ID198
Feature activation+3.32e+00

Pos logit contributions
 habitats+0.022
stack+0.021
 perspective+0.020
arians+0.019
 meanings+0.019
Neg logit contributions
 better-0.010
 closer-0.010
 bite-0.009
 again-0.009
's-0.009
Token
Token ID198
Feature activation+3.28e+00

Pos logit contributions
 better+0.059
 sandwich+0.058
 again+0.057
 closer+0.056
's+0.054
Neg logit contributions
 habitats-0.137
stack-0.136
 perspective-0.129
ric-0.127
uff-0.126
Token"
Token ID1
Feature activation-4.98e+00

Pos logit contributions
 better+0.079
 again+0.075
 to+0.072
 closer+0.072
 said+0.071
Neg logit contributions
 habitats-0.134
stack-0.132
 perspective-0.129
arians-0.124
years-0.121
TokenY
Token ID56
Feature activation-2.91e+00

Pos logit contributions
 habitats+0.170
stack+0.164
 perspective+0.162
arians+0.151
 meanings+0.149
Neg logit contributions
 better-0.152
 to-0.150
 again-0.146
's-0.145
 closer-0.143
Tokenuck
Token ID1347
Feature activation+8.76e+00

Pos logit contributions
uff+0.101
 habitats+0.096
stack+0.096
 perspective+0.095
 meanings+0.094
Neg logit contributions
 better-0.077
 closer-0.076
 said-0.073
 they-0.072
 again-0.071
Token,
Token ID11
Feature activation-1.29e+00

Pos logit contributions
 better+0.155
 again+0.141
 said+0.140
 to+0.139
's+0.133
Neg logit contributions
 habitats-0.441
stack-0.427
 perspective-0.421
arians-0.405
 meanings-0.401
Token this
Token ID428
Feature activation+2.99e+00

Pos logit contributions
stack+0.054
 habitats+0.052
 perspective+0.050
ric+0.049
arians+0.049
Neg logit contributions
 better-0.026
 again-0.025
 closer-0.024
 to-0.023
 said-0.023
Token water
Token ID1660
Feature activation+2.41e+00

Pos logit contributions
 better+0.068
 closer+0.064
 again+0.064
 to+0.061
 said+0.059
Neg logit contributions
 habitats-0.125
stack-0.122
 perspective-0.120
ric-0.112
 meanings-0.112
Token is
Token ID318
Feature activation-1.40e+00

Pos logit contributions
 better+0.049
 again+0.044
 said+0.043
 to+0.043
's+0.043
Neg logit contributions
 habitats-0.113
stack-0.109
 perspective-0.109
arians-0.103
 meanings-0.102
Token bitter
Token ID12922
Feature activation+1.89e+00

Pos logit contributions
 habitats+0.066
stack+0.065
 perspective+0.063
 meanings+0.061
arians+0.061
Neg logit contributions
 better-0.024
 again-0.022
 said-0.022
 closer-0.022
's-0.022
Token on
Token ID319
Feature activation-1.26e+00

Pos logit contributions
 habitats+0.068
stack+0.068
 perspective+0.067
arians+0.064
 meanings+0.064
Neg logit contributions
 better-0.031
 again-0.031
 to-0.031
 said-0.030
's-0.029
Token his
Token ID465
Feature activation-1.65e+00

Pos logit contributions
 habitats+0.070
stack+0.069
 perspective+0.069
arians+0.066
 meanings+0.065
Neg logit contributions
 better-0.015
 to-0.013
 again-0.013
's-0.013
 said-0.012
Token mouth
Token ID5422
Feature activation+1.29e+00

Pos logit contributions
 habitats+0.058
stack+0.056
 perspective+0.055
arians+0.052
 meanings+0.051
Neg logit contributions
 better-0.052
 again-0.049
 to-0.049
 said-0.049
's-0.049
Token.
Token ID13
Feature activation-2.32e-01

Pos logit contributions
 better+0.020
 closer+0.018
 said+0.017
 again+0.017
's+0.017
Neg logit contributions
 habitats-0.062
stack-0.061
 perspective-0.060
 meanings-0.056
arians-0.056
Token Y
Token ID575
Feature activation-4.51e+00

Pos logit contributions
 habitats+0.010
stack+0.009
 perspective+0.009
arians+0.009
 meanings+0.009
Neg logit contributions
 better-0.006
 again-0.005
 closer-0.005
 said-0.005
 to-0.005
Tokenuck
Token ID1347
Feature activation+8.76e+00

Pos logit contributions
 habitats+0.173
stack+0.170
 perspective+0.165
arians+0.158
 meanings+0.155
Neg logit contributions
 better-0.125
 again-0.115
 said-0.115
 closer-0.115
 to-0.114
Token!"
Token ID2474
Feature activation+3.56e+00

Pos logit contributions
 better+0.139
 to+0.130
 said+0.127
 again+0.124
's+0.121
Neg logit contributions
 habitats-0.463
stack-0.452
 perspective-0.449
arians-0.430
years-0.425
Token she
Token ID673
Feature activation+1.87e+00

Pos logit contributions
ade+0.071
 closer+0.070
 again+0.069
 better+0.068
 sandwich+0.065
Neg logit contributions
stack-0.131
 meanings-0.126
 habitats-0.124
nam-0.123
 perspective-0.119
Token said
Token ID531
Feature activation+5.69e+00

Pos logit contributions
 better+0.035
 again+0.034
ade+0.030
 closer+0.028
 to+0.025
Neg logit contributions
 meanings-0.073
years-0.070
stack-0.070
nam-0.070
arians-0.069
Token.
Token ID13
Feature activation+8.54e-03

Pos logit contributions
ade+0.122
 better+0.120
 sandwich+0.116
 bite+0.109
 said+0.108
Neg logit contributions
stack-0.210
years-0.193
 perspective-0.193
uff-0.191
arians-0.189
Token
Token ID198
Feature activation+3.73e+00

Pos logit contributions
 better+0.000
 sandwich+0.000
 closer+0.000
ade+0.000
 again+0.000
Neg logit contributions
stack-0.000
 habitats-0.000
 perspective-0.000
arians-0.000
 meanings-0.000
Token showed
Token ID3751
Feature activation+1.82e+00

Pos logit contributions
 habitats+0.111
stack+0.109
 perspective+0.108
arians+0.103
 meanings+0.103
Neg logit contributions
 better-0.026
 again-0.022
 said-0.021
 closer-0.021
 to-0.021
Token him
Token ID683
Feature activation+2.19e+00

Pos logit contributions
 better+0.033
 again+0.030
 to+0.029
 closer+0.028
 said+0.028
Neg logit contributions
 habitats-0.089
stack-0.088
 perspective-0.087
arians-0.083
ric-0.082
Token how
Token ID703
Feature activation+3.45e-01

Pos logit contributions
 better+0.044
 to+0.043
 again+0.042
's+0.038
 closer+0.037
Neg logit contributions
stack-0.106
 habitats-0.106
 perspective-0.104
arians-0.101
ric-0.096
Token to
Token ID284
Feature activation+7.98e-01

Pos logit contributions
 better+0.002
 again+0.001
 said+0.001
 closer+0.001
's+0.001
Neg logit contributions
 habitats-0.021
 perspective-0.020
stack-0.020
 meanings-0.019
arians-0.019
Token pl
Token ID458
Feature activation-2.21e+00

Pos logit contributions
 better+0.012
 say+0.010
 said+0.010
 again+0.010
 to+0.009
Neg logit contributions
 habitats-0.043
stack-0.042
 perspective-0.041
years-0.040
 meanings-0.040
Tokenuck
Token ID1347
Feature activation+8.72e+00

Pos logit contributions
 habitats+0.065
stack+0.061
arians+0.061
 perspective+0.060
ric+0.058
Neg logit contributions
 better-0.079
 said-0.076
 closer-0.076
 again-0.075
 to-0.075
Token the
Token ID262
Feature activation+2.90e-01

Pos logit contributions
 better+0.106
 again+0.091
 to+0.087
 closer+0.083
 said+0.078
Neg logit contributions
 habitats-0.492
stack-0.482
 perspective-0.477
arians-0.458
years-0.454
Token strings
Token ID13042
Feature activation+5.56e-01

Pos logit contributions
 better+0.009
 again+0.008
 said+0.008
 closer+0.008
 to+0.008
Neg logit contributions
 habitats-0.010
stack-0.010
 perspective-0.010
arians-0.009
 meanings-0.009
Token and
Token ID290
Feature activation+2.87e+00

Pos logit contributions
 better+0.008
 again+0.007
 to+0.006
 said+0.006
 closer+0.006
Neg logit contributions
 habitats-0.030
stack-0.029
 perspective-0.029
arians-0.028
 meanings-0.028
Token make
Token ID787
Feature activation+3.27e+00

Pos logit contributions
 better+0.047
 said+0.042
 to+0.042
 again+0.041
's+0.039
Neg logit contributions
 habitats-0.148
 perspective-0.145
stack-0.145
arians-0.138
years-0.138
Token music
Token ID2647
Feature activation+2.04e+00

Pos logit contributions
 better+0.051
 said+0.045
 again+0.045
 closer+0.045
 to+0.044
Neg logit contributions
 habitats-0.161
 perspective-0.156
stack-0.156
 meanings-0.149
arians-0.149
Token be
Token ID307
Feature activation-2.65e-01

Pos logit contributions
 better+0.014
 again+0.013
 said+0.013
's+0.013
 happens+0.013
Neg logit contributions
stack-0.034
 perspective-0.033
 habitats-0.033
arians-0.031
ric-0.030
Token quiet
Token ID5897
Feature activation+3.72e-01

Pos logit contributions
 habitats+0.011
 perspective+0.011
stack+0.011
arians+0.010
years+0.010
Neg logit contributions
 better-0.005
 again-0.005
 said-0.005
's-0.005
 to-0.005
Token.
Token ID13
Feature activation-1.46e-01

Pos logit contributions
 again+0.005
 better+0.005
 to+0.005
 closer+0.004
's+0.003
Neg logit contributions
 habitats-0.021
stack-0.021
 perspective-0.020
ric-0.020
 meanings-0.020
Token He
Token ID679
Feature activation-2.98e+00

Pos logit contributions
 habitats+0.005
stack+0.005
 meanings+0.005
 perspective+0.005
arians+0.005
Neg logit contributions
 better-0.004
 closer-0.004
 again-0.004
 bite-0.003
 say-0.003
Token sn
Token ID3013
Feature activation-1.83e+00

Pos logit contributions
 habitats+0.135
stack+0.130
 perspective+0.126
arians+0.122
 meanings+0.121
Neg logit contributions
 better-0.060
 again-0.053
 closer-0.053
 to-0.052
 say-0.050
Tokenuck
Token ID1347
Feature activation+8.68e+00

Pos logit contributions
 habitats+0.069
stack+0.067
 perspective+0.067
arians+0.063
 meanings+0.063
Neg logit contributions
 better-0.054
 again-0.051
 to-0.050
 closer-0.049
 said-0.049
Token up
Token ID510
Feature activation+4.24e+00

Pos logit contributions
 better+0.174
 said+0.158
 say+0.154
's+0.153
 closer+0.149
Neg logit contributions
 habitats-0.382
stack-0.374
 perspective-0.361
years-0.350
arians-0.343
Token on
Token ID319
Feature activation-1.41e+00

Pos logit contributions
 better+0.074
's+0.068
 said+0.066
 says+0.064
 say+0.064
Neg logit contributions
stack-0.193
 habitats-0.189
 perspective-0.183
years-0.179
uff-0.174
Token the
Token ID262
Feature activation+2.70e-01

Pos logit contributions
 habitats+0.082
 perspective+0.081
stack+0.080
ric+0.078
arians+0.078
Neg logit contributions
 to-0.015
 better-0.015
 again-0.014
's-0.011
 closer-0.010
Token fish
Token ID5916
Feature activation+5.51e-01

Pos logit contributions
 better+0.007
 said+0.007
 again+0.007
 suggested+0.007
 closer+0.007
Neg logit contributions
stack-0.010
 habitats-0.010
 perspective-0.009
ric-0.009
arians-0.009
Token and
Token ID290
Feature activation+2.77e+00

Pos logit contributions
 better+0.006
 again+0.005
 to+0.005
's+0.005
 said+0.005
Neg logit contributions
 habitats-0.031
 perspective-0.030
stack-0.030
 meanings-0.029
arians-0.029
Token bites
Token ID26081
Feature activation+5.13e+00

Pos logit contributions
 better+0.056
 again+0.049
 closer+0.047
 to+0.046
 said+0.046
Neg logit contributions
 habitats-0.185
stack-0.182
 perspective-0.178
 meanings-0.171
arians-0.170
Token it
Token ID340
Feature activation+7.67e+00

Pos logit contributions
 better+0.070
 again+0.059
 said+0.058
 closer+0.057
 to+0.057
Neg logit contributions
 habitats-0.270
stack-0.265
 perspective-0.261
arians-0.250
 meanings-0.248
Token.
Token ID13
Feature activation-4.92e-01

Pos logit contributions
 better+0.122
 said+0.106
 again+0.104
 closer+0.103
's+0.101
Neg logit contributions
 habitats-0.372
stack-0.371
 perspective-0.361
 meanings-0.344
arians-0.343
Token "
Token ID366
Feature activation-7.51e+00

Pos logit contributions
 habitats+0.023
 perspective+0.023
stack+0.022
arians+0.022
 meanings+0.021
Neg logit contributions
 better-0.009
 said-0.009
 to-0.009
 again-0.008
's-0.008
TokenY
Token ID56
Feature activation-2.70e+00

Pos logit contributions
 habitats+0.288
stack+0.280
 perspective+0.275
arians+0.259
 meanings+0.257
Neg logit contributions
 better-0.210
 again-0.196
 said-0.194
 to-0.193
 closer-0.192
Tokenuck
Token ID1347
Feature activation+8.67e+00

Pos logit contributions
stack+0.096
 habitats+0.095
 perspective+0.093
 meanings+0.091
uff+0.091
Neg logit contributions
 better-0.074
 closer-0.070
 said-0.069
 again-0.067
 to-0.066
Token!"
Token ID2474
Feature activation+2.41e+00

Pos logit contributions
 better+0.164
 again+0.150
 to+0.148
 said+0.147
's+0.141
Neg logit contributions
 habitats-0.431
stack-0.412
 perspective-0.409
arians-0.392
 meanings-0.389
Token she
Token ID673
Feature activation+1.72e+00

Pos logit contributions
 better+0.046
 closer+0.040
 again+0.040
 bigger+0.034
 bite+0.034
Neg logit contributions
stack-0.111
 habitats-0.106
 perspective-0.100
 meanings-0.098
arians-0.097
Token says
Token ID1139
Feature activation+6.53e+00

Pos logit contributions
 better+0.026
 again+0.025
 closer+0.023
 to+0.021
 late+0.018
Neg logit contributions
stack-0.079
 habitats-0.076
 meanings-0.075
arians-0.073
ric-0.072
Token.
Token ID13
Feature activation-7.07e-01

Pos logit contributions
 better+0.130
 closer+0.113
 said+0.110
's+0.108
 say+0.107
Neg logit contributions
stack-0.272
 perspective-0.250
 habitats-0.246
uff-0.244
years-0.244
Token "
Token ID366
Feature activation-7.59e+00

Pos logit contributions
 habitats+0.030
stack+0.030
 perspective+0.029
arians+0.027
 meanings+0.027
Neg logit contributions
 better-0.017
 again-0.015
 closer-0.015
 said-0.015
's-0.014
Token bad
Token ID2089
Feature activation+1.98e-01

Pos logit contributions
 better+0.073
 again+0.067
 said+0.065
 closer+0.065
 to+0.064
Neg logit contributions
 habitats-0.197
stack-0.193
 perspective-0.191
 meanings-0.181
arians-0.180
Token too
Token ID1165
Feature activation+1.40e+00

Pos logit contributions
 better+0.003
 to+0.002
 again+0.002
 said+0.002
's+0.002
Neg logit contributions
 habitats-0.011
stack-0.011
 perspective-0.011
arians-0.010
ric-0.010
Token.
Token ID13
Feature activation-5.35e-02

Pos logit contributions
 better+0.022
 again+0.019
 said+0.019
 closer+0.019
 to+0.018
Neg logit contributions
 habitats-0.071
stack-0.070
 perspective-0.069
arians-0.066
 meanings-0.065
Token "
Token ID366
Feature activation-7.30e+00

Pos logit contributions
 habitats+0.003
stack+0.003
 perspective+0.002
arians+0.002
 meanings+0.002
Neg logit contributions
 better-0.001
 again-0.001
 closer-0.001
 to-0.001
 said-0.001
TokenY
Token ID56
Feature activation-3.04e+00

Pos logit contributions
 habitats+0.282
stack+0.274
 perspective+0.270
arians+0.254
 meanings+0.251
Neg logit contributions
 better-0.203
 again-0.189
 said-0.187
 to-0.186
 closer-0.185
Tokenuck
Token ID1347
Feature activation+8.67e+00

Pos logit contributions
uff+0.105
 habitats+0.103
stack+0.103
 perspective+0.102
ric+0.099
Neg logit contributions
 better-0.086
 closer-0.081
 said-0.079
 again-0.079
 to-0.077
Token!"
Token ID2474
Feature activation+2.88e+00

Pos logit contributions
 better+0.166
 again+0.151
 said+0.150
 to+0.149
 closer+0.145
Neg logit contributions
 habitats-0.418
stack-0.406
 perspective-0.399
arians-0.384
 meanings-0.379
Token Tom
Token ID4186
Feature activation+1.94e+00

Pos logit contributions
 better+0.043
 again+0.035
 closer+0.033
's+0.027
 bite+0.026
Neg logit contributions
 habitats-0.143
stack-0.143
 meanings-0.137
 perspective-0.136
arians-0.132
Token said
Token ID531
Feature activation+5.70e+00

Pos logit contributions
 better+0.022
 closer+0.017
 again+0.015
 kiss+0.014
 come+0.013
Neg logit contributions
stack-0.096
 habitats-0.095
 perspective-0.092
arians-0.092
years-0.091
Token.
Token ID13
Feature activation-2.90e-02

Pos logit contributions
 better+0.109
's+0.095
 said+0.094
 say+0.092
 closer+0.092
Neg logit contributions
stack-0.234
 perspective-0.216
 habitats-0.215
years-0.207
nam-0.207
Token "
Token ID366
Feature activation-7.26e+00

Pos logit contributions
 habitats+0.001
stack+0.001
 perspective+0.001
arians+0.001
 meanings+0.001
Neg logit contributions
 better-0.001
 again-0.001
 closer-0.001
 said-0.001
's-0.001
Token a
Token ID257
Feature activation+4.24e+00

Pos logit contributions
 better+0.032
 closer+0.030
 again+0.029
 said+0.028
 suggested+0.027
Neg logit contributions
 habitats-0.117
 perspective-0.111
stack-0.111
 meanings-0.110
ric-0.110
Token face
Token ID1986
Feature activation+1.48e+00

Pos logit contributions
 better+0.079
 again+0.077
 to+0.074
 said+0.074
 closer+0.073
Neg logit contributions
 habitats-0.182
 perspective-0.181
stack-0.179
 meanings-0.172
ric-0.171
Token.
Token ID13
Feature activation-4.78e-01

Pos logit contributions
 better+0.023
 said+0.021
 closer+0.020
 again+0.020
 to+0.019
Neg logit contributions
 habitats-0.072
stack-0.071
 perspective-0.070
ric-0.067
 meanings-0.066
Token "
Token ID366
Feature activation-7.44e+00

Pos logit contributions
stack+0.019
 habitats+0.019
 meanings+0.018
 perspective+0.017
uff+0.017
Neg logit contributions
 better-0.012
 closer-0.010
 again-0.010
's-0.010
 say-0.009
TokenY
Token ID56
Feature activation-3.04e+00

Pos logit contributions
 habitats+0.260
stack+0.250
 perspective+0.249
 meanings+0.234
ric+0.232
Neg logit contributions
 better-0.219
 again-0.213
 to-0.211
 said-0.207
 closer-0.206
Tokenuck
Token ID1347
Feature activation+8.66e+00

Pos logit contributions
 habitats+0.104
uff+0.103
stack+0.102
 perspective+0.102
 meanings+0.100
Neg logit contributions
 better-0.082
 closer-0.080
 again-0.077
 said-0.076
 to-0.076
Token,
Token ID11
Feature activation-1.45e+00

Pos logit contributions
 better+0.165
 again+0.150
 said+0.149
 to+0.149
's+0.142
Neg logit contributions
 habitats-0.425
stack-0.412
 perspective-0.404
arians-0.389
 meanings-0.383
Token this
Token ID428
Feature activation+2.79e+00

Pos logit contributions
stack+0.064
 habitats+0.064
 perspective+0.061
 meanings+0.060
ric+0.059
Neg logit contributions
 better-0.028
 again-0.026
 closer-0.025
 said-0.024
 to-0.024
Token sauce
Token ID10746
Feature activation+1.22e+00

Pos logit contributions
 better+0.060
 closer+0.057
 again+0.057
 to+0.053
 said+0.053
Neg logit contributions
 habitats-0.118
stack-0.117
 perspective-0.114
arians-0.109
 meanings-0.108
Token is
Token ID318
Feature activation-1.55e+00

Pos logit contributions
 better+0.026
 again+0.024
 closer+0.023
 said+0.023
 to+0.023
Neg logit contributions
 habitats-0.054
stack-0.053
 perspective-0.052
arians-0.050
 meanings-0.049
Token spicy
Token ID26880
Feature activation+2.40e+00

Pos logit contributions
 habitats+0.072
stack+0.070
 perspective+0.068
 meanings+0.066
arians+0.066
Neg logit contributions
 better-0.029
 again-0.027
 closer-0.026
 said-0.026
's-0.025
Token.
Token ID13
Feature activation-1.34e-01

Pos logit contributions
 better+0.003
 again+0.003
 to+0.003
 said+0.003
 closer+0.003
Neg logit contributions
 habitats-0.017
stack-0.016
 perspective-0.016
arians-0.015
 meanings-0.015
Token Anna
Token ID11735
Feature activation+7.54e-01

Pos logit contributions
 habitats+0.006
stack+0.006
 perspective+0.005
arians+0.005
 meanings+0.005
Neg logit contributions
 better-0.003
 again-0.003
 closer-0.003
 said-0.003
 to-0.003
Token and
Token ID290
Feature activation+2.93e+00

Pos logit contributions
 better+0.009
 again+0.008
 closer+0.008
 to+0.007
 said+0.007
Neg logit contributions
 habitats-0.040
stack-0.039
 perspective-0.039
arians-0.038
 meanings-0.037
Token Ben
Token ID3932
Feature activation+2.74e+00

Pos logit contributions
 better+0.061
 again+0.055
 closer+0.055
 said+0.051
 to+0.050
Neg logit contributions
 habitats-0.124
stack-0.122
 perspective-0.114
 meanings-0.113
arians-0.113
Token unb
Token ID22619
Feature activation-1.40e+00

Pos logit contributions
 better+0.053
 again+0.048
 said+0.047
 to+0.046
 closer+0.046
Neg logit contributions
 habitats-0.129
stack-0.126
 perspective-0.125
arians-0.119
 meanings-0.118
Tokenuck
Token ID1347
Feature activation+8.66e+00

Pos logit contributions
 habitats+0.043
stack+0.041
 perspective+0.040
arians+0.039
years+0.037
Neg logit contributions
 better-0.050
 closer-0.046
 again-0.046
 to-0.046
 said-0.045
Tokenled
Token ID992
Feature activation+3.19e+00

Pos logit contributions
 better+0.227
 say+0.212
 sandwich+0.209
 bite+0.200
 apologize+0.200
Neg logit contributions
 habitats-0.264
years-0.252
stack-0.243
vered-0.237
 meanings-0.229
Token and
Token ID290
Feature activation+2.91e+00

Pos logit contributions
 better+0.046
 said+0.042
's+0.039
 to+0.038
 again+0.037
Neg logit contributions
 habitats-0.157
 perspective-0.152
stack-0.150
 meanings-0.144
arians-0.141
Token got
Token ID1392
Feature activation-3.38e-02

Pos logit contributions
 better+0.053
 again+0.048
 closer+0.045
's+0.045
 to+0.043
Neg logit contributions
 habitats-0.135
stack-0.135
 perspective-0.126
arians-0.124
 meanings-0.124
Token off
Token ID572
Feature activation+2.42e-01

Pos logit contributions
nam+0.001
 habitats+0.001
years+0.001
stack+0.001
 perspective+0.001
Neg logit contributions
 again-0.000
 sandwich-0.000
 bite-0.000
 better-0.000
ade-0.000
Token.
Token ID13
Feature activation-1.54e-01

Pos logit contributions
 better+0.003
 again+0.003
 said+0.003
 closer+0.003
 to+0.003
Neg logit contributions
 habitats-0.013
stack-0.012
 perspective-0.012
arians-0.012
 meanings-0.012
Token and
Token ID290
Feature activation+2.90e+00

Pos logit contributions
 said+0.044
 suggested+0.043
 forg+0.042
 closer+0.041
 sandwich+0.041
Neg logit contributions
stack-0.126
 perspective-0.121
uff-0.119
 habitats-0.118
 meanings-0.117
Token said
Token ID531
Feature activation+5.69e+00

Pos logit contributions
 better+0.061
 closer+0.057
 again+0.055
 to+0.055
's+0.054
Neg logit contributions
 habitats-0.125
stack-0.121
 perspective-0.121
 meanings-0.116
ric-0.113
Token,
Token ID11
Feature activation-1.42e+00

Pos logit contributions
 better+0.089
 again+0.078
 to+0.077
 said+0.076
's+0.073
Neg logit contributions
 habitats-0.295
stack-0.289
 perspective-0.286
arians-0.273
 meanings-0.269
Token "
Token ID366
Feature activation-7.19e+00

Pos logit contributions
 habitats+0.062
 perspective+0.060
stack+0.059
arians+0.056
 meanings+0.055
Neg logit contributions
 better-0.034
 again-0.031
 to-0.031
 said-0.031
 closer-0.030
TokenY
Token ID56
Feature activation-3.01e+00

Pos logit contributions
 habitats+0.275
stack+0.267
 perspective+0.263
arians+0.248
 meanings+0.245
Neg logit contributions
 better-0.203
 again-0.189
 said-0.187
 to-0.187
 closer-0.186
Tokenuck
Token ID1347
Feature activation+8.65e+00

Pos logit contributions
 habitats+0.110
uff+0.109
 perspective+0.108
stack+0.108
 meanings+0.106
Neg logit contributions
 better-0.078
 closer-0.075
 again-0.072
 said-0.072
 to-0.069
Token!"
Token ID2474
Feature activation+3.12e+00

Pos logit contributions
 better+0.151
 to+0.138
 again+0.137
 said+0.134
's+0.131
Neg logit contributions
 habitats-0.449
stack-0.436
 perspective-0.426
arians-0.410
years-0.407
Token
Token ID198
Feature activation+3.41e+00

Pos logit contributions
 better+0.056
 closer+0.055
 bite+0.051
's+0.047
 shot+0.046
Neg logit contributions
stack-0.134
 habitats-0.132
 perspective-0.128
uff-0.128
arians-0.120
Token
Token ID198
Feature activation+3.34e+00

Pos logit contributions
 better+0.058
 closer+0.056
 bite+0.053
 sandwich+0.052
's+0.049
Neg logit contributions
 habitats-0.152
 perspective-0.149
stack-0.147
uff-0.144
ric-0.142
TokenMom
Token ID29252
Feature activation+8.28e-01

Pos logit contributions
 better+0.065
 said+0.059
 again+0.058
's+0.057
 to+0.057
Neg logit contributions
 habitats-0.160
stack-0.158
 perspective-0.154
 meanings-0.148
arians-0.147
Token heard
Token ID2982
Feature activation+1.50e+00

Pos logit contributions
 better+0.015
 closer+0.012
 sandwich+0.011
 to+0.011
 they+0.011
Neg logit contributions
 habitats-0.035
stack-0.035
years-0.035
 meanings-0.034
 perspective-0.034
Token and
Token ID290
Feature activation+2.71e+00

Pos logit contributions
 better+0.012
 again+0.010
 said+0.010
 closer+0.009
 to+0.009
Neg logit contributions
 habitats-0.049
stack-0.048
 perspective-0.048
arians-0.046
 meanings-0.046
Token said
Token ID531
Feature activation+5.57e+00

Pos logit contributions
 better+0.039
 closer+0.033
 again+0.033
 to+0.032
's+0.032
Neg logit contributions
 habitats-0.136
stack-0.135
 perspective-0.131
 meanings-0.126
arians-0.125
Token,
Token ID11
Feature activation-1.48e+00

Pos logit contributions
 better+0.096
 said+0.084
 again+0.084
 closer+0.084
's+0.082
Neg logit contributions
 habitats-0.269
stack-0.264
 perspective-0.261
 meanings-0.248
arians-0.248
Token "
Token ID366
Feature activation-7.16e+00

Pos logit contributions
 habitats+0.070
 perspective+0.067
stack+0.065
arians+0.063
ric+0.062
Neg logit contributions
 better-0.032
 to-0.031
 again-0.030
 said-0.030
's-0.029
TokenY
Token ID56
Feature activation-3.05e+00

Pos logit contributions
 habitats+0.280
stack+0.273
 perspective+0.267
arians+0.252
 meanings+0.249
Neg logit contributions
 better-0.198
 again-0.184
 said-0.183
 to-0.180
 closer-0.180
Tokenuck
Token ID1347
Feature activation+8.65e+00

Pos logit contributions
 habitats+0.106
 perspective+0.103
stack+0.102
uff+0.101
 meanings+0.100
Neg logit contributions
 better-0.086
 closer-0.083
 said-0.080
 again-0.080
 to-0.078
Token!
Token ID0
Feature activation+2.37e-01

Pos logit contributions
 better+0.155
 to+0.147
 again+0.146
's+0.137
 said+0.137
Neg logit contributions
 habitats-0.452
stack-0.438
 perspective-0.422
arians-0.417
 meanings-0.406
Token This
Token ID770
Feature activation+8.30e-02

Pos logit contributions
 better+0.006
 again+0.006
 to+0.006
's+0.006
 said+0.006
Neg logit contributions
 habitats-0.010
stack-0.009
 perspective-0.009
arians-0.009
 meanings-0.009
Token is
Token ID318
Feature activation-1.46e+00

Pos logit contributions
 better+0.002
 again+0.002
 closer+0.002
 said+0.002
 to+0.002
Neg logit contributions
 habitats-0.003
stack-0.003
 perspective-0.003
 meanings-0.003
ric-0.003
Token not
Token ID407
Feature activation+3.94e+00

Pos logit contributions
 habitats+0.068
stack+0.067
 perspective+0.066
arians+0.063
 meanings+0.063
Neg logit contributions
 better-0.027
 again-0.025
 said-0.024
 closer-0.024
 to-0.024
Token y
Token ID331
Feature activation-2.71e+00

Pos logit contributions
 better+0.067
 to+0.056
 again+0.056
 said+0.056
's+0.054
Neg logit contributions
 habitats-0.203
stack-0.199
 perspective-0.195
arians-0.189
 meanings-0.183
Token.
Token ID13
Feature activation-4.29e-01

Pos logit contributions
stack+0.025
 habitats+0.025
 perspective+0.025
years+0.024
ric+0.024
Neg logit contributions
 to-0.004
 again-0.004
 better-0.004
's-0.003
 said-0.003
Token
Token ID198
Feature activation+3.28e+00

Pos logit contributions
 habitats+0.021
stack+0.020
 perspective+0.020
arians+0.019
 meanings+0.019
Neg logit contributions
 better-0.008
 again-0.007
 said-0.007
 closer-0.007
 to-0.007
Token
Token ID198
Feature activation+3.24e+00

Pos logit contributions
 better+0.056
 again+0.049
 sandwich+0.048
 closer+0.046
's+0.046
Neg logit contributions
 habitats-0.149
 perspective-0.148
stack-0.147
ric-0.140
uff-0.135
Token"
Token ID1
Feature activation-5.39e+00

Pos logit contributions
 better+0.069
 said+0.063
 closer+0.063
 to+0.062
 again+0.062
Neg logit contributions
 habitats-0.149
stack-0.145
 perspective-0.141
arians-0.137
 meanings-0.135
TokenY
Token ID56
Feature activation-2.99e+00

Pos logit contributions
 habitats+0.202
stack+0.197
 perspective+0.193
arians+0.182
 meanings+0.180
Neg logit contributions
 better-0.157
 again-0.146
 said-0.145
 closer-0.144
 to-0.144
Tokenuck
Token ID1347
Feature activation+8.65e+00

Pos logit contributions
 habitats+0.120
 perspective+0.118
stack+0.116
uff+0.115
 meanings+0.113
Neg logit contributions
 better-0.071
 closer-0.067
 said-0.065
 again-0.064
 to-0.062
Token!
Token ID0
Feature activation+2.18e-01

Pos logit contributions
 better+0.150
 said+0.140
 again+0.139
 to+0.138
's+0.130
Neg logit contributions
 habitats-0.443
stack-0.429
 perspective-0.420
arians-0.406
years-0.402
Token What
Token ID1867
Feature activation-2.21e+00

Pos logit contributions
 better+0.005
's+0.004
 again+0.004
 to+0.004
 said+0.004
Neg logit contributions
 habitats-0.010
stack-0.009
 perspective-0.009
arians-0.009
 meanings-0.009
Token is
Token ID318
Feature activation-1.55e+00

Pos logit contributions
 perspective+0.085
 meanings+0.083
 habitats+0.083
stack+0.078
nam+0.077
Neg logit contributions
 better-0.048
 closer-0.048
 again-0.047
 said-0.045
 say-0.043
Token that
Token ID326
Feature activation+4.66e+00

Pos logit contributions
 habitats+0.075
stack+0.073
 perspective+0.072
arians+0.069
 meanings+0.068
Neg logit contributions
 better-0.029
 again-0.026
 said-0.025
 to-0.025
 closer-0.025
Token smell
Token ID8508
Feature activation+4.37e+00

Pos logit contributions
 better+0.072
 closer+0.070
 again+0.068
's+0.060
 they+0.059
Neg logit contributions
stack-0.209
 habitats-0.208
 perspective-0.207
 meanings-0.199
arians-0.192
Token.
Token ID13
Feature activation-4.97e-01

Pos logit contributions
 habitats+0.149
stack+0.142
 perspective+0.140
ric+0.139
arians+0.138
Neg logit contributions
 to-0.025
 better-0.025
 again-0.024
 closer-0.022
 said-0.019
Token
Token ID198
Feature activation+3.07e+00

Pos logit contributions
 habitats+0.027
stack+0.027
 perspective+0.026
arians+0.025
 meanings+0.025
Neg logit contributions
 better-0.006
 again-0.005
 closer-0.004
's-0.004
 said-0.004
Token
Token ID198
Feature activation+3.03e+00

Pos logit contributions
 better+0.051
 again+0.047
 closer+0.046
 sandwich+0.046
's+0.041
Neg logit contributions
 habitats-0.140
stack-0.139
 perspective-0.134
ric-0.132
uff-0.132
Token"
Token ID1
Feature activation-5.20e+00

Pos logit contributions
 better+0.055
 said+0.052
's+0.050
 to+0.049
 closer+0.048
Neg logit contributions
 habitats-0.151
stack-0.147
 perspective-0.145
 meanings-0.138
arians-0.138
TokenY
Token ID56
Feature activation-3.03e+00

Pos logit contributions
 habitats+0.183
stack+0.177
 perspective+0.174
arians+0.163
 meanings+0.161
Neg logit contributions
 better-0.164
 again-0.154
 said-0.153
 closer-0.152
 to-0.152
Tokenuck
Token ID1347
Feature activation+8.65e+00

Pos logit contributions
stack+0.098
 perspective+0.096
 habitats+0.096
uff+0.096
 meanings+0.093
Neg logit contributions
 better-0.091
 closer-0.085
 again-0.084
 said-0.083
 they-0.083
Token!
Token ID0
Feature activation+3.05e-01

Pos logit contributions
 better+0.172
 again+0.157
 said+0.156
 to+0.154
 closer+0.151
Neg logit contributions
 habitats-0.411
stack-0.398
 perspective-0.392
arians-0.377
 meanings-0.372
Token Let
Token ID3914
Feature activation+1.48e+00

Pos logit contributions
 better+0.008
 again+0.007
 said+0.007
 to+0.007
 closer+0.007
Neg logit contributions
 habitats-0.012
stack-0.012
 perspective-0.012
arians-0.011
 meanings-0.011
Token's
Token ID338
Feature activation+1.47e-01

Pos logit contributions
 opener+0.032
 happier+0.031
 closer+0.030
 kiss+0.030
 late+0.030
Neg logit contributions
stack-0.051
 perspective-0.047
opes-0.045
arians-0.045
ric-0.045
Token get
Token ID651
Feature activation+1.51e+00

Pos logit contributions
 better+0.002
 said+0.002
 again+0.002
 closer+0.002
's+0.002
Neg logit contributions
 habitats-0.008
 perspective-0.008
stack-0.008
years-0.007
arians-0.007
Token out
Token ID503
Feature activation+9.99e-01

Pos logit contributions
 again+0.026
 suggested+0.025
 better+0.025
's+0.025
 said+0.024
Neg logit contributions
stack-0.061
years-0.060
 habitats-0.057
nam-0.057
ocol-0.056
Token.
Token ID13
Feature activation-7.46e-01

Pos logit contributions
 better+0.025
 closer+0.024
 again+0.022
 said+0.022
 to+0.022
Neg logit contributions
stack-0.072
 habitats-0.072
 perspective-0.069
arians-0.068
 meanings-0.066
Token
Token ID198
Feature activation+3.15e+00

Pos logit contributions
 habitats+0.034
stack+0.034
 perspective+0.033
arians+0.032
 meanings+0.031
Neg logit contributions
 better-0.015
 again-0.014
 said-0.014
 to-0.013
 closer-0.013
Token
Token ID198
Feature activation+3.10e+00

Pos logit contributions
 better+0.053
 again+0.049
 closer+0.047
 shot+0.044
 sandwich+0.044
Neg logit contributions
stack-0.143
 habitats-0.142
 perspective-0.138
uff-0.137
ric-0.134
Token"
Token ID1
Feature activation-5.14e+00

Pos logit contributions
 better+0.063
 said+0.057
's+0.056
 to+0.056
 closer+0.055
Neg logit contributions
 habitats-0.147
stack-0.142
 perspective-0.141
 meanings-0.134
arians-0.134
TokenY
Token ID56
Feature activation-2.98e+00

Pos logit contributions
 habitats+0.176
stack+0.170
 perspective+0.168
arians+0.156
 meanings+0.156
Neg logit contributions
 better-0.165
 again-0.155
 to-0.154
 said-0.153
 closer-0.153
Tokenuck
Token ID1347
Feature activation+8.64e+00

Pos logit contributions
uff+0.105
 habitats+0.105
 perspective+0.104
 meanings+0.103
stack+0.101
Neg logit contributions
 better-0.077
 closer-0.075
 said-0.071
 again-0.070
 they-0.070
Token!
Token ID0
Feature activation+1.68e-01

Pos logit contributions
 better+0.161
 said+0.147
 again+0.146
 to+0.145
's+0.139
Neg logit contributions
 habitats-0.426
stack-0.414
 perspective-0.406
arians-0.391
 meanings-0.385
Token I
Token ID314
Feature activation+9.44e-01

Pos logit contributions
 better+0.005
 again+0.004
 said+0.004
 to+0.004
 closer+0.004
Neg logit contributions
 habitats-0.007
stack-0.006
 perspective-0.006
arians-0.006
 meanings-0.006
Token don
Token ID836
Feature activation-3.02e+00

Pos logit contributions
 better+0.017
 again+0.017
 closer+0.016
 bigger+0.015
 they+0.015
Neg logit contributions
 habitats-0.041
 perspective-0.039
stack-0.038
 meanings-0.038
ric-0.037
Token't
Token ID470
Feature activation+4.04e+00

Pos logit contributions
stack+0.108
arians+0.100
 perspective+0.098
uff+0.098
ric+0.096
Neg logit contributions
 better-0.078
 closer-0.076
 again-0.074
 said-0.073
 kiss-0.070
Token like
Token ID588
Feature activation-3.13e+00

Pos logit contributions
 better+0.093
 again+0.091
 closer+0.090
 said+0.087
's+0.087
Neg logit contributions
 habitats-0.158
 perspective-0.157
stack-0.147
 meanings-0.147
years-0.145
Token night
Token ID1755
Feature activation+8.20e-01

Pos logit contributions
 better+0.076
 again+0.067
 said+0.065
 closer+0.065
 to+0.065
Neg logit contributions
 habitats-0.237
stack-0.232
 perspective-0.229
arians-0.219
 meanings-0.217
Token,
Token ID11
Feature activation-1.52e+00

Pos logit contributions
 better+0.015
 closer+0.014
 again+0.013
 said+0.013
's+0.013
Neg logit contributions
 habitats-0.038
stack-0.038
 perspective-0.037
arians-0.035
 meanings-0.035
Token she
Token ID673
Feature activation+1.48e+00

Pos logit contributions
 habitats+0.087
stack+0.084
 perspective+0.084
arians+0.081
ric+0.081
Neg logit contributions
 better-0.017
 to-0.015
 again-0.014
 said-0.014
's-0.013
Token quietly
Token ID12703
Feature activation+6.51e-01

Pos logit contributions
 better+0.027
 again+0.024
 closer+0.024
 to+0.023
 said+0.023
Neg logit contributions
 habitats-0.070
stack-0.069
 perspective-0.067
arians-0.065
 meanings-0.064
Token sn
Token ID3013
Feature activation-1.82e+00

Pos logit contributions
 better+0.013
 again+0.011
 said+0.011
 closer+0.011
 to+0.011
Neg logit contributions
 habitats-0.031
stack-0.030
 perspective-0.029
arians-0.028
 meanings-0.028
Tokenuck
Token ID1347
Feature activation+8.64e+00

Pos logit contributions
 habitats+0.078
stack+0.076
 perspective+0.076
arians+0.072
 meanings+0.071
Neg logit contributions
 better-0.045
 again-0.041
 to-0.041
 said-0.040
 closer-0.040
Token out
Token ID503
Feature activation+9.87e-01

Pos logit contributions
 better+0.138
 said+0.120
 again+0.119
 closer+0.117
's+0.115
Neg logit contributions
 habitats-0.432
stack-0.424
 perspective-0.418
arians-0.398
 meanings-0.395
Token of
Token ID286
Feature activation-2.98e+00

Pos logit contributions
 better+0.015
 again+0.013
 said+0.013
 closer+0.013
 to+0.013
Neg logit contributions
 habitats-0.050
stack-0.049
 perspective-0.049
arians-0.046
 meanings-0.046
Token her
Token ID607
Feature activation-1.65e+00

Pos logit contributions
 habitats+0.178
arians+0.174
 perspective+0.168
stack+0.166
ric+0.166
Neg logit contributions
 better-0.035
 to-0.030
 again-0.027
 some-0.026
 closer-0.024
Token house
Token ID2156
Feature activation-3.29e+00

Pos logit contributions
 habitats+0.067
years+0.067
 perspective+0.066
stack+0.065
arians+0.062
Neg logit contributions
 to-0.045
 better-0.044
's-0.041
 again-0.041
 said-0.039
Token and
Token ID290
Feature activation+2.73e+00

Pos logit contributions
 habitats+0.204
 perspective+0.197
ric+0.196
stack+0.194
arians+0.193
Neg logit contributions
 to-0.031
 again-0.025
 better-0.020
's-0.018
 said-0.016
Token He
Token ID679
Feature activation-2.72e+00

Pos logit contributions
 habitats+0.020
stack+0.019
 meanings+0.019
 perspective+0.018
ric+0.018
Neg logit contributions
 better-0.012
 closer-0.011
 again-0.011
's-0.011
 said-0.010
Token said
Token ID531
Feature activation+5.49e+00

Pos logit contributions
 habitats+0.130
stack+0.126
 perspective+0.121
 meanings+0.121
arians+0.118
Neg logit contributions
 better-0.045
 closer-0.041
 again-0.040
 to-0.039
's-0.036
Token,
Token ID11
Feature activation-1.67e+00

Pos logit contributions
 better+0.106
 said+0.099
 closer+0.097
's+0.094
 say+0.093
Neg logit contributions
 habitats-0.234
stack-0.232
 perspective-0.226
 meanings-0.217
years-0.215
Token "
Token ID366
Feature activation-7.11e+00

Pos logit contributions
 habitats+0.071
stack+0.068
 perspective+0.068
arians+0.064
 meanings+0.063
Neg logit contributions
 better-0.041
 again-0.038
 said-0.037
 to-0.037
 closer-0.037
TokenY
Token ID56
Feature activation-2.92e+00

Pos logit contributions
 habitats+0.266
stack+0.257
 perspective+0.254
 meanings+0.238
arians+0.238
Neg logit contributions
 better-0.203
 again-0.191
 to-0.190
 closer-0.189
 said-0.187
Tokenuck
Token ID1347
Feature activation+8.64e+00

Pos logit contributions
 habitats+0.100
 perspective+0.098
stack+0.097
 meanings+0.095
uff+0.095
Neg logit contributions
 better-0.084
 closer-0.081
 said-0.079
 again-0.078
 to-0.077
Token."
Token ID526
Feature activation+2.13e+00

Pos logit contributions
 better+0.139
 to+0.131
 again+0.131
's+0.121
 said+0.120
Neg logit contributions
 habitats-0.477
stack-0.461
 perspective-0.448
arians-0.439
 meanings-0.433
Token
Token ID198
Feature activation+3.48e+00

Pos logit contributions
 better+0.045
 closer+0.041
's+0.039
 bite+0.038
 kiss+0.035
Neg logit contributions
 habitats-0.087
stack-0.084
uff-0.080
 perspective-0.079
 meanings-0.077
Token
Token ID198
Feature activation+3.40e+00

Pos logit contributions
 better+0.069
 closer+0.064
 sandwich+0.063
 nicer+0.061
 again+0.061
Neg logit contributions
 habitats-0.142
stack-0.139
 perspective-0.136
ric-0.135
uff-0.134
TokenHis
Token ID6653
Feature activation-4.00e+00

Pos logit contributions
 better+0.092
 again+0.087
 closer+0.081
 to+0.080
 said+0.080
Neg logit contributions
 habitats-0.129
stack-0.125
 perspective-0.123
years-0.115
uff-0.114
Token mom
Token ID1995
Feature activation+6.34e-01

Pos logit contributions
 habitats+0.184
stack+0.179
 perspective+0.175
 meanings+0.172
arians+0.167
Neg logit contributions
 better-0.075
 closer-0.068
 again-0.067
 said-0.066
's-0.066
Token.
Token ID13
Feature activation-4.13e-01

Pos logit contributions
 better+0.014
 again+0.012
 said+0.011
 to+0.011
 closer+0.011
Neg logit contributions
 habitats-0.050
stack-0.049
 perspective-0.048
arians-0.046
 meanings-0.046
Token
Token ID198
Feature activation+3.31e+00

Pos logit contributions
stack+0.020
 habitats+0.020
 perspective+0.019
 meanings+0.019
arians+0.018
Neg logit contributions
 better-0.007
 again-0.006
 closer-0.006
's-0.005
 said-0.005
Token
Token ID198
Feature activation+3.24e+00

Pos logit contributions
 better+0.059
 sandwich+0.054
 again+0.053
 closer+0.052
 nicer+0.049
Neg logit contributions
stack-0.150
 habitats-0.143
 perspective-0.140
uff-0.133
arians-0.131
Token"
Token ID1
Feature activation-5.16e+00

Pos logit contributions
 better+0.068
 said+0.063
's+0.062
 closer+0.062
 to+0.062
Neg logit contributions
 habitats-0.150
 perspective-0.144
stack-0.143
ric-0.138
 meanings-0.136
TokenY
Token ID56
Feature activation-2.99e+00

Pos logit contributions
 habitats+0.189
stack+0.185
 perspective+0.180
 meanings+0.170
arians+0.168
Neg logit contributions
 better-0.150
 to-0.143
 again-0.142
 closer-0.140
's-0.139
Tokenuck
Token ID1347
Feature activation+8.63e+00

Pos logit contributions
 habitats+0.117
 perspective+0.114
stack+0.113
 meanings+0.112
uff+0.111
Neg logit contributions
 better-0.072
 closer-0.068
 again-0.066
 said-0.065
 to-0.064
Token!
Token ID0
Feature activation+3.37e-01

Pos logit contributions
 better+0.150
 said+0.140
 to+0.139
 again+0.138
's+0.129
Neg logit contributions
 habitats-0.445
stack-0.429
 perspective-0.424
arians-0.410
 meanings-0.403
Token This
Token ID770
Feature activation+7.27e-02

Pos logit contributions
 better+0.009
 again+0.009
 said+0.008
 to+0.008
 closer+0.008
Neg logit contributions
 habitats-0.013
stack-0.013
 perspective-0.013
arians-0.012
 meanings-0.012
Token lime
Token ID28738
Feature activation-9.45e-01

Pos logit contributions
 better+0.002
 again+0.001
 said+0.001
 to+0.001
 closer+0.001
Neg logit contributions
 habitats-0.003
stack-0.003
 perspective-0.003
arians-0.003
 meanings-0.003
Token is
Token ID318
Feature activation-1.44e+00

Pos logit contributions
 habitats+0.046
 perspective+0.044
stack+0.044
arians+0.042
 meanings+0.041
Neg logit contributions
 better-0.018
 again-0.016
's-0.016
 said-0.016
 to-0.016
Token sour
Token ID11348
Feature activation+2.92e+00

Pos logit contributions
 habitats+0.066
stack+0.064
 perspective+0.063
 meanings+0.060
arians+0.060
Neg logit contributions
 better-0.028
 again-0.026
 closer-0.025
 said-0.025
 to-0.024
Token as
Token ID355
Feature activation-1.06e-01

Pos logit contributions
 habitats+0.034
stack+0.033
years+0.033
 perspective+0.032
ric+0.032
Neg logit contributions
 better-0.007
 again-0.007
 to-0.007
's-0.006
 said-0.006
Token he
Token ID339
Feature activation+2.15e+00

Pos logit contributions
 habitats+0.006
stack+0.006
 perspective+0.006
arians+0.006
ric+0.006
Neg logit contributions
 better-0.001
 to-0.001
 they-0.001
 again-0.001
 closer-0.001
Token opened
Token ID4721
Feature activation+7.19e-01

Pos logit contributions
 said+0.039
 better+0.037
's+0.035
 again+0.034
 to+0.033
Neg logit contributions
 perspective-0.107
stack-0.107
 habitats-0.106
ric-0.100
uff-0.098
Token the
Token ID262
Feature activation+6.10e-02

Pos logit contributions
 better+0.009
 to+0.008
 again+0.008
 closer+0.007
 said+0.007
Neg logit contributions
 habitats-0.041
stack-0.040
 perspective-0.039
arians-0.038
 meanings-0.038
Token ch
Token ID442
Feature activation-1.73e+00

Pos logit contributions
 better+0.002
 to+0.001
 closer+0.001
 again+0.001
 bigger+0.001
Neg logit contributions
 habitats-0.003
 perspective-0.003
stack-0.002
years-0.002
arians-0.002
Tokenocol
Token ID4668
Feature activation-1.38e+01

Pos logit contributions
stack+0.038
 habitats+0.036
 perspective+0.033
arians+0.031
 meanings+0.031
Neg logit contributions
 better-0.077
 again-0.076
 to-0.075
 closer-0.075
 said-0.073
Tokenates
Token ID689
Feature activation-3.34e+00

Pos logit contributions
 habitats+0.231
 perspective+0.231
stack+0.230
 meanings+0.200
arians+0.190
Neg logit contributions
 to-0.639
 again-0.636
 better-0.624
's-0.611
 closer-0.610
Token and
Token ID290
Feature activation+2.68e+00

Pos logit contributions
 habitats+0.183
 perspective+0.173
stack+0.172
 meanings+0.170
arians+0.166
Neg logit contributions
 better-0.046
 again-0.045
 to-0.042
's-0.041
 closer-0.040
Token started
Token ID2067
Feature activation-4.44e+00

Pos logit contributions
 better+0.047
 said+0.044
 again+0.042
 to+0.042
 closer+0.041
Neg logit contributions
 habitats-0.134
 perspective-0.131
stack-0.128
arians-0.123
ric-0.122
Token eating
Token ID6600
Feature activation+1.15e+00

Pos logit contributions
 habitats+0.238
stack+0.234
 perspective+0.231
arians+0.224
 meanings+0.221
Neg logit contributions
 better-0.061
 to-0.060
 again-0.054
 said-0.052
 closer-0.052
Token them
Token ID606
Feature activation+4.48e+00

Pos logit contributions
 better+0.018
 again+0.016
 to+0.015
 closer+0.014
 said+0.013
Neg logit contributions
 habitats-0.063
stack-0.061
 perspective-0.061
arians-0.060
years-0.059
Token a
Token ID257
Feature activation+3.98e+00

Pos logit contributions
 better+0.007
 said+0.006
 again+0.006
 to+0.006
 closer+0.005
Neg logit contributions
 habitats-0.033
stack-0.032
 perspective-0.032
arians-0.031
ric-0.030
Token big
Token ID1263
Feature activation+1.94e+00

Pos logit contributions
 better+0.074
 closer+0.063
 again+0.061
 to+0.061
 said+0.061
Neg logit contributions
 habitats-0.204
stack-0.196
 perspective-0.189
arians-0.186
years-0.185
Token box
Token ID3091
Feature activation-1.98e+00

Pos logit contributions
 better+0.045
 again+0.042
 to+0.040
 said+0.040
 closer+0.039
Neg logit contributions
 habitats-0.092
stack-0.085
 perspective-0.084
arians-0.083
years-0.083
Token of
Token ID286
Feature activation-2.96e+00

Pos logit contributions
stack+0.094
 habitats+0.094
 perspective+0.092
arians+0.087
 meanings+0.086
Neg logit contributions
 better-0.035
 again-0.030
 said-0.029
 closer-0.029
's-0.029
Token ch
Token ID442
Feature activation-1.75e+00

Pos logit contributions
 habitats+0.132
stack+0.131
arians+0.123
 perspective+0.123
ric+0.123
Neg logit contributions
 better-0.078
 said-0.069
 to-0.068
 closer-0.066
 again-0.064
Tokenocol
Token ID4668
Feature activation-1.38e+01

Pos logit contributions
stack+0.041
 habitats+0.036
 perspective+0.033
arians+0.031
 meanings+0.030
Neg logit contributions
 better-0.078
 again-0.077
 to-0.077
 closer-0.077
's-0.075
Tokenates
Token ID689
Feature activation-3.17e+00

Pos logit contributions
 habitats+0.225
stack+0.225
 perspective+0.218
 meanings+0.183
arians+0.176
Neg logit contributions
 to-0.646
 again-0.642
 better-0.635
's-0.633
 said-0.621
Token.
Token ID13
Feature activation-6.10e-01

Pos logit contributions
 habitats+0.182
 perspective+0.173
stack+0.171
 meanings+0.169
arians+0.166
Neg logit contributions
 better-0.037
 again-0.037
 to-0.036
's-0.033
 said-0.032
Token It
Token ID632
Feature activation+2.08e+00

Pos logit contributions
 habitats+0.030
 perspective+0.030
stack+0.029
ric+0.029
arians+0.028
Neg logit contributions
 to-0.010
 said-0.010
 better-0.009
 again-0.009
's-0.009
Token was
Token ID373
Feature activation-2.95e+00

Pos logit contributions
 better+0.032
 again+0.028
 said+0.027
 closer+0.027
 to+0.027
Neg logit contributions
 habitats-0.106
stack-0.104
 perspective-0.103
arians-0.098
 meanings-0.098
Token so
Token ID523
Feature activation-8.46e-01

Pos logit contributions
 perspective+0.172
 habitats+0.171
arians+0.164
stack+0.163
ric+0.157
Neg logit contributions
 better-0.037
 to-0.034
 said-0.031
 bigger-0.029
 closer-0.028
Token a
Token ID257
Feature activation+3.74e+00

Pos logit contributions
stack+0.039
 habitats+0.039
 perspective+0.038
arians+0.037
ric+0.036
Neg logit contributions
 better-0.012
 to-0.011
 said-0.010
 again-0.010
's-0.010
Token large
Token ID1588
Feature activation+1.15e-01

Pos logit contributions
 better+0.082
 closer+0.071
 to+0.070
 said+0.067
 bigger+0.066
Neg logit contributions
 habitats-0.186
stack-0.174
 perspective-0.172
years-0.171
arians-0.169
Token selection
Token ID6356
Feature activation-5.59e+00

Pos logit contributions
 better+0.003
 to+0.003
 look+0.002
 again+0.002
 sandwich+0.002
Neg logit contributions
 habitats-0.006
years-0.005
arians-0.005
ric-0.005
 perspective-0.005
Token of
Token ID286
Feature activation-3.00e+00

Pos logit contributions
 habitats+0.371
 perspective+0.350
stack+0.348
ric+0.348
 meanings+0.337
Neg logit contributions
 better-0.031
 to-0.031
 again-0.026
 said-0.021
's-0.021
Token ch
Token ID442
Feature activation-1.99e+00

Pos logit contributions
stack+0.148
 habitats+0.140
 perspective+0.136
ric+0.134
arians+0.134
Neg logit contributions
 better-0.071
 to-0.067
 said-0.065
 some-0.063
 say-0.059
Tokenocol
Token ID4668
Feature activation-1.37e+01

Pos logit contributions
stack+0.051
 habitats+0.038
 perspective+0.037
 meanings+0.031
arians+0.031
Neg logit contributions
 to-0.090
 better-0.087
 of-0.085
 again-0.085
's-0.084
Tokenates
Token ID689
Feature activation-3.27e+00

Pos logit contributions
stack+0.253
 perspective+0.240
 habitats+0.240
 meanings+0.200
arians+0.192
Neg logit contributions
 to-0.633
 again-0.622
 better-0.614
's-0.607
 closer-0.592
Token.
Token ID13
Feature activation-9.81e-01

Pos logit contributions
 habitats+0.197
 perspective+0.190
stack+0.186
 meanings+0.183
ric+0.180
Neg logit contributions
 to-0.035
 better-0.030
 again-0.030
's-0.028
 said-0.027
Token Ell
Token ID7122
Feature activation-4.85e+00

Pos logit contributions
 habitats+0.046
 perspective+0.045
stack+0.044
 meanings+0.043
arians+0.042
Neg logit contributions
 to-0.019
-0.019
 better-0.019
 said-0.018
 they-0.018
Tokena
Token ID64
Feature activation-4.94e-01

Pos logit contributions
 habitats+0.210
stack+0.207
 perspective+0.199
 meanings+0.190
arians+0.188
Neg logit contributions
 better-0.115
 said-0.114
's-0.114
 again-0.110
 to-0.108
Token picked
Token ID6497
Feature activation+2.25e+00

Pos logit contributions
 habitats+0.027
 perspective+0.027
stack+0.027
ric+0.025
 meanings+0.025
Neg logit contributions
 said-0.006
's-0.006
 to-0.006
 again-0.006
 better-0.006
Token was
Token ID373
Feature activation-3.13e+00

Pos logit contributions
 better+0.115
 said+0.102
 closer+0.102
 to+0.098
 again+0.098
Neg logit contributions
 habitats-0.332
stack-0.329
 perspective-0.323
arians-0.309
 meanings-0.305
Token filled
Token ID5901
Feature activation-1.56e-01

Pos logit contributions
 habitats+0.142
stack+0.141
 perspective+0.137
arians+0.129
 meanings+0.129
Neg logit contributions
 better-0.059
 again-0.053
 said-0.053
 closer-0.053
's-0.052
Token with
Token ID351
Feature activation-1.51e+00

Pos logit contributions
 habitats+0.008
stack+0.008
 perspective+0.008
arians+0.007
 meanings+0.007
Neg logit contributions
 better-0.002
 said-0.002
 again-0.002
 closer-0.002
 to-0.002
Token delicious
Token ID12625
Feature activation-8.31e-01

Pos logit contributions
stack+0.073
 habitats+0.072
arians+0.070
 perspective+0.069
ric+0.067
Neg logit contributions
 better-0.032
 to-0.027
 again-0.027
 said-0.027
 closer-0.027
Token ch
Token ID442
Feature activation-2.09e+00

Pos logit contributions
stack+0.038
 habitats+0.037
 perspective+0.035
arians+0.035
ric+0.035
Neg logit contributions
 better-0.019
 to-0.018
 again-0.018
 said-0.017
's-0.017
Tokenocol
Token ID4668
Feature activation-1.37e+01

Pos logit contributions
stack+0.054
 habitats+0.046
nam+0.043
 perspective+0.043
 meanings+0.041
Neg logit contributions
 better-0.087
 to-0.084
 again-0.083
 bigger-0.082
 come-0.082
Tokenates
Token ID689
Feature activation-3.13e+00

Pos logit contributions
stack+0.236
 habitats+0.234
 perspective+0.231
 meanings+0.190
arians+0.181
Neg logit contributions
 again-0.636
 to-0.636
 better-0.630
's-0.624
 said-0.613
Token.
Token ID13
Feature activation-7.86e-01

Pos logit contributions
 habitats+0.165
stack+0.159
 perspective+0.158
 meanings+0.152
arians+0.152
Neg logit contributions
 better-0.046
 again-0.042
 to-0.041
 said-0.040
 closer-0.039
Token He
Token ID679
Feature activation-3.32e+00

Pos logit contributions
 habitats+0.039
 perspective+0.038
stack+0.037
arians+0.036
 meanings+0.036
Neg logit contributions
 better-0.014
 said-0.013
 to-0.013
 again-0.013
's-0.012
Token was
Token ID373
Feature activation-3.16e+00

Pos logit contributions
 habitats+0.169
 perspective+0.166
stack+0.164
arians+0.157
ric+0.155
Neg logit contributions
 better-0.055
 said-0.053
 again-0.050
's-0.049
 to-0.048
Token happy
Token ID3772
Feature activation+1.33e+00

Pos logit contributions
 habitats+0.152
 perspective+0.146
stack+0.145
arians+0.141
 meanings+0.139
Neg logit contributions
 better-0.066
 to-0.059
 again-0.058
 said-0.058
's-0.057
Token not
Token ID407
Feature activation+4.18e+00

Pos logit contributions
 habitats+0.123
 perspective+0.118
stack+0.118
years+0.115
arians+0.114
Neg logit contributions
 better-0.056
 said-0.050
 to-0.049
 again-0.049
's-0.048
Token eat
Token ID4483
Feature activation+1.36e+00

Pos logit contributions
 better+0.065
 to+0.058
 again+0.054
 said+0.054
's+0.053
Neg logit contributions
 habitats-0.219
stack-0.217
 perspective-0.209
arians-0.207
years-0.205
Token too
Token ID1165
Feature activation+1.50e+00

Pos logit contributions
 better+0.028
 again+0.025
 to+0.025
 closer+0.023
 said+0.022
Neg logit contributions
stack-0.066
 perspective-0.065
 habitats-0.065
arians-0.062
years-0.061
Token many
Token ID867
Feature activation-5.12e+00

Pos logit contributions
 better+0.022
 to+0.020
 closer+0.018
 again+0.018
 said+0.017
Neg logit contributions
stack-0.083
 habitats-0.081
arians-0.079
 perspective-0.078
ric-0.077
Token ch
Token ID442
Feature activation-1.85e+00

Pos logit contributions
stack+0.239
 habitats+0.214
 perspective+0.211
arians+0.207
ric+0.206
Neg logit contributions
 better-0.137
 again-0.125
 to-0.125
 said-0.122
's-0.118
Tokenocol
Token ID4668
Feature activation-1.37e+01

Pos logit contributions
stack+0.038
 habitats+0.030
 perspective+0.030
arians+0.029
 meanings+0.026
Neg logit contributions
 to-0.089
 better-0.088
 again-0.086
 closer-0.084
 bite-0.082
Tokenates
Token ID689
Feature activation-3.17e+00

Pos logit contributions
stack+0.233
 perspective+0.229
 habitats+0.222
 meanings+0.182
arians+0.177
Neg logit contributions
 to-0.643
 again-0.642
 better-0.632
's-0.615
 said-0.609
Token and
Token ID290
Feature activation+2.84e+00

Pos logit contributions
 habitats+0.175
 perspective+0.170
stack+0.168
 meanings+0.162
arians+0.161
Neg logit contributions
 better-0.047
 to-0.043
 again-0.043
's-0.038
 said-0.037
Token his
Token ID465
Feature activation-1.77e+00

Pos logit contributions
 better+0.052
 to+0.049
 said+0.048
 again+0.044
's+0.043
Neg logit contributions
stack-0.141
 perspective-0.141
 habitats-0.140
arians-0.136
uff-0.134
Token friend
Token ID1545
Feature activation-1.83e+00

Pos logit contributions
stack+0.074
 perspective+0.072
arians+0.071
 habitats+0.070
years+0.069
Neg logit contributions
 to-0.044
 better-0.044
 said-0.042
 again-0.040
 sandwich-0.039
Token learned
Token ID4499
Feature activation+1.04e-01

Pos logit contributions
 habitats+0.094
 perspective+0.093
stack+0.093
arians+0.088
ric+0.088
Neg logit contributions
 better-0.032
 said-0.031
's-0.030
 to-0.030
 again-0.028
Token colorful
Token ID20239
Feature activation+1.50e+00

Pos logit contributions
stack+0.269
ric+0.244
 habitats+0.244
 perspective+0.237
arians+0.227
Neg logit contributions
 better-0.133
 said-0.123
's-0.119
 closer-0.114
 to-0.114
Token cand
Token ID2658
Feature activation-4.79e+00

Pos logit contributions
 to+0.041
 better+0.041
 said+0.040
's+0.040
 again+0.038
Neg logit contributions
stack-0.064
ric-0.057
 habitats-0.056
 perspective-0.054
years-0.054
Tokenies
Token ID444
Feature activation-4.43e-01

Pos logit contributions
 habitats+0.195
stack+0.191
 perspective+0.186
arians+0.175
 meanings+0.174
Neg logit contributions
 better-0.124
 said-0.116
 again-0.116
 to-0.115
's-0.113
Token and
Token ID290
Feature activation+2.53e+00

Pos logit contributions
 habitats+0.023
stack+0.023
 perspective+0.022
 meanings+0.022
arians+0.022
Neg logit contributions
 better-0.007
 again-0.006
 to-0.006
 said-0.006
's-0.006
Token ch
Token ID442
Feature activation-2.07e+00

Pos logit contributions
 said+0.049
 better+0.046
 to+0.044
 they+0.042
's+0.042
Neg logit contributions
 habitats-0.123
stack-0.123
 perspective-0.122
ric-0.116
arians-0.116
Tokenocol
Token ID4668
Feature activation-1.36e+01

Pos logit contributions
stack+0.054
 perspective+0.044
 habitats+0.041
arians+0.041
 meanings+0.036
Neg logit contributions
 to-0.086
 said-0.084
 better-0.083
 closer-0.083
 again-0.082
Tokenates
Token ID689
Feature activation-3.27e+00

Pos logit contributions
stack+0.237
 perspective+0.235
 habitats+0.225
 meanings+0.182
arians+0.175
Neg logit contributions
 to-0.631
 again-0.626
 better-0.618
 said-0.614
's-0.605
Token.
Token ID13
Feature activation-7.79e-01

Pos logit contributions
 habitats+0.176
stack+0.171
 perspective+0.169
arians+0.163
 meanings+0.162
Neg logit contributions
 better-0.045
 to-0.041
 again-0.041
 said-0.040
 closer-0.038
Token She
Token ID1375
Feature activation-3.98e+00

Pos logit contributions
 habitats+0.034
 perspective+0.033
stack+0.033
ric+0.031
arians+0.031
Neg logit contributions
 to-0.018
 better-0.017
 said-0.017
's-0.017
 again-0.017
Token said
Token ID531
Feature activation+5.38e+00

Pos logit contributions
 habitats+0.206
stack+0.202
 perspective+0.200
arians+0.191
ric+0.188
Neg logit contributions
 better-0.059
 said-0.054
 again-0.052
 to-0.051
's-0.051
Token to
Token ID284
Feature activation+5.89e-01

Pos logit contributions
 better+0.097
 said+0.084
 again+0.084
 closer+0.083
's+0.081
Neg logit contributions
 habitats-0.253
stack-0.251
 perspective-0.245
arians-0.233
 meanings-0.233
Token enjoyed
Token ID8359
Feature activation-3.42e+00

Pos logit contributions
 better+0.040
 said+0.040
 again+0.036
 to+0.036
's+0.035
Neg logit contributions
 habitats-0.133
 perspective-0.131
stack-0.126
arians-0.124
ric-0.124
Token her
Token ID607
Feature activation-2.06e+00

Pos logit contributions
 habitats+0.196
 perspective+0.194
arians+0.192
stack+0.191
 meanings+0.184
Neg logit contributions
 better-0.042
 to-0.042
 again-0.036
 they-0.036
 said-0.034
Token cool
Token ID3608
Feature activation+2.25e+00

Pos logit contributions
 habitats+0.095
stack+0.093
 perspective+0.091
arians+0.087
 meanings+0.086
Neg logit contributions
 better-0.041
 again-0.037
 said-0.037
 closer-0.036
 to-0.036
Token,
Token ID11
Feature activation-1.88e+00

Pos logit contributions
 better+0.040
 to+0.038
 again+0.037
 said+0.035
's+0.035
Neg logit contributions
 habitats-0.111
stack-0.110
 perspective-0.107
arians-0.105
years-0.103
Token ch
Token ID442
Feature activation-2.03e+00

Pos logit contributions
 habitats+0.088
stack+0.086
 perspective+0.085
arians+0.082
 meanings+0.080
Neg logit contributions
 better-0.038
 again-0.034
 to-0.034
 said-0.034
 closer-0.033
Tokenocol
Token ID4668
Feature activation-1.36e+01

Pos logit contributions
stack+0.064
 habitats+0.057
arians+0.056
 perspective+0.054
 meanings+0.054
Neg logit contributions
 again-0.070
 to-0.070
 better-0.070
 said-0.069
 closer-0.069
Tokenat
Token ID265
Feature activation+1.50e+00

Pos logit contributions
stack+0.260
 perspective+0.259
 habitats+0.255
 meanings+0.220
arians+0.215
Neg logit contributions
 to-0.598
 again-0.595
 said-0.576
 better-0.575
's-0.575
Tokeny
Token ID88
Feature activation-2.39e+00

Pos logit contributions
 better+0.032
 again+0.031
 to+0.031
's+0.030
 said+0.029
Neg logit contributions
 habitats-0.070
 perspective-0.067
stack-0.065
 meanings-0.064
arians-0.063
Token treat
Token ID2190
Feature activation-1.19e-01

Pos logit contributions
 habitats+0.118
stack+0.116
 perspective+0.114
arians+0.113
years+0.112
Neg logit contributions
 to-0.046
 again-0.043
 better-0.042
's-0.042
 said-0.040
Token!
Token ID0
Feature activation-3.77e-01

Pos logit contributions
 habitats+0.006
 perspective+0.006
stack+0.006
arians+0.006
ric+0.006
Neg logit contributions
 better-0.002
 to-0.002
 said-0.002
 again-0.002
 they-0.002
Token
Token ID198
Feature activation+2.39e+00

Pos logit contributions
 habitats+0.022
 perspective+0.022
stack+0.021
arians+0.021
 meanings+0.021
Neg logit contributions
 better-0.003
 said-0.003
-0.003
 to-0.003
's-0.002
Token looking
Token ID2045
Feature activation+4.34e+00

Pos logit contributions
 habitats+0.459
 perspective+0.450
stack+0.445
ric+0.429
arians+0.425
Neg logit contributions
 they-0.113
 to-0.107
 some-0.097
 better-0.089
 said-0.083
Token at
Token ID379
Feature activation-3.22e+00

Pos logit contributions
 better+0.089
 said+0.083
 again+0.080
's+0.079
 say+0.077
Neg logit contributions
 habitats-0.189
stack-0.184
 perspective-0.183
 meanings-0.171
years-0.171
Token all
Token ID477
Feature activation-4.69e+00

Pos logit contributions
 habitats+0.182
arians+0.180
 perspective+0.174
stack+0.173
years+0.170
Neg logit contributions
 better-0.038
 to-0.037
 again-0.033
 said-0.032
 some-0.032
Token the
Token ID262
Feature activation-2.07e-01

Pos logit contributions
 habitats+0.281
stack+0.269
arians+0.269
 perspective+0.264
 meanings+0.256
Neg logit contributions
 to-0.050
 better-0.050
's-0.043
 said-0.043
 again-0.038
Token ch
Token ID442
Feature activation-2.08e+00

Pos logit contributions
stack+0.010
 habitats+0.010
 perspective+0.009
arians+0.009
ric+0.009
Neg logit contributions
 better-0.005
 to-0.004
 said-0.004
's-0.004
 again-0.004
Tokenocol
Token ID4668
Feature activation-1.36e+01

Pos logit contributions
stack+0.041
 perspective+0.036
 habitats+0.036
arians+0.030
 meanings+0.026
Neg logit contributions
 better-0.101
 to-0.099
 again-0.096
's-0.096
 said-0.096
Tokenates
Token ID689
Feature activation-3.11e+00

Pos logit contributions
 perspective+0.254
stack+0.249
 habitats+0.238
arians+0.203
 meanings+0.202
Neg logit contributions
 to-0.611
 again-0.605
 better-0.598
's-0.593
 said-0.586
Token,
Token ID11
Feature activation-1.60e+00

Pos logit contributions
 habitats+0.178
 perspective+0.169
stack+0.166
arians+0.161
 meanings+0.160
Neg logit contributions
 better-0.042
 to-0.039
 again-0.038
 said-0.037
's-0.036
Token Lex
Token ID17210
Feature activation-3.35e+00

Pos logit contributions
 habitats+0.066
 perspective+0.061
arians+0.061
stack+0.060
years+0.059
Neg logit contributions
 better-0.044
 to-0.044
 they-0.044
 said-0.043
 closer-0.043
Tokeni
Token ID72
Feature activation+5.05e-01

Pos logit contributions
 habitats+0.093
stack+0.092
 perspective+0.087
arians+0.083
 meanings+0.081
Neg logit contributions
 better-0.131
 said-0.129
's-0.128
 again-0.127
 to-0.127
Token chose
Token ID7690
Feature activation-1.07e+00

Pos logit contributions
 said+0.007
 better+0.007
's+0.007
 again+0.007
 to+0.006
Neg logit contributions
 habitats-0.027
 perspective-0.026
stack-0.026
ric-0.025
arians-0.025
Token find
Token ID1064
Feature activation+2.28e+00

Pos logit contributions
 habitats+0.036
 perspective+0.035
stack+0.034
arians+0.033
 meanings+0.032
Neg logit contributions
 better-0.010
 said-0.009
 say-0.009
 closer-0.009
's-0.009
Token a
Token ID257
Feature activation+4.09e+00

Pos logit contributions
 better+0.030
 again+0.027
 said+0.026
's+0.025
 closer+0.024
Neg logit contributions
 habitats-0.117
stack-0.115
 perspective-0.112
 meanings-0.109
arians-0.107
Token box
Token ID3091
Feature activation-2.01e+00

Pos logit contributions
 better+0.098
 to+0.086
 closer+0.086
 said+0.081
 again+0.080
Neg logit contributions
 habitats-0.190
years-0.179
stack-0.177
 perspective-0.175
arians-0.174
Token of
Token ID286
Feature activation-3.26e+00

Pos logit contributions
stack+0.095
 habitats+0.091
 perspective+0.089
arians+0.084
uff+0.083
Neg logit contributions
 better-0.035
 again-0.029
 said-0.028
 closer-0.028
's-0.028
Token ch
Token ID442
Feature activation-1.97e+00

Pos logit contributions
stack+0.146
 habitats+0.144
 perspective+0.138
arians+0.137
ric+0.133
Neg logit contributions
 better-0.086
 said-0.076
 to-0.075
 closer-0.074
 say-0.072
Tokenocol
Token ID4668
Feature activation-1.36e+01

Pos logit contributions
stack+0.048
 habitats+0.042
 perspective+0.041
arians+0.037
 meanings+0.034
Neg logit contributions
 better-0.086
 to-0.085
 again-0.084
 closer-0.083
's-0.082
Tokenates
Token ID689
Feature activation-2.73e+00

Pos logit contributions
stack+0.231
 habitats+0.229
 perspective+0.226
 meanings+0.179
arians+0.177
Neg logit contributions
 better-0.636
 to-0.634
 again-0.628
's-0.627
 said-0.616
Token.
Token ID13
Feature activation-5.55e-01

Pos logit contributions
 habitats+0.157
 perspective+0.152
stack+0.150
 meanings+0.146
arians+0.145
Neg logit contributions
 better-0.029
 to-0.028
 again-0.027
 said-0.025
's-0.025
Token
Token ID198
Feature activation+3.16e+00

Pos logit contributions
 habitats+0.023
stack+0.023
 perspective+0.021
 meanings+0.020
arians+0.020
Neg logit contributions
 better-0.013
 again-0.012
's-0.012
 closer-0.012
 bite-0.012
Token
Token ID198
Feature activation+3.09e+00

Pos logit contributions
 better+0.054
 sandwich+0.053
 bite+0.048
 closer+0.048
 again+0.048
Neg logit contributions
 habitats-0.137
stack-0.136
 perspective-0.135
uff-0.129
ric-0.127
Token"
Token ID1
Feature activation-5.29e+00

Pos logit contributions
 better+0.072
 again+0.066
 said+0.064
 closer+0.064
 to+0.064
Neg logit contributions
 habitats-0.132
stack-0.130
 perspective-0.127
arians-0.121
 meanings-0.119
Token son
Token ID3367
Feature activation-2.26e+00

Pos logit contributions
stack+0.130
 habitats+0.125
 perspective+0.125
years+0.118
arians+0.116
Neg logit contributions
 to-0.076
 better-0.073
 again-0.069
 said-0.068
 sandwich-0.067
Token is
Token ID318
Feature activation-1.62e+00

Pos logit contributions
 habitats+0.117
 perspective+0.113
stack+0.113
years+0.107
arians+0.106
Neg logit contributions
 better-0.038
's-0.036
 said-0.035
 to-0.035
 again-0.034
Token eating
Token ID6600
Feature activation+1.18e+00

Pos logit contributions
 habitats+0.076
stack+0.074
 perspective+0.074
arians+0.070
 meanings+0.069
Neg logit contributions
 better-0.035
 to-0.031
 again-0.031
 said-0.030
 closer-0.030
Token our
Token ID674
Feature activation-2.00e+00

Pos logit contributions
 better+0.018
 again+0.016
 to+0.015
 closer+0.013
's+0.013
Neg logit contributions
 habitats-0.066
stack-0.064
 perspective-0.064
arians-0.062
 meanings-0.061
Token ch
Token ID442
Feature activation-2.05e+00

Pos logit contributions
stack+0.093
 habitats+0.093
 perspective+0.092
arians+0.086
years+0.084
Neg logit contributions
 better-0.043
's-0.039
 to-0.038
 again-0.038
 said-0.038
Tokenocol
Token ID4668
Feature activation-1.36e+01

Pos logit contributions
stack+0.042
 habitats+0.034
 perspective+0.034
nam+0.027
 meanings+0.026
Neg logit contributions
 better-0.096
 to-0.095
 bite-0.093
 again-0.093
's-0.092
Tokenates
Token ID689
Feature activation-3.10e+00

Pos logit contributions
 perspective+0.234
stack+0.229
 habitats+0.225
arians+0.178
 meanings+0.175
Neg logit contributions
 to-0.634
 better-0.622
 again-0.622
's-0.620
 said-0.609
Token!"
Token ID2474
Feature activation+2.15e+00

Pos logit contributions
 habitats+0.160
 perspective+0.153
stack+0.150
 meanings+0.148
arians+0.145
Neg logit contributions
 better-0.055
 again-0.053
 to-0.052
's-0.049
 said-0.047
Token
Token ID220
Feature activation-9.24e-01

Pos logit contributions
 better+0.034
 again+0.030
 said+0.030
 to+0.029
 closer+0.029
Neg logit contributions
 habitats-0.108
stack-0.106
 perspective-0.105
arians-0.100
 meanings-0.099
Token
Token ID198
Feature activation+2.64e+00

Pos logit contributions
stack+0.048
 habitats+0.047
 perspective+0.046
ric+0.046
uff+0.045
Neg logit contributions
 better-0.011
 closer-0.010
 again-0.009
 bite-0.008
 said-0.008
Token
Token ID198
Feature activation+2.53e+00

Pos logit contributions
 better+0.050
 closer+0.046
 again+0.045
 to+0.043
 said+0.043
Neg logit contributions
 habitats-0.120
stack-0.118
 perspective-0.116
arians-0.112
uff-0.109
Token and
Token ID290
Feature activation+2.44e+00

Pos logit contributions
 habitats+0.110
stack+0.109
 perspective+0.107
arians+0.102
 meanings+0.101
Neg logit contributions
 better-0.028
 again-0.022
 closer-0.022
 said-0.022
 to-0.021
Token started
Token ID2067
Feature activation-4.54e+00

Pos logit contributions
 better+0.037
 again+0.031
 closer+0.030
 to+0.029
's+0.029
Neg logit contributions
stack-0.121
 habitats-0.120
 perspective-0.115
arians-0.112
years-0.111
Token eating
Token ID6600
Feature activation+1.22e+00

Pos logit contributions
 habitats+0.269
stack+0.264
 perspective+0.261
arians+0.252
 meanings+0.249
Neg logit contributions
 better-0.034
 again-0.025
 to-0.025
 said-0.024
 closer-0.023
Token the
Token ID262
Feature activation-7.41e-03

Pos logit contributions
 better+0.016
 again+0.014
 to+0.013
 closer+0.012
 said+0.012
Neg logit contributions
 habitats-0.070
stack-0.068
 perspective-0.067
arians-0.066
ric-0.064
Token ch
Token ID442
Feature activation-1.91e+00

Pos logit contributions
 habitats+0.000
stack+0.000
 perspective+0.000
arians+0.000
 meanings+0.000
Neg logit contributions
 better-0.000
 again-0.000
 said-0.000
 to-0.000
 closer-0.000
Tokenocol
Token ID4668
Feature activation-1.36e+01

Pos logit contributions
stack+0.032
 habitats+0.031
 perspective+0.029
 meanings+0.021
arians+0.021
Neg logit contributions
 better-0.095
 to-0.092
 again-0.092
's-0.091
 bite-0.091
Tokenates
Token ID689
Feature activation-2.97e+00

Pos logit contributions
 perspective+0.228
 habitats+0.227
stack+0.214
 meanings+0.178
arians+0.177
Neg logit contributions
 to-0.634
 better-0.633
 again-0.630
's-0.624
 said-0.614
Token.
Token ID13
Feature activation-9.63e-01

Pos logit contributions
 habitats+0.159
stack+0.153
 perspective+0.153
arians+0.146
 meanings+0.146
Neg logit contributions
 better-0.042
 again-0.038
 to-0.037
 said-0.036
's-0.035
Token The
Token ID383
Feature activation-1.94e+00

Pos logit contributions
 habitats+0.046
 perspective+0.045
stack+0.044
arians+0.042
ric+0.042
Neg logit contributions
 better-0.019
 to-0.019
 said-0.018
 again-0.018
's-0.017
Token shop
Token ID6128
Feature activation-6.08e+00

Pos logit contributions
 habitats+0.091
 perspective+0.091
stack+0.086
ric+0.082
opes+0.080
Neg logit contributions
 better-0.050
 said-0.044
 bigger-0.042
 sandwich-0.042
's-0.041
Tokenkeeper
Token ID13884
Feature activation-7.86e-01

Pos logit contributions
 habitats+0.286
stack+0.278
 perspective+0.276
arians+0.262
 meanings+0.259
Neg logit contributions
 better-0.119
 again-0.108
 said-0.108
 to-0.106
's-0.105
Token she
Token ID673
Feature activation+1.04e+00

Pos logit contributions
 habitats+0.185
stack+0.181
 perspective+0.179
arians+0.172
 meanings+0.170
Neg logit contributions
 better-0.044
 again-0.038
 said-0.037
 to-0.036
 closer-0.036
Token paid
Token ID3432
Feature activation-1.57e+00

Pos logit contributions
 better+0.016
 again+0.014
 closer+0.014
 to+0.013
's+0.012
Neg logit contributions
stack-0.050
 habitats-0.050
arians-0.047
 perspective-0.047
 meanings-0.046
Token for
Token ID329
Feature activation-2.53e+00

Pos logit contributions
 habitats+0.080
stack+0.078
 perspective+0.077
arians+0.074
 meanings+0.073
Neg logit contributions
 better-0.024
 again-0.021
 said-0.021
 closer-0.021
 to-0.021
Token the
Token ID262
Feature activation-1.16e-01

Pos logit contributions
 habitats+0.150
 perspective+0.147
stack+0.145
ric+0.140
arians+0.138
Neg logit contributions
 better-0.030
 to-0.024
 again-0.023
 said-0.021
 closer-0.019
Token ch
Token ID442
Feature activation-2.12e+00

Pos logit contributions
 habitats+0.006
 perspective+0.005
stack+0.005
ric+0.005
years+0.005
Neg logit contributions
 better-0.003
 to-0.002
 said-0.002
 closer-0.002
 again-0.002
Tokenocol
Token ID4668
Feature activation-1.36e+01

Pos logit contributions
 habitats+0.035
stack+0.035
 perspective+0.031
 meanings+0.025
nam+0.024
Neg logit contributions
 better-0.105
 to-0.103
 again-0.102
 bite-0.101
 closer-0.099
Tokenates
Token ID689
Feature activation-3.20e+00

Pos logit contributions
 perspective+0.259
 habitats+0.247
stack+0.242
 meanings+0.199
arians+0.195
Neg logit contributions
 to-0.614
 again-0.609
 better-0.604
's-0.596
 said-0.585
Token and
Token ID290
Feature activation+2.26e+00

Pos logit contributions
 habitats+0.176
 perspective+0.168
stack+0.167
 meanings+0.163
arians+0.162
Neg logit contributions
 better-0.042
 again-0.039
 to-0.039
 closer-0.036
 said-0.036
Token the
Token ID262
Feature activation-9.37e-02

Pos logit contributions
 better+0.033
 said+0.029
 again+0.029
 to+0.028
 closer+0.028
Neg logit contributions
 habitats-0.118
 perspective-0.115
stack-0.115
arians-0.109
ric-0.108
Token two
Token ID734
Feature activation+5.51e-01

Pos logit contributions
 habitats+0.004
 perspective+0.004
stack+0.004
ric+0.004
years+0.004
Neg logit contributions
 better-0.002
 said-0.002
 to-0.002
 again-0.002
 bigger-0.002
Token of
Token ID286
Feature activation-3.17e+00

Pos logit contributions
 better+0.008
 said+0.007
 again+0.007
 to+0.007
's+0.007
Neg logit contributions
 habitats-0.029
stack-0.029
 perspective-0.028
ric-0.027
arians-0.027
Tokenates
Token ID689
Feature activation-2.86e+00

Pos logit contributions
 habitats+0.228
stack+0.224
 perspective+0.221
 meanings+0.182
arians+0.169
Neg logit contributions
 to-0.628
 again-0.625
 better-0.624
's-0.619
 said-0.604
Token.
Token ID13
Feature activation-6.05e-01

Pos logit contributions
 habitats+0.166
 perspective+0.158
stack+0.154
 meanings+0.153
years+0.151
Neg logit contributions
 to-0.035
 better-0.034
's-0.034
 again-0.033
 said-0.030
Token Lily
Token ID20037
Feature activation+7.14e-01

Pos logit contributions
 habitats+0.026
stack+0.026
 perspective+0.025
arians+0.024
 meanings+0.024
Neg logit contributions
 better-0.014
 again-0.012
 closer-0.012
 said-0.012
 to-0.012
Token loved
Token ID6151
Feature activation-4.23e+00

Pos logit contributions
 said+0.011
 better+0.011
's+0.011
 to+0.010
 again+0.010
Neg logit contributions
 habitats-0.036
 perspective-0.036
stack-0.035
arians-0.033
ric-0.033
Token ch
Token ID442
Feature activation-1.72e+00

Pos logit contributions
stack+0.204
 perspective+0.202
 habitats+0.198
arians+0.191
uff+0.181
Neg logit contributions
 to-0.089
 better-0.083
 said-0.077
 again-0.075
 they-0.072
Tokenocol
Token ID4668
Feature activation-1.35e+01

Pos logit contributions
stack+0.038
 habitats+0.034
 perspective+0.033
 meanings+0.026
arians+0.026
Neg logit contributions
 better-0.079
 to-0.078
 again-0.076
 bite-0.075
 closer-0.075
Tokenates
Token ID689
Feature activation-3.06e+00

Pos logit contributions
 perspective+0.251
stack+0.246
 habitats+0.242
 meanings+0.196
arians+0.187
Neg logit contributions
 to-0.614
 again-0.611
's-0.606
 better-0.605
 said-0.591
Token.
Token ID13
Feature activation-6.10e-01

Pos logit contributions
 habitats+0.170
 perspective+0.164
stack+0.164
arians+0.157
 meanings+0.156
Neg logit contributions
 better-0.040
 again-0.034
 to-0.034
's-0.033
 said-0.032
Token She
Token ID1375
Feature activation-3.69e+00

Pos logit contributions
 habitats+0.026
stack+0.025
 perspective+0.025
arians+0.023
 meanings+0.023
Neg logit contributions
 better-0.015
 to-0.014
 again-0.014
 said-0.014
's-0.014
Token thanked
Token ID26280
Feature activation+1.80e+00

Pos logit contributions
 habitats+0.174
stack+0.170
 perspective+0.169
arians+0.160
 meanings+0.158
Neg logit contributions
 better-0.071
 said-0.065
 again-0.064
 to-0.063
's-0.062
Token Ben
Token ID3932
Feature activation+2.48e+00

Pos logit contributions
 better+0.029
 again+0.026
 to+0.026
 said+0.026
 closer+0.025
Neg logit contributions
 habitats-0.091
stack-0.089
 perspective-0.089
arians-0.084
 meanings-0.083
Token a
Token ID257
Feature activation+3.92e+00

Pos logit contributions
 habitats+0.168
 perspective+0.159
stack+0.156
arians+0.156
ric+0.155
Neg logit contributions
 better-0.030
 to-0.027
 closer-0.025
 again-0.024
's-0.021
Token big
Token ID1263
Feature activation+1.84e+00

Pos logit contributions
 better+0.080
 closer+0.069
 to+0.067
 said+0.065
 bigger+0.065
Neg logit contributions
 habitats-0.203
stack-0.187
arians-0.185
ric-0.184
 perspective-0.184
Token box
Token ID3091
Feature activation-2.07e+00

Pos logit contributions
 better+0.041
 again+0.037
 to+0.036
 closer+0.036
 said+0.035
Neg logit contributions
 habitats-0.091
arians-0.083
ric-0.083
years-0.082
 perspective-0.082
Token of
Token ID286
Feature activation-3.28e+00

Pos logit contributions
 habitats+0.102
 perspective+0.097
stack+0.097
arians+0.093
 meanings+0.093
Neg logit contributions
 better-0.038
 again-0.035
 to-0.035
 said-0.034
's-0.034
Token ch
Token ID442
Feature activation-2.01e+00

Pos logit contributions
 habitats+0.158
stack+0.153
ric+0.152
arians+0.151
 perspective+0.145
Neg logit contributions
 better-0.081
 said-0.070
 to-0.069
 closer-0.068
 say-0.065
Tokenocol
Token ID4668
Feature activation-1.35e+01

Pos logit contributions
stack+0.040
 habitats+0.036
 perspective+0.031
arians+0.030
 meanings+0.026
Neg logit contributions
 better-0.098
 to-0.096
 again-0.095
 closer-0.095
's-0.093
Tokenates
Token ID689
Feature activation-2.89e+00

Pos logit contributions
 habitats+0.239
stack+0.236
 perspective+0.231
arians+0.188
 meanings+0.185
Neg logit contributions
 to-0.619
 better-0.619
 again-0.613
's-0.611
 said-0.604
Token!
Token ID0
Feature activation-8.16e-02

Pos logit contributions
 habitats+0.158
 perspective+0.148
 meanings+0.144
stack+0.144
arians+0.142
Neg logit contributions
 to-0.046
 again-0.044
 better-0.044
's-0.042
 said-0.041
Token
Token ID220
Feature activation-8.01e-01

Pos logit contributions
 habitats+0.004
 perspective+0.004
ric+0.004
stack+0.004
arians+0.004
Neg logit contributions
 to-0.002
's-0.002
 said-0.002
 better-0.002
 again-0.002
Token
Token ID198
Feature activation+2.95e+00

Pos logit contributions
stack+0.042
 habitats+0.041
 perspective+0.040
uff+0.039
ric+0.039
Neg logit contributions
 better-0.010
 closer-0.009
 again-0.008
 bite-0.008
 said-0.007
Token
Token ID198
Feature activation+2.85e+00

Pos logit contributions
 better+0.048
 sandwich+0.041
 worse+0.041
 closer+0.040
 again+0.040
Neg logit contributions
stack-0.137
 perspective-0.137
 habitats-0.135
years-0.123
uff-0.123
Token he
Token ID339
Feature activation+1.93e+00

Pos logit contributions
 better+0.030
 again+0.021
 to+0.021
 said+0.021
's+0.018
Neg logit contributions
 habitats-0.284
stack-0.276
 perspective-0.275
arians-0.268
 meanings-0.264
Token ate
Token ID15063
Feature activation+1.80e+00

Pos logit contributions
 said+0.037
 better+0.036
 to+0.034
's+0.034
 again+0.033
Neg logit contributions
 habitats-0.094
 perspective-0.093
stack-0.092
ric-0.087
arians-0.086
Token all
Token ID477
Feature activation-4.57e+00

Pos logit contributions
 better+0.025
 again+0.022
 to+0.021
 closer+0.020
 said+0.020
Neg logit contributions
 habitats-0.097
stack-0.094
 perspective-0.094
arians-0.091
 meanings-0.089
Token the
Token ID262
Feature activation-1.94e-02

Pos logit contributions
 habitats+0.229
stack+0.225
 perspective+0.223
ric+0.209
 meanings+0.209
Neg logit contributions
 better-0.070
 again-0.061
 closer-0.060
 said-0.060
's-0.057
Token ch
Token ID442
Feature activation-1.83e+00

Pos logit contributions
 habitats+0.001
stack+0.001
 perspective+0.001
arians+0.001
ric+0.001
Neg logit contributions
 better-0.000
 said-0.000
 again-0.000
 to-0.000
's-0.000
Tokenocol
Token ID4668
Feature activation-1.35e+01

Pos logit contributions
stack+0.039
 perspective+0.033
 habitats+0.032
arians+0.029
 meanings+0.026
Neg logit contributions
 better-0.083
 to-0.082
 said-0.081
 again-0.081
 bite-0.080
Tokenates
Token ID689
Feature activation-3.25e+00

Pos logit contributions
 perspective+0.250
stack+0.243
 habitats+0.242
 meanings+0.204
arians+0.199
Neg logit contributions
 again-0.605
 to-0.605
 better-0.594
 said-0.584
's-0.581
Token and
Token ID290
Feature activation+2.60e+00

Pos logit contributions
 habitats+0.170
 perspective+0.163
stack+0.163
arians+0.156
 meanings+0.156
Neg logit contributions
 better-0.050
 again-0.046
 to-0.045
 said-0.043
 closer-0.043
Token sweets
Token ID42402
Feature activation-4.61e+00

Pos logit contributions
 said+0.046
 better+0.044
 to+0.042
 again+0.041
's+0.038
Neg logit contributions
 perspective-0.132
 habitats-0.132
stack-0.128
arians-0.126
uff-0.124
Token.
Token ID13
Feature activation-7.57e-01

Pos logit contributions
 habitats+0.253
 perspective+0.242
stack+0.237
 meanings+0.233
ric+0.232
Neg logit contributions
 better-0.067
 again-0.064
 to-0.063
 said-0.059
's-0.055
Token Then
Token ID3244
Feature activation-3.27e+00

Pos logit contributions
 perspective+0.038
 habitats+0.038
arians+0.036
stack+0.036
years+0.036
Neg logit contributions
 to-0.013
-0.012
 better-0.012
 said-0.012
 again-0.011
Token box
Token ID3091
Feature activation-1.99e+00

Pos logit contributions
 better+0.004
 to+0.003
's+0.003
 again+0.003
 closer+0.003
Neg logit contributions
 habitats-0.006
stack-0.006
 perspective-0.006
arians-0.006
years-0.006
Token and
Token ID290
Feature activation+2.77e+00

Pos logit contributions
stack+0.098
 habitats+0.097
 perspective+0.096
 meanings+0.090
arians+0.089
Neg logit contributions
 better-0.030
 said-0.024
 closer-0.023
 say-0.023
 again-0.023
Token share
Token ID2648
Feature activation+4.40e-01

Pos logit contributions
 better+0.037
 again+0.033
 said+0.031
 to+0.030
 suggested+0.030
Neg logit contributions
stack-0.138
 habitats-0.135
 perspective-0.129
 meanings-0.127
arians-0.126
Token the
Token ID262
Feature activation+2.10e-01

Pos logit contributions
 better+0.007
 said+0.006
 closer+0.006
 again+0.006
's+0.006
Neg logit contributions
 habitats-0.022
stack-0.021
 perspective-0.021
arians-0.020
 meanings-0.020
Token ch
Token ID442
Feature activation-1.95e+00

Pos logit contributions
 better+0.006
 again+0.005
 said+0.005
 closer+0.005
's+0.005
Neg logit contributions
 habitats-0.009
stack-0.008
 perspective-0.008
arians-0.008
years-0.008
Tokenocol
Token ID4668
Feature activation-1.35e+01

Pos logit contributions
stack+0.035
 habitats+0.033
 perspective+0.030
arians+0.026
years+0.025
Neg logit contributions
 better-0.093
 to-0.091
 again-0.090
's-0.089
 closer-0.088
Tokenates
Token ID689
Feature activation-2.73e+00

Pos logit contributions
 perspective+0.228
 habitats+0.226
stack+0.222
 meanings+0.178
arians+0.177
Neg logit contributions
 better-0.634
 to-0.627
 again-0.623
's-0.622
 said-0.610
Token.
Token ID13
Feature activation-5.35e-01

Pos logit contributions
 habitats+0.132
stack+0.132
 perspective+0.129
arians+0.124
ric+0.122
Neg logit contributions
 better-0.043
 said-0.037
 closer-0.036
 again-0.035
 to-0.034
Token They
Token ID1119
Feature activation-6.14e-01

Pos logit contributions
 habitats+0.020
stack+0.019
 perspective+0.018
 meanings+0.017
arians+0.017
Neg logit contributions
 better-0.016
 again-0.015
's-0.015
 closer-0.014
 said-0.014
Token are
Token ID389
Feature activation-4.69e+00

Pos logit contributions
 habitats+0.033
 perspective+0.032
stack+0.031
arians+0.030
years+0.030
Neg logit contributions
 better-0.009
 said-0.008
 say-0.008
's-0.008
 closer-0.008
Token happy
Token ID3772
Feature activation+1.59e+00

Pos logit contributions
 habitats+0.247
 perspective+0.241
stack+0.241
arians+0.230
 meanings+0.225
Neg logit contributions
 better-0.074
 said-0.062
 to-0.061
 again-0.061
 closer-0.059
Token time
Token ID640
Feature activation+2.50e-01

Pos logit contributions
 habitats+0.058
stack+0.057
 perspective+0.054
arians+0.054
years+0.053
Neg logit contributions
 better-0.029
 again-0.027
 to-0.027
 said-0.026
 closer-0.026
Token playing
Token ID2712
Feature activation-2.25e-01

Pos logit contributions
 to+0.005
 again+0.005
 better+0.004
 said+0.004
's+0.004
Neg logit contributions
 habitats-0.013
stack-0.012
 meanings-0.012
arians-0.012
 perspective-0.012
Token with
Token ID351
Feature activation-9.33e-01

Pos logit contributions
 habitats+0.012
stack+0.012
 perspective+0.012
 meanings+0.011
arians+0.011
Neg logit contributions
 better-0.003
 again-0.003
 to-0.003
 closer-0.002
 said-0.002
Token her
Token ID607
Feature activation-2.08e+00

Pos logit contributions
 habitats+0.050
stack+0.050
 perspective+0.048
arians+0.048
ric+0.047
Neg logit contributions
 better-0.013
 to-0.012
 again-0.012
 said-0.011
 closer-0.011
Token or
Token ID393
Feature activation+3.90e-01

Pos logit contributions
 habitats+0.096
stack+0.092
 perspective+0.089
arians+0.089
years+0.088
Neg logit contributions
 better-0.047
 to-0.046
 again-0.045
 said-0.043
 closer-0.041
Tokennam
Token ID7402
Feature activation-1.35e+01

Pos logit contributions
 better+0.009
 to+0.009
 again+0.008
 said+0.008
 closer+0.008
Neg logit contributions
 habitats-0.018
stack-0.017
 perspective-0.017
arians-0.017
uff-0.016
Tokenents
Token ID658
Feature activation-9.54e-01

Pos logit contributions
 habitats+0.497
stack+0.486
 perspective+0.472
arians+0.445
 meanings+0.439
Neg logit contributions
 better-0.395
 again-0.370
 said-0.367
 to-0.367
's-0.364
Token and
Token ID290
Feature activation+2.87e+00

Pos logit contributions
 habitats+0.052
 perspective+0.049
stack+0.049
arians+0.048
 meanings+0.048
Neg logit contributions
 better-0.013
 again-0.012
 to-0.012
 said-0.011
's-0.011
Token making
Token ID1642
Feature activation+1.80e+00

Pos logit contributions
 said+0.053
 better+0.052
 to+0.050
 again+0.046
 they+0.046
Neg logit contributions
 habitats-0.144
stack-0.141
 perspective-0.141
ric-0.138
uff-0.133
Token her
Token ID607
Feature activation-1.97e+00

Pos logit contributions
 better+0.032
 again+0.028
 to+0.028
 said+0.028
's+0.027
Neg logit contributions
 habitats-0.089
stack-0.088
 perspective-0.086
arians-0.082
ric-0.081
Token special
Token ID2041
Feature activation+8.65e-01

Pos logit contributions
 habitats+0.095
stack+0.093
 perspective+0.092
arians+0.088
 meanings+0.087
Neg logit contributions
 better-0.036
 again-0.032
 said-0.032
 to-0.031
 closer-0.031
Token was
Token ID373
Feature activation-3.00e+00

Pos logit contributions
 habitats+0.186
stack+0.183
 perspective+0.180
arians+0.172
 meanings+0.171
Neg logit contributions
 better-0.054
 again-0.046
 said-0.045
 closer-0.045
 to-0.045
Token a
Token ID257
Feature activation+3.93e+00

Pos logit contributions
 habitats+0.185
 perspective+0.181
stack+0.181
arians+0.178
ric+0.172
Neg logit contributions
 better-0.022
 to-0.018
 said-0.016
 again-0.015
 closer-0.013
Token box
Token ID3091
Feature activation-1.95e+00

Pos logit contributions
 better+0.078
 closer+0.066
 to+0.063
 said+0.062
 bigger+0.061
Neg logit contributions
 habitats-0.208
stack-0.193
arians-0.189
years-0.187
 perspective-0.184
Token of
Token ID286
Feature activation-3.02e+00

Pos logit contributions
 habitats+0.098
stack+0.096
 perspective+0.095
arians+0.090
 meanings+0.089
Neg logit contributions
 better-0.031
 again-0.027
 said-0.026
 closer-0.026
 to-0.026
Token ch
Token ID442
Feature activation-1.86e+00

Pos logit contributions
 habitats+0.144
stack+0.142
arians+0.137
 perspective+0.134
ric+0.133
Neg logit contributions
 better-0.072
 said-0.066
 to-0.063
 closer-0.061
 say-0.061
Tokenocol
Token ID4668
Feature activation-1.35e+01

Pos logit contributions
stack+0.046
 habitats+0.041
 perspective+0.036
arians+0.036
 meanings+0.032
Neg logit contributions
 better-0.082
 closer-0.081
 to-0.081
 again-0.079
 bigger-0.078
Tokenates
Token ID689
Feature activation-2.76e+00

Pos logit contributions
 habitats+0.256
stack+0.255
 perspective+0.246
 meanings+0.207
arians+0.203
Neg logit contributions
 to-0.603
 again-0.595
 better-0.592
's-0.591
 said-0.580
Token.
Token ID13
Feature activation-7.66e-01

Pos logit contributions
 habitats+0.153
 perspective+0.145
stack+0.143
 meanings+0.141
arians+0.139
Neg logit contributions
 better-0.039
 to-0.038
 again-0.038
's-0.036
 said-0.036
Token "
Token ID366
Feature activation-7.97e+00

Pos logit contributions
 habitats+0.039
 perspective+0.038
stack+0.038
arians+0.036
 meanings+0.036
Neg logit contributions
 better-0.012
 said-0.011
 to-0.011
 again-0.011
's-0.011
TokenFor
Token ID1890
Feature activation-6.44e+00

Pos logit contributions
 habitats+0.327
stack+0.323
 perspective+0.308
arians+0.300
 meanings+0.294
Neg logit contributions
 better-0.206
 closer-0.193
 again-0.188
 said-0.187
's-0.187
Token me
Token ID502
Feature activation-9.22e-01

Pos logit contributions
stack+0.342
 habitats+0.338
 perspective+0.336
arians+0.320
ric+0.309
Neg logit contributions
 better-0.105
 to-0.098
 again-0.091
 come-0.091
 said-0.089
Token and
Token ID290
Feature activation+2.54e+00

Pos logit contributions
 better+0.031
 closer+0.026
 said+0.025
 again+0.025
's+0.024
Neg logit contributions
 habitats-0.110
stack-0.109
 perspective-0.108
arians-0.103
 meanings-0.102
Token they
Token ID484
Feature activation+4.10e+00

Pos logit contributions
 better+0.024
 again+0.020
 said+0.020
 to+0.019
 closer+0.018
Neg logit contributions
 habitats-0.145
stack-0.142
 perspective-0.141
arians-0.135
 meanings-0.134
Token shared
Token ID4888
Feature activation+7.46e-01

Pos logit contributions
 said+0.086
 better+0.085
 again+0.079
's+0.079
 to+0.076
Neg logit contributions
 habitats-0.200
 perspective-0.194
stack-0.188
arians-0.184
ric-0.183
Token the
Token ID262
Feature activation-2.36e-01

Pos logit contributions
 better+0.010
 again+0.009
 said+0.009
 to+0.009
 closer+0.008
Neg logit contributions
 habitats-0.040
stack-0.039
 perspective-0.038
arians-0.037
 meanings-0.036
Token ch
Token ID442
Feature activation-2.05e+00

Pos logit contributions
 habitats+0.009
stack+0.009
 perspective+0.009
arians+0.009
 meanings+0.009
Neg logit contributions
 better-0.006
 again-0.006
 said-0.006
 closer-0.006
 to-0.006
Tokenocol
Token ID4668
Feature activation-1.35e+01

Pos logit contributions
 habitats+0.034
stack+0.034
 perspective+0.031
arians+0.026
 meanings+0.025
Neg logit contributions
 better-0.103
 again-0.099
 to-0.099
 said-0.097
 closer-0.097
Tokenates
Token ID689
Feature activation-2.83e+00

Pos logit contributions
 perspective+0.245
 habitats+0.243
stack+0.235
arians+0.199
 meanings+0.193
Neg logit contributions
 better-0.617
 to-0.613
 again-0.611
 said-0.602
's-0.597
Token with
Token ID351
Feature activation-1.39e+00

Pos logit contributions
 habitats+0.144
stack+0.142
 perspective+0.140
arians+0.134
 meanings+0.132
Neg logit contributions
 better-0.042
 again-0.037
 said-0.036
 closer-0.036
 to-0.035
Token everyone
Token ID2506
Feature activation-1.79e+00

Pos logit contributions
 habitats+0.071
stack+0.070
 perspective+0.070
arians+0.067
ric+0.066
Neg logit contributions
 better-0.023
 again-0.021
 said-0.021
 to-0.021
's-0.020
Token in
Token ID287
Feature activation-4.58e+00

Pos logit contributions
 habitats+0.096
stack+0.094
 perspective+0.093
arians+0.089
 meanings+0.089
Neg logit contributions
 better-0.024
 again-0.021
 said-0.020
 to-0.020
's-0.020
Token the
Token ID262
Feature activation-2.15e-01

Pos logit contributions
 habitats+0.277
stack+0.274
arians+0.270
opes+0.262
 perspective+0.262
Neg logit contributions
 to-0.037
 better-0.035
 again-0.032
 said-0.030
 they-0.029
Token she
Token ID673
Feature activation+1.01e+00

Pos logit contributions
 better+0.017
 to+0.016
 again+0.016
's+0.015
 said+0.015
Neg logit contributions
stack-0.068
 habitats-0.068
 perspective-0.065
ric-0.064
arians-0.063
Token had
Token ID550
Feature activation-9.02e-01

Pos logit contributions
 better+0.017
 said+0.015
 again+0.015
's+0.015
 to+0.014
Neg logit contributions
 habitats-0.052
 perspective-0.051
stack-0.050
ric-0.048
arians-0.047
Token with
Token ID351
Feature activation-1.38e+00

Pos logit contributions
stack+0.045
 habitats+0.045
 perspective+0.043
ric+0.042
arians+0.042
Neg logit contributions
 better-0.016
 to-0.015
 again-0.015
 said-0.015
's-0.014
Token her
Token ID607
Feature activation-2.20e+00

Pos logit contributions
stack+0.079
 habitats+0.077
 perspective+0.076
arians+0.074
ric+0.074
Neg logit contributions
 better-0.019
 to-0.018
 again-0.016
 said-0.015
 some-0.014
Token or
Token ID393
Feature activation-7.33e-02

Pos logit contributions
 habitats+0.107
stack+0.105
ric+0.102
 perspective+0.100
years+0.099
Neg logit contributions
 to-0.047
 better-0.046
 again-0.043
 said-0.039
's-0.038
Tokennam
Token ID7402
Feature activation-1.35e+01

Pos logit contributions
stack+0.004
 habitats+0.004
 perspective+0.003
arians+0.003
uff+0.003
Neg logit contributions
 better-0.001
 to-0.001
 said-0.001
 again-0.001
's-0.001
Tokenents
Token ID658
Feature activation-1.43e+00

Pos logit contributions
 habitats+0.517
stack+0.512
 perspective+0.486
arians+0.463
 meanings+0.456
Neg logit contributions
 better-0.374
 to-0.353
 again-0.353
's-0.351
 said-0.347
Token.
Token ID13
Feature activation-6.54e-01

Pos logit contributions
 habitats+0.078
stack+0.075
 perspective+0.075
arians+0.073
 meanings+0.073
Neg logit contributions
 better-0.019
 again-0.018
 to-0.018
 said-0.016
 closer-0.015
Token<|endoftext|>
Token ID50256
Feature activation-3.82e+00

Pos logit contributions
 habitats+0.040
 perspective+0.039
stack+0.039
arians+0.038
ric+0.037
Neg logit contributions
-0.004
 better-0.004
 said-0.003
 again-0.003
 to-0.003
TokenOnce
Token ID7454
Feature activation-4.20e+00

Pos logit contributions
 habitats+0.209
stack+0.206
 perspective+0.203
arians+0.195
 meanings+0.193
Neg logit contributions
 better-0.043
 again-0.036
 said-0.035
 closer-0.034
 to-0.034
Token upon
Token ID2402
Feature activation-8.37e+00

Pos logit contributions
 perspective+0.131
stack+0.124
 habitats+0.123
uff+0.117
ric+0.114
Neg logit contributions
 better-0.135
 shot-0.123
 kiss-0.121
 closer-0.121
 say-0.120
Token practiced
Token ID19893
Feature activation-9.83e-01

Pos logit contributions
 better+0.046
 again+0.042
 said+0.041
 to+0.040
's+0.039
Neg logit contributions
 habitats-0.125
stack-0.122
 perspective-0.120
arians-0.115
 meanings-0.114
Token and
Token ID290
Feature activation+2.55e+00

Pos logit contributions
 habitats+0.050
stack+0.048
 perspective+0.046
arians+0.045
 meanings+0.045
Neg logit contributions
 better-0.020
 again-0.019
 to-0.019
 closer-0.016
 they-0.016
Token soon
Token ID2582
Feature activation+1.33e+00

Pos logit contributions
 better+0.052
 again+0.048
 said+0.045
 to+0.045
's+0.044
Neg logit contributions
 habitats-0.123
stack-0.119
 perspective-0.117
arians-0.113
ric-0.112
Token he
Token ID339
Feature activation+1.92e+00

Pos logit contributions
 better+0.017
 closer+0.016
 say+0.015
 said+0.014
's+0.014
Neg logit contributions
 habitats-0.067
 perspective-0.066
stack-0.064
 meanings-0.061
arians-0.060
Token could
Token ID714
Feature activation-3.32e+00

Pos logit contributions
 better+0.034
 closer+0.031
 to+0.030
 again+0.030
 said+0.029
Neg logit contributions
 habitats-0.091
stack-0.088
 perspective-0.088
arians-0.084
 meanings-0.083
Token whistle
Token ID16121
Feature activation+4.92e+00

Pos logit contributions
stack+0.153
 habitats+0.153
 perspective+0.153
arians+0.144
ric+0.144
Neg logit contributions
 better-0.054
 said-0.051
 closer-0.048
's-0.047
 again-0.045
Token just
Token ID655
Feature activation-8.71e-01

Pos logit contributions
 better+0.083
 again+0.075
 to+0.071
 closer+0.067
's+0.066
Neg logit contributions
 habitats-0.253
stack-0.250
 perspective-0.243
arians-0.235
ric-0.232
Token like
Token ID588
Feature activation-3.29e+00

Pos logit contributions
 habitats+0.043
stack+0.042
 perspective+0.041
arians+0.041
years+0.040
Neg logit contributions
 better-0.017
 again-0.016
 to-0.016
 said-0.014
's-0.014
Token the
Token ID262
Feature activation+4.87e-02

Pos logit contributions
stack+0.185
 habitats+0.184
 perspective+0.181
arians+0.177
ric+0.172
Neg logit contributions
 better-0.040
 to-0.039
 again-0.035
 said-0.030
 they-0.029
Token bird
Token ID6512
Feature activation+8.59e-01

Pos logit contributions
 better+0.001
 again+0.001
 said+0.001
 closer+0.001
 to+0.001
Neg logit contributions
 habitats-0.002
stack-0.002
 perspective-0.002
arians-0.002
 meanings-0.002
Token.
Token ID13
Feature activation-5.25e-01

Pos logit contributions
 better+0.013
 again+0.011
's+0.011
 to+0.011
 said+0.011
Neg logit contributions
 habitats-0.045
stack-0.044
 perspective-0.044
arians-0.042
 meanings-0.042
Token a
Token ID257
Feature activation+3.98e+00

Pos logit contributions
 habitats+0.363
 perspective+0.359
stack+0.342
years+0.334
 meanings+0.331
Neg logit contributions
 better-0.133
 said-0.126
 again-0.125
 closer-0.123
 says-0.120
Token time
Token ID640
Feature activation-7.16e-01

Pos logit contributions
 better+0.103
 closer+0.097
 said+0.096
 again+0.096
 say+0.093
Neg logit contributions
 habitats-0.154
 perspective-0.150
uff-0.135
stack-0.135
 meanings-0.132
Token,
Token ID11
Feature activation-2.07e+00

Pos logit contributions
 habitats+0.033
stack+0.033
 perspective+0.033
arians+0.031
ric+0.031
Neg logit contributions
 better-0.013
 said-0.012
 again-0.012
 closer-0.011
's-0.011
Token there
Token ID612
Feature activation+2.55e+00

Pos logit contributions
 habitats+0.105
stack+0.104
 perspective+0.103
ric+0.097
arians+0.096
Neg logit contributions
 better-0.030
 again-0.025
 said-0.025
 to-0.024
 closer-0.024
Token was
Token ID373
Feature activation-2.96e+00

Pos logit contributions
 better+0.035
 again+0.031
 said+0.030
 to+0.030
's+0.030
Neg logit contributions
 habitats-0.135
stack-0.132
 perspective-0.130
arians-0.127
 meanings-0.125
Token a
Token ID257
Feature activation+3.96e+00

Pos logit contributions
 habitats+0.144
stack+0.140
 perspective+0.140
years+0.134
 meanings+0.131
Neg logit contributions
 better-0.043
 again-0.041
 said-0.038
 closer-0.038
 sigh-0.036
Token girl
Token ID2576
Feature activation-2.55e+00

Pos logit contributions
 better+0.077
 again+0.069
 said+0.068
 closer+0.067
 to+0.067
Neg logit contributions
 habitats-0.186
stack-0.182
 perspective-0.179
arians-0.171
 meanings-0.170
Token named
Token ID3706
Feature activation-2.21e+00

Pos logit contributions
 habitats+0.098
stack+0.095
 perspective+0.093
arians+0.089
 meanings+0.088
Neg logit contributions
 better-0.072
 to-0.068
 said-0.067
 again-0.067
's-0.067
Token Mary
Token ID5335
Feature activation+4.46e-01

Pos logit contributions
 habitats+0.112
stack+0.111
 perspective+0.109
 meanings+0.104
years+0.104
Neg logit contributions
 better-0.031
 again-0.026
 closer-0.026
 said-0.024
 shot-0.023
Token.
Token ID13
Feature activation-2.14e-01

Pos logit contributions
 better+0.008
 closer+0.007
 again+0.007
 said+0.006
 to+0.006
Neg logit contributions
 habitats-0.021
 perspective-0.021
stack-0.021
ric-0.019
 meanings-0.019
Token Mary
Token ID5335
Feature activation+4.35e-01

Pos logit contributions
 habitats+0.009
stack+0.008
 perspective+0.008
arians+0.008
ric+0.008
Neg logit contributions
 better-0.006
 said-0.005
 to-0.005
 again-0.005
's-0.005
Token whale
Token ID22206
Feature activation+1.54e+00

Pos logit contributions
 better+0.037
 said+0.034
 closer+0.034
's+0.034
 again+0.033
Neg logit contributions
 habitats-0.051
 perspective-0.050
stack-0.049
 meanings-0.044
arians-0.043
Token.
Token ID13
Feature activation-5.08e-01

Pos logit contributions
 better+0.027
 again+0.024
 to+0.024
's+0.023
 said+0.023
Neg logit contributions
 habitats-0.077
 perspective-0.074
stack-0.074
 meanings-0.071
arians-0.071
Token<|endoftext|>
Token ID50256
Feature activation-3.91e+00

Pos logit contributions
 habitats+0.029
 perspective+0.028
stack+0.028
arians+0.028
 meanings+0.027
Neg logit contributions
 better-0.005
 said-0.004
's-0.004
 again-0.004
 to-0.004
TokenOnce
Token ID7454
Feature activation-4.24e+00

Pos logit contributions
 habitats+0.191
stack+0.189
 perspective+0.185
arians+0.175
ric+0.174
Neg logit contributions
 better-0.064
 closer-0.056
 said-0.055
 again-0.055
 to-0.053
Token upon
Token ID2402
Feature activation-8.27e+00

Pos logit contributions
 perspective+0.132
stack+0.125
 habitats+0.124
uff+0.118
ric+0.116
Neg logit contributions
 better-0.137
 kiss-0.125
 shot-0.124
 sandwich-0.123
 say-0.121
Token a
Token ID257
Feature activation+4.11e+00

Pos logit contributions
 habitats+0.345
 perspective+0.339
stack+0.318
years+0.317
 meanings+0.313
Neg logit contributions
 better-0.144
 said-0.137
 again-0.136
 closer-0.133
 suggested-0.133
Token time
Token ID640
Feature activation-4.54e-01

Pos logit contributions
 better+0.105
 again+0.099
 closer+0.099
 said+0.097
 to+0.095
Neg logit contributions
 habitats-0.162
 perspective-0.157
stack-0.143
uff-0.142
 meanings-0.140
Token there
Token ID612
Feature activation+2.46e+00

Pos logit contributions
stack+0.020
 habitats+0.019
 perspective+0.019
ric+0.018
uff+0.018
Neg logit contributions
 better-0.010
 said-0.008
 again-0.008
 closer-0.008
 say-0.008
Token was
Token ID373
Feature activation-2.79e+00

Pos logit contributions
 better+0.040
 again+0.035
 closer+0.035
 to+0.034
 said+0.034
Neg logit contributions
 habitats-0.121
stack-0.120
 perspective-0.118
 meanings-0.111
arians-0.111
Token a
Token ID257
Feature activation+4.08e+00

Pos logit contributions
 habitats+0.142
stack+0.138
 perspective+0.137
years+0.131
uff+0.129
Neg logit contributions
 better-0.037
 again-0.036
 closer-0.032
 said-0.032
's-0.030
Token little
Token ID1310
Feature activation-3.32e+00

Pos logit contributions
 better+0.078
 again+0.068
 said+0.068
 closer+0.067
 to+0.067
Neg logit contributions
 habitats-0.196
stack-0.191
 perspective-0.188
arians-0.180
 meanings-0.179
Token back
Token ID736
Feature activation-2.45e-01

Pos logit contributions
 better+0.084
 said+0.079
 again+0.070
's+0.070
 say+0.069
Neg logit contributions
 habitats-0.373
 perspective-0.356
stack-0.353
 meanings-0.342
years-0.336
Token to
Token ID284
Feature activation+5.78e-01

Pos logit contributions
 habitats+0.015
stack+0.015
arians+0.015
 perspective+0.014
 meanings+0.014
Neg logit contributions
 to-0.002
 better-0.002
 again-0.002
 closer-0.001
 said-0.001
Token where
Token ID810
Feature activation+5.80e-01

Pos logit contributions
 better+0.009
 to+0.008
 said+0.008
 again+0.008
 say+0.008
Neg logit contributions
 habitats-0.030
 perspective-0.029
stack-0.029
arians-0.029
 meanings-0.028
Token she
Token ID673
Feature activation+1.47e+00

Pos logit contributions
 better+0.007
 again+0.006
 closer+0.006
 said+0.006
's+0.006
Neg logit contributions
 habitats-0.031
stack-0.030
 perspective-0.030
 meanings-0.028
arians-0.028
Token found
Token ID1043
Feature activation+2.41e+00

Pos logit contributions
 better+0.029
 closer+0.026
 again+0.025
 to+0.025
 said+0.023
Neg logit contributions
 habitats-0.067
stack-0.064
 perspective-0.063
arians-0.061
 meanings-0.061
Token it
Token ID340
Feature activation+7.26e+00

Pos logit contributions
 better+0.025
 again+0.021
 said+0.020
 to+0.020
 closer+0.020
Neg logit contributions
 habitats-0.135
stack-0.132
 perspective-0.131
arians-0.126
 meanings-0.124
Token.
Token ID13
Feature activation-3.80e-01

Pos logit contributions
 better+0.116
 again+0.101
 said+0.099
 closer+0.099
 to+0.098
Neg logit contributions
 habitats-0.363
stack-0.356
 perspective-0.351
arians-0.336
 meanings-0.332
Token G
Token ID402
Feature activation-2.80e+00

Pos logit contributions
 habitats+0.017
stack+0.017
 perspective+0.016
arians+0.016
 meanings+0.016
Neg logit contributions
 better-0.008
 to-0.008
 said-0.008
's-0.007
 again-0.007
Tokenilly
Token ID6548
Feature activation+1.16e+00

Pos logit contributions
 habitats+0.106
stack+0.104
 perspective+0.101
arians+0.095
 meanings+0.094
Neg logit contributions
 better-0.080
 again-0.075
 said-0.074
 to-0.074
 closer-0.073
Token smiled
Token ID13541
Feature activation+1.73e+00

Pos logit contributions
 better+0.018
 said+0.017
's+0.016
 again+0.016
 to+0.016
Neg logit contributions
 habitats-0.060
 perspective-0.058
stack-0.058
arians-0.055
ric-0.055
Token and
Token ID290
Feature activation+2.62e+00

Pos logit contributions
 better+0.023
 again+0.020
 to+0.019
 said+0.019
 closer+0.019
Neg logit contributions
 habitats-0.093
stack-0.091
 perspective-0.090
arians-0.086
 meanings-0.085
Token loud
Token ID7812
Feature activation+1.47e+00

Pos logit contributions
 better+0.024
 again+0.021
 said+0.021
 closer+0.021
's+0.020
Neg logit contributions
 habitats-0.059
stack-0.057
 perspective-0.057
 meanings-0.054
arians-0.054
Token noises
Token ID26782
Feature activation+1.53e+00

Pos logit contributions
 better+0.027
 again+0.025
 to+0.025
 said+0.024
's+0.024
Neg logit contributions
stack-0.071
 habitats-0.071
 perspective-0.070
arians-0.066
ric-0.065
Token,
Token ID11
Feature activation-1.49e+00

Pos logit contributions
 better+0.024
 said+0.021
 again+0.020
 closer+0.020
's+0.020
Neg logit contributions
 habitats-0.075
stack-0.074
 perspective-0.073
arians-0.069
 meanings-0.068
Token and
Token ID290
Feature activation+3.04e+00

Pos logit contributions
 habitats+0.072
stack+0.069
 perspective+0.067
 meanings+0.065
ric+0.065
Neg logit contributions
 better-0.026
 again-0.022
 said-0.021
 closer-0.021
's-0.021
Token she
Token ID673
Feature activation+1.78e+00

Pos logit contributions
 better+0.061
 again+0.056
 said+0.056
 to+0.055
 closer+0.055
Neg logit contributions
 habitats-0.141
stack-0.139
 perspective-0.137
arians-0.132
ric-0.129
Token smelled
Token ID37609
Feature activation+4.10e+00

Pos logit contributions
 better+0.024
 said+0.022
 again+0.021
 to+0.021
's+0.021
Neg logit contributions
 habitats-0.094
stack-0.094
 perspective-0.093
arians-0.089
 meanings-0.087
Token strange
Token ID6283
Feature activation+6.71e-01

Pos logit contributions
 better+0.062
 again+0.054
 said+0.053
 closer+0.052
 to+0.052
Neg logit contributions
 habitats-0.210
stack-0.205
 perspective-0.203
arians-0.194
 meanings-0.192
Token smells
Token ID25760
Feature activation+4.13e+00

Pos logit contributions
 better+0.008
 again+0.008
 to+0.008
's+0.007
 said+0.007
Neg logit contributions
stack-0.038
 habitats-0.037
 perspective-0.036
arians-0.036
years-0.035
Token.
Token ID13
Feature activation-3.16e-02

Pos logit contributions
 better+0.065
 said+0.058
 again+0.058
 closer+0.058
 to+0.056
Neg logit contributions
 habitats-0.200
stack-0.197
 perspective-0.194
arians-0.184
 meanings-0.184
Token She
Token ID1375
Feature activation-3.11e+00

Pos logit contributions
 habitats+0.001
stack+0.001
 perspective+0.001
arians+0.001
 meanings+0.001
Neg logit contributions
 better-0.001
 again-0.001
 said-0.001
 to-0.001
 closer-0.001
Token was
Token ID373
Feature activation-2.81e+00

Pos logit contributions
 habitats+0.154
stack+0.150
 perspective+0.148
arians+0.142
 meanings+0.140
Neg logit contributions
 better-0.053
 again-0.047
 to-0.046
 said-0.046
 closer-0.045
Token don
Token ID836
Feature activation-3.58e+00

Pos logit contributions
 habitats+0.075
stack+0.074
 perspective+0.072
years+0.069
arians+0.068
Neg logit contributions
 better-0.035
 said-0.034
 to-0.033
's-0.033
 again-0.031
Token't
Token ID470
Feature activation+3.58e+00

Pos logit contributions
stack+0.126
arians+0.123
 perspective+0.114
ric+0.113
 habitats+0.113
Neg logit contributions
 better-0.095
 closer-0.092
 again-0.091
 sandwich-0.088
 said-0.087
Token you
Token ID345
Feature activation+2.17e+00

Pos logit contributions
 better+0.084
 closer+0.077
 again+0.074
 said+0.071
 shot+0.069
Neg logit contributions
 perspective-0.144
stack-0.143
 habitats-0.141
arians-0.138
 meanings-0.133
Token dress
Token ID6576
Feature activation+1.31e+00

Pos logit contributions
 better+0.033
 said+0.029
 to+0.029
's+0.028
 say+0.028
Neg logit contributions
 habitats-0.115
stack-0.111
 perspective-0.110
arians-0.105
 meanings-0.104
Token up
Token ID510
Feature activation+3.67e+00

Pos logit contributions
 better+0.019
 said+0.018
 again+0.016
 suggested+0.016
 closer+0.016
Neg logit contributions
 habitats-0.060
stack-0.060
arians-0.059
ric-0.058
 perspective-0.058
Token the
Token ID262
Feature activation-5.19e-02

Pos logit contributions
 better+0.060
 again+0.054
 to+0.054
 said+0.051
 closer+0.051
Neg logit contributions
 habitats-0.188
stack-0.185
 perspective-0.181
arians-0.174
 meanings-0.173
Token dough
Token ID15756
Feature activation-7.00e-01

Pos logit contributions
 habitats+0.002
 perspective+0.002
stack+0.002
years+0.002
ric+0.002
Neg logit contributions
 better-0.001
 to-0.001
's-0.001
 again-0.001
 closer-0.001
Token?
Token ID30
Feature activation-2.66e-02

Pos logit contributions
 habitats+0.036
stack+0.035
 perspective+0.034
arians+0.033
 meanings+0.033
Neg logit contributions
 better-0.011
 again-0.010
 said-0.009
 to-0.009
's-0.009
Token You
Token ID921
Feature activation-1.42e+00

Pos logit contributions
 habitats+0.001
 perspective+0.001
stack+0.001
 meanings+0.001
arians+0.001
Neg logit contributions
 better-0.001
's-0.001
 said-0.001
 to-0.001
 again-0.001
Token can
Token ID460
Feature activation-2.17e+00

Pos logit contributions
 habitats+0.065
stack+0.063
 perspective+0.062
arians+0.058
 meanings+0.058
Neg logit contributions
 better-0.031
 said-0.028
 to-0.028
's-0.028
 again-0.028
Token make
Token ID787
Feature activation+3.14e+00

Pos logit contributions
 habitats+0.104
stack+0.103
 perspective+0.101
arians+0.097
 meanings+0.095
Neg logit contributions
 better-0.038
 again-0.034
 said-0.033
 closer-0.033
 to-0.033
Token go
Token ID467
Feature activation+1.90e+00

Pos logit contributions
 habitats+0.161
stack+0.158
 perspective+0.155
arians+0.149
 meanings+0.147
Neg logit contributions
 better-0.055
 again-0.049
 said-0.048
 closer-0.048
 to-0.047
Token back
Token ID736
Feature activation-2.04e-01

Pos logit contributions
 happens+0.039
 suggested+0.035
 said+0.034
 says+0.034
 sandwich+0.033
Neg logit contributions
 habitats-0.074
stack-0.073
 perspective-0.073
ric-0.072
ind-0.068
Token to
Token ID284
Feature activation+6.92e-01

Pos logit contributions
 habitats+0.011
 perspective+0.011
stack+0.011
arians+0.010
ric+0.010
Neg logit contributions
 better-0.002
 said-0.002
's-0.002
 closer-0.002
 say-0.002
Token the
Token ID262
Feature activation-2.70e-02

Pos logit contributions
 better+0.012
 again+0.011
 said+0.011
 to+0.010
 closer+0.010
Neg logit contributions
 habitats-0.034
stack-0.033
 perspective-0.033
arians-0.031
 meanings-0.031
Token park
Token ID3952
Feature activation-2.52e+00

Pos logit contributions
 habitats+0.001
stack+0.001
 perspective+0.001
arians+0.001
 meanings+0.001
Neg logit contributions
 better-0.001
 said-0.001
 again-0.001
 closer-0.001
 to-0.001
Token when
Token ID618
Feature activation-2.07e+00

Pos logit contributions
 habitats+0.134
stack+0.131
 perspective+0.129
arians+0.124
 meanings+0.122
Neg logit contributions
 better-0.032
 said-0.026
 closer-0.026
 again-0.026
's-0.025
Token the
Token ID262
Feature activation-3.73e-02

Pos logit contributions
 habitats+0.119
stack+0.117
 perspective+0.115
arians+0.112
 meanings+0.110
Neg logit contributions
 better-0.021
 again-0.018
 to-0.017
 said-0.016
 closer-0.016
Token wind
Token ID2344
Feature activation+2.57e+00

Pos logit contributions
 habitats+0.001
stack+0.001
 perspective+0.001
 meanings+0.001
arians+0.001
Neg logit contributions
 better-0.001
 again-0.001
 said-0.001
's-0.001
 to-0.001
Token was
Token ID373
Feature activation-2.91e+00

Pos logit contributions
 better+0.054
 closer+0.048
 to+0.046
 again+0.045
's+0.045
Neg logit contributions
 habitats-0.112
stack-0.109
 perspective-0.105
arians-0.101
ric-0.101
Token not
Token ID407
Feature activation+3.88e+00

Pos logit contributions
 habitats+0.123
stack+0.121
 perspective+0.120
ric+0.112
arians+0.111
Neg logit contributions
 better-0.062
's-0.059
 said-0.058
 again-0.058
 to-0.055
Token so
Token ID523
Feature activation-8.60e-01

Pos logit contributions
 better+0.084
 said+0.078
's+0.078
 again+0.078
 closer+0.077
Neg logit contributions
 habitats-0.163
 perspective-0.161
stack-0.160
arians-0.149
ric-0.149
Token were
Token ID547
Feature activation-4.70e+00

Pos logit contributions
 better+0.055
 again+0.050
 said+0.049
 to+0.049
 closer+0.047
Neg logit contributions
 habitats-0.141
stack-0.137
 perspective-0.135
arians-0.129
 meanings-0.127
Token flowing
Token ID17609
Feature activation+6.35e-01

Pos logit contributions
 habitats+0.238
 perspective+0.229
stack+0.226
arians+0.220
ric+0.208
Neg logit contributions
 better-0.097
 to-0.088
 said-0.088
 again-0.084
 closer-0.082
Token down
Token ID866
Feature activation-1.64e+00

Pos logit contributions
 better+0.009
 again+0.009
 to+0.008
's+0.007
 closer+0.007
Neg logit contributions
 habitats-0.036
stack-0.035
 perspective-0.034
ric-0.033
arians-0.033
Token her
Token ID607
Feature activation-1.61e+00

Pos logit contributions
 habitats+0.095
stack+0.092
 perspective+0.092
arians+0.091
ric+0.090
Neg logit contributions
 to-0.020
 again-0.019
 better-0.018
 said-0.015
 closer-0.014
Token cheeks
Token ID25839
Feature activation+6.67e-01

Pos logit contributions
 habitats+0.061
stack+0.059
 perspective+0.059
years+0.057
arians+0.056
Neg logit contributions
 better-0.046
 to-0.045
's-0.044
 again-0.044
 said-0.044
Token.
Token ID13
Feature activation-2.56e-01

Pos logit contributions
 better+0.009
 again+0.008
 to+0.008
 said+0.008
's+0.007
Neg logit contributions
 habitats-0.036
stack-0.035
 perspective-0.034
arians-0.033
 meanings-0.033
Token Her
Token ID2332
Feature activation-4.05e+00

Pos logit contributions
 habitats+0.014
 perspective+0.013
stack+0.013
arians+0.013
 meanings+0.013
Neg logit contributions
 better-0.003
 said-0.003
 to-0.003
 again-0.003
's-0.003
Token mom
Token ID1995
Feature activation+3.80e-01

Pos logit contributions
stack+0.215
 habitats+0.210
 perspective+0.209
arians+0.200
years+0.199
Neg logit contributions
 better-0.060
 said-0.058
 to-0.057
 again-0.056
's-0.054
Token was
Token ID373
Feature activation-2.74e+00

Pos logit contributions
 better+0.005
 said+0.005
 again+0.005
's+0.005
 to+0.005
Neg logit contributions
 habitats-0.020
stack-0.020
 perspective-0.020
arians-0.019
 meanings-0.018
Token worried
Token ID7960
Feature activation+1.59e+00

Pos logit contributions
 habitats+0.119
stack+0.117
 perspective+0.115
arians+0.109
 meanings+0.107
Neg logit contributions
 better-0.061
 again-0.056
 said-0.055
 closer-0.055
 to-0.055
Token and
Token ID290
Feature activation+3.17e+00

Pos logit contributions
 better+0.022
 again+0.019
 said+0.019
 to+0.019
 closer+0.018
Neg logit contributions
 habitats-0.083
stack-0.082
 perspective-0.081
arians-0.077
 meanings-0.077
Token bring
Token ID2222
Feature activation+4.24e+00

Pos logit contributions
 better+0.068
 to+0.061
 said+0.060
 again+0.056
 closer+0.055
Neg logit contributions
 habitats-0.198
 perspective-0.194
stack-0.192
arians-0.185
years-0.183
Token it
Token ID340
Feature activation+7.40e+00

Pos logit contributions
 better+0.073
 again+0.065
 said+0.065
 closer+0.064
's+0.062
Neg logit contributions
 habitats-0.205
stack-0.200
 perspective-0.197
arians-0.188
 meanings-0.188
Token back
Token ID736
Feature activation+1.34e-01

Pos logit contributions
 said+0.115
 better+0.112
 say+0.102
's+0.100
 suggested+0.100
Neg logit contributions
 habitats-0.349
 perspective-0.337
stack-0.336
arians-0.311
 meanings-0.309
Token."
Token ID526
Feature activation+1.12e+00

Pos logit contributions
 better+0.002
 to+0.001
 again+0.001
 closer+0.001
 said+0.001
Neg logit contributions
 habitats-0.007
stack-0.007
 perspective-0.007
arians-0.007
 meanings-0.007
Token
Token ID198
Feature activation+2.61e+00

Pos logit contributions
 better+0.010
 said+0.009
 again+0.009
 to+0.008
's+0.008
Neg logit contributions
 habitats-0.064
stack-0.062
 perspective-0.062
arians-0.060
 meanings-0.059
Token
Token ID198
Feature activation+2.49e+00

Pos logit contributions
 better+0.030
 again+0.025
 closer+0.024
's+0.023
 to+0.022
Neg logit contributions
 habitats-0.141
stack-0.139
 perspective-0.137
ric-0.130
arians-0.129
TokenThe
Token ID464
Feature activation-2.99e+00

Pos logit contributions
 better+0.049
 said+0.046
's+0.044
 again+0.044
 closer+0.044
Neg logit contributions
 habitats-0.116
stack-0.113
 perspective-0.110
 meanings-0.107
ric-0.106
Token man
Token ID582
Feature activation-5.38e-01

Pos logit contributions
 habitats+0.142
stack+0.137
 perspective+0.135
arians+0.132
ric+0.131
Neg logit contributions
 better-0.067
 closer-0.057
 said-0.055
 again-0.055
 bigger-0.054
Token nodded
Token ID14464
Feature activation+2.09e+00

Pos logit contributions
 habitats+0.027
stack+0.026
 perspective+0.026
arians+0.025
 meanings+0.025
Neg logit contributions
 better-0.009
 again-0.007
 closer-0.007
 to-0.007
 said-0.007
Token.
Token ID13
Feature activation-4.92e-01

Pos logit contributions
 better+0.031
 said+0.026
's+0.025
 closer+0.025
 again+0.025
Neg logit contributions
 habitats-0.105
stack-0.103
 perspective-0.102
arians-0.096
 meanings-0.096
Token
Token ID198
Feature activation+2.63e+00

Pos logit contributions
 habitats+0.024
 perspective+0.024
stack+0.024
arians+0.023
ric+0.023
Neg logit contributions
 better-0.008
 to-0.008
 said-0.008
 again-0.008
-0.008
Token could
Token ID714
Feature activation-3.17e+00

Pos logit contributions
 better+0.040
 again+0.036
 said+0.035
 to+0.035
 closer+0.035
Neg logit contributions
 habitats-0.102
stack-0.100
 perspective-0.098
arians-0.094
 meanings-0.093
Token open
Token ID1280
Feature activation+5.72e-01

Pos logit contributions
stack+0.156
 habitats+0.154
 perspective+0.150
arians+0.146
 meanings+0.143
Neg logit contributions
 better-0.051
 again-0.045
 said-0.044
 closer-0.043
's-0.042
Token the
Token ID262
Feature activation+1.98e-02

Pos logit contributions
 again+0.005
 better+0.005
 to+0.005
 closer+0.004
 said+0.003
Neg logit contributions
 habitats-0.035
stack-0.033
 perspective-0.033
arians-0.033
 meanings-0.033
Token gate
Token ID8946
Feature activation-1.26e+00

Pos logit contributions
 better+0.001
 to+0.001
 closer+0.001
's+0.001
 again+0.001
Neg logit contributions
 habitats-0.001
stack-0.001
 perspective-0.001
years-0.001
ric-0.001
Token with
Token ID351
Feature activation-1.21e+00

Pos logit contributions
 habitats+0.069
stack+0.066
 perspective+0.066
arians+0.063
 meanings+0.063
Neg logit contributions
 better-0.016
 again-0.015
 to-0.014
 closer-0.014
 said-0.014
Token his
Token ID465
Feature activation-1.98e+00

Pos logit contributions
 habitats+0.069
 perspective+0.068
stack+0.067
arians+0.066
ric+0.065
Neg logit contributions
 better-0.014
 to-0.012
 said-0.012
 closer-0.011
 again-0.011
Token big
Token ID1263
Feature activation+1.89e+00

Pos logit contributions
 habitats+0.087
stack+0.083
 perspective+0.082
years+0.080
arians+0.078
Neg logit contributions
 to-0.051
 better-0.049
 said-0.048
's-0.047
 closer-0.046
Token key
Token ID1994
Feature activation-1.51e+00

Pos logit contributions
 better+0.049
 to+0.048
 said+0.048
 closer+0.047
 again+0.046
Neg logit contributions
 habitats-0.078
stack-0.076
 perspective-0.072
ric-0.072
arians-0.071
Token.
Token ID13
Feature activation-2.85e-01

Pos logit contributions
 habitats+0.080
stack+0.076
 perspective+0.076
 meanings+0.073
arians+0.073
Neg logit contributions
 better-0.022
 to-0.022
 again-0.021
's-0.020
 said-0.020
Token As
Token ID1081
Feature activation-3.80e+00

Pos logit contributions
 habitats+0.012
stack+0.012
 perspective+0.011
years+0.011
 meanings+0.011
Neg logit contributions
 better-0.007
 closer-0.006
 bite-0.006
's-0.005
 again-0.005
Token he
Token ID339
Feature activation+1.97e+00

Pos logit contributions
 habitats+0.216
stack+0.212
 perspective+0.210
arians+0.202
 meanings+0.200
Neg logit contributions
 better-0.038
 again-0.031
 said-0.030
 to-0.029
 closer-0.029
Token he
Token ID339
Feature activation+1.86e+00

Pos logit contributions
 habitats+0.076
stack+0.074
 perspective+0.073
arians+0.073
ric+0.070
Neg logit contributions
 better-0.010
 they-0.009
 said-0.008
 to-0.008
 he-0.007
Token was
Token ID373
Feature activation-2.96e+00

Pos logit contributions
 better+0.035
 said+0.035
 to+0.033
's+0.032
 again+0.031
Neg logit contributions
 habitats-0.090
 perspective-0.090
stack-0.088
ric-0.084
arians-0.083
Token always
Token ID1464
Feature activation-3.54e+00

Pos logit contributions
 habitats+0.137
 perspective+0.132
stack+0.132
arians+0.126
 meanings+0.123
Neg logit contributions
 better-0.071
 to-0.066
 said-0.062
 closer-0.060
 again-0.060
Token busy
Token ID8179
Feature activation+4.78e-01

Pos logit contributions
 habitats+0.177
stack+0.172
 perspective+0.168
arians+0.162
 meanings+0.160
Neg logit contributions
 better-0.073
 to-0.066
 again-0.062
 said-0.062
 closer-0.062
Token.
Token ID13
Feature activation-1.58e-01

Pos logit contributions
 to+0.010
 better+0.009
 again+0.009
's+0.008
 said+0.007
Neg logit contributions
stack-0.025
 habitats-0.024
 perspective-0.023
arians-0.022
 meanings-0.022
Token One
Token ID1881
Feature activation-4.41e+00

Pos logit contributions
 habitats+0.009
stack+0.009
 perspective+0.009
arians+0.009
years+0.009
Neg logit contributions
 better-0.001
 again-0.001
 to-0.001
 said-0.001
's-0.001
Token day
Token ID1110
Feature activation-1.94e-01

Pos logit contributions
 habitats+0.178
stack+0.173
 perspective+0.170
arians+0.162
 meanings+0.159
Neg logit contributions
 better-0.118
 again-0.109
 said-0.108
 to-0.107
 closer-0.107
Token,
Token ID11
Feature activation-1.79e+00

Pos logit contributions
 habitats+0.011
stack+0.010
 perspective+0.010
arians+0.010
ric+0.010
Neg logit contributions
 better-0.003
 to-0.002
 again-0.002
 said-0.002
's-0.002
Token he
Token ID339
Feature activation+1.85e+00

Pos logit contributions
 habitats+0.110
 perspective+0.105
stack+0.105
arians+0.103
ric+0.102
Neg logit contributions
 better-0.015
 to-0.013
 said-0.012
 again-0.012
 they-0.011
Token decided
Token ID3066
Feature activation-1.09e-01

Pos logit contributions
 better+0.034
 closer+0.029
 again+0.029
 to+0.028
 said+0.027
Neg logit contributions
 habitats-0.088
stack-0.085
 perspective-0.084
arians-0.080
 meanings-0.079
Token to
Token ID284
Feature activation+4.79e-01

Pos logit contributions
 habitats+0.007
stack+0.006
 perspective+0.006
 meanings+0.006
arians+0.006
Neg logit contributions
 better-0.001
 said-0.000
 closer-0.000
 again-0.000
's-0.000
Token.
Token ID13
Feature activation-4.42e-01

Pos logit contributions
 better+0.022
 again+0.021
 to+0.020
 said+0.018
's+0.017
Neg logit contributions
 habitats-0.115
stack-0.115
 perspective-0.112
ric-0.109
 meanings-0.108
Token They
Token ID1119
Feature activation-8.89e-01

Pos logit contributions
 habitats+0.016
stack+0.015
 perspective+0.015
 meanings+0.014
years+0.013
Neg logit contributions
 better-0.013
 closer-0.013
 again-0.012
 bite-0.012
 say-0.012
Token lived
Token ID5615
Feature activation-8.70e+00

Pos logit contributions
 habitats+0.042
 perspective+0.041
stack+0.040
arians+0.039
ric+0.038
Neg logit contributions
 better-0.019
 said-0.018
's-0.017
 again-0.017
 to-0.016
Token happily
Token ID18177
Feature activation-4.35e+00

Pos logit contributions
 habitats+0.385
stack+0.375
 perspective+0.371
arians+0.353
 meanings+0.350
Neg logit contributions
 better-0.199
 again-0.183
 to-0.179
 said-0.177
 closer-0.177
Token ever
Token ID1683
Feature activation+4.75e-02

Pos logit contributions
 habitats+0.190
stack+0.186
 perspective+0.185
 meanings+0.174
arians+0.173
Neg logit contributions
 better-0.107
 to-0.103
 again-0.102
 said-0.099
's-0.099
Token after
Token ID706
Feature activation-5.71e+00

Pos logit contributions
 better+0.001
 again+0.001
 to+0.001
's+0.001
 said+0.001
Neg logit contributions
stack-0.002
 habitats-0.002
 perspective-0.002
arians-0.002
ric-0.002
Token.
Token ID13
Feature activation-4.77e-01

Pos logit contributions
 habitats+0.321
stack+0.314
 perspective+0.310
arians+0.300
 meanings+0.299
Neg logit contributions
 better-0.070
 again-0.067
 to-0.063
 said-0.058
 they-0.053
Token The
Token ID383
Feature activation-1.71e+00

Pos logit contributions
 habitats+0.023
stack+0.022
 perspective+0.022
 meanings+0.021
ric+0.021
Neg logit contributions
 closer-0.008
 better-0.008
 bigger-0.007
 shot-0.007
 stronger-0.006
Token end
Token ID886
Feature activation-4.35e+00

Pos logit contributions
 habitats+0.070
stack+0.069
 perspective+0.067
arians+0.065
ric+0.064
Neg logit contributions
 better-0.048
 again-0.043
 said-0.042
 to-0.041
's-0.041
Token.
Token ID13
Feature activation-5.72e-01

Pos logit contributions
 habitats+0.181
 perspective+0.175
stack+0.171
 meanings+0.166
ric+0.164
Neg logit contributions
 better-0.084
 closer-0.083
 said-0.078
 shot-0.077
 say-0.077
Token<|endoftext|>
Token ID50256
Feature activation-3.21e+00

Pos logit contributions
 habitats+0.025
stack+0.024
 perspective+0.024
ric+0.023
 meanings+0.022
Neg logit contributions
 closer-0.011
up-0.010
 better-0.010
 shot-0.009
 worse-0.009
Token so
Token ID523
Feature activation-1.10e+00

Pos logit contributions
 habitats+0.134
stack+0.130
 perspective+0.129
arians+0.123
 meanings+0.122
Neg logit contributions
 better-0.065
 again-0.059
 to-0.058
 said-0.058
 closer-0.058
Token glad
Token ID9675
Feature activation+9.90e-01

Pos logit contributions
 habitats+0.052
stack+0.050
 perspective+0.050
years+0.048
 meanings+0.047
Neg logit contributions
 better-0.020
 again-0.019
's-0.018
 to-0.017
 closer-0.017
Token she
Token ID673
Feature activation+1.34e+00

Pos logit contributions
 better+0.010
 again+0.008
 said+0.008
 to+0.008
 closer+0.008
Neg logit contributions
 habitats-0.056
stack-0.055
 perspective-0.054
arians-0.052
 meanings-0.052
Token could
Token ID714
Feature activation-3.37e+00

Pos logit contributions
 better+0.031
 again+0.027
 closer+0.027
 say+0.026
 to+0.026
Neg logit contributions
stack-0.054
 habitats-0.053
years-0.050
arians-0.050
 perspective-0.049
Token play
Token ID711
Feature activation+1.08e+00

Pos logit contributions
stack+0.155
 habitats+0.149
 perspective+0.147
ric+0.139
arians+0.138
Neg logit contributions
 better-0.063
 again-0.056
 said-0.056
's-0.056
 closer-0.055
Token in
Token ID287
Feature activation-4.51e+00

Pos logit contributions
 better+0.021
 said+0.021
's+0.019
 says+0.019
 suggested+0.019
Neg logit contributions
stack-0.046
 habitats-0.045
 perspective-0.045
years-0.041
arians-0.041
Token the
Token ID262
Feature activation-8.80e-02

Pos logit contributions
 habitats+0.250
stack+0.244
 perspective+0.240
arians+0.235
 meanings+0.233
Neg logit contributions
 better-0.055
 again-0.049
 to-0.046
 closer-0.044
 said-0.043
Token sand
Token ID6450
Feature activation-1.14e-02

Pos logit contributions
 habitats+0.003
stack+0.003
 perspective+0.003
ric+0.003
arians+0.003
Neg logit contributions
 better-0.003
 said-0.002
 again-0.002
 to-0.002
 closer-0.002
Token without
Token ID1231
Feature activation-2.43e+00

Pos logit contributions
 habitats+0.001
stack+0.001
 perspective+0.001
 meanings+0.001
arians+0.001
Neg logit contributions
 better-0.000
 again-0.000
 to-0.000
 said-0.000
 closer-0.000
Token worrying
Token ID18916
Feature activation+9.21e-01

Pos logit contributions
stack+0.105
 habitats+0.104
 perspective+0.098
arians+0.098
 meanings+0.094
Neg logit contributions
 better-0.064
 again-0.060
 to-0.058
 closer-0.057
 said-0.057
Token about
Token ID546
Feature activation-8.45e-01

Pos logit contributions
 better+0.013
 again+0.012
 to+0.012
's+0.011
 said+0.010
Neg logit contributions
 habitats-0.050
stack-0.050
 perspective-0.048
arians-0.047
years-0.046
Tokenapped
Token ID6320
Feature activation+3.56e+00

Pos logit contributions
 better+0.020
 again+0.019
 said+0.019
's+0.019
 to+0.019
Neg logit contributions
 habitats-0.016
stack-0.016
 perspective-0.015
arians-0.014
 meanings-0.014
Token her
Token ID607
Feature activation-1.78e+00

Pos logit contributions
 better+0.058
 said+0.051
 again+0.050
's+0.050
 closer+0.049
Neg logit contributions
 habitats-0.174
stack-0.168
 perspective-0.168
 meanings-0.159
arians-0.157
Token hands
Token ID2832
Feature activation+1.39e-01

Pos logit contributions
 habitats+0.050
stack+0.048
 perspective+0.047
 meanings+0.044
arians+0.043
Neg logit contributions
 better-0.068
 again-0.065
 said-0.064
 closer-0.063
 to-0.063
Token.
Token ID13
Feature activation-4.40e-01

Pos logit contributions
 better+0.002
 again+0.002
 said+0.002
 to+0.002
 closer+0.002
Neg logit contributions
 habitats-0.007
stack-0.007
 perspective-0.007
arians-0.007
 meanings-0.007
Token She
Token ID1375
Feature activation-3.41e+00

Pos logit contributions
 habitats+0.020
 perspective+0.020
stack+0.019
arians+0.019
years+0.018
Neg logit contributions
 to-0.010
 better-0.009
 said-0.009
-0.009
 they-0.009
Token loved
Token ID6151
Feature activation-4.23e+00

Pos logit contributions
 habitats+0.170
stack+0.167
 perspective+0.165
arians+0.157
 meanings+0.155
Neg logit contributions
 better-0.056
 said-0.050
 again-0.050
 to-0.048
 closer-0.048
Token stuffed
Token ID22259
Feature activation-5.01e+00

Pos logit contributions
stack+0.212
 perspective+0.211
 habitats+0.205
arians+0.202
uff+0.192
Neg logit contributions
 to-0.079
 better-0.073
 said-0.066
 again-0.066
 they-0.064
Token animals
Token ID4695
Feature activation-4.01e+00

Pos logit contributions
 perspective+0.179
stack+0.179
 habitats+0.177
ric+0.167
 meanings+0.162
Neg logit contributions
 again-0.152
 better-0.152
 to-0.150
 said-0.145
's-0.144
Token.
Token ID13
Feature activation-4.13e-01

Pos logit contributions
 habitats+0.234
stack+0.226
 perspective+0.226
arians+0.217
 meanings+0.217
Neg logit contributions
 better-0.048
 again-0.040
 to-0.040
's-0.038
 said-0.035
Token She
Token ID1375
Feature activation-3.37e+00

Pos logit contributions
 perspective+0.019
 habitats+0.019
stack+0.019
arians+0.017
ric+0.017
Neg logit contributions
 better-0.009
 to-0.009
 said-0.008
-0.008
 they-0.008
Token hugged
Token ID41130
Feature activation+2.78e+00

Pos logit contributions
 habitats+0.154
stack+0.151
 perspective+0.148
arians+0.142
 meanings+0.140
Neg logit contributions
 better-0.069
 again-0.062
 said-0.061
 closer-0.061
 to-0.061
Token children
Token ID1751
Feature activation-3.61e+00

Pos logit contributions
stack+0.114
 habitats+0.113
ric+0.111
 perspective+0.109
 meanings+0.107
Neg logit contributions
 again-0.021
 to-0.020
 better-0.018
's-0.012
 closer-0.010
Token.
Token ID13
Feature activation-3.80e-01

Pos logit contributions
 habitats+0.179
stack+0.175
 perspective+0.173
arians+0.165
 meanings+0.164
Neg logit contributions
 better-0.061
 again-0.055
 said-0.053
 to-0.053
's-0.052
Token<|endoftext|>
Token ID50256
Feature activation-3.83e+00

Pos logit contributions
 habitats+0.018
stack+0.017
 perspective+0.017
arians+0.016
 meanings+0.016
Neg logit contributions
 better-0.007
 again-0.007
 said-0.007
 closer-0.007
 to-0.006
TokenL
Token ID43
Feature activation-5.61e+00

Pos logit contributions
 habitats+0.147
stack+0.144
 perspective+0.141
arians+0.132
ric+0.131
Neg logit contributions
 better-0.102
 closer-0.095
 said-0.093
 again-0.092
 say-0.090
Tokenily
Token ID813
Feature activation+3.68e-01

Pos logit contributions
stack+0.189
 habitats+0.187
uff+0.181
 perspective+0.180
years+0.171
Neg logit contributions
 better-0.162
 shot-0.147
 closer-0.145
 stronger-0.143
 kiss-0.137
Token was
Token ID373
Feature activation-3.04e+00

Pos logit contributions
 better+0.005
 again+0.005
 said+0.005
 to+0.005
's+0.004
Neg logit contributions
 habitats-0.019
stack-0.019
 perspective-0.019
arians-0.018
 meanings-0.018
Token a
Token ID257
Feature activation+4.21e+00

Pos logit contributions
 habitats+0.123
stack+0.122
 perspective+0.116
years+0.115
uff+0.112
Neg logit contributions
 better-0.071
 again-0.069
 closer-0.065
 said-0.065
's-0.064
Token little
Token ID1310
Feature activation-3.59e+00

Pos logit contributions
 better+0.082
 to+0.072
 said+0.071
 closer+0.071
's+0.069
Neg logit contributions
 habitats-0.208
stack-0.203
 perspective-0.199
arians-0.192
 meanings-0.191
Token girl
Token ID2576
Feature activation-2.59e+00

Pos logit contributions
 habitats+0.181
stack+0.176
 perspective+0.168
 meanings+0.167
arians+0.162
Neg logit contributions
 better-0.076
 to-0.065
 closer-0.065
 again-0.065
 said-0.064
Token who
Token ID508
Feature activation-2.20e+00

Pos logit contributions
 habitats+0.141
stack+0.134
 perspective+0.133
 meanings+0.133
opes+0.128
Neg logit contributions
's-0.047
 to-0.046
 better-0.043
 again-0.043
 said-0.039
Token liked
Token ID8288
Feature activation-5.05e+00

Pos logit contributions
 perspective+0.095
stack+0.093
 habitats+0.089
arians+0.089
ric+0.083
Neg logit contributions
 better-0.060
 said-0.059
's-0.057
 to-0.054
 say-0.053
Token there
Token ID612
Feature activation+2.59e+00

Pos logit contributions
 habitats+0.029
stack+0.029
 perspective+0.029
arians+0.027
 meanings+0.027
Neg logit contributions
 better-0.011
 again-0.009
 said-0.009
 closer-0.009
 to-0.009
Token was
Token ID373
Feature activation-2.88e+00

Pos logit contributions
 better+0.032
 said+0.029
 again+0.028
's+0.028
 to+0.026
Neg logit contributions
 habitats-0.143
stack-0.138
 perspective-0.138
arians-0.135
 meanings-0.132
Token a
Token ID257
Feature activation+4.01e+00

Pos logit contributions
 habitats+0.147
 perspective+0.143
stack+0.142
years+0.136
 meanings+0.134
Neg logit contributions
 better-0.038
 again-0.035
 said-0.033
 closer-0.033
 to-0.031
Token rh
Token ID9529
Feature activation-5.75e+00

Pos logit contributions
 better+0.075
 said+0.066
 again+0.065
 closer+0.065
 to+0.064
Neg logit contributions
 habitats-0.195
stack-0.190
 perspective-0.187
arians-0.179
 meanings-0.179
Tokenin
Token ID259
Feature activation-1.18e+00

Pos logit contributions
 habitats+0.275
stack+0.267
 meanings+0.252
 perspective+0.252
arians+0.243
Neg logit contributions
 better-0.121
 again-0.109
 said-0.105
's-0.105
 closer-0.105
Tokenoc
Token ID420
Feature activation-1.17e+01

Pos logit contributions
 habitats+0.041
stack+0.039
 perspective+0.038
 meanings+0.038
arians+0.038
Neg logit contributions
 better-0.040
's-0.038
 said-0.038
 to-0.038
 again-0.037
Tokeneros
Token ID27498
Feature activation-1.60e+00

Pos logit contributions
 perspective+0.254
 habitats+0.254
stack+0.239
nam+0.228
ric+0.226
Neg logit contributions
 again-0.513
 to-0.507
 better-0.488
's-0.463
 said-0.460
Token who
Token ID508
Feature activation-2.28e+00

Pos logit contributions
 habitats+0.075
stack+0.073
 perspective+0.072
arians+0.071
 meanings+0.070
Neg logit contributions
 to-0.033
 better-0.032
 said-0.032
's-0.032
 again-0.031
Token wanted
Token ID2227
Feature activation-7.77e-01

Pos logit contributions
 perspective+0.082
stack+0.082
arians+0.082
 habitats+0.080
uff+0.076
Neg logit contributions
 better-0.075
 said-0.074
's-0.074
 to-0.070
 shot-0.067
Token to
Token ID284
Feature activation+4.76e-01

Pos logit contributions
 habitats+0.046
 perspective+0.044
stack+0.044
 meanings+0.042
ric+0.041
Neg logit contributions
 better-0.005
 said-0.004
 again-0.003
 say-0.003
 closer-0.003
Token rock
Token ID3881
Feature activation+4.04e-01

Pos logit contributions
 better+0.008
 again+0.008
 closer+0.008
 said+0.007
 to+0.007
Neg logit contributions
stack-0.022
 habitats-0.022
 perspective-0.022
 meanings-0.020
arians-0.020
Token ever
Token ID1683
Feature activation+1.11e-01

Pos logit contributions
 habitats+0.184
stack+0.180
 perspective+0.177
 meanings+0.169
ric+0.166
Neg logit contributions
 better-0.107
 again-0.105
 to-0.104
's-0.099
 said-0.099
Token after
Token ID706
Feature activation-5.65e+00

Pos logit contributions
 better+0.002
 again+0.002
 to+0.002
 said+0.002
's+0.002
Neg logit contributions
 habitats-0.005
stack-0.005
 perspective-0.005
arians-0.005
ric-0.005
Token.
Token ID13
Feature activation-3.09e-01

Pos logit contributions
 habitats+0.323
stack+0.316
 perspective+0.310
arians+0.301
 meanings+0.300
Neg logit contributions
 better-0.064
 again-0.062
 to-0.059
 said-0.053
's-0.048
Token<|endoftext|>
Token ID50256
Feature activation-3.30e+00

Pos logit contributions
 habitats+0.016
 meanings+0.015
stack+0.015
 perspective+0.015
ric+0.014
Neg logit contributions
 closer-0.004
 better-0.004
 again-0.004
 worse-0.004
 bigger-0.003
TokenOnce
Token ID7454
Feature activation-4.04e+00

Pos logit contributions
 habitats+0.144
stack+0.143
 perspective+0.141
ric+0.131
uff+0.126
Neg logit contributions
 better-0.056
 closer-0.055
 worse-0.050
 nicer-0.050
 say-0.050
Token upon
Token ID2402
Feature activation-8.06e+00

Pos logit contributions
 perspective+0.131
 habitats+0.121
uff+0.119
stack+0.119
ric+0.114
Neg logit contributions
 better-0.125
 stronger-0.113
 shot-0.112
 nicer-0.112
 closer-0.111
Token a
Token ID257
Feature activation+4.28e+00

Pos logit contributions
 habitats+0.305
 perspective+0.299
nam+0.282
ric+0.279
stack+0.279
Neg logit contributions
 closer-0.154
 better-0.149
 again-0.148
 suggested-0.147
 said-0.146
Token time
Token ID640
Feature activation-5.87e-01

Pos logit contributions
 better+0.124
 closer+0.122
 again+0.119
 bigger+0.114
 said+0.113
Neg logit contributions
 habitats-0.152
 perspective-0.145
uff-0.131
 years-0.127
 meanings-0.127
Token,
Token ID11
Feature activation-1.75e+00

Pos logit contributions
stack+0.024
 perspective+0.024
 habitats+0.024
uff+0.023
ric+0.022
Neg logit contributions
 better-0.013
 said-0.011
 closer-0.011
 say-0.011
 again-0.011
Token there
Token ID612
Feature activation+2.83e+00

Pos logit contributions
stack+0.075
 perspective+0.072
 habitats+0.071
uff+0.068
ric+0.067
Neg logit contributions
 better-0.034
 again-0.030
 say-0.030
 kiss-0.030
 said-0.029
Token was
Token ID373
Feature activation-2.70e+00

Pos logit contributions
 better+0.042
 again+0.037
 said+0.036
 to+0.036
 closer+0.035
Neg logit contributions
 habitats-0.146
stack-0.143
 perspective-0.141
arians-0.136
 meanings-0.134
Token very
Token ID845
Feature activation-3.89e+00

Pos logit contributions
 better+0.007
 to+0.006
 again+0.006
 said+0.006
's+0.006
Neg logit contributions
 habitats-0.016
stack-0.016
 perspective-0.015
 meanings-0.014
arians-0.014
Token helpful
Token ID7613
Feature activation+5.56e-01

Pos logit contributions
stack+0.198
 habitats+0.196
 perspective+0.193
ric+0.186
arians+0.183
Neg logit contributions
 better-0.079
 said-0.068
 to-0.068
 closer-0.065
 bigger-0.062
Token.
Token ID13
Feature activation-8.27e-01

Pos logit contributions
 to+0.006
 again+0.005
 better+0.004
's+0.003
 said+0.003
Neg logit contributions
ric-0.034
stack-0.034
 habitats-0.034
 perspective-0.033
 meanings-0.032
Token<|endoftext|>
Token ID50256
Feature activation-3.92e+00

Pos logit contributions
 habitats+0.043
stack+0.042
 perspective+0.041
arians+0.040
ric+0.039
Neg logit contributions
 better-0.013
 again-0.011
 said-0.011
 to-0.011
's-0.011
TokenOnce
Token ID7454
Feature activation-4.35e+00

Pos logit contributions
 habitats+0.181
stack+0.176
 perspective+0.174
arians+0.166
 meanings+0.165
Neg logit contributions
 better-0.080
 again-0.073
 said-0.071
 to-0.071
 closer-0.070
Token upon
Token ID2402
Feature activation-8.34e+00

Pos logit contributions
 perspective+0.133
stack+0.128
 habitats+0.127
uff+0.119
ric+0.118
Neg logit contributions
 better-0.144
 kiss-0.128
 shot-0.128
 closer-0.127
 stronger-0.127
Token a
Token ID257
Feature activation+3.91e+00

Pos logit contributions
 habitats+0.359
 perspective+0.352
stack+0.338
years+0.335
 meanings+0.330
Neg logit contributions
 better-0.136
 again-0.130
 closer-0.128
 said-0.128
 says-0.122
Token time
Token ID640
Feature activation-6.44e-01

Pos logit contributions
 better+0.101
 closer+0.095
 again+0.094
 said+0.093
 to+0.090
Neg logit contributions
 habitats-0.152
 perspective-0.147
uff-0.133
stack-0.133
 meanings-0.131
Token,
Token ID11
Feature activation-2.12e+00

Pos logit contributions
 habitats+0.029
stack+0.029
 perspective+0.029
ric+0.027
arians+0.027
Neg logit contributions
 better-0.013
 said-0.011
 again-0.011
 closer-0.011
's-0.010
Token there
Token ID612
Feature activation+2.44e+00

Pos logit contributions
 habitats+0.109
stack+0.109
 perspective+0.107
arians+0.101
ric+0.101
Neg logit contributions
 better-0.030
 again-0.025
 said-0.025
 to-0.024
 closer-0.024
Token were
Token ID547
Feature activation-5.14e+00

Pos logit contributions
 better+0.034
 again+0.030
 said+0.029
's+0.029
 to+0.029
Neg logit contributions
 habitats-0.129
stack-0.126
 perspective-0.124
arians-0.120
 meanings-0.119
Token my
Token ID616
Feature activation-3.31e+00

Pos logit contributions
 habitats+0.078
stack+0.077
 perspective+0.076
arians+0.072
 meanings+0.072
Neg logit contributions
 better-0.028
 again-0.025
 said-0.024
 to-0.024
 closer-0.024
Token love
Token ID1842
Feature activation-1.52e+00

Pos logit contributions
 habitats+0.153
stack+0.153
 perspective+0.144
arians+0.143
ric+0.142
Neg logit contributions
 better-0.076
 to-0.069
 again-0.068
 said-0.066
 closer-0.063
Token."
Token ID526
Feature activation+1.38e+00

Pos logit contributions
 habitats+0.084
stack+0.083
years+0.082
 perspective+0.082
ric+0.080
Neg logit contributions
 to-0.022
 again-0.019
 better-0.019
 said-0.016
's-0.015
Token<|endoftext|>
Token ID50256
Feature activation-3.94e+00

Pos logit contributions
 better+0.019
 closer+0.016
 again+0.016
 to+0.015
 said+0.015
Neg logit contributions
 habitats-0.072
stack-0.071
 perspective-0.070
arians-0.067
 meanings-0.066
TokenOnce
Token ID7454
Feature activation-4.19e+00

Pos logit contributions
 habitats+0.194
stack+0.193
 perspective+0.189
ric+0.179
arians+0.176
Neg logit contributions
 better-0.056
 closer-0.051
 say-0.046
 said-0.044
 again-0.043
Token upon
Token ID2402
Feature activation-8.27e+00

Pos logit contributions
 perspective+0.126
 habitats+0.118
stack+0.117
uff+0.112
ric+0.111
Neg logit contributions
 better-0.138
 shot-0.126
 kiss-0.125
 closer-0.124
 sandwich-0.124
Token a
Token ID257
Feature activation+4.04e+00

Pos logit contributions
 habitats+0.344
 perspective+0.335
years+0.318
 meanings+0.317
stack+0.313
Neg logit contributions
 better-0.144
 closer-0.141
 again-0.139
 said-0.135
 suggested-0.130
Token time
Token ID640
Feature activation-4.63e-01

Pos logit contributions
 better+0.110
 closer+0.108
 again+0.105
 said+0.101
 to+0.101
Neg logit contributions
 habitats-0.145
 perspective-0.141
uff-0.129
 years-0.124
 meanings-0.124
Token,
Token ID11
Feature activation-1.79e+00

Pos logit contributions
 perspective+0.019
stack+0.019
 habitats+0.019
uff+0.018
ric+0.018
Neg logit contributions
 better-0.010
 said-0.009
 closer-0.009
 say-0.009
 again-0.009
Token there
Token ID612
Feature activation+2.41e+00

Pos logit contributions
stack+0.089
 habitats+0.088
 perspective+0.087
ric+0.082
years+0.082
Neg logit contributions
 better-0.027
 said-0.023
 again-0.023
 to-0.022
 say-0.022
Token was
Token ID373
Feature activation-2.84e+00

Pos logit contributions
 better+0.036
 again+0.031
 said+0.031
 to+0.030
 closer+0.030
Neg logit contributions
 habitats-0.125
stack-0.122
 perspective-0.120
arians-0.115
 meanings-0.114
Token of
Token ID286
Feature activation-3.29e+00

Pos logit contributions
 habitats+0.118
stack+0.114
 perspective+0.114
ric+0.110
 meanings+0.109
Neg logit contributions
 better-0.022
 to-0.021
 again-0.020
 said-0.018
's-0.017
Token bird
Token ID6512
Feature activation+7.23e-01

Pos logit contributions
 habitats+0.154
arians+0.151
stack+0.149
 perspective+0.148
years+0.142
Neg logit contributions
 better-0.078
 to-0.072
 said-0.068
 say-0.064
 some-0.062
Token and
Token ID290
Feature activation+2.58e+00

Pos logit contributions
 better+0.011
 again+0.009
 said+0.009
 to+0.009
 closer+0.009
Neg logit contributions
 habitats-0.037
stack-0.036
 perspective-0.036
arians-0.034
 meanings-0.034
Token needed
Token ID2622
Feature activation-1.50e+00

Pos logit contributions
 better+0.036
 said+0.033
 to+0.032
 again+0.028
's+0.028
Neg logit contributions
 habitats-0.140
 perspective-0.139
stack-0.136
arians-0.132
ric-0.129
Token a
Token ID257
Feature activation+3.89e+00

Pos logit contributions
 habitats+0.085
stack+0.083
 perspective+0.082
arians+0.079
 meanings+0.079
Neg logit contributions
 better-0.015
 again-0.012
 said-0.011
 to-0.011
 closer-0.011
Token lot
Token ID1256
Feature activation-9.87e+00

Pos logit contributions
 better+0.077
 again+0.071
 said+0.071
's+0.068
 closer+0.068
Neg logit contributions
 habitats-0.176
stack-0.173
 perspective-0.172
arians-0.162
 meanings-0.160
Token of
Token ID286
Feature activation-3.28e+00

Pos logit contributions
 habitats+0.535
stack+0.522
 perspective+0.521
arians+0.496
 meanings+0.494
Neg logit contributions
 better-0.112
 again-0.095
 said-0.094
 closer-0.091
's-0.090
Token love
Token ID1842
Feature activation-1.69e+00

Pos logit contributions
stack+0.151
 habitats+0.151
 perspective+0.145
arians+0.141
ric+0.138
Neg logit contributions
 better-0.073
 to-0.065
 said-0.065
 again-0.062
 closer-0.061
Token and
Token ID290
Feature activation+2.56e+00

Pos logit contributions
stack+0.101
 habitats+0.101
years+0.096
 perspective+0.096
arians+0.095
Neg logit contributions
 to-0.019
 better-0.019
 again-0.016
 said-0.013
's-0.013
Token care
Token ID1337
Feature activation+3.30e+00

Pos logit contributions
 better+0.057
 to+0.054
 said+0.052
 again+0.049
's+0.048
Neg logit contributions
stack-0.118
 habitats-0.116
 perspective-0.115
arians-0.113
uff-0.110
Token.
Token ID13
Feature activation-4.12e-01

Pos logit contributions
 better+0.044
 to+0.038
 again+0.036
's+0.032
 said+0.032
Neg logit contributions
 habitats-0.186
stack-0.184
 perspective-0.178
arians-0.174
ric-0.169