TOP ACTIVATIONS
MAX = 0.089

to the pool . We want to stay here ."
the island . They just want the bird to be okay
play . They did not need a missile or a rich
. Everywhere it went it attracted lots of attention because of
not want squash . We want pizza . Pizza is y
up their things . They want to play more . They
Fin liked to dive deep into the water . One day
. A big rock was blocking the road , and no
very competitive . They always want to win .
They said they did not need to be wealthy to be
hurt yourself . And I need to go to the university
to the lake . They want to swim and play .
, " But I really want to play chess ."
and tape . And we need to be gentle ."
anyone . We do not need school ." Their
sand castle . We just need to add some water to
day , Tim my accidentally dropped his little metal car and
and cars back . They want their songs back .
scream and cry . They need help . But
You are children . You need school . You need to

MIN ACTIVATIONS
MIN = -0.142

from the tank . Hel io was thankful and thanked T
as Hel io . Hel io explained that their oxygen was
." The princess thanked the fairy again and
him to lock it . Gerry was excited and he carefully
said goodbye . The p anda was so happy that someone
I 'm lost ." The prince felt so sorry for the
a reward . The p anda refused and said , "
brown man e ." The princess was so happy and thanked
not very nice ." F ido felt very bad . He
wish !" The princess thought for a moment ,
the lovely adventure . The prince said that anytime the par
On Tuesday morning , Mar iah woke up early . She
snakes ." And the p anda waved goodbye with his paw
a little while . The princess said yes and gave the
's very mild ." The pig was relieved , he knew
way she could . Mar iah found many ways to help
and kiss . P up il played with his new friend
will hurt you !" The prince was scared but he was
for his daughter . The prince was sad and left the
him a big hug . Gerry was very happy . <|endoftext|>

INTERVAL 0.031 - 0.089
CONTAINS 17.588%

had no friends because he looked scary . The hipp o
to other kids who might want to play with him .
of the park . She found a nice lady who had
was wrong , but he did not know how to tell
used all his strength to lift it . With a loud

INTERVAL -0.027 - 0.031
CONTAINS 63.174%

food . He asked his friend , a mouse , if
, change him , and rock him . L
them . They closed the door and locked it . They
the doll in the street . She does not see the
were playing in the park . They saw a huge figure

INTERVAL -0.084 - -0.027
CONTAINS 16.785%

she finally found another cr umb . Once again
very excited and couldn â € ™ t believe his luck
She liked her dress . She thought it was beautiful .
, who loved fashion . She had lots of clothes and
to build her nest . She was very happy and grateful

INTERVAL -0.142 - -0.084
CONTAINS 2.453%

warned . L ila and Ben took a bite
the fox left , the princess noticed the bottle of poison
rope around the tree . John got very excited and jumped
met a rabbit . The rabbit was very hungry and asked
his very first race . Tim my was so glad that
Token to
Token ID284
Feature activation+5.04e-02

Pos logit contributions
wered+0.009
 something+0.008
 nature+0.008
 sunrise+0.008
 outer+0.007
Neg logit contributions
bo-0.023
ave-0.022
ience-0.021
mo-0.021
anda-0.021
Token the
Token ID262
Feature activation+3.43e-02

Pos logit contributions
wered+0.059
uttering+0.050
 sunrise+0.047
 soil+0.040
 view+0.039
Neg logit contributions
bo-0.114
 forgot-0.108
 rode-0.104
anda-0.099
 missed-0.098
Token pool
Token ID5933
Feature activation-4.58e-03

Pos logit contributions
wered+0.054
 outer+0.045
 nature+0.044
 sunrise+0.043
 an+0.042
Neg logit contributions
bo-0.065
 forgot-0.063
anda-0.062
ave-0.059
ience-0.058
Token.
Token ID13
Feature activation-6.82e-03

Pos logit contributions
bo+0.011
ave+0.011
ience+0.010
anda+0.010
mo+0.010
Neg logit contributions
wered-0.006
 nature-0.006
 something-0.006
 sunrise-0.006
 outer-0.006
Token We
Token ID775
Feature activation-4.60e-02

Pos logit contributions
bo+0.012
ave+0.012
ience+0.011
mo+0.011
anda+0.011
Neg logit contributions
wered-0.013
 something-0.013
 nature-0.013
 sunrise-0.013
 outer-0.012
Token want
Token ID765
Feature activation+8.87e-02

Pos logit contributions
bo+0.103
ience+0.103
ave+0.102
mo+0.100
jo+0.099
Neg logit contributions
 something-0.066
 a-0.063
wered-0.062
 nature-0.062
 an-0.062
Token to
Token ID284
Feature activation+2.26e-02

Pos logit contributions
wered+0.157
 outer+0.132
 view+0.125
 sunrise+0.122
 sight+0.121
Neg logit contributions
bo-0.164
o-0.151
ave-0.147
 forgot-0.145
 fet-0.143
Token stay
Token ID2652
Feature activation+1.99e-02

Pos logit contributions
wered+0.039
 outer+0.039
 sunrise+0.039
 sight+0.038
 something+0.038
Neg logit contributions
bo-0.043
 forgot-0.040
 rode-0.039
ave-0.038
o-0.036
Token here
Token ID994
Feature activation+1.31e-02

Pos logit contributions
wered+0.028
 sunrise+0.024
 outer+0.023
 nature+0.022
 something+0.021
Neg logit contributions
bo-0.049
ave-0.046
 forgot-0.044
ience-0.044
o-0.044
Token."
Token ID526
Feature activation+7.13e-03

Pos logit contributions
wered+0.021
 nature+0.019
 sunrise+0.019
 outer+0.019
 something+0.019
Neg logit contributions
bo-0.028
ave-0.027
ience-0.026
lets-0.025
 forgot-0.025
Token
Token ID198
Feature activation+2.43e-02

Pos logit contributions
wered+0.012
 outer+0.011
 sunrise+0.011
 nature+0.011
 something+0.011
Neg logit contributions
bo-0.015
ave-0.014
 forgot-0.013
o-0.013
ience-0.013
Token the
Token ID262
Feature activation+3.37e-02

Pos logit contributions
wered+0.064
uttering+0.053
 sunrise+0.052
 outer+0.046
 soil+0.045
Neg logit contributions
 forgot-0.121
bo-0.121
 rode-0.117
 missed-0.115
 forg-0.111
Token island
Token ID7022
Feature activation+4.10e-03

Pos logit contributions
wered+0.046
 outer+0.041
 an+0.037
 something+0.036
 sunrise+0.035
Neg logit contributions
bo-0.072
 forgot-0.070
anda-0.069
 rode-0.065
ience-0.064
Token.
Token ID13
Feature activation+5.35e-03

Pos logit contributions
wered+0.007
 sunrise+0.006
 outer+0.006
 nature+0.006
 something+0.006
Neg logit contributions
bo-0.009
ave-0.008
 forgot-0.008
anda-0.008
mo-0.008
Token They
Token ID1119
Feature activation-5.37e-02

Pos logit contributions
wered+0.010
 something+0.009
 sunrise+0.009
 nature+0.009
 outer+0.009
Neg logit contributions
bo-0.010
ave-0.010
ience-0.009
 forgot-0.009
mo-0.009
Token just
Token ID655
Feature activation-3.68e-02

Pos logit contributions
ave+0.129
bo+0.128
oped+0.124
lets+0.123
ience+0.123
Neg logit contributions
wered-0.072
 something-0.069
 a-0.067
 nature-0.067
 an-0.065
Token want
Token ID765
Feature activation+8.78e-02

Pos logit contributions
ave+0.068
bo+0.067
anda+0.067
ience+0.066
oped+0.065
Neg logit contributions
 something-0.068
wered-0.066
 a-0.066
 nature-0.065
 an-0.065
Token the
Token ID262
Feature activation+1.97e-02

Pos logit contributions
wered+0.152
 outer+0.133
 view+0.125
 sunrise+0.124
 attic+0.119
Neg logit contributions
 forgot-0.146
 cared-0.142
bo-0.142
 apologized-0.139
 forg-0.139
Token bird
Token ID6512
Feature activation-4.21e-02

Pos logit contributions
wered+0.037
 something+0.036
 outer+0.036
 nature+0.035
 sunrise+0.034
Neg logit contributions
bo-0.035
ave-0.034
ience-0.032
anda-0.032
mo-0.032
Token to
Token ID284
Feature activation+2.94e-02

Pos logit contributions
ave+0.124
bo+0.122
ience+0.121
oped+0.117
mo+0.116
Neg logit contributions
 something-0.035
 nature-0.032
 a-0.030
 an-0.029
 lunch-0.028
Token be
Token ID307
Feature activation+2.61e-02

Pos logit contributions
 sunrise+0.051
 outer+0.050
 something+0.050
 sight+0.049
wered+0.049
Neg logit contributions
bo-0.053
 forgot-0.049
 rode-0.049
ave-0.045
 roared-0.044
Token okay
Token ID8788
Feature activation+4.59e-03

Pos logit contributions
wered+0.033
 something+0.033
 nature+0.032
 sunrise+0.031
 outer+0.030
Neg logit contributions
bo-0.063
ave-0.063
ience-0.060
mo-0.059
anda-0.058
Token play
Token ID711
Feature activation+4.11e-02

Pos logit contributions
wered+0.015
 something+0.015
 nature+0.015
 sunrise+0.014
 lunch+0.013
Neg logit contributions
ave-0.028
bo-0.027
ience-0.026
anda-0.026
mo-0.026
Token.
Token ID13
Feature activation+7.52e-03

Pos logit contributions
wered+0.069
 outer+0.064
 sunrise+0.063
 sight+0.061
 nature+0.060
Neg logit contributions
ave-0.081
bo-0.079
 rode-0.073
oc-0.073
lets-0.073
Token They
Token ID1119
Feature activation-5.46e-02

Pos logit contributions
wered+0.015
 something+0.014
 sunrise+0.014
 outer+0.014
 nature+0.014
Neg logit contributions
bo-0.013
ave-0.013
ience-0.012
 forgot-0.012
mo-0.012
Token did
Token ID750
Feature activation+4.45e-02

Pos logit contributions
anna+0.109
bo+0.102
oke+0.100
anda+0.099
jo+0.099
Neg logit contributions
 a-0.108
 the-0.104
 an-0.102
wered-0.100
 something-0.098
Token not
Token ID407
Feature activation+5.78e-03

Pos logit contributions
wered+0.058
 sunrise+0.055
 nature+0.055
 something+0.053
 outer+0.053
Neg logit contributions
bo-0.104
ave-0.103
ience-0.100
 forgot-0.099
lets-0.097
Token need
Token ID761
Feature activation+8.70e-02

Pos logit contributions
wered+0.009
 something+0.009
 nature+0.009
 a+0.008
 an+0.008
Neg logit contributions
ave-0.013
ience-0.013
oped-0.012
anda-0.012
bo-0.012
Token a
Token ID257
Feature activation+4.06e-02

Pos logit contributions
wered+0.116
 sunrise+0.094
 outer+0.094
 view+0.085
 basement+0.085
Neg logit contributions
bo-0.191
ave-0.186
 rode-0.182
 fet-0.181
 forgot-0.180
Token missile
Token ID10105
Feature activation-3.16e-02

Pos logit contributions
wered+0.054
 something+0.048
 nature+0.047
 sunrise+0.045
 outer+0.045
Neg logit contributions
bo-0.098
ave-0.097
ience-0.094
 forgot-0.093
anda-0.092
Token or
Token ID393
Feature activation+1.67e-02

Pos logit contributions
ave+0.086
bo+0.086
ience+0.085
anna+0.082
mo+0.082
Neg logit contributions
 something-0.032
 nature-0.028
 lunch-0.028
 a-0.028
 dinner-0.027
Token a
Token ID257
Feature activation+3.89e-02

Pos logit contributions
wered+0.015
 sunrise+0.014
 nature+0.014
 sight+0.013
 something+0.013
Neg logit contributions
bo-0.046
ave-0.046
ience-0.044
 forgot-0.044
 rode-0.043
Token rich
Token ID5527
Feature activation+2.12e-02

Pos logit contributions
wered+0.057
 something+0.055
 nature+0.054
 sunrise+0.053
 outer+0.052
Neg logit contributions
bo-0.086
ave-0.086
ience-0.082
mo-0.080
anda-0.080
Token.
Token ID13
Feature activation+2.24e-03

Pos logit contributions
ave+0.020
bo+0.019
ience+0.019
lets+0.018
anda+0.018
Neg logit contributions
 something-0.006
 nature-0.005
wered-0.005
 a-0.005
 an-0.005
Token Everywhere
Token ID48201
Feature activation+8.98e-03

Pos logit contributions
 something+0.004
wered+0.004
 nature+0.004
 outer+0.004
 sunrise+0.004
Neg logit contributions
bo-0.004
ave-0.004
ience-0.004
 rode-0.004
mo-0.004
Token it
Token ID340
Feature activation-7.26e-02

Pos logit contributions
 something+0.009
wered+0.009
 nature+0.009
 sunrise+0.008
 outer+0.008
Neg logit contributions
ave-0.024
bo-0.024
ience-0.023
mo-0.023
anda-0.023
Token went
Token ID1816
Feature activation+2.07e-02

Pos logit contributions
lets+0.120
ave+0.115
ience+0.115
bo+0.112
anna+0.112
Neg logit contributions
 nature-0.151
 the-0.150
 something-0.148
 a-0.145
wered-0.143
Token it
Token ID340
Feature activation-3.09e-02

Pos logit contributions
wered+0.023
 something+0.023
 nature+0.022
 sunrise+0.021
 outer+0.021
Neg logit contributions
bo-0.053
ave-0.053
ience-0.051
mo-0.050
anda-0.050
Token attracted
Token ID12725
Feature activation+8.66e-02

Pos logit contributions
lets+0.050
bo+0.049
mo+0.049
ience+0.048
osaurus+0.048
Neg logit contributions
 the-0.067
 nature-0.066
 something-0.066
 a-0.065
 an-0.063
Token lots
Token ID6041
Feature activation+3.28e-02

Pos logit contributions
wered+0.108
 sunrise+0.088
 outer+0.085
 basement+0.084
 dinner+0.077
Neg logit contributions
ave-0.189
lets-0.186
 rode-0.183
bo-0.182
 forgot-0.179
Token of
Token ID286
Feature activation+4.23e-02

Pos logit contributions
wered+0.038
 something+0.036
 nature+0.034
 sunrise+0.034
 outer+0.033
Neg logit contributions
ave-0.084
bo-0.083
ience-0.081
anda-0.079
mo-0.078
Token attention
Token ID3241
Feature activation+2.58e-02

Pos logit contributions
wered+0.048
 something+0.045
 sunrise+0.043
 outer+0.043
 nature+0.042
Neg logit contributions
ave-0.107
bo-0.105
ience-0.102
anda-0.100
mo-0.100
Token because
Token ID780
Feature activation+4.79e-02

Pos logit contributions
wered+0.033
 something+0.029
 sunrise+0.029
 nature+0.029
 outer+0.028
Neg logit contributions
ave-0.062
bo-0.062
ience-0.059
lets-0.058
 forgot-0.057
Token of
Token ID286
Feature activation+4.76e-02

Pos logit contributions
wered+0.054
 lunch+0.050
 purpose+0.048
 sight+0.047
 dinner+0.047
Neg logit contributions
bo-0.108
 rode-0.107
ience-0.106
ave-0.104
 forgot-0.103
Token not
Token ID407
Feature activation+7.28e-03

Pos logit contributions
wered+0.048
 outer+0.046
 sunrise+0.044
 nature+0.042
 sight+0.042
Neg logit contributions
bo-0.061
ience-0.059
 forgot-0.059
ave-0.057
lets-0.055
Token want
Token ID765
Feature activation+7.92e-02

Pos logit contributions
wered+0.012
 something+0.012
 nature+0.011
 outer+0.011
 sunrise+0.011
Neg logit contributions
bo-0.015
ave-0.015
ience-0.014
 forgot-0.014
mo-0.013
Token squash
Token ID34613
Feature activation-1.93e-02

Pos logit contributions
wered+0.120
 outer+0.106
 basement+0.104
 attic+0.096
 sunrise+0.090
Neg logit contributions
bo-0.180
ave-0.166
ience-0.163
lets-0.158
o-0.155
Token.
Token ID13
Feature activation-1.49e-02

Pos logit contributions
bo+0.050
ave+0.050
ience+0.048
mo+0.046
anda+0.046
Neg logit contributions
wered-0.022
 something-0.021
 nature-0.020
 sunrise-0.020
 outer-0.019
Token We
Token ID775
Feature activation-4.92e-02

Pos logit contributions
ave+0.031
bo+0.030
ience+0.028
mo+0.028
anda+0.028
Neg logit contributions
 something-0.025
 nature-0.024
wered-0.024
 an-0.023
 sunrise-0.023
Token want
Token ID765
Feature activation+8.60e-02

Pos logit contributions
bo+0.108
ave+0.107
jo+0.106
mo+0.105
ience+0.105
Neg logit contributions
 something-0.074
 a-0.070
 an-0.069
 nature-0.068
wered-0.068
Token pizza
Token ID14256
Feature activation-2.13e-02

Pos logit contributions
wered+0.140
 outer+0.125
 basement+0.115
 attic+0.106
 view+0.101
Neg logit contributions
bo-0.182
lets-0.167
ave-0.166
ience-0.164
 forgot-0.162
Token.
Token ID13
Feature activation-1.74e-02

Pos logit contributions
bo+0.054
ave+0.054
ience+0.052
mo+0.051
anda+0.051
Neg logit contributions
 something-0.024
wered-0.024
 nature-0.023
 sunrise-0.022
 lunch-0.021
Token Pizza
Token ID20952
Feature activation-5.68e-02

Pos logit contributions
ave+0.034
bo+0.033
ience+0.032
anda+0.031
mo+0.031
Neg logit contributions
 something-0.031
wered-0.030
 nature-0.030
 an-0.029
 sunrise-0.029
Token is
Token ID318
Feature activation+2.66e-02

Pos logit contributions
ave+0.132
mo+0.123
bo+0.123
ience+0.120
oped+0.120
Neg logit contributions
 something-0.089
 nature-0.083
 lunch-0.081
 dinner-0.081
 a-0.081
Token y
Token ID331
Feature activation-1.65e-02

Pos logit contributions
wered+0.034
 nature+0.031
 outer+0.030
 sunrise+0.030
 something+0.030
Neg logit contributions
bo-0.065
ave-0.063
ience-0.062
 forgot-0.060
lets-0.059
Token up
Token ID510
Feature activation+5.47e-02

Pos logit contributions
wered+0.074
 outer+0.062
 sunrise+0.058
 nature+0.055
 university+0.053
Neg logit contributions
bo-0.133
mo-0.118
lers-0.116
o-0.114
anda-0.113
Token their
Token ID511
Feature activation+7.76e-03

Pos logit contributions
wered+0.056
 sunrise+0.047
 outer+0.045
 sight+0.041
 basement+0.039
Neg logit contributions
bo-0.142
ience-0.141
ave-0.136
lets-0.135
mo-0.135
Token things
Token ID1243
Feature activation-1.67e-02

Pos logit contributions
wered+0.011
 something+0.010
 sunrise+0.010
 nature+0.010
 outer+0.009
Neg logit contributions
ave-0.018
bo-0.018
anda-0.017
ience-0.017
mo-0.017
Token.
Token ID13
Feature activation+1.16e-02

Pos logit contributions
bo+0.038
ave+0.035
ience+0.035
 forgot+0.034
anda+0.033
Neg logit contributions
wered-0.027
 outer-0.023
 sunrise-0.023
 nature-0.023
 sight-0.021
Token They
Token ID1119
Feature activation-5.61e-02

Pos logit contributions
wered+0.027
 outer+0.025
 sunrise+0.025
 nature+0.024
 something+0.024
Neg logit contributions
ave-0.016
bo-0.016
 forgot-0.014
lets-0.014
ience-0.014
Token want
Token ID765
Feature activation+8.56e-02

Pos logit contributions
ave+0.134
bo+0.132
ience+0.127
lets+0.126
anda+0.125
Neg logit contributions
wered-0.077
 something-0.075
 nature-0.072
 a-0.070
 an-0.070
Token to
Token ID284
Feature activation+1.40e-02

Pos logit contributions
wered+0.152
 outer+0.140
 view+0.133
 sunrise+0.132
 sight+0.125
Neg logit contributions
bo-0.144
 forgot-0.143
ave-0.137
o-0.137
 cared-0.134
Token play
Token ID711
Feature activation+3.42e-02

Pos logit contributions
 outer+0.024
 something+0.023
wered+0.023
 sunrise+0.023
 sight+0.023
Neg logit contributions
bo-0.028
 forgot-0.026
 rode-0.025
ave-0.024
anda-0.023
Token more
Token ID517
Feature activation+2.35e-02

Pos logit contributions
wered+0.059
 sunrise+0.054
 outer+0.054
 nature+0.051
 sight+0.051
Neg logit contributions
ave-0.063
bo-0.063
 rode-0.058
 forgot-0.058
 pretended-0.057
Token.
Token ID13
Feature activation+7.63e-03

Pos logit contributions
wered+0.034
 sunrise+0.027
 outer+0.027
 nature+0.026
 something+0.026
Neg logit contributions
ave-0.055
bo-0.055
anda-0.053
ience-0.053
lets-0.052
Token They
Token ID1119
Feature activation-5.69e-02

Pos logit contributions
wered+0.015
 sunrise+0.014
 outer+0.014
 something+0.014
 nature+0.014
Neg logit contributions
bo-0.013
ave-0.013
ience-0.012
 forgot-0.012
lets-0.012
Token Fin
Token ID4463
Feature activation-6.25e-02

Pos logit contributions
wered+0.026
 something+0.026
 nature+0.026
 sunrise+0.025
 outer+0.025
Neg logit contributions
bo-0.028
ave-0.028
ience-0.027
mo-0.026
anda-0.025
Token liked
Token ID8288
Feature activation+6.80e-02

Pos logit contributions
ience+0.119
ave+0.117
olded+0.112
mo+0.111
oped+0.111
Neg logit contributions
 a-0.125
 nature-0.124
 the-0.122
 something-0.118
 an-0.116
Token to
Token ID284
Feature activation+2.43e-02

Pos logit contributions
wered+0.108
 outer+0.101
 basement+0.095
 attic+0.090
 something+0.090
Neg logit contributions
o-0.138
bo-0.130
ave-0.128
 rode-0.128
 fet-0.127
Token dive
Token ID15647
Feature activation+4.40e-02

Pos logit contributions
 something+0.039
wered+0.039
 nature+0.038
 sunrise+0.038
 outer+0.037
Neg logit contributions
bo-0.050
ave-0.049
ience-0.047
mo-0.046
anda-0.045
Token deep
Token ID2769
Feature activation+4.22e-02

Pos logit contributions
wered+0.069
 sunrise+0.066
 outer+0.060
 laundry+0.060
 basement+0.059
Neg logit contributions
bo-0.092
ave-0.089
o-0.086
 forgot-0.084
anda-0.084
Token into
Token ID656
Feature activation+8.54e-02

Pos logit contributions
wered+0.072
 sunrise+0.061
 something+0.060
 nature+0.059
 outer+0.059
Neg logit contributions
bo-0.088
ave-0.082
lers-0.081
ience-0.080
mo-0.080
Token the
Token ID262
Feature activation+4.00e-02

Pos logit contributions
wered+0.106
uttering+0.095
 basement+0.093
 sunrise+0.091
 tornado+0.089
Neg logit contributions
bo-0.165
 rode-0.160
 roared-0.156
oc-0.156
o-0.155
Token water
Token ID1660
Feature activation+1.26e-02

Pos logit contributions
wered+0.043
 something+0.039
 outer+0.037
 nature+0.036
 an+0.035
Neg logit contributions
bo-0.108
ave-0.098
anda-0.097
o-0.096
ience-0.094
Token.
Token ID13
Feature activation+3.82e-03

Pos logit contributions
wered+0.014
 something+0.013
 nature+0.013
 sunrise+0.013
 outer+0.012
Neg logit contributions
bo-0.033
ave-0.032
ience-0.030
mo-0.030
anda-0.030
Token One
Token ID1881
Feature activation+1.84e-02

Pos logit contributions
wered+0.008
 something+0.008
 nature+0.007
 sunrise+0.007
 outer+0.007
Neg logit contributions
bo-0.006
ave-0.006
ience-0.006
mo-0.006
anda-0.006
Token day
Token ID1110
Feature activation+4.38e-02

Pos logit contributions
wered+0.026
 outer+0.025
 something+0.024
 sunrise+0.024
 sight+0.023
Neg logit contributions
bo-0.042
o-0.040
ave-0.040
 forgot-0.038
mo-0.037
Token.
Token ID13
Feature activation-2.18e-03

Pos logit contributions
wered+0.060
 sunrise+0.056
 sight+0.051
 nature+0.051
 outer+0.050
Neg logit contributions
bo-0.086
ave-0.083
o-0.082
lets-0.080
anda-0.080
Token A
Token ID317
Feature activation+3.33e-02

Pos logit contributions
bo+0.005
ave+0.005
ience+0.005
mo+0.005
anda+0.004
Neg logit contributions
 something-0.003
wered-0.003
 nature-0.003
 an-0.003
 sunrise-0.003
Token big
Token ID1263
Feature activation+2.94e-02

Pos logit contributions
wered+0.039
 something+0.036
 nature+0.035
 sunrise+0.035
 outer+0.034
Neg logit contributions
bo-0.084
ave-0.083
ience-0.080
anda-0.079
 forgot-0.078
Token rock
Token ID3881
Feature activation-5.01e-02

Pos logit contributions
wered+0.046
 something+0.044
 outer+0.043
 nature+0.042
 sunrise+0.042
Neg logit contributions
bo-0.061
ave-0.059
ience-0.059
anda-0.057
mo-0.057
Token was
Token ID373
Feature activation+3.97e-02

Pos logit contributions
bo+0.090
anna+0.086
mo+0.085
ience+0.085
ave+0.083
Neg logit contributions
 something-0.100
 the-0.097
 a-0.096
 nature-0.093
 an-0.093
Token blocking
Token ID12013
Feature activation+8.52e-02

Pos logit contributions
wered+0.056
 nature+0.054
 sunrise+0.053
 something+0.052
 outer+0.050
Neg logit contributions
bo-0.091
ave-0.090
ience-0.085
mo-0.084
 forgot-0.083
Token the
Token ID262
Feature activation+4.26e-02

Pos logit contributions
wered+0.089
 sunrise+0.084
 lunch+0.078
 laundry+0.074
 dinner+0.072
Neg logit contributions
bo-0.202
o-0.200
lets-0.193
ny-0.189
mo-0.186
Token road
Token ID2975
Feature activation+1.59e-02

Pos logit contributions
wered+0.066
 something+0.063
 outer+0.062
 nature+0.060
 sunrise+0.059
Neg logit contributions
bo-0.088
ave-0.080
 forgot-0.078
anda-0.077
ience-0.077
Token,
Token ID11
Feature activation+1.34e-02

Pos logit contributions
wered+0.019
 sunrise+0.018
 nature+0.017
 something+0.017
 outer+0.017
Neg logit contributions
bo-0.040
ave-0.040
ience-0.038
 forgot-0.037
anda-0.037
Token and
Token ID290
Feature activation+8.38e-03

Pos logit contributions
wered+0.020
 nature+0.019
 something+0.019
 sunrise+0.019
 outer+0.018
Neg logit contributions
bo-0.029
ave-0.029
ience-0.028
 forgot-0.027
mo-0.027
Token no
Token ID645
Feature activation+4.00e-02

Pos logit contributions
 something+0.007
wered+0.007
 nature+0.007
 an+0.006
 sunrise+0.006
Neg logit contributions
bo-0.024
ave-0.024
ience-0.023
mo-0.023
anda-0.023
Token very
Token ID845
Feature activation+3.09e-02

Pos logit contributions
 something+0.030
wered+0.030
 nature+0.029
 sunrise+0.028
 an+0.027
Neg logit contributions
ave-0.042
bo-0.041
ience-0.040
mo-0.038
lets-0.038
Token competitive
Token ID7606
Feature activation+2.22e-02

Pos logit contributions
 something+0.043
wered+0.043
 nature+0.043
 sunrise+0.040
 an+0.040
Neg logit contributions
ave-0.071
bo-0.070
ience-0.067
mo-0.065
lets-0.065
Token.
Token ID13
Feature activation+8.78e-03

Pos logit contributions
wered+0.021
 something+0.021
 nature+0.020
 sunrise+0.019
 outer+0.019
Neg logit contributions
bo-0.061
ave-0.061
ience-0.058
mo-0.057
anda-0.057
Token They
Token ID1119
Feature activation-5.41e-02

Pos logit contributions
wered+0.014
 something+0.014
 nature+0.014
 sunrise+0.013
 outer+0.013
Neg logit contributions
ave-0.018
bo-0.018
ience-0.018
mo-0.017
anda-0.017
Token always
Token ID1464
Feature activation-2.52e-02

Pos logit contributions
bo+0.128
ave+0.127
ience+0.123
anna+0.122
oped+0.122
Neg logit contributions
wered-0.076
 nature-0.073
 the-0.073
 a-0.072
 an-0.069
Token want
Token ID765
Feature activation+8.50e-02

Pos logit contributions
bo+0.054
ave+0.054
mo+0.052
ience+0.052
anda+0.050
Neg logit contributions
 nature-0.042
 the-0.041
 a-0.040
 something-0.040
 an-0.038
Token to
Token ID284
Feature activation+1.60e-02

Pos logit contributions
wered+0.142
 sunrise+0.117
 outer+0.115
 view+0.105
 basement+0.105
Neg logit contributions
ave-0.159
lets-0.159
 forgot-0.152
 pads-0.151
 fet-0.151
Token win
Token ID1592
Feature activation+5.06e-02

Pos logit contributions
wered+0.028
 something+0.027
 nature+0.027
 sunrise+0.027
 outer+0.026
Neg logit contributions
bo-0.031
ave-0.030
 forgot-0.028
ience-0.028
lets-0.028
Token.
Token ID13
Feature activation+4.52e-03

Pos logit contributions
wered+0.049
 sunrise+0.045
 lunch+0.039
 dinner+0.038
 nature+0.038
Neg logit contributions
bo-0.136
ave-0.132
ience-0.131
lets-0.129
 forgot-0.129
Token
Token ID198
Feature activation+2.06e-02

Pos logit contributions
wered+0.007
 something+0.007
 nature+0.006
 sunrise+0.006
 outer+0.006
Neg logit contributions
ave-0.010
bo-0.010
ience-0.010
mo-0.009
anda-0.009
Token
Token ID198
Feature activation+1.85e-02

Pos logit contributions
wered+0.033
 something+0.032
 nature+0.031
 sunrise+0.031
 outer+0.031
Neg logit contributions
bo-0.043
ave-0.042
o-0.040
mo-0.039
anda-0.039
Token They
Token ID1119
Feature activation-5.91e-02

Pos logit contributions
bo+0.011
ave+0.010
ience+0.010
mo+0.009
anda+0.009
Neg logit contributions
wered-0.012
 something-0.011
 nature-0.011
 sunrise-0.011
 outer-0.011
Token said
Token ID531
Feature activation+2.18e-02

Pos logit contributions
anna+0.095
mo+0.095
bo+0.094
jo+0.089
anda+0.089
Neg logit contributions
 a-0.133
 the-0.128
 something-0.127
 an-0.127
wered-0.121
Token they
Token ID484
Feature activation-5.09e-02

Pos logit contributions
wered+0.033
 something+0.032
 nature+0.032
 sunrise+0.031
 outer+0.030
Neg logit contributions
bo-0.048
ave-0.048
ience-0.046
mo-0.044
anda-0.044
Token did
Token ID750
Feature activation+4.36e-02

Pos logit contributions
mo+0.094
anna+0.093
bo+0.093
anda+0.092
ience+0.092
Neg logit contributions
 a-0.099
 something-0.096
 an-0.093
 the-0.088
 nature-0.081
Token not
Token ID407
Feature activation+1.06e-02

Pos logit contributions
wered+0.076
 sunrise+0.073
 soil+0.072
 outer+0.071
 basement+0.071
Neg logit contributions
bo-0.077
 forgot-0.077
ience-0.072
lets-0.069
 rode-0.068
Token need
Token ID761
Feature activation+8.47e-02

Pos logit contributions
 something+0.017
 nature+0.017
wered+0.016
 a+0.016
 an+0.015
Neg logit contributions
ave-0.022
ience-0.022
bo-0.022
oped-0.021
anda-0.021
Token to
Token ID284
Feature activation+2.54e-02

Pos logit contributions
wered+0.124
 outer+0.113
 basement+0.099
 attic+0.099
 east+0.093
Neg logit contributions
bo-0.188
ave-0.175
ience-0.173
o-0.168
lers-0.168
Token be
Token ID307
Feature activation+2.83e-02

Pos logit contributions
wered+0.042
 something+0.041
 sunrise+0.040
 nature+0.040
 outer+0.040
Neg logit contributions
bo-0.053
ave-0.050
 forgot-0.048
ience-0.047
lets-0.046
Token wealthy
Token ID11574
Feature activation+1.21e-02

Pos logit contributions
 something+0.047
 a+0.043
 nature+0.043
 an+0.043
 the+0.042
Neg logit contributions
bo-0.060
ave-0.060
ience-0.058
oped-0.058
jo-0.056
Token to
Token ID284
Feature activation+2.90e-02

Pos logit contributions
 something+0.014
wered+0.014
 nature+0.014
 sunrise+0.013
 sight+0.013
Neg logit contributions
bo-0.030
ave-0.030
ience-0.029
mo-0.028
anda-0.028
Token be
Token ID307
Feature activation+2.24e-02

Pos logit contributions
wered+0.038
 something+0.036
 sunrise+0.036
 nature+0.035
 outer+0.035
Neg logit contributions
bo-0.069
ave-0.067
 forgot-0.064
ience-0.063
mo-0.063
Token hurt
Token ID5938
Feature activation+4.36e-02

Pos logit contributions
bo+0.020
ave+0.020
ience+0.019
 forgot+0.019
mo+0.018
Neg logit contributions
wered-0.013
 something-0.013
 nature-0.012
 sunrise-0.012
 sight-0.012
Token yourself
Token ID3511
Feature activation-1.07e-02

Pos logit contributions
wered+0.055
 sunrise+0.050
 outer+0.046
 sight+0.046
 view+0.044
Neg logit contributions
bo-0.114
ave-0.104
o-0.100
mo-0.100
lets-0.100
Token.
Token ID13
Feature activation-1.01e-02

Pos logit contributions
bo+0.029
ave+0.028
ience+0.027
mo+0.027
anda+0.027
Neg logit contributions
wered-0.011
 something-0.011
 nature-0.010
 sunrise-0.010
 outer-0.010
Token And
Token ID843
Feature activation-5.57e-03

Pos logit contributions
ave+0.018
bo+0.018
ience+0.017
mo+0.016
anda+0.016
Neg logit contributions
wered-0.019
 something-0.019
 nature-0.019
 sunrise-0.018
 outer-0.018
Token I
Token ID314
Feature activation-3.75e-02

Pos logit contributions
bo+0.015
ave+0.015
ience+0.014
mo+0.014
anda+0.014
Neg logit contributions
 something-0.006
wered-0.006
 nature-0.006
 sunrise-0.005
 an-0.005
Token need
Token ID761
Feature activation+8.46e-02

Pos logit contributions
bo+0.082
ience+0.081
lets+0.081
ave+0.081
anna+0.080
Neg logit contributions
 something-0.056
 a-0.054
 an-0.053
 nature-0.051
 the-0.050
Token to
Token ID284
Feature activation+2.46e-02

Pos logit contributions
wered+0.138
 sunrise+0.116
 view+0.111
 outer+0.106
 sight+0.102
Neg logit contributions
ave-0.170
bo-0.170
olded-0.163
 rode-0.158
 forgot-0.152
Token go
Token ID467
Feature activation+4.33e-03

Pos logit contributions
 sunrise+0.040
 sight+0.040
wered+0.040
 outer+0.039
 nature+0.038
Neg logit contributions
bo-0.050
 forgot-0.047
ave-0.045
 rode-0.045
anna-0.042
Token to
Token ID284
Feature activation+5.06e-02

Pos logit contributions
wered+0.004
 nature+0.004
 something+0.004
 sunrise+0.004
 outer+0.004
Neg logit contributions
bo-0.012
ave-0.012
ience-0.011
anda-0.011
mo-0.011
Token the
Token ID262
Feature activation+2.62e-02

Pos logit contributions
wered+0.060
 sunrise+0.052
uttering+0.048
 view+0.046
 sight+0.046
Neg logit contributions
 forgot-0.109
 rode-0.108
bo-0.107
anna-0.104
's-0.099
Token university
Token ID6403
Feature activation-1.32e-02

Pos logit contributions
wered+0.041
 nature+0.037
 sunrise+0.036
 sight+0.035
 outer+0.035
Neg logit contributions
bo-0.052
ave-0.050
 forgot-0.049
anda-0.048
 rode-0.047
Token to
Token ID284
Feature activation+5.62e-02

Pos logit contributions
wered+0.012
 sunrise+0.011
 outer+0.010
 something+0.010
 nature+0.010
Neg logit contributions
bo-0.022
ave-0.021
o-0.020
anda-0.020
mo-0.020
Token the
Token ID262
Feature activation+2.79e-02

Pos logit contributions
wered+0.071
uttering+0.068
eways+0.061
 soil+0.059
 sunrise+0.057
Neg logit contributions
 rode-0.105
 forgot-0.102
bo-0.099
 missed-0.098
 drove-0.096
Token lake
Token ID13546
Feature activation+9.07e-03

Pos logit contributions
wered+0.041
 outer+0.036
 something+0.036
 an+0.035
 nature+0.033
Neg logit contributions
bo-0.055
anda-0.051
 rode-0.051
ience-0.051
 Driver-0.050
Token.
Token ID13
Feature activation+7.18e-03

Pos logit contributions
wered+0.012
 sunrise+0.011
 outer+0.010
 nature+0.010
 something+0.010
Neg logit contributions
bo-0.022
ave-0.021
o-0.020
ience-0.020
anda-0.020
Token They
Token ID1119
Feature activation-5.37e-02

Pos logit contributions
wered+0.014
 sunrise+0.013
 something+0.013
 outer+0.013
 nature+0.013
Neg logit contributions
bo-0.013
ave-0.013
o-0.011
 forgot-0.011
anda-0.011
Token want
Token ID765
Feature activation+8.45e-02

Pos logit contributions
bo+0.122
ave+0.122
lets+0.115
ience+0.114
anda+0.113
Neg logit contributions
wered-0.080
 something-0.075
 nature-0.074
 a-0.073
 an-0.073
Token to
Token ID284
Feature activation+1.82e-02

Pos logit contributions
wered+0.157
 outer+0.148
 view+0.137
 sunrise+0.130
 sight+0.129
Neg logit contributions
bo-0.132
o-0.129
 forgot-0.127
's-0.126
ave-0.124
Token swim
Token ID9422
Feature activation+9.58e-03

Pos logit contributions
 something+0.030
wered+0.030
 outer+0.030
 sunrise+0.029
 sight+0.029
Neg logit contributions
bo-0.036
 forgot-0.033
 rode-0.033
ave-0.032
o-0.030
Token and
Token ID290
Feature activation+7.46e-03

Pos logit contributions
wered+0.012
 sunrise+0.011
 nature+0.011
 something+0.011
 outer+0.011
Neg logit contributions
bo-0.024
ave-0.024
ience-0.022
anda-0.022
mo-0.022
Token play
Token ID711
Feature activation+4.63e-02

Pos logit contributions
wered+0.012
 sunrise+0.011
 something+0.011
 nature+0.011
 outer+0.011
Neg logit contributions
bo-0.015
ave-0.015
 forgot-0.015
 rode-0.014
ience-0.014
Token.
Token ID13
Feature activation+6.91e-03

Pos logit contributions
wered+0.082
 sunrise+0.077
 outer+0.077
 sight+0.071
 nature+0.070
Neg logit contributions
bo-0.081
ave-0.081
 rode-0.079
 roared-0.074
oc-0.073
Token,
Token ID11
Feature activation-9.19e-03

Pos logit contributions
wered+0.001
 nature+0.001
 sunrise+0.001
 something+0.001
 outer+0.001
Neg logit contributions
bo-0.002
ave-0.002
ience-0.002
anda-0.002
mo-0.002
Token "
Token ID366
Feature activation-1.69e-02

Pos logit contributions
ave+0.023
bo+0.023
ience+0.023
anda+0.022
mo+0.022
Neg logit contributions
 something-0.012
wered-0.011
 a-0.011
 nature-0.010
 an-0.010
TokenBut
Token ID1537
Feature activation+1.08e-02

Pos logit contributions
ience+0.038
anda+0.036
mo+0.036
ave+0.036
bo+0.035
Neg logit contributions
 nature-0.026
 something-0.026
wered-0.025
 sight-0.023
 sunrise-0.023
Token I
Token ID314
Feature activation-3.04e-02

Pos logit contributions
wered+0.016
 something+0.015
 sunrise+0.015
 nature+0.015
 outer+0.014
Neg logit contributions
bo-0.024
ave-0.024
ience-0.023
mo-0.022
anda-0.022
Token really
Token ID1107
Feature activation-3.92e-03

Pos logit contributions
bo+0.062
ave+0.061
ience+0.060
mo+0.059
anna+0.058
Neg logit contributions
 something-0.053
 an-0.050
 a-0.049
 nature-0.049
 the-0.046
Token want
Token ID765
Feature activation+8.43e-02

Pos logit contributions
bo+0.007
ave+0.007
mo+0.007
anda+0.007
ience+0.007
Neg logit contributions
 something-0.007
 nature-0.007
 a-0.007
 an-0.007
wered-0.007
Token to
Token ID284
Feature activation+1.77e-02

Pos logit contributions
wered+0.145
 sunrise+0.116
 basement+0.113
 outer+0.110
 view+0.109
Neg logit contributions
ave-0.170
bo-0.167
ience-0.163
 pads-0.160
 rode-0.160
Token play
Token ID711
Feature activation+3.72e-02

Pos logit contributions
wered+0.028
 something+0.027
 sunrise+0.027
 nature+0.027
 outer+0.026
Neg logit contributions
bo-0.037
ave-0.036
 forgot-0.034
ience-0.034
 rode-0.033
Token chess
Token ID19780
Feature activation-1.37e-02

Pos logit contributions
wered+0.051
 sunrise+0.046
 nature+0.045
 outer+0.044
 sight+0.043
Neg logit contributions
bo-0.087
ave-0.086
ience-0.082
 forgot-0.080
 rode-0.079
Token."
Token ID526
Feature activation+2.58e-03

Pos logit contributions
bo+0.034
ave+0.034
ience+0.033
mo+0.032
anda+0.031
Neg logit contributions
wered-0.017
 something-0.016
 nature-0.016
 sunrise-0.015
 outer-0.015
Token
Token ID198
Feature activation+1.26e-02

Pos logit contributions
 something+0.004
wered+0.004
 nature+0.004
 sunrise+0.004
 an+0.003
Neg logit contributions
bo-0.006
ave-0.006
ience-0.006
mo-0.005
anda-0.005
Token and
Token ID290
Feature activation-1.66e-03

Pos logit contributions
wered+0.012
 sunrise+0.011
 something+0.010
 nature+0.010
 outer+0.010
Neg logit contributions
bo-0.021
ave-0.021
ience-0.020
lets-0.020
anda-0.020
Token tape
Token ID9154
Feature activation+8.96e-03

Pos logit contributions
bo+0.004
ave+0.004
ience+0.004
lets+0.004
 forgot+0.003
Neg logit contributions
wered-0.002
 sunrise-0.002
 nature-0.002
 sight-0.002
 outer-0.002
Token.
Token ID13
Feature activation-9.13e-03

Pos logit contributions
wered+0.011
 something+0.010
 nature+0.010
 sunrise+0.010
 outer+0.010
Neg logit contributions
bo-0.022
ave-0.022
ience-0.021
mo-0.021
anda-0.021
Token And
Token ID843
Feature activation+8.94e-03

Pos logit contributions
bo+0.015
ave+0.014
ience+0.014
lets+0.013
mo+0.013
Neg logit contributions
wered-0.019
 something-0.019
 nature-0.018
 sunrise-0.018
 sight-0.018
Token we
Token ID356
Feature activation-3.22e-02

Pos logit contributions
wered+0.015
 sunrise+0.014
 nature+0.014
 sight+0.014
 outer+0.014
Neg logit contributions
bo-0.017
ience-0.017
ave-0.017
lets-0.017
 forgot-0.017
Token need
Token ID761
Feature activation+8.42e-02

Pos logit contributions
ave+0.067
bo+0.066
anna+0.066
jo+0.065
mo+0.063
Neg logit contributions
 something-0.053
 a-0.051
 an-0.050
wered-0.050
 nature-0.049
Token to
Token ID284
Feature activation+1.91e-02

Pos logit contributions
wered+0.148
 sunrise+0.116
 outer+0.114
 view+0.112
 basement+0.108
Neg logit contributions
bo-0.159
ave-0.150
 fet-0.147
lets-0.145
o-0.143
Token be
Token ID307
Feature activation+2.78e-02

Pos logit contributions
wered+0.032
 sunrise+0.031
 sight+0.031
 outer+0.030
 nature+0.030
Neg logit contributions
bo-0.039
ave-0.036
 forgot-0.035
 rode-0.035
lets-0.034
Token gentle
Token ID10296
Feature activation+6.69e-03

Pos logit contributions
wered+0.041
 sunrise+0.039
 nature+0.038
 something+0.038
 outer+0.038
Neg logit contributions
bo-0.063
ave-0.063
ience-0.059
 forgot-0.057
lets-0.057
Token."
Token ID526
Feature activation+2.13e-03

Pos logit contributions
wered+0.011
 outer+0.010
 sunrise+0.009
 sight+0.009
 nature+0.009
Neg logit contributions
bo-0.015
ave-0.015
lets-0.014
ience-0.014
o-0.014
Token
Token ID198
Feature activation+1.78e-02

Pos logit contributions
wered+0.003
 something+0.003
 nature+0.003
 sunrise+0.003
 outer+0.003
Neg logit contributions
bo-0.005
ave-0.005
ience-0.005
mo-0.005
anda-0.004
Token anyone
Token ID2687
Feature activation-3.11e-02

Pos logit contributions
wered+0.045
 nature+0.041
 something+0.041
 sunrise+0.041
 outer+0.040
Neg logit contributions
bo-0.082
ave-0.081
ience-0.078
lets-0.075
 forgot-0.075
Token.
Token ID13
Feature activation-3.93e-03

Pos logit contributions
ave+0.080
bo+0.078
ience+0.075
mo+0.074
oped+0.074
Neg logit contributions
 something-0.038
 nature-0.036
 a-0.034
 the-0.034
 sunrise-0.033
Token We
Token ID775
Feature activation-3.89e-02

Pos logit contributions
bo+0.006
ave+0.006
ience+0.006
lets+0.005
mo+0.005
Neg logit contributions
wered-0.008
 something-0.008
 nature-0.008
 sunrise-0.008
 sight-0.008
Token do
Token ID466
Feature activation+3.94e-02

Pos logit contributions
anna+0.088
jo+0.088
bo+0.088
mo+0.087
ave+0.087
Neg logit contributions
 a-0.056
 something-0.054
 an-0.052
 the-0.052
 nature-0.051
Token not
Token ID407
Feature activation+1.41e-02

Pos logit contributions
wered+0.054
 sunrise+0.050
 nature+0.049
 sight+0.048
 outer+0.048
Neg logit contributions
bo-0.090
ience-0.089
ave-0.088
 forgot-0.086
lets-0.084
Token need
Token ID761
Feature activation+8.42e-02

Pos logit contributions
 something+0.020
wered+0.020
 nature+0.019
 a+0.018
 sunrise+0.018
Neg logit contributions
ave-0.033
ience-0.032
bo-0.031
anda-0.031
mo-0.031
Token school
Token ID1524
Feature activation-3.03e-06

Pos logit contributions
wered+0.116
 sunrise+0.093
 attic+0.086
 outer+0.086
 basement+0.085
Neg logit contributions
bo-0.190
ave-0.187
ience-0.178
 rode-0.177
lets-0.175
Token."
Token ID526
Feature activation+1.73e-03

Feature not active on prev token;
no predictions were affected.
Token
Token ID198
Feature activation+2.24e-02

Pos logit contributions
 something+0.002
wered+0.002
 nature+0.002
 sunrise+0.002
 an+0.002
Neg logit contributions
ave-0.004
bo-0.004
ience-0.004
mo-0.004
anda-0.004
Token
Token ID198
Feature activation+1.19e-02

Pos logit contributions
wered+0.038
 sunrise+0.036
 outer+0.036
 something+0.036
 nature+0.035
Neg logit contributions
bo-0.046
ave-0.044
o-0.042
ience-0.041
anda-0.041
TokenTheir
Token ID14574
Feature activation+1.58e-02

Pos logit contributions
wered+0.016
 something+0.016
 nature+0.016
 sunrise+0.015
 outer+0.015
Neg logit contributions
ave-0.028
bo-0.028
ience-0.027
mo-0.026
anda-0.026
Token sand
Token ID6450
Feature activation-7.60e-03

Pos logit contributions
wered+0.031
 something+0.027
 outer+0.026
 nature+0.026
 an+0.025
Neg logit contributions
bo-0.063
anda-0.062
ave-0.060
 forgot-0.060
ience-0.059
Token castle
Token ID16669
Feature activation-1.60e-02

Pos logit contributions
bo+0.016
ave+0.016
ience+0.015
mo+0.015
anda+0.015
Neg logit contributions
wered-0.012
 something-0.012
 nature-0.012
 sunrise-0.012
 outer-0.011
Token.
Token ID13
Feature activation-6.26e-03

Pos logit contributions
bo+0.039
ave+0.039
ience+0.037
anda+0.036
lets+0.036
Neg logit contributions
wered-0.021
 something-0.019
 nature-0.019
 sunrise-0.018
 outer-0.018
Token We
Token ID775
Feature activation-4.02e-02

Pos logit contributions
bo+0.009
ave+0.009
ience+0.008
lets+0.008
olded+0.008
Neg logit contributions
wered-0.014
 something-0.013
 sight-0.013
 nature-0.013
 sunrise-0.013
Token just
Token ID655
Feature activation-2.38e-02

Pos logit contributions
jo+0.086
anna+0.086
bo+0.084
ience+0.081
ave+0.081
Neg logit contributions
 a-0.062
 something-0.060
 an-0.059
 the-0.058
 nature-0.056
Token need
Token ID761
Feature activation+8.40e-02

Pos logit contributions
bo+0.044
ave+0.043
anda+0.043
ience+0.043
mo+0.041
Neg logit contributions
 something-0.045
wered-0.043
 a-0.043
 nature-0.043
 an-0.042
Token to
Token ID284
Feature activation+2.14e-02

Pos logit contributions
wered+0.130
 sunrise+0.102
 outer+0.100
 basement+0.094
 avail+0.094
Neg logit contributions
bo-0.158
ave-0.156
 could-0.150
 fet-0.149
lets-0.147
Token add
Token ID751
Feature activation+5.40e-02

Pos logit contributions
wered+0.034
 sight+0.034
 sunrise+0.034
 outer+0.033
 something+0.033
Neg logit contributions
bo-0.043
 forgot-0.040
ave-0.039
 rode-0.038
lets-0.037
Token some
Token ID617
Feature activation+4.66e-02

Pos logit contributions
wered+0.078
 basement+0.058
 outer+0.057
 sunrise+0.056
 university+0.053
Neg logit contributions
bo-0.121
lets-0.114
ave-0.113
 pads-0.113
ience-0.112
Token water
Token ID1660
Feature activation-1.11e-02

Pos logit contributions
wered+0.063
 outer+0.049
 sight+0.048
 sunrise+0.047
 nature+0.046
Neg logit contributions
ave-0.106
bo-0.104
anda-0.103
ience-0.102
lets-0.100
Token to
Token ID284
Feature activation+3.94e-02

Pos logit contributions
bo+0.028
ave+0.027
ience+0.026
anda+0.025
mo+0.025
Neg logit contributions
wered-0.014
 something-0.013
 nature-0.013
 sunrise-0.012
 outer-0.012
Token day
Token ID1110
Feature activation+4.08e-02

Pos logit contributions
wered+0.021
 something+0.020
 nature+0.019
 sunrise+0.019
 outer+0.019
Neg logit contributions
bo-0.039
ave-0.038
ience-0.036
mo-0.036
o-0.036
Token,
Token ID11
Feature activation+4.66e-02

Pos logit contributions
wered+0.071
 outer+0.068
 sunrise+0.067
 nature+0.067
 basement+0.062
Neg logit contributions
ave-0.081
bo-0.078
o-0.077
 rode-0.077
 forgot-0.075
Token Tim
Token ID5045
Feature activation-8.01e-02

Pos logit contributions
 outer+0.089
wered+0.085
 nature+0.084
 sunrise+0.083
 basement+0.082
Neg logit contributions
 rode-0.077
o-0.072
 forgot-0.071
ave-0.070
bo-0.070
Tokenmy
Token ID1820
Feature activation-7.24e-02

Pos logit contributions
ience+0.163
olded+0.153
zel+0.150
oped+0.149
uer+0.144
Neg logit contributions
 a-0.145
 the-0.140
 something-0.136
 nature-0.135
 an-0.134
Token accidentally
Token ID14716
Feature activation+1.81e-02

Pos logit contributions
ience+0.121
oped+0.116
mo+0.115
anda+0.111
oke+0.111
Neg logit contributions
 a-0.153
 something-0.151
 the-0.149
 an-0.144
 dinner-0.140
Token dropped
Token ID5710
Feature activation+8.38e-02

Pos logit contributions
 something+0.042
wered+0.041
 an+0.040
 a+0.040
 nature+0.040
Neg logit contributions
ave-0.025
bo-0.025
ience-0.024
anda-0.023
mo-0.023
Token his
Token ID465
Feature activation+1.97e-02

Pos logit contributions
wered+0.113
 outer+0.103
 sunrise+0.100
 basement+0.095
 university+0.095
Neg logit contributions
bo-0.201
 rode-0.177
o-0.171
lers-0.171
mo-0.167
Token little
Token ID1310
Feature activation+1.22e-02

Pos logit contributions
wered+0.036
 something+0.035
 nature+0.035
 sunrise+0.034
 outer+0.033
Neg logit contributions
bo-0.037
ave-0.037
ience-0.035
anda-0.034
mo-0.034
Token metal
Token ID6147
Feature activation+1.26e-02

Pos logit contributions
 something+0.020
 nature+0.019
wered+0.019
 sight+0.019
 lunch+0.019
Neg logit contributions
ave-0.026
bo-0.025
ience-0.024
mo-0.023
 forgot-0.023
Token car
Token ID1097
Feature activation-3.61e-02

Pos logit contributions
wered+0.023
 nature+0.022
 sunrise+0.022
 something+0.022
 outer+0.021
Neg logit contributions
bo-0.024
ave-0.023
anda-0.022
ience-0.022
mo-0.021
Token and
Token ID290
Feature activation-3.38e-02

Pos logit contributions
bo+0.091
ave+0.090
ience+0.087
mo+0.084
anda+0.084
Neg logit contributions
wered-0.043
 something-0.042
 nature-0.041
 sunrise-0.040
 outer-0.038
Token and
Token ID290
Feature activation+3.40e-03

Pos logit contributions
bo+0.031
ave+0.031
ience+0.030
lets+0.029
anda+0.029
Neg logit contributions
wered-0.012
 something-0.012
 nature-0.011
 sunrise-0.011
 outer-0.011
Token cars
Token ID5006
Feature activation-2.38e-02

Pos logit contributions
wered+0.004
 something+0.004
 nature+0.004
 sunrise+0.004
 outer+0.004
Neg logit contributions
bo-0.008
ave-0.008
ience-0.008
mo-0.008
 forgot-0.008
Token back
Token ID736
Feature activation+4.14e-03

Pos logit contributions
bo+0.067
ave+0.067
ience+0.065
mo+0.064
anda+0.063
Neg logit contributions
 something-0.021
wered-0.020
 nature-0.019
 sunrise-0.018
 lunch-0.017
Token.
Token ID13
Feature activation+2.17e-03

Pos logit contributions
wered+0.004
 something+0.004
 nature+0.004
 sunrise+0.004
 outer+0.004
Neg logit contributions
bo-0.011
ave-0.011
ience-0.011
mo-0.010
anda-0.010
Token They
Token ID1119
Feature activation-5.42e-02

Pos logit contributions
 something+0.004
wered+0.004
 nature+0.004
 sunrise+0.003
 outer+0.003
Neg logit contributions
bo-0.004
ave-0.004
ience-0.004
mo-0.004
anda-0.004
Token want
Token ID765
Feature activation+8.36e-02

Pos logit contributions
bo+0.119
ave+0.118
ience+0.111
mo+0.111
lets+0.109
Neg logit contributions
wered-0.085
 something-0.083
 nature-0.080
 an-0.078
 a-0.077
Token their
Token ID511
Feature activation+9.11e-03

Pos logit contributions
wered+0.110
 outer+0.103
 sunrise+0.101
 view+0.098
 basement+0.090
Neg logit contributions
bo-0.189
ave-0.180
 forgot-0.177
 apologized-0.173
o-0.170
Token songs
Token ID7259
Feature activation-1.97e-02

Pos logit contributions
 something+0.013
 nature+0.013
wered+0.013
 sunrise+0.012
 sight+0.012
Neg logit contributions
bo-0.021
ave-0.021
ience-0.020
mo-0.020
oped-0.019
Token back
Token ID736
Feature activation-4.64e-03

Pos logit contributions
bo+0.052
ave+0.052
ience+0.050
mo+0.049
anda+0.049
Neg logit contributions
wered-0.020
 something-0.020
 nature-0.019
 sunrise-0.019
 outer-0.018
Token.
Token ID13
Feature activation+1.90e-03

Pos logit contributions
ave+0.013
bo+0.013
ience+0.013
mo+0.013
anda+0.013
Neg logit contributions
 something-0.004
 nature-0.004
wered-0.004
 sunrise-0.003
 sight-0.003
Token
Token ID198
Feature activation+2.23e-02

Pos logit contributions
 something+0.003
wered+0.003
 nature+0.003
 sunrise+0.003
 outer+0.003
Neg logit contributions
bo-0.004
ave-0.004
ience-0.004
mo-0.004
anda-0.004
Token scream
Token ID8196
Feature activation+1.23e-02

Pos logit contributions
ave+0.093
bo+0.090
ience+0.089
mo+0.086
anna+0.086
Neg logit contributions
 something-0.068
wered-0.067
 nature-0.064
 a-0.063
 an-0.062
Token and
Token ID290
Feature activation-2.73e-02

Pos logit contributions
wered+0.019
 nature+0.018
 something+0.018
 sunrise+0.018
 outer+0.017
Neg logit contributions
ave-0.027
bo-0.027
ience-0.026
mo-0.025
 forgot-0.025
Token cry
Token ID3960
Feature activation+9.27e-03

Pos logit contributions
bo+0.045
ave+0.042
ience+0.040
 forgot+0.040
 rode+0.039
Neg logit contributions
wered-0.056
 sunrise-0.055
 something-0.054
 nature-0.054
 outer-0.053
Token.
Token ID13
Feature activation+1.44e-02

Pos logit contributions
wered+0.011
 nature+0.010
 sunrise+0.010
 outer+0.010
 something+0.010
Neg logit contributions
bo-0.024
ave-0.024
ience-0.023
lets-0.022
 forgot-0.022
Token They
Token ID1119
Feature activation-4.83e-02

Pos logit contributions
wered+0.025
 something+0.023
 outer+0.023
 sunrise+0.023
 nature+0.023
Neg logit contributions
bo-0.029
ave-0.029
 forgot-0.026
ience-0.026
o-0.026
Token need
Token ID761
Feature activation+8.36e-02

Pos logit contributions
ave+0.108
bo+0.106
ience+0.104
mo+0.102
lets+0.101
Neg logit contributions
wered-0.072
 something-0.071
 nature-0.068
 a-0.066
 an-0.066
Token help
Token ID1037
Feature activation+2.17e-02

Pos logit contributions
wered+0.126
 outer+0.113
 sunrise+0.112
 view+0.105
 university+0.101
Neg logit contributions
bo-0.158
ave-0.155
o-0.148
 rode-0.148
 cared-0.147
Token.
Token ID13
Feature activation+1.13e-02

Pos logit contributions
wered+0.032
 outer+0.031
 sunrise+0.029
 view+0.027
 sight+0.027
Neg logit contributions
bo-0.052
ave-0.049
 forgot-0.047
o-0.047
 rode-0.045
Token
Token ID198
Feature activation+3.16e-02

Pos logit contributions
wered+0.018
 something+0.017
 sunrise+0.017
 nature+0.017
 outer+0.017
Neg logit contributions
bo-0.024
ave-0.024
ience-0.022
 forgot-0.021
anda-0.021
Token
Token ID198
Feature activation+1.81e-02

Pos logit contributions
wered+0.056
 sunrise+0.055
 outer+0.054
 sight+0.054
 something+0.053
Neg logit contributions
bo-0.060
ave-0.057
o-0.056
anda-0.052
 rode-0.052
TokenBut
Token ID1537
Feature activation+1.20e-02

Pos logit contributions
wered+0.029
 sunrise+0.029
 outer+0.029
 sight+0.028
 nature+0.028
Neg logit contributions
bo-0.040
ave-0.038
o-0.035
 forgot-0.035
 rode-0.035
Token You
Token ID921
Feature activation-2.72e-02

Pos logit contributions
ave+0.002
bo+0.002
ience+0.002
anda+0.002
mo+0.002
Neg logit contributions
 something-0.002
wered-0.002
 nature-0.002
 an-0.002
 sunrise-0.002
Token are
Token ID389
Feature activation+2.20e-02

Pos logit contributions
anna+0.064
anda+0.063
mo+0.063
bo+0.062
oped+0.062
Neg logit contributions
 a-0.038
 something-0.037
 the-0.036
 an-0.035
 nature-0.034
Token children
Token ID1751
Feature activation-2.25e-02

Pos logit contributions
wered+0.030
 sunrise+0.028
 something+0.028
 nature+0.027
 outer+0.027
Neg logit contributions
bo-0.052
ave-0.052
ience-0.049
 forgot-0.048
mo-0.048
Token.
Token ID13
Feature activation-1.36e-03

Pos logit contributions
ave+0.062
bo+0.061
mo+0.059
ience+0.058
oped+0.058
Neg logit contributions
 something-0.023
 nature-0.021
 a-0.020
 the-0.019
 an-0.019
Token You
Token ID921
Feature activation-2.70e-02

Pos logit contributions
ave+0.003
bo+0.003
ience+0.002
anda+0.002
mo+0.002
Neg logit contributions
 something-0.002
 nature-0.002
wered-0.002
 an-0.002
 sunrise-0.002
Token need
Token ID761
Feature activation+8.36e-02

Pos logit contributions
anna+0.062
anda+0.061
mo+0.061
bo+0.061
oped+0.060
Neg logit contributions
 a-0.039
 something-0.039
 an-0.037
 the-0.037
 nature-0.036
Token school
Token ID1524
Feature activation+9.17e-03

Pos logit contributions
wered+0.116
 sunrise+0.101
 outer+0.091
 view+0.090
 sight+0.084
Neg logit contributions
ave-0.198
bo-0.185
 forgot-0.178
olded-0.177
 cared-0.176
Token.
Token ID13
Feature activation-2.22e-03

Pos logit contributions
 something+0.010
wered+0.010
 nature+0.010
 sunrise+0.010
 outer+0.009
Neg logit contributions
bo-0.023
ave-0.023
ience-0.022
mo-0.022
anda-0.022
Token You
Token ID921
Feature activation-2.46e-02

Pos logit contributions
bo+0.004
ave+0.004
ience+0.004
mo+0.004
anda+0.004
Neg logit contributions
 something-0.004
wered-0.004
 nature-0.004
 sunrise-0.004
 outer-0.004
Token need
Token ID761
Feature activation+8.48e-02

Pos logit contributions
mo+0.056
anna+0.056
anda+0.055
bo+0.055
ience+0.054
Neg logit contributions
 a-0.036
 something-0.036
 the-0.034
 an-0.034
 nature-0.033
Token to
Token ID284
Feature activation+3.40e-02

Pos logit contributions
wered+0.117
 sunrise+0.101
 outer+0.091
 view+0.090
 sight+0.085
Neg logit contributions
ave-0.200
bo-0.186
olded-0.179
 forgot-0.179
 cared-0.178
Token from
Token ID422
Feature activation+2.98e-02

Pos logit contributions
ave+0.073
bo+0.072
ience+0.069
mo+0.068
anda+0.067
Neg logit contributions
 something-0.031
 nature-0.029
 a-0.027
 sunrise-0.027
 an-0.026
Token the
Token ID262
Feature activation+3.01e-02

Pos logit contributions
wered+0.022
 sunrise+0.020
 something+0.020
 nature+0.019
 lunch+0.018
Neg logit contributions
bo-0.086
ave-0.083
ience-0.082
mo-0.081
 rode-0.080
Token tank
Token ID6873
Feature activation-2.18e-02

Pos logit contributions
wered+0.039
 something+0.036
 nature+0.035
 sunrise+0.034
 lunch+0.033
Neg logit contributions
bo-0.071
ience-0.069
ave-0.069
mo-0.066
 forgot-0.065
Token.
Token ID13
Feature activation-1.95e-02

Pos logit contributions
bo+0.063
ave+0.062
ience+0.060
mo+0.059
anda+0.059
Neg logit contributions
 something-0.019
 nature-0.017
wered-0.016
 a-0.016
 an-0.016
Token Hel
Token ID5053
Feature activation-5.71e-02

Pos logit contributions
bo+0.045
ave+0.045
mo+0.044
ience+0.043
 forgot+0.043
Neg logit contributions
 something-0.028
 an-0.028
 nature-0.027
 a-0.027
 the-0.027
Tokenio
Token ID952
Feature activation-1.42e-01

Pos logit contributions
ave+0.090
bo+0.087
ience+0.083
 forgot+0.081
oped+0.078
Neg logit contributions
wered-0.114
 something-0.114
 nature-0.111
 an-0.110
 the-0.109
Token was
Token ID373
Feature activation+1.28e-02

Pos logit contributions
trained+0.244
oped+0.231
amboo+0.229
oke+0.222
mo+0.221
Neg logit contributions
 a-0.272
 the-0.254
 an-0.240
 something-0.235
 nature-0.220
Token thankful
Token ID25535
Feature activation+1.61e-02

Pos logit contributions
 something+0.019
 a+0.018
 an+0.018
 the+0.017
 nature+0.017
Neg logit contributions
ave-0.030
ience-0.029
bo-0.029
anda-0.028
lets-0.028
Token and
Token ID290
Feature activation-3.53e-02

Pos logit contributions
wered+0.017
 something+0.017
 nature+0.016
 sunrise+0.015
 outer+0.015
Neg logit contributions
bo-0.043
ave-0.042
ience-0.041
mo-0.040
anda-0.040
Token thanked
Token ID26280
Feature activation+2.63e-02

Pos logit contributions
bo+0.059
mo+0.057
ave+0.056
anda+0.055
osaurus+0.053
Neg logit contributions
 something-0.077
 the-0.076
 a-0.076
 an-0.073
 nature-0.073
Token T
Token ID309
Feature activation-6.04e-02

Pos logit contributions
wered+0.035
 something+0.034
 nature+0.033
 sunrise+0.032
 outer+0.031
Neg logit contributions
bo-0.062
ave-0.060
ience-0.058
mo-0.057
anda-0.056
Token as
Token ID355
Feature activation+4.01e-02

Pos logit contributions
bo+0.013
ave+0.013
ience+0.012
mo+0.012
anda+0.012
Neg logit contributions
 something-0.004
 nature-0.004
wered-0.004
 sunrise-0.003
 an-0.003
Token Hel
Token ID5053
Feature activation-4.44e-02

Pos logit contributions
wered+0.037
 sunrise+0.032
 something+0.032
 nature+0.032
 lunch+0.029
Neg logit contributions
ave-0.109
bo-0.109
ience-0.107
 rode-0.103
 forgot-0.103
Tokenio
Token ID952
Feature activation-1.03e-01

Pos logit contributions
ave+0.062
bo+0.058
 forgot+0.055
o+0.052
anda+0.052
Neg logit contributions
 outer-0.097
 something-0.097
wered-0.097
 nature-0.095
 an-0.093
Token.
Token ID13
Feature activation-1.81e-02

Pos logit contributions
bo+0.134
 forgot+0.131
ience+0.128
oped+0.126
mo+0.126
Neg logit contributions
 a-0.232
 something-0.219
 the-0.214
 an-0.212
 nature-0.194
Token Hel
Token ID5053
Feature activation-5.84e-02

Pos logit contributions
ave+0.043
ience+0.042
bo+0.041
 forgot+0.041
anda+0.040
Neg logit contributions
 something-0.025
 the-0.025
 an-0.025
 a-0.024
 nature-0.024
Tokenio
Token ID952
Feature activation-1.40e-01

Pos logit contributions
ave+0.091
bo+0.086
 forgot+0.084
ience+0.080
oped+0.078
Neg logit contributions
 something-0.115
wered-0.113
 the-0.112
 an-0.111
 nature-0.111
Token explained
Token ID4893
Feature activation+2.27e-02

Pos logit contributions
trained+0.235
oke+0.218
amboo+0.216
oped+0.208
mo+0.205
Neg logit contributions
 a-0.273
 the-0.254
 an-0.241
 something-0.234
 nature-0.216
Token that
Token ID326
Feature activation+2.03e-02

Pos logit contributions
wered+0.022
 sunrise+0.021
 nature+0.020
 something+0.019
 outer+0.019
Neg logit contributions
ave-0.061
bo-0.061
ience-0.060
anda-0.057
mo-0.057
Token their
Token ID511
Feature activation+6.60e-03

Pos logit contributions
 something+0.028
 nature+0.026
wered+0.026
 an+0.025
 sunrise+0.024
Neg logit contributions
ave-0.050
bo-0.050
ience-0.047
anda-0.046
mo-0.046
Token oxygen
Token ID11863
Feature activation-4.52e-02

Pos logit contributions
 something+0.009
 nature+0.008
wered+0.008
 sunrise+0.008
 sight+0.008
Neg logit contributions
bo-0.016
ave-0.016
ience-0.015
mo-0.015
oped-0.015
Token was
Token ID373
Feature activation+3.37e-02

Pos logit contributions
ave+0.081
bo+0.081
mo+0.074
anna+0.074
ience+0.074
Neg logit contributions
 something-0.090
 nature-0.087
 a-0.083
 the-0.082
 sight-0.081
Token."
Token ID526
Feature activation-2.31e-02

Pos logit contributions
 something+0.016
wered+0.015
 nature+0.015
 sunrise+0.014
 outer+0.014
Neg logit contributions
ave-0.038
bo-0.038
ience-0.036
mo-0.036
anda-0.035
Token
Token ID220
Feature activation-8.88e-03

Pos logit contributions
ave+0.053
bo+0.053
ience+0.051
anda+0.050
mo+0.050
Neg logit contributions
 something-0.034
 nature-0.033
 a-0.032
 the-0.032
 an-0.031
Token
Token ID198
Feature activation-2.84e-03

Pos logit contributions
ave+0.023
bo+0.022
ience+0.022
mo+0.021
 forgot+0.021
Neg logit contributions
 something-0.010
 an-0.010
wered-0.010
 nature-0.010
 a-0.009
Token
Token ID198
Feature activation-9.97e-03

Pos logit contributions
ave+0.007
bo+0.007
ience+0.007
mo+0.007
anda+0.007
Neg logit contributions
 something-0.003
wered-0.003
 nature-0.003
 sunrise-0.003
 an-0.003
TokenThe
Token ID464
Feature activation-4.23e-03

Pos logit contributions
ave+0.026
bo+0.026
ience+0.025
mo+0.025
anda+0.024
Neg logit contributions
wered-0.011
 something-0.011
 nature-0.011
 sunrise-0.010
 outer-0.010
Token princess
Token ID21752
Feature activation-1.30e-01

Pos logit contributions
ave+0.009
bo+0.009
ience+0.009
mo+0.009
lets+0.009
Neg logit contributions
 something-0.006
wered-0.006
 nature-0.006
 sunrise-0.006
 outer-0.006
Token thanked
Token ID26280
Feature activation+3.23e-02

Pos logit contributions
amboo+0.240
 pads+0.240
mo+0.236
ave+0.234
oke+0.232
Neg logit contributions
 the-0.245
 a-0.240
 something-0.223
 an-0.212
 nature-0.209
Token the
Token ID262
Feature activation-2.41e-03

Pos logit contributions
 basement+0.027
wered+0.024
 lunch+0.024
 purpose+0.023
 soil+0.022
Neg logit contributions
bo-0.083
ience-0.079
 rode-0.077
anna-0.077
mo-0.075
Token fairy
Token ID25607
Feature activation-5.75e-02

Pos logit contributions
bo+0.005
ave+0.005
ience+0.005
mo+0.005
anda+0.005
Neg logit contributions
wered-0.004
 something-0.004
 nature-0.004
 sunrise-0.004
 outer-0.003
Token again
Token ID757
Feature activation+7.49e-03

Pos logit contributions
ave+0.157
bo+0.154
 forgot+0.146
oped+0.145
anda+0.144
Neg logit contributions
 something-0.060
 nature-0.056
 a-0.054
 the-0.053
 sunrise-0.052
Token and
Token ID290
Feature activation-3.82e-02

Pos logit contributions
wered+0.010
 something+0.010
 nature+0.010
 sunrise+0.010
 outer+0.009
Neg logit contributions
bo-0.017
ave-0.017
ience-0.017
mo-0.016
anda-0.016
Token him
Token ID683
Feature activation-2.55e-02

Pos logit contributions
 outer+0.076
 basement+0.074
wered+0.071
 sunrise+0.071
 sight+0.070
Neg logit contributions
bo-0.087
ave-0.074
anna-0.074
 rode-0.072
mo-0.071
Token to
Token ID284
Feature activation+1.54e-02

Pos logit contributions
bo+0.071
ave+0.071
ience+0.068
mo+0.066
anda+0.066
Neg logit contributions
 something-0.024
wered-0.023
 nature-0.022
 sunrise-0.021
 an-0.020
Token lock
Token ID5793
Feature activation+3.42e-02

Pos logit contributions
 outer+0.024
 sunrise+0.024
wered+0.024
 sight+0.024
 something+0.023
Neg logit contributions
bo-0.031
ave-0.030
 forgot-0.029
 rode-0.028
anna-0.027
Token it
Token ID340
Feature activation-5.86e-03

Pos logit contributions
wered+0.030
 sunrise+0.026
 nature+0.023
 lunch+0.023
 sight+0.021
Neg logit contributions
bo-0.099
ave-0.095
ience-0.093
mo-0.092
anda-0.090
Token.
Token ID13
Feature activation-9.27e-04

Pos logit contributions
ave+0.015
bo+0.015
ience+0.014
mo+0.014
anda+0.014
Neg logit contributions
wered-0.007
 nature-0.006
 sunrise-0.006
 something-0.006
 outer-0.006
Token Gerry
Token ID42196
Feature activation-1.30e-01

Pos logit contributions
bo+0.002
ave+0.002
ience+0.002
mo+0.002
anda+0.002
Neg logit contributions
wered-0.002
 something-0.002
 nature-0.002
 sunrise-0.002
 outer-0.002
Token was
Token ID373
Feature activation+1.27e-02

Pos logit contributions
oke+0.273
oped+0.253
zel+0.247
ience+0.245
trained+0.243
Neg logit contributions
 a-0.216
 the-0.207
 an-0.195
 something-0.189
 her-0.181
Token excited
Token ID6568
Feature activation+1.14e-02

Pos logit contributions
 something+0.022
wered+0.022
 nature+0.021
 an+0.021
 sunrise+0.021
Neg logit contributions
bo-0.025
ave-0.025
ience-0.024
mo-0.023
anda-0.023
Token and
Token ID290
Feature activation-4.35e-02

Pos logit contributions
wered+0.016
 something+0.015
 nature+0.015
 outer+0.015
 sunrise+0.015
Neg logit contributions
bo-0.026
ave-0.026
ience-0.025
mo-0.024
anda-0.024
Token he
Token ID339
Feature activation-8.81e-02

Pos logit contributions
bo+0.081
mo+0.079
osaurus+0.075
anda+0.074
oke+0.073
Neg logit contributions
 something-0.096
 a-0.096
 the-0.095
 an-0.092
 her-0.084
Token carefully
Token ID7773
Feature activation-3.10e-02

Pos logit contributions
oke+0.183
mo+0.181
bo+0.165
osaurus+0.162
ience+0.157
Neg logit contributions
 a-0.173
 an-0.161
 the-0.153
 something-0.151
 big-0.137
Token said
Token ID531
Feature activation+7.47e-03

Pos logit contributions
bo+0.052
ave+0.051
mo+0.049
anda+0.048
ience+0.048
Neg logit contributions
 something-0.064
 a-0.060
 nature-0.060
 an-0.060
 the-0.058
Token goodbye
Token ID24829
Feature activation-4.04e-03

Pos logit contributions
wered+0.010
 something+0.009
 nature+0.009
 sunrise+0.009
 outer+0.009
Neg logit contributions
bo-0.018
ave-0.018
ience-0.017
mo-0.017
anda-0.017
Token.
Token ID13
Feature activation-1.75e-02

Pos logit contributions
ave+0.011
bo+0.011
ience+0.011
mo+0.011
anda+0.011
Neg logit contributions
 something-0.003
wered-0.003
 nature-0.003
 sunrise-0.003
 an-0.003
Token The
Token ID383
Feature activation-2.03e-03

Pos logit contributions
bo+0.039
ave+0.039
ience+0.038
anda+0.038
mo+0.037
Neg logit contributions
 nature-0.026
 something-0.026
 the-0.025
wered-0.025
 an-0.025
Token p
Token ID279
Feature activation-6.97e-02

Pos logit contributions
ave+0.004
bo+0.004
ience+0.004
mo+0.004
lets+0.004
Neg logit contributions
wered-0.003
 something-0.003
 nature-0.003
 sunrise-0.003
 sight-0.003
Tokenanda
Token ID5282
Feature activation-1.29e-01

Pos logit contributions
ave+0.064
bo+0.064
ience+0.064
 forgot+0.061
ppy+0.061
Neg logit contributions
 nature-0.170
 something-0.168
 sight-0.168
 new-0.167
wered-0.163
Token was
Token ID373
Feature activation+1.62e-02

Pos logit contributions
oke+0.244
azz+0.235
 pads+0.233
oped+0.233
ingo+0.228
Neg logit contributions
 a-0.241
 the-0.238
 something-0.212
 an-0.204
 nature-0.196
Token so
Token ID523
Feature activation+1.27e-02

Pos logit contributions
wered+0.024
 something+0.024
 nature+0.024
 sunrise+0.023
 outer+0.022
Neg logit contributions
bo-0.035
ave-0.035
ience-0.034
mo-0.032
anda-0.032
Token happy
Token ID3772
Feature activation+9.67e-03

Pos logit contributions
 something+0.018
 nature+0.017
 a+0.017
wered+0.017
 an+0.016
Neg logit contributions
ave-0.030
ience-0.029
bo-0.029
lets-0.029
mo-0.029
Token that
Token ID326
Feature activation+1.87e-02

Pos logit contributions
wered+0.010
 something+0.010
 nature+0.010
 sunrise+0.010
 outer+0.009
Neg logit contributions
bo-0.025
ave-0.025
ience-0.024
mo-0.024
anda-0.024
Token someone
Token ID2130
Feature activation-7.72e-02

Pos logit contributions
wered+0.031
 something+0.031
 nature+0.031
 sunrise+0.030
 outer+0.029
Neg logit contributions
bo-0.038
ave-0.038
ience-0.036
mo-0.035
anda-0.035
Token I
Token ID314
Feature activation-5.30e-02

Pos logit contributions
ave+0.032
bo+0.031
mo+0.029
ience+0.029
oped+0.029
Neg logit contributions
 something-0.016
 nature-0.015
wered-0.014
 an-0.014
 sunrise-0.014
Token'm
Token ID1101
Feature activation+1.44e-02

Pos logit contributions
ave+0.106
lets+0.105
 pads+0.104
ience+0.104
osaurus+0.103
Neg logit contributions
 something-0.086
 a-0.084
 an-0.082
 nature-0.077
 the-0.077
Token lost
Token ID2626
Feature activation+3.59e-02

Pos logit contributions
wered+0.024
 something+0.024
 nature+0.024
 sunrise+0.023
 outer+0.022
Neg logit contributions
bo-0.029
ave-0.029
ience-0.027
mo-0.026
anda-0.026
Token."
Token ID526
Feature activation-2.36e-02

Pos logit contributions
wered+0.058
 sunrise+0.049
 outer+0.048
 nature+0.047
 something+0.046
Neg logit contributions
bo-0.080
ave-0.075
ience-0.074
mo-0.073
anda-0.071
Token The
Token ID383
Feature activation+9.07e-03

Pos logit contributions
ave+0.063
ience+0.062
mo+0.060
anda+0.060
bo+0.059
Neg logit contributions
 the-0.028
 something-0.028
 a-0.026
 an-0.025
 nature-0.025
Token prince
Token ID19716
Feature activation-1.29e-01

Pos logit contributions
 something+0.017
wered+0.016
 nature+0.016
 sunrise+0.016
 sight+0.015
Neg logit contributions
ave-0.017
bo-0.017
ience-0.016
mo-0.016
anda-0.016
Token felt
Token ID2936
Feature activation+2.38e-02

Pos logit contributions
 pads+0.243
amboo+0.235
oped+0.229
oke+0.225
ophone+0.220
Neg logit contributions
 a-0.249
 the-0.243
 something-0.232
 an-0.216
 her-0.195
Token so
Token ID523
Feature activation+2.42e-03

Pos logit contributions
wered+0.041
 something+0.040
 nature+0.040
 sunrise+0.039
 outer+0.038
Neg logit contributions
bo-0.047
ave-0.047
ience-0.045
mo-0.043
anda-0.043
Token sorry
Token ID7926
Feature activation+1.16e-03

Pos logit contributions
 something+0.004
wered+0.004
 nature+0.004
 a+0.004
 an+0.004
Neg logit contributions
ave-0.005
lets-0.005
bo-0.005
mo-0.005
ience-0.004
Token for
Token ID329
Feature activation+2.90e-02

Pos logit contributions
 something+0.001
 a+0.001
 an+0.001
 nature+0.001
 the+0.001
Neg logit contributions
ave-0.003
bo-0.003
ience-0.003
mo-0.003
oped-0.003
Token the
Token ID262
Feature activation+1.58e-02

Pos logit contributions
 sunrise+0.027
wered+0.026
 view+0.026
 sight+0.025
 east+0.025
Neg logit contributions
bo-0.068
mo-0.068
ience-0.066
 forgot-0.065
 roared-0.065
Token a
Token ID257
Feature activation+1.29e-02

Pos logit contributions
wered+0.018
 sunrise+0.013
 nature+0.013
 something+0.013
 basement+0.012
Neg logit contributions
bo-0.070
ave-0.069
ience-0.068
 forgot-0.066
 rode-0.066
Token reward
Token ID6721
Feature activation-9.46e-03

Pos logit contributions
wered+0.016
 something+0.016
 nature+0.015
 sunrise+0.015
 outer+0.014
Neg logit contributions
bo-0.031
ave-0.031
ience-0.030
anda-0.029
mo-0.029
Token.
Token ID13
Feature activation-1.64e-02

Pos logit contributions
bo+0.026
ave+0.026
ience+0.025
mo+0.025
anda+0.025
Neg logit contributions
 something-0.008
wered-0.008
 nature-0.008
 sunrise-0.008
 outer-0.007
Token The
Token ID383
Feature activation+4.33e-03

Pos logit contributions
ave+0.035
bo+0.035
anda+0.034
 forgot+0.034
ience+0.034
Neg logit contributions
 something-0.026
 the-0.026
 nature-0.026
 a-0.026
 an-0.025
Token p
Token ID279
Feature activation-6.94e-02

Pos logit contributions
 nature+0.007
 something+0.007
 sunrise+0.007
wered+0.007
 sight+0.006
Neg logit contributions
ave-0.010
bo-0.009
ience-0.009
anna-0.009
oped-0.009
Tokenanda
Token ID5282
Feature activation-1.29e-01

Pos logit contributions
ave+0.061
bo+0.059
 forgot+0.058
ppy+0.056
ience+0.055
Neg logit contributions
 nature-0.173
 sight-0.171
 something-0.170
 view-0.167
 new-0.166
Token refused
Token ID6520
Feature activation+3.98e-02

Pos logit contributions
oke+0.238
amboo+0.231
azz+0.231
mo+0.227
ave+0.221
Neg logit contributions
 a-0.246
 the-0.242
 something-0.219
 an-0.213
 nature-0.203
Token and
Token ID290
Feature activation-2.36e-02

Pos logit contributions
wered+0.045
 something+0.043
 nature+0.042
 sunrise+0.041
 outer+0.040
Neg logit contributions
bo-0.102
ave-0.102
ience-0.098
mo-0.095
anda-0.095
Token said
Token ID531
Feature activation+1.13e-02

Pos logit contributions
ave+0.044
anda+0.044
bo+0.044
mo+0.043
anna+0.041
Neg logit contributions
 the-0.048
 something-0.048
 a-0.046
 an-0.044
 dinner-0.043
Token,
Token ID11
Feature activation+3.43e-03

Pos logit contributions
wered+0.016
 nature+0.015
 something+0.015
 sunrise+0.015
 outer+0.014
Neg logit contributions
bo-0.026
ave-0.026
ience-0.025
mo-0.024
anda-0.024
Token "
Token ID366
Feature activation-2.94e-02

Pos logit contributions
 something+0.004
 an+0.004
 a+0.004
 the+0.004
 nature+0.004
Neg logit contributions
ave-0.009
bo-0.009
anda-0.009
mo-0.008
ience-0.008
Token brown
Token ID7586
Feature activation+7.26e-03

Pos logit contributions
 something+0.011
 nature+0.011
wered+0.011
 sunrise+0.011
 sight+0.011
Neg logit contributions
ave-0.012
bo-0.012
ience-0.011
mo-0.011
oped-0.011
Token man
Token ID582
Feature activation-5.87e-02

Pos logit contributions
wered+0.014
 something+0.014
 nature+0.013
 sunrise+0.013
 outer+0.013
Neg logit contributions
bo-0.013
ave-0.013
ience-0.012
mo-0.012
anda-0.012
Tokene
Token ID68
Feature activation-3.51e-02

Pos logit contributions
ave+0.156
ience+0.148
bo+0.145
 forgot+0.144
oped+0.140
Neg logit contributions
 something-0.067
 a-0.063
 the-0.059
 an-0.058
 nature-0.058
Token."
Token ID526
Feature activation-2.83e-02

Pos logit contributions
ave+0.089
bo+0.087
ience+0.086
 forgot+0.083
mo+0.083
Neg logit contributions
 something-0.042
 a-0.038
 sunrise-0.038
 nature-0.038
 an-0.037
Token The
Token ID383
Feature activation+8.78e-03

Pos logit contributions
ience+0.065
ave+0.062
mo+0.061
bo+0.061
anda+0.060
Neg logit contributions
 the-0.045
 a-0.043
 something-0.043
 lunch-0.041
 dinner-0.040
Token princess
Token ID21752
Feature activation-1.29e-01

Pos logit contributions
 something+0.016
wered+0.015
 nature+0.015
 sunrise+0.015
 lunch+0.015
Neg logit contributions
ave-0.017
bo-0.017
ience-0.016
oped-0.016
mo-0.016
Token was
Token ID373
Feature activation+8.82e-03

Pos logit contributions
trained+0.260
 pads+0.258
mo+0.255
amboo+0.253
oke+0.251
Neg logit contributions
 a-0.232
 the-0.229
 something-0.196
 an-0.193
 her-0.177
Token so
Token ID523
Feature activation+8.74e-03

Pos logit contributions
 something+0.012
 a+0.012
 an+0.012
 the+0.011
wered+0.011
Neg logit contributions
ave-0.022
ience-0.022
bo-0.021
oped-0.020
lets-0.020
Token happy
Token ID3772
Feature activation+1.52e-02

Pos logit contributions
 a+0.011
 the+0.010
 an+0.010
 something+0.010
 nature+0.010
Neg logit contributions
ience-0.023
mo-0.023
ave-0.023
bo-0.022
lets-0.021
Token and
Token ID290
Feature activation-4.55e-02

Pos logit contributions
wered+0.019
 something+0.017
 nature+0.017
 sunrise+0.017
 outer+0.017
Neg logit contributions
bo-0.038
ave-0.038
ience-0.036
anda-0.035
mo-0.035
Token thanked
Token ID26280
Feature activation+2.97e-02

Pos logit contributions
mo+0.091
bo+0.086
ave+0.079
anda+0.078
osaurus+0.078
Neg logit contributions
 the-0.100
 a-0.094
 something-0.089
 nature-0.086
 an-0.086
Token not
Token ID407
Feature activation+3.89e-02

Pos logit contributions
wered+0.022
 nature+0.021
 sunrise+0.021
 something+0.020
 outer+0.020
Neg logit contributions
bo-0.047
ave-0.046
ience-0.045
mo-0.043
 forgot-0.043
Token very
Token ID845
Feature activation+1.23e-02

Pos logit contributions
wered+0.040
 something+0.038
 nature+0.038
 sunrise+0.037
 outer+0.035
Neg logit contributions
bo-0.104
ave-0.103
ience-0.100
mo-0.097
anda-0.096
Token nice
Token ID3621
Feature activation+1.01e-02

Pos logit contributions
 something+0.013
wered+0.013
 nature+0.012
 sunrise+0.012
 outer+0.011
Neg logit contributions
ave-0.033
bo-0.033
ience-0.031
mo-0.031
anda-0.031
Token."
Token ID526
Feature activation-1.48e-02

Pos logit contributions
 something+0.011
 nature+0.011
wered+0.010
 sunrise+0.010
 lunch+0.010
Neg logit contributions
bo-0.026
ave-0.026
ience-0.025
mo-0.025
anda-0.024
Token F
Token ID376
Feature activation-5.88e-02

Pos logit contributions
ave+0.037
bo+0.036
ience+0.036
lets+0.035
mo+0.034
Neg logit contributions
 something-0.020
 an-0.019
 nature-0.018
 a-0.018
 the-0.018
Tokenido
Token ID17305
Feature activation-1.29e-01

Pos logit contributions
ave+0.122
bo+0.120
ience+0.118
lets+0.115
 forgot+0.113
Neg logit contributions
wered-0.092
 nature-0.091
 something-0.090
 an-0.084
 outer-0.084
Token felt
Token ID2936
Feature activation+2.54e-02

Pos logit contributions
mo+0.248
ophone+0.242
 pads+0.241
azz+0.234
lets+0.232
Neg logit contributions
 a-0.249
 the-0.244
 an-0.221
 something-0.218
 dinner-0.173
Token very
Token ID845
Feature activation+5.09e-03

Pos logit contributions
wered+0.045
 nature+0.044
 something+0.043
 sunrise+0.043
 outer+0.042
Neg logit contributions
bo-0.049
ave-0.049
ience-0.046
mo-0.044
 forgot-0.044
Token bad
Token ID2089
Feature activation-8.54e-04

Pos logit contributions
 something+0.009
wered+0.009
 nature+0.009
 an+0.008
 sunrise+0.008
Neg logit contributions
bo-0.010
ave-0.010
lets-0.009
ience-0.009
mo-0.009
Token.
Token ID13
Feature activation-2.75e-02

Pos logit contributions
bo+0.002
ave+0.002
ience+0.002
mo+0.002
anda+0.002
Neg logit contributions
 something-0.001
 nature-0.001
wered-0.001
 sunrise-0.001
 an-0.001
Token He
Token ID679
Feature activation-7.98e-02

Pos logit contributions
ave+0.056
ience+0.055
lets+0.054
bo+0.053
 forgot+0.051
Neg logit contributions
 something-0.048
 nature-0.047
 an-0.047
 the-0.046
 a-0.046
Token wish
Token ID4601
Feature activation-8.30e-03

Pos logit contributions
 something+0.020
wered+0.020
 nature+0.020
 sunrise+0.019
 sight+0.019
Neg logit contributions
ave-0.028
bo-0.028
ience-0.026
mo-0.026
anda-0.026
Token!"
Token ID2474
Feature activation-2.36e-02

Pos logit contributions
bo+0.021
ave+0.021
ience+0.021
mo+0.020
anda+0.020
Neg logit contributions
 something-0.009
wered-0.009
 nature-0.009
 sunrise-0.008
 an-0.008
Token
Token ID198
Feature activation+4.58e-04

Pos logit contributions
ave+0.063
ience+0.063
anda+0.062
mo+0.062
bo+0.062
Neg logit contributions
 something-0.027
 the-0.026
 a-0.026
 an-0.025
 nature-0.023
Token
Token ID198
Feature activation-8.31e-03

Pos logit contributions
 something+0.001
wered+0.001
 nature+0.001
 sunrise+0.001
 an+0.001
Neg logit contributions
ave-0.001
bo-0.001
ience-0.001
mo-0.001
 forgot-0.001
TokenThe
Token ID464
Feature activation-7.05e-04

Pos logit contributions
ave+0.024
bo+0.023
mo+0.023
ience+0.023
anda+0.023
Neg logit contributions
 something-0.007
 nature-0.007
wered-0.007
 a-0.006
 an-0.006
Token princess
Token ID21752
Feature activation-1.29e-01

Pos logit contributions
ave+0.002
bo+0.001
ience+0.001
lets+0.001
oped+0.001
Neg logit contributions
 something-0.001
wered-0.001
 nature-0.001
 sunrise-0.001
 sight-0.001
Token thought
Token ID1807
Feature activation+1.09e-02

Pos logit contributions
amboo+0.235
trained+0.234
 pads+0.233
oke+0.231
ave+0.231
Neg logit contributions
 a-0.238
 the-0.232
 something-0.221
 an-0.211
 nature-0.197
Token for
Token ID329
Feature activation+1.11e-02

Pos logit contributions
wered+0.012
 something+0.011
 nature+0.011
 sunrise+0.011
 sight+0.011
Neg logit contributions
ave-0.028
bo-0.027
ience-0.027
mo-0.026
anda-0.026
Token a
Token ID257
Feature activation+1.67e-02

Pos logit contributions
wered+0.009
 wheat+0.009
 sunrise+0.009
 sand+0.008
 squash+0.008
Neg logit contributions
bo-0.027
 cared-0.026
ave-0.026
'd-0.026
ience-0.025
Token moment
Token ID2589
Feature activation-4.53e-04

Pos logit contributions
 something+0.023
wered+0.022
 nature+0.022
 sunrise+0.021
 outer+0.021
Neg logit contributions
bo-0.039
ave-0.039
ience-0.037
anda-0.037
mo-0.037
Token,
Token ID11
Feature activation-1.10e-02

Pos logit contributions
ave+0.001
bo+0.001
ience+0.001
mo+0.001
lets+0.001
Neg logit contributions
wered-0.001
 nature-0.001
 something-0.001
 sunrise-0.001
 outer-0.001
Token the
Token ID262
Feature activation+9.30e-03

Pos logit contributions
wered+0.019
 sunrise+0.019
 sight+0.017
 outer+0.017
 nature+0.017
Neg logit contributions
 rode-0.034
bo-0.034
ave-0.033
 forgot-0.032
ience-0.032
Token lovely
Token ID14081
Feature activation+3.85e-03

Pos logit contributions
wered+0.011
 something+0.011
 nature+0.011
 sunrise+0.010
 outer+0.010
Neg logit contributions
bo-0.023
ave-0.023
ience-0.022
mo-0.022
anda-0.021
Token adventure
Token ID8855
Feature activation-2.40e-02

Pos logit contributions
wered+0.006
 something+0.006
 nature+0.006
 sunrise+0.006
 outer+0.005
Neg logit contributions
bo-0.008
ave-0.008
ience-0.008
mo-0.008
anda-0.008
Token.
Token ID13
Feature activation-1.83e-02

Pos logit contributions
ave+0.069
bo+0.068
ience+0.066
mo+0.065
anda+0.065
Neg logit contributions
 something-0.020
 nature-0.019
wered-0.017
 sunrise-0.017
 a-0.017
Token The
Token ID383
Feature activation+1.18e-03

Pos logit contributions
ave+0.040
bo+0.039
ience+0.038
mo+0.038
anda+0.038
Neg logit contributions
 something-0.029
 nature-0.028
 the-0.028
 a-0.028
 an-0.028
Token prince
Token ID19716
Feature activation-1.28e-01

Pos logit contributions
 something+0.002
 nature+0.002
wered+0.002
 sunrise+0.002
 a+0.002
Neg logit contributions
ave-0.002
bo-0.002
mo-0.002
oped-0.002
ience-0.002
Token said
Token ID531
Feature activation+6.12e-03

Pos logit contributions
mo+0.273
ave+0.270
bo+0.267
ksh+0.265
amboo+0.261
Neg logit contributions
 the-0.215
 a-0.205
 something-0.179
 nature-0.174
 an-0.173
Token that
Token ID326
Feature activation+2.17e-02

Pos logit contributions
 something+0.010
wered+0.010
 nature+0.009
 sunrise+0.009
 an+0.009
Neg logit contributions
bo-0.013
ave-0.013
ience-0.012
mo-0.012
anda-0.012
Token anytime
Token ID17949
Feature activation+1.14e-02

Pos logit contributions
 something+0.028
 nature+0.026
wered+0.026
 sunrise+0.025
 an+0.025
Neg logit contributions
ave-0.055
bo-0.054
ience-0.052
anda-0.051
mo-0.051
Token the
Token ID262
Feature activation+1.05e-02

Pos logit contributions
 something+0.014
 nature+0.013
 a+0.012
 the+0.012
 an+0.012
Neg logit contributions
bo-0.030
ave-0.030
mo-0.029
ience-0.028
anda-0.028
Token par
Token ID1582
Feature activation-7.71e-02

Pos logit contributions
 something+0.018
 nature+0.018
wered+0.018
 sunrise+0.017
 sight+0.017
Neg logit contributions
ave-0.021
bo-0.021
ience-0.020
mo-0.020
oped-0.020
TokenOn
Token ID2202
Feature activation+3.95e-02

Pos logit contributions
ave+0.016
bo+0.016
ience+0.015
mo+0.015
anda+0.015
Neg logit contributions
wered-0.009
 something-0.009
 nature-0.009
 sunrise-0.009
 an-0.008
Token Tuesday
Token ID3431
Feature activation-1.08e-02

Pos logit contributions
wered+0.024
 nature+0.022
 sunrise+0.021
 something+0.019
 outer+0.019
Neg logit contributions
bo-0.117
ience-0.110
ave-0.110
mo-0.109
 rode-0.107
Token morning
Token ID3329
Feature activation+2.06e-02

Pos logit contributions
ave+0.030
bo+0.030
ience+0.029
mo+0.028
lets+0.028
Neg logit contributions
 something-0.011
 nature-0.010
 sunrise-0.010
wered-0.010
 lunch-0.009
Token,
Token ID11
Feature activation+1.82e-02

Pos logit contributions
wered+0.028
 nature+0.026
 something+0.025
 outer+0.025
 sunrise+0.024
Neg logit contributions
bo-0.048
ave-0.048
anna-0.045
ience-0.045
 forgot-0.044
Token Mar
Token ID1526
Feature activation-7.76e-02

Pos logit contributions
wered+0.021
 something+0.020
 nature+0.020
 sunrise+0.019
 outer+0.019
Neg logit contributions
bo-0.046
ave-0.046
ience-0.044
mo-0.043
 forgot-0.043
Tokeniah
Token ID9520
Feature activation-1.28e-01

Pos logit contributions
ave+0.110
 forgot+0.099
ience+0.098
bo+0.097
 pads+0.093
Neg logit contributions
 nature-0.174
 outer-0.167
 something-0.167
wered-0.165
 sunrise-0.164
Token woke
Token ID19092
Feature activation+3.30e-02

Pos logit contributions
oke+0.210
ience+0.208
ave+0.207
amboo+0.206
zel+0.203
Neg logit contributions
 the-0.245
 a-0.244
 an-0.223
 something-0.223
 dinner-0.212
Token up
Token ID510
Feature activation+2.51e-02

Pos logit contributions
wered+0.048
 nature+0.044
 basement+0.040
 sunrise+0.039
 outer+0.039
Neg logit contributions
bo-0.073
anna-0.071
ience-0.066
mo-0.066
ave-0.065
Token early
Token ID1903
Feature activation+1.12e-02

Pos logit contributions
wered+0.038
 nature+0.034
 sunrise+0.032
 basement+0.032
 outer+0.032
Neg logit contributions
bo-0.055
ave-0.053
ience-0.051
 rode-0.050
 forgot-0.050
Token.
Token ID13
Feature activation-1.11e-02

Pos logit contributions
wered+0.015
 nature+0.014
 something+0.014
 sunrise+0.013
 outer+0.013
Neg logit contributions
bo-0.026
ave-0.026
ience-0.025
 rode-0.024
anna-0.024
Token She
Token ID1375
Feature activation-7.30e-02

Pos logit contributions
bo+0.021
ave+0.021
ience+0.020
mo+0.019
anda+0.019
Neg logit contributions
wered-0.020
 something-0.020
 nature-0.020
 sunrise-0.019
 outer-0.019
Token snakes
Token ID26042
Feature activation-4.86e-02

Pos logit contributions
wered+0.017
 something+0.016
 outer+0.015
 sunrise+0.015
 nature+0.014
Neg logit contributions
bo-0.032
ave-0.032
ience-0.031
anda-0.031
mo-0.030
Token."
Token ID526
Feature activation-1.30e-02

Pos logit contributions
ave+0.127
bo+0.126
ience+0.119
anda+0.118
oped+0.118
Neg logit contributions
 something-0.054
 nature-0.051
 a-0.048
 sunrise-0.048
 an-0.047
Token And
Token ID843
Feature activation-3.25e-03

Pos logit contributions
ave+0.036
ience+0.035
anda+0.035
mo+0.035
bo+0.035
Neg logit contributions
 something-0.013
 the-0.013
 nature-0.013
 a-0.013
 an-0.013
Token the
Token ID262
Feature activation+4.75e-03

Pos logit contributions
ave+0.010
bo+0.010
ience+0.009
mo+0.009
anda+0.009
Neg logit contributions
 something-0.003
 the-0.003
 a-0.003
 an-0.003
 nature-0.003
Token p
Token ID279
Feature activation-7.09e-02

Pos logit contributions
 something+0.005
wered+0.005
 nature+0.005
 sight+0.004
 sunrise+0.004
Neg logit contributions
ave-0.013
bo-0.013
ience-0.013
lets-0.013
oped-0.013
Tokenanda
Token ID5282
Feature activation-1.28e-01

Pos logit contributions
ave+0.081
ience+0.076
 pads+0.074
bo+0.073
ppy+0.071
Neg logit contributions
 nature-0.163
wered-0.159
 new-0.158
 sight-0.157
 something-0.156
Token waved
Token ID26834
Feature activation+1.21e-02

Pos logit contributions
oped+0.235
ave+0.233
 pads+0.226
mo+0.223
oke+0.221
Neg logit contributions
 the-0.242
 a-0.232
 an-0.207
 something-0.203
 nature-0.198
Token goodbye
Token ID24829
Feature activation+4.41e-04

Pos logit contributions
 something+0.018
wered+0.018
 nature+0.017
 sunrise+0.017
 an+0.016
Neg logit contributions
bo-0.027
ave-0.027
ience-0.026
anda-0.025
mo-0.025
Token with
Token ID351
Feature activation+3.73e-02

Pos logit contributions
 something+0.000
wered+0.000
 nature+0.000
 sunrise+0.000
 sight+0.000
Neg logit contributions
bo-0.001
ave-0.001
ience-0.001
mo-0.001
anda-0.001
Token his
Token ID465
Feature activation+6.05e-03

Pos logit contributions
wered+0.030
 basement+0.029
 sunrise+0.029
 outer+0.028
 laundry+0.028
Neg logit contributions
bo-0.099
 rode-0.093
ave-0.092
 cared-0.091
ez-0.090
Token paw
Token ID41827
Feature activation-2.78e-02

Pos logit contributions
 something+0.006
 nature+0.006
wered+0.006
 sunrise+0.006
 sight+0.006
Neg logit contributions
bo-0.016
ave-0.016
ience-0.015
mo-0.015
anda-0.015
Token a
Token ID257
Feature activation+1.29e-02

Pos logit contributions
wered+0.017
 sunrise+0.015
 nature+0.015
 something+0.014
 outer+0.013
Neg logit contributions
bo-0.064
ave-0.062
ience-0.061
anna-0.059
mo-0.059
Token little
Token ID1310
Feature activation-1.41e-02

Pos logit contributions
 something+0.013
 nature+0.012
 lunch+0.012
 sight+0.012
 dinner+0.012
Neg logit contributions
ave-0.035
bo-0.034
ience-0.033
oped-0.032
olded-0.032
Token while
Token ID981
Feature activation+1.07e-02

Pos logit contributions
ave+0.034
bo+0.033
ience+0.032
oped+0.031
mo+0.031
Neg logit contributions
 something-0.020
 sight-0.018
 nature-0.018
 lunch-0.018
 a-0.018
Token.
Token ID13
Feature activation-2.75e-02

Pos logit contributions
 something+0.010
wered+0.009
 nature+0.009
 sunrise+0.009
 an+0.008
Neg logit contributions
ave-0.030
bo-0.030
ience-0.029
mo-0.028
anda-0.028
Token The
Token ID383
Feature activation-4.21e-04

Pos logit contributions
ave+0.074
ience+0.073
bo+0.072
 forgot+0.069
mo+0.069
Neg logit contributions
 the-0.030
 something-0.029
 a-0.029
 nature-0.028
 an-0.027
Token princess
Token ID21752
Feature activation-1.28e-01

Pos logit contributions
ave+0.001
bo+0.001
oped+0.001
ience+0.001
lets+0.001
Neg logit contributions
 something-0.001
 nature-0.001
 sunrise-0.001
 a-0.001
wered-0.001
Token said
Token ID531
Feature activation-6.53e-04

Pos logit contributions
ksh+0.244
bo+0.234
oke+0.233
mo+0.229
amboo+0.227
Neg logit contributions
 a-0.243
 the-0.240
 something-0.220
 an-0.209
 her-0.199
Token yes
Token ID3763
Feature activation-2.58e-02

Pos logit contributions
bo+0.001
ave+0.001
ience+0.001
mo+0.001
anda+0.001
Neg logit contributions
 something-0.001
wered-0.001
 nature-0.001
 sunrise-0.001
 an-0.001
Token and
Token ID290
Feature activation-3.00e-02

Pos logit contributions
ave+0.076
mo+0.075
bo+0.075
oped+0.074
ience+0.073
Neg logit contributions
 something-0.022
 a-0.020
 the-0.019
 an-0.017
 dinner-0.016
Token gave
Token ID2921
Feature activation+7.54e-03

Pos logit contributions
bo+0.074
ave+0.072
mo+0.072
anda+0.071
jo+0.069
Neg logit contributions
 the-0.043
 a-0.042
 something-0.042
 an-0.038
 dinner-0.037
Token the
Token ID262
Feature activation-1.69e-03

Pos logit contributions
wered+0.004
 something+0.004
 nature+0.004
 sunrise+0.003
 outer+0.003
Neg logit contributions
bo-0.024
ave-0.024
ience-0.023
mo-0.023
anda-0.023
Token's
Token ID338
Feature activation+4.66e-03

Pos logit contributions
oped+0.122
ave+0.117
lets+0.116
jo+0.114
mo+0.113
Neg logit contributions
 a-0.117
 something-0.114
 the-0.109
 an-0.109
 nature-0.101
Token very
Token ID845
Feature activation-1.15e-02

Pos logit contributions
 something+0.006
 nature+0.006
 a+0.006
 an+0.005
 the+0.005
Neg logit contributions
ave-0.012
bo-0.012
oped-0.011
ience-0.011
mo-0.011
Token mild
Token ID11607
Feature activation-1.20e-02

Pos logit contributions
ave+0.030
oped+0.029
bo+0.029
lets+0.028
mo+0.028
Neg logit contributions
 something-0.014
 nature-0.013
 a-0.013
 an-0.012
 the-0.012
Token."
Token ID526
Feature activation-2.33e-02

Pos logit contributions
bo+0.030
ave+0.030
ience+0.029
oped+0.028
mo+0.028
Neg logit contributions
 something-0.015
 nature-0.014
 sunrise-0.013
 sight-0.013
 a-0.013
Token The
Token ID383
Feature activation+1.87e-03

Pos logit contributions
ave+0.067
ience+0.066
bo+0.063
mo+0.063
anda+0.063
Neg logit contributions
 the-0.022
 something-0.022
 a-0.021
 nature-0.021
 an-0.020
Token pig
Token ID12967
Feature activation-1.28e-01

Pos logit contributions
 nature+0.003
 something+0.003
 sunrise+0.003
 a+0.003
 sight+0.003
Neg logit contributions
ave-0.004
bo-0.004
oped-0.004
ophone-0.004
jo-0.004
Token was
Token ID373
Feature activation+1.62e-02

Pos logit contributions
 pads+0.218
oped+0.217
oke+0.213
ophone+0.211
azz+0.209
Neg logit contributions
 a-0.246
 the-0.244
 something-0.223
 an-0.218
 nature-0.194
Token relieved
Token ID22598
Feature activation+2.39e-02

Pos logit contributions
 something+0.026
 a+0.025
 an+0.025
wered+0.024
 the+0.024
Neg logit contributions
ave-0.036
ience-0.035
bo-0.034
lets-0.034
oped-0.033
Token,
Token ID11
Feature activation+4.50e-03

Pos logit contributions
wered+0.033
 something+0.032
 nature+0.032
 sunrise+0.031
 outer+0.031
Neg logit contributions
bo-0.055
ave-0.055
ience-0.053
mo-0.051
anda-0.051
Token he
Token ID339
Feature activation-7.65e-02

Pos logit contributions
 something+0.005
 the+0.005
 nature+0.005
 a+0.005
 an+0.005
Neg logit contributions
ave-0.012
bo-0.012
anda-0.011
ience-0.011
mo-0.011
Token knew
Token ID2993
Feature activation+2.97e-02

Pos logit contributions
mo+0.143
anda+0.135
osaurus+0.133
amboo+0.130
ologist+0.130
Neg logit contributions
 a-0.155
 the-0.154
 something-0.137
 an-0.133
 that-0.123
Token way
Token ID835
Feature activation+1.17e-02

Pos logit contributions
wered+0.038
 sunrise+0.031
 something+0.030
 outer+0.030
 an+0.029
Neg logit contributions
ience-0.070
bo-0.070
ave-0.069
 forgot-0.067
mo-0.064
Token she
Token ID673
Feature activation-5.80e-02

Pos logit contributions
wered+0.015
 sunrise+0.013
 something+0.012
 nature+0.012
 outer+0.012
Neg logit contributions
bo-0.030
ave-0.029
ience-0.028
anna-0.027
anda-0.027
Token could
Token ID714
Feature activation+1.67e-02

Pos logit contributions
amboo+0.085
anda+0.084
bo+0.084
mo+0.083
osaurus+0.083
Neg logit contributions
 a-0.123
 something-0.123
 an-0.120
 the-0.118
 nature-0.117
Token.
Token ID13
Feature activation-1.66e-02

Pos logit contributions
wered+0.024
 something+0.023
 nature+0.023
 sunrise+0.022
 outer+0.022
Neg logit contributions
bo-0.038
ave-0.037
ience-0.035
 forgot-0.034
mo-0.034
Token Mar
Token ID1526
Feature activation-7.67e-02

Pos logit contributions
bo+0.036
ave+0.036
ience+0.034
mo+0.033
anda+0.033
Neg logit contributions
wered-0.025
 something-0.025
 nature-0.024
 sunrise-0.024
 outer-0.023
Tokeniah
Token ID9520
Feature activation-1.28e-01

Pos logit contributions
ave+0.147
bo+0.140
ience+0.136
lets+0.133
 forgot+0.132
Neg logit contributions
 something-0.132
 nature-0.132
wered-0.128
 sunrise-0.125
 an-0.125
Token found
Token ID1043
Feature activation+3.30e-02

Pos logit contributions
oped+0.222
 pads+0.221
azz+0.218
amboo+0.218
oke+0.209
Neg logit contributions
 a-0.243
 the-0.240
 an-0.232
 something-0.229
 dinner-0.209
Token many
Token ID867
Feature activation+3.17e-02

Pos logit contributions
wered+0.021
 sunrise+0.018
 nature+0.018
 outer+0.016
 something+0.015
Neg logit contributions
ave-0.100
bo-0.099
ience-0.097
mo-0.095
anda-0.093
Token ways
Token ID2842
Feature activation+1.29e-02

Pos logit contributions
wered+0.037
 something+0.034
 sunrise+0.033
 nature+0.033
 outer+0.032
Neg logit contributions
ave-0.080
ience-0.079
bo-0.079
 forgot-0.075
mo-0.075
Token to
Token ID284
Feature activation+1.70e-02

Pos logit contributions
wered+0.018
 sunrise+0.016
 nature+0.015
 something+0.015
 outer+0.015
Neg logit contributions
ave-0.031
bo-0.030
ience-0.029
mo-0.029
 forgot-0.028
Token help
Token ID1037
Feature activation+1.88e-02

Pos logit contributions
 sight+0.025
 sunrise+0.025
wered+0.024
 outer+0.024
 something+0.023
Neg logit contributions
bo-0.033
 forgot-0.033
ience-0.032
ave-0.032
 rode-0.032
Token and
Token ID290
Feature activation-1.97e-02

Pos logit contributions
ave+0.083
bo+0.083
ience+0.081
anda+0.079
mo+0.079
Neg logit contributions
 something-0.024
 nature-0.022
wered-0.021
 sunrise-0.021
 an-0.020
Token kiss
Token ID9245
Feature activation+7.36e-03

Pos logit contributions
bo+0.043
ave+0.043
mo+0.041
anda+0.041
ience+0.041
Neg logit contributions
 something-0.032
 a-0.030
 nature-0.030
 an-0.029
 lunch-0.029
Token.
Token ID13
Feature activation-5.29e-03

Pos logit contributions
wered+0.007
 something+0.007
 nature+0.006
 sunrise+0.006
 outer+0.006
Neg logit contributions
bo-0.020
ave-0.020
ience-0.020
mo-0.019
anda-0.019
Token P
Token ID350
Feature activation-6.63e-02

Pos logit contributions
ave+0.011
bo+0.011
ience+0.011
mo+0.011
anda+0.011
Neg logit contributions
 something-0.009
 nature-0.008
wered-0.008
 an-0.008
 sunrise-0.008
Tokenup
Token ID929
Feature activation-9.81e-02

Pos logit contributions
ience+0.109
bo+0.106
ave+0.101
oped+0.095
 roared+0.095
Neg logit contributions
 something-0.131
wered-0.128
 an-0.126
 nature-0.125
 sand-0.122
Tokenil
Token ID346
Feature activation-1.28e-01

Pos logit contributions
ience+0.194
oped+0.192
oke+0.191
 pads+0.188
mo+0.187
Neg logit contributions
 a-0.180
 the-0.175
 something-0.165
 an-0.163
 dinner-0.140
Token played
Token ID2826
Feature activation+2.69e-02

Pos logit contributions
olded+0.204
ksh+0.202
oke+0.201
anda+0.199
amboo+0.199
Neg logit contributions
 the-0.241
 a-0.235
 an-0.214
 something-0.204
 nature-0.200
Token with
Token ID351
Feature activation+4.31e-02

Pos logit contributions
wered+0.038
 sunrise+0.035
 outer+0.035
 nature+0.035
 sight+0.034
Neg logit contributions
ave-0.057
bo-0.056
 rode-0.054
ience-0.053
zel-0.051
Token his
Token ID465
Feature activation+1.20e-03

Pos logit contributions
wered+0.040
 sunrise+0.033
 basement+0.031
 outer+0.031
 nature+0.030
Neg logit contributions
bo-0.118
ave-0.113
ience-0.112
 rode-0.111
mo-0.109
Token new
Token ID649
Feature activation+1.46e-02

Pos logit contributions
 something+0.002
 nature+0.002
 sunrise+0.002
wered+0.002
 sight+0.002
Neg logit contributions
ave-0.003
bo-0.003
ience-0.003
mo-0.003
anna-0.003
Token friend
Token ID1545
Feature activation-2.55e-02

Pos logit contributions
 something+0.024
wered+0.024
 nature+0.023
 sunrise+0.023
 sight+0.022
Neg logit contributions
bo-0.030
ave-0.030
ience-0.029
mo-0.028
anda-0.027
Token will
Token ID481
Feature activation+2.34e-02

Pos logit contributions
ience+0.050
lets+0.047
bo+0.047
anna+0.047
 pads+0.046
Neg logit contributions
 something-0.044
 a-0.043
 an-0.042
 nature-0.041
 the-0.041
Token hurt
Token ID5938
Feature activation+4.11e-02

Pos logit contributions
wered+0.035
 something+0.035
 nature+0.034
 sunrise+0.033
 outer+0.033
Neg logit contributions
bo-0.051
ave-0.050
ience-0.048
 forgot-0.046
mo-0.046
Token you
Token ID345
Feature activation-3.16e-02

Pos logit contributions
wered+0.058
 sunrise+0.053
 outer+0.052
 lunch+0.050
 sight+0.050
Neg logit contributions
bo-0.098
ave-0.088
ience-0.088
mo-0.086
o-0.085
Token!"
Token ID2474
Feature activation-1.38e-02

Pos logit contributions
ave+0.080
bo+0.079
ience+0.076
mo+0.074
anda+0.074
Neg logit contributions
 something-0.038
wered-0.036
 nature-0.036
 sunrise-0.034
 an-0.034
Token The
Token ID383
Feature activation+1.53e-02

Pos logit contributions
ience+0.036
ave+0.035
mo+0.035
anda+0.035
bo+0.034
Neg logit contributions
 something-0.018
 a-0.017
 the-0.017
 an-0.016
 nature-0.015
Token prince
Token ID19716
Feature activation-1.28e-01

Pos logit contributions
 something+0.029
 nature+0.028
wered+0.027
 a+0.027
 sunrise+0.026
Neg logit contributions
ave-0.029
bo-0.029
oped-0.028
ience-0.028
mo-0.027
Token was
Token ID373
Feature activation+1.68e-02

Pos logit contributions
 pads+0.267
amboo+0.243
anna+0.239
trained+0.237
oke+0.234
Neg logit contributions
 a-0.235
 the-0.225
 something-0.210
 an-0.206
 all-0.167
Token scared
Token ID12008
Feature activation+2.41e-02

Pos logit contributions
 something+0.029
 a+0.028
 an+0.028
wered+0.027
 nature+0.027
Neg logit contributions
ave-0.035
ience-0.035
bo-0.034
lets-0.033
mo-0.032
Token but
Token ID475
Feature activation-2.73e-02

Pos logit contributions
wered+0.033
 something+0.032
 nature+0.032
 sunrise+0.031
 outer+0.031
Neg logit contributions
bo-0.056
ave-0.056
ience-0.053
mo-0.052
anda-0.052
Token he
Token ID339
Feature activation-7.87e-02

Pos logit contributions
ave+0.058
lets+0.057
anda+0.057
bo+0.057
mo+0.056
Neg logit contributions
 the-0.050
 something-0.049
 a-0.048
 an-0.045
 nature-0.043
Token was
Token ID373
Feature activation+1.04e-02

Pos logit contributions
 pads+0.143
mo+0.140
jo+0.139
ophone+0.137
anna+0.136
Neg logit contributions
 a-0.160
 something-0.156
 the-0.152
 an-0.150
wered-0.132
Token for
Token ID329
Feature activation+4.13e-02

Pos logit contributions
wered+0.053
 sunrise+0.045
 outer+0.044
 something+0.043
 nature+0.043
Neg logit contributions
ave-0.103
bo-0.100
ience-0.098
 forgot-0.097
lets-0.095
Token his
Token ID465
Feature activation+7.00e-03

Pos logit contributions
wered+0.044
 sunrise+0.040
 nature+0.039
 something+0.039
 outer+0.038
Neg logit contributions
bo-0.104
ience-0.101
ave-0.101
mo-0.098
 forgot-0.097
Token daughter
Token ID4957
Feature activation-4.39e-02

Pos logit contributions
 something+0.009
 nature+0.009
 lunch+0.008
 sight+0.008
 sunrise+0.008
Neg logit contributions
bo-0.018
ave-0.017
ience-0.016
oped-0.016
mo-0.016
Token.
Token ID13
Feature activation-2.27e-02

Pos logit contributions
ave+0.127
bo+0.126
ience+0.124
oped+0.121
anda+0.120
Neg logit contributions
 something-0.040
 a-0.035
 lunch-0.033
 an-0.033
 nature-0.033
Token The
Token ID383
Feature activation+8.69e-03

Pos logit contributions
ave+0.056
ience+0.055
bo+0.054
anda+0.052
mo+0.052
Neg logit contributions
 the-0.030
 something-0.029
 a-0.029
 an-0.028
 nature-0.028
Token prince
Token ID19716
Feature activation-1.28e-01

Pos logit contributions
 something+0.016
 nature+0.015
wered+0.015
 sunrise+0.015
 lunch+0.015
Neg logit contributions
bo-0.017
ave-0.017
oped-0.016
ience-0.016
lets-0.015
Token was
Token ID373
Feature activation+1.64e-02

Pos logit contributions
ksh+0.235
amboo+0.235
 pads+0.233
bo+0.226
mo+0.225
Neg logit contributions
 the-0.238
 a-0.237
 something-0.220
 an-0.208
 dinner-0.186
Token sad
Token ID6507
Feature activation+1.90e-02

Pos logit contributions
 something+0.025
 a+0.024
 an+0.023
 the+0.022
wered+0.022
Neg logit contributions
ave-0.039
ience-0.038
bo-0.037
lets-0.036
oped-0.036
Token and
Token ID290
Feature activation-4.56e-02

Pos logit contributions
 something+0.022
 nature+0.020
wered+0.019
 an+0.019
 a+0.019
Neg logit contributions
bo-0.050
ave-0.050
ience-0.048
mo-0.047
anda-0.047
Token left
Token ID1364
Feature activation+3.43e-02

Pos logit contributions
ave+0.086
lets+0.086
oped+0.083
osaurus+0.083
anda+0.082
Neg logit contributions
 the-0.092
 something-0.090
 a-0.088
 an-0.084
 nature-0.081
Token the
Token ID262
Feature activation+2.28e-02

Pos logit contributions
wered+0.031
 something+0.029
 nature+0.029
 sunrise+0.028
 outer+0.026
Neg logit contributions
bo-0.096
ave-0.096
ience-0.092
mo-0.090
anda-0.089
Token him
Token ID683
Feature activation-6.57e-03

Pos logit contributions
wered+0.046
 sunrise+0.042
 outer+0.042
 something+0.041
 nature+0.041
Neg logit contributions
bo-0.086
ave-0.083
ience-0.079
mo-0.077
 rode-0.077
Token a
Token ID257
Feature activation+1.87e-02

Pos logit contributions
bo+0.021
ave+0.021
ience+0.020
mo+0.020
anda+0.020
Neg logit contributions
wered-0.003
 sunrise-0.003
 nature-0.002
 sight-0.002
 something-0.002
Token big
Token ID1263
Feature activation+2.05e-02

Pos logit contributions
wered+0.022
 nature+0.020
 something+0.020
 sunrise+0.019
 outer+0.018
Neg logit contributions
ave-0.048
bo-0.048
ience-0.046
anda-0.044
 forgot-0.044
Token hug
Token ID16225
Feature activation-1.80e-02

Pos logit contributions
wered+0.034
 something+0.033
 nature+0.032
 sunrise+0.031
 outer+0.031
Neg logit contributions
bo-0.041
ave-0.041
ience-0.040
anda-0.038
mo-0.038
Token.
Token ID13
Feature activation-1.44e-03

Pos logit contributions
bo+0.052
ave+0.052
ience+0.051
anda+0.050
mo+0.050
Neg logit contributions
 something-0.015
 nature-0.013
wered-0.013
 sunrise-0.012
 an-0.012
Token Gerry
Token ID42196
Feature activation-1.28e-01

Pos logit contributions
bo+0.003
ave+0.003
ience+0.003
mo+0.003
anda+0.003
Neg logit contributions
wered-0.002
 something-0.002
 nature-0.002
 sunrise-0.002
 outer-0.002
Token was
Token ID373
Feature activation+1.23e-02

Pos logit contributions
oke+0.268
oped+0.256
mo+0.240
trained+0.238
zel+0.226
Neg logit contributions
 a-0.229
 the-0.219
 an-0.202
 something-0.186
 her-0.184
Token very
Token ID845
Feature activation+8.49e-03

Pos logit contributions
 something+0.019
wered+0.018
 nature+0.017
 an+0.017
 sunrise+0.017
Neg logit contributions
ave-0.028
bo-0.028
ience-0.027
mo-0.026
anda-0.026
Token happy
Token ID3772
Feature activation+1.86e-02

Pos logit contributions
 something+0.013
wered+0.012
 nature+0.012
 an+0.012
 a+0.012
Neg logit contributions
bo-0.019
ave-0.019
ience-0.018
mo-0.018
lets-0.018
Token.
Token ID13
Feature activation-1.41e-02

Pos logit contributions
wered+0.023
 something+0.021
 nature+0.021
 sunrise+0.020
 outer+0.020
Neg logit contributions
bo-0.047
ave-0.046
ience-0.044
mo-0.043
 forgot-0.043
Token<|endoftext|>
Token ID50256
Feature activation-8.16e-03

Pos logit contributions
ave+0.023
bo+0.023
ience+0.023
mo+0.022
anda+0.021
Neg logit contributions
 something-0.029
 nature-0.029
wered-0.028
 sunrise-0.028
 an-0.028
Token had
Token ID550
Feature activation+4.47e-02

Pos logit contributions
ave+0.095
ience+0.093
lets+0.091
oke+0.091
olded+0.090
Neg logit contributions
 a-0.096
 the-0.094
 nature-0.093
 an-0.090
 something-0.090
Token no
Token ID645
Feature activation+5.23e-02

Pos logit contributions
wered+0.029
 sunrise+0.026
 outer+0.025
 nature+0.022
 sight+0.021
Neg logit contributions
bo-0.135
ave-0.134
 rode-0.130
 forgot-0.128
o-0.127
Token friends
Token ID2460
Feature activation-1.57e-02

Pos logit contributions
wered+0.077
 something+0.074
 nature+0.072
 sunrise+0.071
 outer+0.071
Neg logit contributions
bo-0.115
ave-0.114
ience-0.108
anda-0.107
mo-0.107
Token because
Token ID780
Feature activation+2.58e-02

Pos logit contributions
bo+0.047
ience+0.046
ave+0.046
mo+0.044
oped+0.043
Neg logit contributions
 something-0.013
 nature-0.012
 a-0.011
 an-0.010
 the-0.010
Token he
Token ID339
Feature activation-6.02e-02

Pos logit contributions
wered+0.052
 something+0.050
 sight+0.049
 sunrise+0.049
 outer+0.049
Neg logit contributions
bo-0.040
ave-0.037
mo-0.036
 rode-0.036
ience-0.036
Token looked
Token ID3114
Feature activation+3.59e-02

Pos logit contributions
ience+0.123
lets+0.122
anna+0.120
oke+0.117
ave+0.115
Neg logit contributions
 a-0.109
 the-0.106
 nature-0.104
 an-0.101
 something-0.098
Token scary
Token ID14343
Feature activation+1.19e-02

Pos logit contributions
wered+0.058
 sunrise+0.054
 nature+0.052
 outer+0.051
 purpose+0.050
Neg logit contributions
ave-0.071
bo-0.070
 forgot-0.068
ience-0.068
mo-0.066
Token.
Token ID13
Feature activation-8.77e-03

Pos logit contributions
wered+0.011
 something+0.011
 nature+0.011
 sunrise+0.010
 outer+0.010
Neg logit contributions
bo-0.032
ave-0.032
ience-0.031
mo-0.030
anda-0.030
Token The
Token ID383
Feature activation+1.92e-02

Pos logit contributions
ave+0.020
bo+0.019
ience+0.019
mo+0.018
 forgot+0.018
Neg logit contributions
 something-0.013
wered-0.013
 nature-0.013
 sunrise-0.012
 an-0.012
Token hipp
Token ID18568
Feature activation-8.36e-02

Pos logit contributions
wered+0.027
 something+0.027
 nature+0.027
 sunrise+0.026
 sight+0.025
Neg logit contributions
ave-0.044
bo-0.044
ience-0.042
mo-0.041
anda-0.040
Tokeno
Token ID78
Feature activation-8.84e-02

Pos logit contributions
ience+0.133
anna+0.099
ave+0.098
 forgot+0.090
 pads+0.085
Neg logit contributions
 nature-0.198
 the-0.191
 a-0.188
 something-0.187
 an-0.178
Token to
Token ID284
Feature activation+5.28e-02

Pos logit contributions
bo+0.016
ave+0.015
ience+0.015
mo+0.015
anda+0.015
Neg logit contributions
 something-0.004
 nature-0.004
wered-0.004
 an-0.004
 a-0.003
Token other
Token ID584
Feature activation+2.19e-02

Pos logit contributions
wered+0.057
 sunrise+0.055
 outer+0.049
 nature+0.048
 sight+0.048
Neg logit contributions
ave-0.130
bo-0.128
 rode-0.127
ience-0.126
 forgot-0.124
Token kids
Token ID3988
Feature activation-2.13e-02

Pos logit contributions
 something+0.025
wered+0.025
 nature+0.024
 sunrise+0.023
 an+0.022
Neg logit contributions
bo-0.056
ave-0.056
ience-0.054
mo-0.052
anda-0.052
Token who
Token ID508
Feature activation-4.93e-02

Pos logit contributions
bo+0.061
ave+0.060
ience+0.060
anda+0.058
oped+0.057
Neg logit contributions
 something-0.019
 nature-0.017
 a-0.016
 an-0.016
 the-0.015
Token might
Token ID1244
Feature activation+1.55e-02

Pos logit contributions
bo+0.085
anda+0.083
ave+0.083
anna+0.081
mo+0.081
Neg logit contributions
 the-0.101
 something-0.100
 nature-0.099
 a-0.098
 an-0.095
Token want
Token ID765
Feature activation+6.09e-02

Pos logit contributions
 something+0.023
wered+0.023
 nature+0.022
 sunrise+0.021
 an+0.021
Neg logit contributions
ave-0.035
bo-0.035
ience-0.034
anda-0.033
mo-0.033
Token to
Token ID284
Feature activation+1.50e-02

Pos logit contributions
wered+0.063
 sunrise+0.057
 something+0.057
 nature+0.056
 outer+0.054
Neg logit contributions
bo-0.163
ave-0.163
ience-0.155
mo-0.150
 forgot-0.149
Token play
Token ID711
Feature activation+2.72e-02

Pos logit contributions
wered+0.022
 something+0.021
 nature+0.021
 sunrise+0.020
 outer+0.020
Neg logit contributions
ave-0.034
bo-0.034
ience-0.032
mo-0.032
anda-0.031
Token with
Token ID351
Feature activation+4.35e-02

Pos logit contributions
wered+0.032
 sunrise+0.029
 nature+0.028
 sight+0.027
 something+0.027
Neg logit contributions
bo-0.069
ave-0.068
ience-0.066
 rode-0.063
 forgot-0.063
Token him
Token ID683
Feature activation-2.17e-02

Pos logit contributions
wered+0.053
 sunrise+0.049
 nature+0.047
 something+0.047
 outer+0.046
Neg logit contributions
bo-0.108
ave-0.106
ience-0.104
mo-0.100
 rode-0.099
Token.
Token ID13
Feature activation-1.89e-03

Pos logit contributions
bo+0.066
ave+0.066
anda+0.063
ience+0.063
mo+0.063
Neg logit contributions
 something-0.016
 a-0.014
 nature-0.013
 an-0.013
 the-0.013
Token of
Token ID286
Feature activation+4.79e-02

Pos logit contributions
wered+0.004
 something+0.004
 nature+0.003
 sunrise+0.003
 outer+0.003
Neg logit contributions
bo-0.010
ave-0.010
ience-0.009
mo-0.009
anda-0.009
Token the
Token ID262
Feature activation+2.71e-02

Pos logit contributions
wered+0.055
uttering+0.049
 sunrise+0.048
 soil+0.044
 outer+0.044
Neg logit contributions
bo-0.095
 rode-0.092
o-0.088
 drove-0.085
 argued-0.083
Token park
Token ID3952
Feature activation-3.36e-03

Pos logit contributions
wered+0.044
 something+0.038
 an+0.036
 outer+0.036
 nature+0.032
Neg logit contributions
bo-0.055
 rode-0.053
anda-0.052
ience-0.049
 forgot-0.048
Token.
Token ID13
Feature activation-9.32e-03

Pos logit contributions
bo+0.010
ave+0.010
ience+0.010
mo+0.009
anda+0.009
Neg logit contributions
 something-0.003
 nature-0.002
wered-0.002
 sight-0.002
 lunch-0.002
Token She
Token ID1375
Feature activation-6.28e-02

Pos logit contributions
bo+0.020
ave+0.020
ience+0.019
mo+0.018
anda+0.018
Neg logit contributions
wered-0.015
 something-0.015
 nature-0.014
 sunrise-0.014
 outer-0.014
Token found
Token ID1043
Feature activation+3.52e-02

Pos logit contributions
anna+0.124
lets+0.124
ience+0.124
osaurus+0.121
mo+0.121
Neg logit contributions
 a-0.117
 the-0.116
 something-0.114
 an-0.111
 school-0.094
Token a
Token ID257
Feature activation+3.96e-02

Pos logit contributions
wered+0.026
 sunrise+0.025
 outer+0.024
 nature+0.024
 sight+0.020
Neg logit contributions
bo-0.102
ave-0.102
mo-0.096
anda-0.095
o-0.095
Token nice
Token ID3621
Feature activation+3.37e-02

Pos logit contributions
wered+0.053
 nature+0.046
 something+0.046
 outer+0.045
 sunrise+0.044
Neg logit contributions
bo-0.093
ave-0.091
anda-0.089
ience-0.089
 forgot-0.088
Token lady
Token ID10846
Feature activation-4.68e-02

Pos logit contributions
wered+0.055
 outer+0.048
 something+0.044
 nature+0.043
 sunrise+0.043
Neg logit contributions
bo-0.067
anda-0.065
 forgot-0.065
 rode-0.064
ave-0.064
Token who
Token ID508
Feature activation-6.11e-02

Pos logit contributions
ave+0.107
bo+0.105
ience+0.102
mo+0.097
 forgot+0.097
Neg logit contributions
 something-0.068
 nature-0.065
wered-0.064
 a-0.061
 sunrise-0.061
Token had
Token ID550
Feature activation+3.17e-02

Pos logit contributions
bo+0.096
anna+0.094
ave+0.092
mo+0.092
anda+0.090
Neg logit contributions
 the-0.137
 a-0.134
 something-0.132
 nature-0.131
 an-0.128
Token was
Token ID373
Feature activation+1.27e-02

Pos logit contributions
wered+0.016
 something+0.016
 nature+0.015
 sunrise+0.015
 outer+0.015
Neg logit contributions
bo-0.020
ave-0.020
ience-0.019
mo-0.018
anda-0.018
Token wrong
Token ID2642
Feature activation+2.04e-02

Pos logit contributions
wered+0.018
 nature+0.016
 sunrise+0.016
 view+0.015
 outer+0.015
Neg logit contributions
bo-0.029
ave-0.028
ience-0.028
mo-0.027
anda-0.027
Token,
Token ID11
Feature activation+4.00e-03

Pos logit contributions
 something+0.022
 nature+0.021
wered+0.020
 sunrise+0.020
 an+0.019
Neg logit contributions
ave-0.054
bo-0.054
ience-0.052
mo-0.050
anda-0.050
Token but
Token ID475
Feature activation-5.00e-03

Pos logit contributions
 something+0.005
wered+0.005
 nature+0.005
 sunrise+0.005
 outer+0.005
Neg logit contributions
bo-0.009
ave-0.009
ience-0.009
anda-0.009
mo-0.009
Token he
Token ID339
Feature activation-6.86e-02

Pos logit contributions
bo+0.011
ave+0.011
ience+0.011
mo+0.010
anda+0.010
Neg logit contributions
wered-0.007
 something-0.007
 nature-0.007
 sunrise-0.007
 outer-0.007
Token did
Token ID750
Feature activation+3.20e-02

Pos logit contributions
amboo+0.123
 pads+0.122
anna+0.121
ave+0.119
lets+0.118
Neg logit contributions
 something-0.130
 a-0.130
 the-0.126
 an-0.125
 nature-0.110
Token not
Token ID407
Feature activation+1.28e-02

Pos logit contributions
wered+0.041
 sunrise+0.040
 nature+0.039
 outer+0.038
 sight+0.038
Neg logit contributions
bo-0.076
ave-0.073
ience-0.073
 forgot-0.071
 rode-0.069
Token know
Token ID760
Feature activation+3.78e-02

Pos logit contributions
 something+0.019
wered+0.018
 nature+0.018
 sunrise+0.017
 an+0.017
Neg logit contributions
ave-0.029
bo-0.028
ience-0.028
anda-0.027
oped-0.027
Token how
Token ID703
Feature activation+3.03e-02

Pos logit contributions
wered+0.047
 sunrise+0.041
 sight+0.041
 nature+0.040
 lunch+0.040
Neg logit contributions
bo-0.092
mo-0.089
ience-0.089
ave-0.088
anda-0.086
Token to
Token ID284
Feature activation+7.32e-03

Pos logit contributions
wered+0.036
 sunrise+0.030
 nature+0.030
 something+0.030
 lunch+0.029
Neg logit contributions
bo-0.079
ave-0.077
ience-0.075
mo-0.072
 forgot-0.072
Token tell
Token ID1560
Feature activation+2.91e-02

Pos logit contributions
wered+0.011
 something+0.011
 sight+0.011
 outer+0.010
 nature+0.010
Neg logit contributions
bo-0.016
ave-0.015
 forgot-0.014
ience-0.014
 rode-0.014
Token used
Token ID973
Feature activation+5.89e-02

Pos logit contributions
anda+0.044
bo+0.044
ave+0.044
anna+0.043
mo+0.042
Neg logit contributions
 something-0.053
 a-0.052
 the-0.051
 an-0.050
 nature-0.049
Token all
Token ID477
Feature activation+3.18e-02

Pos logit contributions
wered+0.058
 sunrise+0.053
 nature+0.053
 something+0.052
 outer+0.049
Neg logit contributions
bo-0.158
ave-0.156
ience-0.150
lets-0.147
mo-0.147
Token his
Token ID465
Feature activation+4.32e-03

Pos logit contributions
wered+0.032
 something+0.030
 sunrise+0.029
 nature+0.028
 outer+0.027
Neg logit contributions
bo-0.085
ave-0.083
mo-0.079
ience-0.079
o-0.078
Token strength
Token ID4202
Feature activation+4.99e-03

Pos logit contributions
 something+0.006
 nature+0.006
 a+0.006
 an+0.006
 sight+0.006
Neg logit contributions
bo-0.010
ave-0.010
ience-0.009
oped-0.009
mo-0.009
Token to
Token ID284
Feature activation+1.37e-02

Pos logit contributions
 something+0.004
wered+0.004
 nature+0.004
 sunrise+0.004
 an+0.004
Neg logit contributions
bo-0.014
ave-0.014
ience-0.014
mo-0.013
anda-0.013
Token lift
Token ID10303
Feature activation+4.81e-02

Pos logit contributions
wered+0.021
 something+0.021
 sunrise+0.020
 nature+0.020
 sight+0.020
Neg logit contributions
bo-0.029
ave-0.028
ience-0.027
 rode-0.026
 forgot-0.026
Token it
Token ID340
Feature activation-5.23e-03

Pos logit contributions
wered+0.042
 sunrise+0.036
 nature+0.034
 something+0.033
 outer+0.032
Neg logit contributions
bo-0.138
ave-0.132
ience-0.131
mo-0.129
lets-0.128
Token.
Token ID13
Feature activation-9.03e-03

Pos logit contributions
bo+0.014
ave+0.014
ience+0.013
mo+0.013
anda+0.013
Neg logit contributions
wered-0.005
 something-0.005
 nature-0.005
 sunrise-0.005
 outer-0.005
Token With
Token ID2080
Feature activation+4.24e-02

Pos logit contributions
ience+0.020
ave+0.020
bo+0.020
 forgot+0.019
anda+0.019
Neg logit contributions
 something-0.014
 a-0.014
 an-0.013
 nature-0.013
wered-0.013
Token a
Token ID257
Feature activation+1.10e-02

Pos logit contributions
wered+0.033
 something+0.031
 nature+0.030
 sunrise+0.030
 outer+0.027
Neg logit contributions
bo-0.121
ave-0.120
ience-0.118
mo-0.114
 rode-0.113
Token loud
Token ID7812
Feature activation+7.91e-03

Pos logit contributions
 something+0.013
 nature+0.011
 sight+0.011
 sunrise+0.011
wered+0.011
Neg logit contributions
ave-0.029
bo-0.029
mo-0.027
ience-0.027
anda-0.027
Token food
Token ID2057
Feature activation-1.53e-02

Pos logit contributions
wered+0.044
 outer+0.040
 basement+0.036
 sunrise+0.035
 nature+0.034
Neg logit contributions
ave-0.080
bo-0.078
 pads-0.075
ience-0.073
anda-0.073
Token.
Token ID13
Feature activation-1.39e-02

Pos logit contributions
ave+0.039
bo+0.038
ience+0.036
 forgot+0.036
anda+0.036
Neg logit contributions
wered-0.018
 sunrise-0.016
 outer-0.016
 nature-0.016
 something-0.015
Token He
Token ID679
Feature activation-7.11e-02

Pos logit contributions
ience+0.032
ave+0.032
bo+0.031
mo+0.031
anda+0.029
Neg logit contributions
 something-0.021
 nature-0.020
 an-0.020
 a-0.020
wered-0.019
Token asked
Token ID1965
Feature activation+2.09e-02

Pos logit contributions
oke+0.148
ophone+0.128
ience+0.127
oped+0.127
ave+0.126
Neg logit contributions
 a-0.145
 the-0.134
 something-0.134
 an-0.133
 nature-0.126
Token his
Token ID465
Feature activation+2.33e-03

Pos logit contributions
wered+0.025
 sunrise+0.025
 sight+0.023
 something+0.023
 nature+0.023
Neg logit contributions
ave-0.050
bo-0.050
 rode-0.048
ience-0.047
mo-0.046
Token friend
Token ID1545
Feature activation-1.80e-02

Pos logit contributions
 something+0.004
wered+0.004
 nature+0.004
 sunrise+0.004
 lunch+0.003
Neg logit contributions
ave-0.005
bo-0.005
ience-0.005
mo-0.005
oped-0.005
Token,
Token ID11
Feature activation+1.17e-02

Pos logit contributions
bo+0.050
ave+0.049
ience+0.048
oped+0.046
mo+0.046
Neg logit contributions
 something-0.019
 nature-0.017
wered-0.016
 an-0.016
 a-0.016
Token a
Token ID257
Feature activation+3.82e-02

Pos logit contributions
wered+0.009
 sunrise+0.008
 outer+0.008
 nature+0.008
 sight+0.008
Neg logit contributions
bo-0.033
ave-0.032
o-0.032
 forgot-0.031
mo-0.031
Token mouse
Token ID10211
Feature activation-3.20e-02

Pos logit contributions
wered+0.048
 something+0.047
 nature+0.045
 sunrise+0.044
 outer+0.043
Neg logit contributions
bo-0.093
ave-0.093
ience-0.089
mo-0.087
anda-0.086
Token,
Token ID11
Feature activation-1.04e-02

Pos logit contributions
bo+0.058
ave+0.058
ience+0.054
mo+0.052
anda+0.052
Neg logit contributions
wered-0.060
 something-0.060
 nature-0.059
 sunrise-0.057
 outer-0.056
Token if
Token ID611
Feature activation+1.35e-02

Pos logit contributions
ave+0.028
bo+0.028
ience+0.027
mo+0.026
anda+0.026
Neg logit contributions
wered-0.010
 something-0.009
 nature-0.009
 sunrise-0.009
 outer-0.009
Token,
Token ID11
Feature activation+6.69e-03

Pos logit contributions
ave+0.081
bo+0.081
ience+0.078
mo+0.076
anda+0.076
Neg logit contributions
 something-0.020
wered-0.019
 nature-0.018
 sunrise-0.017
 lunch-0.016
Token change
Token ID1487
Feature activation+2.78e-02

Pos logit contributions
 something+0.009
 nature+0.008
wered+0.008
 an+0.008
 lunch+0.008
Neg logit contributions
ave-0.017
bo-0.017
ience-0.016
anda-0.016
oped-0.016
Token him
Token ID683
Feature activation-2.27e-02

Pos logit contributions
 something+0.026
 nature+0.024
wered+0.024
 sunrise+0.022
 outer+0.021
Neg logit contributions
ave-0.078
bo-0.078
ience-0.075
mo-0.073
anda-0.073
Token,
Token ID11
Feature activation-5.80e-03

Pos logit contributions
ave+0.063
bo+0.062
ience+0.059
oped+0.059
mo+0.059
Neg logit contributions
 something-0.023
 a-0.021
 lunch-0.021
 nature-0.021
 dinner-0.020
Token and
Token ID290
Feature activation-1.80e-02

Pos logit contributions
ave+0.015
bo+0.015
ience+0.015
anda+0.014
mo+0.014
Neg logit contributions
 something-0.007
 nature-0.006
 a-0.006
 an-0.006
wered-0.006
Token rock
Token ID3881
Feature activation-1.33e-02

Pos logit contributions
ave+0.044
bo+0.043
anda+0.042
oped+0.042
ience+0.041
Neg logit contributions
 something-0.027
 a-0.024
 the-0.024
 nature-0.023
 an-0.023
Token him
Token ID683
Feature activation-4.58e-02

Pos logit contributions
ave+0.037
bo+0.035
ience+0.035
oped+0.034
 forgot+0.034
Neg logit contributions
 something-0.014
 nature-0.013
 a-0.013
 the-0.012
 an-0.012
Token.
Token ID13
Feature activation-1.66e-02

Pos logit contributions
ave+0.105
oke+0.104
bo+0.102
mo+0.100
ience+0.099
Neg logit contributions
 something-0.070
 a-0.068
 the-0.062
 lunch-0.062
 an-0.062
Token
Token ID198
Feature activation+1.35e-02

Pos logit contributions
ave+0.043
bo+0.043
ience+0.042
mo+0.040
lets+0.040
Neg logit contributions
 something-0.020
wered-0.019
 nature-0.018
 an-0.018
 lunch-0.017
Token
Token ID198
Feature activation+2.53e-03

Pos logit contributions
wered+0.020
 something+0.020
 nature+0.019
 sunrise+0.019
 outer+0.018
Neg logit contributions
bo-0.030
ave-0.029
ience-0.028
anda-0.027
mo-0.027
TokenL
Token ID43
Feature activation-3.89e-02

Pos logit contributions
wered+0.004
 something+0.004
 nature+0.003
 sunrise+0.003
 sight+0.003
Neg logit contributions
bo-0.006
ave-0.006
ience-0.005
mo-0.005
anda-0.005
Token them
Token ID606
Feature activation-2.56e-02

Pos logit contributions
wered+0.032
 sunrise+0.027
 nature+0.026
 outer+0.025
 something+0.024
Neg logit contributions
bo-0.098
ave-0.096
ience-0.093
mo-0.092
anda-0.091
Token.
Token ID13
Feature activation-1.57e-04

Pos logit contributions
bo+0.073
ave+0.073
ience+0.070
mo+0.068
anda+0.068
Neg logit contributions
 something-0.022
wered-0.021
 nature-0.020
 sunrise-0.020
 lunch-0.019
Token They
Token ID1119
Feature activation-6.33e-02

Pos logit contributions
bo+0.000
ave+0.000
ience+0.000
mo+0.000
anda+0.000
Neg logit contributions
wered-0.000
 something-0.000
 nature-0.000
 sunrise-0.000
 outer-0.000
Token closed
Token ID4838
Feature activation+5.12e-02

Pos logit contributions
bo+0.116
ave+0.110
anna+0.109
ience+0.105
jo+0.104
Neg logit contributions
 the-0.125
 a-0.124
wered-0.123
 something-0.121
 an-0.121
Token the
Token ID262
Feature activation+2.21e-02

Pos logit contributions
wered+0.054
 sunrise+0.049
 outer+0.042
 basement+0.042
 university+0.041
Neg logit contributions
bo-0.126
mo-0.123
ience-0.118
lets-0.117
ave-0.116
Token door
Token ID3420
Feature activation+2.20e-03

Pos logit contributions
wered+0.028
 something+0.026
 nature+0.026
 sunrise+0.025
 outer+0.025
Neg logit contributions
bo-0.053
ave-0.052
anda-0.051
ience-0.051
mo-0.049
Token and
Token ID290
Feature activation-4.36e-02

Pos logit contributions
wered+0.003
 something+0.003
 nature+0.003
 sunrise+0.003
 outer+0.003
Neg logit contributions
ave-0.005
bo-0.005
ience-0.005
mo-0.004
anda-0.004
Token locked
Token ID8970
Feature activation+4.03e-02

Pos logit contributions
bo+0.068
ave+0.068
ience+0.063
anda+0.062
mo+0.062
Neg logit contributions
 something-0.095
wered-0.093
 nature-0.091
 an-0.088
 sunrise-0.088
Token it
Token ID340
Feature activation-1.42e-02

Pos logit contributions
wered+0.039
 sunrise+0.033
 nature+0.028
 lunch+0.027
 outer+0.027
Neg logit contributions
bo-0.111
mo-0.106
ience-0.105
ave-0.104
anda-0.104
Token.
Token ID13
Feature activation+2.51e-03

Pos logit contributions
bo+0.039
ave+0.039
ience+0.038
mo+0.037
anda+0.037
Neg logit contributions
 something-0.014
wered-0.013
 nature-0.012
 an-0.012
 a-0.012
Token They
Token ID1119
Feature activation-6.42e-02

Pos logit contributions
wered+0.004
 something+0.004
 nature+0.004
 sunrise+0.004
 outer+0.004
Neg logit contributions
bo-0.005
ave-0.005
ience-0.005
mo-0.005
anda-0.005
Token the
Token ID262
Feature activation+6.73e-03

Pos logit contributions
 outer+0.049
wered+0.047
 basement+0.047
 sunrise+0.047
 east+0.043
Neg logit contributions
 forgot-0.106
bo-0.105
ave-0.104
 cared-0.102
 forg-0.101
Token doll
Token ID3654
Feature activation-5.22e-02

Pos logit contributions
wered+0.012
 something+0.012
 nature+0.012
 sunrise+0.012
 outer+0.011
Neg logit contributions
bo-0.013
ave-0.013
ience-0.012
anda-0.011
mo-0.011
Token in
Token ID287
Feature activation+2.93e-02

Pos logit contributions
bo+0.160
ave+0.159
ience+0.153
oped+0.151
mo+0.150
Neg logit contributions
 something-0.040
 a-0.039
 the-0.036
 lunch-0.033
 an-0.031
Token the
Token ID262
Feature activation+1.50e-02

Pos logit contributions
wered+0.026
 sunrise+0.025
 outer+0.019
 soil+0.019
 wheat+0.019
Neg logit contributions
bo-0.076
ave-0.070
 forgot-0.070
 rode-0.069
lets-0.069
Token street
Token ID4675
Feature activation+6.48e-03

Pos logit contributions
wered+0.020
 something+0.019
 sunrise+0.019
 nature+0.019
 outer+0.018
Neg logit contributions
bo-0.036
ave-0.035
ience-0.033
anda-0.032
mo-0.032
Token.
Token ID13
Feature activation-6.62e-04

Pos logit contributions
 something+0.005
 nature+0.004
wered+0.004
 a+0.004
 the+0.004
Neg logit contributions
bo-0.019
ave-0.019
ience-0.019
mo-0.018
anda-0.018
Token She
Token ID1375
Feature activation-6.69e-02

Pos logit contributions
ave+0.001
bo+0.001
 forgot+0.001
o+0.001
lets+0.001
Neg logit contributions
wered-0.001
 something-0.001
 outer-0.001
 sunrise-0.001
 nature-0.001
Token does
Token ID857
Feature activation+3.49e-02

Pos logit contributions
ave+0.131
bo+0.129
jo+0.128
oped+0.124
anna+0.124
Neg logit contributions
 a-0.124
 the-0.124
 something-0.120
 an-0.118
 nature-0.116
Token not
Token ID407
Feature activation+6.68e-03

Pos logit contributions
 sunrise+0.053
wered+0.053
 nature+0.052
 outer+0.050
 soil+0.049
Neg logit contributions
 forgot-0.069
bo-0.069
ave-0.067
ience-0.066
 rode-0.064
Token see
Token ID766
Feature activation+5.35e-02

Pos logit contributions
 something+0.009
wered+0.009
 nature+0.009
 a+0.008
 sunrise+0.008
Neg logit contributions
ave-0.016
ience-0.016
bo-0.016
anda-0.015
mo-0.015
Token the
Token ID262
Feature activation+9.06e-03

Pos logit contributions
wered+0.044
 sunrise+0.040
 outer+0.038
 nature+0.036
 lunch+0.035
Neg logit contributions
bo-0.150
ave-0.145
mo-0.141
anda-0.140
ience-0.139
Token were
Token ID547
Feature activation+2.99e-02

Pos logit contributions
ave+0.128
ience+0.128
anda+0.125
bo+0.124
mo+0.124
Neg logit contributions
 a-0.122
 something-0.122
 the-0.121
 an-0.118
 nature-0.118
Token playing
Token ID2712
Feature activation+4.71e-02

Pos logit contributions
 something+0.039
wered+0.038
 nature+0.037
 sunrise+0.036
 lunch+0.036
Neg logit contributions
ave-0.072
bo-0.072
ience-0.070
lets-0.067
mo-0.067
Token in
Token ID287
Feature activation+4.89e-02

Pos logit contributions
wered+0.072
 outer+0.068
 sunrise+0.067
 nature+0.064
 sight+0.064
Neg logit contributions
bo-0.095
ave-0.093
 rode-0.091
o-0.089
 fet-0.089
Token the
Token ID262
Feature activation+4.40e-02

Pos logit contributions
wered+0.049
 sunrise+0.041
 something+0.040
 east+0.036
 somewhere+0.035
Neg logit contributions
bo-0.118
 rode-0.116
o-0.114
mo-0.107
jo-0.105
Token park
Token ID3952
Feature activation+1.69e-02

Pos logit contributions
wered+0.054
 something+0.045
 outer+0.042
 sunrise+0.041
 sight+0.041
Neg logit contributions
bo-0.111
o-0.100
 rode-0.100
anda-0.099
mo-0.099
Token.
Token ID13
Feature activation-3.26e-03

Pos logit contributions
wered+0.022
 sunrise+0.021
 outer+0.021
 nature+0.020
 something+0.019
Neg logit contributions
bo-0.039
ave-0.039
o-0.037
lets-0.037
 rode-0.036
Token They
Token ID1119
Feature activation-4.36e-02

Pos logit contributions
ave+0.007
bo+0.007
ience+0.007
anda+0.006
 forgot+0.006
Neg logit contributions
 something-0.005
wered-0.005
 nature-0.005
 an-0.005
 a-0.005
Token saw
Token ID2497
Feature activation+4.54e-02

Pos logit contributions
ience+0.077
bo+0.077
ave+0.075
anda+0.070
anna+0.069
Neg logit contributions
 the-0.090
 a-0.089
 nature-0.087
 something-0.087
 an-0.087
Token a
Token ID257
Feature activation+3.59e-02

Pos logit contributions
wered+0.030
 outer+0.027
 sunrise+0.025
 nature+0.023
 purpose+0.022
Neg logit contributions
bo-0.131
ave-0.130
 rode-0.126
o-0.125
anda-0.123
Token huge
Token ID3236
Feature activation+3.25e-02

Pos logit contributions
wered+0.041
 something+0.036
 nature+0.035
 outer+0.035
 an+0.034
Neg logit contributions
bo-0.091
anda-0.088
ience-0.088
ave-0.086
 forgot-0.086
Token figure
Token ID3785
Feature activation-1.60e-02

Pos logit contributions
wered+0.055
 outer+0.050
 something+0.049
 an+0.047
 nature+0.046
Neg logit contributions
bo-0.063
anda-0.062
ience-0.061
mo-0.058
lets-0.058
Token she
Token ID673
Feature activation-7.10e-02

Pos logit contributions
wered+0.047
 outer+0.044
 nature+0.042
 sunrise+0.041
 purpose+0.041
Neg logit contributions
ience-0.055
 forgot-0.055
bo-0.055
ave-0.053
 rode-0.052
Token finally
Token ID3443
Feature activation-1.28e-02

Pos logit contributions
oke+0.133
amboo+0.129
jo+0.124
lets+0.124
anda+0.124
Neg logit contributions
 something-0.140
 a-0.137
 the-0.128
 an-0.128
 dinner-0.120
Token found
Token ID1043
Feature activation+4.07e-02

Pos logit contributions
bo+0.019
ave+0.018
mo+0.018
anda+0.018
ience+0.017
Neg logit contributions
 something-0.029
wered-0.028
 an-0.027
 a-0.027
 nature-0.027
Token another
Token ID1194
Feature activation+2.04e-02

Pos logit contributions
wered+0.038
 sunrise+0.032
 outer+0.030
 nature+0.029
 basement+0.028
Neg logit contributions
bo-0.107
ave-0.106
mo-0.101
ience-0.099
 forgot-0.098
Token cr
Token ID1067
Feature activation-2.84e-02

Pos logit contributions
wered+0.029
 something+0.026
 nature+0.026
 outer+0.025
 sunrise+0.024
Neg logit contributions
bo-0.047
ave-0.046
ience-0.045
mo-0.044
anda-0.044
Tokenumb
Token ID2178
Feature activation-4.09e-02

Pos logit contributions
ave+0.055
bo+0.054
anda+0.047
oped+0.047
mo+0.047
Neg logit contributions
 something-0.054
 nature-0.054
wered-0.052
 sunrise-0.051
 outer-0.050
Token.
Token ID13
Feature activation-1.59e-02

Pos logit contributions
bo+0.096
ave+0.095
ience+0.091
mo+0.089
anda+0.088
Neg logit contributions
wered-0.057
 something-0.054
 nature-0.054
 sunrise-0.052
 outer-0.051
Token
Token ID198
Feature activation+6.78e-03

Pos logit contributions
bo+0.033
ave+0.033
ience+0.031
mo+0.030
anda+0.030
Neg logit contributions
 something-0.026
wered-0.026
 nature-0.025
 sunrise-0.025
 an-0.024
Token
Token ID198
Feature activation-6.50e-03

Pos logit contributions
wered+0.010
 something+0.010
 nature+0.009
 sunrise+0.009
 outer+0.009
Neg logit contributions
bo-0.015
ave-0.015
ience-0.014
anda-0.014
mo-0.014
TokenOnce
Token ID7454
Feature activation+1.48e-02

Pos logit contributions
bo+0.015
ave+0.015
ience+0.014
mo+0.014
anda+0.014
Neg logit contributions
wered-0.009
 something-0.009
 nature-0.009
 sunrise-0.008
 outer-0.008
Token again
Token ID757
Feature activation+9.83e-03

Pos logit contributions
wered+0.020
 nature+0.018
 sunrise+0.018
 outer+0.018
 sight+0.017
Neg logit contributions
ave-0.035
bo-0.034
ience-0.033
 forgot-0.033
anda-0.032
Token very
Token ID845
Feature activation+2.10e-02

Pos logit contributions
 something+0.036
 a+0.035
 an+0.035
 the+0.033
 nature+0.032
Neg logit contributions
ave-0.071
ience-0.070
bo-0.068
lets-0.068
anda-0.067
Token excited
Token ID6568
Feature activation+1.14e-02

Pos logit contributions
 something+0.030
 a+0.029
 an+0.029
 the+0.029
 nature+0.029
Neg logit contributions
ave-0.049
bo-0.048
lets-0.048
ience-0.048
mo-0.046
Token and
Token ID290
Feature activation-3.64e-02

Pos logit contributions
wered+0.013
 something+0.012
 nature+0.012
 sunrise+0.012
 outer+0.012
Neg logit contributions
bo-0.029
ave-0.029
ience-0.028
mo-0.027
anda-0.027
Token couldn
Token ID3521
Feature activation+1.99e-02

Pos logit contributions
bo+0.071
mo+0.070
ave+0.069
anda+0.067
osaurus+0.067
Neg logit contributions
 the-0.072
 a-0.069
 something-0.068
 an-0.065
 nature-0.063
Tokenâ
Token ID22940
Feature activation-5.20e-02

Pos logit contributions
wered+0.031
 something+0.030
 nature+0.030
 sunrise+0.029
 outer+0.028
Neg logit contributions
bo-0.043
ave-0.043
ience-0.041
mo-0.039
anda-0.039
Token€
Token ID26391
Feature activation-4.31e-02

Pos logit contributions
lets+0.101
anda+0.100
ience+0.099
mo+0.098
ave+0.097
Neg logit contributions
wered-0.094
 nature-0.084
 something-0.084
 an-0.080
 sunrise-0.080
Tokenâ„¢
Token ID8151
Feature activation-3.39e-02

Pos logit contributions
mo+0.083
ave+0.082
ience+0.080
bo+0.078
anda+0.077
Neg logit contributions
 something-0.077
 an-0.075
 a-0.075
 the-0.072
 lunch-0.071
Tokent
Token ID83
Feature activation+1.10e-02

Pos logit contributions
ience+0.056
ave+0.053
bo+0.053
lets+0.051
mo+0.049
Neg logit contributions
 something-0.068
wered-0.065
 an-0.064
 a-0.064
 the-0.064
Token believe
Token ID1975
Feature activation+2.04e-02

Pos logit contributions
wered+0.016
 something+0.016
 nature+0.016
 sunrise+0.016
 outer+0.015
Neg logit contributions
bo-0.024
ave-0.024
ience-0.023
mo-0.022
anda-0.022
Token his
Token ID465
Feature activation+1.00e-03

Pos logit contributions
wered+0.026
 sunrise+0.024
 nature+0.023
 lunch+0.023
 sight+0.023
Neg logit contributions
bo-0.048
mo-0.046
ience-0.046
ave-0.045
anda-0.045
Token luck
Token ID8458
Feature activation-3.00e-02

Pos logit contributions
wered+0.001
 something+0.001
 nature+0.001
 sunrise+0.001
 outer+0.001
Neg logit contributions
bo-0.002
ave-0.002
ience-0.002
anda-0.002
mo-0.002
Token She
Token ID1375
Feature activation-7.33e-02

Pos logit contributions
bo+0.028
ave+0.027
ience+0.026
anda+0.025
mo+0.025
Neg logit contributions
wered-0.028
 something-0.027
 nature-0.027
 sunrise-0.026
 outer-0.026
Token liked
Token ID8288
Feature activation+5.56e-02

Pos logit contributions
mo+0.140
oped+0.136
oke+0.132
jo+0.130
anna+0.128
Neg logit contributions
 a-0.151
 something-0.142
 an-0.138
 the-0.134
 her-0.119
Token her
Token ID607
Feature activation+5.53e-03

Pos logit contributions
wered+0.050
 something+0.045
 nature+0.043
 outer+0.042
 sunrise+0.042
Neg logit contributions
bo-0.155
ave-0.152
ience-0.148
mo-0.145
anda-0.145
Token dress
Token ID6576
Feature activation-3.23e-02

Pos logit contributions
 something+0.007
wered+0.007
 nature+0.007
 sunrise+0.007
 sight+0.006
Neg logit contributions
bo-0.013
ave-0.013
ience-0.013
mo-0.012
anda-0.012
Token.
Token ID13
Feature activation-1.50e-02

Pos logit contributions
ave+0.089
bo+0.088
ience+0.086
mo+0.085
oped+0.084
Neg logit contributions
 something-0.035
 a-0.031
 an-0.029
 nature-0.029
 the-0.028
Token She
Token ID1375
Feature activation-7.52e-02

Pos logit contributions
bo+0.026
ave+0.025
ience+0.023
anda+0.023
lets+0.023
Neg logit contributions
wered-0.030
 something-0.028
 outer-0.028
 nature-0.028
 sunrise-0.028
Token thought
Token ID1807
Feature activation+4.30e-02

Pos logit contributions
mo+0.150
oped+0.144
oke+0.140
jo+0.139
anna+0.138
Neg logit contributions
 a-0.150
 something-0.140
 an-0.137
 the-0.135
 her-0.119
Token it
Token ID340
Feature activation-5.42e-02

Pos logit contributions
wered+0.055
 outer+0.050
 nature+0.049
 sight+0.049
 sunrise+0.048
Neg logit contributions
bo-0.099
ience-0.098
ave-0.098
mo-0.095
anda-0.094
Token was
Token ID373
Feature activation+1.36e-02

Pos logit contributions
anna+0.134
ience+0.130
oke+0.129
anda+0.125
olded+0.123
Neg logit contributions
 a-0.085
 something-0.084
 an-0.076
 the-0.073
 nature-0.064
Token beautiful
Token ID4950
Feature activation+3.77e-03

Pos logit contributions
wered+0.014
 something+0.013
 nature+0.013
 sunrise+0.013
 outer+0.012
Neg logit contributions
bo-0.036
ave-0.036
ience-0.035
anda-0.034
mo-0.034
Token.
Token ID13
Feature activation-1.51e-02

Pos logit contributions
wered+0.004
 something+0.004
 nature+0.004
 sunrise+0.004
 outer+0.003
Neg logit contributions
bo-0.010
ave-0.010
ience-0.010
mo-0.009
anda-0.009
Token,
Token ID11
Feature activation+4.80e-03

Pos logit contributions
bo+0.050
ave+0.049
ience+0.045
anda+0.044
mo+0.043
Neg logit contributions
wered-0.078
 something-0.076
 sunrise-0.074
 nature-0.074
 outer-0.073
Token who
Token ID508
Feature activation-4.40e-02

Pos logit contributions
wered+0.007
 sunrise+0.007
 something+0.007
 sight+0.006
 lunch+0.006
Neg logit contributions
bo-0.011
ave-0.010
mo-0.010
o-0.010
 rode-0.010
Token loved
Token ID6151
Feature activation+5.24e-02

Pos logit contributions
ience+0.079
ave+0.073
lets+0.073
mo+0.072
oke+0.072
Neg logit contributions
 a-0.090
 something-0.089
 nature-0.087
 an-0.085
 the-0.084
Token fashion
Token ID6977
Feature activation-1.08e-02

Pos logit contributions
wered+0.069
 something+0.060
 outer+0.057
 nature+0.054
 sight+0.053
Neg logit contributions
bo-0.120
 rode-0.118
ave-0.115
o-0.114
lets-0.113
Token.
Token ID13
Feature activation-1.32e-02

Pos logit contributions
bo+0.028
ave+0.028
ience+0.027
mo+0.026
anda+0.026
Neg logit contributions
wered-0.012
 something-0.012
 nature-0.011
 sunrise-0.011
 outer-0.010
Token She
Token ID1375
Feature activation-5.95e-02

Pos logit contributions
bo+0.021
ave+0.021
lets+0.019
anna+0.019
anda+0.019
Neg logit contributions
 something-0.027
wered-0.026
 outer-0.025
 sight-0.025
 lunch-0.025
Token had
Token ID550
Feature activation+3.48e-02

Pos logit contributions
ience+0.111
osaurus+0.110
bo+0.107
mo+0.104
ave+0.104
Neg logit contributions
 a-0.114
 the-0.114
 an-0.107
 nature-0.106
 school-0.105
Token lots
Token ID6041
Feature activation+3.34e-02

Pos logit contributions
wered+0.022
 sunrise+0.019
 nature+0.018
 something+0.017
 outer+0.016
Neg logit contributions
bo-0.107
ave-0.106
 rode-0.101
 forgot-0.100
lets-0.099
Token of
Token ID286
Feature activation+4.94e-02

Pos logit contributions
wered+0.038
 something+0.036
 nature+0.035
 sunrise+0.034
 outer+0.033
Neg logit contributions
ave-0.087
bo-0.085
ience-0.082
anda-0.081
mo-0.080
Token clothes
Token ID8242
Feature activation-1.91e-02

Pos logit contributions
wered+0.061
 sunrise+0.056
 something+0.055
 nature+0.053
 outer+0.053
Neg logit contributions
ave-0.120
bo-0.117
anda-0.113
ience-0.111
 rode-0.111
Token and
Token ID290
Feature activation-1.11e-02

Pos logit contributions
bo+0.048
ave+0.048
ience+0.046
anda+0.045
mo+0.045
Neg logit contributions
wered-0.022
 something-0.022
 nature-0.021
 sunrise-0.021
 outer-0.020
Token to
Token ID284
Feature activation+1.69e-02

Pos logit contributions
wered+0.052
 something+0.044
 nature+0.044
 sunrise+0.044
 outer+0.043
Neg logit contributions
bo-0.112
ave-0.108
ience-0.106
 rode-0.104
lets-0.104
Token build
Token ID1382
Feature activation+4.97e-02

Pos logit contributions
wered+0.025
 something+0.025
 nature+0.024
 sight+0.024
 sunrise+0.024
Neg logit contributions
bo-0.036
ave-0.034
ience-0.033
 forgot-0.033
 rode-0.033
Token her
Token ID607
Feature activation-3.48e-03

Pos logit contributions
wered+0.047
 sunrise+0.037
 nature+0.035
 outer+0.034
 sight+0.031
Neg logit contributions
bo-0.132
lets-0.123
 pads-0.123
ience-0.122
 rode-0.121
Token nest
Token ID16343
Feature activation+6.44e-03

Pos logit contributions
ave+0.008
bo+0.008
ience+0.007
mo+0.007
anda+0.007
Neg logit contributions
wered-0.005
 something-0.005
 nature-0.005
 sunrise-0.005
 outer-0.005
Token.
Token ID13
Feature activation-7.55e-03

Pos logit contributions
wered+0.008
 sunrise+0.007
 nature+0.007
 outer+0.007
 something+0.007
Neg logit contributions
bo-0.016
ave-0.015
lets-0.015
anda-0.015
mo-0.015
Token She
Token ID1375
Feature activation-6.60e-02

Pos logit contributions
bo+0.016
ave+0.016
ience+0.015
mo+0.015
anda+0.015
Neg logit contributions
wered-0.012
 something-0.012
 nature-0.012
 sunrise-0.012
 an-0.011
Token was
Token ID373
Feature activation+1.99e-02

Pos logit contributions
anna+0.129
bo+0.123
ave+0.122
jo+0.119
mo+0.118
Neg logit contributions
 a-0.129
 the-0.127
 an-0.118
 something-0.117
 nature-0.104
Token very
Token ID845
Feature activation+1.44e-02

Pos logit contributions
 something+0.030
wered+0.030
 nature+0.029
 an+0.028
 sunrise+0.028
Neg logit contributions
ave-0.044
bo-0.044
ience-0.043
anda-0.041
mo-0.041
Token happy
Token ID3772
Feature activation+1.85e-02

Pos logit contributions
 something+0.022
wered+0.021
 nature+0.021
 an+0.020
 a+0.020
Neg logit contributions
bo-0.032
ave-0.032
ience-0.031
mo-0.030
lets-0.030
Token and
Token ID290
Feature activation-3.07e-02

Pos logit contributions
wered+0.029
 outer+0.026
 nature+0.026
 something+0.026
 sunrise+0.026
Neg logit contributions
ave-0.040
bo-0.040
ience-0.037
lets-0.037
 forgot-0.037
Token grateful
Token ID14066
Feature activation+1.55e-02

Pos logit contributions
mo+0.062
bo+0.059
anda+0.056
ave+0.055
ience+0.055
Neg logit contributions
 something-0.059
 the-0.058
 a-0.058
 an-0.056
 nature-0.055
Token warned
Token ID7728
Feature activation+2.05e-02

Pos logit contributions
ience+0.129
lets+0.112
ave+0.109
oc+0.109
uer+0.109
Neg logit contributions
 a-0.152
 the-0.150
 something-0.148
 an-0.139
 lunch-0.135
Token.
Token ID13
Feature activation+3.30e-03

Pos logit contributions
wered+0.024
 sunrise+0.023
 nature+0.023
 outer+0.023
 something+0.021
Neg logit contributions
bo-0.053
ave-0.052
mo-0.049
anda-0.048
ience-0.048
Token
Token ID198
Feature activation+1.33e-02

Pos logit contributions
wered+0.006
 outer+0.006
 nature+0.006
 sunrise+0.005
 something+0.005
Neg logit contributions
bo-0.006
ave-0.006
 forgot-0.006
ience-0.006
oped-0.006
Token
Token ID198
Feature activation+2.07e-03

Pos logit contributions
wered+0.025
 sunrise+0.023
 outer+0.023
 something+0.022
 sight+0.022
Neg logit contributions
bo-0.026
ave-0.024
o-0.023
anda-0.023
oped-0.022
TokenL
Token ID43
Feature activation-3.89e-02

Pos logit contributions
wered+0.004
 something+0.004
 nature+0.003
 sunrise+0.003
 outer+0.003
Neg logit contributions
bo-0.004
ave-0.004
ience-0.004
anda-0.004
mo-0.004
Tokenila
Token ID10102
Feature activation-9.18e-02

Pos logit contributions
ave+0.067
bo+0.067
anda+0.062
mo+0.060
ience+0.060
Neg logit contributions
wered-0.077
 outer-0.077
 something-0.076
 sunrise-0.076
 nature-0.076
Token and
Token ID290
Feature activation-1.65e-03

Pos logit contributions
bo+0.237
mo+0.233
jo+0.233
anda+0.230
ophone+0.227
Neg logit contributions
 something-0.101
 a-0.101
 the-0.096
 an-0.091
 nature-0.081
Token Ben
Token ID3932
Feature activation-7.36e-02

Pos logit contributions
bo+0.004
ave+0.003
 forgot+0.003
ience+0.003
 rode+0.003
Neg logit contributions
wered-0.003
 outer-0.003
 nature-0.002
 sunrise-0.002
 something-0.002
Token took
Token ID1718
Feature activation+3.67e-02

Pos logit contributions
bo+0.109
jo+0.104
anda+0.104
mo+0.103
ience+0.103
Neg logit contributions
 something-0.168
 a-0.163
 an-0.158
 the-0.158
wered-0.157
Token a
Token ID257
Feature activation+2.30e-02

Pos logit contributions
wered+0.034
 outer+0.030
 sunrise+0.029
 nature+0.026
 sight+0.023
Neg logit contributions
bo-0.107
ave-0.099
ience-0.095
mo-0.093
 fet-0.093
Token bite
Token ID13197
Feature activation+1.23e-02

Pos logit contributions
wered+0.041
 nature+0.038
 outer+0.036
 something+0.036
 sunrise+0.036
Neg logit contributions
bo-0.046
ave-0.045
ience-0.043
anda-0.042
 forgot-0.041
Token the
Token ID262
Feature activation+1.03e-02

Pos logit contributions
wered+0.036
 outer+0.032
 nature+0.032
 something+0.031
 purpose+0.030
Neg logit contributions
ience-0.055
bo-0.055
 forgot-0.054
 rode-0.054
ave-0.051
Token fox
Token ID21831
Feature activation-1.01e-01

Pos logit contributions
 something+0.018
wered+0.018
 nature+0.017
 sunrise+0.017
 sight+0.017
Neg logit contributions
ave-0.021
bo-0.021
ience-0.020
lets-0.019
oped-0.019
Token left
Token ID1364
Feature activation+2.15e-02

Pos logit contributions
lets+0.183
bo+0.176
ave+0.174
mo+0.171
ophone+0.168
Neg logit contributions
 the-0.203
 a-0.192
 something-0.191
 dinner-0.181
 lunch-0.178
Token,
Token ID11
Feature activation+1.34e-02

Pos logit contributions
wered+0.019
 sunrise+0.017
 something+0.016
 nature+0.016
 outer+0.016
Neg logit contributions
bo-0.060
ave-0.060
ience-0.058
mo-0.057
 forgot-0.056
Token the
Token ID262
Feature activation+9.74e-03

Pos logit contributions
wered+0.013
 something+0.012
 nature+0.012
 sunrise+0.012
 outer+0.012
Neg logit contributions
bo-0.036
ave-0.036
ience-0.035
 forgot-0.034
mo-0.034
Token princess
Token ID21752
Feature activation-1.13e-01

Pos logit contributions
 something+0.016
 nature+0.016
wered+0.016
 sunrise+0.016
 sight+0.015
Neg logit contributions
ave-0.021
bo-0.020
ience-0.019
oped-0.019
lets-0.018
Token noticed
Token ID6810
Feature activation+3.79e-02

Pos logit contributions
bo+0.199
amboo+0.196
ave+0.193
mo+0.189
oke+0.188
Neg logit contributions
 the-0.227
 a-0.215
 something-0.206
 an-0.193
 nature-0.190
Token the
Token ID262
Feature activation+1.72e-02

Pos logit contributions
wered+0.035
 sunrise+0.028
 nature+0.028
 outer+0.028
 lunch+0.026
Neg logit contributions
ave-0.103
bo-0.102
ience-0.101
mo-0.097
 rode-0.097
Token bottle
Token ID9294
Feature activation-4.42e-02

Pos logit contributions
wered+0.026
 something+0.025
 nature+0.024
 outer+0.024
 sunrise+0.023
Neg logit contributions
bo-0.037
ave-0.037
ience-0.036
anda-0.035
mo-0.035
Token of
Token ID286
Feature activation+3.20e-02

Pos logit contributions
bo+0.103
ave+0.103
ience+0.099
anda+0.097
mo+0.097
Neg logit contributions
 something-0.061
wered-0.058
 nature-0.058
 sunrise-0.056
 an-0.054
Token poison
Token ID8764
Feature activation-2.46e-02

Pos logit contributions
wered+0.043
 outer+0.040
 sunrise+0.038
 something+0.037
 sight+0.035
Neg logit contributions
ience-0.070
bo-0.070
ave-0.069
anda-0.068
lets-0.066
Token rope
Token ID17182
Feature activation-2.01e-02

Pos logit contributions
wered+0.016
 outer+0.015
 something+0.014
 nature+0.014
 sunrise+0.014
Neg logit contributions
bo-0.027
lets-0.027
 pads-0.027
ave-0.027
ience-0.027
Token around
Token ID1088
Feature activation+3.47e-02

Pos logit contributions
bo+0.055
ave+0.055
ience+0.053
mo+0.052
anda+0.051
Neg logit contributions
 something-0.019
wered-0.018
 nature-0.018
 an-0.017
 sunrise-0.016
Token the
Token ID262
Feature activation+1.32e-02

Pos logit contributions
wered+0.038
 outer+0.035
 sunrise+0.034
 something+0.034
 nature+0.034
Neg logit contributions
bo-0.088
ave-0.085
mo-0.083
 rode-0.081
ience-0.080
Token tree
Token ID5509
Feature activation-1.34e-02

Pos logit contributions
wered+0.021
 outer+0.019
 something+0.018
 nature+0.018
 sunrise+0.017
Neg logit contributions
bo-0.028
ience-0.027
ave-0.027
anda-0.026
lets-0.026
Token.
Token ID13
Feature activation-3.36e-03

Pos logit contributions
ave+0.039
bo+0.039
ience+0.038
mo+0.037
anda+0.036
Neg logit contributions
 something-0.011
 nature-0.010
wered-0.010
 sunrise-0.009
 an-0.009
Token John
Token ID1757
Feature activation-1.03e-01

Pos logit contributions
bo+0.007
ave+0.006
ience+0.006
mo+0.006
anda+0.006
Neg logit contributions
wered-0.006
 something-0.006
 nature-0.006
 sunrise-0.006
 outer-0.005
Token got
Token ID1392
Feature activation+2.41e-02

Pos logit contributions
oke+0.196
mo+0.195
amboo+0.194
 Driver+0.193
bo+0.189
Neg logit contributions
 the-0.191
 a-0.181
 an-0.161
 something-0.155
 nature-0.144
Token very
Token ID845
Feature activation+1.76e-02

Pos logit contributions
 something+0.022
wered+0.021
 nature+0.020
 sunrise+0.019
 an+0.018
Neg logit contributions
bo-0.068
ave-0.068
ience-0.066
mo-0.064
anda-0.064
Token excited
Token ID6568
Feature activation+1.48e-02

Pos logit contributions
wered+0.032
 something+0.031
 nature+0.031
 sunrise+0.030
 outer+0.030
Neg logit contributions
bo-0.034
ave-0.033
ience-0.032
anda-0.030
mo-0.030
Token and
Token ID290
Feature activation-3.99e-02

Pos logit contributions
wered+0.026
 outer+0.023
 nature+0.022
 sunrise+0.022
 something+0.021
Neg logit contributions
ave-0.030
bo-0.030
anna-0.028
ience-0.028
lets-0.028
Token jumped
Token ID11687
Feature activation+2.55e-02

Pos logit contributions
bo+0.063
mo+0.060
ave+0.057
anda+0.057
ience+0.055
Neg logit contributions
 something-0.091
 a-0.088
 the-0.088
 an-0.087
 nature-0.086
Token met
Token ID1138
Feature activation+2.85e-02

Pos logit contributions
zel+0.174
bo+0.173
osaurus+0.172
amboo+0.171
jo+0.171
Neg logit contributions
 the-0.155
 a-0.151
 something-0.147
 an-0.140
 nature-0.135
Token a
Token ID257
Feature activation+3.39e-02

Pos logit contributions
wered+0.018
 sunrise+0.017
 outer+0.016
 laundry+0.015
 basement+0.014
Neg logit contributions
ave-0.086
bo-0.085
o-0.083
 rode-0.079
lets-0.078
Token rabbit
Token ID22746
Feature activation-3.21e-02

Pos logit contributions
wered+0.044
 something+0.042
 nature+0.040
 sunrise+0.039
 outer+0.039
Neg logit contributions
bo-0.081
ave-0.080
ience-0.078
anda-0.076
 forgot-0.076
Token.
Token ID13
Feature activation-9.75e-03

Pos logit contributions
ave+0.065
bo+0.064
ience+0.061
mo+0.058
anda+0.058
Neg logit contributions
 something-0.055
 nature-0.054
wered-0.054
 sunrise-0.051
 an-0.051
Token The
Token ID383
Feature activation+1.65e-02

Pos logit contributions
ave+0.025
ience+0.024
bo+0.024
mo+0.024
anda+0.023
Neg logit contributions
 nature-0.012
 something-0.012
 the-0.012
 a-0.012
wered-0.012
Token rabbit
Token ID22746
Feature activation-8.78e-02

Pos logit contributions
 nature+0.031
 something+0.030
wered+0.030
 sunrise+0.030
 sight+0.029
Neg logit contributions
ave-0.032
bo-0.032
ience-0.030
mo-0.029
oped-0.029
Token was
Token ID373
Feature activation+3.62e-02

Pos logit contributions
mo+0.200
oped+0.197
ksh+0.194
bo+0.194
ave+0.194
Neg logit contributions
 the-0.156
 a-0.152
 an-0.131
 something-0.131
 nature-0.122
Token very
Token ID845
Feature activation+2.83e-02

Pos logit contributions
 something+0.053
wered+0.051
 nature+0.050
 an+0.050
 a+0.050
Neg logit contributions
ave-0.084
bo-0.083
ience-0.082
mo-0.077
lets-0.077
Token hungry
Token ID14720
Feature activation+1.34e-02

Pos logit contributions
 nature+0.034
 something+0.033
 a+0.032
wered+0.031
 an+0.031
Neg logit contributions
ave-0.072
bo-0.071
ience-0.071
mo-0.069
lets-0.067
Token and
Token ID290
Feature activation-3.08e-02

Pos logit contributions
wered+0.029
 outer+0.027
 sunrise+0.025
 nature+0.023
 basement+0.023
Neg logit contributions
bo-0.022
ave-0.021
 pads-0.021
o-0.021
 forgot-0.021
Token asked
Token ID1965
Feature activation+2.28e-02

Pos logit contributions
ave+0.050
bo+0.049
osaurus+0.048
jo+0.048
mo+0.047
Neg logit contributions
 the-0.075
 a-0.070
 something-0.069
 nature-0.068
 an-0.067
Token his
Token ID465
Feature activation+1.21e-02

Pos logit contributions
wered+0.033
 nature+0.029
 outer+0.028
 sunrise+0.028
 something+0.027
Neg logit contributions
bo-0.104
ave-0.100
ience-0.098
 forgot-0.096
 rode-0.095
Token very
Token ID845
Feature activation+3.77e-03

Pos logit contributions
 something+0.015
 nature+0.014
 an+0.014
wered+0.014
 a+0.014
Neg logit contributions
ave-0.030
bo-0.030
mo-0.029
anda-0.029
ience-0.028
Token first
Token ID717
Feature activation+1.04e-02

Pos logit contributions
wered+0.004
 something+0.004
 nature+0.004
 sunrise+0.004
 outer+0.004
Neg logit contributions
ave-0.009
bo-0.009
ience-0.009
mo-0.009
anda-0.009
Token race
Token ID3234
Feature activation+5.49e-03

Pos logit contributions
 something+0.013
wered+0.013
 nature+0.012
 sunrise+0.012
 sight+0.012
Neg logit contributions
ave-0.026
bo-0.026
ience-0.025
anda-0.024
mo-0.024
Token.
Token ID13
Feature activation-1.01e-02

Pos logit contributions
 something+0.005
wered+0.005
 nature+0.005
 an+0.005
 sunrise+0.005
Neg logit contributions
ave-0.015
bo-0.015
ience-0.015
mo-0.014
anda-0.014
Token Tim
Token ID5045
Feature activation-9.08e-02

Pos logit contributions
bo+0.022
ience+0.021
ave+0.021
anda+0.020
mo+0.020
Neg logit contributions
 something-0.016
wered-0.016
 nature-0.016
 an-0.015
 sunrise-0.015
Tokenmy
Token ID1820
Feature activation-8.51e-02

Pos logit contributions
olded+0.218
uer+0.210
zel+0.207
ience+0.207
cases+0.206
Neg logit contributions
 the-0.120
 a-0.106
 nature-0.101
 school-0.087
 her-0.086
Token was
Token ID373
Feature activation+1.90e-02

Pos logit contributions
mo+0.206
ppy+0.195
bo+0.189
anna+0.186
anda+0.184
Neg logit contributions
 the-0.145
 a-0.129
 an-0.113
 something-0.108
 new-0.105
Token so
Token ID523
Feature activation+2.14e-02

Pos logit contributions
 something+0.024
 a+0.024
 an+0.023
 the+0.023
 nature+0.021
Neg logit contributions
ave-0.049
ience-0.048
bo-0.047
anda-0.046
lets-0.046
Token glad
Token ID9675
Feature activation+3.80e-02

Pos logit contributions
 something+0.027
 a+0.027
 the+0.027
 an+0.026
wered+0.026
Neg logit contributions
ience-0.053
ave-0.052
bo-0.052
mo-0.052
anna-0.051
Token that
Token ID326
Feature activation+3.22e-02

Pos logit contributions
wered+0.057
 nature+0.052
 something+0.052
 sunrise+0.052
 outer+0.052
Neg logit contributions
bo-0.085
ave-0.083
ience-0.080
 forgot-0.077
mo-0.077