TOP ACTIVATIONS
MAX = 0.143

canned and jar red and bottled and vacuum - se aled
the store and buy some bottled water . Lily was happy
sign that says " T oxic Waste ". "
his head . James started actively pointing out the sock and
head and said he " didn 't know ".
y . He is a bel uga whale . He got
little bird was sad , bec as ue he didn 't
everything went off without a hitch . One
inside and sque aked " aunt 's coming for dinner !'
by and imported by and exported by and sold by and
go near something that looks potentially dangerous and suspect , as
The kind fox sympath ized with him and said
birds and squirrel s -- were frozen in place .
He watched it in amaz ement as it went higher and
then shouted , " I ORDER A WH ALE !"
To her amaz ement , the clothes came out
it . To his amaz ement , he could fit inside
to prevent the bad guys ( the toys on the floor
," the bird said . ( Or , if you think
big smile , making a ( p onding ) noise of

MIN ACTIVATIONS
MIN = -0.136

a great he ave - ho , they gained the avocado
worked hard . The grass ho pper was spoiled and didn
ried ly , the grass ho pper rushed over and tried
and picked up a grass ho pper ! Both friends were
an ant and a grass ho pper . The ant was
busy working when the grass ho pper saw a switch next
. It was a grass ho pper ! Tom was so
€ � � Gr ass ho pper , let â €
and saw a huge grass ho pper , who had been
introduced to a smooth grass ho pper called Jim . Ted
The nos y grass ho pper hopped over to Tom
so happy that the grass ho pper had said yes .
in circles with the grass ho pper . They jumped and
and squ ished the grass ho ppers with her fingers to
lady bug and a grass ho pper . One
?" The grass ho pper said , " I
." The grass ho pper sighed and decided to
Then Baby saw some grass ho ppers in the garden .
and pointed at the grass ho pper . He said ,
It is not a grass ho pper or a cricket .

INTERVAL 0.073 - 0.143
CONTAINS 0.030%

it ? You can take turns using it . Or you
's open it and see ," Mom said . She cut
with . It was the best day ever ! <|endoftext|> Once
have to go to work ," said the man .
<|endoftext|> Sam and Ben were best friends . They liked to

INTERVAL 0.003 - 0.073
CONTAINS 35.073%

she had learned today . <|endoftext|> Jim was a fire man
. <|endoftext|> Once upon a time , there was a boy
the coin and they see that it is not shiny anymore
She thanked the boy and put her hat back on .
to start . The opera was very loud and

INTERVAL -0.066 - 0.003
CONTAINS 64.815%

lovely !" M argo smiled and said goodbye .
day , Anna and Ben hear a noise in the box
who was lost . Bob asked the little girl , "
. Tim was very sad . Tim went to
excited to see what was inside ! L ily

INTERVAL -0.136 - -0.066
CONTAINS 0.082%

they decided to play tag instead . They had so much
and eat it . Big mistake ! When he bit down
was lost and didn 't know how to get back home
sad and he didn 't know why .
're welcome , Rh in oc eros ," the unicorn said
Token canned
Token ID32530
Feature activation+8.32e-03

Pos logit contributions
ths+0.007
 skies+0.007
 two+0.006
 next+0.006
 such+0.006
Neg logit contributions
 habit-0.008
oc-0.008
alk-0.008
 agree-0.008
orse-0.008
Token and
Token ID290
Feature activation+4.36e-03

Pos logit contributions
 skies+0.011
ths+0.011
ting+0.011
 two+0.011
 next+0.010
Neg logit contributions
 habit-0.017
oc-0.016
alk-0.016
 agree-0.015
orse-0.015
Token jar
Token ID17379
Feature activation+1.25e-02

Pos logit contributions
ths+0.007
 skies+0.007
 two+0.006
 such+0.006
 next+0.006
Neg logit contributions
oc-0.008
 habit-0.008
alk-0.008
 agree-0.008
orse-0.007
Tokenred
Token ID445
Feature activation+2.19e-02

Pos logit contributions
 skies+0.017
ths+0.017
 next+0.016
 two+0.016
 such+0.016
Neg logit contributions
oc-0.025
 habit-0.025
orse-0.024
alk-0.023
 agree-0.023
Token and
Token ID290
Feature activation+4.23e-03

Pos logit contributions
ths+0.030
 skies+0.030
ting+0.027
 highest+0.027
 two+0.026
Neg logit contributions
 habit-0.044
oc-0.043
 agree-0.041
alk-0.041
orse-0.040
Token bottled
Token ID34123
Feature activation+1.43e-01

Pos logit contributions
ths+0.006
 skies+0.006
 two+0.006
 such+0.006
 next+0.006
Neg logit contributions
 habit-0.008
oc-0.008
alk-0.008
 agree-0.007
orse-0.007
Token and
Token ID290
Feature activation+4.30e-03

Pos logit contributions
ths+0.141
 skies+0.139
ting+0.127
 two+0.123
 such+0.123
Neg logit contributions
 habit-0.339
oc-0.334
alk-0.323
 agree-0.320
orse-0.319
Token vacuum
Token ID17076
Feature activation+5.62e-03

Pos logit contributions
ths+0.006
 skies+0.006
 two+0.006
ting+0.006
 such+0.006
Neg logit contributions
 habit-0.008
oc-0.008
alk-0.008
 agree-0.008
orse-0.008
Token-
Token ID12
Feature activation+5.80e-02

Pos logit contributions
ths+0.006
 skies+0.006
 highest+0.005
There+0.005
 two+0.005
Neg logit contributions
 habit-0.013
oc-0.013
 agree-0.012
alk-0.012
orse-0.012
Tokense
Token ID325
Feature activation+3.81e-02

Pos logit contributions
ths+0.085
 skies+0.084
ting+0.077
 highest+0.077
 two+0.076
Neg logit contributions
 habit-0.110
oc-0.110
 agree-0.104
alk-0.103
orse-0.102
Tokenaled
Token ID3021
Feature activation+2.46e-02

Pos logit contributions
ths+0.055
 skies+0.055
ting+0.053
 two+0.051
 such+0.051
Neg logit contributions
 habit-0.073
oc-0.070
alk-0.069
 agree-0.068
orse-0.067
Token the
Token ID262
Feature activation-3.28e-04

Pos logit contributions
oc+0.003
 habit+0.003
orse+0.003
alk+0.003
 agree+0.003
Neg logit contributions
 skies-0.001
ths-0.001
 such-0.001
 two-0.001
 next-0.001
Token store
Token ID3650
Feature activation-2.12e-02

Pos logit contributions
 habit+0.001
oc+0.001
 agree+0.001
alk+0.001
orse+0.001
Neg logit contributions
ths-0.001
 skies-0.000
ting-0.000
ips-0.000
 such-0.000
Token and
Token ID290
Feature activation-3.47e-03

Pos logit contributions
 habit+0.042
oc+0.042
alk+0.040
 agree+0.039
orse+0.039
Neg logit contributions
ths-0.030
 skies-0.029
ting-0.027
 highest-0.027
 next-0.027
Token buy
Token ID2822
Feature activation-1.67e-02

Pos logit contributions
 habit+0.005
oc+0.005
alk+0.005
 agree+0.005
orse+0.005
Neg logit contributions
ths-0.006
 skies-0.006
 next-0.006
ting-0.006
 two-0.006
Token some
Token ID617
Feature activation-3.17e-02

Pos logit contributions
 habit+0.038
oc+0.036
 agree+0.036
 understand+0.035
alk+0.035
Neg logit contributions
ths-0.019
 skies-0.017
ting-0.016
 highest-0.016
ips-0.015
Token bottled
Token ID34123
Feature activation+1.31e-01

Pos logit contributions
oc+0.068
 habit+0.067
orse+0.064
alk+0.064
 agree+0.062
Neg logit contributions
 skies-0.041
 two-0.040
 next-0.039
 such-0.038
 highest-0.037
Token water
Token ID1660
Feature activation-9.70e-03

Pos logit contributions
 skies+0.177
ths+0.174
ting+0.164
 two+0.163
 next+0.162
Neg logit contributions
 habit-0.265
oc-0.259
 agree-0.249
alk-0.247
orse-0.243
Token.
Token ID13
Feature activation-1.20e-02

Pos logit contributions
 habit+0.019
oc+0.019
orse+0.018
 agree+0.018
alk+0.018
Neg logit contributions
 skies-0.013
ths-0.013
 next-0.013
ting-0.013
 two-0.012
Token Lily
Token ID20037
Feature activation-8.84e-03

Pos logit contributions
 habit+0.021
oc+0.021
 agree+0.020
alk+0.019
orse+0.019
Neg logit contributions
ths-0.020
 skies-0.019
 highest-0.018
ting-0.018
ips-0.018
Token was
Token ID373
Feature activation-3.92e-03

Pos logit contributions
alk+0.015
 habit+0.015
oc+0.015
 policy+0.014
orse+0.014
Neg logit contributions
ths-0.015
 skies-0.014
 two-0.014
 next-0.014
 both-0.014
Token happy
Token ID3772
Feature activation-4.43e-03

Pos logit contributions
 habit+0.007
 agree+0.007
 understand+0.007
orse+0.006
oc+0.006
Neg logit contributions
 skies-0.006
ths-0.006
ting-0.006
 highest-0.006
ips-0.006
Token sign
Token ID1051
Feature activation-4.69e-03

Pos logit contributions
 two+0.001
 highest+0.001
 skies+0.001
 back+0.001
ths+0.001
Neg logit contributions
alk-0.002
 agree-0.002
 habit-0.002
oc-0.002
orse-0.002
Token that
Token ID326
Feature activation+8.38e-03

Pos logit contributions
oc+0.011
 habit+0.011
 agree+0.010
alk+0.010
orse+0.010
Neg logit contributions
ths-0.005
 skies-0.005
ips-0.004
ting-0.004
 highest-0.004
Token says
Token ID1139
Feature activation+3.47e-03

Pos logit contributions
ths+0.014
 skies+0.014
ting+0.013
 highest+0.013
 two+0.013
Neg logit contributions
 habit-0.014
oc-0.014
 agree-0.013
alk-0.013
orse-0.012
Token "
Token ID366
Feature activation+1.34e-03

Pos logit contributions
 skies+0.006
ths+0.006
 two+0.005
ting+0.005
 next+0.005
Neg logit contributions
 habit-0.006
oc-0.006
 agree-0.006
alk-0.006
orse-0.005
TokenT
Token ID51
Feature activation+1.06e-02

Pos logit contributions
ths+0.002
 skies+0.002
 highest+0.002
 such+0.002
ting+0.002
Neg logit contributions
oc-0.002
 habit-0.002
alk-0.002
 agree-0.002
orse-0.002
Tokenoxic
Token ID18047
Feature activation+1.23e-01

Pos logit contributions
 skies+0.019
ths+0.018
ting+0.018
 next+0.018
ips+0.017
Neg logit contributions
 habit-0.017
oc-0.016
 agree-0.016
orse-0.015
alk-0.015
Token Waste
Token ID35304
Feature activation-3.35e-02

Pos logit contributions
ths+0.197
 skies+0.197
ting+0.186
 highest+0.183
 two+0.182
Neg logit contributions
 habit-0.217
oc-0.212
 agree-0.202
alk-0.201
orse-0.199
Token".
Token ID1911
Feature activation+3.00e-03

Pos logit contributions
 habit+0.062
oc+0.061
 agree+0.058
orse+0.058
alk+0.057
Neg logit contributions
 skies-0.051
ths-0.050
ting-0.048
 two-0.048
 next-0.046
Token
Token ID198
Feature activation-5.23e-03

Pos logit contributions
ths+0.004
 skies+0.004
ting+0.003
 highest+0.003
 next+0.003
Neg logit contributions
 habit-0.006
oc-0.006
 agree-0.006
alk-0.006
orse-0.006
Token
Token ID198
Feature activation-4.15e-03

Pos logit contributions
 habit+0.013
alk+0.011
oc+0.011
 connection+0.011
 policy+0.011
Neg logit contributions
 two-0.005
There-0.005
 next-0.005
 skies-0.005
ths-0.005
Token"
Token ID1
Feature activation+4.21e-02

Pos logit contributions
 habit+0.008
alk+0.008
oc+0.008
orse+0.007
 policy+0.007
Neg logit contributions
ths-0.006
 skies-0.006
There-0.006
 two-0.006
 next-0.006
Token his
Token ID465
Feature activation-8.05e-03

Pos logit contributions
 habit+0.054
oc+0.053
alk+0.050
 agree+0.050
orse+0.049
Neg logit contributions
ths-0.038
 skies-0.037
ting-0.034
 highest-0.034
ips-0.033
Token head
Token ID1182
Feature activation-2.16e-02

Pos logit contributions
 habit+0.015
oc+0.015
 agree+0.014
alk+0.014
orse+0.013
Neg logit contributions
ths-0.012
 skies-0.012
ting-0.011
 highest-0.011
 two-0.011
Token.
Token ID13
Feature activation-7.76e-03

Pos logit contributions
 habit+0.041
oc+0.041
alk+0.039
 agree+0.039
orse+0.037
Neg logit contributions
 skies-0.032
ths-0.031
ting-0.030
 two-0.029
 next-0.029
Token James
Token ID3700
Feature activation+1.44e-02

Pos logit contributions
oc+0.012
 habit+0.012
 agree+0.011
alk+0.011
orse+0.011
Neg logit contributions
ths-0.014
 skies-0.014
ting-0.013
 highest-0.013
ips-0.013
Token started
Token ID2067
Feature activation-8.73e-03

Pos logit contributions
ths+0.026
 skies+0.025
ting+0.024
 two+0.024
 next+0.023
Neg logit contributions
 habit-0.023
oc-0.023
alk-0.022
 agree-0.021
orse-0.021
Token actively
Token ID10630
Feature activation+1.19e-01

Pos logit contributions
 habit+0.019
oc+0.019
alk+0.018
 agree+0.018
orse+0.017
Neg logit contributions
ths-0.011
 skies-0.010
ting-0.010
 highest-0.009
 two-0.009
Token pointing
Token ID10609
Feature activation+9.25e-03

Pos logit contributions
 skies+0.158
ths+0.158
 highest+0.149
ips+0.144
There+0.141
Neg logit contributions
oc-0.242
 habit-0.237
 agree-0.231
orse-0.221
 understand-0.218
Token out
Token ID503
Feature activation-1.40e-02

Pos logit contributions
ths+0.011
 skies+0.011
ips+0.010
 highest+0.010
There+0.010
Neg logit contributions
 habit-0.020
oc-0.019
 agree-0.019
alk-0.018
 understand-0.018
Token the
Token ID262
Feature activation-8.79e-03

Pos logit contributions
oc+0.029
 habit+0.029
 agree+0.027
alk+0.027
 understand+0.026
Neg logit contributions
ths-0.018
 skies-0.017
 highest-0.016
ips-0.016
There-0.015
Token sock
Token ID32263
Feature activation+1.06e-02

Pos logit contributions
oc+0.016
 habit+0.015
alk+0.015
 agree+0.015
orse+0.015
Neg logit contributions
 skies-0.014
ths-0.014
 two-0.013
 highest-0.013
ting-0.013
Token and
Token ID290
Feature activation-5.34e-03

Pos logit contributions
ths+0.014
 skies+0.014
ting+0.012
 highest+0.012
 two+0.012
Neg logit contributions
 habit-0.022
oc-0.022
 agree-0.021
alk-0.021
orse-0.020
Token head
Token ID1182
Feature activation-1.25e-02

Pos logit contributions
 habit+0.004
oc+0.004
 agree+0.004
alk+0.004
orse+0.003
Neg logit contributions
ths-0.004
 skies-0.004
ting-0.003
 highest-0.003
 two-0.003
Token and
Token ID290
Feature activation-9.86e-04

Pos logit contributions
 habit+0.024
oc+0.023
alk+0.022
 agree+0.022
orse+0.022
Neg logit contributions
ths-0.019
 skies-0.018
ting-0.017
 highest-0.017
 two-0.017
Token said
Token ID531
Feature activation+5.29e-03

Pos logit contributions
 habit+0.002
oc+0.002
alk+0.002
 agree+0.002
orse+0.002
Neg logit contributions
ths-0.002
 skies-0.002
ting-0.002
 two-0.001
 next-0.001
Token he
Token ID339
Feature activation+4.38e-03

Pos logit contributions
ths+0.008
 skies+0.008
 two+0.007
 next+0.007
 such+0.007
Neg logit contributions
oc-0.010
 habit-0.010
alk-0.010
 agree-0.009
orse-0.009
Token "
Token ID366
Feature activation+1.35e-03

Pos logit contributions
ths+0.008
 skies+0.008
 two+0.008
 next+0.007
ting+0.007
Neg logit contributions
 habit-0.007
oc-0.007
alk-0.006
 agree-0.006
orse-0.006
Tokendidn
Token ID45168
Feature activation+1.16e-01

Pos logit contributions
 skies+0.002
ths+0.002
ting+0.002
 highest+0.002
ips+0.002
Neg logit contributions
oc-0.003
 habit-0.003
 agree-0.003
orse-0.003
alk-0.003
Token't
Token ID470
Feature activation+1.81e-03

Pos logit contributions
ths+0.107
 skies+0.104
ting+0.091
 highest+0.090
 such+0.086
Neg logit contributions
 habit-0.285
oc-0.282
alk-0.271
 agree-0.271
orse-0.268
Token know
Token ID760
Feature activation-6.79e-02

Pos logit contributions
 two+0.003
 both+0.003
 next+0.003
 back+0.003
 such+0.003
Neg logit contributions
oc-0.003
 habit-0.003
alk-0.002
orse-0.002
onest-0.002
Token".
Token ID1911
Feature activation+4.10e-03

Pos logit contributions
 habit+0.118
oc+0.117
orse+0.110
 agree+0.110
alk+0.108
Neg logit contributions
ths-0.109
 skies-0.108
 highest-0.101
ips-0.099
There-0.097
Token
Token ID220
Feature activation+2.30e-03

Pos logit contributions
ths+0.007
 skies+0.007
ting+0.006
 highest+0.006
ips+0.006
Neg logit contributions
 habit-0.007
oc-0.007
 agree-0.007
alk-0.007
orse-0.007
Token
Token ID198
Feature activation-4.96e-03

Pos logit contributions
ths+0.003
 skies+0.003
 two+0.003
 next+0.003
ips+0.003
Neg logit contributions
 habit-0.005
oc-0.005
alk-0.004
 agree-0.004
orse-0.004
Tokeny
Token ID88
Feature activation-3.67e-02

Pos logit contributions
 next+0.017
ths+0.016
 two+0.016
 such+0.015
 skies+0.015
Neg logit contributions
 habit-0.025
alk-0.024
oc-0.023
 agree-0.023
 understand-0.021
Token.
Token ID13
Feature activation-6.73e-03

Pos logit contributions
 habit+0.066
alk+0.062
oc+0.061
orse+0.060
 agree+0.059
Neg logit contributions
 next-0.058
 two-0.057
 skies-0.055
 back-0.055
 such-0.054
Token He
Token ID679
Feature activation-2.11e-03

Pos logit contributions
 habit+0.013
oc+0.013
 agree+0.013
alk+0.013
orse+0.013
Neg logit contributions
ths-0.009
 skies-0.009
ting-0.008
 highest-0.008
 two-0.008
Token is
Token ID318
Feature activation+1.48e-02

Pos logit contributions
 habit+0.004
alk+0.004
oc+0.004
 agree+0.004
orse+0.004
Neg logit contributions
 skies-0.003
 two-0.003
ths-0.003
 such-0.003
 next-0.003
Token a
Token ID257
Feature activation-4.76e-03

Pos logit contributions
ths+0.019
 skies+0.017
ips+0.016
ting+0.016
There+0.015
Neg logit contributions
 habit-0.032
oc-0.031
orse-0.031
 agree-0.030
alk-0.030
Token bel
Token ID894
Feature activation+1.14e-01

Pos logit contributions
 habit+0.009
oc+0.009
alk+0.009
 agree+0.008
orse+0.008
Neg logit contributions
ths-0.007
 skies-0.007
 highest-0.007
 next-0.007
 two-0.006
Tokenuga
Token ID30302
Feature activation-4.11e-02

Pos logit contributions
ths+0.172
 skies+0.169
ting+0.155
 highest+0.155
 two+0.152
Neg logit contributions
 habit-0.216
oc-0.215
alk-0.203
 agree-0.203
orse-0.199
Token whale
Token ID22206
Feature activation-1.11e-02

Pos logit contributions
 habit+0.062
oc+0.059
 agree+0.057
alk+0.057
orse+0.054
Neg logit contributions
 skies-0.077
ths-0.077
ting-0.073
 next-0.072
 two-0.072
Token.
Token ID13
Feature activation-7.41e-03

Pos logit contributions
 habit+0.022
oc+0.021
alk+0.020
 agree+0.020
orse+0.019
Neg logit contributions
 skies-0.016
ths-0.016
 next-0.015
ting-0.015
 two-0.015
Token He
Token ID679
Feature activation-3.54e-03

Pos logit contributions
oc+0.013
 habit+0.013
 agree+0.012
alk+0.012
orse+0.012
Neg logit contributions
ths-0.012
 skies-0.012
ting-0.011
 highest-0.011
ips-0.011
Token got
Token ID1392
Feature activation+1.73e-02

Pos logit contributions
 habit+0.007
alk+0.006
oc+0.006
 agree+0.006
orse+0.006
Neg logit contributions
 skies-0.005
ths-0.005
 two-0.005
 next-0.005
 highest-0.005
Token little
Token ID1310
Feature activation+6.45e-03

Pos logit contributions
ths+0.011
ips+0.011
ting+0.011
 such+0.010
med+0.010
Neg logit contributions
 habit-0.012
orse-0.011
alk-0.011
oc-0.011
 plan-0.011
Token bird
Token ID6512
Feature activation+1.42e-02

Pos logit contributions
ths+0.011
ips+0.010
ting+0.010
 skies+0.010
 such+0.010
Neg logit contributions
 habit-0.011
oc-0.011
alk-0.010
orse-0.010
 agree-0.010
Token was
Token ID373
Feature activation-3.12e-03

Pos logit contributions
ths+0.023
 next+0.022
 skies+0.022
 two+0.022
 both+0.021
Neg logit contributions
 habit-0.024
 agree-0.023
alk-0.023
oc-0.022
 policy-0.022
Token sad
Token ID6507
Feature activation-2.02e-02

Pos logit contributions
 habit+0.006
orse+0.005
alk+0.005
oc+0.005
 understand+0.005
Neg logit contributions
ths-0.005
 skies-0.005
ting-0.005
ople-0.004
ips-0.004
Token,
Token ID11
Feature activation-3.74e-03

Pos logit contributions
oc+0.043
 habit+0.042
orse+0.041
 agree+0.040
alk+0.040
Neg logit contributions
 skies-0.025
ths-0.025
 next-0.024
 two-0.023
 such-0.023
Token bec
Token ID639
Feature activation+1.11e-01

Pos logit contributions
 habit+0.007
oc+0.007
alk+0.007
 agree+0.007
orse+0.007
Neg logit contributions
ths-0.005
 skies-0.005
ting-0.005
 highest-0.005
ips-0.005
Tokenas
Token ID292
Feature activation-5.44e-03

Pos logit contributions
ths+0.194
 skies+0.194
ting+0.183
 such+0.181
 two+0.175
Neg logit contributions
 habit-0.172
oc-0.167
alk-0.166
 agree-0.161
orse-0.155
Tokenue
Token ID518
Feature activation+7.47e-03

Pos logit contributions
oc+0.009
 habit+0.008
 agree+0.008
orse+0.008
alk+0.008
Neg logit contributions
 skies-0.009
ting-0.009
ths-0.009
 both-0.009
 two-0.009
Token he
Token ID339
Feature activation+1.33e-02

Pos logit contributions
ths+0.011
 skies+0.011
ting+0.010
 highest+0.010
 two+0.010
Neg logit contributions
 habit-0.014
oc-0.014
alk-0.013
 agree-0.013
orse-0.013
Token didn
Token ID1422
Feature activation+6.10e-03

Pos logit contributions
ths+0.023
 skies+0.023
 next+0.022
 highest+0.021
 two+0.021
Neg logit contributions
 habit-0.022
oc-0.021
alk-0.021
 agree-0.020
orse-0.019
Token't
Token ID470
Feature activation+2.56e-03

Pos logit contributions
ths+0.006
 skies+0.006
ting+0.006
 highest+0.005
ips+0.005
Neg logit contributions
 habit-0.014
oc-0.014
alk-0.014
 agree-0.013
orse-0.013
Token everything
Token ID2279
Feature activation+1.78e-02

Pos logit contributions
 skies+0.004
 next+0.004
ths+0.004
 two+0.004
 such+0.003
Neg logit contributions
oc-0.006
 habit-0.006
alk-0.006
 agree-0.006
orse-0.006
Token went
Token ID1816
Feature activation-2.06e-03

Pos logit contributions
ths+0.033
 skies+0.032
 next+0.030
 highest+0.030
ting+0.030
Neg logit contributions
oc-0.028
 habit-0.028
alk-0.026
orse-0.026
 agree-0.025
Token off
Token ID572
Feature activation+7.57e-03

Pos logit contributions
 habit+0.005
oc+0.005
alk+0.005
 agree+0.005
orse+0.005
Neg logit contributions
ths-0.002
 skies-0.002
 highest-0.002
ips-0.002
ting-0.002
Token without
Token ID1231
Feature activation-2.24e-03

Pos logit contributions
 skies+0.009
ths+0.009
 next+0.009
ting+0.009
 such+0.009
Neg logit contributions
 habit-0.016
oc-0.016
 agree-0.015
orse-0.015
alk-0.015
Token a
Token ID257
Feature activation-3.97e-03

Pos logit contributions
 habit+0.005
oc+0.005
 agree+0.004
alk+0.004
orse+0.004
Neg logit contributions
ths-0.003
 skies-0.003
 highest-0.003
ting-0.003
ips-0.002
Token hitch
Token ID35996
Feature activation+1.05e-01

Pos logit contributions
oc+0.007
ruits+0.007
orse+0.007
alk+0.007
grand+0.007
Neg logit contributions
 back-0.006
 two-0.006
 highest-0.005
 next-0.005
 too-0.005
Token.
Token ID13
Feature activation+3.66e-03

Pos logit contributions
ting+0.143
 skies+0.133
 next+0.132
 such+0.131
ths+0.129
Neg logit contributions
 habit-0.218
alk-0.197
 agree-0.196
orse-0.193
oc-0.191
Token
Token ID220
Feature activation+4.00e-03

Pos logit contributions
 skies+0.005
ths+0.005
 two+0.005
ting+0.005
 such+0.005
Neg logit contributions
 habit-0.007
oc-0.007
 agree-0.007
orse-0.007
alk-0.007
Token
Token ID198
Feature activation-2.10e-03

Pos logit contributions
 skies+0.005
ths+0.005
 two+0.005
ips+0.005
 such+0.005
Neg logit contributions
 habit-0.008
oc-0.008
alk-0.008
 agree-0.008
orse-0.007
Token
Token ID198
Feature activation-2.03e-03

Pos logit contributions
 habit+0.005
oc+0.004
 policy+0.004
alk+0.004
 agree+0.004
Neg logit contributions
 next-0.002
 skies-0.002
 two-0.002
There-0.002
 both-0.002
TokenOne
Token ID3198
Feature activation+9.60e-03

Pos logit contributions
 habit+0.004
oc+0.004
alk+0.004
 agree+0.004
orse+0.004
Neg logit contributions
ths-0.003
 skies-0.003
ting-0.002
 highest-0.002
ips-0.002
Token inside
Token ID2641
Feature activation-1.16e-02

Pos logit contributions
ths+0.016
 skies+0.015
ays+0.014
ips+0.014
 yourselves+0.014
Neg logit contributions
 agree-0.023
 habit-0.022
 believe-0.021
oc-0.021
 understand-0.021
Token and
Token ID290
Feature activation+8.30e-03

Pos logit contributions
oc+0.027
 habit+0.027
orse+0.026
alk+0.025
 agree+0.024
Neg logit contributions
 next-0.013
 such-0.012
 skies-0.012
 two-0.012
ths-0.012
Token sque
Token ID35776
Feature activation+2.91e-02

Pos logit contributions
ths+0.014
 skies+0.014
ting+0.013
 highest+0.013
 two+0.013
Neg logit contributions
 habit-0.014
oc-0.014
 agree-0.013
alk-0.013
orse-0.013
Tokenaked
Token ID4335
Feature activation+1.38e-02

Pos logit contributions
ths+0.034
 skies+0.034
ting+0.031
 highest+0.031
 next+0.030
Neg logit contributions
 habit-0.065
oc-0.064
alk-0.061
 agree-0.061
orse-0.060
Token "
Token ID366
Feature activation+1.23e-03

Pos logit contributions
ths+0.019
 skies+0.018
 highest+0.017
ting+0.017
ips+0.016
Neg logit contributions
 habit-0.028
oc-0.028
 agree-0.026
alk-0.026
orse-0.025
Tokenaunt
Token ID12968
Feature activation+1.02e-01

Pos logit contributions
ths+0.002
 skies+0.002
 highest+0.002
ting+0.002
 such+0.002
Neg logit contributions
oc-0.002
 habit-0.002
 agree-0.002
orse-0.002
alk-0.002
Token's
Token ID338
Feature activation+5.11e-03

Pos logit contributions
ths+0.132
 skies+0.129
 highest+0.116
ting+0.115
ips+0.113
Neg logit contributions
oc-0.211
 habit-0.211
 agree-0.200
alk-0.199
orse-0.195
Token coming
Token ID2406
Feature activation-4.43e-02

Pos logit contributions
ths+0.008
 skies+0.008
ting+0.007
 two+0.007
 next+0.007
Neg logit contributions
 habit-0.010
oc-0.009
alk-0.009
 agree-0.009
orse-0.009
Token for
Token ID329
Feature activation-2.10e-03

Pos logit contributions
oc+0.093
 habit+0.093
 agree+0.089
alk+0.087
orse+0.085
Neg logit contributions
ths-0.056
 skies-0.055
ips-0.050
 highest-0.048
ting-0.048
Token dinner
Token ID8073
Feature activation-9.27e-03

Pos logit contributions
 habit+0.005
oc+0.005
 agree+0.005
alk+0.005
orse+0.005
Neg logit contributions
ths-0.002
 skies-0.002
 highest-0.002
ting-0.002
ips-0.002
Token!'
Token ID13679
Feature activation+9.41e-03

Pos logit contributions
 habit+0.017
oc+0.016
alk+0.016
 agree+0.016
orse+0.015
Neg logit contributions
ths-0.015
 skies-0.014
ting-0.013
 highest-0.013
 two-0.013
Token by
Token ID416
Feature activation-5.45e-03

Pos logit contributions
ths+0.060
 skies+0.059
ting+0.054
 such+0.053
 two+0.053
Neg logit contributions
 habit-0.108
oc-0.106
alk-0.102
 agree-0.101
orse-0.100
Token and
Token ID290
Feature activation+3.77e-03

Pos logit contributions
 habit+0.013
oc+0.012
alk+0.012
 agree+0.012
orse+0.012
Neg logit contributions
 skies-0.006
ths-0.006
 two-0.006
 such-0.005
 next-0.005
Token imported
Token ID17392
Feature activation+8.57e-02

Pos logit contributions
ths+0.005
 skies+0.005
 such+0.005
 next+0.005
ting+0.005
Neg logit contributions
oc-0.007
 habit-0.007
alk-0.007
orse-0.007
 agree-0.007
Token by
Token ID416
Feature activation-6.44e-03

Pos logit contributions
ths+0.106
 highest+0.103
 skies+0.099
There+0.091
 tallest+0.086
Neg logit contributions
 habit-0.170
 agree-0.169
oc-0.167
alk-0.166
 policy-0.163
Token and
Token ID290
Feature activation-3.28e-03

Pos logit contributions
 habit+0.014
oc+0.014
 agree+0.013
alk+0.013
orse+0.013
Neg logit contributions
ths-0.008
 skies-0.008
 two-0.007
 next-0.007
 such-0.007
Token exported
Token ID29050
Feature activation+1.00e-01

Pos logit contributions
 habit+0.006
oc+0.006
alk+0.006
orse+0.006
 agree+0.006
Neg logit contributions
ths-0.005
 skies-0.005
ting-0.004
 such-0.004
 next-0.004
Token by
Token ID416
Feature activation-1.39e-02

Pos logit contributions
ths+0.191
 skies+0.180
 highest+0.177
There+0.170
 such+0.162
Neg logit contributions
 habit-0.148
oc-0.147
alk-0.140
 agree-0.140
onest-0.127
Token and
Token ID290
Feature activation-1.11e-02

Pos logit contributions
 habit+0.030
oc+0.030
 agree+0.029
alk+0.028
orse+0.028
Neg logit contributions
 skies-0.017
ths-0.017
 two-0.016
 next-0.016
ting-0.016
Token sold
Token ID2702
Feature activation+1.04e-02

Pos logit contributions
 habit+0.020
oc+0.020
 agree+0.019
orse+0.018
alk+0.018
Neg logit contributions
ths-0.017
 skies-0.016
ips-0.016
 highest-0.015
 two-0.015
Token by
Token ID416
Feature activation-1.32e-02

Pos logit contributions
 skies+0.013
ths+0.013
ting+0.012
 two+0.012
 next+0.012
Neg logit contributions
 habit-0.022
oc-0.022
orse-0.021
 agree-0.021
alk-0.021
Token and
Token ID290
Feature activation-1.02e-02

Pos logit contributions
 habit+0.029
oc+0.029
 agree+0.028
alk+0.027
orse+0.027
Neg logit contributions
 skies-0.015
ths-0.015
 two-0.015
 next-0.014
ting-0.014
Token go
Token ID467
Feature activation-1.74e-02

Pos logit contributions
oc+0.015
orse+0.013
 habit+0.013
alk+0.013
ool+0.012
Neg logit contributions
 two-0.015
 such-0.015
 next-0.015
ths-0.015
 skies-0.015
Token near
Token ID1474
Feature activation-4.70e-02

Pos logit contributions
 habit+0.035
 agree+0.034
 understand+0.033
alk+0.033
oc+0.032
Neg logit contributions
ths-0.022
 skies-0.021
ays-0.020
ips-0.019
efully-0.019
Token something
Token ID1223
Feature activation-2.11e-03

Pos logit contributions
 habit+0.108
oc+0.107
alk+0.103
 agree+0.103
orse+0.101
Neg logit contributions
ths-0.050
 skies-0.049
ting-0.044
 highest-0.043
ips-0.043
Token that
Token ID326
Feature activation+2.98e-03

Pos logit contributions
 habit+0.004
oc+0.004
 agree+0.004
alk+0.004
orse+0.004
Neg logit contributions
ths-0.003
 skies-0.003
ting-0.003
 highest-0.003
 two-0.003
Token looks
Token ID3073
Feature activation+3.99e-02

Pos logit contributions
ths+0.005
 skies+0.005
 next+0.005
 highest+0.004
 two+0.004
Neg logit contributions
oc-0.005
 habit-0.005
orse-0.005
alk-0.005
 agree-0.005
Token potentially
Token ID6196
Feature activation+9.85e-02

Pos logit contributions
ths+0.053
 skies+0.053
 highest+0.050
 such+0.050
 two+0.049
Neg logit contributions
oc-0.080
 habit-0.080
alk-0.076
 agree-0.075
orse-0.075
Token dangerous
Token ID4923
Feature activation+1.59e-02

Pos logit contributions
ths+0.126
 skies+0.124
 such+0.121
 two+0.120
 next+0.120
Neg logit contributions
oc-0.205
 habit-0.200
orse-0.194
alk-0.192
 agree-0.189
Token and
Token ID290
Feature activation-1.95e-03

Pos logit contributions
ths+0.025
 skies+0.024
 highest+0.022
ting+0.022
ips+0.022
Neg logit contributions
 habit-0.029
oc-0.029
 agree-0.027
alk-0.027
orse-0.026
Token suspect
Token ID4099
Feature activation+4.17e-02

Pos logit contributions
oc+0.004
 habit+0.004
alk+0.004
orse+0.004
 agree+0.003
Neg logit contributions
ths-0.003
 skies-0.003
 next-0.003
 such-0.003
 two-0.003
Token,
Token ID11
Feature activation-3.19e-03

Pos logit contributions
ths+0.053
 skies+0.053
ting+0.049
 two+0.048
 such+0.048
Neg logit contributions
 habit-0.087
oc-0.087
alk-0.082
 agree-0.082
orse-0.081
Token as
Token ID355
Feature activation-2.77e-02

Pos logit contributions
 habit+0.006
oc+0.006
alk+0.006
 agree+0.006
orse+0.006
Neg logit contributions
ths-0.004
 skies-0.004
ting-0.004
 such-0.004
 next-0.004
Token
Token ID198
Feature activation-8.82e-03

Pos logit contributions
ths+0.023
 skies+0.023
ting+0.022
 two+0.022
 next+0.021
Neg logit contributions
 habit-0.030
oc-0.029
alk-0.028
 agree-0.028
orse-0.027
Token
Token ID198
Feature activation-8.01e-03

Pos logit contributions
 habit+0.019
oc+0.018
alk+0.018
 agree+0.017
 policy+0.017
Neg logit contributions
 skies-0.011
ths-0.011
 next-0.010
 two-0.010
There-0.010
TokenThe
Token ID464
Feature activation+1.15e-02

Pos logit contributions
 habit+0.019
oc+0.018
alk+0.018
 agree+0.018
orse+0.017
Neg logit contributions
ths-0.009
 skies-0.008
ting-0.007
 next-0.007
 highest-0.007
Token kind
Token ID1611
Feature activation-1.79e-02

Pos logit contributions
ths+0.021
 skies+0.021
 two+0.020
ting+0.020
 next+0.020
Neg logit contributions
 habit-0.018
oc-0.018
alk-0.017
 agree-0.017
orse-0.016
Token fox
Token ID21831
Feature activation+5.41e-03

Pos logit contributions
orse+0.035
 habit+0.034
oc+0.032
alk+0.032
onest+0.032
Neg logit contributions
 two-0.028
 skies-0.026
 next-0.026
ting-0.025
 me-0.025
Token sympath
Token ID23860
Feature activation+9.55e-02

Pos logit contributions
ths+0.009
 skies+0.009
 two+0.009
 next+0.009
ting+0.008
Neg logit contributions
 habit-0.009
alk-0.009
oc-0.009
 policy-0.008
 agree-0.008
Tokenized
Token ID1143
Feature activation+3.94e-02

Pos logit contributions
 skies+0.143
ips+0.137
 highest+0.127
ays+0.126
ting+0.124
Neg logit contributions
oc-0.195
 agree-0.180
 habit-0.174
 understand-0.160
orse-0.159
Token with
Token ID351
Feature activation-1.08e-02

Pos logit contributions
 skies+0.054
ths+0.053
ips+0.050
 highest+0.049
There+0.048
Neg logit contributions
oc-0.082
 habit-0.077
 agree-0.073
alk-0.072
orse-0.068
Token him
Token ID683
Feature activation+1.37e-02

Pos logit contributions
 habit+0.025
oc+0.025
alk+0.024
 agree+0.024
orse+0.024
Neg logit contributions
ths-0.011
 skies-0.011
 highest-0.010
ting-0.010
ips-0.009
Token and
Token ID290
Feature activation-7.68e-03

Pos logit contributions
ths+0.017
 skies+0.017
ting+0.016
 next+0.016
 two+0.016
Neg logit contributions
 habit-0.028
oc-0.028
alk-0.027
 agree-0.027
orse-0.027
Token said
Token ID531
Feature activation+3.12e-03

Pos logit contributions
 habit+0.013
oc+0.013
alk+0.012
 agree+0.012
orse+0.012
Neg logit contributions
ths-0.013
 skies-0.013
ting-0.012
 highest-0.012
 two-0.012
Tokenbirds
Token ID32002
Feature activation+1.23e-02

Pos logit contributions
ths+0.071
 skies+0.070
ting+0.066
 highest+0.065
 such+0.065
Neg logit contributions
 habit-0.088
oc-0.086
alk-0.082
 agree-0.082
orse-0.081
Token and
Token ID290
Feature activation+3.52e-03

Pos logit contributions
 skies+0.021
ths+0.021
 both+0.020
ting+0.020
 two+0.020
Neg logit contributions
 habit-0.020
oc-0.020
alk-0.019
 policy-0.018
 agree-0.018
Token squirrel
Token ID33039
Feature activation+8.03e-03

Pos logit contributions
ths+0.004
 skies+0.004
ting+0.004
 two+0.004
 such+0.004
Neg logit contributions
 habit-0.008
oc-0.008
alk-0.007
 agree-0.007
orse-0.007
Tokens
Token ID82
Feature activation-2.51e-03

Pos logit contributions
ths+0.014
 skies+0.013
ting+0.013
 two+0.013
 both+0.013
Neg logit contributions
 habit-0.014
oc-0.013
alk-0.012
 policy-0.012
 agree-0.012
Token--
Token ID438
Feature activation+4.78e-02

Pos logit contributions
oc+0.004
 habit+0.004
alk+0.004
orse+0.004
 policy+0.004
Neg logit contributions
 skies-0.004
ths-0.004
 two-0.004
 both-0.004
 next-0.004
Tokenwere
Token ID22474
Feature activation+9.47e-02

Pos logit contributions
ths+0.075
 skies+0.074
ting+0.069
 highest+0.068
 two+0.067
Neg logit contributions
 habit-0.086
oc-0.085
 agree-0.081
alk-0.081
orse-0.079
Token frozen
Token ID12912
Feature activation+1.07e-02

Pos logit contributions
ths+0.136
 skies+0.134
ting+0.125
 highest+0.121
Why+0.119
Neg logit contributions
 habit-0.179
oc-0.175
alk-0.171
 agree-0.170
orse-0.170
Token in
Token ID287
Feature activation-1.35e-02

Pos logit contributions
ths+0.015
 skies+0.015
ting+0.014
 highest+0.014
 two+0.014
Neg logit contributions
 habit-0.020
oc-0.020
 agree-0.019
alk-0.019
orse-0.019
Token place
Token ID1295
Feature activation-1.32e-02

Pos logit contributions
 habit+0.032
oc+0.032
alk+0.030
 agree+0.030
orse+0.030
Neg logit contributions
ths-0.014
 skies-0.013
ting-0.012
 highest-0.012
ips-0.011
Token.
Token ID13
Feature activation-1.46e-03

Pos logit contributions
 habit+0.026
oc+0.026
alk+0.024
 agree+0.024
orse+0.024
Neg logit contributions
 skies-0.019
ths-0.019
 such-0.018
 next-0.017
ting-0.017
Token
Token ID198
Feature activation-3.40e-03

Pos logit contributions
 habit+0.003
oc+0.003
 agree+0.002
alk+0.002
orse+0.002
Neg logit contributions
ths-0.002
 skies-0.002
ting-0.002
 highest-0.002
 next-0.002
Token He
Token ID679
Feature activation+1.08e-02

Pos logit contributions
 skies+0.015
ths+0.015
ting+0.014
 two+0.014
 such+0.014
Neg logit contributions
 habit-0.024
oc-0.023
 agree-0.022
alk-0.022
orse-0.022
Token watched
Token ID7342
Feature activation-1.25e-03

Pos logit contributions
ths+0.018
 skies+0.018
 two+0.017
 next+0.017
 both+0.016
Neg logit contributions
 habit-0.019
alk-0.018
oc-0.018
 policy-0.018
 agree-0.017
Token it
Token ID340
Feature activation-6.75e-03

Pos logit contributions
 habit+0.003
oc+0.003
 agree+0.003
alk+0.003
orse+0.003
Neg logit contributions
ths-0.001
 skies-0.001
 highest-0.001
ips-0.001
ting-0.001
Token in
Token ID287
Feature activation-6.00e-03

Pos logit contributions
 habit+0.012
oc+0.011
 policy+0.011
alk+0.011
orse+0.011
Neg logit contributions
 skies-0.011
ths-0.011
 next-0.011
 highest-0.010
 two-0.010
Token amaz
Token ID40642
Feature activation+2.69e-04

Pos logit contributions
 habit+0.014
oc+0.014
 agree+0.013
alk+0.013
orse+0.013
Neg logit contributions
ths-0.007
 skies-0.006
ting-0.006
 highest-0.006
ips-0.006
Tokenement
Token ID972
Feature activation+9.40e-02

Pos logit contributions
 highest+0.000
 fastest+0.000
ting+0.000
There+0.000
Why+0.000
Neg logit contributions
oc-0.000
alk-0.000
 choice-0.000
ool-0.000
 agree-0.000
Token as
Token ID355
Feature activation-1.82e-02

Pos logit contributions
ths+0.137
 skies+0.136
ting+0.126
 highest+0.124
 two+0.123
Neg logit contributions
 habit-0.180
oc-0.178
alk-0.169
 agree-0.169
orse-0.165
Token it
Token ID340
Feature activation-1.07e-02

Pos logit contributions
 habit+0.040
oc+0.040
alk+0.038
 agree+0.038
orse+0.036
Neg logit contributions
 skies-0.021
ths-0.021
 two-0.021
 next-0.020
ting-0.020
Token went
Token ID1816
Feature activation+5.85e-03

Pos logit contributions
 habit+0.018
oc+0.017
alk+0.017
 agree+0.017
orse+0.016
Neg logit contributions
ths-0.018
 skies-0.018
ting-0.017
 highest-0.017
 two-0.017
Token higher
Token ID2440
Feature activation-1.41e-02

Pos logit contributions
 skies+0.005
ths+0.005
ting+0.005
 highest+0.005
 next+0.005
Neg logit contributions
 habit-0.014
oc-0.014
 agree-0.014
alk-0.014
orse-0.014
Token and
Token ID290
Feature activation+8.16e-03

Pos logit contributions
oc+0.022
 habit+0.022
alk+0.022
 agree+0.021
orse+0.020
Neg logit contributions
ths-0.025
 skies-0.024
ips-0.023
ting-0.023
There-0.023
Token then
Token ID788
Feature activation-9.24e-03

Pos logit contributions
 habit+0.020
oc+0.019
 agree+0.019
orse+0.018
 understand+0.018
Neg logit contributions
ths-0.015
 skies-0.015
 highest-0.015
ips-0.014
ays-0.014
Token shouted
Token ID17293
Feature activation-1.37e-02

Pos logit contributions
 habit+0.015
oc+0.015
 agree+0.014
alk+0.014
orse+0.014
Neg logit contributions
ths-0.016
 skies-0.016
ting-0.015
 highest-0.015
 such-0.014
Token,
Token ID11
Feature activation-1.16e-02

Pos logit contributions
 habit+0.025
oc+0.024
alk+0.023
 agree+0.023
orse+0.023
Neg logit contributions
ths-0.022
 skies-0.021
ting-0.020
 highest-0.020
 two-0.020
Token "
Token ID366
Feature activation-6.04e-03

Pos logit contributions
 habit+0.016
oc+0.016
 agree+0.015
alk+0.015
orse+0.015
Neg logit contributions
ths-0.023
 skies-0.022
 highest-0.021
ting-0.021
ips-0.021
TokenI
Token ID40
Feature activation-1.92e-02

Pos logit contributions
oc+0.011
 habit+0.011
 agree+0.011
orse+0.010
 understand+0.010
Neg logit contributions
ips-0.008
ths-0.008
 skies-0.008
ays-0.008
 back-0.008
Token ORDER
Token ID38678
Feature activation+9.40e-02

Pos logit contributions
alk+0.033
 habit+0.031
 policy+0.029
onest+0.028
kin+0.028
Neg logit contributions
 two-0.031
 only-0.031
 least-0.031
 both-0.031
 right-0.029
Token A
Token ID317
Feature activation-2.84e-02

Pos logit contributions
ths+0.145
 skies+0.130
 highest+0.119
ips+0.119
There+0.118
Neg logit contributions
oc-0.179
 habit-0.171
alk-0.169
 agree-0.166
onest-0.160
Token WH
Token ID7655
Feature activation+4.02e-02

Pos logit contributions
 habit+0.052
alk+0.051
oc+0.050
 agree+0.049
 policy+0.049
Neg logit contributions
 two-0.041
 highest-0.040
 skies-0.038
 next-0.038
ths-0.037
TokenALE
Token ID21358
Feature activation+6.00e-04

Pos logit contributions
ths+0.070
 skies+0.069
 two+0.066
 highest+0.066
 such+0.065
Neg logit contributions
 habit-0.065
 agree-0.061
oc-0.061
alk-0.060
 believe-0.056
Token!"
Token ID2474
Feature activation+1.33e-02

Pos logit contributions
 skies+0.001
 next+0.001
 two+0.001
ths+0.001
 highest+0.001
Neg logit contributions
 habit-0.001
 agree-0.001
oc-0.001
 understand-0.001
orse-0.001
Token
Token ID198
Feature activation-1.38e-02

Pos logit contributions
ths+0.023
 skies+0.023
ting+0.022
 highest+0.022
 such+0.021
Neg logit contributions
 habit-0.022
oc-0.021
 agree-0.020
alk-0.020
orse-0.020
Token
Token ID198
Feature activation+9.61e-05

Pos logit contributions
ths+0.012
 skies+0.012
ting+0.011
 highest+0.011
 next+0.011
Neg logit contributions
 habit-0.016
oc-0.016
alk-0.015
 agree-0.015
orse-0.015
Token
Token ID198
Feature activation+1.53e-03

Pos logit contributions
ths+0.000
 skies+0.000
 two+0.000
There+0.000
 next+0.000
Neg logit contributions
 habit-0.000
oc-0.000
alk-0.000
 agree-0.000
 policy-0.000
TokenTo
Token ID2514
Feature activation+4.24e-02

Pos logit contributions
ths+0.002
 skies+0.002
 two+0.002
 next+0.002
There+0.002
Neg logit contributions
 habit-0.003
oc-0.003
alk-0.003
 agree-0.003
orse-0.003
Token her
Token ID607
Feature activation+1.47e-02

Pos logit contributions
ths+0.044
 skies+0.043
 two+0.040
 such+0.040
ting+0.040
Neg logit contributions
 habit-0.098
oc-0.098
alk-0.093
 agree-0.093
orse-0.092
Token amaz
Token ID40642
Feature activation-1.48e-04

Pos logit contributions
ting+0.025
 skies+0.025
ips+0.024
ths+0.024
 yourselves+0.023
Neg logit contributions
 habit-0.025
oc-0.024
 agree-0.023
 plan-0.023
 understand-0.022
Tokenement
Token ID972
Feature activation+9.33e-02

Pos logit contributions
 habit+0.000
 choice+0.000
oc+0.000
alk+0.000
 agree+0.000
Neg logit contributions
ting-0.000
There-0.000
 highest-0.000
 two-0.000
Why-0.000
Token,
Token ID11
Feature activation+1.03e-02

Pos logit contributions
ths+0.151
 skies+0.150
 next+0.140
 two+0.140
ting+0.139
Neg logit contributions
 habit-0.164
oc-0.163
alk-0.152
 agree-0.151
orse-0.148
Token the
Token ID262
Feature activation+7.66e-04

Pos logit contributions
ths+0.012
 skies+0.012
ting+0.011
 highest+0.011
ips+0.011
Neg logit contributions
 habit-0.022
oc-0.022
 agree-0.021
alk-0.021
orse-0.021
Token clothes
Token ID8242
Feature activation-4.64e-03

Pos logit contributions
 skies+0.001
ths+0.001
 two+0.001
ting+0.001
 next+0.001
Neg logit contributions
 habit-0.001
oc-0.001
alk-0.001
 agree-0.001
orse-0.001
Token came
Token ID1625
Feature activation+2.51e-03

Pos logit contributions
oc+0.008
 habit+0.008
orse+0.007
alk+0.007
 policy+0.007
Neg logit contributions
 skies-0.008
ths-0.008
 two-0.008
 next-0.008
 both-0.008
Token out
Token ID503
Feature activation-8.37e-03

Pos logit contributions
ths+0.002
 skies+0.002
ting+0.002
ips+0.002
 highest+0.002
Neg logit contributions
 habit-0.006
oc-0.006
alk-0.006
 agree-0.006
orse-0.005
Token it
Token ID340
Feature activation-1.66e-02

Pos logit contributions
ths+0.010
 skies+0.009
ips+0.008
ting+0.008
 highest+0.008
Neg logit contributions
 habit-0.023
oc-0.023
alk-0.022
 agree-0.022
orse-0.021
Token.
Token ID13
Feature activation+6.43e-03

Pos logit contributions
 habit+0.031
oc+0.031
alk+0.030
 agree+0.029
orse+0.029
Neg logit contributions
 skies-0.025
ths-0.024
 highest-0.023
 next-0.023
ting-0.023
Token To
Token ID1675
Feature activation+2.06e-02

Pos logit contributions
ths+0.011
 skies+0.010
ting+0.010
 highest+0.009
 next+0.009
Neg logit contributions
 habit-0.011
oc-0.011
alk-0.010
 agree-0.010
orse-0.010
Token his
Token ID465
Feature activation+2.70e-03

Pos logit contributions
ths+0.029
 skies+0.029
ting+0.027
 such+0.027
 two+0.026
Neg logit contributions
 habit-0.040
oc-0.040
alk-0.038
 agree-0.038
orse-0.037
Token amaz
Token ID40642
Feature activation+8.00e-04

Pos logit contributions
ting+0.005
 yourselves+0.005
 skies+0.005
ips+0.005
ths+0.005
Neg logit contributions
 habit-0.004
oc-0.004
 plan-0.004
 agree-0.004
 believe-0.004
Tokenement
Token ID972
Feature activation+9.33e-02

Pos logit contributions
ting+0.001
 highest+0.001
There+0.001
 two+0.001
Why+0.001
Neg logit contributions
 habit-0.001
 choice-0.001
alk-0.001
kin-0.001
ool-0.001
Token,
Token ID11
Feature activation-3.12e-04

Pos logit contributions
ths+0.158
 skies+0.156
ting+0.146
 two+0.145
 next+0.145
Neg logit contributions
 habit-0.158
oc-0.156
alk-0.147
 agree-0.146
orse-0.140
Token he
Token ID339
Feature activation+1.03e-02

Pos logit contributions
 habit+0.001
oc+0.001
 agree+0.001
alk+0.001
orse+0.001
Neg logit contributions
ths-0.000
 skies-0.000
ting-0.000
 highest-0.000
ips-0.000
Token could
Token ID714
Feature activation-1.95e-02

Pos logit contributions
ths+0.016
 skies+0.016
 two+0.015
 next+0.015
ting+0.015
Neg logit contributions
 habit-0.019
oc-0.018
alk-0.018
 agree-0.017
 policy-0.017
Token fit
Token ID4197
Feature activation+2.48e-02

Pos logit contributions
oc+0.037
 habit+0.036
alk+0.033
 policy+0.033
 agree+0.032
Neg logit contributions
 skies-0.030
ths-0.029
 highest-0.029
 two-0.028
 next-0.028
Token inside
Token ID2641
Feature activation-1.30e-02

Pos logit contributions
ths+0.035
 skies+0.033
clock+0.030
ople+0.030
ips+0.030
Neg logit contributions
 habit-0.044
oc-0.044
 agree-0.044
 believe-0.044
orse-0.043
Token to
Token ID284
Feature activation-1.88e-03

Pos logit contributions
 habit+0.018
oc+0.018
 agree+0.017
alk+0.017
orse+0.017
Neg logit contributions
ths-0.010
 skies-0.010
ting-0.009
 highest-0.009
ips-0.009
Token prevent
Token ID2948
Feature activation-1.08e-03

Pos logit contributions
oc+0.003
 habit+0.003
alk+0.003
orse+0.003
 policy+0.003
Neg logit contributions
 two-0.003
 next-0.003
 skies-0.003
 both-0.003
ths-0.003
Token the
Token ID262
Feature activation+1.37e-04

Pos logit contributions
 habit+0.002
 agree+0.002
 understand+0.002
oc+0.002
orse+0.002
Neg logit contributions
 skies-0.001
ths-0.001
 highest-0.001
ips-0.001
ays-0.001
Token bad
Token ID2089
Feature activation+2.34e-02

Pos logit contributions
ths+0.000
 skies+0.000
ting+0.000
ips+0.000
 highest+0.000
Neg logit contributions
 habit-0.000
oc-0.000
 agree-0.000
alk-0.000
orse-0.000
Token guys
Token ID3730
Feature activation+2.05e-02

Pos logit contributions
 skies+0.034
ths+0.033
 next+0.033
 two+0.033
ting+0.032
Neg logit contributions
oc-0.044
 habit-0.043
alk-0.042
orse-0.042
 agree-0.042
Token (
Token ID357
Feature activation+9.30e-02

Pos logit contributions
ths+0.027
 skies+0.026
 next+0.025
ting+0.025
 two+0.025
Neg logit contributions
 habit-0.043
oc-0.041
alk-0.039
orse-0.039
 agree-0.039
Tokenthe
Token ID1169
Feature activation-1.92e-02

Pos logit contributions
 skies+0.159
ths+0.157
 highest+0.143
ting+0.142
 two+0.140
Neg logit contributions
oc-0.158
 habit-0.151
 agree-0.145
alk-0.144
orse-0.140
Token toys
Token ID14958
Feature activation+1.22e-02

Pos logit contributions
 habit+0.036
oc+0.035
alk+0.033
 agree+0.033
orse+0.033
Neg logit contributions
ths-0.029
 skies-0.028
ting-0.027
 highest-0.026
 such-0.026
Token on
Token ID319
Feature activation-2.21e-02

Pos logit contributions
ths+0.015
 skies+0.014
 next+0.014
ting+0.014
 two+0.014
Neg logit contributions
 habit-0.026
oc-0.026
orse-0.025
alk-0.025
 agree-0.025
Token the
Token ID262
Feature activation-3.00e-03

Pos logit contributions
 habit+0.052
oc+0.052
alk+0.050
 agree+0.050
orse+0.049
Neg logit contributions
ths-0.022
 skies-0.022
ting-0.019
 highest-0.019
 two-0.018
Token floor
Token ID4314
Feature activation-1.75e-02

Pos logit contributions
 habit+0.005
oc+0.005
alk+0.004
 agree+0.004
 plan+0.004
Neg logit contributions
ths-0.005
ips-0.005
 skies-0.005
ting-0.005
ays-0.005
Token,"
Token ID553
Feature activation+5.77e-02

Pos logit contributions
ths+0.032
 skies+0.031
ting+0.029
 highest+0.029
ips+0.029
Neg logit contributions
oc-0.019
 habit-0.018
 agree-0.018
alk-0.017
orse-0.016
Token the
Token ID262
Feature activation+5.41e-03

Pos logit contributions
ths+0.065
 skies+0.062
ting+0.058
 highest+0.056
ips+0.054
Neg logit contributions
 habit-0.131
oc-0.130
 agree-0.128
orse-0.124
alk-0.123
Token bird
Token ID6512
Feature activation+1.06e-02

Pos logit contributions
ths+0.009
ting+0.009
 skies+0.009
ips+0.009
 such+0.009
Neg logit contributions
 habit-0.009
oc-0.009
orse-0.008
 agree-0.008
alk-0.008
Token said
Token ID531
Feature activation+6.74e-03

Pos logit contributions
ths+0.020
 skies+0.019
ting+0.018
 highest+0.018
 such+0.018
Neg logit contributions
 habit-0.016
oc-0.016
 agree-0.015
alk-0.015
orse-0.015
Token.
Token ID13
Feature activation-1.45e-02

Pos logit contributions
 next+0.009
 skies+0.009
ths+0.009
 back+0.009
ting+0.009
Neg logit contributions
 habit-0.014
oc-0.014
alk-0.013
 agree-0.013
orse-0.012
Token (
Token ID357
Feature activation+9.28e-02

Pos logit contributions
oc+0.024
 habit+0.023
 agree+0.023
alk+0.023
orse+0.022
Neg logit contributions
ths-0.025
 skies-0.024
ting-0.023
 highest-0.023
 such-0.022
TokenOr
Token ID5574
Feature activation-9.08e-03

Pos logit contributions
ths+0.115
 skies+0.113
ting+0.102
 highest+0.101
 such+0.097
Neg logit contributions
oc-0.200
 habit-0.198
 agree-0.190
alk-0.187
orse-0.185
Token,
Token ID11
Feature activation+1.96e-03

Pos logit contributions
 habit+0.013
 agree+0.013
oc+0.013
 understand+0.012
 policy+0.012
Neg logit contributions
 skies-0.017
ths-0.017
ting-0.016
ips-0.016
ays-0.015
Token if
Token ID611
Feature activation-1.23e-02

Pos logit contributions
ths+0.003
 skies+0.003
ting+0.002
 highest+0.002
ips+0.002
Neg logit contributions
 habit-0.004
oc-0.004
 agree-0.004
alk-0.004
orse-0.004
Token you
Token ID345
Feature activation+1.32e-02

Pos logit contributions
 habit+0.031
oc+0.031
 agree+0.030
orse+0.030
alk+0.030
Neg logit contributions
ths-0.010
 skies-0.009
ting-0.009
ips-0.008
 highest-0.008
Token think
Token ID892
Feature activation-1.69e-02

Pos logit contributions
 both+0.023
 two+0.022
 next+0.022
 least+0.022
 back+0.022
Neg logit contributions
 habit-0.022
oc-0.021
 policy-0.020
orse-0.019
onest-0.019
Token big
Token ID1263
Feature activation-2.35e-03

Pos logit contributions
grand+0.019
alk+0.019
ruits+0.019
oc+0.019
 policy+0.019
Neg logit contributions
 back-0.013
 two-0.012
 highest-0.012
 so-0.011
 too-0.011
Token smile
Token ID8212
Feature activation-1.55e-02

Pos logit contributions
grand+0.004
oc+0.004
 habit+0.003
ruits+0.003
abs+0.003
Neg logit contributions
 back-0.004
 two-0.004
 me-0.004
 too-0.004
 skies-0.004
Token,
Token ID11
Feature activation-1.15e-02

Pos logit contributions
 habit+0.039
oc+0.038
alk+0.037
 agree+0.037
 policy+0.036
Neg logit contributions
ting-0.014
 skies-0.013
 two-0.013
ths-0.013
 next-0.013
Token making
Token ID1642
Feature activation-2.00e-02

Pos logit contributions
 habit+0.022
oc+0.021
alk+0.021
 agree+0.020
orse+0.019
Neg logit contributions
ths-0.017
 skies-0.017
ting-0.016
 next-0.016
 two-0.016
Token a
Token ID257
Feature activation-1.25e-02

Pos logit contributions
 habit+0.045
oc+0.045
 agree+0.043
alk+0.043
orse+0.042
Neg logit contributions
ths-0.022
 skies-0.022
ting-0.020
 highest-0.019
 next-0.019
Token (
Token ID357
Feature activation+9.28e-02

Pos logit contributions
alk+0.027
oc+0.026
ruits+0.026
 policy+0.026
onest+0.025
Neg logit contributions
 back-0.015
 highest-0.015
 two-0.014
 both-0.014
 me-0.014
Tokenp
Token ID79
Feature activation-2.19e-02

Pos logit contributions
ths+0.151
 skies+0.150
ting+0.137
 highest+0.136
 two+0.134
Neg logit contributions
oc-0.163
 habit-0.160
 agree-0.151
alk-0.150
orse-0.149
Tokenonding
Token ID42703
Feature activation+3.97e-02

Pos logit contributions
 habit+0.039
oc+0.038
 policy+0.037
 agree+0.037
alk+0.036
Neg logit contributions
 skies-0.034
 two-0.033
ths-0.032
ting-0.032
 both-0.032
Token)
Token ID8
Feature activation+2.41e-02

Pos logit contributions
 skies+0.065
ths+0.064
ting+0.061
 two+0.060
 highest+0.060
Neg logit contributions
 habit-0.069
oc-0.068
 agree-0.065
alk-0.064
orse-0.063
Token noise
Token ID7838
Feature activation+7.46e-03

Pos logit contributions
 skies+0.037
 two+0.037
 next+0.036
ting+0.036
 such+0.035
Neg logit contributions
 habit-0.044
oc-0.042
alk-0.041
 agree-0.041
orse-0.039
Token of
Token ID286
Feature activation+2.63e-03

Pos logit contributions
ths+0.011
 skies+0.011
ting+0.010
 highest+0.010
 two+0.010
Neg logit contributions
 habit-0.014
oc-0.014
 agree-0.013
alk-0.013
orse-0.013
Token a
Token ID257
Feature activation-8.01e-03

Pos logit contributions
 habit+0.028
oc+0.027
 agree+0.027
alk+0.026
orse+0.026
Neg logit contributions
ths-0.011
 skies-0.011
ting-0.010
 highest-0.010
ips-0.009
Token great
Token ID1049
Feature activation+1.92e-02

Pos logit contributions
alk+0.016
oc+0.016
 policy+0.015
 habit+0.015
ruits+0.015
Neg logit contributions
 two-0.011
 highest-0.011
 back-0.011
 skies-0.010
ths-0.010
Token he
Token ID339
Feature activation+7.47e-03

Pos logit contributions
 two+0.030
 back+0.030
 skies+0.028
 right+0.028
 next+0.028
Neg logit contributions
oc-0.036
grand-0.034
orse-0.033
ruits-0.032
ees-0.031
Tokenave
Token ID1015
Feature activation+1.05e-02

Pos logit contributions
ths+0.013
 skies+0.013
ting+0.012
 highest+0.012
 two+0.012
Neg logit contributions
 habit-0.013
oc-0.012
alk-0.012
 agree-0.012
orse-0.011
Token-
Token ID12
Feature activation+5.25e-02

Pos logit contributions
 skies+0.011
ths+0.011
ting+0.010
 two+0.009
 such+0.009
Neg logit contributions
 habit-0.025
oc-0.025
alk-0.024
 agree-0.024
orse-0.023
Tokenho
Token ID8873
Feature activation-1.36e-01

Pos logit contributions
ths+0.090
 skies+0.089
 two+0.088
 next+0.088
There+0.088
Neg logit contributions
 habit-0.087
alk-0.080
oc-0.078
 agree-0.077
 policy-0.075
Token,
Token ID11
Feature activation-7.43e-03

Pos logit contributions
 habit+0.265
alk+0.249
 bond+0.243
 policy+0.240
oc+0.229
Neg logit contributions
 next-0.200
 back-0.190
 two-0.187
ting-0.182
 highest-0.179
Token they
Token ID484
Feature activation+5.36e-03

Pos logit contributions
 habit+0.015
oc+0.015
 agree+0.014
alk+0.014
orse+0.014
Neg logit contributions
ths-0.010
 skies-0.010
ting-0.009
 highest-0.009
 two-0.009
Token gained
Token ID8618
Feature activation+1.19e-02

Pos logit contributions
 both+0.007
ths+0.007
 skies+0.007
 two+0.007
 back+0.007
Neg logit contributions
 habit-0.011
oc-0.011
 policy-0.010
alk-0.010
orse-0.010
Token the
Token ID262
Feature activation-3.90e-03

Pos logit contributions
ths+0.010
 skies+0.010
ting+0.009
 highest+0.008
 two+0.008
Neg logit contributions
 habit-0.030
oc-0.030
 agree-0.029
alk-0.029
orse-0.028
Token avocado
Token ID40377
Feature activation+3.84e-03

Pos logit contributions
oc+0.008
 habit+0.008
orse+0.008
alk+0.008
 agree+0.008
Neg logit contributions
 skies-0.005
ths-0.005
 highest-0.005
 two-0.005
 next-0.005
Token worked
Token ID3111
Feature activation+1.62e-02

Pos logit contributions
 skies+0.010
ths+0.010
 two+0.010
 next+0.009
 such+0.009
Neg logit contributions
oc-0.014
 habit-0.014
alk-0.013
 agree-0.013
orse-0.013
Token hard
Token ID1327
Feature activation-1.10e-02

Pos logit contributions
ths+0.024
 skies+0.023
ting+0.022
 highest+0.021
ips+0.021
Neg logit contributions
 habit-0.031
oc-0.030
 agree-0.029
alk-0.029
orse-0.028
Token.
Token ID13
Feature activation+1.36e-02

Pos logit contributions
 habit+0.020
oc+0.019
alk+0.019
 agree+0.019
orse+0.018
Neg logit contributions
ths-0.017
 skies-0.017
ting-0.016
 highest-0.016
ips-0.015
Token The
Token ID383
Feature activation-7.53e-04

Pos logit contributions
ths+0.020
 skies+0.020
ting+0.019
 highest+0.018
 two+0.018
Neg logit contributions
 habit-0.026
oc-0.025
 agree-0.024
alk-0.024
orse-0.023
Token grass
Token ID8701
Feature activation-2.47e-02

Pos logit contributions
oc+0.001
 habit+0.001
alk+0.001
 agree+0.001
orse+0.001
Neg logit contributions
 skies-0.001
ths-0.001
 two-0.001
 next-0.001
 highest-0.001
Tokenho
Token ID8873
Feature activation-1.34e-01

Pos logit contributions
 habit+0.044
oc+0.044
 agree+0.042
orse+0.040
alk+0.040
Neg logit contributions
 skies-0.042
ths-0.040
 next-0.039
 two-0.037
 highest-0.037
Tokenpper
Token ID2848
Feature activation-4.24e-02

Pos logit contributions
 habit+0.298
oc+0.289
 agree+0.280
alk+0.280
orse+0.273
Neg logit contributions
ths-0.157
 skies-0.154
ting-0.144
 highest-0.140
 next-0.139
Token was
Token ID373
Feature activation-1.84e-03

Pos logit contributions
 habit+0.071
oc+0.069
alk+0.066
 agree+0.065
orse+0.063
Neg logit contributions
ths-0.073
 skies-0.072
ting-0.067
 highest-0.067
 two-0.067
Token spoiled
Token ID34781
Feature activation+9.61e-03

Pos logit contributions
 habit+0.004
oc+0.004
 agree+0.004
alk+0.004
orse+0.004
Neg logit contributions
ths-0.002
 skies-0.002
ting-0.002
 highest-0.002
ips-0.002
Token and
Token ID290
Feature activation+1.08e-02

Pos logit contributions
ths+0.016
 skies+0.016
 highest+0.015
ting+0.014
ips+0.014
Neg logit contributions
 habit-0.016
oc-0.016
 agree-0.015
alk-0.015
orse-0.015
Token didn
Token ID1422
Feature activation+1.19e-02

Pos logit contributions
ths+0.016
 skies+0.015
 two+0.015
 next+0.015
 both+0.015
Neg logit contributions
oc-0.021
 habit-0.021
alk-0.020
 agree-0.020
orse-0.019
Tokenried
Token ID2228
Feature activation+4.54e-03

Pos logit contributions
 habit+0.015
oc+0.014
 agree+0.014
orse+0.013
alk+0.013
Neg logit contributions
ths-0.028
 skies-0.028
ting-0.027
 next-0.026
 highest-0.026
Tokenly
Token ID306
Feature activation+4.99e-02

Pos logit contributions
ths+0.005
 skies+0.005
ips+0.004
There+0.004
ople+0.004
Neg logit contributions
 habit-0.010
oc-0.010
alk-0.010
 agree-0.009
 policy-0.009
Token,
Token ID11
Feature activation-4.43e-03

Pos logit contributions
ths+0.080
 skies+0.078
 highest+0.072
ting+0.071
ips+0.070
Neg logit contributions
 habit-0.088
oc-0.087
 agree-0.083
alk-0.082
orse-0.080
Token the
Token ID262
Feature activation-2.49e-03

Pos logit contributions
 habit+0.010
oc+0.010
alk+0.009
 agree+0.009
 policy+0.009
Neg logit contributions
 skies-0.005
 two-0.005
ths-0.005
ting-0.005
 next-0.005
Token grass
Token ID8701
Feature activation-3.14e-02

Pos logit contributions
oc+0.004
 habit+0.004
alk+0.004
 agree+0.004
orse+0.004
Neg logit contributions
 skies-0.004
ths-0.004
 two-0.004
 highest-0.004
 next-0.004
Tokenho
Token ID8873
Feature activation-1.34e-01

Pos logit contributions
 habit+0.048
oc+0.047
 agree+0.045
alk+0.043
orse+0.043
Neg logit contributions
 skies-0.060
ths-0.059
ting-0.056
 next-0.056
 two-0.055
Tokenpper
Token ID2848
Feature activation-5.02e-02

Pos logit contributions
 habit+0.228
oc+0.217
alk+0.213
 agree+0.212
 policy+0.201
Neg logit contributions
ths-0.227
 skies-0.222
ting-0.218
 next-0.214
 highest-0.211
Token rushed
Token ID13999
Feature activation+1.06e-02

Pos logit contributions
 habit+0.086
oc+0.084
alk+0.080
 agree+0.080
orse+0.078
Neg logit contributions
ths-0.083
 skies-0.083
ting-0.078
 highest-0.077
 two-0.076
Token over
Token ID625
Feature activation-2.35e-02

Pos logit contributions
ths+0.012
 skies+0.011
ips+0.009
ople+0.009
 highest+0.009
Neg logit contributions
 habit-0.024
oc-0.023
alk-0.023
 agree-0.023
 policy-0.022
Token and
Token ID290
Feature activation-3.37e-04

Pos logit contributions
 habit+0.050
oc+0.050
 agree+0.048
alk+0.048
orse+0.047
Neg logit contributions
ths-0.029
 skies-0.029
ting-0.026
 highest-0.026
 two-0.025
Token tried
Token ID3088
Feature activation-2.39e-02

Pos logit contributions
 habit+0.001
oc+0.001
 agree+0.001
orse+0.001
alk+0.001
Neg logit contributions
 skies-0.000
ths-0.000
ips-0.000
 highest-0.000
 such-0.000
Token and
Token ID290
Feature activation-9.84e-04

Pos logit contributions
oc+0.046
 habit+0.043
 agree+0.043
alk+0.042
orse+0.040
Neg logit contributions
ths-0.047
 skies-0.045
 highest-0.043
ting-0.043
ips-0.043
Token picked
Token ID6497
Feature activation+1.11e-02

Pos logit contributions
oc+0.001
 agree+0.001
 habit+0.001
orse+0.001
 believe+0.001
Neg logit contributions
 skies-0.002
ths-0.002
ips-0.002
 highest-0.002
 yourselves-0.002
Token up
Token ID510
Feature activation+3.62e-02

Pos logit contributions
ths+0.011
 skies+0.010
ips+0.009
 highest+0.008
ting+0.008
Neg logit contributions
oc-0.027
 habit-0.027
 agree-0.026
alk-0.025
 understand-0.025
Token a
Token ID257
Feature activation-3.83e-03

Pos logit contributions
ths+0.045
 skies+0.040
ips+0.040
ting+0.037
 highest+0.037
Neg logit contributions
oc-0.080
 habit-0.076
 agree-0.074
 understand-0.072
alk-0.071
Token grass
Token ID8701
Feature activation-2.96e-02

Pos logit contributions
alk+0.008
 agree+0.008
 habit+0.008
orse+0.008
oc+0.008
Neg logit contributions
 two-0.005
 highest-0.004
 back-0.004
 next-0.004
 on-0.004
Tokenho
Token ID8873
Feature activation-1.34e-01

Pos logit contributions
 habit+0.057
oc+0.055
 agree+0.052
orse+0.052
alk+0.052
Neg logit contributions
 skies-0.044
ths-0.043
 next-0.043
ting-0.042
 two-0.041
Tokenpper
Token ID2848
Feature activation-5.21e-02

Pos logit contributions
 habit+0.254
alk+0.233
 agree+0.227
 plan+0.225
 policy+0.223
Neg logit contributions
 next-0.206
ting-0.204
 back-0.198
ths-0.193
 highest-0.187
Token!
Token ID0
Feature activation+2.21e-02

Pos logit contributions
 habit+0.112
oc+0.110
alk+0.106
 agree+0.105
orse+0.103
Neg logit contributions
 skies-0.063
ths-0.062
 two-0.060
 next-0.060
ting-0.060
Token Both
Token ID5747
Feature activation+2.96e-02

Pos logit contributions
 skies+0.031
ths+0.030
 two+0.028
ting+0.028
 highest+0.028
Neg logit contributions
 habit-0.045
oc-0.044
alk-0.042
 agree-0.042
orse-0.041
Token friends
Token ID2460
Feature activation-5.25e-03

Pos logit contributions
ths+0.049
 skies+0.047
 highest+0.046
ips+0.045
ting+0.045
Neg logit contributions
 habit-0.049
 agree-0.047
oc-0.047
alk-0.044
orse-0.043
Token were
Token ID547
Feature activation-1.06e-02

Pos logit contributions
oc+0.010
 habit+0.010
alk+0.009
 policy+0.009
onest+0.009
Neg logit contributions
 both-0.008
 two-0.008
ths-0.008
 skies-0.008
ting-0.008
Token an
Token ID281
Feature activation-4.04e-02

Pos logit contributions
 habit+0.010
oc+0.010
 agree+0.009
orse+0.009
alk+0.009
Neg logit contributions
ths-0.005
 skies-0.004
ting-0.004
ips-0.004
 highest-0.004
Token ant
Token ID1885
Feature activation+8.07e-03

Pos logit contributions
 habit+0.082
oc+0.081
 agree+0.077
alk+0.076
orse+0.075
Neg logit contributions
ths-0.055
 skies-0.054
ting-0.050
 highest-0.049
ips-0.049
Token and
Token ID290
Feature activation+3.17e-03

Pos logit contributions
 next+0.013
ths+0.013
 skies+0.013
 two+0.012
ting+0.012
Neg logit contributions
 habit-0.014
alk-0.013
 agree-0.013
oc-0.013
onest-0.013
Token a
Token ID257
Feature activation+1.99e-03

Pos logit contributions
 two+0.004
 skies+0.004
ths+0.004
 next+0.004
 both+0.004
Neg logit contributions
oc-0.007
 habit-0.007
alk-0.007
 agree-0.007
 policy-0.006
Token grass
Token ID8701
Feature activation-2.66e-02

Pos logit contributions
 two+0.002
 next+0.002
 highest+0.002
 skies+0.002
Why+0.002
Neg logit contributions
alk-0.004
oc-0.004
onest-0.004
ool-0.004
 agree-0.004
Tokenho
Token ID8873
Feature activation-1.33e-01

Pos logit contributions
oc+0.043
 habit+0.043
 agree+0.042
alk+0.040
orse+0.040
Neg logit contributions
 skies-0.047
ths-0.044
 next-0.044
ting-0.044
 two-0.043
Tokenpper
Token ID2848
Feature activation-4.95e-02

Pos logit contributions
 habit+0.244
alk+0.228
 agree+0.224
 policy+0.222
oc+0.221
Neg logit contributions
 next-0.203
ting-0.203
ths-0.199
 skies-0.194
 back-0.191
Token.
Token ID13
Feature activation+2.28e-03

Pos logit contributions
 habit+0.105
alk+0.105
 agree+0.103
oc+0.101
 policy+0.101
Neg logit contributions
 two-0.059
 next-0.058
 skies-0.057
ting-0.057
 both-0.056
Token The
Token ID383
Feature activation-3.47e-03

Pos logit contributions
 skies+0.003
ths+0.003
 two+0.002
ting+0.002
 highest+0.002
Neg logit contributions
 habit-0.005
oc-0.005
 agree-0.005
orse-0.005
alk-0.005
Token ant
Token ID1885
Feature activation+8.23e-03

Pos logit contributions
oc+0.007
orse+0.006
alk+0.006
 habit+0.006
 agree+0.006
Neg logit contributions
 skies-0.005
 two-0.005
 next-0.005
 highest-0.005
 back-0.005
Token was
Token ID373
Feature activation-3.09e-03

Pos logit contributions
 skies+0.015
 next+0.015
ths+0.015
 two+0.015
 both+0.014
Neg logit contributions
 habit-0.012
oc-0.012
alk-0.012
 agree-0.012
 policy-0.011
Token busy
Token ID8179
Feature activation+2.26e-02

Pos logit contributions
 habit+0.003
orse+0.003
 agree+0.003
 understand+0.003
oc+0.003
Neg logit contributions
ths-0.002
 skies-0.002
ting-0.002
 highest-0.002
There-0.002
Token working
Token ID1762
Feature activation-1.26e-02

Pos logit contributions
ths+0.028
 skies+0.028
ting+0.025
 next+0.025
 two+0.025
Neg logit contributions
oc-0.048
 habit-0.048
alk-0.046
 agree-0.046
orse-0.045
Token when
Token ID618
Feature activation+2.13e-03

Pos logit contributions
 habit+0.029
oc+0.029
 agree+0.028
alk+0.028
orse+0.027
Neg logit contributions
ths-0.014
 skies-0.013
ting-0.012
 highest-0.012
 two-0.012
Token the
Token ID262
Feature activation+2.71e-03

Pos logit contributions
ths+0.003
 skies+0.003
ting+0.003
 highest+0.003
ips+0.003
Neg logit contributions
 habit-0.004
oc-0.004
alk-0.004
 agree-0.004
orse-0.004
Token grass
Token ID8701
Feature activation-2.68e-02

Pos logit contributions
ths+0.004
 skies+0.004
ting+0.003
ips+0.003
 such+0.003
Neg logit contributions
 habit-0.005
oc-0.005
 agree-0.005
alk-0.005
orse-0.005
Tokenho
Token ID8873
Feature activation-1.33e-01

Pos logit contributions
 habit+0.049
oc+0.048
 agree+0.047
alk+0.045
orse+0.045
Neg logit contributions
 skies-0.044
ths-0.042
 next-0.040
ting-0.040
 two-0.039
Tokenpper
Token ID2848
Feature activation-4.53e-02

Pos logit contributions
 habit+0.268
oc+0.264
 agree+0.253
alk+0.252
orse+0.247
Neg logit contributions
ths-0.181
 skies-0.178
ting-0.166
 highest-0.163
 two-0.161
Token saw
Token ID2497
Feature activation+7.03e-03

Pos logit contributions
 habit+0.074
oc+0.073
alk+0.069
 agree+0.069
orse+0.067
Neg logit contributions
ths-0.078
 skies-0.078
ting-0.073
 highest-0.072
 two-0.072
Token a
Token ID257
Feature activation-5.83e-03

Pos logit contributions
ths+0.006
 skies+0.006
ting+0.005
 highest+0.005
 two+0.005
Neg logit contributions
 habit-0.018
oc-0.018
 agree-0.017
alk-0.017
orse-0.017
Token switch
Token ID5078
Feature activation+1.78e-02

Pos logit contributions
alk+0.013
oc+0.012
 agree+0.012
orse+0.012
 habit+0.012
Neg logit contributions
 two-0.007
 skies-0.007
 highest-0.007
 back-0.007
ths-0.007
Token next
Token ID1306
Feature activation+3.74e-02

Pos logit contributions
ths+0.024
 skies+0.023
ting+0.021
 highest+0.021
 two+0.021
Neg logit contributions
 habit-0.036
oc-0.035
alk-0.034
 agree-0.034
orse-0.033
Token.
Token ID13
Feature activation+5.23e-03

Pos logit contributions
 habit+0.046
oc+0.045
 agree+0.043
alk+0.043
orse+0.042
Neg logit contributions
ths-0.049
 skies-0.048
ting-0.045
 highest-0.045
 two-0.044
Token It
Token ID632
Feature activation-8.33e-03

Pos logit contributions
ths+0.007
 skies+0.007
ting+0.007
 highest+0.007
 two+0.007
Neg logit contributions
 habit-0.010
oc-0.010
alk-0.010
 agree-0.010
orse-0.009
Token was
Token ID373
Feature activation-2.85e-03

Pos logit contributions
 habit+0.012
oc+0.012
 agree+0.011
alk+0.011
orse+0.011
Neg logit contributions
ths-0.016
 skies-0.016
ting-0.015
 highest-0.015
 such-0.015
Token a
Token ID257
Feature activation-3.68e-03

Pos logit contributions
 habit+0.007
oc+0.007
 agree+0.007
alk+0.007
orse+0.007
Neg logit contributions
ths-0.002
 skies-0.002
ting-0.002
 highest-0.002
 two-0.002
Token grass
Token ID8701
Feature activation-2.75e-02

Pos logit contributions
alk+0.008
 agree+0.007
 habit+0.007
oc+0.007
orse+0.007
Neg logit contributions
 two-0.005
 skies-0.005
 highest-0.005
ths-0.004
 next-0.004
Tokenho
Token ID8873
Feature activation-1.33e-01

Pos logit contributions
 habit+0.051
oc+0.049
 agree+0.048
alk+0.047
orse+0.046
Neg logit contributions
 skies-0.043
ths-0.042
ting-0.040
 next-0.040
 highest-0.039
Tokenpper
Token ID2848
Feature activation-4.72e-02

Pos logit contributions
 habit+0.241
oc+0.223
alk+0.220
 agree+0.219
 policy+0.211
Neg logit contributions
ting-0.213
 next-0.210
ths-0.208
 skies-0.203
 highest-0.200
Token!
Token ID0
Feature activation+2.73e-02

Pos logit contributions
 habit+0.099
oc+0.097
alk+0.097
 agree+0.096
onest+0.090
Neg logit contributions
 skies-0.059
 next-0.059
 two-0.058
ting-0.058
ths-0.057
Token Tom
Token ID4186
Feature activation+1.21e-02

Pos logit contributions
ths+0.035
 skies+0.034
ting+0.032
 highest+0.031
 next+0.030
Neg logit contributions
oc-0.057
 habit-0.057
 agree-0.055
alk-0.054
orse-0.053
Token was
Token ID373
Feature activation-3.66e-03

Pos logit contributions
ths+0.021
 skies+0.021
 two+0.020
ting+0.020
 next+0.020
Neg logit contributions
 habit-0.020
oc-0.019
alk-0.019
 agree-0.018
orse-0.018
Token so
Token ID523
Feature activation-6.54e-03

Pos logit contributions
 habit+0.007
 understand+0.007
 agree+0.007
orse+0.007
 plan+0.007
Neg logit contributions
ths-0.005
 skies-0.005
ips-0.004
ople-0.004
ays-0.004
Token€
Token ID26391
Feature activation+5.67e-03

Pos logit contributions
ths+0.003
 skies+0.003
ting+0.002
 highest+0.002
 two+0.002
Neg logit contributions
 habit-0.004
oc-0.003
 agree-0.003
alk-0.003
orse-0.003
Token�
Token ID129
Feature activation+4.73e-02

Pos logit contributions
ths+0.009
 skies+0.009
 highest+0.008
ting+0.008
 two+0.008
Neg logit contributions
 habit-0.010
oc-0.010
alk-0.009
 agree-0.009
orse-0.009
Token�
Token ID241
Feature activation-2.01e-02

Pos logit contributions
ths+0.074
 skies+0.073
ting+0.067
 two+0.066
 highest+0.066
Neg logit contributions
 habit-0.086
oc-0.085
 agree-0.081
alk-0.080
orse-0.079
TokenGr
Token ID8642
Feature activation-3.91e-03

Pos logit contributions
 habit+0.045
oc+0.044
alk+0.043
 agree+0.042
orse+0.041
Neg logit contributions
 skies-0.023
ths-0.023
ting-0.021
 two-0.021
 highest-0.021
Tokenass
Token ID562
Feature activation+2.81e-02

Pos logit contributions
 habit+0.008
 agree+0.007
oc+0.007
orse+0.007
 plan+0.007
Neg logit contributions
ips-0.005
 skies-0.005
 such-0.005
ags-0.005
 two-0.005
Tokenho
Token ID8873
Feature activation-1.33e-01

Pos logit contributions
ths+0.046
 skies+0.045
 two+0.043
ting+0.042
 next+0.042
Neg logit contributions
 habit-0.050
oc-0.048
 agree-0.046
alk-0.046
orse-0.044
Tokenpper
Token ID2848
Feature activation-4.78e-02

Pos logit contributions
 habit+0.281
alk+0.267
oc+0.266
 agree+0.263
orse+0.250
Neg logit contributions
ths-0.171
 skies-0.164
ting-0.160
 next-0.158
 two-0.156
Token,
Token ID11
Feature activation-3.90e-03

Pos logit contributions
 habit+0.095
alk+0.092
oc+0.090
 agree+0.090
orse+0.086
Neg logit contributions
ths-0.065
 skies-0.064
 two-0.063
 next-0.062
 such-0.061
Token let
Token ID1309
Feature activation+5.00e-03

Pos logit contributions
 habit+0.009
oc+0.009
alk+0.008
 agree+0.008
orse+0.008
Neg logit contributions
 skies-0.004
ths-0.004
 two-0.004
 next-0.004
 such-0.004
Tokenâ
Token ID22940
Feature activation-1.40e-02

Pos logit contributions
ths+0.006
 skies+0.006
 highest+0.005
ips+0.005
ting+0.005
Neg logit contributions
oc-0.011
 habit-0.011
 agree-0.011
orse-0.010
alk-0.010
Token€
Token ID26391
Feature activation+5.27e-03

Pos logit contributions
 habit+0.029
oc+0.028
alk+0.027
orse+0.026
 policy+0.026
Neg logit contributions
ths-0.018
 skies-0.018
ting-0.017
 next-0.017
 such-0.016
Token and
Token ID290
Feature activation-3.79e-03

Pos logit contributions
ths+0.001
 skies+0.001
ting+0.001
 highest+0.001
 next+0.001
Neg logit contributions
 habit-0.001
oc-0.001
alk-0.001
 agree-0.001
orse-0.001
Token saw
Token ID2497
Feature activation-1.99e-03

Pos logit contributions
oc+0.005
 habit+0.005
 agree+0.005
orse+0.005
 understand+0.004
Neg logit contributions
 skies-0.008
ths-0.007
ips-0.007
 highest-0.007
ags-0.007
Token a
Token ID257
Feature activation-6.31e-03

Pos logit contributions
 habit+0.005
oc+0.005
 agree+0.005
alk+0.005
orse+0.005
Neg logit contributions
ths-0.002
 skies-0.002
ting-0.001
 highest-0.001
ips-0.001
Token huge
Token ID3236
Feature activation-1.73e-02

Pos logit contributions
alk+0.014
 habit+0.013
 agree+0.013
oc+0.013
orse+0.013
Neg logit contributions
 two-0.008
 skies-0.007
 back-0.007
 highest-0.007
 next-0.007
Token grass
Token ID8701
Feature activation-3.09e-02

Pos logit contributions
 habit+0.033
alk+0.032
oc+0.031
orse+0.031
 agree+0.031
Neg logit contributions
 skies-0.025
 two-0.024
ths-0.024
 back-0.024
 next-0.023
Tokenho
Token ID8873
Feature activation-1.33e-01

Pos logit contributions
 habit+0.045
oc+0.043
 agree+0.042
alk+0.040
orse+0.039
Neg logit contributions
 skies-0.062
ths-0.061
ting-0.059
 next-0.058
 two-0.058
Tokenpper
Token ID2848
Feature activation-5.04e-02

Pos logit contributions
 habit+0.240
 agree+0.220
alk+0.219
oc+0.219
 policy+0.210
Neg logit contributions
ting-0.212
 next-0.210
ths-0.209
 skies-0.201
 highest-0.198
Token,
Token ID11
Feature activation-7.34e-03

Pos logit contributions
 habit+0.107
oc+0.106
alk+0.103
 agree+0.102
onest+0.098
Neg logit contributions
 skies-0.062
ths-0.060
 next-0.060
 two-0.060
ting-0.059
Token who
Token ID508
Feature activation+1.68e-02

Pos logit contributions
 habit+0.015
oc+0.015
alk+0.015
 agree+0.015
orse+0.014
Neg logit contributions
 skies-0.010
ths-0.009
 two-0.009
 next-0.009
ting-0.009
Token had
Token ID550
Feature activation-1.00e-02

Pos logit contributions
ths+0.028
 skies+0.027
 two+0.026
 both+0.026
 next+0.025
Neg logit contributions
 habit-0.031
alk-0.029
 agree-0.028
oc-0.028
 policy-0.027
Token been
Token ID587
Feature activation+6.92e-03

Pos logit contributions
 habit+0.021
oc+0.021
alk+0.020
 agree+0.020
orse+0.020
Neg logit contributions
ths-0.013
 skies-0.012
ting-0.012
 highest-0.011
ips-0.011
Token introduced
Token ID5495
Feature activation-2.63e-02

Pos logit contributions
 habit+0.007
orse+0.007
 agree+0.007
alk+0.007
 understand+0.007
Neg logit contributions
ths-0.006
 skies-0.006
ting-0.006
There-0.005
 highest-0.005
Token to
Token ID284
Feature activation-1.85e-03

Pos logit contributions
oc+0.043
 agree+0.043
 habit+0.043
alk+0.043
 policy+0.041
Neg logit contributions
ths-0.042
ays-0.039
ting-0.038
 skies-0.038
ips-0.037
Token a
Token ID257
Feature activation-1.20e-03

Pos logit contributions
 habit+0.004
oc+0.004
alk+0.004
 agree+0.004
orse+0.004
Neg logit contributions
ths-0.002
 skies-0.002
ting-0.002
 highest-0.001
 two-0.001
Token smooth
Token ID7209
Feature activation-6.93e-03

Pos logit contributions
oc+0.002
alk+0.002
 habit+0.002
orse+0.002
 agree+0.002
Neg logit contributions
 highest-0.002
 skies-0.002
ths-0.002
 next-0.002
 two-0.002
Token grass
Token ID8701
Feature activation-2.83e-02

Pos logit contributions
oc+0.012
orse+0.012
 agree+0.012
 understand+0.011
onest+0.011
Neg logit contributions
 skies-0.011
 me-0.011
 next-0.011
 two-0.010
 highest-0.010
Tokenho
Token ID8873
Feature activation-1.32e-01

Pos logit contributions
 habit+0.052
oc+0.051
 agree+0.049
alk+0.049
orse+0.048
Neg logit contributions
ths-0.043
 skies-0.043
ting-0.040
 highest-0.040
 next-0.039
Tokenpper
Token ID2848
Feature activation-5.17e-02

Pos logit contributions
 habit+0.242
alk+0.227
 agree+0.222
 policy+0.217
 bond+0.216
Neg logit contributions
 next-0.207
ting-0.199
 back-0.191
ths-0.191
 highest-0.187
Token called
Token ID1444
Feature activation-2.22e-02

Pos logit contributions
 habit+0.101
oc+0.099
alk+0.096
 agree+0.095
orse+0.093
Neg logit contributions
 skies-0.071
ths-0.069
 next-0.069
ting-0.068
 two-0.067
Token Jim
Token ID5395
Feature activation-8.40e-03

Pos logit contributions
 habit+0.046
oc+0.046
 agree+0.044
alk+0.043
orse+0.042
Neg logit contributions
 skies-0.028
ths-0.027
ting-0.027
 two-0.027
 such-0.026
Token.
Token ID13
Feature activation-2.37e-03

Pos logit contributions
 habit+0.016
oc+0.015
alk+0.015
 agree+0.015
orse+0.014
Neg logit contributions
ths-0.012
 skies-0.012
ting-0.012
 next-0.012
 two-0.012
Token Ted
Token ID11396
Feature activation+1.51e-02

Pos logit contributions
 habit+0.005
oc+0.005
 policy+0.004
alk+0.004
orse+0.004
Neg logit contributions
 skies-0.003
 two-0.003
ths-0.003
 next-0.003
 both-0.003
Token
Token ID198
Feature activation-4.15e-03

Pos logit contributions
 habit+0.012
oc+0.011
alk+0.011
 agree+0.010
 policy+0.010
Neg logit contributions
 skies-0.006
There-0.006
 two-0.006
ths-0.006
 next-0.006
TokenThe
Token ID464
Feature activation+1.51e-02

Pos logit contributions
oc+0.008
 habit+0.008
 agree+0.007
alk+0.007
orse+0.007
Neg logit contributions
ths-0.006
 skies-0.006
ting-0.006
 highest-0.006
 such-0.006
Token nos
Token ID43630
Feature activation+4.64e-02

Pos logit contributions
 skies+0.021
ths+0.020
 two+0.019
 next+0.019
 highest+0.018
Neg logit contributions
 habit-0.031
oc-0.031
alk-0.030
 agree-0.029
orse-0.029
Tokeny
Token ID88
Feature activation-3.32e-02

Pos logit contributions
 skies+0.086
ths+0.084
ting+0.083
 two+0.080
 such+0.079
Neg logit contributions
 habit-0.073
oc-0.067
 agree-0.066
orse-0.065
alk-0.065
Token grass
Token ID8701
Feature activation-2.65e-02

Pos logit contributions
alk+0.058
 habit+0.057
 agree+0.057
oc+0.057
orse+0.055
Neg logit contributions
 skies-0.055
 next-0.052
 two-0.052
ting-0.050
 me-0.050
Tokenho
Token ID8873
Feature activation-1.31e-01

Pos logit contributions
 habit+0.041
oc+0.039
 agree+0.039
alk+0.037
 understand+0.036
Neg logit contributions
 skies-0.052
ths-0.050
 next-0.049
 two-0.048
ting-0.048
Tokenpper
Token ID2848
Feature activation-4.39e-02

Pos logit contributions
 habit+0.202
oc+0.198
 agree+0.186
alk+0.186
orse+0.181
Neg logit contributions
ths-0.242
 skies-0.240
ting-0.226
 highest-0.224
 two-0.222
Token hopped
Token ID45230
Feature activation+8.99e-03

Pos logit contributions
 habit+0.071
oc+0.069
alk+0.066
 agree+0.065
orse+0.063
Neg logit contributions
ths-0.078
 skies-0.077
ting-0.073
 two-0.072
 highest-0.072
Token over
Token ID625
Feature activation-2.24e-02

Pos logit contributions
ths+0.008
 skies+0.008
ips+0.006
 highest+0.006
ting+0.006
Neg logit contributions
 habit-0.022
oc-0.022
alk-0.021
 agree-0.021
orse-0.021
Token to
Token ID284
Feature activation-9.44e-04

Pos logit contributions
oc+0.044
 habit+0.044
 agree+0.043
alk+0.042
orse+0.041
Neg logit contributions
ths-0.031
 skies-0.028
 highest-0.027
ips-0.026
ting-0.025
Token Tom
Token ID4186
Feature activation+1.18e-02

Pos logit contributions
 habit+0.002
oc+0.002
alk+0.002
orse+0.002
 agree+0.002
Neg logit contributions
 skies-0.001
ting-0.001
ths-0.001
 two-0.001
 such-0.001
Token so
Token ID523
Feature activation-6.33e-03

Pos logit contributions
 habit+0.008
 agree+0.007
orse+0.007
 understand+0.007
oc+0.007
Neg logit contributions
ths-0.005
 skies-0.005
ting-0.004
There-0.004
ips-0.004
Token happy
Token ID3772
Feature activation-6.76e-03

Pos logit contributions
 habit+0.011
oc+0.011
alk+0.010
 agree+0.010
orse+0.010
Neg logit contributions
ths-0.010
 skies-0.010
 two-0.010
 highest-0.010
 next-0.010
Token that
Token ID326
Feature activation+5.45e-03

Pos logit contributions
 habit+0.013
oc+0.013
 agree+0.012
alk+0.012
orse+0.012
Neg logit contributions
ths-0.010
 skies-0.010
 highest-0.009
ting-0.009
 two-0.009
Token the
Token ID262
Feature activation-1.01e-03

Pos logit contributions
ths+0.010
 skies+0.010
 two+0.009
 next+0.009
ting+0.009
Neg logit contributions
 habit-0.009
oc-0.009
alk-0.008
 agree-0.008
orse-0.008
Token grass
Token ID8701
Feature activation-2.67e-02

Pos logit contributions
 habit+0.002
oc+0.002
alk+0.002
 agree+0.002
orse+0.002
Neg logit contributions
ths-0.001
 skies-0.001
ting-0.001
 highest-0.001
 two-0.001
Tokenho
Token ID8873
Feature activation-1.30e-01

Pos logit contributions
 habit+0.048
oc+0.047
 agree+0.046
alk+0.044
orse+0.042
Neg logit contributions
 skies-0.045
ths-0.043
 next-0.043
ting-0.042
 two-0.041
Tokenpper
Token ID2848
Feature activation-4.50e-02

Pos logit contributions
oc+0.216
 habit+0.215
 agree+0.199
alk+0.199
orse+0.198
Neg logit contributions
ths-0.223
 skies-0.221
 highest-0.204
 such-0.202
ting-0.202
Token had
Token ID550
Feature activation-9.63e-03

Pos logit contributions
 habit+0.069
oc+0.068
alk+0.063
 agree+0.063
orse+0.062
Neg logit contributions
ths-0.083
 skies-0.082
ting-0.077
 highest-0.077
 two-0.076
Token said
Token ID531
Feature activation+3.30e-03

Pos logit contributions
 habit+0.018
alk+0.017
 agree+0.017
orse+0.017
oc+0.017
Neg logit contributions
ths-0.014
 skies-0.013
ting-0.013
ips-0.012
 highest-0.012
Token yes
Token ID3763
Feature activation+5.42e-03

Pos logit contributions
 skies+0.005
ths+0.005
 two+0.005
 such+0.005
ting+0.005
Neg logit contributions
 habit-0.006
oc-0.006
alk-0.005
 agree-0.005
orse-0.005
Token.
Token ID13
Feature activation-1.08e-02

Pos logit contributions
ths+0.009
 skies+0.009
ting+0.008
 highest+0.008
ips+0.008
Neg logit contributions
 habit-0.009
oc-0.009
 agree-0.009
alk-0.009
orse-0.009
Token in
Token ID287
Feature activation-1.15e-02

Pos logit contributions
ths+0.006
 skies+0.006
 highest+0.006
ips+0.006
There+0.005
Neg logit contributions
oc-0.008
 habit-0.008
 agree-0.008
alk-0.007
orse-0.007
Token circles
Token ID13332
Feature activation+6.77e-03

Pos logit contributions
oc+0.024
 habit+0.024
 agree+0.024
alk+0.022
 plan+0.022
Neg logit contributions
ths-0.015
 skies-0.012
ips-0.012
ays-0.011
ople-0.011
Token with
Token ID351
Feature activation-1.19e-02

Pos logit contributions
ths+0.011
 skies+0.011
 highest+0.010
ting+0.010
ips+0.010
Neg logit contributions
 habit-0.011
oc-0.011
 agree-0.011
alk-0.010
orse-0.010
Token the
Token ID262
Feature activation-1.19e-03

Pos logit contributions
 habit+0.025
oc+0.025
 agree+0.024
orse+0.023
alk+0.023
Neg logit contributions
ths-0.016
 skies-0.014
 highest-0.013
ips-0.013
ting-0.013
Token grass
Token ID8701
Feature activation-2.80e-02

Pos logit contributions
oc+0.003
 habit+0.003
alk+0.003
 agree+0.003
orse+0.003
Neg logit contributions
 skies-0.001
 two-0.001
 highest-0.001
ths-0.001
ting-0.001
Tokenho
Token ID8873
Feature activation-1.30e-01

Pos logit contributions
 habit+0.050
oc+0.049
alk+0.047
 agree+0.047
orse+0.046
Neg logit contributions
ths-0.044
 skies-0.043
ting-0.040
 highest-0.040
 such-0.040
Tokenpper
Token ID2848
Feature activation-4.60e-02

Pos logit contributions
 habit+0.212
oc+0.204
alk+0.196
 agree+0.195
 policy+0.185
Neg logit contributions
ths-0.228
 skies-0.223
ting-0.218
 next-0.213
 highest-0.213
Token.
Token ID13
Feature activation-4.11e-03

Pos logit contributions
 habit+0.085
oc+0.083
 agree+0.079
alk+0.079
orse+0.078
Neg logit contributions
ths-0.071
 skies-0.069
ting-0.063
 highest-0.063
ips-0.062
Token They
Token ID1119
Feature activation-5.20e-03

Pos logit contributions
 habit+0.007
oc+0.007
 agree+0.007
alk+0.007
orse+0.007
Neg logit contributions
ths-0.007
 skies-0.006
ting-0.006
 highest-0.006
 next-0.006
Token jumped
Token ID11687
Feature activation+4.79e-03

Pos logit contributions
oc+0.009
 habit+0.009
alk+0.009
 agree+0.008
orse+0.008
Neg logit contributions
ths-0.008
 skies-0.008
ting-0.008
 highest-0.008
 two-0.008
Token and
Token ID290
Feature activation+3.22e-03

Pos logit contributions
ths+0.008
 skies+0.007
ople+0.007
ting+0.007
 yourselves+0.007
Neg logit contributions
 habit-0.008
oc-0.008
 agree-0.008
alk-0.008
orse-0.007
Token and
Token ID290
Feature activation-4.32e-03

Pos logit contributions
ths+0.075
 skies+0.074
 highest+0.069
ting+0.068
ips+0.067
Neg logit contributions
 habit-0.087
oc-0.086
 agree-0.082
alk-0.080
orse-0.079
Token squ
Token ID2809
Feature activation+2.13e-02

Pos logit contributions
 habit+0.008
oc+0.007
alk+0.007
 agree+0.007
orse+0.007
Neg logit contributions
ths-0.007
 skies-0.007
ting-0.007
 highest-0.007
 two-0.007
Tokenished
Token ID1348
Feature activation+1.38e-02

Pos logit contributions
ths+0.035
 skies+0.035
 highest+0.033
ting+0.032
 next+0.032
Neg logit contributions
 habit-0.037
oc-0.037
 agree-0.035
alk-0.034
orse-0.034
Token the
Token ID262
Feature activation-1.52e-03

Pos logit contributions
ths+0.016
 skies+0.016
 highest+0.014
ting+0.014
 two+0.014
Neg logit contributions
 habit-0.030
oc-0.030
 agree-0.029
alk-0.029
orse-0.028
Token grass
Token ID8701
Feature activation-2.78e-02

Pos logit contributions
oc+0.003
 habit+0.003
alk+0.003
 agree+0.003
orse+0.003
Neg logit contributions
 skies-0.002
ths-0.002
 two-0.002
 highest-0.002
ting-0.002
Tokenho
Token ID8873
Feature activation-1.28e-01

Pos logit contributions
 habit+0.062
oc+0.062
 agree+0.059
alk+0.059
orse+0.058
Neg logit contributions
ths-0.031
 skies-0.031
ting-0.029
 highest-0.028
 next-0.028
Tokenppers
Token ID11799
Feature activation-1.61e-02

Pos logit contributions
 habit+0.208
oc+0.198
alk+0.194
 agree+0.192
orse+0.182
Neg logit contributions
ths-0.228
 skies-0.220
ting-0.219
 next-0.213
 highest-0.211
Token with
Token ID351
Feature activation-1.20e-02

Pos logit contributions
 habit+0.042
oc+0.042
 agree+0.040
alk+0.040
orse+0.039
Neg logit contributions
ths-0.012
 skies-0.012
ting-0.010
 highest-0.010
 two-0.010
Token her
Token ID607
Feature activation+4.89e-04

Pos logit contributions
 habit+0.028
oc+0.028
 agree+0.027
alk+0.027
orse+0.026
Neg logit contributions
ths-0.012
 skies-0.012
ting-0.011
 highest-0.011
 next-0.010
Token fingers
Token ID9353
Feature activation+2.23e-02

Pos logit contributions
 skies+0.001
ths+0.001
 two+0.001
 highest+0.001
 next+0.001
Neg logit contributions
oc-0.001
 habit-0.001
alk-0.001
 agree-0.001
orse-0.001
Token to
Token ID284
Feature activation-2.88e-03

Pos logit contributions
ths+0.032
 skies+0.032
ting+0.030
 two+0.030
 next+0.030
Neg logit contributions
oc-0.043
 habit-0.043
alk-0.040
 agree-0.040
orse-0.039
Token lady
Token ID10846
Feature activation-1.53e-02

Pos logit contributions
oc+0.004
 habit+0.004
 agree+0.004
alk+0.004
 policy+0.004
Neg logit contributions
 two-0.003
 highest-0.003
 skies-0.003
 next-0.003
 back-0.003
Tokenbug
Token ID25456
Feature activation-2.64e-02

Pos logit contributions
 habit+0.029
oc+0.027
alk+0.027
 agree+0.026
orse+0.025
Neg logit contributions
ths-0.023
 skies-0.023
ting-0.022
 next-0.021
 two-0.021
Token and
Token ID290
Feature activation+2.07e-03

Pos logit contributions
 habit+0.049
oc+0.047
 agree+0.046
alk+0.046
orse+0.044
Neg logit contributions
 skies-0.040
ths-0.040
ting-0.037
 next-0.037
 two-0.037
Token a
Token ID257
Feature activation-2.69e-03

Pos logit contributions
ths+0.002
 skies+0.002
ting+0.001
 highest+0.001
 two+0.001
Neg logit contributions
 habit-0.005
oc-0.005
alk-0.005
 agree-0.005
orse-0.005
Token grass
Token ID8701
Feature activation-2.71e-02

Pos logit contributions
oc+0.005
alk+0.005
 habit+0.005
 agree+0.005
onest+0.005
Neg logit contributions
 two-0.004
 highest-0.004
 skies-0.004
 back-0.004
 next-0.004
Tokenho
Token ID8873
Feature activation-1.28e-01

Pos logit contributions
 habit+0.043
oc+0.042
 agree+0.040
alk+0.039
orse+0.038
Neg logit contributions
 skies-0.049
ths-0.048
ting-0.047
 next-0.046
 two-0.046
Tokenpper
Token ID2848
Feature activation-4.63e-02

Pos logit contributions
 habit+0.225
alk+0.208
 agree+0.206
oc+0.206
 policy+0.200
Neg logit contributions
ting-0.209
ths-0.205
 next-0.202
 skies-0.201
 two-0.194
Token.
Token ID13
Feature activation+2.77e-03

Pos logit contributions
 habit+0.094
oc+0.092
 agree+0.089
alk+0.088
orse+0.086
Neg logit contributions
 skies-0.061
ting-0.059
 two-0.059
 next-0.059
ths-0.058
Token
Token ID198
Feature activation+1.12e-03

Pos logit contributions
ths+0.004
 skies+0.004
ting+0.004
 highest+0.004
 two+0.004
Neg logit contributions
 habit-0.005
oc-0.005
 agree-0.005
alk-0.005
orse-0.005
Token
Token ID198
Feature activation+1.80e-03

Pos logit contributions
 next+0.001
There+0.001
 two+0.001
 skies+0.001
Why+0.001
Neg logit contributions
 habit-0.003
oc-0.002
alk-0.002
 policy-0.002
 connection-0.002
TokenOne
Token ID3198
Feature activation+1.27e-02

Pos logit contributions
 skies+0.002
 next+0.002
ths+0.002
 two+0.002
There+0.002
Neg logit contributions
 habit-0.004
oc-0.004
alk-0.004
 policy-0.004
 agree-0.004
Token?"
Token ID1701
Feature activation+1.42e-02

Pos logit contributions
ths+0.026
 skies+0.026
ting+0.024
 highest+0.024
 two+0.023
Neg logit contributions
 habit-0.032
oc-0.032
 agree-0.030
alk-0.030
orse-0.029
Token
Token ID198
Feature activation-2.68e-03

Pos logit contributions
ths+0.022
 skies+0.022
ting+0.020
 highest+0.020
ips+0.020
Neg logit contributions
 habit-0.026
oc-0.026
 agree-0.024
alk-0.024
orse-0.024
Token
Token ID198
Feature activation-2.76e-03

Pos logit contributions
 habit+0.005
oc+0.005
alk+0.005
 agree+0.005
orse+0.005
Neg logit contributions
ths-0.004
 skies-0.004
 two-0.004
 next-0.003
There-0.003
TokenThe
Token ID464
Feature activation+1.55e-02

Pos logit contributions
 habit+0.006
oc+0.006
alk+0.006
 agree+0.006
orse+0.006
Neg logit contributions
ths-0.003
 skies-0.003
ting-0.003
 highest-0.003
 two-0.003
Token grass
Token ID8701
Feature activation-2.38e-02

Pos logit contributions
ths+0.023
 skies+0.023
ting+0.021
 highest+0.021
 two+0.021
Neg logit contributions
 habit-0.029
oc-0.029
alk-0.028
 agree-0.028
orse-0.027
Tokenho
Token ID8873
Feature activation-1.28e-01

Pos logit contributions
 habit+0.029
oc+0.027
 agree+0.026
alk+0.025
orse+0.024
Neg logit contributions
ths-0.053
 skies-0.053
 next-0.051
 two-0.050
ting-0.049
Tokenpper
Token ID2848
Feature activation-4.53e-02

Pos logit contributions
 habit+0.214
oc+0.204
alk+0.198
 agree+0.197
orse+0.189
Neg logit contributions
ths-0.223
 skies-0.218
ting-0.210
 next-0.205
 highest-0.204
Token said
Token ID531
Feature activation+6.91e-03

Pos logit contributions
 habit+0.076
oc+0.073
alk+0.071
 agree+0.069
 policy+0.069
Neg logit contributions
ths-0.078
 skies-0.077
 two-0.073
 next-0.072
ting-0.071
Token,
Token ID11
Feature activation-1.40e-03

Pos logit contributions
ths+0.011
 skies+0.011
 next+0.010
 two+0.010
ting+0.010
Neg logit contributions
oc-0.012
 habit-0.012
alk-0.012
 agree-0.012
orse-0.011
Token "
Token ID366
Feature activation+1.47e-04

Pos logit contributions
 habit+0.002
oc+0.002
alk+0.002
 agree+0.002
orse+0.002
Neg logit contributions
 skies-0.002
ths-0.002
ting-0.002
 next-0.002
 two-0.002
TokenI
Token ID40
Feature activation-1.39e-02

Pos logit contributions
 skies+0.000
ths+0.000
ips+0.000
 back+0.000
 such+0.000
Neg logit contributions
oc-0.000
 habit-0.000
orse-0.000
 agree-0.000
 understand-0.000
Token."
Token ID526
Feature activation-3.30e-03

Pos logit contributions
 habit+0.019
oc+0.018
 agree+0.017
alk+0.017
orse+0.017
Neg logit contributions
ths-0.016
 skies-0.016
ting-0.015
 highest-0.015
 two-0.014
Token
Token ID198
Feature activation-8.22e-03

Pos logit contributions
 habit+0.006
oc+0.006
 agree+0.006
alk+0.006
orse+0.005
Neg logit contributions
ths-0.005
 skies-0.005
ting-0.005
 highest-0.005
 two-0.005
Token
Token ID198
Feature activation-7.57e-03

Pos logit contributions
 habit+0.017
oc+0.016
alk+0.016
 policy+0.016
 agree+0.016
Neg logit contributions
 skies-0.010
ths-0.010
 two-0.010
 next-0.010
There-0.010
TokenThe
Token ID464
Feature activation+1.27e-02

Pos logit contributions
 habit+0.018
oc+0.017
alk+0.017
 agree+0.017
orse+0.016
Neg logit contributions
ths-0.008
 skies-0.008
ting-0.007
 highest-0.007
 two-0.007
Token grass
Token ID8701
Feature activation-2.46e-02

Pos logit contributions
ths+0.018
 skies+0.018
 two+0.018
 next+0.017
 highest+0.017
Neg logit contributions
oc-0.025
 habit-0.024
alk-0.024
 agree-0.023
orse-0.023
Tokenho
Token ID8873
Feature activation-1.28e-01

Pos logit contributions
 habit+0.031
oc+0.030
 agree+0.028
alk+0.027
orse+0.026
Neg logit contributions
ths-0.054
 skies-0.054
 next-0.051
 two-0.051
ting-0.050
Tokenpper
Token ID2848
Feature activation-4.65e-02

Pos logit contributions
 habit+0.201
oc+0.198
 agree+0.186
alk+0.186
orse+0.182
Neg logit contributions
ths-0.230
 skies-0.228
ting-0.214
 highest-0.213
 two-0.210
Token sighed
Token ID21893
Feature activation+8.83e-03

Pos logit contributions
 habit+0.074
oc+0.072
alk+0.068
 agree+0.068
orse+0.066
Neg logit contributions
ths-0.083
 skies-0.082
ting-0.077
 highest-0.077
 two-0.076
Token and
Token ID290
Feature activation-1.67e-03

Pos logit contributions
ths+0.012
 skies+0.012
 next+0.012
 two+0.011
 back+0.011
Neg logit contributions
oc-0.018
 habit-0.017
alk-0.017
orse-0.016
 agree-0.016
Token decided
Token ID3066
Feature activation-1.04e-02

Pos logit contributions
 habit+0.003
oc+0.003
alk+0.002
 agree+0.002
orse+0.002
Neg logit contributions
ths-0.003
 skies-0.003
ting-0.003
 highest-0.003
 two-0.003
Token to
Token ID284
Feature activation+1.11e-03

Pos logit contributions
 habit+0.022
oc+0.021
 agree+0.021
alk+0.020
orse+0.020
Neg logit contributions
ths-0.013
 skies-0.013
 highest-0.012
ting-0.012
ips-0.012
TokenThen
Token ID6423
Feature activation+3.36e-02

Pos logit contributions
 habit+0.002
oc+0.002
 agree+0.002
alk+0.002
orse+0.002
Neg logit contributions
ths-0.001
 skies-0.001
ting-0.001
 highest-0.001
 such-0.001
Token Baby
Token ID14801
Feature activation-1.12e-02

Pos logit contributions
ths+0.064
 highest+0.064
 skies+0.062
ips+0.062
ting+0.061
Neg logit contributions
 habit-0.048
oc-0.045
 agree-0.042
orse-0.042
 understand-0.039
Token saw
Token ID2497
Feature activation+4.18e-03

Pos logit contributions
 habit+0.017
oc+0.017
 agree+0.016
orse+0.016
alk+0.015
Neg logit contributions
ths-0.020
 skies-0.020
 highest-0.019
ips-0.019
 such-0.019
Token some
Token ID617
Feature activation-2.74e-02

Pos logit contributions
ths+0.004
 skies+0.004
 highest+0.004
ting+0.003
ips+0.003
Neg logit contributions
 habit-0.010
oc-0.010
 agree-0.010
alk-0.010
orse-0.009
Token grass
Token ID8701
Feature activation-2.39e-02

Pos logit contributions
oc+0.062
 habit+0.060
alk+0.058
orse+0.057
 agree+0.057
Neg logit contributions
 skies-0.033
ths-0.031
 two-0.031
ting-0.029
 next-0.029
Tokenho
Token ID8873
Feature activation-1.27e-01

Pos logit contributions
oc+0.046
 habit+0.046
 agree+0.043
alk+0.043
orse+0.043
Neg logit contributions
 skies-0.035
ths-0.035
ting-0.033
 next-0.032
 two-0.032
Tokenppers
Token ID11799
Feature activation-1.12e-02

Pos logit contributions
 habit+0.202
oc+0.195
alk+0.188
 agree+0.185
orse+0.177
Neg logit contributions
ths-0.229
 skies-0.224
ting-0.219
 next-0.213
 highest-0.212
Token in
Token ID287
Feature activation-8.86e-03

Pos logit contributions
 habit+0.028
oc+0.028
 agree+0.026
alk+0.026
orse+0.026
Neg logit contributions
ths-0.010
 skies-0.010
ting-0.009
 next-0.009
 two-0.009
Token the
Token ID262
Feature activation+1.87e-03

Pos logit contributions
 habit+0.024
oc+0.024
orse+0.023
alk+0.023
 agree+0.023
Neg logit contributions
 skies-0.006
ths-0.006
 two-0.006
 such-0.005
ting-0.005
Token garden
Token ID11376
Feature activation-1.12e-02

Pos logit contributions
ths+0.003
 skies+0.003
ips+0.003
ting+0.003
 such+0.003
Neg logit contributions
 habit-0.003
oc-0.003
alk-0.003
 agree-0.003
orse-0.003
Token.
Token ID13
Feature activation+2.34e-03

Pos logit contributions
oc+0.023
 habit+0.023
orse+0.022
alk+0.022
 agree+0.021
Neg logit contributions
 skies-0.015
 next-0.014
ths-0.014
 two-0.014
 such-0.014
Token and
Token ID290
Feature activation+3.87e-03

Pos logit contributions
ths+0.010
 skies+0.010
 highest+0.009
 next+0.009
ting+0.009
Neg logit contributions
 habit-0.021
oc-0.020
 agree-0.019
alk-0.019
orse-0.019
Token pointed
Token ID6235
Feature activation+1.23e-02

Pos logit contributions
ths+0.007
 skies+0.007
 highest+0.007
ips+0.007
ting+0.006
Neg logit contributions
oc-0.006
 habit-0.006
 agree-0.006
orse-0.005
alk-0.005
Token at
Token ID379
Feature activation+1.40e-02

Pos logit contributions
ths+0.017
 skies+0.016
ting+0.014
ips+0.014
 highest+0.014
Neg logit contributions
oc-0.025
 habit-0.025
 agree-0.025
alk-0.024
 understand-0.024
Token the
Token ID262
Feature activation-3.85e-04

Pos logit contributions
ths+0.020
 skies+0.019
ips+0.017
ting+0.017
 highest+0.017
Neg logit contributions
oc-0.028
 habit-0.028
alk-0.026
 agree-0.026
orse-0.025
Token grass
Token ID8701
Feature activation-2.65e-02

Pos logit contributions
oc+0.001
 habit+0.001
alk+0.001
 agree+0.001
orse+0.001
Neg logit contributions
 skies-0.001
ths-0.001
 two-0.001
 highest-0.001
 next-0.001
Tokenho
Token ID8873
Feature activation-1.26e-01

Pos logit contributions
 habit+0.047
oc+0.046
 agree+0.044
alk+0.044
orse+0.043
Neg logit contributions
ths-0.042
 skies-0.042
ting-0.039
 highest-0.039
 two-0.038
Tokenpper
Token ID2848
Feature activation-4.54e-02

Pos logit contributions
 habit+0.215
oc+0.203
alk+0.197
 agree+0.194
 policy+0.188
Neg logit contributions
ths-0.210
 skies-0.205
ting-0.203
 next-0.202
 highest-0.198
Token.
Token ID13
Feature activation-1.41e-02

Pos logit contributions
 habit+0.084
oc+0.083
alk+0.079
 agree+0.079
orse+0.077
Neg logit contributions
ths-0.069
 skies-0.068
ting-0.064
 highest-0.063
 two-0.063
Token He
Token ID679
Feature activation-1.08e-02

Pos logit contributions
 habit+0.026
oc+0.026
 agree+0.024
alk+0.024
orse+0.024
Neg logit contributions
 skies-0.021
ths-0.021
ting-0.020
 highest-0.020
 two-0.020
Token said
Token ID531
Feature activation+6.41e-03

Pos logit contributions
 habit+0.018
oc+0.017
alk+0.017
 agree+0.016
orse+0.016
Neg logit contributions
ths-0.019
 skies-0.018
 two-0.017
ting-0.017
 highest-0.017
Token,
Token ID11
Feature activation-3.09e-03

Pos logit contributions
ths+0.012
 skies+0.012
ting+0.011
 highest+0.011
ips+0.011
Neg logit contributions
 habit-0.010
oc-0.010
 agree-0.009
alk-0.009
orse-0.009
TokenIt
Token ID1026
Feature activation-1.83e-02

Pos logit contributions
ths+0.042
 skies+0.037
ips+0.035
 highest+0.034
 such+0.034
Neg logit contributions
oc-0.079
 agree-0.076
 habit-0.074
orse-0.073
 understand-0.073
Token is
Token ID318
Feature activation+1.67e-02

Pos logit contributions
oc+0.036
 habit+0.036
 agree+0.034
alk+0.034
orse+0.033
Neg logit contributions
ths-0.026
 skies-0.025
ting-0.023
ips-0.023
 highest-0.022
Token not
Token ID407
Feature activation+7.64e-03

Pos logit contributions
ths+0.017
 skies+0.016
ting+0.015
ips+0.014
 highest+0.014
Neg logit contributions
 habit-0.040
oc-0.039
 agree-0.038
alk-0.038
orse-0.038
Token a
Token ID257
Feature activation+1.42e-04

Pos logit contributions
ths+0.007
 skies+0.007
 two+0.006
 such+0.006
ting+0.006
Neg logit contributions
oc-0.019
 habit-0.019
alk-0.018
 agree-0.018
orse-0.018
Token grass
Token ID8701
Feature activation-2.39e-02

Pos logit contributions
 two+0.000
 highest+0.000
 next+0.000
 skies+0.000
ting+0.000
Neg logit contributions
oc-0.000
alk-0.000
 habit-0.000
 agree-0.000
orse-0.000
Tokenho
Token ID8873
Feature activation-1.25e-01

Pos logit contributions
 habit+0.047
 agree+0.043
oc+0.043
alk+0.043
orse+0.042
Neg logit contributions
 skies-0.034
 next-0.033
ting-0.033
 two-0.032
 highest-0.032
Tokenpper
Token ID2848
Feature activation-4.11e-02

Pos logit contributions
 habit+0.206
oc+0.194
 agree+0.188
alk+0.188
orse+0.178
Neg logit contributions
ths-0.214
 skies-0.211
ting-0.209
 next-0.206
 highest-0.204
Token or
Token ID393
Feature activation+2.49e-02

Pos logit contributions
 habit+0.075
oc+0.071
alk+0.071
 agree+0.069
orse+0.068
Neg logit contributions
 skies-0.062
ths-0.061
 next-0.060
ting-0.060
 two-0.060
Token a
Token ID257
Feature activation-9.82e-04

Pos logit contributions
 skies+0.023
ths+0.022
 two+0.020
ting+0.020
 highest+0.020
Neg logit contributions
 habit-0.061
oc-0.060
 agree-0.058
alk-0.058
orse-0.057
Token cricket
Token ID18836
Feature activation+3.85e-03

Pos logit contributions
oc+0.002
alk+0.002
 agree+0.002
 habit+0.002
orse+0.002
Neg logit contributions
 two-0.002
 skies-0.001
 next-0.001
 highest-0.001
 back-0.001
Token.
Token ID13
Feature activation-9.12e-03

Pos logit contributions
 skies+0.006
ths+0.006
 next+0.006
ting+0.006
 two+0.006
Neg logit contributions
 habit-0.006
oc-0.006
alk-0.006
 agree-0.006
orse-0.006
Token it
Token ID340
Feature activation-1.00e-02

Pos logit contributions
 habit+0.009
oc+0.009
 agree+0.009
alk+0.009
orse+0.009
Neg logit contributions
ths-0.006
 skies-0.006
ting-0.006
 highest-0.006
 two-0.005
Token?
Token ID30
Feature activation+3.66e-02

Pos logit contributions
 habit+0.021
oc+0.020
alk+0.019
 agree+0.019
orse+0.019
Neg logit contributions
ths-0.013
 skies-0.013
 next-0.012
 highest-0.012
 such-0.012
Token You
Token ID921
Feature activation+4.29e-03

Pos logit contributions
ths+0.053
 skies+0.051
ting+0.046
 highest+0.045
 next+0.044
Neg logit contributions
oc-0.074
 habit-0.070
 agree-0.069
orse-0.068
alk-0.067
Token can
Token ID460
Feature activation+1.81e-02

Pos logit contributions
 skies+0.008
ths+0.008
 two+0.007
 highest+0.007
 such+0.007
Neg logit contributions
 habit-0.007
oc-0.007
alk-0.006
orse-0.006
 agree-0.006
Token take
Token ID1011
Feature activation+1.85e-02

Pos logit contributions
ths+0.032
 skies+0.032
 highest+0.030
ting+0.030
 two+0.030
Neg logit contributions
 habit-0.029
oc-0.028
alk-0.026
 agree-0.026
orse-0.026
Token turns
Token ID4962
Feature activation+7.81e-02

Pos logit contributions
ths+0.036
 skies+0.034
ting+0.033
ips+0.032
Why+0.032
Neg logit contributions
oc-0.028
 agree-0.026
 habit-0.025
 understand-0.025
alk-0.024
Token using
Token ID1262
Feature activation-2.69e-02

Pos logit contributions
ths+0.111
 skies+0.109
ting+0.100
 highest+0.099
 two+0.097
Neg logit contributions
 habit-0.154
oc-0.152
 agree-0.145
alk-0.145
orse-0.142
Token it
Token ID340
Feature activation-1.01e-02

Pos logit contributions
 habit+0.054
oc+0.053
alk+0.051
 agree+0.051
orse+0.050
Neg logit contributions
ths-0.037
 skies-0.036
ting-0.034
 highest-0.032
ips-0.032
Token.
Token ID13
Feature activation-3.99e-04

Pos logit contributions
 habit+0.018
oc+0.018
alk+0.018
 agree+0.018
orse+0.017
Neg logit contributions
ths-0.016
 skies-0.015
ting-0.014
ips-0.013
There-0.013
Token Or
Token ID1471
Feature activation+1.94e-02

Pos logit contributions
oc+0.001
 agree+0.001
 habit+0.001
alk+0.001
 understand+0.001
Neg logit contributions
ths-0.001
 skies-0.001
ting-0.001
 highest-0.001
ips-0.001
Token you
Token ID345
Feature activation+1.59e-02

Pos logit contributions
 skies+0.024
ths+0.024
ting+0.023
ips+0.022
ays+0.021
Neg logit contributions
 agree-0.040
 habit-0.040
oc-0.039
 understand-0.038
orse-0.038
Token's
Token ID338
Feature activation+3.64e-03

Pos logit contributions
 habit+0.035
oc+0.035
alk+0.033
 agree+0.033
orse+0.033
Neg logit contributions
ths-0.017
 skies-0.017
 two-0.015
ting-0.015
 next-0.015
Token open
Token ID1280
Feature activation+2.22e-03

Pos logit contributions
ths+0.007
 skies+0.007
ting+0.006
 highest+0.006
 two+0.006
Neg logit contributions
 habit-0.005
oc-0.005
alk-0.005
 agree-0.005
orse-0.005
Token it
Token ID340
Feature activation-1.90e-02

Pos logit contributions
ths+0.004
 skies+0.003
ips+0.003
 highest+0.003
ting+0.003
Neg logit contributions
oc-0.004
alk-0.004
 agree-0.003
 habit-0.003
orse-0.003
Token and
Token ID290
Feature activation+9.44e-03

Pos logit contributions
 habit+0.034
oc+0.034
 agree+0.032
alk+0.032
orse+0.030
Neg logit contributions
ths-0.030
 skies-0.030
ting-0.028
ips-0.027
 highest-0.027
Token see
Token ID766
Feature activation-5.25e-05

Pos logit contributions
ths+0.017
 skies+0.017
ting+0.016
ips+0.016
 highest+0.016
Neg logit contributions
 habit-0.014
oc-0.014
 agree-0.014
alk-0.013
orse-0.013
Token,"
Token ID553
Feature activation+7.99e-02

Pos logit contributions
oc+0.000
 habit+0.000
 agree+0.000
alk+0.000
orse+0.000
Neg logit contributions
ths-0.000
 skies-0.000
ips-0.000
 highest-0.000
ting-0.000
Token Mom
Token ID11254
Feature activation+8.75e-04

Pos logit contributions
ths+0.112
 skies+0.109
 two+0.106
 next+0.103
 such+0.101
Neg logit contributions
 habit-0.159
oc-0.156
alk-0.151
 agree-0.144
orse-0.143
Token said
Token ID531
Feature activation+1.09e-02

Pos logit contributions
ths+0.001
 skies+0.001
 two+0.001
 next+0.001
ting+0.001
Neg logit contributions
 habit-0.002
oc-0.002
alk-0.001
orse-0.001
 agree-0.001
Token.
Token ID13
Feature activation-6.42e-04

Pos logit contributions
ths+0.014
 two+0.014
 next+0.014
ting+0.014
 skies+0.014
Neg logit contributions
oc-0.022
 habit-0.022
alk-0.022
orse-0.020
 agree-0.020
Token She
Token ID1375
Feature activation-5.48e-03

Pos logit contributions
 habit+0.001
oc+0.001
alk+0.001
 agree+0.001
orse+0.001
Neg logit contributions
ths-0.001
 skies-0.001
ting-0.001
 highest-0.001
 two-0.001
Token cut
Token ID2005
Feature activation+2.14e-02

Pos logit contributions
 habit+0.011
alk+0.011
oc+0.010
 policy+0.010
onest+0.010
Neg logit contributions
ths-0.008
 two-0.008
 next-0.008
 skies-0.008
 both-0.007
Token with
Token ID351
Feature activation+5.21e-03

Pos logit contributions
ths+0.009
 skies+0.009
ting+0.008
 highest+0.008
 two+0.007
Neg logit contributions
 habit-0.015
oc-0.015
 agree-0.014
alk-0.014
orse-0.014
Token.
Token ID13
Feature activation+1.44e-02

Pos logit contributions
ths+0.007
 skies+0.007
ting+0.007
 highest+0.007
 two+0.006
Neg logit contributions
 habit-0.010
oc-0.010
 agree-0.010
alk-0.010
orse-0.010
Token It
Token ID632
Feature activation+6.92e-03

Pos logit contributions
ths+0.022
 skies+0.022
ting+0.020
 highest+0.020
 two+0.020
Neg logit contributions
 habit-0.027
oc-0.026
alk-0.025
 agree-0.025
orse-0.024
Token was
Token ID373
Feature activation+1.17e-02

Pos logit contributions
ths+0.012
 skies+0.012
ting+0.011
 highest+0.011
ips+0.011
Neg logit contributions
 habit-0.012
oc-0.012
 agree-0.011
alk-0.011
orse-0.011
Token the
Token ID262
Feature activation+1.15e-02

Pos logit contributions
ths+0.012
 skies+0.012
ting+0.011
 highest+0.011
ips+0.010
Neg logit contributions
 habit-0.027
oc-0.027
alk-0.026
 agree-0.026
orse-0.025
Token best
Token ID1266
Feature activation+7.78e-02

Pos logit contributions
ths+0.011
 skies+0.011
ting+0.010
 highest+0.010
 such+0.009
Neg logit contributions
 habit-0.027
oc-0.027
 agree-0.026
alk-0.026
orse-0.025
Token day
Token ID1110
Feature activation+7.97e-04

Pos logit contributions
 skies+0.125
ths+0.124
 next+0.124
 two+0.122
 such+0.117
Neg logit contributions
oc-0.133
 habit-0.130
alk-0.126
 agree-0.124
orse-0.123
Token ever
Token ID1683
Feature activation+6.24e-03

Pos logit contributions
ths+0.001
 skies+0.001
ting+0.001
 next+0.001
 such+0.001
Neg logit contributions
 habit-0.002
oc-0.002
alk-0.002
 agree-0.002
orse-0.002
Token!
Token ID0
Feature activation+3.23e-02

Pos logit contributions
 next+0.008
 skies+0.008
ths+0.008
 two+0.008
 such+0.008
Neg logit contributions
 habit-0.013
oc-0.012
alk-0.012
orse-0.012
 agree-0.012
Token<|endoftext|>
Token ID50256
Feature activation+2.82e-02

Pos logit contributions
ths+0.048
 skies+0.048
ting+0.045
 highest+0.044
 two+0.044
Neg logit contributions
 habit-0.061
oc-0.059
alk-0.057
 agree-0.056
orse-0.055
TokenOnce
Token ID7454
Feature activation+1.84e-02

Pos logit contributions
ths+0.047
 skies+0.044
ting+0.042
 highest+0.042
 such+0.040
Neg logit contributions
oc-0.049
 habit-0.049
alk-0.047
 agree-0.046
orse-0.045
Token have
Token ID423
Feature activation+3.24e-02

Pos logit contributions
 both+0.009
 only+0.009
 two+0.009
 next+0.009
 too+0.009
Neg logit contributions
alk-0.011
 habit-0.011
onest-0.010
orse-0.010
kin-0.010
Token to
Token ID284
Feature activation+1.25e-02

Pos logit contributions
ting+0.042
ths+0.042
 skies+0.037
ips+0.036
 highest+0.035
Neg logit contributions
 habit-0.065
orse-0.063
 agree-0.062
alk-0.062
 understand-0.062
Token go
Token ID467
Feature activation-2.45e-03

Pos logit contributions
 two+0.021
ths+0.021
 next+0.020
 skies+0.020
 both+0.020
Neg logit contributions
oc-0.022
 habit-0.021
alk-0.019
orse-0.019
onest-0.018
Token to
Token ID284
Feature activation+1.12e-02

Pos logit contributions
 habit+0.005
 agree+0.005
oc+0.005
alk+0.005
orse+0.004
Neg logit contributions
ths-0.004
 skies-0.003
ting-0.003
ips-0.003
 highest-0.003
Token work
Token ID670
Feature activation-3.16e-03

Pos logit contributions
ths+0.012
 skies+0.011
ting+0.010
 highest+0.010
ips+0.010
Neg logit contributions
 habit-0.026
oc-0.026
 agree-0.025
alk-0.025
orse-0.025
Token,"
Token ID553
Feature activation+7.67e-02

Pos logit contributions
 habit+0.005
oc+0.005
 agree+0.005
alk+0.005
orse+0.005
Neg logit contributions
ths-0.005
 skies-0.005
ting-0.005
 highest-0.005
ips-0.005
Token said
Token ID531
Feature activation+1.40e-02

Pos logit contributions
ths+0.101
 skies+0.098
ting+0.090
 highest+0.090
ips+0.088
Neg logit contributions
 habit-0.158
oc-0.157
 agree-0.152
orse-0.149
alk-0.149
Token the
Token ID262
Feature activation+9.22e-03

Pos logit contributions
ths+0.019
 skies+0.018
 highest+0.017
ips+0.016
ting+0.016
Neg logit contributions
 habit-0.028
oc-0.028
orse-0.027
 agree-0.027
alk-0.027
Token man
Token ID582
Feature activation-5.82e-03

Pos logit contributions
 skies+0.015
ths+0.014
 two+0.014
 next+0.014
 highest+0.014
Neg logit contributions
oc-0.017
 habit-0.016
alk-0.016
 agree-0.015
orse-0.015
Token.
Token ID13
Feature activation-2.54e-03

Pos logit contributions
 habit+0.012
alk+0.012
oc+0.011
 policy+0.011
 agree+0.011
Neg logit contributions
ths-0.008
 next-0.008
ting-0.007
 skies-0.007
 two-0.007
Token
Token ID198
Feature activation-4.94e-03

Pos logit contributions
 habit+0.004
oc+0.004
alk+0.004
 agree+0.004
orse+0.004
Neg logit contributions
ths-0.004
 skies-0.004
ting-0.004
 two-0.004
 highest-0.004
Token<|endoftext|>
Token ID50256
Feature activation+2.76e-02

Pos logit contributions
oc+0.015
 habit+0.015
 agree+0.015
alk+0.014
orse+0.014
Neg logit contributions
ths-0.020
 skies-0.019
ting-0.019
 highest-0.018
ips-0.018
TokenSam
Token ID16305
Feature activation-2.63e-03

Pos logit contributions
ths+0.040
 skies+0.033
 highest+0.033
ting+0.031
ople+0.030
Neg logit contributions
oc-0.056
alk-0.055
 habit-0.054
 agree-0.052
orse-0.051
Token and
Token ID290
Feature activation+4.51e-03

Pos logit contributions
 habit+0.005
oc+0.005
orse+0.005
alk+0.005
 agree+0.005
Neg logit contributions
 skies-0.004
ths-0.004
 two-0.004
 next-0.004
 such-0.004
Token Ben
Token ID3932
Feature activation-9.18e-03

Pos logit contributions
 two+0.006
 skies+0.006
 such+0.006
ths+0.006
 both+0.006
Neg logit contributions
oc-0.009
 habit-0.009
alk-0.008
orse-0.008
 agree-0.008
Token were
Token ID547
Feature activation-4.68e-04

Pos logit contributions
oc+0.016
 habit+0.016
orse+0.016
alk+0.016
 policy+0.016
Neg logit contributions
 both-0.015
 two-0.014
 back-0.014
 skies-0.014
 next-0.013
Token best
Token ID1266
Feature activation+8.08e-02

Pos logit contributions
 habit+0.001
 plan+0.001
 agree+0.001
orse+0.001
 understand+0.001
Neg logit contributions
ths-0.001
ting-0.001
 skies-0.001
Why-0.001
 highest-0.001
Token friends
Token ID2460
Feature activation+1.13e-02

Pos logit contributions
ting+0.131
ths+0.128
 skies+0.124
Why+0.123
 highest+0.118
Neg logit contributions
 agree-0.147
 habit-0.143
orse-0.138
oc-0.136
 understand-0.133
Token.
Token ID13
Feature activation+1.87e-03

Pos logit contributions
ths+0.020
 skies+0.019
 highest+0.018
ting+0.018
There+0.017
Neg logit contributions
 habit-0.019
oc-0.018
 agree-0.018
orse-0.017
alk-0.017
Token They
Token ID1119
Feature activation-2.11e-03

Pos logit contributions
 skies+0.003
ths+0.003
 two+0.003
 next+0.003
Why+0.003
Neg logit contributions
 habit-0.003
oc-0.003
alk-0.003
orse-0.003
 agree-0.003
Token liked
Token ID8288
Feature activation+1.76e-02

Pos logit contributions
oc+0.004
 policy+0.004
alk+0.004
 habit+0.004
ool+0.004
Neg logit contributions
 both-0.004
 two-0.003
 back-0.003
 next-0.003
 four-0.003
Token to
Token ID284
Feature activation+7.67e-03

Pos logit contributions
ths+0.026
ips+0.025
 skies+0.025
 highest+0.024
ting+0.024
Neg logit contributions
 habit-0.032
 agree-0.031
alk-0.030
orse-0.030
oc-0.029
Token she
Token ID673
Feature activation+3.21e-04

Pos logit contributions
ths+0.010
 skies+0.010
 highest+0.009
ting+0.009
ips+0.009
Neg logit contributions
 habit-0.012
oc-0.011
 agree-0.011
alk-0.011
orse-0.011
Token had
Token ID550
Feature activation-1.83e-02

Pos logit contributions
ths+0.001
 skies+0.001
ting+0.001
 highest+0.001
 two+0.001
Neg logit contributions
 habit-0.000
oc-0.000
 agree-0.000
alk-0.000
orse-0.000
Token learned
Token ID4499
Feature activation-5.76e-02

Pos logit contributions
 habit+0.034
oc+0.032
 agree+0.032
alk+0.031
orse+0.031
Neg logit contributions
ths-0.028
 skies-0.027
ting-0.026
 highest-0.026
ips-0.025
Token today
Token ID1909
Feature activation-3.17e-02

Pos logit contributions
 habit+0.128
oc+0.125
 agree+0.121
alk+0.120
orse+0.117
Neg logit contributions
ths-0.067
 skies-0.065
ting-0.060
 highest-0.059
ips-0.057
Token.
Token ID13
Feature activation-1.61e-02

Pos logit contributions
 habit+0.057
oc+0.056
 agree+0.054
alk+0.054
orse+0.052
Neg logit contributions
ths-0.049
 skies-0.049
ting-0.046
 highest-0.045
 two-0.044
Token<|endoftext|>
Token ID50256
Feature activation+1.67e-02

Pos logit contributions
 habit+0.031
oc+0.030
 agree+0.029
alk+0.029
orse+0.028
Neg logit contributions
ths-0.024
 skies-0.024
ting-0.022
 highest-0.022
 two-0.021
TokenJim
Token ID18050
Feature activation-9.55e-03

Pos logit contributions
ths+0.026
 skies+0.025
ting+0.023
 highest+0.023
 such+0.022
Neg logit contributions
oc-0.031
 habit-0.031
alk-0.029
 agree-0.029
orse-0.028
Token was
Token ID373
Feature activation-8.28e-03

Pos logit contributions
 habit+0.018
oc+0.017
alk+0.017
 policy+0.017
 agree+0.017
Neg logit contributions
 skies-0.014
 two-0.014
 next-0.013
ths-0.013
 both-0.013
Token a
Token ID257
Feature activation-6.91e-03

Pos logit contributions
 habit+0.018
oc+0.017
orse+0.016
 agree+0.016
alk+0.016
Neg logit contributions
ths-0.011
 skies-0.010
ting-0.010
 highest-0.009
ople-0.009
Token fire
Token ID2046
Feature activation-4.84e-03

Pos logit contributions
 habit+0.013
oc+0.013
 agree+0.013
orse+0.012
alk+0.012
Neg logit contributions
ths-0.010
 skies-0.010
ting-0.009
ips-0.009
 such-0.009
Tokenman
Token ID805
Feature activation-2.86e-02

Pos logit contributions
 habit+0.008
oc+0.008
alk+0.007
 agree+0.007
orse+0.007
Neg logit contributions
ths-0.009
 skies-0.008
 highest-0.008
ting-0.008
There-0.008
Token.
Token ID13
Feature activation-9.17e-03

Pos logit contributions
ths+0.025
 skies+0.025
 highest+0.023
 yourselves+0.023
ips+0.023
Neg logit contributions
 habit-0.027
oc-0.026
 agree-0.026
alk-0.024
orse-0.024
Token<|endoftext|>
Token ID50256
Feature activation+2.43e-02

Pos logit contributions
 habit+0.019
oc+0.018
 agree+0.018
alk+0.018
orse+0.018
Neg logit contributions
 skies-0.012
ths-0.012
ting-0.011
 two-0.011
 such-0.011
TokenOnce
Token ID7454
Feature activation+1.18e-02

Pos logit contributions
ths+0.045
 skies+0.042
 highest+0.040
ting+0.040
ips+0.038
Neg logit contributions
oc-0.038
 habit-0.037
alk-0.036
 agree-0.036
orse-0.034
Token upon
Token ID2402
Feature activation-5.59e-03

Pos logit contributions
 next+0.021
 two+0.021
ths+0.021
 skies+0.021
 both+0.020
Neg logit contributions
oc-0.019
 habit-0.018
alk-0.016
 agree-0.016
orse-0.016
Token a
Token ID257
Feature activation-1.00e-03

Pos logit contributions
 habit+0.016
oc+0.016
alk+0.015
 agree+0.015
orse+0.015
Neg logit contributions
 skies-0.003
ths-0.003
 two-0.003
 such-0.003
 next-0.003
Token time
Token ID640
Feature activation+1.64e-02

Pos logit contributions
oc+0.002
 habit+0.002
alk+0.002
orse+0.002
 agree+0.002
Neg logit contributions
ths-0.001
 skies-0.001
 next-0.001
 two-0.001
There-0.001
Token,
Token ID11
Feature activation-4.94e-03

Pos logit contributions
ths+0.028
 skies+0.027
 next+0.026
 two+0.026
ting+0.025
Neg logit contributions
 habit-0.028
oc-0.027
 agree-0.026
alk-0.026
orse-0.025
Token there
Token ID612
Feature activation+5.81e-03

Pos logit contributions
 habit+0.012
oc+0.012
alk+0.012
 agree+0.012
orse+0.012
Neg logit contributions
ths-0.005
 skies-0.004
ting-0.004
 highest-0.004
ips-0.004
Token was
Token ID373
Feature activation-5.63e-03

Pos logit contributions
 skies+0.008
 two+0.008
ths+0.008
 such+0.008
 next+0.008
Neg logit contributions
 habit-0.011
alk-0.011
oc-0.010
 agree-0.010
orse-0.010
Token a
Token ID257
Feature activation-1.68e-03

Pos logit contributions
 habit+0.013
oc+0.012
orse+0.012
 agree+0.012
alk+0.012
Neg logit contributions
ths-0.007
 skies-0.006
ting-0.006
ips-0.006
ople-0.006
Token boy
Token ID2933
Feature activation-1.58e-02

Pos logit contributions
alk+0.004
 habit+0.003
oc+0.003
 policy+0.003
 agree+0.003
Neg logit contributions
ths-0.002
 next-0.002
 two-0.002
 skies-0.002
 highest-0.002
Token the
Token ID262
Feature activation+4.19e-03

Pos logit contributions
 agree+0.020
 habit+0.019
 understand+0.019
alk+0.018
 believe+0.018
Neg logit contributions
ths-0.015
 skies-0.014
 highest-0.013
ting-0.013
ips-0.013
Token coin
Token ID10752
Feature activation+2.85e-02

Pos logit contributions
 skies+0.009
ths+0.008
 two+0.008
 highest+0.008
 next+0.008
Neg logit contributions
 habit-0.006
oc-0.005
alk-0.005
 agree-0.005
orse-0.005
Token and
Token ID290
Feature activation+4.03e-05

Pos logit contributions
ths+0.037
 skies+0.037
 next+0.035
 two+0.034
ting+0.034
Neg logit contributions
 habit-0.059
oc-0.058
 agree-0.055
alk-0.055
orse-0.054
Token they
Token ID484
Feature activation+2.62e-03

Pos logit contributions
ths+0.000
 skies+0.000
ting+0.000
 highest+0.000
ips+0.000
Neg logit contributions
 habit-0.000
 agree-0.000
oc-0.000
alk-0.000
 understand-0.000
Token see
Token ID766
Feature activation-5.03e-03

Pos logit contributions
ths+0.004
 skies+0.004
ting+0.004
 highest+0.004
 two+0.004
Neg logit contributions
 habit-0.004
oc-0.004
alk-0.004
orse-0.004
 agree-0.004
Token that
Token ID326
Feature activation+1.43e-02

Pos logit contributions
 habit+0.012
oc+0.012
 agree+0.012
alk+0.011
orse+0.011
Neg logit contributions
ths-0.005
 skies-0.005
 highest-0.004
ting-0.004
ips-0.004
Token it
Token ID340
Feature activation-1.33e-02

Pos logit contributions
ths+0.016
 skies+0.016
 two+0.015
 next+0.015
ting+0.015
Neg logit contributions
 habit-0.032
oc-0.031
alk-0.030
 agree-0.030
orse-0.029
Token is
Token ID318
Feature activation+1.91e-02

Pos logit contributions
oc+0.021
 habit+0.020
 agree+0.019
alk+0.019
orse+0.018
Neg logit contributions
ths-0.024
 skies-0.024
ting-0.023
There-0.022
 highest-0.022
Token not
Token ID407
Feature activation+8.03e-03

Pos logit contributions
ths+0.025
 skies+0.024
ting+0.023
 highest+0.022
ips+0.021
Neg logit contributions
 habit-0.040
oc-0.038
 agree-0.038
alk-0.037
orse-0.037
Token shiny
Token ID22441
Feature activation+5.45e-03

Pos logit contributions
ths+0.009
 skies+0.009
 two+0.008
 such+0.008
ting+0.008
Neg logit contributions
oc-0.018
 habit-0.018
alk-0.017
 agree-0.017
orse-0.017
Token anymore
Token ID7471
Feature activation+4.13e-02

Pos logit contributions
ths+0.009
 skies+0.008
There+0.007
ting+0.007
 highest+0.007
Neg logit contributions
 habit-0.010
oc-0.010
 agree-0.010
alk-0.009
 policy-0.009
Token She
Token ID1375
Feature activation-5.24e-03

Pos logit contributions
 habit+0.010
oc+0.010
 agree+0.009
alk+0.009
orse+0.009
Neg logit contributions
ths-0.007
 skies-0.007
 highest-0.007
ting-0.007
 such-0.007
Token thanked
Token ID26280
Feature activation+4.43e-03

Pos logit contributions
 habit+0.010
alk+0.010
oc+0.010
 policy+0.010
onest+0.009
Neg logit contributions
ths-0.008
 both-0.007
 two-0.007
 skies-0.007
 next-0.007
Token the
Token ID262
Feature activation+2.16e-03

Pos logit contributions
ths+0.005
 skies+0.005
 highest+0.005
ips+0.005
ays+0.005
Neg logit contributions
 habit-0.009
 agree-0.009
oc-0.009
alk-0.009
orse-0.009
Token boy
Token ID2933
Feature activation-1.85e-02

Pos logit contributions
 skies+0.004
ths+0.004
 two+0.003
 next+0.003
 highest+0.003
Neg logit contributions
 habit-0.004
oc-0.004
alk-0.003
 agree-0.003
orse-0.003
Token and
Token ID290
Feature activation-1.93e-03

Pos logit contributions
 habit+0.035
alk+0.034
oc+0.034
orse+0.033
 agree+0.032
Neg logit contributions
ths-0.026
 next-0.025
 skies-0.025
ting-0.025
 two-0.025
Token put
Token ID1234
Feature activation+4.69e-02

Pos logit contributions
 habit+0.004
oc+0.003
alk+0.003
orse+0.003
 agree+0.003
Neg logit contributions
ths-0.003
 skies-0.003
 two-0.003
ting-0.003
 next-0.003
Token her
Token ID607
Feature activation+7.83e-03

Pos logit contributions
ths+0.076
 skies+0.072
clock+0.067
 highest+0.066
 fastest+0.061
Neg logit contributions
 habit-0.080
oc-0.078
 agree-0.077
kin-0.075
 choice-0.072
Token hat
Token ID6877
Feature activation+1.10e-02

Pos logit contributions
ths+0.018
 skies+0.017
ting+0.016
ips+0.016
Why+0.016
Neg logit contributions
 habit-0.010
orse-0.009
oc-0.009
 agree-0.008
kin-0.008
Token back
Token ID736
Feature activation+1.63e-02

Pos logit contributions
ths+0.008
 skies+0.007
 yourselves+0.005
ting+0.005
Why+0.005
Neg logit contributions
 habit-0.030
oc-0.029
 agree-0.028
alk-0.028
orse-0.028
Token on
Token ID319
Feature activation-1.83e-02

Pos logit contributions
ths+0.018
 skies+0.017
There+0.014
Why+0.014
 highest+0.014
Neg logit contributions
 habit-0.036
oc-0.036
alk-0.035
orse-0.034
 agree-0.034
Token.
Token ID13
Feature activation-8.72e-03

Pos logit contributions
 habit+0.036
oc+0.035
 agree+0.034
alk+0.034
 policy+0.032
Neg logit contributions
ths-0.027
 skies-0.025
ips-0.023
 highest-0.023
ting-0.023
Token to
Token ID284
Feature activation+3.35e-03

Pos logit contributions
ths+0.022
 skies+0.020
ting+0.020
 highest+0.019
ips+0.019
Neg logit contributions
 habit-0.045
oc-0.043
 agree-0.043
alk-0.042
orse-0.041
Token start
Token ID923
Feature activation+1.64e-03

Pos logit contributions
 skies+0.006
ths+0.006
 next+0.005
 two+0.005
 highest+0.005
Neg logit contributions
 habit-0.006
oc-0.006
alk-0.005
orse-0.005
 agree-0.005
Token.
Token ID13
Feature activation-2.14e-03

Pos logit contributions
ths+0.003
 skies+0.003
ting+0.003
 highest+0.003
There+0.003
Neg logit contributions
 habit-0.003
oc-0.003
 agree-0.003
alk-0.003
orse-0.002
Token
Token ID198
Feature activation-4.07e-03

Pos logit contributions
 habit+0.004
oc+0.004
alk+0.004
 agree+0.004
orse+0.003
Neg logit contributions
ths-0.004
 skies-0.003
ting-0.003
 highest-0.003
 next-0.003
Token
Token ID198
Feature activation-3.72e-03

Pos logit contributions
 habit+0.010
oc+0.009
alk+0.009
 policy+0.009
 agree+0.009
Neg logit contributions
There-0.004
 two-0.004
 next-0.004
 skies-0.004
 both-0.004
TokenThe
Token ID464
Feature activation+1.52e-02

Pos logit contributions
 habit+0.007
oc+0.006
alk+0.006
 agree+0.006
orse+0.006
Neg logit contributions
ths-0.006
 skies-0.006
There-0.006
 next-0.006
 two-0.006
Token opera
Token ID27296
Feature activation-1.74e-04

Pos logit contributions
ths+0.021
ting+0.020
ips+0.020
ays+0.020
 such+0.020
Neg logit contributions
 habit-0.031
 plan-0.029
 agree-0.028
oc-0.028
 choice-0.027
Token was
Token ID373
Feature activation+6.85e-04

Pos logit contributions
 habit+0.000
oc+0.000
 agree+0.000
orse+0.000
 plan+0.000
Neg logit contributions
ths-0.000
 skies-0.000
ting-0.000
 highest-0.000
ips-0.000
Token very
Token ID845
Feature activation+5.84e-03

Pos logit contributions
ths+0.001
ting+0.001
 skies+0.001
ople+0.001
ips+0.001
Neg logit contributions
 habit-0.001
 understand-0.001
 agree-0.001
 plan-0.001
orse-0.001
Token loud
Token ID7812
Feature activation-5.59e-03

Pos logit contributions
ths+0.008
 skies+0.008
ting+0.008
 highest+0.008
ips+0.008
Neg logit contributions
 habit-0.011
oc-0.011
 agree-0.011
alk-0.010
orse-0.010
Token and
Token ID290
Feature activation+5.29e-03

Pos logit contributions
oc+0.008
 habit+0.008
 agree+0.007
alk+0.007
orse+0.007
Neg logit contributions
ths-0.011
 skies-0.010
 highest-0.010
There-0.010
ips-0.010
Token lovely
Token ID14081
Feature activation-3.11e-05

Pos logit contributions
 habit+0.023
 believe+0.022
 understand+0.022
 plan+0.021
 agree+0.021
Neg logit contributions
ths-0.014
ays-0.013
 skies-0.013
ting-0.012
ips-0.012
Token!"
Token ID2474
Feature activation+8.91e-03

Pos logit contributions
 habit+0.000
oc+0.000
 agree+0.000
alk+0.000
orse+0.000
Neg logit contributions
ths-0.000
 skies-0.000
There-0.000
 highest-0.000
ips-0.000
Token M
Token ID337
Feature activation+2.17e-02

Pos logit contributions
 skies+0.012
ths+0.012
 two+0.011
ting+0.011
 such+0.011
Neg logit contributions
 habit-0.018
alk-0.017
oc-0.017
 policy-0.017
 agree-0.017
Tokenargo
Token ID9448
Feature activation-3.85e-02

Pos logit contributions
ths+0.035
 skies+0.034
 highest+0.032
ting+0.032
 such+0.031
Neg logit contributions
 habit-0.039
oc-0.038
 agree-0.036
alk-0.036
orse-0.035
Token smiled
Token ID13541
Feature activation-4.54e-03

Pos logit contributions
 habit+0.061
oc+0.059
alk+0.058
 policy+0.055
 agree+0.054
Neg logit contributions
ths-0.069
 skies-0.069
 two-0.067
 next-0.066
 back-0.064
Token and
Token ID290
Feature activation-1.69e-02

Pos logit contributions
 habit+0.008
oc+0.008
 agree+0.008
alk+0.007
orse+0.007
Neg logit contributions
ths-0.007
 skies-0.007
 highest-0.007
ting-0.007
ips-0.007
Token said
Token ID531
Feature activation-2.26e-03

Pos logit contributions
 habit+0.028
oc+0.028
alk+0.027
 agree+0.026
orse+0.025
Neg logit contributions
ths-0.029
 skies-0.029
ting-0.028
 two-0.027
 such-0.027
Token goodbye
Token ID24829
Feature activation+1.34e-02

Pos logit contributions
oc+0.004
 habit+0.004
alk+0.004
orse+0.004
 agree+0.004
Neg logit contributions
ths-0.003
 two-0.003
 back-0.003
 next-0.003
 skies-0.003
Token.
Token ID13
Feature activation-1.90e-02

Pos logit contributions
ths+0.019
 skies+0.019
ting+0.018
 highest+0.018
ips+0.017
Neg logit contributions
 habit-0.026
oc-0.025
 agree-0.024
alk-0.024
orse-0.024
Token
Token ID220
Feature activation-7.75e-03

Pos logit contributions
 habit+0.038
oc+0.037
alk+0.036
 agree+0.036
orse+0.035
Neg logit contributions
 skies-0.026
ths-0.026
ting-0.024
 two-0.024
 such-0.024
Token
Token ID198
Feature activation-1.74e-02

Pos logit contributions
 habit+0.014
oc+0.014
 agree+0.013
alk+0.013
orse+0.013
Neg logit contributions
ths-0.012
 skies-0.012
ting-0.011
 highest-0.011
 two-0.011
Token day
Token ID1110
Feature activation+1.01e-02

Pos logit contributions
 next+0.023
 two+0.022
 back+0.019
 skies+0.019
 me+0.019
Neg logit contributions
oc-0.032
alk-0.031
 agree-0.031
 habit-0.030
onest-0.030
Token,
Token ID11
Feature activation+2.95e-03

Pos logit contributions
ths+0.018
 skies+0.018
ting+0.017
 highest+0.017
 such+0.017
Neg logit contributions
 habit-0.016
oc-0.015
 agree-0.015
alk-0.014
orse-0.014
Token Anna
Token ID11735
Feature activation-4.02e-03

Pos logit contributions
ths+0.004
 skies+0.004
 highest+0.004
ips+0.004
ting+0.004
Neg logit contributions
 habit-0.006
oc-0.006
 agree-0.006
orse-0.005
alk-0.005
Token and
Token ID290
Feature activation+4.29e-03

Pos logit contributions
 habit+0.008
oc+0.008
alk+0.008
 agree+0.008
orse+0.008
Neg logit contributions
 skies-0.005
 next-0.005
 two-0.005
ths-0.005
ting-0.005
Token Ben
Token ID3932
Feature activation-1.31e-02

Pos logit contributions
ths+0.006
 skies+0.006
 highest+0.006
ips+0.005
There+0.005
Neg logit contributions
 habit-0.008
 agree-0.008
oc-0.008
orse-0.007
 understand-0.007
Token hear
Token ID3285
Feature activation-9.93e-03

Pos logit contributions
oc+0.024
 habit+0.023
orse+0.022
ool+0.022
 policy+0.022
Neg logit contributions
 both-0.022
 back-0.021
 two-0.021
 skies-0.020
 me-0.020
Token a
Token ID257
Feature activation+1.16e-03

Pos logit contributions
oc+0.022
 habit+0.022
 agree+0.021
orse+0.021
 understand+0.020
Neg logit contributions
ths-0.013
 highest-0.011
 skies-0.011
ips-0.010
ting-0.010
Token noise
Token ID7838
Feature activation+1.52e-02

Pos logit contributions
ths+0.001
 skies+0.001
 next+0.001
 highest+0.001
 back+0.001
Neg logit contributions
alk-0.002
 habit-0.002
oc-0.002
 policy-0.002
orse-0.002
Token in
Token ID287
Feature activation-8.49e-03

Pos logit contributions
ths+0.021
 skies+0.021
 highest+0.019
ips+0.018
 two+0.018
Neg logit contributions
 habit-0.030
oc-0.030
 agree-0.029
alk-0.028
orse-0.028
Token the
Token ID262
Feature activation+2.80e-03

Pos logit contributions
oc+0.016
 habit+0.016
 agree+0.016
alk+0.015
 plan+0.015
Neg logit contributions
ths-0.012
 highest-0.012
 skies-0.011
ips-0.011
ople-0.010
Token box
Token ID3091
Feature activation+6.29e-03

Pos logit contributions
ths+0.005
ips+0.005
med+0.005
ople+0.005
ting+0.004
Neg logit contributions
 habit-0.005
oc-0.004
alk-0.004
 plan-0.004
 agree-0.004
Token who
Token ID508
Feature activation+1.91e-02

Pos logit contributions
 habit+0.021
oc+0.021
alk+0.020
 agree+0.020
orse+0.019
Neg logit contributions
ths-0.020
 skies-0.020
ting-0.019
 next-0.019
 two-0.019
Token was
Token ID373
Feature activation-5.68e-03

Pos logit contributions
ths+0.033
 skies+0.032
 two+0.031
 next+0.031
ting+0.030
Neg logit contributions
 habit-0.032
alk-0.031
oc-0.030
 agree-0.030
orse-0.029
Token lost
Token ID2626
Feature activation+2.53e-02

Pos logit contributions
 habit+0.010
oc+0.010
 agree+0.009
 understand+0.009
orse+0.009
Neg logit contributions
ths-0.009
 skies-0.009
 highest-0.008
ips-0.008
There-0.008
Token.
Token ID13
Feature activation-3.90e-04

Pos logit contributions
ths+0.027
 skies+0.027
 next+0.025
ting+0.025
 such+0.024
Neg logit contributions
 habit-0.058
oc-0.058
 agree-0.056
alk-0.056
orse-0.056
Token Bob
Token ID5811
Feature activation+8.40e-03

Pos logit contributions
 habit+0.001
oc+0.001
alk+0.001
 agree+0.001
orse+0.001
Neg logit contributions
 skies-0.000
ths-0.000
 two-0.000
 next-0.000
There-0.000
Token asked
Token ID1965
Feature activation-1.80e-02

Pos logit contributions
ths+0.014
 skies+0.014
 two+0.014
 next+0.014
ting+0.013
Neg logit contributions
 habit-0.014
alk-0.013
oc-0.013
 agree-0.013
 policy-0.013
Token the
Token ID262
Feature activation+4.19e-03

Pos logit contributions
 habit+0.049
alk+0.048
orse+0.048
oc+0.047
 agree+0.045
Neg logit contributions
 two-0.013
 next-0.013
 back-0.013
 such-0.012
 skies-0.012
Token little
Token ID1310
Feature activation+6.54e-03

Pos logit contributions
ths+0.007
 skies+0.007
ting+0.006
ips+0.006
 highest+0.006
Neg logit contributions
 habit-0.007
oc-0.007
 agree-0.007
alk-0.006
orse-0.006
Token girl
Token ID2576
Feature activation-1.16e-02

Pos logit contributions
ths+0.011
 skies+0.011
ting+0.010
 two+0.010
 next+0.010
Neg logit contributions
 habit-0.011
oc-0.011
alk-0.011
orse-0.011
 agree-0.011
Token,
Token ID11
Feature activation-3.92e-03

Pos logit contributions
 habit+0.022
oc+0.021
alk+0.021
orse+0.020
 agree+0.020
Neg logit contributions
ths-0.017
 skies-0.017
 two-0.016
 next-0.016
ting-0.016
Token "
Token ID366
Feature activation+2.91e-03

Pos logit contributions
 habit+0.007
oc+0.007
alk+0.007
orse+0.006
 agree+0.006
Neg logit contributions
 next-0.006
ths-0.006
 skies-0.006
 two-0.006
ting-0.006
Token.
Token ID13
Feature activation+3.70e-03

Pos logit contributions
 habit+0.015
oc+0.015
alk+0.014
orse+0.014
 agree+0.013
Neg logit contributions
 next-0.009
 two-0.008
ths-0.008
 skies-0.008
 such-0.008
Token Tim
Token ID5045
Feature activation+6.03e-03

Pos logit contributions
ths+0.005
 skies+0.005
ting+0.005
 highest+0.005
 two+0.005
Neg logit contributions
 habit-0.007
oc-0.007
 agree-0.007
alk-0.007
orse-0.007
Token was
Token ID373
Feature activation-3.82e-03

Pos logit contributions
ths+0.011
 skies+0.011
ting+0.010
 highest+0.010
 such+0.010
Neg logit contributions
 habit-0.010
oc-0.009
 agree-0.009
alk-0.009
orse-0.009
Token very
Token ID845
Feature activation+3.81e-03

Pos logit contributions
 habit+0.005
 understand+0.005
orse+0.005
 plan+0.005
 agree+0.004
Neg logit contributions
ths-0.008
 skies-0.008
ays-0.007
ting-0.007
 highest-0.007
Token sad
Token ID6507
Feature activation-1.90e-02

Pos logit contributions
ths+0.005
 next+0.005
 two+0.005
 skies+0.005
 both+0.005
Neg logit contributions
oc-0.008
 habit-0.008
alk-0.008
 agree-0.007
orse-0.007
Token.
Token ID13
Feature activation+1.22e-03

Pos logit contributions
 habit+0.038
oc+0.038
alk+0.036
 agree+0.036
orse+0.036
Neg logit contributions
ths-0.026
 skies-0.026
 next-0.024
ting-0.024
 two-0.024
Token
Token ID198
Feature activation+4.00e-04

Pos logit contributions
ths+0.002
 skies+0.002
ting+0.002
 highest+0.002
ips+0.002
Neg logit contributions
 habit-0.002
oc-0.002
 agree-0.002
alk-0.002
orse-0.002
Token
Token ID198
Feature activation+1.11e-04

Pos logit contributions
ths+0.000
 next+0.000
 skies+0.000
 two+0.000
There+0.000
Neg logit contributions
 habit-0.001
oc-0.001
alk-0.001
 agree-0.001
orse-0.001
TokenTim
Token ID14967
Feature activation-1.25e-02

Pos logit contributions
ths+0.000
 skies+0.000
 next+0.000
 two+0.000
There+0.000
Neg logit contributions
 habit-0.000
alk-0.000
oc-0.000
 agree-0.000
orse-0.000
Token went
Token ID1816
Feature activation+2.01e-03

Pos logit contributions
 habit+0.021
oc+0.021
 agree+0.020
orse+0.019
alk+0.019
Neg logit contributions
ths-0.021
 skies-0.021
ips-0.019
 highest-0.019
ting-0.019
Token to
Token ID284
Feature activation+1.63e-03

Pos logit contributions
ths+0.003
 skies+0.003
 highest+0.002
ting+0.002
 such+0.002
Neg logit contributions
 habit-0.004
oc-0.004
 agree-0.004
alk-0.004
 policy-0.004
Token excited
Token ID6568
Feature activation+1.04e-03

Pos logit contributions
ths+0.003
 skies+0.003
 next+0.003
 two+0.003
 both+0.003
Neg logit contributions
alk-0.003
oc-0.003
 habit-0.003
orse-0.003
 agree-0.003
Token to
Token ID284
Feature activation-1.57e-03

Pos logit contributions
ths+0.002
 highest+0.002
ips+0.002
 skies+0.002
There+0.002
Neg logit contributions
 habit-0.002
 agree-0.002
 policy-0.001
alk-0.001
oc-0.001
Token see
Token ID766
Feature activation-8.66e-03

Pos logit contributions
oc+0.003
 habit+0.003
alk+0.002
orse+0.002
 agree+0.002
Neg logit contributions
 skies-0.003
 two-0.002
ths-0.002
 next-0.002
 both-0.002
Token what
Token ID644
Feature activation+2.37e-02

Pos logit contributions
 habit+0.017
oc+0.017
 agree+0.017
 understand+0.016
 believe+0.015
Neg logit contributions
ths-0.012
 skies-0.012
ips-0.011
 highest-0.011
There-0.010
Token was
Token ID373
Feature activation-4.22e-03

Pos logit contributions
ths+0.042
 skies+0.041
ting+0.039
 highest+0.039
ips+0.038
Neg logit contributions
 habit-0.038
oc-0.037
 agree-0.035
alk-0.035
orse-0.034
Token inside
Token ID2641
Feature activation-1.77e-02

Pos logit contributions
 policy+0.008
 habit+0.008
ool+0.007
 believe+0.007
oc+0.007
Neg logit contributions
 skies-0.006
ths-0.006
efully-0.006
ips-0.006
 highest-0.005
Token!
Token ID0
Feature activation+2.55e-02

Pos logit contributions
 habit+0.032
oc+0.031
 agree+0.030
alk+0.029
orse+0.028
Neg logit contributions
ths-0.028
 skies-0.028
 highest-0.026
ting-0.025
ips-0.025
Token
Token ID198
Feature activation+3.61e-03

Pos logit contributions
ths+0.039
 skies+0.039
ting+0.036
 highest+0.035
 two+0.035
Neg logit contributions
 habit-0.048
oc-0.047
alk-0.044
 agree-0.044
orse-0.044
Token
Token ID198
Feature activation+3.60e-03

Pos logit contributions
 skies+0.004
 next+0.004
There+0.004
 two+0.004
ths+0.004
Neg logit contributions
 habit-0.008
oc-0.008
alk-0.007
 agree-0.007
orse-0.007
TokenL
Token ID43
Feature activation+1.55e-03

Pos logit contributions
ths+0.005
 skies+0.005
 next+0.005
There+0.005
 two+0.005
Neg logit contributions
 habit-0.007
oc-0.007
alk-0.007
orse-0.007
 agree-0.006
Tokenily
Token ID813
Feature activation+3.59e-04

Pos logit contributions
ths+0.002
 skies+0.002
 next+0.002
 two+0.002
ips+0.002
Neg logit contributions
oc-0.003
 habit-0.003
orse-0.003
 agree-0.003
alk-0.003
Token they
Token ID484
Feature activation+5.83e-04

Pos logit contributions
 habit+0.006
oc+0.006
alk+0.006
 policy+0.005
 agree+0.005
Neg logit contributions
ths-0.006
 skies-0.005
 two-0.005
 next-0.005
ting-0.005
Token decided
Token ID3066
Feature activation-1.35e-02

Pos logit contributions
ths+0.001
 skies+0.001
ting+0.001
 highest+0.001
ips+0.001
Neg logit contributions
 habit-0.001
oc-0.001
 agree-0.001
alk-0.001
orse-0.001
Token to
Token ID284
Feature activation-1.60e-03

Pos logit contributions
 habit+0.026
 agree+0.026
oc+0.024
orse+0.024
alk+0.024
Neg logit contributions
 skies-0.019
ths-0.019
 highest-0.018
ips-0.018
ting-0.018
Token play
Token ID711
Feature activation-4.12e-03

Pos logit contributions
 habit+0.003
oc+0.003
alk+0.002
orse+0.002
 policy+0.002
Neg logit contributions
 skies-0.003
ths-0.003
 two-0.003
 highest-0.003
 both-0.003
Token tag
Token ID7621
Feature activation-2.75e-03

Pos logit contributions
 habit+0.009
oc+0.009
 agree+0.009
alk+0.009
orse+0.009
Neg logit contributions
ths-0.005
 skies-0.005
ting-0.004
 highest-0.004
ips-0.004
Token instead
Token ID2427
Feature activation-8.05e-02

Pos logit contributions
 habit+0.006
oc+0.006
alk+0.006
 agree+0.006
orse+0.006
Neg logit contributions
ths-0.003
 skies-0.003
ting-0.003
 highest-0.003
 two-0.003
Token.
Token ID13
Feature activation-9.71e-03

Pos logit contributions
 habit+0.144
oc+0.142
alk+0.134
 agree+0.134
orse+0.131
Neg logit contributions
ths-0.127
 skies-0.126
ting-0.118
 next-0.117
 two-0.117
Token They
Token ID1119
Feature activation-8.02e-03

Pos logit contributions
 habit+0.016
oc+0.016
 agree+0.015
alk+0.015
orse+0.015
Neg logit contributions
ths-0.017
 skies-0.016
ting-0.016
 highest-0.015
ips-0.015
Token had
Token ID550
Feature activation-5.94e-03

Pos logit contributions
oc+0.015
 habit+0.015
alk+0.014
 policy+0.014
orse+0.014
Neg logit contributions
ths-0.012
 skies-0.012
 both-0.011
 two-0.011
ting-0.011
Token so
Token ID523
Feature activation-6.92e-03

Pos logit contributions
 habit+0.013
 agree+0.013
oc+0.012
 understand+0.012
orse+0.012
Neg logit contributions
ths-0.007
ting-0.007
ays-0.006
ips-0.006
 skies-0.006
Token much
Token ID881
Feature activation-2.89e-05

Pos logit contributions
 habit+0.014
alk+0.013
oc+0.013
 policy+0.013
orse+0.012
Neg logit contributions
 two-0.009
 both-0.009
There-0.009
ths-0.009
 skies-0.009
Token and
Token ID290
Feature activation-2.54e-03

Pos logit contributions
 habit+0.040
oc+0.039
 agree+0.037
alk+0.037
orse+0.037
Neg logit contributions
ths-0.022
 skies-0.022
ting-0.020
 highest-0.020
 two-0.020
Token eat
Token ID4483
Feature activation-1.63e-02

Pos logit contributions
 habit+0.004
oc+0.004
alk+0.004
 policy+0.004
orse+0.003
Neg logit contributions
 skies-0.005
ths-0.005
 two-0.004
 next-0.004
 such-0.004
Token it
Token ID340
Feature activation-1.87e-02

Pos logit contributions
 habit+0.041
oc+0.040
alk+0.039
 agree+0.039
orse+0.039
Neg logit contributions
 skies-0.014
ths-0.014
 such-0.013
 two-0.013
ting-0.012
Token.
Token ID13
Feature activation-4.53e-03

Pos logit contributions
 habit+0.034
oc+0.033
 agree+0.032
alk+0.032
orse+0.031
Neg logit contributions
ths-0.029
 skies-0.029
ting-0.027
ips-0.026
 highest-0.026
Token Big
Token ID4403
Feature activation-1.69e-02

Pos logit contributions
 habit+0.008
oc+0.007
 agree+0.007
alk+0.007
orse+0.007
Neg logit contributions
ths-0.008
 skies-0.008
ting-0.007
 highest-0.007
 such-0.007
Token mistake
Token ID7457
Feature activation-6.86e-02

Pos logit contributions
oc+0.030
 habit+0.029
alk+0.027
 agree+0.027
orse+0.026
Neg logit contributions
ths-0.028
 skies-0.027
 highest-0.025
ips-0.025
ting-0.025
Token!
Token ID0
Feature activation+1.90e-02

Pos logit contributions
 habit+0.118
oc+0.115
alk+0.110
 agree+0.109
orse+0.107
Neg logit contributions
ths-0.115
 skies-0.113
ting-0.107
 two-0.105
 highest-0.104
Token When
Token ID1649
Feature activation+1.73e-02

Pos logit contributions
ths+0.033
 skies+0.032
ting+0.030
 highest+0.030
 next+0.030
Neg logit contributions
oc-0.032
 habit-0.032
alk-0.029
 agree-0.029
orse-0.029
Token he
Token ID339
Feature activation+1.09e-02

Pos logit contributions
ths+0.033
 skies+0.033
 highest+0.032
ting+0.032
ips+0.031
Neg logit contributions
 habit-0.025
oc-0.025
orse-0.023
 agree-0.023
alk-0.022
Token bit
Token ID1643
Feature activation+2.94e-03

Pos logit contributions
ths+0.018
 skies+0.018
 next+0.017
 two+0.017
 such+0.017
Neg logit contributions
 habit-0.019
oc-0.018
alk-0.018
 agree-0.017
orse-0.017
Token down
Token ID866
Feature activation+3.56e-02

Pos logit contributions
ths+0.003
 skies+0.003
 highest+0.003
Why+0.002
ting+0.002
Neg logit contributions
alk-0.007
oc-0.007
 habit-0.007
orse-0.006
 agree-0.006
Token was
Token ID373
Feature activation-3.55e-03

Pos logit contributions
ths+0.013
 skies+0.012
 next+0.012
 two+0.012
ting+0.012
Neg logit contributions
 habit-0.010
oc-0.010
alk-0.009
 agree-0.009
orse-0.009
Token lost
Token ID2626
Feature activation+2.62e-02

Pos logit contributions
 habit+0.007
orse+0.007
 agree+0.007
 understand+0.007
alk+0.006
Neg logit contributions
ths-0.005
 skies-0.005
ays-0.004
ips-0.004
 highest-0.004
Token and
Token ID290
Feature activation+3.26e-03

Pos logit contributions
ths+0.037
 skies+0.037
ting+0.034
 highest+0.033
 next+0.033
Neg logit contributions
 habit-0.051
oc-0.051
 agree-0.049
alk-0.048
orse-0.048
Token didn
Token ID1422
Feature activation+7.84e-03

Pos logit contributions
ths+0.006
 skies+0.006
ting+0.005
 two+0.005
 such+0.005
Neg logit contributions
oc-0.005
 habit-0.005
alk-0.005
 agree-0.005
orse-0.005
Token't
Token ID470
Feature activation+2.77e-03

Pos logit contributions
ths+0.010
 skies+0.010
ting+0.009
ople+0.009
ips+0.008
Neg logit contributions
 habit-0.016
alk-0.016
oc-0.016
 understand-0.015
orse-0.015
Token know
Token ID760
Feature activation-6.69e-02

Pos logit contributions
 back+0.006
 right+0.006
 two+0.006
 too+0.006
 both+0.006
Neg logit contributions
oc-0.004
 habit-0.003
orse-0.003
kin-0.003
alk-0.003
Token how
Token ID703
Feature activation-5.52e-03

Pos logit contributions
 habit+0.122
oc+0.120
orse+0.114
 agree+0.113
alk+0.111
Neg logit contributions
ths-0.102
 skies-0.099
 highest-0.094
ips-0.093
There-0.089
Token to
Token ID284
Feature activation-1.75e-04

Pos logit contributions
oc+0.013
 habit+0.013
 agree+0.012
alk+0.012
orse+0.012
Neg logit contributions
 skies-0.006
ths-0.006
 next-0.006
 two-0.005
 such-0.005
Token get
Token ID651
Feature activation-2.58e-02

Pos logit contributions
oc+0.000
 habit+0.000
orse+0.000
onest+0.000
alk+0.000
Neg logit contributions
 next-0.000
 both-0.000
 two-0.000
 back-0.000
 me-0.000
Token back
Token ID736
Feature activation+1.80e-02

Pos logit contributions
 habit+0.054
 agree+0.051
 understand+0.050
orse+0.050
 policy+0.049
Neg logit contributions
ths-0.035
 skies-0.032
ips-0.029
 highest-0.028
ting-0.027
Token home
Token ID1363
Feature activation-5.79e-03

Pos logit contributions
ths+0.031
 skies+0.029
 highest+0.027
ting+0.027
There+0.027
Neg logit contributions
 habit-0.031
oc-0.030
alk-0.029
 agree-0.029
 policy-0.028
Token sad
Token ID6507
Feature activation-2.26e-02

Pos logit contributions
 habit+0.004
oc+0.004
alk+0.003
 agree+0.003
orse+0.003
Neg logit contributions
 skies-0.002
ths-0.002
ting-0.002
 two-0.002
 highest-0.002
Token and
Token ID290
Feature activation+1.60e-03

Pos logit contributions
oc+0.044
 habit+0.044
alk+0.043
orse+0.042
 agree+0.042
Neg logit contributions
 skies-0.030
ths-0.030
ting-0.030
 next-0.029
 such-0.029
Token he
Token ID339
Feature activation+1.00e-02

Pos logit contributions
ths+0.003
 skies+0.003
 highest+0.003
ting+0.003
ips+0.003
Neg logit contributions
 habit-0.003
oc-0.002
 agree-0.002
alk-0.002
orse-0.002
Token didn
Token ID1422
Feature activation+5.34e-03

Pos logit contributions
ths+0.019
 skies+0.018
 two+0.017
 next+0.017
ting+0.017
Neg logit contributions
 habit-0.015
oc-0.015
alk-0.015
 agree-0.014
orse-0.014
Token't
Token ID470
Feature activation+7.22e-04

Pos logit contributions
ths+0.006
 skies+0.006
ting+0.006
ople+0.005
ips+0.005
Neg logit contributions
 habit-0.012
alk-0.011
oc-0.011
 agree-0.011
orse-0.011
Token know
Token ID760
Feature activation-7.07e-02

Pos logit contributions
 back+0.002
 both+0.002
 two+0.002
 right+0.001
 too+0.001
Neg logit contributions
oc-0.001
 habit-0.001
 policy-0.001
alk-0.001
onest-0.001
Token why
Token ID1521
Feature activation+1.18e-02

Pos logit contributions
 habit+0.126
oc+0.123
orse+0.117
 agree+0.117
alk+0.113
Neg logit contributions
ths-0.111
 skies-0.110
ips-0.104
 highest-0.103
There-0.099
Token.
Token ID13
Feature activation+4.97e-03

Pos logit contributions
 next+0.013
 skies+0.013
 such+0.013
 two+0.013
ths+0.013
Neg logit contributions
oc-0.026
 habit-0.026
 agree-0.025
alk-0.024
 bond-0.023
Token
Token ID220
Feature activation+5.01e-03

Pos logit contributions
ths+0.008
 skies+0.008
ting+0.007
 highest+0.007
 two+0.007
Neg logit contributions
 habit-0.009
oc-0.009
alk-0.008
 agree-0.008
orse-0.008
Token
Token ID198
Feature activation-9.75e-04

Pos logit contributions
 skies+0.006
ths+0.006
 two+0.006
 next+0.006
There+0.006
Neg logit contributions
 habit-0.011
oc-0.010
alk-0.010
 agree-0.010
orse-0.010
Token
Token ID198
Feature activation-7.95e-04

Pos logit contributions
 habit+0.002
oc+0.002
alk+0.002
 agree+0.002
 policy+0.002
Neg logit contributions
 skies-0.001
ths-0.001
 two-0.001
 next-0.001
There-0.001
Token're
Token ID821
Feature activation-1.42e-03

Pos logit contributions
 habit+0.046
oc+0.046
alk+0.042
 agree+0.042
orse+0.041
Neg logit contributions
 skies-0.064
ths-0.064
 two-0.061
 highest-0.061
 such-0.060
Token welcome
Token ID7062
Feature activation-5.68e-02

Pos logit contributions
 habit+0.002
 understand+0.001
 agree+0.001
 believe+0.001
 plan+0.001
Neg logit contributions
ths-0.003
 skies-0.003
ays-0.003
ting-0.003
ips-0.003
Token,
Token ID11
Feature activation-7.47e-03

Pos logit contributions
oc+0.066
 agree+0.062
 habit+0.057
orse+0.056
 believe+0.054
Neg logit contributions
ths-0.127
 skies-0.124
ips-0.123
 highest-0.119
There-0.116
Token Rh
Token ID10323
Feature activation+3.07e-03

Pos logit contributions
 habit+0.014
oc+0.014
alk+0.013
 agree+0.013
orse+0.013
Neg logit contributions
ths-0.011
 skies-0.011
ting-0.010
 highest-0.010
 two-0.010
Tokenin
Token ID259
Feature activation+2.24e-02

Pos logit contributions
ths+0.003
 skies+0.003
 next+0.003
 two+0.003
ting+0.003
Neg logit contributions
 habit-0.007
alk-0.007
 agree-0.007
oc-0.007
orse-0.006
Tokenoc
Token ID420
Feature activation-9.94e-02

Pos logit contributions
 next+0.061
 two+0.061
ths+0.059
 skies+0.058
ting+0.057
Neg logit contributions
alk-0.015
 habit-0.014
 policy-0.011
 agree-0.011
ruits-0.009
Tokeneros
Token ID27498
Feature activation+2.39e-02

Pos logit contributions
oc+0.217
 habit+0.200
alk+0.189
 agree+0.189
orse+0.187
Neg logit contributions
There-0.122
 skies-0.122
ths-0.121
 highest-0.116
 such-0.112
Token,"
Token ID553
Feature activation+6.17e-02

Pos logit contributions
 next+0.036
 two+0.035
 skies+0.035
ths+0.035
 such+0.035
Neg logit contributions
 habit-0.044
alk-0.042
 agree-0.040
oc-0.039
 policy-0.039
Token the
Token ID262
Feature activation+4.43e-04

Pos logit contributions
ths+0.094
 skies+0.092
ting+0.086
 highest+0.085
 two+0.084
Neg logit contributions
 habit-0.115
oc-0.113
alk-0.108
 agree-0.108
orse-0.105
Token unicorn
Token ID44986
Feature activation-4.63e-03

Pos logit contributions
ths+0.001
 skies+0.001
ting+0.001
 highest+0.001
 such+0.001
Neg logit contributions
 habit-0.001
oc-0.001
 agree-0.001
alk-0.001
orse-0.001
Token said
Token ID531
Feature activation+1.14e-03

Pos logit contributions
 habit+0.008
alk+0.007
oc+0.007
 policy+0.007
 agree+0.007
Neg logit contributions
ths-0.008
 two-0.008
 skies-0.008
 next-0.008
 both-0.007