TOP ACTIVATIONS
MAX = 6.397

of green and lots of tiny , colorful fish ! Molly
big pond with lots of blue water . The sun was
, forests and lots of green grass . She even saw
of water with lots of green plants around it . Ellie
could see lots of av oc ados that the supplier had
a place with lots of green plants and cool , refreshing
yellow leaves and lots of tiny flowers . He ran to
It has red spots and blue eyes ." Tom
only eat blue berries or black berries . They are good
bush with green leaves and red flowers . She thought Ben
stripes , and some had orange and black stripes .
look and found lots of little stitches with coloured thread .
lake , and lots of furry animals . Lucy started running
center with lots of little black dots . The butterfly was
big trees with lots of green leaves . He also saw
Some bees had yellow and black stripes , and some had
full of tall trees and green plants . He put on
he met lots of other furry friends and finally he felt
frog with w arts and red eyes . It was sitting
and some had orange and black stripes . "

MIN ACTIVATIONS
MIN = -4.828

and make it look new again ?" Tim
do to be a good doctor ?" The old
help me to build a books he lf ?" he asked
! Can I please take lessons ?" M um
would fly up to the tre et op and give the
you like to take singing lessons ?" Lu cy
How will we make it heal to our wall ?". Jerry
! Can I mail something please ?" The man
. Can we come back again ?" His parents smiled and
You are a very good writer . Your uncle will be
can make it look new again ." They went
. Can I write a story about you ?"
broken . What should we do ?" mom my said .
make my dress look new again ? â € May
book . He was a writer and a traveler . He
! You are a great writer !" she says .
. Can we come back again ?" Mom and
How about we make a deal ? If you eat one
! Can we buy popcorn again some day ?"
And you are a good writer . But next time ,

INTERVAL 3.591 - 6.397
CONTAINS 0.029%

a hen . He uses brown , pink , white and
a strange creature with long green hair and big brown eyes
my some hot dogs with k etchup and mustard . He
had long yellow hair and blue eyes . She was playing
perfect bush with lots of juicy berries . Little Tim was

INTERVAL 0.785 - 3.591
CONTAINS 16.135%

My name is Tim , nice to meet ya !" <|endoftext|>
top . A little boy named Tim loved to climb the
and the book are not as important as you and me
. They say : " Maybe you can have
Once upon a time there was a boy . He was

INTERVAL -2.022 - 0.785
CONTAINS 82.700%

When she was ready , she said , " We ee
play . The grass was wet , but the sun was
it had paid off . Now she had a new skill
water all around the pool . Tim my
. " Go away , bad dog ! Leave

INTERVAL -4.828 - -2.022
CONTAINS 1.136%

hands . Then lift it up above your head . â
tank . The tank was filled with water , and it
, it was a black man . The girl
street , she heard a sound . It was not a
room . There were some books , some clothes , and
Token of
Token ID286
Feature activation+5.36e-01

Pos logit contributions
handed+0.062
 exhibit+0.061
unch+0.060
anger+0.059
 exits+0.059
Neg logit contributions
 Amen-0.039
 their-0.037
 very-0.035
 smile-0.035
 the-0.031
Token green
Token ID4077
Feature activation+4.40e+00

Pos logit contributions
 exits+0.016
handed+0.015
 exhibit+0.015
 worms+0.015
orns+0.015
Neg logit contributions
 Amen-0.013
 smile-0.011
 their-0.011
te-0.011
 very-0.011
Token and
Token ID290
Feature activation+1.04e-01

Pos logit contributions
handed+0.137
 exhibit+0.124
lor+0.123
unch+0.123
anger+0.119
Neg logit contributions
 their-0.111
 very-0.109
 smile-0.105
 like-0.102
 the-0.100
Token lots
Token ID6041
Feature activation+1.53e+00

Pos logit contributions
 exits+0.004
 worms+0.003
handed+0.003
 exhibit+0.003
 discovery+0.003
Neg logit contributions
 Amen-0.002
 smile-0.002
 their-0.002
te-0.001
 very-0.001
Token of
Token ID286
Feature activation+2.23e+00

Pos logit contributions
handed+0.059
 exhibit+0.056
lor+0.056
anger+0.055
unch+0.054
Neg logit contributions
 their-0.027
 very-0.026
 smile-0.025
 like-0.024
 the-0.023
Token tiny
Token ID7009
Feature activation+6.40e+00

Pos logit contributions
handed+0.061
 exhibit+0.058
 exits+0.057
unch+0.056
orns+0.055
Neg logit contributions
 Amen-0.060
 their-0.058
 very-0.056
 smile-0.055
te-0.050
Token,
Token ID11
Feature activation+4.64e-01

Pos logit contributions
handed+0.186
lor+0.175
unch+0.166
anger+0.164
 exhibit+0.160
Neg logit contributions
 their-0.185
 very-0.182
 smile-0.174
 like-0.171
 the-0.171
Token colorful
Token ID20239
Feature activation+4.49e+00

Pos logit contributions
handed+0.016
 exits+0.016
 exhibit+0.016
unch+0.016
 worms+0.016
Neg logit contributions
 Amen-0.009
 their-0.007
 smile-0.007
 very-0.006
te-0.006
Token fish
Token ID5916
Feature activation+2.59e+00

Pos logit contributions
handed+0.125
lor+0.112
 exhibit+0.111
unch+0.111
anger+0.108
Neg logit contributions
 their-0.130
 very-0.127
 smile-0.125
 the-0.119
 like-0.118
Token!
Token ID0
Feature activation+4.78e-01

Pos logit contributions
handed+0.116
lor+0.110
anger+0.109
 exits+0.107
 exhibit+0.106
Neg logit contributions
 very-0.035
 their-0.034
 like-0.029
 smile-0.027
 the-0.027
Token Molly
Token ID30236
Feature activation+5.49e-01

Pos logit contributions
handed+0.018
 exhibit+0.017
unch+0.017
 exits+0.017
anger+0.017
Neg logit contributions
 their-0.008
 Amen-0.008
 very-0.008
 smile-0.008
 the-0.007
Token big
Token ID1263
Feature activation+8.73e-01

Pos logit contributions
 Amen+0.004
 their+0.004
te+0.003
 smile+0.003
 like+0.003
Neg logit contributions
 exhibit-0.007
handed-0.007
unch-0.007
 exits-0.006
asis-0.006
Token pond
Token ID16723
Feature activation+3.34e-01

Pos logit contributions
 exhibit+0.028
 discovery+0.026
 exits+0.026
handed+0.026
asis+0.026
Neg logit contributions
 Amen-0.019
te-0.017
 their-0.017
 very-0.014
bell-0.014
Token with
Token ID351
Feature activation+1.23e+00

Pos logit contributions
handed+0.013
 exhibit+0.013
unch+0.013
 exits+0.013
anger+0.013
Neg logit contributions
 their-0.005
 Amen-0.005
 very-0.004
 smile-0.004
 like-0.004
Token lots
Token ID6041
Feature activation+1.09e+00

Pos logit contributions
handed+0.044
 exhibit+0.043
unch+0.041
 exits+0.041
anger+0.041
Neg logit contributions
 Amen-0.023
 their-0.022
 very-0.022
 smile-0.021
 like-0.019
Token of
Token ID286
Feature activation+2.83e+00

Pos logit contributions
handed+0.042
lor+0.040
 exhibit+0.040
anger+0.039
unch+0.038
Neg logit contributions
 their-0.019
 very-0.018
 smile-0.017
 the-0.016
 like-0.016
Token blue
Token ID4171
Feature activation+6.24e+00

Pos logit contributions
handed+0.078
lor+0.073
 exhibit+0.072
anger+0.071
unch+0.071
Neg logit contributions
 their-0.082
 very-0.080
 smile-0.076
 the-0.074
 like-0.074
Token water
Token ID1660
Feature activation+1.67e+00

Pos logit contributions
handed+0.209
lor+0.191
 exhibit+0.182
anger+0.180
unch+0.173
Neg logit contributions
 their-0.161
 very-0.148
 the-0.145
 smile-0.145
 like-0.143
Token.
Token ID13
Feature activation-4.72e-01

Pos logit contributions
handed+0.072
 exhibit+0.068
lor+0.068
unch+0.067
 exits+0.067
Neg logit contributions
 their-0.021
 very-0.020
 smile-0.018
 like-0.017
 the-0.017
Token The
Token ID383
Feature activation+5.90e-01

Pos logit contributions
 Amen+0.009
 their+0.008
 smile+0.008
 very+0.008
 like+0.007
Neg logit contributions
handed-0.017
 exhibit-0.016
unch-0.016
 exits-0.016
anger-0.016
Token sun
Token ID4252
Feature activation+1.85e-01

Pos logit contributions
handed+0.016
lor+0.015
 exhibit+0.014
unch+0.014
anger+0.014
Neg logit contributions
 very-0.018
 their-0.018
 smile-0.017
 the-0.016
 like-0.016
Token was
Token ID373
Feature activation-5.31e-01

Pos logit contributions
 exhibit+0.006
handed+0.006
unch+0.006
oles+0.006
asis+0.006
Neg logit contributions
 Amen-0.004
 their-0.003
 very-0.003
 smile-0.003
 low-0.003
Token,
Token ID11
Feature activation-3.17e-01

Pos logit contributions
handed+0.038
 exhibit+0.037
unch+0.036
 exits+0.036
anger+0.036
Neg logit contributions
 their-0.018
 very-0.017
 Amen-0.017
 smile-0.017
 like-0.015
Token forests
Token ID17039
Feature activation+5.45e-01

Pos logit contributions
 Amen+0.006
 their+0.005
 smile+0.004
 very+0.004
te+0.004
Neg logit contributions
handed-0.011
 exhibit-0.011
 exits-0.011
unch-0.011
 discovery-0.011
Token and
Token ID290
Feature activation+8.80e-01

Pos logit contributions
handed+0.021
 exhibit+0.020
unch+0.020
anger+0.020
 exits+0.020
Neg logit contributions
 Amen-0.010
 their-0.008
 very-0.007
 smile-0.007
te-0.006
Token lots
Token ID6041
Feature activation+1.61e+00

Pos logit contributions
handed+0.027
 exhibit+0.026
 exits+0.025
unch+0.024
orns+0.024
Neg logit contributions
 Amen-0.021
 their-0.020
 smile-0.019
 very-0.019
te-0.018
Token of
Token ID286
Feature activation+2.46e+00

Pos logit contributions
handed+0.060
 exhibit+0.058
lor+0.058
anger+0.056
unch+0.056
Neg logit contributions
 their-0.029
 very-0.028
 smile-0.027
 the-0.026
 like-0.025
Token green
Token ID4077
Feature activation+6.03e+00

Pos logit contributions
handed+0.068
 exhibit+0.065
unch+0.064
lor+0.063
anger+0.063
Neg logit contributions
 their-0.067
 very-0.066
 Amen-0.065
 smile-0.064
 the-0.060
Token grass
Token ID8701
Feature activation+3.22e+00

Pos logit contributions
handed+0.176
lor+0.165
unch+0.158
anger+0.152
 exhibit+0.149
Neg logit contributions
 their-0.174
 very-0.167
 the-0.163
 like-0.161
 smile-0.161
Token.
Token ID13
Feature activation-5.92e-01

Pos logit contributions
handed+0.128
 exhibit+0.123
unch+0.120
 exits+0.120
anger+0.119
Neg logit contributions
 their-0.048
 very-0.046
 Amen-0.045
 smile-0.044
 like-0.039
Token She
Token ID1375
Feature activation+7.88e-02

Pos logit contributions
 Amen+0.009
 their+0.009
 very+0.009
 smile+0.009
 like+0.007
Neg logit contributions
handed-0.023
 exhibit-0.022
unch-0.021
 exits-0.021
anger-0.021
Token even
Token ID772
Feature activation-3.66e-01

Pos logit contributions
handed+0.003
 exhibit+0.003
unch+0.003
 exits+0.003
anger+0.003
Neg logit contributions
 Amen-0.001
 their-0.001
 very-0.001
 smile-0.001
 like-0.001
Token saw
Token ID2497
Feature activation+2.20e+00

Pos logit contributions
 Amen+0.008
 their+0.008
 very+0.007
 smile+0.007
 like+0.006
Neg logit contributions
handed-0.012
 exhibit-0.012
unch-0.011
 exits-0.011
anger-0.011
Token of
Token ID286
Feature activation-2.39e-01

Pos logit contributions
handed+0.012
 exhibit+0.012
unch+0.012
 exits+0.012
lor+0.012
Neg logit contributions
 their-0.004
 very-0.004
 Amen-0.004
 smile-0.004
 like-0.003
Token water
Token ID1660
Feature activation+5.47e-01

Pos logit contributions
 Amen+0.007
 smile+0.007
 their+0.007
 very+0.007
te+0.006
Neg logit contributions
handed-0.006
 exhibit-0.006
 exits-0.005
unch-0.005
orns-0.005
Token with
Token ID351
Feature activation+7.19e-01

Pos logit contributions
handed+0.023
lor+0.022
 exhibit+0.022
 exits+0.022
unch+0.022
Neg logit contributions
 their-0.007
 very-0.007
 smile-0.006
 like-0.006
 the-0.006
Token lots
Token ID6041
Feature activation+1.19e+00

Pos logit contributions
handed+0.027
lor+0.026
anger+0.026
unch+0.026
 exhibit+0.026
Neg logit contributions
 their-0.014
 very-0.014
 the-0.012
 like-0.012
 smile-0.011
Token of
Token ID286
Feature activation+1.87e+00

Pos logit contributions
handed+0.052
lor+0.052
 exhibit+0.050
anger+0.048
uding+0.048
Neg logit contributions
 very-0.015
 their-0.015
 the-0.014
 like-0.014
 smile-0.013
Token green
Token ID4077
Feature activation+5.98e+00

Pos logit contributions
handed+0.054
lor+0.051
unch+0.049
 exhibit+0.049
anger+0.049
Neg logit contributions
 their-0.052
 very-0.051
 smile-0.048
 the-0.048
 like-0.047
Token plants
Token ID6134
Feature activation+1.36e+00

Pos logit contributions
handed+0.198
lor+0.178
unch+0.166
anger+0.161
 exhibit+0.159
Neg logit contributions
 their-0.163
 very-0.152
 the-0.152
 like-0.146
 blue-0.146
Token around
Token ID1088
Feature activation+1.47e+00

Pos logit contributions
handed+0.062
lor+0.060
anger+0.059
 exhibit+0.059
unch+0.058
Neg logit contributions
 very-0.015
 their-0.014
 like-0.012
 smile-0.012
 the-0.011
Token it
Token ID340
Feature activation+1.82e-01

Pos logit contributions
handed+0.062
 exhibit+0.059
unch+0.059
anger+0.058
lor+0.058
Neg logit contributions
 their-0.020
 very-0.019
 Amen-0.018
 smile-0.017
 the-0.016
Token.
Token ID13
Feature activation-4.68e-01

Pos logit contributions
handed+0.008
unch+0.008
lor+0.008
 exhibit+0.008
anger+0.008
Neg logit contributions
 very-0.002
 their-0.002
 the-0.002
 like-0.002
 smile-0.002
Token Ellie
Token ID44801
Feature activation+3.03e-01

Pos logit contributions
 very+0.007
 their+0.006
 Amen+0.006
 the+0.006
 smile+0.005
Neg logit contributions
handed-0.020
asis-0.019
lor-0.019
unch-0.019
anger-0.019
Token could
Token ID714
Feature activation+8.92e-03

Pos logit contributions
handed+0.063
 exhibit+0.060
unch+0.059
 exits+0.059
anger+0.058
Neg logit contributions
 Amen-0.027
 their-0.027
 very-0.026
 smile-0.025
 like-0.022
Token see
Token ID766
Feature activation+1.23e+00

Pos logit contributions
handed+0.000
 exhibit+0.000
unch+0.000
 exits+0.000
anger+0.000
Neg logit contributions
 Amen-0.000
 their-0.000
 very-0.000
 smile-0.000
 like-0.000
Token lots
Token ID6041
Feature activation+3.90e-01

Pos logit contributions
handed+0.053
anger+0.052
unch+0.052
 exhibit+0.052
 exits+0.051
Neg logit contributions
 their-0.016
 very-0.015
 the-0.013
 like-0.012
 smile-0.012
Token of
Token ID286
Feature activation+1.39e+00

Pos logit contributions
handed+0.015
 exhibit+0.015
lor+0.014
anger+0.014
 exits+0.014
Neg logit contributions
 very-0.006
 their-0.006
 smile-0.006
 like-0.006
 the-0.006
Token av
Token ID1196
Feature activation-3.11e-01

Pos logit contributions
anger+0.046
handed+0.045
lor+0.045
 exhibit+0.045
unch+0.043
Neg logit contributions
 very-0.031
 their-0.030
 smile-0.030
 the-0.028
 Amen-0.028
Tokenoc
Token ID420
Feature activation+5.96e+00

Pos logit contributions
 very+0.009
 Amen+0.009
 smile+0.009
 like+0.008
 their+0.008
Neg logit contributions
anger-0.008
handed-0.008
 exits-0.007
 exhibit-0.007
lor-0.007
Tokenados
Token ID22484
Feature activation+3.96e-01

Pos logit contributions
 exits+0.134
anger+0.130
lor+0.130
 exhibit+0.130
eway+0.125
Neg logit contributions
 the-0.159
 very-0.156
 smile-0.154
 Amen-0.152
 their-0.151
Token that
Token ID326
Feature activation+4.45e-01

Pos logit contributions
handed+0.017
anger+0.016
 exhibit+0.016
 exits+0.016
lor+0.016
Neg logit contributions
 very-0.005
 their-0.005
 like-0.005
 smile-0.004
 the-0.004
Token the
Token ID262
Feature activation+3.21e-02

Pos logit contributions
handed+0.018
 exhibit+0.017
unch+0.017
 exits+0.017
anger+0.017
Neg logit contributions
 Amen-0.006
 their-0.006
 very-0.006
 smile-0.006
 like-0.005
Token supplier
Token ID22693
Feature activation-5.31e-01

Pos logit contributions
handed+0.001
anger+0.001
lor+0.001
 exhibit+0.001
unch+0.001
Neg logit contributions
 very-0.001
 smile-0.001
 their-0.001
 Amen-0.001
 the-0.001
Token had
Token ID550
Feature activation+1.95e-01

Pos logit contributions
 their+0.007
 Amen+0.007
 very+0.007
 smile+0.007
 like+0.006
Neg logit contributions
handed-0.021
 exhibit-0.021
unch-0.020
 exits-0.020
anger-0.020
Token a
Token ID257
Feature activation-4.57e-01

Pos logit contributions
 their+0.006
 very+0.006
 smile+0.005
 Amen+0.005
 like+0.005
Neg logit contributions
handed-0.015
 exhibit-0.014
lor-0.014
anger-0.014
unch-0.014
Token place
Token ID1295
Feature activation-4.46e-01

Pos logit contributions
 Amen+0.009
 their+0.009
 very+0.009
 smile+0.009
 like+0.008
Neg logit contributions
handed-0.015
 exhibit-0.015
unch-0.014
 exits-0.014
anger-0.014
Token with
Token ID351
Feature activation+5.40e-01

Pos logit contributions
 Amen+0.009
 their+0.008
 smile+0.008
 very+0.008
te+0.007
Neg logit contributions
handed-0.015
 exhibit-0.015
 exits-0.014
anger-0.014
unch-0.014
Token lots
Token ID6041
Feature activation+1.33e+00

Pos logit contributions
handed+0.021
 exhibit+0.020
unch+0.020
anger+0.020
lor+0.020
Neg logit contributions
 their-0.009
 very-0.008
 Amen-0.008
 smile-0.008
 the-0.007
Token of
Token ID286
Feature activation+2.55e+00

Pos logit contributions
lor+0.059
handed+0.058
 exhibit+0.056
anger+0.054
asis+0.054
Neg logit contributions
 very-0.017
 the-0.016
 their-0.016
 smile-0.016
 like-0.015
Token green
Token ID4077
Feature activation+5.95e+00

Pos logit contributions
handed+0.080
lor+0.080
anger+0.076
 exhibit+0.076
unch+0.075
Neg logit contributions
 very-0.067
 their-0.067
 the-0.062
 smile-0.061
 like-0.060
Token plants
Token ID6134
Feature activation+1.55e+00

Pos logit contributions
handed+0.182
lor+0.166
unch+0.155
 exhibit+0.153
anger+0.151
Neg logit contributions
 their-0.172
 very-0.164
 the-0.161
 like-0.158
 smile-0.155
Token and
Token ID290
Feature activation+1.38e+00

Pos logit contributions
handed+0.072
lor+0.070
anger+0.069
unch+0.068
 exhibit+0.068
Neg logit contributions
 very-0.017
 their-0.016
 like-0.014
 the-0.014
 smile-0.013
Token cool
Token ID3608
Feature activation+3.74e+00

Pos logit contributions
handed+0.028
 exhibit+0.026
 exits+0.026
orns+0.024
 worms+0.024
Neg logit contributions
 Amen-0.047
 their-0.044
 smile-0.044
 very-0.043
te-0.041
Token,
Token ID11
Feature activation+3.68e-01

Pos logit contributions
handed+0.101
lor+0.095
unch+0.094
 exhibit+0.092
anger+0.089
Neg logit contributions
 their-0.113
 smile-0.107
 very-0.106
 the-0.102
 like-0.102
Token refreshing
Token ID23056
Feature activation+2.82e+00

Pos logit contributions
handed+0.014
 exhibit+0.014
 exits+0.013
unch+0.013
anger+0.013
Neg logit contributions
 Amen-0.006
 their-0.005
 smile-0.005
 very-0.005
te-0.004
Token yellow
Token ID7872
Feature activation+3.96e+00

Pos logit contributions
handed+0.065
 exhibit+0.061
unch+0.059
anger+0.058
lor+0.058
Neg logit contributions
 their-0.066
 Amen-0.065
 very-0.065
 smile-0.064
 the-0.060
Token leaves
Token ID5667
Feature activation+1.53e-01

Pos logit contributions
handed+0.108
 exhibit+0.099
lor+0.096
unch+0.093
 exits+0.093
Neg logit contributions
 very-0.111
 their-0.111
 smile-0.111
 like-0.104
 Amen-0.104
Token and
Token ID290
Feature activation+5.00e-01

Pos logit contributions
handed+0.006
 exhibit+0.006
unch+0.006
 exits+0.006
anger+0.006
Neg logit contributions
 Amen-0.002
 their-0.002
 very-0.002
 smile-0.002
 like-0.002
Token lots
Token ID6041
Feature activation+1.39e+00

Pos logit contributions
handed+0.015
 exits+0.015
 exhibit+0.015
 worms+0.015
orns+0.014
Neg logit contributions
 Amen-0.012
 their-0.010
 smile-0.010
te-0.010
 very-0.010
Token of
Token ID286
Feature activation+1.84e+00

Pos logit contributions
handed+0.053
 exhibit+0.051
lor+0.051
anger+0.049
asis+0.049
Neg logit contributions
 their-0.024
 very-0.024
 smile-0.023
 like-0.022
 the-0.021
Token tiny
Token ID7009
Feature activation+5.94e+00

Pos logit contributions
handed+0.048
 exhibit+0.045
unch+0.044
lor+0.043
anger+0.043
Neg logit contributions
 their-0.053
 very-0.052
 Amen-0.052
 smile-0.051
 like-0.048
Token flowers
Token ID12734
Feature activation+1.27e+00

Pos logit contributions
handed+0.159
lor+0.145
 exhibit+0.143
unch+0.140
anger+0.139
Neg logit contributions
 very-0.177
 their-0.176
 smile-0.174
 the-0.167
 like-0.165
Token.
Token ID13
Feature activation-4.84e-01

Pos logit contributions
handed+0.054
 exhibit+0.051
unch+0.050
lor+0.050
 exits+0.050
Neg logit contributions
 their-0.016
 very-0.016
 smile-0.015
 Amen-0.014
 like-0.013
Token He
Token ID679
Feature activation+2.33e-01

Pos logit contributions
 Amen+0.008
 their+0.007
 very+0.007
 smile+0.007
 like+0.006
Neg logit contributions
handed-0.019
 exhibit-0.018
unch-0.018
 exits-0.018
anger-0.017
Token ran
Token ID4966
Feature activation+5.69e-01

Pos logit contributions
handed+0.009
 exhibit+0.008
unch+0.008
anger+0.008
 exits+0.008
Neg logit contributions
 their-0.004
 Amen-0.004
 very-0.004
 smile-0.004
 the-0.003
Token to
Token ID284
Feature activation-3.94e-01

Pos logit contributions
handed+0.023
 exhibit+0.022
unch+0.022
 exits+0.021
anger+0.021
Neg logit contributions
 Amen-0.008
 their-0.008
 very-0.008
 smile-0.007
 like-0.006
Token It
Token ID632
Feature activation+9.33e-01

Pos logit contributions
handed+0.042
unch+0.041
 exhibit+0.041
anger+0.040
lor+0.039
Neg logit contributions
 their-0.026
 very-0.026
 smile-0.026
 Amen-0.025
 low-0.022
Token has
Token ID468
Feature activation+8.39e-01

Pos logit contributions
handed+0.036
unch+0.034
 exhibit+0.034
anger+0.033
 exits+0.033
Neg logit contributions
 their-0.016
 very-0.015
 Amen-0.015
 smile-0.014
 like-0.013
Token red
Token ID2266
Feature activation+5.41e+00

Pos logit contributions
handed+0.032
 exhibit+0.031
unch+0.030
anger+0.030
 exits+0.030
Neg logit contributions
 their-0.014
 very-0.013
 Amen-0.013
 smile-0.013
 like-0.011
Token spots
Token ID10222
Feature activation-1.28e-01

Pos logit contributions
handed+0.181
 exhibit+0.165
lor+0.162
 exits+0.158
unch+0.157
Neg logit contributions
 their-0.126
 very-0.123
 smile-0.120
 like-0.118
 the-0.114
Token and
Token ID290
Feature activation+1.41e+00

Pos logit contributions
 Amen+0.002
 their+0.002
 smile+0.002
 very+0.002
 smiled+0.001
Neg logit contributions
handed-0.005
 exhibit-0.005
unch-0.005
anger-0.005
orns-0.005
Token blue
Token ID4171
Feature activation+5.80e+00

Pos logit contributions
 worms+0.043
 exhibit+0.042
orns+0.041
 territory+0.041
 exits+0.041
Neg logit contributions
 Amen-0.033
 their-0.027
te-0.027
 smile-0.026
bell-0.026
Token eyes
Token ID2951
Feature activation-1.07e+00

Pos logit contributions
handed+0.207
 exhibit+0.193
lor+0.187
 exits+0.184
anger+0.184
Neg logit contributions
 their-0.120
 very-0.116
 smile-0.113
 like-0.109
 Amen-0.109
Token."
Token ID526
Feature activation+7.19e-01

Pos logit contributions
 Amen+0.021
 their+0.019
 smile+0.019
 very+0.018
 smiled+0.016
Neg logit contributions
handed-0.036
 exhibit-0.035
unch-0.034
orns-0.034
lor-0.034
Token
Token ID198
Feature activation+8.46e-01

Pos logit contributions
handed+0.029
 exhibit+0.028
lor+0.028
 exits+0.027
asis+0.027
Neg logit contributions
 their-0.012
 very-0.011
 the-0.010
 smile-0.010
 Amen-0.010
Token
Token ID198
Feature activation+7.83e-01

Pos logit contributions
handed+0.037
 exhibit+0.036
 exits+0.036
asis+0.035
lor+0.035
Neg logit contributions
 their-0.012
 the-0.010
 very-0.010
 Amen-0.008
 smile-0.008
TokenTom
Token ID13787
Feature activation+6.09e-01

Pos logit contributions
handed+0.026
 exhibit+0.026
anger+0.025
lor+0.025
 exits+0.024
Neg logit contributions
 their-0.017
 very-0.016
 Amen-0.016
 smile-0.015
 the-0.015
Token only
Token ID691
Feature activation+1.90e+00

Pos logit contributions
 Amen+0.000
 their+0.000
 very+0.000
 smile+0.000
 low+0.000
Neg logit contributions
handed-0.001
 exhibit-0.001
unch-0.001
 exits-0.001
anger-0.001
Token eat
Token ID4483
Feature activation+1.57e+00

Pos logit contributions
handed+0.065
 exhibit+0.063
unch+0.061
 exits+0.060
anger+0.060
Neg logit contributions
 Amen-0.037
 their-0.037
 very-0.036
 smile-0.035
 like-0.031
Token blue
Token ID4171
Feature activation+5.35e+00

Pos logit contributions
handed+0.052
 exhibit+0.050
 exits+0.049
unch+0.049
orns+0.049
Neg logit contributions
 Amen-0.033
 smile-0.030
 their-0.030
 very-0.029
te-0.026
Token berries
Token ID36322
Feature activation+6.81e-01

Pos logit contributions
handed+0.111
 exhibit+0.102
lor+0.097
anger+0.096
unch+0.096
Neg logit contributions
 their-0.184
 very-0.180
 Amen-0.176
 smile-0.176
 like-0.171
Token or
Token ID393
Feature activation+2.15e+00

Pos logit contributions
handed+0.025
 exhibit+0.023
unch+0.023
orns+0.023
anger+0.023
Neg logit contributions
 Amen-0.013
 smile-0.011
 their-0.011
 very-0.010
 smiled-0.010
Token black
Token ID2042
Feature activation+5.73e+00

Pos logit contributions
 worms+0.047
handed+0.046
 exhibit+0.045
 exits+0.044
orns+0.043
Neg logit contributions
 Amen-0.069
te-0.062
 their-0.060
 very-0.060
 smile-0.059
Token berries
Token ID36322
Feature activation+6.21e-01

Pos logit contributions
handed+0.139
 exhibit+0.129
lor+0.126
 exits+0.122
anger+0.122
Neg logit contributions
 their-0.178
 very-0.176
 like-0.170
 smile-0.170
 Amen-0.169
Token.
Token ID13
Feature activation+2.68e-01

Pos logit contributions
handed+0.023
 exhibit+0.022
unch+0.022
anger+0.021
 exits+0.021
Neg logit contributions
 Amen-0.011
 their-0.010
 smile-0.010
 very-0.009
 smiled-0.008
Token They
Token ID1119
Feature activation+1.63e-01

Pos logit contributions
handed+0.009
 exhibit+0.009
 exits+0.009
lor+0.009
unch+0.009
Neg logit contributions
 smile-0.005
 their-0.005
 very-0.005
 Amen-0.004
 smiled-0.004
Token are
Token ID389
Feature activation-6.77e-01

Pos logit contributions
handed+0.006
 exhibit+0.006
 exits+0.005
unch+0.005
orns+0.005
Neg logit contributions
 Amen-0.003
 their-0.003
 very-0.003
 smile-0.003
 low-0.002
Token good
Token ID922
Feature activation+2.47e-01

Pos logit contributions
 Amen+0.013
 smile+0.011
 their+0.010
te+0.010
bell+0.010
Neg logit contributions
 worms-0.022
 exits-0.022
handed-0.022
 exhibit-0.021
unch-0.021
Token bush
Token ID24484
Feature activation+5.93e-02

Pos logit contributions
 Amen+0.001
 their+0.001
 very+0.001
 smile+0.001
te+0.001
Neg logit contributions
 exhibit-0.001
handed-0.001
 exits-0.001
anger-0.001
unch-0.001
Token with
Token ID351
Feature activation+3.38e-01

Pos logit contributions
handed+0.002
 exhibit+0.002
unch+0.002
 exits+0.002
anger+0.002
Neg logit contributions
 Amen-0.001
 their-0.001
 smile-0.001
 very-0.001
 the-0.001
Token green
Token ID4077
Feature activation+4.16e+00

Pos logit contributions
handed+0.014
 exhibit+0.013
unch+0.013
anger+0.013
lor+0.013
Neg logit contributions
 their-0.005
 Amen-0.005
 very-0.004
 smile-0.004
 the-0.004
Token leaves
Token ID5667
Feature activation-1.59e-01

Pos logit contributions
handed+0.119
lor+0.104
unch+0.104
 exhibit+0.103
asis+0.098
Neg logit contributions
 their-0.117
 very-0.116
 smile-0.113
 like-0.111
 the-0.110
Token and
Token ID290
Feature activation+8.59e-01

Pos logit contributions
 Amen+0.002
 their+0.002
 very+0.002
 smile+0.002
 like+0.001
Neg logit contributions
handed-0.007
 exhibit-0.006
unch-0.006
 exits-0.006
anger-0.006
Token red
Token ID2266
Feature activation+5.69e+00

Pos logit contributions
handed+0.029
 exhibit+0.029
 exits+0.028
 worms+0.028
orns+0.027
Neg logit contributions
 Amen-0.018
 their-0.016
 smile-0.015
 very-0.014
te-0.014
Token flowers
Token ID12734
Feature activation+5.18e-01

Pos logit contributions
handed+0.165
lor+0.145
 exhibit+0.140
unch+0.138
asis+0.135
Neg logit contributions
 their-0.169
 very-0.162
 smile-0.160
 like-0.158
 the-0.155
Token.
Token ID13
Feature activation-4.63e-02

Pos logit contributions
handed+0.023
 exhibit+0.022
lor+0.021
unch+0.021
 exits+0.021
Neg logit contributions
 their-0.006
 very-0.006
 smile-0.005
 like-0.005
 Amen-0.005
Token She
Token ID1375
Feature activation-2.99e-01

Pos logit contributions
 their+0.001
 very+0.001
 the+0.000
 smile+0.000
 Amen+0.000
Neg logit contributions
handed-0.002
 exhibit-0.002
asis-0.002
 exits-0.002
lor-0.002
Token thought
Token ID1807
Feature activation+7.56e-01

Pos logit contributions
 their+0.005
 very+0.005
 Amen+0.005
 smile+0.005
 the+0.004
Neg logit contributions
handed-0.011
 exhibit-0.011
unch-0.011
anger-0.011
lor-0.011
Token Ben
Token ID3932
Feature activation-1.20e-01

Pos logit contributions
handed+0.027
 exhibit+0.027
unch+0.026
anger+0.026
 exits+0.026
Neg logit contributions
 Amen-0.013
 their-0.012
 very-0.012
 smile-0.012
 like-0.010
Token stripes
Token ID28806
Feature activation+3.74e-01

Pos logit contributions
handed+0.184
 exhibit+0.169
lor+0.166
unch+0.162
 exits+0.160
Neg logit contributions
 their-0.135
 very-0.130
 smile-0.126
 like-0.123
 the-0.118
Token,
Token ID11
Feature activation+7.64e-01

Pos logit contributions
handed+0.014
 exhibit+0.014
unch+0.014
anger+0.014
orns+0.013
Neg logit contributions
 Amen-0.006
 their-0.005
 smile-0.005
 very-0.005
 low-0.004
Token and
Token ID290
Feature activation+2.13e-01

Pos logit contributions
 worms+0.026
oles+0.024
 footprints+0.024
orns+0.023
 exits+0.023
Neg logit contributions
 Amen-0.017
 smile-0.012
 forward-0.012
te-0.012
 their-0.011
Token some
Token ID617
Feature activation+6.90e-01

Pos logit contributions
 worms+0.007
 exits+0.006
 footprints+0.006
oles+0.006
 exhibit+0.006
Neg logit contributions
 Amen-0.005
 smile-0.004
te-0.003
 forward-0.003
 very-0.003
Token had
Token ID550
Feature activation+7.01e-01

Pos logit contributions
handed+0.022
 exhibit+0.022
unch+0.021
 exits+0.021
anger+0.021
Neg logit contributions
 Amen-0.015
 their-0.015
 very-0.014
 smile-0.014
 like-0.013
Token orange
Token ID10912
Feature activation+5.67e+00

Pos logit contributions
handed+0.023
 exhibit+0.023
unch+0.022
 exits+0.022
anger+0.022
Neg logit contributions
 their-0.015
 Amen-0.014
 very-0.014
 smile-0.014
 like-0.012
Token and
Token ID290
Feature activation+6.33e-01

Pos logit contributions
handed+0.169
 exhibit+0.155
unch+0.150
lor+0.150
 exits+0.147
Neg logit contributions
 their-0.150
 very-0.148
 smile-0.141
 Amen-0.138
 like-0.137
Token black
Token ID2042
Feature activation+5.49e+00

Pos logit contributions
 worms+0.021
oles+0.020
 exits+0.019
orns+0.019
 exhibit+0.019
Neg logit contributions
 Amen-0.014
 smile-0.011
 forward-0.010
 their-0.010
te-0.010
Token stripes
Token ID28806
Feature activation+7.23e-01

Pos logit contributions
handed+0.179
 exhibit+0.166
lor+0.160
unch+0.160
 exits+0.158
Neg logit contributions
 their-0.130
 very-0.127
 smile-0.122
 Amen-0.119
 like-0.118
Token.
Token ID13
Feature activation+2.75e-01

Pos logit contributions
handed+0.027
 exhibit+0.026
unch+0.026
anger+0.026
orns+0.025
Neg logit contributions
 Amen-0.013
 their-0.011
 smile-0.011
 very-0.011
 low-0.009
Token
Token ID198
Feature activation+8.88e-01

Pos logit contributions
handed+0.010
 exhibit+0.010
unch+0.010
 exits+0.009
anger+0.009
Neg logit contributions
 Amen-0.005
 smile-0.004
 very-0.004
 their-0.004
 smiled-0.004
Token look
Token ID804
Feature activation-4.33e-01

Pos logit contributions
handed+0.051
orns+0.048
 exhibit+0.048
unch+0.047
lor+0.047
Neg logit contributions
 very-0.020
 their-0.020
 smile-0.020
 Amen-0.019
 low-0.017
Token and
Token ID290
Feature activation-3.35e-01

Pos logit contributions
 Amen+0.007
 their+0.006
 smile+0.006
 very+0.006
 low+0.005
Neg logit contributions
handed-0.017
 exhibit-0.016
unch-0.016
 exits-0.016
anger-0.016
Token found
Token ID1043
Feature activation+1.66e+00

Pos logit contributions
 their+0.006
 very+0.006
 Amen+0.006
 smile+0.006
 the+0.005
Neg logit contributions
handed-0.012
unch-0.012
 exhibit-0.012
anger-0.012
lor-0.012
Token lots
Token ID6041
Feature activation+9.66e-01

Pos logit contributions
handed+0.081
unch+0.079
 exhibit+0.079
asis+0.077
lor+0.076
Neg logit contributions
 their-0.013
 very-0.013
 the-0.011
 like-0.008
 smile-0.007
Token of
Token ID286
Feature activation+2.37e+00

Pos logit contributions
handed+0.038
lor+0.038
 exhibit+0.037
anger+0.036
unch+0.035
Neg logit contributions
 very-0.015
 their-0.015
 smile-0.014
 like-0.014
 the-0.014
Token little
Token ID1310
Feature activation+5.67e+00

Pos logit contributions
handed+0.082
lor+0.080
 exhibit+0.077
anger+0.077
unch+0.076
Neg logit contributions
 very-0.052
 their-0.052
 smile-0.050
 like-0.047
 the-0.047
Token stitches
Token ID28096
Feature activation-1.17e+00

Pos logit contributions
lor+0.154
handed+0.154
anger+0.146
unch+0.143
 exhibit+0.139
Neg logit contributions
 very-0.166
 their-0.165
 smile-0.164
 like-0.158
 the-0.157
Token with
Token ID351
Feature activation+1.05e+00

Pos logit contributions
 Amen+0.019
 their+0.018
 very+0.017
 smile+0.017
 the+0.014
Neg logit contributions
handed-0.045
 exhibit-0.043
unch-0.042
anger-0.042
 exits-0.042
Token coloured
Token ID34746
Feature activation+3.65e+00

Pos logit contributions
handed+0.038
 exhibit+0.037
 exits+0.036
unch+0.036
anger+0.036
Neg logit contributions
 Amen-0.019
 their-0.017
 very-0.017
 smile-0.017
 like-0.014
Token thread
Token ID4704
Feature activation+1.09e+00

Pos logit contributions
handed+0.098
 exhibit+0.094
 exits+0.092
anger+0.091
unch+0.091
Neg logit contributions
 Amen-0.099
 their-0.095
 very-0.092
 smile-0.091
 low-0.084
Token.
Token ID13
Feature activation-2.95e-01

Pos logit contributions
handed+0.044
 exhibit+0.043
unch+0.042
 exits+0.042
lor+0.041
Neg logit contributions
 their-0.015
 very-0.014
 Amen-0.014
 smile-0.014
 like-0.012
Token lake
Token ID13546
Feature activation+1.44e+00

Pos logit contributions
 exhibit+0.012
handed+0.012
anger+0.011
 exits+0.011
unch+0.011
Neg logit contributions
 Amen-0.014
 their-0.013
 very-0.013
 smile-0.012
te-0.012
Token,
Token ID11
Feature activation-5.34e-01

Pos logit contributions
handed+0.056
 exhibit+0.053
unch+0.053
 exits+0.052
lor+0.052
Neg logit contributions
 their-0.023
 very-0.022
 Amen-0.021
 smile-0.021
 like-0.019
Token and
Token ID290
Feature activation-7.02e-01

Pos logit contributions
 Amen+0.012
 their+0.009
 smile+0.009
te+0.008
 forward+0.007
Neg logit contributions
 exhibit-0.017
 exits-0.017
 worms-0.017
 territory-0.016
 footprints-0.016
Token lots
Token ID6041
Feature activation+9.53e-01

Pos logit contributions
 Amen+0.013
 smile+0.011
te+0.010
 their+0.010
 very+0.009
Neg logit contributions
 exits-0.024
handed-0.024
 exhibit-0.023
 worms-0.023
 discovery-0.023
Token of
Token ID286
Feature activation+1.90e+00

Pos logit contributions
handed+0.037
lor+0.036
 exhibit+0.035
anger+0.034
 exits+0.034
Neg logit contributions
 their-0.016
 very-0.016
 smile-0.015
 like-0.014
 the-0.014
Token furry
Token ID40857
Feature activation+5.66e+00

Pos logit contributions
handed+0.055
 exhibit+0.052
unch+0.051
 exits+0.051
anger+0.050
Neg logit contributions
 Amen-0.048
 their-0.048
 very-0.046
 smile-0.046
 like-0.042
Token animals
Token ID4695
Feature activation+1.21e+00

Pos logit contributions
handed+0.200
lor+0.190
unch+0.171
anger+0.171
ilst+0.163
Neg logit contributions
 their-0.145
 very-0.142
 like-0.133
 smile-0.132
 blue-0.130
Token.
Token ID13
Feature activation-3.22e-01

Pos logit contributions
handed+0.054
lor+0.051
unch+0.051
anger+0.051
 exhibit+0.050
Neg logit contributions
 very-0.014
 their-0.014
 like-0.013
 smile-0.012
 the-0.011
Token Lucy
Token ID22162
Feature activation+7.34e-01

Pos logit contributions
 their+0.005
 very+0.005
 Amen+0.005
 smile+0.004
 the+0.004
Neg logit contributions
handed-0.013
 exhibit-0.013
orns-0.012
anger-0.012
lor-0.012
Token started
Token ID2067
Feature activation+1.53e+00

Pos logit contributions
handed+0.027
 exhibit+0.026
unch+0.025
 exits+0.025
anger+0.025
Neg logit contributions
 Amen-0.013
 their-0.012
 very-0.011
 smile-0.011
 low-0.010
Token running
Token ID2491
Feature activation+8.78e-01

Pos logit contributions
handed+0.055
 exhibit+0.055
unch+0.053
 exits+0.053
anger+0.051
Neg logit contributions
 Amen-0.028
 their-0.024
 very-0.024
 smile-0.024
 like-0.021
Token center
Token ID3641
Feature activation-1.13e+00

Pos logit contributions
handed+0.082
 exhibit+0.077
unch+0.075
 exits+0.074
anger+0.074
Neg logit contributions
 their-0.081
 Amen-0.081
 very-0.079
 smile-0.078
 like-0.073
Token with
Token ID351
Feature activation+7.37e-01

Pos logit contributions
 their+0.016
 Amen+0.015
 very+0.015
 smile+0.015
 like+0.013
Neg logit contributions
handed-0.046
 exhibit-0.044
unch-0.043
 exits-0.043
anger-0.043
Token lots
Token ID6041
Feature activation+7.45e-01

Pos logit contributions
handed+0.027
 exhibit+0.026
unch+0.025
 exits+0.025
anger+0.025
Neg logit contributions
 Amen-0.013
 their-0.013
 very-0.012
 smile-0.012
 like-0.010
Token of
Token ID286
Feature activation+1.68e+00

Pos logit contributions
handed+0.027
 exhibit+0.027
lor+0.026
anger+0.026
unch+0.026
Neg logit contributions
 very-0.013
 their-0.013
 smile-0.013
 Amen-0.012
 like-0.012
Token little
Token ID1310
Feature activation+4.52e+00

Pos logit contributions
handed+0.053
 exhibit+0.050
unch+0.049
anger+0.049
lor+0.048
Neg logit contributions
 their-0.039
 very-0.038
 Amen-0.038
 smile-0.037
 like-0.034
Token black
Token ID2042
Feature activation+5.65e+00

Pos logit contributions
handed+0.124
lor+0.116
 exhibit+0.115
unch+0.114
anger+0.113
Neg logit contributions
 very-0.127
 their-0.126
 smile-0.125
 the-0.118
 Amen-0.117
Token dots
Token ID22969
Feature activation+5.73e-01

Pos logit contributions
handed+0.185
 exhibit+0.164
lor+0.162
unch+0.157
asis+0.156
Neg logit contributions
 very-0.138
 smile-0.136
 their-0.134
 like-0.132
 the-0.129
Token.
Token ID13
Feature activation-2.83e-01

Pos logit contributions
handed+0.023
 exhibit+0.022
unch+0.022
 exits+0.021
anger+0.021
Neg logit contributions
 Amen-0.008
 their-0.008
 very-0.008
 smile-0.007
 like-0.006
Token The
Token ID383
Feature activation-7.60e-01

Pos logit contributions
 their+0.003
 the+0.003
 very+0.003
 The+0.003
 Amen+0.003
Neg logit contributions
handed-0.012
 exits-0.012
 exhibit-0.012
asis-0.012
anger-0.012
Token butterfly
Token ID35113
Feature activation+3.46e-01

Pos logit contributions
 very+0.020
 smile+0.019
 their+0.019
 the+0.018
 Amen+0.018
Neg logit contributions
handed-0.023
lor-0.022
unch-0.021
 exhibit-0.021
anger-0.021
Token was
Token ID373
Feature activation-7.89e-01

Pos logit contributions
handed+0.013
 exhibit+0.013
unch+0.013
anger+0.013
 exits+0.013
Neg logit contributions
 their-0.005
 very-0.005
 Amen-0.005
 smile-0.005
 the-0.004
Token big
Token ID1263
Feature activation+1.63e+00

Pos logit contributions
handed+0.067
 exhibit+0.065
unch+0.064
anger+0.063
 exits+0.063
Neg logit contributions
 their-0.015
 Amen-0.014
 very-0.014
 smile-0.013
 the-0.010
Token trees
Token ID7150
Feature activation+3.48e-01

Pos logit contributions
 exhibit+0.047
handed+0.045
 exits+0.044
 territory+0.043
anger+0.042
Neg logit contributions
 Amen-0.044
 their-0.039
 very-0.037
te-0.036
 smile-0.036
Token with
Token ID351
Feature activation+4.00e-01

Pos logit contributions
handed+0.013
 exhibit+0.013
unch+0.012
 exits+0.012
anger+0.012
Neg logit contributions
 Amen-0.006
 their-0.005
 very-0.005
 smile-0.005
te-0.004
Token lots
Token ID6041
Feature activation+1.36e+00

Pos logit contributions
handed+0.014
 exhibit+0.014
 exits+0.014
orns+0.013
unch+0.013
Neg logit contributions
 Amen-0.007
 their-0.006
 smile-0.006
 very-0.006
te-0.005
Token of
Token ID286
Feature activation+1.66e+00

Pos logit contributions
handed+0.051
lor+0.050
 exhibit+0.049
anger+0.048
unch+0.048
Neg logit contributions
 their-0.023
 very-0.023
 smile-0.022
 like-0.021
 the-0.020
Token green
Token ID4077
Feature activation+5.63e+00

Pos logit contributions
handed+0.047
 exhibit+0.045
unch+0.044
lor+0.044
anger+0.043
Neg logit contributions
 their-0.044
 very-0.043
 Amen-0.043
 smile-0.042
 like-0.039
Token leaves
Token ID5667
Feature activation+7.58e-01

Pos logit contributions
handed+0.160
lor+0.144
unch+0.140
 exhibit+0.138
anger+0.134
Neg logit contributions
 their-0.163
 very-0.158
 smile-0.155
 the-0.149
 like-0.148
Token.
Token ID13
Feature activation-1.11e+00

Pos logit contributions
handed+0.030
 exhibit+0.029
unch+0.029
 exits+0.028
anger+0.028
Neg logit contributions
 Amen-0.011
 their-0.011
 very-0.010
 smile-0.010
 like-0.008
Token He
Token ID679
Feature activation+1.59e-01

Pos logit contributions
 their+0.016
 very+0.015
 Amen+0.015
 the+0.014
 smile+0.014
Neg logit contributions
handed-0.045
orns-0.044
anger-0.044
lor-0.044
unch-0.044
Token also
Token ID635
Feature activation-5.33e-01

Pos logit contributions
handed+0.005
 exhibit+0.005
 exits+0.005
unch+0.005
anger+0.005
Neg logit contributions
 Amen-0.003
 their-0.003
 smile-0.003
 very-0.003
 like-0.002
Token saw
Token ID2497
Feature activation+1.76e+00

Pos logit contributions
 Amen+0.015
 their+0.012
 smile+0.011
te+0.011
 very+0.011
Neg logit contributions
handed-0.015
 exhibit-0.014
 exits-0.013
unch-0.013
oles-0.013
Token Some
Token ID2773
Feature activation+1.58e+00

Pos logit contributions
handed+0.004
 exhibit+0.004
unch+0.003
 exits+0.003
anger+0.003
Neg logit contributions
 Amen-0.001
 very-0.001
 smile-0.001
 their-0.001
 like-0.001
Token bees
Token ID17002
Feature activation+7.38e-02

Pos logit contributions
handed+0.047
lor+0.043
 exhibit+0.043
anger+0.042
unch+0.042
Neg logit contributions
 their-0.043
 very-0.042
 smile-0.040
 the-0.039
 like-0.038
Token had
Token ID550
Feature activation+9.86e-01

Pos logit contributions
handed+0.003
 exhibit+0.003
anger+0.003
lor+0.003
 exits+0.003
Neg logit contributions
 their-0.001
 very-0.001
 smile-0.001
 like-0.001
 the-0.001
Token yellow
Token ID7872
Feature activation+5.43e+00

Pos logit contributions
handed+0.032
 exhibit+0.031
unch+0.030
anger+0.030
lor+0.029
Neg logit contributions
 their-0.022
 very-0.022
 Amen-0.021
 smile-0.021
 like-0.019
Token and
Token ID290
Feature activation+1.23e+00

Pos logit contributions
handed+0.162
 exhibit+0.152
unch+0.148
lor+0.146
 exits+0.145
Neg logit contributions
 their-0.138
 very-0.133
 Amen-0.132
 smile-0.130
 like-0.123
Token black
Token ID2042
Feature activation+5.59e+00

Pos logit contributions
 worms+0.038
oles+0.036
 exits+0.035
 exhibit+0.034
orns+0.034
Neg logit contributions
 Amen-0.030
 smile-0.026
 forward-0.023
 their-0.023
te-0.023
Token stripes
Token ID28806
Feature activation+3.74e-01

Pos logit contributions
handed+0.184
 exhibit+0.169
lor+0.166
unch+0.162
 exits+0.160
Neg logit contributions
 their-0.135
 very-0.130
 smile-0.126
 like-0.123
 the-0.118
Token,
Token ID11
Feature activation+7.64e-01

Pos logit contributions
handed+0.014
 exhibit+0.014
unch+0.014
anger+0.014
orns+0.013
Neg logit contributions
 Amen-0.006
 their-0.005
 smile-0.005
 very-0.005
 low-0.004
Token and
Token ID290
Feature activation+2.13e-01

Pos logit contributions
 worms+0.026
oles+0.024
 footprints+0.024
orns+0.023
 exits+0.023
Neg logit contributions
 Amen-0.017
 smile-0.012
 forward-0.012
te-0.012
 their-0.011
Token some
Token ID617
Feature activation+6.90e-01

Pos logit contributions
 worms+0.007
 exits+0.006
 footprints+0.006
oles+0.006
 exhibit+0.006
Neg logit contributions
 Amen-0.005
 smile-0.004
te-0.003
 forward-0.003
 very-0.003
Token had
Token ID550
Feature activation+7.01e-01

Pos logit contributions
handed+0.022
 exhibit+0.022
unch+0.021
 exits+0.021
anger+0.021
Neg logit contributions
 Amen-0.015
 their-0.015
 very-0.014
 smile-0.014
 like-0.013
Token full
Token ID1336
Feature activation+2.37e-01

Pos logit contributions
 Amen+0.007
 smile+0.005
 their+0.004
te+0.004
bell+0.004
Neg logit contributions
 exits-0.015
 worms-0.015
unch-0.015
handed-0.015
 dam-0.015
Token of
Token ID286
Feature activation+2.42e+00

Pos logit contributions
handed+0.009
 exhibit+0.008
lor+0.008
anger+0.008
 exits+0.008
Neg logit contributions
 their-0.004
 very-0.004
 smile-0.004
 Amen-0.004
 like-0.004
Token tall
Token ID7331
Feature activation+3.50e+00

Pos logit contributions
handed+0.073
lor+0.071
 exhibit+0.069
anger+0.068
unch+0.066
Neg logit contributions
 their-0.065
 very-0.065
 smile-0.061
 the-0.059
 like-0.059
Token trees
Token ID7150
Feature activation+6.09e-01

Pos logit contributions
handed+0.092
lor+0.084
unch+0.083
 exhibit+0.083
anger+0.082
Neg logit contributions
 their-0.103
 very-0.101
 smile-0.098
 Amen-0.096
 the-0.094
Token and
Token ID290
Feature activation+9.48e-01

Pos logit contributions
handed+0.025
 exhibit+0.024
unch+0.023
 exits+0.023
anger+0.023
Neg logit contributions
 Amen-0.008
 their-0.008
 very-0.008
 smile-0.008
 like-0.006
Token green
Token ID4077
Feature activation+5.58e+00

Pos logit contributions
 worms+0.027
 exits+0.027
 exhibit+0.026
handed+0.026
 footprints+0.026
Neg logit contributions
 Amen-0.024
te-0.021
 smile-0.020
 their-0.020
 forward-0.019
Token plants
Token ID6134
Feature activation+1.50e+00

Pos logit contributions
handed+0.166
lor+0.151
 exhibit+0.146
unch+0.146
anger+0.143
Neg logit contributions
 their-0.155
 very-0.153
 like-0.144
 smile-0.143
 the-0.142
Token.
Token ID13
Feature activation+1.12e-01

Pos logit contributions
handed+0.066
 exhibit+0.063
lor+0.062
anger+0.062
unch+0.061
Neg logit contributions
 very-0.018
 their-0.018
 smile-0.016
 like-0.015
 the-0.014
Token He
Token ID679
Feature activation+1.17e-01

Pos logit contributions
handed+0.005
 exhibit+0.005
lor+0.004
unch+0.004
 exits+0.004
Neg logit contributions
 their-0.002
 very-0.001
 the-0.001
 smile-0.001
 Amen-0.001
Token put
Token ID1234
Feature activation+5.15e-03

Pos logit contributions
handed+0.005
anger+0.004
 exhibit+0.004
unch+0.004
lor+0.004
Neg logit contributions
 their-0.002
 very-0.002
 Amen-0.002
 smile-0.002
 smiled-0.002
Token on
Token ID319
Feature activation-8.66e-01

Pos logit contributions
orns+0.000
 exhibit+0.000
handed+0.000
 exits+0.000
 worms+0.000
Neg logit contributions
 Amen-0.000
 smile-0.000
 their-0.000
 very-0.000
te-0.000
Token he
Token ID339
Feature activation+5.33e-01

Pos logit contributions
 exhibit+0.012
handed+0.012
 exits+0.012
anger+0.011
lor+0.011
Neg logit contributions
 Amen-0.005
 smile-0.004
 their-0.004
 very-0.004
te-0.004
Token met
Token ID1138
Feature activation+2.03e+00

Pos logit contributions
handed+0.019
 exhibit+0.018
unch+0.018
 exits+0.018
anger+0.018
Neg logit contributions
 Amen-0.010
 their-0.010
 very-0.009
 smile-0.009
 like-0.008
Token lots
Token ID6041
Feature activation+2.38e-01

Pos logit contributions
handed+0.093
 exhibit+0.090
unch+0.090
anger+0.089
lor+0.088
Neg logit contributions
 their-0.018
 very-0.017
 Amen-0.016
 smile-0.015
 the-0.013
Token of
Token ID286
Feature activation+2.17e+00

Pos logit contributions
handed+0.008
 exhibit+0.008
unch+0.008
lor+0.008
anger+0.008
Neg logit contributions
 their-0.005
 very-0.004
 Amen-0.004
 smile-0.004
 the-0.004
Token other
Token ID584
Feature activation+2.58e+00

Pos logit contributions
lor+0.068
handed+0.067
anger+0.065
unch+0.063
 exhibit+0.061
Neg logit contributions
 very-0.058
 their-0.056
 smile-0.053
 the-0.053
 like-0.052
Token furry
Token ID40857
Feature activation+5.56e+00

Pos logit contributions
lor+0.081
handed+0.079
anger+0.074
unch+0.074
utch+0.071
Neg logit contributions
 very-0.072
 their-0.072
 the-0.070
 like-0.069
 smile-0.066
Token friends
Token ID2460
Feature activation+1.82e-01

Pos logit contributions
lor+0.192
handed+0.191
anger+0.174
unch+0.167
asis+0.163
Neg logit contributions
 their-0.142
 very-0.141
 the-0.134
 like-0.133
 smile-0.131
Token and
Token ID290
Feature activation+4.57e-01

Pos logit contributions
handed+0.007
 exhibit+0.007
unch+0.007
anger+0.007
 exits+0.007
Neg logit contributions
 their-0.003
 Amen-0.003
 very-0.003
 smile-0.002
 like-0.002
Token finally
Token ID3443
Feature activation+4.25e-01

Pos logit contributions
handed+0.016
 exhibit+0.015
 exits+0.015
unch+0.014
 discovery+0.014
Neg logit contributions
 Amen-0.009
 their-0.008
 smile-0.008
 very-0.008
 low-0.007
Token he
Token ID339
Feature activation+9.07e-01

Pos logit contributions
handed+0.014
 exhibit+0.013
unch+0.013
 exits+0.013
anger+0.013
Neg logit contributions
 Amen-0.009
 their-0.009
 very-0.009
 smile-0.009
 like-0.008
Token felt
Token ID2936
Feature activation-8.00e-01

Pos logit contributions
handed+0.033
 exhibit+0.032
unch+0.031
 exits+0.031
anger+0.030
Neg logit contributions
 Amen-0.016
 their-0.016
 very-0.016
 smile-0.015
 like-0.014
Token frog
Token ID21264
Feature activation+2.10e-02

Pos logit contributions
 exhibit+0.018
handed+0.018
unch+0.017
anger+0.017
 exits+0.017
Neg logit contributions
 Amen-0.022
 their-0.021
 very-0.020
 smile-0.020
te-0.019
Token with
Token ID351
Feature activation+1.07e+00

Pos logit contributions
handed+0.001
 exits+0.001
anger+0.001
lor+0.001
 endings+0.001
Neg logit contributions
 very-0.000
 their-0.000
 like-0.000
 the-0.000
 smile-0.000
Token w
Token ID266
Feature activation+2.26e+00

Pos logit contributions
handed+0.043
 exhibit+0.042
unch+0.041
 exits+0.040
anger+0.040
Neg logit contributions
 Amen-0.015
 their-0.015
 very-0.014
 smile-0.014
 like-0.012
Tokenarts
Token ID5889
Feature activation-7.02e-01

Pos logit contributions
handed+0.063
 exhibit+0.060
lor+0.057
 doorstep+0.054
 exits+0.052
Neg logit contributions
 very-0.063
 smile-0.061
 like-0.060
 their-0.060
 too-0.052
Token and
Token ID290
Feature activation+1.80e+00

Pos logit contributions
 their+0.008
 very+0.008
 Amen+0.008
 smile+0.007
 like+0.007
Neg logit contributions
handed-0.030
 exhibit-0.029
 exits-0.029
anger-0.028
lor-0.028
Token red
Token ID2266
Feature activation+5.55e+00

Pos logit contributions
handed+0.059
 exhibit+0.057
 exits+0.055
orns+0.055
 worms+0.054
Neg logit contributions
 Amen-0.039
 their-0.034
 smile-0.034
 very-0.033
te-0.031
Token eyes
Token ID2951
Feature activation-9.02e-01

Pos logit contributions
handed+0.170
 exhibit+0.158
lor+0.155
unch+0.154
 exits+0.152
Neg logit contributions
 their-0.140
 very-0.134
 smile-0.133
 Amen-0.131
 like-0.124
Token.
Token ID13
Feature activation+4.55e-03

Pos logit contributions
 Amen+0.016
 their+0.014
 smile+0.014
 very+0.013
 the+0.011
Neg logit contributions
handed-0.033
 exhibit-0.032
unch-0.032
anger-0.031
 exits-0.031
Token It
Token ID632
Feature activation+5.40e-01

Pos logit contributions
handed+0.000
unch+0.000
 exhibit+0.000
anger+0.000
lor+0.000
Neg logit contributions
 Amen-0.000
 smile-0.000
 very-0.000
 their-0.000
te-0.000
Token was
Token ID373
Feature activation-3.13e-01

Pos logit contributions
 exhibit+0.016
handed+0.016
 worms+0.016
oles+0.016
 footprints+0.016
Neg logit contributions
 Amen-0.012
 their-0.010
 smile-0.010
 forward-0.010
te-0.009
Token sitting
Token ID5586
Feature activation+3.38e-01

Pos logit contributions
 Amen+0.005
 smile+0.004
 their+0.004
te+0.004
 very+0.004
Neg logit contributions
handed-0.011
 exhibit-0.011
 exits-0.011
unch-0.011
 territory-0.011
Token and
Token ID290
Feature activation+2.13e-01

Pos logit contributions
 worms+0.026
oles+0.024
 footprints+0.024
orns+0.023
 exits+0.023
Neg logit contributions
 Amen-0.017
 smile-0.012
 forward-0.012
te-0.012
 their-0.011
Token some
Token ID617
Feature activation+6.90e-01

Pos logit contributions
 worms+0.007
 exits+0.006
 footprints+0.006
oles+0.006
 exhibit+0.006
Neg logit contributions
 Amen-0.005
 smile-0.004
te-0.003
 forward-0.003
 very-0.003
Token had
Token ID550
Feature activation+7.01e-01

Pos logit contributions
handed+0.022
 exhibit+0.022
unch+0.021
 exits+0.021
anger+0.021
Neg logit contributions
 Amen-0.015
 their-0.015
 very-0.014
 smile-0.014
 like-0.013
Token orange
Token ID10912
Feature activation+5.67e+00

Pos logit contributions
handed+0.023
 exhibit+0.023
unch+0.022
 exits+0.022
anger+0.022
Neg logit contributions
 their-0.015
 Amen-0.014
 very-0.014
 smile-0.014
 like-0.012
Token and
Token ID290
Feature activation+6.33e-01

Pos logit contributions
handed+0.169
 exhibit+0.155
unch+0.150
lor+0.150
 exits+0.147
Neg logit contributions
 their-0.150
 very-0.148
 smile-0.141
 Amen-0.138
 like-0.137
Token black
Token ID2042
Feature activation+5.49e+00

Pos logit contributions
 worms+0.021
oles+0.020
 exits+0.019
orns+0.019
 exhibit+0.019
Neg logit contributions
 Amen-0.014
 smile-0.011
 forward-0.010
 their-0.010
te-0.010
Token stripes
Token ID28806
Feature activation+7.23e-01

Pos logit contributions
handed+0.179
 exhibit+0.166
lor+0.160
unch+0.160
 exits+0.158
Neg logit contributions
 their-0.130
 very-0.127
 smile-0.122
 Amen-0.119
 like-0.118
Token.
Token ID13
Feature activation+2.75e-01

Pos logit contributions
handed+0.027
 exhibit+0.026
unch+0.026
anger+0.026
orns+0.025
Neg logit contributions
 Amen-0.013
 their-0.011
 smile-0.011
 very-0.011
 low-0.009
Token
Token ID198
Feature activation+8.88e-01

Pos logit contributions
handed+0.010
 exhibit+0.010
unch+0.010
 exits+0.009
anger+0.009
Neg logit contributions
 Amen-0.005
 smile-0.004
 very-0.004
 their-0.004
 smiled-0.004
Token
Token ID198
Feature activation+7.68e-01

Pos logit contributions
handed+0.038
anger+0.036
 exhibit+0.036
 exits+0.036
asis+0.035
Neg logit contributions
 their-0.014
 very-0.012
 Amen-0.011
 the-0.011
 smile-0.010
Token"
Token ID1
Feature activation-5.40e-01

Pos logit contributions
handed+0.025
 exhibit+0.024
unch+0.023
 exits+0.023
anger+0.023
Neg logit contributions
 Amen-0.017
 their-0.017
 very-0.016
 smile-0.016
 like-0.014
Token and
Token ID290
Feature activation-2.77e-01

Pos logit contributions
 Amen+0.011
 their+0.010
 smile+0.009
 very+0.009
 like+0.008
Neg logit contributions
handed-0.022
 exhibit-0.022
anger-0.021
unch-0.021
 exits-0.021
Token make
Token ID787
Feature activation-2.71e+00

Pos logit contributions
 Amen+0.006
te+0.005
 their+0.005
 very+0.005
 low+0.005
Neg logit contributions
handed-0.009
 discovery-0.008
 exits-0.008
 discoveries-0.008
 exhibit-0.008
Token it
Token ID340
Feature activation-1.57e+00

Pos logit contributions
 their+0.048
 very+0.045
 Amen+0.044
 smile+0.042
 the+0.040
Neg logit contributions
handed-0.102
 exhibit-0.099
unch-0.096
anger-0.096
 exits-0.096
Token look
Token ID804
Feature activation-2.22e+00

Pos logit contributions
 Amen+0.038
te+0.030
hy+0.029
ifying+0.028
 seven+0.028
Neg logit contributions
 discovery-0.050
handed-0.049
ade-0.047
uddle-0.046
 worms-0.045
Token new
Token ID649
Feature activation-2.22e+00

Pos logit contributions
 Amen+0.062
hy+0.047
te+0.046
 whine+0.045
 smile+0.044
Neg logit contributions
 discovery-0.057
handed-0.057
 exits-0.056
uddle-0.055
 worms-0.053
Token again
Token ID757
Feature activation-4.66e+00

Pos logit contributions
 Amen+0.056
hy+0.050
 their+0.049
 low+0.048
 smile+0.047
Neg logit contributions
handed-0.063
 exhibit-0.059
 discovery-0.058
 exits-0.058
anger-0.057
Token?"
Token ID1701
Feature activation-1.04e+00

Pos logit contributions
 Amen+0.103
 smile+0.093
 very+0.090
hy+0.089
 their+0.089
Neg logit contributions
handed-0.143
 exhibit-0.140
 exits-0.135
 discovery-0.134
unch-0.134
Token
Token ID220
Feature activation-1.40e+00

Pos logit contributions
 Amen+0.021
 their+0.020
 very+0.020
 smile+0.019
 like+0.017
Neg logit contributions
handed-0.036
 exhibit-0.035
unch-0.034
 exits-0.033
anger-0.033
Token
Token ID198
Feature activation-4.04e-01

Pos logit contributions
 their+0.017
 very+0.015
 the+0.014
 Amen+0.012
 smile+0.012
Neg logit contributions
 exhibit-0.061
 exits-0.061
handed-0.061
asis-0.060
 ounces-0.059
Token
Token ID198
Feature activation-3.55e-01

Pos logit contributions
 their+0.007
 very+0.007
 Amen+0.006
 smile+0.006
 the+0.006
Neg logit contributions
handed-0.015
 exhibit-0.015
 exits-0.015
anger-0.015
lor-0.014
TokenTim
Token ID14967
Feature activation+1.05e-01

Pos logit contributions
 their+0.010
 Amen+0.010
 very+0.010
 smile+0.010
 like+0.009
Neg logit contributions
handed-0.009
 exhibit-0.009
unch-0.008
anger-0.008
 exits-0.008
Token do
Token ID466
Feature activation-3.28e+00

Pos logit contributions
handed+0.007
 exhibit+0.007
unch+0.007
 exits+0.007
anger+0.006
Neg logit contributions
 Amen-0.004
 their-0.004
 very-0.004
 smile-0.004
 like-0.003
Token to
Token ID284
Feature activation-1.34e-01

Pos logit contributions
 Amen+0.056
 their+0.054
 smile+0.052
 very+0.050
 low+0.044
Neg logit contributions
handed-0.121
 exhibit-0.118
 exits-0.115
unch-0.114
anger-0.114
Token be
Token ID307
Feature activation-1.44e+00

Pos logit contributions
 Amen+0.003
 their+0.003
 very+0.003
 smile+0.003
 low+0.002
Neg logit contributions
handed-0.004
 exhibit-0.004
unch-0.004
 exits-0.004
anger-0.004
Token a
Token ID257
Feature activation-2.15e-01

Pos logit contributions
 Amen+0.028
 their+0.025
 smile+0.023
te+0.021
 very+0.021
Neg logit contributions
handed-0.050
 exits-0.047
 exhibit-0.047
 discovery-0.045
 territory-0.045
Token good
Token ID922
Feature activation-1.03e+00

Pos logit contributions
 very+0.005
 Amen+0.005
 their+0.005
 smile+0.005
 like+0.004
Neg logit contributions
handed-0.007
 exhibit-0.006
unch-0.006
orns-0.006
lor-0.006
Token doctor
Token ID6253
Feature activation-4.64e+00

Pos logit contributions
 Amen+0.027
 their+0.027
 smile+0.027
 very+0.027
 like+0.024
Neg logit contributions
handed-0.029
 exhibit-0.028
unch-0.027
lor-0.027
 exits-0.027
Token?"
Token ID1701
Feature activation-6.37e-01

Pos logit contributions
 their+0.095
 Amen+0.091
 smile+0.088
 low+0.082
 very+0.082
Neg logit contributions
 exhibit-0.149
handed-0.148
 exits-0.136
lor-0.136
asis-0.136
Token
Token ID198
Feature activation-2.65e-01

Pos logit contributions
 Amen+0.007
 their+0.007
 very+0.006
 smile+0.006
 like+0.005
Neg logit contributions
handed-0.027
 exhibit-0.027
unch-0.025
 exits-0.025
anger-0.025
Token
Token ID198
Feature activation-2.76e-01

Pos logit contributions
 Amen+0.005
 their+0.005
 very+0.005
 smile+0.005
 like+0.004
Neg logit contributions
handed-0.009
 exhibit-0.009
unch-0.009
 exits-0.009
anger-0.009
TokenThe
Token ID464
Feature activation-2.02e+00

Pos logit contributions
 their+0.005
 Amen+0.005
 smile+0.004
 very+0.004
 like+0.004
Neg logit contributions
handed-0.011
 exhibit-0.010
unch-0.010
lor-0.010
 exits-0.009
Token old
Token ID1468
Feature activation+4.08e-01

Pos logit contributions
 Amen+0.051
 their+0.050
 smile+0.046
te+0.045
 like+0.045
Neg logit contributions
handed-0.061
 exhibit-0.053
 exits-0.051
 discovery-0.051
unch-0.049
Token help
Token ID1037
Feature activation-1.78e+00

Pos logit contributions
handed+0.012
 exhibit+0.012
unch+0.011
 exits+0.011
anger+0.011
Neg logit contributions
 Amen-0.008
 their-0.008
 very-0.008
 smile-0.008
 the-0.007
Token me
Token ID502
Feature activation+5.20e-01

Pos logit contributions
 their+0.045
 the+0.044
 very+0.044
 smile+0.040
 like+0.039
Neg logit contributions
handed-0.056
lor-0.054
unch-0.054
anger-0.054
 exhibit-0.053
Token to
Token ID284
Feature activation+4.49e-01

Pos logit contributions
handed+0.019
 exhibit+0.018
unch+0.018
anger+0.018
 exits+0.018
Neg logit contributions
 their-0.009
 very-0.009
 Amen-0.009
 smile-0.009
 the-0.008
Token build
Token ID1382
Feature activation-2.10e+00

Pos logit contributions
handed+0.016
 exhibit+0.016
unch+0.016
anger+0.016
lor+0.016
Neg logit contributions
 their-0.008
 very-0.008
 smile-0.008
 Amen-0.008
 the-0.007
Token a
Token ID257
Feature activation-6.94e-01

Pos logit contributions
 Amen+0.024
 their+0.021
 smile+0.020
 very+0.019
 like+0.015
Neg logit contributions
handed-0.091
 exhibit-0.089
 exits-0.086
anger-0.085
unch-0.085
Token books
Token ID3835
Feature activation-4.63e+00

Pos logit contributions
 very+0.018
 smile+0.016
 Amen+0.016
 their+0.016
 low+0.015
Neg logit contributions
handed-0.022
unch-0.021
orns-0.021
lor-0.021
oles-0.020
Tokenhe
Token ID258
Feature activation-1.27e+00

Pos logit contributions
 Amen+0.125
 their+0.121
 smile+0.113
 smiled+0.110
 curly+0.106
Neg logit contributions
handed-0.121
 exhibit-0.118
unch-0.112
anger-0.110
asis-0.107
Tokenlf
Token ID1652
Feature activation-2.15e+00

Pos logit contributions
 Amen+0.046
 their+0.045
 very+0.044
 smile+0.044
 like+0.042
Neg logit contributions
handed-0.024
 exhibit-0.022
unch-0.021
 exits-0.020
anger-0.020
Token?"
Token ID1701
Feature activation+1.44e-01

Pos logit contributions
 Amen+0.048
 their+0.044
 smile+0.043
 smiled+0.039
 very+0.038
Neg logit contributions
handed-0.069
 exhibit-0.068
 exits-0.062
asis-0.062
unch-0.061
Token he
Token ID339
Feature activation-1.19e+00

Pos logit contributions
handed+0.005
 exhibit+0.005
unch+0.005
 exits+0.005
anger+0.005
Neg logit contributions
 Amen-0.002
 their-0.002
 very-0.002
 smile-0.002
 like-0.002
Token asked
Token ID1965
Feature activation+3.51e-01

Pos logit contributions
 Amen+0.027
 their+0.026
 smile+0.025
 very+0.025
 like+0.023
Neg logit contributions
handed-0.036
 exhibit-0.035
 exits-0.034
unch-0.033
anger-0.033
Token!
Token ID0
Feature activation+1.84e-01

Pos logit contributions
 Amen+0.007
 smile+0.007
 their+0.007
 whine+0.006
bell+0.006
Neg logit contributions
lor-0.009
 exits-0.009
 exhibit-0.009
anger-0.008
 endings-0.008
Token Can
Token ID1680
Feature activation-3.15e-01

Pos logit contributions
unch+0.005
 exhibit+0.005
lor+0.005
anger+0.005
 exits+0.005
Neg logit contributions
 their-0.004
 smile-0.004
 very-0.004
 low-0.004
 smiled-0.004
Token I
Token ID314
Feature activation-5.70e-03

Pos logit contributions
 their+0.009
 smile+0.008
 very+0.008
 low+0.008
 Amen+0.008
Neg logit contributions
 exhibit-0.007
unch-0.007
lor-0.007
handed-0.007
anger-0.007
Token please
Token ID3387
Feature activation-1.46e+00

Pos logit contributions
 Amen+0.000
 their+0.000
nam+0.000
 low+0.000
hy+0.000
Neg logit contributions
 exhibit-0.000
unch-0.000
 discovery-0.000
anger-0.000
 exits-0.000
Token take
Token ID1011
Feature activation-1.58e+00

Pos logit contributions
 Amen+0.032
 their+0.031
 very+0.027
 low+0.027
nam+0.025
Neg logit contributions
 exhibit-0.042
unch-0.042
handed-0.041
 discovery-0.041
 exits-0.040
Token lessons
Token ID11658
Feature activation-4.60e+00

Pos logit contributions
 Amen+0.022
 their+0.020
 smile+0.019
 very+0.019
 low+0.016
Neg logit contributions
handed-0.063
 exhibit-0.061
 exits-0.060
anger-0.059
orns-0.059
Token?"
Token ID1701
Feature activation-2.76e-01

Pos logit contributions
 Amen+0.092
 their+0.083
 smile+0.081
 very+0.071
 low+0.070
Neg logit contributions
 exhibit-0.154
handed-0.153
anger-0.148
 exits-0.148
lor-0.147
Token
Token ID198
Feature activation-2.10e-03

Pos logit contributions
 their+0.004
 Amen+0.004
 very+0.004
 smile+0.004
 the+0.003
Neg logit contributions
handed-0.011
 exits-0.011
 exhibit-0.011
anger-0.011
lor-0.011
Token
Token ID198
Feature activation+5.12e-02

Pos logit contributions
 their+0.000
 very+0.000
 Amen+0.000
 smile+0.000
 the+0.000
Neg logit contributions
handed-0.000
 exhibit-0.000
 exits-0.000
anger-0.000
orns-0.000
TokenM
Token ID44
Feature activation+8.93e-02

Pos logit contributions
handed+0.002
 exhibit+0.002
anger+0.002
orns+0.002
 exits+0.002
Neg logit contributions
 their-0.001
 Amen-0.001
 very-0.001
 smile-0.001
 the-0.001
Tokenum
Token ID388
Feature activation-6.14e-01

Pos logit contributions
handed+0.003
 exhibit+0.003
orns+0.003
 exits+0.003
 territory+0.002
Neg logit contributions
 Amen-0.002
 very-0.002
 smile-0.002
 their-0.002
te-0.002
Token would
Token ID561
Feature activation-1.45e+00

Pos logit contributions
 exhibit+0.002
orns+0.002
asis+0.002
 exits+0.002
lor+0.002
Neg logit contributions
 Amen-0.001
 their-0.001
 smile-0.001
hy-0.001
 very-0.001
Token fly
Token ID6129
Feature activation-6.83e-01

Pos logit contributions
 Amen+0.028
 their+0.027
 very+0.025
te+0.022
 seven+0.022
Neg logit contributions
 exhibit-0.046
handed-0.046
unch-0.045
lor-0.044
 exits-0.044
Token up
Token ID510
Feature activation-1.67e+00

Pos logit contributions
 Amen+0.018
ifying+0.013
nam+0.012
hy+0.012
 whine+0.012
Neg logit contributions
 exhibit-0.019
orns-0.019
 territory-0.018
rows-0.018
 discovery-0.018
Token to
Token ID284
Feature activation-7.85e-01

Pos logit contributions
 Amen+0.037
 smile+0.027
 their+0.025
hy+0.024
te+0.023
Neg logit contributions
 exhibit-0.055
orns-0.052
handed-0.052
 exits-0.050
 discovery-0.050
Token the
Token ID262
Feature activation-3.80e-02

Pos logit contributions
 Amen+0.005
 their+0.004
 very+0.004
 smile+0.004
 like+0.002
Neg logit contributions
handed-0.037
 exhibit-0.037
 exits-0.035
unch-0.035
orns-0.035
Token tre
Token ID2054
Feature activation-4.51e+00

Pos logit contributions
 Amen+0.001
 their+0.001
 very+0.001
 smile+0.001
 like+0.001
Neg logit contributions
handed-0.001
 exhibit-0.001
unch-0.001
 exits-0.001
anger-0.001
Tokenet
Token ID316
Feature activation+4.95e-01

Pos logit contributions
 their+0.119
 Amen+0.118
 smile+0.116
 very+0.115
 like+0.105
Neg logit contributions
handed-0.125
 exhibit-0.118
 exits-0.114
orns-0.113
unch-0.112
Tokenop
Token ID404
Feature activation+3.98e-01

Pos logit contributions
handed+0.019
 exhibit+0.019
unch+0.018
anger+0.018
lor+0.018
Neg logit contributions
 Amen-0.008
 their-0.008
 very-0.007
 smile-0.007
 like-0.006
Token and
Token ID290
Feature activation+3.90e-01

Pos logit contributions
handed+0.015
 exhibit+0.015
anger+0.014
unch+0.014
orns+0.014
Neg logit contributions
 Amen-0.007
 their-0.006
 smile-0.005
 very-0.005
 like-0.005
Token give
Token ID1577
Feature activation-2.93e-03

Pos logit contributions
handed+0.013
 exhibit+0.013
 exits+0.013
unch+0.012
 territory+0.012
Neg logit contributions
 Amen-0.007
 their-0.007
 very-0.006
 smile-0.006
te-0.006
Token the
Token ID262
Feature activation-8.00e-02

Pos logit contributions
 their+0.000
 Amen+0.000
 very+0.000
 smile+0.000
 the+0.000
Neg logit contributions
handed-0.000
 exhibit-0.000
unch-0.000
anger-0.000
 exits-0.000
Token you
Token ID345
Feature activation-1.37e+00

Pos logit contributions
 Amen+0.013
 their+0.013
 smile+0.013
 very+0.013
 low+0.012
Neg logit contributions
handed-0.011
 exhibit-0.010
unch-0.010
 exits-0.010
lor-0.010
Token like
Token ID588
Feature activation-1.46e+00

Pos logit contributions
 Amen+0.023
 their+0.023
spe+0.020
lights+0.019
nam+0.019
Neg logit contributions
 exhibit-0.041
 discovery-0.041
unch-0.041
 exits-0.040
 hunt-0.039
Token to
Token ID284
Feature activation-1.91e-01

Pos logit contributions
 Amen+0.026
 their+0.024
 smile+0.023
 very+0.022
 low+0.021
Neg logit contributions
 exhibit-0.050
handed-0.050
 exits-0.049
unch-0.048
lor-0.048
Token take
Token ID1011
Feature activation-1.54e+00

Pos logit contributions
 Amen+0.004
 their+0.004
nam+0.004
 very+0.004
 low+0.004
Neg logit contributions
 exhibit-0.005
unch-0.005
handed-0.005
 discovery-0.005
 exits-0.005
Token singing
Token ID13777
Feature activation-1.39e+00

Pos logit contributions
 Amen+0.019
 their+0.019
 very+0.018
 smile+0.017
 like+0.014
Neg logit contributions
handed-0.064
 exhibit-0.062
unch-0.060
 exits-0.060
anger-0.060
Token lessons
Token ID11658
Feature activation-4.51e+00

Pos logit contributions
 Amen+0.026
hy+0.023
 smile+0.023
 their+0.022
 had+0.021
Neg logit contributions
 exhibit-0.042
 exits-0.042
anger-0.041
lor-0.041
 discovery-0.041
Token?"
Token ID1701
Feature activation-7.98e-01

Pos logit contributions
 Amen+0.094
 their+0.083
 smile+0.083
 very+0.072
 low+0.069
Neg logit contributions
 exhibit-0.147
handed-0.143
 exits-0.143
anger-0.142
lor-0.139
Token
Token ID198
Feature activation-3.29e-01

Pos logit contributions
 their+0.013
 Amen+0.013
 very+0.012
 smile+0.012
 the+0.011
Neg logit contributions
handed-0.031
 exhibit-0.030
unch-0.029
anger-0.029
 exits-0.029
Token
Token ID198
Feature activation-3.05e-01

Pos logit contributions
 their+0.006
 Amen+0.006
 very+0.006
 smile+0.005
 the+0.005
Neg logit contributions
handed-0.012
 exhibit-0.012
 exits-0.012
anger-0.012
unch-0.011
TokenLu
Token ID25596
Feature activation-5.18e-01

Pos logit contributions
 Amen+0.008
 their+0.008
 very+0.008
 smile+0.008
 like+0.007
Neg logit contributions
handed-0.008
 exhibit-0.008
unch-0.007
 exits-0.007
lor-0.007
Tokency
Token ID948
Feature activation+3.89e-01

Pos logit contributions
 Amen+0.009
 their+0.009
 very+0.009
 smile+0.009
 like+0.008
Neg logit contributions
handed-0.019
 exhibit-0.018
orns-0.017
lor-0.017
unch-0.017
TokenHow
Token ID2437
Feature activation-1.07e+00

Pos logit contributions
 Amen+0.011
 their+0.010
 very+0.010
 like+0.009
 smile+0.009
Neg logit contributions
handed-0.031
anger-0.031
 exhibit-0.031
unch-0.030
oles-0.030
Token will
Token ID481
Feature activation-2.68e+00

Pos logit contributions
 Amen+0.026
 their+0.022
 smile+0.021
 very+0.019
 low+0.018
Neg logit contributions
 exhibit-0.033
unch-0.033
anger-0.032
 exits-0.031
handed-0.031
Token we
Token ID356
Feature activation-3.60e-01

Pos logit contributions
 Amen+0.059
 very+0.055
 smile+0.055
 their+0.054
 low+0.051
Neg logit contributions
handed-0.082
 exhibit-0.080
 exits-0.077
unch-0.076
 discovery-0.076
Token make
Token ID787
Feature activation-2.48e+00

Pos logit contributions
 Amen+0.007
 their+0.007
 very+0.006
 smile+0.006
 smiled+0.005
Neg logit contributions
handed-0.012
 exhibit-0.012
unch-0.012
 exits-0.012
anger-0.012
Token it
Token ID340
Feature activation-1.84e+00

Pos logit contributions
 their+0.029
 very+0.026
 Amen+0.025
 the+0.023
 smile+0.022
Neg logit contributions
handed-0.110
 exhibit-0.106
unch-0.105
anger-0.104
 exits-0.104
Token heal
Token ID12035
Feature activation-4.48e+00

Pos logit contributions
 Amen+0.039
 their+0.031
hy+0.030
te+0.029
 smile+0.029
Neg logit contributions
handed-0.059
 exhibit-0.059
 discovery-0.058
 exits-0.057
asis-0.056
Token to
Token ID284
Feature activation-1.13e+00

Pos logit contributions
hy+0.102
 Amen+0.099
 smile+0.094
 curly+0.091
bell+0.087
Neg logit contributions
handed-0.122
 exhibit-0.119
 discovery-0.115
 territory-0.115
asis-0.114
Token our
Token ID674
Feature activation+3.09e-01

Pos logit contributions
 Amen+0.020
 their+0.019
 very+0.018
 smile+0.018
te+0.016
Neg logit contributions
handed-0.040
 exhibit-0.038
 exits-0.037
unch-0.037
 discovery-0.036
Token wall
Token ID3355
Feature activation-1.66e+00

Pos logit contributions
lor+0.010
handed+0.010
unch+0.010
orns+0.009
anger+0.009
Neg logit contributions
 very-0.008
 their-0.008
 smile-0.007
 the-0.007
 Amen-0.007
Token?".
Token ID43634
Feature activation-8.42e-01

Pos logit contributions
 Amen+0.037
 their+0.035
 smile+0.031
te+0.030
hy+0.030
Neg logit contributions
 exhibit-0.051
handed-0.050
 exits-0.047
asis-0.047
anger-0.046
Token Jerry
Token ID13075
Feature activation-7.15e-02

Pos logit contributions
 their+0.021
 Amen+0.021
 very+0.020
 smile+0.020
 the+0.019
Neg logit contributions
handed-0.025
anger-0.024
unch-0.023
 exhibit-0.023
lor-0.023
Token!
Token ID0
Feature activation+3.18e-01

Pos logit contributions
handed+0.027
unch+0.025
orns+0.025
 exhibit+0.024
lor+0.024
Neg logit contributions
 very-0.009
 their-0.009
 the-0.008
 smile-0.008
 Amen-0.008
Token Can
Token ID1680
Feature activation-5.10e-01

Pos logit contributions
handed+0.010
 exhibit+0.009
unch+0.009
anger+0.009
lor+0.009
Neg logit contributions
 their-0.008
 very-0.007
 smile-0.007
 Amen-0.007
 low-0.006
Token I
Token ID314
Feature activation-1.08e+00

Pos logit contributions
 their+0.012
 smile+0.012
 Amen+0.012
 very+0.012
 low+0.011
Neg logit contributions
handed-0.014
 exhibit-0.014
unch-0.013
lor-0.013
 exits-0.013
Token mail
Token ID6920
Feature activation-1.00e+00

Pos logit contributions
 their+0.026
 Amen+0.025
 very+0.023
 smile+0.023
 low+0.022
Neg logit contributions
handed-0.030
 exhibit-0.030
unch-0.029
anger-0.028
 discovery-0.027
Token something
Token ID1223
Feature activation-2.12e+00

Pos logit contributions
 Amen+0.017
 smile+0.016
 their+0.016
 very+0.015
 low+0.014
Neg logit contributions
 exhibit-0.035
handed-0.035
anger-0.034
 exits-0.034
lor-0.033
Token please
Token ID3387
Feature activation-4.48e+00

Pos logit contributions
 their+0.034
 very+0.034
 Amen+0.033
 smile+0.031
 like+0.029
Neg logit contributions
handed-0.083
 exhibit-0.079
unch-0.078
orns-0.077
lor-0.077
Token?"
Token ID1701
Feature activation-3.12e-01

Pos logit contributions
 Amen+0.100
 their+0.096
 smile+0.092
 very+0.089
 low+0.088
Neg logit contributions
 exhibit-0.127
unch-0.125
handed-0.124
 discovery-0.121
anger-0.119
Token
Token ID198
Feature activation-2.47e-02

Pos logit contributions
 their+0.005
 Amen+0.005
 very+0.005
 smile+0.005
 the+0.004
Neg logit contributions
handed-0.012
 exhibit-0.011
unch-0.011
 exits-0.011
lor-0.011
Token
Token ID198
Feature activation+9.90e-03

Pos logit contributions
 their+0.000
 Amen+0.000
 very+0.000
 smile+0.000
 the+0.000
Neg logit contributions
handed-0.001
 exhibit-0.001
unch-0.001
 exits-0.001
anger-0.001
TokenThe
Token ID464
Feature activation-1.53e+00

Pos logit contributions
handed+0.000
 exhibit+0.000
unch+0.000
 exits+0.000
anger+0.000
Neg logit contributions
 Amen-0.000
 their-0.000
 smile-0.000
 very-0.000
 like-0.000
Token man
Token ID582
Feature activation-2.10e+00

Pos logit contributions
 very+0.037
 the+0.035
 blue+0.033
 pink+0.032
 their+0.031
Neg logit contributions
unch-0.054
handed-0.052
oles-0.051
 endings-0.051
orns-0.051
Token.
Token ID13
Feature activation-2.46e-01

Pos logit contributions
 Amen+0.013
 their+0.012
 smile+0.012
 very+0.012
 smiled+0.011
Neg logit contributions
handed-0.020
 exhibit-0.020
 exits-0.019
lor-0.019
anger-0.019
Token Can
Token ID1680
Feature activation-6.24e-01

Pos logit contributions
 their+0.005
 smile+0.005
 very+0.005
 Amen+0.005
 low+0.004
Neg logit contributions
handed-0.008
 exhibit-0.008
unch-0.008
 exits-0.008
lor-0.008
Token we
Token ID356
Feature activation+2.19e-01

Pos logit contributions
 their+0.015
 Amen+0.014
 smile+0.014
 very+0.014
 low+0.013
Neg logit contributions
handed-0.018
 exhibit-0.018
unch-0.017
lor-0.017
 exits-0.017
Token come
Token ID1282
Feature activation-9.74e-01

Pos logit contributions
handed+0.008
 exhibit+0.008
unch+0.007
 exits+0.007
anger+0.007
Neg logit contributions
 Amen-0.004
 their-0.004
 very-0.004
 smile-0.003
 low-0.003
Token back
Token ID736
Feature activation-2.48e+00

Pos logit contributions
 Amen+0.019
 their+0.017
hy+0.017
 curly+0.016
 cloudy+0.016
Neg logit contributions
 exhibit-0.032
 discovery-0.030
 exits-0.029
handed-0.029
 territory-0.029
Token again
Token ID757
Feature activation-4.46e+00

Pos logit contributions
 Amen+0.049
 smile+0.043
 their+0.043
 very+0.043
 low+0.038
Neg logit contributions
 exhibit-0.086
handed-0.086
 exits-0.081
anger-0.080
unch-0.079
Token?"
Token ID1701
Feature activation-7.75e-01

Pos logit contributions
 Amen+0.089
 their+0.082
 very+0.080
 smile+0.080
te+0.070
Neg logit contributions
 exhibit-0.148
handed-0.147
 exits-0.142
unch-0.140
 discovery-0.138
Token His
Token ID2399
Feature activation-1.02e+00

Pos logit contributions
 their+0.013
 Amen+0.012
 very+0.012
 smile+0.012
 the+0.010
Neg logit contributions
handed-0.030
 exhibit-0.029
 exits-0.028
unch-0.028
anger-0.028
Token parents
Token ID3397
Feature activation-2.40e+00

Pos logit contributions
 very+0.015
 smile+0.015
 their+0.015
 Amen+0.015
 the+0.012
Neg logit contributions
handed-0.042
unch-0.040
 exhibit-0.040
orns-0.040
lor-0.040
Token smiled
Token ID13541
Feature activation-2.64e+00

Pos logit contributions
 Amen+0.051
 their+0.046
 very+0.041
 low+0.040
 cloudy+0.040
Neg logit contributions
handed-0.078
unch-0.076
 exhibit-0.075
 discovery-0.071
anger-0.071
Token and
Token ID290
Feature activation-8.54e-01

Pos logit contributions
 Amen+0.074
nam+0.069
hy+0.067
 whine+0.066
 Chess+0.066
Neg logit contributions
 discovery-0.057
unch-0.053
 footprints-0.052
 bur-0.050
handed-0.050
Token You
Token ID921
Feature activation-1.17e+00

Pos logit contributions
handed+0.014
 exhibit+0.014
unch+0.013
anger+0.013
lor+0.013
Neg logit contributions
 their-0.010
 very-0.010
 Amen-0.009
 smile-0.009
 low-0.008
Token are
Token ID389
Feature activation+1.15e-02

Pos logit contributions
 their+0.020
 Amen+0.020
 very+0.017
 smile+0.017
 low+0.017
Neg logit contributions
 exhibit-0.040
handed-0.040
 exits-0.038
lor-0.038
anger-0.038
Token a
Token ID257
Feature activation+8.60e-03

Pos logit contributions
handed+0.000
 exits+0.000
 exhibit+0.000
 burying+0.000
 discovery+0.000
Neg logit contributions
bell-0.000
 Amen-0.000
 smile-0.000
 cloudy-0.000
 whine-0.000
Token very
Token ID845
Feature activation-7.19e-01

Pos logit contributions
handed+0.000
 exhibit+0.000
ager+0.000
 discovery+0.000
 exits+0.000
Neg logit contributions
bell-0.000
 their-0.000
 Amen-0.000
 smiled-0.000
 smile-0.000
Token good
Token ID922
Feature activation-1.20e+00

Pos logit contributions
 Amen+0.014
 their+0.013
 smile+0.012
 very+0.011
 the+0.011
Neg logit contributions
handed-0.025
 exhibit-0.023
 exits-0.022
unch-0.022
anger-0.022
Token writer
Token ID6260
Feature activation-4.45e+00

Pos logit contributions
 Amen+0.029
 their+0.029
 very+0.027
 smile+0.027
 low+0.027
Neg logit contributions
handed-0.034
 exhibit-0.033
anger-0.032
unch-0.031
 exits-0.030
Token.
Token ID13
Feature activation+3.00e-01

Pos logit contributions
 cloudy+0.102
 low+0.101
 smile+0.101
 their+0.095
 curly+0.093
Neg logit contributions
handed-0.127
lor-0.124
 exhibit-0.124
anger-0.115
orns-0.111
Token Your
Token ID3406
Feature activation-1.70e+00

Pos logit contributions
handed+0.009
 exhibit+0.008
lor+0.008
unch+0.008
orns+0.008
Neg logit contributions
 very-0.007
 their-0.007
 smile-0.007
 Amen-0.007
 low-0.007
Token uncle
Token ID7711
Feature activation-3.91e-01

Pos logit contributions
 Amen+0.034
 their+0.034
 very+0.032
 smile+0.032
 low+0.029
Neg logit contributions
handed-0.058
 exhibit-0.056
 exits-0.054
unch-0.053
anger-0.053
Token will
Token ID481
Feature activation-2.54e+00

Pos logit contributions
 Amen+0.008
 their+0.007
 very+0.007
 smile+0.007
 low+0.006
Neg logit contributions
handed-0.013
 exhibit-0.013
orns-0.012
unch-0.012
 exits-0.012
Token be
Token ID307
Feature activation-1.79e+00

Pos logit contributions
 Amen+0.048
 their+0.046
 cloudy+0.043
 smiled+0.043
nam+0.042
Neg logit contributions
handed-0.074
lor-0.069
 discovery-0.068
 hunt-0.066
 exhibit-0.066
Token can
Token ID460
Feature activation-1.93e+00

Pos logit contributions
 very+0.011
 their+0.011
 smile+0.010
 Amen+0.010
 like+0.009
Neg logit contributions
handed-0.033
unch-0.031
lor-0.031
 exhibit-0.030
 exits-0.030
Token make
Token ID787
Feature activation-1.96e+00

Pos logit contributions
 Amen+0.043
 their+0.043
 very+0.041
 smile+0.039
 low+0.036
Neg logit contributions
handed-0.060
 exhibit-0.057
unch-0.055
anger-0.054
 exits-0.054
Token it
Token ID340
Feature activation-1.64e+00

Pos logit contributions
 their+0.024
 the+0.022
 very+0.022
 smile+0.014
 a+0.014
Neg logit contributions
handed-0.090
 exhibit-0.088
lor-0.088
unch-0.088
asis-0.087
Token look
Token ID804
Feature activation-2.33e+00

Pos logit contributions
 Amen+0.033
 their+0.026
 smile+0.025
te+0.024
 smiled+0.023
Neg logit contributions
handed-0.056
 exhibit-0.055
 exits-0.054
 discovery-0.053
anger-0.053
Token new
Token ID649
Feature activation-1.89e+00

Pos logit contributions
 Amen+0.066
 whine+0.051
hy+0.050
te+0.050
 smile+0.049
Neg logit contributions
 exits-0.058
 discovery-0.057
 clues-0.054
 worms-0.053
 territory-0.053
Token again
Token ID757
Feature activation-4.43e+00

Pos logit contributions
 Amen+0.038
 their+0.035
 smile+0.033
 very+0.030
 low+0.030
Neg logit contributions
handed-0.065
 exhibit-0.063
 exits-0.062
anger-0.059
 territory-0.059
Token."
Token ID526
Feature activation-5.60e-01

Pos logit contributions
 Amen+0.099
 smile+0.091
 their+0.088
 very+0.082
hy+0.078
Neg logit contributions
handed-0.136
 exhibit-0.136
 exits-0.132
orns-0.129
asis-0.127
Token
Token ID198
Feature activation-2.44e-01

Pos logit contributions
 their+0.008
 very+0.007
 smile+0.007
 Amen+0.007
 the+0.007
Neg logit contributions
handed-0.023
 exhibit-0.022
 exits-0.022
unch-0.022
lor-0.022
Token
Token ID198
Feature activation-3.40e-01

Pos logit contributions
 their+0.004
 very+0.004
 Amen+0.004
 smile+0.004
 the+0.003
Neg logit contributions
handed-0.010
 exhibit-0.009
 exits-0.009
anger-0.009
lor-0.009
TokenThey
Token ID2990
Feature activation-6.83e-01

Pos logit contributions
 Amen+0.007
 their+0.007
 smile+0.007
 very+0.007
 the+0.006
Neg logit contributions
handed-0.011
 exhibit-0.011
 exits-0.010
unch-0.010
lor-0.010
Token went
Token ID1816
Feature activation+2.08e-02

Pos logit contributions
 Amen+0.016
 their+0.014
 smile+0.014
 very+0.013
 like+0.012
Neg logit contributions
 exhibit-0.020
handed-0.020
 exits-0.019
unch-0.019
aded-0.019
Token.
Token ID13
Feature activation+5.62e-01

Pos logit contributions
handed+0.010
 exhibit+0.010
 exits+0.010
unch+0.010
orns+0.010
Neg logit contributions
 Amen-0.006
 their-0.005
 smile-0.005
 very-0.005
 low-0.005
Token Can
Token ID1680
Feature activation-3.78e-01

Pos logit contributions
unch+0.016
 exhibit+0.016
handed+0.016
lor+0.015
 exits+0.015
Neg logit contributions
 smile-0.013
 their-0.013
 very-0.013
 low-0.011
 smiled-0.011
Token I
Token ID314
Feature activation-8.11e-01

Pos logit contributions
 smile+0.011
 their+0.011
 very+0.011
te+0.010
 Amen+0.010
Neg logit contributions
 exhibit-0.008
unch-0.007
con-0.007
handed-0.007
lor-0.007
Token write
Token ID3551
Feature activation-3.01e+00

Pos logit contributions
 their+0.018
 Amen+0.018
 very+0.016
 cloudy+0.015
 low+0.015
Neg logit contributions
 exhibit-0.023
handed-0.023
unch-0.022
 hunt-0.021
asis-0.021
Token a
Token ID257
Feature activation-1.29e+00

Pos logit contributions
 Amen+0.064
 smile+0.055
 whine+0.050
hy+0.050
spe+0.049
Neg logit contributions
 exhibit-0.089
orns-0.087
anger-0.087
 exits-0.086
 worms-0.085
Token story
Token ID1621
Feature activation-4.42e+00

Pos logit contributions
 Amen+0.033
 very+0.032
 their+0.032
 smile+0.032
 like+0.029
Neg logit contributions
handed-0.038
 exhibit-0.035
unch-0.035
 exits-0.035
orns-0.034
Token about
Token ID546
Feature activation-8.25e-01

Pos logit contributions
 Amen+0.104
 smile+0.093
 their+0.092
 smiled+0.084
 very+0.083
Neg logit contributions
 exhibit-0.131
anger-0.129
unch-0.122
lor-0.122
handed-0.122
Token you
Token ID345
Feature activation-1.18e+00

Pos logit contributions
 their+0.009
 Amen+0.009
 very+0.009
 smile+0.008
 the+0.007
Neg logit contributions
handed-0.036
 exhibit-0.034
unch-0.034
anger-0.034
 exits-0.034
Token?"
Token ID1701
Feature activation-2.91e-01

Pos logit contributions
 Amen+0.024
 their+0.024
 smile+0.022
 very+0.021
hy+0.019
Neg logit contributions
 exhibit-0.038
unch-0.036
handed-0.036
 exits-0.035
orns-0.034
Token
Token ID198
Feature activation+2.30e-01

Pos logit contributions
 Amen+0.007
 very+0.007
 smile+0.007
 their+0.007
te+0.007
Neg logit contributions
handed-0.008
unch-0.008
 exhibit-0.008
orns-0.007
con-0.007
Token
Token ID198
Feature activation+3.02e-01

Pos logit contributions
handed+0.007
unch+0.007
 exhibit+0.006
orns+0.006
lor+0.006
Neg logit contributions
 Amen-0.005
 very-0.005
 smile-0.005
 their-0.005
 like-0.005
Token broken
Token ID5445
Feature activation-4.37e-01

Pos logit contributions
 Amen+0.044
 their+0.040
 smile+0.039
La+0.036
 smiled+0.036
Neg logit contributions
handed-0.044
 exits-0.043
 exhibit-0.042
 discovery-0.041
anger-0.041
Token.
Token ID13
Feature activation+2.89e-01

Pos logit contributions
 Amen+0.011
 smile+0.010
 their+0.010
 smiled+0.009
 low+0.009
Neg logit contributions
handed-0.012
 exits-0.012
 exhibit-0.012
orns-0.011
anger-0.011
Token What
Token ID1867
Feature activation-1.41e+00

Pos logit contributions
handed+0.009
 exhibit+0.009
unch+0.009
 exits+0.009
anger+0.009
Neg logit contributions
 their-0.006
 smile-0.006
 very-0.005
 Amen-0.005
 low-0.005
Token should
Token ID815
Feature activation-1.59e+00

Pos logit contributions
 Amen+0.031
 their+0.027
 smile+0.027
 seven+0.025
 very+0.025
Neg logit contributions
 exhibit-0.044
handed-0.044
unch-0.042
 exits-0.042
 endings-0.041
Token we
Token ID356
Feature activation-9.35e-01

Pos logit contributions
 Amen+0.036
 smile+0.035
 very+0.034
 like+0.034
 low+0.033
Neg logit contributions
handed-0.047
 exhibit-0.046
anger-0.043
 discovery-0.043
 exits-0.043
Token do
Token ID466
Feature activation-4.40e+00

Pos logit contributions
 their+0.019
 Amen+0.019
 smile+0.016
 very+0.016
te+0.015
Neg logit contributions
handed-0.031
 exhibit-0.030
unch-0.029
 exits-0.029
anger-0.028
Token?"
Token ID1701
Feature activation-3.82e-01

Pos logit contributions
 Amen+0.082
 their+0.077
 smile+0.076
 very+0.071
 low+0.068
Neg logit contributions
handed-0.152
 exhibit-0.148
 exits-0.143
unch-0.142
anger-0.141
Token mom
Token ID1995
Feature activation-1.93e+00

Pos logit contributions
 Amen+0.008
 their+0.008
 very+0.008
 smile+0.008
te+0.008
Neg logit contributions
handed-0.011
 exhibit-0.011
unch-0.010
 exits-0.010
 discovery-0.010
Tokenmy
Token ID1820
Feature activation-1.06e+00

Pos logit contributions
 their+0.029
 Amen+0.029
 very+0.027
 smile+0.027
 like+0.023
Neg logit contributions
handed-0.075
 exhibit-0.073
unch-0.071
 exits-0.071
anger-0.070
Token said
Token ID531
Feature activation-1.34e-01

Pos logit contributions
 Amen+0.021
 their+0.021
 very+0.020
 smile+0.020
 like+0.018
Neg logit contributions
handed-0.036
 exhibit-0.035
unch-0.034
 exits-0.034
anger-0.034
Token.
Token ID13
Feature activation-6.85e-01

Pos logit contributions
 Amen+0.003
 seven+0.003
 merry+0.003
La+0.003
ifying+0.003
Neg logit contributions
handed-0.004
unch-0.004
 exhibit-0.003
anger-0.003
oles-0.003
Token make
Token ID787
Feature activation-2.17e+00

Pos logit contributions
handed+0.018
 exhibit+0.017
unch+0.017
 exits+0.016
anger+0.016
Neg logit contributions
 Amen-0.010
 their-0.010
 very-0.009
 smile-0.009
 like-0.008
Token my
Token ID616
Feature activation-7.99e-01

Pos logit contributions
 their+0.028
 very+0.027
 Amen+0.024
 the+0.023
 smile+0.023
Neg logit contributions
handed-0.092
 exhibit-0.090
unch-0.088
 exits-0.087
anger-0.086
Token dress
Token ID6576
Feature activation-1.85e+00

Pos logit contributions
 Amen+0.023
 their+0.021
 very+0.019
 smile+0.019
te+0.019
Neg logit contributions
handed-0.021
anger-0.020
 exhibit-0.020
 exits-0.020
 discovery-0.020
Token look
Token ID804
Feature activation-2.53e+00

Pos logit contributions
 Amen+0.053
 their+0.040
te+0.038
 whine+0.038
 seven+0.037
Neg logit contributions
anger-0.052
 discovery-0.049
ade-0.049
uddle-0.048
handed-0.048
Token new
Token ID649
Feature activation-1.72e+00

Pos logit contributions
 Amen+0.076
 whine+0.059
hy+0.057
te+0.055
 their+0.055
Neg logit contributions
 discovery-0.059
 exits-0.058
handed-0.056
ade-0.056
uddle-0.052
Token again
Token ID757
Feature activation-4.39e+00

Pos logit contributions
 Amen+0.040
 their+0.034
 low+0.030
 smile+0.029
hy+0.028
Neg logit contributions
handed-0.054
 exhibit-0.054
 discovery-0.054
anger-0.054
 exits-0.053
Token?
Token ID30
Feature activation-1.42e+00

Pos logit contributions
 Amen+0.105
 their+0.093
 smile+0.091
 very+0.088
hy+0.084
Neg logit contributions
handed-0.132
 exhibit-0.127
 exits-0.124
asis-0.122
 discovery-0.122
Tokenâ
Token ID22940
Feature activation-2.66e+00

Pos logit contributions
 their+0.031
 smile+0.030
 Amen+0.029
 very+0.029
 low+0.027
Neg logit contributions
handed-0.044
 exhibit-0.041
unch-0.041
lor-0.041
anger-0.041
Token€
Token ID26391
Feature activation-1.45e+00

Pos logit contributions
 their+0.067
 very+0.061
 smile+0.059
 Amen+0.059
 the+0.054
Neg logit contributions
handed-0.084
 exhibit-0.078
lor-0.075
anger-0.075
asis-0.075
Token
Token ID198
Feature activation-3.38e-01

Pos logit contributions
 their+0.032
 Amen+0.031
 very+0.030
 smile+0.029
 like+0.027
Neg logit contributions
handed-0.046
 exhibit-0.045
lor-0.043
unch-0.043
anger-0.042
TokenMay
Token ID6747
Feature activation-1.36e+00

Pos logit contributions
 their+0.006
 Amen+0.006
 very+0.006
 smile+0.005
 the+0.005
Neg logit contributions
handed-0.013
 exhibit-0.012
 exits-0.012
anger-0.012
unch-0.012
Token book
Token ID1492
Feature activation-7.73e-01

Pos logit contributions
 very+0.016
 smile+0.016
 their+0.016
 Amen+0.015
 like+0.014
Neg logit contributions
handed-0.023
unch-0.022
orns-0.022
 exhibit-0.022
lor-0.022
Token.
Token ID13
Feature activation-2.13e-01

Pos logit contributions
 Amen+0.015
 their+0.013
 smile+0.012
 very+0.012
 the+0.010
Neg logit contributions
handed-0.028
 exhibit-0.027
anger-0.026
 exits-0.026
unch-0.026
Token He
Token ID679
Feature activation-5.11e-01

Pos logit contributions
 Amen+0.004
 smile+0.003
 their+0.003
 very+0.003
 smiled+0.003
Neg logit contributions
handed-0.008
 exhibit-0.007
unch-0.007
anger-0.007
lor-0.007
Token was
Token ID373
Feature activation-8.66e-01

Pos logit contributions
 Amen+0.010
 their+0.009
 smile+0.008
 forward+0.008
 cloudy+0.008
Neg logit contributions
 exhibit-0.017
handed-0.017
 exits-0.016
unch-0.015
 discovery-0.015
Token a
Token ID257
Feature activation-9.36e-01

Pos logit contributions
 Amen+0.014
 smile+0.010
 their+0.010
 cloudy+0.009
te+0.008
Neg logit contributions
handed-0.031
 exits-0.030
 exhibit-0.030
 burying-0.029
unch-0.029
Token writer
Token ID6260
Feature activation-4.37e+00

Pos logit contributions
 their+0.018
 Amen+0.018
 smile+0.016
 very+0.015
 the+0.015
Neg logit contributions
handed-0.031
 exhibit-0.031
 exits-0.029
 discovery-0.029
anger-0.029
Token and
Token ID290
Feature activation-1.02e+00

Pos logit contributions
 Amen+0.083
 their+0.080
 smile+0.077
 low+0.075
 cloudy+0.071
Neg logit contributions
handed-0.145
 exhibit-0.144
lor-0.137
anger-0.134
unch-0.133
Token a
Token ID257
Feature activation-1.72e+00

Pos logit contributions
 Amen+0.019
 smile+0.017
 low+0.016
 their+0.015
 cloudy+0.015
Neg logit contributions
handed-0.033
 exhibit-0.032
 exits-0.031
lor-0.030
anger-0.029
Token traveler
Token ID40168
Feature activation-1.27e+00

Pos logit contributions
 their+0.039
 Amen+0.039
 smile+0.033
te+0.033
 like+0.033
Neg logit contributions
 exhibit-0.049
handed-0.049
 discovery-0.046
 exits-0.046
anger-0.045
Token.
Token ID13
Feature activation-1.24e-01

Pos logit contributions
 Amen+0.021
 their+0.020
 very+0.019
 smile+0.019
 low+0.017
Neg logit contributions
handed-0.047
 exhibit-0.046
anger-0.044
 exits-0.044
lor-0.044
Token He
Token ID679
Feature activation-5.73e-01

Pos logit contributions
 Amen+0.002
 smile+0.002
 their+0.002
 very+0.002
 smiled+0.002
Neg logit contributions
handed-0.004
 exhibit-0.004
unch-0.004
lor-0.004
 exits-0.004
Token!
Token ID0
Feature activation+6.88e-01

Pos logit contributions
 Amen+0.039
 their+0.038
 very+0.035
 smile+0.035
 low+0.034
Neg logit contributions
handed-0.039
 exhibit-0.038
 discovery-0.035
orns-0.035
asis-0.035
Token You
Token ID921
Feature activation-1.05e+00

Pos logit contributions
handed+0.021
 exhibit+0.021
unch+0.020
anger+0.020
lor+0.020
Neg logit contributions
 their-0.016
 very-0.015
 smile-0.015
 Amen-0.015
 low-0.014
Token are
Token ID389
Feature activation-1.15e-01

Pos logit contributions
 Amen+0.018
 their+0.018
 very+0.015
 smile+0.015
 low+0.015
Neg logit contributions
handed-0.037
 exhibit-0.037
 exits-0.035
unch-0.035
anger-0.035
Token a
Token ID257
Feature activation+1.54e-01

Pos logit contributions
 Amen+0.002
bell+0.001
 smile+0.001
 their+0.001
 low+0.001
Neg logit contributions
handed-0.004
 exhibit-0.004
 exits-0.004
 burying-0.004
 discovery-0.004
Token great
Token ID1049
Feature activation-7.34e-01

Pos logit contributions
handed+0.005
 exhibit+0.005
 discovery+0.005
asis+0.004
opus+0.004
Neg logit contributions
 their-0.003
 Amen-0.003
bell-0.002
 smiled-0.002
 smile-0.002
Token writer
Token ID6260
Feature activation-4.36e+00

Pos logit contributions
 Amen+0.018
 their+0.018
 very+0.017
 smile+0.016
 low+0.016
Neg logit contributions
handed-0.021
 exhibit-0.021
anger-0.020
unch-0.020
 exits-0.019
Token!"
Token ID2474
Feature activation+3.62e-01

Pos logit contributions
 their+0.105
 low+0.102
 smile+0.099
 Amen+0.098
 cloudy+0.097
Neg logit contributions
handed-0.122
 exhibit-0.120
lor-0.115
anger-0.112
unch-0.110
Token she
Token ID673
Feature activation-1.32e+00

Pos logit contributions
handed+0.012
 exhibit+0.012
unch+0.011
 exits+0.011
emy+0.011
Neg logit contributions
 Amen-0.007
te-0.006
 very-0.006
 smile-0.006
ifying-0.005
Token says
Token ID1139
Feature activation+5.76e-01

Pos logit contributions
 Amen+0.040
 their+0.040
ifying+0.039
 very+0.038
 low+0.037
Neg logit contributions
 exhibit-0.025
 exits-0.024
anger-0.022
handed-0.022
 discovery-0.022
Token.
Token ID13
Feature activation+8.62e-02

Pos logit contributions
unch+0.015
handed+0.015
 exhibit+0.015
oles+0.015
 discovery+0.015
Neg logit contributions
 Amen-0.014
 seven-0.012
 cloudy-0.012
 merry-0.011
ifying-0.011
Token
Token ID198
Feature activation+6.92e-01

Pos logit contributions
handed+0.004
 exhibit+0.003
 exits+0.003
asis+0.003
anger+0.003
Neg logit contributions
 their-0.001
 Amen-0.001
 very-0.001
 smile-0.001
 the-0.001
Token.
Token ID13
Feature activation+8.43e-01

Pos logit contributions
 Amen+0.023
 their+0.022
 very+0.021
 smile+0.021
 like+0.019
Neg logit contributions
handed-0.033
 exhibit-0.032
unch-0.031
 exits-0.031
anger-0.030
Token Can
Token ID1680
Feature activation+2.23e-01

Pos logit contributions
handed+0.026
 exhibit+0.025
unch+0.025
lor+0.024
 exits+0.024
Neg logit contributions
 very-0.019
 smile-0.019
 their-0.018
 low-0.017
 Amen-0.017
Token we
Token ID356
Feature activation+1.06e+00

Pos logit contributions
 exhibit+0.005
lor+0.005
handed+0.005
unch+0.005
anger+0.004
Neg logit contributions
 smile-0.006
 Amen-0.006
 very-0.006
 silent-0.006
 their-0.006
Token come
Token ID1282
Feature activation-5.21e-01

Pos logit contributions
 exhibit+0.035
handed+0.034
asis+0.033
unch+0.033
lor+0.032
Neg logit contributions
 Amen-0.021
 their-0.020
 very-0.019
 low-0.017
 smile-0.017
Token back
Token ID736
Feature activation-2.09e+00

Pos logit contributions
 Amen+0.010
hy+0.009
 curly+0.009
 cloudy+0.009
nam+0.008
Neg logit contributions
 exhibit-0.017
handed-0.016
 discovery-0.015
 territory-0.015
asis-0.015
Token again
Token ID757
Feature activation-4.36e+00

Pos logit contributions
 Amen+0.044
 smile+0.039
 very+0.037
 their+0.036
 low+0.033
Neg logit contributions
 exhibit-0.072
handed-0.070
 exits-0.065
unch-0.065
anger-0.064
Token?"
Token ID1701
Feature activation-3.34e-01

Pos logit contributions
 Amen+0.096
 smile+0.086
 very+0.084
 their+0.083
hy+0.081
Neg logit contributions
 exhibit-0.136
handed-0.132
unch-0.128
asis-0.127
 exits-0.127
Token
Token ID198
Feature activation+2.39e-01

Pos logit contributions
 Amen+0.008
 very+0.007
 smile+0.007
 their+0.007
 like+0.007
Neg logit contributions
handed-0.010
 exhibit-0.010
unch-0.010
anger-0.009
orns-0.009
Token
Token ID198
Feature activation+3.19e-01

Pos logit contributions
handed+0.008
 exhibit+0.007
unch+0.007
 exits+0.007
lor+0.007
Neg logit contributions
 Amen-0.005
 their-0.005
 very-0.005
 smile-0.005
 like-0.005
TokenMom
Token ID29252
Feature activation-1.11e+00

Pos logit contributions
handed+0.011
 exhibit+0.010
unch+0.010
lor+0.010
 exits+0.009
Neg logit contributions
 Amen-0.007
 smile-0.006
 very-0.006
 like-0.006
 their-0.006
Token and
Token ID290
Feature activation-6.48e-01

Pos logit contributions
 Amen+0.014
 their+0.013
 very+0.012
 smile+0.011
 like+0.010
Neg logit contributions
handed-0.046
 exhibit-0.045
unch-0.044
anger-0.044
 exits-0.044
Token How
Token ID1374
Feature activation-1.18e+00

Pos logit contributions
 their+0.005
 Amen+0.005
 very+0.005
 smile+0.005
 the+0.004
Neg logit contributions
handed-0.008
 exhibit-0.008
unch-0.007
 exits-0.007
anger-0.007
Token about
Token ID546
Feature activation-6.53e-01

Pos logit contributions
 Amen+0.029
 their+0.024
 smile+0.023
 very+0.021
 smiled+0.020
Neg logit contributions
 exhibit-0.036
unch-0.036
anger-0.036
 exits-0.035
 endings-0.034
Token we
Token ID356
Feature activation+2.45e-01

Pos logit contributions
 their+0.013
 Amen+0.012
 very+0.012
 smile+0.011
 the+0.011
Neg logit contributions
handed-0.024
anger-0.022
 exhibit-0.022
unch-0.022
 exits-0.022
Token make
Token ID787
Feature activation-2.37e+00

Pos logit contributions
handed+0.009
 exhibit+0.008
unch+0.008
 exits+0.008
anger+0.008
Neg logit contributions
 Amen-0.005
 their-0.004
 very-0.004
 smile-0.004
 low-0.004
Token a
Token ID257
Feature activation-1.20e-01

Pos logit contributions
 their+0.047
 very+0.045
 Amen+0.045
 smile+0.042
 the+0.041
Neg logit contributions
handed-0.085
 exhibit-0.082
anger-0.080
 exits-0.079
unch-0.079
Token deal
Token ID1730
Feature activation-4.34e+00

Pos logit contributions
 very+0.003
 Amen+0.003
 their+0.003
 smile+0.003
 like+0.003
Neg logit contributions
handed-0.004
 exhibit-0.003
anger-0.003
unch-0.003
orns-0.003
Token?
Token ID30
Feature activation-2.16e-01

Pos logit contributions
 Amen+0.083
 smile+0.081
 their+0.077
 low+0.076
 very+0.073
Neg logit contributions
handed-0.143
 exhibit-0.142
unch-0.141
anger-0.135
orns-0.133
Token If
Token ID1002
Feature activation-5.12e-01

Pos logit contributions
 their+0.005
 very+0.004
 smile+0.004
 Amen+0.004
 low+0.004
Neg logit contributions
handed-0.007
 exhibit-0.007
unch-0.007
lor-0.006
 exits-0.006
Token you
Token ID345
Feature activation+1.16e-01

Pos logit contributions
 Amen+0.013
 their+0.013
 very+0.013
 smile+0.013
 like+0.012
Neg logit contributions
handed-0.014
 exhibit-0.014
unch-0.013
 exits-0.013
anger-0.013
Token eat
Token ID4483
Feature activation+5.95e-01

Pos logit contributions
handed+0.004
 exhibit+0.004
unch+0.004
 exits+0.004
anger+0.004
Neg logit contributions
 Amen-0.002
 their-0.002
 very-0.002
 smile-0.002
 the-0.002
Token one
Token ID530
Feature activation-3.70e-01

Pos logit contributions
handed+0.020
 exhibit+0.019
unch+0.019
 exits+0.019
orns+0.018
Neg logit contributions
 Amen-0.012
 smile-0.011
 their-0.010
 very-0.010
te-0.010
Token!
Token ID0
Feature activation+2.60e-01

Pos logit contributions
 Amen+0.035
 their+0.034
 smile+0.033
 very+0.032
 low+0.031
Neg logit contributions
handed-0.042
 exhibit-0.042
unch-0.040
 exits-0.039
anger-0.039
Token Can
Token ID1680
Feature activation-1.12e-01

Pos logit contributions
unch+0.007
lor+0.007
 exhibit+0.006
handed+0.006
anger+0.006
Neg logit contributions
 their-0.007
 smile-0.007
 very-0.006
 low-0.006
nam-0.006
Token we
Token ID356
Feature activation+6.33e-01

Pos logit contributions
 smile+0.003
 their+0.003
 very+0.003
 low+0.003
 Amen+0.003
Neg logit contributions
unch-0.002
lor-0.002
handed-0.002
 exhibit-0.002
anger-0.002
Token buy
Token ID2822
Feature activation-8.83e-01

Pos logit contributions
 exhibit+0.020
handed+0.020
unch+0.020
asis+0.019
lor+0.019
Neg logit contributions
 Amen-0.012
 their-0.012
 very-0.011
 low-0.011
 smile-0.010
Token popcorn
Token ID38914
Feature activation-9.75e-01

Pos logit contributions
 Amen+0.016
 smile+0.015
 their+0.014
 very+0.014
 low+0.013
Neg logit contributions
handed-0.032
 exhibit-0.031
 exits-0.030
unch-0.029
orns-0.029
Token again
Token ID757
Feature activation-4.33e+00

Pos logit contributions
 Amen+0.021
 their+0.018
 smile+0.018
 curly+0.017
 low+0.017
Neg logit contributions
 exhibit-0.032
handed-0.030
anger-0.030
unch-0.029
asis-0.029
Token some
Token ID617
Feature activation-1.98e+00

Pos logit contributions
 Amen+0.096
 smile+0.086
 very+0.085
 their+0.085
hy+0.081
Neg logit contributions
 exhibit-0.135
handed-0.132
 exits-0.127
unch-0.127
 discovery-0.126
Token day
Token ID1110
Feature activation-3.34e+00

Pos logit contributions
 Amen+0.037
 their+0.037
 very+0.035
 smile+0.035
 like+0.031
Neg logit contributions
handed-0.070
 exhibit-0.068
unch-0.066
 exits-0.066
anger-0.065
Token?"
Token ID1701
Feature activation-4.23e-01

Pos logit contributions
 Amen+0.071
 their+0.065
 smile+0.062
 very+0.060
 low+0.054
Neg logit contributions
 exhibit-0.108
handed-0.108
unch-0.106
anger-0.103
 exits-0.103
Token
Token ID220
Feature activation-8.29e-01

Pos logit contributions
 Amen+0.009
 very+0.008
 like+0.008
 seven+0.008
 their+0.008
Neg logit contributions
handed-0.013
 exhibit-0.013
unch-0.013
lor-0.012
alking-0.012
Token
Token ID198
Feature activation+2.95e-02

Pos logit contributions
 their+0.013
 very+0.012
 Amen+0.012
 smile+0.011
 the+0.011
Neg logit contributions
handed-0.033
 exhibit-0.032
 exits-0.032
orns-0.031
asis-0.031
Token And
Token ID843
Feature activation-3.09e-01

Pos logit contributions
handed+0.002
 exhibit+0.002
 exits+0.002
orns+0.002
lor+0.002
Neg logit contributions
 smile-0.002
 very-0.002
 their-0.002
 Amen-0.002
 low-0.002
Token you
Token ID345
Feature activation-4.65e-01

Pos logit contributions
 Amen+0.008
 their+0.007
 smile+0.007
 very+0.007
 low+0.007
Neg logit contributions
handed-0.008
 exhibit-0.008
 exits-0.008
unch-0.008
 discovery-0.008
Token are
Token ID389
Feature activation-1.84e-01

Pos logit contributions
 their+0.008
 Amen+0.008
 very+0.006
 smile+0.006
 low+0.006
Neg logit contributions
handed-0.017
 exhibit-0.017
 exits-0.016
unch-0.016
orns-0.016
Token a
Token ID257
Feature activation+4.15e-01

Pos logit contributions
bell+0.003
 Amen+0.003
 smile+0.003
hy+0.002
 low+0.002
Neg logit contributions
handed-0.006
 exits-0.006
 burying-0.006
 discovery-0.006
 exhibit-0.006
Token good
Token ID922
Feature activation-9.33e-02

Pos logit contributions
handed+0.013
 exhibit+0.012
 discovery+0.012
 exits+0.011
ager+0.011
Neg logit contributions
 their-0.008
 Amen-0.008
bell-0.007
hy-0.007
 smile-0.007
Token writer
Token ID6260
Feature activation-4.32e+00

Pos logit contributions
 Amen+0.002
 their+0.002
 very+0.002
 smile+0.002
 low+0.002
Neg logit contributions
handed-0.003
 exhibit-0.003
anger-0.003
unch-0.003
 exits-0.003
Token.
Token ID13
Feature activation-3.51e-02

Pos logit contributions
 low+0.100
 smile+0.099
 cloudy+0.098
 their+0.096
 curly+0.092
Neg logit contributions
handed-0.117
 exhibit-0.117
lor-0.115
anger-0.110
orns-0.108
Token But
Token ID887
Feature activation-9.37e-02

Pos logit contributions
 smile+0.001
 very+0.001
 their+0.001
 low+0.001
 smiled+0.001
Neg logit contributions
handed-0.001
 exhibit-0.001
lor-0.001
unch-0.001
 exits-0.001
Token next
Token ID1306
Feature activation-2.11e+00

Pos logit contributions
 their+0.002
 Amen+0.002
 very+0.002
 smile+0.002
 the+0.002
Neg logit contributions
handed-0.003
 exhibit-0.003
unch-0.003
anger-0.003
lor-0.003
Token time
Token ID640
Feature activation-7.79e-01

Pos logit contributions
 Amen+0.060
 their+0.060
 smile+0.056
nam+0.054
 very+0.054
Neg logit contributions
 exhibit-0.051
handed-0.049
 exits-0.047
unch-0.047
anger-0.045
Token,
Token ID11
Feature activation-4.84e-01

Pos logit contributions
 Amen+0.013
 their+0.012
 very+0.011
 smile+0.011
 low+0.010
Neg logit contributions
handed-0.029
 exhibit-0.029
unch-0.028
 exits-0.028
anger-0.028
Token a
Token ID257
Feature activation-3.99e-01

Pos logit contributions
 Amen+0.008
 smile+0.008
te+0.007
 low+0.007
 very+0.007
Neg logit contributions
 exhibit-0.018
 exits-0.018
 worms-0.018
handed-0.018
orns-0.018
Token hen
Token ID30963
Feature activation-1.49e-01

Pos logit contributions
 Amen+0.011
 their+0.011
 very+0.010
 smile+0.010
 like+0.010
Neg logit contributions
handed-0.010
 exhibit-0.010
unch-0.009
 exits-0.009
anger-0.009
Token.
Token ID13
Feature activation-1.18e+00

Pos logit contributions
 Amen+0.002
 their+0.002
 very+0.002
 smile+0.002
 low+0.002
Neg logit contributions
handed-0.006
 exhibit-0.006
unch-0.005
 exits-0.005
anger-0.005
Token He
Token ID679
Feature activation-3.71e-01

Pos logit contributions
 Amen+0.024
 smile+0.022
 very+0.022
 their+0.020
te+0.020
Neg logit contributions
handed-0.038
 exhibit-0.037
anger-0.036
 exits-0.036
unch-0.035
Token uses
Token ID3544
Feature activation-7.92e-01

Pos logit contributions
 Amen+0.008
 their+0.007
 very+0.007
 smile+0.007
 low+0.006
Neg logit contributions
 exhibit-0.012
handed-0.012
 exits-0.011
orns-0.011
unch-0.011
Token brown
Token ID7586
Feature activation+4.43e+00

Pos logit contributions
 Amen+0.012
 their+0.011
 very+0.011
 smile+0.010
 like+0.009
Neg logit contributions
handed-0.031
 exhibit-0.030
 exits-0.029
unch-0.029
anger-0.029
Token,
Token ID11
Feature activation-1.20e-01

Pos logit contributions
handed+0.137
 exhibit+0.126
unch+0.123
lor+0.122
 exits+0.121
Neg logit contributions
 their-0.112
 very-0.108
 Amen-0.106
 smile-0.105
 the-0.099
Token pink
Token ID11398
Feature activation+3.40e+00

Pos logit contributions
 Amen+0.002
te+0.002
 their+0.002
 smile+0.002
 forward+0.002
Neg logit contributions
 worms-0.004
con-0.004
emy-0.004
 exits-0.004
 exhibit-0.004
Token,
Token ID11
Feature activation-5.21e-01

Pos logit contributions
handed+0.110
 exhibit+0.107
unch+0.104
 exits+0.103
anger+0.103
Neg logit contributions
 Amen-0.073
 their-0.073
 very-0.070
 smile-0.069
 like-0.062
Token white
Token ID2330
Feature activation+3.49e+00

Pos logit contributions
 Amen+0.012
 smile+0.009
te+0.008
 forward+0.008
 whine+0.008
Neg logit contributions
 worms-0.017
con-0.016
emy-0.015
 vines-0.015
 compost-0.015
Token and
Token ID290
Feature activation-1.27e+00

Pos logit contributions
handed+0.110
 exhibit+0.103
unch+0.101
 exits+0.100
lor+0.099
Neg logit contributions
 their-0.082
 Amen-0.080
 very-0.079
 smile-0.078
 the-0.071
Token a
Token ID257
Feature activation-2.26e-01

Pos logit contributions
handed+0.050
unch+0.050
 exhibit+0.049
anger+0.049
lor+0.049
Neg logit contributions
 their-0.008
 very-0.007
 Amen-0.006
 the-0.005
 smile-0.005
Token strange
Token ID6283
Feature activation-5.36e-02

Pos logit contributions
 Amen+0.005
 their+0.005
 very+0.005
 smile+0.005
 like+0.004
Neg logit contributions
handed-0.007
 exhibit-0.007
unch-0.007
 exits-0.007
anger-0.007
Token creature
Token ID7185
Feature activation+7.00e-01

Pos logit contributions
 Amen+0.001
 their+0.001
 very+0.001
 smile+0.001
 like+0.001
Neg logit contributions
handed-0.001
 exhibit-0.001
unch-0.001
 exits-0.001
anger-0.001
Token with
Token ID351
Feature activation+2.58e-03

Pos logit contributions
handed+0.028
lor+0.026
anger+0.026
unch+0.026
 exhibit+0.026
Neg logit contributions
 their-0.011
 very-0.011
 smile-0.010
 Amen-0.010
 like-0.009
Token long
Token ID890
Feature activation+2.27e+00

Pos logit contributions
handed+0.000
 exhibit+0.000
unch+0.000
anger+0.000
 exits+0.000
Neg logit contributions
 their-0.000
 Amen-0.000
 very-0.000
 smile-0.000
 the-0.000
Token green
Token ID4077
Feature activation+3.89e+00

Pos logit contributions
handed+0.074
 exhibit+0.068
unch+0.067
anger+0.065
lor+0.065
Neg logit contributions
 their-0.053
 very-0.053
 smile-0.053
 Amen-0.050
 the-0.048
Token hair
Token ID4190
Feature activation-8.08e-01

Pos logit contributions
handed+0.122
 exhibit+0.111
unch+0.109
lor+0.108
 exits+0.105
Neg logit contributions
 their-0.096
 very-0.095
 smile-0.094
 Amen-0.089
 the-0.087
Token and
Token ID290
Feature activation+2.17e-01

Pos logit contributions
 Amen+0.015
 their+0.012
 smile+0.011
 very+0.011
 low+0.009
Neg logit contributions
handed-0.030
 exhibit-0.030
anger-0.029
unch-0.029
lor-0.029
Token big
Token ID1263
Feature activation+3.08e+00

Pos logit contributions
handed+0.008
 exhibit+0.008
 exits+0.008
unch+0.007
oles+0.007
Neg logit contributions
 Amen-0.004
 their-0.003
 smile-0.003
 very-0.003
te-0.003
Token brown
Token ID7586
Feature activation+4.11e+00

Pos logit contributions
handed+0.082
 exhibit+0.080
 exits+0.078
unch+0.076
anger+0.076
Neg logit contributions
 Amen-0.084
 their-0.081
 very-0.078
 smile-0.075
 like-0.071
Token eyes
Token ID2951
Feature activation-9.46e-01

Pos logit contributions
handed+0.132
 exhibit+0.122
unch+0.119
lor+0.119
 exits+0.117
Neg logit contributions
 their-0.097
 very-0.095
 smile-0.094
 Amen-0.091
 like-0.086
Tokenmy
Token ID1820
Feature activation-7.30e-01

Pos logit contributions
handed+0.022
 exhibit+0.021
orns+0.020
unch+0.019
ilst+0.019
Neg logit contributions
 Amen-0.015
 smile-0.012
 low-0.011
 their-0.011
 very-0.011
Token some
Token ID617
Feature activation-6.10e-01

Pos logit contributions
 Amen+0.014
 smile+0.011
 low+0.010
te+0.009
 very+0.009
Neg logit contributions
handed-0.028
 exhibit-0.027
 discovery-0.025
 exits-0.025
orns-0.025
Token hot
Token ID3024
Feature activation+1.09e+00

Pos logit contributions
 their+0.016
 very+0.015
 Amen+0.015
 smile+0.015
 like+0.014
Neg logit contributions
handed-0.018
 exhibit-0.017
unch-0.016
lor-0.016
anger-0.016
Token dogs
Token ID6844
Feature activation+8.30e-01

Pos logit contributions
handed+0.028
 exhibit+0.027
unch+0.027
 exits+0.026
anger+0.026
Neg logit contributions
 Amen-0.031
 their-0.030
 very-0.029
 smile-0.029
 low-0.027
Token with
Token ID351
Feature activation+2.23e-02

Pos logit contributions
handed+0.032
 exhibit+0.031
unch+0.030
 exits+0.030
anger+0.029
Neg logit contributions
 Amen-0.014
 their-0.012
 smile-0.012
 very-0.012
 low-0.010
Token k
Token ID479
Feature activation+4.14e+00

Pos logit contributions
handed+0.001
 exhibit+0.001
 exits+0.001
unch+0.001
orns+0.001
Neg logit contributions
 Amen-0.000
 smile-0.000
 their-0.000
 very-0.000
te-0.000
Tokenetchup
Token ID47132
Feature activation+9.35e-01

Pos logit contributions
handed+0.110
 exhibit+0.104
 exits+0.100
lor+0.100
anger+0.099
Neg logit contributions
 their-0.118
 very-0.112
 Amen-0.110
 smile-0.110
 the-0.103
Token and
Token ID290
Feature activation+4.32e-01

Pos logit contributions
handed+0.037
 exhibit+0.036
unch+0.035
anger+0.035
 exits+0.035
Neg logit contributions
 Amen-0.013
 their-0.013
 very-0.012
 smile-0.012
 like-0.010
Token mustard
Token ID33694
Feature activation+7.27e-01

Pos logit contributions
handed+0.010
 worms+0.010
 exits+0.010
 exhibit+0.009
con+0.009
Neg logit contributions
 Amen-0.013
te-0.011
 very-0.011
 smile-0.011
 low-0.011
Token.
Token ID13
Feature activation-1.48e-01

Pos logit contributions
handed+0.030
 exhibit+0.029
 exits+0.028
unch+0.028
anger+0.028
Neg logit contributions
 their-0.010
 very-0.009
 Amen-0.009
 smile-0.009
 like-0.008
Token He
Token ID679
Feature activation-6.80e-02

Pos logit contributions
 their+0.003
 very+0.002
 Amen+0.002
 smile+0.002
 the+0.002
Neg logit contributions
handed-0.006
lor-0.005
 exhibit-0.005
anger-0.005
unch-0.005
Token had
Token ID550
Feature activation-8.42e-02

Pos logit contributions
 Amen+0.024
 their+0.020
 forward+0.020
 smile+0.019
 like+0.018
Neg logit contributions
 exhibit-0.031
handed-0.030
 exits-0.028
 discovery-0.028
alking-0.028
Token long
Token ID890
Feature activation+1.71e+00

Pos logit contributions
 Amen+0.001
 smile+0.001
 their+0.001
te+0.001
 very+0.001
Neg logit contributions
handed-0.003
 exits-0.003
 exhibit-0.003
lor-0.003
orns-0.003
Token yellow
Token ID7872
Feature activation+3.23e+00

Pos logit contributions
 exhibit+0.056
handed+0.055
 exits+0.054
orns+0.053
anger+0.052
Neg logit contributions
 Amen-0.037
 their-0.030
 very-0.029
 smile-0.029
te-0.027
Token hair
Token ID4190
Feature activation-1.33e+00

Pos logit contributions
handed+0.107
 exhibit+0.107
unch+0.103
anger+0.103
 exits+0.103
Neg logit contributions
 Amen-0.067
 their-0.060
 very-0.059
 smile-0.057
 low-0.053
Token and
Token ID290
Feature activation-2.89e-01

Pos logit contributions
 Amen+0.037
 their+0.025
 smile+0.024
 whine+0.021
nam+0.021
Neg logit contributions
anger-0.043
handed-0.040
 exhibit-0.040
unch-0.040
con-0.039
Token blue
Token ID4171
Feature activation+3.64e+00

Pos logit contributions
 Amen+0.006
 forward+0.004
 smile+0.004
 their+0.004
te+0.004
Neg logit contributions
con-0.009
 worms-0.009
 exits-0.009
handed-0.009
 exhibit-0.009
Token eyes
Token ID2951
Feature activation-1.32e+00

Pos logit contributions
 exhibit+0.111
handed+0.111
anger+0.110
unch+0.109
orns+0.109
Neg logit contributions
 Amen-0.086
 their-0.076
 very-0.074
 smile-0.071
 low-0.067
Token.
Token ID13
Feature activation-5.96e-01

Pos logit contributions
 Amen+0.033
 their+0.026
 seven+0.024
 smile+0.024
nam+0.022
Neg logit contributions
handed-0.039
unch-0.039
 exhibit-0.038
anger-0.038
con-0.036
Token She
Token ID1375
Feature activation-9.43e-01

Pos logit contributions
 Amen+0.011
 very+0.010
 their+0.010
 smile+0.009
te+0.008
Neg logit contributions
handed-0.022
 exhibit-0.021
unch-0.021
anger-0.020
 exits-0.020
Token was
Token ID373
Feature activation-4.42e-01

Pos logit contributions
 Amen+0.023
 their+0.019
 forward+0.018
 smile+0.018
 very+0.017
Neg logit contributions
 exhibit-0.027
handed-0.026
 exits-0.025
alking-0.025
unch-0.025
Token playing
Token ID2712
Feature activation+4.80e-01

Pos logit contributions
 Amen+0.009
 their+0.008
 smile+0.007
te+0.007
ifying+0.006
Neg logit contributions
handed-0.014
 exhibit-0.014
 exits-0.014
unch-0.013
 burying-0.013
Token perfect
Token ID2818
Feature activation+1.11e+00

Pos logit contributions
handed+0.035
unch+0.033
lor+0.032
 exhibit+0.032
anger+0.032
Neg logit contributions
 very-0.038
 like-0.037
 smile-0.035
 their-0.034
 the-0.033
Token bush
Token ID24484
Feature activation+2.06e+00

Pos logit contributions
handed+0.029
lor+0.027
unch+0.027
anger+0.025
 exhibit+0.025
Neg logit contributions
 very-0.035
 smile-0.035
 their-0.034
 like-0.033
 the-0.032
Token with
Token ID351
Feature activation+6.69e-01

Pos logit contributions
handed+0.083
 exhibit+0.080
unch+0.079
 exits+0.078
anger+0.078
Neg logit contributions
 their-0.029
 very-0.028
 Amen-0.027
 smile-0.026
 like-0.023
Token lots
Token ID6041
Feature activation+1.01e+00

Pos logit contributions
handed+0.026
 exhibit+0.025
unch+0.025
anger+0.025
lor+0.025
Neg logit contributions
 their-0.011
 very-0.010
 Amen-0.010
 smile-0.009
 the-0.008
Token of
Token ID286
Feature activation+2.59e+00

Pos logit contributions
handed+0.041
lor+0.039
 exhibit+0.039
anger+0.038
asis+0.038
Neg logit contributions
 their-0.016
 very-0.015
 smile-0.015
 the-0.014
 like-0.014
Token juicy
Token ID35075
Feature activation+4.92e+00

Pos logit contributions
handed+0.067
lor+0.064
anger+0.061
 exhibit+0.061
unch+0.058
Neg logit contributions
 very-0.082
 their-0.082
 smile-0.079
 the-0.077
 like-0.076
Token berries
Token ID36322
Feature activation+1.74e+00

Pos logit contributions
handed+0.128
lor+0.114
 exhibit+0.108
anger+0.104
alking+0.097
Neg logit contributions
 very-0.165
 their-0.165
 the-0.163
 smile-0.161
 like-0.159
Token.
Token ID13
Feature activation-6.29e-01

Pos logit contributions
handed+0.077
 exhibit+0.075
lor+0.073
 exits+0.073
anger+0.072
Neg logit contributions
 their-0.021
 very-0.020
 like-0.017
 smile-0.016
 the-0.016
Token Little
Token ID7703
Feature activation+7.52e-01

Pos logit contributions
 their+0.008
 Amen+0.008
 very+0.008
 smile+0.007
 the+0.006
Neg logit contributions
handed-0.026
 exhibit-0.025
unch-0.025
 exits-0.025
anger-0.025
Token Tim
Token ID5045
Feature activation+7.95e-01

Pos logit contributions
handed+0.026
 exhibit+0.025
unch+0.024
lor+0.024
 exits+0.024
Neg logit contributions
 their-0.015
 Amen-0.015
 very-0.015
 smile-0.014
 the-0.013
Token was
Token ID373
Feature activation-5.45e-01

Pos logit contributions
handed+0.031
 exhibit+0.030
unch+0.030
 exits+0.030
anger+0.030
Neg logit contributions
 their-0.012
 Amen-0.012
 very-0.011
 smile-0.011
 the-0.009
TokenMy
Token ID3666
Feature activation+5.02e-01

Pos logit contributions
 Amen+0.006
 their+0.006
 very+0.005
 smile+0.005
 like+0.005
Neg logit contributions
handed-0.011
 exhibit-0.011
anger-0.011
 exits-0.011
unch-0.011
Token name
Token ID1438
Feature activation+5.00e-01

Pos logit contributions
handed+0.019
unch+0.018
orns+0.018
 exhibit+0.017
lor+0.017
Neg logit contributions
 very-0.010
 smile-0.009
 their-0.009
 Amen-0.009
 like-0.008
Token is
Token ID318
Feature activation+7.43e-01

Pos logit contributions
handed+0.017
 exhibit+0.017
unch+0.016
anger+0.016
lor+0.016
Neg logit contributions
 Amen-0.010
 their-0.010
 very-0.009
 smile-0.009
 like-0.008
Token Tim
Token ID5045
Feature activation-4.72e-02

Pos logit contributions
handed+0.024
 exhibit+0.023
unch+0.022
anger+0.022
 exits+0.022
Neg logit contributions
 their-0.017
 Amen-0.016
 very-0.016
 smile-0.016
 the-0.015
Token,
Token ID11
Feature activation+4.59e-01

Pos logit contributions
 Amen+0.001
 their+0.001
 smile+0.001
 low+0.001
 very+0.001
Neg logit contributions
 exhibit-0.001
handed-0.001
unch-0.001
orns-0.001
lor-0.001
Token nice
Token ID3621
Feature activation+2.02e+00

Pos logit contributions
handed+0.016
 exhibit+0.016
unch+0.015
 exits+0.015
anger+0.015
Neg logit contributions
 Amen-0.009
 their-0.009
 very-0.008
 smile-0.008
 low-0.007
Token to
Token ID284
Feature activation+4.51e-01

Pos logit contributions
handed+0.064
lor+0.059
unch+0.057
anger+0.057
 exhibit+0.056
Neg logit contributions
 smile-0.052
 their-0.049
 very-0.048
 the-0.048
 like-0.047
Token meet
Token ID1826
Feature activation+1.36e+00

Pos logit contributions
handed+0.016
 exhibit+0.015
unch+0.015
anger+0.015
 exits+0.015
Neg logit contributions
 their-0.009
 Amen-0.009
 very-0.009
 smile-0.008
 the-0.008
Token ya
Token ID21349
Feature activation-1.23e+00

Pos logit contributions
handed+0.042
 exhibit+0.040
unch+0.040
anger+0.039
 exits+0.039
Neg logit contributions
 their-0.032
 Amen-0.031
 very-0.031
 smile-0.030
 the-0.028
Token!"
Token ID2474
Feature activation-1.98e-01

Pos logit contributions
 their+0.028
 Amen+0.028
 very+0.028
 smile+0.026
 the+0.025
Neg logit contributions
handed-0.039
 exhibit-0.036
unch-0.036
anger-0.036
 exits-0.035
Token<|endoftext|>
Token ID50256
Feature activation+1.68e+00

Pos logit contributions
 their+0.003
 very+0.003
 Amen+0.003
 smile+0.003
 the+0.003
Neg logit contributions
handed-0.007
 exhibit-0.007
 exits-0.007
anger-0.007
lor-0.007
Token top
Token ID1353
Feature activation-1.75e+00

Pos logit contributions
 very+0.011
 smile+0.011
 low+0.010
 the+0.010
 their+0.010
Neg logit contributions
handed-0.014
 exhibit-0.013
lor-0.013
anger-0.013
unch-0.013
Token.
Token ID13
Feature activation-4.69e-01

Pos logit contributions
 Amen+0.031
 their+0.030
 very+0.029
 smile+0.028
 like+0.025
Neg logit contributions
handed-0.064
 exhibit-0.062
unch-0.061
 exits-0.060
anger-0.060
Token A
Token ID317
Feature activation-5.78e-01

Pos logit contributions
 their+0.005
 Amen+0.004
 very+0.004
 the+0.004
 smile+0.004
Neg logit contributions
handed-0.022
lor-0.021
 exhibit-0.021
anger-0.021
 exits-0.020
Token little
Token ID1310
Feature activation+8.00e-01

Pos logit contributions
 very+0.012
 their+0.012
 Amen+0.011
 smile+0.011
 the+0.010
Neg logit contributions
handed-0.020
 exhibit-0.019
unch-0.019
lor-0.019
orns-0.018
Token boy
Token ID2933
Feature activation-1.66e-01

Pos logit contributions
 exhibit+0.025
handed+0.025
 exits+0.024
unch+0.024
anger+0.024
Neg logit contributions
 Amen-0.018
 their-0.017
 very-0.015
 smile-0.015
te-0.014
Token named
Token ID3706
Feature activation+8.54e-01

Pos logit contributions
 Amen+0.003
 their+0.003
 very+0.002
 smile+0.002
 like+0.002
Neg logit contributions
handed-0.006
 exhibit-0.006
unch-0.006
 exits-0.006
anger-0.006
Token Tim
Token ID5045
Feature activation+5.92e-01

Pos logit contributions
handed+0.034
lor+0.033
unch+0.033
anger+0.032
orns+0.032
Neg logit contributions
 their-0.015
 the-0.014
 very-0.013
 Amen-0.012
 like-0.011
Token loved
Token ID6151
Feature activation+1.53e-01

Pos logit contributions
handed+0.021
 exhibit+0.021
unch+0.020
orns+0.020
 exits+0.020
Neg logit contributions
 Amen-0.011
 their-0.010
 very-0.010
 smile-0.010
 like-0.008
Token to
Token ID284
Feature activation+1.01e+00

Pos logit contributions
handed+0.007
lor+0.007
anger+0.007
unch+0.007
 exhibit+0.007
Neg logit contributions
 their-0.002
 very-0.002
 the-0.002
 smile-0.001
 Amen-0.001
Token climb
Token ID12080
Feature activation-7.28e-02

Pos logit contributions
handed+0.030
 exhibit+0.029
unch+0.028
 exits+0.028
anger+0.028
Neg logit contributions
 Amen-0.025
 their-0.024
 very-0.023
 smile-0.023
 like-0.021
Token the
Token ID262
Feature activation+9.42e-01

Pos logit contributions
 Amen+0.001
 smile+0.001
te+0.001
 their+0.001
 smiled+0.001
Neg logit contributions
 exhibit-0.003
orns-0.003
 exits-0.003
handed-0.003
 territory-0.003
Token and
Token ID290
Feature activation-1.37e+00

Pos logit contributions
 Amen+0.014
 their+0.010
 smile+0.010
 very+0.010
te+0.010
Neg logit contributions
 exhibit-0.030
handed-0.030
 exits-0.030
orns-0.029
anger-0.029
Token the
Token ID262
Feature activation-1.45e+00

Pos logit contributions
 Amen+0.023
te+0.017
 smile+0.016
 their+0.016
bell+0.015
Neg logit contributions
 exits-0.049
 exhibit-0.049
 worms-0.049
handed-0.048
 discovery-0.047
Token book
Token ID1492
Feature activation-1.94e-01

Pos logit contributions
 Amen+0.038
 their+0.038
 very+0.037
 smile+0.036
 like+0.033
Neg logit contributions
handed-0.041
 exhibit-0.039
unch-0.038
 exits-0.037
anger-0.037
Token are
Token ID389
Feature activation-6.30e-01

Pos logit contributions
 their+0.003
 Amen+0.003
 very+0.003
 smile+0.003
 like+0.003
Neg logit contributions
handed-0.007
 exhibit-0.007
unch-0.007
 exits-0.007
anger-0.007
Token not
Token ID407
Feature activation-1.04e-01

Pos logit contributions
 Amen+0.012
 their+0.010
 smile+0.010
 very+0.010
te+0.009
Neg logit contributions
handed-0.022
 exhibit-0.022
 exits-0.022
unch-0.021
orns-0.021
Token as
Token ID355
Feature activation+1.46e+00

Pos logit contributions
 Amen+0.002
 their+0.002
 smile+0.002
 very+0.002
te+0.002
Neg logit contributions
handed-0.004
 exhibit-0.003
 exits-0.003
unch-0.003
anger-0.003
Token important
Token ID1593
Feature activation+7.63e-01

Pos logit contributions
handed+0.050
 exhibit+0.048
 exits+0.048
unch+0.047
anger+0.047
Neg logit contributions
 Amen-0.029
 their-0.026
 smile-0.025
 very-0.024
te-0.022
Token as
Token ID355
Feature activation+1.33e+00

Pos logit contributions
handed+0.033
 exhibit+0.031
unch+0.031
lor+0.030
asis+0.030
Neg logit contributions
 their-0.010
 very-0.010
 Amen-0.009
 smile-0.008
 like-0.008
Token you
Token ID345
Feature activation+1.20e+00

Pos logit contributions
handed+0.051
 exhibit+0.048
unch+0.048
lor+0.048
anger+0.047
Neg logit contributions
 their-0.026
 very-0.024
 the-0.023
 smile-0.022
 like-0.021
Token and
Token ID290
Feature activation+1.24e+00

Pos logit contributions
handed+0.049
 exhibit+0.047
unch+0.047
lor+0.047
anger+0.046
Neg logit contributions
 their-0.016
 very-0.016
 smile-0.016
 Amen-0.015
 like-0.015
Token me
Token ID502
Feature activation+9.76e-01

Pos logit contributions
handed+0.043
 exhibit+0.042
 exits+0.042
 territory+0.041
 endings+0.040
Neg logit contributions
 Amen-0.023
 very-0.019
 their-0.019
te-0.018
 smile-0.018
Token.
Token ID13
Feature activation+2.06e-01

Pos logit contributions
handed+0.031
 exhibit+0.030
unch+0.030
 exits+0.030
anger+0.030
Neg logit contributions
 their-0.011
 Amen-0.011
 very-0.010
 smile-0.010
 like-0.008
Token They
Token ID1119
Feature activation+9.29e-02

Pos logit contributions
 exhibit+0.007
handed+0.007
 exits+0.007
 territory+0.007
anger+0.007
Neg logit contributions
 Amen-0.003
 smile-0.003
 very-0.003
 smiled-0.003
te-0.003
Token say
Token ID910
Feature activation+1.53e+00

Pos logit contributions
handed+0.003
 exhibit+0.003
 exits+0.003
unch+0.003
anger+0.003
Neg logit contributions
 Amen-0.002
 their-0.002
 very-0.001
te-0.001
 smile-0.001
Token:
Token ID25
Feature activation+4.77e-01

Pos logit contributions
oles+0.042
lor+0.041
 endings+0.040
handed+0.040
unch+0.039
Neg logit contributions
 Amen-0.032
 cloudy-0.030
 low-0.029
ifying-0.029
 smile-0.028
Token
Token ID198
Feature activation+9.38e-01

Pos logit contributions
handed+0.016
 exhibit+0.016
unch+0.016
lor+0.015
anger+0.015
Neg logit contributions
 Amen-0.009
 their-0.009
 very-0.008
 smile-0.008
 like-0.007
Token
Token ID198
Feature activation+1.11e+00

Pos logit contributions
handed+0.034
 exhibit+0.033
 exits+0.032
anger+0.032
unch+0.032
Neg logit contributions
 their-0.018
 Amen-0.017
 very-0.017
 smile-0.016
 the-0.015
Token"
Token ID1
Feature activation-4.59e-01

Pos logit contributions
handed+0.032
 exhibit+0.030
 exits+0.028
lor+0.028
unch+0.028
Neg logit contributions
 smile-0.028
 Amen-0.027
 very-0.026
 their-0.026
 like-0.026
TokenMaybe
Token ID13300
Feature activation-1.17e+00

Pos logit contributions
 Amen+0.009
 their+0.009
 very+0.009
 smile+0.009
 like+0.008
Neg logit contributions
handed-0.016
 exhibit-0.015
unch-0.015
 exits-0.015
anger-0.015
Token you
Token ID345
Feature activation-3.93e-01

Pos logit contributions
 Amen+0.024
 smile+0.023
 very+0.022
 their+0.021
 low+0.020
Neg logit contributions
handed-0.038
 exhibit-0.038
 exits-0.037
unch-0.036
 discovery-0.035
Token can
Token ID460
Feature activation-7.51e-01

Pos logit contributions
 Amen+0.007
 their+0.006
 very+0.005
 smile+0.005
 low+0.005
Neg logit contributions
 exhibit-0.015
handed-0.015
 exits-0.014
unch-0.014
orns-0.013
Token have
Token ID423
Feature activation-2.95e-01

Pos logit contributions
 Amen+0.012
 their+0.012
 very+0.012
 smile+0.012
 like+0.010
Neg logit contributions
handed-0.029
 exhibit-0.028
unch-0.027
anger-0.027
 exits-0.027
TokenOnce
Token ID7454
Feature activation-6.51e-01

Pos logit contributions
handed+0.056
 exhibit+0.054
 exits+0.053
unch+0.053
anger+0.053
Neg logit contributions
 their-0.029
 Amen-0.028
 very-0.028
 smile-0.026
 the-0.025
Token upon
Token ID2402
Feature activation-5.26e-01

Pos logit contributions
 Amen+0.015
 their+0.015
 very+0.014
 smile+0.014
 like+0.013
Neg logit contributions
handed-0.020
 exhibit-0.019
unch-0.019
anger-0.018
 exits-0.018
Token a
Token ID257
Feature activation+1.50e-02

Pos logit contributions
 Amen+0.008
 smile+0.007
 their+0.006
bell+0.005
 very+0.005
Neg logit contributions
handed-0.021
unch-0.020
 exhibit-0.020
 exits-0.019
 discovery-0.019
Token time
Token ID640
Feature activation-9.64e-01

Pos logit contributions
handed+0.001
anger+0.000
 exhibit+0.000
lor+0.000
 exits+0.000
Neg logit contributions
 their-0.000
 very-0.000
 Amen-0.000
 smile-0.000
 like-0.000
Token there
Token ID612
Feature activation+2.01e-01

Pos logit contributions
 their+0.011
 very+0.011
 Amen+0.010
 smile+0.010
 the+0.009
Neg logit contributions
handed-0.043
 exhibit-0.040
lor-0.040
 exits-0.040
orns-0.040
Token was
Token ID373
Feature activation+8.16e-01

Pos logit contributions
handed+0.008
 exhibit+0.007
unch+0.007
 exits+0.007
anger+0.007
Neg logit contributions
 their-0.003
 Amen-0.003
 very-0.003
 smile-0.003
 the-0.003
Token a
Token ID257
Feature activation-5.06e-01

Pos logit contributions
handed+0.037
 exhibit+0.035
unch+0.035
 exits+0.035
orns+0.034
Neg logit contributions
 Amen-0.008
 their-0.007
 smile-0.007
 very-0.006
 smiled-0.004
Token boy
Token ID2933
Feature activation-8.39e-01

Pos logit contributions
 Amen+0.014
 their+0.013
 smile+0.012
te+0.012
 very+0.012
Neg logit contributions
 exhibit-0.012
handed-0.012
unch-0.012
emy-0.012
asis-0.011
Token.
Token ID13
Feature activation-3.68e-01

Pos logit contributions
 Amen+0.015
 their+0.014
 smile+0.014
 very+0.013
 low+0.012
Neg logit contributions
handed-0.030
 exhibit-0.030
unch-0.029
 exits-0.028
orns-0.028
Token He
Token ID679
Feature activation-2.50e-01

Pos logit contributions
 Amen+0.007
 very+0.006
 their+0.006
 smile+0.006
 smiled+0.005
Neg logit contributions
 exhibit-0.013
handed-0.013
unch-0.013
orns-0.012
 exits-0.012
Token was
Token ID373
Feature activation+5.95e-02

Pos logit contributions
 Amen+0.006
 forward+0.005
 smile+0.005
ifying+0.005
bell+0.005
Neg logit contributions
 exhibit-0.007
asis-0.007
handed-0.007
orns-0.007
emy-0.007
Token When
Token ID1649
Feature activation+7.15e-03

Pos logit contributions
 their+0.004
 Amen+0.004
 very+0.004
 smile+0.004
 the+0.003
Neg logit contributions
handed-0.010
 exhibit-0.010
 exits-0.010
unch-0.010
lor-0.009
Token she
Token ID673
Feature activation+1.64e-01

Pos logit contributions
anger+0.000
handed+0.000
 exhibit+0.000
 exits+0.000
 discovery+0.000
Neg logit contributions
 Amen-0.000
 smile-0.000
te-0.000
 their-0.000
 very-0.000
Token was
Token ID373
Feature activation+7.93e-01

Pos logit contributions
handed+0.006
 exhibit+0.006
 exits+0.006
unch+0.006
anger+0.006
Neg logit contributions
 their-0.002
 very-0.002
 Amen-0.002
 smile-0.002
 the-0.002
Token ready
Token ID3492
Feature activation+5.76e-01

Pos logit contributions
handed+0.025
 exhibit+0.024
unch+0.024
 exits+0.023
anger+0.023
Neg logit contributions
 Amen-0.018
 their-0.017
 very-0.016
 smile-0.016
 like-0.015
Token,
Token ID11
Feature activation+4.96e-01

Pos logit contributions
handed+0.021
 exhibit+0.020
anger+0.020
unch+0.020
 exits+0.019
Neg logit contributions
 Amen-0.011
 their-0.009
 very-0.009
 smile-0.008
te-0.008
Token she
Token ID673
Feature activation+1.11e-01

Pos logit contributions
anger+0.016
handed+0.016
 exits+0.015
 discovery+0.015
 exhibit+0.015
Neg logit contributions
 Amen-0.012
 smile-0.009
 their-0.008
te-0.008
 very-0.008
Token said
Token ID531
Feature activation+2.11e+00

Pos logit contributions
handed+0.004
 exhibit+0.004
unch+0.004
 exits+0.004
anger+0.004
Neg logit contributions
 Amen-0.002
 their-0.002
 very-0.002
 smile-0.002
 like-0.002
Token,
Token ID11
Feature activation+2.02e-01

Pos logit contributions
anger+0.068
unch+0.067
lor+0.065
oles+0.064
 discovery+0.063
Neg logit contributions
 Amen-0.044
 cloudy-0.033
 their-0.033
ifying-0.032
bell-0.032
Token "
Token ID366
Feature activation-2.52e-01

Pos logit contributions
unch+0.006
handed+0.006
lor+0.006
anger+0.006
 exhibit+0.006
Neg logit contributions
 Amen-0.004
 their-0.004
 smile-0.004
 very-0.004
 like-0.003
TokenWe
Token ID1135
Feature activation+8.64e-01

Pos logit contributions
 their+0.005
 Amen+0.005
 very+0.005
 smile+0.005
 like+0.004
Neg logit contributions
handed-0.009
 exhibit-0.008
unch-0.008
 exits-0.008
lor-0.008
Tokenee
Token ID1453
Feature activation+9.85e-01

Pos logit contributions
handed+0.032
 exhibit+0.031
 exits+0.030
unch+0.030
anger+0.030
Neg logit contributions
 Amen-0.014
 their-0.014
 very-0.013
 smile-0.012
 low-0.011
Token play
Token ID711
Feature activation+2.35e-03

Pos logit contributions
handed+0.009
 exhibit+0.009
 exits+0.008
 discovery+0.008
asis+0.008
Neg logit contributions
 Amen-0.006
 very-0.005
 their-0.005
te-0.005
 smile-0.005
Token.
Token ID13
Feature activation-2.19e-01

Pos logit contributions
handed+0.000
 exhibit+0.000
 exits+0.000
anger+0.000
orns+0.000
Neg logit contributions
 Amen-0.000
 smile-0.000
 their-0.000
 very-0.000
 smiled-0.000
Token The
Token ID383
Feature activation-1.30e-01

Pos logit contributions
 very+0.002
 their+0.002
 smile+0.002
 the+0.002
 Amen+0.002
Neg logit contributions
handed-0.010
lor-0.010
 exhibit-0.009
orns-0.009
asis-0.009
Token grass
Token ID8701
Feature activation+2.08e+00

Pos logit contributions
 their+0.003
 Amen+0.003
 very+0.003
 smile+0.003
 like+0.002
Neg logit contributions
handed-0.004
 exhibit-0.004
unch-0.004
 exits-0.004
anger-0.004
Token was
Token ID373
Feature activation-6.07e-01

Pos logit contributions
handed+0.074
 exhibit+0.073
anger+0.070
unch+0.070
 exits+0.070
Neg logit contributions
 Amen-0.040
 their-0.036
 smile-0.034
 very-0.034
 like-0.030
Token wet
Token ID9583
Feature activation+4.67e-01

Pos logit contributions
 Amen+0.013
 smile+0.009
te+0.009
 their+0.008
bell+0.008
Neg logit contributions
 exits-0.020
unch-0.019
 dam-0.019
 footprints-0.019
handed-0.019
Token,
Token ID11
Feature activation-2.50e-01

Pos logit contributions
handed+0.016
 exhibit+0.015
orns+0.015
unch+0.015
anger+0.015
Neg logit contributions
 Amen-0.010
 smile-0.008
 their-0.008
te-0.007
 smiled-0.007
Token but
Token ID475
Feature activation+1.18e+00

Pos logit contributions
 Amen+0.007
te+0.006
bell+0.005
ifying+0.005
 smile+0.005
Neg logit contributions
 footprints-0.007
 worms-0.007
 exits-0.007
handed-0.007
unch-0.007
Token the
Token ID262
Feature activation+3.51e-01

Pos logit contributions
handed+0.037
 exits+0.036
 exhibit+0.036
unch+0.035
 discovery+0.035
Neg logit contributions
 Amen-0.028
 smile-0.023
 their-0.022
te-0.021
 very-0.021
Token sun
Token ID4252
Feature activation+1.89e-01

Pos logit contributions
handed+0.011
 exhibit+0.011
unch+0.011
 exits+0.010
anger+0.010
Neg logit contributions
 Amen-0.008
 their-0.008
 very-0.007
 smile-0.007
 like-0.007
Token was
Token ID373
Feature activation-2.17e-01

Pos logit contributions
 exhibit+0.006
asis+0.006
anger+0.006
handed+0.006
unch+0.005
Neg logit contributions
 Amen-0.005
 their-0.004
 smile-0.003
 very-0.003
nam-0.003
Token it
Token ID340
Feature activation+3.38e-01

Pos logit contributions
handed+0.001
 exits+0.001
 exhibit+0.001
 discovery+0.001
unch+0.001
Neg logit contributions
 Amen-0.000
 their-0.000
 smile-0.000
 very-0.000
te-0.000
Token had
Token ID550
Feature activation+6.76e-01

Pos logit contributions
handed+0.012
 exhibit+0.012
 exits+0.012
anger+0.011
lor+0.011
Neg logit contributions
 Amen-0.006
 their-0.006
 smile-0.005
 very-0.005
 like-0.005
Token paid
Token ID3432
Feature activation+1.70e-01

Pos logit contributions
handed+0.023
 exhibit+0.022
 exits+0.021
unch+0.021
anger+0.021
Neg logit contributions
 Amen-0.015
 their-0.013
 smile-0.013
 very-0.012
 low-0.011
Token off
Token ID572
Feature activation-1.42e-01

Pos logit contributions
handed+0.006
anger+0.006
 exits+0.006
 exhibit+0.006
lor+0.005
Neg logit contributions
 Amen-0.004
 their-0.003
 smile-0.003
 very-0.003
 smiled-0.003
Token.
Token ID13
Feature activation-2.70e-01

Pos logit contributions
 Amen+0.003
 their+0.003
 smile+0.003
 very+0.002
 low+0.002
Neg logit contributions
handed-0.005
 exhibit-0.005
 exits-0.004
anger-0.004
lor-0.004
Token Now
Token ID2735
Feature activation-1.59e-01

Pos logit contributions
 Amen+0.004
 their+0.004
 very+0.004
 smile+0.004
 the+0.004
Neg logit contributions
handed-0.010
 exhibit-0.010
 exits-0.010
unch-0.010
anger-0.010
Token she
Token ID673
Feature activation+4.86e-01

Pos logit contributions
 Amen+0.003
 their+0.003
 very+0.002
 smile+0.002
 low+0.002
Neg logit contributions
handed-0.006
 exhibit-0.006
anger-0.006
lor-0.006
 exits-0.006
Token had
Token ID550
Feature activation+9.06e-01

Pos logit contributions
handed+0.018
 exhibit+0.018
 exits+0.017
unch+0.017
anger+0.017
Neg logit contributions
 Amen-0.008
 their-0.008
 very-0.007
 smile-0.007
 low-0.006
Token a
Token ID257
Feature activation+1.61e+00

Pos logit contributions
handed+0.033
 exhibit+0.032
 exits+0.032
lor+0.031
orns+0.031
Neg logit contributions
 Amen-0.018
 their-0.014
 smile-0.013
 very-0.013
te-0.011
Token new
Token ID649
Feature activation+1.66e+00

Pos logit contributions
 exhibit+0.052
handed+0.052
 discovery+0.050
 exits+0.049
unch+0.049
Neg logit contributions
 Amen-0.035
 their-0.032
 very-0.027
 smile-0.027
te-0.027
Token skill
Token ID5032
Feature activation-1.62e+00

Pos logit contributions
handed+0.043
 exhibit+0.041
unch+0.040
 exits+0.040
anger+0.039
Neg logit contributions
 Amen-0.047
 their-0.047
 very-0.045
 smile-0.044
 like-0.041
Token water
Token ID1660
Feature activation+2.58e-01

Pos logit contributions
 very+0.007
 their+0.007
 smile+0.007
 the+0.007
 like+0.007
Neg logit contributions
handed-0.010
lor-0.009
 exhibit-0.009
unch-0.009
 exits-0.009
Token all
Token ID477
Feature activation-1.13e+00

Pos logit contributions
 exhibit+0.009
handed+0.009
anger+0.009
unch+0.009
 exits+0.009
Neg logit contributions
 Amen-0.005
 smile-0.004
 their-0.003
 very-0.003
te-0.003
Token around
Token ID1088
Feature activation+3.31e-01

Pos logit contributions
 Amen+0.029
 smile+0.022
bell+0.020
 curly+0.020
te+0.020
Neg logit contributions
 exhibit-0.035
orns-0.035
handed-0.034
asis-0.033
ilst-0.032
Token the
Token ID262
Feature activation+6.68e-01

Pos logit contributions
handed+0.014
 exhibit+0.014
 exits+0.014
unch+0.013
orns+0.013
Neg logit contributions
 Amen-0.004
 smile-0.003
 their-0.003
 very-0.003
 smiled-0.002
Token pool
Token ID5933
Feature activation+8.01e-01

Pos logit contributions
lor+0.022
handed+0.021
unch+0.021
anger+0.020
orns+0.020
Neg logit contributions
 very-0.017
 their-0.017
 the-0.016
 like-0.015
 smile-0.015
Token.
Token ID13
Feature activation-5.78e-01

Pos logit contributions
handed+0.031
 exhibit+0.030
unch+0.029
 exits+0.029
anger+0.029
Neg logit contributions
 Amen-0.013
 their-0.012
 smile-0.011
 very-0.011
 the-0.009
Token
Token ID220
Feature activation-7.61e-01

Pos logit contributions
 their+0.007
 very+0.007
 the+0.007
 The+0.006
 Amen+0.006
Neg logit contributions
handed-0.026
lor-0.025
 exits-0.025
anger-0.024
 exhibit-0.024
Token
Token ID198
Feature activation+1.39e-01

Pos logit contributions
 their+0.011
 very+0.010
 the+0.009
 Amen+0.009
 smile+0.008
Neg logit contributions
handed-0.031
 exhibit-0.031
 exits-0.030
anger-0.030
asis-0.030
Token
Token ID198
Feature activation+1.87e-01

Pos logit contributions
handed+0.006
 exits+0.005
anger+0.005
 endings+0.005
 exhibit+0.005
Neg logit contributions
 their-0.002
 very-0.002
 Amen-0.002
 the-0.002
 smile-0.002
TokenTim
Token ID14967
Feature activation+5.30e-01

Pos logit contributions
handed+0.006
anger+0.005
lor+0.005
 exhibit+0.005
orns+0.005
Neg logit contributions
 very-0.005
 their-0.005
 Amen-0.004
 like-0.004
 the-0.004
Tokenmy
Token ID1820
Feature activation+2.19e-01

Pos logit contributions
handed+0.020
 exhibit+0.019
orns+0.018
 exits+0.018
unch+0.018
Neg logit contributions
 Amen-0.009
 their-0.009
 very-0.008
 smile-0.008
 like-0.007
Token.
Token ID13
Feature activation+5.62e-02

Pos logit contributions
handed+0.040
 exhibit+0.039
unch+0.038
 exits+0.038
anger+0.038
Neg logit contributions
 their-0.014
 Amen-0.014
 very-0.014
 smile-0.013
 like-0.011
Token
Token ID198
Feature activation+7.07e-01

Pos logit contributions
handed+0.002
 exhibit+0.002
unch+0.002
 territory+0.002
 endings+0.002
Neg logit contributions
 smile-0.001
 very-0.001
 Amen-0.001
 smiled-0.001
te-0.001
Token
Token ID198
Feature activation+7.33e-01

Pos logit contributions
handed+0.027
 exhibit+0.026
anger+0.026
 exits+0.025
unch+0.025
Neg logit contributions
 their-0.013
 Amen-0.012
 very-0.012
 smile-0.011
 the-0.010
Token"
Token ID1
Feature activation-5.20e-01

Pos logit contributions
handed+0.025
 exhibit+0.024
unch+0.023
 exits+0.023
lor+0.023
Neg logit contributions
 Amen-0.014
 smile-0.014
 their-0.014
 very-0.014
 like-0.012
TokenGo
Token ID5247
Feature activation-1.31e-01

Pos logit contributions
 Amen+0.011
 their+0.011
 very+0.011
 smile+0.011
 like+0.010
Neg logit contributions
handed-0.017
 exhibit-0.016
unch-0.016
 exits-0.016
anger-0.016
Token away
Token ID1497
Feature activation-3.90e-01

Pos logit contributions
 Amen+0.002
 their+0.002
 smile+0.002
 very+0.002
 smiled+0.002
Neg logit contributions
handed-0.005
 exhibit-0.005
anger-0.004
unch-0.004
 exits-0.004
Token,
Token ID11
Feature activation+6.29e-01

Pos logit contributions
 their+0.005
 Amen+0.005
 very+0.005
 smile+0.005
 the+0.004
Neg logit contributions
handed-0.016
 exhibit-0.016
unch-0.016
 exits-0.015
lor-0.015
Token bad
Token ID2089
Feature activation+8.10e-01

Pos logit contributions
handed+0.020
 exhibit+0.019
unch+0.018
 exits+0.018
anger+0.018
Neg logit contributions
 their-0.015
 Amen-0.014
 very-0.014
 smile-0.014
 the-0.013
Token dog
Token ID3290
Feature activation+1.23e+00

Pos logit contributions
handed+0.028
 exhibit+0.025
lor+0.025
unch+0.025
 exits+0.025
Neg logit contributions
 their-0.019
 Amen-0.018
 very-0.017
 smile-0.017
 the-0.017
Token!
Token ID0
Feature activation+1.17e+00

Pos logit contributions
handed+0.043
unch+0.041
 exits+0.040
 exhibit+0.040
anger+0.040
Neg logit contributions
 their-0.025
 very-0.024
 Amen-0.024
 smile-0.023
 like-0.022
Token Leave
Token ID17446
Feature activation+3.09e-01

Pos logit contributions
handed+0.037
 exhibit+0.036
unch+0.035
anger+0.034
lor+0.034
Neg logit contributions
 their-0.026
 Amen-0.026
 very-0.026
 smile-0.025
 low-0.023
Token hands
Token ID2832
Feature activation+1.48e-01

Pos logit contributions
handed+0.059
 exhibit+0.056
lor+0.056
unch+0.056
anger+0.055
Neg logit contributions
 their-0.036
 very-0.035
 smile-0.034
 the-0.033
 like-0.030
Token.
Token ID13
Feature activation-4.24e-01

Pos logit contributions
handed+0.005
 exhibit+0.005
unch+0.005
anger+0.005
 exits+0.005
Neg logit contributions
 Amen-0.003
 their-0.002
 smile-0.002
 very-0.002
 low-0.002
Token Then
Token ID3244
Feature activation-1.72e-01

Pos logit contributions
 their+0.007
 Amen+0.007
 very+0.007
 smile+0.007
 like+0.006
Neg logit contributions
handed-0.016
 exhibit-0.015
unch-0.015
 exits-0.015
anger-0.015
Token lift
Token ID10303
Feature activation-7.74e-01

Pos logit contributions
 Amen+0.003
 very+0.002
 smile+0.002
 their+0.002
 low+0.002
Neg logit contributions
handed-0.006
 exhibit-0.006
anger-0.006
 exits-0.006
lor-0.006
Token it
Token ID340
Feature activation+6.30e-02

Pos logit contributions
 Amen+0.015
 smile+0.012
 their+0.011
 very+0.011
 like+0.009
Neg logit contributions
handed-0.029
 exhibit-0.028
anger-0.028
 exits-0.027
lor-0.027
Token up
Token ID510
Feature activation-2.08e+00

Pos logit contributions
handed+0.003
unch+0.002
 exhibit+0.002
lor+0.002
orns+0.002
Neg logit contributions
 very-0.001
 their-0.001
 smile-0.001
 like-0.001
 the-0.001
Token above
Token ID2029
Feature activation-1.44e+00

Pos logit contributions
 Amen+0.040
 smile+0.029
 their+0.029
 very+0.027
te+0.025
Neg logit contributions
handed-0.075
 exhibit-0.074
 exits-0.073
anger-0.073
unch-0.071
Token your
Token ID534
Feature activation-1.24e+00

Pos logit contributions
 Amen+0.021
 smile+0.015
 very+0.015
 their+0.013
 low+0.011
Neg logit contributions
handed-0.059
 exhibit-0.058
anger-0.056
 exits-0.056
lor-0.055
Token head
Token ID1182
Feature activation-1.55e+00

Pos logit contributions
 very+0.027
 their+0.026
 smile+0.026
 Amen+0.024
 the+0.023
Neg logit contributions
handed-0.042
unch-0.040
 exhibit-0.039
orns-0.039
lor-0.039
Token.
Token ID13
Feature activation-6.05e-01

Pos logit contributions
 Amen+0.024
 their+0.022
 very+0.021
 smile+0.020
 the+0.017
Neg logit contributions
handed-0.061
 exhibit-0.059
unch-0.058
anger-0.058
 exits-0.057
Tokenâ
Token ID22940
Feature activation-1.65e+00

Pos logit contributions
 their+0.011
 Amen+0.010
 very+0.010
 smile+0.010
 the+0.009
Neg logit contributions
handed-0.022
 exhibit-0.022
 exits-0.021
unch-0.021
anger-0.021
Token tank
Token ID6873
Feature activation+4.00e-01

Pos logit contributions
 Amen+0.021
 their+0.021
 smile+0.019
te+0.019
 like+0.018
Neg logit contributions
 exhibit-0.020
handed-0.019
unch-0.019
 discovery-0.018
asis-0.018
Token.
Token ID13
Feature activation-3.94e-01

Pos logit contributions
handed+0.015
 exhibit+0.015
unch+0.015
 exits+0.014
anger+0.014
Neg logit contributions
 Amen-0.006
 their-0.006
 very-0.006
 smile-0.006
 like-0.005
Token The
Token ID383
Feature activation-1.22e+00

Pos logit contributions
 Amen+0.006
 very+0.006
 their+0.006
 smile+0.005
 smiled+0.004
Neg logit contributions
 exhibit-0.015
handed-0.015
unch-0.015
 exits-0.014
anger-0.014
Token tank
Token ID6873
Feature activation+1.30e+00

Pos logit contributions
 their+0.029
 very+0.029
 Amen+0.029
 smile+0.027
 the+0.025
Neg logit contributions
handed-0.039
unch-0.037
lor-0.037
 exhibit-0.037
anger-0.036
Token was
Token ID373
Feature activation-7.14e-01

Pos logit contributions
handed+0.049
 exhibit+0.049
unch+0.047
 exits+0.046
anger+0.046
Neg logit contributions
 Amen-0.021
 their-0.020
 smile-0.018
 very-0.018
 like-0.015
Token filled
Token ID5901
Feature activation-2.22e+00

Pos logit contributions
 Amen+0.010
 smile+0.007
te+0.007
 their+0.006
bell+0.006
Neg logit contributions
 burying-0.027
 exhibit-0.027
unch-0.026
 exits-0.026
 dam-0.026
Token with
Token ID351
Feature activation-1.00e-01

Pos logit contributions
 Amen+0.042
 smile+0.039
 their+0.037
 very+0.034
 smiled+0.032
Neg logit contributions
handed-0.079
 exhibit-0.076
unch-0.074
 exits-0.072
 discovery-0.071
Token water
Token ID1660
Feature activation+7.10e-01

Pos logit contributions
 Amen+0.003
 their+0.003
 smile+0.002
 very+0.002
 like+0.002
Neg logit contributions
handed-0.003
 exhibit-0.003
 exits-0.003
unch-0.003
orns-0.002
Token,
Token ID11
Feature activation-5.39e-02

Pos logit contributions
handed+0.027
 exhibit+0.027
unch+0.026
anger+0.026
 exits+0.026
Neg logit contributions
 Amen-0.012
 their-0.010
 smile-0.010
 very-0.009
te-0.008
Token and
Token ID290
Feature activation-4.70e-01

Pos logit contributions
 Amen+0.001
te+0.001
 smile+0.001
 their+0.001
bell+0.001
Neg logit contributions
 worms-0.002
 exits-0.002
 exhibit-0.002
unch-0.002
handed-0.002
Token it
Token ID340
Feature activation-4.15e-03

Pos logit contributions
 Amen+0.009
 smile+0.007
te+0.007
bell+0.006
 their+0.006
Neg logit contributions
 worms-0.016
 exits-0.016
 exhibit-0.016
handed-0.016
 footprints-0.015
Token,
Token ID11
Feature activation+5.53e-01

Pos logit contributions
 their+0.011
 very+0.010
 Amen+0.010
 smile+0.010
 the+0.009
Neg logit contributions
handed-0.024
 exhibit-0.023
unch-0.023
lor-0.023
 exits-0.023
Token it
Token ID340
Feature activation+2.15e-01

Pos logit contributions
handed+0.018
 exits+0.017
 exhibit+0.017
 discovery+0.017
 worms+0.016
Neg logit contributions
 Amen-0.012
 smile-0.010
te-0.009
 their-0.009
 very-0.009
Token was
Token ID373
Feature activation-4.92e-01

Pos logit contributions
handed+0.008
 exhibit+0.008
unch+0.008
 exits+0.008
anger+0.007
Neg logit contributions
 Amen-0.004
 their-0.004
 very-0.003
 smile-0.003
 like-0.003
Token a
Token ID257
Feature activation-7.05e-01

Pos logit contributions
 Amen+0.008
 smile+0.006
 their+0.006
te+0.005
La+0.005
Neg logit contributions
 exits-0.017
handed-0.017
 discovery-0.016
unch-0.016
 burying-0.016
Token black
Token ID2042
Feature activation+1.24e+00

Pos logit contributions
 Amen+0.016
 their+0.016
 smile+0.014
 very+0.014
 the+0.014
Neg logit contributions
handed-0.022
 exhibit-0.021
 exits-0.020
anger-0.020
unch-0.020
Token man
Token ID582
Feature activation-2.11e+00

Pos logit contributions
handed+0.039
 exhibit+0.038
anger+0.037
unch+0.037
 exits+0.036
Neg logit contributions
 Amen-0.028
 their-0.028
 very-0.026
 smile-0.025
 low-0.024
Token.
Token ID13
Feature activation-4.41e-01

Pos logit contributions
 Amen+0.039
 their+0.038
 smile+0.036
 very+0.036
 low+0.032
Neg logit contributions
handed-0.075
 exhibit-0.073
unch-0.070
 exits-0.068
lor-0.068
Token
Token ID198
Feature activation+5.86e-01

Pos logit contributions
 Amen+0.006
 their+0.006
 very+0.006
 smile+0.005
 like+0.005
Neg logit contributions
handed-0.018
 exhibit-0.017
unch-0.017
 exits-0.017
anger-0.017
Token
Token ID198
Feature activation+4.93e-01

Pos logit contributions
handed+0.022
 exhibit+0.021
anger+0.021
 exits+0.021
 territory+0.021
Neg logit contributions
 Amen-0.010
 their-0.010
 very-0.009
 smile-0.009
 the-0.008
TokenThe
Token ID464
Feature activation-2.30e+00

Pos logit contributions
handed+0.018
 exhibit+0.017
unch+0.016
lor+0.016
 exits+0.016
Neg logit contributions
 their-0.009
 Amen-0.009
 smile-0.009
 very-0.009
 like-0.008
Token girl
Token ID2576
Feature activation-2.91e-01

Pos logit contributions
 Amen+0.052
 their+0.051
 very+0.045
te+0.044
 smile+0.043
Neg logit contributions
handed-0.073
 exhibit-0.068
 exits-0.065
 discovery-0.065
anger-0.062
Token street
Token ID4675
Feature activation+8.54e-01

Pos logit contributions
handed+0.006
unch+0.006
 exhibit+0.005
lor+0.005
orns+0.005
Neg logit contributions
 very-0.010
 their-0.010
 smile-0.009
 the-0.009
 Amen-0.009
Token,
Token ID11
Feature activation+4.91e-01

Pos logit contributions
handed+0.036
unch+0.035
 exhibit+0.035
lor+0.035
orns+0.034
Neg logit contributions
 their-0.010
 very-0.010
 smile-0.009
 Amen-0.009
 the-0.008
Token she
Token ID673
Feature activation+6.98e-02

Pos logit contributions
handed+0.018
 exhibit+0.018
 exits+0.017
unch+0.017
anger+0.017
Neg logit contributions
 Amen-0.008
 their-0.008
 smile-0.007
 very-0.007
 like-0.006
Token heard
Token ID2982
Feature activation-5.78e-02

Pos logit contributions
handed+0.002
 exhibit+0.002
unch+0.002
 exits+0.002
anger+0.002
Neg logit contributions
 their-0.001
 very-0.001
 Amen-0.001
 smile-0.001
 the-0.001
Token a
Token ID257
Feature activation-6.61e-01

Pos logit contributions
 Amen+0.001
 their+0.001
 smile+0.001
 very+0.001
 like+0.000
Neg logit contributions
handed-0.002
 exhibit-0.002
 exits-0.002
anger-0.002
unch-0.002
Token sound
Token ID2128
Feature activation-2.36e+00

Pos logit contributions
 Amen+0.012
 their+0.012
 very+0.011
 smile+0.011
 like+0.010
Neg logit contributions
handed-0.023
 exhibit-0.023
unch-0.022
 exits-0.022
anger-0.022
Token.
Token ID13
Feature activation-5.80e-01

Pos logit contributions
 Amen+0.049
 their+0.047
 smile+0.044
 very+0.043
 seven+0.038
Neg logit contributions
handed-0.078
 exhibit-0.076
unch-0.074
anger-0.073
 exits-0.072
Token It
Token ID632
Feature activation+5.51e-01

Pos logit contributions
 their+0.007
 Amen+0.007
 very+0.007
 smile+0.007
 the+0.006
Neg logit contributions
handed-0.025
 exhibit-0.024
unch-0.023
 exits-0.023
orns-0.023
Token was
Token ID373
Feature activation-5.00e-01

Pos logit contributions
 exhibit+0.018
handed+0.018
 exits+0.017
oles+0.017
lor+0.017
Neg logit contributions
 Amen-0.011
 their-0.010
 smile-0.010
 very-0.009
 like-0.009
Token not
Token ID407
Feature activation-3.69e-01

Pos logit contributions
 Amen+0.008
 smile+0.006
 their+0.005
te+0.005
 smiled+0.005
Neg logit contributions
handed-0.019
 exits-0.018
 exhibit-0.018
 footprints-0.018
unch-0.018
Token a
Token ID257
Feature activation+8.76e-01

Pos logit contributions
 Amen+0.006
 smile+0.005
 their+0.004
 smiled+0.004
te+0.004
Neg logit contributions
handed-0.014
 exits-0.013
 exhibit-0.013
orns-0.013
 footprints-0.013
Token room
Token ID2119
Feature activation-3.74e-01

Pos logit contributions
 Amen+0.007
 their+0.007
 very+0.007
 smile+0.007
 like+0.007
Neg logit contributions
handed-0.006
 exhibit-0.005
unch-0.005
 exits-0.005
anger-0.005
Token.
Token ID13
Feature activation+4.26e-02

Pos logit contributions
 Amen+0.006
 their+0.006
 very+0.006
 smile+0.006
 like+0.005
Neg logit contributions
handed-0.014
 exhibit-0.014
anger-0.013
 exits-0.013
unch-0.013
Token There
Token ID1318
Feature activation+1.11e+00

Pos logit contributions
handed+0.002
 exhibit+0.002
unch+0.002
 exits+0.002
anger+0.002
Neg logit contributions
 Amen-0.001
 their-0.001
 very-0.001
 smile-0.001
 like-0.001
Token were
Token ID547
Feature activation+8.67e-01

Pos logit contributions
handed+0.045
 exhibit+0.043
unch+0.042
 exits+0.042
lor+0.042
Neg logit contributions
 their-0.016
 Amen-0.015
 very-0.015
 smile-0.014
 the-0.013
Token some
Token ID617
Feature activation-1.83e-01

Pos logit contributions
handed+0.025
 exits+0.024
 exhibit+0.024
anger+0.022
 footprints+0.022
Neg logit contributions
 Amen-0.023
 smile-0.021
 their-0.020
 very-0.019
te-0.018
Token books
Token ID3835
Feature activation-2.26e+00

Pos logit contributions
 Amen+0.005
 their+0.005
 very+0.005
 smile+0.005
 like+0.004
Neg logit contributions
handed-0.005
 exhibit-0.005
unch-0.005
 exits-0.005
anger-0.005
Token,
Token ID11
Feature activation-2.37e-01

Pos logit contributions
 Amen+0.036
 their+0.031
 smile+0.029
 very+0.028
 low+0.023
Neg logit contributions
handed-0.088
 exhibit-0.087
unch-0.084
anger-0.084
 exits-0.083
Token some
Token ID617
Feature activation-5.64e-01

Pos logit contributions
 Amen+0.007
 smile+0.005
te+0.005
 forward+0.005
 their+0.005
Neg logit contributions
 exits-0.007
 worms-0.007
 footprints-0.006
 exhibit-0.006
handed-0.006
Token clothes
Token ID8242
Feature activation-6.15e-01

Pos logit contributions
 Amen+0.015
 their+0.014
 very+0.013
 smile+0.013
 low+0.012
Neg logit contributions
handed-0.016
 exhibit-0.015
 exits-0.015
anger-0.015
unch-0.014
Token,
Token ID11
Feature activation-3.42e-01

Pos logit contributions
 Amen+0.009
 their+0.008
 very+0.008
 smile+0.008
 low+0.006
Neg logit contributions
handed-0.024
 exhibit-0.024
unch-0.023
anger-0.023
 exits-0.023
Token and
Token ID290
Feature activation-3.36e-01

Pos logit contributions
 Amen+0.009
 smile+0.007
te+0.007
 forward+0.006
 very+0.006
Neg logit contributions
 exits-0.010
 worms-0.010
 footprints-0.010
 exhibit-0.010
orns-0.009