TOP ACTIVATIONS
MAX = 15.027

could be and never mis beh aved . <|endoftext|> Once upon
cakes and cookies and ch ocol ates and ice cream .
very naughty and always mis beh aving . One day she
listen . He always mis beh aved and caused trouble wherever
be happy and well - beh aved . The end .
giving him too many ch ocol ates ." The king learned
reward . It was ch ocol ates and sweets !
't have all these ch ocol ates ! Now , go
oll ip ops , ch ocol ates , g ummies ,
not eat too many ch ocol ates and his friend learned
ates . Lily loved ch ocol ates . She thanked Ben
, cup cakes and ch ocol ates .
became friendly and well - beh aved , just like Buddy
and laughed and ate ch ocol ates . They had a
was a box of ch ocol ates . Lily loved ch
saw some other kids mis beh aving and running around .
for being so well - beh aved and Jack was delighted
It was sweet , ch ocol at y and warm ,
colorful cand ies and ch ocol ates . She said to
enjoyed her cool , ch ocol at y treat !

MIN ACTIVATIONS
MIN = -9.055

a big pond with ducks and fish . Tom wants to
. <|endoftext|> Max was walking through the park . He saw
. <|endoftext|> John was walking through the park one sunny day
and her dad were walking through a field on a sunny
. Suddenly , the sky darkened and split open . Rain
<|endoftext|> Mike y was walking through the park . He noticed
andy and Jane were walking through the park . Jane had
walking through a forest . Suddenly , she saw a small
! <|endoftext|> Mary was walking through the park one day .
of flowers , trees , and animals . Lily liked to
and her mom were walking through the park one afternoon .
charming little pony was walking through a me adow . It
. <|endoftext|> Jim was walking through the forest . He felt
walking around . Suddenly , Tim my saw a
. <|endoftext|> Jack was walking through the park , with his
<|endoftext|> T ina was walking through the garden with her mom
with many trees , flowers and animals . "
a big pond with ducks and fish . They wanted to
saw some frogs , fish and bugs in the water .
<|endoftext|> M ia was walking through the park one day .

INTERVAL 9.007 - 15.027
CONTAINS 0.027%

for helping her and she wasn 't ashamed to ask for
started eating the delicious ch ocol ates . She was so
, es bon ita y su ave !" Lily says ,
and never taking things that weren 't his .
hard . " Cr ack !" The ball broke

INTERVAL 2.986 - 9.007
CONTAINS 12.125%

job well , and he felt glad . He knew he
to a special safe ! â
there was a boy named Tim my . Tim my loved
is that big blue elephant doing in our lawn ?"
said , " I don 't wanna exercise . This weight

INTERVAL -3.034 - 2.986
CONTAINS 76.117%

so shiny and bright . Is it original ?"
her teacher , and her friends . But nobody seemed to
to always hug your friends . <|endoftext|> Once there was a
tired . Her mom told her it was time for a
in a park on a warm summer afternoon . Suddenly ,

INTERVAL -9.055 - -3.034
CONTAINS 11.730%

The boat floated on the water and looked for
Tom took the trophy to the bathroom and filled the sink
. She decided to pick up some of the junk and
They came back to the kitchen and set up their stage
stepped into the wheat field , something bit him on his
Token could
Token ID714
Feature activation+4.89e+00

Pos logit contributions
its+0.075
ases+0.071
beh+0.069
 supposed+0.066
ws+0.065
Neg logit contributions
 they-0.032
 he-0.028
 she-0.026
 The-0.026
 while-0.025
Token be
Token ID307
Feature activation+6.35e+00

Pos logit contributions
its+0.140
ases+0.132
beh+0.129
 supposed+0.124
ws+0.121
Neg logit contributions
 they-0.091
 he-0.083
 she-0.079
 The-0.077
 while-0.076
Token and
Token ID290
Feature activation-2.59e+00

Pos logit contributions
its+0.172
ases+0.165
beh+0.161
 supposed+0.159
asing+0.148
Neg logit contributions
 they-0.119
 he-0.106
 roof-0.105
 moon-0.103
 The-0.103
Token never
Token ID1239
Feature activation+2.46e+00

Pos logit contributions
 they+0.038
 Pole+0.038
 landscape+0.036
 fireplace+0.036
 moon+0.036
Neg logit contributions
its-0.074
beh-0.071
ases-0.070
 supposed-0.070
 consequences-0.065
Token mis
Token ID2984
Feature activation+1.31e+01

Pos logit contributions
its+0.063
ases+0.059
beh+0.058
 supposed+0.056
 consequences+0.053
Neg logit contributions
 they-0.051
 he-0.046
 The-0.045
 she-0.045
 roof-0.044
Tokenbeh
Token ID20709
Feature activation+1.50e+01

Pos logit contributions
its+0.219
 vitamins+0.214
vern+0.193
ws+0.182
 supposed+0.179
Neg logit contributions
 while-0.345
 they-0.339
 down-0.338
 Big-0.333
 The-0.330
Tokenaved
Token ID9586
Feature activation+2.78e+00

Pos logit contributions
its+0.366
vern+0.328
 consequences+0.317
 vitamins+0.312
ins+0.298
Neg logit contributions
 The-0.302
 Tim-0.286
 down-0.285
 they-0.285
 While-0.275
Token.
Token ID13
Feature activation+5.35e-01

Pos logit contributions
its+0.087
ases+0.084
beh+0.081
 supposed+0.080
 consequences+0.078
Neg logit contributions
 they-0.039
 roof-0.033
 The-0.033
 moon-0.033
 Ally-0.033
Token<|endoftext|>
Token ID50256
Feature activation+1.36e+00

Pos logit contributions
its+0.020
ases+0.019
beh+0.019
 supposed+0.018
 consequences+0.018
Neg logit contributions
 they-0.005
 he-0.004
 roof-0.004
 she-0.004
 while-0.004
TokenOnce
Token ID7454
Feature activation-1.65e+00

Pos logit contributions
its+0.046
ases+0.044
 supposed+0.042
beh+0.041
ump+0.040
Neg logit contributions
 they-0.016
 roof-0.014
 he-0.013
 raft-0.013
 she-0.012
Token upon
Token ID2402
Feature activation-2.05e+00

Pos logit contributions
 they+0.021
 roof+0.020
 moon+0.020
 fireplace+0.019
 wagon+0.019
Neg logit contributions
its-0.052
ases-0.048
 supposed-0.047
vern-0.045
beh-0.045
Token cakes
Token ID30849
Feature activation-7.69e-01

Pos logit contributions
its+0.027
ases+0.025
 supposed+0.024
beh+0.024
 consequences+0.023
Neg logit contributions
 they-0.018
 he-0.016
 Ally-0.016
 The-0.016
 while-0.015
Token and
Token ID290
Feature activation-4.23e+00

Pos logit contributions
 Big+0.008
 Ally+0.008
 Sun+0.008
 The+0.007
 fireplace+0.007
Neg logit contributions
its-0.025
ases-0.024
 supposed-0.023
arts-0.023
ump-0.023
Token cookies
Token ID14746
Feature activation-6.21e-01

Pos logit contributions
 they+0.084
 he+0.076
 The+0.076
 Ally+0.076
 Big+0.075
Neg logit contributions
its-0.107
ases-0.099
beh-0.095
 supposed-0.095
 consequences-0.092
Token and
Token ID290
Feature activation-4.64e+00

Pos logit contributions
 The+0.007
 Big+0.007
 they+0.007
 Ally+0.007
 Sun+0.007
Neg logit contributions
its-0.020
ases-0.019
 supposed-0.019
arts-0.018
beh-0.018
Token ch
Token ID442
Feature activation+5.73e+00

Pos logit contributions
 they+0.091
 Ally+0.086
 The+0.085
 Big+0.083
 he+0.083
Neg logit contributions
its-0.114
ases-0.106
 supposed-0.103
beh-0.102
 consequences-0.101
Tokenocol
Token ID4668
Feature activation+1.45e+01

Pos logit contributions
 supposed+0.102
its+0.102
beh+0.099
vern+0.099
 consequences+0.094
Neg logit contributions
 they-0.143
 The-0.133
 Tim-0.129
 Ally-0.127
 roof-0.127
Tokenates
Token ID689
Feature activation+4.54e-01

Pos logit contributions
its+0.217
beh+0.182
 supposed+0.172
vern+0.167
 consequences+0.163
Neg logit contributions
 they-0.410
 moon-0.389
 Mill-0.388
 fields-0.387
 Sun-0.385
Token and
Token ID290
Feature activation-4.63e+00

Pos logit contributions
its+0.014
ases+0.013
 supposed+0.012
arts+0.012
beh+0.012
Neg logit contributions
 Ally-0.006
 Big-0.006
 The-0.006
 roof-0.006
 fireplace-0.006
Token ice
Token ID4771
Feature activation+2.19e+00

Pos logit contributions
 they+0.085
 Ally+0.082
 The+0.080
 Pole+0.079
 Big+0.079
Neg logit contributions
its-0.116
ases-0.108
 supposed-0.107
beh-0.104
 consequences-0.103
Token cream
Token ID8566
Feature activation-4.81e-01

Pos logit contributions
its+0.048
ases+0.044
beh+0.042
 supposed+0.041
ws+0.039
Neg logit contributions
 they-0.054
 he-0.050
 The-0.049
 she-0.048
 roof-0.048
Token.
Token ID13
Feature activation+4.67e-01

Pos logit contributions
 Ally+0.008
 The+0.008
 Big+0.007
 Sun+0.007
 roof+0.007
Neg logit contributions
its-0.013
ases-0.013
 supposed-0.012
arts-0.012
beh-0.012
Token very
Token ID845
Feature activation+1.80e+00

Pos logit contributions
its+0.093
ases+0.088
beh+0.085
 supposed+0.083
ws+0.080
Neg logit contributions
 they-0.053
 he-0.048
 she-0.045
 The-0.045
 roof-0.044
Token naughty
Token ID45128
Feature activation-3.56e-01

Pos logit contributions
its+0.049
ases+0.046
beh+0.044
ws+0.043
 consequences+0.041
Neg logit contributions
 they-0.037
 he-0.034
 she-0.033
 The-0.032
 Tim-0.031
Token and
Token ID290
Feature activation-2.52e+00

Pos logit contributions
 they+0.004
 Ally+0.004
 roof+0.004
 raft+0.004
 he+0.004
Neg logit contributions
its-0.012
ases-0.011
 supposed-0.011
beh-0.011
 consequences-0.011
Token always
Token ID1464
Feature activation+2.87e-01

Pos logit contributions
 they+0.046
 roof+0.042
 landscape+0.042
 moon+0.042
 fireplace+0.042
Neg logit contributions
its-0.068
beh-0.063
ases-0.063
 supposed-0.063
 consequences-0.058
Token mis
Token ID2984
Feature activation+1.17e+01

Pos logit contributions
its+0.008
ases+0.007
beh+0.007
 supposed+0.007
 consequences+0.007
Neg logit contributions
 they-0.005
 he-0.005
 The-0.005
 roof-0.005
 she-0.005
Tokenbeh
Token ID20709
Feature activation+1.45e+01

Pos logit contributions
 vitamins+0.196
its+0.193
 fault+0.180
ws+0.171
vern+0.166
Neg logit contributions
 while-0.297
 down-0.292
 they-0.285
 Ally-0.285
 The-0.280
Tokenaving
Token ID2703
Feature activation-6.49e-01

Pos logit contributions
its+0.408
 consequences+0.377
vern+0.370
 mistakes+0.346
 wounds+0.346
Neg logit contributions
 The-0.227
 While-0.219
 Tim-0.217
 down-0.212
 Mill-0.211
Token.
Token ID13
Feature activation-1.31e+00

Pos logit contributions
 they+0.009
 he+0.008
 moon+0.007
 raft+0.007
 roof+0.007
Neg logit contributions
its-0.021
ases-0.020
beh-0.019
 supposed-0.019
 consequences-0.019
Token One
Token ID1881
Feature activation-2.09e+00

Pos logit contributions
 they+0.015
 he+0.012
 raft+0.011
 while+0.011
 wagon+0.011
Neg logit contributions
its-0.046
ases-0.044
beh-0.042
 supposed-0.042
 consequences-0.041
Token day
Token ID1110
Feature activation-5.31e+00

Pos logit contributions
 they+0.032
 Ally+0.028
 Pole+0.028
 Mill+0.027
 roof+0.027
Neg logit contributions
its-0.060
 supposed-0.059
beh-0.056
ases-0.056
 consequences-0.054
Token she
Token ID673
Feature activation-1.53e+00

Pos logit contributions
 raft+0.090
 landscape+0.087
 horizon+0.087
stack+0.086
 Pole+0.085
Neg logit contributions
its-0.125
 consequences-0.125
ases-0.125
 supposed-0.123
olds-0.118
Token listen
Token ID6004
Feature activation+9.15e-02

Pos logit contributions
its+0.049
ases+0.046
beh+0.044
 supposed+0.043
vern+0.043
Neg logit contributions
 they-0.033
 he-0.031
 she-0.029
 while-0.029
 The-0.029
Token.
Token ID13
Feature activation-1.59e+00

Pos logit contributions
its+0.003
ases+0.003
 supposed+0.003
beh+0.003
 consequences+0.003
Neg logit contributions
 they-0.001
 roof-0.001
 landscape-0.001
 Ally-0.001
 wagon-0.001
Token He
Token ID679
Feature activation-8.18e-01

Pos logit contributions
 they+0.020
 she+0.016
 roof+0.016
 while+0.016
 moon+0.016
Neg logit contributions
its-0.052
ases-0.049
beh-0.047
 supposed-0.047
 consequences-0.046
Token always
Token ID1464
Feature activation+2.41e-01

Pos logit contributions
 they+0.013
 he+0.012
 The+0.011
 she+0.011
 while+0.011
Neg logit contributions
its-0.025
ases-0.024
beh-0.023
 supposed-0.022
vern-0.021
Token mis
Token ID2984
Feature activation+1.15e+01

Pos logit contributions
its+0.007
ases+0.007
beh+0.006
 supposed+0.006
vern+0.006
Neg logit contributions
 they-0.004
 he-0.004
 she-0.004
 The-0.004
 while-0.004
Tokenbeh
Token ID20709
Feature activation+1.44e+01

Pos logit contributions
its+0.159
 vitamins+0.148
 fault+0.143
vern+0.139
 consequences+0.125
Neg logit contributions
 while-0.333
 they-0.322
 Ally-0.321
 down-0.313
 Big-0.310
Tokenaved
Token ID9586
Feature activation+1.52e-02

Pos logit contributions
its+0.361
vern+0.327
 consequences+0.321
 mistakes+0.299
ins+0.297
Neg logit contributions
 The-0.266
 While-0.257
 Tim-0.255
 they-0.251
 down-0.247
Token and
Token ID290
Feature activation-3.46e+00

Pos logit contributions
its+0.000
ases+0.000
beh+0.000
 supposed+0.000
 consequences+0.000
Neg logit contributions
 they-0.000
 roof-0.000
 landscape-0.000
 fireplace-0.000
 moon-0.000
Token caused
Token ID4073
Feature activation+2.71e+00

Pos logit contributions
 they+0.069
 he+0.062
 The+0.061
 roof+0.060
 she+0.060
Neg logit contributions
its-0.090
ases-0.085
beh-0.083
 supposed-0.081
 consequences-0.077
Token trouble
Token ID5876
Feature activation+3.21e+00

Pos logit contributions
its+0.089
ases+0.084
beh+0.082
 supposed+0.080
asing+0.078
Neg logit contributions
 they-0.038
 he-0.034
 The-0.032
 she-0.031
 Big-0.031
Token wherever
Token ID14530
Feature activation-2.34e+00

Pos logit contributions
its+0.107
ases+0.101
beh+0.098
 supposed+0.096
ws+0.093
Neg logit contributions
 they-0.044
 he-0.038
 she-0.035
 The-0.035
 while-0.035
Token be
Token ID307
Feature activation+7.63e+00

Pos logit contributions
its+0.052
ases+0.049
beh+0.048
 supposed+0.045
asing+0.044
Neg logit contributions
 they-0.037
 he-0.034
 she-0.032
 while-0.031
 The-0.031
Token happy
Token ID3772
Feature activation+8.33e-01

Pos logit contributions
its+0.187
ases+0.173
beh+0.166
 supposed+0.160
ws+0.153
Neg logit contributions
 they-0.173
 he-0.159
 The-0.153
 she-0.153
 while-0.150
Token and
Token ID290
Feature activation-4.01e-01

Pos logit contributions
its+0.025
ases+0.024
 supposed+0.023
beh+0.023
 consequences+0.022
Neg logit contributions
 they-0.012
 Ally-0.011
 roof-0.011
 The-0.011
 Pole-0.011
Token well
Token ID880
Feature activation+2.09e+00

Pos logit contributions
 they+0.008
 he+0.008
 The+0.007
 roof+0.007
 Ally+0.007
Neg logit contributions
its-0.010
ases-0.009
beh-0.009
 supposed-0.009
 consequences-0.008
Token-
Token ID12
Feature activation+5.58e+00

Pos logit contributions
its+0.064
ases+0.061
beh+0.059
 supposed+0.058
 consequences+0.056
Neg logit contributions
 they-0.030
 roof-0.028
 Ally-0.028
 The-0.027
 moon-0.027
Tokenbeh
Token ID20709
Feature activation+1.44e+01

Pos logit contributions
its+0.100
ases+0.091
beh+0.087
 supposed+0.083
asing+0.076
Neg logit contributions
 they-0.161
 he-0.150
 she-0.146
 The-0.146
 roof-0.145
Tokenaved
Token ID9586
Feature activation+3.52e-01

Pos logit contributions
its+0.394
vern+0.343
 consequences+0.343
ases+0.326
 supposed+0.320
Neg logit contributions
 The-0.246
 they-0.233
 Tim-0.228
 While-0.224
 Mill-0.222
Token.
Token ID13
Feature activation+3.86e-01

Pos logit contributions
its+0.011
ases+0.010
beh+0.010
 supposed+0.010
 consequences+0.009
Neg logit contributions
 they-0.005
 Ally-0.005
 roof-0.005
 moon-0.005
 The-0.005
Token The
Token ID383
Feature activation-9.63e-01

Pos logit contributions
its+0.012
ases+0.012
 supposed+0.011
beh+0.011
ump+0.011
Neg logit contributions
 they-0.004
 roof-0.004
 raft-0.004
 wagon-0.004
 landscape-0.004
Token end
Token ID886
Feature activation+1.68e+00

Pos logit contributions
 they+0.017
 he+0.015
 The+0.014
 she+0.014
 roof+0.014
Neg logit contributions
its-0.029
ases-0.027
beh-0.026
 supposed-0.026
ws-0.025
Token.
Token ID13
Feature activation+7.32e-01

Pos logit contributions
its+0.048
ases+0.044
 supposed+0.043
beh+0.043
vern+0.042
Neg logit contributions
 they-0.029
 roof-0.027
 Ally-0.026
 Pole-0.026
 raft-0.026
Token giving
Token ID3501
Feature activation+2.96e+00

Pos logit contributions
its+0.019
ases+0.018
beh+0.018
 supposed+0.017
asing+0.017
Neg logit contributions
 they-0.015
 he-0.014
 The-0.014
 she-0.013
 Tim-0.013
Token him
Token ID683
Feature activation+7.43e-01

Pos logit contributions
its+0.092
ases+0.086
beh+0.084
 supposed+0.082
 consequences+0.080
Neg logit contributions
 they-0.047
 he-0.040
 roof-0.039
 she-0.038
 The-0.038
Token too
Token ID1165
Feature activation+2.96e+00

Pos logit contributions
its+0.023
ases+0.022
beh+0.021
 supposed+0.020
 consequences+0.020
Neg logit contributions
 they-0.012
 he-0.010
 roof-0.010
 she-0.010
 Ally-0.010
Token many
Token ID867
Feature activation+5.38e-01

Pos logit contributions
its+0.086
ases+0.085
beh+0.081
asing+0.079
ief+0.077
Neg logit contributions
 they-0.052
 he-0.047
 while-0.046
 she-0.046
 The-0.045
Token ch
Token ID442
Feature activation+5.81e+00

Pos logit contributions
its+0.015
ases+0.014
beh+0.014
 supposed+0.013
ws+0.013
Neg logit contributions
 they-0.010
 he-0.009
 she-0.009
 The-0.009
 roof-0.009
Tokenocol
Token ID4668
Feature activation+1.43e+01

Pos logit contributions
 supposed+0.099
its+0.097
beh+0.090
vern+0.084
 consequences+0.082
Neg logit contributions
 they-0.147
 he-0.144
 The-0.140
,-0.137
 she-0.135
Tokenates
Token ID689
Feature activation+7.35e-01

Pos logit contributions
its+0.212
 supposed+0.166
beh+0.162
ases+0.152
 consequences+0.149
Neg logit contributions
 they-0.418
 he-0.396
 The-0.390
 moon-0.386
 while-0.384
Token."
Token ID526
Feature activation-4.74e-01

Pos logit contributions
its+0.020
ases+0.019
beh+0.019
 supposed+0.019
arts+0.017
Neg logit contributions
 they-0.013
 Ally-0.012
 The-0.012
 roof-0.012
 Big-0.012
Token The
Token ID383
Feature activation-1.02e+00

Pos logit contributions
 landscape+0.008
 roof+0.007
 moon+0.007
 raft+0.007
 valley+0.006
Neg logit contributions
its-0.012
ases-0.012
olds-0.012
beh-0.012
 humor-0.011
Token king
Token ID5822
Feature activation+1.35e+00

Pos logit contributions
 they+0.017
 he+0.015
 The+0.014
 Ally+0.014
 she+0.014
Neg logit contributions
its-0.030
ases-0.028
beh-0.027
 supposed-0.027
 consequences-0.026
Token learned
Token ID4499
Feature activation+8.47e-01

Pos logit contributions
its+0.043
ases+0.040
beh+0.039
 supposed+0.038
ws+0.037
Neg logit contributions
 they-0.020
 he-0.017
 roof-0.017
 The-0.017
 she-0.016
Token reward
Token ID6721
Feature activation-1.07e+00

Pos logit contributions
 they+0.045
 The+0.038
 he+0.038
 Ally+0.038
 she+0.037
Neg logit contributions
its-0.082
ases-0.079
 consequences-0.076
 supposed-0.073
beh-0.072
Token.
Token ID13
Feature activation-1.92e+00

Pos logit contributions
 roof+0.015
 they+0.015
 landscape+0.015
 fireplace+0.015
 moon+0.014
Neg logit contributions
its-0.031
ases-0.030
 supposed-0.029
beh-0.029
 consequences-0.028
Token It
Token ID632
Feature activation-1.15e+00

Pos logit contributions
 raft+0.027
 landscape+0.026
 roof+0.026
 they+0.024
 valley+0.024
Neg logit contributions
ases-0.051
its-0.050
beh-0.049
olds-0.047
ump-0.046
Token was
Token ID373
Feature activation+3.64e+00

Pos logit contributions
 they+0.017
 roof+0.014
 The+0.014
 he+0.014
 Big+0.014
Neg logit contributions
its-0.036
ases-0.034
beh-0.033
 supposed-0.032
ws-0.031
Token ch
Token ID442
Feature activation+5.89e+00

Pos logit contributions
its+0.105
 supposed+0.101
ases+0.098
beh+0.096
 consequences+0.092
Neg logit contributions
 they-0.060
 Pole-0.052
 he-0.051
 roof-0.051
 landscape-0.051
Tokenocol
Token ID4668
Feature activation+1.41e+01

Pos logit contributions
its+0.094
 supposed+0.086
beh+0.081
ases+0.077
vern+0.073
Neg logit contributions
 they-0.169
 The-0.166
 he-0.161
 Ally-0.158
 she-0.158
Tokenates
Token ID689
Feature activation+5.70e-01

Pos logit contributions
its+0.191
beh+0.148
ases+0.144
 supposed+0.143
 consequences+0.132
Neg logit contributions
 they-0.442
 he-0.423
 The-0.414
 moon-0.411
 while-0.410
Token and
Token ID290
Feature activation-3.67e+00

Pos logit contributions
its+0.016
ases+0.015
 supposed+0.015
beh+0.014
arts+0.014
Neg logit contributions
 they-0.009
 Ally-0.009
 landscape-0.008
 Big-0.008
 roof-0.008
Token sweets
Token ID42402
Feature activation-2.93e-01

Pos logit contributions
 they+0.052
 The+0.051
 Pole+0.050
 wagon+0.049
 landscape+0.049
Neg logit contributions
its-0.104
ases-0.098
 supposed-0.098
beh-0.096
 vitamins-0.092
Token!
Token ID0
Feature activation-7.69e-01

Pos logit contributions
 Ally+0.004
 they+0.004
 The+0.004
 Big+0.004
 Sun+0.004
Neg logit contributions
its-0.008
 supposed-0.008
ases-0.008
beh-0.007
arts-0.007
Token
Token ID198
Feature activation+2.37e+00

Pos logit contributions
 landscape+0.010
 roof+0.010
 raft+0.010
 they+0.009
 moon+0.009
Neg logit contributions
ases-0.022
its-0.022
beh-0.021
ump-0.020
olds-0.020
Token't
Token ID470
Feature activation+3.99e+00

Pos logit contributions
its+0.120
ases+0.114
beh+0.112
 supposed+0.107
 consequences+0.104
Neg logit contributions
 they-0.093
 he-0.084
 she-0.082
 The-0.081
 roof-0.079
Token have
Token ID423
Feature activation+5.26e+00

Pos logit contributions
its+0.113
ases+0.108
beh+0.104
 supposed+0.101
ws+0.099
Neg logit contributions
 they-0.074
 he-0.067
 she-0.064
 while-0.063
 The-0.062
Token all
Token ID477
Feature activation+2.21e+00

Pos logit contributions
its+0.153
ases+0.144
beh+0.139
 supposed+0.134
asing+0.131
Neg logit contributions
 they-0.096
 he-0.085
 she-0.082
 The-0.080
 while-0.079
Token these
Token ID777
Feature activation+2.24e+00

Pos logit contributions
its+0.075
ases+0.072
beh+0.070
 supposed+0.069
 consequences+0.066
Neg logit contributions
 they-0.028
 he-0.024
 The-0.022
 she-0.022
 roof-0.022
Token ch
Token ID442
Feature activation+5.67e+00

Pos logit contributions
its+0.057
ases+0.053
beh+0.051
 supposed+0.050
 consequences+0.048
Neg logit contributions
 they-0.048
 he-0.043
 The-0.042
 Ally-0.042
 roof-0.042
Tokenocol
Token ID4668
Feature activation+1.40e+01

Pos logit contributions
its+0.093
beh+0.092
 supposed+0.091
vern+0.091
 consequences+0.084
Neg logit contributions
 they-0.139
 The-0.130
 he-0.124
 Mama-0.124
 roof-0.124
Tokenates
Token ID689
Feature activation+1.60e+00

Pos logit contributions
its+0.208
beh+0.164
 supposed+0.162
 consequences+0.157
ases+0.155
Neg logit contributions
 they-0.406
 The-0.371
 Mill-0.371
 he-0.370
 while-0.368
Token!
Token ID0
Feature activation+1.37e+00

Pos logit contributions
its+0.046
ases+0.043
beh+0.042
 supposed+0.042
asing+0.040
Neg logit contributions
 they-0.027
 The-0.024
 Big-0.024
 Ally-0.024
 he-0.024
Token Now
Token ID2735
Feature activation-2.28e+00

Pos logit contributions
its+0.036
ases+0.034
beh+0.033
 supposed+0.032
 consequences+0.030
Neg logit contributions
 they-0.028
 he-0.025
 she-0.024
 The-0.024
 roof-0.024
Token,
Token ID11
Feature activation-4.02e+00

Pos logit contributions
 they+0.034
 fireplace+0.034
 moon+0.034
 landscape+0.033
 Pole+0.033
Neg logit contributions
its-0.064
ases-0.059
beh-0.059
 supposed-0.058
 consequences-0.057
Token go
Token ID467
Feature activation+1.46e+00

Pos logit contributions
 Pole+0.076
 fireplace+0.075
 landscape+0.075
 they+0.074
 wagon+0.074
Neg logit contributions
its-0.094
beh-0.089
ases-0.086
 supposed-0.084
 consequences-0.084
Tokenoll
Token ID692
Feature activation+6.09e+00

Pos logit contributions
its+0.097
beh+0.088
 supposed+0.078
 preferences+0.078
ws+0.075
Neg logit contributions
 she-0.183
 he-0.182
 The-0.179
 they-0.177
 �-0.171
Tokenip
Token ID541
Feature activation+3.42e+00

Pos logit contributions
its+0.160
ases+0.150
beh+0.147
 supposed+0.141
 consequences+0.135
Neg logit contributions
 they-0.129
 he-0.122
 she-0.115
 while-0.109
 The-0.104
Tokenops
Token ID2840
Feature activation+1.05e+00

Pos logit contributions
its+0.071
ases+0.066
beh+0.064
 supposed+0.061
ws+0.057
Neg logit contributions
 they-0.089
 he-0.084
 she-0.082
 while-0.079
 The-0.079
Token,
Token ID11
Feature activation-4.67e+00

Pos logit contributions
its+0.033
ases+0.031
 supposed+0.031
beh+0.030
arts+0.029
Neg logit contributions
 they-0.014
 Big-0.013
 The-0.013
 Ally-0.013
 Sun-0.012
Token ch
Token ID442
Feature activation+4.28e+00

Pos logit contributions
 they+0.059
 landscape+0.058
 roof+0.058
 Pole+0.058
 Ally+0.057
Neg logit contributions
its-0.135
ases-0.128
 supposed-0.124
beh-0.121
 vitamins-0.119
Tokenocol
Token ID4668
Feature activation+1.39e+01

Pos logit contributions
its+0.070
 supposed+0.069
beh+0.066
vern+0.063
 consequences+0.060
Neg logit contributions
 they-0.118
 he-0.117
 The-0.114
 she-0.112
 Tim-0.110
Tokenates
Token ID689
Feature activation-2.55e-01

Pos logit contributions
its+0.208
beh+0.164
 supposed+0.159
vern+0.153
 consequences+0.153
Neg logit contributions
 they-0.399
 he-0.388
 The-0.382
 Tim-0.377
 moon-0.376
Token,
Token ID11
Feature activation-3.87e+00

Pos logit contributions
 they+0.003
 Ally+0.002
 Big+0.002
 The+0.002
 landscape+0.002
Neg logit contributions
its-0.009
ases-0.008
 supposed-0.008
arts-0.008
beh-0.008
Token g
Token ID308
Feature activation+4.69e+00

Pos logit contributions
 Ally+0.051
 they+0.050
 landscape+0.049
 Mill+0.048
 Big+0.048
Neg logit contributions
its-0.115
ases-0.108
 supposed-0.102
ws-0.101
 vitamins-0.101
Tokenummies
Token ID39578
Feature activation+1.51e+00

Pos logit contributions
its+0.105
ases+0.099
beh+0.096
 supposed+0.094
 consequences+0.092
Neg logit contributions
 they-0.108
 he-0.101
 The-0.098
 she-0.098
 Tim-0.096
Token,
Token ID11
Feature activation-2.98e+00

Pos logit contributions
its+0.049
ases+0.047
 supposed+0.046
arts+0.045
ump+0.044
Neg logit contributions
 they-0.017
 Ally-0.016
 landscape-0.015
 The-0.015
 roof-0.015
Token not
Token ID407
Feature activation+4.21e+00

Pos logit contributions
its+0.067
ases+0.065
beh+0.064
 supposed+0.061
asing+0.059
Neg logit contributions
 they-0.049
 he-0.045
 she-0.043
 The-0.041
 roof-0.041
Token eat
Token ID4483
Feature activation+2.19e+00

Pos logit contributions
its+0.120
ases+0.116
beh+0.109
 supposed+0.107
vern+0.106
Neg logit contributions
 they-0.077
 he-0.070
 she-0.067
 The-0.066
 while-0.065
Token too
Token ID1165
Feature activation+3.27e+00

Pos logit contributions
its+0.061
ases+0.057
beh+0.054
 supposed+0.054
 consequences+0.053
Neg logit contributions
 they-0.038
 roof-0.034
 The-0.034
 Ally-0.034
 �-0.033
Token many
Token ID867
Feature activation+2.87e-01

Pos logit contributions
its+0.090
ases+0.086
beh+0.084
 supposed+0.080
asing+0.079
Neg logit contributions
 they-0.066
 he-0.061
 she-0.058
 The-0.056
 while-0.055
Token ch
Token ID442
Feature activation+5.45e+00

Pos logit contributions
its+0.008
ases+0.007
beh+0.007
 supposed+0.007
ws+0.006
Neg logit contributions
 they-0.006
 he-0.005
 she-0.005
 The-0.005
 while-0.005
Tokenocol
Token ID4668
Feature activation+1.39e+01

Pos logit contributions
its+0.089
 supposed+0.089
beh+0.084
 consequences+0.078
vern+0.077
Neg logit contributions
 they-0.147
 he-0.143
 The-0.138
 she-0.135
 Tim-0.132
Tokenates
Token ID689
Feature activation+1.14e+00

Pos logit contributions
its+0.204
 supposed+0.157
beh+0.157
ases+0.147
 consequences+0.145
Neg logit contributions
 they-0.415
 he-0.395
 The-0.387
 moon-0.383
 while-0.381
Token and
Token ID290
Feature activation-2.91e+00

Pos logit contributions
its+0.035
ases+0.033
 supposed+0.032
beh+0.032
asing+0.031
Neg logit contributions
 they-0.015
 Ally-0.015
 landscape-0.015
 roof-0.015
 The-0.015
Token his
Token ID465
Feature activation+1.80e+00

Pos logit contributions
 they+0.048
 he+0.042
 The+0.042
 roof+0.041
 she+0.041
Neg logit contributions
its-0.086
ases-0.081
beh-0.079
 supposed-0.078
 consequences-0.075
Token friend
Token ID1545
Feature activation-8.21e-01

Pos logit contributions
its+0.046
ases+0.043
beh+0.041
 supposed+0.041
 consequences+0.040
Neg logit contributions
 they-0.037
 he-0.033
 she-0.032
 The-0.032
 Ally-0.032
Token learned
Token ID4499
Feature activation+1.93e+00

Pos logit contributions
 they+0.012
 roof+0.011
 landscape+0.010
 moon+0.010
 Ally+0.010
Neg logit contributions
its-0.025
ases-0.024
beh-0.023
 supposed-0.023
 consequences-0.022
Tokenates
Token ID689
Feature activation-8.95e-01

Pos logit contributions
its+0.201
beh+0.162
 supposed+0.156
ases+0.148
 consequences+0.146
Neg logit contributions
 they-0.412
 moon-0.387
 he-0.384
 fields-0.382
 roof-0.382
Token.
Token ID13
Feature activation+6.12e-02

Pos logit contributions
 Ally+0.014
 they+0.014
 The+0.013
 Big+0.013
 Sun+0.013
Neg logit contributions
its-0.025
 supposed-0.024
ases-0.024
arts-0.023
beh-0.023
Token Lily
Token ID20037
Feature activation+2.03e+00

Pos logit contributions
its+0.002
ases+0.002
ws+0.002
 vitamins+0.002
olds+0.002
Neg logit contributions
 landscape-0.001
 roof-0.001
 they-0.001
 raft-0.001
 Ally-0.001
Token loved
Token ID6151
Feature activation+1.92e-02

Pos logit contributions
its+0.061
ases+0.057
 supposed+0.055
beh+0.055
ws+0.053
Neg logit contributions
 they-0.032
 he-0.028
 Ally-0.027
 roof-0.027
 Big-0.027
Token ch
Token ID442
Feature activation+5.33e+00

Pos logit contributions
its+0.001
ases+0.001
 supposed+0.001
beh+0.001
ws+0.001
Neg logit contributions
 they-0.000
 roof-0.000
 The-0.000
 he-0.000
 wagon-0.000
Tokenocol
Token ID4668
Feature activation+1.39e+01

Pos logit contributions
 supposed+0.098
beh+0.097
 consequences+0.091
its+0.090
vern+0.086
Neg logit contributions
 they-0.137
 he-0.132
 The-0.131
 she-0.125
 the-0.125
Tokenates
Token ID689
Feature activation+4.62e-01

Pos logit contributions
its+0.201
beh+0.166
 supposed+0.158
 consequences+0.152
ases+0.149
Neg logit contributions
 they-0.415
 he-0.392
 moon-0.390
 The-0.390
 Sun-0.381
Token.
Token ID13
Feature activation+6.34e-01

Pos logit contributions
its+0.014
ases+0.013
 supposed+0.013
beh+0.013
arts+0.012
Neg logit contributions
 they-0.006
 Ally-0.006
 roof-0.006
 Big-0.006
 landscape-0.006
Token She
Token ID1375
Feature activation-9.43e-01

Pos logit contributions
its+0.019
ases+0.019
ws+0.017
 vitamins+0.017
olds+0.017
Neg logit contributions
 they-0.008
 roof-0.008
 raft-0.008
 landscape-0.008
 train-0.008
Token thanked
Token ID26280
Feature activation-2.10e+00

Pos logit contributions
 they+0.015
 he+0.014
 The+0.013
 roof+0.013
 Ally+0.013
Neg logit contributions
its-0.028
ases-0.026
beh-0.025
 supposed-0.025
ws-0.024
Token Ben
Token ID3932
Feature activation+1.96e+00

Pos logit contributions
 roof+0.034
 they+0.033
 landscape+0.033
 wagon+0.032
 while+0.032
Neg logit contributions
ases-0.055
its-0.055
beh-0.049
arts-0.048
ief-0.048
Token,
Token ID11
Feature activation-4.27e+00

Pos logit contributions
 Ally+0.010
 Sun+0.009
 Big+0.009
 The+0.009
 landscape+0.009
Neg logit contributions
its-0.017
 supposed-0.016
ases-0.015
ump-0.015
beh-0.015
Token cup
Token ID6508
Feature activation-9.65e-02

Pos logit contributions
 landscape+0.059
 Ally+0.058
 Pole+0.058
 roof+0.056
 valley+0.056
Neg logit contributions
its-0.117
 supposed-0.111
ases-0.108
beh-0.107
 vitamins-0.107
Tokencakes
Token ID37263
Feature activation-1.09e+00

Pos logit contributions
 Ally+0.002
 Big+0.001
 Sun+0.001
 they+0.001
 The+0.001
Neg logit contributions
its-0.003
ases-0.002
arts-0.002
ump-0.002
 supposed-0.002
Token and
Token ID290
Feature activation-4.91e+00

Pos logit contributions
 Ally+0.014
 Big+0.014
 Thunder+0.014
 fireplace+0.013
 Sun+0.013
Neg logit contributions
its-0.033
 supposed-0.030
ases-0.030
arts-0.029
ump-0.029
Token ch
Token ID442
Feature activation+4.63e+00

Pos logit contributions
 landscape+0.081
 Ally+0.080
 fireplace+0.080
 Pole+0.079
 valley+0.079
Neg logit contributions
its-0.122
beh-0.115
 supposed-0.115
 vitamins-0.112
ases-0.112
Tokenocol
Token ID4668
Feature activation+1.39e+01

Pos logit contributions
its+0.071
 supposed+0.064
beh+0.062
ases+0.061
vern+0.058
Neg logit contributions
 they-0.142
 he-0.132
 The-0.130
 Tim-0.128
 she-0.128
Tokenates
Token ID689
Feature activation-1.50e-01

Pos logit contributions
its+0.189
beh+0.150
 supposed+0.145
ases+0.144
 consequences+0.136
Neg logit contributions
 they-0.435
 he-0.405
 The-0.401
 Tim-0.399
 Mill-0.398
Token.
Token ID13
Feature activation-7.60e-01

Pos logit contributions
 Ally+0.002
 landscape+0.002
 fireplace+0.002
 Big+0.002
 The+0.002
Neg logit contributions
its-0.004
ases-0.004
 supposed-0.004
beh-0.004
arts-0.004
Token
Token ID220
Feature activation+6.99e+00

Pos logit contributions
 they+0.009
 moon+0.009
 landscape+0.009
 roof+0.009
 valley+0.008
Neg logit contributions
its-0.024
ases-0.022
beh-0.022
 supposed-0.021
olds-0.021
Token
Token ID198
Feature activation+2.34e+00

Pos logit contributions
its+0.254
ases+0.240
beh+0.231
 supposed+0.230
vern+0.222
Neg logit contributions
 they-0.069
 he-0.056
 roof-0.056
 moon-0.055
 landscape-0.055
Token
Token ID198
Feature activation+1.21e+00

Pos logit contributions
its+0.089
ases+0.084
beh+0.083
 supposed+0.081
 consequences+0.078
Neg logit contributions
 they-0.021
 he-0.016
 she-0.015
 The-0.014
 roof-0.014
Token became
Token ID2627
Feature activation+3.67e+00

Pos logit contributions
 Pole+0.006
 they+0.006
 roof+0.006
 fireplace+0.006
 moon+0.006
Neg logit contributions
its-0.010
ases-0.010
beh-0.010
 supposed-0.009
ws-0.009
Token friendly
Token ID8030
Feature activation+1.26e-01

Pos logit contributions
its+0.088
ases+0.081
beh+0.079
 supposed+0.076
ws+0.072
Neg logit contributions
 they-0.085
 he-0.078
 she-0.075
 The-0.075
 while-0.073
Token and
Token ID290
Feature activation-2.26e+00

Pos logit contributions
its+0.004
ases+0.004
beh+0.004
 supposed+0.004
 consequences+0.004
Neg logit contributions
 they-0.001
 roof-0.001
 he-0.001
 Ally-0.001
 fireplace-0.001
Token well
Token ID880
Feature activation+7.07e-01

Pos logit contributions
 roof+0.038
 Pole+0.038
 fireplace+0.038
 they+0.037
 landscape+0.037
Neg logit contributions
beh-0.058
its-0.057
ases-0.056
 supposed-0.056
asing-0.051
Token-
Token ID12
Feature activation+4.67e+00

Pos logit contributions
its+0.023
ases+0.022
beh+0.021
 supposed+0.021
 consequences+0.020
Neg logit contributions
 they-0.009
 roof-0.008
 Ally-0.008
 he-0.008
 moon-0.008
Tokenbeh
Token ID20709
Feature activation+1.39e+01

Pos logit contributions
its+0.089
ases+0.081
beh+0.077
 supposed+0.074
 consequences+0.069
Neg logit contributions
 they-0.130
 he-0.121
 she-0.118
 The-0.118
 while-0.116
Tokenaved
Token ID9586
Feature activation-4.44e-01

Pos logit contributions
its+0.387
 consequences+0.339
vern+0.331
 supposed+0.315
ases+0.315
Neg logit contributions
 The-0.238
 they-0.233
 While-0.217
 Tim-0.217
 Mill-0.213
Token,
Token ID11
Feature activation-3.03e+00

Pos logit contributions
 they+0.006
 moon+0.006
 roof+0.005
 fireplace+0.005
 Ally+0.005
Neg logit contributions
its-0.014
ases-0.013
beh-0.013
 consequences-0.012
 supposed-0.012
Token just
Token ID655
Feature activation+5.04e-01

Pos logit contributions
 fireplace+0.042
 Ally+0.042
 roof+0.041
 Pole+0.041
 moon+0.041
Neg logit contributions
its-0.084
 supposed-0.082
ases-0.081
beh-0.080
asing-0.075
Token like
Token ID588
Feature activation-8.38e-01

Pos logit contributions
its+0.012
 supposed+0.012
ases+0.012
beh+0.012
 consequences+0.011
Neg logit contributions
 they-0.010
 roof-0.009
 Pole-0.009
 wagon-0.009
 Ally-0.009
Token Buddy
Token ID36896
Feature activation+2.13e+00

Pos logit contributions
 Pole+0.009
 roof+0.009
 landscape+0.009
 they+0.009
 fireplace+0.009
Neg logit contributions
its-0.026
 supposed-0.025
ases-0.025
beh-0.024
 consequences-0.024
Token and
Token ID290
Feature activation-3.06e+00

Pos logit contributions
 landscape+0.019
 Pole+0.018
 roof+0.018
stack+0.017
 valley+0.017
Neg logit contributions
ases-0.037
its-0.036
arts-0.033
 mistakes-0.033
ps-0.032
Token laughed
Token ID13818
Feature activation+2.36e+00

Pos logit contributions
 wagon+0.061
 raft+0.057
 Ally+0.057
 Pole+0.055
 fireplace+0.055
Neg logit contributions
its-0.068
ases-0.067
ws-0.064
ks-0.062
beh-0.062
Token and
Token ID290
Feature activation-2.54e+00

Pos logit contributions
its+0.050
ases+0.048
 consequences+0.044
arts+0.043
 mistakes+0.043
Neg logit contributions
 Pole-0.050
stack-0.049
 Mill-0.048
 Ally-0.047
 roof-0.047
Token ate
Token ID15063
Feature activation-3.22e-01

Pos logit contributions
 Pole+0.050
 wagon+0.050
 roof+0.049
 fireplace+0.048
 landscape+0.048
Neg logit contributions
ases-0.055
its-0.054
beh-0.053
 supposed-0.052
ws-0.050
Token ch
Token ID442
Feature activation+5.49e+00

Pos logit contributions
 Ally+0.005
 Pole+0.004
 roof+0.004
 they+0.004
 Mill+0.004
Neg logit contributions
its-0.010
ases-0.009
ws-0.008
 consequences-0.008
arts-0.008
Tokenocol
Token ID4668
Feature activation+1.39e+01

Pos logit contributions
 supposed+0.094
beh+0.090
its+0.087
 consequences+0.081
vern+0.080
Neg logit contributions
 they-0.151
 The-0.141
 he-0.138
 Big-0.136
 Tim-0.134
Tokenates
Token ID689
Feature activation+7.00e-01

Pos logit contributions
its+0.206
beh+0.174
 supposed+0.169
ases+0.159
 consequences+0.157
Neg logit contributions
 they-0.416
 he-0.389
 moon-0.388
 The-0.385
 Big-0.381
Token.
Token ID13
Feature activation+4.83e-01

Pos logit contributions
its+0.020
ases+0.018
beh+0.017
 supposed+0.017
arts+0.017
Neg logit contributions
 Ally-0.011
 roof-0.011
 landscape-0.011
 fireplace-0.011
 Mill-0.011
Token They
Token ID1119
Feature activation-8.98e-01

Pos logit contributions
its+0.013
ases+0.013
 vitamins+0.012
 preferences+0.012
ws+0.012
Neg logit contributions
 mountain-0.007
 raft-0.007
 vehicle-0.007
 wagon-0.007
 roof-0.007
Token had
Token ID550
Feature activation+3.58e+00

Pos logit contributions
 they+0.015
 Ally+0.015
 wagon+0.015
 Pole+0.015
 fireplace+0.014
Neg logit contributions
its-0.024
ases-0.023
beh-0.021
 supposed-0.021
arts-0.021
Token a
Token ID257
Feature activation+1.31e+00

Pos logit contributions
its+0.115
ases+0.109
beh+0.106
 supposed+0.105
 consequences+0.100
Neg logit contributions
 they-0.052
 he-0.045
 she-0.043
 The-0.043
 roof-0.042
Token was
Token ID373
Feature activation+4.23e+00

Pos logit contributions
 they+0.016
 Ally+0.015
 The+0.014
 while+0.014
 Big+0.014
Neg logit contributions
its-0.039
ases-0.036
beh-0.035
 supposed-0.035
ws-0.034
Token a
Token ID257
Feature activation+6.97e-01

Pos logit contributions
its+0.123
 supposed+0.118
ases+0.115
ws+0.109
 consequences+0.109
Neg logit contributions
 they-0.065
 Ally-0.057
 The-0.057
 landscape-0.056
 he-0.055
Token box
Token ID3091
Feature activation+8.75e-01

Pos logit contributions
its+0.017
 supposed+0.016
vern+0.015
 consequences+0.015
 liar+0.015
Neg logit contributions
 they-0.013
 The-0.013
 Ally-0.012
 Little-0.012
 Mill-0.012
Token of
Token ID286
Feature activation-1.02e+00

Pos logit contributions
its+0.026
ases+0.024
 supposed+0.024
beh+0.023
arts+0.022
Neg logit contributions
 they-0.013
 Ally-0.012
 Big-0.012
 The-0.012
 Sun-0.012
Token ch
Token ID442
Feature activation+4.48e+00

Pos logit contributions
 they+0.020
 The+0.018
 he+0.018
 Ally+0.018
 she+0.017
Neg logit contributions
its-0.025
ases-0.024
 consequences-0.023
 supposed-0.023
beh-0.022
Tokenocol
Token ID4668
Feature activation+1.38e+01

Pos logit contributions
its+0.075
 supposed+0.069
beh+0.068
vern+0.062
ases+0.061
Neg logit contributions
 they-0.129
 The-0.121
 he-0.119
 roof-0.118
 moon-0.117
Tokenates
Token ID689
Feature activation-8.95e-01

Pos logit contributions
its+0.201
beh+0.162
 supposed+0.156
ases+0.148
 consequences+0.146
Neg logit contributions
 they-0.412
 moon-0.387
 he-0.384
 fields-0.382
 roof-0.382
Token.
Token ID13
Feature activation+6.12e-02

Pos logit contributions
 Ally+0.014
 they+0.014
 The+0.013
 Big+0.013
 Sun+0.013
Neg logit contributions
its-0.025
 supposed-0.024
ases-0.024
arts-0.023
beh-0.023
Token Lily
Token ID20037
Feature activation+2.03e+00

Pos logit contributions
its+0.002
ases+0.002
ws+0.002
 vitamins+0.002
olds+0.002
Neg logit contributions
 landscape-0.001
 roof-0.001
 they-0.001
 raft-0.001
 Ally-0.001
Token loved
Token ID6151
Feature activation+1.92e-02

Pos logit contributions
its+0.061
ases+0.057
 supposed+0.055
beh+0.055
ws+0.053
Neg logit contributions
 they-0.032
 he-0.028
 Ally-0.027
 roof-0.027
 Big-0.027
Token ch
Token ID442
Feature activation+5.33e+00

Pos logit contributions
its+0.001
ases+0.001
 supposed+0.001
beh+0.001
ws+0.001
Neg logit contributions
 they-0.000
 roof-0.000
 The-0.000
 he-0.000
 wagon-0.000
Token saw
Token ID2497
Feature activation-2.13e+00

Pos logit contributions
its+0.051
ases+0.048
 supposed+0.047
beh+0.047
ws+0.044
Neg logit contributions
 they-0.027
 roof-0.023
 Mill-0.023
 Ally-0.023
 �-0.022
Token some
Token ID617
Feature activation-1.37e+00

Pos logit contributions
 they+0.031
 Pole+0.029
 landscape+0.029
 Ally+0.029
 roof+0.029
Neg logit contributions
its-0.061
ases-0.060
 consequences-0.057
 supposed-0.057
beh-0.055
Token other
Token ID584
Feature activation-1.56e+00

Pos logit contributions
 they+0.028
 he+0.025
 The+0.025
 she+0.024
 Ally+0.024
Neg logit contributions
its-0.036
ases-0.034
beh-0.032
 supposed-0.032
 consequences-0.031
Token kids
Token ID3988
Feature activation+5.58e-01

Pos logit contributions
 they+0.038
 he+0.035
 The+0.034
 she+0.034
 roof+0.033
Neg logit contributions
its-0.035
ases-0.033
beh-0.031
 supposed-0.030
 consequences-0.029
Token mis
Token ID2984
Feature activation+1.11e+01

Pos logit contributions
its+0.015
ases+0.014
beh+0.014
 supposed+0.014
asing+0.013
Neg logit contributions
 they-0.010
 The-0.010
 Ally-0.010
 moon-0.009
 Big-0.009
Tokenbeh
Token ID20709
Feature activation+1.38e+01

Pos logit contributions
its+0.214
 vitamins+0.198
 fault+0.190
ws+0.189
 supposed+0.181
Neg logit contributions
 while-0.268
 they-0.261
 Ally-0.240
 Tim-0.239
 raft-0.239
Tokenaving
Token ID2703
Feature activation-4.44e-01

Pos logit contributions
its+0.367
 consequences+0.329
vern+0.318
 vitamins+0.306
 preferences+0.306
Neg logit contributions
 they-0.242
 The-0.241
 Tim-0.236
 While-0.236
 Mill-0.227
Token and
Token ID290
Feature activation-4.50e+00

Pos logit contributions
 they+0.006
 roof+0.006
 moon+0.005
 fireplace+0.005
 he+0.005
Neg logit contributions
its-0.014
ases-0.013
beh-0.013
 supposed-0.013
 consequences-0.012
Token running
Token ID2491
Feature activation+1.33e+00

Pos logit contributions
 fireplace+0.060
 they+0.059
 wagon+0.059
 landscape+0.058
 moon+0.058
Neg logit contributions
its-0.131
beh-0.128
ases-0.127
 supposed-0.125
 consequences-0.118
Token around
Token ID1088
Feature activation-3.22e+00

Pos logit contributions
its+0.040
ases+0.038
beh+0.037
 supposed+0.036
 consequences+0.035
Neg logit contributions
 they-0.020
 moon-0.018
 fireplace-0.018
 roof-0.018
 he-0.017
Token.
Token ID13
Feature activation-6.70e-01

Pos logit contributions
 they+0.042
 roof+0.041
 moon+0.040
 Pole+0.040
 landscape+0.039
Neg logit contributions
its-0.097
ases-0.096
 supposed-0.090
beh-0.089
 consequences-0.088
Token for
Token ID329
Feature activation-8.75e-01

Pos logit contributions
 landscape+0.004
 roof+0.004
 Ally+0.004
 fireplace+0.004
 valley+0.004
Neg logit contributions
its-0.007
ases-0.007
 supposed-0.007
beh-0.007
asing-0.006
Token being
Token ID852
Feature activation+5.38e+00

Pos logit contributions
 they+0.012
 he+0.010
 roof+0.010
 The+0.010
 she+0.010
Neg logit contributions
its-0.028
ases-0.027
beh-0.026
 supposed-0.026
 consequences-0.025
Token so
Token ID523
Feature activation+1.90e+00

Pos logit contributions
its+0.164
ases+0.156
beh+0.152
 supposed+0.149
 consequences+0.142
Neg logit contributions
 they-0.086
 he-0.075
 roof-0.072
 she-0.072
 The-0.071
Token well
Token ID880
Feature activation+1.15e+00

Pos logit contributions
its+0.044
ases+0.041
beh+0.040
 supposed+0.038
 consequences+0.036
Neg logit contributions
 they-0.044
 he-0.041
 The-0.039
 she-0.039
 roof-0.039
Token-
Token ID12
Feature activation+5.11e+00

Pos logit contributions
its+0.035
ases+0.033
beh+0.033
 supposed+0.033
 consequences+0.031
Neg logit contributions
 they-0.016
 roof-0.016
 fireplace-0.015
 moon-0.015
 The-0.015
Tokenbeh
Token ID20709
Feature activation+1.37e+01

Pos logit contributions
its+0.078
beh+0.071
ases+0.069
 supposed+0.063
asing+0.057
Neg logit contributions
 they-0.160
 he-0.149
 roof-0.148
 moon-0.147
 fireplace-0.147
Tokenaved
Token ID9586
Feature activation-8.20e-01

Pos logit contributions
its+0.357
 consequences+0.310
ases+0.300
vern+0.298
 supposed+0.289
Neg logit contributions
 they-0.263
 The-0.261
 Tim-0.240
 while-0.239
 �-0.238
Token and
Token ID290
Feature activation-3.64e+00

Pos logit contributions
 they+0.011
 roof+0.010
 moon+0.010
 fireplace+0.010
 The+0.010
Neg logit contributions
its-0.025
ases-0.024
beh-0.023
 supposed-0.023
 consequences-0.023
Token Jack
Token ID3619
Feature activation+2.06e+00

Pos logit contributions
 they+0.065
 roof+0.059
 The+0.058
 Ally+0.058
 wagon+0.058
Neg logit contributions
its-0.099
beh-0.093
ases-0.093
 supposed-0.092
 consequences-0.086
Token was
Token ID373
Feature activation+4.68e+00

Pos logit contributions
its+0.062
ases+0.058
beh+0.058
 supposed+0.057
arts+0.054
Neg logit contributions
 they-0.030
 roof-0.027
 landscape-0.027
 Ally-0.026
 fireplace-0.026
Token delighted
Token ID20707
Feature activation+1.11e+00

Pos logit contributions
its+0.130
ases+0.122
 supposed+0.122
beh+0.121
vern+0.111
Neg logit contributions
 they-0.084
 he-0.074
 roof-0.074
 landscape-0.073
 wagon-0.072
Token It
Token ID632
Feature activation-6.83e-01

Pos logit contributions
 mountain+0.013
 Pole+0.013
 he+0.013
 yacht+0.013
 while+0.013
Neg logit contributions
its-0.024
 supposed-0.023
icks-0.022
beh-0.022
olds-0.022
Token was
Token ID373
Feature activation+3.74e+00

Pos logit contributions
 they+0.010
 Pole+0.010
 roof+0.010
 wagon+0.009
 Ally+0.009
Neg logit contributions
its-0.021
ases-0.018
 supposed-0.018
beh-0.018
 consequences-0.017
Token sweet
Token ID6029
Feature activation+2.11e+00

Pos logit contributions
its+0.108
 supposed+0.100
ases+0.098
beh+0.095
 consequences+0.093
Neg logit contributions
 they-0.063
 Pole-0.057
 roof-0.056
 he-0.056
 landscape-0.056
Token,
Token ID11
Feature activation-2.36e+00

Pos logit contributions
its+0.073
ases+0.068
 supposed+0.067
beh+0.066
 consequences+0.065
Neg logit contributions
 they-0.023
 roof-0.020
 Ally-0.019
 The-0.019
 Big-0.019
Token ch
Token ID442
Feature activation+4.55e+00

Pos logit contributions
 Pole+0.043
 Ally+0.041
 valley+0.040
 landscape+0.040
 wagon+0.039
Neg logit contributions
its-0.058
 supposed-0.054
 vitamins-0.053
ases-0.051
 preferences-0.050
Tokenocol
Token ID4668
Feature activation+1.37e+01

Pos logit contributions
its+0.077
 supposed+0.077
beh+0.075
ases+0.068
vern+0.068
Neg logit contributions
 they-0.122
 The-0.117
 he-0.114
 she-0.113
 Tim-0.112
Tokenat
Token ID265
Feature activation+1.90e+00

Pos logit contributions
its+0.189
beh+0.158
 supposed+0.146
ases+0.146
 consequences+0.137
Neg logit contributions
 they-0.423
 moon-0.397
 The-0.395
 he-0.394
 Mill-0.392
Tokeny
Token ID88
Feature activation-4.91e-02

Pos logit contributions
its+0.055
ases+0.050
arts+0.049
ump+0.048
beh+0.048
Neg logit contributions
 Ally-0.030
 they-0.030
 landscape-0.028
 wagon-0.027
 valley-0.027
Token and
Token ID290
Feature activation-4.25e+00

Pos logit contributions
 they+0.001
 Ally+0.001
 Pole+0.001
 Big+0.001
 landscape+0.001
Neg logit contributions
its-0.002
ases-0.001
 supposed-0.001
arts-0.001
beh-0.001
Token warm
Token ID5814
Feature activation+1.27e-02

Pos logit contributions
 Pole+0.074
 landscape+0.067
 wagon+0.066
 roof+0.065
 valley+0.065
Neg logit contributions
its-0.108
 supposed-0.103
 vitamins-0.099
beh-0.097
ases-0.095
Token,
Token ID11
Feature activation-2.98e+00

Pos logit contributions
its+0.000
 supposed+0.000
 consequences+0.000
arts+0.000
ases+0.000
Neg logit contributions
 Pole-0.000
 roof-0.000
 Ally-0.000
 they-0.000
 landscape-0.000
Token colorful
Token ID20239
Feature activation-2.35e+00

Pos logit contributions
 they+0.048
 he+0.043
 she+0.041
 while+0.040
 The+0.039
Neg logit contributions
its-0.078
ases-0.074
beh-0.072
 supposed-0.070
ws-0.067
Token cand
Token ID2658
Feature activation+3.87e+00

Pos logit contributions
 they+0.046
 The+0.042
 he+0.042
 �+0.040
 Big+0.040
Neg logit contributions
its-0.061
ases-0.057
beh-0.054
 supposed-0.053
 consequences-0.052
Tokenies
Token ID444
Feature activation-2.12e+00

Pos logit contributions
its+0.101
ases+0.099
beh+0.095
 supposed+0.091
 consequences+0.089
Neg logit contributions
 they-0.081
 he-0.072
 she-0.071
 while-0.068
 roof-0.065
Token and
Token ID290
Feature activation-5.82e+00

Pos logit contributions
 The+0.027
 Big+0.026
 roof+0.025
 Ally+0.025
 Sun+0.025
Neg logit contributions
its-0.065
 supposed-0.062
ases-0.060
beh-0.059
arts-0.059
Token ch
Token ID442
Feature activation+4.75e+00

Pos logit contributions
 they+0.084
 The+0.083
 Pole+0.082
 wagon+0.082
 Big+0.081
Neg logit contributions
its-0.161
 supposed-0.150
ases-0.147
beh-0.147
 vitamins-0.144
Tokenocol
Token ID4668
Feature activation+1.36e+01

Pos logit contributions
its+0.084
 supposed+0.081
beh+0.077
 consequences+0.074
vern+0.074
Neg logit contributions
 they-0.129
 Ally-0.118
 The-0.117
 Mill-0.116
 he-0.115
Tokenates
Token ID689
Feature activation-1.38e-02

Pos logit contributions
its+0.201
beh+0.162
 supposed+0.155
ases+0.152
 consequences+0.150
Neg logit contributions
 they-0.402
 Mill-0.378
 moon-0.372
 The-0.370
 he-0.369
Token.
Token ID13
Feature activation-1.19e+00

Pos logit contributions
 they+0.000
 The+0.000
 Big+0.000
 roof+0.000
 fireplace+0.000
Neg logit contributions
its-0.000
ases-0.000
 supposed-0.000
beh-0.000
arts-0.000
Token She
Token ID1375
Feature activation-2.25e+00

Pos logit contributions
 raft+0.021
 distance+0.019
 mountain+0.019
 jet+0.018
 boat+0.018
Neg logit contributions
its-0.026
 vitamins-0.026
olds-0.025
 preferences-0.025
ases-0.024
Token said
Token ID531
Feature activation-5.81e-01

Pos logit contributions
 they+0.037
 roof+0.035
 fireplace+0.035
 wagon+0.034
 Big+0.034
Neg logit contributions
its-0.062
ases-0.055
beh-0.055
 supposed-0.055
arts-0.053
Token to
Token ID284
Feature activation-1.47e+00

Pos logit contributions
 landscape+0.007
 fireplace+0.006
 moon+0.006
 roof+0.006
 distance+0.006
Neg logit contributions
its-0.017
ases-0.017
arts-0.016
icks-0.016
vern-0.016
Token enjoyed
Token ID8359
Feature activation-7.93e-01

Pos logit contributions
 Pole+0.068
 fireplace+0.064
 landscape+0.063
 roof+0.062
 moon+0.059
Neg logit contributions
beh-0.110
ases-0.109
its-0.108
ws-0.100
 supposed-0.099
Token her
Token ID607
Feature activation+2.34e+00

Pos logit contributions
 they+0.008
 roof+0.007
 Pole+0.007
 landscape+0.006
 fireplace+0.006
Neg logit contributions
its-0.028
ases-0.027
 supposed-0.026
beh-0.026
arts-0.025
Token cool
Token ID3608
Feature activation+1.10e+00

Pos logit contributions
its+0.062
ases+0.057
 supposed+0.056
beh+0.055
 consequences+0.053
Neg logit contributions
 they-0.045
 he-0.041
 Ally-0.040
 The-0.040
 she-0.040
Token,
Token ID11
Feature activation-2.57e+00

Pos logit contributions
its+0.030
ases+0.027
beh+0.026
 supposed+0.025
ws+0.025
Neg logit contributions
 they-0.021
 Ally-0.019
 he-0.018
 Big-0.018
 she-0.018
Token ch
Token ID442
Feature activation+4.73e+00

Pos logit contributions
 Pole+0.045
 Ally+0.041
 they+0.041
 roof+0.041
 landscape+0.041
Neg logit contributions
 supposed-0.061
its-0.061
beh-0.058
ases-0.057
 vitamins-0.056
Tokenocol
Token ID4668
Feature activation+1.35e+01

Pos logit contributions
 supposed+0.085
beh+0.080
its+0.078
 consequences+0.075
vern+0.072
Neg logit contributions
 they-0.124
 he-0.115
 The-0.112
 she-0.111
 while-0.108
Tokenat
Token ID265
Feature activation+1.24e+00

Pos logit contributions
its+0.192
beh+0.156
 supposed+0.153
 consequences+0.147
ases+0.143
Neg logit contributions
 they-0.405
 he-0.378
 moon-0.376
 The-0.372
 Mill-0.371
Tokeny
Token ID88
Feature activation-7.60e-01

Pos logit contributions
its+0.033
ases+0.032
arts+0.031
beh+0.031
ump+0.030
Neg logit contributions
 Ally-0.020
 they-0.019
 The-0.019
 While-0.019
 landscape-0.018
Token treat
Token ID2190
Feature activation-1.66e-02

Pos logit contributions
 they+0.012
 Ally+0.011
 The+0.010
 Pole+0.010
 Big+0.010
Neg logit contributions
its-0.022
ases-0.021
 supposed-0.020
beh-0.020
arts-0.020
Token!
Token ID0
Feature activation-5.07e-01

Pos logit contributions
 roof+0.000
 they+0.000
 Pole+0.000
 landscape+0.000
 The+0.000
Neg logit contributions
its-0.001
ases-0.000
beh-0.000
 supposed-0.000
arts-0.000
Token
Token ID198
Feature activation+2.26e+00

Pos logit contributions
 landscape+0.007
 roof+0.007
 raft+0.007
 moon+0.007
 train+0.007
Neg logit contributions
its-0.014
ases-0.014
ump-0.013
beh-0.013
olds-0.013
Token a
Token ID257
Feature activation-2.72e+00

Pos logit contributions
 Pole+0.038
 Mill+0.038
 fireplace+0.037
 they+0.037
 Ally+0.036
Neg logit contributions
its-0.061
ases-0.060
 consequences-0.058
 supposed-0.055
 vitamins-0.052
Token big
Token ID1263
Feature activation-3.37e+00

Pos logit contributions
 they+0.048
 The+0.047
 Mill+0.045
 he+0.043
 Ally+0.043
Neg logit contributions
its-0.069
 supposed-0.063
 liar-0.059
 consequences-0.059
vern-0.058
Token pond
Token ID16723
Feature activation-3.26e+00

Pos logit contributions
 they+0.053
 The+0.046
 Ally+0.046
 Mill+0.045
 he+0.045
Neg logit contributions
its-0.090
ases-0.086
 supposed-0.082
 cheating-0.081
 consequences-0.080
Token with
Token ID351
Feature activation-3.16e+00

Pos logit contributions
 they+0.048
 Ally+0.044
 Big+0.044
 Mill+0.043
 he+0.043
Neg logit contributions
its-0.100
ases-0.094
beh-0.090
 supposed-0.089
arts-0.084
Token ducks
Token ID39694
Feature activation-1.66e+00

Pos logit contributions
 they+0.054
 he+0.048
 she+0.047
 �+0.046
 Ally+0.045
Neg logit contributions
its-0.091
ases-0.087
beh-0.082
 supposed-0.082
 consequences-0.079
Token and
Token ID290
Feature activation-7.92e+00

Pos logit contributions
 they+0.024
 Mill+0.022
 The+0.022
 Ally+0.022
 he+0.022
Neg logit contributions
its-0.052
ases-0.049
 supposed-0.047
beh-0.047
arts-0.044
Token fish
Token ID5916
Feature activation-8.86e-01

Pos logit contributions
 �+0.144
 fireplace+0.143
 they+0.139
 Pole+0.139
 Mill+0.137
Neg logit contributions
its-0.184
ases-0.183
beh-0.169
 supposed-0.167
 consequences-0.167
Token.
Token ID13
Feature activation-1.20e+00

Pos logit contributions
 they+0.012
 he+0.010
 she+0.009
 The+0.009
 while+0.009
Neg logit contributions
its-0.030
ases-0.028
beh-0.028
 supposed-0.027
 consequences-0.026
Token Tom
Token ID4186
Feature activation+1.43e+00

Pos logit contributions
 fireplace+0.016
 moon+0.016
 Peter+0.015
 Ally+0.015
 valley+0.014
Neg logit contributions
its-0.038
ases-0.035
olds-0.032
beh-0.031
 supposed-0.031
Token wants
Token ID3382
Feature activation+1.51e-01

Pos logit contributions
its+0.033
ases+0.031
beh+0.030
 supposed+0.029
ws+0.027
Neg logit contributions
 they-0.033
 he-0.031
 she-0.030
 The-0.030
 roof-0.030
Token to
Token ID284
Feature activation+1.58e+00

Pos logit contributions
its+0.005
ases+0.005
 supposed+0.004
beh+0.004
 consequences+0.004
Neg logit contributions
 they-0.002
 fireplace-0.002
 roof-0.002
 wagon-0.002
 landscape-0.002
Token.
Token ID13
Feature activation+6.47e-03

Pos logit contributions
 roof+0.005
 raft+0.005
 While+0.005
 landscape+0.005
 wagon+0.005
Neg logit contributions
 supposed-0.007
its-0.007
beh-0.007
ases-0.007
asing-0.007
Token<|endoftext|>
Token ID50256
Feature activation+1.05e+00

Pos logit contributions
its+0.000
ases+0.000
beh+0.000
 supposed+0.000
olds+0.000
Neg logit contributions
 raft-0.000
 they-0.000
 roof-0.000
 landscape-0.000
 wagon-0.000
TokenMax
Token ID11518
Feature activation+2.28e+00

Pos logit contributions
its+0.031
ases+0.030
ump+0.028
 supposed+0.028
beh+0.027
Neg logit contributions
 they-0.016
 roof-0.015
 raft-0.014
 wagon-0.014
 landscape-0.013
Token was
Token ID373
Feature activation+1.52e+00

Pos logit contributions
its+0.062
ases+0.061
 supposed+0.059
ws+0.058
beh+0.056
Neg logit contributions
 they-0.036
 roof-0.035
 Pole-0.032
 while-0.031
 raft-0.031
Token walking
Token ID6155
Feature activation-2.41e+00

Pos logit contributions
its+0.042
 supposed+0.041
ases+0.040
beh+0.039
ump+0.037
Neg logit contributions
 they-0.026
 he-0.023
 Pole-0.023
 landscape-0.023
 roof-0.023
Token through
Token ID832
Feature activation-7.82e+00

Pos logit contributions
 Pole+0.052
 chim+0.047
 fireplace+0.044
 roof+0.043
 moon+0.041
Neg logit contributions
ases-0.046
beh-0.041
 supposed-0.040
 consequences-0.040
asured-0.039
Token the
Token ID262
Feature activation-5.22e+00

Pos logit contributions
 Pole+0.114
 they+0.098
 roof+0.092
stack+0.091
 Sure+0.089
Neg logit contributions
ases-0.206
its-0.205
 consequences-0.203
 supposed-0.199
 trouble-0.194
Token park
Token ID3952
Feature activation-4.86e+00

Pos logit contributions
 they+0.102
stack+0.090
 while+0.089
 Sure+0.085
 he+0.084
Neg logit contributions
 supposed-0.106
its-0.105
 consequences-0.103
ases-0.102
 mistakes-0.100
Token.
Token ID13
Feature activation-2.21e+00

Pos logit contributions
 Pole+0.076
 fireplace+0.076
 Ally+0.074
 Big+0.072
 Sun+0.071
Neg logit contributions
its-0.124
ases-0.122
 supposed-0.119
ump-0.117
beh-0.113
Token He
Token ID679
Feature activation-1.74e+00

Pos logit contributions
 they+0.033
 raft+0.029
 jet+0.029
 valley+0.029
 landscape+0.028
Neg logit contributions
its-0.060
 supposed-0.058
ases-0.057
 consequences-0.053
ump-0.053
Token saw
Token ID2497
Feature activation-3.70e+00

Pos logit contributions
 fireplace+0.029
 Mill+0.027
 roof+0.027
 they+0.027
 wagon+0.027
Neg logit contributions
its-0.043
ases-0.041
 supposed-0.039
oken-0.038
 tricked-0.038
Token.
Token ID13
Feature activation+5.28e-01

Pos logit contributions
its+0.041
ases+0.039
beh+0.038
 supposed+0.038
 consequences+0.036
Neg logit contributions
 they-0.018
 roof-0.016
 The-0.016
 he-0.015
 Ally-0.015
Token<|endoftext|>
Token ID50256
Feature activation+9.65e-01

Pos logit contributions
its+0.018
ases+0.017
beh+0.016
 supposed+0.016
ump+0.016
Neg logit contributions
 they-0.006
 raft-0.006
 wagon-0.005
 roof-0.005
 train-0.005
TokenJohn
Token ID7554
Feature activation+5.94e-01

Pos logit contributions
its+0.032
ases+0.030
 supposed+0.028
beh+0.028
ump+0.028
Neg logit contributions
 they-0.012
 roof-0.011
 raft-0.010
 he-0.010
 wagon-0.010
Token was
Token ID373
Feature activation+1.54e+00

Pos logit contributions
its+0.017
ases+0.016
 supposed+0.016
ws+0.015
beh+0.015
Neg logit contributions
 they-0.010
 roof-0.009
 Ally-0.009
 landscape-0.008
 raft-0.008
Token walking
Token ID6155
Feature activation-2.38e+00

Pos logit contributions
its+0.043
 supposed+0.041
ases+0.041
beh+0.040
 consequences+0.037
Neg logit contributions
 they-0.027
 he-0.024
 landscape-0.023
 roof-0.023
 Ally-0.023
Token through
Token ID832
Feature activation-7.81e+00

Pos logit contributions
 Pole+0.046
 chim+0.044
 fireplace+0.042
 roof+0.041
stack+0.039
Neg logit contributions
ases-0.051
beh-0.045
 supposed-0.044
asured-0.044
its-0.043
Token the
Token ID262
Feature activation-5.18e+00

Pos logit contributions
 Pole+0.107
 they+0.097
stack+0.090
 roof+0.089
 Sure+0.087
Neg logit contributions
ases-0.211
its-0.206
 supposed-0.201
 consequences-0.201
ief-0.194
Token park
Token ID3952
Feature activation-4.93e+00

Pos logit contributions
 they+0.104
stack+0.093
 while+0.091
 �+0.087
 Sure+0.087
Neg logit contributions
 supposed-0.105
its-0.102
ases-0.100
 consequences-0.099
 mistakes-0.097
Token one
Token ID530
Feature activation-1.40e+00

Pos logit contributions
 Ally+0.078
 Pole+0.077
 fireplace+0.077
 Mill+0.075
 Big+0.074
Neg logit contributions
its-0.125
ases-0.125
 supposed-0.120
ump-0.117
beh-0.115
Token sunny
Token ID27737
Feature activation-4.06e+00

Pos logit contributions
 they+0.024
 Mill+0.023
 Ally+0.022
 Pole+0.022
 fireplace+0.022
Neg logit contributions
its-0.038
ases-0.037
beh-0.034
 supposed-0.034
asing-0.033
Token day
Token ID1110
Feature activation-4.55e+00

Pos logit contributions
 they+0.063
 fireplace+0.063
 �+0.062
 Pole+0.062
 Big+0.059
Neg logit contributions
ases-0.100
its-0.100
ump-0.095
 supposed-0.093
 consequences-0.090
Token and
Token ID290
Feature activation-4.97e+00

Pos logit contributions
its+0.024
ases+0.023
 supposed+0.023
ws+0.022
beh+0.021
Neg logit contributions
 they-0.013
 roof-0.013
 raft-0.012
 Pole-0.012
 Ally-0.012
Token her
Token ID607
Feature activation-4.34e-01

Pos logit contributions
 Pole+0.091
 chim+0.080
 fireplace+0.079
 roof+0.078
 mountain+0.076
Neg logit contributions
 supposed-0.120
ases-0.120
its-0.115
beh-0.115
 humor-0.107
Token dad
Token ID9955
Feature activation-4.31e-01

Pos logit contributions
 they+0.007
 Mill+0.007
 Ally+0.006
 he+0.006
 while+0.006
Neg logit contributions
its-0.011
 supposed-0.011
ases-0.011
 consequences-0.010
 fault-0.010
Token were
Token ID547
Feature activation+2.10e+00

Pos logit contributions
 roof+0.006
 they+0.006
 Ally+0.006
 Sun+0.006
 Pole+0.006
Neg logit contributions
its-0.012
ases-0.012
 supposed-0.011
ws-0.011
beh-0.011
Token walking
Token ID6155
Feature activation-2.54e+00

Pos logit contributions
its+0.055
 supposed+0.053
ases+0.052
beh+0.051
ump+0.048
Neg logit contributions
 they-0.038
 he-0.036
 roof-0.035
 landscape-0.035
 moon-0.034
Token through
Token ID832
Feature activation-7.78e+00

Pos logit contributions
 Pole+0.055
 chim+0.049
 fireplace+0.048
 roof+0.046
 moon+0.046
Neg logit contributions
ases-0.051
beh-0.047
 supposed-0.042
ac-0.041
ips-0.041
Token a
Token ID257
Feature activation-4.66e+00

Pos logit contributions
 Pole+0.109
stack+0.092
 roof+0.087
 Sure+0.086
 they+0.084
Neg logit contributions
ases-0.210
its-0.207
 consequences-0.201
 supposed-0.200
ief-0.193
Token field
Token ID2214
Feature activation-5.56e+00

Pos logit contributions
stack+0.084
 they+0.077
 Wil+0.076
 Mill+0.074
 Pole+0.073
Neg logit contributions
its-0.102
 supposed-0.091
vern-0.089
unts-0.089
ases-0.089
Token on
Token ID319
Feature activation-4.32e+00

Pos logit contributions
 Pole+0.101
 Big+0.093
 Bobby+0.091
 wagon+0.091
 fireplace+0.091
Neg logit contributions
its-0.132
ases-0.126
ump-0.125
 supposed-0.121
beh-0.114
Token a
Token ID257
Feature activation-3.67e+00

Pos logit contributions
 Pole+0.059
 Sure+0.050
 they+0.050
 he+0.050
 �+0.049
Neg logit contributions
its-0.120
 consequences-0.115
ases-0.115
 supposed-0.107
 vitamins-0.105
Token sunny
Token ID27737
Feature activation-4.07e+00

Pos logit contributions
 they+0.060
 Tim+0.057
 The+0.057
 Mill+0.055
 he+0.055
Neg logit contributions
its-0.090
 supposed-0.084
vern-0.083
 consequences-0.077
beh-0.075
Token.
Token ID13
Feature activation-1.88e+00

Pos logit contributions
 Pole+0.096
 Ally+0.095
 Mill+0.089
 wagon+0.088
 Big+0.087
Neg logit contributions
its-0.129
ases-0.128
 supposed-0.122
ump-0.122
beh-0.115
Token Suddenly
Token ID24975
Feature activation-7.13e+00

Pos logit contributions
 raft+0.029
 helicopter+0.029
 valley+0.029
 jet+0.029
 yacht+0.028
Neg logit contributions
ases-0.042
its-0.041
 supposed-0.039
beh-0.038
olds-0.037
Token,
Token ID11
Feature activation-6.44e+00

Pos logit contributions
stack+0.106
 horizon+0.100
 ranch+0.097
 While+0.097
 landscape+0.096
Neg logit contributions
its-0.159
 consequences-0.154
ases-0.152
ump-0.147
arts-0.146
Token the
Token ID262
Feature activation-2.70e+00

Pos logit contributions
 Pole+0.077
 exp+0.069
 horizon+0.069
 roof+0.066
 landscape+0.066
Neg logit contributions
 trouble-0.154
its-0.153
 supposed-0.151
ases-0.147
 consequences-0.144
Token sky
Token ID6766
Feature activation-3.63e+00

Pos logit contributions
 they+0.047
 he+0.042
 The+0.041
 while+0.039
 Tim+0.038
Neg logit contributions
its-0.074
 supposed-0.072
 consequences-0.071
ases-0.070
beh-0.068
Token darkened
Token ID44261
Feature activation-7.76e+00

Pos logit contributions
stack+0.057
 Sure+0.057
 Ally+0.055
 wagon+0.055
 Higher+0.054
Neg logit contributions
its-0.102
arts-0.082
 supposed-0.078
ump-0.077
 consequences-0.077
Token and
Token ID290
Feature activation-6.85e+00

Pos logit contributions
 Pole+0.145
 Sure+0.143
 While+0.142
 Ally+0.139
 Higher+0.139
Neg logit contributions
 consequences-0.156
its-0.155
ases-0.145
ps-0.145
 supposed-0.144
Token split
Token ID6626
Feature activation-1.51e+00

Pos logit contributions
 ranch+0.097
 Pole+0.095
 wagon+0.094
 yacht+0.094
 while+0.091
Neg logit contributions
its-0.154
 consequences-0.147
 trouble-0.146
ps-0.143
 tricked-0.139
Token open
Token ID1280
Feature activation-4.09e+00

Pos logit contributions
 they+0.026
 Pole+0.025
 Sure+0.025
 he+0.024
 Ally+0.024
Neg logit contributions
its-0.041
ases-0.037
arts-0.036
 supposed-0.035
ump-0.035
Token.
Token ID13
Feature activation-2.04e+00

Pos logit contributions
 they+0.082
 Ally+0.077
 Pole+0.075
 Sure+0.074
 he+0.072
Neg logit contributions
its-0.102
ases-0.092
arts-0.086
 consequences-0.085
 supposed-0.084
Token Rain
Token ID10301
Feature activation-4.41e-01

Pos logit contributions
 yacht+0.026
 raft+0.025
 valley+0.025
 landscape+0.024
 while+0.024
Neg logit contributions
its-0.054
 supposed-0.050
ump-0.048
ases-0.047
 trouble-0.047
Token<|endoftext|>
Token ID50256
Feature activation+1.29e+00

Pos logit contributions
 they+0.001
 roof+0.001
 landscape+0.001
 train+0.001
 raft+0.001
Neg logit contributions
its-0.002
ases-0.002
beh-0.002
ump-0.002
 supposed-0.002
TokenMike
Token ID16073
Feature activation+3.38e+00

Pos logit contributions
its+0.039
ases+0.037
ump+0.035
 supposed+0.034
beh+0.034
Neg logit contributions
 they-0.019
 roof-0.018
 raft-0.017
 wagon-0.016
 landscape-0.016
Tokeny
Token ID88
Feature activation-1.42e+00

Pos logit contributions
its+0.100
ases+0.096
 supposed+0.091
beh+0.090
ws+0.088
Neg logit contributions
 they-0.053
 roof-0.050
 Ally-0.047
 Sun-0.047
 �-0.046
Token was
Token ID373
Feature activation+1.71e+00

Pos logit contributions
 roof+0.020
 they+0.020
 wagon+0.019
 Ally+0.019
 raft+0.019
Neg logit contributions
its-0.038
ases-0.036
 supposed-0.036
arts-0.034
beh-0.033
Token walking
Token ID6155
Feature activation-2.12e+00

Pos logit contributions
its+0.045
 supposed+0.044
ases+0.043
beh+0.042
ump+0.039
Neg logit contributions
 they-0.032
 he-0.028
 landscape-0.028
 Pole-0.028
 roof-0.027
Token through
Token ID832
Feature activation-7.76e+00

Pos logit contributions
 Pole+0.041
 chim+0.039
 fireplace+0.038
 roof+0.037
stack+0.035
Neg logit contributions
ases-0.044
beh-0.040
its-0.039
 supposed-0.039
 consequences-0.037
Token the
Token ID262
Feature activation-5.20e+00

Pos logit contributions
 Pole+0.105
 they+0.090
stack+0.089
 roof+0.088
 Sure+0.083
Neg logit contributions
ases-0.210
its-0.206
 consequences-0.200
 supposed-0.199
ief-0.191
Token park
Token ID3952
Feature activation-4.92e+00

Pos logit contributions
 they+0.103
stack+0.093
 while+0.091
 Sure+0.086
 he+0.085
Neg logit contributions
its-0.103
 supposed-0.103
ases-0.103
 consequences-0.100
 mistakes-0.098
Token.
Token ID13
Feature activation-2.08e+00

Pos logit contributions
 Pole+0.079
 fireplace+0.079
 Ally+0.079
 Big+0.076
 Mill+0.076
Neg logit contributions
its-0.125
ases-0.122
 supposed-0.117
ump-0.115
beh-0.113
Token He
Token ID679
Feature activation-1.86e+00

Pos logit contributions
 they+0.027
 jet+0.025
 Ally+0.024
 moon+0.023
 raft+0.023
Neg logit contributions
its-0.065
 supposed-0.060
ases-0.059
 consequences-0.055
beh-0.055
Token noticed
Token ID6810
Feature activation-4.57e+00

Pos logit contributions
 fireplace+0.031
 Mill+0.030
 roof+0.029
 wagon+0.029
 Ally+0.029
Neg logit contributions
its-0.044
ases-0.043
 supposed-0.040
 tricked-0.039
oken-0.039
Tokenandy
Token ID10757
Feature activation+1.40e+00

Pos logit contributions
its+0.007
ases+0.007
beh+0.006
 supposed+0.006
vern+0.006
Neg logit contributions
 they-0.007
 he-0.007
 The-0.006
 while-0.006
 she-0.006
Token and
Token ID290
Feature activation-4.72e+00

Pos logit contributions
its+0.037
ases+0.035
 supposed+0.034
arts+0.033
ump+0.032
Neg logit contributions
 roof-0.020
 they-0.020
 Sun-0.019
 Ally-0.019
 fireplace-0.019
Token Jane
Token ID12091
Feature activation+1.62e+00

Pos logit contributions
 Pole+0.079
 roof+0.068
 chim+0.067
 while+0.066
 fireplace+0.066
Neg logit contributions
ases-0.116
 supposed-0.114
beh-0.112
its-0.112
 humor-0.103
Token were
Token ID547
Feature activation+1.61e+00

Pos logit contributions
its+0.049
ases+0.046
 supposed+0.044
beh+0.043
ws+0.042
Neg logit contributions
 they-0.025
 he-0.023
 raft-0.022
 Ally-0.022
 Big-0.022
Token walking
Token ID6155
Feature activation-2.07e+00

Pos logit contributions
its+0.037
 supposed+0.037
ases+0.036
beh+0.035
asing+0.032
Neg logit contributions
 they-0.033
 he-0.032
 landscape-0.031
 roof-0.031
 Pole-0.031
Token through
Token ID832
Feature activation-7.75e+00

Pos logit contributions
 Pole+0.042
 chim+0.037
 fireplace+0.037
 roof+0.034
 moon+0.034
Neg logit contributions
ases-0.047
beh-0.044
 supposed-0.039
asured-0.039
its-0.039
Token the
Token ID262
Feature activation-5.22e+00

Pos logit contributions
 Pole+0.118
stack+0.092
 he+0.091
 they+0.089
 raft+0.089
Neg logit contributions
ases-0.205
its-0.200
 supposed-0.199
 consequences-0.193
ief-0.187
Token park
Token ID3952
Feature activation-4.70e+00

Pos logit contributions
 they+0.097
stack+0.093
 while+0.091
 he+0.090
 �+0.088
Neg logit contributions
 supposed-0.104
its-0.103
ases-0.099
beh-0.098
 consequences-0.097
Token.
Token ID13
Feature activation-2.26e+00

Pos logit contributions
 fireplace+0.080
 Pole+0.080
 Big+0.077
 moon+0.077
 Ally+0.075
Neg logit contributions
its-0.119
ases-0.114
 supposed-0.111
beh-0.111
ump-0.105
Token Jane
Token ID12091
Feature activation+2.64e-01

Pos logit contributions
 helicopter+0.031
 jet+0.030
 train+0.029
 raft+0.029
 plane+0.028
Neg logit contributions
its-0.066
ases-0.061
 supposed-0.060
arts-0.054
 consequences-0.054
Token had
Token ID550
Feature activation+1.60e+00

Pos logit contributions
its+0.007
 supposed+0.006
ases+0.006
beh+0.006
ump+0.006
Neg logit contributions
 fireplace-0.004
 roof-0.004
 raft-0.004
 wagon-0.004
 moon-0.004
Token walking
Token ID6155
Feature activation-1.74e+00

Pos logit contributions
its+0.096
ases+0.091
 supposed+0.089
beh+0.088
 consequences+0.083
Neg logit contributions
 they-0.057
 he-0.049
 roof-0.049
 she-0.048
 The-0.048
Token through
Token ID832
Feature activation-6.73e+00

Pos logit contributions
 roof+0.032
 fireplace+0.032
 Pole+0.031
 they+0.030
 chim+0.030
Neg logit contributions
ases-0.039
beh-0.036
its-0.036
ump-0.034
 consequences-0.034
Token a
Token ID257
Feature activation-3.66e+00

Pos logit contributions
 they+0.078
 Pole+0.075
 roof+0.070
 raft+0.064
 wagon+0.064
Neg logit contributions
its-0.207
ases-0.202
 consequences-0.198
 supposed-0.189
beh-0.185
Token forest
Token ID8222
Feature activation-4.15e+00

Pos logit contributions
 they+0.061
 he+0.049
 Tim+0.049
 she+0.048
 Mill+0.048
Neg logit contributions
its-0.099
vern-0.089
 supposed-0.089
ases-0.088
beh-0.087
Token.
Token ID13
Feature activation-2.39e+00

Pos logit contributions
 they+0.066
 Ally+0.062
 wagon+0.062
 Big+0.062
 fireplace+0.062
Neg logit contributions
its-0.114
ases-0.107
beh-0.104
 supposed-0.104
ump-0.101
Token Suddenly
Token ID24975
Feature activation-7.73e+00

Pos logit contributions
 they+0.032
 roof+0.031
 jet+0.030
 landscape+0.030
 raft+0.030
Neg logit contributions
ases-0.059
its-0.059
ump-0.056
 supposed-0.055
beh-0.054
Token,
Token ID11
Feature activation-6.37e+00

Pos logit contributions
 porch+0.107
 fireplace+0.100
 roof+0.098
 landscape+0.096
 lawn+0.094
Neg logit contributions
 consequences-0.201
 tricked-0.191
 trouble-0.180
its-0.180
 intentions-0.177
Token she
Token ID673
Feature activation-1.88e+00

Pos logit contributions
 landscape+0.075
 fireplace+0.074
 porch+0.074
 roof+0.073
 wagon+0.073
Neg logit contributions
 consequences-0.158
 tricked-0.155
 trouble-0.152
ief-0.149
its-0.146
Token saw
Token ID2497
Feature activation-3.15e+00

Pos logit contributions
 they+0.036
 wagon+0.034
 fireplace+0.033
 The+0.033
 roof+0.033
Neg logit contributions
its-0.046
ases-0.042
 supposed-0.041
arts-0.041
beh-0.041
Token a
Token ID257
Feature activation-1.97e+00

Pos logit contributions
 they+0.052
 Pole+0.050
 roof+0.049
 wagon+0.048
 fireplace+0.047
Neg logit contributions
its-0.080
 consequences-0.077
ases-0.075
 supposed-0.073
beh-0.071
Token small
Token ID1402
Feature activation-2.55e+00

Pos logit contributions
 they+0.035
 The+0.031
 he+0.029
 she+0.029
 Tim+0.029
Neg logit contributions
its-0.051
 supposed-0.048
vern-0.046
 liar-0.046
 consequences-0.045
Token!
Token ID0
Feature activation+6.44e-02

Pos logit contributions
its+0.016
 supposed+0.014
ases+0.014
beh+0.014
vern+0.014
Neg logit contributions
 they-0.010
 roof-0.009
 Ally-0.009
 raft-0.009
 landscape-0.009
Token<|endoftext|>
Token ID50256
Feature activation+1.23e+00

Pos logit contributions
its+0.002
ases+0.002
 supposed+0.002
beh+0.002
ump+0.002
Neg logit contributions
 roof-0.001
 landscape-0.001
 raft-0.001
 wagon-0.001
 they-0.001
TokenMary
Token ID24119
Feature activation+5.61e-01

Pos logit contributions
its+0.046
ases+0.044
beh+0.043
 supposed+0.042
 consequences+0.041
Neg logit contributions
 they-0.011
 he-0.009
 she-0.008
 roof-0.008
 The-0.008
Token was
Token ID373
Feature activation+1.62e+00

Pos logit contributions
its+0.017
ases+0.017
 supposed+0.016
ws+0.016
beh+0.016
Neg logit contributions
 they-0.008
 roof-0.007
 �-0.007
 Ally-0.007
 raft-0.007
Token walking
Token ID6155
Feature activation-2.27e+00

Pos logit contributions
its+0.047
 supposed+0.045
ases+0.045
beh+0.044
 consequences+0.041
Neg logit contributions
 they-0.027
 he-0.024
 roof-0.022
 landscape-0.022
 she-0.022
Token through
Token ID832
Feature activation-7.73e+00

Pos logit contributions
 Pole+0.050
 chim+0.043
 �+0.042
 fireplace+0.041
 Big+0.041
Neg logit contributions
ases-0.046
beh-0.039
 supposed-0.039
its-0.039
 consequences-0.038
Token the
Token ID262
Feature activation-5.11e+00

Pos logit contributions
 Pole+0.111
 they+0.094
 raft+0.093
stack+0.091
 Sure+0.090
Neg logit contributions
ases-0.204
its-0.203
 consequences-0.202
 supposed-0.202
ief-0.192
Token park
Token ID3952
Feature activation-4.94e+00

Pos logit contributions
 they+0.101
 �+0.094
 while+0.092
stack+0.091
 Sure+0.089
Neg logit contributions
 supposed-0.100
its-0.098
 consequences-0.094
ases-0.093
 mistakes-0.093
Token one
Token ID530
Feature activation-1.69e+00

Pos logit contributions
 Pole+0.071
 wagon+0.070
 Big+0.069
 fireplace+0.069
 raft+0.069
Neg logit contributions
its-0.136
ases-0.132
 supposed-0.130
ump-0.125
beh-0.125
Token day
Token ID1110
Feature activation-4.63e+00

Pos logit contributions
 they+0.030
 Pole+0.028
 Mill+0.028
 Ally+0.028
 Bobby+0.027
Neg logit contributions
its-0.044
ases-0.043
beh-0.038
ac-0.038
 supposed-0.038
Token.
Token ID13
Feature activation-1.98e+00

Pos logit contributions
stack+0.079
 wagon+0.076
 fireplace+0.075
 helicopter+0.072
 vehicle+0.072
Neg logit contributions
its-0.112
 supposed-0.112
ump-0.104
ases-0.103
 consequences-0.100
Token of
Token ID286
Feature activation-1.99e+00

Pos logit contributions
 Pole+0.062
stack+0.062
 wagon+0.061
 landscape+0.061
+0.061
Neg logit contributions
ases-0.031
its-0.029
 supposed-0.029
ump-0.028
 consequences-0.027
Token flowers
Token ID12734
Feature activation-3.18e+00

Pos logit contributions
 they+0.032
 he+0.028
 �+0.028
 Pole+0.027
 The+0.027
Neg logit contributions
its-0.058
ases-0.054
 supposed-0.052
beh-0.052
 consequences-0.051
Token,
Token ID11
Feature activation-4.47e+00

Pos logit contributions
 Ally+0.044
 Pole+0.043
 Sure+0.042
 Big+0.042
 wagon+0.042
Neg logit contributions
 supposed-0.087
its-0.084
ases-0.080
ump-0.079
arts-0.076
Token trees
Token ID7150
Feature activation-3.98e+00

Pos logit contributions
 �+0.064
 Sure+0.063
 they+0.060
 Peter+0.060
 Mill+0.058
Neg logit contributions
its-0.120
 vitamins-0.114
 supposed-0.113
ases-0.112
 consequences-0.111
Token,
Token ID11
Feature activation-4.91e+00

Pos logit contributions
 Ally+0.056
 Peter+0.056
 Mill+0.053
 Bobby+0.053
 Pole+0.053
Neg logit contributions
its-0.105
 supposed-0.101
ases-0.099
ump-0.096
arts-0.092
Token and
Token ID290
Feature activation-7.72e+00

Pos logit contributions
 �+0.078
 Peter+0.074
 Mill+0.073
 Bobby+0.073
 Sure+0.073
Neg logit contributions
its-0.120
ases-0.117
 supposed-0.116
 consequences-0.114
 vitamins-0.112
Token animals
Token ID4695
Feature activation-1.66e+00

Pos logit contributions
 �+0.129
stack+0.127
 Pole+0.124
 they+0.123
 wagon+0.122
Neg logit contributions
its-0.184
ases-0.182
 consequences-0.181
 supposed-0.178
beh-0.178
Token.
Token ID13
Feature activation-9.02e-01

Pos logit contributions
 they+0.025
 Ally+0.024
 Mill+0.023
 The+0.023
 Big+0.023
Neg logit contributions
its-0.048
ases-0.046
 supposed-0.045
beh-0.044
arts-0.042
Token Lily
Token ID20037
Feature activation+1.31e+00

Pos logit contributions
 raft+0.011
 landscape+0.010
 Mill+0.010
 they+0.010
 moon+0.010
Neg logit contributions
its-0.027
ases-0.025
 supposed-0.025
 consequences-0.024
ws-0.024
Token liked
Token ID8288
Feature activation-7.58e-01

Pos logit contributions
its+0.036
 supposed+0.034
ases+0.034
beh+0.033
asing+0.032
Neg logit contributions
 they-0.021
 roof-0.019
 he-0.019
 wagon-0.019
 raft-0.018
Token to
Token ID284
Feature activation+7.37e-01

Pos logit contributions
 Pole+0.009
 they+0.009
 Sure+0.009
 �+0.009
 roof+0.009
Neg logit contributions
 supposed-0.022
ases-0.021
its-0.021
beh-0.021
 vitamins-0.020
Token and
Token ID290
Feature activation-5.06e+00

Pos logit contributions
its+0.021
ases+0.020
 supposed+0.019
arts+0.019
ump+0.018
Neg logit contributions
 roof-0.011
 they-0.011
 Sun-0.011
 wagon-0.011
 fireplace-0.010
Token her
Token ID607
Feature activation-6.98e-01

Pos logit contributions
 Pole+0.081
 fireplace+0.067
 roof+0.067
 chim+0.066
 while+0.065
Neg logit contributions
ases-0.135
 supposed-0.134
beh-0.130
its-0.127
 humor-0.122
Token mom
Token ID1995
Feature activation-4.63e-01

Pos logit contributions
 they+0.010
 while+0.010
 Pole+0.010
 Ally+0.010
 he+0.010
Neg logit contributions
its-0.018
ases-0.017
 supposed-0.017
vern-0.016
 consequences-0.016
Token were
Token ID547
Feature activation+1.74e+00

Pos logit contributions
 roof+0.008
 they+0.008
 while+0.008
 Ally+0.007
 Pole+0.007
Neg logit contributions
its-0.012
ases-0.011
 supposed-0.011
beh-0.010
ws-0.010
Token walking
Token ID6155
Feature activation-2.24e+00

Pos logit contributions
its+0.047
 supposed+0.046
ases+0.045
beh+0.044
asing+0.040
Neg logit contributions
 they-0.030
 he-0.029
 landscape-0.028
 moon-0.028
 roof-0.028
Token through
Token ID832
Feature activation-7.72e+00

Pos logit contributions
 Pole+0.049
 chim+0.042
 fireplace+0.042
 moon+0.041
 roof+0.040
Neg logit contributions
ases-0.045
beh-0.043
 supposed-0.038
asured-0.037
ac-0.036
Token the
Token ID262
Feature activation-4.98e+00

Pos logit contributions
 Pole+0.107
 he+0.089
stack+0.088
 they+0.085
 Sure+0.085
Neg logit contributions
its-0.211
ases-0.210
 supposed-0.204
 consequences-0.203
beh-0.195
Token park
Token ID3952
Feature activation-4.66e+00

Pos logit contributions
 they+0.095
stack+0.092
 while+0.089
 �+0.088
 he+0.087
Neg logit contributions
 supposed-0.097
its-0.096
beh-0.090
 consequences-0.089
 mistakes-0.088
Token one
Token ID530
Feature activation-1.45e+00

Pos logit contributions
 fireplace+0.086
 Pole+0.086
 moon+0.083
 Big+0.082
 wagon+0.081
Neg logit contributions
its-0.111
ases-0.106
 supposed-0.103
beh-0.103
ump-0.098
Token afternoon
Token ID6672
Feature activation-5.46e+00

Pos logit contributions
 they+0.025
 Pole+0.024
 fireplace+0.024
 Ally+0.023
 Mill+0.023
Neg logit contributions
its-0.039
ases-0.037
beh-0.034
 supposed-0.033
asing-0.032
Token.
Token ID13
Feature activation-2.66e+00

Pos logit contributions
 fireplace+0.100
stack+0.098
 wagon+0.093
 Ally+0.092
 helicopter+0.090
Neg logit contributions
its-0.124
 supposed-0.120
ases-0.115
ump-0.113
beh-0.113
Token charming
Token ID23332
Feature activation-2.26e+00

Pos logit contributions
 they+0.036
 Mill+0.034
 he+0.033
 Ally+0.032
stack+0.032
Neg logit contributions
its-0.075
 supposed-0.069
ases-0.068
vern-0.065
 consequences-0.064
Token little
Token ID1310
Feature activation-3.49e+00

Pos logit contributions
 they+0.040
 he+0.037
neath+0.036
stack+0.036
 �+0.034
Neg logit contributions
its-0.048
 liar-0.045
 mistake-0.044
arts-0.044
irk-0.043
Token pony
Token ID26902
Feature activation-1.21e+00

Pos logit contributions
 �+0.056
 they+0.056
 he+0.053
stack+0.052
 roof+0.051
Neg logit contributions
ases-0.083
its-0.078
ump-0.076
irk-0.075
 liar-0.073
Token was
Token ID373
Feature activation+1.75e+00

Pos logit contributions
 �+0.024
 raft+0.024
 wagon+0.024
 roof+0.023
 Pole+0.023
Neg logit contributions
ases-0.026
its-0.024
ez-0.024
 supposed-0.023
arts-0.023
Token walking
Token ID6155
Feature activation-2.21e+00

Pos logit contributions
its+0.053
ases+0.050
beh+0.049
 supposed+0.049
 consequences+0.046
Neg logit contributions
 they-0.028
 he-0.025
 roof-0.023
 she-0.023
 The-0.023
Token through
Token ID832
Feature activation-7.71e+00

Pos logit contributions
 Pole+0.043
 they+0.041
 fireplace+0.038
 roof+0.038
 �+0.037
Neg logit contributions
ases-0.056
beh-0.051
its-0.048
ump-0.047
aked-0.046
Token a
Token ID257
Feature activation-4.24e+00

Pos logit contributions
 they+0.082
 Pole+0.074
 raft+0.064
 he+0.062
 roof+0.062
Neg logit contributions
its-0.251
ases-0.246
 supposed-0.238
beh-0.234
 consequences-0.232
Token me
Token ID502
Feature activation+1.41e+00

Pos logit contributions
 they+0.078
 Mill+0.070
stack+0.068
 he+0.068
 Sure+0.066
Neg logit contributions
its-0.100
 supposed-0.090
ases-0.090
unts-0.086
vern-0.086
Tokenadow
Token ID4584
Feature activation-5.97e+00

Pos logit contributions
its+0.039
ases+0.035
beh+0.035
 consequences+0.033
ief+0.033
Neg logit contributions
 they-0.026
 The-0.024
 she-0.024
 roof-0.024
 he-0.024
Token.
Token ID13
Feature activation-2.44e+00

Pos logit contributions
 they+0.109
 wagon+0.108
 Ally+0.108
 Pole+0.107
 Bobby+0.107
Neg logit contributions
its-0.140
ump-0.132
ases-0.132
 supposed-0.130
beh-0.122
Token It
Token ID632
Feature activation-1.66e+00

Pos logit contributions
 they+0.039
 raft+0.039
 jet+0.036
 yacht+0.035
 train+0.035
Neg logit contributions
 supposed-0.061
its-0.059
 vitamins-0.057
ases-0.056
ump-0.055
Token.
Token ID13
Feature activation+6.78e-01

Pos logit contributions
its+0.075
ases+0.070
beh+0.069
 supposed+0.068
 consequences+0.065
Neg logit contributions
 they-0.035
 he-0.030
 The-0.029
 roof-0.029
 she-0.029
Token<|endoftext|>
Token ID50256
Feature activation+1.27e+00

Pos logit contributions
its+0.025
ases+0.023
beh+0.023
 supposed+0.023
vern+0.022
Neg logit contributions
 they-0.006
 he-0.005
 roof-0.005
 she-0.005
 while-0.005
TokenJim
Token ID18050
Feature activation+2.05e+00

Pos logit contributions
its+0.038
ases+0.036
 supposed+0.034
ump+0.034
beh+0.033
Neg logit contributions
 they-0.018
 roof-0.017
 raft-0.016
 wagon-0.016
 landscape-0.015
Token was
Token ID373
Feature activation+1.57e+00

Pos logit contributions
its+0.061
ases+0.058
 supposed+0.056
beh+0.055
ws+0.055
Neg logit contributions
 they-0.032
 roof-0.029
 while-0.027
 Ally-0.026
 Big-0.026
Token walking
Token ID6155
Feature activation-2.11e+00

Pos logit contributions
its+0.043
 supposed+0.042
ases+0.041
beh+0.040
ump+0.038
Neg logit contributions
 they-0.028
 landscape-0.025
 roof-0.024
 Pole-0.024
 he-0.024
Token through
Token ID832
Feature activation-7.71e+00

Pos logit contributions
 Pole+0.042
 chim+0.040
 fireplace+0.038
 roof+0.038
 they+0.037
Neg logit contributions
ases-0.043
beh-0.037
asured-0.037
ac-0.037
 supposed-0.037
Token the
Token ID262
Feature activation-5.09e+00

Pos logit contributions
 Pole+0.108
 they+0.099
stack+0.089
 Ally+0.088
 roof+0.088
Neg logit contributions
ases-0.202
its-0.199
 consequences-0.197
 supposed-0.194
 trouble-0.189
Token forest
Token ID8222
Feature activation-5.14e+00

Pos logit contributions
 they+0.101
stack+0.087
 while+0.087
 Sure+0.084
 �+0.083
Neg logit contributions
 supposed-0.105
its-0.103
 consequences-0.100
 mistakes-0.100
ases-0.099
Token.
Token ID13
Feature activation-2.14e+00

Pos logit contributions
 Ally+0.068
 Pole+0.067
 they+0.067
 Big+0.065
 fireplace+0.065
Neg logit contributions
its-0.147
ases-0.141
 supposed-0.138
ump-0.136
beh-0.133
Token He
Token ID679
Feature activation-1.76e+00

Pos logit contributions
 they+0.027
 train+0.024
 jet+0.024
 Ally+0.024
 fireplace+0.022
Neg logit contributions
its-0.069
ases-0.065
 supposed-0.063
ws-0.061
 consequences-0.060
Token felt
Token ID2936
Feature activation+2.55e+00

Pos logit contributions
 fireplace+0.029
 Mill+0.029
 wagon+0.029
 Ally+0.028
 they+0.028
Neg logit contributions
its-0.041
ases-0.039
 supposed-0.038
 tricked-0.037
oken-0.037
Token walking
Token ID6155
Feature activation-4.04e+00

Pos logit contributions
its+0.031
ases+0.029
beh+0.029
 supposed+0.029
ws+0.027
Neg logit contributions
 they-0.018
 he-0.016
 roof-0.016
 fireplace-0.016
 Big-0.015
Token around
Token ID1088
Feature activation-5.15e+00

Pos logit contributions
 fireplace+0.075
 roof+0.071
 chim+0.070
 Pole+0.067
 moon+0.065
Neg logit contributions
ases-0.086
its-0.083
aked-0.082
 consequences-0.080
ks-0.078
Token.
Token ID13
Feature activation-2.46e+00

Pos logit contributions
 Pole+0.074
 roof+0.070
 fireplace+0.064
 chim+0.062
 wagon+0.061
Neg logit contributions
its-0.137
ases-0.136
 consequences-0.129
 supposed-0.123
arts-0.121
Token
Token ID198
Feature activation+8.95e-01

Pos logit contributions
 raft+0.034
 they+0.032
 roof+0.032
 landscape+0.031
 train+0.030
Neg logit contributions
its-0.064
ases-0.064
beh-0.059
 supposed-0.059
ump-0.059
Token
Token ID198
Feature activation+5.56e-01

Pos logit contributions
its+0.031
ases+0.030
beh+0.029
 supposed+0.028
vern+0.028
Neg logit contributions
 they-0.010
 roof-0.008
 he-0.008
 wagon-0.007
 fireplace-0.007
TokenSuddenly
Token ID38582
Feature activation-7.70e+00

Pos logit contributions
its+0.015
ases+0.014
beh+0.014
 supposed+0.014
vern+0.013
Neg logit contributions
 they-0.010
 roof-0.009
 he-0.009
 fireplace-0.009
 moon-0.009
Token,
Token ID11
Feature activation-6.53e+00

Pos logit contributions
 landscape+0.103
 fireplace+0.101
 Pole+0.101
 roof+0.100
 porch+0.096
Neg logit contributions
 consequences-0.197
its-0.192
ases-0.187
 tricked-0.183
render-0.179
Token Tim
Token ID5045
Feature activation+1.68e+00

Pos logit contributions
 Pole+0.085
 landscape+0.084
 fireplace+0.080
 roof+0.078
 ranch+0.074
Neg logit contributions
 consequences-0.155
its-0.155
beh-0.155
ases-0.153
ief-0.152
Tokenmy
Token ID1820
Feature activation-3.19e-01

Pos logit contributions
its+0.040
 supposed+0.038
ases+0.037
beh+0.036
asing+0.034
Neg logit contributions
 they-0.037
 roof-0.032
 wagon-0.032
 he-0.031
 Ally-0.031
Token saw
Token ID2497
Feature activation-2.77e+00

Pos logit contributions
 they+0.006
 roof+0.005
 wagon+0.005
 fireplace+0.005
 landscape+0.005
Neg logit contributions
its-0.008
 supposed-0.007
ases-0.007
beh-0.007
arts-0.007
Token a
Token ID257
Feature activation-1.82e+00

Pos logit contributions
 they+0.045
 landscape+0.041
 roof+0.041
 Pole+0.041
 wagon+0.039
Neg logit contributions
its-0.075
ases-0.072
 consequences-0.070
 supposed-0.068
beh-0.066
Token.
Token ID13
Feature activation+1.63e-01

Pos logit contributions
 Ally+0.028
 wagon+0.028
 raft+0.027
 they+0.027
 landscape+0.027
Neg logit contributions
 supposed-0.046
its-0.045
ases-0.045
beh-0.043
ump-0.043
Token<|endoftext|>
Token ID50256
Feature activation+1.03e+00

Pos logit contributions
its+0.005
 supposed+0.005
ases+0.005
ump+0.004
beh+0.004
Neg logit contributions
 roof-0.002
 raft-0.002
 jet-0.002
 train-0.002
 landscape-0.002
TokenJack
Token ID14295
Feature activation+2.06e+00

Pos logit contributions
its+0.031
ases+0.029
 supposed+0.027
ump+0.027
beh+0.026
Neg logit contributions
 they-0.015
 roof-0.015
 raft-0.014
 wagon-0.013
 while-0.013
Token was
Token ID373
Feature activation+1.79e+00

Pos logit contributions
its+0.057
ases+0.055
 supposed+0.053
ws+0.053
beh+0.050
Neg logit contributions
 they-0.032
 roof-0.031
 Pole-0.028
 Ally-0.028
 landscape-0.028
Token walking
Token ID6155
Feature activation-2.34e+00

Pos logit contributions
its+0.050
 supposed+0.048
ases+0.048
beh+0.046
ump+0.043
Neg logit contributions
 they-0.031
 he-0.028
 landscape-0.027
 roof-0.027
 Pole-0.026
Token through
Token ID832
Feature activation-7.70e+00

Pos logit contributions
 Pole+0.048
 chim+0.043
 fireplace+0.041
 roof+0.040
 moon+0.039
Neg logit contributions
ases-0.049
beh-0.044
 consequences-0.042
its-0.042
 supposed-0.041
Token the
Token ID262
Feature activation-4.92e+00

Pos logit contributions
 Pole+0.109
 they+0.096
 roof+0.091
stack+0.091
 Sure+0.088
Neg logit contributions
ases-0.200
its-0.197
 consequences-0.196
 supposed-0.191
 trouble-0.188
Token park
Token ID3952
Feature activation-4.74e+00

Pos logit contributions
 they+0.097
stack+0.082
 while+0.082
 he+0.080
 Sure+0.080
Neg logit contributions
 supposed-0.103
its-0.101
 consequences-0.100
ases-0.097
beh-0.097
Token,
Token ID11
Feature activation-4.82e+00

Pos logit contributions
 fireplace+0.079
 Pole+0.079
 Ally+0.079
 Sun+0.076
 Mill+0.076
Neg logit contributions
its-0.113
ases-0.111
 supposed-0.109
ump-0.108
beh-0.104
Token with
Token ID351
Feature activation-2.92e+00

Pos logit contributions
 fireplace+0.071
 chim+0.069
 wagon+0.068
 landscape+0.065
 roof+0.065
Neg logit contributions
 supposed-0.122
its-0.116
ump-0.109
ases-0.104
 consequences-0.102
Token his
Token ID465
Feature activation-8.44e-01

Pos logit contributions
 they+0.036
 he+0.029
 roof+0.029
 The+0.028
 while+0.028
Neg logit contributions
its-0.097
ases-0.094
beh-0.091
 supposed-0.091
 consequences-0.087
Token<|endoftext|>
Token ID50256
Feature activation+1.46e+00

Pos logit contributions
its+0.021
ases+0.021
beh+0.020
 supposed+0.019
ump+0.019
Neg logit contributions
 roof-0.008
 they-0.008
 raft-0.008
 landscape-0.007
 wagon-0.007
TokenT
Token ID51
Feature activation+4.34e+00

Pos logit contributions
its+0.045
ases+0.043
 supposed+0.040
beh+0.039
ump+0.039
Neg logit contributions
 they-0.021
 roof-0.020
 raft-0.018
 wagon-0.018
 while-0.018
Tokenina
Token ID1437
Feature activation+2.18e+00

Pos logit contributions
its+0.111
ases+0.104
beh+0.101
 supposed+0.098
 consequences+0.093
Neg logit contributions
 they-0.092
 he-0.084
 The-0.081
 she-0.080
 roof-0.079
Token was
Token ID373
Feature activation+1.35e+00

Pos logit contributions
its+0.062
ases+0.060
ws+0.055
 supposed+0.055
beh+0.055
Neg logit contributions
 they-0.030
 roof-0.029
 Ally-0.028
 Big-0.028
 fireplace-0.028
Token walking
Token ID6155
Feature activation-1.85e+00

Pos logit contributions
its+0.038
 supposed+0.037
ases+0.036
beh+0.035
ump+0.033
Neg logit contributions
 they-0.023
 he-0.021
 Pole-0.020
 landscape-0.020
 roof-0.020
Token through
Token ID832
Feature activation-7.69e+00

Pos logit contributions
 Pole+0.040
 chim+0.036
 fireplace+0.036
 roof+0.035
 raft+0.033
Neg logit contributions
ases-0.036
beh-0.034
aked-0.032
ump-0.031
its-0.031
Token the
Token ID262
Feature activation-4.82e+00

Pos logit contributions
 Pole+0.106
 they+0.091
 raft+0.090
 roof+0.083
 wagon+0.082
Neg logit contributions
its-0.214
ases-0.214
 supposed-0.208
 consequences-0.208
beh-0.198
Token garden
Token ID11376
Feature activation-6.29e+00

Pos logit contributions
 they+0.096
 while+0.084
stack+0.081
 he+0.080
 �+0.079
Neg logit contributions
its-0.098
 supposed-0.097
ases-0.096
 consequences-0.095
beh-0.092
Token with
Token ID351
Feature activation-3.49e+00

Pos logit contributions
 Ally+0.084
 Pole+0.081
 wagon+0.080
 Big+0.080
 Bobby+0.079
Neg logit contributions
its-0.172
ump-0.168
ases-0.166
 supposed-0.165
 consequences-0.157
Token her
Token ID607
Feature activation-9.67e-01

Pos logit contributions
 they+0.045
 he+0.038
 roof+0.036
 The+0.036
 Pole+0.035
Neg logit contributions
its-0.115
ases-0.110
beh-0.107
 supposed-0.106
 consequences-0.102
Token mom
Token ID1995
Feature activation-2.75e-01

Pos logit contributions
 Pole+0.015
 they+0.014
 he+0.014
 The+0.013
 while+0.013
Neg logit contributions
ases-0.025
its-0.024
 supposed-0.023
arts-0.022
 consequences-0.022
Token with
Token ID351
Feature activation-3.20e+00

Pos logit contributions
 they+0.050
 Big+0.049
 Sun+0.047
 fireplace+0.047
 Ally+0.046
Neg logit contributions
its-0.106
ases-0.102
 supposed-0.100
beh-0.098
arts-0.093
Token many
Token ID867
Feature activation-2.65e+00

Pos logit contributions
 they+0.055
 he+0.049
 roof+0.047
 The+0.047
 she+0.047
Neg logit contributions
its-0.092
ases-0.088
beh-0.083
 supposed-0.083
 consequences-0.079
Token trees
Token ID7150
Feature activation-3.65e+00

Pos logit contributions
 they+0.050
 he+0.045
 she+0.043
 The+0.042
 roof+0.042
Neg logit contributions
its-0.075
ases-0.070
beh-0.069
 supposed-0.067
ws-0.064
Token,
Token ID11
Feature activation-5.01e+00

Pos logit contributions
 they+0.046
 Big+0.045
 Ally+0.044
 Little+0.044
 Sun+0.043
Neg logit contributions
its-0.113
ases-0.107
 supposed-0.105
beh-0.101
arts-0.099
Token flowers
Token ID12734
Feature activation-3.18e+00

Pos logit contributions
 they+0.071
 Mill+0.068
 Peter+0.067
 Bobby+0.067
 �+0.066
Neg logit contributions
its-0.145
ases-0.139
 consequences-0.132
ws-0.129
 vitamins-0.129
Token and
Token ID290
Feature activation-7.69e+00

Pos logit contributions
 Big+0.040
 Little+0.039
 Bobby+0.038
 Ally+0.037
 Peter+0.037
Neg logit contributions
its-0.095
ases-0.091
 supposed-0.088
arts-0.085
icks-0.082
Token animals
Token ID4695
Feature activation-2.02e+00

Pos logit contributions
 they+0.123
 �+0.116
 Mill+0.112
 The+0.112
 Ally+0.111
Neg logit contributions
its-0.204
ases-0.199
beh-0.192
 consequences-0.192
 supposed-0.189
Token.
Token ID13
Feature activation-7.69e-01

Pos logit contributions
 they+0.030
 The+0.027
 Big+0.027
 he+0.026
 Ally+0.026
Neg logit contributions
its-0.061
ases-0.059
 supposed-0.057
beh-0.056
arts-0.053
Token
Token ID198
Feature activation+2.26e+00

Pos logit contributions
 moon+0.010
 roof+0.009
 landscape+0.009
 fireplace+0.009
 mountain+0.009
Neg logit contributions
its-0.024
ases-0.023
ump-0.020
olds-0.020
ws-0.020
Token
Token ID198
Feature activation+1.13e+00

Pos logit contributions
its+0.090
ases+0.088
beh+0.087
 supposed+0.085
 consequences+0.083
Neg logit contributions
 they-0.017
 he-0.012
 The-0.011
 she-0.010
 while-0.009
Token"
Token ID1
Feature activation+1.25e-01

Pos logit contributions
its+0.032
ases+0.030
beh+0.029
 supposed+0.028
ws+0.027
Neg logit contributions
 they-0.021
 he-0.019
 The-0.018
 she-0.018
 roof-0.018
Token a
Token ID257
Feature activation-2.37e+00

Pos logit contributions
 they+0.049
 Pole+0.045
 fireplace+0.045
 he+0.044
 Mill+0.044
Neg logit contributions
its-0.101
ases-0.099
 consequences-0.092
 supposed-0.091
beh-0.090
Token big
Token ID1263
Feature activation-2.91e+00

Pos logit contributions
 they+0.040
 The+0.040
 Mill+0.036
 he+0.036
 Ally+0.036
Neg logit contributions
its-0.064
 supposed-0.058
ases-0.055
vern-0.055
beh-0.054
Token pond
Token ID16723
Feature activation-3.25e+00

Pos logit contributions
 they+0.043
 The+0.038
 he+0.037
 Big+0.037
 Ally+0.037
Neg logit contributions
its-0.083
ases-0.078
 supposed-0.074
arts-0.072
 consequences-0.072
Token with
Token ID351
Feature activation-2.87e+00

Pos logit contributions
 they+0.047
 Big+0.041
 he+0.041
 Ally+0.041
 The+0.041
Neg logit contributions
its-0.101
ases-0.095
beh-0.091
 supposed-0.090
arts-0.086
Token ducks
Token ID39694
Feature activation-1.54e+00

Pos logit contributions
 they+0.049
 he+0.044
 she+0.042
 Ally+0.041
 �+0.041
Neg logit contributions
its-0.083
ases-0.078
beh-0.074
 supposed-0.074
 consequences-0.072
Token and
Token ID290
Feature activation-7.68e+00

Pos logit contributions
 they+0.018
 he+0.016
 The+0.016
 Ally+0.015
 Mill+0.015
Neg logit contributions
its-0.052
ases-0.050
beh-0.047
 supposed-0.047
arts-0.046
Token fish
Token ID5916
Feature activation-7.50e-01

Pos logit contributions
 they+0.137
 �+0.133
 fireplace+0.132
 Ally+0.129
 Mill+0.129
Neg logit contributions
its-0.191
ases-0.183
beh-0.173
 supposed-0.171
 consequences-0.169
Token.
Token ID13
Feature activation-1.10e+00

Pos logit contributions
 they+0.010
 he+0.008
 she+0.008
 The+0.008
 while+0.007
Neg logit contributions
its-0.026
ases-0.024
beh-0.024
 supposed-0.023
 consequences-0.023
Token They
Token ID1119
Feature activation-1.73e+00

Pos logit contributions
 fireplace+0.016
 roof+0.016
 raft+0.015
 moon+0.015
 Ally+0.015
Neg logit contributions
its-0.032
ases-0.031
olds-0.027
 supposed-0.027
ws-0.026
Token wanted
Token ID2227
Feature activation+4.13e-01

Pos logit contributions
 fireplace+0.034
 they+0.033
 Pole+0.033
 roof+0.033
 The+0.032
Neg logit contributions
its-0.041
ases-0.037
 supposed-0.035
beh-0.035
arts-0.033
Token to
Token ID284
Feature activation+4.64e-01

Pos logit contributions
its+0.014
ases+0.013
 supposed+0.013
beh+0.013
 consequences+0.012
Neg logit contributions
 they-0.005
 fireplace-0.004
 roof-0.004
 he-0.004
 wagon-0.004
Token saw
Token ID2497
Feature activation-3.37e+00

Pos logit contributions
 fireplace+0.033
 Pole+0.033
 moon+0.032
 Sun+0.032
 roof+0.032
Neg logit contributions
its-0.049
ases-0.048
 supposed-0.047
beh-0.045
 tricked-0.045
Token some
Token ID617
Feature activation-2.49e+00

Pos logit contributions
 Pole+0.054
 they+0.050
 roof+0.050
 while+0.049
+0.049
Neg logit contributions
its-0.088
ases-0.088
 consequences-0.084
 supposed-0.080
 vitamins-0.078
Token frogs
Token ID37475
Feature activation-1.75e+00

Pos logit contributions
 they+0.043
 he+0.038
 The+0.037
 roof+0.036
 she+0.036
Neg logit contributions
its-0.073
ases-0.068
beh-0.066
 supposed-0.064
 consequences-0.063
Token,
Token ID11
Feature activation-4.78e+00

Pos logit contributions
 they+0.019
 The+0.017
 Big+0.016
 roof+0.016
 he+0.016
Neg logit contributions
its-0.060
ases-0.057
 supposed-0.055
beh-0.054
arts-0.052
Token fish
Token ID5916
Feature activation-5.16e-01

Pos logit contributions
 they+0.093
 Pole+0.089
 fireplace+0.088
 roof+0.088
 Mill+0.087
Neg logit contributions
its-0.114
ases-0.110
beh-0.101
 supposed-0.099
 consequences-0.098
Token and
Token ID290
Feature activation-7.66e+00

Pos logit contributions
 they+0.004
 The+0.003
 roof+0.003
 moon+0.003
 Big+0.003
Neg logit contributions
its-0.020
ases-0.019
beh-0.018
 supposed-0.018
arts-0.017
Token bugs
Token ID11316
Feature activation+7.82e-01

Pos logit contributions
 Pole+0.125
 they+0.125
 fireplace+0.122
 wagon+0.121
 �+0.121
Neg logit contributions
its-0.193
ases-0.189
beh-0.179
 consequences-0.178
 supposed-0.176
Token in
Token ID287
Feature activation-3.20e+00

Pos logit contributions
its+0.025
ases+0.024
beh+0.023
 supposed+0.023
ws+0.022
Neg logit contributions
 they-0.011
 he-0.009
 The-0.009
 Big-0.009
 roof-0.009
Token the
Token ID262
Feature activation-3.41e+00

Pos logit contributions
 they+0.020
 he+0.015
 roof+0.013
 she+0.012
 The+0.012
Neg logit contributions
its-0.128
ases-0.123
beh-0.117
 supposed-0.116
 consequences-0.113
Token water
Token ID1660
Feature activation-2.13e+00

Pos logit contributions
 they+0.053
 Mill+0.049
 Ally+0.045
 Aunt+0.045
 Pole+0.044
Neg logit contributions
its-0.096
 consequences-0.094
ases-0.092
 vitamins-0.086
oken-0.084
Token.
Token ID13
Feature activation-1.00e+00

Pos logit contributions
 they+0.032
 Ally+0.030
 The+0.030
 Big+0.029
 Mill+0.029
Neg logit contributions
its-0.065
ases-0.061
beh-0.057
 supposed-0.056
arts-0.055
Token<|endoftext|>
Token ID50256
Feature activation+1.09e+00

Pos logit contributions
its+0.010
ases+0.009
beh+0.009
 supposed+0.009
ump+0.009
Neg logit contributions
 raft-0.004
 they-0.004
 roof-0.004
 train-0.004
 wagon-0.004
TokenM
Token ID44
Feature activation+2.37e-01

Pos logit contributions
its+0.034
ases+0.033
 supposed+0.031
ump+0.030
beh+0.030
Neg logit contributions
 they-0.014
 roof-0.014
 raft-0.013
 wagon-0.012
 landscape-0.012
Tokenia
Token ID544
Feature activation+7.07e-01

Pos logit contributions
its+0.006
ases+0.005
beh+0.005
 supposed+0.005
ws+0.005
Neg logit contributions
 they-0.005
 he-0.005
 she-0.005
 The-0.005
 roof-0.005
Token was
Token ID373
Feature activation+1.59e+00

Pos logit contributions
its+0.022
ases+0.021
 supposed+0.020
beh+0.019
ws+0.019
Neg logit contributions
 they-0.010
 roof-0.009
 Ally-0.009
 Big-0.008
 �-0.008
Token walking
Token ID6155
Feature activation-2.03e+00

Pos logit contributions
its+0.047
ases+0.045
 supposed+0.043
beh+0.043
 consequences+0.040
Neg logit contributions
 they-0.027
 he-0.024
 she-0.022
 roof-0.022
 The-0.022
Token through
Token ID832
Feature activation-7.66e+00

Pos logit contributions
 Pole+0.040
 chim+0.037
 fireplace+0.036
 Big+0.034
 roof+0.034
Neg logit contributions
ases-0.044
beh-0.040
its-0.039
 supposed-0.038
arts-0.037
Token the
Token ID262
Feature activation-4.91e+00

Pos logit contributions
 Pole+0.101
 they+0.090
 raft+0.088
 Sure+0.079
 wagon+0.079
Neg logit contributions
ases-0.216
its-0.215
 supposed-0.209
 consequences-0.208
ief-0.198
Token park
Token ID3952
Feature activation-4.71e+00

Pos logit contributions
 they+0.098
 while+0.086
stack+0.083
 he+0.082
 �+0.082
Neg logit contributions
its-0.101
 supposed-0.099
ases-0.098
 consequences-0.096
beh-0.092
Token one
Token ID530
Feature activation-1.23e+00

Pos logit contributions
 fireplace+0.069
 Pole+0.068
 Big+0.067
 wagon+0.066
 Ally+0.066
Neg logit contributions
its-0.128
ases-0.124
 supposed-0.121
ump-0.119
beh-0.116
Token day
Token ID1110
Feature activation-4.37e+00

Pos logit contributions
 they+0.022
 Pole+0.021
 Ally+0.021
 Mill+0.020
 fireplace+0.020
Neg logit contributions
its-0.033
ases-0.031
beh-0.029
 supposed-0.029
asing-0.028
Token.
Token ID13
Feature activation-2.19e+00

Pos logit contributions
 fireplace+0.069
stack+0.069
 wagon+0.066
 helicopter+0.064
 Ally+0.063
Neg logit contributions
its-0.111
 supposed-0.109
ump-0.103
ases-0.102
 consequences-0.099
Token for
Token ID329
Feature activation+1.31e+00

Pos logit contributions
its+0.048
 supposed+0.046
ases+0.046
beh+0.046
asing+0.044
Neg logit contributions
 they-0.020
 roof-0.018
 he-0.018
 landscape-0.018
 raft-0.018
Token helping
Token ID5742
Feature activation+3.04e-01

Pos logit contributions
its+0.033
ases+0.031
beh+0.029
 supposed+0.028
ws+0.027
Neg logit contributions
 they-0.029
 he-0.027
 she-0.026
 The-0.026
 while-0.025
Token her
Token ID607
Feature activation+2.08e-01

Pos logit contributions
its+0.011
ases+0.010
beh+0.010
 supposed+0.010
 consequences+0.009
Neg logit contributions
 they-0.004
 he-0.003
 The-0.003
 she-0.003
 roof-0.003
Token and
Token ID290
Feature activation-3.05e+00

Pos logit contributions
its+0.007
ases+0.006
beh+0.006
 supposed+0.006
 consequences+0.006
Neg logit contributions
 they-0.003
 he-0.003
 she-0.003
 The-0.003
 roof-0.003
Token she
Token ID673
Feature activation+6.49e-01

Pos logit contributions
 they+0.058
 he+0.052
 The+0.051
 roof+0.050
 she+0.050
Neg logit contributions
its-0.084
ases-0.079
beh-0.077
 supposed-0.075
 consequences-0.071
Token wasn
Token ID2492
Feature activation+9.26e+00

Pos logit contributions
its+0.019
ases+0.018
beh+0.018
 supposed+0.017
 consequences+0.017
Neg logit contributions
 they-0.011
 he-0.010
 she-0.009
 The-0.009
 roof-0.009
Token't
Token ID470
Feature activation+6.57e+00

Pos logit contributions
its+0.249
ases+0.237
beh+0.227
 consequences+0.219
ws+0.218
Neg logit contributions
 they-0.189
 he-0.171
 The-0.166
 she-0.165
 Tim-0.159
Token ashamed
Token ID22461
Feature activation+5.04e+00

Pos logit contributions
its+0.184
ases+0.173
ws+0.156
beh+0.154
vern+0.152
Neg logit contributions
 they-0.135
 he-0.125
 she-0.123
 Tim-0.122
 while-0.119
Token to
Token ID284
Feature activation+5.08e+00

Pos logit contributions
its+0.160
ases+0.155
ws+0.152
beh+0.150
vern+0.147
Neg logit contributions
 they-0.082
 she-0.077
 he-0.075
 while-0.073
,-0.067
Token ask
Token ID1265
Feature activation+4.04e-01

Pos logit contributions
its+0.130
ases+0.122
beh+0.117
 supposed+0.113
ws+0.110
Neg logit contributions
 they-0.111
 he-0.102
 she-0.098
 The-0.097
 Tim-0.096
Token for
Token ID329
Feature activation+9.23e-01

Pos logit contributions
its+0.012
ases+0.012
 supposed+0.011
beh+0.011
 consequences+0.011
Neg logit contributions
 they-0.006
 landscape-0.006
 roof-0.005
 raft-0.005
 wagon-0.005
Token started
Token ID2067
Feature activation-8.10e-01

Pos logit contributions
 fireplace+0.089
 landscape+0.088
 hurricane+0.085
 roof+0.085
 chim+0.084
Neg logit contributions
its-0.110
ases-0.109
beh-0.105
 supposed-0.103
ws-0.103
Token eating
Token ID6600
Feature activation-2.38e+00

Pos logit contributions
 they+0.012
 moon+0.011
 landscape+0.011
 Ally+0.011
 roof+0.011
Neg logit contributions
its-0.023
ases-0.022
 supposed-0.022
beh-0.022
arts-0.021
Token the
Token ID262
Feature activation-3.00e-01

Pos logit contributions
 Ally+0.029
 Pole+0.028
 roof+0.028
 Big+0.028
 they+0.028
Neg logit contributions
its-0.075
ases-0.068
ws-0.065
 supposed-0.064
arts-0.064
Token delicious
Token ID12625
Feature activation-1.27e+00

Pos logit contributions
 they+0.006
 he+0.005
 she+0.005
 The+0.005
 roof+0.005
Neg logit contributions
its-0.009
ases-0.008
beh-0.008
 supposed-0.008
 consequences-0.007
Token ch
Token ID442
Feature activation+3.62e+00

Pos logit contributions
 they+0.025
 he+0.023
 she+0.022
 The+0.022
 roof+0.021
Neg logit contributions
its-0.034
ases-0.032
beh-0.031
 supposed-0.030
 consequences-0.029
Tokenocol
Token ID4668
Feature activation+1.16e+01

Pos logit contributions
beh+0.064
its+0.063
 supposed+0.061
vern+0.057
ases+0.057
Neg logit contributions
 they-0.088
 The-0.085
 roof-0.085
 he-0.084
 she-0.083
Tokenates
Token ID689
Feature activation-6.65e-01

Pos logit contributions
its+0.157
beh+0.131
ases+0.125
 supposed+0.121
 consequences+0.115
Neg logit contributions
 they-0.365
 The-0.342
 he-0.339
 moon-0.338
 she-0.336
Token.
Token ID13
Feature activation-9.00e-01

Pos logit contributions
 Big+0.010
 Ally+0.010
 The+0.010
 fireplace+0.010
 moon+0.010
Neg logit contributions
its-0.019
ases-0.018
 supposed-0.018
beh-0.017
arts-0.017
Token She
Token ID1375
Feature activation-1.80e+00

Pos logit contributions
 raft+0.010
 he+0.009
 they+0.009
 landscape+0.009
 moon+0.009
Neg logit contributions
its-0.030
 supposed-0.027
ases-0.027
beh-0.026
olds-0.026
Token was
Token ID373
Feature activation+3.63e+00

Pos logit contributions
 they+0.027
 Ally+0.024
 roof+0.024
 landscape+0.024
 Big+0.024
Neg logit contributions
its-0.053
ases-0.050
 supposed-0.048
beh-0.047
arts-0.046
Token so
Token ID523
Feature activation+1.52e+00

Pos logit contributions
its+0.112
ases+0.106
 supposed+0.104
beh+0.104
 consequences+0.097
Neg logit contributions
 they-0.056
 he-0.049
 roof-0.046
 The-0.046
 she-0.046
Token,
Token ID11
Feature activation-3.29e+00

Pos logit contributions
its+0.020
ases+0.019
beh+0.017
 supposed+0.017
asing+0.017
Neg logit contributions
 they-0.013
 Mill-0.012
 roof-0.012
 fireplace-0.012
 Ally-0.012
Token es
Token ID1658
Feature activation+4.22e+00

Pos logit contributions
 they+0.054
 Mill+0.052
 fireplace+0.052
 wagon+0.050
 Ally+0.050
Neg logit contributions
its-0.091
ases-0.086
beh-0.080
asing-0.079
ief-0.077
Token bon
Token ID5351
Feature activation+3.70e+00

Pos logit contributions
its+0.097
ases+0.089
beh+0.087
 supposed+0.084
vern+0.079
Neg logit contributions
 they-0.101
 he-0.093
 The-0.090
 she-0.089
 while-0.088
Tokenita
Token ID5350
Feature activation+2.65e-01

Pos logit contributions
its+0.090
 supposed+0.082
beh+0.082
ases+0.081
vern+0.076
Neg logit contributions
 they-0.081
 he-0.073
 The-0.073
 while-0.073
 Big-0.071
Token y
Token ID331
Feature activation+6.69e+00

Pos logit contributions
its+0.008
ases+0.007
beh+0.007
 supposed+0.007
ws+0.007
Neg logit contributions
 they-0.004
 he-0.004
 Ally-0.004
 she-0.004
 The-0.004
Token su
Token ID424
Feature activation+1.17e+01

Pos logit contributions
 supposed+0.180
its+0.176
beh+0.172
vern+0.166
asing+0.163
Neg logit contributions
 they-0.107
,-0.101
 Tim-0.099
 he-0.098
 The-0.098
Tokenave
Token ID1015
Feature activation-7.18e-01

Pos logit contributions
its+0.281
 supposed+0.267
 consequences+0.264
 vitamins+0.256
 fault+0.250
Neg logit contributions
 they-0.232
 while-0.212
 he-0.206
 she-0.201
 train-0.197
Token!"
Token ID2474
Feature activation-1.94e+00

Pos logit contributions
 they+0.014
 he+0.012
 roof+0.012
 landscape+0.012
 The+0.012
Neg logit contributions
its-0.020
ases-0.019
beh-0.018
 supposed-0.017
 consequences-0.017
Token Lily
Token ID20037
Feature activation+3.32e+00

Pos logit contributions
 roof+0.030
 landscape+0.029
 fireplace+0.028
 wagon+0.028
 moon+0.027
Neg logit contributions
its-0.053
ases-0.051
 humor-0.047
beh-0.047
olds-0.046
Token says
Token ID1139
Feature activation+2.55e+00

Pos logit contributions
its+0.072
ases+0.067
beh+0.063
 supposed+0.060
ws+0.058
Neg logit contributions
 they-0.081
 roof-0.075
 he-0.074
 Ally-0.073
 she-0.073
Token,
Token ID11
Feature activation-2.04e+00

Pos logit contributions
its+0.085
ases+0.082
beh+0.081
 supposed+0.078
asing+0.077
Neg logit contributions
 landscape-0.027
 valley-0.025
 moon-0.025
 they-0.024
 fireplace-0.023
Token and
Token ID290
Feature activation-2.47e+00

Pos logit contributions
its+0.017
beh+0.016
 supposed+0.016
ases+0.016
asing+0.015
Neg logit contributions
 they-0.008
 roof-0.008
 Ally-0.008
 landscape-0.008
 wagon-0.008
Token never
Token ID1239
Feature activation+1.96e+00

Pos logit contributions
 they+0.052
 The+0.048
 he+0.048
 roof+0.047
 she+0.047
Neg logit contributions
its-0.059
ases-0.056
beh-0.055
 supposed-0.054
 consequences-0.051
Token taking
Token ID2263
Feature activation+5.34e+00

Pos logit contributions
its+0.054
ases+0.051
beh+0.049
 supposed+0.048
 consequences+0.046
Neg logit contributions
 they-0.037
 he-0.034
 she-0.032
 The-0.032
 roof-0.032
Token things
Token ID1243
Feature activation+1.75e+00

Pos logit contributions
its+0.138
ases+0.130
beh+0.126
 supposed+0.122
 consequences+0.117
Neg logit contributions
 they-0.110
 he-0.100
 she-0.097
 The-0.096
 roof-0.095
Token that
Token ID326
Feature activation-5.87e-01

Pos logit contributions
its+0.055
ases+0.052
beh+0.051
 supposed+0.051
vern+0.049
Neg logit contributions
 they-0.023
 The-0.021
 Ally-0.021
 roof-0.021
 moon-0.021
Token weren
Token ID6304
Feature activation+1.02e+01

Pos logit contributions
 they+0.010
 roof+0.009
 The+0.009
 he+0.009
 Ally+0.009
Neg logit contributions
its-0.016
ases-0.015
beh-0.015
 supposed-0.015
 consequences-0.014
Token't
Token ID470
Feature activation+7.38e+00

Pos logit contributions
its+0.278
ases+0.269
beh+0.259
ws+0.249
asing+0.247
Neg logit contributions
 they-0.198
 he-0.185
 she-0.175
 The-0.171
 Tim-0.163
Token his
Token ID465
Feature activation+4.00e+00

Pos logit contributions
its+0.166
 supposed+0.155
ases+0.151
beh+0.151
 consequences+0.140
Neg logit contributions
 they-0.170
 The-0.157
 he-0.156
 roof-0.155
 she-0.153
Token.
Token ID13
Feature activation+1.59e-02

Pos logit contributions
its+0.118
ases+0.112
 supposed+0.109
beh+0.108
 consequences+0.105
Neg logit contributions
 they-0.065
 he-0.057
 The-0.056
 she-0.056
 Ally-0.055
Token
Token ID198
Feature activation+2.73e+00

Pos logit contributions
ases+0.000
its+0.000
olds+0.000
beh+0.000
ump+0.000
Neg logit contributions
 shining-0.000
 while-0.000
 raft-0.000
 they-0.000
 roof-0.000
Token
Token ID198
Feature activation+1.75e+00

Pos logit contributions
its+0.100
ases+0.095
beh+0.092
 supposed+0.091
vern+0.088
Neg logit contributions
 they-0.026
 he-0.021
 roof-0.020
 she-0.020
 while-0.019
Token hard
Token ID1327
Feature activation+7.17e-01

Pos logit contributions
 Ally+0.000
 they+0.000
 Mill+0.000
 Pole+0.000
 Big+0.000
Neg logit contributions
its-0.001
ases-0.001
beh-0.001
 supposed-0.001
ws-0.001
Token.
Token ID13
Feature activation+5.33e-01

Pos logit contributions
its+0.026
ases+0.025
beh+0.024
 supposed+0.024
 consequences+0.023
Neg logit contributions
 they-0.008
 he-0.006
 Ally-0.006
 The-0.006
 roof-0.006
Token
Token ID198
Feature activation+4.01e+00

Pos logit contributions
its+0.017
ases+0.016
ws+0.015
 supposed+0.015
 vitamins+0.014
Neg logit contributions
 landscape-0.006
 they-0.006
 Ally-0.006
 jet-0.006
 valley-0.006
Token
Token ID198
Feature activation+2.63e+00

Pos logit contributions
its+0.151
ases+0.144
beh+0.140
 supposed+0.138
ws+0.133
Neg logit contributions
 they-0.035
 he-0.028
 she-0.025
 roof-0.025
 The-0.024
Token"
Token ID1
Feature activation+1.51e+00

Pos logit contributions
its+0.071
ases+0.066
beh+0.063
 supposed+0.061
ws+0.059
Neg logit contributions
 they-0.051
 roof-0.046
 he-0.046
 Ally-0.045
 Mill-0.045
TokenCr
Token ID13916
Feature activation+1.04e+01

Pos logit contributions
its+0.044
ases+0.041
beh+0.040
 supposed+0.040
vern+0.038
Neg logit contributions
 they-0.027
 he-0.024
 The-0.023
 she-0.023
 while-0.022
Tokenack
Token ID441
Feature activation+2.20e+00

Pos logit contributions
its+0.218
ases+0.218
beh+0.218
 supposed+0.211
 humor+0.200
Neg logit contributions
 they-0.227
 while-0.204
 landscape-0.203
 The-0.203
 he-0.200
Token!"
Token ID2474
Feature activation-1.59e+00

Pos logit contributions
its+0.072
ases+0.068
beh+0.065
 supposed+0.063
ws+0.062
Neg logit contributions
 they-0.031
 he-0.027
 she-0.026
 roof-0.026
 The-0.026
Token The
Token ID383
Feature activation-1.73e+00

Pos logit contributions
 fireplace+0.021
 landscape+0.020
 Ally+0.020
 moon+0.019
 roof+0.019
Neg logit contributions
its-0.048
ases-0.046
 consequences-0.043
olds-0.043
ws-0.041
Token ball
Token ID2613
Feature activation+3.76e-01

Pos logit contributions
 they+0.025
 he+0.021
 she+0.021
 The+0.020
 Ally+0.020
Neg logit contributions
its-0.056
ases-0.053
beh-0.050
 supposed-0.050
 consequences-0.049
Token broke
Token ID6265
Feature activation+1.46e+00

Pos logit contributions
its+0.011
ases+0.011
beh+0.010
ws+0.009
 consequences+0.009
Neg logit contributions
 they-0.006
 Mill-0.006
 Ally-0.006
 fireplace-0.006
 Big-0.005
Token job
Token ID1693
Feature activation+1.20e+00

Pos logit contributions
its+0.088
ases+0.084
beh+0.080
 supposed+0.075
ws+0.074
Neg logit contributions
 they-0.070
 he-0.064
 while-0.061
 The-0.061
 she-0.060
Token well
Token ID880
Feature activation+1.29e+00

Pos logit contributions
its+0.037
 supposed+0.035
ases+0.035
beh+0.034
 consequences+0.033
Neg logit contributions
 they-0.016
 Ally-0.015
 The-0.014
 moon-0.014
 wagon-0.014
Token,
Token ID11
Feature activation-2.68e+00

Pos logit contributions
its+0.041
ases+0.039
 supposed+0.038
beh+0.038
 consequences+0.036
Neg logit contributions
 they-0.016
 roof-0.015
 Ally-0.015
 moon-0.015
 The-0.014
Token and
Token ID290
Feature activation-4.28e+00

Pos logit contributions
 Pole+0.035
 Ally+0.033
 they+0.033
 roof+0.033
 fireplace+0.033
Neg logit contributions
its-0.077
 supposed-0.077
ases-0.074
beh-0.073
 consequences-0.071
Token he
Token ID339
Feature activation-3.37e-01

Pos logit contributions
 fireplace+0.039
 Pole+0.039
 moon+0.038
 roof+0.037
 landscape+0.037
Neg logit contributions
its-0.142
beh-0.139
ases-0.137
 supposed-0.137
asing-0.129
Token felt
Token ID2936
Feature activation+3.67e+00

Pos logit contributions
 they+0.005
 he+0.005
 roof+0.005
 The+0.005
 Ally+0.005
Neg logit contributions
its-0.010
ases-0.010
beh-0.009
 supposed-0.009
ws-0.009
Token glad
Token ID9675
Feature activation-1.92e+00

Pos logit contributions
its+0.103
ases+0.097
beh+0.095
 supposed+0.093
 consequences+0.089
Neg logit contributions
 they-0.067
 he-0.059
 roof-0.057
 she-0.057
 The-0.057
Token.
Token ID13
Feature activation-1.90e-01

Pos logit contributions
 Ally+0.028
 Pole+0.027
 landscape+0.026
 wagon+0.026
 roof+0.026
Neg logit contributions
its-0.053
 supposed-0.051
ases-0.051
 consequences-0.051
beh-0.048
Token He
Token ID679
Feature activation-2.09e-01

Pos logit contributions
 they+0.002
 while+0.002
 roof+0.002
 he+0.002
 moon+0.002
Neg logit contributions
its-0.007
ases-0.006
beh-0.006
 supposed-0.006
asing-0.006
Token knew
Token ID2993
Feature activation-4.89e-01

Pos logit contributions
 they+0.004
 roof+0.003
 while+0.003
 Ally+0.003
 he+0.003
Neg logit contributions
its-0.006
ases-0.005
 supposed-0.005
beh-0.005
ws-0.005
Token he
Token ID339
Feature activation+1.99e+00

Pos logit contributions
 landscape+0.005
 fireplace+0.005
 roof+0.005
 Pole+0.005
 moon+0.005
Neg logit contributions
its-0.015
 supposed-0.015
beh-0.015
ases-0.015
vern-0.014
Token to
Token ID284
Feature activation-2.05e+00

Pos logit contributions
 they+0.013
 Sun+0.012
 Big+0.012
 Ally+0.012
 The+0.012
Neg logit contributions
its-0.026
 supposed-0.024
ases-0.024
beh-0.023
arts-0.022
Token a
Token ID257
Feature activation-2.79e+00

Pos logit contributions
 they+0.035
 he+0.031
 raft+0.029
 wagon+0.029
 roof+0.029
Neg logit contributions
its-0.059
ases-0.056
beh-0.054
 supposed-0.054
ws-0.051
Token special
Token ID2041
Feature activation-1.83e+00

Pos logit contributions
 they+0.055
 The+0.051
 Little+0.048
 Ally+0.048
 he+0.047
Neg logit contributions
 supposed-0.062
its-0.061
 liar-0.058
vern-0.058
 consequences-0.058
Token safe
Token ID3338
Feature activation-2.62e+00

Pos logit contributions
 they+0.040
 The+0.037
 he+0.035
 Big+0.035
 Little+0.034
Neg logit contributions
its-0.039
 supposed-0.038
 consequences-0.037
vern-0.035
ases-0.034
Token!
Token ID0
Feature activation-1.16e+00

Pos logit contributions
 they+0.044
 The+0.040
 Big+0.038
 Little+0.038
 Pole+0.038
Neg logit contributions
its-0.068
 supposed-0.064
 consequences-0.062
ases-0.060
 mistake-0.059
Token
Token ID220
Feature activation+6.00e+00

Pos logit contributions
 they+0.015
 raft+0.015
 yacht+0.014
 wagon+0.014
 roof+0.014
Neg logit contributions
its-0.036
ases-0.035
 supposed-0.032
beh-0.031
ump-0.031
Token
Token ID198
Feature activation+1.99e+00

Pos logit contributions
its+0.224
ases+0.213
beh+0.206
 supposed+0.205
ws+0.197
Neg logit contributions
 they-0.055
 he-0.044
 roof-0.039
 she-0.039
 while-0.039
Token
Token ID198
Feature activation+1.09e+00

Pos logit contributions
its+0.073
ases+0.069
 supposed+0.066
beh+0.066
vern+0.064
Neg logit contributions
 they-0.019
 he-0.015
 roof-0.015
 wagon-0.014
 raft-0.014
Tokenâ
Token ID22940
Feature activation+5.30e+00

Pos logit contributions
its+0.028
ases+0.026
beh+0.024
 supposed+0.024
asing+0.023
Neg logit contributions
 they-0.022
 roof-0.020
 he-0.020
 raft-0.019
 Pole-0.019
Token
Token ID26391
Feature activation+3.23e+00

Pos logit contributions
its+0.113
ases+0.104
beh+0.101
 supposed+0.098
asing+0.094
Neg logit contributions
 they-0.137
 he-0.125
 she-0.120
 roof-0.119
 while-0.118
Token
Token ID129
Feature activation+3.62e+00

Pos logit contributions
its+0.090
ases+0.084
 supposed+0.083
beh+0.082
vern+0.077
Neg logit contributions
 they-0.057
 he-0.052
 raft-0.049
 wagon-0.049
 she-0.049
Token there
Token ID612
Feature activation-2.67e+00

Pos logit contributions
 roof+0.057
 Pole+0.057
 fireplace+0.056
 landscape+0.055
 they+0.055
Neg logit contributions
its-0.114
ases-0.105
 supposed-0.104
beh-0.098
 consequences-0.096
Token was
Token ID373
Feature activation-5.57e-01

Pos logit contributions
 they+0.034
 roof+0.029
 Ally+0.029
 he+0.029
 wagon+0.028
Neg logit contributions
its-0.088
ases-0.082
 supposed-0.079
beh-0.077
 consequences-0.075
Token a
Token ID257
Feature activation-2.14e+00

Pos logit contributions
 they+0.008
 he+0.007
 Pole+0.007
 roof+0.007
 landscape+0.007
Neg logit contributions
its-0.016
ases-0.016
 supposed-0.016
beh-0.015
 consequences-0.015
Token boy
Token ID2933
Feature activation+9.65e-01

Pos logit contributions
 they+0.032
 Mill+0.030
 Ally+0.029
 he+0.029
stack+0.029
Neg logit contributions
its-0.056
 supposed-0.052
 liar-0.052
ases-0.050
vern-0.050
Token named
Token ID3706
Feature activation-5.95e-01

Pos logit contributions
its+0.033
ases+0.032
beh+0.031
 supposed+0.030
 consequences+0.029
Neg logit contributions
 they-0.012
 he-0.010
 she-0.009
 The-0.009
 roof-0.009
Token Tim
Token ID5045
Feature activation+3.32e+00

Pos logit contributions
 landscape+0.007
 valley+0.007
 Ally+0.006
 jet+0.006
 roof+0.006
Neg logit contributions
 supposed-0.018
ases-0.017
its-0.017
beh-0.016
ws-0.016
Tokenmy
Token ID1820
Feature activation+1.67e+00

Pos logit contributions
its+0.107
ases+0.101
beh+0.100
 supposed+0.095
vern+0.093
Neg logit contributions
 they-0.049
 he-0.045
 The-0.043
 she-0.041
 Tim-0.041
Token.
Token ID13
Feature activation-1.10e+00

Pos logit contributions
its+0.056
ases+0.053
beh+0.052
 supposed+0.051
 consequences+0.049
Neg logit contributions
 they-0.022
 he-0.019
 The-0.018
 she-0.018
 while-0.017
Token Tim
Token ID5045
Feature activation+3.62e+00

Pos logit contributions
 they+0.010
 Ally+0.008
 she+0.008
 jet+0.008
 valley+0.008
Neg logit contributions
its-0.038
ases-0.037
 supposed-0.035
beh-0.034
ws-0.033
Tokenmy
Token ID1820
Feature activation+1.82e+00

Pos logit contributions
its+0.118
ases+0.112
beh+0.111
vern+0.104
 supposed+0.103
Neg logit contributions
 they-0.051
 he-0.047
 The-0.046
 Tim-0.044
 she-0.043
Token loved
Token ID6151
Feature activation-7.43e-01

Pos logit contributions
its+0.052
 supposed+0.049
ases+0.049
beh+0.047
ws+0.044
Neg logit contributions
 they-0.030
 Ally-0.026
 he-0.026
 roof-0.026
 �-0.026
Token is
Token ID318
Feature activation+3.48e+00

Pos logit contributions
its+0.007
ases+0.006
beh+0.006
 supposed+0.006
ws+0.006
Neg logit contributions
 they-0.005
 he-0.005
 she-0.004
 The-0.004
 roof-0.004
Token that
Token ID326
Feature activation-3.24e-01

Pos logit contributions
 supposed+0.094
its+0.088
beh+0.083
ases+0.083
 consequences+0.082
Neg logit contributions
 While-0.054
 while-0.052
 Sure-0.052
 roof-0.051
 landscape-0.050
Token big
Token ID1263
Feature activation-1.11e+00

Pos logit contributions
 they+0.008
 Sure+0.007
 wagon+0.006
 While+0.006
 while+0.006
Neg logit contributions
 supposed-0.007
its-0.006
 consequences-0.006
ases-0.005
 decisions-0.005
Token blue
Token ID4171
Feature activation-5.79e-01

Pos logit contributions
 they+0.020
 Sure+0.017
 The+0.017
 While+0.017
 Ally+0.017
Neg logit contributions
its-0.030
ases-0.028
 supposed-0.028
 consequences-0.027
beh-0.025
Token elephant
Token ID20950
Feature activation+5.21e-01

Pos logit contributions
 they+0.012
 Ally+0.011
 �+0.011
 she+0.010
 he+0.010
Neg logit contributions
its-0.014
ases-0.013
 supposed-0.012
arts-0.012
ump-0.012
Token doing
Token ID1804
Feature activation+3.49e+00

Pos logit contributions
its+0.012
 supposed+0.012
ases+0.012
beh+0.011
asing+0.011
Neg logit contributions
 they-0.011
 Ally-0.010
 roof-0.010
 wagon-0.010
 The-0.010
Token in
Token ID287
Feature activation-1.05e-01

Pos logit contributions
its+0.092
 supposed+0.087
ases+0.087
beh+0.083
 consequences+0.080
Neg logit contributions
 they-0.065
 roof-0.058
 The-0.058
 he-0.057
 Ally-0.057
Token our
Token ID674
Feature activation+5.98e-01

Pos logit contributions
 they+0.001
 he+0.001
 she+0.001
 The+0.001
 while+0.001
Neg logit contributions
its-0.004
ases-0.004
beh-0.003
 supposed-0.003
 consequences-0.003
Token lawn
Token ID21133
Feature activation-4.10e+00

Pos logit contributions
ases+0.016
its+0.015
 supposed+0.015
 consequences+0.015
beh+0.015
Neg logit contributions
 they-0.011
 he-0.010
 she-0.009
 while-0.009
 Ally-0.009
Token?"
Token ID1701
Feature activation-1.09e+00

Pos logit contributions
 they+0.091
 Ally+0.083
 he+0.080
 Big+0.079
 she+0.078
Neg logit contributions
its-0.089
ases-0.085
 supposed-0.081
beh-0.080
ump-0.076
Token
Token ID220
Feature activation+6.53e+00

Pos logit contributions
 raft+0.018
 Pole+0.018
 landscape+0.018
 train+0.017
 wagon+0.017
Neg logit contributions
its-0.030
olds-0.027
ases-0.026
 supposed-0.026
rums-0.025
Token said
Token ID531
Feature activation-3.84e-01

Pos logit contributions
 roof+0.081
 Pole+0.081
 landscape+0.081
 fireplace+0.080
 wagon+0.077
Neg logit contributions
its-0.077
beh-0.072
 tricked-0.072
 fooled-0.069
 supposed-0.067
Token,
Token ID11
Feature activation-3.95e+00

Pos logit contributions
 landscape+0.004
 fireplace+0.004
 moon+0.004
 raft+0.004
 Pole+0.004
Neg logit contributions
its-0.012
ases-0.012
vern-0.011
 consequences-0.011
beh-0.011
Token "
Token ID366
Feature activation+2.96e+00

Pos logit contributions
 landscape+0.065
 valley+0.064
 raft+0.061
 jet+0.059
 wagon+0.057
Neg logit contributions
its-0.097
 supposed-0.096
vern-0.085
vel-0.085
ps-0.085
TokenI
Token ID40
Feature activation+3.17e+00

Pos logit contributions
its+0.079
ases+0.073
beh+0.073
 supposed+0.070
asing+0.068
Neg logit contributions
 they-0.058
 he-0.052
 roof-0.052
 fireplace-0.050
 raft-0.050
Token don
Token ID836
Feature activation+7.58e+00

Pos logit contributions
its+0.081
ases+0.080
beh+0.077
 consequences+0.071
ief+0.070
Neg logit contributions
 they-0.070
 he-0.065
 while-0.062
 she-0.062
 The-0.059
Token't
Token ID470
Feature activation+5.85e+00

Pos logit contributions
its+0.217
ases+0.209
beh+0.208
ws+0.203
 consequences+0.200
Neg logit contributions
 they-0.145
 he-0.130
 she-0.126
 The-0.122
 "-0.117
Token wanna
Token ID18869
Feature activation+5.43e+00

Pos logit contributions
its+0.154
ases+0.143
ins+0.135
beh+0.132
vern+0.130
Neg logit contributions
 they-0.127
 he-0.125
 while-0.123
 she-0.121
 Tim-0.119
Token exercise
Token ID5517
Feature activation+3.91e-01

Pos logit contributions
its+0.168
ases+0.157
beh+0.156
 supposed+0.149
vern+0.147
Neg logit contributions
 they-0.089
 he-0.080
 she-0.075
 The-0.073
 while-0.073
Token.
Token ID13
Feature activation+3.89e-01

Pos logit contributions
its+0.013
ases+0.012
beh+0.012
 supposed+0.012
asing+0.011
Neg logit contributions
 they-0.006
 he-0.005
 while-0.005
 she-0.005
 The-0.004
Token This
Token ID770
Feature activation+4.84e-01

Pos logit contributions
its+0.010
ases+0.010
beh+0.010
asing+0.010
 supposed+0.009
Neg logit contributions
 they-0.007
 landscape-0.006
 raft-0.006
 valley-0.006
 he-0.006
Token weight
Token ID3463
Feature activation-1.12e-01

Pos logit contributions
its+0.013
ases+0.012
beh+0.012
 supposed+0.011
 consequences+0.011
Neg logit contributions
 they-0.010
 he-0.009
 she-0.008
 The-0.008
 roof-0.008
Token so
Token ID523
Feature activation+1.85e+00

Pos logit contributions
its+0.116
 supposed+0.109
ases+0.108
beh+0.106
 consequences+0.102
Neg logit contributions
 they-0.063
 he-0.055
 roof-0.053
 she-0.053
 The-0.052
Token shiny
Token ID22441
Feature activation+1.60e+00

Pos logit contributions
its+0.054
ases+0.051
beh+0.050
 supposed+0.048
asing+0.046
Neg logit contributions
 they-0.033
 he-0.030
 she-0.028
 The-0.028
 while-0.027
Token and
Token ID290
Feature activation-1.42e+00

Pos logit contributions
its+0.050
ases+0.047
 supposed+0.045
beh+0.045
 consequences+0.044
Neg logit contributions
 they-0.022
 roof-0.020
 Ally-0.020
 The-0.020
 Big-0.019
Token bright
Token ID6016
Feature activation+4.71e-01

Pos logit contributions
 they+0.024
 he+0.021
 The+0.021
 roof+0.020
 she+0.020
Neg logit contributions
its-0.041
ases-0.038
beh-0.038
 supposed-0.037
 consequences-0.035
Token.
Token ID13
Feature activation+1.71e-01

Pos logit contributions
its+0.011
ases+0.010
 supposed+0.010
beh+0.010
arts+0.009
Neg logit contributions
 they-0.010
 roof-0.009
 raft-0.009
 Ally-0.009
 Big-0.009
Token Is
Token ID1148
Feature activation+2.75e+00

Pos logit contributions
its+0.004
ases+0.004
beh+0.004
 vitamins+0.004
asing+0.004
Neg logit contributions
 they-0.003
 raft-0.003
 he-0.003
 landscape-0.003
 roof-0.003
Token it
Token ID340
Feature activation+1.68e+00

Pos logit contributions
its+0.088
 supposed+0.083
beh+0.081
ases+0.081
 consequences+0.078
Neg logit contributions
 fireplace-0.033
 roof-0.033
 they-0.033
 wagon-0.032
 landscape-0.032
Token original
Token ID2656
Feature activation+1.22e+00

Pos logit contributions
 supposed+0.043
its+0.040
beh+0.039
vern+0.037
 vitamins+0.036
Neg logit contributions
 wagon-0.028
 roof-0.028
 landscape-0.028
 they-0.027
 While-0.027
Token?"
Token ID1701
Feature activation-7.47e-01

Pos logit contributions
its+0.031
ases+0.029
 supposed+0.028
beh+0.027
 consequences+0.026
Neg logit contributions
 they-0.024
 The-0.022
 roof-0.022
 he-0.021
 Big-0.021
Token
Token ID198
Feature activation+3.04e+00

Pos logit contributions
 roof+0.011
 landscape+0.011
 Pole+0.011
 train+0.011
 raft+0.011
Neg logit contributions
its-0.021
ases-0.019
olds-0.018
 supposed-0.018
vern-0.018
Token
Token ID198
Feature activation+2.12e+00

Pos logit contributions
its+0.099
ases+0.091
vern+0.088
 supposed+0.087
beh+0.085
Neg logit contributions
 they-0.038
 roof-0.035
 landscape-0.035
 wagon-0.034
 he-0.033
Token her
Token ID607
Feature activation+1.78e+00

Pos logit contributions
 landscape+0.053
 valley+0.049
 roof+0.049
 Pole+0.046
 raft+0.046
Neg logit contributions
 supposed-0.115
its-0.110
render-0.106
ases-0.104
olds-0.102
Token teacher
Token ID4701
Feature activation+3.31e-01

Pos logit contributions
its+0.051
ases+0.047
 supposed+0.047
 consequences+0.046
beh+0.044
Neg logit contributions
 they-0.030
 he-0.026
 Ally-0.026
 The-0.025
 Tim-0.024
Token,
Token ID11
Feature activation-4.13e+00

Pos logit contributions
its+0.011
ases+0.011
 supposed+0.010
beh+0.010
asing+0.010
Neg logit contributions
 roof-0.003
 landscape-0.003
 raft-0.003
 they-0.003
 valley-0.003
Token and
Token ID290
Feature activation-5.07e+00

Pos logit contributions
 landscape+0.058
 roof+0.056
 valley+0.056
 Pole+0.055
 raft+0.054
Neg logit contributions
 supposed-0.115
its-0.113
ases-0.107
render-0.104
asing-0.102
Token her
Token ID607
Feature activation+1.64e+00

Pos logit contributions
 landscape+0.058
 Pole+0.057
 roof+0.056
 fireplace+0.054
 valley+0.052
Neg logit contributions
its-0.155
beh-0.154
 supposed-0.153
ases-0.153
asing-0.144
Token friends
Token ID2460
Feature activation+2.64e-01

Pos logit contributions
its+0.046
ases+0.043
 supposed+0.042
beh+0.041
 consequences+0.040
Neg logit contributions
 they-0.029
 Ally-0.026
 he-0.026
 The-0.025
 roof-0.024
Token.
Token ID13
Feature activation-1.13e+00

Pos logit contributions
its+0.007
ases+0.007
beh+0.007
 supposed+0.007
asing+0.007
Neg logit contributions
 they-0.004
 moon-0.004
 landscape-0.004
 fireplace-0.004
 roof-0.004
Token But
Token ID887
Feature activation-3.29e+00

Pos logit contributions
 raft+0.018
 landscape+0.017
 jet+0.017
 mountain+0.016
 roof+0.016
Neg logit contributions
ases-0.028
olds-0.027
its-0.027
ump-0.025
beh-0.025
Token nobody
Token ID8168
Feature activation+6.39e-01

Pos logit contributions
 Pole+0.042
 landscape+0.042
 fireplace+0.040
 roof+0.040
 moon+0.040
Neg logit contributions
its-0.088
beh-0.086
 supposed-0.086
ases-0.085
 cheating-0.084
Token seemed
Token ID3947
Feature activation+2.97e+00

Pos logit contributions
its+0.019
ases+0.018
beh+0.018
 supposed+0.018
asing+0.017
Neg logit contributions
 they-0.010
 he-0.009
 roof-0.009
 The-0.009
 fireplace-0.009
Token to
Token ID284
Feature activation+4.45e+00

Pos logit contributions
its+0.097
 supposed+0.094
ases+0.092
beh+0.091
 consequences+0.089
Neg logit contributions
 they-0.036
 roof-0.032
 fireplace-0.031
 wagon-0.030
 moon-0.030
Token to
Token ID284
Feature activation+5.63e-01

Pos logit contributions
 roof+0.021
 landscape+0.020
 fireplace+0.020
 wagon+0.020
 they+0.019
Neg logit contributions
its-0.084
ases-0.081
beh-0.080
 supposed-0.079
asing-0.076
Token always
Token ID1464
Feature activation+6.28e-01

Pos logit contributions
its+0.015
ases+0.015
beh+0.014
 supposed+0.014
 consequences+0.013
Neg logit contributions
 they-0.011
 he-0.010
 The-0.009
 she-0.009
 roof-0.009
Token hug
Token ID16225
Feature activation+1.47e+00

Pos logit contributions
its+0.017
ases+0.016
beh+0.016
 supposed+0.015
vern+0.015
Neg logit contributions
 they-0.013
 he-0.012
 she-0.011
 The-0.011
 while-0.011
Token your
Token ID534
Feature activation+2.67e+00

Pos logit contributions
its+0.043
ases+0.042
 consequences+0.040
beh+0.040
asing+0.039
Neg logit contributions
 roof-0.020
 valley-0.019
 raft-0.019
 landscape-0.019
 chim-0.018
Token friends
Token ID2460
Feature activation-1.09e-01

Pos logit contributions
its+0.074
ases+0.069
beh+0.068
 supposed+0.067
ws+0.063
Neg logit contributions
 they-0.053
 The-0.047
 he-0.047
 while-0.046
 Big-0.045
Token.
Token ID13
Feature activation+7.22e-01

Pos logit contributions
 roof+0.001
 landscape+0.001
 moon+0.001
 Ally+0.001
 fireplace+0.001
Neg logit contributions
its-0.003
ases-0.003
 supposed-0.003
beh-0.003
asing-0.003
Token<|endoftext|>
Token ID50256
Feature activation+8.74e-01

Pos logit contributions
its+0.023
ases+0.022
 supposed+0.022
beh+0.022
vern+0.021
Neg logit contributions
 raft-0.008
 roof-0.008
 wagon-0.008
 they-0.008
 landscape-0.008
TokenOnce
Token ID7454
Feature activation-1.60e+00

Pos logit contributions
its+0.030
ases+0.028
 supposed+0.027
beh+0.026
 consequences+0.026
Neg logit contributions
 they-0.010
 roof-0.009
 he-0.008
 raft-0.008
 wagon-0.008
Token there
Token ID612
Feature activation-2.98e+00

Pos logit contributions
 roof+0.020
 moon+0.020
 wagon+0.020
 they+0.020
 fireplace+0.019
Neg logit contributions
its-0.049
ases-0.045
 supposed-0.044
vern-0.043
beh-0.041
Token was
Token ID373
Feature activation-9.48e-01

Pos logit contributions
 roof+0.050
 raft+0.049
 Ally+0.048
 moon+0.047
 tractor+0.047
Neg logit contributions
its-0.079
ases-0.070
 supposed-0.068
 consequences-0.065
ump-0.065
Token a
Token ID257
Feature activation-2.32e+00

Pos logit contributions
 they+0.014
 Pole+0.013
 he+0.013
 roof+0.013
 landscape+0.012
Neg logit contributions
its-0.027
ases-0.027
 supposed-0.026
beh-0.025
 consequences-0.024
Token tired
Token ID10032
Feature activation-2.22e+00

Pos logit contributions
its+0.129
ases+0.121
beh+0.112
ws+0.111
asing+0.110
Neg logit contributions
 they-0.070
 he-0.063
 she-0.061
 The-0.057
 while-0.057
Token.
Token ID13
Feature activation-1.11e+00

Pos logit contributions
 they+0.033
 Ally+0.032
 The+0.030
 Mill+0.030
 Big+0.030
Neg logit contributions
its-0.065
ases-0.062
 supposed-0.060
beh-0.059
 consequences-0.058
Token Her
Token ID2332
Feature activation+7.47e-01

Pos logit contributions
 raft+0.019
 they+0.018
 mountain+0.017
 boat+0.016
 he+0.016
Neg logit contributions
its-0.028
 preferences-0.026
 vitamins-0.026
 consequences-0.025
olds-0.025
Token mom
Token ID1995
Feature activation+7.63e-01

Pos logit contributions
its+0.025
ases+0.023
 supposed+0.023
beh+0.022
 consequences+0.022
Neg logit contributions
 they-0.009
 he-0.008
 Ally-0.008
 The-0.008
 �-0.008
Token told
Token ID1297
Feature activation-1.15e+00

Pos logit contributions
its+0.021
ases+0.019
beh+0.019
 supposed+0.019
asing+0.018
Neg logit contributions
 they-0.013
 roof-0.012
 landscape-0.012
 The-0.012
 moon-0.012
Token her
Token ID607
Feature activation+4.83e-02

Pos logit contributions
 they+0.014
 he+0.012
 The+0.011
 roof+0.011
 she+0.011
Neg logit contributions
its-0.040
ases-0.038
beh-0.037
 supposed-0.036
 consequences-0.035
Token it
Token ID340
Feature activation+2.55e-02

Pos logit contributions
its+0.002
ases+0.002
beh+0.002
 supposed+0.002
 consequences+0.001
Neg logit contributions
 they-0.001
 he-0.000
 The-0.000
 moon-0.000
 roof-0.000
Token was
Token ID373
Feature activation+5.25e+00

Pos logit contributions
its+0.001
 supposed+0.001
beh+0.001
ases+0.001
 consequences+0.001
Neg logit contributions
 they-0.000
 he-0.000
 moon-0.000
 The-0.000
 roof-0.000
Token time
Token ID640
Feature activation+5.41e-01

Pos logit contributions
its+0.137
ases+0.129
beh+0.124
 supposed+0.118
asing+0.115
Neg logit contributions
 they-0.111
 he-0.100
 she-0.097
 The-0.096
 while-0.095
Token for
Token ID329
Feature activation-3.38e-01

Pos logit contributions
its+0.017
 supposed+0.016
ases+0.016
beh+0.015
 consequences+0.015
Neg logit contributions
 they-0.007
 landscape-0.007
 moon-0.007
 valley-0.007
 Ally-0.007
Token a
Token ID257
Feature activation+5.18e-01

Pos logit contributions
 they+0.004
 he+0.004
 she+0.003
 The+0.003
 while+0.003
Neg logit contributions
its-0.012
ases-0.011
beh-0.011
 supposed-0.011
asing-0.010
Token in
Token ID287
Feature activation-3.29e+00

Pos logit contributions
 they+0.006
 roof+0.005
 �+0.005
 Mill+0.005
 Ally+0.005
Neg logit contributions
its-0.010
ases-0.010
beh-0.010
 supposed-0.010
 consequences-0.009
Token a
Token ID257
Feature activation-3.44e+00

Pos logit contributions
 they+0.038
 Pole+0.028
 roof+0.028
 he+0.028
 Ally+0.028
Neg logit contributions
its-0.113
ases-0.110
 supposed-0.105
beh-0.103
 consequences-0.101
Token park
Token ID3952
Feature activation-4.17e+00

Pos logit contributions
 they+0.056
 Mill+0.050
 he+0.049
 Bobby+0.047
 Little+0.047
Neg logit contributions
its-0.099
ases-0.093
 supposed-0.090
beh-0.086
vern-0.085
Token on
Token ID319
Feature activation-3.64e+00

Pos logit contributions
 Ally+0.073
 they+0.071
 Mill+0.070
 wagon+0.068
 Big+0.067
Neg logit contributions
its-0.107
ases-0.104
 supposed-0.101
beh-0.097
ump-0.094
Token a
Token ID257
Feature activation-2.87e+00

Pos logit contributions
 they+0.057
 �+0.056
 Ally+0.054
 Mill+0.054
 Pole+0.053
Neg logit contributions
ases-0.095
its-0.094
 consequences-0.087
 supposed-0.084
 vitamins-0.084
Token warm
Token ID5814
Feature activation-2.31e+00

Pos logit contributions
 they+0.047
 Mill+0.044
 Bobby+0.042
 Little+0.041
 Tim+0.040
Neg logit contributions
its-0.078
 supposed-0.073
vern-0.071
ases-0.069
arts-0.067
Token summer
Token ID3931
Feature activation-3.41e+00

Pos logit contributions
 they+0.036
 �+0.033
 Sure+0.031
 Little+0.029
 he+0.028
Neg logit contributions
its-0.057
ump-0.054
arts-0.052
ases-0.051
ands-0.051
Token afternoon
Token ID6672
Feature activation-6.00e+00

Pos logit contributions
 �+0.049
 they+0.048
 Ally+0.047
 Sure+0.046
 Mill+0.046
Neg logit contributions
its-0.082
 supposed-0.081
ump-0.081
iling-0.077
arts-0.076
Token.
Token ID13
Feature activation-1.98e+00

Pos logit contributions
 Mill+0.116
 Ally+0.116
 �+0.114
 wagon+0.112
 Sure+0.112
Neg logit contributions
its-0.119
 supposed-0.117
ump-0.117
ases-0.115
 consequences-0.112
Token Suddenly
Token ID24975
Feature activation-6.93e+00

Pos logit contributions
 they+0.031
 jet+0.031
 raft+0.030
 train+0.027
 Ally+0.027
Neg logit contributions
its-0.049
ases-0.047
 vitamins-0.047
 supposed-0.046
ump-0.044
Token,
Token ID11
Feature activation-6.19e+00

Pos logit contributions
stack+0.119
 porch+0.111
 ranch+0.106
 landscape+0.105
 fireplace+0.104
Neg logit contributions
 consequences-0.157
ases-0.142
 tricked-0.142
 trouble-0.142
 cheating-0.140
Token
Token ID198
Feature activation+1.39e+00

Pos logit contributions
 they+0.021
 raft+0.019
 roof+0.019
 while+0.018
 she+0.018
Neg logit contributions
its-0.041
ases-0.040
 vitamins-0.035
ump-0.035
 consequences-0.034
Token
Token ID198
Feature activation+3.72e-01

Pos logit contributions
its+0.053
ases+0.051
beh+0.050
 supposed+0.049
 consequences+0.047
Neg logit contributions
 they-0.012
 he-0.009
 The-0.008
 she-0.008
 roof-0.007
TokenThe
Token ID464
Feature activation-2.10e+00

Pos logit contributions
its+0.010
ases+0.010
beh+0.009
 supposed+0.009
vern+0.009
Neg logit contributions
 they-0.007
 he-0.006
 Mill-0.006
 roof-0.006
 she-0.006
Token boat
Token ID8848
Feature activation-1.82e+00

Pos logit contributions
 they+0.018
 he+0.014
 The+0.014
 she+0.014
 Mill+0.013
Neg logit contributions
its-0.079
ases-0.075
beh-0.072
 supposed-0.072
 consequences-0.071
Token floated
Token ID29742
Feature activation-2.10e-01

Pos logit contributions
 fireplace+0.033
 �+0.032
 Mill+0.031
 Aunt+0.031
 Little+0.031
Neg logit contributions
its-0.039
 tricked-0.036
ases-0.036
beh-0.036
 supposed-0.035
Token on
Token ID319
Feature activation-3.99e+00

Pos logit contributions
 chim+0.004
 roof+0.004
 Pole+0.004
 fireplace+0.004
 she+0.003
Neg logit contributions
its-0.005
ases-0.005
ac-0.005
ws-0.004
aked-0.004
Token the
Token ID262
Feature activation-3.09e+00

Pos logit contributions
 Pole+0.058
 she+0.054
+0.051
 they+0.051
lit+0.050
Neg logit contributions
ases-0.106
its-0.101
 consequences-0.095
 vitamins-0.093
ws-0.092
Token water
Token ID1660
Feature activation-4.23e+00

Pos logit contributions
 they+0.047
 she+0.046
 Mill+0.045
 Tim+0.042
 Sure+0.042
Neg logit contributions
its-0.085
 consequences-0.081
 supposed-0.079
ases-0.078
arts-0.076
Token and
Token ID290
Feature activation-5.51e+00

Pos logit contributions
 Mill+0.063
 Ally+0.059
 Aunt+0.058
 wagon+0.057
 Bobby+0.057
Neg logit contributions
its-0.109
ases-0.108
ump-0.103
 supposed-0.100
arts-0.098
Token looked
Token ID3114
Feature activation+5.81e-01

Pos logit contributions
 fireplace+0.090
 wagon+0.090
 �+0.087
 roof+0.086
 Mill+0.084
Neg logit contributions
ws-0.119
 tricked-0.118
 supposed-0.116
ases-0.115
 fooled-0.112
Token for
Token ID329
Feature activation-2.10e+00

Pos logit contributions
its+0.014
 supposed+0.013
ws+0.013
ases+0.013
 consequences+0.012
Neg logit contributions
 roof-0.010
 fireplace-0.010
stack-0.010
 they-0.010
 chim-0.010
TokenTom
Token ID13787
Feature activation+2.79e+00

Pos logit contributions
its+0.029
ases+0.028
beh+0.027
 supposed+0.026
 consequences+0.025
Neg logit contributions
 they-0.021
 he-0.019
 The-0.019
 she-0.018
 while-0.018
Token took
Token ID1718
Feature activation+2.44e+00

Pos logit contributions
its+0.089
ases+0.084
beh+0.081
 supposed+0.080
asing+0.077
Neg logit contributions
 they-0.041
 he-0.035
 roof-0.034
 she-0.034
 Ally-0.033
Token the
Token ID262
Feature activation-1.44e+00

Pos logit contributions
its+0.097
ases+0.092
beh+0.091
 supposed+0.089
 consequences+0.086
Neg logit contributions
 they-0.018
 he-0.014
 The-0.012
 she-0.012
 while-0.011
Token trophy
Token ID16383
Feature activation-2.68e+00

Pos logit contributions
 they+0.027
 he+0.025
 she+0.024
 The+0.024
 roof+0.023
Neg logit contributions
its-0.040
ases-0.038
beh-0.037
 supposed-0.036
ws-0.034
Token to
Token ID284
Feature activation-2.76e+00

Pos logit contributions
 Big+0.037
 Ally+0.037
 Mill+0.035
 they+0.035
 Pole+0.035
Neg logit contributions
ases-0.077
its-0.076
 supposed-0.072
beh-0.071
asing-0.070
Token the
Token ID262
Feature activation-3.37e+00

Pos logit contributions
 they+0.029
 he+0.022
 Ally+0.022
 landscape+0.022
 Pole+0.022
Neg logit contributions
its-0.098
ases-0.094
 supposed-0.091
beh-0.089
ws-0.087
Token bathroom
Token ID12436
Feature activation-3.11e+00

Pos logit contributions
 they+0.057
 Ally+0.051
 Mill+0.050
 he+0.049
 The+0.048
Neg logit contributions
its-0.098
ases-0.093
 supposed-0.089
 consequences-0.088
beh-0.087
Token and
Token ID290
Feature activation-4.99e+00

Pos logit contributions
 Ally+0.037
 they+0.034
 Big+0.033
 Mill+0.033
 Sun+0.033
Neg logit contributions
its-0.099
ases-0.096
 supposed-0.092
 consequences-0.088
beh-0.088
Token filled
Token ID5901
Feature activation-2.20e+00

Pos logit contributions
 Pole+0.102
 landscape+0.099
 Ally+0.099
 valley+0.099
 Sun+0.096
Neg logit contributions
its-0.103
ases-0.096
ws-0.096
 supposed-0.096
 tricked-0.095
Token the
Token ID262
Feature activation-1.38e+00

Pos logit contributions
 they+0.019
 he+0.016
 The+0.013
 she+0.013
 Tim+0.012
Neg logit contributions
its-0.084
ases-0.081
beh-0.081
 supposed-0.079
vern-0.077
Token sink
Token ID14595
Feature activation-2.31e+00

Pos logit contributions
 they+0.023
 he+0.020
 Ally+0.020
 she+0.020
 Big+0.019
Neg logit contributions
its-0.041
ases-0.039
beh-0.037
 supposed-0.037
 consequences-0.036
Token.
Token ID13
Feature activation+1.22e-01

Pos logit contributions
 they+0.022
 Ally+0.021
 Sun+0.021
 Big+0.020
 moon+0.020
Neg logit contributions
its-0.039
ases-0.036
 supposed-0.035
beh-0.034
 consequences-0.033
Token She
Token ID1375
Feature activation-1.27e+00

Pos logit contributions
its+0.004
ases+0.004
olds+0.003
beh+0.003
 supposed+0.003
Neg logit contributions
 raft-0.002
 moon-0.001
 landscape-0.001
 roof-0.001
 they-0.001
Token decided
Token ID3066
Feature activation-2.25e+00

Pos logit contributions
 they+0.021
 he+0.018
 she+0.018
 The+0.018
 roof+0.017
Neg logit contributions
its-0.038
ases-0.036
beh-0.035
 supposed-0.034
ws-0.033
Token to
Token ID284
Feature activation+3.87e-01

Pos logit contributions
 fireplace+0.037
 moon+0.036
 Sun+0.035
 Big+0.034
 landscape+0.034
Neg logit contributions
 supposed-0.060
its-0.058
beh-0.056
 consequences-0.055
ases-0.055
Token pick
Token ID2298
Feature activation+3.86e-01

Pos logit contributions
its+0.011
ases+0.010
beh+0.010
 supposed+0.009
ws+0.009
Neg logit contributions
 they-0.007
 he-0.007
 she-0.006
 The-0.006
 while-0.006
Token up
Token ID510
Feature activation-4.22e+00

Pos logit contributions
its+0.013
ases+0.012
beh+0.012
 supposed+0.011
ws+0.011
Neg logit contributions
 they-0.005
 he-0.005
 roof-0.004
 The-0.004
 she-0.004
Token some
Token ID617
Feature activation-1.35e+00

Pos logit contributions
 they+0.043
 Pole+0.043
 roof+0.043
 fireplace+0.042
 moon+0.041
Neg logit contributions
its-0.140
ases-0.135
beh-0.130
 consequences-0.126
 supposed-0.125
Token of
Token ID286
Feature activation-2.35e+00

Pos logit contributions
 they+0.025
 he+0.022
 The+0.022
 she+0.021
 roof+0.021
Neg logit contributions
its-0.038
ases-0.036
beh-0.034
 supposed-0.034
 consequences-0.032
Token the
Token ID262
Feature activation-1.43e+00

Pos logit contributions
 they+0.021
 Pole+0.020
 roof+0.018
 he+0.017
 moon+0.017
Neg logit contributions
its-0.084
ases-0.080
beh-0.076
 supposed-0.076
 consequences-0.074
Token junk
Token ID18556
Feature activation-3.17e+00

Pos logit contributions
 they+0.028
 he+0.025
 she+0.024
 roof+0.024
 The+0.024
Neg logit contributions
its-0.040
ases-0.038
beh-0.037
 supposed-0.036
vern-0.034
Token and
Token ID290
Feature activation-4.55e+00

Pos logit contributions
 Big+0.035
 they+0.035
 Ally+0.033
 Sun+0.033
 The+0.033
Neg logit contributions
its-0.103
ases-0.100
beh-0.095
 supposed-0.095
 consequences-0.092
Token They
Token ID1119
Feature activation-1.15e+00

Pos logit contributions
its+0.022
ases+0.021
olds+0.019
ws+0.019
 vitamins+0.019
Neg logit contributions
 raft-0.010
 landscape-0.009
 he-0.009
 Ally-0.009
 moon-0.009
Token came
Token ID1625
Feature activation+3.11e+00

Pos logit contributions
 they+0.019
 he+0.017
 The+0.017
 she+0.016
 Ally+0.016
Neg logit contributions
its-0.034
ases-0.032
beh-0.031
 supposed-0.030
ws-0.029
Token back
Token ID736
Feature activation-2.00e+00

Pos logit contributions
its+0.097
ases+0.096
 supposed+0.092
beh+0.090
asing+0.085
Neg logit contributions
 they-0.040
 he-0.038
 Pole-0.036
 landscape-0.035
 The-0.035
Token to
Token ID284
Feature activation-2.12e+00

Pos logit contributions
 raft+0.030
 Ally+0.029
 Pole+0.028
 landscape+0.028
 moon+0.028
Neg logit contributions
ases-0.056
its-0.056
 supposed-0.055
beh-0.051
asured-0.049
Token the
Token ID262
Feature activation-2.38e+00

Pos logit contributions
 Pole+0.025
 wagon+0.023
 landscape+0.023
 raft+0.023
 he+0.023
Neg logit contributions
ases-0.066
its-0.064
ws-0.061
 supposed-0.060
 consequences-0.060
Token kitchen
Token ID9592
Feature activation-4.22e+00

Pos logit contributions
 they+0.042
 Ally+0.042
 Mill+0.039
 he+0.038
 Big+0.037
Neg logit contributions
its-0.065
 supposed-0.060
 consequences-0.059
ases-0.059
arts-0.055
Token and
Token ID290
Feature activation-4.49e+00

Pos logit contributions
 Ally+0.052
 they+0.048
 Mill+0.047
 Big+0.046
 raft+0.044
Neg logit contributions
its-0.136
ases-0.127
 supposed-0.123
arts-0.119
 consequences-0.118
Token set
Token ID900
Feature activation+1.11e+00

Pos logit contributions
 Ally+0.081
 landscape+0.081
 Pole+0.077
 valley+0.077
 raft+0.076
Neg logit contributions
its-0.106
ases-0.105
ws-0.103
beh-0.100
 supposed-0.099
Token up
Token ID510
Feature activation-2.97e+00

Pos logit contributions
its+0.036
ases+0.034
 supposed+0.032
ws+0.031
beh+0.031
Neg logit contributions
 landscape-0.013
 they-0.013
 Ally-0.013
 he-0.013
 moon-0.012
Token their
Token ID511
Feature activation+1.03e+00

Pos logit contributions
 Ally+0.035
 Pole+0.033
 wagon+0.033
 landscape+0.032
 they+0.031
Neg logit contributions
its-0.096
ases-0.093
 supposed-0.088
beh-0.086
ws-0.085
Token stage
Token ID3800
Feature activation-7.75e-01

Pos logit contributions
its+0.026
ases+0.024
 supposed+0.023
 consequences+0.023
beh+0.022
Neg logit contributions
 they-0.021
 he-0.020
 Ally-0.020
 �-0.019
 The-0.019
Token stepped
Token ID10764
Feature activation+2.73e+00

Pos logit contributions
 they+0.010
 roof+0.009
 fireplace+0.009
 The+0.009
 Ally+0.009
Neg logit contributions
its-0.024
ases-0.023
beh-0.023
ws-0.021
 supposed-0.021
Token into
Token ID656
Feature activation-4.69e+00

Pos logit contributions
ases+0.066
its+0.065
oken+0.063
 consequences+0.061
vern+0.060
Neg logit contributions
 they-0.050
 roof-0.048
 landscape-0.045
 chim-0.045
 Pole-0.044
Token the
Token ID262
Feature activation-3.26e+00

Pos logit contributions
 they+0.043
 Pole+0.035
 roof+0.034
 Ally+0.032
 landscape+0.031
Neg logit contributions
its-0.166
ases-0.161
 consequences-0.152
 supposed-0.150
beh-0.149
Token wheat
Token ID17135
Feature activation-5.68e+00

Pos logit contributions
 Bobby+0.047
 they+0.046
 Tim+0.043
 he+0.043
 Peter+0.042
Neg logit contributions
its-0.094
 consequences-0.092
 supposed-0.086
 vitamins-0.082
 decisions-0.082
Token field
Token ID2214
Feature activation-5.50e+00

Pos logit contributions
 While+0.070
 Ally+0.067
 K+0.067
 Mill+0.067
 Aunt+0.065
Neg logit contributions
its-0.134
ump-0.130
ases-0.130
 consequences-0.130
 vitamins-0.126
Token,
Token ID11
Feature activation-5.82e+00

Pos logit contributions
 moon+0.070
 landscape+0.069
 fireplace+0.068
 Ally+0.068
 wagon+0.067
Neg logit contributions
ases-0.140
its-0.140
ump-0.134
 consequences-0.132
olds-0.127
Token something
Token ID1223
Feature activation-7.15e-01

Pos logit contributions
 landscape+0.074
 ranch+0.069
 Ally+0.067
 porch+0.067
 fireplace+0.065
Neg logit contributions
 consequences-0.148
 trouble-0.144
its-0.141
 tricked-0.141
 cheating-0.138
Token bit
Token ID1643
Feature activation+1.91e-01

Pos logit contributions
 they+0.014
 The+0.014
 fireplace+0.014
 roof+0.014
 Pole+0.014
Neg logit contributions
its-0.017
 supposed-0.015
 consequences-0.015
beh-0.015
ases-0.015
Token him
Token ID683
Feature activation+1.37e+00

Pos logit contributions
its+0.005
ases+0.005
beh+0.004
 supposed+0.004
 consequences+0.004
Neg logit contributions
 they-0.004
 roof-0.004
 The-0.003
 Pole-0.003
 landscape-0.003
Token on
Token ID319
Feature activation-1.15e+00

Pos logit contributions
its+0.038
 supposed+0.036
 consequences+0.034
ases+0.034
beh+0.033
Neg logit contributions
 they-0.022
 Ally-0.022
 Sun-0.021
 roof-0.021
 Mill-0.021
Token his
Token ID465
Feature activation+6.15e-01

Pos logit contributions
 they+0.012
 Pole+0.012
 Ally+0.011
 Sure+0.011
 roof+0.010
Neg logit contributions
its-0.038
ases-0.035
 consequences-0.035
 supposed-0.033
ws-0.033