TOP ACTIVATIONS
MAX = 11.193

pulled out a handful of penn ies and gave them to
The man gives them four penn ies . He takes the
her pick up all the penn ies . He realized that
penny was not like other penn ies . It was unusual
" We are looking for penn ies ," Tom said .
on it than the other penn ies she had . She
gy bank and all her penn ies scattered on the floor
ies . They have two penn ies . They are happy
, hoping to find more penn ies . She had noticed
Ben and Lily count their penn ies . They have two
help reun ite the two penn ies . She would never
it shiny like my other penn ies ?" Her mom said
find some more of these penn ies , I can be
't shine like her other penn ies . L
did not find any more penn ies . They
back to playing with the penn ies . They forgot about
buy any candy with his penn ies , no matter how
choices . He had two penn ies â
can still play with my penn ies . I have many
She loved to collect shiny penn ies . One day ,

MIN ACTIVATIONS
MIN = -12.260

her hair in a pony tail so she could look pretty
, Emma decided a pony tail wasn 't so important after
with glasses and a pony tail . She is reading a
along to the hy m ns , said prayers and listened
trying to keep the he ns in line . Billy watched
little shelter for the he ns . Billy couldn 't believe
was playing with some cor ns he found in his garden
was a little girl named Luna . She was only three
mist to go away . Luna knew something good would come
, I 'm sorry ," Luna says . " I was
bad dog . Anna gave Luna water and food , but
floor . They talk to Luna in soft voices . They
water and food , but Luna did not want to play
From then on , whenever Luna found herself in a beautiful
clay balls together !" But Luna ignored Max . She did
lab and make sure that Luna was safe .
He was very happy . Luna was a little sad ,
One day , Luna found a ball that could
" We are sorry , Luna . We want to be
's house . We found Luna in a box in the

INTERVAL 5.330 - 11.193
CONTAINS 0.604%

Why are you not playing nicely ?" " Ben
and thanked him . The officer smiled and said " It
be careful ." Jimmy understood and he kept walking
says . L ily and Ben walk to the
rock was not worth any money , but it was still

INTERVAL -0.533 - 5.330
CONTAINS 69.542%

While he was playing , a big dog came and took
so happy she could go to school after all . She
to help her . She wants her er aser back .
either . One day , Tim my was playing
was a little girl named Lily . She loved to read

INTERVAL -6.396 - -0.533
CONTAINS 29.672%

man ran away into the trees . The z
got there , the girl saw so many zo oms and
do you want to come on our raft ?" The duck
not know what the tide was , but they wanted to
felt proud . Later that day , Tim my

INTERVAL -12.260 - -6.396
CONTAINS 0.182%

and stopped in front of Lisa . It made a funny
and said , " By e bye ! I had a
Rex . Rex smelled the bisc uit and ran to get
. So he continued to nib ble away at the cheese
there was a pig named Pat . Pat loved to eat
Token pulled
Token ID5954
Feature activation+3.38e-01

Pos logit contributions
 disob+0.039
 kids+0.038
ying+0.037
ily+0.036
asure+0.035
Neg logit contributions
elope-0.071
oc-0.070
tail-0.064
ns-0.061
 den-0.059
Token out
Token ID503
Feature activation-2.28e+00

Pos logit contributions
 disob+0.018
 kids+0.017
ying+0.017
ily+0.016
asure+0.016
Neg logit contributions
elope-0.030
oc-0.030
tail-0.028
tails-0.025
ns-0.025
Token a
Token ID257
Feature activation+3.47e-01

Pos logit contributions
oc+0.192
elope+0.190
tail+0.173
tails+0.162
 Louise+0.161
Neg logit contributions
 disob-0.135
ying-0.124
asure-0.118
 kids-0.116
ily-0.113
Token handful
Token ID10089
Feature activation+1.41e+00

Pos logit contributions
 disob+0.020
ying+0.020
 kids+0.019
ily+0.019
asure+0.019
Neg logit contributions
oc-0.030
elope-0.029
tail-0.027
 den-0.024
tails-0.024
Token of
Token ID286
Feature activation-7.08e-01

Pos logit contributions
 disob+0.091
asure+0.086
 kids+0.079
ateful+0.077
ily+0.076
Neg logit contributions
oc-0.111
elope-0.102
tail-0.093
 tal-0.092
ns-0.089
Token penn
Token ID22429
Feature activation+1.12e+01

Pos logit contributions
oc+0.051
tail+0.046
 tal+0.044
ns+0.043
 newfound+0.043
Neg logit contributions
ying-0.050
 disob-0.045
ust-0.043
asure-0.042
aund-0.041
Tokenies
Token ID444
Feature activation+3.88e-01

Pos logit contributions
 disob+0.718
ily+0.698
 truth+0.658
 kids+0.655
 ma+0.637
Neg logit contributions
oc-0.891
elope-0.843
tails-0.809
tail-0.771
 tal-0.758
Token and
Token ID290
Feature activation+1.34e-01

Pos logit contributions
 disob+0.020
 kids+0.020
ily+0.019
ying+0.018
asure+0.018
Neg logit contributions
elope-0.035
oc-0.034
tail-0.032
tails-0.030
 den-0.029
Token gave
Token ID2921
Feature activation+9.60e-01

Pos logit contributions
 disob+0.007
 kids+0.006
ying+0.006
ily+0.006
asure+0.006
Neg logit contributions
elope-0.012
oc-0.012
tail-0.011
tails-0.011
ns-0.011
Token them
Token ID606
Feature activation+1.04e+00

Pos logit contributions
 disob+0.053
ying+0.048
 kids+0.047
asure+0.046
ateful+0.044
Neg logit contributions
elope-0.084
oc-0.082
tail-0.077
ns-0.072
tails-0.071
Token to
Token ID284
Feature activation+4.55e-01

Pos logit contributions
ying+0.066
 disob+0.066
ateful+0.061
ailed+0.060
 kids+0.059
Neg logit contributions
oc-0.076
elope-0.074
tail-0.071
tails-0.067
ns-0.067
TokenThe
Token ID464
Feature activation-1.72e-01

Pos logit contributions
oc+0.017
elope+0.016
tail+0.014
ns+0.013
Bet+0.013
Neg logit contributions
 disob-0.014
ying-0.014
 kids-0.013
asure-0.013
ily-0.013
Token man
Token ID582
Feature activation+1.94e+00

Pos logit contributions
oc+0.009
 herd+0.008
 roaring+0.007
tail+0.007
ns+0.007
Neg logit contributions
ust-0.015
ying-0.015
ily-0.014
asure-0.014
nes-0.013
Token gives
Token ID3607
Feature activation+1.95e+00

Pos logit contributions
 disob+0.129
 kids+0.122
ying+0.121
ily+0.119
asure+0.119
Neg logit contributions
elope-0.146
oc-0.145
tail-0.132
ns-0.121
 den-0.117
Token them
Token ID606
Feature activation+1.04e+00

Pos logit contributions
 disob+0.104
asure+0.092
 kids+0.091
ying+0.090
ily+0.085
Neg logit contributions
elope-0.177
oc-0.172
tail-0.161
ns-0.150
 Louise-0.150
Token four
Token ID1440
Feature activation+3.55e+00

Pos logit contributions
 disob+0.061
ying+0.059
asure+0.054
ateful+0.053
 kids+0.053
Neg logit contributions
oc-0.082
elope-0.081
tail-0.078
tails-0.072
ns-0.071
Token penn
Token ID22429
Feature activation+1.12e+01

Pos logit contributions
asure+0.212
 disob+0.207
ying+0.200
ily+0.190
ateful+0.190
Neg logit contributions
oc-0.311
elope-0.279
tail-0.268
 tal-0.253
ns-0.242
Tokenies
Token ID444
Feature activation+3.85e-01

Pos logit contributions
 disob+0.702
ily+0.638
 kids+0.633
 ma+0.632
 truth+0.630
Neg logit contributions
oc-0.887
elope-0.844
tails-0.817
tail-0.788
 tal-0.752
Token.
Token ID13
Feature activation+1.69e-01

Pos logit contributions
 disob+0.022
 kids+0.020
ying+0.020
ateful+0.019
asure+0.019
Neg logit contributions
elope-0.032
oc-0.032
tail-0.030
tails-0.028
ns-0.027
Token He
Token ID679
Feature activation+7.65e-01

Pos logit contributions
 disob+0.012
ying+0.011
asure+0.011
 kids+0.011
ateful+0.010
Neg logit contributions
oc-0.012
elope-0.012
tail-0.011
 Louise-0.010
 den-0.010
Token takes
Token ID2753
Feature activation+3.13e+00

Pos logit contributions
 disob+0.043
 kids+0.041
ying+0.040
ily+0.039
asure+0.038
Neg logit contributions
elope-0.066
oc-0.065
tail-0.060
ns-0.056
 den-0.055
Token the
Token ID262
Feature activation+2.56e-01

Pos logit contributions
 disob+0.157
 kids+0.144
ying+0.142
asure+0.138
ily+0.136
Neg logit contributions
elope-0.290
oc-0.287
tail-0.269
ns-0.249
tails-0.247
Token her
Token ID607
Feature activation-4.86e-01

Pos logit contributions
asure+0.030
 disob+0.028
ying+0.027
 penn+0.027
ateful+0.027
Neg logit contributions
elope-0.039
oc-0.037
tail-0.036
 Louise-0.034
 Luna-0.034
Token pick
Token ID2298
Feature activation+3.31e+00

Pos logit contributions
elope+0.043
oc+0.041
tail+0.038
tails+0.037
 den+0.035
Neg logit contributions
ily-0.027
 kids-0.026
asure-0.026
 disob-0.025
ateful-0.025
Token up
Token ID510
Feature activation+8.49e-01

Pos logit contributions
 disob+0.189
 kids+0.172
asure+0.172
ying+0.165
ateful+0.162
Neg logit contributions
elope-0.283
oc-0.281
tail-0.265
ns-0.239
tails-0.239
Token all
Token ID477
Feature activation+4.61e+00

Pos logit contributions
 disob+0.044
 kids+0.040
asure+0.040
ying+0.039
ily+0.038
Neg logit contributions
elope-0.077
oc-0.076
tail-0.072
 Louise-0.066
tails-0.065
Token the
Token ID262
Feature activation+2.55e-01

Pos logit contributions
 disob+0.289
asure+0.289
 truth+0.243
 kids+0.243
asuring+0.241
Neg logit contributions
elope-0.356
oc-0.355
tail-0.308
 harmony-0.305
 Louise-0.300
Token penn
Token ID22429
Feature activation+1.12e+01

Pos logit contributions
asure+0.016
ily+0.014
ateful+0.014
 sir+0.014
ying+0.014
Neg logit contributions
oc-0.021
elope-0.018
 den-0.018
ns-0.017
 tal-0.017
Tokenies
Token ID444
Feature activation+3.87e-01

Pos logit contributions
 disob+0.729
ily+0.691
asure+0.674
 truth+0.668
 ma+0.666
Neg logit contributions
oc-0.862
elope-0.842
tails-0.792
tail-0.752
eds-0.729
Token.
Token ID13
Feature activation+1.69e-01

Pos logit contributions
 disob+0.023
 kids+0.022
asure+0.021
ily+0.021
ateful+0.021
Neg logit contributions
elope-0.031
oc-0.031
tail-0.029
 den-0.026
tails-0.026
Token He
Token ID679
Feature activation+7.66e-01

Pos logit contributions
 disob+0.010
 kids+0.010
ily+0.010
ying+0.009
asure+0.009
Neg logit contributions
elope-0.014
oc-0.014
tail-0.013
tails-0.012
ns-0.012
Token realized
Token ID6939
Feature activation-8.30e-01

Pos logit contributions
 disob+0.038
 kids+0.037
ying+0.035
ily+0.035
asure+0.034
Neg logit contributions
elope-0.071
oc-0.071
tail-0.065
 den-0.060
tails-0.060
Token that
Token ID326
Feature activation+1.50e+00

Pos logit contributions
oc+0.058
elope+0.058
 Patch+0.052
tail+0.052
 Louise+0.051
Neg logit contributions
 disob-0.054
asure-0.051
ateful-0.049
 kids-0.049
ying-0.049
Token penny
Token ID26369
Feature activation+7.81e+00

Pos logit contributions
ying+0.070
 disob+0.068
ily+0.067
ateful+0.067
 kids+0.064
Neg logit contributions
oc-0.103
tail-0.093
elope-0.090
 tal-0.089
cone-0.088
Token was
Token ID373
Feature activation-8.58e-01

Pos logit contributions
 disob+0.529
 kids+0.504
ateful+0.502
ying+0.490
 ma+0.481
Neg logit contributions
oc-0.570
elope-0.570
tail-0.525
tails-0.483
ns-0.457
Token not
Token ID407
Feature activation+1.48e+00

Pos logit contributions
oc+0.052
elope+0.052
tail+0.051
 den+0.047
tails+0.046
Neg logit contributions
ying-0.061
 disob-0.058
 kids-0.057
 reasons-0.055
ateful-0.054
Token like
Token ID588
Feature activation-1.06e+00

Pos logit contributions
 disob+0.101
ying+0.099
ateful+0.095
ople+0.095
 salon+0.095
Neg logit contributions
oc-0.086
elope-0.082
tail-0.081
 tal-0.080
tails-0.075
Token other
Token ID584
Feature activation-1.58e+00

Pos logit contributions
oc+0.068
elope+0.063
tail+0.062
 tal+0.062
 Patch+0.061
Neg logit contributions
ying-0.069
ailed-0.066
ily-0.066
 disob-0.065
umber-0.061
Token penn
Token ID22429
Feature activation+1.12e+01

Pos logit contributions
oc+0.134
tail+0.127
elope+0.117
 tal+0.115
tails+0.110
Neg logit contributions
ying-0.095
ily-0.090
asure-0.082
arl-0.079
 disob-0.078
Tokenies
Token ID444
Feature activation+3.86e-01

Pos logit contributions
 disob+0.710
ily+0.700
 ma+0.668
 kids+0.662
 truth+0.634
Neg logit contributions
oc-0.868
elope-0.819
tails-0.813
tail-0.789
eds-0.713
Token.
Token ID13
Feature activation+1.78e-01

Pos logit contributions
 disob+0.022
asure+0.021
 kids+0.021
ily+0.020
ateful+0.020
Neg logit contributions
oc-0.031
elope-0.031
tail-0.029
tails-0.027
ns-0.026
Token It
Token ID632
Feature activation+7.26e-02

Pos logit contributions
 disob+0.014
ying+0.013
ateful+0.013
 penn+0.013
ily+0.012
Neg logit contributions
oc-0.013
elope-0.011
tail-0.010
Bet-0.009
 Sure-0.009
Token was
Token ID373
Feature activation-8.57e-01

Pos logit contributions
ying+0.005
 kids+0.005
 disob+0.005
ateful+0.004
asure+0.004
Neg logit contributions
elope-0.006
oc-0.005
tail-0.005
ns-0.005
tails-0.005
Token unusual
Token ID8468
Feature activation+7.61e-01

Pos logit contributions
elope+0.057
oc+0.054
tail+0.053
 den+0.051
 tal+0.050
Neg logit contributions
ying-0.058
 disob-0.053
 kids-0.052
ateful-0.049
 reasons-0.049
Token"
Token ID1
Feature activation+7.92e-01

Pos logit contributions
oc+0.018
elope+0.018
tail+0.016
Bet+0.015
ns+0.015
Neg logit contributions
 disob-0.013
 kids-0.012
ying-0.012
asure-0.011
ily-0.011
TokenWe
Token ID1135
Feature activation-1.72e+00

Pos logit contributions
ying+0.056
 disob+0.050
 kids+0.050
asure+0.049
 idea+0.049
Neg logit contributions
oc-0.055
elope-0.054
Bet-0.051
June-0.051
Patch-0.049
Token are
Token ID389
Feature activation+2.93e+00

Pos logit contributions
oc+0.140
elope+0.130
tail+0.121
ns+0.116
tails+0.114
Neg logit contributions
ying-0.108
ateful-0.097
rier-0.096
 ma-0.096
outing-0.091
Token looking
Token ID2045
Feature activation-9.93e-01

Pos logit contributions
 disob+0.170
 kids+0.162
ying+0.158
ily+0.158
asure+0.151
Neg logit contributions
elope-0.247
oc-0.241
tail-0.226
tails-0.207
 den-0.206
Token for
Token ID329
Feature activation-3.39e+00

Pos logit contributions
elope+0.086
oc+0.084
tail+0.078
tails+0.071
 den+0.071
Neg logit contributions
 disob-0.057
 kids-0.054
ily-0.052
ying-0.051
asure-0.050
Token penn
Token ID22429
Feature activation+1.12e+01

Pos logit contributions
elope+0.301
oc+0.300
tail+0.281
ns+0.260
 tal+0.258
Neg logit contributions
 disob-0.173
ying-0.172
ily-0.161
 kids-0.156
asure-0.154
Tokenies
Token ID444
Feature activation+3.86e-01

Pos logit contributions
 disob+0.681
ily+0.668
 truth+0.619
 kids+0.616
 ma+0.594
Neg logit contributions
oc-0.926
elope-0.890
tails-0.852
tail-0.819
 tal-0.781
Token,"
Token ID553
Feature activation+4.20e+00

Pos logit contributions
 disob+0.018
 kids+0.016
ily+0.016
ying+0.016
ateful+0.016
Neg logit contributions
elope-0.037
oc-0.037
tail-0.035
tails-0.032
ns-0.032
Token Tom
Token ID4186
Feature activation-2.80e-01

Pos logit contributions
ying+0.290
 disob+0.278
asure+0.268
ust+0.251
ateful+0.250
Neg logit contributions
oc-0.309
elope-0.308
tail-0.269
 Patch-0.263
 Louise-0.253
Token said
Token ID531
Feature activation+1.03e+00

Pos logit contributions
elope+0.024
oc+0.024
tail+0.021
 Louise+0.020
tails+0.020
Neg logit contributions
ying-0.016
ateful-0.015
 disob-0.015
 penn-0.014
asure-0.014
Token.
Token ID13
Feature activation+1.67e-01

Pos logit contributions
asure+0.066
 disob+0.065
ying+0.062
ateful+0.062
 kids+0.061
Neg logit contributions
elope-0.080
oc-0.076
tail-0.073
 Louise-0.066
 Patch-0.064
Token on
Token ID319
Feature activation-1.86e+00

Pos logit contributions
elope+0.296
oc+0.291
tail+0.271
 den+0.252
tails+0.252
Neg logit contributions
 disob-0.117
 kids-0.117
ily-0.111
ying-0.106
asure-0.102
Token it
Token ID340
Feature activation-1.26e+00

Pos logit contributions
elope+0.139
oc+0.131
tail+0.128
 Louise+0.112
tails+0.111
Neg logit contributions
 disob-0.130
ying-0.123
 kids-0.121
asure-0.117
ily-0.115
Token than
Token ID621
Feature activation-3.93e+00

Pos logit contributions
elope+0.121
oc+0.119
tail+0.112
tails+0.103
 den+0.103
Neg logit contributions
 disob-0.059
 kids-0.057
ily-0.054
ying-0.053
asure-0.051
Token the
Token ID262
Feature activation+2.56e-01

Pos logit contributions
elope+0.318
tail+0.290
oc+0.288
ns+0.270
 Louise+0.267
Neg logit contributions
 disob-0.230
 kids-0.212
ying-0.207
asure-0.206
ateful-0.196
Token other
Token ID584
Feature activation-1.58e+00

Pos logit contributions
ying+0.016
 disob+0.013
asure+0.013
ily+0.013
aund+0.012
Neg logit contributions
elope-0.020
oc-0.019
tail-0.019
 lifetime-0.018
 den-0.018
Token penn
Token ID22429
Feature activation+1.12e+01

Pos logit contributions
oc+0.117
tail+0.116
elope+0.116
 lifetime+0.099
 tal+0.098
Neg logit contributions
ying-0.101
 disob-0.095
asure-0.089
arl-0.087
ily-0.086
Tokenies
Token ID444
Feature activation+3.94e-01

Pos logit contributions
 disob+0.682
ily+0.642
 kids+0.619
asure+0.613
 ma+0.609
Neg logit contributions
oc-0.888
elope-0.884
tails-0.824
tail-0.821
ns-0.746
Token she
Token ID673
Feature activation-1.04e+00

Pos logit contributions
 disob+0.020
 kids+0.019
ily+0.018
ying+0.018
asure+0.018
Neg logit contributions
elope-0.036
oc-0.035
tail-0.033
tails-0.031
ns-0.030
Token had
Token ID550
Feature activation-1.02e+00

Pos logit contributions
elope+0.095
oc+0.093
tail+0.087
ns+0.080
tails+0.080
Neg logit contributions
 disob-0.053
 kids-0.052
ying-0.050
ily-0.049
asure-0.047
Token.
Token ID13
Feature activation+1.61e-01

Pos logit contributions
elope+0.075
oc+0.074
tail+0.068
ns+0.063
tails+0.062
Neg logit contributions
 disob-0.067
 kids-0.066
ying-0.065
ateful-0.061
ily-0.061
Token She
Token ID1375
Feature activation+3.43e-01

Pos logit contributions
 disob+0.012
ying+0.011
 kids+0.010
asure+0.010
ateful+0.010
Neg logit contributions
elope-0.012
oc-0.012
tail-0.011
 pasture-0.010
 Louise-0.009
Tokengy
Token ID1360
Feature activation-1.66e+00

Pos logit contributions
elope+0.026
oc+0.024
tail+0.023
 lifetime+0.021
 harmony+0.021
Neg logit contributions
 disob-0.035
 kids-0.035
asure-0.033
ily-0.032
 children-0.032
Token bank
Token ID3331
Feature activation+3.72e+00

Pos logit contributions
elope+0.129
oc+0.128
tail+0.118
tails+0.108
 den+0.107
Neg logit contributions
 disob-0.109
ying-0.106
ily-0.105
 kids-0.105
ateful-0.101
Token and
Token ID290
Feature activation+1.05e-01

Pos logit contributions
 disob+0.241
ateful+0.233
 kids+0.228
asure+0.222
ily+0.222
Neg logit contributions
oc-0.270
elope-0.251
tail-0.233
 den-0.220
 tal-0.218
Token all
Token ID477
Feature activation+4.61e+00

Pos logit contributions
 disob+0.006
ying+0.006
ily+0.006
 kids+0.006
asure+0.006
Neg logit contributions
elope-0.009
oc-0.009
tail-0.008
tails-0.007
 Louise-0.007
Token her
Token ID607
Feature activation-4.97e-01

Pos logit contributions
 disob+0.259
asure+0.257
 kids+0.234
ying+0.228
 truth+0.225
Neg logit contributions
elope-0.380
oc-0.373
tail-0.342
 Louise-0.326
 harmony-0.325
Token penn
Token ID22429
Feature activation+1.12e+01

Pos logit contributions
oc+0.048
elope+0.048
tail+0.044
ns+0.043
 tal+0.042
Neg logit contributions
ying-0.022
 disob-0.020
ily-0.020
asure-0.020
ateful-0.019
Tokenies
Token ID444
Feature activation+3.73e-01

Pos logit contributions
 disob+0.736
 truth+0.718
ily+0.716
 ma+0.708
 kids+0.675
Neg logit contributions
oc-0.795
tails-0.790
elope-0.785
eds-0.731
 tal-0.713
Token scattered
Token ID16830
Feature activation+2.44e+00

Pos logit contributions
ily+0.021
 disob+0.021
asure+0.021
 kids+0.021
ateful+0.021
Neg logit contributions
elope-0.029
oc-0.029
tail-0.027
tails-0.025
ns-0.025
Token on
Token ID319
Feature activation-1.87e+00

Pos logit contributions
 disob+0.170
asure+0.157
 kids+0.157
ying+0.156
ily+0.153
Neg logit contributions
elope-0.170
oc-0.170
tail-0.154
 tal-0.143
ns-0.142
Token the
Token ID262
Feature activation+2.57e-01

Pos logit contributions
elope+0.182
oc+0.179
tail+0.168
tails+0.155
 den+0.155
Neg logit contributions
 disob-0.086
 kids-0.083
ily-0.078
ying-0.076
asure-0.074
Token floor
Token ID4314
Feature activation+3.31e+00

Pos logit contributions
ateful+0.016
ying+0.016
asure+0.015
ily+0.015
 disob+0.014
Neg logit contributions
elope-0.020
oc-0.020
 den-0.019
ns-0.019
tail-0.018
Tokenies
Token ID444
Feature activation+3.79e-01

Pos logit contributions
 disob+0.739
ily+0.692
 truth+0.664
 ma+0.655
asure+0.654
Neg logit contributions
elope-0.830
oc-0.819
tails-0.801
tail-0.754
eds-0.730
Token.
Token ID13
Feature activation+1.60e-01

Pos logit contributions
 disob+0.018
 kids+0.018
ily+0.017
ying+0.016
asure+0.016
Neg logit contributions
elope-0.036
oc-0.035
tail-0.033
tails-0.031
 den-0.030
Token They
Token ID1119
Feature activation-1.20e-01

Pos logit contributions
 disob+0.011
 kids+0.010
ying+0.010
ily+0.010
asure+0.010
Neg logit contributions
elope-0.012
oc-0.012
tail-0.011
 den-0.010
tails-0.010
Token have
Token ID423
Feature activation-7.26e-01

Pos logit contributions
elope+0.010
oc+0.010
tail+0.010
tails+0.009
ns+0.009
Neg logit contributions
 disob-0.006
ying-0.006
 kids-0.006
ily-0.006
asure-0.006
Token two
Token ID734
Feature activation+5.56e+00

Pos logit contributions
elope+0.060
oc+0.058
tail+0.055
ns+0.051
tails+0.051
Neg logit contributions
 disob-0.042
ying-0.042
 kids-0.040
ily-0.038
asure-0.038
Token penn
Token ID22429
Feature activation+1.12e+01

Pos logit contributions
 disob+0.271
ying+0.259
asure+0.251
ily+0.247
 kids+0.236
Neg logit contributions
oc-0.517
elope-0.508
tail-0.477
ns-0.456
tails-0.445
Tokenies
Token ID444
Feature activation+3.78e-01

Pos logit contributions
 disob+0.683
ily+0.637
 truth+0.615
 kids+0.608
 ma+0.604
Neg logit contributions
oc-0.895
elope-0.880
tails-0.846
tail-0.807
ns-0.781
Token.
Token ID13
Feature activation+1.58e-01

Pos logit contributions
 disob+0.019
 kids+0.018
ily+0.017
ying+0.017
asure+0.016
Neg logit contributions
elope-0.035
oc-0.035
tail-0.032
tails-0.030
 den-0.030
Token They
Token ID1119
Feature activation-1.24e-01

Pos logit contributions
 disob+0.010
 kids+0.010
ily+0.009
ying+0.009
asure+0.009
Neg logit contributions
elope-0.013
oc-0.013
tail-0.012
tails-0.011
 den-0.011
Token are
Token ID389
Feature activation+2.92e+00

Pos logit contributions
elope+0.011
oc+0.011
tail+0.010
ns+0.009
tails+0.009
Neg logit contributions
ying-0.007
 disob-0.007
 kids-0.006
ily-0.006
asure-0.006
Token happy
Token ID3772
Feature activation-1.89e-01

Pos logit contributions
 disob+0.155
 kids+0.146
ying+0.141
ily+0.139
asure+0.135
Neg logit contributions
elope-0.262
oc-0.254
tail-0.241
tails-0.223
 den-0.221
Token,
Token ID11
Feature activation+3.76e-01

Pos logit contributions
 kids+0.079
 disob+0.074
ily+0.071
 right+0.069
 children+0.067
Neg logit contributions
oc-0.200
elope-0.198
tail-0.185
ns-0.176
 tal-0.176
Token hoping
Token ID7725
Feature activation-2.81e+00

Pos logit contributions
 kids+0.021
 disob+0.021
 right+0.020
 en+0.019
 children+0.019
Neg logit contributions
oc-0.033
elope-0.032
tail-0.030
tails-0.028
 tal-0.028
Token to
Token ID284
Feature activation+4.30e-01

Pos logit contributions
elope+0.267
oc+0.266
tail+0.246
tails+0.227
 den+0.226
Neg logit contributions
 disob-0.138
 kids-0.136
ily-0.127
ying-0.124
 children-0.118
Token find
Token ID1064
Feature activation+2.86e-02

Pos logit contributions
ying+0.026
 disob+0.024
ily+0.024
 kids+0.024
ateful+0.023
Neg logit contributions
elope-0.034
oc-0.033
tail-0.033
 den-0.029
 herd-0.029
Token more
Token ID517
Feature activation-3.97e+00

Pos logit contributions
 disob+0.001
 kids+0.001
ily+0.001
ying+0.001
asure+0.001
Neg logit contributions
elope-0.003
oc-0.003
tail-0.002
tails-0.002
 den-0.002
Token penn
Token ID22429
Feature activation+1.12e+01

Pos logit contributions
oc+0.333
elope+0.315
tail+0.298
 harmony+0.284
 tal+0.282
Neg logit contributions
 disob-0.224
asure-0.213
ily-0.209
ying-0.203
ailed-0.201
Tokenies
Token ID444
Feature activation+3.80e-01

Pos logit contributions
 disob+0.630
ily+0.585
 kids+0.580
asure+0.541
 ma+0.538
Neg logit contributions
oc-0.971
elope-0.953
tail-0.870
tails-0.869
ns-0.827
Token.
Token ID13
Feature activation+1.56e-01

Pos logit contributions
 disob+0.019
 kids+0.019
ily+0.018
ying+0.017
asure+0.017
Neg logit contributions
elope-0.035
oc-0.034
tail-0.032
tails-0.030
 den-0.030
Token She
Token ID1375
Feature activation+3.27e-01

Pos logit contributions
ily+0.008
 kids+0.008
 disob+0.007
 truth+0.007
ying+0.007
Neg logit contributions
oc-0.015
elope-0.014
tail-0.013
tails-0.013
ns-0.013
Token had
Token ID550
Feature activation-1.05e+00

Pos logit contributions
 disob+0.017
 kids+0.016
ily+0.015
ying+0.015
asure+0.015
Neg logit contributions
elope-0.030
oc-0.029
tail-0.028
tails-0.025
 den-0.025
Token noticed
Token ID6810
Feature activation-2.04e+00

Pos logit contributions
elope+0.099
oc+0.098
tail+0.091
tails+0.083
 den+0.083
Neg logit contributions
 disob-0.053
 kids-0.051
ily-0.048
ying-0.047
asure-0.046
TokenBen
Token ID11696
Feature activation+1.02e+00

Pos logit contributions
elope+0.022
oc+0.021
tail+0.020
 den+0.018
tails+0.018
Neg logit contributions
 disob-0.011
 kids-0.011
ily-0.011
ying-0.010
asure-0.010
Token and
Token ID290
Feature activation+1.06e-01

Pos logit contributions
 disob+0.052
 kids+0.050
ily+0.048
ying+0.048
asure+0.046
Neg logit contributions
elope-0.093
oc-0.091
tail-0.085
 den-0.078
tails-0.078
Token Lily
Token ID20037
Feature activation+1.58e+00

Pos logit contributions
 disob+0.005
asure+0.005
ying+0.005
ateful+0.004
 kids+0.004
Neg logit contributions
elope-0.010
oc-0.010
tail-0.009
 Luna-0.009
 Louise-0.009
Token count
Token ID954
Feature activation-2.92e+00

Pos logit contributions
ying+0.091
ily+0.087
ateful+0.084
 disob+0.081
 kids+0.080
Neg logit contributions
elope-0.133
tail-0.128
oc-0.128
tails-0.116
ns-0.112
Token their
Token ID511
Feature activation-1.17e+00

Pos logit contributions
elope+0.236
oc+0.226
tail+0.223
tails+0.206
 pasture+0.202
Neg logit contributions
 disob-0.173
 kids-0.155
ily-0.154
asure-0.152
ying-0.150
Token penn
Token ID22429
Feature activation+1.12e+01

Pos logit contributions
elope+0.099
oc+0.096
tail+0.093
 den+0.090
tails+0.088
Neg logit contributions
ying-0.062
ily-0.061
 disob-0.060
asure-0.057
ateful-0.053
Tokenies
Token ID444
Feature activation+3.79e-01

Pos logit contributions
 disob+0.739
ily+0.692
 truth+0.664
 ma+0.655
asure+0.654
Neg logit contributions
elope-0.830
oc-0.819
tails-0.801
tail-0.754
eds-0.730
Token.
Token ID13
Feature activation+1.60e-01

Pos logit contributions
 disob+0.018
 kids+0.018
ily+0.017
ying+0.016
asure+0.016
Neg logit contributions
elope-0.036
oc-0.035
tail-0.033
tails-0.031
 den-0.030
Token They
Token ID1119
Feature activation-1.20e-01

Pos logit contributions
 disob+0.011
 kids+0.010
ying+0.010
ily+0.010
asure+0.010
Neg logit contributions
elope-0.012
oc-0.012
tail-0.011
 den-0.010
tails-0.010
Token have
Token ID423
Feature activation-7.26e-01

Pos logit contributions
elope+0.010
oc+0.010
tail+0.010
tails+0.009
ns+0.009
Neg logit contributions
 disob-0.006
ying-0.006
 kids-0.006
ily-0.006
asure-0.006
Token two
Token ID734
Feature activation+5.56e+00

Pos logit contributions
elope+0.060
oc+0.058
tail+0.055
ns+0.051
tails+0.051
Neg logit contributions
 disob-0.042
ying-0.042
 kids-0.040
ily-0.038
asure-0.038
Token help
Token ID1037
Feature activation-3.54e-03

Pos logit contributions
elope+0.189
oc+0.186
tail+0.185
tails+0.160
 den+0.159
Neg logit contributions
 kids-0.126
ying-0.124
ateful-0.122
 disob-0.116
ily-0.116
Token reun
Token ID15398
Feature activation+1.92e+00

Pos logit contributions
elope+0.000
oc+0.000
tail+0.000
tails+0.000
 den+0.000
Neg logit contributions
 disob-0.000
 kids-0.000
ily-0.000
ying-0.000
asure-0.000
Tokenite
Token ID578
Feature activation+1.07e+00

Pos logit contributions
 disob+0.134
 kids+0.131
ily+0.126
ying+0.125
asure+0.122
Neg logit contributions
elope-0.141
oc-0.139
tail-0.128
tails-0.114
 den-0.113
Token the
Token ID262
Feature activation+2.49e-01

Pos logit contributions
 disob+0.052
 kids+0.050
ying+0.046
ily+0.046
 children+0.043
Neg logit contributions
elope-0.102
oc-0.099
tail-0.092
 den-0.086
icol-0.086
Token two
Token ID734
Feature activation+5.57e+00

Pos logit contributions
ying+0.013
ily+0.012
asure+0.012
aund+0.011
ailed+0.011
Neg logit contributions
elope-0.022
oc-0.021
tail-0.020
 den-0.020
 lifetime-0.020
Token penn
Token ID22429
Feature activation+1.12e+01

Pos logit contributions
asure+0.252
ily+0.220
 disob+0.211
ying+0.209
ateful+0.205
Neg logit contributions
oc-0.510
elope-0.489
tail-0.488
 lifetime-0.471
 newfound-0.466
Tokenies
Token ID444
Feature activation+3.87e-01

Pos logit contributions
 disob+0.662
ily+0.634
 truth+0.612
 kids+0.611
asure+0.601
Neg logit contributions
oc-0.903
elope-0.897
tails-0.843
tail-0.820
 lifetime-0.765
Token.
Token ID13
Feature activation+1.67e-01

Pos logit contributions
 disob+0.019
 kids+0.019
ily+0.018
ying+0.017
asure+0.017
Neg logit contributions
elope-0.036
oc-0.035
tail-0.033
tails-0.030
 den-0.030
Token She
Token ID1375
Feature activation+3.32e-01

Pos logit contributions
 kids+0.009
ily+0.009
 disob+0.009
ying+0.008
asure+0.008
Neg logit contributions
elope-0.014
oc-0.014
tail-0.013
ns-0.013
tails-0.012
Token would
Token ID561
Feature activation-2.84e+00

Pos logit contributions
 disob+0.017
 kids+0.016
ily+0.015
asure+0.014
 children+0.014
Neg logit contributions
elope-0.031
oc-0.030
tail-0.029
tails-0.027
 den-0.027
Token never
Token ID1239
Feature activation-1.00e+00

Pos logit contributions
elope+0.258
oc+0.253
tail+0.241
tails+0.220
 den+0.218
Neg logit contributions
 kids-0.144
 disob-0.143
ily-0.141
ying-0.141
ateful-0.132
Token it
Token ID340
Feature activation-1.28e+00

Pos logit contributions
oc+0.143
elope+0.141
tail+0.136
tails+0.123
 tal+0.119
Neg logit contributions
 disob-0.117
ying-0.113
ily-0.110
 kids-0.109
asure-0.107
Token shiny
Token ID22441
Feature activation+1.72e+00

Pos logit contributions
tail+0.081
oc+0.080
elope+0.073
 den+0.071
 harmony+0.067
Neg logit contributions
 disob-0.089
ily-0.087
ying-0.087
ailed-0.086
ateful-0.085
Token like
Token ID588
Feature activation-1.06e+00

Pos logit contributions
 disob+0.096
asure+0.083
 kids+0.082
ailed+0.082
ily+0.081
Neg logit contributions
oc-0.152
tail-0.142
elope-0.136
 den-0.125
 tal-0.121
Token my
Token ID616
Feature activation-9.63e-01

Pos logit contributions
elope+0.083
oc+0.083
tail+0.082
 den+0.073
 pasture+0.073
Neg logit contributions
 disob-0.061
ying-0.060
 kids-0.056
ateful-0.055
ily-0.055
Token other
Token ID584
Feature activation-1.59e+00

Pos logit contributions
oc+0.084
tail+0.081
elope+0.081
 tal+0.076
 lifetime+0.076
Neg logit contributions
ying-0.053
ily-0.050
 disob-0.046
asure-0.043
ailed-0.042
Token penn
Token ID22429
Feature activation+1.12e+01

Pos logit contributions
oc+0.133
tail+0.126
 tal+0.115
elope+0.113
 herd+0.110
Neg logit contributions
ily-0.094
ying-0.093
ailed-0.082
arl-0.080
asure-0.079
Tokenies
Token ID444
Feature activation+3.66e-01

Pos logit contributions
ily+0.759
 disob+0.731
 ma+0.687
 kids+0.685
 truth+0.660
Neg logit contributions
oc-0.795
elope-0.737
tails-0.735
tail-0.731
eds-0.702
Token?"
Token ID1701
Feature activation+2.66e+00

Pos logit contributions
ily+0.018
ailed+0.017
ateful+0.017
ying+0.017
 reasons+0.016
Neg logit contributions
tail-0.030
oc-0.029
elope-0.028
 den-0.026
tails-0.026
Token Her
Token ID2332
Feature activation+2.94e-01

Pos logit contributions
 disob+0.141
 kids+0.136
ily+0.129
ying+0.127
asure+0.124
Neg logit contributions
elope-0.242
oc-0.238
tail-0.223
tails-0.204
 den-0.204
Token mom
Token ID1995
Feature activation-9.61e-01

Pos logit contributions
 disob+0.015
ying+0.014
asure+0.013
ily+0.013
 kids+0.013
Neg logit contributions
elope-0.027
oc-0.025
tail-0.024
 den-0.023
tails-0.023
Token said
Token ID531
Feature activation+1.03e+00

Pos logit contributions
elope+0.079
oc+0.078
tail+0.070
tails+0.065
June+0.065
Neg logit contributions
ying-0.056
 kids-0.055
 disob-0.055
 penn-0.055
ateful-0.054
Token find
Token ID1064
Feature activation+2.30e-02

Pos logit contributions
 kids+0.023
 disob+0.023
ying+0.022
ateful+0.021
ily+0.021
Neg logit contributions
oc-0.038
elope-0.037
tail-0.035
ns-0.031
 den-0.031
Token some
Token ID617
Feature activation+7.14e-01

Pos logit contributions
 disob+0.001
 kids+0.001
ily+0.001
ying+0.001
asure+0.001
Neg logit contributions
elope-0.002
oc-0.002
tail-0.002
 den-0.002
tails-0.002
Token more
Token ID517
Feature activation-3.98e+00

Pos logit contributions
 disob+0.038
ying+0.036
asure+0.035
ateful+0.034
ily+0.034
Neg logit contributions
oc-0.061
elope-0.058
tail-0.057
 tal-0.053
 den-0.051
Token of
Token ID286
Feature activation-7.30e-01

Pos logit contributions
oc+0.345
elope+0.338
tail+0.323
 tal+0.297
tails+0.291
Neg logit contributions
 disob-0.214
asure-0.195
 kids-0.192
ily-0.191
ying-0.191
Token these
Token ID777
Feature activation+2.30e+00

Pos logit contributions
elope+0.057
oc+0.057
tail+0.055
 harmony+0.050
 tal+0.050
Neg logit contributions
 disob-0.041
ying-0.041
ily-0.039
asure-0.039
 kids-0.038
Token penn
Token ID22429
Feature activation+1.12e+01

Pos logit contributions
 disob+0.124
asure+0.118
ily+0.115
ying+0.115
ailed+0.114
Neg logit contributions
oc-0.186
tail-0.175
elope-0.172
 tal-0.162
ns-0.160
Tokenies
Token ID444
Feature activation+3.76e-01

Pos logit contributions
 disob+0.651
ily+0.610
 kids+0.591
asure+0.566
 truth+0.558
Neg logit contributions
oc-0.929
elope-0.922
tail-0.859
tails-0.847
 lifetime-0.790
Token,
Token ID11
Feature activation+3.65e-01

Pos logit contributions
 disob+0.021
 kids+0.020
ily+0.019
ateful+0.019
asure+0.019
Neg logit contributions
elope-0.032
oc-0.031
tail-0.029
 Louise-0.026
 den-0.026
Token I
Token ID314
Feature activation+4.26e-01

Pos logit contributions
 disob+0.019
ying+0.018
 kids+0.017
asure+0.017
ily+0.017
Neg logit contributions
elope-0.032
oc-0.031
tail-0.030
 Louise-0.028
tails-0.027
Token can
Token ID460
Feature activation+3.29e-01

Pos logit contributions
 kids+0.025
 disob+0.025
ying+0.024
ily+0.023
asure+0.023
Neg logit contributions
oc-0.036
elope-0.035
tail-0.033
 den-0.029
tails-0.029
Token be
Token ID307
Feature activation+1.65e+00

Pos logit contributions
ying+0.019
 kids+0.018
ateful+0.018
 disob+0.018
ily+0.017
Neg logit contributions
oc-0.027
tail-0.027
elope-0.027
tails-0.023
 den-0.022
Token't
Token ID470
Feature activation+9.88e-01

Pos logit contributions
 kids+0.157
 disob+0.154
ying+0.149
asure+0.141
ily+0.141
Neg logit contributions
elope-0.227
oc-0.223
tail-0.215
tails-0.192
June-0.191
Token shine
Token ID18340
Feature activation+1.86e+00

Pos logit contributions
ying+0.064
outed+0.060
ateful+0.057
ailed+0.056
 reasons+0.055
Neg logit contributions
tail-0.072
oc-0.069
elope-0.063
tails-0.059
 harmony-0.058
Token like
Token ID588
Feature activation-1.07e+00

Pos logit contributions
 disob+0.102
 kids+0.097
ily+0.092
asure+0.092
ying+0.089
Neg logit contributions
elope-0.162
oc-0.160
tail-0.152
tails-0.139
 den-0.135
Token her
Token ID607
Feature activation-5.21e-01

Pos logit contributions
elope+0.093
oc+0.089
tail+0.087
tails+0.079
 den+0.078
Neg logit contributions
 disob-0.057
ying-0.054
 kids-0.053
ily-0.051
asure-0.051
Token other
Token ID584
Feature activation-1.60e+00

Pos logit contributions
elope+0.043
oc+0.041
tail+0.041
 tal+0.038
tails+0.037
Neg logit contributions
ying-0.030
ily-0.030
 disob-0.028
asure-0.026
 kids-0.026
Token penn
Token ID22429
Feature activation+1.12e+01

Pos logit contributions
oc+0.134
tail+0.131
elope+0.125
 tal+0.118
 lifetime+0.117
Neg logit contributions
ying-0.091
asure-0.085
ily-0.085
arl-0.079
 disob-0.077
Tokenies
Token ID444
Feature activation+3.63e-01

Pos logit contributions
 disob+0.698
ily+0.686
 kids+0.664
 ma+0.658
asure+0.631
Neg logit contributions
elope-0.860
oc-0.851
tails-0.830
tail-0.802
 lifetime-0.723
Token.
Token ID13
Feature activation+1.53e-01

Pos logit contributions
ily+0.020
 kids+0.020
 disob+0.020
asure+0.020
ateful+0.020
Neg logit contributions
elope-0.029
tail-0.028
oc-0.028
tails-0.025
ns-0.024
Token
Token ID198
Feature activation-2.28e-01

Pos logit contributions
 disob+0.010
 kids+0.010
ying+0.010
ily+0.010
asure+0.009
Neg logit contributions
elope-0.012
oc-0.011
tail-0.010
tails-0.009
 den-0.009
Token
Token ID198
Feature activation-2.30e-01

Pos logit contributions
elope+0.020
oc+0.020
tail+0.019
tails+0.018
 den+0.018
Neg logit contributions
 kids-0.012
 disob-0.012
ily-0.012
ying-0.011
 truth-0.011
TokenL
Token ID43
Feature activation+1.67e+00

Pos logit contributions
elope+0.021
oc+0.021
tail+0.019
tails+0.018
Bet+0.017
Neg logit contributions
 disob-0.012
 kids-0.012
ying-0.011
ily-0.011
asure-0.011
Token did
Token ID750
Feature activation+3.29e+00

Pos logit contributions
elope+0.075
oc+0.071
tail+0.066
ns+0.063
Bet+0.062
Neg logit contributions
ying-0.044
 kids-0.043
 disob-0.042
asure-0.042
ateful-0.041
Token not
Token ID407
Feature activation+1.46e+00

Pos logit contributions
 disob+0.106
 kids+0.101
ily+0.091
ying+0.088
asure+0.084
Neg logit contributions
elope-0.367
oc-0.363
tail-0.342
 den-0.320
tails-0.319
Token find
Token ID1064
Feature activation+1.46e-02

Pos logit contributions
ying+0.118
asure+0.118
outed+0.118
ateful+0.110
 penn+0.110
Neg logit contributions
elope-0.083
tail-0.078
oc-0.072
ns-0.064
 Louise-0.064
Token any
Token ID597
Feature activation+8.44e-01

Pos logit contributions
 kids+0.001
 disob+0.001
ily+0.001
ying+0.001
 children+0.001
Neg logit contributions
elope-0.001
oc-0.001
tail-0.001
 den-0.001
tails-0.001
Token more
Token ID517
Feature activation-3.98e+00

Pos logit contributions
 disob+0.044
asure+0.043
ying+0.043
ily+0.041
ateful+0.039
Neg logit contributions
oc-0.072
elope-0.070
tail-0.067
 den-0.062
ns-0.061
Token penn
Token ID22429
Feature activation+1.12e+01

Pos logit contributions
oc+0.353
elope+0.337
tail+0.331
ns+0.305
 harmony+0.302
Neg logit contributions
asure-0.191
 disob-0.191
ying-0.178
ily-0.173
ailed-0.172
Tokenies
Token ID444
Feature activation+3.71e-01

Pos logit contributions
 disob+0.731
ily+0.699
 truth+0.668
 ma+0.666
asure+0.650
Neg logit contributions
oc-0.826
elope-0.788
tails-0.766
tail-0.726
ns-0.719
Token.
Token ID13
Feature activation+1.52e-01

Pos logit contributions
 disob+0.019
 kids+0.018
ily+0.017
asure+0.017
ying+0.017
Neg logit contributions
elope-0.034
oc-0.033
tail-0.031
ns-0.028
tails-0.028
Token
Token ID198
Feature activation-2.30e-01

Pos logit contributions
 kids+0.009
 disob+0.009
ily+0.009
ying+0.008
asure+0.008
Neg logit contributions
elope-0.012
oc-0.012
tail-0.011
tails-0.010
 den-0.010
Token
Token ID198
Feature activation-2.32e-01

Pos logit contributions
tail+0.021
elope+0.021
oc+0.021
tails+0.019
ns+0.019
Neg logit contributions
ily-0.011
 kids-0.011
 disob-0.010
 truth-0.009
 children-0.009
TokenThey
Token ID2990
Feature activation-7.67e-01

Pos logit contributions
elope+0.021
oc+0.020
tail+0.019
 den+0.018
tails+0.018
Neg logit contributions
 disob-0.012
 kids-0.012
ily-0.012
asure-0.011
ying-0.011
Token back
Token ID736
Feature activation-2.81e-01

Pos logit contributions
 kids+0.063
 disob+0.061
ily+0.057
ying+0.054
 right+0.053
Neg logit contributions
elope-0.145
oc-0.144
tail-0.130
 tal-0.124
ns-0.123
Token to
Token ID284
Feature activation+4.26e-01

Pos logit contributions
elope+0.028
oc+0.027
tail+0.026
ns+0.024
 lifetime+0.023
Neg logit contributions
 disob-0.013
 kids-0.013
ily-0.012
ying-0.011
 children-0.011
Token playing
Token ID2712
Feature activation-1.94e+00

Pos logit contributions
 kids+0.022
 disob+0.020
 children+0.018
ily+0.017
 right+0.017
Neg logit contributions
oc-0.043
elope-0.041
tail-0.038
ns-0.037
 lifetime-0.036
Token with
Token ID351
Feature activation-1.47e-01

Pos logit contributions
oc+0.138
tail+0.125
elope+0.125
 harmony+0.110
 pasture+0.106
Neg logit contributions
 disob-0.142
asure-0.136
ying-0.134
ust-0.127
ateful-0.127
Token the
Token ID262
Feature activation+2.34e-01

Pos logit contributions
elope+0.012
oc+0.012
tail+0.011
tails+0.010
ns+0.010
Neg logit contributions
 disob-0.009
ying-0.008
 kids-0.008
asure-0.008
ily-0.008
Token penn
Token ID22429
Feature activation+1.12e+01

Pos logit contributions
ying+0.015
asure+0.015
aund+0.013
ily+0.013
ailed+0.013
Neg logit contributions
 newfound-0.015
 lifetime-0.014
 den-0.014
oc-0.014
 sle-0.014
Tokenies
Token ID444
Feature activation+3.64e-01

Pos logit contributions
 disob+0.779
ily+0.753
asure+0.750
 ma+0.726
 truth+0.715
Neg logit contributions
oc-0.800
elope-0.757
tails-0.731
eds-0.691
tail-0.687
Token.
Token ID13
Feature activation+1.46e-01

Pos logit contributions
 disob+0.018
 kids+0.017
ily+0.016
ying+0.016
asure+0.016
Neg logit contributions
elope-0.034
oc-0.033
tail-0.031
tails-0.029
 den-0.029
Token They
Token ID1119
Feature activation-1.31e-01

Pos logit contributions
 kids+0.007
ily+0.006
 truth+0.006
 children+0.006
 right+0.006
Neg logit contributions
oc-0.014
elope-0.014
tail-0.014
tails-0.013
ns-0.013
Token forgot
Token ID16453
Feature activation-1.17e+00

Pos logit contributions
elope+0.010
oc+0.010
tail+0.009
ns+0.009
tails+0.009
Neg logit contributions
ying-0.008
 disob-0.008
 kids-0.008
asure-0.008
ily-0.007
Token about
Token ID546
Feature activation+2.28e+00

Pos logit contributions
elope+0.094
oc+0.090
tail+0.083
 den+0.082
 Louise+0.082
Neg logit contributions
 disob-0.067
asure-0.061
ily-0.060
ateful-0.059
 kids-0.058
Token buy
Token ID2822
Feature activation+5.07e+00

Pos logit contributions
ying+0.062
ateful+0.058
 kids+0.057
outed+0.056
asure+0.055
Neg logit contributions
elope-0.080
tail-0.078
oc-0.078
ns-0.065
 den-0.063
Token any
Token ID597
Feature activation+8.41e-01

Pos logit contributions
 disob+0.281
ying+0.266
ily+0.261
asure+0.255
ailed+0.246
Neg logit contributions
oc-0.417
elope-0.415
tail-0.388
ns-0.366
tails-0.357
Token candy
Token ID18550
Feature activation+2.53e+00

Pos logit contributions
 disob+0.039
asure+0.037
ying+0.036
ily+0.036
ateful+0.034
Neg logit contributions
oc-0.081
elope-0.076
tail-0.073
 den-0.068
 tal-0.068
Token with
Token ID351
Feature activation-1.56e-01

Pos logit contributions
 disob+0.152
 kids+0.144
ily+0.142
ateful+0.140
asure+0.136
Neg logit contributions
oc-0.206
elope-0.197
tail-0.189
ns-0.170
 den-0.166
Token his
Token ID465
Feature activation+2.95e-01

Pos logit contributions
elope+0.014
oc+0.014
tail+0.013
tails+0.012
ns+0.012
Neg logit contributions
 disob-0.008
 kids-0.007
ying-0.007
ily-0.007
asure-0.007
Token penn
Token ID22429
Feature activation+1.12e+01

Pos logit contributions
 disob+0.015
ying+0.014
ily+0.014
 kids+0.014
asure+0.013
Neg logit contributions
elope-0.027
oc-0.027
tail-0.025
 den-0.023
tails-0.023
Tokenies
Token ID444
Feature activation+3.66e-01

Pos logit contributions
 disob+0.678
ily+0.672
 kids+0.631
 truth+0.603
 ma+0.602
Neg logit contributions
oc-0.917
elope-0.888
tails-0.828
tail-0.808
 tal-0.771
Token,
Token ID11
Feature activation+3.60e-01

Pos logit contributions
 disob+0.020
 kids+0.019
ily+0.019
ying+0.018
asure+0.018
Neg logit contributions
elope-0.032
oc-0.031
tail-0.029
 den-0.027
tails-0.027
Token no
Token ID645
Feature activation-1.35e+00

Pos logit contributions
 disob+0.017
 kids+0.017
ily+0.016
ying+0.016
asure+0.015
Neg logit contributions
elope-0.034
oc-0.033
tail-0.031
 den-0.029
tails-0.029
Token matter
Token ID2300
Feature activation-1.03e+00

Pos logit contributions
oc+0.115
elope+0.114
tail+0.107
 den+0.098
 lifetime+0.097
Neg logit contributions
 disob-0.075
ying-0.073
ily-0.072
asure-0.070
 kids-0.067
Token how
Token ID703
Feature activation+4.53e+00

Pos logit contributions
elope+0.091
oc+0.090
tail+0.081
tails+0.076
 den+0.076
Neg logit contributions
 disob-0.055
ily-0.049
 kids-0.048
asure-0.047
ying-0.046
Token choices
Token ID7747
Feature activation+1.79e+00

Pos logit contributions
elope+0.171
oc+0.171
tail+0.157
tails+0.147
 den+0.146
Neg logit contributions
 disob-0.098
ily-0.089
asure-0.089
ying-0.087
 kids-0.086
Token.
Token ID13
Feature activation+1.30e-01

Pos logit contributions
 disob+0.096
 kids+0.089
ily+0.087
ying+0.084
asure+0.084
Neg logit contributions
elope-0.159
oc-0.158
tail-0.149
 den-0.136
tails-0.135
Token He
Token ID679
Feature activation+7.31e-01

Pos logit contributions
 disob+0.008
 kids+0.008
ying+0.007
ily+0.007
asure+0.007
Neg logit contributions
elope-0.011
oc-0.011
tail-0.010
 den-0.009
tails-0.009
Token had
Token ID550
Feature activation-1.07e+00

Pos logit contributions
 disob+0.041
 kids+0.040
ily+0.038
ying+0.037
asure+0.036
Neg logit contributions
elope-0.064
oc-0.063
tail-0.059
tails-0.054
 den-0.053
Token two
Token ID734
Feature activation+5.55e+00

Pos logit contributions
elope+0.095
oc+0.092
tail+0.087
ns+0.081
tails+0.081
Neg logit contributions
 disob-0.056
ying-0.055
 kids-0.053
ily-0.051
asure-0.051
Token penn
Token ID22429
Feature activation+1.11e+01

Pos logit contributions
 disob+0.278
 kids+0.277
ily+0.254
ying+0.248
 children+0.240
Neg logit contributions
elope-0.517
oc-0.506
tail-0.478
 den-0.435
tails-0.435
Tokenies
Token ID444
Feature activation+3.47e-01

Pos logit contributions
 disob+0.648
ily+0.625
 truth+0.590
 kids+0.589
 ma+0.582
Neg logit contributions
oc-0.933
elope-0.910
tails-0.868
tail-0.828
ns-0.800
Tokenâ
Token ID22940
Feature activation+4.31e+00

Pos logit contributions
 disob+0.018
 kids+0.017
ily+0.016
ying+0.016
asure+0.015
Neg logit contributions
elope-0.032
oc-0.031
tail-0.029
tails-0.027
 den-0.027
Token
Token ID26391
Feature activation+2.17e+00

Pos logit contributions
 kids+0.280
 disob+0.280
ying+0.264
ateful+0.261
 penn+0.258
Neg logit contributions
elope-0.342
oc-0.326
tail-0.315
ns-0.276
 pasture-0.267
Token
Token ID447
Feature activation+7.23e-02

Pos logit contributions
 kids+0.159
 disob+0.148
ily+0.140
 children+0.138
asure+0.135
Neg logit contributions
elope-0.142
oc-0.135
tail-0.126
 den-0.125
 clearing-0.121
Token
Token ID251
Feature activation-1.71e+00

Pos logit contributions
 disob+0.006
 kids+0.006
ily+0.006
ying+0.006
asure+0.006
Neg logit contributions
elope-0.004
oc-0.004
tail-0.004
 den-0.003
tails-0.003
Token can
Token ID460
Feature activation+3.04e-01

Pos logit contributions
oc+0.005
elope+0.005
tail+0.005
ns+0.004
tails+0.004
Neg logit contributions
ying-0.004
ateful-0.004
ily-0.004
 kids-0.004
 penn-0.004
Token still
Token ID991
Feature activation+5.97e-01

Pos logit contributions
ying+0.017
 kids+0.017
 disob+0.016
ily+0.016
ateful+0.016
Neg logit contributions
elope-0.026
oc-0.026
tail-0.025
tails-0.023
 den-0.022
Token play
Token ID711
Feature activation-1.89e+00

Pos logit contributions
ying+0.037
 disob+0.034
 kids+0.034
ily+0.034
ateful+0.033
Neg logit contributions
oc-0.049
elope-0.049
tail-0.047
tails-0.041
ns-0.040
Token with
Token ID351
Feature activation-1.72e-01

Pos logit contributions
oc+0.168
elope+0.167
tail+0.157
 den+0.140
ns+0.140
Neg logit contributions
 disob-0.104
 kids-0.097
ying-0.095
asure-0.092
ily-0.092
Token my
Token ID616
Feature activation-9.95e-01

Pos logit contributions
elope+0.017
oc+0.016
tail+0.015
tails+0.014
 den+0.014
Neg logit contributions
 disob-0.008
 kids-0.008
ily-0.008
ying-0.007
asure-0.007
Token penn
Token ID22429
Feature activation+1.11e+01

Pos logit contributions
oc+0.084
elope+0.076
tail+0.075
 tal+0.074
 den+0.074
Neg logit contributions
ying-0.055
ily-0.053
 disob-0.052
asure-0.051
ailed-0.047
Tokenies
Token ID444
Feature activation+3.53e-01

Pos logit contributions
 disob+0.699
ily+0.664
asure+0.626
 ma+0.624
 truth+0.618
Neg logit contributions
oc-0.897
elope-0.864
tails-0.817
tail-0.810
eds-0.761
Token.
Token ID13
Feature activation+1.33e-01

Pos logit contributions
 disob+0.020
ateful+0.018
ying+0.018
asure+0.018
ailed+0.018
Neg logit contributions
oc-0.030
elope-0.030
tail-0.028
ns-0.025
 den-0.025
Token I
Token ID314
Feature activation+4.14e-01

Pos logit contributions
ily+0.008
 kids+0.008
 disob+0.007
asure+0.007
 right+0.007
Neg logit contributions
elope-0.011
oc-0.011
tail-0.010
tails-0.010
 lifetime-0.010
Token have
Token ID423
Feature activation-7.57e-01

Pos logit contributions
 disob+0.024
 kids+0.024
ying+0.022
ily+0.022
asure+0.022
Neg logit contributions
elope-0.035
oc-0.035
tail-0.032
tails-0.029
 den-0.029
Token many
Token ID867
Feature activation+7.58e-01

Pos logit contributions
elope+0.067
oc+0.065
tail+0.062
ns+0.057
tails+0.057
Neg logit contributions
 disob-0.040
ying-0.039
 kids-0.038
ily-0.037
asure-0.036
Token She
Token ID1375
Feature activation+3.09e-01

Pos logit contributions
 disob+0.007
 kids+0.007
ily+0.007
ying+0.006
asure+0.006
Neg logit contributions
elope-0.014
oc-0.014
tail-0.013
tails-0.012
ns-0.012
Token loved
Token ID6151
Feature activation-3.31e-01

Pos logit contributions
 disob+0.017
 kids+0.017
ily+0.016
ying+0.015
asure+0.015
Neg logit contributions
elope-0.028
oc-0.027
tail-0.025
 den-0.023
tails-0.023
Token to
Token ID284
Feature activation+4.19e-01

Pos logit contributions
elope+0.030
oc+0.029
tail+0.027
tails+0.025
 den+0.025
Neg logit contributions
 disob-0.018
 kids-0.017
ily-0.016
ying-0.016
asure-0.016
Token collect
Token ID2824
Feature activation+3.50e+00

Pos logit contributions
 disob+0.027
ying+0.027
 kids+0.026
ily+0.026
ateful+0.025
Neg logit contributions
elope-0.032
oc-0.032
tail-0.031
tails-0.027
 pasture-0.027
Token shiny
Token ID22441
Feature activation+1.71e+00

Pos logit contributions
 disob+0.212
asure+0.194
ying+0.183
ateful+0.180
ily+0.176
Neg logit contributions
oc-0.292
elope-0.261
tail-0.259
tails-0.243
ns-0.242
Token penn
Token ID22429
Feature activation+1.11e+01

Pos logit contributions
 disob+0.104
asure+0.090
ily+0.088
ying+0.088
 ma+0.084
Neg logit contributions
oc-0.148
elope-0.130
tail-0.130
 den-0.121
tails-0.121
Tokenies
Token ID444
Feature activation+3.58e-01

Pos logit contributions
 disob+0.673
ily+0.620
 kids+0.610
 ma+0.598
 truth+0.570
Neg logit contributions
oc-0.923
elope-0.890
tails-0.837
tail-0.820
ns-0.771
Token.
Token ID13
Feature activation+1.43e-01

Pos logit contributions
 disob+0.018
 kids+0.017
ily+0.016
asure+0.016
ying+0.016
Neg logit contributions
elope-0.033
oc-0.033
tail-0.031
tails-0.028
 den-0.028
Token One
Token ID1881
Feature activation+5.91e-01

Pos logit contributions
 disob+0.009
 kids+0.009
ying+0.008
ily+0.008
asure+0.008
Neg logit contributions
elope-0.011
oc-0.011
tail-0.010
 den-0.009
 pasture-0.009
Token day
Token ID1110
Feature activation-3.28e-01

Pos logit contributions
 disob+0.037
 kids+0.036
ily+0.034
ying+0.034
asure+0.033
Neg logit contributions
elope-0.048
oc-0.047
tail-0.044
 den-0.039
tails-0.039
Token,
Token ID11
Feature activation+3.60e-01

Pos logit contributions
elope+0.035
oc+0.034
tail+0.032
ns+0.031
 den+0.030
Neg logit contributions
 kids-0.013
 disob-0.012
ily-0.012
ying-0.011
 children-0.010
Token her
Token ID607
Feature activation-5.33e-01

Pos logit contributions
 disob+0.173
 kids+0.171
ily+0.161
ying+0.156
asure+0.151
Neg logit contributions
elope-0.342
oc-0.337
tail-0.312
 den-0.289
tails-0.289
Token hair
Token ID4190
Feature activation+1.86e+00

Pos logit contributions
elope+0.048
tail+0.046
oc+0.046
tails+0.042
ns+0.042
Neg logit contributions
 disob-0.027
ying-0.026
ateful-0.025
ily-0.025
asure-0.024
Token in
Token ID287
Feature activation+2.72e+00

Pos logit contributions
 disob+0.130
 kids+0.129
ateful+0.126
 truth+0.124
ying+0.121
Neg logit contributions
tail-0.134
elope-0.120
oc-0.117
tails-0.110
light-0.107
Token a
Token ID257
Feature activation+2.89e-01

Pos logit contributions
 disob+0.124
 kids+0.119
ily+0.112
ying+0.111
asure+0.107
Neg logit contributions
elope-0.266
oc-0.261
tail-0.247
tails-0.227
 den-0.227
Token pony
Token ID26902
Feature activation-1.70e+00

Pos logit contributions
 disob+0.014
 kids+0.014
 iron+0.013
ily+0.012
 right+0.012
Neg logit contributions
elope-0.028
oc-0.027
tail-0.024
tails-0.023
Bet-0.023
Tokentail
Token ID13199
Feature activation-1.23e+01

Pos logit contributions
elope+0.154
oc+0.151
Bet+0.133
icol+0.132
tail+0.131
Neg logit contributions
 disob-0.087
 kids-0.085
ily-0.078
asure-0.078
ying-0.075
Token so
Token ID523
Feature activation+7.18e-01

Pos logit contributions
elope+1.102
oc+1.080
tail+1.029
 den+0.925
ns+0.925
Neg logit contributions
 disob-0.641
 kids-0.620
ying-0.601
ateful-0.592
asure-0.574
Token she
Token ID673
Feature activation-1.07e+00

Pos logit contributions
 disob+0.041
 kids+0.039
ily+0.037
ying+0.037
asure+0.036
Neg logit contributions
elope-0.062
oc-0.061
tail-0.057
tails-0.052
 den-0.051
Token could
Token ID714
Feature activation-2.26e+00

Pos logit contributions
elope+0.093
oc+0.090
tail+0.085
ns+0.078
 Louise+0.077
Neg logit contributions
 kids-0.058
 disob-0.057
ying-0.056
ateful-0.055
ily-0.053
Token look
Token ID804
Feature activation+3.69e+00

Pos logit contributions
elope+0.200
tail+0.195
oc+0.195
tails+0.172
 pasture+0.169
Neg logit contributions
 kids-0.116
ying-0.115
 disob-0.114
ateful-0.111
ily-0.109
Token pretty
Token ID2495
Feature activation+1.28e+00

Pos logit contributions
 disob+0.199
 kids+0.191
ily+0.181
ying+0.179
asure+0.177
Neg logit contributions
elope-0.328
oc-0.324
tail-0.304
 den-0.277
ns-0.275
Token,
Token ID11
Feature activation+3.62e-01

Pos logit contributions
 disob+0.081
 kids+0.078
ily+0.076
asure+0.074
ying+0.073
Neg logit contributions
elope-0.132
oc-0.128
tail-0.121
 Louise-0.110
 den-0.109
Token Emma
Token ID18966
Feature activation+4.83e-01

Pos logit contributions
 disob+0.022
 kids+0.021
ily+0.021
ying+0.021
asure+0.020
Neg logit contributions
elope-0.030
oc-0.029
tail-0.027
 den-0.024
 Louise-0.024
Token decided
Token ID3066
Feature activation+2.99e+00

Pos logit contributions
 disob+0.025
 kids+0.025
ily+0.024
ying+0.024
asure+0.023
Neg logit contributions
elope-0.044
oc-0.043
tail-0.040
tails-0.036
 den-0.036
Token a
Token ID257
Feature activation+3.17e-01

Pos logit contributions
ying+0.194
 kids+0.194
ateful+0.193
ily+0.191
 disob+0.190
Neg logit contributions
elope-0.208
tail-0.194
oc-0.192
 Louise-0.176
tails-0.170
Token pony
Token ID26902
Feature activation-1.67e+00

Pos logit contributions
 disob+0.016
 kids+0.015
 truth+0.013
 children+0.013
 ma+0.012
Neg logit contributions
elope-0.031
oc-0.031
tail-0.029
ns-0.027
tails-0.026
Tokentail
Token ID13199
Feature activation-1.22e+01

Pos logit contributions
elope+0.153
oc+0.151
tail+0.138
 den+0.130
icol+0.129
Neg logit contributions
 disob-0.087
 kids-0.084
ily-0.080
asure-0.076
ying-0.075
Token wasn
Token ID2492
Feature activation+3.75e-01

Pos logit contributions
elope+0.987
oc+0.964
tail+0.909
ns+0.822
 den+0.805
Neg logit contributions
 disob-0.709
ateful-0.693
 kids-0.693
ying-0.682
asure-0.670
Token't
Token ID470
Feature activation+9.90e-01

Pos logit contributions
 kids+0.024
ateful+0.024
ying+0.024
 disob+0.023
 reasons+0.023
Neg logit contributions
elope-0.027
tail-0.026
oc-0.025
June-0.023
Bet-0.023
Token so
Token ID523
Feature activation+7.27e-01

Pos logit contributions
ateful+0.058
asure+0.057
ying+0.057
ily+0.055
ailed+0.055
Neg logit contributions
elope-0.073
tail-0.068
 roaring-0.063
oc-0.062
tails-0.062
Token important
Token ID1593
Feature activation+2.11e+00

Pos logit contributions
asure+0.039
ying+0.037
 kids+0.036
 disob+0.036
 dishes+0.034
Neg logit contributions
elope-0.058
oc-0.054
tail-0.054
 Louise-0.051
ns-0.050
Token after
Token ID706
Feature activation-4.09e+00

Pos logit contributions
 disob+0.113
 kids+0.108
ily+0.105
asure+0.102
ying+0.101
Neg logit contributions
elope-0.187
oc-0.184
tail-0.174
tails-0.158
 den-0.156
Token with
Token ID351
Feature activation-1.29e-01

Pos logit contributions
 kids+0.077
 disob+0.075
ily+0.070
 children+0.065
 right+0.063
Neg logit contributions
elope-0.159
oc-0.156
 tal-0.144
icol-0.144
Bet-0.143
Token glasses
Token ID15232
Feature activation-3.67e+00

Pos logit contributions
oc+0.014
elope+0.014
tail+0.013
 den+0.012
 lifetime+0.012
Neg logit contributions
 kids-0.005
 disob-0.004
ily-0.004
 children-0.004
 boys-0.004
Token and
Token ID290
Feature activation+1.20e-01

Pos logit contributions
elope+0.290
oc+0.289
tail+0.289
 den+0.254
tails+0.251
Neg logit contributions
 disob-0.217
ateful-0.203
 kids-0.199
ying-0.198
asure-0.187
Token a
Token ID257
Feature activation+3.45e-01

Pos logit contributions
 kids+0.005
 disob+0.004
 children+0.004
ily+0.004
 boys+0.003
Neg logit contributions
oc-0.013
elope-0.013
tail-0.012
Bet-0.011
 lifetime-0.011
Token pony
Token ID26902
Feature activation-1.66e+00

Pos logit contributions
 kids+0.018
 disob+0.016
 children+0.015
 iron+0.015
 truth+0.015
Neg logit contributions
elope-0.033
oc-0.032
tail-0.028
ns-0.028
Bet-0.027
Tokentail
Token ID13199
Feature activation-1.22e+01

Pos logit contributions
elope+0.179
oc+0.174
Bet+0.160
icol+0.160
 lifetime+0.160
Neg logit contributions
 kids-0.059
 disob-0.053
ily-0.049
 children-0.049
 right-0.046
Token.
Token ID13
Feature activation+1.76e-01

Pos logit contributions
elope+1.189
oc+1.169
tail+1.099
tails+1.014
 den+1.011
Neg logit contributions
 disob-0.562
 kids-0.543
ily-0.509
ying-0.501
asure-0.486
Token She
Token ID1375
Feature activation+3.46e-01

Pos logit contributions
 kids+0.008
ily+0.008
 disob+0.007
 children+0.007
 en+0.007
Neg logit contributions
elope-0.017
oc-0.017
tail-0.016
tails-0.015
 lifetime-0.015
Token is
Token ID318
Feature activation-1.10e+00

Pos logit contributions
 kids+0.019
 disob+0.018
ily+0.017
 children+0.017
asure+0.016
Neg logit contributions
elope-0.031
oc-0.031
tail-0.029
Bet-0.027
 lifetime-0.027
Token reading
Token ID3555
Feature activation-4.31e+00

Pos logit contributions
oc+0.100
elope+0.098
tail+0.091
 den+0.083
 tal+0.082
Neg logit contributions
 disob-0.060
 kids-0.060
ily-0.056
 children-0.053
asure-0.053
Token a
Token ID257
Feature activation+3.47e-01

Pos logit contributions
elope+0.445
oc+0.440
tail+0.408
 den+0.385
tails+0.380
Neg logit contributions
 kids-0.172
 disob-0.169
ily-0.162
 children-0.145
asure-0.145
Token along
Token ID1863
Feature activation+1.36e+00

Pos logit contributions
elope+0.245
oc+0.242
tail+0.223
 den+0.209
ns+0.208
Neg logit contributions
 kids-0.110
 disob-0.104
ily-0.102
 children-0.096
ying-0.096
Token to
Token ID284
Feature activation+4.18e-01

Pos logit contributions
 kids+0.058
 disob+0.057
ily+0.052
 children+0.050
ying+0.048
Neg logit contributions
elope-0.137
oc-0.135
tail-0.126
 den-0.118
tails-0.118
Token the
Token ID262
Feature activation+2.29e-01

Pos logit contributions
 kids+0.017
 disob+0.017
ily+0.015
 children+0.015
ying+0.014
Neg logit contributions
oc-0.043
elope-0.043
tail-0.040
Bet-0.038
 den-0.037
Token hy
Token ID2537
Feature activation-2.88e+00

Pos logit contributions
 disob+0.012
 kids+0.012
ily+0.011
ying+0.011
asure+0.011
Neg logit contributions
elope-0.020
oc-0.020
tail-0.019
 den-0.017
tails-0.017
Tokenm
Token ID76
Feature activation+2.52e+00

Pos logit contributions
elope+0.212
oc+0.193
 pasture+0.174
tail+0.171
 harmony+0.161
Neg logit contributions
 disob-0.212
ying-0.192
ily-0.190
 kids-0.188
 penn-0.186
Tokenns
Token ID5907
Feature activation-1.21e+01

Pos logit contributions
 disob+0.166
asure+0.156
ily+0.156
 kids+0.149
ying+0.145
Neg logit contributions
elope-0.192
oc-0.185
tail-0.163
 pasture-0.156
tails-0.154
Token,
Token ID11
Feature activation+3.66e-01

Pos logit contributions
elope+1.223
oc+1.189
tail+1.113
Bet+1.057
icol+1.056
Neg logit contributions
 disob-0.513
 kids-0.510
ying-0.455
ily-0.451
 children-0.437
Token said
Token ID531
Feature activation+1.01e+00

Pos logit contributions
 kids+0.019
 disob+0.018
 children+0.016
ily+0.016
asure+0.016
Neg logit contributions
elope-0.034
oc-0.034
tail-0.031
tails-0.029
 den-0.029
Token prayers
Token ID15707
Feature activation-2.47e-01

Pos logit contributions
 disob+0.060
 kids+0.058
ily+0.056
ying+0.055
asure+0.054
Neg logit contributions
elope-0.083
oc-0.081
tail-0.077
tails-0.070
 Louise-0.069
Token and
Token ID290
Feature activation+9.57e-02

Pos logit contributions
elope+0.022
oc+0.022
tail+0.021
tails+0.019
 den+0.019
Neg logit contributions
 disob-0.013
 kids-0.012
ily-0.012
ying-0.011
asure-0.011
Token listened
Token ID16399
Feature activation+3.32e+00

Pos logit contributions
 disob+0.005
 kids+0.005
ily+0.004
asure+0.004
ying+0.004
Neg logit contributions
elope-0.009
oc-0.009
tail-0.008
 den-0.008
Bet-0.008
Token trying
Token ID2111
Feature activation-1.79e+00

Pos logit contributions
oc+0.080
elope+0.078
tail+0.073
 den+0.065
tails+0.065
Neg logit contributions
 disob-0.049
 kids-0.048
ily-0.044
ying-0.043
asure-0.043
Token to
Token ID284
Feature activation+4.24e-01

Pos logit contributions
oc+0.196
elope+0.196
tail+0.185
ns+0.169
icol+0.168
Neg logit contributions
 disob-0.064
 kids-0.062
 right-0.059
 truth-0.056
ying-0.055
Token keep
Token ID1394
Feature activation-1.05e+00

Pos logit contributions
ying+0.025
 kids+0.025
 disob+0.025
ily+0.024
ateful+0.024
Neg logit contributions
elope-0.034
oc-0.033
tail-0.032
ns-0.029
tails-0.029
Token the
Token ID262
Feature activation+2.34e-01

Pos logit contributions
oc+0.105
elope+0.103
tail+0.096
 lifetime+0.088
icol+0.088
Neg logit contributions
 kids-0.050
 disob-0.048
 children-0.043
 right-0.043
 iron-0.042
Token he
Token ID339
Feature activation+9.53e-02

Pos logit contributions
 kids+0.015
 disob+0.014
 children+0.013
 iron+0.013
 right+0.012
Neg logit contributions
elope-0.020
oc-0.019
tail-0.018
Bet-0.017
June-0.017
Tokenns
Token ID5907
Feature activation-1.21e+01

Pos logit contributions
 disob+0.005
 kids+0.005
ily+0.004
ying+0.004
asure+0.004
Neg logit contributions
elope-0.009
oc-0.009
tail-0.008
tails-0.008
 den-0.008
Token in
Token ID287
Feature activation+2.74e+00

Pos logit contributions
elope+1.080
oc+1.065
tail+1.000
 den+0.913
tails+0.911
Neg logit contributions
 disob-0.632
 kids-0.614
ily-0.588
ying-0.576
asure-0.564
Token line
Token ID1627
Feature activation-6.89e-01

Pos logit contributions
 disob+0.125
 kids+0.122
ily+0.113
ying+0.109
asure+0.106
Neg logit contributions
elope-0.267
oc-0.263
tail-0.248
tails-0.229
ns-0.228
Token.
Token ID13
Feature activation+1.51e-01

Pos logit contributions
elope+0.057
oc+0.056
tail+0.053
ns+0.048
tails+0.048
Neg logit contributions
 disob-0.040
 kids-0.038
ateful-0.036
ily-0.035
asure-0.035
Token Billy
Token ID15890
Feature activation-3.77e-01

Pos logit contributions
 kids+0.008
 disob+0.008
asure+0.007
ily+0.007
 children+0.007
Neg logit contributions
elope-0.014
oc-0.013
tail-0.013
tails-0.012
ns-0.012
Token watched
Token ID7342
Feature activation-2.34e+00

Pos logit contributions
elope+0.036
oc+0.035
tail+0.033
 den+0.030
tails+0.030
Neg logit contributions
 disob-0.018
 kids-0.018
ily-0.016
ying-0.016
asure-0.016
Token little
Token ID1310
Feature activation+3.94e+00

Pos logit contributions
 kids+0.018
 little+0.016
 right+0.016
 children+0.016
 fair+0.016
Neg logit contributions
elope-0.029
tail-0.027
June-0.026
oc-0.026
 Louise-0.025
Token shelter
Token ID11772
Feature activation-2.33e+00

Pos logit contributions
 kids+0.232
 children+0.204
 right+0.200
 disob+0.197
 boy+0.194
Neg logit contributions
elope-0.334
June-0.310
Bet-0.298
oc-0.293
tail-0.292
Token for
Token ID329
Feature activation-3.38e+00

Pos logit contributions
elope+0.227
oc+0.223
tail+0.210
tails+0.193
 den+0.193
Neg logit contributions
 disob-0.106
 kids-0.103
ily-0.096
ying-0.095
asure-0.091
Token the
Token ID262
Feature activation+2.42e-01

Pos logit contributions
elope+0.327
oc+0.322
tail+0.302
tails+0.278
 den+0.277
Neg logit contributions
 disob-0.159
 kids-0.155
ily-0.144
ying-0.141
asure-0.137
Token he
Token ID339
Feature activation+1.11e-01

Pos logit contributions
 kids+0.014
 disob+0.014
ily+0.013
 children+0.013
ying+0.012
Neg logit contributions
elope-0.021
oc-0.020
tail-0.019
tails-0.017
Bet-0.017
Tokenns
Token ID5907
Feature activation-1.21e+01

Pos logit contributions
 disob+0.006
 kids+0.006
ily+0.005
 children+0.005
asure+0.005
Neg logit contributions
elope-0.010
oc-0.010
tail-0.009
tails-0.009
Bet-0.009
Token.
Token ID13
Feature activation+1.58e-01

Pos logit contributions
elope+1.106
oc+1.087
tail+1.019
tails+0.934
 den+0.932
Neg logit contributions
 disob-0.613
 kids-0.595
ily-0.563
ying-0.555
asure-0.541
Token Billy
Token ID15890
Feature activation-3.62e-01

Pos logit contributions
 kids+0.008
asure+0.007
 children+0.007
ily+0.007
 disob+0.007
Neg logit contributions
elope-0.014
tail-0.014
oc-0.013
ns-0.013
tails-0.012
Token couldn
Token ID3521
Feature activation-1.30e+00

Pos logit contributions
elope+0.034
oc+0.033
tail+0.032
 den+0.029
 pasture+0.029
Neg logit contributions
 disob-0.018
 kids-0.017
ily-0.015
 children-0.015
ying-0.015
Token't
Token ID470
Feature activation+9.91e-01

Pos logit contributions
elope+0.115
oc+0.113
tail+0.107
tails+0.095
ns+0.095
Neg logit contributions
 disob-0.070
 kids-0.068
ying-0.065
ily-0.065
asure-0.064
Token believe
Token ID1975
Feature activation+1.42e+00

Pos logit contributions
ying+0.058
 kids+0.056
asure+0.055
ateful+0.055
 disob+0.054
Neg logit contributions
elope-0.084
oc-0.082
tail-0.079
ns-0.069
 den-0.067
Token was
Token ID373
Feature activation-8.49e-01

Pos logit contributions
 disob+0.006
 kids+0.006
ying+0.006
ily+0.006
asure+0.006
Neg logit contributions
elope-0.011
oc-0.011
tail-0.010
 den-0.009
ns-0.009
Token playing
Token ID2712
Feature activation-1.91e+00

Pos logit contributions
elope+0.071
oc+0.070
tail+0.064
tails+0.058
 den+0.058
Neg logit contributions
 disob-0.053
 kids-0.052
ily-0.049
ying-0.048
asure-0.047
Token with
Token ID351
Feature activation-1.18e-01

Pos logit contributions
elope+0.178
oc+0.176
tail+0.165
 den+0.151
tails+0.150
Neg logit contributions
 disob-0.096
 kids-0.093
ily-0.087
ying-0.087
asure-0.084
Token some
Token ID617
Feature activation+7.53e-01

Pos logit contributions
elope+0.011
oc+0.011
tail+0.010
tails+0.010
 den+0.010
Neg logit contributions
 disob-0.006
 kids-0.005
ying-0.005
ily-0.005
asure-0.005
Token cor
Token ID1162
Feature activation-3.68e+00

Pos logit contributions
 disob+0.035
 kids+0.033
ily+0.032
ying+0.032
asure+0.031
Neg logit contributions
elope-0.072
oc-0.071
tail-0.067
 den-0.062
tails-0.062
Tokenns
Token ID5907
Feature activation-1.20e+01

Pos logit contributions
elope+0.232
oc+0.197
 harmony+0.174
tail+0.173
 den+0.166
Neg logit contributions
 kids-0.313
 disob-0.303
 children-0.287
 penn-0.287
ily-0.286
Token he
Token ID339
Feature activation+1.21e-01

Pos logit contributions
elope+1.202
oc+1.139
tail+1.074
icol+1.018
June+1.003
Neg logit contributions
 disob-0.584
 kids-0.581
ily-0.532
ying-0.514
 children-0.498
Token found
Token ID1043
Feature activation-9.60e-01

Pos logit contributions
 disob+0.006
 kids+0.006
ily+0.005
 children+0.005
 ignored+0.005
Neg logit contributions
elope-0.012
oc-0.011
tail-0.011
 tal-0.010
oing-0.010
Token in
Token ID287
Feature activation+2.77e+00

Pos logit contributions
elope+0.098
oc+0.096
tail+0.091
tails+0.084
 den+0.084
Neg logit contributions
 disob-0.039
 kids-0.039
ily-0.037
ying-0.035
asure-0.034
Token his
Token ID465
Feature activation+3.29e-01

Pos logit contributions
 kids+0.115
 disob+0.114
ily+0.102
 children+0.096
asure+0.095
Neg logit contributions
elope-0.283
oc-0.278
tail-0.262
tails-0.243
ns-0.241
Token garden
Token ID11376
Feature activation-1.56e+00

Pos logit contributions
 kids+0.019
 disob+0.016
 children+0.015
 iron+0.015
 dad+0.015
Neg logit contributions
oc-0.031
elope-0.028
oing-0.028
tail-0.027
odic-0.027
Token was
Token ID373
Feature activation-1.01e+00

Pos logit contributions
 disob+0.275
 kids+0.257
ying+0.251
ily+0.247
asure+0.245
Neg logit contributions
elope-0.410
oc-0.399
tail-0.367
tails-0.336
 den-0.334
Token a
Token ID257
Feature activation+1.76e-01

Pos logit contributions
elope+0.105
oc+0.105
tail+0.098
tails+0.092
 den+0.091
Neg logit contributions
 kids-0.040
 disob-0.040
ily-0.038
asure-0.034
 children-0.033
Token little
Token ID1310
Feature activation+3.73e+00

Pos logit contributions
 kids+0.006
 disob+0.006
 children+0.005
 boss+0.004
 truth+0.004
Neg logit contributions
elope-0.019
oc-0.019
tail-0.019
tails-0.017
ns-0.017
Token girl
Token ID2576
Feature activation+1.46e+00

Pos logit contributions
 disob+0.121
 kids+0.117
ily+0.104
ying+0.101
asure+0.096
Neg logit contributions
elope-0.413
oc-0.406
tail-0.386
tails-0.360
 den-0.358
Token named
Token ID3706
Feature activation-6.16e+00

Pos logit contributions
 disob+0.094
 kids+0.093
ily+0.086
 children+0.085
 iron+0.081
Neg logit contributions
elope-0.108
oc-0.107
tail-0.099
icol-0.097
Fel-0.096
Token Luna
Token ID23694
Feature activation-1.18e+01

Pos logit contributions
elope+0.564
oc+0.532
tail+0.513
 Louise+0.476
tails+0.473
Neg logit contributions
 disob-0.324
 kids-0.302
ying-0.293
asure-0.284
ateful-0.280
Token.
Token ID13
Feature activation-5.72e-02

Pos logit contributions
elope+1.116
oc+1.080
tail+1.013
tails+0.936
ns+0.928
Neg logit contributions
 disob-0.586
 kids-0.564
ying-0.540
ily-0.522
asure-0.521
Token She
Token ID1375
Feature activation+1.11e-01

Pos logit contributions
 lifetime+0.005
oc+0.005
icol+0.004
 tal+0.004
elope+0.004
Neg logit contributions
ily-0.004
 kids-0.004
 children-0.003
 two-0.003
 Lily-0.003
Token was
Token ID373
Feature activation-1.08e+00

Pos logit contributions
 disob+0.006
 kids+0.005
ily+0.005
asure+0.005
 children+0.005
Neg logit contributions
oc-0.010
elope-0.010
tail-0.010
 den-0.009
tails-0.009
Token only
Token ID691
Feature activation+2.30e+00

Pos logit contributions
oc+0.106
elope+0.106
tail+0.098
 den+0.090
tails+0.090
Neg logit contributions
 disob-0.050
 kids-0.049
ily-0.045
asure-0.043
ying-0.043
Token three
Token ID1115
Feature activation+4.28e+00

Pos logit contributions
 disob+0.077
 kids+0.070
ying+0.068
ily+0.066
asure+0.063
Neg logit contributions
elope-0.253
oc-0.246
tail-0.237
tails-0.220
 den-0.219
Token mist
Token ID4020
Feature activation-1.96e+00

Pos logit contributions
 disob+0.006
 kids+0.005
ying+0.005
ily+0.005
asure+0.005
Neg logit contributions
elope-0.008
oc-0.008
tail-0.008
 den-0.007
ns-0.007
Token to
Token ID284
Feature activation+2.40e-01

Pos logit contributions
elope+0.174
oc+0.169
tail+0.159
tails+0.146
 den+0.145
Neg logit contributions
 disob-0.107
 kids-0.104
ily-0.098
ying-0.096
asure-0.095
Token go
Token ID467
Feature activation+3.19e-01

Pos logit contributions
ying+0.013
 disob+0.013
 kids+0.013
ily+0.013
ateful+0.013
Neg logit contributions
elope-0.021
oc-0.019
tail-0.019
 den-0.017
 pasture-0.017
Token away
Token ID1497
Feature activation-1.04e-02

Pos logit contributions
 disob+0.018
 kids+0.017
ily+0.016
ying+0.016
asure+0.016
Neg logit contributions
elope-0.028
oc-0.027
tail-0.026
tails-0.024
 den-0.024
Token.
Token ID13
Feature activation-5.72e-02

Pos logit contributions
elope+0.001
oc+0.001
tail+0.001
 den+0.001
tails+0.001
Neg logit contributions
 disob-0.001
 kids-0.001
ily-0.000
ying-0.000
asure-0.000
Token Luna
Token ID23694
Feature activation-1.18e+01

Pos logit contributions
elope+0.005
oc+0.005
tail+0.004
tails+0.004
ns+0.004
Neg logit contributions
 kids-0.003
ily-0.003
 disob-0.003
asure-0.003
 children-0.003
Token knew
Token ID2993
Feature activation-9.35e-01

Pos logit contributions
elope+1.064
oc+1.018
tail+0.947
ns+0.878
tails+0.871
Neg logit contributions
 disob-0.620
ying-0.616
 kids-0.609
ateful-0.594
asure-0.585
Token something
Token ID1223
Feature activation-9.04e-01

Pos logit contributions
oc+0.095
elope+0.091
tail+0.086
ns+0.080
oing+0.080
Neg logit contributions
 disob-0.043
 kids-0.042
ily-0.041
ying-0.037
 children-0.036
Token good
Token ID922
Feature activation+9.09e-01

Pos logit contributions
elope+0.082
oc+0.081
tail+0.076
tails+0.069
 den+0.069
Neg logit contributions
 disob-0.048
 kids-0.047
ily-0.045
ying-0.044
asure-0.042
Token would
Token ID561
Feature activation-3.03e+00

Pos logit contributions
 disob+0.054
 kids+0.052
ateful+0.051
asure+0.051
ying+0.048
Neg logit contributions
elope-0.073
oc-0.068
tail-0.065
 Louise-0.061
 den-0.060
Token come
Token ID1282
Feature activation-6.92e-01

Pos logit contributions
elope+0.274
oc+0.260
tail+0.247
 den+0.228
 Louise+0.227
Neg logit contributions
 kids-0.151
 disob-0.150
ying-0.150
ateful-0.146
asure-0.145
Token,
Token ID11
Feature activation+1.78e-01

Pos logit contributions
elope+0.268
oc+0.263
tail+0.244
tails+0.224
ns+0.222
Neg logit contributions
 disob-0.153
 kids-0.147
ying-0.142
ily-0.139
ateful-0.136
Token I
Token ID314
Feature activation+2.35e-01

Pos logit contributions
 disob+0.009
ateful+0.009
ying+0.009
asure+0.008
 kids+0.008
Neg logit contributions
elope-0.016
oc-0.015
tail-0.014
 Louise-0.014
 Luna-0.013
Token'm
Token ID1101
Feature activation+3.73e-01

Pos logit contributions
 kids+0.012
 disob+0.012
ying+0.012
asure+0.012
ily+0.011
Neg logit contributions
oc-0.021
elope-0.021
tail-0.019
ns-0.017
tails-0.017
Token sorry
Token ID7926
Feature activation+1.53e+00

Pos logit contributions
 disob+0.017
 kids+0.016
ying+0.016
ily+0.015
asure+0.015
Neg logit contributions
elope-0.036
oc-0.035
tail-0.033
ns-0.031
 den-0.031
Token,"
Token ID553
Feature activation+3.97e+00

Pos logit contributions
 disob+0.079
 kids+0.075
ily+0.072
ying+0.071
asure+0.071
Neg logit contributions
elope-0.141
oc-0.138
tail-0.130
tails-0.119
 den-0.119
Token Luna
Token ID23694
Feature activation-1.18e+01

Pos logit contributions
 disob+0.220
ying+0.213
 kids+0.206
asure+0.202
ily+0.199
Neg logit contributions
elope-0.349
oc-0.333
tail-0.316
 Louise-0.293
tails-0.287
Token says
Token ID1139
Feature activation-1.59e+00

Pos logit contributions
elope+0.988
oc+0.907
tail+0.860
 Louise+0.852
ns+0.803
Neg logit contributions
ying-0.676
 disob-0.670
asure-0.662
ateful-0.659
 kids-0.632
Token.
Token ID13
Feature activation-2.98e-02

Pos logit contributions
elope+0.133
oc+0.126
tail+0.122
 Louise+0.112
tails+0.110
Neg logit contributions
 disob-0.090
 kids-0.088
asure-0.087
ateful-0.086
ying-0.085
Token "
Token ID366
Feature activation+1.33e-01

Pos logit contributions
oc+0.003
elope+0.003
 lifetime+0.003
tail+0.003
 den+0.003
Neg logit contributions
 kids-0.001
ily-0.001
 disob-0.001
 right-0.001
 children-0.001
TokenI
Token ID40
Feature activation+8.83e-01

Pos logit contributions
 disob+0.008
ying+0.008
 kids+0.008
asure+0.008
 penn+0.008
Neg logit contributions
elope-0.010
oc-0.010
tail-0.009
June-0.009
Bet-0.009
Token was
Token ID373
Feature activation-1.02e+00

Pos logit contributions
ying+0.044
 kids+0.044
asure+0.042
 disob+0.041
ily+0.040
Neg logit contributions
oc-0.083
elope-0.081
tail-0.074
ns-0.068
tails-0.067
Token bad
Token ID2089
Feature activation+2.11e+00

Pos logit contributions
 kids+0.041
 disob+0.040
ily+0.035
 truth+0.035
 children+0.035
Neg logit contributions
elope-0.078
oc-0.076
tail-0.072
tails-0.067
ns-0.066
Token dog
Token ID3290
Feature activation-3.84e+00

Pos logit contributions
 disob+0.121
 kids+0.115
ying+0.115
ily+0.112
asure+0.111
Neg logit contributions
elope-0.180
oc-0.177
tail-0.164
 den-0.152
tails-0.148
Token.
Token ID13
Feature activation+7.36e-03

Pos logit contributions
elope+0.342
oc+0.336
tail+0.314
tails+0.287
 den+0.286
Neg logit contributions
 disob-0.206
 kids-0.200
ily-0.190
ying-0.187
asure-0.182
Token Anna
Token ID11735
Feature activation-1.55e+00

Pos logit contributions
 disob+0.000
 kids+0.000
ily+0.000
ying+0.000
asure+0.000
Neg logit contributions
elope-0.001
oc-0.001
tail-0.001
tails-0.001
 den-0.001
Token gave
Token ID2921
Feature activation+7.52e-01

Pos logit contributions
elope+0.140
oc+0.137
tail+0.128
tails+0.117
ns+0.117
Neg logit contributions
 disob-0.081
 kids-0.079
ying-0.077
ily-0.076
asure-0.075
Token Luna
Token ID23694
Feature activation-1.18e+01

Pos logit contributions
 kids+0.034
 disob+0.032
ily+0.030
 children+0.029
ying+0.027
Neg logit contributions
oc-0.078
elope-0.077
tail-0.070
 lifetime-0.067
 den-0.066
Token water
Token ID1660
Feature activation+2.09e+00

Pos logit contributions
elope+1.140
oc+1.120
tail+1.054
tails+0.971
 den+0.968
Neg logit contributions
 disob-0.550
 kids-0.530
ily-0.498
ying-0.493
asure-0.477
Token and
Token ID290
Feature activation-1.07e-01

Pos logit contributions
 disob+0.118
 kids+0.109
ying+0.108
asure+0.106
ateful+0.105
Neg logit contributions
elope-0.176
oc-0.174
tail-0.163
 den-0.151
tails-0.151
Token food
Token ID2057
Feature activation-3.39e+00

Pos logit contributions
elope+0.010
oc+0.010
tail+0.009
 lifetime+0.008
 den+0.008
Neg logit contributions
 kids-0.006
 disob-0.005
ily-0.005
 children-0.005
 boys-0.004
Token,
Token ID11
Feature activation+1.61e-01

Pos logit contributions
elope+0.301
oc+0.295
tail+0.278
 den+0.257
tails+0.255
Neg logit contributions
 disob-0.179
 kids-0.169
ying-0.163
ateful-0.159
ily-0.158
Token but
Token ID475
Feature activation-3.10e-01

Pos logit contributions
 disob+0.007
 kids+0.007
ily+0.007
ying+0.006
 children+0.006
Neg logit contributions
elope-0.016
oc-0.016
tail-0.015
tails-0.014
 den-0.014
Token floor
Token ID4314
Feature activation+3.81e+00

Pos logit contributions
 disob+0.010
 kids+0.010
ily+0.010
ying+0.009
asure+0.009
Neg logit contributions
elope-0.015
oc-0.015
tail-0.014
tails-0.013
 den-0.012
Token.
Token ID13
Feature activation+1.35e-01

Pos logit contributions
 disob+0.252
ateful+0.236
 kids+0.234
asure+0.230
ying+0.225
Neg logit contributions
elope-0.260
oc-0.251
tail-0.242
light-0.228
 den-0.224
Token They
Token ID1119
Feature activation-2.64e-01

Pos logit contributions
 disob+0.008
 kids+0.008
ily+0.007
ying+0.007
asure+0.007
Neg logit contributions
elope-0.012
oc-0.011
tail-0.011
tails-0.010
 den-0.010
Token talk
Token ID1561
Feature activation-3.73e+00

Pos logit contributions
elope+0.023
oc+0.022
tail+0.021
ns+0.020
tails+0.020
Neg logit contributions
ying-0.014
 disob-0.013
asure-0.013
 kids-0.013
ily-0.013
Token to
Token ID284
Feature activation+2.36e-01

Pos logit contributions
elope+0.351
oc+0.345
tail+0.324
tails+0.297
 den+0.296
Neg logit contributions
 disob-0.184
 kids-0.179
ily-0.168
ying-0.166
asure-0.161
Token Luna
Token ID23694
Feature activation-1.18e+01

Pos logit contributions
 kids+0.010
 disob+0.010
ily+0.009
 children+0.009
ying+0.008
Neg logit contributions
elope-0.024
oc-0.023
tail-0.022
 den-0.021
tails-0.020
Token in
Token ID287
Feature activation+2.55e+00

Pos logit contributions
elope+1.149
oc+1.134
tail+1.061
 den+0.979
tails+0.979
Neg logit contributions
 disob-0.537
 kids-0.527
ily-0.491
ying-0.478
asure-0.460
Token soft
Token ID2705
Feature activation-1.38e+00

Pos logit contributions
 kids+0.100
 disob+0.083
 children+0.083
ily+0.077
 truth+0.077
Neg logit contributions
oc-0.278
elope-0.276
tail-0.255
 lifetime-0.253
ns-0.248
Token voices
Token ID10839
Feature activation-1.05e-01

Pos logit contributions
oc+0.138
elope+0.136
tail+0.123
 lifetime+0.123
ns+0.120
Neg logit contributions
 kids-0.068
 disob-0.058
 children-0.058
ily-0.054
 dishes-0.053
Token.
Token ID13
Feature activation-6.39e-02

Pos logit contributions
elope+0.010
oc+0.010
tail+0.009
tails+0.009
 den+0.008
Neg logit contributions
 disob-0.005
 kids-0.005
ily-0.005
ying-0.005
asure-0.004
Token They
Token ID1119
Feature activation-3.30e-01

Pos logit contributions
oc+0.007
 lifetime+0.006
elope+0.006
 den+0.006
tail+0.006
Neg logit contributions
 kids-0.003
ily-0.003
 children-0.002
 truth-0.002
 ma-0.002
Token water
Token ID1660
Feature activation+2.09e+00

Pos logit contributions
elope+1.140
oc+1.120
tail+1.054
tails+0.971
 den+0.968
Neg logit contributions
 disob-0.550
 kids-0.530
ily-0.498
ying-0.493
asure-0.477
Token and
Token ID290
Feature activation-1.07e-01

Pos logit contributions
 disob+0.118
 kids+0.109
ying+0.108
asure+0.106
ateful+0.105
Neg logit contributions
elope-0.176
oc-0.174
tail-0.163
 den-0.151
tails-0.151
Token food
Token ID2057
Feature activation-3.39e+00

Pos logit contributions
elope+0.010
oc+0.010
tail+0.009
 lifetime+0.008
 den+0.008
Neg logit contributions
 kids-0.006
 disob-0.005
ily-0.005
 children-0.005
 boys-0.004
Token,
Token ID11
Feature activation+1.61e-01

Pos logit contributions
elope+0.301
oc+0.295
tail+0.278
 den+0.257
tails+0.255
Neg logit contributions
 disob-0.179
 kids-0.169
ying-0.163
ateful-0.159
ily-0.158
Token but
Token ID475
Feature activation-3.10e-01

Pos logit contributions
 disob+0.007
 kids+0.007
ily+0.007
ying+0.006
 children+0.006
Neg logit contributions
elope-0.016
oc-0.016
tail-0.015
tails-0.014
 den-0.014
Token Luna
Token ID23694
Feature activation-1.18e+01

Pos logit contributions
oc+0.030
elope+0.029
tail+0.027
icol+0.026
 lifetime+0.026
Neg logit contributions
 kids-0.016
 disob-0.015
ily-0.014
 children-0.013
ying-0.013
Token did
Token ID750
Feature activation+3.11e+00

Pos logit contributions
elope+1.137
oc+1.083
tail+1.009
ns+0.964
tails+0.936
Neg logit contributions
ying-0.529
 kids-0.520
asure-0.515
 disob-0.511
ateful-0.511
Token not
Token ID407
Feature activation+1.30e+00

Pos logit contributions
 disob+0.086
 kids+0.079
ily+0.072
ying+0.063
 truth+0.059
Neg logit contributions
oc-0.367
elope-0.366
tail-0.341
 den-0.326
 pasture-0.321
Token want
Token ID765
Feature activation+1.03e+00

Pos logit contributions
ying+0.068
 disob+0.067
 kids+0.066
asure+0.066
ily+0.065
Neg logit contributions
elope-0.111
oc-0.108
tail-0.104
tails-0.096
ns-0.095
Token to
Token ID284
Feature activation+3.02e-01

Pos logit contributions
 kids+0.038
ily+0.037
 disob+0.036
 truth+0.033
 children+0.031
Neg logit contributions
oc-0.123
elope-0.117
tail-0.108
 lifetime-0.106
oing-0.104
Token play
Token ID711
Feature activation-1.99e+00

Pos logit contributions
ying+0.020
ily+0.018
asure+0.018
outed+0.017
aund+0.017
Neg logit contributions
oc-0.023
elope-0.023
tail-0.022
tails-0.020
 harmony-0.019
TokenFrom
Token ID4863
Feature activation-1.24e+00

Pos logit contributions
elope+0.037
oc+0.036
tail+0.034
 den+0.031
tails+0.031
Neg logit contributions
 disob-0.026
 kids-0.025
ily-0.025
ying-0.023
asure-0.023
Token then
Token ID788
Feature activation-2.72e+00

Pos logit contributions
elope+0.099
oc+0.093
tail+0.089
 Louise+0.079
tails+0.079
Neg logit contributions
 disob-0.078
 kids-0.076
ying-0.073
ily-0.073
asure-0.071
Token on
Token ID319
Feature activation-2.07e+00

Pos logit contributions
elope+0.256
oc+0.246
tail+0.226
 den+0.220
ns+0.219
Neg logit contributions
 disob-0.137
 kids-0.137
ily-0.129
ying-0.129
 right-0.117
Token,
Token ID11
Feature activation+1.86e-01

Pos logit contributions
elope+0.206
oc+0.206
tail+0.193
ns+0.181
 den+0.179
Neg logit contributions
 kids-0.087
 disob-0.087
ily-0.081
ying-0.079
 children-0.073
Token whenever
Token ID8797
Feature activation-3.10e+00

Pos logit contributions
 kids+0.009
 disob+0.009
ily+0.008
 children+0.008
ying+0.008
Neg logit contributions
elope-0.018
oc-0.017
tail-0.016
 den-0.015
ns-0.015
Token Luna
Token ID23694
Feature activation-1.18e+01

Pos logit contributions
oc+0.301
elope+0.298
tail+0.277
ns+0.261
tails+0.260
Neg logit contributions
 kids-0.147
 disob-0.142
ily-0.130
 children-0.124
ying-0.121
Token found
Token ID1043
Feature activation-1.13e+00

Pos logit contributions
elope+1.022
oc+0.999
tail+0.918
ns+0.848
 pasture+0.836
Neg logit contributions
 disob-0.638
 kids-0.637
ying-0.628
asure-0.610
ily-0.610
Token herself
Token ID5223
Feature activation+4.35e+00

Pos logit contributions
oc+0.113
elope+0.112
tail+0.105
tails+0.096
 den+0.096
Neg logit contributions
 kids-0.049
 disob-0.049
ily-0.046
ying-0.044
 children-0.042
Token in
Token ID287
Feature activation+2.62e+00

Pos logit contributions
 disob+0.228
 kids+0.218
ily+0.210
ying+0.205
asure+0.203
Neg logit contributions
elope-0.395
oc-0.383
tail-0.360
 den-0.332
tails-0.331
Token a
Token ID257
Feature activation+2.07e-01

Pos logit contributions
 disob+0.117
 kids+0.114
ily+0.105
ying+0.100
asure+0.098
Neg logit contributions
elope-0.259
oc-0.257
tail-0.240
tails-0.222
ns-0.222
Token beautiful
Token ID4950
Feature activation-1.24e+00

Pos logit contributions
 disob+0.010
 kids+0.010
 right+0.009
 truth+0.009
 boss+0.009
Neg logit contributions
elope-0.019
oc-0.019
tail-0.017
icol-0.017
tails-0.016
Token clay
Token ID21558
Feature activation+1.32e+00

Pos logit contributions
elope+0.289
oc+0.285
tail+0.269
Bet+0.250
 Louise+0.248
Neg logit contributions
 kids-0.102
 disob-0.092
 children-0.082
ily-0.081
 boys-0.073
Token balls
Token ID11333
Feature activation-1.14e+00

Pos logit contributions
 kids+0.068
 disob+0.066
ily+0.061
asure+0.059
ying+0.059
Neg logit contributions
elope-0.123
oc-0.118
tail-0.112
ns-0.103
 pasture-0.103
Token together
Token ID1978
Feature activation-6.84e-01

Pos logit contributions
oc+0.088
elope+0.086
tail+0.083
 den+0.074
tails+0.073
Neg logit contributions
 disob-0.074
ateful-0.065
 kids-0.065
ying-0.064
ily-0.062
Token!"
Token ID2474
Feature activation-2.18e+00

Pos logit contributions
elope+0.058
oc+0.057
tail+0.053
tails+0.049
 den+0.048
Neg logit contributions
 disob-0.040
 kids-0.038
ily-0.037
ying-0.036
asure-0.035
Token But
Token ID887
Feature activation-1.72e+00

Pos logit contributions
elope+0.178
oc+0.176
 lifetime+0.162
tail+0.160
 harmony+0.158
Neg logit contributions
 kids-0.135
ily-0.124
 children-0.117
 disob-0.113
 boys-0.108
Token Luna
Token ID23694
Feature activation-1.17e+01

Pos logit contributions
oc+0.144
elope+0.137
icol+0.129
oing+0.128
 lifetime+0.127
Neg logit contributions
 kids-0.109
ily-0.098
 children-0.094
 disob-0.094
 first-0.093
Token ignored
Token ID9514
Feature activation+4.87e+00

Pos logit contributions
elope+1.148
oc+1.099
tail+1.014
ns+0.962
tails+0.943
Neg logit contributions
ying-0.530
 kids-0.512
ateful-0.509
 disob-0.495
asure-0.481
Token Max
Token ID5436
Feature activation-4.33e+00

Pos logit contributions
 kids+0.187
 disob+0.168
ily+0.166
 children+0.159
 right+0.145
Neg logit contributions
oc-0.528
elope-0.514
 lifetime-0.495
tail-0.467
 den-0.462
Token.
Token ID13
Feature activation+4.81e-02

Pos logit contributions
elope+0.391
oc+0.382
tail+0.361
tails+0.333
 den+0.325
Neg logit contributions
 disob-0.226
 kids-0.221
ily-0.211
ying-0.207
asure-0.204
Token She
Token ID1375
Feature activation+2.30e-01

Pos logit contributions
 kids+0.002
ily+0.002
 children+0.002
 boys+0.002
 right+0.002
Neg logit contributions
oc-0.005
elope-0.005
 lifetime-0.005
 tal-0.004
 harmony-0.004
Token did
Token ID750
Feature activation+3.20e+00

Pos logit contributions
 disob+0.010
 kids+0.009
ily+0.009
ying+0.008
asure+0.008
Neg logit contributions
elope-0.023
oc-0.023
tail-0.022
 den-0.020
tails-0.020
Token lab
Token ID2248
Feature activation-3.15e+00

Pos logit contributions
 kids+0.004
 disob+0.004
ily+0.004
 children+0.004
ying+0.004
Neg logit contributions
elope-0.006
oc-0.006
tail-0.005
tails-0.005
Bet-0.005
Token and
Token ID290
Feature activation-4.69e-02

Pos logit contributions
elope+0.283
oc+0.280
tail+0.262
 den+0.241
tails+0.239
Neg logit contributions
 disob-0.165
 kids-0.158
ily-0.153
ying-0.150
asure-0.147
Token make
Token ID787
Feature activation-2.00e+00

Pos logit contributions
elope+0.004
oc+0.004
tail+0.004
ns+0.004
 pasture+0.004
Neg logit contributions
 disob-0.002
 kids-0.002
ily-0.002
asure-0.002
 children-0.002
Token sure
Token ID1654
Feature activation+2.02e+00

Pos logit contributions
elope+0.201
oc+0.198
tail+0.186
 den+0.171
Bet+0.170
Neg logit contributions
 kids-0.088
 disob-0.087
ily-0.080
ying-0.076
 children-0.074
Token that
Token ID326
Feature activation+1.37e+00

Pos logit contributions
 disob+0.107
 kids+0.104
ily+0.098
ying+0.096
asure+0.093
Neg logit contributions
elope-0.184
oc-0.182
tail-0.169
 den-0.155
tails-0.155
Token Luna
Token ID23694
Feature activation-1.17e+01

Pos logit contributions
 kids+0.072
 disob+0.070
ily+0.063
 children+0.061
ying+0.060
Neg logit contributions
oc-0.129
elope-0.128
tail-0.117
oing-0.110
icol-0.110
Token was
Token ID373
Feature activation-9.77e-01

Pos logit contributions
elope+1.029
oc+0.986
tail+0.930
ns+0.856
tails+0.854
Neg logit contributions
 kids-0.623
 disob-0.623
ying-0.605
ily-0.598
asure-0.598
Token safe
Token ID3338
Feature activation-2.98e+00

Pos logit contributions
elope+0.087
oc+0.083
tail+0.079
tails+0.073
 den+0.073
Neg logit contributions
 disob-0.052
 kids-0.051
ying-0.049
ily-0.048
asure-0.047
Token.
Token ID13
Feature activation+6.37e-02

Pos logit contributions
elope+0.278
oc+0.272
tail+0.255
 den+0.235
tails+0.234
Neg logit contributions
 disob-0.149
 kids-0.142
ily-0.135
ying-0.133
asure-0.132
Token
Token ID198
Feature activation-3.12e-01

Pos logit contributions
 kids+0.003
ily+0.002
 children+0.002
 disob+0.002
asure+0.002
Neg logit contributions
elope-0.006
 lifetime-0.006
oc-0.006
 harmony-0.006
 hovering-0.006
Token
Token ID198
Feature activation-3.07e-01

Pos logit contributions
elope+0.031
tail+0.030
icol+0.030
oc+0.030
 den+0.028
Neg logit contributions
ily-0.013
 kids-0.012
 disob-0.011
 truth-0.011
ying-0.010
Token He
Token ID679
Feature activation+5.91e-01

Pos logit contributions
elope+0.002
oc+0.002
tail+0.002
tails+0.002
ns+0.002
Neg logit contributions
 kids-0.001
ily-0.001
 disob-0.001
asure-0.001
 truth-0.001
Token was
Token ID373
Feature activation-1.02e+00

Pos logit contributions
 disob+0.029
 kids+0.028
ily+0.026
 children+0.024
asure+0.024
Neg logit contributions
elope-0.057
oc-0.055
tail-0.053
 pasture-0.048
 den-0.048
Token very
Token ID845
Feature activation-3.83e+00

Pos logit contributions
oc+0.107
elope+0.105
tail+0.097
 den+0.089
tails+0.089
Neg logit contributions
 disob-0.045
 kids-0.044
ily-0.040
 right-0.037
 children-0.037
Token happy
Token ID3772
Feature activation-3.08e-01

Pos logit contributions
elope+0.342
oc+0.336
tail+0.314
tails+0.287
 den+0.286
Neg logit contributions
 disob-0.208
 kids-0.202
ily-0.192
ying-0.191
asure-0.186
Token.
Token ID13
Feature activation+3.85e-02

Pos logit contributions
elope+0.031
oc+0.030
tail+0.029
ns+0.027
tails+0.027
Neg logit contributions
 kids-0.014
 disob-0.013
ily-0.012
 children-0.012
ying-0.012
Token Luna
Token ID23694
Feature activation-1.17e+01

Pos logit contributions
 kids+0.002
ily+0.002
 disob+0.002
asure+0.001
 children+0.001
Neg logit contributions
elope-0.004
oc-0.004
tail-0.004
tails-0.003
ns-0.003
Token was
Token ID373
Feature activation-9.77e-01

Pos logit contributions
elope+1.020
oc+0.990
tail+0.918
tails+0.844
ns+0.843
Neg logit contributions
 disob-0.641
ying-0.622
 kids-0.620
ily-0.602
ateful-0.593
Token a
Token ID257
Feature activation+2.24e-01

Pos logit contributions
elope+0.090
oc+0.089
tail+0.082
 den+0.075
tails+0.075
Neg logit contributions
 disob-0.052
 kids-0.050
ily-0.048
ying-0.046
asure-0.045
Token little
Token ID1310
Feature activation+3.84e+00

Pos logit contributions
 kids+0.008
 little+0.008
 disob+0.008
 boss+0.008
 fair+0.007
Neg logit contributions
elope-0.024
tail-0.023
tails-0.023
oc-0.023
ns-0.022
Token sad
Token ID6507
Feature activation+2.53e+00

Pos logit contributions
 disob+0.207
 kids+0.203
ily+0.179
 children+0.176
 right+0.174
Neg logit contributions
elope-0.349
oc-0.340
tail-0.314
tails-0.301
Bet-0.298
Token,
Token ID11
Feature activation+2.89e-01

Pos logit contributions
 disob+0.134
 kids+0.128
ying+0.124
asure+0.124
ily+0.123
Neg logit contributions
elope-0.229
oc-0.224
tail-0.209
 den-0.193
 Louise-0.189
Token
Token ID198
Feature activation-4.02e-01

Pos logit contributions
oc+0.003
elope+0.003
tail+0.003
tails+0.003
 lifetime+0.003
Neg logit contributions
 kids-0.001
ily-0.001
 children-0.001
 disob-0.001
 truth-0.001
Token
Token ID198
Feature activation-3.88e-01

Pos logit contributions
oc+0.037
elope+0.037
tail+0.036
icol+0.034
tails+0.033
Neg logit contributions
ily-0.020
 kids-0.019
 disob-0.018
 truth-0.017
ying-0.017
TokenOne
Token ID3198
Feature activation-8.03e-01

Pos logit contributions
elope+0.032
oc+0.032
tail+0.030
 tal+0.028
icol+0.028
Neg logit contributions
 disob-0.023
 kids-0.022
ily-0.022
ying-0.020
asure-0.020
Token day
Token ID1110
Feature activation-4.48e-01

Pos logit contributions
elope+0.064
oc+0.062
tail+0.057
 den+0.052
tails+0.051
Neg logit contributions
 disob-0.052
 kids-0.050
asure-0.048
ying-0.048
ily-0.048
Token,
Token ID11
Feature activation+2.49e-01

Pos logit contributions
elope+0.043
oc+0.043
tail+0.040
 den+0.037
tails+0.037
Neg logit contributions
 disob-0.021
 kids-0.021
ily-0.019
ying-0.019
asure-0.018
Token Luna
Token ID23694
Feature activation-1.17e+01

Pos logit contributions
 kids+0.011
 disob+0.011
ily+0.010
 children+0.009
asure+0.009
Neg logit contributions
oc-0.025
elope-0.025
tail-0.023
 den-0.021
ns-0.021
Token found
Token ID1043
Feature activation-1.10e+00

Pos logit contributions
elope+1.032
oc+0.995
tail+0.913
ns+0.855
 den+0.852
Neg logit contributions
 disob-0.628
ying-0.622
 kids-0.608
asure-0.601
ateful-0.592
Token a
Token ID257
Feature activation+2.17e-01

Pos logit contributions
oc+0.132
elope+0.126
tail+0.117
 lifetime+0.116
oing+0.112
Neg logit contributions
ily-0.039
 kids-0.035
 disob-0.034
 children-0.029
 right-0.029
Token ball
Token ID2613
Feature activation+1.22e+00

Pos logit contributions
 disob+0.010
 kids+0.010
 truth+0.009
 iron+0.009
 children+0.008
Neg logit contributions
elope-0.021
oc-0.021
tail-0.020
ns-0.018
tails-0.018
Token that
Token ID326
Feature activation+1.40e+00

Pos logit contributions
 disob+0.059
 kids+0.057
ying+0.054
ateful+0.053
asure+0.052
Neg logit contributions
oc-0.112
elope-0.112
tail-0.106
 den-0.096
ns-0.095
Token could
Token ID714
Feature activation-2.32e+00

Pos logit contributions
 disob+0.069
 kids+0.067
ily+0.066
ying+0.059
asure+0.058
Neg logit contributions
oc-0.134
elope-0.131
tail-0.122
 den-0.114
 pasture-0.114
Token "
Token ID366
Feature activation+1.33e-01

Pos logit contributions
 kids+0.009
 disob+0.009
ily+0.008
asure+0.008
 children+0.008
Neg logit contributions
elope-0.018
oc-0.018
tail-0.016
 den-0.015
 tal-0.015
TokenWe
Token ID1135
Feature activation-1.90e+00

Pos logit contributions
 kids+0.007
 disob+0.007
ily+0.006
 children+0.006
asure+0.006
Neg logit contributions
elope-0.012
oc-0.012
 den-0.011
tail-0.011
 tal-0.011
Token are
Token ID389
Feature activation+2.77e+00

Pos logit contributions
elope+0.167
oc+0.164
tail+0.158
tails+0.148
ns+0.143
Neg logit contributions
ying-0.101
 disob-0.092
 kids-0.088
ily-0.088
ateful-0.087
Token sorry
Token ID7926
Feature activation+1.61e+00

Pos logit contributions
 kids+0.135
 disob+0.130
ily+0.119
 children+0.116
 right+0.110
Neg logit contributions
oc-0.286
elope-0.277
tail-0.256
 tal-0.236
 den-0.235
Token,
Token ID11
Feature activation+2.49e-01

Pos logit contributions
 kids+0.071
 disob+0.066
ily+0.061
 children+0.060
ying+0.057
Neg logit contributions
elope-0.168
oc-0.165
tail-0.155
ns-0.148
 pasture-0.145
Token Luna
Token ID23694
Feature activation-1.17e+01

Pos logit contributions
 kids+0.023
 children+0.021
ily+0.021
 disob+0.020
 boys+0.020
Neg logit contributions
oc-0.016
elope-0.015
 lifetime-0.014
tail-0.013
icol-0.013
Token.
Token ID13
Feature activation+3.82e-02

Pos logit contributions
elope+1.141
oc+1.129
tail+1.065
 den+0.979
ns+0.976
Neg logit contributions
 disob-0.531
 kids-0.527
ily-0.489
ying-0.466
 children-0.448
Token We
Token ID775
Feature activation-3.34e+00

Pos logit contributions
 kids+0.002
ily+0.002
 disob+0.002
 children+0.002
 boys+0.002
Neg logit contributions
oc-0.003
elope-0.003
 lifetime-0.003
 pasture-0.003
 den-0.003
Token want
Token ID765
Feature activation+1.08e+00

Pos logit contributions
elope+0.271
oc+0.270
tail+0.252
tails+0.243
ns+0.225
Neg logit contributions
ying-0.199
 ma-0.171
ateful-0.169
asure-0.169
rier-0.166
Token to
Token ID284
Feature activation+3.42e-01

Pos logit contributions
 kids+0.048
 disob+0.047
 children+0.041
ily+0.041
 truth+0.040
Neg logit contributions
oc-0.116
elope-0.113
tail-0.104
ns-0.096
 pasture-0.096
Token be
Token ID307
Feature activation+1.58e+00

Pos logit contributions
ying+0.021
 disob+0.020
 kids+0.020
ily+0.020
asure+0.019
Neg logit contributions
elope-0.028
oc-0.027
tail-0.026
tails-0.023
 den-0.023
Token's
Token ID338
Feature activation-1.54e+00

Pos logit contributions
elope+0.143
oc+0.141
tail+0.132
tails+0.122
 den+0.121
Neg logit contributions
 disob-0.073
 kids-0.070
ily-0.066
ying-0.066
asure-0.064
Token house
Token ID2156
Feature activation-5.92e-02

Pos logit contributions
elope+0.134
oc+0.131
tail+0.123
 den+0.112
tails+0.112
Neg logit contributions
 disob-0.086
 kids-0.083
ily-0.080
ying-0.079
asure-0.077
Token.
Token ID13
Feature activation+3.85e-02

Pos logit contributions
elope+0.005
oc+0.005
tail+0.005
tails+0.004
 den+0.004
Neg logit contributions
 disob-0.003
ily-0.003
ying-0.003
 kids-0.003
asure-0.003
Token We
Token ID775
Feature activation-3.35e+00

Pos logit contributions
 kids+0.003
ily+0.002
 disob+0.002
asure+0.002
 children+0.002
Neg logit contributions
elope-0.003
oc-0.003
tail-0.003
tails-0.003
 den-0.003
Token found
Token ID1043
Feature activation-1.09e+00

Pos logit contributions
elope+0.280
oc+0.279
tail+0.261
ns+0.243
tails+0.239
Neg logit contributions
ying-0.192
ateful-0.181
 kids-0.174
 disob-0.172
 ma-0.171
Token Luna
Token ID23694
Feature activation-1.17e+01

Pos logit contributions
oc+0.111
elope+0.110
tail+0.103
oing+0.095
 den+0.095
Neg logit contributions
 kids-0.049
 disob-0.044
ily-0.044
 children-0.042
asure-0.039
Token in
Token ID287
Feature activation+2.65e+00

Pos logit contributions
elope+1.095
oc+1.054
tail+1.014
 den+0.930
tails+0.929
Neg logit contributions
 disob-0.588
 kids-0.542
ying-0.530
ily-0.526
ateful-0.515
Token a
Token ID257
Feature activation+2.37e-01

Pos logit contributions
 kids+0.108
 disob+0.104
ily+0.094
 children+0.090
 truth+0.086
Neg logit contributions
elope-0.273
oc-0.272
tail-0.253
ns-0.239
tails-0.237
Token box
Token ID3091
Feature activation+1.59e+00

Pos logit contributions
 kids+0.011
 store+0.010
 fair+0.010
 little+0.010
 tidy+0.010
Neg logit contributions
elope-0.022
tail-0.021
oc-0.021
icol-0.020
 tal-0.020
Token in
Token ID287
Feature activation+2.66e+00

Pos logit contributions
 disob+0.091
ying+0.084
ateful+0.084
 kids+0.083
ily+0.082
Neg logit contributions
elope-0.133
oc-0.131
tail-0.124
 den-0.113
light-0.110
Token the
Token ID262
Feature activation+1.68e-01

Pos logit contributions
 kids+0.106
 disob+0.099
ily+0.093
 children+0.088
ying+0.082
Neg logit contributions
oc-0.276
elope-0.275
tail-0.257
ns-0.244
tails-0.241
TokenWhy
Token ID5195
Feature activation-2.97e-02

Pos logit contributions
 disob+0.020
ying+0.020
asure+0.019
 kids+0.019
 penn+0.018
Neg logit contributions
oc-0.024
elope-0.023
tail-0.021
Bet-0.020
June-0.019
Token are
Token ID389
Feature activation+2.92e+00

Pos logit contributions
oc+0.002
elope+0.002
 Luna+0.002
tail+0.002
tails+0.002
Neg logit contributions
 disob-0.001
 penn-0.001
asure-0.001
ter-0.001
ying-0.001
Token you
Token ID345
Feature activation+6.57e-01

Pos logit contributions
ying+0.177
 disob+0.170
ateful+0.162
 truth+0.162
iness+0.161
Neg logit contributions
elope-0.199
tail-0.193
 Luna-0.188
 Patch-0.188
tails-0.185
Token not
Token ID407
Feature activation+1.47e+00

Pos logit contributions
ily+0.038
ying+0.038
 reasons+0.035
iness+0.034
ateful+0.034
Neg logit contributions
 roaring-0.048
elope-0.046
 barking-0.045
 herd-0.043
oc-0.043
Token playing
Token ID2712
Feature activation-1.92e+00

Pos logit contributions
ying+0.094
ily+0.092
ailed+0.090
 penn+0.090
 disob+0.088
Neg logit contributions
oc-0.090
elope-0.089
 roaring-0.088
tail-0.082
 herd-0.080
Token nicely
Token ID16576
Feature activation+5.70e+00

Pos logit contributions
oc+0.158
tail+0.144
elope+0.144
 harmony+0.127
 Patch+0.125
Neg logit contributions
 disob-0.120
ying-0.108
asure-0.104
 penn-0.102
ateful-0.101
Token?"
Token ID1701
Feature activation+2.65e+00

Pos logit contributions
 disob+0.258
ily+0.219
 kids+0.215
asure+0.208
ying+0.208
Neg logit contributions
oc-0.565
elope-0.556
tail-0.523
 den-0.476
 Louise-0.468
Token
Token ID198
Feature activation-2.28e-01

Pos logit contributions
 disob+0.174
 kids+0.168
ily+0.161
ying+0.161
asure+0.158
Neg logit contributions
elope-0.207
oc-0.203
tail-0.188
 den-0.169
tails-0.168
Token
Token ID198
Feature activation-2.17e-01

Pos logit contributions
elope+0.019
oc+0.019
tail+0.018
tails+0.016
ns+0.016
Neg logit contributions
 kids-0.013
 disob-0.013
ily-0.013
ying-0.012
 children-0.012
Token"
Token ID1
Feature activation+8.02e-01

Pos logit contributions
elope+0.020
oc+0.019
tail+0.018
tails+0.016
 den+0.016
Neg logit contributions
 disob-0.012
 kids-0.011
ying-0.011
ily-0.011
asure-0.010
TokenBen
Token ID11696
Feature activation+1.01e+00

Pos logit contributions
ying+0.055
 disob+0.052
asure+0.050
 kids+0.050
 dishes+0.048
Neg logit contributions
oc-0.055
elope-0.054
tail-0.050
June-0.049
Patch-0.048
Token and
Token ID290
Feature activation+1.11e-01

Pos logit contributions
 disob+0.339
 kids+0.336
ily+0.306
ying+0.302
asure+0.295
Neg logit contributions
elope-0.767
oc-0.745
tail-0.704
Bet-0.662
tails-0.661
Token thanked
Token ID26280
Feature activation+1.02e+00

Pos logit contributions
 disob+0.005
 kids+0.005
ily+0.005
ying+0.005
asure+0.005
Neg logit contributions
elope-0.010
oc-0.010
tail-0.010
tails-0.009
 den-0.009
Token him
Token ID683
Feature activation-1.08e+00

Pos logit contributions
 disob+0.059
ying+0.057
asure+0.056
ily+0.055
 kids+0.055
Neg logit contributions
elope-0.085
oc-0.085
tail-0.078
 Louise-0.071
 harmony-0.070
Token.
Token ID13
Feature activation+1.62e-01

Pos logit contributions
elope+0.091
oc+0.088
tail+0.083
tails+0.076
 den+0.075
Neg logit contributions
 disob-0.063
 kids-0.062
ily-0.059
ying-0.059
asure-0.057
Token The
Token ID383
Feature activation-1.11e+00

Pos logit contributions
 disob+0.012
 kids+0.011
ily+0.011
ying+0.011
asure+0.011
Neg logit contributions
elope-0.012
oc-0.011
tail-0.010
tails-0.009
ns-0.009
Token officer
Token ID3818
Feature activation+7.79e+00

Pos logit contributions
oc+0.101
elope+0.088
 herd+0.082
 lifetime+0.080
 den+0.078
Neg logit contributions
ying-0.068
asure-0.057
 disob-0.056
ily-0.054
ateful-0.054
Token smiled
Token ID13541
Feature activation+1.16e+00

Pos logit contributions
 disob+0.436
ying+0.431
 kids+0.429
ily+0.422
asure+0.405
Neg logit contributions
elope-0.671
oc-0.671
tail-0.603
ns-0.552
tails-0.551
Token and
Token ID290
Feature activation+1.19e-01

Pos logit contributions
 disob+0.063
 kids+0.060
ying+0.057
ily+0.057
asure+0.055
Neg logit contributions
elope-0.104
oc-0.103
tail-0.095
 den-0.088
tails-0.086
Token said
Token ID531
Feature activation+1.03e+00

Pos logit contributions
 disob+0.005
 kids+0.005
ily+0.004
 children+0.004
asure+0.004
Neg logit contributions
elope-0.012
oc-0.012
tail-0.012
 den-0.011
ns-0.011
Token "
Token ID366
Feature activation+3.12e-01

Pos logit contributions
 disob+0.076
 kids+0.072
ateful+0.070
ailed+0.068
asure+0.068
Neg logit contributions
oc-0.071
elope-0.064
tail-0.058
 Louise-0.054
 Patch-0.054
TokenIt
Token ID1026
Feature activation+2.13e+00

Pos logit contributions
 disob+0.017
 kids+0.017
ily+0.016
ying+0.016
asure+0.015
Neg logit contributions
elope-0.028
oc-0.027
tail-0.025
 den-0.023
tails-0.023
Token be
Token ID307
Feature activation+1.63e+00

Pos logit contributions
ying+0.024
ily+0.022
ateful+0.021
outed+0.021
 kids+0.021
Neg logit contributions
elope-0.033
oc-0.031
tail-0.031
tails-0.028
 den-0.027
Token careful
Token ID8161
Feature activation+7.35e-01

Pos logit contributions
 disob+0.085
 kids+0.082
ily+0.079
ying+0.078
asure+0.076
Neg logit contributions
elope-0.148
oc-0.145
tail-0.136
tails-0.125
 den-0.125
Token."
Token ID526
Feature activation+1.54e+00

Pos logit contributions
 disob+0.041
ily+0.038
 kids+0.037
ateful+0.037
 penn+0.036
Neg logit contributions
elope-0.063
oc-0.061
tail-0.057
tails-0.053
 den-0.052
Token
Token ID198
Feature activation-2.57e-01

Pos logit contributions
 disob+0.108
 kids+0.101
ying+0.101
asure+0.099
ily+0.099
Neg logit contributions
elope-0.116
oc-0.113
tail-0.100
 den-0.092
 Louise-0.091
Token
Token ID198
Feature activation-2.67e-01

Pos logit contributions
elope+0.021
oc+0.020
tail+0.019
tails+0.017
 den+0.017
Neg logit contributions
 disob-0.016
 kids-0.016
ily-0.015
ying-0.015
asure-0.015
TokenJimmy
Token ID40335
Feature activation+5.95e+00

Pos logit contributions
oc+0.023
elope+0.022
tail+0.019
 lifetime+0.019
 pasture+0.018
Neg logit contributions
 disob-0.015
 kids-0.015
asure-0.015
ying-0.014
ily-0.014
Token understood
Token ID7247
Feature activation+2.14e+00

Pos logit contributions
 disob+0.309
 kids+0.309
ying+0.304
ily+0.297
ateful+0.289
Neg logit contributions
elope-0.546
oc-0.532
tail-0.484
tails-0.447
ns-0.446
Token and
Token ID290
Feature activation+7.90e-02

Pos logit contributions
 disob+0.130
asure+0.128
 kids+0.127
ateful+0.125
ily+0.122
Neg logit contributions
oc-0.168
elope-0.166
tail-0.142
 Louise-0.135
 den-0.132
Token he
Token ID339
Feature activation+6.70e-02

Pos logit contributions
 disob+0.004
 kids+0.004
asure+0.004
ying+0.004
ily+0.004
Neg logit contributions
elope-0.007
oc-0.007
tail-0.006
 Louise-0.006
 den-0.006
Token kept
Token ID4030
Feature activation+1.90e+00

Pos logit contributions
 kids+0.003
 disob+0.003
ying+0.003
ily+0.003
asure+0.003
Neg logit contributions
elope-0.006
oc-0.006
tail-0.006
ns-0.005
tails-0.005
Token walking
Token ID6155
Feature activation-2.46e+00

Pos logit contributions
 disob+0.113
 kids+0.108
ying+0.100
ily+0.099
asure+0.096
Neg logit contributions
elope-0.164
oc-0.163
tail-0.150
 den-0.136
ns-0.136
Token says
Token ID1139
Feature activation-1.40e+00

Pos logit contributions
elope+0.064
oc+0.064
tail+0.058
 Louise+0.053
 Luna+0.052
Neg logit contributions
ying-0.062
 disob-0.062
ateful-0.062
asure-0.061
ily-0.058
Token.
Token ID13
Feature activation+1.46e-01

Pos logit contributions
elope+0.114
oc+0.109
tail+0.108
tails+0.094
 Louise+0.093
Neg logit contributions
 disob-0.083
asure-0.080
 kids-0.080
ateful-0.078
ying-0.077
Token
Token ID198
Feature activation-2.36e-01

Pos logit contributions
 kids+0.008
 disob+0.008
ily+0.008
ying+0.008
 children+0.007
Neg logit contributions
elope-0.012
oc-0.012
tail-0.011
tails-0.011
 den-0.010
Token
Token ID198
Feature activation-2.31e-01

Pos logit contributions
elope+0.021
oc+0.021
tail+0.020
tails+0.019
ns+0.018
Neg logit contributions
 kids-0.012
ily-0.012
 disob-0.012
ying-0.011
 truth-0.011
TokenL
Token ID43
Feature activation+1.68e+00

Pos logit contributions
oc+0.018
elope+0.018
tail+0.016
Bet+0.015
June+0.015
Neg logit contributions
 disob-0.015
ying-0.014
 kids-0.014
asure-0.014
 truth-0.013
Tokenily
Token ID813
Feature activation+5.74e+00

Pos logit contributions
nam+0.094
 disob+0.080
 worry+0.072
asure+0.070
 penn+0.069
Neg logit contributions
elope-0.138
oc-0.133
 Luna-0.132
 Louise-0.123
tail-0.123
Token and
Token ID290
Feature activation+9.95e-02

Pos logit contributions
 disob+0.350
 kids+0.333
ying+0.328
ily+0.327
asure+0.324
Neg logit contributions
elope-0.469
oc-0.464
tail-0.423
 den-0.388
ns-0.387
Token Ben
Token ID3932
Feature activation-9.00e-01

Pos logit contributions
 disob+0.007
ying+0.006
asure+0.006
ateful+0.006
 truth+0.006
Neg logit contributions
elope-0.007
tail-0.007
oc-0.007
 Louise-0.007
 Luna-0.007
Token walk
Token ID2513
Feature activation+4.79e-01

Pos logit contributions
elope+0.067
oc+0.065
tail+0.064
ns+0.058
tails+0.056
Neg logit contributions
ying-0.060
ateful-0.059
 disob-0.057
ily-0.057
 kids-0.057
Token to
Token ID284
Feature activation+4.31e-01

Pos logit contributions
 kids+0.020
 right+0.018
ily+0.018
 children+0.017
 disob+0.017
Neg logit contributions
oc-0.051
elope-0.050
tail-0.046
 lifetime-0.045
icol-0.045
Token the
Token ID262
Feature activation+2.38e-01

Pos logit contributions
 kids+0.017
 disob+0.016
ily+0.014
 children+0.014
 right+0.013
Neg logit contributions
oc-0.046
elope-0.046
tail-0.043
 lifetime-0.041
tails-0.041
Token rock
Token ID3881
Feature activation+3.98e+00

Pos logit contributions
ateful+0.018
asure+0.018
ying+0.018
outed+0.018
ily+0.017
Neg logit contributions
 den-0.014
oc-0.012
 lifetime-0.012
 clearing-0.012
 herd-0.012
Token was
Token ID373
Feature activation-8.77e-01

Pos logit contributions
 disob+0.246
 kids+0.245
ying+0.237
ily+0.237
ateful+0.236
Neg logit contributions
elope-0.312
oc-0.306
tail-0.276
ns-0.253
tails-0.253
Token not
Token ID407
Feature activation+1.46e+00

Pos logit contributions
elope+0.078
oc+0.075
tail+0.071
 den+0.066
tails+0.066
Neg logit contributions
 disob-0.046
 kids-0.045
ying-0.043
ily-0.042
asure-0.041
Token worth
Token ID2861
Feature activation+1.37e+00

Pos logit contributions
ying+0.093
ailed+0.091
 disob+0.091
ily+0.089
ateful+0.089
Neg logit contributions
elope-0.100
tail-0.090
oc-0.090
 tal-0.086
tails-0.086
Token any
Token ID597
Feature activation+8.28e-01

Pos logit contributions
ying+0.085
asure+0.085
 disob+0.083
 kids+0.082
ily+0.081
Neg logit contributions
elope-0.102
oc-0.099
tail-0.096
ns-0.085
 tal-0.084
Token money
Token ID1637
Feature activation+6.71e+00

Pos logit contributions
asure+0.043
 disob+0.041
ying+0.041
ily+0.040
ateful+0.038
Neg logit contributions
oc-0.072
elope-0.069
tail-0.065
 den-0.062
 tal-0.061
Token,
Token ID11
Feature activation+3.64e-01

Pos logit contributions
 disob+0.389
 kids+0.371
ateful+0.364
ily+0.363
asure+0.357
Neg logit contributions
elope-0.524
oc-0.517
tail-0.494
ns-0.451
tails-0.444
Token but
Token ID475
Feature activation-1.22e-01

Pos logit contributions
 disob+0.021
asure+0.020
ying+0.019
 kids+0.019
ily+0.018
Neg logit contributions
oc-0.029
elope-0.029
tail-0.028
 Louise-0.025
 harmony-0.025
Token it
Token ID340
Feature activation-1.28e+00

Pos logit contributions
elope+0.011
oc+0.010
tail+0.010
 Louise+0.009
 harmony+0.009
Neg logit contributions
 disob-0.007
ying-0.006
 kids-0.006
asure-0.006
ily-0.006
Token was
Token ID373
Feature activation-8.74e-01

Pos logit contributions
elope+0.112
oc+0.109
tail+0.099
ns+0.092
 den+0.091
Neg logit contributions
 kids-0.070
 disob-0.069
ily-0.067
ying-0.067
asure-0.065
Token still
Token ID991
Feature activation+6.16e-01

Pos logit contributions
elope+0.078
oc+0.075
tail+0.071
 den+0.066
ns+0.065
Neg logit contributions
 disob-0.046
 kids-0.045
ying-0.044
ily-0.042
asure-0.041
Token While
Token ID2893
Feature activation-3.26e+00

Pos logit contributions
 kids+0.009
 disob+0.009
ily+0.009
asure+0.008
ying+0.008
Neg logit contributions
elope-0.014
oc-0.013
tail-0.013
tails-0.012
ns-0.011
Token he
Token ID339
Feature activation+1.00e-01

Pos logit contributions
elope+0.339
oc+0.321
tail+0.311
ns+0.293
tails+0.292
Neg logit contributions
 kids-0.152
 disob-0.129
ily-0.125
 children-0.123
 penn-0.111
Token was
Token ID373
Feature activation-8.57e-01

Pos logit contributions
 disob+0.006
 kids+0.006
ying+0.005
ily+0.005
asure+0.005
Neg logit contributions
oc-0.009
elope-0.009
tail-0.008
 den-0.007
ns-0.007
Token playing
Token ID2712
Feature activation-1.92e+00

Pos logit contributions
elope+0.050
tail+0.046
oc+0.046
 pasture+0.040
tails+0.040
Neg logit contributions
 disob-0.068
ateful-0.066
ying-0.065
 kids-0.065
ily-0.065
Token,
Token ID11
Feature activation+3.73e-01

Pos logit contributions
oc+0.169
elope+0.168
tail+0.157
 den+0.143
 pasture+0.140
Neg logit contributions
 disob-0.105
 kids-0.097
asure-0.095
ying-0.094
ateful-0.093
Token a
Token ID257
Feature activation+3.36e-01

Pos logit contributions
 disob+0.019
 kids+0.019
ily+0.018
ying+0.018
asure+0.017
Neg logit contributions
elope-0.034
oc-0.034
tail-0.031
tails-0.029
 den-0.029
Token big
Token ID1263
Feature activation-1.64e+00

Pos logit contributions
 disob+0.012
 kids+0.012
ily+0.010
 children+0.010
 truth+0.010
Neg logit contributions
elope-0.037
oc-0.036
tail-0.034
tails-0.032
ns-0.032
Token dog
Token ID3290
Feature activation-3.68e+00

Pos logit contributions
elope+0.143
oc+0.140
tail+0.130
tails+0.119
 den+0.118
Neg logit contributions
 disob-0.093
 kids-0.091
ily-0.085
ying-0.084
asure-0.082
Token came
Token ID1625
Feature activation-1.06e+00

Pos logit contributions
elope+0.389
 lifetime+0.353
tail+0.345
 harmony+0.343
 tal+0.342
Neg logit contributions
 disob-0.167
 kids-0.154
ily-0.133
 children-0.121
 officer-0.119
Token and
Token ID290
Feature activation+1.09e-01

Pos logit contributions
elope+0.098
oc+0.095
tail+0.090
ns+0.083
 lifetime+0.082
Neg logit contributions
 disob-0.056
 kids-0.054
ily-0.050
ying-0.049
asure-0.049
Token took
Token ID1718
Feature activation+2.96e+00

Pos logit contributions
 disob+0.005
 kids+0.005
ily+0.004
 ignored+0.004
 children+0.004
Neg logit contributions
elope-0.011
oc-0.011
tail-0.010
 den-0.010
 pasture-0.009
Token so
Token ID523
Feature activation+7.14e-01

Pos logit contributions
oc+0.090
elope+0.088
tail+0.082
icol+0.076
tails+0.075
Neg logit contributions
 disob-0.045
 kids-0.043
ily-0.040
 children-0.036
 right-0.036
Token happy
Token ID3772
Feature activation-2.17e-01

Pos logit contributions
 disob+0.044
 kids+0.043
ily+0.041
ying+0.041
asure+0.040
Neg logit contributions
elope-0.059
oc-0.057
tail-0.053
tails-0.048
 den-0.048
Token she
Token ID673
Feature activation-1.08e+00

Pos logit contributions
elope+0.022
oc+0.022
tail+0.020
ns+0.019
tails+0.019
Neg logit contributions
 kids-0.010
 disob-0.009
ily-0.009
ying-0.008
 children-0.008
Token could
Token ID714
Feature activation-2.28e+00

Pos logit contributions
elope+0.098
oc+0.096
tail+0.090
 den+0.083
 pasture+0.082
Neg logit contributions
 disob-0.060
 kids-0.057
ily-0.053
asure-0.050
ying-0.050
Token go
Token ID467
Feature activation+4.84e-01

Pos logit contributions
oc+0.205
elope+0.203
tail+0.179
ns+0.171
 den+0.169
Neg logit contributions
 disob-0.133
 kids-0.123
ily-0.119
ying-0.109
asure-0.108
Token to
Token ID284
Feature activation+4.06e-01

Pos logit contributions
 kids+0.022
 disob+0.022
 right+0.021
ily+0.020
ying+0.019
Neg logit contributions
elope-0.050
oc-0.049
tail-0.043
ns-0.042
light-0.042
Token school
Token ID1524
Feature activation+1.07e+00

Pos logit contributions
 kids+0.016
 disob+0.015
 children+0.013
 right+0.013
ily+0.012
Neg logit contributions
oc-0.046
elope-0.045
ns-0.040
oing-0.040
tail-0.040
Token after
Token ID706
Feature activation-4.12e+00

Pos logit contributions
 disob+0.052
 kids+0.051
ily+0.047
ying+0.046
asure+0.044
Neg logit contributions
elope-0.102
oc-0.100
tail-0.093
tails-0.087
 den-0.086
Token all
Token ID477
Feature activation+4.58e+00

Pos logit contributions
elope+0.375
oc+0.369
tail+0.344
tails+0.315
 den+0.315
Neg logit contributions
 disob-0.218
 kids-0.213
ily-0.201
ying-0.197
asure-0.192
Token.
Token ID13
Feature activation+1.35e-01

Pos logit contributions
 disob+0.207
 kids+0.205
ily+0.187
ying+0.186
 children+0.175
Neg logit contributions
elope-0.454
oc-0.449
tail-0.420
ns-0.391
tails-0.389
Token She
Token ID1375
Feature activation+3.02e-01

Pos logit contributions
 kids+0.006
 disob+0.006
ily+0.006
 children+0.005
asure+0.005
Neg logit contributions
oc-0.014
elope-0.013
tail-0.012
ns-0.012
 den-0.012
Token to
Token ID284
Feature activation+2.90e-01

Pos logit contributions
 disob+0.142
 kids+0.134
ily+0.133
ying+0.132
asure+0.128
Neg logit contributions
elope-0.241
oc-0.240
tail-0.218
ns-0.205
tails-0.205
Token help
Token ID1037
Feature activation-1.50e-01

Pos logit contributions
ying+0.016
 disob+0.015
ily+0.014
asure+0.014
 kids+0.014
Neg logit contributions
elope-0.025
oc-0.024
tail-0.023
 pasture-0.021
ns-0.021
Token her
Token ID607
Feature activation-6.99e-01

Pos logit contributions
elope+0.013
oc+0.013
tail+0.012
ns+0.011
tails+0.011
Neg logit contributions
 disob-0.008
asure-0.008
ily-0.007
ying-0.007
ateful-0.007
Token.
Token ID13
Feature activation-3.83e-02

Pos logit contributions
elope+0.059
oc+0.058
tail+0.056
tails+0.050
 den+0.049
Neg logit contributions
 disob-0.038
ily-0.038
 kids-0.037
ateful-0.037
ying-0.037
Token She
Token ID1375
Feature activation+1.34e-01

Pos logit contributions
elope+0.003
oc+0.003
tail+0.003
 den+0.003
tails+0.003
Neg logit contributions
 disob-0.002
 kids-0.002
ily-0.002
ying-0.002
asure-0.002
Token wants
Token ID3382
Feature activation-1.92e-01

Pos logit contributions
 disob+0.007
 kids+0.007
ying+0.006
ily+0.006
asure+0.006
Neg logit contributions
elope-0.012
oc-0.012
tail-0.011
 den-0.010
ns-0.010
Token her
Token ID607
Feature activation-7.14e-01

Pos logit contributions
elope+0.016
oc+0.015
tail+0.015
tails+0.013
 Louise+0.013
Neg logit contributions
 disob-0.011
ying-0.011
 kids-0.010
ily-0.010
asure-0.010
Token er
Token ID1931
Feature activation+3.19e+00

Pos logit contributions
elope+0.063
oc+0.061
tail+0.059
tails+0.054
 den+0.054
Neg logit contributions
 disob-0.038
ying-0.036
ily-0.036
 kids-0.035
asure-0.034
Tokenaser
Token ID6005
Feature activation+4.69e+00

Pos logit contributions
 disob+0.106
 kids+0.098
ying+0.086
ily+0.083
asure+0.083
Neg logit contributions
oc-0.346
elope-0.340
tail-0.332
tails-0.308
ns-0.301
Token back
Token ID736
Feature activation-4.66e-01

Pos logit contributions
 disob+0.263
 kids+0.250
ying+0.245
ateful+0.241
asure+0.236
Neg logit contributions
elope-0.394
oc-0.388
tail-0.379
tails-0.339
 den-0.334
Token.
Token ID13
Feature activation-2.98e-02

Pos logit contributions
elope+0.044
oc+0.043
tail+0.041
tails+0.037
 den+0.037
Neg logit contributions
 disob-0.023
 kids-0.022
ily-0.021
ying-0.021
asure-0.020
Token either
Token ID2035
Feature activation+2.73e+00

Pos logit contributions
 disob+0.118
 kids+0.109
asure+0.106
ily+0.105
ying+0.102
Neg logit contributions
elope-0.201
oc-0.199
tail-0.186
tails-0.169
 den-0.167
Token.
Token ID13
Feature activation+9.25e-02

Pos logit contributions
 disob+0.133
 kids+0.127
ily+0.119
ying+0.118
asure+0.116
Neg logit contributions
elope-0.260
oc-0.256
tail-0.240
tails-0.220
 den-0.219
Token
Token ID198
Feature activation-3.00e-01

Pos logit contributions
 disob+0.006
 kids+0.006
ily+0.005
asure+0.005
ying+0.005
Neg logit contributions
elope-0.007
oc-0.007
tail-0.007
tails-0.006
ns-0.006
Token
Token ID198
Feature activation-2.91e-01

Pos logit contributions
elope+0.025
oc+0.025
tail+0.024
 den+0.022
tails+0.022
Neg logit contributions
 disob-0.017
 kids-0.017
ily-0.017
ying-0.016
asure-0.015
TokenOne
Token ID3198
Feature activation-7.14e-01

Pos logit contributions
elope+0.023
oc+0.022
tail+0.021
tails+0.019
 den+0.019
Neg logit contributions
 disob-0.019
 kids-0.018
ily-0.018
ying-0.017
asure-0.017
Token day
Token ID1110
Feature activation-3.68e-01

Pos logit contributions
elope+0.058
oc+0.057
tail+0.052
 den+0.047
tails+0.047
Neg logit contributions
 disob-0.045
 kids-0.044
ily-0.042
ying-0.042
asure-0.041
Token,
Token ID11
Feature activation+3.14e-01

Pos logit contributions
elope+0.034
oc+0.033
tail+0.031
tails+0.029
 den+0.029
Neg logit contributions
 disob-0.019
 kids-0.018
ily-0.017
ying-0.017
asure-0.017
Token Tim
Token ID5045
Feature activation+1.45e+00

Pos logit contributions
 disob+0.015
 kids+0.015
ily+0.014
ying+0.013
asure+0.013
Neg logit contributions
elope-0.030
oc-0.030
tail-0.028
tails-0.026
 den-0.026
Tokenmy
Token ID1820
Feature activation+1.95e-01

Pos logit contributions
 disob+0.078
ying+0.078
 penn+0.075
 kids+0.075
ateful+0.075
Neg logit contributions
elope-0.127
oc-0.124
tail-0.112
 den-0.107
tails-0.106
Token was
Token ID373
Feature activation-9.34e-01

Pos logit contributions
 disob+0.012
ying+0.011
 kids+0.011
ily+0.011
asure+0.011
Neg logit contributions
elope-0.016
oc-0.016
tail-0.014
 den-0.013
tails-0.013
Token playing
Token ID2712
Feature activation-1.99e+00

Pos logit contributions
elope+0.068
oc+0.065
tail+0.063
 den+0.056
tails+0.056
Neg logit contributions
 disob-0.061
 kids-0.061
ying-0.061
ily-0.060
ateful-0.059
Token was
Token ID373
Feature activation-8.86e-01

Pos logit contributions
 disob+0.262
 kids+0.253
ily+0.240
ying+0.238
asure+0.233
Neg logit contributions
elope-0.422
oc-0.414
tail-0.385
tails-0.352
 den-0.351
Token a
Token ID257
Feature activation+3.26e-01

Pos logit contributions
elope+0.091
oc+0.090
tail+0.085
tails+0.079
 den+0.079
Neg logit contributions
 kids-0.036
 disob-0.035
ily-0.034
asure-0.030
ying-0.030
Token little
Token ID1310
Feature activation+3.92e+00

Pos logit contributions
 kids+0.008
 disob+0.007
ily+0.006
 children+0.006
 iron+0.005
Neg logit contributions
elope-0.039
oc-0.038
tail-0.037
tails-0.035
 tal-0.035
Token girl
Token ID2576
Feature activation+1.65e+00

Pos logit contributions
 disob+0.141
 kids+0.139
ily+0.125
ying+0.120
asure+0.115
Neg logit contributions
elope-0.421
oc-0.413
tail-0.392
tails-0.365
 den-0.362
Token named
Token ID3706
Feature activation-5.97e+00

Pos logit contributions
 disob+0.109
 kids+0.109
ily+0.108
 children+0.097
 iron+0.096
Neg logit contributions
elope-0.123
oc-0.118
 lifetime-0.113
tail-0.112
 tal-0.109
Token Lily
Token ID20037
Feature activation+1.59e+00

Pos logit contributions
elope+0.588
oc+0.550
tail+0.541
 Louise+0.504
Bet+0.500
Neg logit contributions
 disob-0.272
ying-0.253
 kids-0.242
ateful-0.238
asure-0.230
Token.
Token ID13
Feature activation+1.56e-01

Pos logit contributions
 disob+0.062
 kids+0.061
ily+0.057
ying+0.051
asure+0.051
Neg logit contributions
elope-0.164
oc-0.160
tail-0.154
 den-0.144
ns-0.144
Token She
Token ID1375
Feature activation+3.25e-01

Pos logit contributions
 disob+0.010
 kids+0.009
ying+0.009
asure+0.009
ateful+0.008
Neg logit contributions
elope-0.013
oc-0.013
tail-0.012
 den-0.011
 Louise-0.011
Token loved
Token ID6151
Feature activation-3.10e-01

Pos logit contributions
 disob+0.019
 kids+0.018
ily+0.017
ying+0.017
asure+0.017
Neg logit contributions
elope-0.028
oc-0.028
tail-0.026
tails-0.023
 den-0.023
Token to
Token ID284
Feature activation+4.33e-01

Pos logit contributions
elope+0.028
oc+0.027
tail+0.026
tails+0.023
 den+0.023
Neg logit contributions
 disob-0.017
 kids-0.016
ily-0.015
ying-0.015
asure-0.015
Token read
Token ID1100
Feature activation-3.12e+00

Pos logit contributions
 disob+0.028
ying+0.028
 kids+0.027
ily+0.027
ateful+0.026
Neg logit contributions
elope-0.033
oc-0.032
tail-0.031
tails-0.027
 den-0.027
Token man
Token ID582
Feature activation+1.88e+00

Pos logit contributions
 disob+0.013
 kids+0.013
ily+0.012
 children+0.012
ying+0.012
Neg logit contributions
elope-0.014
oc-0.014
tail-0.012
tails-0.011
Bet-0.011
Token ran
Token ID4966
Feature activation-3.97e-01

Pos logit contributions
 disob+0.096
 kids+0.094
ily+0.089
ying+0.089
asure+0.087
Neg logit contributions
elope-0.173
oc-0.171
tail-0.158
ns-0.145
tails-0.145
Token away
Token ID1497
Feature activation+1.66e-01

Pos logit contributions
elope+0.035
oc+0.034
tail+0.032
tails+0.029
 den+0.029
Neg logit contributions
 disob-0.022
 kids-0.021
ily-0.020
ying-0.020
asure-0.020
Token into
Token ID656
Feature activation-1.03e+00

Pos logit contributions
 disob+0.008
 kids+0.007
ily+0.007
ying+0.007
asure+0.007
Neg logit contributions
elope-0.016
oc-0.016
tail-0.015
tails-0.014
 den-0.014
Token the
Token ID262
Feature activation+2.09e-01

Pos logit contributions
elope+0.104
oc+0.102
tail+0.097
tails+0.089
ns+0.089
Neg logit contributions
 kids-0.044
 disob-0.044
ily-0.039
 children-0.037
ying-0.036
Token trees
Token ID7150
Feature activation-4.13e+00

Pos logit contributions
 kids+0.013
 children+0.012
 disob+0.012
 right+0.012
 iron+0.011
Neg logit contributions
elope-0.017
oc-0.017
oing-0.015
tail-0.015
Bet-0.015
Token.
Token ID13
Feature activation+1.19e-01

Pos logit contributions
elope+0.411
oc+0.403
tail+0.378
tails+0.350
 den+0.349
Neg logit contributions
 disob-0.179
 kids-0.173
ily-0.159
ying-0.157
asure-0.152
Token
Token ID198
Feature activation-2.58e-01

Pos logit contributions
 kids+0.007
 disob+0.007
ily+0.007
asure+0.006
ying+0.006
Neg logit contributions
elope-0.010
oc-0.010
tail-0.009
tails-0.009
 den-0.008
Token
Token ID198
Feature activation-2.60e-01

Pos logit contributions
elope+0.023
oc+0.022
tail+0.022
tails+0.020
 den+0.020
Neg logit contributions
 disob-0.014
 kids-0.013
ily-0.013
ying-0.012
 truth-0.012
TokenThe
Token ID464
Feature activation-2.27e-01

Pos logit contributions
elope+0.022
oc+0.021
tail+0.020
 den+0.018
tails+0.018
Neg logit contributions
 disob-0.016
 kids-0.015
ily-0.015
ying-0.014
asure-0.014
Token z
Token ID1976
Feature activation+9.13e-02

Pos logit contributions
elope+0.020
oc+0.020
tail+0.019
 den+0.017
ns+0.017
Neg logit contributions
 disob-0.012
ying-0.012
asure-0.012
ily-0.011
 kids-0.011
Token got
Token ID1392
Feature activation+7.28e-01

Pos logit contributions
oc+0.063
elope+0.062
tail+0.055
Bet+0.052
tails+0.051
Neg logit contributions
ying-0.053
 kids-0.052
ily-0.052
ateful-0.052
 disob-0.052
Token there
Token ID612
Feature activation+4.78e+00

Pos logit contributions
 disob+0.029
 kids+0.028
ily+0.026
ying+0.025
asure+0.025
Neg logit contributions
elope-0.076
oc-0.074
tail-0.071
tails-0.066
 den-0.065
Token,
Token ID11
Feature activation+3.70e-01

Pos logit contributions
 disob+0.267
 kids+0.248
ily+0.242
asure+0.241
ying+0.237
Neg logit contributions
elope-0.422
oc-0.401
tail-0.383
tails-0.350
 Louise-0.349
Token the
Token ID262
Feature activation+2.48e-01

Pos logit contributions
 disob+0.019
 kids+0.018
ily+0.018
ying+0.017
asure+0.017
Neg logit contributions
elope-0.034
oc-0.033
tail-0.031
tails-0.029
 den-0.029
Token girl
Token ID2576
Feature activation+1.67e+00

Pos logit contributions
ying+0.011
 disob+0.010
asure+0.010
ily+0.009
ateful+0.009
Neg logit contributions
elope-0.025
oc-0.024
tail-0.023
 herd-0.022
 den-0.022
Token saw
Token ID2497
Feature activation-3.57e+00

Pos logit contributions
 disob+0.091
 kids+0.089
ily+0.083
ying+0.081
asure+0.079
Neg logit contributions
elope-0.148
oc-0.145
tail-0.137
tails-0.124
 den-0.124
Token so
Token ID523
Feature activation+7.47e-01

Pos logit contributions
elope+0.340
oc+0.332
tail+0.312
tails+0.290
 den+0.286
Neg logit contributions
 disob-0.174
 kids-0.163
ying-0.156
ily-0.154
asure-0.152
Token many
Token ID867
Feature activation+7.70e-01

Pos logit contributions
 disob+0.048
asure+0.046
 kids+0.045
ying+0.044
ily+0.043
Neg logit contributions
elope-0.059
oc-0.057
tail-0.052
tails-0.047
Bet-0.047
Token zo
Token ID40565
Feature activation-1.61e-01

Pos logit contributions
 disob+0.038
 kids+0.037
ily+0.035
ying+0.034
asure+0.033
Neg logit contributions
elope-0.073
oc-0.071
tail-0.067
tails-0.062
 den-0.062
Tokenoms
Token ID3150
Feature activation+3.84e+00

Pos logit contributions
oc+0.013
elope+0.013
tail+0.012
ns+0.011
tails+0.011
Neg logit contributions
 disob-0.009
 kids-0.009
asure-0.009
 truth-0.009
 iron-0.008
Token and
Token ID290
Feature activation+1.08e-01

Pos logit contributions
 disob+0.200
 kids+0.192
ily+0.184
ying+0.180
asure+0.178
Neg logit contributions
elope-0.348
oc-0.339
tail-0.321
tails-0.296
 den-0.292
Token do
Token ID466
Feature activation+1.32e+00

Pos logit contributions
ying+0.018
ateful+0.018
 disob+0.017
ily+0.017
 penn+0.017
Neg logit contributions
oc-0.030
elope-0.029
tail-0.026
 Luna-0.026
 Louise-0.025
Token you
Token ID345
Feature activation+6.39e-01

Pos logit contributions
 disob+0.053
 kids+0.051
ily+0.045
ying+0.044
asure+0.043
Neg logit contributions
elope-0.140
oc-0.138
tail-0.129
 den-0.122
 pasture-0.121
Token want
Token ID765
Feature activation+1.18e+00

Pos logit contributions
ying+0.043
ateful+0.041
 kids+0.040
ter+0.039
asure+0.039
Neg logit contributions
oc-0.046
tail-0.044
elope-0.039
ns-0.036
tails-0.036
Token to
Token ID284
Feature activation+4.28e-01

Pos logit contributions
ying+0.068
 disob+0.067
ily+0.066
 kids+0.065
ateful+0.063
Neg logit contributions
elope-0.093
oc-0.092
tail-0.088
 den-0.081
tails-0.080
Token come
Token ID1282
Feature activation-5.25e-01

Pos logit contributions
ying+0.029
ateful+0.027
ily+0.027
outed+0.026
 kids+0.025
Neg logit contributions
tail-0.032
oc-0.031
elope-0.031
tails-0.029
 den-0.028
Token on
Token ID319
Feature activation-1.87e+00

Pos logit contributions
elope+0.052
oc+0.050
tail+0.047
ns+0.044
tails+0.043
Neg logit contributions
 kids-0.024
 disob-0.024
ily-0.022
asure-0.021
ying-0.021
Token our
Token ID674
Feature activation-2.47e+00

Pos logit contributions
elope+0.171
oc+0.170
tail+0.161
 den+0.149
 pasture+0.147
Neg logit contributions
 disob-0.093
ying-0.088
 kids-0.088
ily-0.088
asure-0.083
Token raft
Token ID31812
Feature activation+1.76e+00

Pos logit contributions
elope+0.172
oc+0.167
tail+0.152
 den+0.145
 pasture+0.140
Neg logit contributions
 disob-0.176
ily-0.175
ying-0.170
 kids-0.167
asure-0.166
Token?"
Token ID1701
Feature activation+2.66e+00

Pos logit contributions
 disob+0.062
 kids+0.058
ily+0.056
ying+0.055
ateful+0.050
Neg logit contributions
oc-0.188
elope-0.187
tail-0.178
 den-0.165
tails-0.164
Token The
Token ID383
Feature activation-1.11e+00

Pos logit contributions
ying+0.201
 disob+0.199
ily+0.194
 kids+0.192
asure+0.185
Neg logit contributions
elope-0.181
oc-0.180
tail-0.161
 den-0.139
 pasture-0.139
Token duck
Token ID22045
Feature activation+2.21e+00

Pos logit contributions
elope+0.080
oc+0.078
 herd+0.071
tail+0.071
 den+0.068
Neg logit contributions
ying-0.077
asure-0.073
 disob-0.072
ily-0.071
ateful-0.068
Token not
Token ID407
Feature activation+1.48e+00

Pos logit contributions
 kids+0.080
 disob+0.080
ily+0.066
 right+0.063
 children+0.058
Neg logit contributions
oc-0.395
elope-0.394
tail-0.369
Bet-0.353
 den-0.351
Token know
Token ID760
Feature activation+7.65e-02

Pos logit contributions
ying+0.093
asure+0.088
 penn+0.084
outed+0.081
ateful+0.081
Neg logit contributions
elope-0.117
oc-0.115
tail-0.111
ns-0.100
tails-0.098
Token what
Token ID644
Feature activation-1.14e-01

Pos logit contributions
 kids+0.003
 disob+0.003
ily+0.003
ying+0.003
 children+0.003
Neg logit contributions
elope-0.008
oc-0.008
tail-0.007
tails-0.007
 den-0.007
Token the
Token ID262
Feature activation+2.56e-01

Pos logit contributions
oc+0.012
elope+0.011
tail+0.011
icol+0.010
 pasture+0.010
Neg logit contributions
 disob-0.005
 kids-0.005
ily-0.005
 truth-0.005
 children-0.004
Token tide
Token ID20013
Feature activation+5.27e+00

Pos logit contributions
asure+0.018
ying+0.016
 disob+0.015
ily+0.015
ussed+0.014
Neg logit contributions
elope-0.018
oc-0.016
 den-0.016
 roaring-0.016
 herd-0.015
Token was
Token ID373
Feature activation-8.60e-01

Pos logit contributions
ily+0.309
 disob+0.307
 kids+0.304
ying+0.299
ateful+0.298
Neg logit contributions
elope-0.422
oc-0.417
tail-0.366
ns-0.357
tails-0.340
Token,
Token ID11
Feature activation+3.91e-01

Pos logit contributions
elope+0.068
oc+0.066
tail+0.060
tails+0.058
 roaring+0.057
Neg logit contributions
 disob-0.051
ily-0.050
ateful-0.048
 kids-0.047
ying-0.046
Token but
Token ID475
Feature activation-1.01e-01

Pos logit contributions
 disob+0.018
 kids+0.018
ily+0.016
ying+0.016
asure+0.016
Neg logit contributions
elope-0.038
oc-0.037
tail-0.035
tails-0.033
 den-0.032
Token they
Token ID484
Feature activation-8.25e-01

Pos logit contributions
oc+0.010
elope+0.010
tail+0.009
 pasture+0.009
icol+0.009
Neg logit contributions
 kids-0.005
ily-0.005
 disob-0.005
 children-0.004
 right-0.004
Token wanted
Token ID2227
Feature activation+2.76e+00

Pos logit contributions
elope+0.079
oc+0.076
tail+0.070
ns+0.067
tails+0.066
Neg logit contributions
ying-0.039
 kids-0.037
 disob-0.036
ily-0.034
 dishes-0.034
Token to
Token ID284
Feature activation+4.46e-01

Pos logit contributions
 kids+0.097
 disob+0.092
 truth+0.089
ily+0.083
 children+0.082
Neg logit contributions
oc-0.331
elope-0.316
tail-0.301
icol-0.279
Bet-0.279
Token felt
Token ID2936
Feature activation+1.24e+00

Pos logit contributions
 disob+0.004
 kids+0.004
ily+0.003
ying+0.003
asure+0.003
Neg logit contributions
elope-0.009
oc-0.009
tail-0.008
ns-0.008
 den-0.008
Token proud
Token ID6613
Feature activation-1.51e+00

Pos logit contributions
 disob+0.073
 kids+0.070
ying+0.068
ily+0.067
asure+0.066
Neg logit contributions
elope-0.103
oc-0.102
tail-0.095
tails-0.085
 den-0.085
Token.
Token ID13
Feature activation+1.44e-01

Pos logit contributions
elope+0.141
oc+0.140
tail+0.131
 den+0.121
tails+0.119
Neg logit contributions
 disob-0.074
 kids-0.069
ying-0.066
ily-0.066
asure-0.065
Token
Token ID198
Feature activation-2.27e-01

Pos logit contributions
 disob+0.009
 kids+0.009
ily+0.009
ying+0.008
asure+0.008
Neg logit contributions
elope-0.012
oc-0.011
tail-0.010
 den-0.009
tails-0.009
Token
Token ID198
Feature activation-2.28e-01

Pos logit contributions
elope+0.020
tail+0.020
oc+0.020
tails+0.018
ns+0.018
Neg logit contributions
 kids-0.012
ily-0.012
 disob-0.011
ying-0.011
 truth-0.010
TokenLater
Token ID18602
Feature activation-1.16e+00

Pos logit contributions
oc+0.014
elope+0.014
tail+0.012
Bet+0.011
tails+0.011
Neg logit contributions
 disob-0.019
 kids-0.018
ying-0.018
ily-0.018
asure-0.017
Token that
Token ID326
Feature activation+1.48e+00

Pos logit contributions
elope+0.097
oc+0.093
tail+0.090
 den+0.082
 Louise+0.081
Neg logit contributions
 disob-0.065
asure-0.061
ying-0.060
 kids-0.060
ateful-0.060
Token day
Token ID1110
Feature activation-3.32e-01

Pos logit contributions
 disob+0.081
 kids+0.080
ying+0.073
ily+0.073
 children+0.070
Neg logit contributions
elope-0.131
oc-0.127
tail-0.123
ns-0.114
tails-0.113
Token,
Token ID11
Feature activation+3.73e-01

Pos logit contributions
elope+0.031
oc+0.030
tail+0.028
tails+0.026
 den+0.026
Neg logit contributions
 disob-0.017
 kids-0.016
ily-0.015
ying-0.015
asure-0.015
Token Tim
Token ID5045
Feature activation+1.50e+00

Pos logit contributions
 disob+0.017
 kids+0.016
ily+0.015
ying+0.015
asure+0.014
Neg logit contributions
elope-0.037
oc-0.036
tail-0.034
tails-0.032
 den-0.032
Tokenmy
Token ID1820
Feature activation+2.45e-01

Pos logit contributions
 disob+0.081
ying+0.078
 kids+0.076
asure+0.075
ily+0.075
Neg logit contributions
elope-0.131
oc-0.130
tail-0.117
 den-0.113
tails-0.110
Token and
Token ID290
Feature activation-3.86e-02

Pos logit contributions
elope+0.027
oc+0.027
tail+0.025
ns+0.023
Bet+0.023
Neg logit contributions
 disob-0.014
 kids-0.014
ily-0.013
ying-0.013
asure-0.012
Token stopped
Token ID5025
Feature activation+3.31e+00

Pos logit contributions
elope+0.003
oc+0.003
tail+0.003
ns+0.003
 pasture+0.003
Neg logit contributions
 disob-0.002
 kids-0.002
ily-0.002
 children-0.002
ying-0.002
Token in
Token ID287
Feature activation+2.55e+00

Pos logit contributions
 disob+0.180
 kids+0.176
ily+0.161
ying+0.160
 children+0.155
Neg logit contributions
elope-0.296
oc-0.294
tail-0.272
ns-0.250
 den-0.247
Token front
Token ID2166
Feature activation-5.52e+00

Pos logit contributions
 disob+0.128
 kids+0.125
ily+0.117
ying+0.113
asure+0.111
Neg logit contributions
elope-0.238
oc-0.234
tail-0.219
tails-0.202
ns-0.201
Token of
Token ID286
Feature activation-9.17e-01

Pos logit contributions
elope+0.480
oc+0.469
tail+0.446
 den+0.410
 pasture+0.404
Neg logit contributions
 disob-0.293
asure-0.275
 kids-0.270
ying-0.268
ily-0.263
Token Lisa
Token ID15378
Feature activation-6.97e+00

Pos logit contributions
oc+0.094
elope+0.093
tail+0.087
tails+0.081
 den+0.080
Neg logit contributions
 kids-0.039
 disob-0.038
ily-0.035
 children-0.033
ying-0.032
Token.
Token ID13
Feature activation-5.72e-02

Pos logit contributions
elope+0.676
oc+0.674
tail+0.623
Bet+0.590
 den+0.581
Neg logit contributions
 disob-0.313
 kids-0.307
ily-0.283
ying-0.272
 children-0.265
Token It
Token ID632
Feature activation-1.57e-01

Pos logit contributions
oc+0.005
elope+0.005
tail+0.005
tails+0.004
ns+0.004
Neg logit contributions
 kids-0.003
ily-0.003
 disob-0.003
asure-0.003
 children-0.003
Token made
Token ID925
Feature activation-1.39e+00

Pos logit contributions
oc+0.015
elope+0.015
tail+0.014
 pasture+0.013
tails+0.013
Neg logit contributions
 disob-0.007
 kids-0.007
ily-0.007
 children-0.006
ying-0.006
Token a
Token ID257
Feature activation+1.26e-01

Pos logit contributions
oc+0.124
elope+0.121
tail+0.111
 pasture+0.104
 den+0.103
Neg logit contributions
 kids-0.082
ily-0.078
 disob-0.074
 children-0.073
 right-0.069
Token funny
Token ID8258
Feature activation-2.03e+00

Pos logit contributions
 kids+0.006
 children+0.005
 little+0.005
 right+0.005
 disob+0.005
Neg logit contributions
elope-0.012
tail-0.012
oc-0.012
ns-0.011
June-0.011
Token and
Token ID290
Feature activation+1.18e-01

Pos logit contributions
 disob+0.242
 kids+0.236
ying+0.214
ily+0.210
 right+0.209
Neg logit contributions
elope-0.583
oc-0.579
tail-0.527
Bet-0.518
ns-0.516
Token said
Token ID531
Feature activation+1.04e+00

Pos logit contributions
 disob+0.006
 kids+0.006
ily+0.005
 children+0.005
asure+0.005
Neg logit contributions
oc-0.011
elope-0.011
tail-0.010
ns-0.010
 pasture-0.010
Token,
Token ID11
Feature activation+3.82e-01

Pos logit contributions
 disob+0.060
 kids+0.058
ily+0.054
ying+0.054
ateful+0.053
Neg logit contributions
elope-0.089
oc-0.086
tail-0.083
tails-0.075
 Louise-0.075
Token "
Token ID366
Feature activation+3.24e-01

Pos logit contributions
 disob+0.022
 kids+0.021
ily+0.020
ying+0.020
asure+0.020
Neg logit contributions
elope-0.033
oc-0.032
tail-0.030
tails-0.027
 den-0.027
TokenBy
Token ID3886
Feature activation-1.12e+00

Pos logit contributions
 disob+0.016
 kids+0.016
ily+0.016
asure+0.015
 children+0.014
Neg logit contributions
elope-0.029
oc-0.029
tail-0.027
 den-0.026
tails-0.026
Tokene
Token ID68
Feature activation-8.61e+00

Pos logit contributions
elope+0.051
oc+0.050
tail+0.043
tails+0.035
 den+0.035
Neg logit contributions
 disob-0.109
 kids-0.108
ily-0.104
ying-0.103
asure-0.102
Token bye
Token ID33847
Feature activation-5.47e-01

Pos logit contributions
elope+0.753
oc+0.732
tail+0.680
ns+0.639
 den+0.630
Neg logit contributions
 kids-0.479
 disob-0.478
ily-0.442
asure-0.425
ying-0.421
Token!
Token ID0
Feature activation-5.14e+00

Pos logit contributions
elope+0.049
oc+0.048
tail+0.045
ns+0.042
June+0.041
Neg logit contributions
 kids-0.030
 disob-0.029
ily-0.026
asure-0.026
ying-0.026
Token I
Token ID314
Feature activation+4.38e-01

Pos logit contributions
elope+0.427
oc+0.418
tail+0.389
tails+0.356
 den+0.356
Neg logit contributions
 disob-0.308
 kids-0.301
ily-0.288
ying-0.280
asure-0.279
Token had
Token ID550
Feature activation-1.03e+00

Pos logit contributions
 disob+0.025
 kids+0.024
ily+0.023
ying+0.022
asure+0.022
Neg logit contributions
elope-0.038
oc-0.037
tail-0.035
tails-0.032
 den-0.032
Token a
Token ID257
Feature activation+3.32e-01

Pos logit contributions
oc+0.113
elope+0.111
tail+0.102
oing+0.096
 pasture+0.096
Neg logit contributions
 kids-0.041
 disob-0.040
ily-0.036
 children-0.034
 right-0.034
Token Rex
Token ID17853
Feature activation-8.55e+00

Pos logit contributions
 kids+0.011
 disob+0.010
ily+0.009
 children+0.009
 right+0.008
Neg logit contributions
oc-0.031
elope-0.030
tail-0.027
 lifetime-0.027
oing-0.027
Token.
Token ID13
Feature activation+3.28e-02

Pos logit contributions
elope+0.847
oc+0.826
tail+0.773
Bet+0.729
June+0.723
Neg logit contributions
 disob-0.372
 kids-0.363
ily-0.329
ying-0.325
asure-0.318
Token Rex
Token ID17853
Feature activation-8.54e+00

Pos logit contributions
 kids+0.001
ily+0.001
 disob+0.001
 ma+0.001
 truth+0.001
Neg logit contributions
elope-0.003
oc-0.003
 lifetime-0.003
tail-0.003
 harmony-0.003
Token smelled
Token ID37609
Feature activation-4.70e+00

Pos logit contributions
elope+0.785
oc+0.760
tail+0.728
 lifetime+0.684
 den+0.677
Neg logit contributions
 disob-0.443
 kids-0.425
ily-0.391
 children-0.365
asure-0.363
Token the
Token ID262
Feature activation+1.38e-01

Pos logit contributions
elope+0.460
oc+0.450
tail+0.425
tails+0.393
 den+0.390
Neg logit contributions
 disob-0.219
 kids-0.215
ily-0.200
ying-0.195
asure-0.185
Token bisc
Token ID36228
Feature activation-7.38e+00

Pos logit contributions
 kids+0.009
 disob+0.009
ily+0.008
 children+0.008
 truth+0.008
Neg logit contributions
elope-0.011
oc-0.011
tail-0.010
June-0.009
 Louise-0.009
Tokenuit
Token ID5013
Feature activation+1.15e-01

Pos logit contributions
elope+0.636
oc+0.606
tail+0.563
 harmony+0.523
 lifetime+0.518
Neg logit contributions
 kids-0.424
ily-0.417
 disob-0.413
asure-0.395
ying-0.392
Token and
Token ID290
Feature activation+1.87e-02

Pos logit contributions
 disob+0.006
 kids+0.005
ily+0.005
ying+0.005
asure+0.005
Neg logit contributions
elope-0.011
oc-0.011
tail-0.010
tails-0.009
 den-0.009
Token ran
Token ID4966
Feature activation-4.34e-01

Pos logit contributions
 kids+0.001
 disob+0.001
 children+0.001
ily+0.001
 en+0.001
Neg logit contributions
elope-0.002
oc-0.002
tail-0.002
 lifetime-0.002
June-0.002
Token to
Token ID284
Feature activation+3.49e-01

Pos logit contributions
elope+0.042
oc+0.040
tail+0.038
tails+0.035
 lifetime+0.035
Neg logit contributions
 disob-0.022
 kids-0.021
ily-0.020
ying-0.019
 right-0.018
Token get
Token ID651
Feature activation-1.10e-01

Pos logit contributions
 kids+0.013
 disob+0.012
ily+0.010
 children+0.010
 right+0.010
Neg logit contributions
oc-0.039
elope-0.039
 lifetime-0.037
oing-0.036
light-0.036
Token.
Token ID13
Feature activation+9.97e-02

Pos logit contributions
 disob+0.438
 kids+0.423
asure+0.391
ily+0.390
ying+0.386
Neg logit contributions
elope-0.636
oc-0.615
tail-0.552
 den-0.528
tails-0.517
Token So
Token ID1406
Feature activation+1.83e+00

Pos logit contributions
 disob+0.007
 kids+0.007
ily+0.006
ying+0.006
asure+0.006
Neg logit contributions
elope-0.008
oc-0.007
tail-0.007
tails-0.006
 den-0.006
Token he
Token ID339
Feature activation+3.80e-02

Pos logit contributions
 disob+0.091
 kids+0.088
ily+0.084
ying+0.082
asure+0.080
Neg logit contributions
elope-0.172
oc-0.168
tail-0.158
tails-0.145
 den-0.145
Token continued
Token ID3767
Feature activation-2.31e+00

Pos logit contributions
 kids+0.002
 disob+0.002
ying+0.002
ily+0.002
ateful+0.002
Neg logit contributions
elope-0.004
oc-0.003
tail-0.003
ns-0.003
tails-0.003
Token to
Token ID284
Feature activation+3.75e-01

Pos logit contributions
elope+0.226
oc+0.223
tail+0.207
ns+0.191
June+0.189
Neg logit contributions
 disob-0.110
 kids-0.103
ying-0.092
 iron-0.092
asure-0.091
Token nib
Token ID33272
Feature activation-6.71e+00

Pos logit contributions
 disob+0.021
 kids+0.020
ily+0.020
ying+0.020
asure+0.019
Neg logit contributions
elope-0.033
oc-0.032
tail-0.030
tails-0.028
 den-0.028
Tokenble
Token ID903
Feature activation+6.05e+00

Pos logit contributions
elope+0.678
oc+0.655
tail+0.633
 Louise+0.593
tails+0.590
Neg logit contributions
 disob-0.287
 kids-0.254
asure-0.220
 truth-0.218
 officer-0.217
Token away
Token ID1497
Feature activation+1.66e-01

Pos logit contributions
 kids+0.280
 disob+0.278
ying+0.263
ily+0.259
 children+0.241
Neg logit contributions
elope-0.587
oc-0.562
tail-0.531
tails-0.497
 den-0.494
Token at
Token ID379
Feature activation+1.03e+00

Pos logit contributions
 disob+0.008
 kids+0.008
ily+0.007
ying+0.007
asure+0.007
Neg logit contributions
elope-0.016
oc-0.015
tail-0.014
 den-0.013
tails-0.013
Token the
Token ID262
Feature activation+2.09e-01

Pos logit contributions
 kids+0.044
 disob+0.044
ily+0.040
ying+0.039
 children+0.037
Neg logit contributions
elope-0.102
oc-0.102
tail-0.096
tails-0.088
Bet-0.088
Token cheese
Token ID9891
Feature activation-6.09e+00

Pos logit contributions
 kids+0.013
 disob+0.013
 children+0.012
ily+0.012
 truth+0.012
Neg logit contributions
elope-0.017
oc-0.017
tail-0.016
tails-0.014
Bet-0.014
Token there
Token ID612
Feature activation+4.71e+00

Pos logit contributions
 disob+0.042
 kids+0.038
ying+0.037
ateful+0.037
ily+0.037
Neg logit contributions
elope-0.059
oc-0.058
tail-0.052
 Louise-0.049
Bet-0.049
Token was
Token ID373
Feature activation-9.18e-01

Pos logit contributions
 disob+0.275
 kids+0.262
ying+0.252
asure+0.250
ily+0.250
Neg logit contributions
elope-0.405
oc-0.397
tail-0.362
 den-0.329
tails-0.329
Token a
Token ID257
Feature activation+2.78e-01

Pos logit contributions
oc+0.099
elope+0.098
tail+0.092
tails+0.088
 den+0.086
Neg logit contributions
 kids-0.034
ily-0.033
 disob-0.031
 children-0.027
asure-0.026
Token pig
Token ID12967
Feature activation-4.76e-01

Pos logit contributions
 kids+0.005
 disob+0.004
 children+0.003
ily+0.003
 boss+0.002
Neg logit contributions
elope-0.035
oc-0.034
tail-0.034
tails-0.033
 tal-0.032
Token named
Token ID3706
Feature activation-6.01e+00

Pos logit contributions
elope+0.033
oc+0.032
tail+0.030
icol+0.027
ns+0.027
Neg logit contributions
 disob-0.034
 kids-0.033
asure-0.032
ily-0.032
 children-0.030
Token Pat
Token ID3208
Feature activation-8.33e+00

Pos logit contributions
elope+0.524
oc+0.514
tail+0.479
tails+0.437
 den+0.436
Neg logit contributions
 disob-0.342
 kids-0.332
ily-0.315
ying-0.312
asure-0.304
Token.
Token ID13
Feature activation+1.20e-01

Pos logit contributions
elope+0.786
oc+0.774
tail+0.713
ns+0.663
tails+0.657
Neg logit contributions
 disob-0.415
 kids-0.400
ying-0.376
ily-0.369
asure-0.368
Token Pat
Token ID3208
Feature activation-8.33e+00

Pos logit contributions
 kids+0.008
ily+0.008
 right+0.007
asure+0.007
 disob+0.007
Neg logit contributions
elope-0.009
oc-0.008
tail-0.008
ns-0.008
 lifetime-0.008
Token loved
Token ID6151
Feature activation-3.44e-01

Pos logit contributions
elope+0.632
oc+0.622
tail+0.528
ns+0.514
tails+0.471
Neg logit contributions
ying-0.568
 disob-0.566
 kids-0.564
ateful-0.552
asure-0.544
Token to
Token ID284
Feature activation+3.99e-01

Pos logit contributions
elope+0.033
oc+0.032
tail+0.030
ns+0.027
tails+0.027
Neg logit contributions
 disob-0.017
 kids-0.017
ily-0.015
asure-0.015
 children-0.015
Token eat
Token ID4483
Feature activation-2.82e+00

Pos logit contributions
 disob+0.024
 kids+0.023
ying+0.023
ily+0.022
asure+0.022
Neg logit contributions
elope-0.033
oc-0.032
tail-0.031
tails-0.028
 den-0.027