TOP ACTIVATIONS
MAX = 0.132

the wonderful things the l oll ipop had tasted like .
he chose a purple l oll ipop . He couldn 't
it was the biggest l oll ipop Peter had ever seen
enjoy eating their colorful l oll ipop ! <|endoftext|> Once upon
Peter started eating his l oll ipop . It was difficult
they were eating their l oll ip ops , they saw
to buy an ordinary l oll ipop but she saw a
Comfort ably eating his l oll ipop , John happily went
Can we have a l oll ipop , mom ?" Ben
It was an icy l oll ipop ! Jack reached out
She saw a big l oll ipop and asked her mom
gives them each a l oll ipop . Tom gets a
money to buy her l oll ipop ! She was so
because he lost his l oll ipop . The little girl
not find a rare l oll ipop . Finally , he
a sticker and a l oll ipop . You can choose
she recognized - a l oll ipop ! It was sitting
happened : a modern l oll ipop machine came to their
and she dropped her l oll ipop ! It flew out
, a sour lemon l oll ipop . Jimmy laughed and

MIN ACTIVATIONS
MIN = -0.089

mom . Mom hears Sam crying . She sees
cries . Mom hears the noise and comes to
cry . Mom hears the noise and comes to
too . Mom hears the noise and comes to
!" A man hears them and comes to the
floor . Mom hears the noise . She comes
mess . Mom hears the noise and comes back
who works in the park hears them . She runs to
rooms . Mom hears the noise and comes to
uffy ." Mom hears the noise and comes to
cries . Mom hears the noise . She comes
!" Mom my hears Mia crying . She comes
doll !" Mom hears them and comes to the
. Their mom hears the noise and comes to
Lily !" Mom hears Sara and runs to the
away . Mom hears the noise and comes to
too . Mom hears them . She comes to
happy . Mom hears them . She comes to
cries . Mom hears Ben cry . She comes
cry . Mom hears them and comes to see

INTERVAL 0.077 - 0.132
CONTAINS 0.083%

L izzy went to her bur row and found her skirt
fluffy and lived in a bur row in the forest .
she was proud of what she had done ! <|endoftext|> Once
a police car with a s iren and a light .
soon became Joy 's most tre asured toy , and she

INTERVAL 0.022 - 0.077
CONTAINS 14.372%

hello to his neighbours , w inking and waving to them
. B enny looked around and saw his friend
faster than yours !" " No , it 's
John and his mom were walking through the park . John
a violin . Finally , her mom said yes , and

INTERVAL -0.034 - 0.022
CONTAINS 80.318%

fascinated and said , " Mom , you 're so smart
and said to her : " Hello there ! Would you
had been with her . <|endoftext|> Once upon a time ,
you today ? â Ryan replied ,
a time , there was a little boy named Tim my

INTERVAL -0.089 - -0.034
CONTAINS 5.227%

You should not run away from me ! Come back now
playing together in the garden when they saw something interesting in
the blocks . He reached for them and pulled them down
Fin was happy to learn something new . He told his
a noise . It sounded like a pretty bird singing .
Token the
Token ID262
Feature activation+2.26e-03

Pos logit contributions
belt+0.012
amas+0.011
uster+0.011
immers+0.011
ippers+0.011
Neg logit contributions
od-0.003
 written-0.003
 Leo-0.002
 says-0.002
 how-0.002
Token wonderful
Token ID7932
Feature activation+2.92e-02

Pos logit contributions
belt+0.005
uster+0.005
amas+0.005
ippers+0.005
ider+0.005
Neg logit contributions
 a-0.003
 that-0.003
 how-0.003
 says-0.003
 the-0.003
Token things
Token ID1243
Feature activation-1.17e-02

Pos logit contributions
belt+0.062
uster+0.061
amas+0.059
ivery+0.059
ider+0.057
Neg logit contributions
 that-0.042
 how-0.041
 a-0.040
 says-0.039
 the-0.038
Token the
Token ID262
Feature activation+1.73e-02

Pos logit contributions
od+0.009
 Leo+0.009
 Ma+0.009
 how+0.009
 says+0.009
Neg logit contributions
belt-0.028
amas-0.027
uster-0.027
ippers-0.027
immers-0.027
Token l
Token ID300
Feature activation+4.03e-02

Pos logit contributions
uster+0.035
belt+0.035
amas+0.034
ippers+0.034
ivery+0.034
Neg logit contributions
 a-0.025
 that-0.025
 the-0.023
 how-0.022
's-0.022
Tokenoll
Token ID692
Feature activation+1.32e-01

Pos logit contributions
belt+0.079
amas+0.079
immers+0.077
uster+0.076
ivery+0.074
Neg logit contributions
 that-0.055
od-0.054
 written-0.053
:-0.052
 how-0.051
Tokenipop
Token ID42800
Feature activation+4.71e-02

Pos logit contributions
uster+0.220
ivery+0.218
amas+0.218
immers+0.211
belt+0.202
Neg logit contributions
 that-0.239
 how-0.223
:-0.213
 a-0.212
 says-0.212
Token had
Token ID550
Feature activation+2.90e-02

Pos logit contributions
belt+0.118
amas+0.116
uster+0.116
immers+0.115
ippers+0.115
Neg logit contributions
 that-0.038
 how-0.037
 a-0.035
 written-0.035
 Leo-0.034
Token tasted
Token ID29187
Feature activation-3.11e-02

Pos logit contributions
belt+0.068
amas+0.066
uster+0.065
ippers+0.065
immers+0.064
Neg logit contributions
 how-0.027
 that-0.027
 Leo-0.026
 says-0.026
od-0.026
Token like
Token ID588
Feature activation-2.06e-02

Pos logit contributions
 how+0.023
 Leo+0.023
 says+0.023
od+0.023
 that+0.023
Neg logit contributions
belt-0.077
amas-0.076
uster-0.074
immers-0.074
ippers-0.074
Token.
Token ID13
Feature activation+2.19e-03

Pos logit contributions
 how+0.015
 that+0.014
od+0.014
 says+0.014
 written+0.014
Neg logit contributions
belt-0.052
amas-0.051
uster-0.050
ippers-0.050
immers-0.050
Token he
Token ID339
Feature activation+1.85e-02

Pos logit contributions
belt+0.019
amas+0.019
uster+0.018
immers+0.018
ippers+0.018
Neg logit contributions
 that-0.006
 a-0.006
 how-0.006
 Leo-0.005
 Mark-0.005
Token chose
Token ID7690
Feature activation-8.13e-03

Pos logit contributions
belt+0.043
amas+0.042
uster+0.042
ippers+0.041
immers+0.041
Neg logit contributions
 that-0.019
 how-0.019
 a-0.019
 written-0.019
 says-0.019
Token a
Token ID257
Feature activation-5.77e-03

Pos logit contributions
 how+0.004
 that+0.004
od+0.003
 written+0.003
 Leo+0.003
Neg logit contributions
belt-0.023
amas-0.023
uster-0.022
ippers-0.022
immers-0.022
Token purple
Token ID14032
Feature activation+4.44e-02

Pos logit contributions
 how+0.007
 that+0.007
 a+0.007
 written+0.007
 Leo+0.007
Neg logit contributions
belt-0.012
amas-0.012
uster-0.012
ippers-0.011
immers-0.011
Token l
Token ID300
Feature activation+4.30e-02

Pos logit contributions
immers+0.090
amas+0.090
belt+0.090
uster+0.089
ivery+0.088
Neg logit contributions
 that-0.059
 a-0.058
 how-0.056
 written-0.053
:-0.053
Tokenoll
Token ID692
Feature activation+1.31e-01

Pos logit contributions
belt+0.074
amas+0.072
uster+0.070
ippers+0.070
immers+0.069
Neg logit contributions
 how-0.068
 that-0.068
 a-0.066
 written-0.066
 Leo-0.066
Tokenipop
Token ID42800
Feature activation+3.47e-02

Pos logit contributions
ivery+0.227
amas+0.227
immers+0.218
uster+0.212
ital+0.206
Neg logit contributions
 that-0.228
 how-0.215
 a-0.209
:-0.208
 written-0.201
Token.
Token ID13
Feature activation+5.96e-03

Pos logit contributions
belt+0.085
amas+0.082
uster+0.080
ippers+0.080
ider+0.080
Neg logit contributions
 how-0.029
od-0.028
 Leo-0.028
 says-0.028
 Ma-0.028
Token He
Token ID679
Feature activation+1.92e-02

Pos logit contributions
immers+0.013
belt+0.013
ivery+0.013
ippers+0.012
igs+0.012
Neg logit contributions
 how-0.007
 a-0.007
 Mark-0.007
 Leo-0.007
 that-0.007
Token couldn
Token ID3521
Feature activation+8.25e-03

Pos logit contributions
belt+0.045
amas+0.044
uster+0.043
ippers+0.043
immers+0.043
Neg logit contributions
 how-0.018
 that-0.018
 a-0.017
 written-0.017
od-0.017
Token't
Token ID470
Feature activation-1.83e-02

Pos logit contributions
belt+0.018
uster+0.017
immers+0.017
amas+0.017
ippers+0.017
Neg logit contributions
 that-0.010
 a-0.010
 how-0.010
's-0.009
 Leo-0.009
Token it
Token ID340
Feature activation-6.68e-04

Pos logit contributions
 how+0.007
 that+0.007
od+0.006
 written+0.006
 Leo+0.006
Neg logit contributions
belt-0.021
amas-0.020
uster-0.020
ippers-0.020
immers-0.020
Token was
Token ID373
Feature activation-2.24e-02

Pos logit contributions
 how+0.001
 Leo+0.001
od+0.001
 that+0.001
 Ma+0.001
Neg logit contributions
belt-0.002
amas-0.002
immers-0.001
uster-0.001
ippers-0.001
Token the
Token ID262
Feature activation-2.88e-03

Pos logit contributions
 how+0.015
 that+0.015
od+0.015
 says+0.015
 Leo+0.015
Neg logit contributions
belt-0.057
amas-0.056
uster-0.055
ippers-0.054
immers-0.054
Token biggest
Token ID4094
Feature activation+1.13e-02

Pos logit contributions
 how+0.004
od+0.004
 that+0.004
 Leo+0.004
 says+0.004
Neg logit contributions
belt-0.006
amas-0.006
ippers-0.005
uster-0.005
immers-0.005
Token l
Token ID300
Feature activation+4.11e-02

Pos logit contributions
belt+0.025
amas+0.025
uster+0.025
immers+0.024
ippers+0.024
Neg logit contributions
 that-0.012
 how-0.012
 a-0.012
 written-0.012
 says-0.012
Tokenoll
Token ID692
Feature activation+1.30e-01

Pos logit contributions
amas+0.077
belt+0.076
immers+0.075
uster+0.074
ivery+0.074
Neg logit contributions
 that-0.059
od-0.057
 written-0.057
 a-0.056
 how-0.055
Tokenipop
Token ID42800
Feature activation+3.81e-02

Pos logit contributions
ivery+0.240
amas+0.236
uster+0.232
immers+0.228
+0.216
Neg logit contributions
 that-0.214
 how-0.199
 a-0.194
:-0.191
 says-0.189
Token Peter
Token ID5613
Feature activation+1.91e-02

Pos logit contributions
belt+0.090
amas+0.086
ippers+0.084
uster+0.084
immers+0.084
Neg logit contributions
 how-0.035
 says-0.034
od-0.033
 Leo-0.033
 Ma-0.033
Token had
Token ID550
Feature activation+2.13e-02

Pos logit contributions
belt+0.044
amas+0.043
uster+0.043
ippers+0.042
immers+0.042
Neg logit contributions
 that-0.019
 how-0.019
 a-0.018
 written-0.018
 says-0.018
Token ever
Token ID1683
Feature activation+1.57e-02

Pos logit contributions
belt+0.050
amas+0.048
ippers+0.047
immers+0.047
uster+0.047
Neg logit contributions
 how-0.019
 Leo-0.019
 says-0.019
 that-0.019
od-0.018
Token seen
Token ID1775
Feature activation-1.59e-02

Pos logit contributions
belt+0.031
amas+0.030
immers+0.030
ippers+0.030
uster+0.029
Neg logit contributions
 Leo-0.019
 Mark-0.019
 that-0.019
 how-0.018
 Ma-0.018
Token enjoy
Token ID2883
Feature activation+5.00e-03

Pos logit contributions
 written+0.011
 says+0.010
od+0.010
 that+0.010
 how+0.010
Neg logit contributions
belt-0.044
uster-0.043
amas-0.043
ippers-0.043
immers-0.043
Token eating
Token ID6600
Feature activation+4.17e-03

Pos logit contributions
belt+0.015
amas+0.015
uster+0.015
immers+0.015
ippers+0.015
Neg logit contributions
 how-0.002
 that-0.002
 a-0.002
 written-0.001
 Leo-0.001
Token their
Token ID511
Feature activation+3.67e-02

Pos logit contributions
belt+0.010
amas+0.010
uster+0.010
ippers+0.010
immers+0.010
Neg logit contributions
 how-0.004
 that-0.003
 a-0.003
 written-0.003
 Leo-0.003
Token colorful
Token ID20239
Feature activation+5.06e-02

Pos logit contributions
uster+0.085
belt+0.085
amas+0.081
ivery+0.080
immers+0.079
Neg logit contributions
 a-0.041
 that-0.041
 how-0.040
's-0.039
 says-0.038
Token l
Token ID300
Feature activation+4.98e-02

Pos logit contributions
uster+0.127
belt+0.124
ivery+0.121
immers+0.119
amas+0.117
Neg logit contributions
 a-0.052
 that-0.050
 how-0.050
's-0.048
 the-0.046
Tokenoll
Token ID692
Feature activation+1.28e-01

Pos logit contributions
belt+0.088
amas+0.087
uster+0.084
immers+0.084
ivery+0.082
Neg logit contributions
 that-0.076
od-0.075
 how-0.074
 written-0.074
 a-0.074
Tokenipop
Token ID42800
Feature activation+3.12e-02

Pos logit contributions
uster+0.179
ivery+0.177
amas+0.176
belt+0.175
immers+0.172
Neg logit contributions
 that-0.251
 how-0.245
 a-0.235
od-0.233
 says-0.233
Token!
Token ID0
Feature activation-4.83e-03

Pos logit contributions
belt+0.083
amas+0.081
uster+0.080
ippers+0.080
immers+0.079
Neg logit contributions
 how-0.020
 that-0.020
 a-0.019
 written-0.018
 Leo-0.018
Token<|endoftext|>
Token ID50256
Feature activation+1.84e-02

Pos logit contributions
 how+0.005
 written+0.005
 that+0.005
od+0.005
 says+0.005
Neg logit contributions
belt-0.010
amas-0.010
uster-0.010
ippers-0.010
-0.010
TokenOnce
Token ID7454
Feature activation+1.88e-02

Pos logit contributions
belt+0.040
uster+0.038
amas+0.038
immers+0.038
ippers+0.037
Neg logit contributions
 a-0.022
 that-0.022
 Leo-0.021
 how-0.021
 says-0.021
Token upon
Token ID2402
Feature activation-3.72e-02

Pos logit contributions
belt+0.025
ippers+0.023
uster+0.023
amas+0.023
ittens+0.022
Neg logit contributions
 written-0.036
 how-0.036
od-0.035
 that-0.035
 a-0.035
Token Peter
Token ID5613
Feature activation-2.31e-02

Pos logit contributions
belt+0.004
amas+0.003
uster+0.003
ippers+0.003
immers+0.003
Neg logit contributions
 how-0.001
od-0.001
 written-0.001
 says-0.001
 that-0.001
Token started
Token ID2067
Feature activation+6.46e-03

Pos logit contributions
 how+0.021
 that+0.021
 a+0.020
 written+0.020
 Leo+0.020
Neg logit contributions
belt-0.055
amas-0.054
uster-0.053
ippers-0.053
immers-0.053
Token eating
Token ID6600
Feature activation-1.56e-03

Pos logit contributions
belt+0.016
amas+0.015
ippers+0.015
uster+0.015
immers+0.015
Neg logit contributions
od-0.005
 Leo-0.005
 says-0.005
 how-0.005
 that-0.005
Token his
Token ID465
Feature activation+5.63e-02

Pos logit contributions
 how+0.001
 that+0.001
od+0.001
 written+0.001
 Leo+0.001
Neg logit contributions
belt-0.004
amas-0.004
uster-0.004
ippers-0.004
immers-0.004
Token l
Token ID300
Feature activation+5.02e-02

Pos logit contributions
uster+0.121
belt+0.120
amas+0.116
immers+0.116
ivery+0.115
Neg logit contributions
 a-0.070
 how-0.068
 that-0.068
's-0.064
 says-0.064
Tokenoll
Token ID692
Feature activation+1.28e-01

Pos logit contributions
amas+0.096
belt+0.096
immers+0.094
ivery+0.092
uster+0.091
Neg logit contributions
 that-0.070
 how-0.068
 written-0.067
 a-0.067
od-0.066
Tokenipop
Token ID42800
Feature activation+3.15e-02

Pos logit contributions
amas+0.199
ivery+0.199
uster+0.198
immers+0.194
belt+0.193
Neg logit contributions
 that-0.232
 how-0.227
 a-0.218
 says-0.215
:-0.213
Token.
Token ID13
Feature activation+4.63e-03

Pos logit contributions
belt+0.077
amas+0.075
uster+0.074
ippers+0.074
immers+0.074
Neg logit contributions
 how-0.027
 that-0.027
 a-0.025
 written-0.025
 Leo-0.025
Token It
Token ID632
Feature activation+9.08e-04

Pos logit contributions
belt+0.010
immers+0.009
ivery+0.009
ippers+0.009
ider+0.009
Neg logit contributions
 a-0.006
 how-0.006
 that-0.006
 Mark-0.006
 Ma-0.006
Token was
Token ID373
Feature activation-2.43e-02

Pos logit contributions
belt+0.002
amas+0.002
immers+0.002
ro+0.002
+0.002
Neg logit contributions
 Leo-0.001
od-0.001
 Ma-0.001
 how-0.001
 Mark-0.001
Token difficult
Token ID2408
Feature activation+2.78e-02

Pos logit contributions
od+0.015
 says+0.014
 Leo+0.014
 how+0.014
 that+0.014
Neg logit contributions
belt-0.064
amas-0.063
uster-0.062
ippers-0.061
immers-0.061
Token they
Token ID484
Feature activation+2.44e-02

Pos logit contributions
 that+0.011
 how+0.011
 a+0.011
 Leo+0.010
 written+0.010
Neg logit contributions
belt-0.041
amas-0.041
uster-0.040
ippers-0.040
immers-0.040
Token were
Token ID547
Feature activation+2.97e-03

Pos logit contributions
belt+0.051
amas+0.050
uster+0.049
+0.049
ittens+0.048
Neg logit contributions
od-0.027
 how-0.027
 that-0.027
 Leo-0.027
 written-0.026
Token eating
Token ID6600
Feature activation+6.58e-03

Pos logit contributions
belt+0.006
amas+0.006
uster+0.006
ippers+0.006
immers+0.006
Neg logit contributions
 how-0.004
 that-0.004
od-0.004
 a-0.004
 written-0.004
Token their
Token ID511
Feature activation+4.09e-02

Pos logit contributions
belt+0.018
amas+0.017
uster+0.017
ippers+0.017
immers+0.017
Neg logit contributions
 that-0.004
 how-0.004
 a-0.004
 Leo-0.004
 written-0.004
Token l
Token ID300
Feature activation+4.93e-02

Pos logit contributions
belt+0.099
amas+0.097
ippers+0.095
uster+0.095
immers+0.094
Neg logit contributions
 how-0.035
 that-0.035
 written-0.034
od-0.034
 Leo-0.034
Tokenoll
Token ID692
Feature activation+1.27e-01

Pos logit contributions
belt+0.092
amas+0.091
immers+0.088
ivery+0.086
uster+0.086
Neg logit contributions
 that-0.071
od-0.069
 how-0.069
 a-0.068
 written-0.068
Tokenip
Token ID541
Feature activation+8.53e-02

Pos logit contributions
amas+0.171
ivery+0.169
uster+0.167
belt+0.166
immers+0.164
Neg logit contributions
 that-0.256
 how-0.251
 says-0.242
 a-0.241
's-0.237
Tokenops
Token ID2840
Feature activation+1.49e-02

Pos logit contributions
belt+0.108
uster+0.104
immers+0.101
amas+0.100
ippers+0.098
Neg logit contributions
 that-0.178
 how-0.174
 a-0.172
's-0.172
 says-0.171
Token,
Token ID11
Feature activation-1.19e-02

Pos logit contributions
belt+0.041
uster+0.040
amas+0.040
immers+0.040
ippers+0.039
Neg logit contributions
 that-0.010
 how-0.009
 a-0.009
's-0.009
:-0.008
Token they
Token ID484
Feature activation+1.33e-04

Pos logit contributions
 says+0.010
od+0.010
 written+0.010
 how+0.009
 explains+0.009
Neg logit contributions
belt-0.028
amas-0.027
uster-0.027
immers-0.027
ippers-0.027
Token saw
Token ID2497
Feature activation-5.48e-02

Pos logit contributions
belt+0.000
amas+0.000
uster+0.000
immers+0.000
ivery+0.000
Neg logit contributions
 how-0.000
od-0.000
 that-0.000
 Leo-0.000
 says-0.000
Token to
Token ID284
Feature activation+8.41e-03

Pos logit contributions
uster+0.003
belt+0.003
amas+0.003
ippers+0.003
immers+0.003
Neg logit contributions
 that-0.001
 a-0.001
 how-0.001
's-0.001
 says-0.001
Token buy
Token ID2822
Feature activation-1.42e-02

Pos logit contributions
amas+0.018
belt+0.018
uster+0.018
ider+0.018
immers+0.018
Neg logit contributions
 a-0.011
 that-0.011
 how-0.011
 the-0.010
 an-0.010
Token an
Token ID281
Feature activation-1.53e-02

Pos logit contributions
 how+0.010
 that+0.009
od+0.009
 Leo+0.009
 written+0.009
Neg logit contributions
belt-0.037
amas-0.036
uster-0.035
ippers-0.035
immers-0.035
Token ordinary
Token ID8850
Feature activation+3.47e-02

Pos logit contributions
 that+0.019
 a+0.019
 how+0.019
 written+0.018
 about+0.017
Neg logit contributions
amas-0.032
belt-0.032
immers-0.032
ippers-0.032
uster-0.031
Token l
Token ID300
Feature activation+3.09e-02

Pos logit contributions
ivery+0.071
uster+0.070
immers+0.070
amas+0.069
belt+0.067
Neg logit contributions
 a-0.052
 that-0.050
 how-0.046
 an-0.045
 the-0.045
Tokenoll
Token ID692
Feature activation+1.27e-01

Pos logit contributions
amas+0.059
belt+0.058
immers+0.057
uster+0.056
ivery+0.056
Neg logit contributions
 that-0.044
 written-0.043
od-0.042
 how-0.042
 a-0.042
Tokenipop
Token ID42800
Feature activation+2.90e-02

Pos logit contributions
ivery+0.215
amas+0.213
uster+0.208
immers+0.207
belt+0.196
Neg logit contributions
 that-0.224
 how-0.212
 a-0.204
:-0.204
 says-0.203
Token but
Token ID475
Feature activation-4.96e-03

Pos logit contributions
belt+0.079
amas+0.078
uster+0.078
immers+0.077
ippers+0.077
Neg logit contributions
 that-0.018
 how-0.017
 a-0.016
 written-0.015
's-0.015
Token she
Token ID673
Feature activation+1.48e-02

Pos logit contributions
 how+0.004
 that+0.004
 a+0.004
 Leo+0.004
 written+0.004
Neg logit contributions
amas-0.013
uster-0.012
immers-0.012
belt-0.012
ippers-0.012
Token saw
Token ID2497
Feature activation-4.47e-02

Pos logit contributions
uster+0.037
amas+0.036
immers+0.036
ippers+0.036
belt+0.036
Neg logit contributions
 that-0.014
 how-0.014
's-0.014
 a-0.013
 says-0.013
Token a
Token ID257
Feature activation-2.99e-02

Pos logit contributions
od+0.021
 Leo+0.021
 says+0.020
 Ma+0.019
 Mark+0.018
Neg logit contributions
belt-0.124
amas-0.119
uster-0.117
ippers-0.116
ittens-0.115
Token Comfort
Token ID45769
Feature activation-5.42e-03

Pos logit contributions
belt+0.003
amas+0.003
uster+0.003
ippers+0.003
immers+0.002
Neg logit contributions
 that-0.001
 how-0.001
od-0.001
 written-0.001
 a-0.001
Tokenably
Token ID1346
Feature activation-8.82e-03

Pos logit contributions
 that+0.007
 how+0.007
 a+0.007
 written+0.007
's+0.007
Neg logit contributions
belt-0.011
amas-0.011
ippers-0.011
uster-0.011
immers-0.011
Token eating
Token ID6600
Feature activation-3.95e-03

Pos logit contributions
 that+0.008
 how+0.007
 a+0.007
 written+0.007
 says+0.007
Neg logit contributions
belt-0.022
amas-0.022
uster-0.021
ippers-0.021
immers-0.021
Token his
Token ID465
Feature activation+4.78e-02

Pos logit contributions
od+0.002
 how+0.002
 written+0.002
 says+0.002
 that+0.002
Neg logit contributions
belt-0.011
amas-0.010
uster-0.010
ippers-0.010
immers-0.010
Token l
Token ID300
Feature activation+4.53e-02

Pos logit contributions
uster+0.105
belt+0.104
immers+0.103
amas+0.102
ivery+0.101
Neg logit contributions
 a-0.059
 that-0.056
 how-0.056
's-0.053
 says-0.052
Tokenoll
Token ID692
Feature activation+1.26e-01

Pos logit contributions
amas+0.083
belt+0.083
immers+0.081
ivery+0.078
uster+0.078
Neg logit contributions
 that-0.067
 how-0.065
 written-0.065
od-0.064
 a-0.064
Tokenipop
Token ID42800
Feature activation+3.07e-02

Pos logit contributions
amas+0.196
ivery+0.192
immers+0.188
uster+0.188
belt+0.182
Neg logit contributions
 that-0.234
 how-0.229
 a-0.217
:-0.216
 says-0.216
Token,
Token ID11
Feature activation+5.38e-03

Pos logit contributions
belt+0.073
uster+0.073
amas+0.072
immers+0.071
ippers+0.071
Neg logit contributions
 that-0.029
 how-0.028
 a-0.027
's-0.026
 written-0.026
Token John
Token ID1757
Feature activation-3.41e-02

Pos logit contributions
belt+0.012
amas+0.012
uster+0.012
immers+0.012
ippers+0.012
Neg logit contributions
 that-0.006
 how-0.006
 a-0.005
 written-0.005
 Leo-0.005
Token happily
Token ID18177
Feature activation+4.52e-02

Pos logit contributions
 how+0.031
 that+0.031
 written+0.030
od+0.030
 a+0.030
Neg logit contributions
belt-0.080
amas-0.078
uster-0.077
ippers-0.077
immers-0.076
Token went
Token ID1816
Feature activation+3.96e-02

Pos logit contributions
amas+0.097
belt+0.097
uster+0.095
immers+0.093
ippers+0.093
Neg logit contributions
 that-0.054
 how-0.053
 a-0.052
 written-0.051
 says-0.050
TokenCan
Token ID6090
Feature activation+1.59e-02

Pos logit contributions
belt+0.036
amas+0.035
uster+0.034
immers+0.034
ippers+0.034
Neg logit contributions
 a-0.028
 how-0.028
 that-0.028
 written-0.028
 Leo-0.028
Token we
Token ID356
Feature activation+1.08e-02

Pos logit contributions
belt+0.035
amas+0.034
uster+0.034
ippers+0.034
immers+0.034
Neg logit contributions
 that-0.018
 how-0.018
 a-0.017
 Leo-0.017
 says-0.017
Token have
Token ID423
Feature activation-1.32e-02

Pos logit contributions
belt+0.023
amas+0.023
uster+0.022
immers+0.022
ippers+0.022
Neg logit contributions
 that-0.013
 how-0.013
 a-0.013
 says-0.012
 Leo-0.012
Token a
Token ID257
Feature activation+8.63e-03

Pos logit contributions
od+0.013
 Ma+0.011
 Leo+0.011
 Lou+0.011
 written+0.011
Neg logit contributions
belt-0.029
amas-0.029
ittens-0.029
uster-0.028
igs-0.028
Token l
Token ID300
Feature activation+3.84e-02

Pos logit contributions
belt+0.017
amas+0.017
immers+0.017
uster+0.016
ippers+0.016
Neg logit contributions
 a-0.012
 that-0.012
 how-0.012
 says-0.011
's-0.011
Tokenoll
Token ID692
Feature activation+1.26e-01

Pos logit contributions
amas+0.071
belt+0.071
immers+0.068
uster+0.067
ivery+0.066
Neg logit contributions
 that-0.057
 how-0.055
 written-0.055
 a-0.054
od-0.054
Tokenipop
Token ID42800
Feature activation+3.08e-02

Pos logit contributions
amas+0.210
ivery+0.206
uster+0.204
immers+0.203
belt+0.196
Neg logit contributions
 that-0.220
 how-0.211
 says-0.205
 a-0.202
:-0.199
Token,
Token ID11
Feature activation-1.26e-04

Pos logit contributions
uster+0.075
belt+0.074
amas+0.074
immers+0.073
ippers+0.073
Neg logit contributions
 that-0.030
 a-0.028
 how-0.028
 says-0.026
's-0.025
Token mom
Token ID1995
Feature activation-3.39e-02

Pos logit contributions
od+0.000
 written+0.000
 how+0.000
 that+0.000
 says+0.000
Neg logit contributions
belt-0.000
amas-0.000
uster-0.000
ittens-0.000
ippers-0.000
Token?"
Token ID1701
Feature activation+3.19e-03

Pos logit contributions
 that+0.038
 how+0.038
 says+0.037
 a+0.036
 written+0.035
Neg logit contributions
belt-0.077
amas-0.075
uster-0.075
ippers-0.074
immers-0.073
Token Ben
Token ID3932
Feature activation-2.54e-02

Pos logit contributions
belt+0.007
amas+0.007
uster+0.007
ippers+0.007
immers+0.007
Neg logit contributions
 how-0.003
 that-0.003
 written-0.003
 a-0.003
od-0.003
Token It
Token ID632
Feature activation-6.82e-03

Pos logit contributions
belt+0.013
immers+0.012
amas+0.012
ippers+0.012
uster+0.012
Neg logit contributions
 a-0.006
 how-0.006
 that-0.006
 written-0.006
 Leo-0.006
Token was
Token ID373
Feature activation-2.15e-02

Pos logit contributions
 Leo+0.008
icks+0.008
 Sof+0.008
od+0.008
 Ma+0.008
Neg logit contributions
belt-0.012
amas-0.011
ro-0.011
-0.011
suit-0.011
Token an
Token ID281
Feature activation-3.82e-03

Pos logit contributions
 says+0.013
od+0.012
's+0.011
 explains+0.010
 Leo+0.009
Neg logit contributions
belt-0.055
uster-0.052
amas-0.052
ippers-0.051
ro-0.051
Token icy
Token ID30284
Feature activation+4.39e-02

Pos logit contributions
 Leo+0.005
 Ma+0.005
 Mark+0.005
 how+0.005
 Luc+0.005
Neg logit contributions
belt-0.007
uster-0.007
amas-0.007
agons-0.007
immers-0.007
Token l
Token ID300
Feature activation+3.85e-02

Pos logit contributions
uster+0.088
belt+0.088
immers+0.088
ivery+0.087
ippers+0.087
Neg logit contributions
 that-0.063
 a-0.059
 how-0.055
:-0.054
's-0.054
Tokenoll
Token ID692
Feature activation+1.26e-01

Pos logit contributions
belt+0.064
uster+0.062
amas+0.061
ippers+0.061
ider+0.061
Neg logit contributions
 how-0.065
 says-0.063
 Leo-0.061
 that-0.061
 Mark-0.061
Tokenipop
Token ID42800
Feature activation+3.28e-02

Pos logit contributions
immers+0.194
amas+0.194
ivery+0.193
belt+0.188
uster+0.188
Neg logit contributions
 that-0.228
 how-0.216
 a-0.215
 written-0.214
od-0.209
Token!
Token ID0
Feature activation-8.66e-03

Pos logit contributions
belt+0.087
amas+0.085
uster+0.082
ippers+0.082
ider+0.082
Neg logit contributions
 how-0.021
 says-0.020
 Leo-0.020
 Ma-0.019
od-0.019
Token Jack
Token ID3619
Feature activation-2.75e-02

Pos logit contributions
 a+0.009
 that+0.008
 how+0.008
 Leo+0.008
 an+0.008
Neg logit contributions
belt-0.021
immers-0.020
ippers-0.020
ivery-0.020
ittens-0.020
Token reached
Token ID4251
Feature activation+7.44e-03

Pos logit contributions
 that+0.025
 how+0.025
 Leo+0.025
od+0.024
 written+0.024
Neg logit contributions
belt-0.063
amas-0.061
uster-0.060
ippers-0.059
immers-0.059
Token out
Token ID503
Feature activation-2.60e-02

Pos logit contributions
belt+0.018
uster+0.018
amas+0.017
ippers+0.017
immers+0.017
Neg logit contributions
od-0.006
 written-0.006
 how-0.006
 that-0.006
 says-0.006
Token She
Token ID1375
Feature activation+2.03e-02

Pos logit contributions
immers+0.013
agons+0.012
igs+0.012
belt+0.012
ishing+0.012
Neg logit contributions
 Mrs-0.004
 a-0.004
's-0.004
 that-0.004
 Mary-0.004
Token saw
Token ID2497
Feature activation-4.05e-02

Pos logit contributions
uster+0.049
immers+0.048
ippers+0.048
amas+0.048
belt+0.048
Neg logit contributions
 how-0.019
 a-0.019
 that-0.019
's-0.019
 says-0.018
Token a
Token ID257
Feature activation-2.22e-02

Pos logit contributions
od+0.027
 Lou+0.026
 Leo+0.026
 says+0.024
 Ma+0.024
Neg logit contributions
belt-0.102
amas-0.098
uster-0.096
ittens-0.096
angers-0.096
Token big
Token ID1263
Feature activation+8.07e-03

Pos logit contributions
 that+0.024
 a+0.024
 written+0.023
 how+0.022
 Mrs+0.022
Neg logit contributions
immers-0.050
belt-0.050
uster-0.049
amas-0.049
ippers-0.049
Token l
Token ID300
Feature activation+3.57e-02

Pos logit contributions
immers+0.017
uster+0.017
belt+0.017
ippers+0.016
ivery+0.016
Neg logit contributions
 that-0.011
 a-0.011
 the-0.010
 how-0.010
 says-0.010
Tokenoll
Token ID692
Feature activation+1.25e-01

Pos logit contributions
belt+0.065
amas+0.063
immers+0.062
uster+0.062
ippers+0.061
Neg logit contributions
 that-0.053
 written-0.052
 how-0.052
 a-0.051
od-0.051
Tokenipop
Token ID42800
Feature activation+2.68e-02

Pos logit contributions
amas+0.209
immers+0.207
uster+0.203
ivery+0.203
belt+0.196
Neg logit contributions
 that-0.222
 how-0.209
 a-0.203
:-0.202
 says-0.200
Token and
Token ID290
Feature activation+1.98e-03

Pos logit contributions
belt+0.062
amas+0.060
uster+0.059
ippers+0.058
immers+0.058
Neg logit contributions
 how-0.027
 that-0.026
 Leo-0.026
od-0.026
 says-0.025
Token asked
Token ID1965
Feature activation+4.55e-03

Pos logit contributions
belt+0.004
amas+0.004
uster+0.004
ippers+0.004
ittens+0.004
Neg logit contributions
od-0.002
 how-0.002
 Leo-0.002
 that-0.002
 Ma-0.002
Token her
Token ID607
Feature activation-1.43e-02

Pos logit contributions
belt+0.012
amas+0.011
uster+0.011
ippers+0.011
ittens+0.011
Neg logit contributions
od-0.003
 written-0.003
 Leo-0.003
 says-0.003
 that-0.003
Token mom
Token ID1995
Feature activation-3.73e-02

Pos logit contributions
od+0.020
 written+0.020
 Leo+0.020
 Lou+0.019
 Ray+0.019
Neg logit contributions
belt-0.024
ippers-0.023
aj-0.022
amas-0.022
ittens-0.022
Token gives
Token ID3607
Feature activation-2.15e-02

Pos logit contributions
 Ma+0.036
od+0.035
 Jan+0.033
 Mrs+0.033
ful+0.033
Neg logit contributions
ider-0.071
ippers-0.070
ro-0.069
belt-0.069
immers-0.068
Token them
Token ID606
Feature activation-4.28e-02

Pos logit contributions
od+0.031
 written+0.030
 explains+0.028
 that+0.028
 how+0.028
Neg logit contributions
belt-0.036
immers-0.035
amas-0.034
ippers-0.034
uster-0.034
Token each
Token ID1123
Feature activation+8.21e-03

Pos logit contributions
od+0.042
 written+0.041
 explains+0.039
 Lou+0.038
 Joseph+0.036
Neg logit contributions
ippers-0.087
immers-0.086
uster-0.086
ro-0.082
belt-0.082
Token a
Token ID257
Feature activation+2.49e-03

Pos logit contributions
ro+0.019
uster+0.019
ippers+0.019
ittens+0.018
belt+0.018
Neg logit contributions
od-0.007
 how-0.006
 Lou-0.006
 Mark-0.006
 written-0.006
Token l
Token ID300
Feature activation+4.47e-02

Pos logit contributions
belt+0.005
amas+0.005
uster+0.005
ippers+0.005
ittens+0.005
Neg logit contributions
 how-0.003
od-0.003
 that-0.003
 Leo-0.003
 Mark-0.003
Tokenoll
Token ID692
Feature activation+1.25e-01

Pos logit contributions
belt+0.073
amas+0.070
uster+0.069
ippers+0.069
immers+0.067
Neg logit contributions
 how-0.075
 that-0.074
 a-0.073
 written-0.072
 Leo-0.072
Tokenipop
Token ID42800
Feature activation+3.53e-02

Pos logit contributions
amas+0.188
belt+0.182
ivery+0.181
immers+0.179
uster+0.179
Neg logit contributions
 that-0.231
 how-0.228
 says-0.224
 a-0.222
 Leo-0.217
Token.
Token ID13
Feature activation-2.42e-03

Pos logit contributions
belt+0.083
ider+0.079
amas+0.079
ippers+0.079
ro+0.077
Neg logit contributions
od-0.032
 Ma-0.031
 Mark-0.031
 how-0.031
 Leo-0.030
Token Tom
Token ID4186
Feature activation-4.32e-02

Pos logit contributions
od+0.003
 how+0.003
 written+0.003
 that+0.003
's+0.003
Neg logit contributions
belt-0.005
amas-0.005
ippers-0.004
uster-0.004
ro-0.004
Token gets
Token ID3011
Feature activation-2.75e-02

Pos logit contributions
od+0.049
 how+0.048
 that+0.048
 written+0.048
 Ma+0.047
Neg logit contributions
belt-0.088
amas-0.088
ippers-0.086
uster-0.085
immers-0.085
Token a
Token ID257
Feature activation-1.40e-02

Pos logit contributions
od+0.025
 Lou+0.024
 Mark+0.023
 Luc+0.022
 how+0.022
Neg logit contributions
ippers-0.059
belt-0.059
uster-0.058
amas-0.057
immers-0.057
Token money
Token ID1637
Feature activation+9.76e-03

Pos logit contributions
belt+0.142
ivery+0.142
amas+0.139
immers+0.136
uster+0.135
Neg logit contributions
 that-0.099
 a-0.098
 how-0.095
 the-0.092
 what-0.088
Token to
Token ID284
Feature activation+3.41e-02

Pos logit contributions
belt+0.026
amas+0.026
uster+0.025
ippers+0.025
immers+0.025
Neg logit contributions
 that-0.006
 how-0.006
 a-0.006
 written-0.006
 says-0.005
Token buy
Token ID2822
Feature activation-1.54e-02

Pos logit contributions
belt+0.074
amas+0.073
uster+0.072
angers+0.071
ider+0.071
Neg logit contributions
 that-0.048
 a-0.048
 how-0.045
 the-0.045
 say-0.044
Token her
Token ID607
Feature activation+2.14e-02

Pos logit contributions
 Leo+0.011
od+0.011
 how+0.011
 written+0.011
 says+0.011
Neg logit contributions
belt-0.038
amas-0.037
ippers-0.037
uster-0.037
immers-0.037
Token l
Token ID300
Feature activation+4.58e-02

Pos logit contributions
uster+0.048
amas+0.047
belt+0.047
immers+0.046
ivery+0.045
Neg logit contributions
 a-0.026
 how-0.026
 that-0.025
's-0.024
 an-0.024
Tokenoll
Token ID692
Feature activation+1.25e-01

Pos logit contributions
amas+0.096
immers+0.093
ivery+0.093
belt+0.092
ital+0.089
Neg logit contributions
 that-0.058
 how-0.055
 a-0.055
 written-0.055
od-0.054
Tokenipop
Token ID42800
Feature activation+2.47e-02

Pos logit contributions
amas+0.199
ivery+0.198
uster+0.192
immers+0.190
belt+0.187
Neg logit contributions
 that-0.225
 how-0.222
 a-0.210
 says-0.208
:-0.208
Token!
Token ID0
Feature activation-3.17e-03

Pos logit contributions
belt+0.069
uster+0.068
amas+0.068
immers+0.067
ippers+0.067
Neg logit contributions
 that-0.014
 how-0.013
 a-0.013
's-0.011
 written-0.011
Token She
Token ID1375
Feature activation+2.15e-02

Pos logit contributions
 that+0.003
 how+0.003
 a+0.003
 says+0.003
 Ma+0.003
Neg logit contributions
belt-0.008
immers-0.007
amas-0.007
ider-0.007
ippers-0.007
Token was
Token ID373
Feature activation-1.39e-02

Pos logit contributions
belt+0.053
amas+0.052
uster+0.051
ippers+0.051
immers+0.051
Neg logit contributions
 how-0.018
 that-0.018
 a-0.017
 written-0.017
 Leo-0.017
Token so
Token ID523
Feature activation-1.29e-02

Pos logit contributions
 says+0.010
 Leo+0.009
od+0.009
 explains+0.009
 Luc+0.009
Neg logit contributions
belt-0.036
amas-0.034
uster-0.034
ippers-0.033
ivery-0.033
Token because
Token ID780
Feature activation-1.95e-02

Pos logit contributions
belt+0.071
amas+0.069
uster+0.067
ippers+0.067
ivery+0.066
Neg logit contributions
od-0.028
 Leo-0.028
 says-0.027
 how-0.027
 written-0.027
Token he
Token ID339
Feature activation+2.81e-02

Pos logit contributions
 how+0.018
 that+0.018
od+0.017
 written+0.017
 says+0.017
Neg logit contributions
belt-0.046
amas-0.045
uster-0.044
ippers-0.044
immers-0.044
Token lost
Token ID2626
Feature activation+1.30e-02

Pos logit contributions
belt+0.054
amas+0.052
ittens+0.051
ivery+0.050
uster+0.048
Neg logit contributions
 Ma-0.036
 Leo-0.035
 how-0.035
od-0.035
 Mary-0.035
Token his
Token ID465
Feature activation+5.05e-02

Pos logit contributions
belt+0.033
amas+0.033
uster+0.032
ippers+0.032
immers+0.032
Neg logit contributions
 says-0.009
od-0.009
 written-0.009
 how-0.009
 explains-0.008
Token l
Token ID300
Feature activation+4.54e-02

Pos logit contributions
belt+0.096
amas+0.094
uster+0.093
immers+0.092
ippers+0.092
Neg logit contributions
 that-0.071
 how-0.070
 a-0.069
 written-0.067
 says-0.067
Tokenoll
Token ID692
Feature activation+1.25e-01

Pos logit contributions
amas+0.083
belt+0.083
immers+0.081
uster+0.079
ippers+0.078
Neg logit contributions
 that-0.067
 a-0.064
 written-0.064
od-0.064
 how-0.064
Tokenipop
Token ID42800
Feature activation+3.06e-02

Pos logit contributions
amas+0.190
immers+0.189
uster+0.186
ivery+0.186
belt+0.182
Neg logit contributions
 that-0.227
 how-0.219
 a-0.214
 says-0.207
's-0.207
Token.
Token ID13
Feature activation-6.73e-03

Pos logit contributions
belt+0.079
amas+0.077
uster+0.076
ippers+0.076
immers+0.076
Neg logit contributions
 that-0.022
 how-0.022
 a-0.021
 written-0.021
 Leo-0.021
Token The
Token ID383
Feature activation-7.48e-03

Pos logit contributions
 how+0.007
 that+0.007
 written+0.007
od+0.007
 a+0.007
Neg logit contributions
belt-0.015
amas-0.015
uster-0.015
ippers-0.014
immers-0.014
Token little
Token ID1310
Feature activation+2.02e-02

Pos logit contributions
 a+0.010
 that+0.010
 how+0.010
 says+0.010
's+0.010
Neg logit contributions
belt-0.015
amas-0.015
immers-0.015
uster-0.015
ippers-0.015
Token girl
Token ID2576
Feature activation-2.00e-02

Pos logit contributions
belt+0.036
amas+0.035
immers+0.034
uster+0.034
ippers+0.033
Neg logit contributions
 a-0.034
 says-0.033
 that-0.033
's-0.033
 how-0.032
Token not
Token ID407
Feature activation-3.06e-02

Pos logit contributions
belt+0.078
amas+0.076
ippers+0.075
uster+0.075
immers+0.074
Neg logit contributions
 how-0.020
od-0.020
 written-0.020
 that-0.020
 says-0.020
Token find
Token ID1064
Feature activation-3.15e-02

Pos logit contributions
 how+0.051
 says+0.050
 that+0.050
 Leo+0.050
 Mark+0.050
Neg logit contributions
belt-0.048
uster-0.045
ippers-0.045
amas-0.045
immers-0.044
Token a
Token ID257
Feature activation-1.77e-02

Pos logit contributions
od+0.035
 explains+0.035
 says+0.035
 written+0.034
 Lou+0.033
Neg logit contributions
belt-0.063
uster-0.063
ippers-0.062
ittens-0.060
igs-0.060
Token rare
Token ID4071
Feature activation+1.69e-02

Pos logit contributions
 how+0.019
 that+0.019
 a+0.018
 written+0.018
 Leo+0.018
Neg logit contributions
belt-0.039
amas-0.038
uster-0.038
ippers-0.037
immers-0.037
Token l
Token ID300
Feature activation+3.96e-02

Pos logit contributions
belt+0.039
amas+0.038
uster+0.038
ippers+0.037
immers+0.037
Neg logit contributions
 how-0.017
 that-0.017
 a-0.016
 written-0.016
 Leo-0.016
Tokenoll
Token ID692
Feature activation+1.24e-01

Pos logit contributions
belt+0.067
amas+0.066
uster+0.064
immers+0.063
ippers+0.063
Neg logit contributions
 that-0.063
 how-0.063
 a-0.061
 written-0.061
od-0.061
Tokenipop
Token ID42800
Feature activation+2.73e-02

Pos logit contributions
amas+0.197
immers+0.192
ivery+0.191
belt+0.185
uster+0.185
Neg logit contributions
 that-0.226
 how-0.218
 a-0.212
 written-0.209
 says-0.206
Token.
Token ID13
Feature activation-6.52e-03

Pos logit contributions
belt+0.070
amas+0.067
uster+0.067
ippers+0.066
ider+0.066
Neg logit contributions
 how-0.019
 Leo-0.019
 says-0.019
 Ma-0.019
od-0.019
Token Finally
Token ID9461
Feature activation+1.68e-02

Pos logit contributions
 how+0.006
 that+0.006
od+0.006
 written+0.006
 says+0.006
Neg logit contributions
belt-0.015
amas-0.015
uster-0.014
ippers-0.014
immers-0.014
Token,
Token ID11
Feature activation+2.81e-03

Pos logit contributions
belt+0.044
amas+0.043
uster+0.043
ippers+0.043
immers+0.042
Neg logit contributions
 that-0.012
 how-0.011
 a-0.011
 Leo-0.011
 written-0.011
Token he
Token ID339
Feature activation+2.05e-02

Pos logit contributions
belt+0.006
amas+0.006
+0.005
uster+0.005
ippers+0.005
Neg logit contributions
od-0.003
 says-0.003
 explains-0.003
icks-0.003
 Ma-0.003
Token a
Token ID257
Feature activation-1.03e-02

Pos logit contributions
od+0.007
 how+0.007
 that+0.007
 written+0.007
 Ma+0.007
Neg logit contributions
belt-0.022
amas-0.022
immers-0.022
ippers-0.022
uster-0.022
Token sticker
Token ID27735
Feature activation-1.35e-02

Pos logit contributions
 a+0.013
 that+0.012
 how+0.012
 says+0.012
 Leo+0.012
Neg logit contributions
belt-0.022
amas-0.022
immers-0.021
uster-0.021
ippers-0.021
Token and
Token ID290
Feature activation+6.16e-03

Pos logit contributions
 how+0.010
 that+0.010
 a+0.010
 written+0.010
od+0.010
Neg logit contributions
belt-0.034
amas-0.033
uster-0.033
ippers-0.033
immers-0.033
Token a
Token ID257
Feature activation-6.63e-03

Pos logit contributions
belt+0.018
amas+0.017
uster+0.017
ippers+0.017
immers+0.017
Neg logit contributions
od-0.002
 how-0.002
 that-0.002
 Leo-0.002
 written-0.002
Token l
Token ID300
Feature activation+3.06e-02

Pos logit contributions
 that+0.008
 a+0.008
 how+0.008
 written+0.008
 says+0.008
Neg logit contributions
belt-0.014
amas-0.014
uster-0.014
immers-0.014
ippers-0.013
Tokenoll
Token ID692
Feature activation+1.24e-01

Pos logit contributions
belt+0.058
amas+0.058
immers+0.056
uster+0.055
ippers+0.054
Neg logit contributions
 that-0.044
 how-0.043
 written-0.042
 a-0.042
od-0.042
Tokenipop
Token ID42800
Feature activation+3.31e-02

Pos logit contributions
ivery+0.218
amas+0.216
immers+0.208
uster+0.207
+0.198
Neg logit contributions
 that-0.213
 how-0.204
 says-0.198
 a-0.197
:-0.195
Token.
Token ID13
Feature activation+2.20e-03

Pos logit contributions
belt+0.087
amas+0.086
uster+0.086
immers+0.085
ippers+0.084
Neg logit contributions
 that-0.023
 how-0.022
 a-0.021
 written-0.021
 says-0.020
Token You
Token ID921
Feature activation+5.71e-03

Pos logit contributions
belt+0.005
amas+0.005
uster+0.005
immers+0.005
ippers+0.005
Neg logit contributions
 how-0.003
 that-0.002
 a-0.002
 Leo-0.002
 says-0.002
Token can
Token ID460
Feature activation+2.22e-02

Pos logit contributions
belt+0.013
amas+0.012
uster+0.012
ippers+0.012
immers+0.012
Neg logit contributions
 how-0.006
 that-0.006
 Leo-0.006
od-0.006
 written-0.006
Token choose
Token ID3853
Feature activation-3.43e-02

Pos logit contributions
belt+0.047
amas+0.046
uster+0.045
immers+0.045
ippers+0.045
Neg logit contributions
 that-0.027
 how-0.026
 a-0.026
 written-0.025
 says-0.025
Token she
Token ID673
Feature activation+5.70e-02

Pos logit contributions
 Ma+0.069
 Leo+0.067
 Mark+0.064
 Joseph+0.062
 Lou+0.062
Neg logit contributions
belt-0.104
ippers-0.096
ider-0.095
ittens-0.094
suit-0.094
Token recognized
Token ID8018
Feature activation-1.59e-02

Pos logit contributions
belt+0.131
amas+0.129
uster+0.127
ippers+0.126
immers+0.126
Neg logit contributions
 how-0.057
 that-0.057
 a-0.056
 written-0.055
 says-0.054
Token -
Token ID532
Feature activation-2.97e-02

Pos logit contributions
 says+0.014
 Leo+0.014
 Ma+0.013
 Mark+0.013
od+0.013
Neg logit contributions
belt-0.037
ippers-0.035
uster-0.035
amas-0.034
angers-0.034
Token a
Token ID257
Feature activation-9.67e-03

Pos logit contributions
 says+0.014
od+0.013
 Leo+0.013
 how+0.013
 written+0.012
Neg logit contributions
belt-0.081
amas-0.078
uster-0.077
ippers-0.077
-0.076
Token l
Token ID300
Feature activation+3.59e-02

Pos logit contributions
 that+0.010
 how+0.010
 a+0.010
 written+0.010
 Leo+0.010
Neg logit contributions
belt-0.021
amas-0.021
uster-0.021
ippers-0.020
immers-0.020
Tokenoll
Token ID692
Feature activation+1.24e-01

Pos logit contributions
belt+0.062
uster+0.059
ippers+0.059
amas+0.059
ider+0.058
Neg logit contributions
 how-0.058
 says-0.056
 that-0.056
 Leo-0.055
 Mark-0.055
Tokenipop
Token ID42800
Feature activation+2.69e-02

Pos logit contributions
amas+0.203
immers+0.197
ivery+0.195
uster+0.192
belt+0.189
Neg logit contributions
 that-0.224
 how-0.211
 a-0.208
 written-0.207
:-0.206
Token!
Token ID0
Feature activation-7.62e-03

Pos logit contributions
belt+0.072
amas+0.071
uster+0.069
ippers+0.069
immers+0.068
Neg logit contributions
 how-0.016
 Leo-0.015
od-0.015
 says-0.015
 Ma-0.014
Token It
Token ID632
Feature activation-4.78e-03

Pos logit contributions
 how+0.009
 that+0.009
od+0.008
 says+0.008
 written+0.008
Neg logit contributions
belt-0.016
amas-0.016
uster-0.016
ippers-0.015
-0.015
Token was
Token ID373
Feature activation-2.27e-02

Pos logit contributions
 Leo+0.005
od+0.005
 Ma+0.005
 how+0.005
 Mark+0.005
Neg logit contributions
belt-0.010
-0.009
amas-0.009
ro-0.009
ittens-0.009
Token sitting
Token ID5586
Feature activation+5.04e-02

Pos logit contributions
 says+0.017
od+0.016
 explains+0.015
 Leo+0.015
 Mark+0.014
Neg logit contributions
belt-0.055
amas-0.052
uster-0.052
ittens-0.052
ro-0.051
Token happened
Token ID3022
Feature activation+1.27e-02

Pos logit contributions
belt+0.031
amas+0.029
ider+0.029
ital+0.028
uster+0.028
Neg logit contributions
 Leo-0.012
 Ma-0.012
od-0.011
 Luc-0.011
 how-0.011
Token:
Token ID25
Feature activation-4.07e-02

Pos logit contributions
belt+0.032
amas+0.031
ippers+0.031
uster+0.031
immers+0.031
Neg logit contributions
 says-0.009
 Leo-0.009
od-0.009
 how-0.009
 Ma-0.008
Token a
Token ID257
Feature activation-7.10e-03

Pos logit contributions
 says+0.031
 how+0.030
 Leo+0.030
od+0.030
 that+0.029
Neg logit contributions
belt-0.098
amas-0.097
uster-0.096
-0.096
ippers-0.094
Token modern
Token ID3660
Feature activation+2.58e-02

Pos logit contributions
 how+0.008
 that+0.008
od+0.008
 Leo+0.008
 says+0.008
Neg logit contributions
belt-0.015
amas-0.015
uster-0.014
ippers-0.014
immers-0.014
Token l
Token ID300
Feature activation+3.74e-02

Pos logit contributions
belt+0.054
immers+0.054
uster+0.053
amas+0.053
ippers+0.053
Neg logit contributions
 that-0.032
 a-0.031
 how-0.031
 written-0.030
 says-0.030
Tokenoll
Token ID692
Feature activation+1.24e-01

Pos logit contributions
belt+0.065
amas+0.063
uster+0.062
ippers+0.061
immers+0.061
Neg logit contributions
 how-0.058
 that-0.058
 a-0.057
 Leo-0.056
 written-0.056
Tokenipop
Token ID42800
Feature activation+2.67e-02

Pos logit contributions
immers+0.206
amas+0.206
ivery+0.200
uster+0.197
belt+0.193
Neg logit contributions
 that-0.219
 how-0.206
 a-0.204
 written-0.200
:-0.199
Token machine
Token ID4572
Feature activation-8.93e-03

Pos logit contributions
belt+0.066
amas+0.064
uster+0.062
ippers+0.062
ider+0.062
Neg logit contributions
 how-0.022
 Leo-0.021
od-0.021
 Ma-0.021
 says-0.021
Token came
Token ID1625
Feature activation+4.97e-03

Pos logit contributions
 Leo+0.007
 how+0.007
 Ma+0.007
od+0.007
 Luc+0.007
Neg logit contributions
belt-0.022
amas-0.021
uster-0.020
ider-0.020
ippers-0.020
Token to
Token ID284
Feature activation-8.80e-03

Pos logit contributions
belt+0.012
amas+0.012
uster+0.012
ippers+0.011
immers+0.011
Neg logit contributions
 says-0.004
 Leo-0.004
 explains-0.004
 how-0.004
 that-0.004
Token their
Token ID511
Feature activation-2.16e-02

Pos logit contributions
 how+0.005
 that+0.005
 written+0.005
 a+0.005
od+0.005
Neg logit contributions
belt-0.024
amas-0.023
uster-0.023
ippers-0.023
immers-0.023
Token and
Token ID290
Feature activation-4.23e-03

Pos logit contributions
belt+0.018
amas+0.017
uster+0.017
ippers+0.017
immers+0.017
Neg logit contributions
 that-0.009
 how-0.008
 a-0.008
 Leo-0.008
 written-0.008
Token she
Token ID673
Feature activation+1.78e-02

Pos logit contributions
 how+0.003
od+0.003
 that+0.003
 written+0.003
 says+0.003
Neg logit contributions
belt-0.011
amas-0.010
uster-0.010
immers-0.010
ippers-0.010
Token dropped
Token ID5710
Feature activation-6.22e-03

Pos logit contributions
belt+0.039
amas+0.039
uster+0.039
ippers+0.038
immers+0.038
Neg logit contributions
 that-0.020
 how-0.019
 a-0.019
's-0.019
 written-0.019
Token her
Token ID607
Feature activation+5.70e-02

Pos logit contributions
od+0.004
 how+0.004
 says+0.004
 written+0.004
 Leo+0.003
Neg logit contributions
belt-0.016
amas-0.016
uster-0.016
ippers-0.016
immers-0.015
Token l
Token ID300
Feature activation+4.65e-02

Pos logit contributions
belt+0.122
amas+0.120
uster+0.120
immers+0.119
ivery+0.117
Neg logit contributions
 that-0.068
 a-0.066
 how-0.066
 says-0.063
 written-0.062
Tokenoll
Token ID692
Feature activation+1.24e-01

Pos logit contributions
belt+0.085
amas+0.084
immers+0.082
uster+0.080
ivery+0.079
Neg logit contributions
 that-0.068
 how-0.066
 written-0.066
 a-0.066
od-0.066
Tokenipop
Token ID42800
Feature activation+2.96e-02

Pos logit contributions
amas+0.191
ivery+0.189
immers+0.188
uster+0.183
belt+0.183
Neg logit contributions
 that-0.227
 how-0.220
 a-0.214
 written-0.207
:-0.207
Token!
Token ID0
Feature activation-8.36e-03

Pos logit contributions
belt+0.072
amas+0.070
uster+0.070
immers+0.069
ippers+0.069
Neg logit contributions
 that-0.027
 how-0.026
 a-0.025
 written-0.025
 Leo-0.024
Token It
Token ID632
Feature activation-5.41e-03

Pos logit contributions
 a+0.007
 that+0.007
 how+0.007
 Leo+0.006
 Ma+0.006
Neg logit contributions
belt-0.021
immers-0.021
amas-0.020
ippers-0.020
uster-0.020
Token flew
Token ID13112
Feature activation+4.12e-02

Pos logit contributions
 how+0.006
od+0.006
 that+0.006
 Leo+0.006
 written+0.006
Neg logit contributions
belt-0.012
amas-0.011
uster-0.011
immers-0.011
ro-0.011
Token out
Token ID503
Feature activation-1.21e-02

Pos logit contributions
belt+0.084
amas+0.081
uster+0.081
ippers+0.079
immers+0.079
Neg logit contributions
 how-0.050
od-0.050
 says-0.049
 that-0.049
 Leo-0.049
Token,
Token ID11
Feature activation+1.16e-02

Pos logit contributions
belt+0.010
amas+0.010
uster+0.010
ippers+0.010
immers+0.010
Neg logit contributions
 how-0.003
 that-0.003
 a-0.003
 written-0.003
 Leo-0.003
Token a
Token ID257
Feature activation-9.27e-03

Pos logit contributions
belt+0.027
amas+0.027
immers+0.026
ippers+0.026
uster+0.026
Neg logit contributions
 that-0.012
 how-0.012
 a-0.012
 written-0.011
 says-0.011
Token sour
Token ID11348
Feature activation+1.40e-02

Pos logit contributions
 that+0.011
 a+0.011
 how+0.011
 written+0.011
 says+0.010
Neg logit contributions
belt-0.020
amas-0.019
immers-0.019
uster-0.019
ippers-0.019
Token lemon
Token ID18873
Feature activation+1.56e-02

Pos logit contributions
belt+0.029
immers+0.028
uster+0.028
amas+0.028
ippers+0.028
Neg logit contributions
 that-0.019
 a-0.018
 how-0.017
 written-0.017
 says-0.017
Token l
Token ID300
Feature activation+3.53e-02

Pos logit contributions
belt+0.036
amas+0.035
uster+0.035
ippers+0.035
immers+0.034
Neg logit contributions
 that-0.016
 how-0.015
 a-0.015
 written-0.015
 says-0.015
Tokenoll
Token ID692
Feature activation+1.24e-01

Pos logit contributions
belt+0.066
amas+0.065
uster+0.063
immers+0.063
ippers+0.062
Neg logit contributions
 that-0.051
 how-0.050
od-0.049
 written-0.049
 a-0.049
Tokenipop
Token ID42800
Feature activation+3.18e-02

Pos logit contributions
amas+0.195
ivery+0.193
uster+0.188
immers+0.186
belt+0.180
Neg logit contributions
 that-0.234
 how-0.224
:-0.214
 a-0.212
 says-0.211
Token.
Token ID13
Feature activation+1.85e-04

Pos logit contributions
belt+0.077
amas+0.074
uster+0.073
ippers+0.072
immers+0.072
Neg logit contributions
 how-0.027
od-0.027
 Leo-0.027
 says-0.026
 Ma-0.026
Token Jimmy
Token ID12963
Feature activation-1.76e-02

Pos logit contributions
belt+0.000
immers+0.000
ivery+0.000
ippers+0.000
igs+0.000
Neg logit contributions
 a-0.000
 how-0.000
 that-0.000
 Mrs-0.000
 Leo-0.000
Token laughed
Token ID13818
Feature activation-1.81e-02

Pos logit contributions
 how+0.013
 that+0.013
 a+0.013
 written+0.012
 says+0.012
Neg logit contributions
belt-0.045
amas-0.044
uster-0.043
ippers-0.043
immers-0.043
Token and
Token ID290
Feature activation+1.46e-02

Pos logit contributions
 how+0.019
 that+0.019
od+0.018
 written+0.018
 says+0.018
Neg logit contributions
belt-0.040
amas-0.040
uster-0.039
ippers-0.039
immers-0.038
Token mom
Token ID1995
Feature activation-3.72e-02

Pos logit contributions
 written+0.003
od+0.003
 how+0.003
 that+0.003
 a+0.003
Neg logit contributions
belt-0.006
amas-0.005
ippers-0.005
immers-0.005
uster-0.005
Token.
Token ID13
Feature activation-1.01e-02

Pos logit contributions
od+0.027
 how+0.027
 written+0.026
 that+0.026
 Leo+0.025
Neg logit contributions
belt-0.093
amas-0.091
uster-0.091
ippers-0.089
immers-0.089
Token
Token ID198
Feature activation-7.62e-03

Pos logit contributions
od+0.014
 written+0.013
 how+0.013
 ze+0.012
 Ma+0.012
Neg logit contributions
belt-0.020
amas-0.019
ippers-0.019
ittens-0.019
uster-0.018
Token
Token ID198
Feature activation+8.42e-03

Pos logit contributions
 written+0.009
od+0.009
 how+0.008
 Ma+0.008
 Luc+0.008
Neg logit contributions
amas-0.016
belt-0.016
ippers-0.015
uster-0.015
ittens-0.015
TokenMom
Token ID29252
Feature activation-4.54e-02

Pos logit contributions
belt+0.014
amas+0.014
uster+0.014
immers+0.013
ippers+0.013
Neg logit contributions
 that-0.014
 how-0.014
 a-0.013
 Leo-0.013
 says-0.013
Token hears
Token ID23180
Feature activation-8.90e-02

Pos logit contributions
od+0.049
 written+0.047
 Luc+0.046
 Sof+0.043
 how+0.043
Neg logit contributions
amas-0.099
uster-0.094
ippers-0.093
belt-0.092
immers-0.091
Token Sam
Token ID3409
Feature activation-4.96e-02

Pos logit contributions
 written+0.079
od+0.072
's+0.067
 Luc+0.065
 explains+0.061
Neg logit contributions
uster-0.205
ittens-0.203
amas-0.202
ippers-0.198
igs-0.195
Token crying
Token ID13774
Feature activation+9.64e-03

Pos logit contributions
 Luc+0.066
 written+0.065
 Ma+0.065
od+0.065
 Leo+0.061
Neg logit contributions
amas-0.093
ivery-0.088
ivering-0.088
uster-0.087
immers-0.085
Token.
Token ID13
Feature activation-5.61e-03

Pos logit contributions
amas+0.023
belt+0.023
ippers+0.023
uster+0.023
immers+0.023
Neg logit contributions
 written-0.008
 how-0.007
od-0.007
 that-0.007
 Leo-0.007
Token She
Token ID1375
Feature activation+1.62e-02

Pos logit contributions
od+0.009
 written+0.008
 how+0.008
 Ma+0.007
 whose+0.007
Neg logit contributions
belt-0.010
amas-0.009
-0.009
-0.009
ittens-0.009
Token sees
Token ID7224
Feature activation-6.52e-02

Pos logit contributions
amas+0.032
belt+0.031
ippers+0.031
uster+0.031
immers+0.031
Neg logit contributions
od-0.020
 written-0.019
 that-0.018
 Ma-0.018
 Luc-0.018
Token cries
Token ID24691
Feature activation+2.70e-03

Pos logit contributions
od+0.008
 written+0.007
 Luc+0.006
 Steve+0.006
 Ma+0.006
Neg logit contributions
belt-0.006
amas-0.006
ro-0.006
 sleeves-0.006
igs-0.006
Token.
Token ID13
Feature activation-1.32e-02

Pos logit contributions
uster+0.006
belt+0.006
ippers+0.006
amas+0.006
ider+0.006
Neg logit contributions
od-0.002
 written-0.002
 how-0.002
 Ma-0.002
 Luc-0.002
Token
Token ID198
Feature activation-3.54e-03

Pos logit contributions
 written+0.017
od+0.016
 how+0.016
 that+0.015
's+0.015
Neg logit contributions
-0.025
amas-0.025
ippers-0.025
ittens-0.024
ro-0.024
Token
Token ID198
Feature activation+1.26e-02

Pos logit contributions
 written+0.005
od+0.005
 how+0.004
 Luc+0.004
 that+0.004
Neg logit contributions
ittens-0.007
amas-0.007
ippers-0.007
belt-0.007
uster-0.006
TokenMom
Token ID29252
Feature activation-4.33e-02

Pos logit contributions
belt+0.023
amas+0.022
uster+0.022
ippers+0.022
immers+0.021
Neg logit contributions
 how-0.019
 that-0.019
 written-0.019
od-0.018
 a-0.018
Token hears
Token ID23180
Feature activation-8.84e-02

Pos logit contributions
od+0.043
 Luc+0.042
 written+0.040
 Sof+0.038
 how+0.038
Neg logit contributions
amas-0.098
uster-0.094
belt-0.092
immers-0.092
ippers-0.092
Token the
Token ID262
Feature activation-4.00e-02

Pos logit contributions
 written+0.101
od+0.083
's+0.081
 explains+0.075
opot+0.075
Neg logit contributions
ittens-0.180
uster-0.178
igs-0.174
amas-0.174
ippers-0.172
Token noise
Token ID7838
Feature activation-4.14e-02

Pos logit contributions
 how+0.056
od+0.054
 Ma+0.054
 written+0.053
 Luc+0.053
Neg logit contributions
amas-0.078
belt-0.073
ippers-0.071
immers-0.071
uster-0.071
Token and
Token ID290
Feature activation-1.38e-03

Pos logit contributions
 Ma+0.050
od+0.049
 Luc+0.049
 written+0.049
 Leo+0.048
Neg logit contributions
amas-0.081
belt-0.080
uster-0.079
ippers-0.078
immers-0.078
Token comes
Token ID2058
Feature activation-1.32e-02

Pos logit contributions
od+0.002
 written+0.002
 Luc+0.002
 Ma+0.002
 Sof+0.002
Neg logit contributions
amas-0.002
immers-0.002
belt-0.002
uster-0.002
igs-0.002
Token to
Token ID284
Feature activation-2.98e-02

Pos logit contributions
 how+0.008
 that+0.008
 written+0.008
od+0.007
 says+0.007
Neg logit contributions
belt-0.035
amas-0.035
uster-0.034
ippers-0.034
immers-0.034
Token cry
Token ID3960
Feature activation-4.95e-03

Pos logit contributions
amas+0.027
belt+0.027
ivery+0.026
uster+0.026
immers+0.026
Neg logit contributions
 written-0.024
od-0.024
 Leo-0.022
 that-0.022
 Luc-0.022
Token.
Token ID13
Feature activation-5.78e-03

Pos logit contributions
 how+0.003
 that+0.003
 a+0.003
 written+0.003
 Leo+0.003
Neg logit contributions
belt-0.013
amas-0.013
uster-0.013
ippers-0.013
immers-0.013
Token
Token ID198
Feature activation+2.64e-03

Pos logit contributions
 how+0.005
 that+0.005
 a+0.005
 written+0.005
 Leo+0.005
Neg logit contributions
belt-0.014
amas-0.014
uster-0.014
ippers-0.014
immers-0.014
Token
Token ID198
Feature activation+1.88e-02

Pos logit contributions
amas+0.006
ittens+0.006
ippers+0.006
belt+0.005
uster+0.005
Neg logit contributions
 written-0.003
od-0.003
 how-0.003
 Luc-0.003
 Ma-0.003
TokenMom
Token ID29252
Feature activation-4.29e-02

Pos logit contributions
belt+0.038
uster+0.036
amas+0.036
immers+0.035
ippers+0.034
Neg logit contributions
 that-0.027
 how-0.026
 a-0.026
 Leo-0.025
's-0.025
Token hears
Token ID23180
Feature activation-8.84e-02

Pos logit contributions
od+0.042
 Luc+0.041
 written+0.041
 how+0.039
 that+0.039
Neg logit contributions
amas-0.096
uster-0.093
belt-0.092
immers-0.091
ippers-0.091
Token the
Token ID262
Feature activation-4.28e-02

Pos logit contributions
 written+0.067
od+0.064
 says+0.060
 Luc+0.060
 how+0.059
Neg logit contributions
amas-0.215
uster-0.214
ippers-0.210
ittens-0.209
belt-0.208
Token noise
Token ID7838
Feature activation-3.93e-02

Pos logit contributions
 how+0.057
od+0.057
 written+0.056
 Leo+0.056
 Ma+0.055
Neg logit contributions
amas-0.084
belt-0.080
uster-0.078
ippers-0.077
ivery-0.077
Token and
Token ID290
Feature activation-3.30e-03

Pos logit contributions
 Ma+0.048
 written+0.048
od+0.048
 Luc+0.048
 Leo+0.048
Neg logit contributions
amas-0.076
belt-0.075
uster-0.074
ippers-0.073
ittens-0.072
Token comes
Token ID2058
Feature activation-1.73e-02

Pos logit contributions
od+0.004
 written+0.004
 Luc+0.004
 Ma+0.004
 Mark+0.004
Neg logit contributions
belt-0.006
amas-0.006
uster-0.006
immers-0.006
ippers-0.006
Token to
Token ID284
Feature activation-3.13e-02

Pos logit contributions
 how+0.010
 that+0.009
 a+0.009
 Leo+0.009
 says+0.008
Neg logit contributions
belt-0.048
amas-0.047
uster-0.047
ippers-0.046
immers-0.046
Token too
Token ID1165
Feature activation-1.93e-02

Pos logit contributions
amas+0.018
belt+0.018
ippers+0.018
immers+0.018
ider+0.018
Neg logit contributions
 written-0.006
od-0.006
 Ma-0.006
 Luc-0.006
 Mark-0.006
Token.
Token ID13
Feature activation-8.87e-03

Pos logit contributions
 Ma+0.017
 how+0.017
od+0.017
 Luc+0.016
 written+0.016
Neg logit contributions
amas-0.045
belt-0.044
ivery-0.044
ippers-0.044
ittens-0.044
Token
Token ID198
Feature activation-3.45e-03

Pos logit contributions
od+0.010
 written+0.010
 how+0.010
's+0.010
 that+0.009
Neg logit contributions
amas-0.018
-0.018
belt-0.018
ippers-0.017
ittens-0.017
Token
Token ID198
Feature activation+1.11e-02

Pos logit contributions
 written+0.004
od+0.004
 how+0.004
 Luc+0.004
 Ma+0.004
Neg logit contributions
amas-0.007
ittens-0.007
ippers-0.007
belt-0.006
uster-0.006
TokenMom
Token ID29252
Feature activation-4.32e-02

Pos logit contributions
belt+0.018
amas+0.018
ippers+0.017
uster+0.017
ittens+0.017
Neg logit contributions
 written-0.018
 how-0.018
od-0.018
 that-0.017
 says-0.017
Token hears
Token ID23180
Feature activation-8.77e-02

Pos logit contributions
 Luc+0.046
od+0.046
 Sof+0.044
 written+0.044
 Ma+0.041
Neg logit contributions
amas-0.094
immers-0.087
uster-0.087
acking-0.087
ippers-0.086
Token the
Token ID262
Feature activation-4.29e-02

Pos logit contributions
 written+0.104
od+0.095
's+0.090
 Luc+0.087
 explains+0.084
Neg logit contributions
ittens-0.171
uster-0.171
amas-0.171
-0.163
ippers-0.163
Token noise
Token ID7838
Feature activation-3.94e-02

Pos logit contributions
 how+0.062
od+0.062
 Ma+0.060
 written+0.060
 Luc+0.060
Neg logit contributions
amas-0.080
belt-0.074
ippers-0.073
immers-0.072
ivery-0.072
Token and
Token ID290
Feature activation-1.78e-03

Pos logit contributions
 Ma+0.049
 Luc+0.048
 written+0.048
od+0.048
 Leo+0.047
Neg logit contributions
amas-0.075
uster-0.073
belt-0.073
ippers-0.073
ider-0.072
Token comes
Token ID2058
Feature activation-1.41e-02

Pos logit contributions
od+0.003
 written+0.003
 Luc+0.002
 Ma+0.002
 Sof+0.002
Neg logit contributions
amas-0.003
belt-0.003
immers-0.003
uster-0.003
igs-0.003
Token to
Token ID284
Feature activation-3.01e-02

Pos logit contributions
 how+0.009
 that+0.009
 written+0.009
od+0.009
 says+0.009
Neg logit contributions
belt-0.037
amas-0.036
uster-0.036
ippers-0.036
immers-0.036
Token!"
Token ID2474
Feature activation-6.66e-03

Pos logit contributions
 how+0.013
od+0.013
 that+0.013
 written+0.013
 Leo+0.013
Neg logit contributions
belt-0.029
amas-0.028
uster-0.027
immers-0.027
ippers-0.027
Token
Token ID198
Feature activation-1.62e-03

Pos logit contributions
 that+0.005
 a+0.005
 how+0.005
 says+0.005
 Leo+0.005
Neg logit contributions
belt-0.017
amas-0.017
immers-0.017
uster-0.017
ippers-0.017
Token
Token ID198
Feature activation+1.34e-02

Pos logit contributions
 how+0.001
 that+0.001
 written+0.001
od+0.001
 Leo+0.001
Neg logit contributions
belt-0.004
amas-0.004
uster-0.004
ippers-0.004
immers-0.004
TokenA
Token ID32
Feature activation-1.62e-02

Pos logit contributions
belt+0.034
amas+0.033
immers+0.033
uster+0.032
ippers+0.032
Neg logit contributions
 a-0.013
 that-0.013
 how-0.011
 the-0.011
's-0.011
Token man
Token ID582
Feature activation-3.80e-02

Pos logit contributions
 that+0.020
 how+0.020
 a+0.020
 says+0.019
 Leo+0.019
Neg logit contributions
belt-0.033
amas-0.033
uster-0.032
immers-0.032
ippers-0.032
Token hears
Token ID23180
Feature activation-8.74e-02

Pos logit contributions
od+0.047
 Ma+0.046
 Luc+0.045
 written+0.043
 Sof+0.043
Neg logit contributions
belt-0.072
ider-0.072
ippers-0.068
amas-0.068
immers-0.068
Token them
Token ID606
Feature activation-2.59e-02

Pos logit contributions
 written+0.081
od+0.078
 how+0.075
 Ma+0.073
 Luc+0.073
Neg logit contributions
amas-0.198
uster-0.197
belt-0.193
ippers-0.192
immers-0.191
Token and
Token ID290
Feature activation-2.06e-03

Pos logit contributions
 Ma+0.034
 written+0.033
 Luc+0.033
od+0.033
 Sof+0.031
Neg logit contributions
amas-0.048
uster-0.046
immers-0.045
ivery-0.045
ippers-0.045
Token comes
Token ID2058
Feature activation-1.64e-02

Pos logit contributions
od+0.003
 written+0.003
 Luc+0.003
 Ma+0.003
 about+0.003
Neg logit contributions
belt-0.003
amas-0.003
uster-0.003
immers-0.003
igs-0.003
Token to
Token ID284
Feature activation-3.20e-02

Pos logit contributions
 written+0.015
od+0.014
 explains+0.013
 says+0.013
 that+0.012
Neg logit contributions
uster-0.038
belt-0.037
immers-0.037
ippers-0.037
amas-0.037
Token the
Token ID262
Feature activation-4.60e-02

Pos logit contributions
od+0.022
 written+0.022
 how+0.020
 that+0.020
's+0.019
Neg logit contributions
belt-0.079
uster-0.079
amas-0.078
immers-0.078
ippers-0.077
Token floor
Token ID4314
Feature activation-2.10e-02

Pos logit contributions
 that+0.058
 how+0.057
 a+0.057
 says+0.056
 written+0.056
Neg logit contributions
belt-0.043
amas-0.041
uster-0.041
ippers-0.041
immers-0.040
Token.
Token ID13
Feature activation-1.64e-02

Pos logit contributions
 how+0.014
 Leo+0.013
od+0.013
 that+0.013
 written+0.013
Neg logit contributions
belt-0.055
amas-0.053
ippers-0.052
uster-0.052
immers-0.052
Token
Token ID198
Feature activation-9.66e-03

Pos logit contributions
 that+0.018
 a+0.018
 how+0.018
 Mark+0.017
 says+0.017
Neg logit contributions
belt-0.037
immers-0.035
amas-0.035
uster-0.035
ippers-0.035
Token
Token ID198
Feature activation+6.93e-03

Pos logit contributions
 written+0.010
 how+0.010
od+0.009
 that+0.009
 Leo+0.009
Neg logit contributions
belt-0.021
amas-0.021
uster-0.021
ippers-0.021
ittens-0.021
TokenMom
Token ID29252
Feature activation-4.65e-02

Pos logit contributions
belt+0.013
amas+0.013
immers+0.013
uster+0.012
ippers+0.012
Neg logit contributions
 that-0.011
 a-0.010
 how-0.010
 Mark-0.010
:-0.010
Token hears
Token ID23180
Feature activation-8.73e-02

Pos logit contributions
od+0.053
 Luc+0.052
 Sof+0.051
 written+0.049
 Ma+0.047
Neg logit contributions
amas-0.097
uster-0.090
acking-0.089
immers-0.088
ippers-0.088
Token the
Token ID262
Feature activation-4.23e-02

Pos logit contributions
 written+0.078
od+0.071
's+0.071
 says+0.068
 explains+0.066
Neg logit contributions
amas-0.199
uster-0.199
ittens-0.197
ippers-0.190
angers-0.189
Token noise
Token ID7838
Feature activation-4.19e-02

Pos logit contributions
 how+0.051
 that+0.050
od+0.049
 written+0.049
 Leo+0.049
Neg logit contributions
belt-0.088
amas-0.087
uster-0.085
ippers-0.084
immers-0.084
Token.
Token ID13
Feature activation-8.86e-03

Pos logit contributions
 Ma+0.045
 Luc+0.044
od+0.043
 written+0.043
 Leo+0.043
Neg logit contributions
amas-0.088
belt-0.086
uster-0.085
ippers-0.084
immers-0.084
Token She
Token ID1375
Feature activation+8.49e-03

Pos logit contributions
od+0.012
 how+0.012
 written+0.012
 Leo+0.012
 whose+0.011
Neg logit contributions
amas-0.016
belt-0.016
-0.015
angers-0.015
ippers-0.015
Token comes
Token ID2058
Feature activation-1.91e-02

Pos logit contributions
amas+0.015
belt+0.015
uster+0.015
agons+0.015
ippers+0.015
Neg logit contributions
od-0.011
 written-0.011
 Ma-0.010
 that-0.010
 Luc-0.010
Token mess
Token ID2085
Feature activation+1.34e-02

Pos logit contributions
 how+0.050
 that+0.049
 written+0.048
od+0.048
 Leo+0.048
Neg logit contributions
belt-0.058
amas-0.057
uster-0.055
ippers-0.055
immers-0.055
Token.
Token ID13
Feature activation-1.43e-02

Pos logit contributions
belt+0.038
amas+0.037
uster+0.037
ippers+0.037
immers+0.036
Neg logit contributions
 that-0.007
 how-0.007
 a-0.006
od-0.006
 written-0.006
Token
Token ID198
Feature activation-5.52e-03

Pos logit contributions
 written+0.018
 how+0.018
 that+0.018
od+0.017
's+0.017
Neg logit contributions
amas-0.027
ippers-0.027
-0.027
belt-0.026
angers-0.026
Token
Token ID198
Feature activation+1.11e-02

Pos logit contributions
 written+0.007
od+0.007
 how+0.007
 Luc+0.006
 Ma+0.006
Neg logit contributions
ittens-0.010
ippers-0.010
amas-0.010
-0.010
belt-0.010
TokenMom
Token ID29252
Feature activation-4.35e-02

Pos logit contributions
belt+0.020
amas+0.019
uster+0.019
ippers+0.019
immers+0.019
Neg logit contributions
 how-0.017
 that-0.016
 written-0.016
od-0.016
 a-0.016
Token hears
Token ID23180
Feature activation-8.73e-02

Pos logit contributions
od+0.045
 Luc+0.044
 written+0.042
 Sof+0.041
 how+0.040
Neg logit contributions
amas-0.095
acking-0.089
uster-0.089
immers-0.088
ippers-0.088
Token the
Token ID262
Feature activation-4.03e-02

Pos logit contributions
 written+0.117
od+0.099
 Luc+0.099
's+0.099
 explains+0.095
Neg logit contributions
uster-0.157
amas-0.156
ippers-0.155
ittens-0.154
immers-0.151
Token noise
Token ID7838
Feature activation-4.10e-02

Pos logit contributions
 how+0.049
 that+0.048
od+0.048
 written+0.047
 Leo+0.047
Neg logit contributions
belt-0.083
amas-0.083
uster-0.080
ippers-0.080
immers-0.079
Token and
Token ID290
Feature activation-1.74e-03

Pos logit contributions
 Ma+0.054
 Luc+0.054
 written+0.053
od+0.053
 Leo+0.051
Neg logit contributions
amas-0.075
belt-0.073
uster-0.073
ippers-0.072
immers-0.072
Token comes
Token ID2058
Feature activation-1.27e-02

Pos logit contributions
 written+0.002
od+0.002
 Luc+0.002
 whose+0.002
 Ma+0.002
Neg logit contributions
amas-0.003
immers-0.003
uster-0.003
belt-0.003
ippers-0.003
Token back
Token ID736
Feature activation-1.59e-02

Pos logit contributions
 how+0.007
 that+0.007
 written+0.007
od+0.007
 says+0.007
Neg logit contributions
belt-0.034
amas-0.033
uster-0.033
ippers-0.033
immers-0.033
Token who
Token ID508
Feature activation+9.03e-03

Pos logit contributions
od+0.052
 Luc+0.050
 written+0.049
 how+0.049
 Ma+0.049
Neg logit contributions
belt-0.072
ider-0.072
ippers-0.069
amas-0.068
ivering-0.067
Token works
Token ID2499
Feature activation-3.28e-04

Pos logit contributions
belt+0.018
amas+0.017
ital+0.017
+0.017
ippers+0.016
Neg logit contributions
od-0.011
 Ma-0.010
 Luc-0.010
 how-0.010
 written-0.010
Token in
Token ID287
Feature activation-1.14e-02

Pos logit contributions
 how+0.000
 that+0.000
 a+0.000
 written+0.000
 says+0.000
Neg logit contributions
belt-0.001
amas-0.001
uster-0.001
ippers-0.001
immers-0.001
Token the
Token ID262
Feature activation-9.70e-03

Pos logit contributions
 how+0.003
 that+0.003
 a+0.003
 written+0.003
 Leo+0.003
Neg logit contributions
belt-0.034
amas-0.034
uster-0.034
ippers-0.033
immers-0.033
Token park
Token ID3952
Feature activation-1.41e-03

Pos logit contributions
 that+0.014
 how+0.014
 a+0.014
 says+0.013
 written+0.013
Neg logit contributions
belt-0.018
amas-0.018
uster-0.018
immers-0.017
ippers-0.017
Token hears
Token ID23180
Feature activation-8.72e-02

Pos logit contributions
 how+0.001
od+0.001
 written+0.001
 Leo+0.001
 that+0.001
Neg logit contributions
belt-0.004
amas-0.003
ippers-0.003
immers-0.003
ider-0.003
Token them
Token ID606
Feature activation-2.90e-02

Pos logit contributions
 written+0.084
od+0.077
 explains+0.074
's+0.071
 Luc+0.069
Neg logit contributions
ittens-0.185
amas-0.184
uster-0.181
igs-0.177
ippers-0.177
Token.
Token ID13
Feature activation-8.16e-03

Pos logit contributions
 Leo+0.026
 written+0.026
 Ma+0.026
od+0.026
 Luc+0.026
Neg logit contributions
amas-0.065
ivery-0.063
belt-0.061
ippers-0.061
immers-0.061
Token She
Token ID1375
Feature activation+1.14e-02

Pos logit contributions
od+0.013
 written+0.013
 how+0.012
's+0.012
 Leo+0.011
Neg logit contributions
belt-0.013
ro-0.012
-0.012
amas-0.012
angers-0.012
Token runs
Token ID4539
Feature activation+1.68e-02

Pos logit contributions
belt+0.024
amas+0.023
ippers+0.023
immers+0.023
uster+0.022
Neg logit contributions
od-0.013
 written-0.012
 that-0.011
 Ma-0.011
 Luc-0.011
Token to
Token ID284
Feature activation-3.70e-02

Pos logit contributions
belt+0.047
amas+0.046
uster+0.045
ippers+0.045
immers+0.045
Neg logit contributions
 that-0.009
 how-0.009
 a-0.009
 Leo-0.008
 written-0.008
Token rooms
Token ID9519
Feature activation-1.70e-02

Pos logit contributions
 written+0.020
od+0.020
 how+0.019
 Ma+0.019
 explains+0.019
Neg logit contributions
belt-0.016
amas-0.016
immers-0.015
ippers-0.015
angers-0.015
Token.
Token ID13
Feature activation-1.33e-02

Pos logit contributions
 how+0.011
 that+0.011
 Leo+0.010
 written+0.010
od+0.010
Neg logit contributions
belt-0.045
amas-0.044
uster-0.043
ippers-0.043
immers-0.043
Token
Token ID198
Feature activation-2.86e-03

Pos logit contributions
 that+0.014
 how+0.014
 a+0.014
 Leo+0.014
 says+0.013
Neg logit contributions
belt-0.030
amas-0.029
uster-0.029
immers-0.029
ippers-0.029
Token
Token ID198
Feature activation+1.41e-02

Pos logit contributions
 how+0.003
 written+0.002
 that+0.002
od+0.002
 Leo+0.002
Neg logit contributions
belt-0.007
amas-0.007
uster-0.007
ippers-0.007
immers-0.007
TokenMom
Token ID29252
Feature activation-4.17e-02

Pos logit contributions
amas+0.033
belt+0.032
immers+0.032
uster+0.031
angers+0.031
Neg logit contributions
 that-0.017
 a-0.017
 how-0.015
 the-0.015
's-0.015
Token hears
Token ID23180
Feature activation-8.71e-02

Pos logit contributions
od+0.039
 Luc+0.038
 written+0.037
 how+0.035
 Sof+0.035
Neg logit contributions
amas-0.096
belt-0.092
uster-0.092
immers-0.090
ippers-0.090
Token the
Token ID262
Feature activation-4.18e-02

Pos logit contributions
 written+0.121
od+0.115
 Luc+0.111
's+0.110
 Mark+0.109
Neg logit contributions
uster-0.159
amas-0.155
ittens-0.153
ippers-0.153
immers-0.149
Token noise
Token ID7838
Feature activation-3.88e-02

Pos logit contributions
 how+0.060
od+0.058
 Leo+0.058
 Luc+0.057
 Ma+0.057
Neg logit contributions
amas-0.079
belt-0.074
ippers-0.071
uster-0.071
immers-0.071
Token and
Token ID290
Feature activation-2.85e-03

Pos logit contributions
 Ma+0.048
od+0.048
 written+0.047
 Luc+0.047
 Leo+0.047
Neg logit contributions
amas-0.075
belt-0.075
uster-0.074
ippers-0.072
immers-0.072
Token comes
Token ID2058
Feature activation-1.74e-02

Pos logit contributions
od+0.004
 written+0.004
 Luc+0.003
 Ma+0.003
 that+0.003
Neg logit contributions
uster-0.005
belt-0.005
amas-0.005
immers-0.005
ippers-0.005
Token to
Token ID284
Feature activation-3.24e-02

Pos logit contributions
 how+0.009
 that+0.009
 a+0.009
 written+0.009
 Leo+0.008
Neg logit contributions
belt-0.048
amas-0.047
uster-0.047
ippers-0.046
immers-0.046
Tokenuffy
Token ID15352
Feature activation-2.21e-02

Pos logit contributions
belt+0.032
uster+0.029
amas+0.028
immers+0.027
ital+0.026
Neg logit contributions
 Leo-0.059
 that-0.059
 says-0.058
od-0.058
 a-0.058
Token."
Token ID526
Feature activation-1.05e-02

Pos logit contributions
 that+0.016
 how+0.016
 a+0.015
 says+0.015
 Leo+0.015
Neg logit contributions
belt-0.058
amas-0.056
uster-0.056
ippers-0.056
immers-0.055
Token
Token ID198
Feature activation-2.66e-03

Pos logit contributions
 written+0.010
 how+0.010
od+0.010
 that+0.009
 a+0.009
Neg logit contributions
belt-0.024
amas-0.024
ippers-0.024
uster-0.024
ittens-0.023
Token
Token ID198
Feature activation+1.51e-02

Pos logit contributions
 written+0.003
od+0.003
 how+0.003
 Ma+0.003
 Luc+0.003
Neg logit contributions
amas-0.006
belt-0.006
ippers-0.006
ittens-0.006
uster-0.006
TokenMom
Token ID29252
Feature activation-4.25e-02

Pos logit contributions
belt+0.032
amas+0.031
uster+0.031
immers+0.030
ippers+0.030
Neg logit contributions
 that-0.020
 how-0.019
 a-0.019
 Leo-0.019
 says-0.018
Token hears
Token ID23180
Feature activation-8.71e-02

Pos logit contributions
od+0.048
 Luc+0.046
 written+0.045
 Sof+0.043
 Steve+0.041
Neg logit contributions
amas-0.089
aj-0.085
uster-0.085
ippers-0.084
immers-0.084
Token the
Token ID262
Feature activation-3.88e-02

Pos logit contributions
 written+0.131
od+0.121
 Luc+0.117
's+0.116
 says+0.113
Neg logit contributions
uster-0.151
ittens-0.146
amas-0.146
ippers-0.145
-0.140
Token noise
Token ID7838
Feature activation-4.05e-02

Pos logit contributions
 how+0.052
od+0.050
 written+0.049
 that+0.049
 Leo+0.048
Neg logit contributions
amas-0.076
belt-0.075
ippers-0.073
uster-0.072
immers-0.072
Token and
Token ID290
Feature activation-1.12e-03

Pos logit contributions
 Ma+0.044
 Luc+0.043
od+0.043
 written+0.043
 Leo+0.042
Neg logit contributions
amas-0.084
belt-0.083
ippers-0.082
uster-0.082
ider-0.081
Token comes
Token ID2058
Feature activation-9.77e-03

Pos logit contributions
od+0.002
 written+0.002
 Luc+0.001
 Ma+0.001
 Mark+0.001
Neg logit contributions
uster-0.002
amas-0.002
immers-0.002
ippers-0.002
igs-0.002
Token to
Token ID284
Feature activation-2.79e-02

Pos logit contributions
 how+0.005
 that+0.005
 written+0.005
od+0.005
 a+0.005
Neg logit contributions
belt-0.027
amas-0.026
uster-0.026
ippers-0.026
immers-0.026
Token cries
Token ID24691
Feature activation+8.36e-03

Pos logit contributions
igs+0.014
irmed+0.014
ro+0.014
amas+0.014
agons+0.013
Neg logit contributions
od-0.017
 Mary-0.016
 Ma-0.016
 written-0.016
 Luc-0.016
Token.
Token ID13
Feature activation-1.54e-02

Pos logit contributions
uster+0.020
amas+0.020
belt+0.020
ippers+0.020
immers+0.019
Neg logit contributions
od-0.007
 written-0.007
 how-0.007
 Ma-0.006
 Leo-0.006
Token
Token ID198
Feature activation-7.55e-03

Pos logit contributions
 written+0.018
 how+0.018
od+0.018
's+0.017
 that+0.017
Neg logit contributions
amas-0.032
belt-0.031
ippers-0.031
-0.031
ittens-0.030
Token
Token ID198
Feature activation+8.91e-03

Pos logit contributions
 written+0.009
od+0.009
 how+0.008
's+0.008
 Ma+0.008
Neg logit contributions
belt-0.015
amas-0.015
ittens-0.015
ippers-0.015
uster-0.015
TokenMom
Token ID29252
Feature activation-4.60e-02

Pos logit contributions
belt+0.016
amas+0.016
uster+0.016
immers+0.016
ippers+0.016
Neg logit contributions
 that-0.014
 a-0.014
 how-0.013
 Leo-0.013
 Mark-0.013
Token hears
Token ID23180
Feature activation-8.71e-02

Pos logit contributions
od+0.055
 Luc+0.053
 written+0.051
 Sof+0.050
 Ma+0.049
Neg logit contributions
amas-0.094
acking-0.088
uster-0.088
agons-0.087
igs-0.087
Token the
Token ID262
Feature activation-4.26e-02

Pos logit contributions
 written+0.080
od+0.073
's+0.068
 Luc+0.066
 explains+0.065
Neg logit contributions
amas-0.196
ittens-0.196
uster-0.196
ippers-0.189
igs-0.187
Token noise
Token ID7838
Feature activation-4.03e-02

Pos logit contributions
 that+0.048
 how+0.046
 a+0.046
 says+0.045
 written+0.044
Neg logit contributions
belt-0.096
uster-0.092
amas-0.092
ippers-0.091
immers-0.091
Token.
Token ID13
Feature activation-4.87e-03

Pos logit contributions
 Ma+0.037
 Luc+0.036
od+0.036
 written+0.035
 Leo+0.034
Neg logit contributions
amas-0.091
belt-0.089
uster-0.089
ippers-0.088
immers-0.088
Token She
Token ID1375
Feature activation+1.16e-02

Pos logit contributions
od+0.008
 written+0.008
 how+0.007
 whose+0.007
 ze+0.007
Neg logit contributions
amas-0.008
-0.008
belt-0.007
ittens-0.007
ro-0.007
Token comes
Token ID2058
Feature activation-1.69e-02

Pos logit contributions
amas+0.020
agons+0.020
igs+0.020
ippers+0.020
immers+0.020
Neg logit contributions
od-0.016
 written-0.016
 Ma-0.014
 Luc-0.014
 how-0.014
Token!"
Token ID2474
Feature activation-5.42e-03

Pos logit contributions
 how+0.000
 that+0.000
 written+0.000
od+0.000
 Leo+0.000
Neg logit contributions
belt-0.001
amas-0.001
uster-0.001
ippers-0.001
immers-0.001
Token
Token ID198
Feature activation+9.53e-03

Pos logit contributions
od+0.007
 written+0.006
 how+0.006
 ze+0.005
 Luc+0.005
Neg logit contributions
amas-0.011
ittens-0.011
belt-0.011
-0.011
ippers-0.010
Token
Token ID198
Feature activation+2.65e-02

Pos logit contributions
belt+0.024
amas+0.023
uster+0.023
ippers+0.023
immers+0.023
Neg logit contributions
 how-0.008
 that-0.008
 a-0.007
 written-0.007
 Leo-0.007
TokenMom
Token ID29252
Feature activation-3.86e-02

Pos logit contributions
belt+0.051
uster+0.050
immers+0.050
amas+0.050
ippers+0.048
Neg logit contributions
 that-0.039
 how-0.039
 a-0.038
 Leo-0.037
 Mark-0.037
Tokenmy
Token ID1820
Feature activation-1.88e-02

Pos logit contributions
od+0.040
 written+0.039
 Luc+0.039
 Sof+0.038
 that+0.037
Neg logit contributions
amas-0.084
uster-0.079
ider-0.079
immers-0.078
belt-0.078
Token hears
Token ID23180
Feature activation-8.71e-02

Pos logit contributions
od+0.023
 written+0.022
 Luc+0.022
 Sof+0.021
 Ma+0.020
Neg logit contributions
amas-0.038
ider-0.035
immers-0.035
belt-0.034
ippers-0.034
Token Mia
Token ID32189
Feature activation-3.73e-02

Pos logit contributions
 written+0.083
od+0.073
's+0.070
 explains+0.065
 Luc+0.065
Neg logit contributions
amas-0.192
uster-0.191
ittens-0.191
-0.184
igs-0.181
Token crying
Token ID13774
Feature activation+1.22e-02

Pos logit contributions
 Luc+0.046
od+0.044
 Ma+0.044
 written+0.043
 Leo+0.043
Neg logit contributions
amas-0.072
ivering-0.067
ivery-0.067
uster-0.067
ippers-0.066
Token.
Token ID13
Feature activation+5.50e-04

Pos logit contributions
amas+0.030
belt+0.030
ippers+0.029
uster+0.029
immers+0.029
Neg logit contributions
 how-0.009
 written-0.009
 Leo-0.009
 that-0.009
od-0.009
Token She
Token ID1375
Feature activation+1.84e-02

Pos logit contributions
amas+0.001
belt+0.001
+0.001
ittens+0.001
ippers+0.001
Neg logit contributions
od-0.001
 written-0.001
 how-0.001
 Ma-0.001
 that-0.001
Token comes
Token ID2058
Feature activation-1.98e-02

Pos logit contributions
amas+0.037
belt+0.036
uster+0.036
immers+0.035
ippers+0.035
Neg logit contributions
od-0.022
 written-0.021
 that-0.021
 Leo-0.020
 how-0.020
Token doll
Token ID3654
Feature activation-7.89e-03

Pos logit contributions
uster+0.049
belt+0.047
amas+0.046
ivery+0.045
ippers+0.044
Neg logit contributions
 that-0.027
 a-0.027
 how-0.026
 says-0.026
's-0.024
Token!"
Token ID2474
Feature activation-4.67e-03

Pos logit contributions
 how+0.008
 written+0.008
 Leo+0.008
od+0.008
 Ma+0.008
Neg logit contributions
belt-0.018
amas-0.018
immers-0.017
ippers-0.017
uster-0.017
Token
Token ID198
Feature activation+2.55e-03

Pos logit contributions
 how+0.004
 written+0.004
 that+0.004
od+0.004
 Leo+0.004
Neg logit contributions
belt-0.011
amas-0.011
uster-0.011
ippers-0.011
immers-0.011
Token
Token ID198
Feature activation+1.94e-02

Pos logit contributions
belt+0.006
amas+0.006
ippers+0.006
uster+0.006
immers+0.005
Neg logit contributions
 written-0.002
 how-0.002
od-0.002
 that-0.002
 Leo-0.002
TokenMom
Token ID29252
Feature activation-4.07e-02

Pos logit contributions
belt+0.041
amas+0.040
uster+0.038
immers+0.037
ippers+0.037
Neg logit contributions
 that-0.026
 a-0.026
 how-0.025
's-0.024
 Leo-0.024
Token hears
Token ID23180
Feature activation-8.71e-02

Pos logit contributions
 Luc+0.040
od+0.040
 written+0.039
 Sof+0.037
 how+0.036
Neg logit contributions
amas-0.091
uster-0.087
immers-0.086
ippers-0.085
belt-0.085
Token them
Token ID606
Feature activation-2.39e-02

Pos logit contributions
 written+0.097
's+0.087
od+0.084
 explains+0.081
 Luc+0.081
Neg logit contributions
uster-0.182
ippers-0.176
ittens-0.174
amas-0.173
angers-0.173
Token and
Token ID290
Feature activation-9.18e-04

Pos logit contributions
 Ma+0.032
 Luc+0.032
 written+0.031
od+0.031
 Leo+0.031
Neg logit contributions
amas-0.045
ivery-0.042
uster-0.042
ippers-0.041
immers-0.041
Token comes
Token ID2058
Feature activation-1.91e-02

Pos logit contributions
 written+0.001
od+0.001
 Luc+0.001
 Ma+0.001
 Mark+0.001
Neg logit contributions
immers-0.001
uster-0.001
ippers-0.001
amas-0.001
belt-0.001
Token to
Token ID284
Feature activation-3.18e-02

Pos logit contributions
 written+0.012
od+0.012
 that+0.011
 how+0.011
 says+0.011
Neg logit contributions
belt-0.050
amas-0.049
uster-0.049
ippers-0.049
immers-0.049
Token the
Token ID262
Feature activation-5.30e-02

Pos logit contributions
 written+0.021
od+0.021
 that+0.020
 how+0.020
 says+0.019
Neg logit contributions
belt-0.081
amas-0.080
uster-0.080
ippers-0.079
immers-0.079
Token.
Token ID13
Feature activation-7.45e-03

Pos logit contributions
belt+0.028
amas+0.028
ippers+0.027
ider+0.027
immers+0.027
Neg logit contributions
 written-0.009
od-0.009
 how-0.009
 that-0.008
 Ma-0.008
Token
Token ID198
Feature activation-9.10e-04

Pos logit contributions
od+0.008
 written+0.008
 how+0.008
 that+0.008
's+0.007
Neg logit contributions
belt-0.016
amas-0.016
ippers-0.015
-0.015
uster-0.015
Token
Token ID198
Feature activation+1.50e-02

Pos logit contributions
 written+0.001
od+0.001
 how+0.001
 that+0.001
 Luc+0.001
Neg logit contributions
amas-0.002
belt-0.002
ippers-0.002
uster-0.002
ittens-0.002
TokenTheir
Token ID14574
Feature activation+1.61e-02

Pos logit contributions
belt+0.026
amas+0.025
uster+0.025
ippers+0.025
immers+0.024
Neg logit contributions
 how-0.024
 that-0.024
 a-0.023
 written-0.023
od-0.023
Token mom
Token ID1995
Feature activation-4.03e-02

Pos logit contributions
belt+0.033
amas+0.032
ippers+0.031
immers+0.031
ider+0.031
Neg logit contributions
 written-0.020
 how-0.020
od-0.019
 that-0.019
 Ma-0.019
Token hears
Token ID23180
Feature activation-8.70e-02

Pos logit contributions
 Luc+0.040
od+0.040
 written+0.040
 Sof+0.037
 Ma+0.036
Neg logit contributions
amas-0.089
immers-0.084
belt-0.083
uster-0.083
ider-0.083
Token the
Token ID262
Feature activation-3.96e-02

Pos logit contributions
 written+0.115
od+0.106
's+0.103
 Luc+0.102
 says+0.097
Neg logit contributions
uster-0.162
amas-0.160
ittens-0.160
ippers-0.154
immers-0.152
Token noise
Token ID7838
Feature activation-4.03e-02

Pos logit contributions
od+0.058
 how+0.058
 Ma+0.057
 written+0.057
 Luc+0.056
Neg logit contributions
amas-0.073
belt-0.067
ivery-0.066
ider-0.066
immers-0.065
Token and
Token ID290
Feature activation-2.87e-03

Pos logit contributions
 Ma+0.047
 written+0.047
od+0.047
 Luc+0.047
 Leo+0.046
Neg logit contributions
amas-0.080
belt-0.079
uster-0.078
ippers-0.077
immers-0.077
Token comes
Token ID2058
Feature activation-1.39e-02

Pos logit contributions
 written+0.004
od+0.004
 Luc+0.004
 Ma+0.004
 Sof+0.003
Neg logit contributions
amas-0.005
immers-0.005
belt-0.005
uster-0.005
ippers-0.005
Token to
Token ID284
Feature activation-2.96e-02

Pos logit contributions
 how+0.011
 that+0.011
 written+0.011
od+0.010
 says+0.010
Neg logit contributions
belt-0.035
amas-0.034
uster-0.034
ippers-0.033
immers-0.033
Token Lily
Token ID20037
Feature activation-2.12e-02

Pos logit contributions
od+0.007
 written+0.006
 how+0.006
 explains+0.006
 Ma+0.006
Neg logit contributions
amas-0.013
uster-0.012
belt-0.012
ippers-0.012
immers-0.012
Token!"
Token ID2474
Feature activation-2.05e-03

Pos logit contributions
 how+0.020
 that+0.020
 written+0.020
od+0.019
 Leo+0.019
Neg logit contributions
belt-0.049
amas-0.048
uster-0.047
ippers-0.047
immers-0.047
Token
Token ID198
Feature activation+6.83e-03

Pos logit contributions
 how+0.002
 that+0.002
 written+0.002
od+0.002
 a+0.002
Neg logit contributions
belt-0.005
amas-0.005
uster-0.005
ippers-0.005
immers-0.005
Token
Token ID198
Feature activation+2.18e-02

Pos logit contributions
belt+0.016
amas+0.016
uster+0.016
ippers+0.016
ittens+0.015
Neg logit contributions
 written-0.006
 how-0.006
od-0.006
 that-0.006
 Leo-0.006
TokenMom
Token ID29252
Feature activation-3.91e-02

Pos logit contributions
belt+0.046
uster+0.045
immers+0.044
amas+0.044
angers+0.043
Neg logit contributions
 that-0.030
 a-0.030
 Leo-0.028
 the-0.028
 how-0.028
Token hears
Token ID23180
Feature activation-8.69e-02

Pos logit contributions
od+0.045
 written+0.044
 Luc+0.042
 Sof+0.041
 how+0.040
Neg logit contributions
amas-0.083
immers-0.078
uster-0.077
ippers-0.077
-0.076
Token Sara
Token ID24799
Feature activation-3.12e-02

Pos logit contributions
 written+0.091
od+0.082
's+0.078
 explains+0.075
 Luc+0.074
Neg logit contributions
uster-0.185
amas-0.184
ittens-0.184
-0.178
ippers-0.178
Token and
Token ID290
Feature activation+6.94e-03

Pos logit contributions
 written+0.035
 Luc+0.033
 Ma+0.033
 how+0.032
od+0.032
Neg logit contributions
amas-0.066
uster-0.062
immers-0.062
ippers-0.062
ivering-0.061
Token runs
Token ID4539
Feature activation+2.11e-02

Pos logit contributions
amas+0.012
uster+0.012
immers+0.012
ippers+0.012
belt+0.012
Neg logit contributions
od-0.010
 written-0.010
 Luc-0.008
 Steve-0.008
 Ma-0.008
Token to
Token ID284
Feature activation-3.20e-02

Pos logit contributions
belt+0.058
amas+0.056
uster+0.056
ippers+0.055
immers+0.055
Neg logit contributions
 how-0.012
 that-0.011
od-0.011
 written-0.011
 Leo-0.011
Token the
Token ID262
Feature activation-4.36e-02

Pos logit contributions
 written+0.020
od+0.019
 how+0.015
 that+0.015
 says+0.014
Neg logit contributions
amas-0.083
belt-0.083
uster-0.082
ippers-0.081
immers-0.081
Token away
Token ID1497
Feature activation+6.03e-03

Pos logit contributions
od+0.025
 written+0.024
 Ma+0.024
 Leo+0.024
 says+0.024
Neg logit contributions
belt-0.029
amas-0.026
ippers-0.026
immers-0.026
uster-0.026
Token.
Token ID13
Feature activation-4.73e-03

Pos logit contributions
belt+0.015
amas+0.015
uster+0.014
ippers+0.014
ider+0.014
Neg logit contributions
od-0.005
 says-0.004
 Leo-0.004
 written-0.004
 Ma-0.004
Token
Token ID198
Feature activation-6.99e-03

Pos logit contributions
 that+0.004
 a+0.004
 Leo+0.004
 how+0.004
 Mark+0.004
Neg logit contributions
belt-0.012
immers-0.011
uster-0.011
amas-0.011
ippers-0.011
Token
Token ID198
Feature activation+1.13e-02

Pos logit contributions
 written+0.007
od+0.007
 how+0.007
 that+0.006
 Luc+0.006
Neg logit contributions
belt-0.016
amas-0.015
uster-0.015
ittens-0.015
ippers-0.015
TokenMom
Token ID29252
Feature activation-4.04e-02

Pos logit contributions
belt+0.023
amas+0.023
immers+0.023
ippers+0.023
uster+0.023
Neg logit contributions
 that-0.016
 a-0.015
's-0.015
 how-0.014
 Leo-0.014
Token hears
Token ID23180
Feature activation-8.69e-02

Pos logit contributions
od+0.044
 Luc+0.043
 written+0.041
 Sof+0.040
 how+0.039
Neg logit contributions
amas-0.088
uster-0.085
belt-0.085
immers-0.082
ippers-0.082
Token the
Token ID262
Feature activation-4.13e-02

Pos logit contributions
 written+0.061
od+0.061
 how+0.057
 Luc+0.056
 says+0.056
Neg logit contributions
amas-0.217
uster-0.215
belt-0.213
ippers-0.209
ittens-0.209
Token noise
Token ID7838
Feature activation-3.75e-02

Pos logit contributions
 how+0.050
od+0.049
 that+0.048
 written+0.048
 Leo+0.048
Neg logit contributions
belt-0.085
amas-0.085
uster-0.082
ippers-0.082
immers-0.081
Token and
Token ID290
Feature activation-2.83e-03

Pos logit contributions
 Ma+0.047
 written+0.046
od+0.046
 Luc+0.046
 Leo+0.046
Neg logit contributions
amas-0.072
belt-0.071
uster-0.071
ippers-0.069
immers-0.069
Token comes
Token ID2058
Feature activation-1.35e-02

Pos logit contributions
od+0.004
 written+0.004
 Luc+0.003
 Ma+0.003
 Sof+0.003
Neg logit contributions
belt-0.005
amas-0.005
uster-0.005
immers-0.005
igs-0.005
Token to
Token ID284
Feature activation-3.15e-02

Pos logit contributions
 how+0.008
 that+0.008
 written+0.007
od+0.007
 says+0.007
Neg logit contributions
belt-0.036
amas-0.036
uster-0.036
ippers-0.035
immers-0.035
Token too
Token ID1165
Feature activation-1.54e-02

Pos logit contributions
belt+0.015
amas+0.015
uster+0.015
ippers+0.015
immers+0.015
Neg logit contributions
od-0.004
 how-0.004
 written-0.004
 Leo-0.004
 Ma-0.004
Token.
Token ID13
Feature activation-1.46e-02

Pos logit contributions
 Ma+0.015
od+0.014
 Luc+0.014
 how+0.014
 written+0.014
Neg logit contributions
amas-0.035
belt-0.034
ivery-0.034
ippers-0.034
ittens-0.034
Token
Token ID198
Feature activation-6.86e-03

Pos logit contributions
 written+0.017
od+0.016
 how+0.016
 that+0.015
 Luc+0.015
Neg logit contributions
amas-0.031
-0.030
ippers-0.030
belt-0.030
ittens-0.029
Token
Token ID198
Feature activation+1.10e-02

Pos logit contributions
 written+0.010
od+0.009
 how+0.009
 Luc+0.008
's+0.008
Neg logit contributions
ittens-0.012
amas-0.012
-0.012
ippers-0.012
belt-0.012
TokenMom
Token ID29252
Feature activation-4.23e-02

Pos logit contributions
belt+0.021
uster+0.020
amas+0.020
immers+0.019
ippers+0.019
Neg logit contributions
 that-0.017
 a-0.016
 Leo-0.016
 how-0.016
's-0.016
Token hears
Token ID23180
Feature activation-8.69e-02

Pos logit contributions
od+0.044
 Luc+0.044
 written+0.041
 Sof+0.041
 Ma+0.040
Neg logit contributions
amas-0.094
uster-0.088
acking-0.087
immers-0.087
ippers-0.087
Token them
Token ID606
Feature activation-2.56e-02

Pos logit contributions
 written+0.086
od+0.081
 Luc+0.076
's+0.072
 Ma+0.069
Neg logit contributions
amas-0.192
uster-0.191
ittens-0.185
ippers-0.183
angers-0.178
Token.
Token ID13
Feature activation-4.97e-03

Pos logit contributions
 Ma+0.026
 Luc+0.026
 Leo+0.025
od+0.025
 written+0.025
Neg logit contributions
amas-0.056
ivery-0.055
immers-0.053
ippers-0.053
uster-0.053
Token She
Token ID1375
Feature activation+1.23e-02

Pos logit contributions
od+0.007
 written+0.007
 how+0.007
 Ma+0.006
 Luc+0.006
Neg logit contributions
amas-0.009
belt-0.009
ippers-0.009
uster-0.009
-0.008
Token comes
Token ID2058
Feature activation-2.10e-02

Pos logit contributions
amas+0.024
belt+0.023
ippers+0.023
immers+0.023
uster+0.023
Neg logit contributions
od-0.015
 written-0.015
 Ma-0.014
 Luc-0.014
 Leo-0.014
Token to
Token ID284
Feature activation-3.70e-02

Pos logit contributions
 how+0.012
 that+0.012
 a+0.011
 written+0.011
 Leo+0.011
Neg logit contributions
belt-0.058
amas-0.057
uster-0.056
ippers-0.056
immers-0.055
Token happy
Token ID3772
Feature activation-1.52e-02

Pos logit contributions
 how+0.024
 a+0.024
 that+0.024
 written+0.023
od+0.023
Neg logit contributions
belt-0.048
amas-0.047
uster-0.047
ippers-0.045
immers-0.045
Token.
Token ID13
Feature activation-1.34e-02

Pos logit contributions
 Ma+0.012
 Luc+0.012
od+0.012
 written+0.012
 Leo+0.012
Neg logit contributions
belt-0.035
amas-0.035
ippers-0.035
immers-0.035
ider-0.035
Token
Token ID198
Feature activation-4.74e-03

Pos logit contributions
 how+0.018
 written+0.018
 that+0.018
od+0.018
 Ma+0.017
Neg logit contributions
amas-0.025
ippers-0.024
belt-0.024
uster-0.024
-0.024
Token
Token ID198
Feature activation+1.21e-02

Pos logit contributions
od+0.005
 written+0.005
 how+0.005
 Luc+0.005
 Ma+0.005
Neg logit contributions
belt-0.009
amas-0.009
ippers-0.009
ittens-0.009
uster-0.009
TokenMom
Token ID29252
Feature activation-4.39e-02

Pos logit contributions
belt+0.024
amas+0.023
uster+0.023
immers+0.022
ippers+0.022
Neg logit contributions
 that-0.018
 a-0.017
 how-0.017
 Leo-0.017
's-0.017
Token hears
Token ID23180
Feature activation-8.69e-02

Pos logit contributions
od+0.042
 Luc+0.042
 written+0.040
 Sof+0.039
 how+0.039
Neg logit contributions
amas-0.098
belt-0.094
uster-0.093
immers-0.093
ippers-0.093
Token them
Token ID606
Feature activation-2.78e-02

Pos logit contributions
 written+0.114
od+0.109
 Luc+0.107
 Ma+0.102
's+0.101
Neg logit contributions
uster-0.162
amas-0.161
ippers-0.157
ittens-0.155
immers-0.154
Token.
Token ID13
Feature activation-8.23e-03

Pos logit contributions
 Ma+0.030
 Luc+0.029
 Leo+0.029
od+0.028
 written+0.028
Neg logit contributions
amas-0.060
ivery-0.057
immers-0.057
belt-0.057
ippers-0.057
Token She
Token ID1375
Feature activation+1.09e-02

Pos logit contributions
od+0.012
 how+0.011
 written+0.011
 Ma+0.010
 whose+0.010
Neg logit contributions
amas-0.015
belt-0.014
-0.014
ippers-0.014
-0.014
Token comes
Token ID2058
Feature activation-2.29e-02

Pos logit contributions
belt+0.022
amas+0.022
immers+0.021
ippers+0.021
uster+0.021
Neg logit contributions
od-0.013
 written-0.012
 that-0.012
 Leo-0.012
 Ma-0.012
Token to
Token ID284
Feature activation-3.86e-02

Pos logit contributions
 a+0.010
 how+0.010
 that+0.010
 Leo+0.008
's+0.008
Neg logit contributions
belt-0.068
amas-0.066
uster-0.065
ippers-0.065
immers-0.064
Token cries
Token ID24691
Feature activation+9.17e-03

Pos logit contributions
igs+0.022
ro+0.022
amas+0.022
belt+0.022
irmed+0.021
Neg logit contributions
od-0.023
 Mary-0.022
 written-0.021
 Ma-0.021
 Luc-0.021
Token.
Token ID13
Feature activation-1.27e-02

Pos logit contributions
belt+0.022
uster+0.022
amas+0.022
ippers+0.022
immers+0.021
Neg logit contributions
od-0.007
 how-0.007
 written-0.007
 that-0.007
 Leo-0.007
Token
Token ID198
Feature activation-2.44e-03

Pos logit contributions
 how+0.012
 written+0.012
 that+0.012
od+0.012
 says+0.012
Neg logit contributions
belt-0.029
amas-0.029
ippers-0.028
uster-0.028
immers-0.028
Token
Token ID198
Feature activation+1.43e-02

Pos logit contributions
 written+0.003
od+0.003
 how+0.003
's+0.003
 Luc+0.003
Neg logit contributions
ittens-0.005
ippers-0.005
belt-0.005
amas-0.005
-0.005
TokenMom
Token ID29252
Feature activation-4.10e-02

Pos logit contributions
belt+0.028
amas+0.027
uster+0.027
immers+0.026
ippers+0.026
Neg logit contributions
 that-0.021
 a-0.020
 Leo-0.020
 how-0.020
 says-0.020
Token hears
Token ID23180
Feature activation-8.69e-02

Pos logit contributions
od+0.042
 Luc+0.041
 written+0.039
 Sof+0.038
 Ma+0.037
Neg logit contributions
amas-0.091
uster-0.086
belt-0.085
ippers-0.085
immers-0.085
Token Ben
Token ID3932
Feature activation-4.27e-02

Pos logit contributions
 written+0.097
od+0.089
's+0.085
 Luc+0.082
 explains+0.080
Neg logit contributions
ittens-0.180
uster-0.180
amas-0.179
ippers-0.176
-0.171
Token cry
Token ID3960
Feature activation-7.19e-04

Pos logit contributions
 Luc+0.051
 Ma+0.051
 written+0.049
od+0.049
 how+0.049
Neg logit contributions
amas-0.086
ivery-0.082
immers-0.081
uster-0.081
ippers-0.081
Token.
Token ID13
Feature activation-4.68e-03

Pos logit contributions
 that+0.000
 how+0.000
 a+0.000
 says+0.000
 written+0.000
Neg logit contributions
belt-0.002
amas-0.002
uster-0.002
immers-0.002
ippers-0.002
Token She
Token ID1375
Feature activation+1.66e-02

Pos logit contributions
od+0.006
 written+0.006
 how+0.006
 Mark+0.006
 Luc+0.006
Neg logit contributions
amas-0.009
belt-0.009
ippers-0.009
-0.008
ittens-0.008
Token comes
Token ID2058
Feature activation-1.99e-02

Pos logit contributions
belt+0.033
amas+0.033
uster+0.032
ippers+0.032
immers+0.032
Neg logit contributions
od-0.020
 written-0.019
 Ma-0.019
 that-0.019
 Leo-0.019
Token cry
Token ID3960
Feature activation-4.25e-03

Pos logit contributions
belt+0.013
amas+0.012
ippers+0.012
uster+0.012
immers+0.012
Neg logit contributions
 that-0.008
 a-0.007
 how-0.007
 says-0.007
 Leo-0.007
Token.
Token ID13
Feature activation-1.71e-02

Pos logit contributions
 how+0.003
 that+0.003
 written+0.003
od+0.003
 Leo+0.003
Neg logit contributions
belt-0.011
amas-0.011
uster-0.011
ippers-0.011
immers-0.011
Token
Token ID198
Feature activation-9.53e-03

Pos logit contributions
 how+0.017
 written+0.016
 that+0.016
od+0.016
 says+0.015
Neg logit contributions
belt-0.039
amas-0.039
ippers-0.038
uster-0.038
immers-0.037
Token
Token ID198
Feature activation+8.79e-03

Pos logit contributions
 written+0.010
od+0.010
 how+0.010
 that+0.009
 Ma+0.009
Neg logit contributions
belt-0.020
amas-0.020
ittens-0.020
uster-0.020
ippers-0.020
TokenMom
Token ID29252
Feature activation-4.09e-02

Pos logit contributions
belt+0.018
amas+0.017
uster+0.017
immers+0.017
ippers+0.017
Neg logit contributions
 that-0.013
 a-0.012
 how-0.012
 Leo-0.012
's-0.011
Token hears
Token ID23180
Feature activation-8.67e-02

Pos logit contributions
od+0.043
 Luc+0.042
 Sof+0.040
 written+0.039
 how+0.037
Neg logit contributions
amas-0.089
uster-0.085
acking-0.083
immers-0.083
belt-0.083
Token them
Token ID606
Feature activation-2.65e-02

Pos logit contributions
 written+0.090
od+0.084
 Luc+0.081
's+0.080
 says+0.078
Neg logit contributions
amas-0.184
uster-0.183
ittens-0.180
ippers-0.177
immers-0.175
Token and
Token ID290
Feature activation-6.59e-04

Pos logit contributions
 Ma+0.033
 Luc+0.032
 written+0.032
od+0.032
 Leo+0.031
Neg logit contributions
amas-0.053
ivery-0.050
immers-0.049
ittens-0.049
ippers-0.049
Token comes
Token ID2058
Feature activation-1.64e-02

Pos logit contributions
od+0.001
 written+0.001
 Luc+0.001
 Sof+0.001
 Ma+0.001
Neg logit contributions
uster-0.001
immers-0.001
amas-0.001
igs-0.001
belt-0.001
Token to
Token ID284
Feature activation-3.14e-02

Pos logit contributions
 that+0.009
 how+0.009
 written+0.009
od+0.009
 says+0.008
Neg logit contributions
belt-0.044
amas-0.044
uster-0.043
ippers-0.043
immers-0.043
Token see
Token ID766
Feature activation-6.39e-02

Pos logit contributions
 how+0.024
 that+0.024
 written+0.023
 a+0.023
od+0.023
Neg logit contributions
belt-0.079
amas-0.077
uster-0.077
ippers-0.076
immers-0.076
TokenL
Token ID43
Feature activation-6.21e-03

Pos logit contributions
belt+0.029
amas+0.029
uster+0.028
immers+0.028
ippers+0.028
Neg logit contributions
 that-0.028
 how-0.028
 a-0.027
 written-0.027
 says-0.027
Tokenizzy
Token ID40593
Feature activation-8.22e-04

Pos logit contributions
 how+0.011
 that+0.010
 a+0.010
od+0.010
 says+0.010
Neg logit contributions
belt-0.010
amas-0.010
immers-0.009
uster-0.009
ippers-0.009
Token went
Token ID1816
Feature activation+3.68e-02

Pos logit contributions
 that+0.001
 how+0.001
's+0.001
 a+0.001
 says+0.001
Neg logit contributions
belt-0.002
uster-0.002
amas-0.002
ippers-0.002
immers-0.002
Token to
Token ID284
Feature activation-1.57e-02

Pos logit contributions
belt+0.101
amas+0.100
uster+0.099
ippers+0.098
immers+0.098
Neg logit contributions
 that-0.022
 a-0.021
 how-0.021
 written-0.019
 Leo-0.019
Token her
Token ID607
Feature activation+1.35e-03

Pos logit contributions
od+0.012
 written+0.011
 explains+0.010
 that+0.009
:+0.009
Neg logit contributions
amas-0.037
belt-0.036
ippers-0.036
 lifestyle-0.036
ital-0.036
Token bur
Token ID4356
Feature activation+8.81e-02

Pos logit contributions
belt+0.002
amas+0.002
ippers+0.002
aj+0.002
immers+0.002
Neg logit contributions
 written-0.002
od-0.002
 how-0.002
 explains-0.002
 says-0.002
Tokenrow
Token ID808
Feature activation+2.47e-02

Pos logit contributions
amas+0.095
belt+0.095
+0.091
uster+0.088
ippers+0.086
Neg logit contributions
 how-0.195
 that-0.192
 a-0.189
 Ma-0.188
 Leo-0.187
Token and
Token ID290
Feature activation-7.51e-03

Pos logit contributions
belt+0.058
amas+0.056
uster+0.056
ippers+0.055
immers+0.055
Neg logit contributions
 that-0.024
 how-0.024
 a-0.023
 written-0.023
 Leo-0.022
Token found
Token ID1043
Feature activation-2.79e-02

Pos logit contributions
od+0.009
 written+0.008
 how+0.008
 that+0.008
 explains+0.008
Neg logit contributions
belt-0.015
amas-0.014
uster-0.014
angers-0.014
ippers-0.014
Token her
Token ID607
Feature activation+1.69e-02

Pos logit contributions
od+0.030
opot+0.028
 explains+0.027
 says+0.026
 listens+0.025
Neg logit contributions
uster-0.057
belt-0.055
angers-0.053
robe-0.053
igs-0.052
Token skirt
Token ID23967
Feature activation+2.14e-02

Pos logit contributions
belt+0.030
amas+0.029
ippers+0.029
immers+0.028
ittens+0.028
Neg logit contributions
od-0.025
 written-0.025
 how-0.024
 says-0.024
 that-0.024
Token fluffy
Token ID39145
Feature activation+3.05e-02

Pos logit contributions
belt+0.049
amas+0.047
uster+0.046
ippers+0.046
immers+0.045
Neg logit contributions
 how-0.025
od-0.025
 written-0.024
 that-0.024
 says-0.024
Token and
Token ID290
Feature activation+9.74e-03

Pos logit contributions
belt+0.073
amas+0.073
uster+0.071
ippers+0.071
immers+0.071
Neg logit contributions
 how-0.025
 written-0.025
od-0.025
 says-0.025
 that-0.024
Token lived
Token ID5615
Feature activation+1.05e-02

Pos logit contributions
belt+0.021
amas+0.019
ittens+0.019
ippers+0.019
uster+0.019
Neg logit contributions
od-0.011
 written-0.011
 says-0.011
 explains-0.011
 how-0.010
Token in
Token ID287
Feature activation+8.18e-04

Pos logit contributions
belt+0.025
amas+0.024
uster+0.024
immers+0.023
ippers+0.023
Neg logit contributions
 that-0.010
 a-0.010
 how-0.010
 Leo-0.010
 written-0.009
Token a
Token ID257
Feature activation-5.18e-03

Pos logit contributions
belt+0.002
amas+0.002
uster+0.002
ro+0.002
ippers+0.002
Neg logit contributions
od-0.000
 says-0.000
 explains-0.000
 written-0.000
:-0.000
Token bur
Token ID4356
Feature activation+1.09e-01

Pos logit contributions
 that+0.005
 a+0.005
 how+0.005
's+0.005
 Leo+0.005
Neg logit contributions
belt-0.012
amas-0.012
uster-0.012
ippers-0.012
immers-0.012
Tokenrow
Token ID808
Feature activation+3.94e-02

Pos logit contributions
belt+0.135
amas+0.134
+0.133
+0.122
uster+0.121
Neg logit contributions
 that-0.228
 how-0.219
 a-0.215
's-0.213
 says-0.211
Token in
Token ID287
Feature activation-6.48e-03

Pos logit contributions
belt+0.102
amas+0.100
uster+0.099
ippers+0.098
immers+0.098
Neg logit contributions
 that-0.028
 how-0.028
 a-0.027
 written-0.026
od-0.025
Token the
Token ID262
Feature activation-1.60e-02

Pos logit contributions
od+0.003
 written+0.003
 says+0.003
 explains+0.003
 how+0.002
Neg logit contributions
belt-0.017
amas-0.017
ippers-0.017
uster-0.017
immers-0.017
Token forest
Token ID8222
Feature activation+2.50e-03

Pos logit contributions
 a+0.027
 that+0.027
's+0.026
 how+0.026
 Leo+0.026
Neg logit contributions
belt-0.027
uster-0.026
ippers-0.025
amas-0.025
immers-0.025
Token.
Token ID13
Feature activation-2.15e-03

Pos logit contributions
belt+0.006
amas+0.006
immers+0.006
ippers+0.006
ider+0.006
Neg logit contributions
 written-0.002
od-0.002
 how-0.002
 says-0.002
 Leo-0.002
Token she
Token ID673
Feature activation+4.24e-02

Pos logit contributions
belt+0.042
immers+0.041
ippers+0.041
amas+0.041
uster+0.041
Neg logit contributions
 how-0.021
 that-0.021
 a-0.020
 written-0.019
's-0.019
Token was
Token ID373
Feature activation-8.87e-04

Pos logit contributions
belt+0.099
amas+0.096
uster+0.095
ippers+0.093
immers+0.093
Neg logit contributions
 Leo-0.039
od-0.038
 how-0.037
 Mark-0.037
 that-0.037
Token proud
Token ID6613
Feature activation-3.03e-02

Pos logit contributions
 says+0.001
 Leo+0.001
od+0.001
 Mark+0.001
 written+0.001
Neg logit contributions
belt-0.002
amas-0.002
uster-0.002
ippers-0.002
ivery-0.002
Token of
Token ID286
Feature activation+2.96e-03

Pos logit contributions
od+0.029
 Leo+0.028
 Mark+0.028
 Ma+0.027
 Luc+0.027
Neg logit contributions
uster-0.066
belt-0.066
ippers-0.064
amas-0.064
ittens-0.064
Token what
Token ID644
Feature activation-3.01e-02

Pos logit contributions
belt+0.007
ippers+0.007
amas+0.007
uster+0.007
+0.007
Neg logit contributions
 written-0.002
 says-0.002
od-0.002
's-0.002
 Ma-0.002
Token she
Token ID673
Feature activation+8.33e-02

Pos logit contributions
 Mark+0.032
od+0.032
 ze+0.032
 Ma+0.031
 Ray+0.031
Neg logit contributions
ippers-0.056
-0.055
-0.055
amas-0.055
belt-0.055
Token had
Token ID550
Feature activation+4.75e-02

Pos logit contributions
belt+0.185
amas+0.181
uster+0.178
ippers+0.176
immers+0.176
Neg logit contributions
 how-0.089
 that-0.089
 a-0.086
 Leo-0.086
od-0.085
Token done
Token ID1760
Feature activation+3.91e-03

Pos logit contributions
belt+0.101
amas+0.100
ippers+0.098
uster+0.097
immers+0.096
Neg logit contributions
 Leo-0.053
 how-0.052
 Mark-0.051
 Ma-0.051
 that-0.050
Token!
Token ID0
Feature activation+6.76e-05

Pos logit contributions
belt+0.010
amas+0.010
uster+0.010
ippers+0.010
immers+0.010
Neg logit contributions
 how-0.003
 that-0.003
 Leo-0.003
 a-0.003
 written-0.003
Token<|endoftext|>
Token ID50256
Feature activation+1.49e-02

Pos logit contributions
amas+0.000
+0.000
ippers+0.000
belt+0.000
uster+0.000
Neg logit contributions
 written-0.000
od-0.000
 how-0.000
 that-0.000
 says-0.000
TokenOnce
Token ID7454
Feature activation+1.63e-02

Pos logit contributions
belt+0.035
immers+0.034
uster+0.034
amas+0.033
ippers+0.033
Neg logit contributions
 a-0.016
 that-0.016
 Leo-0.015
's-0.015
 how-0.015
Token a
Token ID257
Feature activation-2.69e-02

Pos logit contributions
od+0.040
 written+0.040
's+0.037
 explains+0.035
 rhy+0.034
Neg logit contributions
belt-0.110
uster-0.105
ippers-0.102
igs-0.102
immers-0.101
Token police
Token ID1644
Feature activation-4.65e-03

Pos logit contributions
 that+0.031
 a+0.031
 written+0.030
 how+0.030
 says+0.029
Neg logit contributions
belt-0.058
amas-0.057
immers-0.057
uster-0.056
ippers-0.056
Token car
Token ID1097
Feature activation+1.13e-02

Pos logit contributions
 that+0.007
 how+0.007
 a+0.007
 says+0.007
 written+0.007
Neg logit contributions
belt-0.008
amas-0.008
uster-0.008
immers-0.008
ippers-0.008
Token with
Token ID351
Feature activation-3.16e-02

Pos logit contributions
belt+0.027
amas+0.026
uster+0.026
ippers+0.026
immers+0.026
Neg logit contributions
 how-0.010
 Leo-0.009
od-0.009
 Ma-0.009
 says-0.009
Token a
Token ID257
Feature activation-8.62e-03

Pos logit contributions
od+0.019
 Ma+0.017
 Leo+0.017
 says+0.017
 Luc+0.016
Neg logit contributions
belt-0.084
uster-0.081
ippers-0.080
amas-0.080
immers-0.079
Token s
Token ID264
Feature activation+7.70e-02

Pos logit contributions
 a+0.011
 that+0.011
 written+0.010
 says+0.010
's+0.010
Neg logit contributions
amas-0.018
immers-0.018
belt-0.018
uster-0.017
angers-0.017
Tokeniren
Token ID24080
Feature activation-2.35e-02

Pos logit contributions
belt+0.106
amas+0.101
uster+0.097
immers+0.097
ippers+0.096
Neg logit contributions
 that-0.149
 how-0.146
od-0.145
 written-0.145
 says-0.144
Token and
Token ID290
Feature activation-3.94e-03

Pos logit contributions
 how+0.019
od+0.019
 Leo+0.018
 Ma+0.018
 Luc+0.018
Neg logit contributions
belt-0.058
amas-0.056
uster-0.055
ippers-0.055
immers-0.055
Token a
Token ID257
Feature activation-7.50e-03

Pos logit contributions
od+0.002
 how+0.001
 Leo+0.001
 that+0.001
 Ma+0.001
Neg logit contributions
belt-0.011
amas-0.011
uster-0.011
ippers-0.011
immers-0.011
Token light
Token ID1657
Feature activation-3.81e-03

Pos logit contributions
 a+0.009
 that+0.009
 written+0.009
 says+0.009
 how+0.009
Neg logit contributions
belt-0.015
amas-0.015
immers-0.015
uster-0.015
ippers-0.015
Token.
Token ID13
Feature activation-1.21e-02

Pos logit contributions
 that+0.004
 how+0.004
 a+0.004
 written+0.004
 says+0.004
Neg logit contributions
belt-0.008
amas-0.008
uster-0.008
ippers-0.008
immers-0.008
Token soon
Token ID2582
Feature activation+1.98e-02

Pos logit contributions
belt+0.006
amas+0.005
uster+0.005
ippers+0.005
immers+0.005
Neg logit contributions
 how-0.002
 that-0.002
 Leo-0.002
od-0.002
 written-0.002
Token became
Token ID2627
Feature activation-3.87e-03

Pos logit contributions
belt+0.037
amas+0.037
uster+0.036
ro+0.036
ippers+0.036
Neg logit contributions
 Mark-0.024
 Leo-0.024
 how-0.023
 whose-0.023
od-0.023
Token Joy
Token ID14087
Feature activation-1.13e-02

Pos logit contributions
 that+0.003
 how+0.003
 says+0.003
od+0.003
 Leo+0.003
Neg logit contributions
belt-0.009
amas-0.009
uster-0.009
ippers-0.009
immers-0.009
Token's
Token ID338
Feature activation+3.73e-02

Pos logit contributions
 how+0.008
 says+0.008
 Leo+0.007
 Mark+0.007
 explains+0.007
Neg logit contributions
belt-0.028
amas-0.028
uster-0.027
ippers-0.027
immers-0.027
Token most
Token ID749
Feature activation-1.58e-02

Pos logit contributions
belt+0.075
amas+0.074
uster+0.072
immers+0.072
ippers+0.071
Neg logit contributions
 that-0.049
 how-0.048
 a-0.047
 written-0.047
od-0.046
Token tre
Token ID2054
Feature activation+9.41e-02

Pos logit contributions
 Leo+0.021
 Mark+0.020
 says+0.020
 Ma+0.019
 how+0.019
Neg logit contributions
belt-0.031
uster-0.030
ippers-0.030
ider-0.030
immers-0.029
Tokenasured
Token ID34006
Feature activation+6.98e-02

Pos logit contributions
amas+0.122
uster+0.113
belt+0.113
+0.109
immers+0.109
Neg logit contributions
 that-0.194
 how-0.194
 written-0.189
od-0.188
 a-0.185
Token toy
Token ID13373
Feature activation+2.44e-02

Pos logit contributions
amas+0.122
belt+0.121
uster+0.117
immers+0.116
ippers+0.116
Neg logit contributions
 that-0.112
 how-0.110
 a-0.108
 written-0.106
od-0.104
Token,
Token ID11
Feature activation+9.84e-03

Pos logit contributions
belt+0.062
amas+0.059
ippers+0.059
uster+0.058
immers+0.058
Neg logit contributions
 Leo-0.017
 how-0.017
 says-0.017
 Ma-0.017
od-0.017
Token and
Token ID290
Feature activation-1.20e-02

Pos logit contributions
belt+0.022
amas+0.021
uster+0.020
ippers+0.020
ittens+0.020
Neg logit contributions
od-0.010
 says-0.010
 Leo-0.010
 how-0.010
 Ma-0.010
Token she
Token ID673
Feature activation+2.16e-02

Pos logit contributions
od+0.014
 written+0.013
 Ma+0.012
opot+0.012
 explains+0.011
Neg logit contributions
belt-0.022
ider-0.021
 lifestyle-0.021
ro-0.021
irmed-0.021
Token hello
Token ID23748
Feature activation-1.44e-02

Pos logit contributions
 written+0.008
 says+0.008
 how+0.008
od+0.008
 explains+0.008
Neg logit contributions
belt-0.020
uster-0.019
ippers-0.019
amas-0.019
immers-0.019
Token to
Token ID284
Feature activation-9.51e-03

Pos logit contributions
 explains+0.015
 says+0.015
od+0.015
 written+0.014
icks+0.014
Neg logit contributions
belt-0.031
amas-0.030
ippers-0.029
ro-0.029
uster-0.029
Token his
Token ID465
Feature activation+5.54e-04

Pos logit contributions
od+0.006
 written+0.006
 says+0.006
 Leo+0.005
 Ma+0.005
Neg logit contributions
belt-0.025
uster-0.024
amas-0.024
immers-0.023
ippers-0.023
Token neighbours
Token ID23788
Feature activation-4.09e-02

Pos logit contributions
belt+0.001
ippers+0.001
amas+0.001
immers+0.001
uster+0.001
Neg logit contributions
 written-0.001
od-0.001
 how-0.001
 that-0.001
 Leo-0.001
Token,
Token ID11
Feature activation-5.26e-03

Pos logit contributions
od+0.032
 Leo+0.032
 says+0.032
 written+0.032
 Ma+0.032
Neg logit contributions
belt-0.096
uster-0.092
ippers-0.092
amas-0.091
immers-0.090
Token w
Token ID266
Feature activation+3.45e-02

Pos logit contributions
od+0.005
 written+0.005
 says+0.005
 how+0.005
 that+0.005
Neg logit contributions
belt-0.011
uster-0.011
amas-0.011
ippers-0.011
immers-0.010
Tokeninking
Token ID8040
Feature activation-1.68e-02

Pos logit contributions
belt+0.052
amas+0.048
uster+0.047
+0.045
ro+0.045
Neg logit contributions
od-0.065
 that-0.064
:-0.064
 how-0.062
 about-0.062
Token and
Token ID290
Feature activation-2.18e-02

Pos logit contributions
 written+0.015
od+0.015
 how+0.015
 that+0.015
 Leo+0.014
Neg logit contributions
belt-0.039
ippers-0.038
immers-0.038
uster-0.037
amas-0.037
Token waving
Token ID25849
Feature activation+3.92e-03

Pos logit contributions
od+0.022
 how+0.022
 that+0.022
 written+0.021
 Leo+0.021
Neg logit contributions
belt-0.048
amas-0.046
immers-0.045
ippers-0.045
uster-0.045
Token to
Token ID284
Feature activation-2.72e-02

Pos logit contributions
belt+0.010
ippers+0.009
uster+0.009
immers+0.009
amas+0.009
Neg logit contributions
od-0.003
 how-0.003
 that-0.003
 written-0.003
 says-0.003
Token them
Token ID606
Feature activation-2.87e-02

Pos logit contributions
 written+0.029
od+0.028
 Luc+0.027
 Ma+0.027
 says+0.026
Neg logit contributions
immers-0.056
belt-0.055
amas-0.055
uster-0.054
ippers-0.054
Token.
Token ID13
Feature activation-9.96e-04

Pos logit contributions
belt+0.018
amas+0.018
uster+0.017
ippers+0.017
immers+0.017
Neg logit contributions
 how-0.004
 that-0.004
 Leo-0.004
 written-0.004
 a-0.004
Token
Token ID198
Feature activation-1.65e-03

Pos logit contributions
 a+0.001
 that+0.001
's+0.001
 how+0.001
 Leo+0.001
Neg logit contributions
belt-0.003
immers-0.003
uster-0.003
ippers-0.003
ider-0.003
Token
Token ID198
Feature activation+1.73e-02

Pos logit contributions
 that+0.001
 a+0.001
 how+0.001
 Leo+0.001
 says+0.001
Neg logit contributions
belt-0.005
amas-0.004
immers-0.004
uster-0.004
ippers-0.004
TokenB
Token ID33
Feature activation+3.33e-02

Pos logit contributions
angers+0.038
ippers+0.038
belt+0.037
amas+0.037
immers+0.037
Neg logit contributions
 that-0.024
's-0.023
 a-0.023
-0.022
 else-0.021
Tokenenny
Token ID11870
Feature activation-3.00e-04

Pos logit contributions
belt+0.069
amas+0.063
angers+0.062
igs+0.062
agons+0.061
Neg logit contributions
's-0.041
 a-0.039
od-0.039
:-0.038
 Pat-0.037
Token looked
Token ID3114
Feature activation+2.53e-02

Pos logit contributions
 how+0.000
 that+0.000
od+0.000
 Leo+0.000
 written+0.000
Neg logit contributions
belt-0.001
amas-0.001
uster-0.001
immers-0.001
ippers-0.001
Token around
Token ID1088
Feature activation-5.73e-03

Pos logit contributions
belt+0.054
amas+0.053
uster+0.052
ippers+0.052
immers+0.052
Neg logit contributions
 how-0.028
 that-0.028
od-0.028
 written-0.028
 Leo-0.028
Token and
Token ID290
Feature activation-1.06e-03

Pos logit contributions
od+0.005
 written+0.005
 how+0.005
 says+0.005
 that+0.005
Neg logit contributions
belt-0.013
amas-0.013
uster-0.013
ippers-0.013
immers-0.013
Token saw
Token ID2497
Feature activation-4.58e-02

Pos logit contributions
od+0.001
 how+0.001
 written+0.001
 says+0.001
 that+0.001
Neg logit contributions
belt-0.002
amas-0.002
uster-0.002
immers-0.002
ippers-0.002
Token his
Token ID465
Feature activation+2.91e-02

Pos logit contributions
od+0.036
 says+0.033
 written+0.032
 explains+0.030
 Mary+0.030
Neg logit contributions
igs-0.104
belt-0.101
uster-0.100
ittens-0.100
-0.098
Token friend
Token ID1545
Feature activation-2.79e-02

Pos logit contributions
belt+0.061
amas+0.059
uster+0.059
immers+0.057
ippers+0.057
Neg logit contributions
 that-0.037
 how-0.036
 a-0.036
 Leo-0.035
 says-0.035
Token faster
Token ID5443
Feature activation+1.90e-02

Pos logit contributions
od+0.026
 how+0.025
 says+0.025
 Leo+0.025
 that+0.025
Neg logit contributions
belt-0.052
amas-0.050
uster-0.050
immers-0.049
ippers-0.049
Token than
Token ID621
Feature activation+6.45e-04

Pos logit contributions
belt+0.042
amas+0.042
uster+0.041
ivery+0.040
ippers+0.040
Neg logit contributions
 that-0.022
 how-0.021
 a-0.021
 says-0.020
's-0.020
Token yours
Token ID12431
Feature activation+2.66e-02

Pos logit contributions
belt+0.002
amas+0.001
uster+0.001
ippers+0.001
immers+0.001
Neg logit contributions
 how-0.001
 that-0.001
 written-0.001
od-0.001
 says-0.001
Token!"
Token ID2474
Feature activation-3.49e-03

Pos logit contributions
belt+0.059
amas+0.058
uster+0.057
ippers+0.056
immers+0.056
Neg logit contributions
 how-0.028
 that-0.028
 a-0.027
 written-0.027
od-0.027
Token
Token ID198
Feature activation+4.00e-03

Pos logit contributions
 how+0.003
 written+0.003
 that+0.003
od+0.003
 Mark+0.003
Neg logit contributions
belt-0.008
amas-0.008
ippers-0.008
uster-0.008
immers-0.008
Token
Token ID198
Feature activation+2.28e-02

Pos logit contributions
ippers+0.008
amas+0.008
belt+0.008
uster+0.008
ittens+0.008
Neg logit contributions
 written-0.005
 how-0.004
od-0.004
 that-0.004
 Luc-0.004
Token"
Token ID1
Feature activation+1.07e-02

Pos logit contributions
belt+0.041
amas+0.040
uster+0.039
immers+0.038
ippers+0.038
Neg logit contributions
 that-0.035
 how-0.035
 a-0.035
 Leo-0.035
 says-0.034
TokenNo
Token ID2949
Feature activation+5.56e-03

Pos logit contributions
belt+0.021
amas+0.021
uster+0.020
ippers+0.020
immers+0.020
Neg logit contributions
 that-0.015
 how-0.015
 a-0.015
 Leo-0.014
 written-0.014
Token,
Token ID11
Feature activation-4.57e-03

Pos logit contributions
belt+0.013
amas+0.013
immers+0.012
ippers+0.012
ittens+0.012
Neg logit contributions
 written-0.005
od-0.005
 how-0.005
 Leo-0.005
 that-0.005
Token it
Token ID340
Feature activation-8.88e-04

Pos logit contributions
 written+0.005
od+0.004
 how+0.004
 Leo+0.004
 explains+0.004
Neg logit contributions
belt-0.010
uster-0.010
ippers-0.010
amas-0.010
ittens-0.010
Token's
Token ID338
Feature activation-2.81e-02

Pos logit contributions
 Ma+0.001
 Mrs+0.001
 Mary+0.001
 written+0.001
od+0.001
Neg logit contributions
belt-0.002
amas-0.002
immers-0.002
ider-0.002
ippers-0.002
TokenJohn
Token ID7554
Feature activation-2.97e-02

Pos logit contributions
belt+0.028
uster+0.027
immers+0.027
amas+0.027
ippers+0.027
Neg logit contributions
 a-0.010
 that-0.010
 Leo-0.010
 how-0.010
 says-0.010
Token and
Token ID290
Feature activation+2.59e-02

Pos logit contributions
 that+0.029
 how+0.028
 a+0.028
's+0.028
 says+0.027
Neg logit contributions
belt-0.073
amas-0.071
uster-0.071
ippers-0.070
immers-0.070
Token his
Token ID465
Feature activation+2.42e-02

Pos logit contributions
belt+0.058
ippers+0.054
uster+0.054
amas+0.054
immers+0.053
Neg logit contributions
od-0.025
 written-0.023
 Ma-0.022
 Luc-0.022
 Leo-0.021
Token mom
Token ID1995
Feature activation-9.38e-03

Pos logit contributions
belt+0.047
amas+0.045
ippers+0.045
uster+0.045
ittens+0.045
Neg logit contributions
od-0.031
 how-0.031
 written-0.031
 Ma-0.030
 says-0.029
Token were
Token ID547
Feature activation-9.97e-03

Pos logit contributions
 how+0.011
 that+0.011
od+0.011
 Leo+0.011
 written+0.011
Neg logit contributions
belt-0.019
amas-0.019
uster-0.019
ippers-0.018
immers-0.018
Token walking
Token ID6155
Feature activation+4.52e-02

Pos logit contributions
 that+0.010
 a+0.010
 how+0.010
's+0.010
 written+0.009
Neg logit contributions
belt-0.024
uster-0.023
amas-0.023
immers-0.023
ro-0.022
Token through
Token ID832
Feature activation-4.42e-02

Pos logit contributions
amas+0.111
ittens+0.109
belt+0.109
ivery+0.108
uster+0.107
Neg logit contributions
 a-0.051
 that-0.047
 the-0.046
 how-0.045
 an-0.043
Token the
Token ID262
Feature activation-2.77e-02

Pos logit contributions
od+0.017
 says+0.017
 written+0.017
 explains+0.014
 Luc+0.014
Neg logit contributions
belt-0.122
amas-0.120
uster-0.118
ippers-0.118
immers-0.117
Token park
Token ID3952
Feature activation+1.46e-02

Pos logit contributions
 how+0.048
od+0.048
 written+0.048
 Leo+0.047
 that+0.047
Neg logit contributions
belt-0.042
amas-0.041
uster-0.039
immers-0.039
ippers-0.039
Token.
Token ID13
Feature activation+5.93e-03

Pos logit contributions
belt+0.035
ippers+0.034
amas+0.034
immers+0.033
ider+0.033
Neg logit contributions
 Leo-0.012
 says-0.012
od-0.012
 how-0.012
 written-0.012
Token John
Token ID1757
Feature activation-2.92e-02

Pos logit contributions
belt+0.013
uster+0.013
ippers+0.013
immers+0.013
ittens+0.013
Neg logit contributions
 a-0.007
 how-0.007
 Mark-0.007
 Mary-0.007
 that-0.006
Token a
Token ID257
Feature activation-1.14e-02

Pos logit contributions
od+0.011
 that+0.011
 how+0.011
 says+0.011
 Leo+0.011
Neg logit contributions
belt-0.056
amas-0.054
uster-0.054
ippers-0.054
immers-0.053
Token violin
Token ID38283
Feature activation+2.51e-03

Pos logit contributions
 that+0.016
 how+0.016
 a+0.016
 written+0.016
 says+0.016
Neg logit contributions
belt-0.021
amas-0.021
uster-0.020
immers-0.020
ippers-0.020
Token.
Token ID13
Feature activation+1.05e-02

Pos logit contributions
belt+0.007
amas+0.006
uster+0.006
ippers+0.006
immers+0.006
Neg logit contributions
 how-0.002
 Leo-0.002
od-0.002
 says-0.002
 Ma-0.002
Token Finally
Token ID9461
Feature activation+2.13e-02

Pos logit contributions
belt+0.025
immers+0.024
uster+0.024
ippers+0.024
amas+0.023
Neg logit contributions
 how-0.011
 that-0.010
 a-0.010
 Mark-0.010
 Leo-0.010
Token,
Token ID11
Feature activation+9.51e-03

Pos logit contributions
belt+0.056
amas+0.055
uster+0.054
ippers+0.054
immers+0.054
Neg logit contributions
 that-0.015
 how-0.015
 a-0.014
 written-0.014
 Leo-0.013
Token her
Token ID607
Feature activation+2.37e-02

Pos logit contributions
belt+0.025
amas+0.025
ippers+0.024
uster+0.024
+0.024
Neg logit contributions
od-0.006
 says-0.006
 how-0.006
 written-0.006
 Leo-0.005
Token mom
Token ID1995
Feature activation-1.57e-02

Pos logit contributions
belt+0.055
amas+0.054
uster+0.054
immers+0.054
ippers+0.053
Neg logit contributions
 that-0.024
 a-0.024
 how-0.023
 written-0.022
 says-0.022
Token said
Token ID531
Feature activation-3.30e-03

Pos logit contributions
 Leo+0.014
 how+0.013
od+0.013
 Luc+0.012
 Ma+0.012
Neg logit contributions
belt-0.037
uster-0.035
amas-0.035
ittens-0.034
ider-0.034
Token yes
Token ID3763
Feature activation+8.76e-03

Pos logit contributions
 says+0.002
od+0.002
 Leo+0.002
 written+0.002
 Ma+0.002
Neg logit contributions
belt-0.008
amas-0.008
uster-0.008
ippers-0.008
ider-0.008
Token,
Token ID11
Feature activation+4.90e-03

Pos logit contributions
belt+0.018
amas+0.018
uster+0.018
ittens+0.017
ro+0.017
Neg logit contributions
 Leo-0.010
 Ma-0.010
 Luc-0.010
od-0.009
 how-0.009
Token and
Token ID290
Feature activation-3.34e-03

Pos logit contributions
belt+0.012
immers+0.012
ippers+0.012
amas+0.012
uster+0.012
Neg logit contributions
 that-0.005
 how-0.005
 a-0.005
 written-0.004
's-0.004
Token fascinated
Token ID30103
Feature activation-1.98e-02

Pos logit contributions
 how+0.006
 that+0.006
 a+0.006
 written+0.006
 Leo+0.005
Neg logit contributions
belt-0.018
amas-0.018
uster-0.017
ippers-0.017
immers-0.017
Token and
Token ID290
Feature activation+5.05e-03

Pos logit contributions
 Leo+0.022
 says+0.021
 Ma+0.021
od+0.021
 Luc+0.021
Neg logit contributions
belt-0.043
amas-0.041
uster-0.041
immers-0.041
ippers-0.040
Token said
Token ID531
Feature activation+7.57e-03

Pos logit contributions
belt+0.011
amas+0.011
uster+0.011
ippers+0.011
immers+0.011
Neg logit contributions
 how-0.005
 that-0.005
 a-0.005
 written-0.005
 says-0.005
Token,
Token ID11
Feature activation+4.03e-03

Pos logit contributions
belt+0.016
ippers+0.015
ider+0.015
amas+0.014
uster+0.014
Neg logit contributions
 says-0.008
 written-0.008
 explains-0.008
 Leo-0.008
 Ma-0.008
Token "
Token ID366
Feature activation+2.74e-02

Pos logit contributions
belt+0.010
amas+0.010
uster+0.010
ippers+0.009
immers+0.009
Neg logit contributions
 how-0.003
 that-0.003
 a-0.003
 written-0.003
 Leo-0.003
TokenMom
Token ID29252
Feature activation-2.41e-02

Pos logit contributions
belt+0.056
amas+0.055
ippers+0.053
immers+0.053
uster+0.053
Neg logit contributions
 how-0.035
 written-0.035
 that-0.034
od-0.034
 a-0.034
Token,
Token ID11
Feature activation+1.49e-02

Pos logit contributions
 that+0.031
 Leo+0.031
od+0.031
 how+0.031
 written+0.031
Neg logit contributions
belt-0.046
amas-0.046
uster-0.044
ippers-0.044
immers-0.043
Token you
Token ID345
Feature activation+1.44e-02

Pos logit contributions
belt+0.033
uster+0.032
amas+0.032
ippers+0.031
immers+0.031
Neg logit contributions
 written-0.015
od-0.015
 says-0.014
 Leo-0.014
 how-0.014
Token're
Token ID821
Feature activation-3.16e-04

Pos logit contributions
belt+0.035
amas+0.034
uster+0.034
ippers+0.034
immers+0.034
Neg logit contributions
 how-0.012
 that-0.012
 Leo-0.012
 written-0.012
od-0.012
Token so
Token ID523
Feature activation-6.30e-03

Pos logit contributions
 how+0.000
 that+0.000
od+0.000
 says+0.000
 written+0.000
Neg logit contributions
belt-0.001
amas-0.001
uster-0.001
ippers-0.001
immers-0.001
Token smart
Token ID4451
Feature activation-1.92e-02

Pos logit contributions
 that+0.007
 how+0.007
 a+0.007
 says+0.007
od+0.007
Neg logit contributions
belt-0.014
amas-0.013
uster-0.013
ippers-0.013
immers-0.013
Token and
Token ID290
Feature activation+7.96e-03

Pos logit contributions
belt+0.049
amas+0.048
uster+0.047
ippers+0.047
immers+0.047
Neg logit contributions
 how-0.023
 that-0.023
 a-0.022
 written-0.022
 Leo-0.022
Token said
Token ID531
Feature activation+3.41e-03

Pos logit contributions
belt+0.018
amas+0.017
uster+0.017
ippers+0.017
immers+0.017
Neg logit contributions
od-0.007
 written-0.007
 how-0.007
 says-0.006
 that-0.006
Token to
Token ID284
Feature activation-7.78e-04

Pos logit contributions
belt+0.007
ippers+0.007
ider+0.007
ro+0.007
amas+0.007
Neg logit contributions
 written-0.003
 says-0.003
od-0.003
 Ma-0.003
 explains-0.003
Token her
Token ID607
Feature activation-1.55e-04

Pos logit contributions
 how+0.000
 that+0.000
 a+0.000
 Leo+0.000
od+0.000
Neg logit contributions
belt-0.002
amas-0.002
uster-0.002
ippers-0.002
immers-0.002
Token:
Token ID25
Feature activation-4.97e-02

Pos logit contributions
 written+0.000
od+0.000
 says+0.000
 Leo+0.000
 Mark+0.000
Neg logit contributions
belt-0.000
ippers-0.000
uster-0.000
immers-0.000
ittens-0.000
Token "
Token ID366
Feature activation+2.16e-02

Pos logit contributions
 that+0.045
 how+0.045
 written+0.044
 says+0.044
od+0.044
Neg logit contributions
belt-0.115
amas-0.112
ippers-0.110
uster-0.110
immers-0.108
TokenHello
Token ID15496
Feature activation+5.15e-03

Pos logit contributions
belt+0.043
amas+0.042
ippers+0.041
uster+0.040
immers+0.040
Neg logit contributions
 how-0.030
 written-0.029
 that-0.029
 a-0.029
od-0.029
Token there
Token ID612
Feature activation+1.86e-02

Pos logit contributions
belt+0.011
amas+0.011
ippers+0.011
uster+0.011
immers+0.011
Neg logit contributions
 written-0.005
 says-0.005
 explains-0.005
od-0.005
 how-0.005
Token!
Token ID0
Feature activation+2.18e-02

Pos logit contributions
amas+0.040
immers+0.039
belt+0.039
ippers+0.038
+0.037
Neg logit contributions
 written-0.018
 how-0.016
od-0.016
 Ma-0.016
 explains-0.016
Token Would
Token ID10928
Feature activation-7.66e-04

Pos logit contributions
belt+0.048
amas+0.047
immers+0.046
uster+0.046
ippers+0.046
Neg logit contributions
 how-0.024
 that-0.024
 a-0.023
 Leo-0.023
 written-0.023
Token you
Token ID345
Feature activation+1.10e-02

Pos logit contributions
od+0.001
 written+0.001
 Ma+0.001
 explains+0.001
 says+0.001
Neg logit contributions
belt-0.001
ittens-0.001
amas-0.001
ider-0.001
-0.001
Token had
Token ID550
Feature activation+1.34e-02

Pos logit contributions
 how+0.021
 that+0.021
 written+0.020
 a+0.019
od+0.019
Neg logit contributions
belt-0.095
amas-0.093
uster-0.091
ippers-0.091
immers-0.091
Token been
Token ID587
Feature activation+3.02e-02

Pos logit contributions
belt+0.033
ippers+0.032
amas+0.032
immers+0.032
uster+0.031
Neg logit contributions
od-0.009
 Ma-0.009
 Leo-0.009
 Luc-0.008
 Mark-0.008
Token with
Token ID351
Feature activation-2.25e-02

Pos logit contributions
belt+0.069
amas+0.067
uster+0.066
ippers+0.066
immers+0.066
Neg logit contributions
 how-0.030
 that-0.030
 Leo-0.029
 says-0.029
 written-0.029
Token her
Token ID607
Feature activation+3.61e-02

Pos logit contributions
od+0.014
 says+0.014
 written+0.014
 Ma+0.013
 how+0.013
Neg logit contributions
belt-0.057
uster-0.056
amas-0.056
ippers-0.055
immers-0.055
Token.
Token ID13
Feature activation-1.07e-03

Pos logit contributions
belt+0.092
amas+0.092
uster+0.090
immers+0.088
ippers+0.088
Neg logit contributions
 that-0.028
 how-0.028
 a-0.027
 written-0.026
 Leo-0.026
Token<|endoftext|>
Token ID50256
Feature activation+9.79e-03

Pos logit contributions
 written+0.001
od+0.001
 how+0.001
 explains+0.001
 says+0.001
Neg logit contributions
-0.002
amas-0.002
-0.002
ittens-0.002
belt-0.002
TokenOnce
Token ID7454
Feature activation+1.83e-02

Pos logit contributions
amas+0.022
belt+0.022
ippers+0.021
uster+0.021
+0.021
Neg logit contributions
 how-0.010
od-0.009
 that-0.009
 written-0.009
 says-0.009
Token upon
Token ID2402
Feature activation-3.82e-02

Pos logit contributions
belt+0.024
ippers+0.023
uster+0.023
amas+0.023
ittens+0.022
Neg logit contributions
 written-0.035
 how-0.034
od-0.034
 Leo-0.034
 that-0.033
Token a
Token ID257
Feature activation-3.67e-03

Pos logit contributions
od+0.015
 says+0.015
 Ma+0.015
 Luc+0.015
 explains+0.014
Neg logit contributions
amas-0.104
belt-0.102
ippers-0.100
ittens-0.100
ivery-0.100
Token time
Token ID640
Feature activation+2.24e-02

Pos logit contributions
 how+0.007
 that+0.007
 written+0.007
od+0.007
 a+0.007
Neg logit contributions
belt-0.005
amas-0.005
uster-0.005
ippers-0.005
immers-0.005
Token,
Token ID11
Feature activation+1.57e-02

Pos logit contributions
belt+0.047
ippers+0.047
+0.047
ittens+0.046
amas+0.046
Neg logit contributions
 written-0.024
 says-0.022
 explains-0.022
od-0.021
 how-0.021
Token you
Token ID345
Feature activation+1.31e-02

Pos logit contributions
belt+0.024
uster+0.024
immers+0.024
ivery+0.024
amas+0.023
Neg logit contributions
 that-0.008
 a-0.008
 how-0.007
 the-0.007
 an-0.006
Token today
Token ID1909
Feature activation+1.36e-02

Pos logit contributions
belt+0.033
uster+0.033
amas+0.033
immers+0.032
ittens+0.032
Neg logit contributions
 that-0.012
 a-0.012
 how-0.012
's-0.011
 written-0.011
Token?
Token ID30
Feature activation+1.08e-02

Pos logit contributions
uster+0.036
belt+0.035
immers+0.034
ittens+0.034
amas+0.034
Neg logit contributions
 that-0.012
 a-0.012
 how-0.011
's-0.011
:-0.010
Tokenâ
Token ID22940
Feature activation+1.10e-02

Pos logit contributions
immers+0.025
belt+0.025
ivery+0.024
igs+0.024
amas+0.024
Neg logit contributions
 a-0.011
 how-0.011
 that-0.011
's-0.011
 says-0.011
Token
Token ID26391
Feature activation-3.76e-04

Pos logit contributions
ippers+0.019
belt+0.019
uster+0.019
immers+0.018
angers+0.018
Neg logit contributions
 that-0.017
 says-0.017
 written-0.017
's-0.017
 how-0.017
Token
Token ID198
Feature activation+6.43e-03

Pos logit contributions
 that+0.000
 how+0.000
 a+0.000
's+0.000
 written+0.000
Neg logit contributions
belt-0.001
uster-0.001
immers-0.001
ippers-0.001
amas-0.001
Token
Token ID198
Feature activation+2.08e-02

Pos logit contributions
belt+0.017
immers+0.017
amas+0.017
ippers+0.017
ider+0.017
Neg logit contributions
 that-0.004
 a-0.004
 how-0.004
 Leo-0.004
 says-0.004
TokenRyan
Token ID21868
Feature activation-2.37e-02

Pos logit contributions
immers+0.048
belt+0.048
ippers+0.047
angers+0.047
amas+0.047
Neg logit contributions
 that-0.022
 a-0.022
's-0.021
 how-0.020
 written-0.020
Token replied
Token ID8712
Feature activation+5.09e-03

Pos logit contributions
 how+0.015
 that+0.015
 a+0.014
 written+0.014
 Leo+0.014
Neg logit contributions
belt-0.063
amas-0.062
uster-0.061
ippers-0.060
immers-0.060
Token,
Token ID11
Feature activation-1.82e-03

Pos logit contributions
belt+0.012
amas+0.012
uster+0.012
ippers+0.012
immers+0.011
Neg logit contributions
 written-0.004
 says-0.004
od-0.004
 Leo-0.004
 how-0.004
Token
Token ID6184
Feature activation+4.06e-02

Pos logit contributions
 that+0.001
 how+0.001
 a+0.001
 written+0.001
 Leo+0.001
Neg logit contributions
belt-0.005
amas-0.005
immers-0.004
uster-0.004
ippers-0.004
Token a
Token ID257
Feature activation-6.62e-04

Pos logit contributions
od+0.018
 Ma+0.017
 Luc+0.017
 explains+0.017
 says+0.017
Neg logit contributions
amas-0.097
belt-0.094
ivery-0.092
ippers-0.092
ittens-0.092
Token time
Token ID640
Feature activation+2.78e-02

Pos logit contributions
 how+0.001
 that+0.001
 written+0.001
 a+0.001
od+0.001
Neg logit contributions
belt-0.001
amas-0.001
uster-0.001
ippers-0.001
immers-0.001
Token,
Token ID11
Feature activation+1.72e-02

Pos logit contributions
belt+0.059
ippers+0.059
+0.058
amas+0.058
ittens+0.058
Neg logit contributions
 written-0.030
 says-0.027
 explains-0.027
 how-0.027
od-0.027
Token there
Token ID612
Feature activation+1.58e-02

Pos logit contributions
belt+0.034
amas+0.033
uster+0.032
ippers+0.032
immers+0.032
Neg logit contributions
 how-0.023
 that-0.023
 a-0.023
 written-0.022
 Leo-0.022
Token was
Token ID373
Feature activation-2.35e-02

Pos logit contributions
+0.030
amas+0.030
belt+0.030
ro+0.029
ippers+0.028
Neg logit contributions
 written-0.020
od-0.018
 explains-0.018
 ze-0.018
 Ma-0.018
Token a
Token ID257
Feature activation-6.58e-03

Pos logit contributions
 says+0.009
od+0.008
 explains+0.008
 how+0.008
 written+0.008
Neg logit contributions
amas-0.065
ittens-0.063
-0.063
belt-0.063
uster-0.063
Token little
Token ID1310
Feature activation+3.99e-02

Pos logit contributions
 that+0.008
 a+0.008
 how+0.008
 Leo+0.008
 says+0.007
Neg logit contributions
belt-0.014
amas-0.014
uster-0.014
immers-0.013
ippers-0.013
Token boy
Token ID2933
Feature activation-2.06e-02

Pos logit contributions
belt+0.073
amas+0.071
immers+0.069
ippers+0.067
uster+0.067
Neg logit contributions
 that-0.064
 a-0.063
 says-0.060
 Leo-0.060
 how-0.059
Token named
Token ID3706
Feature activation+1.76e-02

Pos logit contributions
 written+0.036
 explains+0.035
 says+0.034
 Luc+0.034
od+0.033
Neg logit contributions
ittens-0.031
belt-0.029
ippers-0.028
ivering-0.027
ider-0.027
Token Tim
Token ID5045
Feature activation-8.49e-03

Pos logit contributions
belt+0.036
amas+0.035
uster+0.034
ittens+0.034
immers+0.034
Neg logit contributions
 written-0.022
od-0.021
 how-0.021
 explains-0.020
 that-0.020
Tokenmy
Token ID1820
Feature activation-6.38e-03

Pos logit contributions
 Luc+0.014
od+0.014
 Leo+0.014
 written+0.014
 Sof+0.014
Neg logit contributions
belt-0.012
ishing-0.012
ivery-0.012
ittens-0.012
uster-0.011
TokenYou
Token ID1639
Feature activation+1.19e-02

Pos logit contributions
belt+0.003
amas+0.002
uster+0.002
immers+0.002
ippers+0.002
Neg logit contributions
 that-0.001
 a-0.001
 how-0.001
 Leo-0.001
od-0.001
Token should
Token ID815
Feature activation+1.74e-03

Pos logit contributions
belt+0.028
amas+0.027
uster+0.027
ippers+0.027
immers+0.027
Neg logit contributions
 how-0.011
 that-0.010
 Leo-0.010
od-0.010
 written-0.010
Token not
Token ID407
Feature activation-3.37e-02

Pos logit contributions
belt+0.004
amas+0.004
uster+0.004
ippers+0.004
immers+0.004
Neg logit contributions
 that-0.002
 how-0.001
 a-0.001
 Leo-0.001
 says-0.001
Token run
Token ID1057
Feature activation+2.96e-02

Pos logit contributions
 how+0.037
od+0.036
 that+0.036
 Leo+0.036
 written+0.035
Neg logit contributions
belt-0.074
amas-0.072
uster-0.070
ippers-0.070
immers-0.070
Token away
Token ID1497
Feature activation+1.67e-02

Pos logit contributions
belt+0.076
amas+0.076
ippers+0.074
uster+0.074
immers+0.073
Neg logit contributions
 that-0.025
 a-0.023
 how-0.022
's-0.020
 says-0.020
Token from
Token ID422
Feature activation-3.94e-02

Pos logit contributions
belt+0.042
amas+0.041
immers+0.041
ippers+0.041
uster+0.040
Neg logit contributions
 written-0.012
od-0.011
 says-0.011
 how-0.011
 Leo-0.010
Token me
Token ID502
Feature activation+3.06e-03

Pos logit contributions
 that+0.013
 how+0.012
 a+0.012
 Leo+0.010
 the+0.009
Neg logit contributions
belt-0.123
uster-0.121
amas-0.120
ippers-0.119
immers-0.118
Token!
Token ID0
Feature activation-4.44e-03

Pos logit contributions
belt+0.008
amas+0.007
ippers+0.007
immers+0.007
uster+0.007
Neg logit contributions
 how-0.002
 written-0.002
od-0.002
 that-0.002
 Leo-0.002
Token Come
Token ID7911
Feature activation+3.01e-02

Pos logit contributions
 that+0.005
 Leo+0.005
 how+0.005
 says+0.005
 Mark+0.005
Neg logit contributions
belt-0.010
immers-0.010
amas-0.010
uster-0.010
ippers-0.009
Token back
Token ID736
Feature activation-2.10e-03

Pos logit contributions
belt+0.073
amas+0.072
immers+0.071
uster+0.070
ivery+0.070
Neg logit contributions
 that-0.031
 a-0.030
 how-0.028
 Leo-0.027
's-0.027
Token now
Token ID783
Feature activation+5.36e-03

Pos logit contributions
 that+0.002
 a+0.002
 how+0.002
 says+0.002
 Leo+0.002
Neg logit contributions
amas-0.005
belt-0.005
uster-0.005
ippers-0.005
ittens-0.005
Token playing
Token ID2712
Feature activation+2.61e-02

Pos logit contributions
 written+0.008
 how+0.008
 says+0.008
 explains+0.007
od+0.007
Neg logit contributions
belt-0.014
amas-0.014
ippers-0.014
angers-0.014
immers-0.014
Token together
Token ID1978
Feature activation-3.05e-03

Pos logit contributions
belt+0.062
uster+0.060
amas+0.060
ippers+0.059
immers+0.058
Neg logit contributions
 a-0.026
 that-0.026
 how-0.026
 Leo-0.024
od-0.024
Token in
Token ID287
Feature activation-1.22e-02

Pos logit contributions
 says+0.004
 Ma+0.003
 written+0.003
 Mrs+0.003
 Luc+0.003
Neg logit contributions
belt-0.006
ippers-0.006
amas-0.006
immers-0.006
ider-0.006
Token the
Token ID262
Feature activation-2.17e-02

Pos logit contributions
 written+0.005
 says+0.005
od+0.004
 that+0.004
 how+0.004
Neg logit contributions
belt-0.034
ippers-0.033
amas-0.033
immers-0.033
uster-0.033
Token garden
Token ID11376
Feature activation-5.94e-03

Pos logit contributions
 how+0.036
 that+0.036
 written+0.035
od+0.035
 Leo+0.035
Neg logit contributions
belt-0.035
amas-0.034
uster-0.033
immers-0.033
ippers-0.033
Token when
Token ID618
Feature activation-3.59e-02

Pos logit contributions
 written+0.006
 says+0.006
 explains+0.006
od+0.006
 Leo+0.006
Neg logit contributions
belt-0.013
ippers-0.013
ider-0.012
immers-0.012
amas-0.012
Token they
Token ID484
Feature activation+8.77e-03

Pos logit contributions
 how+0.024
 that+0.024
od+0.023
 written+0.023
 says+0.023
Neg logit contributions
belt-0.093
amas-0.091
uster-0.090
ippers-0.090
immers-0.089
Token saw
Token ID2497
Feature activation-4.74e-02

Pos logit contributions
belt+0.016
ivery+0.015
+0.015
amas+0.015
ippers+0.015
Neg logit contributions
 how-0.012
 that-0.011
 says-0.011
 Mark-0.011
 explains-0.011
Token something
Token ID1223
Feature activation-5.26e-02

Pos logit contributions
od+0.042
 says+0.041
 explains+0.041
 written+0.040
opot+0.039
Neg logit contributions
belt-0.087
 rink-0.085
 tub-0.085
ippers-0.084
immers-0.083
Token interesting
Token ID3499
Feature activation-1.44e-03

Pos logit contributions
 Ma+0.079
 Leo+0.072
 Lou+0.070
 Joseph+0.069
 Mark+0.069
Neg logit contributions
belt-0.088
suit-0.084
ippers-0.082
 rink-0.081
ider-0.081
Token in
Token ID287
Feature activation+2.37e-02

Pos logit contributions
 Ma+0.002
 Leo+0.002
 Mrs+0.002
 Luc+0.002
 Lou+0.002
Neg logit contributions
belt-0.003
ider-0.003
ippers-0.003
ro-0.003
ishing-0.003
Token the
Token ID262
Feature activation-1.52e-02

Pos logit contributions
 written+0.007
od+0.007
 explains+0.007
 says+0.007
's+0.007
Neg logit contributions
belt-0.023
amas-0.023
angers-0.023
igs-0.023
immers-0.022
Token blocks
Token ID7021
Feature activation-6.22e-03

Pos logit contributions
 how+0.021
 that+0.021
 written+0.020
 a+0.020
 Leo+0.020
Neg logit contributions
belt-0.029
amas-0.028
uster-0.028
ippers-0.028
immers-0.028
Token.
Token ID13
Feature activation-6.52e-03

Pos logit contributions
 Ma+0.007
od+0.007
 Luc+0.006
 written+0.006
 Leo+0.006
Neg logit contributions
belt-0.013
amas-0.012
ippers-0.012
immers-0.012
ider-0.012
Token He
Token ID679
Feature activation+1.58e-02

Pos logit contributions
 a+0.008
 that+0.008
 Leo+0.008
 how+0.008
 says+0.007
Neg logit contributions
belt-0.015
immers-0.014
amas-0.014
uster-0.014
ippers-0.013
Token reached
Token ID4251
Feature activation+4.78e-03

Pos logit contributions
belt+0.032
ro+0.030
amas+0.030
ider+0.030
ittens+0.029
Neg logit contributions
od-0.018
 that-0.018
 how-0.018
 Ma-0.017
 Mark-0.017
Token for
Token ID329
Feature activation-4.23e-02

Pos logit contributions
belt+0.012
amas+0.011
uster+0.011
ippers+0.011
immers+0.011
Neg logit contributions
 how-0.004
 that-0.004
 a-0.004
 Leo-0.004
 says-0.004
Token them
Token ID606
Feature activation-3.65e-02

Pos logit contributions
 how+0.027
 written+0.027
od+0.027
 that+0.027
 says+0.026
Neg logit contributions
belt-0.109
amas-0.108
uster-0.107
ippers-0.106
immers-0.106
Token and
Token ID290
Feature activation-5.39e-03

Pos logit contributions
od+0.041
 Ma+0.040
 says+0.040
 Luc+0.039
 written+0.039
Neg logit contributions
belt-0.073
amas-0.072
ider-0.070
ippers-0.069
immers-0.069
Token pulled
Token ID5954
Feature activation-5.32e-03

Pos logit contributions
od+0.007
 written+0.007
 that+0.006
 Ma+0.006
 Luc+0.006
Neg logit contributions
belt-0.010
ro-0.010
ider-0.010
amas-0.009
-0.009
Token them
Token ID606
Feature activation-2.04e-02

Pos logit contributions
 how+0.005
 that+0.005
 written+0.005
od+0.005
 says+0.005
Neg logit contributions
belt-0.012
amas-0.012
uster-0.012
ippers-0.012
immers-0.012
Token down
Token ID866
Feature activation+1.22e-02

Pos logit contributions
od+0.024
 written+0.024
 says+0.024
 how+0.024
 Ma+0.023
Neg logit contributions
belt-0.041
amas-0.040
ippers-0.039
immers-0.039
uster-0.039
Token Fin
Token ID4463
Feature activation-2.43e-02

Pos logit contributions
 how+0.009
 that+0.009
 written+0.009
od+0.009
 a+0.009
Neg logit contributions
belt-0.017
amas-0.016
uster-0.016
ippers-0.016
immers-0.016
Token was
Token ID373
Feature activation-1.97e-02

Pos logit contributions
 how+0.021
 that+0.021
 a+0.020
 written+0.020
 Leo+0.020
Neg logit contributions
belt-0.059
amas-0.057
uster-0.057
ippers-0.056
immers-0.056
Token happy
Token ID3772
Feature activation-1.34e-02

Pos logit contributions
od+0.021
 Ma+0.020
 says+0.020
 Luc+0.020
 Leo+0.019
Neg logit contributions
belt-0.041
amas-0.039
uster-0.038
ider-0.038
immers-0.038
Token to
Token ID284
Feature activation+7.70e-03

Pos logit contributions
od+0.011
 written+0.011
 Ma+0.011
 says+0.011
 explains+0.011
Neg logit contributions
belt-0.030
amas-0.030
ippers-0.030
immers-0.030
ider-0.029
Token learn
Token ID2193
Feature activation-6.22e-02

Pos logit contributions
belt+0.018
amas+0.017
uster+0.017
immers+0.017
ippers+0.017
Neg logit contributions
 that-0.008
 how-0.008
 a-0.008
 says-0.008
 Leo-0.008
Token something
Token ID1223
Feature activation-4.39e-02

Pos logit contributions
 Ma+0.061
od+0.059
 Leo+0.058
 Luc+0.058
 Lou+0.057
Neg logit contributions
belt-0.131
igs-0.125
immers-0.125
ippers-0.125
uster-0.124
Token new
Token ID649
Feature activation+7.76e-03

Pos logit contributions
 Ma+0.068
 Leo+0.064
 Lou+0.061
 written+0.060
 Mary+0.060
Neg logit contributions
immers-0.067
igs-0.066
belt-0.066
ippers-0.064
 rink-0.064
Token.
Token ID13
Feature activation+1.23e-04

Pos logit contributions
belt+0.014
ippers+0.014
immers+0.013
ro+0.013
amas+0.013
Neg logit contributions
 Ma-0.010
 Mrs-0.010
 Leo-0.009
 Luc-0.009
 Sof-0.009
Token He
Token ID679
Feature activation+2.54e-02

Pos logit contributions
belt+0.000
amas+0.000
uster+0.000
immers+0.000
ippers+0.000
Neg logit contributions
 that-0.000
 how-0.000
 Leo-0.000
 a-0.000
 says-0.000
Token told
Token ID1297
Feature activation-2.18e-02

Pos logit contributions
belt+0.054
amas+0.052
immers+0.051
uster+0.051
ippers+0.051
Neg logit contributions
od-0.027
 how-0.026
 that-0.026
 Mark-0.026
 written-0.026
Token his
Token ID465
Feature activation-1.16e-02

Pos logit contributions
 written+0.020
od+0.019
 Ma+0.018
 says+0.018
 how+0.018
Neg logit contributions
belt-0.050
amas-0.050
uster-0.049
immers-0.049
ippers-0.049
Token a
Token ID257
Feature activation-4.05e-02

Pos logit contributions
 says+0.038
od+0.036
 explains+0.035
opot+0.035
 whose+0.031
Neg logit contributions
amas-0.143
belt-0.143
ro-0.143
uster-0.141
 lifestyle-0.135
Token noise
Token ID7838
Feature activation-2.62e-02

Pos logit contributions
 written+0.055
 how+0.054
od+0.054
 Leo+0.051
 Ma+0.051
Neg logit contributions
amas-0.074
belt-0.072
uster-0.069
ivery-0.069
ittens-0.068
Token.
Token ID13
Feature activation-4.15e-03

Pos logit contributions
 Ma+0.032
 Luc+0.032
od+0.031
 Sof+0.031
 Leo+0.030
Neg logit contributions
uster-0.046
ro-0.045
ivering-0.045
amas-0.045
belt-0.045
Token It
Token ID632
Feature activation-6.67e-03

Pos logit contributions
 how+0.004
 that+0.004
 written+0.004
od+0.004
 says+0.004
Neg logit contributions
belt-0.009
amas-0.009
uster-0.009
ippers-0.009
immers-0.009
Token sounded
Token ID14846
Feature activation-2.49e-02

Pos logit contributions
od+0.008
 Ma+0.008
 how+0.008
 whose+0.008
 Leo+0.007
Neg logit contributions
belt-0.012
amas-0.012
ivery-0.011
ro-0.011
ivering-0.011
Token like
Token ID588
Feature activation-4.37e-02

Pos logit contributions
od+0.033
 says+0.031
 Leo+0.029
 Luc+0.029
 Ma+0.029
Neg logit contributions
belt-0.046
uster-0.045
coat-0.043
ivery-0.043
ro-0.043
Token a
Token ID257
Feature activation-1.69e-02

Pos logit contributions
od+0.022
 says+0.020
 how+0.019
's+0.018
 explains+0.018
Neg logit contributions
belt-0.119
amas-0.115
uster-0.114
ippers-0.113
immers-0.112
Token pretty
Token ID2495
Feature activation+2.19e-02

Pos logit contributions
 how+0.020
 that+0.019
od+0.019
 written+0.019
 says+0.019
Neg logit contributions
belt-0.035
amas-0.034
uster-0.034
ippers-0.033
immers-0.033
Token bird
Token ID6512
Feature activation-4.03e-03

Pos logit contributions
belt+0.047
amas+0.046
uster+0.045
immers+0.045
ippers+0.045
Neg logit contributions
 that-0.026
 how-0.026
 a-0.025
 written-0.025
 says-0.025
Token singing
Token ID13777
Feature activation+3.18e-03

Pos logit contributions
 how+0.003
 Ma+0.003
 Leo+0.003
od+0.003
 Luc+0.003
Neg logit contributions
belt-0.010
amas-0.009
ider-0.009
ippers-0.009
immers-0.009
Token.
Token ID13
Feature activation-1.42e-02

Pos logit contributions
belt+0.008
amas+0.007
uster+0.007
ro+0.007
ippers+0.007
Neg logit contributions
 says-0.003
od-0.003
 explains-0.003
 Luc-0.003
 Leo-0.002