TOP ACTIVATIONS
MAX = 3.149

Tommy wanted a very special toy , but he did not
, he tri pped on an old brick and all his
. They both wanted a toy they saw in the store
request . She wanted a toy with a handle so she
a small bird perched on an old crate . The bird
, he managed to jump an extra long , far distance
" That 's a nice toy ," she said . "
je ep is not a toy . We might break it
She wanted to make a toy missile . She asked her
lost or sick or hurt ," mom said .
, it 's my favorite toy !" Lily replied .
. <|endoftext|> John wanted a toy truck . His mom said
<|endoftext|> Mark wanted a new toy but couldn â
him it was not a toy . But Tim did not
mom gave her a new toy . It was a little
and get to the water ." Sam whispered , trying to
the door and there was an old lady .
. He requested a soft toy , but the shop keeper
usual area . She reached an expensive part of town and
thinks it is a shiny toy that makes his dad 's

MIN ACTIVATIONS
MIN = -3.811

not played with for a long time . It was a
red bike with a com fy seat and shiny silver bell
Billy was wearing a really cool shirt with a dinosaur on
the beach and build sand cast les .
play . Lily saw a merry - go - round and
a strong and brave bear on it . Benny wanted to
things inside , but the brave little girl saw him .
big eyes and a w agging tail . The young boy
man walking on a tight ro pe . He had to
with his favorite cartoon character on the cover . He asked
room with a pretty tut oring desk .
around and saw a big do ggy smile at her .
it ! The white all igator was scared so
to have found such a beautiful treasure . <|endoftext|> Once upon
had gone away for a long time . Lucy sighed sadly
Ben smiled and hugged its neck . Mom took
hug ." The quiet all igator did not know
not played with for a long time . It is a
. When the white all igator got out of
Lily picked the apples and shared them with her friends .

INTERVAL 1.409 - 3.149
CONTAINS 1.095%

a chair . She had an idea . " Ben ,
find a place to play at sunset . He walked and
, Mama said it was time to go to the store
a lot of fun . They are good friends . <|endoftext|>
They had made it to the other side of their journey

INTERVAL -0.331 - 1.409
CONTAINS 70.431%

her mom 's message . Later that day ,
dirt . They dug it out and saw that it was
away and hid in a shelter . The little
playing . " I don 't want to come
, " It sounds like it is coming from the river

INTERVAL -2.071 - -0.331
CONTAINS 28.383%

only nice and was not going to scare the town .
them and said , " You naughty boys ! How dare
try new things . Ros ie asked Benny if
smiled . They took their new toys to the water ,
big , white caul iflower . He was so proud of

INTERVAL -3.811 - -2.071
CONTAINS 0.091%

." So , Tim my shared his cars with the little
they played pretend for a long time . They had a
did not know how to share . " I
treasure map and after a long walk , she finally found
. You never play with a gun . You understand ?"
Token Tommy
Token ID19919
Feature activation+1.14e+00

Pos logit contributions
les+0.010
lights+0.010
ork+0.010
rily+0.010
adoes+0.009
Neg logit contributions
 least-0.009
oh-0.007
 say-0.007
 kidding-0.007
 oh-0.007
Token wanted
Token ID2227
Feature activation+5.92e-01

Pos logit contributions
 least+0.056
 kidding+0.045
 oh+0.044
oh+0.043
 ourselves+0.043
Neg logit contributions
les-0.036
ork-0.033
lights-0.032
lf-0.031
ened-0.030
Token a
Token ID257
Feature activation+3.38e-03

Pos logit contributions
 least+0.027
oh+0.023
 kidding+0.021
 say+0.021
 ourselves+0.020
Neg logit contributions
les-0.019
ork-0.019
lights-0.019
adoes-0.018
rily-0.017
Token very
Token ID845
Feature activation-9.88e-01

Pos logit contributions
 least+0.000
oh+0.000
 say+0.000
 oh+0.000
 kidding+0.000
Neg logit contributions
les-0.000
lights-0.000
ork-0.000
rily-0.000
adoes-0.000
Token special
Token ID2041
Feature activation+3.96e-01

Pos logit contributions
les+0.047
lights+0.046
adoes+0.045
rily+0.045
ened+0.042
Neg logit contributions
 least-0.033
 say-0.027
oh-0.025
 wait-0.025
 oh-0.025
Token toy
Token ID13373
Feature activation+3.15e+00

Pos logit contributions
 least+0.019
oh+0.016
 say+0.015
 kidding+0.015
 oh+0.014
Neg logit contributions
les-0.013
lights-0.012
ork-0.012
rily-0.012
adoes-0.011
Token,
Token ID11
Feature activation+1.17e+00

Pos logit contributions
oh+0.149
 least+0.148
Ta+0.138
 kidding+0.137
 elbows+0.133
Neg logit contributions
ork-0.073
lights-0.069
 soldier-0.067
les-0.067
 novels-0.065
Token but
Token ID475
Feature activation-5.24e-01

Pos logit contributions
 least+0.047
oh+0.039
 ourselves+0.033
 kidding+0.033
 am+0.032
Neg logit contributions
lights-0.046
les-0.045
ork-0.044
rily-0.043
adoes-0.042
Token he
Token ID339
Feature activation+1.81e-01

Pos logit contributions
les+0.023
lights+0.022
ork+0.021
rily+0.021
adoes+0.021
Neg logit contributions
 least-0.021
 oh-0.016
oh-0.015
 say-0.015
 kidding-0.015
Token did
Token ID750
Feature activation-5.07e-01

Pos logit contributions
 least+0.007
oh+0.006
 say+0.006
 kidding+0.005
 oh+0.005
Neg logit contributions
les-0.007
lights-0.007
ork-0.007
rily-0.007
adoes-0.007
Token not
Token ID407
Feature activation-3.73e-01

Pos logit contributions
les+0.025
lights+0.025
rily+0.024
adoes+0.023
ork+0.023
Neg logit contributions
 least-0.016
 say-0.012
 oh-0.012
oh-0.012
 am-0.012
Token,
Token ID11
Feature activation-8.02e-02

Pos logit contributions
 least+0.005
oh+0.004
 say+0.004
 oh+0.004
 kidding+0.004
Neg logit contributions
les-0.007
lights-0.006
ork-0.006
rily-0.006
adoes-0.006
Token he
Token ID339
Feature activation-5.22e-01

Pos logit contributions
les+0.003
lights+0.003
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.003
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token tri
Token ID1333
Feature activation+3.71e-02

Pos logit contributions
lights+0.023
les+0.022
rily+0.022
adoes+0.022
ork+0.022
Neg logit contributions
 least-0.019
oh-0.015
 say-0.014
 oh-0.013
 kidding-0.013
Tokenpped
Token ID1496
Feature activation+6.36e-01

Pos logit contributions
 least+0.002
oh+0.002
 say+0.002
 oh+0.002
 kidding+0.002
Neg logit contributions
les-0.001
lights-0.001
ork-0.001
rily-0.001
adoes-0.001
Token on
Token ID319
Feature activation-4.82e-01

Pos logit contributions
 least+0.031
oh+0.025
 kidding+0.025
 oh+0.024
 say+0.024
Neg logit contributions
les-0.021
lights-0.019
ork-0.019
rily-0.019
adoes-0.018
Token an
Token ID281
Feature activation+3.14e+00

Pos logit contributions
les+0.026
lights+0.026
rily+0.025
ork+0.025
adoes+0.024
Neg logit contributions
 least-0.013
oh-0.008
 say-0.008
 oh-0.008
 kidding-0.008
Token old
Token ID1468
Feature activation+1.03e+00

Pos logit contributions
 least+0.130
oh+0.122
 am+0.120
'm+0.114
 ourselves+0.107
Neg logit contributions
les-0.112
lf-0.105
rily-0.103
lights-0.096
ork-0.090
Token brick
Token ID17214
Feature activation+4.87e-01

Pos logit contributions
 least+0.048
oh+0.039
 kidding+0.036
 Mir+0.035
 ourselves+0.035
Neg logit contributions
les-0.036
ork-0.035
lights-0.034
adoes-0.033
rily-0.032
Token and
Token ID290
Feature activation-8.70e-02

Pos logit contributions
 least+0.021
oh+0.017
 kidding+0.016
 say+0.016
 oh+0.016
Neg logit contributions
les-0.018
lights-0.018
ork-0.017
rily-0.017
adoes-0.016
Token all
Token ID477
Feature activation+2.96e-01

Pos logit contributions
les+0.004
lights+0.004
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.003
oh-0.003
 say-0.003
 oh-0.002
 kidding-0.002
Token his
Token ID465
Feature activation+2.96e-01

Pos logit contributions
 least+0.013
oh+0.010
 oh+0.010
 say+0.010
 kidding+0.010
Neg logit contributions
les-0.011
lights-0.011
rily-0.010
ork-0.010
adoes-0.010
Token.
Token ID13
Feature activation+1.18e-01

Pos logit contributions
 least+0.042
oh+0.035
 oh+0.034
 kidding+0.033
 worst+0.032
Neg logit contributions
les-0.029
lights-0.027
ork-0.025
rily-0.024
ened-0.023
Token They
Token ID1119
Feature activation+9.20e-01

Pos logit contributions
 least+0.005
oh+0.004
 say+0.004
 oh+0.004
 kidding+0.004
Neg logit contributions
les-0.004
lights-0.004
ork-0.004
rily-0.004
adoes-0.004
Token both
Token ID1111
Feature activation+1.47e+00

Pos logit contributions
 least+0.041
oh+0.034
 oh+0.034
 kidding+0.032
 Mir+0.032
Neg logit contributions
les-0.034
lights-0.029
ened-0.028
adoes-0.028
rily-0.028
Token wanted
Token ID2227
Feature activation+4.23e-01

Pos logit contributions
 least+0.075
 oh+0.061
 ourselves+0.060
oh+0.060
 kidding+0.058
Neg logit contributions
les-0.046
ened-0.038
lights-0.038
ork-0.037
rily-0.035
Token a
Token ID257
Feature activation-2.27e-01

Pos logit contributions
 least+0.019
oh+0.016
 kidding+0.014
 say+0.014
 oh+0.014
Neg logit contributions
les-0.015
lights-0.015
ork-0.014
adoes-0.014
rily-0.013
Token toy
Token ID13373
Feature activation+3.11e+00

Pos logit contributions
les+0.008
lights+0.008
rily+0.008
ork+0.008
adoes+0.008
Neg logit contributions
 least-0.010
oh-0.008
 say-0.008
 oh-0.008
 kidding-0.008
Token they
Token ID484
Feature activation+5.68e-01

Pos logit contributions
oh+0.130
 least+0.128
 elbows+0.122
Ta+0.121
 kidding+0.120
Neg logit contributions
lights-0.086
ork-0.084
les-0.080
 soldier-0.079
rily-0.078
Token saw
Token ID2497
Feature activation+8.62e-01

Pos logit contributions
 least+0.022
oh+0.018
 kidding+0.017
 Mir+0.017
 oh+0.017
Neg logit contributions
les-0.023
lights-0.022
ork-0.021
rily-0.021
adoes-0.020
Token in
Token ID287
Feature activation-1.70e-01

Pos logit contributions
 least+0.038
oh+0.031
 say+0.029
 kidding+0.029
 Mir+0.028
Neg logit contributions
les-0.033
lights-0.031
ork-0.030
rily-0.030
adoes-0.029
Token the
Token ID262
Feature activation+1.01e+00

Pos logit contributions
les+0.009
lights+0.009
ork+0.008
rily+0.008
adoes+0.008
Neg logit contributions
 least-0.005
oh-0.004
 say-0.004
 kidding-0.003
 oh-0.003
Token store
Token ID3650
Feature activation+8.59e-01

Pos logit contributions
 least+0.043
 oh+0.038
oh+0.038
 kidding+0.037
 say+0.036
Neg logit contributions
les-0.034
ork-0.032
lights-0.031
adoes-0.029
 tant-0.028
Token request
Token ID2581
Feature activation+1.66e+00

Pos logit contributions
 least+0.017
oh+0.014
 say+0.013
 kidding+0.013
 oh+0.013
Neg logit contributions
les-0.013
lights-0.013
ork-0.013
rily-0.012
adoes-0.012
Token.
Token ID13
Feature activation-6.42e-01

Pos logit contributions
 least+0.078
oh+0.065
 kidding+0.062
 elbows+0.062
 ourselves+0.060
Neg logit contributions
ork-0.052
les-0.052
lights-0.048
adoes-0.045
bell-0.045
Token She
Token ID1375
Feature activation-4.22e-01

Pos logit contributions
les+0.034
lights+0.033
rily+0.032
ork+0.031
adoes+0.030
Neg logit contributions
 least-0.018
 say-0.014
oh-0.013
 am-0.012
 oh-0.012
Token wanted
Token ID2227
Feature activation+2.21e-01

Pos logit contributions
lights+0.018
les+0.018
rily+0.017
adoes+0.017
ork+0.017
Neg logit contributions
 least-0.016
oh-0.013
 say-0.013
 oh-0.012
 am-0.012
Token a
Token ID257
Feature activation-3.43e-01

Pos logit contributions
 least+0.009
oh+0.008
 say+0.007
 kidding+0.007
 oh+0.007
Neg logit contributions
les-0.008
lights-0.008
ork-0.008
rily-0.008
adoes-0.008
Token toy
Token ID13373
Feature activation+3.09e+00

Pos logit contributions
les+0.013
lights+0.013
rily+0.012
adoes+0.011
ork+0.011
Neg logit contributions
 least-0.015
oh-0.012
 say-0.012
 oh-0.012
 kidding-0.011
Token with
Token ID351
Feature activation-7.75e-02

Pos logit contributions
 least+0.132
oh+0.128
 elbows+0.121
 kidding+0.119
Ta+0.116
Neg logit contributions
ork-0.088
les-0.082
lights-0.081
bell-0.081
 soldier-0.080
Token a
Token ID257
Feature activation-1.18e+00

Pos logit contributions
les+0.004
lights+0.004
ork+0.004
rily+0.004
adoes+0.004
Neg logit contributions
 least-0.002
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token handle
Token ID5412
Feature activation+9.34e-01

Pos logit contributions
lights+0.047
les+0.047
rily+0.044
adoes+0.043
ork+0.041
Neg logit contributions
 least-0.048
 say-0.040
oh-0.038
 oh-0.038
 am-0.037
Token so
Token ID523
Feature activation+1.23e-01

Pos logit contributions
 least+0.042
oh+0.033
 kidding+0.032
 oh+0.032
 ourselves+0.031
Neg logit contributions
les-0.034
lights-0.033
ork-0.031
rily-0.031
bell-0.030
Token she
Token ID673
Feature activation-1.15e+00

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 kidding+0.002
 oh+0.002
Neg logit contributions
les-0.006
lights-0.006
ork-0.006
rily-0.006
adoes-0.006
Token a
Token ID257
Feature activation+4.37e-01

Pos logit contributions
 least+0.035
oh+0.028
 kidding+0.024
 say+0.024
 wait+0.023
Neg logit contributions
les-0.043
rily-0.041
lights-0.041
adoes-0.041
ork-0.041
Token small
Token ID1402
Feature activation+4.41e-01

Pos logit contributions
 least+0.020
oh+0.016
 say+0.015
 kidding+0.015
 oh+0.015
Neg logit contributions
les-0.015
lights-0.015
ork-0.015
rily-0.014
adoes-0.014
Token bird
Token ID6512
Feature activation-2.45e-01

Pos logit contributions
 least+0.020
oh+0.016
 say+0.015
 oh+0.015
 kidding+0.015
Neg logit contributions
les-0.016
lights-0.015
ork-0.015
rily-0.015
adoes-0.014
Token perched
Token ID49264
Feature activation+2.03e-01

Pos logit contributions
lights+0.010
les+0.010
ork+0.010
rily+0.010
adoes+0.010
Neg logit contributions
 least-0.009
oh-0.007
 say-0.007
 oh-0.007
 kidding-0.007
Token on
Token ID319
Feature activation+4.04e-01

Pos logit contributions
 least+0.008
oh+0.006
 say+0.006
 oh+0.006
 kidding+0.006
Neg logit contributions
les-0.008
lights-0.008
rily-0.008
ork-0.008
adoes-0.008
Token an
Token ID281
Feature activation+3.02e+00

Pos logit contributions
 least+0.014
oh+0.010
 say+0.010
 kidding+0.010
 oh+0.009
Neg logit contributions
les-0.019
lights-0.018
ork-0.018
rily-0.018
adoes-0.017
Token old
Token ID1468
Feature activation+8.69e-01

Pos logit contributions
 least+0.136
 am+0.130
oh+0.124
'm+0.120
 kidding+0.118
Neg logit contributions
les-0.095
rily-0.094
lf-0.085
lights-0.080
bell-0.080
Token crate
Token ID27021
Feature activation+3.08e-01

Pos logit contributions
 least+0.041
oh+0.032
 kidding+0.031
 say+0.030
 ourselves+0.030
Neg logit contributions
les-0.030
ork-0.030
rily-0.028
lights-0.028
adoes-0.027
Token.
Token ID13
Feature activation+2.08e-01

Pos logit contributions
 least+0.013
oh+0.010
 say+0.010
 kidding+0.010
 oh+0.010
Neg logit contributions
les-0.012
lights-0.012
ork-0.011
rily-0.011
adoes-0.011
Token The
Token ID383
Feature activation+2.68e-01

Pos logit contributions
 least+0.008
oh+0.006
 say+0.006
 kidding+0.006
 oh+0.006
Neg logit contributions
les-0.009
lights-0.009
ork-0.008
rily-0.008
adoes-0.008
Token bird
Token ID6512
Feature activation-1.00e+00

Pos logit contributions
 least+0.014
oh+0.011
 kidding+0.011
 oh+0.011
 say+0.011
Neg logit contributions
les-0.008
lights-0.008
ork-0.008
rily-0.007
adoes-0.007
Token,
Token ID11
Feature activation-4.42e-01

Pos logit contributions
 least+0.020
oh+0.015
 say+0.014
 oh+0.014
 kidding+0.014
Neg logit contributions
les-0.023
lights-0.022
ork-0.022
rily-0.021
adoes-0.021
Token he
Token ID339
Feature activation-2.01e-01

Pos logit contributions
les+0.020
lights+0.019
ork+0.019
rily+0.019
adoes+0.018
Neg logit contributions
 least-0.017
oh-0.012
 oh-0.012
 say-0.012
 kidding-0.012
Token managed
Token ID5257
Feature activation-2.93e-01

Pos logit contributions
lights+0.008
les+0.008
rily+0.008
ork+0.008
adoes+0.008
Neg logit contributions
 least-0.008
oh-0.006
 say-0.006
 oh-0.006
 kidding-0.006
Token to
Token ID284
Feature activation+2.54e-01

Pos logit contributions
lights+0.013
les+0.012
rily+0.012
ork+0.012
adoes+0.012
Neg logit contributions
 least-0.011
oh-0.009
 say-0.009
 oh-0.008
 kidding-0.008
Token jump
Token ID4391
Feature activation+1.04e+00

Pos logit contributions
 least+0.009
oh+0.007
 oh+0.007
 say+0.007
 kidding+0.007
Neg logit contributions
les-0.011
lights-0.011
ork-0.010
rily-0.010
adoes-0.010
Token an
Token ID281
Feature activation+3.01e+00

Pos logit contributions
 least+0.047
oh+0.038
 oh+0.037
 kidding+0.036
 say+0.036
Neg logit contributions
les-0.038
lights-0.036
ork-0.035
rily-0.035
adoes-0.033
Token extra
Token ID3131
Feature activation-5.45e-02

Pos logit contributions
 am+0.117
oh+0.117
 least+0.114
 elbows+0.109
'm+0.107
Neg logit contributions
les-0.103
rily-0.097
lf-0.087
en-0.087
lights-0.083
Token long
Token ID890
Feature activation-5.23e-01

Pos logit contributions
les+0.002
lights+0.002
ork+0.002
rily+0.002
adoes+0.002
Neg logit contributions
 least-0.002
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token,
Token ID11
Feature activation+9.42e-01

Pos logit contributions
lights+0.018
les+0.018
rily+0.017
adoes+0.017
ork+0.017
Neg logit contributions
 least-0.024
oh-0.020
 say-0.019
 kidding-0.018
 oh-0.018
Token far
Token ID1290
Feature activation-2.99e-02

Pos logit contributions
 least+0.035
oh+0.028
 ourselves+0.025
 say+0.025
 kidding+0.025
Neg logit contributions
les-0.039
lights-0.038
ork-0.037
rily-0.037
adoes-0.036
Token distance
Token ID5253
Feature activation+2.25e-01

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token"
Token ID1
Feature activation+1.81e-01

Pos logit contributions
 least+0.036
 kidding+0.029
 worst+0.028
oh+0.028
 oh+0.028
Neg logit contributions
les-0.019
ork-0.018
lights-0.018
rily-0.016
adoes-0.016
TokenThat
Token ID2504
Feature activation+9.50e-01

Pos logit contributions
 least+0.007
oh+0.006
 say+0.005
 kidding+0.005
 oh+0.005
Neg logit contributions
les-0.007
lights-0.007
ork-0.007
rily-0.007
adoes-0.007
Token's
Token ID338
Feature activation+1.17e+00

Pos logit contributions
 least+0.041
 kidding+0.033
 elbows+0.032
 Mir+0.031
 ourselves+0.031
Neg logit contributions
les-0.036
ork-0.034
adoes-0.032
lights-0.032
ened-0.030
Token a
Token ID257
Feature activation+6.46e-01

Pos logit contributions
 least+0.043
oh+0.036
 elbows+0.034
 Mir+0.031
 say+0.031
Neg logit contributions
adoes-0.044
ork-0.042
lights-0.042
lf-0.039
les-0.038
Token nice
Token ID3621
Feature activation-2.06e-01

Pos logit contributions
 least+0.025
oh+0.020
 say+0.019
 ourselves+0.019
 Mir+0.019
Neg logit contributions
ork-0.025
les-0.025
lights-0.024
adoes-0.023
rily-0.023
Token toy
Token ID13373
Feature activation+2.99e+00

Pos logit contributions
lights+0.008
les+0.008
rily+0.008
ork+0.007
adoes+0.007
Neg logit contributions
 least-0.009
oh-0.007
 say-0.007
 oh-0.007
 kidding-0.007
Token,"
Token ID553
Feature activation+1.78e+00

Pos logit contributions
 least+0.134
 elbows+0.115
 kidding+0.111
oh+0.105
 Mir+0.104
Neg logit contributions
les-0.099
lights-0.089
ork-0.087
adoes-0.087
bell-0.083
Token she
Token ID673
Feature activation+2.05e-01

Pos logit contributions
 least+0.089
oh+0.073
 kidding+0.068
 ourselves+0.064
 tallest+0.064
Neg logit contributions
lights-0.054
les-0.050
ork-0.046
bell-0.045
lf-0.045
Token said
Token ID531
Feature activation+2.10e-01

Pos logit contributions
 least+0.005
oh+0.004
 kidding+0.003
 say+0.003
 oh+0.003
Neg logit contributions
les-0.011
lights-0.011
ork-0.011
rily-0.011
adoes-0.011
Token.
Token ID13
Feature activation-1.85e-01

Pos logit contributions
 least+0.009
oh+0.007
 say+0.007
 kidding+0.007
 oh+0.007
Neg logit contributions
les-0.008
lights-0.008
rily-0.008
ork-0.008
adoes-0.007
Token "
Token ID366
Feature activation+3.24e-01

Pos logit contributions
les+0.008
lights+0.007
rily+0.007
ork+0.007
adoes+0.007
Neg logit contributions
 least-0.007
oh-0.006
 say-0.006
 oh-0.006
 kidding-0.005
Token je
Token ID11223
Feature activation-6.12e-01

Pos logit contributions
 least+0.054
oh+0.047
 oh+0.044
 kidding+0.044
 Mir+0.041
Neg logit contributions
ork-0.047
les-0.046
lights-0.046
adoes-0.044
 tant-0.043
Tokenep
Token ID538
Feature activation+6.80e-01

Pos logit contributions
les+0.022
lights+0.022
rily+0.022
ork+0.020
adoes+0.020
Neg logit contributions
 least-0.027
oh-0.022
 say-0.020
 oh-0.020
 am-0.020
Token is
Token ID318
Feature activation-6.10e-02

Pos logit contributions
 least+0.024
oh+0.018
 oh+0.018
 kidding+0.017
 say+0.016
Neg logit contributions
les-0.031
lights-0.030
ork-0.029
adoes-0.028
rily-0.028
Token not
Token ID407
Feature activation+6.92e-01

Pos logit contributions
les+0.003
lights+0.003
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.002
oh-0.002
 say-0.002
 oh-0.001
 kidding-0.001
Token a
Token ID257
Feature activation+3.75e-01

Pos logit contributions
 least+0.027
oh+0.021
 am+0.018
 say+0.018
 kidding+0.018
Neg logit contributions
les-0.029
lights-0.028
adoes-0.027
ork-0.027
rily-0.027
Token toy
Token ID13373
Feature activation+2.98e+00

Pos logit contributions
 least+0.018
oh+0.015
 say+0.014
 kidding+0.014
 oh+0.014
Neg logit contributions
les-0.012
ork-0.011
lights-0.011
rily-0.011
adoes-0.011
Token.
Token ID13
Feature activation+6.23e-01

Pos logit contributions
 least+0.151
 kidding+0.139
 elbows+0.135
 Fre+0.134
Ta+0.134
Neg logit contributions
les-0.073
ork-0.066
lights-0.063
bell-0.061
adoes-0.059
Token We
Token ID775
Feature activation-1.16e-01

Pos logit contributions
 least+0.030
oh+0.024
 say+0.023
 oh+0.023
 kidding+0.023
Neg logit contributions
lights-0.020
les-0.019
ork-0.019
adoes-0.019
rily-0.018
Token might
Token ID1244
Feature activation+3.54e-01

Pos logit contributions
les+0.005
lights+0.005
rily+0.005
ork+0.005
adoes+0.005
Neg logit contributions
 least-0.004
oh-0.003
 say-0.003
 oh-0.003
 kidding-0.003
Token break
Token ID2270
Feature activation+8.00e-01

Pos logit contributions
 least+0.014
oh+0.011
 oh+0.011
 kidding+0.011
 say+0.010
Neg logit contributions
les-0.014
ork-0.013
lights-0.013
adoes-0.013
rily-0.012
Token it
Token ID340
Feature activation+9.52e-01

Pos logit contributions
 least+0.036
oh+0.029
 kidding+0.028
 say+0.027
 oh+0.027
Neg logit contributions
les-0.030
lights-0.029
adoes-0.027
ork-0.027
rily-0.026
Token She
Token ID1375
Feature activation-1.86e-01

Pos logit contributions
 least+0.019
oh+0.014
 kidding+0.013
 oh+0.013
 say+0.013
Neg logit contributions
les-0.020
lights-0.020
ork-0.019
rily-0.019
adoes-0.019
Token wanted
Token ID2227
Feature activation+3.44e-01

Pos logit contributions
lights+0.008
les+0.008
rily+0.007
ork+0.007
adoes+0.007
Neg logit contributions
 least-0.007
oh-0.006
 say-0.005
 oh-0.005
 kidding-0.005
Token to
Token ID284
Feature activation-3.01e-01

Pos logit contributions
 least+0.015
oh+0.012
 kidding+0.012
 say+0.012
 oh+0.011
Neg logit contributions
les-0.012
lights-0.012
ork-0.012
adoes-0.011
rily-0.011
Token make
Token ID787
Feature activation+6.65e-01

Pos logit contributions
les+0.014
lights+0.014
rily+0.014
ork+0.014
adoes+0.013
Neg logit contributions
 least-0.010
 say-0.008
oh-0.008
 am-0.007
 oh-0.007
Token a
Token ID257
Feature activation+6.02e-01

Pos logit contributions
 least+0.024
oh+0.019
 am+0.017
 kidding+0.017
 say+0.016
Neg logit contributions
les-0.030
lights-0.029
ork-0.028
rily-0.028
adoes-0.028
Token toy
Token ID13373
Feature activation+2.97e+00

Pos logit contributions
 least+0.028
oh+0.023
 kidding+0.021
 oh+0.021
 ourselves+0.021
Neg logit contributions
les-0.020
ork-0.020
lights-0.020
rily-0.019
adoes-0.018
Token missile
Token ID10105
Feature activation+5.52e-01

Pos logit contributions
 least+0.129
oh+0.123
 elbows+0.110
 kidding+0.109
Ta+0.106
Neg logit contributions
lights-0.091
les-0.088
ork-0.087
adoes-0.085
rily-0.084
Token.
Token ID13
Feature activation+5.51e-01

Pos logit contributions
 least+0.022
oh+0.018
 kidding+0.017
 oh+0.017
 say+0.016
Neg logit contributions
les-0.022
lights-0.021
ork-0.020
rily-0.020
adoes-0.020
Token She
Token ID1375
Feature activation-6.64e-01

Pos logit contributions
 least+0.023
 kidding+0.017
oh+0.017
 worst+0.016
 oh+0.016
Neg logit contributions
les-0.022
lights-0.021
ork-0.021
rily-0.020
adoes-0.020
Token asked
Token ID1965
Feature activation+6.10e-01

Pos logit contributions
lights+0.029
les+0.028
rily+0.028
adoes+0.028
ork+0.027
Neg logit contributions
 least-0.024
oh-0.020
 say-0.019
 oh-0.018
 Mir-0.017
Token her
Token ID607
Feature activation-1.47e-01

Pos logit contributions
 least+0.025
oh+0.019
 kidding+0.017
 am+0.017
 say+0.017
Neg logit contributions
les-0.025
lights-0.024
adoes-0.024
ork-0.023
rily-0.023
Token lost
Token ID2626
Feature activation+1.73e-01

Pos logit contributions
lights+0.055
les+0.054
ork+0.053
rily+0.050
 tant+0.049
Neg logit contributions
 least-0.032
 say-0.023
 oh-0.022
 kidding-0.022
 saying-0.020
Token or
Token ID393
Feature activation-1.62e-01

Pos logit contributions
 least+0.007
oh+0.006
 say+0.005
 kidding+0.005
 oh+0.005
Neg logit contributions
les-0.007
lights-0.007
ork-0.006
rily-0.006
adoes-0.006
Token sick
Token ID6639
Feature activation-1.68e-02

Pos logit contributions
les+0.007
lights+0.007
ork+0.007
rily+0.007
adoes+0.007
Neg logit contributions
 least-0.006
oh-0.004
 say-0.004
 oh-0.004
 kidding-0.004
Token or
Token ID393
Feature activation-5.77e-01

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.000
 kidding-0.000
Token hurt
Token ID5938
Feature activation+2.04e-01

Pos logit contributions
les+0.027
lights+0.027
ork+0.026
rily+0.026
adoes+0.025
Neg logit contributions
 least-0.020
 say-0.015
oh-0.015
 oh-0.014
 kidding-0.014
Token,"
Token ID553
Feature activation+2.96e+00

Pos logit contributions
 least+0.007
oh+0.006
 say+0.005
 kidding+0.005
 oh+0.005
Neg logit contributions
les-0.009
lights-0.009
ork-0.009
rily-0.009
adoes-0.008
Token mom
Token ID1995
Feature activation+8.57e-01

Pos logit contributions
 least+0.150
oh+0.138
 longest+0.135
 tallest+0.133
 fastest+0.131
Neg logit contributions
lf-0.049
 novels-0.041
bell-0.039
lights-0.038
ork-0.036
Token said
Token ID531
Feature activation+9.33e-01

Pos logit contributions
 least+0.027
 ourselves+0.019
oh+0.018
 kidding+0.018
 am+0.018
Neg logit contributions
les-0.042
lights-0.040
ork-0.039
rily-0.038
bell-0.038
Token.
Token ID13
Feature activation+1.14e+00

Pos logit contributions
 least+0.043
oh+0.034
 am+0.031
 kidding+0.031
 say+0.030
Neg logit contributions
les-0.033
lights-0.033
rily-0.032
ork-0.030
adoes-0.030
Token
Token ID198
Feature activation+1.40e+00

Pos logit contributions
 least+0.058
oh+0.045
 kidding+0.045
 wait+0.043
 ourselves+0.043
Neg logit contributions
ork-0.034
lights-0.034
les-0.032
adoes-0.032
rily-0.029
Token
Token ID198
Feature activation+1.25e+00

Pos logit contributions
 least+0.072
 kidding+0.059
 wait+0.057
 oh+0.056
oh+0.056
Neg logit contributions
ork-0.043
lights-0.042
les-0.042
adoes-0.036
rily-0.035
Token,
Token ID11
Feature activation-1.26e+00

Pos logit contributions
les+0.002
lights+0.002
ork+0.002
rily+0.002
adoes+0.002
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token it
Token ID340
Feature activation+3.79e-01

Pos logit contributions
les+0.061
ork+0.057
lights+0.057
rily+0.055
 tant+0.052
Neg logit contributions
 least-0.041
 oh-0.038
 say-0.037
oh-0.032
 wait-0.031
Token's
Token ID338
Feature activation+1.09e+00

Pos logit contributions
 least+0.014
oh+0.010
 oh+0.010
 kidding+0.010
 say+0.010
Neg logit contributions
les-0.017
lights-0.016
ork-0.016
adoes-0.015
rily-0.015
Token my
Token ID616
Feature activation+8.73e-01

Pos logit contributions
 least+0.041
oh+0.033
 elbows+0.030
 kidding+0.029
 Mir+0.029
Neg logit contributions
rily-0.043
lights-0.042
les-0.042
ork-0.042
adoes-0.041
Token favorite
Token ID4004
Feature activation-1.70e-01

Pos logit contributions
 least+0.040
oh+0.034
 Mir+0.031
 kidding+0.031
 elbows+0.031
Neg logit contributions
les-0.030
ork-0.028
lights-0.027
rily-0.027
adoes-0.026
Token toy
Token ID13373
Feature activation+2.95e+00

Pos logit contributions
les+0.006
lights+0.006
ork+0.005
rily+0.005
adoes+0.005
Neg logit contributions
 least-0.008
oh-0.007
 say-0.006
 oh-0.006
 kidding-0.006
Token!"
Token ID2474
Feature activation-2.86e-01

Pos logit contributions
 least+0.140
 kidding+0.128
 elbows+0.122
 Mir+0.117
 Mil+0.114
Neg logit contributions
les-0.095
ork-0.082
lights-0.081
adoes-0.079
rily-0.074
Token Lily
Token ID20037
Feature activation+1.29e+00

Pos logit contributions
les+0.011
lights+0.011
rily+0.010
ork+0.010
adoes+0.010
Neg logit contributions
 least-0.012
 say-0.010
oh-0.010
 oh-0.010
 am-0.009
Token replied
Token ID8712
Feature activation-6.07e-01

Pos logit contributions
 least+0.050
 worst+0.037
 kidding+0.036
 ourselves+0.036
oh+0.035
Neg logit contributions
les-0.054
ork-0.050
rily-0.049
ened-0.048
lf-0.048
Token.
Token ID13
Feature activation-2.52e-02

Pos logit contributions
les+0.028
lights+0.027
ork+0.027
adoes+0.026
rily+0.025
Neg logit contributions
 least-0.020
 say-0.016
oh-0.016
 oh-0.016
 kidding-0.015
Token
Token ID220
Feature activation+3.66e-01

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token.
Token ID13
Feature activation+4.63e-01

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token<|endoftext|>
Token ID50256
Feature activation-3.44e-01

Pos logit contributions
 least+0.019
oh+0.015
 oh+0.015
 say+0.015
 kidding+0.015
Neg logit contributions
les-0.018
lights-0.017
ork-0.017
rily-0.016
adoes-0.016
TokenJohn
Token ID7554
Feature activation+1.02e-01

Pos logit contributions
lights+0.016
les+0.016
rily+0.016
adoes+0.015
ork+0.015
Neg logit contributions
 least-0.011
oh-0.009
 say-0.008
 am-0.008
 oh-0.008
Token wanted
Token ID2227
Feature activation-4.64e-04

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 oh+0.003
 kidding+0.003
Neg logit contributions
les-0.004
lights-0.004
ork-0.004
rily-0.004
adoes-0.004
Token a
Token ID257
Feature activation-6.68e-02

Pos logit contributions
les+0.000
lights+0.000
ork+0.000
rily+0.000
adoes+0.000
Neg logit contributions
 least-0.000
oh-0.000
 say-0.000
 oh-0.000
 kidding-0.000
Token toy
Token ID13373
Feature activation+2.94e+00

Pos logit contributions
les+0.002
lights+0.002
rily+0.002
ork+0.002
adoes+0.002
Neg logit contributions
 least-0.003
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token truck
Token ID7779
Feature activation+1.36e+00

Pos logit contributions
 least+0.126
oh+0.125
 elbows+0.117
 kidding+0.115
Ta+0.114
Neg logit contributions
ork-0.079
lights-0.077
les-0.077
rily-0.075
 soldier-0.074
Token.
Token ID13
Feature activation+4.91e-02

Pos logit contributions
 least+0.061
oh+0.051
 elbows+0.050
 kidding+0.049
 Mir+0.048
Neg logit contributions
les-0.043
lights-0.043
ork-0.041
rily-0.040
bell-0.039
Token His
Token ID2399
Feature activation+2.68e-01

Pos logit contributions
 least+0.002
oh+0.002
 say+0.002
 oh+0.002
 kidding+0.002
Neg logit contributions
les-0.002
lights-0.002
ork-0.002
rily-0.002
adoes-0.001
Token mom
Token ID1995
Feature activation+9.31e-02

Pos logit contributions
 least+0.012
oh+0.010
 say+0.009
 kidding+0.009
 oh+0.009
Neg logit contributions
lights-0.009
les-0.009
ork-0.009
rily-0.009
adoes-0.009
Token said
Token ID531
Feature activation+6.83e-01

Pos logit contributions
 least+0.003
oh+0.002
 say+0.002
 oh+0.002
 kidding+0.002
Neg logit contributions
les-0.004
lights-0.004
ork-0.004
rily-0.004
adoes-0.004
Token<|endoftext|>
Token ID50256
Feature activation-5.25e-01

Pos logit contributions
les+0.032
lights+0.031
rily+0.030
adoes+0.030
ork+0.029
Neg logit contributions
 least-0.021
oh-0.015
 say-0.014
 am-0.014
 oh-0.014
TokenMark
Token ID9704
Feature activation-2.72e-01

Pos logit contributions
lights+0.028
les+0.027
rily+0.027
adoes+0.027
ork+0.026
Neg logit contributions
 least-0.014
oh-0.011
 say-0.009
 am-0.009
 Mir-0.008
Token wanted
Token ID2227
Feature activation-1.19e-01

Pos logit contributions
lights+0.011
les+0.011
ork+0.010
rily+0.010
adoes+0.010
Neg logit contributions
 least-0.011
oh-0.009
 say-0.008
 kidding-0.008
 oh-0.008
Token a
Token ID257
Feature activation-3.89e-01

Pos logit contributions
les+0.005
lights+0.005
rily+0.005
ork+0.005
adoes+0.005
Neg logit contributions
 least-0.005
oh-0.004
 say-0.004
 oh-0.003
 kidding-0.003
Token new
Token ID649
Feature activation+6.07e-01

Pos logit contributions
les+0.014
lights+0.014
rily+0.013
adoes+0.013
ork+0.013
Neg logit contributions
 least-0.018
oh-0.014
 say-0.014
 oh-0.014
 kidding-0.013
Token toy
Token ID13373
Feature activation+2.93e+00

Pos logit contributions
 least+0.031
oh+0.026
 kidding+0.024
 ourselves+0.024
 say+0.024
Neg logit contributions
les-0.017
lights-0.017
ork-0.017
rily-0.016
adoes-0.016
Token but
Token ID475
Feature activation-3.26e-01

Pos logit contributions
 least+0.129
oh+0.126
 elbows+0.117
 kidding+0.115
Ta+0.112
Neg logit contributions
lights-0.080
les-0.079
ork-0.079
rily-0.074
bell-0.074
Token couldn
Token ID3521
Feature activation-6.23e-01

Pos logit contributions
les+0.014
lights+0.014
rily+0.013
ork+0.013
adoes+0.013
Neg logit contributions
 least-0.013
oh-0.010
 say-0.010
 oh-0.009
 kidding-0.009
Tokenâ
Token ID22940
Feature activation+1.61e+00

Pos logit contributions
les+0.026
lights+0.026
rily+0.025
adoes+0.024
ork+0.023
Neg logit contributions
 least-0.025
oh-0.020
 say-0.019
 am-0.019
 oh-0.018
Token
Token ID26391
Feature activation-5.70e-01

Pos logit contributions
 least+0.049
 oh+0.042
 Mir+0.039
oh+0.037
 ourselves+0.036
Neg logit contributions
les-0.071
ork-0.070
bell-0.064
lights-0.063
lf-0.063
Token
Token ID8151
Feature activation-6.12e-01

Pos logit contributions
les+0.025
lights+0.024
rily+0.024
adoes+0.023
ork+0.022
Neg logit contributions
 least-0.022
oh-0.017
 say-0.016
 am-0.016
 oh-0.015
Token him
Token ID683
Feature activation-8.00e-01

Pos logit contributions
 least+0.003
oh+0.002
 say+0.002
 oh+0.002
 kidding+0.002
Neg logit contributions
les-0.003
lights-0.003
ork-0.003
rily-0.003
adoes-0.003
Token it
Token ID340
Feature activation+5.61e-01

Pos logit contributions
les+0.038
lights+0.038
rily+0.037
adoes+0.035
ork+0.035
Neg logit contributions
 least-0.026
 say-0.019
oh-0.019
 oh-0.019
 am-0.018
Token was
Token ID373
Feature activation+1.14e-02

Pos logit contributions
 least+0.024
oh+0.018
 kidding+0.018
 say+0.018
 oh+0.017
Neg logit contributions
les-0.022
ork-0.021
lights-0.021
adoes-0.020
rily-0.020
Token not
Token ID407
Feature activation+3.93e-02

Pos logit contributions
 least+0.000
oh+0.000
 say+0.000
 oh+0.000
 kidding+0.000
Neg logit contributions
les-0.000
lights-0.000
ork-0.000
rily-0.000
adoes-0.000
Token a
Token ID257
Feature activation+1.07e+00

Pos logit contributions
 least+0.001
oh+0.001
 say+0.001
 oh+0.001
 kidding+0.001
Neg logit contributions
les-0.002
lights-0.002
ork-0.002
rily-0.002
adoes-0.002
Token toy
Token ID13373
Feature activation+2.93e+00

Pos logit contributions
 least+0.045
oh+0.035
 ourselves+0.033
 elbows+0.033
 kidding+0.033
Neg logit contributions
ork-0.041
les-0.040
lights-0.037
adoes-0.036
 tant-0.036
Token.
Token ID13
Feature activation+6.03e-01

Pos logit contributions
Ta+0.143
 least+0.143
 elbows+0.135
oh+0.134
 kidding+0.129
Neg logit contributions
les-0.072
lights-0.062
ork-0.057
adoes-0.053
 anymore-0.050
Token But
Token ID887
Feature activation+9.06e-01

Pos logit contributions
 least+0.027
oh+0.021
 kidding+0.020
 say+0.020
 worst+0.019
Neg logit contributions
lights-0.022
les-0.022
adoes-0.021
ork-0.021
rily-0.020
Token Tim
Token ID5045
Feature activation+3.27e-01

Pos logit contributions
 least+0.041
oh+0.036
 am+0.032
 say+0.031
 kidding+0.031
Neg logit contributions
lights-0.028
adoes-0.027
ened-0.026
les-0.026
lf-0.025
Token did
Token ID750
Feature activation-4.34e-01

Pos logit contributions
 least+0.013
oh+0.010
 oh+0.010
 kidding+0.010
 say+0.010
Neg logit contributions
les-0.013
ork-0.012
lights-0.012
rily-0.012
adoes-0.012
Token not
Token ID407
Feature activation-5.27e-01

Pos logit contributions
lights+0.021
les+0.021
rily+0.021
ork+0.020
adoes+0.019
Neg logit contributions
 least-0.014
 say-0.011
oh-0.011
 oh-0.010
 kidding-0.010
Token mom
Token ID1995
Feature activation-2.61e-01

Pos logit contributions
 least+0.006
oh+0.005
 say+0.005
 kidding+0.005
 oh+0.005
Neg logit contributions
les-0.005
lights-0.005
ork-0.005
rily-0.004
adoes-0.004
Token gave
Token ID2921
Feature activation+1.16e-01

Pos logit contributions
lights+0.011
les+0.011
ork+0.011
rily+0.011
adoes+0.010
Neg logit contributions
 least-0.010
oh-0.008
 say-0.008
 oh-0.007
 kidding-0.007
Token her
Token ID607
Feature activation+4.73e-01

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 kidding+0.003
 oh+0.003
Neg logit contributions
les-0.006
lights-0.005
ork-0.005
rily-0.005
adoes-0.005
Token a
Token ID257
Feature activation+7.28e-01

Pos logit contributions
 least+0.014
oh+0.011
 kidding+0.010
 say+0.009
 oh+0.009
Neg logit contributions
les-0.024
lights-0.023
ork-0.022
rily-0.022
adoes-0.022
Token new
Token ID649
Feature activation+1.51e+00

Pos logit contributions
 least+0.034
oh+0.028
 ourselves+0.026
 kidding+0.026
 oh+0.026
Neg logit contributions
les-0.025
ork-0.024
lights-0.023
rily-0.022
 tant-0.022
Token toy
Token ID13373
Feature activation+2.92e+00

Pos logit contributions
 least+0.073
oh+0.063
 ourselves+0.058
 elbows+0.057
 kidding+0.056
Neg logit contributions
les-0.047
ork-0.046
lights-0.043
rily-0.041
 tant-0.039
Token.
Token ID13
Feature activation-6.38e-01

Pos logit contributions
 least+0.136
oh+0.125
 kidding+0.121
 elbows+0.118
 ourselves+0.112
Neg logit contributions
les-0.083
ork-0.079
lights-0.078
rily-0.073
bell-0.070
Token It
Token ID632
Feature activation+1.09e+00

Pos logit contributions
les+0.027
lights+0.026
rily+0.025
adoes+0.024
ork+0.024
Neg logit contributions
 least-0.025
oh-0.020
 say-0.020
 oh-0.019
 Mir-0.019
Token was
Token ID373
Feature activation+1.27e+00

Pos logit contributions
 least+0.053
 kidding+0.045
 ourselves+0.042
 elbows+0.042
oh+0.041
Neg logit contributions
les-0.036
lights-0.033
ork-0.032
rily-0.031
ened-0.031
Token a
Token ID257
Feature activation+8.55e-01

Pos logit contributions
 least+0.048
oh+0.040
 kidding+0.034
 ourselves+0.034
 wait+0.033
Neg logit contributions
lights-0.053
rily-0.051
ork-0.050
les-0.050
adoes-0.048
Token little
Token ID1310
Feature activation+1.89e+00

Pos logit contributions
 least+0.039
oh+0.031
 ourselves+0.030
 kidding+0.030
 say+0.029
Neg logit contributions
ork-0.029
les-0.028
lights-0.028
rily-0.027
pron-0.025
Token and
Token ID290
Feature activation-3.97e-01

Pos logit contributions
 least+0.024
oh+0.018
 oh+0.017
 kidding+0.017
 say+0.017
Neg logit contributions
les-0.026
lights-0.025
ork-0.024
rily-0.024
adoes-0.024
Token get
Token ID651
Feature activation+5.86e-01

Pos logit contributions
les+0.019
lights+0.019
ork+0.018
rily+0.018
adoes+0.017
Neg logit contributions
 least-0.014
 say-0.010
oh-0.010
 oh-0.009
 kidding-0.009
Token to
Token ID284
Feature activation+5.74e-01

Pos logit contributions
 least+0.021
oh+0.017
 kidding+0.015
 say+0.015
 am+0.015
Neg logit contributions
les-0.026
lights-0.025
rily-0.024
ork-0.024
adoes-0.024
Token the
Token ID262
Feature activation+1.79e+00

Pos logit contributions
 least+0.020
oh+0.016
 oh+0.014
 kidding+0.014
 am+0.013
Neg logit contributions
les-0.025
ork-0.024
lights-0.024
rily-0.024
adoes-0.024
Token water
Token ID1660
Feature activation+4.00e-01

Pos logit contributions
 least+0.070
oh+0.061
 kidding+0.061
 elbows+0.061
 oh+0.058
Neg logit contributions
ork-0.057
les-0.056
rily-0.051
lights-0.048
adoes-0.046
Token."
Token ID526
Feature activation+2.89e+00

Pos logit contributions
 least+0.016
oh+0.012
 oh+0.012
 kidding+0.012
 say+0.012
Neg logit contributions
les-0.016
lights-0.016
rily-0.015
ork-0.015
adoes-0.015
Token Sam
Token ID3409
Feature activation+7.98e-01

Pos logit contributions
 least+0.143
oh+0.125
 longest+0.124
 tallest+0.124
 worst+0.124
Neg logit contributions
ork-0.066
 One-0.052
 All-0.050
les-0.048
bell-0.047
Token whispered
Token ID25029
Feature activation+3.74e-01

Pos logit contributions
 least+0.026
oh+0.018
 ourselves+0.018
 kidding+0.017
 am+0.017
Neg logit contributions
les-0.040
lights-0.037
rily-0.036
ork-0.036
adoes-0.035
Token,
Token ID11
Feature activation-8.71e-01

Pos logit contributions
 least+0.015
oh+0.011
 say+0.010
 kidding+0.010
 am+0.010
Neg logit contributions
les-0.016
lights-0.016
rily-0.015
ork-0.015
adoes-0.015
Token trying
Token ID2111
Feature activation+4.58e-02

Pos logit contributions
les+0.044
lights+0.043
ork+0.043
adoes+0.040
rily+0.039
Neg logit contributions
 least-0.027
 say-0.021
 oh-0.020
oh-0.020
 kidding-0.019
Token to
Token ID284
Feature activation-2.29e-01

Pos logit contributions
 least+0.002
oh+0.001
 say+0.001
 oh+0.001
 kidding+0.001
Neg logit contributions
les-0.002
lights-0.002
ork-0.002
rily-0.002
adoes-0.002
Token the
Token ID262
Feature activation+4.77e-01

Pos logit contributions
les+0.019
lights+0.019
ork+0.019
rily+0.018
adoes+0.018
Neg logit contributions
 least-0.013
oh-0.010
 say-0.009
 Mir-0.009
 oh-0.009
Token door
Token ID3420
Feature activation+5.54e-01

Pos logit contributions
 least+0.024
oh+0.020
 oh+0.019
 kidding+0.019
 say+0.019
Neg logit contributions
les-0.014
ork-0.014
lights-0.013
rily-0.012
adoes-0.012
Token and
Token ID290
Feature activation+9.52e-02

Pos logit contributions
 least+0.027
oh+0.022
 kidding+0.022
 oh+0.021
 say+0.021
Neg logit contributions
lights-0.018
les-0.018
ork-0.017
rily-0.016
adoes-0.015
Token there
Token ID612
Feature activation+9.46e-01

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 kidding+0.003
 oh+0.003
Neg logit contributions
les-0.004
lights-0.004
ork-0.004
rily-0.004
adoes-0.004
Token was
Token ID373
Feature activation+9.37e-01

Pos logit contributions
 least+0.044
 kidding+0.033
oh+0.032
 ourselves+0.031
 say+0.031
Neg logit contributions
les-0.035
ork-0.034
lights-0.032
rily-0.032
ened-0.030
Token an
Token ID281
Feature activation+2.89e+00

Pos logit contributions
 least+0.032
oh+0.025
 say+0.023
 kidding+0.022
 ourselves+0.022
Neg logit contributions
lights-0.043
ork-0.041
adoes-0.041
les-0.041
rily-0.040
Token old
Token ID1468
Feature activation+9.65e-01

Pos logit contributions
 least+0.133
oh+0.129
 am+0.122
'm+0.115
 ourselves+0.114
Neg logit contributions
les-0.085
rily-0.084
lf-0.083
adoes-0.076
lights-0.072
Token lady
Token ID10846
Feature activation+4.56e-01

Pos logit contributions
 least+0.048
oh+0.039
 ourselves+0.037
 kidding+0.035
 oh+0.034
Neg logit contributions
les-0.031
ork-0.029
lights-0.029
adoes-0.027
rily-0.027
Token.
Token ID13
Feature activation-8.28e-02

Pos logit contributions
 least+0.019
oh+0.014
 kidding+0.014
 oh+0.014
 say+0.014
Neg logit contributions
les-0.018
lights-0.017
ork-0.017
rily-0.017
adoes-0.016
Token
Token ID198
Feature activation+2.33e-01

Pos logit contributions
les+0.004
lights+0.004
ork+0.004
rily+0.004
adoes+0.004
Neg logit contributions
 least-0.003
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token
Token ID198
Feature activation+1.19e-01

Pos logit contributions
 least+0.009
oh+0.007
 kidding+0.007
 oh+0.007
 say+0.007
Neg logit contributions
les-0.010
lights-0.009
ork-0.009
rily-0.009
adoes-0.009
Token.
Token ID13
Feature activation-3.81e-01

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 kidding+0.003
 oh+0.003
Neg logit contributions
les-0.003
lights-0.003
ork-0.003
rily-0.003
adoes-0.003
Token He
Token ID679
Feature activation+8.02e-02

Pos logit contributions
les+0.017
lights+0.016
rily+0.015
ork+0.015
adoes+0.015
Neg logit contributions
 least-0.014
oh-0.011
 say-0.011
 oh-0.010
 Mir-0.010
Token requested
Token ID9167
Feature activation-1.12e-01

Pos logit contributions
 least+0.003
oh+0.002
 say+0.002
 kidding+0.002
 oh+0.002
Neg logit contributions
les-0.003
lights-0.003
ork-0.003
rily-0.003
adoes-0.003
Token a
Token ID257
Feature activation-1.61e-02

Pos logit contributions
les+0.005
lights+0.005
rily+0.005
ork+0.005
adoes+0.004
Neg logit contributions
 least-0.004
oh-0.003
 say-0.003
 oh-0.003
 kidding-0.003
Token soft
Token ID2705
Feature activation-5.78e-01

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token toy
Token ID13373
Feature activation+2.87e+00

Pos logit contributions
lights+0.023
les+0.022
rily+0.021
ork+0.021
adoes+0.020
Neg logit contributions
 least-0.024
 say-0.019
oh-0.019
 oh-0.018
 kidding-0.018
Token,
Token ID11
Feature activation+2.11e-01

Pos logit contributions
 least+0.140
oh+0.131
 kidding+0.122
 elbows+0.121
 ourselves+0.115
Neg logit contributions
ork-0.074
les-0.072
adoes-0.068
 soldier-0.067
bell-0.065
Token but
Token ID475
Feature activation+6.75e-02

Pos logit contributions
 least+0.007
oh+0.005
 say+0.005
 kidding+0.004
 oh+0.004
Neg logit contributions
les-0.010
lights-0.010
ork-0.010
rily-0.010
adoes-0.009
Token the
Token ID262
Feature activation-6.31e-03

Pos logit contributions
 least+0.003
oh+0.002
 say+0.002
 kidding+0.002
 oh+0.002
Neg logit contributions
les-0.003
lights-0.003
ork-0.002
rily-0.002
adoes-0.002
Token shop
Token ID6128
Feature activation-1.50e-02

Pos logit contributions
les+0.000
lights+0.000
ork+0.000
rily+0.000
adoes+0.000
Neg logit contributions
 least-0.000
oh-0.000
 say-0.000
 oh-0.000
 kidding-0.000
Tokenkeeper
Token ID13884
Feature activation-1.55e-01

Pos logit contributions
les+0.000
lights+0.000
ork+0.000
rily+0.000
adoes+0.000
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token usual
Token ID6678
Feature activation-5.09e-01

Pos logit contributions
 least+0.055
oh+0.046
 kidding+0.044
 oh+0.044
 say+0.043
Neg logit contributions
les-0.041
lights-0.039
ork-0.038
adoes-0.036
rily-0.034
Token area
Token ID1989
Feature activation+5.90e-01

Pos logit contributions
les+0.020
lights+0.020
rily+0.019
ork+0.019
adoes+0.018
Neg logit contributions
 least-0.021
oh-0.017
 say-0.016
 oh-0.015
 kidding-0.015
Token.
Token ID13
Feature activation-4.22e-01

Pos logit contributions
 least+0.026
oh+0.021
 kidding+0.020
 say+0.020
 oh+0.019
Neg logit contributions
les-0.022
lights-0.022
rily-0.020
ork-0.020
adoes-0.020
Token She
Token ID1375
Feature activation+4.50e-02

Pos logit contributions
les+0.019
lights+0.019
ork+0.018
rily+0.018
adoes+0.018
Neg logit contributions
 least-0.015
oh-0.012
 say-0.011
 oh-0.010
 Mir-0.010
Token reached
Token ID4251
Feature activation+1.28e+00

Pos logit contributions
 least+0.002
oh+0.001
 say+0.001
 oh+0.001
 kidding+0.001
Neg logit contributions
les-0.002
lights-0.002
ork-0.002
rily-0.002
adoes-0.002
Token an
Token ID281
Feature activation+2.87e+00

Pos logit contributions
 least+0.055
oh+0.049
 oh+0.044
 kidding+0.043
 worst+0.042
Neg logit contributions
les-0.047
lights-0.046
rily-0.044
ork-0.042
ened-0.041
Token expensive
Token ID5789
Feature activation+6.36e-01

Pos logit contributions
oh+0.126
 least+0.125
 am+0.123
'm+0.116
 ourselves+0.114
Neg logit contributions
les-0.085
rily-0.081
bell-0.079
ened-0.075
lights-0.075
Token part
Token ID636
Feature activation+7.65e-01

Pos logit contributions
 least+0.029
oh+0.023
 say+0.022
 kidding+0.022
 oh+0.022
Neg logit contributions
les-0.023
lights-0.022
ork-0.021
rily-0.021
adoes-0.020
Token of
Token ID286
Feature activation+1.57e-01

Pos logit contributions
 least+0.041
oh+0.034
 kidding+0.033
 oh+0.033
 say+0.033
Neg logit contributions
les-0.021
lights-0.021
ork-0.019
rily-0.019
adoes-0.018
Token town
Token ID3240
Feature activation+5.71e-01

Pos logit contributions
 least+0.005
oh+0.004
 say+0.004
 oh+0.004
 kidding+0.004
Neg logit contributions
les-0.007
lights-0.007
ork-0.007
rily-0.007
adoes-0.007
Token and
Token ID290
Feature activation-6.56e-02

Pos logit contributions
 least+0.024
oh+0.019
 say+0.018
 kidding+0.018
 oh+0.018
Neg logit contributions
les-0.022
lights-0.022
ork-0.021
rily-0.021
adoes-0.020
Token thinks
Token ID6834
Feature activation-1.94e-01

Pos logit contributions
 least+0.034
oh+0.028
 oh+0.027
 kidding+0.026
 ourselves+0.026
Neg logit contributions
les-0.031
lights-0.028
ork-0.027
rily-0.026
ened-0.026
Token it
Token ID340
Feature activation+8.35e-01

Pos logit contributions
les+0.007
lights+0.007
ork+0.006
rily+0.006
adoes+0.006
Neg logit contributions
 least-0.009
oh-0.007
 say-0.007
 oh-0.007
 kidding-0.007
Token is
Token ID318
Feature activation+1.32e+00

Pos logit contributions
 least+0.035
 kidding+0.028
 oh+0.027
oh+0.027
 say+0.026
Neg logit contributions
les-0.033
lights-0.031
ork-0.030
adoes-0.029
rily-0.028
Token a
Token ID257
Feature activation+1.14e+00

Pos logit contributions
 least+0.051
oh+0.046
 elbows+0.038
 kidding+0.037
 oh+0.036
Neg logit contributions
lights-0.053
les-0.050
rily-0.048
adoes-0.048
ork-0.046
Token shiny
Token ID22441
Feature activation+3.65e-02

Pos logit contributions
 least+0.048
oh+0.039
 kidding+0.037
 ourselves+0.037
 elbows+0.036
Neg logit contributions
ork-0.041
les-0.039
lights-0.039
pron-0.038
 tant-0.037
Token toy
Token ID13373
Feature activation+2.86e+00

Pos logit contributions
 least+0.002
oh+0.001
 say+0.001
 oh+0.001
 kidding+0.001
Neg logit contributions
les-0.001
lights-0.001
ork-0.001
rily-0.001
adoes-0.001
Token that
Token ID326
Feature activation-2.59e-01

Pos logit contributions
 least+0.138
oh+0.128
 kidding+0.125
 elbows+0.124
 ourselves+0.116
Neg logit contributions
les-0.078
lights-0.075
ork-0.068
rily-0.065
bell-0.062
Token makes
Token ID1838
Feature activation+1.67e-01

Pos logit contributions
les+0.011
lights+0.011
rily+0.010
ork+0.010
adoes+0.010
Neg logit contributions
 least-0.010
oh-0.008
 say-0.008
 oh-0.007
 am-0.007
Token his
Token ID465
Feature activation+8.30e-01

Pos logit contributions
 least+0.007
oh+0.005
 say+0.005
 kidding+0.005
 oh+0.005
Neg logit contributions
les-0.007
lights-0.007
ork-0.006
rily-0.006
adoes-0.006
Token dad
Token ID9955
Feature activation+1.18e-01

Pos logit contributions
 least+0.038
oh+0.033
 kidding+0.030
 oh+0.029
 Mir+0.029
Neg logit contributions
les-0.028
lights-0.028
rily-0.026
ork-0.026
 tant-0.026
Token's
Token ID338
Feature activation+3.60e-01

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 oh+0.003
 kidding+0.003
Neg logit contributions
les-0.005
lights-0.005
ork-0.005
rily-0.005
adoes-0.005
Token not
Token ID407
Feature activation-4.77e-01

Pos logit contributions
les+0.014
lights+0.014
rily+0.013
ork+0.013
adoes+0.013
Neg logit contributions
 least-0.010
oh-0.008
 say-0.007
 oh-0.007
 kidding-0.007
Token played
Token ID2826
Feature activation-1.24e+00

Pos logit contributions
lights+0.022
les+0.022
rily+0.021
ork+0.021
adoes+0.020
Neg logit contributions
 least-0.017
 say-0.013
oh-0.012
 oh-0.012
 kidding-0.012
Token with
Token ID351
Feature activation-1.17e+00

Pos logit contributions
lights+0.054
les+0.054
rily+0.049
ened+0.048
ork+0.047
Neg logit contributions
 least-0.042
 oh-0.036
oh-0.035
 very-0.034
 am-0.034
Token for
Token ID329
Feature activation-2.05e-01

Pos logit contributions
lights+0.052
les+0.052
rily+0.051
ork+0.049
adoes+0.048
Neg logit contributions
 least-0.040
oh-0.030
 say-0.029
 oh-0.029
 ourselves-0.029
Token a
Token ID257
Feature activation-2.05e+00

Pos logit contributions
les+0.010
lights+0.010
rily+0.010
ork+0.010
adoes+0.010
Neg logit contributions
 least-0.006
oh-0.004
 say-0.004
 oh-0.004
 kidding-0.004
Token long
Token ID890
Feature activation-3.81e+00

Pos logit contributions
lights+0.087
adoes+0.087
les+0.081
rily+0.079
ened+0.076
Neg logit contributions
 least-0.077
 very-0.066
oh-0.064
 wait-0.064
 say-0.062
Token time
Token ID640
Feature activation-2.98e-01

Pos logit contributions
rily+0.170
lights+0.164
adoes+0.157
ocked+0.149
aining+0.139
Neg logit contributions
 least-0.107
 wait-0.102
 say-0.099
 very-0.098
 am-0.093
Token.
Token ID13
Feature activation-2.83e-01

Pos logit contributions
les+0.012
lights+0.012
ork+0.012
rily+0.012
adoes+0.012
Neg logit contributions
 least-0.012
oh-0.009
 say-0.009
 oh-0.008
 kidding-0.008
Token It
Token ID632
Feature activation-1.82e-01

Pos logit contributions
les+0.013
lights+0.013
rily+0.013
ork+0.013
adoes+0.012
Neg logit contributions
 least-0.010
oh-0.007
 say-0.007
 oh-0.007
 kidding-0.006
Token was
Token ID373
Feature activation+2.72e-01

Pos logit contributions
les+0.008
lights+0.008
rily+0.008
ork+0.008
adoes+0.007
Neg logit contributions
 least-0.007
oh-0.005
 say-0.005
 oh-0.005
 am-0.005
Token a
Token ID257
Feature activation+2.61e-01

Pos logit contributions
 least+0.009
oh+0.007
 say+0.006
 kidding+0.006
 oh+0.006
Neg logit contributions
les-0.013
lights-0.013
ork-0.012
rily-0.012
adoes-0.012
Token red
Token ID2266
Feature activation-4.14e-01

Pos logit contributions
les+0.002
lights+0.002
ork+0.002
rily+0.002
adoes+0.002
Neg logit contributions
 least-0.002
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token bike
Token ID7161
Feature activation+1.91e-01

Pos logit contributions
lights+0.015
les+0.015
rily+0.014
ork+0.014
adoes+0.014
Neg logit contributions
 least-0.018
oh-0.015
 say-0.014
 oh-0.014
 kidding-0.014
Token with
Token ID351
Feature activation-5.26e-01

Pos logit contributions
 least+0.008
oh+0.007
 say+0.006
 kidding+0.006
 oh+0.006
Neg logit contributions
les-0.007
lights-0.007
ork-0.007
rily-0.007
adoes-0.007
Token a
Token ID257
Feature activation-1.51e+00

Pos logit contributions
lights+0.027
les+0.026
ork+0.026
rily+0.025
adoes+0.025
Neg logit contributions
 least-0.015
oh-0.011
 say-0.011
 oh-0.010
 Mir-0.010
Token com
Token ID401
Feature activation-1.41e+00

Pos logit contributions
lights+0.067
les+0.066
rily+0.064
adoes+0.062
ened+0.058
Neg logit contributions
 least-0.049
 say-0.044
 very-0.042
oh-0.041
 am-0.040
Tokenfy
Token ID24928
Feature activation-3.75e+00

Pos logit contributions
adoes+0.053
lights+0.053
lf+0.050
rily+0.049
les+0.047
Neg logit contributions
 least-0.056
oh-0.049
 say-0.048
 wait-0.045
 kidding-0.043
Token seat
Token ID5852
Feature activation-7.65e-01

Pos logit contributions
lights+0.188
rily+0.177
ust+0.167
les+0.166
ide+0.166
Neg logit contributions
 least-0.078
 say-0.075
 wait-0.073
 h-0.072
 very-0.072
Token and
Token ID290
Feature activation-1.74e-01

Pos logit contributions
les+0.033
lights+0.032
adoes+0.031
rily+0.031
ork+0.030
Neg logit contributions
 least-0.028
oh-0.022
 say-0.021
 oh-0.020
 am-0.019
Token shiny
Token ID22441
Feature activation-1.80e+00

Pos logit contributions
les+0.009
lights+0.009
ork+0.008
rily+0.008
adoes+0.008
Neg logit contributions
 least-0.005
oh-0.004
 say-0.004
 oh-0.003
 kidding-0.003
Token silver
Token ID8465
Feature activation-1.38e+00

Pos logit contributions
ork+0.077
lights+0.076
ide+0.074
rily+0.073
les+0.072
Neg logit contributions
 least-0.051
 say-0.044
 oh-0.042
 very-0.041
 most-0.040
Token bell
Token ID8966
Feature activation-8.08e-01

Pos logit contributions
lights+0.055
les+0.054
ork+0.053
rily+0.052
adoes+0.051
Neg logit contributions
 least-0.049
 say-0.039
 oh-0.038
oh-0.038
 am-0.037
Token Billy
Token ID15890
Feature activation-1.41e-01

Pos logit contributions
les+0.063
rily+0.062
lights+0.061
ork+0.059
urbed+0.056
Neg logit contributions
 least-0.041
 say-0.036
oh-0.031
 am-0.031
 oh-0.030
Token was
Token ID373
Feature activation-1.08e-01

Pos logit contributions
lights+0.006
les+0.006
ork+0.006
rily+0.006
adoes+0.006
Neg logit contributions
 least-0.005
oh-0.004
 say-0.004
 oh-0.004
 kidding-0.004
Token wearing
Token ID5762
Feature activation-1.90e+00

Pos logit contributions
les+0.005
lights+0.005
ork+0.005
rily+0.005
adoes+0.005
Neg logit contributions
 least-0.003
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token a
Token ID257
Feature activation-3.04e+00

Pos logit contributions
les+0.128
lights+0.126
ork+0.125
rily+0.124
anna+0.120
Neg logit contributions
 least-0.022
 very-0.015
 most-0.010
 oh-0.010
 best-0.010
Token really
Token ID1107
Feature activation-1.79e+00

Pos logit contributions
lights+0.161
adoes+0.155
urbed+0.154
les+0.152
rily+0.152
Neg logit contributions
 very-0.076
 say-0.069
 best-0.065
 least-0.061
 oh-0.057
Token cool
Token ID3608
Feature activation-3.66e+00

Pos logit contributions
lights+0.088
rily+0.088
les+0.086
adoes+0.086
urbed+0.081
Neg logit contributions
 least-0.044
 say-0.042
 very-0.042
 am-0.040
 oh-0.039
Token shirt
Token ID10147
Feature activation-8.52e-01

Pos logit contributions
rily+0.187
lights+0.187
urbed+0.183
ide+0.171
ust+0.170
Neg logit contributions
 say-0.074
 very-0.074
 least-0.059
 waiting-0.058
 best-0.057
Token with
Token ID351
Feature activation-1.32e+00

Pos logit contributions
lights+0.038
les+0.037
adoes+0.036
ork+0.036
rily+0.036
Neg logit contributions
 least-0.031
oh-0.023
 say-0.023
 oh-0.021
 kidding-0.021
Token a
Token ID257
Feature activation-2.27e+00

Pos logit contributions
lights+0.068
ork+0.067
les+0.065
rily+0.065
adoes+0.062
Neg logit contributions
 least-0.037
 say-0.028
 oh-0.026
 very-0.026
oh-0.024
Token dinosaur
Token ID28257
Feature activation-1.95e+00

Pos logit contributions
lights+0.109
les+0.105
rily+0.103
adoes+0.102
ide+0.101
Neg logit contributions
 say-0.063
 least-0.063
 very-0.061
 am-0.055
 best-0.054
Token on
Token ID319
Feature activation-3.07e+00

Pos logit contributions
lights+0.095
rily+0.087
adoes+0.087
les+0.084
ened+0.082
Neg logit contributions
 least-0.062
oh-0.048
 say-0.048
 very-0.046
 oh-0.043
Token the
Token ID262
Feature activation+1.15e+00

Pos logit contributions
 least+0.005
oh+0.004
 say+0.004
 oh+0.004
 kidding+0.004
Neg logit contributions
les-0.007
lights-0.007
ork-0.007
rily-0.007
adoes-0.007
Token beach
Token ID10481
Feature activation-1.01e-01

Pos logit contributions
 least+0.049
oh+0.043
 kidding+0.040
 say+0.039
 oh+0.039
Neg logit contributions
les-0.039
ork-0.038
lights-0.038
adoes-0.036
rily-0.036
Token and
Token ID290
Feature activation-7.63e-01

Pos logit contributions
les+0.004
lights+0.004
ork+0.004
rily+0.004
adoes+0.004
Neg logit contributions
 least-0.004
oh-0.003
 say-0.003
 oh-0.003
 kidding-0.003
Token build
Token ID1382
Feature activation+2.53e-01

Pos logit contributions
les+0.036
lights+0.035
ork+0.034
rily+0.034
adoes+0.033
Neg logit contributions
 least-0.028
 say-0.021
oh-0.020
 wait-0.019
 am-0.019
Token sand
Token ID6450
Feature activation-8.80e-01

Pos logit contributions
 least+0.009
oh+0.007
 say+0.006
 oh+0.006
 kidding+0.006
Neg logit contributions
les-0.012
lights-0.011
ork-0.011
rily-0.011
adoes-0.011
Tokencast
Token ID2701
Feature activation-3.65e+00

Pos logit contributions
les+0.030
ork+0.030
lights+0.029
adoes+0.029
rily+0.028
Neg logit contributions
 least-0.038
 say-0.031
oh-0.031
 ourselves-0.029
 am-0.028
Tokenles
Token ID829
Feature activation-9.51e-01

Pos logit contributions
adoes+0.082
 former+0.068
lights+0.067
 aids+0.062
igsaw+0.058
Neg logit contributions
 least-0.178
oh-0.165
 are-0.165
ila-0.161
 say-0.155
Token.
Token ID13
Feature activation-3.85e-01

Pos logit contributions
lights+0.040
les+0.040
ork+0.039
adoes+0.038
rily+0.038
Neg logit contributions
 least-0.034
oh-0.027
 say-0.026
 oh-0.024
 ourselves-0.024
Token
Token ID220
Feature activation-7.85e-01

Pos logit contributions
les+0.016
lights+0.015
ork+0.015
rily+0.015
adoes+0.014
Neg logit contributions
 least-0.015
oh-0.012
 say-0.012
 kidding-0.011
 oh-0.011
Token
Token ID198
Feature activation-2.75e-01

Pos logit contributions
les+0.038
lights+0.037
adoes+0.036
rily+0.036
ork+0.035
Neg logit contributions
 least-0.025
oh-0.019
 say-0.018
 am-0.017
 Mir-0.017
Token
Token ID198
Feature activation-3.32e-01

Pos logit contributions
les+0.012
lights+0.012
rily+0.012
adoes+0.012
ork+0.012
Neg logit contributions
 least-0.010
oh-0.008
 say-0.007
 oh-0.007
 kidding-0.007
Token play
Token ID711
Feature activation+1.72e-02

Pos logit contributions
lights+0.037
les+0.036
rily+0.035
ork+0.034
adoes+0.034
Neg logit contributions
 least-0.025
 say-0.020
 wait-0.017
 am-0.017
oh-0.017
Token.
Token ID13
Feature activation-1.01e+00

Pos logit contributions
 least+0.001
oh+0.001
 say+0.001
 oh+0.001
 kidding+0.001
Neg logit contributions
les-0.001
lights-0.001
ork-0.001
rily-0.001
adoes-0.001
Token Lily
Token ID20037
Feature activation+5.17e-01

Pos logit contributions
les+0.047
lights+0.046
rily+0.044
adoes+0.043
ork+0.042
Neg logit contributions
 least-0.034
 say-0.028
 Mir-0.026
oh-0.026
 oh-0.025
Token saw
Token ID2497
Feature activation+8.56e-01

Pos logit contributions
 least+0.022
oh+0.017
 kidding+0.016
 ourselves+0.016
 oh+0.016
Neg logit contributions
les-0.021
lights-0.019
ork-0.019
rily-0.018
adoes-0.018
Token a
Token ID257
Feature activation-3.03e-01

Pos logit contributions
 least+0.031
oh+0.025
 kidding+0.022
 am+0.021
 say+0.021
Neg logit contributions
les-0.038
lights-0.037
adoes-0.035
ork-0.035
rily-0.034
Token merry
Token ID44570
Feature activation-3.64e+00

Pos logit contributions
les+0.012
lights+0.012
rily+0.011
ork+0.011
adoes+0.011
Neg logit contributions
 least-0.012
oh-0.010
 say-0.010
 oh-0.010
 kidding-0.009
Token-
Token ID12
Feature activation-7.20e-01

Pos logit contributions
lights+0.174
rily+0.174
urbed+0.159
adoes+0.155
 journals+0.150
Neg logit contributions
 say-0.093
 wait-0.090
 least-0.089
 oh-0.086
oh-0.081
Tokengo
Token ID2188
Feature activation-2.07e-01

Pos logit contributions
lights+0.030
les+0.029
adoes+0.028
rily+0.028
ork+0.027
Neg logit contributions
 least-0.027
oh-0.024
 say-0.022
 oh-0.021
 wait-0.020
Token-
Token ID12
Feature activation-8.20e-01

Pos logit contributions
lights+0.005
les+0.005
rily+0.005
ork+0.005
adoes+0.005
Neg logit contributions
 least-0.011
oh-0.010
 say-0.009
 oh-0.009
 kidding-0.009
Tokenround
Token ID744
Feature activation+2.06e-01

Pos logit contributions
lights+0.025
les+0.023
rily+0.023
adoes+0.023
ened+0.021
Neg logit contributions
 least-0.040
oh-0.037
 say-0.035
 oh-0.033
 wait-0.032
Token and
Token ID290
Feature activation+5.09e-02

Pos logit contributions
 least+0.009
oh+0.007
 say+0.007
 oh+0.007
 kidding+0.007
Neg logit contributions
les-0.008
lights-0.007
ork-0.007
rily-0.007
adoes-0.007
Token a
Token ID257
Feature activation-1.04e+00

Pos logit contributions
les+0.012
lights+0.012
ork+0.012
rily+0.012
adoes+0.012
Neg logit contributions
 least-0.007
oh-0.004
 say-0.004
 oh-0.004
 kidding-0.004
Token strong
Token ID1913
Feature activation-2.16e+00

Pos logit contributions
lights+0.041
les+0.040
rily+0.038
adoes+0.037
ened+0.036
Neg logit contributions
 least-0.043
 say-0.035
oh-0.034
 oh-0.033
 am-0.032
Token and
Token ID290
Feature activation-3.87e-01

Pos logit contributions
lights+0.104
rily+0.092
les+0.090
ork+0.088
adoes+0.084
Neg logit contributions
 least-0.072
 say-0.067
 saying-0.058
 doing-0.056
 very-0.056
Token brave
Token ID14802
Feature activation-3.15e+00

Pos logit contributions
les+0.018
lights+0.017
ork+0.017
rily+0.017
adoes+0.016
Neg logit contributions
 least-0.013
 say-0.010
 oh-0.010
oh-0.010
 kidding-0.009
Token bear
Token ID6842
Feature activation-8.27e-01

Pos logit contributions
lights+0.147
ork+0.129
rily+0.128
lf+0.126
ide+0.124
Neg logit contributions
 say-0.089
 least-0.088
 very-0.079
 shout-0.076
ses-0.075
Token on
Token ID319
Feature activation-3.60e+00

Pos logit contributions
lights+0.039
les+0.037
rily+0.035
adoes+0.035
ork+0.034
Neg logit contributions
 least-0.028
oh-0.023
 say-0.022
 am-0.021
 oh-0.020
Token it
Token ID340
Feature activation-4.23e-01

Pos logit contributions
les+0.203
lights+0.203
anna+0.197
arre+0.196
aff+0.196
Neg logit contributions
 very-0.060
 a-0.048
 an-0.044
 the-0.043
 so-0.037
Token.
Token ID13
Feature activation-1.11e+00

Pos logit contributions
lights+0.018
les+0.018
ork+0.017
rily+0.017
adoes+0.017
Neg logit contributions
 least-0.016
oh-0.012
 say-0.012
 oh-0.011
 kidding-0.011
Token Benny
Token ID44275
Feature activation+7.16e-03

Pos logit contributions
lights+0.059
les+0.056
rily+0.054
ork+0.054
adoes+0.054
Neg logit contributions
 least-0.031
 say-0.024
oh-0.023
 oh-0.021
 am-0.020
Token wanted
Token ID2227
Feature activation-2.97e-01

Pos logit contributions
 least+0.000
oh+0.000
 say+0.000
 oh+0.000
 kidding+0.000
Neg logit contributions
les-0.000
lights-0.000
ork-0.000
rily-0.000
adoes-0.000
Token to
Token ID284
Feature activation-1.65e-01

Pos logit contributions
lights+0.013
les+0.013
rily+0.013
ork+0.012
adoes+0.012
Neg logit contributions
 least-0.011
oh-0.008
 say-0.008
 oh-0.008
 kidding-0.008
Token things
Token ID1243
Feature activation+1.01e+00

Pos logit contributions
lights+0.058
rily+0.057
les+0.057
ork+0.054
ocked+0.049
Neg logit contributions
 least-0.064
 worst-0.052
 best-0.052
 say-0.051
 oh-0.051
Token inside
Token ID2641
Feature activation+7.18e-01

Pos logit contributions
 least+0.041
oh+0.034
 kidding+0.033
 Mir+0.032
 oh+0.032
Neg logit contributions
les-0.037
lights-0.037
rily-0.036
ork-0.035
adoes-0.034
Token,
Token ID11
Feature activation+5.73e-01

Pos logit contributions
 least+0.030
oh+0.025
 kidding+0.023
 oh+0.023
 say+0.022
Neg logit contributions
les-0.027
lights-0.027
rily-0.026
ork-0.025
adoes-0.024
Token but
Token ID475
Feature activation+2.66e-02

Pos logit contributions
 least+0.021
oh+0.017
 kidding+0.014
 am+0.014
 ourselves+0.014
Neg logit contributions
lights-0.025
les-0.025
rily-0.024
ork-0.023
adoes-0.023
Token the
Token ID262
Feature activation-1.84e+00

Pos logit contributions
 least+0.001
oh+0.001
 say+0.001
 oh+0.001
 kidding+0.001
Neg logit contributions
les-0.001
lights-0.001
ork-0.001
rily-0.001
adoes-0.001
Token brave
Token ID14802
Feature activation-3.58e+00

Pos logit contributions
les+0.071
rily+0.069
lights+0.069
ork+0.063
en+0.062
Neg logit contributions
 least-0.082
 worst-0.068
 oh-0.068
 very-0.067
 best-0.067
Token little
Token ID1310
Feature activation-1.53e+00

Pos logit contributions
rily+0.202
lights+0.198
lor+0.193
pron+0.189
ork+0.187
Neg logit contributions
 least-0.059
 very-0.044
 best-0.043
 oh-0.042
 say-0.042
Token girl
Token ID2576
Feature activation-7.98e-02

Pos logit contributions
lights+0.084
ork+0.083
rily+0.080
les+0.080
ocked+0.075
Neg logit contributions
 least-0.040
 oh-0.028
 say-0.027
 worst-0.025
 kidding-0.024
Token saw
Token ID2497
Feature activation-1.90e-02

Pos logit contributions
les+0.003
lights+0.003
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.003
oh-0.003
 say-0.002
 oh-0.002
 kidding-0.002
Token him
Token ID683
Feature activation+5.39e-01

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.001
 say-0.000
 oh-0.000
 kidding-0.000
Token.
Token ID13
Feature activation+1.18e-01

Pos logit contributions
 least+0.023
oh+0.019
 kidding+0.018
 oh+0.018
 say+0.017
Neg logit contributions
les-0.020
rily-0.019
ork-0.019
lights-0.019
adoes-0.018
Token big
Token ID1263
Feature activation+3.50e-01

Pos logit contributions
les+0.003
lights+0.003
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.002
oh-0.002
 say-0.001
 oh-0.001
 kidding-0.001
Token eyes
Token ID2951
Feature activation-2.30e+00

Pos logit contributions
 least+0.017
oh+0.014
 oh+0.013
 say+0.013
 kidding+0.013
Neg logit contributions
les-0.012
lights-0.011
ork-0.011
rily-0.011
adoes-0.011
Token and
Token ID290
Feature activation-1.61e-01

Pos logit contributions
lights+0.118
adoes+0.116
 tant+0.114
ork+0.109
les+0.109
Neg logit contributions
 least-0.059
oh-0.050
 saying-0.050
 say-0.048
 very-0.046
Token a
Token ID257
Feature activation-7.62e-01

Pos logit contributions
les+0.008
lights+0.007
ork+0.007
rily+0.007
adoes+0.007
Neg logit contributions
 least-0.005
oh-0.004
 say-0.004
 oh-0.004
 kidding-0.004
Token w
Token ID266
Feature activation-5.40e-01

Pos logit contributions
lights+0.031
les+0.031
rily+0.029
adoes+0.029
ork+0.028
Neg logit contributions
 least-0.030
 say-0.024
oh-0.024
 oh-0.023
 am-0.022
Tokenagging
Token ID16406
Feature activation-3.48e+00

Pos logit contributions
lights+0.018
les+0.017
rily+0.017
adoes+0.016
ork+0.016
Neg logit contributions
 least-0.026
oh-0.021
 say-0.020
 kidding-0.020
 am-0.020
Token tail
Token ID7894
Feature activation-2.06e+00

Pos logit contributions
lights+0.160
les+0.150
vable+0.145
ilantro+0.144
oons+0.142
Neg logit contributions
 say-0.098
 very-0.094
 saying-0.087
 shout-0.082
ses-0.082
Token.
Token ID13
Feature activation-5.60e-01

Pos logit contributions
lights+0.105
les+0.099
adoes+0.098
ork+0.098
 tant+0.097
Neg logit contributions
 least-0.057
 say-0.045
oh-0.044
 saying-0.042
 very-0.041
Token The
Token ID383
Feature activation-1.47e+00

Pos logit contributions
les+0.025
lights+0.025
ork+0.024
rily+0.024
adoes+0.023
Neg logit contributions
 least-0.020
oh-0.016
 say-0.015
 Mir-0.015
 oh-0.014
Token young
Token ID1862
Feature activation-1.83e+00

Pos logit contributions
ork+0.065
les+0.064
rily+0.063
lights+0.061
anne+0.058
Neg logit contributions
 least-0.057
 best-0.046
 say-0.045
 most-0.045
 very-0.045
Token boy
Token ID2933
Feature activation+3.12e-01

Pos logit contributions
ork+0.113
lights+0.106
les+0.102
rily+0.099
ches+0.096
Neg logit contributions
 least-0.038
 say-0.027
 oh-0.026
 best-0.025
 most-0.025
Token man
Token ID582
Feature activation-2.16e-01

Pos logit contributions
les+0.003
lights+0.003
rily+0.003
ork+0.003
adoes+0.003
Neg logit contributions
 least-0.004
oh-0.003
 say-0.003
 oh-0.003
 kidding-0.003
Token walking
Token ID6155
Feature activation-4.62e-01

Pos logit contributions
les+0.010
lights+0.010
ork+0.009
rily+0.009
adoes+0.009
Neg logit contributions
 least-0.008
oh-0.006
 say-0.006
 oh-0.005
 kidding-0.005
Token on
Token ID319
Feature activation-5.60e-01

Pos logit contributions
lights+0.021
les+0.021
ork+0.020
rily+0.020
adoes+0.019
Neg logit contributions
 least-0.016
oh-0.013
 say-0.012
 oh-0.012
 kidding-0.011
Token a
Token ID257
Feature activation+8.84e-02

Pos logit contributions
les+0.031
lights+0.030
rily+0.029
ork+0.029
adoes+0.028
Neg logit contributions
 least-0.015
 say-0.010
oh-0.010
 oh-0.009
 Mir-0.009
Token tight
Token ID5381
Feature activation-1.33e+00

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 oh+0.003
 kidding+0.003
Neg logit contributions
les-0.003
lights-0.003
ork-0.003
rily-0.003
adoes-0.003
Tokenro
Token ID305
Feature activation-3.46e+00

Pos logit contributions
lights+0.056
les+0.055
rily+0.052
adoes+0.052
ork+0.049
Neg logit contributions
 least-0.049
 say-0.040
oh-0.039
 wait-0.037
 am-0.036
Tokenpe
Token ID431
Feature activation-4.67e-01

Pos logit contributions
lights+0.165
les+0.145
rily+0.144
 journals+0.143
adoes+0.139
Neg logit contributions
oh-0.102
 least-0.087
 oh-0.085
 am-0.084
 very-0.080
Token.
Token ID13
Feature activation-6.41e-01

Pos logit contributions
lights+0.019
les+0.019
ork+0.018
rily+0.018
adoes+0.018
Neg logit contributions
 least-0.018
oh-0.014
 say-0.014
 oh-0.013
 kidding-0.013
Token He
Token ID679
Feature activation-6.87e-01

Pos logit contributions
lights+0.030
les+0.030
rily+0.029
ork+0.029
adoes+0.028
Neg logit contributions
 least-0.020
 say-0.016
oh-0.016
 Mir-0.015
 oh-0.014
Token had
Token ID550
Feature activation+1.14e-01

Pos logit contributions
lights+0.033
les+0.032
rily+0.032
adoes+0.031
ork+0.031
Neg logit contributions
 least-0.023
oh-0.018
 say-0.017
 oh-0.016
 Mir-0.015
Token to
Token ID284
Feature activation+2.91e-01

Pos logit contributions
 least+0.004
oh+0.003
 say+0.002
 kidding+0.002
 oh+0.002
Neg logit contributions
les-0.006
lights-0.006
ork-0.005
rily-0.005
adoes-0.005
Token with
Token ID351
Feature activation-9.61e-01

Pos logit contributions
lights+0.019
les+0.019
rily+0.018
adoes+0.018
ork+0.018
Neg logit contributions
 least-0.016
oh-0.012
 say-0.012
 oh-0.012
 kidding-0.011
Token his
Token ID465
Feature activation-4.42e-02

Pos logit contributions
lights+0.048
les+0.047
ork+0.046
rily+0.045
adoes+0.044
Neg logit contributions
 least-0.029
 say-0.021
 oh-0.021
oh-0.020
 very-0.019
Token favorite
Token ID4004
Feature activation-2.58e+00

Pos logit contributions
les+0.002
lights+0.002
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.002
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token cartoon
Token ID16251
Feature activation-1.01e+00

Pos logit contributions
lights+0.121
rily+0.117
urbed+0.106
ide+0.104
ackers+0.101
Neg logit contributions
 say-0.076
 very-0.070
 am-0.068
 are-0.067
 least-0.064
Token character
Token ID2095
Feature activation-4.69e-01

Pos logit contributions
lights+0.042
les+0.041
rily+0.041
adoes+0.039
ork+0.038
Neg logit contributions
 least-0.037
 say-0.030
oh-0.030
 oh-0.029
 am-0.028
Token on
Token ID319
Feature activation-3.45e+00

Pos logit contributions
lights+0.020
les+0.020
ork+0.019
rily+0.019
adoes+0.019
Neg logit contributions
 least-0.017
oh-0.013
 say-0.013
 oh-0.013
 kidding-0.012
Token the
Token ID262
Feature activation+2.86e-01

Pos logit contributions
lights+0.207
les+0.192
anna+0.192
anne+0.190
ust+0.186
Neg logit contributions
 very-0.055
 a-0.050
 least-0.048
 the-0.045
 an-0.044
Token cover
Token ID3002
Feature activation-1.40e+00

Pos logit contributions
 least+0.014
oh+0.012
 say+0.011
 kidding+0.011
 oh+0.011
Neg logit contributions
les-0.009
lights-0.009
ork-0.009
rily-0.008
adoes-0.008
Token.
Token ID13
Feature activation-1.09e+00

Pos logit contributions
lights+0.063
les+0.063
adoes+0.062
ork+0.062
rily+0.060
Neg logit contributions
 least-0.043
 say-0.033
oh-0.033
 oh-0.029
 very-0.029
Token He
Token ID679
Feature activation-4.82e-01

Pos logit contributions
les+0.052
lights+0.049
rily+0.046
ork+0.044
 tant+0.043
Neg logit contributions
 least-0.036
 oh-0.030
 say-0.029
oh-0.028
 Mir-0.028
Token asked
Token ID1965
Feature activation-3.31e-01

Pos logit contributions
lights+0.021
les+0.021
rily+0.020
ork+0.019
adoes+0.019
Neg logit contributions
 least-0.018
oh-0.015
 say-0.014
 oh-0.014
 Mir-0.013
Token room
Token ID2119
Feature activation-4.79e-01

Pos logit contributions
les+0.110
lights+0.108
rily+0.101
adoes+0.097
ocked+0.094
Neg logit contributions
 least-0.078
 very-0.064
 say-0.062
 oh-0.060
oh-0.059
Token with
Token ID351
Feature activation+2.16e-02

Pos logit contributions
les+0.020
lights+0.020
rily+0.019
adoes+0.019
ork+0.019
Neg logit contributions
 least-0.018
oh-0.015
 say-0.014
 oh-0.014
 am-0.013
Token a
Token ID257
Feature activation-4.95e-01

Pos logit contributions
 least+0.001
oh+0.001
 say+0.000
 oh+0.000
 kidding+0.000
Neg logit contributions
les-0.001
lights-0.001
ork-0.001
rily-0.001
adoes-0.001
Token pretty
Token ID2495
Feature activation-2.23e+00

Pos logit contributions
les+0.020
lights+0.019
rily+0.019
adoes+0.018
ork+0.018
Neg logit contributions
 least-0.020
 say-0.016
oh-0.016
 oh-0.015
 am-0.015
Token tut
Token ID9732
Feature activation-2.35e+00

Pos logit contributions
lights+0.095
les+0.093
rily+0.091
adoes+0.083
urbed+0.081
Neg logit contributions
 least-0.072
 say-0.064
 very-0.063
 oh-0.061
 am-0.059
Tokenoring
Token ID3255
Feature activation-3.45e+00

Pos logit contributions
lights+0.086
adoes+0.085
 tant+0.077
emo+0.074
rily+0.073
Neg logit contributions
oh-0.083
 least-0.080
 say-0.067
-0.064
 very-0.064
Token desk
Token ID6915
Feature activation-1.28e+00

Pos logit contributions
lights+0.140
emo+0.135
rily+0.133
ackers+0.128
ilantro+0.126
Neg logit contributions
 very-0.092
 h-0.085
 least-0.079
 say-0.079
 doing-0.076
Token.
Token ID13
Feature activation-3.04e-01

Pos logit contributions
les+0.058
lights+0.057
adoes+0.056
rily+0.055
 tant+0.054
Neg logit contributions
 least-0.042
oh-0.033
 say-0.032
 oh-0.030
 saying-0.029
Token
Token ID220
Feature activation-2.05e-02

Pos logit contributions
les+0.015
lights+0.014
rily+0.014
ork+0.014
adoes+0.013
Neg logit contributions
 least-0.010
oh-0.008
 say-0.007
 oh-0.007
 Mir-0.007
Token
Token ID198
Feature activation+2.03e-01

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token
Token ID198
Feature activation-1.99e-01

Pos logit contributions
 least+0.008
oh+0.006
 say+0.006
 kidding+0.006
 oh+0.006
Neg logit contributions
les-0.008
lights-0.008
ork-0.008
rily-0.008
adoes-0.008
Token around
Token ID1088
Feature activation+6.09e-01

Pos logit contributions
lights+0.010
les+0.010
ork+0.009
rily+0.009
adoes+0.009
Neg logit contributions
 least-0.011
oh-0.008
 say-0.008
 kidding-0.008
 oh-0.008
Token and
Token ID290
Feature activation+5.53e-01

Pos logit contributions
 least+0.028
oh+0.023
 kidding+0.021
 say+0.021
 oh+0.021
Neg logit contributions
les-0.021
lights-0.021
ork-0.020
rily-0.020
ened-0.019
Token saw
Token ID2497
Feature activation+2.80e-01

Pos logit contributions
 least+0.022
oh+0.017
 kidding+0.016
 ourselves+0.016
 worst+0.016
Neg logit contributions
les-0.022
lights-0.021
ork-0.021
rily-0.021
ened-0.020
Token a
Token ID257
Feature activation-7.24e-01

Pos logit contributions
 least+0.010
oh+0.008
 say+0.007
 kidding+0.007
 oh+0.007
Neg logit contributions
les-0.013
lights-0.012
ork-0.012
rily-0.012
adoes-0.012
Token big
Token ID1263
Feature activation-4.96e-01

Pos logit contributions
les+0.029
lights+0.029
rily+0.028
adoes+0.027
ork+0.026
Neg logit contributions
 least-0.029
oh-0.023
 say-0.023
 oh-0.022
 kidding-0.022
Token do
Token ID466
Feature activation-3.44e+00

Pos logit contributions
les+0.020
lights+0.020
rily+0.019
ork+0.019
adoes+0.019
Neg logit contributions
 least-0.020
oh-0.015
 say-0.015
 kidding-0.014
 oh-0.014
Tokenggy
Token ID19970
Feature activation+2.23e-01

Pos logit contributions
lights+0.142
rily+0.133
les+0.128
adoes+0.122
ened+0.122
Neg logit contributions
 am-0.127
 very-0.119
oh-0.115
 doing-0.115
 least-0.114
Token smile
Token ID8212
Feature activation-4.09e-01

Pos logit contributions
 least+0.009
oh+0.007
 say+0.007
 oh+0.007
 kidding+0.007
Neg logit contributions
les-0.009
lights-0.009
ork-0.008
rily-0.008
adoes-0.008
Token at
Token ID379
Feature activation-4.47e-01

Pos logit contributions
lights+0.017
les+0.017
ork+0.017
rily+0.016
adoes+0.016
Neg logit contributions
 least-0.015
oh-0.012
 say-0.012
 oh-0.011
 kidding-0.011
Token her
Token ID607
Feature activation+5.02e-01

Pos logit contributions
les+0.023
lights+0.023
ork+0.023
rily+0.022
adoes+0.022
Neg logit contributions
 least-0.013
oh-0.009
 say-0.009
 oh-0.008
 ourselves-0.008
Token.
Token ID13
Feature activation-3.47e-01

Pos logit contributions
 least+0.022
oh+0.018
 oh+0.017
 say+0.017
 kidding+0.017
Neg logit contributions
les-0.018
lights-0.018
ork-0.017
rily-0.017
adoes-0.016
Token it
Token ID340
Feature activation+6.52e-01

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 oh+0.003
 kidding+0.003
Neg logit contributions
les-0.004
lights-0.004
ork-0.004
rily-0.004
adoes-0.004
Token!
Token ID0
Feature activation+4.89e-01

Pos logit contributions
 least+0.027
oh+0.022
 kidding+0.021
 say+0.021
 oh+0.021
Neg logit contributions
les-0.025
lights-0.024
rily-0.023
ork-0.023
adoes-0.022
Token
Token ID198
Feature activation+6.43e-01

Pos logit contributions
 least+0.021
oh+0.016
 say+0.016
 oh+0.016
 kidding+0.015
Neg logit contributions
les-0.018
lights-0.018
ork-0.018
rily-0.017
adoes-0.017
Token
Token ID198
Feature activation-4.69e-02

Pos logit contributions
 least+0.028
 kidding+0.022
oh+0.021
 oh+0.021
 say+0.021
Neg logit contributions
les-0.023
ork-0.023
lights-0.023
rily-0.022
adoes-0.020
TokenThe
Token ID464
Feature activation-1.70e+00

Pos logit contributions
les+0.002
lights+0.002
ork+0.002
rily+0.002
adoes+0.002
Neg logit contributions
 least-0.002
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token white
Token ID2330
Feature activation-3.41e+00

Pos logit contributions
ork+0.087
les+0.075
lights+0.074
aff+0.072
rily+0.071
Neg logit contributions
 least-0.063
 best-0.052
 most-0.050
 very-0.049
 worst-0.047
Token all
Token ID477
Feature activation-3.74e-01

Pos logit contributions
ork+0.163
lights+0.154
isdom+0.147
azz+0.146
url+0.144
Neg logit contributions
 very-0.073
 m-0.072
 kay-0.071
 least-0.071
 most-0.067
Tokenigator
Token ID23823
Feature activation+8.53e-02

Pos logit contributions
ork+0.013
les+0.013
lights+0.013
rily+0.012
adoes+0.012
Neg logit contributions
 least-0.017
oh-0.014
 say-0.014
 oh-0.013
 kidding-0.013
Token was
Token ID373
Feature activation+1.54e-01

Pos logit contributions
 least+0.003
oh+0.003
 say+0.002
 oh+0.002
 kidding+0.002
Neg logit contributions
les-0.004
lights-0.003
ork-0.003
rily-0.003
adoes-0.003
Token scared
Token ID12008
Feature activation+1.79e-01

Pos logit contributions
 least+0.004
oh+0.003
 say+0.002
 oh+0.002
 kidding+0.002
Neg logit contributions
les-0.008
lights-0.008
rily-0.008
ork-0.008
adoes-0.008
Token so
Token ID523
Feature activation+1.37e-01

Pos logit contributions
 least+0.008
oh+0.006
 say+0.006
 kidding+0.006
 oh+0.006
Neg logit contributions
les-0.007
lights-0.007
rily-0.006
ork-0.006
adoes-0.006
Token to
Token ID284
Feature activation+8.25e-02

Pos logit contributions
les+0.014
lights+0.014
ork+0.013
rily+0.013
adoes+0.013
Neg logit contributions
 least-0.013
oh-0.010
 say-0.009
 oh-0.009
 kidding-0.009
Token have
Token ID423
Feature activation-7.03e-01

Pos logit contributions
 least+0.003
oh+0.002
 say+0.002
 oh+0.002
 kidding+0.002
Neg logit contributions
les-0.004
lights-0.004
ork-0.004
rily-0.004
adoes-0.004
Token found
Token ID1043
Feature activation-4.68e-01

Pos logit contributions
lights+0.034
les+0.033
rily+0.033
adoes+0.032
ork+0.031
Neg logit contributions
 least-0.024
oh-0.017
 oh-0.017
 say-0.017
 am-0.016
Token such
Token ID884
Feature activation-4.47e-02

Pos logit contributions
lights+0.023
les+0.022
ork+0.022
rily+0.021
adoes+0.021
Neg logit contributions
 least-0.015
 say-0.011
oh-0.011
 oh-0.010
 ourselves-0.010
Token a
Token ID257
Feature activation-1.29e+00

Pos logit contributions
les+0.002
lights+0.002
ork+0.002
rily+0.002
adoes+0.002
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token beautiful
Token ID4950
Feature activation-3.41e+00

Pos logit contributions
lights+0.067
les+0.066
adoes+0.062
rily+0.061
ened+0.059
Neg logit contributions
 least-0.042
 very-0.031
 say-0.030
 best-0.030
oh-0.029
Token treasure
Token ID14068
Feature activation-1.33e+00

Pos logit contributions
lights+0.157
cases+0.143
rily+0.141
les+0.139
oons+0.134
Neg logit contributions
 least-0.106
 best-0.101
 very-0.100
 most-0.095
 doing-0.093
Token.
Token ID13
Feature activation-2.02e-02

Pos logit contributions
lights+0.063
les+0.062
adoes+0.060
 tant+0.059
rily+0.058
Neg logit contributions
 least-0.045
 say-0.032
oh-0.032
 oh-0.030
 wait-0.029
Token<|endoftext|>
Token ID50256
Feature activation-1.03e+00

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
TokenOnce
Token ID7454
Feature activation-1.12e+00

Pos logit contributions
lights+0.054
les+0.052
adoes+0.052
rily+0.052
ork+0.049
Neg logit contributions
 least-0.029
oh-0.022
 am-0.021
 say-0.021
 Mir-0.019
Token upon
Token ID2402
Feature activation-5.82e-01

Pos logit contributions
les+0.044
adoes+0.044
lights+0.043
rily+0.043
ork+0.040
Neg logit contributions
 least-0.045
oh-0.037
 say-0.036
 am-0.035
 Mir-0.033
Token had
Token ID550
Feature activation-1.05e+00

Pos logit contributions
lights+0.039
les+0.038
rily+0.038
adoes+0.036
ork+0.036
Neg logit contributions
 least-0.029
oh-0.023
 say-0.022
 am-0.021
 oh-0.020
Token gone
Token ID3750
Feature activation+1.65e-01

Pos logit contributions
les+0.058
lights+0.058
rily+0.056
adoes+0.053
ork+0.053
Neg logit contributions
 least-0.028
 oh-0.018
oh-0.018
 say-0.017
 very-0.016
Token away
Token ID1497
Feature activation+4.15e-01

Pos logit contributions
 least+0.007
oh+0.005
 say+0.005
 oh+0.005
 kidding+0.005
Neg logit contributions
les-0.007
lights-0.007
ork-0.006
rily-0.006
adoes-0.006
Token for
Token ID329
Feature activation+1.25e-01

Pos logit contributions
 least+0.017
oh+0.014
 say+0.013
 kidding+0.013
 oh+0.013
Neg logit contributions
les-0.016
lights-0.016
ork-0.015
rily-0.015
adoes-0.015
Token a
Token ID257
Feature activation-1.09e+00

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 kidding+0.003
 oh+0.003
Neg logit contributions
les-0.006
lights-0.006
ork-0.006
rily-0.006
adoes-0.006
Token long
Token ID890
Feature activation-3.41e+00

Pos logit contributions
les+0.047
lights+0.046
adoes+0.045
rily+0.045
ened+0.041
Neg logit contributions
 least-0.044
 say-0.034
 wait-0.034
oh-0.034
 oh-0.033
Token time
Token ID640
Feature activation-7.55e-02

Pos logit contributions
rily+0.156
lights+0.150
adoes+0.139
les+0.137
ocked+0.126
Neg logit contributions
 least-0.116
 wait-0.105
 say-0.101
 very-0.099
 h-0.093
Token.
Token ID13
Feature activation-6.02e-01

Pos logit contributions
les+0.003
lights+0.003
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.003
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token Lucy
Token ID22162
Feature activation+9.95e-01

Pos logit contributions
les+0.029
lights+0.028
rily+0.027
ork+0.026
adoes+0.026
Neg logit contributions
 least-0.020
oh-0.016
 say-0.015
 oh-0.015
 Mir-0.015
Token sighed
Token ID21893
Feature activation-4.17e-01

Pos logit contributions
 least+0.044
 kidding+0.037
 ourselves+0.036
 oh+0.036
oh+0.035
Neg logit contributions
les-0.036
ork-0.032
ened-0.030
rily-0.030
lights-0.030
Token sadly
Token ID21098
Feature activation-6.79e-02

Pos logit contributions
les+0.017
lights+0.017
ork+0.016
rily+0.016
adoes+0.016
Neg logit contributions
 least-0.017
oh-0.013
 say-0.013
 oh-0.012
 kidding-0.012
Token Ben
Token ID3932
Feature activation-2.58e-01

Pos logit contributions
les+0.049
lights+0.048
adoes+0.045
ork+0.045
lf+0.044
Neg logit contributions
 least-0.035
 Mir-0.031
 say-0.029
 Mil-0.028
 kidding-0.028
Token smiled
Token ID13541
Feature activation-4.86e-01

Pos logit contributions
lights+0.011
les+0.011
ork+0.011
rily+0.010
adoes+0.010
Neg logit contributions
 least-0.010
oh-0.008
 say-0.007
 oh-0.007
 kidding-0.007
Token and
Token ID290
Feature activation-9.78e-03

Pos logit contributions
les+0.020
lights+0.020
ork+0.020
rily+0.019
adoes+0.018
Neg logit contributions
 least-0.019
 say-0.015
oh-0.014
 oh-0.014
 kidding-0.014
Token hugged
Token ID41130
Feature activation-1.06e+00

Pos logit contributions
les+0.000
lights+0.000
ork+0.000
rily+0.000
adoes+0.000
Neg logit contributions
 least-0.000
oh-0.000
 say-0.000
 oh-0.000
 kidding-0.000
Token its
Token ID663
Feature activation-2.29e-01

Pos logit contributions
lights+0.048
ork+0.047
les+0.046
adoes+0.046
lf+0.045
Neg logit contributions
 least-0.032
 Mir-0.028
 say-0.028
 Mil-0.025
 oh-0.025
Token neck
Token ID7393
Feature activation-3.40e+00

Pos logit contributions
les+0.008
lights+0.008
ork+0.008
rily+0.008
adoes+0.008
Neg logit contributions
 least-0.010
 say-0.008
oh-0.008
 oh-0.008
 kidding-0.008
Token.
Token ID13
Feature activation-1.06e+00

Pos logit contributions
adoes+0.188
lights+0.185
lf+0.184
igsaw+0.179
ened+0.171
Neg logit contributions
 very-0.068
 say-0.052
 so-0.046
 right-0.037
 least-0.034
Token
Token ID198
Feature activation-1.18e-02

Pos logit contributions
les+0.052
lights+0.049
adoes+0.048
ork+0.048
ened+0.048
Neg logit contributions
 least-0.030
 Mir-0.027
 say-0.025
 Mil-0.023
 Roc-0.023
Token
Token ID198
Feature activation-2.92e-01

Pos logit contributions
les+0.001
lights+0.001
ork+0.000
rily+0.000
adoes+0.000
Neg logit contributions
 least-0.000
oh-0.000
 say-0.000
 oh-0.000
 kidding-0.000
TokenMom
Token ID29252
Feature activation-5.15e-01

Pos logit contributions
lights+0.011
les+0.011
rily+0.010
adoes+0.010
ork+0.010
Neg logit contributions
 least-0.012
oh-0.011
 say-0.010
 oh-0.009
 am-0.009
Token took
Token ID1718
Feature activation-1.70e-01

Pos logit contributions
lights+0.022
ork+0.021
rily+0.021
adoes+0.021
les+0.021
Neg logit contributions
 least-0.020
oh-0.016
 say-0.016
 Mir-0.014
 oh-0.014
Token hug
Token ID16225
Feature activation+3.34e-01

Pos logit contributions
lights+0.027
les+0.027
adoes+0.025
rily+0.025
ork+0.024
Neg logit contributions
 least-0.027
 say-0.021
oh-0.021
 am-0.020
 kidding-0.019
Token."
Token ID526
Feature activation+7.67e-01

Pos logit contributions
 least+0.014
oh+0.011
 say+0.011
 kidding+0.011
 oh+0.011
Neg logit contributions
les-0.013
lights-0.012
ork-0.012
rily-0.012
adoes-0.012
Token
Token ID198
Feature activation+1.05e+00

Pos logit contributions
 least+0.034
oh+0.027
 oh+0.026
 worst+0.026
 kidding+0.026
Neg logit contributions
les-0.028
lights-0.027
ork-0.026
rily-0.026
adoes-0.025
Token
Token ID198
Feature activation+8.79e-01

Pos logit contributions
 least+0.052
 kidding+0.042
 worst+0.040
 oh+0.040
oh+0.039
Neg logit contributions
les-0.034
ork-0.034
lights-0.032
rily-0.031
lf-0.029
TokenThe
Token ID464
Feature activation-1.59e+00

Pos logit contributions
 least+0.040
 kidding+0.031
 worst+0.030
ses+0.029
 say+0.029
Neg logit contributions
les-0.031
ork-0.030
lights-0.029
lf-0.029
rily-0.029
Token quiet
Token ID5897
Feature activation-3.38e+00

Pos logit contributions
ork+0.077
lights+0.068
les+0.065
rily+0.064
bell+0.062
Neg logit contributions
 least-0.062
 best-0.048
 very-0.048
 most-0.046
 worst-0.044
Token all
Token ID477
Feature activation-2.64e-01

Pos logit contributions
ork+0.182
pron+0.154
ast+0.149
ues+0.147
url+0.147
Neg logit contributions
 very-0.075
 so-0.072
 are-0.071
 welcome-0.071
 best-0.071
Tokenigator
Token ID23823
Feature activation-7.28e-02

Pos logit contributions
lights+0.008
les+0.008
ork+0.008
rily+0.008
adoes+0.008
Neg logit contributions
 least-0.013
oh-0.011
 say-0.010
 oh-0.010
 kidding-0.010
Token did
Token ID750
Feature activation-7.61e-01

Pos logit contributions
les+0.003
lights+0.003
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.003
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token not
Token ID407
Feature activation-3.68e-01

Pos logit contributions
lights+0.039
les+0.038
adoes+0.036
ork+0.035
 tant+0.035
Neg logit contributions
 least-0.023
 say-0.020
 ourselves-0.017
 am-0.016
 wait-0.016
Token know
Token ID760
Feature activation-9.59e-02

Pos logit contributions
lights+0.019
les+0.019
ork+0.018
rily+0.018
adoes+0.018
Neg logit contributions
 least-0.011
 say-0.008
oh-0.008
 kidding-0.007
 wait-0.007
Token not
Token ID407
Feature activation-5.93e-01

Pos logit contributions
les+0.000
lights+0.000
ork+0.000
rily+0.000
adoes+0.000
Neg logit contributions
 least-0.000
oh-0.000
 say-0.000
 oh-0.000
 kidding-0.000
Token played
Token ID2826
Feature activation-1.32e+00

Pos logit contributions
lights+0.029
les+0.029
rily+0.028
ork+0.027
adoes+0.027
Neg logit contributions
 least-0.019
 say-0.014
oh-0.013
 kidding-0.013
 am-0.013
Token with
Token ID351
Feature activation-1.79e+00

Pos logit contributions
lights+0.058
les+0.057
rily+0.054
ork+0.053
adoes+0.051
Neg logit contributions
 least-0.044
 oh-0.036
oh-0.036
 am-0.035
 say-0.035
Token for
Token ID329
Feature activation-3.34e-01

Pos logit contributions
lights+0.078
rily+0.077
les+0.076
ork+0.074
adoes+0.073
Neg logit contributions
 least-0.059
 oh-0.044
 very-0.044
 say-0.043
oh-0.043
Token a
Token ID257
Feature activation-2.18e+00

Pos logit contributions
les+0.015
lights+0.015
rily+0.015
ork+0.015
adoes+0.014
Neg logit contributions
 least-0.012
oh-0.009
 say-0.008
 oh-0.008
 kidding-0.008
Token long
Token ID890
Feature activation-3.38e+00

Pos logit contributions
lights+0.096
adoes+0.093
rily+0.090
les+0.090
ocked+0.084
Neg logit contributions
 least-0.072
 very-0.065
 say-0.063
 oh-0.060
 wait-0.060
Token time
Token ID640
Feature activation-7.71e-01

Pos logit contributions
rily+0.149
lights+0.148
ocked+0.133
adoes+0.131
les+0.124
Neg logit contributions
 h-0.097
 say-0.095
 am-0.092
 least-0.092
 wait-0.091
Token.
Token ID13
Feature activation-4.24e-01

Pos logit contributions
les+0.035
lights+0.034
rily+0.033
ork+0.033
adoes+0.033
Neg logit contributions
 least-0.027
oh-0.020
 say-0.020
 kidding-0.019
 oh-0.019
Token It
Token ID632
Feature activation+4.37e-01

Pos logit contributions
les+0.018
lights+0.017
rily+0.016
ork+0.016
adoes+0.015
Neg logit contributions
 least-0.017
oh-0.014
 say-0.013
 oh-0.013
 Mir-0.013
Token is
Token ID318
Feature activation+4.23e-01

Pos logit contributions
 least+0.016
oh+0.012
 oh+0.012
 kidding+0.011
 ourselves+0.011
Neg logit contributions
les-0.020
lights-0.019
ork-0.019
rily-0.018
adoes-0.018
Token a
Token ID257
Feature activation-2.18e-01

Pos logit contributions
 least+0.016
oh+0.013
 say+0.011
 kidding+0.011
 oh+0.011
Neg logit contributions
lights-0.018
les-0.018
ork-0.017
rily-0.017
adoes-0.017
Token.
Token ID13
Feature activation+2.07e-01

Pos logit contributions
les+0.010
lights+0.010
ork+0.009
rily+0.009
adoes+0.009
Neg logit contributions
 least-0.009
oh-0.007
 say-0.007
 oh-0.007
 kidding-0.007
Token
Token ID198
Feature activation+7.04e-01

Pos logit contributions
 least+0.009
oh+0.007
 say+0.007
 oh+0.007
 kidding+0.006
Neg logit contributions
les-0.008
lights-0.008
ork-0.007
rily-0.007
adoes-0.007
Token
Token ID198
Feature activation-1.70e-01

Pos logit contributions
 least+0.031
 kidding+0.024
oh+0.024
 oh+0.024
 say+0.023
Neg logit contributions
ork-0.025
les-0.025
lights-0.024
rily-0.023
adoes-0.022
TokenWhen
Token ID2215
Feature activation+4.07e-01

Pos logit contributions
les+0.008
lights+0.008
rily+0.007
ork+0.007
adoes+0.007
Neg logit contributions
 least-0.006
oh-0.005
 say-0.004
 oh-0.004
 kidding-0.004
Token the
Token ID262
Feature activation-1.47e+00

Pos logit contributions
 least+0.016
oh+0.013
 say+0.012
 kidding+0.012
 oh+0.012
Neg logit contributions
les-0.016
lights-0.016
ork-0.016
rily-0.015
adoes-0.015
Token white
Token ID2330
Feature activation-3.37e+00

Pos logit contributions
ork+0.072
les+0.064
rily+0.062
lights+0.060
aff+0.060
Neg logit contributions
 least-0.055
 best-0.043
 worst-0.041
 most-0.040
 wait-0.040
Token all
Token ID477
Feature activation-4.07e-01

Pos logit contributions
ork+0.167
ides+0.152
isdom+0.151
hell+0.150
lights+0.150
Neg logit contributions
 kay-0.071
 m-0.069
 least-0.069
 very-0.067
 most-0.063
Tokenigator
Token ID23823
Feature activation+2.35e-01

Pos logit contributions
ork+0.015
les+0.015
lights+0.015
rily+0.014
adoes+0.014
Neg logit contributions
 least-0.018
oh-0.014
 say-0.014
 oh-0.013
 kidding-0.013
Token got
Token ID1392
Feature activation-4.04e-02

Pos logit contributions
 least+0.010
oh+0.008
 say+0.007
 oh+0.007
 kidding+0.007
Neg logit contributions
les-0.009
lights-0.009
ork-0.009
rily-0.008
adoes-0.008
Token out
Token ID503
Feature activation+1.11e-01

Pos logit contributions
les+0.002
lights+0.002
ork+0.002
rily+0.002
adoes+0.002
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token of
Token ID286
Feature activation-3.25e-01

Pos logit contributions
 least+0.005
oh+0.004
 say+0.004
 oh+0.003
 kidding+0.003
Neg logit contributions
les-0.004
lights-0.004
ork-0.004
rily-0.004
adoes-0.004
Token Lily
Token ID20037
Feature activation+4.44e-01

Pos logit contributions
les+0.034
lights+0.033
rily+0.032
adoes+0.031
ork+0.030
Neg logit contributions
 least-0.027
oh-0.021
 say-0.021
 Mir-0.020
 oh-0.019
Token picked
Token ID6497
Feature activation-1.18e+00

Pos logit contributions
 least+0.018
oh+0.014
 kidding+0.014
 oh+0.014
 say+0.014
Neg logit contributions
les-0.018
ork-0.016
lights-0.016
rily-0.016
adoes-0.016
Token the
Token ID262
Feature activation+1.01e-01

Pos logit contributions
lights+0.061
ork+0.054
lf+0.054
les+0.052
adoes+0.052
Neg logit contributions
 least-0.039
 say-0.029
oh-0.029
 oh-0.027
 Mir-0.026
Token apples
Token ID22514
Feature activation+3.09e-01

Pos logit contributions
 least+0.005
oh+0.004
 say+0.004
 oh+0.004
 kidding+0.004
Neg logit contributions
les-0.003
lights-0.003
ork-0.003
rily-0.003
adoes-0.003
Token and
Token ID290
Feature activation-6.77e-01

Pos logit contributions
 least+0.014
oh+0.011
 oh+0.011
 kidding+0.011
 say+0.011
Neg logit contributions
les-0.011
lights-0.011
rily-0.011
ork-0.010
adoes-0.010
Token shared
Token ID4888
Feature activation-3.36e+00

Pos logit contributions
lights+0.029
ork+0.028
les+0.028
rily+0.028
adoes+0.027
Neg logit contributions
 least-0.026
 say-0.020
oh-0.020
 oh-0.019
 Mir-0.019
Token them
Token ID606
Feature activation-5.43e-01

Pos logit contributions
lights+0.198
acket+0.176
les+0.174
adoes+0.173
lf+0.173
Neg logit contributions
 very-0.070
 say-0.066
 so-0.062
 herself-0.057
 are-0.057
Token with
Token ID351
Feature activation-6.19e-01

Pos logit contributions
les+0.021
lights+0.021
adoes+0.019
ork+0.019
rily+0.019
Neg logit contributions
 least-0.023
oh-0.018
 say-0.018
 ourselves-0.017
 oh-0.017
Token her
Token ID607
Feature activation-8.94e-01

Pos logit contributions
lights+0.033
les+0.033
rily+0.032
ork+0.031
adoes+0.031
Neg logit contributions
 least-0.017
 say-0.011
oh-0.011
 oh-0.010
 ourselves-0.010
Token friends
Token ID2460
Feature activation-5.00e-01

Pos logit contributions
les+0.041
lights+0.040
ork+0.038
rily+0.037
adoes+0.036
Neg logit contributions
 least-0.033
 say-0.024
 best-0.024
 cle-0.023
 oh-0.023
Token.
Token ID13
Feature activation-1.34e-01

Pos logit contributions
les+0.022
lights+0.022
ork+0.021
adoes+0.021
rily+0.021
Neg logit contributions
 least-0.018
oh-0.014
 say-0.014
 oh-0.013
 kidding-0.013
Token a
Token ID257
Feature activation+3.62e-01

Pos logit contributions
 least+0.013
oh+0.010
 say+0.009
 kidding+0.009
 oh+0.009
Neg logit contributions
les-0.016
lights-0.016
ork-0.015
rily-0.015
adoes-0.015
Token chair
Token ID5118
Feature activation-2.72e-01

Pos logit contributions
 least+0.017
oh+0.014
 kidding+0.014
 say+0.014
 oh+0.013
Neg logit contributions
les-0.012
lights-0.012
ork-0.011
rily-0.011
adoes-0.010
Token.
Token ID13
Feature activation-6.40e-02

Pos logit contributions
les+0.011
lights+0.011
rily+0.011
ork+0.011
adoes+0.010
Neg logit contributions
 least-0.011
oh-0.008
 say-0.008
 oh-0.008
 kidding-0.008
Token She
Token ID1375
Feature activation-6.71e-01

Pos logit contributions
les+0.003
lights+0.003
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.002
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token had
Token ID550
Feature activation+9.04e-01

Pos logit contributions
lights+0.031
les+0.030
rily+0.029
adoes+0.029
ork+0.029
Neg logit contributions
 least-0.023
oh-0.019
 say-0.018
 oh-0.018
 Mir-0.017
Token an
Token ID281
Feature activation+1.64e+00

Pos logit contributions
 least+0.030
oh+0.025
 kidding+0.022
 say+0.022
 oh+0.021
Neg logit contributions
ork-0.040
les-0.038
lights-0.038
adoes-0.037
ened-0.037
Token idea
Token ID2126
Feature activation+6.04e-01

Pos logit contributions
 least+0.081
oh+0.077
 am+0.067
 ourselves+0.066
 Mir+0.065
Neg logit contributions
les-0.047
rily-0.047
adoes-0.043
lights-0.042
ened-0.042
Token.
Token ID13
Feature activation+5.16e-01

Pos logit contributions
 least+0.028
oh+0.022
 kidding+0.021
 say+0.020
 ourselves+0.020
Neg logit contributions
les-0.021
lights-0.020
ork-0.020
rily-0.019
adoes-0.018
Token "
Token ID366
Feature activation-2.13e-01

Pos logit contributions
 least+0.022
oh+0.017
 kidding+0.016
 say+0.016
 oh+0.016
Neg logit contributions
les-0.020
lights-0.019
ork-0.019
rily-0.019
adoes-0.018
TokenBen
Token ID11696
Feature activation+2.46e-01

Pos logit contributions
les+0.010
lights+0.010
rily+0.009
ork+0.009
adoes+0.009
Neg logit contributions
 least-0.008
oh-0.006
 say-0.005
 oh-0.005
 kidding-0.005
Token,
Token ID11
Feature activation+5.60e-02

Pos logit contributions
 least+0.009
oh+0.006
 say+0.006
 oh+0.006
 kidding+0.006
Neg logit contributions
les-0.011
lights-0.011
ork-0.011
rily-0.011
adoes-0.010
Token find
Token ID1064
Feature activation+5.52e-01

Pos logit contributions
lights+0.078
les+0.077
rily+0.074
 tant+0.069
ork+0.068
Neg logit contributions
 least-0.042
 say-0.035
 wait-0.031
 am-0.030
oh-0.029
Token a
Token ID257
Feature activation+1.01e-01

Pos logit contributions
 least+0.019
oh+0.016
 kidding+0.013
 say+0.013
 am+0.013
Neg logit contributions
les-0.025
lights-0.024
ork-0.024
adoes-0.023
rily-0.023
Token place
Token ID1295
Feature activation+1.34e+00

Pos logit contributions
 least+0.004
oh+0.004
 say+0.003
 oh+0.003
 kidding+0.003
Neg logit contributions
les-0.004
lights-0.004
ork-0.003
rily-0.003
adoes-0.003
Token to
Token ID284
Feature activation+3.45e-01

Pos logit contributions
 least+0.065
oh+0.056
 kidding+0.052
 Mir+0.050
 elbows+0.050
Neg logit contributions
lights-0.042
les-0.039
ork-0.039
rily-0.036
adoes-0.035
Token play
Token ID711
Feature activation+6.80e-02

Pos logit contributions
 least+0.012
oh+0.010
 kidding+0.009
 oh+0.008
 say+0.008
Neg logit contributions
les-0.015
lights-0.015
ork-0.015
rily-0.014
adoes-0.014
Token at
Token ID379
Feature activation+1.71e+00

Pos logit contributions
 least+0.003
oh+0.002
 say+0.002
 kidding+0.002
 oh+0.002
Neg logit contributions
les-0.003
lights-0.003
ork-0.002
rily-0.002
adoes-0.002
Token sunset
Token ID26428
Feature activation-3.17e-01

Pos logit contributions
 least+0.064
 oh+0.061
 kidding+0.059
oh+0.058
 elbows+0.055
Neg logit contributions
les-0.058
ork-0.057
lights-0.054
lf-0.053
adoes-0.050
Token.
Token ID13
Feature activation-4.65e-01

Pos logit contributions
les+0.013
lights+0.013
rily+0.013
ork+0.013
adoes+0.012
Neg logit contributions
 least-0.012
oh-0.010
 say-0.009
 oh-0.009
 am-0.009
Token He
Token ID679
Feature activation-4.45e-01

Pos logit contributions
les+0.020
lights+0.020
rily+0.019
adoes+0.019
ork+0.019
Neg logit contributions
 least-0.018
oh-0.014
 say-0.014
 am-0.013
 Mir-0.013
Token walked
Token ID6807
Feature activation+3.77e-01

Pos logit contributions
lights+0.019
les+0.019
rily+0.018
ork+0.018
adoes+0.018
Neg logit contributions
 least-0.017
oh-0.013
 say-0.013
 oh-0.012
 am-0.012
Token and
Token ID290
Feature activation-2.27e-01

Pos logit contributions
 least+0.018
oh+0.015
 kidding+0.014
 oh+0.014
 say+0.014
Neg logit contributions
les-0.013
lights-0.012
rily-0.012
ork-0.012
adoes-0.011
Token,
Token ID11
Feature activation-5.92e-01

Pos logit contributions
lights+0.048
les+0.047
rily+0.046
adoes+0.046
ork+0.045
Neg logit contributions
 least-0.022
oh-0.018
 say-0.017
 am-0.015
 oh-0.015
Token Mama
Token ID40084
Feature activation+4.70e-01

Pos logit contributions
les+0.023
lights+0.023
ork+0.021
adoes+0.021
rily+0.021
Neg logit contributions
 least-0.025
oh-0.021
 say-0.020
 oh-0.019
 am-0.019
Token said
Token ID531
Feature activation+5.35e-01

Pos logit contributions
 least+0.020
oh+0.016
 oh+0.016
 kidding+0.015
 say+0.015
Neg logit contributions
les-0.018
lights-0.017
ork-0.016
rily-0.016
adoes-0.016
Token it
Token ID340
Feature activation-3.70e-01

Pos logit contributions
 least+0.021
oh+0.015
 say+0.014
 kidding+0.014
 oh+0.014
Neg logit contributions
les-0.023
ork-0.023
lights-0.022
rily-0.022
adoes-0.021
Token was
Token ID373
Feature activation+1.35e+00

Pos logit contributions
les+0.017
lights+0.017
rily+0.016
ork+0.016
adoes+0.016
Neg logit contributions
 least-0.013
oh-0.010
 say-0.009
 oh-0.009
 am-0.009
Token time
Token ID640
Feature activation+1.71e+00

Pos logit contributions
 least+0.083
oh+0.075
 ourselves+0.071
 am+0.068
 kidding+0.067
Neg logit contributions
lights-0.021
les-0.020
ork-0.019
rily-0.018
pron-0.016
Token to
Token ID284
Feature activation-3.64e-01

Pos logit contributions
 least+0.088
oh+0.072
 worst+0.069
 cle+0.068
 kidding+0.067
Neg logit contributions
ork-0.046
les-0.045
rily-0.044
lights-0.044
bell-0.039
Token go
Token ID467
Feature activation+6.69e-01

Pos logit contributions
les+0.017
lights+0.017
ork+0.016
rily+0.016
adoes+0.016
Neg logit contributions
 least-0.012
 say-0.009
oh-0.009
 kidding-0.008
 am-0.008
Token to
Token ID284
Feature activation+7.69e-01

Pos logit contributions
 least+0.027
oh+0.023
 oh+0.021
 kidding+0.021
 Mir+0.019
Neg logit contributions
les-0.027
lights-0.026
ork-0.025
rily-0.024
adoes-0.023
Token the
Token ID262
Feature activation+1.11e+00

Pos logit contributions
 least+0.028
oh+0.024
 oh+0.022
 kidding+0.021
 worst+0.020
Neg logit contributions
les-0.031
ork-0.030
lights-0.030
rily-0.029
adoes-0.028
Token store
Token ID3650
Feature activation+7.70e-01

Pos logit contributions
 least+0.047
oh+0.040
 kidding+0.039
 oh+0.038
 elbows+0.037
Neg logit contributions
ork-0.037
les-0.036
lights-0.035
rily-0.034
adoes-0.033
Token a
Token ID257
Feature activation-3.78e-01

Pos logit contributions
 least+0.033
oh+0.028
 oh+0.025
 kidding+0.025
 say+0.023
Neg logit contributions
les-0.037
lights-0.036
ork-0.035
rily-0.033
adoes-0.033
Token lot
Token ID1256
Feature activation-1.69e-01

Pos logit contributions
les+0.013
lights+0.013
rily+0.013
adoes+0.013
ork+0.012
Neg logit contributions
 least-0.017
 say-0.014
oh-0.014
 am-0.013
 kidding-0.013
Token of
Token ID286
Feature activation+2.96e-02

Pos logit contributions
les+0.005
lights+0.005
rily+0.005
ork+0.005
adoes+0.005
Neg logit contributions
 least-0.009
oh-0.007
 say-0.007
 kidding-0.007
 am-0.007
Token fun
Token ID1257
Feature activation-7.65e-01

Pos logit contributions
 least+0.001
oh+0.001
 say+0.001
 oh+0.001
 kidding+0.001
Neg logit contributions
les-0.001
lights-0.001
ork-0.001
rily-0.001
adoes-0.001
Token.
Token ID13
Feature activation-9.50e-02

Pos logit contributions
les+0.035
rily+0.035
lights+0.034
ork+0.034
adoes+0.033
Neg logit contributions
 least-0.025
 say-0.019
oh-0.018
 ourselves-0.018
 oh-0.018
Token They
Token ID1119
Feature activation+1.60e+00

Pos logit contributions
les+0.004
lights+0.004
rily+0.004
ork+0.004
adoes+0.004
Neg logit contributions
 least-0.004
oh-0.003
 say-0.003
 oh-0.003
 kidding-0.003
Token are
Token ID389
Feature activation+2.64e-01

Pos logit contributions
 least+0.044
 oh+0.043
oh+0.038
 kidding+0.035
 elbows+0.034
Neg logit contributions
les-0.075
ork-0.067
lights-0.066
 tant-0.063
bell-0.063
Token good
Token ID922
Feature activation-1.35e-01

Pos logit contributions
 least+0.009
oh+0.007
 say+0.007
 oh+0.006
 kidding+0.006
Neg logit contributions
les-0.012
lights-0.012
rily-0.011
ork-0.011
adoes-0.011
Token friends
Token ID2460
Feature activation-5.95e-01

Pos logit contributions
les+0.005
lights+0.005
ork+0.005
rily+0.005
adoes+0.005
Neg logit contributions
 least-0.006
oh-0.004
 say-0.004
 oh-0.004
 kidding-0.004
Token.
Token ID13
Feature activation+3.25e-01

Pos logit contributions
les+0.027
lights+0.026
ork+0.026
adoes+0.026
rily+0.025
Neg logit contributions
 least-0.021
oh-0.016
 say-0.016
 oh-0.015
 am-0.015
Token<|endoftext|>
Token ID50256
Feature activation-2.50e-03

Pos logit contributions
 least+0.014
oh+0.011
 oh+0.010
 kidding+0.010
 say+0.010
Neg logit contributions
lights-0.012
les-0.012
ork-0.012
rily-0.011
adoes-0.011
Token They
Token ID1119
Feature activation+8.41e-01

Pos logit contributions
les+0.012
lights+0.011
ork+0.011
rily+0.011
adoes+0.011
Neg logit contributions
 least-0.011
oh-0.009
 say-0.008
 kidding-0.008
 oh-0.008
Token had
Token ID550
Feature activation+7.72e-01

Pos logit contributions
 least+0.033
oh+0.027
 kidding+0.026
 oh+0.026
 Mir+0.025
Neg logit contributions
les-0.034
lights-0.031
ork-0.030
rily-0.030
ened-0.029
Token made
Token ID925
Feature activation+5.15e-01

Pos logit contributions
 least+0.028
oh+0.022
 say+0.021
 oh+0.020
 kidding+0.020
Neg logit contributions
les-0.033
ork-0.031
lights-0.031
rily-0.030
ened-0.030
Token it
Token ID340
Feature activation+5.79e-01

Pos logit contributions
 least+0.020
oh+0.016
 say+0.014
 kidding+0.014
 am+0.014
Neg logit contributions
les-0.022
lights-0.021
ork-0.021
rily-0.021
adoes-0.020
Token to
Token ID284
Feature activation-5.84e-02

Pos logit contributions
 least+0.023
oh+0.018
 say+0.017
 kidding+0.017
 oh+0.017
Neg logit contributions
lights-0.024
les-0.024
ork-0.023
rily-0.022
adoes-0.022
Token the
Token ID262
Feature activation+1.49e+00

Pos logit contributions
les+0.003
lights+0.003
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.002
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token other
Token ID584
Feature activation+1.30e+00

Pos logit contributions
 least+0.052
oh+0.049
 elbows+0.046
 oh+0.045
 say+0.044
Neg logit contributions
les-0.054
ork-0.052
lights-0.050
adoes-0.048
rily-0.045
Token side
Token ID1735
Feature activation+2.45e-01

Pos logit contributions
 least+0.054
oh+0.048
 oh+0.045
 kidding+0.043
 elbows+0.043
Neg logit contributions
les-0.046
lights-0.045
adoes-0.039
ork-0.039
rily-0.037
Token of
Token ID286
Feature activation-4.05e-01

Pos logit contributions
 least+0.011
oh+0.009
 say+0.009
 oh+0.009
 kidding+0.009
Neg logit contributions
les-0.009
lights-0.008
ork-0.008
rily-0.008
adoes-0.008
Token their
Token ID511
Feature activation+1.02e-01

Pos logit contributions
les+0.021
lights+0.021
ork+0.020
rily+0.020
adoes+0.019
Neg logit contributions
 least-0.013
 say-0.008
oh-0.008
 oh-0.008
 kidding-0.008
Token journey
Token ID7002
Feature activation+3.24e-01

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 oh+0.003
 kidding+0.003
Neg logit contributions
les-0.004
lights-0.004
ork-0.004
rily-0.004
adoes-0.004
Token her
Token ID607
Feature activation-1.76e-01

Pos logit contributions
lights+0.021
les+0.020
ork+0.020
rily+0.020
adoes+0.019
Neg logit contributions
 least-0.014
 say-0.010
oh-0.010
 oh-0.010
 kidding-0.009
Token mom
Token ID1995
Feature activation-1.05e-01

Pos logit contributions
les+0.007
lights+0.007
ork+0.006
rily+0.006
adoes+0.006
Neg logit contributions
 least-0.007
oh-0.006
 say-0.006
 oh-0.006
 kidding-0.005
Token's
Token ID338
Feature activation-5.05e-01

Pos logit contributions
les+0.004
lights+0.004
ork+0.004
rily+0.004
adoes+0.004
Neg logit contributions
 least-0.004
oh-0.003
 say-0.003
 oh-0.003
 kidding-0.003
Token message
Token ID3275
Feature activation-8.97e-01

Pos logit contributions
les+0.021
lights+0.021
rily+0.020
ork+0.020
adoes+0.019
Neg logit contributions
 least-0.020
 say-0.015
 oh-0.015
oh-0.015
 am-0.014
Token.
Token ID13
Feature activation-8.25e-01

Pos logit contributions
lights+0.039
les+0.039
rily+0.038
ork+0.037
adoes+0.037
Neg logit contributions
 least-0.032
 say-0.024
oh-0.024
 oh-0.023
 am-0.022
Token
Token ID198
Feature activation-1.82e-01

Pos logit contributions
les+0.040
lights+0.039
ork+0.036
rily+0.036
adoes+0.036
Neg logit contributions
 least-0.028
oh-0.020
 say-0.020
 Mir-0.019
 oh-0.019
Token
Token ID198
Feature activation-6.04e-01

Pos logit contributions
les+0.008
lights+0.008
rily+0.008
adoes+0.008
ork+0.008
Neg logit contributions
 least-0.006
oh-0.005
 say-0.005
 oh-0.005
 am-0.004
TokenLater
Token ID18602
Feature activation+2.66e-02

Pos logit contributions
lights+0.025
les+0.025
rily+0.023
adoes+0.023
ened+0.022
Neg logit contributions
 least-0.024
oh-0.020
 say-0.019
 oh-0.019
 am-0.018
Token that
Token ID326
Feature activation+8.23e-02

Pos logit contributions
 least+0.001
oh+0.001
 say+0.001
 oh+0.001
 kidding+0.001
Neg logit contributions
les-0.001
lights-0.001
ork-0.001
rily-0.001
adoes-0.001
Token day
Token ID1110
Feature activation-4.89e-01

Pos logit contributions
 least+0.003
oh+0.002
 say+0.002
 oh+0.002
 kidding+0.002
Neg logit contributions
les-0.004
lights-0.003
ork-0.003
rily-0.003
adoes-0.003
Token,
Token ID11
Feature activation-9.85e-01

Pos logit contributions
lights+0.026
les+0.025
rily+0.025
adoes+0.025
ork+0.024
Neg logit contributions
 least-0.013
oh-0.010
 say-0.009
 am-0.009
 oh-0.009
Token dirt
Token ID13647
Feature activation-6.66e-01

Pos logit contributions
 least+0.026
oh+0.022
 oh+0.021
 kidding+0.021
 say+0.021
Neg logit contributions
les-0.021
lights-0.020
ork-0.020
rily-0.018
adoes-0.018
Token.
Token ID13
Feature activation-7.80e-01

Pos logit contributions
les+0.028
lights+0.028
ork+0.027
adoes+0.027
rily+0.026
Neg logit contributions
 least-0.025
oh-0.019
 say-0.018
 oh-0.017
 am-0.017
Token They
Token ID1119
Feature activation+4.53e-01

Pos logit contributions
les+0.032
lights+0.030
rily+0.028
adoes+0.028
ork+0.027
Neg logit contributions
 least-0.031
oh-0.026
 Mir-0.025
 say-0.025
 oh-0.025
Token dug
Token ID18735
Feature activation+4.45e-01

Pos logit contributions
 least+0.018
oh+0.014
 kidding+0.013
 oh+0.013
 say+0.013
Neg logit contributions
les-0.019
lights-0.018
rily-0.017
ork-0.017
adoes-0.017
Token it
Token ID340
Feature activation+5.73e-01

Pos logit contributions
 least+0.018
oh+0.014
 say+0.013
 kidding+0.013
 oh+0.013
Neg logit contributions
les-0.018
lights-0.018
ork-0.018
rily-0.017
adoes-0.017
Token out
Token ID503
Feature activation+2.34e-01

Pos logit contributions
 least+0.027
oh+0.022
 kidding+0.021
 oh+0.021
 say+0.021
Neg logit contributions
les-0.020
lights-0.020
rily-0.019
ork-0.019
adoes-0.018
Token and
Token ID290
Feature activation+3.22e-01

Pos logit contributions
 least+0.011
oh+0.009
 say+0.008
 oh+0.008
 kidding+0.008
Neg logit contributions
les-0.008
lights-0.008
rily-0.008
ork-0.008
adoes-0.008
Token saw
Token ID2497
Feature activation+9.14e-01

Pos logit contributions
 least+0.014
oh+0.011
 kidding+0.010
 say+0.010
 oh+0.010
Neg logit contributions
les-0.012
lights-0.012
rily-0.012
ork-0.012
adoes-0.011
Token that
Token ID326
Feature activation-1.73e-01

Pos logit contributions
 least+0.037
oh+0.031
 kidding+0.028
 say+0.027
 am+0.026
Neg logit contributions
lights-0.037
les-0.036
ork-0.034
rily-0.034
adoes-0.034
Token it
Token ID340
Feature activation+1.56e-01

Pos logit contributions
les+0.006
lights+0.006
ork+0.006
rily+0.006
adoes+0.006
Neg logit contributions
 least-0.008
oh-0.006
 say-0.006
 oh-0.006
 kidding-0.006
Token was
Token ID373
Feature activation+7.14e-01

Pos logit contributions
 least+0.006
oh+0.005
 say+0.005
 kidding+0.005
 oh+0.005
Neg logit contributions
les-0.006
lights-0.006
ork-0.006
rily-0.006
adoes-0.006
Token away
Token ID1497
Feature activation+1.19e-01

Pos logit contributions
les+0.008
lights+0.008
rily+0.008
ork+0.008
adoes+0.007
Neg logit contributions
 least-0.008
oh-0.006
 say-0.006
 oh-0.006
 kidding-0.006
Token and
Token ID290
Feature activation+4.70e-01

Pos logit contributions
 least+0.005
oh+0.004
 say+0.004
 oh+0.004
 kidding+0.004
Neg logit contributions
les-0.005
lights-0.005
ork-0.004
rily-0.004
adoes-0.004
Token hid
Token ID24519
Feature activation+1.26e-01

Pos logit contributions
 least+0.019
oh+0.015
 kidding+0.014
 say+0.014
 ourselves+0.014
Neg logit contributions
les-0.018
lights-0.018
ork-0.017
adoes-0.017
rily-0.017
Token in
Token ID287
Feature activation-8.89e-02

Pos logit contributions
 least+0.005
oh+0.004
 say+0.004
 kidding+0.004
 oh+0.004
Neg logit contributions
les-0.005
lights-0.005
rily-0.005
ork-0.005
adoes-0.005
Token a
Token ID257
Feature activation-3.61e-01

Pos logit contributions
les+0.005
lights+0.004
ork+0.004
rily+0.004
adoes+0.004
Neg logit contributions
 least-0.003
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token shelter
Token ID11772
Feature activation+2.17e-01

Pos logit contributions
les+0.015
lights+0.015
rily+0.014
adoes+0.014
ork+0.014
Neg logit contributions
 least-0.014
oh-0.011
 say-0.011
 oh-0.010
 am-0.010
Token.
Token ID13
Feature activation+1.31e-01

Pos logit contributions
 least+0.009
oh+0.007
 say+0.007
 kidding+0.007
 oh+0.007
Neg logit contributions
les-0.008
lights-0.008
ork-0.008
rily-0.008
adoes-0.008
Token
Token ID198
Feature activation+1.09e-02

Pos logit contributions
 least+0.005
oh+0.004
 say+0.004
 oh+0.004
 kidding+0.004
Neg logit contributions
les-0.005
lights-0.005
ork-0.005
rily-0.005
adoes-0.005
Token
Token ID198
Feature activation-8.59e-01

Pos logit contributions
 least+0.000
oh+0.000
 say+0.000
 oh+0.000
 kidding+0.000
Neg logit contributions
les-0.000
lights-0.000
ork-0.000
rily-0.000
adoes-0.000
TokenThe
Token ID464
Feature activation-1.15e+00

Pos logit contributions
lights+0.042
les+0.042
rily+0.041
ened+0.039
adoes+0.038
Neg logit contributions
 least-0.027
oh-0.024
 oh-0.021
 say-0.020
 am-0.020
Token little
Token ID1310
Feature activation-2.89e-01

Pos logit contributions
rily+0.043
lights+0.043
les+0.043
ened+0.040
ork+0.040
Neg logit contributions
 least-0.049
 say-0.039
 wait-0.039
 oh-0.038
 best-0.037
Token playing
Token ID2712
Feature activation-1.20e-01

Pos logit contributions
 least+0.030
oh+0.024
 am+0.022
 kidding+0.022
 cle+0.021
Neg logit contributions
les-0.035
lights-0.034
adoes-0.033
rily-0.032
ork-0.031
Token.
Token ID13
Feature activation-4.27e-01

Pos logit contributions
les+0.005
lights+0.005
ork+0.005
rily+0.005
adoes+0.005
Neg logit contributions
 least-0.005
oh-0.004
 say-0.003
 oh-0.003
 kidding-0.003
Token
Token ID198
Feature activation-1.85e-01

Pos logit contributions
les+0.021
lights+0.021
rily+0.020
ork+0.019
adoes+0.019
Neg logit contributions
 least-0.014
oh-0.011
 say-0.010
 oh-0.009
 am-0.009
Token
Token ID198
Feature activation-7.71e-01

Pos logit contributions
les+0.008
lights+0.008
rily+0.008
ork+0.008
adoes+0.008
Neg logit contributions
 least-0.007
oh-0.005
 say-0.005
 oh-0.005
 am-0.005
Token"
Token ID1
Feature activation-2.74e-01

Pos logit contributions
lights+0.036
rily+0.035
les+0.035
ened+0.033
adoes+0.033
Neg logit contributions
 least-0.024
oh-0.023
 oh-0.020
 say-0.019
 am-0.019
TokenI
Token ID40
Feature activation+2.36e-01

Pos logit contributions
lights+0.013
les+0.013
rily+0.012
ork+0.012
adoes+0.012
Neg logit contributions
 least-0.010
oh-0.008
 oh-0.007
 say-0.007
 am-0.007
Token don
Token ID836
Feature activation-3.35e-01

Pos logit contributions
 least+0.007
oh+0.005
 oh+0.005
 say+0.005
 kidding+0.005
Neg logit contributions
les-0.012
lights-0.011
ork-0.011
rily-0.011
adoes-0.011
Token't
Token ID470
Feature activation+6.27e-01

Pos logit contributions
les+0.013
lights+0.013
rily+0.012
ork+0.012
adoes+0.012
Neg logit contributions
 least-0.014
oh-0.011
 say-0.011
 oh-0.010
 am-0.010
Token want
Token ID765
Feature activation+1.27e-01

Pos logit contributions
 least+0.021
oh+0.016
 Mir+0.014
 oh+0.014
 ourselves+0.014
Neg logit contributions
les-0.028
ork-0.027
lights-0.026
adoes-0.025
rily-0.025
Token to
Token ID284
Feature activation+3.05e-01

Pos logit contributions
 least+0.005
oh+0.004
 say+0.004
 kidding+0.004
 oh+0.004
Neg logit contributions
les-0.005
lights-0.005
ork-0.005
rily-0.005
adoes-0.005
Token come
Token ID1282
Feature activation+1.04e-01

Pos logit contributions
 least+0.012
oh+0.009
 oh+0.009
 kidding+0.009
 say+0.008
Neg logit contributions
les-0.013
lights-0.012
ork-0.012
adoes-0.012
rily-0.012
Token,
Token ID11
Feature activation-6.65e-01

Pos logit contributions
 least+0.002
oh+0.001
 say+0.001
 oh+0.001
 kidding+0.001
Neg logit contributions
les-0.002
lights-0.002
ork-0.002
rily-0.002
adoes-0.002
Token "
Token ID366
Feature activation+6.60e-02

Pos logit contributions
lights+0.029
les+0.029
ork+0.027
rily+0.026
adoes+0.026
Neg logit contributions
 least-0.025
oh-0.020
 say-0.019
 oh-0.019
 kidding-0.018
TokenIt
Token ID1026
Feature activation-4.82e-01

Pos logit contributions
 least+0.002
oh+0.002
 say+0.002
 oh+0.002
 kidding+0.002
Neg logit contributions
les-0.003
lights-0.003
ork-0.003
rily-0.003
adoes-0.003
Token sounds
Token ID5238
Feature activation+8.55e-01

Pos logit contributions
lights+0.024
les+0.024
ork+0.023
rily+0.023
adoes+0.022
Neg logit contributions
 least-0.015
oh-0.012
 say-0.010
 am-0.010
 oh-0.010
Token like
Token ID588
Feature activation+8.14e-01

Pos logit contributions
 least+0.027
oh+0.021
 Mir+0.018
 kidding+0.018
 oh+0.018
Neg logit contributions
les-0.042
lights-0.041
ork-0.039
adoes-0.038
rily-0.038
Token it
Token ID340
Feature activation+3.90e-01

Pos logit contributions
 least+0.027
oh+0.022
 say+0.019
 kidding+0.019
 am+0.018
Neg logit contributions
lights-0.036
les-0.036
ork-0.035
adoes-0.035
rily-0.034
Token is
Token ID318
Feature activation-4.84e-02

Pos logit contributions
 least+0.014
oh+0.011
 say+0.010
 oh+0.010
 kidding+0.010
Neg logit contributions
les-0.017
lights-0.017
ork-0.016
rily-0.016
adoes-0.016
Token coming
Token ID2406
Feature activation+6.85e-01

Pos logit contributions
les+0.002
lights+0.002
ork+0.002
rily+0.002
adoes+0.002
Neg logit contributions
 least-0.002
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token from
Token ID422
Feature activation+7.01e-01

Pos logit contributions
 least+0.028
oh+0.021
 kidding+0.021
 oh+0.021
 say+0.020
Neg logit contributions
les-0.027
lights-0.026
ork-0.026
rily-0.026
adoes-0.025
Token the
Token ID262
Feature activation+5.58e-01

Pos logit contributions
 least+0.024
oh+0.020
 say+0.019
 am+0.018
 kidding+0.018
Neg logit contributions
les-0.030
lights-0.029
ork-0.029
adoes-0.029
rily-0.028
Token river
Token ID7850
Feature activation-4.47e-01

Pos logit contributions
 least+0.024
oh+0.020
 oh+0.019
 say+0.019
 kidding+0.019
Neg logit contributions
les-0.020
lights-0.020
ork-0.019
adoes-0.018
rily-0.018
Token only
Token ID691
Feature activation+1.21e-01

Pos logit contributions
les+0.035
lights+0.035
ork+0.034
rily+0.033
adoes+0.032
Neg logit contributions
 least-0.023
 say-0.016
oh-0.016
 oh-0.016
 kidding-0.015
Token nice
Token ID3621
Feature activation-2.23e-01

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 oh+0.003
 kidding+0.003
Neg logit contributions
les-0.006
lights-0.005
rily-0.005
ork-0.005
adoes-0.005
Token and
Token ID290
Feature activation-6.21e-03

Pos logit contributions
les+0.009
lights+0.009
ork+0.008
rily+0.008
adoes+0.008
Neg logit contributions
 least-0.009
oh-0.007
 say-0.007
 oh-0.007
 kidding-0.007
Token was
Token ID373
Feature activation-4.15e-01

Pos logit contributions
les+0.000
lights+0.000
ork+0.000
rily+0.000
adoes+0.000
Neg logit contributions
 least-0.000
oh-0.000
 say-0.000
 oh-0.000
 kidding-0.000
Token not
Token ID407
Feature activation-4.57e-01

Pos logit contributions
les+0.020
lights+0.020
ork+0.019
rily+0.019
adoes+0.018
Neg logit contributions
 least-0.014
oh-0.010
 say-0.010
 oh-0.010
 kidding-0.009
Token going
Token ID1016
Feature activation-3.69e-01

Pos logit contributions
les+0.024
lights+0.023
ork+0.023
rily+0.022
adoes+0.022
Neg logit contributions
 least-0.014
oh-0.009
 say-0.009
 oh-0.009
 kidding-0.009
Token to
Token ID284
Feature activation+4.47e-01

Pos logit contributions
les+0.016
lights+0.016
rily+0.015
ork+0.015
adoes+0.015
Neg logit contributions
 least-0.014
oh-0.011
 say-0.010
 oh-0.010
 kidding-0.010
Token scare
Token ID19437
Feature activation+1.11e-01

Pos logit contributions
 least+0.016
oh+0.013
 oh+0.012
 kidding+0.012
 say+0.012
Neg logit contributions
les-0.019
lights-0.018
ork-0.018
adoes-0.018
rily-0.018
Token the
Token ID262
Feature activation-7.27e-01

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 oh+0.003
 kidding+0.003
Neg logit contributions
les-0.005
lights-0.005
ork-0.004
rily-0.004
adoes-0.004
Token town
Token ID3240
Feature activation-4.07e-01

Pos logit contributions
lights+0.026
les+0.026
rily+0.026
ork+0.025
adoes+0.024
Neg logit contributions
 least-0.034
oh-0.026
 say-0.026
 oh-0.025
 worst-0.025
Token.
Token ID13
Feature activation+3.64e-01

Pos logit contributions
les+0.016
lights+0.016
ork+0.015
rily+0.015
adoes+0.015
Neg logit contributions
 least-0.017
oh-0.013
 say-0.013
 oh-0.012
 kidding-0.012
Token them
Token ID606
Feature activation-5.21e-01

Pos logit contributions
lights+0.019
les+0.019
ork+0.019
rily+0.018
adoes+0.017
Neg logit contributions
 least-0.022
oh-0.018
 say-0.017
 oh-0.017
 kidding-0.016
Token and
Token ID290
Feature activation-3.17e-01

Pos logit contributions
lights+0.021
les+0.021
ork+0.019
rily+0.019
adoes+0.019
Neg logit contributions
 least-0.021
oh-0.017
 say-0.016
 oh-0.015
 kidding-0.015
Token said
Token ID531
Feature activation-1.70e-01

Pos logit contributions
les+0.014
lights+0.014
rily+0.013
ork+0.013
adoes+0.013
Neg logit contributions
 least-0.012
 say-0.009
oh-0.009
 oh-0.009
 kidding-0.008
Token,
Token ID11
Feature activation-2.47e-01

Pos logit contributions
les+0.008
lights+0.008
rily+0.008
ork+0.008
adoes+0.008
Neg logit contributions
 least-0.005
oh-0.004
 say-0.004
 oh-0.004
 kidding-0.004
Token "
Token ID366
Feature activation+1.23e-01

Pos logit contributions
les+0.010
lights+0.010
rily+0.009
ork+0.009
adoes+0.009
Neg logit contributions
 least-0.010
oh-0.008
 say-0.008
 oh-0.008
 kidding-0.008
TokenYou
Token ID1639
Feature activation-5.60e-01

Pos logit contributions
 least+0.004
oh+0.003
 say+0.002
 kidding+0.002
 oh+0.002
Neg logit contributions
les-0.006
lights-0.006
ork-0.006
rily-0.006
adoes-0.006
Token naughty
Token ID45128
Feature activation-5.92e-01

Pos logit contributions
lights+0.031
les+0.030
rily+0.030
ork+0.029
adoes+0.028
Neg logit contributions
 least-0.017
 say-0.011
oh-0.011
 am-0.010
 kidding-0.010
Token boys
Token ID6510
Feature activation+8.69e-01

Pos logit contributions
lights+0.027
rily+0.026
les+0.026
ork+0.025
adoes+0.024
Neg logit contributions
 least-0.021
oh-0.016
 say-0.016
 oh-0.015
 am-0.015
Token!
Token ID0
Feature activation-1.15e-01

Pos logit contributions
 least+0.025
oh+0.017
 oh+0.017
 Mir+0.016
 kidding+0.016
Neg logit contributions
les-0.045
ork-0.042
lights-0.041
adoes-0.040
rily-0.040
Token How
Token ID1374
Feature activation-2.55e-01

Pos logit contributions
les+0.006
lights+0.006
ork+0.006
rily+0.006
adoes+0.006
Neg logit contributions
 least-0.003
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token dare
Token ID16498
Feature activation+2.76e-01

Pos logit contributions
les+0.012
lights+0.012
rily+0.012
adoes+0.011
ork+0.011
Neg logit contributions
 least-0.009
oh-0.006
 say-0.006
 am-0.006
 oh-0.006
Token try
Token ID1949
Feature activation-1.18e+00

Pos logit contributions
lights+0.035
les+0.034
rily+0.034
ork+0.032
adoes+0.032
Neg logit contributions
 least-0.024
 say-0.020
 am-0.017
 wait-0.017
oh-0.017
Token new
Token ID649
Feature activation-7.38e-01

Pos logit contributions
lights+0.057
rily+0.054
les+0.053
ork+0.052
adoes+0.051
Neg logit contributions
 least-0.037
 say-0.030
 doing-0.028
 very-0.028
 saying-0.028
Token things
Token ID1243
Feature activation+4.39e-01

Pos logit contributions
lights+0.029
ork+0.029
rily+0.028
les+0.028
bell+0.027
Neg logit contributions
 least-0.029
 say-0.024
 ourselves-0.023
 oh-0.023
 am-0.022
Token.
Token ID13
Feature activation-2.63e-01

Pos logit contributions
 least+0.019
oh+0.016
 kidding+0.015
 oh+0.015
 say+0.014
Neg logit contributions
les-0.016
lights-0.016
rily-0.015
ork-0.015
adoes-0.014
Token
Token ID198
Feature activation-5.51e-01

Pos logit contributions
les+0.012
lights+0.012
ork+0.011
rily+0.011
adoes+0.011
Neg logit contributions
 least-0.010
oh-0.007
 say-0.007
 oh-0.007
 kidding-0.007
Token
Token ID198
Feature activation-6.35e-01

Pos logit contributions
lights+0.026
les+0.026
adoes+0.025
rily+0.025
ork+0.024
Neg logit contributions
 least-0.018
oh-0.014
 say-0.013
 am-0.013
 oh-0.012
TokenRos
Token ID35740
Feature activation-9.55e-01

Pos logit contributions
lights+0.025
les+0.023
rily+0.023
adoes+0.022
ork+0.022
Neg logit contributions
 least-0.027
oh-0.022
 say-0.021
 oh-0.020
 am-0.020
Tokenie
Token ID494
Feature activation-4.12e-01

Pos logit contributions
lights+0.026
ork+0.026
adoes+0.025
rily+0.024
les+0.023
Neg logit contributions
 least-0.051
oh-0.042
 say-0.040
 ourselves-0.037
 am-0.037
Token asked
Token ID1965
Feature activation+9.43e-02

Pos logit contributions
lights+0.018
les+0.017
rily+0.017
ork+0.017
adoes+0.017
Neg logit contributions
 least-0.016
oh-0.012
 say-0.012
 oh-0.011
 kidding-0.011
Token Benny
Token ID44275
Feature activation-2.35e-01

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 kidding+0.003
 oh+0.003
Neg logit contributions
les-0.004
lights-0.004
ork-0.004
rily-0.004
adoes-0.004
Token if
Token ID611
Feature activation+2.93e-01

Pos logit contributions
lights+0.011
les+0.011
ork+0.011
rily+0.011
adoes+0.011
Neg logit contributions
 least-0.008
oh-0.006
 say-0.006
 oh-0.005
 kidding-0.005
Token smiled
Token ID13541
Feature activation-5.33e-01

Pos logit contributions
 least+0.014
oh+0.011
 kidding+0.010
 oh+0.010
 am+0.010
Neg logit contributions
les-0.014
lights-0.013
ork-0.012
rily-0.012
adoes-0.012
Token.
Token ID13
Feature activation-6.13e-01

Pos logit contributions
les+0.023
ork+0.023
lights+0.023
rily+0.022
adoes+0.022
Neg logit contributions
 least-0.019
 say-0.015
oh-0.014
 ourselves-0.013
 oh-0.013
Token They
Token ID1119
Feature activation+5.91e-01

Pos logit contributions
les+0.028
rily+0.027
lights+0.026
ork+0.025
adoes+0.025
Neg logit contributions
 least-0.022
 say-0.017
 Mir-0.016
oh-0.016
 kidding-0.016
Token took
Token ID1718
Feature activation-5.73e-01

Pos logit contributions
 least+0.023
oh+0.019
 oh+0.017
 kidding+0.017
 ourselves+0.017
Neg logit contributions
les-0.025
lights-0.023
ork-0.022
rily-0.022
adoes-0.021
Token their
Token ID511
Feature activation-5.78e-01

Pos logit contributions
lights+0.028
les+0.027
ork+0.027
rily+0.027
lf+0.026
Neg logit contributions
 least-0.019
 say-0.013
oh-0.013
 oh-0.012
 ourselves-0.012
Token new
Token ID649
Feature activation-8.33e-01

Pos logit contributions
les+0.023
rily+0.022
lights+0.022
ork+0.022
adoes+0.021
Neg logit contributions
 least-0.024
 say-0.019
oh-0.018
 am-0.017
 ourselves-0.017
Token toys
Token ID14958
Feature activation+1.51e-01

Pos logit contributions
les+0.034
lights+0.034
ork+0.033
rily+0.033
adoes+0.032
Neg logit contributions
 least-0.032
 say-0.026
 oh-0.024
 am-0.023
 kidding-0.023
Token to
Token ID284
Feature activation+2.02e-02

Pos logit contributions
 least+0.007
oh+0.005
 say+0.005
 oh+0.005
 kidding+0.005
Neg logit contributions
les-0.005
lights-0.005
ork-0.005
rily-0.005
adoes-0.005
Token the
Token ID262
Feature activation+3.34e-01

Pos logit contributions
 least+0.001
oh+0.001
 say+0.000
 oh+0.000
 kidding+0.000
Neg logit contributions
les-0.001
lights-0.001
ork-0.001
rily-0.001
adoes-0.001
Token water
Token ID1660
Feature activation-2.52e-01

Pos logit contributions
 least+0.014
oh+0.011
 say+0.010
 oh+0.010
 kidding+0.010
Neg logit contributions
les-0.013
lights-0.013
ork-0.012
rily-0.012
adoes-0.012
Token,
Token ID11
Feature activation-2.08e-01

Pos logit contributions
les+0.010
lights+0.010
ork+0.009
rily+0.009
adoes+0.009
Neg logit contributions
 least-0.011
oh-0.008
 say-0.008
 oh-0.008
 kidding-0.008
Token big
Token ID1263
Feature activation+4.09e-01

Pos logit contributions
 least+0.037
oh+0.031
 kidding+0.030
 say+0.029
 oh+0.029
Neg logit contributions
les-0.027
rily-0.026
lights-0.024
ork-0.023
adoes-0.022
Token,
Token ID11
Feature activation-2.61e-01

Pos logit contributions
 least+0.018
oh+0.014
 say+0.013
 oh+0.013
 kidding+0.013
Neg logit contributions
les-0.016
lights-0.015
rily-0.015
ork-0.014
adoes-0.014
Token white
Token ID2330
Feature activation-5.83e-01

Pos logit contributions
les+0.011
lights+0.011
ork+0.010
rily+0.010
adoes+0.010
Neg logit contributions
 least-0.010
oh-0.008
 say-0.008
 oh-0.008
 kidding-0.007
Token caul
Token ID46087
Feature activation-8.90e-01

Pos logit contributions
lights+0.022
les+0.021
ork+0.021
adoes+0.021
rily+0.020
Neg logit contributions
 least-0.024
oh-0.020
 say-0.019
 oh-0.018
 am-0.018
Tokeniflower
Token ID42642
Feature activation+1.09e+00

Pos logit contributions
lights+0.030
adoes+0.029
les+0.027
ork+0.026
ened+0.026
Neg logit contributions
 least-0.041
oh-0.037
 ourselves-0.032
 am-0.032
 say-0.031
Token.
Token ID13
Feature activation-4.09e-01

Pos logit contributions
 least+0.048
oh+0.040
 kidding+0.039
 oh+0.038
 say+0.037
Neg logit contributions
les-0.039
lights-0.037
rily-0.037
ork-0.036
adoes-0.034
Token He
Token ID679
Feature activation-2.44e-01

Pos logit contributions
les+0.017
lights+0.016
ork+0.016
rily+0.015
adoes+0.015
Neg logit contributions
 least-0.017
oh-0.013
 say-0.013
 kidding-0.012
 oh-0.012
Token was
Token ID373
Feature activation+4.70e-01

Pos logit contributions
lights+0.010
les+0.010
ork+0.009
rily+0.009
adoes+0.009
Neg logit contributions
 least-0.010
oh-0.008
 say-0.007
 oh-0.007
 kidding-0.007
Token so
Token ID523
Feature activation+3.28e-01

Pos logit contributions
 least+0.012
oh+0.008
 say+0.007
 ourselves+0.007
 kidding+0.006
Neg logit contributions
les-0.025
lights-0.025
rily-0.025
ork-0.024
adoes-0.024
Token proud
Token ID6613
Feature activation-1.30e-01

Pos logit contributions
 least+0.012
oh+0.009
 oh+0.008
 kidding+0.008
 say+0.008
Neg logit contributions
les-0.015
lights-0.014
rily-0.014
ork-0.014
adoes-0.013
Token of
Token ID286
Feature activation-6.31e-01

Pos logit contributions
les+0.004
lights+0.004
ork+0.004
rily+0.004
adoes+0.004
Neg logit contributions
 least-0.007
oh-0.005
 say-0.005
 oh-0.005
 kidding-0.005
Token."
Token ID526
Feature activation+5.05e-01

Pos logit contributions
lights+0.033
les+0.033
ork+0.032
rily+0.031
adoes+0.030
Neg logit contributions
 least-0.030
oh-0.023
 say-0.022
 oh-0.021
 am-0.021
Token So
Token ID1406
Feature activation+2.29e-01

Pos logit contributions
 least+0.022
oh+0.017
 kidding+0.016
 worst+0.016
 oh+0.016
Neg logit contributions
les-0.019
lights-0.019
ork-0.018
rily-0.018
adoes-0.018
Token,
Token ID11
Feature activation-1.25e-01

Pos logit contributions
 least+0.008
oh+0.006
 say+0.006
 kidding+0.006
 oh+0.006
Neg logit contributions
les-0.010
lights-0.010
ork-0.010
rily-0.010
adoes-0.009
Token Tim
Token ID5045
Feature activation-2.25e-01

Pos logit contributions
les+0.006
lights+0.005
ork+0.005
rily+0.005
adoes+0.005
Neg logit contributions
 least-0.005
oh-0.003
 say-0.003
 oh-0.003
 kidding-0.003
Tokenmy
Token ID1820
Feature activation-8.30e-01

Pos logit contributions
lights+0.010
les+0.010
ork+0.009
rily+0.009
adoes+0.009
Neg logit contributions
 least-0.008
oh-0.007
 say-0.006
 kidding-0.006
 oh-0.006
Token shared
Token ID4888
Feature activation-2.09e+00

Pos logit contributions
lights+0.037
rily+0.034
les+0.034
adoes+0.033
ork+0.033
Neg logit contributions
 least-0.032
 say-0.025
oh-0.025
 oh-0.022
 wait-0.022
Token his
Token ID465
Feature activation-4.31e-01

Pos logit contributions
lights+0.124
les+0.121
ork+0.118
anna+0.117
rily+0.116
Neg logit contributions
 say-0.036
 very-0.036
 least-0.035
 so-0.031
 a-0.028
Token cars
Token ID5006
Feature activation-4.06e-01

Pos logit contributions
les+0.017
rily+0.016
lights+0.016
ork+0.015
adoes+0.015
Neg logit contributions
 least-0.018
 say-0.014
oh-0.014
 oh-0.014
 kidding-0.014
Token with
Token ID351
Feature activation-9.76e-01

Pos logit contributions
lights+0.015
les+0.015
rily+0.014
ork+0.014
adoes+0.014
Neg logit contributions
 least-0.017
oh-0.014
 say-0.014
 oh-0.013
 kidding-0.013
Token the
Token ID262
Feature activation+5.53e-02

Pos logit contributions
les+0.045
lights+0.045
rily+0.045
ork+0.044
adoes+0.042
Neg logit contributions
 least-0.034
 say-0.024
 oh-0.023
 very-0.022
oh-0.021
Token little
Token ID1310
Feature activation-4.69e-02

Pos logit contributions
 least+0.003
oh+0.002
 say+0.002
 oh+0.002
 kidding+0.002
Neg logit contributions
les-0.002
lights-0.002
ork-0.002
rily-0.002
adoes-0.002
Token they
Token ID484
Feature activation-5.58e-01

Pos logit contributions
 least+0.016
oh+0.013
 kidding+0.012
 oh+0.012
 say+0.012
Neg logit contributions
les-0.016
lights-0.015
ork-0.015
rily-0.014
adoes-0.014
Token played
Token ID2826
Feature activation-6.31e-01

Pos logit contributions
lights+0.024
les+0.023
ork+0.023
rily+0.023
adoes+0.023
Neg logit contributions
 least-0.022
 say-0.017
oh-0.016
 wait-0.015
 kidding-0.015
Token pretend
Token ID16614
Feature activation-2.32e-01

Pos logit contributions
lights+0.027
les+0.027
ork+0.025
rily+0.025
adoes+0.025
Neg logit contributions
 least-0.023
 say-0.019
 oh-0.017
 ourselves-0.017
 am-0.017
Token for
Token ID329
Feature activation-2.52e-01

Pos logit contributions
les+0.009
lights+0.009
rily+0.009
ork+0.009
adoes+0.008
Neg logit contributions
 least-0.009
oh-0.007
 say-0.007
 oh-0.007
 kidding-0.007
Token a
Token ID257
Feature activation-8.27e-01

Pos logit contributions
les+0.013
lights+0.013
rily+0.013
ork+0.013
adoes+0.013
Neg logit contributions
 least-0.007
oh-0.005
 say-0.004
 oh-0.004
 kidding-0.004
Token long
Token ID890
Feature activation-2.94e+00

Pos logit contributions
les+0.031
lights+0.030
adoes+0.030
rily+0.029
ened+0.027
Neg logit contributions
 least-0.036
oh-0.030
 say-0.029
 wait-0.028
 oh-0.028
Token time
Token ID640
Feature activation-2.24e-01

Pos logit contributions
rily+0.115
adoes+0.115
lights+0.112
ocked+0.109
les+0.105
Neg logit contributions
 least-0.108
 wait-0.099
 say-0.094
 very-0.090
oh-0.087
Token.
Token ID13
Feature activation-1.04e+00

Pos logit contributions
les+0.009
lights+0.009
rily+0.009
ork+0.009
adoes+0.009
Neg logit contributions
 least-0.009
oh-0.007
 say-0.007
 kidding-0.006
 oh-0.006
Token They
Token ID1119
Feature activation+7.99e-01

Pos logit contributions
les+0.053
rily+0.051
adoes+0.049
lights+0.049
ork+0.049
Neg logit contributions
 least-0.031
 say-0.024
 Mir-0.023
 am-0.022
 kidding-0.021
Token had
Token ID550
Feature activation+7.69e-01

Pos logit contributions
 least+0.034
oh+0.029
 oh+0.028
 kidding+0.027
 ourselves+0.027
Neg logit contributions
les-0.029
lights-0.027
ork-0.026
rily-0.025
lf-0.025
Token a
Token ID257
Feature activation+1.60e-01

Pos logit contributions
 least+0.027
oh+0.022
 say+0.019
 kidding+0.019
 oh+0.019
Neg logit contributions
les-0.034
lights-0.033
ork-0.032
rily-0.031
adoes-0.031
Token did
Token ID750
Feature activation-9.70e-01

Pos logit contributions
 least+0.013
oh+0.010
 kidding+0.009
 oh+0.009
 say+0.009
Neg logit contributions
les-0.015
lights-0.015
ork-0.015
rily-0.014
adoes-0.014
Token not
Token ID407
Feature activation-3.70e-01

Pos logit contributions
les+0.055
lights+0.053
rily+0.052
adoes+0.050
ened+0.049
Neg logit contributions
 least-0.026
 say-0.018
 am-0.018
 ourselves-0.017
 oh-0.017
Token know
Token ID760
Feature activation-8.42e-01

Pos logit contributions
lights+0.022
les+0.021
rily+0.021
adoes+0.020
ork+0.020
Neg logit contributions
 least-0.009
 say-0.006
oh-0.006
 am-0.005
 kidding-0.005
Token how
Token ID703
Feature activation-1.27e+00

Pos logit contributions
les+0.036
ork+0.034
lights+0.034
rily+0.034
adoes+0.032
Neg logit contributions
 least-0.033
 say-0.025
 oh-0.024
oh-0.024
 ourselves-0.023
Token to
Token ID284
Feature activation-1.83e-01

Pos logit contributions
les+0.063
lights+0.061
adoes+0.059
rily+0.059
lf+0.055
Neg logit contributions
 least-0.036
 say-0.029
 very-0.029
 ourselves-0.027
 am-0.027
Token share
Token ID2648
Feature activation-2.21e+00

Pos logit contributions
les+0.008
lights+0.008
rily+0.008
ork+0.008
adoes+0.007
Neg logit contributions
 least-0.007
oh-0.005
 say-0.005
 oh-0.005
 kidding-0.005
Token.
Token ID13
Feature activation+4.26e-01

Pos logit contributions
lights+0.118
les+0.116
urbed+0.112
rily+0.111
pron+0.108
Neg logit contributions
 least-0.049
 say-0.047
 ourselves-0.045
 very-0.044
 are-0.038
Token
Token ID198
Feature activation+1.77e-01

Pos logit contributions
 least+0.018
oh+0.014
 kidding+0.013
 oh+0.013
 say+0.013
Neg logit contributions
les-0.016
lights-0.016
ork-0.016
rily-0.015
adoes-0.015
Token
Token ID198
Feature activation+1.81e-01

Pos logit contributions
 least+0.007
oh+0.005
 say+0.005
 kidding+0.005
 oh+0.005
Neg logit contributions
les-0.007
lights-0.007
ork-0.007
rily-0.007
adoes-0.007
Token"
Token ID1
Feature activation-2.21e-01

Pos logit contributions
 least+0.008
oh+0.006
 say+0.006
 kidding+0.006
 oh+0.006
Neg logit contributions
les-0.007
lights-0.007
ork-0.006
rily-0.006
adoes-0.006
TokenI
Token ID40
Feature activation+1.49e-01

Pos logit contributions
les+0.010
lights+0.010
rily+0.010
ork+0.009
adoes+0.009
Neg logit contributions
 least-0.008
oh-0.006
 oh-0.006
 say-0.006
 kidding-0.005
Token treasure
Token ID14068
Feature activation-3.69e-01

Pos logit contributions
 least+0.006
oh+0.005
 say+0.004
 oh+0.004
 kidding+0.004
Neg logit contributions
les-0.004
lights-0.004
ork-0.004
rily-0.004
adoes-0.004
Token map
Token ID3975
Feature activation-5.99e-01

Pos logit contributions
les+0.014
lights+0.014
ork+0.013
rily+0.013
adoes+0.013
Neg logit contributions
 least-0.015
oh-0.012
 say-0.012
 oh-0.011
 kidding-0.011
Token and
Token ID290
Feature activation-2.37e-01

Pos logit contributions
les+0.024
lights+0.024
ork+0.023
adoes+0.023
rily+0.023
Neg logit contributions
 least-0.024
oh-0.018
 say-0.018
 oh-0.017
 kidding-0.016
Token after
Token ID706
Feature activation+4.15e-01

Pos logit contributions
les+0.010
lights+0.010
ork+0.010
rily+0.010
adoes+0.009
Neg logit contributions
 least-0.009
oh-0.007
 say-0.007
 oh-0.007
 kidding-0.006
Token a
Token ID257
Feature activation-7.82e-01

Pos logit contributions
 least+0.014
oh+0.011
 say+0.010
 kidding+0.010
 oh+0.010
Neg logit contributions
lights-0.019
les-0.019
ork-0.019
rily-0.018
adoes-0.018
Token long
Token ID890
Feature activation-2.65e+00

Pos logit contributions
les+0.034
lights+0.034
adoes+0.034
rily+0.032
ork+0.031
Neg logit contributions
 least-0.030
oh-0.024
 say-0.022
 wait-0.022
 oh-0.022
Token walk
Token ID2513
Feature activation+4.81e-01

Pos logit contributions
adoes+0.104
rily+0.101
lights+0.099
les+0.092
ork+0.088
Neg logit contributions
 least-0.102
 wait-0.092
 say-0.091
 saying-0.082
 doing-0.081
Token,
Token ID11
Feature activation+1.65e-02

Pos logit contributions
 least+0.018
oh+0.014
 say+0.013
 kidding+0.013
 oh+0.013
Neg logit contributions
les-0.021
lights-0.021
ork-0.020
rily-0.020
adoes-0.019
Token she
Token ID673
Feature activation-1.02e-01

Pos logit contributions
 least+0.000
oh+0.000
 say+0.000
 oh+0.000
 kidding+0.000
Neg logit contributions
les-0.001
lights-0.001
ork-0.001
rily-0.001
adoes-0.001
Token finally
Token ID3443
Feature activation-4.87e-01

Pos logit contributions
les+0.004
lights+0.004
ork+0.004
rily+0.004
adoes+0.004
Neg logit contributions
 least-0.004
oh-0.003
 say-0.003
 oh-0.003
 kidding-0.003
Token found
Token ID1043
Feature activation+4.54e-01

Pos logit contributions
lights+0.021
les+0.020
adoes+0.020
rily+0.020
ork+0.019
Neg logit contributions
 least-0.019
oh-0.015
 say-0.014
 am-0.014
 kidding-0.013
Token.
Token ID13
Feature activation+6.35e-01

Pos logit contributions
lights+0.035
les+0.032
ork+0.031
adoes+0.031
rily+0.031
Neg logit contributions
 least-0.042
oh-0.034
 say-0.033
 oh-0.031
 wait-0.031
Token You
Token ID921
Feature activation-8.45e-01

Pos logit contributions
 least+0.023
oh+0.018
 oh+0.017
 say+0.017
 kidding+0.016
Neg logit contributions
lights-0.026
les-0.026
ork-0.025
adoes-0.025
rily-0.025
Token never
Token ID1239
Feature activation-3.81e-01

Pos logit contributions
lights+0.046
rily+0.045
les+0.044
ork+0.043
adoes+0.042
Neg logit contributions
 least-0.025
 say-0.017
oh-0.017
 am-0.016
 wait-0.015
Token play
Token ID711
Feature activation-1.42e+00

Pos logit contributions
lights+0.017
les+0.017
rily+0.017
ork+0.016
adoes+0.016
Neg logit contributions
 least-0.014
 say-0.011
oh-0.011
 am-0.010
 kidding-0.010
Token with
Token ID351
Feature activation-9.08e-01

Pos logit contributions
lights+0.066
les+0.063
rily+0.063
ened+0.060
anne+0.058
Neg logit contributions
 least-0.050
 oh-0.042
 say-0.038
 am-0.038
 very-0.037
Token a
Token ID257
Feature activation-2.20e+00

Pos logit contributions
lights+0.040
les+0.040
rily+0.039
ork+0.038
adoes+0.036
Neg logit contributions
 least-0.033
 oh-0.024
oh-0.024
 say-0.024
 ourselves-0.024
Token gun
Token ID2485
Feature activation-1.41e+00

Pos logit contributions
lights+0.099
rily+0.095
les+0.087
ened+0.084
adoes+0.084
Neg logit contributions
 least-0.082
 say-0.068
 oh-0.067
 am-0.065
 very-0.065
Token.
Token ID13
Feature activation+7.84e-01

Pos logit contributions
lights+0.052
les+0.046
adoes+0.045
rily+0.045
ork+0.045
Neg logit contributions
 least-0.067
oh-0.054
 say-0.052
 wait-0.050
 oh-0.050
Token You
Token ID921
Feature activation-8.40e-01

Pos logit contributions
 least+0.029
oh+0.022
 oh+0.021
 say+0.020
 kidding+0.020
Neg logit contributions
lights-0.032
les-0.032
ork-0.032
adoes-0.031
rily-0.031
Token understand
Token ID1833
Feature activation+1.43e-01

Pos logit contributions
lights+0.047
rily+0.045
les+0.045
ork+0.044
adoes+0.043
Neg logit contributions
 least-0.024
 say-0.017
 am-0.016
oh-0.016
 wait-0.014
Token?"
Token ID1701
Feature activation+1.41e+00

Pos logit contributions
 least+0.006
oh+0.004
 say+0.004
 kidding+0.004
 oh+0.004
Neg logit contributions
les-0.006
lights-0.006
ork-0.006
rily-0.005
adoes-0.005