TOP ACTIVATIONS
MAX = 0.078

Tommy wanted a very special toy , but he did not
, he tri pped on an old brick and all his
. They both wanted a toy they saw in the store
request . She wanted a toy with a handle so she
a small bird perched on an old crate . The bird
, he managed to jump an extra long , far distance
" That 's a nice toy ," she said . "
je ep is not a toy . We might break it
She wanted to make a toy missile . She asked her
lost or sick or hurt ," mom said .
, it 's my favorite toy !" Lily replied .
. <|endoftext|> John wanted a toy truck . His mom said
<|endoftext|> Mark wanted a new toy but couldn â
him it was not a toy . But Tim did not
mom gave her a new toy . It was a little
and get to the water ." Sam whispered , trying to
the door and there was an old lady .
. He requested a soft toy , but the shop keeper
usual area . She reached an expensive part of town and
thinks it is a shiny toy that makes his dad 's

MIN ACTIVATIONS
MIN = -0.094

not played with for a long time . It was a
red bike with a com fy seat and shiny silver bell
Billy was wearing a really cool shirt with a dinosaur on
the beach and build sand cast les .
play . Lily saw a merry - go - round and
a strong and brave bear on it . Benny wanted to
things inside , but the brave little girl saw him .
big eyes and a w agging tail . The young boy
man walking on a tight ro pe . He had to
with his favorite cartoon character on the cover . He asked
room with a pretty tut oring desk .
around and saw a big do ggy smile at her .
it ! The white all igator was scared so
to have found such a beautiful treasure . <|endoftext|> Once upon
had gone away for a long time . Lucy sighed sadly
Ben smiled and hugged its neck . Mom took
hug ." The quiet all igator did not know
not played with for a long time . It is a
. When the white all igator got out of
Lily picked the apples and shared them with her friends .

INTERVAL 0.035 - 0.078
CONTAINS 1.096%

toys . He had a toy box with a lid that
play together and their favourite thing to do was swing .
called Sally . She found something very strange in her garden
, he got a new toy car . It was big
" Oh honey , I 'm sorry but we don 't

INTERVAL -0.008 - 0.035
CONTAINS 70.454%

mom would come in and say , " L ily ,
saw the open freezer and said " Uh oh !" He
L ily and Ben hug each other and say sorry .
it suddenly jumped right onto Madd ie and pin ched her
it was doing . The squirrel said ,

INTERVAL -0.051 - -0.008
CONTAINS 28.359%

to do . Suddenly , he heard a ringing sound coming
with the water . The little girl said , " You
Tom asks . He reaches for the necklace , but Sara
a tube in her toy box . It was hard and
jealous , so he decided to share the diamond with them

INTERVAL -0.094 - -0.051
CONTAINS 0.091%

Lily whispered , holding a paper hat as her hat .
outside and play in her white hut .
his special spot . The first all igator wished he had
. They had fun and shared their pictures .
rest . She found a cozy spot under a tree and
Token Tommy
Token ID19919
Feature activation+2.80e-02

Pos logit contributions
les+0.010
lights+0.010
ork+0.010
rily+0.010
adoes+0.009
Neg logit contributions
 least-0.009
oh-0.007
 say-0.007
 kidding-0.007
 oh-0.007
Token wanted
Token ID2227
Feature activation+1.46e-02

Pos logit contributions
 least+0.056
 kidding+0.045
 oh+0.044
oh+0.043
 ourselves+0.043
Neg logit contributions
les-0.036
ork-0.033
lights-0.032
lf-0.031
ened-0.030
Token a
Token ID257
Feature activation+1.00e-04

Pos logit contributions
 least+0.027
oh+0.023
 kidding+0.021
 say+0.021
 ourselves+0.020
Neg logit contributions
les-0.019
ork-0.019
lights-0.019
adoes-0.018
rily-0.017
Token very
Token ID845
Feature activation-2.43e-02

Pos logit contributions
 least+0.000
oh+0.000
 say+0.000
 oh+0.000
 kidding+0.000
Neg logit contributions
les-0.000
lights-0.000
ork-0.000
rily-0.000
adoes-0.000
Token special
Token ID2041
Feature activation+9.78e-03

Pos logit contributions
les+0.047
lights+0.046
adoes+0.045
rily+0.045
ened+0.042
Neg logit contributions
 least-0.033
 say-0.027
oh-0.025
 wait-0.025
 oh-0.025
Token toy
Token ID13373
Feature activation+7.76e-02

Pos logit contributions
 least+0.019
oh+0.016
 say+0.015
 kidding+0.015
 oh+0.014
Neg logit contributions
les-0.013
lights-0.012
ork-0.012
rily-0.012
adoes-0.011
Token,
Token ID11
Feature activation+2.89e-02

Pos logit contributions
oh+0.149
 least+0.148
Ta+0.138
 kidding+0.137
 elbows+0.133
Neg logit contributions
ork-0.073
lights-0.069
 soldier-0.067
les-0.067
 novels-0.065
Token but
Token ID475
Feature activation-1.29e-02

Pos logit contributions
 least+0.047
oh+0.039
 ourselves+0.033
 kidding+0.033
 am+0.032
Neg logit contributions
lights-0.046
les-0.045
ork-0.044
rily-0.043
adoes-0.042
Token he
Token ID339
Feature activation+4.47e-03

Pos logit contributions
les+0.023
lights+0.022
ork+0.021
rily+0.021
adoes+0.021
Neg logit contributions
 least-0.021
 oh-0.016
oh-0.015
 say-0.015
 kidding-0.015
Token did
Token ID750
Feature activation-1.25e-02

Pos logit contributions
 least+0.007
oh+0.006
 say+0.006
 kidding+0.005
 oh+0.005
Neg logit contributions
les-0.007
lights-0.007
ork-0.007
rily-0.007
adoes-0.007
Token not
Token ID407
Feature activation-9.18e-03

Pos logit contributions
les+0.025
lights+0.025
rily+0.024
adoes+0.023
ork+0.023
Neg logit contributions
 least-0.016
 say-0.012
 oh-0.012
oh-0.012
 am-0.012
Token,
Token ID11
Feature activation-1.98e-03

Pos logit contributions
 least+0.005
oh+0.004
 say+0.004
 oh+0.004
 kidding+0.004
Neg logit contributions
les-0.007
lights-0.006
ork-0.006
rily-0.006
adoes-0.006
Token he
Token ID339
Feature activation-1.28e-02

Pos logit contributions
les+0.003
lights+0.003
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.003
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token tri
Token ID1333
Feature activation+9.33e-04

Pos logit contributions
lights+0.023
les+0.022
rily+0.022
adoes+0.022
ork+0.022
Neg logit contributions
 least-0.019
oh-0.015
 say-0.014
 oh-0.013
 kidding-0.013
Tokenpped
Token ID1496
Feature activation+1.57e-02

Pos logit contributions
 least+0.002
oh+0.002
 say+0.002
 oh+0.002
 kidding+0.002
Neg logit contributions
les-0.001
lights-0.001
ork-0.001
rily-0.001
adoes-0.001
Token on
Token ID319
Feature activation-1.19e-02

Pos logit contributions
 least+0.031
oh+0.025
 kidding+0.025
 oh+0.024
 say+0.024
Neg logit contributions
les-0.021
lights-0.019
ork-0.019
rily-0.019
adoes-0.018
Token an
Token ID281
Feature activation+7.74e-02

Pos logit contributions
les+0.026
lights+0.026
rily+0.025
ork+0.025
adoes+0.024
Neg logit contributions
 least-0.013
oh-0.008
 say-0.008
 oh-0.008
 kidding-0.008
Token old
Token ID1468
Feature activation+2.55e-02

Pos logit contributions
 least+0.130
oh+0.122
 am+0.120
'm+0.114
 ourselves+0.107
Neg logit contributions
les-0.112
lf-0.105
rily-0.103
lights-0.096
ork-0.090
Token brick
Token ID17214
Feature activation+1.20e-02

Pos logit contributions
 least+0.048
oh+0.039
 kidding+0.036
 Mir+0.035
 ourselves+0.035
Neg logit contributions
les-0.036
ork-0.035
lights-0.034
adoes-0.033
rily-0.032
Token and
Token ID290
Feature activation-2.13e-03

Pos logit contributions
 least+0.021
oh+0.017
 kidding+0.016
 say+0.016
 oh+0.016
Neg logit contributions
les-0.018
lights-0.018
ork-0.017
rily-0.017
adoes-0.016
Token all
Token ID477
Feature activation+7.30e-03

Pos logit contributions
les+0.004
lights+0.004
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.003
oh-0.003
 say-0.003
 oh-0.002
 kidding-0.002
Token his
Token ID465
Feature activation+7.30e-03

Pos logit contributions
 least+0.013
oh+0.010
 oh+0.010
 kidding+0.010
 say+0.010
Neg logit contributions
les-0.011
lights-0.011
rily-0.010
ork-0.010
adoes-0.010
Token.
Token ID13
Feature activation+2.92e-03

Pos logit contributions
 least+0.042
oh+0.035
 oh+0.034
 kidding+0.033
 worst+0.032
Neg logit contributions
les-0.029
lights-0.027
ork-0.025
rily-0.024
ened-0.023
Token They
Token ID1119
Feature activation+2.27e-02

Pos logit contributions
 least+0.005
oh+0.004
 say+0.004
 oh+0.004
 kidding+0.004
Neg logit contributions
les-0.004
lights-0.004
ork-0.004
rily-0.004
adoes-0.004
Token both
Token ID1111
Feature activation+3.63e-02

Pos logit contributions
 least+0.041
oh+0.034
 oh+0.034
 kidding+0.032
 Mir+0.032
Neg logit contributions
les-0.034
lights-0.029
ened-0.028
adoes-0.028
rily-0.027
Token wanted
Token ID2227
Feature activation+1.04e-02

Pos logit contributions
 least+0.075
 oh+0.061
 ourselves+0.060
oh+0.060
 kidding+0.058
Neg logit contributions
les-0.046
ened-0.038
lights-0.038
ork-0.037
rily-0.035
Token a
Token ID257
Feature activation-5.57e-03

Pos logit contributions
 least+0.019
oh+0.016
 kidding+0.014
 say+0.014
 oh+0.014
Neg logit contributions
les-0.015
lights-0.015
ork-0.014
adoes-0.014
rily-0.014
Token toy
Token ID13373
Feature activation+7.65e-02

Pos logit contributions
les+0.008
lights+0.008
rily+0.008
ork+0.008
adoes+0.008
Neg logit contributions
 least-0.010
oh-0.008
 say-0.008
 oh-0.008
 kidding-0.008
Token they
Token ID484
Feature activation+1.40e-02

Pos logit contributions
oh+0.130
 least+0.128
 elbows+0.122
Ta+0.121
 kidding+0.120
Neg logit contributions
lights-0.086
ork-0.084
les-0.080
 soldier-0.079
rily-0.078
Token saw
Token ID2497
Feature activation+2.12e-02

Pos logit contributions
 least+0.022
oh+0.018
 kidding+0.017
 Mir+0.017
 oh+0.017
Neg logit contributions
les-0.023
lights-0.022
ork-0.021
rily-0.021
adoes-0.020
Token in
Token ID287
Feature activation-4.18e-03

Pos logit contributions
 least+0.038
oh+0.031
 say+0.029
 kidding+0.029
 Mir+0.028
Neg logit contributions
les-0.033
lights-0.031
ork-0.030
rily-0.030
adoes-0.029
Token the
Token ID262
Feature activation+2.48e-02

Pos logit contributions
les+0.009
lights+0.008
ork+0.008
rily+0.008
adoes+0.008
Neg logit contributions
 least-0.005
oh-0.004
 say-0.004
 kidding-0.003
 oh-0.003
Token store
Token ID3650
Feature activation+2.12e-02

Pos logit contributions
 least+0.043
 oh+0.038
oh+0.038
 kidding+0.038
 say+0.036
Neg logit contributions
les-0.034
ork-0.032
lights-0.031
adoes-0.029
 tant-0.028
Token request
Token ID2581
Feature activation+4.08e-02

Pos logit contributions
 least+0.017
oh+0.014
 say+0.013
 kidding+0.013
 oh+0.013
Neg logit contributions
les-0.013
lights-0.013
ork-0.013
rily-0.012
adoes-0.012
Token.
Token ID13
Feature activation-1.58e-02

Pos logit contributions
 least+0.078
oh+0.065
 kidding+0.063
 elbows+0.062
 ourselves+0.060
Neg logit contributions
ork-0.052
les-0.052
lights-0.048
adoes-0.045
bell-0.045
Token She
Token ID1375
Feature activation-1.04e-02

Pos logit contributions
les+0.034
lights+0.033
rily+0.032
ork+0.031
adoes+0.030
Neg logit contributions
 least-0.018
 say-0.014
oh-0.013
 am-0.012
 oh-0.012
Token wanted
Token ID2227
Feature activation+5.45e-03

Pos logit contributions
lights+0.018
les+0.018
rily+0.017
adoes+0.017
ork+0.017
Neg logit contributions
 least-0.016
oh-0.013
 say-0.013
 oh-0.012
 am-0.012
Token a
Token ID257
Feature activation-8.43e-03

Pos logit contributions
 least+0.009
oh+0.008
 say+0.007
 kidding+0.007
 oh+0.007
Neg logit contributions
les-0.008
lights-0.008
ork-0.008
rily-0.008
adoes-0.008
Token toy
Token ID13373
Feature activation+7.61e-02

Pos logit contributions
les+0.013
lights+0.013
rily+0.012
adoes+0.011
ork+0.011
Neg logit contributions
 least-0.015
oh-0.012
 say-0.012
 oh-0.012
 kidding-0.011
Token with
Token ID351
Feature activation-1.92e-03

Pos logit contributions
 least+0.132
oh+0.128
 elbows+0.121
 kidding+0.119
Ta+0.116
Neg logit contributions
ork-0.088
les-0.082
lights-0.081
bell-0.081
 soldier-0.080
Token a
Token ID257
Feature activation-2.91e-02

Pos logit contributions
les+0.004
lights+0.004
ork+0.004
rily+0.004
adoes+0.004
Neg logit contributions
 least-0.002
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token handle
Token ID5412
Feature activation+2.30e-02

Pos logit contributions
lights+0.047
les+0.047
rily+0.044
adoes+0.042
ork+0.041
Neg logit contributions
 least-0.048
 say-0.040
oh-0.038
 oh-0.038
 am-0.037
Token so
Token ID523
Feature activation+3.03e-03

Pos logit contributions
 least+0.042
oh+0.033
 kidding+0.032
 oh+0.032
 ourselves+0.031
Neg logit contributions
les-0.034
lights-0.033
ork-0.031
rily-0.031
bell-0.030
Token she
Token ID673
Feature activation-2.83e-02

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 kidding+0.002
 oh+0.002
Neg logit contributions
les-0.006
lights-0.006
ork-0.006
rily-0.006
adoes-0.006
Token a
Token ID257
Feature activation+1.08e-02

Pos logit contributions
 least+0.035
oh+0.028
 kidding+0.024
 say+0.024
 wait+0.023
Neg logit contributions
les-0.043
rily-0.041
lights-0.041
adoes-0.041
ork-0.041
Token small
Token ID1402
Feature activation+1.09e-02

Pos logit contributions
 least+0.020
oh+0.016
 say+0.015
 kidding+0.015
 oh+0.015
Neg logit contributions
les-0.015
lights-0.015
ork-0.015
rily-0.014
adoes-0.014
Token bird
Token ID6512
Feature activation-6.03e-03

Pos logit contributions
 least+0.020
oh+0.016
 say+0.015
 oh+0.015
 kidding+0.015
Neg logit contributions
les-0.016
lights-0.015
ork-0.015
rily-0.015
adoes-0.014
Token perched
Token ID49264
Feature activation+5.00e-03

Pos logit contributions
lights+0.010
les+0.010
ork+0.010
rily+0.010
adoes+0.010
Neg logit contributions
 least-0.009
oh-0.007
 say-0.007
 oh-0.007
 kidding-0.007
Token on
Token ID319
Feature activation+9.95e-03

Pos logit contributions
 least+0.008
oh+0.006
 say+0.006
 kidding+0.006
 oh+0.006
Neg logit contributions
les-0.008
lights-0.008
rily-0.008
ork-0.008
adoes-0.008
Token an
Token ID281
Feature activation+7.45e-02

Pos logit contributions
 least+0.014
oh+0.010
 say+0.010
 kidding+0.010
 oh+0.009
Neg logit contributions
les-0.019
lights-0.018
ork-0.018
rily-0.018
adoes-0.017
Token old
Token ID1468
Feature activation+2.14e-02

Pos logit contributions
 least+0.136
 am+0.130
oh+0.124
'm+0.120
 kidding+0.118
Neg logit contributions
les-0.096
rily-0.094
lf-0.085
lights-0.080
bell-0.080
Token crate
Token ID27021
Feature activation+7.58e-03

Pos logit contributions
 least+0.041
oh+0.032
 kidding+0.031
 say+0.030
 ourselves+0.030
Neg logit contributions
les-0.030
ork-0.030
rily-0.028
lights-0.028
adoes-0.027
Token.
Token ID13
Feature activation+5.14e-03

Pos logit contributions
 least+0.013
oh+0.010
 say+0.010
 kidding+0.010
 oh+0.010
Neg logit contributions
les-0.012
lights-0.012
ork-0.011
rily-0.011
adoes-0.011
Token The
Token ID383
Feature activation+6.58e-03

Pos logit contributions
 least+0.008
oh+0.006
 say+0.006
 kidding+0.006
 oh+0.006
Neg logit contributions
les-0.009
lights-0.009
ork-0.008
rily-0.008
adoes-0.008
Token bird
Token ID6512
Feature activation-2.46e-02

Pos logit contributions
 least+0.013
oh+0.011
 kidding+0.011
 oh+0.011
 say+0.011
Neg logit contributions
les-0.008
lights-0.008
ork-0.008
rily-0.007
adoes-0.007
Token,
Token ID11
Feature activation-1.09e-02

Pos logit contributions
 least+0.020
oh+0.015
 say+0.014
 oh+0.014
 kidding+0.014
Neg logit contributions
les-0.023
lights-0.022
ork-0.022
rily-0.021
adoes-0.021
Token he
Token ID339
Feature activation-4.94e-03

Pos logit contributions
les+0.020
lights+0.019
ork+0.019
rily+0.019
adoes+0.018
Neg logit contributions
 least-0.017
oh-0.012
 oh-0.012
 say-0.012
 kidding-0.012
Token managed
Token ID5257
Feature activation-7.22e-03

Pos logit contributions
lights+0.008
les+0.008
rily+0.008
ork+0.008
adoes+0.008
Neg logit contributions
 least-0.008
oh-0.006
 say-0.006
 oh-0.006
 kidding-0.006
Token to
Token ID284
Feature activation+6.28e-03

Pos logit contributions
lights+0.012
les+0.012
rily+0.012
ork+0.012
adoes+0.012
Neg logit contributions
 least-0.011
oh-0.009
 say-0.009
 oh-0.008
 kidding-0.008
Token jump
Token ID4391
Feature activation+2.57e-02

Pos logit contributions
 least+0.009
oh+0.007
 oh+0.007
 say+0.007
 kidding+0.007
Neg logit contributions
les-0.011
lights-0.011
ork-0.010
rily-0.010
adoes-0.010
Token an
Token ID281
Feature activation+7.41e-02

Pos logit contributions
 least+0.047
oh+0.038
 oh+0.037
 kidding+0.036
 say+0.036
Neg logit contributions
les-0.038
lights-0.036
ork-0.035
rily-0.035
adoes-0.033
Token extra
Token ID3131
Feature activation-1.33e-03

Pos logit contributions
 am+0.117
oh+0.117
 least+0.115
 elbows+0.109
'm+0.107
Neg logit contributions
les-0.103
rily-0.097
lf-0.087
en-0.087
lights-0.083
Token long
Token ID890
Feature activation-1.29e-02

Pos logit contributions
les+0.002
lights+0.002
ork+0.002
rily+0.002
adoes+0.002
Neg logit contributions
 least-0.002
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token,
Token ID11
Feature activation+2.32e-02

Pos logit contributions
lights+0.018
les+0.018
rily+0.017
adoes+0.017
ork+0.017
Neg logit contributions
 least-0.024
oh-0.020
 say-0.019
 kidding-0.018
 oh-0.018
Token far
Token ID1290
Feature activation-7.48e-04

Pos logit contributions
 least+0.035
oh+0.028
 ourselves+0.025
 say+0.025
 kidding+0.025
Neg logit contributions
les-0.039
lights-0.038
ork-0.037
rily-0.037
adoes-0.036
Token distance
Token ID5253
Feature activation+5.55e-03

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token"
Token ID1
Feature activation+4.46e-03

Pos logit contributions
 least+0.036
 kidding+0.029
 worst+0.028
oh+0.028
 oh+0.028
Neg logit contributions
les-0.019
ork-0.018
lights-0.018
rily-0.016
adoes-0.016
TokenThat
Token ID2504
Feature activation+2.34e-02

Pos logit contributions
 least+0.007
oh+0.006
 say+0.005
 kidding+0.005
 oh+0.005
Neg logit contributions
les-0.007
lights-0.007
ork-0.007
rily-0.007
adoes-0.007
Token's
Token ID338
Feature activation+2.87e-02

Pos logit contributions
 least+0.041
 kidding+0.033
 elbows+0.032
 Mir+0.031
 ourselves+0.031
Neg logit contributions
les-0.036
ork-0.034
adoes-0.032
lights-0.032
ened-0.030
Token a
Token ID257
Feature activation+1.59e-02

Pos logit contributions
 least+0.043
oh+0.036
 elbows+0.034
 Mir+0.031
 say+0.031
Neg logit contributions
adoes-0.044
ork-0.042
lights-0.042
lf-0.039
les-0.038
Token nice
Token ID3621
Feature activation-5.06e-03

Pos logit contributions
 least+0.025
oh+0.020
 say+0.019
 ourselves+0.019
 Mir+0.019
Neg logit contributions
ork-0.025
les-0.025
lights-0.024
adoes-0.023
rily-0.023
Token toy
Token ID13373
Feature activation+7.36e-02

Pos logit contributions
lights+0.008
les+0.008
rily+0.008
ork+0.007
adoes+0.007
Neg logit contributions
 least-0.009
oh-0.007
 say-0.007
 oh-0.007
 kidding-0.007
Token,"
Token ID553
Feature activation+4.37e-02

Pos logit contributions
 least+0.134
 elbows+0.116
 kidding+0.111
oh+0.105
 Mir+0.104
Neg logit contributions
les-0.099
lights-0.089
ork-0.087
adoes-0.087
bell-0.083
Token she
Token ID673
Feature activation+5.04e-03

Pos logit contributions
 least+0.089
oh+0.073
 kidding+0.068
 ourselves+0.064
 tallest+0.064
Neg logit contributions
lights-0.054
les-0.050
ork-0.046
bell-0.045
lf-0.045
Token said
Token ID531
Feature activation+5.18e-03

Pos logit contributions
 least+0.005
oh+0.004
 kidding+0.003
 say+0.003
 oh+0.003
Neg logit contributions
les-0.011
lights-0.011
ork-0.011
rily-0.011
adoes-0.011
Token.
Token ID13
Feature activation-4.56e-03

Pos logit contributions
 least+0.009
oh+0.007
 say+0.007
 kidding+0.007
 oh+0.007
Neg logit contributions
les-0.008
lights-0.008
rily-0.008
ork-0.008
adoes-0.007
Token "
Token ID366
Feature activation+8.00e-03

Pos logit contributions
les+0.008
lights+0.007
rily+0.007
ork+0.007
adoes+0.007
Neg logit contributions
 least-0.007
oh-0.006
 say-0.006
 oh-0.006
 kidding-0.005
Token je
Token ID11223
Feature activation-1.51e-02

Pos logit contributions
 least+0.054
oh+0.047
 oh+0.044
 kidding+0.044
 Mir+0.041
Neg logit contributions
ork-0.047
les-0.046
lights-0.045
adoes-0.044
 tant-0.043
Tokenep
Token ID538
Feature activation+1.67e-02

Pos logit contributions
les+0.022
lights+0.022
rily+0.022
ork+0.020
adoes+0.020
Neg logit contributions
 least-0.027
oh-0.022
 say-0.020
 oh-0.020
 am-0.020
Token is
Token ID318
Feature activation-1.49e-03

Pos logit contributions
 least+0.024
oh+0.018
 oh+0.017
 kidding+0.017
 say+0.016
Neg logit contributions
les-0.031
lights-0.030
ork-0.029
adoes-0.028
rily-0.028
Token not
Token ID407
Feature activation+1.70e-02

Pos logit contributions
les+0.003
lights+0.003
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.002
oh-0.002
 say-0.002
 oh-0.001
 kidding-0.001
Token a
Token ID257
Feature activation+9.22e-03

Pos logit contributions
 least+0.027
oh+0.021
 am+0.018
 say+0.018
 kidding+0.018
Neg logit contributions
les-0.029
lights-0.028
adoes-0.027
ork-0.027
rily-0.027
Token toy
Token ID13373
Feature activation+7.34e-02

Pos logit contributions
 least+0.018
oh+0.015
 say+0.014
 kidding+0.014
 oh+0.014
Neg logit contributions
les-0.012
ork-0.011
lights-0.011
rily-0.011
adoes-0.010
Token.
Token ID13
Feature activation+1.54e-02

Pos logit contributions
 least+0.151
 kidding+0.139
 elbows+0.135
 Fre+0.134
Ta+0.134
Neg logit contributions
les-0.074
ork-0.066
lights-0.063
bell-0.061
adoes-0.059
Token We
Token ID775
Feature activation-2.84e-03

Pos logit contributions
 least+0.030
oh+0.024
 say+0.023
 oh+0.023
 kidding+0.023
Neg logit contributions
lights-0.020
les-0.019
ork-0.019
adoes-0.019
rily-0.018
Token might
Token ID1244
Feature activation+8.72e-03

Pos logit contributions
les+0.005
lights+0.005
rily+0.005
ork+0.005
adoes+0.005
Neg logit contributions
 least-0.004
oh-0.003
 say-0.003
 oh-0.003
 kidding-0.003
Token break
Token ID2270
Feature activation+1.97e-02

Pos logit contributions
 least+0.014
oh+0.011
 oh+0.011
 kidding+0.011
 say+0.010
Neg logit contributions
les-0.014
ork-0.013
lights-0.013
adoes-0.013
rily-0.012
Token it
Token ID340
Feature activation+2.34e-02

Pos logit contributions
 least+0.036
oh+0.029
 kidding+0.028
 say+0.027
 oh+0.027
Neg logit contributions
les-0.030
lights-0.029
adoes-0.027
ork-0.027
rily-0.026
Token She
Token ID1375
Feature activation-4.56e-03

Pos logit contributions
 least+0.019
oh+0.014
 kidding+0.013
 oh+0.013
 say+0.013
Neg logit contributions
les-0.020
lights-0.020
ork-0.019
rily-0.019
adoes-0.019
Token wanted
Token ID2227
Feature activation+8.48e-03

Pos logit contributions
lights+0.008
les+0.008
rily+0.007
ork+0.007
adoes+0.007
Neg logit contributions
 least-0.007
oh-0.006
 say-0.005
 oh-0.005
 kidding-0.005
Token to
Token ID284
Feature activation-7.43e-03

Pos logit contributions
 least+0.015
oh+0.012
 kidding+0.012
 say+0.012
 oh+0.011
Neg logit contributions
les-0.013
lights-0.012
ork-0.012
adoes-0.011
rily-0.011
Token make
Token ID787
Feature activation+1.64e-02

Pos logit contributions
les+0.014
lights+0.014
rily+0.014
ork+0.014
adoes+0.013
Neg logit contributions
 least-0.010
 say-0.008
oh-0.008
 am-0.007
 oh-0.007
Token a
Token ID257
Feature activation+1.48e-02

Pos logit contributions
 least+0.024
oh+0.019
 am+0.017
 kidding+0.017
 say+0.016
Neg logit contributions
les-0.030
lights-0.029
ork-0.028
rily-0.028
adoes-0.028
Token toy
Token ID13373
Feature activation+7.32e-02

Pos logit contributions
 least+0.028
oh+0.023
 kidding+0.021
 oh+0.021
 ourselves+0.021
Neg logit contributions
les-0.020
ork-0.020
lights-0.020
rily-0.019
adoes-0.018
Token missile
Token ID10105
Feature activation+1.36e-02

Pos logit contributions
 least+0.129
oh+0.123
 elbows+0.110
 kidding+0.110
Ta+0.106
Neg logit contributions
lights-0.091
les-0.088
ork-0.086
adoes-0.084
rily-0.084
Token.
Token ID13
Feature activation+1.36e-02

Pos logit contributions
 least+0.022
oh+0.018
 kidding+0.017
 oh+0.017
 say+0.016
Neg logit contributions
les-0.022
lights-0.021
ork-0.020
rily-0.020
adoes-0.020
Token She
Token ID1375
Feature activation-1.63e-02

Pos logit contributions
 least+0.023
 kidding+0.017
oh+0.017
 worst+0.016
 oh+0.016
Neg logit contributions
les-0.022
lights-0.021
ork-0.021
rily-0.020
adoes-0.020
Token asked
Token ID1965
Feature activation+1.50e-02

Pos logit contributions
lights+0.029
les+0.028
rily+0.028
adoes+0.028
ork+0.027
Neg logit contributions
 least-0.024
oh-0.020
 say-0.019
 oh-0.018
 Mir-0.017
Token her
Token ID607
Feature activation-3.62e-03

Pos logit contributions
 least+0.025
oh+0.019
 kidding+0.017
 am+0.017
 say+0.017
Neg logit contributions
les-0.025
lights-0.024
adoes-0.024
ork-0.023
rily-0.023
Token lost
Token ID2626
Feature activation+4.27e-03

Pos logit contributions
lights+0.055
les+0.055
ork+0.054
rily+0.050
 tant+0.049
Neg logit contributions
 least-0.032
 say-0.023
 oh-0.022
 kidding-0.022
 saying-0.020
Token or
Token ID393
Feature activation-3.97e-03

Pos logit contributions
 least+0.007
oh+0.006
 say+0.005
 kidding+0.005
 oh+0.005
Neg logit contributions
les-0.007
lights-0.007
ork-0.006
rily-0.006
adoes-0.006
Token sick
Token ID6639
Feature activation-4.10e-04

Pos logit contributions
les+0.007
lights+0.007
ork+0.007
rily+0.007
adoes+0.007
Neg logit contributions
 least-0.006
oh-0.004
 say-0.004
 oh-0.004
 kidding-0.004
Token or
Token ID393
Feature activation-1.42e-02

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.001
 say-0.000
 oh-0.000
 kidding-0.000
Token hurt
Token ID5938
Feature activation+5.03e-03

Pos logit contributions
les+0.027
lights+0.027
ork+0.026
rily+0.026
adoes+0.025
Neg logit contributions
 least-0.020
 say-0.015
oh-0.015
 oh-0.014
 kidding-0.014
Token,"
Token ID553
Feature activation+7.30e-02

Pos logit contributions
 least+0.007
oh+0.006
 say+0.005
 kidding+0.005
 oh+0.005
Neg logit contributions
les-0.009
lights-0.009
ork-0.009
rily-0.009
adoes-0.008
Token mom
Token ID1995
Feature activation+2.11e-02

Pos logit contributions
 least+0.150
oh+0.138
 longest+0.135
 tallest+0.133
 fastest+0.131
Neg logit contributions
lf-0.049
 novels-0.041
bell-0.039
lights-0.038
ork-0.036
Token said
Token ID531
Feature activation+2.30e-02

Pos logit contributions
 least+0.027
 ourselves+0.019
oh+0.018
 kidding+0.018
 am+0.018
Neg logit contributions
les-0.042
lights-0.040
ork-0.039
rily-0.038
bell-0.038
Token.
Token ID13
Feature activation+2.80e-02

Pos logit contributions
 least+0.043
oh+0.034
 am+0.031
 kidding+0.031
 say+0.030
Neg logit contributions
les-0.033
lights-0.033
rily-0.032
ork-0.030
adoes-0.030
Token
Token ID198
Feature activation+3.45e-02

Pos logit contributions
 least+0.058
oh+0.045
 kidding+0.045
 wait+0.043
 ourselves+0.043
Neg logit contributions
ork-0.034
lights-0.034
les-0.032
adoes-0.032
rily-0.029
Token
Token ID198
Feature activation+3.08e-02

Pos logit contributions
 least+0.072
 kidding+0.059
 wait+0.057
 oh+0.056
oh+0.056
Neg logit contributions
ork-0.043
lights-0.042
les-0.042
adoes-0.036
rily-0.035
Token,
Token ID11
Feature activation-3.09e-02

Pos logit contributions
les+0.002
lights+0.002
ork+0.002
rily+0.002
adoes+0.002
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token it
Token ID340
Feature activation+9.31e-03

Pos logit contributions
les+0.061
ork+0.057
lights+0.057
rily+0.055
 tant+0.052
Neg logit contributions
 least-0.041
 oh-0.038
 say-0.037
oh-0.032
 wait-0.031
Token's
Token ID338
Feature activation+2.68e-02

Pos logit contributions
 least+0.014
oh+0.010
 oh+0.010
 kidding+0.010
 say+0.010
Neg logit contributions
les-0.017
lights-0.016
ork-0.016
adoes-0.015
rily-0.015
Token my
Token ID616
Feature activation+2.15e-02

Pos logit contributions
 least+0.041
oh+0.033
 elbows+0.030
 kidding+0.029
 Mir+0.029
Neg logit contributions
rily-0.043
les-0.042
lights-0.042
ork-0.042
adoes-0.041
Token favorite
Token ID4004
Feature activation-4.21e-03

Pos logit contributions
 least+0.040
oh+0.034
 Mir+0.031
 kidding+0.031
 elbows+0.031
Neg logit contributions
les-0.030
ork-0.028
lights-0.027
rily-0.027
adoes-0.026
Token toy
Token ID13373
Feature activation+7.26e-02

Pos logit contributions
les+0.006
lights+0.006
ork+0.005
rily+0.005
adoes+0.005
Neg logit contributions
 least-0.008
oh-0.007
 say-0.006
 oh-0.006
 kidding-0.006
Token!"
Token ID2474
Feature activation-7.06e-03

Pos logit contributions
 least+0.140
 kidding+0.128
 elbows+0.122
 Mir+0.117
 Mil+0.114
Neg logit contributions
les-0.095
ork-0.082
lights-0.081
adoes-0.079
rily-0.074
Token Lily
Token ID20037
Feature activation+3.17e-02

Pos logit contributions
les+0.011
lights+0.011
rily+0.010
ork+0.010
adoes+0.010
Neg logit contributions
 least-0.012
 say-0.010
oh-0.010
 oh-0.010
 kidding-0.009
Token replied
Token ID8712
Feature activation-1.49e-02

Pos logit contributions
 least+0.050
 worst+0.037
 kidding+0.036
 ourselves+0.036
oh+0.035
Neg logit contributions
les-0.054
ork-0.050
rily-0.049
ened-0.048
lf-0.048
Token.
Token ID13
Feature activation-6.23e-04

Pos logit contributions
les+0.028
lights+0.027
ork+0.026
adoes+0.026
rily+0.025
Neg logit contributions
 least-0.020
 say-0.016
oh-0.016
 oh-0.016
 kidding-0.015
Token
Token ID220
Feature activation+9.00e-03

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token.
Token ID13
Feature activation+1.14e-02

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token<|endoftext|>
Token ID50256
Feature activation-8.47e-03

Pos logit contributions
 least+0.019
oh+0.015
 oh+0.015
 say+0.015
 kidding+0.015
Neg logit contributions
les-0.018
lights-0.017
ork-0.017
rily-0.016
adoes-0.016
TokenJohn
Token ID7554
Feature activation+2.48e-03

Pos logit contributions
lights+0.016
les+0.016
rily+0.016
adoes+0.015
ork+0.015
Neg logit contributions
 least-0.011
oh-0.009
 say-0.008
 am-0.008
 oh-0.008
Token wanted
Token ID2227
Feature activation-1.08e-06

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 oh+0.003
 kidding+0.003
Neg logit contributions
les-0.004
lights-0.004
ork-0.004
rily-0.004
adoes-0.004
Token a
Token ID257
Feature activation-1.62e-03

Feature not active on prev token;
no predictions were affected.
Token toy
Token ID13373
Feature activation+7.25e-02

Pos logit contributions
les+0.002
lights+0.002
ork+0.002
rily+0.002
adoes+0.002
Neg logit contributions
 least-0.003
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token truck
Token ID7779
Feature activation+3.36e-02

Pos logit contributions
 least+0.126
oh+0.125
 elbows+0.117
 kidding+0.115
Ta+0.114
Neg logit contributions
ork-0.079
les-0.077
lights-0.077
rily-0.075
 soldier-0.074
Token.
Token ID13
Feature activation+1.20e-03

Pos logit contributions
 least+0.061
oh+0.051
 elbows+0.050
 kidding+0.049
 Mir+0.048
Neg logit contributions
les-0.043
lights-0.043
ork-0.041
rily-0.040
bell-0.039
Token His
Token ID2399
Feature activation+6.58e-03

Pos logit contributions
 least+0.002
oh+0.002
 say+0.002
 oh+0.002
 kidding+0.002
Neg logit contributions
les-0.002
lights-0.002
ork-0.002
rily-0.002
adoes-0.001
Token mom
Token ID1995
Feature activation+2.28e-03

Pos logit contributions
 least+0.012
oh+0.010
 say+0.009
 kidding+0.009
 oh+0.009
Neg logit contributions
les-0.009
lights-0.009
ork-0.009
rily-0.009
adoes-0.009
Token said
Token ID531
Feature activation+1.68e-02

Pos logit contributions
 least+0.003
oh+0.002
 say+0.002
 oh+0.002
 kidding+0.002
Neg logit contributions
les-0.004
lights-0.004
ork-0.004
rily-0.004
adoes-0.004
Token<|endoftext|>
Token ID50256
Feature activation-1.29e-02

Pos logit contributions
les+0.032
lights+0.031
rily+0.030
adoes+0.030
ork+0.029
Neg logit contributions
 least-0.021
oh-0.015
 say-0.014
 am-0.014
 oh-0.014
TokenMark
Token ID9704
Feature activation-6.71e-03

Pos logit contributions
lights+0.028
les+0.027
rily+0.027
adoes+0.027
ork+0.026
Neg logit contributions
 least-0.014
oh-0.011
 say-0.009
 am-0.009
 Mir-0.008
Token wanted
Token ID2227
Feature activation-2.92e-03

Pos logit contributions
lights+0.011
les+0.011
ork+0.010
rily+0.010
adoes+0.010
Neg logit contributions
 least-0.011
oh-0.009
 say-0.008
 kidding-0.008
 oh-0.008
Token a
Token ID257
Feature activation-9.56e-03

Pos logit contributions
les+0.005
lights+0.005
rily+0.005
ork+0.005
adoes+0.005
Neg logit contributions
 least-0.005
oh-0.004
 say-0.004
 oh-0.003
 kidding-0.003
Token new
Token ID649
Feature activation+1.50e-02

Pos logit contributions
les+0.014
lights+0.014
rily+0.013
adoes+0.013
ork+0.013
Neg logit contributions
 least-0.018
oh-0.014
 say-0.014
 oh-0.013
 kidding-0.013
Token toy
Token ID13373
Feature activation+7.22e-02

Pos logit contributions
 least+0.031
oh+0.026
 kidding+0.024
 ourselves+0.024
 say+0.024
Neg logit contributions
les-0.017
lights-0.017
ork-0.017
rily-0.016
adoes-0.016
Token but
Token ID475
Feature activation-8.02e-03

Pos logit contributions
 least+0.129
oh+0.126
 elbows+0.117
 kidding+0.115
Ta+0.113
Neg logit contributions
lights-0.080
les-0.079
ork-0.079
rily-0.074
bell-0.074
Token couldn
Token ID3521
Feature activation-1.53e-02

Pos logit contributions
les+0.014
lights+0.014
rily+0.013
ork+0.013
adoes+0.013
Neg logit contributions
 least-0.013
oh-0.010
 say-0.009
 oh-0.009
 kidding-0.009
Tokenâ
Token ID22940
Feature activation+3.97e-02

Pos logit contributions
les+0.026
lights+0.026
rily+0.025
adoes+0.024
ork+0.023
Neg logit contributions
 least-0.025
oh-0.020
 say-0.019
 am-0.019
 oh-0.018
Token
Token ID26391
Feature activation-1.40e-02

Pos logit contributions
 least+0.049
 oh+0.042
 Mir+0.039
oh+0.037
 ourselves+0.036
Neg logit contributions
les-0.071
ork-0.070
bell-0.064
lights-0.063
lf-0.063
Token
Token ID8151
Feature activation-1.51e-02

Pos logit contributions
les+0.025
lights+0.024
rily+0.024
adoes+0.023
ork+0.022
Neg logit contributions
 least-0.022
oh-0.017
 say-0.016
 am-0.016
 oh-0.015
Token him
Token ID683
Feature activation-1.97e-02

Pos logit contributions
 least+0.003
oh+0.002
 say+0.002
 oh+0.002
 kidding+0.002
Neg logit contributions
les-0.003
lights-0.003
ork-0.003
rily-0.003
adoes-0.003
Token it
Token ID340
Feature activation+1.38e-02

Pos logit contributions
les+0.038
lights+0.038
rily+0.037
adoes+0.035
ork+0.035
Neg logit contributions
 least-0.026
 say-0.019
oh-0.019
 oh-0.019
 am-0.018
Token was
Token ID373
Feature activation+2.89e-04

Pos logit contributions
 least+0.024
oh+0.018
 kidding+0.018
 say+0.018
 oh+0.017
Neg logit contributions
les-0.022
ork-0.021
lights-0.021
adoes-0.020
rily-0.020
Token not
Token ID407
Feature activation+9.78e-04

Pos logit contributions
 least+0.000
oh+0.000
 say+0.000
 oh+0.000
 kidding+0.000
Neg logit contributions
les-0.001
lights-0.001
ork-0.000
rily-0.000
adoes-0.000
Token a
Token ID257
Feature activation+2.64e-02

Pos logit contributions
 least+0.001
oh+0.001
 say+0.001
 oh+0.001
 kidding+0.001
Neg logit contributions
les-0.002
lights-0.002
ork-0.002
rily-0.002
adoes-0.002
Token toy
Token ID13373
Feature activation+7.20e-02

Pos logit contributions
 least+0.045
oh+0.035
 ourselves+0.033
 elbows+0.033
 kidding+0.033
Neg logit contributions
ork-0.041
les-0.040
lights-0.037
adoes-0.036
 tant-0.036
Token.
Token ID13
Feature activation+1.48e-02

Pos logit contributions
Ta+0.143
 least+0.143
 elbows+0.135
oh+0.134
 kidding+0.129
Neg logit contributions
les-0.072
lights-0.061
ork-0.057
adoes-0.053
 anymore-0.050
Token But
Token ID887
Feature activation+2.23e-02

Pos logit contributions
 least+0.027
oh+0.021
 kidding+0.020
 say+0.020
 worst+0.019
Neg logit contributions
lights-0.022
les-0.022
adoes-0.021
ork-0.021
rily-0.020
Token Tim
Token ID5045
Feature activation+8.04e-03

Pos logit contributions
 least+0.041
oh+0.036
 am+0.032
 say+0.031
 kidding+0.031
Neg logit contributions
lights-0.028
adoes-0.027
ened-0.026
les-0.026
lf-0.025
Token did
Token ID750
Feature activation-1.07e-02

Pos logit contributions
 least+0.013
oh+0.010
 oh+0.010
 kidding+0.010
 say+0.010
Neg logit contributions
les-0.013
ork-0.012
lights-0.012
rily-0.012
adoes-0.012
Token not
Token ID407
Feature activation-1.30e-02

Pos logit contributions
lights+0.021
les+0.021
rily+0.021
ork+0.020
adoes+0.019
Neg logit contributions
 least-0.014
 say-0.011
oh-0.011
 oh-0.010
 kidding-0.010
Token mom
Token ID1995
Feature activation-6.44e-03

Pos logit contributions
 least+0.006
oh+0.005
 say+0.005
 kidding+0.005
 oh+0.005
Neg logit contributions
les-0.005
lights-0.005
ork-0.005
rily-0.004
adoes-0.004
Token gave
Token ID2921
Feature activation+2.86e-03

Pos logit contributions
lights+0.011
les+0.011
ork+0.011
rily+0.011
adoes+0.010
Neg logit contributions
 least-0.010
oh-0.008
 say-0.008
 oh-0.007
 kidding-0.007
Token her
Token ID607
Feature activation+1.16e-02

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 kidding+0.003
 oh+0.003
Neg logit contributions
les-0.006
lights-0.005
ork-0.005
rily-0.005
adoes-0.005
Token a
Token ID257
Feature activation+1.79e-02

Pos logit contributions
 least+0.014
oh+0.011
 kidding+0.010
 say+0.009
 oh+0.009
Neg logit contributions
les-0.024
lights-0.023
ork-0.022
rily-0.022
adoes-0.022
Token new
Token ID649
Feature activation+3.71e-02

Pos logit contributions
 least+0.034
oh+0.028
 ourselves+0.026
 kidding+0.026
 oh+0.026
Neg logit contributions
les-0.025
ork-0.024
lights-0.023
rily-0.022
 tant-0.022
Token toy
Token ID13373
Feature activation+7.18e-02

Pos logit contributions
 least+0.073
oh+0.063
 ourselves+0.058
 elbows+0.057
 kidding+0.056
Neg logit contributions
les-0.047
ork-0.046
lights-0.043
rily-0.041
 tant-0.039
Token.
Token ID13
Feature activation-1.57e-02

Pos logit contributions
 least+0.136
oh+0.125
 kidding+0.121
 elbows+0.118
 ourselves+0.112
Neg logit contributions
les-0.083
ork-0.079
lights-0.078
rily-0.073
bell-0.070
Token It
Token ID632
Feature activation+2.69e-02

Pos logit contributions
les+0.027
lights+0.026
rily+0.025
adoes+0.024
ork+0.024
Neg logit contributions
 least-0.025
oh-0.020
 say-0.020
 oh-0.019
 Mir-0.019
Token was
Token ID373
Feature activation+3.13e-02

Pos logit contributions
 least+0.053
 kidding+0.045
 ourselves+0.042
 elbows+0.042
oh+0.041
Neg logit contributions
les-0.036
lights-0.033
ork-0.032
rily-0.031
ened-0.031
Token a
Token ID257
Feature activation+2.11e-02

Pos logit contributions
 least+0.048
oh+0.040
 kidding+0.034
 ourselves+0.034
 am+0.033
Neg logit contributions
lights-0.053
rily-0.051
ork-0.050
les-0.050
adoes-0.048
Token little
Token ID1310
Feature activation+4.65e-02

Pos logit contributions
 least+0.039
oh+0.031
 ourselves+0.030
 kidding+0.030
 say+0.030
Neg logit contributions
ork-0.029
les-0.028
lights-0.028
rily-0.027
pron-0.025
Token and
Token ID290
Feature activation-9.77e-03

Pos logit contributions
 least+0.024
oh+0.018
 oh+0.017
 kidding+0.017
 say+0.017
Neg logit contributions
les-0.026
lights-0.025
ork-0.024
rily-0.024
adoes-0.024
Token get
Token ID651
Feature activation+1.44e-02

Pos logit contributions
les+0.019
lights+0.019
ork+0.018
rily+0.018
adoes+0.017
Neg logit contributions
 least-0.014
 say-0.010
oh-0.010
 oh-0.009
 kidding-0.009
Token to
Token ID284
Feature activation+1.41e-02

Pos logit contributions
 least+0.021
oh+0.017
 kidding+0.015
 say+0.015
 am+0.015
Neg logit contributions
les-0.026
lights-0.025
rily-0.024
ork-0.024
adoes-0.024
Token the
Token ID262
Feature activation+4.40e-02

Pos logit contributions
 least+0.020
oh+0.016
 oh+0.014
 kidding+0.014
 am+0.013
Neg logit contributions
les-0.025
ork-0.024
lights-0.024
rily-0.024
adoes-0.024
Token water
Token ID1660
Feature activation+9.88e-03

Pos logit contributions
 least+0.070
oh+0.061
 kidding+0.061
 elbows+0.061
 oh+0.058
Neg logit contributions
ork-0.057
les-0.056
rily-0.051
lights-0.048
adoes-0.046
Token."
Token ID526
Feature activation+7.12e-02

Pos logit contributions
 least+0.016
oh+0.012
 oh+0.012
 kidding+0.012
 say+0.012
Neg logit contributions
les-0.016
lights-0.016
rily-0.015
ork-0.015
adoes-0.015
Token Sam
Token ID3409
Feature activation+1.97e-02

Pos logit contributions
 least+0.143
oh+0.125
 longest+0.124
 tallest+0.124
 worst+0.124
Neg logit contributions
ork-0.066
 One-0.052
 All-0.050
les-0.048
bell-0.047
Token whispered
Token ID25029
Feature activation+9.23e-03

Pos logit contributions
 least+0.026
oh+0.018
 ourselves+0.018
 kidding+0.017
 am+0.017
Neg logit contributions
les-0.040
lights-0.037
rily-0.036
ork-0.036
adoes-0.035
Token,
Token ID11
Feature activation-2.15e-02

Pos logit contributions
 least+0.015
oh+0.011
 say+0.010
 kidding+0.010
 am+0.010
Neg logit contributions
les-0.016
lights-0.016
rily-0.015
ork-0.015
adoes-0.015
Token trying
Token ID2111
Feature activation+1.15e-03

Pos logit contributions
les+0.044
lights+0.043
ork+0.043
adoes+0.040
rily+0.039
Neg logit contributions
 least-0.027
 say-0.021
 oh-0.020
oh-0.020
 kidding-0.019
Token to
Token ID284
Feature activation-5.64e-03

Pos logit contributions
 least+0.002
oh+0.002
 say+0.001
 oh+0.001
 kidding+0.001
Neg logit contributions
les-0.002
lights-0.002
ork-0.002
rily-0.002
adoes-0.002
Token the
Token ID262
Feature activation+1.18e-02

Pos logit contributions
les+0.019
lights+0.019
ork+0.019
rily+0.018
adoes+0.018
Neg logit contributions
 least-0.013
oh-0.010
 say-0.009
 Mir-0.009
 oh-0.009
Token door
Token ID3420
Feature activation+1.36e-02

Pos logit contributions
 least+0.024
oh+0.020
 oh+0.019
 kidding+0.019
 say+0.019
Neg logit contributions
les-0.014
ork-0.014
lights-0.013
rily-0.012
adoes-0.012
Token and
Token ID290
Feature activation+2.35e-03

Pos logit contributions
 least+0.027
oh+0.022
 kidding+0.022
 oh+0.021
 say+0.021
Neg logit contributions
lights-0.018
les-0.018
ork-0.017
rily-0.016
adoes-0.015
Token there
Token ID612
Feature activation+2.33e-02

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 kidding+0.003
 oh+0.003
Neg logit contributions
les-0.004
lights-0.004
ork-0.004
rily-0.004
adoes-0.004
Token was
Token ID373
Feature activation+2.31e-02

Pos logit contributions
 least+0.044
 kidding+0.033
oh+0.032
 ourselves+0.031
 say+0.031
Neg logit contributions
les-0.035
ork-0.034
lights-0.032
rily-0.032
ened-0.030
Token an
Token ID281
Feature activation+7.11e-02

Pos logit contributions
 least+0.032
oh+0.025
 say+0.023
 kidding+0.022
 ourselves+0.022
Neg logit contributions
lights-0.043
ork-0.041
les-0.041
adoes-0.041
rily-0.040
Token old
Token ID1468
Feature activation+2.38e-02

Pos logit contributions
 least+0.133
oh+0.129
 am+0.122
'm+0.115
 ourselves+0.114
Neg logit contributions
les-0.085
rily-0.084
lf-0.083
adoes-0.076
lights-0.072
Token lady
Token ID10846
Feature activation+1.12e-02

Pos logit contributions
 least+0.048
oh+0.039
 ourselves+0.037
 kidding+0.036
 oh+0.034
Neg logit contributions
les-0.031
ork-0.029
lights-0.029
adoes-0.027
rily-0.027
Token.
Token ID13
Feature activation-2.03e-03

Pos logit contributions
 least+0.019
oh+0.014
 kidding+0.014
 oh+0.014
 say+0.014
Neg logit contributions
les-0.018
lights-0.017
ork-0.017
rily-0.017
adoes-0.016
Token
Token ID198
Feature activation+5.74e-03

Pos logit contributions
les+0.004
lights+0.004
ork+0.004
rily+0.004
adoes+0.004
Neg logit contributions
 least-0.003
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token
Token ID198
Feature activation+2.93e-03

Pos logit contributions
 least+0.009
oh+0.007
 kidding+0.007
 oh+0.007
 say+0.007
Neg logit contributions
les-0.010
lights-0.009
ork-0.009
rily-0.009
adoes-0.009
Token.
Token ID13
Feature activation-9.37e-03

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 kidding+0.003
 oh+0.003
Neg logit contributions
les-0.003
lights-0.003
ork-0.003
rily-0.003
adoes-0.003
Token He
Token ID679
Feature activation+1.97e-03

Pos logit contributions
les+0.017
lights+0.016
rily+0.015
ork+0.015
adoes+0.015
Neg logit contributions
 least-0.014
oh-0.011
 say-0.011
 oh-0.010
 Mir-0.010
Token requested
Token ID9167
Feature activation-2.76e-03

Pos logit contributions
 least+0.003
oh+0.002
 say+0.002
 kidding+0.002
 oh+0.002
Neg logit contributions
les-0.003
lights-0.003
ork-0.003
rily-0.003
adoes-0.003
Token a
Token ID257
Feature activation-3.89e-04

Pos logit contributions
les+0.005
lights+0.005
rily+0.005
ork+0.005
adoes+0.004
Neg logit contributions
 least-0.004
oh-0.003
 say-0.003
 oh-0.003
 kidding-0.003
Token soft
Token ID2705
Feature activation-1.42e-02

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token toy
Token ID13373
Feature activation+7.07e-02

Pos logit contributions
lights+0.023
les+0.022
rily+0.021
ork+0.021
adoes+0.020
Neg logit contributions
 least-0.024
 say-0.019
oh-0.019
 oh-0.018
 kidding-0.018
Token,
Token ID11
Feature activation+5.20e-03

Pos logit contributions
 least+0.140
oh+0.131
 kidding+0.122
 elbows+0.121
 ourselves+0.115
Neg logit contributions
ork-0.074
les-0.072
adoes-0.068
 soldier-0.067
bell-0.065
Token but
Token ID475
Feature activation+1.67e-03

Pos logit contributions
 least+0.007
oh+0.005
 say+0.005
 kidding+0.004
 oh+0.004
Neg logit contributions
les-0.010
lights-0.010
ork-0.010
rily-0.010
adoes-0.009
Token the
Token ID262
Feature activation-1.62e-04

Pos logit contributions
 least+0.003
oh+0.002
 say+0.002
 kidding+0.002
 oh+0.002
Neg logit contributions
les-0.003
lights-0.003
ork-0.003
rily-0.002
adoes-0.002
Token shop
Token ID6128
Feature activation-3.65e-04

Pos logit contributions
les+0.000
lights+0.000
ork+0.000
rily+0.000
adoes+0.000
Neg logit contributions
 least-0.000
oh-0.000
 say-0.000
 oh-0.000
 kidding-0.000
Tokenkeeper
Token ID13884
Feature activation-3.82e-03

Pos logit contributions
les+0.000
lights+0.000
ork+0.000
rily+0.000
adoes+0.000
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token usual
Token ID6678
Feature activation-1.25e-02

Pos logit contributions
 least+0.055
oh+0.046
 kidding+0.044
 oh+0.044
 say+0.043
Neg logit contributions
les-0.041
lights-0.039
ork-0.038
adoes-0.036
rily-0.034
Token area
Token ID1989
Feature activation+1.45e-02

Pos logit contributions
les+0.020
lights+0.020
rily+0.019
ork+0.019
adoes+0.018
Neg logit contributions
 least-0.021
oh-0.017
 say-0.016
 oh-0.015
 kidding-0.015
Token.
Token ID13
Feature activation-1.04e-02

Pos logit contributions
 least+0.026
oh+0.021
 kidding+0.020
 say+0.020
 oh+0.019
Neg logit contributions
les-0.022
lights-0.022
rily-0.020
ork-0.020
adoes-0.020
Token She
Token ID1375
Feature activation+1.11e-03

Pos logit contributions
les+0.019
lights+0.019
ork+0.018
rily+0.018
adoes+0.018
Neg logit contributions
 least-0.015
oh-0.012
 say-0.011
 oh-0.010
 Mir-0.010
Token reached
Token ID4251
Feature activation+3.15e-02

Pos logit contributions
 least+0.002
oh+0.001
 say+0.001
 oh+0.001
 kidding+0.001
Neg logit contributions
les-0.002
lights-0.002
ork-0.002
rily-0.002
adoes-0.002
Token an
Token ID281
Feature activation+7.06e-02

Pos logit contributions
 least+0.055
oh+0.049
 oh+0.044
 kidding+0.043
 ourselves+0.042
Neg logit contributions
les-0.047
lights-0.046
rily-0.044
ork-0.042
ened-0.041
Token expensive
Token ID5789
Feature activation+1.57e-02

Pos logit contributions
oh+0.126
 least+0.125
 am+0.123
'm+0.116
 ourselves+0.114
Neg logit contributions
les-0.085
rily-0.081
bell-0.079
ened-0.075
lights-0.075
Token part
Token ID636
Feature activation+1.89e-02

Pos logit contributions
 least+0.029
oh+0.023
 say+0.022
 kidding+0.022
 oh+0.022
Neg logit contributions
les-0.023
lights-0.022
ork-0.021
rily-0.021
adoes-0.020
Token of
Token ID286
Feature activation+3.88e-03

Pos logit contributions
 least+0.041
oh+0.034
 kidding+0.033
 oh+0.033
 say+0.033
Neg logit contributions
les-0.021
lights-0.021
ork-0.019
rily-0.019
adoes-0.018
Token town
Token ID3240
Feature activation+1.41e-02

Pos logit contributions
 least+0.005
oh+0.004
 say+0.004
 oh+0.004
 kidding+0.004
Neg logit contributions
les-0.007
lights-0.007
ork-0.007
rily-0.007
adoes-0.007
Token and
Token ID290
Feature activation-1.61e-03

Pos logit contributions
 least+0.024
oh+0.019
 say+0.018
 kidding+0.018
 oh+0.018
Neg logit contributions
les-0.022
lights-0.022
ork-0.021
rily-0.021
adoes-0.020
Token thinks
Token ID6834
Feature activation-4.78e-03

Pos logit contributions
 least+0.034
oh+0.028
 oh+0.027
 kidding+0.026
 ourselves+0.026
Neg logit contributions
les-0.031
lights-0.028
ork-0.027
rily-0.026
ened-0.026
Token it
Token ID340
Feature activation+2.06e-02

Pos logit contributions
les+0.007
lights+0.007
ork+0.006
rily+0.006
adoes+0.006
Neg logit contributions
 least-0.009
oh-0.007
 say-0.007
 oh-0.007
 kidding-0.007
Token is
Token ID318
Feature activation+3.26e-02

Pos logit contributions
 least+0.035
 kidding+0.028
 oh+0.027
oh+0.027
 say+0.026
Neg logit contributions
les-0.033
lights-0.031
ork-0.030
adoes-0.029
rily-0.028
Token a
Token ID257
Feature activation+2.80e-02

Pos logit contributions
 least+0.051
oh+0.046
 elbows+0.038
 kidding+0.037
 oh+0.036
Neg logit contributions
lights-0.053
les-0.050
rily-0.048
adoes-0.048
ork-0.046
Token shiny
Token ID22441
Feature activation+8.98e-04

Pos logit contributions
 least+0.048
oh+0.039
 kidding+0.037
 ourselves+0.037
 elbows+0.036
Neg logit contributions
ork-0.041
les-0.039
lights-0.039
pron-0.038
 tant-0.037
Token toy
Token ID13373
Feature activation+7.04e-02

Pos logit contributions
 least+0.002
oh+0.001
 say+0.001
 oh+0.001
 kidding+0.001
Neg logit contributions
les-0.001
lights-0.001
ork-0.001
rily-0.001
adoes-0.001
Token that
Token ID326
Feature activation-6.39e-03

Pos logit contributions
 least+0.138
oh+0.128
 kidding+0.125
 elbows+0.124
 ourselves+0.116
Neg logit contributions
les-0.078
lights-0.074
ork-0.068
rily-0.065
bell-0.062
Token makes
Token ID1838
Feature activation+4.10e-03

Pos logit contributions
les+0.011
lights+0.011
rily+0.010
ork+0.010
adoes+0.010
Neg logit contributions
 least-0.010
oh-0.008
 say-0.008
 oh-0.007
 am-0.007
Token his
Token ID465
Feature activation+2.05e-02

Pos logit contributions
 least+0.007
oh+0.005
 say+0.005
 kidding+0.005
 oh+0.005
Neg logit contributions
les-0.007
lights-0.007
ork-0.006
rily-0.006
adoes-0.006
Token dad
Token ID9955
Feature activation+2.92e-03

Pos logit contributions
 least+0.038
oh+0.033
 kidding+0.030
 oh+0.029
 Mir+0.029
Neg logit contributions
les-0.028
lights-0.028
rily-0.026
ork-0.026
 tant-0.026
Token's
Token ID338
Feature activation+8.85e-03

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 oh+0.003
 kidding+0.003
Neg logit contributions
les-0.005
lights-0.005
ork-0.005
rily-0.005
adoes-0.005
Token not
Token ID407
Feature activation-1.17e-02

Pos logit contributions
les+0.014
lights+0.014
rily+0.013
ork+0.013
adoes+0.013
Neg logit contributions
 least-0.010
oh-0.008
 say-0.007
 oh-0.007
 kidding-0.007
Token played
Token ID2826
Feature activation-3.05e-02

Pos logit contributions
lights+0.022
les+0.022
rily+0.021
ork+0.020
adoes+0.020
Neg logit contributions
 least-0.017
 say-0.013
oh-0.012
 oh-0.012
 kidding-0.012
Token with
Token ID351
Feature activation-2.89e-02

Pos logit contributions
lights+0.054
les+0.054
rily+0.049
ened+0.048
ork+0.047
Neg logit contributions
 least-0.042
 oh-0.036
oh-0.035
 very-0.034
 am-0.034
Token for
Token ID329
Feature activation-5.04e-03

Pos logit contributions
lights+0.052
les+0.052
rily+0.051
ork+0.049
adoes+0.048
Neg logit contributions
 least-0.040
oh-0.030
 say-0.029
 oh-0.029
 ourselves-0.029
Token a
Token ID257
Feature activation-5.05e-02

Pos logit contributions
les+0.010
lights+0.010
rily+0.010
ork+0.010
adoes+0.010
Neg logit contributions
 least-0.006
oh-0.004
 say-0.004
 oh-0.004
 kidding-0.004
Token long
Token ID890
Feature activation-9.39e-02

Pos logit contributions
lights+0.087
adoes+0.087
les+0.081
rily+0.079
ened+0.076
Neg logit contributions
 least-0.077
 very-0.066
oh-0.064
 wait-0.064
 say-0.062
Token time
Token ID640
Feature activation-7.34e-03

Pos logit contributions
rily+0.170
lights+0.164
adoes+0.157
ocked+0.148
aining+0.139
Neg logit contributions
 least-0.107
 wait-0.102
 say-0.099
 very-0.098
 am-0.093
Token.
Token ID13
Feature activation-6.95e-03

Pos logit contributions
les+0.012
lights+0.012
ork+0.012
rily+0.012
adoes+0.012
Neg logit contributions
 least-0.012
oh-0.009
 say-0.009
 oh-0.008
 kidding-0.008
Token It
Token ID632
Feature activation-4.48e-03

Pos logit contributions
les+0.013
lights+0.013
rily+0.013
ork+0.013
adoes+0.012
Neg logit contributions
 least-0.010
oh-0.007
 say-0.007
 oh-0.007
 kidding-0.006
Token was
Token ID373
Feature activation+6.71e-03

Pos logit contributions
les+0.008
lights+0.008
rily+0.008
ork+0.008
adoes+0.007
Neg logit contributions
 least-0.007
oh-0.005
 say-0.005
 oh-0.005
 am-0.005
Token a
Token ID257
Feature activation+6.44e-03

Pos logit contributions
 least+0.009
oh+0.007
 say+0.006
 kidding+0.006
 oh+0.006
Neg logit contributions
les-0.013
lights-0.013
ork-0.012
rily-0.012
adoes-0.012
Token red
Token ID2266
Feature activation-1.02e-02

Pos logit contributions
les+0.002
lights+0.002
ork+0.002
rily+0.002
adoes+0.002
Neg logit contributions
 least-0.002
oh-0.002
 say-0.001
 oh-0.001
 kidding-0.001
Token bike
Token ID7161
Feature activation+4.68e-03

Pos logit contributions
lights+0.015
les+0.015
rily+0.014
ork+0.014
adoes+0.014
Neg logit contributions
 least-0.018
oh-0.015
 say-0.014
 oh-0.014
 kidding-0.014
Token with
Token ID351
Feature activation-1.30e-02

Pos logit contributions
 least+0.008
oh+0.007
 say+0.006
 kidding+0.006
 oh+0.006
Neg logit contributions
les-0.007
lights-0.007
ork-0.007
rily-0.007
adoes-0.006
Token a
Token ID257
Feature activation-3.72e-02

Pos logit contributions
lights+0.027
les+0.026
ork+0.026
rily+0.025
adoes+0.025
Neg logit contributions
 least-0.015
oh-0.011
 say-0.011
 oh-0.010
 Mir-0.010
Token com
Token ID401
Feature activation-3.48e-02

Pos logit contributions
lights+0.067
les+0.066
rily+0.064
adoes+0.062
ened+0.058
Neg logit contributions
 least-0.049
 say-0.044
 very-0.042
oh-0.041
 am-0.040
Tokenfy
Token ID24928
Feature activation-9.23e-02

Pos logit contributions
adoes+0.053
lights+0.053
lf+0.050
rily+0.049
les+0.047
Neg logit contributions
 least-0.056
oh-0.049
 say-0.048
 wait-0.045
 kidding-0.043
Token seat
Token ID5852
Feature activation-1.89e-02

Pos logit contributions
lights+0.188
rily+0.177
ust+0.167
les+0.167
ide+0.166
Neg logit contributions
 least-0.078
 say-0.075
 wait-0.073
 h-0.072
 very-0.072
Token and
Token ID290
Feature activation-4.30e-03

Pos logit contributions
les+0.033
lights+0.032
adoes+0.031
rily+0.031
ork+0.030
Neg logit contributions
 least-0.028
oh-0.022
 say-0.021
 oh-0.020
 am-0.020
Token shiny
Token ID22441
Feature activation-4.44e-02

Pos logit contributions
les+0.009
lights+0.009
ork+0.008
rily+0.008
adoes+0.008
Neg logit contributions
 least-0.005
oh-0.004
 say-0.004
 oh-0.003
 kidding-0.003
Token silver
Token ID8465
Feature activation-3.40e-02

Pos logit contributions
ork+0.077
lights+0.076
ide+0.074
rily+0.073
les+0.072
Neg logit contributions
 least-0.051
 say-0.044
 oh-0.042
 very-0.041
 most-0.040
Token bell
Token ID8966
Feature activation-1.99e-02

Pos logit contributions
lights+0.055
les+0.054
ork+0.053
rily+0.052
adoes+0.051
Neg logit contributions
 least-0.049
 say-0.039
 oh-0.038
oh-0.038
 am-0.037
Token Billy
Token ID15890
Feature activation-3.48e-03

Pos logit contributions
les+0.063
rily+0.062
lights+0.061
ork+0.059
urbed+0.056
Neg logit contributions
 least-0.041
 say-0.036
oh-0.031
 am-0.031
 oh-0.030
Token was
Token ID373
Feature activation-2.66e-03

Pos logit contributions
lights+0.006
les+0.006
ork+0.006
rily+0.006
adoes+0.006
Neg logit contributions
 least-0.005
oh-0.004
 say-0.004
 oh-0.004
 kidding-0.004
Token wearing
Token ID5762
Feature activation-4.68e-02

Pos logit contributions
les+0.005
lights+0.005
ork+0.005
rily+0.005
adoes+0.005
Neg logit contributions
 least-0.003
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token a
Token ID257
Feature activation-7.49e-02

Pos logit contributions
les+0.128
lights+0.126
ork+0.125
rily+0.124
anna+0.120
Neg logit contributions
 least-0.022
 very-0.015
 most-0.010
 oh-0.010
 say-0.010
Token really
Token ID1107
Feature activation-4.41e-02

Pos logit contributions
lights+0.161
adoes+0.155
urbed+0.154
les+0.152
rily+0.152
Neg logit contributions
 very-0.076
 say-0.069
 best-0.065
 least-0.061
 oh-0.057
Token cool
Token ID3608
Feature activation-9.01e-02

Pos logit contributions
lights+0.088
rily+0.088
les+0.086
adoes+0.086
urbed+0.081
Neg logit contributions
 least-0.044
 say-0.042
 very-0.042
 am-0.040
 oh-0.039
Token shirt
Token ID10147
Feature activation-2.10e-02

Pos logit contributions
rily+0.187
lights+0.187
urbed+0.183
ide+0.171
ust+0.170
Neg logit contributions
 say-0.074
 very-0.074
 least-0.059
 waiting-0.058
 am-0.057
Token with
Token ID351
Feature activation-3.24e-02

Pos logit contributions
lights+0.038
les+0.037
adoes+0.036
ork+0.036
rily+0.036
Neg logit contributions
 least-0.031
oh-0.023
 say-0.023
 oh-0.021
 kidding-0.021
Token a
Token ID257
Feature activation-5.60e-02

Pos logit contributions
lights+0.068
ork+0.067
les+0.066
rily+0.065
adoes+0.062
Neg logit contributions
 least-0.037
 say-0.028
 oh-0.026
 very-0.026
oh-0.024
Token dinosaur
Token ID28257
Feature activation-4.80e-02

Pos logit contributions
lights+0.109
les+0.105
rily+0.103
adoes+0.102
ide+0.101
Neg logit contributions
 say-0.063
 least-0.063
 very-0.061
 am-0.055
 best-0.054
Token on
Token ID319
Feature activation-7.56e-02

Pos logit contributions
lights+0.095
rily+0.087
adoes+0.087
les+0.084
ened+0.082
Neg logit contributions
 least-0.062
oh-0.048
 say-0.048
 very-0.046
 oh-0.043
Token the
Token ID262
Feature activation+2.83e-02

Pos logit contributions
 least+0.006
oh+0.004
 say+0.004
 oh+0.004
 kidding+0.004
Neg logit contributions
les-0.007
lights-0.007
ork-0.007
rily-0.007
adoes-0.007
Token beach
Token ID10481
Feature activation-2.47e-03

Pos logit contributions
 least+0.049
oh+0.043
 kidding+0.040
 say+0.039
 oh+0.039
Neg logit contributions
les-0.039
ork-0.038
lights-0.038
adoes-0.036
rily-0.036
Token and
Token ID290
Feature activation-1.88e-02

Pos logit contributions
les+0.004
lights+0.004
ork+0.004
rily+0.004
adoes+0.004
Neg logit contributions
 least-0.004
oh-0.003
 say-0.003
 oh-0.003
 kidding-0.003
Token build
Token ID1382
Feature activation+6.23e-03

Pos logit contributions
les+0.036
lights+0.035
ork+0.034
rily+0.034
adoes+0.033
Neg logit contributions
 least-0.028
 say-0.021
oh-0.020
 wait-0.019
 am-0.019
Token sand
Token ID6450
Feature activation-2.17e-02

Pos logit contributions
 least+0.009
oh+0.007
 say+0.006
 oh+0.006
 kidding+0.006
Neg logit contributions
les-0.012
lights-0.011
ork-0.011
rily-0.011
adoes-0.011
Tokencast
Token ID2701
Feature activation-8.99e-02

Pos logit contributions
les+0.030
ork+0.030
lights+0.029
adoes+0.029
rily+0.028
Neg logit contributions
 least-0.038
oh-0.031
 say-0.031
 ourselves-0.029
 am-0.028
Tokenles
Token ID829
Feature activation-2.34e-02

Pos logit contributions
adoes+0.082
 former+0.068
lights+0.067
 aids+0.062
igsaw+0.058
Neg logit contributions
 least-0.178
oh-0.166
 are-0.166
ila-0.161
 say-0.156
Token.
Token ID13
Feature activation-9.47e-03

Pos logit contributions
lights+0.040
les+0.040
ork+0.039
adoes+0.038
rily+0.038
Neg logit contributions
 least-0.034
oh-0.027
 say-0.026
 oh-0.024
 ourselves-0.024
Token
Token ID220
Feature activation-1.93e-02

Pos logit contributions
les+0.016
lights+0.015
ork+0.015
rily+0.015
adoes+0.014
Neg logit contributions
 least-0.015
oh-0.012
 say-0.012
 kidding-0.011
 oh-0.011
Token
Token ID198
Feature activation-6.77e-03

Pos logit contributions
les+0.038
lights+0.037
adoes+0.036
rily+0.036
ork+0.035
Neg logit contributions
 least-0.025
oh-0.019
 say-0.018
 am-0.017
 Mir-0.017
Token
Token ID198
Feature activation-8.17e-03

Pos logit contributions
les+0.012
lights+0.012
rily+0.012
adoes+0.012
ork+0.012
Neg logit contributions
 least-0.010
oh-0.008
 say-0.007
 oh-0.007
 kidding-0.007
Token play
Token ID711
Feature activation+4.27e-04

Pos logit contributions
lights+0.037
les+0.036
rily+0.035
ork+0.034
adoes+0.034
Neg logit contributions
 least-0.025
 say-0.020
 wait-0.017
 am-0.017
oh-0.017
Token.
Token ID13
Feature activation-2.50e-02

Pos logit contributions
 least+0.001
oh+0.001
 say+0.001
 oh+0.001
 kidding+0.001
Neg logit contributions
les-0.001
lights-0.001
ork-0.001
rily-0.001
adoes-0.001
Token Lily
Token ID20037
Feature activation+1.27e-02

Pos logit contributions
les+0.047
lights+0.046
rily+0.044
adoes+0.043
ork+0.043
Neg logit contributions
 least-0.034
 say-0.028
 Mir-0.026
oh-0.026
 oh-0.025
Token saw
Token ID2497
Feature activation+2.11e-02

Pos logit contributions
 least+0.022
oh+0.017
 kidding+0.016
 ourselves+0.016
 oh+0.016
Neg logit contributions
les-0.021
lights-0.019
ork-0.019
rily-0.018
adoes-0.018
Token a
Token ID257
Feature activation-7.45e-03

Pos logit contributions
 least+0.031
oh+0.025
 kidding+0.022
 am+0.021
 say+0.021
Neg logit contributions
les-0.038
lights-0.037
adoes-0.035
ork-0.035
rily-0.034
Token merry
Token ID44570
Feature activation-8.96e-02

Pos logit contributions
les+0.012
lights+0.012
rily+0.011
ork+0.011
adoes+0.011
Neg logit contributions
 least-0.012
oh-0.010
 say-0.010
 oh-0.009
 kidding-0.009
Token-
Token ID12
Feature activation-1.77e-02

Pos logit contributions
lights+0.175
rily+0.174
urbed+0.159
adoes+0.155
 journals+0.150
Neg logit contributions
 say-0.093
 wait-0.090
 least-0.089
 oh-0.087
oh-0.081
Tokengo
Token ID2188
Feature activation-5.08e-03

Pos logit contributions
lights+0.030
les+0.029
adoes+0.028
rily+0.028
ork+0.027
Neg logit contributions
 least-0.027
oh-0.024
 say-0.022
 oh-0.021
 wait-0.020
Token-
Token ID12
Feature activation-2.02e-02

Pos logit contributions
lights+0.005
les+0.005
rily+0.005
ork+0.005
adoes+0.005
Neg logit contributions
 least-0.011
oh-0.010
 say-0.009
 oh-0.009
 kidding-0.009
Tokenround
Token ID744
Feature activation+5.06e-03

Pos logit contributions
lights+0.025
les+0.023
rily+0.023
adoes+0.023
ened+0.021
Neg logit contributions
 least-0.040
oh-0.037
 say-0.035
 oh-0.033
 wait-0.032
Token and
Token ID290
Feature activation+1.25e-03

Pos logit contributions
 least+0.009
oh+0.007
 say+0.007
 oh+0.007
 kidding+0.007
Neg logit contributions
les-0.008
lights-0.007
ork-0.007
rily-0.007
adoes-0.007
Token a
Token ID257
Feature activation-2.56e-02

Pos logit contributions
les+0.012
lights+0.012
ork+0.012
rily+0.012
adoes+0.012
Neg logit contributions
 least-0.007
oh-0.004
 say-0.004
 oh-0.004
 kidding-0.004
Token strong
Token ID1913
Feature activation-5.32e-02

Pos logit contributions
lights+0.041
les+0.040
rily+0.038
adoes+0.037
ened+0.036
Neg logit contributions
 least-0.043
 say-0.035
oh-0.034
 oh-0.033
 am-0.032
Token and
Token ID290
Feature activation-9.53e-03

Pos logit contributions
lights+0.104
rily+0.092
les+0.090
ork+0.088
adoes+0.084
Neg logit contributions
 least-0.072
 say-0.067
 saying-0.058
 doing-0.056
 very-0.056
Token brave
Token ID14802
Feature activation-7.75e-02

Pos logit contributions
les+0.018
lights+0.017
ork+0.017
rily+0.017
adoes+0.016
Neg logit contributions
 least-0.013
 say-0.010
 oh-0.010
oh-0.010
 kidding-0.009
Token bear
Token ID6842
Feature activation-2.04e-02

Pos logit contributions
lights+0.147
ork+0.129
rily+0.128
lf+0.126
ide+0.124
Neg logit contributions
 say-0.089
 least-0.088
 very-0.079
 shout-0.076
ses-0.075
Token on
Token ID319
Feature activation-8.87e-02

Pos logit contributions
lights+0.039
les+0.037
rily+0.035
adoes+0.035
ork+0.034
Neg logit contributions
 least-0.028
oh-0.023
 say-0.022
 am-0.021
 oh-0.020
Token it
Token ID340
Feature activation-1.04e-02

Pos logit contributions
les+0.203
lights+0.203
anna+0.197
arre+0.196
aff+0.196
Neg logit contributions
 very-0.060
 a-0.048
 an-0.044
 the-0.043
 so-0.037
Token.
Token ID13
Feature activation-2.73e-02

Pos logit contributions
lights+0.018
les+0.018
ork+0.017
rily+0.017
adoes+0.017
Neg logit contributions
 least-0.016
oh-0.012
 say-0.012
 oh-0.011
 kidding-0.011
Token Benny
Token ID44275
Feature activation+1.81e-04

Pos logit contributions
lights+0.059
les+0.056
rily+0.054
ork+0.054
adoes+0.054
Neg logit contributions
 least-0.031
 say-0.024
oh-0.023
 oh-0.021
 am-0.020
Token wanted
Token ID2227
Feature activation-7.32e-03

Pos logit contributions
 least+0.000
oh+0.000
 say+0.000
 oh+0.000
 kidding+0.000
Neg logit contributions
les-0.000
lights-0.000
ork-0.000
rily-0.000
adoes-0.000
Token to
Token ID284
Feature activation-4.07e-03

Pos logit contributions
lights+0.013
les+0.013
rily+0.013
ork+0.012
adoes+0.012
Neg logit contributions
 least-0.011
oh-0.008
 say-0.008
 oh-0.008
 kidding-0.008
Token things
Token ID1243
Feature activation+2.49e-02

Pos logit contributions
lights+0.058
rily+0.057
les+0.057
ork+0.054
ocked+0.049
Neg logit contributions
 least-0.064
 worst-0.052
 best-0.052
 say-0.051
 oh-0.051
Token inside
Token ID2641
Feature activation+1.77e-02

Pos logit contributions
 least+0.041
oh+0.034
 kidding+0.033
 Mir+0.032
 oh+0.032
Neg logit contributions
les-0.037
lights-0.037
rily-0.036
ork-0.035
adoes-0.034
Token,
Token ID11
Feature activation+1.41e-02

Pos logit contributions
 least+0.030
oh+0.025
 kidding+0.023
 oh+0.023
 say+0.022
Neg logit contributions
les-0.027
lights-0.027
rily-0.026
ork-0.025
adoes-0.024
Token but
Token ID475
Feature activation+6.57e-04

Pos logit contributions
 least+0.021
oh+0.017
 kidding+0.014
 am+0.014
 ourselves+0.014
Neg logit contributions
lights-0.025
les-0.025
rily-0.024
ork-0.023
adoes-0.023
Token the
Token ID262
Feature activation-4.53e-02

Pos logit contributions
 least+0.001
oh+0.001
 say+0.001
 oh+0.001
 kidding+0.001
Neg logit contributions
les-0.001
lights-0.001
ork-0.001
rily-0.001
adoes-0.001
Token brave
Token ID14802
Feature activation-8.81e-02

Pos logit contributions
les+0.071
rily+0.069
lights+0.069
ork+0.063
en+0.062
Neg logit contributions
 least-0.082
 worst-0.068
 oh-0.068
 very-0.067
 best-0.067
Token little
Token ID1310
Feature activation-3.76e-02

Pos logit contributions
rily+0.202
lights+0.198
lor+0.193
pron+0.189
ork+0.187
Neg logit contributions
 least-0.059
 very-0.044
 best-0.043
 oh-0.042
 say-0.042
Token girl
Token ID2576
Feature activation-1.97e-03

Pos logit contributions
lights+0.084
ork+0.083
rily+0.080
les+0.080
ocked+0.075
Neg logit contributions
 least-0.040
 oh-0.028
 say-0.027
 worst-0.025
 kidding-0.024
Token saw
Token ID2497
Feature activation-4.81e-04

Pos logit contributions
les+0.003
lights+0.003
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.003
oh-0.003
 say-0.002
 oh-0.002
 kidding-0.002
Token him
Token ID683
Feature activation+1.33e-02

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.000
 kidding-0.000
Token.
Token ID13
Feature activation+2.90e-03

Pos logit contributions
 least+0.023
oh+0.019
 kidding+0.018
 oh+0.018
 say+0.017
Neg logit contributions
les-0.020
rily-0.019
ork-0.019
lights-0.019
adoes-0.018
Token big
Token ID1263
Feature activation+8.62e-03

Pos logit contributions
les+0.003
lights+0.003
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.002
oh-0.002
 say-0.001
 oh-0.001
 kidding-0.001
Token eyes
Token ID2951
Feature activation-5.65e-02

Pos logit contributions
 least+0.017
oh+0.014
 oh+0.013
 say+0.013
 kidding+0.013
Neg logit contributions
les-0.012
lights-0.011
ork-0.011
rily-0.011
adoes-0.011
Token and
Token ID290
Feature activation-3.96e-03

Pos logit contributions
lights+0.118
adoes+0.116
 tant+0.114
ork+0.109
les+0.109
Neg logit contributions
 least-0.059
oh-0.050
 saying-0.050
 say-0.048
 very-0.046
Token a
Token ID257
Feature activation-1.88e-02

Pos logit contributions
les+0.008
lights+0.007
ork+0.007
rily+0.007
adoes+0.007
Neg logit contributions
 least-0.005
oh-0.004
 say-0.004
 oh-0.004
 kidding-0.004
Token w
Token ID266
Feature activation-1.33e-02

Pos logit contributions
les+0.031
lights+0.031
rily+0.029
adoes+0.029
ork+0.028
Neg logit contributions
 least-0.030
 say-0.024
oh-0.024
 oh-0.023
 am-0.022
Tokenagging
Token ID16406
Feature activation-8.57e-02

Pos logit contributions
lights+0.018
les+0.017
rily+0.017
adoes+0.016
ork+0.016
Neg logit contributions
 least-0.026
oh-0.021
 say-0.020
 kidding-0.020
 am-0.020
Token tail
Token ID7894
Feature activation-5.08e-02

Pos logit contributions
lights+0.160
les+0.150
vable+0.145
ilantro+0.144
oons+0.142
Neg logit contributions
 say-0.098
 very-0.094
 saying-0.087
 shout-0.082
ses-0.082
Token.
Token ID13
Feature activation-1.38e-02

Pos logit contributions
lights+0.104
les+0.099
adoes+0.098
ork+0.098
 tant+0.096
Neg logit contributions
 least-0.057
 say-0.045
oh-0.044
 saying-0.042
 very-0.041
Token The
Token ID383
Feature activation-3.62e-02

Pos logit contributions
les+0.025
lights+0.025
ork+0.024
rily+0.024
adoes+0.023
Neg logit contributions
 least-0.020
oh-0.016
 say-0.015
 Mir-0.015
 oh-0.014
Token young
Token ID1862
Feature activation-4.51e-02

Pos logit contributions
ork+0.065
les+0.064
rily+0.063
lights+0.061
anne+0.058
Neg logit contributions
 least-0.057
 best-0.046
 say-0.045
 most-0.045
 very-0.045
Token boy
Token ID2933
Feature activation+7.70e-03

Pos logit contributions
ork+0.113
lights+0.106
les+0.102
rily+0.099
ches+0.096
Neg logit contributions
 least-0.038
 say-0.027
 oh-0.026
 best-0.025
 most-0.025
Token man
Token ID582
Feature activation-5.30e-03

Pos logit contributions
les+0.003
lights+0.003
rily+0.003
ork+0.003
adoes+0.003
Neg logit contributions
 least-0.004
oh-0.003
 say-0.003
 oh-0.003
 kidding-0.003
Token walking
Token ID6155
Feature activation-1.14e-02

Pos logit contributions
les+0.010
lights+0.010
ork+0.009
rily+0.009
adoes+0.009
Neg logit contributions
 least-0.008
oh-0.006
 say-0.006
 oh-0.005
 kidding-0.005
Token on
Token ID319
Feature activation-1.38e-02

Pos logit contributions
lights+0.021
les+0.021
ork+0.020
rily+0.020
adoes+0.019
Neg logit contributions
 least-0.016
oh-0.013
 say-0.012
 oh-0.011
 kidding-0.011
Token a
Token ID257
Feature activation+2.19e-03

Pos logit contributions
les+0.031
lights+0.030
rily+0.029
ork+0.029
adoes+0.028
Neg logit contributions
 least-0.015
 say-0.010
oh-0.010
 oh-0.009
 Mir-0.009
Token tight
Token ID5381
Feature activation-3.27e-02

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 oh+0.003
 kidding+0.003
Neg logit contributions
les-0.003
lights-0.003
ork-0.003
rily-0.003
adoes-0.003
Tokenro
Token ID305
Feature activation-8.53e-02

Pos logit contributions
lights+0.056
les+0.055
rily+0.052
adoes+0.052
ork+0.049
Neg logit contributions
 least-0.049
 say-0.040
oh-0.039
 wait-0.037
 am-0.036
Tokenpe
Token ID431
Feature activation-1.15e-02

Pos logit contributions
lights+0.165
les+0.145
rily+0.144
 journals+0.143
adoes+0.139
Neg logit contributions
oh-0.102
 least-0.087
 oh-0.085
 am-0.084
 very-0.080
Token.
Token ID13
Feature activation-1.58e-02

Pos logit contributions
lights+0.019
les+0.019
ork+0.018
rily+0.018
adoes+0.018
Neg logit contributions
 least-0.018
oh-0.014
 say-0.014
 oh-0.013
 kidding-0.013
Token He
Token ID679
Feature activation-1.69e-02

Pos logit contributions
lights+0.030
les+0.030
rily+0.029
ork+0.029
adoes+0.028
Neg logit contributions
 least-0.020
 say-0.016
oh-0.016
 Mir-0.015
 oh-0.014
Token had
Token ID550
Feature activation+2.83e-03

Pos logit contributions
lights+0.033
les+0.032
rily+0.032
adoes+0.031
ork+0.031
Neg logit contributions
 least-0.023
oh-0.018
 say-0.017
 oh-0.016
 Mir-0.015
Token to
Token ID284
Feature activation+7.17e-03

Pos logit contributions
 least+0.004
oh+0.003
 say+0.002
 kidding+0.002
 oh+0.002
Neg logit contributions
les-0.006
lights-0.006
ork-0.006
rily-0.005
adoes-0.005
Token with
Token ID351
Feature activation-2.37e-02

Pos logit contributions
lights+0.019
les+0.019
rily+0.018
adoes+0.018
ork+0.018
Neg logit contributions
 least-0.016
oh-0.012
 say-0.012
 oh-0.012
 kidding-0.011
Token his
Token ID465
Feature activation-1.10e-03

Pos logit contributions
lights+0.048
les+0.047
ork+0.046
rily+0.045
adoes+0.044
Neg logit contributions
 least-0.029
 say-0.021
 oh-0.021
oh-0.020
 very-0.019
Token favorite
Token ID4004
Feature activation-6.35e-02

Pos logit contributions
les+0.002
lights+0.002
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.002
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token cartoon
Token ID16251
Feature activation-2.49e-02

Pos logit contributions
lights+0.121
rily+0.117
urbed+0.106
ide+0.104
ackers+0.101
Neg logit contributions
 say-0.076
 very-0.070
 am-0.068
 are-0.067
 least-0.064
Token character
Token ID2095
Feature activation-1.16e-02

Pos logit contributions
lights+0.042
les+0.041
rily+0.041
adoes+0.039
ork+0.038
Neg logit contributions
 least-0.037
 say-0.030
oh-0.030
 oh-0.029
 am-0.028
Token on
Token ID319
Feature activation-8.50e-02

Pos logit contributions
lights+0.020
les+0.020
ork+0.019
rily+0.019
adoes+0.019
Neg logit contributions
 least-0.017
oh-0.013
 say-0.013
 oh-0.013
 kidding-0.012
Token the
Token ID262
Feature activation+7.03e-03

Pos logit contributions
lights+0.207
les+0.192
anna+0.192
anne+0.190
ust+0.186
Neg logit contributions
 very-0.055
 a-0.050
 least-0.048
 the-0.045
 an-0.044
Token cover
Token ID3002
Feature activation-3.45e-02

Pos logit contributions
 least+0.014
oh+0.012
 say+0.011
 kidding+0.011
 oh+0.011
Neg logit contributions
les-0.009
lights-0.009
ork-0.008
rily-0.008
adoes-0.008
Token.
Token ID13
Feature activation-2.70e-02

Pos logit contributions
les+0.064
lights+0.063
adoes+0.062
ork+0.062
rily+0.060
Neg logit contributions
 least-0.043
 say-0.033
oh-0.033
 oh-0.029
 very-0.029
Token He
Token ID679
Feature activation-1.19e-02

Pos logit contributions
les+0.052
lights+0.049
rily+0.046
ork+0.044
urbed+0.043
Neg logit contributions
 least-0.036
 oh-0.030
 say-0.029
oh-0.028
 Mir-0.028
Token asked
Token ID1965
Feature activation-8.16e-03

Pos logit contributions
lights+0.021
les+0.021
rily+0.020
ork+0.019
adoes+0.019
Neg logit contributions
 least-0.018
oh-0.015
 say-0.014
 oh-0.014
 Mir-0.013
Token room
Token ID2119
Feature activation-1.18e-02

Pos logit contributions
les+0.110
lights+0.108
rily+0.101
adoes+0.097
ocked+0.094
Neg logit contributions
 least-0.078
 very-0.064
 say-0.062
 oh-0.060
oh-0.059
Token with
Token ID351
Feature activation+5.32e-04

Pos logit contributions
les+0.020
lights+0.020
rily+0.019
adoes+0.019
ork+0.019
Neg logit contributions
 least-0.018
oh-0.015
 say-0.014
 oh-0.014
 am-0.013
Token a
Token ID257
Feature activation-1.22e-02

Pos logit contributions
 least+0.001
oh+0.001
 say+0.000
 oh+0.000
 kidding+0.000
Neg logit contributions
les-0.001
lights-0.001
ork-0.001
rily-0.001
adoes-0.001
Token pretty
Token ID2495
Feature activation-5.49e-02

Pos logit contributions
les+0.020
lights+0.019
rily+0.019
adoes+0.018
ork+0.018
Neg logit contributions
 least-0.020
 say-0.016
oh-0.016
 oh-0.015
 am-0.015
Token tut
Token ID9732
Feature activation-5.78e-02

Pos logit contributions
lights+0.095
les+0.093
rily+0.091
adoes+0.083
urbed+0.081
Neg logit contributions
 least-0.072
 say-0.064
 very-0.063
 oh-0.061
 am-0.059
Tokenoring
Token ID3255
Feature activation-8.49e-02

Pos logit contributions
lights+0.086
adoes+0.085
 tant+0.077
emo+0.074
rily+0.073
Neg logit contributions
oh-0.083
 least-0.080
 say-0.067
-0.064
 very-0.064
Token desk
Token ID6915
Feature activation-3.15e-02

Pos logit contributions
lights+0.140
emo+0.135
rily+0.133
ackers+0.128
ilantro+0.126
Neg logit contributions
 very-0.092
 h-0.085
 least-0.079
 say-0.079
 doing-0.076
Token.
Token ID13
Feature activation-7.50e-03

Pos logit contributions
les+0.058
lights+0.057
adoes+0.056
rily+0.055
 tant+0.054
Neg logit contributions
 least-0.042
oh-0.033
 say-0.032
 oh-0.030
 saying-0.029
Token
Token ID220
Feature activation-5.11e-04

Pos logit contributions
les+0.015
lights+0.014
rily+0.014
ork+0.014
adoes+0.013
Neg logit contributions
 least-0.010
oh-0.008
 say-0.007
 oh-0.007
 Mir-0.007
Token
Token ID198
Feature activation+5.01e-03

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token
Token ID198
Feature activation-4.91e-03

Pos logit contributions
 least+0.008
oh+0.006
 say+0.006
 kidding+0.006
 oh+0.006
Neg logit contributions
les-0.008
lights-0.008
ork-0.008
rily-0.008
adoes-0.008
Token around
Token ID1088
Feature activation+1.50e-02

Pos logit contributions
lights+0.010
les+0.010
ork+0.009
rily+0.009
adoes+0.009
Neg logit contributions
 least-0.011
oh-0.008
 say-0.008
 kidding-0.008
 oh-0.008
Token and
Token ID290
Feature activation+1.36e-02

Pos logit contributions
 least+0.028
oh+0.023
 kidding+0.021
 say+0.021
 oh+0.021
Neg logit contributions
les-0.021
lights-0.021
ork-0.020
rily-0.020
ened-0.019
Token saw
Token ID2497
Feature activation+6.89e-03

Pos logit contributions
 least+0.022
oh+0.017
 kidding+0.016
 ourselves+0.016
 worst+0.016
Neg logit contributions
les-0.022
lights-0.021
ork-0.021
rily-0.021
ened-0.020
Token a
Token ID257
Feature activation-1.78e-02

Pos logit contributions
 least+0.010
oh+0.008
 say+0.007
 kidding+0.007
 oh+0.007
Neg logit contributions
les-0.013
lights-0.012
ork-0.012
rily-0.012
adoes-0.012
Token big
Token ID1263
Feature activation-1.22e-02

Pos logit contributions
les+0.029
lights+0.029
rily+0.028
adoes+0.027
ork+0.026
Neg logit contributions
 least-0.029
oh-0.023
 say-0.023
 oh-0.022
 kidding-0.022
Token do
Token ID466
Feature activation-8.46e-02

Pos logit contributions
les+0.020
lights+0.020
rily+0.019
ork+0.019
adoes+0.019
Neg logit contributions
 least-0.020
oh-0.015
 say-0.015
 kidding-0.014
 oh-0.014
Tokenggy
Token ID19970
Feature activation+5.49e-03

Pos logit contributions
lights+0.142
rily+0.133
les+0.128
adoes+0.122
ened+0.122
Neg logit contributions
 am-0.127
 very-0.119
oh-0.115
 doing-0.115
 least-0.114
Token smile
Token ID8212
Feature activation-1.01e-02

Pos logit contributions
 least+0.009
oh+0.007
 say+0.007
 oh+0.007
 kidding+0.007
Neg logit contributions
les-0.009
lights-0.009
ork-0.008
rily-0.008
adoes-0.008
Token at
Token ID379
Feature activation-1.10e-02

Pos logit contributions
lights+0.017
les+0.017
ork+0.017
rily+0.016
adoes+0.016
Neg logit contributions
 least-0.015
oh-0.012
 say-0.012
 oh-0.011
 kidding-0.011
Token her
Token ID607
Feature activation+1.24e-02

Pos logit contributions
les+0.023
lights+0.023
ork+0.023
rily+0.022
adoes+0.022
Neg logit contributions
 least-0.013
oh-0.009
 say-0.009
 oh-0.008
 ourselves-0.008
Token.
Token ID13
Feature activation-8.55e-03

Pos logit contributions
 least+0.022
oh+0.018
 oh+0.017
 say+0.017
 kidding+0.017
Neg logit contributions
les-0.018
lights-0.018
ork-0.017
rily-0.017
adoes-0.016
Token it
Token ID340
Feature activation+1.61e-02

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 oh+0.003
 kidding+0.003
Neg logit contributions
les-0.004
lights-0.004
ork-0.004
rily-0.004
adoes-0.004
Token!
Token ID0
Feature activation+1.21e-02

Pos logit contributions
 least+0.027
oh+0.022
 kidding+0.021
 say+0.021
 oh+0.021
Neg logit contributions
les-0.025
lights-0.024
rily-0.023
ork-0.023
adoes-0.022
Token
Token ID198
Feature activation+1.58e-02

Pos logit contributions
 least+0.021
oh+0.016
 say+0.016
 oh+0.016
 kidding+0.015
Neg logit contributions
les-0.018
lights-0.018
ork-0.018
rily-0.017
adoes-0.017
Token
Token ID198
Feature activation-1.15e-03

Pos logit contributions
 least+0.028
 kidding+0.022
oh+0.021
 oh+0.021
 say+0.021
Neg logit contributions
les-0.023
ork-0.023
lights-0.023
rily-0.022
adoes-0.020
TokenThe
Token ID464
Feature activation-4.18e-02

Pos logit contributions
les+0.002
lights+0.002
ork+0.002
rily+0.002
adoes+0.002
Neg logit contributions
 least-0.002
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token white
Token ID2330
Feature activation-8.40e-02

Pos logit contributions
ork+0.087
les+0.075
lights+0.074
aff+0.072
rily+0.071
Neg logit contributions
 least-0.064
 best-0.052
 most-0.050
 very-0.049
 worst-0.047
Token all
Token ID477
Feature activation-9.23e-03

Pos logit contributions
ork+0.162
lights+0.154
isdom+0.147
azz+0.146
url+0.144
Neg logit contributions
 very-0.073
 m-0.072
 kay-0.071
 least-0.071
 most-0.067
Tokenigator
Token ID23823
Feature activation+2.09e-03

Pos logit contributions
ork+0.013
les+0.013
lights+0.013
rily+0.012
adoes+0.012
Neg logit contributions
 least-0.017
oh-0.014
 say-0.014
 oh-0.013
 kidding-0.013
Token was
Token ID373
Feature activation+3.78e-03

Pos logit contributions
 least+0.003
oh+0.003
 say+0.002
 oh+0.002
 kidding+0.002
Neg logit contributions
les-0.004
lights-0.003
ork-0.003
rily-0.003
adoes-0.003
Token scared
Token ID12008
Feature activation+4.39e-03

Pos logit contributions
 least+0.004
oh+0.003
 say+0.002
 oh+0.002
 kidding+0.002
Neg logit contributions
les-0.008
lights-0.008
rily-0.008
ork-0.008
adoes-0.008
Token so
Token ID523
Feature activation+3.36e-03

Pos logit contributions
 least+0.008
oh+0.006
 say+0.006
 kidding+0.006
 oh+0.006
Neg logit contributions
les-0.007
lights-0.007
rily-0.006
ork-0.006
adoes-0.006
Token to
Token ID284
Feature activation+2.04e-03

Pos logit contributions
les+0.014
lights+0.014
ork+0.013
rily+0.013
adoes+0.013
Neg logit contributions
 least-0.013
oh-0.010
 say-0.009
 oh-0.009
 kidding-0.009
Token have
Token ID423
Feature activation-1.73e-02

Pos logit contributions
 least+0.003
oh+0.002
 say+0.002
 oh+0.002
 kidding+0.002
Neg logit contributions
les-0.004
lights-0.004
ork-0.004
rily-0.004
adoes-0.004
Token found
Token ID1043
Feature activation-1.15e-02

Pos logit contributions
lights+0.034
les+0.033
rily+0.033
adoes+0.032
ork+0.031
Neg logit contributions
 least-0.024
oh-0.017
 oh-0.017
 say-0.017
 am-0.016
Token such
Token ID884
Feature activation-1.11e-03

Pos logit contributions
lights+0.023
les+0.022
ork+0.022
rily+0.021
adoes+0.021
Neg logit contributions
 least-0.015
 say-0.011
oh-0.011
 oh-0.010
 ourselves-0.010
Token a
Token ID257
Feature activation-3.19e-02

Pos logit contributions
les+0.002
lights+0.002
ork+0.002
rily+0.002
adoes+0.002
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token beautiful
Token ID4950
Feature activation-8.39e-02

Pos logit contributions
lights+0.067
les+0.066
adoes+0.062
rily+0.061
ened+0.059
Neg logit contributions
 least-0.042
 very-0.031
 say-0.030
 best-0.030
oh-0.029
Token treasure
Token ID14068
Feature activation-3.26e-02

Pos logit contributions
lights+0.157
cases+0.143
rily+0.141
les+0.140
oons+0.134
Neg logit contributions
 least-0.106
 best-0.101
 very-0.100
 most-0.095
 doing-0.093
Token.
Token ID13
Feature activation-5.04e-04

Pos logit contributions
lights+0.063
les+0.062
adoes+0.060
 tant+0.058
rily+0.058
Neg logit contributions
 least-0.045
 say-0.032
oh-0.032
 oh-0.030
 wait-0.029
Token<|endoftext|>
Token ID50256
Feature activation-2.54e-02

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
TokenOnce
Token ID7454
Feature activation-2.77e-02

Pos logit contributions
lights+0.053
les+0.052
adoes+0.052
rily+0.052
ork+0.049
Neg logit contributions
 least-0.029
oh-0.022
 am-0.021
 say-0.021
 Mir-0.019
Token upon
Token ID2402
Feature activation-1.43e-02

Pos logit contributions
les+0.044
adoes+0.044
lights+0.043
rily+0.043
ork+0.040
Neg logit contributions
 least-0.045
oh-0.037
 say-0.036
 am-0.035
 Mir-0.033
Token had
Token ID550
Feature activation-2.59e-02

Pos logit contributions
lights+0.039
les+0.038
rily+0.038
adoes+0.036
ork+0.036
Neg logit contributions
 least-0.029
oh-0.023
 say-0.022
 am-0.021
 oh-0.020
Token gone
Token ID3750
Feature activation+4.08e-03

Pos logit contributions
les+0.058
lights+0.058
rily+0.056
adoes+0.053
ork+0.053
Neg logit contributions
 least-0.028
 oh-0.018
oh-0.018
 say-0.017
 very-0.016
Token away
Token ID1497
Feature activation+1.02e-02

Pos logit contributions
 least+0.007
oh+0.005
 say+0.005
 oh+0.005
 kidding+0.005
Neg logit contributions
les-0.007
lights-0.007
ork-0.006
rily-0.006
adoes-0.006
Token for
Token ID329
Feature activation+3.08e-03

Pos logit contributions
 least+0.017
oh+0.014
 say+0.013
 kidding+0.013
 oh+0.013
Neg logit contributions
les-0.016
lights-0.016
ork-0.015
rily-0.015
adoes-0.015
Token a
Token ID257
Feature activation-2.69e-02

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 kidding+0.003
 oh+0.003
Neg logit contributions
les-0.006
lights-0.006
ork-0.006
rily-0.006
adoes-0.006
Token long
Token ID890
Feature activation-8.39e-02

Pos logit contributions
les+0.047
lights+0.046
adoes+0.045
rily+0.045
ened+0.041
Neg logit contributions
 least-0.044
 say-0.034
 wait-0.034
oh-0.034
 oh-0.033
Token time
Token ID640
Feature activation-1.86e-03

Pos logit contributions
rily+0.156
lights+0.150
adoes+0.139
les+0.137
ocked+0.125
Neg logit contributions
 least-0.117
 wait-0.105
 say-0.101
 very-0.099
 h-0.093
Token.
Token ID13
Feature activation-1.48e-02

Pos logit contributions
les+0.003
lights+0.003
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.003
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token Lucy
Token ID22162
Feature activation+2.45e-02

Pos logit contributions
les+0.029
lights+0.028
rily+0.027
ork+0.026
adoes+0.026
Neg logit contributions
 least-0.020
oh-0.016
 say-0.015
 oh-0.015
 Mir-0.015
Token sighed
Token ID21893
Feature activation-1.03e-02

Pos logit contributions
 least+0.044
 kidding+0.037
 ourselves+0.036
 oh+0.036
oh+0.035
Neg logit contributions
les-0.036
ork-0.032
ened-0.030
rily-0.030
lights-0.030
Token sadly
Token ID21098
Feature activation-1.68e-03

Pos logit contributions
les+0.017
lights+0.017
ork+0.016
rily+0.016
adoes+0.016
Neg logit contributions
 least-0.017
oh-0.013
 say-0.013
 oh-0.012
 kidding-0.012
Token Ben
Token ID3932
Feature activation-6.35e-03

Pos logit contributions
les+0.049
lights+0.048
adoes+0.045
ork+0.045
lf+0.044
Neg logit contributions
 least-0.035
 Mir-0.031
 say-0.029
 Mil-0.028
 kidding-0.028
Token smiled
Token ID13541
Feature activation-1.20e-02

Pos logit contributions
lights+0.011
les+0.011
ork+0.011
rily+0.010
adoes+0.010
Neg logit contributions
 least-0.010
oh-0.008
 say-0.007
 oh-0.007
 kidding-0.007
Token and
Token ID290
Feature activation-2.40e-04

Pos logit contributions
les+0.020
lights+0.020
ork+0.020
rily+0.019
adoes+0.018
Neg logit contributions
 least-0.019
 say-0.015
oh-0.014
 oh-0.014
 kidding-0.014
Token hugged
Token ID41130
Feature activation-2.60e-02

Pos logit contributions
les+0.000
lights+0.000
ork+0.000
rily+0.000
adoes+0.000
Neg logit contributions
 least-0.000
oh-0.000
 say-0.000
 oh-0.000
 kidding-0.000
Token its
Token ID663
Feature activation-5.64e-03

Pos logit contributions
lights+0.048
ork+0.047
les+0.046
adoes+0.045
lf+0.045
Neg logit contributions
 least-0.032
 Mir-0.028
 say-0.028
 Mil-0.025
 oh-0.025
Token neck
Token ID7393
Feature activation-8.38e-02

Pos logit contributions
les+0.008
lights+0.008
ork+0.008
rily+0.008
adoes+0.008
Neg logit contributions
 least-0.010
 say-0.008
oh-0.008
 oh-0.008
 kidding-0.008
Token.
Token ID13
Feature activation-2.60e-02

Pos logit contributions
adoes+0.188
lights+0.185
lf+0.184
igsaw+0.179
ened+0.171
Neg logit contributions
 very-0.068
 say-0.052
 so-0.046
 right-0.037
 least-0.034
Token
Token ID198
Feature activation-2.96e-04

Pos logit contributions
les+0.052
lights+0.049
adoes+0.048
ork+0.048
ened+0.048
Neg logit contributions
 least-0.030
 Mir-0.027
 say-0.025
 Mil-0.023
 Roc-0.023
Token
Token ID198
Feature activation-7.20e-03

Pos logit contributions
les+0.001
lights+0.001
ork+0.000
rily+0.000
adoes+0.000
Neg logit contributions
 least-0.000
oh-0.000
 say-0.000
 oh-0.000
 kidding-0.000
TokenMom
Token ID29252
Feature activation-1.27e-02

Pos logit contributions
lights+0.011
les+0.011
rily+0.010
adoes+0.010
ork+0.010
Neg logit contributions
 least-0.012
oh-0.011
 say-0.010
 oh-0.010
 am-0.009
Token took
Token ID1718
Feature activation-4.21e-03

Pos logit contributions
lights+0.022
ork+0.021
rily+0.021
adoes+0.021
les+0.021
Neg logit contributions
 least-0.020
oh-0.016
 say-0.016
 Mir-0.014
 oh-0.014
Token hug
Token ID16225
Feature activation+8.24e-03

Pos logit contributions
lights+0.027
les+0.026
adoes+0.025
rily+0.025
ork+0.024
Neg logit contributions
 least-0.027
 say-0.021
oh-0.021
 am-0.020
 kidding-0.019
Token."
Token ID526
Feature activation+1.89e-02

Pos logit contributions
 least+0.014
oh+0.011
 say+0.011
 kidding+0.011
 oh+0.011
Neg logit contributions
les-0.013
lights-0.012
ork-0.012
rily-0.012
adoes-0.012
Token
Token ID198
Feature activation+2.57e-02

Pos logit contributions
 least+0.034
oh+0.027
 oh+0.026
 worst+0.026
 kidding+0.026
Neg logit contributions
les-0.028
lights-0.027
ork-0.026
rily-0.026
adoes-0.025
Token
Token ID198
Feature activation+2.16e-02

Pos logit contributions
 least+0.052
 kidding+0.042
 worst+0.040
 oh+0.040
oh+0.039
Neg logit contributions
les-0.034
ork-0.034
lights-0.032
rily-0.031
lf-0.029
TokenThe
Token ID464
Feature activation-3.92e-02

Pos logit contributions
 least+0.040
 kidding+0.031
 worst+0.030
ses+0.029
 say+0.029
Neg logit contributions
les-0.031
ork-0.030
lights-0.029
lf-0.029
rily-0.029
Token quiet
Token ID5897
Feature activation-8.34e-02

Pos logit contributions
ork+0.077
lights+0.068
les+0.065
rily+0.064
bell+0.062
Neg logit contributions
 least-0.062
 best-0.048
 very-0.048
 most-0.046
 worst-0.044
Token all
Token ID477
Feature activation-6.51e-03

Pos logit contributions
ork+0.182
pron+0.154
ast+0.149
ues+0.147
url+0.147
Neg logit contributions
 very-0.075
 so-0.072
 are-0.071
 welcome-0.071
 best-0.071
Tokenigator
Token ID23823
Feature activation-1.80e-03

Pos logit contributions
lights+0.008
les+0.008
ork+0.008
rily+0.008
adoes+0.008
Neg logit contributions
 least-0.013
oh-0.011
 say-0.010
 oh-0.010
 kidding-0.010
Token did
Token ID750
Feature activation-1.88e-02

Pos logit contributions
les+0.003
lights+0.003
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.003
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token not
Token ID407
Feature activation-9.06e-03

Pos logit contributions
lights+0.039
les+0.038
adoes+0.036
ork+0.035
 tant+0.035
Neg logit contributions
 least-0.023
 say-0.020
 ourselves-0.017
 am-0.016
 wait-0.016
Token know
Token ID760
Feature activation-2.37e-03

Pos logit contributions
lights+0.019
les+0.019
ork+0.018
rily+0.018
adoes+0.018
Neg logit contributions
 least-0.011
 say-0.008
oh-0.008
 kidding-0.007
 wait-0.007
Token not
Token ID407
Feature activation-1.46e-02

Pos logit contributions
les+0.000
lights+0.000
ork+0.000
rily+0.000
adoes+0.000
Neg logit contributions
 least-0.000
oh-0.000
 say-0.000
 oh-0.000
 kidding-0.000
Token played
Token ID2826
Feature activation-3.25e-02

Pos logit contributions
lights+0.029
les+0.029
rily+0.028
ork+0.027
adoes+0.027
Neg logit contributions
 least-0.019
 say-0.014
oh-0.013
 kidding-0.013
 am-0.013
Token with
Token ID351
Feature activation-4.41e-02

Pos logit contributions
lights+0.058
les+0.057
rily+0.054
ork+0.053
adoes+0.051
Neg logit contributions
 least-0.044
 oh-0.036
oh-0.036
 am-0.035
 say-0.035
Token for
Token ID329
Feature activation-8.21e-03

Pos logit contributions
lights+0.078
rily+0.077
les+0.076
ork+0.074
adoes+0.073
Neg logit contributions
 least-0.059
 oh-0.044
 very-0.044
 say-0.043
oh-0.043
Token a
Token ID257
Feature activation-5.38e-02

Pos logit contributions
les+0.015
lights+0.015
rily+0.015
ork+0.015
adoes+0.014
Neg logit contributions
 least-0.012
oh-0.009
 say-0.008
 oh-0.008
 kidding-0.008
Token long
Token ID890
Feature activation-8.32e-02

Pos logit contributions
lights+0.096
adoes+0.093
rily+0.090
les+0.090
ocked+0.084
Neg logit contributions
 least-0.072
 very-0.065
 say-0.063
 oh-0.060
 wait-0.060
Token time
Token ID640
Feature activation-1.90e-02

Pos logit contributions
rily+0.149
lights+0.148
ocked+0.133
adoes+0.131
les+0.124
Neg logit contributions
 h-0.097
 say-0.095
 am-0.093
 least-0.092
 wait-0.091
Token.
Token ID13
Feature activation-1.04e-02

Pos logit contributions
les+0.035
lights+0.034
rily+0.033
ork+0.033
adoes+0.033
Neg logit contributions
 least-0.027
oh-0.020
 say-0.020
 kidding-0.019
 oh-0.019
Token It
Token ID632
Feature activation+1.08e-02

Pos logit contributions
les+0.018
lights+0.017
rily+0.016
ork+0.016
adoes+0.015
Neg logit contributions
 least-0.017
oh-0.014
 say-0.013
 oh-0.013
 Mir-0.013
Token is
Token ID318
Feature activation+1.04e-02

Pos logit contributions
 least+0.016
oh+0.012
 oh+0.012
 kidding+0.011
 ourselves+0.011
Neg logit contributions
les-0.020
lights-0.019
ork-0.019
rily-0.018
adoes-0.018
Token a
Token ID257
Feature activation-5.36e-03

Pos logit contributions
 least+0.016
oh+0.013
 say+0.011
 kidding+0.011
 oh+0.011
Neg logit contributions
lights-0.018
les-0.018
ork-0.017
rily-0.017
adoes-0.017
Token.
Token ID13
Feature activation+5.09e-03

Pos logit contributions
les+0.010
lights+0.010
ork+0.009
rily+0.009
adoes+0.009
Neg logit contributions
 least-0.009
oh-0.007
 say-0.007
 oh-0.007
 kidding-0.007
Token
Token ID198
Feature activation+1.73e-02

Pos logit contributions
 least+0.009
oh+0.007
 say+0.007
 oh+0.007
 kidding+0.006
Neg logit contributions
les-0.008
lights-0.008
ork-0.007
rily-0.007
adoes-0.007
Token
Token ID198
Feature activation-4.20e-03

Pos logit contributions
 least+0.031
 kidding+0.024
oh+0.024
 oh+0.024
 say+0.023
Neg logit contributions
ork-0.025
les-0.025
lights-0.024
rily-0.023
adoes-0.022
TokenWhen
Token ID2215
Feature activation+1.00e-02

Pos logit contributions
les+0.008
lights+0.008
rily+0.007
ork+0.007
adoes+0.007
Neg logit contributions
 least-0.006
oh-0.005
 say-0.004
 oh-0.004
 kidding-0.004
Token the
Token ID262
Feature activation-3.62e-02

Pos logit contributions
 least+0.016
oh+0.013
 say+0.012
 kidding+0.012
 oh+0.012
Neg logit contributions
les-0.016
lights-0.016
ork-0.016
rily-0.015
adoes-0.015
Token white
Token ID2330
Feature activation-8.30e-02

Pos logit contributions
ork+0.072
les+0.064
rily+0.062
lights+0.060
aff+0.060
Neg logit contributions
 least-0.055
 best-0.043
 worst-0.041
 most-0.040
 wait-0.040
Token all
Token ID477
Feature activation-1.00e-02

Pos logit contributions
ork+0.167
ides+0.152
isdom+0.152
hell+0.150
lights+0.150
Neg logit contributions
 kay-0.071
 m-0.069
 least-0.069
 very-0.067
 most-0.063
Tokenigator
Token ID23823
Feature activation+5.78e-03

Pos logit contributions
ork+0.015
les+0.015
lights+0.015
rily+0.014
adoes+0.014
Neg logit contributions
 least-0.018
oh-0.014
 say-0.014
 oh-0.013
 kidding-0.013
Token got
Token ID1392
Feature activation-9.80e-04

Pos logit contributions
 least+0.010
oh+0.008
 say+0.007
 oh+0.007
 kidding+0.007
Neg logit contributions
les-0.009
lights-0.009
ork-0.009
rily-0.008
adoes-0.008
Token out
Token ID503
Feature activation+2.75e-03

Pos logit contributions
les+0.002
lights+0.002
ork+0.002
rily+0.002
adoes+0.002
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token of
Token ID286
Feature activation-7.99e-03

Pos logit contributions
 least+0.005
oh+0.004
 say+0.004
 oh+0.004
 kidding+0.003
Neg logit contributions
les-0.004
lights-0.004
ork-0.004
rily-0.004
adoes-0.004
Token Lily
Token ID20037
Feature activation+1.10e-02

Pos logit contributions
les+0.034
lights+0.033
rily+0.032
adoes+0.031
ork+0.030
Neg logit contributions
 least-0.027
oh-0.021
 say-0.021
 Mir-0.020
 oh-0.019
Token picked
Token ID6497
Feature activation-2.90e-02

Pos logit contributions
 least+0.018
oh+0.014
 kidding+0.014
 oh+0.014
 say+0.014
Neg logit contributions
les-0.018
ork-0.016
lights-0.016
rily-0.016
adoes-0.016
Token the
Token ID262
Feature activation+2.48e-03

Pos logit contributions
lights+0.061
ork+0.054
lf+0.054
les+0.052
adoes+0.052
Neg logit contributions
 least-0.039
 say-0.029
oh-0.029
 oh-0.027
 Mir-0.026
Token apples
Token ID22514
Feature activation+7.62e-03

Pos logit contributions
 least+0.005
oh+0.004
 say+0.004
 oh+0.004
 kidding+0.004
Neg logit contributions
les-0.003
lights-0.003
ork-0.003
rily-0.003
adoes-0.003
Token and
Token ID290
Feature activation-1.67e-02

Pos logit contributions
 least+0.014
oh+0.011
 oh+0.011
 kidding+0.011
 say+0.011
Neg logit contributions
les-0.011
lights-0.011
rily-0.011
ork-0.010
adoes-0.010
Token shared
Token ID4888
Feature activation-8.27e-02

Pos logit contributions
lights+0.029
ork+0.028
les+0.028
rily+0.028
adoes+0.027
Neg logit contributions
 least-0.026
 say-0.020
oh-0.020
 oh-0.019
 Mir-0.019
Token them
Token ID606
Feature activation-1.34e-02

Pos logit contributions
lights+0.198
acket+0.176
les+0.175
adoes+0.173
lf+0.173
Neg logit contributions
 very-0.070
 say-0.066
 so-0.062
 herself-0.057
 are-0.057
Token with
Token ID351
Feature activation-1.52e-02

Pos logit contributions
les+0.021
lights+0.021
adoes+0.019
ork+0.019
rily+0.019
Neg logit contributions
 least-0.023
oh-0.018
 say-0.018
 ourselves-0.017
 oh-0.017
Token her
Token ID607
Feature activation-2.20e-02

Pos logit contributions
lights+0.033
les+0.033
rily+0.032
ork+0.031
adoes+0.031
Neg logit contributions
 least-0.017
 say-0.011
oh-0.011
 oh-0.010
 ourselves-0.010
Token friends
Token ID2460
Feature activation-1.23e-02

Pos logit contributions
les+0.041
lights+0.040
ork+0.038
rily+0.037
adoes+0.036
Neg logit contributions
 least-0.033
 say-0.024
 best-0.024
 cle-0.023
 oh-0.023
Token.
Token ID13
Feature activation-3.30e-03

Pos logit contributions
les+0.022
lights+0.022
ork+0.021
adoes+0.021
rily+0.021
Neg logit contributions
 least-0.018
oh-0.014
 say-0.014
 oh-0.013
 kidding-0.013
Token toys
Token ID14958
Feature activation+2.05e-03

Pos logit contributions
 least+0.007
oh+0.005
 say+0.005
 kidding+0.005
 oh+0.005
Neg logit contributions
les-0.005
lights-0.005
ork-0.004
rily-0.004
adoes-0.004
Token.
Token ID13
Feature activation-2.62e-02

Pos logit contributions
 least+0.003
oh+0.003
 say+0.003
 oh+0.003
 kidding+0.003
Neg logit contributions
les-0.003
lights-0.003
ork-0.003
rily-0.003
adoes-0.003
Token He
Token ID679
Feature activation+9.88e-03

Pos logit contributions
les+0.052
rily+0.049
lights+0.048
ork+0.044
adoes+0.043
Neg logit contributions
 least-0.033
 oh-0.027
 say-0.026
oh-0.026
 Mir-0.025
Token had
Token ID550
Feature activation-1.02e-03

Pos logit contributions
 least+0.017
oh+0.014
 ourselves+0.013
 kidding+0.013
 say+0.013
Neg logit contributions
les-0.016
lights-0.015
ork-0.014
rily-0.014
adoes-0.013
Token a
Token ID257
Feature activation-9.65e-03

Pos logit contributions
les+0.002
lights+0.002
ork+0.002
rily+0.002
adoes+0.002
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token toy
Token ID13373
Feature activation+3.71e-02

Pos logit contributions
les+0.015
lights+0.015
rily+0.014
adoes+0.014
ork+0.013
Neg logit contributions
 least-0.017
 say-0.014
oh-0.013
 oh-0.013
 am-0.013
Token box
Token ID3091
Feature activation-2.75e-02

Pos logit contributions
 least+0.075
oh+0.069
 kidding+0.065
 ourselves+0.064
 oh+0.062
Neg logit contributions
les-0.041
lights-0.040
ork-0.039
adoes-0.037
rily-0.034
Token with
Token ID351
Feature activation-1.73e-02

Pos logit contributions
les+0.048
lights+0.046
adoes+0.046
rily+0.045
ork+0.045
Neg logit contributions
 least-0.042
oh-0.033
 say-0.032
 very-0.030
 oh-0.029
Token a
Token ID257
Feature activation-3.48e-02

Pos logit contributions
lights+0.034
les+0.034
ork+0.033
rily+0.032
adoes+0.032
Neg logit contributions
 least-0.022
oh-0.016
 say-0.015
 oh-0.015
 Mir-0.014
Token lid
Token ID19789
Feature activation-3.46e-02

Pos logit contributions
les+0.061
lights+0.061
rily+0.057
adoes+0.056
ened+0.053
Neg logit contributions
 least-0.048
 say-0.043
 very-0.041
oh-0.040
 am-0.040
Token that
Token ID326
Feature activation+8.08e-03

Pos logit contributions
adoes+0.065
lights+0.064
les+0.062
ork+0.062
rily+0.061
Neg logit contributions
 least-0.048
oh-0.038
 say-0.036
 very-0.032
 saying-0.032
Token play
Token ID711
Feature activation-2.27e-03

Pos logit contributions
 least+0.003
oh+0.003
 say+0.002
 oh+0.002
 kidding+0.002
Neg logit contributions
les-0.003
lights-0.003
ork-0.003
rily-0.003
adoes-0.003
Token together
Token ID1978
Feature activation-2.23e-03

Pos logit contributions
les+0.004
lights+0.004
rily+0.004
ork+0.004
adoes+0.004
Neg logit contributions
 least-0.003
oh-0.003
 say-0.002
 oh-0.002
 kidding-0.002
Token and
Token ID290
Feature activation-2.10e-02

Pos logit contributions
les+0.004
lights+0.004
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.004
oh-0.003
 say-0.003
 oh-0.003
 kidding-0.003
Token their
Token ID511
Feature activation+1.09e-02

Pos logit contributions
les+0.042
lights+0.040
rily+0.040
ork+0.039
adoes+0.038
Neg logit contributions
 least-0.029
 say-0.024
 wait-0.022
oh-0.021
 am-0.021
Token favourite
Token ID12507
Feature activation-2.20e-02

Pos logit contributions
 least+0.020
oh+0.017
 say+0.016
 kidding+0.016
 oh+0.016
Neg logit contributions
lights-0.015
les-0.015
ork-0.014
adoes-0.014
rily-0.014
Token thing
Token ID1517
Feature activation+3.73e-02

Pos logit contributions
lights+0.034
rily+0.033
les+0.033
ork+0.030
adoes+0.030
Neg logit contributions
 least-0.038
 oh-0.030
oh-0.030
 am-0.030
 say-0.030
Token to
Token ID284
Feature activation+7.25e-03

Pos logit contributions
 least+0.073
 kidding+0.061
oh+0.060
 ourselves+0.055
 say+0.055
Neg logit contributions
lights-0.051
les-0.048
ork-0.047
adoes-0.044
ened-0.041
Token do
Token ID466
Feature activation-2.30e-03

Pos logit contributions
 least+0.010
oh+0.008
 oh+0.007
 kidding+0.007
 say+0.007
Neg logit contributions
les-0.013
lights-0.013
ork-0.013
rily-0.012
adoes-0.012
Token was
Token ID373
Feature activation-1.93e-04

Pos logit contributions
les+0.004
lights+0.004
rily+0.004
ork+0.004
adoes+0.004
Neg logit contributions
 least-0.004
oh-0.003
 say-0.003
 oh-0.003
 kidding-0.003
Token swing
Token ID9628
Feature activation-1.92e-04

Pos logit contributions
les+0.000
lights+0.000
ork+0.000
rily+0.000
adoes+0.000
Neg logit contributions
 least-0.000
oh-0.000
 say-0.000
 oh-0.000
 kidding-0.000
Token.
Token ID13
Feature activation-6.46e-03

Pos logit contributions
les+0.000
lights+0.000
ork+0.000
rily+0.000
adoes+0.000
Neg logit contributions
 least-0.000
oh-0.000
 say-0.000
 oh-0.000
 kidding-0.000
Token called
Token ID1444
Feature activation+4.84e-04

Pos logit contributions
 least+0.029
oh+0.023
 kidding+0.023
 oh+0.022
 say+0.022
Neg logit contributions
les-0.027
lights-0.026
ork-0.025
rily-0.025
adoes-0.024
Token Sally
Token ID25737
Feature activation+9.51e-03

Pos logit contributions
 least+0.001
oh+0.001
 say+0.001
 oh+0.001
 kidding+0.001
Neg logit contributions
les-0.001
lights-0.001
ork-0.001
rily-0.001
adoes-0.001
Token.
Token ID13
Feature activation-3.69e-03

Pos logit contributions
 least+0.017
oh+0.013
 oh+0.013
 kidding+0.013
 say+0.013
Neg logit contributions
les-0.014
lights-0.014
rily-0.013
ork-0.013
adoes-0.013
Token She
Token ID1375
Feature activation+5.22e-03

Pos logit contributions
les+0.006
lights+0.006
ork+0.006
rily+0.006
adoes+0.006
Neg logit contributions
 least-0.006
oh-0.005
 say-0.004
 oh-0.004
 kidding-0.004
Token found
Token ID1043
Feature activation+3.17e-02

Pos logit contributions
 least+0.009
oh+0.007
 say+0.007
 kidding+0.006
 oh+0.006
Neg logit contributions
les-0.008
lights-0.008
ork-0.008
rily-0.008
adoes-0.008
Token something
Token ID1223
Feature activation+4.36e-02

Pos logit contributions
 least+0.047
oh+0.037
 kidding+0.035
 say+0.034
 wait+0.033
Neg logit contributions
les-0.055
ork-0.054
lights-0.052
rily-0.052
adoes-0.050
Token very
Token ID845
Feature activation-5.01e-03

Pos logit contributions
 least+0.083
oh+0.068
 kidding+0.066
 elbows+0.063
 ourselves+0.062
Neg logit contributions
ork-0.060
lights-0.060
les-0.059
rily-0.054
bell-0.053
Token strange
Token ID6283
Feature activation+1.76e-02

Pos logit contributions
les+0.009
lights+0.008
ork+0.008
rily+0.008
adoes+0.008
Neg logit contributions
 least-0.008
oh-0.006
 say-0.006
 oh-0.006
 kidding-0.006
Token in
Token ID287
Feature activation-9.25e-03

Pos logit contributions
 least+0.031
oh+0.025
 kidding+0.023
 say+0.023
 oh+0.023
Neg logit contributions
les-0.027
lights-0.026
ork-0.025
rily-0.025
adoes-0.024
Token her
Token ID607
Feature activation+1.02e-02

Pos logit contributions
les+0.019
lights+0.019
rily+0.018
ork+0.018
adoes+0.018
Neg logit contributions
 least-0.011
oh-0.008
 say-0.008
 oh-0.007
 kidding-0.007
Token garden
Token ID11376
Feature activation+1.86e-04

Pos logit contributions
 least+0.018
oh+0.015
 say+0.014
 kidding+0.014
 oh+0.014
Neg logit contributions
les-0.015
lights-0.015
ork-0.015
rily-0.014
adoes-0.014
Token,
Token ID11
Feature activation+2.95e-03

Pos logit contributions
 least+0.014
oh+0.010
 say+0.009
 kidding+0.009
 oh+0.009
Neg logit contributions
les-0.016
lights-0.015
ork-0.015
adoes-0.015
rily-0.015
Token he
Token ID339
Feature activation-8.50e-03

Pos logit contributions
 least+0.005
oh+0.004
 say+0.004
 kidding+0.004
 oh+0.003
Neg logit contributions
les-0.005
lights-0.005
ork-0.005
rily-0.004
adoes-0.004
Token got
Token ID1392
Feature activation+2.24e-02

Pos logit contributions
lights+0.015
les+0.015
rily+0.015
ork+0.014
adoes+0.014
Neg logit contributions
 least-0.013
oh-0.010
 say-0.010
 oh-0.009
 am-0.009
Token a
Token ID257
Feature activation-5.51e-03

Pos logit contributions
 least+0.031
oh+0.027
 Mir+0.022
 say+0.022
 ourselves+0.022
Neg logit contributions
les-0.039
lights-0.037
ork-0.037
 tant-0.036
adoes-0.036
Token new
Token ID649
Feature activation-3.06e-03

Pos logit contributions
les+0.008
lights+0.008
rily+0.008
adoes+0.007
ork+0.007
Neg logit contributions
 least-0.010
oh-0.008
 say-0.008
 oh-0.008
 kidding-0.007
Token toy
Token ID13373
Feature activation+5.78e-02

Pos logit contributions
les+0.004
lights+0.004
rily+0.004
ork+0.004
adoes+0.004
Neg logit contributions
 least-0.006
oh-0.005
 say-0.005
 oh-0.005
 kidding-0.005
Token car
Token ID1097
Feature activation+1.70e-02

Pos logit contributions
 least+0.118
oh+0.109
 kidding+0.106
 ourselves+0.104
 Mir+0.102
Neg logit contributions
les-0.058
ork-0.057
lights-0.053
adoes-0.051
lf-0.050
Token.
Token ID13
Feature activation-1.52e-02

Pos logit contributions
 least+0.031
oh+0.025
 kidding+0.024
 Mir+0.024
 ourselves+0.024
Neg logit contributions
les-0.024
lights-0.023
ork-0.023
adoes-0.022
rily-0.021
Token It
Token ID632
Feature activation+1.30e-02

Pos logit contributions
les+0.026
lights+0.026
rily+0.024
ork+0.024
adoes+0.023
Neg logit contributions
 least-0.024
 say-0.020
oh-0.020
 oh-0.018
 Mir-0.018
Token was
Token ID373
Feature activation+1.94e-02

Pos logit contributions
 least+0.023
oh+0.018
 kidding+0.018
 oh+0.017
 ourselves+0.017
Neg logit contributions
les-0.020
lights-0.020
ork-0.019
adoes-0.018
rily-0.018
Token big
Token ID1263
Feature activation-1.96e-03

Pos logit contributions
 least+0.029
oh+0.024
 ourselves+0.020
 kidding+0.020
 am+0.020
Neg logit contributions
lights-0.034
les-0.033
ork-0.032
rily-0.032
adoes-0.032
Token "
Token ID366
Feature activation+1.30e-02

Pos logit contributions
les+0.022
lights+0.022
ork+0.021
adoes+0.020
rily+0.020
Neg logit contributions
 least-0.020
oh-0.016
 say-0.015
 oh-0.015
 Mir-0.015
TokenOh
Token ID5812
Feature activation+1.70e-02

Pos logit contributions
 least+0.019
oh+0.014
 say+0.014
 kidding+0.013
 oh+0.013
Neg logit contributions
les-0.023
lights-0.022
rily-0.022
ork-0.022
adoes-0.021
Token honey
Token ID12498
Feature activation+3.60e-02

Pos logit contributions
 least+0.026
oh+0.020
 say+0.019
 elbows+0.019
 kidding+0.018
Neg logit contributions
les-0.028
lights-0.028
ork-0.027
rily-0.027
adoes-0.027
Token,
Token ID11
Feature activation-2.46e-02

Pos logit contributions
 least+0.060
 elbows+0.045
oh+0.044
 kidding+0.044
 oh+0.044
Neg logit contributions
les-0.057
lights-0.056
ork-0.054
rily-0.052
adoes-0.051
Token I
Token ID314
Feature activation+2.01e-02

Pos logit contributions
les+0.050
lights+0.050
ork+0.048
rily+0.047
adoes+0.045
Neg logit contributions
 least-0.032
 oh-0.023
 say-0.023
oh-0.022
 ourselves-0.021
Token'm
Token ID1101
Feature activation+3.67e-02

Pos logit contributions
 least+0.027
 oh+0.022
 ourselves+0.021
oh+0.021
 elbows+0.021
Neg logit contributions
les-0.037
ork-0.035
lights-0.032
rily-0.031
bell-0.031
Token sorry
Token ID7926
Feature activation+5.77e-03

Pos logit contributions
 least+0.051
 elbows+0.044
oh+0.041
 ourselves+0.038
 am+0.037
Neg logit contributions
rily-0.060
lights-0.057
les-0.057
ork-0.055
lf-0.053
Token but
Token ID475
Feature activation-2.20e-03

Pos logit contributions
 least+0.008
oh+0.006
 say+0.006
 kidding+0.006
 oh+0.006
Neg logit contributions
les-0.011
lights-0.010
ork-0.010
rily-0.010
adoes-0.010
Token we
Token ID356
Feature activation-3.12e-02

Pos logit contributions
les+0.004
lights+0.004
ork+0.004
rily+0.004
adoes+0.004
Neg logit contributions
 least-0.003
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token don
Token ID836
Feature activation-1.75e-02

Pos logit contributions
lights+0.069
rily+0.066
les+0.065
ork+0.064
adoes+0.063
Neg logit contributions
 least-0.042
 say-0.030
oh-0.027
 am-0.027
 wait-0.025
Token't
Token ID470
Feature activation-2.56e-02

Pos logit contributions
les+0.029
lights+0.029
rily+0.028
ork+0.027
adoes+0.027
Neg logit contributions
 least-0.028
oh-0.023
 say-0.021
 am-0.020
 ourselves-0.020
Token mom
Token ID1995
Feature activation-2.28e-03

Pos logit contributions
les+0.029
rily+0.029
lights+0.028
ork+0.026
urbed+0.025
Neg logit contributions
 least-0.025
 oh-0.019
 say-0.019
oh-0.019
 am-0.019
Token would
Token ID561
Feature activation+1.36e-02

Pos logit contributions
les+0.004
lights+0.004
ork+0.004
rily+0.004
adoes+0.004
Neg logit contributions
 least-0.004
oh-0.003
 say-0.003
 oh-0.003
 kidding-0.003
Token come
Token ID1282
Feature activation-1.41e-02

Pos logit contributions
 least+0.021
oh+0.017
 oh+0.016
 kidding+0.016
 Mir+0.015
Neg logit contributions
les-0.021
lights-0.021
ork-0.021
adoes-0.020
rily-0.020
Token in
Token ID287
Feature activation+2.94e-02

Pos logit contributions
les+0.025
lights+0.025
ork+0.023
rily+0.023
adoes+0.023
Neg logit contributions
 least-0.022
 say-0.017
oh-0.017
 oh-0.016
 am-0.015
Token and
Token ID290
Feature activation+2.81e-03

Pos logit contributions
 least+0.062
oh+0.055
 kidding+0.052
 elbows+0.052
 worst+0.052
Neg logit contributions
ork-0.029
les-0.028
lights-0.028
rily-0.028
 tant-0.025
Token say
Token ID910
Feature activation+8.64e-03

Pos logit contributions
 least+0.004
oh+0.003
 say+0.003
 oh+0.003
 kidding+0.003
Neg logit contributions
les-0.005
lights-0.005
ork-0.005
rily-0.005
adoes-0.005
Token,
Token ID11
Feature activation-1.94e-03

Pos logit contributions
 least+0.013
oh+0.010
 say+0.009
 kidding+0.009
 oh+0.009
Neg logit contributions
les-0.015
lights-0.015
ork-0.015
rily-0.015
adoes-0.014
Token "
Token ID366
Feature activation+2.79e-04

Pos logit contributions
les+0.003
lights+0.003
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.003
oh-0.003
 say-0.003
 oh-0.003
 kidding-0.002
TokenL
Token ID43
Feature activation-1.57e-02

Pos logit contributions
 least+0.000
oh+0.000
 say+0.000
 oh+0.000
 kidding+0.000
Neg logit contributions
les-0.000
lights-0.000
ork-0.000
rily-0.000
adoes-0.000
Tokenily
Token ID813
Feature activation-2.68e-03

Pos logit contributions
lights+0.034
rily+0.033
les+0.032
adoes+0.032
ork+0.031
Neg logit contributions
 least-0.018
oh-0.013
 say-0.011
 oh-0.011
 kidding-0.011
Token,
Token ID11
Feature activation-5.66e-03

Pos logit contributions
les+0.005
lights+0.005
rily+0.005
ork+0.005
adoes+0.005
Neg logit contributions
 least-0.003
oh-0.003
 say-0.002
 oh-0.002
 kidding-0.002
Token saw
Token ID2497
Feature activation-5.42e-03

Pos logit contributions
 least+0.008
oh+0.006
 say+0.006
 kidding+0.006
 oh+0.006
Neg logit contributions
les-0.009
lights-0.009
ork-0.008
rily-0.008
adoes-0.008
Token the
Token ID262
Feature activation-8.43e-03

Pos logit contributions
les+0.010
lights+0.010
ork+0.010
rily+0.010
adoes+0.009
Neg logit contributions
 least-0.008
oh-0.006
 say-0.005
 oh-0.005
 kidding-0.005
Token open
Token ID1280
Feature activation-1.77e-02

Pos logit contributions
les+0.014
lights+0.014
ork+0.013
rily+0.013
adoes+0.013
Neg logit contributions
 least-0.014
oh-0.011
 say-0.011
 oh-0.010
 kidding-0.010
Token freezer
Token ID30967
Feature activation-6.87e-03

Pos logit contributions
les+0.028
lights+0.028
ork+0.027
rily+0.027
adoes+0.026
Neg logit contributions
 least-0.029
oh-0.022
 say-0.022
 oh-0.021
 am-0.021
Token and
Token ID290
Feature activation-2.91e-03

Pos logit contributions
les+0.011
lights+0.011
ork+0.010
rily+0.010
adoes+0.010
Neg logit contributions
 least-0.011
oh-0.009
 say-0.009
 oh-0.008
 kidding-0.008
Token said
Token ID531
Feature activation-5.92e-04

Pos logit contributions
les+0.005
lights+0.005
ork+0.005
rily+0.005
adoes+0.005
Neg logit contributions
 least-0.004
oh-0.003
 say-0.003
 oh-0.003
 kidding-0.003
Token "
Token ID366
Feature activation+3.68e-03

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
TokenUh
Token ID34653
Feature activation+9.07e-03

Pos logit contributions
 least+0.006
oh+0.004
 say+0.004
 kidding+0.004
 oh+0.004
Neg logit contributions
les-0.007
lights-0.006
ork-0.006
rily-0.006
adoes-0.006
Token oh
Token ID11752
Feature activation+1.88e-02

Pos logit contributions
 least+0.017
oh+0.013
 say+0.012
 kidding+0.012
 am+0.012
Neg logit contributions
les-0.013
lights-0.013
ork-0.012
rily-0.012
adoes-0.012
Token!"
Token ID2474
Feature activation+5.68e-03

Pos logit contributions
 least+0.028
oh+0.020
 say+0.019
 elbows+0.019
 ourselves+0.018
Neg logit contributions
les-0.032
ork-0.032
lights-0.032
rily-0.031
adoes-0.030
Token He
Token ID679
Feature activation-1.38e-02

Pos logit contributions
 least+0.009
oh+0.007
 say+0.007
 kidding+0.007
 oh+0.007
Neg logit contributions
les-0.009
lights-0.009
ork-0.009
rily-0.009
adoes-0.009
TokenL
Token ID43
Feature activation-1.29e-02

Pos logit contributions
 least+0.033
 kidding+0.026
oh+0.026
 oh+0.026
 say+0.025
Neg logit contributions
les-0.020
lights-0.019
ork-0.019
rily-0.017
adoes-0.017
Tokenily
Token ID813
Feature activation-1.49e-02

Pos logit contributions
lights+0.028
les+0.027
rily+0.027
ork+0.026
adoes+0.026
Neg logit contributions
 least-0.015
oh-0.010
 kidding-0.009
 say-0.009
 oh-0.009
Token and
Token ID290
Feature activation+7.80e-03

Pos logit contributions
lights+0.025
rily+0.024
ork+0.024
les+0.023
adoes+0.023
Neg logit contributions
 least-0.025
 say-0.019
oh-0.019
 kidding-0.017
 am-0.017
Token Ben
Token ID3932
Feature activation+9.86e-03

Pos logit contributions
 least+0.010
oh+0.008
 oh+0.007
 kidding+0.007
 say+0.007
Neg logit contributions
les-0.015
lights-0.015
ork-0.014
adoes-0.014
rily-0.014
Token hug
Token ID16225
Feature activation-3.85e-04

Pos logit contributions
 least+0.015
oh+0.012
 oh+0.011
 kidding+0.011
 Mir+0.010
Neg logit contributions
les-0.017
lights-0.016
ork-0.016
rily-0.016
adoes-0.016
Token each
Token ID1123
Feature activation-1.33e-03

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.000
 say-0.000
 oh-0.000
 kidding-0.000
Token other
Token ID584
Feature activation-2.56e-02

Pos logit contributions
les+0.003
lights+0.003
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.002
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token and
Token ID290
Feature activation+1.28e-02

Pos logit contributions
lights+0.048
les+0.047
ork+0.046
rily+0.046
adoes+0.046
Neg logit contributions
 least-0.034
 say-0.027
 very-0.025
oh-0.024
 ourselves-0.024
Token say
Token ID910
Feature activation+1.82e-03

Pos logit contributions
 least+0.017
oh+0.014
 oh+0.013
 kidding+0.012
 am+0.011
Neg logit contributions
les-0.024
lights-0.024
adoes-0.022
ork-0.022
rily-0.022
Token sorry
Token ID7926
Feature activation-6.66e-03

Pos logit contributions
 least+0.002
oh+0.002
 say+0.002
 oh+0.002
 kidding+0.002
Neg logit contributions
les-0.004
lights-0.003
ork-0.003
rily-0.003
adoes-0.003
Token.
Token ID13
Feature activation-1.11e-02

Pos logit contributions
les+0.011
lights+0.011
rily+0.011
ork+0.011
adoes+0.010
Neg logit contributions
 least-0.010
oh-0.008
 say-0.008
 oh-0.008
 am-0.007
Token it
Token ID340
Feature activation-7.88e-03

Pos logit contributions
les+0.003
lights+0.003
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.003
oh-0.002
 say-0.002
 oh-0.002
 kidding-0.002
Token suddenly
Token ID6451
Feature activation-8.25e-03

Pos logit contributions
lights+0.014
les+0.014
rily+0.013
ork+0.013
adoes+0.013
Neg logit contributions
 least-0.012
oh-0.010
 say-0.009
 oh-0.009
 kidding-0.009
Token jumped
Token ID11687
Feature activation+6.83e-03

Pos logit contributions
lights+0.014
les+0.014
rily+0.013
ork+0.013
adoes+0.013
Neg logit contributions
 least-0.013
oh-0.010
 say-0.010
 oh-0.010
 kidding-0.009
Token right
Token ID826
Feature activation-6.64e-04

Pos logit contributions
 least+0.011
oh+0.009
 say+0.009
 oh+0.008
 kidding+0.008
Neg logit contributions
les-0.011
lights-0.011
ork-0.011
rily-0.011
adoes-0.010
Token onto
Token ID4291
Feature activation+1.79e-02

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token Madd
Token ID21457
Feature activation+1.56e-02

Pos logit contributions
 least+0.027
oh+0.022
 say+0.020
 kidding+0.020
 oh+0.019
Neg logit contributions
les-0.030
ork-0.030
lights-0.029
rily-0.029
adoes-0.028
Tokenie
Token ID494
Feature activation+6.29e-03

Pos logit contributions
 least+0.041
oh+0.036
 kidding+0.035
 oh+0.035
 am+0.035
Neg logit contributions
les-0.011
rily-0.008
lights-0.008
ork-0.007
adoes-0.007
Token and
Token ID290
Feature activation+5.37e-03

Pos logit contributions
 least+0.010
oh+0.008
 say+0.007
 kidding+0.007
 oh+0.007
Neg logit contributions
les-0.011
lights-0.010
ork-0.010
rily-0.010
adoes-0.010
Token pin
Token ID6757
Feature activation-4.35e-03

Pos logit contributions
 least+0.009
oh+0.007
 say+0.007
 kidding+0.006
 oh+0.006
Neg logit contributions
les-0.009
lights-0.008
rily-0.008
ork-0.008
adoes-0.008
Tokenched
Token ID1740
Feature activation+3.36e-02

Pos logit contributions
lights+0.006
les+0.006
ork+0.005
rily+0.005
adoes+0.005
Neg logit contributions
 least-0.008
oh-0.007
 say-0.007
 oh-0.007
 kidding-0.006
Token her
Token ID607
Feature activation+1.43e-02

Pos logit contributions
 least+0.060
oh+0.045
 kidding+0.044
 say+0.043
 worst+0.043
Neg logit contributions
les-0.051
rily-0.049
ened-0.047
adoes-0.047
ork-0.045
Token it
Token ID340
Feature activation+9.27e-03

Pos logit contributions
 least+0.007
oh+0.006
 say+0.005
 kidding+0.005
 oh+0.005
Neg logit contributions
les-0.006
lights-0.006
ork-0.006
rily-0.006
adoes-0.006
Token was
Token ID373
Feature activation+4.45e-02

Pos logit contributions
 least+0.016
oh+0.012
 kidding+0.012
 ourselves+0.011
 say+0.011
Neg logit contributions
les-0.015
lights-0.014
ork-0.014
rily-0.014
adoes-0.013
Token doing
Token ID1804
Feature activation+8.46e-03

Pos logit contributions
 least+0.085
oh+0.075
 Mir+0.068
 elbows+0.068
Ta+0.067
Neg logit contributions
les-0.056
rily-0.050
ork-0.049
adoes-0.044
ened-0.044
Token.
Token ID13
Feature activation+1.61e-02

Pos logit contributions
 least+0.015
oh+0.012
 say+0.011
 kidding+0.011
 oh+0.011
Neg logit contributions
les-0.013
lights-0.013
ork-0.012
rily-0.012
adoes-0.012
Token
Token ID220
Feature activation+3.31e-03

Pos logit contributions
 least+0.028
oh+0.023
 kidding+0.022
 worst+0.021
 oh+0.021
Neg logit contributions
les-0.024
ork-0.023
lights-0.023
adoes-0.022
rily-0.022
Token
Token ID198
Feature activation+1.36e-02

Pos logit contributions
 least+0.005
oh+0.004
 say+0.004
 oh+0.004
 kidding+0.004
Neg logit contributions
les-0.006
lights-0.006
ork-0.005
rily-0.005
adoes-0.005
Token
Token ID198
Feature activation+2.45e-03

Pos logit contributions
 least+0.024
 kidding+0.018
oh+0.018
 oh+0.018
 worst+0.017
Neg logit contributions
les-0.021
ork-0.020
lights-0.020
rily-0.019
adoes-0.019
TokenThe
Token ID464
Feature activation-1.45e-02

Pos logit contributions
 least+0.004
oh+0.004
 say+0.003
 kidding+0.003
 oh+0.003
Neg logit contributions
les-0.004
lights-0.004
ork-0.003
rily-0.003
adoes-0.003
Token squirrel
Token ID33039
Feature activation-2.21e-02

Pos logit contributions
lights+0.026
les+0.025
rily+0.025
ork+0.023
adoes+0.023
Neg logit contributions
 least-0.023
oh-0.018
 say-0.018
 oh-0.018
 am-0.017
Token said
Token ID531
Feature activation+1.57e-02

Pos logit contributions
lights+0.054
les+0.052
rily+0.051
adoes+0.051
ork+0.051
Neg logit contributions
 least-0.019
oh-0.014
 say-0.013
 oh-0.011
 am-0.009
Token,
Token ID11
Feature activation+1.88e-02

Pos logit contributions
 least+0.026
oh+0.019
 say+0.018
 kidding+0.018
 ourselves+0.018
Neg logit contributions
ork-0.026
les-0.026
lights-0.025
rily-0.025
adoes-0.024
Token to
Token ID284
Feature activation+1.54e-02

Pos logit contributions
les+0.015
lights+0.014
rily+0.014
ork+0.014
adoes+0.013
Neg logit contributions
 least-0.013
oh-0.010
 say-0.010
 oh-0.010
 am-0.009
Token do
Token ID466
Feature activation+1.67e-02

Pos logit contributions
 least+0.019
oh+0.014
 kidding+0.013
 Mir+0.012
 oh+0.012
Neg logit contributions
les-0.031
lights-0.031
ork-0.030
lf-0.029
rily-0.029
Token.
Token ID13
Feature activation-3.42e-03

Pos logit contributions
 least+0.030
oh+0.025
 kidding+0.024
 say+0.023
 Mir+0.023
Neg logit contributions
les-0.024
lights-0.023
ork-0.022
rily-0.021
adoes-0.021
Token Suddenly
Token ID24975
Feature activation+6.61e-03

Pos logit contributions
les+0.006
lights+0.006
rily+0.005
ork+0.005
adoes+0.005
Neg logit contributions
 least-0.006
oh-0.004
 say-0.004
 oh-0.004
 kidding-0.004
Token,
Token ID11
Feature activation+6.40e-03

Pos logit contributions
 least+0.010
oh+0.007
 say+0.007
 kidding+0.007
 oh+0.006
Neg logit contributions
les-0.012
lights-0.012
ork-0.012
adoes-0.011
rily-0.011
Token he
Token ID339
Feature activation-1.45e-02

Pos logit contributions
 least+0.011
oh+0.009
 say+0.008
 kidding+0.008
 oh+0.008
Neg logit contributions
les-0.010
lights-0.010
ork-0.010
rily-0.010
adoes-0.009
Token heard
Token ID2982
Feature activation-2.47e-02

Pos logit contributions
lights+0.027
rily+0.026
les+0.026
adoes+0.026
ork+0.025
Neg logit contributions
 least-0.021
oh-0.016
 say-0.015
 oh-0.015
 kidding-0.014
Token a
Token ID257
Feature activation+3.33e-02

Pos logit contributions
les+0.059
rily+0.058
lights+0.058
ork+0.056
adoes+0.055
Neg logit contributions
 least-0.020
 say-0.013
 oh-0.013
oh-0.011
 Mir-0.011
Token ringing
Token ID32333
Feature activation-1.26e-02

Pos logit contributions
 least+0.074
 kidding+0.059
oh+0.058
 ourselves+0.056
 worst+0.055
Neg logit contributions
ork-0.036
les-0.035
lights-0.035
bell-0.033
lf-0.031
Token sound
Token ID2128
Feature activation-3.48e-03

Pos logit contributions
les+0.023
lights+0.022
rily+0.022
ork+0.022
adoes+0.021
Neg logit contributions
 least-0.018
oh-0.014
 say-0.014
 oh-0.013
 am-0.013
Token coming
Token ID2406
Feature activation+2.15e-02

Pos logit contributions
les+0.006
lights+0.006
ork+0.006
rily+0.005
adoes+0.005
Neg logit contributions
 least-0.006
oh-0.004
 say-0.004
 oh-0.004
 kidding-0.004
Token with
Token ID351
Feature activation-7.59e-03

Pos logit contributions
 least+0.015
oh+0.012
 say+0.011
 kidding+0.010
 oh+0.010
Neg logit contributions
les-0.015
lights-0.014
ork-0.014
rily-0.014
adoes-0.014
Token the
Token ID262
Feature activation-9.28e-03

Pos logit contributions
les+0.016
lights+0.015
ork+0.015
rily+0.015
adoes+0.014
Neg logit contributions
 least-0.009
 say-0.007
oh-0.006
 oh-0.006
 ourselves-0.006
Token water
Token ID1660
Feature activation-9.17e-03

Pos logit contributions
les+0.016
lights+0.015
ork+0.015
rily+0.015
adoes+0.014
Neg logit contributions
 least-0.015
 say-0.011
oh-0.011
 oh-0.011
 kidding-0.011
Token.
Token ID13
Feature activation-1.17e-02

Pos logit contributions
les+0.015
lights+0.015
ork+0.015
rily+0.015
adoes+0.014
Neg logit contributions
 least-0.014
oh-0.011
 say-0.011
 kidding-0.010
 oh-0.010
Token The
Token ID383
Feature activation-2.88e-02

Pos logit contributions
les+0.021
lights+0.020
ork+0.020
rily+0.019
adoes+0.019
Neg logit contributions
 least-0.018
oh-0.014
 say-0.014
 Mir-0.013
 kidding-0.013
Token little
Token ID1310
Feature activation-1.31e-02

Pos logit contributions
rily+0.044
ork+0.043
lights+0.040
les+0.040
ened+0.039
Neg logit contributions
 least-0.052
 say-0.043
 best-0.041
 most-0.040
 worst-0.039
Token girl
Token ID2576
Feature activation+1.66e-02

Pos logit contributions
ork+0.025
lights+0.025
les+0.025
rily+0.024
adoes+0.023
Neg logit contributions
 least-0.018
 say-0.013
oh-0.013
 oh-0.012
 kidding-0.012
Token said
Token ID531
Feature activation+5.33e-03

Pos logit contributions
 least+0.028
oh+0.022
 kidding+0.021
 ourselves+0.021
 oh+0.021
Neg logit contributions
les-0.027
lights-0.025
ork-0.024
rily-0.024
adoes-0.023
Token,
Token ID11
Feature activation-8.47e-03

Pos logit contributions
 least+0.008
oh+0.006
 say+0.006
 kidding+0.006
 oh+0.006
Neg logit contributions
les-0.009
lights-0.009
rily-0.009
ork-0.009
adoes-0.009
Token "
Token ID366
Feature activation-1.38e-02

Pos logit contributions
les+0.014
lights+0.014
ork+0.013
rily+0.013
adoes+0.013
Neg logit contributions
 least-0.014
oh-0.011
 say-0.011
 oh-0.010
 kidding-0.010
TokenYou
Token ID1639
Feature activation-4.19e-03

Pos logit contributions
lights+0.028
les+0.028
ork+0.027
rily+0.027
adoes+0.026
Neg logit contributions
 least-0.018
oh-0.014
 oh-0.012
 say-0.012
 kidding-0.011
Token Tom
Token ID4186
Feature activation+6.20e-03

Pos logit contributions
 least+0.023
oh+0.018
 kidding+0.016
 ourselves+0.016
 worst+0.015
Neg logit contributions
les-0.024
lights-0.023
ork-0.023
rily-0.023
adoes-0.022
Token asks
Token ID7893
Feature activation-6.32e-04

Pos logit contributions
 least+0.011
oh+0.008
 oh+0.008
 kidding+0.008
 say+0.008
Neg logit contributions
les-0.010
lights-0.009
rily-0.009
ork-0.009
adoes-0.009
Token.
Token ID13
Feature activation-1.68e-02

Pos logit contributions
les+0.001
lights+0.001
ork+0.001
rily+0.001
adoes+0.001
Neg logit contributions
 least-0.001
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token He
Token ID679
Feature activation-1.76e-02

Pos logit contributions
les+0.030
lights+0.030
adoes+0.028
rily+0.028
ork+0.027
Neg logit contributions
 least-0.023
oh-0.020
 Mir-0.020
 say-0.020
 oh-0.019
Token reaches
Token ID12229
Feature activation+1.81e-02

Pos logit contributions
lights+0.034
les+0.032
ork+0.032
rily+0.032
adoes+0.032
Neg logit contributions
 least-0.024
oh-0.018
 say-0.018
 Mir-0.017
 kidding-0.017
Token for
Token ID329
Feature activation-1.51e-02

Pos logit contributions
 least+0.027
oh+0.022
 oh+0.020
 ourselves+0.019
 kidding+0.019
Neg logit contributions
les-0.032
rily-0.031
lights-0.031
ork-0.030
ened-0.029
Token the
Token ID262
Feature activation+9.86e-03

Pos logit contributions
les+0.031
lights+0.031
adoes+0.029
rily+0.029
ork+0.029
Neg logit contributions
 least-0.019
 say-0.014
 oh-0.014
oh-0.014
 Mir-0.013
Token necklace
Token ID38987
Feature activation+9.18e-04

Pos logit contributions
 least+0.020
oh+0.016
 kidding+0.016
 say+0.015
 oh+0.015
Neg logit contributions
les-0.013
ork-0.012
lights-0.012
rily-0.011
adoes-0.011
Token,
Token ID11
Feature activation+1.14e-02

Pos logit contributions
 least+0.002
oh+0.001
 say+0.001
 oh+0.001
 kidding+0.001
Neg logit contributions
les-0.002
lights-0.001
ork-0.001
rily-0.001
adoes-0.001
Token but
Token ID475
Feature activation+4.46e-03

Pos logit contributions
 least+0.017
oh+0.013
 am+0.012
 ourselves+0.012
 say+0.012
Neg logit contributions
les-0.020
lights-0.019
rily-0.019
ork-0.019
adoes-0.018
Token Sara
Token ID24799
Feature activation+9.92e-03

Pos logit contributions
 least+0.008
oh+0.007
 say+0.006
 kidding+0.006
 oh+0.006
Neg logit contributions
les-0.006
lights-0.006
ork-0.006
rily-0.006
adoes-0.006
Token a
Token ID257
Feature activation-5.78e-03

Pos logit contributions
 least+0.036
oh+0.030
 say+0.027
 kidding+0.027
 wait+0.026
Neg logit contributions
les-0.043
ork-0.041
lights-0.040
adoes-0.040
rily-0.039
Token tube
Token ID12403
Feature activation-1.24e-02

Pos logit contributions
les+0.009
lights+0.009
rily+0.009
ork+0.008
adoes+0.008
Neg logit contributions
 least-0.010
oh-0.008
 say-0.007
 oh-0.007
 kidding-0.007
Token in
Token ID287
Feature activation-9.18e-03

Pos logit contributions
lights+0.021
les+0.021
rily+0.021
adoes+0.020
ork+0.020
Neg logit contributions
 least-0.019
oh-0.015
 say-0.014
 oh-0.013
 kidding-0.013
Token her
Token ID607
Feature activation+9.77e-03

Pos logit contributions
les+0.020
lights+0.020
ork+0.019
rily+0.019
adoes+0.018
Neg logit contributions
 least-0.011
oh-0.007
 say-0.007
 oh-0.007
 Mir-0.006
Token toy
Token ID13373
Feature activation+2.46e-02

Pos logit contributions
 least+0.018
oh+0.015
 say+0.014
 kidding+0.014
 oh+0.014
Neg logit contributions
les-0.013
lights-0.013
ork-0.013
rily-0.012
adoes-0.012
Token box
Token ID3091
Feature activation-1.55e-02

Pos logit contributions
 least+0.045
 kidding+0.038
oh+0.038
 say+0.037
 Mir+0.036
Neg logit contributions
les-0.033
ork-0.032
lights-0.030
adoes-0.030
rily-0.028
Token.
Token ID13
Feature activation-2.45e-02

Pos logit contributions
lights+0.028
les+0.028
adoes+0.027
ork+0.026
rily+0.026
Neg logit contributions
 least-0.023
oh-0.017
 say-0.017
 oh-0.016
 am-0.015
Token It
Token ID632
Feature activation+1.78e-02

Pos logit contributions
lights+0.056
les+0.054
rily+0.051
adoes+0.050
ork+0.049
Neg logit contributions
 least-0.023
 say-0.019
oh-0.019
 Mir-0.017
 am-0.017
Token was
Token ID373
Feature activation+2.49e-02

Pos logit contributions
 least+0.034
 kidding+0.028
 ourselves+0.027
oh+0.026
 oh+0.026
Neg logit contributions
les-0.025
lights-0.024
ork-0.023
rily-0.021
adoes-0.021
Token hard
Token ID1327
Feature activation+4.38e-03

Pos logit contributions
 least+0.040
oh+0.033
 ourselves+0.028
 kidding+0.028
 say+0.028
Neg logit contributions
lights-0.041
ork-0.040
rily-0.039
les-0.039
adoes-0.039
Token and
Token ID290
Feature activation+4.46e-03

Pos logit contributions
 least+0.008
oh+0.006
 say+0.006
 oh+0.006
 kidding+0.006
Neg logit contributions
les-0.007
lights-0.007
ork-0.006
rily-0.006
adoes-0.006
Token jealous
Token ID19354
Feature activation-5.02e-03

Pos logit contributions
 least+0.017
oh+0.013
 oh+0.012
 kidding+0.012
 say+0.012
Neg logit contributions
les-0.020
lights-0.020
rily-0.019
ork-0.019
adoes-0.019
Token,
Token ID11
Feature activation+9.35e-03

Pos logit contributions
les+0.008
lights+0.008
ork+0.008
rily+0.008
adoes+0.007
Neg logit contributions
 least-0.008
oh-0.007
 say-0.006
 oh-0.006
 kidding-0.006
Token so
Token ID523
Feature activation-5.94e-03

Pos logit contributions
 least+0.013
oh+0.010
 say+0.009
 kidding+0.009
 oh+0.009
Neg logit contributions
les-0.017
lights-0.017
ork-0.016
rily-0.016
adoes-0.016
Token he
Token ID339
Feature activation-2.20e-02

Pos logit contributions
les+0.010
lights+0.010
ork+0.010
rily+0.010
adoes+0.010
Neg logit contributions
 least-0.009
oh-0.007
 say-0.007
 oh-0.007
 kidding-0.006
Token decided
Token ID3066
Feature activation-1.59e-02

Pos logit contributions
lights+0.042
les+0.040
ork+0.039
adoes+0.038
rily+0.038
Neg logit contributions
 least-0.032
oh-0.024
 say-0.024
 oh-0.022
 wait-0.021
Token to
Token ID284
Feature activation-4.42e-02

Pos logit contributions
lights+0.030
les+0.030
rily+0.028
adoes+0.027
ork+0.027
Neg logit contributions
 least-0.023
 say-0.017
oh-0.017
 oh-0.017
 kidding-0.016
Token share
Token ID2648
Feature activation-4.37e-02

Pos logit contributions
lights+0.097
les+0.093
rily+0.090
ork+0.086
adoes+0.084
Neg logit contributions
 least-0.060
 say-0.049
 wait-0.042
oh-0.039
 am-0.039
Token the
Token ID262
Feature activation-2.68e-02

Pos logit contributions
lights+0.092
ork+0.087
les+0.087
adoes+0.084
bell+0.084
Neg logit contributions
 least-0.052
 say-0.043
 ourselves-0.040
 very-0.038
 saying-0.036
Token diamond
Token ID15291
Feature activation-1.75e-02

Pos logit contributions
lights+0.037
les+0.035
ork+0.034
adoes+0.032
rily+0.031
Neg logit contributions
 least-0.053
 say-0.043
 oh-0.042
 am-0.042
 best-0.041
Token with
Token ID351
Feature activation-1.43e-02

Pos logit contributions
les+0.028
lights+0.028
ork+0.027
adoes+0.026
rily+0.026
Neg logit contributions
 least-0.029
oh-0.023
 say-0.023
 ourselves-0.021
 oh-0.021
Token them
Token ID606
Feature activation+3.09e-03

Pos logit contributions
lights+0.025
les+0.025
rily+0.024
ork+0.024
adoes+0.023
Neg logit contributions
 least-0.023
oh-0.017
 say-0.017
 oh-0.016
 kidding-0.016
Token Lily
Token ID20037
Feature activation+2.01e-02

Pos logit contributions
 least+0.014
oh+0.012
 say+0.011
 oh+0.011
 kidding+0.011
Neg logit contributions
les-0.010
lights-0.010
ork-0.009
rily-0.009
adoes-0.009
Token whispered
Token ID25029
Feature activation+1.20e-02

Pos logit contributions
 least+0.026
oh+0.018
 ourselves+0.017
 kidding+0.017
 oh+0.016
Neg logit contributions
les-0.041
lights-0.038
ork-0.038
rily-0.037
adoes-0.037
Token,
Token ID11
Feature activation-1.84e-02

Pos logit contributions
 least+0.020
oh+0.016
 say+0.015
 kidding+0.014
 am+0.014
Neg logit contributions
les-0.020
lights-0.019
rily-0.019
ork-0.018
adoes-0.018
Token holding
Token ID4769
Feature activation-1.59e-02

Pos logit contributions
les+0.037
lights+0.036
ork+0.036
adoes+0.033
rily+0.033
Neg logit contributions
 least-0.024
 say-0.018
 oh-0.017
oh-0.017
 kidding-0.016
Token a
Token ID257
Feature activation-2.62e-02

Pos logit contributions
lights+0.031
les+0.030
ork+0.030
rily+0.029
adoes+0.028
Neg logit contributions
 least-0.021
oh-0.015
 say-0.015
 oh-0.015
 Mir-0.014
Token paper
Token ID3348
Feature activation-5.54e-02

Pos logit contributions
lights+0.044
les+0.044
adoes+0.041
rily+0.041
ork+0.038
Neg logit contributions
 least-0.039
 say-0.034
oh-0.032
 am-0.031
 oh-0.031
Token hat
Token ID6877
Feature activation-3.24e-02

Pos logit contributions
les+0.092
lights+0.091
ork+0.086
adoes+0.085
ened+0.084
Neg logit contributions
 least-0.081
 say-0.067
 very-0.059
oh-0.058
 wait-0.058
Token as
Token ID355
Feature activation-1.78e-02

Pos logit contributions
lights+0.060
les+0.059
ork+0.057
adoes+0.056
rily+0.055
Neg logit contributions
 least-0.044
 say-0.037
oh-0.034
 saying-0.031
 wait-0.031
Token her
Token ID607
Feature activation+1.06e-02

Pos logit contributions
les+0.035
lights+0.034
ork+0.033
rily+0.033
adoes+0.032
Neg logit contributions
 least-0.023
oh-0.017
 say-0.016
 oh-0.016
 Mir-0.016
Token hat
Token ID6877
Feature activation-1.47e-02

Pos logit contributions
 least+0.020
oh+0.017
 oh+0.016
 say+0.016
 kidding+0.016
Neg logit contributions
les-0.014
lights-0.014
ork-0.013
rily-0.013
adoes-0.013
Token.
Token ID13
Feature activation+3.94e-03

Pos logit contributions
lights+0.025
les+0.025
ork+0.024
rily+0.024
adoes+0.023
Neg logit contributions
 least-0.022
oh-0.018
 say-0.017
 oh-0.016
 am-0.016
Token outside
Token ID2354
Feature activation+1.85e-02

Pos logit contributions
les+0.015
lights+0.015
rily+0.014
ork+0.014
adoes+0.014
Neg logit contributions
 least-0.012
oh-0.009
 say-0.009
 am-0.008
 kidding-0.008
Token and
Token ID290
Feature activation-3.32e-03

Pos logit contributions
 least+0.035
oh+0.030
 kidding+0.028
 oh+0.028
 say+0.028
Neg logit contributions
les-0.024
lights-0.024
rily-0.022
ork-0.022
adoes-0.021
Token play
Token ID711
Feature activation-1.06e-02

Pos logit contributions
les+0.006
lights+0.006
ork+0.006
rily+0.006
adoes+0.005
Neg logit contributions
 least-0.005
oh-0.004
 say-0.004
 oh-0.003
 kidding-0.003
Token in
Token ID287
Feature activation+7.26e-03

Pos logit contributions
les+0.017
lights+0.017
ork+0.016
rily+0.016
adoes+0.016
Neg logit contributions
 least-0.017
oh-0.014
 say-0.013
 oh-0.013
 am-0.013
Token her
Token ID607
Feature activation-2.00e-02

Pos logit contributions
 least+0.010
oh+0.008
 say+0.007
 kidding+0.007
 oh+0.007
Neg logit contributions
les-0.013
lights-0.013
ork-0.013
rily-0.013
adoes-0.012
Token white
Token ID2330
Feature activation-6.00e-02

Pos logit contributions
les+0.035
rily+0.032
ork+0.032
lights+0.032
adoes+0.031
Neg logit contributions
 least-0.032
 say-0.023
 best-0.023
 am-0.022
 ourselves-0.022
Token hut
Token ID40812
Feature activation-2.72e-05

Pos logit contributions
rily+0.106
les+0.105
ocked+0.101
aff+0.101
lights+0.100
Neg logit contributions
 least-0.073
 am-0.058
 very-0.056
 best-0.056
 say-0.055
Token.
Token ID13
Feature activation-5.95e-03

Pos logit contributions
les+0.000
lights+0.000
ork+0.000
rily+0.000
adoes+0.000
Neg logit contributions
 least-0.000
oh-0.000
 say-0.000
 oh-0.000
 kidding-0.000
Token
Token ID220
Feature activation-1.29e-03

Pos logit contributions
les+0.011
lights+0.010
ork+0.010
rily+0.010
adoes+0.010
Neg logit contributions
 least-0.009
oh-0.007
 say-0.007
 oh-0.006
 am-0.006
Token
Token ID198
Feature activation+7.77e-03

Pos logit contributions
les+0.002
lights+0.002
ork+0.002
rily+0.002
adoes+0.002
Neg logit contributions
 least-0.002
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token
Token ID198
Feature activation-1.21e-02

Pos logit contributions
 least+0.013
oh+0.010
 kidding+0.010
 say+0.010
 oh+0.009
Neg logit contributions
les-0.013
lights-0.012
ork-0.012
rily-0.012
adoes-0.011
Token his
Token ID465
Feature activation-1.12e-02

Pos logit contributions
les+0.084
lights+0.080
rily+0.078
ork+0.078
pron+0.077
Neg logit contributions
 least-0.044
 say-0.037
 very-0.036
 ourselves-0.034
 saying-0.033
Token special
Token ID2041
Feature activation-3.96e-02

Pos logit contributions
les+0.018
lights+0.018
ork+0.017
rily+0.017
adoes+0.017
Neg logit contributions
 least-0.018
 say-0.014
oh-0.014
 oh-0.014
 kidding-0.013
Token spot
Token ID4136
Feature activation-1.69e-02

Pos logit contributions
lights+0.063
les+0.063
rily+0.061
ork+0.059
adoes+0.058
Neg logit contributions
 least-0.061
 say-0.051
 am-0.050
 best-0.047
 cle-0.047
Token.
Token ID13
Feature activation-1.22e-02

Pos logit contributions
les+0.029
lights+0.028
ork+0.027
rily+0.027
adoes+0.027
Neg logit contributions
 least-0.026
oh-0.020
 say-0.020
 oh-0.018
 am-0.018
Token The
Token ID383
Feature activation-4.40e-02

Pos logit contributions
les+0.021
lights+0.020
ork+0.019
rily+0.019
adoes+0.019
Neg logit contributions
 least-0.020
oh-0.015
 say-0.015
 kidding-0.015
 Mir-0.014
Token first
Token ID717
Feature activation-6.90e-02

Pos logit contributions
ork+0.090
lights+0.077
rily+0.075
les+0.074
bell+0.073
Neg logit contributions
 least-0.067
 best-0.054
 most-0.054
 very-0.052
 worst-0.049
Token all
Token ID477
Feature activation-1.99e-02

Pos logit contributions
ork+0.123
lights+0.118
pron+0.109
bell+0.107
les+0.107
Neg logit contributions
 least-0.094
 best-0.091
 most-0.087
 very-0.084
 so-0.081
Tokenigator
Token ID23823
Feature activation+3.51e-03

Pos logit contributions
ork+0.029
lights+0.028
les+0.028
rily+0.027
adoes+0.026
Neg logit contributions
 least-0.036
oh-0.030
 say-0.029
 oh-0.027
 am-0.027
Token wished
Token ID16555
Feature activation+7.75e-03

Pos logit contributions
 least+0.006
oh+0.004
 say+0.004
 oh+0.004
 kidding+0.004
Neg logit contributions
les-0.006
lights-0.006
ork-0.006
rily-0.006
adoes-0.005
Token he
Token ID339
Feature activation-2.29e-02

Pos logit contributions
 least+0.013
oh+0.010
 say+0.010
 kidding+0.009
 oh+0.009
Neg logit contributions
les-0.013
lights-0.012
ork-0.012
rily-0.012
adoes-0.012
Token had
Token ID550
Feature activation-1.85e-02

Pos logit contributions
lights+0.045
les+0.044
adoes+0.042
rily+0.042
ork+0.042
Neg logit contributions
 least-0.031
oh-0.024
 say-0.022
 oh-0.021
 am-0.021
Token.
Token ID13
Feature activation+1.20e-02

Pos logit contributions
lights+0.032
les+0.031
rily+0.030
ork+0.030
adoes+0.029
Neg logit contributions
 least-0.025
 say-0.020
oh-0.020
 am-0.018
 Mir-0.017
Token They
Token ID1119
Feature activation+1.22e-02

Pos logit contributions
 least+0.022
oh+0.018
 oh+0.017
 kidding+0.017
 say+0.017
Neg logit contributions
lights-0.017
les-0.017
ork-0.016
adoes-0.016
rily-0.015
Token had
Token ID550
Feature activation+2.16e-02

Pos logit contributions
 least+0.021
oh+0.017
 oh+0.016
 kidding+0.016
 ourselves+0.016
Neg logit contributions
les-0.019
lights-0.018
ork-0.017
rily-0.017
adoes-0.016
Token fun
Token ID1257
Feature activation-1.33e-02

Pos logit contributions
 least+0.032
oh+0.027
 kidding+0.024
 oh+0.024
 say+0.023
Neg logit contributions
lights-0.036
les-0.036
ork-0.036
rily-0.034
adoes-0.033
Token and
Token ID290
Feature activation+4.75e-03

Pos logit contributions
les+0.026
lights+0.026
rily+0.025
ork+0.025
adoes+0.024
Neg logit contributions
 least-0.017
oh-0.012
 say-0.012
 oh-0.012
 ourselves-0.011
Token shared
Token ID4888
Feature activation-5.76e-02

Pos logit contributions
 least+0.008
oh+0.006
 say+0.006
 oh+0.006
 kidding+0.006
Neg logit contributions
les-0.008
lights-0.008
ork-0.007
rily-0.007
adoes-0.007
Token their
Token ID511
Feature activation+4.05e-03

Pos logit contributions
urbed+0.118
les+0.116
lights+0.115
emaker+0.113
rily+0.112
Neg logit contributions
 say-0.054
 very-0.053
 least-0.046
 ourselves-0.046
 Mir-0.044
Token pictures
Token ID5986
Feature activation-2.09e-02

Pos logit contributions
 least+0.008
oh+0.006
 say+0.006
 oh+0.006
 kidding+0.006
Neg logit contributions
les-0.005
lights-0.005
ork-0.005
rily-0.005
adoes-0.005
Token.
Token ID13
Feature activation+1.21e-02

Pos logit contributions
les+0.037
adoes+0.035
lights+0.035
rily+0.034
ened+0.034
Neg logit contributions
 least-0.031
 say-0.024
oh-0.024
 ourselves-0.022
 very-0.021
Token
Token ID198
Feature activation+1.32e-03

Pos logit contributions
 least+0.021
oh+0.017
 oh+0.016
 kidding+0.016
 say+0.016
Neg logit contributions
lights-0.018
les-0.018
ork-0.018
adoes-0.017
rily-0.016
Token
Token ID198
Feature activation+5.80e-04

Pos logit contributions
 least+0.002
oh+0.002
 say+0.001
 oh+0.001
 kidding+0.001
Neg logit contributions
les-0.002
lights-0.002
ork-0.002
rily-0.002
adoes-0.002
Token rest
Token ID1334
Feature activation+7.91e-03

Pos logit contributions
lights+0.032
les+0.031
rily+0.029
ork+0.029
adoes+0.029
Neg logit contributions
 least-0.022
 say-0.017
oh-0.016
 wait-0.016
 am-0.015
Token.
Token ID13
Feature activation-1.43e-03

Pos logit contributions
 least+0.014
oh+0.011
 say+0.010
 kidding+0.010
 oh+0.010
Neg logit contributions
les-0.012
lights-0.012
rily-0.012
ork-0.011
adoes-0.011
Token She
Token ID1375
Feature activation-1.52e-02

Pos logit contributions
les+0.003
lights+0.003
ork+0.003
rily+0.003
adoes+0.003
Neg logit contributions
 least-0.002
oh-0.001
 say-0.001
 oh-0.001
 kidding-0.001
Token found
Token ID1043
Feature activation+2.05e-02

Pos logit contributions
lights+0.027
les+0.026
adoes+0.026
ork+0.025
rily+0.025
Neg logit contributions
 least-0.024
oh-0.018
 say-0.018
 oh-0.017
 wait-0.016
Token a
Token ID257
Feature activation+8.46e-03

Pos logit contributions
 least+0.029
oh+0.024
 kidding+0.022
 say+0.021
 am+0.021
Neg logit contributions
les-0.037
rily-0.035
lights-0.035
ork-0.034
adoes-0.033
Token cozy
Token ID37438
Feature activation-5.14e-02

Pos logit contributions
 least+0.016
oh+0.013
 say+0.012
 kidding+0.012
 oh+0.012
Neg logit contributions
les-0.012
lights-0.012
ork-0.011
rily-0.011
adoes-0.011
Token spot
Token ID4136
Feature activation+7.54e-03

Pos logit contributions
lights+0.095
les+0.090
adoes+0.087
rily+0.085
ocked+0.077
Neg logit contributions
 least-0.076
oh-0.059
 say-0.059
 very-0.058
 oh-0.056
Token under
Token ID739
Feature activation-1.48e-02

Pos logit contributions
 least+0.013
oh+0.010
 say+0.010
 kidding+0.009
 oh+0.009
Neg logit contributions
les-0.012
lights-0.012
ork-0.011
rily-0.011
adoes-0.011
Token a
Token ID257
Feature activation-3.70e-03

Pos logit contributions
les+0.035
lights+0.035
rily+0.033
adoes+0.033
ork+0.033
Neg logit contributions
 least-0.014
oh-0.008
 say-0.008
 oh-0.008
 Mir-0.008
Token tree
Token ID5509
Feature activation-1.07e-02

Pos logit contributions
les+0.006
lights+0.006
rily+0.005
ork+0.005
adoes+0.005
Neg logit contributions
 least-0.006
oh-0.005
 say-0.005
 oh-0.005
 kidding-0.005
Token and
Token ID290
Feature activation-3.50e-03

Pos logit contributions
lights+0.018
les+0.018
rily+0.017
ork+0.017
adoes+0.017
Neg logit contributions
 least-0.017
oh-0.014
 say-0.013
 oh-0.012
 kidding-0.012