TOP ACTIVATIONS
MAX = 0.064

". The puppy w agged his tail and l icked her
but be careful . Ost ric hes can be gr umpy
. The puppy w agged its tail and l icked her
The dog w agged his tail and l icked the
F uzz y w agged his tail and pointed to the
should not tease the ost ric hes . They are not
house . Max w agged his tail and agreed to help
The puppy w agged his tail . He rushed up
. The dog w agged his tail and l icked their
friendly . Max w agged his tail and felt happy that
. The dog w agged its tail and became friendly .
The puppy w agged its tail and followed Lily back
. The dog w agged his tail and said , "
The dog w agged his tail and walked with Sally
. The dog w agged his tail and l icked her
. The dog w agged his tail and l icked Fl
walk . Spot w agged his tail and bark ed happily
The puppy w agged his tail . He l icked
." The dog w agged his tail and said , "
Max w agged his tail and ran around happily

MIN ACTIVATIONS
MIN = -0.051

them and make a y ummy salad for lunch !"
bite and makes a y ummy sound . "
kitchen and made a y ummy cake . They also made
fast it made a bub bly sound . The little boy
pepper and made a y ummy snack . L
together and made a y ummy shrimp dish .
and they made a y ummy apple pie . But when
berries and made a y ummy b erry pie . It
so surprised that they started tor oll away .
early and make a y ummy breakfast for her family .
up to reveal a y ummy surprise .
fridge and found a y ummy ice cream cake that was
strawberries and made a y ummy strawberry cake . She smiled
boom and see a fire work shoot up in the air
pie turn brown and bub bly . " D
they were making a y ummy and pretty cake for their
too cold for a re ind eer like you !"
. He made a y ummy sound too .
bread and made a y ummy sandwich . But
surrender and make a y ummy snack instead ," she said

INTERVAL 0.035 - 0.064
CONTAINS 0.021%

Yes , they are ost ric hes ," mom said .
" Ben " on the box with a marker . Then
s eyes filled with tears . He had been tricked
it . He w agged his tail and bark ed .
They sit on the grass . They talk and share

INTERVAL 0.006 - 0.035
CONTAINS 19.807%

Dad said , " Let 's record it !"
play with it . The girl seemed spoiled and told Johnny
" I want to change my home ." So , the
a veterinarian when she grew up . She liked to read
used cookie cut ters to cut out different shapes and put

INTERVAL -0.022 - 0.006
CONTAINS 79.734%

! It hopped and hopped until it reached the waterfall .
, I was right ! There are jewels and coins and
beautiful gem that he wanted to sell . He went to
. She hugged her bear tightly and said , " I
. I 'll take care of it ." Her mom went

INTERVAL -0.051 - -0.022
CONTAINS 0.438%

, and they were also very happy for him .
outside to the garden to wait for the puppy . She
saw a toy that he really wanted . He asked his
Lily and Tim my went back outside to gather more wood
he had to do the job properly or he would get
Token".
Token ID1911
Feature activation+2.46e-03

Pos logit contributions
Mom+0.002
Everyone+0.002
har+0.002
um+0.002
 doorstep+0.002
Neg logit contributions
 pret-0.001
 Stri-0.001
 enough-0.001
 sy-0.001
 asleep-0.001
Token The
Token ID383
Feature activation-3.88e-03

Pos logit contributions
 pret+0.004
 Stri+0.004
 asleep+0.004
rob+0.004
 sy+0.003
Neg logit contributions
Mom-0.004
um-0.004
 year-0.004
Everyone-0.004
ini-0.004
Token puppy
Token ID26188
Feature activation+1.05e-02

Pos logit contributions
Everyone+0.007
har+0.007
ator+0.007
ooks+0.006
um+0.006
Neg logit contributions
rob-0.005
 sy-0.005
 pret-0.005
 scar-0.005
 Stri-0.004
Token w
Token ID266
Feature activation+8.85e-03

Pos logit contributions
rob+0.018
 pret+0.018
 Stri+0.017
 sy+0.016
 Fro+0.015
Neg logit contributions
Mom-0.017
Everyone-0.015
um-0.015
 year-0.015
Daddy-0.014
Tokenagged
Token ID14655
Feature activation+1.33e-02

Pos logit contributions
rob+0.015
 pret+0.014
 Stri+0.014
 sy+0.013
 Fro+0.012
Neg logit contributions
Mom-0.015
Everyone-0.013
 year-0.013
um-0.013
Daddy-0.012
Token his
Token ID465
Feature activation+6.36e-02

Pos logit contributions
rob+0.020
 pret+0.017
 Stri+0.017
 sy+0.016
 Roc+0.014
Neg logit contributions
Mom-0.026
Everyone-0.024
um-0.023
 year-0.023
ally-0.022
Token tail
Token ID7894
Feature activation+2.59e-02

Pos logit contributions
rob+0.089
 pret+0.082
 Stri+0.079
 sy+0.075
 Fro+0.067
Neg logit contributions
Mom-0.124
Everyone-0.118
 year-0.113
um-0.113
ally-0.106
Token and
Token ID290
Feature activation+7.91e-03

Pos logit contributions
rob+0.048
 pret+0.043
 Stri+0.042
 sy+0.040
 Roc+0.037
Neg logit contributions
Mom-0.041
Everyone-0.038
um-0.037
 year-0.037
ally-0.033
Token l
Token ID300
Feature activation-5.90e-04

Pos logit contributions
rob+0.015
 pret+0.013
 Stri+0.013
 sy+0.013
 Roc+0.011
Neg logit contributions
Mom-0.012
Everyone-0.011
 year-0.011
um-0.010
ally-0.010
Tokenicked
Token ID9484
Feature activation-2.21e-03

Pos logit contributions
Mom+0.001
um+0.001
Everyone+0.001
 year+0.001
ally+0.001
Neg logit contributions
rob-0.001
 pret-0.001
 Stri-0.001
 sy-0.001
 scar-0.001
Token her
Token ID607
Feature activation+1.89e-02

Pos logit contributions
Mom+0.004
Everyone+0.004
 year+0.004
um+0.004
har+0.004
Neg logit contributions
rob-0.003
 pret-0.003
 Stri-0.003
 sy-0.003
 Fro-0.002
Token but
Token ID475
Feature activation-4.17e-04

Pos logit contributions
rob+0.012
 pret+0.011
 sy+0.010
 Stri+0.010
 asleep+0.010
Neg logit contributions
Mom-0.006
Everyone-0.005
um-0.005
 year-0.005
ally-0.004
Token be
Token ID307
Feature activation+1.09e-02

Pos logit contributions
Mom+0.001
Everyone+0.000
um+0.000
 year+0.000
ally+0.000
Neg logit contributions
rob-0.001
 pret-0.001
 Stri-0.001
 sy-0.001
 Roc-0.001
Token careful
Token ID8161
Feature activation+6.13e-03

Pos logit contributions
rob+0.021
 pret+0.019
 Stri+0.018
 sy+0.018
 Roc+0.016
Neg logit contributions
Mom-0.017
Everyone-0.015
um-0.015
 year-0.015
har-0.013
Token.
Token ID13
Feature activation+9.28e-04

Pos logit contributions
rob+0.010
 Stri+0.009
 pret+0.009
 sy+0.009
 Roc+0.008
Neg logit contributions
Mom-0.010
Everyone-0.010
 year-0.009
ally-0.009
har-0.008
Token Ost
Token ID38919
Feature activation-4.57e-03

Pos logit contributions
rob+0.002
 pret+0.002
 Stri+0.002
 sy+0.001
 scar+0.001
Neg logit contributions
Mom-0.001
Everyone-0.001
um-0.001
 year-0.001
ally-0.001
Tokenric
Token ID1173
Feature activation+6.32e-02

Pos logit contributions
Mom+0.009
Everyone+0.009
um+0.008
 year+0.008
ally+0.008
Neg logit contributions
rob-0.006
 pret-0.006
 Stri-0.005
 sy-0.005
 scar-0.005
Tokenhes
Token ID956
Feature activation+3.24e-02

Pos logit contributions
rob+0.108
 pret+0.098
 sy+0.095
 Stri+0.094
 scar+0.086
Neg logit contributions
Mom-0.106
Everyone-0.102
 year-0.095
um-0.094
 Everyone-0.086
Token can
Token ID460
Feature activation+1.43e-02

Pos logit contributions
rob+0.059
 pret+0.052
 Stri+0.051
 sy+0.048
 Roc+0.045
Neg logit contributions
Mom-0.053
Everyone-0.050
 year-0.048
um-0.047
 Everyone-0.042
Token be
Token ID307
Feature activation+2.49e-02

Pos logit contributions
rob+0.028
 pret+0.025
 Stri+0.023
 sy+0.023
 asleep+0.021
Neg logit contributions
Mom-0.023
Everyone-0.020
um-0.019
 year-0.019
Daddy-0.017
Token gr
Token ID1036
Feature activation+1.51e-02

Pos logit contributions
rob+0.050
 pret+0.043
 Stri+0.042
 sy+0.040
 Roc+0.037
Neg logit contributions
Mom-0.038
Everyone-0.034
 year-0.033
um-0.032
har-0.030
Tokenumpy
Token ID32152
Feature activation+7.60e-03

Pos logit contributions
rob+0.028
 pret+0.026
 Stri+0.025
 sy+0.024
 Roc+0.022
Neg logit contributions
Mom-0.023
Everyone-0.022
um-0.021
 year-0.021
ally-0.019
Token.
Token ID13
Feature activation+1.20e-03

Pos logit contributions
Mom+0.010
Everyone+0.009
um+0.009
har+0.008
VP+0.008
Neg logit contributions
rob-0.009
 Stri-0.008
 pret-0.008
 sy-0.007
 asleep-0.007
Token The
Token ID383
Feature activation-9.94e-03

Pos logit contributions
rob+0.002
 Stri+0.002
 pret+0.002
 sy+0.002
 Fro+0.002
Neg logit contributions
Mom-0.002
 year-0.002
Everyone-0.002
um-0.002
ator-0.002
Token puppy
Token ID26188
Feature activation+3.70e-03

Pos logit contributions
Everyone+0.017
har+0.017
Mom+0.017
um+0.017
ator+0.016
Neg logit contributions
rob-0.015
 sy-0.013
 pret-0.013
 scar-0.013
 Stri-0.013
Token w
Token ID266
Feature activation+1.39e-03

Pos logit contributions
rob+0.007
 pret+0.007
 Stri+0.007
 sy+0.007
 Fro+0.006
Neg logit contributions
Mom-0.005
Everyone-0.005
um-0.004
 year-0.004
ator-0.004
Tokenagged
Token ID14655
Feature activation+1.04e-02

Pos logit contributions
rob+0.003
 pret+0.003
 Stri+0.003
 sy+0.002
 Roc+0.002
Neg logit contributions
Mom-0.002
Everyone-0.002
um-0.002
 year-0.002
ally-0.002
Token its
Token ID663
Feature activation+6.12e-02

Pos logit contributions
rob+0.016
 pret+0.014
 Stri+0.014
 sy+0.013
 Roc+0.012
Neg logit contributions
Mom-0.020
Everyone-0.018
um-0.018
 year-0.018
ally-0.017
Token tail
Token ID7894
Feature activation+2.26e-02

Pos logit contributions
rob+0.065
 pret+0.063
 Stri+0.060
 sy+0.058
 scar+0.054
Neg logit contributions
Mom-0.127
Everyone-0.127
 year-0.121
um-0.118
ator-0.116
Token and
Token ID290
Feature activation+1.10e-02

Pos logit contributions
rob+0.039
 pret+0.036
 Stri+0.035
 sy+0.033
 Roc+0.031
Neg logit contributions
Mom-0.038
Everyone-0.035
um-0.033
 year-0.033
ally-0.031
Token l
Token ID300
Feature activation+2.72e-05

Pos logit contributions
rob+0.017
 pret+0.015
 Stri+0.015
 sy+0.014
 Roc+0.013
Neg logit contributions
Mom-0.020
Everyone-0.019
 year-0.018
um-0.017
ally-0.016
Tokenicked
Token ID9484
Feature activation-4.74e-03

Pos logit contributions
rob+0.000
 pret+0.000
 Stri+0.000
 sy+0.000
 scar+0.000
Neg logit contributions
Mom-0.000
um-0.000
Everyone-0.000
 year-0.000
ally-0.000
Token her
Token ID607
Feature activation+2.08e-02

Pos logit contributions
Mom+0.008
Everyone+0.008
 year+0.008
um+0.008
 doorstep+0.007
Neg logit contributions
rob-0.007
 pret-0.007
 Stri-0.007
 sy-0.006
 Fro-0.006
Token
Token ID26391
Feature activation-6.35e-03

Pos logit contributions
Mom+0.029
Everyone+0.026
um+0.025
 year+0.025
 doorstep+0.024
Neg logit contributions
rob-0.036
 pret-0.032
 Stri-0.031
 sy-0.029
 Roc-0.027
Token The
Token ID383
Feature activation-8.93e-03

Pos logit contributions
Mom+0.009
Everyone+0.008
um+0.008
 year+0.008
ally+0.007
Neg logit contributions
rob-0.012
 Stri-0.012
 pret-0.011
 sy-0.010
 Roc-0.010
Token dog
Token ID3290
Feature activation+7.72e-03

Pos logit contributions
ator+0.016
Everyone+0.016
Mom+0.015
um+0.015
ally+0.015
Neg logit contributions
rob-0.013
 Stri-0.012
 pret-0.012
 sy-0.012
 Fro-0.011
Token w
Token ID266
Feature activation+7.31e-03

Pos logit contributions
rob+0.014
 pret+0.013
 Stri+0.013
 sy+0.012
 ro+0.011
Neg logit contributions
Mom-0.013
Everyone-0.012
um-0.011
 year-0.010
Daddy-0.010
Tokenagged
Token ID14655
Feature activation+9.71e-03

Pos logit contributions
rob+0.013
 pret+0.012
 Stri+0.012
 sy+0.011
 Fro+0.010
Neg logit contributions
Mom-0.012
Everyone-0.011
 year-0.010
um-0.010
 doorstep-0.010
Token his
Token ID465
Feature activation+6.11e-02

Pos logit contributions
rob+0.011
 pret+0.010
 Stri+0.009
 sy+0.008
 Fro+0.008
Neg logit contributions
Mom-0.022
Everyone-0.021
um-0.020
 year-0.020
 doorstep-0.019
Token tail
Token ID7894
Feature activation+2.28e-02

Pos logit contributions
rob+0.093
 pret+0.084
 Stri+0.082
 sy+0.077
 Roc+0.070
Neg logit contributions
Mom-0.115
Everyone-0.107
um-0.104
 year-0.103
ally-0.096
Token and
Token ID290
Feature activation+6.95e-03

Pos logit contributions
rob+0.032
 pret+0.031
 Stri+0.030
 sy+0.028
 asleep+0.027
Neg logit contributions
Mom-0.042
Everyone-0.039
um-0.037
Daddy-0.037
ally-0.036
Token l
Token ID300
Feature activation-2.85e-03

Pos logit contributions
rob+0.011
 pret+0.010
 Stri+0.009
 sy+0.009
 Roc+0.008
Neg logit contributions
Mom-0.013
Everyone-0.012
um-0.012
 year-0.012
ally-0.011
Tokenicked
Token ID9484
Feature activation-6.57e-03

Pos logit contributions
Mom+0.006
Everyone+0.006
um+0.005
 year+0.005
ally+0.005
Neg logit contributions
rob-0.004
 pret-0.003
 Stri-0.003
 sy-0.003
 scar-0.003
Token the
Token ID262
Feature activation+1.23e-02

Pos logit contributions
Mom+0.013
Everyone+0.012
 year+0.012
um+0.011
 doorstep+0.011
Neg logit contributions
rob-0.009
 Stri-0.008
 pret-0.008
 sy-0.008
 Fro-0.007
TokenF
Token ID37
Feature activation-1.06e-02

Pos logit contributions
rob+0.000
 pret+0.000
 Stri+0.000
 sy+0.000
 Roc+0.000
Neg logit contributions
Mom-0.000
Everyone-0.000
um-0.000
 year-0.000
ally-0.000
Tokenuzz
Token ID4715
Feature activation-4.38e-03

Pos logit contributions
Mom+0.020
Everyone+0.020
um+0.018
Daddy+0.017
 year+0.017
Neg logit contributions
rob-0.015
 pret-0.015
 Stri-0.014
 sy-0.013
 Fro-0.012
Tokeny
Token ID88
Feature activation+5.62e-03

Pos logit contributions
Mom+0.009
Everyone+0.008
um+0.008
 year+0.008
ally+0.007
Neg logit contributions
rob-0.006
 pret-0.005
 Stri-0.005
 sy-0.005
 Roc-0.004
Token w
Token ID266
Feature activation+9.24e-03

Pos logit contributions
 pret+0.008
 Stri+0.008
rob+0.008
 Fro+0.007
 ro+0.007
Neg logit contributions
Mom-0.010
Everyone-0.009
 year-0.009
ini-0.008
ator-0.008
Tokenagged
Token ID14655
Feature activation+8.64e-03

Pos logit contributions
rob+0.016
 pret+0.014
 Stri+0.014
 sy+0.013
 Fro+0.012
Neg logit contributions
Mom-0.016
Everyone-0.014
 year-0.014
um-0.013
 doorstep-0.013
Token his
Token ID465
Feature activation+6.11e-02

Pos logit contributions
rob+0.012
 pret+0.011
 Stri+0.010
 sy+0.010
 Roc+0.008
Neg logit contributions
Mom-0.018
Everyone-0.016
um-0.016
 year-0.016
ally-0.015
Token tail
Token ID7894
Feature activation+2.12e-02

Pos logit contributions
rob+0.096
 pret+0.086
 Stri+0.084
 sy+0.079
 Roc+0.071
Neg logit contributions
Mom-0.113
Everyone-0.105
um-0.102
 year-0.101
ally-0.094
Token and
Token ID290
Feature activation+1.25e-02

Pos logit contributions
rob+0.035
 pret+0.032
 Stri+0.032
 sy+0.030
 Fro+0.027
Neg logit contributions
Mom-0.036
Everyone-0.034
um-0.032
 year-0.032
ally-0.030
Token pointed
Token ID6235
Feature activation-1.87e-02

Pos logit contributions
rob+0.021
 pret+0.018
 sy+0.017
 Stri+0.017
robat+0.015
Neg logit contributions
Mom-0.022
Everyone-0.020
um-0.019
 year-0.018
ally-0.018
Token to
Token ID284
Feature activation-7.36e-04

Pos logit contributions
Mom+0.035
Everyone+0.033
 year+0.031
um+0.031
ally+0.029
Neg logit contributions
rob-0.028
 pret-0.026
 Stri-0.025
 sy-0.023
 Roc-0.022
Token the
Token ID262
Feature activation-1.08e-02

Pos logit contributions
Mom+0.001
Everyone+0.001
 year+0.001
 Everyone+0.001
um+0.001
Neg logit contributions
rob-0.001
 pret-0.001
 sy-0.001
 Stri-0.001
robat-0.001
Token should
Token ID815
Feature activation+2.89e-04

Pos logit contributions
rob+0.007
 pret+0.007
 Stri+0.007
 sy+0.006
 ro+0.006
Neg logit contributions
Mom-0.009
um-0.008
 year-0.008
Everyone-0.008
ally-0.007
Token not
Token ID407
Feature activation+5.95e-03

Pos logit contributions
 pret+0.000
 Stri+0.000
rob+0.000
 sy+0.000
 Fro+0.000
Neg logit contributions
 year-0.000
Mom-0.000
urious-0.000
Everyone-0.000
har-0.000
Token tease
Token ID34691
Feature activation+1.92e-02

Pos logit contributions
 pret+0.007
 Stri+0.007
 sy+0.006
rob+0.006
ah+0.006
Neg logit contributions
ator-0.011
Mom-0.011
um-0.011
 year-0.010
Everyone-0.010
Token the
Token ID262
Feature activation+8.41e-03

Pos logit contributions
rob+0.034
 pret+0.032
 Stri+0.031
 sy+0.030
 Roc+0.028
Neg logit contributions
Mom-0.030
Everyone-0.028
um-0.027
 year-0.027
 doorstep-0.026
Token ost
Token ID23619
Feature activation+2.01e-02

Pos logit contributions
rob+0.012
 pret+0.011
 Stri+0.010
 sy+0.010
 Roc+0.009
Neg logit contributions
Mom-0.017
Everyone-0.016
um-0.015
 year-0.015
ally-0.014
Tokenric
Token ID1173
Feature activation+6.09e-02

Pos logit contributions
rob+0.028
 pret+0.025
 Stri+0.024
 sy+0.023
 Roc+0.020
Neg logit contributions
Mom-0.041
Everyone-0.038
um-0.037
 year-0.037
ally-0.035
Tokenhes
Token ID956
Feature activation+2.88e-02

Pos logit contributions
rob+0.095
 pret+0.086
 Stri+0.083
 sy+0.080
 scar+0.072
Neg logit contributions
Mom-0.112
Everyone-0.105
 year-0.101
um-0.101
ally-0.091
Token.
Token ID13
Feature activation-1.76e-03

Pos logit contributions
rob+0.059
 pret+0.054
 Stri+0.053
 sy+0.051
 Roc+0.047
Neg logit contributions
Mom-0.040
Everyone-0.036
 year-0.035
um-0.034
ally-0.030
Token They
Token ID1119
Feature activation+9.09e-03

Pos logit contributions
Mom+0.003
Everyone+0.002
 year+0.002
um+0.002
 doorstep+0.002
Neg logit contributions
rob-0.003
 Stri-0.003
 pret-0.003
 sy-0.003
 Roc-0.003
Token are
Token ID389
Feature activation+1.31e-02

Pos logit contributions
rob+0.012
 pret+0.011
 Stri+0.011
 sy+0.010
 Roc+0.009
Neg logit contributions
Mom-0.019
Everyone-0.017
um-0.017
 year-0.017
ally-0.016
Token not
Token ID407
Feature activation+2.76e-03

Pos logit contributions
rob+0.027
 pret+0.025
 Stri+0.024
 sy+0.023
 Roc+0.021
Neg logit contributions
Mom-0.018
Everyone-0.016
um-0.016
 year-0.016
ally-0.014
Token house
Token ID2156
Feature activation-2.62e-03

Pos logit contributions
Mom+0.001
Everyone+0.001
ally+0.001
um+0.001
 year+0.001
Neg logit contributions
 pret-0.001
 sy-0.001
rob-0.001
 asleep-0.001
 Stri-0.001
Token.
Token ID13
Feature activation+1.59e-03

Pos logit contributions
Mom+0.004
Everyone+0.004
um+0.003
 year+0.003
ally+0.003
Neg logit contributions
rob-0.005
 pret-0.005
 Stri-0.004
 sy-0.004
 asleep-0.004
Token Max
Token ID5436
Feature activation+1.91e-04

Pos logit contributions
rob+0.003
 Stri+0.003
 pret+0.002
 sy+0.002
 Fro+0.002
Neg logit contributions
Mom-0.003
 year-0.002
Everyone-0.002
um-0.002
ator-0.002
Token w
Token ID266
Feature activation+5.87e-03

Pos logit contributions
rob+0.000
 pret+0.000
 Stri+0.000
 sy+0.000
 Roc+0.000
Neg logit contributions
Mom-0.000
Everyone-0.000
 year-0.000
um-0.000
ally-0.000
Tokenagged
Token ID14655
Feature activation+1.17e-02

Pos logit contributions
rob+0.010
 pret+0.009
 Stri+0.009
 sy+0.008
 Fro+0.008
Neg logit contributions
Mom-0.010
Everyone-0.009
 year-0.009
um-0.009
Daddy-0.008
Token his
Token ID465
Feature activation+6.07e-02

Pos logit contributions
rob+0.016
 pret+0.013
 Stri+0.013
 sy+0.013
 Roc+0.011
Neg logit contributions
Mom-0.025
Everyone-0.023
 year-0.023
um-0.023
ally-0.021
Token tail
Token ID7894
Feature activation+2.41e-02

Pos logit contributions
rob+0.068
 pret+0.067
 Stri+0.061
 sy+0.059
 scar+0.054
Neg logit contributions
Everyone-0.127
Mom-0.127
 year-0.119
um-0.118
Daddy-0.114
Token and
Token ID290
Feature activation+1.14e-02

Pos logit contributions
rob+0.047
 pret+0.042
 Stri+0.042
 sy+0.039
 Roc+0.036
Neg logit contributions
Mom-0.037
Everyone-0.033
um-0.033
 year-0.033
ally-0.029
Token agreed
Token ID4987
Feature activation-1.85e-02

Pos logit contributions
rob+0.019
 Stri+0.016
 sy+0.015
 pret+0.015
robat+0.014
Neg logit contributions
Mom-0.020
Everyone-0.018
 year-0.017
um-0.017
 tidy-0.016
Token to
Token ID284
Feature activation+4.62e-03

Pos logit contributions
Mom+0.032
Everyone+0.030
um+0.029
 year+0.029
ally+0.027
Neg logit contributions
rob-0.031
 pret-0.028
 Stri-0.027
 sy-0.026
 Roc-0.023
Token help
Token ID1037
Feature activation-1.03e-04

Pos logit contributions
 pret+0.007
rob+0.007
 Stri+0.007
 sy+0.006
 Fro+0.006
Neg logit contributions
Mom-0.008
um-0.007
Everyone-0.007
 year-0.007
urious-0.007
Token
Token ID198
Feature activation-4.42e-04

Pos logit contributions
 rightful+0.006
Mom+0.006
um+0.006
 doorstep+0.006
 year+0.006
Neg logit contributions
rob-0.009
 Stri-0.008
 pret-0.008
 Roc-0.007
 Fro-0.007
TokenThe
Token ID464
Feature activation-1.28e-02

Pos logit contributions
um+0.000
 doorstep+0.000
 year+0.000
ator+0.000
 rightful+0.000
Neg logit contributions
rob-0.001
 Stri-0.001
 pret-0.001
 sy-0.001
 Roc-0.001
Token puppy
Token ID26188
Feature activation+9.30e-03

Pos logit contributions
Mom+0.017
um+0.017
Everyone+0.017
har+0.017
ator+0.017
Neg logit contributions
rob-0.023
 pret-0.022
 sy-0.021
 Stri-0.020
 scar-0.020
Token w
Token ID266
Feature activation+4.42e-03

Pos logit contributions
rob+0.017
 pret+0.016
 Stri+0.016
 sy+0.015
 Fro+0.014
Neg logit contributions
Mom-0.014
Everyone-0.013
um-0.012
 year-0.012
ally-0.011
Tokenagged
Token ID14655
Feature activation+8.81e-03

Pos logit contributions
rob+0.008
 pret+0.007
 Stri+0.007
 sy+0.007
 Fro+0.006
Neg logit contributions
Mom-0.007
Everyone-0.007
 year-0.006
um-0.006
 doorstep-0.006
Token his
Token ID465
Feature activation+6.06e-02

Pos logit contributions
rob+0.013
 pret+0.011
 Stri+0.011
 sy+0.010
 Roc+0.009
Neg logit contributions
Mom-0.018
Everyone-0.017
um-0.016
 year-0.016
ally-0.015
Token tail
Token ID7894
Feature activation+2.34e-02

Pos logit contributions
rob+0.089
 pret+0.082
 Stri+0.079
 sy+0.075
 Roc+0.067
Neg logit contributions
Mom-0.115
Everyone-0.108
 year-0.104
um-0.104
ally-0.097
Token.
Token ID13
Feature activation+4.09e-03

Pos logit contributions
rob+0.042
 pret+0.038
 Stri+0.037
 sy+0.035
 Roc+0.032
Neg logit contributions
Mom-0.039
Everyone-0.035
um-0.034
 year-0.034
ally-0.031
Token He
Token ID679
Feature activation+2.79e-03

Pos logit contributions
rob+0.006
 Stri+0.006
 pret+0.006
 sy+0.005
 Fro+0.005
Neg logit contributions
Mom-0.007
 year-0.007
Everyone-0.007
um-0.007
ator-0.006
Token rushed
Token ID13999
Feature activation+1.21e-03

Pos logit contributions
rob+0.005
 pret+0.005
 Stri+0.005
 sy+0.005
 Fro+0.004
Neg logit contributions
Mom-0.004
Everyone-0.004
um-0.004
 year-0.003
ator-0.003
Token up
Token ID510
Feature activation+1.95e-02

Pos logit contributions
rob+0.002
 pret+0.002
 Stri+0.002
 sy+0.002
 Roc+0.002
Neg logit contributions
Mom-0.002
Everyone-0.002
um-0.002
 year-0.002
ally-0.001
Token.
Token ID13
Feature activation-3.03e-03

Pos logit contributions
Mom+0.015
um+0.012
Everyone+0.012
 year+0.012
 Mom+0.011
Neg logit contributions
rob-0.031
 pret-0.026
 Stri-0.026
 sy-0.025
pill-0.024
Token The
Token ID383
Feature activation-1.08e-02

Pos logit contributions
Mom+0.005
 year+0.005
Everyone+0.005
um+0.005
ator+0.004
Neg logit contributions
rob-0.005
 Stri-0.005
 pret-0.005
 Fro-0.004
 sy-0.004
Token dog
Token ID3290
Feature activation+6.24e-03

Pos logit contributions
Mom+0.018
Everyone+0.016
um+0.016
 year+0.016
Daddy+0.014
Neg logit contributions
rob-0.020
 pret-0.018
 Stri-0.018
 sy-0.017
 Roc-0.015
Token w
Token ID266
Feature activation+4.72e-03

Pos logit contributions
rob+0.012
 pret+0.010
 Stri+0.010
 sy+0.010
 Roc+0.009
Neg logit contributions
Mom-0.010
Everyone-0.009
 year-0.009
um-0.008
ally-0.008
Tokenagged
Token ID14655
Feature activation+9.46e-03

Pos logit contributions
rob+0.010
 pret+0.009
 Stri+0.009
 sy+0.009
 Roc+0.008
Neg logit contributions
Mom-0.006
Everyone-0.006
um-0.006
 year-0.005
ally-0.005
Token his
Token ID465
Feature activation+6.06e-02

Pos logit contributions
rob+0.014
 pret+0.012
 Stri+0.012
 sy+0.012
 Roc+0.010
Neg logit contributions
Mom-0.019
Everyone-0.017
um-0.017
 year-0.017
ally-0.016
Token tail
Token ID7894
Feature activation+2.20e-02

Pos logit contributions
rob+0.119
 Stri+0.103
 pret+0.101
 sy+0.096
pill+0.089
Neg logit contributions
Mom-0.098
um-0.085
Everyone-0.084
 year-0.082
har-0.075
Token and
Token ID290
Feature activation+1.04e-02

Pos logit contributions
rob+0.044
 pret+0.039
 Stri+0.039
 sy+0.036
 Roc+0.034
Neg logit contributions
Mom-0.033
Everyone-0.030
 year-0.030
um-0.029
ally-0.026
Token l
Token ID300
Feature activation+4.10e-04

Pos logit contributions
rob+0.018
 sy+0.015
 pret+0.015
 Stri+0.014
robat+0.014
Neg logit contributions
Mom-0.017
Everyone-0.015
 year-0.014
I-0.014
 Mom-0.013
Tokenicked
Token ID9484
Feature activation-3.60e-03

Pos logit contributions
rob+0.001
 pret+0.001
 Stri+0.001
 sy+0.001
 scar+0.001
Neg logit contributions
Mom-0.001
um-0.001
Everyone-0.001
 year-0.001
 Everyone-0.001
Token their
Token ID511
Feature activation+2.56e-02

Pos logit contributions
Mom+0.007
Everyone+0.007
um+0.006
 year+0.006
ally+0.006
Neg logit contributions
rob-0.005
 pret-0.005
 Stri-0.004
 sy-0.004
 Roc-0.004
Token friendly
Token ID8030
Feature activation-3.97e-03

Pos logit contributions
rob+0.006
 Stri+0.006
 pret+0.006
 sy+0.006
 Roc+0.006
Neg logit contributions
Mom-0.004
um-0.003
Everyone-0.003
 year-0.003
ally-0.003
Token.
Token ID13
Feature activation-1.69e-03

Pos logit contributions
Mom+0.006
Everyone+0.006
um+0.005
 year+0.005
ally+0.005
Neg logit contributions
rob-0.007
 pret-0.007
 Stri-0.007
 sy-0.006
 Roc-0.006
Token Max
Token ID5436
Feature activation-1.24e-03

Pos logit contributions
Mom+0.003
 year+0.003
um+0.003
Everyone+0.003
ator+0.003
Neg logit contributions
rob-0.003
 Stri-0.003
 pret-0.002
 sy-0.002
 Fro-0.002
Token w
Token ID266
Feature activation+4.88e-03

Pos logit contributions
Mom+0.002
Everyone+0.002
um+0.002
 year+0.002
ally+0.001
Neg logit contributions
rob-0.002
 pret-0.002
 Stri-0.002
 sy-0.002
 Roc-0.002
Tokenagged
Token ID14655
Feature activation+1.26e-02

Pos logit contributions
rob+0.008
 pret+0.007
 Stri+0.007
 sy+0.007
 Fro+0.006
Neg logit contributions
Mom-0.008
Everyone-0.008
 year-0.007
um-0.007
Daddy-0.007
Token his
Token ID465
Feature activation+6.01e-02

Pos logit contributions
rob+0.016
 pret+0.014
 Stri+0.013
 sy+0.013
 Roc+0.011
Neg logit contributions
Mom-0.027
Everyone-0.026
um-0.025
 year-0.025
ally-0.023
Token tail
Token ID7894
Feature activation+2.58e-02

Pos logit contributions
rob+0.065
 pret+0.063
 Stri+0.057
 sy+0.054
 ro+0.050
Neg logit contributions
Mom-0.129
Everyone-0.127
 year-0.119
um-0.118
Daddy-0.117
Token and
Token ID290
Feature activation+1.35e-02

Pos logit contributions
rob+0.044
 pret+0.040
 Stri+0.039
 sy+0.038
 Roc+0.034
Neg logit contributions
Mom-0.044
Everyone-0.040
um-0.039
 year-0.038
ally-0.036
Token felt
Token ID2936
Feature activation+3.54e-03

Pos logit contributions
rob+0.025
 sy+0.022
 pret+0.022
 Stri+0.022
 Roc+0.019
Neg logit contributions
Mom-0.020
Everyone-0.018
 year-0.017
um-0.017
ally-0.016
Token happy
Token ID3772
Feature activation+1.37e-02

Pos logit contributions
rob+0.007
 pret+0.006
 Stri+0.006
 sy+0.006
 Roc+0.005
Neg logit contributions
Mom-0.006
Everyone-0.005
um-0.005
 year-0.005
ally-0.004
Token that
Token ID326
Feature activation-1.58e-03

Pos logit contributions
rob+0.025
 pret+0.023
 Stri+0.022
 sy+0.021
 Roc+0.019
Neg logit contributions
Mom-0.022
Everyone-0.020
um-0.020
 year-0.019
ally-0.018
Token.
Token ID13
Feature activation-1.62e-03

Pos logit contributions
rob+0.019
 pret+0.018
 Stri+0.018
 sy+0.017
 Roc+0.016
Neg logit contributions
Mom-0.013
Everyone-0.012
um-0.012
 year-0.011
ally-0.010
Token The
Token ID383
Feature activation-1.18e-02

Pos logit contributions
Mom+0.003
 year+0.003
ator+0.002
um+0.002
Everyone+0.002
Neg logit contributions
 Stri-0.003
rob-0.002
 pret-0.002
 Fro-0.002
 sy-0.002
Token dog
Token ID3290
Feature activation+4.92e-03

Pos logit contributions
Mom+0.023
ator+0.023
um+0.023
Everyone+0.023
ally+0.022
Neg logit contributions
rob-0.013
 pret-0.013
 sy-0.012
 Stri-0.012
 scar-0.011
Token w
Token ID266
Feature activation+5.16e-03

Pos logit contributions
rob+0.009
 pret+0.008
 Stri+0.008
 sy+0.008
 Fro+0.007
Neg logit contributions
Mom-0.008
Everyone-0.007
um-0.007
 year-0.007
ally-0.006
Tokenagged
Token ID14655
Feature activation+1.28e-02

Pos logit contributions
rob+0.010
 pret+0.009
 Stri+0.009
 sy+0.008
 Roc+0.007
Neg logit contributions
Mom-0.008
Everyone-0.007
 year-0.007
um-0.007
ally-0.006
Token its
Token ID663
Feature activation+6.00e-02

Pos logit contributions
rob+0.019
 pret+0.017
 Stri+0.016
 sy+0.016
 Roc+0.014
Neg logit contributions
Mom-0.025
Everyone-0.023
um-0.023
 year-0.023
ally-0.021
Token tail
Token ID7894
Feature activation+2.45e-02

Pos logit contributions
rob+0.064
 pret+0.063
 Stri+0.059
 sy+0.058
 Fro+0.053
Neg logit contributions
Mom-0.127
Everyone-0.123
 year-0.118
um-0.115
Daddy-0.114
Token and
Token ID290
Feature activation+1.09e-02

Pos logit contributions
rob+0.044
 pret+0.040
 Stri+0.039
 sy+0.037
 Roc+0.034
Neg logit contributions
Mom-0.040
Everyone-0.036
um-0.035
 year-0.035
ally-0.032
Token became
Token ID2627
Feature activation+5.51e-03

Pos logit contributions
rob+0.018
 pret+0.016
 Stri+0.016
 sy+0.015
 Roc+0.014
Neg logit contributions
Mom-0.019
Everyone-0.017
 year-0.016
um-0.016
ally-0.015
Token friendly
Token ID8030
Feature activation+2.45e-03

Pos logit contributions
rob+0.010
 pret+0.009
 Stri+0.009
 sy+0.009
 Fro+0.008
Neg logit contributions
Mom-0.009
Everyone-0.008
um-0.008
 year-0.008
 doorstep-0.007
Token.
Token ID13
Feature activation-2.22e-04

Pos logit contributions
rob+0.004
 pret+0.004
 Stri+0.004
 sy+0.004
 Fro+0.004
Neg logit contributions
Mom-0.004
Everyone-0.004
um-0.003
 year-0.003
ally-0.003
Token
Token ID198
Feature activation+2.97e-03

Pos logit contributions
 doorstep+0.000
um+0.000
 year+0.000
ator+0.000
urious+0.000
Neg logit contributions
rob-0.000
 Stri-0.000
 Fro-0.000
 pret-0.000
 sy-0.000
TokenThe
Token ID464
Feature activation-1.15e-02

Pos logit contributions
 Stri+0.005
rob+0.005
 sy+0.005
 pret+0.005
 Fro+0.004
Neg logit contributions
um-0.004
ator-0.004
 doorstep-0.004
 year-0.004
urious-0.003
Token puppy
Token ID26188
Feature activation+8.64e-03

Pos logit contributions
har+0.021
ooks+0.020
Everyone+0.020
ator+0.020
Jane+0.020
Neg logit contributions
rob-0.015
 sy-0.014
 pret-0.013
 scar-0.013
 Stri-0.012
Token w
Token ID266
Feature activation+4.96e-03

Pos logit contributions
rob+0.016
 pret+0.015
 Stri+0.015
 sy+0.014
 Fro+0.013
Neg logit contributions
Mom-0.013
Everyone-0.012
um-0.012
 year-0.011
Daddy-0.010
Tokenagged
Token ID14655
Feature activation+7.83e-03

Pos logit contributions
rob+0.008
 pret+0.007
 Stri+0.007
 sy+0.007
ez+0.006
Neg logit contributions
Mom-0.009
Everyone-0.008
 year-0.008
um-0.007
Daddy-0.007
Token its
Token ID663
Feature activation+5.99e-02

Pos logit contributions
rob+0.013
 pret+0.011
 Stri+0.010
 sy+0.010
 Roc+0.009
Neg logit contributions
Mom-0.015
Everyone-0.013
um-0.013
 year-0.013
ally-0.012
Token tail
Token ID7894
Feature activation+2.14e-02

Pos logit contributions
rob+0.062
 pret+0.061
 Stri+0.059
 sy+0.057
 scar+0.052
Neg logit contributions
Mom-0.127
Everyone-0.125
 year-0.120
um-0.116
Daddy-0.115
Token and
Token ID290
Feature activation+1.10e-02

Pos logit contributions
rob+0.045
 pret+0.039
 Stri+0.039
 sy+0.036
 Roc+0.034
Neg logit contributions
Mom-0.031
 year-0.028
um-0.027
Everyone-0.027
 doorstep-0.024
Token followed
Token ID3940
Feature activation-1.04e-02

Pos logit contributions
rob+0.019
 pret+0.017
 Stri+0.016
 sy+0.016
 Roc+0.014
Neg logit contributions
Mom-0.019
Everyone-0.017
 year-0.016
um-0.016
ally-0.015
Token Lily
Token ID20037
Feature activation-8.15e-03

Pos logit contributions
Mom+0.021
urious+0.020
Everyone+0.019
um+0.019
ator+0.019
Neg logit contributions
 Stri-0.012
 pret-0.012
rob-0.012
 Fro-0.011
 sy-0.011
Token back
Token ID736
Feature activation-1.72e-02

Pos logit contributions
Mom+0.013
Everyone+0.012
 year+0.011
um+0.011
 doorstep+0.011
Neg logit contributions
rob-0.015
 pret-0.014
 Stri-0.013
 sy-0.013
 Roc-0.012
Token.
Token ID13
Feature activation-3.39e-03

Pos logit contributions
Mom+0.000
 year+0.000
um+0.000
Everyone+0.000
 doorstep+0.000
Neg logit contributions
rob-0.000
 pret-0.000
 Stri-0.000
 sy-0.000
 scar-0.000
Token The
Token ID383
Feature activation-1.11e-02

Pos logit contributions
Mom+0.006
 year+0.006
Daddy+0.006
Everyone+0.006
um+0.006
Neg logit contributions
 Stri-0.005
rob-0.005
 pret-0.004
 Roc-0.004
 Fro-0.004
Token dog
Token ID3290
Feature activation+6.27e-03

Pos logit contributions
Mom+0.021
um+0.021
Everyone+0.021
ator+0.020
har+0.020
Neg logit contributions
rob-0.014
 pret-0.013
 Stri-0.013
 sy-0.012
 scar-0.012
Token w
Token ID266
Feature activation+2.36e-03

Pos logit contributions
rob+0.012
 pret+0.011
 Stri+0.011
 sy+0.010
 Roc+0.010
Neg logit contributions
Mom-0.009
Everyone-0.009
um-0.008
 year-0.008
ally-0.007
Tokenagged
Token ID14655
Feature activation+5.92e-03

Pos logit contributions
rob+0.004
 pret+0.004
 Stri+0.004
 sy+0.004
 Roc+0.003
Neg logit contributions
Mom-0.004
Everyone-0.003
um-0.003
 year-0.003
ally-0.003
Token his
Token ID465
Feature activation+5.99e-02

Pos logit contributions
rob+0.009
 pret+0.008
 Stri+0.008
 sy+0.008
 Roc+0.006
Neg logit contributions
Mom-0.011
Everyone-0.011
 year-0.010
um-0.010
ally-0.010
Token tail
Token ID7894
Feature activation+2.12e-02

Pos logit contributions
rob+0.110
 pret+0.096
 Stri+0.096
 sy+0.091
 Roc+0.082
Neg logit contributions
Mom-0.101
Everyone-0.089
um-0.089
 year-0.087
ally-0.080
Token and
Token ID290
Feature activation+4.85e-03

Pos logit contributions
rob+0.039
 pret+0.035
 Stri+0.035
 sy+0.033
 Roc+0.030
Neg logit contributions
Mom-0.034
Everyone-0.031
 year-0.030
um-0.030
ally-0.027
Token said
Token ID531
Feature activation-3.86e-03

Pos logit contributions
rob+0.009
 pret+0.008
 sy+0.008
 Stri+0.007
 asleep+0.007
Neg logit contributions
Mom-0.008
Everyone-0.007
 year-0.007
um-0.006
ally-0.006
Token,
Token ID11
Feature activation-1.01e-03

Pos logit contributions
Mom+0.006
Everyone+0.006
um+0.006
 year+0.005
 Everyone+0.005
Neg logit contributions
rob-0.008
 pret-0.006
 Stri-0.006
 sy-0.006
 scar-0.006
Token "
Token ID366
Feature activation-2.21e-03

Pos logit contributions
Mom+0.001
Everyone+0.001
um+0.001
 year+0.001
ally+0.001
Neg logit contributions
rob-0.002
 pret-0.002
 Stri-0.002
 sy-0.002
 Roc-0.002
Token
Token ID198
Feature activation-3.26e-04

Pos logit contributions
 doorstep+0.007
 mant+0.006
ator+0.006
um+0.006
VP+0.006
Neg logit contributions
 Stri-0.008
rob-0.008
 Fro-0.007
 pret-0.007
 Roc-0.007
TokenThe
Token ID464
Feature activation-1.28e-02

Pos logit contributions
um+0.000
 doorstep+0.000
ator+0.000
 year+0.000
Mom+0.000
Neg logit contributions
rob-0.001
 Stri-0.001
 pret-0.001
 sy-0.001
 Fro-0.001
Token dog
Token ID3290
Feature activation+7.53e-03

Pos logit contributions
Mom+0.020
Everyone+0.019
um+0.019
ally+0.019
ator+0.019
Neg logit contributions
 pret-0.021
rob-0.021
 Stri-0.020
 sy-0.020
 Fro-0.018
Token w
Token ID266
Feature activation+3.29e-03

Pos logit contributions
rob+0.013
 pret+0.011
 Stri+0.011
 sy+0.011
 Roc+0.010
Neg logit contributions
Mom-0.013
Everyone-0.012
um-0.012
 year-0.012
ally-0.011
Tokenagged
Token ID14655
Feature activation+8.77e-03

Pos logit contributions
rob+0.006
 pret+0.005
 Stri+0.005
 sy+0.005
 Fro+0.005
Neg logit contributions
Mom-0.005
Everyone-0.005
 year-0.005
um-0.005
 doorstep-0.004
Token his
Token ID465
Feature activation+5.98e-02

Pos logit contributions
rob+0.011
 pret+0.009
 Stri+0.009
 sy+0.008
 Roc+0.007
Neg logit contributions
Mom-0.019
Everyone-0.018
um-0.018
 year-0.018
ally-0.017
Token tail
Token ID7894
Feature activation+2.17e-02

Pos logit contributions
rob+0.076
 pret+0.072
 Stri+0.068
 sy+0.064
 Fro+0.058
Neg logit contributions
Mom-0.122
Everyone-0.118
 year-0.112
um-0.111
Daddy-0.106
Token and
Token ID290
Feature activation+1.55e-02

Pos logit contributions
rob+0.040
 pret+0.037
 Stri+0.036
 sy+0.034
 Roc+0.032
Neg logit contributions
Mom-0.034
Everyone-0.031
um-0.030
 year-0.030
ally-0.027
Token walked
Token ID6807
Feature activation+3.97e-03

Pos logit contributions
rob+0.024
 Stri+0.020
 pret+0.020
 sy+0.020
 asleep+0.017
Neg logit contributions
Mom-0.028
Everyone-0.027
 year-0.025
um-0.025
ally-0.024
Token with
Token ID351
Feature activation-9.23e-03

Pos logit contributions
rob+0.006
 pret+0.006
 Stri+0.006
 sy+0.005
 Fro+0.005
Neg logit contributions
Mom-0.007
Everyone-0.007
um-0.006
 year-0.006
 doorstep-0.006
Token Sally
Token ID25737
Feature activation-2.29e-03

Pos logit contributions
Mom+0.016
Everyone+0.015
 year+0.014
um+0.014
 Everyone+0.013
Neg logit contributions
rob-0.016
 pret-0.014
 Stri-0.014
 sy-0.013
 scar-0.011
Token.
Token ID13
Feature activation+1.15e-03

Pos logit contributions
Mom+0.024
um+0.021
Everyone+0.021
 year+0.020
ally+0.018
Neg logit contributions
rob-0.040
 pret-0.037
 Stri-0.036
 sy-0.034
 Roc-0.032
Token The
Token ID383
Feature activation-9.60e-03

Pos logit contributions
rob+0.002
 pret+0.002
 Stri+0.002
 sy+0.002
 Roc+0.002
Neg logit contributions
Mom-0.002
Everyone-0.002
um-0.002
 year-0.002
ally-0.001
Token dog
Token ID3290
Feature activation+6.54e-03

Pos logit contributions
Mom+0.019
Everyone+0.018
um+0.017
 year+0.017
ally+0.017
Neg logit contributions
rob-0.013
 pret-0.012
 Stri-0.012
 sy-0.011
 Fro-0.010
Token w
Token ID266
Feature activation+4.99e-03

Pos logit contributions
rob+0.011
 pret+0.010
 Stri+0.010
 sy+0.009
 Roc+0.009
Neg logit contributions
Mom-0.011
Everyone-0.010
 year-0.010
um-0.010
ally-0.009
Tokenagged
Token ID14655
Feature activation+1.02e-02

Pos logit contributions
rob+0.010
 pret+0.010
 Stri+0.009
 sy+0.009
 Roc+0.008
Neg logit contributions
Mom-0.007
Everyone-0.006
um-0.006
 year-0.006
ally-0.005
Token his
Token ID465
Feature activation+5.97e-02

Pos logit contributions
rob+0.014
 pret+0.013
 Stri+0.012
 sy+0.012
 Roc+0.010
Neg logit contributions
Mom-0.021
Everyone-0.019
um-0.019
 year-0.019
ally-0.018
Token tail
Token ID7894
Feature activation+2.11e-02

Pos logit contributions
rob+0.075
 pret+0.071
 Stri+0.067
 sy+0.065
 Fro+0.059
Neg logit contributions
Mom-0.121
Everyone-0.117
 year-0.112
um-0.110
ally-0.105
Token and
Token ID290
Feature activation+9.56e-03

Pos logit contributions
rob+0.037
 pret+0.034
 Stri+0.034
 sy+0.032
 Roc+0.029
Neg logit contributions
Mom-0.035
Everyone-0.032
um-0.031
 year-0.030
ally-0.028
Token l
Token ID300
Feature activation-1.91e-04

Pos logit contributions
rob+0.014
 pret+0.012
 Stri+0.012
 sy+0.012
 Roc+0.010
Neg logit contributions
Mom-0.018
Everyone-0.017
 year-0.016
um-0.015
 Everyone-0.015
Tokenicked
Token ID9484
Feature activation-4.38e-03

Pos logit contributions
Mom+0.000
um+0.000
Everyone+0.000
 year+0.000
 Everyone+0.000
Neg logit contributions
rob-0.000
 pret-0.000
 Stri-0.000
 sy-0.000
 scar-0.000
Token her
Token ID607
Feature activation+2.08e-02

Pos logit contributions
Mom+0.008
Everyone+0.007
 year+0.007
um+0.007
 doorstep+0.007
Neg logit contributions
rob-0.006
 Stri-0.006
 pret-0.006
 sy-0.006
 Fro-0.005
Token.
Token ID13
Feature activation-4.53e-03

Pos logit contributions
Mom+0.006
um+0.005
 year+0.005
Everyone+0.004
 nicely+0.004
Neg logit contributions
rob-0.016
 pret-0.014
 Stri-0.014
 sy-0.013
 Roc-0.013
Token The
Token ID383
Feature activation-1.14e-02

Pos logit contributions
Mom+0.007
Everyone+0.007
um+0.006
 year+0.006
ally+0.006
Neg logit contributions
rob-0.008
 pret-0.008
 Stri-0.007
 sy-0.007
 Roc-0.007
Token dog
Token ID3290
Feature activation+5.48e-03

Pos logit contributions
Mom+0.020
Everyone+0.019
um+0.018
 year+0.018
ally+0.017
Neg logit contributions
rob-0.019
 pret-0.017
 Stri-0.017
 sy-0.016
 Roc-0.014
Token w
Token ID266
Feature activation+3.31e-03

Pos logit contributions
rob+0.011
 Stri+0.009
 pret+0.009
 sy+0.009
 Roc+0.008
Neg logit contributions
Mom-0.008
Everyone-0.007
 year-0.007
um-0.007
ally-0.006
Tokenagged
Token ID14655
Feature activation+7.50e-03

Pos logit contributions
rob+0.007
 pret+0.007
 Stri+0.007
 sy+0.006
 Roc+0.006
Neg logit contributions
Mom-0.004
Everyone-0.004
um-0.004
 year-0.004
ally-0.003
Token his
Token ID465
Feature activation+5.97e-02

Pos logit contributions
rob+0.012
 pret+0.010
 Stri+0.010
 sy+0.009
pill+0.008
Neg logit contributions
Mom-0.015
Everyone-0.013
 year-0.013
um-0.013
har-0.012
Token tail
Token ID7894
Feature activation+2.06e-02

Pos logit contributions
rob+0.114
 pret+0.099
 Stri+0.099
 sy+0.091
 Roc+0.085
Neg logit contributions
Mom-0.099
um-0.085
Everyone-0.085
 year-0.083
har-0.077
Token and
Token ID290
Feature activation+6.69e-03

Pos logit contributions
rob+0.043
 Stri+0.037
 pret+0.037
 sy+0.035
 Roc+0.033
Neg logit contributions
Mom-0.030
 year-0.027
Everyone-0.026
um-0.026
 doorstep-0.022
Token l
Token ID300
Feature activation-1.52e-03

Pos logit contributions
rob+0.011
 sy+0.009
 pret+0.009
 Stri+0.009
robat+0.009
Neg logit contributions
Mom-0.011
Everyone-0.009
 Mom-0.009
 year-0.009
I-0.009
Tokenicked
Token ID9484
Feature activation-6.07e-03

Pos logit contributions
Mom+0.003
um+0.002
Everyone+0.002
 year+0.002
ally+0.002
Neg logit contributions
rob-0.003
 pret-0.002
 Stri-0.002
 sy-0.002
 scar-0.002
Token Fl
Token ID1610
Feature activation-1.40e-03

Pos logit contributions
Mom+0.012
Everyone+0.011
um+0.011
 year+0.010
Daddy+0.010
Neg logit contributions
rob-0.010
 pret-0.008
 Stri-0.008
 sy-0.008
 scar-0.007
Token walk
Token ID2513
Feature activation+5.81e-03

Pos logit contributions
Mom+0.001
Everyone+0.001
 year+0.001
um+0.001
ally+0.001
Neg logit contributions
rob-0.001
 pret-0.001
 Stri-0.001
 sy-0.001
 Roc-0.001
Token.
Token ID13
Feature activation+1.39e-03

Pos logit contributions
rob+0.010
 Stri+0.009
 pret+0.009
 sy+0.009
 Fro+0.008
Neg logit contributions
Mom-0.009
Everyone-0.009
um-0.008
ally-0.007
VP-0.007
Token Spot
Token ID15899
Feature activation-2.15e-03

Pos logit contributions
rob+0.002
 pret+0.002
 Stri+0.002
 sy+0.002
 Roc+0.002
Neg logit contributions
Mom-0.002
Everyone-0.002
um-0.002
 year-0.002
ally-0.002
Token w
Token ID266
Feature activation+4.20e-03

Pos logit contributions
Mom+0.003
Everyone+0.003
 year+0.003
um+0.003
ally+0.002
Neg logit contributions
rob-0.004
 pret-0.004
 Stri-0.004
 sy-0.004
 Roc-0.003
Tokenagged
Token ID14655
Feature activation+9.92e-03

Pos logit contributions
rob+0.007
 pret+0.007
 Stri+0.006
 sy+0.006
 Fro+0.006
Neg logit contributions
Mom-0.007
Everyone-0.006
 year-0.006
um-0.006
Daddy-0.006
Token his
Token ID465
Feature activation+5.96e-02

Pos logit contributions
rob+0.014
 pret+0.012
 Stri+0.011
 sy+0.011
 Roc+0.009
Neg logit contributions
Mom-0.021
Everyone-0.019
 year-0.019
um-0.019
ally-0.018
Token tail
Token ID7894
Feature activation+2.44e-02

Pos logit contributions
rob+0.082
 pret+0.075
 Stri+0.072
 sy+0.068
 Fro+0.062
Neg logit contributions
Mom-0.118
Everyone-0.112
 year-0.107
um-0.107
Daddy-0.100
Token and
Token ID290
Feature activation+1.21e-02

Pos logit contributions
rob+0.044
 pret+0.040
 Stri+0.039
 sy+0.037
 Roc+0.034
Neg logit contributions
Mom-0.040
Everyone-0.036
um-0.035
 year-0.035
ally-0.032
Token bark
Token ID21405
Feature activation+6.77e-03

Pos logit contributions
rob+0.019
 pret+0.017
 Stri+0.017
 sy+0.016
 Roc+0.014
Neg logit contributions
Mom-0.022
Everyone-0.020
 year-0.019
um-0.019
ally-0.018
Tokened
Token ID276
Feature activation+9.14e-03

Pos logit contributions
rob+0.011
 Stri+0.010
 pret+0.010
 sy+0.010
 scar+0.009
Neg logit contributions
Mom-0.012
Everyone-0.011
um-0.011
 year-0.011
 Everyone-0.010
Token happily
Token ID18177
Feature activation+6.66e-03

Pos logit contributions
 pret+0.015
rob+0.015
 Stri+0.014
 sy+0.014
 Fro+0.013
Neg logit contributions
Mom-0.015
Everyone-0.014
um-0.013
ator-0.013
 year-0.013
Token
Token ID198
Feature activation-3.23e-03

Pos logit contributions
 year+0.011
 doorstep+0.011
um+0.010
Mom+0.010
ator+0.010
Neg logit contributions
 Stri-0.013
rob-0.013
 pret-0.012
 Roc-0.011
 sy-0.011
TokenThe
Token ID464
Feature activation-1.09e-02

Pos logit contributions
 year+0.003
um+0.003
 doorstep+0.003
Mom+0.003
ator+0.003
Neg logit contributions
rob-0.007
 Stri-0.007
 pret-0.006
 sy-0.006
 Roc-0.006
Token puppy
Token ID26188
Feature activation+6.17e-03

Pos logit contributions
Everyone+0.018
har+0.017
Mom+0.017
um+0.017
ally+0.017
Neg logit contributions
rob-0.017
 sy-0.016
 pret-0.016
 Stri-0.015
 scar-0.014
Token w
Token ID266
Feature activation+2.05e-03

Pos logit contributions
rob+0.010
 pret+0.010
 Stri+0.009
 sy+0.009
 Fro+0.008
Neg logit contributions
Mom-0.010
Everyone-0.010
um-0.010
 year-0.009
ator-0.009
Tokenagged
Token ID14655
Feature activation+5.32e-03

Pos logit contributions
rob+0.004
 pret+0.004
 Stri+0.004
 sy+0.003
 Roc+0.003
Neg logit contributions
Mom-0.003
Everyone-0.003
um-0.003
 year-0.003
ally-0.002
Token his
Token ID465
Feature activation+5.96e-02

Pos logit contributions
rob+0.007
 pret+0.006
 Stri+0.006
 sy+0.006
 Roc+0.005
Neg logit contributions
Mom-0.011
Everyone-0.010
um-0.010
 year-0.010
ally-0.009
Token tail
Token ID7894
Feature activation+1.88e-02

Pos logit contributions
rob+0.092
 pret+0.083
 Stri+0.081
 sy+0.076
 Roc+0.069
Neg logit contributions
Mom-0.112
Everyone-0.104
um-0.100
 year-0.100
ally-0.093
Token.
Token ID13
Feature activation+1.14e-03

Pos logit contributions
rob+0.038
 pret+0.035
 Stri+0.034
 sy+0.033
 Roc+0.030
Neg logit contributions
Mom-0.027
Everyone-0.024
um-0.023
 year-0.023
ally-0.021
Token He
Token ID679
Feature activation+3.05e-03

Pos logit contributions
rob+0.002
 pret+0.002
 Stri+0.002
 sy+0.002
 Roc+0.002
Neg logit contributions
Mom-0.002
Everyone-0.002
um-0.002
 year-0.002
ally-0.001
Token l
Token ID300
Feature activation+3.75e-03

Pos logit contributions
rob+0.005
 pret+0.005
 Stri+0.005
 sy+0.005
 Fro+0.004
Neg logit contributions
Mom-0.005
Everyone-0.005
um-0.004
 year-0.004
ator-0.004
Tokenicked
Token ID9484
Feature activation-6.26e-04

Pos logit contributions
rob+0.006
 pret+0.005
 Stri+0.005
 sy+0.005
 scar+0.004
Neg logit contributions
Mom-0.007
um-0.007
Everyone-0.007
 year-0.006
ally-0.006
Token."
Token ID526
Feature activation+9.00e-05

Pos logit contributions
Mom+0.000
Everyone+0.000
um+0.000
ally+0.000
 year+0.000
Neg logit contributions
rob-0.001
 pret-0.001
 Stri-0.001
 sy-0.001
 asleep-0.001
Token The
Token ID383
Feature activation-5.40e-03

Pos logit contributions
rob+0.000
 pret+0.000
 Stri+0.000
 sy+0.000
 scar+0.000
Neg logit contributions
Mom-0.000
Everyone-0.000
um-0.000
 year-0.000
ally-0.000
Token dog
Token ID3290
Feature activation+1.03e-02

Pos logit contributions
Mom+0.009
Everyone+0.009
um+0.008
 year+0.008
ally+0.008
Neg logit contributions
rob-0.009
 pret-0.008
 Stri-0.008
 sy-0.008
 Roc-0.007
Token w
Token ID266
Feature activation+8.62e-03

Pos logit contributions
rob+0.018
 pret+0.017
 Stri+0.016
 sy+0.015
 Fro+0.014
Neg logit contributions
Mom-0.017
Everyone-0.015
um-0.015
 year-0.015
Daddy-0.014
Tokenagged
Token ID14655
Feature activation+9.75e-03

Pos logit contributions
rob+0.014
 pret+0.012
 Stri+0.012
 sy+0.011
ez+0.011
Neg logit contributions
Mom-0.015
Everyone-0.014
 year-0.013
um-0.013
 doorstep-0.013
Token his
Token ID465
Feature activation+5.95e-02

Pos logit contributions
rob+0.013
 pret+0.011
 Stri+0.011
 sy+0.011
 Roc+0.009
Neg logit contributions
Mom-0.020
Everyone-0.019
um-0.019
 year-0.018
ally-0.017
Token tail
Token ID7894
Feature activation+2.25e-02

Pos logit contributions
rob+0.086
 pret+0.078
 Stri+0.076
 sy+0.072
 Fro+0.064
Neg logit contributions
Mom-0.116
Everyone-0.109
um-0.105
 year-0.104
ally-0.098
Token and
Token ID290
Feature activation+9.87e-03

Pos logit contributions
rob+0.040
 pret+0.036
 Stri+0.036
 sy+0.034
 Roc+0.031
Neg logit contributions
Mom-0.037
Everyone-0.034
um-0.033
 year-0.033
ally-0.030
Token said
Token ID531
Feature activation-1.12e-03

Pos logit contributions
rob+0.018
 sy+0.016
 pret+0.016
 Stri+0.016
robat+0.014
Neg logit contributions
Mom-0.014
Everyone-0.014
 year-0.012
um-0.012
 Everyone-0.012
Token,
Token ID11
Feature activation-4.50e-04

Pos logit contributions
Mom+0.002
Everyone+0.002
um+0.002
 year+0.002
ally+0.002
Neg logit contributions
rob-0.002
 pret-0.002
 Stri-0.002
 sy-0.002
 scar-0.001
Token "
Token ID366
Feature activation-2.79e-03

Pos logit contributions
Mom+0.001
Everyone+0.001
um+0.001
 year+0.001
Daddy+0.000
Neg logit contributions
rob-0.001
 pret-0.001
 Stri-0.001
 sy-0.001
 Roc-0.001
Token
Token ID198
Feature activation-2.66e-03

Pos logit contributions
Mom+0.003
 year+0.002
Everyone+0.002
um+0.002
har+0.002
Neg logit contributions
rob-0.004
 Stri-0.004
 pret-0.004
 sy-0.003
 Roc-0.003
Token
Token ID198
Feature activation+9.41e-04

Pos logit contributions
Mom+0.004
 rightful+0.004
um+0.004
har+0.004
 year+0.004
Neg logit contributions
rob-0.005
 Stri-0.004
 pret-0.004
 Roc-0.004
 Fro-0.004
TokenMax
Token ID11518
Feature activation-1.19e-03

Pos logit contributions
rob+0.001
 Stri+0.001
 pret+0.001
Aud+0.001
 scar+0.001
Neg logit contributions
 rightful-0.002
um-0.002
ator-0.002
 year-0.002
ini-0.002
Token w
Token ID266
Feature activation+6.70e-03

Pos logit contributions
Mom+0.002
Everyone+0.002
um+0.002
 year+0.002
ally+0.001
Neg logit contributions
rob-0.002
 pret-0.002
 Stri-0.002
 sy-0.002
 Roc-0.002
Tokenagged
Token ID14655
Feature activation+8.35e-03

Pos logit contributions
rob+0.011
 pret+0.010
 Stri+0.010
 sy+0.009
 Fro+0.008
Neg logit contributions
Mom-0.012
Everyone-0.011
 year-0.010
Daddy-0.010
um-0.010
Token his
Token ID465
Feature activation+5.95e-02

Pos logit contributions
rob+0.011
 pret+0.010
 Stri+0.009
 sy+0.009
 Fro+0.008
Neg logit contributions
Mom-0.018
Everyone-0.016
 year-0.016
um-0.016
ally-0.015
Token tail
Token ID7894
Feature activation+2.48e-02

Pos logit contributions
rob+0.097
 pret+0.087
 Stri+0.085
 sy+0.080
 Roc+0.073
Neg logit contributions
Mom-0.108
Everyone-0.099
um-0.096
 year-0.096
ally-0.089
Token and
Token ID290
Feature activation+1.10e-02

Pos logit contributions
rob+0.045
 pret+0.040
 Stri+0.040
 sy+0.037
 Roc+0.034
Neg logit contributions
Mom-0.041
Everyone-0.037
um-0.036
 year-0.036
ally-0.033
Token ran
Token ID4966
Feature activation-3.90e-03

Pos logit contributions
rob+0.021
 pret+0.019
 Stri+0.019
 sy+0.019
robat+0.016
Neg logit contributions
Mom-0.016
Everyone-0.015
 year-0.014
um-0.013
ally-0.013
Token around
Token ID1088
Feature activation+2.65e-03

Pos logit contributions
Mom+0.007
Everyone+0.006
um+0.006
 year+0.006
ally+0.006
Neg logit contributions
rob-0.006
 pret-0.006
 Stri-0.006
 sy-0.005
 Roc-0.005
Token happily
Token ID18177
Feature activation+2.60e-04

Pos logit contributions
rob+0.006
 Stri+0.005
 pret+0.005
 sy+0.005
 scar+0.005
Neg logit contributions
Mom-0.003
 year-0.003
Everyone-0.003
um-0.003
 nicely-0.003
Token them
Token ID606
Feature activation-2.24e-03

Pos logit contributions
Mom+0.012
Everyone+0.011
 year+0.011
 doorstep+0.011
ally+0.011
Neg logit contributions
rob-0.009
 pret-0.009
 Stri-0.008
 sy-0.008
 scar-0.007
Token and
Token ID290
Feature activation-4.92e-03

Pos logit contributions
Mom+0.003
Everyone+0.003
 year+0.003
um+0.003
ally+0.003
Neg logit contributions
rob-0.004
 pret-0.004
 Stri-0.004
 sy-0.004
 scar-0.003
Token make
Token ID787
Feature activation-8.65e-03

Pos logit contributions
Mom+0.008
um+0.008
 year+0.008
Everyone+0.008
ally+0.008
Neg logit contributions
 pret-0.008
rob-0.007
 Stri-0.007
 sy-0.007
 scar-0.007
Token a
Token ID257
Feature activation-9.93e-03

Pos logit contributions
Mom+0.015
 year+0.014
Everyone+0.014
um+0.014
ally+0.013
Neg logit contributions
rob-0.014
 pret-0.013
 Stri-0.013
 sy-0.012
 scar-0.011
Token y
Token ID331
Feature activation-2.74e-02

Pos logit contributions
Mom+0.017
Everyone+0.015
ally+0.015
um+0.015
Jane+0.014
Neg logit contributions
 pret-0.016
 scar-0.015
rob-0.014
 sy-0.014
 ro-0.014
Tokenummy
Token ID13513
Feature activation-5.08e-02

Pos logit contributions
Mom+0.027
Everyone+0.024
 year+0.019
har+0.019
VP+0.018
Neg logit contributions
rob-0.062
 Stri-0.057
 pret-0.056
 sy-0.055
 Fro-0.053
Token salad
Token ID20698
Feature activation-2.44e-02

Pos logit contributions
Mom+0.097
Everyone+0.088
um+0.082
ally+0.081
har+0.079
Neg logit contributions
rob-0.076
 pret-0.073
 sy-0.066
 Stri-0.062
 ro-0.061
Token for
Token ID329
Feature activation+4.15e-04

Pos logit contributions
Mom+0.033
Everyone+0.029
um+0.028
 year+0.028
ally+0.025
Neg logit contributions
rob-0.051
 pret-0.047
 Stri-0.046
 sy-0.044
 Roc-0.041
Token lunch
Token ID9965
Feature activation-1.53e-02

Pos logit contributions
rob+0.001
 Stri+0.001
 pret+0.001
 sy+0.001
 scar+0.001
Neg logit contributions
Mom-0.001
um-0.001
Everyone-0.001
 year-0.001
urious-0.001
Token!"
Token ID2474
Feature activation-5.31e-03

Pos logit contributions
Mom+0.021
Everyone+0.020
um+0.018
 year+0.017
ally+0.017
Neg logit contributions
rob-0.030
 pret-0.028
 Stri-0.027
 sy-0.026
 asleep-0.024
Token
Token ID198
Feature activation-9.80e-04

Pos logit contributions
Mom+0.008
 year+0.007
um+0.007
Everyone+0.007
ator+0.007
Neg logit contributions
rob-0.009
 Stri-0.009
 pret-0.009
 Fro-0.009
 sy-0.008
Token bite
Token ID13197
Feature activation+5.64e-03

Pos logit contributions
Mom+0.044
Everyone+0.041
um+0.039
 year+0.037
ally+0.037
Neg logit contributions
rob-0.035
 pret-0.034
 sy-0.032
 Stri-0.031
 scar-0.027
Token and
Token ID290
Feature activation-1.44e-04

Pos logit contributions
rob+0.012
 pret+0.011
 Stri+0.011
 sy+0.010
 Roc+0.009
Neg logit contributions
Mom-0.008
um-0.007
Everyone-0.007
 year-0.007
ally-0.006
Token makes
Token ID1838
Feature activation+4.01e-04

Pos logit contributions
Mom+0.000
Everyone+0.000
um+0.000
 year+0.000
 tidy+0.000
Neg logit contributions
rob-0.000
 Stri-0.000
 sy-0.000
 pret-0.000
 Fro-0.000
Token a
Token ID257
Feature activation-2.84e-03

Pos logit contributions
rob+0.001
 pret+0.001
 Stri+0.001
 sy+0.001
 Roc+0.001
Neg logit contributions
Mom-0.001
Everyone-0.001
um-0.001
 year-0.001
 Everyone-0.001
Token y
Token ID331
Feature activation-2.53e-02

Pos logit contributions
Mom+0.004
Everyone+0.004
um+0.004
 year+0.004
ally+0.003
Neg logit contributions
rob-0.005
 pret-0.005
 sy-0.005
 Stri-0.005
 scar-0.005
Tokenummy
Token ID13513
Feature activation-4.89e-02

Pos logit contributions
Mom+0.020
Everyone+0.017
 year+0.015
um+0.015
ally+0.012
Neg logit contributions
rob-0.066
 pret-0.062
 Stri-0.061
 sy-0.059
 Roc-0.056
Token sound
Token ID2128
Feature activation-9.95e-03

Pos logit contributions
Mom+0.081
Everyone+0.075
um+0.071
 year+0.071
ally+0.066
Neg logit contributions
rob-0.085
 pret-0.078
 Stri-0.075
 sy-0.073
 Roc-0.066
Token.
Token ID13
Feature activation-3.76e-03

Pos logit contributions
Mom+0.010
Everyone+0.009
 year+0.009
um+0.009
 Everyone+0.007
Neg logit contributions
rob-0.025
 Stri-0.022
 pret-0.022
 sy-0.021
 Fro-0.020
Token
Token ID198
Feature activation-4.47e-03

Pos logit contributions
Mom+0.006
Everyone+0.005
um+0.005
 year+0.005
ally+0.004
Neg logit contributions
rob-0.007
 pret-0.007
 Stri-0.007
 sy-0.006
 Roc-0.006
Token
Token ID198
Feature activation-5.17e-04

Pos logit contributions
Mom+0.006
 year+0.006
Everyone+0.006
har+0.006
um+0.006
Neg logit contributions
rob-0.008
 Stri-0.008
 pret-0.008
 Roc-0.007
 sy-0.007
Token"
Token ID1
Feature activation+5.73e-03

Pos logit contributions
 year+0.001
 doorstep+0.001
um+0.001
ator+0.001
ally+0.001
Neg logit contributions
rob-0.001
 Stri-0.001
 pret-0.001
 sy-0.001
 Roc-0.001
Token kitchen
Token ID9592
Feature activation-1.14e-02

Pos logit contributions
Mom+0.024
um+0.022
Everyone+0.022
 year+0.022
ally+0.021
Neg logit contributions
rob-0.031
 pret-0.029
 Stri-0.028
 sy-0.027
 scar-0.025
Token and
Token ID290
Feature activation-7.55e-03

Pos logit contributions
Mom+0.017
Everyone+0.016
 year+0.015
um+0.015
 doorstep+0.014
Neg logit contributions
rob-0.022
 pret-0.020
 Stri-0.019
 sy-0.019
 Roc-0.017
Token made
Token ID925
Feature activation-1.48e-02

Pos logit contributions
Mom+0.012
Everyone+0.011
um+0.011
 year+0.011
ally+0.010
Neg logit contributions
rob-0.013
 pret-0.013
 Stri-0.012
 sy-0.012
 Roc-0.011
Token a
Token ID257
Feature activation-1.20e-02

Pos logit contributions
Mom+0.025
Everyone+0.024
um+0.023
 year+0.022
ally+0.021
Neg logit contributions
rob-0.026
 pret-0.022
 Stri-0.022
 sy-0.022
 Roc-0.019
Token y
Token ID331
Feature activation-3.02e-02

Pos logit contributions
Mom+0.018
Everyone+0.017
um+0.016
 year+0.016
ally+0.015
Neg logit contributions
rob-0.021
 pret-0.021
 Stri-0.020
 sy-0.019
 scar-0.018
Tokenummy
Token ID13513
Feature activation-4.85e-02

Pos logit contributions
Mom+0.025
Everyone+0.021
 year+0.018
um+0.018
Daddy+0.015
Neg logit contributions
rob-0.077
 pret-0.072
 Stri-0.072
 sy-0.069
 Fro-0.066
Token cake
Token ID12187
Feature activation-2.92e-02

Pos logit contributions
Mom+0.081
Everyone+0.075
um+0.072
 year+0.071
ally+0.066
Neg logit contributions
rob-0.084
 pret-0.077
 Stri-0.075
 sy-0.072
 Roc-0.065
Token.
Token ID13
Feature activation-1.43e-03

Pos logit contributions
Mom+0.042
Everyone+0.038
um+0.037
 year+0.036
 doorstep+0.033
Neg logit contributions
rob-0.058
 pret-0.053
 Stri-0.053
 sy-0.050
 Fro-0.047
Token They
Token ID1119
Feature activation+7.25e-03

Pos logit contributions
Mom+0.002
Everyone+0.002
um+0.002
 year+0.002
ally+0.001
Neg logit contributions
rob-0.003
 pret-0.003
 Stri-0.003
 sy-0.003
 Roc-0.002
Token also
Token ID635
Feature activation+6.09e-03

Pos logit contributions
rob+0.012
 pret+0.012
 Stri+0.011
 sy+0.010
 scar+0.010
Neg logit contributions
Mom-0.012
Everyone-0.011
um-0.011
 year-0.011
VP-0.010
Token made
Token ID925
Feature activation-5.41e-03

Pos logit contributions
rob+0.011
 pret+0.010
 Stri+0.009
 sy+0.009
 scar+0.008
Neg logit contributions
Mom-0.010
Everyone-0.009
um-0.009
 year-0.009
 doorstep-0.008
Token fast
Token ID3049
Feature activation-8.81e-03

Pos logit contributions
Mom+0.009
um+0.009
Everyone+0.009
 doorstep+0.009
 year+0.009
Neg logit contributions
rob-0.008
 Stri-0.008
 pret-0.007
 Roc-0.007
 sy-0.006
Token it
Token ID340
Feature activation-3.40e-03

Pos logit contributions
Mom+0.010
 year+0.010
Everyone+0.009
um+0.009
 Everyone+0.008
Neg logit contributions
rob-0.019
 pret-0.017
 sy-0.017
 Stri-0.017
 Fro-0.015
Token made
Token ID925
Feature activation-5.23e-03

Pos logit contributions
Mom+0.006
Everyone+0.005
um+0.005
 year+0.005
ally+0.004
Neg logit contributions
rob-0.006
 pret-0.006
 Stri-0.005
 sy-0.005
 Roc-0.005
Token a
Token ID257
Feature activation-4.62e-03

Pos logit contributions
Mom+0.010
Everyone+0.009
um+0.009
 year+0.009
 Everyone+0.008
Neg logit contributions
rob-0.009
 pret-0.008
 sy-0.007
 Stri-0.007
 scar-0.006
Token bub
Token ID10015
Feature activation-1.54e-02

Pos logit contributions
Mom+0.007
Everyone+0.006
um+0.006
 year+0.006
ally+0.006
Neg logit contributions
rob-0.008
 pret-0.008
 Stri-0.008
 sy-0.008
 scar-0.007
Tokenbly
Token ID36874
Feature activation-4.84e-02

Pos logit contributions
Mom+0.022
Everyone+0.020
um+0.019
 year+0.019
ally+0.018
Neg logit contributions
rob-0.029
 pret-0.026
 Stri-0.026
 sy-0.026
 Roc-0.023
Token sound
Token ID2128
Feature activation-4.81e-03

Pos logit contributions
Mom+0.073
Everyone+0.064
um+0.062
Daddy+0.058
ally+0.058
Neg logit contributions
 pret-0.087
rob-0.087
 sy-0.083
 Stri-0.082
 ro-0.077
Token.
Token ID13
Feature activation+1.19e-03

Pos logit contributions
Mom+0.006
Everyone+0.005
 year+0.005
um+0.005
 tidy+0.004
Neg logit contributions
rob-0.011
 pret-0.010
 Stri-0.010
 sy-0.009
 Fro-0.009
Token The
Token ID383
Feature activation-8.83e-03

Pos logit contributions
 Stri+0.002
rob+0.002
 pret+0.002
 sy+0.002
 Fro+0.002
Neg logit contributions
Mom-0.002
Everyone-0.002
um-0.002
 year-0.002
ator-0.002
Token little
Token ID1310
Feature activation+1.16e-02

Pos logit contributions
um+0.014
abby+0.014
ator+0.013
urious+0.013
ally+0.013
Neg logit contributions
 pret-0.012
 ro-0.012
 scar-0.012
 Stri-0.012
 sy-0.012
Token boy
Token ID2933
Feature activation+4.19e-03

Pos logit contributions
rob+0.025
 pret+0.023
 Stri+0.023
 sy+0.022
 Fro+0.021
Neg logit contributions
Mom-0.015
Everyone-0.013
um-0.013
 year-0.012
 doorstep-0.011
Token pepper
Token ID13385
Feature activation-1.05e-03

Pos logit contributions
rob+0.006
 pret+0.005
 Stri+0.005
 sy+0.005
 ro+0.005
Neg logit contributions
Mom-0.005
Everyone-0.005
 year-0.005
um-0.005
har-0.004
Token and
Token ID290
Feature activation-5.46e-03

Pos logit contributions
Mom+0.001
 year+0.001
um+0.001
Everyone+0.001
 doorstep+0.001
Neg logit contributions
rob-0.002
 pret-0.002
 Stri-0.002
 sy-0.002
 Roc-0.002
Token made
Token ID925
Feature activation-1.11e-02

Pos logit contributions
Mom+0.009
Everyone+0.008
um+0.008
 year+0.008
ally+0.007
Neg logit contributions
rob-0.010
 pret-0.009
 Stri-0.009
 sy-0.008
 Roc-0.008
Token a
Token ID257
Feature activation-7.71e-03

Pos logit contributions
Mom+0.019
Everyone+0.018
um+0.017
 year+0.016
ally+0.015
Neg logit contributions
rob-0.020
 pret-0.017
 Stri-0.017
 sy-0.017
 Roc-0.015
Token y
Token ID331
Feature activation-2.92e-02

Pos logit contributions
Mom+0.012
Everyone+0.011
um+0.010
 year+0.010
ally+0.010
Neg logit contributions
 pret-0.013
rob-0.013
 Stri-0.013
 sy-0.012
 scar-0.012
Tokenummy
Token ID13513
Feature activation-4.82e-02

Pos logit contributions
Mom+0.027
Everyone+0.024
 year+0.020
har+0.018
um+0.018
Neg logit contributions
rob-0.068
 Stri-0.064
 pret-0.063
 sy-0.062
 Fro-0.059
Token snack
Token ID26906
Feature activation-1.89e-02

Pos logit contributions
Mom+0.085
Everyone+0.078
um+0.073
 year+0.073
ally+0.069
Neg logit contributions
rob-0.079
 pret-0.073
 Stri-0.069
 sy-0.068
 Fro-0.061
Token.
Token ID13
Feature activation-7.20e-04

Pos logit contributions
Mom+0.026
Everyone+0.023
um+0.022
 year+0.022
ally+0.020
Neg logit contributions
rob-0.039
 pret-0.036
 Stri-0.035
 sy-0.034
 Roc-0.031
Token
Token ID198
Feature activation-2.61e-03

Pos logit contributions
Mom+0.001
Everyone+0.001
 year+0.001
um+0.001
 doorstep+0.001
Neg logit contributions
rob-0.001
 Stri-0.001
 pret-0.001
 sy-0.001
 Roc-0.001
Token
Token ID198
Feature activation+1.74e-03

Pos logit contributions
Mom+0.004
um+0.003
ator+0.003
 year+0.003
Everyone+0.003
Neg logit contributions
rob-0.005
 Stri-0.005
 pret-0.005
 Fro-0.004
 Roc-0.004
TokenL
Token ID43
Feature activation+9.50e-03

Pos logit contributions
 Stri+0.003
rob+0.003
 pret+0.003
 sy+0.003
 scar+0.003
Neg logit contributions
ator-0.002
 doorstep-0.002
um-0.002
Mom-0.002
 year-0.002
Token together
Token ID1978
Feature activation-6.75e-03

Pos logit contributions
Mom+0.000
um+0.000
Everyone+0.000
 year+0.000
ally+0.000
Neg logit contributions
rob-0.001
 pret-0.001
 Stri-0.001
 sy-0.001
 ze-0.000
Token and
Token ID290
Feature activation-4.51e-03

Pos logit contributions
Mom+0.011
Everyone+0.010
um+0.010
 year+0.010
ally+0.009
Neg logit contributions
rob-0.013
 pret-0.011
 Stri-0.011
 sy-0.011
 Roc-0.010
Token made
Token ID925
Feature activation-9.37e-03

Pos logit contributions
Mom+0.007
Everyone+0.007
um+0.007
 year+0.007
ally+0.006
Neg logit contributions
rob-0.008
 pret-0.007
 Stri-0.007
 sy-0.007
 Roc-0.006
Token a
Token ID257
Feature activation-8.71e-03

Pos logit contributions
Mom+0.016
Everyone+0.015
um+0.015
 year+0.015
ally+0.014
Neg logit contributions
rob-0.016
 pret-0.014
 Stri-0.014
 sy-0.013
 Roc-0.012
Token y
Token ID331
Feature activation-2.58e-02

Pos logit contributions
Mom+0.014
Everyone+0.013
ally+0.012
um+0.012
ator+0.012
Neg logit contributions
 pret-0.014
 sy-0.013
 scar-0.013
rob-0.013
 Stri-0.012
Tokenummy
Token ID13513
Feature activation-4.72e-02

Pos logit contributions
Mom+0.024
Everyone+0.021
 year+0.018
har+0.016
um+0.016
Neg logit contributions
rob-0.062
 pret-0.057
 Stri-0.057
 sy-0.055
 Fro-0.053
Token shrimp
Token ID30060
Feature activation-2.52e-02

Pos logit contributions
Mom+0.085
Everyone+0.078
 year+0.072
um+0.072
har+0.070
Neg logit contributions
rob-0.073
 pret-0.071
 sy-0.064
 Stri-0.063
 ro-0.060
Token dish
Token ID9433
Feature activation-1.68e-02

Pos logit contributions
Mom+0.036
Everyone+0.033
um+0.032
 year+0.031
ally+0.028
Neg logit contributions
rob-0.050
 pret-0.046
 Stri-0.045
 sy-0.043
 Roc-0.040
Token.
Token ID13
Feature activation-1.24e-04

Pos logit contributions
Mom+0.025
Everyone+0.023
um+0.022
 year+0.022
ally+0.020
Neg logit contributions
rob-0.032
 pret-0.029
 Stri-0.029
 sy-0.027
 Roc-0.025
Token
Token ID220
Feature activation-1.42e-03

Pos logit contributions
Mom+0.000
 year+0.000
Everyone+0.000
um+0.000
har+0.000
Neg logit contributions
 Stri-0.000
rob-0.000
 pret-0.000
 Fro-0.000
 Roc-0.000
Token
Token ID198
Feature activation-2.10e-03

Pos logit contributions
Mom+0.002
 year+0.002
Everyone+0.002
um+0.002
 doorstep+0.002
Neg logit contributions
rob-0.003
 Stri-0.003
 pret-0.003
 sy-0.002
 Fro-0.002
Token and
Token ID290
Feature activation-6.18e-03

Pos logit contributions
rob+0.005
 pret+0.005
 Stri+0.005
 sy+0.005
 Roc+0.004
Neg logit contributions
Mom-0.003
Everyone-0.003
um-0.003
 year-0.003
ally-0.002
Token they
Token ID484
Feature activation-5.76e-03

Pos logit contributions
Mom+0.010
Everyone+0.009
um+0.009
 year+0.009
ally+0.008
Neg logit contributions
rob-0.011
 pret-0.010
 Stri-0.010
 sy-0.009
 Roc-0.009
Token made
Token ID925
Feature activation-1.51e-02

Pos logit contributions
Mom+0.009
Everyone+0.009
um+0.008
 year+0.008
ally+0.008
Neg logit contributions
rob-0.010
 pret-0.009
 Stri-0.009
 sy-0.009
 Roc-0.008
Token a
Token ID257
Feature activation-7.66e-03

Pos logit contributions
Mom+0.028
Everyone+0.026
 year+0.025
um+0.025
 doorstep+0.023
Neg logit contributions
rob-0.023
 pret-0.021
 Stri-0.021
 sy-0.019
 Fro-0.018
Token y
Token ID331
Feature activation-2.75e-02

Pos logit contributions
Mom+0.011
Everyone+0.010
um+0.010
 year+0.010
ally+0.009
Neg logit contributions
rob-0.014
 pret-0.014
 Stri-0.013
 sy-0.013
 scar-0.012
Tokenummy
Token ID13513
Feature activation-4.66e-02

Pos logit contributions
Mom+0.025
Everyone+0.021
 year+0.018
um+0.016
Daddy+0.016
Neg logit contributions
rob-0.066
 pret-0.061
 Stri-0.061
 sy-0.059
 Fro-0.057
Token apple
Token ID17180
Feature activation-2.00e-02

Pos logit contributions
Mom+0.080
Everyone+0.074
um+0.071
 year+0.070
ally+0.065
Neg logit contributions
rob-0.079
 pret-0.072
 Stri-0.070
 sy-0.067
 Roc-0.060
Token pie
Token ID2508
Feature activation-1.99e-02

Pos logit contributions
Mom+0.027
Everyone+0.024
um+0.024
 year+0.023
ally+0.021
Neg logit contributions
rob-0.042
 pret-0.038
 Stri-0.038
 sy-0.036
 Roc-0.033
Token.
Token ID13
Feature activation+1.66e-04

Pos logit contributions
Mom+0.027
Everyone+0.024
um+0.024
 year+0.023
ally+0.021
Neg logit contributions
rob-0.041
 pret-0.038
 Stri-0.037
 sy-0.036
 Roc-0.033
Token But
Token ID887
Feature activation-4.97e-03

Pos logit contributions
 pret+0.000
 Stri+0.000
 sy+0.000
rob+0.000
 Fro+0.000
Neg logit contributions
Mom-0.000
 year-0.000
ator-0.000
har-0.000
Everyone-0.000
Token when
Token ID618
Feature activation+1.13e-03

Pos logit contributions
Mom+0.008
Everyone+0.007
um+0.007
 year+0.007
 doorstep+0.006
Neg logit contributions
rob-0.009
 pret-0.008
 Stri-0.008
 sy-0.008
 Fro-0.007
Token berries
Token ID36322
Feature activation+2.87e-04

Pos logit contributions
Mom+0.008
Everyone+0.008
um+0.008
 year+0.008
har+0.007
Neg logit contributions
rob-0.006
 pret-0.006
 Stri-0.005
 sy-0.005
 Fro-0.005
Token and
Token ID290
Feature activation-4.28e-03

Pos logit contributions
rob+0.001
 pret+0.001
 Stri+0.001
 sy+0.001
 ze+0.000
Neg logit contributions
Mom-0.000
um-0.000
 year-0.000
Everyone-0.000
ally-0.000
Token made
Token ID925
Feature activation-1.13e-02

Pos logit contributions
Mom+0.007
Everyone+0.006
um+0.006
 year+0.006
ally+0.005
Neg logit contributions
rob-0.008
 pret-0.007
 Stri-0.007
 sy-0.007
 Roc-0.006
Token a
Token ID257
Feature activation-5.48e-03

Pos logit contributions
Mom+0.019
Everyone+0.018
um+0.017
 year+0.017
ally+0.016
Neg logit contributions
rob-0.020
 pret-0.018
 Stri-0.017
 sy-0.017
 Roc-0.015
Token y
Token ID331
Feature activation-2.77e-02

Pos logit contributions
Mom+0.009
Everyone+0.008
um+0.008
 year+0.008
ator+0.008
Neg logit contributions
rob-0.009
 pret-0.009
 Stri-0.008
 sy-0.008
 scar-0.008
Tokenummy
Token ID13513
Feature activation-4.62e-02

Pos logit contributions
Mom+0.025
Everyone+0.022
 year+0.018
har+0.017
um+0.016
Neg logit contributions
rob-0.066
 pret-0.062
 Stri-0.061
 sy-0.060
 Fro-0.057
Token b
Token ID275
Feature activation-2.35e-02

Pos logit contributions
Mom+0.082
Everyone+0.076
um+0.072
 year+0.072
Daddy+0.067
Neg logit contributions
rob-0.075
 pret-0.069
 Stri-0.066
 sy-0.064
 Fro-0.058
Tokenerry
Token ID6996
Feature activation-1.71e-02

Pos logit contributions
Mom+0.045
um+0.042
Everyone+0.042
 year+0.041
ally+0.039
Neg logit contributions
rob-0.036
 pret-0.033
 Stri-0.031
 sy-0.029
 Roc-0.026
Token pie
Token ID2508
Feature activation-2.30e-02

Pos logit contributions
Mom+0.025
Everyone+0.022
um+0.022
 year+0.022
ally+0.020
Neg logit contributions
rob-0.034
 pret-0.031
 Stri-0.030
 sy-0.029
 Roc-0.026
Token.
Token ID13
Feature activation+7.02e-04

Pos logit contributions
Mom+0.029
um+0.026
Everyone+0.026
 year+0.025
ally+0.023
Neg logit contributions
rob-0.050
 pret-0.046
 Stri-0.045
 sy-0.043
 Roc-0.040
Token It
Token ID632
Feature activation-4.78e-03

Pos logit contributions
 Stri+0.001
 pret+0.001
rob+0.001
 Roc+0.001
 Fro+0.001
Neg logit contributions
Mom-0.001
 year-0.001
Everyone-0.001
Daddy-0.001
har-0.001
Token so
Token ID523
Feature activation-2.48e-03

Pos logit contributions
rob+0.021
 pret+0.019
 Stri+0.018
 sy+0.017
 Roc+0.016
Neg logit contributions
Mom-0.016
Everyone-0.014
um-0.014
 year-0.014
ally-0.012
Token surprised
Token ID6655
Feature activation+7.12e-03

Pos logit contributions
Mom+0.004
Everyone+0.004
 year+0.003
um+0.003
 doorstep+0.003
Neg logit contributions
rob-0.004
 pret-0.004
 Stri-0.004
 sy-0.004
 Fro-0.004
Token that
Token ID326
Feature activation-5.39e-03

Pos logit contributions
rob+0.012
 pret+0.011
 Stri+0.011
 sy+0.010
 Fro+0.009
Neg logit contributions
Mom-0.012
Everyone-0.012
um-0.011
 year-0.011
 doorstep-0.010
Token they
Token ID484
Feature activation-9.00e-03

Pos logit contributions
Mom+0.012
Everyone+0.011
 year+0.011
um+0.011
 doorstep+0.010
Neg logit contributions
rob-0.006
 Stri-0.006
 pret-0.006
 sy-0.005
 Fro-0.005
Token started
Token ID2067
Feature activation-1.62e-03

Pos logit contributions
Mom+0.015
um+0.014
Everyone+0.014
 year+0.013
ator+0.013
Neg logit contributions
rob-0.014
 pret-0.014
 Stri-0.014
 sy-0.013
 ro-0.012
Token tor
Token ID7332
Feature activation-4.60e-02

Pos logit contributions
Mom+0.003
Everyone+0.003
um+0.003
 year+0.003
ally+0.003
Neg logit contributions
rob-0.002
 pret-0.002
 Stri-0.002
 sy-0.002
 Roc-0.002
Tokenoll
Token ID692
Feature activation-1.90e-03

Pos logit contributions
Mom+0.081
Everyone+0.075
um+0.072
 year+0.072
ally+0.067
Neg logit contributions
rob-0.077
 pret-0.070
 Stri-0.068
 sy-0.065
 Roc-0.059
Token away
Token ID1497
Feature activation-1.75e-02

Pos logit contributions
um+0.004
Mom+0.003
 year+0.003
ally+0.003
Everyone+0.003
Neg logit contributions
rob-0.003
 pret-0.002
 Stri-0.002
 sy-0.002
 Roc-0.002
Token.
Token ID13
Feature activation-6.55e-04

Pos logit contributions
Mom+0.024
Everyone+0.022
 year+0.022
um+0.021
har+0.020
Neg logit contributions
rob-0.036
 pret-0.032
 Stri-0.031
 sy-0.030
 Roc-0.027
Token
Token ID220
Feature activation-2.37e-03

Pos logit contributions
Mom+0.001
 year+0.001
um+0.001
Everyone+0.001
 doorstep+0.001
Neg logit contributions
 Stri-0.001
rob-0.001
 Fro-0.001
 pret-0.001
 Roc-0.001
Token
Token ID198
Feature activation-2.01e-03

Pos logit contributions
Mom+0.003
 year+0.003
Everyone+0.003
um+0.003
 doorstep+0.003
Neg logit contributions
rob-0.004
 Stri-0.004
 pret-0.004
 sy-0.004
 Fro-0.004
Token early
Token ID1903
Feature activation+1.82e-03

Pos logit contributions
Mom+0.012
Everyone+0.011
 year+0.010
ally+0.010
um+0.010
Neg logit contributions
rob-0.014
 pret-0.014
 Stri-0.013
 sy-0.013
 asleep-0.012
Token and
Token ID290
Feature activation-5.29e-03

Pos logit contributions
rob+0.004
 Stri+0.003
 pret+0.003
 sy+0.003
 Fro+0.003
Neg logit contributions
Mom-0.003
 year-0.002
Everyone-0.002
um-0.002
 doorstep-0.002
Token make
Token ID787
Feature activation-9.09e-03

Pos logit contributions
Mom+0.009
Everyone+0.008
um+0.008
 year+0.008
ally+0.008
Neg logit contributions
rob-0.009
 pret-0.008
 Stri-0.008
 sy-0.008
 scar-0.007
Token a
Token ID257
Feature activation-1.22e-02

Pos logit contributions
Mom+0.015
Everyone+0.014
 year+0.014
um+0.014
 Everyone+0.012
Neg logit contributions
rob-0.016
 pret-0.014
 Stri-0.014
 sy-0.013
 Fro-0.012
Token y
Token ID331
Feature activation-2.16e-02

Pos logit contributions
Mom+0.019
Everyone+0.017
um+0.017
ally+0.017
ator+0.016
Neg logit contributions
 pret-0.020
 scar-0.019
rob-0.019
 sy-0.019
 Stri-0.018
Tokenummy
Token ID13513
Feature activation-4.56e-02

Pos logit contributions
Mom+0.019
Everyone+0.016
 year+0.014
um+0.013
har+0.011
Neg logit contributions
rob-0.054
 pret-0.050
 Stri-0.050
 sy-0.049
 Roc-0.046
Token breakfast
Token ID12607
Feature activation-1.68e-02

Pos logit contributions
Mom+0.082
Everyone+0.074
um+0.070
 year+0.068
ally+0.068
Neg logit contributions
rob-0.074
 pret-0.070
 sy-0.064
 Stri-0.063
 scar-0.059
Token for
Token ID329
Feature activation+4.36e-03

Pos logit contributions
Mom+0.021
Everyone+0.019
 year+0.018
um+0.018
 doorstep+0.015
Neg logit contributions
rob-0.036
 pret-0.034
 Stri-0.033
 sy-0.032
 Fro-0.030
Token her
Token ID607
Feature activation-2.04e-03

Pos logit contributions
rob+0.006
 pret+0.006
 Stri+0.006
 sy+0.005
 Roc+0.005
Neg logit contributions
Mom-0.008
Everyone-0.008
um-0.008
 year-0.008
ally-0.007
Token family
Token ID1641
Feature activation-1.01e-02

Pos logit contributions
Mom+0.003
Everyone+0.002
um+0.002
ally+0.002
har+0.002
Neg logit contributions
 pret-0.004
rob-0.004
 Stri-0.004
 sy-0.004
 asleep-0.004
Token.
Token ID13
Feature activation+3.75e-03

Pos logit contributions
Mom+0.015
Everyone+0.013
um+0.013
 year+0.012
ally+0.011
Neg logit contributions
rob-0.020
 pret-0.018
 Stri-0.018
 sy-0.017
 Roc-0.016
Token up
Token ID510
Feature activation+9.84e-03

Pos logit contributions
rob+0.003
 pret+0.003
 Stri+0.003
 sy+0.003
 enough+0.003
Neg logit contributions
Mom-0.004
Everyone-0.004
um-0.003
 year-0.003
ally-0.003
Token to
Token ID284
Feature activation-4.78e-03

Pos logit contributions
rob+0.015
 pret+0.014
 Stri+0.014
 sy+0.013
 enough+0.013
Neg logit contributions
Mom-0.018
Everyone-0.016
um-0.015
 year-0.015
ally-0.015
Token reveal
Token ID7766
Feature activation-1.68e-02

Pos logit contributions
Mom+0.009
um+0.009
Everyone+0.009
 year+0.008
ally+0.008
Neg logit contributions
rob-0.006
 pret-0.006
 Stri-0.006
 sy-0.005
 Fro-0.005
Token a
Token ID257
Feature activation-1.77e-02

Pos logit contributions
Mom+0.034
Everyone+0.031
um+0.029
 year+0.029
ally+0.029
Neg logit contributions
rob-0.021
 pret-0.021
 Stri-0.021
 enough-0.019
 sy-0.019
Token y
Token ID331
Feature activation-3.00e-02

Pos logit contributions
Mom+0.030
Everyone+0.028
um+0.027
ally+0.026
ator+0.026
Neg logit contributions
 pret-0.028
 sy-0.027
 ro-0.025
 scar-0.025
 Stri-0.024
Tokenummy
Token ID13513
Feature activation-4.53e-02

Pos logit contributions
Mom+0.035
Everyone+0.032
Daddy+0.025
VP+0.024
har+0.024
Neg logit contributions
rob-0.061
 pret-0.058
 enough-0.057
 Stri-0.057
 sy-0.057
Token surprise
Token ID5975
Feature activation-1.08e-02

Pos logit contributions
Mom+0.096
Everyone+0.090
Daddy+0.081
har+0.080
ator+0.079
Neg logit contributions
 pret-0.057
rob-0.057
 sy-0.052
 ro-0.050
 scar-0.046
Token.
Token ID13
Feature activation-4.63e-04

Pos logit contributions
Mom+0.019
Everyone+0.018
um+0.016
 year+0.016
urious+0.016
Neg logit contributions
rob-0.018
 pret-0.017
 Stri-0.016
 sy-0.016
 enough-0.014
Token
Token ID220
Feature activation-2.08e-04

Pos logit contributions
Mom+0.001
 year+0.001
Everyone+0.001
um+0.001
 doorstep+0.001
Neg logit contributions
rob-0.001
 Stri-0.001
 pret-0.001
 Fro-0.001
 Roc-0.001
Token
Token ID198
Feature activation-9.48e-04

Pos logit contributions
Mom+0.000
 year+0.000
Everyone+0.000
um+0.000
 doorstep+0.000
Neg logit contributions
rob-0.000
 Stri-0.000
 pret-0.000
 sy-0.000
 Fro-0.000
Token
Token ID198
Feature activation+2.76e-03

Pos logit contributions
Mom+0.001
um+0.001
 year+0.001
 doorstep+0.001
 rightful+0.001
Neg logit contributions
rob-0.002
 Stri-0.002
 pret-0.002
 Fro-0.002
 Roc-0.002
Token fridge
Token ID25772
Feature activation-3.95e-03

Pos logit contributions
ally+0.011
Everyone+0.011
um+0.010
ini+0.010
 behold+0.010
Neg logit contributions
 asleep-0.008
 pret-0.008
 sy-0.008
 ro-0.007
 pill-0.007
Token and
Token ID290
Feature activation-5.98e-03

Pos logit contributions
Mom+0.006
 year+0.006
um+0.006
Everyone+0.006
 doorstep+0.005
Neg logit contributions
rob-0.007
 pret-0.007
 Stri-0.007
 sy-0.006
 Roc-0.006
Token found
Token ID1043
Feature activation-2.26e-02

Pos logit contributions
Mom+0.010
Everyone+0.010
um+0.009
 year+0.009
ally+0.008
Neg logit contributions
rob-0.010
 pret-0.010
 Stri-0.009
 sy-0.009
 Roc-0.008
Token a
Token ID257
Feature activation-1.25e-02

Pos logit contributions
Mom+0.042
Everyone+0.039
 year+0.037
um+0.037
ally+0.035
Neg logit contributions
rob-0.035
 pret-0.032
 Stri-0.031
 sy-0.029
 Roc-0.027
Token y
Token ID331
Feature activation-3.01e-02

Pos logit contributions
Mom+0.020
Everyone+0.020
um+0.019
ally+0.019
urious+0.018
Neg logit contributions
 pret-0.020
 sy-0.018
rob-0.018
 ro-0.017
 scar-0.017
Tokenummy
Token ID13513
Feature activation-4.50e-02

Pos logit contributions
Mom+0.033
Everyone+0.031
har+0.024
VP+0.023
Daddy+0.023
Neg logit contributions
rob-0.063
 pret-0.059
 enough-0.057
 sy-0.057
 Stri-0.057
Token ice
Token ID4771
Feature activation-1.92e-02

Pos logit contributions
Mom+0.084
Everyone+0.081
um+0.072
urious+0.072
har+0.072
Neg logit contributions
rob-0.066
 pret-0.065
 sy-0.057
 ro-0.054
 scar-0.051
Token cream
Token ID8566
Feature activation-2.72e-02

Pos logit contributions
Mom+0.033
Everyone+0.031
um+0.028
 year+0.027
urious+0.027
Neg logit contributions
rob-0.033
 pret-0.030
 Stri-0.027
 sy-0.027
 ro-0.024
Token cake
Token ID12187
Feature activation-1.88e-02

Pos logit contributions
Mom+0.047
Everyone+0.045
um+0.042
 year+0.042
ally+0.039
Neg logit contributions
rob-0.045
 pret-0.041
 Stri-0.039
 sy-0.038
 Roc-0.034
Token that
Token ID326
Feature activation+6.32e-04

Pos logit contributions
Mom+0.028
Everyone+0.026
um+0.024
 year+0.024
ally+0.022
Neg logit contributions
rob-0.036
 pret-0.033
 Stri-0.032
 sy-0.031
 Roc-0.029
Token was
Token ID373
Feature activation+5.61e-03

Pos logit contributions
rob+0.001
 pret+0.001
 Stri+0.001
 sy+0.001
 Roc+0.001
Neg logit contributions
Mom-0.001
Everyone-0.001
um-0.001
 year-0.001
ally-0.001
Token strawberries
Token ID36973
Feature activation+5.89e-03

Pos logit contributions
rob+0.000
 pret+0.000
 sy+0.000
 Stri+0.000
 scar+0.000
Neg logit contributions
Mom-0.001
Everyone-0.000
um-0.000
 year-0.000
Daddy-0.000
Token and
Token ID290
Feature activation+2.36e-03

Pos logit contributions
rob+0.012
 pret+0.011
 Stri+0.010
 sy+0.010
 Roc+0.009
Neg logit contributions
Mom-0.009
um-0.008
 year-0.008
Everyone-0.008
ally-0.007
Token made
Token ID925
Feature activation-8.47e-03

Pos logit contributions
rob+0.004
 pret+0.004
 Stri+0.004
 sy+0.004
 scar+0.003
Neg logit contributions
Mom-0.004
Everyone-0.004
um-0.003
 year-0.003
 doorstep-0.003
Token a
Token ID257
Feature activation-7.52e-03

Pos logit contributions
Mom+0.015
Everyone+0.013
um+0.013
 year+0.013
ally+0.012
Neg logit contributions
rob-0.014
 pret-0.013
 Stri-0.013
 sy-0.012
 Roc-0.011
Token y
Token ID331
Feature activation-2.44e-02

Pos logit contributions
Mom+0.012
Everyone+0.011
um+0.010
ator+0.010
 year+0.010
Neg logit contributions
 pret-0.013
 sy-0.012
 scar-0.012
rob-0.012
 Stri-0.011
Tokenummy
Token ID13513
Feature activation-4.50e-02

Pos logit contributions
Mom+0.024
Everyone+0.021
 year+0.017
Daddy+0.016
har+0.016
Neg logit contributions
rob-0.056
 pret-0.052
 Stri-0.052
 sy-0.051
 Fro-0.049
Token strawberry
Token ID41236
Feature activation-2.36e-02

Pos logit contributions
Mom+0.081
Everyone+0.074
um+0.069
 year+0.067
Daddy+0.066
Neg logit contributions
rob-0.071
 pret-0.068
 sy-0.062
 Stri-0.061
 ro-0.057
Token cake
Token ID12187
Feature activation-1.87e-02

Pos logit contributions
Mom+0.037
Everyone+0.033
um+0.032
 year+0.032
ally+0.029
Neg logit contributions
rob-0.044
 pret-0.040
 Stri-0.039
 sy-0.038
 Roc-0.035
Token.
Token ID13
Feature activation+4.01e-04

Pos logit contributions
Mom+0.026
Everyone+0.024
um+0.023
 year+0.023
ally+0.021
Neg logit contributions
rob-0.037
 pret-0.034
 Stri-0.034
 sy-0.032
 Roc-0.030
Token She
Token ID1375
Feature activation-1.74e-03

Pos logit contributions
 Stri+0.001
rob+0.001
 pret+0.001
 Fro+0.001
 Roc+0.001
Neg logit contributions
Mom-0.001
 year-0.001
um-0.001
ator-0.001
Everyone-0.001
Token smiled
Token ID13541
Feature activation+4.08e-03

Pos logit contributions
Mom+0.003
Everyone+0.003
um+0.002
 year+0.002
VP+0.002
Neg logit contributions
 pret-0.003
rob-0.003
 Stri-0.003
 sy-0.003
 ro-0.002
Token boom
Token ID14166
Feature activation+5.05e-03

Pos logit contributions
rob+0.005
 Stri+0.004
 pret+0.004
 sy+0.004
 Fro+0.004
Neg logit contributions
Mom-0.004
Everyone-0.003
 year-0.003
 tidy-0.003
um-0.003
Token and
Token ID290
Feature activation-1.25e-02

Pos logit contributions
rob+0.010
 Stri+0.009
 pret+0.009
 sy+0.009
 Fro+0.008
Neg logit contributions
Mom-0.007
Everyone-0.007
 year-0.006
um-0.006
ally-0.005
Token see
Token ID766
Feature activation-2.08e-02

Pos logit contributions
Mom+0.023
Everyone+0.021
um+0.021
 year+0.020
ally+0.019
Neg logit contributions
rob-0.020
 pret-0.018
 Stri-0.018
 sy-0.017
 Roc-0.015
Token a
Token ID257
Feature activation-1.69e-02

Pos logit contributions
Mom+0.039
um+0.036
Everyone+0.036
 year+0.035
ally+0.034
Neg logit contributions
rob-0.031
 pret-0.029
 Stri-0.028
 sy-0.026
 Roc-0.024
Token fire
Token ID2046
Feature activation-2.65e-02

Pos logit contributions
Mom+0.026
um+0.026
ally+0.024
Everyone+0.024
urious+0.023
Neg logit contributions
 pret-0.029
rob-0.028
 sy-0.027
 Stri-0.026
 ro-0.025
Tokenwork
Token ID1818
Feature activation-4.49e-02

Pos logit contributions
Mom+0.041
Everyone+0.038
um+0.036
 year+0.036
ally+0.033
Neg logit contributions
rob-0.049
 pret-0.045
 Stri-0.044
 sy-0.042
 Roc-0.039
Token shoot
Token ID2686
Feature activation-9.38e-03

Pos logit contributions
Mom+0.079
Everyone+0.071
um+0.070
ally+0.069
Daddy+0.065
Neg logit contributions
rob-0.073
 pret-0.069
 sy-0.063
 Stri-0.063
 ro-0.060
Token up
Token ID510
Feature activation+1.02e-02

Pos logit contributions
Mom+0.017
Everyone+0.015
um+0.015
ally+0.014
 year+0.014
Neg logit contributions
rob-0.015
 pret-0.014
 Stri-0.014
 sy-0.013
 ro-0.012
Token in
Token ID287
Feature activation-1.85e-02

Pos logit contributions
rob+0.015
 pret+0.015
 Stri+0.014
 asleep+0.013
 enough+0.013
Neg logit contributions
Mom-0.018
Everyone-0.017
um-0.017
ally-0.017
 doorstep-0.016
Token the
Token ID262
Feature activation-3.04e-03

Pos logit contributions
Mom+0.038
um+0.037
Everyone+0.036
ally+0.035
 doorstep+0.034
Neg logit contributions
rob-0.022
 pret-0.021
 Stri-0.020
 sy-0.018
 asleep-0.017
Token air
Token ID1633
Feature activation-3.42e-03

Pos logit contributions
ally+0.005
Mom+0.005
Everyone+0.005
um+0.005
becca+0.005
Neg logit contributions
rob-0.005
 pret-0.004
 sy-0.004
 asleep-0.004
 ro-0.004
Token pie
Token ID2508
Feature activation-2.19e-02

Pos logit contributions
Mom+0.024
Everyone+0.024
har+0.023
urious+0.022
ally+0.022
Neg logit contributions
 pret-0.016
 ro-0.015
 sy-0.015
rob-0.015
 enough-0.014
Token turn
Token ID1210
Feature activation-9.88e-03

Pos logit contributions
Mom+0.029
um+0.026
Everyone+0.025
 year+0.025
 doorstep+0.022
Neg logit contributions
rob-0.047
 Stri-0.042
 pret-0.042
 sy-0.040
 Fro-0.038
Token brown
Token ID7586
Feature activation-9.71e-03

Pos logit contributions
Mom+0.016
Everyone+0.014
um+0.014
 year+0.014
ally+0.013
Neg logit contributions
rob-0.018
 pret-0.016
 Stri-0.016
 sy-0.015
 Roc-0.014
Token and
Token ID290
Feature activation+4.82e-03

Pos logit contributions
Mom+0.014
um+0.012
 year+0.012
Everyone+0.012
ally+0.010
Neg logit contributions
rob-0.020
 Stri-0.018
 pret-0.018
 sy-0.017
 Roc-0.016
Token bub
Token ID10015
Feature activation-1.62e-02

Pos logit contributions
rob+0.007
 pret+0.007
 Stri+0.007
 sy+0.007
 scar+0.006
Neg logit contributions
Mom-0.008
Everyone-0.008
um-0.008
 year-0.008
ally-0.007
Tokenbly
Token ID36874
Feature activation-4.48e-02

Pos logit contributions
Mom+0.024
Everyone+0.022
 year+0.021
um+0.020
ally+0.020
Neg logit contributions
rob-0.030
 pret-0.028
 Stri-0.027
 sy-0.026
 ro-0.024
Token.
Token ID13
Feature activation-2.91e-03

Pos logit contributions
Mom+0.059
Everyone+0.053
um+0.052
 year+0.051
ally+0.045
Neg logit contributions
rob-0.095
 Stri-0.086
 pret-0.086
 sy-0.081
 Roc-0.076
Token
Token ID198
Feature activation-7.98e-03

Pos logit contributions
Mom+0.005
 year+0.005
 doorstep+0.005
Everyone+0.005
um+0.005
Neg logit contributions
 Stri-0.004
 pret-0.004
rob-0.004
 Roc-0.004
 Fro-0.004
Token
Token ID198
Feature activation-3.58e-03

Pos logit contributions
 rightful+0.013
 doorstep+0.013
 mant+0.013
erie+0.013
har+0.013
Neg logit contributions
 Stri-0.012
rob-0.011
 pret-0.010
 Roc-0.010
 Fro-0.010
Token"
Token ID1
Feature activation+6.47e-03

Pos logit contributions
 doorstep+0.005
 mant+0.004
 rightful+0.004
 year+0.004
ator+0.004
Neg logit contributions
 Stri-0.007
rob-0.006
 Roc-0.006
 pret-0.006
 Fro-0.005
TokenD
Token ID35
Feature activation-5.81e-03

Pos logit contributions
rob+0.015
 pret+0.014
 Stri+0.014
 sy+0.014
 Roc+0.013
Neg logit contributions
um-0.006
 year-0.006
Mom-0.006
Everyone-0.005
 doorstep-0.005
Token they
Token ID484
Feature activation+8.49e-03

Pos logit contributions
Mom+0.021
Everyone+0.020
um+0.019
 year+0.019
ally+0.018
Neg logit contributions
rob-0.021
 pret-0.019
 Stri-0.018
 sy-0.017
 Roc-0.016
Token were
Token ID547
Feature activation+4.06e-03

Pos logit contributions
rob+0.015
 pret+0.014
 Stri+0.013
 sy+0.012
 Roc+0.011
Neg logit contributions
Mom-0.014
Everyone-0.013
um-0.013
 year-0.012
ally-0.012
Token making
Token ID1642
Feature activation-2.07e-03

Pos logit contributions
rob+0.007
 pret+0.007
 Stri+0.007
 sy+0.006
 Roc+0.006
Neg logit contributions
Mom-0.006
Everyone-0.006
um-0.006
 year-0.006
ally-0.005
Token a
Token ID257
Feature activation-2.85e-03

Pos logit contributions
Mom+0.004
Everyone+0.003
 year+0.003
um+0.003
ally+0.003
Neg logit contributions
rob-0.003
 pret-0.003
 Stri-0.003
 sy-0.003
 Fro-0.003
Token y
Token ID331
Feature activation-1.99e-02

Pos logit contributions
Mom+0.005
Everyone+0.004
um+0.004
ally+0.004
 year+0.004
Neg logit contributions
 pret-0.005
rob-0.005
 sy-0.004
 Stri-0.004
 scar-0.004
Tokenummy
Token ID13513
Feature activation-4.47e-02

Pos logit contributions
Mom+0.018
Everyone+0.016
 year+0.013
um+0.012
har+0.012
Neg logit contributions
rob-0.048
 pret-0.044
 Stri-0.044
 sy-0.043
 Fro-0.041
Token and
Token ID290
Feature activation+4.10e-03

Pos logit contributions
Mom+0.076
Everyone+0.069
um+0.065
 year+0.063
Daddy+0.061
Neg logit contributions
rob-0.077
 pret-0.073
 sy-0.067
 Stri-0.067
 scar-0.060
Token pretty
Token ID2495
Feature activation-2.38e-02

Pos logit contributions
rob+0.007
 pret+0.007
 Stri+0.007
 sy+0.006
 scar+0.006
Neg logit contributions
Mom-0.007
Everyone-0.006
um-0.006
 year-0.006
ally-0.005
Token cake
Token ID12187
Feature activation-2.61e-02

Pos logit contributions
Mom+0.044
Everyone+0.040
um+0.038
Daddy+0.037
har+0.036
Neg logit contributions
 pret-0.037
rob-0.036
 sy-0.032
 scar-0.030
 Stri-0.030
Token for
Token ID329
Feature activation-7.72e-04

Pos logit contributions
Mom+0.033
Everyone+0.029
um+0.028
 year+0.028
ally+0.025
Neg logit contributions
rob-0.056
 pret-0.051
 Stri-0.051
 sy-0.049
 Fro-0.046
Token their
Token ID511
Feature activation-2.70e-03

Pos logit contributions
Mom+0.001
Everyone+0.001
um+0.001
 year+0.001
ally+0.001
Neg logit contributions
rob-0.001
 pret-0.001
 Stri-0.001
 sy-0.001
 Fro-0.001
Token too
Token ID1165
Feature activation-1.17e-02

Pos logit contributions
rob+0.015
 pret+0.014
 Stri+0.014
 sy+0.013
 Roc+0.012
Neg logit contributions
Mom-0.015
Everyone-0.013
um-0.013
 year-0.013
ally-0.012
Token cold
Token ID4692
Feature activation+5.65e-03

Pos logit contributions
Mom+0.020
Everyone+0.018
um+0.017
 doorstep+0.017
ally+0.016
Neg logit contributions
rob-0.019
 pret-0.018
 Stri-0.018
 sy-0.017
 ro-0.016
Token for
Token ID329
Feature activation+3.69e-03

Pos logit contributions
rob+0.011
 pret+0.010
 Stri+0.010
 sy+0.010
 Fro+0.009
Neg logit contributions
Mom-0.007
um-0.007
 year-0.007
Everyone-0.006
 doorstep-0.006
Token a
Token ID257
Feature activation-1.22e-03

Pos logit contributions
rob+0.005
 pret+0.005
 Stri+0.005
 sy+0.005
 Roc+0.004
Neg logit contributions
Mom-0.007
Everyone-0.007
um-0.006
 year-0.006
ally-0.006
Token re
Token ID302
Feature activation-2.90e-02

Pos logit contributions
Mom+0.002
um+0.002
Everyone+0.002
ator+0.002
urious+0.002
Neg logit contributions
 pret-0.002
rob-0.002
 sy-0.002
 Stri-0.002
 scar-0.002
Tokenind
Token ID521
Feature activation-4.47e-02

Pos logit contributions
Mom+0.045
Everyone+0.043
um+0.039
 Everyone+0.038
ally+0.036
Neg logit contributions
rob-0.053
 Stri-0.048
 pret-0.047
 sy-0.047
ah-0.044
Tokeneer
Token ID28153
Feature activation-1.60e-02

Pos logit contributions
Mom+0.093
 year+0.090
Everyone+0.087
um+0.086
 doorstep+0.080
Neg logit contributions
rob-0.055
 pret-0.049
 Stri-0.049
 sy-0.045
 Fro-0.040
Token like
Token ID588
Feature activation-1.11e-02

Pos logit contributions
Mom+0.022
 year+0.020
Everyone+0.020
um+0.019
 doorstep+0.018
Neg logit contributions
rob-0.032
 pret-0.029
 Stri-0.028
 sy-0.027
 Roc-0.025
Token you
Token ID345
Feature activation-3.97e-03

Pos logit contributions
Mom+0.020
Everyone+0.018
 year+0.018
um+0.018
ally+0.016
Neg logit contributions
rob-0.018
 pret-0.016
 Stri-0.016
 sy-0.015
 Roc-0.014
Token!"
Token ID2474
Feature activation+9.01e-04

Pos logit contributions
Mom+0.006
Everyone+0.005
 year+0.005
um+0.005
ally+0.005
Neg logit contributions
rob-0.008
 pret-0.007
 Stri-0.007
 sy-0.007
 Roc-0.006
Token
Token ID198
Feature activation+2.35e-03

Pos logit contributions
 Stri+0.001
rob+0.001
 pret+0.001
 Fro+0.001
 sy+0.001
Neg logit contributions
Mom-0.001
ator-0.001
um-0.001
 doorstep-0.001
ini-0.001
Token.
Token ID13
Feature activation-4.63e-03

Pos logit contributions
rob+0.016
 pret+0.014
 Stri+0.014
 sy+0.014
 Roc+0.013
Neg logit contributions
Mom-0.009
Everyone-0.008
um-0.008
 year-0.008
ally-0.007
Token He
Token ID679
Feature activation-2.22e-03

Pos logit contributions
Mom+0.008
 year+0.008
Everyone+0.007
um+0.007
 doorstep+0.007
Neg logit contributions
rob-0.007
 pret-0.007
 Stri-0.007
 sy-0.006
 Fro-0.006
Token made
Token ID925
Feature activation-8.67e-03

Pos logit contributions
Mom+0.004
Everyone+0.003
um+0.003
 year+0.003
 doorstep+0.003
Neg logit contributions
rob-0.004
 pret-0.004
 Stri-0.004
 sy-0.003
 Fro-0.003
Token a
Token ID257
Feature activation-3.14e-03

Pos logit contributions
Mom+0.015
Everyone+0.014
um+0.013
 year+0.013
ally+0.012
Neg logit contributions
rob-0.015
 pret-0.013
 Stri-0.013
 sy-0.012
 Roc-0.011
Token y
Token ID331
Feature activation-2.22e-02

Pos logit contributions
Mom+0.004
Everyone+0.004
um+0.003
 year+0.003
ally+0.003
Neg logit contributions
rob-0.007
 pret-0.006
 Stri-0.006
 sy-0.006
 scar-0.005
Tokenummy
Token ID13513
Feature activation-4.45e-02

Pos logit contributions
Mom+0.016
Everyone+0.013
um+0.013
 year+0.012
ally+0.010
Neg logit contributions
rob-0.060
 pret-0.056
 Stri-0.056
 sy-0.054
 Roc-0.051
Token sound
Token ID2128
Feature activation-5.55e-03

Pos logit contributions
Mom+0.074
Everyone+0.068
um+0.067
 year+0.067
ally+0.059
Neg logit contributions
rob-0.079
 Stri-0.072
 pret-0.069
 sy-0.065
 Roc-0.061
Token too
Token ID1165
Feature activation-9.32e-03

Pos logit contributions
Mom+0.005
Everyone+0.004
 year+0.004
um+0.004
 nicely+0.003
Neg logit contributions
rob-0.015
 Stri-0.013
 pret-0.013
 sy-0.012
 scar-0.012
Token.
Token ID13
Feature activation-4.79e-03

Pos logit contributions
Mom+0.009
um+0.008
Everyone+0.007
 year+0.007
 tidy+0.006
Neg logit contributions
rob-0.024
 pret-0.022
 Stri-0.022
 sy-0.021
 scar-0.020
Token
Token ID198
Feature activation-7.12e-03

Pos logit contributions
Mom+0.008
 year+0.008
Everyone+0.008
um+0.008
 doorstep+0.008
Neg logit contributions
rob-0.007
 Stri-0.007
 pret-0.007
 sy-0.007
 Fro-0.006
Token
Token ID198
Feature activation-3.15e-03

Pos logit contributions
 doorstep+0.011
har+0.011
 year+0.011
Mom+0.010
ator+0.010
Neg logit contributions
 Stri-0.011
rob-0.011
 pret-0.010
 Fro-0.010
 Roc-0.010
Token bread
Token ID8509
Feature activation+7.11e-03

Pos logit contributions
rob+0.034
 pret+0.031
 Stri+0.030
 sy+0.028
 scar+0.025
Neg logit contributions
Mom-0.048
Everyone-0.044
um-0.043
 year-0.043
 doorstep-0.041
Token and
Token ID290
Feature activation+2.68e-03

Pos logit contributions
rob+0.014
 pret+0.013
 Stri+0.013
 sy+0.012
 Roc+0.011
Neg logit contributions
Mom-0.010
Everyone-0.009
um-0.009
 year-0.009
ally-0.008
Token made
Token ID925
Feature activation-9.23e-03

Pos logit contributions
rob+0.004
 pret+0.004
 Stri+0.004
 sy+0.004
 scar+0.003
Neg logit contributions
Mom-0.005
Everyone-0.004
 year-0.004
um-0.004
ally-0.004
Token a
Token ID257
Feature activation-9.15e-03

Pos logit contributions
Mom+0.017
Everyone+0.016
 year+0.015
um+0.015
ally+0.014
Neg logit contributions
rob-0.014
 pret-0.013
 Stri-0.013
 sy-0.012
 Roc-0.011
Token y
Token ID331
Feature activation-1.85e-02

Pos logit contributions
Mom+0.014
Everyone+0.012
um+0.012
 year+0.012
ally+0.011
Neg logit contributions
rob-0.017
 pret-0.016
 Stri-0.016
 sy-0.015
 scar-0.014
Tokenummy
Token ID13513
Feature activation-4.45e-02

Pos logit contributions
Mom+0.014
Everyone+0.012
 year+0.011
um+0.010
ally+0.008
Neg logit contributions
rob-0.048
 pret-0.045
 Stri-0.045
 sy-0.043
 Roc-0.041
Token sandwich
Token ID20433
Feature activation-1.27e-02

Pos logit contributions
Mom+0.080
Everyone+0.073
um+0.070
 year+0.069
ally+0.066
Neg logit contributions
rob-0.071
 pret-0.066
 Stri-0.063
 sy-0.061
 scar-0.054
Token.
Token ID13
Feature activation-3.18e-04

Pos logit contributions
Mom+0.019
Everyone+0.017
um+0.016
 year+0.016
ally+0.015
Neg logit contributions
rob-0.025
 pret-0.023
 Stri-0.022
 sy-0.021
 Roc-0.020
Token
Token ID198
Feature activation-3.66e-03

Pos logit contributions
Mom+0.001
 year+0.000
har+0.000
 doorstep+0.000
ator+0.000
Neg logit contributions
 Stri-0.001
rob-0.000
 pret-0.000
 Fro-0.000
 Rib-0.000
Token
Token ID198
Feature activation+5.04e-04

Pos logit contributions
 doorstep+0.006
 mant+0.006
 rightful+0.006
har+0.006
ator+0.006
Neg logit contributions
 Stri-0.006
rob-0.005
 Fro-0.005
 Roc-0.005
 pret-0.004
TokenBut
Token ID1537
Feature activation-8.10e-03

Pos logit contributions
 Stri+0.001
rob+0.001
 Fro+0.001
ez+0.001
 pret+0.001
Neg logit contributions
 doorstep-0.001
 mant-0.001
ator-0.001
 rightful-0.001
um-0.001
Token surrender
Token ID16908
Feature activation-7.78e-04

Pos logit contributions
 pret+0.009
 Stri+0.009
 sy+0.008
 ro+0.008
 Fro+0.008
Neg logit contributions
Mom-0.015
 year-0.015
urious-0.015
Really-0.015
ally-0.015
Token and
Token ID290
Feature activation+2.40e-03

Pos logit contributions
Mom+0.001
Everyone+0.001
um+0.001
 year+0.001
ally+0.001
Neg logit contributions
rob-0.001
 pret-0.001
 Stri-0.001
 sy-0.001
 Roc-0.001
Token make
Token ID787
Feature activation-4.48e-03

Pos logit contributions
 pret+0.004
rob+0.003
 Stri+0.003
 sy+0.003
 ro+0.003
Neg logit contributions
Mom-0.004
 year-0.004
um-0.004
Everyone-0.004
ally-0.004
Token a
Token ID257
Feature activation-7.51e-03

Pos logit contributions
Mom+0.008
Everyone+0.007
 year+0.007
um+0.007
ally+0.007
Neg logit contributions
rob-0.007
 pret-0.007
 Stri-0.007
 sy-0.006
 Roc-0.006
Token y
Token ID331
Feature activation-1.72e-02

Pos logit contributions
Mom+0.012
Everyone+0.011
um+0.011
ally+0.010
 year+0.010
Neg logit contributions
 pret-0.013
rob-0.012
 sy-0.012
 Stri-0.012
 scar-0.011
Tokenummy
Token ID13513
Feature activation-4.44e-02

Pos logit contributions
Mom+0.015
Everyone+0.013
 year+0.011
um+0.010
Daddy+0.009
Neg logit contributions
rob-0.043
 pret-0.040
 Stri-0.040
 sy-0.039
 Fro-0.037
Token snack
Token ID26906
Feature activation-1.13e-02

Pos logit contributions
Mom+0.080
Everyone+0.073
um+0.069
 year+0.067
ally+0.066
Neg logit contributions
rob-0.072
 pret-0.068
 sy-0.063
 Stri-0.062
 ro-0.056
Token instead
Token ID2427
Feature activation+2.56e-04

Pos logit contributions
Mom+0.018
Everyone+0.016
um+0.015
 year+0.015
ally+0.014
Neg logit contributions
rob-0.020
 pret-0.019
 Stri-0.018
 sy-0.018
 Roc-0.016
Token,"
Token ID553
Feature activation+7.71e-03

Pos logit contributions
rob+0.000
 Stri+0.000
 pret+0.000
 sy+0.000
 ro+0.000
Neg logit contributions
Mom-0.000
Everyone-0.000
har-0.000
ally-0.000
um-0.000
Token she
Token ID673
Feature activation+1.65e-02

Pos logit contributions
rob+0.010
 Stri+0.010
 pret+0.009
 Fro+0.008
 sy+0.008
Neg logit contributions
Mom-0.015
um-0.014
 doorstep-0.014
Everyone-0.014
 year-0.014
Token said
Token ID531
Feature activation+8.48e-03

Pos logit contributions
rob+0.025
 pret+0.023
 Stri+0.023
 sy+0.021
 Fro+0.019
Neg logit contributions
Mom-0.031
Everyone-0.028
 year-0.028
um-0.027
 doorstep-0.026
TokenYes
Token ID5297
Feature activation+1.35e-03

Pos logit contributions
rob+0.027
 pret+0.024
 Stri+0.024
 sy+0.023
 ze+0.022
Neg logit contributions
um-0.011
Mom-0.011
 year-0.011
 doorstep-0.011
Everyone-0.010
Token,
Token ID11
Feature activation+8.24e-03

Pos logit contributions
rob+0.002
 pret+0.002
 Stri+0.002
 sy+0.001
 Fro+0.001
Neg logit contributions
Mom-0.003
Everyone-0.003
 year-0.002
um-0.002
har-0.002
Token they
Token ID484
Feature activation+2.12e-02

Pos logit contributions
rob+0.014
 pret+0.013
 Stri+0.013
 sy+0.012
 Roc+0.011
Neg logit contributions
Mom-0.014
Everyone-0.013
um-0.013
 year-0.012
 doorstep-0.012
Token are
Token ID389
Feature activation+1.14e-02

Pos logit contributions
rob+0.029
 Stri+0.025
 pret+0.025
 sy+0.024
 Fro+0.020
Neg logit contributions
Mom-0.044
Everyone-0.042
um-0.040
 year-0.040
ally-0.038
Token ost
Token ID23619
Feature activation+3.92e-03

Pos logit contributions
rob+0.022
 pret+0.021
 Stri+0.020
 sy+0.019
 Roc+0.018
Neg logit contributions
Mom-0.017
Everyone-0.015
um-0.015
 year-0.014
ally-0.013
Tokenric
Token ID1173
Feature activation+4.51e-02

Pos logit contributions
rob+0.006
 pret+0.005
 Stri+0.005
 sy+0.005
 Roc+0.004
Neg logit contributions
Mom-0.008
um-0.007
Everyone-0.007
 year-0.007
ally-0.007
Tokenhes
Token ID956
Feature activation+2.02e-02

Pos logit contributions
rob+0.067
 pret+0.060
 Stri+0.059
 sy+0.055
 Roc+0.050
Neg logit contributions
Mom-0.086
Everyone-0.080
um-0.078
 year-0.077
ally-0.072
Token,"
Token ID553
Feature activation+1.17e-02

Pos logit contributions
rob+0.038
 pret+0.034
 Stri+0.034
 sy+0.032
 Roc+0.030
Neg logit contributions
Mom-0.031
Everyone-0.029
um-0.028
 year-0.028
ally-0.025
Token mom
Token ID1995
Feature activation+3.24e-03

Pos logit contributions
rob+0.019
 pret+0.017
 Stri+0.017
 sy+0.016
 Roc+0.015
Neg logit contributions
Mom-0.021
Everyone-0.019
um-0.019
 year-0.019
ally-0.017
Token said
Token ID531
Feature activation+1.12e-02

Pos logit contributions
rob+0.005
 Stri+0.004
 pret+0.004
 sy+0.004
 Roc+0.004
Neg logit contributions
Mom-0.006
Everyone-0.006
 year-0.006
um-0.006
 doorstep-0.005
Token.
Token ID13
Feature activation+1.97e-03

Pos logit contributions
rob+0.025
 pret+0.022
 Stri+0.022
 sy+0.021
 scar+0.019
Neg logit contributions
Mom-0.014
Everyone-0.013
um-0.013
 year-0.012
ally-0.011
Token "
Token ID366
Feature activation+3.91e-03

Pos logit contributions
Mom+0.009
Everyone+0.009
um+0.008
 year+0.008
Daddy+0.007
Neg logit contributions
rob-0.012
 pret-0.010
 Stri-0.010
 sy-0.010
 Fro-0.009
TokenBen
Token ID11696
Feature activation+1.50e-02

Pos logit contributions
rob+0.008
 pret+0.007
 Stri+0.007
 sy+0.007
 Roc+0.006
Neg logit contributions
Mom-0.005
um-0.005
Everyone-0.005
 year-0.004
 doorstep-0.004
Token"
Token ID1
Feature activation+1.21e-02

Pos logit contributions
rob+0.031
 pret+0.028
 Stri+0.028
 sy+0.027
 Roc+0.025
Neg logit contributions
Mom-0.020
Everyone-0.018
um-0.018
 year-0.017
ally-0.016
Token on
Token ID319
Feature activation+1.51e-02

Pos logit contributions
rob+0.022
 pret+0.020
 Stri+0.019
 sy+0.019
 Roc+0.017
Neg logit contributions
Mom-0.020
Everyone-0.018
um-0.017
 year-0.017
ally-0.016
Token the
Token ID262
Feature activation+2.28e-02

Pos logit contributions
rob+0.026
 pret+0.023
 Stri+0.022
 sy+0.022
 Fro+0.020
Neg logit contributions
Mom-0.026
Everyone-0.024
 year-0.023
um-0.023
Daddy-0.021
Token box
Token ID3091
Feature activation+3.63e-02

Pos logit contributions
rob+0.042
 pret+0.039
 Stri+0.038
 sy+0.037
 scar+0.034
Neg logit contributions
Mom-0.035
Everyone-0.033
um-0.031
 year-0.031
ally-0.029
Token with
Token ID351
Feature activation+3.62e-03

Pos logit contributions
rob+0.058
 Stri+0.054
 pret+0.054
 sy+0.051
 Fro+0.046
Neg logit contributions
Mom-0.062
Everyone-0.057
um-0.053
 year-0.052
har-0.050
Token a
Token ID257
Feature activation+1.65e-02

Pos logit contributions
rob+0.007
 pret+0.006
 Stri+0.005
 sy+0.005
ah+0.005
Neg logit contributions
Mom-0.006
Everyone-0.006
um-0.005
 year-0.005
 tidy-0.005
Token marker
Token ID18364
Feature activation+5.23e-03

Pos logit contributions
rob+0.027
 pret+0.026
 Stri+0.025
 sy+0.025
 scar+0.022
Neg logit contributions
Mom-0.028
Everyone-0.026
um-0.025
 year-0.025
ally-0.024
Token.
Token ID13
Feature activation+4.21e-03

Pos logit contributions
rob+0.011
 pret+0.010
 Stri+0.010
 sy+0.009
 Fro+0.009
Neg logit contributions
Mom-0.007
um-0.006
Everyone-0.006
 year-0.006
 doorstep-0.005
Token Then
Token ID3244
Feature activation-3.31e-03

Pos logit contributions
rob+0.007
 pret+0.007
 Stri+0.007
 sy+0.006
 Roc+0.006
Neg logit contributions
Mom-0.007
Everyone-0.006
 year-0.006
um-0.006
ally-0.006
Token
Token ID8151
Feature activation+6.79e-03

Pos logit contributions
Mom+0.008
Everyone+0.007
um+0.007
 year+0.007
ally+0.007
Neg logit contributions
rob-0.009
 pret-0.008
 Stri-0.008
 sy-0.007
 Roc-0.007
Tokens
Token ID82
Feature activation+1.33e-02

Pos logit contributions
rob+0.011
 Stri+0.010
 pret+0.010
 sy+0.009
 Fro+0.009
Neg logit contributions
Mom-0.012
Everyone-0.011
 year-0.010
um-0.010
VP-0.010
Token eyes
Token ID2951
Feature activation+1.53e-02

Pos logit contributions
 pret+0.023
 Stri+0.022
 sy+0.022
rob+0.022
 Fro+0.021
Neg logit contributions
Mom-0.021
Everyone-0.018
ally-0.017
um-0.017
VP-0.017
Token filled
Token ID5901
Feature activation-4.84e-03

Pos logit contributions
rob+0.028
 pret+0.025
 Stri+0.025
 sy+0.024
 Roc+0.022
Neg logit contributions
Mom-0.024
Everyone-0.022
 year-0.021
um-0.021
ally-0.019
Token with
Token ID351
Feature activation+4.25e-03

Pos logit contributions
Mom+0.008
Everyone+0.007
 doorstep+0.007
um+0.007
VP+0.007
Neg logit contributions
rob-0.008
 Stri-0.007
 pret-0.007
 Fro-0.007
 sy-0.006
Token tears
Token ID10953
Feature activation+3.86e-02

Pos logit contributions
rob+0.007
 pret+0.007
 Stri+0.007
 sy+0.007
 scar+0.006
Neg logit contributions
Mom-0.007
Everyone-0.006
um-0.006
 year-0.006
ally-0.006
Token.
Token ID13
Feature activation+2.26e-03

Pos logit contributions
rob+0.079
 pret+0.072
 Stri+0.071
 sy+0.067
 Roc+0.063
Neg logit contributions
Mom-0.054
Everyone-0.049
 year-0.047
um-0.047
ally-0.041
Token He
Token ID679
Feature activation+2.05e-03

Pos logit contributions
 Stri+0.003
rob+0.003
 pret+0.003
 sy+0.003
 Fro+0.003
Neg logit contributions
Mom-0.004
 year-0.004
Everyone-0.004
um-0.004
 doorstep-0.003
Token had
Token ID550
Feature activation-1.22e-03

Pos logit contributions
rob+0.004
 pret+0.004
 Stri+0.003
 sy+0.003
 ro+0.003
Neg logit contributions
Mom-0.003
Everyone-0.003
um-0.003
 year-0.003
ally-0.003
Token been
Token ID587
Feature activation+1.95e-03

Pos logit contributions
Mom+0.002
Everyone+0.002
 year+0.002
um+0.002
ally+0.002
Neg logit contributions
rob-0.002
 pret-0.002
 Stri-0.002
 sy-0.002
 Roc-0.002
Token tricked
Token ID42608
Feature activation+2.12e-02

Pos logit contributions
rob+0.003
 pret+0.003
 Stri+0.003
 sy+0.003
 Roc+0.003
Neg logit contributions
Mom-0.003
Everyone-0.003
 year-0.003
um-0.003
ally-0.003
Token it
Token ID340
Feature activation+5.18e-03

Pos logit contributions
Mom+0.009
Everyone+0.008
um+0.008
 year+0.008
ally+0.007
Neg logit contributions
rob-0.010
 pret-0.009
 Stri-0.009
 sy-0.008
 Roc-0.008
Token.
Token ID13
Feature activation-4.89e-03

Pos logit contributions
rob+0.012
 Stri+0.011
 pret+0.011
 sy+0.010
 Roc+0.009
Neg logit contributions
Mom-0.006
Everyone-0.005
 year-0.005
um-0.005
ally-0.005
Token He
Token ID679
Feature activation-1.30e-03

Pos logit contributions
Mom+0.008
Everyone+0.007
 year+0.007
um+0.007
ally+0.006
Neg logit contributions
rob-0.009
 Stri-0.008
 pret-0.008
 sy-0.008
 Roc-0.007
Token w
Token ID266
Feature activation-3.48e-03

Pos logit contributions
Mom+0.002
Everyone+0.002
um+0.002
 year+0.002
ally+0.002
Neg logit contributions
rob-0.002
 pret-0.002
 Stri-0.002
 sy-0.002
 Roc-0.002
Tokenagged
Token ID14655
Feature activation+2.09e-03

Pos logit contributions
Mom+0.006
Everyone+0.005
 year+0.005
um+0.005
 doorstep+0.005
Neg logit contributions
rob-0.006
 pret-0.006
 Stri-0.006
 sy-0.005
 Fro-0.005
Token his
Token ID465
Feature activation+5.27e-02

Pos logit contributions
rob+0.003
 pret+0.002
 Stri+0.002
 sy+0.002
 Roc+0.002
Neg logit contributions
Mom-0.004
Everyone-0.004
um-0.004
 year-0.004
ally-0.004
Token tail
Token ID7894
Feature activation+1.59e-02

Pos logit contributions
rob+0.080
 pret+0.072
 Stri+0.071
 sy+0.067
 Roc+0.060
Neg logit contributions
Mom-0.100
Everyone-0.093
um-0.090
 year-0.090
ally-0.084
Token and
Token ID290
Feature activation+1.06e-02

Pos logit contributions
rob+0.030
 pret+0.027
 Stri+0.027
 sy+0.025
 Roc+0.023
Neg logit contributions
Mom-0.025
Everyone-0.022
 year-0.022
um-0.022
 doorstep-0.019
Token bark
Token ID21405
Feature activation+4.48e-03

Pos logit contributions
rob+0.017
 sy+0.014
 Stri+0.014
 pret+0.014
robat+0.013
Neg logit contributions
Mom-0.018
Everyone-0.016
 year-0.015
um-0.015
 Mom-0.015
Tokened
Token ID276
Feature activation+5.57e-03

Pos logit contributions
rob+0.009
 Stri+0.008
 pret+0.007
 sy+0.007
 scar+0.007
Neg logit contributions
Mom-0.007
Everyone-0.007
um-0.007
 year-0.006
 Everyone-0.006
Token.
Token ID13
Feature activation-3.68e-03

Pos logit contributions
rob+0.010
 pret+0.010
 Stri+0.009
 sy+0.009
 Roc+0.008
Neg logit contributions
Mom-0.009
Everyone-0.008
um-0.007
 year-0.007
ally-0.007
Token
Token ID198
Feature activation-1.77e-03

Pos logit contributions
 doorstep+0.011
 rightful+0.010
ator+0.010
ally+0.010
um+0.010
Neg logit contributions
 Stri-0.008
rob-0.008
 pret-0.007
 Roc-0.007
 Fro-0.006
TokenThey
Token ID2990
Feature activation+9.05e-03

Pos logit contributions
 doorstep+0.003
ator+0.003
 rightful+0.003
um+0.003
ally+0.003
Neg logit contributions
rob-0.003
 Stri-0.003
 pret-0.002
 Roc-0.002
 sy-0.002
Token sit
Token ID1650
Feature activation+1.04e-02

Pos logit contributions
 pret+0.012
rob+0.012
 Stri+0.011
 sy+0.010
 scar+0.010
Neg logit contributions
Mom-0.018
um-0.017
Everyone-0.016
 year-0.016
ally-0.016
Token on
Token ID319
Feature activation+1.14e-02

Pos logit contributions
rob+0.019
 pret+0.017
 Stri+0.017
 sy+0.016
 Fro+0.014
Neg logit contributions
Mom-0.018
um-0.015
Everyone-0.015
 year-0.015
har-0.014
Token the
Token ID262
Feature activation+1.27e-02

Pos logit contributions
rob+0.018
 pret+0.016
 Stri+0.016
 sy+0.015
 Roc+0.014
Neg logit contributions
Mom-0.021
Everyone-0.019
um-0.018
 year-0.018
ally-0.017
Token grass
Token ID8701
Feature activation+3.59e-02

Pos logit contributions
rob+0.026
 pret+0.024
 Stri+0.023
 sy+0.022
 Roc+0.021
Neg logit contributions
Mom-0.018
Everyone-0.016
um-0.015
 year-0.015
 doorstep-0.014
Token.
Token ID13
Feature activation+1.33e-03

Pos logit contributions
rob+0.080
 pret+0.073
 Stri+0.071
 sy+0.070
 Roc+0.064
Neg logit contributions
Mom-0.046
 year-0.041
um-0.040
Everyone-0.037
 doorstep-0.034
Token They
Token ID1119
Feature activation+1.10e-02

Pos logit contributions
rob+0.002
 pret+0.002
 Stri+0.002
 sy+0.002
 Roc+0.002
Neg logit contributions
Mom-0.002
Everyone-0.002
 year-0.002
um-0.002
ally-0.002
Token talk
Token ID1561
Feature activation-1.95e-03

Pos logit contributions
rob+0.018
 pret+0.016
 Stri+0.016
 sy+0.015
 Roc+0.014
Neg logit contributions
Mom-0.019
Everyone-0.018
um-0.017
 year-0.017
ally-0.016
Token and
Token ID290
Feature activation+1.18e-02

Pos logit contributions
Mom+0.003
Everyone+0.003
 year+0.003
um+0.003
 Everyone+0.002
Neg logit contributions
rob-0.004
 Stri-0.003
 pret-0.003
 sy-0.003
 Fro-0.003
Token share
Token ID2648
Feature activation-9.87e-03

Pos logit contributions
rob+0.021
 Stri+0.019
 pret+0.019
 sy+0.018
 Roc+0.016
Neg logit contributions
Mom-0.020
Everyone-0.018
 year-0.017
um-0.017
 doorstep-0.016
Token Dad
Token ID17415
Feature activation-6.85e-03

Pos logit contributions
 Stri+0.005
 pret+0.005
rob+0.005
 Fro+0.004
 sy+0.004
Neg logit contributions
Mom-0.004
Everyone-0.004
har-0.004
ator-0.004
 year-0.004
Token said
Token ID531
Feature activation+5.26e-03

Pos logit contributions
Mom+0.012
Everyone+0.012
 year+0.011
um+0.011
 doorstep+0.010
Neg logit contributions
rob-0.011
 pret-0.010
 Stri-0.010
 sy-0.009
 Fro-0.009
Token,
Token ID11
Feature activation+4.71e-03

Pos logit contributions
rob+0.007
 pret+0.007
 Stri+0.006
 sy+0.006
 Fro+0.005
Neg logit contributions
Mom-0.010
Everyone-0.010
um-0.009
 year-0.009
 doorstep-0.009
Token "
Token ID366
Feature activation+5.61e-04

Pos logit contributions
 pret+0.007
rob+0.007
 Stri+0.006
 sy+0.006
 ro+0.006
Neg logit contributions
Mom-0.008
Everyone-0.008
um-0.008
 year-0.008
 doorstep-0.008
TokenLet
Token ID5756
Feature activation+5.60e-03

Pos logit contributions
rob+0.001
 Stri+0.001
 pret+0.001
 sy+0.001
pill+0.001
Neg logit contributions
um-0.001
 doorstep-0.001
Mom-0.001
 year-0.001
ator-0.001
Token's
Token ID338
Feature activation+8.66e-03

Pos logit contributions
rob+0.007
 pret+0.007
 Stri+0.006
 sy+0.006
 ro+0.006
Neg logit contributions
Mom-0.011
Everyone-0.010
har-0.010
 year-0.010
ator-0.010
Token record
Token ID1700
Feature activation-1.94e-03

Pos logit contributions
 pret+0.011
 Stri+0.010
 ro+0.009
 sy+0.009
 Roc+0.008
Neg logit contributions
Mom-0.016
 year-0.016
Everyone-0.016
um-0.015
Okay-0.015
Token it
Token ID340
Feature activation+7.52e-03

Pos logit contributions
Mom+0.004
Everyone+0.003
um+0.003
 year+0.003
ally+0.003
Neg logit contributions
rob-0.003
 pret-0.003
 Stri-0.003
 sy-0.003
 scar-0.002
Token!"
Token ID2474
Feature activation-2.79e-03

Pos logit contributions
rob+0.015
 pret+0.014
 Stri+0.014
 sy+0.013
 Fro+0.012
Neg logit contributions
Mom-0.010
Everyone-0.010
um-0.009
 year-0.009
 doorstep-0.008
Token
Token ID198
Feature activation-2.34e-03

Pos logit contributions
Mom+0.004
Everyone+0.004
um+0.004
 year+0.004
ally+0.003
Neg logit contributions
rob-0.005
 pret-0.005
 Stri-0.005
 sy-0.004
 Roc-0.004
Token
Token ID198
Feature activation+1.34e-03

Pos logit contributions
Mom+0.003
 doorstep+0.003
VP+0.003
 rightful+0.003
 year+0.003
Neg logit contributions
 Stri-0.004
rob-0.004
 pret-0.004
 Fro-0.004
 Roc-0.003
Token play
Token ID711
Feature activation-8.86e-03

Pos logit contributions
Mom+0.014
urious+0.014
Everyone+0.013
 year+0.013
um+0.013
Neg logit contributions
 pret-0.010
 Stri-0.010
 sy-0.010
 Fro-0.009
rob-0.009
Token with
Token ID351
Feature activation-1.15e-02

Pos logit contributions
Mom+0.011
Everyone+0.009
um+0.009
 year+0.009
ally+0.008
Neg logit contributions
rob-0.020
 pret-0.019
 Stri-0.018
 sy-0.017
 Fro-0.016
Token it
Token ID340
Feature activation+2.38e-03

Pos logit contributions
Mom+0.019
Everyone+0.017
 year+0.016
um+0.016
ally+0.015
Neg logit contributions
rob-0.021
 pret-0.019
 Stri-0.018
 sy-0.017
 Roc-0.016
Token.
Token ID13
Feature activation-2.99e-03

Pos logit contributions
rob+0.005
 pret+0.005
 Stri+0.004
 sy+0.004
 Roc+0.004
Neg logit contributions
Mom-0.003
Everyone-0.003
 year-0.003
um-0.003
ally-0.002
Token The
Token ID383
Feature activation-9.92e-03

Pos logit contributions
Mom+0.005
Everyone+0.004
 year+0.004
um+0.004
 doorstep+0.004
Neg logit contributions
rob-0.005
 Stri-0.005
 pret-0.005
 sy-0.005
 Fro-0.004
Token girl
Token ID2576
Feature activation+8.90e-03

Pos logit contributions
Everyone+0.014
har+0.013
Mom+0.013
ator+0.013
um+0.013
Neg logit contributions
rob-0.017
 pret-0.017
 sy-0.016
 Stri-0.015
 scar-0.015
Token seemed
Token ID3947
Feature activation-2.69e-04

Pos logit contributions
rob+0.014
 pret+0.014
 Stri+0.013
 sy+0.012
 Fro+0.012
Neg logit contributions
Mom-0.015
Everyone-0.014
um-0.013
 year-0.013
ator-0.013
Token spoiled
Token ID34781
Feature activation+8.28e-03

Pos logit contributions
Mom+0.000
Everyone+0.000
 year+0.000
um+0.000
 tidy+0.000
Neg logit contributions
rob-0.001
 Stri-0.000
 pret-0.000
 sy-0.000
 Roc-0.000
Token and
Token ID290
Feature activation+6.95e-03

Pos logit contributions
rob+0.016
 pret+0.014
 Stri+0.014
 sy+0.014
 Roc+0.013
Neg logit contributions
Mom-0.012
 year-0.011
um-0.011
Everyone-0.011
ally-0.010
Token told
Token ID1297
Feature activation+2.39e-03

Pos logit contributions
rob+0.013
 pret+0.011
 Stri+0.011
 sy+0.011
 Roc+0.010
Neg logit contributions
Mom-0.011
Everyone-0.010
um-0.010
 year-0.010
ally-0.009
Token Johnny
Token ID15470
Feature activation-1.21e-02

Pos logit contributions
rob+0.004
 pret+0.003
 sy+0.003
 Stri+0.003
 Fro+0.003
Neg logit contributions
Mom-0.005
Everyone-0.004
um-0.004
 year-0.004
ally-0.004
Token "
Token ID366
Feature activation+5.63e-03

Pos logit contributions
rob+0.021
 pret+0.018
 sy+0.017
 Stri+0.017
robat+0.016
Neg logit contributions
Mom-0.011
Everyone-0.009
 year-0.009
um-0.008
 Everyone-0.008
TokenI
Token ID40
Feature activation+7.31e-03

Pos logit contributions
rob+0.013
 pret+0.012
 Stri+0.011
 sy+0.011
ah+0.010
Neg logit contributions
um-0.006
 doorstep-0.005
 year-0.005
Everyone-0.005
ally-0.005
Token want
Token ID765
Feature activation-5.85e-03

Pos logit contributions
rob+0.012
 pret+0.011
 Stri+0.011
 sy+0.010
 Roc+0.009
Neg logit contributions
Mom-0.013
Everyone-0.012
um-0.011
 year-0.011
ally-0.010
Token to
Token ID284
Feature activation+1.13e-02

Pos logit contributions
Mom+0.011
Everyone+0.011
um+0.010
 year+0.010
ally+0.010
Neg logit contributions
rob-0.008
 pret-0.008
 Stri-0.008
 sy-0.007
 Fro-0.006
Token change
Token ID1487
Feature activation+8.26e-03

Pos logit contributions
 pret+0.013
 Stri+0.013
 Fro+0.012
 sy+0.011
 ro+0.011
Neg logit contributions
Mom-0.022
urious-0.021
Really-0.020
 year-0.020
um-0.020
Token my
Token ID616
Feature activation+1.06e-02

Pos logit contributions
 pret+0.010
 Stri+0.010
rob+0.010
 scar+0.009
 sy+0.008
Neg logit contributions
Mom-0.017
Everyone-0.016
 year-0.015
um-0.015
ally-0.015
Token home
Token ID1363
Feature activation+1.44e-03

Pos logit contributions
 pret+0.016
rob+0.014
 ro+0.014
 scar+0.013
 Fro+0.013
Neg logit contributions
Mom-0.020
Everyone-0.019
Lu-0.018
um-0.017
Daddy-0.017
Token."
Token ID526
Feature activation+6.56e-03

Pos logit contributions
rob+0.003
 pret+0.002
 Stri+0.002
 sy+0.002
 asleep+0.002
Neg logit contributions
Mom-0.002
Everyone-0.002
um-0.002
 year-0.002
ally-0.002
Token So
Token ID1406
Feature activation-1.03e-02

Pos logit contributions
rob+0.011
 Stri+0.011
 pret+0.011
 sy+0.010
 Roc+0.010
Neg logit contributions
Mom-0.010
Everyone-0.009
um-0.009
 year-0.009
ally-0.009
Token,
Token ID11
Feature activation+1.72e-04

Pos logit contributions
Mom+0.021
Everyone+0.019
um+0.019
 year+0.018
 doorstep+0.018
Neg logit contributions
rob-0.014
 pret-0.013
 Stri-0.013
 sy-0.011
 Roc-0.011
Token the
Token ID262
Feature activation-2.17e-04

Pos logit contributions
rob+0.000
 pret+0.000
 sy+0.000
 Stri+0.000
 scar+0.000
Neg logit contributions
Mom-0.000
Everyone-0.000
 year-0.000
um-0.000
 Everyone-0.000
Token a
Token ID257
Feature activation-1.23e-03

Pos logit contributions
rob+0.004
 pret+0.004
 Stri+0.004
 sy+0.003
 Roc+0.003
Neg logit contributions
Mom-0.004
Everyone-0.003
um-0.003
 year-0.003
ally-0.003
Token veterinarian
Token ID41404
Feature activation-1.09e-02

Pos logit contributions
Mom+0.002
um+0.002
Everyone+0.002
ally+0.002
urious+0.002
Neg logit contributions
 pret-0.002
rob-0.002
 Stri-0.002
 sy-0.002
 Roc-0.002
Token when
Token ID618
Feature activation-7.06e-03

Pos logit contributions
Mom+0.014
 year+0.012
Everyone+0.012
um+0.011
 doorstep+0.010
Neg logit contributions
rob-0.024
 pret-0.021
 Stri-0.021
 sy-0.020
 Roc-0.019
Token she
Token ID673
Feature activation-1.17e-04

Pos logit contributions
Mom+0.015
Everyone+0.014
um+0.014
 year+0.013
har+0.013
Neg logit contributions
rob-0.008
 Stri-0.008
 pret-0.008
 sy-0.007
 Roc-0.006
Token grew
Token ID6348
Feature activation-1.13e-03

Pos logit contributions
Mom+0.000
Everyone+0.000
um+0.000
 year+0.000
ally+0.000
Neg logit contributions
rob-0.000
 pret-0.000
 Stri-0.000
 sy-0.000
 Roc-0.000
Token up
Token ID510
Feature activation+1.11e-02

Pos logit contributions
Mom+0.002
Everyone+0.002
um+0.001
 year+0.001
ally+0.001
Neg logit contributions
rob-0.002
 pret-0.002
 Stri-0.002
 sy-0.002
 Roc-0.002
Token.
Token ID13
Feature activation+1.07e-03

Pos logit contributions
rob+0.018
 pret+0.018
 Stri+0.018
 sy+0.017
 Roc+0.015
Neg logit contributions
Mom-0.018
Everyone-0.017
um-0.015
ally-0.015
 doorstep-0.015
Token She
Token ID1375
Feature activation-2.66e-03

Pos logit contributions
rob+0.002
 pret+0.002
 Stri+0.002
 sy+0.001
 Roc+0.001
Neg logit contributions
Mom-0.002
Everyone-0.002
 year-0.002
um-0.002
har-0.002
Token liked
Token ID8288
Feature activation-1.15e-02

Pos logit contributions
Mom+0.004
Everyone+0.004
um+0.004
 year+0.003
ator+0.003
Neg logit contributions
rob-0.005
 pret-0.005
 Stri-0.005
 sy-0.004
 scar-0.004
Token to
Token ID284
Feature activation+8.06e-03

Pos logit contributions
Mom+0.017
 year+0.016
Everyone+0.015
um+0.014
 tidy+0.014
Neg logit contributions
rob-0.023
 pret-0.019
 Stri-0.019
 sy-0.018
ah-0.017
Token read
Token ID1100
Feature activation+1.88e-03

Pos logit contributions
rob+0.013
 pret+0.013
 Stri+0.012
 sy+0.012
 Roc+0.011
Neg logit contributions
Mom-0.014
Everyone-0.012
um-0.012
 year-0.012
ally-0.012
Token used
Token ID973
Feature activation-3.46e-03

Pos logit contributions
Mom+0.001
Everyone+0.000
um+0.000
 year+0.000
ally+0.000
Neg logit contributions
rob-0.001
 pret-0.001
 Stri-0.001
 sy-0.001
 Roc-0.001
Token cookie
Token ID19751
Feature activation+1.31e-03

Pos logit contributions
Mom+0.005
Everyone+0.005
um+0.005
 year+0.005
 tidy+0.004
Neg logit contributions
rob-0.007
 pret-0.006
 Stri-0.006
 sy-0.005
 Roc-0.005
Token cut
Token ID2005
Feature activation+1.40e-02

Pos logit contributions
rob+0.003
 pret+0.002
 Stri+0.002
 sy+0.002
 Roc+0.002
Neg logit contributions
Mom-0.002
um-0.002
Everyone-0.002
 year-0.002
 nicely-0.001
Tokenters
Token ID1010
Feature activation+8.35e-03

Pos logit contributions
rob+0.027
 pret+0.024
 Stri+0.023
 sy+0.022
 Fro+0.021
Neg logit contributions
Mom-0.022
um-0.020
Everyone-0.020
 year-0.019
 Everyone-0.017
Token to
Token ID284
Feature activation+1.01e-02

Pos logit contributions
rob+0.018
 Stri+0.015
 pret+0.015
 sy+0.014
 Roc+0.013
Neg logit contributions
Mom-0.012
Everyone-0.011
um-0.011
 year-0.010
 tidy-0.009
Token cut
Token ID2005
Feature activation+1.13e-02

Pos logit contributions
rob+0.017
 pret+0.016
 Stri+0.016
 sy+0.015
 Roc+0.014
Neg logit contributions
Mom-0.017
Everyone-0.015
um-0.015
 year-0.015
ally-0.014
Token out
Token ID503
Feature activation+1.34e-02

Pos logit contributions
rob+0.021
 pret+0.019
 Stri+0.018
 sy+0.018
 Fro+0.016
Neg logit contributions
Mom-0.018
Everyone-0.016
um-0.016
 year-0.015
 Everyone-0.014
Token different
Token ID1180
Feature activation+1.13e-02

Pos logit contributions
rob+0.023
 pret+0.021
 Stri+0.021
 sy+0.020
 Roc+0.018
Neg logit contributions
Mom-0.022
Everyone-0.020
um-0.019
 year-0.019
ally-0.018
Token shapes
Token ID15268
Feature activation+3.07e-03

Pos logit contributions
rob+0.024
 pret+0.022
 Stri+0.022
 sy+0.021
 Roc+0.020
Neg logit contributions
Mom-0.014
Everyone-0.013
um-0.012
 year-0.012
ally-0.011
Token and
Token ID290
Feature activation+4.16e-03

Pos logit contributions
rob+0.007
 Stri+0.006
 pret+0.006
 sy+0.006
 Roc+0.005
Neg logit contributions
Mom-0.004
um-0.004
Everyone-0.004
 year-0.003
ally-0.003
Token put
Token ID1234
Feature activation-1.17e-02

Pos logit contributions
rob+0.008
 pret+0.007
 Stri+0.007
 sy+0.007
 Roc+0.006
Neg logit contributions
Mom-0.007
Everyone-0.006
um-0.006
 year-0.006
ally-0.005
Token!
Token ID0
Feature activation-4.35e-04

Pos logit contributions
Mom+0.002
Everyone+0.002
 year+0.002
um+0.002
ally+0.001
Neg logit contributions
rob-0.002
 pret-0.002
 Stri-0.002
 sy-0.002
 Roc-0.002
Token It
Token ID632
Feature activation-5.02e-05

Pos logit contributions
Mom+0.001
Everyone+0.001
 year+0.001
um+0.001
ator+0.001
Neg logit contributions
 Stri-0.001
rob-0.001
 pret-0.001
 sy-0.001
 Fro-0.001
Token hopped
Token ID45230
Feature activation+7.70e-03

Pos logit contributions
Mom+0.000
Everyone+0.000
 year+0.000
um+0.000
ally+0.000
Neg logit contributions
rob-0.000
 pret-0.000
 Stri-0.000
 sy-0.000
 Roc-0.000
Token and
Token ID290
Feature activation+1.31e-02

Pos logit contributions
rob+0.014
 pret+0.013
 Stri+0.013
 sy+0.012
 Roc+0.011
Neg logit contributions
Mom-0.012
Everyone-0.011
um-0.010
 year-0.010
ally-0.009
Token hopped
Token ID45230
Feature activation+1.72e-03

Pos logit contributions
rob+0.024
 sy+0.023
 pret+0.022
 Stri+0.021
bread+0.018
Neg logit contributions
Mom-0.020
Everyone-0.018
 year-0.017
um-0.016
har-0.015
Token until
Token ID1566
Feature activation+3.80e-03

Pos logit contributions
rob+0.003
 pret+0.003
 Stri+0.003
 sy+0.002
 Fro+0.002
Neg logit contributions
Mom-0.003
Everyone-0.003
um-0.003
 year-0.003
ally-0.003
Token it
Token ID340
Feature activation-3.68e-03

Pos logit contributions
rob+0.008
 pret+0.007
 Stri+0.007
 sy+0.007
 scar+0.006
Neg logit contributions
Mom-0.005
Everyone-0.005
um-0.005
 year-0.005
ally-0.004
Token reached
Token ID4251
Feature activation+9.35e-03

Pos logit contributions
Mom+0.005
Everyone+0.005
 year+0.004
um+0.004
ally+0.004
Neg logit contributions
rob-0.008
 pret-0.007
 sy-0.007
 Stri-0.007
 scar-0.006
Token the
Token ID262
Feature activation-6.19e-03

Pos logit contributions
rob+0.020
 sy+0.017
 pret+0.017
 Stri+0.016
robat+0.016
Neg logit contributions
Mom-0.013
Everyone-0.012
um-0.012
 year-0.011
ally-0.010
Token waterfall
Token ID42467
Feature activation+8.14e-03

Pos logit contributions
Mom+0.008
Everyone+0.007
um+0.007
 year+0.007
ally+0.007
Neg logit contributions
rob-0.013
 pret-0.012
 Stri-0.012
 sy-0.011
 ro-0.011
Token.
Token ID13
Feature activation+2.17e-04

Pos logit contributions
rob+0.017
 pret+0.015
 Stri+0.015
 sy+0.015
 Roc+0.013
Neg logit contributions
Mom-0.011
Everyone-0.010
um-0.010
 year-0.009
ally-0.009
Token,
Token ID11
Feature activation-5.03e-03

Pos logit contributions
rob+0.017
 pret+0.017
 Stri+0.016
 sy+0.015
 Fro+0.014
Neg logit contributions
Mom-0.017
Everyone-0.016
um-0.015
 year-0.015
ator-0.015
Token I
Token ID314
Feature activation-8.33e-03

Pos logit contributions
Mom+0.009
um+0.009
Everyone+0.009
urious+0.009
 year+0.009
Neg logit contributions
 pret-0.007
 Stri-0.007
rob-0.007
 sy-0.006
 Fro-0.006
Token was
Token ID373
Feature activation+2.26e-03

Pos logit contributions
Mom+0.016
Everyone+0.015
um+0.015
ator+0.015
 doorstep+0.014
Neg logit contributions
 pret-0.011
rob-0.011
 Stri-0.010
 sy-0.010
 Fro-0.009
Token right
Token ID826
Feature activation+4.37e-04

Pos logit contributions
rob+0.004
 pret+0.003
 Stri+0.003
 sy+0.003
 Fro+0.003
Neg logit contributions
Mom-0.004
Everyone-0.004
um-0.004
 year-0.004
ally-0.003
Token!
Token ID0
Feature activation-8.43e-03

Pos logit contributions
rob+0.001
 Stri+0.001
 pret+0.001
 asleep+0.001
 enough+0.001
Neg logit contributions
Everyone-0.001
Mom-0.001
ally-0.001
 year-0.001
um-0.001
Token There
Token ID1318
Feature activation+3.07e-03

Pos logit contributions
um+0.014
 year+0.014
Mom+0.014
ator+0.014
 doorstep+0.014
Neg logit contributions
 Stri-0.013
 pret-0.012
rob-0.012
 sy-0.011
 Fro-0.011
Token are
Token ID389
Feature activation+6.23e-03

Pos logit contributions
rob+0.005
 pret+0.005
 Stri+0.005
 sy+0.005
 Fro+0.004
Neg logit contributions
Mom-0.005
Everyone-0.005
um-0.004
 year-0.004
har-0.004
Token jewels
Token ID42701
Feature activation+4.80e-03

Pos logit contributions
rob+0.010
 pret+0.009
 Stri+0.009
 sy+0.008
 scar+0.008
Neg logit contributions
Mom-0.011
um-0.010
Everyone-0.010
 year-0.010
ally-0.010
Token and
Token ID290
Feature activation+3.41e-03

Pos logit contributions
rob+0.009
 pret+0.009
 Stri+0.009
 sy+0.008
 Roc+0.008
Neg logit contributions
Mom-0.007
 year-0.006
Everyone-0.006
um-0.006
 doorstep-0.006
Token coins
Token ID10796
Feature activation-7.63e-03

Pos logit contributions
rob+0.005
 pret+0.005
 Stri+0.004
 sy+0.004
 scar+0.004
Neg logit contributions
Mom-0.006
Everyone-0.006
um-0.006
 year-0.006
ally-0.006
Token and
Token ID290
Feature activation-4.59e-04

Pos logit contributions
Mom+0.011
Everyone+0.010
 year+0.010
um+0.010
ally+0.009
Neg logit contributions
rob-0.015
 pret-0.013
 Stri-0.013
 sy-0.013
 Roc-0.012
Token beautiful
Token ID4950
Feature activation-8.89e-03

Pos logit contributions
rob+0.011
 pret+0.011
 sy+0.010
 Stri+0.010
 scar+0.010
Neg logit contributions
Mom-0.009
Everyone-0.009
um-0.009
ally-0.008
 year-0.008
Token gem
Token ID16840
Feature activation+4.36e-03

Pos logit contributions
Mom+0.017
Everyone+0.016
um+0.015
 year+0.014
har+0.014
Neg logit contributions
rob-0.013
 pret-0.013
 sy-0.012
 Stri-0.011
 scar-0.010
Token that
Token ID326
Feature activation+6.60e-04

Pos logit contributions
rob+0.009
 pret+0.008
 Stri+0.008
 sy+0.008
 Roc+0.007
Neg logit contributions
Mom-0.006
Everyone-0.006
um-0.005
 year-0.005
ally-0.005
Token he
Token ID339
Feature activation-8.23e-03

Pos logit contributions
rob+0.001
 pret+0.001
 Stri+0.001
 sy+0.001
 Roc+0.001
Neg logit contributions
Mom-0.001
Everyone-0.001
um-0.001
 year-0.001
ally-0.001
Token wanted
Token ID2227
Feature activation-8.63e-03

Pos logit contributions
Mom+0.012
um+0.011
Everyone+0.011
 doorstep+0.010
ally+0.010
Neg logit contributions
rob-0.016
 pret-0.016
 Stri-0.015
 ro-0.014
 sy-0.014
Token to
Token ID284
Feature activation+6.98e-04

Pos logit contributions
Mom+0.018
Everyone+0.016
um+0.016
 year+0.015
ally+0.015
Neg logit contributions
rob-0.011
 pret-0.011
 Stri-0.010
 sy-0.010
 Roc-0.009
Token sell
Token ID3677
Feature activation-2.55e-03

Pos logit contributions
 pret+0.001
 Stri+0.001
rob+0.001
 sy+0.001
 Fro+0.001
Neg logit contributions
Mom-0.001
 year-0.001
um-0.001
urious-0.001
har-0.001
Token.
Token ID13
Feature activation+1.51e-03

Pos logit contributions
Mom+0.005
Everyone+0.005
ally+0.005
har+0.004
urious+0.004
Neg logit contributions
rob-0.003
 pret-0.003
 Stri-0.003
 enough-0.003
 sy-0.003
Token He
Token ID679
Feature activation-1.09e-03

Pos logit contributions
rob+0.002
 Stri+0.002
 pret+0.002
 Roc+0.002
 sy+0.002
Neg logit contributions
Mom-0.003
har-0.002
um-0.002
Everyone-0.002
 year-0.002
Token went
Token ID1816
Feature activation+3.69e-03

Pos logit contributions
Everyone+0.002
Mom+0.002
um+0.002
ally+0.002
har+0.002
Neg logit contributions
 pret-0.002
rob-0.002
 Stri-0.002
 ro-0.002
 sy-0.002
Token to
Token ID284
Feature activation+1.48e-04

Pos logit contributions
rob+0.007
 pret+0.006
 Stri+0.006
 sy+0.006
ez+0.006
Neg logit contributions
Mom-0.005
 year-0.005
Everyone-0.005
um-0.005
 tidy-0.004
Token.
Token ID13
Feature activation-2.16e-03

Pos logit contributions
rob+0.009
 pret+0.008
 Stri+0.008
 sy+0.008
 Fro+0.007
Neg logit contributions
Mom-0.008
Everyone-0.007
um-0.007
 year-0.007
ally-0.006
Token She
Token ID1375
Feature activation-3.71e-03

Pos logit contributions
 year+0.004
Mom+0.004
Everyone+0.003
ator+0.003
har+0.003
Neg logit contributions
 Stri-0.003
 pret-0.003
rob-0.003
 Fro-0.003
 sy-0.003
Token hugged
Token ID41130
Feature activation-3.25e-03

Pos logit contributions
Mom+0.006
Everyone+0.006
um+0.006
 year+0.006
ator+0.005
Neg logit contributions
 pret-0.006
rob-0.006
 Stri-0.006
 sy-0.005
 scar-0.005
Token her
Token ID607
Feature activation+2.20e-02

Pos logit contributions
Mom+0.007
Everyone+0.006
 year+0.006
um+0.006
 doorstep+0.006
Neg logit contributions
rob-0.004
 pret-0.004
 Stri-0.004
 sy-0.003
 Fro-0.003
Token bear
Token ID6842
Feature activation+2.05e-02

Pos logit contributions
rob+0.028
 pret+0.026
 Stri+0.025
 sy+0.024
 scar+0.021
Neg logit contributions
Mom-0.046
Everyone-0.044
um-0.042
 year-0.042
ally-0.040
Token tightly
Token ID17707
Feature activation+5.78e-03

Pos logit contributions
rob+0.035
 pret+0.033
 Stri+0.032
 sy+0.031
 scar+0.029
Neg logit contributions
Mom-0.033
Everyone-0.031
um-0.029
 year-0.028
Daddy-0.027
Token and
Token ID290
Feature activation+1.62e-03

Pos logit contributions
rob+0.010
 pret+0.009
 Stri+0.009
 sy+0.008
 Roc+0.007
Neg logit contributions
Mom-0.010
Everyone-0.009
um-0.009
 year-0.009
ally-0.008
Token said
Token ID531
Feature activation-2.90e-03

Pos logit contributions
rob+0.003
 pret+0.003
 Stri+0.003
 sy+0.003
 Roc+0.002
Neg logit contributions
Mom-0.002
Everyone-0.002
um-0.002
 year-0.002
ally-0.002
Token,
Token ID11
Feature activation-3.70e-04

Pos logit contributions
Mom+0.005
Everyone+0.005
um+0.005
 year+0.005
ally+0.004
Neg logit contributions
rob-0.005
 pret-0.004
 Stri-0.004
 sy-0.004
 Roc-0.003
Token "
Token ID366
Feature activation-3.41e-04

Pos logit contributions
Mom+0.001
Everyone+0.001
um+0.001
 year+0.000
ally+0.000
Neg logit contributions
rob-0.001
 pret-0.001
 Stri-0.001
 sy-0.001
 Roc-0.001
TokenI
Token ID40
Feature activation+4.51e-03

Pos logit contributions
 doorstep+0.000
 year+0.000
um+0.000
ator+0.000
 structure+0.000
Neg logit contributions
rob-0.001
 Stri-0.001
 sy-0.001
 pret-0.001
ah-0.001
Token.
Token ID13
Feature activation+1.34e-03

Pos logit contributions
rob+0.015
 pret+0.014
 Stri+0.014
 sy+0.013
 Roc+0.012
Neg logit contributions
Mom-0.013
Everyone-0.012
um-0.011
 year-0.011
 doorstep-0.010
Token I
Token ID314
Feature activation-4.31e-04

Pos logit contributions
 Stri+0.002
rob+0.002
 pret+0.002
 sy+0.002
 Roc+0.002
Neg logit contributions
Mom-0.002
 year-0.002
Everyone-0.002
um-0.002
 doorstep-0.002
Token'll
Token ID1183
Feature activation-1.03e-02

Pos logit contributions
Mom+0.001
um+0.001
Everyone+0.001
 year+0.001
Daddy+0.001
Neg logit contributions
rob-0.001
 pret-0.001
 Stri-0.001
 sy-0.001
 Fro-0.001
Token take
Token ID1011
Feature activation+6.95e-03

Pos logit contributions
 year+0.020
Mom+0.020
um+0.020
urious+0.020
Really+0.019
Neg logit contributions
 pret-0.011
 Stri-0.011
 sy-0.010
rob-0.010
 ro-0.009
Token care
Token ID1337
Feature activation+2.06e-03

Pos logit contributions
rob+0.014
 pret+0.013
 Stri+0.013
 sy+0.012
 Roc+0.011
Neg logit contributions
Mom-0.010
Everyone-0.009
um-0.009
 year-0.008
ally-0.008
Token of
Token ID286
Feature activation+1.56e-03

Pos logit contributions
rob+0.004
 Stri+0.003
 pret+0.003
 sy+0.003
 Roc+0.003
Neg logit contributions
Mom-0.003
Everyone-0.003
 year-0.003
um-0.003
ally-0.003
Token it
Token ID340
Feature activation+1.89e-02

Pos logit contributions
rob+0.003
 pret+0.003
 Stri+0.003
 sy+0.003
 Roc+0.003
Neg logit contributions
Mom-0.002
Everyone-0.002
um-0.002
 year-0.002
ally-0.002
Token."
Token ID526
Feature activation+1.20e-03

Pos logit contributions
rob+0.037
 pret+0.034
 Stri+0.033
 sy+0.032
 Roc+0.030
Neg logit contributions
Mom-0.027
Everyone-0.025
 year-0.024
um-0.024
ally-0.021
Token Her
Token ID2332
Feature activation-7.66e-03

Pos logit contributions
rob+0.002
 pret+0.002
 Stri+0.002
 sy+0.002
 Fro+0.002
Neg logit contributions
Mom-0.002
 year-0.002
Everyone-0.002
ator-0.002
um-0.002
Token mom
Token ID1995
Feature activation+1.39e-03

Pos logit contributions
Everyone+0.012
 year+0.011
ator+0.010
 Everyone+0.010
Twe+0.010
Neg logit contributions
 pret-0.014
 scar-0.013
 sy-0.012
 Fro-0.011
 Stri-0.011
Token went
Token ID1816
Feature activation+4.41e-03

Pos logit contributions
rob+0.002
 pret+0.002
 Stri+0.002
 sy+0.002
 Fro+0.002
Neg logit contributions
Mom-0.002
Everyone-0.002
 year-0.002
um-0.002
 doorstep-0.002
Token,
Token ID11
Feature activation+3.50e-03

Pos logit contributions
Mom+0.009
Everyone+0.008
 year+0.008
um+0.008
ally+0.007
Neg logit contributions
rob-0.010
 pret-0.009
 Stri-0.009
 sy-0.009
 Roc-0.008
Token and
Token ID290
Feature activation-3.37e-03

Pos logit contributions
rob+0.006
 pret+0.005
 Stri+0.005
 sy+0.005
 Fro+0.004
Neg logit contributions
Mom-0.006
Everyone-0.006
um-0.006
 year-0.005
ally-0.005
Token they
Token ID484
Feature activation-5.33e-03

Pos logit contributions
Mom+0.006
Everyone+0.006
um+0.006
 year+0.006
ally+0.005
Neg logit contributions
rob-0.005
 pret-0.005
 Stri-0.005
 sy-0.004
 Roc-0.004
Token were
Token ID547
Feature activation+2.71e-03

Pos logit contributions
Mom+0.008
Everyone+0.008
um+0.007
 year+0.007
har+0.007
Neg logit contributions
rob-0.010
 pret-0.009
 Stri-0.009
 sy-0.008
 Fro-0.008
Token also
Token ID635
Feature activation-6.29e-03

Pos logit contributions
rob+0.005
 pret+0.004
 Stri+0.004
 sy+0.004
 Fro+0.004
Neg logit contributions
Mom-0.005
Everyone-0.004
um-0.004
 year-0.004
ally-0.004
Token very
Token ID845
Feature activation-2.29e-02

Pos logit contributions
Mom+0.009
Everyone+0.008
um+0.008
 year+0.008
 doorstep+0.007
Neg logit contributions
rob-0.012
 pret-0.011
 Stri-0.011
 sy-0.010
 Fro-0.010
Token happy
Token ID3772
Feature activation+3.50e-03

Pos logit contributions
Mom+0.037
Everyone+0.034
 year+0.033
 doorstep+0.033
um+0.032
Neg logit contributions
rob-0.036
 pret-0.035
 Stri-0.035
 Fro-0.033
 sy-0.032
Token for
Token ID329
Feature activation+8.53e-03

Pos logit contributions
rob+0.005
 pret+0.005
 Stri+0.005
 asleep+0.004
 enough+0.004
Neg logit contributions
Mom-0.007
Everyone-0.006
um-0.006
 year-0.005
ally-0.005
Token him
Token ID683
Feature activation-1.49e-02

Pos logit contributions
rob+0.011
 pret+0.010
 Stri+0.010
 sy+0.009
 Fro+0.008
Neg logit contributions
Mom-0.018
Everyone-0.017
um-0.016
 year-0.016
 doorstep-0.015
Token.
Token ID13
Feature activation-8.43e-04

Pos logit contributions
Mom+0.021
Everyone+0.019
um+0.018
 year+0.018
ally+0.016
Neg logit contributions
rob-0.030
 pret-0.027
 Stri-0.027
 sy-0.026
 Roc-0.024
Token
Token ID198
Feature activation-1.85e-03

Pos logit contributions
Mom+0.001
har+0.001
 year+0.001
Everyone+0.001
ini+0.001
Neg logit contributions
rob-0.001
 Stri-0.001
 pret-0.001
 Fro-0.001
 Roc-0.001
Token outside
Token ID2354
Feature activation-7.58e-03

Pos logit contributions
Mom+0.013
Everyone+0.012
um+0.012
 year+0.012
ally+0.011
Neg logit contributions
rob-0.013
 pret-0.012
 Stri-0.011
 sy-0.011
 Roc-0.010
Token to
Token ID284
Feature activation-9.50e-03

Pos logit contributions
Mom+0.012
 year+0.011
Everyone+0.010
um+0.010
 doorstep+0.009
Neg logit contributions
rob-0.015
 pret-0.013
 Stri-0.013
 sy-0.012
 scar-0.011
Token the
Token ID262
Feature activation-1.08e-02

Pos logit contributions
Mom+0.015
um+0.014
Everyone+0.014
 year+0.013
ally+0.012
Neg logit contributions
rob-0.017
 pret-0.016
 Stri-0.016
 sy-0.015
 Fro-0.014
Token garden
Token ID11376
Feature activation-6.87e-03

Pos logit contributions
Mom+0.015
um+0.015
Everyone+0.015
 year+0.015
ally+0.014
Neg logit contributions
rob-0.019
 pret-0.018
 Stri-0.018
 sy-0.018
 asleep-0.017
Token to
Token ID284
Feature activation-1.18e-02

Pos logit contributions
Mom+0.010
Everyone+0.009
um+0.009
 year+0.009
ally+0.008
Neg logit contributions
rob-0.013
 pret-0.012
 Stri-0.012
 sy-0.012
 Fro-0.011
Token wait
Token ID4043
Feature activation-2.55e-02

Pos logit contributions
Mom+0.020
um+0.019
 year+0.018
Everyone+0.018
ally+0.016
Neg logit contributions
rob-0.019
 pret-0.019
 Stri-0.018
 sy-0.016
 Fro-0.016
Token for
Token ID329
Feature activation-3.12e-03

Pos logit contributions
Mom+0.047
Everyone+0.044
um+0.042
 year+0.042
ally+0.039
Neg logit contributions
rob-0.039
 pret-0.036
 Stri-0.035
 sy-0.033
 Roc-0.030
Token the
Token ID262
Feature activation-1.58e-02

Pos logit contributions
Mom+0.006
Everyone+0.005
um+0.005
 year+0.005
ally+0.004
Neg logit contributions
rob-0.005
 pret-0.005
 Stri-0.004
 sy-0.004
ah-0.004
Token puppy
Token ID26188
Feature activation-1.29e-02

Pos logit contributions
Mom+0.027
um+0.027
Everyone+0.027
urious+0.026
ally+0.026
Neg logit contributions
 sy-0.023
rob-0.023
 pret-0.021
 scar-0.021
 Stri-0.020
Token.
Token ID13
Feature activation-1.20e-03

Pos logit contributions
Mom+0.021
Everyone+0.019
 year+0.018
um+0.018
 doorstep+0.017
Neg logit contributions
rob-0.024
 pret-0.021
 Stri-0.021
 sy-0.020
 Roc-0.018
Token She
Token ID1375
Feature activation-2.30e-03

Pos logit contributions
Mom+0.002
Everyone+0.002
 year+0.002
um+0.002
ally+0.001
Neg logit contributions
rob-0.002
 Stri-0.002
 pret-0.002
 sy-0.002
 Roc-0.002
Token saw
Token ID2497
Feature activation-9.18e-03

Pos logit contributions
Mom+0.023
Everyone+0.021
um+0.020
 year+0.020
ally+0.019
Neg logit contributions
rob-0.028
 pret-0.026
 Stri-0.025
 sy-0.024
 Roc-0.022
Token a
Token ID257
Feature activation+3.06e-03

Pos logit contributions
Mom+0.017
Everyone+0.015
 year+0.014
um+0.014
Daddy+0.013
Neg logit contributions
rob-0.015
 pret-0.014
 Stri-0.013
 sy-0.013
 Roc-0.011
Token toy
Token ID13373
Feature activation+1.08e-03

Pos logit contributions
rob+0.005
 pret+0.005
 sy+0.005
 Stri+0.005
 scar+0.005
Neg logit contributions
Mom-0.005
Everyone-0.004
um-0.004
ally-0.004
urious-0.004
Token that
Token ID326
Feature activation-2.35e-04

Pos logit contributions
rob+0.002
 pret+0.002
 Stri+0.002
 sy+0.002
 Fro+0.002
Neg logit contributions
Mom-0.002
Everyone-0.001
 year-0.001
um-0.001
 doorstep-0.001
Token he
Token ID339
Feature activation-7.42e-03

Pos logit contributions
Mom+0.000
Everyone+0.000
um+0.000
 year+0.000
ally+0.000
Neg logit contributions
rob-0.000
 pret-0.000
 Stri-0.000
 sy-0.000
 Roc-0.000
Token really
Token ID1107
Feature activation-2.34e-02

Pos logit contributions
Mom+0.011
Everyone+0.011
um+0.010
ator+0.010
 doorstep+0.009
Neg logit contributions
 pret-0.013
 Stri-0.013
rob-0.013
 ro-0.012
 scar-0.011
Token wanted
Token ID2227
Feature activation-6.98e-03

Pos logit contributions
Mom+0.029
Everyone+0.026
um+0.025
 year+0.023
ator+0.023
Neg logit contributions
rob-0.050
 pret-0.048
 Stri-0.047
 sy-0.044
 Fro-0.042
Token.
Token ID13
Feature activation+2.20e-03

Pos logit contributions
Mom+0.012
Everyone+0.011
um+0.010
 year+0.010
ally+0.010
Neg logit contributions
rob-0.012
 pret-0.011
 Stri-0.011
 sy-0.010
 Roc-0.009
Token He
Token ID679
Feature activation-1.93e-03

Pos logit contributions
 Stri+0.002
 pret+0.002
 sy+0.002
rob+0.002
 Roc+0.002
Neg logit contributions
Mom-0.005
Everyone-0.005
 year-0.004
um-0.004
ator-0.004
Token asked
Token ID1965
Feature activation+1.21e-02

Pos logit contributions
Everyone+0.003
Mom+0.003
ator+0.003
um+0.003
 year+0.003
Neg logit contributions
 pret-0.003
 Stri-0.003
rob-0.003
 scar-0.003
 ro-0.003
Token his
Token ID465
Feature activation+1.06e-03

Pos logit contributions
rob+0.014
 pret+0.012
 Stri+0.012
 sy+0.011
 Roc+0.009
Neg logit contributions
Mom-0.028
Everyone-0.026
um-0.025
 year-0.025
ally-0.024
Token Lily
Token ID20037
Feature activation+5.46e-04

Pos logit contributions
rob+0.022
 pret+0.020
 Stri+0.020
 sy+0.019
 Roc+0.017
Neg logit contributions
Mom-0.023
Everyone-0.021
um-0.021
 year-0.020
ally-0.019
Token and
Token ID290
Feature activation+4.90e-03

Pos logit contributions
rob+0.001
 pret+0.001
 Stri+0.001
 sy+0.001
 Fro+0.001
Neg logit contributions
Mom-0.001
um-0.001
Everyone-0.001
 year-0.001
ally-0.001
Token Tim
Token ID5045
Feature activation-7.53e-03

Pos logit contributions
rob+0.010
 pret+0.009
 Stri+0.009
 sy+0.009
 Roc+0.008
Neg logit contributions
Mom-0.007
Everyone-0.006
um-0.006
 year-0.006
ally-0.005
Tokenmy
Token ID1820
Feature activation-1.46e-02

Pos logit contributions
Mom+0.009
um+0.007
ally+0.007
 year+0.007
Everyone+0.007
Neg logit contributions
rob-0.018
 pret-0.017
 sy-0.016
 Stri-0.016
 Roc-0.015
Token went
Token ID1816
Feature activation-2.88e-03

Pos logit contributions
Mom+0.018
um+0.015
 year+0.015
Everyone+0.015
ally+0.013
Neg logit contributions
rob-0.034
 pret-0.030
 Stri-0.030
 sy-0.029
 Fro-0.028
Token back
Token ID736
Feature activation-2.38e-02

Pos logit contributions
Mom+0.003
 year+0.003
Everyone+0.003
um+0.003
 tidy+0.002
Neg logit contributions
rob-0.007
 pret-0.006
 Stri-0.006
 sy-0.006
pill-0.006
Token outside
Token ID2354
Feature activation-7.99e-03

Pos logit contributions
Mom+0.027
 year+0.024
 neighbourhood+0.023
 doorstep+0.022
 tidy+0.022
Neg logit contributions
rob-0.054
 Stri-0.048
 pret-0.048
 sy-0.047
ah-0.044
Token to
Token ID284
Feature activation-1.23e-02

Pos logit contributions
Mom+0.011
 year+0.011
Everyone+0.010
um+0.010
 doorstep+0.009
Neg logit contributions
rob-0.017
 pret-0.014
 Stri-0.014
 sy-0.014
 scar-0.013
Token gather
Token ID6431
Feature activation-1.07e-02

Pos logit contributions
Mom+0.016
Everyone+0.014
 year+0.013
um+0.013
 doorstep+0.012
Neg logit contributions
rob-0.027
 pret-0.024
 Stri-0.024
 sy-0.023
 scar-0.021
Token more
Token ID517
Feature activation-5.58e-03

Pos logit contributions
Mom+0.018
Everyone+0.017
um+0.016
 year+0.016
ally+0.015
Neg logit contributions
rob-0.018
 pret-0.017
 Stri-0.016
 sy-0.016
 Roc-0.014
Token wood
Token ID4898
Feature activation+1.45e-02

Pos logit contributions
Mom+0.009
Everyone+0.009
 year+0.009
um+0.008
 tidy+0.008
Neg logit contributions
rob-0.010
 pret-0.009
 Stri-0.008
 sy-0.008
 Fro-0.008
Token he
Token ID339
Feature activation-6.21e-03

Pos logit contributions
Mom+0.001
Everyone+0.001
um+0.001
 year+0.001
ally+0.001
Neg logit contributions
rob-0.001
 pret-0.000
 Stri-0.000
 sy-0.000
 Roc-0.000
Token had
Token ID550
Feature activation-7.91e-03

Pos logit contributions
Mom+0.010
Everyone+0.009
um+0.009
 year+0.008
VP+0.008
Neg logit contributions
rob-0.011
 pret-0.010
 Stri-0.010
 sy-0.009
 ro-0.009
Token to
Token ID284
Feature activation-3.09e-03

Pos logit contributions
Mom+0.014
Everyone+0.013
um+0.013
 year+0.012
ally+0.012
Neg logit contributions
rob-0.013
 pret-0.012
 Stri-0.012
 sy-0.011
 Roc-0.010
Token do
Token ID466
Feature activation-6.21e-03

Pos logit contributions
Mom+0.005
Everyone+0.005
um+0.005
ini+0.005
urious+0.005
Neg logit contributions
 pret-0.004
 Stri-0.004
rob-0.004
 sy-0.004
 enough-0.004
Token the
Token ID262
Feature activation-1.78e-02

Pos logit contributions
Mom+0.011
Everyone+0.010
um+0.010
 year+0.010
ally+0.009
Neg logit contributions
rob-0.010
 pret-0.009
 Stri-0.009
 sy-0.009
 Roc-0.008
Token job
Token ID1693
Feature activation-2.56e-02

Pos logit contributions
Mom+0.029
Everyone+0.026
um+0.025
 year+0.025
ally+0.024
Neg logit contributions
rob-0.032
 pret-0.030
 Stri-0.029
 sy-0.028
 Roc-0.025
Token properly
Token ID6105
Feature activation-3.99e-03

Pos logit contributions
Mom+0.041
Everyone+0.037
um+0.036
 year+0.035
ally+0.033
Neg logit contributions
rob-0.046
 pret-0.043
 Stri-0.041
 sy-0.040
 Roc-0.036
Token or
Token ID393
Feature activation-6.93e-03

Pos logit contributions
Mom+0.005
Everyone+0.004
 year+0.004
um+0.004
 doorstep+0.004
Neg logit contributions
rob-0.009
 pret-0.008
 Stri-0.008
 sy-0.008
 Roc-0.007
Token he
Token ID339
Feature activation-4.99e-03

Pos logit contributions
Mom+0.012
um+0.011
Everyone+0.011
 year+0.011
ally+0.011
Neg logit contributions
 pret-0.010
rob-0.010
 Stri-0.010
 sy-0.009
 Fro-0.009
Token would
Token ID561
Feature activation-1.11e-02

Pos logit contributions
Mom+0.008
Everyone+0.007
um+0.007
 year+0.006
ally+0.006
Neg logit contributions
rob-0.009
 pret-0.009
 Stri-0.008
 sy-0.008
 ro-0.007
Token get
Token ID651
Feature activation-1.15e-02

Pos logit contributions
Mom+0.018
um+0.016
Everyone+0.016
 year+0.016
ally+0.015
Neg logit contributions
 pret-0.018
rob-0.017
 Stri-0.017
 sy-0.016
 ro-0.015