TOP ACTIVATIONS
MAX = 0.070

wiped away his z ig zag and went on with his
and clean your boo - bo o . Then you can
squirrel stood up on its hind legs , grabbed the peanut
to carry . They picked out some l oll ip ops
He stood up on his hind legs and cl apped his
is that we should respect everyone â s belongings
ily and her mom picked out a pretty card with flowers
a rabbit standing on its hind legs and hopping around .
my cleaned her boo - bo o with a wet cloth
He stood up on his hind legs and waved his front
and her mom my picked out some books and sat down
him back and they all held each other tightly .
flags up on the flag pole . Look , you can
L ily and her mom picked out a pretty card with
and upset , John fet ched a hose and washed away
. The vet looked the owl over , but sadly told
They put Lily on a stret cher and carried her to
shared ." They picked up the pieces of the
something moving in the wild fl owers . It was a
He stood up on his hind legs and held the bone

MIN ACTIVATIONS
MIN = -0.056

p uddle and jumps in it . He gets wet and
Sue laughs and ch ases it . Tim laughs too .
uddle . He jumps in it . He spl ashes .
the swing and jumps on it . He holds the ropes
a chair and climbs on it . She reaches for the
He hit a ball and it fell to the ground .
a tall tower and named it " The Big Tower ".
bear . She shouts at it . " Go away ,
the swing and jumped on it . He pushed with his
the bird and jumps on it . But the bird is
threw the ball up and it came down . It fell
first . She jumps on it and says , " This
the bench and climbs on it too . He also reaches
the bird and jumps on it . But the bird is
a dough . He rolls it with a rolling pin .
, or toys , or books ," Max said .
catches the ball and throws it back to Lily . They
She pressed the whistle and it made a loud sound .
the piano and climbs on it . Then she reaches for
a doll , and a book . They were

INTERVAL 0.038 - 0.070
CONTAINS 0.031%

next time . Now , let us take the jar down
. He said , " Let 's splash the water and
should never be selfish . When we share with our friends
park every day . They picked peanuts and shared them with
and stay away from people who make us feel uncomfortable .

INTERVAL 0.007 - 0.038
CONTAINS 23.193%

she made it to the beach safe and sound . She
Ben 's car lying on the floor , next to a
happy home . <|endoftext|> Once upon a time , there was
's why E amon always tried to be patient and enjoy
good to help your friends when they need it . Shooting

INTERVAL -0.024 - 0.007
CONTAINS 75.649%

saw a pretty dress . It was long and made of
L ily , that is a good idea . We can
he praised them . He said , " You made a
lots of different things to eat - but nothing that Alice
Together they cooked the soft , white rice . The kingdom

INTERVAL -0.056 - -0.024
CONTAINS 1.127%

Will they be loud ?" Ben asks . "
more funny and sm elly pictures . And everyone loved the
the beach together . <|endoftext|> Once upon a time , there
" No , thank you ," the witch said ,
first , let me introduce you to my other friends ."
Token wiped
Token ID21122
Feature activation+1.50e-02

Pos logit contributions
 Mom+0.011
 Sara+0.011
 sometimes+0.011
ily+0.011
 pand+0.010
Neg logit contributions
lor-0.015
izer-0.015
 invention-0.015
zed-0.015
 unnecessary-0.015
Token away
Token ID1497
Feature activation+2.54e-02

Pos logit contributions
 unnecessary+0.031
lor+0.031
+0.031
izer+0.031
bek+0.030
Neg logit contributions
 Mom-0.015
 sometimes-0.015
 from-0.014
 L-0.014
 Sara-0.014
Token his
Token ID465
Feature activation+2.37e-02

Pos logit contributions
 unnecessary+0.055
izer+0.054
lor+0.053
aman+0.052
 invention+0.052
Neg logit contributions
 Mom-0.024
 sometimes-0.024
 from-0.022
 Sara-0.022
 L-0.021
Token z
Token ID1976
Feature activation+1.75e-02

Pos logit contributions
 unnecessary+0.043
lor+0.043
izer+0.042
aman+0.042
bek+0.042
Neg logit contributions
 Mom-0.029
 sometimes-0.028
 from-0.027
 L-0.026
 Sara-0.026
Tokenig
Token ID328
Feature activation+2.95e-02

Pos logit contributions
 unnecessary+0.018
Matthew+0.017
 avail+0.017
aman+0.017
Michelle+0.016
Neg logit contributions
ily-0.035
 L-0.034
 Mom-0.033
 Z-0.033
 sometimes-0.032
Tokenzag
Token ID50183
Feature activation+5.90e-02

Pos logit contributions
 unnecessary+0.033
izer+0.029
 avail+0.028
 invention+0.028
bek+0.028
Neg logit contributions
 sometimes-0.059
 Mom-0.058
 from-0.057
 Sara-0.055
 L-0.055
Token and
Token ID290
Feature activation+1.05e-03

Pos logit contributions
izer+0.096
 unnecessary+0.095
 invention+0.091
lor+0.091
 lookout+0.089
Neg logit contributions
 sometimes-0.083
 Mom-0.081
 Sara-0.078
 L-0.076
ily-0.075
Token went
Token ID1816
Feature activation-6.77e-03

Pos logit contributions
 unnecessary+0.002
izer+0.002
lor+0.002
 invention+0.002
bek+0.002
Neg logit contributions
 sometimes-0.001
 Mom-0.001
 Sara-0.001
 from-0.001
 L-0.001
Token on
Token ID319
Feature activation+6.37e-03

Pos logit contributions
 sometimes+0.007
 Mom+0.007
 Sara+0.007
 from+0.007
 L+0.007
Neg logit contributions
 unnecessary-0.013
izer-0.013
 invention-0.013
lor-0.013
bek-0.013
Token with
Token ID351
Feature activation-1.25e-04

Pos logit contributions
 unnecessary+0.012
izer+0.012
lor+0.012
 invention+0.012
bek+0.012
Neg logit contributions
 sometimes-0.007
 Mom-0.007
 Sara-0.007
 from-0.006
 L-0.006
Token his
Token ID465
Feature activation-4.89e-03

Pos logit contributions
 sometimes+0.000
 Mom+0.000
 from+0.000
 Sara+0.000
ily+0.000
Neg logit contributions
 unnecessary-0.000
izer-0.000
 avail-0.000
 invention-0.000
lor-0.000
Token and
Token ID290
Feature activation+1.40e-02

Pos logit contributions
 unnecessary+0.013
izer+0.013
lor+0.012
 invention+0.012
bek+0.012
Neg logit contributions
 sometimes-0.004
 Mom-0.004
 from-0.004
 Sara-0.004
 L-0.004
Token clean
Token ID3424
Feature activation+1.87e-03

Pos logit contributions
 unnecessary+0.024
izer+0.024
lor+0.023
 invention+0.023
bek+0.022
Neg logit contributions
 sometimes-0.019
 Mom-0.019
 Sara-0.018
 from-0.017
 L-0.017
Token your
Token ID534
Feature activation+3.15e-03

Pos logit contributions
 unnecessary+0.004
izer+0.004
lor+0.004
bek+0.004
 invention+0.004
Neg logit contributions
 sometimes-0.002
 Mom-0.002
 Sara-0.002
 from-0.002
 L-0.002
Token boo
Token ID22513
Feature activation+3.44e-02

Pos logit contributions
asha+0.006
bek+0.006
aman+0.006
andra+0.006
lor+0.006
Neg logit contributions
 Mom-0.004
 sometimes-0.003
 alone-0.003
 from-0.003
 Sara-0.003
Token-
Token ID12
Feature activation+1.22e-02

Pos logit contributions
 unnecessary+0.047
izer+0.047
aman+0.045
 invention+0.044
lor+0.044
Neg logit contributions
 sometimes-0.059
 Mom-0.057
 from-0.055
 Sara-0.054
 L-0.053
Tokenbo
Token ID2127
Feature activation+5.73e-02

Pos logit contributions
 unnecessary+0.021
izer+0.020
lor+0.020
 invention+0.019
bek+0.019
Neg logit contributions
 sometimes-0.017
 Mom-0.016
 Sara-0.015
 from-0.015
 L-0.015
Tokeno
Token ID78
Feature activation+4.99e-02

Pos logit contributions
 unnecessary+0.110
izer+0.109
lor+0.107
bek+0.103
 invention+0.102
Neg logit contributions
 sometimes-0.062
 Mom-0.061
 Sara-0.057
 Z-0.056
 L-0.055
Token.
Token ID13
Feature activation+1.89e-02

Pos logit contributions
 unnecessary+0.104
izer+0.102
lor+0.100
 invention+0.098
bek+0.097
Neg logit contributions
 sometimes-0.046
 Mom-0.046
 Sara-0.043
 L-0.042
 Z-0.042
Token Then
Token ID3244
Feature activation+2.89e-02

Pos logit contributions
 unnecessary+0.032
izer+0.031
lor+0.030
 invention+0.030
bek+0.030
Neg logit contributions
 sometimes-0.025
 Mom-0.024
 from-0.023
 Sara-0.023
ily-0.023
Token you
Token ID345
Feature activation-7.30e-03

Pos logit contributions
izer+0.060
 unnecessary+0.059
lor+0.058
 invention+0.057
bek+0.057
Neg logit contributions
 sometimes-0.030
 Mom-0.030
 Sara-0.027
 from-0.026
 L-0.026
Token can
Token ID460
Feature activation-1.46e-02

Pos logit contributions
 sometimes+0.011
 Mom+0.011
 Sara+0.010
 Z+0.010
 L+0.010
Neg logit contributions
 unnecessary-0.011
izer-0.011
 invention-0.011
lor-0.011
bek-0.010
Token squirrel
Token ID33039
Feature activation+2.49e-02

Pos logit contributions
izer+0.011
+0.011
 unnecessary+0.011
 lookout+0.011
 newfound+0.011
Neg logit contributions
 sometimes-0.011
 Anna-0.011
 L-0.011
 Sara-0.011
 Z-0.011
Token stood
Token ID6204
Feature activation+2.19e-02

Pos logit contributions
bek+0.040
zed+0.040
lor+0.040
aman+0.040
 invention+0.039
Neg logit contributions
 Mom-0.032
 Sara-0.031
ily-0.031
 sometimes-0.030
 L-0.029
Token up
Token ID510
Feature activation+1.10e-02

Pos logit contributions
izer+0.042
 unnecessary+0.041
aman+0.040
 invention+0.040
 avail+0.040
Neg logit contributions
 sometimes-0.023
ily-0.022
 Mom-0.021
 Sara-0.021
 pand-0.021
Token on
Token ID319
Feature activation+1.01e-02

Pos logit contributions
 unnecessary+0.023
izer+0.023
 invention+0.023
lor+0.022
aman+0.022
Neg logit contributions
 sometimes-0.010
 Mom-0.009
 Sara-0.009
ily-0.008
 L-0.008
Token its
Token ID663
Feature activation+1.65e-02

Pos logit contributions
 unnecessary+0.024
lor+0.023
izer+0.023
bek+0.022
isting+0.022
Neg logit contributions
 Mom-0.009
 sometimes-0.008
 from-0.008
 Sara-0.008
 Tom-0.008
Token hind
Token ID16222
Feature activation+5.58e-02

Pos logit contributions
 lookout+0.025
 unnecessary+0.025
izer+0.024
 invention+0.024
lor+0.022
Neg logit contributions
 sometimes-0.026
ily-0.025
 Sara-0.024
 L-0.024
 Mom-0.024
Token legs
Token ID7405
Feature activation+1.56e-02

Pos logit contributions
 unnecessary+0.103
izer+0.102
 invention+0.099
lor+0.098
aman+0.096
Neg logit contributions
 sometimes-0.066
 Mom-0.065
 Sara-0.062
 L-0.060
 from-0.060
Token,
Token ID11
Feature activation+1.62e-02

Pos logit contributions
izer+0.031
aman+0.030
 unnecessary+0.030
 invention+0.030
lor+0.030
Neg logit contributions
 Mom-0.015
 sometimes-0.015
 Sara-0.014
ily-0.014
 L-0.014
Token grabbed
Token ID13176
Feature activation+3.04e-02

Pos logit contributions
 unnecessary+0.033
izer+0.033
lor+0.032
 invention+0.031
bek+0.031
Neg logit contributions
 sometimes-0.017
 Mom-0.016
 Sara-0.015
 from-0.015
 L-0.014
Token the
Token ID262
Feature activation+1.24e-02

Pos logit contributions
 unnecessary+0.062
izer+0.062
lor+0.060
 invention+0.059
bek+0.059
Neg logit contributions
 sometimes-0.031
 Mom-0.030
 Sara-0.028
 from-0.028
 L-0.027
Token peanut
Token ID26636
Feature activation+8.85e-03

Pos logit contributions
izer+0.019
 unnecessary+0.019
 lookout+0.018
lor+0.017
 invention+0.017
Neg logit contributions
 sometimes-0.019
 Mom-0.018
 Sara-0.017
 Z-0.017
 Anna-0.017
Token to
Token ID284
Feature activation-3.33e-03

Pos logit contributions
 sometimes+0.012
ily+0.012
 Sara+0.011
 Z+0.011
 L+0.011
Neg logit contributions
 unnecessary-0.027
 invention-0.026
 lookout-0.026
 tool-0.025
izer-0.025
Token carry
Token ID3283
Feature activation+5.73e-03

Pos logit contributions
 sometimes+0.005
 Mom+0.005
 Sara+0.004
 from+0.004
 L+0.004
Neg logit contributions
 unnecessary-0.005
izer-0.005
lor-0.005
 invention-0.005
bek-0.005
Token.
Token ID13
Feature activation-9.99e-03

Pos logit contributions
 unnecessary+0.012
izer+0.012
 precautions+0.011
 invention+0.011
lor+0.011
Neg logit contributions
 sometimes-0.005
 Mom-0.005
 Z-0.005
 Sara-0.004
ily-0.004
Token They
Token ID1119
Feature activation+3.21e-03

Pos logit contributions
 sometimes+0.010
 from+0.009
 Mom+0.009
 pengu+0.009
 Z+0.009
Neg logit contributions
 unnecessary-0.019
izer-0.018
aman-0.018
-0.018
 invention-0.018
Token picked
Token ID6497
Feature activation+4.39e-02

Pos logit contributions
bek+0.005
izer+0.005
 unnecessary+0.005
 invention+0.005
aman+0.005
Neg logit contributions
 sometimes-0.004
 Mom-0.004
 Sara-0.004
 from-0.004
 Z-0.004
Token out
Token ID503
Feature activation+5.56e-02

Pos logit contributions
 unnecessary+0.089
izer+0.088
lor+0.085
 invention+0.085
bek+0.083
Neg logit contributions
 sometimes-0.045
 Mom-0.043
 Sara-0.040
 from-0.039
 L-0.039
Token some
Token ID617
Feature activation+2.79e-04

Pos logit contributions
 unnecessary+0.113
izer+0.112
 invention+0.108
lor+0.107
aman+0.106
Neg logit contributions
 sometimes-0.052
 Mom-0.048
ily-0.045
 Sara-0.045
 from-0.045
Token l
Token ID300
Feature activation-5.25e-04

Pos logit contributions
 unnecessary+0.000
izer+0.000
 invention+0.000
 tool+0.000
 lookout+0.000
Neg logit contributions
 Mom-0.000
 sometimes-0.000
 Z-0.000
 L-0.000
 Sara-0.000
Tokenoll
Token ID692
Feature activation+2.00e-02

Pos logit contributions
 sometimes+0.001
 Mom+0.001
 Z+0.000
 from+0.000
 Sara+0.000
Neg logit contributions
lor-0.001
 unnecessary-0.001
izer-0.001
-0.001
ipping-0.001
Tokenip
Token ID541
Feature activation+2.00e-02

Pos logit contributions
izer+0.026
 unnecessary+0.026
 lookout+0.025
 invention+0.024
lor+0.024
Neg logit contributions
 sometimes-0.036
 Mom-0.035
 Sara-0.034
 from-0.034
 L-0.033
Tokenops
Token ID2840
Feature activation+2.22e-02

Pos logit contributions
 unnecessary+0.034
izer+0.033
bek+0.032
 invention+0.032
 avail+0.032
Neg logit contributions
 sometimes-0.028
 Mom-0.027
 from-0.027
 Sara-0.026
 L-0.025
Token He
Token ID679
Feature activation-1.22e-02

Pos logit contributions
 sometimes+0.010
 from+0.008
 Mom+0.008
 pand+0.008
ily+0.008
Neg logit contributions
izer-0.019
 unnecessary-0.019
lor-0.019
-0.018
aman-0.018
Token stood
Token ID6204
Feature activation+8.13e-03

Pos logit contributions
 sometimes+0.016
 Mom+0.015
 Sara+0.014
ily+0.014
 from+0.014
Neg logit contributions
izer-0.020
 unnecessary-0.020
lor-0.019
 invention-0.019
bek-0.019
Token up
Token ID510
Feature activation+1.95e-03

Pos logit contributions
izer+0.015
ipping+0.015
 unnecessary+0.015
 avail+0.014
antly+0.014
Neg logit contributions
 sometimes-0.008
 pand-0.008
ily-0.008
 Mom-0.008
 Sara-0.008
Token on
Token ID319
Feature activation-8.60e-04

Pos logit contributions
izer+0.004
aman+0.003
+0.003
 invention+0.003
 unnecessary+0.003
Neg logit contributions
 sometimes-0.002
ily-0.002
 Mom-0.002
 pand-0.002
 Z-0.002
Token his
Token ID465
Feature activation+2.28e-02

Pos logit contributions
 sometimes+0.001
 Mom+0.001
 Sara+0.001
ily+0.001
 from+0.001
Neg logit contributions
izer-0.002
 unnecessary-0.002
lor-0.002
 invention-0.002
aman-0.002
Token hind
Token ID16222
Feature activation+5.54e-02

Pos logit contributions
 unnecessary+0.039
lor+0.038
bek+0.037
izer+0.037
aman+0.037
Neg logit contributions
 Mom-0.032
 sometimes-0.030
 from-0.029
 Sara-0.029
 Anna-0.027
Token legs
Token ID7405
Feature activation+5.93e-03

Pos logit contributions
 unnecessary+0.109
izer+0.108
lor+0.104
 invention+0.103
bek+0.103
Neg logit contributions
 sometimes-0.060
 Mom-0.059
 Sara-0.055
 from-0.054
 L-0.053
Token and
Token ID290
Feature activation-9.54e-03

Pos logit contributions
izer+0.010
aman+0.010
lor+0.010
 lookout+0.010
+0.010
Neg logit contributions
 pand-0.007
 Mom-0.007
 Sara-0.007
 Z-0.007
 L-0.006
Token cl
Token ID537
Feature activation-3.68e-03

Pos logit contributions
 sometimes+0.012
 Mom+0.012
ily+0.012
 Sara+0.011
 from+0.011
Neg logit contributions
 unnecessary-0.016
izer-0.016
 invention-0.015
lor-0.015
bek-0.015
Tokenapped
Token ID6320
Feature activation+1.58e-02

Pos logit contributions
 sometimes+0.005
 Mom+0.005
 Sara+0.004
 from+0.004
 L+0.004
Neg logit contributions
 unnecessary-0.006
izer-0.006
lor-0.006
 invention-0.006
bek-0.006
Token his
Token ID465
Feature activation+1.33e-02

Pos logit contributions
 unnecessary+0.021
izer+0.021
 avail+0.021
ously+0.020
zed+0.020
Neg logit contributions
 sometimes-0.023
 pand-0.022
ents-0.021
ily-0.021
 pengu-0.020
Token is
Token ID318
Feature activation+1.28e-02

Pos logit contributions
 unnecessary+0.011
lor+0.011
izer+0.011
 invention+0.011
aman+0.011
Neg logit contributions
 sometimes-0.008
 Mom-0.008
 Sara-0.007
 Z-0.007
 L-0.007
Token that
Token ID326
Feature activation+1.54e-02

Pos logit contributions
izer+0.033
 unnecessary+0.032
lor+0.031
 avail+0.031
bek+0.031
Neg logit contributions
 sometimes-0.008
 Mom-0.007
 from-0.007
 Sara-0.006
 L-0.006
Token we
Token ID356
Feature activation+3.34e-02

Pos logit contributions
 unnecessary+0.029
izer+0.029
 invention+0.028
lor+0.028
bek+0.027
Neg logit contributions
 sometimes-0.017
 Mom-0.017
 Sara-0.016
ily-0.015
 L-0.015
Token should
Token ID815
Feature activation+2.84e-02

Pos logit contributions
 unnecessary+0.063
izer+0.062
lor+0.060
aman+0.059
 invention+0.059
Neg logit contributions
 sometimes-0.040
 Mom-0.038
 from-0.036
 Sara-0.036
ily-0.035
Token respect
Token ID2461
Feature activation+2.08e-02

Pos logit contributions
 unnecessary+0.056
izer+0.055
lor+0.052
bek+0.052
 invention+0.052
Neg logit contributions
 sometimes-0.034
 Mom-0.032
 from-0.029
 Sara-0.029
ily-0.028
Token everyone
Token ID2506
Feature activation+5.48e-02

Pos logit contributions
izer+0.044
lor+0.044
+0.043
 unnecessary+0.043
bek+0.042
Neg logit contributions
 sometimes-0.021
 Mom-0.021
 from-0.020
 Sara-0.018
 Tom-0.018
Tokenâ
Token ID22940
Feature activation+3.51e-02

Pos logit contributions
izer+0.117
 unnecessary+0.114
lor+0.113
aman+0.111
bek+0.111
Neg logit contributions
 sometimes-0.054
 Mom-0.051
 from-0.048
 Sara-0.046
 L-0.044
Token
Token ID26391
Feature activation+4.67e-02

Pos logit contributions
 unnecessary+0.050
lor+0.049
izer+0.047
aman+0.047
 invention+0.046
Neg logit contributions
 sometimes-0.058
ily-0.054
 like-0.052
 Mom-0.052
 Tom-0.052
Token
Token ID8151
Feature activation+3.59e-02

Pos logit contributions
 unnecessary+0.063
izer+0.062
 avail+0.058
 invention+0.058
aman+0.058
Neg logit contributions
 sometimes-0.080
 Mom-0.079
 Sara-0.075
 Tom-0.075
 L-0.074
Tokens
Token ID82
Feature activation+2.08e-02

Pos logit contributions
 unnecessary+0.071
izer+0.070
lor+0.068
 invention+0.068
bek+0.067
Neg logit contributions
 sometimes-0.037
 Mom-0.036
 Sara-0.034
 from-0.033
 L-0.033
Token belongings
Token ID30985
Feature activation+9.56e-03

Pos logit contributions
izer+0.035
 unnecessary+0.034
lor+0.034
 avail+0.034
aman+0.033
Neg logit contributions
 sometimes-0.030
 Mom-0.028
 from-0.027
 Sara-0.025
 alone-0.025
Tokenily
Token ID813
Feature activation-3.85e-03

Pos logit contributions
 sometimes+0.001
ily+0.000
 Mom-0.000
 Sara-0.001
 from-0.001
Neg logit contributions
 unnecessary-0.050
izer-0.049
 invention-0.048
 avail-0.048
lor-0.048
Token and
Token ID290
Feature activation+1.73e-02

Pos logit contributions
 sometimes+0.004
 Mom+0.004
 from+0.003
 Sara+0.003
 L+0.003
Neg logit contributions
 unnecessary-0.009
 avail-0.008
izer-0.008
bek-0.008
 lookout-0.008
Token her
Token ID607
Feature activation+5.10e-03

Pos logit contributions
izer+0.033
aman+0.031
 avail+0.031
 unnecessary+0.031
 invention+0.031
Neg logit contributions
 sometimes-0.018
ily-0.016
 Mom-0.016
 Z-0.015
 from-0.015
Token mom
Token ID1995
Feature activation+5.12e-03

Pos logit contributions
 invention+0.009
izer+0.009
+0.009
 newfound+0.009
 unnecessary+0.009
Neg logit contributions
 sometimes-0.005
ily-0.005
 from-0.005
 Sara-0.005
 thirteen-0.005
Token picked
Token ID6497
Feature activation+5.35e-02

Pos logit contributions
 unnecessary+0.009
izer+0.009
lor+0.008
 invention+0.008
bek+0.008
Neg logit contributions
 sometimes-0.007
 Mom-0.007
 Sara-0.006
 from-0.006
 L-0.006
Token out
Token ID503
Feature activation+5.47e-02

Pos logit contributions
 unnecessary+0.116
izer+0.116
lor+0.116
bek+0.114
 lookout+0.113
Neg logit contributions
 Mom-0.048
 sometimes-0.046
 Sara-0.044
 from-0.043
 Tom-0.041
Token a
Token ID257
Feature activation+7.42e-03

Pos logit contributions
 unnecessary+0.120
izer+0.118
 invention+0.114
lor+0.114
bek+0.113
Neg logit contributions
 sometimes-0.045
 Mom-0.042
 Sara-0.039
ily-0.038
 from-0.038
Token pretty
Token ID2495
Feature activation+6.93e-03

Pos logit contributions
 unnecessary+0.014
izer+0.013
lor+0.013
 invention+0.013
bek+0.013
Neg logit contributions
 sometimes-0.009
 Mom-0.009
 Sara-0.009
 from-0.009
 L-0.008
Token card
Token ID2657
Feature activation+1.67e-02

Pos logit contributions
 unnecessary+0.013
izer+0.012
lor+0.012
bek+0.012
 invention+0.012
Neg logit contributions
 sometimes-0.009
 Mom-0.009
 Sara-0.008
 from-0.008
 L-0.008
Token with
Token ID351
Feature activation+3.87e-03

Pos logit contributions
izer+0.031
lor+0.030
 invention+0.030
 tool+0.029
 tractor+0.029
Neg logit contributions
 sometimes-0.017
 Mom-0.016
 Sara-0.016
 Mr-0.016
 Qu-0.016
Token flowers
Token ID12734
Feature activation+2.19e-02

Pos logit contributions
 unnecessary+0.006
 invention+0.006
izer+0.006
 newfound+0.006
andra+0.006
Neg logit contributions
 sometimes-0.005
 alone-0.005
ily-0.005
 Sara-0.004
 Mom-0.004
Token a
Token ID257
Feature activation+7.41e-03

Pos logit contributions
 unnecessary+0.001
izer+0.001
lor+0.001
aman+0.001
 invention+0.001
Neg logit contributions
 sometimes-0.000
 Mom-0.000
 Sara-0.000
 Z-0.000
ily-0.000
Token rabbit
Token ID22746
Feature activation+1.01e-02

Pos logit contributions
 unnecessary+0.013
izer+0.013
 invention+0.012
bek+0.012
lor+0.012
Neg logit contributions
 sometimes-0.009
 Mom-0.009
 Sara-0.009
ily-0.009
 from-0.008
Token standing
Token ID5055
Feature activation+3.69e-04

Pos logit contributions
izer+0.018
lor+0.017
 unnecessary+0.017
 invention+0.017
aman+0.016
Neg logit contributions
 Mom-0.012
 sometimes-0.012
 Sara-0.012
ily-0.011
 L-0.011
Token on
Token ID319
Feature activation-8.50e-03

Pos logit contributions
izer+0.001
 unnecessary+0.001
lor+0.001
bek+0.001
aman+0.001
Neg logit contributions
 sometimes-0.000
 Mom-0.000
 Sara-0.000
 pand-0.000
 L-0.000
Token its
Token ID663
Feature activation+7.73e-03

Pos logit contributions
 sometimes+0.008
 Mom+0.007
 Sara+0.007
ily+0.007
 L+0.007
Neg logit contributions
 unnecessary-0.018
izer-0.018
 invention-0.017
lor-0.017
aman-0.017
Token hind
Token ID16222
Feature activation+5.42e-02

Pos logit contributions
 lookout+0.011
 unnecessary+0.011
izer+0.010
 invention+0.010
lor+0.010
Neg logit contributions
 sometimes-0.013
ily-0.012
 Sara-0.012
 L-0.012
 Mom-0.012
Token legs
Token ID7405
Feature activation-1.34e-03

Pos logit contributions
 unnecessary+0.098
izer+0.097
 invention+0.094
lor+0.093
 avail+0.091
Neg logit contributions
 sometimes-0.066
 Mom-0.065
 Sara-0.062
 L-0.060
ily-0.060
Token and
Token ID290
Feature activation-1.71e-03

Pos logit contributions
 Mom+0.001
 sometimes+0.001
 Sara+0.001
 L+0.001
ily+0.001
Neg logit contributions
izer-0.003
lor-0.003
aman-0.003
 unnecessary-0.003
 invention-0.003
Token hopping
Token ID47153
Feature activation+6.44e-03

Pos logit contributions
 sometimes+0.002
ily+0.002
 Mom+0.002
 Sara+0.002
 pand+0.002
Neg logit contributions
 unnecessary-0.003
 lookout-0.003
zed-0.003
 invention-0.003
izer-0.003
Token around
Token ID1088
Feature activation-1.44e-02

Pos logit contributions
izer+0.013
 unnecessary+0.013
ously+0.012
 precautions+0.012
 lookout+0.012
Neg logit contributions
 Mom-0.006
 sometimes-0.006
 Sara-0.006
 pand-0.005
 L-0.005
Token.
Token ID13
Feature activation-4.98e-03

Pos logit contributions
 Mom+0.011
 sometimes+0.011
 Sara+0.010
ily+0.010
 Z+0.010
Neg logit contributions
 unnecessary-0.031
izer-0.030
 invention-0.030
 lookout-0.030
 avail-0.029
Tokenmy
Token ID1820
Feature activation+5.43e-03

Pos logit contributions
 sometimes+0.009
 Mom+0.008
 Sara+0.008
 Z+0.008
 from+0.008
Neg logit contributions
izer-0.023
 unnecessary-0.022
 invention-0.022
aman-0.022
lor-0.022
Token cleaned
Token ID20750
Feature activation+6.26e-03

Pos logit contributions
 unnecessary+0.010
izer+0.010
lor+0.009
 invention+0.009
bek+0.009
Neg logit contributions
 sometimes-0.007
 Mom-0.007
 Sara-0.006
 from-0.006
 L-0.006
Token her
Token ID607
Feature activation+2.64e-02

Pos logit contributions
 unnecessary+0.014
izer+0.014
lor+0.014
bek+0.014
isting+0.013
Neg logit contributions
 Mom-0.005
 sometimes-0.005
 Sara-0.005
 from-0.005
 L-0.005
Token boo
Token ID22513
Feature activation+3.16e-02

Pos logit contributions
 unnecessary+0.052
lor+0.051
izer+0.050
bek+0.050
asha+0.050
Neg logit contributions
 Mom-0.032
 sometimes-0.031
 from-0.030
 Sara-0.027
 alone-0.027
Token-
Token ID12
Feature activation+1.29e-02

Pos logit contributions
 unnecessary+0.042
izer+0.042
 invention+0.040
aman+0.040
lor+0.040
Neg logit contributions
 sometimes-0.055
 Mom-0.054
 from-0.052
 Sara-0.051
 L-0.050
Tokenbo
Token ID2127
Feature activation+5.42e-02

Pos logit contributions
 unnecessary+0.021
izer+0.021
 invention+0.021
 avail+0.020
lor+0.020
Neg logit contributions
 sometimes-0.018
 Mom-0.018
 Sara-0.017
 from-0.017
 L-0.016
Tokeno
Token ID78
Feature activation+4.98e-02

Pos logit contributions
 unnecessary+0.105
izer+0.105
lor+0.102
 invention+0.100
bek+0.099
Neg logit contributions
 sometimes-0.058
 Mom-0.056
 Sara-0.053
 L-0.051
 from-0.051
Token with
Token ID351
Feature activation+1.14e-02

Pos logit contributions
 unnecessary+0.096
izer+0.095
lor+0.092
 invention+0.091
bek+0.090
Neg logit contributions
 sometimes-0.054
 Mom-0.053
 Sara-0.051
 L-0.049
 Z-0.049
Token a
Token ID257
Feature activation-3.29e-03

Pos logit contributions
 unnecessary+0.023
izer+0.022
 invention+0.021
aman+0.021
 avail+0.021
Neg logit contributions
 sometimes-0.012
 Mom-0.011
ily-0.011
 Sara-0.011
 Z-0.011
Token wet
Token ID9583
Feature activation+6.38e-03

Pos logit contributions
 Mom+0.004
 from+0.004
 Mr+0.004
 Sara+0.004
 sometimes+0.004
Neg logit contributions
 unnecessary-0.006
 avail-0.006
izer-0.006
bek-0.006
lor-0.006
Token cloth
Token ID16270
Feature activation+1.12e-02

Pos logit contributions
 unnecessary+0.010
bek+0.009
izer+0.009
lor+0.009
isting+0.009
Neg logit contributions
 sometimes-0.011
 Mom-0.011
 from-0.010
 Sara-0.010
 L-0.010
Token He
Token ID679
Feature activation+9.03e-03

Pos logit contributions
 unnecessary+0.004
izer+0.004
 invention+0.004
lor+0.004
bek+0.004
Neg logit contributions
 sometimes-0.002
 Mom-0.002
 from-0.002
 Sara-0.002
ily-0.002
Token stood
Token ID6204
Feature activation+1.86e-02

Pos logit contributions
 unnecessary+0.018
izer+0.018
 avail+0.017
lor+0.017
+0.016
Neg logit contributions
 sometimes-0.011
 Mom-0.010
 from-0.010
 Sara-0.009
 L-0.009
Token up
Token ID510
Feature activation+5.79e-03

Pos logit contributions
 unnecessary+0.040
izer+0.040
lor+0.039
 invention+0.038
bek+0.038
Neg logit contributions
 sometimes-0.016
 Mom-0.016
 Sara-0.015
 from-0.014
 L-0.014
Token on
Token ID319
Feature activation+6.68e-03

Pos logit contributions
 unnecessary+0.013
izer+0.013
lor+0.012
 invention+0.012
bek+0.012
Neg logit contributions
 sometimes-0.005
 Mom-0.005
 Sara-0.004
 from-0.004
 L-0.004
Token his
Token ID465
Feature activation+1.90e-02

Pos logit contributions
 unnecessary+0.017
izer+0.016
lor+0.016
bek+0.016
isting+0.016
Neg logit contributions
 sometimes-0.004
 Mom-0.004
 from-0.004
 Sara-0.004
 L-0.003
Token hind
Token ID16222
Feature activation+5.42e-02

Pos logit contributions
 unnecessary+0.032
lor+0.031
bek+0.031
izer+0.030
 avail+0.030
Neg logit contributions
 Mom-0.029
 sometimes-0.026
 from-0.026
 alone-0.025
 Z-0.024
Token legs
Token ID7405
Feature activation+1.30e-02

Pos logit contributions
izer+0.114
 unnecessary+0.113
lor+0.110
bek+0.109
aman+0.107
Neg logit contributions
 sometimes-0.053
 Mom-0.053
 from-0.047
 Sara-0.047
 L-0.046
Token and
Token ID290
Feature activation+2.60e-03

Pos logit contributions
 unnecessary+0.027
izer+0.027
lor+0.026
 invention+0.026
aman+0.026
Neg logit contributions
 sometimes-0.012
 Mom-0.011
 Sara-0.011
 L-0.010
ily-0.010
Token waved
Token ID26834
Feature activation+1.86e-02

Pos logit contributions
 unnecessary+0.005
izer+0.005
lor+0.004
 invention+0.004
bek+0.004
Neg logit contributions
 sometimes-0.003
 Mom-0.003
 Sara-0.003
 from-0.003
 L-0.003
Token his
Token ID465
Feature activation+1.04e-02

Pos logit contributions
 unnecessary+0.036
izer+0.035
 invention+0.034
lor+0.034
 avail+0.033
Neg logit contributions
 sometimes-0.021
 Mom-0.020
 Sara-0.019
ily-0.019
 from-0.018
Token front
Token ID2166
Feature activation+1.18e-03

Pos logit contributions
lor+0.021
aman+0.020
achers+0.020
bek+0.020
izer+0.020
Neg logit contributions
 Mom-0.013
 from-0.011
 sometimes-0.011
 Z-0.010
 L-0.010
Token and
Token ID290
Feature activation+1.37e-02

Pos logit contributions
 sometimes+0.002
 Mom+0.002
 Sara+0.001
 from+0.001
 L+0.001
Neg logit contributions
 unnecessary-0.002
izer-0.002
 invention-0.002
lor-0.002
bek-0.002
Token her
Token ID607
Feature activation+3.88e-03

Pos logit contributions
 avail+0.024
 unnecessary+0.024
 invention+0.024
aman+0.024
izer+0.024
Neg logit contributions
 sometimes-0.015
ily-0.014
ents-0.014
 from-0.014
 Z-0.013
Token mom
Token ID1995
Feature activation+1.02e-03

Pos logit contributions
 invention+0.007
 newfound+0.007
 tool+0.006
+0.006
pron+0.006
Neg logit contributions
 sometimes-0.005
 thirteen-0.005
 from-0.004
ily-0.004
 like-0.004
Tokenmy
Token ID1820
Feature activation+1.49e-02

Pos logit contributions
bek+0.002
aman+0.002
andra+0.002
ously+0.002
 cast+0.002
Neg logit contributions
 Flor-0.001
 Qu-0.001
 Sara-0.001
ily-0.001
 Z-0.001
Token picked
Token ID6497
Feature activation+5.03e-02

Pos logit contributions
 unnecessary+0.023
bek+0.022
aman+0.022
izer+0.022
 invention+0.022
Neg logit contributions
 sometimes-0.020
 Mom-0.020
 Sara-0.020
ily-0.019
 from-0.019
Token out
Token ID503
Feature activation+5.39e-02

Pos logit contributions
lor+0.116
izer+0.113
bek+0.113
isting+0.113
 lookout+0.112
Neg logit contributions
 Mom-0.045
 from-0.040
 sometimes-0.040
 Sara-0.040
 Tom-0.039
Token some
Token ID617
Feature activation+3.83e-03

Pos logit contributions
 unnecessary+0.118
izer+0.116
 invention+0.112
lor+0.112
bek+0.111
Neg logit contributions
 sometimes-0.046
 Mom-0.044
 Sara-0.041
 from-0.040
 L-0.039
Token books
Token ID3835
Feature activation-2.69e-03

Pos logit contributions
 unnecessary+0.007
izer+0.007
 invention+0.007
aman+0.006
 lookout+0.006
Neg logit contributions
 sometimes-0.005
 Mom-0.004
 Sara-0.004
 Z-0.004
 L-0.004
Token and
Token ID290
Feature activation+2.51e-02

Pos logit contributions
 Mom+0.003
 sometimes+0.002
 Sara+0.002
ily+0.002
 L+0.002
Neg logit contributions
izer-0.005
aman-0.005
 unnecessary-0.005
lor-0.005
 invention-0.005
Token sat
Token ID3332
Feature activation+2.56e-02

Pos logit contributions
 unnecessary+0.047
izer+0.046
 invention+0.044
lor+0.044
bek+0.044
Neg logit contributions
 sometimes-0.030
 Mom-0.029
 Sara-0.027
 from-0.027
 L-0.026
Token down
Token ID866
Feature activation-3.86e-04

Pos logit contributions
 unnecessary+0.055
izer+0.054
lor+0.053
 invention+0.052
 avail+0.052
Neg logit contributions
 Mom-0.025
 sometimes-0.025
 Sara-0.023
 from-0.023
 L-0.022
Token him
Token ID683
Feature activation+9.54e-03

Pos logit contributions
 unnecessary+0.080
lor+0.080
bek+0.079
 invention+0.079
izer+0.078
Neg logit contributions
 from-0.022
 sometimes-0.021
 Mom-0.021
 like-0.017
 Sara-0.017
Token back
Token ID736
Feature activation+1.51e-02

Pos logit contributions
 unnecessary+0.020
izer+0.019
lor+0.019
 invention+0.019
bek+0.019
Neg logit contributions
 sometimes-0.009
 Mom-0.009
 Sara-0.008
 L-0.008
ily-0.008
Token and
Token ID290
Feature activation+2.49e-02

Pos logit contributions
 unnecessary+0.033
izer+0.033
lor+0.032
 invention+0.032
bek+0.032
Neg logit contributions
 sometimes-0.013
 Mom-0.012
 Sara-0.011
 from-0.011
 L-0.011
Token they
Token ID484
Feature activation+1.39e-02

Pos logit contributions
 unnecessary+0.057
izer+0.057
lor+0.055
 invention+0.055
isting+0.054
Neg logit contributions
 sometimes-0.020
 Mom-0.020
 from-0.018
 Sara-0.018
 L-0.017
Token all
Token ID477
Feature activation+1.28e-02

Pos logit contributions
bek+0.026
aman+0.026
zed+0.026
andra+0.026
asha+0.026
Neg logit contributions
 sometimes-0.012
 Sara-0.012
ily-0.012
 pand-0.012
 Mom-0.012
Token held
Token ID2714
Feature activation+5.38e-02

Pos logit contributions
zed+0.020
bek+0.019
ability+0.018
andra+0.018
aman+0.018
Neg logit contributions
 sometimes-0.016
ily-0.015
 pand-0.015
 Mom-0.015
ents-0.015
Token each
Token ID1123
Feature activation+2.84e-02

Pos logit contributions
lor+0.121
izer+0.118
bek+0.116
 unnecessary+0.116
 invention+0.115
Neg logit contributions
 Mom-0.051
 from-0.046
 sometimes-0.046
 Sara-0.044
 Tom-0.042
Token other
Token ID584
Feature activation+2.05e-02

Pos logit contributions
lor+0.069
+0.067
bek+0.067
 unnecessary+0.067
+0.066
Neg logit contributions
 Mom-0.021
 sometimes-0.020
 from-0.019
 Sara-0.019
 Anna-0.018
Token tightly
Token ID17707
Feature activation+8.91e-03

Pos logit contributions
 unnecessary+0.039
izer+0.037
lor+0.037
 invention+0.037
bek+0.036
Neg logit contributions
 sometimes-0.022
 Mom-0.022
 Sara-0.021
 L-0.021
ily-0.021
Token.
Token ID13
Feature activation+2.19e-02

Pos logit contributions
 unnecessary+0.019
izer+0.018
 invention+0.018
 lookout+0.018
lor+0.018
Neg logit contributions
 Mom-0.007
 sometimes-0.007
 Sara-0.007
ily-0.007
 L-0.007
Token
Token ID220
Feature activation+1.62e-02

Pos logit contributions
 unnecessary+0.043
izer+0.042
 avail+0.041
bek+0.040
 invention+0.040
Neg logit contributions
 Mom-0.029
 sometimes-0.026
 Sara-0.025
 Tom-0.025
 Z-0.025
Token flags
Token ID9701
Feature activation+1.09e-02

Pos logit contributions
 unnecessary+0.035
izer+0.035
lor+0.034
 invention+0.034
bek+0.033
Neg logit contributions
 sometimes-0.020
 Mom-0.020
 Sara-0.019
 from-0.018
 L-0.018
Token up
Token ID510
Feature activation+1.31e-02

Pos logit contributions
izer+0.021
 unnecessary+0.021
 invention+0.021
lor+0.021
isting+0.020
Neg logit contributions
 Mom-0.010
 sometimes-0.010
 Sara-0.010
 L-0.010
ily-0.010
Token on
Token ID319
Feature activation+4.40e-03

Pos logit contributions
 unnecessary+0.027
izer+0.027
 invention+0.026
bek+0.026
lor+0.026
Neg logit contributions
 sometimes-0.011
 Mom-0.011
 Sara-0.011
ily-0.010
 L-0.010
Token the
Token ID262
Feature activation+1.54e-02

Pos logit contributions
 unnecessary+0.010
izer+0.010
 invention+0.009
lor+0.009
aman+0.009
Neg logit contributions
 sometimes-0.004
 Mom-0.003
 Sara-0.003
 from-0.003
ily-0.003
Token flag
Token ID6056
Feature activation+2.11e-02

Pos logit contributions
 unnecessary+0.026
izer+0.026
lor+0.025
 invention+0.025
bek+0.024
Neg logit contributions
 sometimes-0.021
 Mom-0.021
 Sara-0.019
 from-0.019
 L-0.019
Tokenpole
Token ID36869
Feature activation+5.35e-02

Pos logit contributions
izer+0.046
 unnecessary+0.046
lor+0.045
 invention+0.045
aman+0.044
Neg logit contributions
 sometimes-0.016
 Mom-0.016
 Sara-0.015
ily-0.014
 L-0.014
Token.
Token ID13
Feature activation+2.27e-04

Pos logit contributions
 unnecessary+0.130
izer+0.127
lor+0.124
bek+0.123
 avail+0.123
Neg logit contributions
 sometimes-0.036
 Mom-0.034
 from-0.032
 Sara-0.029
 L-0.028
Token Look
Token ID6803
Feature activation-1.32e-02

Pos logit contributions
izer+0.000
 unnecessary+0.000
lor+0.000
 invention+0.000
bek+0.000
Neg logit contributions
 sometimes-0.000
ily-0.000
 alone-0.000
 from-0.000
 Sara-0.000
Token,
Token ID11
Feature activation+4.60e-03

Pos logit contributions
 sometimes+0.012
 Mom+0.011
 pand+0.011
 Sara+0.011
ily+0.011
Neg logit contributions
 unnecessary-0.027
 invention-0.026
izer-0.026
isting-0.025
aman-0.025
Token you
Token ID345
Feature activation-1.26e-02

Pos logit contributions
 unnecessary+0.009
izer+0.008
 invention+0.008
lor+0.008
bek+0.008
Neg logit contributions
 sometimes-0.005
 Mom-0.005
ily-0.005
 Sara-0.005
 from-0.004
Token can
Token ID460
Feature activation-1.34e-02

Pos logit contributions
 sometimes+0.015
 Mom+0.014
 Sara+0.013
 from+0.013
 L+0.013
Neg logit contributions
 unnecessary-0.024
izer-0.024
lor-0.023
 invention-0.023
bek-0.023
TokenL
Token ID43
Feature activation-1.63e-02

Pos logit contributions
 unnecessary+0.011
izer+0.011
 invention+0.011
lor+0.011
 avail+0.011
Neg logit contributions
 sometimes-0.008
 Mom-0.007
 Sara-0.007
 from-0.007
 L-0.007
Tokenily
Token ID813
Feature activation-3.85e-03

Pos logit contributions
 sometimes+0.001
ily+0.000
 Mom-0.000
 Sara-0.001
 from-0.001
Neg logit contributions
 unnecessary-0.050
izer-0.049
 invention-0.048
 avail-0.048
lor-0.048
Token and
Token ID290
Feature activation+1.73e-02

Pos logit contributions
 sometimes+0.004
 Mom+0.004
 from+0.003
 Sara+0.003
 L+0.003
Neg logit contributions
 unnecessary-0.009
 avail-0.008
izer-0.008
bek-0.008
 lookout-0.008
Token her
Token ID607
Feature activation+5.10e-03

Pos logit contributions
izer+0.033
aman+0.031
 avail+0.031
 unnecessary+0.031
 invention+0.031
Neg logit contributions
 sometimes-0.018
ily-0.016
 Mom-0.016
 Z-0.015
 from-0.015
Token mom
Token ID1995
Feature activation+5.12e-03

Pos logit contributions
 invention+0.009
izer+0.009
+0.009
 newfound+0.009
 unnecessary+0.009
Neg logit contributions
 sometimes-0.005
ily-0.005
 from-0.005
 Sara-0.005
 thirteen-0.005
Token picked
Token ID6497
Feature activation+5.35e-02

Pos logit contributions
 unnecessary+0.009
izer+0.009
lor+0.008
 invention+0.008
bek+0.008
Neg logit contributions
 sometimes-0.007
 Mom-0.007
 Sara-0.006
 from-0.006
 L-0.006
Token out
Token ID503
Feature activation+5.47e-02

Pos logit contributions
 unnecessary+0.116
izer+0.116
lor+0.116
bek+0.114
 lookout+0.113
Neg logit contributions
 Mom-0.048
 sometimes-0.046
 Sara-0.044
 from-0.043
 Tom-0.041
Token a
Token ID257
Feature activation+7.42e-03

Pos logit contributions
 unnecessary+0.120
izer+0.118
 invention+0.114
lor+0.114
bek+0.113
Neg logit contributions
 sometimes-0.045
 Mom-0.042
 Sara-0.039
ily-0.038
 from-0.038
Token pretty
Token ID2495
Feature activation+6.93e-03

Pos logit contributions
 unnecessary+0.014
izer+0.013
lor+0.013
 invention+0.013
bek+0.013
Neg logit contributions
 sometimes-0.009
 Mom-0.009
 Sara-0.009
 from-0.009
 L-0.008
Token card
Token ID2657
Feature activation+1.67e-02

Pos logit contributions
 unnecessary+0.013
izer+0.012
lor+0.012
bek+0.012
 invention+0.012
Neg logit contributions
 sometimes-0.009
 Mom-0.009
 Sara-0.008
 from-0.008
 L-0.008
Token with
Token ID351
Feature activation+3.87e-03

Pos logit contributions
izer+0.031
lor+0.030
 invention+0.030
 tool+0.029
 tractor+0.029
Neg logit contributions
 sometimes-0.017
 Mom-0.016
 Sara-0.016
 Mr-0.016
 Qu-0.016
Token and
Token ID290
Feature activation+2.97e-02

Pos logit contributions
 sometimes+0.000
 Mom+0.000
 Sara+0.000
 pand+0.000
 Flor+0.000
Neg logit contributions
aman-0.000
zed-0.000
izer-0.000
 invention-0.000
andra-0.000
Token upset
Token ID9247
Feature activation+2.01e-04

Pos logit contributions
 unnecessary+0.058
izer+0.057
aman+0.057
 invention+0.056
bek+0.056
Neg logit contributions
 sometimes-0.030
 Mom-0.028
ily-0.027
 Sara-0.027
 pand-0.026
Token,
Token ID11
Feature activation+1.55e-02

Pos logit contributions
 unnecessary+0.000
izer+0.000
 invention+0.000
lor+0.000
aman+0.000
Neg logit contributions
 sometimes-0.000
 Mom-0.000
 Sara-0.000
ily-0.000
 L-0.000
Token John
Token ID1757
Feature activation+1.85e-02

Pos logit contributions
 unnecessary+0.031
izer+0.031
lor+0.030
 invention+0.030
bek+0.029
Neg logit contributions
 Mom-0.017
 sometimes-0.017
 Sara-0.016
 from-0.015
 L-0.015
Token fet
Token ID11351
Feature activation+5.84e-03

Pos logit contributions
 unnecessary+0.032
izer+0.032
lor+0.031
 invention+0.031
bek+0.030
Neg logit contributions
 sometimes-0.023
 Mom-0.023
 Sara-0.022
ily-0.021
 L-0.021
Tokenched
Token ID1740
Feature activation+5.31e-02

Pos logit contributions
 avail+0.009
 unnecessary+0.009
aman+0.009
 lookout+0.009
izer+0.008
Neg logit contributions
 sometimes-0.009
 L-0.009
ily-0.009
 from-0.009
 Mom-0.009
Token a
Token ID257
Feature activation+3.96e-03

Pos logit contributions
lor+0.127
 unnecessary+0.124
izer+0.124
isting+0.122
bek+0.121
Neg logit contributions
 Mom-0.045
 sometimes-0.039
 Sara-0.039
 from-0.037
 Tom-0.036
Token hose
Token ID31489
Feature activation+1.19e-02

Pos logit contributions
lor+0.007
 unnecessary+0.007
 avail+0.007
izer+0.007
asha+0.007
Neg logit contributions
 Mom-0.006
 Sara-0.005
 from-0.005
 L-0.005
 sometimes-0.005
Token and
Token ID290
Feature activation+4.99e-03

Pos logit contributions
 unnecessary+0.026
izer+0.026
 invention+0.025
lor+0.025
aman+0.024
Neg logit contributions
 sometimes-0.010
 Mom-0.010
 Sara-0.009
 L-0.009
ily-0.009
Token washed
Token ID18989
Feature activation+1.12e-02

Pos logit contributions
izer+0.009
 unnecessary+0.009
lor+0.009
bek+0.009
 avail+0.009
Neg logit contributions
 sometimes-0.006
 Mom-0.006
 Sara-0.006
 L-0.006
 from-0.006
Token away
Token ID1497
Feature activation+1.50e-02

Pos logit contributions
 unnecessary+0.024
izer+0.024
lor+0.023
 invention+0.023
bek+0.023
Neg logit contributions
 sometimes-0.011
 Mom-0.010
 Sara-0.010
 from-0.009
 L-0.009
Token.
Token ID13
Feature activation+3.02e-03

Pos logit contributions
 unnecessary+0.075
izer+0.074
lor+0.072
 invention+0.071
bek+0.071
Neg logit contributions
 sometimes-0.022
 Mom-0.021
 from-0.018
 Sara-0.018
 L-0.017
Token The
Token ID383
Feature activation+4.98e-03

Pos logit contributions
 unnecessary+0.008
izer+0.007
isting+0.007
 avail+0.007
lor+0.007
Neg logit contributions
 Mom-0.002
 Sara-0.002
 sometimes-0.002
 from-0.002
 L-0.002
Token vet
Token ID20202
Feature activation+2.16e-02

Pos logit contributions
 unnecessary+0.008
izer+0.008
lor+0.008
bek+0.008
 invention+0.008
Neg logit contributions
 sometimes-0.007
 Mom-0.007
 from-0.006
ily-0.006
 Sara-0.006
Token looked
Token ID3114
Feature activation+1.46e-02

Pos logit contributions
izer+0.038
 invention+0.038
 unnecessary+0.037
lor+0.037
bek+0.037
Neg logit contributions
 Mom-0.024
 sometimes-0.024
 Sara-0.023
 Z-0.022
ily-0.022
Token the
Token ID262
Feature activation+2.00e-02

Pos logit contributions
 unnecessary+0.030
izer+0.028
 invention+0.027
aman+0.027
 newfound+0.027
Neg logit contributions
 sometimes-0.014
ily-0.013
 Mom-0.013
 Sara-0.013
 Z-0.013
Token owl
Token ID39610
Feature activation+5.29e-02

Pos logit contributions
izer+0.035
 unnecessary+0.034
bek+0.034
lor+0.034
isting+0.034
Neg logit contributions
 sometimes-0.026
 Mom-0.025
 from-0.025
 alone-0.023
 Sara-0.023
Token over
Token ID625
Feature activation+5.70e-03

Pos logit contributions
 unnecessary+0.108
izer+0.108
 avail+0.103
lor+0.103
 invention+0.103
Neg logit contributions
 sometimes-0.056
 Mom-0.051
 from-0.050
 Sara-0.047
 alone-0.045
Token,
Token ID11
Feature activation+1.31e-02

Pos logit contributions
izer+0.014
 unnecessary+0.014
lor+0.014
 invention+0.013
isting+0.013
Neg logit contributions
 sometimes-0.004
 Mom-0.004
 from-0.004
 Sara-0.004
 L-0.003
Token but
Token ID475
Feature activation+1.00e-02

Pos logit contributions
 unnecessary+0.027
izer+0.027
lor+0.026
 invention+0.026
bek+0.026
Neg logit contributions
 sometimes-0.013
 Mom-0.013
 Sara-0.012
 from-0.012
 L-0.011
Token sadly
Token ID21098
Feature activation+9.23e-03

Pos logit contributions
 unnecessary+0.019
bek+0.018
izer+0.018
aman+0.018
 avail+0.018
Neg logit contributions
 sometimes-0.010
 Mom-0.010
ily-0.010
 Z-0.009
 Sara-0.009
Token told
Token ID1297
Feature activation+3.52e-03

Pos logit contributions
 unnecessary+0.020
izer+0.019
lor+0.019
 invention+0.019
bek+0.018
Neg logit contributions
 sometimes-0.009
 Mom-0.008
 Sara-0.008
 from-0.007
 L-0.007
Token They
Token ID1119
Feature activation-6.57e-03

Pos logit contributions
 sometimes+0.001
 pand+0.001
 pengu+0.001
 from+0.001
 Z+0.001
Neg logit contributions
-0.001
izer-0.001
 unnecessary-0.001
Michelle-0.001
andra-0.001
Token put
Token ID1234
Feature activation+3.30e-02

Pos logit contributions
 sometimes+0.008
 Mom+0.008
 Sara+0.007
 pand+0.007
ily+0.007
Neg logit contributions
izer-0.011
bek-0.011
 unnecessary-0.011
aman-0.011
lor-0.011
Token Lily
Token ID20037
Feature activation+1.95e-02

Pos logit contributions
 unnecessary+0.064
izer+0.063
 invention+0.060
lor+0.059
 avail+0.059
Neg logit contributions
 sometimes-0.036
 Mom-0.033
 Sara-0.031
 from-0.031
 Z-0.031
Token on
Token ID319
Feature activation+3.12e-02

Pos logit contributions
 unnecessary+0.038
izer+0.037
lor+0.036
 invention+0.035
bek+0.035
Neg logit contributions
 sometimes-0.021
 Mom-0.021
 Sara-0.020
 L-0.019
ily-0.019
Token a
Token ID257
Feature activation+1.89e-02

Pos logit contributions
 unnecessary+0.069
izer+0.069
 invention+0.066
lor+0.066
aman+0.066
Neg logit contributions
 sometimes-0.026
 Mom-0.024
 Sara-0.022
 from-0.022
 L-0.021
Token stret
Token ID5759
Feature activation+5.27e-02

Pos logit contributions
 unnecessary+0.022
lor+0.021
isting+0.021
asha+0.021
 avail+0.021
Neg logit contributions
 Mom-0.036
 Sara-0.035
 Anna-0.035
 sometimes-0.035
 alone-0.035
Tokencher
Token ID2044
Feature activation+9.19e-03

Pos logit contributions
 unnecessary+0.057
izer+0.056
lor+0.053
 invention+0.052
bek+0.052
Neg logit contributions
 sometimes-0.103
 Mom-0.101
 Sara-0.098
 from-0.097
 L-0.096
Token and
Token ID290
Feature activation+4.44e-03

Pos logit contributions
izer+0.018
lor+0.018
 invention+0.017
 unnecessary+0.017
 lookout+0.017
Neg logit contributions
 Mom-0.009
 sometimes-0.009
 Sara-0.009
 L-0.008
 pand-0.008
Token carried
Token ID5281
Feature activation+1.86e-02

Pos logit contributions
 unnecessary+0.009
izer+0.009
lor+0.009
 invention+0.009
isting+0.009
Neg logit contributions
 Mom-0.005
 sometimes-0.005
 Sara-0.005
 L-0.004
 from-0.004
Token her
Token ID607
Feature activation+6.19e-03

Pos logit contributions
 unnecessary+0.036
izer+0.035
 precautions+0.034
 avail+0.034
bek+0.033
Neg logit contributions
 sometimes-0.019
 Mom-0.017
 pengu-0.016
 pand-0.016
 Z-0.015
Token to
Token ID284
Feature activation+7.06e-04

Pos logit contributions
 unnecessary+0.013
izer+0.013
lor+0.012
 invention+0.012
bek+0.012
Neg logit contributions
 sometimes-0.006
 Mom-0.006
 Sara-0.006
ily-0.005
 L-0.005
Token shared
Token ID4888
Feature activation+1.02e-02

Pos logit contributions
 sometimes+0.014
 Mom+0.014
 Sara+0.013
 L+0.013
ily+0.013
Neg logit contributions
 unnecessary-0.019
bek-0.018
izer-0.018
 invention-0.018
 lookout-0.017
Token."
Token ID526
Feature activation+5.76e-03

Pos logit contributions
 unnecessary+0.016
izer+0.014
 precautions+0.014
 invention+0.014
bek+0.014
Neg logit contributions
 pand-0.013
ily-0.012
 sometimes-0.012
 Z-0.012
 Qu-0.012
Token
Token ID198
Feature activation+1.50e-02

Pos logit contributions
 unnecessary+0.012
izer+0.011
+0.011
 invention+0.011
 lookout+0.011
Neg logit contributions
 sometimes-0.005
 from-0.005
 Mom-0.005
 pand-0.005
ily-0.005
Token
Token ID198
Feature activation+1.25e-02

Pos logit contributions
izer+0.030
 unnecessary+0.029
lor+0.028
isting+0.028
 invention+0.027
Neg logit contributions
 sometimes-0.016
 pand-0.014
 Mom-0.014
 from-0.013
 Sara-0.013
TokenThey
Token ID2990
Feature activation-4.20e-03

Pos logit contributions
 unnecessary+0.022
izer+0.022
 invention+0.021
lor+0.021
bek+0.021
Neg logit contributions
 sometimes-0.016
 Mom-0.016
 Sara-0.015
 from-0.015
 L-0.014
Token picked
Token ID6497
Feature activation+5.27e-02

Pos logit contributions
 sometimes+0.005
 Mom+0.005
 Sara+0.005
 pand+0.005
 Z+0.005
Neg logit contributions
izer-0.007
bek-0.007
andra-0.007
 unnecessary-0.007
aman-0.007
Token up
Token ID510
Feature activation+4.85e-02

Pos logit contributions
lor+0.131
bek+0.127
 unnecessary+0.126
aman+0.125
 lookout+0.124
Neg logit contributions
 Mom-0.040
 from-0.037
 sometimes-0.037
 Sara-0.036
 Tom-0.035
Token the
Token ID262
Feature activation+9.56e-03

Pos logit contributions
 unnecessary+0.109
izer+0.108
 invention+0.104
lor+0.103
bek+0.103
Neg logit contributions
 sometimes-0.038
 Mom-0.035
 Sara-0.032
 from-0.031
ily-0.031
Token pieces
Token ID5207
Feature activation+1.46e-02

Pos logit contributions
 unnecessary+0.014
izer+0.013
 invention+0.013
lor+0.013
bek+0.013
Neg logit contributions
 sometimes-0.015
 Mom-0.015
 Sara-0.015
 L-0.014
 from-0.014
Token of
Token ID286
Feature activation+1.15e-02

Pos logit contributions
izer+0.026
 invention+0.025
ously+0.025
ipping+0.024
 unnecessary+0.024
Neg logit contributions
 Z-0.016
 Mom-0.015
 L-0.015
 pand-0.015
 Sara-0.015
Token the
Token ID262
Feature activation+1.10e-03

Pos logit contributions
 invention+0.015
izer+0.014
 unnecessary+0.014
 tool+0.014
 avail+0.014
Neg logit contributions
 sometimes-0.019
ily-0.018
 Qu-0.018
 Mom-0.017
 Sara-0.017
Token something
Token ID1223
Feature activation-5.99e-03

Pos logit contributions
 sometimes+0.000
 Mom+0.000
 Sara+0.000
 Z+0.000
 from+0.000
Neg logit contributions
 unnecessary-0.000
izer-0.000
lor-0.000
aman-0.000
asha-0.000
Token moving
Token ID3867
Feature activation-2.44e-04

Pos logit contributions
 sometimes+0.007
 Mom+0.007
 Sara+0.007
ily+0.007
 L+0.006
Neg logit contributions
 unnecessary-0.011
 invention-0.010
 lookout-0.010
izer-0.010
 tool-0.010
Token in
Token ID287
Feature activation-1.02e-03

Pos logit contributions
 Mom+0.000
 sometimes+0.000
 Sara+0.000
 L+0.000
ily+0.000
Neg logit contributions
 unnecessary-0.001
izer-0.001
 invention-0.000
bek-0.000
lor-0.000
Token the
Token ID262
Feature activation+1.56e-02

Pos logit contributions
 sometimes+0.001
 Mom+0.001
 Sara+0.001
ily+0.001
 L+0.001
Neg logit contributions
izer-0.002
 unnecessary-0.002
 invention-0.002
aman-0.002
 avail-0.002
Token wild
Token ID4295
Feature activation+1.52e-02

Pos logit contributions
 unnecessary+0.027
izer+0.027
lor+0.026
 invention+0.026
bek+0.026
Neg logit contributions
 sometimes-0.021
 Mom-0.020
 Sara-0.019
 from-0.019
 L-0.018
Tokenfl
Token ID2704
Feature activation+5.26e-02

Pos logit contributions
 unnecessary+0.028
izer+0.028
lor+0.027
 invention+0.027
bek+0.027
Neg logit contributions
 sometimes-0.018
 Mom-0.018
 Sara-0.017
 from-0.016
 L-0.016
Tokenowers
Token ID3618
Feature activation+2.50e-02

Pos logit contributions
 unnecessary+0.101
bek+0.097
 avail+0.091
 invention+0.087
zed+0.087
Neg logit contributions
 from-0.062
 sometimes-0.062
 Mom-0.060
 pand-0.056
ily-0.055
Token.
Token ID13
Feature activation+2.85e-03

Pos logit contributions
 unnecessary+0.058
izer+0.058
lor+0.056
 invention+0.056
bek+0.055
Neg logit contributions
 sometimes-0.018
 Mom-0.017
 Sara-0.015
 from-0.015
 L-0.014
Token It
Token ID632
Feature activation-9.24e-03

Pos logit contributions
 unnecessary+0.006
izer+0.005
lor+0.005
 invention+0.005
bek+0.005
Neg logit contributions
 sometimes-0.003
 Mom-0.003
 Sara-0.003
 from-0.003
 L-0.003
Token was
Token ID373
Feature activation+2.48e-03

Pos logit contributions
 Mom+0.012
 sometimes+0.012
 Sara+0.011
ily+0.011
 Z+0.011
Neg logit contributions
 unnecessary-0.015
izer-0.015
 invention-0.015
lor-0.015
asha-0.015
Token a
Token ID257
Feature activation+7.41e-03

Pos logit contributions
 unnecessary+0.004
izer+0.004
aman+0.004
 invention+0.004
 lookout+0.004
Neg logit contributions
 sometimes-0.003
ily-0.003
 Mom-0.003
 Sara-0.002
 Z-0.002
Token He
Token ID679
Feature activation-6.38e-03

Pos logit contributions
 sometimes+0.009
 Mom+0.009
 Sara+0.008
 from+0.008
 L+0.008
Neg logit contributions
 unnecessary-0.018
izer-0.018
lor-0.018
 invention-0.018
bek-0.017
Token stood
Token ID6204
Feature activation+1.18e-02

Pos logit contributions
 Mom+0.008
 sometimes+0.008
ily+0.008
 from+0.008
 Sara+0.008
Neg logit contributions
izer-0.010
 invention-0.010
 unnecessary-0.010
zed-0.010
bek-0.010
Token up
Token ID510
Feature activation+3.93e-03

Pos logit contributions
izer+0.024
 unnecessary+0.023
 invention+0.023
bek+0.023
 avail+0.022
Neg logit contributions
 sometimes-0.011
ily-0.010
 Mom-0.010
 Sara-0.010
 pand-0.010
Token on
Token ID319
Feature activation+6.51e-04

Pos logit contributions
izer+0.008
 unnecessary+0.008
 invention+0.008
aman+0.008
bek+0.008
Neg logit contributions
 sometimes-0.004
 Mom-0.004
ily-0.003
 Sara-0.003
 L-0.003
Token his
Token ID465
Feature activation+1.47e-02

Pos logit contributions
 unnecessary+0.001
izer+0.001
lor+0.001
bek+0.001
 invention+0.001
Neg logit contributions
 sometimes-0.001
 Mom-0.001
 Sara-0.000
 from-0.000
 L-0.000
Token hind
Token ID16222
Feature activation+5.26e-02

Pos logit contributions
 unnecessary+0.024
izer+0.024
 invention+0.023
lor+0.023
bek+0.022
Neg logit contributions
 sometimes-0.021
 Mom-0.020
 Sara-0.019
 from-0.019
 L-0.019
Token legs
Token ID7405
Feature activation+3.10e-03

Pos logit contributions
 unnecessary+0.096
izer+0.095
 invention+0.092
lor+0.091
 avail+0.090
Neg logit contributions
 sometimes-0.064
 Mom-0.061
 Sara-0.059
 from-0.057
ily-0.057
Token and
Token ID290
Feature activation-3.13e-03

Pos logit contributions
izer+0.006
aman+0.006
 invention+0.006
 lookout+0.006
ously+0.005
Neg logit contributions
 Mom-0.003
 sometimes-0.003
 Sara-0.003
 pand-0.003
ily-0.003
Token held
Token ID2714
Feature activation+2.51e-02

Pos logit contributions
 sometimes+0.004
 Mom+0.004
ily+0.004
 Sara+0.003
 from+0.003
Neg logit contributions
 unnecessary-0.005
izer-0.005
 invention-0.005
bek-0.005
 avail-0.005
Token the
Token ID262
Feature activation+2.71e-03

Pos logit contributions
 unnecessary+0.051
izer+0.050
lor+0.050
aman+0.049
bek+0.049
Neg logit contributions
 Mom-0.027
 sometimes-0.026
 Sara-0.024
 from-0.024
 L-0.024
Token bone
Token ID9970
Feature activation+8.33e-03

Pos logit contributions
 unnecessary+0.004
izer+0.004
 invention+0.004
lor+0.004
bek+0.004
Neg logit contributions
 sometimes-0.004
 Mom-0.004
 Sara-0.004
 from-0.004
 L-0.004
Token p
Token ID279
Feature activation-4.80e-04

Pos logit contributions
 Mom+0.028
 sometimes+0.028
 Sara+0.027
 from+0.026
 L+0.026
Neg logit contributions
 unnecessary-0.025
izer-0.023
lor-0.023
bek-0.022
 avail-0.022
Tokenuddle
Token ID24500
Feature activation-1.14e-02

Pos logit contributions
 Mom+0.001
 sometimes+0.001
 L+0.001
 Sara+0.001
 from+0.001
Neg logit contributions
 unnecessary-0.001
izer-0.001
 avail-0.001
lor-0.001
 lookout-0.000
Token and
Token ID290
Feature activation-1.24e-02

Pos logit contributions
 pand+0.012
 Z+0.012
 Mom+0.011
 sometimes+0.011
 Flor+0.011
Neg logit contributions
izer-0.021
ously-0.020
 invention-0.020
-0.019
 tool-0.019
Token jumps
Token ID18045
Feature activation-2.51e-02

Pos logit contributions
 sometimes+0.015
 Mom+0.015
 Sara+0.014
 L+0.014
 Anna+0.013
Neg logit contributions
lor-0.025
izer-0.025
 unnecessary-0.025
bek-0.024
 avail-0.023
Token in
Token ID287
Feature activation-2.20e-02

Pos logit contributions
 sometimes+0.027
 pand+0.026
 Mom+0.026
 pengu+0.025
 Mr+0.025
Neg logit contributions
izer-0.043
 unnecessary-0.042
 precautions-0.041
ipping-0.041
antly-0.041
Token it
Token ID340
Feature activation-5.56e-02

Pos logit contributions
 sometimes+0.018
 Mom+0.017
ily+0.016
 Sara+0.016
 pand+0.016
Neg logit contributions
izer-0.047
 unnecessary-0.046
 invention-0.045
 avail-0.045
-0.044
Token.
Token ID13
Feature activation-1.77e-02

Pos logit contributions
 pand+0.056
 Z+0.054
ily+0.053
 Mom+0.053
 Mr+0.052
Neg logit contributions
 invention-0.098
 lookout-0.096
antly-0.096
izer-0.095
 tool-0.095
Token He
Token ID679
Feature activation-1.33e-02

Pos logit contributions
 sometimes+0.019
 from+0.016
 pand+0.016
ily+0.016
 alone+0.015
Neg logit contributions
-0.033
izer-0.033
 unnecessary-0.032
zed-0.032
isel-0.031
Token gets
Token ID3011
Feature activation-1.42e-02

Pos logit contributions
 sometimes+0.018
 Mom+0.018
ily+0.017
 from+0.017
 Sara+0.017
Neg logit contributions
 unnecessary-0.021
izer-0.021
 invention-0.020
isting-0.019
 lookout-0.019
Token wet
Token ID9583
Feature activation-8.02e-03

Pos logit contributions
hy+0.018
 Z+0.017
 sometimes+0.017
 Clown+0.017
 Flor+0.016
Neg logit contributions
 unnecessary-0.023
 tool-0.021
aded-0.020
 cast-0.020
isel-0.020
Token and
Token ID290
Feature activation-1.99e-03

Pos logit contributions
 Mom+0.007
 sometimes+0.007
 Z+0.007
 Sara+0.007
ily+0.007
Neg logit contributions
izer-0.016
 unnecessary-0.016
ipping-0.015
 invention-0.015
aman-0.015
Token Sue
Token ID26089
Feature activation-7.05e-03

Pos logit contributions
 sometimes+0.011
 from+0.010
 pand+0.009
 alone+0.009
 like+0.009
Neg logit contributions
izer-0.016
lor-0.016
-0.016
 unnecessary-0.015
-0.015
Token laughs
Token ID22051
Feature activation-1.80e-02

Pos logit contributions
 Mom+0.009
ily+0.008
 sometimes+0.008
 Sara+0.008
 L+0.008
Neg logit contributions
izer-0.011
 unnecessary-0.011
achers-0.011
 invention-0.011
isting-0.011
Token and
Token ID290
Feature activation-1.31e-02

Pos logit contributions
 pand+0.024
ily+0.023
 Nib+0.022
uzz+0.022
 pengu+0.022
Neg logit contributions
ously-0.023
 unnecessary-0.023
izer-0.023
ipping-0.023
 invention-0.022
Token ch
Token ID442
Feature activation-1.88e-02

Pos logit contributions
 sometimes+0.013
 Mom+0.013
ily+0.012
 from+0.012
 Sara+0.011
Neg logit contributions
 unnecessary-0.025
izer-0.025
 invention-0.024
 lookout-0.024
 avail-0.023
Tokenases
Token ID1386
Feature activation-1.84e-02

Pos logit contributions
 sometimes+0.027
 Mom+0.026
 from+0.024
 Mr+0.023
 Sara+0.023
Neg logit contributions
 unnecessary-0.031
izer-0.031
isel-0.030
isting-0.029
 invention-0.029
Token it
Token ID340
Feature activation-5.48e-02

Pos logit contributions
 sometimes+0.014
 pengu+0.012
 Mom+0.012
 pand+0.012
ily+0.011
Neg logit contributions
 unnecessary-0.040
izer-0.039
 avail-0.038
aman-0.038
 precautions-0.037
Token.
Token ID13
Feature activation-1.28e-02

Pos logit contributions
 Mom+0.057
 Mr+0.054
 Sara+0.052
ily+0.052
 pand+0.051
Neg logit contributions
 unnecessary-0.095
 invention-0.092
 lookout-0.092
izer-0.091
zed-0.090
Token Tim
Token ID5045
Feature activation-1.27e-03

Pos logit contributions
 sometimes+0.013
 from+0.013
 pengu+0.012
 alone+0.012
 pand+0.011
Neg logit contributions
-0.024
izer-0.023
aman-0.023
 unnecessary-0.022
-0.022
Token laughs
Token ID22051
Feature activation-1.94e-02

Pos logit contributions
 Mom+0.002
 sometimes+0.001
 Sara+0.001
ily+0.001
 from+0.001
Neg logit contributions
izer-0.002
 unnecessary-0.002
 invention-0.002
isting-0.002
achers-0.002
Token too
Token ID1165
Feature activation-2.49e-02

Pos logit contributions
 pand+0.027
uzz+0.025
ily+0.025
 Mom+0.025
 Ell+0.025
Neg logit contributions
 unnecessary-0.025
izer-0.025
ously-0.025
ipping-0.024
antly-0.023
Token.
Token ID13
Feature activation-1.48e-02

Pos logit contributions
 sometimes+0.021
 Mom+0.020
 pand+0.019
ily+0.019
 Sara+0.019
Neg logit contributions
 unnecessary-0.051
izer-0.049
 lookout-0.048
zed-0.048
 invention-0.048
Tokenuddle
Token ID24500
Feature activation-1.36e-02

Pos logit contributions
 unnecessary+0.007
izer+0.006
lor+0.006
 invention+0.006
bek+0.006
Neg logit contributions
 sometimes-0.013
 Mom-0.013
 Sara-0.012
 L-0.012
 from-0.012
Token.
Token ID13
Feature activation-5.78e-03

Pos logit contributions
 pand+0.015
 Z+0.014
 Mr+0.014
 Flor+0.013
nam+0.013
Neg logit contributions
ously-0.023
izer-0.023
 tool-0.022
ipping-0.022
antly-0.022
Token He
Token ID679
Feature activation-1.37e-02

Pos logit contributions
 sometimes+0.007
 pand+0.006
 from+0.006
 alone+0.006
 pengu+0.006
Neg logit contributions
izer-0.010
isel-0.010
 tool-0.010
-0.010
 unnecessary-0.009
Token jumps
Token ID18045
Feature activation-2.35e-02

Pos logit contributions
 sometimes+0.020
 Mom+0.019
 from+0.018
 Sara+0.018
ily+0.018
Neg logit contributions
 unnecessary-0.021
izer-0.021
 invention-0.020
lor-0.020
 lookout-0.019
Token in
Token ID287
Feature activation-1.73e-02

Pos logit contributions
 pand+0.030
 Mom+0.030
 pengu+0.030
 sometimes+0.029
 Z+0.028
Neg logit contributions
 unnecessary-0.035
antly-0.034
izer-0.033
 precautions-0.033
ipping-0.033
Token it
Token ID340
Feature activation-5.42e-02

Pos logit contributions
 pand+0.017
 sometimes+0.016
ily+0.016
ents+0.016
 Flor+0.015
Neg logit contributions
izer-0.032
 tool-0.031
 avail-0.031
 invention-0.030
 cast-0.030
Token.
Token ID13
Feature activation-9.97e-03

Pos logit contributions
 pand+0.062
 Z+0.059
 Mom+0.058
 Mr+0.058
 pengu+0.058
Neg logit contributions
antly-0.086
 lookout-0.084
 tool-0.083
ig-0.083
 technique-0.083
Token He
Token ID679
Feature activation-1.50e-02

Pos logit contributions
 sometimes+0.012
 pand+0.011
 from+0.011
 alone+0.011
ents+0.010
Neg logit contributions
izer-0.016
isel-0.016
-0.016
zed-0.016
 tool-0.016
Token spl
Token ID4328
Feature activation-7.96e-03

Pos logit contributions
 Mom+0.021
 sometimes+0.021
ily+0.021
 from+0.020
 Sara+0.020
Neg logit contributions
 unnecessary-0.022
izer-0.021
zed-0.021
 invention-0.020
 lookout-0.020
Tokenashes
Token ID7465
Feature activation-2.21e-02

Pos logit contributions
 sometimes+0.011
 Mom+0.010
 Sara+0.010
 from+0.010
 L+0.010
Neg logit contributions
 unnecessary-0.014
izer-0.014
lor-0.013
 invention-0.013
bek-0.013
Token.
Token ID13
Feature activation-7.93e-03

Pos logit contributions
 pand+0.023
 Z+0.023
 sometimes+0.022
 Mom+0.022
 Flor+0.021
Neg logit contributions
 unnecessary-0.038
ipping-0.037
izer-0.037
isel-0.036
 tool-0.036
Token the
Token ID262
Feature activation-5.64e-03

Pos logit contributions
 Mom+0.013
 Sara+0.012
 Tom+0.012
 L+0.011
 sometimes+0.011
Neg logit contributions
 unnecessary-0.051
lor-0.051
izer-0.049
bek-0.049
isting-0.049
Token swing
Token ID9628
Feature activation-1.06e-02

Pos logit contributions
 sometimes+0.008
 Mom+0.008
 Sara+0.008
 from+0.008
 L+0.008
Neg logit contributions
 unnecessary-0.009
izer-0.009
lor-0.009
 invention-0.008
bek-0.008
Token and
Token ID290
Feature activation-9.91e-03

Pos logit contributions
 pand+0.011
 Mom+0.011
 sometimes+0.010
 Flor+0.010
 pengu+0.010
Neg logit contributions
izer-0.018
aman-0.018
-0.018
lor-0.018
 invention-0.018
Token jumps
Token ID18045
Feature activation-2.50e-02

Pos logit contributions
 sometimes+0.014
 Mom+0.014
 Sara+0.014
 L+0.013
 Tom+0.013
Neg logit contributions
izer-0.017
 unnecessary-0.017
lor-0.016
bek-0.016
 invention-0.016
Token on
Token ID319
Feature activation-9.21e-03

Pos logit contributions
 Mom+0.026
 sometimes+0.026
 pengu+0.025
 pand+0.025
 Mr+0.023
Neg logit contributions
izer-0.044
 unnecessary-0.044
 AW-0.043
ipping-0.042
-0.041
Token it
Token ID340
Feature activation-5.39e-02

Pos logit contributions
 Mom+0.009
 sometimes+0.009
 Sara+0.009
 L+0.008
 Tom+0.008
Neg logit contributions
 unnecessary-0.020
izer-0.019
lor-0.019
bek-0.018
 invention-0.018
Token.
Token ID13
Feature activation-1.70e-02

Pos logit contributions
 pand+0.058
 pengu+0.057
 Mom+0.053
ily+0.052
 Nib+0.052
Neg logit contributions
 technique-0.089
 invention-0.089
antly-0.087
 lookout-0.087
 vantage-0.087
Token He
Token ID679
Feature activation-8.49e-03

Pos logit contributions
 sometimes+0.016
 from+0.014
 pand+0.013
ily+0.013
 Mom+0.013
Neg logit contributions
izer-0.035
 unnecessary-0.034
-0.034
lor-0.033
aman-0.033
Token holds
Token ID6622
Feature activation+1.45e-02

Pos logit contributions
 Mom+0.012
 sometimes+0.012
ily+0.012
 from+0.012
 Sara+0.011
Neg logit contributions
 unnecessary-0.012
 invention-0.012
izer-0.011
isting-0.011
 vantage-0.011
Token the
Token ID262
Feature activation-1.58e-03

Pos logit contributions
 unnecessary+0.033
izer+0.032
lor+0.032
 invention+0.032
aman+0.031
Neg logit contributions
 sometimes-0.012
 Mom-0.012
 Sara-0.011
 L-0.010
 from-0.010
Token ropes
Token ID31001
Feature activation+5.12e-03

Pos logit contributions
 Mom+0.002
 sometimes+0.002
 Sara+0.002
ily+0.002
 Tom+0.002
Neg logit contributions
lor-0.003
izer-0.003
 unnecessary-0.003
bek-0.003
achers-0.003
Token a
Token ID257
Feature activation-4.04e-03

Pos logit contributions
 unnecessary+0.005
 precautions+0.005
izer+0.005
 avail+0.005
 tool+0.004
Neg logit contributions
 pengu-0.003
 pand-0.003
 Qu-0.003
 sometimes-0.003
 Flor-0.003
Token chair
Token ID5118
Feature activation-2.27e-02

Pos logit contributions
 Mom+0.006
 sometimes+0.006
 Sara+0.005
 from+0.005
 L+0.005
Neg logit contributions
 unnecessary-0.007
izer-0.007
lor-0.006
 invention-0.006
aman-0.006
Token and
Token ID290
Feature activation+1.50e-03

Pos logit contributions
 Qu+0.023
 Sara+0.023
 Mom+0.023
 pand+0.023
 Flor+0.022
Neg logit contributions
izer-0.042
 tool-0.040
 invention-0.040
aman-0.040
 unnecessary-0.039
Token climbs
Token ID34332
Feature activation-1.31e-02

Pos logit contributions
izer+0.003
lor+0.003
 unnecessary+0.003
bek+0.003
 invention+0.003
Neg logit contributions
 sometimes-0.002
 Mom-0.002
 Tom-0.002
 Sara-0.002
 L-0.002
Token on
Token ID319
Feature activation-1.33e-02

Pos logit contributions
 sometimes+0.013
 Mom+0.013
 Sara+0.012
ily+0.012
 L+0.012
Neg logit contributions
 unnecessary-0.026
izer-0.025
 invention-0.025
lor-0.024
aman-0.024
Token it
Token ID340
Feature activation-5.34e-02

Pos logit contributions
 sometimes+0.012
 Mom+0.012
 Sara+0.011
 from+0.011
ily+0.011
Neg logit contributions
 unnecessary-0.028
izer-0.027
lor-0.027
 invention-0.027
 avail-0.026
Token.
Token ID13
Feature activation-1.41e-02

Pos logit contributions
 pengu+0.059
 pand+0.058
 Qu+0.055
 Mom+0.053
 Sara+0.053
Neg logit contributions
 invention-0.091
 lookout-0.090
-0.090
 tool-0.089
 technique-0.089
Token She
Token ID1375
Feature activation-4.94e-03

Pos logit contributions
 sometimes+0.017
 pand+0.016
 pengu+0.015
 from+0.015
 alone+0.014
Neg logit contributions
zed-0.025
-0.024
 lookout-0.023
-0.023
 unnecessary-0.023
Token reaches
Token ID12229
Feature activation-1.33e-02

Pos logit contributions
 sometimes+0.008
 Mom+0.008
 from+0.007
ily+0.007
 Sara+0.007
Neg logit contributions
 unnecessary-0.007
 invention-0.006
izer-0.006
zed-0.006
lor-0.006
Token for
Token ID329
Feature activation-1.09e-02

Pos logit contributions
 sometimes+0.010
 Mom+0.009
 Sara+0.009
ily+0.008
 L+0.008
Neg logit contributions
 unnecessary-0.030
izer-0.030
 invention-0.029
 avail-0.028
lor-0.028
Token the
Token ID262
Feature activation-9.46e-03

Pos logit contributions
 sometimes+0.009
 Mom+0.009
 Sara+0.008
 from+0.008
 L+0.008
Neg logit contributions
 unnecessary-0.024
izer-0.024
lor-0.023
 invention-0.023
bek-0.023
Token He
Token ID679
Feature activation-1.49e-02

Pos logit contributions
 sometimes+0.010
 Mom+0.010
 from+0.009
 Sara+0.009
ily+0.009
Neg logit contributions
 unnecessary-0.023
izer-0.023
lor-0.022
 invention-0.022
bek-0.021
Token hit
Token ID2277
Feature activation+1.51e-03

Pos logit contributions
 Mom+0.021
 sometimes+0.021
 pand+0.021
ily+0.020
 from+0.020
Neg logit contributions
 cast-0.021
izer-0.020
 invention-0.019
bek-0.019
zed-0.019
Token a
Token ID257
Feature activation-3.10e-03

Pos logit contributions
izer+0.003
 unnecessary+0.003
 cast+0.003
 tool+0.003
 avail+0.003
Neg logit contributions
 sometimes-0.002
 pand-0.002
ily-0.001
 Mom-0.001
ents-0.001
Token ball
Token ID2613
Feature activation-1.97e-02

Pos logit contributions
 sometimes+0.004
 Mom+0.004
 Sara+0.004
 from+0.004
 L+0.004
Neg logit contributions
 unnecessary-0.005
izer-0.005
lor-0.005
 invention-0.005
bek-0.005
Token and
Token ID290
Feature activation-1.19e-02

Pos logit contributions
 Mom+0.021
 Sara+0.020
 Z+0.019
 pand+0.019
 L+0.018
Neg logit contributions
izer-0.037
ipping-0.034
 lookout-0.034
 tool-0.034
 cast-0.034
Token it
Token ID340
Feature activation-5.28e-02

Pos logit contributions
 sometimes+0.015
 Mom+0.015
 pand+0.014
 from+0.014
ily+0.014
Neg logit contributions
 unnecessary-0.020
izer-0.019
 invention-0.018
bek-0.018
zed-0.018
Token fell
Token ID3214
Feature activation-3.32e-02

Pos logit contributions
 Mom+0.083
 Sara+0.080
 pand+0.079
 Z+0.077
 Mr+0.077
Neg logit contributions
zed-0.069
 cast-0.068
 unnecessary-0.065
antly-0.065
 AW-0.063
Token to
Token ID284
Feature activation-1.50e-02

Pos logit contributions
 Mom+0.043
 sometimes+0.042
 Sara+0.041
ily+0.041
 Z+0.039
Neg logit contributions
 unnecessary-0.054
izer-0.052
antly-0.049
 invention-0.049
ipping-0.049
Token the
Token ID262
Feature activation-1.57e-02

Pos logit contributions
 sometimes+0.011
 Mom+0.011
 Sara+0.010
 from+0.009
 L+0.009
Neg logit contributions
 unnecessary-0.035
izer-0.035
lor-0.034
 invention-0.034
bek-0.033
Token ground
Token ID2323
Feature activation-2.41e-02

Pos logit contributions
 sometimes+0.024
 Mom+0.023
 Z+0.022
 Sara+0.022
 L+0.022
Neg logit contributions
izer-0.024
 unnecessary-0.023
 lookout-0.022
 tool-0.022
 invention-0.022
Token.
Token ID13
Feature activation-9.83e-03

Pos logit contributions
 Mom+0.027
ily+0.026
 pand+0.026
 Sara+0.025
 Z+0.025
Neg logit contributions
 lookout-0.040
 cast-0.040
izer-0.039
ously-0.039
-0.039
Token a
Token ID257
Feature activation-1.96e-02

Pos logit contributions
 sometimes+0.008
 Qu+0.007
 Mom+0.007
 alone+0.007
 Z+0.007
Neg logit contributions
 unnecessary-0.012
 invention-0.012
zed-0.011
izer-0.011
 cast-0.011
Token tall
Token ID7331
Feature activation-1.90e-02

Pos logit contributions
 sometimes+0.026
ily+0.024
 Sara+0.023
 Mom+0.023
 from+0.023
Neg logit contributions
 unnecessary-0.033
izer-0.033
 invention-0.033
 lookout-0.032
bek-0.032
Token tower
Token ID10580
Feature activation-2.48e-02

Pos logit contributions
 sometimes+0.030
ily+0.029
 Qu+0.028
 Sara+0.028
 Z+0.027
Neg logit contributions
 invention-0.027
izer-0.027
 tool-0.025
 lookout-0.025
 tractor-0.025
Token and
Token ID290
Feature activation-1.66e-02

Pos logit contributions
 Qu+0.020
 Mom+0.019
 Sara+0.018
 Z+0.018
ily+0.018
Neg logit contributions
izer-0.051
 invention-0.051
 tool-0.049
 lookout-0.048
ipping-0.048
Token named
Token ID3706
Feature activation-1.39e-02

Pos logit contributions
 sometimes+0.020
 Mom+0.019
 from+0.018
 Sara+0.018
ily+0.018
Neg logit contributions
 unnecessary-0.027
 invention-0.027
bek-0.026
zed-0.026
izer-0.026
Token it
Token ID340
Feature activation-5.20e-02

Pos logit contributions
 sometimes+0.009
 from+0.008
 Mom+0.007
 alone+0.007
 pengu+0.007
Neg logit contributions
 unnecessary-0.032
izer-0.031
 invention-0.031
 avail-0.030
aman-0.030
Token "
Token ID366
Feature activation-2.97e-02

Pos logit contributions
 sometimes+0.064
 Mom+0.064
 Sara+0.063
 Qu+0.061
 Z+0.061
Neg logit contributions
 invention-0.082
zed-0.081
antly-0.081
izer-0.081
 tool-0.080
TokenThe
Token ID464
Feature activation-8.17e-03

Pos logit contributions
 sometimes+0.045
 Mom+0.045
 Sara+0.043
 from+0.043
 L+0.042
Neg logit contributions
izer-0.043
 unnecessary-0.042
lor-0.041
bek-0.041
 invention-0.040
Token Big
Token ID4403
Feature activation+1.64e-02

Pos logit contributions
 sometimes+0.012
 Sara+0.011
 Z+0.011
 from+0.011
 L+0.011
Neg logit contributions
 invention-0.012
 unnecessary-0.012
izer-0.012
 tool-0.012
 lookout-0.012
Token Tower
Token ID8765
Feature activation-1.09e-02

Pos logit contributions
 unnecessary+0.027
izer+0.026
lor+0.025
 invention+0.025
bek+0.025
Neg logit contributions
 sometimes-0.023
 Mom-0.023
 Sara-0.022
 from-0.021
 L-0.021
Token".
Token ID1911
Feature activation-1.13e-02

Pos logit contributions
 sometimes+0.012
 Mom+0.012
 Sara+0.011
 L+0.011
 Z+0.011
Neg logit contributions
izer-0.020
 unnecessary-0.020
lor-0.020
 invention-0.019
aman-0.019
Token bear
Token ID6842
Feature activation+7.49e-03

Pos logit contributions
 sometimes+0.016
 Mom+0.014
ily+0.014
 Z+0.014
 L+0.014
Neg logit contributions
izer-0.014
 lookout-0.013
 tool-0.013
 unnecessary-0.013
 tractor-0.013
Token.
Token ID13
Feature activation-7.14e-03

Pos logit contributions
izer+0.017
lor+0.016
 invention+0.016
 unnecessary+0.016
bek+0.015
Neg logit contributions
 Mom-0.006
 sometimes-0.006
 Sara-0.005
ily-0.005
 Z-0.005
Token She
Token ID1375
Feature activation-4.69e-03

Pos logit contributions
 sometimes+0.007
 Mom+0.006
 from+0.006
 Sara+0.006
ily+0.006
Neg logit contributions
 unnecessary-0.015
izer-0.015
lor-0.014
 invention-0.014
aman-0.014
Token shouts
Token ID34757
Feature activation-7.76e-03

Pos logit contributions
 sometimes+0.006
 Mom+0.006
 Sara+0.006
 from+0.006
 L+0.006
Neg logit contributions
 unnecessary-0.008
izer-0.008
lor-0.008
bek-0.008
 invention-0.008
Token at
Token ID379
Feature activation-1.26e-02

Pos logit contributions
 sometimes+0.007
 Mom+0.006
 pand+0.006
 pengu+0.006
 Z+0.006
Neg logit contributions
 unnecessary-0.016
izer-0.015
zed-0.015
ipping-0.015
 AW-0.015
Token it
Token ID340
Feature activation-5.19e-02

Pos logit contributions
 sometimes+0.008
 Mom+0.008
 Sara+0.007
 from+0.006
 L+0.006
Neg logit contributions
 unnecessary-0.031
izer-0.031
lor-0.030
 invention-0.030
bek-0.029
Token.
Token ID13
Feature activation-1.21e-02

Pos logit contributions
 sometimes+0.044
 pand+0.044
 Mom+0.044
ily+0.043
 pengu+0.042
Neg logit contributions
izer-0.098
 invention-0.096
 unnecessary-0.095
zed-0.094
 tool-0.093
Token "
Token ID366
Feature activation-3.79e-03

Pos logit contributions
 sometimes+0.010
 Mom+0.010
 from+0.009
 Sara+0.009
ily+0.008
Neg logit contributions
 unnecessary-0.026
izer-0.026
lor-0.025
 invention-0.025
aman-0.025
TokenGo
Token ID5247
Feature activation-1.26e-02

Pos logit contributions
 sometimes+0.006
 Mom+0.006
 Sara+0.006
 from+0.006
 L+0.006
Neg logit contributions
 unnecessary-0.005
izer-0.005
lor-0.005
bek-0.005
 invention-0.005
Token away
Token ID1497
Feature activation-2.69e-02

Pos logit contributions
 sometimes+0.019
 from+0.018
 Mom+0.018
 Sara+0.018
 like+0.018
Neg logit contributions
 unnecessary-0.020
 invention-0.020
 avail-0.020
lor-0.020
aman-0.019
Token,
Token ID11
Feature activation-2.06e-02

Pos logit contributions
 sometimes+0.024
 Mom+0.022
ily+0.021
 Sara+0.020
 Z+0.020
Neg logit contributions
izer-0.057
 unnecessary-0.056
lor-0.055
 invention-0.055
 avail-0.054
Token the
Token ID262
Feature activation-8.50e-03

Pos logit contributions
 Mom+0.013
 Sara+0.011
 Tom+0.011
 L+0.011
 from+0.010
Neg logit contributions
 unnecessary-0.046
lor-0.045
isting-0.043
bek-0.043
izer-0.043
Token swing
Token ID9628
Feature activation-1.02e-02

Pos logit contributions
 Mom+0.011
 sometimes+0.011
 Sara+0.010
 Tom+0.010
 L+0.010
Neg logit contributions
lor-0.016
izer-0.016
 unnecessary-0.016
bek-0.015
 invention-0.015
Token and
Token ID290
Feature activation-1.67e-02

Pos logit contributions
 Mom+0.011
 sometimes+0.011
ily+0.011
 pand+0.011
 Sara+0.010
Neg logit contributions
aman-0.017
izer-0.017
ipping-0.017
-0.017
 invention-0.017
Token jumped
Token ID11687
Feature activation-1.47e-02

Pos logit contributions
 sometimes+0.022
 Mom+0.022
 Sara+0.021
 L+0.021
 from+0.020
Neg logit contributions
izer-0.030
 unnecessary-0.030
lor-0.029
 invention-0.028
isting-0.028
Token on
Token ID319
Feature activation-4.02e-03

Pos logit contributions
 sometimes+0.017
 Mom+0.016
 pand+0.016
 pengu+0.016
ily+0.015
Neg logit contributions
 unnecessary-0.024
izer-0.024
ipping-0.024
 AW-0.024
-0.023
Token it
Token ID340
Feature activation-5.15e-02

Pos logit contributions
 Mom+0.004
 sometimes+0.004
 Sara+0.004
 L+0.004
 Tom+0.003
Neg logit contributions
 unnecessary-0.009
izer-0.009
lor-0.009
bek-0.008
 invention-0.008
Token.
Token ID13
Feature activation-1.82e-02

Pos logit contributions
 pand+0.053
ily+0.052
 Mom+0.051
 pengu+0.051
 Sara+0.049
Neg logit contributions
 invention-0.087
 lookout-0.087
 technique-0.086
ipping-0.085
-0.084
Token He
Token ID679
Feature activation-1.60e-02

Pos logit contributions
 sometimes+0.018
 Mom+0.017
 from+0.016
 Sara+0.016
ily+0.015
Neg logit contributions
izer-0.037
 unnecessary-0.037
 invention-0.035
lor-0.035
-0.035
Token pushed
Token ID7121
Feature activation+5.50e-04

Pos logit contributions
 sometimes+0.022
 Mom+0.022
 from+0.021
ily+0.021
 Sara+0.021
Neg logit contributions
 unnecessary-0.023
 invention-0.022
zed-0.022
izer-0.022
bek-0.022
Token with
Token ID351
Feature activation+5.10e-03

Pos logit contributions
 unnecessary+0.001
izer+0.001
isel+0.001
 invention+0.001
 precautions+0.001
Neg logit contributions
 sometimes-0.001
 pand-0.001
 Mom-0.001
 pengu-0.001
 Flor-0.001
Token his
Token ID465
Feature activation+1.19e-03

Pos logit contributions
 unnecessary+0.011
izer+0.011
lor+0.011
 invention+0.010
bek+0.010
Neg logit contributions
 sometimes-0.005
 Mom-0.005
 Sara-0.004
 from-0.004
 L-0.004
Token the
Token ID262
Feature activation-9.79e-03

Pos logit contributions
 sometimes+0.019
 Mom+0.017
ily+0.016
 from+0.016
 L+0.016
Neg logit contributions
izer-0.038
 unnecessary-0.038
 avail-0.036
 invention-0.036
aman-0.035
Token bird
Token ID6512
Feature activation+7.06e-03

Pos logit contributions
 sometimes+0.012
 Mom+0.011
 Sara+0.011
 from+0.011
 L+0.010
Neg logit contributions
 unnecessary-0.019
izer-0.018
lor-0.018
bek-0.018
 invention-0.018
Token and
Token ID290
Feature activation-8.08e-03

Pos logit contributions
 unnecessary+0.016
izer+0.016
lor+0.015
 invention+0.015
bek+0.015
Neg logit contributions
 sometimes-0.006
 Mom-0.005
 Sara-0.005
 from-0.005
 L-0.005
Token jumps
Token ID18045
Feature activation-2.41e-02

Pos logit contributions
 sometimes+0.011
 Mom+0.011
 Sara+0.010
 L+0.010
 Tom+0.010
Neg logit contributions
izer-0.014
 unnecessary-0.014
lor-0.014
bek-0.014
 invention-0.014
Token on
Token ID319
Feature activation-6.50e-03

Pos logit contributions
 Mom+0.028
 pengu+0.027
 pand+0.027
 sometimes+0.025
ents+0.025
Neg logit contributions
 unnecessary-0.039
 AW-0.038
izer-0.038
 precautions-0.037
antly-0.037
Token it
Token ID340
Feature activation-5.15e-02

Pos logit contributions
 sometimes+0.007
 Mom+0.006
ily+0.006
 from+0.006
 Sara+0.006
Neg logit contributions
izer-0.013
 unnecessary-0.013
 invention-0.012
lor-0.012
 avail-0.012
Token.
Token ID13
Feature activation-9.84e-03

Pos logit contributions
 pand+0.058
 pengu+0.057
 Mom+0.057
 Sara+0.054
 Mr+0.054
Neg logit contributions
 lookout-0.084
 technique-0.081
antly-0.080
zed-0.080
 invention-0.079
Token But
Token ID887
Feature activation-4.07e-03

Pos logit contributions
 sometimes+0.010
ily+0.009
 pand+0.009
 from+0.009
 Flor+0.008
Neg logit contributions
izer-0.018
 unnecessary-0.018
aman-0.018
-0.018
zed-0.018
Token the
Token ID262
Feature activation-8.04e-04

Pos logit contributions
 pand+0.004
ily+0.004
 sometimes+0.004
 Flor+0.004
 pengu+0.004
Neg logit contributions
 unnecessary-0.007
zed-0.007
 avail-0.007
izer-0.007
aman-0.007
Token bird
Token ID6512
Feature activation+2.51e-02

Pos logit contributions
 sometimes+0.001
 Mom+0.001
 Sara+0.001
 from+0.001
 L+0.001
Neg logit contributions
 unnecessary-0.001
izer-0.001
lor-0.001
 invention-0.001
bek-0.001
Token is
Token ID318
Feature activation-1.85e-02

Pos logit contributions
 unnecessary+0.045
izer+0.044
lor+0.042
 invention+0.042
bek+0.042
Neg logit contributions
 sometimes-0.031
 Mom-0.031
 Sara-0.029
 from-0.029
 L-0.028
Token threw
Token ID9617
Feature activation+4.51e-03

Pos logit contributions
 Mom+0.001
 sometimes+0.001
 Sara+0.001
 from+0.001
 Z+0.001
Neg logit contributions
izer-0.002
 unnecessary-0.002
 invention-0.002
lor-0.002
bek-0.002
Token the
Token ID262
Feature activation-3.98e-03

Pos logit contributions
 unnecessary+0.008
izer+0.008
 tool+0.007
isel+0.007
 cast+0.007
Neg logit contributions
 sometimes-0.005
ily-0.005
 pand-0.005
 Flor-0.005
 Mom-0.005
Token ball
Token ID2613
Feature activation-1.02e-02

Pos logit contributions
 sometimes+0.006
 Mom+0.006
 Sara+0.006
 L+0.005
 from+0.005
Neg logit contributions
 unnecessary-0.006
izer-0.006
 invention-0.006
lor-0.006
bek-0.006
Token up
Token ID510
Feature activation-3.83e-03

Pos logit contributions
 Sara+0.011
 Mom+0.011
 Z+0.011
 pand+0.011
 L+0.011
Neg logit contributions
 unnecessary-0.017
 vantage-0.017
 tool-0.017
izer-0.017
 lookout-0.016
Token and
Token ID290
Feature activation-2.17e-02

Pos logit contributions
 sometimes+0.004
 Mom+0.003
 Sara+0.003
ily+0.003
 Z+0.003
Neg logit contributions
izer-0.008
 unnecessary-0.008
 invention-0.007
bek-0.007
aman-0.007
Token it
Token ID340
Feature activation-5.14e-02

Pos logit contributions
ily+0.030
 sometimes+0.030
 Mom+0.029
 pand+0.028
 Sara+0.028
Neg logit contributions
 unnecessary-0.032
izer-0.032
 invention-0.031
 avail-0.031
bek-0.031
Token came
Token ID1625
Feature activation-1.14e-02

Pos logit contributions
 Mom+0.083
 Sara+0.079
 sometimes+0.077
 Z+0.077
 pand+0.076
Neg logit contributions
zed-0.066
 unnecessary-0.063
 lookout-0.063
 cast-0.063
izer-0.062
Token down
Token ID866
Feature activation-2.34e-02

Pos logit contributions
 sometimes+0.016
 Mom+0.015
 Sara+0.015
ily+0.014
 L+0.014
Neg logit contributions
 unnecessary-0.019
izer-0.018
 invention-0.017
 lookout-0.017
bek-0.017
Token.
Token ID13
Feature activation-1.38e-02

Pos logit contributions
 sometimes+0.018
 Mom+0.018
ily+0.017
 Sara+0.017
 pand+0.016
Neg logit contributions
izer-0.048
 unnecessary-0.048
-0.047
bek-0.046
 invention-0.046
Token It
Token ID632
Feature activation-2.78e-02

Pos logit contributions
 sometimes+0.016
 from+0.014
ily+0.014
 pand+0.014
 alone+0.013
Neg logit contributions
izer-0.025
-0.024
 invention-0.024
isel-0.024
 unnecessary-0.024
Token fell
Token ID3214
Feature activation-2.57e-02

Pos logit contributions
 Mom+0.040
 sometimes+0.038
 Sara+0.038
ily+0.038
 pand+0.037
Neg logit contributions
izer-0.039
 cast-0.038
zed-0.038
 invention-0.037
 unnecessary-0.036
Token first
Token ID717
Feature activation-7.83e-03

Pos logit contributions
izer+0.019
 unnecessary+0.018
 invention+0.018
aman+0.018
lor+0.018
Neg logit contributions
 Mom-0.009
 sometimes-0.008
 Sara-0.008
 Z-0.008
ily-0.008
Token.
Token ID13
Feature activation-7.79e-03

Pos logit contributions
 sometimes+0.005
 Mom+0.005
 Sara+0.004
 L+0.004
 Z+0.004
Neg logit contributions
 unnecessary-0.018
izer-0.018
 invention-0.017
lor-0.017
aman-0.017
Token She
Token ID1375
Feature activation+1.64e-04

Pos logit contributions
 sometimes+0.007
 from+0.006
ily+0.006
 pengu+0.006
 pand+0.006
Neg logit contributions
 unnecessary-0.016
izer-0.015
 invention-0.015
-0.015
-0.015
Token jumps
Token ID18045
Feature activation-2.02e-02

Pos logit contributions
 unnecessary+0.000
izer+0.000
 invention+0.000
aman+0.000
zed+0.000
Neg logit contributions
 sometimes-0.000
 Mom-0.000
 from-0.000
ily-0.000
 Sara-0.000
Token on
Token ID319
Feature activation-2.55e-03

Pos logit contributions
 pengu+0.026
 pand+0.025
 Mom+0.025
 sometimes+0.025
 Sara+0.023
Neg logit contributions
 unnecessary-0.030
izer-0.030
antly-0.029
zed-0.029
 precautions-0.029
Token it
Token ID340
Feature activation-5.13e-02

Pos logit contributions
 Mom+0.002
 sometimes+0.002
 Sara+0.002
 Tom+0.001
 Anna+0.001
Neg logit contributions
 unnecessary-0.006
izer-0.006
lor-0.006
bek-0.006
 invention-0.006
Token and
Token ID290
Feature activation-8.38e-03

Pos logit contributions
 pand+0.061
 pengu+0.059
ily+0.059
 Mom+0.059
 Sara+0.058
Neg logit contributions
 invention-0.080
 lookout-0.080
 technique-0.079
zed-0.078
lor-0.078
Token says
Token ID1139
Feature activation+3.95e-03

Pos logit contributions
 sometimes+0.012
 Mom+0.012
ily+0.011
 from+0.011
 Sara+0.011
Neg logit contributions
 unnecessary-0.013
izer-0.012
 invention-0.012
 avail-0.012
lor-0.012
Token,
Token ID11
Feature activation-6.79e-03

Pos logit contributions
 unnecessary+0.008
zed+0.008
lor+0.008
+0.007
ously+0.007
Neg logit contributions
 sometimes-0.004
 Mom-0.003
 alone-0.003
 from-0.003
 L-0.003
Token "
Token ID366
Feature activation-1.27e-02

Pos logit contributions
 sometimes+0.008
 Mom+0.008
ily+0.008
 from+0.008
 Sara+0.008
Neg logit contributions
 unnecessary-0.011
izer-0.011
-0.011
-0.011
 lookout-0.011
TokenThis
Token ID1212
Feature activation-8.91e-03

Pos logit contributions
 sometimes+0.019
 Mom+0.019
 from+0.018
 Sara+0.018
 L+0.018
Neg logit contributions
 unnecessary-0.020
izer-0.019
lor-0.018
bek-0.018
aman-0.018
Token the
Token ID262
Feature activation-1.08e-02

Pos logit contributions
 sometimes+0.015
 Mom+0.013
 from+0.012
ily+0.012
 Sara+0.012
Neg logit contributions
izer-0.039
 unnecessary-0.038
 avail-0.037
 invention-0.037
aman-0.037
Token bench
Token ID7624
Feature activation-1.73e-02

Pos logit contributions
 sometimes+0.017
 Mom+0.016
 from+0.016
ily+0.016
 L+0.016
Neg logit contributions
 unnecessary-0.015
izer-0.014
 lookout-0.014
lor-0.014
aman-0.014
Token and
Token ID290
Feature activation-1.73e-02

Pos logit contributions
 pand+0.022
 Qu+0.020
 unic+0.020
 Z+0.020
 pengu+0.019
Neg logit contributions
ously-0.026
-0.026
izer-0.025
 cast-0.025
 tool-0.024
Token climbs
Token ID34332
Feature activation-7.95e-03

Pos logit contributions
 sometimes+0.024
 Mom+0.023
 Sara+0.023
 L+0.022
 from+0.022
Neg logit contributions
izer-0.030
 unnecessary-0.030
lor-0.030
bek-0.029
 invention-0.029
Token on
Token ID319
Feature activation-7.38e-03

Pos logit contributions
 sometimes+0.008
 Mom+0.008
 Sara+0.007
ily+0.007
 L+0.007
Neg logit contributions
izer-0.016
 unnecessary-0.016
lor-0.015
 invention-0.015
aman-0.015
Token it
Token ID340
Feature activation-5.13e-02

Pos logit contributions
 sometimes+0.007
 Mom+0.006
ily+0.006
 from+0.006
 Sara+0.006
Neg logit contributions
izer-0.015
 unnecessary-0.015
 avail-0.014
lor-0.014
 invention-0.014
Token too
Token ID1165
Feature activation-2.95e-02

Pos logit contributions
 pengu+0.060
 pand+0.060
 Mom+0.057
 Mr+0.055
 Qu+0.053
Neg logit contributions
-0.078
lor-0.077
 technique-0.076
 lookout-0.076
 precautions-0.076
Token.
Token ID13
Feature activation-1.03e-02

Pos logit contributions
 pand+0.029
 pengu+0.028
 sometimes+0.028
 Qu+0.027
 Mom+0.027
Neg logit contributions
 unnecessary-0.054
 precautions-0.053
-0.051
 avail-0.051
izer-0.051
Token He
Token ID679
Feature activation-7.48e-03

Pos logit contributions
 sometimes+0.012
 pand+0.012
 from+0.011
 alone+0.010
 pengu+0.010
Neg logit contributions
izer-0.017
zed-0.017
-0.017
isel-0.017
 unnecessary-0.016
Token also
Token ID635
Feature activation-1.16e-02

Pos logit contributions
 sometimes+0.011
 Mom+0.011
ily+0.010
 from+0.010
 Sara+0.010
Neg logit contributions
 unnecessary-0.011
izer-0.011
 invention-0.011
lor-0.011
aman-0.010
Token reaches
Token ID12229
Feature activation-1.28e-02

Pos logit contributions
 sometimes+0.015
 Mom+0.014
 Sara+0.014
 L+0.013
 from+0.013
Neg logit contributions
izer-0.022
 unnecessary-0.021
lor-0.021
bek-0.021
 invention-0.020
Token the
Token ID262
Feature activation-1.08e-02

Pos logit contributions
 sometimes+0.020
 Mom+0.018
ily+0.017
 from+0.017
 Sara+0.016
Neg logit contributions
izer-0.039
 unnecessary-0.039
 avail-0.037
 invention-0.036
aman-0.036
Token bird
Token ID6512
Feature activation+7.68e-03

Pos logit contributions
 sometimes+0.013
 Mom+0.013
 Sara+0.012
 from+0.012
 L+0.012
Neg logit contributions
 unnecessary-0.020
izer-0.020
lor-0.019
 invention-0.019
bek-0.019
Token and
Token ID290
Feature activation-8.88e-03

Pos logit contributions
 unnecessary+0.017
izer+0.017
lor+0.016
 invention+0.016
bek+0.016
Neg logit contributions
 sometimes-0.006
 Mom-0.006
 Sara-0.006
 from-0.005
 L-0.005
Token jumps
Token ID18045
Feature activation-2.37e-02

Pos logit contributions
 sometimes+0.012
 Mom+0.012
 Sara+0.012
 L+0.012
 Tom+0.011
Neg logit contributions
izer-0.016
 unnecessary-0.015
lor-0.015
bek-0.015
 invention-0.015
Token on
Token ID319
Feature activation-5.70e-03

Pos logit contributions
 Mom+0.028
 pengu+0.027
 pand+0.027
 sometimes+0.025
ents+0.025
Neg logit contributions
 unnecessary-0.038
izer-0.037
 AW-0.037
antly-0.036
 precautions-0.036
Token it
Token ID340
Feature activation-5.11e-02

Pos logit contributions
 sometimes+0.006
 Mom+0.006
 from+0.005
 Sara+0.005
ily+0.005
Neg logit contributions
izer-0.011
 unnecessary-0.011
 invention-0.011
lor-0.011
 avail-0.011
Token.
Token ID13
Feature activation-9.61e-03

Pos logit contributions
 pand+0.058
 pengu+0.057
 Mom+0.056
 Sara+0.054
 Mr+0.054
Neg logit contributions
 lookout-0.082
 technique-0.080
antly-0.080
 vantage-0.079
 invention-0.079
Token But
Token ID887
Feature activation-5.71e-03

Pos logit contributions
 sometimes+0.010
ily+0.009
 pand+0.009
 from+0.009
 alone+0.008
Neg logit contributions
izer-0.018
 unnecessary-0.018
zed-0.018
aman-0.018
-0.018
Token the
Token ID262
Feature activation-1.74e-03

Pos logit contributions
 pand+0.006
ily+0.006
 sometimes+0.005
 pengu+0.005
 Flor+0.005
Neg logit contributions
 unnecessary-0.010
zed-0.010
 avail-0.010
izer-0.010
 cast-0.010
Token bird
Token ID6512
Feature activation+2.55e-02

Pos logit contributions
 sometimes+0.002
 Mom+0.002
 Sara+0.002
 from+0.002
 L+0.002
Neg logit contributions
 unnecessary-0.003
izer-0.003
 invention-0.003
lor-0.003
 lookout-0.003
Token is
Token ID318
Feature activation-1.83e-02

Pos logit contributions
 unnecessary+0.045
izer+0.044
lor+0.042
 invention+0.042
bek+0.041
Neg logit contributions
 Mom-0.032
 sometimes-0.032
 Sara-0.030
 from-0.030
 L-0.029
Token a
Token ID257
Feature activation-1.77e-02

Pos logit contributions
 sometimes+0.006
 Z+0.006
 alone+0.005
 Mom+0.005
ily+0.005
Neg logit contributions
 unnecessary-0.010
izer-0.010
 invention-0.009
zed-0.009
asha-0.009
Token dough
Token ID15756
Feature activation-2.30e-02

Pos logit contributions
 sometimes+0.023
ily+0.021
 Mom+0.020
 Z+0.020
 Sara+0.020
Neg logit contributions
izer-0.031
 unnecessary-0.031
 invention-0.030
 lookout-0.029
bek-0.029
Token.
Token ID13
Feature activation-9.26e-03

Pos logit contributions
 Z+0.022
 Mom+0.021
 sometimes+0.021
ily+0.021
 Sara+0.020
Neg logit contributions
izer-0.044
 invention-0.043
 tool-0.043
 cast-0.041
 lookout-0.041
Token He
Token ID679
Feature activation-1.38e-02

Pos logit contributions
 sometimes+0.010
 from+0.008
 pand+0.008
 alone+0.008
 Mom+0.008
Neg logit contributions
izer-0.018
-0.018
 unnecessary-0.018
lor-0.017
aman-0.017
Token rolls
Token ID14088
Feature activation-1.98e-02

Pos logit contributions
 Mom+0.019
 sometimes+0.019
 from+0.019
 alone+0.019
ily+0.019
Neg logit contributions
 invention-0.019
 unnecessary-0.019
izer-0.019
isting-0.019
 cast-0.019
Token it
Token ID340
Feature activation-5.11e-02

Pos logit contributions
 sometimes+0.022
 Mom+0.021
 pand+0.021
ily+0.021
 Z+0.020
Neg logit contributions
izer-0.036
 unnecessary-0.034
 invention-0.033
antly-0.033
-0.033
Token with
Token ID351
Feature activation-1.91e-02

Pos logit contributions
 Mom+0.061
 Z+0.058
 pand+0.058
 Mr+0.057
 Sara+0.057
Neg logit contributions
 vantage-0.081
 lookout-0.079
ig-0.079
antly-0.079
 technique-0.079
Token a
Token ID257
Feature activation-2.03e-02

Pos logit contributions
 sometimes+0.025
 pand+0.023
 Z+0.023
ily+0.023
 from+0.022
Neg logit contributions
 unnecessary-0.029
 tool-0.029
izer-0.029
isel-0.029
 cast-0.028
Token rolling
Token ID10708
Feature activation+5.11e-03

Pos logit contributions
 Mom+0.025
 sometimes+0.024
 Sara+0.024
 from+0.023
 L+0.023
Neg logit contributions
 unnecessary-0.038
izer-0.037
lor-0.036
bek-0.036
 avail-0.035
Token pin
Token ID6757
Feature activation-1.13e-02

Pos logit contributions
 unnecessary+0.009
lor+0.009
bek+0.009
 avail+0.008
Matthew+0.008
Neg logit contributions
 Mom-0.007
 sometimes-0.007
 Sara-0.007
 from-0.007
 L-0.007
Token.
Token ID13
Feature activation-1.02e-02

Pos logit contributions
 Mom+0.010
 sometimes+0.009
 Sara+0.009
 Mr+0.008
 Z+0.008
Neg logit contributions
 unnecessary-0.023
izer-0.023
lor-0.023
 invention-0.022
zed-0.022
Token,
Token ID11
Feature activation-2.09e-02

Pos logit contributions
 Mom+0.009
 sometimes+0.009
 Z+0.008
 L+0.008
 Sara+0.008
Neg logit contributions
izer-0.022
 invention-0.022
 unnecessary-0.022
lor-0.021
-0.021
Token or
Token ID393
Feature activation-1.91e-02

Pos logit contributions
 alone+0.023
 Z+0.022
uzz+0.022
 pand+0.022
ily+0.021
Neg logit contributions
 unnecessary-0.034
izer-0.033
 invention-0.033
 tool-0.033
 tractor-0.032
Token toys
Token ID14958
Feature activation-2.94e-02

Pos logit contributions
ily+0.021
 Z+0.020
 Mom+0.020
 L+0.019
 Sara+0.019
Neg logit contributions
 unnecessary-0.035
 invention-0.031
 tool-0.031
 lookout-0.030
 avail-0.030
Token,
Token ID11
Feature activation-2.31e-03

Pos logit contributions
 Mom+0.028
 L+0.027
 Z+0.027
 Qu+0.026
 Flor+0.025
Neg logit contributions
 unnecessary-0.054
-0.054
 tool-0.053
 invention-0.053
lor-0.053
Token or
Token ID393
Feature activation-7.37e-03

Pos logit contributions
 Mom+0.002
 sometimes+0.002
ily+0.002
 Z+0.002
 Sara+0.002
Neg logit contributions
 unnecessary-0.004
izer-0.004
 invention-0.004
lor-0.004
-0.004
Token books
Token ID3835
Feature activation-5.08e-02

Pos logit contributions
ily+0.008
 Mom+0.008
 Z+0.008
 L+0.008
 sometimes+0.008
Neg logit contributions
 unnecessary-0.013
 invention-0.012
 tool-0.012
earned-0.012
izer-0.012
Token,"
Token ID553
Feature activation-4.20e-03

Pos logit contributions
 pand+0.054
 Mom+0.054
 Qu+0.053
 L+0.053
 Z+0.053
Neg logit contributions
-0.089
lor-0.088
izer-0.088
isting-0.088
aman-0.085
Token Max
Token ID5436
Feature activation-2.16e-02

Pos logit contributions
 from+0.005
 sometimes+0.005
ily+0.005
 pand+0.004
asp+0.004
Neg logit contributions
 avail-0.007
 unnecessary-0.007
 invention-0.007
-0.007
izer-0.007
Token said
Token ID531
Feature activation+8.19e-03

Pos logit contributions
 pand+0.024
 alone+0.024
 Sara+0.022
 pengu+0.022
 Mom+0.022
Neg logit contributions
andra-0.038
asha-0.038
 invention-0.036
Matthew-0.035
aman-0.035
Token.
Token ID13
Feature activation-7.14e-03

Pos logit contributions
 unnecessary+0.016
izer+0.016
ously+0.016
 invention+0.016
ipping+0.015
Neg logit contributions
 pand-0.007
 sometimes-0.007
 Z-0.007
 pengu-0.007
 Mom-0.006
Token
Token ID198
Feature activation+6.55e-03

Pos logit contributions
 sometimes+0.008
 Mom+0.007
 pand+0.007
 from+0.007
 Sara+0.007
Neg logit contributions
izer-0.013
 unnecessary-0.013
lor-0.013
-0.013
 invention-0.013
Token catches
Token ID17591
Feature activation-1.68e-02

Pos logit contributions
 Mom+0.011
ily+0.011
uzz+0.010
 Sara+0.010
 pand+0.010
Neg logit contributions
 unnecessary-0.010
 cast-0.010
izer-0.010
isting-0.009
 AW-0.009
Token the
Token ID262
Feature activation-1.19e-02

Pos logit contributions
 sometimes+0.022
ily+0.021
 pand+0.020
 Mom+0.020
uzz+0.020
Neg logit contributions
 unnecessary-0.027
izer-0.026
 invention-0.024
 avail-0.024
ipping-0.023
Token ball
Token ID2613
Feature activation-2.63e-02

Pos logit contributions
 sometimes+0.019
 Mom+0.018
 Sara+0.017
 Z+0.017
ily+0.017
Neg logit contributions
izer-0.017
 unnecessary-0.017
 invention-0.016
 tool-0.015
 lookout-0.015
Token and
Token ID290
Feature activation-1.54e-02

Pos logit contributions
 Z+0.028
 Mom+0.028
 pand+0.027
 Sara+0.027
 Flor+0.027
Neg logit contributions
izer-0.045
 cast-0.044
 tool-0.043
ipping-0.043
-0.043
Token throws
Token ID12542
Feature activation-8.14e-03

Pos logit contributions
 sometimes+0.023
 Mom+0.022
 Sara+0.021
 from+0.021
 L+0.021
Neg logit contributions
 unnecessary-0.024
izer-0.024
lor-0.023
 invention-0.023
bek-0.022
Token it
Token ID340
Feature activation-5.07e-02

Pos logit contributions
 sometimes+0.011
 Mom+0.011
 pand+0.011
 Z+0.010
 from+0.010
Neg logit contributions
 unnecessary-0.012
izer-0.012
 invention-0.011
isel-0.011
 avail-0.011
Token back
Token ID736
Feature activation-1.26e-02

Pos logit contributions
 Mom+0.056
 pand+0.052
 Sara+0.052
 Mr+0.052
 Z+0.051
Neg logit contributions
 unnecessary-0.088
 AW-0.085
 vantage-0.083
izer-0.083
 lookout-0.082
Token to
Token ID284
Feature activation-2.39e-02

Pos logit contributions
 Mom+0.011
 sometimes+0.011
 Sara+0.010
ily+0.010
 L+0.010
Neg logit contributions
 unnecessary-0.026
izer-0.025
 invention-0.024
aman-0.024
lor-0.024
Token Lily
Token ID20037
Feature activation-2.74e-02

Pos logit contributions
 sometimes+0.013
 Mom+0.011
 from+0.010
ily+0.009
 Sara+0.009
Neg logit contributions
izer-0.059
 unnecessary-0.059
 avail-0.057
 invention-0.057
 lookout-0.056
Token.
Token ID13
Feature activation-2.07e-02

Pos logit contributions
 Mom+0.025
 sometimes+0.023
ily+0.023
 Sara+0.023
 pand+0.022
Neg logit contributions
izer-0.056
 unnecessary-0.054
 invention-0.053
ipping-0.051
 lookout-0.051
Token They
Token ID1119
Feature activation-1.13e-02

Pos logit contributions
 sometimes+0.021
 pand+0.020
 from+0.019
 pengu+0.018
ily+0.018
Neg logit contributions
-0.039
izer-0.037
-0.035
 unnecessary-0.035
lor-0.035
Token She
Token ID1375
Feature activation-1.33e-02

Pos logit contributions
 sometimes+0.008
 pengu+0.007
 pand+0.007
 from+0.007
 alone+0.007
Neg logit contributions
-0.016
 unnecessary-0.016
izer-0.015
-0.015
Michelle-0.015
Token pressed
Token ID12070
Feature activation-3.10e-04

Pos logit contributions
 sometimes+0.016
 Mom+0.016
 Sara+0.015
 from+0.015
ily+0.014
Neg logit contributions
izer-0.023
 unnecessary-0.023
bek-0.022
 invention-0.022
lor-0.022
Token the
Token ID262
Feature activation+2.83e-03

Pos logit contributions
 pand+0.000
 sometimes+0.000
 pengu+0.000
ily+0.000
 Qu+0.000
Neg logit contributions
 unnecessary-0.001
antly-0.001
 precautions-0.000
 invention-0.000
 tool-0.000
Token whistle
Token ID16121
Feature activation+9.51e-03

Pos logit contributions
 unnecessary+0.005
izer+0.005
lor+0.005
bek+0.005
aman+0.005
Neg logit contributions
 sometimes-0.003
 Mom-0.003
 from-0.003
 Sara-0.003
 L-0.003
Token and
Token ID290
Feature activation-2.01e-02

Pos logit contributions
izer+0.019
 unnecessary+0.019
 invention+0.018
lor+0.018
zed+0.018
Neg logit contributions
 Mom-0.009
 sometimes-0.009
 Sara-0.008
 pand-0.008
ily-0.008
Token it
Token ID340
Feature activation-5.02e-02

Pos logit contributions
 sometimes+0.024
 Mom+0.024
ily+0.023
 pand+0.022
 Sara+0.022
Neg logit contributions
 unnecessary-0.034
izer-0.033
bek-0.032
 invention-0.032
 avail-0.032
Token made
Token ID925
Feature activation-8.24e-03

Pos logit contributions
 Mom+0.071
 Sara+0.069
 sometimes+0.067
 Mr+0.066
ily+0.066
Neg logit contributions
zed-0.072
izer-0.071
 unnecessary-0.071
antly-0.070
 invention-0.070
Token a
Token ID257
Feature activation-1.83e-02

Pos logit contributions
 sometimes+0.010
 alone+0.008
 Qu+0.008
 pand+0.008
 Mom+0.008
Neg logit contributions
 unnecessary-0.014
izer-0.013
 invention-0.013
zed-0.013
 avail-0.013
Token loud
Token ID7812
Feature activation-1.11e-02

Pos logit contributions
 sometimes+0.020
 Mom+0.019
 Sara+0.018
ily+0.018
 from+0.018
Neg logit contributions
 unnecessary-0.036
izer-0.035
 invention-0.034
lor-0.034
bek-0.033
Token sound
Token ID2128
Feature activation-5.68e-03

Pos logit contributions
 sometimes+0.013
 Mom+0.013
ily+0.012
 Sara+0.012
 Z+0.011
Neg logit contributions
izer-0.021
 unnecessary-0.020
 invention-0.020
 lookout-0.019
lor-0.019
Token.
Token ID13
Feature activation-9.97e-03

Pos logit contributions
 Mom+0.005
 sometimes+0.005
 Sara+0.005
ily+0.005
 Qu+0.004
Neg logit contributions
 unnecessary-0.011
izer-0.011
zed-0.011
 invention-0.010
 lookout-0.010
Token the
Token ID262
Feature activation-8.18e-03

Pos logit contributions
 sometimes+0.014
 Mom+0.014
 Sara+0.012
 from+0.012
 L+0.012
Neg logit contributions
 unnecessary-0.050
izer-0.050
lor-0.049
 invention-0.048
bek-0.048
Token piano
Token ID19132
Feature activation-1.30e-02

Pos logit contributions
 sometimes+0.014
 Mom+0.014
 from+0.014
 Sara+0.013
 Anna+0.013
Neg logit contributions
izer-0.011
 unnecessary-0.011
lor-0.010
bek-0.010
 invention-0.010
Token and
Token ID290
Feature activation-8.46e-03

Pos logit contributions
 Mom+0.009
 sometimes+0.009
 Sara+0.009
ily+0.008
 pand+0.008
Neg logit contributions
izer-0.028
 invention-0.028
 unnecessary-0.028
lor-0.027
aman-0.027
Token climbs
Token ID34332
Feature activation-5.59e-03

Pos logit contributions
 sometimes+0.010
 Mom+0.010
 Sara+0.009
 from+0.009
 L+0.009
Neg logit contributions
izer-0.016
 unnecessary-0.016
lor-0.015
 invention-0.015
bek-0.015
Token on
Token ID319
Feature activation-1.08e-02

Pos logit contributions
 sometimes+0.005
 Mom+0.005
 Sara+0.005
 from+0.005
 L+0.004
Neg logit contributions
 unnecessary-0.012
izer-0.012
lor-0.012
 invention-0.011
bek-0.011
Token it
Token ID340
Feature activation-5.01e-02

Pos logit contributions
 sometimes+0.010
 Mom+0.010
 Sara+0.009
 from+0.009
 L+0.008
Neg logit contributions
 unnecessary-0.023
izer-0.023
lor-0.022
 invention-0.022
bek-0.022
Token.
Token ID13
Feature activation-1.39e-02

Pos logit contributions
 pand+0.047
 pengu+0.047
 Mom+0.046
 sometimes+0.045
 Qu+0.044
Neg logit contributions
 invention-0.092
 lookout-0.090
lor-0.089
-0.089
 technique-0.089
Token Then
Token ID3244
Feature activation+5.29e-03

Pos logit contributions
 sometimes+0.015
 pand+0.014
 from+0.014
 pengu+0.013
 alone+0.012
Neg logit contributions
zed-0.025
-0.025
 unnecessary-0.024
-0.024
 lookout-0.024
Token she
Token ID673
Feature activation-1.57e-02

Pos logit contributions
 unnecessary+0.010
 avail+0.010
zed+0.009
+0.009
 precautions+0.009
Neg logit contributions
 sometimes-0.005
ily-0.005
 pengu-0.005
 from-0.005
uzz-0.005
Token reaches
Token ID12229
Feature activation-1.56e-02

Pos logit contributions
 sometimes+0.024
 Mom+0.023
 Sara+0.022
 from+0.022
ily+0.022
Neg logit contributions
 unnecessary-0.023
izer-0.023
 invention-0.022
lor-0.022
aman-0.021
Token for
Token ID329
Feature activation-1.49e-02

Pos logit contributions
 sometimes+0.013
 Mom+0.012
 Sara+0.011
 from+0.011
 L+0.011
Neg logit contributions
 unnecessary-0.035
izer-0.035
lor-0.034
 invention-0.034
bek-0.033
Token a
Token ID257
Feature activation-2.61e-02

Pos logit contributions
 sometimes+0.006
 Mom+0.006
ily+0.005
 Sara+0.005
 from+0.005
Neg logit contributions
izer-0.011
 unnecessary-0.010
 invention-0.010
lor-0.010
 lookout-0.010
Token doll
Token ID3654
Feature activation-2.28e-02

Pos logit contributions
 sometimes+0.028
 Mom+0.026
ily+0.026
 Sara+0.025
 from+0.025
Neg logit contributions
izer-0.051
 unnecessary-0.050
 invention-0.049
 lookout-0.048
bek-0.048
Token,
Token ID11
Feature activation-2.99e-03

Pos logit contributions
 Mom+0.022
 sometimes+0.022
ily+0.021
 Sara+0.020
 Mr+0.019
Neg logit contributions
izer-0.042
 invention-0.041
lor-0.041
 tool-0.041
-0.040
Token and
Token ID290
Feature activation-1.73e-02

Pos logit contributions
 sometimes+0.003
 Mom+0.003
 Sara+0.002
 from+0.002
 L+0.002
Neg logit contributions
 unnecessary-0.006
izer-0.006
lor-0.006
 invention-0.006
bek-0.006
Token a
Token ID257
Feature activation-2.19e-02

Pos logit contributions
 sometimes+0.016
 pand+0.016
 alone+0.014
ily+0.014
 Mom+0.014
Neg logit contributions
 unnecessary-0.032
 invention-0.031
izer-0.031
zed-0.031
 tool-0.031
Token book
Token ID1492
Feature activation-5.01e-02

Pos logit contributions
 sometimes+0.027
ily+0.025
 Mom+0.024
 from+0.024
 Sara+0.024
Neg logit contributions
izer-0.039
 unnecessary-0.038
 invention-0.038
 lookout-0.037
bek-0.037
Token.
Token ID13
Feature activation-1.22e-02

Pos logit contributions
 Qu+0.053
 Mom+0.053
 pand+0.052
 Nib+0.051
 sometimes+0.050
Neg logit contributions
lor-0.081
 tool-0.080
 invention-0.078
izer-0.078
 cast-0.076
Token
Token ID198
Feature activation-4.93e-03

Pos logit contributions
 sometimes+0.011
 Mom+0.010
 Sara+0.009
 from+0.009
 L+0.009
Neg logit contributions
 unnecessary-0.026
izer-0.026
lor-0.025
 invention-0.025
aman-0.025
Token
Token ID198
Feature activation-1.32e-03

Pos logit contributions
 sometimes+0.005
 pand+0.005
 alone+0.005
 from+0.005
 Mom+0.005
Neg logit contributions
izer-0.009
lor-0.009
 unnecessary-0.009
isting-0.009
aman-0.009
TokenThey
Token ID2990
Feature activation-1.68e-02

Pos logit contributions
 sometimes+0.002
 Mom+0.002
 Sara+0.001
ily+0.001
 from+0.001
Neg logit contributions
 unnecessary-0.003
izer-0.002
 invention-0.002
lor-0.002
 avail-0.002
Token were
Token ID547
Feature activation-1.61e-02

Pos logit contributions
 pand+0.024
 Sara+0.024
 from+0.023
 Mom+0.023
 sometimes+0.022
Neg logit contributions
bek-0.023
 managed-0.022
 cast-0.022
isel-0.021
 invention-0.021
Token next
Token ID1306
Feature activation-3.10e-03

Pos logit contributions
 pand+0.022
 Z+0.020
 Flor+0.020
 L+0.020
 Nib+0.020
Neg logit contributions
 precautions-0.032
 unnecessary-0.031
antly-0.030
ously-0.030
 situations-0.030
Token time
Token ID640
Feature activation+1.01e-02

Pos logit contributions
 sometimes+0.005
 Mom+0.005
 Sara+0.005
 from+0.005
ily+0.005
Neg logit contributions
 unnecessary-0.004
izer-0.004
lor-0.004
 invention-0.004
bek-0.004
Token.
Token ID13
Feature activation+2.22e-02

Pos logit contributions
 unnecessary+0.018
 precautions+0.017
isting+0.017
 invention+0.017
izer+0.016
Neg logit contributions
 pand-0.011
 Flor-0.010
 pengu-0.010
ily-0.010
 Nib-0.009
Token Now
Token ID2735
Feature activation+8.74e-04

Pos logit contributions
 unnecessary+0.040
izer+0.040
lor+0.039
 invention+0.038
bek+0.038
Neg logit contributions
 sometimes-0.027
 Mom-0.025
 from-0.024
 Sara-0.024
ily-0.024
Token,
Token ID11
Feature activation+4.87e-03

Pos logit contributions
 unnecessary+0.002
izer+0.002
 avail+0.002
aman+0.002
lor+0.002
Neg logit contributions
 sometimes-0.001
ily-0.001
 Mom-0.001
 Z-0.001
 L-0.001
Token let
Token ID1309
Feature activation+3.93e-02

Pos logit contributions
 unnecessary+0.008
izer+0.008
bek+0.008
aman+0.008
 invention+0.008
Neg logit contributions
 sometimes-0.006
 Z-0.006
ily-0.006
 Mom-0.006
 pand-0.005
Token us
Token ID514
Feature activation+1.16e-02

Pos logit contributions
 unnecessary+0.074
izer+0.073
lor+0.071
 invention+0.070
bek+0.069
Neg logit contributions
 sometimes-0.046
 Mom-0.045
 Sara-0.042
 from-0.041
 L-0.041
Token take
Token ID1011
Feature activation+2.69e-02

Pos logit contributions
 unnecessary+0.022
izer+0.022
bek+0.022
lor+0.021
aman+0.021
Neg logit contributions
 sometimes-0.014
 Mom-0.014
 from-0.014
 like-0.013
 Sara-0.012
Token the
Token ID262
Feature activation+6.06e-03

Pos logit contributions
izer+0.055
 unnecessary+0.055
 invention+0.053
lor+0.053
bek+0.053
Neg logit contributions
 Mom-0.027
 sometimes-0.027
 Sara-0.025
 from-0.024
 L-0.024
Token jar
Token ID17379
Feature activation+1.01e-02

Pos logit contributions
 unnecessary+0.009
izer+0.009
lor+0.009
 invention+0.009
bek+0.009
Neg logit contributions
 sometimes-0.009
 Mom-0.009
 Sara-0.009
 L-0.009
ily-0.009
Token down
Token ID866
Feature activation+5.78e-03

Pos logit contributions
izer+0.021
 unnecessary+0.021
lor+0.021
 invention+0.020
aman+0.020
Neg logit contributions
 sometimes-0.008
 Mom-0.008
 Z-0.008
 Sara-0.008
 L-0.008
Token.
Token ID13
Feature activation-4.77e-03

Pos logit contributions
izer+0.004
 unnecessary+0.004
 invention+0.004
lor+0.004
aman+0.004
Neg logit contributions
 sometimes-0.001
 Mom-0.001
 Sara-0.001
 L-0.001
ily-0.001
Token He
Token ID679
Feature activation+6.91e-03

Pos logit contributions
 sometimes+0.005
 from+0.005
ily+0.005
 pand+0.005
 Mom+0.005
Neg logit contributions
izer-0.009
 unnecessary-0.008
aman-0.008
-0.008
 invention-0.008
Token said
Token ID531
Feature activation+1.17e-02

Pos logit contributions
 unnecessary+0.015
izer+0.015
lor+0.014
 invention+0.014
bek+0.014
Neg logit contributions
 sometimes-0.006
 Mom-0.006
 Sara-0.006
 from-0.005
 L-0.005
Token,
Token ID11
Feature activation+4.54e-03

Pos logit contributions
 unnecessary+0.021
ously+0.020
zed+0.020
+0.020
antly+0.019
Neg logit contributions
 sometimes-0.012
 sill-0.011
 Mom-0.011
 Z-0.011
ents-0.011
Token "
Token ID366
Feature activation+2.47e-03

Pos logit contributions
izer+0.008
 unnecessary+0.008
lor+0.008
 invention+0.008
+0.008
Neg logit contributions
 sometimes-0.005
 Mom-0.005
ily-0.005
 Sara-0.005
 from-0.005
TokenLet
Token ID5756
Feature activation+3.92e-02

Pos logit contributions
 unnecessary+0.004
izer+0.004
lor+0.004
 invention+0.004
bek+0.004
Neg logit contributions
 sometimes-0.004
 Mom-0.004
 Sara-0.003
 from-0.003
 L-0.003
Token's
Token ID338
Feature activation+8.85e-03

Pos logit contributions
bek+0.091
 unnecessary+0.090
+0.089
izer+0.089
isel+0.088
Neg logit contributions
 Mom-0.035
 sometimes-0.034
 from-0.031
 Tom-0.031
 Sara-0.029
Token splash
Token ID22870
Feature activation+7.09e-03

Pos logit contributions
 unnecessary+0.016
izer+0.015
lor+0.015
bek+0.015
 invention+0.015
Neg logit contributions
 sometimes-0.012
 Mom-0.011
 from-0.011
 Sara-0.011
 L-0.010
Token the
Token ID262
Feature activation+9.28e-05

Pos logit contributions
 unnecessary+0.015
izer+0.015
 invention+0.014
isting+0.014
 precautions+0.014
Neg logit contributions
 Mom-0.006
 sometimes-0.006
 Sara-0.006
 Z-0.005
ily-0.005
Token water
Token ID1660
Feature activation+5.65e-03

Pos logit contributions
izer+0.000
 unnecessary+0.000
 lookout+0.000
 invention+0.000
isting+0.000
Neg logit contributions
 Mom-0.000
 sometimes-0.000
 Tom-0.000
 Anna-0.000
 Sara-0.000
Token and
Token ID290
Feature activation-8.53e-04

Pos logit contributions
izer+0.011
 unnecessary+0.011
 invention+0.011
lor+0.010
aman+0.010
Neg logit contributions
 Mom-0.006
 sometimes-0.005
 Sara-0.005
ily-0.005
 Z-0.005
Token should
Token ID815
Feature activation+2.49e-02

Pos logit contributions
 unnecessary+0.048
izer+0.046
 invention+0.046
isting+0.045
lor+0.045
Neg logit contributions
 Mom-0.036
 Sara-0.035
 sometimes-0.035
 L-0.034
 Z-0.034
Token never
Token ID1239
Feature activation+1.99e-02

Pos logit contributions
 unnecessary+0.048
izer+0.047
lor+0.045
 invention+0.045
bek+0.045
Neg logit contributions
 sometimes-0.029
 Mom-0.028
 Sara-0.026
 from-0.026
 L-0.025
Token be
Token ID307
Feature activation+1.06e-02

Pos logit contributions
 unnecessary+0.033
izer+0.032
 invention+0.031
 avail+0.031
 lookout+0.030
Neg logit contributions
 sometimes-0.027
 Mom-0.027
 Sara-0.026
 L-0.025
ily-0.025
Token selfish
Token ID20363
Feature activation-1.90e-04

Pos logit contributions
izer+0.020
 unnecessary+0.020
lor+0.020
 invention+0.019
aman+0.019
Neg logit contributions
 sometimes-0.012
 Mom-0.012
 Sara-0.011
 from-0.011
 L-0.011
Token.
Token ID13
Feature activation+3.95e-02

Pos logit contributions
 Mom+0.000
 sometimes+0.000
 Sara+0.000
 Z+0.000
 L+0.000
Neg logit contributions
 unnecessary-0.000
izer-0.000
 invention-0.000
lor-0.000
bek-0.000
Token When
Token ID1649
Feature activation+4.37e-02

Pos logit contributions
 unnecessary+0.078
izer+0.077
 invention+0.074
bek+0.073
aman+0.073
Neg logit contributions
 sometimes-0.044
 Mom-0.044
 Sara-0.040
 L-0.039
 from-0.039
Token we
Token ID356
Feature activation+4.60e-03

Pos logit contributions
izer+0.099
lor+0.095
aman+0.093
 unnecessary+0.091
achers+0.091
Neg logit contributions
 sometimes-0.046
 Mom-0.045
 from-0.039
 Sara-0.037
 L-0.036
Token share
Token ID2648
Feature activation+6.57e-03

Pos logit contributions
 unnecessary+0.008
izer+0.008
 invention+0.008
lor+0.008
bek+0.008
Neg logit contributions
 sometimes-0.006
 Mom-0.006
 Sara-0.005
 L-0.005
 Z-0.005
Token with
Token ID351
Feature activation+2.39e-02

Pos logit contributions
 unnecessary+0.013
izer+0.013
lor+0.013
 invention+0.013
bek+0.013
Neg logit contributions
 sometimes-0.007
 Mom-0.007
 Sara-0.006
 from-0.006
 L-0.006
Token our
Token ID674
Feature activation+2.67e-04

Pos logit contributions
izer+0.051
lor+0.050
aman+0.049
bek+0.049
+0.049
Neg logit contributions
 sometimes-0.025
 Mom-0.025
 from-0.022
 Sara-0.021
 L-0.020
Token friends
Token ID2460
Feature activation+2.62e-02

Pos logit contributions
izer+0.000
 unnecessary+0.000
lor+0.000
bek+0.000
 avail+0.000
Neg logit contributions
 sometimes-0.000
 Mom-0.000
 Sara-0.000
 L-0.000
 from-0.000
Token park
Token ID3952
Feature activation+1.42e-02

Pos logit contributions
 unnecessary+0.022
bek+0.022
 invention+0.021
 avail+0.021
ously+0.021
Neg logit contributions
 Mom-0.013
 sometimes-0.013
 Sara-0.012
 Tom-0.012
 alone-0.012
Token every
Token ID790
Feature activation+7.74e-03

Pos logit contributions
izer+0.027
 unnecessary+0.027
 lookout+0.027
zed+0.027
 invention+0.027
Neg logit contributions
 Mom-0.013
 sometimes-0.013
 Sara-0.013
ily-0.012
 L-0.012
Token day
Token ID1110
Feature activation-9.74e-04

Pos logit contributions
 unnecessary+0.017
lor+0.017
izer+0.017
 invention+0.016
bek+0.016
Neg logit contributions
 sometimes-0.008
 Mom-0.007
 Sara-0.006
 from-0.006
 alone-0.006
Token.
Token ID13
Feature activation-1.43e-02

Pos logit contributions
 sometimes+0.001
 Sara+0.001
 Mom+0.001
 Z+0.001
 pand+0.001
Neg logit contributions
 unnecessary-0.002
-0.002
izer-0.002
 invention-0.002
 lookout-0.002
Token They
Token ID1119
Feature activation-1.12e-02

Pos logit contributions
 sometimes+0.016
 from+0.016
 alone+0.015
 pengu+0.014
 angry+0.014
Neg logit contributions
-0.025
izer-0.025
zed-0.024
 unnecessary-0.024
-0.024
Token picked
Token ID6497
Feature activation+4.06e-02

Pos logit contributions
 Mom+0.015
 Sara+0.014
 from+0.014
ily+0.014
 sometimes+0.014
Neg logit contributions
zed-0.017
 lookout-0.017
izer-0.017
bek-0.016
asha-0.016
Token peanuts
Token ID45612
Feature activation+2.52e-02

Pos logit contributions
 unnecessary+0.087
izer+0.087
bek+0.087
lor+0.087
 invention+0.085
Neg logit contributions
 sometimes-0.037
 Mom-0.036
 Sara-0.034
 from-0.034
 L-0.032
Token and
Token ID290
Feature activation+9.30e-03

Pos logit contributions
izer+0.047
lor+0.045
+0.045
 unnecessary+0.045
asha+0.044
Neg logit contributions
 Mom-0.027
 sometimes-0.025
 L-0.025
 Sara-0.024
 Z-0.024
Token shared
Token ID4888
Feature activation+1.11e-02

Pos logit contributions
 unnecessary+0.018
izer+0.018
lor+0.018
 invention+0.017
aman+0.017
Neg logit contributions
 sometimes-0.011
 Mom-0.010
 Sara-0.010
 from-0.009
 L-0.009
Token them
Token ID606
Feature activation+1.79e-02

Pos logit contributions
izer+0.020
 unnecessary+0.020
lor+0.019
 invention+0.019
bek+0.019
Neg logit contributions
 sometimes-0.012
 Mom-0.012
ily-0.011
 Sara-0.011
 Z-0.011
Token with
Token ID351
Feature activation-4.89e-03

Pos logit contributions
izer+0.036
zed+0.035
 invention+0.035
lor+0.034
 unnecessary+0.034
Neg logit contributions
 sometimes-0.014
 L-0.014
 Mom-0.014
 Sara-0.014
ily-0.013
Token and
Token ID290
Feature activation+1.73e-02

Pos logit contributions
 unnecessary+0.051
izer+0.050
 invention+0.049
lor+0.048
isting+0.048
Neg logit contributions
 Mom-0.020
 Sara-0.020
 sometimes-0.020
 L-0.019
 Z-0.019
Token stay
Token ID2652
Feature activation+2.93e-02

Pos logit contributions
izer+0.036
lor+0.035
 unnecessary+0.034
aman+0.034
+0.033
Neg logit contributions
 sometimes-0.020
 Mom-0.019
 from-0.018
 Sara-0.017
 like-0.016
Token away
Token ID1497
Feature activation-5.92e-03

Pos logit contributions
lor+0.062
+0.062
aman+0.061
asha+0.059
bek+0.058
Neg logit contributions
 sometimes-0.033
 from-0.033
 alone-0.032
 like-0.031
 Mom-0.031
Token from
Token ID422
Feature activation-3.09e-03

Pos logit contributions
ily+0.003
 Mom+0.003
 Sara+0.003
 sometimes+0.002
 Z+0.002
Neg logit contributions
 unnecessary-0.015
 invention-0.014
izer-0.014
 precautions-0.013
 solution-0.013
Token people
Token ID661
Feature activation+2.57e-02

Pos logit contributions
 sometimes+0.003
 Mom+0.003
ily+0.003
 Sara+0.003
 from+0.003
Neg logit contributions
 unnecessary-0.006
izer-0.006
 avail-0.005
 invention-0.005
 lookout-0.005
Token who
Token ID508
Feature activation+3.86e-02

Pos logit contributions
 unnecessary+0.053
izer+0.052
lor+0.050
 invention+0.050
bek+0.049
Neg logit contributions
 Mom-0.024
 sometimes-0.024
 Sara-0.023
 L-0.022
ily-0.022
Token make
Token ID787
Feature activation+1.89e-02

Pos logit contributions
izer+0.079
lor+0.077
bek+0.075
 avail+0.075
aman+0.075
Neg logit contributions
 sometimes-0.047
 Mom-0.041
 from-0.040
 alone-0.039
 like-0.038
Token us
Token ID514
Feature activation+2.19e-02

Pos logit contributions
izer+0.040
+0.040
bek+0.040
lor+0.039
+0.038
Neg logit contributions
 Mom-0.020
 sometimes-0.020
 Sara-0.019
 from-0.019
 Lily-0.018
Token feel
Token ID1254
Feature activation+1.63e-02

Pos logit contributions
 unnecessary+0.042
izer+0.040
 invention+0.039
 lookout+0.038
isting+0.037
Neg logit contributions
 Mom-0.024
 sometimes-0.024
 Sara-0.023
ily-0.023
 L-0.023
Token uncomfortable
Token ID12916
Feature activation+3.26e-03

Pos logit contributions
izer+0.033
 unnecessary+0.032
lor+0.032
bek+0.031
 invention+0.031
Neg logit contributions
 sometimes-0.018
 Mom-0.017
 from-0.016
 Sara-0.016
 L-0.015
Token.
Token ID13
Feature activation+4.54e-03

Pos logit contributions
 unnecessary+0.006
 invention+0.006
 situations+0.006
ipping+0.006
izer+0.006
Neg logit contributions
 Z-0.003
 Mom-0.003
 Sara-0.003
 L-0.003
 Flor-0.003
Token she
Token ID673
Feature activation-1.46e-03

Pos logit contributions
 sometimes+0.004
 Mom+0.004
 Sara+0.004
 from+0.004
 L+0.004
Neg logit contributions
izer-0.019
 unnecessary-0.019
 invention-0.018
lor-0.018
bek-0.018
Token made
Token ID925
Feature activation+1.75e-03

Pos logit contributions
 sometimes+0.002
 Mom+0.002
 Sara+0.002
 from+0.002
 L+0.002
Neg logit contributions
 unnecessary-0.002
izer-0.002
 invention-0.002
lor-0.002
bek-0.002
Token it
Token ID340
Feature activation-1.32e-02

Pos logit contributions
bek+0.004
izer+0.004
lor+0.004
+0.004
earned+0.004
Neg logit contributions
 Mom-0.002
 sometimes-0.001
 Sara-0.001
 Tom-0.001
 L-0.001
Token to
Token ID284
Feature activation-1.78e-03

Pos logit contributions
 sometimes+0.013
 Mom+0.013
 from+0.013
 Sara+0.012
 L+0.012
Neg logit contributions
izer-0.027
 unnecessary-0.027
lor-0.026
bek-0.026
 invention-0.026
Token the
Token ID262
Feature activation-1.09e-02

Pos logit contributions
 Mom+0.001
 sometimes+0.001
 Sara+0.001
 from+0.001
 L+0.001
Neg logit contributions
 unnecessary-0.004
izer-0.004
lor-0.004
bek-0.004
 invention-0.004
Token beach
Token ID10481
Feature activation+1.82e-02

Pos logit contributions
 sometimes+0.015
 Mom+0.014
 Sara+0.013
 from+0.013
ily+0.013
Neg logit contributions
izer-0.019
 unnecessary-0.019
lor-0.019
bek-0.018
aman-0.018
Token safe
Token ID3338
Feature activation-8.37e-03

Pos logit contributions
 unnecessary+0.041
izer+0.041
lor+0.039
 invention+0.039
bek+0.039
Neg logit contributions
 sometimes-0.014
 Mom-0.013
 Sara-0.012
 from-0.012
 L-0.012
Token and
Token ID290
Feature activation+6.30e-03

Pos logit contributions
 sometimes+0.006
 Mom+0.006
 Sara+0.005
 from+0.005
 L+0.005
Neg logit contributions
 unnecessary-0.019
izer-0.019
lor-0.019
 invention-0.019
bek-0.018
Token sound
Token ID2128
Feature activation-8.13e-03

Pos logit contributions
 unnecessary+0.012
izer+0.012
lor+0.011
 invention+0.011
bek+0.011
Neg logit contributions
 sometimes-0.008
 Mom-0.007
 Sara-0.007
 from-0.007
 L-0.007
Token.
Token ID13
Feature activation+1.08e-04

Pos logit contributions
 sometimes+0.007
 Mom+0.007
 Sara+0.006
 L+0.006
 Z+0.006
Neg logit contributions
 unnecessary-0.018
izer-0.017
lor-0.017
 invention-0.017
aman-0.017
Token She
Token ID1375
Feature activation+6.40e-03

Pos logit contributions
 unnecessary+0.000
izer+0.000
lor+0.000
 invention+0.000
bek+0.000
Neg logit contributions
 sometimes-0.000
 Mom-0.000
 Sara-0.000
 from-0.000
 L-0.000
Token Ben
Token ID3932
Feature activation+1.96e-03

Pos logit contributions
 sometimes+0.002
asp+0.002
baby+0.002
uzz+0.002
 pand+0.002
Neg logit contributions
 unnecessary-0.003
 tool-0.003
 invention-0.003
ously-0.003
 cast-0.003
Token's
Token ID338
Feature activation-8.91e-03

Pos logit contributions
izer+0.003
 unnecessary+0.003
ipping+0.003
 invention+0.003
zed+0.003
Neg logit contributions
 Mom-0.002
ily-0.002
 sometimes-0.002
 Sara-0.002
 L-0.002
Token car
Token ID1097
Feature activation+6.93e-04

Pos logit contributions
 sometimes+0.013
ily+0.013
 Sara+0.013
 L+0.012
 Qu+0.012
Neg logit contributions
 invention-0.014
izer-0.013
 tool-0.013
 lookout-0.013
 unnecessary-0.012
Token lying
Token ID9105
Feature activation+2.03e-03

Pos logit contributions
izer+0.001
 unnecessary+0.001
 invention+0.001
lor+0.001
ipping+0.001
Neg logit contributions
 Mom-0.001
 sometimes-0.001
 Sara-0.001
ily-0.001
 L-0.001
Token on
Token ID319
Feature activation-3.04e-03

Pos logit contributions
 unnecessary+0.004
izer+0.004
lor+0.004
 invention+0.004
aman+0.004
Neg logit contributions
 sometimes-0.002
 Mom-0.002
 Sara-0.002
 L-0.002
 Z-0.002
Token the
Token ID262
Feature activation+1.23e-02

Pos logit contributions
 sometimes+0.003
ily+0.003
 Mom+0.003
 pand+0.002
 Z+0.002
Neg logit contributions
izer-0.006
 avail-0.006
 unnecessary-0.006
 invention-0.006
aman-0.006
Token floor
Token ID4314
Feature activation-1.56e-02

Pos logit contributions
 lookout+0.020
izer+0.018
 unnecessary+0.017
 tool+0.017
 invention+0.017
Neg logit contributions
 sometimes-0.018
 L-0.018
 Sara-0.017
 Mom-0.017
ents-0.017
Token,
Token ID11
Feature activation+5.60e-03

Pos logit contributions
 Mom+0.016
 pand+0.015
 Z+0.015
 Qu+0.015
 Sara+0.015
Neg logit contributions
ipping-0.028
-0.027
 lookout-0.027
ously-0.027
lor-0.027
Token next
Token ID1306
Feature activation+3.31e-03

Pos logit contributions
izer+0.011
 unnecessary+0.010
 lookout+0.010
aman+0.010
 invention+0.010
Neg logit contributions
ily-0.006
 Mom-0.005
 sometimes-0.005
 Sara-0.005
 pand-0.005
Token to
Token ID284
Feature activation-1.19e-02

Pos logit contributions
 unnecessary+0.006
izer+0.006
lor+0.006
 invention+0.006
bek+0.006
Neg logit contributions
 sometimes-0.004
 Mom-0.004
 Sara-0.003
 from-0.003
 L-0.003
Token a
Token ID257
Feature activation+6.29e-03

Pos logit contributions
 sometimes+0.009
 Mom+0.008
 from+0.008
 pand+0.008
 Z+0.008
Neg logit contributions
izer-0.025
 unnecessary-0.025
 lookout-0.024
 avail-0.024
zed-0.024
Token happy
Token ID3772
Feature activation+1.78e-03

Pos logit contributions
 Mom+0.006
 sometimes+0.006
 Sara+0.006
 from+0.006
 L+0.006
Neg logit contributions
 unnecessary-0.010
izer-0.010
lor-0.010
 invention-0.010
 avail-0.010
Token home
Token ID1363
Feature activation-1.00e-03

Pos logit contributions
 unnecessary+0.003
izer+0.003
lor+0.003
 invention+0.003
bek+0.003
Neg logit contributions
 sometimes-0.003
 Mom-0.002
 Sara-0.002
 from-0.002
 L-0.002
Token.
Token ID13
Feature activation-3.05e-03

Pos logit contributions
 sometimes+0.001
 Mom+0.001
 Sara+0.001
ily+0.001
 Z+0.001
Neg logit contributions
izer-0.002
 invention-0.002
 unnecessary-0.002
bek-0.002
lor-0.002
Token<|endoftext|>
Token ID50256
Feature activation+1.77e-03

Pos logit contributions
 sometimes+0.003
 Mom+0.003
 from+0.003
ily+0.002
 Sara+0.002
Neg logit contributions
izer-0.006
 unnecessary-0.006
lor-0.006
-0.006
aman-0.006
TokenOnce
Token ID7454
Feature activation-2.91e-02

Pos logit contributions
izer+0.003
lor+0.003
+0.003
 unnecessary+0.003
antly+0.003
Neg logit contributions
 sometimes-0.002
 Mom-0.002
 pand-0.002
 Sara-0.002
 from-0.002
Token upon
Token ID2402
Feature activation+1.16e-02

Pos logit contributions
 sometimes+0.043
 pand+0.040
 Z+0.039
 from+0.039
 sill+0.039
Neg logit contributions
aman-0.038
izer-0.037
 unnecessary-0.037
ig-0.037
 invention-0.037
Token a
Token ID257
Feature activation+2.28e-02

Pos logit contributions
 unnecessary+0.025
izer+0.025
lor+0.024
 invention+0.024
bek+0.023
Neg logit contributions
 sometimes-0.010
 Mom-0.010
 Sara-0.009
 from-0.009
 L-0.009
Token time
Token ID640
Feature activation+1.19e-02

Pos logit contributions
 unnecessary+0.035
lor+0.034
 invention+0.034
 avail+0.034
bek+0.034
Neg logit contributions
 Mom-0.036
 sometimes-0.035
 Sara-0.033
 Tom-0.033
 Anna-0.033
Token,
Token ID11
Feature activation-9.57e-03

Pos logit contributions
+0.022
izer+0.021
isting+0.021
aman+0.021
 unnecessary+0.021
Neg logit contributions
 sometimes-0.012
ents-0.011
 Mom-0.011
 Sara-0.011
 Z-0.011
Token there
Token ID612
Feature activation+1.02e-03

Pos logit contributions
 pand+0.010
 sometimes+0.010
ents+0.010
 Z+0.010
 Mom+0.009
Neg logit contributions
izer-0.017
 unnecessary-0.017
isting-0.016
 tool-0.016
antly-0.016
Token was
Token ID373
Feature activation+8.73e-03

Pos logit contributions
izer+0.002
zed+0.002
+0.002
lor+0.002
aman+0.002
Neg logit contributions
 sometimes-0.001
ents-0.001
 Mom-0.001
 from-0.001
 Sara-0.001
Token's
Token ID338
Feature activation+4.11e-03

Pos logit contributions
 sometimes+0.006
 Mom+0.005
 from+0.005
 Sara+0.005
 L+0.005
Neg logit contributions
 unnecessary-0.010
izer-0.010
lor-0.009
 invention-0.009
bek-0.009
Token why
Token ID1521
Feature activation-2.76e-03

Pos logit contributions
 unnecessary+0.008
izer+0.008
lor+0.008
 invention+0.008
bek+0.008
Neg logit contributions
 sometimes-0.004
 Mom-0.004
 Sara-0.004
 from-0.004
 L-0.004
Token E
Token ID412
Feature activation-4.75e-03

Pos logit contributions
 sometimes+0.003
 Mom+0.003
 Sara+0.003
 from+0.003
 L+0.003
Neg logit contributions
 unnecessary-0.005
izer-0.005
lor-0.005
 invention-0.005
bek-0.005
Tokenamon
Token ID16487
Feature activation+3.98e-03

Pos logit contributions
 sometimes+0.008
ily+0.007
 Tom+0.007
 Mom+0.007
 like+0.007
Neg logit contributions
 unnecessary-0.007
 lookout-0.007
 avail-0.007
aman-0.007
izer-0.007
Token always
Token ID1464
Feature activation+6.05e-04

Pos logit contributions
 unnecessary+0.008
izer+0.008
lor+0.007
 avail+0.007
bek+0.007
Neg logit contributions
 sometimes-0.005
 Mom-0.004
 from-0.004
 Sara-0.004
ily-0.004
Token tried
Token ID3088
Feature activation+7.65e-03

Pos logit contributions
izer+0.001
lor+0.001
 unnecessary+0.001
aman+0.001
asha+0.001
Neg logit contributions
 sometimes-0.001
 Mom-0.001
 from-0.001
 alone-0.001
 like-0.001
Token to
Token ID284
Feature activation+1.96e-02

Pos logit contributions
izer+0.017
lor+0.016
bek+0.016
+0.016
aman+0.016
Neg logit contributions
 sometimes-0.008
 Mom-0.008
 from-0.007
 Sara-0.007
 L-0.007
Token be
Token ID307
Feature activation+9.20e-03

Pos logit contributions
 unnecessary+0.038
izer+0.037
lor+0.037
bek+0.036
aman+0.036
Neg logit contributions
 sometimes-0.024
 Mom-0.023
 from-0.022
 like-0.020
 Sara-0.020
Token patient
Token ID5827
Feature activation-1.68e-02

Pos logit contributions
lor+0.021
izer+0.021
aman+0.021
+0.021
bek+0.020
Neg logit contributions
 sometimes-0.009
 Mom-0.008
 like-0.008
 from-0.008
 alone-0.008
Token and
Token ID290
Feature activation-2.53e-03

Pos logit contributions
 sometimes+0.014
 Mom+0.014
 Sara+0.013
 L+0.012
 from+0.012
Neg logit contributions
 unnecessary-0.037
izer-0.036
 invention-0.035
lor-0.035
bek-0.035
Token enjoy
Token ID2883
Feature activation+3.11e-03

Pos logit contributions
 sometimes+0.004
 Mom+0.003
 from+0.003
 like+0.003
 Tom+0.003
Neg logit contributions
izer-0.005
lor-0.005
aman-0.004
 unnecessary-0.004
achers-0.004
Token good
Token ID922
Feature activation-3.66e-03

Pos logit contributions
lor+0.065
izer+0.064
+0.064
aman+0.063
achers+0.062
Neg logit contributions
 sometimes-0.030
 from-0.027
 Mom-0.026
 like-0.026
 alone-0.023
Token to
Token ID284
Feature activation+1.78e-02

Pos logit contributions
 sometimes+0.003
 from+0.003
 Mom+0.003
 like+0.003
 Tom+0.003
Neg logit contributions
izer-0.009
lor-0.009
aman-0.009
bek-0.008
 unnecessary-0.008
Token help
Token ID1037
Feature activation+1.30e-02

Pos logit contributions
lor+0.035
bek+0.035
 unnecessary+0.035
aman+0.034
izer+0.033
Neg logit contributions
 sometimes-0.022
 Mom-0.021
 from-0.019
 like-0.019
 Tom-0.019
Token your
Token ID534
Feature activation+9.66e-03

Pos logit contributions
lor+0.028
+0.028
bek+0.028
izer+0.027
aman+0.027
Neg logit contributions
 sometimes-0.013
 from-0.013
 Mom-0.013
 Sara-0.011
 Tom-0.011
Token friends
Token ID2460
Feature activation+2.91e-02

Pos logit contributions
izer+0.020
lor+0.020
aman+0.020
bek+0.020
 unnecessary+0.020
Neg logit contributions
 Mom-0.010
 sometimes-0.010
 from-0.009
 Sara-0.008
 L-0.008
Token when
Token ID618
Feature activation+1.94e-02

Pos logit contributions
 unnecessary+0.067
izer+0.067
lor+0.065
bek+0.065
aman+0.064
Neg logit contributions
 sometimes-0.023
 Mom-0.021
 from-0.020
 Sara-0.018
 L-0.017
Token they
Token ID484
Feature activation+1.01e-03

Pos logit contributions
izer+0.040
lor+0.038
 unnecessary+0.038
bek+0.037
aman+0.037
Neg logit contributions
 sometimes-0.022
 Mom-0.020
 from-0.019
 Sara-0.019
 L-0.018
Token need
Token ID761
Feature activation+4.82e-03

Pos logit contributions
 unnecessary+0.002
izer+0.002
 invention+0.002
 lookout+0.002
isting+0.002
Neg logit contributions
 Mom-0.001
 sometimes-0.001
 Sara-0.001
 L-0.001
 Z-0.001
Token it
Token ID340
Feature activation+1.04e-02

Pos logit contributions
bek+0.011
izer+0.011
lor+0.010
aman+0.010
andra+0.010
Neg logit contributions
 sometimes-0.005
 Mom-0.005
 from-0.005
 Sara-0.004
 like-0.004
Token.
Token ID13
Feature activation+1.35e-02

Pos logit contributions
izer+0.027
 unnecessary+0.026
bek+0.026
lor+0.026
 avail+0.026
Neg logit contributions
 sometimes-0.006
 from-0.005
 Mom-0.005
 Sara-0.004
 alone-0.004
Token Shooting
Token ID25360
Feature activation-1.46e-02

Pos logit contributions
izer+0.029
 unnecessary+0.028
lor+0.028
 invention+0.027
bek+0.027
Neg logit contributions
 sometimes-0.012
 Mom-0.011
 Sara-0.011
 from-0.010
ily-0.010
Token saw
Token ID2497
Feature activation+4.73e-03

Pos logit contributions
 unnecessary+0.009
 lookout+0.009
zed+0.009
andra+0.009
 avail+0.009
Neg logit contributions
 pand-0.009
 sometimes-0.009
ily-0.009
 Z-0.009
 pengu-0.009
Token a
Token ID257
Feature activation+7.45e-03

Pos logit contributions
 unnecessary+0.008
aman+0.007
 tool+0.007
 invention+0.007
 avail+0.007
Neg logit contributions
 sometimes-0.006
ily-0.005
 Z-0.005
 alone-0.005
 pand-0.005
Token pretty
Token ID2495
Feature activation+8.49e-03

Pos logit contributions
 unnecessary+0.013
izer+0.013
 invention+0.012
 lookout+0.012
bek+0.012
Neg logit contributions
 sometimes-0.010
 Mom-0.009
ily-0.009
 Sara-0.009
 from-0.009
Token dress
Token ID6576
Feature activation+1.84e-03

Pos logit contributions
 unnecessary+0.013
izer+0.013
lor+0.013
 invention+0.013
bek+0.013
Neg logit contributions
 sometimes-0.012
 Mom-0.012
 Sara-0.011
 from-0.011
 L-0.011
Token.
Token ID13
Feature activation-5.49e-04

Pos logit contributions
izer+0.004
lor+0.004
 unnecessary+0.004
 invention+0.004
aman+0.004
Neg logit contributions
 Mom-0.001
 sometimes-0.001
 Sara-0.001
 L-0.001
ily-0.001
Token It
Token ID632
Feature activation-1.34e-02

Pos logit contributions
 Mom+0.000
 sometimes+0.000
 Sara+0.000
 L+0.000
 Tom+0.000
Neg logit contributions
 unnecessary-0.001
izer-0.001
lor-0.001
 invention-0.001
bek-0.001
Token was
Token ID373
Feature activation-7.14e-03

Pos logit contributions
ily+0.020
 Sara+0.020
 Z+0.019
uzz+0.019
 pand+0.019
Neg logit contributions
zed-0.016
antly-0.016
asha-0.016
 lookout-0.015
 cast-0.015
Token long
Token ID890
Feature activation-1.42e-02

Pos logit contributions
 sometimes+0.008
ily+0.007
ents+0.007
 alone+0.007
baby+0.007
Neg logit contributions
aman-0.011
izer-0.011
 avail-0.011
pron-0.011
 tool-0.011
Token and
Token ID290
Feature activation-1.03e-02

Pos logit contributions
 Mom+0.016
 sometimes+0.015
 Sara+0.015
ily+0.015
 L+0.015
Neg logit contributions
izer-0.025
 lookout-0.025
 invention-0.024
lor-0.024
aman-0.024
Token made
Token ID925
Feature activation-2.41e-03

Pos logit contributions
 sometimes+0.011
 pand+0.011
ily+0.011
 from+0.011
ents+0.010
Neg logit contributions
 unnecessary-0.017
 avail-0.017
pron-0.016
zed-0.016
 newfound-0.016
Token of
Token ID286
Feature activation-5.77e-03

Pos logit contributions
 sometimes+0.003
 pengu+0.002
 Z+0.002
 pand+0.002
 alone+0.002
Neg logit contributions
 unnecessary-0.004
 avail-0.004
izer-0.004
 invention-0.004
antly-0.004
TokenL
Token ID43
Feature activation-9.48e-03

Pos logit contributions
izer+0.007
 unnecessary+0.007
lor+0.007
bek+0.007
 invention+0.007
Neg logit contributions
 Mom-0.006
 sometimes-0.006
 from-0.006
 Sara-0.006
 L-0.006
Tokenily
Token ID813
Feature activation-1.29e-02

Pos logit contributions
ily+0.001
 sometimes-0.002
 Sara-0.002
 L-0.003
 Tom-0.003
Neg logit contributions
 unnecessary-0.033
 invention-0.031
 avail-0.031
 lifetime-0.031
bek-0.031
Token,
Token ID11
Feature activation-3.70e-03

Pos logit contributions
 sometimes+0.012
 Mom+0.012
ily+0.011
 from+0.011
 Sara+0.011
Neg logit contributions
izer-0.025
 invention-0.024
 unnecessary-0.024
lor-0.024
-0.024
Token that
Token ID326
Feature activation-1.10e-03

Pos logit contributions
 sometimes+0.004
 Mom+0.004
 from+0.003
 Z+0.003
 alone+0.003
Neg logit contributions
 unnecessary-0.007
izer-0.007
 invention-0.007
 lookout-0.007
lor-0.006
Token is
Token ID318
Feature activation-1.28e-02

Pos logit contributions
 sometimes+0.001
 Mom+0.001
 Sara+0.001
ily+0.001
 L+0.001
Neg logit contributions
 unnecessary-0.002
izer-0.002
lor-0.002
 invention-0.002
bek-0.002
Token a
Token ID257
Feature activation-4.75e-03

Pos logit contributions
 sometimes+0.014
ily+0.013
ents+0.012
 pengu+0.012
 pand+0.012
Neg logit contributions
 unnecessary-0.023
izer-0.022
 avail-0.021
 tool-0.020
Michelle-0.020
Token good
Token ID922
Feature activation-5.80e-03

Pos logit contributions
 sometimes+0.006
 Mom+0.006
 Sara+0.005
 from+0.005
 L+0.005
Neg logit contributions
 unnecessary-0.009
izer-0.009
lor-0.008
 invention-0.008
bek-0.008
Token idea
Token ID2126
Feature activation+1.14e-02

Pos logit contributions
 sometimes+0.009
 Mom+0.008
 Sara+0.008
ily+0.008
 from+0.008
Neg logit contributions
izer-0.009
 unnecessary-0.009
 invention-0.008
aman-0.008
 avail-0.008
Token.
Token ID13
Feature activation+1.31e-02

Pos logit contributions
izer+0.020
 invention+0.020
 tool+0.020
 precautions+0.019
lor+0.019
Neg logit contributions
 sometimes-0.011
 Mom-0.011
 Flor-0.011
 pand-0.010
 Qu-0.010
Token We
Token ID775
Feature activation+8.20e-03

Pos logit contributions
izer+0.020
+0.020
 unnecessary+0.019
lor+0.019
 tool+0.019
Neg logit contributions
 sometimes-0.017
 from-0.017
 pand-0.016
 alone-0.016
ents-0.016
Token can
Token ID460
Feature activation-1.15e-02

Pos logit contributions
isel+0.010
 unnecessary+0.010
 invention+0.010
isting+0.010
 tool+0.009
Neg logit contributions
 Sara-0.013
 Z-0.013
 Mom-0.013
 from-0.013
 Anna-0.013
Token he
Token ID339
Feature activation+5.05e-04

Pos logit contributions
 unnecessary+0.002
izer+0.002
lor+0.002
 invention+0.002
bek+0.002
Neg logit contributions
 sometimes-0.001
 Mom-0.001
 Sara-0.001
 from-0.001
 L-0.001
Token praised
Token ID15342
Feature activation+7.40e-03

Pos logit contributions
zed+0.001
izer+0.001
 cast+0.001
 invention+0.001
andra+0.001
Neg logit contributions
 sometimes-0.001
 Mom-0.001
ily-0.001
 Sara-0.001
 pand-0.001
Token them
Token ID606
Feature activation-1.97e-02

Pos logit contributions
 avail+0.011
 unnecessary+0.010
izer+0.010
 invention+0.010
asha+0.009
Neg logit contributions
 sometimes-0.011
 pengu-0.010
 alone-0.010
ents-0.010
 pand-0.010
Token.
Token ID13
Feature activation+2.14e-04

Pos logit contributions
 sometimes+0.019
ily+0.019
 pand+0.017
 Sara+0.017
 Mom+0.016
Neg logit contributions
 unnecessary-0.034
 invention-0.034
izer-0.034
ipping-0.033
 tractor-0.032
Token He
Token ID679
Feature activation-3.01e-04

Pos logit contributions
izer+0.000
andra+0.000
+0.000
lor+0.000
 tool+0.000
Neg logit contributions
 sometimes-0.000
 alone-0.000
 pand-0.000
 from-0.000
 like-0.000
Token said
Token ID531
Feature activation+5.77e-03

Pos logit contributions
 sometimes+0.000
 Mom+0.000
 Sara+0.000
 alone+0.000
 pand+0.000
Neg logit contributions
izer-0.001
andra-0.001
 invention-0.001
lor-0.001
 cast-0.001
Token,
Token ID11
Feature activation+2.61e-03

Pos logit contributions
zed+0.009
 unnecessary+0.009
ously+0.009
lor+0.008
 invention+0.008
Neg logit contributions
 sill-0.007
 sometimes-0.006
ents-0.006
 alone-0.006
ocular-0.006
Token "
Token ID366
Feature activation-2.73e-03

Pos logit contributions
izer+0.004
+0.004
+0.004
zed+0.004
lor+0.004
Neg logit contributions
 sometimes-0.003
ily-0.003
 Mom-0.003
 Sara-0.003
 alone-0.003
TokenYou
Token ID1639
Feature activation-1.79e-03

Pos logit contributions
 Mom+0.004
 sometimes+0.004
 from+0.004
 Sara+0.004
 L+0.004
Neg logit contributions
izer-0.004
 unnecessary-0.004
lor-0.004
bek-0.004
 invention-0.004
Token made
Token ID925
Feature activation-2.92e-03

Pos logit contributions
 sometimes+0.002
 Mom+0.002
 Sara+0.002
 from+0.002
 L+0.002
Neg logit contributions
 unnecessary-0.003
izer-0.003
 invention-0.003
lor-0.003
isting-0.003
Token a
Token ID257
Feature activation-1.03e-02

Pos logit contributions
 sometimes+0.004
ily+0.003
 alone+0.003
 from+0.003
 Sara+0.003
Neg logit contributions
 unnecessary-0.005
izer-0.005
 invention-0.005
lor-0.005
 avail-0.005
Token lots
Token ID6041
Feature activation-1.10e-02

Pos logit contributions
 sometimes+0.005
 Mom+0.005
 from+0.005
 Sara+0.005
 L+0.005
Neg logit contributions
 unnecessary-0.009
izer-0.009
lor-0.009
 invention-0.009
aman-0.009
Token of
Token ID286
Feature activation+1.28e-02

Pos logit contributions
 sometimes+0.010
 Mom+0.010
 Z+0.010
 Flor+0.009
 Qu+0.009
Neg logit contributions
 unnecessary-0.021
 invention-0.021
earned-0.020
izer-0.020
 precautions-0.020
Token different
Token ID1180
Feature activation+1.06e-02

Pos logit contributions
 unnecessary+0.018
 lookout+0.017
 invention+0.017
 newfound+0.017
 tool+0.016
Neg logit contributions
 sometimes-0.020
ily-0.018
 Flor-0.018
ents-0.017
uzz-0.017
Token things
Token ID1243
Feature activation+1.25e-03

Pos logit contributions
 unnecessary+0.014
 invention+0.013
izer+0.013
 lookout+0.012
 avail+0.012
Neg logit contributions
 sometimes-0.018
 Mom-0.018
ily-0.017
 Sara-0.017
 Z-0.017
Token to
Token ID284
Feature activation+9.94e-03

Pos logit contributions
 unnecessary+0.003
izer+0.003
 invention+0.003
lor+0.003
 lookout+0.003
Neg logit contributions
 Mom-0.001
 sometimes-0.001
 Sara-0.001
 L-0.001
ily-0.001
Token eat
Token ID4483
Feature activation+1.10e-03

Pos logit contributions
 lookout+0.016
izer+0.016
 avail+0.015
 unnecessary+0.015
 invention+0.015
Neg logit contributions
 sometimes-0.014
 Mom-0.013
 from-0.013
ily-0.012
 Z-0.012
Token -
Token ID532
Feature activation+4.58e-03

Pos logit contributions
 unnecessary+0.002
izer+0.002
 invention+0.002
lor+0.002
 lookout+0.002
Neg logit contributions
 sometimes-0.001
 Mom-0.001
 Z-0.001
 L-0.001
 Sara-0.001
Token but
Token ID475
Feature activation+1.22e-02

Pos logit contributions
 unnecessary+0.009
izer+0.008
 invention+0.008
lor+0.008
aman+0.008
Neg logit contributions
 sometimes-0.005
 Mom-0.005
ily-0.005
 Sara-0.005
 L-0.005
Token nothing
Token ID2147
Feature activation+4.13e-03

Pos logit contributions
 unnecessary+0.022
 avail+0.021
aman+0.021
 invention+0.021
bek+0.021
Neg logit contributions
ily-0.013
 L-0.013
 sometimes-0.013
 pand-0.013
 Z-0.012
Token that
Token ID326
Feature activation+7.15e-03

Pos logit contributions
izer+0.006
 unnecessary+0.006
lor+0.006
bek+0.006
aman+0.006
Neg logit contributions
 sometimes-0.007
 from-0.006
 Mom-0.006
 Sara-0.006
ily-0.006
Token Alice
Token ID14862
Feature activation+1.69e-02

Pos logit contributions
 unnecessary+0.013
izer+0.013
 invention+0.012
lor+0.012
bek+0.012
Neg logit contributions
 sometimes-0.009
 Mom-0.009
 Sara-0.008
 L-0.008
 from-0.008
Token Together
Token ID17083
Feature activation+4.02e-03

Pos logit contributions
 Mom+0.006
 sometimes+0.005
 Sara+0.005
 from+0.005
 L+0.005
Neg logit contributions
 unnecessary-0.015
izer-0.015
bek-0.014
 avail-0.014
 invention-0.014
Token they
Token ID484
Feature activation+3.53e-03

Pos logit contributions
 unnecessary+0.008
izer+0.008
 invention+0.008
aman+0.008
 avail+0.008
Neg logit contributions
 sometimes-0.004
ily-0.003
 Mom-0.003
 Sara-0.003
 L-0.003
Token cooked
Token ID15847
Feature activation+2.33e-02

Pos logit contributions
 unnecessary+0.006
izer+0.006
lor+0.006
 invention+0.006
bek+0.006
Neg logit contributions
 sometimes-0.005
 Mom-0.005
 Sara-0.004
 from-0.004
 L-0.004
Token the
Token ID262
Feature activation+9.10e-04

Pos logit contributions
 unnecessary+0.054
izer+0.053
lor+0.053
bek+0.052
 invention+0.051
Neg logit contributions
 Mom-0.018
 sometimes-0.018
 from-0.016
 Sara-0.016
 L-0.014
Token soft
Token ID2705
Feature activation+1.37e-02

Pos logit contributions
 unnecessary+0.001
izer+0.001
bek+0.001
lor+0.001
aman+0.001
Neg logit contributions
 sometimes-0.001
 Mom-0.001
 from-0.001
 Sara-0.001
 alone-0.001
Token,
Token ID11
Feature activation+4.01e-03

Pos logit contributions
 unnecessary+0.021
izer+0.020
lor+0.019
bek+0.019
 invention+0.019
Neg logit contributions
 sometimes-0.022
 Mom-0.021
 from-0.020
 Sara-0.020
 L-0.019
Token white
Token ID2330
Feature activation+4.19e-03

Pos logit contributions
 unnecessary+0.008
izer+0.008
lor+0.008
 invention+0.007
bek+0.007
Neg logit contributions
 sometimes-0.005
 Mom-0.005
 from-0.004
 Sara-0.004
 L-0.004
Token rice
Token ID11464
Feature activation-5.10e-03

Pos logit contributions
 unnecessary+0.006
izer+0.006
bek+0.006
lor+0.006
aman+0.006
Neg logit contributions
 sometimes-0.007
 Mom-0.007
 from-0.006
 Sara-0.006
 L-0.006
Token.
Token ID13
Feature activation-6.83e-03

Pos logit contributions
 sometimes+0.004
 Mom+0.004
 from+0.004
 Sara+0.004
 L+0.003
Neg logit contributions
 unnecessary-0.012
izer-0.011
lor-0.011
bek-0.011
 invention-0.011
Token The
Token ID383
Feature activation-3.30e-03

Pos logit contributions
 Mom+0.008
 sometimes+0.007
 Sara+0.007
 from+0.007
 L+0.007
Neg logit contributions
 unnecessary-0.014
izer-0.013
bek-0.013
 avail-0.013
 invention-0.013
Token kingdom
Token ID13239
Feature activation+7.78e-03

Pos logit contributions
 sometimes+0.005
 Mom+0.004
 Sara+0.004
 L+0.004
 Z+0.004
Neg logit contributions
izer-0.005
 unnecessary-0.005
 invention-0.005
 lookout-0.005
lor-0.005
TokenWill
Token ID8743
Feature activation+5.45e-03

Pos logit contributions
 unnecessary+0.001
izer+0.001
lor+0.001
 invention+0.001
bek+0.000
Neg logit contributions
 sometimes-0.001
 Mom-0.000
 Sara-0.000
 from-0.000
 L-0.000
Token they
Token ID484
Feature activation-3.11e-03

Pos logit contributions
 unnecessary+0.010
izer+0.010
+0.010
zed+0.009
lor+0.009
Neg logit contributions
ily-0.006
 sometimes-0.006
 Mom-0.005
 L-0.005
 from-0.005
Token be
Token ID307
Feature activation-1.01e-05

Pos logit contributions
ily+0.004
 sometimes+0.004
 Sara+0.004
 Mom+0.004
 L+0.004
Neg logit contributions
 unnecessary-0.004
izer-0.004
-0.004
 invention-0.004
 lookout-0.004
Token loud
Token ID7812
Feature activation-1.27e-02

Pos logit contributions
 sometimes+0.000
ily+0.000
 Mom+0.000
 Sara+0.000
 L+0.000
Neg logit contributions
 unnecessary-0.000
izer-0.000
 invention-0.000
isting-0.000
 avail-0.000
Token?"
Token ID1701
Feature activation-1.10e-02

Pos logit contributions
 Mom+0.013
ily+0.012
 Sara+0.012
 Z+0.012
 pand+0.012
Neg logit contributions
izer-0.024
 unnecessary-0.023
 invention-0.023
aman-0.022
isting-0.022
Token Ben
Token ID3932
Feature activation-2.76e-02

Pos logit contributions
ocular+0.009
uzz+0.007
asp+0.007
 sometimes+0.007
 sill+0.007
Neg logit contributions
 avail-0.023
-0.022
Michelle-0.021
aman-0.021
 Donna-0.021
Token asks
Token ID7893
Feature activation-6.61e-03

Pos logit contributions
 pand+0.032
 Z+0.032
 Mom+0.031
 pengu+0.031
ily+0.031
Neg logit contributions
 invention-0.045
izer-0.044
asha-0.043
andra-0.043
Michelle-0.042
Token.
Token ID13
Feature activation-1.79e-02

Pos logit contributions
 sometimes+0.007
 pengu+0.007
 pand+0.006
 Z+0.006
 thirteen+0.006
Neg logit contributions
 avail-0.012
antly-0.011
 tool-0.011
 unnecessary-0.011
ously-0.011
Token
Token ID198
Feature activation-3.02e-03

Pos logit contributions
 sometimes+0.020
 pand+0.018
 Z+0.017
 pengu+0.017
 alone+0.017
Neg logit contributions
izer-0.032
-0.032
 tool-0.030
Michelle-0.030
 unnecessary-0.030
Token
Token ID198
Feature activation-3.29e-03

Pos logit contributions
 sometimes+0.004
 pand+0.003
 like+0.003
 from+0.003
 pengu+0.003
Neg logit contributions
izer-0.005
aman-0.005
lor-0.005
Michelle-0.005
Matthew-0.005
Token"
Token ID1
Feature activation-3.74e-03

Pos logit contributions
 sometimes+0.004
 Mom+0.004
 from+0.004
 Z+0.004
 pand+0.004
Neg logit contributions
izer-0.006
 unnecessary-0.005
bek-0.005
lor-0.005
aman-0.005
Token more
Token ID517
Feature activation-1.76e-02

Pos logit contributions
 Mom+0.010
 sometimes+0.010
 Sara+0.010
 Z+0.010
ily+0.009
Neg logit contributions
 unnecessary-0.018
izer-0.017
 invention-0.016
 avail-0.016
bek-0.016
Token funny
Token ID8258
Feature activation+2.32e-03

Pos logit contributions
 sometimes+0.017
 Mom+0.017
 Sara+0.015
 L+0.015
 from+0.015
Neg logit contributions
 unnecessary-0.037
izer-0.036
 invention-0.035
lor-0.035
aman-0.034
Token and
Token ID290
Feature activation-6.87e-03

Pos logit contributions
 unnecessary+0.004
izer+0.004
 invention+0.004
lor+0.004
 avail+0.004
Neg logit contributions
 sometimes-0.003
 Mom-0.003
 Sara-0.003
ily-0.003
 Z-0.003
Token sm
Token ID895
Feature activation-2.52e-02

Pos logit contributions
 Mom+0.009
 sometimes+0.008
 Sara+0.008
ily+0.008
 Z+0.008
Neg logit contributions
 unnecessary-0.012
izer-0.011
 invention-0.011
bek-0.011
 avail-0.011
Tokenelly
Token ID6148
Feature activation-1.16e-02

Pos logit contributions
 sometimes+0.035
 Mom+0.035
 Sara+0.033
 Z+0.033
 from+0.032
Neg logit contributions
 unnecessary-0.040
izer-0.040
ipping-0.038
aman-0.038
lor-0.038
Token pictures
Token ID5986
Feature activation-2.51e-02

Pos logit contributions
 sometimes+0.009
 Mom+0.009
 Sara+0.009
ily+0.008
 Z+0.008
Neg logit contributions
 unnecessary-0.025
 invention-0.024
izer-0.024
 avail-0.023
ipping-0.023
Token.
Token ID13
Feature activation-7.28e-03

Pos logit contributions
 Mom+0.023
 Sara+0.021
 pengu+0.020
 sometimes+0.020
 Qu+0.020
Neg logit contributions
lor-0.048
-0.046
 invention-0.046
 unnecessary-0.046
ipping-0.046
Token And
Token ID843
Feature activation+1.76e-03

Pos logit contributions
 sometimes+0.008
 from+0.007
 alone+0.007
ily+0.007
 like+0.007
Neg logit contributions
-0.013
aman-0.013
lor-0.013
izer-0.013
 unnecessary-0.013
Token everyone
Token ID2506
Feature activation-2.86e-03

Pos logit contributions
 unnecessary+0.003
bek+0.003
aman+0.003
zed+0.003
 avail+0.003
Neg logit contributions
ily-0.002
ents-0.002
 Mom-0.002
 pand-0.002
 sometimes-0.002
Token loved
Token ID6151
Feature activation+6.32e-03

Pos logit contributions
 Mom+0.003
 Sara+0.003
 Z+0.003
 sometimes+0.003
 L+0.003
Neg logit contributions
 invention-0.005
zed-0.005
bek-0.005
izer-0.005
isting-0.004
Token the
Token ID262
Feature activation-6.27e-03

Pos logit contributions
 unnecessary+0.013
izer+0.012
 invention+0.012
aman+0.012
bek+0.012
Neg logit contributions
 sometimes-0.006
 Mom-0.006
ily-0.006
 from-0.006
 Sara-0.006
Token the
Token ID262
Feature activation+1.78e-02

Pos logit contributions
izer+0.044
 unnecessary+0.044
lor+0.043
bek+0.043
 invention+0.043
Neg logit contributions
 sometimes-0.011
 Mom-0.010
 Sara-0.009
 from-0.009
 L-0.009
Token beach
Token ID10481
Feature activation+3.19e-02

Pos logit contributions
 unnecessary+0.023
izer+0.022
 lookout+0.021
 invention+0.021
aman+0.021
Neg logit contributions
 sometimes-0.031
 Mom-0.030
 Sara-0.029
ily-0.029
 L-0.029
Token together
Token ID1978
Feature activation-1.12e-02

Pos logit contributions
 unnecessary+0.075
izer+0.075
lor+0.073
 invention+0.073
aman+0.072
Neg logit contributions
 sometimes-0.020
 Mom-0.019
 Sara-0.017
ily-0.016
 L-0.016
Token.
Token ID13
Feature activation+4.86e-03

Pos logit contributions
 sometimes+0.007
 Mom+0.007
 Sara+0.006
 L+0.005
ily+0.005
Neg logit contributions
 unnecessary-0.027
izer-0.027
lor-0.026
 invention-0.026
aman-0.026
Token<|endoftext|>
Token ID50256
Feature activation+7.65e-03

Pos logit contributions
 unnecessary+0.012
izer+0.011
 invention+0.011
 avail+0.011
bek+0.011
Neg logit contributions
 Mom-0.004
 Sara-0.004
 L-0.004
 sometimes-0.004
 Tom-0.003
TokenOnce
Token ID7454
Feature activation-3.18e-02

Pos logit contributions
 unnecessary+0.012
izer+0.012
lor+0.012
 invention+0.012
bek+0.011
Neg logit contributions
 sometimes-0.011
 Mom-0.011
 Sara-0.010
 from-0.010
 L-0.010
Token upon
Token ID2402
Feature activation+1.56e-02

Pos logit contributions
 sometimes+0.046
 from+0.042
 Mom+0.041
 pand+0.041
 Z+0.041
Neg logit contributions
aman-0.046
 unnecessary-0.046
izer-0.044
 invention-0.044
lor-0.043
Token a
Token ID257
Feature activation+2.65e-02

Pos logit contributions
 unnecessary+0.035
izer+0.035
 invention+0.034
lor+0.034
bek+0.034
Neg logit contributions
 sometimes-0.013
 Mom-0.012
 Sara-0.011
 from-0.011
 L-0.011
Token time
Token ID640
Feature activation+1.08e-02

Pos logit contributions
 unnecessary+0.042
lor+0.041
 invention+0.040
izer+0.040
 avail+0.040
Neg logit contributions
 Mom-0.040
 sometimes-0.040
 Sara-0.038
 Tom-0.037
 Anna-0.037
Token,
Token ID11
Feature activation-6.23e-03

Pos logit contributions
+0.019
aman+0.019
+0.018
 unnecessary+0.018
izer+0.018
Neg logit contributions
 sometimes-0.011
 sill-0.010
 Mom-0.010
ents-0.010
 pengu-0.010
Token there
Token ID612
Feature activation+1.79e-03

Pos logit contributions
 pand+0.008
ents+0.008
 sometimes+0.007
 Z+0.007
 Mom+0.007
Neg logit contributions
izer-0.009
 unnecessary-0.009
-0.009
isting-0.009
antly-0.009
Token
Token ID198
Feature activation+9.68e-03

Pos logit contributions
izer+0.017
lor+0.017
Michelle+0.017
 unnecessary+0.017
aman+0.016
Neg logit contributions
 sometimes-0.011
 pand-0.010
 from-0.010
 like-0.009
 Mom-0.009
Token"
Token ID1
Feature activation-5.83e-03

Pos logit contributions
izer+0.014
 unnecessary+0.013
lor+0.013
bek+0.013
+0.013
Neg logit contributions
 sometimes-0.015
 Mom-0.015
 from-0.014
 Sara-0.014
 L-0.014
TokenNo
Token ID2949
Feature activation+5.96e-03

Pos logit contributions
 sometimes+0.008
 Mom+0.008
 from+0.008
 Sara+0.008
 L+0.008
Neg logit contributions
 unnecessary-0.009
izer-0.009
bek-0.009
lor-0.009
-0.009
Token,
Token ID11
Feature activation-4.19e-03

Pos logit contributions
 unnecessary+0.014
izer+0.014
lor+0.014
 invention+0.014
bek+0.014
Neg logit contributions
 sometimes-0.004
 Mom-0.004
 Sara-0.003
 from-0.003
 L-0.003
Token thank
Token ID5875
Feature activation+9.34e-04

Pos logit contributions
 sometimes+0.004
 Mom+0.004
 from+0.004
ily+0.004
 alone+0.004
Neg logit contributions
 unnecessary-0.008
izer-0.008
lor-0.008
aman-0.008
bek-0.008
Token you
Token ID345
Feature activation-2.70e-02

Pos logit contributions
 unnecessary+0.001
 avail+0.001
+0.001
Michelle+0.001
asha+0.001
Neg logit contributions
 sometimes-0.001
 sill-0.001
uzz-0.001
ily-0.001
 pengu-0.001
Token,"
Token ID553
Feature activation+1.94e-03

Pos logit contributions
 Qu+0.029
uzz+0.028
 Flor+0.027
 Z+0.027
 pengu+0.027
Neg logit contributions
aman-0.043
 unnecessary-0.043
 route-0.042
 tractor-0.041
 avail-0.040
Token the
Token ID262
Feature activation-9.00e-04

Pos logit contributions
 avail+0.003
aman+0.003
asha+0.003
 cast+0.003
Michelle+0.003
Neg logit contributions
 sometimes-0.002
ily-0.002
uzz-0.002
asp-0.002
 pand-0.002
Token witch
Token ID16365
Feature activation-7.78e-03

Pos logit contributions
 sometimes+0.002
 Mom+0.002
 from+0.002
 Sara+0.001
 Z+0.001
Neg logit contributions
 unnecessary-0.001
izer-0.001
lor-0.001
 invention-0.001
asha-0.001
Token said
Token ID531
Feature activation+3.44e-03

Pos logit contributions
 Mom+0.008
 pand+0.008
 sometimes+0.008
 Z+0.007
 Sara+0.007
Neg logit contributions
 cast-0.014
asha-0.014
izer-0.014
andra-0.014
lor-0.014
Token,
Token ID11
Feature activation-1.22e-03

Pos logit contributions
izer+0.007
ously+0.007
 unnecessary+0.007
 invention+0.006
ipping+0.006
Neg logit contributions
 sometimes-0.003
 pand-0.003
 Mom-0.003
 Z-0.003
 pengu-0.003
Token first
Token ID717
Feature activation+3.86e-03

Pos logit contributions
ily+0.000
 sometimes+0.000
 Z+0.000
 Mom+0.000
 alone+0.000
Neg logit contributions
 unnecessary-0.000
 avail-0.000
izer-0.000
 invention-0.000
-0.000
Token,
Token ID11
Feature activation-5.40e-03

Pos logit contributions
 unnecessary+0.009
izer+0.008
 avail+0.008
 invention+0.008
lor+0.008
Neg logit contributions
 sometimes-0.003
 Mom-0.003
ily-0.003
 from-0.003
 Sara-0.003
Token let
Token ID1309
Feature activation+1.66e-02

Pos logit contributions
 sometimes+0.005
 Mom+0.005
ily+0.005
 Sara+0.005
 from+0.005
Neg logit contributions
 unnecessary-0.011
izer-0.010
 invention-0.010
 avail-0.010
lor-0.010
Token me
Token ID502
Feature activation+3.74e-03

Pos logit contributions
 unnecessary+0.033
izer+0.033
 invention+0.032
bek+0.031
lor+0.031
Neg logit contributions
 sometimes-0.018
 Mom-0.018
 Sara-0.017
 from-0.016
 L-0.016
Token introduce
Token ID10400
Feature activation-4.13e-03

Pos logit contributions
 unnecessary+0.007
izer+0.007
lor+0.007
aman+0.007
bek+0.007
Neg logit contributions
 Mom-0.005
 sometimes-0.005
 Sara-0.004
 from-0.004
 L-0.004
Token you
Token ID345
Feature activation-2.59e-02

Pos logit contributions
 sometimes+0.005
 pand+0.004
 pengu+0.004
 alone+0.004
 from+0.004
Neg logit contributions
 unnecessary-0.007
lor-0.007
antly-0.007
izer-0.007
 avail-0.007
Token to
Token ID284
Feature activation-1.47e-02

Pos logit contributions
 Mom+0.031
 pand+0.030
 Z+0.030
ily+0.030
 L+0.029
Neg logit contributions
 unnecessary-0.044
lor-0.041
izer-0.040
 precautions-0.039
 invention-0.039
Token my
Token ID616
Feature activation-1.44e-02

Pos logit contributions
 sometimes+0.015
ily+0.013
 from+0.013
 Mom+0.013
 Z+0.013
Neg logit contributions
 unnecessary-0.029
izer-0.028
 avail-0.028
aman-0.027
 invention-0.027
Token other
Token ID584
Feature activation-7.75e-03

Pos logit contributions
 sometimes+0.020
ily+0.020
 from+0.018
 Z+0.018
 Sara+0.018
Neg logit contributions
 invention-0.023
 unnecessary-0.023
izer-0.022
 tool-0.021
 lookout-0.021
Token friends
Token ID2460
Feature activation+6.42e-03

Pos logit contributions
 sometimes+0.009
 Z+0.009
ily+0.009
 Mom+0.008
 Sara+0.008
Neg logit contributions
izer-0.013
 unnecessary-0.013
 invention-0.013
 tool-0.013
 lookout-0.013
Token."
Token ID526
Feature activation-7.53e-03

Pos logit contributions
lor+0.012
+0.012
izer+0.012
 unnecessary+0.012
 invention+0.012
Neg logit contributions
 Mom-0.006
 pand-0.005
 Z-0.005
 Sara-0.005
 L-0.005