TOP ACTIVATIONS
MAX = 0.061

Dave looked at Ms . Sarah with a smile
. Her tutor was called Ms Daisy and she was very
Cl ara was welcomed by Ms Daisy and sat down at
he called to his friend Ms . Beaver . " Look
Dave asked his teacher , Ms . Sarah , " Why
beach when she saw a gentle jelly fish in the water
more sa us ages for Mrs Smith . He quickly returned
looking around and enjoying the peaceful view .
this and said in a soothing voice ,
special . She saw a gentle glow coming from the small
Sam turned , and saw Ms . Smith , his neighbour
at him and with a gentle gesture , he pushed him
a birthday party !" said Ms . Beaver . Dave felt
it up . With a gentle smile , he rubbed its
had not listened to the wise frog , he would have
She saw her teacher , Ms . Lee , and said
down at the desk . Ms Daisy was a very brilliant
and scoop ed up the little frog , carrying it away
She never forgot the kind bees , who showed her
Mum , when he saw Mrs Smith . He felt ashamed

MIN ACTIVATIONS
MIN = -0.073

the broken toy in his je ep . They went to
ride in the big red je ep . <|endoftext|> Once upon
Tim ride in your red je ep ?" The man said
They run back to their je ep . They get in
, Tim my drove his je ep slowly and carefully .
to put on your special je ep clothes ."
ep . This is my je ep !" L
to go out in the je ep . Tommy was so
riding in a big blue je ep . One
go on adventures in your je ep . <|endoftext|> Once upon
special in the big red je ep . <|endoftext|> Once upon
. He gets in his je ep and follows them .
it was the best ash tr ay they had ever seen
You cannot ride in my je ep . Get out now
and ride in his toy je ep . One day ,
together . With her new je ep , Beth any could
Sam to stay in the je ep and be quiet .
a ride in the red je ep . But first ,
like to play with their je ep . Their je ep
Thank you , Mr . Mechan ic !" Tim my said

INTERVAL 0.028 - 0.061
CONTAINS 0.174%

and Lily took a nice warm bath . She felt much
" What about if we pretend there is a cave
make a face as he tasted the bitter medicine .
She did not want to taste the broccoli . She looked
apped and cheered for the little girl as she proudly held

INTERVAL -0.006 - 0.028
CONTAINS 74.395%

with a big chim ney and a bright star on the
Anna and Ben are playing in the garden . They see
asked , " Do you have something for me ?" Lily
so big and so round that it was like a mountain
asked his mom for help and they found lots of creative

INTERVAL -0.039 - -0.006
CONTAINS 25.372%

the flowers . Jane spent the whole day in
One day , Lily was playing outside when she saw a
pic n ics together . One day ,
!" she said . " You did not listen . You
. All the people in the town were very busy each

INTERVAL -0.073 - -0.039
CONTAINS 0.059%

ite . They tied the rib bons and the feathers to
put on her pretty new jewelry and went to bed .
. She loved to eat av oc ados because they were
a wealthy girl who had many toys and clothes . She
mean . This is my cr ay on . Go away
Token
Token ID198
Feature activation+1.32e-02

Pos logit contributions
 dip+0.010
 stare+0.009
 look+0.009
 touch+0.009
 pet+0.009
Neg logit contributions
rob-0.018
eh-0.016
aj-0.015
ross-0.014
three-0.014
Token
Token ID198
Feature activation+1.21e-02

Pos logit contributions
 dip+0.014
 look+0.013
 stare+0.013
 pet+0.013
 touch+0.013
Neg logit contributions
rob-0.028
eh-0.025
aj-0.023
ross-0.023
xy-0.022
TokenDave
Token ID27984
Feature activation-1.65e-03

Pos logit contributions
 dip+0.015
 pet+0.015
 look+0.015
 touch+0.015
 stare+0.014
Neg logit contributions
rob-0.023
aj-0.019
eh-0.019
xy-0.018
 nonsense-0.018
Token looked
Token ID3114
Feature activation-9.47e-03

Pos logit contributions
rob+0.003
eh+0.003
aj+0.002
ross+0.002
lf+0.002
Neg logit contributions
 dip-0.002
 stare-0.002
 look-0.002
 gaze-0.002
 touch-0.002
Token at
Token ID379
Feature activation+2.33e-02

Pos logit contributions
rob+0.012
eh+0.012
aj+0.011
 nonsense+0.010
ibr+0.010
Neg logit contributions
 dip-0.015
 stare-0.014
 Seal-0.014
 Swan-0.014
 gaze-0.014
Token Ms
Token ID6997
Feature activation+6.15e-02

Pos logit contributions
 dip+0.033
 stare+0.032
 gaze+0.031
 touch+0.029
 try+0.029
Neg logit contributions
rob-0.041
lf-0.031
eh-0.031
 nonsense-0.029
 puzzles-0.029
Token.
Token ID13
Feature activation+1.17e-02

Pos logit contributions
 dip+0.079
 stare+0.075
 look+0.073
 touch+0.072
 gaze+0.072
Neg logit contributions
rob-0.115
eh-0.100
aj-0.096
ross-0.093
 nonsense-0.092
Token Sarah
Token ID10490
Feature activation+1.53e-04

Pos logit contributions
 dip+0.016
 look+0.015
 stare+0.015
 pet+0.015
 touch+0.015
Neg logit contributions
rob-0.021
eh-0.018
aj-0.017
ross-0.017
xy-0.016
Token with
Token ID351
Feature activation-1.61e-03

Pos logit contributions
 dip+0.000
 crab+0.000
 Swan+0.000
 care+0.000
 Seal+0.000
Neg logit contributions
rob-0.000
lf-0.000
es-0.000
aj-0.000
eh-0.000
Token a
Token ID257
Feature activation-8.29e-04

Pos logit contributions
rob+0.002
 nonsense+0.002
 shields+0.002
lf+0.002
eh+0.002
Neg logit contributions
 dip-0.002
 Swan-0.002
 stare-0.002
 gaze-0.002
 crab-0.002
Token smile
Token ID8212
Feature activation-4.07e-03

Pos logit contributions
rob+0.001
 nonsense+0.001
eh+0.001
aj+0.001
 AW+0.001
Neg logit contributions
 Seal-0.001
 dip-0.001
 Swan-0.001
 Goose-0.001
 Ned-0.001
Token.
Token ID13
Feature activation+9.22e-03

Pos logit contributions
rob+0.010
eh+0.009
aj+0.008
ross+0.008
xy+0.008
Neg logit contributions
 dip-0.006
 look-0.006
 stare-0.005
 pet-0.005
 touch-0.005
Token Her
Token ID2332
Feature activation+1.26e-02

Pos logit contributions
 dip+0.012
 stare+0.011
 look+0.011
 pet+0.011
 touch+0.011
Neg logit contributions
rob-0.017
eh-0.015
aj-0.014
 nonsense-0.013
ross-0.013
Token tutor
Token ID48680
Feature activation+4.74e-03

Pos logit contributions
 look+0.019
 pet+0.019
 try+0.019
 stare+0.018
 touch+0.018
Neg logit contributions
rob-0.023
eh-0.019
aj-0.017
xy-0.017
ross-0.016
Token was
Token ID373
Feature activation+8.13e-03

Pos logit contributions
 dip+0.007
 stare+0.006
 gaze+0.006
 touch+0.006
 look+0.006
Neg logit contributions
rob-0.008
eh-0.007
lf-0.006
 nonsense-0.006
aj-0.006
Token called
Token ID1444
Feature activation+1.18e-02

Pos logit contributions
 dip+0.010
 stare+0.009
 look+0.009
 touch+0.009
 gaze+0.009
Neg logit contributions
rob-0.016
eh-0.014
aj-0.013
ross-0.013
 nonsense-0.013
Token Ms
Token ID6997
Feature activation+5.88e-02

Pos logit contributions
 dip+0.017
 look+0.017
 stare+0.017
 pet+0.017
 touch+0.017
Neg logit contributions
rob-0.020
eh-0.017
aj-0.016
ross-0.015
xy-0.015
Token Daisy
Token ID40355
Feature activation+1.09e-02

Pos logit contributions
 dip+0.071
 stare+0.068
 look+0.067
 touch+0.066
 pet+0.065
Neg logit contributions
rob-0.115
eh-0.099
aj-0.095
ross-0.093
xy-0.091
Token and
Token ID290
Feature activation-6.11e-03

Pos logit contributions
 dip+0.014
 stare+0.013
 look+0.013
 touch+0.013
 gaze+0.013
Neg logit contributions
rob-0.020
eh-0.018
aj-0.017
ross-0.016
 nonsense-0.016
Token she
Token ID673
Feature activation-6.98e-03

Pos logit contributions
rob+0.013
eh+0.011
aj+0.011
ross+0.010
aces+0.010
Neg logit contributions
 look-0.007
 pet-0.007
 stare-0.007
 dip-0.007
 touch-0.007
Token was
Token ID373
Feature activation+7.31e-03

Pos logit contributions
rob+0.013
eh+0.011
aj+0.010
ross+0.010
xy+0.010
Neg logit contributions
 dip-0.009
 look-0.009
 stare-0.009
 touch-0.009
 pet-0.009
Token very
Token ID845
Feature activation-5.27e-04

Pos logit contributions
 dip+0.010
 stare+0.009
 look+0.009
 touch+0.009
 gaze+0.009
Neg logit contributions
rob-0.014
eh-0.012
aj-0.011
ross-0.011
 nonsense-0.011
TokenCl
Token ID2601
Feature activation+1.66e-03

Pos logit contributions
 pet+0.011
 look+0.010
 dip+0.010
 touch+0.010
 wonder+0.010
Neg logit contributions
rob-0.015
eh-0.013
aj-0.013
aces-0.012
lich-0.012
Tokenara
Token ID3301
Feature activation-3.65e-03

Pos logit contributions
 dip+0.002
 pet+0.002
 look+0.002
 touch+0.002
 stare+0.002
Neg logit contributions
rob-0.003
eh-0.002
 nonsense-0.002
lf-0.002
lich-0.002
Token was
Token ID373
Feature activation+5.62e-03

Pos logit contributions
rob+0.008
eh+0.007
aj+0.006
ross+0.006
xy+0.006
Neg logit contributions
 look-0.004
 pet-0.004
 dip-0.004
 stare-0.004
 touch-0.004
Token welcomed
Token ID15846
Feature activation+6.91e-04

Pos logit contributions
 dip+0.007
 stare+0.006
 touch+0.006
 crab+0.006
 gaze+0.006
Neg logit contributions
rob-0.011
eh-0.010
aj-0.009
ross-0.009
 nonsense-0.009
Token by
Token ID416
Feature activation+2.37e-02

Pos logit contributions
 dip+0.001
 stare+0.001
 look+0.001
 gaze+0.001
 wonder+0.001
Neg logit contributions
rob-0.001
eh-0.001
aj-0.001
ross-0.001
 nonsense-0.001
Token Ms
Token ID6997
Feature activation+5.74e-02

Pos logit contributions
 dip+0.033
 stare+0.031
 gaze+0.030
 crab+0.029
 look+0.029
Neg logit contributions
rob-0.039
eh-0.034
aj-0.033
 nonsense-0.033
lf-0.033
Token Daisy
Token ID40355
Feature activation+6.69e-03

Pos logit contributions
 dip+0.064
 look+0.062
 stare+0.062
 pet+0.061
 touch+0.060
Neg logit contributions
rob-0.117
eh-0.103
aj-0.097
ross-0.094
xy-0.093
Token and
Token ID290
Feature activation-1.34e-02

Pos logit contributions
 dip+0.008
 stare+0.008
 look+0.008
 gaze+0.008
 wonder+0.008
Neg logit contributions
rob-0.012
eh-0.011
aj-0.010
ross-0.010
 nonsense-0.010
Token sat
Token ID3332
Feature activation+6.32e-03

Pos logit contributions
rob+0.028
eh+0.025
aj+0.023
lich+0.022
aces+0.022
Neg logit contributions
 pet-0.016
 look-0.015
 touch-0.015
 stare-0.015
 dip-0.015
Token down
Token ID866
Feature activation+2.41e-03

Pos logit contributions
 look+0.005
 pet+0.005
 touch+0.005
 wonder+0.005
 stare+0.005
Neg logit contributions
rob-0.015
eh-0.013
 AW-0.013
 nonsense-0.012
ross-0.012
Token at
Token ID379
Feature activation+4.51e-03

Pos logit contributions
 look+0.002
 touch+0.002
 pet+0.002
 try+0.002
 stare+0.002
Neg logit contributions
rob-0.006
eh-0.005
aj-0.005
cod-0.005
ross-0.005
Token he
Token ID339
Feature activation-1.88e-03

Pos logit contributions
 dip+0.010
 stare+0.009
 gaze+0.008
 look+0.008
 wonder+0.008
Neg logit contributions
rob-0.015
eh-0.013
ross-0.013
 nonsense-0.013
 AW-0.013
Token called
Token ID1444
Feature activation+1.70e-02

Pos logit contributions
rob+0.003
eh+0.003
aj+0.003
ross+0.003
lf+0.003
Neg logit contributions
 dip-0.002
 stare-0.002
 look-0.002
 gaze-0.002
 touch-0.002
Token to
Token ID284
Feature activation-7.30e-03

Pos logit contributions
 dip+0.021
 stare+0.018
 pet+0.018
 wonder+0.018
 gaze+0.018
Neg logit contributions
rob-0.031
eh-0.028
ross-0.027
 nonsense-0.026
lich-0.025
Token his
Token ID465
Feature activation+9.89e-03

Pos logit contributions
rob+0.010
lf+0.008
eh+0.008
ross+0.008
lich+0.008
Neg logit contributions
 dip-0.012
 fragrance-0.010
 stare-0.010
 crab-0.010
 scent-0.010
Token friend
Token ID1545
Feature activation+1.06e-02

Pos logit contributions
 dip+0.013
 stare+0.013
 gaze+0.011
 look+0.011
 touch+0.011
Neg logit contributions
rob-0.017
aj-0.016
eh-0.015
ibr-0.014
 nonsense-0.014
Token Ms
Token ID6997
Feature activation+5.67e-02

Pos logit contributions
 dip+0.015
 stare+0.015
 wonder+0.014
 gaze+0.014
 care+0.014
Neg logit contributions
rob-0.016
eh-0.015
aj-0.014
es-0.014
ibr-0.013
Token.
Token ID13
Feature activation+3.50e-03

Pos logit contributions
 dip+0.068
 stare+0.065
 look+0.065
 touch+0.063
 pet+0.063
Neg logit contributions
rob-0.111
eh-0.096
aj-0.092
ross-0.090
xy-0.088
Token Beaver
Token ID42746
Feature activation+2.03e-02

Pos logit contributions
 dip+0.005
 look+0.005
 stare+0.005
 wonder+0.005
 touch+0.005
Neg logit contributions
rob-0.006
eh-0.005
aj-0.005
ross-0.005
 nonsense-0.005
Token.
Token ID13
Feature activation-1.49e-05

Pos logit contributions
 dip+0.028
 wonder+0.026
 stare+0.026
 gaze+0.025
 smell+0.025
Neg logit contributions
rob-0.032
eh-0.028
es-0.027
xy-0.027
lich-0.026
Token "
Token ID366
Feature activation+1.13e-02

Pos logit contributions
rob+0.000
eh+0.000
aj+0.000
ross+0.000
xy+0.000
Neg logit contributions
 dip-0.000
 look-0.000
 stare-0.000
 gaze-0.000
 touch-0.000
TokenLook
Token ID8567
Feature activation-1.04e-02

Pos logit contributions
 dip+0.015
 look+0.015
 stare+0.014
 gaze+0.014
 try+0.014
Neg logit contributions
rob-0.022
eh-0.018
three-0.016
ross-0.016
lich-0.016
TokenDave
Token ID27984
Feature activation+4.90e-03

Pos logit contributions
 dip+0.023
 look+0.022
 stare+0.022
 pet+0.021
 touch+0.021
Neg logit contributions
rob-0.030
eh-0.026
aj-0.024
ross-0.023
xy-0.023
Token asked
Token ID1965
Feature activation-2.11e-03

Pos logit contributions
 dip+0.006
 stare+0.006
 look+0.006
 gaze+0.006
 try+0.006
Neg logit contributions
rob-0.008
eh-0.008
aj-0.007
lf-0.007
ep-0.007
Token his
Token ID465
Feature activation+6.24e-03

Pos logit contributions
rob+0.004
eh+0.004
aj+0.003
xy+0.003
ross+0.003
Neg logit contributions
 look-0.003
 pet-0.003
 touch-0.003
 dip-0.003
 stare-0.003
Token teacher
Token ID4701
Feature activation+2.21e-02

Pos logit contributions
 dip+0.008
 stare+0.008
 look+0.007
 gaze+0.007
 touch+0.007
Neg logit contributions
rob-0.010
eh-0.010
aj-0.009
ibr-0.009
 nonsense-0.009
Token,
Token ID11
Feature activation+1.52e-02

Pos logit contributions
 dip+0.028
 stare+0.026
 pet+0.026
 touch+0.026
 gaze+0.026
Neg logit contributions
rob-0.041
eh-0.036
aj-0.034
 nonsense-0.033
ross-0.033
Token Ms
Token ID6997
Feature activation+5.54e-02

Pos logit contributions
 dip+0.030
 stare+0.028
 gaze+0.028
 care+0.028
 wonder+0.027
Neg logit contributions
rob-0.015
eh-0.013
 nonsense-0.012
lich-0.012
ibr-0.012
Token.
Token ID13
Feature activation+1.47e-02

Pos logit contributions
 dip+0.071
 stare+0.067
 look+0.065
 gaze+0.064
 touch+0.064
Neg logit contributions
rob-0.103
eh-0.090
aj-0.085
ross-0.084
xy-0.083
Token Sarah
Token ID10490
Feature activation+5.85e-03

Pos logit contributions
 dip+0.021
 look+0.020
 stare+0.020
 pet+0.020
 touch+0.020
Neg logit contributions
rob-0.026
eh-0.022
aj-0.021
ross-0.020
xy-0.020
Token,
Token ID11
Feature activation+1.03e-02

Pos logit contributions
 dip+0.008
 stare+0.007
 touch+0.007
 gaze+0.007
 crab+0.007
Neg logit contributions
rob-0.010
eh-0.009
aj-0.008
xy-0.008
 nonsense-0.008
Token "
Token ID366
Feature activation+1.61e-02

Pos logit contributions
 dip+0.014
 stare+0.013
 gaze+0.013
 clam+0.013
 pet+0.013
Neg logit contributions
rob-0.017
eh-0.015
 nonsense-0.014
aj-0.014
lich-0.014
TokenWhy
Token ID5195
Feature activation-8.54e-03

Pos logit contributions
 dip+0.019
 look+0.018
 stare+0.018
 gaze+0.017
 try+0.017
Neg logit contributions
rob-0.034
eh-0.029
three-0.027
lich-0.026
xy-0.026
Token beach
Token ID10481
Feature activation+8.52e-03

Pos logit contributions
rob+0.007
eh+0.006
aj+0.006
lf+0.006
xy+0.006
Neg logit contributions
 dip-0.005
 stare-0.005
 look-0.005
 try-0.005
 wonder-0.005
Token when
Token ID618
Feature activation+1.62e-02

Pos logit contributions
 dip+0.011
 stare+0.011
 wonder+0.011
 try+0.011
 pet+0.010
Neg logit contributions
rob-0.015
eh-0.013
aj-0.012
lf-0.012
ross-0.012
Token she
Token ID673
Feature activation+1.37e-02

Pos logit contributions
 dip+0.023
 stare+0.021
 look+0.021
 pet+0.021
 touch+0.021
Neg logit contributions
rob-0.027
eh-0.023
aj-0.021
ross-0.021
xy-0.021
Token saw
Token ID2497
Feature activation-3.79e-03

Pos logit contributions
 dip+0.019
 stare+0.018
 care+0.018
 look+0.018
 try+0.018
Neg logit contributions
rob-0.021
eh-0.021
aj-0.019
lf-0.017
CR-0.017
Token a
Token ID257
Feature activation+1.23e-02

Pos logit contributions
rob+0.006
eh+0.006
 nonsense+0.005
ibr+0.005
aj+0.005
Neg logit contributions
 dip-0.005
 stare-0.005
 try-0.005
 gaze-0.005
 touch-0.005
Token gentle
Token ID10296
Feature activation+5.22e-02

Pos logit contributions
 care+0.018
 stare+0.018
 dip+0.017
 gaze+0.017
 stroke+0.017
Neg logit contributions
rob-0.016
eh-0.016
lf-0.015
aj-0.014
 fort-0.013
Token jelly
Token ID27644
Feature activation+1.79e-02

Pos logit contributions
 care+0.054
Yes+0.051
 dip+0.051
 try+0.051
 stare+0.050
Neg logit contributions
eh-0.083
rob-0.079
lf-0.076
 fort-0.069
ross-0.068
Tokenfish
Token ID11084
Feature activation+1.74e-02

Pos logit contributions
 stare+0.023
 wonder+0.023
 dip+0.023
 gaze+0.022
 care+0.022
Neg logit contributions
rob-0.033
eh-0.029
ross-0.026
ider-0.026
lf-0.026
Token in
Token ID287
Feature activation+1.11e-02

Pos logit contributions
 dip+0.023
 stare+0.023
 care+0.023
 wonder+0.023
 look+0.023
Neg logit contributions
rob-0.029
eh-0.026
aj-0.024
ider-0.023
xy-0.023
Token the
Token ID262
Feature activation+4.10e-03

Pos logit contributions
 dip+0.016
 stare+0.016
mit+0.015
 gaze+0.015
 Swan+0.015
Neg logit contributions
rob-0.016
eh-0.014
 nonsense-0.013
lf-0.012
ross-0.012
Token water
Token ID1660
Feature activation+2.93e-02

Pos logit contributions
 dip+0.004
 stare+0.004
 Swan+0.004
 Seal+0.004
 try+0.004
Neg logit contributions
rob-0.008
eh-0.007
aj-0.006
CR-0.006
 fort-0.006
Token more
Token ID517
Feature activation-8.40e-03

Pos logit contributions
rob+0.002
eh+0.001
aj+0.001
 infected+0.001
lich+0.001
Neg logit contributions
 stare-0.002
 gaze-0.002
 dip-0.002
 Swan-0.002
 look-0.002
Token sa
Token ID473
Feature activation-1.31e-02

Pos logit contributions
rob+0.014
eh+0.012
aj+0.011
xy+0.010
 nonsense+0.010
Neg logit contributions
 stare-0.012
 gaze-0.012
 dip-0.011
 look-0.011
 Swan-0.011
Tokenus
Token ID385
Feature activation+5.71e-03

Pos logit contributions
rob+0.016
eh+0.011
xy+0.011
aj+0.011
ross+0.010
Neg logit contributions
 stare-0.027
 dip-0.027
 gaze-0.027
 look-0.026
 pet-0.026
Tokenages
Token ID1095
Feature activation+1.15e-02

Pos logit contributions
 dip+0.014
 stare+0.014
 wonder+0.014
 gaze+0.014
 touch+0.014
Neg logit contributions
rob-0.003
eh-0.002
aj-0.002
es-0.002
xy-0.001
Token for
Token ID329
Feature activation-8.58e-03

Pos logit contributions
 stare+0.017
 gaze+0.017
 dip+0.017
 wonder+0.016
 look+0.016
Neg logit contributions
rob-0.018
eh-0.016
aj-0.015
xy-0.014
aces-0.014
Token Mrs
Token ID9074
Feature activation+5.17e-02

Pos logit contributions
rob+0.015
eh+0.013
lf+0.013
aj+0.012
 nonsense+0.012
Neg logit contributions
 dip-0.011
 stare-0.011
 gaze-0.011
 wonder-0.010
 look-0.010
Token Smith
Token ID4176
Feature activation+3.03e-03

Pos logit contributions
 dip+0.064
 stare+0.061
 look+0.060
 touch+0.059
 gaze+0.059
Neg logit contributions
rob-0.100
eh-0.086
aj-0.082
ross-0.080
xy-0.080
Token.
Token ID13
Feature activation+1.85e-04

Pos logit contributions
 stare+0.004
 dip+0.004
 gaze+0.004
 wonder+0.004
 look+0.004
Neg logit contributions
rob-0.005
eh-0.005
aj-0.004
aces-0.004
es-0.004
Token He
Token ID679
Feature activation-7.15e-03

Pos logit contributions
 dip+0.000
 stare+0.000
 look+0.000
 touch+0.000
 pet+0.000
Neg logit contributions
rob-0.000
eh-0.000
aj-0.000
ross-0.000
lich-0.000
Token quickly
Token ID2952
Feature activation-1.93e-02

Pos logit contributions
rob+0.013
eh+0.012
aj+0.011
ross+0.011
xy+0.011
Neg logit contributions
 dip-0.009
 stare-0.009
 look-0.009
 gaze-0.008
 touch-0.008
Token returned
Token ID4504
Feature activation-1.05e-03

Pos logit contributions
rob+0.030
eh+0.029
aj+0.027
ross+0.025
lf+0.025
Neg logit contributions
 Seal-0.025
 dip-0.025
 Swan-0.025
 stare-0.024
 crab-0.024
Token looking
Token ID2045
Feature activation-1.06e-02

Pos logit contributions
rob+0.011
eh+0.010
aj+0.009
ross+0.009
 nonsense+0.008
Neg logit contributions
 dip-0.008
 stare-0.008
 look-0.008
 clam-0.008
 try-0.008
Token around
Token ID1088
Feature activation-3.04e-03

Pos logit contributions
rob+0.018
eh+0.016
aj+0.015
ross+0.014
 nonsense+0.014
Neg logit contributions
 dip-0.015
 stare-0.015
 look-0.014
 touch-0.014
 pet-0.014
Token and
Token ID290
Feature activation+7.67e-03

Pos logit contributions
rob+0.005
eh+0.004
aj+0.004
ross+0.004
 nonsense+0.004
Neg logit contributions
 dip-0.004
 stare-0.004
 look-0.004
 gaze-0.004
 pet-0.004
Token enjoying
Token ID13226
Feature activation+1.56e-02

Pos logit contributions
 dip+0.010
 look+0.009
 stare+0.009
 Seal+0.009
 crab+0.009
Neg logit contributions
rob-0.014
eh-0.012
aj-0.011
ross-0.011
 nonsense-0.011
Token the
Token ID262
Feature activation+2.34e-02

Pos logit contributions
 dip+0.020
 stare+0.019
 touch+0.018
 look+0.018
 Swan+0.018
Neg logit contributions
rob-0.029
eh-0.024
ross-0.022
 nonsense-0.022
aj-0.022
Token peaceful
Token ID12309
Feature activation+5.15e-02

Pos logit contributions
 Seal+0.032
 Swan+0.031
 stare+0.031
 dip+0.031
 try+0.030
Neg logit contributions
rob-0.035
eh-0.032
three-0.027
 nonsense-0.027
ooky-0.026
Token view
Token ID1570
Feature activation+1.66e-03

Pos logit contributions
 dip+0.060
 Seal+0.060
 try+0.058
 Swan+0.058
 reel+0.057
Neg logit contributions
rob-0.083
eh-0.075
lf-0.070
ross-0.066
 fort-0.065
Token.
Token ID13
Feature activation-3.46e-03

Pos logit contributions
 stare+0.002
 dip+0.002
 gaze+0.002
 wonder+0.002
 pet+0.002
Neg logit contributions
rob-0.003
eh-0.003
ross-0.002
xy-0.002
aj-0.002
Token
Token ID220
Feature activation-9.68e-03

Pos logit contributions
rob+0.006
eh+0.006
aj+0.006
xy+0.005
three+0.005
Neg logit contributions
 dip-0.004
 look-0.004
 crab-0.004
 stare-0.004
 clam-0.004
Token
Token ID198
Feature activation-4.90e-03

Pos logit contributions
rob+0.022
eh+0.020
aj+0.018
ross+0.018
 nonsense+0.018
Neg logit contributions
 dip-0.008
 look-0.008
 stare-0.008
 pet-0.008
 touch-0.008
Token
Token ID198
Feature activation-4.99e-03

Pos logit contributions
rob+0.011
eh+0.010
aj+0.009
ross+0.009
xy+0.009
Neg logit contributions
 dip-0.005
 look-0.005
 stare-0.004
 pet-0.004
 touch-0.004
Token this
Token ID428
Feature activation+9.10e-03

Pos logit contributions
rob+0.017
eh+0.014
aj+0.014
 nonsense+0.014
lich+0.014
Neg logit contributions
 dip-0.010
 stare-0.009
 gaze-0.009
 pet-0.009
 wonder-0.009
Token and
Token ID290
Feature activation+7.01e-03

Pos logit contributions
 dip+0.013
 stare+0.012
 gaze+0.012
 touch+0.012
 pet+0.012
Neg logit contributions
rob-0.015
eh-0.013
 nonsense-0.012
aj-0.012
lf-0.012
Token said
Token ID531
Feature activation+2.05e-02

Pos logit contributions
 dip+0.011
 gaze+0.010
 stare+0.010
 wonder+0.010
 look+0.010
Neg logit contributions
rob-0.011
eh-0.009
aj-0.008
lf-0.008
ross-0.008
Token in
Token ID287
Feature activation-2.93e-03

Pos logit contributions
 dip+0.025
 stare+0.023
 wonder+0.023
 gaze+0.023
 pet+0.023
Neg logit contributions
rob-0.039
eh-0.034
aj-0.033
ross-0.032
 nonsense-0.032
Token a
Token ID257
Feature activation+1.39e-03

Pos logit contributions
rob+0.004
 nonsense+0.004
lf+0.004
aj+0.003
lich+0.003
Neg logit contributions
 dip-0.005
 stare-0.004
 gaze-0.004
 crab-0.004
 poppy-0.004
Token soothing
Token ID40836
Feature activation+5.09e-02

Pos logit contributions
 dip+0.002
 Seal+0.002
 Swan+0.002
mit+0.002
 poppy+0.002
Neg logit contributions
rob-0.002
eh-0.002
 nonsense-0.002
 AW-0.002
aj-0.001
Token voice
Token ID3809
Feature activation+2.10e-02

Pos logit contributions
 Seal+0.049
 dip+0.049
 crab+0.048
mit+0.048
 Swan+0.047
Neg logit contributions
rob-0.091
eh-0.087
 nonsense-0.083
lf-0.083
 AW-0.078
Token,
Token ID11
Feature activation+3.96e-03

Pos logit contributions
 dip+0.029
 wonder+0.027
 gaze+0.027
 stare+0.026
 pet+0.026
Neg logit contributions
rob-0.036
 nonsense-0.030
eh-0.030
ross-0.030
lf-0.029
Token
Token ID6184
Feature activation+6.67e-03

Pos logit contributions
 dip+0.007
 stare+0.006
 look+0.006
 gaze+0.006
 touch+0.006
Neg logit contributions
rob-0.005
eh-0.004
aj-0.004
 nonsense-0.004
xy-0.004
Token
Token ID95
Feature activation+9.08e-03

Pos logit contributions
 dip+0.010
 wonder+0.010
 pet+0.010
 look+0.010
 touch+0.010
Neg logit contributions
rob-0.011
eh-0.009
xy-0.009
aj-0.008
ider-0.008
Token
Token ID26391
Feature activation-5.12e-03

Pos logit contributions
 dip+0.014
 look+0.014
 stare+0.014
 touch+0.013
 pet+0.013
Neg logit contributions
rob-0.015
eh-0.012
aj-0.012
ross-0.011
xy-0.011
Token special
Token ID2041
Feature activation+7.55e-03

Pos logit contributions
 gaze+0.025
 stare+0.025
 Seal+0.025
 care+0.024
 dip+0.023
Neg logit contributions
rob-0.020
eh-0.020
 nonsense-0.018
chy-0.017
lf-0.017
Token.
Token ID13
Feature activation+6.80e-03

Pos logit contributions
 dip+0.011
 stare+0.011
 gaze+0.010
 Seal+0.010
 try+0.010
Neg logit contributions
rob-0.012
eh-0.010
 nonsense-0.010
xy-0.010
aces-0.010
Token She
Token ID1375
Feature activation+6.54e-03

Pos logit contributions
 dip+0.010
 stare+0.010
 care+0.009
 crab+0.009
 look+0.009
Neg logit contributions
rob-0.009
eh-0.009
ibr-0.008
three-0.008
aj-0.008
Token saw
Token ID2497
Feature activation-5.45e-03

Pos logit contributions
 dip+0.008
 stare+0.008
 look+0.008
 gaze+0.008
 try+0.008
Neg logit contributions
rob-0.011
eh-0.011
aj-0.010
lf-0.010
ross-0.009
Token a
Token ID257
Feature activation+3.72e-03

Pos logit contributions
rob+0.008
eh+0.007
 nonsense+0.006
es+0.006
xy+0.006
Neg logit contributions
 dip-0.008
 stare-0.008
 gaze-0.008
 try-0.007
 touch-0.007
Token gentle
Token ID10296
Feature activation+5.07e-02

Pos logit contributions
 stare+0.006
 gaze+0.006
 dip+0.006
 care+0.006
 touch+0.005
Neg logit contributions
rob-0.005
eh-0.005
lf-0.004
 nonsense-0.004
aj-0.004
Token glow
Token ID19634
Feature activation+7.50e-03

Pos logit contributions
 dip+0.058
 stare+0.057
 gaze+0.054
 touch+0.054
 care+0.053
Neg logit contributions
eh-0.080
rob-0.079
lf-0.073
ross-0.067
 nonsense-0.064
Token coming
Token ID2406
Feature activation-6.43e-04

Pos logit contributions
 stare+0.011
 care+0.010
 gaze+0.010
 dip+0.010
 investigate+0.010
Neg logit contributions
rob-0.011
eh-0.011
lf-0.010
ross-0.009
 nonsense-0.009
Token from
Token ID422
Feature activation+5.63e-03

Pos logit contributions
rob+0.001
eh+0.001
ross+0.001
 nonsense+0.001
xy+0.001
Neg logit contributions
 dip-0.001
 stare-0.001
 gaze-0.001
 look-0.001
 try-0.001
Token the
Token ID262
Feature activation+7.50e-03

Pos logit contributions
 dip+0.007
 stare+0.007
 look+0.006
 gaze+0.006
 try+0.006
Neg logit contributions
rob-0.011
eh-0.009
 nonsense-0.008
aj-0.008
xy-0.008
Token small
Token ID1402
Feature activation+1.93e-02

Pos logit contributions
 stare+0.010
 dip+0.009
 gaze+0.009
 touch+0.009
 wonder+0.008
Neg logit contributions
rob-0.014
eh-0.012
lf-0.010
 fort-0.010
 reminds-0.010
Token Sam
Token ID3409
Feature activation-3.07e-03

Pos logit contributions
rob+0.014
eh+0.012
aj+0.011
 nonsense+0.011
xy+0.011
Neg logit contributions
 dip-0.012
 stare-0.011
 look-0.011
 gaze-0.011
 clam-0.011
Token turned
Token ID2900
Feature activation+1.92e-02

Pos logit contributions
rob+0.004
eh+0.004
es+0.004
lich+0.004
lf+0.004
Neg logit contributions
 dip-0.005
 poppy-0.004
 stare-0.004
 crab-0.004
 wonder-0.004
Token,
Token ID11
Feature activation-7.63e-03

Pos logit contributions
 dip+0.027
 gaze+0.026
 stare+0.026
mit+0.024
 crab+0.024
Neg logit contributions
rob-0.030
eh-0.027
lich-0.025
 infected-0.025
 AW-0.025
Token and
Token ID290
Feature activation-4.18e-03

Pos logit contributions
rob+0.012
eh+0.011
lich+0.010
aj+0.010
 nonsense+0.010
Neg logit contributions
 dip-0.011
 crab-0.010
 clam-0.010
 Seal-0.010
 stare-0.010
Token saw
Token ID2497
Feature activation-8.01e-03

Pos logit contributions
rob+0.008
eh+0.007
aj+0.006
lich+0.006
ross+0.006
Neg logit contributions
 dip-0.005
 stare-0.005
 gaze-0.005
 look-0.004
 try-0.004
Token Ms
Token ID6997
Feature activation+5.04e-02

Pos logit contributions
rob+0.015
eh+0.012
 nonsense+0.012
ross+0.011
ibr+0.011
Neg logit contributions
 dip-0.011
 stare-0.010
 gaze-0.010
 pet-0.010
 touch-0.010
Token.
Token ID13
Feature activation+8.73e-03

Pos logit contributions
 dip+0.069
 stare+0.064
 look+0.063
 wonder+0.063
 gaze+0.062
Neg logit contributions
rob-0.090
eh-0.078
aj-0.074
ross-0.073
xy-0.072
Token Smith
Token ID4176
Feature activation-4.70e-03

Pos logit contributions
 dip+0.014
 look+0.013
 stare+0.013
 wonder+0.013
 care+0.012
Neg logit contributions
rob-0.014
eh-0.012
aj-0.011
 nonsense-0.010
ross-0.010
Token,
Token ID11
Feature activation+6.68e-03

Pos logit contributions
es+0.006
rob+0.006
lf+0.006
eh+0.006
lich+0.005
Neg logit contributions
 dip-0.007
 care-0.007
 stare-0.007
 Seal-0.007
 wonder-0.007
Token his
Token ID465
Feature activation-7.23e-03

Pos logit contributions
 dip+0.010
 stare+0.008
 care+0.008
 try+0.008
 gaze+0.008
Neg logit contributions
rob-0.010
eh-0.010
lich-0.010
es-0.010
lf-0.009
Token neighbour
Token ID12250
Feature activation-8.24e-05

Pos logit contributions
rob+0.012
eh+0.011
aj+0.010
 nonsense+0.010
lich+0.009
Neg logit contributions
 dip-0.010
 stare-0.010
 look-0.010
 touch-0.009
 gaze-0.009
Token at
Token ID379
Feature activation+2.10e-02

Pos logit contributions
rob+0.006
eh+0.005
aj+0.005
 nonsense+0.005
 AW+0.005
Neg logit contributions
 dip-0.005
 stare-0.005
 gaze-0.005
 Seal-0.005
 touch-0.005
Token him
Token ID683
Feature activation+1.36e-02

Pos logit contributions
 dip+0.028
 stare+0.027
 gaze+0.026
 pet+0.024
 touch+0.024
Neg logit contributions
rob-0.038
eh-0.030
xy-0.029
 nonsense-0.028
lich-0.028
Token and
Token ID290
Feature activation+3.01e-03

Pos logit contributions
 care+0.023
 dip+0.023
 stare+0.022
 distance+0.022
 wonder+0.022
Neg logit contributions
es-0.014
aj-0.013
lich-0.012
rob-0.012
eh-0.012
Token with
Token ID351
Feature activation+6.57e-03

Pos logit contributions
 dip+0.004
 stare+0.004
 gaze+0.004
 look+0.004
 try+0.004
Neg logit contributions
rob-0.005
eh-0.004
aj-0.004
lf-0.004
 nonsense-0.004
Token a
Token ID257
Feature activation+5.69e-03

Pos logit contributions
 dip+0.009
 Swan+0.009
 stare+0.009
ory+0.008
 crab+0.008
Neg logit contributions
rob-0.010
eh-0.009
 nonsense-0.008
 AW-0.008
 shields-0.008
Token gentle
Token ID10296
Feature activation+5.04e-02

Pos logit contributions
 dip+0.008
 Abby+0.008
 Seal+0.008
 stare+0.008
 Ned+0.007
Neg logit contributions
eh-0.007
 AW-0.007
rob-0.007
aj-0.006
 nonsense-0.006
Token gesture
Token ID18342
Feature activation+2.01e-02

Pos logit contributions
Where+0.041
Yes+0.040
What+0.040
 dip+0.038
 Seal+0.038
Neg logit contributions
eh-0.087
rob-0.077
 AW-0.077
ross-0.077
 nonsense-0.073
Token,
Token ID11
Feature activation-2.27e-03

Pos logit contributions
 dip+0.032
 gaze+0.031
 stare+0.031
 wonder+0.031
 investigate+0.030
Neg logit contributions
rob-0.028
eh-0.023
ross-0.023
es-0.023
xy-0.022
Token he
Token ID339
Feature activation-8.10e-03

Pos logit contributions
rob+0.004
eh+0.003
aj+0.003
 nonsense+0.003
ross+0.003
Neg logit contributions
 dip-0.003
 stare-0.003
 gaze-0.003
 look-0.003
 touch-0.003
Token pushed
Token ID7121
Feature activation+6.31e-03

Pos logit contributions
rob+0.015
eh+0.013
aj+0.012
ross+0.012
xy+0.012
Neg logit contributions
 dip-0.010
 stare-0.010
 look-0.010
 gaze-0.010
 touch-0.010
Token him
Token ID683
Feature activation+9.12e-03

Pos logit contributions
 dip+0.008
 stare+0.008
 gaze+0.007
 look+0.007
 wonder+0.007
Neg logit contributions
rob-0.011
eh-0.010
lich-0.009
 nonsense-0.009
xy-0.009
Token a
Token ID257
Feature activation-3.33e-04

Pos logit contributions
 dip+0.014
 stare+0.013
 gaze+0.013
 crab+0.013
 touch+0.013
Neg logit contributions
rob-0.016
eh-0.015
aj-0.014
 nonsense-0.014
ross-0.013
Token birthday
Token ID10955
Feature activation-6.24e-03

Pos logit contributions
rob+0.001
eh+0.000
 nonsense+0.000
aj+0.000
lf+0.000
Neg logit contributions
 dip-0.000
 stare-0.000
 touch-0.000
 gaze-0.000
 Seal-0.000
Token party
Token ID2151
Feature activation-6.40e-03

Pos logit contributions
rob+0.011
eh+0.009
aj+0.009
ross+0.008
 nonsense+0.008
Neg logit contributions
 dip-0.009
 stare-0.009
 look-0.008
 pet-0.008
 touch-0.008
Token!"
Token ID2474
Feature activation+1.38e-03

Pos logit contributions
rob+0.010
eh+0.009
aj+0.008
lich+0.008
ross+0.008
Neg logit contributions
 look-0.010
 dip-0.009
 pet-0.009
 touch-0.009
 try-0.009
Token said
Token ID531
Feature activation+6.94e-03

Pos logit contributions
 dip+0.002
 stare+0.002
 look+0.002
 gaze+0.002
 touch+0.002
Neg logit contributions
rob-0.002
eh-0.002
aj-0.002
 nonsense-0.002
ross-0.002
Token Ms
Token ID6997
Feature activation+5.01e-02

Pos logit contributions
 dip+0.009
 stare+0.009
 look+0.009
 gaze+0.009
 wonder+0.009
Neg logit contributions
rob-0.012
eh-0.011
ross-0.010
aj-0.009
 nonsense-0.009
Token.
Token ID13
Feature activation+5.50e-03

Pos logit contributions
 dip+0.057
 look+0.055
 stare+0.055
 pet+0.053
 touch+0.053
Neg logit contributions
rob-0.101
eh-0.088
aj-0.084
ross-0.082
xy-0.080
Token Beaver
Token ID42746
Feature activation+1.44e-02

Pos logit contributions
 dip+0.007
 look+0.007
 stare+0.007
 wonder+0.007
 gaze+0.006
Neg logit contributions
rob-0.010
eh-0.009
aj-0.008
ross-0.008
three-0.008
Token.
Token ID13
Feature activation-4.17e-03

Pos logit contributions
 dip+0.021
 wonder+0.020
 stare+0.020
 poppy+0.019
 crab+0.019
Neg logit contributions
rob-0.020
eh-0.019
es-0.018
aj-0.018
lich-0.017
Token Dave
Token ID9935
Feature activation+3.52e-04

Pos logit contributions
rob+0.007
eh+0.006
aj+0.006
ross+0.006
xy+0.006
Neg logit contributions
 dip-0.006
 look-0.006
 stare-0.006
 touch-0.005
 gaze-0.005
Token felt
Token ID2936
Feature activation+2.00e-02

Pos logit contributions
 dip+0.001
 stare+0.000
 look+0.000
 wonder+0.000
 gaze+0.000
Neg logit contributions
rob-0.001
eh-0.000
aj-0.000
es-0.000
lf-0.000
Token it
Token ID340
Feature activation+9.30e-03

Pos logit contributions
 stare+0.019
 dip+0.019
 gaze+0.018
 Swan+0.017
 distance+0.017
Neg logit contributions
rob-0.036
eh-0.031
 nonsense-0.029
xy-0.029
three-0.028
Token up
Token ID510
Feature activation-1.02e-03

Pos logit contributions
 dip+0.013
 stare+0.013
 gaze+0.013
 Seal+0.013
 wonder+0.012
Neg logit contributions
rob-0.015
eh-0.013
aj-0.012
xy-0.012
 nonsense-0.012
Token.
Token ID13
Feature activation+1.44e-02

Pos logit contributions
rob+0.002
eh+0.002
aj+0.001
ross+0.001
ider+0.001
Neg logit contributions
 dip-0.001
 stare-0.001
 gaze-0.001
 look-0.001
 Seal-0.001
Token With
Token ID2080
Feature activation+6.61e-03

Pos logit contributions
 dip+0.020
 poppy+0.019
 pet+0.018
 crab+0.018
 stare+0.018
Neg logit contributions
rob-0.024
aj-0.020
eh-0.020
three-0.020
ross-0.019
Token a
Token ID257
Feature activation+2.53e-03

Pos logit contributions
 dip+0.010
 stare+0.010
 gaze+0.010
 Swan+0.009
 pet+0.009
Neg logit contributions
rob-0.010
eh-0.008
 nonsense-0.008
ross-0.007
xy-0.007
Token gentle
Token ID10296
Feature activation+5.00e-02

Pos logit contributions
 Seal+0.003
 Abby+0.003
 Ned+0.003
 Swan+0.003
 Nan+0.003
Neg logit contributions
eh-0.004
 AW-0.003
br-0.003
rob-0.003
aj-0.003
Token smile
Token ID8212
Feature activation+3.17e-03

Pos logit contributions
 Seal+0.045
 Ned+0.045
 Swan+0.042
 harbour+0.041
 dip+0.039
Neg logit contributions
eh-0.090
 AW-0.081
ross-0.080
 nonsense-0.075
EEP-0.072
Token,
Token ID11
Feature activation+1.45e-03

Pos logit contributions
 dip+0.005
 stare+0.005
 wonder+0.005
 gaze+0.005
 investigate+0.005
Neg logit contributions
rob-0.005
eh-0.004
ross-0.004
aces-0.004
es-0.004
Token he
Token ID339
Feature activation-1.19e-02

Pos logit contributions
 dip+0.002
 pet+0.002
 stare+0.002
 gaze+0.002
 poppy+0.002
Neg logit contributions
rob-0.002
eh-0.002
ross-0.002
 nonsense-0.002
aj-0.002
Token rubbed
Token ID31862
Feature activation+1.84e-02

Pos logit contributions
rob+0.024
eh+0.021
aj+0.020
ross+0.019
xy+0.019
Neg logit contributions
 dip-0.014
 stare-0.013
 look-0.013
 gaze-0.013
 touch-0.013
Token its
Token ID663
Feature activation+8.16e-03

Pos logit contributions
 dip+0.025
 stare+0.024
 pet+0.023
 look+0.023
 gaze+0.023
Neg logit contributions
rob-0.033
eh-0.028
aj-0.027
ross-0.026
lich-0.025
Token had
Token ID550
Feature activation+1.06e-02

Pos logit contributions
 dip+0.018
 stare+0.017
 look+0.017
 gaze+0.017
 touch+0.016
Neg logit contributions
rob-0.020
eh-0.017
aj-0.017
xy-0.016
ross-0.016
Token not
Token ID407
Feature activation+6.46e-03

Pos logit contributions
 Swan+0.013
 dip+0.013
 stare+0.013
 gaze+0.013
 Seal+0.013
Neg logit contributions
rob-0.017
eh-0.017
 nonsense-0.015
aj-0.015
ross-0.015
Token listened
Token ID16399
Feature activation+2.03e-02

Pos logit contributions
 Swan+0.008
 dip+0.008
 Seal+0.008
 gaze+0.008
 stare+0.008
Neg logit contributions
eh-0.010
rob-0.010
ross-0.009
 nonsense-0.009
lich-0.009
Token to
Token ID284
Feature activation+4.11e-03

Pos logit contributions
 dip+0.021
 stare+0.020
 look+0.020
 pet+0.020
 gaze+0.019
Neg logit contributions
rob-0.042
eh-0.037
aj-0.036
ross-0.035
xy-0.034
Token the
Token ID262
Feature activation+1.97e-02

Pos logit contributions
 dip+0.006
 stare+0.005
 gaze+0.005
 pet+0.005
 Swan+0.005
Neg logit contributions
rob-0.007
eh-0.006
 nonsense-0.006
ross-0.005
xy-0.005
Token wise
Token ID10787
Feature activation+4.97e-02

Pos logit contributions
 dip+0.030
 stare+0.029
 gaze+0.028
 look+0.028
 touch+0.028
Neg logit contributions
rob-0.030
eh-0.027
aj-0.024
 nonsense-0.024
lich-0.023
Token frog
Token ID21264
Feature activation+2.16e-02

Pos logit contributions
 dip+0.060
 stare+0.058
 gaze+0.056
 look+0.056
 wonder+0.055
Neg logit contributions
rob-0.088
eh-0.081
ross-0.074
 nonsense-0.074
aj-0.074
Token,
Token ID11
Feature activation-3.20e-03

Pos logit contributions
 dip+0.026
 stare+0.025
 gaze+0.024
 pet+0.024
 look+0.024
Neg logit contributions
rob-0.039
eh-0.036
aj-0.033
 nonsense-0.033
ross-0.033
Token he
Token ID339
Feature activation-8.27e-03

Pos logit contributions
rob+0.005
eh+0.004
 nonsense+0.004
ross+0.004
xy+0.004
Neg logit contributions
 dip-0.005
 pet-0.004
 stare-0.004
 gaze-0.004
 touch-0.004
Token would
Token ID561
Feature activation-2.55e-03

Pos logit contributions
rob+0.015
eh+0.013
aj+0.012
xy+0.012
ross+0.012
Neg logit contributions
 dip-0.011
 stare-0.011
 look-0.011
 gaze-0.010
 touch-0.010
Token have
Token ID423
Feature activation+1.85e-02

Pos logit contributions
rob+0.003
 nonsense+0.003
eh+0.003
 reminds+0.003
ross+0.003
Neg logit contributions
 Swan-0.003
 crab-0.003
 Seal-0.003
 dip-0.003
 clam-0.003
Token She
Token ID1375
Feature activation+1.51e-03

Pos logit contributions
 dip+0.010
 crab+0.009
 clam+0.009
 look+0.009
 stare+0.009
Neg logit contributions
rob-0.009
aj-0.008
eh-0.008
 nonsense-0.008
ibr-0.008
Token saw
Token ID2497
Feature activation-1.51e-02

Pos logit contributions
 look+0.002
 pet+0.001
 touch+0.001
 dip+0.001
 stare+0.001
Neg logit contributions
rob-0.003
eh-0.003
ross-0.003
xy-0.003
aj-0.003
Token her
Token ID607
Feature activation+1.10e-02

Pos logit contributions
rob+0.025
eh+0.020
aj+0.019
 nonsense+0.019
ibr+0.018
Neg logit contributions
 dip-0.023
 stare-0.022
 gaze-0.022
 look-0.021
 touch-0.021
Token teacher
Token ID4701
Feature activation+2.46e-03

Pos logit contributions
 dip+0.015
 stare+0.014
 look+0.014
 touch+0.014
 pet+0.014
Neg logit contributions
rob-0.020
eh-0.017
aj-0.016
ross-0.015
 nonsense-0.015
Token,
Token ID11
Feature activation+1.10e-02

Pos logit contributions
 dip+0.004
 stare+0.003
 look+0.003
 crab+0.003
 wonder+0.003
Neg logit contributions
rob-0.004
eh-0.003
aj-0.003
lf-0.003
 nonsense-0.003
Token Ms
Token ID6997
Feature activation+4.95e-02

Pos logit contributions
 dip+0.019
 clam+0.018
 stare+0.017
 gaze+0.017
 care+0.017
Neg logit contributions
rob-0.015
eh-0.012
aj-0.012
 nonsense-0.011
lich-0.011
Token.
Token ID13
Feature activation+2.16e-02

Pos logit contributions
 dip+0.058
 look+0.055
 stare+0.055
 touch+0.054
 pet+0.054
Neg logit contributions
rob-0.099
eh-0.086
aj-0.082
ross-0.080
xy-0.078
Token Lee
Token ID5741
Feature activation+1.14e-02

Pos logit contributions
 dip+0.035
 look+0.034
 stare+0.034
 touch+0.033
 pet+0.033
Neg logit contributions
rob-0.033
eh-0.027
aj-0.026
ross-0.024
 nonsense-0.024
Token,
Token ID11
Feature activation+1.01e-02

Pos logit contributions
 dip+0.014
 stare+0.013
 look+0.013
 gaze+0.013
 wonder+0.013
Neg logit contributions
rob-0.021
eh-0.018
aj-0.018
lf-0.017
 nonsense-0.017
Token and
Token ID290
Feature activation+2.90e-03

Pos logit contributions
 dip+0.014
 stare+0.013
 clam+0.013
 gaze+0.013
 crab+0.012
Neg logit contributions
rob-0.017
eh-0.015
aj-0.015
lich-0.014
 nonsense-0.014
Token said
Token ID531
Feature activation+9.46e-03

Pos logit contributions
 dip+0.003
 look+0.003
 pet+0.003
 stare+0.003
 touch+0.003
Neg logit contributions
rob-0.006
eh-0.005
aj-0.005
ross-0.005
xy-0.005
Token down
Token ID866
Feature activation+2.41e-03

Pos logit contributions
 look+0.005
 pet+0.005
 touch+0.005
 wonder+0.005
 stare+0.005
Neg logit contributions
rob-0.015
eh-0.013
 AW-0.013
 nonsense-0.012
ross-0.012
Token at
Token ID379
Feature activation+4.51e-03

Pos logit contributions
 look+0.002
 touch+0.002
 pet+0.002
 try+0.002
 stare+0.002
Neg logit contributions
rob-0.006
eh-0.005
aj-0.005
cod-0.005
ross-0.005
Token the
Token ID262
Feature activation+5.11e-03

Pos logit contributions
 look+0.005
 pet+0.004
 dip+0.004
 touch+0.004
 try+0.004
Neg logit contributions
rob-0.010
eh-0.010
aj-0.009
ross-0.009
ider-0.009
Token desk
Token ID6915
Feature activation-1.65e-02

Pos logit contributions
 look+0.007
 pet+0.006
 try+0.006
 touch+0.006
 gaze+0.006
Neg logit contributions
rob-0.011
eh-0.009
xy-0.009
aj-0.009
ross-0.009
Token.
Token ID13
Feature activation+3.43e-03

Pos logit contributions
rob+0.039
eh+0.033
aj+0.032
 AW+0.032
lich+0.031
Neg logit contributions
 look-0.015
 dip-0.014
 touch-0.014
 pet-0.014
 stare-0.014
Token Ms
Token ID6997
Feature activation+4.93e-02

Pos logit contributions
 dip+0.005
 look+0.004
 stare+0.004
 touch+0.004
 pet+0.004
Neg logit contributions
rob-0.006
eh-0.005
aj-0.005
ross-0.005
 nonsense-0.005
Token Daisy
Token ID40355
Feature activation-2.23e-03

Pos logit contributions
 dip+0.060
 stare+0.057
 look+0.056
 touch+0.055
 gaze+0.055
Neg logit contributions
rob-0.096
eh-0.083
aj-0.080
ross-0.077
xy-0.076
Token was
Token ID373
Feature activation+8.89e-03

Pos logit contributions
rob+0.004
eh+0.003
aj+0.003
xy+0.003
ross+0.003
Neg logit contributions
 dip-0.003
 stare-0.003
 look-0.003
 gaze-0.003
 try-0.003
Token a
Token ID257
Feature activation+4.46e-03

Pos logit contributions
 dip+0.010
 stare+0.010
 look+0.010
 touch+0.010
 gaze+0.010
Neg logit contributions
rob-0.018
eh-0.016
aj-0.015
ross-0.015
 nonsense-0.014
Token very
Token ID845
Feature activation-2.09e-02

Pos logit contributions
 dip+0.006
 stare+0.006
 touch+0.006
 gaze+0.006
 look+0.006
Neg logit contributions
rob-0.007
eh-0.007
 nonsense-0.006
aj-0.006
lf-0.006
Token brilliant
Token ID10457
Feature activation+6.46e-03

Pos logit contributions
rob+0.040
eh+0.033
aj+0.031
xy+0.031
lich+0.030
Neg logit contributions
 look-0.028
 dip-0.028
 pet-0.028
 stare-0.027
 touch-0.027
Token and
Token ID290
Feature activation+9.63e-03

Pos logit contributions
 dip+0.020
 stare+0.019
 look+0.018
 gaze+0.018
 wonder+0.018
Neg logit contributions
rob-0.032
eh-0.028
aj-0.026
ross-0.026
lich-0.025
Token scoop
Token ID23529
Feature activation+1.78e-02

Pos logit contributions
 stare+0.011
 wonder+0.011
 gaze+0.011
 Seal+0.011
 dip+0.011
Neg logit contributions
rob-0.017
eh-0.016
lich-0.014
aj-0.013
xy-0.013
Tokened
Token ID276
Feature activation+2.99e-02

Pos logit contributions
 dip+0.032
 look+0.031
 stare+0.031
 pet+0.031
 touch+0.031
Neg logit contributions
rob-0.024
eh-0.020
aj-0.018
ross-0.017
xy-0.017
Token up
Token ID510
Feature activation+2.13e-02

Pos logit contributions
 stare+0.041
 gaze+0.041
 dip+0.041
 Swan+0.038
 contain+0.038
Neg logit contributions
rob-0.044
eh-0.040
lich-0.038
es-0.037
 nonsense-0.035
Token the
Token ID262
Feature activation+2.84e-02

Pos logit contributions
 dip+0.029
 stare+0.028
 gaze+0.027
 wonder+0.026
 look+0.026
Neg logit contributions
rob-0.037
eh-0.033
lich-0.029
aj-0.029
three-0.029
Token little
Token ID1310
Feature activation+4.92e-02

Pos logit contributions
 stare+0.034
 gaze+0.033
 dip+0.032
 wonder+0.031
 Gem+0.031
Neg logit contributions
rob-0.046
eh-0.043
ider-0.037
ls-0.036
 nonsense-0.036
Token frog
Token ID21264
Feature activation+2.10e-02

Pos logit contributions
 stare+0.071
 gaze+0.069
 contain+0.068
 investigate+0.067
 sample+0.067
Neg logit contributions
rob-0.062
eh-0.053
 fort-0.049
ider-0.048
 knights-0.048
Token,
Token ID11
Feature activation-2.79e-03

Pos logit contributions
 stare+0.030
 wonder+0.030
 dip+0.030
 Seal+0.030
 gaze+0.029
Neg logit contributions
eh-0.030
rob-0.028
es-0.026
lich-0.025
aj-0.024
Token carrying
Token ID6872
Feature activation+3.62e-03

Pos logit contributions
rob+0.004
eh+0.004
aj+0.004
lich+0.004
 nonsense+0.004
Neg logit contributions
 dip-0.004
 Seal-0.004
 stare-0.004
 crab-0.004
 clam-0.003
Token it
Token ID340
Feature activation+3.14e-03

Pos logit contributions
 stare+0.005
 dip+0.005
 gaze+0.005
 look+0.004
 Swan+0.004
Neg logit contributions
rob-0.006
eh-0.006
lich-0.005
 nonsense-0.005
es-0.005
Token away
Token ID1497
Feature activation-4.19e-04

Pos logit contributions
 stare+0.004
 dip+0.004
 Seal+0.004
 gaze+0.004
 wonder+0.004
Neg logit contributions
eh-0.005
rob-0.005
aj-0.005
aces-0.004
ross-0.004
Token
Token ID198
Feature activation-7.82e-03

Pos logit contributions
rob+0.019
eh+0.017
aj+0.017
ross+0.016
 nonsense+0.016
Neg logit contributions
 dip-0.007
 look-0.007
 pet-0.007
 stare-0.007
 touch-0.007
TokenShe
Token ID3347
Feature activation-1.03e-02

Pos logit contributions
rob+0.016
aj+0.013
eh+0.013
ross+0.013
 nonsense+0.012
Neg logit contributions
 dip-0.009
 pet-0.009
 look-0.009
 touch-0.009
 stare-0.009
Token never
Token ID1239
Feature activation-9.04e-03

Pos logit contributions
rob+0.020
eh+0.018
aj+0.017
ross+0.016
 nonsense+0.016
Neg logit contributions
 dip-0.012
 look-0.012
 pet-0.012
 stare-0.012
 touch-0.012
Token forgot
Token ID16453
Feature activation+1.04e-02

Pos logit contributions
rob+0.013
eh+0.011
ross+0.011
aj+0.010
 nonsense+0.010
Neg logit contributions
 dip-0.013
 Swan-0.013
 Seal-0.013
 poppy-0.013
 clam-0.012
Token the
Token ID262
Feature activation+1.52e-02

Pos logit contributions
 dip+0.014
 stare+0.013
 gaze+0.013
 wonder+0.012
 Swan+0.012
Neg logit contributions
rob-0.017
eh-0.016
 nonsense-0.015
aj-0.015
ross-0.015
Token kind
Token ID1611
Feature activation+4.91e-02

Pos logit contributions
 dip+0.021
 stare+0.021
 Seal+0.020
 touch+0.020
 look+0.020
Neg logit contributions
rob-0.023
eh-0.022
 nonsense-0.020
aj-0.019
 reminds-0.018
Token bees
Token ID17002
Feature activation+2.40e-02

Pos logit contributions
 dip+0.068
 paddle+0.066
 stare+0.065
 investigate+0.063
 gaze+0.062
Neg logit contributions
eh-0.063
ross-0.063
rob-0.062
 nonsense-0.061
 teaches-0.059
Token,
Token ID11
Feature activation+1.90e-03

Pos logit contributions
 dip+0.031
 stare+0.029
 wonder+0.029
 gaze+0.029
 Swan+0.029
Neg logit contributions
rob-0.039
eh-0.035
es-0.033
xy-0.032
ross-0.032
Token who
Token ID508
Feature activation+2.74e-03

Pos logit contributions
 dip+0.002
 stare+0.002
 look+0.002
 gaze+0.002
 touch+0.002
Neg logit contributions
rob-0.004
eh-0.003
aj-0.003
ross-0.003
 nonsense-0.003
Token showed
Token ID3751
Feature activation-3.76e-03

Pos logit contributions
 dip+0.004
 stare+0.004
 look+0.004
 touch+0.004
 gaze+0.004
Neg logit contributions
rob-0.005
eh-0.004
xy-0.004
aj-0.004
ross-0.004
Token her
Token ID607
Feature activation+1.53e-02

Pos logit contributions
rob+0.006
eh+0.005
es+0.005
 nonsense+0.005
ibr+0.005
Neg logit contributions
 dip-0.005
 stare-0.005
 gaze-0.005
 crab-0.005
 Swan-0.005
Token Mum
Token ID38367
Feature activation+1.83e-02

Pos logit contributions
 dip+0.008
 stare+0.007
 gaze+0.007
 look+0.007
 distance+0.007
Neg logit contributions
aj-0.007
rob-0.007
lf-0.006
 homework-0.006
eh-0.006
Token,
Token ID11
Feature activation+1.61e-02

Pos logit contributions
 dip+0.025
 stare+0.025
 wonder+0.024
 gaze+0.024
 look+0.023
Neg logit contributions
rob-0.031
eh-0.027
aj-0.026
xy-0.025
 nonsense-0.024
Token when
Token ID618
Feature activation+2.60e-02

Pos logit contributions
 dip+0.021
 stare+0.020
 care+0.019
 gaze+0.019
 look+0.018
Neg logit contributions
rob-0.026
eh-0.026
aj-0.024
lich-0.023
 nonsense-0.022
Token he
Token ID339
Feature activation+7.50e-03

Pos logit contributions
 dip+0.043
 stare+0.041
 look+0.040
 gaze+0.040
 pet+0.040
Neg logit contributions
rob-0.036
eh-0.031
lf-0.029
ross-0.028
 nonsense-0.028
Token saw
Token ID2497
Feature activation-2.82e-03

Pos logit contributions
 dip+0.010
 stare+0.010
 gaze+0.009
 look+0.009
 wonder+0.009
Neg logit contributions
rob-0.013
eh-0.012
aj-0.011
lf-0.011
lich-0.010
Token Mrs
Token ID9074
Feature activation+4.90e-02

Pos logit contributions
rob+0.005
eh+0.004
 nonsense+0.004
lf+0.004
ross+0.004
Neg logit contributions
 dip-0.004
 stare-0.004
 gaze-0.003
 Swan-0.003
 Seal-0.003
Token Smith
Token ID4176
Feature activation+6.87e-03

Pos logit contributions
 dip+0.069
 stare+0.065
 wonder+0.063
 try+0.063
 look+0.062
Neg logit contributions
rob-0.085
eh-0.072
aj-0.070
xy-0.068
ross-0.067
Token.
Token ID13
Feature activation+1.28e-02

Pos logit contributions
 wonder+0.010
 stare+0.010
 dip+0.010
 gaze+0.010
 care+0.010
Neg logit contributions
rob-0.010
eh-0.009
aj-0.008
es-0.008
lf-0.008
Token He
Token ID679
Feature activation-1.41e-03

Pos logit contributions
 dip+0.024
 stare+0.022
 care+0.021
 look+0.021
 crab+0.021
Neg logit contributions
rob-0.015
eh-0.013
aj-0.013
 nonsense-0.012
 Cere-0.011
Token felt
Token ID2936
Feature activation+2.06e-02

Pos logit contributions
rob+0.002
eh+0.002
lf+0.002
aj+0.002
 nonsense+0.002
Neg logit contributions
 dip-0.002
 stare-0.002
 look-0.002
 gaze-0.002
 touch-0.002
Token ashamed
Token ID22461
Feature activation+2.24e-02

Pos logit contributions
 dip+0.034
 Swan+0.033
 gaze+0.032
 stare+0.031
 Seal+0.031
Neg logit contributions
eh-0.023
rob-0.022
 infected-0.022
 nonsense-0.022
 important-0.021
Token the
Token ID262
Feature activation-1.24e-02

Pos logit contributions
rob+0.004
eh+0.003
aj+0.003
ross+0.003
xy+0.003
Neg logit contributions
 look-0.002
 dip-0.002
 pet-0.002
 touch-0.002
 try-0.002
Token broken
Token ID5445
Feature activation-2.13e-02

Pos logit contributions
rob+0.020
eh+0.017
aj+0.016
 nonsense+0.015
lich+0.015
Neg logit contributions
 dip-0.019
 stare-0.018
 look-0.018
 pet-0.018
 touch-0.017
Token toy
Token ID13373
Feature activation-2.79e-02

Pos logit contributions
rob+0.039
eh+0.033
aj+0.031
ross+0.031
xy+0.030
Neg logit contributions
 look-0.029
 dip-0.029
 pet-0.028
 touch-0.028
 stare-0.028
Token in
Token ID287
Feature activation-5.22e-03

Pos logit contributions
rob+0.050
eh+0.042
aj+0.040
xy+0.038
ross+0.038
Neg logit contributions
 dip-0.038
 stare-0.037
 look-0.036
 gaze-0.036
 pet-0.036
Token his
Token ID465
Feature activation-1.75e-02

Pos logit contributions
rob+0.008
eh+0.006
lf+0.006
 nonsense+0.006
aj+0.006
Neg logit contributions
 dip-0.008
 stare-0.008
 gaze-0.008
 pet-0.007
 Swan-0.007
Token je
Token ID11223
Feature activation-7.27e-02

Pos logit contributions
rob+0.029
eh+0.026
aj+0.025
lich+0.022
lf+0.022
Neg logit contributions
 dip-0.025
 stare-0.024
 touch-0.023
 Swan-0.023
 Seal-0.023
Tokenep
Token ID538
Feature activation-2.49e-02

Pos logit contributions
rob+0.076
 nonsense+0.061
ross+0.060
eh+0.053
ider+0.051
Neg logit contributions
 look-0.138
 pet-0.134
 try-0.133
 touch-0.133
 peek-0.128
Token.
Token ID13
Feature activation-7.72e-04

Pos logit contributions
rob+0.052
eh+0.045
aj+0.043
ross+0.042
xy+0.041
Neg logit contributions
 dip-0.027
 look-0.027
 stare-0.026
 pet-0.026
 touch-0.026
Token They
Token ID1119
Feature activation-1.06e-02

Pos logit contributions
rob+0.001
eh+0.001
aj+0.001
xy+0.001
ross+0.001
Neg logit contributions
 dip-0.001
 stare-0.001
 look-0.001
 gaze-0.001
 pet-0.001
Token went
Token ID1816
Feature activation-4.32e-03

Pos logit contributions
rob+0.020
eh+0.017
aj+0.016
xy+0.016
lf+0.016
Neg logit contributions
 dip-0.013
 stare-0.012
 look-0.012
 pet-0.012
 gaze-0.012
Token to
Token ID284
Feature activation-1.03e-02

Pos logit contributions
rob+0.009
eh+0.007
aj+0.007
xy+0.007
lich+0.007
Neg logit contributions
 dip-0.005
 stare-0.005
 gaze-0.005
 wonder-0.005
 touch-0.004
Token ride
Token ID6594
Feature activation-1.53e-02

Pos logit contributions
 stare+0.004
 dip+0.004
 clam+0.004
 wonder+0.003
 Seal+0.003
Neg logit contributions
rob-0.005
eh-0.004
lf-0.004
xy-0.003
 AW-0.003
Token in
Token ID287
Feature activation-3.75e-03

Pos logit contributions
rob+0.032
eh+0.029
aj+0.028
ross+0.026
lich+0.025
Neg logit contributions
 look-0.017
 dip-0.016
 pet-0.016
 touch-0.016
 stare-0.015
Token the
Token ID262
Feature activation-1.58e-02

Pos logit contributions
rob+0.007
eh+0.006
lf+0.006
 nonsense+0.006
aj+0.006
Neg logit contributions
 dip-0.005
 stare-0.005
 gaze-0.004
 clam-0.004
 crab-0.004
Token big
Token ID1263
Feature activation-1.29e-02

Pos logit contributions
rob+0.026
eh+0.022
aj+0.020
lich+0.019
 nonsense+0.019
Neg logit contributions
 dip-0.024
 stare-0.023
 look-0.022
 touch-0.022
 gaze-0.022
Token red
Token ID2266
Feature activation-1.85e-02

Pos logit contributions
rob+0.021
eh+0.017
aj+0.016
lf+0.015
opl+0.015
Neg logit contributions
 dip-0.019
 stare-0.019
 gaze-0.018
 Seal-0.018
 touch-0.018
Token je
Token ID11223
Feature activation-7.23e-02

Pos logit contributions
rob+0.027
eh+0.022
aj+0.019
lich+0.019
opl+0.018
Neg logit contributions
 dip-0.031
 stare-0.030
 gaze-0.029
 Seal-0.029
 clam-0.029
Tokenep
Token ID538
Feature activation-2.04e-02

Pos logit contributions
rob+0.068
 nonsense+0.063
ross+0.062
 teaches+0.053
eh+0.053
Neg logit contributions
 look-0.139
 pet-0.135
 touch-0.134
 try-0.131
 peek-0.129
Token.
Token ID13
Feature activation-2.69e-03

Pos logit contributions
rob+0.043
eh+0.038
aj+0.036
ross+0.035
xy+0.035
Neg logit contributions
 dip-0.021
 look-0.021
 stare-0.020
 touch-0.020
 pet-0.020
Token<|endoftext|>
Token ID50256
Feature activation+3.63e-04

Pos logit contributions
rob+0.006
eh+0.005
aj+0.005
ross+0.004
xy+0.004
Neg logit contributions
 dip-0.003
 look-0.003
 stare-0.003
 pet-0.003
 touch-0.003
TokenOnce
Token ID7454
Feature activation-5.18e-03

Pos logit contributions
 dip+0.000
 look+0.000
 pet+0.000
 touch+0.000
 stare+0.000
Neg logit contributions
rob-0.001
eh-0.001
aj-0.001
ross-0.001
 nonsense-0.001
Token upon
Token ID2402
Feature activation-2.40e-02

Pos logit contributions
rob+0.011
eh+0.009
ider+0.008
ross+0.008
lich+0.008
Neg logit contributions
 look-0.006
 touch-0.006
 dip-0.006
 try-0.006
 stare-0.006
Token Tim
Token ID5045
Feature activation-1.94e-03

Pos logit contributions
rob+0.012
eh+0.010
aj+0.009
ross+0.009
 nonsense+0.009
Neg logit contributions
 dip-0.008
 look-0.008
 touch-0.008
 try-0.008
 stare-0.008
Token ride
Token ID6594
Feature activation-8.01e-03

Pos logit contributions
rob+0.004
eh+0.004
aj+0.003
lich+0.003
 nonsense+0.003
Neg logit contributions
 dip-0.002
 wonder-0.002
 stare-0.002
 clam-0.002
 gaze-0.002
Token in
Token ID287
Feature activation-5.23e-03

Pos logit contributions
rob+0.018
eh+0.016
aj+0.015
ross+0.015
ibr+0.014
Neg logit contributions
 look-0.008
 touch-0.008
 dip-0.008
 try-0.007
 pet-0.007
Token your
Token ID534
Feature activation-7.54e-03

Pos logit contributions
rob+0.011
eh+0.009
aj+0.009
ross+0.009
xy+0.009
Neg logit contributions
 dip-0.006
 look-0.006
 stare-0.005
 pet-0.005
 touch-0.005
Token red
Token ID2266
Feature activation-2.03e-02

Pos logit contributions
rob+0.013
eh+0.012
aj+0.011
ross+0.010
xy+0.010
Neg logit contributions
 dip-0.010
 stare-0.010
 look-0.010
 touch-0.010
 pet-0.010
Token je
Token ID11223
Feature activation-7.05e-02

Pos logit contributions
rob+0.034
eh+0.029
aj+0.028
lich+0.025
xy+0.025
Neg logit contributions
 dip-0.030
 stare-0.029
 look-0.028
 wonder-0.028
 pet-0.028
Tokenep
Token ID538
Feature activation-2.21e-02

Pos logit contributions
rob+0.069
ross+0.063
 nonsense+0.062
eh+0.054
chy+0.052
Neg logit contributions
 look-0.132
 pet-0.128
 touch-0.126
 try-0.126
 gaze-0.122
Token?"
Token ID1701
Feature activation+7.87e-03

Pos logit contributions
rob+0.039
eh+0.034
ross+0.031
aj+0.031
xy+0.030
Neg logit contributions
 look-0.032
 touch-0.030
 dip-0.030
 try-0.030
 stare-0.029
Token The
Token ID383
Feature activation+2.69e-03

Pos logit contributions
 dip+0.014
 clam+0.013
 stare+0.013
 look+0.013
 gaze+0.013
Neg logit contributions
rob-0.010
eh-0.008
aj-0.008
lich-0.007
xy-0.007
Token man
Token ID582
Feature activation+1.51e-03

Pos logit contributions
 dip+0.004
 stare+0.004
 look+0.004
 touch+0.004
 gaze+0.004
Neg logit contributions
rob-0.004
eh-0.003
aj-0.003
lf-0.003
lich-0.003
Token said
Token ID531
Feature activation+1.28e-02

Pos logit contributions
 dip+0.002
 stare+0.002
 gaze+0.002
 look+0.002
 poppy+0.002
Neg logit contributions
rob-0.003
eh-0.002
aj-0.002
lf-0.002
xy-0.002
Token They
Token ID1119
Feature activation-7.09e-03

Pos logit contributions
 dip+0.009
 clam+0.009
 crab+0.008
 stare+0.008
 gaze+0.008
Neg logit contributions
rob-0.006
 nonsense-0.005
xy-0.005
EEP-0.005
 AW-0.005
Token run
Token ID1057
Feature activation+2.31e-07

Pos logit contributions
rob+0.012
 nonsense+0.011
 AW+0.011
eh+0.010
lich+0.010
Neg logit contributions
 Seal-0.009
 Swan-0.009
 crab-0.008
 poppy-0.008
 clam-0.008
Token back
Token ID736
Feature activation-1.51e-02

Feature not active on prev token;
no predictions were affected.
Token to
Token ID284
Feature activation-5.75e-03

Pos logit contributions
rob+0.033
eh+0.029
aj+0.028
ross+0.026
xy+0.026
Neg logit contributions
 dip-0.015
 stare-0.015
 look-0.014
 gaze-0.014
 try-0.014
Token their
Token ID511
Feature activation-1.06e-02

Pos logit contributions
rob+0.007
 nonsense+0.006
lich+0.005
 homework+0.005
eh+0.005
Neg logit contributions
 dip-0.009
 clam-0.009
mit-0.009
 crab-0.009
 Swan-0.009
Token je
Token ID11223
Feature activation-7.02e-02

Pos logit contributions
rob+0.020
eh+0.017
aj+0.016
ross+0.015
xy+0.015
Neg logit contributions
 dip-0.014
 look-0.014
 pet-0.014
 stare-0.013
 touch-0.013
Tokenep
Token ID538
Feature activation-1.96e-02

Pos logit contributions
rob+0.073
 nonsense+0.066
ross+0.062
chy+0.058
eh+0.058
Neg logit contributions
 look-0.127
 pet-0.126
 touch-0.121
 try-0.121
 crab-0.117
Token.
Token ID13
Feature activation+2.18e-03

Pos logit contributions
rob+0.040
eh+0.035
aj+0.034
ross+0.033
xy+0.032
Neg logit contributions
 dip-0.022
 look-0.021
 stare-0.021
 pet-0.020
 touch-0.020
Token They
Token ID1119
Feature activation-8.66e-03

Pos logit contributions
 dip+0.004
 clam+0.003
 crab+0.003
 stare+0.003
 look+0.003
Neg logit contributions
rob-0.003
aj-0.003
 nonsense-0.003
xy-0.002
CR-0.002
Token get
Token ID651
Feature activation+5.59e-03

Pos logit contributions
rob+0.016
eh+0.014
 AW+0.014
 nonsense+0.014
aj+0.014
Neg logit contributions
 Swan-0.009
 Seal-0.009
 crab-0.009
 poppy-0.009
 clam-0.008
Token in
Token ID287
Feature activation-3.35e-03

Pos logit contributions
 stare+0.008
 gaze+0.008
 dip+0.008
 Seal+0.007
 clam+0.007
Neg logit contributions
rob-0.009
eh-0.008
 infected-0.007
lich-0.007
aj-0.007
Token,
Token ID11
Feature activation-1.65e-02

Pos logit contributions
rob+0.047
eh+0.042
aj+0.040
xy+0.038
ross+0.038
Neg logit contributions
 look-0.013
 dip-0.012
 try-0.012
 pet-0.012
 touch-0.012
Token Tim
Token ID5045
Feature activation-1.51e-02

Pos logit contributions
rob+0.029
eh+0.025
aj+0.024
ross+0.023
xy+0.022
Neg logit contributions
 dip-0.023
 look-0.022
 stare-0.022
 pet-0.022
 touch-0.022
Tokenmy
Token ID1820
Feature activation-1.55e-02

Pos logit contributions
rob+0.029
eh+0.025
aj+0.024
ider+0.023
ross+0.023
Neg logit contributions
 look-0.019
 dip-0.018
 pet-0.018
 touch-0.018
 try-0.018
Token drove
Token ID10357
Feature activation-1.42e-02

Pos logit contributions
rob+0.029
eh+0.025
aj+0.024
ross+0.023
xy+0.023
Neg logit contributions
 dip-0.020
 look-0.019
 stare-0.019
 pet-0.019
 touch-0.019
Token his
Token ID465
Feature activation-1.17e-02

Pos logit contributions
rob+0.029
eh+0.025
aj+0.024
ross+0.023
 nonsense+0.023
Neg logit contributions
 dip-0.016
 stare-0.016
 look-0.015
 pet-0.015
 touch-0.015
Token je
Token ID11223
Feature activation-6.98e-02

Pos logit contributions
rob+0.020
eh+0.017
aj+0.017
ross+0.015
 nonsense+0.015
Neg logit contributions
 dip-0.017
 look-0.016
 stare-0.016
 touch-0.016
 pet-0.016
Tokenep
Token ID538
Feature activation-2.14e-02

Pos logit contributions
rob+0.076
 nonsense+0.063
ross+0.055
 Cere+0.053
 AW+0.053
Neg logit contributions
 look-0.128
 pet-0.127
 try-0.126
 touch-0.124
 dip-0.120
Token slowly
Token ID6364
Feature activation-1.55e-02

Pos logit contributions
rob+0.044
eh+0.039
aj+0.038
xy+0.036
ross+0.035
Neg logit contributions
 look-0.024
 dip-0.023
 pet-0.023
 touch-0.023
 try-0.022
Token and
Token ID290
Feature activation-1.38e-02

Pos logit contributions
rob+0.028
eh+0.024
aj+0.023
aces+0.022
ross+0.022
Neg logit contributions
 dip-0.019
 crab-0.019
 stare-0.019
 clam-0.018
 wonder-0.018
Token carefully
Token ID7773
Feature activation-9.95e-03

Pos logit contributions
rob+0.028
eh+0.025
aj+0.024
ider+0.023
ross+0.022
Neg logit contributions
 look-0.016
 dip-0.016
 pet-0.015
 stare-0.015
 touch-0.015
Token.
Token ID13
Feature activation-7.31e-03

Pos logit contributions
rob+0.017
eh+0.014
aj+0.013
aces+0.012
ross+0.012
Neg logit contributions
 dip-0.014
 wonder-0.014
 stare-0.014
 crab-0.014
 Swan-0.014
Token to
Token ID284
Feature activation+2.55e-03

Pos logit contributions
 crab+0.007
 clam+0.007
 stare+0.007
 dip+0.007
 Seal+0.007
Neg logit contributions
rob-0.010
aj-0.009
eh-0.009
 nonsense-0.009
three-0.008
Token put
Token ID1234
Feature activation-2.67e-03

Pos logit contributions
 crab+0.003
 clam+0.002
 Seal+0.002
 Swan+0.002
 poppy+0.002
Neg logit contributions
rob-0.005
aj-0.004
lf-0.004
eh-0.004
ep-0.004
Token on
Token ID319
Feature activation+1.15e-03

Pos logit contributions
rob+0.004
eh+0.004
aj+0.004
xy+0.003
 nonsense+0.003
Neg logit contributions
 dip-0.004
 stare-0.004
 gaze-0.004
 pet-0.004
 look-0.004
Token your
Token ID534
Feature activation+4.73e-03

Pos logit contributions
 dip+0.002
 look+0.002
 pet+0.002
 stare+0.002
 touch+0.002
Neg logit contributions
rob-0.002
eh-0.002
aj-0.001
ross-0.001
xy-0.001
Token special
Token ID2041
Feature activation-3.00e-04

Pos logit contributions
 dip+0.006
 stare+0.006
 look+0.006
 touch+0.006
 pet+0.006
Neg logit contributions
rob-0.009
eh-0.008
aj-0.007
lich-0.007
ross-0.007
Token je
Token ID11223
Feature activation-6.97e-02

Pos logit contributions
rob+0.001
eh+0.000
aj+0.000
 nonsense+0.000
lf+0.000
Neg logit contributions
 dip-0.000
 stare-0.000
 look-0.000
 touch-0.000
 pet-0.000
Tokenep
Token ID538
Feature activation-1.46e-02

Pos logit contributions
rob+0.074
 nonsense+0.065
ross+0.062
eh+0.057
 Cere+0.056
Neg logit contributions
 look-0.130
 try-0.123
 touch-0.121
 pet-0.121
 peek-0.118
Token clothes
Token ID8242
Feature activation+1.89e-03

Pos logit contributions
rob+0.027
eh+0.023
aj+0.021
ross+0.021
 nonsense+0.020
Neg logit contributions
 look-0.020
 dip-0.020
 touch-0.019
 try-0.019
 stare-0.019
Token."
Token ID526
Feature activation+1.89e-03

Pos logit contributions
 look+0.002
 dip+0.002
 touch+0.002
 pet+0.002
 try+0.002
Neg logit contributions
rob-0.004
eh-0.003
aj-0.003
ross-0.003
lich-0.003
Token
Token ID220
Feature activation-9.51e-03

Pos logit contributions
 dip+0.003
 stare+0.002
 clam+0.002
 look+0.002
 pet+0.002
Neg logit contributions
rob-0.003
aj-0.003
eh-0.003
lf-0.003
lich-0.003
Token
Token ID198
Feature activation+1.15e-03

Pos logit contributions
rob+0.024
eh+0.022
ross+0.020
 nonsense+0.020
ider+0.020
Neg logit contributions
 look-0.006
 touch-0.006
 try-0.006
 dip-0.006
 stare-0.005
Tokenep
Token ID538
Feature activation-2.42e-02

Pos logit contributions
rob+0.064
 nonsense+0.054
ross+0.052
eh+0.051
aj+0.048
Neg logit contributions
 look-0.121
 try-0.115
 pet-0.113
 touch-0.112
 crab-0.108
Token.
Token ID13
Feature activation-8.81e-04

Pos logit contributions
rob+0.047
eh+0.040
aj+0.038
ibr+0.035
xy+0.035
Neg logit contributions
 look-0.033
 try-0.031
 touch-0.031
 dip-0.030
 stare-0.030
Token This
Token ID770
Feature activation-1.44e-02

Pos logit contributions
rob+0.002
eh+0.001
aj+0.001
ross+0.001
xy+0.001
Neg logit contributions
 dip-0.001
 look-0.001
 stare-0.001
 pet-0.001
 touch-0.001
Token is
Token ID318
Feature activation+8.34e-04

Pos logit contributions
rob+0.024
eh+0.021
aj+0.020
ross+0.020
ider+0.019
Neg logit contributions
 look-0.022
 dip-0.021
 try-0.021
 touch-0.020
 pet-0.020
Token my
Token ID616
Feature activation-1.26e-02

Pos logit contributions
 look+0.001
 dip+0.001
 try+0.001
 touch+0.001
 pet+0.001
Neg logit contributions
rob-0.001
eh-0.001
aj-0.001
ross-0.001
xy-0.001
Token je
Token ID11223
Feature activation-6.93e-02

Pos logit contributions
rob+0.022
eh+0.018
aj+0.017
 nonsense+0.016
xy+0.016
Neg logit contributions
 dip-0.018
 stare-0.017
 look-0.017
 touch-0.017
 pet-0.017
Tokenep
Token ID538
Feature activation-2.87e-02

Pos logit contributions
rob+0.071
 nonsense+0.061
ross+0.059
eh+0.055
aj+0.053
Neg logit contributions
 look-0.132
 try-0.125
 pet-0.123
 touch-0.121
 crab-0.119
Token!"
Token ID2474
Feature activation+4.43e-03

Pos logit contributions
rob+0.056
eh+0.047
aj+0.045
ibr+0.042
three+0.041
Neg logit contributions
 look-0.040
 try-0.037
 touch-0.037
 dip-0.035
 stare-0.035
Token
Token ID198
Feature activation+4.77e-03

Pos logit contributions
 dip+0.004
 look+0.004
 stare+0.004
 pet+0.004
 touch+0.004
Neg logit contributions
rob-0.010
eh-0.009
aj-0.008
ross-0.008
xy-0.008
Token
Token ID198
Feature activation+3.64e-03

Pos logit contributions
 look+0.004
 dip+0.004
 touch+0.004
 pet+0.004
 stare+0.004
Neg logit contributions
rob-0.011
eh-0.010
aj-0.010
ross-0.009
 nonsense-0.009
TokenL
Token ID43
Feature activation+2.34e-04

Pos logit contributions
 dip+0.006
 look+0.006
 stare+0.006
 pet+0.006
 touch+0.006
Neg logit contributions
rob-0.005
eh-0.004
aj-0.004
ross-0.004
xy-0.004
Token to
Token ID284
Feature activation+1.04e-04

Pos logit contributions
 dip+0.022
 stare+0.021
 pet+0.021
 wonder+0.021
 Seal+0.021
Neg logit contributions
rob-0.027
eh-0.023
aj-0.023
ross-0.021
 nonsense-0.021
Token go
Token ID467
Feature activation+1.80e-03

Pos logit contributions
 crab+0.000
 clam+0.000
 Seal+0.000
mit+0.000
 Swan+0.000
Neg logit contributions
lf-0.000
rob-0.000
eh-0.000
CR-0.000
 homework-0.000
Token out
Token ID503
Feature activation+8.76e-03

Pos logit contributions
 dip+0.002
 stare+0.002
 wonder+0.002
 clam+0.002
 crab+0.002
Neg logit contributions
rob-0.003
eh-0.003
aj-0.003
xy-0.003
 nonsense-0.003
Token in
Token ID287
Feature activation+7.39e-03

Pos logit contributions
 dip+0.009
 look+0.009
 stare+0.009
 pet+0.009
 touch+0.009
Neg logit contributions
rob-0.019
eh-0.016
aj-0.015
ross-0.015
xy-0.015
Token the
Token ID262
Feature activation-2.38e-03

Pos logit contributions
 dip+0.009
 stare+0.009
 clam+0.009
 crab+0.009
 Seal+0.009
Neg logit contributions
rob-0.012
aj-0.010
eh-0.010
lf-0.010
 nonsense-0.009
Token je
Token ID11223
Feature activation-6.90e-02

Pos logit contributions
rob+0.004
eh+0.004
aj+0.003
 nonsense+0.003
 fort+0.003
Neg logit contributions
 dip-0.003
 stare-0.003
 clam-0.003
 touch-0.003
 look-0.003
Tokenep
Token ID538
Feature activation-1.94e-02

Pos logit contributions
rob+0.064
 nonsense+0.059
ross+0.057
ses+0.051
 teaches+0.049
Neg logit contributions
 look-0.135
 touch-0.131
 pet-0.129
 gaze-0.129
 distance-0.128
Token.
Token ID13
Feature activation+2.96e-03

Pos logit contributions
rob+0.039
eh+0.034
aj+0.032
ross+0.031
xy+0.031
Neg logit contributions
 dip-0.022
 look-0.022
 stare-0.021
 pet-0.021
 touch-0.021
Token Tommy
Token ID19919
Feature activation+5.96e-03

Pos logit contributions
 dip+0.004
 clam+0.004
 poppy+0.004
 crab+0.004
 stare+0.004
Neg logit contributions
rob-0.004
aj-0.004
eh-0.004
three-0.004
 nonsense-0.004
Token was
Token ID373
Feature activation+4.37e-03

Pos logit contributions
 dip+0.007
 stare+0.007
 look+0.007
 pet+0.007
 touch+0.007
Neg logit contributions
rob-0.012
eh-0.010
aj-0.010
ross-0.009
xy-0.009
Token so
Token ID523
Feature activation+1.48e-03

Pos logit contributions
 dip+0.005
 stare+0.005
 touch+0.005
 gaze+0.005
 pet+0.005
Neg logit contributions
rob-0.008
eh-0.008
aj-0.007
lich-0.007
ross-0.007
Token riding
Token ID10311
Feature activation-1.62e-02

Pos logit contributions
rob+0.001
eh+0.000
aj+0.000
ross+0.000
ider+0.000
Neg logit contributions
 look-0.000
 dip-0.000
 pet-0.000
 stare-0.000
 touch-0.000
Token in
Token ID287
Feature activation-4.45e-04

Pos logit contributions
rob+0.036
aj+0.032
eh+0.032
ross+0.031
ider+0.031
Neg logit contributions
 look-0.016
 pet-0.016
 touch-0.015
 stare-0.014
 dip-0.014
Token a
Token ID257
Feature activation-1.20e-02

Pos logit contributions
rob+0.001
eh+0.001
aj+0.001
ross+0.001
xy+0.001
Neg logit contributions
 dip-0.000
 look-0.000
 pet-0.000
 stare-0.000
 touch-0.000
Token big
Token ID1263
Feature activation+3.38e-03

Pos logit contributions
rob+0.023
eh+0.020
aj+0.019
ross+0.019
aces+0.018
Neg logit contributions
 dip-0.015
 look-0.015
 pet-0.015
 touch-0.014
 stare-0.014
Token blue
Token ID4171
Feature activation+5.35e-03

Pos logit contributions
 dip+0.004
 stare+0.004
 look+0.004
 touch+0.004
 pet+0.004
Neg logit contributions
rob-0.006
eh-0.005
aj-0.005
lf-0.005
 nonsense-0.005
Token je
Token ID11223
Feature activation-6.88e-02

Pos logit contributions
 dip+0.007
 stare+0.007
 try+0.006
 look+0.006
 clam+0.006
Neg logit contributions
rob-0.010
eh-0.009
aj-0.008
lf-0.008
 fort-0.008
Tokenep
Token ID538
Feature activation-2.30e-02

Pos logit contributions
rob+0.062
 nonsense+0.059
ross+0.054
 teaches+0.051
ses+0.049
Neg logit contributions
 look-0.132
 pet-0.131
 touch-0.129
 distance-0.128
 dip-0.127
Token.
Token ID13
Feature activation-1.07e-03

Pos logit contributions
rob+0.047
eh+0.041
aj+0.038
ross+0.037
xy+0.037
Neg logit contributions
 dip-0.026
 look-0.025
 stare-0.025
 pet-0.025
 touch-0.024
Token
Token ID198
Feature activation+5.52e-03

Pos logit contributions
rob+0.002
eh+0.002
aj+0.002
ross+0.002
xy+0.002
Neg logit contributions
 dip-0.001
 look-0.001
 stare-0.001
 pet-0.001
 touch-0.001
Token
Token ID198
Feature activation+1.53e-03

Pos logit contributions
 dip+0.005
 look+0.005
 pet+0.004
 touch+0.004
 stare+0.004
Neg logit contributions
rob-0.013
eh-0.011
aj-0.011
ross-0.011
lich-0.010
TokenOne
Token ID3198
Feature activation-2.54e-03

Pos logit contributions
 pet+0.002
 dip+0.002
 touch+0.002
 look+0.002
 try+0.002
Neg logit contributions
rob-0.003
aj-0.003
drivers-0.003
ross-0.003
lich-0.003
Token go
Token ID467
Feature activation+2.50e-03

Pos logit contributions
 clam+0.013
 Swan+0.013
 poppy+0.013
 Seal+0.013
 dip+0.013
Neg logit contributions
rob-0.026
eh-0.024
lich-0.023
aj-0.023
aces-0.022
Token on
Token ID319
Feature activation-2.02e-03

Pos logit contributions
 dip+0.003
 stare+0.003
 gaze+0.003
 wonder+0.003
 clam+0.003
Neg logit contributions
rob-0.005
eh-0.004
aj-0.004
ider-0.004
xy-0.004
Token adventures
Token ID17545
Feature activation-1.11e-02

Pos logit contributions
rob+0.004
eh+0.004
ross+0.003
xy+0.003
ibr+0.003
Neg logit contributions
 look-0.002
 try-0.002
 touch-0.002
 pet-0.002
 care-0.002
Token in
Token ID287
Feature activation-6.58e-03

Pos logit contributions
rob+0.026
eh+0.024
aj+0.023
ross+0.023
lich+0.022
Neg logit contributions
 look-0.009
 dip-0.009
 touch-0.008
 pet-0.008
 try-0.008
Token your
Token ID534
Feature activation-7.92e-03

Pos logit contributions
rob+0.014
eh+0.012
aj+0.011
ross+0.011
 nonsense+0.011
Neg logit contributions
 dip-0.007
 stare-0.007
 look-0.007
 pet-0.007
 touch-0.007
Token je
Token ID11223
Feature activation-6.87e-02

Pos logit contributions
rob+0.015
eh+0.013
aj+0.013
ross+0.012
 nonsense+0.012
Neg logit contributions
 dip-0.010
 look-0.009
 stare-0.009
 touch-0.009
 pet-0.009
Tokenep
Token ID538
Feature activation-2.38e-02

Pos logit contributions
rob+0.068
 nonsense+0.057
ross+0.053
 Cere+0.050
eh+0.048
Neg logit contributions
 look-0.134
 try-0.130
 touch-0.129
 pet-0.129
 gaze-0.127
Token.
Token ID13
Feature activation-9.30e-03

Pos logit contributions
rob+0.051
eh+0.046
aj+0.042
ross+0.042
xy+0.041
Neg logit contributions
 look-0.025
 touch-0.023
 try-0.023
 dip-0.023
 stare-0.023
Token<|endoftext|>
Token ID50256
Feature activation-3.29e-03

Pos logit contributions
rob+0.019
eh+0.017
aj+0.016
ross+0.015
xy+0.015
Neg logit contributions
 dip-0.011
 look-0.010
 stare-0.010
 pet-0.010
 touch-0.010
TokenOnce
Token ID7454
Feature activation-9.15e-03

Pos logit contributions
rob+0.007
eh+0.006
aj+0.006
ross+0.006
xy+0.006
Neg logit contributions
 dip-0.003
 look-0.003
 stare-0.003
 pet-0.003
 touch-0.003
Token upon
Token ID2402
Feature activation-2.48e-02

Pos logit contributions
rob+0.018
eh+0.015
ross+0.014
ider+0.013
 AW+0.013
Neg logit contributions
 look-0.012
 dip-0.012
 touch-0.012
 try-0.012
 stare-0.012
Token special
Token ID2041
Feature activation+6.11e-04

Pos logit contributions
rob+0.017
eh+0.015
aj+0.015
ross+0.014
 nonsense+0.014
Neg logit contributions
 dip-0.016
 pet-0.015
 stare-0.015
 look-0.015
 Swan-0.015
Token in
Token ID287
Feature activation+1.46e-03

Pos logit contributions
 dip+0.001
 stare+0.001
 gaze+0.001
 try+0.001
 touch+0.001
Neg logit contributions
rob-0.001
ider-0.001
aj-0.001
eh-0.001
es-0.001
Token the
Token ID262
Feature activation-1.32e-02

Pos logit contributions
 dip+0.003
 stare+0.002
 gaze+0.002
 look+0.002
 pet+0.002
Neg logit contributions
rob-0.002
eh-0.002
 nonsense-0.002
lf-0.001
ross-0.001
Token big
Token ID1263
Feature activation-5.32e-03

Pos logit contributions
rob+0.022
eh+0.019
aj+0.018
opl+0.017
lich+0.016
Neg logit contributions
 dip-0.019
 stare-0.017
 clam-0.017
 look-0.017
 Seal-0.017
Token red
Token ID2266
Feature activation-1.83e-02

Pos logit contributions
rob+0.009
eh+0.007
aj+0.007
opl+0.007
lf+0.007
Neg logit contributions
 dip-0.008
 stare-0.007
 investigate-0.007
 Seal-0.007
 gaze-0.007
Token je
Token ID11223
Feature activation-6.84e-02

Pos logit contributions
rob+0.029
eh+0.023
aj+0.021
lich+0.021
opl+0.020
Neg logit contributions
 dip-0.029
 stare-0.028
 wonder-0.027
 Seal-0.027
 look-0.027
Tokenep
Token ID538
Feature activation-2.06e-02

Pos logit contributions
rob+0.061
 nonsense+0.052
ross+0.051
eh+0.046
aj+0.044
Neg logit contributions
 look-0.139
 pet-0.135
 touch-0.135
 try-0.131
 stare-0.131
Token.
Token ID13
Feature activation-4.37e-04

Pos logit contributions
rob+0.041
eh+0.036
aj+0.034
ross+0.033
xy+0.033
Neg logit contributions
 dip-0.024
 stare-0.023
 look-0.023
 pet-0.022
 touch-0.022
Token<|endoftext|>
Token ID50256
Feature activation+1.94e-03

Pos logit contributions
rob+0.001
eh+0.001
aj+0.001
ross+0.001
xy+0.001
Neg logit contributions
 dip-0.001
 look-0.001
 stare-0.001
 pet-0.001
 touch-0.001
TokenOnce
Token ID7454
Feature activation-3.08e-03

Pos logit contributions
 dip+0.002
 look+0.002
 stare+0.002
 pet+0.002
 touch+0.002
Neg logit contributions
rob-0.004
eh-0.004
aj-0.004
ross-0.003
xy-0.003
Token upon
Token ID2402
Feature activation-2.31e-02

Pos logit contributions
rob+0.006
eh+0.005
ross+0.005
ider+0.005
aj+0.005
Neg logit contributions
 look-0.004
 touch-0.004
 dip-0.004
 try-0.004
 stare-0.004
Token.
Token ID13
Feature activation+1.90e-03

Pos logit contributions
 dip+0.021
 stare+0.021
 gaze+0.020
 wonder+0.020
 Seal+0.020
Neg logit contributions
rob-0.021
 nonsense-0.018
aj-0.018
eh-0.017
ross-0.017
Token He
Token ID679
Feature activation-4.10e-04

Pos logit contributions
 look+0.002
 dip+0.002
 stare+0.002
 touch+0.002
 pet+0.002
Neg logit contributions
rob-0.004
eh-0.003
aj-0.003
ross-0.003
xy-0.003
Token gets
Token ID3011
Feature activation+1.06e-02

Pos logit contributions
rob+0.001
eh+0.001
ross+0.001
aj+0.001
xy+0.000
Neg logit contributions
 look-0.001
 pet-0.001
 dip-0.001
 try-0.001
 touch-0.001
Token in
Token ID287
Feature activation-3.09e-03

Pos logit contributions
 dip+0.016
 stare+0.016
 gaze+0.016
 wonder+0.015
 Swan+0.015
Neg logit contributions
rob-0.016
eh-0.014
 infected-0.013
lich-0.013
aj-0.013
Token his
Token ID465
Feature activation-1.66e-02

Pos logit contributions
rob+0.005
eh+0.004
aj+0.004
 nonsense+0.004
ross+0.004
Neg logit contributions
 dip-0.005
 stare-0.004
 gaze-0.004
 look-0.004
 pet-0.004
Token je
Token ID11223
Feature activation-6.81e-02

Pos logit contributions
rob+0.031
eh+0.026
aj+0.025
ross+0.024
xy+0.024
Neg logit contributions
 dip-0.022
 look-0.022
 stare-0.021
 pet-0.021
 touch-0.021
Tokenep
Token ID538
Feature activation-2.10e-02

Pos logit contributions
rob+0.069
 nonsense+0.064
ross+0.063
chy+0.057
eh+0.056
Neg logit contributions
 look-0.122
 pet-0.117
 touch-0.116
 try-0.115
 gaze-0.113
Token and
Token ID290
Feature activation-3.62e-03

Pos logit contributions
rob+0.048
eh+0.043
aj+0.040
ross+0.039
xy+0.039
Neg logit contributions
 look-0.020
 touch-0.019
 try-0.018
 stare-0.018
 pet-0.018
Token follows
Token ID5679
Feature activation-2.45e-03

Pos logit contributions
rob+0.006
eh+0.005
ross+0.005
ooky+0.005
xy+0.005
Neg logit contributions
 look-0.006
 pet-0.006
 touch-0.006
 try-0.006
 dip-0.006
Token them
Token ID606
Feature activation+1.46e-03

Pos logit contributions
rob+0.006
eh+0.005
aj+0.005
ross+0.005
 nonsense+0.005
Neg logit contributions
 dip-0.002
 stare-0.002
 look-0.002
 gaze-0.002
 wonder-0.002
Token.
Token ID13
Feature activation-4.18e-03

Pos logit contributions
 dip+0.002
 stare+0.002
 gaze+0.002
 crab+0.002
 pet+0.002
Neg logit contributions
rob-0.003
eh-0.002
aj-0.002
 nonsense-0.002
ross-0.002
Token it
Token ID340
Feature activation-3.59e-03

Pos logit contributions
 dip+0.012
 stare+0.011
 touch+0.011
 pet+0.011
 gaze+0.011
Neg logit contributions
rob-0.019
 nonsense-0.017
ross-0.016
eh-0.016
lf-0.016
Token was
Token ID373
Feature activation+1.12e-02

Pos logit contributions
rob+0.007
eh+0.006
aj+0.005
ross+0.005
xy+0.005
Neg logit contributions
 dip-0.005
 stare-0.005
 look-0.004
 pet-0.004
 touch-0.004
Token the
Token ID262
Feature activation-9.26e-03

Pos logit contributions
 dip+0.014
 stare+0.014
 gaze+0.013
 touch+0.013
 pet+0.013
Neg logit contributions
rob-0.020
eh-0.019
aj-0.017
 nonsense-0.017
ross-0.017
Token best
Token ID1266
Feature activation-4.95e-04

Pos logit contributions
rob+0.008
eh+0.006
 nonsense+0.006
 reminds+0.005
aj+0.005
Neg logit contributions
 dip-0.019
 stare-0.019
 touch-0.019
 clam-0.018
 paddle-0.018
Token ash
Token ID12530
Feature activation-3.64e-02

Pos logit contributions
rob+0.001
eh+0.001
aj+0.001
 nonsense+0.001
ross+0.001
Neg logit contributions
 dip-0.001
 stare-0.001
 Seal-0.001
 pet-0.001
 touch-0.001
Tokentr
Token ID2213
Feature activation-6.79e-02

Pos logit contributions
rob+0.032
eh+0.022
aj+0.020
ross+0.016
 nonsense+0.015
Neg logit contributions
 look-0.085
 dip-0.084
 pet-0.083
 touch-0.083
 stare-0.082
Tokenay
Token ID323
Feature activation-2.36e-02

Pos logit contributions
rob+0.147
eh+0.140
cod+0.123
 nonsense+0.122
aj+0.121
Neg logit contributions
 look-0.070
 touch-0.057
 try-0.057
 dip-0.057
 stare-0.050
Token they
Token ID484
Feature activation+1.57e-03

Pos logit contributions
rob+0.050
eh+0.045
aj+0.044
ross+0.040
ider+0.039
Neg logit contributions
 look-0.027
 touch-0.024
 pet-0.024
 try-0.024
 dip-0.023
Token had
Token ID550
Feature activation-8.73e-03

Pos logit contributions
 look+0.002
 dip+0.002
 touch+0.002
 pet+0.002
 stare+0.002
Neg logit contributions
rob-0.003
eh-0.003
aj-0.003
ross-0.002
 nonsense-0.002
Token ever
Token ID1683
Feature activation-1.64e-02

Pos logit contributions
rob+0.016
eh+0.014
aj+0.013
ross+0.013
xy+0.012
Neg logit contributions
 look-0.012
 dip-0.011
 pet-0.011
 touch-0.011
 try-0.011
Token seen
Token ID1775
Feature activation-2.73e-02

Pos logit contributions
rob+0.033
eh+0.029
aj+0.027
ross+0.026
xy+0.026
Neg logit contributions
 dip-0.019
 look-0.018
 stare-0.018
 pet-0.018
 touch-0.018
TokenYou
Token ID1639
Feature activation-6.44e-03

Pos logit contributions
 dip+0.005
 look+0.004
 stare+0.004
 pet+0.004
 touch+0.004
Neg logit contributions
rob-0.007
eh-0.006
aj-0.006
ross-0.006
xy-0.006
Token cannot
Token ID2314
Feature activation-2.31e-03

Pos logit contributions
rob+0.010
eh+0.009
 nonsense+0.009
lich+0.009
ibr+0.009
Neg logit contributions
 dip-0.009
 gaze-0.008
 stare-0.008
 distance-0.008
 Seal-0.008
Token ride
Token ID6594
Feature activation-8.57e-03

Pos logit contributions
rob+0.004
eh+0.003
 AW+0.003
ross+0.003
CR+0.003
Neg logit contributions
 toddler-0.003
 dip-0.002
 Swan-0.002
 Mum-0.002
 clam-0.002
Token in
Token ID287
Feature activation-8.19e-03

Pos logit contributions
rob+0.018
eh+0.017
aj+0.016
ross+0.015
ibr+0.015
Neg logit contributions
 look-0.009
 pet-0.009
 touch-0.009
 try-0.008
 dip-0.008
Token my
Token ID616
Feature activation-9.45e-03

Pos logit contributions
rob+0.017
eh+0.014
aj+0.014
ross+0.013
xy+0.013
Neg logit contributions
 dip-0.009
 look-0.009
 stare-0.009
 pet-0.009
 touch-0.009
Token je
Token ID11223
Feature activation-6.79e-02

Pos logit contributions
rob+0.017
eh+0.014
aj+0.014
ross+0.013
xy+0.012
Neg logit contributions
 dip-0.013
 stare-0.013
 look-0.013
 touch-0.012
 gaze-0.012
Tokenep
Token ID538
Feature activation-2.06e-02

Pos logit contributions
rob+0.062
ross+0.055
 nonsense+0.054
eh+0.047
ider+0.045
Neg logit contributions
 look-0.133
 pet-0.131
 touch-0.128
 try-0.127
 crab-0.124
Token.
Token ID13
Feature activation+4.59e-04

Pos logit contributions
rob+0.041
eh+0.036
aj+0.033
xy+0.032
ross+0.032
Neg logit contributions
 look-0.026
 touch-0.024
 try-0.024
 dip-0.024
 pet-0.023
Token Get
Token ID3497
Feature activation-2.09e-03

Pos logit contributions
 dip+0.001
 look+0.001
 stare+0.001
 pet+0.001
 touch+0.001
Neg logit contributions
rob-0.001
eh-0.001
aj-0.001
ross-0.001
xy-0.001
Token out
Token ID503
Feature activation-2.20e-02

Pos logit contributions
rob+0.005
eh+0.004
aj+0.004
ross+0.004
xy+0.004
Neg logit contributions
 dip-0.002
 look-0.002
 pet-0.002
 touch-0.002
 stare-0.002
Token now
Token ID783
Feature activation-9.81e-03

Pos logit contributions
rob+0.055
eh+0.046
xy+0.045
aj+0.044
ross+0.042
Neg logit contributions
 look-0.019
 touch-0.018
 dip-0.018
 try-0.018
 pet-0.018
Token and
Token ID290
Feature activation+5.54e-03

Pos logit contributions
rob+0.015
eh+0.014
aj+0.014
ross+0.013
lich+0.013
Neg logit contributions
 look-0.007
 dip-0.007
 pet-0.007
 touch-0.007
 try-0.006
Token ride
Token ID6594
Feature activation-1.26e-02

Pos logit contributions
 dip+0.006
 stare+0.006
 Seal+0.006
 Swan+0.006
 look+0.006
Neg logit contributions
rob-0.011
eh-0.009
aj-0.009
lf-0.009
 nonsense-0.009
Token in
Token ID287
Feature activation+2.51e-03

Pos logit contributions
rob+0.024
eh+0.021
aj+0.021
ross+0.020
ider+0.020
Neg logit contributions
 look-0.016
 pet-0.016
 touch-0.016
 dip-0.015
 try-0.015
Token his
Token ID465
Feature activation-9.37e-03

Pos logit contributions
 dip+0.003
 stare+0.003
 gaze+0.003
 pet+0.003
 look+0.003
Neg logit contributions
rob-0.005
eh-0.004
aj-0.004
lf-0.004
 nonsense-0.004
Token toy
Token ID13373
Feature activation-3.14e-02

Pos logit contributions
rob+0.016
eh+0.014
aj+0.013
xy+0.012
lich+0.012
Neg logit contributions
 dip-0.014
 stare-0.013
 look-0.013
 touch-0.013
 gaze-0.012
Token je
Token ID11223
Feature activation-6.79e-02

Pos logit contributions
rob+0.059
eh+0.050
aj+0.047
xy+0.045
ross+0.044
Neg logit contributions
 dip-0.042
 stare-0.041
 look-0.040
 gaze-0.039
 touch-0.039
Tokenep
Token ID538
Feature activation-2.32e-02

Pos logit contributions
rob+0.061
 nonsense+0.054
ross+0.052
 AW+0.048
 teaches+0.044
Neg logit contributions
 look-0.132
 pet-0.132
 touch-0.128
 try-0.126
 crab-0.125
Token.
Token ID13
Feature activation-1.31e-03

Pos logit contributions
rob+0.049
eh+0.043
aj+0.042
ross+0.040
xy+0.040
Neg logit contributions
 dip-0.024
 look-0.024
 touch-0.023
 pet-0.023
 stare-0.023
Token One
Token ID1881
Feature activation-3.85e-03

Pos logit contributions
rob+0.002
eh+0.002
aj+0.002
ross+0.002
xy+0.002
Neg logit contributions
 dip-0.002
 look-0.002
 stare-0.002
 pet-0.002
 touch-0.002
Token day
Token ID1110
Feature activation+3.22e-03

Pos logit contributions
rob+0.007
eh+0.006
aj+0.006
ross+0.006
xy+0.006
Neg logit contributions
 dip-0.005
 look-0.005
 stare-0.005
 pet-0.005
 touch-0.005
Token,
Token ID11
Feature activation-2.63e-03

Pos logit contributions
 dip+0.003
 look+0.003
 stare+0.003
 pet+0.003
 touch+0.003
Neg logit contributions
rob-0.007
eh-0.006
aj-0.006
ross-0.006
xy-0.006
Token together
Token ID1978
Feature activation+5.34e-03

Pos logit contributions
 dip+0.020
 stare+0.019
 gaze+0.018
 look+0.018
 try+0.018
Neg logit contributions
rob-0.024
eh-0.021
aj-0.020
ross-0.020
 nonsense-0.019
Token.
Token ID13
Feature activation-6.34e-03

Pos logit contributions
 dip+0.009
 stare+0.008
 crab+0.008
 gaze+0.008
 Swan+0.008
Neg logit contributions
rob-0.007
es-0.006
ross-0.006
lf-0.006
eh-0.006
Token With
Token ID2080
Feature activation-4.48e-03

Pos logit contributions
rob+0.011
aj+0.009
eh+0.009
xy+0.009
ross+0.009
Neg logit contributions
 dip-0.009
 crab-0.009
 stare-0.008
 clam-0.008
 look-0.008
Token her
Token ID607
Feature activation-1.17e-02

Pos logit contributions
rob+0.007
eh+0.006
 nonsense+0.006
 puzzles+0.005
lf+0.005
Neg logit contributions
 stare-0.006
 dip-0.006
 gaze-0.006
 crab-0.006
 clam-0.006
Token new
Token ID649
Feature activation-1.43e-02

Pos logit contributions
rob+0.021
eh+0.018
aj+0.018
ross+0.016
xy+0.016
Neg logit contributions
 dip-0.016
 stare-0.015
 look-0.015
 touch-0.015
 pet-0.015
Token je
Token ID11223
Feature activation-6.77e-02

Pos logit contributions
rob+0.023
eh+0.020
aj+0.019
lich+0.017
aces+0.017
Neg logit contributions
 dip-0.021
 stare-0.020
 look-0.020
 clam-0.020
 gaze-0.020
Tokenep
Token ID538
Feature activation-2.66e-02

Pos logit contributions
rob+0.072
 nonsense+0.056
eh+0.054
ross+0.052
aj+0.051
Neg logit contributions
 look-0.134
 try-0.129
 pet-0.129
 dip-0.127
 care-0.127
Token,
Token ID11
Feature activation-1.69e-02

Pos logit contributions
rob+0.050
eh+0.043
aj+0.040
ross+0.039
xy+0.038
Neg logit contributions
 dip-0.035
 look-0.034
 stare-0.034
 pet-0.033
 touch-0.033
Token Beth
Token ID19223
Feature activation-1.32e-02

Pos logit contributions
rob+0.031
eh+0.026
aj+0.025
ross+0.024
xy+0.024
Neg logit contributions
 dip-0.022
 stare-0.022
 look-0.021
 gaze-0.021
 touch-0.021
Tokenany
Token ID1092
Feature activation-7.47e-03

Pos logit contributions
rob+0.024
eh+0.021
aj+0.019
 nonsense+0.019
ross+0.018
Neg logit contributions
 look-0.018
 dip-0.018
 pet-0.017
 stare-0.017
 touch-0.017
Token could
Token ID714
Feature activation-1.39e-03

Pos logit contributions
rob+0.014
eh+0.013
aj+0.012
ross+0.011
 nonsense+0.011
Neg logit contributions
 dip-0.009
 look-0.009
 pet-0.009
 stare-0.009
 touch-0.009
Token Sam
Token ID3409
Feature activation+8.75e-03

Pos logit contributions
 dip+0.000
 look+0.000
 pet+0.000
 stare+0.000
 touch+0.000
Neg logit contributions
rob-0.001
eh-0.001
aj-0.001
ross-0.001
xy-0.000
Token to
Token ID284
Feature activation-1.16e-02

Pos logit contributions
 dip+0.010
 stare+0.010
 look+0.010
 gaze+0.009
 touch+0.009
Neg logit contributions
rob-0.017
eh-0.015
aj-0.014
 nonsense-0.014
ross-0.014
Token stay
Token ID2652
Feature activation+5.40e-03

Pos logit contributions
rob+0.023
eh+0.021
aj+0.020
 nonsense+0.019
ross+0.019
Neg logit contributions
 dip-0.012
 clam-0.011
 stare-0.011
 wonder-0.011
 Seal-0.011
Token in
Token ID287
Feature activation+6.30e-04

Pos logit contributions
 dip+0.007
 stare+0.007
 crab+0.006
 gaze+0.006
 clam+0.006
Neg logit contributions
rob-0.009
eh-0.009
aj-0.009
ross-0.008
ider-0.007
Token the
Token ID262
Feature activation+1.02e-03

Pos logit contributions
 dip+0.001
 look+0.001
 stare+0.001
 pet+0.001
 touch+0.001
Neg logit contributions
rob-0.001
eh-0.001
aj-0.001
ross-0.001
xy-0.001
Token je
Token ID11223
Feature activation-6.76e-02

Pos logit contributions
 dip+0.001
 stare+0.001
 look+0.001
 touch+0.001
 gaze+0.001
Neg logit contributions
rob-0.002
eh-0.002
aj-0.002
 nonsense-0.001
ross-0.001
Tokenep
Token ID538
Feature activation-1.63e-02

Pos logit contributions
 nonsense+0.074
rob+0.073
ross+0.069
ccoli+0.065
 Cere+0.064
Neg logit contributions
 look-0.114
 distance-0.112
 pet-0.109
 touch-0.109
 try-0.107
Token and
Token ID290
Feature activation+6.76e-03

Pos logit contributions
rob+0.037
eh+0.032
aj+0.031
xy+0.030
ross+0.030
Neg logit contributions
 look-0.015
 dip-0.015
 touch-0.014
 stare-0.014
 pet-0.014
Token be
Token ID307
Feature activation-6.33e-03

Pos logit contributions
 dip+0.007
 stare+0.007
 look+0.007
 pet+0.007
 touch+0.007
Neg logit contributions
rob-0.014
eh-0.012
aj-0.012
ross-0.011
 nonsense-0.011
Token quiet
Token ID5897
Feature activation+7.11e-04

Pos logit contributions
rob+0.011
eh+0.009
aj+0.009
xy+0.009
ross+0.008
Neg logit contributions
 dip-0.009
 look-0.009
 pet-0.009
 stare-0.009
 touch-0.009
Token.
Token ID13
Feature activation-4.81e-03

Pos logit contributions
 dip+0.001
 look+0.001
 pet+0.001
 touch+0.001
 stare+0.001
Neg logit contributions
rob-0.002
eh-0.001
aj-0.001
ross-0.001
xy-0.001
Token a
Token ID257
Feature activation+1.62e-03

Pos logit contributions
rob+0.012
eh+0.011
aj+0.011
ross+0.011
xy+0.010
Neg logit contributions
 look-0.007
 dip-0.007
 pet-0.007
 touch-0.007
 try-0.006
Token ride
Token ID6594
Feature activation-1.33e-02

Pos logit contributions
 wonder+0.002
 clam+0.002
 stare+0.002
 Seal+0.002
 dip+0.002
Neg logit contributions
rob-0.002
eh-0.002
aj-0.002
lf-0.002
 AW-0.002
Token in
Token ID287
Feature activation-3.55e-03

Pos logit contributions
rob+0.028
eh+0.026
aj+0.025
ross+0.024
ibr+0.023
Neg logit contributions
 look-0.014
 touch-0.013
 dip-0.013
 try-0.013
 pet-0.012
Token the
Token ID262
Feature activation-1.47e-02

Pos logit contributions
rob+0.008
eh+0.007
aj+0.007
ross+0.007
xy+0.007
Neg logit contributions
 look-0.003
 dip-0.003
 touch-0.003
 pet-0.003
 try-0.003
Token red
Token ID2266
Feature activation-1.24e-02

Pos logit contributions
rob+0.025
eh+0.021
aj+0.020
 nonsense+0.018
lich+0.018
Neg logit contributions
 dip-0.022
 stare-0.021
 look-0.020
 touch-0.020
 gaze-0.020
Token je
Token ID11223
Feature activation-6.72e-02

Pos logit contributions
rob+0.018
eh+0.014
aj+0.013
lf+0.012
opl+0.012
Neg logit contributions
 dip-0.021
 stare-0.021
 wonder-0.020
 gaze-0.020
 Seal-0.019
Tokenep
Token ID538
Feature activation-2.10e-02

Pos logit contributions
 nonsense+0.062
rob+0.062
ross+0.061
 teaches+0.053
chy+0.051
Neg logit contributions
 look-0.125
 pet-0.119
 touch-0.118
 try-0.117
 peek-0.115
Token.
Token ID13
Feature activation+1.58e-03

Pos logit contributions
rob+0.046
eh+0.041
aj+0.038
ross+0.037
xy+0.037
Neg logit contributions
 look-0.022
 touch-0.020
 dip-0.020
 try-0.020
 stare-0.019
Token But
Token ID887
Feature activation+1.30e-02

Pos logit contributions
 dip+0.002
 look+0.002
 stare+0.002
 pet+0.002
 touch+0.002
Neg logit contributions
rob-0.003
eh-0.002
aj-0.002
 nonsense-0.002
xy-0.002
Token first
Token ID717
Feature activation-1.35e-02

Pos logit contributions
 dip+0.018
 pet+0.018
 stare+0.017
 gaze+0.017
 wonder+0.017
Neg logit contributions
rob-0.021
eh-0.018
aj-0.018
 nonsense-0.017
ross-0.017
Token,
Token ID11
Feature activation+2.60e-03

Pos logit contributions
rob+0.026
eh+0.022
aj+0.022
 nonsense+0.021
ross+0.021
Neg logit contributions
 dip-0.016
 stare-0.015
 pet-0.015
 gaze-0.015
 wonder-0.014
Token like
Token ID588
Feature activation-7.80e-03

Pos logit contributions
 Seal+0.010
 Swan+0.010
 clam+0.010
 dip+0.010
 crab+0.010
Neg logit contributions
rob-0.015
eh-0.014
 nonsense-0.012
aces-0.012
aj-0.012
Token to
Token ID284
Feature activation-1.07e-03

Pos logit contributions
rob+0.015
eh+0.013
xy+0.012
aj+0.012
 nonsense+0.012
Neg logit contributions
 dip-0.009
 clam-0.009
 stare-0.009
 gaze-0.008
 Swan-0.008
Token play
Token ID711
Feature activation-2.24e-02

Pos logit contributions
rob+0.002
eh+0.002
lf+0.002
aj+0.002
aces+0.002
Neg logit contributions
 clam-0.001
 Seal-0.001
 Swan-0.001
 dip-0.001
 distance-0.001
Token with
Token ID351
Feature activation-1.11e-02

Pos logit contributions
rob+0.049
eh+0.044
aj+0.043
ross+0.042
three+0.040
Neg logit contributions
 dip-0.023
 look-0.022
 pet-0.022
 touch-0.021
 try-0.020
Token their
Token ID511
Feature activation-1.90e-03

Pos logit contributions
rob+0.022
eh+0.019
aj+0.018
ross+0.018
xy+0.017
Neg logit contributions
 dip-0.013
 look-0.013
 stare-0.013
 pet-0.013
 touch-0.013
Token je
Token ID11223
Feature activation-6.70e-02

Pos logit contributions
rob+0.003
eh+0.002
aj+0.002
xy+0.002
 nonsense+0.002
Neg logit contributions
 dip-0.003
 stare-0.003
 look-0.003
 gaze-0.003
 touch-0.003
Tokenep
Token ID538
Feature activation-2.29e-02

Pos logit contributions
rob+0.065
ross+0.060
 nonsense+0.057
eh+0.054
aj+0.051
Neg logit contributions
 look-0.131
 dip-0.123
 pet-0.123
 touch-0.122
 try-0.122
Token.
Token ID13
Feature activation+6.46e-03

Pos logit contributions
rob+0.049
eh+0.043
aj+0.041
ross+0.040
xy+0.039
Neg logit contributions
 look-0.024
 dip-0.024
 touch-0.023
 pet-0.023
 stare-0.023
Token Their
Token ID5334
Feature activation+7.69e-04

Pos logit contributions
 clam+0.011
 care+0.010
 stare+0.010
 crab+0.010
 look+0.010
Neg logit contributions
rob-0.008
aj-0.007
eh-0.006
 nonsense-0.006
 Cere-0.006
Token je
Token ID11223
Feature activation-6.04e-02

Pos logit contributions
 dip+0.001
 stare+0.001
 look+0.001
 touch+0.001
 gaze+0.001
Neg logit contributions
rob-0.001
aj-0.001
eh-0.001
xy-0.001
lich-0.001
Tokenep
Token ID538
Feature activation-2.52e-02

Pos logit contributions
rob+0.060
ross+0.052
eh+0.052
 nonsense+0.049
aj+0.047
Neg logit contributions
 look-0.116
 try-0.109
 pet-0.108
 crab-0.107
 touch-0.107
TokenThank
Token ID10449
Feature activation-1.02e-02

Pos logit contributions
 dip+0.001
 look+0.001
 stare+0.001
 pet+0.001
 touch+0.001
Neg logit contributions
rob-0.001
eh-0.001
aj-0.001
ross-0.001
xy-0.001
Token you
Token ID345
Feature activation-6.42e-03

Pos logit contributions
rob+0.021
aj+0.020
eh+0.019
ross+0.018
 nonsense+0.017
Neg logit contributions
 look-0.011
 touch-0.010
 pet-0.010
 try-0.010
 care-0.010
Token,
Token ID11
Feature activation-1.54e-02

Pos logit contributions
rob+0.015
xy+0.013
eh+0.013
CR+0.012
aces+0.012
Neg logit contributions
 look-0.007
 Mum-0.006
 Dad-0.006
 mum-0.006
 try-0.005
Token Mr
Token ID1770
Feature activation+8.66e-03

Pos logit contributions
rob+0.032
eh+0.028
aj+0.026
ross+0.026
xy+0.025
Neg logit contributions
 look-0.018
 dip-0.017
 try-0.017
 touch-0.017
 pet-0.017
Token.
Token ID13
Feature activation-5.23e-03

Pos logit contributions
 look+0.008
 touch+0.008
 try+0.007
 pet+0.007
 care+0.007
Neg logit contributions
rob-0.020
eh-0.018
aj-0.017
ibr-0.016
ross-0.016
Token Mechan
Token ID13438
Feature activation-6.70e-02

Pos logit contributions
rob+0.007
eh+0.006
aj+0.005
ross+0.005
 nonsense+0.005
Neg logit contributions
 Seal-0.010
 try-0.009
 look-0.009
 Swan-0.009
 touch-0.009
Tokenic
Token ID291
Feature activation-2.36e-02

Pos logit contributions
eh+0.108
rob+0.107
 nonsense+0.102
aj+0.101
sed+0.101
Neg logit contributions
 care-0.077
 look-0.073
 dip-0.070
 try-0.067
Yes-0.066
Token!"
Token ID2474
Feature activation-3.09e-03

Pos logit contributions
rob+0.040
aj+0.038
eh+0.038
lich+0.033
ibr+0.033
Neg logit contributions
 look-0.032
 care-0.031
 try-0.029
 wave-0.028
 touch-0.028
Token Tim
Token ID5045
Feature activation-2.52e-02

Pos logit contributions
rob+0.006
eh+0.005
aj+0.005
ross+0.004
xy+0.004
Neg logit contributions
 dip-0.004
 look-0.004
 stare-0.004
 pet-0.004
 touch-0.004
Tokenmy
Token ID1820
Feature activation-1.49e-02

Pos logit contributions
rob+0.054
eh+0.046
xy+0.044
 AW+0.044
ross+0.044
Neg logit contributions
 look-0.028
 pet-0.027
 touch-0.027
 try-0.026
 dip-0.025
Token said
Token ID531
Feature activation+1.74e-03

Pos logit contributions
rob+0.033
eh+0.028
ross+0.027
xy+0.027
aj+0.027
Neg logit contributions
 look-0.016
 pet-0.015
 dip-0.015
 touch-0.015
 try-0.014
Token and
Token ID290
Feature activation-7.66e-03

Pos logit contributions
rob+0.046
eh+0.039
ross+0.037
xy+0.037
 nonsense+0.036
Neg logit contributions
 look-0.017
 pet-0.016
 touch-0.015
 dip-0.015
 stare-0.014
Token Lily
Token ID20037
Feature activation-9.86e-03

Pos logit contributions
rob+0.016
eh+0.014
aj+0.013
ross+0.013
ider+0.013
Neg logit contributions
 look-0.009
 dip-0.009
 pet-0.009
 touch-0.008
 stare-0.008
Token took
Token ID1718
Feature activation-7.17e-03

Pos logit contributions
rob+0.020
eh+0.018
aj+0.017
ross+0.016
 nonsense+0.016
Neg logit contributions
 dip-0.011
 look-0.011
 pet-0.011
 stare-0.011
 touch-0.011
Token a
Token ID257
Feature activation+1.07e-02

Pos logit contributions
rob+0.012
eh+0.010
aj+0.009
ibr+0.009
aces+0.008
Neg logit contributions
 stare-0.011
 gaze-0.010
 clam-0.010
 crab-0.010
 Swan-0.010
Token nice
Token ID3621
Feature activation+3.15e-02

Pos logit contributions
 Seal+0.015
 Swan+0.014
 wonder+0.014
 clam+0.014
 stare+0.013
Neg logit contributions
eh-0.017
rob-0.017
aj-0.015
 AW-0.014
 fort-0.013
Token warm
Token ID5814
Feature activation+3.13e-02

Pos logit contributions
 Seal+0.037
 clam+0.034
 Swan+0.034
 wonder+0.034
 stare+0.032
Neg logit contributions
eh-0.057
rob-0.056
aj-0.047
lf-0.046
 AW-0.045
Token bath
Token ID7837
Feature activation+5.76e-03

Pos logit contributions
 Seal+0.038
 clam+0.037
 wonder+0.034
 stare+0.034
 Swan+0.034
Neg logit contributions
eh-0.056
rob-0.056
aj-0.049
 fort-0.044
lf-0.044
Token.
Token ID13
Feature activation-9.16e-03

Pos logit contributions
 dip+0.007
 look+0.006
 pet+0.006
 stare+0.006
 touch+0.006
Neg logit contributions
rob-0.012
eh-0.010
aj-0.010
ross-0.009
 nonsense-0.009
Token She
Token ID1375
Feature activation-7.24e-03

Pos logit contributions
rob+0.021
eh+0.017
ross+0.016
xy+0.016
aj+0.016
Neg logit contributions
 look-0.010
 pet-0.010
 dip-0.010
 touch-0.010
 stare-0.010
Token felt
Token ID2936
Feature activation+1.12e-02

Pos logit contributions
rob+0.017
eh+0.014
ross+0.014
ider+0.014
 AW+0.014
Neg logit contributions
 pet-0.007
 dip-0.007
 look-0.007
 wonder-0.007
 touch-0.007
Token much
Token ID881
Feature activation-5.24e-04

Pos logit contributions
 Seal+0.017
 stare+0.017
 Swan+0.017
 gaze+0.017
 dip+0.017
Neg logit contributions
rob-0.014
eh-0.014
 infected-0.012
aj-0.012
 nonsense-0.012
Token
Token ID198
Feature activation-6.13e-03

Pos logit contributions
rob+0.010
eh+0.009
aj+0.008
ross+0.008
aces+0.008
Neg logit contributions
 look-0.003
 try-0.003
 touch-0.003
 pet-0.003
 dip-0.003
Token"
Token ID1
Feature activation+1.29e-02

Pos logit contributions
rob+0.010
eh+0.008
aj+0.008
aces+0.007
xy+0.007
Neg logit contributions
 look-0.009
 touch-0.009
 pet-0.009
 dip-0.009
 try-0.009
TokenWhat
Token ID2061
Feature activation-8.05e-03

Pos logit contributions
 stare+0.019
 look+0.019
 gaze+0.019
 dip+0.019
 clam+0.018
Neg logit contributions
rob-0.021
three-0.017
eh-0.017
xy-0.017
ider-0.016
Token about
Token ID546
Feature activation+8.89e-03

Pos logit contributions
rob+0.016
eh+0.014
aj+0.013
ross+0.013
lich+0.012
Neg logit contributions
 look-0.010
 dip-0.010
 try-0.010
 touch-0.010
 stare-0.010
Token if
Token ID611
Feature activation+1.41e-02

Pos logit contributions
 stare+0.013
 gaze+0.012
 dip+0.012
 distance+0.012
 wonder+0.012
Neg logit contributions
rob-0.014
eh-0.012
three-0.012
xy-0.011
aj-0.011
Token we
Token ID356
Feature activation+3.89e-02

Pos logit contributions
 stare+0.021
 dip+0.021
 gaze+0.020
 pet+0.020
 wonder+0.020
Neg logit contributions
rob-0.021
eh-0.018
aj-0.018
 nonsense-0.017
xy-0.017
Token pretend
Token ID16614
Feature activation+9.57e-03

Pos logit contributions
 distance+0.047
 Seal+0.046
 Swan+0.046
 Goose+0.043
 pet+0.042
Neg logit contributions
 nonsense-0.054
 reminds-0.053
 weighs-0.053
xy-0.053
 costs-0.053
Token there
Token ID612
Feature activation+4.41e-03

Pos logit contributions
 stare+0.013
 gaze+0.013
 pet+0.013
 dip+0.013
 wonder+0.013
Neg logit contributions
rob-0.015
eh-0.013
aj-0.013
 nonsense-0.013
lf-0.013
Token is
Token ID318
Feature activation+7.98e-03

Pos logit contributions
 dip+0.006
 stare+0.006
 pet+0.006
 gaze+0.006
 look+0.006
Neg logit contributions
rob-0.008
eh-0.006
aj-0.006
xy-0.006
 nonsense-0.006
Token a
Token ID257
Feature activation-1.30e-03

Pos logit contributions
 dip+0.010
 stare+0.010
 look+0.010
 pet+0.009
 touch+0.009
Neg logit contributions
rob-0.015
eh-0.013
aj-0.012
ross-0.012
 nonsense-0.012
Token cave
Token ID11527
Feature activation-1.09e-02

Pos logit contributions
rob+0.002
eh+0.002
 nonsense+0.002
lf+0.002
aj+0.002
Neg logit contributions
 stare-0.002
 dip-0.002
 gaze-0.002
 touch-0.002
 look-0.002
Token make
Token ID787
Feature activation+1.15e-02

Pos logit contributions
 infected+0.007
chy+0.007
rob+0.006
eh+0.006
lf+0.006
Neg logit contributions
 Swan-0.007
 Seal-0.007
 crab-0.007
 clam-0.007
mit-0.006
Token a
Token ID257
Feature activation+3.67e-03

Pos logit contributions
 dip+0.014
 stare+0.014
 pet+0.014
 gaze+0.014
 touch+0.013
Neg logit contributions
rob-0.021
eh-0.018
aj-0.018
 nonsense-0.017
ross-0.017
Token face
Token ID1986
Feature activation+7.51e-03

Pos logit contributions
 Seal+0.005
 Swan+0.005
 dip+0.005
 stare+0.005
 gaze+0.005
Neg logit contributions
rob-0.005
eh-0.005
aj-0.004
 nonsense-0.004
 infected-0.004
Token as
Token ID355
Feature activation+3.45e-03

Pos logit contributions
 Seal+0.010
 dip+0.010
 Swan+0.010
 wonder+0.009
 gaze+0.009
Neg logit contributions
rob-0.011
eh-0.011
 nonsense-0.010
es-0.010
aj-0.010
Token he
Token ID339
Feature activation-7.95e-03

Pos logit contributions
 dip+0.005
 look+0.005
 stare+0.004
 touch+0.004
 pet+0.004
Neg logit contributions
rob-0.007
eh-0.006
aj-0.005
ross-0.005
xy-0.005
Token tasted
Token ID29187
Feature activation+2.83e-02

Pos logit contributions
rob+0.016
eh+0.014
aj+0.013
ross+0.013
xy+0.012
Neg logit contributions
 dip-0.009
 look-0.009
 stare-0.009
 pet-0.009
 touch-0.009
Token the
Token ID262
Feature activation+2.59e-04

Pos logit contributions
 Swan+0.042
 gaze+0.041
 stare+0.041
 Seal+0.040
 crab+0.038
Neg logit contributions
rob-0.037
 infected-0.036
eh-0.036
 nonsense-0.035
aj-0.034
Token bitter
Token ID12922
Feature activation+1.96e-02

Pos logit contributions
 stare+0.000
 dip+0.000
 Swan+0.000
 Seal+0.000
 gaze+0.000
Neg logit contributions
rob-0.000
eh-0.000
aj-0.000
 nonsense-0.000
 infected-0.000
Token medicine
Token ID9007
Feature activation+1.51e-02

Pos logit contributions
 Seal+0.021
 Swan+0.021
 stare+0.020
 investigate+0.020
 gaze+0.020
Neg logit contributions
rob-0.036
eh-0.032
aj-0.031
 nonsense-0.031
ops-0.031
Token.
Token ID13
Feature activation-1.16e-02

Pos logit contributions
 dip+0.019
 stare+0.018
 gaze+0.018
 pet+0.018
 wonder+0.018
Neg logit contributions
rob-0.029
eh-0.025
aj-0.023
ross-0.023
xy-0.023
Token
Token ID198
Feature activation-8.60e-03

Pos logit contributions
rob+0.025
eh+0.022
aj+0.020
ross+0.020
xy+0.020
Neg logit contributions
 look-0.012
 dip-0.012
 stare-0.012
 try-0.012
 touch-0.012
Token She
Token ID1375
Feature activation-3.52e-03

Pos logit contributions
 dip+0.012
 Seal+0.011
 stare+0.011
 crab+0.011
 clam+0.011
Neg logit contributions
rob-0.012
aj-0.011
eh-0.011
aces-0.010
 nonsense-0.010
Token did
Token ID750
Feature activation+8.22e-03

Pos logit contributions
rob+0.007
eh+0.006
aj+0.006
ross+0.005
xy+0.005
Neg logit contributions
 dip-0.004
 look-0.004
 stare-0.004
 pet-0.004
 touch-0.004
Token not
Token ID407
Feature activation+1.57e-03

Pos logit contributions
 dip+0.013
 look+0.013
 stare+0.013
 pet+0.013
 touch+0.013
Neg logit contributions
rob-0.013
eh-0.011
aj-0.010
ross-0.010
xy-0.009
Token want
Token ID765
Feature activation+9.64e-03

Pos logit contributions
 Swan+0.002
 Seal+0.002
 poppy+0.002
 Ned+0.002
 clam+0.002
Neg logit contributions
eh-0.002
rob-0.002
 nonsense-0.002
ross-0.002
aces-0.002
Token to
Token ID284
Feature activation+9.44e-03

Pos logit contributions
 Swan+0.016
 paddle+0.015
 gaze+0.015
 stare+0.015
 dip+0.014
Neg logit contributions
 homework-0.012
 dragons-0.012
 nonsense-0.012
 puzzles-0.011
rob-0.011
Token taste
Token ID6938
Feature activation+3.18e-02

Pos logit contributions
 Swan+0.014
 Seal+0.013
 toddler+0.013
 harbour+0.012
 Ned+0.012
Neg logit contributions
 homework-0.011
 excuse-0.011
 nonsense-0.010
chy-0.010
lf-0.010
Token the
Token ID262
Feature activation+1.20e-02

Pos logit contributions
 Swan+0.046
 gaze+0.045
 stare+0.045
 Seal+0.044
 dip+0.043
Neg logit contributions
rob-0.044
 nonsense-0.043
eh-0.042
 infected-0.040
aj-0.039
Token broccoli
Token ID44653
Feature activation+1.94e-02

Pos logit contributions
 dip+0.016
 stare+0.016
 look+0.015
 gaze+0.015
 Seal+0.015
Neg logit contributions
rob-0.021
eh-0.018
aj-0.017
 nonsense-0.017
ross-0.016
Token.
Token ID13
Feature activation+4.31e-03

Pos logit contributions
 dip+0.024
 stare+0.024
 gaze+0.023
 Seal+0.023
 pet+0.023
Neg logit contributions
rob-0.035
eh-0.031
aj-0.029
 nonsense-0.029
ross-0.028
Token She
Token ID1375
Feature activation-1.05e-03

Pos logit contributions
 dip+0.005
 stare+0.005
 look+0.005
 pet+0.005
 Seal+0.005
Neg logit contributions
rob-0.008
eh-0.007
aj-0.007
 nonsense-0.006
ross-0.006
Token looked
Token ID3114
Feature activation-1.45e-02

Pos logit contributions
rob+0.002
eh+0.002
aj+0.002
ross+0.002
 nonsense+0.002
Neg logit contributions
 dip-0.001
 look-0.001
 stare-0.001
 touch-0.001
 pet-0.001
Tokenapped
Token ID6320
Feature activation+3.26e-03

Pos logit contributions
rob+0.003
aj+0.002
ross+0.002
eh+0.002
ider+0.002
Neg logit contributions
 dip-0.004
 look-0.003
 distance-0.003
 stare-0.003
 try-0.003
Token and
Token ID290
Feature activation-6.69e-03

Pos logit contributions
 dip+0.004
 stare+0.004
 clam+0.004
 gaze+0.004
 Seal+0.004
Neg logit contributions
rob-0.005
eh-0.005
aj-0.005
ross-0.005
 nonsense-0.004
Token cheered
Token ID34767
Feature activation-5.28e-04

Pos logit contributions
rob+0.014
eh+0.012
aj+0.012
ross+0.011
xy+0.011
Neg logit contributions
 dip-0.007
 look-0.007
 stare-0.007
 pet-0.007
 touch-0.007
Token for
Token ID329
Feature activation-5.56e-03

Pos logit contributions
rob+0.001
eh+0.001
aj+0.001
ross+0.001
 nonsense+0.001
Neg logit contributions
 dip-0.001
 stare-0.001
 look-0.001
 pet-0.001
 touch-0.001
Token the
Token ID262
Feature activation+3.97e-03

Pos logit contributions
rob+0.009
eh+0.007
ross+0.007
 nonsense+0.007
ibr+0.007
Neg logit contributions
 dip-0.008
 stare-0.008
 gaze-0.008
 touch-0.008
 crab-0.008
Token little
Token ID1310
Feature activation+2.82e-02

Pos logit contributions
 dip+0.005
 stare+0.005
 touch+0.005
 look+0.005
 gaze+0.005
Neg logit contributions
rob-0.007
eh-0.006
aj-0.006
 nonsense-0.006
lf-0.006
Token girl
Token ID2576
Feature activation+1.42e-02

Pos logit contributions
 stare+0.039
 dip+0.039
 gaze+0.037
 touch+0.037
 Seal+0.036
Neg logit contributions
rob-0.044
eh-0.038
ider-0.035
 nonsense-0.035
aj-0.035
Token as
Token ID355
Feature activation+8.92e-04

Pos logit contributions
 dip+0.021
 stare+0.020
 crab+0.020
 gaze+0.020
 clam+0.019
Neg logit contributions
rob-0.020
eh-0.019
es-0.018
aces-0.018
aj-0.017
Token she
Token ID673
Feature activation-1.36e-02

Pos logit contributions
 dip+0.001
 look+0.001
 stare+0.001
 pet+0.001
 touch+0.001
Neg logit contributions
rob-0.002
eh-0.002
aj-0.001
ross-0.001
xy-0.001
Token proudly
Token ID23017
Feature activation-1.21e-02

Pos logit contributions
rob+0.028
eh+0.025
aj+0.024
ross+0.023
 nonsense+0.023
Neg logit contributions
 pet-0.015
 dip-0.015
 touch-0.014
 look-0.014
 try-0.014
Token held
Token ID2714
Feature activation+3.95e-03

Pos logit contributions
rob+0.023
eh+0.020
aj+0.019
ross+0.018
xy+0.018
Neg logit contributions
 dip-0.015
 look-0.014
 stare-0.014
 pet-0.014
 touch-0.014
Token with
Token ID351
Feature activation-2.39e-03

Pos logit contributions
rob+0.002
eh+0.002
ross+0.002
lf+0.002
aj+0.002
Neg logit contributions
 dip-0.002
 stare-0.001
 pet-0.001
 Seal-0.001
 gaze-0.001
Token a
Token ID257
Feature activation-8.55e-03

Pos logit contributions
rob+0.004
eh+0.003
aj+0.003
 nonsense+0.003
lich+0.003
Neg logit contributions
 dip-0.004
 stare-0.003
 gaze-0.003
 look-0.003
 touch-0.003
Token big
Token ID1263
Feature activation+5.50e-03

Pos logit contributions
rob+0.012
eh+0.011
aj+0.011
lf+0.010
 nonsense+0.010
Neg logit contributions
 dip-0.013
 stare-0.012
 Seal-0.012
 gaze-0.012
 care-0.012
Token chim
Token ID18205
Feature activation-2.09e-02

Pos logit contributions
 dip+0.007
 stare+0.007
 Seal+0.007
 touch+0.007
 gaze+0.007
Neg logit contributions
rob-0.010
eh-0.008
aj-0.008
lf-0.008
 nonsense-0.008
Tokenney
Token ID1681
Feature activation-4.29e-03

Pos logit contributions
rob+0.029
eh+0.024
aj+0.022
ross+0.021
xy+0.020
Neg logit contributions
 dip-0.037
 look-0.036
 stare-0.036
 pet-0.036
 touch-0.036
Token and
Token ID290
Feature activation+4.47e-03

Pos logit contributions
rob+0.008
eh+0.007
aj+0.006
lf+0.006
ross+0.006
Neg logit contributions
 dip-0.005
 stare-0.005
 gaze-0.005
 Seal-0.005
 try-0.005
Token a
Token ID257
Feature activation-1.19e-02

Pos logit contributions
 look+0.005
 dip+0.005
 pet+0.005
 touch+0.005
 stare+0.005
Neg logit contributions
rob-0.009
eh-0.008
aj-0.008
ross-0.008
xy-0.007
Token bright
Token ID6016
Feature activation+1.17e-02

Pos logit contributions
rob+0.018
eh+0.016
aj+0.015
 nonsense+0.015
lf+0.015
Neg logit contributions
 dip-0.017
 stare-0.016
 Seal-0.016
 care-0.016
 touch-0.016
Token star
Token ID3491
Feature activation-1.02e-02

Pos logit contributions
 dip+0.014
 Seal+0.014
mit+0.013
 stare+0.013
 crab+0.013
Neg logit contributions
rob-0.019
eh-0.018
lf-0.017
aj-0.016
 fort-0.016
Token on
Token ID319
Feature activation-3.76e-03

Pos logit contributions
rob+0.017
eh+0.015
aj+0.014
lf+0.014
ider+0.014
Neg logit contributions
 dip-0.014
 stare-0.014
 wonder-0.014
 look-0.013
 try-0.013
Token the
Token ID262
Feature activation-7.75e-03

Pos logit contributions
rob+0.007
eh+0.006
lf+0.006
 nonsense+0.006
ross+0.006
Neg logit contributions
 dip-0.004
 stare-0.004
 pet-0.004
 crab-0.004
 gaze-0.004
TokenAnna
Token ID31160
Feature activation+5.06e-03

Pos logit contributions
 dip+0.007
 stare+0.007
 look+0.007
 pet+0.007
 touch+0.007
Neg logit contributions
rob-0.010
eh-0.008
aj-0.008
ross-0.007
xy-0.007
Token and
Token ID290
Feature activation+6.73e-03

Pos logit contributions
 pet+0.005
 look+0.005
 dip+0.005
 touch+0.005
 stare+0.005
Neg logit contributions
rob-0.011
eh-0.010
aj-0.009
ross-0.009
xy-0.009
Token Ben
Token ID3932
Feature activation+1.83e-02

Pos logit contributions
 dip+0.010
 clam+0.010
 Swan+0.009
 Seal+0.009
 stare+0.009
Neg logit contributions
rob-0.010
eh-0.009
lich-0.008
aj-0.008
 nonsense-0.008
Token are
Token ID389
Feature activation-3.50e-03

Pos logit contributions
 dip+0.023
 stare+0.021
 look+0.021
 Seal+0.021
 gaze+0.021
Neg logit contributions
rob-0.034
eh-0.029
aj-0.028
lich-0.026
 nonsense-0.026
Token playing
Token ID2712
Feature activation-1.88e-02

Pos logit contributions
rob+0.008
 AW+0.006
ross+0.006
xy+0.006
eh+0.006
Neg logit contributions
 pet-0.004
 look-0.004
 dip-0.004
 try-0.004
 wonder-0.004
Token in
Token ID287
Feature activation+4.07e-03

Pos logit contributions
rob+0.043
aj+0.038
eh+0.037
ross+0.036
xy+0.036
Neg logit contributions
 dip-0.017
 look-0.017
 pet-0.017
 touch-0.016
 try-0.016
Token the
Token ID262
Feature activation-2.60e-03

Pos logit contributions
 dip+0.005
 stare+0.005
 pet+0.004
 look+0.004
 gaze+0.004
Neg logit contributions
rob-0.008
eh-0.007
lf-0.006
aj-0.006
xy-0.006
Token garden
Token ID11376
Feature activation+2.97e-03

Pos logit contributions
rob+0.005
eh+0.005
aj+0.004
ross+0.004
xy+0.004
Neg logit contributions
 dip-0.003
 stare-0.003
 look-0.003
 touch-0.003
 pet-0.003
Token.
Token ID13
Feature activation+1.30e-02

Pos logit contributions
 dip+0.004
 stare+0.004
 look+0.003
 pet+0.003
 touch+0.003
Neg logit contributions
rob-0.006
eh-0.005
aj-0.005
ross-0.004
xy-0.004
Token They
Token ID1119
Feature activation+8.92e-03

Pos logit contributions
 dip+0.018
 stare+0.017
 look+0.017
 clam+0.017
 poppy+0.017
Neg logit contributions
rob-0.020
eh-0.018
aj-0.017
 nonsense-0.016
 Cere-0.016
Token see
Token ID766
Feature activation-1.20e-02

Pos logit contributions
 Seal+0.006
 dip+0.006
 Swan+0.006
 clam+0.006
 stare+0.005
Neg logit contributions
rob-0.021
eh-0.019
aj-0.017
 nonsense-0.017
ross-0.017
Token asked
Token ID1965
Feature activation-7.32e-03

Pos logit contributions
rob+0.012
eh+0.010
aj+0.009
ross+0.009
xy+0.009
Neg logit contributions
 dip-0.007
 look-0.007
 stare-0.007
 pet-0.007
 touch-0.007
Token,
Token ID11
Feature activation+4.15e-03

Pos logit contributions
rob+0.019
eh+0.016
aj+0.015
xy+0.015
ider+0.015
Neg logit contributions
 look-0.008
 pet-0.007
 try-0.007
 touch-0.007
 care-0.007
Token "
Token ID366
Feature activation+1.47e-02

Pos logit contributions
 dip+0.007
 stare+0.007
 look+0.007
 gaze+0.006
 touch+0.006
Neg logit contributions
rob-0.006
eh-0.005
aj-0.005
ross-0.005
 nonsense-0.004
TokenDo
Token ID5211
Feature activation+4.10e-03

Pos logit contributions
 dip+0.017
 look+0.016
 stare+0.016
 pet+0.016
 touch+0.016
Neg logit contributions
rob-0.030
eh-0.026
aj-0.025
ross-0.024
xy-0.024
Token you
Token ID345
Feature activation+1.82e-02

Pos logit contributions
 look+0.005
 try+0.005
 dip+0.005
 stare+0.004
 touch+0.004
Neg logit contributions
rob-0.009
eh-0.008
aj-0.007
ross-0.007
ider-0.007
Token have
Token ID423
Feature activation+1.87e-02

Pos logit contributions
 Swan+0.021
 Seal+0.021
 dip+0.020
 poppy+0.019
 pet+0.019
Neg logit contributions
eh-0.031
rob-0.030
aj-0.028
 nonsense-0.027
aces-0.027
Token something
Token ID1223
Feature activation+1.16e-02

Pos logit contributions
 dip+0.025
 stare+0.025
 gaze+0.024
 wonder+0.024
 look+0.024
Neg logit contributions
rob-0.032
eh-0.029
aj-0.027
 nonsense-0.025
ross-0.025
Token for
Token ID329
Feature activation-1.20e-02

Pos logit contributions
 stare+0.017
 gaze+0.017
 dip+0.017
 wonder+0.017
 Seal+0.017
Neg logit contributions
rob-0.016
eh-0.015
aj-0.015
 nonsense-0.014
xy-0.013
Token me
Token ID502
Feature activation+6.02e-04

Pos logit contributions
rob+0.023
eh+0.020
aj+0.019
ross+0.018
xy+0.018
Neg logit contributions
 dip-0.015
 look-0.014
 stare-0.014
 pet-0.014
 touch-0.014
Token?"
Token ID1701
Feature activation+5.70e-03

Pos logit contributions
 dip+0.001
 stare+0.001
 wonder+0.001
 gaze+0.001
 pet+0.001
Neg logit contributions
rob-0.001
eh-0.001
aj-0.001
 nonsense-0.001
ider-0.001
Token Lily
Token ID20037
Feature activation+5.44e-03

Pos logit contributions
 dip+0.008
 look+0.008
 stare+0.008
 pet+0.008
 touch+0.008
Neg logit contributions
rob-0.010
eh-0.008
aj-0.008
ross-0.007
xy-0.007
Token so
Token ID523
Feature activation-3.30e-03

Pos logit contributions
rob+0.001
eh+0.001
aj+0.001
 nonsense+0.001
ross+0.001
Neg logit contributions
 dip-0.001
 stare-0.001
 gaze-0.001
 look-0.001
 touch-0.001
Token big
Token ID1263
Feature activation-3.09e-03

Pos logit contributions
rob+0.006
eh+0.005
aj+0.005
ross+0.004
xy+0.004
Neg logit contributions
 dip-0.005
 stare-0.005
 look-0.005
 gaze-0.004
 pet-0.004
Token and
Token ID290
Feature activation-1.05e-02

Pos logit contributions
rob+0.007
eh+0.006
aj+0.006
ross+0.006
xy+0.006
Neg logit contributions
 dip-0.003
 stare-0.003
 look-0.003
 pet-0.003
 touch-0.003
Token so
Token ID523
Feature activation-1.09e-02

Pos logit contributions
rob+0.020
eh+0.017
aj+0.016
ross+0.015
ibr+0.015
Neg logit contributions
 dip-0.014
 look-0.014
 pet-0.013
 touch-0.013
 stare-0.013
Token round
Token ID2835
Feature activation-2.82e-03

Pos logit contributions
rob+0.023
eh+0.020
aj+0.019
lich+0.019
ross+0.018
Neg logit contributions
 look-0.012
 dip-0.012
 pet-0.012
 touch-0.012
 wonder-0.011
Token that
Token ID326
Feature activation-1.05e-03

Pos logit contributions
rob+0.006
eh+0.006
aj+0.005
ross+0.005
xy+0.005
Neg logit contributions
 dip-0.003
 look-0.003
 stare-0.002
 pet-0.002
 touch-0.002
Token it
Token ID340
Feature activation+4.54e-03

Pos logit contributions
rob+0.003
eh+0.002
aj+0.002
ross+0.002
xy+0.002
Neg logit contributions
 dip-0.001
 look-0.001
 stare-0.001
 pet-0.001
 touch-0.001
Token was
Token ID373
Feature activation+3.61e-03

Pos logit contributions
 dip+0.005
 look+0.005
 pet+0.005
 stare+0.005
 touch+0.005
Neg logit contributions
rob-0.010
eh-0.008
aj-0.008
ross-0.008
 nonsense-0.008
Token like
Token ID588
Feature activation+6.43e-03

Pos logit contributions
 dip+0.004
 stare+0.004
 gaze+0.004
 look+0.004
 touch+0.004
Neg logit contributions
rob-0.007
eh-0.006
aj-0.006
ross-0.006
 nonsense-0.006
Token a
Token ID257
Feature activation-5.93e-04

Pos logit contributions
 dip+0.008
 stare+0.008
 gaze+0.008
 touch+0.008
 pet+0.008
Neg logit contributions
rob-0.011
eh-0.010
 nonsense-0.009
lf-0.009
xy-0.009
Token mountain
Token ID8598
Feature activation-7.19e-03

Pos logit contributions
rob+0.001
eh+0.001
lf+0.001
 nonsense+0.001
aj+0.001
Neg logit contributions
 stare-0.001
 dip-0.001
 gaze-0.001
 look-0.001
 touch-0.001
Token asked
Token ID1965
Feature activation-1.83e-02

Pos logit contributions
 dip+0.003
 pet+0.003
 look+0.003
 touch+0.003
 stare+0.003
Neg logit contributions
rob-0.006
eh-0.005
aj-0.005
ross-0.005
 nonsense-0.005
Token his
Token ID465
Feature activation-2.22e-03

Pos logit contributions
rob+0.043
eh+0.038
aj+0.038
ider+0.036
xy+0.034
Neg logit contributions
 look-0.018
 try-0.016
 pet-0.016
 touch-0.016
 care-0.015
Token mom
Token ID1995
Feature activation+6.31e-03

Pos logit contributions
rob+0.005
eh+0.004
aj+0.004
ross+0.004
xy+0.004
Neg logit contributions
 dip-0.003
 look-0.002
 stare-0.002
 pet-0.002
 touch-0.002
Token for
Token ID329
Feature activation+2.16e-03

Pos logit contributions
 dip+0.010
 stare+0.010
 gaze+0.009
 pet+0.009
 wonder+0.009
Neg logit contributions
rob-0.010
eh-0.008
aj-0.008
ross-0.008
 nonsense-0.007
Token help
Token ID1037
Feature activation-1.19e-02

Pos logit contributions
 stare+0.003
 dip+0.003
 gaze+0.003
 Swan+0.003
 pet+0.003
Neg logit contributions
rob-0.004
eh-0.003
aj-0.003
lf-0.003
 nonsense-0.003
Token and
Token ID290
Feature activation+8.02e-03

Pos logit contributions
rob+0.017
eh+0.016
lf+0.015
lich+0.015
es+0.015
Neg logit contributions
 dip-0.017
 stare-0.017
 Swan-0.017
 crab-0.017
 Seal-0.017
Token they
Token ID484
Feature activation+5.28e-03

Pos logit contributions
 dip+0.010
 look+0.009
 stare+0.009
 pet+0.009
 touch+0.009
Neg logit contributions
rob-0.016
eh-0.014
aj-0.013
ross-0.013
xy-0.012
Token found
Token ID1043
Feature activation-9.85e-03

Pos logit contributions
 dip+0.006
 stare+0.006
 Seal+0.006
 gaze+0.006
 touch+0.006
Neg logit contributions
rob-0.009
eh-0.009
aj-0.008
lf-0.008
xy-0.008
Token lots
Token ID6041
Feature activation-1.48e-03

Pos logit contributions
rob+0.017
eh+0.015
xy+0.014
 nonsense+0.013
ross+0.013
Neg logit contributions
 stare-0.014
 dip-0.013
 gaze-0.013
 touch-0.012
 crab-0.012
Token of
Token ID286
Feature activation-2.10e-02

Pos logit contributions
rob+0.002
eh+0.002
aj+0.002
ross+0.002
xy+0.002
Neg logit contributions
 stare-0.002
 gaze-0.002
 look-0.002
 Seal-0.002
 dip-0.002
Token creative
Token ID7325
Feature activation+3.57e-04

Pos logit contributions
rob+0.034
eh+0.029
aj+0.027
 nonsense+0.025
xy+0.025
Neg logit contributions
 dip-0.032
 stare-0.032
 look-0.031
 gaze-0.031
 touch-0.031
Token the
Token ID262
Feature activation+3.34e-03

Pos logit contributions
 dip+0.006
 Seal+0.006
 stare+0.006
 crab+0.006
 Swan+0.006
Neg logit contributions
rob-0.005
eh-0.005
lf-0.005
es-0.005
ross-0.004
Token flowers
Token ID12734
Feature activation+1.59e-02

Pos logit contributions
 dip+0.005
 stare+0.005
 try+0.005
 Seal+0.005
 touch+0.005
Neg logit contributions
rob-0.005
eh-0.005
aj-0.005
lf-0.004
ross-0.004
Token.
Token ID13
Feature activation+3.28e-03

Pos logit contributions
 stare+0.022
 Seal+0.021
 dip+0.021
 wonder+0.021
 crab+0.021
Neg logit contributions
rob-0.025
eh-0.022
aces-0.021
es-0.021
xy-0.020
Token
Token ID198
Feature activation+8.69e-04

Pos logit contributions
 dip+0.004
 crab+0.004
 look+0.004
 stare+0.004
 pet+0.004
Neg logit contributions
rob-0.005
eh-0.005
aj-0.005
three-0.005
ross-0.005
Token
Token ID198
Feature activation+2.36e-04

Pos logit contributions
 dip+0.001
 look+0.001
 stare+0.001
 pet+0.001
 gaze+0.001
Neg logit contributions
rob-0.002
eh-0.002
aj-0.001
ross-0.001
xy-0.001
TokenJane
Token ID41083
Feature activation-6.54e-03

Pos logit contributions
 dip+0.000
 look+0.000
 stare+0.000
 pet+0.000
 touch+0.000
Neg logit contributions
rob-0.000
eh-0.000
aj-0.000
ross-0.000
xy-0.000
Token spent
Token ID3377
Feature activation-9.95e-03

Pos logit contributions
rob+0.012
eh+0.010
aj+0.010
ross+0.009
lf+0.009
Neg logit contributions
 dip-0.009
 look-0.008
 stare-0.008
 touch-0.008
 gaze-0.008
Token the
Token ID262
Feature activation-1.42e-02

Pos logit contributions
rob+0.020
eh+0.017
aj+0.017
ross+0.016
xy+0.016
Neg logit contributions
 dip-0.012
 look-0.011
 stare-0.011
 pet-0.011
 touch-0.011
Token whole
Token ID2187
Feature activation-4.70e-03

Pos logit contributions
rob+0.030
aj+0.026
eh+0.026
ross+0.025
xy+0.025
Neg logit contributions
 dip-0.015
 look-0.015
 pet-0.015
 stare-0.015
 gaze-0.015
Token day
Token ID1110
Feature activation+3.13e-04

Pos logit contributions
rob+0.007
eh+0.006
aj+0.006
ross+0.006
lf+0.005
Neg logit contributions
 dip-0.007
 look-0.007
 stare-0.007
 touch-0.007
 pet-0.007
Token in
Token ID287
Feature activation+6.70e-03

Pos logit contributions
 dip+0.000
 look+0.000
 stare+0.000
 pet+0.000
 touch+0.000
Neg logit contributions
rob-0.001
eh-0.000
aj-0.000
ross-0.000
xy-0.000
Token One
Token ID1881
Feature activation-3.25e-03

Pos logit contributions
rob+0.011
eh+0.009
aj+0.009
ross+0.009
xy+0.009
Neg logit contributions
 dip-0.006
 look-0.006
 pet-0.006
 stare-0.006
 touch-0.006
Token day
Token ID1110
Feature activation-6.97e-04

Pos logit contributions
rob+0.006
eh+0.005
aj+0.005
ross+0.004
 nonsense+0.004
Neg logit contributions
 dip-0.005
 stare-0.004
 look-0.004
 pet-0.004
 gaze-0.004
Token,
Token ID11
Feature activation+1.11e-03

Pos logit contributions
rob+0.002
eh+0.001
aj+0.001
ross+0.001
xy+0.001
Neg logit contributions
 dip-0.001
 look-0.001
 pet-0.001
 touch-0.001
 stare-0.001
Token Lily
Token ID20037
Feature activation+8.99e-04

Pos logit contributions
 dip+0.002
 stare+0.002
 look+0.002
 gaze+0.002
 wonder+0.002
Neg logit contributions
rob-0.002
eh-0.001
aj-0.001
ross-0.001
 nonsense-0.001
Token was
Token ID373
Feature activation-9.46e-03

Pos logit contributions
 pet+0.001
 look+0.001
 dip+0.001
 touch+0.001
 stare+0.001
Neg logit contributions
rob-0.002
eh-0.002
ross-0.002
aj-0.002
xy-0.002
Token playing
Token ID2712
Feature activation-2.00e-02

Pos logit contributions
rob+0.017
eh+0.014
aj+0.014
xy+0.013
ross+0.013
Neg logit contributions
 dip-0.013
 look-0.013
 pet-0.013
 stare-0.013
 touch-0.013
Token outside
Token ID2354
Feature activation-3.57e-03

Pos logit contributions
rob+0.046
aj+0.040
eh+0.039
xy+0.038
ross+0.038
Neg logit contributions
 look-0.019
 dip-0.018
 pet-0.018
 touch-0.018
 try-0.017
Token when
Token ID618
Feature activation+1.61e-02

Pos logit contributions
rob+0.008
aj+0.007
eh+0.007
xy+0.007
ross+0.007
Neg logit contributions
 dip-0.003
 look-0.003
 touch-0.003
 pet-0.003
 stare-0.003
Token she
Token ID673
Feature activation+6.26e-03

Pos logit contributions
 dip+0.022
 look+0.021
 stare+0.021
 pet+0.020
 touch+0.020
Neg logit contributions
rob-0.029
eh-0.025
aj-0.024
ross-0.023
xy-0.022
Token saw
Token ID2497
Feature activation-1.45e-02

Pos logit contributions
 dip+0.008
 stare+0.008
 look+0.008
 gaze+0.007
 pet+0.007
Neg logit contributions
rob-0.012
eh-0.011
aj-0.010
ross-0.009
lf-0.009
Token a
Token ID257
Feature activation-4.22e-03

Pos logit contributions
rob+0.024
eh+0.021
 nonsense+0.019
aj+0.018
ross+0.018
Neg logit contributions
 dip-0.021
 stare-0.021
 gaze-0.020
 look-0.020
 try-0.020
Token pic
Token ID8301
Feature activation+4.65e-03

Pos logit contributions
 dip+0.007
 stare+0.007
 touch+0.007
 gaze+0.007
 look+0.007
Neg logit contributions
rob-0.007
eh-0.006
aj-0.005
 nonsense-0.005
xy-0.005
Tokenn
Token ID77
Feature activation-1.65e-03

Pos logit contributions
 dip+0.007
 look+0.006
 stare+0.006
 pet+0.006
 touch+0.006
Neg logit contributions
rob-0.008
eh-0.007
aj-0.007
ross-0.006
 nonsense-0.006
Tokenics
Token ID873
Feature activation+1.03e-02

Pos logit contributions
rob+0.003
aj+0.003
eh+0.003
 nonsense+0.003
oc+0.003
Neg logit contributions
 look-0.002
 try-0.002
 touch-0.002
 dip-0.002
 stare-0.002
Token together
Token ID1978
Feature activation+1.10e-03

Pos logit contributions
 dip+0.012
 stare+0.012
 look+0.012
 touch+0.012
 pet+0.012
Neg logit contributions
rob-0.020
eh-0.018
aj-0.017
ross-0.016
xy-0.016
Token.
Token ID13
Feature activation+3.65e-04

Pos logit contributions
 dip+0.001
 stare+0.001
 look+0.001
 gaze+0.001
 wonder+0.001
Neg logit contributions
rob-0.002
eh-0.002
aj-0.002
xy-0.002
ross-0.002
Token
Token ID220
Feature activation-1.43e-02

Pos logit contributions
 clam+0.001
 dip+0.000
 crab+0.000
 stare+0.000
 look+0.000
Neg logit contributions
rob-0.001
eh-0.001
aj-0.000
xy-0.000
three-0.000
Token
Token ID198
Feature activation+2.80e-04

Pos logit contributions
rob+0.034
eh+0.030
aj+0.028
ross+0.028
 nonsense+0.027
Neg logit contributions
 look-0.012
 dip-0.012
 touch-0.011
 stare-0.011
 pet-0.011
Token
Token ID198
Feature activation+9.22e-05

Pos logit contributions
 dip+0.000
 look+0.000
 stare+0.000
 pet+0.000
 touch+0.000
Neg logit contributions
rob-0.001
eh-0.001
aj-0.001
ross-0.001
 nonsense-0.001
TokenOne
Token ID3198
Feature activation+1.09e-03

Pos logit contributions
 dip+0.000
 pet+0.000
 look+0.000
 touch+0.000
 stare+0.000
Neg logit contributions
rob-0.000
eh-0.000
aj-0.000
ross-0.000
aces-0.000
Token day
Token ID1110
Feature activation+3.76e-03

Pos logit contributions
 dip+0.002
 stare+0.001
 look+0.001
 gaze+0.001
 try+0.001
Neg logit contributions
rob-0.002
eh-0.002
aj-0.002
ross-0.001
 nonsense-0.001
Token,
Token ID11
Feature activation+5.93e-03

Pos logit contributions
 dip+0.004
 stare+0.004
 gaze+0.004
 look+0.004
 wonder+0.004
Neg logit contributions
rob-0.007
eh-0.006
aj-0.006
ross-0.006
lich-0.006
Token!"
Token ID2474
Feature activation+6.31e-04

Pos logit contributions
rob+0.009
eh+0.008
aj+0.008
ross+0.007
xy+0.007
Neg logit contributions
 dip-0.004
 look-0.004
 stare-0.004
 pet-0.004
 touch-0.004
Token she
Token ID673
Feature activation-1.79e-02

Pos logit contributions
 dip+0.001
 stare+0.001
 look+0.001
 gaze+0.001
 pet+0.001
Neg logit contributions
rob-0.001
eh-0.001
aj-0.001
 nonsense-0.001
ross-0.001
Token said
Token ID531
Feature activation-4.94e-03

Pos logit contributions
rob+0.038
eh+0.032
ross+0.032
aj+0.032
xy+0.031
Neg logit contributions
 look-0.021
 pet-0.021
 touch-0.020
 try-0.020
 dip-0.019
Token.
Token ID13
Feature activation-3.90e-03

Pos logit contributions
rob+0.008
eh+0.007
aj+0.007
ross+0.007
ider+0.007
Neg logit contributions
 dip-0.007
 Seal-0.006
 wonder-0.006
 pet-0.006
 stare-0.006
Token "
Token ID366
Feature activation-8.44e-04

Pos logit contributions
rob+0.006
eh+0.005
aj+0.005
 nonsense+0.005
ross+0.005
Neg logit contributions
 dip-0.006
 stare-0.006
 gaze-0.006
 look-0.006
 clam-0.006
TokenYou
Token ID1639
Feature activation-1.11e-02

Pos logit contributions
rob+0.002
aj+0.001
eh+0.001
ross+0.001
xy+0.001
Neg logit contributions
 touch-0.001
 pet-0.001
 dip-0.001
 look-0.001
 wonder-0.001
Token did
Token ID750
Feature activation+2.56e-03

Pos logit contributions
rob+0.020
eh+0.017
aj+0.016
ross+0.015
 nonsense+0.015
Neg logit contributions
 dip-0.015
 look-0.015
 stare-0.015
 pet-0.015
 touch-0.015
Token not
Token ID407
Feature activation-8.87e-03

Pos logit contributions
 look+0.004
 dip+0.004
 touch+0.004
 try+0.004
 pet+0.004
Neg logit contributions
rob-0.005
eh-0.004
aj-0.004
ross-0.003
xy-0.003
Token listen
Token ID6004
Feature activation+8.43e-03

Pos logit contributions
rob+0.018
eh+0.017
 nonsense+0.016
ross+0.015
lich+0.015
Neg logit contributions
 dip-0.009
 Seal-0.008
 gaze-0.008
 stare-0.008
 clam-0.008
Token.
Token ID13
Feature activation-2.90e-03

Pos logit contributions
 look+0.007
 dip+0.007
 touch+0.007
 pet+0.007
 try+0.007
Neg logit contributions
rob-0.020
eh-0.017
aj-0.016
xy-0.016
ross-0.016
Token You
Token ID921
Feature activation-9.70e-03

Pos logit contributions
rob+0.005
eh+0.004
aj+0.003
ross+0.003
xy+0.003
Neg logit contributions
 look-0.005
 stare-0.005
 dip-0.005
 touch-0.005
 pet-0.005
Token.
Token ID13
Feature activation+8.34e-03

Pos logit contributions
rob+0.014
eh+0.013
lf+0.012
ider+0.011
ross+0.011
Neg logit contributions
 dip-0.013
 stare-0.012
 try-0.012
 wonder-0.011
 gaze-0.011
Token All
Token ID1439
Feature activation+4.57e-03

Pos logit contributions
 dip+0.012
 pet+0.011
 clam+0.011
 crab+0.011
 look+0.011
Neg logit contributions
rob-0.014
eh-0.012
aj-0.011
 nonsense-0.011
aces-0.011
Token the
Token ID262
Feature activation-3.08e-02

Pos logit contributions
 dip+0.006
 pet+0.006
 gaze+0.006
 stare+0.006
 Seal+0.006
Neg logit contributions
rob-0.008
eh-0.007
 nonsense-0.006
xy-0.006
lich-0.006
Token people
Token ID661
Feature activation+1.07e-02

Pos logit contributions
rob+0.059
eh+0.051
aj+0.048
ross+0.047
xy+0.046
Neg logit contributions
 dip-0.038
 look-0.037
 stare-0.037
 pet-0.036
 touch-0.036
Token in
Token ID287
Feature activation-1.95e-03

Pos logit contributions
 dip+0.017
 Seal+0.017
 gaze+0.017
 Swan+0.016
 Mum+0.016
Neg logit contributions
rob-0.013
es-0.011
eh-0.011
 has-0.011
lf-0.011
Token the
Token ID262
Feature activation-1.93e-02

Pos logit contributions
rob+0.003
lf+0.003
 nonsense+0.003
 puzzles+0.003
eh+0.003
Neg logit contributions
 dip-0.003
 clam-0.003
 stare-0.003
 crab-0.002
 gaze-0.002
Token town
Token ID3240
Feature activation-6.66e-03

Pos logit contributions
rob+0.040
eh+0.035
ibr+0.033
lf+0.033
aj+0.033
Neg logit contributions
 dip-0.020
 stare-0.018
 touch-0.017
 gaze-0.017
 look-0.017
Token were
Token ID547
Feature activation-1.23e-03

Pos logit contributions
rob+0.011
eh+0.009
aj+0.009
xy+0.008
lf+0.008
Neg logit contributions
 dip-0.010
 pet-0.009
 stare-0.009
 Seal-0.009
 clam-0.009
Token very
Token ID845
Feature activation-8.72e-03

Pos logit contributions
rob+0.002
eh+0.002
aj+0.002
 nonsense+0.002
ross+0.002
Neg logit contributions
 dip-0.002
 stare-0.002
 gaze-0.002
 touch-0.002
 look-0.001
Token busy
Token ID8179
Feature activation-3.40e-03

Pos logit contributions
rob+0.015
eh+0.013
aj+0.012
ross+0.012
xy+0.012
Neg logit contributions
 dip-0.012
 look-0.012
 stare-0.012
 pet-0.012
 touch-0.012
Token each
Token ID1123
Feature activation-2.45e-02

Pos logit contributions
rob+0.006
aj+0.005
eh+0.005
lf+0.005
 nonsense+0.005
Neg logit contributions
 dip-0.005
 stare-0.005
 gaze-0.004
 pet-0.004
 Seal-0.004
Tokenite
Token ID578
Feature activation-1.30e-02

Pos logit contributions
rob+0.041
eh+0.036
aj+0.034
xy+0.033
ross+0.031
Neg logit contributions
 stare-0.029
 dip-0.029
 look-0.029
 gaze-0.028
 wonder-0.028
Token.
Token ID13
Feature activation-1.16e-03

Pos logit contributions
rob+0.023
eh+0.020
xy+0.019
aj+0.018
aces+0.018
Neg logit contributions
 stare-0.017
 dip-0.017
 gaze-0.016
 wonder-0.016
 Seal-0.016
Token They
Token ID1119
Feature activation-7.76e-03

Pos logit contributions
rob+0.002
aj+0.002
eh+0.002
xy+0.002
 nonsense+0.002
Neg logit contributions
 dip-0.002
 stare-0.002
 look-0.002
 clam-0.001
 gaze-0.001
Token tied
Token ID8165
Feature activation+4.83e-03

Pos logit contributions
rob+0.015
eh+0.013
aj+0.013
ross+0.012
xy+0.012
Neg logit contributions
 dip-0.009
 look-0.009
 stare-0.009
 pet-0.009
 touch-0.009
Token the
Token ID262
Feature activation-2.92e-03

Pos logit contributions
 stare+0.006
 gaze+0.006
 dip+0.006
 clam+0.006
 crab+0.006
Neg logit contributions
rob-0.009
eh-0.007
three-0.007
xy-0.007
 nonsense-0.007
Token rib
Token ID12183
Feature activation-3.93e-02

Pos logit contributions
rob+0.005
eh+0.005
aj+0.004
ross+0.004
xy+0.004
Neg logit contributions
 dip-0.004
 look-0.004
 pet-0.004
 stare-0.004
 touch-0.004
Tokenbons
Token ID23461
Feature activation-3.34e-03

Pos logit contributions
rob+0.057
eh+0.047
aj+0.047
ider+0.045
 nonsense+0.044
Neg logit contributions
 dip-0.065
 pet-0.064
 look-0.064
 touch-0.063
 care-0.062
Token and
Token ID290
Feature activation-1.57e-02

Pos logit contributions
rob+0.006
eh+0.005
aj+0.005
ross+0.005
xy+0.005
Neg logit contributions
 stare-0.004
 dip-0.004
 gaze-0.004
 wonder-0.004
 pet-0.004
Token the
Token ID262
Feature activation-1.74e-02

Pos logit contributions
rob+0.031
eh+0.028
aj+0.026
ross+0.026
xy+0.025
Neg logit contributions
 dip-0.020
 look-0.020
 pet-0.019
 touch-0.019
 try-0.019
Token feathers
Token ID27400
Feature activation+3.46e-03

Pos logit contributions
rob+0.031
eh+0.027
aj+0.025
ross+0.024
xy+0.024
Neg logit contributions
 dip-0.024
 look-0.024
 stare-0.023
 pet-0.023
 touch-0.023
Token to
Token ID284
Feature activation-7.03e-03

Pos logit contributions
 stare+0.005
 Seal+0.005
 clam+0.005
 dip+0.005
 wonder+0.005
Neg logit contributions
rob-0.006
eh-0.005
aces-0.005
xy-0.005
aj-0.005
Token put
Token ID1234
Feature activation+1.22e-03

Pos logit contributions
 dip+0.004
 look+0.004
 stare+0.004
 pet+0.004
 touch+0.004
Neg logit contributions
rob-0.006
eh-0.005
aj-0.005
ross-0.005
xy-0.005
Token on
Token ID319
Feature activation+3.00e-03

Pos logit contributions
 dip+0.002
 stare+0.002
 gaze+0.002
 touch+0.002
 look+0.002
Neg logit contributions
rob-0.002
eh-0.002
aj-0.002
ross-0.002
xy-0.002
Token her
Token ID607
Feature activation+1.12e-02

Pos logit contributions
 dip+0.004
 stare+0.004
 crab+0.004
 touch+0.004
 Swan+0.004
Neg logit contributions
rob-0.005
eh-0.004
aj-0.004
lich-0.004
xy-0.004
Token pretty
Token ID2495
Feature activation-1.88e-03

Pos logit contributions
 dip+0.015
 look+0.015
 pet+0.015
 stare+0.015
 touch+0.015
Neg logit contributions
rob-0.021
eh-0.017
ross-0.016
aj-0.016
xy-0.016
Token new
Token ID649
Feature activation+1.76e-03

Pos logit contributions
rob+0.003
eh+0.003
aj+0.003
ross+0.003
xy+0.003
Neg logit contributions
 dip-0.002
 look-0.002
 stare-0.002
 pet-0.002
 touch-0.002
Token jewelry
Token ID22634
Feature activation-3.95e-02

Pos logit contributions
 dip+0.002
 stare+0.002
 touch+0.002
 look+0.002
 try+0.002
Neg logit contributions
rob-0.003
eh-0.003
aj-0.003
lf-0.002
aces-0.002
Token and
Token ID290
Feature activation-8.28e-03

Pos logit contributions
rob+0.079
eh+0.068
aj+0.064
ross+0.063
 nonsense+0.062
Neg logit contributions
 dip-0.047
 look-0.046
 pet-0.045
 touch-0.044
 stare-0.044
Token went
Token ID1816
Feature activation-1.93e-03

Pos logit contributions
rob+0.018
eh+0.016
aj+0.015
ross+0.014
xy+0.014
Neg logit contributions
 dip-0.009
 look-0.009
 pet-0.009
 stare-0.008
 touch-0.008
Token to
Token ID284
Feature activation-7.69e-04

Pos logit contributions
rob+0.004
eh+0.003
aj+0.003
ross+0.003
xy+0.003
Neg logit contributions
 dip-0.002
 stare-0.002
 look-0.002
 touch-0.002
 pet-0.002
Token bed
Token ID3996
Feature activation+9.55e-03

Pos logit contributions
rob+0.001
eh+0.001
aj+0.001
 nonsense+0.001
lf+0.001
Neg logit contributions
 dip-0.001
 crab-0.001
 stare-0.001
 Seal-0.001
 clam-0.001
Token.
Token ID13
Feature activation+2.43e-03

Pos logit contributions
 look+0.010
 dip+0.010
 pet+0.010
 stare+0.009
 touch+0.009
Neg logit contributions
rob-0.021
eh-0.018
lich-0.017
 AW-0.017
 nonsense-0.017
Token.
Token ID13
Feature activation+9.01e-04

Pos logit contributions
rob+0.000
eh+0.000
aj+0.000
ross+0.000
xy+0.000
Neg logit contributions
 dip-0.000
 look-0.000
 stare-0.000
 pet-0.000
 touch-0.000
Token She
Token ID1375
Feature activation+1.92e-03

Pos logit contributions
 look+0.001
 dip+0.001
 stare+0.001
 pet+0.001
 touch+0.001
Neg logit contributions
rob-0.002
eh-0.002
aj-0.002
ross-0.002
xy-0.002
Token loved
Token ID6151
Feature activation-1.86e-02

Pos logit contributions
 pet+0.002
 look+0.002
 touch+0.002
 stare+0.002
 wonder+0.002
Neg logit contributions
rob-0.004
aj-0.004
ross-0.004
three-0.004
ider-0.004
Token to
Token ID284
Feature activation+2.85e-03

Pos logit contributions
rob+0.040
eh+0.035
aj+0.033
xy+0.032
 nonsense+0.032
Neg logit contributions
 dip-0.019
 stare-0.018
 look-0.018
 gaze-0.018
 touch-0.017
Token eat
Token ID4483
Feature activation-1.83e-03

Pos logit contributions
 Seal+0.003
 Swan+0.003
 clam+0.003
 distance+0.003
 Goose+0.003
Neg logit contributions
rob-0.005
eh-0.005
lf-0.005
aj-0.005
 homework-0.004
Token av
Token ID1196
Feature activation-4.18e-02

Pos logit contributions
rob+0.003
eh+0.003
aj+0.002
ross+0.002
xy+0.002
Neg logit contributions
 dip-0.003
 look-0.003
 stare-0.003
 pet-0.003
 touch-0.003
Tokenoc
Token ID420
Feature activation-1.91e-02

Pos logit contributions
rob+0.039
aj+0.032
ider+0.032
 nonsense+0.029
eh+0.028
Neg logit contributions
 pet-0.088
 look-0.087
 care-0.087
 dip-0.086
 touch-0.085
Tokenados
Token ID22484
Feature activation-7.85e-03

Pos logit contributions
rob+0.030
eh+0.026
aj+0.025
 nonsense+0.023
ross+0.023
Neg logit contributions
 pet-0.029
 look-0.028
 dip-0.028
 try-0.028
 touch-0.028
Token because
Token ID780
Feature activation+6.03e-03

Pos logit contributions
rob+0.015
eh+0.013
aj+0.012
xy+0.012
ross+0.012
Neg logit contributions
 dip-0.009
 stare-0.009
 gaze-0.009
 look-0.009
 pet-0.009
Token they
Token ID484
Feature activation+4.74e-03

Pos logit contributions
 look+0.006
 dip+0.006
 pet+0.005
 touch+0.005
 try+0.005
Neg logit contributions
rob-0.014
eh-0.012
aj-0.012
ross-0.011
lich-0.011
Token were
Token ID547
Feature activation+5.08e-04

Pos logit contributions
 dip+0.006
 stare+0.005
 gaze+0.005
 Seal+0.005
 touch+0.005
Neg logit contributions
rob-0.009
eh-0.008
lf-0.007
xy-0.007
lich-0.007
Token a
Token ID257
Feature activation-8.55e-04

Pos logit contributions
rob+0.017
xy+0.013
ross+0.013
aj+0.013
eh+0.013
Neg logit contributions
 look-0.010
 pet-0.009
 dip-0.009
 touch-0.009
 stare-0.009
Token wealthy
Token ID11574
Feature activation-4.75e-03

Pos logit contributions
rob+0.001
eh+0.001
aj+0.001
lf+0.001
ross+0.001
Neg logit contributions
 dip-0.001
 stare-0.001
 look-0.001
 touch-0.001
 gaze-0.001
Token girl
Token ID2576
Feature activation+8.14e-03

Pos logit contributions
rob+0.008
eh+0.007
lf+0.007
aj+0.007
ross+0.006
Neg logit contributions
 dip-0.006
 stare-0.006
 gaze-0.006
 look-0.006
 try-0.006
Token who
Token ID508
Feature activation+4.98e-03

Pos logit contributions
 dip+0.012
 stare+0.011
 gaze+0.011
 look+0.011
 try+0.011
Neg logit contributions
rob-0.013
eh-0.012
aj-0.011
ider-0.011
lf-0.010
Token had
Token ID550
Feature activation-1.24e-02

Pos logit contributions
 dip+0.007
 stare+0.007
 look+0.007
 touch+0.007
 pet+0.007
Neg logit contributions
rob-0.008
eh-0.007
aj-0.006
xy-0.006
ross-0.006
Token many
Token ID867
Feature activation-4.12e-02

Pos logit contributions
rob+0.025
eh+0.021
aj+0.020
ross+0.019
xy+0.019
Neg logit contributions
 look-0.015
 dip-0.015
 pet-0.015
 touch-0.015
 stare-0.015
Token toys
Token ID14958
Feature activation-9.82e-03

Pos logit contributions
rob+0.069
eh+0.058
aj+0.054
ross+0.054
xy+0.052
Neg logit contributions
 dip-0.063
 pet-0.062
 look-0.061
 stare-0.060
 touch-0.060
Token and
Token ID290
Feature activation-6.43e-03

Pos logit contributions
rob+0.016
eh+0.015
aj+0.014
aces+0.014
ider+0.013
Neg logit contributions
 dip-0.013
 stare-0.013
 Seal-0.013
 gaze-0.013
 Swan-0.013
Token clothes
Token ID8242
Feature activation-9.12e-04

Pos logit contributions
rob+0.012
eh+0.010
aj+0.010
ross+0.010
xy+0.010
Neg logit contributions
 look-0.009
 dip-0.008
 pet-0.008
 stare-0.008
 touch-0.008
Token.
Token ID13
Feature activation+1.30e-02

Pos logit contributions
rob+0.002
aj+0.001
eh+0.001
aces+0.001
xy+0.001
Neg logit contributions
 dip-0.001
 stare-0.001
 wonder-0.001
 Seal-0.001
 gaze-0.001
Token She
Token ID1375
Feature activation+5.06e-03

Pos logit contributions
 dip+0.020
 Seal+0.020
 Swan+0.020
 crab+0.020
 clam+0.019
Neg logit contributions
rob-0.017
eh-0.015
aj-0.015
three-0.014
 nonsense-0.014
Token mean
Token ID1612
Feature activation-1.01e-02

Pos logit contributions
rob+0.010
eh+0.009
aj+0.008
ross+0.008
xy+0.008
Neg logit contributions
 dip-0.009
 look-0.009
 pet-0.008
 stare-0.008
 touch-0.008
Token.
Token ID13
Feature activation-4.60e-03

Pos logit contributions
rob+0.022
eh+0.019
aj+0.018
xy+0.017
ross+0.017
Neg logit contributions
 look-0.011
 touch-0.010
 pet-0.010
 try-0.010
 dip-0.010
Token This
Token ID770
Feature activation-4.40e-03

Pos logit contributions
rob+0.007
eh+0.006
aj+0.005
ross+0.005
xy+0.005
Neg logit contributions
 dip-0.008
 look-0.008
 pet-0.008
 touch-0.008
 stare-0.008
Token is
Token ID318
Feature activation-2.98e-03

Pos logit contributions
rob+0.008
eh+0.007
aj+0.006
ross+0.006
ider+0.006
Neg logit contributions
 look-0.007
 try-0.006
 dip-0.006
 touch-0.006
 pet-0.006
Token my
Token ID616
Feature activation-5.17e-03

Pos logit contributions
rob+0.005
eh+0.004
aj+0.004
ross+0.004
xy+0.004
Neg logit contributions
 look-0.005
 dip-0.005
 pet-0.004
 touch-0.004
 try-0.004
Token cr
Token ID1067
Feature activation-4.66e-02

Pos logit contributions
rob+0.009
eh+0.007
aj+0.007
ross+0.007
xy+0.007
Neg logit contributions
 dip-0.008
 look-0.008
 stare-0.008
 pet-0.007
 touch-0.007
Tokenay
Token ID323
Feature activation-2.72e-02

Pos logit contributions
rob+0.094
eh+0.080
 nonsense+0.073
 AW+0.072
aj+0.072
Neg logit contributions
 look-0.054
 touch-0.051
 try-0.051
 dip-0.050
 stare-0.047
Tokenon
Token ID261
Feature activation-2.54e-02

Pos logit contributions
rob+0.053
eh+0.047
aj+0.046
ross+0.045
ider+0.044
Neg logit contributions
 look-0.034
 try-0.032
 touch-0.031
 care-0.029
 dip-0.029
Token.
Token ID13
Feature activation-1.68e-04

Pos logit contributions
rob+0.055
eh+0.046
aj+0.043
ross+0.043
lich+0.042
Neg logit contributions
 look-0.029
 dip-0.027
 touch-0.027
 try-0.027
 pet-0.027
Token Go
Token ID1514
Feature activation+8.27e-03

Pos logit contributions
rob+0.000
eh+0.000
aj+0.000
ross+0.000
xy+0.000
Neg logit contributions
 dip-0.000
 look-0.000
 stare-0.000
 pet-0.000
 touch-0.000
Token away
Token ID1497
Feature activation-1.79e-02

Pos logit contributions
 dip+0.009
 stare+0.009
 Swan+0.008
 crab+0.008
 clam+0.008
Neg logit contributions
rob-0.016
eh-0.014
CR-0.013
lich-0.013
aj-0.013