TOP ACTIVATIONS
MAX = 6.285

away . Her heart was beating fast and she was so
scared and her heart was beating faster . She wanted to
was the sound of a beating heart ! He
are amazing . You are beating the monster ."
and Amy 's heart was beating even faster . She was
. His little heart was beating very fast and he felt
and could feel her heart beating faster . A
because the other bunny kept beating him on price . All
Molly could hear her heart beating inside . It was very
slowly . Her heart was beating fast and she was shaking
in awe , his heart beating fast . When it was
she could feel her heart beating fast . Suddenly , she
exercises . B eth leaned forward and tried to
she could , her heart beating faster and faster .
. B eth â s friends
bathroom ; the sun was beating down on the windows .
board . His heart was beating fast , but he was
vious . B eth smiled . She knew that
. Milo 's heart started beating fast and he was very
with the sound of rain beating on the window .

MIN ACTIVATIONS
MIN = -8.091

necklace . They found some seaw eed to use as string
had lots of y ummy seaw eed and a big ,
home . He grabbed some seaw eed and started to build
He also saw some tasty seaw eed and decided to give
It was hiding under a seaw eed . Lily said ,
see shells , rocks , seaw eed , crabs and fish
so many little pieces of seaw eed . It sw ished
like fish and shells and seaw eed . One
from the rocks and the seaw eed , but the net
decorated it with shells and seaw eed . They had a
ugg led up in some seaw eed and felt very cozy
She also picked some green seaw eed to show her friends
of shells , rocks and seaw eed , but Jane was
shore , collecting shells and seaw eed . Jimmy took time
tentacles to reach for the seaw eed , and he wrapped
She told him it was seaw eed . Despite the smell
ate it with shells and seaw eed . They find a
started to chew on some seaw eed . Mary was so
Like shells and rocks and seaw eed ." M
found it stuck in some seaw eed . Tim was so

INTERVAL 2.691 - 6.285
CONTAINS 0.436%

The little girl and her grandparents spent the whole day playing
wider and wider . The wind was roaring and it felt
thanked the little one for being his friend and gave them
so sad they started to cry . Suddenly ,
just me and Mr . Sn ugg les . We 're

INTERVAL -0.903 - 2.691
CONTAINS 85.144%

you , Uncle Jack !" Uncle Jack smiled and said ,
noticed the same shiny object . She quickly realized that it
couldn 't do it . His dream of succeeding had ended
L ily thought the bee looked heavy because it was carrying
across the lake she called out to her mum , '

INTERVAL -4.497 - -0.903
CONTAINS 14.411%

would frown and hide whenever he saw something scary . One
, she didn 't know what to do . But she
and pulled out his own sword . He was about to
the game . <|endoftext|> Once upon a time , there were
. He looked outside and saw that his shiny toy car

INTERVAL -8.091 - -4.497
CONTAINS 0.008%

O ll ie . O ll ie lived in a safe
shore , collecting shells and seaw eed . Jimmy took time
pretty and she asked O ll ie a question . "
picture of the yellow lab rador . She recognized it as
Jake checked on his k iw i to make sure it
Token away
Token ID1497
Feature activation+1.12e+00

Pos logit contributions
aw+0.011
 pasta+0.010
Dan+0.010
 seaw+0.010
 crabs+0.010
Neg logit contributions
 Hatt-0.008
tered-0.008
ened-0.008
 Ellen-0.008
 Eleanor-0.008
Token.
Token ID13
Feature activation-4.44e-02

Pos logit contributions
tered+0.038
 Ellen+0.038
 Hatt+0.038
 Eleanor+0.037
ened+0.036
Neg logit contributions
aw-0.039
 pasta-0.039
 seaw-0.039
olo-0.039
ins-0.038
Token Her
Token ID2332
Feature activation+1.16e+00

Pos logit contributions
ins+0.002
 seaw+0.002
 pasta+0.002
 coconut+0.002
iles+0.002
Neg logit contributions
 Ellen-0.001
 Beth-0.001
 Eleanor-0.001
 Grace-0.001
 Hatt-0.001
Token heart
Token ID2612
Feature activation+1.79e+00

Pos logit contributions
 Ellen+0.037
tered+0.034
 Eleanor+0.034
 Grace+0.033
 Beth+0.033
Neg logit contributions
ins-0.047
aw-0.047
olo-0.047
 seaw-0.046
 pasta-0.046
Token was
Token ID373
Feature activation+3.26e-01

Pos logit contributions
 Ellen+0.068
 Hatt+0.067
 Eleanor+0.065
tered+0.064
 Grace+0.063
Neg logit contributions
aw-0.059
ins-0.058
 seaw-0.055
ll-0.055
actus-0.055
Token beating
Token ID11226
Feature activation+6.29e+00

Pos logit contributions
 Hatt+0.013
 Ellen+0.013
tered+0.013
 Eleanor+0.013
hered+0.012
Neg logit contributions
aw-0.009
 pasta-0.009
 seaw-0.009
ins-0.009
olo-0.009
Token fast
Token ID3049
Feature activation+1.12e+00

Pos logit contributions
 Hatt+0.239
ened+0.239
 Eleanor+0.238
hered+0.237
 Ellen+0.236
Neg logit contributions
aw-0.174
rou-0.165
 greedy-0.149
He-0.137
oni-0.136
Token and
Token ID290
Feature activation+1.35e-01

Pos logit contributions
 Hatt+0.043
 Ellen+0.042
 Eleanor+0.042
tered+0.041
 Beth+0.039
Neg logit contributions
aw-0.037
 pasta-0.034
 seaw-0.034
ins-0.033
rou-0.031
Token she
Token ID673
Feature activation-5.06e-01

Pos logit contributions
 Ellen+0.005
tered+0.005
 Hatt+0.005
 Eleanor+0.005
hered+0.005
Neg logit contributions
aw-0.005
 pasta-0.004
 seaw-0.004
ins-0.004
olo-0.004
Token was
Token ID373
Feature activation+2.64e-01

Pos logit contributions
 pasta+0.019
aw+0.019
ins+0.018
olo+0.018
 seaw+0.018
Neg logit contributions
tered-0.017
 Ellen-0.017
 Hatt-0.016
 Eleanor-0.016
 Beth-0.016
Token so
Token ID523
Feature activation-1.47e+00

Pos logit contributions
tered+0.010
 Ellen+0.010
 Hatt+0.010
 Eleanor+0.010
 Beth+0.010
Neg logit contributions
aw-0.008
ins-0.008
olo-0.008
 pasta-0.008
 seaw-0.008
Token scared
Token ID12008
Feature activation-2.41e-02

Pos logit contributions
olo+0.045
 pasta+0.044
oni+0.043
ins+0.043
ades+0.043
Neg logit contributions
 Ellen-0.038
tered-0.036
ened-0.036
 as-0.035
 Beth-0.035
Token and
Token ID290
Feature activation+2.96e-01

Pos logit contributions
 pasta+0.001
olo+0.001
ins+0.001
 seaw+0.001
aw+0.001
Neg logit contributions
 Ellen-0.001
tered-0.001
 Hatt-0.001
 Eleanor-0.001
 Beth-0.001
Token her
Token ID607
Feature activation+7.95e-01

Pos logit contributions
 Ellen+0.010
tered+0.010
 Hatt+0.009
 Eleanor+0.009
 Beth+0.009
Neg logit contributions
 pasta-0.011
aw-0.011
olo-0.011
ins-0.011
 seaw-0.011
Token heart
Token ID2612
Feature activation+1.76e+00

Pos logit contributions
 Ellen+0.027
tered+0.027
 Hatt+0.026
 Eleanor+0.026
 Beth+0.025
Neg logit contributions
aw-0.029
 pasta-0.029
 seaw-0.028
olo-0.028
ins-0.028
Token was
Token ID373
Feature activation+3.61e-01

Pos logit contributions
 Ellen+0.065
 Hatt+0.063
 Eleanor+0.061
tered+0.061
 Grace+0.059
Neg logit contributions
aw-0.062
ins-0.060
 seaw-0.058
 pasta-0.057
actus-0.057
Token beating
Token ID11226
Feature activation+6.23e+00

Pos logit contributions
 Ellen+0.014
tered+0.014
 Hatt+0.013
 Eleanor+0.013
 Beth+0.013
Neg logit contributions
aw-0.012
 pasta-0.012
 seaw-0.011
ins-0.011
olo-0.011
Token faster
Token ID5443
Feature activation+8.17e-01

Pos logit contributions
 Ellen+0.223
 Hatt+0.220
 Eleanor+0.220
hered+0.216
ened+0.213
Neg logit contributions
aw-0.198
rou-0.189
 seaw-0.168
oni-0.164
 coconut-0.163
Token.
Token ID13
Feature activation+7.74e-02

Pos logit contributions
 Ellen+0.034
 Hatt+0.034
 Eleanor+0.033
tered+0.033
 Beth+0.032
Neg logit contributions
aw-0.025
 seaw-0.023
ins-0.022
 pasta-0.022
olo-0.021
Token She
Token ID1375
Feature activation+9.56e-01

Pos logit contributions
 Ellen+0.003
tered+0.003
 Hatt+0.003
 Eleanor+0.003
 Beth+0.003
Neg logit contributions
 pasta-0.003
 seaw-0.003
aw-0.003
ins-0.003
olo-0.003
Token wanted
Token ID2227
Feature activation-3.65e-01

Pos logit contributions
 Ellen+0.034
tered+0.033
 Hatt+0.033
 Eleanor+0.032
 Beth+0.031
Neg logit contributions
aw-0.035
 seaw-0.033
olo-0.033
ins-0.033
 pasta-0.033
Token to
Token ID284
Feature activation-1.48e-01

Pos logit contributions
ins+0.014
aw+0.014
olo+0.014
 seaw+0.014
 pasta+0.014
Neg logit contributions
 Ellen-0.012
tered-0.012
 Hatt-0.012
 Beth-0.011
 Eleanor-0.011
Token was
Token ID373
Feature activation+6.28e-02

Pos logit contributions
tered+0.028
 Ellen+0.027
ened+0.026
 Hatt+0.026
 Grace+0.026
Neg logit contributions
olo-0.034
ins-0.033
Dan-0.033
 pasta-0.033
aw-0.033
Token the
Token ID262
Feature activation-1.85e-01

Pos logit contributions
tered+0.002
 Ellen+0.002
 Beth+0.002
 Grace+0.002
 Hatt+0.002
Neg logit contributions
ins-0.002
olo-0.002
aw-0.002
actus-0.002
 seaw-0.002
Token sound
Token ID2128
Feature activation+2.52e+00

Pos logit contributions
ins+0.007
olo+0.007
aw+0.007
Dan+0.007
ades+0.006
Neg logit contributions
 Ellen-0.006
tered-0.006
 Hatt-0.006
 Grace-0.006
 Eleanor-0.006
Token of
Token ID286
Feature activation+4.02e-01

Pos logit contributions
 Ellen+0.087
tered+0.085
 Hatt+0.084
 Eleanor+0.083
 Beth+0.080
Neg logit contributions
aw-0.092
 pasta-0.091
ins-0.090
olo-0.089
 seaw-0.089
Token a
Token ID257
Feature activation+1.28e+00

Pos logit contributions
 Ellen+0.013
tered+0.012
 Beth+0.012
 Hatt+0.012
 Grace+0.012
Neg logit contributions
ins-0.016
olo-0.016
aw-0.016
actus-0.015
ll-0.015
Token beating
Token ID11226
Feature activation+6.21e+00

Pos logit contributions
 Ellen+0.045
tered+0.044
 Hatt+0.043
 Eleanor+0.042
 Beth+0.042
Neg logit contributions
aw-0.046
ins-0.045
olo-0.045
 pasta-0.045
 seaw-0.045
Token heart
Token ID2612
Feature activation+1.55e+00

Pos logit contributions
 Ellen+0.202
 Hatt+0.201
 Eleanor+0.197
tered+0.196
ened+0.184
Neg logit contributions
aw-0.234
 seaw-0.219
 pasta-0.215
rou-0.213
 coconut-0.211
Token!
Token ID0
Feature activation-8.41e-01

Pos logit contributions
 Ellen+0.059
 Hatt+0.057
 Eleanor+0.057
tered+0.056
 Beth+0.055
Neg logit contributions
aw-0.051
 pasta-0.050
ins-0.050
olo-0.048
oni-0.048
Token
Token ID198
Feature activation-5.13e-01

Pos logit contributions
 seaw+0.035
ins+0.035
 pasta+0.034
actus+0.034
olo+0.033
Neg logit contributions
 Ellen-0.028
 Beth-0.026
 Eleanor-0.026
 Grace-0.026
tered-0.026
Token
Token ID198
Feature activation-4.32e-01

Pos logit contributions
 seaw+0.022
 pasta+0.022
sters+0.021
actus+0.021
ins+0.021
Neg logit contributions
 Ellen-0.016
 Beth-0.016
 Eleanor-0.016
 Grace-0.015
 Hatt-0.015
TokenHe
Token ID1544
Feature activation-9.10e-01

Pos logit contributions
 pasta+0.016
 seaw+0.016
actus+0.016
oni+0.015
 oy+0.015
Neg logit contributions
 Ellen-0.016
tered-0.015
 Eleanor-0.015
 Beth-0.015
 Hatt-0.015
Token are
Token ID389
Feature activation-4.85e-01

Pos logit contributions
 Ellen+0.054
tered+0.054
 Hatt+0.052
 Eleanor+0.052
 Beth+0.051
Neg logit contributions
aw-0.052
 pasta-0.052
 seaw-0.051
olo-0.051
ins-0.051
Token amazing
Token ID4998
Feature activation-1.54e+00

Pos logit contributions
actus+0.020
ins+0.020
aw+0.020
 seaw+0.019
olo+0.019
Neg logit contributions
 Ellen-0.016
tered-0.015
 Hatt-0.015
 Grace-0.015
 Eleanor-0.014
Token.
Token ID13
Feature activation+3.85e-01

Pos logit contributions
 coconut+0.058
 seaw+0.058
olo+0.057
aw+0.057
Dan+0.057
Neg logit contributions
 Ellen-0.050
tered-0.050
 Grace-0.048
 Eleanor-0.048
 Beth-0.048
Token You
Token ID921
Feature activation+1.54e+00

Pos logit contributions
 Ellen+0.015
tered+0.015
 Hatt+0.014
 Eleanor+0.014
 Beth+0.014
Neg logit contributions
ins-0.013
aw-0.012
 pasta-0.012
 seaw-0.012
olo-0.012
Token are
Token ID389
Feature activation-4.60e-01

Pos logit contributions
 Ellen+0.061
tered+0.060
 Eleanor+0.059
 Hatt+0.058
 Beth+0.057
Neg logit contributions
aw-0.048
 seaw-0.048
 pasta-0.047
olo-0.047
ins-0.046
Token beating
Token ID11226
Feature activation+6.18e+00

Pos logit contributions
aw+0.018
ins+0.018
actus+0.018
 seaw+0.018
olo+0.017
Neg logit contributions
 Ellen-0.016
tered-0.015
 Hatt-0.015
 Eleanor-0.014
 Grace-0.014
Token the
Token ID262
Feature activation-2.05e-01

Pos logit contributions
 Hatt+0.215
 Ellen+0.214
tered+0.213
ened+0.209
 Eleanor+0.208
Neg logit contributions
aw-0.213
rou-0.198
ins-0.196
 pasta-0.196
Dan-0.195
Token monster
Token ID9234
Feature activation-5.83e-01

Pos logit contributions
ins+0.008
olo+0.008
aw+0.008
 seaw+0.008
 pasta+0.008
Neg logit contributions
 Ellen-0.007
tered-0.007
 Hatt-0.006
 Eleanor-0.006
 Beth-0.006
Token."
Token ID526
Feature activation-8.20e-01

Pos logit contributions
aw+0.021
 pasta+0.020
olo+0.020
 seaw+0.020
ins+0.020
Neg logit contributions
 Ellen-0.021
tered-0.020
 Hatt-0.020
 Eleanor-0.019
 Beth-0.019
Token
Token ID198
Feature activation-3.41e-01

Pos logit contributions
iles+0.035
ins+0.035
 seaw+0.035
 coconut+0.034
olo+0.034
Neg logit contributions
 Beth-0.023
 Grace-0.023
 Ellen-0.023
 as-0.023
 Hatt-0.023
Token
Token ID198
Feature activation-3.26e-01

Pos logit contributions
 seaw+0.014
olo+0.014
 coconut+0.014
 pasta+0.014
 Brussels+0.014
Neg logit contributions
 Ellen-0.010
 Beth-0.010
tered-0.010
 Grace-0.010
 Eleanor-0.010
Token and
Token ID290
Feature activation+3.05e-01

Pos logit contributions
 Ellen+0.023
tered+0.022
 Hatt+0.022
 Eleanor+0.021
 Beth+0.021
Neg logit contributions
aw-0.028
 pasta-0.028
ins-0.028
olo-0.028
 seaw-0.028
Token Amy
Token ID14235
Feature activation-2.24e-01

Pos logit contributions
 Hatt+0.011
 Ellen+0.011
tered+0.011
 Eleanor+0.010
hered+0.010
Neg logit contributions
 pasta-0.011
 seaw-0.011
aw-0.010
 coconut-0.010
ins-0.010
Token's
Token ID338
Feature activation-1.02e+00

Pos logit contributions
aw+0.008
 pasta+0.008
 seaw+0.008
olo+0.007
ins+0.007
Neg logit contributions
 Ellen-0.008
tered-0.008
 Hatt-0.008
 Eleanor-0.008
 Beth-0.008
Token heart
Token ID2612
Feature activation+1.72e+00

Pos logit contributions
aw+0.039
ins+0.039
olo+0.039
 pasta+0.038
 seaw+0.038
Neg logit contributions
 Ellen-0.033
tered-0.033
 Hatt-0.032
 Eleanor-0.031
 Beth-0.031
Token was
Token ID373
Feature activation+2.30e-01

Pos logit contributions
 Ellen+0.065
 Hatt+0.064
 Eleanor+0.062
 Grace+0.060
tered+0.059
Neg logit contributions
aw-0.059
ins-0.056
 seaw-0.056
 pasta-0.055
oni-0.054
Token beating
Token ID11226
Feature activation+6.15e+00

Pos logit contributions
 Ellen+0.010
tered+0.010
 Hatt+0.009
 Eleanor+0.009
 Beth+0.009
Neg logit contributions
aw-0.007
 pasta-0.007
ins-0.006
 seaw-0.006
olo-0.006
Token even
Token ID772
Feature activation-9.64e-01

Pos logit contributions
 Ellen+0.219
 Hatt+0.215
 Eleanor+0.214
 Grace+0.206
ened+0.205
Neg logit contributions
aw-0.200
rou-0.191
 seaw-0.167
oni-0.165
 pasta-0.165
Token faster
Token ID5443
Feature activation+6.93e-01

Pos logit contributions
aw+0.038
 seaw+0.037
 pasta+0.037
ins+0.036
 coconut+0.035
Neg logit contributions
 Ellen-0.032
 Hatt-0.031
tered-0.031
 Eleanor-0.029
 Beth-0.029
Token.
Token ID13
Feature activation+1.92e-01

Pos logit contributions
 Ellen+0.027
 Hatt+0.027
tered+0.027
 Eleanor+0.026
 Beth+0.026
Neg logit contributions
aw-0.023
 seaw-0.021
 pasta-0.021
ins-0.021
olo-0.021
Token She
Token ID1375
Feature activation+1.00e+00

Pos logit contributions
 Ellen+0.006
tered+0.006
 Hatt+0.006
 Eleanor+0.006
 Beth+0.006
Neg logit contributions
 seaw-0.007
aw-0.007
ins-0.007
 pasta-0.007
olo-0.007
Token was
Token ID373
Feature activation+2.20e-01

Pos logit contributions
 Ellen+0.036
tered+0.035
 Hatt+0.034
 Eleanor+0.034
 Beth+0.033
Neg logit contributions
aw-0.036
 seaw-0.035
olo-0.034
 pasta-0.034
ins-0.034
Token.
Token ID13
Feature activation-5.56e-02

Pos logit contributions
aw+0.028
olo+0.027
ins+0.027
 pasta+0.027
 seaw+0.027
Neg logit contributions
tered-0.025
 Ellen-0.025
 Hatt-0.024
 Eleanor-0.024
 Beth-0.023
Token His
Token ID2399
Feature activation+2.26e-01

Pos logit contributions
ins+0.002
olo+0.002
 seaw+0.002
Dan+0.002
aw+0.002
Neg logit contributions
tered-0.002
 Ellen-0.002
 Hatt-0.002
 Beth-0.002
 Eleanor-0.002
Token little
Token ID1310
Feature activation+4.78e-01

Pos logit contributions
 Ellen+0.009
tered+0.009
 Hatt+0.009
 Eleanor+0.009
 Beth+0.008
Neg logit contributions
aw-0.007
 pasta-0.007
ins-0.007
olo-0.007
 seaw-0.007
Token heart
Token ID2612
Feature activation+1.77e+00

Pos logit contributions
 Ellen+0.018
tered+0.018
 Hatt+0.017
 Eleanor+0.017
 Beth+0.016
Neg logit contributions
 pasta-0.016
 seaw-0.016
aw-0.016
ins-0.016
 coconut-0.016
Token was
Token ID373
Feature activation+2.17e-01

Pos logit contributions
 Ellen+0.068
 Hatt+0.065
 Eleanor+0.064
tered+0.062
 Beth+0.062
Neg logit contributions
aw-0.060
ins-0.059
 pasta-0.058
actus-0.057
 seaw-0.057
Token beating
Token ID11226
Feature activation+6.12e+00

Pos logit contributions
 Ellen+0.009
 Hatt+0.009
 Eleanor+0.009
tered+0.009
 Grace+0.008
Neg logit contributions
 pasta-0.006
aw-0.006
 seaw-0.006
ins-0.006
 coconut-0.006
Token very
Token ID845
Feature activation-1.62e+00

Pos logit contributions
 Ellen+0.236
 Eleanor+0.231
 Hatt+0.228
hered+0.220
 Grace+0.219
Neg logit contributions
aw-0.187
rou-0.179
oni-0.167
 pasta-0.167
 greedy-0.165
Token fast
Token ID3049
Feature activation+1.10e+00

Pos logit contributions
aw+0.063
 pasta+0.063
ins+0.063
olo+0.063
 seaw+0.062
Neg logit contributions
tered-0.052
 Ellen-0.051
 Hatt-0.049
 Eleanor-0.048
 Beth-0.047
Token and
Token ID290
Feature activation+1.23e-01

Pos logit contributions
 Ellen+0.038
 Hatt+0.038
 Eleanor+0.037
tered+0.037
 Beth+0.035
Neg logit contributions
aw-0.040
 pasta-0.039
 seaw-0.038
ins-0.038
oni-0.036
Token he
Token ID339
Feature activation-1.41e+00

Pos logit contributions
tered+0.005
 Ellen+0.005
 Hatt+0.005
 Eleanor+0.005
 Beth+0.005
Neg logit contributions
aw-0.004
ins-0.004
olo-0.004
 pasta-0.004
 seaw-0.003
Token felt
Token ID2936
Feature activation+4.13e-01

Pos logit contributions
olo+0.057
 pasta+0.057
ins+0.057
aw+0.057
Dan+0.056
Neg logit contributions
tered-0.044
 Ellen-0.042
 Hatt-0.041
 Eleanor-0.040
ened-0.040
Token and
Token ID290
Feature activation+2.94e-01

Pos logit contributions
aw+0.006
 pasta+0.005
ins+0.005
olo+0.005
 seaw+0.005
Neg logit contributions
tered-0.005
 Hatt-0.005
 Ellen-0.005
 Eleanor-0.005
tail-0.005
Token could
Token ID714
Feature activation-1.17e+00

Pos logit contributions
 Ellen+0.010
 Hatt+0.010
tered+0.010
 Eleanor+0.010
hered+0.010
Neg logit contributions
aw-0.010
 pasta-0.010
 seaw-0.010
olo-0.010
ins-0.009
Token feel
Token ID1254
Feature activation+1.30e+00

Pos logit contributions
aw+0.022
olo+0.021
 pasta+0.021
Dan+0.021
 seaw+0.021
Neg logit contributions
tered-0.060
 Ellen-0.060
 Hatt-0.059
 Eleanor-0.058
ened-0.058
Token her
Token ID607
Feature activation+7.42e-01

Pos logit contributions
tered+0.042
 Ellen+0.042
 Hatt+0.041
 Eleanor+0.039
 Beth+0.038
Neg logit contributions
aw-0.050
 seaw-0.049
 pasta-0.048
ins-0.048
olo-0.048
Token heart
Token ID2612
Feature activation+1.73e+00

Pos logit contributions
tered+0.025
 Ellen+0.025
 Hatt+0.024
 Eleanor+0.024
 Beth+0.023
Neg logit contributions
 pasta-0.028
 seaw-0.027
aw-0.027
 coconut-0.026
olo-0.026
Token beating
Token ID11226
Feature activation+6.10e+00

Pos logit contributions
 Hatt+0.061
 Ellen+0.061
 Eleanor+0.059
tered+0.056
 Grace+0.055
Neg logit contributions
aw-0.061
ins-0.059
 pasta-0.059
rou-0.058
 seaw-0.058
Token faster
Token ID5443
Feature activation+6.11e-01

Pos logit contributions
 Hatt+0.212
 Ellen+0.210
 Eleanor+0.206
ened+0.204
 Grace+0.202
Neg logit contributions
aw-0.192
rou-0.174
 greedy-0.159
ll-0.158
 yourself-0.158
Token.
Token ID13
Feature activation+2.09e-01

Pos logit contributions
 Hatt+0.026
 Ellen+0.026
 Eleanor+0.026
tered+0.025
 Beth+0.025
Neg logit contributions
aw-0.018
 seaw-0.016
ins-0.016
 pasta-0.016
olo-0.016
Token
Token ID198
Feature activation-6.32e-02

Pos logit contributions
tered+0.007
 Ellen+0.007
 Hatt+0.007
ened+0.007
 Eleanor+0.006
Neg logit contributions
aw-0.008
 pasta-0.007
olo-0.007
ins-0.007
Dan-0.007
Token
Token ID198
Feature activation-5.32e-02

Pos logit contributions
aw+0.002
 pasta+0.002
ins+0.002
olo+0.002
 seaw+0.002
Neg logit contributions
 Ellen-0.003
tered-0.003
 Hatt-0.003
 Eleanor-0.003
 Beth-0.003
TokenA
Token ID32
Feature activation+1.87e+00

Pos logit contributions
aw+0.002
ins+0.002
 pasta+0.002
olo+0.002
 seaw+0.002
Neg logit contributions
 Ellen-0.002
tered-0.002
 Hatt-0.002
 Eleanor-0.002
 Beth-0.002
Token because
Token ID780
Feature activation+1.90e+00

Pos logit contributions
olo+0.000
 pasta+0.000
 seaw+0.000
 coconut+0.000
Dan+0.000
Neg logit contributions
 Ellen-0.000
 Hatt-0.000
 Eleanor-0.000
tered-0.000
ened-0.000
Token the
Token ID262
Feature activation-1.76e-01

Pos logit contributions
 Ellen+0.076
 Beth+0.074
 Eleanor+0.073
tered+0.072
 Hatt+0.072
Neg logit contributions
ins-0.063
aw-0.062
olo-0.061
 seaw-0.060
actus-0.059
Token other
Token ID584
Feature activation-2.76e+00

Pos logit contributions
aw+0.006
olo+0.006
ins+0.006
Dan+0.006
rou+0.006
Neg logit contributions
tered-0.006
 Ellen-0.006
 Grace-0.006
 as-0.006
ened-0.006
Token bunny
Token ID44915
Feature activation+2.09e-01

Pos logit contributions
olo+0.093
aw+0.092
ins+0.091
Dan+0.090
 pasta+0.090
Neg logit contributions
 Ellen-0.099
tered-0.099
 Beth-0.095
 Eleanor-0.095
 Hatt-0.095
Token kept
Token ID4030
Feature activation+5.25e-01

Pos logit contributions
tered+0.007
 Ellen+0.007
 Hatt+0.006
 Eleanor+0.006
 Beth+0.006
Neg logit contributions
aw-0.008
 pasta-0.008
ins-0.008
olo-0.008
 seaw-0.008
Token beating
Token ID11226
Feature activation+6.09e+00

Pos logit contributions
 Ellen+0.018
tered+0.018
 Hatt+0.017
 Eleanor+0.017
 Beth+0.017
Neg logit contributions
aw-0.019
olo-0.019
ins-0.019
 pasta-0.019
 seaw-0.019
Token him
Token ID683
Feature activation-8.97e-01

Pos logit contributions
 Ellen+0.203
tered+0.196
 Hatt+0.194
 Eleanor+0.194
 Grace+0.186
Neg logit contributions
aw-0.225
 pasta-0.218
 seaw-0.216
ins-0.216
 coconut-0.210
Token on
Token ID319
Feature activation+1.54e+00

Pos logit contributions
 pasta+0.033
aw+0.032
 seaw+0.032
ins+0.032
 coconut+0.031
Neg logit contributions
 Ellen-0.031
tered-0.030
 Hatt-0.030
 Eleanor-0.029
 Beth-0.029
Token price
Token ID2756
Feature activation+4.53e-01

Pos logit contributions
 Ellen+0.049
tered+0.049
 Hatt+0.047
 Eleanor+0.046
 Beth+0.046
Neg logit contributions
ins-0.061
aw-0.060
olo-0.060
 pasta-0.058
 seaw-0.058
Token.
Token ID13
Feature activation+8.85e-02

Pos logit contributions
 Ellen+0.016
tered+0.016
 Hatt+0.016
 Eleanor+0.015
 Beth+0.015
Neg logit contributions
aw-0.016
 pasta-0.016
 seaw-0.016
ins-0.016
olo-0.015
Token All
Token ID1439
Feature activation+9.48e-01

Pos logit contributions
 Ellen+0.003
tered+0.003
 Hatt+0.003
 Eleanor+0.003
 Beth+0.003
Neg logit contributions
 pasta-0.003
aw-0.003
ins-0.003
 seaw-0.003
olo-0.003
Token Molly
Token ID30236
Feature activation-5.20e-01

Pos logit contributions
 Ellen+0.003
tered+0.003
 Hatt+0.002
 Eleanor+0.002
 Beth+0.002
Neg logit contributions
 pasta-0.003
aw-0.003
 seaw-0.003
olo-0.003
ins-0.003
Token could
Token ID714
Feature activation-1.22e+00

Pos logit contributions
aw+0.019
 pasta+0.019
ins+0.019
olo+0.019
 seaw+0.019
Neg logit contributions
 Ellen-0.018
tered-0.018
 Hatt-0.017
 Eleanor-0.017
 Beth-0.016
Token hear
Token ID3285
Feature activation-1.39e+00

Pos logit contributions
 pasta+0.051
aw+0.051
olo+0.051
 seaw+0.050
ins+0.050
Neg logit contributions
 Ellen-0.035
tered-0.035
 Hatt-0.033
 Eleanor-0.033
 Beth-0.032
Token her
Token ID607
Feature activation+6.94e-01

Pos logit contributions
aw+0.054
 pasta+0.054
olo+0.053
ins+0.053
 seaw+0.053
Neg logit contributions
 Ellen-0.044
tered-0.044
 Hatt-0.043
 Eleanor-0.042
hered-0.041
Token heart
Token ID2612
Feature activation+1.74e+00

Pos logit contributions
 Ellen+0.024
tered+0.024
 Hatt+0.023
 Eleanor+0.022
 Beth+0.022
Neg logit contributions
aw-0.026
 pasta-0.026
olo-0.025
ins-0.025
 seaw-0.024
Token beating
Token ID11226
Feature activation+6.06e+00

Pos logit contributions
 Ellen+0.062
 Hatt+0.059
 Eleanor+0.058
tered+0.057
 Beth+0.056
Neg logit contributions
ins-0.062
aw-0.061
 pasta-0.061
oni-0.059
ades-0.058
Token inside
Token ID2641
Feature activation+9.32e-01

Pos logit contributions
 Ellen+0.210
hered+0.207
tered+0.199
 Hatt+0.198
 Eleanor+0.198
Neg logit contributions
aw-0.200
rou-0.185
 pasta-0.176
ins-0.175
 seaw-0.174
Token.
Token ID13
Feature activation-7.91e-02

Pos logit contributions
 Ellen+0.033
tered+0.033
 Hatt+0.032
 Eleanor+0.032
 Beth+0.031
Neg logit contributions
aw-0.033
 pasta-0.032
ins-0.032
olo-0.032
 seaw-0.032
Token It
Token ID632
Feature activation+1.75e+00

Pos logit contributions
aw+0.004
ins+0.004
 coconut+0.004
ades+0.004
ll+0.004
Neg logit contributions
 Grace-0.002
tered-0.002
 Beth-0.002
 Ellen-0.002
 Eleanor-0.002
Token was
Token ID373
Feature activation+1.83e-01

Pos logit contributions
 Ellen+0.067
tered+0.066
 Hatt+0.065
 Eleanor+0.064
hered+0.063
Neg logit contributions
aw-0.055
 pasta-0.053
 seaw-0.053
ins-0.053
olo-0.053
Token very
Token ID845
Feature activation-1.67e+00

Pos logit contributions
 Ellen+0.006
tered+0.006
 Hatt+0.006
 Eleanor+0.006
 Beth+0.006
Neg logit contributions
aw-0.007
 pasta-0.007
olo-0.007
ins-0.007
 seaw-0.007
Token slowly
Token ID6364
Feature activation+1.16e+00

Pos logit contributions
 Hatt+0.020
tered+0.019
 Ellen+0.019
hered+0.019
 Eleanor+0.019
Neg logit contributions
aw-0.022
 seaw-0.022
 pasta-0.021
Dan-0.021
 crabs-0.021
Token.
Token ID13
Feature activation+3.01e-01

Pos logit contributions
tered+0.041
 Ellen+0.041
 Hatt+0.040
 Eleanor+0.039
 Grace+0.037
Neg logit contributions
aw-0.043
 seaw-0.041
 pasta-0.040
Dan-0.039
olo-0.038
Token Her
Token ID2332
Feature activation+1.16e+00

Pos logit contributions
tered+0.009
 Ellen+0.009
 Hatt+0.009
ened+0.009
 Eleanor+0.009
Neg logit contributions
aw-0.011
 pasta-0.011
olo-0.010
 Brussels-0.010
 coconut-0.010
Token heart
Token ID2612
Feature activation+1.59e+00

Pos logit contributions
tered+0.047
 Ellen+0.045
 Hatt+0.044
hered+0.043
 Eleanor+0.042
Neg logit contributions
 pasta-0.036
aw-0.036
 seaw-0.035
olo-0.034
actus-0.034
Token was
Token ID373
Feature activation+1.85e-01

Pos logit contributions
 Ellen+0.064
 Hatt+0.062
 Eleanor+0.058
 Grace+0.058
 Beth+0.057
Neg logit contributions
aw-0.051
ins-0.047
rou-0.046
ll-0.046
ades-0.046
Token beating
Token ID11226
Feature activation+6.06e+00

Pos logit contributions
 Hatt+0.007
 Ellen+0.007
 Eleanor+0.007
tered+0.007
hered+0.007
Neg logit contributions
aw-0.006
rou-0.006
 pasta-0.005
 seaw-0.005
 shallow-0.005
Token fast
Token ID3049
Feature activation+9.64e-01

Pos logit contributions
hered+0.220
oned+0.219
 Ellen+0.218
 Hatt+0.217
 agreement+0.214
Neg logit contributions
rou-0.175
aw-0.170
 greedy-0.133
ll-0.132
oni-0.131
Token and
Token ID290
Feature activation+1.77e-01

Pos logit contributions
 Hatt+0.037
 Ellen+0.036
 Eleanor+0.035
tered+0.035
 Kathy+0.033
Neg logit contributions
aw-0.030
 pasta-0.027
rou-0.027
 seaw-0.027
ins-0.026
Token she
Token ID673
Feature activation-5.74e-01

Pos logit contributions
hered+0.006
 Hatt+0.006
tered+0.006
 Ellen+0.006
oned+0.006
Neg logit contributions
aw-0.006
 pasta-0.005
 seaw-0.005
rou-0.005
 shallow-0.005
Token was
Token ID373
Feature activation+1.45e-01

Pos logit contributions
aw+0.020
ins+0.019
olo+0.019
 seaw+0.019
 pasta+0.019
Neg logit contributions
 Ellen-0.021
tered-0.020
 Hatt-0.020
 Eleanor-0.020
 Beth-0.019
Token shaking
Token ID17275
Feature activation+1.18e+00

Pos logit contributions
 Hatt+0.006
tered+0.005
 Ellen+0.005
 Eleanor+0.005
 Grace+0.005
Neg logit contributions
aw-0.005
rou-0.004
ins-0.004
Dan-0.004
olo-0.004
Token in
Token ID287
Feature activation-5.00e-01

Pos logit contributions
 Ellen+0.085
 Hatt+0.082
 Eleanor+0.081
tered+0.079
 Beth+0.077
Neg logit contributions
aw-0.105
 pasta-0.103
 seaw-0.101
ins-0.097
olo-0.096
Token awe
Token ID25030
Feature activation+4.79e-01

Pos logit contributions
 pasta+0.018
 seaw+0.018
aw+0.018
olo+0.018
 coconut+0.018
Neg logit contributions
tered-0.017
 Ellen-0.017
 Hatt-0.017
 Eleanor-0.016
 Beth-0.016
Token,
Token ID11
Feature activation+3.66e-01

Pos logit contributions
tered+0.015
 Ellen+0.014
ened+0.014
 as+0.014
 Grace+0.014
Neg logit contributions
olo-0.019
 coconut-0.019
 pasta-0.019
ins-0.019
Dan-0.019
Token his
Token ID465
Feature activation+2.01e-01

Pos logit contributions
 Ellen+0.011
tered+0.010
 as+0.010
 Grace+0.010
 Eleanor+0.010
Neg logit contributions
ins-0.016
 pasta-0.015
olo-0.015
aw-0.015
 seaw-0.015
Token heart
Token ID2612
Feature activation+1.62e+00

Pos logit contributions
 Ellen+0.007
tered+0.007
 Hatt+0.007
 Eleanor+0.007
 Beth+0.007
Neg logit contributions
aw-0.007
 pasta-0.007
ins-0.007
olo-0.007
 seaw-0.007
Token beating
Token ID11226
Feature activation+6.06e+00

Pos logit contributions
 Ellen+0.057
 Hatt+0.055
 Eleanor+0.054
tered+0.053
 Beth+0.052
Neg logit contributions
aw-0.060
ins-0.057
 pasta-0.057
 seaw-0.057
olo-0.056
Token fast
Token ID3049
Feature activation+1.02e+00

Pos logit contributions
 Ellen+0.213
 Hatt+0.207
 Eleanor+0.207
hered+0.202
tered+0.200
Neg logit contributions
aw-0.207
rou-0.197
 seaw-0.196
 pasta-0.186
 coconut-0.185
Token.
Token ID13
Feature activation+1.06e-01

Pos logit contributions
 Ellen+0.038
 Hatt+0.038
 Eleanor+0.037
tered+0.037
 Beth+0.036
Neg logit contributions
aw-0.034
 seaw-0.033
 pasta-0.033
ins-0.032
olo-0.031
Token When
Token ID1649
Feature activation+1.11e+00

Pos logit contributions
 Ellen+0.004
 Beth+0.004
tered+0.004
 Grace+0.004
 Eleanor+0.004
Neg logit contributions
ins-0.004
 pasta-0.004
 seaw-0.004
actus-0.004
 coconut-0.004
Token it
Token ID340
Feature activation+4.74e-01

Pos logit contributions
 Ellen+0.043
tered+0.043
 Hatt+0.041
 Eleanor+0.041
 Beth+0.040
Neg logit contributions
aw-0.036
 pasta-0.036
olo-0.035
 seaw-0.035
ins-0.035
Token was
Token ID373
Feature activation+2.31e-01

Pos logit contributions
 Ellen+0.018
tered+0.017
 Hatt+0.017
 Eleanor+0.017
 Beth+0.016
Neg logit contributions
aw-0.016
 pasta-0.016
 seaw-0.016
olo-0.016
ins-0.016
Token she
Token ID673
Feature activation-7.06e-01

Pos logit contributions
 Ellen+0.001
tered+0.001
ened+0.001
eness+0.001
 Hatt+0.001
Neg logit contributions
 seaw-0.001
 crocod-0.001
ll-0.001
aw-0.001
 mo-0.001
Token could
Token ID714
Feature activation-1.31e+00

Pos logit contributions
aw+0.025
 seaw+0.024
olo+0.024
ins+0.024
 pasta+0.024
Neg logit contributions
 Ellen-0.025
tered-0.025
 Hatt-0.024
 Eleanor-0.024
 Grace-0.023
Token feel
Token ID1254
Feature activation+1.09e+00

Pos logit contributions
Dan+0.029
ll+0.029
olo+0.027
aw+0.026
 crabs+0.026
Neg logit contributions
ened-0.061
 Hatt-0.061
 Ellen-0.061
tered-0.060
 Eleanor-0.058
Token her
Token ID607
Feature activation+4.60e-01

Pos logit contributions
 Ellen+0.040
tered+0.040
 Hatt+0.039
 Eleanor+0.037
ened+0.037
Neg logit contributions
aw-0.037
 seaw-0.037
 pasta-0.035
 coconut-0.034
 crabs-0.034
Token heart
Token ID2612
Feature activation+1.60e+00

Pos logit contributions
 Hatt+0.017
 Ellen+0.017
 Eleanor+0.017
tered+0.017
 Grace+0.016
Neg logit contributions
 pasta-0.015
 seaw-0.015
 coconut-0.014
actus-0.014
aw-0.014
Token beating
Token ID11226
Feature activation+6.04e+00

Pos logit contributions
 Ellen+0.060
 Hatt+0.059
 Eleanor+0.057
 Beth+0.055
 Grace+0.055
Neg logit contributions
aw-0.054
 pasta-0.051
ins-0.051
rou-0.051
 seaw-0.050
Token fast
Token ID3049
Feature activation+8.57e-01

Pos logit contributions
 Ellen+0.224
 Hatt+0.222
 Eleanor+0.218
ened+0.218
 Grace+0.217
Neg logit contributions
aw-0.194
rou-0.170
P-0.161
Dan-0.157
ll-0.154
Token.
Token ID13
Feature activation-4.01e-01

Pos logit contributions
 Ellen+0.038
 Hatt+0.037
 Eleanor+0.037
 Beth+0.036
tered+0.036
Neg logit contributions
aw-0.024
 pasta-0.022
 seaw-0.022
ins-0.021
olo-0.020
Token Suddenly
Token ID24975
Feature activation+6.80e-01

Pos logit contributions
aw+0.011
 pasta+0.011
ins+0.011
olo+0.011
 seaw+0.011
Neg logit contributions
 Ellen-0.017
tered-0.017
 Hatt-0.016
 Eleanor-0.016
ened-0.016
Token,
Token ID11
Feature activation+1.74e-01

Pos logit contributions
tered+0.025
gent+0.024
oned+0.024
rated+0.024
ened+0.023
Neg logit contributions
aw-0.022
 pasta-0.020
 worms-0.020
 crabs-0.019
 ol-0.019
Token she
Token ID673
Feature activation-7.62e-01

Pos logit contributions
tered+0.007
ened+0.007
 Ellen+0.007
hered+0.007
oned+0.006
Neg logit contributions
 pasta-0.006
 seaw-0.005
 crabs-0.005
aw-0.005
 yogurt-0.005
Token exercises
Token ID13565
Feature activation+5.86e-01

Pos logit contributions
 Ellen+0.030
tered+0.030
 Hatt+0.029
 Eleanor+0.029
 Beth+0.029
Neg logit contributions
olo-0.029
aw-0.028
ins-0.028
Dan-0.028
actus-0.027
Token.
Token ID13
Feature activation+3.67e-01

Pos logit contributions
 Ellen+0.019
tered+0.018
 Hatt+0.018
 Eleanor+0.018
ened+0.018
Neg logit contributions
aw-0.023
 pasta-0.023
olo-0.022
 seaw-0.022
ins-0.022
Token
Token ID198
Feature activation-3.20e-02

Pos logit contributions
 Ellen+0.011
 Beth+0.010
 Eleanor+0.010
 Grace+0.010
 Hatt+0.010
Neg logit contributions
 seaw-0.016
ins-0.016
 pasta-0.016
olo-0.016
 coconut-0.016
Token
Token ID198
Feature activation-2.65e-02

Pos logit contributions
 seaw+0.001
 pasta+0.001
 coconut+0.001
sters+0.001
ins+0.001
Neg logit contributions
 Ellen-0.001
 Beth-0.001
 Eleanor-0.001
 Grace-0.001
 Hatt-0.001
TokenB
Token ID33
Feature activation+2.34e+00

Pos logit contributions
 pasta+0.001
 seaw+0.001
actus+0.001
oni+0.001
ins+0.001
Neg logit contributions
 Ellen-0.001
tered-0.001
 Eleanor-0.001
 Beth-0.001
 Hatt-0.001
Tokeneth
Token ID2788
Feature activation+6.04e+00

Pos logit contributions
tered+0.072
 Ellen+0.071
 Hatt+0.069
 Eleanor+0.069
 Beth+0.068
Neg logit contributions
 pasta-0.095
aw-0.095
 seaw-0.094
olo-0.093
ins-0.092
Token leaned
Token ID23831
Feature activation+7.95e-01

Pos logit contributions
 Ellen+0.200
tered+0.197
 Hatt+0.193
 Eleanor+0.189
 Beth+0.184
Neg logit contributions
aw-0.229
 pasta-0.224
ins-0.224
olo-0.223
 seaw-0.222
Token forward
Token ID2651
Feature activation+2.35e+00

Pos logit contributions
 Ellen+0.021
tered+0.020
 Hatt+0.020
 Eleanor+0.020
 Beth+0.019
Neg logit contributions
aw-0.035
 pasta-0.035
olo-0.035
ins-0.035
 seaw-0.034
Token and
Token ID290
Feature activation+4.90e-01

Pos logit contributions
 Ellen+0.085
 Hatt+0.083
tered+0.083
 Eleanor+0.081
hered+0.077
Neg logit contributions
 pasta-0.080
 seaw-0.079
aw-0.078
 oy-0.077
 crabs-0.076
Token tried
Token ID3088
Feature activation+1.54e-01

Pos logit contributions
 Ellen+0.015
tered+0.015
 Hatt+0.014
 Eleanor+0.014
 Beth+0.014
Neg logit contributions
aw-0.020
 pasta-0.020
olo-0.020
ins-0.020
 seaw-0.020
Token to
Token ID284
Feature activation+1.62e-01

Pos logit contributions
tered+0.005
 Ellen+0.005
 Hatt+0.005
 Eleanor+0.005
ened+0.005
Neg logit contributions
 pasta-0.005
aw-0.005
 seaw-0.005
ins-0.005
olo-0.005
Token she
Token ID673
Feature activation-6.64e-01

Pos logit contributions
 Ellen+0.003
tered+0.003
 Hatt+0.003
 Eleanor+0.003
hered+0.003
Neg logit contributions
 pasta-0.003
aw-0.003
 seaw-0.003
olo-0.003
ins-0.003
Token could
Token ID714
Feature activation-1.23e+00

Pos logit contributions
aw+0.016
 seaw+0.015
olo+0.015
 pasta+0.015
ins+0.015
Neg logit contributions
 Ellen-0.032
tered-0.031
 Hatt-0.030
 Eleanor-0.030
 Beth-0.030
Token,
Token ID11
Feature activation+2.96e-01

Pos logit contributions
Dan+0.032
aw+0.032
ll+0.032
 pasta+0.032
 seaw+0.032
Neg logit contributions
 Ellen-0.050
 Eleanor-0.049
 Hatt-0.049
tered-0.049
 Grace-0.047
Token her
Token ID607
Feature activation+6.95e-01

Pos logit contributions
tered+0.009
 Ellen+0.009
 Hatt+0.009
 Eleanor+0.009
hered+0.008
Neg logit contributions
 pasta-0.012
aw-0.011
 seaw-0.011
 coconut-0.011
olo-0.011
Token heart
Token ID2612
Feature activation+1.71e+00

Pos logit contributions
tered+0.027
 Ellen+0.026
hered+0.025
 Hatt+0.025
 Eleanor+0.025
Neg logit contributions
 pasta-0.024
 coconut-0.023
 seaw-0.022
aw-0.022
actus-0.022
Token beating
Token ID11226
Feature activation+6.04e+00

Pos logit contributions
 Ellen+0.059
 Hatt+0.057
 Eleanor+0.056
tered+0.055
 Grace+0.054
Neg logit contributions
aw-0.063
ins-0.060
 seaw-0.059
 pasta-0.059
rou-0.058
Token faster
Token ID5443
Feature activation+5.58e-01

Pos logit contributions
hered+0.206
 Ellen+0.203
 Eleanor+0.195
tered+0.194
 Hatt+0.194
Neg logit contributions
aw-0.192
rou-0.183
Dan-0.160
 yourself-0.160
 seaw-0.159
Token and
Token ID290
Feature activation+1.42e-01

Pos logit contributions
 Ellen+0.022
 Hatt+0.022
tered+0.022
 Eleanor+0.022
 Beth+0.021
Neg logit contributions
aw-0.018
 seaw-0.017
 pasta-0.017
ins-0.016
olo-0.016
Token faster
Token ID5443
Feature activation+5.64e-01

Pos logit contributions
 Ellen+0.005
tered+0.005
hered+0.004
 Hatt+0.004
 Eleanor+0.004
Neg logit contributions
aw-0.005
 seaw-0.005
rou-0.005
 pasta-0.005
 shallow-0.005
Token.
Token ID13
Feature activation+3.34e-01

Pos logit contributions
 Ellen+0.024
 Hatt+0.024
tered+0.024
 Eleanor+0.023
 Beth+0.023
Neg logit contributions
aw-0.017
 seaw-0.015
ins-0.015
 pasta-0.015
olo-0.014
Token
Token ID198
Feature activation-3.36e-02

Pos logit contributions
tered+0.012
 Ellen+0.012
 Hatt+0.011
 Eleanor+0.011
ened+0.011
Neg logit contributions
aw-0.012
 pasta-0.012
olo-0.011
ins-0.011
 seaw-0.011
Token.
Token ID13
Feature activation+2.55e-01

Pos logit contributions
 Ellen+0.035
tered+0.035
 Eleanor+0.033
 Hatt+0.033
 Beth+0.033
Neg logit contributions
olo-0.045
ins-0.045
aw-0.045
 pasta-0.045
 seaw-0.044
Token
Token ID220
Feature activation-7.80e-01

Pos logit contributions
 Ellen+0.008
 Beth+0.008
 Eleanor+0.008
 Hatt+0.008
 Grace+0.008
Neg logit contributions
 seaw-0.011
ins-0.010
 pasta-0.010
olo-0.010
 coconut-0.010
Token
Token ID198
Feature activation+4.64e-02

Pos logit contributions
 seaw+0.029
 pasta+0.028
 coconut+0.028
olo+0.027
 crabs+0.027
Neg logit contributions
 Ellen-0.029
 Eleanor-0.028
 Hatt-0.028
 Beth-0.028
tered-0.027
Token
Token ID198
Feature activation+7.85e-02

Pos logit contributions
 Ellen+0.002
 Eleanor+0.002
 Beth+0.002
 Hatt+0.002
tered+0.002
Neg logit contributions
 pasta-0.002
 seaw-0.002
 coconut-0.002
ins-0.002
olo-0.002
TokenB
Token ID33
Feature activation+2.09e+00

Pos logit contributions
 Ellen+0.003
 Eleanor+0.003
tered+0.003
 Beth+0.003
 Hatt+0.003
Neg logit contributions
 pasta-0.003
 seaw-0.003
oni-0.003
ins-0.003
 coconut-0.003
Tokeneth
Token ID2788
Feature activation+6.00e+00

Pos logit contributions
 Ellen+0.070
tered+0.070
 Hatt+0.068
 Eleanor+0.068
 Beth+0.067
Neg logit contributions
 pasta-0.079
aw-0.078
 seaw-0.077
olo-0.076
ins-0.076
Tokenâ
Token ID22940
Feature activation-5.17e-02

Pos logit contributions
 Ellen+0.206
 Hatt+0.200
tered+0.199
 Eleanor+0.194
 Grace+0.189
Neg logit contributions
aw-0.224
olo-0.213
ins-0.211
 pasta-0.209
 seaw-0.209
Token
Token ID26391
Feature activation+2.08e+00

Pos logit contributions
aw+0.002
olo+0.002
ll+0.002
Dan+0.002
actus+0.002
Neg logit contributions
 Ellen-0.002
 Hatt-0.002
tered-0.002
 Eleanor-0.002
 Grace-0.002
Token
Token ID8151
Feature activation+8.85e-01

Pos logit contributions
 Ellen+0.079
tered+0.077
 Hatt+0.076
 Eleanor+0.076
 Beth+0.075
Neg logit contributions
 seaw-0.071
 pasta-0.071
ins-0.069
olo-0.068
aw-0.068
Tokens
Token ID82
Feature activation-1.05e+00

Pos logit contributions
tered+0.028
 Beth+0.027
 Ellen+0.027
 Eleanor+0.026
 Hatt+0.025
Neg logit contributions
 seaw-0.038
 pasta-0.038
 Brussels-0.035
ins-0.035
 coconut-0.035
Token friends
Token ID2460
Feature activation-3.93e-01

Pos logit contributions
ins+0.040
aw+0.040
olo+0.040
 pasta+0.040
 seaw+0.040
Neg logit contributions
 Ellen-0.034
tered-0.034
 Hatt-0.032
 Eleanor-0.032
 Beth-0.032
Token bathroom
Token ID12436
Feature activation+1.05e-01

Pos logit contributions
olo+0.024
ins+0.024
oni+0.023
iles+0.023
Dan+0.023
Neg logit contributions
 Beth-0.023
 Grace-0.022
tered-0.022
 Hatt-0.022
 Eleanor-0.022
Token;
Token ID26
Feature activation-1.21e+00

Pos logit contributions
tered+0.003
 Ellen+0.003
 Hatt+0.003
 Eleanor+0.003
 Beth+0.003
Neg logit contributions
aw-0.004
ins-0.004
olo-0.004
 seaw-0.004
 coconut-0.004
Token the
Token ID262
Feature activation-6.51e-01

Pos logit contributions
ins+0.053
actus+0.053
aw+0.052
olo+0.052
P+0.051
Neg logit contributions
tered-0.034
 as-0.034
 Beth-0.033
 Ellen-0.033
 Grace-0.032
Token sun
Token ID4252
Feature activation+2.19e-01

Pos logit contributions
aw+0.021
olo+0.020
ins+0.020
 coconut+0.019
Dan+0.019
Neg logit contributions
tered-0.026
 Hatt-0.025
 Eleanor-0.025
 Beth-0.025
 Ellen-0.025
Token was
Token ID373
Feature activation+1.03e-02

Pos logit contributions
 Ellen+0.008
tered+0.008
 Hatt+0.008
 Eleanor+0.008
 Grace+0.007
Neg logit contributions
 pasta-0.008
aw-0.008
 seaw-0.007
ins-0.007
olo-0.007
Token beating
Token ID11226
Feature activation+5.99e+00

Pos logit contributions
tered+0.000
 Ellen+0.000
 Hatt+0.000
 Beth+0.000
 Eleanor+0.000
Neg logit contributions
aw-0.000
olo-0.000
ins-0.000
 seaw-0.000
 coconut-0.000
Token down
Token ID866
Feature activation+1.97e+00

Pos logit contributions
 Ellen+0.224
hered+0.208
 Hatt+0.207
 Eleanor+0.207
 Faith+0.204
Neg logit contributions
 pasta-0.200
aw-0.198
 seaw-0.194
rou-0.186
 shallow-0.182
Token on
Token ID319
Feature activation+1.33e+00

Pos logit contributions
tered+0.057
 Ellen+0.054
 as+0.053
ened+0.052
 Beth+0.052
Neg logit contributions
olo-0.085
aw-0.084
ins-0.084
actus-0.082
Dan-0.082
Token the
Token ID262
Feature activation-6.79e-01

Pos logit contributions
tered+0.037
 Ellen+0.035
 as+0.034
 Grace+0.034
 Beth+0.034
Neg logit contributions
ins-0.059
aw-0.058
olo-0.057
actus-0.057
ll-0.057
Token windows
Token ID9168
Feature activation+1.02e+00

Pos logit contributions
ins+0.024
aw+0.024
olo+0.023
Dan+0.023
 coconut+0.023
Neg logit contributions
tered-0.025
 Hatt-0.024
 Eleanor-0.024
 Grace-0.024
 Ellen-0.023
Token.
Token ID13
Feature activation-5.72e-01

Pos logit contributions
 as+0.031
tered+0.031
ened+0.029
 Ellen+0.028
 Beth+0.028
Neg logit contributions
P-0.042
actus-0.041
 seaw-0.040
ins-0.040
 coconut-0.040
Token board
Token ID3096
Feature activation-3.29e+00

Pos logit contributions
ins+0.046
olo+0.045
aw+0.044
Dan+0.043
ll+0.042
Neg logit contributions
 Ellen-0.047
tered-0.046
 Grace-0.045
 Hatt-0.044
 Beth-0.044
Token.
Token ID13
Feature activation-8.49e-01

Pos logit contributions
olo+0.134
ins+0.131
Dan+0.130
 coconut+0.129
aw+0.127
Neg logit contributions
tered-0.101
 Ellen-0.095
ened-0.092
 as-0.092
hered-0.090
Token His
Token ID2399
Feature activation-4.20e-01

Pos logit contributions
ins+0.038
 seaw+0.037
actus+0.037
 coconut+0.036
 pasta+0.036
Neg logit contributions
 Ellen-0.026
 Beth-0.025
 Grace-0.025
 Eleanor-0.024
tered-0.024
Token heart
Token ID2612
Feature activation+1.38e+00

Pos logit contributions
ins+0.017
aw+0.016
olo+0.016
actus+0.016
P+0.016
Neg logit contributions
 Ellen-0.013
tered-0.013
 as-0.012
ened-0.012
 Beth-0.012
Token was
Token ID373
Feature activation-1.83e-02

Pos logit contributions
 Ellen+0.050
tered+0.048
 Hatt+0.048
 Eleanor+0.047
 Beth+0.046
Neg logit contributions
aw-0.048
 pasta-0.048
ins-0.047
 seaw-0.047
olo-0.046
Token beating
Token ID11226
Feature activation+5.98e+00

Pos logit contributions
 pasta+0.001
aw+0.001
 seaw+0.001
ins+0.001
olo+0.001
Neg logit contributions
 Ellen-0.001
tered-0.001
 Hatt-0.001
 Eleanor-0.001
 Beth-0.001
Token fast
Token ID3049
Feature activation+8.36e-01

Pos logit contributions
 Ellen+0.213
 Eleanor+0.212
 Hatt+0.208
hered+0.202
 Grace+0.201
Neg logit contributions
aw-0.194
rou-0.192
 pasta-0.185
oni-0.179
 seaw-0.179
Token,
Token ID11
Feature activation+4.90e-01

Pos logit contributions
 Ellen+0.031
 Hatt+0.031
 Eleanor+0.031
tered+0.030
 Beth+0.029
Neg logit contributions
aw-0.027
 pasta-0.027
 seaw-0.026
ins-0.024
olo-0.024
Token but
Token ID475
Feature activation+8.96e-01

Pos logit contributions
 Ellen+0.015
tered+0.014
 Hatt+0.014
 Eleanor+0.014
 Beth+0.013
Neg logit contributions
aw-0.020
 pasta-0.020
 seaw-0.020
olo-0.019
ins-0.019
Token he
Token ID339
Feature activation-1.65e+00

Pos logit contributions
 Ellen+0.035
tered+0.035
 Hatt+0.034
 Eleanor+0.034
 Beth+0.033
Neg logit contributions
aw-0.028
 pasta-0.028
olo-0.028
 seaw-0.027
ins-0.027
Token was
Token ID373
Feature activation+2.04e-01

Pos logit contributions
 pasta+0.064
ins+0.063
 seaw+0.063
aw+0.062
olo+0.062
Neg logit contributions
tered-0.054
 Ellen-0.052
 Eleanor-0.050
 Hatt-0.050
 Beth-0.050
Tokenvious
Token ID1442
Feature activation-1.02e+00

Pos logit contributions
 pasta+0.007
olo+0.007
 coconut+0.007
 seaw+0.007
Dan+0.007
Neg logit contributions
tered-0.011
 Ellen-0.011
 Hatt-0.011
 Eleanor-0.011
hered-0.011
Token.
Token ID13
Feature activation+3.11e-01

Pos logit contributions
olo+0.040
 pasta+0.040
aw+0.039
ins+0.039
 seaw+0.039
Neg logit contributions
 Ellen-0.032
tered-0.032
 Hatt-0.031
 Eleanor-0.030
 Beth-0.030
Token
Token ID198
Feature activation-1.90e-01

Pos logit contributions
 Ellen+0.011
 Grace+0.010
 Beth+0.010
 Eleanor+0.010
 Hatt+0.009
Neg logit contributions
ins-0.013
 seaw-0.013
aw-0.013
actus-0.012
 pasta-0.012
Token
Token ID198
Feature activation-1.65e-01

Pos logit contributions
 seaw+0.008
ins+0.008
 pasta+0.008
 Brussels+0.008
 coconut+0.008
Neg logit contributions
 Ellen-0.006
tered-0.005
 Grace-0.005
 Eleanor-0.005
 Beth-0.005
TokenB
Token ID33
Feature activation+2.17e+00

Pos logit contributions
 seaw+0.006
 pasta+0.006
actus+0.006
 oy+0.006
 coconut+0.006
Neg logit contributions
 Ellen-0.006
tered-0.006
 Eleanor-0.005
 Grace-0.005
 Hatt-0.005
Tokeneth
Token ID2788
Feature activation+5.97e+00

Pos logit contributions
 Ellen+0.065
tered+0.065
 Hatt+0.061
 Eleanor+0.061
 Beth+0.060
Neg logit contributions
 pasta-0.090
 seaw-0.089
aw-0.089
ins-0.088
 coconut-0.087
Token smiled
Token ID13541
Feature activation-9.80e-01

Pos logit contributions
 Ellen+0.206
 Hatt+0.202
tered+0.198
 Eleanor+0.196
 Beth+0.188
Neg logit contributions
aw-0.222
olo-0.207
ins-0.206
 pasta-0.204
 seaw-0.203
Token.
Token ID13
Feature activation+1.53e-01

Pos logit contributions
aw+0.037
 pasta+0.035
 seaw+0.034
ins+0.034
olo+0.034
Neg logit contributions
 Ellen-0.033
 Hatt-0.033
 Eleanor-0.032
tered-0.032
 Beth-0.030
Token She
Token ID1375
Feature activation+1.06e+00

Pos logit contributions
 Ellen+0.006
tered+0.006
 Hatt+0.006
 Eleanor+0.006
 Beth+0.006
Neg logit contributions
aw-0.005
 pasta-0.005
ins-0.005
 seaw-0.005
olo-0.005
Token knew
Token ID2993
Feature activation-8.20e-01

Pos logit contributions
tered+0.035
 Ellen+0.035
 Hatt+0.033
 Eleanor+0.033
 Beth+0.033
Neg logit contributions
 pasta-0.040
ins-0.040
olo-0.040
aw-0.039
 seaw-0.039
Token that
Token ID326
Feature activation+2.02e-01

Pos logit contributions
ins+0.032
aw+0.032
olo+0.031
actus+0.031
 seaw+0.031
Neg logit contributions
 Ellen-0.027
tered-0.026
 Beth-0.026
 Hatt-0.026
 Eleanor-0.025
Token.
Token ID13
Feature activation-1.80e-02

Pos logit contributions
 Ellen+0.003
 Hatt+0.003
tered+0.003
 Eleanor+0.003
 Beth+0.003
Neg logit contributions
aw-0.004
 pasta-0.004
 seaw-0.004
olo-0.004
ins-0.004
Token Milo
Token ID32662
Feature activation-3.75e-01

Pos logit contributions
ins+0.001
aw+0.001
 coconut+0.001
ades+0.001
 pasta+0.001
Neg logit contributions
 Grace-0.001
tered-0.001
 Beth-0.001
 Hatt-0.001
 Ellen-0.001
Token's
Token ID338
Feature activation-1.08e+00

Pos logit contributions
 pasta+0.016
ins+0.016
 coconut+0.015
olo+0.015
 seaw+0.015
Neg logit contributions
tered-0.011
 Ellen-0.011
 Grace-0.010
ened-0.010
 Eleanor-0.010
Token heart
Token ID2612
Feature activation+1.55e+00

Pos logit contributions
ins+0.045
aw+0.045
olo+0.045
 seaw+0.044
 pasta+0.043
Neg logit contributions
 Ellen-0.031
tered-0.031
 Hatt-0.029
 Beth-0.029
 Eleanor-0.028
Token started
Token ID2067
Feature activation-1.53e-01

Pos logit contributions
 Ellen+0.057
 Hatt+0.056
 Eleanor+0.055
tered+0.054
 Beth+0.053
Neg logit contributions
aw-0.054
ins-0.052
 seaw-0.051
 pasta-0.051
actus-0.051
Token beating
Token ID11226
Feature activation+5.96e+00

Pos logit contributions
aw+0.006
olo+0.005
 pasta+0.005
ins+0.005
 seaw+0.005
Neg logit contributions
 Ellen-0.005
tered-0.005
 Hatt-0.005
 Eleanor-0.005
 Beth-0.005
Token fast
Token ID3049
Feature activation+9.15e-01

Pos logit contributions
 Ellen+0.215
 Hatt+0.214
 Eleanor+0.211
tered+0.206
hered+0.202
Neg logit contributions
aw-0.197
rou-0.183
 seaw-0.179
oni-0.173
ll-0.173
Token and
Token ID290
Feature activation+1.15e-01

Pos logit contributions
 Hatt+0.033
 Ellen+0.033
 Eleanor+0.033
tered+0.032
 Beth+0.031
Neg logit contributions
aw-0.033
 seaw-0.031
 pasta-0.031
ins-0.030
rou-0.029
Token he
Token ID339
Feature activation-1.56e+00

Pos logit contributions
 Ellen+0.004
tered+0.004
 Hatt+0.004
 Eleanor+0.004
 Beth+0.004
Neg logit contributions
aw-0.004
 pasta-0.004
ins-0.004
olo-0.004
 seaw-0.004
Token was
Token ID373
Feature activation+1.28e-01

Pos logit contributions
 pasta+0.061
ins+0.060
olo+0.060
aw+0.060
 seaw+0.060
Neg logit contributions
tered-0.050
 Ellen-0.049
 Hatt-0.047
 Eleanor-0.047
ened-0.046
Token very
Token ID845
Feature activation-1.76e+00

Pos logit contributions
 Ellen+0.005
tered+0.005
 Hatt+0.005
 Eleanor+0.005
 Beth+0.005
Neg logit contributions
aw-0.004
 pasta-0.004
ins-0.004
olo-0.004
 seaw-0.004
Token with
Token ID351
Feature activation-4.01e-01

Pos logit contributions
 Hatt+0.020
 Ellen+0.020
 Eleanor+0.019
 Beth+0.018
tered+0.018
Neg logit contributions
aw-0.022
 pasta-0.021
ades-0.020
 seaw-0.020
oni-0.020
Token the
Token ID262
Feature activation-2.37e-01

Pos logit contributions
 pasta+0.014
 seaw+0.014
aw+0.014
 coconut+0.013
 crabs+0.013
Neg logit contributions
 Hatt-0.015
tered-0.015
 Eleanor-0.014
 Ellen-0.014
hered-0.013
Token sound
Token ID2128
Feature activation+2.30e+00

Pos logit contributions
aw+0.008
ins+0.008
olo+0.008
 seaw+0.008
 pasta+0.008
Neg logit contributions
 Ellen-0.008
tered-0.008
 Hatt-0.008
 Eleanor-0.008
 Beth-0.008
Token of
Token ID286
Feature activation-6.92e-02

Pos logit contributions
 Ellen+0.086
 Hatt+0.084
tered+0.082
 Eleanor+0.082
 Grace+0.078
Neg logit contributions
aw-0.081
 pasta-0.078
ins-0.076
 seaw-0.076
olo-0.075
Token rain
Token ID6290
Feature activation+1.54e+00

Pos logit contributions
 pasta+0.002
aw+0.002
 seaw+0.002
olo+0.002
 coconut+0.002
Neg logit contributions
 Ellen-0.002
tered-0.002
 Hatt-0.002
 Eleanor-0.002
hered-0.002
Token beating
Token ID11226
Feature activation+5.96e+00

Pos logit contributions
 Ellen+0.060
 Hatt+0.059
 Eleanor+0.056
tered+0.055
 Grace+0.054
Neg logit contributions
aw-0.053
ins-0.050
 pasta-0.049
 seaw-0.047
ll-0.047
Token on
Token ID319
Feature activation+1.42e+00

Pos logit contributions
 Ellen+0.212
 Hatt+0.211
 Eleanor+0.200
 Faith+0.199
hered+0.190
Neg logit contributions
aw-0.215
rou-0.186
 seaw-0.182
 crocod-0.177
ins-0.177
Token the
Token ID262
Feature activation-2.92e-01

Pos logit contributions
 Hatt+0.051
 Ellen+0.051
 Eleanor+0.050
tered+0.049
hered+0.047
Neg logit contributions
 pasta-0.050
aw-0.050
 seaw-0.049
olo-0.047
 coconut-0.046
Token window
Token ID4324
Feature activation-4.03e-01

Pos logit contributions
ins+0.010
aw+0.010
olo+0.010
 pasta+0.010
 seaw+0.010
Neg logit contributions
 Ellen-0.011
tered-0.011
 Hatt-0.010
 Eleanor-0.010
 Beth-0.010
Token.
Token ID13
Feature activation+1.09e-01

Pos logit contributions
aw+0.014
 pasta+0.014
ins+0.014
olo+0.014
 seaw+0.014
Neg logit contributions
 Ellen-0.015
tered-0.014
 Hatt-0.014
 Eleanor-0.014
 Beth-0.014
Token
Token ID198
Feature activation-1.88e-01

Pos logit contributions
tered+0.004
 Ellen+0.004
 Hatt+0.004
 Eleanor+0.004
ened+0.004
Neg logit contributions
aw-0.004
 pasta-0.004
olo-0.004
 seaw-0.004
Dan-0.004
Token necklace
Token ID38987
Feature activation-9.78e-01

Pos logit contributions
ins+0.003
olo+0.003
aw+0.003
Dan+0.003
ll+0.003
Neg logit contributions
 Grace-0.003
tered-0.003
 Ellen-0.003
 as-0.003
 Hatt-0.003
Token.
Token ID13
Feature activation+4.57e-01

Pos logit contributions
aw+0.038
olo+0.038
 pasta+0.038
ins+0.038
 seaw+0.038
Neg logit contributions
 Ellen-0.031
tered-0.031
 Hatt-0.029
 Eleanor-0.029
 Beth-0.028
Token They
Token ID1119
Feature activation+1.08e+00

Pos logit contributions
tered+0.011
 Grace+0.011
 Ellen+0.011
 Beth+0.011
 Hatt+0.011
Neg logit contributions
ins-0.022
aw-0.021
olo-0.021
iles-0.021
 seaw-0.021
Token found
Token ID1043
Feature activation-7.56e-01

Pos logit contributions
 Ellen+0.037
tered+0.036
 Eleanor+0.035
 Hatt+0.035
 Beth+0.034
Neg logit contributions
 pasta-0.040
 seaw-0.040
aw-0.040
olo-0.039
ins-0.039
Token some
Token ID617
Feature activation+5.72e-01

Pos logit contributions
aw+0.031
 pasta+0.031
olo+0.031
ins+0.030
 seaw+0.030
Neg logit contributions
 Ellen-0.023
tered-0.023
 Hatt-0.022
 Eleanor-0.021
 Beth-0.021
Token seaw
Token ID32085
Feature activation-7.66e+00

Pos logit contributions
tered+0.021
 Ellen+0.021
 Hatt+0.021
 Eleanor+0.020
 Beth+0.020
Neg logit contributions
olo-0.019
ins-0.019
aw-0.019
Dan-0.018
 pasta-0.018
Tokeneed
Token ID2308
Feature activation-1.20e+00

Pos logit contributions
olo+0.147
ins+0.145
 pasta+0.141
aw+0.141
Dan+0.139
Neg logit contributions
tered-0.378
 Hatt-0.360
hered-0.359
 as-0.353
 Ellen-0.351
Token to
Token ID284
Feature activation-3.88e-02

Pos logit contributions
aw+0.055
olo+0.054
ins+0.054
Dan+0.054
P+0.053
Neg logit contributions
tered-0.029
 Ellen-0.027
 as-0.027
ened-0.026
 Hatt-0.026
Token use
Token ID779
Feature activation-5.45e-01

Pos logit contributions
aw+0.002
ins+0.002
olo+0.002
 pasta+0.002
actus+0.002
Neg logit contributions
tered-0.001
 Ellen-0.001
 Hatt-0.001
 Eleanor-0.001
 Beth-0.001
Token as
Token ID355
Feature activation+4.15e-01

Pos logit contributions
aw+0.023
olo+0.023
ins+0.023
 pasta+0.022
 seaw+0.022
Neg logit contributions
tered-0.015
 Ellen-0.015
 Hatt-0.015
 Eleanor-0.014
 Beth-0.014
Token string
Token ID4731
Feature activation+9.40e-01

Pos logit contributions
tered+0.012
 as+0.011
 Ellen+0.011
 Grace+0.010
 Beth+0.010
Neg logit contributions
actus-0.018
ins-0.018
olo-0.018
aw-0.018
ll-0.017
Token had
Token ID550
Feature activation+5.45e-01

Pos logit contributions
tered+0.044
 Grace+0.041
 Ellen+0.040
 Beth+0.040
 Hatt+0.040
Neg logit contributions
aw-0.063
ins-0.062
Dan-0.061
olo-0.061
ades-0.060
Token lots
Token ID6041
Feature activation-9.35e-01

Pos logit contributions
tered+0.017
ened+0.016
 Grace+0.016
 as+0.016
 Ellen+0.015
Neg logit contributions
ins-0.023
ll-0.023
actus-0.022
olo-0.022
opus-0.021
Token of
Token ID286
Feature activation-8.19e-02

Pos logit contributions
olo+0.041
Dan+0.040
ins+0.039
aw+0.039
P+0.039
Neg logit contributions
tered-0.025
 as-0.024
ened-0.024
 Ellen-0.023
 Hatt-0.023
Token y
Token ID331
Feature activation-3.64e+00

Pos logit contributions
olo+0.003
aw+0.003
ll+0.003
Dan+0.003
actus+0.003
Neg logit contributions
tered-0.003
 as-0.003
 Grace-0.003
ened-0.003
 care-0.003
Tokenummy
Token ID13513
Feature activation-1.35e+00

Pos logit contributions
olo+0.116
Dan+0.114
aw+0.114
actus+0.110
ins+0.110
Neg logit contributions
ened-0.134
tered-0.132
hered-0.128
 as-0.128
 throughout-0.121
Token seaw
Token ID32085
Feature activation-7.60e+00

Pos logit contributions
olo+0.040
Dan+0.040
aw+0.039
ll+0.039
rou+0.038
Neg logit contributions
tered-0.052
ened-0.051
 as-0.051
 Ellen-0.049
 Hatt-0.049
Tokeneed
Token ID2308
Feature activation-1.21e+00

Pos logit contributions
aw+0.167
 Brussels+0.167
Dan+0.166
 pasta+0.162
 shallow+0.161
Neg logit contributions
tered-0.333
hered-0.320
 as-0.316
ened-0.299
 Hatt-0.298
Token and
Token ID290
Feature activation-4.59e-02

Pos logit contributions
Dan+0.055
aw+0.055
olo+0.054
P+0.054
ins+0.054
Neg logit contributions
 as-0.030
tered-0.029
ened-0.026
 Grace-0.026
hered-0.025
Token a
Token ID257
Feature activation+1.08e+00

Pos logit contributions
aw+0.001
olo+0.001
ins+0.001
actus+0.001
ll+0.001
Neg logit contributions
tered-0.002
 Grace-0.002
 as-0.002
 Ellen-0.002
ened-0.002
Token big
Token ID1263
Feature activation-4.67e-01

Pos logit contributions
 Grace+0.035
 cheer+0.033
tered+0.033
 as+0.033
 Ellen+0.033
Neg logit contributions
aw-0.041
ins-0.041
iles-0.040
olo-0.040
ll-0.038
Token,
Token ID11
Feature activation+9.08e-02

Pos logit contributions
ins+0.017
aw+0.017
olo+0.017
ll+0.016
Dan+0.016
Neg logit contributions
tered-0.015
 Ellen-0.015
ened-0.015
 Grace-0.015
 Hatt-0.015
Token home
Token ID1363
Feature activation+7.24e-01

Pos logit contributions
 Ellen+0.011
tered+0.011
 Hatt+0.011
 Eleanor+0.010
 Beth+0.010
Neg logit contributions
 pasta-0.013
 seaw-0.013
aw-0.013
 coconut-0.013
olo-0.012
Token.
Token ID13
Feature activation+2.57e-02

Pos logit contributions
tered+0.022
 Ellen+0.022
 Hatt+0.021
 Eleanor+0.021
 Beth+0.021
Neg logit contributions
aw-0.029
ins-0.028
olo-0.028
 seaw-0.028
 pasta-0.028
Token He
Token ID679
Feature activation-2.87e-01

Pos logit contributions
 Ellen+0.001
 Beth+0.001
 Grace+0.001
 Eleanor+0.001
 Hatt+0.001
Neg logit contributions
ins-0.001
 seaw-0.001
actus-0.001
 pasta-0.001
ades-0.001
Token grabbed
Token ID13176
Feature activation-2.30e-01

Pos logit contributions
ins+0.012
 pasta+0.012
aw+0.011
olo+0.011
actus+0.011
Neg logit contributions
tered-0.009
 Ellen-0.008
 Hatt-0.008
 Eleanor-0.008
 Grace-0.008
Token some
Token ID617
Feature activation+6.62e-01

Pos logit contributions
 pasta+0.008
 seaw+0.008
aw+0.008
 coconut+0.008
olo+0.008
Neg logit contributions
 Ellen-0.008
tered-0.008
 Hatt-0.008
 Eleanor-0.008
hered-0.008
Token seaw
Token ID32085
Feature activation-7.59e+00

Pos logit contributions
tered+0.027
 Ellen+0.027
 Hatt+0.027
ened+0.026
 Grace+0.026
Neg logit contributions
olo-0.019
aw-0.019
Dan-0.018
ins-0.018
actus-0.018
Tokeneed
Token ID2308
Feature activation-1.10e+00

Pos logit contributions
aw+0.157
 pasta+0.155
ins+0.148
Dan+0.147
 Brussels+0.147
Neg logit contributions
tered-0.358
hered-0.346
 as-0.336
ened-0.331
 Hatt-0.328
Token and
Token ID290
Feature activation+2.38e-01

Pos logit contributions
aw+0.053
Dan+0.052
olo+0.052
ins+0.052
 Brussels+0.051
Neg logit contributions
tered-0.025
 as-0.024
ened-0.022
 Ellen-0.022
hered-0.022
Token started
Token ID2067
Feature activation+2.01e-01

Pos logit contributions
 Ellen+0.008
tered+0.008
 Hatt+0.008
 Eleanor+0.007
 Beth+0.007
Neg logit contributions
aw-0.009
 pasta-0.009
ins-0.009
olo-0.009
 seaw-0.009
Token to
Token ID284
Feature activation+1.95e-02

Pos logit contributions
 as+0.006
ened+0.006
tered+0.005
 Ellen+0.005
 Beth+0.005
Neg logit contributions
aw-0.009
actus-0.009
olo-0.008
ins-0.008
iles-0.008
Token build
Token ID1382
Feature activation-1.71e-01

Pos logit contributions
 Ellen+0.001
tered+0.001
 Hatt+0.001
 Eleanor+0.001
 Beth+0.001
Neg logit contributions
aw-0.001
ins-0.001
olo-0.001
 pasta-0.001
actus-0.001
Token He
Token ID679
Feature activation+2.16e-02

Pos logit contributions
 Ellen+0.014
 Grace+0.013
 Beth+0.013
 Hatt+0.013
 Eleanor+0.013
Neg logit contributions
 seaw-0.020
ins-0.019
 pasta-0.019
actus-0.019
aw-0.019
Token also
Token ID635
Feature activation-1.98e+00

Pos logit contributions
tered+0.001
 Ellen+0.001
 Grace+0.001
ened+0.001
 Hatt+0.001
Neg logit contributions
 pasta-0.001
ins-0.001
olo-0.001
aw-0.001
 coconut-0.001
Token saw
Token ID2497
Feature activation-8.94e-01

Pos logit contributions
ins+0.075
aw+0.074
 pasta+0.074
olo+0.074
actus+0.074
Neg logit contributions
 Ellen-0.064
tered-0.064
 Grace-0.061
 Hatt-0.060
ened-0.060
Token some
Token ID617
Feature activation+7.14e-01

Pos logit contributions
olo+0.034
ins+0.034
actus+0.033
Dan+0.033
 pasta+0.033
Neg logit contributions
 Ellen-0.029
tered-0.029
 Grace-0.028
 Beth-0.027
 Hatt-0.027
Token tasty
Token ID25103
Feature activation-2.19e+00

Pos logit contributions
tered+0.026
 Ellen+0.026
 Grace+0.025
 Hatt+0.025
ened+0.025
Neg logit contributions
olo-0.024
Dan-0.023
aw-0.023
ins-0.023
 pasta-0.023
Token seaw
Token ID32085
Feature activation-7.48e+00

Pos logit contributions
aw+0.059
olo+0.058
ins+0.057
 pasta+0.057
 seaw+0.056
Neg logit contributions
 Ellen-0.095
tered-0.095
 Hatt-0.092
 Eleanor-0.091
 Beth-0.090
Tokeneed
Token ID2308
Feature activation-1.08e+00

Pos logit contributions
Dan+0.169
aw+0.165
ins+0.164
 pasta+0.163
olo+0.162
Neg logit contributions
tered-0.328
hered-0.318
 as-0.311
 throughout-0.298
ened-0.295
Token and
Token ID290
Feature activation+2.56e-01

Pos logit contributions
Dan+0.051
aw+0.050
olo+0.050
P+0.050
ins+0.050
Neg logit contributions
 as-0.024
tered-0.024
ened-0.022
 Ellen-0.022
 Grace-0.021
Token decided
Token ID3066
Feature activation-2.01e+00

Pos logit contributions
tered+0.009
 Ellen+0.008
 Grace+0.008
ened+0.008
 as+0.008
Neg logit contributions
olo-0.009
ins-0.009
aw-0.009
actus-0.009
Dan-0.009
Token to
Token ID284
Feature activation-4.57e-03

Pos logit contributions
ins+0.093
olo+0.091
actus+0.090
iles+0.090
aw+0.087
Neg logit contributions
 as-0.055
 Ellen-0.054
tered-0.053
 Grace-0.048
 Beth-0.047
Token give
Token ID1577
Feature activation+2.28e-01

Pos logit contributions
ins+0.000
aw+0.000
actus+0.000
olo+0.000
rou+0.000
Neg logit contributions
tered-0.000
 Ellen-0.000
 Grace-0.000
 Beth-0.000
 as-0.000
Token It
Token ID632
Feature activation+1.38e+00

Pos logit contributions
ll+0.026
ins+0.026
iles+0.025
rou+0.024
anut+0.024
Neg logit contributions
 Grace-0.012
 Ellen-0.012
tered-0.012
 Eleanor-0.011
 as-0.011
Token was
Token ID373
Feature activation-4.65e-02

Pos logit contributions
tered+0.037
 Ellen+0.034
 Grace+0.034
ened+0.033
 Beth+0.033
Neg logit contributions
ins-0.064
aw-0.062
olo-0.062
 pasta-0.061
Dan-0.059
Token hiding
Token ID11816
Feature activation-3.78e-01

Pos logit contributions
ins+0.002
actus+0.002
ll+0.002
aw+0.002
act+0.002
Neg logit contributions
tered-0.001
 as-0.001
 Ellen-0.001
 Grace-0.001
 Beth-0.001
Token under
Token ID739
Feature activation-2.31e-01

Pos logit contributions
ins+0.018
olo+0.018
aw+0.018
 pasta+0.018
actus+0.018
Neg logit contributions
tered-0.009
 Ellen-0.008
 as-0.008
ened-0.008
 Beth-0.007
Token a
Token ID257
Feature activation+1.17e+00

Pos logit contributions
ins+0.012
ll+0.012
aw+0.011
actus+0.011
rou+0.011
Neg logit contributions
tered-0.005
 Ellen-0.005
 Grace-0.005
 as-0.005
 Beth-0.005
Token seaw
Token ID32085
Feature activation-7.47e+00

Pos logit contributions
 cheer+0.037
 breath+0.037
 Grace+0.036
hered+0.034
 as+0.033
Neg logit contributions
ins-0.048
ll-0.045
olo-0.044
aw-0.043
iles-0.042
Tokeneed
Token ID2308
Feature activation-1.19e+00

Pos logit contributions
ins+0.192
 Brussels+0.167
 pasta+0.167
iles+0.164
ll+0.162
Neg logit contributions
tered-0.318
hered-0.311
 as-0.302
 breath-0.299
ened-0.290
Token.
Token ID13
Feature activation-4.86e-01

Pos logit contributions
ins+0.054
Dan+0.052
aw+0.051
olo+0.051
P+0.051
Neg logit contributions
 as-0.032
tered-0.029
hered-0.027
 breath-0.027
ened-0.027
Token Lily
Token ID20037
Feature activation-1.66e-01

Pos logit contributions
ll+0.021
ins+0.021
actus+0.020
 coconut+0.020
rou+0.020
Neg logit contributions
 Grace-0.014
tered-0.014
 Ellen-0.014
 Eleanor-0.014
 Hatt-0.014
Token said
Token ID531
Feature activation+3.70e-02

Pos logit contributions
 pasta+0.008
ins+0.008
iles+0.008
 Brussels+0.007
aw+0.007
Neg logit contributions
tered-0.004
 as-0.004
ened-0.004
 Ellen-0.004
 Grace-0.004
Token,
Token ID11
Feature activation+2.16e-01

Pos logit contributions
 Ellen+0.001
tered+0.001
 Grace+0.000
 Hatt+0.000
 Beth+0.000
Neg logit contributions
ins-0.002
aw-0.002
actus-0.002
olo-0.002
 pasta-0.002
Token see
Token ID766
Feature activation-1.93e+00

Pos logit contributions
 Ellen+0.005
tered+0.005
 Hatt+0.005
 Eleanor+0.005
 Grace+0.005
Neg logit contributions
aw-0.010
ins-0.010
actus-0.010
 pasta-0.010
olo-0.010
Token shells
Token ID19679
Feature activation-3.37e+00

Pos logit contributions
ll+0.071
ins+0.070
actus+0.069
olo+0.069
P+0.068
Neg logit contributions
tered-0.063
 Ellen-0.061
 Grace-0.061
 Hatt-0.061
 as-0.061
Token,
Token ID11
Feature activation+1.97e-01

Pos logit contributions
aw+0.167
olo+0.166
 pasta+0.166
ins+0.166
 seaw+0.164
Neg logit contributions
tered-0.069
 Ellen-0.069
 Hatt-0.065
 Eleanor-0.063
 Beth-0.062
Token rocks
Token ID12586
Feature activation-1.18e+00

Pos logit contributions
 as+0.009
 Hatt+0.009
tered+0.009
 Grace+0.009
 Ellen+0.009
Neg logit contributions
actus-0.005
ll-0.004
ins-0.004
aw-0.004
rou-0.004
Token,
Token ID11
Feature activation+1.78e-01

Pos logit contributions
aw+0.058
 pasta+0.058
ins+0.057
 seaw+0.057
olo+0.057
Neg logit contributions
 Ellen-0.025
tered-0.025
 Hatt-0.024
 Eleanor-0.023
 Beth-0.023
Token seaw
Token ID32085
Feature activation-7.47e+00

Pos logit contributions
tered+0.007
 as+0.007
 Hatt+0.007
 Ellen+0.007
ened+0.007
Neg logit contributions
ins-0.005
olo-0.005
aw-0.005
actus-0.005
rou-0.005
Tokeneed
Token ID2308
Feature activation-1.24e+00

Pos logit contributions
rou+0.148
 pasta+0.145
ins+0.143
olo+0.139
 coconut+0.139
Neg logit contributions
tered-0.348
hered-0.333
 as-0.333
 Hatt-0.327
ened-0.325
Token,
Token ID11
Feature activation+1.77e-01

Pos logit contributions
aw+0.064
olo+0.063
 pasta+0.063
ins+0.063
 seaw+0.063
Neg logit contributions
tered-0.022
 Ellen-0.022
 Hatt-0.021
 Eleanor-0.020
 Grace-0.019
Token crabs
Token ID49172
Feature activation-4.22e+00

Pos logit contributions
tered+0.005
 as+0.005
 Ellen+0.005
 Hatt+0.005
ened+0.005
Neg logit contributions
ins-0.007
olo-0.007
aw-0.007
actus-0.007
Dan-0.007
Token and
Token ID290
Feature activation-2.31e-01

Pos logit contributions
aw+0.244
 pasta+0.242
olo+0.241
ins+0.241
 seaw+0.240
Neg logit contributions
 as-0.045
tered-0.044
hered-0.042
 Ellen-0.042
 Hatt-0.039
Token fish
Token ID5916
Feature activation-2.26e+00

Pos logit contributions
ll+0.006
ins+0.006
actus+0.006
olo+0.006
Dan+0.006
Neg logit contributions
tered-0.010
 as-0.010
 Grace-0.010
 Ellen-0.010
ened-0.010
Token so
Token ID523
Feature activation-1.56e+00

Pos logit contributions
ins+0.006
aw+0.006
olo+0.006
actus+0.006
ll+0.005
Neg logit contributions
 Ellen-0.004
tered-0.004
 Grace-0.004
 Beth-0.004
ened-0.004
Token many
Token ID867
Feature activation-7.91e-01

Pos logit contributions
ins+0.065
olo+0.065
 pasta+0.064
actus+0.063
aw+0.063
Neg logit contributions
tered-0.048
 Ellen-0.046
 Grace-0.045
 Beth-0.045
 Eleanor-0.044
Token little
Token ID1310
Feature activation+3.32e-01

Pos logit contributions
oni+0.028
olo+0.027
ll+0.027
Dan+0.027
actus+0.027
Neg logit contributions
tered-0.027
ened-0.027
 Ellen-0.026
 Grace-0.026
 as-0.026
Token pieces
Token ID5207
Feature activation+1.27e-01

Pos logit contributions
ened+0.014
 Grace+0.014
 as+0.013
 Ellen+0.013
tered+0.013
Neg logit contributions
olo-0.009
ades-0.009
oni-0.009
Dan-0.009
aw-0.009
Token of
Token ID286
Feature activation-2.02e-01

Pos logit contributions
tered+0.004
 Ellen+0.004
 Hatt+0.004
ened+0.004
 Beth+0.004
Neg logit contributions
olo-0.005
aw-0.005
ins-0.005
 pasta-0.005
Dan-0.005
Token seaw
Token ID32085
Feature activation-7.45e+00

Pos logit contributions
ll+0.008
actus+0.008
ins+0.008
P+0.007
olo+0.007
Neg logit contributions
 Grace-0.006
 as-0.006
tered-0.006
ened-0.006
 Ellen-0.006
Tokeneed
Token ID2308
Feature activation-1.20e+00

Pos logit contributions
 pasta+0.180
 Brussels+0.177
 coconut+0.170
 volcano+0.168
Dan+0.168
Neg logit contributions
tered-0.313
hered-0.310
 as-0.301
ened-0.292
 Hatt-0.289
Token.
Token ID13
Feature activation-1.99e-01

Pos logit contributions
olo+0.049
aw+0.049
ins+0.049
Dan+0.049
 pasta+0.048
Neg logit contributions
tered-0.035
 Ellen-0.033
ened-0.033
 as-0.033
 Hatt-0.031
Token It
Token ID632
Feature activation+1.58e+00

Pos logit contributions
ins+0.009
 pasta+0.009
actus+0.009
 seaw+0.009
olo+0.009
Neg logit contributions
tered-0.005
 Ellen-0.005
 Grace-0.005
 Beth-0.005
 Eleanor-0.005
Token sw
Token ID1509
Feature activation-1.54e+00

Pos logit contributions
tered+0.051
 Ellen+0.050
 Hatt+0.048
 Eleanor+0.048
 Beth+0.047
Neg logit contributions
 pasta-0.061
aw-0.061
ins-0.061
olo-0.060
 seaw-0.059
Tokenished
Token ID1348
Feature activation-6.73e-01

Pos logit contributions
 pasta+0.044
olo+0.043
 seaw+0.040
ins+0.040
Dan+0.039
Neg logit contributions
tered-0.068
ened-0.067
 Ellen-0.063
 as-0.061
 Eleanor-0.061
Token like
Token ID588
Feature activation-6.23e-01

Pos logit contributions
tered+0.008
 Ellen+0.008
 Hatt+0.008
 Eleanor+0.008
 Beth+0.007
Neg logit contributions
aw-0.014
ins-0.014
olo-0.014
 pasta-0.014
 seaw-0.014
Token fish
Token ID5916
Feature activation-2.30e+00

Pos logit contributions
ll+0.020
actus+0.020
ins+0.020
iflower+0.019
anut+0.019
Neg logit contributions
 as-0.024
 Grace-0.023
 Beth-0.022
tered-0.022
 Ellen-0.021
Token and
Token ID290
Feature activation-1.51e-03

Pos logit contributions
aw+0.134
Dan+0.133
rou+0.133
olo+0.133
 Brussels+0.132
Neg logit contributions
 as-0.024
hered-0.017
tered-0.016
 Grace-0.014
 Beth-0.012
Token shells
Token ID19679
Feature activation-3.32e+00

Pos logit contributions
ins+0.000
aw+0.000
actus+0.000
ll+0.000
P+0.000
Neg logit contributions
 as-0.000
 Grace-0.000
 Beth-0.000
 Ellen-0.000
tered-0.000
Token and
Token ID290
Feature activation-1.02e-02

Pos logit contributions
olo+0.139
Dan+0.139
aw+0.137
 coconut+0.136
 pasta+0.135
Neg logit contributions
tered-0.091
 as-0.090
 Ellen-0.087
 Hatt-0.084
hered-0.084
Token seaw
Token ID32085
Feature activation-7.37e+00

Pos logit contributions
ins+0.000
actus+0.000
ll+0.000
aw+0.000
olo+0.000
Neg logit contributions
 as-0.000
tered-0.000
 Grace-0.000
ened-0.000
 Ellen-0.000
Tokeneed
Token ID2308
Feature activation-1.23e+00

Pos logit contributions
 pasta+0.153
 Brussels+0.152
rou+0.152
olo+0.147
 coconut+0.146
Neg logit contributions
tered-0.331
 as-0.315
hered-0.315
ened-0.311
 breath-0.306
Token.
Token ID13
Feature activation-5.45e-01

Pos logit contributions
Dan+0.051
olo+0.050
aw+0.050
P+0.049
 coconut+0.049
Neg logit contributions
tered-0.035
 as-0.034
ened-0.032
 Ellen-0.032
 Grace-0.031
Token
Token ID198
Feature activation-5.79e-01

Pos logit contributions
 seaw+0.021
ins+0.021
aw+0.021
 pasta+0.021
olo+0.021
Neg logit contributions
 Ellen-0.018
 Beth-0.018
 Eleanor-0.017
 Hatt-0.017
 Grace-0.017
Token
Token ID198
Feature activation-5.31e-01

Pos logit contributions
 seaw+0.022
 pasta+0.021
aw+0.021
olo+0.020
 coconut+0.020
Neg logit contributions
 Ellen-0.020
 Eleanor-0.020
tered-0.020
 Beth-0.020
 Hatt-0.020
TokenOne
Token ID3198
Feature activation+7.28e-01

Pos logit contributions
 pasta+0.018
 seaw+0.018
actus+0.018
aw+0.018
olo+0.018
Neg logit contributions
tered-0.020
 Ellen-0.019
 Eleanor-0.019
 Hatt-0.019
 Beth-0.019
Token from
Token ID422
Feature activation+4.04e-01

Pos logit contributions
 Ellen+0.013
 Hatt+0.013
 Eleanor+0.013
tered+0.012
 Grace+0.012
Neg logit contributions
 pasta-0.016
aw-0.016
oni-0.015
 seaw-0.015
ll-0.015
Token the
Token ID262
Feature activation-4.08e-01

Pos logit contributions
tered+0.013
 Ellen+0.013
 Hatt+0.013
 Eleanor+0.012
hered+0.012
Neg logit contributions
 pasta-0.015
 seaw-0.015
aw-0.015
olo-0.015
Dan-0.015
Token rocks
Token ID12586
Feature activation-1.31e+00

Pos logit contributions
ins+0.015
aw+0.014
olo+0.013
rou+0.013
ll+0.013
Neg logit contributions
tered-0.015
 breath-0.014
hered-0.014
 as-0.014
 Grace-0.014
Token and
Token ID290
Feature activation+1.09e-01

Pos logit contributions
aw+0.059
 pasta+0.059
ins+0.059
olo+0.059
 seaw+0.058
Neg logit contributions
 Ellen-0.034
tered-0.033
 Hatt-0.032
 Eleanor-0.031
 Beth-0.031
Token the
Token ID262
Feature activation-4.50e-01

Pos logit contributions
 Ellen+0.003
tered+0.003
 Hatt+0.003
 Eleanor+0.003
hered+0.003
Neg logit contributions
 seaw-0.004
 pasta-0.004
aw-0.004
olo-0.004
 coconut-0.004
Token seaw
Token ID32085
Feature activation-7.37e+00

Pos logit contributions
aw+0.014
ins+0.013
rou+0.012
olo+0.012
Dan+0.012
Neg logit contributions
tered-0.018
 breath-0.018
 as-0.017
 Grace-0.017
 cheer-0.017
Tokeneed
Token ID2308
Feature activation-1.29e+00

Pos logit contributions
 pasta+0.185
 Brussels+0.177
ins+0.175
 microsc+0.171
rou+0.166
Neg logit contributions
tered-0.299
hered-0.292
 as-0.279
 breath-0.271
ened-0.271
Token,
Token ID11
Feature activation+3.68e-01

Pos logit contributions
aw+0.050
olo+0.049
ins+0.049
 pasta+0.049
 seaw+0.048
Neg logit contributions
tered-0.041
 Ellen-0.041
 Hatt-0.039
 Eleanor-0.039
 Beth-0.038
Token but
Token ID475
Feature activation+1.33e+00

Pos logit contributions
 Ellen+0.009
tered+0.009
 Hatt+0.009
 Eleanor+0.008
 Beth+0.008
Neg logit contributions
aw-0.017
 pasta-0.017
ins-0.017
olo-0.017
 seaw-0.017
Token the
Token ID262
Feature activation-4.39e-01

Pos logit contributions
 Ellen+0.043
tered+0.043
 Hatt+0.041
hered+0.040
 Eleanor+0.040
Neg logit contributions
 pasta-0.050
aw-0.050
olo-0.049
 seaw-0.049
 coconut-0.048
Token net
Token ID2010
Feature activation-2.81e+00

Pos logit contributions
ins+0.014
aw+0.014
rou+0.013
ll+0.013
ades+0.013
Neg logit contributions
 breath-0.017
 as-0.017
tered-0.017
 cheer-0.016
ened-0.016
Token decorated
Token ID24789
Feature activation+3.30e-01

Pos logit contributions
 Ellen+0.001
tered+0.001
 Hatt+0.001
 Eleanor+0.001
 Beth+0.001
Neg logit contributions
aw-0.002
ins-0.002
olo-0.002
 pasta-0.002
 seaw-0.002
Token it
Token ID340
Feature activation+3.82e-01

Pos logit contributions
tered+0.007
 Ellen+0.007
 as+0.006
 Grace+0.006
ened+0.006
Neg logit contributions
ll-0.016
olo-0.016
ins-0.016
aw-0.016
P-0.016
Token with
Token ID351
Feature activation-4.31e-01

Pos logit contributions
tered+0.009
 as+0.008
 Ellen+0.007
 Grace+0.007
ened+0.007
Neg logit contributions
olo-0.020
Dan-0.019
aw-0.019
ins-0.019
 seaw-0.019
Token shells
Token ID19679
Feature activation-3.25e+00

Pos logit contributions
actus+0.016
olo+0.016
ll+0.016
ins+0.016
opus+0.015
Neg logit contributions
tered-0.015
 as-0.015
ened-0.014
 Grace-0.014
 Ellen-0.014
Token and
Token ID290
Feature activation+8.83e-03

Pos logit contributions
Dan+0.147
olo+0.147
P+0.145
aw+0.143
ins+0.142
Neg logit contributions
 as-0.083
tered-0.078
ened-0.072
 Ellen-0.072
hered-0.071
Token seaw
Token ID32085
Feature activation-7.32e+00

Pos logit contributions
tered+0.000
 as+0.000
ened+0.000
 Ellen+0.000
 Hatt+0.000
Neg logit contributions
ins-0.000
actus-0.000
olo-0.000
aw-0.000
ll-0.000
Tokeneed
Token ID2308
Feature activation-1.01e+00

Pos logit contributions
 Brussels+0.153
 pasta+0.153
rou+0.151
olo+0.149
 coconut+0.148
Neg logit contributions
tered-0.320
 as-0.308
hered-0.305
ened-0.304
 Hatt-0.300
Token.
Token ID13
Feature activation-3.60e-01

Pos logit contributions
olo+0.040
Dan+0.040
aw+0.040
P+0.040
 pasta+0.040
Neg logit contributions
tered-0.030
 as-0.029
ened-0.028
 Ellen-0.028
 Hatt-0.027
Token They
Token ID1119
Feature activation+5.26e-01

Pos logit contributions
 seaw+0.016
ins+0.016
 pasta+0.016
aw+0.016
olo+0.016
Neg logit contributions
 Ellen-0.010
tered-0.010
 Eleanor-0.010
 Hatt-0.010
 Beth-0.010
Token had
Token ID550
Feature activation+5.24e-01

Pos logit contributions
tered+0.016
 Hatt+0.015
 Ellen+0.015
 Eleanor+0.014
 Grace+0.014
Neg logit contributions
ins-0.022
aw-0.022
actus-0.021
olo-0.021
 pasta-0.021
Token a
Token ID257
Feature activation+1.15e+00

Pos logit contributions
tered+0.016
 as+0.015
 Ellen+0.015
 Grace+0.014
 Hatt+0.014
Neg logit contributions
actus-0.022
ins-0.022
opus-0.021
olo-0.021
ll-0.020
Tokenugg
Token ID6837
Feature activation+9.33e-01

Pos logit contributions
 Ellen+0.049
tered+0.047
 Hatt+0.047
 Eleanor+0.046
 Grace+0.045
Neg logit contributions
aw-0.045
ins-0.043
 pasta-0.043
olo-0.043
 seaw-0.042
Tokenled
Token ID992
Feature activation+7.05e-01

Pos logit contributions
 Ellen+0.031
 Hatt+0.030
tered+0.029
 Eleanor+0.029
 Beth+0.029
Neg logit contributions
aw-0.037
ins-0.037
 pasta-0.036
 seaw-0.036
olo-0.035
Token up
Token ID510
Feature activation+6.73e-01

Pos logit contributions
 Ellen+0.019
tered+0.019
 Hatt+0.019
 Eleanor+0.018
 Beth+0.018
Neg logit contributions
aw-0.031
 pasta-0.031
 seaw-0.030
ins-0.030
olo-0.030
Token in
Token ID287
Feature activation-3.67e-01

Pos logit contributions
 Ellen+0.019
tered+0.019
 Hatt+0.018
 Eleanor+0.018
 Beth+0.017
Neg logit contributions
aw-0.029
olo-0.028
 pasta-0.028
ins-0.028
 seaw-0.028
Token some
Token ID617
Feature activation+3.80e-01

Pos logit contributions
 pasta+0.014
aw+0.014
 seaw+0.014
olo+0.014
ins+0.014
Neg logit contributions
 Ellen-0.012
tered-0.012
 Hatt-0.011
 Eleanor-0.011
 Beth-0.011
Token seaw
Token ID32085
Feature activation-7.31e+00

Pos logit contributions
tered+0.014
 Ellen+0.014
 Grace+0.013
 Hatt+0.013
 Eleanor+0.013
Neg logit contributions
olo-0.013
ins-0.012
Dan-0.012
aw-0.012
oni-0.012
Tokeneed
Token ID2308
Feature activation-1.13e+00

Pos logit contributions
 pasta+0.162
 Brussels+0.157
 coconut+0.153
olo+0.151
ins+0.151
Neg logit contributions
tered-0.330
hered-0.315
 as-0.310
 Hatt-0.301
ened-0.301
Token and
Token ID290
Feature activation+1.47e-01

Pos logit contributions
aw+0.051
olo+0.051
ins+0.051
 pasta+0.051
Dan+0.050
Neg logit contributions
tered-0.028
 Ellen-0.028
 Hatt-0.026
 Grace-0.026
 Eleanor-0.025
Token felt
Token ID2936
Feature activation+3.65e-01

Pos logit contributions
 Ellen+0.005
tered+0.005
 Hatt+0.005
 Eleanor+0.005
 Beth+0.004
Neg logit contributions
 pasta-0.006
aw-0.006
 seaw-0.006
olo-0.006
ins-0.006
Token very
Token ID845
Feature activation-1.73e+00

Pos logit contributions
tered+0.011
 as+0.011
 Grace+0.011
 Ellen+0.011
 Hatt+0.010
Neg logit contributions
ins-0.015
olo-0.015
iles-0.014
Dan-0.014
aw-0.014
Token cozy
Token ID37438
Feature activation+8.12e-01

Pos logit contributions
aw+0.073
olo+0.073
ins+0.073
iles+0.072
oni+0.071
Neg logit contributions
tered-0.052
 as-0.049
 Grace-0.046
 Hatt-0.044
ened-0.044
Token She
Token ID1375
Feature activation+1.12e+00

Pos logit contributions
tered+0.003
 Ellen+0.003
 Hatt+0.003
hered+0.003
 Eleanor+0.003
Neg logit contributions
aw-0.004
 pasta-0.003
olo-0.003
 coconut-0.003
 seaw-0.003
Token also
Token ID635
Feature activation-2.06e+00

Pos logit contributions
tered+0.034
 Ellen+0.033
 Hatt+0.032
ened+0.032
 Eleanor+0.032
Neg logit contributions
ins-0.045
aw-0.045
olo-0.044
 pasta-0.044
 coconut-0.043
Token picked
Token ID6497
Feature activation-1.90e+00

Pos logit contributions
aw+0.074
ins+0.074
 pasta+0.074
olo+0.073
 seaw+0.072
Neg logit contributions
 Ellen-0.071
tered-0.071
 Hatt-0.069
 Eleanor-0.068
 Beth-0.067
Token some
Token ID617
Feature activation+3.36e-01

Pos logit contributions
aw+0.077
 pasta+0.076
 seaw+0.076
ins+0.074
olo+0.074
Neg logit contributions
 Ellen-0.059
tered-0.059
 Hatt-0.057
 Eleanor-0.056
 Beth-0.054
Token green
Token ID4077
Feature activation+6.69e-02

Pos logit contributions
tered+0.013
 Ellen+0.013
 Hatt+0.013
 Eleanor+0.012
 Beth+0.012
Neg logit contributions
olo-0.011
aw-0.010
ins-0.010
Dan-0.010
 pasta-0.010
Token seaw
Token ID32085
Feature activation-7.30e+00

Pos logit contributions
 Ellen+0.003
tered+0.003
 Hatt+0.003
 Eleanor+0.003
 Beth+0.003
Neg logit contributions
aw-0.002
olo-0.002
 pasta-0.002
ins-0.002
 seaw-0.002
Tokeneed
Token ID2308
Feature activation-1.20e+00

Pos logit contributions
aw+0.146
 pasta+0.140
ins+0.139
olo+0.138
Dan+0.135
Neg logit contributions
tered-0.351
 Hatt-0.333
hered-0.333
ened-0.326
 as-0.325
Token to
Token ID284
Feature activation-3.68e-01

Pos logit contributions
aw+0.054
ins+0.054
olo+0.054
Dan+0.053
 pasta+0.053
Neg logit contributions
tered-0.030
 Ellen-0.029
 Hatt-0.028
 Eleanor-0.027
 Grace-0.027
Token show
Token ID905
Feature activation-1.50e+00

Pos logit contributions
aw+0.016
 pasta+0.016
 seaw+0.016
olo+0.016
ins+0.016
Neg logit contributions
 Ellen-0.010
tered-0.010
 Hatt-0.010
 Eleanor-0.009
 Beth-0.009
Token her
Token ID607
Feature activation+6.15e-01

Pos logit contributions
aw+0.058
 pasta+0.056
ins+0.056
olo+0.054
 crabs+0.054
Neg logit contributions
tered-0.048
hered-0.047
 Hatt-0.047
 Ellen-0.046
 Eleanor-0.046
Token friends
Token ID2460
Feature activation-3.93e-01

Pos logit contributions
 Ellen+0.020
 Beth+0.020
 Grace+0.020
 Hatt+0.019
 as+0.019
Neg logit contributions
aw-0.024
olo-0.023
ins-0.023
P-0.023
Dan-0.022
Token of
Token ID286
Feature activation-1.12e-01

Pos logit contributions
 Ellen+0.027
tered+0.027
 Hatt+0.027
 Eleanor+0.026
 Beth+0.025
Neg logit contributions
aw-0.026
ins-0.026
 pasta-0.026
 seaw-0.026
olo-0.025
Token shells
Token ID19679
Feature activation-3.34e+00

Pos logit contributions
olo+0.004
oni+0.004
actus+0.004
ll+0.004
Dan+0.004
Neg logit contributions
 Ellen-0.004
ened-0.004
 Eleanor-0.004
tered-0.004
 Hatt-0.004
Token,
Token ID11
Feature activation+2.51e-01

Pos logit contributions
aw+0.154
olo+0.154
ins+0.153
 pasta+0.152
Dan+0.151
Neg logit contributions
 Ellen-0.079
tered-0.079
 Hatt-0.074
 Grace-0.072
 Eleanor-0.072
Token rocks
Token ID12586
Feature activation-1.21e+00

Pos logit contributions
tered+0.010
 Ellen+0.010
 Hatt+0.010
ened+0.010
 as+0.010
Neg logit contributions
ins-0.007
actus-0.007
olo-0.007
aw-0.007
ll-0.007
Token and
Token ID290
Feature activation+2.85e-02

Pos logit contributions
aw+0.058
 pasta+0.058
olo+0.058
ins+0.058
 seaw+0.057
Neg logit contributions
 Ellen-0.027
tered-0.027
 Hatt-0.025
 Eleanor-0.025
 Beth-0.024
Token seaw
Token ID32085
Feature activation-7.27e+00

Pos logit contributions
tered+0.001
 Ellen+0.001
 Hatt+0.001
 Grace+0.001
ened+0.001
Neg logit contributions
olo-0.001
actus-0.001
ins-0.001
aw-0.001
ll-0.001
Tokeneed
Token ID2308
Feature activation-1.14e+00

Pos logit contributions
 pasta+0.147
 Brussels+0.140
 coconut+0.139
olo+0.139
rou+0.138
Neg logit contributions
tered-0.341
 Hatt-0.326
hered-0.325
 as-0.323
ened-0.322
Token,
Token ID11
Feature activation+2.54e-01

Pos logit contributions
aw+0.047
olo+0.047
 pasta+0.047
ins+0.046
Dan+0.046
Neg logit contributions
tered-0.033
 Ellen-0.032
 Hatt-0.031
ened-0.030
 Eleanor-0.030
Token but
Token ID475
Feature activation+1.21e+00

Pos logit contributions
tered+0.006
 Ellen+0.006
 as+0.006
 Hatt+0.006
 Grace+0.006
Neg logit contributions
actus-0.011
ins-0.011
olo-0.011
aw-0.011
P-0.011
Token Jane
Token ID12091
Feature activation-9.27e-01

Pos logit contributions
 Ellen+0.040
tered+0.039
 Hatt+0.039
 Eleanor+0.039
hered+0.038
Neg logit contributions
 pasta-0.045
 seaw-0.044
aw-0.044
 coconut-0.044
olo-0.043
Token was
Token ID373
Feature activation+1.98e-01

Pos logit contributions
 pasta+0.035
 seaw+0.035
ins+0.034
aw+0.034
olo+0.034
Neg logit contributions
tered-0.031
 Ellen-0.030
 Hatt-0.029
 Eleanor-0.029
ened-0.029
Token shore
Token ID15191
Feature activation-1.90e+00

Pos logit contributions
ins+0.017
aw+0.016
olo+0.016
 pasta+0.015
P+0.015
Neg logit contributions
tered-0.021
 Ellen-0.020
 Grace-0.020
 as-0.020
 Eleanor-0.019
Token,
Token ID11
Feature activation+1.55e-01

Pos logit contributions
 pasta+0.086
ins+0.086
olo+0.086
aw+0.086
 seaw+0.085
Neg logit contributions
tered-0.046
 Ellen-0.045
 as-0.042
 Hatt-0.042
ened-0.042
Token collecting
Token ID13157
Feature activation-2.80e+00

Pos logit contributions
 Ellen+0.005
tered+0.005
 Hatt+0.005
 Eleanor+0.005
 Beth+0.005
Neg logit contributions
aw-0.006
 pasta-0.006
ins-0.006
olo-0.006
 seaw-0.006
Token shells
Token ID19679
Feature activation-3.43e+00

Pos logit contributions
olo+0.073
ins+0.071
 pasta+0.070
ll+0.069
actus+0.069
Neg logit contributions
tered-0.123
 Ellen-0.121
ened-0.119
 Hatt-0.118
 Eleanor-0.118
Token and
Token ID290
Feature activation-6.55e-02

Pos logit contributions
olo+0.159
ins+0.158
aw+0.158
 pasta+0.158
Dan+0.157
Neg logit contributions
tered-0.079
 Ellen-0.077
 as-0.076
hered-0.072
 Hatt-0.072
Token seaw
Token ID32085
Feature activation-7.26e+00

Pos logit contributions
aw+0.002
ins+0.002
olo+0.002
actus+0.002
 pasta+0.002
Neg logit contributions
tered-0.003
 Ellen-0.002
ened-0.002
 Grace-0.002
 Beth-0.002
Tokeneed
Token ID2308
Feature activation-1.25e+00

Pos logit contributions
 pasta+0.158
 Brussels+0.150
 coconut+0.146
olo+0.141
 volcano+0.139
Neg logit contributions
tered-0.335
hered-0.323
 as-0.319
ened-0.318
 Hatt-0.312
Token.
Token ID13
Feature activation+5.97e-02

Pos logit contributions
aw+0.049
 pasta+0.049
olo+0.049
ins+0.048
 seaw+0.048
Neg logit contributions
tered-0.038
 Ellen-0.037
 Hatt-0.035
ened-0.035
 Eleanor-0.035
Token Jimmy
Token ID12963
Feature activation+1.59e-01

Pos logit contributions
tered+0.002
 Ellen+0.002
 Hatt+0.002
 Eleanor+0.002
 Grace+0.002
Neg logit contributions
ins-0.003
aw-0.003
 seaw-0.002
olo-0.002
actus-0.002
Token took
Token ID1718
Feature activation-4.03e-01

Pos logit contributions
 Ellen+0.005
tered+0.005
 Hatt+0.005
 Eleanor+0.005
 Beth+0.005
Neg logit contributions
aw-0.006
 pasta-0.006
olo-0.006
 seaw-0.006
ins-0.006
Token time
Token ID640
Feature activation-1.41e-01

Pos logit contributions
aw+0.015
 seaw+0.014
 pasta+0.014
 crabs+0.013
 oy+0.013
Neg logit contributions
 Ellen-0.015
 Hatt-0.015
tered-0.015
 Eleanor-0.014
 Beth-0.013
Token tentacles
Token ID41192
Feature activation-8.58e-01

Pos logit contributions
 Ellen+0.007
tered+0.007
 Hatt+0.007
 Eleanor+0.007
 Beth+0.007
Neg logit contributions
aw-0.007
ins-0.007
olo-0.007
 pasta-0.007
 seaw-0.007
Token to
Token ID284
Feature activation-1.76e-01

Pos logit contributions
olo+0.033
aw+0.033
ins+0.033
 pasta+0.033
 seaw+0.032
Neg logit contributions
tered-0.027
 Ellen-0.027
 Hatt-0.026
ened-0.025
 Eleanor-0.025
Token reach
Token ID3151
Feature activation-1.72e+00

Pos logit contributions
aw+0.008
 pasta+0.007
olo+0.007
 seaw+0.007
ins+0.007
Neg logit contributions
 Ellen-0.005
tered-0.005
 Hatt-0.005
 Eleanor-0.005
 Beth-0.004
Token for
Token ID329
Feature activation+9.31e-01

Pos logit contributions
aw+0.079
ins+0.079
olo+0.078
 pasta+0.076
actus+0.076
Neg logit contributions
tered-0.042
 Ellen-0.042
 Hatt-0.040
 Beth-0.039
 Eleanor-0.039
Token the
Token ID262
Feature activation-2.87e-01

Pos logit contributions
 Ellen+0.029
tered+0.029
 Hatt+0.028
 Eleanor+0.028
 Beth+0.027
Neg logit contributions
ins-0.037
aw-0.037
olo-0.036
 pasta-0.036
actus-0.035
Token seaw
Token ID32085
Feature activation-7.26e+00

Pos logit contributions
aw+0.007
ins+0.007
ades+0.006
 Brussels+0.006
Dan+0.006
Neg logit contributions
 breath-0.013
 Grace-0.013
tered-0.013
ened-0.012
hered-0.012
Tokeneed
Token ID2308
Feature activation-1.17e+00

Pos logit contributions
 yourselves+0.193
ins+0.192
 Brussels+0.189
aw+0.182
 pasta+0.180
Neg logit contributions
hered-0.280
tered-0.277
 as-0.270
 breath-0.256
ly-0.255
Token,
Token ID11
Feature activation+4.33e-01

Pos logit contributions
aw+0.053
olo+0.053
ins+0.053
Dan+0.052
P+0.051
Neg logit contributions
tered-0.029
 as-0.028
 Ellen-0.028
ened-0.026
 Hatt-0.026
Token and
Token ID290
Feature activation+1.30e-01

Pos logit contributions
tered+0.011
 Ellen+0.011
 as+0.010
ened+0.010
 Hatt+0.010
Neg logit contributions
ins-0.020
olo-0.020
aw-0.020
actus-0.020
Dan-0.019
Token he
Token ID339
Feature activation-1.50e+00

Pos logit contributions
 Ellen+0.004
tered+0.004
 Hatt+0.004
 Eleanor+0.004
 Beth+0.004
Neg logit contributions
aw-0.005
 pasta-0.005
 seaw-0.005
olo-0.005
ins-0.005
Token wrapped
Token ID12908
Feature activation+1.35e+00

Pos logit contributions
ins+0.060
aw+0.060
 pasta+0.059
olo+0.059
actus+0.058
Neg logit contributions
tered-0.047
 Ellen-0.044
 Hatt-0.043
ened-0.043
 Grace-0.042
Token She
Token ID1375
Feature activation+9.24e-01

Pos logit contributions
 Ellen+0.002
tered+0.002
 Beth+0.002
 Grace+0.002
 Hatt+0.002
Neg logit contributions
aw-0.002
ins-0.002
 seaw-0.002
olo-0.002
actus-0.002
Token told
Token ID1297
Feature activation+5.11e-01

Pos logit contributions
tered+0.033
 Ellen+0.033
 Hatt+0.032
 Eleanor+0.031
 Beth+0.031
Neg logit contributions
aw-0.032
 pasta-0.032
ins-0.032
 seaw-0.032
olo-0.032
Token him
Token ID683
Feature activation-9.06e-01

Pos logit contributions
tered+0.020
 Ellen+0.020
 Hatt+0.019
 Eleanor+0.019
 Beth+0.018
Neg logit contributions
 pasta-0.016
aw-0.016
olo-0.016
 seaw-0.016
ins-0.016
Token it
Token ID340
Feature activation+4.07e-01

Pos logit contributions
ins+0.036
aw+0.036
actus+0.036
 seaw+0.036
P+0.036
Neg logit contributions
 Ellen-0.028
tered-0.028
 Beth-0.027
 Hatt-0.026
 Grace-0.026
Token was
Token ID373
Feature activation+1.24e-01

Pos logit contributions
tered+0.014
 Ellen+0.014
 Hatt+0.013
 Eleanor+0.013
 Beth+0.013
Neg logit contributions
aw-0.015
ins-0.015
olo-0.015
 pasta-0.015
 seaw-0.015
Token seaw
Token ID32085
Feature activation-7.25e+00

Pos logit contributions
 as+0.004
tered+0.003
 Beth+0.003
 Ellen+0.003
 Grace+0.003
Neg logit contributions
aw-0.006
actus-0.006
ins-0.005
ades-0.005
opus-0.005
Tokeneed
Token ID2308
Feature activation-1.12e+00

Pos logit contributions
 pasta+0.151
aw+0.146
 Brussels+0.142
ins+0.139
olo+0.139
Neg logit contributions
tered-0.340
hered-0.333
 Hatt-0.317
ened-0.317
 as-0.313
Token.
Token ID13
Feature activation+7.72e-02

Pos logit contributions
aw+0.049
ins+0.048
olo+0.048
Dan+0.048
P+0.047
Neg logit contributions
tered-0.030
 Ellen-0.029
 Hatt-0.028
 Beth-0.027
 Grace-0.027
Token Despite
Token ID7945
Feature activation+8.43e-01

Pos logit contributions
 Ellen+0.002
tered+0.002
 Grace+0.002
 Beth+0.002
 Hatt+0.002
Neg logit contributions
aw-0.003
ins-0.003
actus-0.003
 seaw-0.003
 coconut-0.003
Token the
Token ID262
Feature activation-3.57e-01

Pos logit contributions
 Ellen+0.027
tered+0.027
 Hatt+0.026
 Eleanor+0.026
 Beth+0.025
Neg logit contributions
aw-0.032
ins-0.032
 pasta-0.032
olo-0.032
 seaw-0.032
Token smell
Token ID8508
Feature activation-1.32e+00

Pos logit contributions
ins+0.013
aw+0.013
ades+0.012
ll+0.012
olo+0.011
Neg logit contributions
 care-0.012
 breath-0.012
 moment-0.012
ened-0.012
 as-0.012
Tokenate
Token ID378
Feature activation-1.23e+00

Pos logit contributions
tered+0.016
 Ellen+0.016
 Eleanor+0.015
 Hatt+0.015
ened+0.015
Neg logit contributions
 seaw-0.009
olo-0.009
 pasta-0.009
 coconut-0.008
aw-0.008
Token it
Token ID340
Feature activation+1.25e-01

Pos logit contributions
actus+0.056
olo+0.056
P+0.056
aw+0.056
 seaw+0.056
Neg logit contributions
tered-0.029
 Ellen-0.028
 as-0.027
 Eleanor-0.027
 Grace-0.026
Token with
Token ID351
Feature activation-6.82e-01

Pos logit contributions
tered+0.003
 as+0.003
 Ellen+0.003
 Hatt+0.003
 Grace+0.003
Neg logit contributions
olo-0.006
aw-0.006
ins-0.006
Dan-0.006
actus-0.006
Token shells
Token ID19679
Feature activation-3.41e+00

Pos logit contributions
ll+0.026
actus+0.026
ins+0.026
olo+0.026
aw+0.025
Neg logit contributions
tered-0.022
 as-0.022
ened-0.022
 Grace-0.021
 Beth-0.021
Token and
Token ID290
Feature activation-2.07e-01

Pos logit contributions
Dan+0.153
olo+0.152
aw+0.150
P+0.149
ins+0.149
Neg logit contributions
 as-0.089
tered-0.081
hered-0.076
ened-0.075
 Ellen-0.074
Token seaw
Token ID32085
Feature activation-7.24e+00

Pos logit contributions
actus+0.006
ins+0.006
ll+0.006
aw+0.006
olo+0.006
Neg logit contributions
tered-0.008
 as-0.008
ened-0.008
 Hatt-0.008
 Ellen-0.008
Tokeneed
Token ID2308
Feature activation-1.25e+00

Pos logit contributions
 pasta+0.138
 coconut+0.138
 Brussels+0.137
olo+0.136
rou+0.134
Neg logit contributions
tered-0.341
hered-0.327
 as-0.325
 Hatt-0.324
ened-0.324
Token.
Token ID13
Feature activation-3.12e-01

Pos logit contributions
olo+0.051
aw+0.051
Dan+0.051
P+0.050
 coconut+0.050
Neg logit contributions
 as-0.035
tered-0.035
ened-0.033
hered-0.032
 Ellen-0.032
Token They
Token ID1119
Feature activation+4.06e-01

Pos logit contributions
aw+0.010
 pasta+0.010
olo+0.010
Dan+0.010
ins+0.010
Neg logit contributions
 Ellen-0.012
tered-0.012
ened-0.011
 Hatt-0.011
hered-0.011
Token find
Token ID1064
Feature activation-1.39e+00

Pos logit contributions
tered+0.011
 Hatt+0.011
 Ellen+0.011
 Eleanor+0.010
 Grace+0.010
Neg logit contributions
aw-0.018
ins-0.018
actus-0.017
olo-0.017
 pasta-0.017
Token a
Token ID257
Feature activation+1.00e+00

Pos logit contributions
aw+0.049
olo+0.049
ins+0.049
 pasta+0.049
 seaw+0.048
Neg logit contributions
 Ellen-0.048
tered-0.048
 Hatt-0.047
 Eleanor-0.046
 Beth-0.045
Token started
Token ID2067
Feature activation-1.15e-01

Pos logit contributions
tered+0.000
 Ellen+0.000
 Grace+0.000
ened+0.000
 Eleanor+0.000
Neg logit contributions
ins-0.000
aw-0.000
olo-0.000
actus-0.000
 pasta-0.000
Token to
Token ID284
Feature activation-3.99e-01

Pos logit contributions
actus+0.005
aw+0.005
olo+0.005
iles+0.005
P+0.005
Neg logit contributions
 as-0.003
 breath-0.003
ened-0.003
hered-0.003
 Beth-0.003
Token chew
Token ID34722
Feature activation-1.19e+00

Pos logit contributions
aw+0.017
ins+0.017
olo+0.016
actus+0.016
 pasta+0.016
Neg logit contributions
tered-0.011
 Ellen-0.011
 Grace-0.011
 Beth-0.011
 Eleanor-0.011
Token on
Token ID319
Feature activation+1.44e+00

Pos logit contributions
ins+0.048
olo+0.048
aw+0.048
 pasta+0.047
actus+0.047
Neg logit contributions
tered-0.036
 Ellen-0.034
 Hatt-0.033
 Eleanor-0.033
 Grace-0.032
Token some
Token ID617
Feature activation+2.53e-01

Pos logit contributions
tered+0.036
 Ellen+0.035
 Grace+0.034
 as+0.034
 Beth+0.033
Neg logit contributions
ins-0.067
ll-0.067
actus-0.066
olo-0.064
P-0.063
Token seaw
Token ID32085
Feature activation-7.20e+00

Pos logit contributions
 as+0.010
ened+0.009
tered+0.009
 breath+0.009
 throughout+0.009
Neg logit contributions
Dan-0.008
olo-0.008
ich-0.008
ll-0.008
rou-0.008
Tokeneed
Token ID2308
Feature activation-1.22e+00

Pos logit contributions
 Brussels+0.173
 pasta+0.169
aw+0.168
rou+0.167
 microsc+0.163
Neg logit contributions
tered-0.300
hered-0.287
 as-0.285
 breath-0.269
ened-0.268
Token.
Token ID13
Feature activation-2.20e-01

Pos logit contributions
Dan+0.052
olo+0.052
aw+0.051
P+0.051
ins+0.051
Neg logit contributions
 as-0.034
tered-0.033
ened-0.031
 throughout-0.030
hered-0.030
Token Mary
Token ID5335
Feature activation+2.58e-01

Pos logit contributions
 seaw+0.008
ins+0.008
 pasta+0.008
olo+0.008
aw+0.008
Neg logit contributions
 Ellen-0.007
tered-0.007
 Eleanor-0.007
 Hatt-0.007
 Beth-0.007
Token was
Token ID373
Feature activation+7.88e-02

Pos logit contributions
tered+0.007
ened+0.007
 as+0.007
 Grace+0.006
 Ellen+0.006
Neg logit contributions
ins-0.011
ades-0.011
 pasta-0.011
actus-0.011
Dan-0.011
Token so
Token ID523
Feature activation-1.65e+00

Pos logit contributions
 Ellen+0.003
tered+0.003
 Hatt+0.002
 Eleanor+0.002
 Beth+0.002
Neg logit contributions
aw-0.003
 pasta-0.003
olo-0.003
ins-0.003
 seaw-0.003
Token Like
Token ID4525
Feature activation+5.90e-01

Pos logit contributions
 Ellen+0.020
 Hatt+0.019
 Eleanor+0.019
 Beth+0.019
 Grace+0.019
Neg logit contributions
ins-0.022
actus-0.022
 seaw-0.021
 coconut-0.021
aw-0.021
Token shells
Token ID19679
Feature activation-2.95e+00

Pos logit contributions
 as+0.018
 Ellen+0.017
tered+0.017
 Grace+0.017
 Beth+0.016
Neg logit contributions
ins-0.024
aw-0.024
actus-0.024
olo-0.023
Dan-0.023
Token and
Token ID290
Feature activation+4.51e-01

Pos logit contributions
olo+0.144
aw+0.144
 coconut+0.143
Dan+0.143
 pasta+0.143
Neg logit contributions
tered-0.059
 Ellen-0.059
 as-0.056
 Hatt-0.055
 Grace-0.054
Token rocks
Token ID12586
Feature activation-7.01e-01

Pos logit contributions
 as+0.019
 Grace+0.018
 care+0.018
ened+0.018
tered+0.018
Neg logit contributions
actus-0.013
ins-0.013
olo-0.013
ll-0.012
aw-0.012
Token and
Token ID290
Feature activation+3.78e-01

Pos logit contributions
olo+0.031
aw+0.030
 pasta+0.030
Dan+0.030
 coconut+0.030
Neg logit contributions
tered-0.018
 Ellen-0.018
 as-0.017
 Hatt-0.017
hered-0.016
Token seaw
Token ID32085
Feature activation-7.18e+00

Pos logit contributions
 as+0.015
tered+0.015
 Grace+0.015
ened+0.015
 Ellen+0.014
Neg logit contributions
olo-0.011
ins-0.011
actus-0.011
ll-0.011
Dan-0.011
Tokeneed
Token ID2308
Feature activation-7.62e-01

Pos logit contributions
rou+0.153
Dan+0.150
 coconut+0.150
 Brussels+0.149
 pasta+0.147
Neg logit contributions
tered-0.316
 as-0.306
hered-0.302
ened-0.302
 Hatt-0.293
Token."
Token ID526
Feature activation-8.53e-01

Pos logit contributions
Dan+0.032
olo+0.032
aw+0.032
 coconut+0.032
 pasta+0.031
Neg logit contributions
tered-0.021
 as-0.020
 Ellen-0.019
ened-0.019
 Grace-0.019
Token
Token ID198
Feature activation-4.20e-01

Pos logit contributions
ins+0.033
 seaw+0.033
aw+0.033
actus+0.033
 coconut+0.032
Neg logit contributions
 Ellen-0.029
 Grace-0.029
 Hatt-0.028
 Beth-0.028
 Eleanor-0.028
Token
Token ID198
Feature activation-3.86e-01

Pos logit contributions
 seaw+0.018
 pasta+0.018
sters+0.017
 Brussels+0.017
 coconut+0.017
Neg logit contributions
 Ellen-0.013
 Eleanor-0.013
 Grace-0.013
 Beth-0.012
 Hatt-0.012
TokenM
Token ID44
Feature activation-8.96e-01

Pos logit contributions
 pasta+0.015
 seaw+0.014
oni+0.014
actus+0.014
ins+0.014
Neg logit contributions
 Ellen-0.014
 Eleanor-0.014
tered-0.013
 Hatt-0.013
 Grace-0.013
Token found
Token ID1043
Feature activation-1.19e+00

Pos logit contributions
ins+0.021
aw+0.021
 pasta+0.021
olo+0.021
 seaw+0.020
Neg logit contributions
tered-0.019
 Ellen-0.019
 Hatt-0.018
 Eleanor-0.018
 Beth-0.017
Token it
Token ID340
Feature activation+3.07e-01

Pos logit contributions
ins+0.053
olo+0.052
actus+0.051
 pasta+0.050
aw+0.050
Neg logit contributions
tered-0.033
 Ellen-0.033
 Beth-0.031
 Hatt-0.031
 Eleanor-0.030
Token stuck
Token ID7819
Feature activation-5.49e-01

Pos logit contributions
tered+0.009
 Ellen+0.008
 Hatt+0.008
 Beth+0.008
hered+0.008
Neg logit contributions
olo-0.013
aw-0.013
ins-0.013
Dan-0.013
 pasta-0.013
Token in
Token ID287
Feature activation-5.64e-01

Pos logit contributions
aw+0.023
 pasta+0.023
 seaw+0.023
ins+0.022
olo+0.022
Neg logit contributions
 Ellen-0.016
tered-0.016
 Hatt-0.015
 Eleanor-0.015
 Beth-0.015
Token some
Token ID617
Feature activation+2.41e-01

Pos logit contributions
aw+0.024
ins+0.024
olo+0.023
 pasta+0.023
actus+0.023
Neg logit contributions
 Ellen-0.017
tered-0.016
 Hatt-0.016
 Eleanor-0.016
 Beth-0.015
Token seaw
Token ID32085
Feature activation-7.18e+00

Pos logit contributions
tered+0.009
 Ellen+0.009
ened+0.008
 as+0.008
 Grace+0.008
Neg logit contributions
olo-0.008
ins-0.008
ll-0.008
Dan-0.008
oni-0.008
Tokeneed
Token ID2308
Feature activation-1.21e+00

Pos logit contributions
ins+0.167
 pasta+0.163
 Brussels+0.160
olo+0.158
rou+0.152
Neg logit contributions
tered-0.314
hered-0.303
 as-0.299
 Hatt-0.287
ened-0.285
Token.
Token ID13
Feature activation-4.00e-01

Pos logit contributions
olo+0.053
Dan+0.053
ins+0.052
aw+0.052
P+0.052
Neg logit contributions
tered-0.032
 as-0.032
 Ellen-0.030
hered-0.029
ened-0.028
Token Tim
Token ID5045
Feature activation-1.52e+00

Pos logit contributions
ins+0.016
 seaw+0.016
 pasta+0.016
olo+0.016
aw+0.016
Neg logit contributions
 Ellen-0.013
tered-0.013
 Beth-0.013
 Eleanor-0.013
 Grace-0.012
Token was
Token ID373
Feature activation+5.72e-02

Pos logit contributions
aw+0.059
olo+0.058
 pasta+0.058
ins+0.058
 seaw+0.057
Neg logit contributions
tered-0.049
 Ellen-0.048
 Hatt-0.046
 Eleanor-0.046
 Beth-0.045
Token so
Token ID523
Feature activation-1.70e+00

Pos logit contributions
 Ellen+0.002
tered+0.002
 Hatt+0.002
 Eleanor+0.002
ened+0.002
Neg logit contributions
 pasta-0.002
aw-0.002
 coconut-0.002
 greedy-0.002
ins-0.001
TokenThe
Token ID464
Feature activation-3.02e-01

Pos logit contributions
 pasta+0.008
 seaw+0.008
actus+0.007
 coconut+0.007
aw+0.007
Neg logit contributions
 Ellen-0.008
tered-0.008
 Eleanor-0.008
 Hatt-0.008
 Beth-0.007
Token little
Token ID1310
Feature activation+6.13e-01

Pos logit contributions
 pasta+0.010
 seaw+0.009
aw+0.009
olo+0.009
 oy+0.009
Neg logit contributions
 Ellen-0.011
hered-0.011
tered-0.011
 Eleanor-0.011
ened-0.011
Token girl
Token ID2576
Feature activation+1.92e+00

Pos logit contributions
tered+0.022
oned+0.021
 Ellen+0.020
hered+0.020
 Hatt+0.019
Neg logit contributions
 pasta-0.021
aw-0.021
 oy-0.021
 seaw-0.021
 crabs-0.021
Token and
Token ID290
Feature activation+2.28e-01

Pos logit contributions
 Ellen+0.068
tered+0.067
 Hatt+0.065
 Eleanor+0.064
 Beth+0.063
Neg logit contributions
aw-0.068
 pasta-0.066
olo-0.066
ins-0.066
 seaw-0.066
Token her
Token ID607
Feature activation+7.77e-01

Pos logit contributions
tered+0.008
oned+0.008
hered+0.008
 Hatt+0.007
 Eleanor+0.007
Neg logit contributions
aw-0.007
olo-0.007
 pasta-0.007
 oy-0.007
Dan-0.007
Token grandparents
Token ID28571
Feature activation+3.43e+00

Pos logit contributions
oned+0.034
hered+0.034
tered+0.034
tail+0.032
 Eleanor+0.031
Neg logit contributions
 pasta-0.020
actus-0.018
rou-0.018
olo-0.018
ll-0.017
Token spent
Token ID3377
Feature activation-8.54e-03

Pos logit contributions
 Ellen+0.132
 Eleanor+0.127
tered+0.127
 Hatt+0.124
 Beth+0.117
Neg logit contributions
aw-0.113
olo-0.104
 pasta-0.102
ll-0.102
 seaw-0.101
Token the
Token ID262
Feature activation-1.96e-01

Pos logit contributions
aw+0.000
 pasta+0.000
P+0.000
ins+0.000
 crabs+0.000
Neg logit contributions
 Hatt-0.000
 Ellen-0.000
tered-0.000
 Eleanor-0.000
uster-0.000
Token whole
Token ID2187
Feature activation-1.20e-01

Pos logit contributions
 pasta+0.007
P+0.007
aw+0.006
ll+0.006
 seaw+0.006
Neg logit contributions
 Hatt-0.007
 Ellen-0.007
ened-0.007
 Eleanor-0.007
tered-0.007
Token day
Token ID1110
Feature activation+1.40e+00

Pos logit contributions
P+0.004
aw+0.004
 pasta+0.004
eh+0.004
 seaw+0.004
Neg logit contributions
 Hatt-0.004
 Ellen-0.004
 Eleanor-0.004
tered-0.004
 Beth-0.004
Token playing
Token ID2712
Feature activation+2.23e-02

Pos logit contributions
 Ellen+0.049
tered+0.048
 Eleanor+0.047
 Hatt+0.047
ened+0.045
Neg logit contributions
aw-0.052
ins-0.051
 pasta-0.050
 seaw-0.048
olo-0.048
Token wider
Token ID10595
Feature activation+9.58e-01

Pos logit contributions
tered+0.044
 Ellen+0.043
 Beth+0.043
 Hatt+0.042
 Eleanor+0.042
Neg logit contributions
olo-0.062
actus-0.060
ins-0.060
aw-0.060
 pasta-0.059
Token and
Token ID290
Feature activation+9.56e-03

Pos logit contributions
 as+0.021
tered+0.019
ened+0.019
 Ellen+0.018
 Grace+0.017
Neg logit contributions
olo-0.047
ins-0.046
 pasta-0.046
P-0.046
actus-0.046
Token wider
Token ID10595
Feature activation+9.62e-01

Pos logit contributions
 Ellen+0.000
tered+0.000
 Hatt+0.000
 Eleanor+0.000
 Beth+0.000
Neg logit contributions
aw-0.000
 pasta-0.000
ins-0.000
olo-0.000
 seaw-0.000
Token.
Token ID13
Feature activation-6.75e-02

Pos logit contributions
 as+0.025
tered+0.025
ened+0.024
 Ellen+0.024
 Grace+0.023
Neg logit contributions
olo-0.042
ins-0.042
 pasta-0.042
aw-0.041
actus-0.041
Token The
Token ID383
Feature activation+1.43e-01

Pos logit contributions
ins+0.003
 seaw+0.003
aw+0.003
 pasta+0.003
olo+0.003
Neg logit contributions
 Ellen-0.002
tered-0.002
 Beth-0.002
 Hatt-0.002
 Eleanor-0.002
Token wind
Token ID2344
Feature activation+2.78e+00

Pos logit contributions
 Ellen+0.005
tered+0.005
 Hatt+0.005
 Grace+0.005
 Eleanor+0.005
Neg logit contributions
olo-0.005
ins-0.005
aw-0.005
 pasta-0.005
Dan-0.005
Token was
Token ID373
Feature activation+1.86e-01

Pos logit contributions
 Ellen+0.091
tered+0.090
 Hatt+0.088
 Eleanor+0.086
 Beth+0.084
Neg logit contributions
aw-0.106
 pasta-0.104
ins-0.104
olo-0.104
 seaw-0.104
Token roaring
Token ID43170
Feature activation+1.82e-01

Pos logit contributions
 Ellen+0.007
 Hatt+0.007
tered+0.006
 Eleanor+0.006
 Grace+0.006
Neg logit contributions
aw-0.007
 pasta-0.006
 seaw-0.006
olo-0.006
Dan-0.006
Token and
Token ID290
Feature activation+5.40e-02

Pos logit contributions
 Hatt+0.006
 Ellen+0.006
 Eleanor+0.006
 Beth+0.005
 Grace+0.005
Neg logit contributions
aw-0.007
 seaw-0.007
 pasta-0.007
olo-0.007
oni-0.007
Token it
Token ID340
Feature activation+4.34e-01

Pos logit contributions
 Ellen+0.002
tered+0.002
 Hatt+0.002
 Eleanor+0.002
 Beth+0.002
Neg logit contributions
aw-0.002
 pasta-0.002
 seaw-0.002
olo-0.002
ins-0.002
Token felt
Token ID2936
Feature activation+3.69e-01

Pos logit contributions
 Ellen+0.014
tered+0.014
 Hatt+0.014
 Eleanor+0.013
 Beth+0.013
Neg logit contributions
aw-0.017
 pasta-0.016
ins-0.016
olo-0.016
 seaw-0.016
Token thanked
Token ID26280
Feature activation+5.08e-01

Pos logit contributions
tered+0.021
 Ellen+0.020
ened+0.020
 Hatt+0.019
 Beth+0.019
Neg logit contributions
 pasta-0.029
ins-0.029
olo-0.028
 seaw-0.028
 coconut-0.028
Token the
Token ID262
Feature activation+1.02e-01

Pos logit contributions
tered+0.018
 Ellen+0.018
 Hatt+0.017
hered+0.017
ened+0.017
Neg logit contributions
 pasta-0.018
olo-0.018
 seaw-0.017
aw-0.017
Dan-0.017
Token little
Token ID1310
Feature activation+8.54e-01

Pos logit contributions
 Ellen+0.003
tered+0.003
 Hatt+0.003
 Eleanor+0.003
 Grace+0.003
Neg logit contributions
aw-0.004
ins-0.004
olo-0.004
 pasta-0.004
 seaw-0.004
Token one
Token ID530
Feature activation+1.72e+00

Pos logit contributions
tered+0.029
 Ellen+0.028
 Hatt+0.027
 Eleanor+0.026
hered+0.025
Neg logit contributions
 seaw-0.033
 pasta-0.032
aw-0.032
 coconut-0.031
ins-0.031
Token for
Token ID329
Feature activation+1.29e+00

Pos logit contributions
 Ellen+0.050
tered+0.049
 Hatt+0.048
 Eleanor+0.047
 Beth+0.046
Neg logit contributions
aw-0.073
 pasta-0.073
 seaw-0.072
olo-0.071
ins-0.071
Token being
Token ID852
Feature activation+2.91e+00

Pos logit contributions
 Ellen+0.051
tered+0.050
 Hatt+0.050
 Eleanor+0.049
ened+0.048
Neg logit contributions
 pasta-0.040
aw-0.040
 seaw-0.040
olo-0.039
ins-0.039
Token his
Token ID465
Feature activation+1.32e-01

Pos logit contributions
 Ellen+0.092
tered+0.087
 Beth+0.085
 Grace+0.084
 Eleanor+0.084
Neg logit contributions
actus-0.116
olo-0.114
 seaw-0.114
 coconut-0.114
 pasta-0.113
Token friend
Token ID1545
Feature activation+3.23e-01

Pos logit contributions
 Ellen+0.004
tered+0.004
 Hatt+0.004
 Eleanor+0.004
 Beth+0.004
Neg logit contributions
aw-0.005
 pasta-0.005
ins-0.005
olo-0.005
 seaw-0.005
Token and
Token ID290
Feature activation+2.96e-01

Pos logit contributions
 Ellen+0.011
tered+0.010
 Hatt+0.010
 Eleanor+0.010
 Beth+0.010
Neg logit contributions
ins-0.012
 pasta-0.012
 seaw-0.012
aw-0.012
olo-0.012
Token gave
Token ID2921
Feature activation+5.48e-01

Pos logit contributions
 Ellen+0.010
tered+0.010
 Hatt+0.009
 Eleanor+0.009
 Beth+0.009
Neg logit contributions
aw-0.011
 pasta-0.011
ins-0.011
olo-0.011
 seaw-0.011
Token them
Token ID606
Feature activation-1.01e-01

Pos logit contributions
tered+0.021
 Ellen+0.021
 Hatt+0.021
 Eleanor+0.020
hered+0.020
Neg logit contributions
aw-0.018
 pasta-0.017
 seaw-0.017
ins-0.017
olo-0.017
Token so
Token ID523
Feature activation-1.23e+00

Pos logit contributions
 Ellen+0.018
tered+0.018
 Hatt+0.018
 Eleanor+0.018
 Grace+0.017
Neg logit contributions
aw-0.017
ins-0.016
 pasta-0.016
olo-0.015
 coconut-0.015
Token sad
Token ID6507
Feature activation+1.07e+00

Pos logit contributions
 pasta+0.045
 seaw+0.045
olo+0.045
aw+0.045
ins+0.044
Neg logit contributions
tered-0.042
 Ellen-0.042
 Hatt-0.040
 Eleanor-0.040
 Beth-0.039
Token they
Token ID484
Feature activation-1.85e-01

Pos logit contributions
 Ellen+0.035
tered+0.035
 Hatt+0.034
 Eleanor+0.033
 Beth+0.032
Neg logit contributions
aw-0.041
 pasta-0.040
ins-0.040
olo-0.040
 seaw-0.039
Token started
Token ID2067
Feature activation+2.04e-01

Pos logit contributions
aw+0.007
 pasta+0.007
olo+0.007
 seaw+0.007
ins+0.007
Neg logit contributions
 Ellen-0.006
tered-0.006
 Hatt-0.006
 Eleanor-0.006
 Beth-0.006
Token to
Token ID284
Feature activation+4.99e-03

Pos logit contributions
 Ellen+0.007
tered+0.007
 Beth+0.006
 Hatt+0.006
 Eleanor+0.006
Neg logit contributions
aw-0.008
olo-0.008
ins-0.008
 seaw-0.008
 pasta-0.008
Token cry
Token ID3960
Feature activation+2.74e+00

Pos logit contributions
 Ellen+0.000
tered+0.000
 Hatt+0.000
 Eleanor+0.000
 Beth+0.000
Neg logit contributions
aw-0.000
 pasta-0.000
 seaw-0.000
olo-0.000
ins-0.000
Token.
Token ID13
Feature activation+6.31e-01

Pos logit contributions
 Ellen+0.121
 Hatt+0.119
 Eleanor+0.117
tered+0.112
ened+0.112
Neg logit contributions
aw-0.075
 pasta-0.071
ins-0.069
oni-0.064
olo-0.064
Token
Token ID198
Feature activation+1.96e-01

Pos logit contributions
 Ellen+0.017
tered+0.017
 Beth+0.016
 Grace+0.016
 Eleanor+0.015
Neg logit contributions
 seaw-0.029
ins-0.029
actus-0.029
 pasta-0.029
aw-0.028
Token
Token ID198
Feature activation+1.90e-01

Pos logit contributions
 Ellen+0.006
 Beth+0.006
tered+0.006
 Eleanor+0.006
 Hatt+0.006
Neg logit contributions
 seaw-0.008
 pasta-0.008
actus-0.008
ins-0.008
aw-0.008
TokenSuddenly
Token ID38582
Feature activation+1.49e+00

Pos logit contributions
tered+0.006
 Ellen+0.006
 Beth+0.006
oned+0.006
 Eleanor+0.006
Neg logit contributions
 pasta-0.008
 seaw-0.008
actus-0.008
 mac-0.007
ins-0.007
Token,
Token ID11
Feature activation+7.49e-01

Pos logit contributions
tered+0.040
 Hatt+0.039
 Eleanor+0.038
 Ellen+0.038
ened+0.035
Neg logit contributions
aw-0.067
 pasta-0.065
 seaw-0.065
olo-0.064
Dan-0.064
Token just
Token ID655
Feature activation-1.73e+00

Pos logit contributions
ins+0.038
actus+0.036
 seaw+0.035
olo+0.035
ades+0.035
Neg logit contributions
 Ellen-0.029
 Beth-0.027
 Grace-0.027
tered-0.026
 as-0.026
Token me
Token ID502
Feature activation+8.47e-01

Pos logit contributions
ins+0.078
olo+0.074
actus+0.074
ades+0.074
aw+0.073
Neg logit contributions
 Ellen-0.049
tered-0.047
 Grace-0.047
 Beth-0.046
 Hatt-0.045
Token and
Token ID290
Feature activation+1.36e-01

Pos logit contributions
 Ellen+0.020
tered+0.019
 Beth+0.018
 Hatt+0.018
 Eleanor+0.018
Neg logit contributions
aw-0.040
 seaw-0.040
ins-0.040
 pasta-0.040
olo-0.039
Token Mr
Token ID1770
Feature activation-3.81e-01

Pos logit contributions
 Ellen+0.006
tered+0.005
 Grace+0.005
 Beth+0.005
 Hatt+0.005
Neg logit contributions
ins-0.004
aw-0.004
olo-0.004
actus-0.004
Dan-0.004
Token.
Token ID13
Feature activation+3.33e-01

Pos logit contributions
aw+0.014
 pasta+0.014
 seaw+0.014
ins+0.014
olo+0.014
Neg logit contributions
 Ellen-0.013
tered-0.013
 Hatt-0.013
 Eleanor-0.012
 Beth-0.012
Token Sn
Token ID5489
Feature activation+3.20e+00

Pos logit contributions
 Ellen+0.008
tered+0.008
 Hatt+0.008
 Eleanor+0.008
 Beth+0.008
Neg logit contributions
aw-0.015
 pasta-0.015
ins-0.015
olo-0.015
 seaw-0.015
Tokenugg
Token ID6837
Feature activation+8.86e-01

Pos logit contributions
tered+0.105
 Hatt+0.104
 Eleanor+0.102
 Ellen+0.102
hered+0.100
Neg logit contributions
 pasta-0.118
 seaw-0.118
 coconut-0.115
ins-0.115
aw-0.113
Tokenles
Token ID829
Feature activation-6.61e-01

Pos logit contributions
 Ellen+0.027
tered+0.027
 Hatt+0.026
 Eleanor+0.025
 Beth+0.025
Neg logit contributions
aw-0.036
 pasta-0.036
olo-0.035
ins-0.035
 seaw-0.035
Token.
Token ID13
Feature activation+3.54e-01

Pos logit contributions
 coconut+0.025
olo+0.025
ins+0.025
Dan+0.025
 seaw+0.024
Neg logit contributions
tered-0.022
 Ellen-0.021
 Grace-0.020
 as-0.020
 Beth-0.020
Token We
Token ID775
Feature activation+6.52e-01

Pos logit contributions
 Ellen+0.014
tered+0.014
 Hatt+0.014
 Beth+0.014
 Grace+0.014
Neg logit contributions
ins-0.011
aw-0.011
 seaw-0.011
 pasta-0.011
olo-0.011
Token're
Token ID821
Feature activation+3.43e-01

Pos logit contributions
 Ellen+0.021
tered+0.021
 Hatt+0.020
 Eleanor+0.020
 Beth+0.019
Neg logit contributions
aw-0.025
 pasta-0.025
ins-0.025
olo-0.025
 seaw-0.025
Token you
Token ID345
Feature activation-3.87e-01

Pos logit contributions
actus+0.013
ins+0.012
ll+0.012
Dan+0.012
olo+0.012
Neg logit contributions
 as-0.013
 Grace-0.013
 Ellen-0.013
 Beth-0.012
tered-0.012
Token,
Token ID11
Feature activation+4.66e-01

Pos logit contributions
aw+0.017
 seaw+0.017
ins+0.017
olo+0.017
 coconut+0.017
Neg logit contributions
tered-0.010
 Ellen-0.010
 Grace-0.009
 Eleanor-0.009
 as-0.009
Token Uncle
Token ID23169
Feature activation+8.68e-01

Pos logit contributions
 Ellen+0.014
 Beth+0.013
 Grace+0.013
 Eleanor+0.012
 as+0.012
Neg logit contributions
ins-0.021
aw-0.020
actus-0.020
P-0.020
ades-0.020
Token Jack
Token ID3619
Feature activation-1.64e+00

Pos logit contributions
 Ellen+0.028
 Beth+0.027
 Grace+0.027
 as+0.026
 Eleanor+0.026
Neg logit contributions
 seaw-0.034
 coconut-0.034
actus-0.033
ades-0.033
ins-0.033
Token!"
Token ID2474
Feature activation-6.80e-01

Pos logit contributions
 coconut+0.064
 seaw+0.064
aw+0.063
 pasta+0.063
ades+0.062
Neg logit contributions
 Ellen-0.051
tered-0.049
 Grace-0.049
 as-0.048
 Beth-0.048
Token Uncle
Token ID23169
Feature activation+7.84e-01

Pos logit contributions
ins+0.029
 seaw+0.029
rou+0.028
 coconut+0.028
actus+0.028
Neg logit contributions
 Ellen-0.020
tered-0.019
 Beth-0.019
 as-0.018
 Grace-0.018
Token Jack
Token ID3619
Feature activation-1.73e+00

Pos logit contributions
 Ellen+0.027
tered+0.026
 Beth+0.026
 Eleanor+0.025
 Grace+0.025
Neg logit contributions
ins-0.029
 seaw-0.029
aw-0.029
 pasta-0.028
 coconut-0.028
Token smiled
Token ID13541
Feature activation-1.18e+00

Pos logit contributions
aw+0.062
olo+0.062
 pasta+0.062
 seaw+0.061
ins+0.061
Neg logit contributions
 Ellen-0.060
tered-0.059
 Hatt-0.058
 Eleanor-0.057
 Beth-0.056
Token and
Token ID290
Feature activation+1.18e-03

Pos logit contributions
aw+0.042
 pasta+0.040
ins+0.038
 seaw+0.038
olo+0.038
Neg logit contributions
 Ellen-0.045
 Hatt-0.045
 Eleanor-0.044
tered-0.042
 Beth-0.042
Token said
Token ID531
Feature activation+2.71e-01

Pos logit contributions
 Ellen+0.000
tered+0.000
 Hatt+0.000
 Eleanor+0.000
 Beth+0.000
Neg logit contributions
aw-0.000
 pasta-0.000
ins-0.000
olo-0.000
 seaw-0.000
Token,
Token ID11
Feature activation+3.21e-01

Pos logit contributions
 Ellen+0.006
 Hatt+0.006
tered+0.006
 Eleanor+0.006
 Beth+0.006
Neg logit contributions
aw-0.013
 pasta-0.013
ins-0.013
 seaw-0.013
olo-0.013
Token noticed
Token ID6810
Feature activation-9.67e-01

Pos logit contributions
iles+0.032
aw+0.031
olo+0.031
 coconut+0.031
ins+0.031
Neg logit contributions
tered-0.019
 Grace-0.019
ened-0.018
 Ellen-0.018
 Beth-0.018
Token the
Token ID262
Feature activation-3.66e-01

Pos logit contributions
ins+0.038
olo+0.038
actus+0.037
 seaw+0.037
aw+0.037
Neg logit contributions
 Ellen-0.030
tered-0.029
 Beth-0.029
 Hatt-0.028
 Grace-0.028
Token same
Token ID976
Feature activation-1.25e-01

Pos logit contributions
aw+0.012
ins+0.012
olo+0.012
Dan+0.012
iles+0.011
Neg logit contributions
tered-0.014
ened-0.013
 Ellen-0.013
 as-0.013
 Hatt-0.013
Token shiny
Token ID22441
Feature activation-3.72e-01

Pos logit contributions
olo+0.004
aw+0.004
ins+0.004
Dan+0.004
ll+0.004
Neg logit contributions
tered-0.005
 Ellen-0.005
ened-0.005
 Hatt-0.005
 Eleanor-0.004
Token object
Token ID2134
Feature activation-6.83e-01

Pos logit contributions
aw+0.010
olo+0.010
ins+0.010
Dan+0.009
 seaw+0.009
Neg logit contributions
tered-0.016
 Ellen-0.016
 Hatt-0.016
ened-0.016
 Eleanor-0.015
Token.
Token ID13
Feature activation+2.49e-01

Pos logit contributions
aw+0.028
ins+0.027
olo+0.027
 pasta+0.027
 seaw+0.027
Neg logit contributions
 Ellen-0.021
tered-0.021
 Hatt-0.020
 Eleanor-0.019
 Beth-0.019
Token She
Token ID1375
Feature activation+1.09e+00

Pos logit contributions
 Ellen+0.005
 Beth+0.005
 Grace+0.005
tered+0.005
 Hatt+0.005
Neg logit contributions
ins-0.013
iles-0.013
 seaw-0.013
actus-0.013
ll-0.013
Token quickly
Token ID2952
Feature activation-5.67e-01

Pos logit contributions
tered+0.033
 Ellen+0.032
ened+0.031
 Grace+0.030
 Hatt+0.030
Neg logit contributions
ins-0.045
olo-0.044
 pasta-0.044
aw-0.044
 seaw-0.043
Token realized
Token ID6939
Feature activation-2.47e+00

Pos logit contributions
ins+0.022
olo+0.022
aw+0.022
 pasta+0.021
 seaw+0.021
Neg logit contributions
tered-0.018
 Ellen-0.018
 Hatt-0.017
ened-0.017
 Eleanor-0.017
Token that
Token ID326
Feature activation-7.61e-02

Pos logit contributions
ins+0.104
olo+0.103
 seaw+0.103
aw+0.102
 pasta+0.101
Neg logit contributions
 Ellen-0.073
tered-0.070
 Hatt-0.068
 Beth-0.068
 Eleanor-0.067
Token it
Token ID340
Feature activation+3.73e-01

Pos logit contributions
aw+0.003
ins+0.003
olo+0.003
 seaw+0.003
 pasta+0.003
Neg logit contributions
 Ellen-0.002
tered-0.002
 Hatt-0.002
 Eleanor-0.002
 Beth-0.002
Token couldn
Token ID3521
Feature activation+2.60e-01

Pos logit contributions
 pasta+0.049
aw+0.049
olo+0.049
 seaw+0.048
ins+0.048
Neg logit contributions
tered-0.046
 Ellen-0.045
 Hatt-0.043
 Eleanor-0.043
ened-0.042
Token't
Token ID470
Feature activation-7.01e-01

Pos logit contributions
 Ellen+0.008
tered+0.008
 Hatt+0.008
 Eleanor+0.008
 Beth+0.008
Neg logit contributions
aw-0.010
ins-0.010
 pasta-0.010
 seaw-0.010
olo-0.010
Token do
Token ID466
Feature activation-6.02e-01

Pos logit contributions
 seaw+0.028
 pasta+0.028
olo+0.027
aw+0.027
 coconut+0.027
Neg logit contributions
 Ellen-0.022
 Hatt-0.020
 Eleanor-0.020
tered-0.020
ened-0.020
Token it
Token ID340
Feature activation+7.16e-01

Pos logit contributions
aw+0.022
 pasta+0.022
ins+0.022
olo+0.022
 seaw+0.022
Neg logit contributions
 Ellen-0.020
tered-0.020
 Hatt-0.020
 Eleanor-0.019
 Beth-0.019
Token.
Token ID13
Feature activation+3.50e-01

Pos logit contributions
 Ellen+0.023
tered+0.023
 Hatt+0.022
 Eleanor+0.022
 Beth+0.021
Neg logit contributions
aw-0.028
 pasta-0.027
olo-0.027
ins-0.027
 seaw-0.027
Token His
Token ID2399
Feature activation+3.67e-01

Pos logit contributions
 Ellen+0.011
 Beth+0.011
 Eleanor+0.010
 Hatt+0.010
 Grace+0.010
Neg logit contributions
 seaw-0.015
ins-0.015
ades-0.015
 pasta-0.015
actus-0.014
Token dream
Token ID4320
Feature activation+7.73e-01

Pos logit contributions
 Ellen+0.012
tered+0.012
 Hatt+0.012
 Eleanor+0.011
 Beth+0.011
Neg logit contributions
aw-0.014
 pasta-0.014
olo-0.014
 seaw-0.014
ins-0.014
Token of
Token ID286
Feature activation+2.08e-01

Pos logit contributions
 Ellen+0.028
 Hatt+0.028
tered+0.027
 Eleanor+0.027
 Grace+0.026
Neg logit contributions
aw-0.027
 pasta-0.025
ins-0.025
olo-0.025
 seaw-0.025
Token succeeding
Token ID34195
Feature activation+4.23e-01

Pos logit contributions
 Ellen+0.007
tered+0.007
 Hatt+0.007
 Eleanor+0.007
 Beth+0.007
Neg logit contributions
aw-0.008
olo-0.008
ins-0.008
actus-0.007
 pasta-0.007
Token had
Token ID550
Feature activation+8.45e-01

Pos logit contributions
 Ellen+0.012
tered+0.012
 Hatt+0.012
 Eleanor+0.011
 Beth+0.011
Neg logit contributions
aw-0.018
 pasta-0.017
olo-0.017
ins-0.017
 seaw-0.017
Token ended
Token ID4444
Feature activation+7.71e-01

Pos logit contributions
 Ellen+0.029
tered+0.028
ened+0.027
 Beth+0.027
 Eleanor+0.026
Neg logit contributions
olo-0.032
actus-0.031
ins-0.031
aw-0.031
 pasta-0.031
TokenL
Token ID43
Feature activation+1.43e+00

Pos logit contributions
 pasta+0.001
 seaw+0.001
olo+0.001
 coconut+0.001
iles+0.001
Neg logit contributions
tered-0.001
 Ellen-0.001
 Eleanor-0.001
 Hatt-0.001
 Grace-0.001
Tokenily
Token ID813
Feature activation-8.33e-01

Pos logit contributions
tered+0.056
 Ellen+0.054
 Eleanor+0.053
 Hatt+0.053
 Grace+0.051
Neg logit contributions
 seaw-0.046
 pasta-0.045
 coconut-0.045
aw-0.045
Dan-0.045
Token thought
Token ID1807
Feature activation-7.14e-01

Pos logit contributions
aw+0.031
 pasta+0.031
olo+0.030
ins+0.030
 seaw+0.030
Neg logit contributions
 Ellen-0.028
tered-0.028
 Hatt-0.027
 Eleanor-0.027
 Beth-0.026
Token the
Token ID262
Feature activation-4.37e-01

Pos logit contributions
 pasta+0.031
aw+0.030
 seaw+0.030
olo+0.030
ins+0.029
Neg logit contributions
 Ellen-0.020
tered-0.020
 Hatt-0.019
hered-0.019
 Eleanor-0.019
Token bee
Token ID20697
Feature activation-1.16e+00

Pos logit contributions
aw+0.012
ins+0.012
olo+0.012
 pasta+0.012
 seaw+0.012
Neg logit contributions
tered-0.018
 Ellen-0.018
 Grace-0.018
 Eleanor-0.018
 Hatt-0.018
Token looked
Token ID3114
Feature activation-1.71e-01

Pos logit contributions
aw+0.038
 pasta+0.037
 seaw+0.037
ins+0.037
olo+0.036
Neg logit contributions
 Ellen-0.044
 Hatt-0.043
tered-0.043
 Eleanor-0.042
 Beth-0.041
Token heavy
Token ID4334
Feature activation+2.44e-01

Pos logit contributions
olo+0.007
Dan+0.007
ins+0.007
 pasta+0.006
 coconut+0.006
Neg logit contributions
tered-0.006
 Ellen-0.005
 as-0.005
 Grace-0.005
ened-0.005
Token because
Token ID780
Feature activation+1.77e+00

Pos logit contributions
 Ellen+0.009
 Eleanor+0.009
 Hatt+0.009
tered+0.008
 Beth+0.008
Neg logit contributions
 pasta-0.009
aw-0.009
 seaw-0.009
ins-0.008
oni-0.008
Token it
Token ID340
Feature activation+2.96e-01

Pos logit contributions
 Ellen+0.048
tered+0.048
 Hatt+0.045
 Eleanor+0.045
 Grace+0.044
Neg logit contributions
aw-0.078
olo-0.077
ins-0.077
 seaw-0.076
 pasta-0.076
Token was
Token ID373
Feature activation+5.50e-02

Pos logit contributions
 Ellen+0.012
tered+0.011
 Hatt+0.011
 Eleanor+0.011
hered+0.011
Neg logit contributions
 seaw-0.009
 pasta-0.009
aw-0.009
ins-0.009
olo-0.009
Token carrying
Token ID6872
Feature activation-6.80e-01

Pos logit contributions
tered+0.002
 Grace+0.002
 Ellen+0.002
 as+0.002
 Beth+0.002
Neg logit contributions
aw-0.002
ins-0.002
olo-0.002
Dan-0.002
ades-0.002
Token across
Token ID1973
Feature activation+1.79e+00

Pos logit contributions
aw+0.096
 seaw+0.092
ins+0.090
 crabs+0.089
 pasta+0.089
Neg logit contributions
tered-0.068
 Ellen-0.067
 Hatt-0.066
 Eleanor-0.063
 Beth-0.060
Token the
Token ID262
Feature activation-6.50e-01

Pos logit contributions
tered+0.059
 Ellen+0.057
 Hatt+0.056
 Eleanor+0.054
ened+0.053
Neg logit contributions
aw-0.069
 pasta-0.069
 seaw-0.068
olo-0.067
Dan-0.066
Token lake
Token ID13546
Feature activation-1.68e+00

Pos logit contributions
ins+0.019
olo+0.019
oni+0.018
iles+0.017
 pasta+0.017
Neg logit contributions
 Ellen-0.027
 Eleanor-0.026
tered-0.026
 Grace-0.026
 Beth-0.025
Token she
Token ID673
Feature activation-6.70e-01

Pos logit contributions
aw+0.081
 pasta+0.080
 seaw+0.080
ins+0.079
olo+0.079
Neg logit contributions
 Ellen-0.038
tered-0.037
 Hatt-0.037
 Eleanor-0.036
 Beth-0.033
Token called
Token ID1444
Feature activation-1.04e-01

Pos logit contributions
 coconut+0.025
 pasta+0.025
ins+0.025
olo+0.024
 Brussels+0.024
Neg logit contributions
tered-0.021
ened-0.021
 Ellen-0.021
 Grace-0.021
 Beth-0.020
Token out
Token ID503
Feature activation+1.45e+00

Pos logit contributions
 pasta+0.004
olo+0.004
aw+0.004
 seaw+0.004
oni+0.004
Neg logit contributions
 Hatt-0.003
tered-0.003
 Eleanor-0.003
ened-0.003
 Ellen-0.003
Token to
Token ID284
Feature activation-4.96e-01

Pos logit contributions
 Hatt+0.059
 Eleanor+0.058
tered+0.057
 Ellen+0.057
hered+0.055
Neg logit contributions
 pasta-0.040
aw-0.037
oni-0.036
olo-0.036
ades-0.036
Token her
Token ID607
Feature activation+4.54e-01

Pos logit contributions
 pasta+0.019
olo+0.018
 seaw+0.018
 crabs+0.018
aw+0.018
Neg logit contributions
tered-0.016
 Hatt-0.015
 Eleanor-0.015
ened-0.015
 Ellen-0.014
Token mum
Token ID25682
Feature activation-7.12e-01

Pos logit contributions
tered+0.015
 Ellen+0.015
 Hatt+0.015
 Eleanor+0.014
 Beth+0.014
Neg logit contributions
aw-0.017
 pasta-0.017
olo-0.017
 seaw-0.017
ins-0.017
Token,
Token ID11
Feature activation+7.27e-02

Pos logit contributions
aw+0.028
 pasta+0.028
olo+0.028
ins+0.028
 seaw+0.028
Neg logit contributions
 Ellen-0.022
tered-0.022
 Hatt-0.021
 Eleanor-0.021
 Beth-0.020
Token '
Token ID705
Feature activation+8.33e-01

Pos logit contributions
tered+0.002
 Hatt+0.002
 Ellen+0.002
ened+0.002
 Eleanor+0.002
Neg logit contributions
 pasta-0.003
olo-0.003
aw-0.003
 seaw-0.003
ins-0.003
Token would
Token ID561
Feature activation+1.63e-01

Pos logit contributions
aw+0.008
 pasta+0.008
ins+0.008
olo+0.008
 seaw+0.008
Neg logit contributions
 Ellen-0.008
tered-0.008
 Hatt-0.007
 Eleanor-0.007
 Beth-0.007
Token frown
Token ID22002
Feature activation+2.19e+00

Pos logit contributions
 Ellen+0.004
tered+0.004
 Hatt+0.004
 Eleanor+0.004
 Beth+0.004
Neg logit contributions
aw-0.007
ins-0.007
 pasta-0.007
olo-0.007
 seaw-0.007
Token and
Token ID290
Feature activation+1.62e-01

Pos logit contributions
 Ellen+0.069
 Hatt+0.067
tered+0.066
 Eleanor+0.065
 Beth+0.063
Neg logit contributions
aw-0.088
ins-0.086
 pasta-0.085
 seaw-0.085
olo-0.084
Token hide
Token ID7808
Feature activation-1.09e+00

Pos logit contributions
tered+0.004
 Ellen+0.004
 Hatt+0.004
 Eleanor+0.004
 Beth+0.004
Neg logit contributions
 seaw-0.007
 pasta-0.007
aw-0.007
olo-0.007
ins-0.007
Token whenever
Token ID8797
Feature activation-6.42e-01

Pos logit contributions
 seaw+0.041
aw+0.039
 pasta+0.039
 crabs+0.039
ins+0.038
Neg logit contributions
 Ellen-0.037
 Hatt-0.036
hered-0.035
 Eleanor-0.035
tered-0.033
Token he
Token ID339
Feature activation-1.40e+00

Pos logit contributions
aw+0.015
 pasta+0.014
ins+0.014
olo+0.014
 seaw+0.014
Neg logit contributions
 Ellen-0.031
tered-0.031
 Hatt-0.030
 Eleanor-0.030
 Beth-0.029
Token saw
Token ID2497
Feature activation-1.09e+00

Pos logit contributions
ins+0.052
 pasta+0.052
olo+0.051
aw+0.051
 coconut+0.050
Neg logit contributions
tered-0.046
 Ellen-0.045
 Hatt-0.044
ened-0.044
 Eleanor-0.044
Token something
Token ID1223
Feature activation-8.42e-01

Pos logit contributions
aw+0.041
olo+0.041
ins+0.041
 pasta+0.041
 seaw+0.040
Neg logit contributions
 Ellen-0.036
tered-0.036
 Hatt-0.034
 Eleanor-0.034
 Beth-0.034
Token scary
Token ID14343
Feature activation+9.17e-01

Pos logit contributions
olo+0.034
Dan+0.034
ins+0.034
aw+0.033
P+0.032
Neg logit contributions
 Ellen-0.026
tered-0.026
 as-0.025
 Beth-0.025
 Grace-0.025
Token.
Token ID13
Feature activation+4.58e-02

Pos logit contributions
 Ellen+0.031
tered+0.030
 Hatt+0.029
 Eleanor+0.029
 Beth+0.028
Neg logit contributions
aw-0.034
 pasta-0.034
olo-0.034
ins-0.034
 seaw-0.034
Token One
Token ID1881
Feature activation+1.29e+00

Pos logit contributions
 Ellen+0.001
tered+0.001
 Beth+0.001
 Hatt+0.001
 Grace+0.001
Neg logit contributions
ins-0.002
olo-0.002
aw-0.002
 seaw-0.002
Dan-0.002
Token,
Token ID11
Feature activation+5.57e-01

Pos logit contributions
aw+0.010
 pasta+0.009
olo+0.009
 seaw+0.009
ins+0.009
Neg logit contributions
tered-0.009
 Ellen-0.009
 Hatt-0.009
 Eleanor-0.008
hered-0.008
Token she
Token ID673
Feature activation-5.37e-01

Pos logit contributions
 Ellen+0.016
tered+0.016
 Hatt+0.016
 Eleanor+0.015
hered+0.015
Neg logit contributions
aw-0.023
 pasta-0.022
 seaw-0.022
olo-0.022
 coconut-0.022
Token didn
Token ID1422
Feature activation+8.71e-01

Pos logit contributions
 pasta+0.022
ins+0.022
olo+0.022
 seaw+0.022
aw+0.021
Neg logit contributions
tered-0.017
 Ellen-0.016
 Hatt-0.016
 Eleanor-0.015
 Beth-0.015
Token't
Token ID470
Feature activation-9.78e-01

Pos logit contributions
 Ellen+0.029
tered+0.029
 Hatt+0.028
 Eleanor+0.028
 Beth+0.027
Neg logit contributions
aw-0.032
 pasta-0.032
ins-0.032
olo-0.032
 seaw-0.032
Token know
Token ID760
Feature activation+1.14e+00

Pos logit contributions
 seaw+0.035
 pasta+0.034
aw+0.034
 coconut+0.034
olo+0.034
Neg logit contributions
tered-0.033
 Ellen-0.033
 Hatt-0.032
 Eleanor-0.032
ened-0.031
Token what
Token ID644
Feature activation-1.02e+00

Pos logit contributions
 Ellen+0.038
tered+0.037
 Hatt+0.036
 Eleanor+0.036
 Beth+0.035
Neg logit contributions
aw-0.043
olo-0.043
ins-0.043
 pasta-0.043
 seaw-0.043
Token to
Token ID284
Feature activation-1.46e-01

Pos logit contributions
ins+0.041
aw+0.041
olo+0.040
actus+0.040
 seaw+0.039
Neg logit contributions
 Ellen-0.033
tered-0.032
 Beth-0.031
 Grace-0.031
 Eleanor-0.031
Token do
Token ID466
Feature activation-7.50e-01

Pos logit contributions
 pasta+0.005
olo+0.005
 seaw+0.005
aw+0.005
 coconut+0.005
Neg logit contributions
tered-0.005
 Ellen-0.005
 Hatt-0.005
 Eleanor-0.005
 Grace-0.005
Token.
Token ID13
Feature activation+1.46e-02

Pos logit contributions
 pasta+0.023
aw+0.023
ins+0.022
olo+0.022
 seaw+0.021
Neg logit contributions
 Ellen-0.030
 Hatt-0.029
tered-0.029
 Eleanor-0.029
 Beth-0.028
Token But
Token ID887
Feature activation+1.47e+00

Pos logit contributions
 Ellen+0.000
 Beth+0.000
tered+0.000
 Hatt+0.000
 Eleanor+0.000
Neg logit contributions
 seaw-0.001
ins-0.001
 pasta-0.001
olo-0.001
aw-0.001
Token she
Token ID673
Feature activation-4.35e-01

Pos logit contributions
tered+0.041
hered+0.040
 Ellen+0.039
 Hatt+0.039
 Eleanor+0.038
Neg logit contributions
 pasta-0.060
aw-0.060
 oy-0.058
 coconut-0.057
olo-0.057
Token and
Token ID290
Feature activation+2.98e-01

Pos logit contributions
aw+0.005
 pasta+0.005
ins+0.005
olo+0.005
 seaw+0.005
Neg logit contributions
 Ellen-0.004
tered-0.004
 Hatt-0.004
 Eleanor-0.004
 Beth-0.004
Token pulled
Token ID5954
Feature activation+8.38e-01

Pos logit contributions
 Ellen+0.010
tered+0.010
 Eleanor+0.010
 Beth+0.010
 Hatt+0.010
Neg logit contributions
 pasta-0.011
ins-0.011
aw-0.011
olo-0.011
 seaw-0.011
Token out
Token ID503
Feature activation+1.64e+00

Pos logit contributions
 Hatt+0.029
 Ellen+0.028
 Eleanor+0.028
tered+0.028
tail+0.026
Neg logit contributions
aw-0.031
 pasta-0.029
 seaw-0.029
olo-0.028
oni-0.028
Token his
Token ID465
Feature activation+5.91e-02

Pos logit contributions
 Ellen+0.057
tered+0.057
 Hatt+0.057
 Eleanor+0.056
hered+0.053
Neg logit contributions
aw-0.059
 pasta-0.058
 seaw-0.057
ins-0.056
olo-0.056
Token own
Token ID898
Feature activation-8.81e-01

Pos logit contributions
 Ellen+0.002
tered+0.002
 Hatt+0.002
 Eleanor+0.002
 Beth+0.002
Neg logit contributions
aw-0.002
ins-0.002
olo-0.002
 pasta-0.002
 seaw-0.002
Token sword
Token ID8429
Feature activation-1.63e+00

Pos logit contributions
ins+0.038
olo+0.038
aw+0.038
Dan+0.037
 pasta+0.037
Neg logit contributions
 Ellen-0.024
tered-0.023
ened-0.022
 Grace-0.022
hered-0.022
Token.
Token ID13
Feature activation+3.20e-01

Pos logit contributions
aw+0.055
 pasta+0.054
ins+0.053
 seaw+0.053
olo+0.053
Neg logit contributions
 Ellen-0.060
tered-0.059
 Hatt-0.059
 Eleanor-0.058
 Beth-0.056
Token He
Token ID679
Feature activation-1.64e-01

Pos logit contributions
 Ellen+0.010
 Beth+0.009
 Grace+0.009
tered+0.009
 Eleanor+0.009
Neg logit contributions
ins-0.014
 seaw-0.014
actus-0.014
 pasta-0.014
aw-0.014
Token was
Token ID373
Feature activation+2.83e-01

Pos logit contributions
aw+0.006
 pasta+0.006
ins+0.006
olo+0.006
 seaw+0.006
Neg logit contributions
 Ellen-0.006
tered-0.006
 Hatt-0.006
 Eleanor-0.005
 Beth-0.005
Token about
Token ID546
Feature activation+1.38e+00

Pos logit contributions
 Ellen+0.010
tered+0.010
 Eleanor+0.009
 Hatt+0.009
 Beth+0.009
Neg logit contributions
 seaw-0.010
 pasta-0.010
ins-0.010
olo-0.010
actus-0.010
Token to
Token ID284
Feature activation-1.41e-01

Pos logit contributions
 Ellen+0.039
 as+0.038
tered+0.036
 Grace+0.036
 Beth+0.036
Neg logit contributions
ins-0.061
actus-0.061
aw-0.061
olo-0.060
P-0.059
Token the
Token ID262
Feature activation-2.27e-01

Pos logit contributions
 Ellen+0.014
tered+0.014
 Hatt+0.014
 Eleanor+0.013
 Beth+0.013
Neg logit contributions
 pasta-0.016
aw-0.016
 seaw-0.015
olo-0.015
ins-0.015
Token game
Token ID983
Feature activation-1.14e+00

Pos logit contributions
ins+0.009
olo+0.008
aw+0.008
Dan+0.008
iles+0.008
Neg logit contributions
 Ellen-0.008
tered-0.008
hered-0.007
 Hatt-0.007
 Grace-0.007
Token.
Token ID13
Feature activation+3.98e-01

Pos logit contributions
ins+0.042
aw+0.041
olo+0.041
 pasta+0.041
 seaw+0.041
Neg logit contributions
tered-0.039
 Ellen-0.039
 Hatt-0.037
 Eleanor-0.036
 Beth-0.036
Token<|endoftext|>
Token ID50256
Feature activation-1.65e-01

Pos logit contributions
 Ellen+0.013
 Grace+0.012
 Eleanor+0.012
 Beth+0.012
 Hatt+0.012
Neg logit contributions
 seaw-0.017
aw-0.017
ins-0.016
iles-0.016
 coconut-0.016
TokenOnce
Token ID7454
Feature activation-9.27e-01

Pos logit contributions
 seaw+0.005
 coconut+0.005
 pasta+0.005
ins+0.005
olo+0.005
Neg logit contributions
 Ellen-0.007
 Grace-0.007
 Beth-0.007
 Eleanor-0.007
tered-0.007
Token upon
Token ID2402
Feature activation-1.85e+00

Pos logit contributions
aw+0.019
 seaw+0.019
olo+0.019
 pasta+0.018
ins+0.018
Neg logit contributions
 Ellen-0.047
tered-0.046
 Hatt-0.046
 Eleanor-0.045
 Beth-0.045
Token a
Token ID257
Feature activation+1.02e+00

Pos logit contributions
ins+0.085
olo+0.083
actus+0.083
aw+0.083
oni+0.082
Neg logit contributions
tered-0.049
 Ellen-0.047
 as-0.043
 Eleanor-0.043
 Grace-0.043
Token time
Token ID640
Feature activation-5.18e-01

Pos logit contributions
 Ellen+0.032
tered+0.031
 Hatt+0.031
 Eleanor+0.030
 Beth+0.029
Neg logit contributions
aw-0.040
 pasta-0.040
olo-0.040
 seaw-0.039
ins-0.039
Token,
Token ID11
Feature activation+1.28e-02

Pos logit contributions
 pasta+0.024
aw+0.024
olo+0.024
 seaw+0.024
 coconut+0.024
Neg logit contributions
tered-0.012
 Ellen-0.012
 Hatt-0.011
 Eleanor-0.011
ened-0.011
Token there
Token ID612
Feature activation+1.36e+00

Pos logit contributions
 Ellen+0.001
 Hatt+0.001
tered+0.001
 Eleanor+0.001
ened+0.001
Neg logit contributions
 pasta-0.000
aw-0.000
 coconut-0.000
 seaw-0.000
olo-0.000
Token were
Token ID547
Feature activation-1.36e-01

Pos logit contributions
 Ellen+0.049
tered+0.049
 Hatt+0.048
 Eleanor+0.047
 Beth+0.046
Neg logit contributions
aw-0.047
 pasta-0.047
olo-0.046
 seaw-0.046
ins-0.046
Token.
Token ID13
Feature activation+1.43e-01

Pos logit contributions
ins+0.014
olo+0.014
 seaw+0.013
aw+0.013
 pasta+0.013
Neg logit contributions
tered-0.012
 Ellen-0.011
ened-0.011
 Hatt-0.011
 Beth-0.011
Token He
Token ID679
Feature activation-2.09e-01

Pos logit contributions
 Beth+0.005
 Ellen+0.005
tered+0.004
 Grace+0.004
 Hatt+0.004
Neg logit contributions
olo-0.006
ins-0.006
iles-0.006
 seaw-0.006
anut-0.006
Token looked
Token ID3114
Feature activation-2.21e-01

Pos logit contributions
aw+0.008
 pasta+0.008
ins+0.008
olo+0.008
 seaw+0.008
Neg logit contributions
tered-0.007
 Ellen-0.007
 Hatt-0.007
 Eleanor-0.007
 Beth-0.006
Token outside
Token ID2354
Feature activation+4.14e-01

Pos logit contributions
aw+0.010
 pasta+0.010
ins+0.010
olo+0.010
 seaw+0.010
Neg logit contributions
 Ellen-0.006
tered-0.006
 Hatt-0.005
 Eleanor-0.005
 Beth-0.005
Token and
Token ID290
Feature activation+1.08e-01

Pos logit contributions
 Ellen+0.013
 Hatt+0.013
tered+0.013
 Eleanor+0.013
 Beth+0.012
Neg logit contributions
 pasta-0.016
aw-0.016
 seaw-0.016
ins-0.015
olo-0.015
Token saw
Token ID2497
Feature activation-1.53e+00

Pos logit contributions
 Ellen+0.004
tered+0.004
 Hatt+0.004
 Eleanor+0.004
 Beth+0.004
Neg logit contributions
aw-0.003
 pasta-0.003
ins-0.003
olo-0.003
 seaw-0.003
Token that
Token ID326
Feature activation-9.71e-02

Pos logit contributions
ins+0.063
olo+0.063
aw+0.061
actus+0.061
ll+0.060
Neg logit contributions
 Ellen-0.047
tered-0.045
 Beth-0.045
 Hatt-0.045
 Grace-0.044
Token his
Token ID465
Feature activation+1.05e-01

Pos logit contributions
aw+0.004
 pasta+0.004
ins+0.004
olo+0.004
 seaw+0.004
Neg logit contributions
 Ellen-0.003
tered-0.003
 Hatt-0.003
 Eleanor-0.003
 Beth-0.003
Token shiny
Token ID22441
Feature activation-4.34e-01

Pos logit contributions
tered+0.003
 Hatt+0.003
ened+0.003
 Ellen+0.003
 as+0.003
Neg logit contributions
ins-0.004
aw-0.004
olo-0.004
Dan-0.004
P-0.004
Token toy
Token ID13373
Feature activation+1.67e-01

Pos logit contributions
aw+0.013
olo+0.013
ins+0.013
 seaw+0.012
 pasta+0.012
Neg logit contributions
tered-0.018
 Ellen-0.018
 Hatt-0.017
 Eleanor-0.017
ened-0.017
Token car
Token ID1097
Feature activation+1.14e+00

Pos logit contributions
 Ellen+0.005
tered+0.005
 Hatt+0.005
 Beth+0.005
 Grace+0.005
Neg logit contributions
ins-0.007
olo-0.007
aw-0.007
Dan-0.007
 seaw-0.007
Token O
Token ID440
Feature activation-1.65e+00

Pos logit contributions
ins+0.015
aw+0.015
 pasta+0.014
olo+0.014
ades+0.014
Neg logit contributions
 Beth-0.009
 Ellen-0.009
 Grace-0.009
tered-0.009
 Hatt-0.009
Tokenll
Token ID297
Feature activation-5.19e+00

Pos logit contributions
 pasta+0.039
 seaw+0.037
 coconut+0.034
 pineapple+0.032
 chefs+0.031
Neg logit contributions
 Beth-0.080
 Hatt-0.078
 Eleanor-0.078
hered-0.078
 Grace-0.077
Tokenie
Token ID494
Feature activation+8.20e-01

Pos logit contributions
aw+0.227
ins+0.226
 pasta+0.225
 seaw+0.224
olo+0.223
Neg logit contributions
 Ellen-0.139
tered-0.137
 Hatt-0.132
 Eleanor-0.130
 Beth-0.124
Token.
Token ID13
Feature activation-1.11e-01

Pos logit contributions
 Ellen+0.023
tered+0.022
 Grace+0.022
 Beth+0.022
 Hatt+0.022
Neg logit contributions
 pasta-0.036
aw-0.035
 seaw-0.035
olo-0.035
ins-0.035
Token O
Token ID440
Feature activation-1.64e+00

Pos logit contributions
ins+0.005
aw+0.005
 pasta+0.005
olo+0.005
ades+0.005
Neg logit contributions
 Beth-0.003
 Grace-0.003
 Hatt-0.003
 Ellen-0.003
 Eleanor-0.003
Tokenll
Token ID297
Feature activation-4.99e+00

Pos logit contributions
 seaw+0.037
 pasta+0.037
 pineapple+0.032
 chefs+0.032
 coconut+0.031
Neg logit contributions
 Beth-0.079
 Eleanor-0.077
 Hatt-0.076
 Grace-0.073
 Ellen-0.073
Tokenie
Token ID494
Feature activation+8.43e-01

Pos logit contributions
aw+0.229
 pasta+0.227
olo+0.226
ins+0.225
 seaw+0.225
Neg logit contributions
 Ellen-0.124
tered-0.124
 Hatt-0.118
 Eleanor-0.116
 Beth-0.113
Token lived
Token ID5615
Feature activation+7.27e-01

Pos logit contributions
tered+0.025
 Ellen+0.024
 Grace+0.024
 Beth+0.024
 Eleanor+0.023
Neg logit contributions
 pasta-0.035
olo-0.035
 coconut-0.035
ins-0.034
aw-0.034
Token in
Token ID287
Feature activation-1.68e-01

Pos logit contributions
tered+0.015
 as+0.014
 Grace+0.014
 Ellen+0.014
 Beth+0.014
Neg logit contributions
olo-0.038
aw-0.037
 pasta-0.036
actus-0.036
oni-0.036
Token a
Token ID257
Feature activation+1.14e+00

Pos logit contributions
 pasta+0.006
 seaw+0.006
aw+0.006
olo+0.006
 coconut+0.006
Neg logit contributions
tered-0.006
 Ellen-0.006
 Hatt-0.005
 Eleanor-0.005
ened-0.005
Token safe
Token ID3338
Feature activation-2.70e-01

Pos logit contributions
 Grace+0.041
 Ellen+0.039
 Beth+0.039
 Hatt+0.037
 Eleanor+0.037
Neg logit contributions
ins-0.045
olo-0.045
oni-0.043
aw-0.042
iles-0.042
Token shore
Token ID15191
Feature activation-1.90e+00

Pos logit contributions
ins+0.017
aw+0.016
olo+0.016
 pasta+0.015
P+0.015
Neg logit contributions
tered-0.021
 Ellen-0.020
 Grace-0.020
 as-0.020
 Eleanor-0.019
Token,
Token ID11
Feature activation+1.55e-01

Pos logit contributions
 pasta+0.086
ins+0.086
olo+0.086
aw+0.086
 seaw+0.085
Neg logit contributions
tered-0.046
 Ellen-0.045
 as-0.042
 Hatt-0.042
ened-0.042
Token collecting
Token ID13157
Feature activation-2.80e+00

Pos logit contributions
 Ellen+0.005
tered+0.005
 Hatt+0.005
 Eleanor+0.005
 Beth+0.005
Neg logit contributions
aw-0.006
 pasta-0.006
ins-0.006
olo-0.006
 seaw-0.006
Token shells
Token ID19679
Feature activation-3.43e+00

Pos logit contributions
olo+0.073
ins+0.071
 pasta+0.070
ll+0.069
actus+0.069
Neg logit contributions
tered-0.123
 Ellen-0.121
ened-0.119
 Hatt-0.118
 Eleanor-0.118
Token and
Token ID290
Feature activation-6.55e-02

Pos logit contributions
olo+0.159
ins+0.158
aw+0.158
 pasta+0.158
Dan+0.157
Neg logit contributions
tered-0.079
 Ellen-0.077
 as-0.076
hered-0.072
 Hatt-0.072
Token seaw
Token ID32085
Feature activation-7.26e+00

Pos logit contributions
aw+0.002
ins+0.002
olo+0.002
actus+0.002
 pasta+0.002
Neg logit contributions
tered-0.003
 Ellen-0.002
ened-0.002
 Grace-0.002
 Beth-0.002
Tokeneed
Token ID2308
Feature activation-1.25e+00

Pos logit contributions
 pasta+0.158
 Brussels+0.150
 coconut+0.146
olo+0.141
 volcano+0.139
Neg logit contributions
tered-0.335
hered-0.323
 as-0.319
ened-0.318
 Hatt-0.312
Token.
Token ID13
Feature activation+5.97e-02

Pos logit contributions
aw+0.049
 pasta+0.049
olo+0.049
ins+0.048
 seaw+0.048
Neg logit contributions
tered-0.038
 Ellen-0.037
 Hatt-0.035
ened-0.035
 Eleanor-0.035
Token Jimmy
Token ID12963
Feature activation+1.59e-01

Pos logit contributions
tered+0.002
 Ellen+0.002
 Hatt+0.002
 Eleanor+0.002
 Grace+0.002
Neg logit contributions
ins-0.003
aw-0.003
 seaw-0.002
olo-0.002
actus-0.002
Token took
Token ID1718
Feature activation-4.03e-01

Pos logit contributions
 Ellen+0.005
tered+0.005
 Hatt+0.005
 Eleanor+0.005
 Beth+0.005
Neg logit contributions
aw-0.006
 pasta-0.006
olo-0.006
 seaw-0.006
ins-0.006
Token time
Token ID640
Feature activation-1.41e-01

Pos logit contributions
aw+0.015
 seaw+0.014
 pasta+0.014
 crabs+0.013
 oy+0.013
Neg logit contributions
 Ellen-0.015
 Hatt-0.015
tered-0.015
 Eleanor-0.014
 Beth-0.013
Token pretty
Token ID2495
Feature activation-4.64e-01

Pos logit contributions
 pasta+0.077
 seaw+0.075
olo+0.075
actus+0.075
iles+0.073
Neg logit contributions
 Ellen-0.036
tered-0.034
 Grace-0.034
 Beth-0.033
 as-0.032
Token and
Token ID290
Feature activation+1.04e-01

Pos logit contributions
ins+0.020
olo+0.020
Dan+0.020
 seaw+0.020
 coconut+0.020
Neg logit contributions
 Ellen-0.012
tered-0.012
 as-0.012
 Grace-0.012
ened-0.012
Token she
Token ID673
Feature activation-6.18e-01

Pos logit contributions
 Ellen+0.004
tered+0.004
 Hatt+0.004
 Eleanor+0.004
 Beth+0.003
Neg logit contributions
aw-0.004
 pasta-0.004
ins-0.004
olo-0.004
 seaw-0.004
Token asked
Token ID1965
Feature activation+5.60e-01

Pos logit contributions
 pasta+0.025
ins+0.025
olo+0.025
aw+0.024
 seaw+0.024
Neg logit contributions
tered-0.019
 Ellen-0.019
 Grace-0.018
 Beth-0.018
 Eleanor-0.018
Token O
Token ID440
Feature activation-1.61e+00

Pos logit contributions
 Ellen+0.018
tered+0.018
 Hatt+0.018
 Eleanor+0.018
 Beth+0.017
Neg logit contributions
aw-0.021
 pasta-0.021
olo-0.021
 seaw-0.021
ins-0.021
Tokenll
Token ID297
Feature activation-4.63e+00

Pos logit contributions
 seaw+0.051
 pasta+0.050
 crabs+0.046
 coconut+0.046
 oy+0.044
Neg logit contributions
 Ellen-0.068
 Eleanor-0.068
 Beth-0.066
 Hatt-0.066
tered-0.064
Tokenie
Token ID494
Feature activation+6.16e-01

Pos logit contributions
aw+0.210
 pasta+0.208
olo+0.207
ins+0.206
 seaw+0.206
Neg logit contributions
 Ellen-0.119
tered-0.118
 Hatt-0.113
 Eleanor-0.111
 Beth-0.108
Token a
Token ID257
Feature activation+1.09e+00

Pos logit contributions
 Ellen+0.014
 Hatt+0.013
 Grace+0.013
tered+0.013
 Beth+0.013
Neg logit contributions
aw-0.030
 seaw-0.030
olo-0.029
 pasta-0.029
ins-0.029
Token question
Token ID1808
Feature activation+8.52e-01

Pos logit contributions
tered+0.040
 Ellen+0.039
 Hatt+0.038
ened+0.037
 Eleanor+0.037
Neg logit contributions
 seaw-0.037
aw-0.037
 pasta-0.037
 coconut-0.037
olo-0.036
Token.
Token ID13
Feature activation+1.07e-01

Pos logit contributions
 Ellen+0.028
tered+0.028
 Hatt+0.026
 Eleanor+0.026
 Beth+0.026
Neg logit contributions
 seaw-0.032
ins-0.032
aw-0.032
 pasta-0.032
actus-0.032
Token "
Token ID366
Feature activation+1.39e+00

Pos logit contributions
 Beth+0.003
 Ellen+0.003
 Hatt+0.003
 Eleanor+0.003
 Grace+0.003
Neg logit contributions
ins-0.005
 pasta-0.005
aw-0.005
 seaw-0.005
olo-0.005
Token picture
Token ID4286
Feature activation-5.23e-02

Pos logit contributions
 Ellen+0.045
tered+0.044
 Hatt+0.043
 Eleanor+0.042
 Beth+0.041
Neg logit contributions
aw-0.042
ins-0.042
olo-0.042
 pasta-0.042
 seaw-0.041
Token of
Token ID286
Feature activation-1.82e-02

Pos logit contributions
aw+0.002
 pasta+0.002
olo+0.002
ins+0.002
 seaw+0.002
Neg logit contributions
 Ellen-0.002
tered-0.002
 Hatt-0.002
 Eleanor-0.002
 Beth-0.002
Token the
Token ID262
Feature activation-3.11e-01

Pos logit contributions
aw+0.001
ins+0.001
olo+0.001
 pasta+0.001
 seaw+0.001
Neg logit contributions
 Ellen-0.001
tered-0.001
 Hatt-0.001
 Eleanor-0.001
 Beth-0.001
Token yellow
Token ID7872
Feature activation+4.63e-01

Pos logit contributions
ins+0.010
olo+0.010
aw+0.010
 seaw+0.010
 pasta+0.010
Neg logit contributions
 Ellen-0.012
tered-0.012
 Hatt-0.011
 Eleanor-0.011
 Beth-0.011
Token lab
Token ID2248
Feature activation-3.36e-02

Pos logit contributions
 Ellen+0.016
tered+0.016
 Hatt+0.015
 Eleanor+0.015
 Beth+0.015
Neg logit contributions
ins-0.016
olo-0.016
aw-0.016
 pasta-0.016
 seaw-0.016
Tokenrador
Token ID40368
Feature activation-4.61e+00

Pos logit contributions
ins+0.001
aw+0.001
olo+0.001
 seaw+0.001
 pasta+0.001
Neg logit contributions
 Ellen-0.001
tered-0.001
 Hatt-0.001
 Eleanor-0.001
 Beth-0.001
Token.
Token ID13
Feature activation+2.27e-01

Pos logit contributions
aw+0.156
 pasta+0.152
olo+0.149
 seaw+0.149
ins+0.147
Neg logit contributions
 Ellen-0.172
 Hatt-0.169
tered-0.169
 Eleanor-0.166
 Beth-0.161
Token She
Token ID1375
Feature activation+9.72e-01

Pos logit contributions
 Ellen+0.007
tered+0.007
 Hatt+0.007
 Eleanor+0.007
 Beth+0.007
Neg logit contributions
aw-0.009
 pasta-0.009
ins-0.009
olo-0.009
 seaw-0.009
Token recognized
Token ID8018
Feature activation-1.19e+00

Pos logit contributions
 Ellen+0.034
tered+0.034
 Hatt+0.033
 Eleanor+0.033
 Beth+0.032
Neg logit contributions
aw-0.035
 pasta-0.034
ins-0.034
olo-0.034
 seaw-0.034
Token it
Token ID340
Feature activation+3.35e-01

Pos logit contributions
aw+0.043
 pasta+0.040
 seaw+0.039
 crabs+0.039
olo+0.039
Neg logit contributions
tered-0.043
 Hatt-0.041
 Ellen-0.040
hered-0.040
 Eleanor-0.040
Token as
Token ID355
Feature activation+5.46e-02

Pos logit contributions
 Hatt+0.011
 Eleanor+0.011
 Ellen+0.011
tered+0.010
ened+0.010
Neg logit contributions
 pasta-0.012
aw-0.012
oni-0.012
 seaw-0.011
olo-0.011
Token Jake
Token ID14757
Feature activation+3.26e-01

Pos logit contributions
 Ellen+0.008
tered+0.008
 Hatt+0.008
 Eleanor+0.008
 Beth+0.008
Neg logit contributions
aw-0.005
 pasta-0.005
olo-0.005
ins-0.005
 seaw-0.005
Token checked
Token ID10667
Feature activation+2.63e+00

Pos logit contributions
tered+0.010
 Ellen+0.010
 Hatt+0.009
 Eleanor+0.009
 Grace+0.009
Neg logit contributions
 pasta-0.013
aw-0.013
 coconut-0.013
ins-0.013
 seaw-0.013
Token on
Token ID319
Feature activation+1.35e+00

Pos logit contributions
tered+0.092
 Ellen+0.092
hered+0.088
 Hatt+0.088
 Eleanor+0.086
Neg logit contributions
 pasta-0.095
ins-0.093
 seaw-0.093
aw-0.092
olo-0.092
Token his
Token ID465
Feature activation-3.24e-02

Pos logit contributions
 Ellen+0.044
tered+0.044
 Hatt+0.042
 Eleanor+0.042
 Beth+0.041
Neg logit contributions
aw-0.051
 pasta-0.051
ins-0.050
olo-0.050
 seaw-0.050
Token k
Token ID479
Feature activation+2.12e+00

Pos logit contributions
 pasta+0.001
aw+0.001
 seaw+0.001
ins+0.001
olo+0.001
Neg logit contributions
 Ellen-0.001
tered-0.001
 Hatt-0.001
 Eleanor-0.001
 Beth-0.001
Tokeniw
Token ID14246
Feature activation-4.84e+00

Pos logit contributions
 Ellen+0.071
 Hatt+0.069
 Beth+0.069
 Eleanor+0.069
hered+0.068
Neg logit contributions
 seaw-0.081
olo-0.081
 pasta-0.079
oni-0.075
Dan-0.075
Tokeni
Token ID72
Feature activation-4.32e-01

Pos logit contributions
P+0.204
 pasta+0.201
 coconut+0.198
 chefs+0.197
Dan+0.196
Neg logit contributions
ened-0.138
 as-0.126
tered-0.123
 Hatt-0.115
 Eleanor-0.113
Token to
Token ID284
Feature activation-5.29e-01

Pos logit contributions
aw+0.018
ins+0.018
olo+0.018
 pasta+0.018
 seaw+0.018
Neg logit contributions
tered-0.012
 Ellen-0.012
 Hatt-0.011
 Eleanor-0.011
 Grace-0.011
Token make
Token ID787
Feature activation+4.67e-01

Pos logit contributions
 seaw+0.022
 pasta+0.022
aw+0.021
 coconut+0.021
olo+0.021
Neg logit contributions
 Ellen-0.016
tered-0.015
 Hatt-0.015
 Eleanor-0.015
 Beth-0.014
Token sure
Token ID1654
Feature activation+1.42e+00

Pos logit contributions
 Ellen+0.015
tered+0.014
 Hatt+0.014
 Eleanor+0.014
 Beth+0.014
Neg logit contributions
aw-0.018
olo-0.018
 pasta-0.018
 seaw-0.018
ins-0.018
Token it
Token ID340
Feature activation+2.58e-01

Pos logit contributions
 Ellen+0.047
tered+0.046
 Hatt+0.045
 Beth+0.045
 as+0.044
Neg logit contributions
actus-0.053
aw-0.053
olo-0.053
ins-0.052
P-0.052