TOP ACTIVATIONS
MAX = 0.074

, Lucy !" said Wh isk ers , " Do you
, Rover !" said Wh isk ers . " You are
a happy song and Wh isk ers listened .
to unlock ," said Wh isk ers . "
Let 's play !" Wh isk ers said , " No
me and Mr . Wh isk ers ?" Tim smiled and
share ." Spot and Wh isk ers agreed , and they
the mess and the qu arre ling twins . She is
my new song ?" Wh isk ers smiled and said ,
ball ." Buddy and Wh isk ers stopped fighting . They
singing and looked at Wh isk ers . "
his puzzle and thanked Wh isk ers for helping him .
a cheerful little 3 - year - old boy . He
said , " Hi Wh isk ers ! Do you want
family sh oo ed Wh isk ers away and told Max
and replied with a mis chie vous gl int in his
, kissing them on their fore heads . She
Wh isk ers ?" Wh isk ers replied , " I
" Thank you , Wh isk ers !" The
to our castle ." Wh isk ers forg ave her ,

MIN ACTIVATIONS
MIN = -0.067

strange birds . They had long necks , long legs ,
es best . They had long necks and spots . Sam
the girl would put her ear against the fence .
es best . They had long necks and spots . Ben
es best . They had long necks and spots . Sam
person . The person had long hair , a beard and
the sky . It had long wings , a sharp be
gorilla . The gorilla has long arms and long hair .
a haircut . She had long hair that got in her
, he always kept an ear out for any silly noises
his house . She had long hair and a pretty dress
named Lily . She had long , curly hair that always
weird animal . It has long ears , a fluffy tail
named Bow . Bow had long , soft fur and loved
unusual looking . He had long white hair and a big
called Lily . She had long , pretty hair and wore
the elephant . He had long , smooth ears and a
named Lily . She had long hair that was very thick
puppy . The puppy had long , soft fur . It
and loud . He had long hair and a beard .

INTERVAL 0.039 - 0.074
CONTAINS 0.132%

Tim my was so happy that he could hardly speak .
behind the curtains in her living room . She saw her
thought about it and decided that getting warm by the camp
it was far away and sometimes it was close . They
! Lucy was so excited that she decided to take it

INTERVAL 0.003 - 0.039
CONTAINS 37.537%

two friends started to explore and had fun together .
cool ! S ally said , " Let 's
from our grandma . She loves us . You are wrong
a rainbow with all the colors of the world . Seeing
herself down . She goes very fast . She feels the

INTERVAL -0.032 - 0.003
CONTAINS 61.931%

. One day , she wanted to wear a princess dress
Their mom said , " I 'm glad you
laugh and shout . " Look at my car
Yes , let 's work together . We are strong and
. We did not want to eat the caul iflower .

INTERVAL -0.067 - -0.032
CONTAINS 0.399%

Tom says . " Can I go first ?"
being foolish . Why is he doing that ?"
the curtain again . <|endoftext|> S ara and Tom were playing
?" Lily asked . She reached out her hand to take
. It had red and yellow feathers and a long tail
Token,
Token ID11
Feature activation-9.23e-03

Pos logit contributions
 One+0.007
 grasp+0.006
 pocket+0.006
 nine+0.005
 front+0.005
Neg logit contributions
taker-0.014
lling-0.014
washer-0.013
ull-0.013
led-0.013
Token Lucy
Token ID22162
Feature activation+5.67e-03

Pos logit contributions
 One+0.012
 front+0.010
 pocket+0.010
 grasp+0.010
 nine+0.010
Neg logit contributions
led-0.018
lling-0.017
ull-0.017
taker-0.017
lad-0.017
Token!"
Token ID2474
Feature activation-1.58e-02

Pos logit contributions
taker+0.010
lad+0.010
lling+0.010
 accidents+0.010
 happens+0.010
Neg logit contributions
 grasp-0.007
 One-0.007
 pocket-0.007
 front-0.006
 spine-0.006
Token said
Token ID531
Feature activation+7.40e-03

Pos logit contributions
 grasp+0.022
 One+0.021
 spine+0.020
 pocket+0.020
 front+0.020
Neg logit contributions
taker-0.027
lad-0.027
washer-0.026
 accidents-0.026
lling-0.026
Token Wh
Token ID854
Feature activation+1.03e-02

Pos logit contributions
lad+0.014
lling+0.014
taker+0.014
washer+0.014
led+0.013
Neg logit contributions
 One-0.009
 grasp-0.009
 front-0.008
 pocket-0.008
 spine-0.008
Tokenisk
Token ID1984
Feature activation+7.39e-02

Pos logit contributions
taker+0.017
 accidents+0.016
washer+0.016
 happens+0.016
 works+0.016
Neg logit contributions
 grasp-0.015
 One-0.015
 spine-0.014
 pocket-0.013
 front-0.013
Tokeners
Token ID364
Feature activation+1.43e-02

Pos logit contributions
 accidents+0.135
washer+0.123
 disl+0.120
 Sorry+0.118
 Acc+0.116
Neg logit contributions
 nine-0.082
 front-0.078
 ten-0.077
One-0.076
ags-0.072
Token,
Token ID11
Feature activation+6.75e-03

Pos logit contributions
taker+0.027
 accidents+0.027
 happens+0.027
lad+0.027
 works+0.026
Neg logit contributions
 grasp-0.017
 One-0.016
 spine-0.015
 pocket-0.015
 front-0.015
Token "
Token ID366
Feature activation+1.67e-02

Pos logit contributions
taker+0.011
lad+0.011
 happens+0.011
washer+0.011
 works+0.011
Neg logit contributions
 grasp-0.010
 One-0.010
 spine-0.009
 pocket-0.009
 front-0.009
TokenDo
Token ID5211
Feature activation+3.77e-03

Pos logit contributions
 happens+0.027
taker+0.027
 works+0.027
washer+0.027
 accidents+0.027
Neg logit contributions
 One-0.024
 grasp-0.023
One-0.023
 front-0.022
 spine-0.022
Token you
Token ID345
Feature activation-2.47e-03

Pos logit contributions
taker+0.005
washer+0.005
lling+0.005
ull+0.005
led+0.005
Neg logit contributions
 One-0.006
 grasp-0.006
 pocket-0.006
 spine-0.006
 front-0.006
Token,
Token ID11
Feature activation-2.17e-02

Pos logit contributions
 happens+0.021
 works+0.021
taker+0.021
lling+0.021
 accidents+0.021
Neg logit contributions
 grasp-0.015
 One-0.015
 spine-0.014
 pocket-0.014
 front-0.013
Token Rover
Token ID36718
Feature activation+2.32e-02

Pos logit contributions
 grasp+0.035
 One+0.033
 pocket+0.032
 spine+0.032
 grip+0.031
Neg logit contributions
taker-0.033
washer-0.032
lad-0.032
 accidents-0.032
 works-0.031
Token!"
Token ID2474
Feature activation-2.47e-02

Pos logit contributions
 accidents+0.048
lling+0.047
lad+0.044
taker+0.044
washer+0.044
Neg logit contributions
 One-0.024
 grasp-0.023
 pocket-0.022
 front-0.020
 nine-0.020
Token said
Token ID531
Feature activation-6.06e-04

Pos logit contributions
 One+0.039
 grasp+0.037
 pocket+0.035
 front+0.033
 spine+0.032
Neg logit contributions
lad-0.043
washer-0.039
lling-0.039
taker-0.038
ull-0.038
Token Wh
Token ID854
Feature activation+1.07e-02

Pos logit contributions
 grasp+0.001
 One+0.001
 pocket+0.001
 spine+0.001
 front+0.001
Neg logit contributions
lad-0.001
taker-0.001
washer-0.001
 works-0.001
 accidents-0.001
Tokenisk
Token ID1984
Feature activation+7.06e-02

Pos logit contributions
taker+0.021
 accidents+0.020
washer+0.020
 happens+0.020
 works+0.020
Neg logit contributions
 grasp-0.013
 One-0.012
 spine-0.012
 pocket-0.011
 grip-0.011
Tokeners
Token ID364
Feature activation+1.94e-02

Pos logit contributions
 accidents+0.127
washer+0.117
 disl+0.115
 Acc+0.112
 Sorry+0.109
Neg logit contributions
 nine-0.084
 grasp-0.077
 front-0.077
One-0.074
 ten-0.071
Token.
Token ID13
Feature activation+3.53e-03

Pos logit contributions
taker+0.031
washer+0.030
ull+0.029
led+0.029
ing+0.028
Neg logit contributions
 grasp-0.030
 One-0.027
 spine-0.027
 pocket-0.026
 grip-0.026
Token "
Token ID366
Feature activation+3.04e-02

Pos logit contributions
taker+0.005
 accidents+0.004
lad+0.004
 happens+0.004
washer+0.004
Neg logit contributions
 grasp-0.007
 spine-0.006
 grip-0.006
 pocket-0.006
 front-0.006
TokenYou
Token ID1639
Feature activation+1.19e-02

Pos logit contributions
 happens+0.034
taker+0.033
 accidents+0.033
washer+0.033
 works+0.032
Neg logit contributions
 grasp-0.059
 One-0.058
One-0.056
 front-0.056
 nine-0.055
Token are
Token ID389
Feature activation+2.37e-03

Pos logit contributions
 accidents+0.024
fl+0.023
ing+0.023
lad+0.022
men+0.022
Neg logit contributions
 One-0.014
 grasp-0.013
 pocket-0.013
 nine-0.012
 front-0.011
Token a
Token ID257
Feature activation+1.49e-02

Pos logit contributions
 One+0.008
 front+0.007
 nine+0.007
One+0.006
 grasp+0.006
Neg logit contributions
washer-0.013
beh-0.012
 works-0.012
 happens-0.012
lad-0.012
Token happy
Token ID3772
Feature activation+5.75e-03

Pos logit contributions
washer+0.017
 happens+0.017
 accidents+0.017
beh+0.017
taker+0.017
Neg logit contributions
 grasp-0.029
 One-0.028
 spine-0.027
 front-0.027
 pocket-0.027
Token song
Token ID3496
Feature activation+6.38e-03

Pos logit contributions
washer+0.007
 works+0.007
 happens+0.006
lad+0.006
taker+0.006
Neg logit contributions
 grasp-0.011
 spine-0.011
 One-0.011
 front-0.011
 pocket-0.011
Token and
Token ID290
Feature activation+2.03e-02

Pos logit contributions
taker+0.010
washer+0.010
ing+0.010
 accidents+0.010
lad+0.009
Neg logit contributions
 grasp-0.010
 pocket-0.009
 spine-0.009
 One-0.009
 grip-0.009
Token Wh
Token ID854
Feature activation+1.56e-02

Pos logit contributions
lad+0.038
washer+0.036
ymes+0.036
taker+0.035
ull+0.035
Neg logit contributions
 One-0.028
 grasp-0.025
 pocket-0.023
 use-0.022
 front-0.022
Tokenisk
Token ID1984
Feature activation+6.87e-02

Pos logit contributions
taker+0.030
washer+0.029
 accidents+0.029
 happens+0.029
lling+0.028
Neg logit contributions
 grasp-0.019
 One-0.018
 spine-0.017
 pocket-0.016
 grip-0.016
Tokeners
Token ID364
Feature activation+1.65e-02

Pos logit contributions
 accidents+0.125
washer+0.117
 disl+0.112
 Acc+0.111
 Sorry+0.108
Neg logit contributions
 nine-0.082
 front-0.073
ids-0.073
ags-0.073
 grasp-0.072
Token listened
Token ID16399
Feature activation+2.88e-02

Pos logit contributions
 accidents+0.026
lad+0.025
taker+0.025
washer+0.024
 happens+0.023
Neg logit contributions
 grasp-0.026
 One-0.025
 pocket-0.024
 front-0.024
 nine-0.023
Token.
Token ID13
Feature activation+9.16e-03

Pos logit contributions
taker+0.048
 accidents+0.047
washer+0.047
lad+0.047
 happens+0.046
Neg logit contributions
 grasp-0.042
 One-0.038
 spine-0.038
 pocket-0.037
 grip-0.036
Token
Token ID198
Feature activation+1.20e-03

Pos logit contributions
taker+0.017
 accidents+0.017
washer+0.017
lling+0.017
lad+0.017
Neg logit contributions
 One-0.011
 grasp-0.011
 spine-0.010
 pocket-0.010
 front-0.009
Token
Token ID198
Feature activation-8.15e-03

Pos logit contributions
taker+0.002
lad+0.002
washer+0.002
 happens+0.002
 works+0.002
Neg logit contributions
 grasp-0.002
 spine-0.002
 pocket-0.002
 One-0.001
 grip-0.001
Token to
Token ID284
Feature activation+1.79e-02

Pos logit contributions
taker+0.007
 accidents+0.007
washer+0.007
 works+0.007
 happens+0.007
Neg logit contributions
 grasp-0.008
 One-0.008
 spine-0.008
 pocket-0.008
 grip-0.007
Token unlock
Token ID12116
Feature activation-3.74e-03

Pos logit contributions
 sound+0.030
 works+0.029
taker+0.028
 happen+0.028
rest+0.028
Neg logit contributions
 grasp-0.024
 spine-0.024
rils-0.023
 One-0.022
 pocket-0.022
Token,"
Token ID553
Feature activation-1.82e-02

Pos logit contributions
 grasp+0.007
 spine+0.007
 One+0.007
 pocket+0.007
 grip+0.007
Neg logit contributions
taker-0.004
 accidents-0.004
 happens-0.004
 works-0.004
lad-0.004
Token said
Token ID531
Feature activation-3.80e-03

Pos logit contributions
 One+0.028
 grasp+0.027
 pocket+0.026
 front+0.024
One+0.024
Neg logit contributions
lad-0.031
taker-0.030
washer-0.029
lling-0.028
free-0.028
Token Wh
Token ID854
Feature activation+8.58e-03

Pos logit contributions
 grasp+0.006
 One+0.005
 spine+0.005
 pocket+0.005
 front+0.005
Neg logit contributions
taker-0.006
lad-0.006
washer-0.006
 accidents-0.006
 works-0.006
Tokenisk
Token ID1984
Feature activation+6.80e-02

Pos logit contributions
taker+0.016
 accidents+0.015
washer+0.015
lad+0.015
 works+0.015
Neg logit contributions
 grasp-0.011
 One-0.010
 spine-0.010
 pocket-0.010
 grip-0.010
Tokeners
Token ID364
Feature activation+1.24e-02

Pos logit contributions
 accidents+0.119
washer+0.106
 disl+0.104
 Acc+0.101
 Sorry+0.100
Neg logit contributions
 nine-0.088
 grasp-0.083
 front-0.082
One-0.081
 One-0.079
Token.
Token ID13
Feature activation-3.68e-04

Pos logit contributions
taker+0.020
washer+0.019
lling+0.019
led+0.019
ing+0.019
Neg logit contributions
 grasp-0.019
 spine-0.017
 pocket-0.017
 One-0.017
 grip-0.016
Token
Token ID198
Feature activation-2.81e-03

Pos logit contributions
 grasp+0.001
 spine+0.001
 grip+0.001
 pocket+0.001
 One+0.001
Neg logit contributions
taker-0.001
 happens-0.001
 accidents-0.001
lad-0.001
washer-0.001
Token
Token ID198
Feature activation-9.23e-03

Pos logit contributions
 grasp+0.004
 spine+0.004
 One+0.003
 pocket+0.003
 grip+0.003
Neg logit contributions
taker-0.005
lad-0.005
washer-0.005
 accidents-0.005
 happens-0.005
Token"
Token ID1
Feature activation+1.30e-02

Pos logit contributions
 grasp+0.014
 One+0.013
 pocket+0.013
 spine+0.013
 front+0.012
Neg logit contributions
taker-0.015
 accidents-0.014
washer-0.014
 works-0.014
lad-0.014
TokenLet
Token ID5756
Feature activation-7.23e-03

Pos logit contributions
taker+0.010
washer+0.010
lad+0.009
 happens+0.009
 works+0.009
Neg logit contributions
 grasp-0.010
 One-0.010
One-0.009
 front-0.009
 pocket-0.009
Token's
Token ID338
Feature activation+6.59e-03

Pos logit contributions
 One+0.009
 grasp+0.009
 front+0.008
 pocket+0.008
One+0.007
Neg logit contributions
taker-0.014
washer-0.014
lad-0.014
ull-0.013
 works-0.013
Token play
Token ID711
Feature activation+2.00e-02

Pos logit contributions
 sound+0.009
 works+0.009
washer+0.009
 news+0.009
fl+0.008
Neg logit contributions
 spine-0.011
 grasp-0.010
rils-0.009
 nine-0.009
ids-0.009
Token!"
Token ID2474
Feature activation-1.86e-02

Pos logit contributions
taker+0.038
lad+0.037
washer+0.036
lling+0.036
 accidents+0.036
Neg logit contributions
 grasp-0.025
 One-0.025
 pocket-0.023
 front-0.022
 nine-0.021
Token Wh
Token ID854
Feature activation+6.09e-03

Pos logit contributions
 grasp+0.027
 One+0.024
 spine+0.024
 pocket+0.024
 grip+0.023
Neg logit contributions
taker-0.031
 accidents-0.031
washer-0.030
lad-0.030
 works-0.030
Tokenisk
Token ID1984
Feature activation+6.78e-02

Pos logit contributions
taker+0.011
 accidents+0.010
washer+0.010
lad+0.010
 works+0.010
Neg logit contributions
 grasp-0.008
 One-0.008
 spine-0.007
 pocket-0.007
 grip-0.007
Tokeners
Token ID364
Feature activation+1.93e-02

Pos logit contributions
 accidents+0.122
washer+0.112
 disl+0.110
 Acc+0.109
spe+0.107
Neg logit contributions
 nine-0.084
 front-0.076
One-0.075
 ten-0.074
ags-0.073
Token said
Token ID531
Feature activation+9.05e-03

Pos logit contributions
taker+0.026
 accidents+0.026
lad+0.025
 happens+0.025
 works+0.024
Neg logit contributions
 grasp-0.034
 One-0.033
 pocket-0.032
 spine-0.031
 front-0.031
Token,
Token ID11
Feature activation+3.37e-03

Pos logit contributions
lling+0.021
 works+0.021
lad+0.020
washer+0.020
fl+0.020
Neg logit contributions
 One-0.007
 grasp-0.007
 pocket-0.006
 grip-0.006
 spine-0.006
Token "
Token ID366
Feature activation+2.55e-02

Pos logit contributions
taker+0.006
lad+0.006
 works+0.006
 happens+0.006
washer+0.005
Neg logit contributions
 grasp-0.005
 One-0.005
 spine-0.004
 pocket-0.004
 front-0.004
TokenNo
Token ID2949
Feature activation-1.28e-02

Pos logit contributions
taker+0.033
 accidents+0.033
washer+0.032
lad+0.032
 works+0.032
Neg logit contributions
 grasp-0.046
 One-0.044
 spine-0.043
 pocket-0.042
 front-0.041
Token me
Token ID502
Feature activation-4.93e-03

Pos logit contributions
 accidents+0.013
taker+0.012
 happens+0.012
 news+0.012
washer+0.012
Neg logit contributions
 grasp-0.016
 spine-0.015
 grip-0.014
 pocket-0.014
 One-0.013
Token and
Token ID290
Feature activation+4.23e-03

Pos logit contributions
 grasp+0.010
 spine+0.009
 grip+0.009
 One+0.009
 pocket+0.009
Neg logit contributions
washer-0.006
 works-0.006
taker-0.006
 accidents-0.005
lad-0.005
Token Mr
Token ID1770
Feature activation+3.16e-02

Pos logit contributions
taker+0.008
 accidents+0.007
washer+0.007
 works+0.007
 happens+0.007
Neg logit contributions
 grasp-0.006
 spine-0.005
 One-0.005
 pocket-0.005
 grip-0.005
Token.
Token ID13
Feature activation-1.37e-03

Pos logit contributions
taker+0.059
 accidents+0.057
lad+0.057
washer+0.057
 works+0.056
Neg logit contributions
 grasp-0.040
 One-0.037
 spine-0.036
 pocket-0.035
 grip-0.034
Token Wh
Token ID854
Feature activation+2.19e-02

Pos logit contributions
 grasp+0.002
 grip+0.002
 spine+0.002
 One+0.002
 pocket+0.002
Neg logit contributions
washer-0.002
 works-0.002
 accidents-0.002
lad-0.002
taker-0.002
Tokenisk
Token ID1984
Feature activation+6.72e-02

Pos logit contributions
taker+0.041
 accidents+0.040
washer+0.039
lling+0.039
 happens+0.039
Neg logit contributions
 grasp-0.027
 One-0.026
 spine-0.025
 pocket-0.024
rils-0.023
Tokeners
Token ID364
Feature activation-9.17e-03

Pos logit contributions
 accidents+0.118
washer+0.100
 Acc+0.099
 ang+0.098
 happens+0.094
Neg logit contributions
 nine-0.090
 front-0.082
One-0.082
 One-0.081
ids-0.079
Token?"
Token ID1701
Feature activation-9.87e-03

Pos logit contributions
 grasp+0.013
 One+0.013
 pocket+0.012
 spine+0.012
 front+0.012
Neg logit contributions
taker-0.015
 accidents-0.015
lad-0.015
lling-0.014
 happens-0.014
Token Tim
Token ID5045
Feature activation+1.37e-02

Pos logit contributions
 grasp+0.014
 One+0.013
 spine+0.013
 pocket+0.012
 front+0.012
Neg logit contributions
taker-0.017
lad-0.017
 accidents-0.016
washer-0.016
 happens-0.016
Token smiled
Token ID13541
Feature activation-1.35e-02

Pos logit contributions
taker+0.021
 accidents+0.020
lad+0.020
free+0.020
 happens+0.020
Neg logit contributions
 One-0.023
 grasp-0.022
 pocket-0.021
 front-0.020
 spine-0.019
Token and
Token ID290
Feature activation+1.56e-02

Pos logit contributions
 grasp+0.017
 One+0.016
 spine+0.016
 pocket+0.015
 front+0.015
Neg logit contributions
taker-0.024
washer-0.024
lad-0.024
 accidents-0.024
 works-0.024
Token share
Token ID2648
Feature activation+1.01e-03

Pos logit contributions
taker+0.011
 accidents+0.011
washer+0.011
 works+0.011
 happens+0.011
Neg logit contributions
 grasp-0.008
 One-0.007
 spine-0.007
 pocket-0.007
 grip-0.007
Token."
Token ID526
Feature activation-9.33e-03

Pos logit contributions
taker+0.001
washer+0.001
lad+0.001
 accidents+0.001
 works+0.001
Neg logit contributions
 grasp-0.002
 One-0.002
 pocket-0.002
 spine-0.002
 front-0.001
Token Spot
Token ID15899
Feature activation+1.96e-02

Pos logit contributions
 grasp+0.014
 spine+0.013
 pocket+0.012
 grip+0.012
 One+0.012
Neg logit contributions
taker-0.015
 accidents-0.015
fl-0.015
 happens-0.014
 works-0.014
Token and
Token ID290
Feature activation-4.55e-03

Pos logit contributions
 accidents+0.035
lad+0.034
taker+0.034
rest+0.033
ymes+0.032
Neg logit contributions
 One-0.027
 grasp-0.025
 pocket-0.025
 front-0.023
 nine-0.022
Token Wh
Token ID854
Feature activation+1.55e-02

Pos logit contributions
 One+0.006
 grasp+0.006
 pocket+0.005
 front+0.005
 grip+0.005
Neg logit contributions
washer-0.009
lad-0.009
taker-0.009
 happens-0.008
 works-0.008
Tokenisk
Token ID1984
Feature activation+6.62e-02

Pos logit contributions
taker+0.028
washer+0.028
 accidents+0.028
 happens+0.027
 works+0.027
Neg logit contributions
 grasp-0.020
 One-0.019
 spine-0.018
 pocket-0.018
 grip-0.017
Tokeners
Token ID364
Feature activation+1.47e-02

Pos logit contributions
 accidents+0.110
washer+0.105
 disl+0.101
 happens+0.097
 works+0.096
Neg logit contributions
 nine-0.090
 grasp-0.081
 front-0.080
ags-0.079
One-0.079
Token agreed
Token ID4987
Feature activation+1.23e-02

Pos logit contributions
 accidents+0.021
lad+0.021
taker+0.021
 happens+0.020
 works+0.020
Neg logit contributions
 grasp-0.025
 One-0.023
 pocket-0.023
 spine-0.022
 grip-0.022
Token,
Token ID11
Feature activation+9.30e-03

Pos logit contributions
taker+0.024
 works+0.024
washer+0.024
lad+0.024
 happens+0.024
Neg logit contributions
 grasp-0.014
 One-0.013
 pocket-0.012
 spine-0.012
 grip-0.012
Token and
Token ID290
Feature activation+2.40e-02

Pos logit contributions
taker+0.018
 happens+0.018
lad+0.018
washer+0.018
 works+0.017
Neg logit contributions
 grasp-0.011
 One-0.010
 spine-0.010
 pocket-0.010
 front-0.009
Token they
Token ID484
Feature activation+1.42e-02

Pos logit contributions
lad+0.041
ymes+0.039
 works+0.038
taker+0.037
washer+0.037
Neg logit contributions
 grasp-0.036
 pocket-0.034
 One-0.033
 grip-0.031
 nine-0.031
Token the
Token ID262
Feature activation-4.62e-03

Pos logit contributions
lad+0.013
taker+0.013
washer+0.012
ull+0.012
rest+0.012
Neg logit contributions
 grasp-0.011
 One-0.011
 front-0.010
 nine-0.010
 pocket-0.010
Token mess
Token ID2085
Feature activation+2.60e-02

Pos logit contributions
 grasp+0.009
 One+0.008
 spine+0.008
 pocket+0.008
 grip+0.008
Neg logit contributions
taker-0.005
lad-0.005
 accidents-0.005
washer-0.005
 happens-0.005
Token and
Token ID290
Feature activation-1.33e-02

Pos logit contributions
haw+0.052
 disl+0.052
lad+0.052
taker+0.051
fl+0.051
Neg logit contributions
 front-0.025
 grasp-0.024
 One-0.023
 nine-0.023
 pocket-0.022
Token the
Token ID262
Feature activation+2.85e-03

Pos logit contributions
 grasp+0.020
 front+0.019
 One+0.019
 pocket+0.019
 nine+0.019
Neg logit contributions
lad-0.022
ull-0.021
taker-0.021
rest-0.021
ww-0.020
Token qu
Token ID627
Feature activation+2.92e-02

Pos logit contributions
lad+0.004
rest+0.004
taker+0.004
 happens+0.004
ull+0.004
Neg logit contributions
 grasp-0.005
 pocket-0.005
 spine-0.004
 front-0.004
 One-0.004
Tokenarre
Token ID9624
Feature activation+6.50e-02

Pos logit contributions
taker+0.040
 happens+0.040
 accidents+0.040
lad+0.039
 works+0.039
Neg logit contributions
 grasp-0.048
 One-0.045
 spine-0.043
 front-0.043
rils-0.043
Tokenling
Token ID1359
Feature activation-4.67e-03

Pos logit contributions
 disl+0.073
haw+0.068
 Acc+0.068
lad+0.067
fl+0.064
Neg logit contributions
 front-0.111
 use-0.109
 nine-0.108
 grip-0.106
 grasp-0.104
Token twins
Token ID20345
Feature activation+1.74e-02

Pos logit contributions
 grasp+0.007
 front+0.007
 One+0.007
 pocket+0.007
 grip+0.007
Neg logit contributions
lad-0.007
beh-0.007
 works-0.007
haw-0.007
rest-0.007
Token.
Token ID13
Feature activation-2.93e-03

Pos logit contributions
taker+0.029
 accidents+0.028
washer+0.028
 works+0.028
 happens+0.028
Neg logit contributions
 grasp-0.025
 One-0.023
 spine-0.023
 pocket-0.022
 grip-0.022
Token She
Token ID1375
Feature activation-1.16e-02

Pos logit contributions
 grasp+0.004
 spine+0.003
 grip+0.003
 pocket+0.003
 One+0.003
Neg logit contributions
taker-0.005
 accidents-0.005
 happens-0.005
washer-0.005
 works-0.005
Token is
Token ID318
Feature activation-7.60e-03

Pos logit contributions
 grasp+0.023
 One+0.022
 pocket+0.021
 spine+0.021
 front+0.021
Neg logit contributions
taker-0.013
lad-0.013
 accidents-0.013
washer-0.013
 happens-0.012
Token my
Token ID616
Feature activation+1.89e-02

Pos logit contributions
taker+0.007
lad+0.007
washer+0.007
lling+0.007
led+0.007
Neg logit contributions
 grasp-0.006
 One-0.006
 front-0.006
 pocket-0.005
 nine-0.005
Token new
Token ID649
Feature activation+2.03e-02

Pos logit contributions
taker+0.030
led+0.029
lling+0.029
lad+0.029
washer+0.028
Neg logit contributions
 grasp-0.032
 One-0.030
 spine-0.030
 front-0.029
 pocket-0.029
Token song
Token ID3496
Feature activation-1.58e-03

Pos logit contributions
taker+0.038
led+0.037
lad+0.037
lling+0.037
washer+0.037
Neg logit contributions
 grasp-0.026
 One-0.025
 spine-0.024
 pocket-0.024
 front-0.024
Token?"
Token ID1701
Feature activation-1.35e-02

Pos logit contributions
 grasp+0.003
 One+0.002
 spine+0.002
 pocket+0.002
 grip+0.002
Neg logit contributions
taker-0.002
 accidents-0.002
lad-0.002
washer-0.002
 works-0.002
Token Wh
Token ID854
Feature activation+6.00e-03

Pos logit contributions
 grasp+0.021
 One+0.020
 spine+0.019
 pocket+0.019
 grip+0.018
Neg logit contributions
taker-0.021
 accidents-0.020
lad-0.020
washer-0.020
 works-0.020
Tokenisk
Token ID1984
Feature activation+6.44e-02

Pos logit contributions
taker+0.011
 accidents+0.011
washer+0.011
 happens+0.010
lad+0.010
Neg logit contributions
 grasp-0.008
 One-0.007
 spine-0.007
 pocket-0.007
 grip-0.007
Tokeners
Token ID364
Feature activation+2.22e-02

Pos logit contributions
 accidents+0.110
washer+0.099
spe+0.096
ww+0.095
 disl+0.094
Neg logit contributions
 nine-0.086
One-0.082
 front-0.079
 One-0.078
ags-0.076
Token smiled
Token ID13541
Feature activation-1.75e-02

Pos logit contributions
taker+0.027
 accidents+0.026
washer+0.026
lad+0.026
 works+0.026
Neg logit contributions
 grasp-0.042
 One-0.040
 spine-0.039
 pocket-0.039
 grip-0.038
Token and
Token ID290
Feature activation+1.83e-02

Pos logit contributions
 grasp+0.026
 One+0.023
 spine+0.023
 pocket+0.023
 grip+0.022
Neg logit contributions
taker-0.029
 accidents-0.029
lad-0.028
washer-0.028
 happens-0.028
Token said
Token ID531
Feature activation+2.43e-02

Pos logit contributions
lad+0.030
taker+0.030
free+0.029
washer+0.029
lling+0.028
Neg logit contributions
 One-0.027
 grasp-0.026
 pocket-0.025
 front-0.023
 spine-0.023
Token,
Token ID11
Feature activation+1.94e-02

Pos logit contributions
lling+0.062
beh+0.057
washer+0.056
rest+0.056
taker+0.056
Neg logit contributions
 One-0.019
 grasp-0.014
 nine-0.013
 pocket-0.013
ids-0.011
Token ball
Token ID2613
Feature activation-5.26e-03

Pos logit contributions
taker+0.015
washer+0.014
lad+0.014
 happens+0.014
led+0.014
Neg logit contributions
 grasp-0.014
 One-0.013
 pocket-0.012
 spine-0.012
 grip-0.012
Token."
Token ID526
Feature activation-1.79e-02

Pos logit contributions
 grasp+0.008
 One+0.007
 pocket+0.007
 spine+0.007
 grip+0.007
Neg logit contributions
 happens-0.009
lad-0.009
taker-0.009
 works-0.009
 accidents-0.008
Token Buddy
Token ID36896
Feature activation+6.73e-03

Pos logit contributions
 grasp+0.029
 spine+0.027
 One+0.027
 pocket+0.027
 grip+0.026
Neg logit contributions
taker-0.027
 accidents-0.026
washer-0.025
 happens-0.025
 works-0.025
Token and
Token ID290
Feature activation-2.30e-02

Pos logit contributions
lad+0.012
taker+0.012
 accidents+0.012
 works+0.012
rest+0.012
Neg logit contributions
 grasp-0.009
 One-0.008
 pocket-0.008
 front-0.007
 spine-0.007
Token Wh
Token ID854
Feature activation+9.96e-03

Pos logit contributions
 grasp+0.036
 One+0.034
 spine+0.033
 pocket+0.033
 grip+0.032
Neg logit contributions
taker-0.035
lad-0.034
washer-0.034
 accidents-0.034
 works-0.034
Tokenisk
Token ID1984
Feature activation+6.42e-02

Pos logit contributions
taker+0.018
 accidents+0.017
washer+0.017
lad+0.017
 happens+0.017
Neg logit contributions
 grasp-0.014
 One-0.013
 spine-0.012
 pocket-0.012
 grip-0.012
Tokeners
Token ID364
Feature activation+6.98e-03

Pos logit contributions
 accidents+0.106
washer+0.102
 disl+0.098
 happens+0.096
iden+0.094
Neg logit contributions
 nine-0.086
 front-0.076
ags-0.076
 grasp-0.076
One-0.075
Token stopped
Token ID5025
Feature activation-3.65e-03

Pos logit contributions
lad+0.009
 accidents+0.009
taker+0.009
 happens+0.009
 works+0.009
Neg logit contributions
 grasp-0.012
 One-0.012
 pocket-0.012
 spine-0.011
 grip-0.011
Token fighting
Token ID4330
Feature activation+9.50e-03

Pos logit contributions
 grasp+0.007
 spine+0.006
 One+0.006
 pocket+0.006
 front+0.006
Neg logit contributions
 accidents-0.005
taker-0.004
lling-0.004
 works-0.004
fl-0.004
Token.
Token ID13
Feature activation-1.57e-03

Pos logit contributions
taker+0.018
lad+0.018
washer+0.018
 happens+0.018
 works+0.018
Neg logit contributions
 grasp-0.011
 One-0.011
 spine-0.010
 front-0.009
 grip-0.009
Token They
Token ID1119
Feature activation-1.91e-03

Pos logit contributions
 grasp+0.003
 spine+0.002
 grip+0.002
 pocket+0.002
 nine+0.002
Neg logit contributions
taker-0.002
 accidents-0.002
 happens-0.002
washer-0.002
fl-0.002
Token singing
Token ID13777
Feature activation+4.59e-03

Pos logit contributions
 grasp+0.010
 One+0.009
 pocket+0.009
 spine+0.009
 grip+0.009
Neg logit contributions
 accidents-0.008
fl-0.007
 works-0.007
taker-0.007
lad-0.007
Token and
Token ID290
Feature activation+1.56e-02

Pos logit contributions
lad+0.010
washer+0.010
beh+0.010
 works+0.010
 disl+0.010
Neg logit contributions
 One-0.004
 grasp-0.004
 front-0.004
 spine-0.003
 nine-0.003
Token looked
Token ID3114
Feature activation+2.85e-02

Pos logit contributions
lad+0.026
washer+0.026
taker+0.025
ymes+0.024
ull+0.024
Neg logit contributions
 grasp-0.024
 One-0.022
 pocket-0.022
 spine-0.021
 front-0.021
Token at
Token ID379
Feature activation-3.30e-03

Pos logit contributions
washer+0.052
lad+0.052
taker+0.051
 works+0.051
rest+0.050
Neg logit contributions
 grasp-0.036
 One-0.035
 pocket-0.033
 spine-0.031
 front-0.031
Token Wh
Token ID854
Feature activation+7.54e-03

Pos logit contributions
 grasp+0.005
 One+0.005
 spine+0.005
 pocket+0.005
 grip+0.004
Neg logit contributions
taker-0.005
washer-0.005
lad-0.005
 accidents-0.005
 works-0.005
Tokenisk
Token ID1984
Feature activation+6.33e-02

Pos logit contributions
taker+0.013
 accidents+0.013
lad+0.013
washer+0.013
 works+0.013
Neg logit contributions
 grasp-0.010
 One-0.010
 spine-0.009
 pocket-0.009
 grip-0.009
Tokeners
Token ID364
Feature activation+2.19e-03

Pos logit contributions
 accidents+0.109
 disl+0.104
washer+0.102
 Acc+0.101
 Sorry+0.095
Neg logit contributions
 nine-0.077
 grasp-0.072
 front-0.072
ids-0.071
 One-0.066
Token.
Token ID13
Feature activation-7.86e-03

Pos logit contributions
taker+0.004
 accidents+0.004
washer+0.004
 works+0.004
lad+0.004
Neg logit contributions
 grasp-0.003
 One-0.003
 spine-0.003
 pocket-0.003
 grip-0.003
Token
Token ID198
Feature activation+1.65e-03

Pos logit contributions
 grasp+0.013
 spine+0.012
 grip+0.012
 pocket+0.012
 front+0.012
Neg logit contributions
taker-0.011
 accidents-0.011
 happens-0.011
lad-0.010
washer-0.010
Token
Token ID198
Feature activation-9.85e-03

Pos logit contributions
taker+0.003
lad+0.003
washer+0.003
 happens+0.003
 works+0.003
Neg logit contributions
 grasp-0.002
 spine-0.002
 pocket-0.002
 grip-0.002
 One-0.002
Token"
Token ID1
Feature activation+1.18e-02

Pos logit contributions
 grasp+0.015
 One+0.014
 spine+0.014
 pocket+0.014
 front+0.013
Neg logit contributions
taker-0.015
 accidents-0.015
washer-0.015
lad-0.015
 works-0.015
Token his
Token ID465
Feature activation+5.99e-03

Pos logit contributions
taker+0.012
 happens+0.012
lad+0.012
washer+0.012
 accidents+0.012
Neg logit contributions
 grasp-0.013
 One-0.012
 pocket-0.012
 front-0.012
 spine-0.012
Token puzzle
Token ID15027
Feature activation-6.96e-03

Pos logit contributions
taker+0.006
 accidents+0.006
 works+0.006
washer+0.006
lad+0.006
Neg logit contributions
 grasp-0.012
 One-0.012
 spine-0.011
 pocket-0.011
rils-0.011
Token and
Token ID290
Feature activation+2.21e-02

Pos logit contributions
 grasp+0.010
 One+0.010
 spine+0.009
 pocket+0.009
rils+0.009
Neg logit contributions
taker-0.011
washer-0.011
 accidents-0.011
 works-0.011
lad-0.011
Token thanked
Token ID26280
Feature activation+1.27e-03

Pos logit contributions
ymes+0.039
lad+0.039
free+0.037
taker+0.036
iness+0.035
Neg logit contributions
 grasp-0.031
 pocket-0.030
 One-0.028
 front-0.026
 use-0.026
Token Wh
Token ID854
Feature activation+4.83e-03

Pos logit contributions
taker+0.002
 accidents+0.002
lad+0.002
washer+0.002
 works+0.002
Neg logit contributions
 grasp-0.002
 One-0.002
 spine-0.002
 pocket-0.002
 grip-0.002
Tokenisk
Token ID1984
Feature activation+6.33e-02

Pos logit contributions
taker+0.009
 happens+0.009
washer+0.009
 accidents+0.009
 works+0.009
Neg logit contributions
 grasp-0.006
 One-0.006
 spine-0.005
 pocket-0.005
 grip-0.005
Tokeners
Token ID364
Feature activation-1.73e-03

Pos logit contributions
 accidents+0.107
washer+0.098
 disl+0.097
spe+0.094
 Acc+0.094
Neg logit contributions
 nine-0.079
 front-0.072
 One-0.072
 grasp-0.072
One-0.071
Token for
Token ID329
Feature activation+4.84e-03

Pos logit contributions
 grasp+0.002
 One+0.002
 spine+0.002
 pocket+0.002
 grip+0.002
Neg logit contributions
taker-0.003
washer-0.003
 accidents-0.003
lad-0.003
 works-0.003
Token helping
Token ID5742
Feature activation-4.48e-03

Pos logit contributions
 happens+0.007
washer+0.007
led+0.007
taker+0.007
 works+0.007
Neg logit contributions
 grasp-0.008
 front-0.007
 pocket-0.007
 One-0.007
 nine-0.007
Token him
Token ID683
Feature activation+2.00e-02

Pos logit contributions
 grasp+0.008
 One+0.008
 spine+0.007
 pocket+0.007
 grip+0.007
Neg logit contributions
taker-0.006
lad-0.006
washer-0.006
 accidents-0.006
 works-0.006
Token.
Token ID13
Feature activation+6.39e-03

Pos logit contributions
taker+0.037
 accidents+0.036
washer+0.036
 works+0.036
lad+0.036
Neg logit contributions
 grasp-0.025
 One-0.023
 spine-0.023
 pocket-0.022
 grip-0.021
Token a
Token ID257
Feature activation+1.84e-02

Pos logit contributions
lad+0.037
lling+0.034
 works+0.034
led+0.034
rated+0.034
Neg logit contributions
 One-0.022
 grasp-0.022
 front-0.021
 nine-0.021
 pocket-0.020
Token cheerful
Token ID37999
Feature activation+1.53e-02

Pos logit contributions
led+0.030
rest+0.030
lad+0.030
beh+0.030
taker+0.029
Neg logit contributions
 grasp-0.028
 pocket-0.028
 front-0.027
 One-0.026
 grip-0.025
Token little
Token ID1310
Feature activation+2.02e-02

Pos logit contributions
lad+0.023
led+0.023
 happens+0.023
rest+0.023
taker+0.022
Neg logit contributions
 grasp-0.025
 pocket-0.024
 One-0.024
 spine-0.023
 front-0.023
Token 3
Token ID513
Feature activation+3.45e-02

Pos logit contributions
lad+0.022
taker+0.021
led+0.020
rest+0.020
 happens+0.020
Neg logit contributions
 grasp-0.043
 pocket-0.041
 One-0.040
 spine-0.040
 front-0.039
Token-
Token ID12
Feature activation+4.13e-02

Pos logit contributions
 happens+0.052
lad+0.052
free+0.050
fl+0.050
 works+0.049
Neg logit contributions
 grasp-0.053
 front-0.052
 pocket-0.052
 One-0.051
 grip-0.048
Tokenyear
Token ID1941
Feature activation+6.24e-02

Pos logit contributions
 accidents+0.047
 happens+0.043
 works+0.041
fl+0.039
lad+0.039
Neg logit contributions
 One-0.080
 grasp-0.080
 nine-0.080
 front-0.076
One-0.076
Token-
Token ID12
Feature activation+4.92e-02

Pos logit contributions
lad+0.057
 accidents+0.057
 works+0.056
 happens+0.055
taker+0.054
Neg logit contributions
 grasp-0.136
 One-0.128
 pocket-0.128
 spine-0.127
 grip-0.125
Tokenold
Token ID727
Feature activation+2.99e-02

Pos logit contributions
 accidents+0.075
 happens+0.072
 works+0.070
lad+0.069
fl+0.069
Neg logit contributions
 grasp-0.076
 One-0.074
 nine-0.071
 pocket-0.070
 front-0.069
Token boy
Token ID2933
Feature activation+1.79e-02

Pos logit contributions
lad+0.044
lling+0.043
 happens+0.043
ww+0.042
rest+0.042
Neg logit contributions
 pocket-0.050
 grasp-0.048
 front-0.046
 One-0.046
 nine-0.046
Token.
Token ID13
Feature activation+9.08e-03

Pos logit contributions
taker+0.033
lad+0.032
 accidents+0.032
 happens+0.032
 works+0.032
Neg logit contributions
 grasp-0.023
 One-0.021
 pocket-0.021
 spine-0.020
 front-0.020
Token He
Token ID679
Feature activation+8.62e-03

Pos logit contributions
taker+0.016
 accidents+0.015
lad+0.015
washer+0.015
 works+0.015
Neg logit contributions
 grasp-0.012
 One-0.012
 spine-0.011
 pocket-0.011
 front-0.011
Token said
Token ID531
Feature activation+3.37e-03

Pos logit contributions
taker+0.024
lad+0.023
 accidents+0.023
free+0.023
washer+0.023
Neg logit contributions
 One-0.023
 grasp-0.021
 pocket-0.020
One-0.020
 front-0.019
Token,
Token ID11
Feature activation+1.33e-02

Pos logit contributions
lling+0.009
 works+0.008
rest+0.008
lad+0.008
washer+0.008
Neg logit contributions
 One-0.003
 grasp-0.002
 nine-0.002
 pocket-0.002
 spine-0.002
Token "
Token ID366
Feature activation+1.14e-02

Pos logit contributions
lling+0.025
rest+0.025
lad+0.025
 works+0.025
taker+0.025
Neg logit contributions
 One-0.017
 grasp-0.015
 spine-0.014
 front-0.014
 pocket-0.013
TokenHi
Token ID17250
Feature activation-1.85e-02

Pos logit contributions
taker+0.018
washer+0.018
lad+0.017
 accidents+0.017
lling+0.017
Neg logit contributions
 One-0.018
 grasp-0.018
One-0.017
 front-0.017
 spine-0.016
Token Wh
Token ID854
Feature activation+3.82e-03

Pos logit contributions
 One+0.022
 grasp+0.021
 pocket+0.019
 front+0.019
 nine+0.018
Neg logit contributions
taker-0.038
lling-0.036
washer-0.036
led-0.035
lad-0.035
Tokenisk
Token ID1984
Feature activation+6.23e-02

Pos logit contributions
taker+0.006
lad+0.006
 accidents+0.006
 works+0.006
washer+0.006
Neg logit contributions
 grasp-0.006
 One-0.005
 spine-0.005
 pocket-0.005
 grip-0.005
Tokeners
Token ID364
Feature activation+2.17e-03

Pos logit contributions
 accidents+0.099
washer+0.090
spe+0.088
rated+0.086
taker+0.084
Neg logit contributions
 nine-0.090
 front-0.085
 One-0.084
ids-0.082
One-0.082
Token!
Token ID0
Feature activation+2.86e-03

Pos logit contributions
taker+0.003
 accidents+0.003
lad+0.003
washer+0.003
 works+0.003
Neg logit contributions
 grasp-0.003
 One-0.003
 spine-0.003
 pocket-0.003
 grip-0.003
Token Do
Token ID2141
Feature activation+2.42e-03

Pos logit contributions
lling+0.005
taker+0.004
led+0.004
 accidents+0.004
 works+0.004
Neg logit contributions
 One-0.005
 grasp-0.004
 spine-0.004
 pocket-0.004
rils-0.004
Token you
Token ID345
Feature activation-6.27e-04

Pos logit contributions
taker+0.003
washer+0.003
 accidents+0.003
lling+0.003
 works+0.003
Neg logit contributions
 grasp-0.005
 One-0.004
 spine-0.004
 pocket-0.004
 front-0.004
Token want
Token ID765
Feature activation-8.30e-03

Pos logit contributions
 grasp+0.001
 One+0.001
 spine+0.001
 pocket+0.001
 front+0.001
Neg logit contributions
taker-0.001
lad-0.001
 works-0.001
 accidents-0.001
washer-0.001
Token family
Token ID1641
Feature activation+1.63e-02

Pos logit contributions
 grasp+0.013
 One+0.012
 spine+0.011
 pocket+0.011
 grip+0.011
Neg logit contributions
taker-0.017
lad-0.017
 happens-0.017
 accidents-0.016
 works-0.016
Token sh
Token ID427
Feature activation+9.62e-03

Pos logit contributions
lad+0.022
taker+0.022
ww+0.021
 accidents+0.021
fl+0.021
Neg logit contributions
 grasp-0.029
 One-0.028
 pocket-0.027
 front-0.026
 nine-0.025
Tokenoo
Token ID2238
Feature activation+1.68e-02

Pos logit contributions
taker+0.014
 accidents+0.013
washer+0.013
lad+0.013
 happens+0.013
Neg logit contributions
 grasp-0.016
 One-0.015
 spine-0.015
 pocket-0.015
 grip-0.014
Tokened
Token ID276
Feature activation-2.71e-03

Pos logit contributions
lad+0.026
ull+0.024
washer+0.023
 disl+0.023
vell+0.023
Neg logit contributions
 One-0.025
 grasp-0.024
 front-0.023
 both-0.023
 spine-0.023
Token Wh
Token ID854
Feature activation+1.45e-02

Pos logit contributions
 One+0.003
 grasp+0.003
 front+0.003
 spine+0.003
 pocket+0.003
Neg logit contributions
lad-0.006
washer-0.005
 happens-0.005
taker-0.005
rest-0.005
Tokenisk
Token ID1984
Feature activation+6.17e-02

Pos logit contributions
taker+0.025
 accidents+0.024
lad+0.024
washer+0.024
 works+0.024
Neg logit contributions
 grasp-0.020
 One-0.019
 spine-0.018
 pocket-0.018
 grip-0.017
Tokeners
Token ID364
Feature activation-8.51e-03

Pos logit contributions
 accidents+0.103
washer+0.095
 happens+0.094
 disl+0.093
 Acc+0.090
Neg logit contributions
 nine-0.081
 grasp-0.075
 front-0.074
ids-0.074
One-0.071
Token away
Token ID1497
Feature activation-8.06e-03

Pos logit contributions
 grasp+0.012
 One+0.011
 spine+0.011
 pocket+0.010
 grip+0.010
Neg logit contributions
taker-0.015
 accidents-0.014
washer-0.014
lad-0.014
 works-0.014
Token and
Token ID290
Feature activation+6.71e-03

Pos logit contributions
 grasp+0.009
 spine+0.009
 One+0.009
 front+0.008
 grip+0.008
Neg logit contributions
lad-0.016
 happens-0.015
taker-0.015
washer-0.015
 disl-0.015
Token told
Token ID1297
Feature activation+9.29e-04

Pos logit contributions
lad+0.013
ymes+0.012
ull+0.012
taker+0.012
lling+0.011
Neg logit contributions
 grasp-0.009
 pocket-0.009
 One-0.009
 grip-0.008
 spine-0.008
Token Max
Token ID5436
Feature activation+1.39e-02

Pos logit contributions
lad+0.002
washer+0.002
lling+0.002
ull+0.002
taker+0.002
Neg logit contributions
 One-0.001
 grasp-0.001
 injuries-0.001
 front-0.001
rils-0.001
Token and
Token ID290
Feature activation+1.98e-02

Pos logit contributions
 grasp+0.043
 One+0.041
 spine+0.040
 front+0.039
 pocket+0.039
Neg logit contributions
taker-0.049
 works-0.049
 happens-0.049
lad-0.048
washer-0.048
Token replied
Token ID8712
Feature activation+1.05e-02

Pos logit contributions
lad+0.030
taker+0.029
washer+0.029
 accidents+0.029
ymes+0.029
Neg logit contributions
 grasp-0.032
 One-0.031
 pocket-0.030
 spine-0.029
 grip-0.028
Token with
Token ID351
Feature activation+6.02e-03

Pos logit contributions
lling+0.024
 works+0.023
ymes+0.023
lad+0.023
 accidents+0.023
Neg logit contributions
 One-0.009
 grasp-0.009
 pocket-0.008
 nine-0.007
 front-0.007
Token a
Token ID257
Feature activation+1.18e-02

Pos logit contributions
taker+0.009
 accidents+0.008
 works+0.008
 happens+0.008
lad+0.008
Neg logit contributions
 grasp-0.010
 spine-0.010
 One-0.010
 pocket-0.009
 grip-0.009
Token mis
Token ID2984
Feature activation+4.85e-02

Pos logit contributions
led+0.021
rest+0.020
lling+0.020
 accidents+0.020
beh+0.019
Neg logit contributions
 grasp-0.018
 pocket-0.017
 front-0.016
 One-0.016
 grip-0.016
Tokenchie
Token ID3043
Feature activation+6.16e-02

Pos logit contributions
 accidents+0.075
taker+0.074
 happens+0.074
 works+0.073
washer+0.072
Neg logit contributions
 grasp-0.076
 One-0.070
 spine-0.069
 pocket-0.068
 grip-0.066
Tokenvous
Token ID31222
Feature activation+1.71e-02

Pos logit contributions
 accidents+0.093
 works+0.075
washer+0.073
sets+0.073
fl+0.073
Neg logit contributions
 One-0.108
 grasp-0.101
 front-0.097
One-0.097
 nine-0.096
Token gl
Token ID1278
Feature activation+2.01e-02

Pos logit contributions
led+0.029
washer+0.029
rest+0.029
lling+0.029
 accidents+0.029
Neg logit contributions
 grasp-0.025
 One-0.023
 pocket-0.022
 spine-0.022
 grip-0.022
Tokenint
Token ID600
Feature activation-8.35e-03

Pos logit contributions
 accidents+0.036
taker+0.035
 happens+0.035
 works+0.035
washer+0.035
Neg logit contributions
 grasp-0.026
 spine-0.024
 One-0.024
 pocket-0.023
 grip-0.023
Token in
Token ID287
Feature activation+3.90e-03

Pos logit contributions
 grasp+0.018
 One+0.018
 spine+0.017
 pocket+0.017
 front+0.017
Neg logit contributions
 accidents-0.007
 works-0.007
 disl-0.006
lad-0.006
lling-0.006
Token his
Token ID465
Feature activation+1.07e-02

Pos logit contributions
taker+0.005
 accidents+0.005
 works+0.005
 happens+0.005
ull+0.005
Neg logit contributions
 grasp-0.007
 spine-0.007
 One-0.007
 pocket-0.006
 grip-0.006
Token,
Token ID11
Feature activation-2.68e-03

Pos logit contributions
taker+0.021
 accidents+0.021
lad+0.021
 works+0.021
washer+0.021
Neg logit contributions
 grasp-0.014
 One-0.013
 spine-0.013
 pocket-0.012
 front-0.012
Token kissing
Token ID25847
Feature activation-4.73e-03

Pos logit contributions
 grasp+0.005
 One+0.004
 spine+0.004
 pocket+0.004
 grip+0.004
Neg logit contributions
 accidents-0.004
taker-0.004
washer-0.003
lad-0.003
 sound-0.003
Token them
Token ID606
Feature activation+3.10e-02

Pos logit contributions
 One+0.004
 front+0.004
 pocket+0.004
 grasp+0.004
 hand+0.003
Neg logit contributions
lad-0.010
beh-0.010
 works-0.010
taker-0.010
washer-0.010
Token on
Token ID319
Feature activation-3.17e-03

Pos logit contributions
lad+0.059
 happens+0.058
taker+0.058
 accidents+0.058
washer+0.057
Neg logit contributions
 grasp-0.036
 One-0.036
 spine-0.034
 front-0.033
 pocket-0.033
Token their
Token ID511
Feature activation+3.55e-02

Pos logit contributions
 grasp+0.006
 spine+0.005
 pocket+0.005
 grip+0.005
 One+0.005
Neg logit contributions
 accidents-0.004
taker-0.004
 sound-0.004
 works-0.004
 news-0.004
Token fore
Token ID1674
Feature activation+6.16e-02

Pos logit contributions
lad+0.073
taker+0.072
ing+0.071
washer+0.071
lling+0.071
Neg logit contributions
 grasp-0.039
 spine-0.036
 pocket-0.035
 One-0.035
 front-0.034
Tokenheads
Token ID16600
Feature activation+3.31e-03

Pos logit contributions
taker+0.086
lad+0.084
washer+0.084
 accidents+0.083
 works+0.082
Neg logit contributions
 grasp-0.106
 One-0.099
 pocket-0.096
 spine-0.096
 grip-0.093
Token.
Token ID13
Feature activation+6.07e-03

Pos logit contributions
taker+0.006
washer+0.006
 accidents+0.006
lad+0.006
 works+0.006
Neg logit contributions
 grasp-0.004
 One-0.004
 spine-0.004
 pocket-0.004
 grip-0.004
Token
Token ID198
Feature activation+1.01e-02

Pos logit contributions
 happens+0.007
taker+0.007
 accidents+0.007
lad+0.007
ull+0.007
Neg logit contributions
 grasp-0.012
 grip-0.011
 spine-0.011
 pocket-0.011
rils-0.010
Token
Token ID198
Feature activation-1.21e-03

Pos logit contributions
taker+0.016
 accidents+0.015
washer+0.015
lad+0.015
 happens+0.015
Neg logit contributions
 grasp-0.016
 spine-0.015
 One-0.014
 pocket-0.014
 grip-0.014
TokenShe
Token ID3347
Feature activation+8.33e-04

Pos logit contributions
 grasp+0.003
 spine+0.002
 grip+0.002
 One+0.002
 pocket+0.002
Neg logit contributions
lad-0.001
 accidents-0.001
washer-0.001
ww-0.001
ull-0.001
Token Wh
Token ID854
Feature activation+1.53e-03

Pos logit contributions
 One+0.027
 grasp+0.026
 pocket+0.024
 front+0.023
 nine+0.023
Neg logit contributions
led-0.029
lling-0.028
taker-0.028
washer-0.028
lad-0.027
Tokenisk
Token ID1984
Feature activation+5.34e-02

Pos logit contributions
taker+0.003
lad+0.003
 accidents+0.002
 works+0.002
washer+0.002
Neg logit contributions
 grasp-0.002
 One-0.002
 spine-0.002
 pocket-0.002
 grip-0.002
Tokeners
Token ID364
Feature activation-8.16e-03

Pos logit contributions
 accidents+0.091
 Acc+0.084
 disl+0.081
spe+0.081
 fortunate+0.080
Neg logit contributions
 nine-0.069
 front-0.063
One-0.063
ids-0.061
ary-0.058
Token?"
Token ID1701
Feature activation-1.54e-02

Pos logit contributions
 grasp+0.011
 One+0.010
 pocket+0.010
 spine+0.009
 grip+0.009
Neg logit contributions
taker-0.015
lad-0.015
 accidents-0.015
fl-0.014
lling-0.014
Token Wh
Token ID854
Feature activation+3.04e-03

Pos logit contributions
 grasp+0.025
 One+0.023
 spine+0.023
 pocket+0.022
 front+0.022
Neg logit contributions
taker-0.023
 accidents-0.023
lad-0.023
washer-0.023
 works-0.022
Tokenisk
Token ID1984
Feature activation+6.15e-02

Pos logit contributions
taker+0.005
 accidents+0.005
washer+0.005
 happens+0.005
lad+0.005
Neg logit contributions
 grasp-0.004
 One-0.004
 spine-0.004
 pocket-0.004
 grip-0.004
Tokeners
Token ID364
Feature activation+1.67e-02

Pos logit contributions
 accidents+0.103
washer+0.095
 disl+0.095
ww+0.095
spe+0.094
Neg logit contributions
 nine-0.081
One-0.076
 front-0.074
ags-0.073
 One-0.071
Token replied
Token ID8712
Feature activation+2.26e-03

Pos logit contributions
taker+0.021
 accidents+0.021
lad+0.020
 works+0.020
washer+0.020
Neg logit contributions
 grasp-0.031
 One-0.030
 spine-0.028
 pocket-0.028
 front-0.028
Token,
Token ID11
Feature activation+9.66e-03

Pos logit contributions
lling+0.005
 works+0.005
washer+0.005
taker+0.005
lad+0.005
Neg logit contributions
 One-0.002
 grasp-0.002
 spine-0.002
 nine-0.002
 front-0.002
Token "
Token ID366
Feature activation+2.06e-02

Pos logit contributions
taker+0.016
 works+0.016
lad+0.016
washer+0.016
 happens+0.016
Neg logit contributions
 One-0.013
 grasp-0.013
 spine-0.012
 front-0.012
 pocket-0.012
TokenI
Token ID40
Feature activation-2.10e-02

Pos logit contributions
lad+0.042
lling+0.041
taker+0.040
 works+0.040
 accidents+0.038
Neg logit contributions
 One-0.023
One-0.021
 grasp-0.020
 spine-0.018
 front-0.018
Token "
Token ID366
Feature activation+1.10e-02

Pos logit contributions
taker+0.012
 works+0.012
washer+0.012
 accidents+0.012
 happens+0.012
Neg logit contributions
 grasp-0.011
 One-0.011
 spine-0.010
 pocket-0.010
 front-0.010
TokenThank
Token ID10449
Feature activation-2.36e-03

Pos logit contributions
washer+0.017
taker+0.016
 works+0.016
 happens+0.015
 accidents+0.015
Neg logit contributions
 grasp-0.018
 One-0.017
One-0.017
 front-0.016
 pocket-0.016
Token you
Token ID345
Feature activation+1.00e-02

Pos logit contributions
 grasp+0.004
 One+0.004
 pocket+0.004
 front+0.004
 spine+0.004
Neg logit contributions
taker-0.003
washer-0.003
 works-0.003
lling-0.003
lad-0.003
Token,
Token ID11
Feature activation-1.93e-02

Pos logit contributions
taker+0.018
 works+0.018
lling+0.018
 happens+0.018
 accidents+0.018
Neg logit contributions
 grasp-0.012
 One-0.012
 spine-0.011
 pocket-0.011
 front-0.011
Token Wh
Token ID854
Feature activation+5.28e-03

Pos logit contributions
 grasp+0.032
 One+0.030
 spine+0.029
 pocket+0.029
 grip+0.028
Neg logit contributions
taker-0.028
washer-0.027
lad-0.027
 accidents-0.027
 works-0.027
Tokenisk
Token ID1984
Feature activation+6.14e-02

Pos logit contributions
taker+0.010
washer+0.009
 accidents+0.009
 happens+0.009
 works+0.009
Neg logit contributions
 grasp-0.007
 One-0.007
 spine-0.006
 pocket-0.006
 grip-0.006
Tokeners
Token ID364
Feature activation-1.67e-04

Pos logit contributions
 accidents+0.108
washer+0.100
 Acc+0.096
spe+0.094
 disl+0.092
Neg logit contributions
 nine-0.074
 front-0.068
One-0.066
 One-0.064
 grasp-0.064
Token!"
Token ID2474
Feature activation-1.64e-02

Pos logit contributions
 grasp+0.000
 One+0.000
 pocket+0.000
 front+0.000
 nine+0.000
Neg logit contributions
 accidents-0.000
lling-0.000
 happens-0.000
taker-0.000
led-0.000
Token
Token ID198
Feature activation-5.62e-03

Pos logit contributions
 grasp+0.020
 One+0.019
 spine+0.018
 pocket+0.018
 front+0.017
Neg logit contributions
taker-0.031
lad-0.031
washer-0.030
 accidents-0.030
 happens-0.030
Token
Token ID198
Feature activation-1.16e-02

Pos logit contributions
 grasp+0.008
 spine+0.007
 pocket+0.007
 One+0.007
 grip+0.007
Neg logit contributions
taker-0.010
lad-0.009
washer-0.009
 accidents-0.009
 happens-0.009
TokenThe
Token ID464
Feature activation-6.89e-03

Pos logit contributions
 grasp+0.018
 One+0.017
 spine+0.017
 pocket+0.016
 grip+0.016
Neg logit contributions
taker-0.018
 accidents-0.018
washer-0.018
lad-0.018
 works-0.017
Token to
Token ID284
Feature activation+1.73e-02

Pos logit contributions
 grasp+0.010
 spine+0.009
 One+0.009
 grip+0.008
 pocket+0.008
Neg logit contributions
 works-0.007
fl-0.007
 happens-0.007
washer-0.007
ing-0.007
Token our
Token ID674
Feature activation+1.15e-02

Pos logit contributions
 accidents+0.017
 works+0.015
 disl+0.015
 happens+0.015
 sound+0.015
Neg logit contributions
 spine-0.036
 grasp-0.036
ids-0.034
 One-0.033
 pocket-0.033
Token castle
Token ID16669
Feature activation-4.19e-03

Pos logit contributions
taker+0.019
lad+0.018
 accidents+0.018
washer+0.018
 works+0.018
Neg logit contributions
 grasp-0.017
 One-0.016
 spine-0.016
 pocket-0.016
 grip-0.015
Token."
Token ID526
Feature activation-8.69e-03

Pos logit contributions
 grasp+0.007
 One+0.006
 spine+0.006
 pocket+0.006
 grip+0.006
Neg logit contributions
taker-0.006
washer-0.006
 works-0.006
 accidents-0.006
lad-0.006
Token Wh
Token ID854
Feature activation+8.88e-03

Pos logit contributions
 grasp+0.013
 spine+0.011
 grip+0.011
 One+0.011
 pocket+0.011
Neg logit contributions
taker-0.015
 accidents-0.015
washer-0.014
 works-0.014
 happens-0.014
Tokenisk
Token ID1984
Feature activation+6.12e-02

Pos logit contributions
taker+0.016
 accidents+0.016
washer+0.016
 happens+0.015
lad+0.015
Neg logit contributions
 grasp-0.012
 One-0.011
 spine-0.011
 pocket-0.010
 grip-0.010
Tokeners
Token ID364
Feature activation+1.67e-02

Pos logit contributions
 accidents+0.101
washer+0.095
 disl+0.093
ww+0.091
 happens+0.091
Neg logit contributions
 nine-0.080
One-0.073
 front-0.071
 grasp-0.071
ags-0.069
Token forg
Token ID5784
Feature activation+1.18e-02

Pos logit contributions
taker+0.022
lad+0.022
 accidents+0.022
 happens+0.022
 works+0.021
Neg logit contributions
 grasp-0.030
 One-0.028
 pocket-0.028
 spine-0.027
 front-0.027
Tokenave
Token ID1015
Feature activation-1.39e-02

Pos logit contributions
 accidents+0.020
fl+0.019
washer+0.019
lad+0.019
taker+0.019
Neg logit contributions
 grasp-0.016
 One-0.016
 spine-0.015
 pocket-0.014
 grip-0.014
Token her
Token ID607
Feature activation+1.47e-02

Pos logit contributions
 One+0.018
 grasp+0.017
 pocket+0.016
 front+0.015
 grip+0.015
Neg logit contributions
taker-0.026
lad-0.026
washer-0.026
lling-0.025
 happens-0.025
Token,
Token ID11
Feature activation+5.00e-03

Pos logit contributions
ww+0.033
 happens+0.033
lad+0.032
haw+0.031
 works+0.031
Neg logit contributions
 One-0.013
 grasp-0.013
 pocket-0.012
 spine-0.012
 grip-0.011
Token strange
Token ID6283
Feature activation-1.15e-03

Pos logit contributions
 grasp+0.018
 One+0.017
 spine+0.017
 pocket+0.017
 grip+0.016
Neg logit contributions
taker-0.013
 accidents-0.013
lad-0.013
washer-0.013
 works-0.013
Token birds
Token ID10087
Feature activation-7.80e-03

Pos logit contributions
 One+0.002
 grasp+0.002
 front+0.002
 pocket+0.002
 nine+0.002
Neg logit contributions
lad-0.001
taker-0.001
washer-0.001
lling-0.001
 accidents-0.001
Token.
Token ID13
Feature activation-4.41e-04

Pos logit contributions
 grasp+0.012
 One+0.011
 spine+0.010
 pocket+0.010
 front+0.010
Neg logit contributions
taker-0.013
lad-0.012
washer-0.012
 accidents-0.012
 happens-0.012
Token They
Token ID1119
Feature activation-4.46e-03

Pos logit contributions
 grasp+0.001
 spine+0.001
 grip+0.001
 pocket+0.001
 injuries+0.001
Neg logit contributions
 happens-0.001
taker-0.001
 accidents-0.001
 works-0.001
 news-0.001
Token had
Token ID550
Feature activation-2.07e-02

Pos logit contributions
 grasp+0.009
 spine+0.009
 One+0.008
 pocket+0.008
 grip+0.008
Neg logit contributions
taker-0.005
washer-0.005
 works-0.005
 accidents-0.005
 happens-0.005
Token long
Token ID890
Feature activation-6.71e-02

Pos logit contributions
 grasp+0.034
 spine+0.032
 One+0.031
 pocket+0.031
 grip+0.030
Neg logit contributions
taker-0.029
 happens-0.029
 works-0.029
 accidents-0.029
lad-0.028
Token necks
Token ID44184
Feature activation-4.05e-02

Pos logit contributions
 grasp+0.084
 spine+0.076
 pocket+0.073
+0.071
 grip+0.069
Neg logit contributions
fl-0.126
 happens-0.122
 works-0.121
led-0.120
 news-0.118
Token,
Token ID11
Feature activation-1.74e-02

Pos logit contributions
 grasp+0.057
 One+0.052
 spine+0.051
 pocket+0.050
 grip+0.049
Neg logit contributions
taker-0.069
 accidents-0.067
washer-0.067
lad-0.067
 works-0.066
Token long
Token ID890
Feature activation-5.09e-02

Pos logit contributions
 grasp+0.027
 spine+0.025
 One+0.025
 pocket+0.024
 grip+0.023
Neg logit contributions
taker-0.027
 accidents-0.027
 works-0.026
 happens-0.026
washer-0.026
Token legs
Token ID7405
Feature activation-4.21e-02

Pos logit contributions
 grasp+0.061
 spine+0.054
+0.054
 pocket+0.054
 One+0.052
Neg logit contributions
fl-0.096
led-0.090
 happens-0.090
 works-0.089
 noises-0.088
Token,
Token ID11
Feature activation-1.22e-02

Pos logit contributions
 grasp+0.071
 spine+0.064
 One+0.064
 pocket+0.063
 grip+0.061
Neg logit contributions
ing-0.062
taker-0.061
washer-0.057
free-0.057
fl-0.057
Tokenes
Token ID274
Feature activation-3.57e-04

Pos logit contributions
 accidents+0.005
taker+0.005
washer+0.005
 happens+0.005
lad+0.005
Neg logit contributions
 grasp-0.011
 One-0.011
 nine-0.010
 pocket-0.010
 spine-0.010
Token best
Token ID1266
Feature activation-1.03e-03

Pos logit contributions
 One+0.000
 grasp+0.000
 pocket+0.000
 front+0.000
 nine+0.000
Neg logit contributions
lad-0.001
 accidents-0.001
 happens-0.001
taker-0.001
washer-0.001
Token.
Token ID13
Feature activation-1.25e-03

Pos logit contributions
 grasp+0.002
 One+0.001
 spine+0.001
 pocket+0.001
 grip+0.001
Neg logit contributions
taker-0.002
 accidents-0.002
lad-0.002
washer-0.002
 works-0.002
Token They
Token ID1119
Feature activation+2.87e-03

Pos logit contributions
 grasp+0.002
 injuries+0.002
 grip+0.002
 spine+0.002
ider+0.002
Neg logit contributions
fl-0.002
 means-0.001
 boy-0.001
ing-0.001
 happens-0.001
Token had
Token ID550
Feature activation-1.86e-02

Pos logit contributions
 works+0.003
taker+0.003
fl+0.003
washer+0.003
 happens+0.002
Neg logit contributions
 grasp-0.006
 spine-0.006
rils-0.006
 grip-0.006
 injuries-0.006
Token long
Token ID890
Feature activation-6.66e-02

Pos logit contributions
 grasp+0.032
 spine+0.030
 One+0.030
 pocket+0.029
 grip+0.029
Neg logit contributions
taker-0.026
 accidents-0.025
 works-0.025
 happens-0.025
washer-0.024
Token necks
Token ID44184
Feature activation-3.94e-02

Pos logit contributions
 grasp+0.100
 spine+0.091
 grip+0.087
 pocket+0.086
+0.085
Neg logit contributions
fl-0.107
 works-0.106
led-0.105
 happens-0.102
 news-0.101
Token and
Token ID290
Feature activation-1.13e-02

Pos logit contributions
 grasp+0.059
 spine+0.053
 One+0.053
 pocket+0.052
 grip+0.050
Neg logit contributions
taker-0.064
 accidents-0.062
washer-0.062
led-0.062
 works-0.062
Token spots
Token ID10222
Feature activation-7.61e-03

Pos logit contributions
 grasp+0.017
 spine+0.016
 One+0.016
 pocket+0.015
 grip+0.015
Neg logit contributions
taker-0.018
 happens-0.017
 works-0.017
 accidents-0.017
washer-0.017
Token.
Token ID13
Feature activation+1.72e-04

Pos logit contributions
 grasp+0.012
 spine+0.011
 pocket+0.010
 One+0.010
rils+0.010
Neg logit contributions
taker-0.012
led-0.012
 works-0.012
 happens-0.012
washer-0.012
Token Sam
Token ID3409
Feature activation+4.37e-03

Pos logit contributions
fl+0.000
ing+0.000
 means+0.000
 boy+0.000
 happens+0.000
Neg logit contributions
 grasp-0.000
 grip-0.000
 spine-0.000
 injuries-0.000
 pocket-0.000
Token the
Token ID262
Feature activation-1.09e-02

Pos logit contributions
taker+0.001
 accidents+0.001
 happens+0.001
 works+0.001
washer+0.001
Neg logit contributions
 grasp-0.001
 spine-0.001
 One-0.001
 pocket-0.001
 grip-0.001
Token girl
Token ID2576
Feature activation+3.67e-03

Pos logit contributions
 One+0.016
 grasp+0.016
 front+0.016
 pocket+0.016
 spine+0.015
Neg logit contributions
lad-0.018
taker-0.017
ull-0.017
 happens-0.016
beh-0.016
Token would
Token ID561
Feature activation-4.06e-03

Pos logit contributions
taker+0.004
 accidents+0.004
lad+0.004
lling+0.004
washer+0.004
Neg logit contributions
 grasp-0.008
 One-0.007
 pocket-0.007
 spine-0.007
 grip-0.007
Token put
Token ID1234
Feature activation-5.14e-03

Pos logit contributions
 grasp+0.006
 spine+0.005
 One+0.005
 pocket+0.005
rils+0.005
Neg logit contributions
taker-0.007
 works-0.007
washer-0.007
 sound-0.006
lad-0.006
Token her
Token ID607
Feature activation-1.20e-02

Pos logit contributions
 grasp+0.007
 One+0.007
 pocket+0.007
 front+0.007
 grip+0.006
Neg logit contributions
lad-0.009
washer-0.009
taker-0.009
 happens-0.009
ull-0.009
Token ear
Token ID1027
Feature activation-6.44e-02

Pos logit contributions
 grasp+0.020
 One+0.018
 pocket+0.018
 spine+0.018
 front+0.018
Neg logit contributions
lad-0.018
taker-0.018
 accidents-0.017
 happens-0.017
washer-0.017
Token against
Token ID1028
Feature activation-2.45e-02

Pos logit contributions
 grasp+0.128
 One+0.124
+0.122
rils+0.118
 spine+0.114
Neg logit contributions
fl-0.067
ing-0.065
 sound-0.064
led-0.062
 accidents-0.061
Token the
Token ID262
Feature activation+2.06e-03

Pos logit contributions
 grasp+0.042
 spine+0.040
 One+0.039
 pocket+0.038
 grip+0.037
Neg logit contributions
 accidents-0.032
taker-0.032
 works-0.031
 happens-0.031
 news-0.030
Token fence
Token ID13990
Feature activation-9.12e-03

Pos logit contributions
taker+0.003
washer+0.003
 accidents+0.003
 news+0.003
 sound+0.003
Neg logit contributions
 grasp-0.003
 One-0.003
 spine-0.002
ids-0.002
 pocket-0.002
Token.
Token ID13
Feature activation+9.44e-03

Pos logit contributions
 grasp+0.015
 spine+0.014
 One+0.013
rils+0.013
 grip+0.013
Neg logit contributions
washer-0.013
taker-0.013
ing-0.013
 works-0.012
 accidents-0.012
Token
Token ID198
Feature activation-6.28e-03

Pos logit contributions
fl+0.013
taker+0.012
free+0.012
 happens+0.012
 works+0.012
Neg logit contributions
 grasp-0.017
 grip-0.016
 injuries-0.016
 spine-0.015
 pocket-0.015
Tokenes
Token ID274
Feature activation-9.03e-04

Pos logit contributions
 accidents+0.006
taker+0.006
washer+0.005
 happens+0.005
lad+0.005
Neg logit contributions
 One-0.011
 grasp-0.011
 nine-0.011
 pocket-0.011
 front-0.011
Token best
Token ID1266
Feature activation-3.45e-03

Pos logit contributions
 One+0.001
 grasp+0.001
 pocket+0.001
 front+0.001
 nine+0.001
Neg logit contributions
lad-0.002
 accidents-0.002
taker-0.002
 happens-0.002
washer-0.002
Token.
Token ID13
Feature activation+5.78e-03

Pos logit contributions
 grasp+0.005
 One+0.005
 spine+0.005
 pocket+0.004
 grip+0.004
Neg logit contributions
taker-0.006
 accidents-0.006
washer-0.006
lad-0.006
 works-0.006
Token They
Token ID1119
Feature activation-1.40e-03

Pos logit contributions
fl+0.007
 happens+0.006
 boy+0.006
ing+0.006
 smells+0.006
Neg logit contributions
 grasp-0.011
 grip-0.011
 spine-0.011
 injuries-0.010
rils-0.010
Token had
Token ID550
Feature activation-1.60e-02

Pos logit contributions
 grasp+0.003
 spine+0.003
rils+0.003
 grip+0.003
 injuries+0.003
Neg logit contributions
 works-0.001
taker-0.001
washer-0.001
fl-0.001
ull-0.001
Token long
Token ID890
Feature activation-6.43e-02

Pos logit contributions
 grasp+0.027
 One+0.025
 spine+0.025
 pocket+0.024
 grip+0.024
Neg logit contributions
taker-0.023
 accidents-0.022
washer-0.022
lad-0.022
 works-0.022
Token necks
Token ID44184
Feature activation-3.73e-02

Pos logit contributions
 grasp+0.093
 spine+0.083
 grip+0.080
 pocket+0.078
+0.078
Neg logit contributions
fl-0.108
 works-0.107
led-0.104
 happens-0.101
 news-0.100
Token and
Token ID290
Feature activation-1.16e-02

Pos logit contributions
 grasp+0.055
 spine+0.050
 One+0.050
 pocket+0.049
 grip+0.047
Neg logit contributions
taker-0.061
 accidents-0.060
washer-0.059
 works-0.059
lad-0.059
Token spots
Token ID10222
Feature activation-8.36e-03

Pos logit contributions
 grasp+0.017
 spine+0.016
 One+0.016
 pocket+0.015
 grip+0.015
Neg logit contributions
taker-0.019
 accidents-0.019
 happens-0.019
 works-0.019
washer-0.018
Token.
Token ID13
Feature activation+6.80e-03

Pos logit contributions
 grasp+0.013
 spine+0.012
 pocket+0.011
 One+0.011
rils+0.011
Neg logit contributions
taker-0.014
led-0.013
ing-0.013
 happens-0.013
washer-0.013
Token Ben
Token ID3932
Feature activation+9.98e-03

Pos logit contributions
ing+0.007
 happens+0.007
 boy+0.007
fl+0.007
 accidents+0.007
Neg logit contributions
 grasp-0.014
 grip-0.014
 spine-0.013
 injuries-0.013
rils-0.012
Tokenes
Token ID274
Feature activation+6.14e-04

Pos logit contributions
 accidents+0.008
taker+0.008
washer+0.008
 happens+0.007
lad+0.007
Neg logit contributions
 grasp-0.017
 One-0.016
 nine-0.016
 pocket-0.016
 front-0.015
Token best
Token ID1266
Feature activation-1.59e-03

Pos logit contributions
lad+0.001
 accidents+0.001
taker+0.001
 happens+0.001
washer+0.001
Neg logit contributions
 One-0.001
 grasp-0.001
 pocket-0.001
 front-0.001
 nine-0.001
Token.
Token ID13
Feature activation+1.98e-03

Pos logit contributions
 grasp+0.002
 spine+0.002
 One+0.002
 pocket+0.002
 grip+0.002
Neg logit contributions
taker-0.003
 accidents-0.003
washer-0.003
lad-0.003
 works-0.003
Token They
Token ID1119
Feature activation-1.77e-04

Pos logit contributions
fl+0.002
ing+0.002
 boy+0.002
 happens+0.002
 accidents+0.002
Neg logit contributions
 grasp-0.004
 grip-0.004
 spine-0.004
 injuries-0.004
 pocket-0.003
Token had
Token ID550
Feature activation-1.38e-02

Pos logit contributions
 grasp+0.000
 spine+0.000
rils+0.000
 grip+0.000
 injuries+0.000
Neg logit contributions
 works-0.000
taker-0.000
washer-0.000
fl-0.000
ull-0.000
Token long
Token ID890
Feature activation-6.36e-02

Pos logit contributions
 grasp+0.024
 spine+0.022
 One+0.022
 pocket+0.022
 grip+0.021
Neg logit contributions
taker-0.019
 accidents-0.019
 works-0.018
 happens-0.018
washer-0.018
Token necks
Token ID44184
Feature activation-3.71e-02

Pos logit contributions
 grasp+0.090
 spine+0.080
 grip+0.076
 pocket+0.075
+0.073
Neg logit contributions
fl-0.109
 works-0.108
led-0.106
 happens-0.103
 news-0.102
Token and
Token ID290
Feature activation-1.10e-02

Pos logit contributions
 grasp+0.055
 spine+0.050
 One+0.050
 pocket+0.048
 grip+0.047
Neg logit contributions
taker-0.060
 accidents-0.059
washer-0.059
 works-0.059
led-0.058
Token spots
Token ID10222
Feature activation-8.48e-03

Pos logit contributions
 grasp+0.016
 spine+0.015
 One+0.015
 pocket+0.014
 grip+0.014
Neg logit contributions
taker-0.018
 happens-0.017
 works-0.017
 accidents-0.017
washer-0.017
Token.
Token ID13
Feature activation+3.44e-03

Pos logit contributions
 grasp+0.013
 spine+0.012
 pocket+0.012
 One+0.012
rils+0.011
Neg logit contributions
taker-0.014
led-0.013
ing-0.013
 happens-0.013
washer-0.013
Token Sam
Token ID3409
Feature activation+8.16e-03

Pos logit contributions
fl+0.004
ing+0.004
 boy+0.004
 accidents+0.003
 happens+0.003
Neg logit contributions
 grasp-0.007
 spine-0.007
 grip-0.007
 injuries-0.007
 pocket-0.006
Token person
Token ID1048
Feature activation-8.58e-03

Pos logit contributions
taker+0.015
led+0.015
lad+0.015
rest+0.014
lling+0.014
Neg logit contributions
 grasp-0.016
 pocket-0.015
 One-0.014
 spine-0.014
 front-0.014
Token.
Token ID13
Feature activation-9.89e-03

Pos logit contributions
 grasp+0.014
 One+0.014
 spine+0.014
 grip+0.013
 pocket+0.013
Neg logit contributions
washer-0.012
 accidents-0.012
taker-0.012
 works-0.012
ing-0.011
Token The
Token ID383
Feature activation-2.14e-02

Pos logit contributions
 grasp+0.013
 spine+0.012
 One+0.011
 pocket+0.011
 grip+0.011
Neg logit contributions
taker-0.018
 happens-0.018
 accidents-0.018
 works-0.018
washer-0.018
Token person
Token ID1048
Feature activation+1.69e-02

Pos logit contributions
 grasp+0.038
 One+0.036
 spine+0.035
 pocket+0.034
 grip+0.033
Neg logit contributions
taker-0.029
 accidents-0.028
washer-0.028
 works-0.028
 happens-0.028
Token had
Token ID550
Feature activation-2.94e-02

Pos logit contributions
taker+0.020
lad+0.020
 accidents+0.019
washer+0.019
 happens+0.019
Neg logit contributions
 grasp-0.032
 One-0.031
 pocket-0.030
 spine-0.030
 front-0.029
Token long
Token ID890
Feature activation-6.33e-02

Pos logit contributions
 grasp+0.050
 spine+0.049
 One+0.047
 pocket+0.045
 grip+0.044
Neg logit contributions
 happens-0.040
 works-0.039
 accidents-0.039
 news-0.039
taker-0.038
Token hair
Token ID4190
Feature activation-2.59e-02

Pos logit contributions
 grasp+0.087
 spine+0.080
 One+0.078
+0.076
 grip+0.073
Neg logit contributions
led-0.107
 works-0.105
 happens-0.105
 news-0.104
 accidents-0.102
Token,
Token ID11
Feature activation-2.08e-02

Pos logit contributions
 grasp+0.037
 One+0.034
 spine+0.033
 pocket+0.032
 grip+0.031
Neg logit contributions
taker-0.044
 accidents-0.043
washer-0.043
lad-0.042
 works-0.042
Token a
Token ID257
Feature activation-9.26e-04

Pos logit contributions
 grasp+0.032
 One+0.030
 spine+0.029
 pocket+0.028
 grip+0.028
Neg logit contributions
taker-0.033
 accidents-0.032
 works-0.032
washer-0.031
 happens-0.031
Token beard
Token ID21213
Feature activation-5.83e-03

Pos logit contributions
 grasp+0.001
 One+0.001
 spine+0.001
 pocket+0.001
 grip+0.001
Neg logit contributions
taker-0.001
 accidents-0.001
washer-0.001
 works-0.001
 happens-0.001
Token and
Token ID290
Feature activation-2.65e-02

Pos logit contributions
 grasp+0.009
 One+0.008
 spine+0.008
 pocket+0.008
 grip+0.007
Neg logit contributions
taker-0.009
 accidents-0.009
washer-0.009
 works-0.009
led-0.009
Token the
Token ID262
Feature activation-6.35e-03

Pos logit contributions
 grasp+0.037
 spine+0.035
 One+0.034
 pocket+0.033
 grip+0.033
Neg logit contributions
taker-0.025
 accidents-0.025
 works-0.025
 happens-0.024
ull-0.024
Token sky
Token ID6766
Feature activation-1.61e-02

Pos logit contributions
 grasp+0.009
 spine+0.009
ids+0.009
 One+0.009
 grip+0.009
Neg logit contributions
fl-0.010
 sound-0.010
 news-0.010
washer-0.009
 works-0.009
Token.
Token ID13
Feature activation-4.59e-03

Pos logit contributions
 grasp+0.024
 spine+0.022
 One+0.022
 pocket+0.021
 grip+0.020
Neg logit contributions
 accidents-0.025
taker-0.025
washer-0.025
ull-0.025
 works-0.025
Token It
Token ID632
Feature activation+3.35e-03

Pos logit contributions
 grasp+0.006
 One+0.005
 spine+0.005
 pocket+0.005
 grip+0.005
Neg logit contributions
taker-0.009
 accidents-0.008
washer-0.008
lad-0.008
 happens-0.008
Token had
Token ID550
Feature activation-2.37e-02

Pos logit contributions
taker+0.004
lad+0.004
 accidents+0.004
washer+0.004
lling+0.004
Neg logit contributions
 One-0.006
 grasp-0.006
 front-0.006
 pocket-0.006
One-0.006
Token long
Token ID890
Feature activation-6.19e-02

Pos logit contributions
 grasp+0.041
 spine+0.040
 pocket+0.038
 One+0.038
 grip+0.037
Neg logit contributions
 happens-0.032
taker-0.031
 accidents-0.031
 news-0.031
 works-0.030
Token wings
Token ID12098
Feature activation-3.63e-02

Pos logit contributions
 grasp+0.093
 spine+0.082
 pocket+0.081
+0.081
 One+0.080
Neg logit contributions
fl-0.097
 happens-0.096
led-0.095
 works-0.093
 news-0.093
Token,
Token ID11
Feature activation-1.77e-02

Pos logit contributions
 grasp+0.054
 One+0.050
 spine+0.049
 pocket+0.048
 grip+0.046
Neg logit contributions
taker-0.059
 accidents-0.058
 happens-0.057
led-0.057
 works-0.057
Token a
Token ID257
Feature activation-7.44e-04

Pos logit contributions
 grasp+0.026
 One+0.024
 spine+0.024
 pocket+0.023
 grip+0.023
Neg logit contributions
taker-0.029
 accidents-0.028
washer-0.028
lad-0.028
 works-0.028
Token sharp
Token ID7786
Feature activation-1.00e-02

Pos logit contributions
 grasp+0.001
 One+0.001
 spine+0.001
 pocket+0.001
 grip+0.001
Neg logit contributions
taker-0.001
 accidents-0.001
washer-0.001
 happens-0.001
 works-0.001
Token be
Token ID307
Feature activation+8.67e-03

Pos logit contributions
 grasp+0.013
 One+0.012
One+0.012
 pocket+0.011
 front+0.011
Neg logit contributions
 sound-0.018
 happens-0.017
ull-0.017
taker-0.017
 works-0.017
Token gorilla
Token ID45314
Feature activation-6.66e-03

Pos logit contributions
 grasp+0.014
 One+0.014
 spine+0.013
 pocket+0.013
 front+0.013
Neg logit contributions
taker-0.013
 accidents-0.013
lad-0.013
washer-0.013
 happens-0.013
Token.
Token ID13
Feature activation-6.05e-03

Pos logit contributions
 grasp+0.007
 One+0.007
 pocket+0.007
 front+0.006
 spine+0.006
Neg logit contributions
taker-0.013
lad-0.013
 accidents-0.013
 happens-0.013
lling-0.013
Token The
Token ID383
Feature activation-9.54e-03

Pos logit contributions
 grasp+0.012
 grip+0.012
 spine+0.011
 injuries+0.010
rils+0.010
Neg logit contributions
 happens-0.007
 boy-0.007
 means-0.007
 smells-0.007
washer-0.007
Token gorilla
Token ID45314
Feature activation+5.49e-03

Pos logit contributions
 grasp+0.018
 One+0.017
 spine+0.016
 pocket+0.016
 front+0.016
Neg logit contributions
taker-0.012
 accidents-0.012
washer-0.012
lad-0.012
 happens-0.012
Token has
Token ID468
Feature activation-2.67e-02

Pos logit contributions
taker+0.007
 accidents+0.007
lad+0.007
lling+0.007
free+0.007
Neg logit contributions
 grasp-0.010
 One-0.010
 pocket-0.010
 nine-0.009
 front-0.009
Token long
Token ID890
Feature activation-6.18e-02

Pos logit contributions
 grasp+0.045
 spine+0.043
 One+0.041
 pocket+0.041
 grip+0.041
Neg logit contributions
taker-0.036
 happens-0.036
 works-0.035
 accidents-0.035
 news-0.035
Token arms
Token ID5101
Feature activation-3.09e-02

Pos logit contributions
 grasp+0.103
+0.098
 spine+0.091
 grip+0.091
 pocket+0.089
Neg logit contributions
fl-0.087
led-0.085
 works-0.083
 happens-0.082
 noises-0.082
Token and
Token ID290
Feature activation-7.03e-03

Pos logit contributions
 grasp+0.052
 spine+0.047
 pocket+0.046
 grip+0.045
 injuries+0.044
Neg logit contributions
ing-0.046
taker-0.046
led-0.044
washer-0.043
 works-0.043
Token long
Token ID890
Feature activation-4.30e-02

Pos logit contributions
 grasp+0.010
 One+0.009
 spine+0.009
 pocket+0.009
 grip+0.009
Neg logit contributions
taker-0.012
 accidents-0.011
 works-0.011
 happens-0.011
washer-0.011
Token hair
Token ID4190
Feature activation-1.17e-02

Pos logit contributions
 grasp+0.062
 spine+0.056
 pocket+0.055
+0.055
 grip+0.054
Neg logit contributions
led-0.069
fl-0.069
 happens-0.068
 works-0.067
 accidents-0.066
Token.
Token ID13
Feature activation-6.86e-03

Pos logit contributions
 grasp+0.016
 One+0.014
 spine+0.014
 pocket+0.014
 grip+0.013
Neg logit contributions
taker-0.021
 accidents-0.020
washer-0.020
lad-0.020
 works-0.020
Token a
Token ID257
Feature activation-2.82e-03

Pos logit contributions
 grasp+0.010
 One+0.009
 spine+0.009
 pocket+0.009
 grip+0.009
Neg logit contributions
taker-0.012
lad-0.012
 accidents-0.011
washer-0.011
 works-0.011
Token haircut
Token ID45548
Feature activation-7.24e-03

Pos logit contributions
 grasp+0.004
 One+0.004
 pocket+0.004
 spine+0.003
 grip+0.003
Neg logit contributions
taker-0.005
lling-0.005
lad-0.005
led-0.005
 accidents-0.005
Token.
Token ID13
Feature activation-4.29e-03

Pos logit contributions
 grasp+0.011
 One+0.010
 spine+0.010
rils+0.009
 pocket+0.009
Neg logit contributions
taker-0.012
washer-0.012
 accidents-0.011
 works-0.011
 happens-0.011
Token She
Token ID1375
Feature activation-6.39e-03

Pos logit contributions
 grasp+0.005
 spine+0.005
 pocket+0.005
 grip+0.005
 One+0.004
Neg logit contributions
taker-0.008
 accidents-0.008
washer-0.008
lad-0.008
 works-0.008
Token had
Token ID550
Feature activation-2.47e-02

Pos logit contributions
 grasp+0.013
 spine+0.012
 One+0.012
 pocket+0.012
rils+0.012
Neg logit contributions
taker-0.007
 works-0.007
washer-0.007
 happens-0.007
led-0.007
Token long
Token ID890
Feature activation-6.05e-02

Pos logit contributions
 spine+0.046
 grasp+0.046
 One+0.041
+0.041
rils+0.040
Neg logit contributions
 happens-0.031
 works-0.029
 news-0.029
 accidents-0.028
taker-0.028
Token hair
Token ID4190
Feature activation-3.17e-02

Pos logit contributions
 grasp+0.081
 spine+0.071
rils+0.067
 One+0.065
+0.064
Neg logit contributions
 works-0.109
led-0.109
 happens-0.107
fl-0.104
 news-0.104
Token that
Token ID326
Feature activation+1.17e-02

Pos logit contributions
 grasp+0.051
 spine+0.046
 One+0.046
 pocket+0.044
rils+0.043
Neg logit contributions
taker-0.048
led-0.047
 accidents-0.047
ing-0.047
 works-0.047
Token got
Token ID1392
Feature activation+9.17e-03

Pos logit contributions
lad+0.019
lling+0.018
 accidents+0.018
taker+0.017
ull+0.017
Neg logit contributions
 grasp-0.018
 One-0.018
 pocket-0.017
 spine-0.016
 grip-0.016
Token in
Token ID287
Feature activation-1.57e-02

Pos logit contributions
lad+0.015
taker+0.014
washer+0.013
ull+0.013
lling+0.013
Neg logit contributions
 grasp-0.014
 One-0.013
 pocket-0.013
 nine-0.013
 front-0.013
Token her
Token ID607
Feature activation-1.06e-02

Pos logit contributions
 grasp+0.027
 spine+0.026
 One+0.024
 pocket+0.024
rils+0.023
Neg logit contributions
 accidents-0.022
 works-0.021
taker-0.021
 sound-0.020
 news-0.020
Token,
Token ID11
Feature activation+2.05e-03

Pos logit contributions
 grasp+0.022
 One+0.021
 pocket+0.019
 spine+0.019
 front+0.018
Neg logit contributions
taker-0.035
lad-0.035
washer-0.034
lling-0.034
 accidents-0.034
Token he
Token ID339
Feature activation-1.03e-02

Pos logit contributions
lad+0.004
washer+0.004
 happens+0.004
taker+0.004
led+0.004
Neg logit contributions
 One-0.003
 grasp-0.003
 pocket-0.003
 nine-0.002
 spine-0.002
Token always
Token ID1464
Feature activation+1.80e-02

Pos logit contributions
 grasp+0.014
 One+0.013
 pocket+0.012
 nine+0.012
 front+0.012
Neg logit contributions
taker-0.018
 accidents-0.018
washer-0.017
lad-0.017
 news-0.017
Token kept
Token ID4030
Feature activation-2.45e-03

Pos logit contributions
taker+0.032
washer+0.029
 accidents+0.029
lad+0.028
ull+0.027
Neg logit contributions
 grasp-0.027
 pocket-0.025
 One-0.024
 spine-0.023
 front-0.022
Token an
Token ID281
Feature activation-5.93e-03

Pos logit contributions
 grasp+0.004
 front+0.004
 One+0.004
 pocket+0.003
 nine+0.003
Neg logit contributions
taker-0.004
lad-0.004
washer-0.004
 happens-0.004
 works-0.004
Token ear
Token ID1027
Feature activation-6.03e-02

Pos logit contributions
 grasp+0.006
 One+0.006
 spine+0.006
 pocket+0.005
rils+0.005
Neg logit contributions
taker-0.012
 accidents-0.012
rest-0.012
 news-0.012
lad-0.012
Token out
Token ID503
Feature activation-3.08e-02

Pos logit contributions
 grasp+0.108
 One+0.108
+0.100
rils+0.098
 pocket+0.096
Neg logit contributions
ing-0.077
 sound-0.074
free-0.074
fl-0.073
washer-0.073
Token for
Token ID329
Feature activation-1.10e-02

Pos logit contributions
 grasp+0.042
 One+0.038
 pocket+0.038
 spine+0.037
 front+0.037
Neg logit contributions
taker-0.053
lad-0.053
 happens-0.053
washer-0.053
 accidents-0.052
Token any
Token ID597
Feature activation-1.18e-02

Pos logit contributions
 grasp+0.019
 One+0.018
 pocket+0.017
 spine+0.017
 front+0.017
Neg logit contributions
taker-0.016
washer-0.016
lad-0.015
 happens-0.015
 works-0.015
Token silly
Token ID14397
Feature activation-6.00e-04

Pos logit contributions
 grasp+0.020
 One+0.018
 spine+0.018
 pocket+0.018
 grip+0.017
Neg logit contributions
taker-0.017
lad-0.017
washer-0.017
 accidents-0.017
 works-0.016
Token noises
Token ID26782
Feature activation+1.64e-03

Pos logit contributions
 grasp+0.001
 spine+0.001
 pocket+0.001
 One+0.001
 grip+0.001
Neg logit contributions
washer-0.001
taker-0.001
lad-0.001
lling-0.001
led-0.001
Token his
Token ID465
Feature activation-9.86e-03

Pos logit contributions
 grasp+0.028
 spine+0.027
 One+0.026
 pocket+0.025
 grip+0.025
Neg logit contributions
 accidents-0.019
taker-0.019
 works-0.019
 happens-0.018
washer-0.018
Token house
Token ID2156
Feature activation-5.67e-03

Pos logit contributions
 grasp+0.016
 spine+0.015
ids+0.015
rils+0.015
 One+0.015
Neg logit contributions
washer-0.013
 works-0.013
 boy-0.013
 news-0.013
 sound-0.012
Token.
Token ID13
Feature activation-6.88e-03

Pos logit contributions
 grasp+0.009
 One+0.008
 spine+0.008
 pocket+0.008
 grip+0.008
Neg logit contributions
washer-0.009
 works-0.009
 accidents-0.009
taker-0.009
ing-0.008
Token She
Token ID1375
Feature activation-3.16e-03

Pos logit contributions
 grasp+0.008
 spine+0.007
 One+0.007
 pocket+0.007
 grip+0.007
Neg logit contributions
taker-0.013
 accidents-0.013
 happens-0.013
washer-0.013
 works-0.013
Token had
Token ID550
Feature activation-2.29e-02

Pos logit contributions
 grasp+0.006
 One+0.006
 pocket+0.005
 front+0.005
 nine+0.005
Neg logit contributions
taker-0.004
lad-0.004
 accidents-0.004
lling-0.004
free-0.004
Token long
Token ID890
Feature activation-6.02e-02

Pos logit contributions
 grasp+0.040
 spine+0.038
 One+0.037
 pocket+0.036
 grip+0.036
Neg logit contributions
taker-0.030
 happens-0.030
 accidents-0.030
 works-0.030
washer-0.029
Token hair
Token ID4190
Feature activation-3.42e-02

Pos logit contributions
 grasp+0.079
 spine+0.067
+0.067
 grip+0.064
rils+0.063
Neg logit contributions
 works-0.107
 news-0.106
 happens-0.104
led-0.104
fl-0.104
Token and
Token ID290
Feature activation-9.49e-03

Pos logit contributions
 grasp+0.053
 One+0.048
 spine+0.047
 pocket+0.046
 grip+0.045
Neg logit contributions
taker-0.054
washer-0.053
 accidents-0.053
 works-0.053
led-0.052
Token a
Token ID257
Feature activation-1.68e-03

Pos logit contributions
 grasp+0.014
 One+0.013
 pocket+0.013
 front+0.012
 spine+0.012
Neg logit contributions
lad-0.016
taker-0.016
washer-0.015
lling-0.015
 accidents-0.015
Token pretty
Token ID2495
Feature activation-7.78e-03

Pos logit contributions
 grasp+0.003
 One+0.003
 spine+0.003
rils+0.003
 grip+0.002
Neg logit contributions
taker-0.002
washer-0.002
 accidents-0.002
 works-0.002
 happens-0.002
Token dress
Token ID6576
Feature activation-7.37e-03

Pos logit contributions
 grasp+0.011
 One+0.010
 spine+0.010
rils+0.010
 nine+0.009
Neg logit contributions
taker-0.013
 sound-0.012
 news-0.012
 works-0.012
 accidents-0.012
Token named
Token ID3706
Feature activation-1.03e-02

Pos logit contributions
 grasp+0.008
 spine+0.007
 One+0.007
 grip+0.007
rils+0.006
Neg logit contributions
washer-0.010
taker-0.010
ull-0.010
ing-0.010
fl-0.010
Token Lily
Token ID20037
Feature activation+2.58e-03

Pos logit contributions
 grasp+0.018
 spine+0.017
 pocket+0.017
 One+0.016
 grip+0.016
Neg logit contributions
taker-0.014
washer-0.013
 accidents-0.013
 happens-0.013
lad-0.013
Token.
Token ID13
Feature activation-1.06e-02

Pos logit contributions
taker+0.004
washer+0.004
 works+0.004
ull+0.004
lad+0.004
Neg logit contributions
 grasp-0.004
 spine-0.004
 grip-0.004
rils-0.004
 pocket-0.004
Token She
Token ID1375
Feature activation-8.27e-03

Pos logit contributions
 grasp+0.017
 grip+0.015
 spine+0.015
 injuries+0.014
 pocket+0.014
Neg logit contributions
taker-0.017
 accidents-0.016
free-0.016
lad-0.016
 happens-0.016
Token had
Token ID550
Feature activation-3.52e-02

Pos logit contributions
 grasp+0.017
 One+0.016
 spine+0.016
 pocket+0.016
 grip+0.015
Neg logit contributions
taker-0.009
 accidents-0.009
lad-0.008
washer-0.008
 works-0.008
Token long
Token ID890
Feature activation-6.01e-02

Pos logit contributions
 grasp+0.064
 spine+0.062
 One+0.058
 pocket+0.058
 grip+0.057
Neg logit contributions
taker-0.044
 accidents-0.044
 news-0.043
 happens-0.043
fl-0.042
Token,
Token ID11
Feature activation-1.27e-02

Pos logit contributions
 grasp+0.084
 spine+0.074
+0.072
rils+0.071
 grip+0.070
Neg logit contributions
fl-0.101
 works-0.099
 news-0.099
led-0.098
 happens-0.097
Token curly
Token ID45731
Feature activation-2.85e-02

Pos logit contributions
 grasp+0.020
 spine+0.017
 One+0.017
 pocket+0.017
 nine+0.016
Neg logit contributions
 accidents-0.020
taker-0.019
 sound-0.019
free-0.019
fl-0.018
Token hair
Token ID4190
Feature activation-2.77e-02

Pos logit contributions
 grasp+0.042
 spine+0.037
+0.036
 grip+0.034
rils+0.034
Neg logit contributions
fl-0.048
led-0.047
 news-0.046
 things-0.045
 happens-0.044
Token that
Token ID326
Feature activation+1.02e-02

Pos logit contributions
 grasp+0.053
 spine+0.048
 One+0.047
 Deal+0.046
 pocket+0.045
Neg logit contributions
ing-0.036
fl-0.035
led-0.034
 sound-0.033
 accidents-0.032
Token always
Token ID1464
Feature activation+7.67e-03

Pos logit contributions
lad+0.016
 accidents+0.016
lling+0.016
taker+0.015
washer+0.015
Neg logit contributions
 grasp-0.015
 One-0.015
 pocket-0.014
 spine-0.014
 front-0.013
Token weird
Token ID7650
Feature activation-1.21e-03

Pos logit contributions
led+0.005
lad+0.005
rest+0.005
taker+0.005
beh+0.005
Neg logit contributions
 grasp-0.005
 pocket-0.005
 front-0.005
 One-0.005
 spine-0.005
Token animal
Token ID5044
Feature activation+2.08e-03

Pos logit contributions
 grasp+0.002
 pocket+0.002
 One+0.002
 spine+0.002
 grip+0.002
Neg logit contributions
taker-0.002
lad-0.002
lling-0.002
washer-0.002
led-0.002
Token.
Token ID13
Feature activation-9.73e-03

Pos logit contributions
taker+0.003
lad+0.003
 accidents+0.003
 works+0.003
 happens+0.003
Neg logit contributions
 grasp-0.003
 One-0.003
 pocket-0.003
 spine-0.003
 grip-0.003
Token It
Token ID632
Feature activation-3.84e-03

Pos logit contributions
 grasp+0.017
 spine+0.017
 grip+0.017
 injuries+0.015
 pocket+0.015
Neg logit contributions
 happens-0.014
taker-0.013
 accidents-0.012
washer-0.012
ull-0.012
Token has
Token ID468
Feature activation-3.03e-02

Pos logit contributions
 One+0.007
 grasp+0.007
 pocket+0.007
 front+0.006
 spine+0.006
Neg logit contributions
lad-0.005
taker-0.005
 accidents-0.005
washer-0.005
ull-0.005
Token long
Token ID890
Feature activation-5.95e-02

Pos logit contributions
 grasp+0.051
 One+0.047
 spine+0.047
 pocket+0.046
 grip+0.045
Neg logit contributions
taker-0.044
 accidents-0.042
 happens-0.042
washer-0.042
 works-0.042
Token ears
Token ID11368
Feature activation-2.90e-02

Pos logit contributions
 grasp+0.087
+0.079
 One+0.077
 pocket+0.076
 spine+0.075
Neg logit contributions
fl-0.094
led-0.093
 happens-0.093
 news-0.091
 works-0.090
Token,
Token ID11
Feature activation-2.36e-02

Pos logit contributions
 grasp+0.044
 One+0.041
 spine+0.040
 pocket+0.040
 grip+0.038
Neg logit contributions
taker-0.046
 accidents-0.045
ing-0.045
 happens-0.044
washer-0.044
Token a
Token ID257
Feature activation-5.20e-03

Pos logit contributions
 grasp+0.034
 One+0.031
 spine+0.031
 pocket+0.030
 grip+0.029
Neg logit contributions
taker-0.039
 accidents-0.039
 works-0.038
 happens-0.038
washer-0.038
Token fluffy
Token ID39145
Feature activation-9.77e-03

Pos logit contributions
 grasp+0.008
 One+0.008
 spine+0.008
 pocket+0.008
 grip+0.007
Neg logit contributions
taker-0.008
 accidents-0.008
washer-0.007
lad-0.007
 works-0.007
Token tail
Token ID7894
Feature activation-2.53e-02

Pos logit contributions
 grasp+0.011
 One+0.010
 spine+0.009
 nine+0.009
 pocket+0.009
Neg logit contributions
 sound-0.019
fl-0.019
 accidents-0.019
taker-0.019
ull-0.019
Token named
Token ID3706
Feature activation-7.21e-03

Pos logit contributions
washer+0.005
taker+0.005
 sound+0.005
ull+0.005
ing+0.005
Neg logit contributions
 grasp-0.005
 spine-0.004
 grip-0.004
 pocket-0.004
 nine-0.004
Token Bow
Token ID9740
Feature activation-7.74e-03

Pos logit contributions
 grasp+0.014
 spine+0.013
 grip+0.013
 pocket+0.013
 nine+0.012
Neg logit contributions
 accidents-0.008
ing-0.008
ww-0.008
fl-0.008
free-0.008
Token.
Token ID13
Feature activation-2.37e-02

Pos logit contributions
 grasp+0.013
 spine+0.012
 pocket+0.011
 grip+0.011
rils+0.011
Neg logit contributions
ull-0.011
taker-0.011
led-0.011
fl-0.011
lad-0.011
Token Bow
Token ID9740
Feature activation-4.02e-03

Pos logit contributions
 grasp+0.043
 spine+0.040
 pocket+0.039
 grip+0.039
rils+0.038
Neg logit contributions
 happens-0.030
 works-0.030
fl-0.029
lad-0.029
 accidents-0.029
Token had
Token ID550
Feature activation-2.79e-02

Pos logit contributions
 One+0.007
 grasp+0.007
 pocket+0.007
One+0.007
 front+0.006
Neg logit contributions
taker-0.006
 accidents-0.005
washer-0.005
lad-0.005
lling-0.005
Token long
Token ID890
Feature activation-5.95e-02

Pos logit contributions
 grasp+0.051
 spine+0.049
 pocket+0.046
 One+0.046
 grip+0.046
Neg logit contributions
 accidents-0.034
taker-0.034
 happens-0.034
 news-0.034
fl-0.033
Token,
Token ID11
Feature activation-4.43e-03

Pos logit contributions
 grasp+0.087
 spine+0.077
+0.076
rils+0.073
 grip+0.072
Neg logit contributions
fl-0.096
 news-0.095
led-0.093
 happens-0.092
 sound-0.092
Token soft
Token ID2705
Feature activation-1.60e-02

Pos logit contributions
 grasp+0.008
 spine+0.007
 One+0.007
 pocket+0.007
 grip+0.007
Neg logit contributions
taker-0.006
 accidents-0.006
 works-0.006
washer-0.006
 happens-0.006
Token fur
Token ID9230
Feature activation-5.02e-03

Pos logit contributions
 grasp+0.020
 One+0.019
 spine+0.018
 pocket+0.018
 grip+0.017
Neg logit contributions
taker-0.029
 accidents-0.029
washer-0.028
lad-0.028
 works-0.028
Token and
Token ID290
Feature activation+9.15e-03

Pos logit contributions
 grasp+0.008
 spine+0.007
 One+0.007
 pocket+0.007
 grip+0.007
Neg logit contributions
taker-0.008
washer-0.008
led-0.008
ing-0.008
 accidents-0.008
Token loved
Token ID6151
Feature activation+8.34e-03

Pos logit contributions
taker+0.013
lad+0.013
washer+0.012
lling+0.012
 accidents+0.012
Neg logit contributions
 grasp-0.015
 One-0.015
 pocket-0.014
 spine-0.014
 front-0.014
Token unusual
Token ID8468
Feature activation-8.25e-04

Pos logit contributions
 grasp+0.012
 One+0.012
 pocket+0.012
 grip+0.011
 front+0.011
Neg logit contributions
fl-0.007
lad-0.007
taker-0.007
 accidents-0.007
 happens-0.007
Token looking
Token ID2045
Feature activation+1.20e-02

Pos logit contributions
 grasp+0.001
 One+0.001
 pocket+0.001
 spine+0.001
 front+0.001
Neg logit contributions
lad-0.002
taker-0.002
ull-0.002
lling-0.002
 happens-0.002
Token.
Token ID13
Feature activation-5.88e-04

Pos logit contributions
lad+0.025
ull+0.025
fl+0.025
beh+0.024
ymes+0.024
Neg logit contributions
 One-0.012
 pocket-0.011
 grasp-0.011
 spine-0.010
 nine-0.010
Token He
Token ID679
Feature activation-8.08e-03

Pos logit contributions
 grasp+0.001
 spine+0.001
 pocket+0.001
 grip+0.001
 One+0.001
Neg logit contributions
taker-0.001
 accidents-0.001
washer-0.001
 works-0.001
 happens-0.001
Token had
Token ID550
Feature activation-2.52e-02

Pos logit contributions
 grasp+0.013
 One+0.013
 pocket+0.012
 spine+0.012
 grip+0.012
Neg logit contributions
taker-0.012
lad-0.012
 accidents-0.011
 happens-0.011
washer-0.011
Token long
Token ID890
Feature activation-5.93e-02

Pos logit contributions
 grasp+0.041
 spine+0.039
 One+0.038
 pocket+0.037
 grip+0.036
Neg logit contributions
taker-0.037
 accidents-0.036
 works-0.036
 happens-0.036
washer-0.036
Token white
Token ID2330
Feature activation-4.81e-02

Pos logit contributions
 grasp+0.077
 spine+0.069
rils+0.064
 One+0.064
+0.063
Neg logit contributions
 works-0.109
 happens-0.104
led-0.104
taker-0.102
 news-0.102
Token hair
Token ID4190
Feature activation-3.30e-02

Pos logit contributions
 grasp+0.064
 spine+0.058
 One+0.054
rils+0.051
 grip+0.050
Neg logit contributions
 news-0.086
 works-0.086
fl-0.086
 accidents-0.084
 happens-0.083
Token and
Token ID290
Feature activation+2.65e-03

Pos logit contributions
 grasp+0.053
 One+0.048
 spine+0.048
 pocket+0.046
 grip+0.045
Neg logit contributions
taker-0.050
washer-0.050
ing-0.049
 accidents-0.049
 works-0.049
Token a
Token ID257
Feature activation-7.21e-03

Pos logit contributions
lad+0.004
taker+0.004
 accidents+0.004
washer+0.004
lling+0.004
Neg logit contributions
 grasp-0.004
 One-0.004
 pocket-0.004
 spine-0.004
 front-0.004
Token big
Token ID1263
Feature activation-8.51e-03

Pos logit contributions
 grasp+0.012
 One+0.012
 spine+0.011
rils+0.011
 pocket+0.011
Neg logit contributions
taker-0.010
washer-0.009
 accidents-0.009
 works-0.009
 happens-0.009
Token called
Token ID1444
Feature activation-2.76e-03

Pos logit contributions
 grasp+0.012
 spine+0.012
 grip+0.011
+0.010
rils+0.010
Neg logit contributions
ull-0.010
washer-0.010
fl-0.010
 sound-0.010
ing-0.009
Token Lily
Token ID20037
Feature activation+3.09e-03

Pos logit contributions
 grasp+0.004
 spine+0.004
 One+0.004
 pocket+0.004
 grip+0.004
Neg logit contributions
taker-0.004
 happens-0.004
lad-0.004
washer-0.004
 accidents-0.004
Token.
Token ID13
Feature activation-1.29e-02

Pos logit contributions
washer+0.004
taker+0.004
ull+0.004
fl+0.004
ing+0.004
Neg logit contributions
 grasp-0.005
 spine-0.005
 grip-0.005
-0.005
rils-0.005
Token She
Token ID1375
Feature activation-9.50e-03

Pos logit contributions
 grasp+0.023
 grip+0.021
 spine+0.020
 pocket+0.020
 injuries+0.020
Neg logit contributions
free-0.018
taker-0.018
 accidents-0.017
 happens-0.016
 works-0.016
Token had
Token ID550
Feature activation-3.19e-02

Pos logit contributions
 One+0.015
 grasp+0.014
 pocket+0.014
One+0.013
 front+0.013
Neg logit contributions
 accidents-0.015
taker-0.014
lad-0.014
lling-0.014
 choices-0.014
Token long
Token ID890
Feature activation-5.90e-02

Pos logit contributions
 grasp+0.054
 spine+0.051
 One+0.050
 pocket+0.049
 grip+0.048
Neg logit contributions
taker-0.045
 accidents-0.044
 happens-0.043
washer-0.043
 works-0.043
Token,
Token ID11
Feature activation-8.48e-03

Pos logit contributions
 grasp+0.086
 spine+0.075
+0.072
rils+0.071
 grip+0.071
Neg logit contributions
fl-0.098
led-0.096
 news-0.096
 works-0.094
 happens-0.094
Token pretty
Token ID2495
Feature activation-1.81e-02

Pos logit contributions
 grasp+0.014
 spine+0.013
 One+0.013
 pocket+0.012
 grip+0.012
Neg logit contributions
taker-0.012
 accidents-0.012
 sound-0.012
free-0.011
rest-0.011
Token hair
Token ID4190
Feature activation-2.19e-02

Pos logit contributions
 grasp+0.026
 One+0.023
 spine+0.023
 grip+0.022
 pocket+0.021
Neg logit contributions
taker-0.030
 accidents-0.030
fl-0.030
led-0.030
 works-0.030
Token and
Token ID290
Feature activation+7.89e-04

Pos logit contributions
 grasp+0.046
 spine+0.042
 Deal+0.041
 One+0.040
 grip+0.039
Neg logit contributions
ing-0.027
fl-0.025
 sound-0.024
led-0.023
washer-0.023
Token wore
Token ID12408
Feature activation-7.88e-04

Pos logit contributions
taker+0.001
 accidents+0.001
washer+0.001
lad+0.001
 works+0.001
Neg logit contributions
 grasp-0.001
 One-0.001
 spine-0.001
 pocket-0.001
 grip-0.001
Token the
Token ID262
Feature activation-6.21e-03

Pos logit contributions
 grasp+0.004
 One+0.004
 pocket+0.004
 spine+0.004
 front+0.004
Neg logit contributions
taker-0.004
lad-0.004
washer-0.004
 happens-0.004
 accidents-0.004
Token elephant
Token ID20950
Feature activation+3.69e-03

Pos logit contributions
 grasp+0.010
 One+0.009
 spine+0.009
 front+0.009
 pocket+0.009
Neg logit contributions
taker-0.010
 happens-0.010
lad-0.010
 accidents-0.010
washer-0.010
Token.
Token ID13
Feature activation+7.77e-03

Pos logit contributions
taker+0.006
 accidents+0.006
washer+0.006
 works+0.006
 happens+0.006
Neg logit contributions
 grasp-0.006
 One-0.005
 spine-0.005
 pocket-0.005
 grip-0.005
Token He
Token ID679
Feature activation-3.92e-03

Pos logit contributions
taker+0.013
washer+0.013
 accidents+0.013
lad+0.013
 happens+0.013
Neg logit contributions
 grasp-0.011
 spine-0.010
 grip-0.010
 pocket-0.009
 One-0.009
Token had
Token ID550
Feature activation-2.43e-02

Pos logit contributions
 grasp+0.007
 One+0.007
 spine+0.007
 pocket+0.007
 grip+0.007
Neg logit contributions
taker-0.005
 accidents-0.005
washer-0.005
 works-0.004
lad-0.004
Token long
Token ID890
Feature activation-5.88e-02

Pos logit contributions
 grasp+0.038
 spine+0.036
 One+0.036
 pocket+0.035
 grip+0.034
Neg logit contributions
taker-0.036
 accidents-0.035
 happens-0.035
 works-0.035
washer-0.035
Token,
Token ID11
Feature activation-9.39e-03

Pos logit contributions
 grasp+0.080
 spine+0.073
 pocket+0.070
 One+0.070
+0.069
Neg logit contributions
 works-0.097
fl-0.097
led-0.096
 happens-0.095
 news-0.094
Token smooth
Token ID7209
Feature activation-1.95e-02

Pos logit contributions
 grasp+0.015
 One+0.014
 spine+0.014
 pocket+0.013
 grip+0.013
Neg logit contributions
taker-0.014
 accidents-0.014
washer-0.014
lad-0.014
 works-0.014
Token ears
Token ID11368
Feature activation-1.76e-02

Pos logit contributions
 grasp+0.026
 One+0.024
 spine+0.024
 pocket+0.023
 grip+0.023
Neg logit contributions
taker-0.035
washer-0.034
lad-0.034
 accidents-0.034
 happens-0.033
Token and
Token ID290
Feature activation+2.04e-03

Pos logit contributions
 grasp+0.026
 One+0.024
 spine+0.024
 pocket+0.023
 grip+0.022
Neg logit contributions
taker-0.029
 accidents-0.028
washer-0.028
 happens-0.028
ing-0.027
Token a
Token ID257
Feature activation+1.19e-03

Pos logit contributions
taker+0.003
lad+0.003
washer+0.003
 accidents+0.003
lling+0.003
Neg logit contributions
 grasp-0.003
 One-0.003
 pocket-0.003
 spine-0.003
 front-0.003
Token named
Token ID3706
Feature activation-5.34e-03

Pos logit contributions
 spine+0.014
 grasp+0.014
 grip+0.012
+0.012
 Deal+0.012
Neg logit contributions
washer-0.011
ull-0.011
 sound-0.011
ing-0.010
fl-0.010
Token Lily
Token ID20037
Feature activation+2.83e-03

Pos logit contributions
 grasp+0.012
 spine+0.011
 grip+0.010
 pocket+0.010
rils+0.010
Neg logit contributions
taker-0.005
free-0.005
 accidents-0.005
washer-0.005
 happens-0.005
Token.
Token ID13
Feature activation-9.53e-03

Pos logit contributions
ull+0.004
washer+0.004
fl+0.004
free+0.004
 works+0.004
Neg logit contributions
 grasp-0.005
 spine-0.005
-0.005
 pocket-0.005
rils-0.005
Token She
Token ID1375
Feature activation-8.90e-03

Pos logit contributions
 grasp+0.016
 grip+0.014
 pocket+0.014
 spine+0.014
 injuries+0.013
Neg logit contributions
free-0.015
 accidents-0.014
taker-0.014
fl-0.014
 happens-0.014
Token had
Token ID550
Feature activation-2.85e-02

Pos logit contributions
 One+0.014
 grasp+0.013
 pocket+0.013
One+0.012
 front+0.012
Neg logit contributions
taker-0.014
 accidents-0.014
lad-0.014
lling-0.013
 decisions-0.013
Token long
Token ID890
Feature activation-5.87e-02

Pos logit contributions
 grasp+0.051
 spine+0.048
 One+0.046
 pocket+0.046
 grip+0.045
Neg logit contributions
taker-0.038
 accidents-0.037
 happens-0.036
washer-0.036
fl-0.036
Token hair
Token ID4190
Feature activation-3.71e-02

Pos logit contributions
 grasp+0.080
 spine+0.068
 grip+0.065
rils+0.065
+0.064
Neg logit contributions
fl-0.103
 news-0.102
led-0.101
 works-0.099
 accidents-0.099
Token that
Token ID326
Feature activation+1.40e-02

Pos logit contributions
 grasp+0.078
 spine+0.071
 One+0.069
 pocket+0.065
 grip+0.065
Neg logit contributions
ing-0.044
led-0.041
 sound-0.040
fl-0.040
 accidents-0.039
Token was
Token ID373
Feature activation-1.50e-02

Pos logit contributions
lad+0.025
lling+0.025
 accidents+0.024
ull+0.024
taker+0.024
Neg logit contributions
 grasp-0.018
 One-0.018
 pocket-0.018
 spine-0.016
 front-0.016
Token very
Token ID845
Feature activation-7.49e-03

Pos logit contributions
 grasp+0.027
 spine+0.025
 One+0.025
 pocket+0.024
 grip+0.023
Neg logit contributions
taker-0.020
 accidents-0.020
washer-0.019
 works-0.019
 happens-0.019
Token thick
Token ID6546
Feature activation-8.09e-03

Pos logit contributions
 grasp+0.015
 One+0.014
 spine+0.014
 pocket+0.014
 grip+0.014
Neg logit contributions
taker-0.008
 accidents-0.008
lad-0.008
washer-0.008
 works-0.008
Token puppy
Token ID26188
Feature activation+1.33e-02

Pos logit contributions
lad+0.014
lling+0.013
 happens+0.013
rest+0.013
taker+0.013
Neg logit contributions
 pocket-0.023
 grasp-0.023
 One-0.022
 spine-0.022
 nine-0.022
Token.
Token ID13
Feature activation-9.69e-03

Pos logit contributions
washer+0.024
taker+0.024
ull+0.024
free+0.023
 sound+0.023
Neg logit contributions
 grasp-0.018
 spine-0.016
 grip-0.015
 One-0.015
 pocket-0.014
Token The
Token ID383
Feature activation-1.68e-02

Pos logit contributions
 grasp+0.014
 spine+0.012
 grip+0.012
 nine+0.012
 pocket+0.012
Neg logit contributions
taker-0.016
 accidents-0.016
free-0.016
 works-0.016
 happens-0.015
Token puppy
Token ID26188
Feature activation+2.35e-02

Pos logit contributions
 grasp+0.032
 One+0.030
 spine+0.029
 pocket+0.029
 front+0.028
Neg logit contributions
taker-0.021
lad-0.020
washer-0.020
 accidents-0.020
 works-0.020
Token had
Token ID550
Feature activation-2.26e-02

Pos logit contributions
 accidents+0.031
lad+0.031
taker+0.031
lling+0.030
beh+0.028
Neg logit contributions
 One-0.044
 grasp-0.042
 pocket-0.041
 front-0.038
 spine-0.038
Token long
Token ID890
Feature activation-5.87e-02

Pos logit contributions
 grasp+0.039
 spine+0.036
 One+0.036
 pocket+0.035
 grip+0.034
Neg logit contributions
taker-0.031
 accidents-0.031
 happens-0.030
washer-0.030
 works-0.030
Token,
Token ID11
Feature activation+4.78e-03

Pos logit contributions
 grasp+0.088
+0.079
 spine+0.078
 nine+0.076
rils+0.076
Neg logit contributions
 news-0.089
fl-0.089
 sound-0.088
 noises-0.087
 happens-0.086
Token soft
Token ID2705
Feature activation-1.50e-02

Pos logit contributions
taker+0.007
lad+0.007
 accidents+0.007
washer+0.007
 works+0.006
Neg logit contributions
 grasp-0.008
 One-0.008
 spine-0.007
 pocket-0.007
 grip-0.007
Token fur
Token ID9230
Feature activation-6.98e-04

Pos logit contributions
 grasp+0.019
 One+0.017
 spine+0.017
 pocket+0.016
 grip+0.016
Neg logit contributions
taker-0.028
 accidents-0.028
washer-0.028
lad-0.028
 works-0.027
Token.
Token ID13
Feature activation-6.84e-03

Pos logit contributions
 grasp+0.001
 One+0.001
 spine+0.001
 pocket+0.001
 grip+0.001
Neg logit contributions
taker-0.001
washer-0.001
 accidents-0.001
 happens-0.001
led-0.001
Token It
Token ID632
Feature activation+1.33e-03

Pos logit contributions
 grasp+0.012
 spine+0.011
 grip+0.011
 injuries+0.010
 nine+0.010
Neg logit contributions
taker-0.009
 works-0.009
 accidents-0.009
 happens-0.009
 sound-0.009
Token and
Token ID290
Feature activation+1.19e-02

Pos logit contributions
 grasp+0.012
 One+0.011
 pocket+0.011
 spine+0.011
 front+0.011
Neg logit contributions
taker-0.022
lad-0.022
ull-0.021
rest-0.021
washer-0.021
Token loud
Token ID7812
Feature activation+1.06e-02

Pos logit contributions
lad+0.021
fl+0.020
ying+0.019
ing+0.019
ins+0.019
Neg logit contributions
 One-0.018
 pocket-0.016
 front-0.015
 grasp-0.015
One-0.015
Token.
Token ID13
Feature activation-6.96e-03

Pos logit contributions
lad+0.027
washer+0.026
taker+0.026
ull+0.025
ww+0.025
Neg logit contributions
 One-0.008
 front-0.006
 grasp-0.005
rils-0.005
 nine-0.005
Token He
Token ID679
Feature activation-6.42e-03

Pos logit contributions
 grasp+0.010
 spine+0.009
 pocket+0.009
 One+0.009
 grip+0.009
Neg logit contributions
taker-0.011
washer-0.011
 accidents-0.011
 happens-0.011
 works-0.011
Token had
Token ID550
Feature activation-2.28e-02

Pos logit contributions
 grasp+0.012
 One+0.011
 pocket+0.011
 spine+0.010
 front+0.010
Neg logit contributions
taker-0.008
lad-0.008
 accidents-0.008
washer-0.008
 works-0.008
Token long
Token ID890
Feature activation-5.85e-02

Pos logit contributions
 grasp+0.036
 spine+0.034
 One+0.033
 pocket+0.032
 grip+0.031
Neg logit contributions
 happens-0.034
taker-0.034
 works-0.034
 accidents-0.034
washer-0.033
Token hair
Token ID4190
Feature activation-3.38e-02

Pos logit contributions
 grasp+0.093
 spine+0.085
+0.083
 pocket+0.081
rils+0.081
Neg logit contributions
 happens-0.083
 works-0.083
led-0.082
fl-0.080
 accidents-0.079
Token and
Token ID290
Feature activation-7.56e-03

Pos logit contributions
 grasp+0.051
 One+0.047
 spine+0.047
 pocket+0.045
 grip+0.044
Neg logit contributions
taker-0.054
 accidents-0.053
washer-0.053
 works-0.052
 happens-0.052
Token a
Token ID257
Feature activation+9.83e-05

Pos logit contributions
 grasp+0.012
 One+0.011
 pocket+0.011
 front+0.011
 nine+0.011
Neg logit contributions
lad-0.012
taker-0.011
lling-0.011
 accidents-0.011
washer-0.011
Token beard
Token ID21213
Feature activation-2.36e-03

Pos logit contributions
taker+0.000
 accidents+0.000
washer+0.000
 works+0.000
 happens+0.000
Neg logit contributions
 grasp-0.000
 One-0.000
 spine-0.000
 pocket-0.000
 grip-0.000
Token.
Token ID13
Feature activation-6.94e-03

Pos logit contributions
 grasp+0.003
 One+0.003
 pocket+0.003
 spine+0.003
 front+0.003
Neg logit contributions
taker-0.004
lad-0.004
 accidents-0.004
washer-0.004
 happens-0.004
TokenTim
Token ID14967
Feature activation-1.10e-02

Pos logit contributions
 grasp+0.019
 spine+0.017
 pocket+0.017
 One+0.017
 grip+0.016
Neg logit contributions
taker-0.015
 accidents-0.015
washer-0.015
lad-0.015
 works-0.014
Tokenmy
Token ID1820
Feature activation+8.95e-03

Pos logit contributions
 One+0.013
 grasp+0.013
 pocket+0.012
 front+0.011
 nine+0.011
Neg logit contributions
taker-0.021
 accidents-0.021
 happens-0.021
lad-0.020
free-0.020
Token was
Token ID373
Feature activation-1.01e-02

Pos logit contributions
taker+0.013
lad+0.013
 accidents+0.012
free+0.012
rest+0.012
Neg logit contributions
 One-0.015
 grasp-0.015
 pocket-0.014
 front-0.014
One-0.013
Token so
Token ID523
Feature activation-1.08e-02

Pos logit contributions
 grasp+0.022
 One+0.020
 spine+0.020
 pocket+0.020
 grip+0.020
Neg logit contributions
taker-0.010
 accidents-0.010
washer-0.009
lad-0.009
 works-0.009
Token happy
Token ID3772
Feature activation-1.01e-02

Pos logit contributions
 One+0.018
 grasp+0.017
 pocket+0.016
 grip+0.016
One+0.016
Neg logit contributions
fl-0.016
works-0.016
ymes-0.016
taker-0.015
 accidents-0.015
Token that
Token ID326
Feature activation+4.22e-02

Pos logit contributions
 grasp+0.013
 One+0.012
 spine+0.012
 pocket+0.011
 grip+0.011
Neg logit contributions
taker-0.019
lad-0.018
 works-0.018
 happens-0.018
washer-0.018
Token he
Token ID339
Feature activation+2.06e-03

Pos logit contributions
ymes+0.071
led+0.068
lad+0.066
taker+0.065
lling+0.065
Neg logit contributions
 One-0.071
 grasp-0.064
 pocket-0.062
 front-0.062
One-0.059
Token could
Token ID714
Feature activation+1.16e-03

Pos logit contributions
ymes+0.004
 accidents+0.004
lad+0.004
fl+0.004
 choices+0.004
Neg logit contributions
 One-0.003
 pocket-0.003
 grasp-0.002
 grip-0.002
 use-0.002
Token hardly
Token ID8941
Feature activation-1.42e-02

Pos logit contributions
 sound+0.002
rest+0.002
 lesson+0.001
 works+0.001
taker+0.001
Neg logit contributions
 spine-0.002
rils-0.002
 grasp-0.002
ids-0.002
 One-0.002
Token speak
Token ID2740
Feature activation+5.83e-03

Pos logit contributions
 grasp+0.018
 One+0.017
 nine+0.017
rils+0.017
 pocket+0.016
Neg logit contributions
rest-0.024
 sound-0.024
taker-0.024
free-0.024
washer-0.023
Token.
Token ID13
Feature activation+6.25e-03

Pos logit contributions
lad+0.013
ww+0.013
lling+0.012
led+0.012
taker+0.012
Neg logit contributions
 One-0.006
 front-0.005
 grasp-0.005
 nine-0.004
 pocket-0.004
Token behind
Token ID2157
Feature activation-2.14e-02

Pos logit contributions
 grasp+0.013
 One+0.012
 spine+0.012
 pocket+0.012
 front+0.012
Neg logit contributions
lad-0.013
 accidents-0.013
 works-0.013
washer-0.013
taker-0.013
Token the
Token ID262
Feature activation+9.88e-03

Pos logit contributions
 grasp+0.037
 spine+0.035
 One+0.034
 pocket+0.033
 grip+0.033
Neg logit contributions
 accidents-0.029
taker-0.029
 works-0.028
 happens-0.028
 news-0.028
Token curtains
Token ID41160
Feature activation-1.12e-02

Pos logit contributions
taker+0.017
washer+0.017
 accidents+0.017
 works+0.017
 happens+0.017
Neg logit contributions
 grasp-0.013
 One-0.012
 spine-0.012
 pocket-0.012
 grip-0.011
Token in
Token ID287
Feature activation-4.26e-03

Pos logit contributions
 grasp+0.019
 One+0.017
 spine+0.017
 grip+0.017
 pocket+0.017
Neg logit contributions
washer-0.016
ing-0.015
taker-0.015
 works-0.014
lling-0.014
Token her
Token ID607
Feature activation+8.04e-03

Pos logit contributions
 spine+0.008
 grasp+0.008
 grip+0.007
ids+0.007
 One+0.007
Neg logit contributions
 accidents-0.005
 works-0.005
taker-0.005
 news-0.005
 sound-0.005
Token living
Token ID2877
Feature activation+3.99e-02

Pos logit contributions
washer+0.010
fl+0.009
 sound+0.009
 boy+0.009
 things+0.009
Neg logit contributions
 grasp-0.014
rils-0.014
 grip-0.013
 One-0.013
One-0.013
Token room
Token ID2119
Feature activation+1.59e-03

Pos logit contributions
taker+0.067
lad+0.065
 accidents+0.065
 happens+0.064
washer+0.064
Neg logit contributions
 grasp-0.058
 One-0.053
 pocket-0.052
 spine-0.052
 front-0.051
Token.
Token ID13
Feature activation+9.46e-03

Pos logit contributions
washer+0.002
 sound+0.002
ing+0.002
 accidents+0.002
taker+0.002
Neg logit contributions
 grasp-0.003
 One-0.003
 pocket-0.002
 spine-0.002
 grip-0.002
Token She
Token ID1375
Feature activation-6.70e-04

Pos logit contributions
taker+0.013
fl+0.012
 happens+0.012
free+0.012
washer+0.012
Neg logit contributions
 grasp-0.017
 grip-0.016
 injuries-0.016
 spine-0.015
 nine-0.015
Token saw
Token ID2497
Feature activation+5.39e-03

Pos logit contributions
 grasp+0.001
 One+0.001
 spine+0.001
 pocket+0.001
 grip+0.001
Neg logit contributions
taker-0.001
washer-0.001
 accidents-0.001
 works-0.001
 happens-0.001
Token her
Token ID607
Feature activation-2.54e-02

Pos logit contributions
lad+0.011
 works+0.010
lling+0.010
beh+0.010
taker+0.010
Neg logit contributions
 One-0.006
 nine-0.006
 grasp-0.006
 front-0.006
 pocket-0.006
Token thought
Token ID1807
Feature activation-8.53e-03

Pos logit contributions
 grasp+0.002
 One+0.002
 pocket+0.002
 front+0.002
 nine+0.002
Neg logit contributions
 accidents-0.002
taker-0.002
free-0.002
lad-0.002
ymes-0.002
Token about
Token ID546
Feature activation+2.01e-02

Pos logit contributions
 grasp+0.012
 One+0.011
 spine+0.011
 grip+0.011
 pocket+0.011
Neg logit contributions
taker-0.014
washer-0.014
lad-0.014
fl-0.014
 works-0.014
Token it
Token ID340
Feature activation+5.31e-04

Pos logit contributions
led+0.034
washer+0.033
taker+0.033
ymes+0.032
free+0.032
Neg logit contributions
 One-0.031
 grasp-0.028
 pocket-0.027
 front-0.027
rils-0.025
Token and
Token ID290
Feature activation+8.28e-03

Pos logit contributions
taker+0.001
lad+0.001
ull+0.001
 happens+0.001
 disl+0.001
Neg logit contributions
 One-0.001
 grasp-0.001
 pocket-0.000
 spine-0.000
 front-0.000
Token decided
Token ID3066
Feature activation+1.72e-02

Pos logit contributions
free+0.015
ymes+0.015
lad+0.014
taker+0.014
rest+0.014
Neg logit contributions
 grasp-0.011
 One-0.011
 pocket-0.011
 nine-0.010
 front-0.009
Token that
Token ID326
Feature activation+3.98e-02

Pos logit contributions
ymes+0.034
taker+0.033
lad+0.033
 works+0.033
lling+0.032
Neg logit contributions
 One-0.020
 grasp-0.020
 pocket-0.018
 front-0.018
 spine-0.017
Token getting
Token ID1972
Feature activation+2.29e-02

Pos logit contributions
led+0.066
taker+0.066
ymes+0.066
lad+0.064
free+0.063
Neg logit contributions
 One-0.063
 grasp-0.057
 pocket-0.053
 front-0.052
One-0.051
Token warm
Token ID5814
Feature activation-6.56e-04

Pos logit contributions
lad+0.039
taker+0.036
 works+0.035
ymes+0.035
lling+0.035
Neg logit contributions
 grasp-0.033
 One-0.031
 front-0.031
 pocket-0.030
 nine-0.030
Token by
Token ID416
Feature activation+1.28e-03

Pos logit contributions
 grasp+0.001
 One+0.001
 spine+0.001
 front+0.001
 pocket+0.001
Neg logit contributions
lad-0.001
taker-0.001
beh-0.001
ymes-0.001
haw-0.001
Token the
Token ID262
Feature activation+3.73e-03

Pos logit contributions
washer+0.002
taker+0.002
 happens+0.002
 works+0.002
lad+0.002
Neg logit contributions
 grasp-0.002
 One-0.002
 spine-0.002
 pocket-0.002
 front-0.002
Token camp
Token ID1413
Feature activation+1.09e-02

Pos logit contributions
taker+0.006
lad+0.006
 accidents+0.005
led+0.005
washer+0.005
Neg logit contributions
 grasp-0.006
 One-0.006
 spine-0.006
 front-0.006
 pocket-0.006
Token it
Token ID340
Feature activation-1.65e-03

Pos logit contributions
taker+0.035
lad+0.034
lling+0.033
free+0.033
washer+0.032
Neg logit contributions
 One-0.036
 grasp-0.036
 pocket-0.034
 front-0.033
 nine-0.033
Token was
Token ID373
Feature activation-4.43e-03

Pos logit contributions
 grasp+0.003
 One+0.003
 pocket+0.003
 spine+0.003
 grip+0.003
Neg logit contributions
taker-0.002
lad-0.002
 accidents-0.002
free-0.002
lling-0.002
Token far
Token ID1290
Feature activation-1.13e-02

Pos logit contributions
 grasp+0.009
 One+0.008
 spine+0.008
 pocket+0.008
 grip+0.008
Neg logit contributions
taker-0.005
 accidents-0.005
washer-0.005
lad-0.005
 works-0.005
Token away
Token ID1497
Feature activation-1.99e-02

Pos logit contributions
 grasp+0.018
 One+0.017
 spine+0.016
 front+0.016
 grip+0.016
Neg logit contributions
lad-0.018
taker-0.018
washer-0.017
ull-0.017
 happens-0.017
Token and
Token ID290
Feature activation+9.01e-03

Pos logit contributions
 grasp+0.029
 One+0.027
 spine+0.027
 front+0.026
 pocket+0.026
Neg logit contributions
taker-0.033
lad-0.032
washer-0.032
 happens-0.031
 works-0.031
Token sometimes
Token ID3360
Feature activation+3.91e-02

Pos logit contributions
lad+0.015
taker+0.015
rest+0.014
ww+0.014
ymes+0.014
Neg logit contributions
 grasp-0.013
 One-0.013
 pocket-0.012
 nine-0.012
 grip-0.012
Token it
Token ID340
Feature activation-3.04e-03

Pos logit contributions
lad+0.065
taker+0.063
washer+0.060
lling+0.059
beh+0.059
Neg logit contributions
 grasp-0.061
 One-0.060
 pocket-0.056
 grip-0.056
 spine-0.054
Token was
Token ID373
Feature activation-9.46e-03

Pos logit contributions
 grasp+0.006
 One+0.005
 pocket+0.005
 grip+0.005
 spine+0.005
Neg logit contributions
taker-0.004
lad-0.004
 accidents-0.004
free-0.004
lling-0.004
Token close
Token ID1969
Feature activation-1.91e-02

Pos logit contributions
 grasp+0.018
 spine+0.017
 One+0.017
 pocket+0.017
rils+0.016
Neg logit contributions
 accidents-0.011
 works-0.011
 happens-0.011
taker-0.011
ull-0.010
Token.
Token ID13
Feature activation-6.53e-03

Pos logit contributions
 grasp+0.030
 One+0.027
 spine+0.027
 pocket+0.027
 grip+0.025
Neg logit contributions
taker-0.029
 accidents-0.029
 works-0.029
washer-0.029
 happens-0.029
Token They
Token ID1119
Feature activation-5.25e-03

Pos logit contributions
 grasp+0.012
 spine+0.012
 grip+0.011
 pocket+0.011
 injuries+0.011
Neg logit contributions
ull-0.009
 happens-0.009
fl-0.008
lad-0.008
 means-0.008
Token!
Token ID0
Feature activation+8.78e-03

Feature not active on prev token;
no predictions were affected.
Token Lucy
Token ID22162
Feature activation+1.70e-03

Pos logit contributions
taker+0.014
 happens+0.013
 accidents+0.013
fl+0.013
 works+0.013
Neg logit contributions
 grasp-0.014
 spine-0.013
 grip-0.013
 pocket-0.012
 nine-0.012
Token was
Token ID373
Feature activation-1.53e-02

Pos logit contributions
taker+0.002
 accidents+0.002
lad+0.002
washer+0.002
 happens+0.002
Neg logit contributions
 grasp-0.003
 One-0.003
 spine-0.003
 pocket-0.003
 grip-0.003
Token so
Token ID523
Feature activation-8.02e-03

Pos logit contributions
 grasp+0.029
 One+0.027
 pocket+0.027
 spine+0.027
 front+0.026
Neg logit contributions
taker-0.019
lad-0.018
 happens-0.018
washer-0.018
 accidents-0.018
Token excited
Token ID6568
Feature activation-3.27e-05

Pos logit contributions
 grasp+0.013
 One+0.013
 pocket+0.013
 grip+0.012
 front+0.012
Neg logit contributions
 happens-0.011
 accidents-0.011
fl-0.011
washer-0.011
works-0.011
Token that
Token ID326
Feature activation+4.47e-02

Pos logit contributions
 grasp+0.000
 One+0.000
 spine+0.000
 pocket+0.000
 grip+0.000
Neg logit contributions
taker-0.000
lad-0.000
washer-0.000
 works-0.000
 accidents-0.000
Token she
Token ID673
Feature activation-1.28e-03

Pos logit contributions
taker+0.073
lad+0.071
washer+0.071
 works+0.070
led+0.070
Neg logit contributions
 grasp-0.068
 One-0.065
 pocket-0.062
 spine-0.060
 front-0.060
Token decided
Token ID3066
Feature activation+1.41e-02

Pos logit contributions
 grasp+0.002
 pocket+0.002
 One+0.002
 grip+0.002
 front+0.002
Neg logit contributions
lad-0.002
 accidents-0.002
taker-0.002
 works-0.002
fl-0.002
Token to
Token ID284
Feature activation-1.93e-03

Pos logit contributions
taker+0.027
 works+0.026
lad+0.026
washer+0.026
led+0.026
Neg logit contributions
 grasp-0.017
 One-0.016
 pocket-0.015
 front-0.015
 spine-0.015
Token take
Token ID1011
Feature activation-8.50e-05

Pos logit contributions
 spine+0.003
 grasp+0.003
 nine+0.003
 injuries+0.003
ids+0.003
Neg logit contributions
 sound-0.003
ull-0.003
taker-0.003
 works-0.003
washer-0.003
Token it
Token ID340
Feature activation-5.71e-03

Pos logit contributions
 grasp+0.000
 One+0.000
 spine+0.000
 pocket+0.000
 grip+0.000
Neg logit contributions
taker-0.000
 accidents-0.000
 happens-0.000
lad-0.000
 works-0.000
Token two
Token ID734
Feature activation-3.25e-02

Pos logit contributions
 grasp+0.029
 One+0.028
 pocket+0.028
 spine+0.027
 front+0.027
Neg logit contributions
lad-0.030
taker-0.030
led-0.029
lling-0.029
 happens-0.029
Token friends
Token ID2460
Feature activation+1.38e-02

Pos logit contributions
 grasp+0.058
 One+0.054
 spine+0.054
 pocket+0.052
 grip+0.051
Neg logit contributions
taker-0.043
 accidents-0.042
washer-0.042
 works-0.042
 happens-0.041
Token started
Token ID2067
Feature activation-7.37e-03

Pos logit contributions
taker+0.019
 accidents+0.018
lad+0.018
washer+0.018
 works+0.018
Neg logit contributions
 grasp-0.024
 One-0.023
 spine-0.022
 pocket-0.022
 grip-0.021
Token to
Token ID284
Feature activation-5.03e-03

Pos logit contributions
 grasp+0.008
 One+0.008
 front+0.007
 pocket+0.007
 spine+0.007
Neg logit contributions
lad-0.016
taker-0.016
ww-0.015
led-0.015
rest-0.015
Token explore
Token ID7301
Feature activation+1.47e-02

Pos logit contributions
 grasp+0.007
 One+0.006
 spine+0.006
 pocket+0.006
 grip+0.006
Neg logit contributions
taker-0.009
lad-0.009
 accidents-0.009
led-0.009
 happens-0.009
Token and
Token ID290
Feature activation+1.47e-02

Pos logit contributions
taker+0.023
lad+0.023
washer+0.022
 accidents+0.022
 happens+0.022
Neg logit contributions
 grasp-0.023
 One-0.021
 spine-0.020
 pocket-0.020
 front-0.020
Token had
Token ID550
Feature activation-1.30e-02

Pos logit contributions
taker+0.028
washer+0.027
 works+0.027
 accidents+0.027
 happens+0.027
Neg logit contributions
 grasp-0.018
 One-0.016
 spine-0.016
 pocket-0.016
 grip-0.015
Token fun
Token ID1257
Feature activation-7.03e-03

Pos logit contributions
 grasp+0.020
 One+0.020
 pocket+0.018
 front+0.018
 nine+0.018
Neg logit contributions
taker-0.021
lad-0.021
led-0.020
washer-0.020
rest-0.020
Token together
Token ID1978
Feature activation-2.47e-03

Pos logit contributions
 grasp+0.010
 spine+0.009
 One+0.009
 pocket+0.009
rils+0.009
Neg logit contributions
taker-0.012
 accidents-0.012
 works-0.011
washer-0.011
 happens-0.011
Token.
Token ID13
Feature activation+1.20e-02

Pos logit contributions
 grasp+0.004
 spine+0.004
 pocket+0.004
 One+0.004
 nine+0.004
Neg logit contributions
lling-0.003
ull-0.003
ing-0.003
fl-0.003
 happens-0.003
Token
Token ID198
Feature activation-6.85e-03

Pos logit contributions
 happens+0.014
taker+0.014
fl+0.014
washer+0.014
ull+0.014
Neg logit contributions
 grasp-0.023
 grip-0.021
 injuries-0.021
 spine-0.021
 pocket-0.020
Token cool
Token ID3608
Feature activation+1.83e-02

Pos logit contributions
lad+0.028
washer+0.028
taker+0.027
fl+0.027
ymes+0.026
Neg logit contributions
 One-0.039
 grasp-0.038
 pocket-0.035
 grip-0.034
 front-0.033
Token!
Token ID0
Feature activation+4.04e-03

Pos logit contributions
taker+0.038
lad+0.038
ull+0.037
washer+0.037
beh+0.036
Neg logit contributions
 grasp-0.018
 One-0.017
 pocket-0.016
 front-0.015
 spine-0.015
Token
Token ID198
Feature activation-8.14e-03

Pos logit contributions
 happens+0.006
taker+0.006
 accidents+0.006
ull+0.006
fl+0.006
Neg logit contributions
 grasp-0.007
 spine-0.006
 grip-0.006
 pocket-0.006
 nine-0.006
Token
Token ID198
Feature activation-1.77e-02

Pos logit contributions
 grasp+0.015
 spine+0.015
 grip+0.014
 pocket+0.014
rils+0.013
Neg logit contributions
ull-0.011
ing-0.010
taker-0.010
lad-0.010
washer-0.010
TokenS
Token ID50
Feature activation-2.86e-02

Pos logit contributions
 grasp+0.036
 spine+0.035
 pocket+0.033
 grip+0.032
 One+0.032
Neg logit contributions
taker-0.019
lad-0.019
lling-0.019
 accidents-0.019
led-0.019
Tokenally
Token ID453
Feature activation+1.01e-02

Pos logit contributions
 grasp+0.041
 spine+0.037
 One+0.037
 pocket+0.036
 grip+0.035
Neg logit contributions
 accidents-0.048
lad-0.048
taker-0.047
 happens-0.047
 works-0.047
Token said
Token ID531
Feature activation+1.03e-02

Pos logit contributions
taker+0.015
 accidents+0.014
lad+0.014
washer+0.014
 works+0.014
Neg logit contributions
 grasp-0.017
 One-0.016
 pocket-0.015
 spine-0.015
 front-0.015
Token,
Token ID11
Feature activation+1.51e-02

Pos logit contributions
lling+0.024
 works+0.023
lad+0.023
washer+0.023
rest+0.023
Neg logit contributions
 One-0.009
 grasp-0.008
 pocket-0.007
 nine-0.007
 front-0.007
Token "
Token ID366
Feature activation+1.12e-02

Pos logit contributions
 works+0.028
lling+0.028
taker+0.028
rest+0.028
washer+0.027
Neg logit contributions
 One-0.019
 grasp-0.018
 spine-0.017
 front-0.017
 pocket-0.016
TokenLet
Token ID5756
Feature activation-8.82e-03

Pos logit contributions
taker+0.013
washer+0.013
 accidents+0.013
 works+0.013
lad+0.013
Neg logit contributions
 grasp-0.022
 One-0.021
 spine-0.021
 pocket-0.020
 front-0.020
Token's
Token ID338
Feature activation+8.29e-03

Pos logit contributions
 grasp+0.012
 One+0.012
 front+0.011
 pocket+0.010
One+0.010
Neg logit contributions
taker-0.016
washer-0.016
lad-0.016
 works-0.015
free-0.015
Token from
Token ID422
Feature activation+2.67e-02

Pos logit contributions
 grasp+0.017
 One+0.015
 spine+0.015
 pocket+0.015
 grip+0.015
Neg logit contributions
taker-0.015
washer-0.014
 accidents-0.014
 happens-0.014
 works-0.014
Token our
Token ID674
Feature activation+1.46e-03

Pos logit contributions
taker+0.044
 accidents+0.043
washer+0.043
lad+0.043
 works+0.043
Neg logit contributions
 grasp-0.039
 One-0.036
 spine-0.035
 pocket-0.034
 grip-0.033
Token grandma
Token ID49890
Feature activation+5.39e-03

Pos logit contributions
taker+0.003
lad+0.002
 accidents+0.002
led+0.002
lling+0.002
Neg logit contributions
 grasp-0.002
 One-0.002
 pocket-0.002
 spine-0.002
 front-0.002
Token.
Token ID13
Feature activation+8.24e-03

Pos logit contributions
taker+0.010
 accidents+0.010
lad+0.010
washer+0.010
 happens+0.010
Neg logit contributions
 grasp-0.007
 One-0.006
 pocket-0.006
 spine-0.006
 front-0.006
Token She
Token ID1375
Feature activation+3.13e-04

Pos logit contributions
taker+0.013
 happens+0.013
 accidents+0.013
washer+0.013
lad+0.013
Neg logit contributions
 grasp-0.013
 spine-0.012
 grip-0.011
 pocket-0.011
 One-0.011
Token loves
Token ID10408
Feature activation+1.36e-02

Pos logit contributions
taker+0.000
 accidents+0.000
lad+0.000
ull+0.000
lling+0.000
Neg logit contributions
 grasp-0.001
 One-0.001
 pocket-0.001
 front-0.001
 nine-0.001
Token us
Token ID514
Feature activation+2.22e-02

Pos logit contributions
taker+0.020
lad+0.020
washer+0.020
 accidents+0.020
 works+0.020
Neg logit contributions
 grasp-0.022
 One-0.021
 spine-0.020
 pocket-0.020
 front-0.019
Token.
Token ID13
Feature activation+1.14e-02

Pos logit contributions
 happens+0.040
lad+0.040
 accidents+0.040
taker+0.040
 works+0.040
Neg logit contributions
 One-0.027
 grasp-0.026
 front-0.025
 pocket-0.024
 nine-0.024
Token You
Token ID921
Feature activation-2.37e-03

Pos logit contributions
taker+0.016
 happens+0.016
 accidents+0.016
lad+0.016
 works+0.016
Neg logit contributions
 grasp-0.020
 grip-0.018
 spine-0.018
 pocket-0.018
rils-0.017
Token are
Token ID389
Feature activation-3.22e-03

Pos logit contributions
 grasp+0.004
 One+0.003
 pocket+0.003
 front+0.003
 nine+0.003
Neg logit contributions
 accidents-0.004
lad-0.004
taker-0.004
ing-0.004
fl-0.004
Token wrong
Token ID2642
Feature activation-8.62e-03

Pos logit contributions
 grasp+0.005
 One+0.005
 front+0.005
 pocket+0.005
 nine+0.005
Neg logit contributions
lad-0.004
taker-0.004
washer-0.004
led-0.004
 accidents-0.004
Token a
Token ID257
Feature activation+4.42e-03

Pos logit contributions
 grasp+0.021
 One+0.019
 spine+0.019
 pocket+0.019
 grip+0.018
Neg logit contributions
taker-0.018
 accidents-0.018
washer-0.017
 works-0.017
lad-0.017
Token rainbow
Token ID27223
Feature activation-5.15e-03

Pos logit contributions
led+0.007
lad+0.006
taker+0.006
rest+0.006
lling+0.006
Neg logit contributions
 grasp-0.008
 pocket-0.007
 One-0.007
 front-0.007
 spine-0.007
Token with
Token ID351
Feature activation-2.47e-03

Pos logit contributions
 grasp+0.008
 One+0.008
 spine+0.008
 pocket+0.007
 grip+0.007
Neg logit contributions
taker-0.008
 accidents-0.008
washer-0.008
lad-0.007
 works-0.007
Token all
Token ID477
Feature activation+4.61e-03

Pos logit contributions
 grasp+0.004
 One+0.004
 front+0.004
 pocket+0.003
 nine+0.003
Neg logit contributions
washer-0.004
taker-0.004
lad-0.004
 accidents-0.004
 works-0.004
Token the
Token ID262
Feature activation-3.47e-03

Pos logit contributions
washer+0.006
taker+0.006
lad+0.006
 accidents+0.006
 works+0.006
Neg logit contributions
 grasp-0.008
 One-0.008
 front-0.008
 pocket-0.008
 spine-0.007
Token colors
Token ID7577
Feature activation+6.29e-03

Pos logit contributions
 grasp+0.006
 One+0.006
 pocket+0.006
 spine+0.006
 front+0.005
Neg logit contributions
washer-0.005
lad-0.005
taker-0.005
 accidents-0.005
 happens-0.005
Token of
Token ID286
Feature activation+9.57e-03

Pos logit contributions
taker+0.010
lad+0.010
washer+0.010
 accidents+0.010
 works+0.010
Neg logit contributions
 grasp-0.009
 One-0.008
 spine-0.008
 pocket-0.008
 grip-0.008
Token the
Token ID262
Feature activation+3.57e-05

Pos logit contributions
taker+0.014
washer+0.014
lad+0.014
 accidents+0.013
 happens+0.013
Neg logit contributions
 grasp-0.016
 One-0.015
 pocket-0.014
 spine-0.014
 front-0.014
Token world
Token ID995
Feature activation+3.11e-02

Pos logit contributions
lad+0.000
taker+0.000
washer+0.000
 accidents+0.000
 happens+0.000
Neg logit contributions
 grasp-0.000
 pocket-0.000
 One-0.000
 spine-0.000
 front-0.000
Token.
Token ID13
Feature activation-2.96e-04

Pos logit contributions
lad+0.061
taker+0.058
ull+0.058
 accidents+0.058
haw+0.057
Neg logit contributions
 grasp-0.035
 One-0.033
 front-0.032
 pocket-0.030
 grip-0.029
Token Seeing
Token ID25913
Feature activation-2.22e-04

Pos logit contributions
 grasp+0.000
 spine+0.000
 grip+0.000
 pocket+0.000
 One+0.000
Neg logit contributions
taker-0.001
 accidents-0.001
 happens-0.001
washer-0.001
lad-0.001
Token herself
Token ID5223
Feature activation+1.93e-02

Pos logit contributions
 grasp+0.002
 spine+0.002
 One+0.002
 pocket+0.002
 grip+0.002
Neg logit contributions
taker-0.002
 accidents-0.002
 happens-0.002
 works-0.002
washer-0.002
Token down
Token ID866
Feature activation-3.44e-03

Pos logit contributions
taker+0.025
lad+0.024
 accidents+0.024
washer+0.024
 happens+0.024
Neg logit contributions
 grasp-0.036
 One-0.034
 spine-0.033
 pocket-0.033
 grip-0.032
Token.
Token ID13
Feature activation+8.36e-03

Pos logit contributions
 grasp+0.004
 One+0.004
 front+0.004
 pocket+0.004
 spine+0.003
Neg logit contributions
taker-0.007
lad-0.007
washer-0.007
 accidents-0.006
ull-0.006
Token She
Token ID1375
Feature activation+6.97e-04

Pos logit contributions
 happens+0.012
taker+0.012
ing+0.012
 accidents+0.011
free+0.011
Neg logit contributions
 grasp-0.014
 grip-0.012
 spine-0.012
 pocket-0.012
 injuries-0.012
Token goes
Token ID2925
Feature activation+7.76e-04

Pos logit contributions
 works+0.001
 happens+0.001
taker+0.001
washer+0.001
 accidents+0.001
Neg logit contributions
 grasp-0.001
 spine-0.001
 One-0.001
 pocket-0.001
rils-0.001
Token very
Token ID845
Feature activation+2.46e-02

Pos logit contributions
taker+0.001
washer+0.001
 accidents+0.001
 works+0.001
lad+0.001
Neg logit contributions
 grasp-0.002
 One-0.001
 spine-0.001
 pocket-0.001
rils-0.001
Token fast
Token ID3049
Feature activation-1.62e-03

Pos logit contributions
taker+0.025
 accidents+0.024
washer+0.023
led+0.023
 works+0.023
Neg logit contributions
 grasp-0.051
 One-0.048
 spine-0.048
 pocket-0.047
rils-0.047
Token.
Token ID13
Feature activation+6.81e-03

Pos logit contributions
 grasp+0.002
 One+0.002
 spine+0.002
 pocket+0.002
 grip+0.002
Neg logit contributions
taker-0.003
 accidents-0.003
lad-0.003
washer-0.003
 works-0.003
Token She
Token ID1375
Feature activation-3.05e-03

Pos logit contributions
 happens+0.009
ing+0.009
taker+0.009
free+0.009
 accidents+0.009
Neg logit contributions
 grasp-0.012
 grip-0.011
 spine-0.010
 pocket-0.010
 injuries-0.010
Token feels
Token ID5300
Feature activation-7.84e-03

Pos logit contributions
 grasp+0.006
 spine+0.006
 One+0.006
 pocket+0.005
rils+0.005
Neg logit contributions
 works-0.004
 happens-0.004
taker-0.003
 accidents-0.003
washer-0.003
Token the
Token ID262
Feature activation+3.11e-04

Pos logit contributions
 grasp+0.012
 One+0.012
 pocket+0.011
 front+0.011
 grip+0.011
Neg logit contributions
lad-0.013
washer-0.012
ull-0.012
taker-0.012
 works-0.012
Token.
Token ID13
Feature activation+2.13e-03

Pos logit contributions
 grasp+0.022
 spine+0.020
rils+0.020
 grip+0.019
 One+0.018
Neg logit contributions
washer-0.014
ing-0.013
free-0.012
fl-0.012
taker-0.012
Token One
Token ID1881
Feature activation-8.94e-03

Pos logit contributions
 boy+0.004
fl+0.004
 works+0.004
 sounds+0.004
free+0.004
Neg logit contributions
 grasp-0.003
 grip-0.003
 injuries-0.003
 spine-0.002
itching-0.002
Token day
Token ID1110
Feature activation+5.30e-03

Pos logit contributions
 grasp+0.020
 One+0.019
 spine+0.019
 pocket+0.019
 grip+0.018
Neg logit contributions
taker-0.008
washer-0.007
 accidents-0.007
lad-0.007
 works-0.007
Token,
Token ID11
Feature activation+1.21e-02

Pos logit contributions
 accidents+0.008
 happens+0.008
ing+0.007
taker+0.007
led+0.007
Neg logit contributions
 grasp-0.009
 spine-0.008
 pocket-0.008
rils-0.008
 One-0.008
Token she
Token ID673
Feature activation-1.62e-02

Pos logit contributions
 accidents+0.015
taker+0.014
free+0.013
 happens+0.013
led+0.013
Neg logit contributions
 grasp-0.024
 spine-0.022
rils-0.021
 pocket-0.021
 One-0.021
Token wanted
Token ID2227
Feature activation-1.13e-02

Pos logit contributions
 grasp+0.026
 One+0.025
 pocket+0.025
 spine+0.024
 front+0.023
Neg logit contributions
taker-0.025
lad-0.024
 accidents-0.024
free-0.023
lling-0.022
Token to
Token ID284
Feature activation-1.67e-02

Pos logit contributions
 grasp+0.019
 spine+0.018
 One+0.017
 pocket+0.017
 grip+0.016
Neg logit contributions
 accidents-0.017
washer-0.016
taker-0.016
 happens-0.016
 works-0.016
Token wear
Token ID5806
Feature activation-7.42e-04

Pos logit contributions
 spine+0.029
 grasp+0.026
rils+0.024
 injuries+0.024
 nine+0.024
Neg logit contributions
 sound-0.025
 works-0.024
 news-0.024
washer-0.024
taker-0.023
Token a
Token ID257
Feature activation-4.16e-03

Pos logit contributions
 grasp+0.001
 spine+0.001
 One+0.001
 pocket+0.001
 grip+0.001
Neg logit contributions
taker-0.001
 accidents-0.001
 works-0.001
 happens-0.001
washer-0.001
Token princess
Token ID21752
Feature activation-1.88e-02

Pos logit contributions
 grasp+0.008
 spine+0.007
 One+0.007
rils+0.007
 nine+0.007
Neg logit contributions
washer-0.005
taker-0.005
 works-0.005
 accidents-0.005
 happens-0.005
Token dress
Token ID6576
Feature activation-1.43e-02

Pos logit contributions
 grasp+0.035
 One+0.033
 spine+0.033
rils+0.031
 grip+0.031
Neg logit contributions
washer-0.023
taker-0.023
 works-0.022
 accidents-0.022
fl-0.022
Token
Token ID198
Feature activation+4.67e-03

Pos logit contributions
 grasp+0.017
 grip+0.016
 spine+0.016
rils+0.015
 pocket+0.015
Neg logit contributions
taker-0.014
 accidents-0.013
washer-0.013
 happens-0.013
ull-0.013
Token
Token ID198
Feature activation-3.26e-03

Pos logit contributions
taker+0.007
ull+0.007
 accidents+0.007
washer+0.007
lad+0.007
Neg logit contributions
 grasp-0.007
 spine-0.007
 grip-0.007
 pocket-0.007
 One-0.006
TokenTheir
Token ID14574
Feature activation+3.99e-03

Pos logit contributions
 spine+0.006
 grasp+0.006
 grip+0.005
 One+0.005
ids+0.005
Neg logit contributions
ull-0.005
washer-0.005
taker-0.004
 accidents-0.004
 sound-0.004
Token mom
Token ID1995
Feature activation+7.96e-03

Pos logit contributions
taker+0.007
 accidents+0.007
washer+0.007
 happens+0.007
 lesson+0.007
Neg logit contributions
 grasp-0.005
 One-0.005
 spine-0.005
rils-0.005
 pocket-0.005
Token said
Token ID531
Feature activation+3.35e-03

Pos logit contributions
taker+0.011
lad+0.011
fl+0.011
rest+0.011
 happens+0.011
Neg logit contributions
 grasp-0.013
 One-0.012
 pocket-0.012
 front-0.012
 nine-0.012
Token,
Token ID11
Feature activation-6.04e-03

Pos logit contributions
taker+0.006
 works+0.006
lling+0.006
lad+0.006
washer+0.006
Neg logit contributions
 grasp-0.004
 One-0.004
 pocket-0.003
 spine-0.003
 nine-0.003
Token "
Token ID366
Feature activation+1.81e-02

Pos logit contributions
 grasp+0.010
 One+0.009
 spine+0.009
 pocket+0.009
 front+0.008
Neg logit contributions
taker-0.009
 accidents-0.009
washer-0.009
lad-0.009
 works-0.009
TokenI
Token ID40
Feature activation-2.32e-02

Pos logit contributions
taker+0.027
washer+0.026
 works+0.026
ull+0.026
lad+0.026
Neg logit contributions
 grasp-0.028
 One-0.028
One-0.027
 pocket-0.026
 front-0.026
Token'm
Token ID1101
Feature activation-3.60e-03

Pos logit contributions
 grasp+0.043
 spine+0.041
 One+0.040
 pocket+0.039
 grip+0.039
Neg logit contributions
taker-0.029
washer-0.028
lad-0.028
 accidents-0.028
 works-0.028
Token glad
Token ID9675
Feature activation-1.43e-02

Pos logit contributions
 grasp+0.009
 spine+0.008
 One+0.008
 grip+0.008
 pocket+0.008
Neg logit contributions
taker-0.003
 accidents-0.003
washer-0.003
lad-0.003
free-0.002
Token you
Token ID345
Feature activation+1.97e-02

Pos logit contributions
 grasp+0.029
 One+0.027
 spine+0.027
 pocket+0.026
 grip+0.026
Neg logit contributions
taker-0.016
 accidents-0.016
lad-0.015
washer-0.015
 happens-0.015
Token laugh
Token ID6487
Feature activation+1.23e-03

Pos logit contributions
 works+0.001
taker+0.001
washer+0.001
 happens+0.001
free+0.001
Neg logit contributions
 grasp-0.001
 spine-0.001
 One-0.001
rils-0.001
 pocket-0.001
Token and
Token ID290
Feature activation+1.72e-02

Pos logit contributions
beh+0.003
lad+0.003
washer+0.003
rest+0.003
taker+0.003
Neg logit contributions
 One-0.001
 front-0.001
 nine-0.001
 grasp-0.001
 spine-0.001
Token shout
Token ID9191
Feature activation+1.18e-02

Pos logit contributions
taker+0.027
 accidents+0.026
 works+0.026
washer+0.026
 happens+0.026
Neg logit contributions
 grasp-0.026
 One-0.025
 spine-0.024
 pocket-0.024
 grip-0.023
Token.
Token ID13
Feature activation+1.05e-02

Pos logit contributions
lad+0.024
washer+0.024
taker+0.024
 works+0.024
ull+0.023
Neg logit contributions
 One-0.012
 grasp-0.012
 spine-0.011
 front-0.011
 nine-0.010
Token
Token ID198
Feature activation+6.57e-04

Pos logit contributions
 happens+0.014
taker+0.014
ull+0.014
 accidents+0.013
fl+0.013
Neg logit contributions
 grasp-0.019
 spine-0.017
 grip-0.017
 pocket-0.017
rils-0.016
Token
Token ID198
Feature activation-1.20e-03

Pos logit contributions
ull+0.001
lad+0.001
taker+0.001
washer+0.001
 happens+0.001
Neg logit contributions
 grasp-0.001
 spine-0.001
 pocket-0.001
 grip-0.001
rils-0.001
Token"
Token ID1
Feature activation+1.76e-02

Pos logit contributions
 grasp+0.002
 spine+0.002
 grip+0.002
 pocket+0.002
rils+0.002
Neg logit contributions
lad-0.002
ww-0.002
washer-0.002
taker-0.002
ull-0.002
TokenLook
Token ID8567
Feature activation-1.13e-02

Pos logit contributions
taker+0.024
washer+0.023
 accidents+0.022
lad+0.022
led+0.022
Neg logit contributions
 grasp-0.031
 One-0.031
 front-0.029
 nine-0.029
One-0.029
Token at
Token ID379
Feature activation-7.24e-03

Pos logit contributions
 grasp+0.014
 One+0.013
 pocket+0.012
 spine+0.012
 grip+0.012
Neg logit contributions
taker-0.021
lad-0.020
 works-0.020
washer-0.020
 accidents-0.020
Token my
Token ID616
Feature activation-9.61e-04

Pos logit contributions
 grasp+0.012
 One+0.011
 pocket+0.011
 front+0.011
 spine+0.011
Neg logit contributions
taker-0.011
lad-0.011
led-0.011
washer-0.010
 works-0.010
Token car
Token ID1097
Feature activation+1.56e-03

Pos logit contributions
 grasp+0.002
 One+0.001
 pocket+0.001
 spine+0.001
 grip+0.001
Neg logit contributions
taker-0.001
lad-0.001
lling-0.001
led-0.001
 happens-0.001
TokenYes
Token ID5297
Feature activation-1.33e-02

Pos logit contributions
taker+0.025
 accidents+0.025
washer+0.025
lad+0.024
 works+0.024
Neg logit contributions
 grasp-0.034
 One-0.034
 pocket-0.031
One-0.031
 spine-0.031
Token,
Token ID11
Feature activation-1.77e-02

Pos logit contributions
 grasp+0.021
 spine+0.020
 One+0.019
 pocket+0.019
 grip+0.018
Neg logit contributions
 accidents-0.021
taker-0.021
 happens-0.020
lad-0.020
washer-0.020
Token let
Token ID1309
Feature activation-6.97e-03

Pos logit contributions
 grasp+0.029
 One+0.027
 spine+0.026
 pocket+0.026
 grip+0.025
Neg logit contributions
taker-0.026
lad-0.026
washer-0.025
 accidents-0.025
 works-0.025
Token's
Token ID338
Feature activation+2.54e-02

Pos logit contributions
 One+0.008
 front+0.008
 grasp+0.008
 pocket+0.007
 nine+0.007
Neg logit contributions
taker-0.015
lad-0.014
free-0.014
washer-0.013
ull-0.013
Token work
Token ID670
Feature activation+2.86e-02

Pos logit contributions
taker+0.040
 works+0.039
washer+0.039
 accidents+0.038
 happens+0.038
Neg logit contributions
 grasp-0.039
 spine-0.036
 One-0.035
 pocket-0.034
rils-0.034
Token together
Token ID1978
Feature activation-8.15e-03

Pos logit contributions
taker+0.055
 accidents+0.054
lad+0.054
ull+0.053
led+0.053
Neg logit contributions
 One-0.031
 front-0.031
 grasp-0.031
 nine-0.030
 grip-0.028
Token.
Token ID13
Feature activation+1.83e-02

Pos logit contributions
 grasp+0.014
 spine+0.014
 pocket+0.013
rils+0.013
 grip+0.012
Neg logit contributions
washer-0.011
 happens-0.011
 works-0.011
lad-0.011
ing-0.011
Token We
Token ID775
Feature activation-9.08e-03

Pos logit contributions
taker+0.022
 happens+0.022
free+0.021
 works+0.021
washer+0.021
Neg logit contributions
 grasp-0.035
 grip-0.033
 spine-0.031
 pocket-0.030
 front-0.029
Token are
Token ID389
Feature activation+8.39e-04

Pos logit contributions
 grasp+0.019
 One+0.018
 spine+0.018
 pocket+0.017
 grip+0.017
Neg logit contributions
taker-0.010
washer-0.009
 works-0.009
 accidents-0.009
 happens-0.009
Token strong
Token ID1913
Feature activation-8.57e-03

Pos logit contributions
taker+0.001
 accidents+0.001
washer+0.001
lad+0.001
 works+0.001
Neg logit contributions
 grasp-0.002
 One-0.002
 spine-0.002
 pocket-0.001
 grip-0.001
Token and
Token ID290
Feature activation+8.25e-03

Pos logit contributions
 grasp+0.011
 One+0.010
 pocket+0.010
 spine+0.010
 front+0.009
Neg logit contributions
taker-0.016
lad-0.016
 accidents-0.015
washer-0.015
 happens-0.015
Token.
Token ID13
Feature activation+6.47e-03

Pos logit contributions
taker+0.016
lad+0.016
 works+0.016
washer+0.016
lling+0.016
Neg logit contributions
 One-0.006
 grasp-0.005
 nine-0.005
 front-0.005
One-0.004
Token We
Token ID775
Feature activation-2.36e-02

Pos logit contributions
taker+0.008
 accidents+0.008
washer+0.008
lad+0.008
 works+0.008
Neg logit contributions
 grasp-0.012
 One-0.011
 spine-0.011
 pocket-0.011
 grip-0.010
Token did
Token ID750
Feature activation+8.95e-04

Pos logit contributions
 grasp+0.043
 One+0.042
 pocket+0.040
 spine+0.040
 grip+0.039
Neg logit contributions
taker-0.030
 accidents-0.029
lad-0.029
washer-0.029
 works-0.028
Token not
Token ID407
Feature activation-6.04e-03

Pos logit contributions
taker+0.002
lad+0.002
washer+0.002
led+0.002
lling+0.002
Neg logit contributions
 grasp-0.001
 One-0.001
 front-0.001
 pocket-0.001
 grip-0.001
Token want
Token ID765
Feature activation-1.98e-02

Pos logit contributions
 grasp+0.012
 One+0.011
 spine+0.011
 pocket+0.011
 grip+0.010
Neg logit contributions
taker-0.007
 accidents-0.007
washer-0.007
lad-0.007
 works-0.007
Token to
Token ID284
Feature activation+7.62e-04

Pos logit contributions
 grasp+0.030
 One+0.028
 spine+0.028
 pocket+0.027
 grip+0.026
Neg logit contributions
taker-0.031
 accidents-0.031
 happens-0.030
washer-0.030
 works-0.030
Token eat
Token ID4483
Feature activation+1.81e-03

Pos logit contributions
 accidents+0.001
 sound+0.001
 works+0.001
 happens+0.001
taker+0.001
Neg logit contributions
 grasp-0.001
 spine-0.001
 One-0.001
 pocket-0.001
ids-0.001
Token the
Token ID262
Feature activation+7.91e-03

Pos logit contributions
taker+0.003
lad+0.003
washer+0.003
lling+0.003
 works+0.003
Neg logit contributions
 grasp-0.003
 One-0.003
 pocket-0.003
 front-0.003
 nine-0.003
Token caul
Token ID46087
Feature activation+2.58e-02

Pos logit contributions
lad+0.012
beh+0.012
taker+0.012
led+0.011
 happens+0.011
Neg logit contributions
 grasp-0.013
 pocket-0.012
 front-0.012
 One-0.012
 grip-0.012
Tokeniflower
Token ID42642
Feature activation+5.80e-03

Pos logit contributions
taker+0.019
 accidents+0.017
lad+0.017
washer+0.017
 works+0.017
Neg logit contributions
 grasp-0.062
 One-0.059
 spine-0.058
 pocket-0.057
 grip-0.056
Token.
Token ID13
Feature activation+3.81e-03

Pos logit contributions
lad+0.011
taker+0.011
 accidents+0.011
 works+0.011
 happens+0.010
Neg logit contributions
 grasp-0.007
 One-0.006
 pocket-0.006
 spine-0.006
 front-0.006
Token Tom
Token ID4186
Feature activation+4.05e-03

Pos logit contributions
 grasp+0.045
 One+0.041
 spine+0.041
 pocket+0.040
 grip+0.039
Neg logit contributions
taker-0.057
 accidents-0.055
washer-0.055
 happens-0.054
 works-0.054
Token says
Token ID1139
Feature activation+4.22e-04

Pos logit contributions
taker+0.005
washer+0.005
 works+0.005
 happens+0.005
 accidents+0.005
Neg logit contributions
 grasp-0.008
 spine-0.007
 One-0.007
 pocket-0.007
 grip-0.007
Token.
Token ID13
Feature activation+7.68e-03

Pos logit contributions
taker+0.001
lad+0.001
 accidents+0.001
 works+0.001
 happens+0.001
Neg logit contributions
 grasp-0.001
 One-0.000
 spine-0.000
 pocket-0.000
 grip-0.000
Token "
Token ID366
Feature activation+2.57e-02

Pos logit contributions
taker+0.011
washer+0.011
 accidents+0.011
 happens+0.011
lad+0.011
Neg logit contributions
 grasp-0.013
 spine-0.012
 pocket-0.011
 One-0.011
 grip-0.011
TokenCan
Token ID6090
Feature activation-3.09e-03

Pos logit contributions
taker+0.033
 accidents+0.032
washer+0.031
lad+0.031
 works+0.031
Neg logit contributions
 grasp-0.047
 One-0.045
 spine-0.044
 pocket-0.043
 grip-0.042
Token I
Token ID314
Feature activation-3.70e-02

Pos logit contributions
 grasp+0.006
 spine+0.005
 grip+0.005
 pocket+0.005
rils+0.005
Neg logit contributions
 accidents-0.004
 happens-0.004
washer-0.004
taker-0.004
 works-0.004
Token go
Token ID467
Feature activation-1.94e-03

Pos logit contributions
 spine+0.072
 injuries+0.064
 grasp+0.063
rils+0.061
 necks+0.057
Neg logit contributions
 works-0.047
washer-0.046
free-0.045
 happens-0.044
 sound-0.044
Token first
Token ID717
Feature activation-3.36e-02

Pos logit contributions
 grasp+0.004
 spine+0.004
 One+0.004
 injuries+0.004
rils+0.004
Neg logit contributions
washer-0.002
ing-0.002
fl-0.002
 works-0.002
 sound-0.002
Token?"
Token ID1701
Feature activation-1.37e-02

Pos logit contributions
 grasp+0.072
 spine+0.070
rils+0.065
 injuries+0.063
 pocket+0.063
Neg logit contributions
washer-0.034
ing-0.033
 happens-0.032
 works-0.032
 sound-0.030
Token
Token ID198
Feature activation+4.79e-03

Pos logit contributions
 grasp+0.022
 spine+0.021
 grip+0.020
 pocket+0.019
rils+0.019
Neg logit contributions
taker-0.021
 happens-0.020
 accidents-0.020
washer-0.020
ull-0.020
Token
Token ID198
Feature activation+2.07e-03

Pos logit contributions
taker+0.007
ull+0.007
lad+0.007
washer+0.007
 happens+0.007
Neg logit contributions
 grasp-0.008
 spine-0.007
 grip-0.007
 pocket-0.007
rils-0.007
Token being
Token ID852
Feature activation+1.01e-02

Pos logit contributions
taker+0.007
lad+0.007
washer+0.007
 works+0.007
led+0.007
Neg logit contributions
 grasp-0.007
 One-0.006
 spine-0.006
 pocket-0.006
 front-0.006
Token foolish
Token ID19538
Feature activation+1.31e-02

Pos logit contributions
lad+0.013
fl+0.012
 works+0.012
taker+0.012
 happens+0.011
Neg logit contributions
 grasp-0.019
 nine-0.018
 pocket-0.018
 front-0.018
 One-0.018
Token.
Token ID13
Feature activation+7.69e-03

Pos logit contributions
lad+0.026
ull+0.026
taker+0.026
lling+0.025
 works+0.025
Neg logit contributions
 One-0.014
 grasp-0.013
 spine-0.012
 front-0.011
 pocket-0.011
Token Why
Token ID4162
Feature activation-1.40e-02

Pos logit contributions
taker+0.012
 accidents+0.012
lling+0.012
lad+0.012
 works+0.012
Neg logit contributions
 grasp-0.012
 One-0.011
 spine-0.011
 pocket-0.011
 front-0.010
Token is
Token ID318
Feature activation-4.26e-04

Pos logit contributions
 One+0.020
 pocket+0.019
 grasp+0.019
 use+0.019
 grip+0.019
Neg logit contributions
taker-0.022
lad-0.021
lling-0.021
led-0.021
year-0.021
Token he
Token ID339
Feature activation-3.68e-02

Pos logit contributions
 One+0.001
 grasp+0.001
 pocket+0.001
 front+0.001
 nine+0.001
Neg logit contributions
lad-0.001
lling-0.001
ull-0.001
taker-0.001
led-0.001
Token doing
Token ID1804
Feature activation-8.55e-03

Pos logit contributions
 grasp+0.061
 One+0.057
 pocket+0.057
 grip+0.056
 spine+0.055
Neg logit contributions
taker-0.054
lad-0.053
ull-0.052
washer-0.050
lling-0.049
Token that
Token ID326
Feature activation+1.20e-02

Pos logit contributions
 grasp+0.013
 One+0.012
 pocket+0.012
 front+0.012
 nine+0.012
Neg logit contributions
lad-0.014
taker-0.013
led-0.013
ull-0.013
washer-0.013
Token?"
Token ID1701
Feature activation-1.36e-02

Pos logit contributions
led+0.019
taker+0.019
lad+0.019
lling+0.018
ull+0.018
Neg logit contributions
 grasp-0.018
 One-0.018
 front-0.017
 pocket-0.017
 grip-0.017
Token
Token ID220
Feature activation+3.00e-03

Pos logit contributions
 grasp+0.022
 spine+0.021
 grip+0.020
 pocket+0.018
 nine+0.018
Neg logit contributions
taker-0.020
 accidents-0.020
 disl-0.019
washer-0.019
 sound-0.019
Token
Token ID198
Feature activation-3.66e-03

Pos logit contributions
taker+0.004
washer+0.004
lad+0.004
ing+0.004
 accidents+0.004
Neg logit contributions
 grasp-0.005
 spine-0.005
 pocket-0.005
 grip-0.005
rils-0.004
Token the
Token ID262
Feature activation+9.71e-03

Pos logit contributions
 grasp+0.027
 One+0.026
 pocket+0.025
 front+0.025
 spine+0.025
Neg logit contributions
taker-0.026
lad-0.025
washer-0.025
 works-0.024
 happens-0.024
Token curtain
Token ID29461
Feature activation-1.34e-02

Pos logit contributions
lad+0.016
taker+0.016
 accidents+0.016
 happens+0.016
led+0.016
Neg logit contributions
 grasp-0.014
 One-0.013
 spine-0.013
 pocket-0.013
 front-0.013
Token again
Token ID757
Feature activation-4.33e-03

Pos logit contributions
 grasp+0.020
 One+0.018
 spine+0.018
 pocket+0.018
 grip+0.017
Neg logit contributions
taker-0.022
washer-0.021
 accidents-0.021
 works-0.021
 happens-0.021
Token.
Token ID13
Feature activation+2.03e-03

Pos logit contributions
 grasp+0.007
 spine+0.007
 One+0.006
rils+0.006
 pocket+0.006
Neg logit contributions
 accidents-0.006
ing-0.006
 happens-0.006
lling-0.006
 works-0.006
Token<|endoftext|>
Token ID50256
Feature activation+3.90e-02

Pos logit contributions
 happens+0.003
 accidents+0.003
taker+0.003
 works+0.003
lad+0.003
Neg logit contributions
 grasp-0.003
 grip-0.003
 spine-0.003
 pocket-0.003
rils-0.003
TokenS
Token ID50
Feature activation-3.52e-02

Pos logit contributions
taker+0.062
 accidents+0.060
lad+0.059
lling+0.058
washer+0.057
Neg logit contributions
 One-0.060
 grasp-0.060
 front-0.056
 pocket-0.055
 spine-0.053
Tokenara
Token ID3301
Feature activation+1.01e-02

Pos logit contributions
 spine+0.053
 grasp+0.052
 grip+0.046
sing+0.045
rils+0.044
Neg logit contributions
washer-0.056
lad-0.055
 works-0.054
ing-0.054
 happens-0.052
Token and
Token ID290
Feature activation-1.16e-02

Pos logit contributions
taker+0.017
 accidents+0.016
lad+0.016
lling+0.015
washer+0.015
Neg logit contributions
 grasp-0.015
 One-0.014
 pocket-0.014
 spine-0.014
 front-0.013
Token Tom
Token ID4186
Feature activation+1.88e-02

Pos logit contributions
 grasp+0.018
 spine+0.017
 One+0.017
 pocket+0.016
 grip+0.016
Neg logit contributions
taker-0.018
 accidents-0.018
 happens-0.018
washer-0.017
 works-0.017
Token were
Token ID547
Feature activation-1.19e-02

Pos logit contributions
taker+0.029
 accidents+0.029
lad+0.028
washer+0.027
lling+0.027
Neg logit contributions
 grasp-0.030
 One-0.029
 pocket-0.028
 front-0.026
 nine-0.026
Token playing
Token ID2712
Feature activation+1.55e-02

Pos logit contributions
 grasp+0.023
 spine+0.022
 One+0.021
 grip+0.020
rils+0.020
Neg logit contributions
washer-0.014
taker-0.014
 accidents-0.014
 works-0.014
led-0.013
Token?"
Token ID1701
Feature activation-2.73e-02

Pos logit contributions
 grasp+0.002
 spine+0.002
 One+0.002
 pocket+0.002
 grip+0.002
Neg logit contributions
taker-0.001
 accidents-0.001
 happens-0.001
washer-0.001
 works-0.001
Token Lily
Token ID20037
Feature activation+1.64e-02

Pos logit contributions
 grasp+0.047
 spine+0.044
 grip+0.042
rils+0.041
 pocket+0.040
Neg logit contributions
taker-0.039
 happens-0.038
 accidents-0.037
washer-0.036
 works-0.036
Token asked
Token ID1965
Feature activation+1.76e-03

Pos logit contributions
 happens+0.023
 works+0.023
taker+0.023
washer+0.023
 accidents+0.022
Neg logit contributions
 grasp-0.028
 spine-0.026
 grip-0.025
 pocket-0.025
 One-0.024
Token.
Token ID13
Feature activation+4.61e-03

Pos logit contributions
taker+0.003
 accidents+0.003
washer+0.003
 happens+0.003
ing+0.003
Neg logit contributions
 grasp-0.003
 spine-0.002
 One-0.002
 pocket-0.002
 grip-0.002
Token She
Token ID1375
Feature activation+6.65e-03

Pos logit contributions
 happens+0.005
taker+0.005
ing+0.005
 boy+0.005
 accidents+0.005
Neg logit contributions
 grasp-0.010
 spine-0.009
 grip-0.009
 injuries-0.009
 pocket-0.008
Token reached
Token ID4251
Feature activation-3.33e-02

Pos logit contributions
led+0.007
washer+0.007
 works+0.007
 happens+0.007
taker+0.007
Neg logit contributions
 grasp-0.014
 spine-0.013
 One-0.012
rils-0.012
 grip-0.012
Token out
Token ID503
Feature activation-2.64e-02

Pos logit contributions
 grasp+0.070
+0.068
 spine+0.067
rils+0.065
One+0.064
Neg logit contributions
ing-0.033
 works-0.032
washer-0.030
taker-0.030
 happens-0.030
Token her
Token ID607
Feature activation-5.89e-03

Pos logit contributions
 grasp+0.050
 spine+0.048
 pocket+0.045
rils+0.044
+0.043
Neg logit contributions
taker-0.035
 accidents-0.034
ing-0.032
ull-0.032
led-0.031
Token hand
Token ID1021
Feature activation-2.11e-02

Pos logit contributions
 One+0.007
One+0.007
 spine+0.006
ern+0.006
 grasp+0.006
Neg logit contributions
washer-0.010
 things-0.010
 boy-0.010
 works-0.009
 sound-0.009
Token to
Token ID284
Feature activation+5.93e-03

Pos logit contributions
 grasp+0.036
 One+0.033
 spine+0.032
 grip+0.032
 pocket+0.031
Neg logit contributions
ing-0.030
washer-0.029
taker-0.028
led-0.028
fl-0.028
Token take
Token ID1011
Feature activation+2.29e-03

Pos logit contributions
 works+0.010
 sound+0.009
rest+0.009
ull+0.009
lling+0.009
Neg logit contributions
 spine-0.008
 grasp-0.008
ider-0.007
 One-0.007
rils-0.007
Token.
Token ID13
Feature activation-4.28e-03

Pos logit contributions
 grasp+0.041
 Deal+0.041
 spine+0.039
ids+0.039
 One+0.037
Neg logit contributions
 sound-0.030
ing-0.028
washer-0.026
 works-0.026
fl-0.026
Token It
Token ID632
Feature activation+6.11e-03

Pos logit contributions
 grasp+0.007
 spine+0.006
 grip+0.006
 injuries+0.006
 pocket+0.006
Neg logit contributions
 happens-0.007
ull-0.006
taker-0.006
washer-0.006
ing-0.006
Token had
Token ID550
Feature activation-2.18e-02

Pos logit contributions
taker+0.007
lad+0.007
washer+0.007
 accidents+0.007
lling+0.007
Neg logit contributions
 One-0.012
 grasp-0.012
 front-0.011
 pocket-0.011
One-0.010
Token red
Token ID2266
Feature activation-4.50e-02

Pos logit contributions
 grasp+0.038
 spine+0.037
 pocket+0.035
 One+0.035
 grip+0.034
Neg logit contributions
 happens-0.029
taker-0.029
 accidents-0.029
 news-0.028
ull-0.028
Token and
Token ID290
Feature activation+3.20e-03

Pos logit contributions
 grasp+0.073
 spine+0.069
 pocket+0.066
ids+0.063
 aid+0.062
Neg logit contributions
fl-0.069
 accidents-0.068
 happens-0.065
 news-0.065
ing-0.064
Token yellow
Token ID7872
Feature activation-3.79e-02

Pos logit contributions
taker+0.006
lad+0.006
washer+0.006
lling+0.006
 accidents+0.006
Neg logit contributions
 One-0.004
 grasp-0.004
 front-0.004
 pocket-0.004
 nine-0.004
Token feathers
Token ID27400
Feature activation-8.27e-03

Pos logit contributions
 grasp+0.062
 spine+0.059
ait+0.056
 pocket+0.056
 aid+0.054
Neg logit contributions
fl-0.059
 accidents-0.056
 happens-0.054
 sound-0.053
 news-0.053
Token and
Token ID290
Feature activation+5.23e-03

Pos logit contributions
 grasp+0.012
 One+0.011
 spine+0.010
 pocket+0.010
 grip+0.010
Neg logit contributions
taker-0.014
 accidents-0.014
washer-0.014
lad-0.014
 happens-0.014
Token a
Token ID257
Feature activation+1.96e-03

Pos logit contributions
washer+0.009
lad+0.009
taker+0.009
lling+0.009
 accidents+0.009
Neg logit contributions
 One-0.007
 grasp-0.007
 front-0.007
 pocket-0.006
 nine-0.006
Token long
Token ID890
Feature activation-1.70e-02

Pos logit contributions
taker+0.003
 accidents+0.003
 sound+0.003
 happens+0.003
ull+0.003
Neg logit contributions
 grasp-0.003
 One-0.003
 spine-0.003
 pocket-0.003
rils-0.003
Token tail
Token ID7894
Feature activation-1.20e-02

Pos logit contributions
 grasp+0.021
ait+0.020
+0.018
 One+0.018
 nine+0.018
Neg logit contributions
 sound-0.031
ull-0.029
 happens-0.028
fl-0.028
ity-0.028