TOP ACTIVATIONS
MAX = 4.804

they found out that the c actus had spoiled and had
loved looking at the impressive c actus in his backyard and
told him all about the c actus . He promised to
wrong . She saw the c actus and Lily 's finger
to be careful around the c actus . <|endoftext|> Once upon
. But the c actus was not happy .
that her friend touched the c actus and got hurt .
play . She saw the c actus and said , "
in the sand . The c actus was so en vious
for her grandma . The c rows had won the battle
. She saw that the c actus was dro opy and
Suddenly , the c actus started talking ! It
my told her that the c actus was not soft like
!" The c actus just stayed still .
play . She sees the c actus and wants to touch
wanted to play with the oct opus , but we broke
. He looked at the c actus and saw a pink
patience , he noticed the c actus was starting to grow
She told him that the c actus had sharp spikes that
play nicely together . The oct opus was a gift for

MIN ACTIVATIONS
MIN = -4.467

cut into it . <|endoftext|> Ru pert was a happy boy
with new life ! <|endoftext|> Far mer Fred was very cheerful
voice . But then , mirac ulously , it returned and
was very happy . <|endoftext|> Far away in the sky was
loved . <|endoftext|> One day Far ley was playing with his
." St ri pe y felt sad because he
very fast and always won the races . The
, Tim my , sometimes elev ators can be noisy because
said together , " A br ac ad ab ra !"
The cat grinned mis chie v ously . " Never
Can we feed the ze br as , Mom ?" Lily
Weight : 15 kilograms . Average for age : 3 years
over his body . Stri pe y liked to eat grass
She said , " A br ac ad ab ra !"
a z ebra named Stri pe y . He was black
use the dictionary to learn all kinds of new words !"
line , and he won the race . The
go . An ya raced to the gym
The little boy smiled mis chie v ously and said ,
" two ". She did this until she had ten sticks

INTERVAL 2.486 - 4.804
CONTAINS 0.065%

it was stuck ! The bald man wasn 't scared .
It split open and the cater pillar crawled out to safety
. She knew that the sm elly popcorn had been a
drops the wallet . The wallet is hard . It hits
? â € But the lemon couldn 't hear . The

INTERVAL 0.169 - 2.486
CONTAINS 40.666%

feel sorry . They say , " We are sorry ,
bad . He was just a little boy .
much as you want ." L ily smiled and
was a delicate bird who could soar like no other .
feed the animals , just like his mom . He was

INTERVAL -2.149 - 0.169
CONTAINS 58.969%

sometimes it was close . They learned that this was called
. It felt very happy . But just when
a video . It was an important video about being nice
watched the sun grow bigger and bigger . After that ,
want to leave the seal . He wants to play with

INTERVAL -4.467 - -2.149
CONTAINS 0.300%

the cooler and was stealing all the drinks . Tim my
May I join you ?" Maria nodded her head
wanted to eat the fruits right away . They peeled off
down and the lightning flashes lit up the sky .
word in the newspaper , just like the bird moved from
Token they
Token ID484
Feature activation+1.99e-01

Pos logit contributions
 shark+0.020
 steak+0.019
 thief+0.019
 crab+0.018
 salad+0.018
Neg logit contributions
 sorts-0.012
 ahead-0.011
 shades-0.011
eter-0.011
 Finally-0.011
Token found
Token ID1043
Feature activation-9.17e-01

Pos logit contributions
 shark+0.005
 thief+0.005
 steak+0.005
 gag+0.005
 crab+0.005
Neg logit contributions
 sorts-0.003
 shades-0.003
eter-0.003
 ahead-0.003
 colours-0.003
Token out
Token ID503
Feature activation-8.61e-01

Pos logit contributions
 sorts+0.019
 shades+0.018
mediately+0.018
 ahead+0.018
k+0.018
Neg logit contributions
 shark-0.022
 steak-0.022
 pasta-0.020
 salad-0.020
 thief-0.020
Token that
Token ID326
Feature activation+8.29e-01

Pos logit contributions
 sorts+0.015
 shades+0.014
 Finally+0.013
 ahead+0.013
mediately+0.013
Neg logit contributions
 shark-0.024
 steak-0.023
 thief-0.022
 dessert-0.021
 pasta-0.021
Token the
Token ID262
Feature activation+1.33e+00

Pos logit contributions
 shark+0.019
 steak+0.018
 thief+0.018
opus+0.018
ascal+0.017
Neg logit contributions
 sorts-0.018
 ahead-0.017
 called-0.016
 coloured-0.016
 Finally-0.016
Token c
Token ID269
Feature activation+4.80e+00

Pos logit contributions
 shark+0.026
 steak+0.024
 thief+0.023
 salad+0.022
 crab+0.022
Neg logit contributions
 sorts-0.034
 ahead-0.033
 called-0.032
 coloured-0.031
 Finally-0.031
Tokenactus
Token ID34144
Feature activation+2.62e+00

Pos logit contributions
 shark+0.064
 steak+0.059
 thief+0.054
 salad+0.051
 crab+0.051
Neg logit contributions
 sorts-0.151
 ahead-0.150
 called-0.144
 Finally-0.143
 coloured-0.143
Token had
Token ID550
Feature activation-1.21e-01

Pos logit contributions
 shark+0.094
 steak+0.091
 thief+0.089
 salad+0.087
 crab+0.087
Neg logit contributions
 sorts-0.022
 ahead-0.022
 called-0.020
 coloured-0.019
 shades-0.018
Token spoiled
Token ID34781
Feature activation-4.26e-01

Pos logit contributions
 sorts+0.003
 ahead+0.002
 Finally+0.002
mediately+0.002
k+0.002
Neg logit contributions
 shark-0.003
 steak-0.003
 thief-0.003
 salad-0.003
 pasta-0.003
Token and
Token ID290
Feature activation+9.86e-01

Pos logit contributions
 sorts+0.009
 ahead+0.009
 Finally+0.008
mediately+0.008
ually+0.008
Neg logit contributions
 shark-0.010
 steak-0.010
 pasta-0.009
 dessert-0.009
 thief-0.009
Token had
Token ID550
Feature activation+1.37e-01

Pos logit contributions
 shark+0.025
 steak+0.024
 thief+0.023
 salad+0.023
 pasta+0.022
Neg logit contributions
 sorts-0.019
 ahead-0.019
 Finally-0.018
k-0.018
 shades-0.017
Token loved
Token ID6151
Feature activation+1.03e-01

Pos logit contributions
 sorts+0.003
 ahead+0.003
mediately+0.003
 shades+0.003
k+0.003
Neg logit contributions
 shark-0.007
 steak-0.006
 thief-0.006
 mustache-0.006
 salad-0.006
Token looking
Token ID2045
Feature activation+5.37e-01

Pos logit contributions
 shark+0.003
opus+0.003
otion+0.003
 thief+0.003
 liar+0.003
Neg logit contributions
 called-0.001
 sorts-0.001
 past-0.001
 ahead-0.001
 coloured-0.001
Token at
Token ID379
Feature activation+3.44e-01

Pos logit contributions
 shark+0.018
 steak+0.017
 thief+0.016
 crab+0.016
 salad+0.016
Neg logit contributions
 sorts-0.007
 ahead-0.006
 called-0.006
 Finally-0.006
 coloured-0.005
Token the
Token ID262
Feature activation+9.14e-01

Pos logit contributions
 shark+0.010
 steak+0.010
 thief+0.009
 salad+0.009
opus+0.009
Neg logit contributions
 sorts-0.006
 ahead-0.005
 called-0.005
 coloured-0.005
 shades-0.005
Token impressive
Token ID8036
Feature activation+1.76e+00

Pos logit contributions
 shark+0.018
 steak+0.018
 thief+0.017
 salad+0.016
 liar+0.016
Neg logit contributions
 sorts-0.022
 ahead-0.022
 called-0.021
 coloured-0.021
 shades-0.021
Token c
Token ID269
Feature activation+4.71e+00

Pos logit contributions
 shark+0.036
 steak+0.035
 thief+0.033
 salad+0.032
 crab+0.031
Neg logit contributions
 sorts-0.043
 ahead-0.042
 called-0.040
 coloured-0.040
 shades-0.039
Tokenactus
Token ID34144
Feature activation+1.82e+00

Pos logit contributions
 shark+0.051
 steak+0.047
 thief+0.044
 liar+0.040
 gag+0.039
Neg logit contributions
 ahead-0.159
 sorts-0.158
 called-0.152
 shades-0.151
 coloured-0.151
Token in
Token ID287
Feature activation+3.48e-01

Pos logit contributions
 shark+0.062
 thief+0.059
 liar+0.059
 steak+0.059
 virus+0.059
Neg logit contributions
 ahead-0.020
 sorts-0.020
 called-0.018
 over-0.018
 when-0.017
Token his
Token ID465
Feature activation+4.84e-01

Pos logit contributions
 shark+0.010
opus+0.010
 steak+0.010
 thief+0.010
 gag+0.010
Neg logit contributions
 sorts-0.005
 ahead-0.005
 called-0.005
 coloured-0.004
 shades-0.004
Token backyard
Token ID24296
Feature activation+4.61e-01

Pos logit contributions
 shark+0.012
 steak+0.011
 thief+0.011
 salad+0.011
 crab+0.011
Neg logit contributions
 sorts-0.010
 ahead-0.010
 coloured-0.009
 called-0.009
 Finally-0.009
Token and
Token ID290
Feature activation+6.54e-02

Pos logit contributions
 shark+0.014
 steak+0.014
 salad+0.013
 thief+0.013
 pasta+0.013
Neg logit contributions
 sorts-0.006
 ahead-0.006
mediately-0.006
 coloured-0.006
 shades-0.006
Token told
Token ID1297
Feature activation+1.25e+00

Pos logit contributions
 shark+0.011
 mustache+0.010
 thief+0.010
 steak+0.010
 crab+0.010
Neg logit contributions
 Finally-0.008
 After-0.008
 shades-0.008
 directions-0.008
eter-0.008
Token him
Token ID683
Feature activation+9.66e-01

Pos logit contributions
 shark+0.031
opus+0.030
 liar+0.029
 thief+0.028
 steak+0.028
Neg logit contributions
 ahead-0.025
 called-0.025
 sorts-0.025
 past-0.024
 over-0.023
Token all
Token ID477
Feature activation-1.46e+00

Pos logit contributions
 shark+0.027
 steak+0.026
 thief+0.025
opus+0.025
ascal+0.025
Neg logit contributions
 ahead-0.016
 sorts-0.015
 called-0.015
 Finally-0.014
 over-0.014
Token about
Token ID546
Feature activation+5.86e-01

Pos logit contributions
 sorts+0.021
 ahead+0.020
 called+0.018
 Finally+0.018
 coloured+0.018
Neg logit contributions
 shark-0.045
 steak-0.043
 thief-0.042
 salad-0.041
 crab-0.040
Token the
Token ID262
Feature activation+6.41e-01

Pos logit contributions
opus+0.016
 shark+0.016
 thief+0.015
 liar+0.015
otion+0.015
Neg logit contributions
 called-0.010
 sorts-0.010
 ahead-0.010
 past-0.009
 coloured-0.009
Token c
Token ID269
Feature activation+4.71e+00

Pos logit contributions
 shark+0.014
 steak+0.014
otion+0.013
 thief+0.013
oly+0.013
Neg logit contributions
 sorts-0.014
 ahead-0.014
 past-0.014
 called-0.013
 coloured-0.013
Tokenactus
Token ID34144
Feature activation+2.32e+00

Pos logit contributions
 shark+0.053
 steak+0.050
 thief+0.045
 liar+0.041
 gag+0.040
Neg logit contributions
 ahead-0.157
 sorts-0.156
 called-0.150
 coloured-0.149
 shades-0.149
Token.
Token ID13
Feature activation-1.02e-01

Pos logit contributions
 shark+0.077
 steak+0.074
 thief+0.073
 liar+0.073
 gag+0.072
Neg logit contributions
 ahead-0.026
 called-0.026
 sorts-0.026
 over-0.023
 began-0.021
Token He
Token ID679
Feature activation-3.73e-01

Pos logit contributions
 sorts+0.002
 ahead+0.002
 coloured+0.002
k+0.002
 shades+0.002
Neg logit contributions
 shark-0.003
 steak-0.003
 thief-0.003
 salad-0.003
 pasta-0.003
Token promised
Token ID8072
Feature activation+2.77e-01

Pos logit contributions
 sorts+0.006
 ahead+0.006
k+0.005
 shades+0.005
 Finally+0.005
Neg logit contributions
 shark-0.011
 steak-0.011
 thief-0.010
 salad-0.010
 crab-0.010
Token to
Token ID284
Feature activation+5.94e-01

Pos logit contributions
 shark+0.008
 steak+0.008
 thief+0.008
opus+0.008
 salad+0.008
Neg logit contributions
 ahead-0.004
 sorts-0.004
 called-0.004
 Finally-0.004
 past-0.004
Token wrong
Token ID2642
Feature activation+2.43e-01

Pos logit contributions
 ahead+0.020
 called+0.019
 past+0.018
 sorts+0.018
 coloured+0.018
Neg logit contributions
 shark-0.029
 steak-0.028
 crab-0.027
opus-0.026
 salad-0.026
Token.
Token ID13
Feature activation+1.54e-01

Pos logit contributions
 shark+0.008
 steak+0.008
 thief+0.008
 salad+0.008
 crab+0.008
Neg logit contributions
 sorts-0.003
 ahead-0.002
 Finally-0.002
 coloured-0.002
k-0.002
Token She
Token ID1375
Feature activation+5.63e-01

Pos logit contributions
 shark+0.005
 steak+0.005
 thief+0.005
opus+0.004
 crab+0.004
Neg logit contributions
 Finally-0.002
 ahead-0.002
 sorts-0.002
 called-0.002
 After-0.002
Token saw
Token ID2497
Feature activation+2.87e-02

Pos logit contributions
 shark+0.016
 thief+0.015
 steak+0.015
 crab+0.014
 ow+0.014
Neg logit contributions
 sorts-0.010
eter-0.009
 shades-0.009
 ahead-0.009
k-0.009
Token the
Token ID262
Feature activation+1.24e+00

Pos logit contributions
 shark+0.001
 steak+0.001
 thief+0.001
opus+0.001
 gag+0.001
Neg logit contributions
 called-0.000
 ahead-0.000
 sorts-0.000
 past-0.000
 over-0.000
Token c
Token ID269
Feature activation+4.68e+00

Pos logit contributions
 shark+0.023
 steak+0.022
ascal+0.020
otion+0.020
 salad+0.020
Neg logit contributions
 ahead-0.031
 sorts-0.030
 past-0.030
 called-0.029
 coloured-0.029
Tokenactus
Token ID34144
Feature activation+2.04e+00

Pos logit contributions
 shark+0.068
 steak+0.063
 thief+0.057
 salad+0.055
 dessert+0.054
Neg logit contributions
 ahead-0.141
 sorts-0.134
k-0.134
 called-0.132
 past-0.132
Token and
Token ID290
Feature activation+4.74e-01

Pos logit contributions
 shark+0.075
oly+0.074
 liar+0.073
ascal+0.072
 cowboy+0.072
Neg logit contributions
 over-0.018
 called-0.018
 ahead-0.017
 when-0.015
house-0.015
Token Lily
Token ID20037
Feature activation+9.62e-01

Pos logit contributions
 shark+0.014
 steak+0.013
 thief+0.013
 crab+0.012
 salad+0.012
Neg logit contributions
 sorts-0.008
 ahead-0.008
eter-0.007
 Finally-0.007
k-0.007
Token's
Token ID338
Feature activation+8.59e-01

Pos logit contributions
 shark+0.027
 steak+0.026
opus+0.025
 thief+0.024
 salad+0.024
Neg logit contributions
 called-0.017
 ahead-0.017
 sorts-0.016
 began-0.016
 over-0.016
Token finger
Token ID7660
Feature activation+3.22e-01

Pos logit contributions
 shark+0.018
 steak+0.017
 thief+0.016
 salad+0.016
opus+0.016
Neg logit contributions
 ahead-0.020
 sorts-0.020
 called-0.019
 shades-0.019
 coloured-0.019
Token to
Token ID284
Feature activation+4.94e-01

Pos logit contributions
 shark+0.010
 steak+0.009
 thief+0.009
opus+0.009
ascal+0.009
Neg logit contributions
 sorts-0.004
 ahead-0.003
 over-0.003
 called-0.003
 when-0.003
Token be
Token ID307
Feature activation+7.35e-01

Pos logit contributions
 shark+0.012
 steak+0.012
 salad+0.012
 thief+0.012
 dessert+0.012
Neg logit contributions
 ahead-0.009
 sorts-0.009
antly-0.009
 Finally-0.009
ples-0.009
Token careful
Token ID8161
Feature activation+1.59e+00

Pos logit contributions
 thief+0.013
 dessert+0.013
 steak+0.013
 chefs+0.012
 cowboy+0.012
Neg logit contributions
se-0.018
 sorts-0.018
 Finally-0.018
ually-0.018
ris-0.017
Token around
Token ID1088
Feature activation+9.04e-01

Pos logit contributions
 shark+0.047
 steak+0.046
 thief+0.044
 salad+0.043
 dessert+0.043
Neg logit contributions
 sorts-0.023
 Finally-0.021
 ahead-0.021
 coloured-0.020
ples-0.020
Token the
Token ID262
Feature activation+1.58e+00

Pos logit contributions
 steak+0.029
opus+0.029
 shark+0.029
 gag+0.028
 liar+0.028
Neg logit contributions
 called-0.013
 sorts-0.012
 when-0.011
 ahead-0.011
 past-0.011
Token c
Token ID269
Feature activation+4.68e+00

Pos logit contributions
 liar+0.044
otion+0.043
 shark+0.043
 steak+0.042
 gag+0.042
Neg logit contributions
 past-0.028
 called-0.026
 ahead-0.026
 coloured-0.025
 admired-0.025
Tokenactus
Token ID34144
Feature activation+1.79e+00

Pos logit contributions
 shark+0.059
 steak+0.058
 thief+0.053
 liar+0.050
 gag+0.049
Neg logit contributions
 ahead-0.148
 sorts-0.147
 called-0.143
 shades-0.140
 past-0.140
Token.
Token ID13
Feature activation+4.47e-01

Pos logit contributions
 shark+0.060
 steak+0.058
 thief+0.056
opus+0.055
 crab+0.054
Neg logit contributions
 sorts-0.021
 ahead-0.020
 called-0.020
 over-0.018
k-0.017
Token<|endoftext|>
Token ID50256
Feature activation+1.63e+00

Pos logit contributions
 shark+0.016
 steak+0.015
 thief+0.015
 crab+0.014
 liar+0.014
Neg logit contributions
 sorts-0.005
 Finally-0.005
 After-0.005
 ahead-0.004
 When-0.004
TokenOnce
Token ID7454
Feature activation+1.51e+00

Pos logit contributions
 shark+0.050
 steak+0.048
 thief+0.048
 crab+0.046
 salad+0.046
Neg logit contributions
 sorts-0.023
 ahead-0.022
 called-0.021
 Finally-0.020
 coloured-0.020
Token upon
Token ID2402
Feature activation+1.39e+00

Pos logit contributions
 shark+0.045
 steak+0.044
 thief+0.043
 liar+0.041
 gag+0.041
Neg logit contributions
 sorts-0.022
 ahead-0.022
 called-0.020
 Finally-0.020
 past-0.020
Token.
Token ID13
Feature activation+1.37e-01

Pos logit contributions
 sorts+0.024
 ahead+0.022
 coloured+0.020
 Finally+0.020
 called+0.020
Neg logit contributions
 shark-0.072
 steak-0.070
 thief-0.068
 salad-0.067
 crab-0.066
Token
Token ID198
Feature activation+9.06e-02

Pos logit contributions
 shark+0.004
 steak+0.004
 thief+0.004
 salad+0.004
 crab+0.004
Neg logit contributions
 sorts-0.002
 ahead-0.002
 coloured-0.002
 called-0.002
k-0.002
Token
Token ID198
Feature activation+1.50e-01

Pos logit contributions
 shark+0.003
 steak+0.003
 thief+0.003
 salad+0.002
 crab+0.002
Neg logit contributions
 sorts-0.001
 ahead-0.001
 called-0.001
 Finally-0.001
 coloured-0.001
TokenBut
Token ID1537
Feature activation-1.57e+00

Pos logit contributions
 shark+0.005
 steak+0.004
 thief+0.004
 crab+0.004
 salad+0.004
Neg logit contributions
 sorts-0.002
 ahead-0.002
 shades-0.002
 coloured-0.002
eter-0.002
Token the
Token ID262
Feature activation+1.49e-01

Pos logit contributions
 sorts+0.026
 ahead+0.023
 coloured+0.023
 shades+0.023
k+0.022
Neg logit contributions
 shark-0.047
 steak-0.044
 thief-0.042
 crab-0.042
 salad-0.042
Token c
Token ID269
Feature activation+4.63e+00

Pos logit contributions
otion+0.004
 steak+0.004
oly+0.004
uts+0.003
 liar+0.003
Neg logit contributions
 called-0.003
 past-0.003
 colors-0.003
 colours-0.003
 ahead-0.003
Tokenactus
Token ID34144
Feature activation+2.33e+00

Pos logit contributions
 shark+0.041
 steak+0.037
 thief+0.033
 salad+0.029
 crab+0.028
Neg logit contributions
 sorts-0.164
 ahead-0.164
 called-0.159
 coloured-0.157
 shades-0.157
Token was
Token ID373
Feature activation+5.72e-01

Pos logit contributions
 shark+0.075
 steak+0.073
 thief+0.071
 crab+0.069
 salad+0.069
Neg logit contributions
 sorts-0.030
 ahead-0.028
 Finally-0.025
 coloured-0.025
 shades-0.025
Token not
Token ID407
Feature activation+6.10e-01

Pos logit contributions
 shark+0.013
 steak+0.013
 thief+0.012
 salad+0.011
 ow+0.011
Neg logit contributions
 sorts-0.014
 Finally-0.012
ually-0.012
antly-0.012
k-0.012
Token happy
Token ID3772
Feature activation+1.46e-01

Pos logit contributions
 shark+0.017
 steak+0.016
 thief+0.015
opus+0.015
 crab+0.015
Neg logit contributions
 sorts-0.011
 ahead-0.011
 called-0.010
 coloured-0.010
 shades-0.010
Token.
Token ID13
Feature activation-3.64e-01

Pos logit contributions
 shark+0.005
 steak+0.005
 thief+0.005
opus+0.005
 liar+0.005
Neg logit contributions
 ahead-0.001
 sorts-0.001
 called-0.001
 shades-0.001
 coloured-0.001
Token that
Token ID326
Feature activation+1.58e+00

Pos logit contributions
 shark+0.037
 steak+0.035
opus+0.035
ascal+0.034
oly+0.034
Neg logit contributions
 sorts-0.014
 called-0.014
 ahead-0.014
 over-0.013
 began-0.012
Token her
Token ID607
Feature activation+1.06e+00

Pos logit contributions
 shark+0.044
opus+0.044
ascal+0.043
 steak+0.043
oly+0.042
Neg logit contributions
 called-0.026
 sorts-0.025
 when-0.025
 over-0.024
 painted-0.023
Token friend
Token ID1545
Feature activation+2.35e+00

Pos logit contributions
 shark+0.027
 steak+0.025
 thief+0.025
 crab+0.024
 salad+0.024
Neg logit contributions
 sorts-0.021
 ahead-0.021
 Finally-0.019
k-0.019
 coloured-0.019
Token touched
Token ID12615
Feature activation-5.52e-01

Pos logit contributions
 shark+0.061
 thief+0.060
 steak+0.059
 mustache+0.057
 ow+0.056
Neg logit contributions
 sorts-0.042
k-0.040
ples-0.040
 Finally-0.039
 shades-0.039
Token the
Token ID262
Feature activation+1.43e+00

Pos logit contributions
 sorts+0.010
 ahead+0.010
 Finally+0.009
k+0.009
 coloured+0.009
Neg logit contributions
 shark-0.015
 steak-0.014
 thief-0.013
 salad-0.013
 pasta-0.013
Token c
Token ID269
Feature activation+4.62e+00

Pos logit contributions
oly+0.037
 liar+0.037
otion+0.036
ascal+0.036
 steak+0.036
Neg logit contributions
 past-0.027
 called-0.025
 colors-0.023
 direction-0.023
 shades-0.023
Tokenactus
Token ID34144
Feature activation+2.26e+00

Pos logit contributions
 shark+0.047
 steak+0.042
 thief+0.038
 liar+0.033
 salad+0.033
Neg logit contributions
 ahead-0.159
 sorts-0.158
 called-0.153
 coloured-0.151
 shades-0.151
Token and
Token ID290
Feature activation+1.11e+00

Pos logit contributions
 shark+0.082
 steak+0.078
 thief+0.077
opus+0.076
oly+0.075
Neg logit contributions
 ahead-0.020
 sorts-0.020
 called-0.019
 over-0.018
 past-0.017
Token got
Token ID1392
Feature activation+1.50e+00

Pos logit contributions
 shark+0.028
 steak+0.027
 thief+0.026
 salad+0.026
 crab+0.025
Neg logit contributions
 sorts-0.022
 ahead-0.021
 Finally-0.020
k-0.020
 shades-0.020
Token hurt
Token ID5938
Feature activation+4.63e-01

Pos logit contributions
 shark+0.034
 steak+0.033
 thief+0.032
 salad+0.031
 crab+0.030
Neg logit contributions
 sorts-0.033
 ahead-0.032
 shades-0.030
 coloured-0.030
 Finally-0.030
Token.
Token ID13
Feature activation+7.65e-01

Pos logit contributions
 shark+0.015
 steak+0.015
 thief+0.015
 crab+0.014
 salad+0.014
Neg logit contributions
 sorts-0.005
 ahead-0.005
 Finally-0.005
 coloured-0.005
 shades-0.004
Token play
Token ID711
Feature activation+6.00e-01

Pos logit contributions
 sorts+0.005
 ahead+0.005
 Finally+0.005
 called+0.005
 coloured+0.005
Neg logit contributions
 shark-0.010
 steak-0.010
 thief-0.009
 crab-0.009
 salad-0.009
Token.
Token ID13
Feature activation+1.41e+00

Pos logit contributions
 steak+0.024
 shark+0.024
opus+0.024
 thief+0.023
oly+0.023
Neg logit contributions
 called-0.003
 ahead-0.003
 sorts-0.003
 past-0.002
 when-0.002
Token She
Token ID1375
Feature activation+2.80e-01

Pos logit contributions
 shark+0.041
 steak+0.041
opus+0.039
 crab+0.038
ascal+0.038
Neg logit contributions
 Finally-0.022
 sorts-0.021
 After-0.021
 ahead-0.021
 called-0.021
Token saw
Token ID2497
Feature activation+1.88e-01

Pos logit contributions
 shark+0.008
 steak+0.007
 thief+0.007
 crab+0.007
 mustache+0.007
Neg logit contributions
 sorts-0.005
eter-0.005
k-0.005
 ahead-0.005
 shades-0.005
Token the
Token ID262
Feature activation+7.23e-01

Pos logit contributions
 shark+0.006
opus+0.006
 steak+0.006
ascal+0.006
 gag+0.006
Neg logit contributions
 called-0.002
 sorts-0.002
 over-0.002
 ahead-0.002
 past-0.002
Token c
Token ID269
Feature activation+4.61e+00

Pos logit contributions
 steak+0.019
 shark+0.018
otion+0.018
oly+0.017
 liar+0.017
Neg logit contributions
 past-0.013
 called-0.013
 coloured-0.013
 colors-0.013
 colours-0.013
Tokenactus
Token ID34144
Feature activation+2.05e+00

Pos logit contributions
 shark+0.048
 steak+0.045
 thief+0.039
 salad+0.036
 dessert+0.035
Neg logit contributions
 ahead-0.158
 sorts-0.157
 called-0.152
 coloured-0.151
k-0.150
Token and
Token ID290
Feature activation+2.01e-01

Pos logit contributions
 shark+0.074
oly+0.074
 liar+0.074
ascal+0.073
 steak+0.072
Neg logit contributions
 over-0.019
 called-0.018
 ahead-0.018
 when-0.015
 began-0.015
Token said
Token ID531
Feature activation+3.38e-01

Pos logit contributions
 shark+0.006
 steak+0.005
 thief+0.005
 crab+0.005
 salad+0.005
Neg logit contributions
 sorts-0.003
 ahead-0.003
eter-0.003
k-0.003
 Finally-0.003
Token,
Token ID11
Feature activation+1.62e-01

Pos logit contributions
 shark+0.013
 steak+0.012
 thief+0.012
ascal+0.012
 salad+0.012
Neg logit contributions
 ahead-0.002
 sorts-0.002
 called-0.002
 past-0.002
 Finally-0.002
Token "
Token ID366
Feature activation-1.55e-02

Pos logit contributions
 shark+0.006
 steak+0.005
 thief+0.005
 salad+0.005
 pasta+0.005
Neg logit contributions
 ahead-0.002
 called-0.002
 past-0.002
 Finally-0.002
 over-0.002
Token in
Token ID287
Feature activation-1.72e+00

Pos logit contributions
 sorts+0.003
 ahead+0.003
 called+0.003
 Finally+0.003
 coloured+0.003
Neg logit contributions
 shark-0.006
 steak-0.006
 thief-0.006
 salad-0.005
 crab-0.005
Token the
Token ID262
Feature activation+1.47e-01

Pos logit contributions
 sorts+0.016
 ahead+0.015
 called+0.014
 coloured+0.013
 past+0.012
Neg logit contributions
opus-0.060
 shark-0.060
 thief-0.058
ascal-0.057
 liar-0.057
Token sand
Token ID6450
Feature activation+5.74e-01

Pos logit contributions
 shark+0.003
 steak+0.003
 thief+0.003
 crab+0.003
 salad+0.003
Neg logit contributions
 sorts-0.003
 ahead-0.003
 Finally-0.003
 coloured-0.003
eter-0.003
Token.
Token ID13
Feature activation+6.32e-01

Pos logit contributions
 shark+0.020
 steak+0.020
 thief+0.019
 liar+0.019
 gag+0.019
Neg logit contributions
 ahead-0.005
 sorts-0.005
 called-0.005
 coloured-0.004
k-0.004
Token The
Token ID383
Feature activation+3.71e-01

Pos logit contributions
 shark+0.017
 steak+0.017
 thief+0.016
 salad+0.016
 crab+0.016
Neg logit contributions
 sorts-0.011
 ahead-0.010
 coloured-0.010
k-0.009
 shades-0.009
Token c
Token ID269
Feature activation+4.61e+00

Pos logit contributions
 shark+0.007
 steak+0.006
otion+0.006
 thief+0.006
 salad+0.006
Neg logit contributions
 called-0.009
 sorts-0.009
 ahead-0.009
 past-0.009
 coloured-0.009
Tokenactus
Token ID34144
Feature activation+2.21e+00

Pos logit contributions
 shark+0.046
 steak+0.042
 thief+0.036
 salad+0.034
 dessert+0.033
Neg logit contributions
 ahead-0.160
 sorts-0.158
 called-0.154
 coloured-0.153
 shades-0.152
Token was
Token ID373
Feature activation-2.68e-01

Pos logit contributions
 shark+0.075
 steak+0.072
 thief+0.070
 crab+0.069
 salad+0.069
Neg logit contributions
 sorts-0.025
 ahead-0.023
 Finally-0.020
 shades-0.020
k-0.020
Token so
Token ID523
Feature activation+9.07e-01

Pos logit contributions
 sorts+0.005
 ahead+0.005
 Finally+0.005
eter+0.005
k+0.004
Neg logit contributions
 shark-0.007
 steak-0.007
 thief-0.007
 salad-0.006
 crab-0.006
Token en
Token ID551
Feature activation-1.42e+00

Pos logit contributions
 shark+0.023
 steak+0.022
 thief+0.021
 salad+0.020
 crab+0.020
Neg logit contributions
 sorts-0.018
 ahead-0.017
 Finally-0.016
 called-0.016
 shades-0.016
Tokenvious
Token ID1442
Feature activation+1.11e+00

Pos logit contributions
ually+0.041
 Finally+0.040
antly+0.040
cards+0.040
 coloured+0.039
Neg logit contributions
 shark-0.020
 thief-0.020
 crab-0.019
 steak-0.018
opus-0.018
Token for
Token ID329
Feature activation+1.59e+00

Pos logit contributions
 shark+0.010
 steak+0.010
 thief+0.009
opus+0.009
ascal+0.009
Neg logit contributions
 ahead-0.003
 sorts-0.002
 called-0.002
 past-0.002
 over-0.002
Token her
Token ID607
Feature activation+7.11e-01

Pos logit contributions
 shark+0.047
opus+0.046
 thief+0.044
 steak+0.043
 crab+0.043
Neg logit contributions
 ahead-0.026
 called-0.026
 past-0.024
 coloured-0.024
 shades-0.024
Token grandma
Token ID49890
Feature activation+1.19e+00

Pos logit contributions
 shark+0.019
 steak+0.018
opus+0.017
 thief+0.017
ascal+0.017
Neg logit contributions
 ahead-0.013
 sorts-0.013
 called-0.012
 past-0.012
 shades-0.012
Token.
Token ID13
Feature activation+5.18e-01

Pos logit contributions
 shark+0.039
 steak+0.038
 thief+0.037
 salad+0.036
 crab+0.036
Neg logit contributions
 sorts-0.014
 ahead-0.013
 Finally-0.013
 coloured-0.012
mediately-0.012
Token The
Token ID383
Feature activation+7.48e-01

Pos logit contributions
 shark+0.014
 steak+0.013
 salad+0.013
 thief+0.012
 pasta+0.012
Neg logit contributions
 coloured-0.009
 sorts-0.009
 ahead-0.009
k-0.009
 colours-0.008
Token c
Token ID269
Feature activation+4.61e+00

Pos logit contributions
 shark+0.021
 steak+0.020
 thief+0.020
 crab+0.019
 salad+0.019
Neg logit contributions
 sorts-0.012
 ahead-0.012
 called-0.011
 coloured-0.011
 Finally-0.011
Tokenrows
Token ID8516
Feature activation+3.79e-01

Pos logit contributions
 shark+0.076
 steak+0.070
 thief+0.068
opus+0.065
 crab+0.064
Neg logit contributions
 sorts-0.132
 ahead-0.127
 called-0.123
 coloured-0.123
 Finally-0.122
Token had
Token ID550
Feature activation+1.15e-01

Pos logit contributions
 shark+0.011
 steak+0.010
 thief+0.010
 crab+0.010
 salad+0.010
Neg logit contributions
 sorts-0.006
 ahead-0.006
 Finally-0.006
eter-0.006
 shades-0.006
Token won
Token ID1839
Feature activation-4.80e-01

Pos logit contributions
 shark+0.003
 steak+0.003
 thief+0.003
 salad+0.003
 crab+0.003
Neg logit contributions
 sorts-0.002
 ahead-0.002
 Finally-0.002
k-0.002
mediately-0.002
Token the
Token ID262
Feature activation-2.02e+00

Pos logit contributions
 sorts+0.008
 ahead+0.007
 called+0.007
 coloured+0.007
 Finally+0.007
Neg logit contributions
 shark-0.014
 steak-0.014
 thief-0.013
 salad-0.013
 crab-0.013
Token battle
Token ID3344
Feature activation+1.59e-01

Pos logit contributions
 ahead+0.042
 past+0.041
 called+0.040
 sorts+0.040
 coloured+0.038
Neg logit contributions
 shark-0.049
 steak-0.046
ascal-0.043
otion-0.043
opus-0.043
Token.
Token ID13
Feature activation-1.87e-01

Pos logit contributions
 shark+0.014
 steak+0.014
 thief+0.013
opus+0.013
 liar+0.013
Neg logit contributions
 ahead-0.004
 called-0.004
 over-0.003
 sorts-0.003
 past-0.003
Token She
Token ID1375
Feature activation+1.13e-01

Pos logit contributions
 sorts+0.004
 ahead+0.004
 coloured+0.003
k+0.003
 called+0.003
Neg logit contributions
 shark-0.005
 steak-0.005
 thief-0.004
 salad-0.004
 crab-0.004
Token saw
Token ID2497
Feature activation+3.31e-01

Pos logit contributions
 shark+0.003
 steak+0.003
 thief+0.003
 salad+0.003
 crab+0.003
Neg logit contributions
 sorts-0.002
 ahead-0.002
 shades-0.002
eter-0.002
 Finally-0.002
Token that
Token ID326
Feature activation+3.40e-01

Pos logit contributions
 shark+0.010
 thief+0.009
 steak+0.009
opus+0.009
 virus+0.009
Neg logit contributions
 ahead-0.005
 called-0.005
 sorts-0.005
 past-0.004
 over-0.004
Token the
Token ID262
Feature activation+1.21e+00

Pos logit contributions
opus+0.010
 shark+0.010
ascal+0.010
 steak+0.009
 thief+0.009
Neg logit contributions
 called-0.005
 sorts-0.005
 ahead-0.005
 over-0.005
 when-0.005
Token c
Token ID269
Feature activation+4.56e+00

Pos logit contributions
otion+0.030
oly+0.029
 shark+0.029
 steak+0.029
opus+0.028
Neg logit contributions
 colors-0.024
 past-0.023
 colours-0.023
 called-0.022
 coloured-0.021
Tokenactus
Token ID34144
Feature activation+2.57e+00

Pos logit contributions
 shark+0.048
 steak+0.045
 thief+0.039
 salad+0.036
 gag+0.035
Neg logit contributions
 ahead-0.154
 sorts-0.152
 called-0.149
 coloured-0.147
k-0.146
Token was
Token ID373
Feature activation+1.12e+00

Pos logit contributions
 shark+0.098
 steak+0.097
 thief+0.094
 liar+0.094
oly+0.094
Neg logit contributions
 called-0.015
 ahead-0.015
 began-0.013
 over-0.012
 sorts-0.012
Token dro
Token ID3102
Feature activation+1.90e+00

Pos logit contributions
 shark+0.027
 steak+0.025
 thief+0.024
opus+0.024
 crab+0.024
Neg logit contributions
 ahead-0.023
 sorts-0.023
 called-0.022
 coloured-0.022
 shades-0.021
Tokenopy
Token ID11081
Feature activation+7.22e-01

Pos logit contributions
 shark+0.034
 steak+0.033
 thief+0.030
 salad+0.029
 crab+0.029
Neg logit contributions
 ahead-0.051
 sorts-0.051
 called-0.049
k-0.049
 coloured-0.048
Token and
Token ID290
Feature activation+1.07e+00

Pos logit contributions
 shark+0.024
 steak+0.024
opus+0.023
 thief+0.023
oly+0.022
Neg logit contributions
 ahead-0.008
 sorts-0.007
 called-0.007
 coloured-0.007
k-0.007
Token
Token ID198
Feature activation+5.85e-01

Pos logit contributions
 steak+0.010
 shark+0.010
 salad+0.010
 crab+0.009
 dessert+0.009
Neg logit contributions
k-0.010
ples-0.009
 coloured-0.009
 sorts-0.009
ist-0.009
Token
Token ID198
Feature activation+8.51e-01

Pos logit contributions
 shark+0.017
 steak+0.016
 thief+0.016
 salad+0.015
 crab+0.015
Neg logit contributions
 sorts-0.009
 ahead-0.009
 called-0.008
 coloured-0.008
 shades-0.008
TokenSuddenly
Token ID38582
Feature activation-8.17e-01

Pos logit contributions
 steak+0.017
 shark+0.017
 thief+0.017
 crab+0.017
 salad+0.016
Neg logit contributions
 sorts-0.018
eter-0.018
k-0.018
ples-0.018
 ahead-0.018
Token,
Token ID11
Feature activation-5.50e-01

Pos logit contributions
 sorts+0.015
k+0.015
eter+0.013
ples+0.013
mediately+0.013
Neg logit contributions
 shark-0.022
 steak-0.022
 salad-0.021
 thief-0.021
 crab-0.021
Token the
Token ID262
Feature activation+1.54e-01

Pos logit contributions
 sorts+0.011
k+0.011
ples+0.011
eter+0.011
 ahead+0.010
Neg logit contributions
 shark-0.013
 steak-0.013
 thief-0.012
 salad-0.012
 dessert-0.012
Token c
Token ID269
Feature activation+4.53e+00

Pos logit contributions
 liar+0.004
 steak+0.004
 shark+0.004
otion+0.004
oly+0.004
Neg logit contributions
 called-0.003
 past-0.003
 ahead-0.003
 shades-0.003
 colours-0.003
Tokenactus
Token ID34144
Feature activation+2.68e+00

Pos logit contributions
 shark+0.042
 steak+0.039
 thief+0.033
 salad+0.031
 dessert+0.030
Neg logit contributions
 ahead-0.159
 sorts-0.158
 called-0.153
 coloured-0.152
 shades-0.151
Token started
Token ID2067
Feature activation-2.42e-01

Pos logit contributions
 shark+0.087
 steak+0.084
 thief+0.083
 crab+0.080
 salad+0.079
Neg logit contributions
 sorts-0.035
 ahead-0.032
 Finally-0.029
k-0.029
 shades-0.029
Token talking
Token ID3375
Feature activation+3.26e-01

Pos logit contributions
 sorts+0.004
 ahead+0.003
 Finally+0.003
 coloured+0.003
mediately+0.003
Neg logit contributions
 shark-0.007
 steak-0.007
 thief-0.007
 crab-0.007
 salad-0.007
Token!
Token ID0
Feature activation+6.70e-01

Pos logit contributions
 shark+0.009
 steak+0.009
 thief+0.009
 salad+0.009
 pasta+0.009
Neg logit contributions
 sorts-0.005
 Finally-0.005
mediately-0.005
ually-0.005
 ahead-0.005
Token It
Token ID632
Feature activation-4.24e-01

Pos logit contributions
 shark+0.019
 steak+0.018
 thief+0.017
 salad+0.017
 crab+0.017
Neg logit contributions
 sorts-0.011
 ahead-0.011
k-0.010
 called-0.010
 coloured-0.010
Tokenmy
Token ID1820
Feature activation-4.74e-01

Pos logit contributions
 sorts+0.005
eter+0.005
 Finally+0.004
 After+0.004
 shades+0.004
Neg logit contributions
 mustache-0.006
 shark-0.006
 ow-0.006
 crying-0.006
 thief-0.006
Token told
Token ID1297
Feature activation+3.37e-01

Pos logit contributions
 sorts+0.009
eter+0.008
 Finally+0.008
 ahead+0.008
 shades+0.008
Neg logit contributions
 shark-0.012
 steak-0.012
 thief-0.012
 mustache-0.011
 crab-0.011
Token her
Token ID607
Feature activation+1.85e-01

Pos logit contributions
 shark+0.012
opus+0.012
 liar+0.011
 steak+0.011
 crab+0.011
Neg logit contributions
 called-0.004
 past-0.003
 over-0.003
 when-0.003
 ahead-0.003
Token that
Token ID326
Feature activation+1.25e+00

Pos logit contributions
 shark+0.006
 steak+0.005
opus+0.005
 thief+0.005
ascal+0.005
Neg logit contributions
 ahead-0.002
 sorts-0.002
 called-0.002
 over-0.002
 past-0.002
Token the
Token ID262
Feature activation+2.59e-01

Pos logit contributions
 shark+0.036
opus+0.035
 thief+0.035
 steak+0.035
 liar+0.034
Neg logit contributions
 called-0.020
 sorts-0.019
 when-0.018
 ahead-0.018
 painted-0.018
Token c
Token ID269
Feature activation+4.52e+00

Pos logit contributions
 shark+0.006
 steak+0.006
oly+0.006
otion+0.006
 liar+0.006
Neg logit contributions
 called-0.005
 past-0.005
 ahead-0.005
 shades-0.005
 sorts-0.005
Tokenactus
Token ID34144
Feature activation+2.77e+00

Pos logit contributions
 shark+0.042
 steak+0.037
 thief+0.032
 salad+0.029
 crab+0.028
Neg logit contributions
 sorts-0.160
 ahead-0.160
 called-0.154
 coloured-0.153
 shades-0.153
Token was
Token ID373
Feature activation+8.99e-01

Pos logit contributions
 shark+0.097
 steak+0.094
 thief+0.092
 salad+0.090
 crab+0.090
Neg logit contributions
 sorts-0.027
 ahead-0.026
 called-0.023
 coloured-0.022
 Finally-0.022
Token not
Token ID407
Feature activation+5.90e-01

Pos logit contributions
 shark+0.020
 steak+0.019
 thief+0.019
 salad+0.018
 crab+0.018
Neg logit contributions
 sorts-0.020
 ahead-0.019
 Finally-0.019
k-0.019
 shades-0.018
Token soft
Token ID2705
Feature activation+1.17e+00

Pos logit contributions
 shark+0.017
 steak+0.016
opus+0.016
 thief+0.015
 crab+0.015
Neg logit contributions
 sorts-0.010
 ahead-0.009
 called-0.009
 coloured-0.009
 past-0.009
Token like
Token ID588
Feature activation+9.64e-01

Pos logit contributions
 shark+0.040
opus+0.038
 steak+0.038
oly+0.038
 thief+0.037
Neg logit contributions
 ahead-0.012
 sorts-0.012
 called-0.011
 like-0.011
 over-0.011
Token!"
Token ID2474
Feature activation-9.40e-01

Pos logit contributions
 shark+0.018
 steak+0.017
 thief+0.016
 salad+0.016
 pasta+0.016
Neg logit contributions
 sorts-0.010
 ahead-0.009
 Finally-0.009
 coloured-0.009
k-0.009
Token
Token ID220
Feature activation-2.73e+00

Pos logit contributions
eter+0.022
kind+0.021
mediately+0.021
ruly+0.021
ist+0.021
Neg logit contributions
 steak-0.016
 crying-0.016
 mustache-0.015
 ow-0.015
uts-0.015
Token
Token ID198
Feature activation-4.75e-01

Pos logit contributions
 sorts+0.058
 coloured+0.057
 colours+0.054
 ahead+0.054
 shades+0.054
Neg logit contributions
 shark-0.060
 steak-0.059
otion-0.057
 thief-0.056
uts-0.055
Token
Token ID198
Feature activation-3.05e-02

Pos logit contributions
eter+0.012
 sorts+0.012
ples+0.012
 sizes+0.012
 colours+0.011
Neg logit contributions
 steak-0.008
uts-0.007
 shark-0.007
opus-0.007
 thief-0.007
TokenThe
Token ID464
Feature activation-2.35e-01

Pos logit contributions
eter+0.001
 sorts+0.001
 rays+0.001
 shades+0.001
ples+0.001
Neg logit contributions
 thief-0.001
uts-0.001
 virus-0.001
opus-0.001
coon-0.001
Token c
Token ID269
Feature activation+4.48e+00

Pos logit contributions
 called+0.005
 sorts+0.005
 past+0.005
 ahead+0.005
 shades+0.005
Neg logit contributions
 shark-0.006
 steak-0.005
 liar-0.005
 thief-0.005
 drown-0.005
Tokenactus
Token ID34144
Feature activation+1.62e+00

Pos logit contributions
 shark+0.040
 steak+0.036
 thief+0.032
 salad+0.029
 crab+0.027
Neg logit contributions
 sorts-0.159
 ahead-0.159
 called-0.153
 coloured-0.153
 shades-0.152
Token just
Token ID655
Feature activation-9.85e-01

Pos logit contributions
 shark+0.051
 steak+0.049
 thief+0.048
 crab+0.048
 salad+0.047
Neg logit contributions
 sorts-0.022
 ahead-0.020
k-0.020
 Finally-0.019
ples-0.019
Token stayed
Token ID9658
Feature activation+1.22e+00

Pos logit contributions
 sorts+0.019
eter+0.018
ris+0.018
 Finally+0.018
ushes+0.017
Neg logit contributions
 shark-0.025
 ow-0.024
 crab-0.023
 steak-0.023
 salad-0.022
Token still
Token ID991
Feature activation+2.69e-01

Pos logit contributions
 shark+0.029
 steak+0.028
 thief+0.027
 salad+0.026
 crab+0.026
Neg logit contributions
 sorts-0.025
 Finally-0.023
bye-0.023
eter-0.023
se-0.023
Token.
Token ID13
Feature activation+5.53e-02

Pos logit contributions
 shark+0.008
 steak+0.008
 thief+0.008
 salad+0.008
 crab+0.008
Neg logit contributions
 sorts-0.004
 ahead-0.004
 Finally-0.003
 coloured-0.003
 shades-0.003
Token play
Token ID711
Feature activation+5.49e-01

Pos logit contributions
 ahead+0.008
 sorts+0.007
 called+0.007
 past+0.007
 over+0.006
Neg logit contributions
opus-0.018
 shark-0.018
 thief-0.018
ascal-0.018
otion-0.018
Token.
Token ID13
Feature activation+5.67e-01

Pos logit contributions
 shark+0.022
opus+0.022
 liar+0.022
oly+0.022
 thief+0.022
Neg logit contributions
 called-0.002
 ahead-0.002
 past-0.002
 like-0.002
 when-0.002
Token She
Token ID1375
Feature activation+1.11e+00

Pos logit contributions
 shark+0.013
 steak+0.012
 thief+0.012
 crab+0.012
opus+0.012
Neg logit contributions
 sorts-0.012
 ahead-0.012
 called-0.011
 Finally-0.011
 shades-0.011
Token sees
Token ID7224
Feature activation+1.10e-01

Pos logit contributions
 shark+0.030
 steak+0.028
 thief+0.027
 salad+0.027
 crab+0.026
Neg logit contributions
 sorts-0.020
 ahead-0.020
eter-0.019
k-0.019
 Finally-0.019
Token the
Token ID262
Feature activation+7.68e-01

Pos logit contributions
ascal+0.003
 drown+0.003
 thief+0.003
opus+0.003
 shark+0.003
Neg logit contributions
 called-0.002
 sorts-0.002
 ahead-0.002
 over-0.002
 past-0.002
Token c
Token ID269
Feature activation+4.48e+00

Pos logit contributions
 shark+0.018
 steak+0.018
 liar+0.017
otion+0.017
oly+0.017
Neg logit contributions
 past-0.016
 ahead-0.016
 called-0.016
 coloured-0.015
 sorts-0.015
Tokenactus
Token ID34144
Feature activation+2.08e+00

Pos logit contributions
 shark+0.047
 steak+0.043
 thief+0.039
 liar+0.035
 salad+0.035
Neg logit contributions
 ahead-0.153
 sorts-0.151
 called-0.147
 shades-0.145
 coloured-0.145
Token and
Token ID290
Feature activation+3.00e-01

Pos logit contributions
 shark+0.073
 liar+0.072
oly+0.071
ascal+0.071
 thief+0.070
Neg logit contributions
 over-0.021
 ahead-0.021
 called-0.020
 sorts-0.018
 past-0.017
Token wants
Token ID3382
Feature activation+1.22e+00

Pos logit contributions
 shark+0.008
 steak+0.007
 salad+0.007
 thief+0.007
 crab+0.007
Neg logit contributions
eter-0.006
 sorts-0.006
 ahead-0.006
k-0.006
 Finally-0.005
Token to
Token ID284
Feature activation+3.66e-01

Pos logit contributions
 shark+0.038
 thief+0.036
 steak+0.036
opus+0.036
ascal+0.035
Neg logit contributions
 ahead-0.016
 sorts-0.016
 called-0.015
 past-0.014
 shades-0.014
Token touch
Token ID3638
Feature activation+1.41e-01

Pos logit contributions
 shark+0.010
 steak+0.009
 crab+0.009
 thief+0.009
 salad+0.009
Neg logit contributions
 sorts-0.007
 Finally-0.007
 ahead-0.007
mediately-0.006
eter-0.006
Token wanted
Token ID2227
Feature activation+7.78e-02

Pos logit contributions
 sorts+0.003
 ahead+0.002
 Finally+0.002
eter+0.002
antly+0.002
Neg logit contributions
 shark-0.003
 thief-0.003
 steak-0.003
 crab-0.003
 liar-0.003
Token to
Token ID284
Feature activation+2.86e-01

Pos logit contributions
 shark+0.002
 steak+0.002
 thief+0.002
 salad+0.002
 crab+0.002
Neg logit contributions
 sorts-0.001
 ahead-0.001
 called-0.001
 Finally-0.001
 coloured-0.001
Token play
Token ID711
Feature activation+9.27e-02

Pos logit contributions
 shark+0.007
 steak+0.007
 thief+0.007
 dessert+0.007
 crab+0.007
Neg logit contributions
 sorts-0.005
 ahead-0.005
 Finally-0.005
mediately-0.005
ples-0.005
Token with
Token ID351
Feature activation+4.62e-01

Pos logit contributions
 shark+0.003
 steak+0.003
 thief+0.003
ascal+0.003
opus+0.003
Neg logit contributions
 ahead-0.001
 sorts-0.001
 called-0.001
 coloured-0.001
 past-0.001
Token the
Token ID262
Feature activation+1.33e+00

Pos logit contributions
 shark+0.012
 steak+0.011
 thief+0.010
 salad+0.010
 pasta+0.010
Neg logit contributions
 sorts-0.009
 ahead-0.009
 Finally-0.009
mediately-0.009
k-0.008
Token oct
Token ID19318
Feature activation+4.47e+00

Pos logit contributions
 steak+0.029
 shark+0.029
ascal+0.029
opus+0.028
olo+0.028
Neg logit contributions
 sorts-0.027
 past-0.027
 ahead-0.026
 colours-0.026
 colors-0.025
Tokenopus
Token ID25790
Feature activation+3.34e-01

Pos logit contributions
 gag+0.047
 steak+0.045
 peas+0.043
 salad+0.039
 ow+0.039
Neg logit contributions
 ahead-0.143
 sorts-0.136
 called-0.136
k-0.134
ist-0.133
Token,
Token ID11
Feature activation+5.95e-02

Pos logit contributions
 shark+0.011
 steak+0.011
ascal+0.010
 thief+0.010
opus+0.010
Neg logit contributions
 sorts-0.004
 ahead-0.004
 called-0.004
 coloured-0.003
 past-0.003
Token but
Token ID475
Feature activation-9.57e-01

Pos logit contributions
 steak+0.002
 shark+0.002
opus+0.002
 gag+0.002
 thief+0.002
Neg logit contributions
 called-0.001
 ahead-0.001
 sorts-0.001
 over-0.001
 like-0.001
Token we
Token ID356
Feature activation+4.17e-01

Pos logit contributions
 sorts+0.022
 ahead+0.021
 called+0.020
 coloured+0.020
 shades+0.020
Neg logit contributions
 shark-0.021
 steak-0.020
 thief-0.019
 salad-0.018
 crab-0.018
Token broke
Token ID6265
Feature activation-1.59e+00

Pos logit contributions
 shark+0.011
 steak+0.011
 thief+0.011
 crab+0.010
 salad+0.010
Neg logit contributions
 sorts-0.007
 ahead-0.007
 Finally-0.007
 shades-0.006
 coloured-0.006
Token.
Token ID13
Feature activation+1.57e-01

Pos logit contributions
 sorts+0.020
 ahead+0.018
 Finally+0.018
 shades+0.017
 coloured+0.017
Neg logit contributions
 shark-0.049
 steak-0.048
 thief-0.046
 crab-0.045
 salad-0.045
Token He
Token ID679
Feature activation-2.10e-01

Pos logit contributions
 shark+0.004
 steak+0.004
 thief+0.003
 salad+0.003
 crab+0.003
Neg logit contributions
 sorts-0.003
 ahead-0.003
 Finally-0.003
 called-0.003
 shades-0.003
Token looked
Token ID3114
Feature activation-1.12e+00

Pos logit contributions
 sorts+0.004
 ahead+0.004
eter+0.003
 shades+0.003
k+0.003
Neg logit contributions
 shark-0.006
 steak-0.005
 thief-0.005
 crab-0.005
 mustache-0.005
Token at
Token ID379
Feature activation-4.26e-02

Pos logit contributions
 sorts+0.021
ually+0.020
bye+0.020
eter+0.019
 shades+0.019
Neg logit contributions
 shark-0.031
 steak-0.029
 thief-0.028
 salad-0.027
 mustache-0.027
Token the
Token ID262
Feature activation+8.94e-01

Pos logit contributions
 sorts+0.001
 ahead+0.001
 Finally+0.001
k+0.001
 coloured+0.001
Neg logit contributions
 shark-0.001
 steak-0.001
 thief-0.001
 crab-0.001
 salad-0.001
Token c
Token ID269
Feature activation+4.46e+00

Pos logit contributions
 shark+0.019
 steak+0.018
 thief+0.018
 salad+0.017
 dessert+0.017
Neg logit contributions
 sorts-0.021
 ahead-0.021
 called-0.020
 coloured-0.019
 shades-0.019
Tokenactus
Token ID34144
Feature activation+1.37e+00

Pos logit contributions
 shark+0.043
 steak+0.039
 thief+0.035
 salad+0.031
 dessert+0.030
Neg logit contributions
 sorts-0.156
 ahead-0.155
 called-0.150
 shades-0.149
 coloured-0.149
Token and
Token ID290
Feature activation+7.00e-02

Pos logit contributions
 shark+0.049
 steak+0.047
 thief+0.046
ascal+0.046
 liar+0.046
Neg logit contributions
 ahead-0.013
 called-0.012
 over-0.011
 sorts-0.011
 past-0.010
Token saw
Token ID2497
Feature activation-2.16e-01

Pos logit contributions
 shark+0.002
 steak+0.002
 crab+0.002
 thief+0.002
 salad+0.002
Neg logit contributions
 sorts-0.001
 ahead-0.001
eter-0.001
 Finally-0.001
mediately-0.001
Token a
Token ID257
Feature activation+6.07e-01

Pos logit contributions
 sorts+0.004
 ahead+0.004
k+0.004
 Finally+0.004
eter+0.004
Neg logit contributions
 shark-0.006
 steak-0.005
 thief-0.005
 crab-0.005
 ow-0.005
Token pink
Token ID11398
Feature activation+4.47e-01

Pos logit contributions
 shark+0.013
 steak+0.013
 thief+0.012
otion+0.012
ascal+0.012
Neg logit contributions
 sorts-0.013
 ahead-0.013
 called-0.013
 past-0.012
 coloured-0.012
Token patience
Token ID16336
Feature activation-2.72e-01

Pos logit contributions
 sorts+0.000
 ahead+0.000
 called+0.000
 coloured+0.000
 past+0.000
Neg logit contributions
 shark-0.000
opus-0.000
 thief-0.000
ascal-0.000
oly-0.000
Token,
Token ID11
Feature activation+3.29e-01

Pos logit contributions
 sorts+0.003
 ahead+0.003
 coloured+0.003
 shades+0.003
 Finally+0.003
Neg logit contributions
 shark-0.009
 steak-0.009
 thief-0.009
 salad-0.008
 crab-0.008
Token he
Token ID339
Feature activation+1.14e-01

Pos logit contributions
 shark+0.008
 thief+0.007
ascal+0.007
 steak+0.007
opus+0.007
Neg logit contributions
 ahead-0.007
 sorts-0.007
 called-0.006
 Finally-0.006
 coloured-0.006
Token noticed
Token ID6810
Feature activation-7.95e-01

Pos logit contributions
 shark+0.004
 steak+0.003
 thief+0.003
 crab+0.003
 salad+0.003
Neg logit contributions
 sorts-0.002
 ahead-0.002
 Finally-0.001
 shades-0.001
mediately-0.001
Token the
Token ID262
Feature activation+2.19e-01

Pos logit contributions
 sorts+0.016
k+0.015
 ahead+0.015
mediately+0.015
 Finally+0.015
Neg logit contributions
 shark-0.021
 steak-0.020
 pasta-0.019
 salad-0.018
 thief-0.018
Token c
Token ID269
Feature activation+4.44e+00

Pos logit contributions
 shark+0.005
 steak+0.005
 thief+0.004
 salad+0.004
 liar+0.004
Neg logit contributions
 sorts-0.005
 ahead-0.005
 called-0.005
 coloured-0.005
 shades-0.005
Tokenactus
Token ID34144
Feature activation+1.89e+00

Pos logit contributions
 shark+0.039
 steak+0.035
 thief+0.030
 salad+0.027
 crab+0.026
Neg logit contributions
 sorts-0.160
 ahead-0.159
 called-0.154
 coloured-0.153
 shades-0.153
Token was
Token ID373
Feature activation+3.97e-01

Pos logit contributions
 shark+0.059
 steak+0.056
 thief+0.055
 crab+0.053
 salad+0.053
Neg logit contributions
 sorts-0.026
 ahead-0.026
 called-0.023
 coloured-0.023
 shades-0.023
Token starting
Token ID3599
Feature activation-2.69e-01

Pos logit contributions
 shark+0.009
 steak+0.009
 thief+0.009
 salad+0.009
 dessert+0.009
Neg logit contributions
 sorts-0.008
 ahead-0.008
 Finally-0.008
k-0.008
eter-0.007
Token to
Token ID284
Feature activation-5.44e-02

Pos logit contributions
 sorts+0.004
 ahead+0.004
mediately+0.004
 Finally+0.004
k+0.004
Neg logit contributions
 shark-0.008
 steak-0.007
 thief-0.007
 salad-0.007
 crab-0.007
Token grow
Token ID1663
Feature activation+8.47e-01

Pos logit contributions
 sorts+0.001
 ahead+0.001
 called+0.001
 coloured+0.001
 shades+0.001
Neg logit contributions
 shark-0.001
 steak-0.001
opus-0.001
 thief-0.001
 crab-0.001
Token She
Token ID1375
Feature activation+3.13e-01

Pos logit contributions
 Finally+0.001
 After+0.001
 called+0.001
 ahead+0.001
 sorts+0.001
Neg logit contributions
 shark-0.003
 steak-0.003
 thief-0.003
opus-0.003
 crab-0.003
Token told
Token ID1297
Feature activation+4.86e-01

Pos logit contributions
 shark+0.008
 thief+0.008
 steak+0.008
 crab+0.007
 ow+0.007
Neg logit contributions
 sorts-0.006
 shades-0.006
eter-0.006
k-0.006
 ahead-0.006
Token him
Token ID683
Feature activation+1.03e+00

Pos logit contributions
 shark+0.013
opus+0.013
 steak+0.012
 thief+0.012
 liar+0.012
Neg logit contributions
 called-0.008
 ahead-0.008
 past-0.008
 over-0.008
 sorts-0.008
Token that
Token ID326
Feature activation+1.37e+00

Pos logit contributions
 shark+0.028
 steak+0.027
 thief+0.026
 salad+0.025
 crab+0.025
Neg logit contributions
 sorts-0.018
 ahead-0.018
 called-0.017
 Finally-0.016
 coloured-0.016
Token the
Token ID262
Feature activation+1.95e-01

Pos logit contributions
 shark+0.031
opus+0.031
ascal+0.030
 gag+0.029
 steak+0.029
Neg logit contributions
 called-0.029
 sorts-0.028
 when-0.028
 ahead-0.027
 over-0.027
Token c
Token ID269
Feature activation+4.43e+00

Pos logit contributions
 shark+0.004
 steak+0.004
opus+0.004
otion+0.004
ascal+0.004
Neg logit contributions
 ahead-0.004
 sorts-0.004
 called-0.004
 past-0.004
 coloured-0.004
Tokenactus
Token ID34144
Feature activation+2.68e+00

Pos logit contributions
 shark+0.047
 steak+0.042
 thief+0.037
 crab+0.034
 salad+0.034
Neg logit contributions
 ahead-0.150
 sorts-0.149
 called-0.144
 coloured-0.143
 Finally-0.143
Token had
Token ID550
Feature activation-3.44e-02

Pos logit contributions
 shark+0.097
 steak+0.094
 thief+0.091
ascal+0.089
opus+0.089
Neg logit contributions
 sorts-0.022
 ahead-0.022
 called-0.021
 coloured-0.018
 began-0.017
Token sharp
Token ID7786
Feature activation+8.89e-01

Pos logit contributions
 sorts+0.001
 ahead+0.001
 Finally+0.001
 called+0.001
 coloured+0.001
Neg logit contributions
 shark-0.001
 steak-0.001
 thief-0.001
 salad-0.001
 crab-0.001
Token spikes
Token ID27198
Feature activation+1.85e+00

Pos logit contributions
 shark+0.018
 steak+0.017
 thief+0.016
 pasta+0.016
 crab+0.016
Neg logit contributions
 sorts-0.022
 ahead-0.021
 Finally-0.021
mediately-0.020
ually-0.020
Token that
Token ID326
Feature activation+1.11e+00

Pos logit contributions
 shark+0.061
 steak+0.059
 thief+0.057
 crab+0.056
opus+0.056
Neg logit contributions
 ahead-0.022
 sorts-0.021
 called-0.020
 coloured-0.019
 over-0.019
Token play
Token ID711
Feature activation+4.88e-01

Pos logit contributions
 dessert+0.018
 steak+0.017
 salad+0.017
 thief+0.017
 shark+0.016
Neg logit contributions
 sorts-0.019
antly-0.019
 Finally-0.019
eter-0.018
ples-0.018
Token nicely
Token ID16576
Feature activation-5.24e-01

Pos logit contributions
 shark+0.016
 steak+0.016
 thief+0.015
 salad+0.015
 crab+0.015
Neg logit contributions
 sorts-0.006
 ahead-0.005
 called-0.005
 coloured-0.005
 Finally-0.005
Token together
Token ID1978
Feature activation-4.50e-01

Pos logit contributions
 sorts+0.007
 ahead+0.006
 Finally+0.006
 shades+0.006
 called+0.006
Neg logit contributions
 shark-0.017
 steak-0.016
 thief-0.016
 salad-0.015
 pasta-0.015
Token.
Token ID13
Feature activation-6.29e-02

Pos logit contributions
 sorts+0.006
 ahead+0.006
 Finally+0.005
 shades+0.005
 coloured+0.005
Neg logit contributions
 shark-0.014
 steak-0.013
 thief-0.013
 salad-0.013
 dessert-0.013
Token The
Token ID383
Feature activation+4.46e-01

Pos logit contributions
 sorts+0.001
 coloured+0.001
 ahead+0.001
 began+0.001
 past+0.001
Neg logit contributions
 shark-0.001
 steak-0.001
oly-0.001
ascal-0.001
 thief-0.001
Token oct
Token ID19318
Feature activation+4.42e+00

Pos logit contributions
 steak+0.007
otion+0.007
 gag+0.006
 thief+0.006
uts+0.006
Neg logit contributions
 sorts-0.013
 called-0.012
 ahead-0.012
 colours-0.012
 colors-0.012
Tokenopus
Token ID25790
Feature activation+9.28e-01

Pos logit contributions
 gag+0.052
 steak+0.049
 peas+0.044
 salad+0.044
 dessert+0.042
Neg logit contributions
 ahead-0.138
 sorts-0.134
k-0.133
 called-0.131
ist-0.129
Token was
Token ID373
Feature activation+5.38e-01

Pos logit contributions
 shark+0.029
 steak+0.028
 thief+0.027
 crab+0.027
 salad+0.026
Neg logit contributions
 sorts-0.013
 ahead-0.012
 shades-0.011
 Finally-0.011
 coloured-0.011
Token a
Token ID257
Feature activation-2.01e-01

Pos logit contributions
 shark+0.013
 steak+0.012
 thief+0.012
 salad+0.011
 pasta+0.011
Neg logit contributions
 sorts-0.012
 ahead-0.011
 Finally-0.011
 shades-0.011
 coloured-0.011
Token gift
Token ID6979
Feature activation-7.41e-01

Pos logit contributions
 sorts+0.004
 ahead+0.004
 called+0.004
 past+0.004
 coloured+0.004
Neg logit contributions
 shark-0.005
 steak-0.005
otion-0.005
 thief-0.004
 gag-0.004
Token for
Token ID329
Feature activation+1.03e+00

Pos logit contributions
 sorts+0.010
 ahead+0.010
 coloured+0.009
 Finally+0.009
 shades+0.009
Neg logit contributions
 shark-0.023
 steak-0.022
 thief-0.022
 salad-0.021
 crab-0.021
Token cut
Token ID2005
Feature activation-1.44e+00

Pos logit contributions
 shark+0.012
 salad+0.012
 steak+0.012
 pasta+0.011
 crab+0.011
Neg logit contributions
 Finally-0.007
 ahead-0.007
eter-0.007
mediately-0.007
 After-0.006
Token into
Token ID656
Feature activation-4.02e-01

Pos logit contributions
 sorts+0.026
 ahead+0.025
mediately+0.025
bye+0.024
k+0.024
Neg logit contributions
 steak-0.037
 shark-0.036
 pasta-0.035
 salad-0.035
 dessert-0.035
Token it
Token ID340
Feature activation-1.13e-01

Pos logit contributions
 sorts+0.007
 ahead+0.007
 Finally+0.007
 called+0.007
mediately+0.007
Neg logit contributions
 shark-0.011
 steak-0.010
 salad-0.010
 dessert-0.010
 pasta-0.010
Token.
Token ID13
Feature activation-2.12e-01

Pos logit contributions
 sorts+0.001
 ahead+0.001
mediately+0.001
 Finally+0.001
k+0.001
Neg logit contributions
 shark-0.004
 steak-0.004
 salad-0.003
 thief-0.003
 pasta-0.003
Token<|endoftext|>
Token ID50256
Feature activation+1.15e+00

Pos logit contributions
 coloured+0.004
 ahead+0.004
 sorts+0.004
k+0.004
 past+0.004
Neg logit contributions
 shark-0.005
 steak-0.005
 salad-0.005
 crab-0.005
 thief-0.005
TokenRu
Token ID40464
Feature activation-4.47e+00

Pos logit contributions
 shark+0.033
 steak+0.032
 thief+0.031
 salad+0.030
 crab+0.030
Neg logit contributions
 sorts-0.019
 ahead-0.019
 called-0.017
 Finally-0.017
 coloured-0.016
Tokenpert
Token ID11766
Feature activation-1.70e-01

Pos logit contributions
 sorts+0.086
cards+0.079
 called+0.078
 shades+0.078
 sizes+0.076
Neg logit contributions
 steak-0.109
 shark-0.102
 beetle-0.097
 crab-0.096
 sausage-0.096
Token was
Token ID373
Feature activation-1.56e-01

Pos logit contributions
 sorts+0.002
 ahead+0.002
eter+0.002
k+0.002
 shades+0.002
Neg logit contributions
 shark-0.005
 steak-0.005
 thief-0.005
 salad-0.005
 crab-0.005
Token a
Token ID257
Feature activation+3.27e-01

Pos logit contributions
 sorts+0.003
k+0.003
 shades+0.003
eter+0.003
ples+0.003
Neg logit contributions
 shark-0.003
 steak-0.003
 thief-0.003
 dessert-0.003
 pasta-0.003
Token happy
Token ID3772
Feature activation+7.39e-01

Pos logit contributions
 shark+0.008
 steak+0.007
 thief+0.007
 crab+0.007
 salad+0.007
Neg logit contributions
 sorts-0.007
 ahead-0.007
 coloured-0.007
 Finally-0.007
 shades-0.007
Token boy
Token ID2933
Feature activation-9.88e-01

Pos logit contributions
 shark+0.020
 gag+0.019
otion+0.019
 steak+0.019
 liar+0.019
Neg logit contributions
 ahead-0.013
 sorts-0.013
 called-0.012
 past-0.012
 coloured-0.011
Token with
Token ID351
Feature activation-4.47e-01

Pos logit contributions
 Finally+0.026
mediately+0.024
 sorts+0.024
se+0.024
k+0.024
Neg logit contributions
 shark-0.045
 steak-0.045
 dessert-0.043
 pasta-0.043
 salad-0.042
Token new
Token ID649
Feature activation-8.88e-01

Pos logit contributions
 Finally+0.008
mediately+0.008
 ahead+0.008
k+0.008
ually+0.008
Neg logit contributions
 shark-0.011
 steak-0.011
 salad-0.011
 dessert-0.010
 pasta-0.010
Token life
Token ID1204
Feature activation+6.59e-01

Pos logit contributions
 Finally+0.020
mediately+0.019
k+0.019
ually+0.019
 ahead+0.019
Neg logit contributions
 shark-0.020
 steak-0.019
 salad-0.019
 crab-0.018
 peas-0.018
Token!
Token ID0
Feature activation-1.52e+00

Pos logit contributions
 shark+0.020
 steak+0.020
 salad+0.019
 thief+0.019
 pasta+0.019
Neg logit contributions
 Finally-0.008
 sorts-0.008
mediately-0.008
 After-0.008
 ahead-0.008
Token<|endoftext|>
Token ID50256
Feature activation+1.03e+00

Pos logit contributions
 coloured+0.030
k+0.030
 ahead+0.029
 sorts+0.029
 past+0.028
Neg logit contributions
 shark-0.036
 steak-0.035
 pasta-0.032
 salad-0.031
 dessert-0.031
TokenFar
Token ID21428
Feature activation-4.22e+00

Pos logit contributions
 shark+0.027
 steak+0.026
 thief+0.025
 salad+0.025
 crab+0.025
Neg logit contributions
 sorts-0.019
 ahead-0.018
 called-0.017
 coloured-0.017
 Finally-0.017
Tokenmer
Token ID647
Feature activation-1.30e+00

Pos logit contributions
 sorts+0.077
 brushes+0.070
 shades+0.069
 Finally+0.068
ushes+0.068
Neg logit contributions
olph-0.110
 shark-0.106
ascal-0.095
 crab-0.095
 steak-0.094
Token Fred
Token ID8559
Feature activation-5.92e-01

Pos logit contributions
 sorts+0.022
 Finally+0.019
 ahead+0.019
mediately+0.019
eter+0.019
Neg logit contributions
 shark-0.038
 steak-0.036
 thief-0.035
 crab-0.035
 pasta-0.034
Token was
Token ID373
Feature activation-3.61e-01

Pos logit contributions
 sorts+0.010
eter+0.010
 Finally+0.010
mediately+0.010
 ahead+0.010
Neg logit contributions
 shark-0.016
 steak-0.015
 crab-0.014
 mustache-0.014
 thief-0.014
Token very
Token ID845
Feature activation+3.20e-01

Pos logit contributions
 sorts+0.007
k+0.007
eter+0.007
 shades+0.007
ples+0.007
Neg logit contributions
 shark-0.008
 steak-0.008
 thief-0.008
 dessert-0.008
 pasta-0.008
Token cheerful
Token ID37999
Feature activation+9.03e-01

Pos logit contributions
 shark+0.007
 steak+0.007
 crab+0.007
 thief+0.006
 salad+0.006
Neg logit contributions
 sorts-0.007
k-0.007
antly-0.007
 ahead-0.007
ples-0.007
Token voice
Token ID3809
Feature activation+3.91e-01

Pos logit contributions
 shark+0.001
 steak+0.001
 thief+0.001
 salad+0.001
 crab+0.001
Neg logit contributions
 sorts-0.001
 ahead-0.001
 called-0.001
 coloured-0.001
 shades-0.001
Token.
Token ID13
Feature activation+3.81e-01

Pos logit contributions
 shark+0.012
 steak+0.012
 thief+0.012
 crab+0.011
 salad+0.011
Neg logit contributions
 sorts-0.005
 ahead-0.005
 Finally-0.005
 coloured-0.004
 shades-0.004
Token But
Token ID887
Feature activation-8.63e-01

Pos logit contributions
 shark+0.011
 steak+0.011
 thief+0.011
 salad+0.010
 pasta+0.010
Neg logit contributions
 sorts-0.006
 ahead-0.006
 coloured-0.005
k-0.005
 called-0.005
Token then
Token ID788
Feature activation+2.85e-01

Pos logit contributions
 sorts+0.017
k+0.016
ples+0.016
 shades+0.016
 coloured+0.015
Neg logit contributions
 shark-0.021
 steak-0.021
 thief-0.020
 crab-0.020
 dessert-0.019
Token,
Token ID11
Feature activation-2.68e-01

Pos logit contributions
 shark+0.007
 steak+0.007
 thief+0.007
 crab+0.007
 salad+0.007
Neg logit contributions
 sorts-0.006
k-0.005
mediately-0.005
eter-0.005
 ahead-0.005
Token mirac
Token ID30352
Feature activation-4.07e+00

Pos logit contributions
 sorts+0.006
k+0.006
ples+0.005
eter+0.005
mediately+0.005
Neg logit contributions
 shark-0.006
 steak-0.006
 thief-0.006
 dessert-0.006
 salad-0.006
Tokenulously
Token ID18117
Feature activation+2.63e-01

Pos logit contributions
 sorts+0.066
 Finally+0.065
 brushes+0.063
ushes+0.055
 With+0.055
Neg logit contributions
 ow-0.101
 steak-0.094
 thief-0.094
 shark-0.093
coon-0.091
Token,
Token ID11
Feature activation+9.00e-02

Pos logit contributions
 shark+0.010
 steak+0.009
 thief+0.009
 crab+0.009
opus+0.009
Neg logit contributions
 ahead-0.002
 sorts-0.002
 called-0.002
 coloured-0.002
 Finally-0.002
Token it
Token ID340
Feature activation+1.83e+00

Pos logit contributions
 shark+0.002
 steak+0.002
 thief+0.002
 salad+0.002
 crab+0.002
Neg logit contributions
 sorts-0.002
k-0.002
 ahead-0.002
ples-0.002
mediately-0.002
Token returned
Token ID4504
Feature activation+7.32e-01

Pos logit contributions
 shark+0.058
 thief+0.055
 steak+0.054
 crab+0.052
 ow+0.052
Neg logit contributions
 sorts-0.026
 ahead-0.023
 Finally-0.022
k-0.022
eter-0.021
Token and
Token ID290
Feature activation+4.49e-01

Pos logit contributions
 thief+0.016
 shark+0.016
 steak+0.016
 crab+0.015
 salad+0.015
Neg logit contributions
 sorts-0.015
mediately-0.014
ples-0.014
 Finally-0.013
ually-0.013
Token was
Token ID373
Feature activation+1.00e-01

Pos logit contributions
 shark+0.001
 steak+0.001
 thief+0.001
 crab+0.001
 salad+0.001
Neg logit contributions
 sorts-0.001
 ahead-0.001
 Finally-0.001
eter-0.001
 shades-0.001
Token very
Token ID845
Feature activation+8.81e-01

Pos logit contributions
 shark+0.002
 thief+0.002
 steak+0.002
 dessert+0.002
 crab+0.002
Neg logit contributions
 sorts-0.002
 Finally-0.002
eter-0.002
 shades-0.002
k-0.002
Token happy
Token ID3772
Feature activation+7.58e-01

Pos logit contributions
 shark+0.019
 thief+0.017
 steak+0.017
 crab+0.017
 liar+0.016
Neg logit contributions
 sorts-0.020
antly-0.020
k-0.019
eter-0.019
 Finally-0.019
Token.
Token ID13
Feature activation-6.84e-01

Pos logit contributions
 shark+0.026
 steak+0.025
 thief+0.024
 salad+0.024
 crab+0.024
Neg logit contributions
 sorts-0.008
 ahead-0.008
 called-0.007
 coloured-0.007
 shades-0.007
Token<|endoftext|>
Token ID50256
Feature activation+8.70e-01

Pos logit contributions
 sorts+0.010
 ahead+0.010
 coloured+0.009
 called+0.009
 shades+0.009
Neg logit contributions
 shark-0.020
 steak-0.020
 thief-0.019
 salad-0.018
 crab-0.018
TokenFar
Token ID21428
Feature activation-4.00e+00

Pos logit contributions
 shark+0.025
 steak+0.024
 thief+0.024
 salad+0.023
 liar+0.023
Neg logit contributions
 ahead-0.014
 sorts-0.014
 called-0.013
 Finally-0.012
 coloured-0.012
Token away
Token ID1497
Feature activation+3.91e-01

Pos logit contributions
 sorts+0.075
 shades+0.072
 brushes+0.069
eter+0.069
 Finally+0.064
Neg logit contributions
 shark-0.103
olph-0.098
 steak-0.095
opus-0.092
 crab-0.091
Token in
Token ID287
Feature activation-1.59e+00

Pos logit contributions
 shark+0.012
 steak+0.011
 crab+0.011
 thief+0.011
opus+0.011
Neg logit contributions
 sorts-0.005
k-0.005
 ahead-0.005
 coloured-0.005
ples-0.005
Token the
Token ID262
Feature activation-4.34e-01

Pos logit contributions
 sorts+0.025
k+0.023
 ahead+0.023
eter+0.022
 coloured+0.022
Neg logit contributions
 shark-0.047
 steak-0.045
 thief-0.043
 salad-0.042
 crab-0.042
Token sky
Token ID6766
Feature activation-5.93e-01

Pos logit contributions
 sorts+0.009
 ahead+0.009
 called+0.008
 coloured+0.008
 past+0.008
Neg logit contributions
 shark-0.010
 steak-0.010
 thief-0.010
 liar-0.009
otion-0.009
Token was
Token ID373
Feature activation+5.01e-02

Pos logit contributions
 sorts+0.009
 Finally+0.008
ples+0.008
eter+0.008
k+0.008
Neg logit contributions
 shark-0.017
 steak-0.017
 salad-0.016
 pasta-0.016
 crab-0.016
Token loved
Token ID6151
Feature activation+8.02e-01

Pos logit contributions
 steak+0.013
 salad+0.013
 shark+0.013
 dessert+0.012
 pasta+0.012
Neg logit contributions
se-0.017
ris-0.016
k-0.016
ples-0.016
 Finally-0.016
Token.
Token ID13
Feature activation-2.17e-01

Pos logit contributions
 shark+0.025
 steak+0.025
 thief+0.024
 salad+0.023
 pasta+0.023
Neg logit contributions
 sorts-0.010
mediately-0.009
 ahead-0.009
 Finally-0.009
k-0.009
Token<|endoftext|>
Token ID50256
Feature activation+1.19e+00

Pos logit contributions
 coloured+0.005
k+0.004
 sorts+0.004
 ahead+0.004
ples+0.004
Neg logit contributions
 steak-0.005
 shark-0.005
 dessert-0.004
 thief-0.004
 salad-0.004
TokenOne
Token ID3198
Feature activation-1.83e+00

Pos logit contributions
 shark+0.033
 steak+0.032
 thief+0.031
 salad+0.030
 crab+0.030
Neg logit contributions
 sorts-0.020
 ahead-0.020
 called-0.018
 Finally-0.018
 coloured-0.018
Token day
Token ID1110
Feature activation+5.47e-01

Pos logit contributions
 Finally+0.032
k+0.032
ples+0.032
 sorts+0.031
antly+0.031
Neg logit contributions
 shark-0.048
 steak-0.047
 crab-0.044
 pasta-0.044
 salad-0.044
Token Far
Token ID6755
Feature activation-3.85e+00

Pos logit contributions
 shark+0.018
 steak+0.018
 thief+0.017
 crab+0.017
 salad+0.017
Neg logit contributions
 sorts-0.006
 ahead-0.006
k-0.005
 shades-0.005
eter-0.005
Tokenley
Token ID1636
Feature activation-2.09e-01

Pos logit contributions
eter+0.075
 sorts+0.074
 shades+0.072
 Finally+0.069
ples+0.068
Neg logit contributions
 shark-0.092
 steak-0.087
 thief-0.083
 sailor-0.082
olph-0.081
Token was
Token ID373
Feature activation-6.36e-01

Pos logit contributions
 sorts+0.003
 ahead+0.003
k+0.003
 Finally+0.003
eter+0.002
Neg logit contributions
 shark-0.006
 steak-0.006
 thief-0.006
 crab-0.006
 salad-0.006
Token playing
Token ID2712
Feature activation-7.26e-01

Pos logit contributions
 sorts+0.013
eter+0.013
ples+0.013
k+0.013
 shades+0.012
Neg logit contributions
 steak-0.015
 shark-0.014
 dessert-0.013
 thief-0.013
 pasta-0.013
Token with
Token ID351
Feature activation+1.65e-01

Pos logit contributions
 sorts+0.009
 ahead+0.009
 Finally+0.008
 shades+0.008
k+0.008
Neg logit contributions
 shark-0.023
 steak-0.022
 thief-0.022
 salad-0.021
 crab-0.021
Token his
Token ID465
Feature activation-1.00e+00

Pos logit contributions
 shark+0.004
 steak+0.004
 thief+0.004
opus+0.004
 liar+0.004
Neg logit contributions
 sorts-0.003
 ahead-0.003
 coloured-0.003
 called-0.003
 shades-0.003
Token."
Token ID526
Feature activation-1.92e-01

Pos logit contributions
 sorts+0.001
 ahead+0.001
 called+0.001
 coloured+0.001
 Finally+0.001
Neg logit contributions
 shark-0.003
 steak-0.003
 thief-0.003
 salad-0.003
 crab-0.003
Token
Token ID198
Feature activation-2.17e-01

Pos logit contributions
 sorts+0.003
 ahead+0.003
 coloured+0.003
 shades+0.003
 called+0.003
Neg logit contributions
 shark-0.005
 steak-0.005
 thief-0.005
 crab-0.005
 salad-0.005
Token
Token ID198
Feature activation-2.23e-02

Pos logit contributions
 sorts+0.003
 ahead+0.003
 called+0.003
 Finally+0.003
 coloured+0.003
Neg logit contributions
 shark-0.006
 steak-0.006
 thief-0.006
 salad-0.006
 crab-0.006
TokenSt
Token ID1273
Feature activation-2.14e+00

Pos logit contributions
 sorts+0.000
 ahead+0.000
 called+0.000
 Finally+0.000
 coloured+0.000
Neg logit contributions
 shark-0.001
 steak-0.001
 thief-0.001
 salad-0.001
 crab-0.001
Tokenri
Token ID380
Feature activation-2.76e+00

Pos logit contributions
 sorts+0.044
 ahead+0.044
k+0.041
 called+0.041
 Finally+0.041
Neg logit contributions
 shark-0.052
 steak-0.050
 thief-0.048
 salad-0.046
 crab-0.046
Tokenpe
Token ID431
Feature activation-3.82e+00

Pos logit contributions
 sorts+0.064
 ahead+0.061
 shades+0.059
 Finally+0.059
eter+0.057
Neg logit contributions
 shark-0.057
 steak-0.055
opus-0.054
 mustache-0.053
ascal-0.053
Tokeny
Token ID88
Feature activation-9.65e-01

Pos logit contributions
 sorts+0.077
 Finally+0.071
 ahead+0.070
eter+0.069
 shades+0.066
Neg logit contributions
 shark-0.087
olph-0.086
 mustache-0.084
 steak-0.083
 crab-0.082
Token felt
Token ID2936
Feature activation-3.91e-01

Pos logit contributions
 sorts+0.016
 ahead+0.014
 shades+0.014
 coloured+0.013
 Finally+0.013
Neg logit contributions
 shark-0.028
 steak-0.027
 thief-0.026
 crab-0.026
 salad-0.026
Token sad
Token ID6507
Feature activation+9.88e-01

Pos logit contributions
 sorts+0.010
 ahead+0.009
 coloured+0.009
 shades+0.009
 called+0.009
Neg logit contributions
 shark-0.008
 steak-0.008
 thief-0.007
 crab-0.007
 salad-0.007
Token because
Token ID780
Feature activation+1.19e+00

Pos logit contributions
 shark+0.035
 steak+0.034
opus+0.034
 thief+0.034
 gag+0.033
Neg logit contributions
 ahead-0.010
 called-0.009
 when-0.009
 over-0.008
 like-0.008
Token he
Token ID339
Feature activation+8.75e-02

Pos logit contributions
 shark+0.022
 steak+0.020
 salad+0.020
 gag+0.020
 thief+0.020
Neg logit contributions
 ahead-0.031
 called-0.031
 sorts-0.030
 when-0.028
 over-0.028
Token very
Token ID845
Feature activation+8.19e-01

Pos logit contributions
 shark+0.005
 steak+0.004
 thief+0.004
 salad+0.004
 dessert+0.004
Neg logit contributions
 sorts-0.003
 ahead-0.003
antly-0.003
eter-0.003
 Finally-0.003
Token fast
Token ID3049
Feature activation+8.43e-01

Pos logit contributions
 shark+0.019
 steak+0.018
 thief+0.017
 salad+0.017
 crab+0.017
Neg logit contributions
 sorts-0.019
 ahead-0.018
antly-0.017
 shades-0.017
 coloured-0.017
Token and
Token ID290
Feature activation-2.47e-01

Pos logit contributions
 shark+0.023
 steak+0.022
 thief+0.022
 crab+0.020
 salad+0.020
Neg logit contributions
 sorts-0.015
 Finally-0.014
mediately-0.014
 shades-0.013
antly-0.013
Token always
Token ID1464
Feature activation+3.71e-01

Pos logit contributions
 sorts+0.004
antly+0.004
ples+0.004
eter+0.004
 shades+0.004
Neg logit contributions
 shark-0.006
 steak-0.006
 thief-0.006
 crab-0.006
 salad-0.006
Token won
Token ID1839
Feature activation-1.18e+00

Pos logit contributions
 shark+0.010
 thief+0.009
 crab+0.009
 steak+0.009
 liar+0.008
Neg logit contributions
 Finally-0.007
antly-0.007
 sorts-0.007
eter-0.007
mediately-0.006
Token the
Token ID262
Feature activation-3.80e+00

Pos logit contributions
 sorts+0.018
 ahead+0.017
 Finally+0.016
 coloured+0.016
mediately+0.016
Neg logit contributions
 shark-0.035
 steak-0.033
 thief-0.032
 crab-0.031
 salad-0.031
Token races
Token ID9558
Feature activation-1.15e+00

Pos logit contributions
 ahead+0.068
 sorts+0.067
 past+0.065
 called+0.062
 coloured+0.061
Neg logit contributions
 shark-0.099
 steak-0.094
otion-0.093
 thief-0.092
ascal-0.092
Token.
Token ID13
Feature activation+5.13e-01

Pos logit contributions
 sorts+0.015
 Finally+0.015
mediately+0.014
antly+0.014
 coloured+0.014
Neg logit contributions
 shark-0.036
 steak-0.034
 thief-0.034
 crab-0.033
 salad-0.032
Token
Token ID198
Feature activation+2.00e-01

Pos logit contributions
 shark+0.016
 steak+0.015
 thief+0.015
 crab+0.014
 salad+0.014
Neg logit contributions
 sorts-0.007
 ahead-0.007
 called-0.006
 Finally-0.006
 coloured-0.006
Token
Token ID198
Feature activation+5.37e-01

Pos logit contributions
 shark+0.007
 thief+0.006
 steak+0.006
 salad+0.006
 mustache+0.006
Neg logit contributions
 sorts-0.002
 ahead-0.002
 called-0.002
 Finally-0.002
 After-0.002
TokenThe
Token ID464
Feature activation-9.59e-02

Pos logit contributions
 shark+0.018
 steak+0.017
 thief+0.017
 crab+0.017
 salad+0.016
Neg logit contributions
 sorts-0.006
 ahead-0.006
 called-0.006
 Finally-0.005
 coloured-0.005
Token,
Token ID11
Feature activation-5.87e-01

Pos logit contributions
 sorts+0.011
 ahead+0.011
 called+0.009
 coloured+0.009
 Finally+0.009
Neg logit contributions
 shark-0.039
 steak-0.038
 thief-0.037
 salad-0.036
 crab-0.036
Token Tim
Token ID5045
Feature activation-8.94e-02

Pos logit contributions
 sorts+0.013
 ahead+0.012
k+0.012
 Finally+0.012
 shades+0.012
Neg logit contributions
 shark-0.014
 steak-0.013
 thief-0.013
 salad-0.012
 pasta-0.012
Tokenmy
Token ID1820
Feature activation-1.51e-02

Pos logit contributions
 Finally+0.001
 sorts+0.001
 ahead+0.001
 After+0.001
 grace+0.001
Neg logit contributions
 shark-0.003
 steak-0.003
 mustache-0.003
 thief-0.002
 pasta-0.002
Token,
Token ID11
Feature activation-4.44e-01

Pos logit contributions
amed+0.000
 sorts+0.000
 Finally+0.000
ushes+0.000
mediately+0.000
Neg logit contributions
 shark-0.000
 steak-0.000
 mustache-0.000
 thief-0.000
 pasta-0.000
Token sometimes
Token ID3360
Feature activation+6.01e-01

Pos logit contributions
 ahead+0.008
 sorts+0.007
 called+0.007
 coloured+0.007
 past+0.007
Neg logit contributions
 shark-0.012
 steak-0.012
 thief-0.011
opus-0.011
 gag-0.011
Token elev
Token ID7662
Feature activation-3.75e+00

Pos logit contributions
 shark+0.014
 steak+0.013
 thief+0.013
 crab+0.012
 salad+0.012
Neg logit contributions
 sorts-0.013
 Finally-0.012
 ahead-0.012
k-0.012
mediately-0.012
Tokenators
Token ID2024
Feature activation-2.14e+00

Pos logit contributions
 sorts+0.087
 Finally+0.080
 After+0.080
 grace+0.078
 called+0.076
Neg logit contributions
 shark-0.073
oly-0.071
otion-0.070
olph-0.068
ascal-0.066
Token can
Token ID460
Feature activation-4.78e-01

Pos logit contributions
 sorts+0.042
 ahead+0.039
 Finally+0.039
 shades+0.038
 coloured+0.038
Neg logit contributions
 shark-0.053
 steak-0.050
 thief-0.049
 crab-0.047
 dessert-0.047
Token be
Token ID307
Feature activation-5.19e-01

Pos logit contributions
 sorts+0.010
 ahead+0.009
 Finally+0.009
eter+0.009
mediately+0.009
Neg logit contributions
 shark-0.012
 steak-0.011
 thief-0.011
 ow-0.010
 crab-0.010
Token noisy
Token ID31210
Feature activation+6.79e-01

Pos logit contributions
 Finally+0.012
 sorts+0.012
bye+0.012
antly+0.012
cy+0.011
Neg logit contributions
 thief-0.010
 steak-0.010
 shark-0.010
 dessert-0.009
 ow-0.009
Token because
Token ID780
Feature activation+8.16e-01

Pos logit contributions
 shark+0.019
 steak+0.019
 thief+0.018
 salad+0.017
 pasta+0.017
Neg logit contributions
 sorts-0.011
 Finally-0.011
 ahead-0.010
mediately-0.010
eter-0.010
Token said
Token ID531
Feature activation-1.41e-01

Pos logit contributions
 shark+0.003
 steak+0.003
 thief+0.003
 crab+0.003
 salad+0.002
Neg logit contributions
 sorts-0.002
eter-0.002
 ahead-0.002
 Finally-0.002
 After-0.002
Token together
Token ID1978
Feature activation-2.42e-01

Pos logit contributions
 ahead+0.002
 sorts+0.002
 called+0.001
 Finally+0.001
 coloured+0.001
Neg logit contributions
 shark-0.005
 steak-0.005
 thief-0.004
opus-0.004
 salad-0.004
Token,
Token ID11
Feature activation-4.58e-01

Pos logit contributions
 sorts+0.003
 ahead+0.003
 Finally+0.003
mediately+0.003
k+0.003
Neg logit contributions
 shark-0.008
 steak-0.008
 thief-0.007
 salad-0.007
 pasta-0.007
Token "
Token ID366
Feature activation-2.33e-01

Pos logit contributions
 ahead+0.007
 sorts+0.007
 called+0.006
 Finally+0.006
 After+0.006
Neg logit contributions
 shark-0.014
 steak-0.013
 thief-0.013
 crab-0.013
opus-0.013
TokenA
Token ID32
Feature activation-7.53e-01

Pos logit contributions
 sorts+0.005
 ahead+0.004
 called+0.004
 coloured+0.004
 Finally+0.004
Neg logit contributions
 shark-0.006
 steak-0.006
 thief-0.005
 crab-0.005
opus-0.005
Tokenbr
Token ID1671
Feature activation-3.74e+00

Pos logit contributions
 sorts+0.015
eter+0.015
mediately+0.014
 Finally+0.014
 ahead+0.014
Neg logit contributions
 shark-0.019
 steak-0.018
 thief-0.017
opus-0.017
 virus-0.017
Tokenac
Token ID330
Feature activation-1.52e+00

Pos logit contributions
 sorts+0.063
 ahead+0.060
 began+0.057
 called+0.056
 shades+0.056
Neg logit contributions
 shark-0.098
ascal-0.096
oly-0.095
 steak-0.095
otion-0.091
Tokenad
Token ID324
Feature activation+4.97e-01

Pos logit contributions
 Finally+0.036
 After+0.034
 ahead+0.034
 sorts+0.034
eter+0.032
Neg logit contributions
 steak-0.031
 shark-0.031
 thief-0.030
 ow-0.029
 mustache-0.029
Tokenab
Token ID397
Feature activation-2.98e+00

Pos logit contributions
 shark+0.015
 steak+0.015
 thief+0.014
 salad+0.014
 crab+0.014
Neg logit contributions
 sorts-0.007
 ahead-0.007
 called-0.006
 Finally-0.006
 coloured-0.006
Tokenra
Token ID430
Feature activation-6.36e-01

Pos logit contributions
 sorts+0.057
 Finally+0.048
 After+0.046
mediately+0.046
 called+0.046
Neg logit contributions
 shark-0.081
 crab-0.076
 steak-0.076
opus-0.075
 mustache-0.074
Token!"
Token ID2474
Feature activation-3.99e-01

Pos logit contributions
 sorts+0.010
 Finally+0.009
 ahead+0.009
 shades+0.009
eter+0.009
Neg logit contributions
 shark-0.019
 steak-0.018
 thief-0.017
opus-0.017
otion-0.017
Token
Token ID198
Feature activation+7.20e-01

Pos logit contributions
 shark+0.013
 steak+0.013
 thief+0.013
 salad+0.012
 crab+0.012
Neg logit contributions
 ahead-0.005
 sorts-0.005
 Finally-0.005
 called-0.005
 After-0.005
TokenThe
Token ID464
Feature activation+1.38e-01

Pos logit contributions
 shark+0.025
 steak+0.024
 salad+0.023
 thief+0.023
 pasta+0.023
Neg logit contributions
 ahead-0.008
 sorts-0.008
 Finally-0.008
 called-0.007
 After-0.007
Token cat
Token ID3797
Feature activation+8.05e-01

Pos logit contributions
 shark+0.003
 steak+0.003
 crab+0.003
opus+0.003
ascal+0.003
Neg logit contributions
 ahead-0.003
 sorts-0.003
 called-0.003
 past-0.003
 coloured-0.003
Token grinned
Token ID36268
Feature activation-8.88e-01

Pos logit contributions
 shark+0.025
 steak+0.024
 thief+0.023
 crab+0.023
 salad+0.022
Neg logit contributions
 sorts-0.012
 ahead-0.011
 shades-0.010
 coloured-0.010
 Finally-0.010
Token mis
Token ID2984
Feature activation-1.56e+00

Pos logit contributions
 sorts+0.018
 brushes+0.016
 Finally+0.016
ually+0.016
 coloured+0.016
Neg logit contributions
 shark-0.024
 steak-0.022
 thief-0.021
 crab-0.020
 salad-0.020
Tokenchie
Token ID3043
Feature activation-3.71e+00

Pos logit contributions
 Finally+0.034
 sorts+0.034
 ahead+0.033
 coloured+0.032
 shades+0.032
Neg logit contributions
 thief-0.032
 shark-0.032
 liar-0.031
ascal-0.031
 crab-0.030
Tokenv
Token ID85
Feature activation-6.30e-01

Pos logit contributions
 brushes+0.084
eter+0.084
 Finally+0.082
ples+0.082
ushes+0.080
Neg logit contributions
 shark-0.065
 steak-0.063
 thief-0.062
 crab-0.062
opus-0.061
Tokenously
Token ID3481
Feature activation-1.31e+00

Pos logit contributions
 sorts+0.012
 ahead+0.012
 shades+0.011
 coloured+0.011
 Finally+0.011
Neg logit contributions
 shark-0.016
 steak-0.015
 thief-0.015
 crab-0.014
 liar-0.014
Token.
Token ID13
Feature activation-8.27e-02

Pos logit contributions
 sorts+0.025
 shades+0.022
eter+0.021
 coloured+0.021
 brushes+0.020
Neg logit contributions
 steak-0.034
 thief-0.033
 shark-0.033
 mustache-0.032
 salad-0.031
Token "
Token ID366
Feature activation-3.87e-01

Pos logit contributions
 sorts+0.001
 ahead+0.001
 coloured+0.001
k+0.001
 shades+0.001
Neg logit contributions
 shark-0.002
 steak-0.002
 thief-0.002
 salad-0.002
 crab-0.002
TokenNever
Token ID12295
Feature activation-8.15e-01

Pos logit contributions
 sorts+0.008
 ahead+0.008
 called+0.008
 Finally+0.008
 shades+0.008
Neg logit contributions
 shark-0.009
 steak-0.008
 thief-0.008
 crab-0.008
opus-0.008
TokenCan
Token ID6090
Feature activation+4.91e-01

Pos logit contributions
 sorts+0.004
 ahead+0.004
 called+0.004
ually+0.004
 Finally+0.004
Neg logit contributions
 shark-0.005
 steak-0.005
 crab-0.005
 pasta-0.004
 thief-0.004
Token we
Token ID356
Feature activation-5.22e-01

Pos logit contributions
 shark+0.010
 steak+0.010
 thief+0.009
ascal+0.009
 liar+0.009
Neg logit contributions
 ahead-0.012
 called-0.011
 past-0.011
 sorts-0.011
 Finally-0.011
Token feed
Token ID3745
Feature activation-1.01e-01

Pos logit contributions
 sorts+0.010
antly+0.010
mediately+0.010
 coloured+0.010
 Finally+0.009
Neg logit contributions
 shark-0.013
 crab-0.012
 steak-0.012
 thief-0.012
 salad-0.012
Token the
Token ID262
Feature activation-9.43e-01

Pos logit contributions
 sorts+0.002
 ahead+0.001
 Finally+0.001
 coloured+0.001
 called+0.001
Neg logit contributions
 shark-0.003
 steak-0.003
 thief-0.003
 salad-0.003
 crab-0.003
Token ze
Token ID41271
Feature activation+1.98e+00

Pos logit contributions
 sorts+0.020
 ahead+0.020
 Finally+0.019
 coloured+0.018
mediately+0.018
Neg logit contributions
 shark-0.022
 steak-0.021
 thief-0.020
 crab-0.020
 salad-0.019
Tokenbr
Token ID1671
Feature activation-3.65e+00

Pos logit contributions
 shark+0.057
 steak+0.055
 thief+0.053
 gag+0.052
 virus+0.052
Neg logit contributions
 ahead-0.031
 sorts-0.030
 called-0.030
 past-0.028
k-0.027
Tokenas
Token ID292
Feature activation-6.87e-01

Pos logit contributions
 ahead+0.063
 sorts+0.062
 began+0.059
 admired+0.059
 Finally+0.058
Neg logit contributions
 shark-0.095
 steak-0.088
opus-0.088
ascal-0.087
 crab-0.087
Token,
Token ID11
Feature activation+8.84e-02

Pos logit contributions
 ahead+0.009
 sorts+0.009
 called+0.008
 coloured+0.008
 past+0.008
Neg logit contributions
 shark-0.021
 steak-0.021
 thief-0.020
 liar-0.020
 virus-0.020
Token Mom
Token ID11254
Feature activation+1.02e+00

Pos logit contributions
 liar+0.003
 thief+0.003
 shark+0.003
 virus+0.002
ascal+0.002
Neg logit contributions
 called-0.001
 ahead-0.001
 past-0.001
 sorts-0.001
 over-0.001
Token?"
Token ID1701
Feature activation+2.36e-01

Pos logit contributions
 shark+0.024
 steak+0.023
 thief+0.022
 salad+0.022
 crab+0.022
Neg logit contributions
 sorts-0.021
 ahead-0.021
 Finally-0.020
 coloured-0.019
 shades-0.019
Token Lily
Token ID20037
Feature activation-3.24e-01

Pos logit contributions
 shark+0.004
 steak+0.004
 thief+0.004
 liar+0.004
opus+0.004
Neg logit contributions
 ahead-0.006
 called-0.006
 sorts-0.006
 Finally-0.006
 After-0.006
Token Weight
Token ID14331
Feature activation-6.04e-01

Pos logit contributions
 sorts+0.016
 ahead+0.016
 coloured+0.015
 called+0.014
k+0.014
Neg logit contributions
 shark-0.019
 steak-0.018
 thief-0.018
 salad-0.017
 pasta-0.017
Token:
Token ID25
Feature activation-3.80e-01

Pos logit contributions
 sorts+0.010
 Finally+0.010
 ahead+0.010
eter+0.009
mediately+0.009
Neg logit contributions
 shark-0.016
 steak-0.015
 pasta-0.015
opus-0.015
 salad-0.014
Token 15
Token ID1315
Feature activation-6.25e-01

Pos logit contributions
 sorts+0.007
 ahead+0.007
 coloured+0.006
mediately+0.006
k+0.006
Neg logit contributions
 shark-0.010
 steak-0.010
 thief-0.009
 salad-0.009
 pasta-0.009
Token kilograms
Token ID37075
Feature activation-1.92e-01

Pos logit contributions
 sorts+0.010
 ahead+0.010
 Finally+0.009
mediately+0.009
k+0.009
Neg logit contributions
 shark-0.017
 steak-0.017
 pasta-0.016
 salad-0.016
 thief-0.016
Token.
Token ID13
Feature activation-7.70e-01

Pos logit contributions
 sorts+0.003
 ahead+0.003
 Finally+0.003
mediately+0.003
 admired+0.002
Neg logit contributions
 shark-0.006
 steak-0.006
 salad-0.005
 pasta-0.005
 thief-0.005
Token Average
Token ID13475
Feature activation-3.65e+00

Pos logit contributions
 sorts+0.015
 ahead+0.015
 coloured+0.014
 called+0.014
 shades+0.014
Neg logit contributions
 shark-0.019
 steak-0.018
 thief-0.018
 salad-0.017
 crab-0.017
Token for
Token ID329
Feature activation-3.16e-01

Pos logit contributions
 sorts+0.071
house+0.068
mediately+0.063
 Finally+0.062
antly+0.062
Neg logit contributions
 shark-0.091
 steak-0.088
 pasta-0.086
 salad-0.084
 dessert-0.083
Token age
Token ID2479
Feature activation+1.47e-01

Pos logit contributions
 sorts+0.006
 ahead+0.006
 Finally+0.006
mediately+0.006
k+0.006
Neg logit contributions
 shark-0.008
 steak-0.008
 thief-0.007
 salad-0.007
 pasta-0.007
Token:
Token ID25
Feature activation+3.05e-01

Pos logit contributions
 steak+0.004
 shark+0.004
 thief+0.004
 pasta+0.004
 mustache+0.004
Neg logit contributions
 sorts-0.002
mediately-0.002
 Finally-0.002
 coloured-0.002
 After-0.002
Token 3
Token ID513
Feature activation-3.95e-01

Pos logit contributions
 steak+0.008
 shark+0.008
 thief+0.007
 pasta+0.007
 salad+0.007
Neg logit contributions
 sorts-0.006
mediately-0.005
 ahead-0.005
 coloured-0.005
 Finally-0.005
Token years
Token ID812
Feature activation+1.12e+00

Pos logit contributions
 sorts+0.008
 ahead+0.008
 Finally+0.008
mediately+0.008
 After+0.007
Neg logit contributions
 steak-0.009
 shark-0.009
 mustache-0.009
 thief-0.009
 liar-0.009
Token over
Token ID625
Feature activation+4.58e-01

Pos logit contributions
antly+0.042
eter+0.041
 Finally+0.041
ually+0.038
Ok+0.035
Neg logit contributions
 shark-0.051
 steak-0.048
 pasta-0.047
 dessert-0.046
 salad-0.046
Token his
Token ID465
Feature activation-4.32e-01

Pos logit contributions
 shark+0.010
 steak+0.009
 crab+0.008
 pasta+0.008
 thief+0.008
Neg logit contributions
 sorts-0.011
 Finally-0.011
ually-0.011
mediately-0.011
 ahead-0.010
Token body
Token ID1767
Feature activation+5.12e-01

Pos logit contributions
 sorts+0.010
 ahead+0.009
mediately+0.009
 coloured+0.009
 Finally+0.009
Neg logit contributions
 shark-0.010
 steak-0.010
 thief-0.009
 mustache-0.009
 ow-0.009
Token.
Token ID13
Feature activation+1.61e-01

Pos logit contributions
 shark+0.015
 steak+0.015
 pasta+0.014
 thief+0.014
 salad+0.014
Neg logit contributions
 sorts-0.007
 Finally-0.007
mediately-0.006
 ahead-0.006
eter-0.006
Token Stri
Token ID26137
Feature activation-1.58e-01

Pos logit contributions
 shark+0.005
 steak+0.004
 thief+0.004
 salad+0.004
 pasta+0.004
Neg logit contributions
 sorts-0.003
 ahead-0.003
 coloured-0.002
k-0.002
 shades-0.002
Tokenpe
Token ID431
Feature activation-3.63e+00

Pos logit contributions
 ahead+0.004
 sorts+0.004
 called+0.004
 coloured+0.004
k+0.004
Neg logit contributions
 shark-0.003
 steak-0.003
 thief-0.002
 salad-0.002
 crab-0.002
Tokeny
Token ID88
Feature activation-4.54e-01

Pos logit contributions
 Finally+0.068
eter+0.068
 ahead+0.067
 sorts+0.067
 lower+0.062
Neg logit contributions
 shark-0.082
olph-0.082
 mustache-0.080
 steak-0.077
ascal-0.077
Token liked
Token ID8288
Feature activation+2.95e-01

Pos logit contributions
 sorts+0.007
eter+0.006
 ahead+0.006
ples+0.006
 shades+0.005
Neg logit contributions
 shark-0.014
 steak-0.013
 thief-0.013
 mustache-0.013
 crab-0.013
Token to
Token ID284
Feature activation+8.11e-01

Pos logit contributions
 shark+0.010
otion+0.010
 thief+0.010
opus+0.010
 gag+0.010
Neg logit contributions
 sorts-0.003
 called-0.003
 past-0.002
 ahead-0.002
 over-0.002
Token eat
Token ID4483
Feature activation+1.16e+00

Pos logit contributions
 shark+0.020
 steak+0.019
 thief+0.018
 mustache+0.018
 crab+0.018
Neg logit contributions
 sorts-0.017
 ahead-0.016
 Finally-0.016
eter-0.016
antly-0.015
Token grass
Token ID8701
Feature activation-3.22e-01

Pos logit contributions
 shark+0.028
 thief+0.027
 steak+0.027
opus+0.025
 liar+0.025
Neg logit contributions
 sorts-0.024
 ahead-0.023
 called-0.022
 coloured-0.022
 past-0.022
Token She
Token ID1375
Feature activation+4.95e-01

Pos logit contributions
 sorts+0.003
 ahead+0.003
 Finally+0.002
 called+0.002
 After+0.002
Neg logit contributions
 shark-0.006
 steak-0.006
 thief-0.006
 crab-0.005
 salad-0.005
Token said
Token ID531
Feature activation+5.00e-03

Pos logit contributions
 shark+0.013
 steak+0.012
 thief+0.012
 crab+0.012
 ow+0.012
Neg logit contributions
 sorts-0.010
 Finally-0.009
 ahead-0.009
 After-0.008
bye-0.008
Token,
Token ID11
Feature activation+5.87e-01

Pos logit contributions
 shark+0.000
ascal+0.000
 steak+0.000
 thief+0.000
 virus+0.000
Neg logit contributions
 ahead-0.000
 called-0.000
 past-0.000
 over-0.000
 when-0.000
Token "
Token ID366
Feature activation-7.34e-01

Pos logit contributions
 shark+0.019
 steak+0.019
 thief+0.019
opus+0.018
 salad+0.018
Neg logit contributions
 ahead-0.007
 called-0.007
 over-0.006
 Finally-0.006
 past-0.006
TokenA
Token ID32
Feature activation-4.79e-01

Pos logit contributions
 ahead+0.014
 sorts+0.013
k+0.013
 coloured+0.012
 shades+0.012
Neg logit contributions
 shark-0.020
 steak-0.019
 thief-0.018
 salad-0.018
 pasta-0.018
Tokenbr
Token ID1671
Feature activation-3.63e+00

Pos logit contributions
 sorts+0.010
 ahead+0.010
 called+0.009
 coloured+0.009
 past+0.009
Neg logit contributions
 shark-0.011
 steak-0.011
 thief-0.010
ascal-0.010
 gag-0.010
Tokenac
Token ID330
Feature activation-1.22e+00

Pos logit contributions
 sorts+0.060
 ahead+0.056
 called+0.054
 began+0.054
 shades+0.053
Neg logit contributions
 shark-0.099
ascal-0.094
 steak-0.094
oly-0.093
otion-0.092
Tokenad
Token ID324
Feature activation+4.99e-01

Pos logit contributions
 Finally+0.026
 sorts+0.026
 ahead+0.025
 After+0.024
 shades+0.024
Neg logit contributions
 shark-0.029
 steak-0.027
 thief-0.027
 ow-0.026
 crab-0.026
Tokenab
Token ID397
Feature activation-2.89e+00

Pos logit contributions
 shark+0.015
 steak+0.015
 thief+0.014
 salad+0.014
 crab+0.014
Neg logit contributions
 sorts-0.007
 ahead-0.007
 called-0.006
 coloured-0.006
 Finally-0.006
Tokenra
Token ID430
Feature activation-5.82e-01

Pos logit contributions
 sorts+0.048
 Finally+0.041
 ahead+0.039
 After+0.038
 called+0.038
Neg logit contributions
 shark-0.087
 steak-0.082
 crab-0.082
 thief-0.080
opus-0.080
Token!"
Token ID2474
Feature activation-5.17e-02

Pos logit contributions
 sorts+0.009
 Finally+0.008
 shades+0.008
eter+0.008
 ahead+0.007
Neg logit contributions
 shark-0.018
 steak-0.017
 thief-0.016
opus-0.016
 mustache-0.016
Token a
Token ID257
Feature activation+5.00e-01

Pos logit contributions
 sorts+0.004
 ahead+0.004
 coloured+0.003
 called+0.003
 Finally+0.003
Neg logit contributions
 shark-0.016
opus-0.015
 thief-0.015
ascal-0.015
 liar-0.015
Token z
Token ID1976
Feature activation-7.41e-01

Pos logit contributions
otion+0.014
 gag+0.013
 shark+0.013
uts+0.013
 steak+0.013
Neg logit contributions
 called-0.008
 ahead-0.008
 past-0.008
 sorts-0.007
 loved-0.007
Tokenebra
Token ID37052
Feature activation-1.76e+00

Pos logit contributions
 ahead+0.016
 sorts+0.016
 called+0.015
k+0.015
 coloured+0.015
Neg logit contributions
 shark-0.017
 steak-0.016
 thief-0.015
 salad-0.015
 pasta-0.015
Token named
Token ID3706
Feature activation+5.06e-01

Pos logit contributions
 sorts+0.026
eter+0.024
 Finally+0.024
mediately+0.024
 shades+0.024
Neg logit contributions
 shark-0.055
 steak-0.051
 thief-0.050
 crab-0.050
opus-0.050
Token Stri
Token ID26137
Feature activation+2.53e-02

Pos logit contributions
 shark+0.013
 steak+0.012
 thief+0.011
 crab+0.011
 salad+0.011
Neg logit contributions
 sorts-0.010
 ahead-0.010
eter-0.009
mediately-0.009
 coloured-0.009
Tokenpe
Token ID431
Feature activation-3.60e+00

Pos logit contributions
 shark+0.000
 steak+0.000
 thief+0.000
 salad+0.000
 dessert+0.000
Neg logit contributions
 ahead-0.001
 sorts-0.001
 called-0.001
k-0.001
 coloured-0.001
Tokeny
Token ID88
Feature activation-5.79e-01

Pos logit contributions
eter+0.059
 sorts+0.059
 Finally+0.056
 ahead+0.053
 lower+0.052
Neg logit contributions
 shark-0.097
 steak-0.096
olph-0.092
 mustache-0.092
 crab-0.090
Token.
Token ID13
Feature activation-1.74e-01

Pos logit contributions
 sorts+0.009
 ahead+0.008
eter+0.008
 shades+0.008
 Finally+0.008
Neg logit contributions
 shark-0.018
 steak-0.017
 thief-0.016
 crab-0.016
opus-0.016
Token He
Token ID679
Feature activation-1.93e-01

Pos logit contributions
 sorts+0.004
 ahead+0.004
 coloured+0.004
 called+0.004
 shades+0.004
Neg logit contributions
 shark-0.004
 steak-0.004
 thief-0.003
 salad-0.003
 crab-0.003
Token was
Token ID373
Feature activation-8.89e-01

Pos logit contributions
 sorts+0.002
eter+0.002
 ahead+0.002
mediately+0.002
 shades+0.002
Neg logit contributions
 shark-0.006
 steak-0.006
 thief-0.006
 crab-0.006
 ow-0.006
Token black
Token ID2042
Feature activation+6.48e-01

Pos logit contributions
 sorts+0.015
eter+0.015
antly+0.014
 ahead+0.014
ples+0.014
Neg logit contributions
 shark-0.025
 steak-0.024
 thief-0.023
 crab-0.022
 cowboy-0.022
Token use
Token ID779
Feature activation-1.02e+00

Pos logit contributions
 sorts+0.004
mediately+0.004
eter+0.004
amed+0.004
antly+0.004
Neg logit contributions
 thief-0.005
 shark-0.005
 steak-0.005
 dessert-0.004
 mustache-0.004
Token the
Token ID262
Feature activation-6.38e-01

Pos logit contributions
 sorts+0.020
mediately+0.020
 ahead+0.019
 Finally+0.019
eter+0.019
Neg logit contributions
 steak-0.025
 shark-0.025
 pasta-0.023
 salad-0.023
 thief-0.023
Token dictionary
Token ID22155
Feature activation-7.34e-01

Pos logit contributions
eter+0.017
 sorts+0.017
k+0.017
mediately+0.017
ually+0.017
Neg logit contributions
 steak-0.011
 thief-0.011
 mustache-0.010
 virus-0.010
 plate-0.010
Token to
Token ID284
Feature activation+6.75e-01

Pos logit contributions
 sorts+0.012
mediately+0.011
 coloured+0.011
 ahead+0.010
ples+0.010
Neg logit contributions
 steak-0.020
 shark-0.020
 thief-0.020
 salad-0.019
 pasta-0.019
Token learn
Token ID2193
Feature activation-7.52e-01

Pos logit contributions
 steak+0.016
 shark+0.016
 thief+0.015
 dessert+0.015
 salad+0.015
Neg logit contributions
 sorts-0.013
 ahead-0.013
eter-0.012
antly-0.012
 Finally-0.012
Token all
Token ID477
Feature activation-3.58e+00

Pos logit contributions
mediately+0.014
k+0.013
 Finally+0.013
ples+0.013
 sorts+0.013
Neg logit contributions
 steak-0.020
 shark-0.019
 salad-0.018
 pasta-0.018
 dessert-0.018
Token kinds
Token ID6982
Feature activation+3.83e-01

Pos logit contributions
 Finally+0.071
eter+0.068
amed+0.067
antly+0.067
k+0.064
Neg logit contributions
 steak-0.088
 shark-0.088
 salad-0.084
 pasta-0.083
 dessert-0.082
Token of
Token ID286
Feature activation-1.01e+00

Pos logit contributions
 shark+0.011
 steak+0.011
 salad+0.011
 pasta+0.011
 thief+0.011
Neg logit contributions
 sorts-0.005
 Finally-0.005
mediately-0.005
 ahead-0.005
k-0.005
Token new
Token ID649
Feature activation-1.42e+00

Pos logit contributions
 ahead+0.020
 sorts+0.020
k+0.019
 Finally+0.019
mediately+0.019
Neg logit contributions
 steak-0.024
 shark-0.024
 thief-0.023
 salad-0.022
 dessert-0.022
Token words
Token ID2456
Feature activation-3.47e-01

Pos logit contributions
mediately+0.036
 ahead+0.035
 lower+0.034
ually+0.034
k+0.034
Neg logit contributions
 steak-0.027
 salad-0.025
 dessert-0.024
 shark-0.024
 pasta-0.024
Token!"
Token ID2474
Feature activation-6.72e-01

Pos logit contributions
 sorts+0.005
 ahead+0.005
 Finally+0.005
mediately+0.005
 coloured+0.004
Neg logit contributions
 shark-0.010
 steak-0.010
 salad-0.010
 thief-0.010
 pasta-0.009
Token line
Token ID1627
Feature activation-1.35e+00

Pos logit contributions
 sorts+0.011
 ahead+0.011
 Finally+0.011
 coloured+0.010
k+0.010
Neg logit contributions
 shark-0.012
 steak-0.011
 thief-0.011
 crab-0.010
 salad-0.010
Token,
Token ID11
Feature activation-3.75e-01

Pos logit contributions
 sorts+0.020
 Finally+0.019
ples+0.019
 shades+0.019
 coloured+0.018
Neg logit contributions
 shark-0.039
 steak-0.037
 thief-0.036
 crab-0.035
 pasta-0.035
Token and
Token ID290
Feature activation-1.50e+00

Pos logit contributions
 sorts+0.007
 ahead+0.007
 coloured+0.006
 shades+0.006
 Finally+0.006
Neg logit contributions
 shark-0.010
 steak-0.009
 thief-0.009
 crab-0.009
 salad-0.009
Token he
Token ID339
Feature activation-2.80e-01

Pos logit contributions
 sorts+0.028
 shades+0.027
 coloured+0.027
 ahead+0.026
ples+0.026
Neg logit contributions
 shark-0.040
 steak-0.037
 thief-0.036
 crab-0.036
 salad-0.034
Token won
Token ID1839
Feature activation-1.18e+00

Pos logit contributions
 sorts+0.004
 shades+0.004
ples+0.004
 Finally+0.004
 coloured+0.004
Neg logit contributions
 shark-0.008
 crab-0.008
 thief-0.008
 steak-0.007
opus-0.007
Token the
Token ID262
Feature activation-3.58e+00

Pos logit contributions
 sorts+0.018
 ahead+0.016
 coloured+0.016
 Finally+0.016
mediately+0.015
Neg logit contributions
 shark-0.035
 steak-0.034
 thief-0.033
 crab-0.032
 dessert-0.031
Token race
Token ID3234
Feature activation-1.41e+00

Pos logit contributions
 sorts+0.073
 coloured+0.068
 shades+0.067
 ahead+0.067
 called+0.066
Neg logit contributions
 shark-0.088
 steak-0.084
 thief-0.083
 crab-0.079
 salad-0.078
Token.
Token ID13
Feature activation-1.15e-01

Pos logit contributions
 sorts+0.019
 Finally+0.018
 coloured+0.018
 shades+0.018
mediately+0.018
Neg logit contributions
 shark-0.042
 thief-0.041
 steak-0.040
 crab-0.039
 salad-0.038
Token
Token ID198
Feature activation+2.35e-02

Pos logit contributions
 sorts+0.002
 coloured+0.002
 ahead+0.002
 shades+0.002
k+0.002
Neg logit contributions
 shark-0.003
 steak-0.003
 thief-0.003
 salad-0.003
 crab-0.003
Token
Token ID198
Feature activation+5.05e-01

Pos logit contributions
 shark+0.001
 steak+0.001
 thief+0.001
 salad+0.001
 crab+0.001
Neg logit contributions
 sorts-0.000
 ahead-0.000
 Finally-0.000
 called-0.000
 After-0.000
TokenThe
Token ID464
Feature activation-9.24e-01

Pos logit contributions
 shark+0.017
 steak+0.016
 thief+0.016
 salad+0.015
 pasta+0.015
Neg logit contributions
 sorts-0.006
 ahead-0.006
 Finally-0.005
 called-0.005
 After-0.005
Token go
Token ID467
Feature activation+9.00e-01

Pos logit contributions
 sorts+0.014
 ahead+0.013
 Finally+0.013
 coloured+0.013
mediately+0.013
Neg logit contributions
 shark-0.024
 steak-0.023
 thief-0.022
 crab-0.022
 salad-0.022
Token.
Token ID13
Feature activation-1.87e-01

Pos logit contributions
 shark+0.029
 steak+0.028
 thief+0.027
 salad+0.026
 crab+0.026
Neg logit contributions
 sorts-0.012
 ahead-0.012
 called-0.010
 coloured-0.010
 Finally-0.010
Token
Token ID220
Feature activation-2.00e+00

Pos logit contributions
k+0.004
 coloured+0.004
 ahead+0.003
ist+0.003
kind+0.003
Neg logit contributions
 steak-0.004
 shark-0.004
 pasta-0.004
 salad-0.004
 dessert-0.004
Token
Token ID198
Feature activation-1.57e-01

Pos logit contributions
 sorts+0.041
 coloured+0.039
 ahead+0.039
k+0.038
 colours+0.037
Neg logit contributions
 shark-0.044
 steak-0.044
otion-0.043
oly-0.041
 thief-0.041
Token
Token ID198
Feature activation+7.95e-02

Pos logit contributions
 sorts+0.002
 ahead+0.002
 Finally+0.002
 called+0.002
 After+0.002
Neg logit contributions
 shark-0.005
 thief-0.005
 steak-0.005
 crab-0.005
 liar-0.005
TokenAn
Token ID2025
Feature activation-3.57e+00

Pos logit contributions
 shark+0.003
 steak+0.002
 thief+0.002
 salad+0.002
 crab+0.002
Neg logit contributions
 sorts-0.001
 ahead-0.001
 called-0.001
 coloured-0.001
 Finally-0.001
Tokenya
Token ID3972
Feature activation-1.07e+00

Pos logit contributions
 Finally+0.091
 sorts+0.090
house+0.089
eter+0.089
 ahead+0.087
Neg logit contributions
 shark-0.063
 steak-0.060
 thief-0.056
 crab-0.056
opus-0.053
Token raced
Token ID32897
Feature activation+8.30e-03

Pos logit contributions
 sorts+0.015
 ahead+0.014
 shades+0.014
 Finally+0.013
eter+0.013
Neg logit contributions
 shark-0.032
 steak-0.031
 crab-0.030
 thief-0.030
opus-0.029
Token to
Token ID284
Feature activation+4.82e-02

Pos logit contributions
 shark+0.000
 steak+0.000
 thief+0.000
 crab+0.000
 dessert+0.000
Neg logit contributions
 sorts-0.000
eter-0.000
mediately-0.000
 Finally-0.000
 shades-0.000
Token the
Token ID262
Feature activation+4.31e-01

Pos logit contributions
 shark+0.001
 steak+0.001
 thief+0.001
 salad+0.001
 crab+0.001
Neg logit contributions
 sorts-0.001
 ahead-0.001
 Finally-0.001
 coloured-0.001
 called-0.001
Token gym
Token ID11550
Feature activation-5.14e-01

Pos logit contributions
 shark+0.008
 steak+0.007
 thief+0.007
 crab+0.007
 salad+0.007
Neg logit contributions
 sorts-0.011
 ahead-0.011
 coloured-0.011
k-0.011
eter-0.011
Token The
Token ID383
Feature activation-1.59e-01

Pos logit contributions
 sorts+0.010
 ahead+0.009
 coloured+0.009
 called+0.009
 shades+0.009
Neg logit contributions
 shark-0.019
 steak-0.019
 thief-0.018
 crab-0.017
 salad-0.017
Token little
Token ID1310
Feature activation-5.06e-01

Pos logit contributions
 sorts+0.004
 ahead+0.004
 called+0.003
 coloured+0.003
 Finally+0.003
Neg logit contributions
 shark-0.003
 steak-0.003
 thief-0.003
 salad-0.003
 crab-0.003
Token boy
Token ID2933
Feature activation-2.84e-01

Pos logit contributions
 sorts+0.013
 Finally+0.013
ually+0.013
 After+0.012
k+0.012
Neg logit contributions
 shark-0.010
 thief-0.010
 crab-0.010
 steak-0.010
 salad-0.009
Token smiled
Token ID13541
Feature activation-8.01e-02

Pos logit contributions
 sorts+0.005
 ahead+0.005
eter+0.005
bons+0.004
ushes+0.004
Neg logit contributions
 shark-0.007
 steak-0.007
 mustache-0.007
 crab-0.007
 thief-0.007
Token mis
Token ID2984
Feature activation-1.64e+00

Pos logit contributions
 sorts+0.001
 ahead+0.001
 Finally+0.001
ually+0.001
 coloured+0.001
Neg logit contributions
 shark-0.002
 steak-0.002
 thief-0.002
 crab-0.002
 salad-0.002
Tokenchie
Token ID3043
Feature activation-3.57e+00

Pos logit contributions
 Finally+0.034
 sorts+0.032
 ahead+0.032
 After+0.032
 brushes+0.032
Neg logit contributions
 shark-0.035
ascal-0.035
 thief-0.035
 crab-0.034
 liar-0.034
Tokenv
Token ID85
Feature activation-8.73e-01

Pos logit contributions
eter+0.079
 Finally+0.077
 brushes+0.074
ples+0.074
 grace+0.073
Neg logit contributions
 steak-0.065
 shark-0.064
 crab-0.064
opus-0.060
 thief-0.060
Tokenously
Token ID3481
Feature activation-1.03e+00

Pos logit contributions
 sorts+0.017
 ahead+0.017
 called+0.016
 Finally+0.016
 coloured+0.016
Neg logit contributions
 shark-0.022
 steak-0.021
 thief-0.020
 crab-0.019
 salad-0.019
Token and
Token ID290
Feature activation-5.53e-01

Pos logit contributions
 sorts+0.019
eter+0.017
 shades+0.016
 Finally+0.016
ually+0.016
Neg logit contributions
 steak-0.027
 mustache-0.027
 shark-0.027
 thief-0.026
 salad-0.025
Token said
Token ID531
Feature activation+2.92e-01

Pos logit contributions
 sorts+0.009
 ahead+0.009
 shades+0.008
ples+0.008
eter+0.008
Neg logit contributions
 shark-0.016
 steak-0.015
 crab-0.014
 thief-0.014
 salad-0.014
Token,
Token ID11
Feature activation-1.76e-01

Pos logit contributions
 shark+0.009
 steak+0.009
 thief+0.009
 salad+0.009
 crab+0.009
Neg logit contributions
 sorts-0.004
 ahead-0.003
 shades-0.003
mediately-0.003
 coloured-0.003
Token "
Token ID366
Feature activation-8.93e-01

Pos logit contributions
 shark+0.022
 steak+0.021
 thief+0.020
 virus+0.020
ascal+0.020
Neg logit contributions
 ahead-0.009
 Finally-0.009
 past-0.009
 sorts-0.009
 called-0.008
Tokentwo
Token ID11545
Feature activation-2.01e+00

Pos logit contributions
 sorts+0.017
 ahead+0.017
 called+0.016
 Finally+0.015
mediately+0.015
Neg logit contributions
 shark-0.022
 steak-0.022
 thief-0.021
 crab-0.020
 pasta-0.020
Token".
Token ID1911
Feature activation-1.21e-01

Pos logit contributions
 sorts+0.031
 ahead+0.030
 Finally+0.029
mediately+0.028
 coloured+0.027
Neg logit contributions
 shark-0.058
 steak-0.057
 salad-0.055
 pasta-0.054
 crab-0.053
Token She
Token ID1375
Feature activation+2.13e-01

Pos logit contributions
 ahead+0.002
 sorts+0.002
 Finally+0.002
 called+0.001
 After+0.001
Neg logit contributions
 shark-0.004
 steak-0.004
 thief-0.004
 crab-0.004
opus-0.003
Token did
Token ID750
Feature activation-1.03e-01

Pos logit contributions
 shark+0.006
 steak+0.005
 pasta+0.005
 crab+0.005
 salad+0.005
Neg logit contributions
 sorts-0.004
 ahead-0.004
 directions-0.004
 direction-0.003
 Finally-0.003
Token this
Token ID428
Feature activation-3.57e+00

Pos logit contributions
 sorts+0.003
bye+0.003
eter+0.003
cy+0.003
k+0.003
Neg logit contributions
 salad-0.002
 steak-0.002
 pasta-0.002
 dessert-0.002
 shark-0.002
Token until
Token ID1566
Feature activation-2.50e-01

Pos logit contributions
 sorts+0.044
 ahead+0.043
mediately+0.039
k+0.039
 Finally+0.039
Neg logit contributions
 shark-0.110
 steak-0.109
 salad-0.105
 pasta-0.104
 thief-0.103
Token she
Token ID673
Feature activation+2.83e-01

Pos logit contributions
 ahead+0.005
 sorts+0.005
 called+0.005
 Finally+0.005
 coloured+0.005
Neg logit contributions
 shark-0.006
 steak-0.006
 thief-0.006
opus-0.006
ascal-0.005
Token had
Token ID550
Feature activation-1.90e-01

Pos logit contributions
 shark+0.008
 steak+0.008
 crab+0.008
 thief+0.008
 pasta+0.008
Neg logit contributions
 sorts-0.004
 ahead-0.004
 Finally-0.004
 After-0.004
mediately-0.004
Token ten
Token ID3478
Feature activation-1.34e+00

Pos logit contributions
 sorts+0.003
 ahead+0.003
 called+0.003
 coloured+0.003
 Finally+0.003
Neg logit contributions
 shark-0.006
 steak-0.005
 thief-0.005
 crab-0.005
 salad-0.005
Token sticks
Token ID16461
Feature activation-1.40e+00

Pos logit contributions
 ahead+0.029
 sorts+0.029
 coloured+0.027
 called+0.027
 past+0.027
Neg logit contributions
 shark-0.031
 steak-0.029
 thief-0.029
opus-0.027
 virus-0.027
Token it
Token ID340
Feature activation+1.33e-01

Pos logit contributions
 sorts+0.014
 ahead+0.014
eter+0.014
 Finally+0.014
k+0.013
Neg logit contributions
 shark-0.017
 steak-0.016
 thief-0.016
 crab-0.016
 salad-0.015
Token was
Token ID373
Feature activation+3.15e-01

Pos logit contributions
 shark+0.004
 ow+0.004
 steak+0.004
 thief+0.004
 crab+0.004
Neg logit contributions
 sorts-0.002
eter-0.002
 Finally-0.002
 ahead-0.002
k-0.002
Token stuck
Token ID7819
Feature activation-8.27e-01

Pos logit contributions
 shark+0.008
 steak+0.008
 thief+0.007
 salad+0.007
 crab+0.007
Neg logit contributions
 sorts-0.006
 ahead-0.006
 Finally-0.006
k-0.006
ually-0.006
Token!
Token ID0
Feature activation+6.56e-01

Pos logit contributions
 sorts+0.016
 Finally+0.015
 After+0.014
mediately+0.014
 brushes+0.014
Neg logit contributions
 shark-0.021
 thief-0.020
 steak-0.020
 crab-0.019
 pasta-0.018
Token The
Token ID383
Feature activation+5.02e-01

Pos logit contributions
 shark+0.020
 steak+0.019
 thief+0.019
 salad+0.018
 crab+0.018
Neg logit contributions
 sorts-0.009
 ahead-0.009
 coloured-0.008
 shades-0.008
k-0.008
Token bald
Token ID28273
Feature activation+2.77e+00

Pos logit contributions
otion+0.013
 shark+0.013
 gag+0.012
ascal+0.012
uts+0.012
Neg logit contributions
 called-0.009
 ahead-0.008
 past-0.008
 began-0.008
 painted-0.008
Token man
Token ID582
Feature activation-4.76e-02

Pos logit contributions
 shark+0.062
 steak+0.059
 salad+0.058
otion+0.057
 dessert+0.057
Neg logit contributions
 ahead-0.059
k-0.056
 called-0.056
 coloured-0.055
 sorts-0.055
Token wasn
Token ID2492
Feature activation+7.47e-01

Pos logit contributions
 sorts+0.001
 ahead+0.001
eter+0.001
 shades+0.001
 Finally+0.001
Neg logit contributions
 shark-0.001
 steak-0.001
 thief-0.001
 crab-0.001
 mustache-0.001
Token't
Token ID470
Feature activation+8.44e-01

Pos logit contributions
 shark+0.019
 steak+0.018
 salad+0.018
 pasta+0.018
 peas+0.017
Neg logit contributions
 ahead-0.017
 called-0.015
 over-0.015
 past-0.014
 lower-0.014
Token scared
Token ID12008
Feature activation+9.04e-01

Pos logit contributions
 shark+0.025
opus+0.023
 steak+0.023
 salad+0.023
 pasta+0.022
Neg logit contributions
 ahead-0.015
 called-0.015
 sorts-0.014
 past-0.014
 coloured-0.013
Token.
Token ID13
Feature activation+4.09e-01

Pos logit contributions
 shark+0.030
 steak+0.030
 gag+0.028
opus+0.028
 salad+0.028
Neg logit contributions
 ahead-0.011
 sorts-0.010
 past-0.009
 called-0.009
 over-0.009
Token It
Token ID632
Feature activation+7.86e-02

Pos logit contributions
 shark+0.019
 steak+0.018
 thief+0.017
 salad+0.017
 crab+0.017
Neg logit contributions
 sorts-0.010
 ahead-0.009
 coloured-0.009
k-0.009
 shades-0.009
Token split
Token ID6626
Feature activation-1.35e+00

Pos logit contributions
 shark+0.002
 ow+0.002
 thief+0.002
 crab+0.002
 steak+0.002
Neg logit contributions
 sorts-0.001
 Finally-0.001
 ahead-0.001
k-0.001
eter-0.001
Token open
Token ID1280
Feature activation-1.59e+00

Pos logit contributions
 sorts+0.023
 Finally+0.020
ually+0.020
 ahead+0.020
mediately+0.020
Neg logit contributions
 shark-0.039
 steak-0.038
 pasta-0.035
 crab-0.035
 salad-0.035
Token and
Token ID290
Feature activation-1.66e-01

Pos logit contributions
 sorts+0.030
bye+0.029
ually+0.029
mediately+0.029
 Finally+0.027
Neg logit contributions
 steak-0.040
 shark-0.039
 salad-0.037
 pasta-0.036
 peas-0.035
Token the
Token ID262
Feature activation+7.07e-02

Pos logit contributions
 sorts+0.003
ples+0.003
k+0.003
 shades+0.003
eter+0.003
Neg logit contributions
 shark-0.004
 steak-0.004
 thief-0.004
 salad-0.004
 crab-0.004
Token cater
Token ID20825
Feature activation+2.78e+00

Pos logit contributions
 shark+0.001
 thief+0.001
 crab+0.001
 steak+0.001
 salad+0.001
Neg logit contributions
 sorts-0.002
 ahead-0.002
 Finally-0.002
ually-0.002
k-0.002
Tokenpillar
Token ID41643
Feature activation+1.21e+00

Pos logit contributions
 shark+0.028
opus+0.024
 steak+0.024
 thief+0.023
 crab+0.023
Neg logit contributions
 sorts-0.096
 Finally-0.093
 ahead-0.092
ually-0.092
 coloured-0.091
Token crawled
Token ID45668
Feature activation+2.27e-01

Pos logit contributions
 shark+0.034
 thief+0.032
 steak+0.032
 ow+0.032
 crab+0.032
Neg logit contributions
 sorts-0.020
 ahead-0.018
 Finally-0.017
mediately-0.017
 shades-0.017
Token out
Token ID503
Feature activation-5.20e-01

Pos logit contributions
 shark+0.006
 thief+0.006
 steak+0.006
 salad+0.006
 crab+0.005
Neg logit contributions
 sorts-0.004
ually-0.004
mediately-0.004
 brushes-0.004
 kinds-0.004
Token to
Token ID284
Feature activation-7.87e-01

Pos logit contributions
 sorts+0.008
 ahead+0.007
 Finally+0.007
mediately+0.007
k+0.007
Neg logit contributions
 shark-0.015
 steak-0.015
 salad-0.014
 thief-0.014
 pasta-0.014
Token safety
Token ID3747
Feature activation-3.13e-01

Pos logit contributions
 sorts+0.015
 ahead+0.014
 Finally+0.013
 coloured+0.013
 shades+0.013
Neg logit contributions
 shark-0.021
 steak-0.020
 thief-0.019
 salad-0.019
 crab-0.019
Token.
Token ID13
Feature activation+7.02e-02

Pos logit contributions
 sorts+0.007
 ahead+0.006
 Finally+0.006
mediately+0.006
 coloured+0.006
Neg logit contributions
 shark-0.015
 steak-0.015
 thief-0.014
 salad-0.014
 pasta-0.014
Token She
Token ID1375
Feature activation+3.18e-01

Pos logit contributions
 shark+0.002
 steak+0.002
 thief+0.002
 salad+0.002
 pasta+0.002
Neg logit contributions
 sorts-0.001
 coloured-0.001
 ahead-0.001
k-0.001
 called-0.001
Token knew
Token ID2993
Feature activation-3.39e-01

Pos logit contributions
 shark+0.009
 steak+0.009
 thief+0.009
 crab+0.008
 salad+0.008
Neg logit contributions
 sorts-0.005
 ahead-0.005
 Finally-0.005
 shades-0.004
k-0.004
Token that
Token ID326
Feature activation+4.77e-01

Pos logit contributions
 sorts+0.007
 ahead+0.007
 called+0.006
 coloured+0.006
 Finally+0.006
Neg logit contributions
 shark-0.009
 steak-0.008
 thief-0.008
 salad-0.008
 crab-0.008
Token the
Token ID262
Feature activation+6.85e-01

Pos logit contributions
 shark+0.013
 steak+0.012
 thief+0.012
opus+0.012
ascal+0.011
Neg logit contributions
 sorts-0.009
 ahead-0.009
 called-0.008
 coloured-0.008
 shades-0.008
Token sm
Token ID895
Feature activation+2.49e+00

Pos logit contributions
ascal+0.014
otion+0.014
oly+0.014
uts+0.014
 shark+0.014
Neg logit contributions
 ahead-0.015
 past-0.014
 colours-0.014
 called-0.014
 coloured-0.013
Tokenelly
Token ID6148
Feature activation+3.01e+00

Pos logit contributions
 shark+0.052
 steak+0.050
 thief+0.047
 salad+0.045
 crab+0.045
Neg logit contributions
 sorts-0.060
 ahead-0.059
 called-0.056
 coloured-0.056
 Finally-0.055
Token popcorn
Token ID38914
Feature activation+1.09e+00

Pos logit contributions
ascal+0.055
oly+0.053
 shark+0.052
otion+0.051
 steak+0.050
Neg logit contributions
 ahead-0.076
 past-0.074
 coloured-0.072
 sorts-0.071
 colours-0.070
Token had
Token ID550
Feature activation-3.92e-01

Pos logit contributions
 shark+0.038
 steak+0.037
 thief+0.036
 salad+0.036
 crab+0.035
Neg logit contributions
 sorts-0.010
 ahead-0.010
 called-0.009
 coloured-0.008
 shades-0.008
Token been
Token ID587
Feature activation+2.36e-01

Pos logit contributions
 sorts+0.007
 ahead+0.007
 called+0.006
 coloured+0.006
 Finally+0.006
Neg logit contributions
 shark-0.011
 steak-0.010
 thief-0.010
 crab-0.009
 salad-0.009
Token a
Token ID257
Feature activation-6.36e-02

Pos logit contributions
 shark+0.006
 steak+0.006
 thief+0.005
 salad+0.005
 crab+0.005
Neg logit contributions
 sorts-0.005
 ahead-0.005
 Finally-0.004
 called-0.004
 coloured-0.004
Token drops
Token ID10532
Feature activation+2.78e-01

Pos logit contributions
 shark+0.013
 steak+0.012
 thief+0.011
 ow+0.011
 gag+0.011
Neg logit contributions
 sorts-0.013
 ahead-0.012
eter-0.012
 shades-0.012
 coloured-0.012
Token the
Token ID262
Feature activation+1.03e+00

Pos logit contributions
 shark+0.007
 steak+0.007
 thief+0.006
 pasta+0.006
 crab+0.006
Neg logit contributions
 sorts-0.005
ually-0.005
 Finally-0.005
 coloured-0.005
ples-0.005
Token wallet
Token ID13008
Feature activation+1.67e+00

Pos logit contributions
 shark+0.022
ascal+0.020
 gag+0.019
otion+0.019
 steak+0.019
Neg logit contributions
 past-0.025
 ahead-0.024
 called-0.023
k-0.023
 coloured-0.022
Token.
Token ID13
Feature activation+1.24e-01

Pos logit contributions
 shark+0.059
 steak+0.056
ascal+0.055
 thief+0.055
opus+0.054
Neg logit contributions
 ahead-0.017
 sorts-0.015
 called-0.015
 over-0.014
 past-0.014
Token The
Token ID383
Feature activation+7.76e-01

Pos logit contributions
 shark+0.004
 steak+0.004
 thief+0.003
opus+0.003
 salad+0.003
Neg logit contributions
 Finally-0.002
 After-0.002
 ahead-0.002
 sorts-0.002
 called-0.002
Token wallet
Token ID13008
Feature activation+2.60e+00

Pos logit contributions
 shark+0.017
 steak+0.016
ascal+0.016
opus+0.015
otion+0.015
Neg logit contributions
 ahead-0.017
 called-0.016
 past-0.016
 sorts-0.016
 coloured-0.016
Token is
Token ID318
Feature activation+1.19e+00

Pos logit contributions
 shark+0.068
 steak+0.065
 thief+0.062
 salad+0.061
 crab+0.061
Neg logit contributions
 ahead-0.048
 sorts-0.048
 called-0.045
 coloured-0.044
k-0.044
Token hard
Token ID1327
Feature activation+1.20e+00

Pos logit contributions
 shark+0.025
 steak+0.023
 thief+0.022
 salad+0.021
 crab+0.021
Neg logit contributions
 sorts-0.029
 ahead-0.028
 Finally-0.027
 called-0.027
 coloured-0.027
Token.
Token ID13
Feature activation+1.04e-02

Pos logit contributions
 shark+0.049
opus+0.047
 crab+0.047
ascal+0.046
 steak+0.046
Neg logit contributions
 ahead-0.007
 called-0.006
 over-0.006
 sorts-0.005
 past-0.005
Token It
Token ID632
Feature activation+7.09e-01

Pos logit contributions
 shark+0.000
 steak+0.000
 thief+0.000
 peas+0.000
opus+0.000
Neg logit contributions
 Finally-0.000
 After-0.000
 When-0.000
 ahead-0.000
 Suddenly-0.000
Token hits
Token ID7127
Feature activation-4.12e-01

Pos logit contributions
 shark+0.018
 steak+0.016
 thief+0.016
 salad+0.015
 ow+0.015
Neg logit contributions
 sorts-0.015
 ahead-0.014
 shades-0.014
 coloured-0.013
 Finally-0.013
Token?
Token ID30
Feature activation-1.02e+00

Pos logit contributions
 sorts+0.021
 ahead+0.021
 called+0.020
 coloured+0.019
 past+0.019
Neg logit contributions
 shark-0.029
 steak-0.028
 thief-0.027
ascal-0.027
 gag-0.026
Tokenâ
Token ID22940
Feature activation-9.44e-01

Pos logit contributions
 sorts+0.020
 ahead+0.020
 coloured+0.019
 called+0.019
 shades+0.018
Neg logit contributions
 shark-0.026
 steak-0.024
 thief-0.024
 crab-0.023
 salad-0.023
Token€
Token ID26391
Feature activation+1.22e+00

Pos logit contributions
 sorts+0.021
eter+0.019
 shades+0.019
 ahead+0.019
 Finally+0.019
Neg logit contributions
 shark-0.020
 steak-0.020
opus-0.019
 thief-0.019
 pasta-0.019
Token But
Token ID887
Feature activation-8.26e-01

Pos logit contributions
 shark+0.033
 steak+0.032
 thief+0.031
 crab+0.030
 salad+0.030
Neg logit contributions
 sorts-0.022
 ahead-0.021
 coloured-0.020
 called-0.019
 shades-0.019
Token the
Token ID262
Feature activation-1.62e-01

Pos logit contributions
 sorts+0.019
 ahead+0.018
 called+0.017
 coloured+0.017
 Finally+0.017
Neg logit contributions
 shark-0.019
 steak-0.018
 thief-0.017
 salad-0.016
 crab-0.016
Token lemon
Token ID18873
Feature activation+2.94e+00

Pos logit contributions
 ahead+0.004
 sorts+0.003
 past+0.003
 colours+0.003
 called+0.003
Neg logit contributions
 shark-0.004
ascal-0.003
 steak-0.003
otion-0.003
oly-0.003
Token couldn
Token ID3521
Feature activation+5.86e-01

Pos logit contributions
 shark+0.091
 steak+0.088
 thief+0.085
 salad+0.083
 crab+0.083
Neg logit contributions
 sorts-0.040
 ahead-0.039
 called-0.036
 coloured-0.035
 shades-0.035
Token't
Token ID470
Feature activation+3.80e-01

Pos logit contributions
 shark+0.015
 steak+0.015
 salad+0.014
ascal+0.014
 liar+0.014
Neg logit contributions
 ahead-0.011
 called-0.011
 over-0.011
 began-0.010
antly-0.010
Token hear
Token ID3285
Feature activation-2.16e-02

Pos logit contributions
 steak+0.013
 liar+0.012
 shark+0.012
 salad+0.012
opus+0.012
Neg logit contributions
 ahead-0.005
 called-0.005
 over-0.005
 like-0.005
 when-0.004
Token.
Token ID13
Feature activation-1.61e-01

Pos logit contributions
 sorts+0.000
 ahead+0.000
 called+0.000
 coloured+0.000
 Finally+0.000
Neg logit contributions
 shark-0.001
 steak-0.001
 thief-0.000
 salad-0.000
 crab-0.000
Token The
Token ID383
Feature activation+7.17e-02

Pos logit contributions
 sorts+0.003
 ahead+0.003
 called+0.003
 Finally+0.003
 shades+0.003
Neg logit contributions
 shark-0.004
 steak-0.004
 thief-0.004
 salad-0.004
 crab-0.004
Token feel
Token ID1254
Feature activation+2.42e-01

Pos logit contributions
 ahead+0.011
 sorts+0.011
 called+0.010
 Finally+0.010
k+0.010
Neg logit contributions
 shark-0.018
 steak-0.017
 thief-0.017
opus-0.016
 crab-0.016
Token sorry
Token ID7926
Feature activation-2.66e-01

Pos logit contributions
 shark+0.004
 steak+0.004
 thief+0.003
 salad+0.003
 crab+0.003
Neg logit contributions
 sorts-0.007
 ahead-0.007
 called-0.007
 coloured-0.007
 Finally-0.007
Token.
Token ID13
Feature activation-1.14e+00

Pos logit contributions
 ahead+0.002
 sorts+0.002
 called+0.002
 past+0.002
 Finally+0.002
Neg logit contributions
 shark-0.010
 steak-0.010
 thief-0.009
opus-0.009
 crab-0.009
Token They
Token ID1119
Feature activation+7.26e-02

Pos logit contributions
 sorts+0.018
 ahead+0.017
 coloured+0.016
k+0.016
mediately+0.015
Neg logit contributions
 shark-0.034
 steak-0.033
 thief-0.031
 crab-0.031
 salad-0.030
Token say
Token ID910
Feature activation+1.05e-01

Pos logit contributions
 shark+0.002
 steak+0.002
 thief+0.002
 salad+0.002
 crab+0.002
Neg logit contributions
 sorts-0.001
 ahead-0.001
 coloured-0.001
 shades-0.001
 Finally-0.001
Token,
Token ID11
Feature activation+5.21e-01

Pos logit contributions
 shark+0.004
 steak+0.004
ascal+0.004
 thief+0.004
 virus+0.004
Neg logit contributions
 ahead-0.000
 past-0.000
 called-0.000
 over-0.000
 Finally-0.000
Token "
Token ID366
Feature activation-7.47e-01

Pos logit contributions
 shark+0.018
 steak+0.017
 thief+0.017
 gag+0.017
 crab+0.016
Neg logit contributions
 ahead-0.006
 called-0.005
 Finally-0.005
 over-0.005
 After-0.005
TokenWe
Token ID1135
Feature activation-4.18e-01

Pos logit contributions
 sorts+0.013
 ahead+0.013
 called+0.012
 coloured+0.012
 Finally+0.012
Neg logit contributions
 shark-0.020
 steak-0.020
 thief-0.019
 crab-0.018
 salad-0.018
Token are
Token ID389
Feature activation+1.01e-01

Pos logit contributions
 sorts+0.008
 ahead+0.008
 Finally+0.008
 shades+0.008
 coloured+0.008
Neg logit contributions
 shark-0.010
 steak-0.010
 thief-0.010
 crab-0.010
 salad-0.009
Token sorry
Token ID7926
Feature activation+4.04e-01

Pos logit contributions
 shark+0.002
opus+0.002
 steak+0.002
 virus+0.002
 gag+0.002
Neg logit contributions
 called-0.003
 ahead-0.003
 past-0.003
 over-0.003
 like-0.002
Token,
Token ID11
Feature activation-5.82e-01

Pos logit contributions
 shark+0.014
 steak+0.014
 thief+0.013
 salad+0.013
 crab+0.013
Neg logit contributions
 sorts-0.004
 ahead-0.004
 called-0.003
 coloured-0.003
 Finally-0.003
Token bad
Token ID2089
Feature activation+6.11e-01

Pos logit contributions
 thief+0.002
 shark+0.002
 dessert+0.002
 steak+0.002
 liar+0.002
Neg logit contributions
 sorts-0.004
 Finally-0.004
antly-0.004
ually-0.004
 After-0.004
Token.
Token ID13
Feature activation-8.60e-02

Pos logit contributions
 shark+0.023
 steak+0.023
opus+0.022
 thief+0.022
 crab+0.022
Neg logit contributions
 ahead-0.004
 sorts-0.003
 called-0.003
 past-0.003
 over-0.003
Token He
Token ID679
Feature activation-5.95e-02

Pos logit contributions
 sorts+0.002
 ahead+0.002
 coloured+0.002
k+0.002
 colours+0.002
Neg logit contributions
 shark-0.002
 steak-0.002
 thief-0.002
 salad-0.002
 ow-0.002
Token was
Token ID373
Feature activation-1.09e-01

Pos logit contributions
 sorts+0.001
 shades+0.001
eter+0.001
 ahead+0.001
 colours+0.001
Neg logit contributions
 shark-0.001
 steak-0.001
 thief-0.001
 mustache-0.001
 ow-0.001
Token just
Token ID655
Feature activation-1.27e+00

Pos logit contributions
 sorts+0.002
 ahead+0.002
 Finally+0.002
 shades+0.002
k+0.002
Neg logit contributions
 shark-0.003
 steak-0.003
 thief-0.003
 salad-0.002
 crab-0.002
Token a
Token ID257
Feature activation+3.50e-01

Pos logit contributions
 sorts+0.032
 ahead+0.031
 Finally+0.030
antly+0.030
 shades+0.029
Neg logit contributions
 shark-0.025
 steak-0.024
 thief-0.023
 crab-0.022
 pasta-0.022
Token little
Token ID1310
Feature activation-8.97e-01

Pos logit contributions
 shark+0.007
 steak+0.007
otion+0.007
ascal+0.007
 gag+0.007
Neg logit contributions
 ahead-0.008
 sorts-0.008
 called-0.007
 past-0.007
 coloured-0.007
Token boy
Token ID2933
Feature activation-2.13e-01

Pos logit contributions
 sorts+0.020
 ahead+0.019
 Finally+0.018
 coloured+0.018
 called+0.018
Neg logit contributions
 shark-0.021
 steak-0.020
 thief-0.019
 crab-0.018
 salad-0.018
Token.
Token ID13
Feature activation+1.89e-01

Pos logit contributions
 sorts+0.003
 shades+0.003
 coloured+0.003
 ahead+0.003
 Finally+0.003
Neg logit contributions
 shark-0.006
 steak-0.006
 thief-0.006
 crab-0.006
 dessert-0.006
Token
Token ID198
Feature activation-2.55e-01

Pos logit contributions
 shark+0.005
 steak+0.004
 thief+0.004
 salad+0.004
 pasta+0.004
Neg logit contributions
 sorts-0.004
 ahead-0.004
 coloured-0.004
k-0.004
 shades-0.004
Token
Token ID198
Feature activation-2.13e-01

Pos logit contributions
 sorts+0.003
 ahead+0.003
 called+0.003
 Finally+0.003
 After+0.003
Neg logit contributions
 shark-0.008
 thief-0.008
 steak-0.008
 crab-0.008
 salad-0.008
Token much
Token ID881
Feature activation+1.66e-01

Pos logit contributions
 shark+0.026
 steak+0.025
 salad+0.024
 pasta+0.024
 thief+0.024
Neg logit contributions
 sorts-0.019
 ahead-0.019
mediately-0.018
k-0.018
 Finally-0.018
Token as
Token ID355
Feature activation+1.13e+00

Pos logit contributions
 steak+0.004
 salad+0.004
 pasta+0.004
 shark+0.004
 dessert+0.004
Neg logit contributions
 sorts-0.003
ually-0.003
mediately-0.003
eter-0.003
k-0.003
Token you
Token ID345
Feature activation-3.57e-01

Pos logit contributions
 shark+0.027
 steak+0.027
 thief+0.025
 salad+0.025
 pasta+0.025
Neg logit contributions
 sorts-0.023
 ahead-0.022
k-0.021
mediately-0.021
 Finally-0.021
Token want
Token ID765
Feature activation+5.56e-01

Pos logit contributions
 sorts+0.006
mediately+0.006
antly+0.006
 directions+0.006
 Finally+0.005
Neg logit contributions
 pasta-0.009
 salad-0.009
 shark-0.009
 dessert-0.009
 thief-0.009
Token."
Token ID526
Feature activation+4.50e-02

Pos logit contributions
 steak+0.016
 shark+0.016
 pasta+0.015
 salad+0.015
 dessert+0.015
Neg logit contributions
 sorts-0.009
mediately-0.009
ually-0.009
k-0.008
 Finally-0.008
Token
Token ID198
Feature activation+3.28e-01

Pos logit contributions
 shark+0.001
 steak+0.001
 thief+0.001
opus+0.001
 salad+0.001
Neg logit contributions
 sorts-0.001
 ahead-0.001
 called-0.001
 Finally-0.001
 After-0.001
Token
Token ID198
Feature activation+2.58e-01

Pos logit contributions
 shark+0.010
 steak+0.010
 thief+0.010
 salad+0.009
 crab+0.009
Neg logit contributions
 sorts-0.005
 ahead-0.004
 called-0.004
 Finally-0.004
 After-0.004
TokenL
Token ID43
Feature activation-1.63e-01

Pos logit contributions
 shark+0.007
 steak+0.007
 thief+0.006
 salad+0.006
 pasta+0.006
Neg logit contributions
 sorts-0.005
 ahead-0.005
 called-0.005
 coloured-0.004
 Finally-0.004
Tokenily
Token ID813
Feature activation-2.50e-01

Pos logit contributions
 sorts+0.004
 called+0.004
 ahead+0.004
 coloured+0.004
mediately+0.004
Neg logit contributions
 shark-0.004
 steak-0.003
 thief-0.003
 salad-0.003
 gag-0.003
Token smiled
Token ID13541
Feature activation+2.92e-02

Pos logit contributions
 sorts+0.005
 ahead+0.004
 shades+0.004
k+0.004
eter+0.004
Neg logit contributions
 mustache-0.006
 salad-0.006
 shark-0.006
 steak-0.006
 pasta-0.006
Token and
Token ID290
Feature activation-1.20e-01

Pos logit contributions
 shark+0.001
 steak+0.001
 thief+0.001
 salad+0.001
 crab+0.001
Neg logit contributions
 sorts-0.000
 ahead-0.000
 coloured-0.000
 Finally-0.000
 shades-0.000
Token was
Token ID373
Feature activation-3.41e-01

Pos logit contributions
 sorts+0.007
 ahead+0.007
k+0.007
eter+0.006
 Finally+0.006
Neg logit contributions
 shark-0.023
 steak-0.022
 crab-0.021
 thief-0.021
 salad-0.020
Token a
Token ID257
Feature activation+6.12e-01

Pos logit contributions
 sorts+0.004
 ahead+0.004
 called+0.004
 coloured+0.004
 Finally+0.003
Neg logit contributions
 shark-0.011
 thief-0.011
 steak-0.011
opus-0.011
 crab-0.010
Token delicate
Token ID19217
Feature activation+5.29e-01

Pos logit contributions
otion+0.016
 shark+0.016
 steak+0.015
 gag+0.015
uts+0.015
Neg logit contributions
 ahead-0.011
 called-0.011
 past-0.010
 sorts-0.010
 coloured-0.009
Token bird
Token ID6512
Feature activation-9.70e-01

Pos logit contributions
 shark+0.010
 steak+0.009
 gag+0.009
 thief+0.009
otion+0.009
Neg logit contributions
 ahead-0.013
 sorts-0.013
 called-0.013
 coloured-0.012
 past-0.012
Token who
Token ID508
Feature activation-3.55e-01

Pos logit contributions
 Finally+0.014
 sorts+0.014
eter+0.013
 shades+0.013
 ahead+0.013
Neg logit contributions
 shark-0.029
 steak-0.028
 crab-0.027
 thief-0.027
 salad-0.026
Token could
Token ID714
Feature activation+8.93e-01

Pos logit contributions
eter+0.007
 directions+0.006
mediately+0.006
 ahead+0.006
 wider+0.006
Neg logit contributions
 shark-0.009
 steak-0.008
 ow-0.008
 crab-0.008
 gag-0.008
Token soar
Token ID48701
Feature activation-1.49e+00

Pos logit contributions
 shark+0.023
 steak+0.022
 thief+0.021
opus+0.020
 pasta+0.020
Neg logit contributions
 sorts-0.017
 ahead-0.017
 called-0.016
 coloured-0.016
 shades-0.015
Token like
Token ID588
Feature activation+1.30e+00

Pos logit contributions
 Finally+0.029
 brushes+0.028
ples+0.028
ually+0.027
mediately+0.027
Neg logit contributions
 shark-0.035
 steak-0.033
 salad-0.032
 crab-0.031
 thief-0.031
Token no
Token ID645
Feature activation-6.39e-01

Pos logit contributions
 shark+0.044
ascal+0.041
opus+0.041
 gag+0.041
 steak+0.041
Neg logit contributions
 sorts-0.014
 ahead-0.014
 past-0.013
 called-0.013
 coloured-0.012
Token other
Token ID584
Feature activation-3.47e-01

Pos logit contributions
 sorts+0.012
 ahead+0.012
 Finally+0.012
mediately+0.012
 After+0.011
Neg logit contributions
 shark-0.016
 steak-0.016
 thief-0.015
 salad-0.015
 crab-0.015
Token.
Token ID13
Feature activation+1.14e-01

Pos logit contributions
 sorts+0.007
 ahead+0.007
 Finally+0.007
mediately+0.006
 After+0.006
Neg logit contributions
 shark-0.009
 steak-0.008
 thief-0.008
 crab-0.008
 ow-0.008
Token feed
Token ID3745
Feature activation+9.24e-01

Pos logit contributions
 shark+0.025
 steak+0.024
 thief+0.023
 salad+0.022
 crab+0.022
Neg logit contributions
 sorts-0.016
 ahead-0.016
 Finally-0.015
 coloured-0.015
 shades-0.015
Token the
Token ID262
Feature activation-7.36e-01

Pos logit contributions
 shark+0.024
 steak+0.023
 thief+0.022
 crab+0.021
 salad+0.021
Neg logit contributions
 sorts-0.018
 ahead-0.017
 called-0.016
 coloured-0.016
 Finally-0.016
Token animals
Token ID4695
Feature activation+3.72e-02

Pos logit contributions
 sorts+0.018
 ahead+0.018
 called+0.017
 coloured+0.017
 shades+0.017
Neg logit contributions
 shark-0.015
 steak-0.014
 thief-0.013
 crab-0.013
 salad-0.013
Token,
Token ID11
Feature activation+1.05e-01

Pos logit contributions
 shark+0.001
 steak+0.001
 thief+0.001
opus+0.001
 crab+0.001
Neg logit contributions
 sorts-0.000
 ahead-0.000
 called-0.000
 coloured-0.000
k-0.000
Token just
Token ID655
Feature activation-2.29e+00

Pos logit contributions
 shark+0.003
 steak+0.003
 thief+0.003
 crab+0.003
 salad+0.003
Neg logit contributions
 sorts-0.002
 ahead-0.002
 called-0.002
 coloured-0.002
 Finally-0.002
Token like
Token ID588
Feature activation+9.30e-01

Pos logit contributions
 sorts+0.029
 Finally+0.026
amed+0.026
antly+0.026
ruly+0.025
Neg logit contributions
 shark-0.072
 steak-0.070
 salad-0.068
 thief-0.067
 crab-0.067
Token his
Token ID465
Feature activation-7.42e-01

Pos logit contributions
 shark+0.025
 steak+0.024
 thief+0.023
opus+0.023
 crab+0.023
Neg logit contributions
 sorts-0.016
 ahead-0.016
 called-0.015
 coloured-0.015
 shades-0.014
Token mom
Token ID1995
Feature activation+5.25e-01

Pos logit contributions
 sorts+0.015
 ahead+0.015
mediately+0.014
 Finally+0.014
k+0.014
Neg logit contributions
 shark-0.018
 steak-0.017
 thief-0.016
 salad-0.016
 crab-0.016
Token.
Token ID13
Feature activation-6.35e-01

Pos logit contributions
 shark+0.016
 steak+0.016
 thief+0.015
 mustache+0.015
 salad+0.015
Neg logit contributions
 sorts-0.007
 ahead-0.006
 Finally-0.006
 shades-0.006
 coloured-0.006
Token He
Token ID679
Feature activation-7.10e-01

Pos logit contributions
 sorts+0.010
 ahead+0.009
 called+0.008
 Finally+0.008
 coloured+0.008
Neg logit contributions
 shark-0.019
 steak-0.018
 thief-0.018
 salad-0.017
 crab-0.017
Token was
Token ID373
Feature activation-6.10e-01

Pos logit contributions
 sorts+0.011
 ahead+0.010
 shades+0.010
k+0.009
mediately+0.009
Neg logit contributions
 shark-0.021
 steak-0.020
 thief-0.019
 salad-0.019
 crab-0.019
Token sometimes
Token ID3360
Feature activation+7.44e-01

Pos logit contributions
 sorts+0.000
 ahead+0.000
 Finally+0.000
antly+0.000
k+0.000
Neg logit contributions
 shark-0.000
 steak-0.000
 salad-0.000
 crab-0.000
 thief-0.000
Token it
Token ID340
Feature activation+8.00e-01

Pos logit contributions
 shark+0.017
 steak+0.017
 thief+0.016
 crab+0.015
 salad+0.015
Neg logit contributions
 sorts-0.016
 ahead-0.016
 Finally-0.015
 coloured-0.015
mediately-0.015
Token was
Token ID373
Feature activation+4.87e-01

Pos logit contributions
 shark+0.025
 steak+0.024
 thief+0.023
 crab+0.023
 salad+0.023
Neg logit contributions
 sorts-0.011
 ahead-0.010
 Finally-0.010
 After-0.009
 coloured-0.009
Token close
Token ID1969
Feature activation-1.44e+00

Pos logit contributions
 shark+0.012
 steak+0.012
 salad+0.011
 thief+0.011
 pasta+0.011
Neg logit contributions
 sorts-0.010
 Finally-0.009
antly-0.009
k-0.009
ually-0.009
Token.
Token ID13
Feature activation-3.25e-01

Pos logit contributions
 sorts+0.033
amed+0.032
 Finally+0.031
ually+0.031
mediately+0.030
Neg logit contributions
 shark-0.029
 steak-0.029
 salad-0.027
 pasta-0.026
 peas-0.025
Token They
Token ID1119
Feature activation-2.28e-01

Pos logit contributions
 ahead+0.006
 sorts+0.006
 coloured+0.006
k+0.006
ples+0.006
Neg logit contributions
 shark-0.008
 steak-0.008
 salad-0.007
 crab-0.007
 pasta-0.007
Token learned
Token ID4499
Feature activation-7.73e-01

Pos logit contributions
eter+0.004
antly+0.004
 sorts+0.004
 After+0.004
 Finally+0.004
Neg logit contributions
 shark-0.006
 crab-0.006
 thief-0.005
 steak-0.005
 peas-0.005
Token that
Token ID326
Feature activation+3.43e-01

Pos logit contributions
 sorts+0.014
 ahead+0.014
ples+0.013
k+0.013
 Finally+0.013
Neg logit contributions
 shark-0.021
 steak-0.020
 salad-0.019
 thief-0.019
 pasta-0.019
Token this
Token ID428
Feature activation-1.64e-01

Pos logit contributions
 shark+0.009
opus+0.009
 thief+0.009
 steak+0.009
ascal+0.009
Neg logit contributions
 sorts-0.006
 ahead-0.006
 called-0.006
 shades-0.005
 coloured-0.005
Token was
Token ID373
Feature activation-1.03e+00

Pos logit contributions
 sorts+0.002
 ahead+0.002
 called+0.002
 coloured+0.002
 Finally+0.002
Neg logit contributions
 shark-0.005
 steak-0.005
 thief-0.005
 salad-0.005
 crab-0.005
Token called
Token ID1444
Feature activation-4.56e-01

Pos logit contributions
k+0.025
 sorts+0.024
se+0.024
ples+0.024
antly+0.023
Neg logit contributions
 steak-0.022
 shark-0.022
 salad-0.020
 pasta-0.020
 thief-0.020
Token.
Token ID13
Feature activation-7.10e-02

Pos logit contributions
 sorts+0.011
 ahead+0.010
 Finally+0.010
 coloured+0.010
eter+0.010
Neg logit contributions
 shark-0.025
 steak-0.025
 thief-0.023
 salad-0.023
 crab-0.023
Token It
Token ID632
Feature activation-1.38e-01

Pos logit contributions
 sorts+0.001
 ahead+0.001
 coloured+0.001
k+0.001
 shades+0.001
Neg logit contributions
 shark-0.002
 steak-0.002
 thief-0.002
 salad-0.002
 crab-0.002
Token felt
Token ID2936
Feature activation-6.53e-01

Pos logit contributions
eter+0.003
 sorts+0.003
 Finally+0.003
 brushes+0.002
 After+0.002
Neg logit contributions
 shark-0.004
 crab-0.003
 steak-0.003
 thief-0.003
 camel-0.003
Token very
Token ID845
Feature activation+1.49e+00

Pos logit contributions
 sorts+0.013
eter+0.013
k+0.013
ples+0.012
ist+0.012
Neg logit contributions
 shark-0.017
 steak-0.016
 crab-0.015
 thief-0.014
 salad-0.014
Token happy
Token ID3772
Feature activation+9.36e-01

Pos logit contributions
 shark+0.034
 steak+0.032
 crab+0.031
 thief+0.030
 salad+0.030
Neg logit contributions
 sorts-0.034
ples-0.032
eter-0.031
 ahead-0.031
 coloured-0.031
Token.
Token ID13
Feature activation-2.93e-02

Pos logit contributions
 shark+0.029
 steak+0.028
 thief+0.027
 salad+0.026
 crab+0.026
Neg logit contributions
 sorts-0.013
 ahead-0.012
 Finally-0.012
k-0.011
 coloured-0.011
Token
Token ID198
Feature activation+4.42e-01

Pos logit contributions
 sorts+0.001
 coloured+0.000
 ahead+0.000
k+0.000
 shades+0.000
Neg logit contributions
 shark-0.001
 steak-0.001
 thief-0.001
 salad-0.001
 crab-0.001
Token
Token ID198
Feature activation+9.52e-01

Pos logit contributions
 shark+0.014
 steak+0.014
 thief+0.013
 salad+0.013
 dessert+0.013
Neg logit contributions
 sorts-0.006
 ahead-0.006
 called-0.005
 Finally-0.005
 After-0.005
TokenBut
Token ID1537
Feature activation-1.41e+00

Pos logit contributions
 shark+0.032
 steak+0.031
 thief+0.030
 salad+0.029
 dessert+0.029
Neg logit contributions
 sorts-0.012
 ahead-0.012
 called-0.010
 Finally-0.010
mediately-0.010
Token just
Token ID655
Feature activation-2.45e+00

Pos logit contributions
 sorts+0.025
ples+0.022
 shades+0.021
 brushes+0.021
eter+0.021
Neg logit contributions
 shark-0.041
 steak-0.039
 crab-0.038
 thief-0.036
 salad-0.036
Token when
Token ID618
Feature activation+4.10e-01

Pos logit contributions
 sorts+0.036
 Finally+0.033
cards+0.032
ruly+0.032
eter+0.031
Neg logit contributions
 shark-0.076
 steak-0.071
 crab-0.070
 salad-0.069
 pasta-0.068
Token a
Token ID257
Feature activation-5.82e-01

Pos logit contributions
 ahead+0.002
 sorts+0.002
 called+0.002
 over+0.002
 past+0.002
Neg logit contributions
 shark-0.003
 steak-0.003
opus-0.003
 thief-0.003
 crab-0.003
Token video
Token ID2008
Feature activation-2.73e-01

Pos logit contributions
 sorts+0.012
 ahead+0.012
 past+0.012
 called+0.012
 lower+0.011
Neg logit contributions
otion-0.014
ascal-0.013
oll-0.012
olo-0.012
olph-0.012
Token.
Token ID13
Feature activation+3.70e-01

Pos logit contributions
 ahead+0.003
 called+0.003
 sorts+0.003
 over+0.003
 past+0.002
Neg logit contributions
 shark-0.009
 steak-0.009
ascal-0.009
 crab-0.009
 gag-0.009
Token It
Token ID632
Feature activation-4.67e-01

Pos logit contributions
 shark+0.011
 steak+0.011
 thief+0.011
 crab+0.010
 salad+0.010
Neg logit contributions
 sorts-0.005
 ahead-0.005
 Finally-0.005
 called-0.005
 shades-0.004
Token was
Token ID373
Feature activation-3.02e-01

Pos logit contributions
 sorts+0.006
 ahead+0.006
k+0.005
 shades+0.005
eter+0.005
Neg logit contributions
 shark-0.015
 steak-0.014
 thief-0.014
 ow-0.013
 virus-0.013
Token an
Token ID281
Feature activation-2.18e-01

Pos logit contributions
 sorts+0.006
k+0.006
 shades+0.006
 ahead+0.006
antly+0.006
Neg logit contributions
 steak-0.007
 shark-0.007
 thief-0.007
 dessert-0.007
 salad-0.007
Token important
Token ID1593
Feature activation-1.64e-01

Pos logit contributions
 sorts+0.005
 called+0.005
 ahead+0.005
 coloured+0.005
 over+0.004
Neg logit contributions
 shark-0.005
 steak-0.004
opus-0.004
 crab-0.004
 gag-0.004
Token video
Token ID2008
Feature activation+1.86e-01

Pos logit contributions
 sorts+0.003
 ahead+0.002
 called+0.002
 coloured+0.002
 past+0.002
Neg logit contributions
 shark-0.005
 steak-0.005
ascal-0.005
 peas-0.004
 thief-0.004
Token about
Token ID546
Feature activation+1.07e-01

Pos logit contributions
 shark+0.006
 steak+0.006
 thief+0.005
 crab+0.005
 salad+0.005
Neg logit contributions
 sorts-0.003
 ahead-0.003
 called-0.002
 coloured-0.002
 Finally-0.002
Token being
Token ID852
Feature activation-6.89e-01

Pos logit contributions
 shark+0.003
 steak+0.003
 thief+0.003
opus+0.002
 crab+0.002
Neg logit contributions
 sorts-0.002
 ahead-0.002
 called-0.002
 coloured-0.002
 shades-0.002
Token nice
Token ID3621
Feature activation+5.23e-01

Pos logit contributions
 sorts+0.016
 Finally+0.016
antly+0.015
 ahead+0.015
 shades+0.015
Neg logit contributions
 shark-0.014
 thief-0.014
 steak-0.014
 liar-0.013
 dessert-0.013
Token watched
Token ID7342
Feature activation-1.02e+00

Pos logit contributions
 sorts+0.007
eter+0.007
ples+0.006
mediately+0.006
 shades+0.006
Neg logit contributions
 shark-0.011
 steak-0.010
 crab-0.010
 salad-0.010
 pasta-0.010
Token the
Token ID262
Feature activation-1.44e+00

Pos logit contributions
 sorts+0.012
 ahead+0.012
 called+0.011
 coloured+0.011
 Finally+0.010
Neg logit contributions
 shark-0.033
 steak-0.032
 thief-0.032
 crab-0.031
 salad-0.030
Token sun
Token ID4252
Feature activation-1.33e+00

Pos logit contributions
 sorts+0.030
 ahead+0.029
 called+0.027
 past+0.027
 coloured+0.027
Neg logit contributions
 shark-0.035
 steak-0.034
 thief-0.033
 salad-0.031
 liar-0.031
Token grow
Token ID1663
Feature activation+6.19e-01

Pos logit contributions
 sorts+0.023
 ahead+0.021
 Finally+0.020
mediately+0.020
 called+0.019
Neg logit contributions
 shark-0.036
 steak-0.035
 salad-0.034
 thief-0.033
 pasta-0.033
Token bigger
Token ID5749
Feature activation-1.23e+00

Pos logit contributions
 shark+0.016
 steak+0.016
 salad+0.016
 pasta+0.015
 peas+0.015
Neg logit contributions
 sorts-0.011
ually-0.010
k-0.010
mediately-0.010
 Finally-0.009
Token and
Token ID290
Feature activation-3.92e-01

Pos logit contributions
 sorts+0.018
 Finally+0.017
mediately+0.017
 ahead+0.016
ually+0.016
Neg logit contributions
 shark-0.037
 steak-0.036
 salad-0.034
 pasta-0.034
 crab-0.034
Token bigger
Token ID5749
Feature activation-1.28e+00

Pos logit contributions
ris+0.008
 sorts+0.008
se+0.007
k+0.007
ess+0.007
Neg logit contributions
 shark-0.010
 salad-0.010
 steak-0.009
 peas-0.009
 crab-0.009
Token.
Token ID13
Feature activation+4.43e-01

Pos logit contributions
 sorts+0.018
mediately+0.017
 Finally+0.016
 ahead+0.016
ually+0.015
Neg logit contributions
 shark-0.040
 steak-0.039
 salad-0.037
 crab-0.036
 pasta-0.036
Token After
Token ID2293
Feature activation+6.18e-01

Pos logit contributions
 steak+0.012
 shark+0.012
 salad+0.011
 pasta+0.011
 thief+0.011
Neg logit contributions
 sorts-0.007
 ahead-0.007
k-0.007
 coloured-0.007
eter-0.007
Token that
Token ID326
Feature activation-1.71e+00

Pos logit contributions
 shark+0.022
opus+0.021
ascal+0.021
 thief+0.021
oly+0.021
Neg logit contributions
 called-0.006
 ahead-0.006
 sorts-0.006
 past-0.006
 coloured-0.005
Token,
Token ID11
Feature activation-6.89e-02

Pos logit contributions
 ahead+0.011
 sorts+0.010
 past+0.009
 over+0.007
 called+0.006
Neg logit contributions
 shark-0.066
ascal-0.066
 steak-0.065
 liar-0.065
 gag-0.064
Token want
Token ID765
Feature activation+9.90e-01

Pos logit contributions
 shark+0.039
 steak+0.038
 thief+0.036
opus+0.036
 salad+0.035
Neg logit contributions
 sorts-0.017
 ahead-0.017
 called-0.016
k-0.015
 coloured-0.015
Token to
Token ID284
Feature activation+8.70e-01

Pos logit contributions
 shark+0.032
 steak+0.031
 thief+0.030
opus+0.029
 crab+0.029
Neg logit contributions
 sorts-0.012
 ahead-0.012
 called-0.011
 coloured-0.011
 shades-0.011
Token leave
Token ID2666
Feature activation+7.27e-01

Pos logit contributions
 shark+0.024
 steak+0.023
 thief+0.022
opus+0.022
 salad+0.021
Neg logit contributions
 sorts-0.015
 ahead-0.015
 called-0.014
k-0.013
 coloured-0.013
Token the
Token ID262
Feature activation+7.71e-01

Pos logit contributions
 shark+0.019
 steak+0.018
 thief+0.018
 salad+0.017
 crab+0.017
Neg logit contributions
 sorts-0.014
 ahead-0.013
 Finally-0.013
mediately-0.013
 coloured-0.013
Token seal
Token ID13810
Feature activation+9.64e-01

Pos logit contributions
 shark+0.015
otion+0.015
ascal+0.015
 steak+0.015
 gag+0.015
Neg logit contributions
 ahead-0.018
 past-0.017
 sorts-0.017
 called-0.017
 coloured-0.016
Token.
Token ID13
Feature activation-2.00e-01

Pos logit contributions
 shark+0.030
ascal+0.030
oly+0.029
 steak+0.029
 gag+0.029
Neg logit contributions
 ahead-0.013
 called-0.011
 over-0.011
 when-0.011
 sorts-0.010
Token He
Token ID679
Feature activation-1.26e-02

Pos logit contributions
 sorts+0.004
 ahead+0.004
 coloured+0.004
k+0.004
 shades+0.004
Neg logit contributions
 shark-0.005
 steak-0.004
 thief-0.004
 crab-0.004
 salad-0.004
Token wants
Token ID3382
Feature activation+1.29e+00

Pos logit contributions
 sorts+0.000
eter+0.000
 ahead+0.000
 shades+0.000
 coloured+0.000
Neg logit contributions
 shark-0.000
 steak-0.000
 thief-0.000
 mustache-0.000
 salad-0.000
Token to
Token ID284
Feature activation+4.40e-01

Pos logit contributions
 shark+0.042
 steak+0.040
 thief+0.040
opus+0.039
 salad+0.038
Neg logit contributions
 sorts-0.016
 ahead-0.015
 called-0.014
 past-0.013
 coloured-0.013
Token play
Token ID711
Feature activation+5.84e-01

Pos logit contributions
 shark+0.012
 steak+0.012
 thief+0.011
 crab+0.011
 salad+0.011
Neg logit contributions
 sorts-0.008
 ahead-0.007
 Finally-0.007
 coloured-0.007
 shades-0.007
Token with
Token ID351
Feature activation+1.25e+00

Pos logit contributions
ascal+0.023
opus+0.023
 gag+0.023
oly+0.023
 shark+0.022
Neg logit contributions
 ahead-0.004
 called-0.003
 like-0.003
 over-0.003
 past-0.003
Token the
Token ID262
Feature activation+5.53e-02

Pos logit contributions
 sorts+0.012
 ahead+0.011
 called+0.011
 Finally+0.010
 coloured+0.010
Neg logit contributions
 shark-0.022
 steak-0.021
 thief-0.021
 salad-0.020
 crab-0.020
Token cooler
Token ID19346
Feature activation-3.58e-01

Pos logit contributions
 shark+0.001
 steak+0.001
 thief+0.001
 salad+0.001
 crab+0.001
Neg logit contributions
 sorts-0.001
 ahead-0.001
 called-0.001
 coloured-0.001
 Finally-0.001
Token and
Token ID290
Feature activation+2.04e-01

Pos logit contributions
 sorts+0.006
 shades+0.005
ples+0.005
 coloured+0.005
 ahead+0.005
Neg logit contributions
 shark-0.010
 steak-0.010
 thief-0.010
 salad-0.010
 plate-0.010
Token was
Token ID373
Feature activation-2.20e-01

Pos logit contributions
 shark+0.006
 steak+0.006
 thief+0.005
 salad+0.005
 crab+0.005
Neg logit contributions
 sorts-0.003
 ahead-0.003
k-0.003
 shades-0.003
 Finally-0.003
Token stealing
Token ID16568
Feature activation-9.89e-02

Pos logit contributions
 sorts+0.005
 shades+0.004
k+0.004
 ahead+0.004
eter+0.004
Neg logit contributions
 shark-0.005
 steak-0.005
 thief-0.005
 salad-0.005
 dessert-0.005
Token all
Token ID477
Feature activation-2.38e+00

Pos logit contributions
 sorts+0.002
ually+0.002
k+0.002
 Finally+0.002
mediately+0.002
Neg logit contributions
 steak-0.003
 shark-0.003
 salad-0.002
 pasta-0.002
 thief-0.002
Token the
Token ID262
Feature activation-6.07e-01

Pos logit contributions
ples+0.038
 Finally+0.037
k+0.037
eter+0.037
 shades+0.035
Neg logit contributions
 shark-0.065
 steak-0.064
 salad-0.062
 crab-0.060
 thief-0.060
Token drinks
Token ID11758
Feature activation+9.70e-01

Pos logit contributions
 sorts+0.013
 ahead+0.013
 called+0.012
 Finally+0.012
 coloured+0.012
Neg logit contributions
 shark-0.014
 steak-0.013
 thief-0.013
 salad-0.012
 crab-0.012
Token.
Token ID13
Feature activation+2.57e-01

Pos logit contributions
 shark+0.032
 steak+0.031
 thief+0.030
 crab+0.030
 salad+0.030
Neg logit contributions
 sorts-0.011
 ahead-0.011
 called-0.010
 coloured-0.009
 Finally-0.009
Token Tim
Token ID5045
Feature activation-2.88e-01

Pos logit contributions
 shark+0.008
 steak+0.007
 thief+0.007
 salad+0.007
 crab+0.007
Neg logit contributions
 sorts-0.004
 ahead-0.004
 coloured-0.004
 called-0.004
 shades-0.003
Tokenmy
Token ID1820
Feature activation+1.84e-01

Pos logit contributions
 sorts+0.003
 ahead+0.003
 Finally+0.003
 shades+0.003
eter+0.003
Neg logit contributions
 shark-0.010
 steak-0.009
 thief-0.009
 crab-0.009
 mustache-0.009
Token May
Token ID1737
Feature activation+1.32e-01

Pos logit contributions
 coloured+0.022
 sorts+0.021
 admired+0.020
 past+0.020
k+0.020
Neg logit contributions
 shark-0.014
 steak-0.012
 thief-0.011
opus-0.011
 crab-0.011
Token I
Token ID314
Feature activation-5.59e-01

Pos logit contributions
 shark+0.004
 steak+0.003
 crab+0.003
 thief+0.003
opus+0.003
Neg logit contributions
 sorts-0.002
 Finally-0.002
mediately-0.002
 shades-0.002
 directions-0.002
Token join
Token ID4654
Feature activation-4.88e-01

Pos logit contributions
 sorts+0.010
 shades+0.010
eter+0.010
mediately+0.010
 colours+0.010
Neg logit contributions
 shark-0.014
 crab-0.012
 steak-0.012
 thief-0.012
 liar-0.012
Token you
Token ID345
Feature activation+6.14e-01

Pos logit contributions
 sorts+0.008
 ahead+0.008
 called+0.007
 past+0.007
 coloured+0.007
Neg logit contributions
 shark-0.014
 steak-0.013
 thief-0.013
opus-0.013
 gag-0.013
Token?"
Token ID1701
Feature activation-5.40e-01

Pos logit contributions
 shark+0.019
 steak+0.018
 thief+0.018
 salad+0.017
 crab+0.017
Neg logit contributions
 sorts-0.009
 ahead-0.008
 Finally-0.008
 coloured-0.008
 shades-0.008
Token
Token ID220
Feature activation-2.35e+00

Pos logit contributions
k+0.010
eter+0.010
 ahead+0.010
ruly+0.010
 coloured+0.010
Neg logit contributions
 shark-0.012
 steak-0.011
 salad-0.011
 pasta-0.011
 mustache-0.011
Token
Token ID628
Feature activation-1.25e+00

Pos logit contributions
 sorts+0.038
 ahead+0.037
 coloured+0.034
 called+0.034
 shades+0.034
Neg logit contributions
 shark-0.067
 steak-0.065
 thief-0.062
 salad-0.061
 crab-0.061
Token Maria
Token ID14200
Feature activation-1.71e+00

Pos logit contributions
 sorts+0.022
k+0.022
 ahead+0.022
eter+0.022
 coloured+0.022
Neg logit contributions
 shark-0.032
 steak-0.030
 salad-0.029
 crab-0.029
 thief-0.029
Token nodded
Token ID14464
Feature activation-1.67e-01

Pos logit contributions
 sorts+0.029
 Finally+0.028
 ahead+0.027
eter+0.026
 After+0.025
Neg logit contributions
 shark-0.047
 steak-0.044
 mustache-0.043
 crab-0.043
opus-0.041
Token her
Token ID607
Feature activation-4.68e-02

Pos logit contributions
 sorts+0.003
 ahead+0.002
 Finally+0.002
 called+0.002
 shades+0.002
Neg logit contributions
 shark-0.005
 steak-0.005
 thief-0.005
 crab-0.005
 salad-0.005
Token head
Token ID1182
Feature activation-5.40e-02

Pos logit contributions
 sorts+0.001
 ahead+0.001
 coloured+0.001
 kinds+0.001
ples+0.001
Neg logit contributions
 shark-0.001
 steak-0.001
 thief-0.001
 mustache-0.001
 salad-0.001
Token wanted
Token ID2227
Feature activation+2.71e-01

Pos logit contributions
 sorts+0.006
 ahead+0.006
eter+0.006
antly+0.006
mediately+0.006
Neg logit contributions
 shark-0.010
 steak-0.009
 thief-0.009
 crab-0.009
 salad-0.009
Token to
Token ID284
Feature activation-1.43e-01

Pos logit contributions
 shark+0.008
 steak+0.008
 thief+0.008
 salad+0.008
 crab+0.008
Neg logit contributions
 sorts-0.004
 ahead-0.004
 called-0.003
 coloured-0.003
 Finally-0.003
Token eat
Token ID4483
Feature activation+5.65e-01

Pos logit contributions
 sorts+0.003
 ahead+0.003
 Finally+0.003
ples+0.003
mediately+0.003
Neg logit contributions
 shark-0.003
 steak-0.003
 thief-0.003
 crab-0.003
 salad-0.003
Token the
Token ID262
Feature activation-5.71e-01

Pos logit contributions
 shark+0.017
 steak+0.016
 thief+0.016
opus+0.015
 crab+0.015
Neg logit contributions
 sorts-0.008
 ahead-0.008
 called-0.008
 coloured-0.007
 Finally-0.007
Token fruits
Token ID15921
Feature activation+4.09e-01

Pos logit contributions
 sorts+0.014
 ahead+0.014
 Finally+0.013
 called+0.013
 coloured+0.013
Neg logit contributions
 shark-0.012
 steak-0.011
 thief-0.010
 crab-0.010
opus-0.010
Token right
Token ID826
Feature activation-2.35e+00

Pos logit contributions
 shark+0.015
ascal+0.014
opus+0.014
 virus+0.014
 gag+0.014
Neg logit contributions
 ahead-0.004
 sorts-0.004
 called-0.004
 over-0.003
 past-0.003
Token away
Token ID1497
Feature activation-3.19e-01

Pos logit contributions
 sorts+0.042
 Finally+0.036
 After+0.036
 brushes+0.036
ples+0.035
Neg logit contributions
 shark-0.065
 steak-0.062
 crab-0.059
 dessert-0.058
 pasta-0.058
Token.
Token ID13
Feature activation-8.07e-01

Pos logit contributions
 sorts+0.004
 Finally+0.004
 ahead+0.004
mediately+0.004
 coloured+0.004
Neg logit contributions
 shark-0.010
 steak-0.010
 thief-0.009
 crab-0.009
 salad-0.009
Token They
Token ID1119
Feature activation-3.40e-01

Pos logit contributions
 sorts+0.013
 coloured+0.012
 ahead+0.012
k+0.011
 colours+0.011
Neg logit contributions
 shark-0.024
 steak-0.023
 thief-0.022
 crab-0.021
 pasta-0.021
Token peeled
Token ID42617
Feature activation-1.00e+00

Pos logit contributions
 sorts+0.006
 ahead+0.006
eter+0.005
 shades+0.005
antly+0.005
Neg logit contributions
 shark-0.009
 steak-0.009
 thief-0.009
 crab-0.009
 salad-0.009
Token off
Token ID572
Feature activation+3.06e-01

Pos logit contributions
 sorts+0.019
 ahead+0.017
 coloured+0.016
ually+0.016
 called+0.016
Neg logit contributions
 shark-0.027
 steak-0.027
 salad-0.026
 pasta-0.025
 crab-0.025
Token down
Token ID866
Feature activation+9.81e-01

Pos logit contributions
 shark+0.034
 steak+0.033
 thief+0.032
 crab+0.031
 salad+0.031
Neg logit contributions
 sorts-0.017
 ahead-0.016
 shades-0.015
 coloured-0.015
 called-0.015
Token and
Token ID290
Feature activation+1.33e+00

Pos logit contributions
 shark+0.028
 steak+0.027
 thief+0.026
 salad+0.025
 crab+0.025
Neg logit contributions
 sorts-0.017
 ahead-0.015
k-0.015
 shades-0.015
 coloured-0.015
Token the
Token ID262
Feature activation+1.75e-01

Pos logit contributions
 shark+0.034
 steak+0.032
 salad+0.030
 pasta+0.030
 crab+0.030
Neg logit contributions
 sorts-0.026
 ahead-0.024
k-0.024
eter-0.024
ples-0.024
Token lightning
Token ID14357
Feature activation+3.71e-01

Pos logit contributions
 shark+0.004
 steak+0.004
 thief+0.004
 salad+0.004
 crab+0.004
Neg logit contributions
 sorts-0.004
 ahead-0.003
 Finally-0.003
k-0.003
 called-0.003
Token flashes
Token ID29227
Feature activation-4.49e-01

Pos logit contributions
 shark+0.010
 steak+0.010
 thief+0.009
 ow+0.009
 crab+0.009
Neg logit contributions
 sorts-0.006
eter-0.006
 Finally-0.006
mediately-0.006
 After-0.005
Token lit
Token ID6578
Feature activation-2.26e+00

Pos logit contributions
 Finally+0.007
 sorts+0.007
eter+0.007
ually+0.007
 brushes+0.007
Neg logit contributions
 shark-0.013
 steak-0.012
 salad-0.011
 ow-0.011
 crab-0.011
Token up
Token ID510
Feature activation+6.01e-02

Pos logit contributions
 brushes+0.038
mediately+0.038
 sorts+0.037
enge+0.037
Ok+0.035
Neg logit contributions
 shark-0.059
 steak-0.059
 iceberg-0.053
 ow-0.052
 salad-0.051
Token the
Token ID262
Feature activation+3.79e-01

Pos logit contributions
 shark+0.002
 steak+0.002
 salad+0.001
 pasta+0.001
 crab+0.001
Neg logit contributions
 sorts-0.001
 brushes-0.001
ually-0.001
mediately-0.001
 Finally-0.001
Token sky
Token ID6766
Feature activation-4.55e-01

Pos logit contributions
 shark+0.009
 steak+0.008
 thief+0.008
 ow+0.008
 salad+0.008
Neg logit contributions
 sorts-0.009
eter-0.008
mediately-0.008
 ahead-0.008
k-0.008
Token.
Token ID13
Feature activation+4.39e-01

Pos logit contributions
 sorts+0.007
mediately+0.007
 Finally+0.007
 brushes+0.006
ually+0.006
Neg logit contributions
 shark-0.013
 steak-0.013
 salad-0.012
 pasta-0.012
 thief-0.012
Token
Token ID198
Feature activation+4.31e-01

Pos logit contributions
 shark+0.012
 steak+0.011
 thief+0.010
 salad+0.010
 pasta+0.010
Neg logit contributions
 sorts-0.008
 coloured-0.008
 ahead-0.008
k-0.008
 shades-0.007
Token word
Token ID1573
Feature activation-5.12e-01

Pos logit contributions
 sorts+0.004
 ahead+0.003
 Finally+0.003
eter+0.003
 coloured+0.003
Neg logit contributions
 shark-0.005
 steak-0.005
 thief-0.005
 salad-0.005
 dessert-0.005
Token in
Token ID287
Feature activation-6.87e-01

Pos logit contributions
 sorts+0.008
 Finally+0.007
mediately+0.007
 ahead+0.007
 coloured+0.007
Neg logit contributions
 shark-0.015
 steak-0.014
 thief-0.014
 salad-0.013
 pasta-0.013
Token the
Token ID262
Feature activation-4.69e-01

Pos logit contributions
 Finally+0.012
 sorts+0.012
eter+0.012
 ahead+0.012
mediately+0.011
Neg logit contributions
 shark-0.018
 steak-0.017
 thief-0.016
 liar-0.016
 ow-0.016
Token newspaper
Token ID7533
Feature activation+2.48e-01

Pos logit contributions
 sorts+0.010
 ahead+0.009
eter+0.009
 coloured+0.009
ually+0.009
Neg logit contributions
 shark-0.011
 steak-0.011
 thief-0.011
 mustache-0.010
 salad-0.010
Token,
Token ID11
Feature activation-4.21e-01

Pos logit contributions
 shark+0.008
 steak+0.008
 thief+0.008
 salad+0.007
 crab+0.007
Neg logit contributions
 sorts-0.003
 ahead-0.003
 Finally-0.003
 coloured-0.003
 shades-0.003
Token just
Token ID655
Feature activation-2.33e+00

Pos logit contributions
 sorts+0.008
 ahead+0.007
 coloured+0.007
 Finally+0.007
eter+0.007
Neg logit contributions
 shark-0.011
 steak-0.011
 thief-0.010
 salad-0.010
 pasta-0.010
Token like
Token ID588
Feature activation+6.02e-01

Pos logit contributions
 sorts+0.038
 Finally+0.037
eter+0.035
amed+0.034
ris+0.033
Neg logit contributions
 shark-0.063
 steak-0.061
 thief-0.058
 ow-0.058
 salad-0.057
Token the
Token ID262
Feature activation-1.11e+00

Pos logit contributions
 shark+0.016
 steak+0.015
 thief+0.015
 salad+0.014
 crab+0.014
Neg logit contributions
 sorts-0.011
 ahead-0.011
 Finally-0.010
k-0.010
 coloured-0.010
Token bird
Token ID6512
Feature activation+3.35e-01

Pos logit contributions
 sorts+0.025
 ahead+0.023
 Finally+0.023
k+0.023
mediately+0.023
Neg logit contributions
 shark-0.025
 steak-0.025
 thief-0.024
 salad-0.023
 crab-0.022
Token moved
Token ID3888
Feature activation-4.79e-01

Pos logit contributions
 shark+0.010
 steak+0.010
 thief+0.010
 crab+0.009
 salad+0.009
Neg logit contributions
 sorts-0.005
 ahead-0.004
 Finally-0.004
 shades-0.004
 coloured-0.004
Token from
Token ID422
Feature activation-3.79e-01

Pos logit contributions
 sorts+0.007
 Finally+0.007
ually+0.007
 brushes+0.006
mediately+0.006
Neg logit contributions
 shark-0.014
 steak-0.013
 thief-0.013
 crab-0.012
 pasta-0.012