TOP ACTIVATIONS
MAX = 0.121

fence and run to the ost ric hes .
They wanted to touch the ost ric hes . They reached
their loving friendship . <|endoftext|> Pe ggy was very excited .
to mom and to the ost ric hes . They learned
stopped and looked at the ost ric hes . They saw
From that day on , Pe ppa never went for a
be filled with wonderful <|endoftext|> Once upon a time , there
lies on the grass and breat hes hard . He does
. Tom ro ars and breat hes fire . Anna screams
together every day . <|endoftext|> Once upon a time , there
in the park . <|endoftext|> Once upon a time , there
had happy dreams . <|endoftext|> Once upon a time , there
be safe at last <|endoftext|> Once upon a time , there
have good friends . <|endoftext|> Once upon a time , there
on the farm . <|endoftext|> Once upon a time , a
One day , Pe ppa stumbled upon a strange
on their faces . <|endoftext|> Once upon a time there was
happy and tired . <|endoftext|> Once upon a time , there
making people happy . <|endoftext|> Once upon a time , there
, fall weather . <|endoftext|> Once upon a time , in

MIN ACTIVATIONS
MIN = -0.098

's mom showed him a big file . It was full
their mom gave them a big bowl of cream to share
and Dad give them a big present each . Ben and
. She showed him a big map and said " Look
and dad gave her a big bowl . She was so
's mom gave him a big thick envelope to mail .
Her mom gave her a big , red umbrella to keep
. The map shows a big X in the east .
's mom gave him a big onion to eat . Tim
, Mom gave them a big bar of chocolate to share
opl ane . It is big and red and can fly
leaves . It was pile big enough for Bob to hide
. Mom gives them a big bowl of dough and a
best , because it is big and strong . Mia likes
. Mom gave her a big bowl and a spoon .
He thinks hipp os are big and funny and have big
and dad gave them a big surprise . They had bought
oy sters , which were big and hard and had a
Suddenly , she found a big stick ! She bounded over
special . It was a big temple ! The temple was

INTERVAL 0.066 - 0.121
CONTAINS 0.634%

come back again . <|endoftext|> Once upon a time , there
That was fun !" <|endoftext|> Once upon a time , there
be just perfect ! <|endoftext|> Once there was a boy named
explore the jungle . <|endoftext|> Once upon a time , there
<|endoftext|> Once upon a time , there was a small boot

INTERVAL 0.011 - 0.066
CONTAINS 27.705%

of o atmeal without any marks . He ate it all
she spun around and around in a circle . The tw
to play . She forgot about her sadness and smiled .
streets and saw a park , a store , and a
it 's not always safe to zoom around without being careful

INTERVAL -0.043 - 0.011
CONTAINS 67.607%

for a while and realized that he would feel bad if
mom and hid under the table . Their mom was angry
play her trumpet . She would play it every day and
When Sarah got the past els home , she was so
mouse ! It was a very cute mouse ! The two

INTERVAL -0.098 - -0.043
CONTAINS 4.054%

Ben . You are very nice . I will be very
the park . It is big and round and green .
floor . It made a loud noise . Suddenly
proud . He shouted , " Look , I did it
je ep found a bright red apple . He was very
Token fence
Token ID13990
Feature activation+6.65e-03

Pos logit contributions
bie+0.021
ric+0.021
abella+0.020
tha+0.020
icking+0.019
Neg logit contributions
 aback-0.018
lier-0.016
Sur-0.016
pillar-0.016
Mine-0.016
Token and
Token ID290
Feature activation+1.70e-02

Pos logit contributions
icking+0.015
abella+0.014
bie+0.014
tha+0.014
bled+0.013
Neg logit contributions
pillar-0.008
 target-0.007
Sur-0.007
 pieces-0.006
 aback-0.006
Token run
Token ID1057
Feature activation+9.51e-03

Pos logit contributions
abella+0.035
icking+0.035
bie+0.034
tha+0.033
bled+0.033
Neg logit contributions
 target-0.021
pillar-0.021
 pieces-0.021
 bites-0.018
Sur-0.018
Token to
Token ID284
Feature activation+1.75e-02

Pos logit contributions
abella+0.020
icking+0.019
bie+0.019
tha+0.019
bled+0.019
Neg logit contributions
pillar-0.011
 target-0.010
 pieces-0.010
Sur-0.009
 piece-0.008
Token the
Token ID262
Feature activation+3.74e-05

Pos logit contributions
icking+0.034
abella+0.034
bie+0.033
bled+0.032
tha+0.032
Neg logit contributions
 target-0.024
 pieces-0.024
pillar-0.023
 bites-0.022
 string-0.021
Token ost
Token ID23619
Feature activation+1.21e-01

Pos logit contributions
abella+0.000
icking+0.000
bie+0.000
tha+0.000
bled+0.000
Neg logit contributions
pillar-0.000
 aback-0.000
 target-0.000
 pieces-0.000
Sur-0.000
Tokenric
Token ID1173
Feature activation+9.07e-02

Pos logit contributions
abella+0.090
bled+0.069
icking+0.063
belt+0.062
ili+0.061
Neg logit contributions
pillar-0.312
 target-0.312
 pieces-0.304
 bites-0.291
',"-0.283
Tokenhes
Token ID956
Feature activation+7.60e-03

Pos logit contributions
icking+0.104
abella+0.099
tha+0.098
bie+0.097
belt+0.087
Neg logit contributions
pillar-0.196
 pieces-0.195
 target-0.189
Sur-0.181
 bites-0.179
Token.
Token ID13
Feature activation+1.65e-03

Pos logit contributions
abella+0.021
icking+0.020
bie+0.020
tha+0.020
bled+0.019
Neg logit contributions
 pieces-0.005
 target-0.004
pillar-0.004
 bites-0.003
 string-0.003
Token
Token ID198
Feature activation-7.02e-03

Pos logit contributions
icking+0.004
abella+0.004
bled+0.004
bie+0.004
tha+0.004
Neg logit contributions
 pieces-0.001
 target-0.001
pillar-0.001
Sur-0.001
 bites-0.001
Token
Token ID198
Feature activation-9.97e-03

Pos logit contributions
pillar+0.006
 target+0.006
 pieces+0.006
 bites+0.005
Sur+0.005
Neg logit contributions
abella-0.017
bie-0.017
icking-0.017
tha-0.016
ili-0.016
Token They
Token ID1119
Feature activation-4.05e-03

Pos logit contributions
abella+0.002
icking+0.002
bie+0.002
tha+0.002
bled+0.002
Neg logit contributions
pillar-0.001
 target-0.001
 pieces-0.001
Sur-0.001
 bites-0.001
Token wanted
Token ID2227
Feature activation-3.44e-02

Pos logit contributions
 target+0.005
pillar+0.005
 pieces+0.005
 bites+0.005
Sur+0.005
Neg logit contributions
icking-0.008
abella-0.008
bie-0.008
tha-0.008
bled-0.008
Token to
Token ID284
Feature activation+3.15e-04

Pos logit contributions
 pieces+0.022
 target+0.019
pillar+0.018
 bites+0.017
Sur+0.014
Neg logit contributions
icking-0.093
abella-0.092
tha-0.090
bled-0.090
bie-0.089
Token touch
Token ID3638
Feature activation-5.65e-03

Pos logit contributions
icking+0.001
bie+0.001
abella+0.001
tha+0.001
ric+0.000
Neg logit contributions
pillar-0.001
 pieces-0.000
 target-0.000
Sur-0.000
 aback-0.000
Token the
Token ID262
Feature activation-4.39e-03

Pos logit contributions
 pieces+0.008
pillar+0.007
 target+0.007
 bites+0.007
 string+0.007
Neg logit contributions
abella-0.011
icking-0.011
bie-0.011
bled-0.011
tha-0.011
Token ost
Token ID23619
Feature activation+1.20e-01

Pos logit contributions
pillar+0.006
 target+0.006
 pieces+0.006
 aback+0.006
Sur+0.006
Neg logit contributions
icking-0.008
abella-0.008
bie-0.008
tha-0.008
ili-0.007
Tokenric
Token ID1173
Feature activation+8.17e-02

Pos logit contributions
abella+0.090
bled+0.064
belt+0.061
ili+0.059
icking+0.057
Neg logit contributions
pillar-0.315
 target-0.309
 pieces-0.306
 bites-0.291
 bite-0.283
Tokenhes
Token ID956
Feature activation+4.03e-03

Pos logit contributions
icking+0.089
abella+0.087
bie+0.085
tha+0.085
belt+0.074
Neg logit contributions
pillar-0.180
 pieces-0.179
 target-0.173
Sur-0.165
 bites-0.165
Token.
Token ID13
Feature activation+5.40e-03

Pos logit contributions
abella+0.010
bie+0.010
icking+0.010
tha+0.010
bled+0.010
Neg logit contributions
 pieces-0.003
 target-0.003
pillar-0.003
 bites-0.002
 string-0.002
Token They
Token ID1119
Feature activation-1.57e-03

Pos logit contributions
abella+0.011
icking+0.011
bie+0.011
tha+0.010
bled+0.010
Neg logit contributions
pillar-0.007
 target-0.007
 pieces-0.007
 bites-0.006
Sur-0.006
Token reached
Token ID4251
Feature activation+1.91e-02

Pos logit contributions
 target+0.002
pillar+0.002
 pieces+0.002
 bites+0.002
 string+0.002
Neg logit contributions
icking-0.003
abella-0.003
bie-0.003
tha-0.003
bled-0.003
Token their
Token ID511
Feature activation+5.36e-02

Pos logit contributions
abella+0.081
icking+0.077
belt+0.075
bie+0.075
tha+0.074
Neg logit contributions
 pieces-0.069
 target-0.065
pillar-0.063
 string-0.059
 bites-0.058
Token loving
Token ID14442
Feature activation-9.17e-03

Pos logit contributions
abella+0.105
icking+0.104
bie+0.101
tha+0.099
bled+0.098
Neg logit contributions
 target-0.073
pillar-0.073
 pieces-0.071
 bites-0.064
Sur-0.063
Token friendship
Token ID14738
Feature activation+1.15e-02

Pos logit contributions
pillar+0.014
 target+0.014
 pieces+0.014
Sur+0.013
 bites+0.012
Neg logit contributions
icking-0.016
abella-0.015
bie-0.015
tha-0.015
bled-0.014
Token.
Token ID13
Feature activation-4.30e-03

Pos logit contributions
icking+0.027
abella+0.026
bie+0.026
tha+0.025
bled+0.025
Neg logit contributions
pillar-0.011
 target-0.011
 pieces-0.010
Sur-0.010
 bites-0.009
Token<|endoftext|>
Token ID50256
Feature activation+6.98e-04

Pos logit contributions
pillar+0.003
 target+0.003
 pieces+0.003
Sur+0.003
 bites+0.003
Neg logit contributions
icking-0.011
abella-0.011
bie-0.011
tha-0.010
bled-0.010
TokenPe
Token ID6435
Feature activation+1.17e-01

Pos logit contributions
abella+0.001
icking+0.001
bie+0.001
tha+0.001
ili+0.001
Neg logit contributions
 target-0.001
pillar-0.001
 pieces-0.001
 bites-0.001
Sur-0.001
Tokenggy
Token ID19970
Feature activation+2.48e-02

Pos logit contributions
ancer+0.185
angers+0.167
lights+0.165
bie+0.164
ores+0.163
Neg logit contributions
'-0.190
pillar-0.179
's-0.178
 target-0.175
-0.173
Token was
Token ID373
Feature activation+2.45e-02

Pos logit contributions
abella+0.053
icking+0.053
bie+0.052
tha+0.051
bled+0.049
Neg logit contributions
pillar-0.028
 target-0.028
 pieces-0.027
Sur-0.024
 bites-0.023
Token very
Token ID845
Feature activation+1.62e-02

Pos logit contributions
abella+0.051
icking+0.051
tha+0.050
riage+0.048
bie+0.048
Neg logit contributions
 pieces-0.028
 target-0.026
pillar-0.026
-0.025
 aback-0.024
Token excited
Token ID6568
Feature activation-5.12e-02

Pos logit contributions
abella+0.038
ogo+0.038
icking+0.037
ric+0.037
tha+0.037
Neg logit contributions
-0.015
 piece-0.012
 bites-0.012
!"-0.011
 pieces-0.011
Token.
Token ID13
Feature activation+1.99e-03

Pos logit contributions
pillar+0.054
 target+0.052
 pieces+0.048
Sur+0.046
 aback+0.043
Neg logit contributions
icking-0.119
abella-0.113
bie-0.113
tha-0.110
bled-0.106
Token to
Token ID284
Feature activation+1.62e-02

Pos logit contributions
pillar+0.040
 target+0.039
 pieces+0.035
Sur+0.032
 aback+0.030
Neg logit contributions
icking-0.090
bie-0.089
abella-0.088
bled-0.083
tha-0.082
Token mom
Token ID1995
Feature activation-1.69e-02

Pos logit contributions
aj+0.040
abella+0.038
icking+0.038
belt+0.037
angers+0.037
Neg logit contributions
 pieces-0.014
 string-0.014
 bite-0.014
 piece-0.013
 bites-0.013
Token and
Token ID290
Feature activation+2.74e-02

Pos logit contributions
 pieces+0.008
pillar+0.006
 target+0.005
 bites+0.004
 aback+0.003
Neg logit contributions
abella-0.049
bled-0.048
tha-0.047
ili-0.047
bie-0.047
Token to
Token ID284
Feature activation+2.18e-02

Pos logit contributions
abella+0.067
icking+0.065
bie+0.064
tha+0.063
bled+0.063
Neg logit contributions
 pieces-0.024
 target-0.023
pillar-0.023
 bites-0.020
Sur-0.020
Token the
Token ID262
Feature activation-4.03e-03

Pos logit contributions
abella+0.056
aj+0.055
angers+0.053
belt+0.052
bie+0.052
Neg logit contributions
 bite-0.019
 string-0.017
 target-0.017
 bites-0.017
 rescue-0.016
Token ost
Token ID23619
Feature activation+1.17e-01

Pos logit contributions
 pieces+0.007
pillar+0.007
 target+0.007
Sur+0.007
 bites+0.006
Neg logit contributions
icking-0.006
abella-0.006
bled-0.006
bie-0.006
belt-0.006
Tokenric
Token ID1173
Feature activation+7.84e-02

Pos logit contributions
abella+0.109
bled+0.085
belt+0.081
 Ty+0.073
 Seth+0.071
Neg logit contributions
pillar-0.274
 pieces-0.271
 target-0.263
 bites-0.251
!"-0.250
Tokenhes
Token ID956
Feature activation+1.51e-02

Pos logit contributions
icking+0.086
abella+0.084
bie+0.083
tha+0.081
belt+0.072
Neg logit contributions
 pieces-0.172
pillar-0.170
 target-0.167
Sur-0.158
 bites-0.157
Token.
Token ID13
Feature activation+3.71e-03

Pos logit contributions
abella+0.040
bie+0.040
tha+0.039
aj+0.039
icking+0.039
Neg logit contributions
 pieces-0.010
 target-0.007
 bites-0.006
 string-0.006
 piece-0.006
Token They
Token ID1119
Feature activation-5.52e-03

Pos logit contributions
icking+0.008
abella+0.008
bie+0.008
tha+0.007
bled+0.007
Neg logit contributions
pillar-0.005
 target-0.005
 pieces-0.004
Sur-0.004
 bites-0.004
Token learned
Token ID4499
Feature activation-2.94e-02

Pos logit contributions
pillar+0.009
 target+0.009
 pieces+0.008
Sur+0.008
 bites+0.008
Neg logit contributions
icking-0.010
abella-0.010
bie-0.009
tha-0.009
bled-0.009
Token stopped
Token ID5025
Feature activation+3.37e-02

Pos logit contributions
abella+0.001
icking+0.001
bie+0.001
tha+0.001
bled+0.001
Neg logit contributions
pillar-0.001
 target-0.001
 pieces-0.001
Sur-0.001
 bites-0.001
Token and
Token ID290
Feature activation+2.13e-03

Pos logit contributions
abella+0.076
belt+0.071
bled+0.069
icking+0.069
ili+0.068
Neg logit contributions
 pieces-0.037
 target-0.037
 bites-0.035
pillar-0.035
 bite-0.033
Token looked
Token ID3114
Feature activation+1.14e-02

Pos logit contributions
icking+0.005
bled+0.005
angers+0.005
belt+0.005
ancer+0.005
Neg logit contributions
 bites-0.002
 target-0.002
 pieces-0.002
's-0.002
pillar-0.002
Token at
Token ID379
Feature activation+9.04e-03

Pos logit contributions
abella+0.023
icking+0.022
bie+0.022
tha+0.021
bled+0.021
Neg logit contributions
pillar-0.015
 target-0.015
 pieces-0.015
 bites-0.013
 aback-0.013
Token the
Token ID262
Feature activation-1.24e-02

Pos logit contributions
icking+0.019
abella+0.019
aj+0.019
belt+0.019
bled+0.018
Neg logit contributions
 pieces-0.011
 target-0.010
 string-0.010
 bites-0.009
pillar-0.009
Token ost
Token ID23619
Feature activation+1.14e-01

Pos logit contributions
 pieces+0.020
pillar+0.019
 target+0.019
Sur+0.017
 bites+0.017
Neg logit contributions
icking-0.022
abella-0.021
bie-0.021
bled-0.020
tha-0.020
Tokenric
Token ID1173
Feature activation+8.30e-02

Pos logit contributions
abella+0.107
bled+0.082
belt+0.080
 seeker+0.072
ili+0.072
Neg logit contributions
pillar-0.271
 pieces-0.262
 target-0.258
 bites-0.249
!"-0.247
Tokenhes
Token ID956
Feature activation-2.09e-03

Pos logit contributions
icking+0.099
tha+0.095
bie+0.094
abella+0.094
belt+0.085
Neg logit contributions
 pieces-0.176
pillar-0.174
 target-0.165
Sur-0.161
 bites-0.160
Token.
Token ID13
Feature activation-9.75e-05

Pos logit contributions
 pieces+0.001
pillar+0.001
 target+0.001
 bites+0.001
 string+0.001
Neg logit contributions
abella-0.006
bie-0.006
tha-0.006
icking-0.006
ric-0.005
Token They
Token ID1119
Feature activation-6.68e-03

Pos logit contributions
 target+0.000
 pieces+0.000
pillar+0.000
Sur+0.000
 bites+0.000
Neg logit contributions
icking-0.000
bled-0.000
abella-0.000
tha-0.000
bie-0.000
Token saw
Token ID2497
Feature activation-1.68e-02

Pos logit contributions
 target+0.006
 pieces+0.006
pillar+0.006
 bites+0.005
 string+0.005
Neg logit contributions
icking-0.016
abella-0.015
bled-0.015
bie-0.015
tha-0.015
Token From
Token ID3574
Feature activation+3.99e-03

Pos logit contributions
pillar+0.010
 target+0.010
 pieces+0.010
Sur+0.010
 aback+0.009
Neg logit contributions
bie-0.016
abella-0.016
icking-0.016
bled-0.015
tha-0.015
Token that
Token ID326
Feature activation+3.90e-02

Pos logit contributions
bie+0.011
icking+0.011
ancer+0.011
abella+0.011
allows+0.011
Neg logit contributions
 target-0.002
pillar-0.002
 pieces-0.002
 string-0.002
 bites-0.001
Token day
Token ID1110
Feature activation+2.37e-02

Pos logit contributions
abella+0.059
icking+0.059
bie+0.057
tha+0.055
bled+0.054
Neg logit contributions
pillar-0.070
 target-0.069
 pieces-0.068
Sur-0.063
 bites-0.062
Token on
Token ID319
Feature activation+9.04e-03

Pos logit contributions
abella+0.053
bie+0.053
ric+0.052
tha+0.050
aj+0.050
Neg logit contributions
pillar-0.026
 pieces-0.024
Sur-0.024
'-0.022
 target-0.021
Token,
Token ID11
Feature activation+3.01e-02

Pos logit contributions
abella+0.024
bie+0.023
icking+0.022
ric+0.022
bled+0.022
Neg logit contributions
 pieces-0.007
pillar-0.007
 bites-0.005
Sur-0.005
 string-0.005
Token Pe
Token ID2631
Feature activation+1.13e-01

Pos logit contributions
abella+0.061
bie+0.060
aj+0.059
icking+0.058
belt+0.058
Neg logit contributions
 pieces-0.038
pillar-0.037
Sur-0.036
 string-0.035
 target-0.033
Tokenppa
Token ID44989
Feature activation+6.24e-03

Pos logit contributions
abella+0.120
ores+0.104
bie+0.104
icking+0.104
 Jas+0.102
Neg logit contributions
pillar-0.249
 pieces-0.246
 target-0.245
Sur-0.229
 string-0.225
Token never
Token ID1239
Feature activation+4.58e-03

Pos logit contributions
abella+0.013
icking+0.013
bled+0.013
tha+0.012
bie+0.012
Neg logit contributions
pillar-0.008
 pieces-0.007
 target-0.007
 bites-0.006
Sur-0.006
Token went
Token ID1816
Feature activation+4.87e-02

Pos logit contributions
abella+0.009
bled+0.008
icking+0.008
tha+0.008
ili+0.008
Neg logit contributions
 target-0.007
pillar-0.007
 pieces-0.006
 bites-0.006
 aback-0.006
Token for
Token ID329
Feature activation+8.97e-03

Pos logit contributions
abella+0.130
bie+0.126
tha+0.125
bled+0.124
icking+0.124
Neg logit contributions
 pieces-0.035
 bites-0.034
 target-0.032
 aback-0.030
pillar-0.029
Token a
Token ID257
Feature activation-1.56e-02

Pos logit contributions
icking+0.022
abella+0.021
bie+0.021
bled+0.021
tha+0.021
Neg logit contributions
 pieces-0.009
 target-0.008
pillar-0.008
 bites-0.007
Sur-0.007
Token be
Token ID307
Feature activation+3.87e-02

Pos logit contributions
pillar+0.014
 target+0.013
 pieces+0.012
Sur+0.012
 bites+0.012
Neg logit contributions
icking-0.017
abella-0.016
tha-0.016
bie-0.015
ric-0.015
Token filled
Token ID5901
Feature activation+7.77e-03

Pos logit contributions
abella+0.083
icking+0.082
bie+0.080
tha+0.080
ili+0.079
Neg logit contributions
 pieces-0.047
 target-0.046
pillar-0.041
 aback-0.040
 bites-0.039
Token with
Token ID351
Feature activation+2.31e-02

Pos logit contributions
icking+0.016
abella+0.016
bie+0.016
tha+0.016
bled+0.015
Neg logit contributions
pillar-0.010
 target-0.009
Sur-0.008
 pieces-0.008
 bites-0.008
Token wonderful
Token ID7932
Feature activation-2.56e-02

Pos logit contributions
abella+0.048
icking+0.047
bie+0.046
tha+0.046
bled+0.045
Neg logit contributions
 pieces-0.029
 target-0.028
pillar-0.028
 bites-0.025
Sur-0.024
Token<|endoftext|>
Token ID50256
Feature activation+1.00e-02

Pos logit contributions
pillar+0.037
 target+0.036
 pieces+0.033
Sur+0.032
 aback+0.031
Neg logit contributions
icking-0.047
bie-0.046
abella-0.046
tha-0.044
bled-0.043
TokenOnce
Token ID7454
Feature activation+1.13e-01

Pos logit contributions
abella+0.020
icking+0.020
bie+0.020
tha+0.019
bled+0.019
Neg logit contributions
 target-0.013
pillar-0.012
 pieces-0.012
Sur-0.011
 bites-0.011
Token upon
Token ID2402
Feature activation-4.95e-04

Pos logit contributions
abella+0.161
icking+0.154
bled+0.151
tha+0.149
belt+0.147
Neg logit contributions
 pieces-0.197
pillar-0.194
 piece-0.176
!"-0.175
 target-0.175
Token a
Token ID257
Feature activation-3.26e-03

Pos logit contributions
pillar+0.001
 target+0.001
 pieces+0.001
 bites+0.001
 aback+0.001
Neg logit contributions
abella-0.001
bie-0.001
icking-0.001
bled-0.001
ili-0.001
Token time
Token ID640
Feature activation+5.51e-02

Pos logit contributions
 target+0.006
 pieces+0.006
pillar+0.006
 aback+0.005
Sur+0.005
Neg logit contributions
bie-0.005
abella-0.005
icking-0.005
tha-0.004
allows-0.004
Token,
Token ID11
Feature activation+7.16e-02

Pos logit contributions
abella+0.132
icking+0.129
bie+0.125
tha+0.125
bled+0.124
Neg logit contributions
pillar-0.050
 pieces-0.046
 target-0.043
Sur-0.041
 bites-0.039
Token there
Token ID612
Feature activation+2.96e-02

Pos logit contributions
abella+0.154
icking+0.147
tha+0.143
aj+0.139
angers+0.139
Neg logit contributions
 pieces-0.083
pillar-0.080
 target-0.076
-0.073
 piece-0.072
Token lies
Token ID7363
Feature activation+3.72e-02

Pos logit contributions
abella+0.002
icking+0.002
bie+0.002
ric+0.002
tha+0.002
Neg logit contributions
pillar-0.001
 target-0.001
 pieces-0.001
Sur-0.001
 aback-0.001
Token on
Token ID319
Feature activation-1.00e-02

Pos logit contributions
abella+0.081
icking+0.077
bie+0.075
tha+0.075
ili+0.074
Neg logit contributions
 pieces-0.046
pillar-0.041
 target-0.041
 bites-0.040
 aback-0.038
Token the
Token ID262
Feature activation+1.60e-02

Pos logit contributions
 pieces+0.013
 target+0.013
pillar+0.012
 bites+0.011
 string+0.011
Neg logit contributions
icking-0.021
bie-0.020
abella-0.020
tha-0.019
bled-0.019
Token grass
Token ID8701
Feature activation+1.74e-02

Pos logit contributions
abella+0.029
icking+0.029
bie+0.028
tha+0.028
bled+0.027
Neg logit contributions
pillar-0.024
 target-0.023
 pieces-0.022
 aback-0.021
Sur-0.021
Token and
Token ID290
Feature activation+2.10e-02

Pos logit contributions
icking+0.043
abella+0.041
bie+0.041
tha+0.040
bled+0.040
Neg logit contributions
pillar-0.015
 target-0.014
 pieces-0.013
Sur-0.013
 aback-0.011
Token breat
Token ID4882
Feature activation+1.13e-01

Pos logit contributions
icking+0.044
abella+0.044
bie+0.042
tha+0.042
ili+0.041
Neg logit contributions
 pieces-0.027
 target-0.026
pillar-0.025
 bites-0.025
Sur-0.023
Tokenhes
Token ID956
Feature activation+2.45e-02

Pos logit contributions
bie+0.131
tha+0.128
icking+0.128
abella+0.124
belt+0.113
Neg logit contributions
 pieces-0.242
pillar-0.236
 target-0.233
Sur-0.229
 bites-0.217
Token hard
Token ID1327
Feature activation-3.04e-02

Pos logit contributions
abella+0.054
icking+0.052
bie+0.052
tha+0.052
ili+0.050
Neg logit contributions
 pieces-0.029
 target-0.028
pillar-0.026
 bites-0.025
Sur-0.024
Token.
Token ID13
Feature activation+2.41e-03

Pos logit contributions
pillar+0.037
 target+0.033
Sur+0.030
 aback+0.028
 pieces+0.027
Neg logit contributions
icking-0.064
abella-0.062
bie-0.061
allows-0.060
bled-0.059
Token He
Token ID679
Feature activation+5.35e-03

Pos logit contributions
icking+0.006
abella+0.006
tha+0.006
bie+0.006
bled+0.006
Neg logit contributions
pillar-0.002
 target-0.002
 pieces-0.002
Sur-0.002
 bites-0.002
Token does
Token ID857
Feature activation+3.17e-02

Pos logit contributions
abella+0.011
icking+0.011
bie+0.011
tha+0.011
bled+0.011
Neg logit contributions
pillar-0.006
 target-0.006
 pieces-0.006
Sur-0.005
 bites-0.005
Token.
Token ID13
Feature activation-4.19e-03

Pos logit contributions
 pieces+0.005
 target+0.005
pillar+0.004
Sur+0.003
 bites+0.003
Neg logit contributions
icking-0.022
bie-0.022
tha-0.022
abella-0.022
bled-0.021
Token Tom
Token ID4186
Feature activation-2.19e-02

Pos logit contributions
 target+0.004
pillar+0.004
 pieces+0.004
Sur+0.004
 bites+0.004
Neg logit contributions
icking-0.010
bled-0.010
tha-0.010
abella-0.009
bie-0.009
Token ro
Token ID686
Feature activation+5.60e-02

Pos logit contributions
pillar+0.021
 target+0.020
 pieces+0.020
Sur+0.017
 bites+0.016
Neg logit contributions
abella-0.052
icking-0.051
bie-0.050
tha-0.049
bled-0.049
Tokenars
Token ID945
Feature activation+1.35e-02

Pos logit contributions
icking+0.122
abella+0.121
bie+0.121
tha+0.118
bled+0.113
Neg logit contributions
 pieces-0.062
pillar-0.062
 target-0.061
Sur-0.055
 bites-0.052
Token and
Token ID290
Feature activation+6.10e-03

Pos logit contributions
abella+0.035
icking+0.035
bie+0.035
tha+0.035
bled+0.034
Neg logit contributions
 pieces-0.009
 target-0.009
pillar-0.009
 bites-0.007
Sur-0.007
Token breat
Token ID4882
Feature activation+1.12e-01

Pos logit contributions
icking+0.013
abella+0.012
bled+0.012
bie+0.012
tha+0.012
Neg logit contributions
 target-0.008
 pieces-0.008
 bites-0.007
pillar-0.007
 string-0.007
Tokenhes
Token ID956
Feature activation+2.05e-02

Pos logit contributions
bie+0.146
tha+0.143
icking+0.142
abella+0.128
belt+0.124
Neg logit contributions
 pieces-0.232
pillar-0.219
 target-0.217
Sur-0.217
 bites-0.203
Token fire
Token ID2046
Feature activation+3.44e-02

Pos logit contributions
abella+0.048
bie+0.047
tha+0.046
icking+0.046
ili+0.044
Neg logit contributions
 pieces-0.023
 target-0.021
pillar-0.020
 bites-0.019
 string-0.018
Token.
Token ID13
Feature activation-1.17e-03

Pos logit contributions
abella+0.081
icking+0.081
bie+0.079
tha+0.078
bled+0.077
Neg logit contributions
pillar-0.032
 target-0.031
 pieces-0.031
Sur-0.026
 bites-0.026
Token Anna
Token ID11735
Feature activation-1.36e-02

Pos logit contributions
 target+0.001
pillar+0.001
 pieces+0.001
Sur+0.001
 bites+0.001
Neg logit contributions
icking-0.003
tha-0.003
bled-0.003
bie-0.003
abella-0.003
Token screams
Token ID26557
Feature activation-1.57e-02

Pos logit contributions
pillar+0.013
 target+0.013
 pieces+0.012
Sur+0.011
 aback+0.010
Neg logit contributions
abella-0.031
icking-0.031
bie-0.031
tha-0.030
ili-0.029
Token together
Token ID1978
Feature activation+4.53e-03

Pos logit contributions
icking+0.032
bled+0.032
abella+0.032
ores+0.029
tha+0.028
Neg logit contributions
pillar-0.033
 target-0.030
Sur-0.029
 pieces-0.029
 bites-0.027
Token every
Token ID790
Feature activation+2.40e-02

Pos logit contributions
icking+0.007
abella+0.007
bie+0.006
bled+0.006
tha+0.006
Neg logit contributions
pillar-0.008
Sur-0.007
 target-0.007
 pieces-0.007
 aback-0.007
Token day
Token ID1110
Feature activation+1.44e-02

Pos logit contributions
abella+0.032
bie+0.031
icking+0.031
tha+0.029
ili+0.028
Neg logit contributions
pillar-0.047
 target-0.046
 pieces-0.045
Sur-0.044
 aback-0.043
Token.
Token ID13
Feature activation+2.99e-03

Pos logit contributions
icking+0.034
abella+0.033
bie+0.032
tha+0.032
bled+0.032
Neg logit contributions
pillar-0.014
 target-0.013
 pieces-0.012
Sur-0.012
 aback-0.011
Token<|endoftext|>
Token ID50256
Feature activation+5.66e-03

Pos logit contributions
icking+0.007
abella+0.007
bie+0.007
tha+0.007
bled+0.007
Neg logit contributions
pillar-0.003
 target-0.003
 pieces-0.002
Sur-0.002
 bites-0.002
TokenOnce
Token ID7454
Feature activation+1.11e-01

Pos logit contributions
abella+0.012
bie+0.012
icking+0.012
ili+0.011
tha+0.011
Neg logit contributions
 target-0.006
pillar-0.006
 pieces-0.006
 bites-0.005
Sur-0.005
Token upon
Token ID2402
Feature activation-2.98e-03

Pos logit contributions
abella+0.149
icking+0.146
bled+0.137
tha+0.136
belt+0.135
Neg logit contributions
pillar-0.209
 pieces-0.207
 target-0.196
 piece-0.187
 bites-0.186
Token a
Token ID257
Feature activation+9.07e-03

Pos logit contributions
pillar+0.005
 target+0.005
 pieces+0.005
 aback+0.005
 bites+0.004
Neg logit contributions
abella-0.005
bie-0.004
allows-0.004
icking-0.004
bled-0.004
Token time
Token ID640
Feature activation+5.73e-02

Pos logit contributions
bie+0.013
abella+0.013
icking+0.012
aj+0.011
tha+0.011
Neg logit contributions
 target-0.017
 pieces-0.017
pillar-0.017
 aback-0.016
Sur-0.015
Token,
Token ID11
Feature activation+7.75e-02

Pos logit contributions
abella+0.120
icking+0.119
bie+0.116
tha+0.114
bled+0.112
Neg logit contributions
pillar-0.069
 target-0.068
 pieces-0.066
Sur-0.059
 bites-0.058
Token there
Token ID612
Feature activation+3.32e-02

Pos logit contributions
abella+0.141
icking+0.137
bie+0.132
tha+0.131
ili+0.127
Neg logit contributions
pillar-0.114
 pieces-0.114
 target-0.112
Sur-0.103
 bites-0.101
Token in
Token ID287
Feature activation+3.69e-02

Pos logit contributions
icking+0.071
bie+0.070
abella+0.070
ric+0.069
tha+0.069
Neg logit contributions
 target-0.015
 pieces-0.015
pillar-0.015
Sur-0.010
!'-0.010
Token the
Token ID262
Feature activation+3.48e-02

Pos logit contributions
abella+0.063
icking+0.063
bie+0.061
tha+0.060
bled+0.058
Neg logit contributions
pillar-0.058
 target-0.057
 pieces-0.056
Sur-0.052
 bites-0.050
Token park
Token ID3952
Feature activation+3.46e-02

Pos logit contributions
icking+0.076
abella+0.076
bie+0.072
bled+0.072
tha+0.072
Neg logit contributions
 target-0.042
pillar-0.042
 pieces-0.041
Sur-0.035
 bites-0.034
Token.
Token ID13
Feature activation+7.05e-04

Pos logit contributions
abella+0.093
icking+0.091
bie+0.091
tha+0.090
bled+0.089
Neg logit contributions
 pieces-0.020
 target-0.020
pillar-0.020
 bites-0.014
Sur-0.013
Token<|endoftext|>
Token ID50256
Feature activation+1.27e-02

Pos logit contributions
icking+0.002
bled+0.002
tha+0.002
abella+0.002
bie+0.002
Neg logit contributions
pillar-0.001
 target-0.001
 pieces-0.000
Sur-0.000
-0.000
TokenOnce
Token ID7454
Feature activation+1.11e-01

Pos logit contributions
abella+0.029
bie+0.028
icking+0.028
tha+0.027
ili+0.027
Neg logit contributions
 target-0.013
pillar-0.012
 pieces-0.012
 bites-0.010
Sur-0.010
Token upon
Token ID2402
Feature activation+3.63e-03

Pos logit contributions
abella+0.155
icking+0.149
tha+0.145
ili+0.141
belt+0.140
Neg logit contributions
 pieces-0.204
pillar-0.203
 target-0.187
 piece-0.183
!"-0.175
Token a
Token ID257
Feature activation+1.42e-02

Pos logit contributions
abella+0.006
bie+0.006
icking+0.006
bled+0.006
ili+0.006
Neg logit contributions
pillar-0.006
 target-0.005
 pieces-0.005
 aback-0.005
 bites-0.005
Token time
Token ID640
Feature activation+5.97e-02

Pos logit contributions
bie+0.022
abella+0.021
icking+0.021
aj+0.019
tha+0.019
Neg logit contributions
 target-0.026
 pieces-0.025
pillar-0.025
 aback-0.024
Sur-0.023
Token,
Token ID11
Feature activation+8.01e-02

Pos logit contributions
abella+0.138
icking+0.133
bie+0.131
tha+0.129
bled+0.127
Neg logit contributions
pillar-0.061
 pieces-0.058
 target-0.056
Sur-0.052
 bites-0.048
Token there
Token ID612
Feature activation+3.66e-02

Pos logit contributions
abella+0.159
icking+0.152
tha+0.145
bie+0.145
ili+0.142
Neg logit contributions
 pieces-0.107
pillar-0.103
 target-0.101
 piece-0.093
Sur-0.092
Token had
Token ID550
Feature activation-2.08e-02

Pos logit contributions
pillar+0.008
 target+0.008
 pieces+0.007
 bites+0.007
Sur+0.006
Neg logit contributions
icking-0.010
abella-0.010
bie-0.010
tha-0.010
bled-0.010
Token happy
Token ID3772
Feature activation-3.58e-02

Pos logit contributions
pillar+0.025
 target+0.025
 pieces+0.024
Sur+0.022
 bites+0.021
Neg logit contributions
abella-0.044
icking-0.043
bie-0.042
tha-0.041
bled-0.041
Token dreams
Token ID10625
Feature activation+4.69e-02

Pos logit contributions
pillar+0.047
 target+0.046
 pieces+0.045
Sur+0.041
 bites+0.040
Neg logit contributions
abella-0.071
icking-0.071
bie-0.069
tha-0.068
bled-0.067
Token.
Token ID13
Feature activation-1.56e-03

Pos logit contributions
abella+0.118
icking+0.117
bie+0.116
tha+0.113
bled+0.112
Neg logit contributions
pillar-0.036
 target-0.036
 pieces-0.034
Sur-0.028
 bites-0.027
Token<|endoftext|>
Token ID50256
Feature activation+1.11e-02

Pos logit contributions
pillar+0.001
 target+0.001
Sur+0.001
 pieces+0.001
+0.001
Neg logit contributions
icking-0.004
bled-0.004
tha-0.004
abella-0.004
bie-0.004
TokenOnce
Token ID7454
Feature activation+1.11e-01

Pos logit contributions
abella+0.024
bie+0.024
icking+0.023
ili+0.023
tha+0.023
Neg logit contributions
 target-0.012
pillar-0.012
 pieces-0.011
Sur-0.010
 bites-0.010
Token upon
Token ID2402
Feature activation+4.29e-03

Pos logit contributions
abella+0.156
icking+0.150
tha+0.145
belt+0.142
ili+0.140
Neg logit contributions
 pieces-0.205
pillar-0.202
 target-0.186
 piece-0.185
 bites-0.177
Token a
Token ID257
Feature activation+1.49e-02

Pos logit contributions
abella+0.007
bie+0.007
icking+0.007
bled+0.007
ili+0.006
Neg logit contributions
pillar-0.007
 target-0.007
 pieces-0.006
 aback-0.006
 bites-0.006
Token time
Token ID640
Feature activation+5.98e-02

Pos logit contributions
bie+0.022
icking+0.021
abella+0.021
aj+0.019
tha+0.019
Neg logit contributions
 target-0.028
pillar-0.027
 pieces-0.027
 aback-0.026
Sur-0.025
Token,
Token ID11
Feature activation+8.14e-02

Pos logit contributions
abella+0.130
icking+0.128
bie+0.125
tha+0.123
bled+0.121
Neg logit contributions
pillar-0.068
 target-0.065
 pieces-0.065
Sur-0.058
 bites-0.056
Token there
Token ID612
Feature activation+3.92e-02

Pos logit contributions
abella+0.161
icking+0.154
tha+0.147
bie+0.146
ili+0.144
Neg logit contributions
 pieces-0.109
pillar-0.104
 target-0.102
 piece-0.096
Sur-0.095
Token be
Token ID307
Feature activation+5.93e-02

Pos logit contributions
abella+0.035
icking+0.034
bie+0.034
aj+0.033
bled+0.033
Neg logit contributions
 target-0.016
pillar-0.015
 pieces-0.014
 bite-0.013
 string-0.013
Token safe
Token ID3338
Feature activation-3.78e-03

Pos logit contributions
abella+0.120
bie+0.114
icking+0.113
tha+0.110
ili+0.109
Neg logit contributions
 pieces-0.079
 target-0.079
pillar-0.077
Sur-0.070
 aback-0.069
Token at
Token ID379
Feature activation+4.57e-02

Pos logit contributions
pillar+0.003
 target+0.003
 pieces+0.003
Sur+0.003
 bites+0.002
Neg logit contributions
icking-0.009
abella-0.009
bie-0.009
tha-0.009
bled-0.009
Token last
Token ID938
Feature activation+1.19e-02

Pos logit contributions
abella+0.089
icking+0.086
bie+0.085
ili+0.082
tha+0.081
Neg logit contributions
 pieces-0.066
 target-0.064
pillar-0.059
 bites-0.057
 string-0.056
Token<|endoftext|>
Token ID50256
Feature activation+6.87e-03

Pos logit contributions
icking+0.020
abella+0.020
bie+0.019
tha+0.019
belt+0.018
Neg logit contributions
pillar-0.019
 target-0.017
Sur-0.017
 pieces-0.016
 aback-0.015
TokenOnce
Token ID7454
Feature activation+1.11e-01

Pos logit contributions
abella+0.014
icking+0.013
bie+0.013
tha+0.013
bled+0.013
Neg logit contributions
 target-0.009
pillar-0.009
 pieces-0.008
Sur-0.008
 bites-0.008
Token upon
Token ID2402
Feature activation-4.45e-03

Pos logit contributions
abella+0.152
icking+0.146
bled+0.141
tha+0.140
ili+0.137
Neg logit contributions
 pieces-0.204
pillar-0.201
 target-0.187
 piece-0.182
 aback-0.181
Token a
Token ID257
Feature activation-6.50e-04

Pos logit contributions
pillar+0.007
 target+0.007
 pieces+0.006
 aback+0.006
 bites+0.006
Neg logit contributions
abella-0.008
bie-0.008
icking-0.008
bled-0.007
ili-0.007
Token time
Token ID640
Feature activation+5.10e-02

Pos logit contributions
 target+0.001
 pieces+0.001
pillar+0.001
 aback+0.001
Sur+0.001
Neg logit contributions
bie-0.001
abella-0.001
icking-0.001
tha-0.001
aj-0.001
Token,
Token ID11
Feature activation+7.12e-02

Pos logit contributions
abella+0.122
icking+0.120
tha+0.116
bie+0.116
bled+0.115
Neg logit contributions
pillar-0.046
 pieces-0.043
 target-0.040
Sur-0.039
 bites-0.036
Token there
Token ID612
Feature activation+2.78e-02

Pos logit contributions
abella+0.147
icking+0.140
tha+0.136
ili+0.132
aj+0.131
Neg logit contributions
 pieces-0.090
pillar-0.088
 target-0.086
-0.080
 piece-0.079
Token have
Token ID423
Feature activation-1.92e-02

Pos logit contributions
abella+0.054
icking+0.053
bie+0.052
tha+0.051
bled+0.050
Neg logit contributions
pillar-0.038
 target-0.038
 pieces-0.037
Sur-0.033
 bites-0.033
Token good
Token ID922
Feature activation-2.39e-02

Pos logit contributions
pillar+0.029
 target+0.027
 pieces+0.024
Sur+0.024
 bites+0.022
Neg logit contributions
abella-0.035
icking-0.034
bie-0.033
bled-0.032
tha-0.032
Token friends
Token ID2460
Feature activation-6.99e-03

Pos logit contributions
pillar+0.041
 target+0.038
 pieces+0.036
Sur+0.035
hest+0.034
Neg logit contributions
icking-0.039
bie-0.038
abella-0.037
tha-0.035
bled-0.035
Token.
Token ID13
Feature activation-4.17e-03

Pos logit contributions
pillar+0.007
 target+0.007
 pieces+0.006
Sur+0.006
 bites+0.005
Neg logit contributions
icking-0.016
abella-0.016
bie-0.016
tha-0.015
bled-0.015
Token<|endoftext|>
Token ID50256
Feature activation+5.73e-03

Pos logit contributions
pillar+0.003
 target+0.003
 pieces+0.003
Sur+0.003
 bites+0.003
Neg logit contributions
icking-0.010
abella-0.010
bie-0.010
tha-0.010
bled-0.010
TokenOnce
Token ID7454
Feature activation+1.11e-01

Pos logit contributions
abella+0.011
bie+0.011
icking+0.011
ili+0.011
tha+0.010
Neg logit contributions
 target-0.007
pillar-0.007
 pieces-0.007
 bites-0.006
Sur-0.006
Token upon
Token ID2402
Feature activation-2.63e-04

Pos logit contributions
abella+0.153
icking+0.144
tha+0.140
bled+0.137
belt+0.137
Neg logit contributions
 pieces-0.208
pillar-0.206
 target-0.189
 piece-0.188
 bites-0.182
Token a
Token ID257
Feature activation+1.45e-02

Pos logit contributions
pillar+0.000
 target+0.000
 pieces+0.000
 aback+0.000
 bites+0.000
Neg logit contributions
abella-0.000
bie-0.000
icking-0.000
bled-0.000
allows-0.000
Token time
Token ID640
Feature activation+5.93e-02

Pos logit contributions
bie+0.022
icking+0.020
abella+0.020
aj+0.019
tha+0.018
Neg logit contributions
 target-0.027
pillar-0.027
 pieces-0.027
 aback-0.025
Sur-0.024
Token,
Token ID11
Feature activation+8.01e-02

Pos logit contributions
abella+0.122
icking+0.122
bie+0.119
tha+0.117
bled+0.115
Neg logit contributions
pillar-0.073
 target-0.072
 pieces-0.070
Sur-0.063
 bites-0.061
Token there
Token ID612
Feature activation+3.45e-02

Pos logit contributions
abella+0.156
icking+0.148
tha+0.143
bie+0.140
ili+0.140
Neg logit contributions
 pieces-0.110
pillar-0.107
 target-0.104
 piece-0.097
Sur-0.096
Token on
Token ID319
Feature activation+9.12e-03

Pos logit contributions
icking+0.084
abella+0.084
bie+0.082
tha+0.081
bled+0.079
Neg logit contributions
pillar-0.030
 target-0.029
 pieces-0.028
Sur-0.024
 bites-0.023
Token the
Token ID262
Feature activation+3.48e-02

Pos logit contributions
abella+0.017
icking+0.016
bie+0.016
tha+0.016
bled+0.015
Neg logit contributions
pillar-0.014
 target-0.013
 pieces-0.013
Sur-0.012
 bites-0.012
Token farm
Token ID5318
Feature activation+3.62e-02

Pos logit contributions
icking+0.064
abella+0.064
bie+0.062
tha+0.061
bled+0.059
Neg logit contributions
pillar-0.050
 target-0.048
 pieces-0.048
Sur-0.045
 aback-0.043
Token.
Token ID13
Feature activation-7.75e-04

Pos logit contributions
icking+0.089
abella+0.089
bie+0.087
tha+0.085
bled+0.084
Neg logit contributions
pillar-0.030
 target-0.029
 pieces-0.028
Sur-0.024
 bites-0.023
Token<|endoftext|>
Token ID50256
Feature activation+8.11e-03

Pos logit contributions
pillar+0.001
 target+0.000
 pieces+0.000
Sur+0.000
+0.000
Neg logit contributions
icking-0.002
bled-0.002
abella-0.002
tha-0.002
bie-0.002
TokenOnce
Token ID7454
Feature activation+1.11e-01

Pos logit contributions
abella+0.018
bie+0.018
icking+0.018
tha+0.017
ili+0.017
Neg logit contributions
 target-0.009
pillar-0.008
 pieces-0.008
 bites-0.007
Sur-0.007
Token upon
Token ID2402
Feature activation-8.59e-04

Pos logit contributions
abella+0.148
icking+0.141
tha+0.136
bled+0.134
ili+0.132
Neg logit contributions
pillar-0.209
 pieces-0.209
 target-0.194
 piece-0.188
 bites-0.182
Token a
Token ID257
Feature activation+1.27e-02

Pos logit contributions
pillar+0.001
 target+0.001
 pieces+0.001
 aback+0.001
 bites+0.001
Neg logit contributions
abella-0.001
bie-0.001
icking-0.001
bled-0.001
allows-0.001
Token time
Token ID640
Feature activation+5.73e-02

Pos logit contributions
bie+0.019
icking+0.018
abella+0.018
aj+0.016
tha+0.016
Neg logit contributions
 target-0.024
 pieces-0.023
pillar-0.023
 aback-0.022
Sur-0.021
Token,
Token ID11
Feature activation+7.77e-02

Pos logit contributions
abella+0.125
icking+0.123
bie+0.120
tha+0.118
bled+0.116
Neg logit contributions
pillar-0.065
 target-0.062
 pieces-0.062
Sur-0.056
 bites-0.053
Token a
Token ID257
Feature activation-1.45e-02

Pos logit contributions
abella+0.147
icking+0.141
tha+0.136
bie+0.133
ili+0.132
Neg logit contributions
 pieces-0.110
pillar-0.108
 target-0.105
Sur-0.096
 piece-0.096
Token
Token ID198
Feature activation-5.28e-03

Pos logit contributions
abella+0.084
icking+0.083
bie+0.083
tha+0.080
bled+0.079
Neg logit contributions
 target-0.033
pillar-0.032
 pieces-0.030
 bites-0.026
Sur-0.025
Token
Token ID198
Feature activation-1.30e-02

Pos logit contributions
 target+0.006
pillar+0.006
 pieces+0.005
 bites+0.005
Sur+0.005
Neg logit contributions
abella-0.012
bie-0.012
ili-0.011
icking-0.011
tha-0.011
TokenOne
Token ID3198
Feature activation+3.48e-02

Pos logit contributions
pillar+0.013
 target+0.013
 pieces+0.013
Sur+0.011
 bites+0.011
Neg logit contributions
icking-0.030
abella-0.030
bie-0.029
tha-0.029
bled-0.028
Token day
Token ID1110
Feature activation+9.86e-03

Pos logit contributions
bled+0.080
abella+0.079
icking+0.078
ric+0.077
tha+0.075
Neg logit contributions
pillar-0.041
 piece-0.039
 pieces-0.035
 bite-0.035
looking-0.035
Token,
Token ID11
Feature activation+2.49e-02

Pos logit contributions
abella+0.027
icking+0.027
bie+0.026
belt+0.026
ric+0.025
Neg logit contributions
Sur-0.006
pillar-0.005
 pieces-0.005
,"-0.004
 bites-0.004
Token Pe
Token ID2631
Feature activation+1.11e-01

Pos logit contributions
icking+0.047
aj+0.046
abella+0.045
ric+0.044
bled+0.043
Neg logit contributions
 pieces-0.035
Sur-0.033
 string-0.033
pillar-0.032
 piece-0.030
Tokenppa
Token ID44989
Feature activation+6.75e-03

Pos logit contributions
abella+0.110
icking+0.100
bie+0.097
ores+0.096
 Jas+0.096
Neg logit contributions
 pieces-0.245
pillar-0.245
 target-0.240
 piece-0.225
Sur-0.224
Token stumbled
Token ID24241
Feature activation+2.60e-02

Pos logit contributions
abella+0.015
icking+0.015
tha+0.014
bled+0.014
bie+0.014
Neg logit contributions
pillar-0.008
 target-0.007
 pieces-0.007
Sur-0.006
 bites-0.006
Token upon
Token ID2402
Feature activation-4.96e-03

Pos logit contributions
abella+0.051
icking+0.051
tha+0.050
bie+0.049
bled+0.047
Neg logit contributions
pillar-0.035
 pieces-0.034
 target-0.034
Sur-0.030
 aback-0.029
Token a
Token ID257
Feature activation-4.70e-02

Pos logit contributions
pillar+0.006
 target+0.006
 pieces+0.006
Sur+0.005
 bites+0.005
Neg logit contributions
abella-0.011
icking-0.011
bie-0.010
tha-0.010
bled-0.010
Token strange
Token ID6283
Feature activation-4.30e-02

Pos logit contributions
pillar+0.057
 pieces+0.056
 target+0.054
Sur+0.053
 aback+0.053
Neg logit contributions
abella-0.092
icking-0.092
bie-0.091
tha-0.088
ric-0.084
Token on
Token ID319
Feature activation+1.67e-03

Pos logit contributions
pillar+0.026
 target+0.025
 pieces+0.025
Sur+0.022
 bites+0.021
Neg logit contributions
abella-0.050
icking-0.049
bie-0.048
tha-0.047
ili-0.046
Token their
Token ID511
Feature activation+5.85e-02

Pos logit contributions
abella+0.003
icking+0.003
bie+0.003
tha+0.002
bled+0.002
Neg logit contributions
pillar-0.003
 target-0.003
 pieces-0.003
Sur-0.003
 bites-0.003
Token faces
Token ID6698
Feature activation-1.41e-03

Pos logit contributions
icking+0.104
abella+0.102
bie+0.101
tha+0.099
bled+0.096
Neg logit contributions
pillar-0.089
 target-0.085
 pieces-0.083
Sur-0.079
 aback-0.077
Token.
Token ID13
Feature activation-9.55e-03

Pos logit contributions
pillar+0.001
 target+0.001
 pieces+0.001
Sur+0.001
 bites+0.001
Neg logit contributions
abella-0.004
icking-0.004
bie-0.004
tha-0.003
ili-0.003
Token<|endoftext|>
Token ID50256
Feature activation+6.94e-03

Pos logit contributions
pillar+0.008
 target+0.007
 pieces+0.007
Sur+0.007
+0.006
Neg logit contributions
icking-0.024
abella-0.023
bled-0.023
tha-0.023
bie-0.022
TokenOnce
Token ID7454
Feature activation+1.11e-01

Pos logit contributions
abella+0.015
bie+0.015
icking+0.014
ili+0.014
tha+0.014
Neg logit contributions
 target-0.008
pillar-0.008
 pieces-0.007
Sur-0.006
 bites-0.006
Token upon
Token ID2402
Feature activation+1.05e-03

Pos logit contributions
abella+0.161
icking+0.156
tha+0.151
belt+0.148
ili+0.147
Neg logit contributions
 pieces-0.194
pillar-0.193
 piece-0.175
 target-0.174
,"-0.168
Token a
Token ID257
Feature activation+1.37e-02

Pos logit contributions
abella+0.002
bie+0.002
icking+0.002
bled+0.002
ili+0.002
Neg logit contributions
pillar-0.002
 target-0.002
 pieces-0.001
 aback-0.001
 bites-0.001
Token time
Token ID640
Feature activation+6.08e-02

Pos logit contributions
bie+0.021
icking+0.019
abella+0.019
aj+0.018
tha+0.018
Neg logit contributions
 target-0.025
 pieces-0.025
pillar-0.025
 aback-0.024
Sur-0.022
Token there
Token ID612
Feature activation+4.31e-02

Pos logit contributions
abella+0.136
icking+0.133
bie+0.130
tha+0.128
bled+0.127
Neg logit contributions
pillar-0.065
 pieces-0.062
 target-0.061
Sur-0.056
 bites-0.053
Token was
Token ID373
Feature activation+3.45e-02

Pos logit contributions
abella+0.107
icking+0.106
bie+0.102
tha+0.102
bled+0.101
Neg logit contributions
pillar-0.035
 pieces-0.034
 target-0.032
Sur-0.027
 bites-0.026
Token happy
Token ID3772
Feature activation-3.01e-02

Pos logit contributions
abella+0.068
bled+0.064
icking+0.062
ili+0.062
tha+0.060
Neg logit contributions
 pieces-0.039
 target-0.038
 bite-0.036
 and-0.036
's-0.034
Token and
Token ID290
Feature activation+3.88e-02

Pos logit contributions
pillar+0.029
 target+0.029
 pieces+0.028
Sur+0.024
 bites+0.023
Neg logit contributions
abella-0.070
icking-0.070
bie-0.069
tha-0.067
bled-0.066
Token tired
Token ID10032
Feature activation-1.45e-02

Pos logit contributions
abella+0.082
ahs+0.075
icking+0.074
tha+0.074
ric+0.074
Neg logit contributions
 pieces-0.050
 target-0.047
-0.047
 bites-0.046
's-0.046
Token.
Token ID13
Feature activation-1.68e-03

Pos logit contributions
pillar+0.014
 target+0.013
 pieces+0.012
Sur+0.011
 bites+0.011
Neg logit contributions
icking-0.034
abella-0.034
bie-0.033
tha-0.033
bled-0.032
Token<|endoftext|>
Token ID50256
Feature activation+8.94e-03

Pos logit contributions
pillar+0.001
 target+0.001
Sur+0.001
+0.001
 pieces+0.001
Neg logit contributions
icking-0.005
bled-0.005
abella-0.004
tha-0.004
eking-0.004
TokenOnce
Token ID7454
Feature activation+1.10e-01

Pos logit contributions
abella+0.018
bie+0.018
icking+0.018
ili+0.017
tha+0.017
Neg logit contributions
 target-0.011
pillar-0.011
 pieces-0.010
Sur-0.009
 bites-0.009
Token upon
Token ID2402
Feature activation+1.70e-03

Pos logit contributions
abella+0.157
icking+0.149
tha+0.143
belt+0.141
ili+0.139
Neg logit contributions
 pieces-0.202
pillar-0.197
 piece-0.182
 target-0.181
 bites-0.175
Token a
Token ID257
Feature activation+1.55e-02

Pos logit contributions
abella+0.003
bie+0.003
bled+0.003
icking+0.003
allows+0.003
Neg logit contributions
pillar-0.003
 target-0.003
 aback-0.003
 pieces-0.002
 bites-0.002
Token time
Token ID640
Feature activation+5.99e-02

Pos logit contributions
bie+0.024
icking+0.022
abella+0.022
tha+0.021
aj+0.021
Neg logit contributions
 target-0.028
pillar-0.028
 pieces-0.027
 aback-0.026
Sur-0.025
Token,
Token ID11
Feature activation+8.20e-02

Pos logit contributions
abella+0.129
icking+0.127
bie+0.125
tha+0.122
bled+0.120
Neg logit contributions
pillar-0.068
 target-0.066
 pieces-0.066
Sur-0.059
 bites-0.057
Token there
Token ID612
Feature activation+3.85e-02

Pos logit contributions
abella+0.161
icking+0.153
tha+0.147
bie+0.145
ili+0.144
Neg logit contributions
 pieces-0.112
pillar-0.107
 target-0.106
Sur-0.099
 piece-0.098
Token making
Token ID1642
Feature activation+8.85e-03

Pos logit contributions
icking+0.027
abella+0.027
bie+0.025
tha+0.025
bled+0.025
Neg logit contributions
 target-0.019
pillar-0.019
 pieces-0.019
 bites-0.018
Sur-0.017
Token people
Token ID661
Feature activation+2.75e-02

Pos logit contributions
icking+0.017
abella+0.017
bie+0.016
tha+0.016
bled+0.016
Neg logit contributions
pillar-0.012
 target-0.012
 pieces-0.012
Sur-0.011
 bites-0.011
Token happy
Token ID3772
Feature activation-3.98e-02

Pos logit contributions
abella+0.053
icking+0.051
tha+0.050
bie+0.050
bled+0.049
Neg logit contributions
pillar-0.038
 target-0.038
 pieces-0.038
Sur-0.034
 bites-0.033
Token.
Token ID13
Feature activation-2.27e-03

Pos logit contributions
pillar+0.038
 target+0.037
 pieces+0.033
Sur+0.031
 aback+0.030
Neg logit contributions
icking-0.093
bie-0.090
abella-0.090
tha-0.087
bled-0.087
Token<|endoftext|>
Token ID50256
Feature activation+9.27e-03

Pos logit contributions
pillar+0.002
 target+0.002
 pieces+0.002
Sur+0.002
+0.002
Neg logit contributions
icking-0.006
bled-0.005
abella-0.005
tha-0.005
bie-0.005
TokenOnce
Token ID7454
Feature activation+1.10e-01

Pos logit contributions
abella+0.022
icking+0.022
bie+0.022
tha+0.021
bled+0.021
Neg logit contributions
 target-0.008
pillar-0.008
 pieces-0.008
Sur-0.007
 bites-0.006
Token upon
Token ID2402
Feature activation+3.76e-03

Pos logit contributions
abella+0.159
icking+0.151
tha+0.150
ili+0.146
belt+0.144
Neg logit contributions
 pieces-0.199
pillar-0.196
 target-0.179
 piece-0.178
!"-0.170
Token a
Token ID257
Feature activation+1.29e-02

Pos logit contributions
abella+0.007
bie+0.007
icking+0.006
bled+0.006
ili+0.006
Neg logit contributions
pillar-0.006
 target-0.005
 pieces-0.005
 aback-0.005
 bites-0.005
Token time
Token ID640
Feature activation+5.95e-02

Pos logit contributions
bie+0.020
abella+0.018
icking+0.018
aj+0.017
tha+0.017
Neg logit contributions
 target-0.024
 pieces-0.023
pillar-0.023
 aback-0.022
Sur-0.021
Token,
Token ID11
Feature activation+8.00e-02

Pos logit contributions
abella+0.135
icking+0.131
bie+0.129
tha+0.127
bled+0.125
Neg logit contributions
pillar-0.062
 pieces-0.059
 target-0.058
Sur-0.053
 bites-0.050
Token there
Token ID612
Feature activation+3.85e-02

Pos logit contributions
abella+0.167
icking+0.158
tha+0.152
ili+0.149
bie+0.149
Neg logit contributions
 pieces-0.098
pillar-0.092
 target-0.090
 piece-0.085
Sur-0.082
Token,
Token ID11
Feature activation+2.40e-02

Pos logit contributions
pillar+0.002
 target+0.001
 pieces+0.001
Sur+0.001
 bites+0.001
Neg logit contributions
icking-0.002
abella-0.002
bie-0.002
tha-0.002
bled-0.002
Token fall
Token ID2121
Feature activation+8.13e-03

Pos logit contributions
abella+0.051
icking+0.048
ili+0.048
bie+0.047
bled+0.047
Neg logit contributions
 target-0.031
 pieces-0.030
pillar-0.029
 piece-0.026
 bites-0.026
Token weather
Token ID6193
Feature activation+1.71e-02

Pos logit contributions
abella+0.016
icking+0.015
bie+0.015
tha+0.015
ili+0.014
Neg logit contributions
 target-0.012
 pieces-0.011
pillar-0.011
 bites-0.010
Sur-0.010
Token.
Token ID13
Feature activation-4.40e-03

Pos logit contributions
abella+0.043
icking+0.042
bie+0.041
tha+0.041
bled+0.040
Neg logit contributions
 target-0.014
pillar-0.014
 pieces-0.014
Sur-0.011
 bites-0.011
Token<|endoftext|>
Token ID50256
Feature activation+6.81e-03

Pos logit contributions
pillar+0.002
 target+0.002
Sur+0.002
+0.002
 pieces+0.002
Neg logit contributions
icking-0.013
bled-0.012
tha-0.012
abella-0.012
eking-0.012
TokenOnce
Token ID7454
Feature activation+1.10e-01

Pos logit contributions
abella+0.016
icking+0.015
bie+0.015
tha+0.015
ili+0.015
Neg logit contributions
 target-0.007
pillar-0.007
 pieces-0.006
Sur-0.006
 bites-0.006
Token upon
Token ID2402
Feature activation+2.17e-03

Pos logit contributions
abella+0.155
icking+0.150
tha+0.143
belt+0.142
bled+0.141
Neg logit contributions
 pieces-0.200
pillar-0.198
 target-0.181
 piece-0.180
!"-0.174
Token a
Token ID257
Feature activation+1.25e-02

Pos logit contributions
abella+0.004
bie+0.004
icking+0.004
bled+0.003
ili+0.003
Neg logit contributions
pillar-0.003
 target-0.003
 pieces-0.003
 aback-0.003
 bites-0.003
Token time
Token ID640
Feature activation+5.83e-02

Pos logit contributions
bie+0.019
abella+0.018
icking+0.018
tha+0.017
aj+0.017
Neg logit contributions
 target-0.023
 pieces-0.022
pillar-0.022
 aback-0.021
Sur-0.020
Token,
Token ID11
Feature activation+7.91e-02

Pos logit contributions
abella+0.131
icking+0.128
bie+0.125
tha+0.123
bled+0.122
Neg logit contributions
pillar-0.062
 pieces-0.059
 target-0.057
Sur-0.053
 bites-0.050
Token in
Token ID287
Feature activation+4.09e-02

Pos logit contributions
abella+0.160
icking+0.153
tha+0.146
bie+0.145
ili+0.143
Neg logit contributions
 pieces-0.102
pillar-0.099
 target-0.096
 piece-0.090
Sur-0.089
Token's
Token ID338
Feature activation+1.68e-02

Pos logit contributions
 target+0.016
pillar+0.016
 pieces+0.016
Sur+0.015
 aback+0.014
Neg logit contributions
icking-0.019
abella-0.019
bie-0.019
tha-0.018
bled-0.017
Token mom
Token ID1995
Feature activation-1.48e-02

Pos logit contributions
abella+0.019
icking+0.019
bie+0.018
tha+0.017
bled+0.017
Neg logit contributions
pillar-0.036
 target-0.036
 pieces-0.036
Sur-0.034
 bites-0.033
Token showed
Token ID3751
Feature activation-4.23e-02

Pos logit contributions
pillar+0.022
 target+0.022
 pieces+0.021
Sur+0.020
 bites+0.019
Neg logit contributions
icking-0.027
abella-0.027
bie-0.026
tha-0.025
bled-0.025
Token him
Token ID683
Feature activation-8.27e-03

Pos logit contributions
 target+0.081
pillar+0.081
 pieces+0.073
Sur+0.072
 string+0.068
Neg logit contributions
abella-0.065
bie-0.059
icking-0.058
 Ty-0.054
 Jen-0.052
Token a
Token ID257
Feature activation-7.16e-02

Pos logit contributions
 pieces+0.011
pillar+0.010
 target+0.010
 bites+0.009
 piece+0.009
Neg logit contributions
tha-0.017
icking-0.017
abella-0.017
bie-0.016
bled-0.016
Token big
Token ID1263
Feature activation-9.83e-02

Pos logit contributions
pillar+0.099
 target+0.093
 aback+0.091
 pieces+0.091
Sur+0.087
Neg logit contributions
bie-0.132
abella-0.129
icking-0.129
tha-0.122
ric-0.118
Token file
Token ID2393
Feature activation-4.79e-02

Pos logit contributions
pillar+0.124
Sur+0.116
 target+0.113
 aback+0.108
ives+0.107
Neg logit contributions
icking-0.194
bie-0.190
tha-0.184
abella-0.175
belt-0.171
Token.
Token ID13
Feature activation-2.79e-02

Pos logit contributions
pillar+0.067
 target+0.067
 aback+0.064
Sur+0.063
Mine+0.059
Neg logit contributions
icking-0.090
abella-0.084
belt-0.083
bled-0.081
tha-0.076
Token It
Token ID632
Feature activation-2.69e-02

Pos logit contributions
pillar+0.054
 target+0.053
 pieces+0.051
Sur+0.047
 aback+0.046
Neg logit contributions
abella-0.039
bie-0.037
icking-0.037
tha-0.035
ores-0.034
Token was
Token ID373
Feature activation+1.49e-02

Pos logit contributions
pillar+0.039
 target+0.039
Sur+0.034
 pieces+0.034
 aback+0.033
Neg logit contributions
abella-0.051
icking-0.050
bie-0.049
tha-0.047
bled-0.046
Token full
Token ID1336
Feature activation-7.53e-02

Pos logit contributions
icking+0.034
abella+0.034
bie+0.033
tha+0.033
ili+0.032
Neg logit contributions
 pieces-0.016
 target-0.015
 string-0.014
 piece-0.014
pillar-0.013
Token their
Token ID511
Feature activation+4.33e-02

Pos logit contributions
ric+0.044
icking+0.042
ogo+0.041
tha+0.041
abella+0.041
Neg logit contributions
 pieces-0.033
 string-0.029
 piece-0.029
Sur-0.029
 target-0.028
Token mom
Token ID1995
Feature activation-1.09e-02

Pos logit contributions
abella+0.054
icking+0.051
bie+0.049
tha+0.049
ili+0.049
Neg logit contributions
 pieces-0.091
 target-0.089
pillar-0.089
 bites-0.083
Sur-0.082
Token gave
Token ID2921
Feature activation-5.70e-04

Pos logit contributions
pillar+0.014
 pieces+0.014
 target+0.014
Sur+0.012
 bites+0.012
Neg logit contributions
abella-0.022
icking-0.021
bie-0.021
tha-0.021
bled-0.020
Token them
Token ID606
Feature activation-8.55e-04

Pos logit contributions
 pieces+0.001
pillar+0.001
 target+0.001
 bites+0.001
 aback+0.001
Neg logit contributions
abella-0.001
icking-0.001
tha-0.001
bie-0.001
belt-0.001
Token a
Token ID257
Feature activation-6.76e-02

Pos logit contributions
 pieces+0.001
pillar+0.001
 target+0.001
 bites+0.001
Sur+0.001
Neg logit contributions
abella-0.002
icking-0.002
bie-0.002
tha-0.002
bled-0.002
Token big
Token ID1263
Feature activation-9.69e-02

Pos logit contributions
pillar+0.086
 target+0.085
 pieces+0.081
Sur+0.075
 aback+0.073
Neg logit contributions
bie-0.133
abella-0.133
icking-0.133
tha-0.125
ili-0.123
Token bowl
Token ID9396
Feature activation-1.82e-02

Pos logit contributions
pillar+0.126
 target+0.121
Sur+0.114
 pieces+0.111
hest+0.103
Neg logit contributions
icking-0.191
bie-0.187
tha-0.180
abella-0.177
belt-0.170
Token of
Token ID286
Feature activation+1.39e-02

Pos logit contributions
 target+0.024
pillar+0.023
Sur+0.021
 pieces+0.019
Start+0.019
Neg logit contributions
icking-0.039
bie-0.036
ores-0.035
abella-0.035
bled-0.034
Token cream
Token ID8566
Feature activation-5.39e-03

Pos logit contributions
bie+0.020
allows+0.019
icking+0.019
 Mam+0.018
ores+0.018
Neg logit contributions
 target-0.028
pillar-0.027
Sur-0.023
 aback-0.022
ives-0.022
Token to
Token ID284
Feature activation+1.64e-02

Pos logit contributions
pillar+0.007
 target+0.007
 aback+0.006
Start+0.006
Sur+0.006
Neg logit contributions
icking-0.011
bie-0.011
abella-0.011
tha-0.011
bled-0.010
Token share
Token ID2648
Feature activation-1.02e-02

Pos logit contributions
abella+0.033
icking+0.033
bie+0.032
tha+0.031
bled+0.031
Neg logit contributions
pillar-0.021
 target-0.021
 pieces-0.021
Sur-0.018
 bites-0.018
Token and
Token ID290
Feature activation+9.89e-04

Pos logit contributions
 target+0.025
pillar+0.025
 pieces+0.023
Sur+0.021
 aback+0.020
Neg logit contributions
abella-0.046
icking-0.046
bie-0.046
tha-0.044
ili-0.043
Token Dad
Token ID17415
Feature activation-2.19e-02

Pos logit contributions
icking+0.001
abella+0.001
tha+0.001
bled+0.001
ili+0.001
Neg logit contributions
 pieces-0.002
pillar-0.002
 target-0.002
 bites-0.002
Sur-0.002
Token give
Token ID1577
Feature activation-2.36e-03

Pos logit contributions
pillar+0.030
 target+0.030
 pieces+0.028
Sur+0.026
 bites+0.025
Neg logit contributions
icking-0.042
abella-0.042
bie-0.041
tha-0.040
ili-0.039
Token them
Token ID606
Feature activation+4.67e-03

Pos logit contributions
 pieces+0.003
pillar+0.003
 target+0.003
 bites+0.003
Sur+0.003
Neg logit contributions
icking-0.004
belt-0.004
tha-0.004
abella-0.004
ric-0.004
Token a
Token ID257
Feature activation-6.67e-02

Pos logit contributions
icking+0.010
abella+0.010
tha+0.010
bie+0.010
bled+0.010
Neg logit contributions
 pieces-0.005
pillar-0.005
 target-0.005
Sur-0.004
 bites-0.004
Token big
Token ID1263
Feature activation-9.68e-02

Pos logit contributions
pillar+0.091
 target+0.088
 pieces+0.081
 aback+0.079
Sur+0.074
Neg logit contributions
abella-0.130
bie-0.129
icking-0.124
tha-0.118
ili-0.118
Token present
Token ID1944
Feature activation-2.04e-02

Pos logit contributions
pillar+0.140
 target+0.131
Sur+0.124
 pieces+0.116
 aback+0.114
Neg logit contributions
bie-0.179
icking-0.175
tha-0.166
abella-0.163
aj-0.159
Token each
Token ID1123
Feature activation-1.80e-02

Pos logit contributions
pillar+0.023
 target+0.023
 pieces+0.021
Sur+0.020
 aback+0.019
Neg logit contributions
icking-0.045
abella-0.045
bie-0.043
tha-0.043
bled-0.042
Token.
Token ID13
Feature activation-2.36e-02

Pos logit contributions
pillar+0.018
 target+0.018
 pieces+0.016
Sur+0.015
 aback+0.014
Neg logit contributions
abella-0.041
bie-0.041
icking-0.041
tha-0.040
ili-0.039
Token Ben
Token ID3932
Feature activation-2.30e-02

Pos logit contributions
 target+0.041
pillar+0.041
 pieces+0.039
Sur+0.036
 bites+0.035
Neg logit contributions
abella-0.038
bie-0.037
icking-0.036
tha-0.034
ili-0.034
Token and
Token ID290
Feature activation+1.21e-02

Pos logit contributions
pillar+0.033
 target+0.032
Sur+0.029
 pieces+0.027
 aback+0.027
Neg logit contributions
bie-0.044
abella-0.043
icking-0.040
aj-0.040
tha-0.039
Token.
Token ID13
Feature activation-2.14e-02

Pos logit contributions
pillar+0.035
 target+0.035
 pieces+0.031
Sur+0.031
 bites+0.028
Neg logit contributions
abella-0.081
icking-0.081
bie-0.077
tha-0.076
bled-0.074
Token She
Token ID1375
Feature activation-9.43e-03

Pos logit contributions
pillar+0.025
 target+0.024
 pieces+0.024
 bites+0.021
Sur+0.021
Neg logit contributions
abella-0.046
icking-0.046
bie-0.046
tha-0.044
ili-0.043
Token showed
Token ID3751
Feature activation-3.03e-02

Pos logit contributions
pillar+0.011
 target+0.011
 pieces+0.011
 bites+0.010
Sur+0.010
Neg logit contributions
icking-0.020
ric-0.019
bled-0.019
bie-0.019
abella-0.019
Token him
Token ID683
Feature activation-3.52e-03

Pos logit contributions
pillar+0.045
 target+0.045
 pieces+0.042
Sur+0.040
 bites+0.038
Neg logit contributions
abella-0.056
icking-0.055
bie-0.054
tha-0.052
bled-0.051
Token a
Token ID257
Feature activation-6.30e-02

Pos logit contributions
 pieces+0.005
pillar+0.005
 target+0.004
 bites+0.004
 piece+0.004
Neg logit contributions
icking-0.007
abella-0.007
tha-0.007
bie-0.007
bled-0.007
Token big
Token ID1263
Feature activation-9.62e-02

Pos logit contributions
pillar+0.079
 target+0.074
 pieces+0.073
Sur+0.069
 aback+0.068
Neg logit contributions
bie-0.125
icking-0.123
abella-0.123
tha-0.117
ric-0.114
Token map
Token ID3975
Feature activation-1.61e-02

Pos logit contributions
pillar+0.116
Sur+0.108
 target+0.105
 pieces+0.100
 aback+0.098
Neg logit contributions
icking-0.195
bie-0.190
tha-0.185
abella-0.179
belt-0.174
Token and
Token ID290
Feature activation+1.02e-03

Pos logit contributions
pillar+0.021
 target+0.020
Sur+0.019
 pieces+0.019
 aback+0.018
Neg logit contributions
icking-0.032
abella-0.031
tha-0.030
belt-0.029
bie-0.029
Token said
Token ID531
Feature activation-7.54e-03

Pos logit contributions
icking+0.002
bie+0.002
abella+0.002
ric+0.002
tha+0.002
Neg logit contributions
 pieces-0.001
 target-0.001
pillar-0.001
Sur-0.001
 string-0.001
Token "
Token ID366
Feature activation-5.60e-02

Pos logit contributions
pillar+0.008
 pieces+0.008
 target+0.008
Sur+0.007
 bites+0.007
Neg logit contributions
icking-0.017
abella-0.016
bie-0.016
tha-0.016
ili-0.015
TokenLook
Token ID8567
Feature activation+4.74e-03

Pos logit contributions
 target+0.063
pillar+0.063
 pieces+0.061
 bites+0.057
 aback+0.052
Neg logit contributions
abella-0.119
bie-0.117
icking-0.114
ili-0.111
tha-0.111
Token and
Token ID290
Feature activation-8.22e-03

Pos logit contributions
pillar+0.020
 pieces+0.019
 target+0.018
 bites+0.017
Sur+0.017
Neg logit contributions
abella-0.033
icking-0.032
bled-0.032
tha-0.031
bie-0.031
Token dad
Token ID9955
Feature activation-7.60e-03

Pos logit contributions
pillar+0.016
 pieces+0.016
 target+0.016
Sur+0.015
 string+0.015
Neg logit contributions
icking-0.011
abella-0.011
belt-0.010
bled-0.010
bie-0.010
Token gave
Token ID2921
Feature activation-4.21e-03

Pos logit contributions
pillar+0.009
 pieces+0.009
 target+0.009
Sur+0.008
 bites+0.008
Neg logit contributions
abella-0.016
icking-0.015
tha-0.015
bled-0.015
bie-0.015
Token her
Token ID607
Feature activation-1.08e-02

Pos logit contributions
 target+0.008
pillar+0.008
 pieces+0.007
powers+0.007
 bites+0.007
Neg logit contributions
abella-0.006
icking-0.006
bie-0.006
tha-0.006
ili-0.006
Token a
Token ID257
Feature activation-6.60e-02

Pos logit contributions
 pieces+0.011
pillar+0.011
 target+0.011
 bites+0.009
Sur+0.009
Neg logit contributions
icking-0.024
abella-0.024
tha-0.024
bie-0.023
bled-0.023
Token big
Token ID1263
Feature activation-9.60e-02

Pos logit contributions
pillar+0.084
 target+0.083
 pieces+0.079
Sur+0.073
 aback+0.073
Neg logit contributions
bie-0.131
abella-0.129
icking-0.129
tha-0.122
ili-0.121
Token bowl
Token ID9396
Feature activation-1.60e-02

Pos logit contributions
pillar+0.120
 target+0.116
 pieces+0.109
Sur+0.108
 aback+0.098
Neg logit contributions
icking-0.194
bie-0.192
abella-0.185
tha-0.184
belt-0.173
Token.
Token ID13
Feature activation-2.94e-02

Pos logit contributions
 target+0.019
pillar+0.019
Sur+0.017
 pieces+0.016
 bites+0.015
Neg logit contributions
icking-0.036
bie-0.035
abella-0.033
tha-0.032
bled-0.032
Token She
Token ID1375
Feature activation-8.87e-03

Pos logit contributions
pillar+0.041
 pieces+0.039
 target+0.038
 bites+0.035
Sur+0.034
Neg logit contributions
abella-0.056
icking-0.054
tha-0.054
bie-0.053
bled-0.052
Token was
Token ID373
Feature activation+2.44e-02

Pos logit contributions
 target+0.014
pillar+0.014
 pieces+0.012
Sur+0.012
 aback+0.012
Neg logit contributions
abella-0.016
bie-0.016
icking-0.016
tha-0.015
bled-0.015
Token so
Token ID523
Feature activation-3.69e-03

Pos logit contributions
icking+0.061
abella+0.061
bie+0.059
tha+0.058
ogo+0.058
Neg logit contributions
 pieces-0.019
 piece-0.017
 string-0.016
 aback-0.014
 target-0.014
Token's
Token ID338
Feature activation+1.87e-02

Pos logit contributions
pillar+0.020
 target+0.019
 pieces+0.018
Sur+0.017
 aback+0.016
Neg logit contributions
icking-0.042
bie-0.042
abella-0.042
tha-0.039
bled-0.038
Token mom
Token ID1995
Feature activation-1.50e-02

Pos logit contributions
icking+0.027
abella+0.026
bie+0.025
bled+0.024
tha+0.024
Neg logit contributions
pillar-0.036
 target-0.035
 pieces-0.034
Sur-0.033
 aback-0.032
Token gave
Token ID2921
Feature activation-2.03e-03

Pos logit contributions
pillar+0.023
 target+0.023
 pieces+0.022
Sur+0.020
 bites+0.020
Neg logit contributions
icking-0.027
abella-0.027
bie-0.026
tha-0.025
bled-0.025
Token him
Token ID683
Feature activation-3.92e-03

Pos logit contributions
pillar+0.004
 target+0.004
Sur+0.003
 pieces+0.003
powers+0.003
Neg logit contributions
icking-0.003
bie-0.003
abella-0.003
bled-0.003
tha-0.003
Token a
Token ID257
Feature activation-6.90e-02

Pos logit contributions
 pieces+0.004
pillar+0.004
 target+0.004
 bites+0.004
Sur+0.003
Neg logit contributions
abella-0.009
icking-0.009
tha-0.009
bie-0.008
bled-0.008
Token big
Token ID1263
Feature activation-9.55e-02

Pos logit contributions
pillar+0.098
 target+0.093
 pieces+0.091
Sur+0.084
 aback+0.081
Neg logit contributions
bie-0.125
icking-0.124
abella-0.122
aj-0.114
tha-0.114
Token thick
Token ID6546
Feature activation-7.13e-02

Pos logit contributions
pillar+0.132
 target+0.124
Sur+0.123
hest+0.116
 pieces+0.114
Neg logit contributions
icking-0.177
bie-0.170
tha-0.162
 seeker-0.156
abella-0.155
Token envelope
Token ID22878
Feature activation-1.01e-02

Pos logit contributions
pillar+0.092
Sur+0.088
 target+0.085
 pieces+0.082
 aback+0.081
Neg logit contributions
icking-0.132
belt-0.126
bie-0.123
tha-0.122
abella-0.120
Token to
Token ID284
Feature activation+1.54e-02

Pos logit contributions
pillar+0.013
 target+0.012
 pieces+0.011
Sur+0.011
Mine+0.011
Neg logit contributions
icking-0.022
belt-0.020
abella-0.020
bled-0.020
tha-0.019
Token mail
Token ID6920
Feature activation+1.47e-02

Pos logit contributions
abella+0.033
icking+0.033
bie+0.032
tha+0.031
bled+0.031
Neg logit contributions
pillar-0.018
 target-0.017
 pieces-0.017
Sur-0.015
 bites-0.015
Token.
Token ID13
Feature activation-1.84e-02

Pos logit contributions
abella+0.029
icking+0.029
bie+0.028
tha+0.027
bled+0.027
Neg logit contributions
pillar-0.020
 target-0.019
 pieces-0.018
Sur-0.017
 bites-0.017
Token Her
Token ID2332
Feature activation+3.57e-02

Pos logit contributions
pillar+0.012
 target+0.011
 pieces+0.011
Sur+0.010
 bites+0.009
Neg logit contributions
icking-0.030
abella-0.029
tha-0.028
bie-0.028
bled-0.028
Token mom
Token ID1995
Feature activation-1.48e-02

Pos logit contributions
abella+0.075
riage+0.067
ancer+0.065
ili+0.063
ores+0.062
Neg logit contributions
 pieces-0.049
 ball-0.046
 string-0.044
 bird-0.042
's-0.041
Token gave
Token ID2921
Feature activation-1.84e-03

Pos logit contributions
pillar+0.019
 pieces+0.019
 target+0.018
Sur+0.016
 bites+0.016
Neg logit contributions
abella-0.030
icking-0.029
bled-0.028
tha-0.028
bie-0.028
Token her
Token ID607
Feature activation-1.67e-02

Pos logit contributions
 target+0.004
pillar+0.004
powers+0.004
Sur+0.004
hest+0.004
Neg logit contributions
abella-0.002
bie-0.002
icking-0.002
ili-0.002
bled-0.002
Token a
Token ID257
Feature activation-7.10e-02

Pos logit contributions
pillar+0.020
 target+0.020
 pieces+0.019
Sur+0.017
 bites+0.017
Neg logit contributions
abella-0.035
icking-0.035
bie-0.034
tha-0.033
bled-0.033
Token big
Token ID1263
Feature activation-9.53e-02

Pos logit contributions
pillar+0.094
 target+0.093
 pieces+0.089
Sur+0.081
 aback+0.079
Neg logit contributions
icking-0.138
bie-0.137
abella-0.135
tha-0.129
ili-0.127
Token,
Token ID11
Feature activation+3.35e-02

Pos logit contributions
pillar+0.129
 target+0.129
 pieces+0.120
Sur+0.117
hest+0.106
Neg logit contributions
icking-0.182
bie-0.178
tha-0.170
abella-0.170
ili-0.161
Token red
Token ID2266
Feature activation-6.39e-02

Pos logit contributions
abella+0.069
icking+0.069
bie+0.067
tha+0.066
bled+0.065
Neg logit contributions
pillar-0.041
 target-0.041
 pieces-0.040
Sur-0.036
 bites-0.035
Token umbrella
Token ID25510
Feature activation+5.10e-03

Pos logit contributions
pillar+0.091
 target+0.089
 pieces+0.084
 aback+0.081
Sur+0.079
Neg logit contributions
icking-0.119
bie-0.113
tha-0.111
bled-0.109
abella-0.108
Token to
Token ID284
Feature activation+9.72e-03

Pos logit contributions
icking+0.012
abella+0.011
bled+0.011
bie+0.011
tha+0.011
Neg logit contributions
pillar-0.005
 target-0.005
 pieces-0.005
Sur-0.005
 aback-0.004
Token keep
Token ID1394
Feature activation+1.57e-02

Pos logit contributions
abella+0.022
bie+0.022
icking+0.021
tha+0.021
bled+0.020
Neg logit contributions
 target-0.010
 pieces-0.010
 string-0.010
pillar-0.010
 bite-0.009
Token.
Token ID13
Feature activation-1.41e-02

Pos logit contributions
abella+0.002
icking+0.002
belt+0.002
bled+0.002
tha+0.002
Neg logit contributions
pillar-0.001
 target-0.001
 pieces-0.001
 aback-0.001
Sur-0.001
Token The
Token ID383
Feature activation-2.91e-02

Pos logit contributions
 pieces+0.015
 target+0.014
pillar+0.014
Sur+0.013
 bites+0.013
Neg logit contributions
icking-0.032
abella-0.031
tha-0.030
bie-0.030
bled-0.030
Token map
Token ID3975
Feature activation-9.43e-03

Pos logit contributions
pillar+0.045
 target+0.041
 pieces+0.040
 aback+0.040
Sur+0.039
Neg logit contributions
abella-0.050
icking-0.049
bie-0.048
tha-0.048
bled-0.046
Token shows
Token ID2523
Feature activation-2.62e-02

Pos logit contributions
pillar+0.014
 target+0.014
Sur+0.013
 aback+0.012
 pieces+0.012
Neg logit contributions
icking-0.017
abella-0.016
tha-0.016
bled-0.015
bie-0.015
Token a
Token ID257
Feature activation-5.91e-02

Pos logit contributions
 pieces+0.034
 target+0.032
pillar+0.030
 bites+0.028
 string+0.027
Neg logit contributions
bie-0.055
icking-0.054
abella-0.053
tha-0.052
aj-0.052
Token big
Token ID1263
Feature activation-9.51e-02

Pos logit contributions
pillar+0.082
 aback+0.078
 pieces+0.074
 bites+0.071
 target+0.071
Neg logit contributions
abella-0.107
bie-0.105
icking-0.100
tha-0.099
 Ty-0.098
Token X
Token ID1395
Feature activation+7.24e-04

Pos logit contributions
pillar+0.128
Sur+0.121
 aback+0.117
ives+0.116
hest+0.107
Neg logit contributions
bie-0.167
icking-0.165
tha-0.164
belt-0.156
abella-0.155
Token in
Token ID287
Feature activation-4.91e-04

Pos logit contributions
icking+0.002
abella+0.002
tha+0.001
bie+0.001
bled+0.001
Neg logit contributions
pillar-0.001
 target-0.001
 pieces-0.001
Sur-0.001
 aback-0.001
Token the
Token ID262
Feature activation+8.23e-03

Pos logit contributions
pillar+0.001
 target+0.001
 pieces+0.001
Sur+0.001
 bites+0.001
Neg logit contributions
abella-0.001
icking-0.001
bie-0.001
tha-0.001
bled-0.001
Token east
Token ID7627
Feature activation+1.33e-02

Pos logit contributions
abella+0.014
bie+0.013
icking+0.013
tha+0.013
belt+0.012
Neg logit contributions
pillar-0.013
 target-0.012
 pieces-0.012
 aback-0.011
Sur-0.011
Token.
Token ID13
Feature activation-1.56e-02

Pos logit contributions
icking+0.029
bie+0.028
abella+0.028
tha+0.027
ric+0.027
Neg logit contributions
pillar-0.015
 target-0.014
 pieces-0.014
Sur-0.013
 bites-0.013
Token's
Token ID338
Feature activation+2.26e-02

Pos logit contributions
pillar+0.015
 target+0.015
 pieces+0.014
 aback+0.013
Sur+0.013
Neg logit contributions
bie-0.039
icking-0.039
abella-0.038
tha-0.036
bled-0.035
Token mom
Token ID1995
Feature activation-1.23e-02

Pos logit contributions
abella+0.025
icking+0.024
bie+0.023
tha+0.023
bled+0.022
Neg logit contributions
pillar-0.050
 target-0.049
 pieces-0.049
Sur-0.046
 bites-0.045
Token gave
Token ID2921
Feature activation-1.22e-03

Pos logit contributions
pillar+0.019
 target+0.019
 pieces+0.018
Sur+0.016
 bites+0.016
Neg logit contributions
icking-0.022
abella-0.022
bie-0.022
tha-0.021
bled-0.020
Token him
Token ID683
Feature activation-1.60e-03

Pos logit contributions
pillar+0.002
 target+0.002
 pieces+0.002
Sur+0.002
powers+0.002
Neg logit contributions
icking-0.002
bie-0.002
abella-0.002
tha-0.002
bled-0.002
Token a
Token ID257
Feature activation-6.84e-02

Pos logit contributions
 pieces+0.002
pillar+0.001
 target+0.001
 bites+0.001
Sur+0.001
Neg logit contributions
icking-0.004
abella-0.004
tha-0.004
bled-0.004
bie-0.004
Token big
Token ID1263
Feature activation-9.51e-02

Pos logit contributions
pillar+0.092
 target+0.090
 pieces+0.085
Sur+0.079
 aback+0.076
Neg logit contributions
bie-0.132
abella-0.131
icking-0.131
tha-0.122
ili-0.121
Token onion
Token ID21670
Feature activation-2.60e-02

Pos logit contributions
pillar+0.128
 target+0.122
Sur+0.115
 pieces+0.110
hest+0.105
Neg logit contributions
icking-0.184
bie-0.181
tha-0.172
abella-0.170
belt-0.163
Token to
Token ID284
Feature activation+1.19e-02

Pos logit contributions
pillar+0.031
 target+0.030
Sur+0.029
 aback+0.028
 pieces+0.027
Neg logit contributions
icking-0.055
bie-0.052
abella-0.050
belt-0.049
tha-0.048
Token eat
Token ID4483
Feature activation-1.24e-02

Pos logit contributions
abella+0.025
icking+0.025
bie+0.024
tha+0.024
bled+0.023
Neg logit contributions
pillar-0.014
 target-0.014
 pieces-0.014
Sur-0.012
 bites-0.012
Token.
Token ID13
Feature activation-1.82e-02

Pos logit contributions
pillar+0.017
 target+0.017
Sur+0.015
 aback+0.015
 pieces+0.013
Neg logit contributions
abella-0.024
ores-0.023
icking-0.023
allows-0.023
bie-0.023
Token Tim
Token ID5045
Feature activation+6.27e-03

Pos logit contributions
 target+0.033
pillar+0.033
 pieces+0.031
Sur+0.028
 bites+0.028
Neg logit contributions
icking-0.028
abella-0.028
bie-0.027
tha-0.026
 Su-0.025
Token,
Token ID11
Feature activation+2.27e-02

Pos logit contributions
abella+0.023
ric+0.023
icking+0.023
bie+0.023
tha+0.023
Neg logit contributions
 pieces-0.006
Sur-0.006
 bites-0.005
!"-0.005
pillar-0.005
Token Mom
Token ID11254
Feature activation-2.37e-02

Pos logit contributions
ric+0.046
icking+0.045
ogo+0.044
bie+0.043
mo+0.042
Neg logit contributions
 pieces-0.031
Sur-0.027
 string-0.027
 piece-0.027
 bites-0.026
Token gave
Token ID2921
Feature activation+3.00e-03

Pos logit contributions
 target+0.032
pillar+0.032
 pieces+0.031
Sur+0.028
 bites+0.027
Neg logit contributions
abella-0.046
icking-0.046
bie-0.045
tha-0.043
bled-0.042
Token them
Token ID606
Feature activation-3.11e-03

Pos logit contributions
icking+0.006
belt+0.006
abella+0.006
tha+0.006
bled+0.006
Neg logit contributions
 pieces-0.004
 target-0.003
 bites-0.003
pillar-0.003
 aback-0.003
Token a
Token ID257
Feature activation-6.81e-02

Pos logit contributions
pillar+0.004
 target+0.004
 pieces+0.004
Sur+0.003
 bites+0.003
Neg logit contributions
abella-0.007
icking-0.007
bie-0.006
tha-0.006
bled-0.006
Token big
Token ID1263
Feature activation-9.47e-02

Pos logit contributions
pillar+0.090
 target+0.088
 pieces+0.083
 aback+0.078
Sur+0.078
Neg logit contributions
bie-0.132
abella-0.130
icking-0.129
tha-0.122
ili-0.120
Token bar
Token ID2318
Feature activation-4.83e-03

Pos logit contributions
pillar+0.129
 target+0.122
Sur+0.118
 pieces+0.110
hest+0.108
Neg logit contributions
icking-0.179
bie-0.178
tha-0.169
abella-0.164
belt-0.161
Token of
Token ID286
Feature activation+1.42e-02

Pos logit contributions
 target+0.010
pillar+0.010
 pieces+0.010
Sur+0.010
 aback+0.010
Neg logit contributions
bie-0.005
abella-0.005
icking-0.005
tha-0.005
belt-0.005
Token chocolate
Token ID11311
Feature activation-5.14e-03

Pos logit contributions
icking+0.031
abella+0.030
bie+0.029
tha+0.029
bled+0.029
Neg logit contributions
 pieces-0.017
pillar-0.016
 target-0.016
 bites-0.015
 string-0.014
Token to
Token ID284
Feature activation+1.55e-02

Pos logit contributions
 target+0.007
pillar+0.007
 aback+0.007
Start+0.006
Sur+0.006
Neg logit contributions
icking-0.009
bie-0.009
abella-0.009
tha-0.009
belt-0.009
Token share
Token ID2648
Feature activation-8.84e-03

Pos logit contributions
icking+0.034
abella+0.033
bie+0.033
tha+0.032
bled+0.032
Neg logit contributions
pillar-0.017
 pieces-0.017
 target-0.017
 bites-0.016
Sur-0.015
Tokenopl
Token ID20106
Feature activation+2.15e-02

Pos logit contributions
abella+0.042
icking+0.040
ric+0.038
bie+0.038
tha+0.037
Neg logit contributions
 target-0.071
 pieces-0.070
Sur-0.068
pillar-0.067
 aback-0.067
Tokenane
Token ID1531
Feature activation-1.07e-02

Pos logit contributions
icking+0.031
abella+0.031
bie+0.029
tha+0.029
bled+0.028
Neg logit contributions
 target-0.040
pillar-0.040
 pieces-0.039
Sur-0.036
 bites-0.036
Token.
Token ID13
Feature activation-1.65e-02

Pos logit contributions
pillar+0.011
 target+0.010
 pieces+0.010
 aback+0.010
Sur+0.010
Neg logit contributions
icking-0.024
abella-0.023
tha-0.023
bie-0.023
ric-0.022
Token It
Token ID632
Feature activation-2.51e-02

Pos logit contributions
pillar+0.022
 target+0.021
 pieces+0.021
Sur+0.019
 bites+0.018
Neg logit contributions
abella-0.033
icking-0.032
bie-0.032
tha-0.031
ili-0.030
Token is
Token ID318
Feature activation+1.81e-02

Pos logit contributions
pillar+0.028
 target+0.028
 pieces+0.026
Sur+0.024
 aback+0.024
Neg logit contributions
abella-0.054
icking-0.053
bie-0.053
tha-0.051
ric-0.051
Token big
Token ID1263
Feature activation-9.47e-02

Pos logit contributions
icking+0.047
bie+0.044
abella+0.044
tha+0.044
ancer+0.044
Neg logit contributions
 pieces-0.016
 target-0.015
 string-0.015
 piece-0.013
 magazine-0.011
Token and
Token ID290
Feature activation+1.50e-02

Pos logit contributions
pillar+0.118
Sur+0.113
Start+0.104
 target+0.103
Mine+0.102
Neg logit contributions
icking-0.184
bie-0.175
abella-0.171
aj-0.170
belt-0.168
Token red
Token ID2266
Feature activation-6.29e-02

Pos logit contributions
icking+0.036
abella+0.035
bie+0.034
tha+0.033
bled+0.033
Neg logit contributions
 pieces-0.016
 target-0.015
 string-0.014
pillar-0.014
 piece-0.013
Token and
Token ID290
Feature activation+1.31e-02

Pos logit contributions
pillar+0.089
Sur+0.081
 aback+0.080
 target+0.079
Start+0.077
Neg logit contributions
icking-0.118
ric-0.106
belt-0.106
ahs-0.104
angers-0.102
Token can
Token ID460
Feature activation-1.07e-03

Pos logit contributions
icking+0.031
abella+0.029
bie+0.029
tha+0.028
bled+0.028
Neg logit contributions
 pieces-0.015
 target-0.014
pillar-0.013
 string-0.013
 bites-0.012
Token fly
Token ID6129
Feature activation+2.47e-02

Pos logit contributions
pillar+0.002
 target+0.002
 pieces+0.001
Sur+0.001
 aback+0.001
Neg logit contributions
abella-0.002
icking-0.002
tha-0.002
bie-0.002
bled-0.002
Token leaves
Token ID5667
Feature activation+1.13e-02

Pos logit contributions
icking+0.015
bie+0.015
abella+0.015
ili+0.014
allows+0.014
Neg logit contributions
pillar-0.014
 target-0.014
Sur-0.012
 pieces-0.012
 aback-0.011
Token.
Token ID13
Feature activation-2.00e-02

Pos logit contributions
icking+0.021
bled+0.019
abella+0.019
bie+0.019
tha+0.018
Neg logit contributions
pillar-0.017
 target-0.016
Sur-0.016
 pieces-0.015
 aback-0.015
Token It
Token ID632
Feature activation-2.27e-02

Pos logit contributions
 target+0.025
pillar+0.025
 pieces+0.023
Sur+0.021
 bites+0.020
Neg logit contributions
abella-0.041
icking-0.040
bie-0.040
tha-0.039
bled-0.038
Token was
Token ID373
Feature activation+1.65e-02

Pos logit contributions
 target+0.036
pillar+0.035
Sur+0.031
 aback+0.031
 pieces+0.030
Neg logit contributions
bie-0.040
abella-0.040
icking-0.037
tha-0.036
bled-0.036
Token pile
Token ID14540
Feature activation-5.46e-03

Pos logit contributions
icking+0.037
abella+0.037
bie+0.036
tha+0.036
ancer+0.035
Neg logit contributions
 pieces-0.019
 target-0.017
 string-0.015
pillar-0.015
 piece-0.015
Token big
Token ID1263
Feature activation-9.46e-02

Pos logit contributions
pillar+0.005
 target+0.005
 pieces+0.005
Sur+0.004
 bites+0.004
Neg logit contributions
icking-0.013
abella-0.013
bie-0.012
bled-0.012
tha-0.012
Token enough
Token ID1576
Feature activation+1.87e-03

Pos logit contributions
 target+0.123
pillar+0.123
Sur+0.116
',"+0.113
Mine+0.110
Neg logit contributions
icking-0.174
bie-0.161
allows-0.154
aj-0.153
 K-0.152
Token for
Token ID329
Feature activation-1.22e-03

Pos logit contributions
icking+0.004
bie+0.003
abella+0.003
tha+0.003
bled+0.003
Neg logit contributions
pillar-0.003
 target-0.003
Sur-0.002
 pieces-0.002
 aback-0.002
Token Bob
Token ID5811
Feature activation-2.39e-03

Pos logit contributions
 pieces+0.001
 bites+0.001
 bite+0.001
 string+0.001
 nut+0.001
Neg logit contributions
icking-0.003
aj-0.003
bie-0.003
abella-0.003
belt-0.003
Token to
Token ID284
Feature activation+1.28e-02

Pos logit contributions
pillar+0.002
 target+0.002
 pieces+0.002
Sur+0.002
 aback+0.002
Neg logit contributions
icking-0.006
bie-0.006
abella-0.006
tha-0.005
bled-0.005
Token hide
Token ID7808
Feature activation+3.04e-02

Pos logit contributions
abella+0.028
icking+0.027
bie+0.027
tha+0.027
bled+0.026
Neg logit contributions
 target-0.015
pillar-0.015
 pieces-0.015
 bites-0.013
 string-0.012
Token.
Token ID13
Feature activation-7.33e-03

Pos logit contributions
abella+0.036
icking+0.035
bie+0.035
tha+0.034
bled+0.033
Neg logit contributions
 pieces-0.018
 target-0.017
pillar-0.017
Sur-0.015
 bites-0.015
Token Mom
Token ID11254
Feature activation-2.49e-02

Pos logit contributions
 pieces+0.009
pillar+0.009
 target+0.009
 bites+0.008
Sur+0.008
Neg logit contributions
icking-0.015
bled-0.015
abella-0.015
tha-0.014
belt-0.014
Token gives
Token ID3607
Feature activation-1.38e-03

Pos logit contributions
pillar+0.033
 target+0.033
 pieces+0.031
Sur+0.028
 bites+0.028
Neg logit contributions
abella-0.050
icking-0.049
bie-0.048
tha-0.047
bled-0.046
Token them
Token ID606
Feature activation-1.66e-03

Pos logit contributions
 pieces+0.002
 piece+0.002
pillar+0.002
 slices+0.002
 bites+0.002
Neg logit contributions
belt-0.003
abella-0.003
aj-0.003
bled-0.002
icking-0.002
Token a
Token ID257
Feature activation-6.22e-02

Pos logit contributions
 pieces+0.002
pillar+0.002
 target+0.002
Sur+0.002
 bites+0.002
Neg logit contributions
abella-0.003
icking-0.003
bie-0.003
tha-0.003
bled-0.003
Token big
Token ID1263
Feature activation-9.46e-02

Pos logit contributions
 target+0.101
pillar+0.097
 pieces+0.085
 aback+0.084
Sur+0.081
Neg logit contributions
bie-0.113
abella-0.105
icking-0.104
ili-0.098
tha-0.098
Token bowl
Token ID9396
Feature activation-1.76e-02

Pos logit contributions
 target+0.152
pillar+0.151
Sur+0.133
hest+0.128
powers+0.127
Neg logit contributions
bie-0.162
icking-0.159
tha-0.149
abella-0.141
iah-0.137
Token of
Token ID286
Feature activation+1.17e-02

Pos logit contributions
 target+0.020
pillar+0.020
Sur+0.017
 pieces+0.017
 bites+0.015
Neg logit contributions
icking-0.040
bie-0.038
abella-0.038
tha-0.037
bled-0.036
Token dough
Token ID15756
Feature activation-1.93e-02

Pos logit contributions
bie+0.018
allows+0.017
abella+0.017
icking+0.017
ores+0.016
Neg logit contributions
 target-0.022
pillar-0.022
 aback-0.018
Sur-0.018
Start-0.017
Token and
Token ID290
Feature activation-9.99e-03

Pos logit contributions
pillar+0.025
 target+0.024
Sur+0.021
 aback+0.021
Start+0.021
Neg logit contributions
icking-0.040
bie-0.037
abella-0.037
tha-0.037
bled-0.037
Token a
Token ID257
Feature activation-4.46e-02

Pos logit contributions
pillar+0.013
 target+0.013
 pieces+0.013
Sur+0.012
 bites+0.012
Neg logit contributions
icking-0.019
abella-0.019
bie-0.019
tha-0.018
bled-0.018
Token best
Token ID1266
Feature activation+1.72e-03

Pos logit contributions
 target+0.021
pillar+0.021
Sur+0.019
 pieces+0.019
 aback+0.019
Neg logit contributions
bie-0.042
icking-0.042
abella-0.041
ric-0.039
allows-0.039
Token,
Token ID11
Feature activation+2.72e-02

Pos logit contributions
icking+0.004
bie+0.004
abella+0.004
tha+0.003
ili+0.003
Neg logit contributions
pillar-0.002
 target-0.002
 pieces-0.002
Sur-0.002
 aback-0.002
Token because
Token ID780
Feature activation+8.46e-03

Pos logit contributions
icking+0.061
abella+0.061
tha+0.059
bie+0.059
bled+0.058
Neg logit contributions
 target-0.029
pillar-0.029
 pieces-0.029
 bites-0.025
Sur-0.024
Token it
Token ID340
Feature activation-1.94e-02

Pos logit contributions
icking+0.021
aj+0.019
bie+0.019
tha+0.019
abella+0.019
Neg logit contributions
 pieces-0.009
pillar-0.008
 target-0.008
 string-0.007
 bites-0.007
Token is
Token ID318
Feature activation+2.19e-02

Pos logit contributions
 target+0.025
pillar+0.025
 pieces+0.023
Sur+0.022
 aback+0.022
Neg logit contributions
abella-0.038
icking-0.038
bie-0.037
tha-0.036
bled-0.035
Token big
Token ID1263
Feature activation-9.46e-02

Pos logit contributions
icking+0.048
abella+0.047
tha+0.046
bie+0.045
bled+0.044
Neg logit contributions
 pieces-0.026
 target-0.026
pillar-0.023
 bites-0.023
 string-0.022
Token and
Token ID290
Feature activation+2.62e-02

Pos logit contributions
pillar+0.129
Sur+0.126
Start+0.118
Mine+0.115
 target+0.113
Neg logit contributions
icking-0.168
bie-0.166
aj-0.153
allows-0.151
abella-0.150
Token strong
Token ID1913
Feature activation-6.23e-02

Pos logit contributions
icking+0.057
abella+0.056
bie+0.054
tha+0.054
bled+0.053
Neg logit contributions
 target-0.031
pillar-0.031
 pieces-0.030
 bites-0.027
Sur-0.026
Token.
Token ID13
Feature activation-7.44e-03

Pos logit contributions
pillar+0.079
Sur+0.075
 target+0.071
Start+0.067
 aback+0.064
Neg logit contributions
icking-0.118
bie-0.115
allows-0.111
abella-0.110
ores-0.110
Token Mia
Token ID32189
Feature activation-1.38e-02

Pos logit contributions
pillar+0.007
 target+0.007
 pieces+0.006
Sur+0.006
 bites+0.006
Neg logit contributions
icking-0.018
abella-0.017
tha-0.017
bie-0.017
bled-0.017
Token likes
Token ID7832
Feature activation-2.81e-02

Pos logit contributions
pillar+0.015
 target+0.015
 pieces+0.014
Sur+0.013
 bites+0.012
Neg logit contributions
abella-0.031
icking-0.030
bie-0.030
tha-0.029
bled-0.028
Token.
Token ID13
Feature activation-6.40e-03

Pos logit contributions
icking+0.034
abella+0.032
bie+0.031
tha+0.031
ores+0.031
Neg logit contributions
pillar-0.023
 target-0.021
Sur-0.019
 aback-0.018
 pieces-0.018
Token Mom
Token ID11254
Feature activation-2.24e-02

Pos logit contributions
pillar+0.013
 target+0.012
 pieces+0.012
 bites+0.011
Sur+0.011
Neg logit contributions
icking-0.009
bie-0.009
abella-0.009
gan-0.008
bled-0.008
Token gave
Token ID2921
Feature activation-6.04e-04

Pos logit contributions
 target+0.033
pillar+0.033
 pieces+0.031
Sur+0.029
 bites+0.028
Neg logit contributions
abella-0.042
icking-0.041
bie-0.041
tha-0.039
bled-0.038
Token her
Token ID607
Feature activation-8.50e-03

Pos logit contributions
pillar+0.001
 target+0.001
 pieces+0.001
Sur+0.001
 bites+0.001
Neg logit contributions
abella-0.001
icking-0.001
bie-0.001
tha-0.001
bled-0.001
Token a
Token ID257
Feature activation-6.40e-02

Pos logit contributions
pillar+0.013
 target+0.013
Sur+0.011
 pieces+0.011
 aback+0.010
Neg logit contributions
icking-0.016
bie-0.016
abella-0.016
bled-0.015
tha-0.015
Token big
Token ID1263
Feature activation-9.46e-02

Pos logit contributions
pillar+0.106
 target+0.106
 aback+0.092
lier+0.089
Sur+0.088
Neg logit contributions
bie-0.111
icking-0.100
abella-0.095
aj-0.095
tha-0.092
Token bowl
Token ID9396
Feature activation-1.75e-02

Pos logit contributions
pillar+0.159
 target+0.153
Sur+0.144
hest+0.141
So+0.134
Neg logit contributions
bie-0.151
icking-0.150
tha-0.136
aj-0.128
 Mam-0.127
Token and
Token ID290
Feature activation+2.80e-03

Pos logit contributions
 target+0.022
pillar+0.022
Sur+0.020
 pieces+0.018
Start+0.018
Neg logit contributions
icking-0.038
bie-0.036
abella-0.035
bled-0.035
ores-0.034
Token a
Token ID257
Feature activation-5.16e-02

Pos logit contributions
icking+0.005
abella+0.005
bie+0.005
tha+0.005
bled+0.005
Neg logit contributions
pillar-0.004
 pieces-0.004
 target-0.004
Sur-0.003
 bites-0.003
Token spoon
Token ID24556
Feature activation-2.51e-03

Pos logit contributions
pillar+0.085
 target+0.081
 aback+0.075
 pieces+0.071
hest+0.070
Neg logit contributions
bie-0.087
icking-0.079
aj-0.077
abella-0.077
 Mam-0.074
Token.
Token ID13
Feature activation-2.36e-02

Pos logit contributions
pillar+0.002
 target+0.002
 pieces+0.002
Sur+0.002
 bites+0.001
Neg logit contributions
icking-0.006
abella-0.006
bie-0.006
tha-0.006
bled-0.006
Token He
Token ID679
Feature activation-1.02e-02

Pos logit contributions
pillar+0.019
 target+0.019
 pieces+0.019
Sur+0.016
 bites+0.016
Neg logit contributions
abella-0.033
bie-0.032
icking-0.032
tha-0.030
ili-0.030
Token thinks
Token ID6834
Feature activation-9.49e-03

Pos logit contributions
pillar+0.013
 target+0.013
 pieces+0.012
Sur+0.012
 aback+0.011
Neg logit contributions
abella-0.020
bie-0.020
icking-0.020
tha-0.019
bled-0.018
Token hipp
Token ID18568
Feature activation+4.90e-03

Pos logit contributions
 pieces+0.010
pillar+0.008
 target+0.008
 string+0.008
 bites+0.008
Neg logit contributions
icking-0.022
bie-0.022
tha-0.022
abella-0.022
aj-0.021
Tokenos
Token ID418
Feature activation-1.59e-02

Pos logit contributions
icking+0.005
abella+0.005
bie+0.005
tha+0.005
bled+0.005
Neg logit contributions
 target-0.011
pillar-0.011
 pieces-0.010
Sur-0.010
 bites-0.010
Token are
Token ID389
Feature activation+2.54e-02

Pos logit contributions
 target+0.019
pillar+0.019
 pieces+0.018
Sur+0.017
 bites+0.016
Neg logit contributions
icking-0.033
abella-0.032
bie-0.032
ric-0.031
tha-0.030
Token big
Token ID1263
Feature activation-9.43e-02

Pos logit contributions
icking+0.063
tha+0.060
bie+0.059
abella+0.059
belt+0.057
Neg logit contributions
 pieces-0.026
 target-0.024
 string-0.021
pillar-0.021
 piece-0.020
Token and
Token ID290
Feature activation+1.65e-02

Pos logit contributions
pillar+0.122
Sur+0.119
 target+0.112
Start+0.111
Mine+0.109
Neg logit contributions
icking-0.171
bie-0.169
aj-0.161
 seeker-0.160
abella-0.157
Token funny
Token ID8258
Feature activation-5.30e-02

Pos logit contributions
icking+0.040
tha+0.038
abella+0.038
bie+0.037
bled+0.036
Neg logit contributions
 pieces-0.017
 target-0.017
pillar-0.016
 bites-0.015
 string-0.015
Token and
Token ID290
Feature activation+1.14e-02

Pos logit contributions
pillar+0.051
 target+0.050
 pieces+0.045
Sur+0.042
 aback+0.040
Neg logit contributions
icking-0.122
abella-0.121
bie-0.121
bled-0.114
tha-0.114
Token have
Token ID423
Feature activation-2.17e-02

Pos logit contributions
icking+0.027
tha+0.026
abella+0.026
bie+0.025
bled+0.025
Neg logit contributions
 target-0.012
 pieces-0.012
pillar-0.012
 bites-0.011
 string-0.011
Token big
Token ID1263
Feature activation-8.08e-02

Pos logit contributions
pillar+0.032
 target+0.030
 aback+0.028
 pieces+0.027
Sur+0.027
Neg logit contributions
abella-0.040
bie-0.038
icking-0.036
ric-0.035
 Ty-0.035
Token and
Token ID290
Feature activation-5.50e-03

Pos logit contributions
pillar+0.014
 pieces+0.014
 target+0.013
Sur+0.012
 bites+0.012
Neg logit contributions
abella-0.023
icking-0.023
bie-0.022
bled-0.022
ric-0.022
Token dad
Token ID9955
Feature activation-1.52e-02

Pos logit contributions
 pieces+0.011
 target+0.010
pillar+0.010
Sur+0.010
 piece+0.010
Neg logit contributions
icking-0.007
abella-0.007
ric-0.007
tha-0.007
ili-0.007
Token gave
Token ID2921
Feature activation-3.75e-03

Pos logit contributions
pillar+0.020
 pieces+0.020
 target+0.020
Sur+0.018
 bites+0.018
Neg logit contributions
abella-0.029
icking-0.029
bie-0.029
tha-0.028
bled-0.028
Token them
Token ID606
Feature activation+4.49e-03

Pos logit contributions
 pieces+0.005
pillar+0.005
 target+0.005
 bites+0.005
Sur+0.005
Neg logit contributions
icking-0.007
abella-0.007
tha-0.007
bie-0.007
bled-0.006
Token a
Token ID257
Feature activation-6.71e-02

Pos logit contributions
icking+0.010
abella+0.010
tha+0.010
bie+0.010
bled+0.010
Neg logit contributions
 pieces-0.005
pillar-0.005
 target-0.004
 bites-0.004
Sur-0.004
Token big
Token ID1263
Feature activation-9.43e-02

Pos logit contributions
pillar+0.087
 target+0.084
 pieces+0.079
Sur+0.075
 aback+0.072
Neg logit contributions
bie-0.133
abella-0.132
icking-0.131
tha-0.123
ili-0.121
Token surprise
Token ID5975
Feature activation-1.63e-02

Pos logit contributions
pillar+0.127
 target+0.118
Sur+0.114
 pieces+0.107
hest+0.101
Neg logit contributions
icking-0.183
bie-0.180
tha-0.173
abella-0.169
belt-0.165
Token.
Token ID13
Feature activation-2.22e-02

Pos logit contributions
pillar+0.018
 target+0.018
 pieces+0.016
Sur+0.015
 aback+0.015
Neg logit contributions
icking-0.036
abella-0.036
bie-0.035
tha-0.035
bled-0.034
Token They
Token ID1119
Feature activation-1.58e-02

Pos logit contributions
pillar+0.032
 target+0.032
 pieces+0.030
Sur+0.027
 bites+0.027
Neg logit contributions
abella-0.042
icking-0.040
bie-0.040
bled-0.038
ili-0.038
Token had
Token ID550
Feature activation-2.65e-02

Pos logit contributions
pillar+0.022
 target+0.022
 pieces+0.022
Sur+0.019
 bites+0.019
Neg logit contributions
abella-0.030
icking-0.030
bie-0.029
tha-0.028
bled-0.028
Token bought
Token ID5839
Feature activation-1.24e-02

Pos logit contributions
pillar+0.038
 target+0.035
 pieces+0.032
Sur+0.030
 aback+0.030
Neg logit contributions
abella-0.053
bie-0.051
icking-0.050
tha-0.049
bled-0.048
Token oy
Token ID35104
Feature activation+3.36e-02

Pos logit contributions
 pieces+0.011
pillar+0.011
 target+0.011
 bites+0.010
Sur+0.010
Neg logit contributions
icking-0.017
abella-0.016
bie-0.016
tha-0.016
ili-0.015
Tokensters
Token ID5937
Feature activation+3.00e-02

Pos logit contributions
abella+0.050
icking+0.049
bie+0.047
tha+0.046
bled+0.045
Neg logit contributions
pillar-0.061
 target-0.061
 pieces-0.060
Sur-0.056
 bites-0.055
Token,
Token ID11
Feature activation+1.41e-02

Pos logit contributions
abella+0.071
icking+0.070
bie+0.069
tha+0.068
bled+0.066
Neg logit contributions
pillar-0.028
 target-0.028
 pieces-0.028
 bites-0.023
Sur-0.023
Token which
Token ID543
Feature activation-6.18e-03

Pos logit contributions
abella+0.032
icking+0.032
tha+0.031
bie+0.031
bled+0.031
Neg logit contributions
 pieces-0.016
 target-0.015
pillar-0.014
 bites-0.013
 string-0.013
Token were
Token ID547
Feature activation+2.73e-02

Pos logit contributions
pillar+0.008
 target+0.008
 pieces+0.008
Sur+0.007
 bites+0.007
Neg logit contributions
abella-0.012
icking-0.012
bie-0.012
tha-0.012
bled-0.011
Token big
Token ID1263
Feature activation-9.41e-02

Pos logit contributions
icking+0.062
abella+0.061
tha+0.059
bie+0.058
ancer+0.057
Neg logit contributions
 pieces-0.032
 target-0.029
pillar-0.026
 string-0.025
 piece-0.025
Token and
Token ID290
Feature activation+1.78e-02

Pos logit contributions
pillar+0.112
 target+0.106
Sur+0.098
Start+0.095
 aback+0.091
Neg logit contributions
icking-0.191
bie-0.190
abella-0.180
tha-0.175
ric-0.173
Token hard
Token ID1327
Feature activation-4.74e-02

Pos logit contributions
icking+0.043
abella+0.041
tha+0.039
bie+0.039
ili+0.039
Neg logit contributions
 pieces-0.019
 target-0.018
pillar-0.017
 string-0.016
 bites-0.016
Token and
Token ID290
Feature activation+1.23e-02

Pos logit contributions
pillar+0.051
 target+0.047
Sur+0.043
 pieces+0.040
 aback+0.040
Neg logit contributions
icking-0.106
bie-0.106
abella-0.104
tha-0.100
bled-0.099
Token had
Token ID550
Feature activation-2.10e-02

Pos logit contributions
icking+0.028
abella+0.027
tha+0.027
ili+0.026
bie+0.026
Neg logit contributions
 pieces-0.015
 target-0.013
 bites-0.013
 string-0.012
 piece-0.012
Token a
Token ID257
Feature activation-3.32e-02

Pos logit contributions
pillar+0.031
 target+0.030
 pieces+0.028
Sur+0.027
 aback+0.027
Neg logit contributions
abella-0.039
bie-0.037
icking-0.037
tha-0.035
bled-0.035
Token Suddenly
Token ID24975
Feature activation-5.29e-03

Pos logit contributions
 target+0.008
pillar+0.008
 pieces+0.008
 aback+0.007
+0.007
Neg logit contributions
icking-0.026
tha-0.026
bled-0.025
bie-0.024
abella-0.024
Token,
Token ID11
Feature activation+2.47e-02

Pos logit contributions
 pieces+0.003
+0.002
pillar+0.002
Sur+0.002
 string+0.002
Neg logit contributions
ric-0.015
bie-0.015
icking-0.015
tha-0.015
allows-0.014
Token she
Token ID673
Feature activation+3.59e-03

Pos logit contributions
aj+0.070
icking+0.069
ric+0.067
bie+0.067
tha+0.066
Neg logit contributions
 pieces-0.017
-0.014
Sur-0.012
 string-0.012
Start-0.011
Token found
Token ID1043
Feature activation-1.10e-02

Pos logit contributions
icking+0.008
abella+0.008
tha+0.008
ili+0.008
bie+0.008
Neg logit contributions
pillar-0.004
 target-0.004
 pieces-0.004
Sur-0.003
 bites-0.003
Token a
Token ID257
Feature activation-6.45e-02

Pos logit contributions
 pieces+0.012
pillar+0.012
 target+0.010
 aback+0.010
Sur+0.010
Neg logit contributions
icking-0.025
bie-0.025
aj-0.024
tha-0.024
abella-0.023
Token big
Token ID1263
Feature activation-9.41e-02

Pos logit contributions
pillar+0.079
 pieces+0.075
 target+0.074
Sur+0.070
 aback+0.069
Neg logit contributions
abella-0.128
icking-0.127
bie-0.126
tha-0.123
bled-0.118
Token stick
Token ID4859
Feature activation-2.03e-02

Pos logit contributions
pillar+0.117
Sur+0.109
 target+0.109
 pieces+0.107
 aback+0.100
Neg logit contributions
icking-0.185
bie-0.180
tha-0.180
abella-0.178
bled-0.168
Token!
Token ID0
Feature activation-2.56e-02

Pos logit contributions
pillar+0.027
Sur+0.026
 target+0.026
 pieces+0.025
 aback+0.024
Neg logit contributions
icking-0.040
abella-0.036
bled-0.036
tha-0.035
belt-0.035
Token She
Token ID1375
Feature activation-4.15e-03

Pos logit contributions
pillar+0.021
 pieces+0.019
 target+0.018
Sur+0.017
 aback+0.016
Neg logit contributions
icking-0.067
bie-0.064
tha-0.063
bled-0.063
abella-0.063
Token bounded
Token ID49948
Feature activation+1.82e-02

Pos logit contributions
pillar+0.006
 target+0.006
 pieces+0.006
 aback+0.006
Sur+0.006
Neg logit contributions
abella-0.007
bie-0.007
icking-0.007
tha-0.007
bled-0.007
Token over
Token ID625
Feature activation-8.10e-03

Pos logit contributions
abella+0.034
icking+0.034
bie+0.033
tha+0.032
bled+0.032
Neg logit contributions
pillar-0.026
 target-0.025
 pieces-0.025
Sur-0.023
 bites-0.022
Token special
Token ID2041
Feature activation-4.49e-02

Pos logit contributions
pillar+0.036
 target+0.035
 pieces+0.033
Sur+0.032
 aback+0.032
Neg logit contributions
icking-0.043
abella-0.042
bled-0.040
bie-0.040
tha-0.038
Token.
Token ID13
Feature activation-1.82e-02

Pos logit contributions
pillar+0.049
 target+0.048
 pieces+0.044
Sur+0.043
 aback+0.039
Neg logit contributions
icking-0.102
abella-0.098
bie-0.094
bled-0.094
tha-0.094
Token It
Token ID632
Feature activation-2.44e-02

Pos logit contributions
pillar+0.021
 target+0.021
 pieces+0.020
Sur+0.017
 bites+0.017
Neg logit contributions
icking-0.039
abella-0.039
bie-0.038
tha-0.037
bled-0.037
Token was
Token ID373
Feature activation+6.89e-03

Pos logit contributions
pillar+0.028
 target+0.028
 pieces+0.027
Sur+0.024
 bites+0.024
Neg logit contributions
abella-0.052
icking-0.052
bie-0.050
tha-0.049
bled-0.049
Token a
Token ID257
Feature activation-5.94e-02

Pos logit contributions
icking+0.016
abella+0.016
bie+0.016
tha+0.016
ili+0.015
Neg logit contributions
 pieces-0.007
 target-0.006
pillar-0.006
 string-0.005
 piece-0.005
Token big
Token ID1263
Feature activation-9.39e-02

Pos logit contributions
pillar+0.072
 target+0.070
 pieces+0.069
 aback+0.063
Sur+0.062
Neg logit contributions
abella-0.120
icking-0.119
bie-0.118
tha-0.115
bled-0.112
Token temple
Token ID12505
Feature activation+2.10e-03

Pos logit contributions
pillar+0.125
Sur+0.118
 target+0.117
 pieces+0.115
 aback+0.110
Neg logit contributions
icking-0.177
bie-0.172
tha-0.172
abella-0.167
belt-0.159
Token!
Token ID0
Feature activation-2.82e-02

Pos logit contributions
icking+0.005
abella+0.005
bie+0.005
tha+0.005
bled+0.005
Neg logit contributions
pillar-0.002
 target-0.002
 pieces-0.002
Sur-0.002
 aback-0.002
Token The
Token ID383
Feature activation-2.10e-02

Pos logit contributions
pillar+0.041
 target+0.041
 pieces+0.039
 bites+0.036
Sur+0.035
Neg logit contributions
icking-0.052
abella-0.051
bie-0.051
tha-0.049
bled-0.048
Token temple
Token ID12505
Feature activation+1.32e-02

Pos logit contributions
 pieces+0.035
pillar+0.033
 target+0.033
Sur+0.031
 piece+0.030
Neg logit contributions
icking-0.036
abella-0.036
bie-0.034
bled-0.033
tha-0.032
Token was
Token ID373
Feature activation+1.86e-02

Pos logit contributions
abella+0.029
icking+0.029
bie+0.028
tha+0.028
bled+0.027
Neg logit contributions
pillar-0.015
 target-0.014
 pieces-0.014
Sur-0.013
 bites-0.012
Token come
Token ID1282
Feature activation+4.88e-02

Pos logit contributions
pillar+0.006
 pieces+0.006
 target+0.006
Sur+0.005
 aback+0.005
Neg logit contributions
abella-0.007
icking-0.007
bie-0.007
tha-0.007
ili-0.007
Token back
Token ID736
Feature activation+5.79e-04

Pos logit contributions
abella+0.126
icking+0.118
tha+0.117
bled+0.115
ili+0.114
Neg logit contributions
 target-0.041
 pieces-0.036
pillar-0.034
!"-0.032
 bites-0.029
Token again
Token ID757
Feature activation-7.71e-03

Pos logit contributions
abella+0.002
bled+0.002
icking+0.002
ric+0.002
tha+0.002
Neg logit contributions
 pieces-0.000
 target-0.000
pillar-0.000
!"-0.000
 bites-0.000
Token.
Token ID13
Feature activation-1.86e-04

Pos logit contributions
 target+0.006
pillar+0.006
 pieces+0.006
Sur+0.005
 bites+0.004
Neg logit contributions
abella-0.020
icking-0.020
bie-0.019
bled-0.019
tha-0.019
Token<|endoftext|>
Token ID50256
Feature activation+8.96e-04

Pos logit contributions
pillar+0.000
 target+0.000
 pieces+0.000
Sur+0.000
 bites+0.000
Neg logit contributions
abella-0.000
icking-0.000
bie-0.000
tha-0.000
bled-0.000
TokenOnce
Token ID7454
Feature activation+1.00e-01

Pos logit contributions
abella+0.002
bie+0.002
ili+0.002
icking+0.002
bled+0.002
Neg logit contributions
 target-0.001
pillar-0.001
 pieces-0.001
 bites-0.001
Sur-0.001
Token upon
Token ID2402
Feature activation-6.78e-03

Pos logit contributions
abella+0.131
icking+0.125
tha+0.119
bled+0.118
belt+0.117
Neg logit contributions
pillar-0.194
 pieces-0.194
 target-0.182
 piece-0.175
 bites-0.174
Token a
Token ID257
Feature activation+7.24e-03

Pos logit contributions
pillar+0.013
 target+0.012
 aback+0.012
 pieces+0.011
 bites+0.011
Neg logit contributions
abella-0.009
bie-0.009
allows-0.009
ores-0.009
bled-0.009
Token time
Token ID640
Feature activation+5.43e-02

Pos logit contributions
bie+0.010
icking+0.010
abella+0.010
aj+0.009
tha+0.009
Neg logit contributions
 target-0.014
pillar-0.014
 pieces-0.014
 aback-0.013
Sur-0.013
Token,
Token ID11
Feature activation+7.42e-02

Pos logit contributions
icking+0.109
abella+0.108
bie+0.106
tha+0.103
bled+0.101
Neg logit contributions
 target-0.071
pillar-0.071
 pieces-0.068
Sur-0.061
 bites-0.060
Token there
Token ID612
Feature activation+3.14e-02

Pos logit contributions
abella+0.131
icking+0.128
bie+0.123
tha+0.121
ili+0.117
Neg logit contributions
pillar-0.114
 pieces-0.114
 target-0.112
Sur-0.103
 bites-0.101
TokenThat
Token ID2504
Feature activation-2.91e-02

Pos logit contributions
pillar+0.042
 target+0.041
 pieces+0.040
Sur+0.035
 bites+0.032
Neg logit contributions
icking-0.113
abella-0.113
bie-0.110
tha-0.108
bled-0.106
Token was
Token ID373
Feature activation+1.95e-02

Pos logit contributions
pillar+0.032
 target+0.031
 pieces+0.029
 aback+0.028
Sur+0.027
Neg logit contributions
abella-0.064
icking-0.063
bie-0.063
tha-0.060
ili-0.058
Token fun
Token ID1257
Feature activation-2.31e-02

Pos logit contributions
icking+0.045
angers+0.044
bie+0.044
belt+0.043
ili+0.043
Neg logit contributions
 pieces-0.019
 target-0.018
-0.016
!"-0.016
 piece-0.016
Token!"
Token ID2474
Feature activation-1.15e-02

Pos logit contributions
pillar+0.019
 target+0.018
 pieces+0.016
Sur+0.015
 bites+0.013
Neg logit contributions
icking-0.058
abella-0.058
bie-0.057
tha-0.056
bled-0.054
Token<|endoftext|>
Token ID50256
Feature activation-4.56e-03

Pos logit contributions
 pieces+0.012
pillar+0.012
 target+0.012
Sur+0.011
 bites+0.010
Neg logit contributions
abella-0.025
icking-0.025
bled-0.024
bie-0.024
ili-0.024
TokenOnce
Token ID7454
Feature activation+9.77e-02

Pos logit contributions
 target+0.005
pillar+0.005
 pieces+0.005
 bites+0.004
Sur+0.004
Neg logit contributions
abella-0.010
icking-0.010
bie-0.010
tha-0.009
ili-0.009
Token upon
Token ID2402
Feature activation-9.80e-03

Pos logit contributions
abella+0.132
icking+0.125
bled+0.124
tha+0.121
ili+0.118
Neg logit contributions
 pieces-0.181
pillar-0.178
 target-0.165
 piece-0.163
 bites-0.160
Token a
Token ID257
Feature activation+2.64e-03

Pos logit contributions
pillar+0.017
 target+0.016
 pieces+0.015
 aback+0.014
 bites+0.014
Neg logit contributions
abella-0.016
bie-0.016
icking-0.015
allows-0.015
ili-0.014
Token time
Token ID640
Feature activation+4.97e-02

Pos logit contributions
bie+0.004
abella+0.004
icking+0.004
aj+0.003
allows+0.003
Neg logit contributions
 target-0.005
pillar-0.005
 pieces-0.005
 aback-0.005
Sur-0.005
Token,
Token ID11
Feature activation+6.53e-02

Pos logit contributions
abella+0.116
icking+0.113
bie+0.111
tha+0.110
bled+0.109
Neg logit contributions
pillar-0.048
 pieces-0.046
 target-0.042
Sur-0.041
 bites-0.038
Token there
Token ID612
Feature activation+2.78e-02

Pos logit contributions
abella+0.124
icking+0.119
tha+0.114
bie+0.112
ili+0.110
Neg logit contributions
 pieces-0.093
pillar-0.092
 target-0.087
Sur-0.083
 piece-0.081
Token be
Token ID307
Feature activation+5.22e-02

Pos logit contributions
icking+0.036
abella+0.036
bie+0.035
tha+0.034
bled+0.034
Neg logit contributions
pillar-0.027
 target-0.026
 pieces-0.025
Sur-0.023
 bites-0.022
Token just
Token ID655
Feature activation+3.40e-02

Pos logit contributions
abella+0.107
icking+0.104
tha+0.102
belt+0.101
ric+0.100
Neg logit contributions
 pieces-0.068
 target-0.061
pillar-0.057
 piece-0.056
 aback-0.056
Token perfect
Token ID2818
Feature activation-2.50e-02

Pos logit contributions
abella+0.077
ili+0.075
tha+0.074
icking+0.074
bie+0.073
Neg logit contributions
 pieces-0.037
 target-0.034
 bites-0.032
 aback-0.031
-0.030
Token!
Token ID0
Feature activation-1.06e-02

Pos logit contributions
pillar+0.030
 target+0.029
 pieces+0.027
Sur+0.025
 bites+0.024
Neg logit contributions
icking-0.054
abella-0.053
bie-0.052
tha-0.051
bled-0.050
Token<|endoftext|>
Token ID50256
Feature activation-7.74e-03

Pos logit contributions
 target+0.012
 pieces+0.011
pillar+0.011
 bites+0.010
 aback+0.009
Neg logit contributions
abella-0.024
bie-0.024
icking-0.023
ili-0.022
tha-0.022
TokenOnce
Token ID7454
Feature activation+1.01e-01

Pos logit contributions
 target+0.009
pillar+0.008
 pieces+0.008
 bites+0.007
Sur+0.007
Neg logit contributions
abella-0.017
bie-0.016
icking-0.016
ili-0.016
tha-0.016
Token there
Token ID612
Feature activation+3.26e-02

Pos logit contributions
abella+0.129
icking+0.125
bled+0.121
tha+0.119
ili+0.115
Neg logit contributions
 pieces-0.198
pillar-0.196
 target-0.185
 piece-0.177
 bites-0.177
Token was
Token ID373
Feature activation+1.62e-02

Pos logit contributions
abella+0.077
icking+0.077
bie+0.074
tha+0.074
bled+0.074
Neg logit contributions
pillar-0.029
 pieces-0.028
 target-0.027
Sur-0.024
 bites-0.023
Token a
Token ID257
Feature activation-4.53e-02

Pos logit contributions
abella+0.030
bie+0.030
icking+0.030
ili+0.028
bled+0.028
Neg logit contributions
 target-0.023
pillar-0.023
 pieces-0.021
Sur-0.020
 bites-0.020
Token boy
Token ID2933
Feature activation-1.26e-02

Pos logit contributions
pillar+0.049
 target+0.047
 pieces+0.044
Sur+0.040
 piece+0.039
Neg logit contributions
icking-0.105
abella-0.104
bled-0.100
tha-0.098
bie-0.097
Token named
Token ID3706
Feature activation+5.47e-02

Pos logit contributions
pillar+0.026
 target+0.026
 pieces+0.025
 bites+0.024
Sur+0.024
Neg logit contributions
abella-0.016
icking-0.015
bie-0.014
bled-0.014
tha-0.014
Token explore
Token ID7301
Feature activation-5.34e-06

Pos logit contributions
abella+0.056
icking+0.054
bie+0.054
bled+0.052
tha+0.052
Neg logit contributions
 target-0.031
pillar-0.030
 pieces-0.030
 bites-0.026
Sur-0.026
Token the
Token ID262
Feature activation+1.91e-02

Feature not active on prev token;
no predictions were affected.
Token jungle
Token ID20712
Feature activation+2.63e-02

Pos logit contributions
icking+0.037
abella+0.037
bie+0.036
tha+0.035
bled+0.034
Neg logit contributions
pillar-0.026
 target-0.026
 pieces-0.025
Sur-0.023
 bites-0.022
Token.
Token ID13
Feature activation-1.66e-03

Pos logit contributions
icking+0.057
abella+0.057
bie+0.055
tha+0.054
bled+0.053
Neg logit contributions
pillar-0.029
 target-0.028
 pieces-0.027
Sur-0.025
 bites-0.024
Token<|endoftext|>
Token ID50256
Feature activation-3.35e-03

Pos logit contributions
pillar+0.002
 target+0.002
 pieces+0.002
Sur+0.002
 bites+0.002
Neg logit contributions
abella-0.004
bie-0.003
icking-0.003
tha-0.003
bled-0.003
TokenOnce
Token ID7454
Feature activation+1.03e-01

Pos logit contributions
pillar+0.004
 target+0.004
 pieces+0.004
Sur+0.003
 bites+0.003
Neg logit contributions
abella-0.007
icking-0.007
bie-0.007
tha-0.007
bled-0.007
Token upon
Token ID2402
Feature activation-8.65e-03

Pos logit contributions
abella+0.139
icking+0.133
bled+0.132
tha+0.126
ili+0.125
Neg logit contributions
 pieces-0.190
pillar-0.190
 target-0.178
 piece-0.171
 aback-0.169
Token a
Token ID257
Feature activation-1.64e-03

Pos logit contributions
pillar+0.014
 target+0.013
 pieces+0.012
 aback+0.012
 bites+0.011
Neg logit contributions
abella-0.015
bie-0.015
icking-0.014
ili-0.014
bled-0.014
Token time
Token ID640
Feature activation+5.15e-02

Pos logit contributions
 target+0.003
pillar+0.003
 pieces+0.003
 aback+0.003
Sur+0.003
Neg logit contributions
bie-0.002
abella-0.002
icking-0.002
tha-0.002
aj-0.002
Token,
Token ID11
Feature activation+6.90e-02

Pos logit contributions
abella+0.119
icking+0.117
bie+0.114
bled+0.113
tha+0.113
Neg logit contributions
pillar-0.050
 pieces-0.048
 target-0.046
Sur-0.042
 bites-0.040
Token there
Token ID612
Feature activation+2.83e-02

Pos logit contributions
abella+0.130
icking+0.125
tha+0.120
bie+0.118
ili+0.117
Neg logit contributions
 pieces-0.099
pillar-0.098
 target-0.093
-0.088
Sur-0.087
Token<|endoftext|>
Token ID50256
Feature activation-8.20e-03

Pos logit contributions
pillar+0.010
 target+0.009
 pieces+0.009
Sur+0.007
 bites+0.007
Neg logit contributions
abella-0.034
icking-0.034
bie-0.033
tha-0.032
bled-0.032
TokenOnce
Token ID7454
Feature activation+9.65e-02

Pos logit contributions
 target+0.011
pillar+0.010
 pieces+0.010
 bites+0.009
 string+0.009
Neg logit contributions
abella-0.016
bie-0.016
icking-0.015
ili-0.015
tha-0.015
Token upon
Token ID2402
Feature activation-8.18e-03

Pos logit contributions
abella+0.124
icking+0.120
bled+0.118
tha+0.113
belt+0.110
Neg logit contributions
pillar-0.187
 pieces-0.187
 target-0.174
 piece-0.168
 bites-0.167
Token a
Token ID257
Feature activation-2.03e-03

Pos logit contributions
pillar+0.013
 target+0.013
 pieces+0.012
 aback+0.012
 bites+0.011
Neg logit contributions
abella-0.014
bie-0.013
icking-0.013
ili-0.012
bled-0.012
Token time
Token ID640
Feature activation+4.56e-02

Pos logit contributions
 target+0.004
 pieces+0.004
pillar+0.004
 aback+0.003
Sur+0.003
Neg logit contributions
bie-0.003
abella-0.003
icking-0.003
tha-0.003
aj-0.003
Token,
Token ID11
Feature activation+6.78e-02

Pos logit contributions
abella+0.102
icking+0.101
bie+0.098
tha+0.097
bled+0.096
Neg logit contributions
pillar-0.048
 pieces-0.046
 target-0.045
Sur-0.041
 bites-0.040
Token there
Token ID612
Feature activation+2.55e-02

Pos logit contributions
abella+0.121
icking+0.118
tha+0.112
bie+0.112
ili+0.108
Neg logit contributions
 pieces-0.103
pillar-0.103
 target-0.099
Sur-0.093
 bites-0.090
Token was
Token ID373
Feature activation+1.62e-02

Pos logit contributions
abella+0.058
icking+0.057
bie+0.056
tha+0.055
bled+0.054
Neg logit contributions
pillar-0.027
 target-0.026
 pieces-0.025
Sur-0.022
 bites-0.022
Token a
Token ID257
Feature activation-4.36e-02

Pos logit contributions
abella+0.031
bie+0.030
icking+0.030
bled+0.028
tha+0.028
Neg logit contributions
pillar-0.023
 target-0.022
 pieces-0.021
 bites-0.019
 aback-0.019
Token small
Token ID1402
Feature activation-5.48e-02

Pos logit contributions
pillar+0.044
 target+0.043
 pieces+0.040
 piece+0.036
Sur+0.036
Neg logit contributions
icking-0.104
abella-0.103
bled-0.100
bie-0.097
tha-0.097
Token boot
Token ID6297
Feature activation-5.17e-03

Pos logit contributions
pillar+0.053
 target+0.052
 pieces+0.051
 bites+0.042
Sur+0.042
Neg logit contributions
abella-0.130
icking-0.128
bie-0.124
bled-0.123
tha-0.123
Token of
Token ID286
Feature activation+2.13e-02

Pos logit contributions
pillar+0.014
 target+0.013
 pieces+0.012
Sur+0.012
 bites+0.011
Neg logit contributions
icking-0.029
bie-0.028
abella-0.027
tha-0.027
bled-0.026
Token o
Token ID267
Feature activation+7.15e-02

Pos logit contributions
abella+0.034
icking+0.034
bie+0.033
tha+0.032
bled+0.031
Neg logit contributions
pillar-0.036
 target-0.036
 pieces-0.034
Sur-0.032
 bites-0.031
Tokenatmeal
Token ID45280
Feature activation+3.11e-02

Pos logit contributions
abella+0.083
tha+0.074
bie+0.072
icking+0.070
bled+0.069
Neg logit contributions
pillar-0.156
 pieces-0.152
 target-0.143
Sur-0.141
 bites-0.140
Token without
Token ID1231
Feature activation+2.03e-02

Pos logit contributions
abella+0.071
icking+0.071
bie+0.069
tha+0.068
bled+0.067
Neg logit contributions
pillar-0.031
 target-0.031
 pieces-0.030
Sur-0.026
 bites-0.025
Token any
Token ID597
Feature activation-1.40e-02

Pos logit contributions
icking+0.039
belt+0.037
abella+0.037
ancer+0.036
ili+0.036
Neg logit contributions
 pieces-0.031
 bites-0.028
 target-0.027
 string-0.026
 piece-0.026
Token marks
Token ID8849
Feature activation+2.38e-02

Pos logit contributions
pillar+0.019
 target+0.019
 pieces+0.018
Sur+0.017
 aback+0.016
Neg logit contributions
icking-0.027
abella-0.027
bie-0.026
tha-0.025
bled-0.025
Token.
Token ID13
Feature activation-1.01e-03

Pos logit contributions
abella+0.054
bie+0.053
icking+0.052
tha+0.052
ili+0.050
Neg logit contributions
 pieces-0.025
pillar-0.024
 target-0.024
 bites-0.020
Sur-0.020
Token He
Token ID679
Feature activation-1.49e-03

Pos logit contributions
pillar+0.001
 target+0.001
 pieces+0.001
Sur+0.001
 aback+0.001
Neg logit contributions
icking-0.002
abella-0.002
tha-0.002
bie-0.002
bled-0.002
Token ate
Token ID15063
Feature activation+1.55e-02

Pos logit contributions
pillar+0.002
 target+0.002
 pieces+0.002
Sur+0.002
 bites+0.002
Neg logit contributions
abella-0.003
icking-0.003
bie-0.003
tha-0.002
bled-0.002
Token it
Token ID340
Feature activation-9.54e-04

Pos logit contributions
icking+0.032
abella+0.032
bie+0.032
tha+0.031
ili+0.030
Neg logit contributions
 pieces-0.020
 target-0.018
pillar-0.018
 bites-0.017
 string-0.016
Token all
Token ID477
Feature activation+5.06e-03

Pos logit contributions
pillar+0.001
 target+0.001
 pieces+0.001
 bites+0.001
Sur+0.001
Neg logit contributions
abella-0.002
icking-0.002
bie-0.002
tha-0.002
ili-0.002
Token she
Token ID673
Feature activation-5.15e-04

Pos logit contributions
icking+0.012
abella+0.012
bie+0.012
aj+0.012
ric+0.012
Neg logit contributions
 pieces-0.006
pillar-0.005
 target-0.005
 string-0.005
 piece-0.005
Token spun
Token ID26843
Feature activation+8.09e-03

Pos logit contributions
pillar+0.001
 target+0.001
 pieces+0.001
Sur+0.001
 bites+0.001
Neg logit contributions
icking-0.001
bie-0.001
abella-0.001
tha-0.001
ric-0.001
Token around
Token ID1088
Feature activation-6.47e-03

Pos logit contributions
abella+0.015
icking+0.015
bie+0.014
bled+0.014
tha+0.014
Neg logit contributions
pillar-0.011
 target-0.011
 pieces-0.010
Sur-0.010
 bites-0.010
Token and
Token ID290
Feature activation+1.87e-02

Pos logit contributions
pillar+0.008
 target+0.008
Sur+0.007
 pieces+0.007
 aback+0.007
Neg logit contributions
abella-0.013
icking-0.012
bie-0.012
tha-0.012
bled-0.012
Token around
Token ID1088
Feature activation+8.34e-03

Pos logit contributions
icking+0.044
tha+0.041
abella+0.041
bie+0.040
ili+0.040
Neg logit contributions
 target-0.021
 pieces-0.020
pillar-0.020
 bites-0.018
 aback-0.017
Token in
Token ID287
Feature activation+1.96e-02

Pos logit contributions
abella+0.017
icking+0.017
bie+0.017
tha+0.016
bled+0.016
Neg logit contributions
pillar-0.010
 target-0.010
 pieces-0.009
Sur-0.009
 aback-0.008
Token a
Token ID257
Feature activation-1.50e-02

Pos logit contributions
bie+0.032
abella+0.032
icking+0.031
tha+0.030
 Ty+0.029
Neg logit contributions
pillar-0.031
 target-0.030
Sur-0.027
 pieces-0.027
 aback-0.026
Token circle
Token ID9197
Feature activation+3.49e-04

Pos logit contributions
 aback+0.021
pillar+0.020
ives+0.020
 target+0.020
',"+0.020
Neg logit contributions
abella-0.025
bie-0.024
 Ty-0.023
ili-0.022
ric-0.021
Token.
Token ID13
Feature activation-8.75e-03

Pos logit contributions
icking+0.001
abella+0.001
bie+0.001
bled+0.001
tha+0.001
Neg logit contributions
pillar-0.000
 target-0.000
 pieces-0.000
Sur-0.000
 aback-0.000
Token The
Token ID383
Feature activation-8.78e-03

Pos logit contributions
pillar+0.010
 target+0.010
 pieces+0.010
Sur+0.009
 bites+0.009
Neg logit contributions
icking-0.019
abella-0.019
bie-0.018
tha-0.018
bled-0.018
Token tw
Token ID665
Feature activation+4.70e-02

Pos logit contributions
 pieces+0.013
 target+0.012
pillar+0.012
 bites+0.011
 piece+0.011
Neg logit contributions
icking-0.018
abella-0.018
tha-0.017
bie-0.017
bled-0.016
Token to
Token ID284
Feature activation-4.61e-03

Pos logit contributions
icking+0.018
abella+0.018
tha+0.018
bled+0.018
bie+0.017
Neg logit contributions
 pieces-0.003
 target-0.003
 bites-0.002
pillar-0.002
 string-0.002
Token play
Token ID711
Feature activation+2.98e-03

Pos logit contributions
 target+0.005
 pieces+0.005
pillar+0.005
 bites+0.004
 string+0.004
Neg logit contributions
abella-0.010
icking-0.010
bie-0.010
bled-0.010
tha-0.010
Token.
Token ID13
Feature activation-4.28e-04

Pos logit contributions
abella+0.006
icking+0.006
bled+0.005
ili+0.005
ores+0.005
Neg logit contributions
pillar-0.004
 target-0.004
Sur-0.003
 bites-0.003
 aback-0.003
Token She
Token ID1375
Feature activation-8.25e-03

Pos logit contributions
pillar+0.000
 target+0.000
 pieces+0.000
Sur+0.000
 bites+0.000
Neg logit contributions
abella-0.001
icking-0.001
bie-0.001
tha-0.001
bled-0.001
Token forgot
Token ID16453
Feature activation-4.15e-02

Pos logit contributions
 pieces+0.011
 target+0.011
pillar+0.011
 bites+0.010
Sur+0.009
Neg logit contributions
icking-0.017
tha-0.016
abella-0.016
bie-0.016
bled-0.015
Token about
Token ID546
Feature activation+1.45e-02

Pos logit contributions
 pieces+0.059
 target+0.057
pillar+0.056
 bites+0.053
 string+0.050
Neg logit contributions
icking-0.076
tha-0.076
bie-0.075
abella-0.075
bled-0.073
Token her
Token ID607
Feature activation+2.86e-02

Pos logit contributions
icking+0.034
bled+0.031
bie+0.031
ancer+0.030
belt+0.030
Neg logit contributions
 pieces-0.018
 target-0.015
 bites-0.015
 bite-0.014
 string-0.014
Token sadness
Token ID25303
Feature activation-1.00e-02

Pos logit contributions
abella+0.059
icking+0.057
bie+0.056
bled+0.055
tha+0.054
Neg logit contributions
 pieces-0.038
 target-0.037
pillar-0.036
 bites-0.033
 string-0.031
Token and
Token ID290
Feature activation+2.30e-02

Pos logit contributions
 pieces+0.006
pillar+0.006
 target+0.006
 bites+0.004
Sur+0.004
Neg logit contributions
abella-0.027
bie-0.026
icking-0.026
tha-0.026
bled-0.026
Token smiled
Token ID13541
Feature activation-1.78e-02

Pos logit contributions
abella+0.045
icking+0.045
bie+0.043
ili+0.043
tha+0.043
Neg logit contributions
 pieces-0.033
 target-0.032
 bites-0.030
pillar-0.029
 string-0.028
Token.
Token ID13
Feature activation-1.45e-03

Pos logit contributions
 target+0.013
pillar+0.013
 pieces+0.012
Sur+0.010
 bites+0.009
Neg logit contributions
abella-0.046
icking-0.045
bie-0.045
tha-0.044
ili-0.043
Token streets
Token ID6483
Feature activation+1.94e-02

Pos logit contributions
 pieces+0.007
pillar+0.007
 target+0.007
Sur+0.006
 bites+0.006
Neg logit contributions
abella-0.019
icking-0.019
bie-0.018
tha-0.018
ili-0.018
Token and
Token ID290
Feature activation+8.11e-04

Pos logit contributions
abella+0.047
icking+0.047
bie+0.046
tha+0.045
bled+0.044
Neg logit contributions
pillar-0.017
 target-0.017
 pieces-0.017
Sur-0.014
 bites-0.013
Token saw
Token ID2497
Feature activation-8.66e-03

Pos logit contributions
icking+0.002
abella+0.002
ili+0.002
tha+0.002
bie+0.002
Neg logit contributions
 pieces-0.001
 target-0.001
pillar-0.001
 bites-0.001
Sur-0.001
Token a
Token ID257
Feature activation-4.82e-02

Pos logit contributions
 pieces+0.011
pillar+0.011
 target+0.011
Sur+0.010
 bites+0.009
Neg logit contributions
icking-0.018
abella-0.017
bie-0.017
tha-0.017
bled-0.016
Token park
Token ID3952
Feature activation+9.93e-03

Pos logit contributions
pillar+0.059
 target+0.057
 pieces+0.056
 aback+0.051
 bites+0.051
Neg logit contributions
abella-0.098
icking-0.096
bie-0.096
tha-0.093
ili-0.089
Token,
Token ID11
Feature activation+2.18e-02

Pos logit contributions
abella+0.024
icking+0.023
bie+0.023
tha+0.023
bled+0.022
Neg logit contributions
 target-0.009
pillar-0.009
 pieces-0.009
Sur-0.008
 bites-0.007
Token a
Token ID257
Feature activation-4.42e-02

Pos logit contributions
icking+0.049
abella+0.048
bie+0.047
tha+0.047
bled+0.046
Neg logit contributions
 target-0.025
 pieces-0.025
pillar-0.023
Sur-0.021
 string-0.020
Token store
Token ID3650
Feature activation-2.34e-04

Pos logit contributions
pillar+0.052
 target+0.052
 pieces+0.050
Sur+0.045
 bites+0.043
Neg logit contributions
icking-0.095
abella-0.094
bie-0.091
tha-0.090
bled-0.089
Token,
Token ID11
Feature activation+2.65e-02

Pos logit contributions
pillar+0.000
 target+0.000
 pieces+0.000
 bites+0.000
Sur+0.000
Neg logit contributions
icking-0.000
abella-0.000
bie-0.000
tha-0.000
bled-0.000
Token and
Token ID290
Feature activation+2.92e-04

Pos logit contributions
icking+0.064
abella+0.064
bie+0.063
tha+0.062
bled+0.062
Neg logit contributions
 pieces-0.026
 target-0.025
pillar-0.022
Sur-0.020
 string-0.020
Token a
Token ID257
Feature activation-3.73e-02

Pos logit contributions
icking+0.001
bie+0.001
tha+0.001
abella+0.001
ric+0.001
Neg logit contributions
 pieces-0.000
 target-0.000
pillar-0.000
Sur-0.000
 string-0.000
Token it
Token ID340
Feature activation-2.59e-02

Pos logit contributions
 pieces+0.002
 target+0.002
pillar+0.002
 bites+0.001
Sur+0.001
Neg logit contributions
abella-0.002
icking-0.002
bie-0.002
tha-0.002
bled-0.002
Token's
Token ID338
Feature activation+2.33e-02

Pos logit contributions
 target+0.025
pillar+0.022
Sur+0.020
 pieces+0.019
 slices+0.018
Neg logit contributions
icking-0.063
abella-0.059
bie-0.058
bled-0.057
tha-0.056
Token not
Token ID407
Feature activation-9.20e-03

Pos logit contributions
abella+0.044
bie+0.044
tha+0.042
icking+0.042
angers+0.042
Neg logit contributions
 target-0.033
 pieces-0.033
 string-0.029
-0.029
pillar-0.029
Token always
Token ID1464
Feature activation+2.18e-02

Pos logit contributions
pillar+0.015
 target+0.014
 pieces+0.014
Sur+0.013
 bites+0.013
Neg logit contributions
icking-0.016
abella-0.016
bie-0.015
tha-0.015
bled-0.014
Token safe
Token ID3338
Feature activation-9.05e-03

Pos logit contributions
icking+0.039
abella+0.037
bie+0.037
tha+0.035
bled+0.034
Neg logit contributions
pillar-0.035
 target-0.032
 pieces-0.031
Sur-0.030
 bites-0.028
Token to
Token ID284
Feature activation+2.43e-02

Pos logit contributions
pillar+0.009
 target+0.008
 pieces+0.008
Sur+0.007
 bites+0.006
Neg logit contributions
icking-0.022
abella-0.021
bie-0.021
tha-0.021
bled-0.020
Token zoom
Token ID19792
Feature activation+4.38e-02

Pos logit contributions
icking+0.047
abella+0.045
bie+0.045
tha+0.044
ric+0.042
Neg logit contributions
pillar-0.034
 pieces-0.032
 target-0.032
Sur-0.030
 aback-0.028
Token around
Token ID1088
Feature activation+1.81e-02

Pos logit contributions
abella+0.094
icking+0.093
bie+0.092
tha+0.090
bled+0.088
Neg logit contributions
pillar-0.051
 target-0.050
 pieces-0.049
Sur-0.043
 bites-0.042
Token without
Token ID1231
Feature activation+1.90e-02

Pos logit contributions
icking+0.036
abella+0.036
bie+0.035
tha+0.034
bled+0.033
Neg logit contributions
pillar-0.024
 target-0.023
 pieces-0.023
Sur-0.021
 bites-0.020
Token being
Token ID852
Feature activation+4.96e-02

Pos logit contributions
abella+0.034
icking+0.033
bie+0.032
tha+0.031
belt+0.031
Neg logit contributions
 pieces-0.030
 target-0.030
pillar-0.027
 bites-0.027
 string-0.027
Token careful
Token ID8161
Feature activation-1.26e-02

Pos logit contributions
abella+0.094
icking+0.092
bie+0.090
tha+0.088
bled+0.086
Neg logit contributions
 target-0.072
pillar-0.072
 pieces-0.071
Sur-0.063
 bites-0.063
Token for
Token ID329
Feature activation-1.09e-02

Pos logit contributions
 pieces+0.014
pillar+0.013
 target+0.013
 bites+0.012
 string+0.012
Neg logit contributions
bie-0.022
tha-0.022
icking-0.021
abella-0.021
bled-0.021
Token a
Token ID257
Feature activation-3.17e-02

Pos logit contributions
pillar+0.015
 target+0.015
 pieces+0.014
Sur+0.013
 bites+0.013
Neg logit contributions
abella-0.021
icking-0.021
bie-0.021
tha-0.020
bled-0.020
Token while
Token ID981
Feature activation-9.94e-03

Pos logit contributions
pillar+0.056
 pieces+0.054
 target+0.054
 bites+0.050
 aback+0.050
Neg logit contributions
bie-0.048
abella-0.045
icking-0.042
tha-0.042
ili-0.040
Token and
Token ID290
Feature activation+1.57e-02

Pos logit contributions
pillar+0.008
 target+0.008
 pieces+0.008
Sur+0.006
 bites+0.006
Neg logit contributions
abella-0.025
icking-0.025
bie-0.024
tha-0.024
bled-0.023
Token realized
Token ID6939
Feature activation-2.08e-02

Pos logit contributions
icking+0.033
bie+0.033
abella+0.032
ric+0.032
ili+0.031
Neg logit contributions
 target-0.019
 pieces-0.019
pillar-0.019
Sur-0.017
 bites-0.017
Token that
Token ID326
Feature activation-5.05e-03

Pos logit contributions
 pieces+0.020
 target+0.017
 string+0.017
pillar+0.016
 piece+0.015
Neg logit contributions
bie-0.050
icking-0.048
ric-0.046
tha-0.046
ancer-0.046
Token he
Token ID339
Feature activation-1.24e-02

Pos logit contributions
 pieces+0.007
 target+0.007
pillar+0.006
 string+0.006
 piece+0.006
Neg logit contributions
icking-0.010
abella-0.010
bie-0.010
tha-0.010
bled-0.010
Token would
Token ID561
Feature activation+6.01e-03

Pos logit contributions
pillar+0.016
 target+0.015
 pieces+0.015
 bites+0.013
Sur+0.013
Neg logit contributions
icking-0.025
abella-0.025
bie-0.024
tha-0.024
bled-0.024
Token feel
Token ID1254
Feature activation-1.21e-02

Pos logit contributions
icking+0.011
abella+0.011
bie+0.010
tha+0.010
bled+0.010
Neg logit contributions
pillar-0.009
 pieces-0.009
 target-0.009
Sur-0.008
 bites-0.008
Token bad
Token ID2089
Feature activation-4.79e-02

Pos logit contributions
pillar+0.016
 target+0.015
 pieces+0.015
Sur+0.013
 bites+0.013
Neg logit contributions
icking-0.024
abella-0.024
bie-0.023
bled-0.023
tha-0.023
Token if
Token ID611
Feature activation-8.19e-03

Pos logit contributions
pillar+0.050
 target+0.048
 pieces+0.043
Sur+0.041
 aback+0.038
Neg logit contributions
icking-0.110
abella-0.107
bie-0.104
bled-0.102
tha-0.100
Token mom
Token ID1995
Feature activation-2.70e-02

Pos logit contributions
abella+0.078
icking+0.075
bled+0.075
tha+0.072
ili+0.071
Neg logit contributions
 pieces-0.085
 target-0.077
 bites-0.074
 string-0.071
 piece-0.071
Token and
Token ID290
Feature activation+2.02e-02

Pos logit contributions
 pieces+0.018
pillar+0.018
 target+0.016
 bites+0.014
 aback+0.013
Neg logit contributions
abella-0.070
bled-0.070
bie-0.069
tha-0.068
icking-0.067
Token hid
Token ID24519
Feature activation+3.74e-02

Pos logit contributions
abella+0.036
ric+0.036
icking+0.036
bled+0.035
ili+0.034
Neg logit contributions
 pieces-0.030
 target-0.028
 piece-0.027
 string-0.027
 bites-0.027
Token under
Token ID739
Feature activation-1.54e-02

Pos logit contributions
abella+0.084
icking+0.079
bie+0.079
ancer+0.078
ric+0.078
Neg logit contributions
 pieces-0.047
 target-0.036
 string-0.034
 bites-0.034
 and-0.032
Token the
Token ID262
Feature activation+2.50e-02

Pos logit contributions
 pieces+0.020
 target+0.018
pillar+0.017
 bites+0.017
 string+0.017
Neg logit contributions
icking-0.033
abella-0.032
bie-0.031
tha-0.031
bled-0.030
Token table
Token ID3084
Feature activation+3.86e-03

Pos logit contributions
abella+0.035
icking+0.035
bie+0.034
tha+0.034
ili+0.033
Neg logit contributions
pillar-0.046
 target-0.046
 aback-0.042
Sur-0.042
 pieces-0.040
Token.
Token ID13
Feature activation-2.15e-03

Pos logit contributions
icking+0.009
abella+0.009
bie+0.009
tha+0.009
bled+0.009
Neg logit contributions
pillar-0.003
 target-0.003
 pieces-0.003
Sur-0.003
 bites-0.003
Token Their
Token ID5334
Feature activation+4.45e-02

Pos logit contributions
 pieces+0.002
 target+0.002
pillar+0.002
Sur+0.002
 bites+0.002
Neg logit contributions
icking-0.005
bie-0.004
abella-0.004
tha-0.004
ili-0.004
Token mom
Token ID1995
Feature activation-2.14e-02

Pos logit contributions
abella+0.073
ancer+0.073
icking+0.071
ili+0.069
tha+0.067
Neg logit contributions
 pieces-0.082
 bites-0.073
 piece-0.070
 string-0.069
 target-0.068
Token was
Token ID373
Feature activation+2.25e-02

Pos logit contributions
pillar+0.026
 pieces+0.025
 target+0.025
Sur+0.022
 bites+0.022
Neg logit contributions
abella-0.044
icking-0.044
bie-0.044
tha-0.043
bled-0.042
Token angry
Token ID7954
Feature activation-6.06e-02

Pos logit contributions
bie+0.042
icking+0.042
abella+0.041
tha+0.040
ancer+0.038
Neg logit contributions
 pieces-0.033
 target-0.031
pillar-0.029
 bites-0.029
Sur-0.029
Token play
Token ID711
Feature activation+2.04e-03

Pos logit contributions
pillar+0.001
 target+0.001
 pieces+0.001
Sur+0.001
 bites+0.001
Neg logit contributions
abella-0.001
icking-0.001
bie-0.001
tha-0.001
bled-0.001
Token her
Token ID607
Feature activation+2.59e-02

Pos logit contributions
bled+0.003
abella+0.003
icking+0.003
ili+0.003
ores+0.003
Neg logit contributions
pillar-0.004
 aback-0.003
 target-0.003
Sur-0.003
 pieces-0.003
Token trumpet
Token ID40334
Feature activation+1.25e-02

Pos logit contributions
icking+0.042
bie+0.040
abella+0.039
tha+0.039
allows+0.039
Neg logit contributions
 aback-0.042
pillar-0.041
Sur-0.040
 target-0.038
Start-0.037
Token.
Token ID13
Feature activation-9.31e-03

Pos logit contributions
icking+0.022
abella+0.022
bie+0.021
bled+0.020
belt+0.020
Neg logit contributions
pillar-0.019
Sur-0.018
 target-0.018
 aback-0.017
 pieces-0.017
Token She
Token ID1375
Feature activation-1.36e-03

Pos logit contributions
pillar+0.017
 pieces+0.017
 target+0.016
Sur+0.016
 bites+0.016
Neg logit contributions
 Kim-0.012
 K-0.012
abella-0.011
 Liz-0.011
 Hil-0.011
Token would
Token ID561
Feature activation+3.37e-03

Pos logit contributions
pillar+0.002
 pieces+0.002
 target+0.002
 bites+0.002
Sur+0.002
Neg logit contributions
icking-0.003
abella-0.003
tha-0.002
bie-0.002
bled-0.002
Token play
Token ID711
Feature activation-3.24e-05

Pos logit contributions
abella+0.006
icking+0.006
bie+0.006
tha+0.006
bled+0.006
Neg logit contributions
pillar-0.005
 target-0.005
 pieces-0.005
Sur-0.004
 bites-0.004
Token it
Token ID340
Feature activation-1.24e-02

Pos logit contributions
pillar+0.000
 target+0.000
 pieces+0.000
 aback+0.000
Sur+0.000
Neg logit contributions
abella-0.000
bled-0.000
ili-0.000
 Ty-0.000
ores-0.000
Token every
Token ID790
Feature activation+5.44e-03

Pos logit contributions
pillar+0.023
 target+0.023
Sur+0.022
 pieces+0.022
 aback+0.021
Neg logit contributions
icking-0.018
abella-0.017
bie-0.016
bled-0.016
tha-0.015
Token day
Token ID1110
Feature activation+7.41e-03

Pos logit contributions
abella+0.006
bie+0.006
icking+0.006
tha+0.006
ili+0.006
Neg logit contributions
pillar-0.011
 target-0.011
 pieces-0.011
Sur-0.011
 aback-0.010
Token and
Token ID290
Feature activation+3.12e-03

Pos logit contributions
abella+0.016
icking+0.016
bie+0.015
tha+0.015
ili+0.015
Neg logit contributions
 target-0.008
pillar-0.008
 pieces-0.008
Sur-0.007
 aback-0.007
Token When
Token ID1649
Feature activation-2.44e-03

Pos logit contributions
pillar+0.030
 target+0.029
 pieces+0.028
 bites+0.026
Sur+0.026
Neg logit contributions
abella-0.028
icking-0.028
bie-0.027
tha-0.026
bled-0.025
Token Sarah
Token ID10490
Feature activation-5.33e-04

Pos logit contributions
 pieces+0.003
 puzzle+0.003
 string+0.003
 surprise+0.003
 piece+0.003
Neg logit contributions
ric-0.005
mo-0.005
icking-0.005
bie-0.005
bled-0.005
Token got
Token ID1392
Feature activation+3.69e-02

Pos logit contributions
pillar+0.001
 target+0.001
 pieces+0.001
Sur+0.001
 bites+0.001
Neg logit contributions
abella-0.001
icking-0.001
bie-0.001
tha-0.001
bled-0.001
Token the
Token ID262
Feature activation-9.28e-03

Pos logit contributions
abella+0.075
icking+0.072
ric+0.072
tha+0.071
ili+0.070
Neg logit contributions
 pieces-0.050
 aback-0.042
 target-0.042
pillar-0.041
 bites-0.041
Token past
Token ID1613
Feature activation+2.20e-02

Pos logit contributions
pillar+0.013
 target+0.013
 pieces+0.012
Sur+0.011
 bites+0.011
Neg logit contributions
icking-0.018
abella-0.018
bie-0.017
tha-0.017
ili-0.017
Tokenels
Token ID1424
Feature activation+1.03e-02

Pos logit contributions
abella+0.032
icking+0.031
bie+0.031
tha+0.029
bled+0.029
Neg logit contributions
 pieces-0.041
pillar-0.041
 target-0.040
Sur-0.037
 bites-0.036
Token home
Token ID1363
Feature activation+2.46e-02

Pos logit contributions
icking+0.021
abella+0.021
bie+0.021
tha+0.020
bled+0.020
Neg logit contributions
pillar-0.013
 target-0.013
 pieces-0.012
Sur-0.011
 bites-0.011
Token,
Token ID11
Feature activation+1.92e-02

Pos logit contributions
abella+0.066
ric+0.065
bie+0.063
ili+0.062
ogo+0.061
Neg logit contributions
 pieces-0.019
pillar-0.012
 piece-0.012
 aback-0.011
!"-0.011
Token she
Token ID673
Feature activation-3.93e-03

Pos logit contributions
ric+0.046
bie+0.045
abella+0.044
ogo+0.044
aj+0.044
Neg logit contributions
 pieces-0.021
 target-0.017
 string-0.017
 piece-0.016
 magazine-0.016
Token was
Token ID373
Feature activation+2.89e-02

Pos logit contributions
pillar+0.005
 target+0.005
 pieces+0.005
Sur+0.005
 bites+0.005
Neg logit contributions
abella-0.008
icking-0.008
bie-0.007
tha-0.007
bled-0.007
Token so
Token ID523
Feature activation+4.41e-03

Pos logit contributions
abella+0.065
icking+0.064
tha+0.064
ancer+0.064
bie+0.063
Neg logit contributions
 pieces-0.029
 target-0.025
 piece-0.024
 aback-0.024
-0.023
Token mouse
Token ID10211
Feature activation-2.92e-02

Pos logit contributions
pillar+0.064
ives+0.064
Sur+0.062
hest+0.060
 target+0.060
Neg logit contributions
bie-0.118
icking-0.115
tha-0.112
abella-0.105
ric-0.105
Token!
Token ID0
Feature activation-3.05e-02

Pos logit contributions
pillar+0.035
 target+0.034
 pieces+0.034
Sur+0.031
 aback+0.030
Neg logit contributions
icking-0.060
abella-0.058
bie-0.058
tha-0.056
bled-0.055
Token It
Token ID632
Feature activation-2.09e-02

Pos logit contributions
pillar+0.041
 target+0.040
 pieces+0.039
Sur+0.035
 bites+0.035
Neg logit contributions
abella-0.060
icking-0.059
bie-0.058
tha-0.057
bled-0.055
Token was
Token ID373
Feature activation+1.90e-02

Pos logit contributions
 target+0.027
pillar+0.026
 pieces+0.026
Sur+0.023
 bites+0.023
Neg logit contributions
abella-0.041
icking-0.040
bie-0.040
tha-0.039
bled-0.038
Token a
Token ID257
Feature activation-4.17e-02

Pos logit contributions
icking+0.039
abella+0.039
bie+0.038
tha+0.037
bled+0.036
Neg logit contributions
 target-0.024
 pieces-0.023
pillar-0.023
Sur-0.020
 bites-0.020
Token very
Token ID845
Feature activation+7.40e-03

Pos logit contributions
pillar+0.051
 pieces+0.049
 target+0.048
 aback+0.047
Sur+0.045
Neg logit contributions
bie-0.081
abella-0.078
tha-0.078
icking-0.078
ili-0.072
Token cute
Token ID13779
Feature activation-5.44e-02

Pos logit contributions
icking+0.016
abella+0.016
bie+0.015
bled+0.015
tha+0.015
Neg logit contributions
 target-0.009
pillar-0.009
 pieces-0.008
 bites-0.008
 piece-0.007
Token mouse
Token ID10211
Feature activation-3.38e-02

Pos logit contributions
Sur+0.059
 pieces+0.059
pillar+0.056
 target+0.056
 aback+0.054
Neg logit contributions
icking-0.110
bie-0.109
tha-0.104
abella-0.100
aby-0.098
Token!
Token ID0
Feature activation-2.71e-02

Pos logit contributions
pillar+0.030
 target+0.030
 pieces+0.030
Sur+0.028
 aback+0.026
Neg logit contributions
icking-0.078
abella-0.077
bie-0.076
tha-0.075
ili-0.072
Token The
Token ID383
Feature activation-3.69e-03

Pos logit contributions
pillar+0.036
 target+0.036
 pieces+0.035
Sur+0.031
 bites+0.031
Neg logit contributions
abella-0.053
icking-0.052
bie-0.051
tha-0.050
bled-0.049
Token two
Token ID734
Feature activation-1.38e-02

Pos logit contributions
pillar+0.003
 target+0.003
Sur+0.002
 aback+0.002
 pieces+0.002
Neg logit contributions
icking-0.009
bie-0.009
abella-0.009
tha-0.009
bled-0.009
Token Ben
Token ID3932
Feature activation-2.15e-02

Pos logit contributions
icking+0.017
abella+0.016
bie+0.016
bled+0.016
tha+0.015
Neg logit contributions
 target-0.019
pillar-0.018
 pieces-0.018
 bites-0.017
Sur-0.016
Token.
Token ID13
Feature activation-1.43e-02

Pos logit contributions
pillar+0.027
Mine+0.026
 target+0.026
Sur+0.026
 pieces+0.023
Neg logit contributions
bie-0.043
icking-0.043
tha-0.041
abella-0.040
aj-0.039
Token You
Token ID921
Feature activation-1.34e-02

Pos logit contributions
pillar+0.013
 target+0.012
 pieces+0.011
Sur+0.010
 aback+0.009
Neg logit contributions
abella-0.036
icking-0.035
bie-0.034
bled-0.034
tha-0.033
Token are
Token ID389
Feature activation+2.98e-02

Pos logit contributions
pillar+0.021
 target+0.020
 pieces+0.019
Sur+0.018
 bites+0.017
Neg logit contributions
icking-0.024
abella-0.023
bie-0.022
bled-0.021
ili-0.021
Token very
Token ID845
Feature activation+2.53e-02

Pos logit contributions
icking+0.062
abella+0.062
bie+0.061
tha+0.059
bled+0.058
Neg logit contributions
 target-0.037
pillar-0.036
 pieces-0.036
Sur-0.032
 bites-0.031
Token nice
Token ID3621
Feature activation-4.44e-02

Pos logit contributions
abella+0.055
icking+0.055
bie+0.053
tha+0.052
bled+0.051
Neg logit contributions
pillar-0.029
 pieces-0.028
 target-0.028
Sur-0.025
 bites-0.025
Token.
Token ID13
Feature activation-1.42e-02

Pos logit contributions
pillar+0.073
 target+0.071
Mine+0.062
hest+0.062
Sur+0.062
Neg logit contributions
bie-0.074
aj-0.072
icking-0.069
allows-0.065
 Jen-0.064
Token I
Token ID314
Feature activation-1.62e-02

Pos logit contributions
pillar+0.014
 target+0.013
 pieces+0.012
Sur+0.012
 bites+0.011
Neg logit contributions
icking-0.033
abella-0.033
bie-0.032
bled-0.032
tha-0.032
Token will
Token ID481
Feature activation-1.25e-03

Pos logit contributions
 target+0.022
pillar+0.022
 pieces+0.022
Sur+0.020
 bites+0.018
Neg logit contributions
abella-0.031
icking-0.030
bie-0.030
ili-0.028
tha-0.028
Token be
Token ID307
Feature activation+4.10e-02

Pos logit contributions
pillar+0.002
 pieces+0.002
 target+0.002
Sur+0.002
 bites+0.001
Neg logit contributions
icking-0.002
abella-0.002
bie-0.002
tha-0.002
bled-0.002
Token very
Token ID845
Feature activation+3.35e-02

Pos logit contributions
abella+0.082
icking+0.081
bie+0.079
tha+0.078
ili+0.075
Neg logit contributions
 target-0.054
 pieces-0.054
pillar-0.053
Sur-0.048
 bites-0.047
Token the
Token ID262
Feature activation+2.14e-02

Pos logit contributions
icking+0.030
abella+0.029
bie+0.028
tha+0.028
bled+0.027
Neg logit contributions
pillar-0.024
 target-0.024
 pieces-0.024
Sur-0.021
 bites-0.021
Token park
Token ID3952
Feature activation+2.65e-02

Pos logit contributions
icking+0.048
abella+0.048
bie+0.047
bled+0.046
tha+0.046
Neg logit contributions
pillar-0.023
 target-0.023
 pieces-0.022
Sur-0.020
 bites-0.019
Token.
Token ID13
Feature activation-1.38e-02

Pos logit contributions
abella+0.071
icking+0.070
bie+0.069
tha+0.068
bled+0.067
Neg logit contributions
pillar-0.017
 target-0.017
 pieces-0.016
Sur-0.012
 bites-0.012
Token It
Token ID632
Feature activation-2.10e-02

Pos logit contributions
pillar+0.010
 target+0.009
 pieces+0.009
Sur+0.008
 bites+0.008
Neg logit contributions
icking-0.036
abella-0.035
bie-0.035
tha-0.034
bled-0.034
Token is
Token ID318
Feature activation+1.74e-02

Pos logit contributions
 target+0.023
pillar+0.023
 pieces+0.021
Sur+0.020
 aback+0.020
Neg logit contributions
abella-0.046
bie-0.044
icking-0.044
tha-0.043
ric-0.042
Token big
Token ID1263
Feature activation-9.11e-02

Pos logit contributions
icking+0.044
bie+0.043
abella+0.043
tha+0.043
ancer+0.042
Neg logit contributions
 pieces-0.015
 target-0.015
 string-0.013
 piece-0.012
pillar-0.012
Token and
Token ID290
Feature activation+1.38e-02

Pos logit contributions
pillar+0.098
Sur+0.094
 target+0.093
Start+0.085
 aback+0.085
Neg logit contributions
icking-0.191
abella-0.182
bie-0.181
tha-0.176
aj-0.174
Token round
Token ID2835
Feature activation-6.06e-02

Pos logit contributions
icking+0.033
abella+0.032
bie+0.031
tha+0.031
ili+0.030
Neg logit contributions
 pieces-0.015
 target-0.014
pillar-0.014
 string-0.012
 bites-0.012
Token and
Token ID290
Feature activation+1.68e-02

Pos logit contributions
 target+0.057
pillar+0.056
 pieces+0.048
Sur+0.047
 aback+0.046
Neg logit contributions
icking-0.141
abella-0.138
bie-0.136
bled-0.133
tha-0.132
Token green
Token ID4077
Feature activation-5.09e-02

Pos logit contributions
icking+0.040
abella+0.038
bie+0.037
tha+0.037
ili+0.036
Neg logit contributions
 pieces-0.019
 target-0.017
pillar-0.017
 string-0.016
 bites-0.016
Token.
Token ID13
Feature activation-1.06e-02

Pos logit contributions
pillar+0.059
 target+0.058
Sur+0.052
 aback+0.052
 pieces+0.049
Neg logit contributions
icking-0.105
abella-0.100
bled-0.099
bie-0.098
tha-0.097
Token floor
Token ID4314
Feature activation+1.01e-02

Pos logit contributions
abella+0.030
belt+0.029
tha+0.029
icking+0.029
bie+0.028
Neg logit contributions
pillar-0.034
 aback-0.031
 target-0.030
Sur-0.030
lier-0.028
Token.
Token ID13
Feature activation+8.88e-03

Pos logit contributions
icking+0.025
abella+0.024
bie+0.024
tha+0.023
bled+0.023
Neg logit contributions
pillar-0.009
 target-0.009
 pieces-0.008
Sur-0.007
 bites-0.007
Token It
Token ID632
Feature activation-1.42e-02

Pos logit contributions
abella+0.020
icking+0.019
bie+0.019
tha+0.019
bled+0.018
Neg logit contributions
 target-0.010
pillar-0.010
 pieces-0.009
Sur-0.008
 bites-0.008
Token made
Token ID925
Feature activation-2.96e-03

Pos logit contributions
pillar+0.020
 target+0.019
 pieces+0.018
Sur+0.017
 aback+0.017
Neg logit contributions
abella-0.027
icking-0.027
bie-0.026
tha-0.026
bled-0.026
Token a
Token ID257
Feature activation-1.96e-02

Pos logit contributions
 pieces+0.003
 target+0.003
pillar+0.003
Sur+0.003
 bites+0.003
Neg logit contributions
icking-0.006
abella-0.006
bie-0.006
tha-0.006
bled-0.006
Token loud
Token ID7812
Feature activation-5.44e-02

Pos logit contributions
pillar+0.036
 aback+0.035
ives+0.033
 target+0.032
 pieces+0.032
Neg logit contributions
ric-0.026
bie-0.025
abella-0.025
icking-0.024
 Ty-0.024
Token noise
Token ID7838
Feature activation-5.98e-03

Pos logit contributions
pillar+0.073
 target+0.068
 pieces+0.063
 aback+0.061
powers+0.059
Neg logit contributions
icking-0.096
ric-0.096
abella-0.095
allows-0.092
bie-0.092
Token.
Token ID13
Feature activation+5.68e-03

Pos logit contributions
pillar+0.007
 target+0.006
 pieces+0.006
Sur+0.006
 aback+0.006
Neg logit contributions
icking-0.012
abella-0.012
ores-0.012
bie-0.011
bled-0.011
Token
Token ID198
Feature activation-2.42e-03

Pos logit contributions
icking+0.014
abella+0.014
bie+0.014
tha+0.014
bled+0.013
Neg logit contributions
pillar-0.005
 target-0.005
 pieces-0.004
Sur-0.004
 bites-0.004
Token
Token ID198
Feature activation-5.88e-03

Pos logit contributions
 target+0.003
pillar+0.002
 pieces+0.002
 bites+0.002
 aback+0.002
Neg logit contributions
abella-0.006
bie-0.005
icking-0.005
tha-0.005
ili-0.005
TokenSuddenly
Token ID38582
Feature activation-2.10e-03

Pos logit contributions
 target+0.006
pillar+0.006
 pieces+0.006
 bites+0.005
Sur+0.005
Neg logit contributions
abella-0.014
icking-0.013
bie-0.013
tha-0.013
bled-0.013
Token proud
Token ID6613
Feature activation-5.92e-02

Pos logit contributions
abella+0.020
icking+0.020
tha+0.019
ric+0.019
aby+0.019
Neg logit contributions
 pieces-0.011
 bites-0.011
Sur-0.011
 target-0.010
pillar-0.010
Token.
Token ID13
Feature activation-9.61e-04

Pos logit contributions
 target+0.070
pillar+0.069
 pieces+0.061
 aback+0.058
Sur+0.056
Neg logit contributions
abella-0.126
icking-0.125
bie-0.122
tha-0.117
ores-0.115
Token He
Token ID679
Feature activation-1.64e-03

Pos logit contributions
pillar+0.001
 target+0.001
 pieces+0.001
Sur+0.001
 bites+0.001
Neg logit contributions
icking-0.002
abella-0.002
bie-0.002
tha-0.002
bled-0.002
Token shouted
Token ID17293
Feature activation-4.32e-03

Pos logit contributions
pillar+0.002
 target+0.002
 pieces+0.002
Sur+0.002
 bites+0.002
Neg logit contributions
icking-0.003
abella-0.003
bie-0.003
tha-0.003
bled-0.003
Token,
Token ID11
Feature activation+2.23e-02

Pos logit contributions
 target+0.005
pillar+0.005
 pieces+0.005
Sur+0.004
 bites+0.004
Neg logit contributions
icking-0.009
abella-0.009
bie-0.009
tha-0.009
ric-0.009
Token "
Token ID366
Feature activation-4.46e-02

Pos logit contributions
icking+0.054
abella+0.053
bie+0.051
ili+0.050
tha+0.050
Neg logit contributions
 target-0.020
pillar-0.019
Sur-0.018
 pieces-0.018
-0.017
TokenLook
Token ID8567
Feature activation+8.60e-03

Pos logit contributions
pillar+0.046
 target+0.046
 pieces+0.046
 bites+0.040
 aback+0.038
Neg logit contributions
abella-0.100
icking-0.098
bie-0.097
tha-0.094
bled-0.093
Token,
Token ID11
Feature activation-4.07e-03

Pos logit contributions
abella+0.019
icking+0.018
tha+0.018
bie+0.018
ili+0.018
Neg logit contributions
pillar-0.009
 target-0.009
 pieces-0.009
Sur-0.008
 bites-0.008
Token I
Token ID314
Feature activation-2.46e-02

Pos logit contributions
 target+0.006
 pieces+0.006
pillar+0.006
Sur+0.006
 piece+0.006
Neg logit contributions
icking-0.007
abella-0.007
bled-0.007
bie-0.007
tha-0.007
Token did
Token ID750
Feature activation-6.55e-03

Pos logit contributions
pillar+0.032
 target+0.032
 pieces+0.031
Sur+0.028
 bites+0.027
Neg logit contributions
abella-0.049
icking-0.049
bie-0.048
tha-0.047
bled-0.046
Token it
Token ID340
Feature activation-1.45e-02

Pos logit contributions
 target+0.006
pillar+0.006
 pieces+0.006
Sur+0.005
 bites+0.005
Neg logit contributions
icking-0.015
abella-0.015
bie-0.015
tha-0.015
bled-0.014
Token je
Token ID11223
Feature activation+7.40e-02

Pos logit contributions
abella+0.005
icking+0.005
bled+0.005
bie+0.005
ores+0.005
Neg logit contributions
 pieces-0.003
 target-0.003
pillar-0.002
 string-0.002
Sur-0.002
Tokenep
Token ID538
Feature activation+1.70e-02

Pos logit contributions
abella+0.102
icking+0.100
bie+0.099
tha+0.094
bled+0.090
Neg logit contributions
pillar-0.143
 target-0.141
 pieces-0.141
Sur-0.131
 bites-0.126
Token found
Token ID1043
Feature activation-1.17e-02

Pos logit contributions
abella+0.036
bie+0.035
tha+0.035
ili+0.034
icking+0.034
Neg logit contributions
 target-0.020
pillar-0.020
 pieces-0.020
Sur-0.017
's-0.016
Token a
Token ID257
Feature activation-5.52e-02

Pos logit contributions
pillar+0.013
 pieces+0.013
 target+0.013
Sur+0.011
 aback+0.011
Neg logit contributions
abella-0.026
icking-0.025
bie-0.025
tha-0.025
ili-0.024
Token bright
Token ID6016
Feature activation-5.32e-02

Pos logit contributions
pillar+0.070
 pieces+0.066
 target+0.065
Sur+0.061
 bites+0.060
Neg logit contributions
icking-0.109
abella-0.109
bie-0.108
tha-0.105
ili-0.099
Token red
Token ID2266
Feature activation-5.79e-02

Pos logit contributions
pillar+0.074
Sur+0.069
 target+0.068
 pieces+0.067
ives+0.065
Neg logit contributions
icking-0.098
bie-0.096
tha-0.094
abella-0.092
ric-0.089
Token apple
Token ID17180
Feature activation-2.72e-02

Pos logit contributions
pillar+0.078
 aback+0.075
 pieces+0.073
Sur+0.073
 target+0.071
Neg logit contributions
icking-0.109
tha-0.100
bie-0.100
ric-0.099
belt-0.097
Token.
Token ID13
Feature activation-1.82e-02

Pos logit contributions
 aback+0.041
Sur+0.040
Start+0.038
powers+0.038
pillar+0.037
Neg logit contributions
icking-0.048
bled-0.043
belt-0.041
tha-0.041
abella-0.039
Token He
Token ID679
Feature activation-3.14e-03

Pos logit contributions
pillar+0.018
 target+0.018
 pieces+0.018
Sur+0.016
 bites+0.015
Neg logit contributions
icking-0.042
abella-0.042
bie-0.041
tha-0.040
bled-0.040
Token was
Token ID373
Feature activation+3.50e-02

Pos logit contributions
pillar+0.005
 target+0.005
 pieces+0.004
 aback+0.004
Sur+0.004
Neg logit contributions
abella-0.006
bie-0.006
icking-0.005
tha-0.005
bled-0.005
Token very
Token ID845
Feature activation+1.98e-02

Pos logit contributions
abella+0.086
icking+0.085
tha+0.083
bie+0.083
ancer+0.081
Neg logit contributions
 pieces-0.031
 target-0.030
pillar-0.026
 piece-0.024
 bites-0.024