TOP ACTIVATIONS
MAX = 0.087

. It comes from a far away place called Africa .
It was coming from a far away planet . The bird
She found it in a far away land . It is
. I came from a far away land to see the
in the mail from a far away land . The gem
He is visiting from a far away land for the day
It was coming from a far away corner of the cave
can you please shut the door ?" he asked . Lily
aliens who came from a far away planet . They showed
huge flood came to a far away land . The little
. I come from a far away planet . I am
. I come from a far away place . I can
. Please , unlock the door and come out ."
wind and landed in a far away place . This place
us . Please open the door ." S ara
fly a plane to a far place . He is anxious
sorry . Please open the door . I don 't want
that it came from a far away place . The all
send a missile to a far away place . The missile
just came back from a far away place . This is

MIN ACTIVATIONS
MIN = -0.105

lots of animals of all sizes . One day , a
creatures of all shapes and sizes . One of these creatures
all sorts of shapes and sizes . She was so happy
different colors , shapes and sizes . She carefully chose the
cars of different colors and sizes . They liked to zoom
cars of different colors and sizes . He liked to make
cars of different colors and sizes . He liked to make
toys of all shapes and sizes . Her favorite were two
trains of different colors and sizes . They also have a
cubes of different colors and sizes . They can make towers
had many boxes of different sizes and shapes . They used
paint br ushes of all sizes . The painter
cars of different colors and sizes , but their favorite was
cars of different colors and sizes . They make roads with
cars of different colors and sizes . They made roads and
bles of different colors and sizes . She enjoyed rolling them
rocks and sticks of all sizes . Every morning , he
boats of different colors and sizes . They liked to make
different colors , shapes and sizes . Everyone was amazed at
cubes of different colors and sizes . They build towers and

INTERVAL 0.039 - 0.087
CONTAINS 0.947%

Max thought about it and decided not to eat the mouse
door . He pushed the door . The door was stuck
right . So he turned off the TV and went to
Lily shouts . " The road is not for playing .
. At the far side of the pond ,

INTERVAL -0.009 - 0.039
CONTAINS 71.136%

push and shove each other . Suddenly , Tim my tri
herself in the big mirror in her room .
the coin and it rolled away . Sam was
upon a time there was a boy named Jack . Jack
them feel happy again . <|endoftext|> S ully and his family

INTERVAL -0.057 - -0.009
CONTAINS 27.836%

the brightest . She loved to compete with the other animals
Lily told her mom , " I want that yellow fruit
is scary and mean . They do not want to find
and I love each other ?" Tom nodded and
to Mia . " Look , Mia . It

INTERVAL -0.105 - -0.057
CONTAINS 0.081%

the tourist took lots of pictures and learned new things .
friends squeezed lots of le mons and mixed them with sugar
and they had lots of fun together ! <|endoftext|> Once upon
a liked to do fun things every day .
. She found many pe b bles . She filled her
Token.
Token ID13
Feature activation+1.64e-02

Pos logit contributions
 my+0.050
 me+0.043
 MY+0.043
 mine+0.041
 cave+0.040
Neg logit contributions
bows-0.048
hing-0.043
owers-0.042
 understand-0.042
ruck-0.041
Token It
Token ID632
Feature activation+3.41e-02

Pos logit contributions
 my+0.028
 me+0.025
 away+0.024
 mistake+0.024
 cage+0.024
Neg logit contributions
bows-0.029
hing-0.027
owers-0.026
 understand-0.026
anes-0.024
Token comes
Token ID2058
Feature activation+1.59e-02

Pos logit contributions
 my+0.044
 cave+0.035
 gru+0.034
 MY+0.034
 me+0.034
Neg logit contributions
bows-0.074
hing-0.070
owers-0.066
 understand-0.065
 sizes-0.064
Token from
Token ID422
Feature activation+4.89e-02

Pos logit contributions
 my+0.028
 me+0.024
 MY+0.023
 cave+0.023
 cage+0.023
Neg logit contributions
bows-0.027
hing-0.027
 understand-0.025
ruck-0.024
owers-0.024
Token a
Token ID257
Feature activation+2.76e-02

Pos logit contributions
 my+0.080
lor+0.071
 MY+0.071
ar+0.070
 gru+0.066
Neg logit contributions
bows-0.078
ruck-0.075
 understand-0.075
hing-0.073
owers-0.069
Token far
Token ID1290
Feature activation+8.74e-02

Pos logit contributions
 my+0.046
fired+0.038
 MY+0.038
 slams+0.037
 me+0.036
Neg logit contributions
bows-0.046
hing-0.042
 understand-0.042
ruck-0.041
 sizes-0.040
Tokenaway
Token ID8272
Feature activation+6.01e-02

Pos logit contributions
lor+0.104
 my+0.095
 MY+0.090
 gru+0.090
 hasn+0.090
Neg logit contributions
hing-0.159
ruck-0.158
bows-0.153
 understand-0.149
aff-0.146
Token place
Token ID1295
Feature activation+3.31e-02

Pos logit contributions
 my+0.102
 MY+0.099
 gru+0.092
 timeout+0.091
lor+0.088
Neg logit contributions
bows-0.093
 sizes-0.076
ruck-0.076
 dancers-0.075
 understand-0.070
Token called
Token ID1444
Feature activation+1.86e-02

Pos logit contributions
 my+0.057
 MY+0.050
 gru+0.050
 me+0.049
ution+0.046
Neg logit contributions
bows-0.057
 understand-0.049
owers-0.048
hing-0.048
 were-0.047
Token Africa
Token ID5478
Feature activation+3.48e-02

Pos logit contributions
 my+0.032
 me+0.027
 cave+0.027
 MY+0.026
 cage+0.026
Neg logit contributions
bows-0.032
hing-0.030
owers-0.029
 understand-0.028
ruck-0.027
Token.
Token ID13
Feature activation+1.49e-02

Pos logit contributions
 my+0.066
 MY+0.057
 me+0.054
 mine+0.051
 dismay+0.051
Neg logit contributions
bows-0.059
owers-0.051
hing-0.050
 understand-0.048
ances-0.046
Token It
Token ID632
Feature activation+6.13e-03

Pos logit contributions
 my+0.010
 mine+0.008
 MY+0.008
 me+0.008
 cage+0.008
Neg logit contributions
bows-0.009
hing-0.008
owers-0.008
 understand-0.008
 sizes-0.007
Token was
Token ID373
Feature activation-2.60e-03

Pos logit contributions
 my+0.011
 mine+0.009
 me+0.009
 MY+0.009
 cage+0.009
Neg logit contributions
hing-0.011
bows-0.010
 understand-0.010
owers-0.009
ruck-0.009
Token coming
Token ID2406
Feature activation+2.61e-02

Pos logit contributions
bows+0.005
hing+0.005
owers+0.004
 understand+0.004
ruck+0.004
Neg logit contributions
 my-0.004
 me-0.004
 cage-0.004
 MY-0.003
 mine-0.003
Token from
Token ID422
Feature activation+3.07e-02

Pos logit contributions
 my+0.047
 me+0.040
 MY+0.038
 cage+0.037
 mine+0.037
Neg logit contributions
bows-0.046
hing-0.045
 understand-0.041
ruck-0.041
owers-0.041
Token a
Token ID257
Feature activation+3.83e-02

Pos logit contributions
 my+0.052
 me+0.044
 MY+0.043
 gru+0.041
 mine+0.041
Neg logit contributions
bows-0.053
hing-0.052
ruck-0.050
owers-0.048
 understand-0.047
Token far
Token ID1290
Feature activation+8.59e-02

Pos logit contributions
 my+0.066
 MY+0.054
riage+0.054
fired+0.053
ar+0.052
Neg logit contributions
hing-0.059
 understand-0.059
bows-0.059
ruck-0.056
 were-0.052
Token away
Token ID1497
Feature activation+4.71e-02

Pos logit contributions
lor+0.100
 my+0.083
fired+0.082
riage+0.082
 gru+0.081
Neg logit contributions
hing-0.177
ruck-0.166
bows-0.153
 understand-0.148
 like-0.147
Token planet
Token ID5440
Feature activation+3.26e-02

Pos logit contributions
 my+0.078
 MY+0.069
 gru+0.068
ution+0.067
 mine+0.067
Neg logit contributions
bows-0.083
ruck-0.076
hing-0.075
owers-0.070
 understand-0.069
Token.
Token ID13
Feature activation+5.17e-03

Pos logit contributions
 my+0.067
 MY+0.057
 gru+0.054
 me+0.054
ution+0.053
Neg logit contributions
bows-0.046
hing-0.043
owers-0.040
ruck-0.039
 were-0.039
Token The
Token ID383
Feature activation+7.75e-03

Pos logit contributions
 my+0.009
 mine+0.007
 MY+0.007
 mistake+0.007
 me+0.007
Neg logit contributions
bows-0.009
owers-0.008
hing-0.008
 understand-0.008
 sizes-0.008
Token bird
Token ID6512
Feature activation+2.79e-02

Pos logit contributions
 my+0.012
 me+0.010
 MY+0.009
 mine+0.009
 gru+0.008
Neg logit contributions
bows-0.016
hing-0.015
owers-0.015
 understand-0.015
ruck-0.014
Token She
Token ID1375
Feature activation+3.18e-02

Pos logit contributions
 my+0.024
 cave+0.021
 away+0.021
 me+0.021
 cage+0.021
Neg logit contributions
bows-0.027
hing-0.025
 understand-0.025
owers-0.025
ickets-0.023
Token found
Token ID1043
Feature activation+1.48e-02

Pos logit contributions
 my+0.055
 me+0.046
 cage+0.043
 MY+0.042
 cave+0.042
Neg logit contributions
bows-0.061
hing-0.058
owers-0.055
 understand-0.054
ruck-0.052
Token it
Token ID340
Feature activation+1.95e-02

Pos logit contributions
 my+0.022
 cave+0.021
 gru+0.021
 MY+0.019
 cage+0.019
Neg logit contributions
bows-0.026
 understand-0.024
hing-0.024
ruck-0.023
owers-0.023
Token in
Token ID287
Feature activation+2.77e-02

Pos logit contributions
 my+0.033
 gru+0.032
 cave+0.030
 MY+0.029
 cage+0.029
Neg logit contributions
bows-0.034
owers-0.029
hing-0.029
 understand-0.028
 were-0.026
Token a
Token ID257
Feature activation+2.90e-02

Pos logit contributions
 my+0.043
 me+0.038
 gru+0.038
 cave+0.038
ar+0.037
Neg logit contributions
bows-0.050
 understand-0.046
hing-0.045
ruck-0.044
 sizes-0.043
Token far
Token ID1290
Feature activation+8.56e-02

Pos logit contributions
 my+0.047
riage+0.046
fired+0.043
eros+0.042
 MY+0.042
Neg logit contributions
 understand-0.043
 were-0.036
hing-0.036
bows-0.036
 love-0.035
Tokenaway
Token ID8272
Feature activation+6.07e-02

Pos logit contributions
lor+0.118
 gru+0.109
 Does+0.106
 MY+0.105
 hasn+0.099
Neg logit contributions
ruck-0.135
 understand-0.135
hing-0.133
bows-0.128
aff-0.122
Token land
Token ID1956
Feature activation+3.42e-02

Pos logit contributions
 MY+0.095
 my+0.089
 gru+0.088
oh+0.086
CH+0.084
Neg logit contributions
bows-0.094
 sizes-0.081
 understand-0.080
 were-0.077
ruck-0.077
Token.
Token ID13
Feature activation+9.87e-03

Pos logit contributions
 my+0.060
 MY+0.056
 gru+0.055
 cave+0.053
 me+0.052
Neg logit contributions
bows-0.056
owers-0.049
hing-0.047
ances-0.045
 understand-0.045
Token It
Token ID632
Feature activation+2.10e-02

Pos logit contributions
 my+0.015
 cage+0.013
 cave+0.013
 away+0.013
 me+0.013
Neg logit contributions
bows-0.018
hing-0.017
 understand-0.017
owers-0.017
ickets-0.016
Token is
Token ID318
Feature activation-6.87e-04

Pos logit contributions
 my+0.034
 gru+0.029
 me+0.028
 MY+0.028
 cave+0.028
Neg logit contributions
bows-0.041
hing-0.037
owers-0.035
 were-0.035
 understand-0.034
Token.
Token ID13
Feature activation+1.36e-02

Pos logit contributions
 my+0.033
 MY+0.029
 me+0.029
 gru+0.028
 cave+0.027
Neg logit contributions
bows-0.032
hing-0.029
owers-0.028
 understand-0.028
ruck-0.026
Token I
Token ID314
Feature activation+1.45e-02

Pos logit contributions
 my+0.020
 me+0.018
 cave+0.017
 cage+0.017
 away+0.017
Neg logit contributions
bows-0.027
hing-0.026
 understand-0.025
owers-0.025
ickets-0.023
Token came
Token ID1625
Feature activation+2.16e-02

Pos logit contributions
 my+0.025
 me+0.022
 cage+0.020
 mine+0.020
 cave+0.020
Neg logit contributions
bows-0.027
hing-0.026
owers-0.025
ruck-0.023
 understand-0.023
Token from
Token ID422
Feature activation+5.49e-02

Pos logit contributions
 my+0.026
 gru+0.023
 MY+0.023
 cave+0.023
 cage+0.022
Neg logit contributions
bows-0.044
 understand-0.044
hing-0.044
ruck-0.041
owers-0.040
Token a
Token ID257
Feature activation+3.59e-02

Pos logit contributions
 my+0.072
lor+0.070
 MY+0.069
 gru+0.069
ar+0.067
Neg logit contributions
 understand-0.095
bows-0.092
ruck-0.092
hing-0.092
owers-0.091
Token far
Token ID1290
Feature activation+8.55e-02

Pos logit contributions
 my+0.058
 MY+0.052
 slams+0.051
riage+0.048
ar+0.047
Neg logit contributions
 understand-0.057
bows-0.057
ruck-0.053
hing-0.052
 were-0.049
Tokenaway
Token ID8272
Feature activation+6.50e-02

Pos logit contributions
 gru+0.102
 my+0.098
 MY+0.098
lor+0.092
 hasn+0.089
Neg logit contributions
hing-0.156
ruck-0.155
bows-0.152
 understand-0.146
aff-0.144
Token land
Token ID1956
Feature activation+3.86e-02

Pos logit contributions
 MY+0.114
 my+0.109
 gru+0.107
oh+0.097
 timeout+0.096
Neg logit contributions
bows-0.095
ruck-0.083
 dancers-0.078
 sizes-0.078
 understand-0.076
Token to
Token ID284
Feature activation+3.18e-02

Pos logit contributions
 my+0.072
 MY+0.066
 me+0.062
 dismay+0.061
 gru+0.061
Neg logit contributions
bows-0.061
hing-0.054
owers-0.053
 understand-0.049
ances-0.049
Token see
Token ID766
Feature activation+3.49e-02

Pos logit contributions
 my+0.062
 MY+0.059
 gru+0.058
 mine+0.054
ar+0.054
Neg logit contributions
 understand-0.046
hing-0.041
bows-0.040
ruck-0.035
 were-0.034
Token the
Token ID262
Feature activation+2.93e-03

Pos logit contributions
 my+0.044
 MY+0.037
 cave+0.036
 cage+0.036
 me+0.035
Neg logit contributions
bows-0.073
hing-0.072
 understand-0.069
owers-0.069
ruck-0.067
Token in
Token ID287
Feature activation+1.80e-02

Pos logit contributions
bows+0.009
hing+0.009
owers+0.008
 understand+0.008
ruck+0.008
Neg logit contributions
 my-0.008
 me-0.007
 cage-0.006
 MY-0.006
 mine-0.006
Token the
Token ID262
Feature activation+9.40e-03

Pos logit contributions
 my+0.035
 MY+0.031
ar+0.031
 cage+0.031
 gru+0.030
Neg logit contributions
bows-0.026
 understand-0.024
hing-0.024
ruck-0.023
owers-0.022
Token mail
Token ID6920
Feature activation+1.42e-02

Pos logit contributions
 my+0.016
 MY+0.015
shadow+0.015
smart+0.014
ar+0.014
Neg logit contributions
bows-0.015
owers-0.014
 understand-0.013
ruck-0.013
hing-0.013
Token from
Token ID422
Feature activation+2.75e-02

Pos logit contributions
 my+0.027
 me+0.022
 MY+0.022
 mine+0.021
 gru+0.021
Neg logit contributions
bows-0.025
hing-0.023
owers-0.023
ruck-0.021
 understand-0.021
Token a
Token ID257
Feature activation+3.24e-02

Pos logit contributions
 my+0.052
 MY+0.046
 gru+0.044
 cage+0.044
ar+0.043
Neg logit contributions
bows-0.042
 understand-0.039
hing-0.038
ruck-0.037
owers-0.037
Token far
Token ID1290
Feature activation+8.51e-02

Pos logit contributions
 my+0.055
fired+0.047
 slams+0.047
 MY+0.047
riage+0.045
Neg logit contributions
 understand-0.047
bows-0.045
hing-0.043
ruck-0.042
 were-0.041
Token away
Token ID1497
Feature activation+4.29e-02

Pos logit contributions
lor+0.103
 my+0.099
 gru+0.098
 MY+0.096
 Does+0.094
Neg logit contributions
hing-0.144
bows-0.144
ruck-0.142
aff-0.138
 understand-0.132
Token land
Token ID1956
Feature activation+2.28e-02

Pos logit contributions
 my+0.061
 MY+0.054
 gru+0.052
ution+0.051
 mine+0.050
Neg logit contributions
bows-0.088
ruck-0.079
 understand-0.074
owers-0.073
 sizes-0.073
Token.
Token ID13
Feature activation-3.86e-03

Pos logit contributions
 my+0.049
 MY+0.042
 me+0.041
 mine+0.040
 gru+0.039
Neg logit contributions
bows-0.033
owers-0.029
hing-0.028
ances-0.026
ruck-0.026
Token The
Token ID383
Feature activation-8.16e-03

Pos logit contributions
bows+0.007
hing+0.006
owers+0.006
 understand+0.006
ruck+0.005
Neg logit contributions
 my-0.007
 me-0.006
 cage-0.006
 MY-0.006
 mine-0.006
Token gem
Token ID16840
Feature activation+1.06e-02

Pos logit contributions
bows+0.017
hing+0.017
ickets+0.016
owers+0.015
akes+0.015
Neg logit contributions
 my-0.013
 me-0.011
 cage-0.010
 mine-0.010
 cave-0.009
Token He
Token ID679
Feature activation+1.56e-02

Pos logit contributions
 my+0.023
 cage+0.022
 mistake+0.022
 away+0.022
 me+0.021
Neg logit contributions
bows-0.019
hing-0.018
 understand-0.018
ickets-0.017
owers-0.016
Token is
Token ID318
Feature activation-9.24e-05

Pos logit contributions
 my+0.028
 me+0.024
 cage+0.023
 MY+0.022
 cave+0.022
Neg logit contributions
bows-0.028
hing-0.026
owers-0.025
 understand-0.025
ruck-0.023
Token visiting
Token ID10013
Feature activation+2.88e-02

Pos logit contributions
bows+0.000
hing+0.000
 understand+0.000
ruck+0.000
 sizes+0.000
Neg logit contributions
 my-0.000
 cage-0.000
 me-0.000
 cave-0.000
 MY-0.000
Token from
Token ID422
Feature activation+4.18e-02

Pos logit contributions
 my+0.033
 cage+0.032
 cave+0.031
fired+0.029
 gru+0.029
Neg logit contributions
bows-0.058
hing-0.055
 understand-0.054
owers-0.053
 dancers-0.052
Token a
Token ID257
Feature activation+3.72e-02

Pos logit contributions
 my+0.063
 cage+0.057
 MY+0.056
 gru+0.056
 cave+0.055
Neg logit contributions
bows-0.068
ruck-0.064
 understand-0.064
hing-0.063
 dancers-0.062
Token far
Token ID1290
Feature activation+8.42e-02

Pos logit contributions
 my+0.068
fired+0.060
 slams+0.058
eros+0.058
 MY+0.056
Neg logit contributions
 understand-0.050
bows-0.048
hing-0.046
 dancers-0.044
ruck-0.042
Token away
Token ID1497
Feature activation+4.63e-02

Pos logit contributions
lor+0.096
 gru+0.090
 my+0.088
 Does+0.088
fired+0.088
Neg logit contributions
hing-0.149
ruck-0.145
bows-0.142
 understand-0.140
aff-0.137
Token land
Token ID1956
Feature activation+2.82e-02

Pos logit contributions
 my+0.065
ution+0.060
 MY+0.060
 gru+0.059
 mine+0.057
Neg logit contributions
bows-0.084
 dancers-0.076
ruck-0.074
 understand-0.074
 were-0.072
Token for
Token ID329
Feature activation+1.33e-02

Pos logit contributions
 my+0.050
 MY+0.043
 me+0.042
 cage+0.042
 gru+0.041
Neg logit contributions
bows-0.045
hing-0.042
owers-0.040
ruck-0.038
ances-0.038
Token the
Token ID262
Feature activation-6.42e-03

Pos logit contributions
 my+0.023
 cage+0.020
 MY+0.020
fired+0.020
 cave+0.020
Neg logit contributions
bows-0.020
 understand-0.020
ruck-0.019
hing-0.019
 dancers-0.018
Token day
Token ID1110
Feature activation+5.87e-03

Pos logit contributions
bows+0.012
hing+0.012
owers+0.011
 understand+0.011
ruck+0.011
Neg logit contributions
 my-0.011
 me-0.009
 cage-0.009
 mine-0.009
 cave-0.009
Token It
Token ID632
Feature activation+1.85e-02

Pos logit contributions
 my+0.011
 me+0.009
 mine+0.009
 cave+0.009
 cage+0.009
Neg logit contributions
bows-0.011
hing-0.010
owers-0.010
 understand-0.010
ruck-0.009
Token was
Token ID373
Feature activation-5.23e-04

Pos logit contributions
 my+0.035
 mine+0.030
 MY+0.029
 cage+0.028
 cave+0.028
Neg logit contributions
hing-0.029
bows-0.028
 understand-0.028
 were-0.026
owers-0.024
Token coming
Token ID2406
Feature activation+2.23e-02

Pos logit contributions
bows+0.001
hing+0.001
 understand+0.001
owers+0.001
ruck+0.001
Neg logit contributions
 my-0.001
 me-0.001
 cage-0.001
 MY-0.001
 mine-0.001
Token from
Token ID422
Feature activation+3.12e-02

Pos logit contributions
 my+0.039
 me+0.033
 cage+0.031
 MY+0.031
 mine+0.030
Neg logit contributions
bows-0.042
hing-0.040
owers-0.037
 understand-0.037
ruck-0.036
Token a
Token ID257
Feature activation+4.14e-02

Pos logit contributions
 my+0.052
 me+0.043
 MY+0.041
 mine+0.040
 cage+0.040
Neg logit contributions
bows-0.059
hing-0.058
 understand-0.054
owers-0.054
ruck-0.053
Token far
Token ID1290
Feature activation+8.39e-02

Pos logit contributions
 my+0.071
 MY+0.058
 slams+0.058
fired+0.056
 belongs+0.056
Neg logit contributions
 understand-0.069
hing-0.065
bows-0.063
ruck-0.061
 were-0.059
Token away
Token ID1497
Feature activation+4.14e-02

Pos logit contributions
lor+0.106
 my+0.097
 MY+0.092
fired+0.089
 hasn+0.088
Neg logit contributions
hing-0.159
ruck-0.151
 understand-0.144
bows-0.137
 like-0.131
Token corner
Token ID5228
Feature activation+2.55e-02

Pos logit contributions
 my+0.066
 MY+0.056
 mine+0.056
 me+0.054
ution+0.050
Neg logit contributions
bows-0.077
hing-0.071
 understand-0.071
ruck-0.071
 were-0.066
Token of
Token ID286
Feature activation+2.22e-02

Pos logit contributions
 my+0.056
 MY+0.047
 me+0.047
 mine+0.047
 your+0.043
Neg logit contributions
bows-0.035
hing-0.033
ruck-0.031
 understand-0.030
owers-0.030
Token the
Token ID262
Feature activation+6.81e-03

Pos logit contributions
 my+0.036
 mine+0.030
 me+0.030
 MY+0.030
ar+0.028
Neg logit contributions
hing-0.039
bows-0.039
ruck-0.037
 understand-0.037
owers-0.035
Token cave
Token ID11527
Feature activation+4.17e-02

Pos logit contributions
 my+0.008
 MY+0.006
 mine+0.005
 your+0.005
ress+0.005
Neg logit contributions
bows-0.015
hing-0.015
ruck-0.014
 understand-0.014
 were-0.014
Token can
Token ID460
Feature activation+5.56e-03

Pos logit contributions
 my+0.021
 gru+0.020
 cage+0.020
 cave+0.019
fired+0.019
Neg logit contributions
hing-0.028
 understand-0.028
bows-0.028
ruck-0.026
 were-0.025
Token you
Token ID345
Feature activation+1.15e-02

Pos logit contributions
 my+0.006
 cage+0.005
 cave+0.005
 gru+0.005
 MY+0.004
Neg logit contributions
hing-0.013
bows-0.012
 understand-0.012
owers-0.012
ickets-0.011
Token please
Token ID3387
Feature activation+4.21e-02

Pos logit contributions
 my+0.023
 gru+0.020
 MY+0.020
 me+0.020
 cage+0.019
Neg logit contributions
hing-0.017
 understand-0.017
bows-0.016
owers-0.014
ruck-0.014
Token shut
Token ID4423
Feature activation+5.77e-02

Pos logit contributions
 my+0.080
 gru+0.072
 MY+0.070
 cage+0.068
 cave+0.067
Neg logit contributions
bows-0.064
 understand-0.062
hing-0.062
owers-0.054
ruck-0.053
Token the
Token ID262
Feature activation+1.11e-02

Pos logit contributions
 gru+0.080
 cage+0.077
 ranger+0.073
ar+0.073
bek+0.072
Neg logit contributions
hing-0.094
ruck-0.091
 viol-0.089
 dancers-0.089
bows-0.088
Token door
Token ID3420
Feature activation+8.37e-02

Pos logit contributions
 my+0.014
 gru+0.011
 MY+0.011
ar+0.011
 me+0.010
Neg logit contributions
bows-0.025
owers-0.022
 understand-0.022
hing-0.022
ruck-0.021
Token?"
Token ID1701
Feature activation+1.46e-02

Pos logit contributions
 gru+0.130
 my+0.122
 MY+0.122
eros+0.121
 crate+0.119
Neg logit contributions
bows-0.137
 understand-0.133
hing-0.130
ruck-0.117
 like-0.111
Token he
Token ID339
Feature activation+2.21e-02

Pos logit contributions
 my+0.030
 cave+0.028
 MY+0.028
 cage+0.028
 destination+0.027
Neg logit contributions
 understand-0.017
bows-0.017
hing-0.015
 were-0.014
owers-0.013
Token asked
Token ID1965
Feature activation+1.43e-02

Pos logit contributions
 my+0.044
 cage+0.038
 cave+0.037
 MY+0.037
 me+0.037
Neg logit contributions
bows-0.033
hing-0.031
 understand-0.030
owers-0.028
ruck-0.027
Token.
Token ID13
Feature activation+1.50e-02

Pos logit contributions
 my+0.026
 me+0.022
 MY+0.022
 cage+0.022
 cave+0.022
Neg logit contributions
bows-0.024
hing-0.023
 understand-0.022
owers-0.021
ruck-0.021
Token Lily
Token ID20037
Feature activation+1.27e-02

Pos logit contributions
 my+0.028
 cave+0.027
 cage+0.027
 MY+0.026
 beast+0.026
Neg logit contributions
bows-0.020
 understand-0.019
 were-0.018
hing-0.016
ruck-0.015
Token aliens
Token ID16269
Feature activation-2.46e-02

Pos logit contributions
bows+0.019
hing+0.018
owers+0.018
 understand+0.017
ruck+0.017
Neg logit contributions
 my-0.013
 me-0.011
 cage-0.010
 cave-0.009
 mine-0.009
Token who
Token ID508
Feature activation+1.15e-03

Pos logit contributions
bows+0.049
hing+0.046
owers+0.044
 understand+0.042
ruck+0.041
Neg logit contributions
 my-0.041
 me-0.035
 cage-0.032
 cave-0.032
 mine-0.031
Token came
Token ID1625
Feature activation+2.89e-04

Pos logit contributions
 my+0.002
 me+0.002
 cage+0.002
 MY+0.002
 mine+0.002
Neg logit contributions
bows-0.002
hing-0.002
owers-0.002
 understand-0.002
ruck-0.002
Token from
Token ID422
Feature activation+3.40e-02

Pos logit contributions
 my+0.000
 me+0.000
 cage+0.000
 MY+0.000
 mine+0.000
Neg logit contributions
bows-0.001
hing-0.001
owers-0.001
 understand-0.001
ruck-0.001
Token a
Token ID257
Feature activation+3.05e-02

Pos logit contributions
 my+0.060
 MY+0.051
 me+0.050
 mine+0.047
 cage+0.046
Neg logit contributions
bows-0.057
hing-0.055
 understand-0.055
ruck-0.053
owers-0.048
Token far
Token ID1290
Feature activation+8.34e-02

Pos logit contributions
 my+0.052
 MY+0.044
riage+0.043
fired+0.042
 slams+0.042
Neg logit contributions
bows-0.047
 understand-0.046
ruck-0.045
hing-0.045
 were-0.041
Tokenaway
Token ID8272
Feature activation+6.20e-02

Pos logit contributions
lor+0.108
 my+0.101
fired+0.098
 MY+0.098
 hasn+0.096
Neg logit contributions
hing-0.143
ruck-0.141
bows-0.132
aff-0.128
 understand-0.127
Token planet
Token ID5440
Feature activation+2.98e-02

Pos logit contributions
 MY+0.101
 my+0.098
 gru+0.089
oh+0.089
 Which+0.087
Neg logit contributions
bows-0.093
 dancers-0.085
ruck-0.085
 sizes-0.078
 planets-0.077
Token.
Token ID13
Feature activation+1.01e-02

Pos logit contributions
 my+0.061
 MY+0.052
 me+0.049
 gru+0.049
 mine+0.047
Neg logit contributions
bows-0.043
hing-0.039
 were-0.037
owers-0.036
ruck-0.036
Token They
Token ID1119
Feature activation-3.88e-02

Pos logit contributions
 my+0.020
 cage+0.017
 me+0.017
 MY+0.017
 mine+0.016
Neg logit contributions
bows-0.016
hing-0.014
 understand-0.014
owers-0.013
ruck-0.013
Token showed
Token ID3751
Feature activation-1.73e-02

Pos logit contributions
bows+0.079
owers+0.073
hing+0.067
akes+0.065
ickets+0.064
Neg logit contributions
 my-0.065
 me-0.060
 cave-0.054
 cage-0.054
 away-0.052
Token huge
Token ID3236
Feature activation+2.45e-02

Pos logit contributions
 my+0.009
 me+0.008
 cage+0.007
 cave+0.007
 MY+0.007
Neg logit contributions
bows-0.012
hing-0.011
owers-0.011
ruck-0.010
 understand-0.010
Token flood
Token ID6947
Feature activation+2.66e-02

Pos logit contributions
 my+0.041
 MY+0.034
 me+0.033
 gru+0.032
fired+0.031
Neg logit contributions
bows-0.045
hing-0.043
owers-0.041
ruck-0.039
 understand-0.038
Token came
Token ID1625
Feature activation+6.73e-03

Pos logit contributions
 my+0.049
 me+0.040
 MY+0.039
 mine+0.038
 gru+0.038
Neg logit contributions
bows-0.047
hing-0.045
owers-0.042
ruck-0.040
 understand-0.039
Token to
Token ID284
Feature activation+1.38e-02

Pos logit contributions
 my+0.011
 me+0.009
 cage+0.008
 MY+0.008
 mine+0.008
Neg logit contributions
bows-0.013
hing-0.013
owers-0.012
 understand-0.012
ruck-0.012
Token a
Token ID257
Feature activation+3.85e-02

Pos logit contributions
 my+0.022
 MY+0.018
 gru+0.018
 me+0.018
 cage+0.018
Neg logit contributions
bows-0.026
hing-0.025
 understand-0.024
ruck-0.023
owers-0.023
Token far
Token ID1290
Feature activation+8.33e-02

Pos logit contributions
 my+0.075
 MY+0.064
fired+0.063
 gru+0.062
lor+0.061
Neg logit contributions
bows-0.051
 understand-0.051
hing-0.049
ruck-0.048
 dancers-0.045
Tokenaway
Token ID8272
Feature activation+6.28e-02

Pos logit contributions
lor+0.119
 gru+0.104
 my+0.102
fired+0.101
 Does+0.098
Neg logit contributions
hing-0.144
ruck-0.135
bows-0.133
aff-0.127
 like-0.120
Token land
Token ID1956
Feature activation+3.10e-02

Pos logit contributions
 my+0.102
 MY+0.102
 gru+0.099
oh+0.092
lor+0.092
Neg logit contributions
bows-0.093
 dancers-0.088
ruck-0.086
 were-0.080
hing-0.078
Token.
Token ID13
Feature activation-5.44e-03

Pos logit contributions
 my+0.066
 me+0.056
 MY+0.056
 mine+0.054
 gru+0.052
Neg logit contributions
bows-0.045
hing-0.041
owers-0.040
 were-0.038
ruck-0.037
Token The
Token ID383
Feature activation-5.35e-03

Pos logit contributions
bows+0.011
hing+0.010
owers+0.010
 understand+0.009
ruck+0.009
Neg logit contributions
 my-0.009
 me-0.008
 cage-0.007
 MY-0.007
 mine-0.007
Token little
Token ID1310
Feature activation-4.04e-03

Pos logit contributions
bows+0.010
hing+0.009
owers+0.009
 understand+0.009
ruck+0.008
Neg logit contributions
 my-0.010
 me-0.008
 MY-0.008
 cage-0.007
 mine-0.007
Token.
Token ID13
Feature activation+9.74e-03

Pos logit contributions
bows+0.001
hing+0.001
owers+0.001
 understand+0.001
ruck+0.001
Neg logit contributions
 my-0.001
 me-0.001
 MY-0.001
 cage-0.001
 cave-0.001
Token I
Token ID314
Feature activation+8.29e-04

Pos logit contributions
 my+0.014
 me+0.012
 cage+0.012
 MY+0.012
 mistake+0.012
Neg logit contributions
bows-0.020
hing-0.018
 understand-0.018
owers-0.018
 sizes-0.017
Token come
Token ID1282
Feature activation+3.43e-02

Pos logit contributions
 my+0.001
 me+0.001
 cage+0.001
 cave+0.001
 mine+0.001
Neg logit contributions
bows-0.002
hing-0.001
owers-0.001
ruck-0.001
ances-0.001
Token from
Token ID422
Feature activation+5.32e-02

Pos logit contributions
 my+0.044
 MY+0.036
 me+0.036
 gru+0.034
 cave+0.034
Neg logit contributions
bows-0.073
hing-0.072
 understand-0.070
ruck-0.066
owers-0.066
Token a
Token ID257
Feature activation+2.94e-02

Pos logit contributions
 my+0.074
 MY+0.070
 gru+0.065
lor+0.063
 cage+0.062
Neg logit contributions
 understand-0.093
bows-0.092
hing-0.090
ruck-0.089
owers-0.087
Token far
Token ID1290
Feature activation+8.25e-02

Pos logit contributions
 my+0.051
 MY+0.044
 me+0.042
 slams+0.042
fired+0.041
Neg logit contributions
bows-0.047
hing-0.046
 understand-0.045
ruck-0.044
 dancers-0.041
Token away
Token ID1497
Feature activation+4.72e-02

Pos logit contributions
 my+0.101
 MY+0.098
 gru+0.096
lor+0.095
 Does+0.091
Neg logit contributions
bows-0.148
hing-0.147
ruck-0.144
aff-0.138
 understand-0.137
Token planet
Token ID5440
Feature activation+3.86e-02

Pos logit contributions
 my+0.078
 MY+0.073
 gru+0.069
 mine+0.068
 me+0.067
Neg logit contributions
bows-0.081
 understand-0.074
hing-0.072
ruck-0.072
 were-0.069
Token.
Token ID13
Feature activation+1.10e-02

Pos logit contributions
 my+0.075
 MY+0.069
 gru+0.064
 me+0.063
 dismay+0.062
Neg logit contributions
bows-0.059
owers-0.050
hing-0.050
 understand-0.049
 sizes-0.047
Token I
Token ID314
Feature activation+7.22e-03

Pos logit contributions
 my+0.016
 me+0.014
 cage+0.013
 MY+0.013
 mine+0.013
Neg logit contributions
bows-0.022
hing-0.021
 understand-0.020
owers-0.020
ickets-0.019
Token am
Token ID716
Feature activation+6.58e-03

Pos logit contributions
 my+0.014
 me+0.012
 cage+0.011
 cave+0.011
 MY+0.011
Neg logit contributions
bows-0.013
hing-0.012
owers-0.011
 understand-0.011
ruck-0.011
Token.
Token ID13
Feature activation+1.00e-02

Pos logit contributions
 my+0.019
 MY+0.017
 me+0.017
 gru+0.016
fired+0.016
Neg logit contributions
bows-0.017
hing-0.016
owers-0.015
 understand-0.015
bee-0.015
Token I
Token ID314
Feature activation+8.18e-03

Pos logit contributions
 my+0.014
 me+0.012
 cave+0.012
 cage+0.012
 mistake+0.012
Neg logit contributions
bows-0.021
hing-0.019
 understand-0.019
owers-0.019
ickets-0.018
Token come
Token ID1282
Feature activation+3.34e-02

Pos logit contributions
 my+0.015
 me+0.013
 cage+0.012
 MY+0.012
 mine+0.012
Neg logit contributions
bows-0.014
hing-0.014
owers-0.013
ruck-0.012
 understand-0.012
Token from
Token ID422
Feature activation+5.38e-02

Pos logit contributions
 my+0.048
 MY+0.041
 gru+0.041
 cave+0.040
 me+0.040
Neg logit contributions
bows-0.065
hing-0.063
 understand-0.061
owers-0.058
ruck-0.057
Token a
Token ID257
Feature activation+3.54e-02

Pos logit contributions
 MY+0.065
 my+0.065
ress+0.063
lor+0.062
 gru+0.060
Neg logit contributions
 understand-0.100
bows-0.094
hing-0.094
ruck-0.092
owers-0.092
Token far
Token ID1290
Feature activation+8.25e-02

Pos logit contributions
 my+0.058
 MY+0.050
 slams+0.047
 gru+0.046
fired+0.045
Neg logit contributions
bows-0.061
 understand-0.058
hing-0.056
ruck-0.054
owers-0.053
Token away
Token ID1497
Feature activation+3.91e-02

Pos logit contributions
 gru+0.097
 my+0.097
 MY+0.094
lor+0.087
 hasn+0.087
Neg logit contributions
hing-0.155
bows-0.152
ruck-0.147
 understand-0.143
aff-0.140
Token place
Token ID1295
Feature activation+2.03e-02

Pos logit contributions
 my+0.069
 MY+0.065
 gru+0.064
ution+0.060
 mine+0.059
Neg logit contributions
bows-0.064
owers-0.055
 understand-0.055
hing-0.055
ruck-0.053
Token.
Token ID13
Feature activation+1.46e-02

Pos logit contributions
 my+0.038
 MY+0.035
 gru+0.035
 me+0.033
ution+0.033
Neg logit contributions
bows-0.031
 understand-0.025
 sizes-0.025
owers-0.024
ruck-0.024
Token I
Token ID314
Feature activation+1.29e-02

Pos logit contributions
 my+0.020
 cave+0.018
 mistake+0.017
 me+0.017
 away+0.017
Neg logit contributions
bows-0.029
hing-0.028
 understand-0.028
owers-0.027
ickets-0.026
Token can
Token ID460
Feature activation-1.21e-02

Pos logit contributions
 my+0.025
 me+0.022
 MY+0.020
 cage+0.020
 mine+0.020
Neg logit contributions
bows-0.022
hing-0.021
 understand-0.020
owers-0.019
ruck-0.018
Token.
Token ID13
Feature activation+1.49e-02

Pos logit contributions
 my+0.059
 MY+0.054
 me+0.053
 gru+0.052
 cage+0.051
Neg logit contributions
bows-0.060
hing-0.054
 understand-0.053
ances-0.051
owers-0.051
Token Please
Token ID4222
Feature activation+2.68e-02

Pos logit contributions
 my+0.023
 cave+0.023
 away+0.022
 me+0.022
 cage+0.021
Neg logit contributions
bows-0.025
 understand-0.025
owers-0.024
hing-0.023
ickets-0.021
Token,
Token ID11
Feature activation+1.54e-02

Pos logit contributions
 my+0.047
 cave+0.040
 gru+0.039
 MY+0.039
 cage+0.039
Neg logit contributions
bows-0.046
 understand-0.045
hing-0.045
owers-0.041
ances-0.038
Token unlock
Token ID12116
Feature activation+7.41e-02

Pos logit contributions
 my+0.023
 cave+0.023
 gru+0.022
 cage+0.022
fired+0.022
Neg logit contributions
 understand-0.027
bows-0.025
 dancers-0.022
 were-0.022
hing-0.022
Token the
Token ID262
Feature activation+2.12e-02

Pos logit contributions
 gru+0.084
 cave+0.083
 cage+0.076
har+0.073
fired+0.073
Neg logit contributions
bows-0.128
ruck-0.125
 dancers-0.125
ances-0.121
hing-0.121
Token door
Token ID3420
Feature activation+8.20e-02

Pos logit contributions
 my+0.023
ar+0.020
 MY+0.018
oh+0.018
 me+0.017
Neg logit contributions
bows-0.048
 understand-0.043
ruck-0.042
 dancers-0.042
owers-0.041
Token and
Token ID290
Feature activation+4.36e-02

Pos logit contributions
 gru+0.157
 cave+0.146
ution+0.145
 MY+0.144
 my+0.143
Neg logit contributions
bows-0.110
hing-0.092
 understand-0.092
ruck-0.091
 were-0.090
Token come
Token ID1282
Feature activation+5.79e-02

Pos logit contributions
 gru+0.061
 cave+0.058
 my+0.057
ar+0.056
 cage+0.056
Neg logit contributions
 understand-0.076
bows-0.072
hing-0.069
 were-0.068
 dancers-0.064
Token out
Token ID503
Feature activation+5.70e-02

Pos logit contributions
 gru+0.091
 MY+0.084
 my+0.083
ution+0.078
 cave+0.077
Neg logit contributions
bows-0.096
 understand-0.093
ruck-0.088
hing-0.088
owers-0.081
Token."
Token ID526
Feature activation-1.36e-03

Pos logit contributions
 my+0.089
 gru+0.088
 MY+0.088
ution+0.088
 cage+0.087
Neg logit contributions
 understand-0.086
ruck-0.083
hing-0.081
bows-0.080
 dancers-0.077
Token
Token ID198
Feature activation+7.03e-03

Pos logit contributions
bows+0.002
 understand+0.002
owers+0.002
hing+0.002
ruck+0.002
Neg logit contributions
 my-0.003
 me-0.002
 mine-0.002
 MY-0.002
 cage-0.002
Token wind
Token ID2344
Feature activation-9.95e-03

Pos logit contributions
 my+0.002
 MY+0.002
 me+0.002
 gru+0.002
 cage+0.002
Neg logit contributions
bows-0.002
hing-0.001
owers-0.001
 understand-0.001
ruck-0.001
Token and
Token ID290
Feature activation+4.82e-03

Pos logit contributions
bows+0.020
hing+0.019
owers+0.019
 understand+0.018
ruck+0.017
Neg logit contributions
 my-0.016
 me-0.013
 cage-0.012
 mine-0.012
 cave-0.012
Token landed
Token ID11406
Feature activation+3.70e-03

Pos logit contributions
 my+0.008
 me+0.007
 cage+0.006
 MY+0.006
 mine+0.006
Neg logit contributions
bows-0.009
hing-0.009
 understand-0.008
owers-0.008
ruck-0.008
Token in
Token ID287
Feature activation+2.10e-02

Pos logit contributions
 my+0.007
 MY+0.006
 me+0.006
 cage+0.006
 mine+0.006
Neg logit contributions
bows-0.005
owers-0.005
hing-0.005
 understand-0.005
ruck-0.005
Token a
Token ID257
Feature activation+3.48e-02

Pos logit contributions
 my+0.039
 MY+0.034
ar+0.034
 me+0.034
 cage+0.033
Neg logit contributions
bows-0.031
 understand-0.030
hing-0.030
ruck-0.029
 sizes-0.027
Token far
Token ID1290
Feature activation+8.17e-02

Pos logit contributions
 my+0.060
riage+0.056
fired+0.053
ar+0.053
eros+0.052
Neg logit contributions
 understand-0.047
hing-0.045
ruck-0.040
bows-0.040
 dancers-0.040
Token away
Token ID1497
Feature activation+3.71e-02

Pos logit contributions
lor+0.112
 gru+0.099
 MY+0.098
fired+0.098
ution+0.095
Neg logit contributions
hing-0.147
ruck-0.132
bows-0.128
 understand-0.126
aff-0.124
Token place
Token ID1295
Feature activation+4.30e-03

Pos logit contributions
 my+0.057
 MY+0.051
 gru+0.048
ution+0.047
 mine+0.047
Neg logit contributions
bows-0.071
ruck-0.063
hing-0.062
 understand-0.062
 sizes-0.061
Token.
Token ID13
Feature activation-7.81e-04

Pos logit contributions
 my+0.009
 me+0.008
 MY+0.008
 gru+0.007
 mine+0.007
Neg logit contributions
bows-0.006
hing-0.006
 understand-0.006
owers-0.005
 sizes-0.005
Token This
Token ID770
Feature activation+3.13e-03

Pos logit contributions
bows+0.001
owers+0.001
hing+0.001
 understand+0.001
ickets+0.001
Neg logit contributions
 my-0.001
 me-0.001
 cage-0.001
 cave-0.001
 mine-0.001
Token place
Token ID1295
Feature activation+2.11e-02

Pos logit contributions
 my+0.006
 me+0.005
 gru+0.005
 MY+0.005
 mine+0.005
Neg logit contributions
bows-0.005
hing-0.005
 understand-0.004
owers-0.004
ruck-0.004
Token us
Token ID514
Feature activation-8.61e-03

Pos logit contributions
 my+0.028
 gru+0.027
 cage+0.025
 cave+0.024
 MY+0.024
Neg logit contributions
bows-0.043
hing-0.041
ruck-0.040
 understand-0.040
owers-0.039
Token.
Token ID13
Feature activation+1.48e-02

Pos logit contributions
bows+0.015
hing+0.014
 understand+0.014
owers+0.013
ruck+0.013
Neg logit contributions
 my-0.015
 me-0.013
 MY-0.013
 gru-0.012
 cave-0.012
Token Please
Token ID4222
Feature activation+2.48e-02

Pos logit contributions
 my+0.022
 me+0.021
 cave+0.020
 cage+0.020
 away+0.020
Neg logit contributions
bows-0.027
hing-0.026
 understand-0.026
owers-0.024
ickets-0.023
Token open
Token ID1280
Feature activation+5.18e-02

Pos logit contributions
 my+0.043
 gru+0.037
 cave+0.037
 MY+0.036
 cage+0.036
Neg logit contributions
bows-0.043
hing-0.042
 understand-0.042
owers-0.038
 were-0.036
Token the
Token ID262
Feature activation+1.88e-02

Pos logit contributions
 gru+0.094
har+0.083
 ranger+0.082
bek+0.082
 cave+0.081
Neg logit contributions
ruck-0.067
 dancers-0.066
bows-0.064
hing-0.063
 motors-0.062
Token door
Token ID3420
Feature activation+8.17e-02

Pos logit contributions
 my+0.026
 MY+0.022
 me+0.020
oh+0.019
 gru+0.019
Neg logit contributions
bows-0.037
hing-0.036
ruck-0.036
 understand-0.034
 sizes-0.034
Token."
Token ID526
Feature activation-1.36e-03

Pos logit contributions
 gru+0.169
 cave+0.156
 MY+0.153
ution+0.153
eros+0.150
Neg logit contributions
bows-0.097
 like-0.082
ruck-0.081
 were-0.080
as-0.077
Token
Token ID198
Feature activation+1.05e-02

Pos logit contributions
bows+0.002
 understand+0.002
owers+0.002
hing+0.002
 were+0.002
Neg logit contributions
 my-0.003
 me-0.002
 mine-0.002
 MY-0.002
 cage-0.002
Token
Token ID198
Feature activation+1.91e-02

Pos logit contributions
 my+0.019
 me+0.016
 cage+0.015
 mine+0.015
 MY+0.015
Neg logit contributions
bows-0.019
hing-0.018
owers-0.017
 understand-0.017
ruck-0.016
TokenS
Token ID50
Feature activation+8.17e-03

Pos logit contributions
 my+0.035
 me+0.030
 mine+0.029
 cave+0.028
 cage+0.028
Neg logit contributions
bows-0.033
hing-0.031
 understand-0.030
owers-0.029
ruck-0.028
Tokenara
Token ID3301
Feature activation+1.18e-02

Pos logit contributions
 my+0.013
 gru+0.011
 mine+0.010
 cave+0.010
 me+0.010
Neg logit contributions
bows-0.017
owers-0.016
hing-0.015
akes-0.015
ruck-0.014
Token fly
Token ID6129
Feature activation+2.10e-02

Pos logit contributions
bows+0.023
hing+0.022
owers+0.021
ruck+0.020
 sizes+0.019
Neg logit contributions
 my-0.026
 me-0.022
 cage-0.020
 mine-0.020
 MY-0.020
Token a
Token ID257
Feature activation+2.12e-02

Pos logit contributions
 my+0.035
 MY+0.031
 cage+0.030
 me+0.029
 mine+0.029
Neg logit contributions
bows-0.038
hing-0.034
 understand-0.033
 sizes-0.031
ruck-0.030
Token plane
Token ID6614
Feature activation+2.23e-02

Pos logit contributions
 my+0.038
 MY+0.032
 gru+0.031
fired+0.030
 me+0.030
Neg logit contributions
bows-0.035
hing-0.032
 understand-0.031
ruck-0.030
owers-0.029
Token to
Token ID284
Feature activation+1.84e-02

Pos logit contributions
 my+0.042
 MY+0.034
 me+0.034
 gru+0.033
 cave+0.033
Neg logit contributions
bows-0.040
hing-0.038
 understand-0.035
owers-0.034
ruck-0.032
Token a
Token ID257
Feature activation+3.32e-02

Pos logit contributions
 my+0.032
 MY+0.028
 me+0.027
 gru+0.027
 mine+0.027
Neg logit contributions
bows-0.031
hing-0.031
 understand-0.030
ruck-0.027
owers-0.027
Token far
Token ID1290
Feature activation+8.16e-02

Pos logit contributions
 my+0.061
 MY+0.054
fired+0.053
 gru+0.052
 slams+0.050
Neg logit contributions
bows-0.047
 understand-0.046
hing-0.044
 dancers-0.043
ruck-0.042
Token place
Token ID1295
Feature activation+1.08e-02

Pos logit contributions
 gru+0.116
lor+0.116
 my+0.115
 MY+0.114
fired+0.108
Neg logit contributions
hing-0.132
bows-0.130
 understand-0.120
ruck-0.119
aff-0.117
Token.
Token ID13
Feature activation+3.06e-03

Pos logit contributions
 my+0.021
 MY+0.019
 me+0.019
 gru+0.017
 mine+0.017
Neg logit contributions
bows-0.017
 understand-0.014
hing-0.014
ruck-0.013
 sizes-0.013
Token He
Token ID679
Feature activation+1.20e-02

Pos logit contributions
 my+0.006
 cage+0.005
 me+0.005
 cave+0.005
 MY+0.005
Neg logit contributions
bows-0.004
 understand-0.004
hing-0.004
anes-0.004
 sizes-0.003
Token is
Token ID318
Feature activation-1.15e-02

Pos logit contributions
 my+0.018
 me+0.015
 MY+0.014
 cage+0.013
 mine+0.013
Neg logit contributions
bows-0.025
hing-0.024
owers-0.023
 understand-0.023
ruck-0.022
Token anxious
Token ID18116
Feature activation-1.24e-02

Pos logit contributions
bows+0.021
hing+0.020
owers+0.018
 understand+0.018
ruck+0.018
Neg logit contributions
 my-0.021
 me-0.018
 cage-0.017
 MY-0.017
 cave-0.016
Token sorry
Token ID7926
Feature activation+6.93e-03

Pos logit contributions
 my+0.009
 cage+0.008
 me+0.008
 cave+0.008
 MY+0.008
Neg logit contributions
bows-0.009
hing-0.009
 understand-0.009
owers-0.008
ruck-0.008
Token.
Token ID13
Feature activation+1.93e-02

Pos logit contributions
 my+0.010
 cage+0.009
 me+0.009
 MY+0.008
 cave+0.008
Neg logit contributions
bows-0.014
hing-0.013
 understand-0.013
owers-0.012
ruck-0.012
Token Please
Token ID4222
Feature activation+1.61e-02

Pos logit contributions
 cage+0.025
 my+0.024
 away+0.024
 me+0.023
 cave+0.023
Neg logit contributions
bows-0.038
hing-0.036
 understand-0.035
owers-0.034
ickets-0.033
Token open
Token ID1280
Feature activation+5.76e-02

Pos logit contributions
 my+0.031
 me+0.026
 cage+0.026
 cave+0.026
 MY+0.026
Neg logit contributions
bows-0.026
hing-0.026
 understand-0.024
owers-0.023
ruck-0.022
Token the
Token ID262
Feature activation+3.25e-02

Pos logit contributions
 gru+0.085
 cage+0.078
 ranger+0.077
bek+0.076
har+0.075
Neg logit contributions
ruck-0.093
hing-0.087
 dancers-0.087
 motors-0.082
bows-0.079
Token door
Token ID3420
Feature activation+8.11e-02

Pos logit contributions
 my+0.039
ar+0.031
 MY+0.031
 me+0.030
 gru+0.029
Neg logit contributions
bows-0.070
hing-0.065
ruck-0.063
owers-0.062
 understand-0.061
Token.
Token ID13
Feature activation+2.55e-02

Pos logit contributions
 gru+0.137
 my+0.130
 cave+0.130
 cage+0.129
ution+0.128
Neg logit contributions
bows-0.125
ruck-0.107
hing-0.106
 were-0.103
 sizes-0.099
Token I
Token ID314
Feature activation+1.51e-02

Pos logit contributions
 mistake+0.037
 my+0.035
 away+0.034
eds+0.034
 cave+0.034
Neg logit contributions
 understand-0.045
bows-0.042
hing-0.040
owers-0.039
 were-0.037
Token don
Token ID836
Feature activation-8.62e-03

Pos logit contributions
 my+0.027
 me+0.023
 cage+0.021
 MY+0.021
 cave+0.021
Neg logit contributions
bows-0.028
hing-0.026
 understand-0.025
owers-0.024
ruck-0.023
Token't
Token ID470
Feature activation-1.57e-02

Pos logit contributions
bows+0.017
hing+0.015
owers+0.015
 sizes+0.014
ruck+0.014
Neg logit contributions
 my-0.015
 me-0.013
 MY-0.012
 mine-0.012
 cage-0.012
Token want
Token ID765
Feature activation+2.01e-02

Pos logit contributions
bows+0.021
hing+0.019
owers+0.018
ruck+0.016
 understand+0.016
Neg logit contributions
 my-0.037
 me-0.033
 mine-0.031
 cage-0.031
 MY-0.030
Token that
Token ID326
Feature activation-3.74e-03

Pos logit contributions
bows+0.003
hing+0.003
 understand+0.003
 were+0.002
owers+0.002
Neg logit contributions
 my-0.004
 me-0.003
 MY-0.003
 cage-0.003
 cave-0.003
Token it
Token ID340
Feature activation+6.55e-03

Pos logit contributions
bows+0.005
hing+0.005
 understand+0.005
owers+0.005
 sizes+0.005
Neg logit contributions
 my-0.008
 me-0.007
 cage-0.007
 MY-0.007
 mine-0.006
Token came
Token ID1625
Feature activation+7.11e-03

Pos logit contributions
 my+0.013
 me+0.011
 cage+0.010
 MY+0.010
 mine+0.010
Neg logit contributions
bows-0.011
hing-0.010
 understand-0.010
owers-0.009
 sizes-0.009
Token from
Token ID422
Feature activation+2.69e-02

Pos logit contributions
 my+0.010
 me+0.008
 cage+0.007
 MY+0.007
 mine+0.007
Neg logit contributions
bows-0.016
hing-0.015
owers-0.015
 understand-0.014
ruck-0.014
Token a
Token ID257
Feature activation+2.73e-02

Pos logit contributions
 my+0.046
 me+0.038
 MY+0.037
 cage+0.036
 mine+0.035
Neg logit contributions
bows-0.048
hing-0.047
 understand-0.045
ruck-0.043
owers-0.043
Token far
Token ID1290
Feature activation+8.10e-02

Pos logit contributions
 my+0.048
 slams+0.039
 MY+0.039
fired+0.037
riage+0.037
Neg logit contributions
 understand-0.043
bows-0.043
hing-0.041
ruck-0.038
 were-0.038
Tokenaway
Token ID8272
Feature activation+5.83e-02

Pos logit contributions
lor+0.093
 my+0.087
 MY+0.079
 hasn+0.077
fired+0.076
Neg logit contributions
hing-0.156
ruck-0.150
bows-0.148
 understand-0.147
aff-0.138
Token place
Token ID1295
Feature activation+9.99e-03

Pos logit contributions
 my+0.098
 MY+0.097
oh+0.084
shadow+0.083
lor+0.083
Neg logit contributions
bows-0.091
 sizes-0.079
 dancers-0.078
ruck-0.076
 were-0.074
Token.
Token ID13
Feature activation-1.77e-03

Pos logit contributions
 my+0.020
 me+0.017
 MY+0.017
 mine+0.015
 cage+0.015
Neg logit contributions
bows-0.017
hing-0.015
 understand-0.014
owers-0.014
 were-0.014
Token The
Token ID383
Feature activation-1.84e-03

Pos logit contributions
bows+0.002
owers+0.002
hing+0.002
 understand+0.002
 sizes+0.002
Neg logit contributions
 my-0.004
 me-0.003
 cave-0.003
 cage-0.003
 mine-0.003
Token all
Token ID477
Feature activation-1.92e-03

Pos logit contributions
bows+0.003
hing+0.003
owers+0.003
 understand+0.003
ruck+0.003
Neg logit contributions
 my-0.003
 me-0.003
 cage-0.003
 cave-0.003
 mine-0.003
Token send
Token ID3758
Feature activation-4.67e-03

Pos logit contributions
bows+0.023
hing+0.022
owers+0.021
ruck+0.019
 understand+0.019
Neg logit contributions
 my-0.027
 me-0.024
 cage-0.022
 mine-0.021
 cave-0.021
Token a
Token ID257
Feature activation+1.45e-02

Pos logit contributions
bows+0.008
hing+0.007
 understand+0.007
ruck+0.007
owers+0.007
Neg logit contributions
 my-0.009
 cave-0.007
 MY-0.007
 cage-0.007
 me-0.007
Token missile
Token ID10105
Feature activation+1.78e-02

Pos logit contributions
 my+0.024
 me+0.020
 MY+0.019
 mine+0.018
 cage+0.018
Neg logit contributions
bows-0.028
hing-0.026
 understand-0.026
ruck-0.025
owers-0.024
Token to
Token ID284
Feature activation+2.12e-03

Pos logit contributions
 my+0.029
 me+0.024
 MY+0.023
 gru+0.022
 cage+0.022
Neg logit contributions
bows-0.035
hing-0.033
 understand-0.031
owers-0.030
ruck-0.030
Token a
Token ID257
Feature activation+2.84e-02

Pos logit contributions
 my+0.003
 MY+0.003
 gru+0.003
 me+0.003
 cage+0.003
Neg logit contributions
 understand-0.004
hing-0.004
bows-0.004
ruck-0.004
owers-0.003
Token far
Token ID1290
Feature activation+8.08e-02

Pos logit contributions
 my+0.052
 MY+0.044
fired+0.043
 slams+0.042
ar+0.041
Neg logit contributions
 understand-0.044
bows-0.041
hing-0.039
ruck-0.039
 dancers-0.035
Tokenaway
Token ID8272
Feature activation+6.26e-02

Pos logit contributions
lor+0.112
 gru+0.107
 my+0.101
 MY+0.100
 Does+0.100
Neg logit contributions
hing-0.135
ruck-0.131
 understand-0.127
bows-0.125
aff-0.116
Token place
Token ID1295
Feature activation+1.02e-02

Pos logit contributions
 MY+0.113
 gru+0.111
 my+0.110
oh+0.102
 timeout+0.100
Neg logit contributions
bows-0.083
 dancers-0.071
ruck-0.070
 sizes-0.066
 were-0.066
Token.
Token ID13
Feature activation+2.58e-03

Pos logit contributions
 my+0.021
 me+0.018
 MY+0.018
 gru+0.017
 mine+0.017
Neg logit contributions
bows-0.015
hing-0.013
 understand-0.013
 were-0.013
ruck-0.013
Token The
Token ID383
Feature activation-1.10e-02

Pos logit contributions
 my+0.005
 me+0.004
 mine+0.004
 MY+0.004
 cage+0.004
Neg logit contributions
bows-0.005
hing-0.004
owers-0.004
 understand-0.004
ruck-0.004
Token missile
Token ID10105
Feature activation+1.86e-02

Pos logit contributions
bows+0.025
hing+0.022
owers+0.022
 sizes+0.020
ickets+0.020
Neg logit contributions
 my-0.017
 me-0.015
 cage-0.013
 cave-0.013
 door-0.012
Token just
Token ID655
Feature activation+4.49e-03

Pos logit contributions
 my+0.023
 me+0.020
 cage+0.019
 MY+0.019
 mine+0.019
Neg logit contributions
bows-0.020
hing-0.019
owers-0.017
 understand-0.017
ruck-0.016
Token came
Token ID1625
Feature activation+2.41e-02

Pos logit contributions
 my+0.010
 me+0.008
 cage+0.008
 MY+0.008
 cave+0.008
Neg logit contributions
bows-0.007
hing-0.006
owers-0.006
 understand-0.006
ruck-0.005
Token back
Token ID736
Feature activation+4.44e-02

Pos logit contributions
 my+0.033
 MY+0.028
 cage+0.028
 me+0.027
 cave+0.027
Neg logit contributions
bows-0.047
 understand-0.046
hing-0.046
ruck-0.045
owers-0.042
Token from
Token ID422
Feature activation+4.81e-02

Pos logit contributions
 my+0.063
 gru+0.058
 cage+0.057
 MY+0.056
 cave+0.054
Neg logit contributions
bows-0.084
hing-0.076
owers-0.076
ruck-0.076
 understand-0.075
Token a
Token ID257
Feature activation+3.06e-02

Pos logit contributions
 my+0.056
 MY+0.053
ar+0.051
lor+0.051
 cage+0.050
Neg logit contributions
ruck-0.091
bows-0.090
 understand-0.090
hing-0.089
 dancers-0.085
Token far
Token ID1290
Feature activation+8.06e-02

Pos logit contributions
 my+0.052
 me+0.043
 MY+0.043
fired+0.040
 slams+0.039
Neg logit contributions
bows-0.052
hing-0.049
 understand-0.048
ruck-0.048
owers-0.044
Tokenaway
Token ID8272
Feature activation+5.74e-02

Pos logit contributions
 my+0.081
 MY+0.074
 gru+0.072
lor+0.070
ution+0.067
Neg logit contributions
hing-0.168
bows-0.164
ruck-0.163
 understand-0.152
ances-0.152
Token place
Token ID1295
Feature activation+2.53e-02

Pos logit contributions
 my+0.089
 MY+0.084
oh+0.077
 gru+0.074
 Help+0.072
Neg logit contributions
bows-0.101
ruck-0.090
 sizes-0.086
hing-0.085
 dancers-0.084
Token.
Token ID13
Feature activation+2.22e-02

Pos logit contributions
 my+0.042
 MY+0.037
 me+0.036
 gru+0.035
 cage+0.034
Neg logit contributions
bows-0.046
hing-0.039
owers-0.038
ruck-0.038
 sizes-0.038
Token This
Token ID770
Feature activation+1.62e-02

Pos logit contributions
 my+0.034
 away+0.031
 cage+0.031
 mistake+0.031
 me+0.031
Neg logit contributions
bows-0.040
 understand-0.039
hing-0.038
ickets-0.035
owers-0.035
Token is
Token ID318
Feature activation+2.04e-02

Pos logit contributions
 my+0.022
 me+0.017
 MY+0.017
 cage+0.016
 mine+0.015
Neg logit contributions
bows-0.036
hing-0.035
owers-0.033
 understand-0.033
ruck-0.032
Token lots
Token ID6041
Feature activation-1.18e-02

Pos logit contributions
bows+0.022
hing+0.021
owers+0.020
 understand+0.019
ances+0.018
Neg logit contributions
 my-0.022
 me-0.019
 cage-0.017
 mine-0.017
 cave-0.017
Token of
Token ID286
Feature activation-2.85e-02

Pos logit contributions
bows+0.021
hing+0.020
 understand+0.019
owers+0.019
ances+0.017
Neg logit contributions
 my-0.023
 me-0.020
 cage-0.019
 cave-0.018
 mine-0.017
Token animals
Token ID4695
Feature activation-4.98e-02

Pos logit contributions
bows+0.054
hing+0.052
owers+0.049
 understand+0.047
ances+0.045
Neg logit contributions
 my-0.054
 me-0.048
 cage-0.044
 cave-0.043
 mine-0.041
Token of
Token ID286
Feature activation-2.13e-02

Pos logit contributions
bows+0.125
hing+0.119
owers+0.110
 understand+0.110
ances+0.109
Neg logit contributions
 my-0.054
 me-0.046
 cage-0.044
 cave-0.041
 this-0.035
Token all
Token ID477
Feature activation-2.38e-02

Pos logit contributions
bows+0.038
hing+0.037
owers+0.034
 understand+0.032
ances+0.031
Neg logit contributions
 my-0.043
 me-0.039
 cage-0.035
 cave-0.034
 mine-0.033
Token sizes
Token ID10620
Feature activation-9.59e-02

Pos logit contributions
bows+0.048
hing+0.045
owers+0.044
 understand+0.042
ruck+0.040
Neg logit contributions
 my-0.040
 me-0.034
 cage-0.031
 cave-0.030
 mine-0.029
Token.
Token ID13
Feature activation-1.00e-02

Pos logit contributions
bows+0.235
hing+0.233
 understand+0.232
owers+0.213
ances+0.212
Neg logit contributions
 my-0.095
 me-0.077
 cage-0.068
 this-0.063
 cave-0.062
Token One
Token ID1881
Feature activation+1.31e-02

Pos logit contributions
bows+0.018
hing+0.018
owers+0.016
 understand+0.016
ruck+0.016
Neg logit contributions
 my-0.018
 me-0.016
 cage-0.015
 MY-0.014
 mine-0.014
Token day
Token ID1110
Feature activation-1.42e-02

Pos logit contributions
 my+0.028
 me+0.025
 MY+0.024
 mine+0.024
 gru+0.023
Neg logit contributions
bows-0.018
hing-0.017
 understand-0.016
owers-0.016
ruck-0.015
Token,
Token ID11
Feature activation-1.13e-02

Pos logit contributions
bows+0.035
hing+0.034
owers+0.032
 understand+0.031
ruck+0.031
Neg logit contributions
 my-0.018
 me-0.014
 mine-0.012
 MY-0.012
 cage-0.012
Token a
Token ID257
Feature activation+1.75e-02

Pos logit contributions
bows+0.023
hing+0.022
owers+0.021
 understand+0.020
ruck+0.020
Neg logit contributions
 my-0.019
 me-0.016
 mine-0.014
 cage-0.014
 MY-0.014
Token creatures
Token ID8109
Feature activation-4.70e-02

Pos logit contributions
bows+0.041
hing+0.039
owers+0.036
 understand+0.034
akes+0.034
Neg logit contributions
 my-0.047
 me-0.042
 cage-0.037
 mine-0.037
 your-0.036
Token of
Token ID286
Feature activation-1.89e-02

Pos logit contributions
bows+0.111
hing+0.106
owers+0.099
ruck+0.099
 understand+0.098
Neg logit contributions
 my-0.059
 me-0.050
 cage-0.047
 cave-0.044
 mine-0.039
Token all
Token ID477
Feature activation-2.22e-02

Pos logit contributions
bows+0.035
hing+0.033
owers+0.031
akes+0.028
 understand+0.028
Neg logit contributions
 my-0.037
 me-0.033
 cage-0.029
 cave-0.029
 mine-0.028
Token shapes
Token ID15268
Feature activation-7.83e-02

Pos logit contributions
bows+0.042
hing+0.039
owers+0.038
 understand+0.036
ruck+0.035
Neg logit contributions
 my-0.040
 me-0.034
 cage-0.031
 cave-0.030
 mine-0.030
Token and
Token ID290
Feature activation-3.15e-02

Pos logit contributions
bows+0.184
hing+0.182
 understand+0.176
owers+0.172
ances+0.171
Neg logit contributions
 my-0.085
 me-0.069
 cage-0.065
 cave-0.062
 this-0.060
Token sizes
Token ID10620
Feature activation-9.55e-02

Pos logit contributions
hing+0.052
owers+0.051
bows+0.051
akes+0.047
anes+0.045
Neg logit contributions
 my-0.064
 me-0.057
 cave-0.050
 cage-0.049
 your-0.048
Token.
Token ID13
Feature activation-7.75e-03

Pos logit contributions
bows+0.230
hing+0.229
 understand+0.227
owers+0.215
ances+0.211
Neg logit contributions
 my-0.087
 me-0.070
 cage-0.064
 this-0.058
 mine-0.058
Token One
Token ID1881
Feature activation+1.55e-02

Pos logit contributions
bows+0.014
hing+0.013
owers+0.013
 understand+0.012
ruck+0.012
Neg logit contributions
 my-0.014
 me-0.012
 cage-0.011
 MY-0.011
 mine-0.011
Token of
Token ID286
Feature activation-1.08e-02

Pos logit contributions
 my+0.034
 me+0.030
 MY+0.029
 mine+0.029
 gru+0.028
Neg logit contributions
bows-0.021
hing-0.020
 understand-0.019
owers-0.018
ruck-0.017
Token these
Token ID777
Feature activation-3.34e-02

Pos logit contributions
bows+0.024
hing+0.022
owers+0.021
 understand+0.021
ruck+0.020
Neg logit contributions
 my-0.017
 me-0.014
 cage-0.012
 MY-0.012
 cave-0.012
Token creatures
Token ID8109
Feature activation-3.56e-02

Pos logit contributions
bows+0.051
hing+0.050
 understand+0.044
ruck+0.044
owers+0.043
Neg logit contributions
 my-0.070
 me-0.064
 cage-0.058
 cave-0.057
 mine-0.056
Token all
Token ID477
Feature activation+1.51e-02

Pos logit contributions
 my+0.059
 MY+0.052
ar+0.051
 me+0.050
 cage+0.049
Neg logit contributions
bows-0.066
 understand-0.064
ruck-0.062
hing-0.061
 were-0.060
Token sorts
Token ID10524
Feature activation-4.62e-02

Pos logit contributions
 my+0.025
 MY+0.021
 cage+0.021
 me+0.020
 mine+0.020
Neg logit contributions
hing-0.027
bows-0.026
ruck-0.025
 understand-0.024
 sizes-0.024
Token of
Token ID286
Feature activation-2.29e-02

Pos logit contributions
bows+0.086
hing+0.078
owers+0.076
 understand+0.073
bee+0.070
Neg logit contributions
 my-0.085
 me-0.074
 this-0.065
 mine-0.064
 away-0.063
Token shapes
Token ID15268
Feature activation-6.99e-02

Pos logit contributions
bows+0.045
hing+0.043
owers+0.041
 understand+0.038
ickets+0.036
Neg logit contributions
 my-0.041
 me-0.036
 mine-0.031
 cave-0.031
 cage-0.031
Token and
Token ID290
Feature activation-3.15e-02

Pos logit contributions
bows+0.146
hing+0.145
owers+0.136
 understand+0.136
ruck+0.129
Neg logit contributions
 my-0.103
 me-0.087
 cave-0.076
 cage-0.076
 mine-0.075
Token sizes
Token ID10620
Feature activation-9.40e-02

Pos logit contributions
bows+0.050
hing+0.049
owers+0.047
ickets+0.041
 understand+0.041
Neg logit contributions
 my-0.066
 me-0.058
 mine-0.052
 cave-0.051
 this-0.050
Token.
Token ID13
Feature activation-1.30e-02

Pos logit contributions
bows+0.208
hing+0.207
 understand+0.201
owers+0.198
ances+0.190
Neg logit contributions
 my-0.111
 me-0.092
 this-0.082
 mine-0.080
 away-0.078
Token She
Token ID1375
Feature activation+4.20e-05

Pos logit contributions
bows+0.023
hing+0.022
owers+0.021
 understand+0.020
ruck+0.020
Neg logit contributions
 my-0.023
 me-0.020
 cage-0.019
 cave-0.019
 mine-0.018
Token was
Token ID373
Feature activation-2.27e-02

Pos logit contributions
 my+0.000
 me+0.000
 cage+0.000
 MY+0.000
 mine+0.000
Neg logit contributions
bows-0.000
hing-0.000
owers-0.000
 understand-0.000
ruck-0.000
Token so
Token ID523
Feature activation-1.71e-02

Pos logit contributions
bows+0.045
hing+0.043
owers+0.041
 understand+0.039
ruck+0.039
Neg logit contributions
 my-0.038
 me-0.032
 cage-0.029
 MY-0.029
 mine-0.029
Token happy
Token ID3772
Feature activation-1.51e-02

Pos logit contributions
bows+0.030
hing+0.029
owers+0.027
ruck+0.026
ances+0.025
Neg logit contributions
 my-0.034
 me-0.029
 mine-0.026
 your-0.026
 MY-0.026
Token different
Token ID1180
Feature activation-4.04e-02

Pos logit contributions
bows+0.053
hing+0.051
owers+0.048
 understand+0.045
ruck+0.044
Neg logit contributions
 my-0.053
 me-0.046
 cage-0.042
 mine-0.041
 cave-0.041
Token colors
Token ID7577
Feature activation-6.70e-02

Pos logit contributions
hing+0.066
bows+0.062
owers+0.059
 understand+0.058
ickets+0.055
Neg logit contributions
 my-0.081
 me-0.073
 mine-0.065
 cage-0.065
 cave-0.064
Token,
Token ID11
Feature activation-2.67e-02

Pos logit contributions
hing+0.138
bows+0.134
 understand+0.134
owers+0.130
ruck+0.122
Neg logit contributions
 my-0.099
 me-0.082
 mine-0.071
 this-0.069
 cage-0.067
Token shapes
Token ID15268
Feature activation-6.14e-02

Pos logit contributions
bows+0.045
hing+0.044
owers+0.042
 understand+0.040
ruck+0.038
Neg logit contributions
 my-0.052
 me-0.045
 mine-0.041
 cage-0.041
 MY-0.040
Token and
Token ID290
Feature activation-3.42e-02

Pos logit contributions
bows+0.127
hing+0.126
owers+0.121
 understand+0.121
ances+0.115
Neg logit contributions
 my-0.091
 me-0.077
 cage-0.067
 mine-0.066
 this-0.065
Token sizes
Token ID10620
Feature activation-9.05e-02

Pos logit contributions
hing+0.045
owers+0.044
bows+0.044
ickets+0.038
ruck+0.038
Neg logit contributions
 my-0.078
 me-0.070
 mine-0.064
 cage-0.062
 this-0.062
Token.
Token ID13
Feature activation-1.30e-02

Pos logit contributions
hing+0.199
bows+0.198
 understand+0.191
owers+0.189
ruck+0.181
Neg logit contributions
 my-0.115
 me-0.094
 mine-0.084
 this-0.082
 away-0.080
Token She
Token ID1375
Feature activation+7.74e-05

Pos logit contributions
bows+0.024
hing+0.022
owers+0.021
 understand+0.021
ruck+0.020
Neg logit contributions
 my-0.024
 me-0.020
 cage-0.019
 MY-0.018
 cave-0.018
Token carefully
Token ID7773
Feature activation-7.66e-03

Pos logit contributions
 my+0.000
 me+0.000
 cage+0.000
 mine+0.000
 MY+0.000
Neg logit contributions
bows-0.000
hing-0.000
owers-0.000
ruck-0.000
 understand-0.000
Token chose
Token ID7690
Feature activation-2.64e-03

Pos logit contributions
bows+0.012
hing+0.012
owers+0.011
 understand+0.011
ruck+0.010
Neg logit contributions
 my-0.015
 me-0.014
 cage-0.012
 MY-0.012
 mine-0.012
Token the
Token ID262
Feature activation-2.06e-02

Pos logit contributions
bows+0.005
hing+0.005
owers+0.005
 understand+0.004
ruck+0.004
Neg logit contributions
 my-0.005
 me-0.004
 cage-0.004
 MY-0.003
 mine-0.003
Token cars
Token ID5006
Feature activation-4.80e-02

Pos logit contributions
bows+0.044
hing+0.043
owers+0.042
ickets+0.037
 understand+0.037
Neg logit contributions
 my-0.056
 me-0.050
 cage-0.047
 cave-0.045
 mine-0.045
Token of
Token ID286
Feature activation-3.10e-02

Pos logit contributions
owers+0.099
bows+0.097
hing+0.096
bee+0.093
 understand+0.093
Neg logit contributions
 my-0.069
 me-0.057
 cage-0.055
 this-0.050
 away-0.050
Token different
Token ID1180
Feature activation-4.66e-02

Pos logit contributions
owers+0.070
hing+0.069
bows+0.066
ickets+0.064
akes+0.061
Neg logit contributions
 my-0.051
 me-0.042
 mine-0.037
 your-0.037
 cage-0.035
Token colors
Token ID7577
Feature activation-6.70e-02

Pos logit contributions
hing+0.072
owers+0.070
 understand+0.067
bows+0.066
ickets+0.064
Neg logit contributions
 my-0.095
 me-0.086
 cage-0.077
 mine-0.076
 cave-0.075
Token and
Token ID290
Feature activation-2.76e-02

Pos logit contributions
 understand+0.147
hing+0.146
owers+0.141
bows+0.138
mun+0.134
Neg logit contributions
 my-0.084
 me-0.068
 this-0.062
 cage-0.060
 door-0.059
Token sizes
Token ID10620
Feature activation-9.04e-02

Pos logit contributions
owers+0.032
hing+0.029
ickets+0.026
akes+0.025
anes+0.023
Neg logit contributions
 my-0.070
 me-0.062
 mine-0.060
 this-0.059
 your-0.058
Token.
Token ID13
Feature activation-1.40e-02

Pos logit contributions
hing+0.219
 understand+0.213
bows+0.211
owers+0.211
ickets+0.206
Neg logit contributions
 my-0.087
 me-0.064
 this-0.062
 away-0.062
 mine-0.060
Token They
Token ID1119
Feature activation-4.39e-02

Pos logit contributions
bows+0.025
hing+0.024
owers+0.023
 understand+0.022
ruck+0.021
Neg logit contributions
 my-0.026
 me-0.022
 cage-0.021
 cave-0.020
 MY-0.020
Token liked
Token ID8288
Feature activation-1.52e-02

Pos logit contributions
owers+0.094
bows+0.090
ickets+0.089
hing+0.086
anes+0.084
Neg logit contributions
 my-0.060
 me-0.058
 locked-0.055
 away-0.051
 mine-0.049
Token to
Token ID284
Feature activation-1.77e-02

Pos logit contributions
bows+0.038
hing+0.037
owers+0.035
 understand+0.034
ruck+0.033
Neg logit contributions
 my-0.018
 me-0.014
 mine-0.012
 cage-0.012
 MY-0.012
Token zoom
Token ID19792
Feature activation-1.65e-02

Pos logit contributions
bows+0.031
owers+0.030
hing+0.030
ruck+0.027
ickets+0.026
Neg logit contributions
 my-0.033
 me-0.029
 cage-0.026
 mine-0.026
 cave-0.025
Token cars
Token ID5006
Feature activation-4.71e-02

Pos logit contributions
bows+0.047
hing+0.046
owers+0.046
ickets+0.040
 understand+0.039
Neg logit contributions
 my-0.058
 me-0.051
 cage-0.048
 mine-0.046
 cave-0.046
Token of
Token ID286
Feature activation-2.78e-02

Pos logit contributions
bows+0.098
owers+0.098
hing+0.095
 understand+0.091
bee+0.090
Neg logit contributions
 my-0.070
 me-0.058
 cage-0.053
 away-0.050
 mine-0.050
Token different
Token ID1180
Feature activation-4.34e-02

Pos logit contributions
owers+0.057
bows+0.056
hing+0.055
ickets+0.049
akes+0.049
Neg logit contributions
 my-0.049
 me-0.043
 mine-0.037
 your-0.036
 cage-0.035
Token colors
Token ID7577
Feature activation-6.50e-02

Pos logit contributions
hing+0.063
owers+0.061
bows+0.061
 understand+0.059
ickets+0.055
Neg logit contributions
 my-0.093
 me-0.084
 mine-0.075
 cage-0.075
 cave-0.074
Token and
Token ID290
Feature activation-2.09e-02

Pos logit contributions
 understand+0.136
hing+0.136
bows+0.135
owers+0.133
ances+0.124
Neg logit contributions
 my-0.091
 me-0.075
 this-0.065
 mine-0.064
 cage-0.063
Token sizes
Token ID10620
Feature activation-8.99e-02

Pos logit contributions
owers+0.019
hing+0.018
bows+0.017
akes+0.014
ickets+0.014
Neg logit contributions
 my-0.058
 me-0.052
 mine-0.049
 this-0.048
 cage-0.047
Token.
Token ID13
Feature activation-6.25e-03

Pos logit contributions
bows+0.209
hing+0.207
owers+0.206
 understand+0.205
ances+0.192
Neg logit contributions
 my-0.101
 me-0.077
 mine-0.071
 this-0.070
 away-0.068
Token He
Token ID679
Feature activation-8.56e-04

Pos logit contributions
bows+0.012
hing+0.011
owers+0.011
 understand+0.010
ruck+0.010
Neg logit contributions
 my-0.011
 me-0.009
 cage-0.009
 MY-0.008
 mine-0.008
Token liked
Token ID8288
Feature activation-6.01e-03

Pos logit contributions
bows+0.001
hing+0.001
owers+0.001
ruck+0.001
ances+0.001
Neg logit contributions
 my-0.002
 me-0.001
 cage-0.001
 cave-0.001
 mine-0.001
Token to
Token ID284
Feature activation-1.53e-02

Pos logit contributions
bows+0.014
hing+0.014
owers+0.013
 understand+0.013
ruck+0.012
Neg logit contributions
 my-0.008
 me-0.006
 cage-0.006
 mine-0.006
 MY-0.005
Token make
Token ID787
Feature activation-2.68e-02

Pos logit contributions
bows+0.026
hing+0.025
owers+0.025
ruck+0.022
 understand+0.022
Neg logit contributions
 my-0.029
 me-0.026
 cage-0.023
 mine-0.023
 MY-0.023
Token cars
Token ID5006
Feature activation-4.87e-02

Pos logit contributions
bows+0.045
owers+0.042
hing+0.042
 understand+0.037
ickets+0.037
Neg logit contributions
 my-0.055
 me-0.048
 cage-0.044
 mine-0.043
 cave-0.042
Token of
Token ID286
Feature activation-2.80e-02

Pos logit contributions
owers+0.104
bows+0.102
hing+0.100
bee+0.096
 understand+0.096
Neg logit contributions
 my-0.071
 me-0.058
 cage-0.054
 this-0.052
 away-0.050
Token different
Token ID1180
Feature activation-4.49e-02

Pos logit contributions
owers+0.060
bows+0.059
hing+0.058
ickets+0.053
akes+0.052
Neg logit contributions
 my-0.049
 me-0.042
 mine-0.036
 your-0.035
 cage-0.034
Token colors
Token ID7577
Feature activation-6.75e-02

Pos logit contributions
hing+0.066
owers+0.065
 understand+0.063
bows+0.063
ickets+0.058
Neg logit contributions
 my-0.097
 me-0.087
 mine-0.077
 cage-0.076
 cave-0.075
Token and
Token ID290
Feature activation-2.37e-02

Pos logit contributions
 understand+0.150
hing+0.148
bows+0.145
owers+0.144
ances+0.136
Neg logit contributions
 my-0.089
 me-0.073
 this-0.064
 cage-0.059
 door-0.058
Token sizes
Token ID10620
Feature activation-8.97e-02

Pos logit contributions
owers+0.024
hing+0.022
bows+0.018
akes+0.018
ickets+0.018
Neg logit contributions
 my-0.066
 me-0.058
 mine-0.056
 this-0.055
 your-0.054
Token.
Token ID13
Feature activation-5.28e-03

Pos logit contributions
bows+0.222
hing+0.222
 understand+0.220
owers+0.217
ickets+0.206
Neg logit contributions
 my-0.087
 me-0.064
 this-0.058
 mine-0.056
 away-0.053
Token He
Token ID679
Feature activation-1.97e-03

Pos logit contributions
bows+0.009
hing+0.009
owers+0.009
 understand+0.008
ruck+0.008
Neg logit contributions
 my-0.010
 me-0.008
 cage-0.008
 MY-0.008
 mine-0.007
Token liked
Token ID8288
Feature activation-8.43e-03

Pos logit contributions
bows+0.004
hing+0.003
owers+0.003
ickets+0.003
ances+0.003
Neg logit contributions
 my-0.004
 me-0.003
 cage-0.003
 cave-0.003
 mine-0.003
Token to
Token ID284
Feature activation-1.51e-02

Pos logit contributions
bows+0.020
hing+0.020
owers+0.019
 understand+0.019
ruck+0.018
Neg logit contributions
 my-0.010
 me-0.008
 cage-0.007
 mine-0.007
 MY-0.007
Token make
Token ID787
Feature activation-3.03e-02

Pos logit contributions
bows+0.026
owers+0.025
hing+0.025
ruck+0.023
ickets+0.022
Neg logit contributions
 my-0.029
 me-0.025
 mine-0.023
 cage-0.023
 MY-0.022
Token toys
Token ID14958
Feature activation-4.63e-02

Pos logit contributions
hing+0.071
bows+0.067
owers+0.065
ances+0.062
ruck+0.060
Neg logit contributions
 my-0.047
 me-0.039
 your-0.033
 mine-0.032
 cage-0.032
Token of
Token ID286
Feature activation-2.96e-02

Pos logit contributions
bows+0.097
hing+0.093
owers+0.088
 understand+0.086
ruck+0.084
Neg logit contributions
 my-0.072
 me-0.060
 cage-0.054
 mine-0.053
 cave-0.053
Token all
Token ID477
Feature activation-2.25e-02

Pos logit contributions
hing+0.063
bows+0.060
owers+0.058
ickets+0.054
anes+0.053
Neg logit contributions
 my-0.053
 me-0.045
 your-0.040
 mine-0.038
 cage-0.035
Token shapes
Token ID15268
Feature activation-7.67e-02

Pos logit contributions
bows+0.041
hing+0.040
owers+0.038
 understand+0.036
ruck+0.035
Neg logit contributions
 my-0.041
 me-0.035
 cage-0.032
 mine-0.032
 MY-0.031
Token and
Token ID290
Feature activation-3.22e-02

Pos logit contributions
hing+0.173
bows+0.167
 understand+0.164
owers+0.157
ruck+0.156
Neg logit contributions
 my-0.108
 me-0.087
 this-0.074
 cage-0.073
 cave-0.072
Token sizes
Token ID10620
Feature activation-8.95e-02

Pos logit contributions
hing+0.054
owers+0.051
uttering+0.047
anes+0.047
ickets+0.047
Neg logit contributions
 my-0.068
 me-0.061
 your-0.053
 mine-0.053
 this-0.052
Token.
Token ID13
Feature activation-1.02e-02

Pos logit contributions
hing+0.219
 understand+0.213
bows+0.206
owers+0.199
ances+0.197
Neg logit contributions
 my-0.096
 me-0.077
 mine-0.065
 this-0.064
 away-0.062
Token Her
Token ID2332
Feature activation-1.45e-02

Pos logit contributions
bows+0.020
hing+0.019
owers+0.018
 understand+0.017
ruck+0.017
Neg logit contributions
 my-0.018
 me-0.015
 cage-0.014
 MY-0.014
 mine-0.013
Token favorite
Token ID4004
Feature activation-8.85e-03

Pos logit contributions
hing+0.027
bows+0.026
ruck+0.024
 understand+0.023
owers+0.023
Neg logit contributions
 my-0.026
 me-0.023
 cage-0.021
 mine-0.021
 cave-0.021
Token were
Token ID547
Feature activation-2.28e-02

Pos logit contributions
bows+0.015
hing+0.014
owers+0.013
 understand+0.013
ruck+0.013
Neg logit contributions
 my-0.017
 me-0.015
 cage-0.014
 MY-0.014
 cave-0.014
Token two
Token ID734
Feature activation-3.81e-02

Pos logit contributions
bows+0.044
hing+0.044
owers+0.041
 understand+0.040
ruck+0.039
Neg logit contributions
 my-0.040
 me-0.034
 cage-0.030
 mine-0.030
 MY-0.030
Token trains
Token ID13404
Feature activation-5.57e-02

Pos logit contributions
bows+0.041
hing+0.040
owers+0.037
 understand+0.036
ruck+0.034
Neg logit contributions
 my-0.043
 me-0.038
 cage-0.035
 cave-0.034
 mine-0.034
Token of
Token ID286
Feature activation-2.46e-02

Pos logit contributions
bows+0.104
hing+0.099
owers+0.097
 understand+0.095
ruck+0.090
Neg logit contributions
 my-0.092
 me-0.081
 cage-0.073
 mine-0.071
 away-0.070
Token different
Token ID1180
Feature activation-3.03e-02

Pos logit contributions
bows+0.044
hing+0.042
owers+0.041
 understand+0.038
ickets+0.036
Neg logit contributions
 my-0.047
 me-0.041
 cage-0.037
 mine-0.036
 cave-0.035
Token colors
Token ID7577
Feature activation-5.82e-02

Pos logit contributions
bows+0.042
hing+0.041
owers+0.039
 understand+0.037
ickets+0.034
Neg logit contributions
 my-0.067
 me-0.060
 cage-0.055
 mine-0.054
 cave-0.054
Token and
Token ID290
Feature activation-2.27e-02

Pos logit contributions
hing+0.118
bows+0.118
 understand+0.115
owers+0.114
ruck+0.106
Neg logit contributions
 my-0.086
 me-0.071
 mine-0.062
 cage-0.060
 this-0.059
Token sizes
Token ID10620
Feature activation-8.89e-02

Pos logit contributions
bows+0.029
hing+0.028
owers+0.028
ruck+0.024
 understand+0.023
Neg logit contributions
 my-0.053
 me-0.047
 mine-0.044
 cage-0.043
 cave-0.042
Token.
Token ID13
Feature activation-8.41e-03

Pos logit contributions
bows+0.197
hing+0.197
owers+0.190
 understand+0.190
ruck+0.180
Neg logit contributions
 my-0.110
 me-0.087
 mine-0.079
 this-0.076
 away-0.076
Token They
Token ID1119
Feature activation-4.04e-02

Pos logit contributions
bows+0.015
hing+0.014
owers+0.013
 understand+0.013
ruck+0.012
Neg logit contributions
 my-0.016
 me-0.013
 cage-0.013
 MY-0.012
 cave-0.012
Token also
Token ID635
Feature activation-1.40e-02

Pos logit contributions
owers+0.067
bows+0.066
ruck+0.063
hing+0.060
ickets+0.059
Neg logit contributions
 my-0.076
 me-0.070
 away-0.063
 mine-0.062
 this-0.060
Token have
Token ID423
Feature activation-1.95e-02

Pos logit contributions
bows+0.022
owers+0.022
hing+0.021
ruck+0.020
ickets+0.019
Neg logit contributions
 my-0.028
 me-0.025
 cage-0.023
 mine-0.022
 away-0.022
Token a
Token ID257
Feature activation-1.38e-02

Pos logit contributions
bows+0.036
hing+0.035
owers+0.033
 understand+0.031
ruck+0.030
Neg logit contributions
 my-0.036
 me-0.031
 mine-0.028
 cage-0.028
 MY-0.028
Token cubes
Token ID34896
Feature activation-4.83e-02

Pos logit contributions
bows+0.045
hing+0.041
owers+0.039
 understand+0.036
ickets+0.034
Neg logit contributions
 my-0.068
 me-0.060
 mine-0.056
 cage-0.055
 cave-0.055
Token of
Token ID286
Feature activation-4.21e-02

Pos logit contributions
hing+0.096
bows+0.092
 understand+0.088
owers+0.086
ruck+0.083
Neg logit contributions
 my-0.080
 me-0.066
 cage-0.062
 mine-0.061
 away-0.058
Token different
Token ID1180
Feature activation-4.96e-02

Pos logit contributions
hing+0.094
bows+0.093
owers+0.091
ickets+0.084
bee+0.083
Neg logit contributions
 my-0.070
 me-0.059
 mine-0.051
 your-0.049
 this-0.047
Token colors
Token ID7577
Feature activation-6.57e-02

Pos logit contributions
hing+0.072
bows+0.072
owers+0.067
 understand+0.066
ickets+0.062
Neg logit contributions
 my-0.108
 me-0.097
 mine-0.088
 cave-0.087
 cage-0.085
Token and
Token ID290
Feature activation-2.65e-02

Pos logit contributions
hing+0.147
 understand+0.143
bows+0.141
owers+0.136
mun+0.133
Neg logit contributions
 my-0.087
 me-0.070
 mine-0.061
 this-0.061
 cage-0.058
Token sizes
Token ID10620
Feature activation-8.87e-02

Pos logit contributions
owers+0.031
hing+0.031
ickets+0.028
bows+0.027
anes+0.025
Neg logit contributions
 my-0.067
 mine-0.059
 me-0.058
 this-0.055
 your-0.054
Token.
Token ID13
Feature activation-1.66e-02

Pos logit contributions
hing+0.217
bows+0.212
 understand+0.207
owers+0.202
ickets+0.198
Neg logit contributions
 my-0.096
 me-0.072
 mine-0.067
 this-0.062
 away-0.060
Token They
Token ID1119
Feature activation-4.52e-02

Pos logit contributions
bows+0.030
hing+0.028
owers+0.027
 understand+0.026
ruck+0.025
Neg logit contributions
 my-0.031
 me-0.026
 cage-0.025
 cave-0.024
 MY-0.024
Token can
Token ID460
Feature activation-3.84e-02

Pos logit contributions
owers+0.096
bows+0.095
itted+0.089
ruck+0.088
anes+0.087
Neg logit contributions
 my-0.072
 me-0.066
 mine-0.062
 away-0.056
 this-0.053
Token make
Token ID787
Feature activation-3.73e-02

Pos logit contributions
bows+0.076
owers+0.071
hing+0.069
ruck+0.066
bee+0.064
Neg logit contributions
 my-0.070
 me-0.061
 mine-0.055
 your-0.051
 this-0.050
Token towers
Token ID18028
Feature activation-4.22e-02

Pos logit contributions
bows+0.077
owers+0.074
hing+0.074
bee+0.067
 understand+0.067
Neg logit contributions
 my-0.064
 me-0.054
 mine-0.047
 cage-0.046
 this-0.045
Token had
Token ID550
Feature activation-3.01e-02

Pos logit contributions
bows+0.094
owers+0.089
ickets+0.089
hing+0.087
anes+0.084
Neg logit contributions
 my-0.057
 me-0.050
 locked-0.049
 cage-0.048
 cave-0.046
Token many
Token ID867
Feature activation-2.89e-02

Pos logit contributions
hing+0.057
bows+0.056
owers+0.052
ickets+0.049
 understand+0.049
Neg logit contributions
 my-0.055
 me-0.048
 cage-0.045
 cave-0.044
 mine-0.043
Token boxes
Token ID10559
Feature activation-4.14e-02

Pos logit contributions
bows+0.053
hing+0.052
owers+0.049
 understand+0.046
ickets+0.045
Neg logit contributions
 my-0.054
 me-0.047
 cage-0.044
 cave-0.044
 mine-0.042
Token of
Token ID286
Feature activation-3.71e-02

Pos logit contributions
bows+0.083
hing+0.083
owers+0.077
 understand+0.076
ruck+0.073
Neg logit contributions
 my-0.066
 me-0.055
 cage-0.052
 cave-0.050
 mine-0.049
Token different
Token ID1180
Feature activation-4.95e-02

Pos logit contributions
bows+0.072
hing+0.071
owers+0.068
ickets+0.064
 understand+0.061
Neg logit contributions
 my-0.068
 me-0.059
 cage-0.053
 cave-0.052
 mine-0.051
Token sizes
Token ID10620
Feature activation-8.82e-02

Pos logit contributions
hing+0.081
bows+0.076
 understand+0.073
owers+0.073
ickets+0.072
Neg logit contributions
 my-0.100
 me-0.089
 cave-0.083
 cage-0.082
 mine-0.079
Token and
Token ID290
Feature activation-2.17e-02

Pos logit contributions
hing+0.199
bows+0.191
 understand+0.189
owers+0.180
ickets+0.176
Neg logit contributions
 my-0.115
 me-0.092
 this-0.084
 cage-0.084
 mine-0.083
Token shapes
Token ID15268
Feature activation-7.66e-02

Pos logit contributions
hing+0.022
bows+0.021
owers+0.020
 understand+0.018
ickets+0.017
Neg logit contributions
 my-0.058
 me-0.052
 mine-0.049
 cage-0.048
 cave-0.048
Token.
Token ID13
Feature activation-1.61e-02

Pos logit contributions
hing+0.182
bows+0.172
 understand+0.171
owers+0.168
ickets+0.163
Neg logit contributions
 my-0.088
 me-0.069
 cage-0.069
 away-0.064
 this-0.063
Token They
Token ID1119
Feature activation-4.40e-02

Pos logit contributions
bows+0.031
hing+0.029
owers+0.028
 understand+0.027
ruck+0.026
Neg logit contributions
 my-0.028
 me-0.023
 cage-0.022
 cave-0.022
 MY-0.021
Token used
Token ID973
Feature activation-3.75e-02

Pos logit contributions
bows+0.094
owers+0.089
hing+0.088
ickets+0.088
anes+0.082
Neg logit contributions
 my-0.064
 me-0.060
 locked-0.058
 cage-0.054
 cave-0.052
Token paint
Token ID7521
Feature activation-3.90e-02

Pos logit contributions
bows+0.010
hing+0.010
owers+0.010
ickets+0.009
akes+0.009
Neg logit contributions
 my-0.009
 me-0.007
 this-0.006
 your-0.006
 mine-0.006
Tokenbr
Token ID1671
Feature activation-4.26e-02

Pos logit contributions
bows+0.080
hing+0.080
 understand+0.077
owers+0.077
ruck+0.072
Neg logit contributions
 my-0.060
 me-0.052
 mine-0.043
 cage-0.043
 away-0.042
Tokenushes
Token ID17237
Feature activation-4.48e-02

Pos logit contributions
hing+0.067
bows+0.062
owers+0.057
 understand+0.055
bee+0.054
Neg logit contributions
 my-0.088
 me-0.077
 cage-0.072
 away-0.068
 your-0.068
Token of
Token ID286
Feature activation-2.04e-02

Pos logit contributions
bows+0.090
hing+0.087
owers+0.083
 understand+0.081
ruck+0.078
Neg logit contributions
 my-0.073
 me-0.061
 cage-0.055
 mine-0.054
 cave-0.054
Token all
Token ID477
Feature activation-1.03e-02

Pos logit contributions
bows+0.037
hing+0.035
owers+0.034
 understand+0.032
ickets+0.030
Neg logit contributions
 my-0.039
 me-0.034
 mine-0.030
 cage-0.030
 cave-0.029
Token sizes
Token ID10620
Feature activation-8.81e-02

Pos logit contributions
bows+0.021
hing+0.020
owers+0.019
 understand+0.018
ruck+0.018
Neg logit contributions
 my-0.017
 me-0.014
 cage-0.013
 MY-0.013
 mine-0.013
Token.
Token ID13
Feature activation-2.70e-03

Pos logit contributions
 understand+0.219
hing+0.216
bows+0.214
owers+0.203
mun+0.200
Neg logit contributions
 my-0.091
 me-0.066
 this-0.059
 mine-0.057
 away-0.054
Token
Token ID198
Feature activation-3.78e-03

Pos logit contributions
bows+0.005
hing+0.005
owers+0.005
 understand+0.004
ruck+0.004
Neg logit contributions
 my-0.005
 me-0.004
 cage-0.004
 MY-0.004
 mine-0.004
Token
Token ID198
Feature activation+9.74e-03

Pos logit contributions
bows+0.008
hing+0.007
owers+0.007
 understand+0.007
ruck+0.006
Neg logit contributions
 my-0.006
 me-0.005
 cage-0.005
 cave-0.005
 MY-0.005
TokenThe
Token ID464
Feature activation-3.86e-03

Pos logit contributions
 my+0.017
 me+0.015
 cage+0.013
 MY+0.013
 mine+0.013
Neg logit contributions
bows-0.019
hing-0.018
owers-0.017
 understand-0.016
ruck-0.016
Token painter
Token ID34537
Feature activation+9.13e-03

Pos logit contributions
bows+0.006
hing+0.006
owers+0.005
 understand+0.005
 sizes+0.005
Neg logit contributions
 my-0.009
 me-0.008
 cage-0.007
 mine-0.007
 cave-0.007
Token cars
Token ID5006
Feature activation-4.51e-02

Pos logit contributions
bows+0.041
owers+0.040
hing+0.040
 understand+0.034
ickets+0.034
Neg logit contributions
 my-0.054
 me-0.048
 cage-0.045
 mine-0.043
 cave-0.043
Token of
Token ID286
Feature activation-2.61e-02

Pos logit contributions
bows+0.095
owers+0.094
hing+0.092
 understand+0.087
bee+0.085
Neg logit contributions
 my-0.066
 me-0.055
 cage-0.050
 away-0.048
 mine-0.047
Token different
Token ID1180
Feature activation-3.93e-02

Pos logit contributions
owers+0.053
bows+0.052
hing+0.051
ickets+0.046
akes+0.045
Neg logit contributions
 my-0.047
 me-0.040
 mine-0.035
 your-0.034
 cage-0.034
Token colors
Token ID7577
Feature activation-6.20e-02

Pos logit contributions
hing+0.055
owers+0.054
bows+0.053
 understand+0.051
ickets+0.047
Neg logit contributions
 my-0.087
 me-0.078
 mine-0.070
 cage-0.070
 cave-0.069
Token and
Token ID290
Feature activation-2.23e-02

Pos logit contributions
hing+0.126
 understand+0.125
bows+0.124
owers+0.122
ances+0.113
Neg logit contributions
 my-0.090
 me-0.075
 cage-0.065
 mine-0.064
 this-0.064
Token sizes
Token ID10620
Feature activation-8.80e-02

Pos logit contributions
owers+0.018
hing+0.018
bows+0.016
ickets+0.014
ruck+0.013
Neg logit contributions
 my-0.063
 me-0.056
 mine-0.054
 cage-0.052
 this-0.052
Token,
Token ID11
Feature activation-4.40e-02

Pos logit contributions
bows+0.195
hing+0.195
 understand+0.191
owers+0.191
ruck+0.179
Neg logit contributions
 my-0.109
 me-0.086
 mine-0.079
 away-0.076
 this-0.075
Token but
Token ID475
Feature activation-1.97e-02

Pos logit contributions
hing+0.080
owers+0.080
bows+0.075
 understand+0.075
ickets+0.074
Neg logit contributions
 my-0.079
 me-0.065
 mine-0.059
 this-0.058
 your-0.056
Token their
Token ID511
Feature activation-2.62e-02

Pos logit contributions
bows+0.033
hing+0.031
owers+0.029
 understand+0.028
ruck+0.027
Neg logit contributions
 my-0.039
 me-0.034
 cage-0.032
 MY-0.031
 mine-0.031
Token favorite
Token ID4004
Feature activation-1.15e-02

Pos logit contributions
bows+0.042
hing+0.040
owers+0.038
 understand+0.035
ruck+0.035
Neg logit contributions
 my-0.054
 me-0.047
 cage-0.044
 mine-0.044
 cave-0.043
Token was
Token ID373
Feature activation-1.19e-02

Pos logit contributions
bows+0.020
hing+0.019
owers+0.018
 understand+0.017
ruck+0.017
Neg logit contributions
 my-0.022
 me-0.018
 MY-0.017
 cage-0.017
 cave-0.017
Token cars
Token ID5006
Feature activation-4.39e-02

Pos logit contributions
bows+0.043
owers+0.041
hing+0.041
 understand+0.036
ickets+0.035
Neg logit contributions
 my-0.057
 me-0.050
 cage-0.048
 mine-0.045
 cave-0.045
Token of
Token ID286
Feature activation-2.90e-02

Pos logit contributions
owers+0.096
bows+0.096
hing+0.095
 understand+0.091
bee+0.090
Neg logit contributions
 my-0.058
 me-0.047
 cage-0.044
 this-0.040
 away-0.040
Token different
Token ID1180
Feature activation-4.39e-02

Pos logit contributions
owers+0.058
hing+0.057
bows+0.056
ickets+0.051
akes+0.049
Neg logit contributions
 my-0.054
 me-0.046
 mine-0.040
 your-0.039
 cage-0.039
Token colors
Token ID7577
Feature activation-6.14e-02

Pos logit contributions
hing+0.063
owers+0.061
bows+0.060
 understand+0.059
ickets+0.055
Neg logit contributions
 my-0.095
 me-0.086
 cage-0.077
 mine-0.076
 cave-0.075
Token and
Token ID290
Feature activation-2.42e-02

Pos logit contributions
hing+0.127
 understand+0.127
bows+0.123
owers+0.121
ruck+0.114
Neg logit contributions
 my-0.089
 me-0.073
 cage-0.063
 this-0.062
 mine-0.061
Token sizes
Token ID10620
Feature activation-8.79e-02

Pos logit contributions
owers+0.023
hing+0.021
ickets+0.017
bows+0.017
akes+0.017
Neg logit contributions
 my-0.067
 me-0.059
 mine-0.057
 this-0.055
 your-0.055
Token.
Token ID13
Feature activation-9.23e-03

Pos logit contributions
hing+0.206
bows+0.203
 understand+0.201
owers+0.199
ruck+0.191
Neg logit contributions
 my-0.098
 me-0.075
 mine-0.067
 this-0.065
 away-0.064
Token They
Token ID1119
Feature activation-4.21e-02

Pos logit contributions
bows+0.015
owers+0.013
 understand+0.013
hing+0.013
 sizes+0.012
Neg logit contributions
 my-0.018
 me-0.015
 cave-0.015
 cage-0.015
 mine-0.015
Token make
Token ID787
Feature activation-3.72e-02

Pos logit contributions
owers+0.093
bows+0.086
ickets+0.083
itted+0.083
ruck+0.082
Neg logit contributions
 my-0.063
 me-0.057
 mine-0.051
 away-0.050
 locked-0.048
Token roads
Token ID9725
Feature activation-2.99e-02

Pos logit contributions
owers+0.074
bows+0.069
hing+0.066
akes+0.060
bee+0.060
Neg logit contributions
 my-0.070
 me-0.061
 your-0.052
 cage-0.052
 this-0.052
Token with
Token ID351
Feature activation-3.35e-02

Pos logit contributions
owers+0.058
bows+0.057
hing+0.056
 understand+0.053
bee+0.051
Neg logit contributions
 my-0.049
 me-0.041
 cage-0.038
 away-0.037
 cave-0.036
Token cars
Token ID5006
Feature activation-4.73e-02

Pos logit contributions
bows+0.039
owers+0.037
hing+0.036
 understand+0.032
ickets+0.032
Neg logit contributions
 my-0.051
 me-0.045
 cage-0.042
 cave-0.041
 mine-0.041
Token of
Token ID286
Feature activation-2.98e-02

Pos logit contributions
owers+0.102
hing+0.099
bows+0.099
 understand+0.096
bee+0.095
Neg logit contributions
 my-0.064
 me-0.052
 cage-0.050
 away-0.046
 this-0.045
Token different
Token ID1180
Feature activation-4.40e-02

Pos logit contributions
owers+0.064
hing+0.063
bows+0.062
ickets+0.058
akes+0.055
Neg logit contributions
 my-0.051
 me-0.043
 mine-0.037
 your-0.037
 cage-0.035
Token colors
Token ID7577
Feature activation-6.60e-02

Pos logit contributions
hing+0.064
owers+0.063
 understand+0.060
bows+0.060
ickets+0.057
Neg logit contributions
 my-0.093
 me-0.084
 mine-0.075
 cage-0.074
 cave-0.074
Token and
Token ID290
Feature activation-2.83e-02

Pos logit contributions
 understand+0.142
hing+0.141
owers+0.135
bows+0.133
mun+0.128
Neg logit contributions
 my-0.086
 me-0.071
 this-0.063
 cage-0.061
 door-0.060
Token sizes
Token ID10620
Feature activation-8.76e-02

Pos logit contributions
owers+0.027
hing+0.025
ickets+0.020
akes+0.019
bows+0.019
Neg logit contributions
 my-0.077
 me-0.069
 mine-0.066
 this-0.065
 your-0.063
Token.
Token ID13
Feature activation-1.25e-02

Pos logit contributions
hing+0.208
 understand+0.204
bows+0.203
owers+0.200
ickets+0.193
Neg logit contributions
 my-0.091
 me-0.069
 mine-0.063
 away-0.063
 this-0.062
Token They
Token ID1119
Feature activation-4.40e-02

Pos logit contributions
bows+0.024
hing+0.022
owers+0.022
 understand+0.021
ruck+0.020
Neg logit contributions
 my-0.022
 me-0.018
 cage-0.017
 MY-0.017
 cave-0.017
Token made
Token ID925
Feature activation-4.16e-02

Pos logit contributions
owers+0.099
bows+0.095
ickets+0.092
hing+0.088
anes+0.086
Neg logit contributions
 my-0.057
 me-0.053
 locked-0.050
 away-0.047
 mine-0.046
Token roads
Token ID9725
Feature activation-2.91e-02

Pos logit contributions
owers+0.090
bows+0.082
hing+0.080
bee+0.074
ickets+0.074
Neg logit contributions
 my-0.073
 me-0.063
 your-0.054
 this-0.054
 our-0.052
Token and
Token ID290
Feature activation-2.90e-02

Pos logit contributions
owers+0.058
bows+0.056
hing+0.055
 understand+0.052
bee+0.051
Neg logit contributions
 my-0.047
 me-0.040
 cage-0.036
 away-0.036
 door-0.035
Tokenbles
Token ID7689
Feature activation-5.17e-02

Pos logit contributions
 understand+0.064
bows+0.061
hing+0.058
ruck+0.053
owers+0.050
Neg logit contributions
 my-0.113
 me-0.100
 cage-0.093
 mine-0.092
ar-0.092
Token of
Token ID286
Feature activation-2.80e-02

Pos logit contributions
hing+0.101
bows+0.100
 understand+0.093
owers+0.093
ruck+0.090
Neg logit contributions
 my-0.087
 me-0.074
 cage-0.068
 mine-0.067
 this-0.063
Token different
Token ID1180
Feature activation-3.65e-02

Pos logit contributions
hing+0.059
bows+0.058
owers+0.055
akes+0.052
ances+0.049
Neg logit contributions
 my-0.051
 me-0.045
 mine-0.037
 your-0.036
 this-0.035
Token colors
Token ID7577
Feature activation-6.38e-02

Pos logit contributions
hing+0.054
bows+0.051
owers+0.049
 understand+0.049
ickets+0.043
Neg logit contributions
 my-0.083
 me-0.075
 mine-0.067
 cage-0.065
 cave-0.064
Token and
Token ID290
Feature activation-1.71e-02

Pos logit contributions
hing+0.140
 understand+0.136
bows+0.133
owers+0.129
ruck+0.125
Neg logit contributions
 my-0.089
 me-0.074
 mine-0.062
 this-0.062
 cage-0.060
Token sizes
Token ID10620
Feature activation-8.76e-02

Pos logit contributions
hing+0.022
owers+0.021
bows+0.020
ruck+0.019
akes+0.018
Neg logit contributions
 my-0.042
 me-0.037
 mine-0.035
 this-0.033
 your-0.033
Token.
Token ID13
Feature activation-7.19e-03

Pos logit contributions
hing+0.208
 understand+0.199
bows+0.199
owers+0.194
ruck+0.189
Neg logit contributions
 my-0.103
 me-0.081
 mine-0.072
 this-0.070
 away-0.068
Token She
Token ID1375
Feature activation+7.18e-03

Pos logit contributions
bows+0.014
hing+0.013
owers+0.012
ruck+0.012
 understand+0.012
Neg logit contributions
 my-0.013
 me-0.011
 cage-0.010
 mine-0.010
 MY-0.010
Token enjoyed
Token ID8359
Feature activation-1.34e-02

Pos logit contributions
 my+0.014
 me+0.013
 cage+0.011
 mine+0.011
 cave+0.010
Neg logit contributions
hing-0.013
bows-0.013
owers-0.012
ruck-0.011
ances-0.011
Token rolling
Token ID10708
Feature activation-1.46e-02

Pos logit contributions
hing+0.028
bows+0.028
owers+0.026
 understand+0.025
akes+0.025
Neg logit contributions
 my-0.023
 me-0.020
 this-0.018
 your-0.017
 mine-0.016
Token them
Token ID606
Feature activation-3.54e-02

Pos logit contributions
bows+0.027
hing+0.026
owers+0.025
 understand+0.025
ruck+0.024
Neg logit contributions
 my-0.027
 me-0.023
 mine-0.020
 cage-0.020
 away-0.020
Token rocks
Token ID12586
Feature activation-2.90e-02

Pos logit contributions
bows+0.021
hing+0.020
owers+0.019
 understand+0.018
ruck+0.018
Neg logit contributions
 my-0.022
 me-0.019
 cage-0.018
 MY-0.017
 mine-0.017
Token and
Token ID290
Feature activation-1.07e-02

Pos logit contributions
bows+0.057
hing+0.054
owers+0.052
 understand+0.051
ruck+0.049
Neg logit contributions
 my-0.048
 me-0.041
 cage-0.037
 mine-0.036
 cave-0.036
Token sticks
Token ID16461
Feature activation-3.91e-02

Pos logit contributions
bows+0.018
hing+0.017
owers+0.016
 understand+0.016
ruck+0.015
Neg logit contributions
 my-0.020
 me-0.018
 cage-0.016
 MY-0.016
 mine-0.016
Token of
Token ID286
Feature activation-1.42e-02

Pos logit contributions
bows+0.077
hing+0.074
owers+0.070
 understand+0.068
ruck+0.067
Neg logit contributions
 my-0.064
 me-0.054
 cage-0.049
 MY-0.049
 mine-0.048
Token all
Token ID477
Feature activation-4.17e-03

Pos logit contributions
bows+0.027
hing+0.026
owers+0.025
 understand+0.024
ruck+0.023
Neg logit contributions
 my-0.024
 me-0.021
 cage-0.019
 MY-0.019
 mine-0.019
Token sizes
Token ID10620
Feature activation-8.75e-02

Pos logit contributions
bows+0.008
hing+0.008
owers+0.007
 understand+0.007
ruck+0.007
Neg logit contributions
 my-0.007
 me-0.006
 cage-0.006
 MY-0.006
 mine-0.006
Token.
Token ID13
Feature activation-1.21e-02

Pos logit contributions
bows+0.188
hing+0.184
 understand+0.176
owers+0.173
ruck+0.165
Neg logit contributions
 my-0.124
 me-0.104
 mine-0.091
 cage-0.088
 cave-0.087
Token Every
Token ID3887
Feature activation+8.33e-03

Pos logit contributions
bows+0.022
hing+0.021
owers+0.020
 understand+0.019
ruck+0.019
Neg logit contributions
 my-0.022
 me-0.018
 cage-0.017
 MY-0.017
 mine-0.017
Token morning
Token ID3329
Feature activation-2.17e-03

Pos logit contributions
 my+0.018
 MY+0.015
 me+0.015
 gru+0.015
 mine+0.015
Neg logit contributions
bows-0.010
hing-0.010
 understand-0.010
 were-0.009
owers-0.009
Token,
Token ID11
Feature activation-1.36e-02

Pos logit contributions
bows+0.004
hing+0.004
owers+0.004
 understand+0.004
ruck+0.004
Neg logit contributions
 my-0.004
 me-0.003
 cage-0.003
 MY-0.003
 mine-0.003
Token he
Token ID339
Feature activation+1.01e-02

Pos logit contributions
bows+0.027
hing+0.026
owers+0.024
 understand+0.023
ruck+0.023
Neg logit contributions
 my-0.023
 me-0.020
 cage-0.017
 MY-0.017
 cave-0.017
Token boats
Token ID16659
Feature activation-4.86e-02

Pos logit contributions
bows+0.052
hing+0.046
owers+0.045
ickets+0.042
 understand+0.041
Neg logit contributions
 my-0.065
 me-0.058
 cage-0.053
 cave-0.053
 mine-0.051
Token of
Token ID286
Feature activation-3.10e-02

Pos logit contributions
bows+0.099
hing+0.098
 understand+0.092
owers+0.092
ruck+0.090
Neg logit contributions
 my-0.073
 me-0.060
 cage-0.058
 away-0.054
 cave-0.054
Token different
Token ID1180
Feature activation-4.42e-02

Pos logit contributions
bows+0.071
owers+0.068
hing+0.068
akes+0.062
ickets+0.062
Neg logit contributions
 my-0.052
 me-0.044
 your-0.037
 mine-0.036
 our-0.035
Token colors
Token ID7577
Feature activation-6.85e-02

Pos logit contributions
bows+0.063
hing+0.062
owers+0.058
 understand+0.058
ickets+0.055
Neg logit contributions
 my-0.096
 me-0.087
 cave-0.079
 cage-0.079
 away-0.077
Token and
Token ID290
Feature activation-2.91e-02

Pos logit contributions
hing+0.143
 understand+0.143
bows+0.140
owers+0.134
ruck+0.128
Neg logit contributions
 my-0.095
 me-0.078
 cage-0.070
 this-0.069
 cave-0.066
Token sizes
Token ID10620
Feature activation-8.73e-02

Pos logit contributions
hing+0.027
owers+0.026
bows+0.023
ickets+0.021
akes+0.020
Neg logit contributions
 my-0.079
 me-0.070
 mine-0.067
 this-0.066
 your-0.065
Token.
Token ID13
Feature activation-9.60e-03

Pos logit contributions
hing+0.200
bows+0.200
 understand+0.195
owers+0.190
ruck+0.184
Neg logit contributions
 my-0.102
 me-0.080
 mine-0.071
 away-0.071
 this-0.070
Token They
Token ID1119
Feature activation-4.68e-02

Pos logit contributions
bows+0.016
hing+0.015
owers+0.015
 understand+0.014
 sizes+0.013
Neg logit contributions
 my-0.018
 me-0.015
 cage-0.015
 cave-0.015
 MY-0.014
Token liked
Token ID8288
Feature activation-1.75e-02

Pos logit contributions
bows+0.097
owers+0.093
ickets+0.089
anes+0.087
ruck+0.087
Neg logit contributions
 my-0.066
 me-0.065
 away-0.058
 locked-0.056
 now-0.053
Token to
Token ID284
Feature activation-2.30e-02

Pos logit contributions
bows+0.044
hing+0.043
owers+0.041
 understand+0.040
ruck+0.039
Neg logit contributions
 my-0.021
 me-0.016
 mine-0.013
 cage-0.013
 MY-0.013
Token make
Token ID787
Feature activation-3.85e-02

Pos logit contributions
bows+0.042
owers+0.039
hing+0.039
ruck+0.036
ickets+0.035
Neg logit contributions
 my-0.042
 me-0.037
 cage-0.033
 mine-0.033
 cave-0.032
Token different
Token ID1180
Feature activation-3.25e-02

Pos logit contributions
bows+0.050
hing+0.048
owers+0.045
 understand+0.043
ruck+0.042
Neg logit contributions
 my-0.053
 me-0.046
 cage-0.043
 mine-0.042
 MY-0.042
Token colors
Token ID7577
Feature activation-6.48e-02

Pos logit contributions
hing+0.061
bows+0.058
owers+0.054
 understand+0.054
ruck+0.052
Neg logit contributions
 my-0.059
 me-0.050
 cage-0.046
 cave-0.045
 mine-0.045
Token,
Token ID11
Feature activation-3.70e-02

Pos logit contributions
hing+0.137
 understand+0.132
bows+0.131
owers+0.127
ances+0.121
Neg logit contributions
 my-0.091
 me-0.076
 cage-0.070
 cave-0.065
 mine-0.065
Token shapes
Token ID15268
Feature activation-6.97e-02

Pos logit contributions
hing+0.060
bows+0.057
owers+0.055
ruck+0.053
 understand+0.052
Neg logit contributions
 my-0.075
 me-0.064
 mine-0.059
 cage-0.059
 this-0.057
Token and
Token ID290
Feature activation-2.67e-02

Pos logit contributions
hing+0.164
bows+0.159
 understand+0.157
owers+0.155
ruck+0.150
Neg logit contributions
 my-0.081
 me-0.065
 cage-0.057
 this-0.054
 away-0.053
Token sizes
Token ID10620
Feature activation-8.72e-02

Pos logit contributions
hing+0.040
owers+0.036
bows+0.035
ruck+0.033
ickets+0.032
Neg logit contributions
 my-0.059
 me-0.052
 cage-0.048
 mine-0.047
 this-0.046
Token.
Token ID13
Feature activation-1.88e-02

Pos logit contributions
hing+0.198
bows+0.192
 understand+0.190
owers+0.184
ruck+0.180
Neg logit contributions
 my-0.105
 me-0.085
 mine-0.076
 this-0.073
 away-0.073
Token Everyone
Token ID11075
Feature activation-3.22e-02

Pos logit contributions
bows+0.031
hing+0.030
owers+0.028
 understand+0.027
ruck+0.026
Neg logit contributions
 my-0.037
 me-0.032
 cage-0.030
 MY-0.030
 mine-0.029
Token was
Token ID373
Feature activation-2.21e-02

Pos logit contributions
bows+0.057
hing+0.054
owers+0.051
 understand+0.049
ruck+0.048
Neg logit contributions
 my-0.060
 me-0.053
 cage-0.048
 MY-0.047
 cave-0.047
Token amazed
Token ID24562
Feature activation-1.96e-02

Pos logit contributions
bows+0.036
hing+0.035
owers+0.032
 understand+0.030
ruck+0.030
Neg logit contributions
 my-0.046
 me-0.040
 cage-0.037
 mine-0.036
 MY-0.036
Token at
Token ID379
Feature activation-1.60e-02

Pos logit contributions
bows+0.041
hing+0.040
owers+0.038
 understand+0.037
ruck+0.036
Neg logit contributions
 my-0.030
 me-0.026
 cage-0.023
 mine-0.022
 MY-0.022
Token cubes
Token ID34896
Feature activation-4.37e-02

Pos logit contributions
bows+0.038
hing+0.035
owers+0.033
 understand+0.030
ickets+0.029
Neg logit contributions
 my-0.055
 me-0.049
 cage-0.045
 mine-0.045
 cave-0.045
Token of
Token ID286
Feature activation-3.33e-02

Pos logit contributions
bows+0.078
hing+0.077
 understand+0.071
owers+0.071
ruck+0.067
Neg logit contributions
 my-0.080
 me-0.068
 mine-0.063
 cage-0.063
 away-0.060
Token different
Token ID1180
Feature activation-4.00e-02

Pos logit contributions
bows+0.069
hing+0.066
owers+0.065
ickets+0.058
bee+0.057
Neg logit contributions
 my-0.061
 me-0.053
 mine-0.046
 your-0.044
 this-0.044
Token colors
Token ID7577
Feature activation-5.90e-02

Pos logit contributions
bows+0.057
hing+0.056
owers+0.053
 understand+0.051
ickets+0.047
Neg logit contributions
 my-0.089
 me-0.080
 mine-0.073
 cave-0.072
 cage-0.071
Token and
Token ID290
Feature activation-1.99e-02

Pos logit contributions
hing+0.119
bows+0.118
 understand+0.114
owers+0.111
ances+0.106
Neg logit contributions
 my-0.093
 me-0.076
 mine-0.068
 cage-0.067
 this-0.065
Token sizes
Token ID10620
Feature activation-8.72e-02

Pos logit contributions
hing+0.023
bows+0.022
owers+0.022
ickets+0.018
ruck+0.018
Neg logit contributions
 my-0.051
 me-0.044
 mine-0.043
 cage-0.041
 your-0.040
Token.
Token ID13
Feature activation-1.45e-02

Pos logit contributions
hing+0.192
bows+0.191
 understand+0.184
owers+0.181
ances+0.173
Neg logit contributions
 my-0.119
 me-0.093
 mine-0.087
 this-0.081
 away-0.079
Token They
Token ID1119
Feature activation-3.96e-02

Pos logit contributions
bows+0.023
hing+0.022
owers+0.021
 understand+0.020
ruck+0.019
Neg logit contributions
 my-0.029
 me-0.025
 cage-0.024
 MY-0.023
 cave-0.023
Token build
Token ID1382
Feature activation-3.14e-02

Pos logit contributions
owers+0.076
bows+0.074
ruck+0.069
itted+0.069
hing+0.068
Neg logit contributions
 my-0.070
 me-0.063
 mine-0.058
 away-0.055
 this-0.053
Token towers
Token ID18028
Feature activation-4.47e-02

Pos logit contributions
bows+0.062
hing+0.060
owers+0.058
 understand+0.054
ruck+0.053
Neg logit contributions
 my-0.054
 me-0.046
 cage-0.042
 mine-0.041
 cave-0.041
Token and
Token ID290
Feature activation-2.26e-02

Pos logit contributions
hing+0.093
bows+0.092
owers+0.090
 understand+0.087
bee+0.085
Neg logit contributions
 my-0.067
 me-0.056
 cage-0.052
 cave-0.049
 door-0.049
Token Max
Token ID5436
Feature activation+6.27e-03

Pos logit contributions
bows+0.016
hing+0.015
 understand+0.014
owers+0.014
ruck+0.013
Neg logit contributions
 my-0.018
 me-0.015
 cage-0.014
 mine-0.014
 MY-0.014
Token thought
Token ID1807
Feature activation-1.89e-02

Pos logit contributions
 my+0.013
 cave+0.011
 cage+0.011
 MY+0.011
 mine+0.011
Neg logit contributions
bows-0.008
 understand-0.008
hing-0.008
 were-0.008
bee-0.008
Token about
Token ID546
Feature activation+2.46e-02

Pos logit contributions
bows+0.035
hing+0.035
 understand+0.033
ruck+0.032
owers+0.031
Neg logit contributions
 my-0.031
 me-0.026
 cage-0.026
 cave-0.025
 MY-0.024
Token it
Token ID340
Feature activation+1.08e-02

Pos logit contributions
 my+0.035
 cage+0.031
 away+0.030
 gru+0.030
 MY+0.030
Neg logit contributions
bows-0.046
hing-0.043
 understand-0.043
ruck-0.041
owers-0.041
Token and
Token ID290
Feature activation+1.54e-02

Pos logit contributions
 my+0.024
 gru+0.022
 cage+0.022
 MY+0.022
 cave+0.022
Neg logit contributions
 understand-0.012
hing-0.011
bows-0.011
 were-0.010
ruck-0.009
Token decided
Token ID3066
Feature activation+4.25e-02

Pos logit contributions
 my+0.033
 cage+0.029
 cave+0.029
 MY+0.029
 gru+0.027
Neg logit contributions
bows-0.020
 understand-0.018
hing-0.018
 were-0.017
bee-0.016
Token not
Token ID407
Feature activation+2.32e-03

Pos logit contributions
 cave+0.080
 cage+0.079
 my+0.079
 enclosure+0.074
 MY+0.072
Neg logit contributions
 understand-0.057
bows-0.055
 were-0.047
ruck-0.044
working-0.041
Token to
Token ID284
Feature activation-1.85e-03

Pos logit contributions
 my+0.003
 me+0.003
 cage+0.003
 cave+0.003
 MY+0.003
Neg logit contributions
bows-0.005
 understand-0.004
hing-0.004
ruck-0.004
 sizes-0.004
Token eat
Token ID4483
Feature activation-7.56e-03

Pos logit contributions
bows+0.003
hing+0.003
 understand+0.003
owers+0.003
ruck+0.003
Neg logit contributions
 my-0.004
 me-0.003
 cage-0.003
 MY-0.003
 cave-0.003
Token the
Token ID262
Feature activation-5.41e-03

Pos logit contributions
bows+0.013
hing+0.012
 understand+0.012
owers+0.012
 sizes+0.011
Neg logit contributions
 my-0.012
 cage-0.011
 cave-0.010
 MY-0.010
 gru-0.010
Token mouse
Token ID10211
Feature activation+1.51e-02

Pos logit contributions
bows+0.012
hing+0.011
owers+0.011
 understand+0.010
ruck+0.010
Neg logit contributions
 my-0.008
 me-0.007
 cage-0.006
 MY-0.006
 mine-0.006
Token door
Token ID3420
Feature activation+4.04e-02

Pos logit contributions
bows+0.002
 understand+0.001
hing+0.001
owers+0.001
 dancers+0.001
Neg logit contributions
 my-0.002
 MY-0.001
 me-0.001
 gru-0.001
fired-0.001
Token.
Token ID13
Feature activation+1.76e-04

Pos logit contributions
 my+0.092
 MY+0.082
 me+0.076
 mine+0.075
 cave+0.074
Neg logit contributions
bows-0.051
 understand-0.046
hing-0.044
 were-0.043
ances-0.040
Token He
Token ID679
Feature activation+1.74e-02

Pos logit contributions
 my+0.000
 mine+0.000
 me+0.000
 MY+0.000
 away+0.000
Neg logit contributions
bows-0.000
 understand-0.000
hing-0.000
ickets-0.000
owers-0.000
Token pushed
Token ID7121
Feature activation-4.33e-03

Pos logit contributions
 my+0.037
 mine+0.031
 MY+0.030
 cave+0.029
 cage+0.029
Neg logit contributions
bows-0.024
 were-0.023
 understand-0.022
hing-0.022
bee-0.020
Token the
Token ID262
Feature activation+1.72e-03

Pos logit contributions
hing+0.008
bows+0.008
 understand+0.008
ruck+0.007
owers+0.007
Neg logit contributions
 my-0.007
 cage-0.006
 cave-0.006
 MY-0.006
 me-0.005
Token door
Token ID3420
Feature activation+4.51e-02

Pos logit contributions
 my+0.002
 MY+0.001
 me+0.001
ar+0.001
 cave+0.001
Neg logit contributions
bows-0.004
hing-0.004
 understand-0.004
ruck-0.004
owers-0.004
Token.
Token ID13
Feature activation-6.02e-03

Pos logit contributions
 my+0.084
 MY+0.073
 me+0.071
 cave+0.068
 cage+0.067
Neg logit contributions
bows-0.075
hing-0.070
 understand-0.069
ruck-0.063
 were-0.061
Token The
Token ID383
Feature activation-4.37e-03

Pos logit contributions
bows+0.010
 understand+0.009
hing+0.009
owers+0.008
ickets+0.008
Neg logit contributions
 my-0.012
 mine-0.010
 me-0.010
 MY-0.010
 cage-0.010
Token door
Token ID3420
Feature activation+3.72e-02

Pos logit contributions
bows+0.011
 understand+0.010
 dancers+0.010
owers+0.010
hing+0.010
Neg logit contributions
 my-0.004
 MY-0.003
 me-0.003
ar-0.003
shadow-0.002
Token was
Token ID373
Feature activation+1.11e-02

Pos logit contributions
 my+0.072
 MY+0.059
 me+0.059
 mine+0.057
 cave+0.057
Neg logit contributions
bows-0.062
hing-0.060
 were-0.056
 understand-0.055
ruck-0.051
Token stuck
Token ID7819
Feature activation-9.46e-03

Pos logit contributions
 my+0.014
ar+0.013
lor+0.013
eros+0.012
 enclosure+0.012
Neg logit contributions
bows-0.021
 understand-0.020
ruck-0.020
 were-0.019
hing-0.019
Token right
Token ID826
Feature activation+2.03e-02

Pos logit contributions
bows+0.014
hing+0.013
 understand+0.013
owers+0.012
ruck+0.012
Neg logit contributions
 my-0.009
 me-0.007
 cage-0.007
 cave-0.007
 MY-0.007
Token.
Token ID13
Feature activation+1.37e-02

Pos logit contributions
 my+0.046
 MY+0.043
 cage+0.040
 me+0.038
 enclosure+0.038
Neg logit contributions
bows-0.024
 understand-0.021
hing-0.021
 were-0.020
ances-0.020
Token So
Token ID1406
Feature activation+9.28e-03

Pos logit contributions
 my+0.026
 me+0.022
 cage+0.021
 MY+0.021
 mine+0.021
Neg logit contributions
bows-0.023
hing-0.021
owers-0.021
 understand-0.020
ruck-0.020
Token he
Token ID339
Feature activation+1.53e-02

Pos logit contributions
 my+0.017
 cage+0.016
 cave+0.015
 enclosure+0.015
 gru+0.015
Neg logit contributions
 were-0.014
bows-0.014
 understand-0.013
hing-0.013
ruck-0.012
Token turned
Token ID2900
Feature activation+3.31e-03

Pos logit contributions
 my+0.033
 me+0.027
 MY+0.027
 cage+0.027
 cave+0.027
Neg logit contributions
bows-0.021
hing-0.020
 understand-0.019
owers-0.018
ruck-0.018
Token off
Token ID572
Feature activation+4.70e-02

Pos logit contributions
 my+0.006
 me+0.005
 cage+0.004
 cave+0.004
 MY+0.004
Neg logit contributions
bows-0.006
hing-0.006
 understand-0.006
owers-0.006
ruck-0.006
Token the
Token ID262
Feature activation+1.96e-03

Pos logit contributions
 my+0.083
 enclosure+0.078
 cave+0.077
 cage+0.077
 MY+0.075
Neg logit contributions
hing-0.070
bows-0.065
 were-0.065
 understand-0.064
ruck-0.061
Token TV
Token ID3195
Feature activation+1.66e-02

Pos logit contributions
 my+0.003
 me+0.003
 cage+0.003
 mine+0.003
 cave+0.003
Neg logit contributions
bows-0.004
hing-0.004
owers-0.003
 understand-0.003
ruck-0.003
Token and
Token ID290
Feature activation+7.03e-03

Pos logit contributions
 my+0.036
 me+0.031
 MY+0.030
 gru+0.029
 mine+0.029
Neg logit contributions
bows-0.025
hing-0.023
owers-0.021
 understand-0.021
ruck-0.020
Token went
Token ID1816
Feature activation+4.89e-02

Pos logit contributions
 my+0.015
 MY+0.012
 cage+0.012
 me+0.012
 mine+0.012
Neg logit contributions
bows-0.010
hing-0.010
 understand-0.009
ruck-0.009
 were-0.009
Token to
Token ID284
Feature activation+1.46e-02

Pos logit contributions
 my+0.087
 MY+0.083
 gru+0.077
riage+0.073
oh+0.072
Neg logit contributions
bows-0.075
ruck-0.068
 understand-0.067
hing-0.066
 dancers-0.063
Token Lily
Token ID20037
Feature activation-2.65e-03

Pos logit contributions
 understand+0.000
bows+0.000
 were+0.000
hing+0.000
owers+0.000
Neg logit contributions
 my-0.000
 cage-0.000
 MY-0.000
 cave-0.000
 enclosure-0.000
Token shouts
Token ID34757
Feature activation-1.65e-02

Pos logit contributions
bows+0.005
hing+0.005
owers+0.005
 understand+0.005
ruck+0.005
Neg logit contributions
 my-0.004
 me-0.003
 MY-0.003
 cage-0.003
 cave-0.003
Token.
Token ID13
Feature activation-1.11e-02

Pos logit contributions
bows+0.035
hing+0.034
owers+0.032
ruck+0.031
 understand+0.030
Neg logit contributions
 my-0.025
 me-0.021
 cage-0.019
 mine-0.019
 cave-0.018
Token "
Token ID366
Feature activation-2.45e-02

Pos logit contributions
bows+0.023
hing+0.022
owers+0.020
ruck+0.020
 understand+0.020
Neg logit contributions
 my-0.018
 me-0.016
 cage-0.014
 cave-0.013
 MY-0.013
TokenThe
Token ID464
Feature activation-1.36e-02

Pos logit contributions
bows+0.052
owers+0.046
hing+0.044
 sizes+0.044
ruck+0.044
Neg logit contributions
 my-0.040
 me-0.034
 MY-0.028
 cage-0.028
 Help-0.028
Token road
Token ID2975
Feature activation+5.74e-02

Pos logit contributions
bows+0.036
hing+0.035
owers+0.034
 understand+0.033
ruck+0.033
Neg logit contributions
 my-0.013
 me-0.010
 MY-0.008
 cage-0.008
 mine-0.008
Token is
Token ID318
Feature activation+2.28e-02

Pos logit contributions
 my+0.071
 gru+0.066
 cave+0.060
 MY+0.059
lor+0.059
Neg logit contributions
bows-0.120
hing-0.113
 were-0.107
 sizes-0.102
ruck-0.102
Token not
Token ID407
Feature activation-6.29e-03

Pos logit contributions
lor+0.037
ar+0.035
ress+0.034
eros+0.033
oh+0.033
Neg logit contributions
bows-0.034
ruck-0.034
 sizes-0.033
hing-0.032
 understand-0.031
Token for
Token ID329
Feature activation-4.80e-03

Pos logit contributions
bows+0.012
hing+0.012
 understand+0.011
ruck+0.011
owers+0.011
Neg logit contributions
 my-0.010
 me-0.009
 cage-0.008
 MY-0.008
 cave-0.008
Token playing
Token ID2712
Feature activation+1.20e-03

Pos logit contributions
bows+0.009
hing+0.009
ruck+0.009
owers+0.008
 understand+0.008
Neg logit contributions
 my-0.007
 cage-0.006
 gru-0.006
 cave-0.006
 me-0.006
Token.
Token ID13
Feature activation+6.20e-03

Pos logit contributions
 my+0.002
 MY+0.002
 me+0.002
 cage+0.002
 gru+0.002
Neg logit contributions
bows-0.002
hing-0.002
 understand-0.002
ruck-0.002
owers-0.002
Token.
Token ID13
Feature activation-1.31e-03

Pos logit contributions
bows+0.019
hing+0.018
owers+0.017
 understand+0.017
ruck+0.016
Neg logit contributions
 my-0.018
 me-0.015
 cage-0.014
 MY-0.014
 cave-0.013
Token
Token ID198
Feature activation+4.01e-03

Pos logit contributions
bows+0.002
hing+0.002
owers+0.002
 understand+0.002
ruck+0.002
Neg logit contributions
 my-0.002
 me-0.002
 cage-0.002
 mine-0.002
 cave-0.002
Token
Token ID198
Feature activation+9.46e-03

Pos logit contributions
 my+0.007
 me+0.006
 cage+0.005
 MY+0.005
 cave+0.005
Neg logit contributions
bows-0.008
hing-0.007
owers-0.007
 understand-0.007
ruck-0.007
TokenAt
Token ID2953
Feature activation+2.23e-02

Pos logit contributions
 my+0.016
 me+0.014
 cage+0.013
 cave+0.013
 mine+0.013
Neg logit contributions
bows-0.018
hing-0.017
owers-0.016
 understand-0.016
ruck-0.016
Token the
Token ID262
Feature activation+1.10e-02

Pos logit contributions
 my+0.033
 MY+0.027
 cage+0.027
 me+0.027
 mine+0.026
Neg logit contributions
bows-0.042
hing-0.042
 understand-0.040
owers-0.039
ruck-0.039
Token far
Token ID1290
Feature activation+7.40e-02

Pos logit contributions
 my+0.020
 MY+0.016
fired+0.016
enn+0.016
smart+0.016
Neg logit contributions
bows-0.018
hing-0.017
owers-0.016
 understand-0.016
 were-0.015
Token side
Token ID1735
Feature activation+2.55e-02

Pos logit contributions
lor+0.083
 my+0.081
 Help+0.075
 gru+0.072
 MY+0.071
Neg logit contributions
bows-0.141
hing-0.139
ruck-0.131
 sizes-0.123
 textures-0.122
Token of
Token ID286
Feature activation+2.32e-02

Pos logit contributions
 my+0.061
 MY+0.053
 gru+0.052
 mine+0.052
 me+0.052
Neg logit contributions
bows-0.028
hing-0.025
 were-0.024
owers-0.024
 understand-0.021
Token the
Token ID262
Feature activation+5.66e-03

Pos logit contributions
 my+0.036
 MY+0.030
ar+0.029
 gru+0.029
 mine+0.029
Neg logit contributions
bows-0.040
hing-0.039
 understand-0.037
ruck-0.037
owers-0.037
Token pond
Token ID16723
Feature activation+2.41e-02

Pos logit contributions
 my+0.009
 MY+0.007
 gru+0.007
fired+0.006
 me+0.006
Neg logit contributions
bows-0.010
hing-0.010
owers-0.010
 understand-0.009
ruck-0.009
Token,
Token ID11
Feature activation-8.77e-03

Pos logit contributions
 my+0.043
 MY+0.035
 mine+0.034
 me+0.033
 cage+0.033
Neg logit contributions
bows-0.043
owers-0.040
hing-0.039
 were-0.039
 understand-0.038
Token push
Token ID4574
Feature activation-1.30e-02

Pos logit contributions
owers+0.009
bows+0.009
hing+0.008
ruck+0.008
ances+0.008
Neg logit contributions
 my-0.007
 me-0.006
 mine-0.006
 away-0.006
 MY-0.005
Token and
Token ID290
Feature activation-8.94e-04

Pos logit contributions
bows+0.027
hing+0.027
 understand+0.025
owers+0.025
ruck+0.024
Neg logit contributions
 my-0.019
 me-0.016
 cage-0.015
 cave-0.015
 MY-0.015
Token shove
Token ID38332
Feature activation-2.65e-02

Pos logit contributions
bows+0.001
hing+0.001
 understand+0.001
owers+0.001
ruck+0.001
Neg logit contributions
 my-0.002
 me-0.002
 cage-0.001
 MY-0.001
 cave-0.001
Token each
Token ID1123
Feature activation+9.78e-04

Pos logit contributions
bows+0.048
owers+0.045
hing+0.044
ickets+0.040
 understand+0.040
Neg logit contributions
 my-0.051
 me-0.044
 mine-0.040
 away-0.039
 MY-0.038
Token other
Token ID584
Feature activation-8.35e-03

Pos logit contributions
 my+0.002
 me+0.002
 mine+0.001
 cage+0.001
 MY+0.001
Neg logit contributions
bows-0.002
hing-0.002
owers-0.002
 understand-0.002
ruck-0.001
Token.
Token ID13
Feature activation-1.20e-03

Pos logit contributions
bows+0.018
hing+0.017
owers+0.017
ruck+0.016
 understand+0.016
Neg logit contributions
 my-0.012
 me-0.010
 mine-0.009
 cage-0.009
 away-0.009
Token Suddenly
Token ID24975
Feature activation-5.25e-03

Pos logit contributions
bows+0.002
hing+0.002
owers+0.002
 understand+0.002
ruck+0.002
Neg logit contributions
 my-0.002
 me-0.002
 cage-0.002
 cave-0.002
 MY-0.002
Token,
Token ID11
Feature activation-7.28e-03

Pos logit contributions
bows+0.012
hing+0.012
owers+0.011
 understand+0.011
ruck+0.011
Neg logit contributions
 my-0.007
 me-0.006
 cage-0.005
 MY-0.005
 mine-0.005
Token Tim
Token ID5045
Feature activation-3.98e-03

Pos logit contributions
bows+0.015
hing+0.015
owers+0.014
 understand+0.013
ruck+0.013
Neg logit contributions
 my-0.012
 me-0.010
 mine-0.009
 MY-0.008
 cage-0.008
Tokenmy
Token ID1820
Feature activation+4.97e-03

Pos logit contributions
bows+0.007
hing+0.006
owers+0.006
 understand+0.006
ruck+0.006
Neg logit contributions
 my-0.008
 me-0.007
 cage-0.006
 MY-0.006
 mine-0.006
Token tri
Token ID1333
Feature activation-8.20e-03

Pos logit contributions
 my+0.008
 me+0.007
 cage+0.006
 MY+0.006
 cave+0.006
Neg logit contributions
bows-0.010
hing-0.010
owers-0.009
 understand-0.009
ruck-0.009
Token herself
Token ID5223
Feature activation-1.47e-02

Pos logit contributions
 my+0.002
 me+0.001
 cage+0.001
 MY+0.001
 cave+0.001
Neg logit contributions
bows-0.002
hing-0.002
owers-0.002
 understand-0.001
ruck-0.001
Token in
Token ID287
Feature activation+1.91e-02

Pos logit contributions
hing+0.031
bows+0.031
 understand+0.029
ruck+0.028
ickets+0.028
Neg logit contributions
 my-0.022
 me-0.018
 cage-0.017
 mine-0.016
 cave-0.016
Token the
Token ID262
Feature activation-2.33e-03

Pos logit contributions
 my+0.029
 me+0.025
 MY+0.023
 cage+0.023
 cave+0.022
Neg logit contributions
bows-0.039
hing-0.037
 understand-0.035
owers-0.035
ruck-0.034
Token big
Token ID1263
Feature activation+1.92e-02

Pos logit contributions
bows+0.004
hing+0.004
owers+0.004
 understand+0.004
ruck+0.004
Neg logit contributions
 my-0.004
 MY-0.003
 me-0.003
 gru-0.003
 mine-0.003
Token mirror
Token ID10162
Feature activation+7.97e-03

Pos logit contributions
 my+0.031
shadow+0.031
 MY+0.029
lor+0.028
har+0.027
Neg logit contributions
bows-0.034
owers-0.030
hing-0.030
 were-0.029
 sizes-0.029
Token in
Token ID287
Feature activation+2.22e-02

Pos logit contributions
 my+0.016
 me+0.013
 MY+0.013
 gru+0.013
 mine+0.012
Neg logit contributions
bows-0.014
owers-0.013
hing-0.013
 understand-0.011
ances-0.011
Token her
Token ID607
Feature activation-9.03e-04

Pos logit contributions
 my+0.043
 me+0.038
 MY+0.037
 cage+0.037
 gru+0.037
Neg logit contributions
bows-0.034
hing-0.032
 understand-0.030
owers-0.029
ruck-0.029
Token room
Token ID2119
Feature activation+2.64e-03

Pos logit contributions
bows+0.002
owers+0.002
 understand+0.001
 were+0.001
 designs+0.001
Neg logit contributions
 my-0.001
 MY-0.001
shadow-0.001
 gru-0.001
ar-0.001
Token.
Token ID13
Feature activation-2.81e-03

Pos logit contributions
 my+0.005
 me+0.004
 MY+0.004
 cage+0.004
 mine+0.004
Neg logit contributions
bows-0.005
hing-0.004
owers-0.004
 understand-0.004
ruck-0.004
Token
Token ID198
Feature activation-3.71e-03

Pos logit contributions
bows+0.006
hing+0.005
owers+0.005
 understand+0.005
ruck+0.005
Neg logit contributions
 my-0.005
 me-0.004
 cage-0.004
 MY-0.003
 mine-0.003
Token
Token ID198
Feature activation+6.78e-03

Pos logit contributions
bows+0.008
hing+0.008
owers+0.007
 sizes+0.007
ances+0.007
Neg logit contributions
 my-0.006
 me-0.005
 cage-0.004
 cave-0.004
 MY-0.004
Token the
Token ID262
Feature activation+4.39e-03

Pos logit contributions
 my+0.005
 me+0.004
 cage+0.004
 cave+0.004
 MY+0.004
Neg logit contributions
bows-0.007
hing-0.006
 understand-0.006
ruck-0.006
owers-0.006
Token coin
Token ID10752
Feature activation+2.15e-02

Pos logit contributions
 my+0.007
 me+0.006
 cage+0.006
 mine+0.005
 MY+0.005
Neg logit contributions
bows-0.009
hing-0.008
owers-0.008
 understand-0.008
ruck-0.008
Token and
Token ID290
Feature activation+3.69e-03

Pos logit contributions
 my+0.045
 gru+0.040
 me+0.039
 MY+0.039
 cave+0.038
Neg logit contributions
bows-0.030
hing-0.028
owers-0.025
ruck-0.024
 understand-0.024
Token it
Token ID340
Feature activation+9.26e-03

Pos logit contributions
 my+0.006
 me+0.005
 cage+0.005
 MY+0.005
 gru+0.005
Neg logit contributions
bows-0.006
hing-0.006
 understand-0.006
ruck-0.006
 were-0.006
Token rolled
Token ID11686
Feature activation+1.02e-03

Pos logit contributions
 my+0.017
 me+0.014
 cage+0.013
 MY+0.013
 cave+0.013
Neg logit contributions
bows-0.016
hing-0.016
owers-0.014
 understand-0.014
ruck-0.014
Token away
Token ID1497
Feature activation+1.14e-02

Pos logit contributions
 my+0.001
 cage+0.001
 MY+0.001
 me+0.001
 cave+0.001
Neg logit contributions
bows-0.002
hing-0.002
ruck-0.002
 understand-0.002
 were-0.002
Token.
Token ID13
Feature activation+1.99e-03

Pos logit contributions
 my+0.020
 me+0.017
 cage+0.016
 MY+0.016
 mine+0.016
Neg logit contributions
bows-0.021
hing-0.020
owers-0.019
 understand-0.018
ruck-0.018
Token
Token ID198
Feature activation+6.94e-03

Pos logit contributions
 my+0.004
 cage+0.003
 me+0.003
 cave+0.003
 away+0.003
Neg logit contributions
bows-0.003
hing-0.003
owers-0.003
 understand-0.003
ickets-0.002
Token
Token ID198
Feature activation+1.41e-02

Pos logit contributions
 my+0.013
 me+0.011
 cage+0.010
 mine+0.010
 MY+0.010
Neg logit contributions
bows-0.012
hing-0.012
owers-0.011
 understand-0.011
ruck-0.011
TokenSam
Token ID16305
Feature activation+3.78e-03

Pos logit contributions
 my+0.028
 me+0.024
 cage+0.024
 cave+0.023
 mine+0.023
Neg logit contributions
bows-0.022
hing-0.022
 understand-0.020
owers-0.020
ruck-0.019
Token was
Token ID373
Feature activation-3.36e-03

Pos logit contributions
 my+0.007
 me+0.006
 MY+0.006
 cage+0.006
 mine+0.006
Neg logit contributions
bows-0.006
hing-0.006
 understand-0.006
owers-0.006
ruck-0.005
Token upon
Token ID2402
Feature activation-2.28e-02

Pos logit contributions
bows+0.018
hing+0.017
owers+0.016
 understand+0.016
ruck+0.015
Neg logit contributions
 my-0.022
 me-0.019
 cage-0.017
 MY-0.017
 mine-0.017
Token a
Token ID257
Feature activation-1.53e-03

Pos logit contributions
bows+0.059
owers+0.058
hing+0.056
ances+0.055
anes+0.052
Neg logit contributions
 my-0.029
 me-0.023
 your-0.019
 this-0.019
 cave-0.018
Token time
Token ID640
Feature activation-1.52e-02

Pos logit contributions
bows+0.003
hing+0.003
owers+0.003
ruck+0.003
anes+0.003
Neg logit contributions
 my-0.003
 me-0.002
 cage-0.002
 MY-0.002
 mine-0.002
Token there
Token ID612
Feature activation-1.41e-02

Pos logit contributions
bows+0.035
hing+0.033
owers+0.032
 understand+0.031
ruck+0.030
Neg logit contributions
 my-0.021
 me-0.017
 cage-0.015
 MY-0.015
 mine-0.015
Token was
Token ID373
Feature activation-2.24e-02

Pos logit contributions
bows+0.030
hing+0.028
owers+0.027
 understand+0.026
ruck+0.026
Neg logit contributions
 my-0.022
 me-0.018
 cage-0.016
 MY-0.016
 mine-0.016
Token a
Token ID257
Feature activation+1.33e-02

Pos logit contributions
bows+0.064
hing+0.063
ances+0.061
anes+0.058
owers+0.057
Neg logit contributions
 my-0.025
 me-0.018
 your-0.016
 this-0.014
 here-0.013
Token boy
Token ID2933
Feature activation+4.73e-03

Pos logit contributions
 my+0.024
 me+0.021
 MY+0.020
 cage+0.019
 mine+0.019
Neg logit contributions
bows-0.023
hing-0.022
owers-0.021
 understand-0.021
ruck-0.020
Token named
Token ID3706
Feature activation-1.04e-03

Pos logit contributions
 my+0.008
 me+0.006
 MY+0.006
 mine+0.006
 cage+0.006
Neg logit contributions
bows-0.009
hing-0.009
owers-0.008
 understand-0.008
ruck-0.008
Token Jack
Token ID3619
Feature activation-9.48e-03

Pos logit contributions
bows+0.002
hing+0.002
 understand+0.002
owers+0.002
ruck+0.002
Neg logit contributions
 my-0.002
 me-0.002
 cage-0.002
 cave-0.002
 MY-0.002
Token.
Token ID13
Feature activation-1.61e-02

Pos logit contributions
bows+0.022
hing+0.020
owers+0.020
ickets+0.019
 understand+0.018
Neg logit contributions
 my-0.014
 me-0.012
 cage-0.010
 cave-0.010
 mine-0.009
Token Jack
Token ID3619
Feature activation-7.49e-03

Pos logit contributions
bows+0.034
hing+0.033
owers+0.032
ances+0.031
ruck+0.030
Neg logit contributions
 my-0.025
 me-0.021
 mine-0.018
 cave-0.018
 MY-0.017
Token them
Token ID606
Feature activation-4.20e-02

Pos logit contributions
bows+0.056
hing+0.053
owers+0.052
ances+0.048
ruck+0.048
Neg logit contributions
 my-0.049
 me-0.043
 cage-0.036
 mine-0.036
 cave-0.036
Token feel
Token ID1254
Feature activation-5.07e-03

Pos logit contributions
bows+0.070
owers+0.068
hing+0.068
ruck+0.063
ances+0.062
Neg logit contributions
 my-0.083
 me-0.069
 cave-0.064
 away-0.064
 mine-0.064
Token happy
Token ID3772
Feature activation-2.21e-02

Pos logit contributions
bows+0.010
hing+0.010
owers+0.009
 understand+0.009
ruck+0.009
Neg logit contributions
 my-0.008
 me-0.007
 cage-0.006
 MY-0.006
 cave-0.006
Token again
Token ID757
Feature activation+2.20e-02

Pos logit contributions
bows+0.045
hing+0.043
owers+0.040
 understand+0.039
ruck+0.039
Neg logit contributions
 my-0.036
 me-0.030
 cage-0.028
 MY-0.027
 mine-0.027
Token.
Token ID13
Feature activation-5.23e-04

Pos logit contributions
 my+0.045
 MY+0.041
 cage+0.040
 slams+0.039
 me+0.039
Neg logit contributions
bows-0.031
 understand-0.029
hing-0.028
owers-0.027
ances-0.026
Token<|endoftext|>
Token ID50256
Feature activation-3.02e-03

Pos logit contributions
bows+0.001
hing+0.001
owers+0.001
 understand+0.001
ruck+0.001
Neg logit contributions
 my-0.001
 me-0.001
 cage-0.001
 mine-0.001
 MY-0.001
TokenS
Token ID50
Feature activation-1.17e-02

Pos logit contributions
bows+0.006
hing+0.006
owers+0.006
 understand+0.005
ruck+0.005
Neg logit contributions
 my-0.005
 me-0.004
 cage-0.004
 mine-0.004
 MY-0.004
Tokenully
Token ID2132
Feature activation+1.80e-03

Pos logit contributions
bows+0.020
hing+0.020
owers+0.019
 understand+0.018
ruck+0.018
Neg logit contributions
 my-0.021
 me-0.017
 gru-0.017
 cave-0.017
 cage-0.017
Token and
Token ID290
Feature activation-1.96e-02

Pos logit contributions
 my+0.003
 me+0.003
 cage+0.003
 MY+0.003
 mine+0.003
Neg logit contributions
bows-0.003
hing-0.003
owers-0.003
 understand-0.003
ruck-0.003
Token his
Token ID465
Feature activation-1.73e-02

Pos logit contributions
bows+0.048
hing+0.045
owers+0.044
akes+0.041
ances+0.041
Neg logit contributions
 my-0.028
 me-0.023
 your-0.019
 mine-0.018
 this-0.018
Token family
Token ID1641
Feature activation-2.20e-02

Pos logit contributions
bows+0.043
hing+0.043
ruck+0.038
owers+0.038
akes+0.037
Neg logit contributions
 my-0.023
 me-0.020
 cage-0.016
 cave-0.016
 mine-0.015
Token the
Token ID262
Feature activation-1.23e-02

Pos logit contributions
bows+0.021
hing+0.019
owers+0.018
 understand+0.018
 sizes+0.017
Neg logit contributions
 my-0.022
 MY-0.020
 cage-0.019
 me-0.019
 gru-0.018
Token brightest
Token ID33871
Feature activation-1.69e-02

Pos logit contributions
bows+0.023
hing+0.022
 understand+0.021
owers+0.021
ruck+0.020
Neg logit contributions
 my-0.022
 me-0.019
 cage-0.017
 mine-0.017
 cave-0.017
Token.
Token ID13
Feature activation-1.18e-03

Pos logit contributions
bows+0.030
hing+0.029
owers+0.027
 understand+0.026
ruck+0.025
Neg logit contributions
 my-0.031
 me-0.027
 cage-0.025
 MY-0.025
 mine-0.024
Token She
Token ID1375
Feature activation+9.18e-03

Pos logit contributions
bows+0.002
owers+0.002
hing+0.002
 understand+0.002
 sizes+0.001
Neg logit contributions
 my-0.002
 away-0.002
 cage-0.002
 MY-0.002
 me-0.002
Token loved
Token ID6151
Feature activation-8.89e-03

Pos logit contributions
 my+0.019
 me+0.018
 cage+0.016
 cave+0.016
 mine+0.016
Neg logit contributions
bows-0.015
hing-0.014
owers-0.013
ruck-0.012
 understand-0.012
Token to
Token ID284
Feature activation-1.33e-02

Pos logit contributions
bows+0.021
hing+0.021
owers+0.020
 understand+0.019
ruck+0.019
Neg logit contributions
 my-0.012
 me-0.010
 cage-0.008
 mine-0.008
 cave-0.008
Token compete
Token ID9320
Feature activation-4.17e-02

Pos logit contributions
bows+0.025
hing+0.022
owers+0.022
ruck+0.021
 sizes+0.020
Neg logit contributions
 my-0.025
 me-0.022
 cage-0.020
 cave-0.019
 mine-0.019
Token with
Token ID351
Feature activation-8.65e-03

Pos logit contributions
bows+0.086
hing+0.083
owers+0.079
 understand+0.076
ruck+0.075
Neg logit contributions
 my-0.066
 me-0.055
 cage-0.049
 mine-0.049
 MY-0.049
Token the
Token ID262
Feature activation-1.53e-02

Pos logit contributions
bows+0.017
hing+0.016
owers+0.015
ruck+0.014
 understand+0.014
Neg logit contributions
 my-0.016
 me-0.013
 cage-0.012
 mine-0.012
 MY-0.011
Token other
Token ID584
Feature activation-1.91e-02

Pos logit contributions
bows+0.029
hing+0.029
owers+0.026
 understand+0.026
ruck+0.026
Neg logit contributions
 my-0.027
 me-0.024
 cage-0.022
 mine-0.022
 cave-0.020
Token animals
Token ID4695
Feature activation-3.69e-02

Pos logit contributions
bows+0.032
hing+0.031
owers+0.028
 understand+0.028
ruck+0.027
Neg logit contributions
 my-0.038
 me-0.033
 cage-0.030
 MY-0.030
 mine-0.030
Token Lily
Token ID20037
Feature activation+8.51e-04

Pos logit contributions
 my+0.007
 cave+0.006
 cage+0.006
 MY+0.006
 me+0.006
Neg logit contributions
bows-0.006
owers-0.006
hing-0.006
 understand-0.005
ruck-0.005
Token told
Token ID1297
Feature activation-7.06e-03

Pos logit contributions
 my+0.002
 me+0.001
 cage+0.001
 MY+0.001
 mine+0.001
Neg logit contributions
bows-0.002
hing-0.001
owers-0.001
 understand-0.001
ruck-0.001
Token her
Token ID607
Feature activation+7.26e-03

Pos logit contributions
bows+0.011
hing+0.011
 understand+0.010
owers+0.010
ruck+0.009
Neg logit contributions
 my-0.014
 me-0.012
 cage-0.012
 MY-0.012
 cave-0.012
Token mom
Token ID1995
Feature activation+3.03e-03

Pos logit contributions
 my+0.010
 me+0.008
 cage+0.007
 mine+0.007
 cave+0.007
Neg logit contributions
bows-0.017
hing-0.016
owers-0.015
 understand-0.015
ruck-0.015
Token,
Token ID11
Feature activation-2.46e-04

Pos logit contributions
 my+0.004
 me+0.003
 mine+0.003
 cage+0.003
 this+0.003
Neg logit contributions
bows-0.007
hing-0.007
owers-0.007
akes-0.006
ruck-0.006
Token "
Token ID366
Feature activation-1.57e-02

Pos logit contributions
bows+0.001
hing+0.001
owers+0.001
ruck+0.001
ances+0.000
Neg logit contributions
 my-0.000
 me-0.000
 cage-0.000
 mine-0.000
 MY-0.000
TokenI
Token ID40
Feature activation+1.38e-02

Pos logit contributions
bows+0.034
hing+0.033
owers+0.030
 understand+0.030
ruck+0.030
Neg logit contributions
 my-0.023
 me-0.021
 cage-0.017
 MY-0.016
 mine-0.016
Token want
Token ID765
Feature activation+2.08e-02

Pos logit contributions
 my+0.033
 me+0.029
 cave+0.028
 gru+0.028
 cage+0.028
Neg logit contributions
 understand-0.016
bows-0.016
hing-0.014
owers-0.013
 were-0.012
Token that
Token ID326
Feature activation+2.81e-02

Pos logit contributions
 my+0.029
 gru+0.027
 cave+0.026
 cage+0.026
 MY+0.025
Neg logit contributions
bows-0.038
hing-0.037
 understand-0.037
owers-0.036
ruck-0.035
Token yellow
Token ID7872
Feature activation-8.02e-03

Pos logit contributions
har+0.047
lor+0.045
 gru+0.045
 my+0.045
 prison+0.045
Neg logit contributions
bows-0.041
 understand-0.036
owers-0.035
 were-0.034
 sizes-0.032
Token fruit
Token ID8234
Feature activation+6.87e-03

Pos logit contributions
bows+0.014
hing+0.013
owers+0.013
 understand+0.012
ruck+0.012
Neg logit contributions
 my-0.014
 me-0.012
 gru-0.012
 MY-0.012
 cave-0.011
Token is
Token ID318
Feature activation+1.10e-04

Pos logit contributions
 my+0.034
 gru+0.026
 me+0.025
 cave+0.024
 MY+0.024
Neg logit contributions
bows-0.060
hing-0.054
 sizes-0.050
owers-0.050
 understand-0.049
Token scary
Token ID14343
Feature activation-1.71e-02

Pos logit contributions
 my+0.000
 cage+0.000
 me+0.000
 cave+0.000
 gru+0.000
Neg logit contributions
bows-0.000
hing-0.000
 sizes-0.000
 understand-0.000
owers-0.000
Token and
Token ID290
Feature activation+2.35e-02

Pos logit contributions
bows+0.032
hing+0.030
owers+0.028
 understand+0.028
ruck+0.027
Neg logit contributions
 my-0.030
 me-0.026
 cage-0.024
 MY-0.024
 cave-0.023
Token mean
Token ID1612
Feature activation-1.77e-02

Pos logit contributions
 my+0.041
 me+0.036
 cave+0.036
 cage+0.036
 MY+0.034
Neg logit contributions
bows-0.040
hing-0.038
 understand-0.037
 sizes-0.035
owers-0.034
Token.
Token ID13
Feature activation-2.47e-03

Pos logit contributions
bows+0.034
hing+0.032
owers+0.030
 understand+0.030
ruck+0.029
Neg logit contributions
 my-0.030
 me-0.026
 cage-0.024
 MY-0.024
 cave-0.023
Token They
Token ID1119
Feature activation-2.61e-02

Pos logit contributions
bows+0.003
 understand+0.002
owers+0.002
 were+0.002
hing+0.002
Neg logit contributions
 my-0.006
 cave-0.006
 cage-0.005
 me-0.005
 mine-0.005
Token do
Token ID466
Feature activation-2.66e-03

Pos logit contributions
bows+0.049
owers+0.045
hing+0.044
ruck+0.044
anes+0.041
Neg logit contributions
 my-0.047
 me-0.045
 mine-0.040
 away-0.039
 cage-0.037
Token not
Token ID407
Feature activation-2.00e-02

Pos logit contributions
bows+0.005
hing+0.005
 understand+0.005
owers+0.005
ruck+0.004
Neg logit contributions
 my-0.004
 me-0.004
 cage-0.003
 cave-0.003
 MY-0.003
Token want
Token ID765
Feature activation+7.31e-03

Pos logit contributions
owers+0.030
bows+0.029
hing+0.027
anes+0.025
akes+0.024
Neg logit contributions
 my-0.044
 me-0.039
 mine-0.037
 away-0.036
 cage-0.035
Token to
Token ID284
Feature activation+6.43e-03

Pos logit contributions
 my+0.011
 gru+0.010
 cave+0.009
 cage+0.009
 MY+0.009
Neg logit contributions
bows-0.014
 understand-0.013
hing-0.012
ruck-0.012
 sizes-0.012
Token find
Token ID1064
Feature activation+1.72e-02

Pos logit contributions
 my+0.012
 me+0.010
 mine+0.009
 cage+0.009
 MY+0.009
Neg logit contributions
bows-0.012
hing-0.012
owers-0.012
ruck-0.011
 understand-0.010
Token and
Token ID290
Feature activation+1.50e-02

Pos logit contributions
bows+0.008
hing+0.008
owers+0.007
 understand+0.007
ruck+0.007
Neg logit contributions
 my-0.007
 me-0.006
 cage-0.005
 MY-0.005
 mine-0.005
Token I
Token ID314
Feature activation-4.96e-03

Pos logit contributions
 my+0.023
 cage+0.020
 me+0.019
 cave+0.019
 MY+0.019
Neg logit contributions
bows-0.029
hing-0.027
 understand-0.027
 sizes-0.026
owers-0.025
Token love
Token ID1842
Feature activation+3.75e-03

Pos logit contributions
bows+0.008
hing+0.007
owers+0.007
ruck+0.007
 understand+0.006
Neg logit contributions
 my-0.010
 me-0.009
 cage-0.008
 mine-0.008
 MY-0.008
Token each
Token ID1123
Feature activation+8.30e-03

Pos logit contributions
 my+0.005
 cage+0.004
 cave+0.004
 gru+0.004
 me+0.004
Neg logit contributions
bows-0.008
hing-0.008
owers-0.008
ruck-0.008
 understand-0.007
Token other
Token ID584
Feature activation-8.65e-03

Pos logit contributions
 my+0.016
 me+0.014
 cage+0.013
 MY+0.013
 cave+0.013
Neg logit contributions
bows-0.014
hing-0.013
owers-0.013
 understand-0.012
ruck-0.012
Token?"
Token ID1701
Feature activation-9.88e-03

Pos logit contributions
bows+0.018
hing+0.017
owers+0.016
 understand+0.016
ruck+0.016
Neg logit contributions
 my-0.013
 me-0.011
 cage-0.010
 MY-0.010
 cave-0.010
Token
Token ID198
Feature activation+4.14e-03

Pos logit contributions
bows+0.020
hing+0.019
owers+0.018
 understand+0.018
ruck+0.017
Neg logit contributions
 my-0.016
 me-0.014
 cage-0.012
 MY-0.012
 mine-0.012
Token
Token ID198
Feature activation+1.01e-02

Pos logit contributions
 my+0.007
 me+0.006
 cage+0.006
 MY+0.006
 cave+0.006
Neg logit contributions
bows-0.008
hing-0.007
owers-0.007
 understand-0.007
ruck-0.006
TokenTom
Token ID13787
Feature activation+4.94e-03

Pos logit contributions
 my+0.023
 cave+0.020
 me+0.020
 cage+0.020
 MY+0.019
Neg logit contributions
bows-0.013
hing-0.013
 understand-0.012
owers-0.011
ruck-0.011
Token nodded
Token ID14464
Feature activation+5.84e-03

Pos logit contributions
 my+0.010
 me+0.009
 cage+0.009
 MY+0.008
 cave+0.008
Neg logit contributions
bows-0.007
hing-0.007
 understand-0.007
owers-0.006
ruck-0.006
Token and
Token ID290
Feature activation+1.81e-03

Pos logit contributions
 my+0.010
 me+0.009
 cage+0.008
 MY+0.008
 mine+0.008
Neg logit contributions
bows-0.011
hing-0.010
owers-0.010
 understand-0.009
ruck-0.009
Token to
Token ID284
Feature activation-4.92e-04

Pos logit contributions
 my+0.043
 MY+0.038
 me+0.037
 cave+0.036
 cage+0.036
Neg logit contributions
bows-0.044
hing-0.039
 understand-0.038
ruck-0.036
 sizes-0.036
Token Mia
Token ID32189
Feature activation+5.65e-03

Pos logit contributions
bows+0.001
hing+0.001
 understand+0.001
ruck+0.001
owers+0.001
Neg logit contributions
 my-0.001
 MY-0.001
 cage-0.001
 cave-0.001
 me-0.001
Token.
Token ID13
Feature activation-5.72e-03

Pos logit contributions
 my+0.011
 MY+0.009
 me+0.009
 mine+0.009
 cage+0.009
Neg logit contributions
bows-0.009
hing-0.009
 understand-0.008
owers-0.008
ruck-0.008
Token
Token ID198
Feature activation+6.19e-03

Pos logit contributions
bows+0.008
 understand+0.008
hing+0.008
owers+0.007
ickets+0.007
Neg logit contributions
 my-0.011
 me-0.010
 mine-0.010
 MY-0.010
 cage-0.010
Token
Token ID198
Feature activation+1.54e-02

Pos logit contributions
 my+0.011
 me+0.009
 cage+0.009
 MY+0.008
 mine+0.008
Neg logit contributions
bows-0.012
hing-0.011
owers-0.010
 understand-0.010
ruck-0.010
Token"
Token ID1
Feature activation-1.09e-02

Pos logit contributions
 my+0.023
 me+0.019
 cage+0.018
 mine+0.018
 cave+0.018
Neg logit contributions
hing-0.031
bows-0.031
 understand-0.029
owers-0.028
ruck-0.027
TokenLook
Token ID8567
Feature activation+2.63e-02

Pos logit contributions
bows+0.021
hing+0.020
owers+0.019
ances+0.019
ruck+0.019
Neg logit contributions
 my-0.018
 me-0.015
 cage-0.013
 gru-0.013
 MY-0.013
Token,
Token ID11
Feature activation+5.71e-03

Pos logit contributions
 my+0.042
 cave+0.039
 cage+0.039
 gru+0.038
 MY+0.037
Neg logit contributions
bows-0.043
hing-0.041
 understand-0.040
owers-0.040
 were-0.036
Token Mia
Token ID32189
Feature activation+1.37e-02

Pos logit contributions
har+0.008
fired+0.008
eds+0.007
 destination+0.007
 gru+0.007
Neg logit contributions
bows-0.010
 understand-0.010
 were-0.009
 dancers-0.009
hing-0.009
Token.
Token ID13
Feature activation+1.61e-03

Pos logit contributions
 my+0.022
 MY+0.019
 me+0.018
 mine+0.018
 cave+0.018
Neg logit contributions
bows-0.027
hing-0.024
owers-0.023
 understand-0.023
ruck-0.022
Token It
Token ID632
Feature activation+3.23e-02

Pos logit contributions
 my+0.002
 me+0.002
 mistake+0.002
 gru+0.002
 away+0.002
Neg logit contributions
bows-0.003
owers-0.003
hing-0.003
 understand-0.003
anes-0.003
Token the
Token ID262
Feature activation-8.49e-03

Pos logit contributions
bows+0.048
hing+0.047
owers+0.045
ruck+0.042
 understand+0.042
Neg logit contributions
 my-0.033
 me-0.027
 cave-0.025
 cage-0.024
 this-0.023
Token tourist
Token ID18473
Feature activation-2.43e-03

Pos logit contributions
bows+0.015
hing+0.014
owers+0.013
 understand+0.013
ruck+0.013
Neg logit contributions
 my-0.016
 me-0.013
 MY-0.012
 cage-0.012
 mine-0.012
Token took
Token ID1718
Feature activation-1.22e-02

Pos logit contributions
bows+0.004
hing+0.004
owers+0.004
 understand+0.004
ruck+0.004
Neg logit contributions
 my-0.004
 me-0.004
 cage-0.003
 MY-0.003
 mine-0.003
Token lots
Token ID6041
Feature activation-1.10e-02

Pos logit contributions
bows+0.022
hing+0.021
 understand+0.019
owers+0.019
ruck+0.019
Neg logit contributions
 my-0.022
 me-0.018
 MY-0.018
 cage-0.018
 cave-0.017
Token of
Token ID286
Feature activation-2.33e-02

Pos logit contributions
bows+0.019
hing+0.017
owers+0.017
 understand+0.016
ruck+0.015
Neg logit contributions
 my-0.022
 me-0.019
 cage-0.018
 cave-0.017
 mine-0.017
Token pictures
Token ID5986
Feature activation-6.16e-02

Pos logit contributions
bows+0.043
hing+0.041
owers+0.039
 understand+0.036
ickets+0.035
Neg logit contributions
 my-0.043
 me-0.037
 cage-0.035
 cave-0.033
 mine-0.033
Token and
Token ID290
Feature activation-3.07e-02

Pos logit contributions
bows+0.121
hing+0.118
owers+0.114
 understand+0.110
ances+0.105
Neg logit contributions
 my-0.100
 me-0.084
 cage-0.079
 cave-0.077
 mine-0.076
Token learned
Token ID4499
Feature activation-7.02e-03

Pos logit contributions
bows+0.049
hing+0.049
owers+0.047
 understand+0.042
ickets+0.041
Neg logit contributions
 my-0.062
 me-0.054
 cage-0.050
 cave-0.050
 mine-0.049
Token new
Token ID649
Feature activation-2.41e-02

Pos logit contributions
bows+0.012
hing+0.011
 understand+0.010
ruck+0.010
owers+0.010
Neg logit contributions
 my-0.013
 me-0.012
 cage-0.011
 MY-0.011
 cave-0.011
Token things
Token ID1243
Feature activation-4.54e-02

Pos logit contributions
bows+0.046
hing+0.045
owers+0.042
 understand+0.040
ruck+0.040
Neg logit contributions
 my-0.042
 me-0.036
 cage-0.033
 cave-0.032
 mine-0.032
Token.
Token ID13
Feature activation-4.33e-03

Pos logit contributions
bows+0.090
hing+0.086
owers+0.081
 understand+0.080
ruck+0.078
Neg logit contributions
 my-0.075
 me-0.063
 cage-0.058
 MY-0.057
 cave-0.056
Token friends
Token ID2460
Feature activation-5.21e-02

Pos logit contributions
bows+0.037
hing+0.036
owers+0.033
ruck+0.033
 understand+0.032
Neg logit contributions
 my-0.020
 me-0.016
 cage-0.014
 mine-0.014
 cave-0.014
Token squeezed
Token ID29650
Feature activation-2.58e-02

Pos logit contributions
bows+0.092
hing+0.083
owers+0.081
ickets+0.079
ruck+0.078
Neg logit contributions
 my-0.090
 me-0.088
 away-0.079
 cage-0.077
 mine-0.075
Token lots
Token ID6041
Feature activation-1.08e-02

Pos logit contributions
bows+0.057
hing+0.054
owers+0.052
 understand+0.050
ruck+0.050
Neg logit contributions
 my-0.038
 me-0.032
 cage-0.028
 mine-0.027
 MY-0.027
Token of
Token ID286
Feature activation-2.39e-02

Pos logit contributions
bows+0.018
hing+0.017
owers+0.016
 understand+0.015
ruck+0.014
Neg logit contributions
 my-0.022
 me-0.019
 cage-0.018
 mine-0.017
 MY-0.017
Token le
Token ID443
Feature activation-4.79e-02

Pos logit contributions
bows+0.051
hing+0.048
owers+0.046
ickets+0.044
ances+0.043
Neg logit contributions
 my-0.040
 me-0.035
 mine-0.029
 this-0.028
 away-0.028
Tokenmons
Token ID11567
Feature activation-6.00e-02

Pos logit contributions
 understand+0.077
hing+0.075
ruck+0.068
bows+0.068
epad+0.065
Neg logit contributions
 my-0.080
ar-0.080
 me-0.074
oh-0.073
 cage-0.072
Token and
Token ID290
Feature activation-2.02e-02

Pos logit contributions
bows+0.125
hing+0.124
 understand+0.118
ruck+0.115
owers+0.113
Neg logit contributions
 my-0.085
 me-0.074
 away-0.069
 cage-0.064
 mine-0.064
Token mixed
Token ID7668
Feature activation-3.96e-02

Pos logit contributions
bows+0.037
hing+0.037
owers+0.034
ickets+0.032
 understand+0.032
Neg logit contributions
 my-0.036
 me-0.032
 mine-0.028
 MY-0.028
 cage-0.028
Token them
Token ID606
Feature activation-3.62e-02

Pos logit contributions
bows+0.068
hing+0.065
 understand+0.061
owers+0.060
ruck+0.058
Neg logit contributions
 my-0.078
 me-0.069
 mine-0.061
 away-0.060
 MY-0.060
Token with
Token ID351
Feature activation-2.54e-02

Pos logit contributions
bows+0.070
hing+0.069
owers+0.064
ruck+0.063
 understand+0.063
Neg logit contributions
 my-0.061
 me-0.052
 mine-0.047
 cage-0.046
 away-0.046
Token sugar
Token ID7543
Feature activation-3.46e-02

Pos logit contributions
bows+0.054
hing+0.051
owers+0.049
ickets+0.047
ruck+0.047
Neg logit contributions
 my-0.041
 me-0.035
 mine-0.031
 away-0.030
 this-0.029
Token and
Token ID290
Feature activation-2.11e-02

Pos logit contributions
hing+0.042
bows+0.042
owers+0.040
ances+0.037
ruck+0.036
Neg logit contributions
 my-0.041
 me-0.034
 cage-0.032
 mine-0.031
 cave-0.030
Token they
Token ID484
Feature activation-5.05e-02

Pos logit contributions
bows+0.035
hing+0.033
owers+0.031
 understand+0.030
ruck+0.029
Neg logit contributions
 my-0.042
 me-0.036
 cage-0.034
 MY-0.033
 mine-0.033
Token had
Token ID550
Feature activation-2.37e-02

Pos logit contributions
bows+0.084
hing+0.075
owers+0.074
ruck+0.073
ances+0.069
Neg logit contributions
 my-0.097
 me-0.089
 cage-0.081
 away-0.080
 cave-0.079
Token lots
Token ID6041
Feature activation-2.97e-02

Pos logit contributions
bows+0.037
hing+0.034
owers+0.032
 understand+0.031
ruck+0.030
Neg logit contributions
 my-0.049
 me-0.043
 cage-0.041
 MY-0.040
 cave-0.040
Token of
Token ID286
Feature activation-2.86e-02

Pos logit contributions
bows+0.053
hing+0.051
owers+0.049
 understand+0.045
ances+0.045
Neg logit contributions
 my-0.055
 me-0.047
 cage-0.044
 cave-0.042
 mine-0.041
Token fun
Token ID1257
Feature activation-6.10e-02

Pos logit contributions
hing+0.058
bows+0.058
owers+0.054
ickets+0.052
ruck+0.051
Neg logit contributions
 my-0.047
 me-0.041
 cage-0.038
 mine-0.037
 cave-0.036
Token together
Token ID1978
Feature activation-2.74e-02

Pos logit contributions
hing+0.125
bows+0.125
 understand+0.115
ickets+0.113
owers+0.113
Neg logit contributions
 my-0.092
 me-0.077
 cave-0.074
 cage-0.071
 this-0.069
Token!
Token ID0
Feature activation-7.95e-03

Pos logit contributions
bows+0.052
hing+0.049
owers+0.047
 understand+0.045
ruck+0.044
Neg logit contributions
 my-0.048
 me-0.041
 cage-0.038
 MY-0.037
 mine-0.037
Token<|endoftext|>
Token ID50256
Feature activation-7.18e-03

Pos logit contributions
bows+0.015
hing+0.014
owers+0.013
 understand+0.013
ruck+0.012
Neg logit contributions
 my-0.014
 me-0.012
 cage-0.011
 MY-0.011
 mine-0.011
TokenOnce
Token ID7454
Feature activation-7.46e-03

Pos logit contributions
bows+0.016
hing+0.015
owers+0.014
 understand+0.014
ruck+0.014
Neg logit contributions
 my-0.010
 me-0.009
 cage-0.008
 cave-0.008
 MY-0.008
Token upon
Token ID2402
Feature activation-2.09e-02

Pos logit contributions
bows+0.013
hing+0.012
owers+0.011
 understand+0.011
ruck+0.011
Neg logit contributions
 my-0.015
 me-0.013
 cage-0.012
 MY-0.012
 cave-0.011
Tokena
Token ID64
Feature activation+2.02e-03

Pos logit contributions
bows+0.045
hing+0.044
owers+0.041
ruck+0.039
 understand+0.039
Neg logit contributions
 my-0.034
 me-0.029
 cage-0.027
 cave-0.025
 MY-0.024
Token liked
Token ID8288
Feature activation-5.82e-03

Pos logit contributions
 my+0.005
 me+0.004
 cage+0.004
 MY+0.004
 mine+0.004
Neg logit contributions
bows-0.003
hing-0.002
owers-0.002
 understand-0.002
ruck-0.002
Token to
Token ID284
Feature activation-1.55e-02

Pos logit contributions
bows+0.015
hing+0.015
owers+0.014
 understand+0.013
ruck+0.013
Neg logit contributions
 my-0.007
 me-0.005
 mine-0.004
 cage-0.004
 cave-0.004
Token do
Token ID466
Feature activation-1.86e-02

Pos logit contributions
bows+0.030
hing+0.028
owers+0.028
ruck+0.027
 sizes+0.026
Neg logit contributions
 my-0.027
 me-0.024
 cage-0.020
 mine-0.020
 cave-0.020
Token fun
Token ID1257
Feature activation-4.18e-02

Pos logit contributions
bows+0.034
hing+0.032
owers+0.030
 understand+0.030
ruck+0.028
Neg logit contributions
 my-0.033
 me-0.028
 cage-0.027
 MY-0.027
 cave-0.026
Token things
Token ID1243
Feature activation-5.75e-02

Pos logit contributions
bows+0.087
hing+0.085
owers+0.081
 understand+0.079
ruck+0.077
Neg logit contributions
 my-0.067
 me-0.058
 cave-0.049
 cage-0.048
 mine-0.048
Token every
Token ID790
Feature activation+1.96e-02

Pos logit contributions
hing+0.124
bows+0.122
owers+0.117
ruck+0.115
 understand+0.113
Neg logit contributions
 my-0.084
 me-0.069
 mine-0.061
 away-0.059
 cage-0.059
Token day
Token ID1110
Feature activation-1.43e-02

Pos logit contributions
 my+0.048
 MY+0.042
 gru+0.042
 me+0.041
ar+0.040
Neg logit contributions
bows-0.020
hing-0.019
 understand-0.018
ruck-0.017
owers-0.016
Token.
Token ID13
Feature activation-5.79e-03

Pos logit contributions
bows+0.028
hing+0.027
owers+0.025
 understand+0.025
ruck+0.024
Neg logit contributions
 my-0.024
 me-0.020
 cage-0.018
 MY-0.018
 mine-0.018
Token
Token ID198
Feature activation-4.67e-03

Pos logit contributions
bows+0.011
hing+0.011
owers+0.010
 understand+0.010
ruck+0.010
Neg logit contributions
 my-0.010
 me-0.008
 cage-0.007
 cave-0.007
 mine-0.007
Token
Token ID198
Feature activation+5.49e-03

Pos logit contributions
bows+0.010
hing+0.010
owers+0.009
ances+0.009
 sizes+0.009
Neg logit contributions
 my-0.007
 me-0.006
 cage-0.005
 cave-0.005
 MY-0.005
Token.
Token ID13
Feature activation-1.01e-02

Pos logit contributions
 my+0.026
 MY+0.023
 me+0.021
 mine+0.021
 gru+0.021
Neg logit contributions
bows-0.020
hing-0.016
 sizes-0.016
owers-0.016
 understand-0.015
Token She
Token ID1375
Feature activation+1.81e-02

Pos logit contributions
hing+0.019
bows+0.019
owers+0.017
ruck+0.016
 understand+0.016
Neg logit contributions
 my-0.018
 me-0.016
 cage-0.014
 cave-0.014
 MY-0.014
Token found
Token ID1043
Feature activation+2.55e-03

Pos logit contributions
 my+0.038
 MY+0.031
 mine+0.031
 me+0.030
 gru+0.029
Neg logit contributions
bows-0.027
 were-0.025
 understand-0.025
hing-0.024
owers-0.022
Token many
Token ID867
Feature activation-8.28e-03

Pos logit contributions
 my+0.004
 me+0.004
 cage+0.003
 cave+0.003
 mine+0.003
Neg logit contributions
bows-0.005
hing-0.005
owers-0.005
 understand-0.004
ruck-0.004
Token pe
Token ID613
Feature activation-3.60e-04

Pos logit contributions
bows+0.014
hing+0.014
owers+0.013
 understand+0.012
ruck+0.012
Neg logit contributions
 my-0.016
 me-0.014
 cage-0.013
 MY-0.012
 mine-0.012
Tokenb
Token ID65
Feature activation-5.74e-02

Pos logit contributions
bows+0.001
hing+0.001
owers+0.001
 understand+0.001
ruck+0.001
Neg logit contributions
 my-0.001
 me-0.001
 cage-0.000
 cave-0.000
 MY-0.000
Tokenbles
Token ID7689
Feature activation-3.76e-02

Pos logit contributions
bows+0.071
hing+0.065
 understand+0.061
 viol+0.059
 sizes+0.053
Neg logit contributions
 my-0.134
 me-0.118
 cage-0.111
 cave-0.108
 your-0.108
Token.
Token ID13
Feature activation-1.35e-02

Pos logit contributions
bows+0.071
hing+0.068
owers+0.064
 understand+0.062
ruck+0.060
Neg logit contributions
 my-0.066
 me-0.056
 cage-0.052
 cave-0.051
 mine-0.051
Token She
Token ID1375
Feature activation+9.30e-03

Pos logit contributions
hing+0.024
bows+0.024
owers+0.022
ruck+0.022
 understand+0.021
Neg logit contributions
 my-0.024
 me-0.021
 cage-0.019
 MY-0.019
 mine-0.018
Token filled
Token ID5901
Feature activation-1.57e-02

Pos logit contributions
 my+0.017
 me+0.015
 cage+0.014
 MY+0.013
 mine+0.013
Neg logit contributions
bows-0.017
hing-0.016
owers-0.015
 understand-0.015
ruck-0.014
Token her
Token ID607
Feature activation+1.13e-03

Pos logit contributions
bows+0.034
hing+0.034
owers+0.032
 understand+0.031
ruck+0.030
Neg logit contributions
 my-0.024
 me-0.020
 cage-0.017
 mine-0.017
 MY-0.016