TOP ACTIVATIONS
MAX = 0.075

Then she followed the river bank until she could see her
away in her pig gy bank every week . This way
of sight from the river bank . The little
flew down to the river bank and looked for the w
elephant went to the river bank and he noticed something strange
ran down to the river bank . When
a raft near the river bank and she thought it was
water and into the river bank . Su
hadn 't left the river bank at all .
her seat on the river bank . From that day on
to seat on the river bank and watch the other animals
," Be good , Holly ,. I 'll be back soon
John quickly scanned the river bank to see if anything was
pulled out his pig gy bank and emptied it on the
pulled herself onto the river bank , and waved hello to
and ran to the river bank . Charlie
appeared and destroyed the river bank ! Water was everywhere and
and grabbed his pig gy bank and brought it back to
man standing by the river bank . The old man showed
back down on the river bank . He continued

MIN ACTIVATIONS
MIN = -0.175

When Daisy put the magn ifying glass down , the cater
flashlight , and a magn ifying glass . As
She takes out a magn ifying glass . She looks at
, next to her magn ifying glass . She looked at
the necklace and the magn ifying glass and started to examine
s a books he lf . You can use
He brought his magn ifying glass and looked closely .
He took out a magn ifying glass and a notebook and
the watch with the magn ifying glass . He saw the
, handing him the magn ifying glass . She looked around
He pointed his magn ifying glass at different classmates and
a flashlight and a magn ifying glass to examine their finds
said , holding the magn ifying glass over a black ant
the necklace and the magn ifying glass and told him that
She had a big magn ifying glass that made things look
it she had a magn ifying glass , a tiny torch
and handed him the magn ifying glass . He
hat and grabbed her magn ifying glass . She was ready
cried , dropping the magn ifying glass . "
. Lily had a magn ifying glass that she found in

INTERVAL 0.013 - 0.075
CONTAINS 7.517%

hold the string and run with the k ite . The
the plates and cups in their right spots . Suddenly ,
have fun playing tag with his friend . The end .
the rock . Mom smiled and took the rock
. The bunny tried to eat the sword , but it

INTERVAL -0.050 - 0.013
CONTAINS 92.459%

Tom . Tom is hiding behind the sofa . He has
placed it under her pillow . She waited
and hon k and roll around . Sara and Ben laughed
made a loud noise . Alice said ,
to explore the world around him . One day , Spot

INTERVAL -0.113 - -0.050
CONTAINS 0.024%

hat and grabbed her magn ifying glass . She was ready
it back on the books he lf so we can find
. She was filled with amaz ement . At
chair and allowed the ha ird ress er to give her
. The books he lf came crashing down on

INTERVAL -0.175 - -0.113
CONTAINS 0.000%

flashlight , and a magn ifying glass . As
She takes out a magn ifying glass . She looks at
the necklace and the magn ifying glass and started to examine
, next to her magn ifying glass . She looked at
s a books he lf . You can use
Token Then
Token ID3244
Feature activation+1.54e-02

Pos logit contributions
rums+0.006
 runs+0.005
 turns+0.005
 happens+0.005
 comes+0.004
Neg logit contributions
he-0.007
ifying-0.006
 contain-0.006
 spine-0.005
 entire-0.005
Token she
Token ID673
Feature activation+2.75e-03

Pos logit contributions
rums+0.034
 runs+0.028
 turns+0.026
 happens+0.026
 comes+0.026
Neg logit contributions
he-0.028
ifying-0.024
 contain-0.024
 usual-0.021
 entire-0.020
Token followed
Token ID3940
Feature activation+1.03e-03

Pos logit contributions
rums+0.005
 runs+0.004
 turns+0.004
 comes+0.004
 happens+0.004
Neg logit contributions
he-0.006
ifying-0.006
 contain-0.005
 entire-0.005
 spine-0.005
Token the
Token ID262
Feature activation+4.43e-03

Pos logit contributions
rums+0.001
 runs+0.001
 turns+0.001
 comes+0.001
 tries+0.001
Neg logit contributions
he-0.003
ifying-0.002
 contain-0.002
 entire-0.002
pron-0.002
Token river
Token ID7850
Feature activation+1.92e-02

Pos logit contributions
rums+0.009
 runs+0.008
 turns+0.008
 comes+0.007
 happens+0.007
Neg logit contributions
he-0.009
ifying-0.008
 contain-0.007
 spine-0.006
 entire-0.006
Tokenbank
Token ID17796
Feature activation+7.52e-02

Pos logit contributions
rums+0.042
 runs+0.038
 turns+0.036
 happens+0.035
 comes+0.034
Neg logit contributions
he-0.036
ifying-0.030
 contain-0.030
 entire-0.026
 spine-0.025
Token until
Token ID1566
Feature activation-5.74e-03

Pos logit contributions
rums+0.159
 runs+0.146
 turns+0.139
 comes+0.133
 happens+0.132
Neg logit contributions
he-0.144
ifying-0.123
 contain-0.119
 entire-0.103
 spine-0.101
Token she
Token ID673
Feature activation+3.71e-03

Pos logit contributions
he+0.014
ifying+0.013
 contain+0.013
 entire+0.011
 spine+0.011
Neg logit contributions
rums-0.008
 runs-0.008
 turns-0.008
 comes-0.007
 happens-0.007
Token could
Token ID714
Feature activation+1.11e-02

Pos logit contributions
rums+0.006
 runs+0.005
 turns+0.005
 comes+0.005
 happens+0.005
Neg logit contributions
he-0.009
ifying-0.008
 contain-0.008
 spine-0.007
 entire-0.007
Token see
Token ID766
Feature activation+4.61e-03

Pos logit contributions
 runs+0.019
rums+0.018
 comes+0.017
 turns+0.017
 tries+0.016
Neg logit contributions
he-0.025
ifying-0.023
 contain-0.019
pron-0.019
 entire-0.019
Token her
Token ID607
Feature activation+1.17e-02

Pos logit contributions
rums+0.009
 runs+0.008
 turns+0.008
 happens+0.007
 comes+0.007
Neg logit contributions
he-0.009
ifying-0.008
 contain-0.008
 usual-0.007
 entire-0.007
Token away
Token ID1497
Feature activation-1.67e-02

Pos logit contributions
he+0.014
ifying+0.013
 contain+0.012
 entire+0.011
 spine+0.011
Neg logit contributions
 runs-0.011
rums-0.011
 turns-0.010
 comes-0.010
es-0.010
Token in
Token ID287
Feature activation+3.43e-03

Pos logit contributions
he+0.036
ifying+0.031
 contain+0.030
pron+0.027
 entire+0.027
Neg logit contributions
rums-0.029
 runs-0.027
es-0.025
 turns-0.025
 comes-0.024
Token her
Token ID607
Feature activation+1.03e-02

Pos logit contributions
rums+0.005
 runs+0.005
 turns+0.005
 comes+0.004
 tries+0.004
Neg logit contributions
he-0.008
ifying-0.008
 contain-0.007
 spine-0.007
ancer-0.007
Token pig
Token ID12967
Feature activation+1.34e-02

Pos logit contributions
 runs+0.022
rums+0.022
 turns+0.021
 comes+0.020
 tries+0.020
Neg logit contributions
he-0.019
ifying-0.015
 contain-0.015
ancer-0.012
 entire-0.012
Tokengy
Token ID1360
Feature activation+1.40e-02

Pos logit contributions
rums+0.024
 runs+0.022
 turns+0.021
es+0.020
 comes+0.020
Neg logit contributions
he-0.030
ifying-0.026
 contain-0.025
 entire-0.022
 spine-0.021
Tokenbank
Token ID17796
Feature activation+7.29e-02

Pos logit contributions
rums+0.026
 runs+0.025
 turns+0.023
 comes+0.022
 tries+0.022
Neg logit contributions
he-0.030
ifying-0.025
 contain-0.024
 entire-0.021
 spine-0.021
Token every
Token ID790
Feature activation-1.20e-02

Pos logit contributions
rums+0.146
 runs+0.134
 turns+0.125
 comes+0.122
es+0.119
Neg logit contributions
he-0.145
ifying-0.126
 contain-0.120
 entire-0.106
 spine-0.102
Token week
Token ID1285
Feature activation-2.48e-03

Pos logit contributions
he+0.022
ifying+0.019
 contain+0.017
 spine+0.015
 entire+0.014
Neg logit contributions
 runs-0.025
rums-0.025
 turns-0.024
 comes-0.024
 tries-0.023
Token.
Token ID13
Feature activation+9.81e-04

Pos logit contributions
he+0.005
ifying+0.004
 contain+0.004
 entire+0.003
 spine+0.003
Neg logit contributions
rums-0.005
 runs-0.005
 turns-0.005
 comes-0.004
 happens-0.004
Token This
Token ID770
Feature activation+5.65e-03

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 happens+0.002
 comes+0.002
Neg logit contributions
he-0.002
ifying-0.002
 contain-0.002
 entire-0.002
 spine-0.001
Token way
Token ID835
Feature activation+1.85e-02

Pos logit contributions
rums+0.014
 runs+0.013
 turns+0.012
 comes+0.012
 happens+0.012
Neg logit contributions
he-0.009
ifying-0.008
 contain-0.007
 entire-0.006
 spine-0.006
Token of
Token ID286
Feature activation-3.53e-02

Pos logit contributions
he+0.004
ifying+0.003
 contain+0.003
pron+0.003
 spine+0.003
Neg logit contributions
 runs-0.005
rums-0.005
 comes-0.004
 tries-0.004
 turns-0.004
Token sight
Token ID6504
Feature activation+4.05e-03

Pos logit contributions
he+0.077
ifying+0.068
 contain+0.066
 entire+0.058
 spine+0.058
Neg logit contributions
rums-0.064
 runs-0.060
 turns-0.058
 comes-0.054
 tries-0.053
Token from
Token ID422
Feature activation-6.23e-03

Pos logit contributions
rums+0.008
 runs+0.008
 turns+0.007
 comes+0.007
 happens+0.007
Neg logit contributions
he-0.008
ifying-0.007
 contain-0.007
 entire-0.006
 spine-0.006
Token the
Token ID262
Feature activation+5.66e-03

Pos logit contributions
he+0.015
ifying+0.013
 contain+0.013
pron+0.012
 spine+0.012
Neg logit contributions
 runs-0.009
rums-0.009
 turns-0.008
 comes-0.008
 tries-0.008
Token river
Token ID7850
Feature activation+1.65e-02

Pos logit contributions
 runs+0.009
 turns+0.008
 sounds+0.008
 comes+0.008
rums+0.008
Neg logit contributions
he-0.013
 contain-0.011
ifying-0.011
pron-0.010
 spine-0.009
Tokenbank
Token ID17796
Feature activation+7.15e-02

Pos logit contributions
rums+0.032
 runs+0.030
 turns+0.028
 comes+0.028
 happens+0.026
Neg logit contributions
he-0.033
ifying-0.029
 contain-0.027
 entire-0.024
pron-0.024
Token.
Token ID13
Feature activation+1.15e-03

Pos logit contributions
rums+0.139
 runs+0.130
 turns+0.120
 comes+0.118
es+0.115
Neg logit contributions
he-0.145
ifying-0.126
 contain-0.120
 entire-0.106
pron-0.104
Token
Token ID198
Feature activation+2.89e-03

Pos logit contributions
rums+0.002
 runs+0.002
 happens+0.002
 turns+0.002
 comes+0.002
Neg logit contributions
he-0.002
ifying-0.002
 contain-0.002
 entire-0.002
 spine-0.002
Token
Token ID198
Feature activation+1.69e-03

Pos logit contributions
rums+0.006
 runs+0.005
 turns+0.005
 comes+0.005
 happens+0.005
Neg logit contributions
he-0.006
ifying-0.005
 contain-0.005
 entire-0.004
 usual-0.004
TokenThe
Token ID464
Feature activation-1.61e-02

Pos logit contributions
rums+0.004
 runs+0.004
 turns+0.004
 comes+0.004
 happens+0.004
Neg logit contributions
he-0.003
ifying-0.002
 contain-0.002
 entire-0.002
pron-0.002
Token little
Token ID1310
Feature activation-7.97e-03

Pos logit contributions
he+0.034
ifying+0.029
 contain+0.028
 spine+0.024
pron+0.023
Neg logit contributions
rums-0.030
 runs-0.030
 turns-0.028
 comes-0.027
 sounds-0.026
Token flew
Token ID13112
Feature activation-4.53e-04

Pos logit contributions
he+0.018
 contain+0.015
ifying+0.015
 entire+0.014
 usual+0.013
Neg logit contributions
rums-0.015
 runs-0.011
 turns-0.011
 happens-0.011
es-0.011
Token down
Token ID866
Feature activation-1.61e-02

Pos logit contributions
he+0.001
ifying+0.001
 contain+0.001
 entire+0.001
 spine+0.001
Neg logit contributions
rums-0.001
 runs-0.001
 turns-0.001
 comes-0.001
 happens-0.001
Token to
Token ID284
Feature activation+1.10e-03

Pos logit contributions
he+0.034
ifying+0.031
 contain+0.029
 spine+0.026
 entire+0.025
Neg logit contributions
rums-0.029
 runs-0.028
 turns-0.027
 comes-0.026
 tries-0.024
Token the
Token ID262
Feature activation-3.50e-04

Pos logit contributions
 runs+0.002
rums+0.002
 turns+0.001
 comes+0.001
 tries+0.001
Neg logit contributions
he-0.003
ifying-0.003
 contain-0.002
 spine-0.002
 entire-0.002
Token river
Token ID7850
Feature activation+1.59e-02

Pos logit contributions
he+0.001
 contain+0.001
ifying+0.001
ancer+0.000
 spine+0.000
Neg logit contributions
 runs-0.001
rums-0.001
 turns-0.001
 sounds-0.001
 comes-0.001
Tokenbank
Token ID17796
Feature activation+7.11e-02

Pos logit contributions
rums+0.032
 runs+0.029
 turns+0.028
 comes+0.027
 happens+0.026
Neg logit contributions
he-0.032
ifying-0.028
 contain-0.027
 entire-0.024
 spine-0.023
Token and
Token ID290
Feature activation+7.90e-04

Pos logit contributions
rums+0.149
 runs+0.137
 turns+0.130
 comes+0.125
 happens+0.123
Neg logit contributions
he-0.137
ifying-0.117
 contain-0.113
 entire-0.099
 spine-0.096
Token looked
Token ID3114
Feature activation-1.47e-02

Pos logit contributions
rums+0.002
 runs+0.001
 happens+0.001
 turns+0.001
 comes+0.001
Neg logit contributions
he-0.002
 contain-0.001
ifying-0.001
 usual-0.001
 spine-0.001
Token for
Token ID329
Feature activation+6.72e-03

Pos logit contributions
he+0.029
ifying+0.024
 contain+0.023
 entire+0.020
 spine+0.020
Neg logit contributions
rums-0.030
 runs-0.029
 turns-0.027
 comes-0.026
 happens-0.026
Token the
Token ID262
Feature activation+3.09e-03

Pos logit contributions
rums+0.013
 runs+0.011
 turns+0.010
 comes+0.010
 happens+0.010
Neg logit contributions
he-0.015
ifying-0.013
 contain-0.013
 entire-0.011
 usual-0.011
Token w
Token ID266
Feature activation+1.35e-03

Pos logit contributions
 runs+0.005
rums+0.005
 turns+0.005
 comes+0.005
 sounds+0.005
Neg logit contributions
he-0.007
ifying-0.006
 contain-0.006
 spine-0.005
pron-0.005
Token elephant
Token ID20950
Feature activation+6.38e-03

Pos logit contributions
 runs+0.005
rums+0.005
 turns+0.005
 comes+0.004
 sounds+0.004
Neg logit contributions
he-0.006
 contain-0.005
ifying-0.005
 spine-0.004
 entire-0.004
Token went
Token ID1816
Feature activation+9.12e-03

Pos logit contributions
rums+0.011
 runs+0.009
 turns+0.009
 happens+0.008
 comes+0.008
Neg logit contributions
he-0.015
ifying-0.013
 contain-0.013
 entire-0.012
 spine-0.011
Token to
Token ID284
Feature activation+3.23e-03

Pos logit contributions
rums+0.017
 runs+0.016
 comes+0.015
 turns+0.015
 tries+0.014
Neg logit contributions
he-0.019
ifying-0.017
 contain-0.016
 entire-0.014
 spine-0.014
Token the
Token ID262
Feature activation+3.26e-03

Pos logit contributions
rums+0.006
 runs+0.005
 turns+0.005
 comes+0.005
 tries+0.005
Neg logit contributions
he-0.007
ifying-0.006
 contain-0.006
 entire-0.005
 spine-0.005
Token river
Token ID7850
Feature activation+1.68e-02

Pos logit contributions
rums+0.006
 runs+0.006
 turns+0.005
 sounds+0.005
 comes+0.005
Neg logit contributions
he-0.007
 contain-0.006
ifying-0.006
ancer-0.005
 spine-0.005
Tokenbank
Token ID17796
Feature activation+7.10e-02

Pos logit contributions
rums+0.032
 runs+0.030
 turns+0.028
 comes+0.027
 tries+0.026
Neg logit contributions
he-0.034
ifying-0.031
 contain-0.028
 entire-0.025
 spine-0.024
Token and
Token ID290
Feature activation+3.95e-03

Pos logit contributions
rums+0.143
 runs+0.133
 turns+0.123
 comes+0.120
es+0.116
Neg logit contributions
he-0.142
ifying-0.122
 contain-0.115
 entire-0.102
 spine-0.098
Token he
Token ID339
Feature activation-8.70e-04

Pos logit contributions
rums+0.008
 runs+0.007
 happens+0.007
 turns+0.007
 comes+0.006
Neg logit contributions
he-0.008
ifying-0.007
 contain-0.007
 usual-0.006
 entire-0.006
Token noticed
Token ID6810
Feature activation+7.98e-03

Pos logit contributions
he+0.002
ifying+0.002
 contain+0.002
 entire+0.001
 spine+0.001
Neg logit contributions
rums-0.002
 runs-0.001
 turns-0.001
 comes-0.001
 happens-0.001
Token something
Token ID1223
Feature activation+1.09e-03

Pos logit contributions
rums+0.014
 runs+0.013
 turns+0.012
 comes+0.011
 happens+0.011
Neg logit contributions
he-0.018
ifying-0.016
 contain-0.015
 entire-0.014
 spine-0.014
Token strange
Token ID6283
Feature activation+2.64e-02

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 happens+0.002
 comes+0.002
Neg logit contributions
he-0.002
ifying-0.002
 contain-0.002
 spine-0.001
 entire-0.001
Token ran
Token ID4966
Feature activation+1.54e-03

Pos logit contributions
he+0.012
ifying+0.010
 contain+0.010
 entire+0.009
 usual+0.009
Neg logit contributions
rums-0.010
 runs-0.008
 turns-0.008
 comes-0.007
 happens-0.007
Token down
Token ID866
Feature activation-1.80e-02

Pos logit contributions
rums+0.003
 runs+0.003
 turns+0.003
 comes+0.003
 happens+0.003
Neg logit contributions
he-0.003
ifying-0.003
 contain-0.003
 entire-0.002
 spine-0.002
Token to
Token ID284
Feature activation+2.92e-03

Pos logit contributions
he+0.040
ifying+0.036
 contain+0.034
 spine+0.031
 entire+0.031
Neg logit contributions
rums-0.031
 runs-0.030
 turns-0.028
 comes-0.027
 tries-0.026
Token the
Token ID262
Feature activation+3.08e-03

Pos logit contributions
rums+0.006
 runs+0.005
 turns+0.005
 comes+0.005
 tries+0.005
Neg logit contributions
he-0.006
ifying-0.005
 contain-0.005
 entire-0.005
 spine-0.005
Token river
Token ID7850
Feature activation+1.50e-02

Pos logit contributions
 runs+0.006
 turns+0.006
rums+0.005
 sounds+0.005
 happens+0.005
Neg logit contributions
he-0.006
ifying-0.005
 contain-0.005
 spine-0.004
pron-0.004
Tokenbank
Token ID17796
Feature activation+7.01e-02

Pos logit contributions
rums+0.031
 runs+0.028
 turns+0.027
 comes+0.026
 happens+0.025
Neg logit contributions
he-0.029
ifying-0.026
 contain-0.024
 entire-0.021
 spine-0.021
Token.
Token ID13
Feature activation-1.99e-03

Pos logit contributions
rums+0.148
 runs+0.136
 turns+0.128
 comes+0.123
 happens+0.121
Neg logit contributions
he-0.135
ifying-0.116
 contain-0.110
 entire-0.097
 spine-0.093
Token
Token ID220
Feature activation+1.77e-03

Pos logit contributions
he+0.004
ifying+0.003
 contain+0.003
 spine+0.003
 entire+0.003
Neg logit contributions
rums-0.004
 runs-0.004
 happens-0.003
 listens-0.003
 turns-0.003
Token
Token ID198
Feature activation+1.42e-03

Pos logit contributions
rums+0.003
 runs+0.003
 turns+0.003
 happens+0.003
 comes+0.003
Neg logit contributions
he-0.004
ifying-0.003
 contain-0.003
 entire-0.003
 usual-0.003
Token
Token ID198
Feature activation+1.27e-04

Pos logit contributions
rums+0.003
 runs+0.002
 turns+0.002
 happens+0.002
 comes+0.002
Neg logit contributions
he-0.003
ifying-0.003
 contain-0.003
 entire-0.002
 usual-0.002
TokenWhen
Token ID2215
Feature activation+5.52e-03

Pos logit contributions
rums+0.000
 runs+0.000
 happens+0.000
 turns+0.000
 comes+0.000
Neg logit contributions
he-0.000
ifying-0.000
 contain-0.000
 entire-0.000
pron-0.000
Token a
Token ID257
Feature activation+2.77e-03

Pos logit contributions
rums+0.018
 runs+0.016
 turns+0.015
 comes+0.015
 happens+0.014
Neg logit contributions
he-0.016
ifying-0.014
 contain-0.013
 usual-0.012
 entire-0.012
Token raft
Token ID31812
Feature activation-1.80e-02

Pos logit contributions
 runs+0.005
 turns+0.005
rums+0.005
 comes+0.005
 sounds+0.005
Neg logit contributions
he-0.006
 contain-0.005
ifying-0.005
 spine-0.004
 entire-0.004
Token near
Token ID1474
Feature activation+6.05e-03

Pos logit contributions
he+0.034
ifying+0.032
 contain+0.031
pron+0.028
 spine+0.028
Neg logit contributions
 runs-0.032
 comes-0.030
rums-0.029
 turns-0.027
 sounds-0.026
Token the
Token ID262
Feature activation+4.71e-03

Pos logit contributions
rums+0.012
 runs+0.010
 happens+0.010
 turns+0.010
 comes+0.009
Neg logit contributions
he-0.013
 contain-0.011
ifying-0.011
 entire-0.010
 usual-0.009
Token river
Token ID7850
Feature activation+1.67e-02

Pos logit contributions
 runs+0.008
 turns+0.008
rums+0.008
 sounds+0.007
 comes+0.007
Neg logit contributions
he-0.010
 contain-0.009
ifying-0.009
 spine-0.007
pron-0.007
Tokenbank
Token ID17796
Feature activation+6.98e-02

Pos logit contributions
rums+0.032
 runs+0.030
 turns+0.028
 comes+0.028
 happens+0.026
Neg logit contributions
he-0.033
ifying-0.030
 contain-0.028
 entire-0.025
pron-0.024
Token and
Token ID290
Feature activation+5.00e-04

Pos logit contributions
rums+0.139
 runs+0.129
 turns+0.118
 comes+0.118
es+0.113
Neg logit contributions
he-0.140
ifying-0.122
 contain-0.115
 entire-0.102
pron-0.100
Token she
Token ID673
Feature activation-2.43e-03

Pos logit contributions
rums+0.001
 happens+0.001
 turns+0.001
 runs+0.001
es+0.001
Neg logit contributions
he-0.001
 contain-0.001
ifying-0.001
 usual-0.001
chard-0.001
Token thought
Token ID1807
Feature activation-1.04e-02

Pos logit contributions
he+0.005
ifying+0.004
 contain+0.004
 entire+0.004
 usual+0.004
Neg logit contributions
rums-0.005
 runs-0.004
 happens-0.004
 turns-0.004
 comes-0.004
Token it
Token ID340
Feature activation-4.90e-03

Pos logit contributions
he+0.028
ifying+0.025
 contain+0.024
 entire+0.022
 spine+0.022
Neg logit contributions
rums-0.014
 runs-0.013
 turns-0.012
 comes-0.011
 happens-0.011
Token was
Token ID373
Feature activation-7.59e-04

Pos logit contributions
he+0.011
ifying+0.009
 contain+0.009
 entire+0.009
 usual+0.008
Neg logit contributions
rums-0.009
 runs-0.008
 turns-0.007
 comes-0.007
 tries-0.007
Token water
Token ID1660
Feature activation+1.44e-03

Pos logit contributions
rums+0.008
 runs+0.008
 turns+0.008
 comes+0.007
 tries+0.007
Neg logit contributions
he-0.010
ifying-0.008
 contain-0.008
 spine-0.007
ancer-0.007
Token and
Token ID290
Feature activation+7.52e-03

Pos logit contributions
rums+0.003
 runs+0.003
 turns+0.002
 comes+0.002
 happens+0.002
Neg logit contributions
he-0.003
ifying-0.003
 contain-0.002
 entire-0.002
 spine-0.002
Token into
Token ID656
Feature activation+1.29e-02

Pos logit contributions
rums+0.014
 runs+0.012
 turns+0.011
 happens+0.011
 comes+0.011
Neg logit contributions
he-0.017
ifying-0.015
 contain-0.014
 entire-0.013
 spine-0.012
Token the
Token ID262
Feature activation+4.55e-03

Pos logit contributions
rums+0.021
 runs+0.020
 turns+0.019
 comes+0.018
 tries+0.017
Neg logit contributions
he-0.030
ifying-0.027
 contain-0.026
 spine-0.024
 entire-0.023
Token river
Token ID7850
Feature activation+1.54e-02

Pos logit contributions
 runs+0.008
rums+0.008
 turns+0.008
 comes+0.007
 sounds+0.007
Neg logit contributions
he-0.010
ifying-0.008
 contain-0.008
 spine-0.007
ancer-0.007
Tokenbank
Token ID17796
Feature activation+6.95e-02

Pos logit contributions
rums+0.035
 runs+0.031
 turns+0.030
 happens+0.029
 comes+0.028
Neg logit contributions
he-0.028
ifying-0.023
 contain-0.023
 entire-0.020
 spine-0.019
Token.
Token ID13
Feature activation-6.58e-04

Pos logit contributions
rums+0.157
 runs+0.143
 turns+0.137
 happens+0.132
 comes+0.131
Neg logit contributions
he-0.124
ifying-0.104
 contain-0.101
 entire-0.087
 spine-0.084
Token
Token ID220
Feature activation+1.89e-03

Pos logit contributions
he+0.001
ifying+0.001
 contain+0.001
 entire+0.001
 spine+0.001
Neg logit contributions
rums-0.001
 runs-0.001
 turns-0.001
 happens-0.001
 comes-0.001
Token
Token ID198
Feature activation+4.88e-04

Pos logit contributions
rums+0.004
 runs+0.004
 turns+0.003
 happens+0.003
 tries+0.003
Neg logit contributions
he-0.004
ifying-0.003
 contain-0.003
 entire-0.003
 usual-0.002
Token
Token ID198
Feature activation-5.31e-04

Pos logit contributions
rums+0.001
 runs+0.001
 turns+0.001
 happens+0.001
 comes+0.001
Neg logit contributions
he-0.001
ifying-0.001
 contain-0.001
 entire-0.001
 usual-0.001
TokenSu
Token ID5606
Feature activation-4.95e-03

Pos logit contributions
he+0.001
ifying+0.001
 contain+0.001
 entire+0.001
 usual+0.001
Neg logit contributions
rums-0.001
 runs-0.001
 turns-0.001
 happens-0.001
 comes-0.001
Token hadn
Token ID8020
Feature activation+8.10e-03

Pos logit contributions
he+0.002
ifying+0.001
 contain+0.001
 spine+0.001
 entire+0.001
Neg logit contributions
rums-0.001
 runs-0.001
 turns-0.001
 comes-0.001
 happens-0.001
Token't
Token ID470
Feature activation-5.29e-03

Pos logit contributions
rums+0.015
 runs+0.014
 comes+0.012
 turns+0.012
 happens+0.012
Neg logit contributions
he-0.017
ifying-0.016
 contain-0.015
 spine-0.014
pron-0.014
Token left
Token ID1364
Feature activation-1.29e-03

Pos logit contributions
he+0.011
ifying+0.010
 contain+0.010
 entire+0.008
 spine+0.008
Neg logit contributions
rums-0.010
 runs-0.009
 turns-0.009
 comes-0.008
 happens-0.008
Token the
Token ID262
Feature activation+3.70e-03

Pos logit contributions
he+0.003
ifying+0.003
 contain+0.002
 entire+0.002
 spine+0.002
Neg logit contributions
rums-0.002
 runs-0.002
 turns-0.002
 comes-0.002
 tries-0.002
Token river
Token ID7850
Feature activation+1.48e-02

Pos logit contributions
 runs+0.007
rums+0.007
 turns+0.006
 comes+0.006
 happens+0.006
Neg logit contributions
he-0.008
ifying-0.007
 contain-0.007
pron-0.006
 spine-0.006
Tokenbank
Token ID17796
Feature activation+6.95e-02

Pos logit contributions
rums+0.028
 runs+0.027
 turns+0.025
 comes+0.024
 happens+0.023
Neg logit contributions
he-0.030
ifying-0.026
 contain-0.024
pron-0.022
 spine-0.022
Token at
Token ID379
Feature activation+1.24e-03

Pos logit contributions
rums+0.122
 runs+0.115
 comes+0.103
es+0.103
 turns+0.103
Neg logit contributions
he-0.150
ifying-0.132
 contain-0.122
pron-0.116
 spine-0.112
Token all
Token ID477
Feature activation+9.41e-03

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 comes+0.002
 happens+0.002
Neg logit contributions
he-0.003
ifying-0.003
 contain-0.002
pron-0.002
 spine-0.002
Token.
Token ID13
Feature activation-3.03e-04

Pos logit contributions
rums+0.019
 runs+0.017
 turns+0.016
 comes+0.016
 happens+0.016
Neg logit contributions
he-0.019
ifying-0.017
 contain-0.016
 spine-0.014
pron-0.014
Token
Token ID198
Feature activation+4.56e-04

Pos logit contributions
he+0.001
ifying+0.001
 contain+0.001
 entire+0.000
 spine+0.000
Neg logit contributions
rums-0.001
 runs-0.001
 happens-0.000
 turns-0.000
 comes-0.000
Token
Token ID198
Feature activation-5.31e-04

Pos logit contributions
rums+0.001
 runs+0.001
 turns+0.001
 comes+0.001
 happens+0.001
Neg logit contributions
he-0.001
ifying-0.001
 contain-0.001
 entire-0.001
 spine-0.001
Token her
Token ID607
Feature activation+9.09e-03

Pos logit contributions
rums+0.005
 runs+0.005
 turns+0.004
 happens+0.004
 comes+0.004
Neg logit contributions
he-0.006
ifying-0.005
 contain-0.005
 entire-0.004
 usual-0.004
Token seat
Token ID5852
Feature activation-1.41e-02

Pos logit contributions
rums+0.018
 runs+0.018
 turns+0.017
 tries+0.016
 sounds+0.015
Neg logit contributions
he-0.019
ifying-0.015
 contain-0.015
ancer-0.013
pron-0.013
Token on
Token ID319
Feature activation-2.68e-03

Pos logit contributions
he+0.029
ifying+0.028
 contain+0.026
 entire+0.024
ancer+0.023
Neg logit contributions
rums-0.023
 runs-0.022
es-0.021
 turns-0.019
 comes-0.019
Token the
Token ID262
Feature activation+4.15e-03

Pos logit contributions
he+0.006
ifying+0.006
 contain+0.006
 spine+0.005
 entire+0.005
Neg logit contributions
rums-0.004
 runs-0.004
 turns-0.004
 comes-0.003
 happens-0.003
Token river
Token ID7850
Feature activation+1.47e-02

Pos logit contributions
 runs+0.008
rums+0.007
 turns+0.007
 sounds+0.007
 tries+0.007
Neg logit contributions
he-0.009
ifying-0.008
 contain-0.007
ancer-0.006
pron-0.006
Tokenbank
Token ID17796
Feature activation+6.92e-02

Pos logit contributions
rums+0.029
 runs+0.027
 turns+0.025
 comes+0.024
 happens+0.024
Neg logit contributions
he-0.029
ifying-0.026
 contain-0.024
 entire-0.022
 spine-0.021
Token.
Token ID13
Feature activation+4.73e-04

Pos logit contributions
rums+0.136
 runs+0.125
 turns+0.115
 comes+0.113
es+0.111
Neg logit contributions
he-0.141
ifying-0.124
 contain-0.116
 entire-0.103
pron-0.101
Token From
Token ID3574
Feature activation+1.14e-02

Pos logit contributions
rums+0.001
 runs+0.001
 happens+0.001
 turns+0.001
 tries+0.001
Neg logit contributions
he-0.001
ifying-0.001
 contain-0.001
 entire-0.001
 spine-0.001
Token that
Token ID326
Feature activation-1.23e-03

Pos logit contributions
rums+0.021
 runs+0.020
 turns+0.019
 happens+0.018
 comes+0.017
Neg logit contributions
he-0.024
ifying-0.022
 contain-0.021
 spine-0.019
 entire-0.018
Token day
Token ID1110
Feature activation+4.33e-03

Pos logit contributions
he+0.002
ifying+0.001
 contain+0.001
 entire+0.001
 spine+0.001
Neg logit contributions
rums-0.003
 runs-0.003
 turns-0.003
 happens-0.003
 comes-0.003
Token on
Token ID319
Feature activation-2.13e-03

Pos logit contributions
rums+0.009
 runs+0.009
 turns+0.008
 comes+0.008
 happens+0.008
Neg logit contributions
he-0.008
ifying-0.007
 contain-0.007
 entire-0.006
 spine-0.006
Token to
Token ID284
Feature activation+3.94e-03

Pos logit contributions
rums+0.002
 runs+0.002
es+0.002
 happens+0.002
 turns+0.002
Neg logit contributions
he-0.001
 usual-0.001
 contain-0.001
ifying-0.001
 entire-0.001
Token seat
Token ID5852
Feature activation-1.46e-02

Pos logit contributions
rums+0.009
 runs+0.008
 turns+0.007
 comes+0.007
 happens+0.007
Neg logit contributions
he-0.007
ifying-0.006
 contain-0.006
 entire-0.005
 spine-0.005
Token on
Token ID319
Feature activation-4.59e-03

Pos logit contributions
he+0.029
ifying+0.025
 contain+0.025
 entire+0.022
 spine+0.021
Neg logit contributions
rums-0.029
 runs-0.027
 turns-0.025
 comes-0.024
 happens-0.024
Token the
Token ID262
Feature activation+3.15e-03

Pos logit contributions
he+0.010
ifying+0.009
 contain+0.008
 entire+0.007
 usual+0.007
Neg logit contributions
rums-0.009
 runs-0.008
 turns-0.007
 happens-0.007
 comes-0.007
Token river
Token ID7850
Feature activation+1.50e-02

Pos logit contributions
 runs+0.006
rums+0.006
 turns+0.005
 comes+0.005
 sounds+0.005
Neg logit contributions
he-0.006
ifying-0.006
 contain-0.005
 spine-0.005
ancer-0.004
Tokenbank
Token ID17796
Feature activation+6.90e-02

Pos logit contributions
rums+0.033
 runs+0.030
 turns+0.029
 comes+0.027
 happens+0.027
Neg logit contributions
he-0.028
ifying-0.024
 contain-0.023
 entire-0.020
 spine-0.019
Token and
Token ID290
Feature activation+1.96e-03

Pos logit contributions
rums+0.144
 runs+0.133
 turns+0.125
 comes+0.120
 happens+0.117
Neg logit contributions
he-0.133
ifying-0.115
 contain-0.109
 entire-0.096
 spine-0.093
Token watch
Token ID2342
Feature activation-7.55e-03

Pos logit contributions
rums+0.004
 runs+0.004
 happens+0.004
 turns+0.004
es+0.003
Neg logit contributions
he-0.003
 contain-0.003
ifying-0.003
 usual-0.003
 spine-0.003
Token the
Token ID262
Feature activation+5.42e-03

Pos logit contributions
he+0.016
ifying+0.013
 contain+0.013
 entire+0.012
 usual+0.011
Neg logit contributions
rums-0.015
 runs-0.013
 turns-0.012
 happens-0.012
 comes-0.012
Token other
Token ID584
Feature activation+4.96e-03

Pos logit contributions
rums+0.009
 runs+0.009
 turns+0.008
 comes+0.008
 sounds+0.007
Neg logit contributions
he-0.013
ifying-0.011
 contain-0.011
 spine-0.010
pron-0.009
Token animals
Token ID4695
Feature activation+2.83e-03

Pos logit contributions
rums+0.008
 runs+0.007
 turns+0.007
 happens+0.007
 comes+0.006
Neg logit contributions
he-0.012
ifying-0.010
 contain-0.010
 entire-0.009
 usual-0.009
Token,"
Token ID553
Feature activation-2.12e-02

Pos logit contributions
he+0.026
ifying+0.022
 contain+0.020
 entire+0.018
 spine+0.018
Neg logit contributions
rums-0.034
 runs-0.030
 turns-0.029
 happens-0.029
 comes-0.028
TokenBe
Token ID3856
Feature activation+4.69e-03

Pos logit contributions
he+0.046
ifying+0.039
 contain+0.038
 entire+0.034
 spine+0.033
Neg logit contributions
rums-0.041
 runs-0.037
 turns-0.035
 comes-0.033
 happens-0.033
Token good
Token ID922
Feature activation+2.83e-02

Pos logit contributions
rums+0.010
 runs+0.009
 happens+0.008
 comes+0.008
 turns+0.008
Neg logit contributions
he-0.008
 contain-0.007
 entire-0.007
ifying-0.006
kin-0.006
Token,
Token ID11
Feature activation+1.40e-03

Pos logit contributions
rums+0.057
 runs+0.053
 turns+0.049
 comes+0.049
 happens+0.047
Neg logit contributions
he-0.057
ifying-0.051
 contain-0.048
 entire-0.043
 usual-0.041
Token Holly
Token ID7764
Feature activation+5.63e-03

Pos logit contributions
rums+0.003
 runs+0.003
 turns+0.002
 comes+0.002
 tries+0.002
Neg logit contributions
he-0.003
ifying-0.002
 contain-0.002
 entire-0.002
 spine-0.002
Token,.
Token ID38508
Feature activation+6.86e-02

Pos logit contributions
rums+0.010
 runs+0.009
 turns+0.009
 comes+0.008
 happens+0.008
Neg logit contributions
he-0.013
ifying-0.011
 contain-0.011
 entire-0.010
 spine-0.009
Token I
Token ID314
Feature activation-3.90e-03

Pos logit contributions
rums+0.131
 runs+0.119
 turns+0.113
 comes+0.107
 happens+0.106
Neg logit contributions
he-0.147
ifying-0.127
 contain-0.123
 entire-0.109
 spine-0.107
Token'll
Token ID1183
Feature activation-9.96e-03

Pos logit contributions
he+0.009
ifying+0.008
 contain+0.008
 spine+0.008
 horizon+0.007
Neg logit contributions
rums-0.006
 runs-0.006
 turns-0.006
 comes-0.006
 tries-0.005
Token be
Token ID307
Feature activation-9.01e-03

Pos logit contributions
he+0.021
ifying+0.020
 contain+0.017
 spine+0.017
chard+0.016
Neg logit contributions
rums-0.017
 runs-0.017
 turns-0.016
 comes-0.016
 tries-0.014
Token back
Token ID736
Feature activation-1.39e-02

Pos logit contributions
he+0.017
ifying+0.015
 contain+0.015
 spine+0.013
 usual+0.012
Neg logit contributions
rums-0.019
 runs-0.018
 turns-0.017
 comes-0.016
 tries-0.016
Token soon
Token ID2582
Feature activation+1.09e-02

Pos logit contributions
he+0.031
 contain+0.027
ifying+0.027
 spine+0.025
pron+0.024
Neg logit contributions
rums-0.022
 runs-0.021
es-0.019
 turns-0.019
 comes-0.018
Token John
Token ID1757
Feature activation-1.07e-02

Pos logit contributions
he+0.004
ifying+0.003
 contain+0.003
 entire+0.003
 spine+0.003
Neg logit contributions
rums-0.003
 runs-0.003
 turns-0.003
 happens-0.003
 comes-0.003
Token quickly
Token ID2952
Feature activation-7.21e-03

Pos logit contributions
he+0.026
ifying+0.023
 contain+0.023
 entire+0.020
 spine+0.020
Neg logit contributions
rums-0.017
 runs-0.015
 turns-0.014
 comes-0.013
 happens-0.013
Token scanned
Token ID28660
Feature activation+2.70e-03

Pos logit contributions
he+0.017
 contain+0.015
ifying+0.014
 entire+0.013
 spine+0.012
Neg logit contributions
rums-0.013
 runs-0.010
 turns-0.010
 happens-0.009
 comes-0.009
Token the
Token ID262
Feature activation+4.07e-03

Pos logit contributions
rums+0.005
 runs+0.004
 turns+0.004
 comes+0.004
 tries+0.004
Neg logit contributions
he-0.006
ifying-0.006
 contain-0.005
 entire-0.005
 spine-0.005
Token river
Token ID7850
Feature activation+1.50e-02

Pos logit contributions
 runs+0.008
rums+0.008
 turns+0.008
 comes+0.007
 tries+0.007
Neg logit contributions
he-0.008
ifying-0.007
 contain-0.006
pron-0.006
 spine-0.005
Tokenbank
Token ID17796
Feature activation+6.85e-02

Pos logit contributions
rums+0.030
 runs+0.028
 turns+0.027
 comes+0.025
 happens+0.025
Neg logit contributions
he-0.030
ifying-0.026
 contain-0.025
 entire-0.022
 spine-0.021
Token to
Token ID284
Feature activation+1.38e-03

Pos logit contributions
rums+0.141
 runs+0.130
 turns+0.123
 comes+0.118
 happens+0.116
Neg logit contributions
he-0.134
ifying-0.115
 contain-0.111
 entire-0.097
 spine-0.094
Token see
Token ID766
Feature activation+1.86e-03

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 comes+0.002
 tries+0.002
Neg logit contributions
he-0.003
ifying-0.003
pron-0.003
 contain-0.003
 spine-0.002
Token if
Token ID611
Feature activation+7.55e-03

Pos logit contributions
rums+0.003
 runs+0.003
 turns+0.003
 happens+0.002
 comes+0.002
Neg logit contributions
he-0.004
ifying-0.004
 contain-0.004
 entire-0.003
 usual-0.003
Token anything
Token ID1997
Feature activation+7.53e-03

Pos logit contributions
rums+0.013
 runs+0.012
 turns+0.011
 comes+0.010
 happens+0.010
Neg logit contributions
he-0.018
ifying-0.015
 contain-0.015
 entire-0.013
 usual-0.013
Token was
Token ID373
Feature activation+4.37e-04

Pos logit contributions
rums+0.013
 runs+0.011
 turns+0.010
 comes+0.010
 happens+0.009
Neg logit contributions
he-0.018
ifying-0.016
 contain-0.016
 entire-0.014
 usual-0.014
Token pulled
Token ID5954
Feature activation+6.12e-03

Pos logit contributions
rums+0.015
 runs+0.013
 turns+0.012
 comes+0.012
 happens+0.012
Neg logit contributions
he-0.022
ifying-0.019
 contain-0.019
 entire-0.017
 spine-0.017
Token out
Token ID503
Feature activation-3.23e-03

Pos logit contributions
rums+0.011
 runs+0.010
 turns+0.010
 comes+0.009
 happens+0.009
Neg logit contributions
he-0.014
ifying-0.012
 contain-0.012
 entire-0.010
 spine-0.010
Token his
Token ID465
Feature activation+9.25e-03

Pos logit contributions
he+0.008
ifying+0.007
 contain+0.006
 spine+0.006
 entire+0.006
Neg logit contributions
rums-0.005
 runs-0.005
 turns-0.005
 comes-0.004
es-0.004
Token pig
Token ID12967
Feature activation+1.16e-02

Pos logit contributions
 runs+0.017
rums+0.017
 turns+0.016
 hugs+0.015
 tries+0.015
Neg logit contributions
he-0.020
 contain-0.016
ifying-0.015
ancer-0.014
pron-0.013
Tokengy
Token ID1360
Feature activation+1.33e-02

Pos logit contributions
rums+0.022
 runs+0.020
 turns+0.019
 comes+0.018
 happens+0.018
Neg logit contributions
he-0.025
ifying-0.021
 contain-0.021
 entire-0.018
 spine-0.018
Tokenbank
Token ID17796
Feature activation+6.84e-02

Pos logit contributions
rums+0.024
 runs+0.022
 turns+0.021
 comes+0.020
 tries+0.019
Neg logit contributions
he-0.029
ifying-0.025
 contain-0.024
 entire-0.021
 spine-0.021
Token and
Token ID290
Feature activation+4.08e-03

Pos logit contributions
rums+0.137
 runs+0.127
 turns+0.120
 comes+0.116
es+0.112
Neg logit contributions
he-0.136
ifying-0.117
 contain-0.113
 entire-0.099
 spine-0.097
Token emptied
Token ID46705
Feature activation+5.16e-03

Pos logit contributions
rums+0.008
 runs+0.007
 turns+0.006
 happens+0.006
 comes+0.006
Neg logit contributions
he-0.009
ifying-0.008
 contain-0.007
 entire-0.006
 usual-0.006
Token it
Token ID340
Feature activation-5.50e-03

Pos logit contributions
rums+0.008
 runs+0.008
 turns+0.007
 comes+0.007
 happens+0.007
Neg logit contributions
he-0.012
ifying-0.011
 contain-0.011
 entire-0.010
 spine-0.010
Token on
Token ID319
Feature activation-3.42e-03

Pos logit contributions
he+0.011
ifying+0.010
 contain+0.009
 entire+0.008
 spine+0.008
Neg logit contributions
rums-0.011
 runs-0.010
 turns-0.010
 comes-0.009
 happens-0.009
Token the
Token ID262
Feature activation+3.01e-03

Pos logit contributions
he+0.009
ifying+0.008
 contain+0.008
 spine+0.007
ancer+0.007
Neg logit contributions
rums-0.004
 turns-0.004
 runs-0.004
 comes-0.004
 tries-0.003
Token pulled
Token ID5954
Feature activation+8.34e-03

Pos logit contributions
rums+0.005
 runs+0.005
 turns+0.004
 comes+0.004
 happens+0.004
Neg logit contributions
he-0.006
ifying-0.006
 contain-0.006
 entire-0.005
 usual-0.005
Token herself
Token ID5223
Feature activation+9.98e-03

Pos logit contributions
rums+0.018
 runs+0.016
 turns+0.015
 happens+0.015
 comes+0.015
Neg logit contributions
he-0.016
ifying-0.014
 contain-0.013
 entire-0.012
 spine-0.011
Token onto
Token ID4291
Feature activation-5.74e-03

Pos logit contributions
rums+0.018
 runs+0.018
 turns+0.017
 comes+0.016
 tries+0.016
Neg logit contributions
he-0.021
ifying-0.018
 contain-0.017
 entire-0.015
pron-0.015
Token the
Token ID262
Feature activation+5.35e-03

Pos logit contributions
he+0.012
ifying+0.011
 contain+0.011
 entire+0.009
 spine+0.009
Neg logit contributions
rums-0.011
 runs-0.010
 turns-0.009
 comes-0.009
 happens-0.009
Token river
Token ID7850
Feature activation+1.47e-02

Pos logit contributions
 runs+0.008
 turns+0.007
 sounds+0.007
rums+0.007
 comes+0.007
Neg logit contributions
he-0.013
ifying-0.012
 contain-0.010
pron-0.010
hel-0.010
Tokenbank
Token ID17796
Feature activation+6.81e-02

Pos logit contributions
rums+0.032
 runs+0.029
 turns+0.027
 comes+0.026
 happens+0.026
Neg logit contributions
he-0.028
ifying-0.024
 contain-0.023
 entire-0.020
 spine-0.020
Token,
Token ID11
Feature activation+2.83e-03

Pos logit contributions
rums+0.141
 runs+0.130
 turns+0.123
 comes+0.119
 happens+0.116
Neg logit contributions
he-0.133
ifying-0.114
 contain-0.109
 entire-0.096
 spine-0.093
Token and
Token ID290
Feature activation+6.88e-03

Pos logit contributions
rums+0.006
 runs+0.006
 turns+0.005
 comes+0.005
 happens+0.005
Neg logit contributions
he-0.005
ifying-0.005
 contain-0.004
 entire-0.004
 spine-0.004
Token waved
Token ID26834
Feature activation-2.44e-03

Pos logit contributions
rums+0.014
 runs+0.013
 turns+0.012
 comes+0.011
 happens+0.011
Neg logit contributions
he-0.014
ifying-0.012
 contain-0.012
 entire-0.010
 spine-0.010
Token hello
Token ID23748
Feature activation+2.67e-02

Pos logit contributions
he+0.005
ifying+0.005
 contain+0.005
 entire+0.004
 spine+0.004
Neg logit contributions
rums-0.004
 runs-0.004
 turns-0.004
 comes-0.004
 happens-0.004
Token to
Token ID284
Feature activation+3.40e-03

Pos logit contributions
rums+0.045
 runs+0.043
 turns+0.039
 comes+0.038
es+0.037
Neg logit contributions
he-0.063
ifying-0.051
 contain-0.051
pron-0.048
 entire-0.046
Token and
Token ID290
Feature activation+2.08e-03

Pos logit contributions
he+0.036
ifying+0.033
 contain+0.031
 entire+0.027
 spine+0.027
Neg logit contributions
rums-0.036
 runs-0.034
 turns-0.033
 comes-0.031
es-0.030
Token ran
Token ID4966
Feature activation+1.22e-03

Pos logit contributions
rums+0.004
 runs+0.003
 turns+0.003
 comes+0.003
 happens+0.003
Neg logit contributions
he-0.004
ifying-0.004
 contain-0.004
 entire-0.003
 usual-0.003
Token to
Token ID284
Feature activation+7.84e-04

Pos logit contributions
rums+0.003
 runs+0.002
 turns+0.002
 comes+0.002
 happens+0.002
Neg logit contributions
he-0.002
ifying-0.002
 contain-0.002
 entire-0.002
 spine-0.002
Token the
Token ID262
Feature activation+1.85e-03

Pos logit contributions
rums+0.001
 runs+0.001
 turns+0.001
 comes+0.001
 tries+0.001
Neg logit contributions
he-0.002
ifying-0.002
 contain-0.002
 entire-0.001
 spine-0.001
Token river
Token ID7850
Feature activation+1.29e-02

Pos logit contributions
rums+0.004
 runs+0.003
 turns+0.003
 comes+0.003
 sounds+0.003
Neg logit contributions
he-0.004
 contain-0.003
ifying-0.003
 spine-0.003
ancer-0.002
Tokenbank
Token ID17796
Feature activation+6.81e-02

Pos logit contributions
rums+0.026
 runs+0.024
 turns+0.023
 comes+0.022
 happens+0.021
Neg logit contributions
he-0.026
ifying-0.022
 contain-0.021
 entire-0.019
 spine-0.018
Token.
Token ID13
Feature activation-2.28e-03

Pos logit contributions
rums+0.149
 runs+0.137
 turns+0.130
 comes+0.125
 happens+0.124
Neg logit contributions
he-0.126
ifying-0.106
 contain-0.103
 entire-0.089
 spine-0.087
Token
Token ID220
Feature activation+6.57e-04

Pos logit contributions
he+0.005
ifying+0.004
 contain+0.004
 entire+0.004
 spine+0.004
Neg logit contributions
rums-0.004
 runs-0.004
 happens-0.003
 turns-0.003
 comes-0.003
Token
Token ID198
Feature activation+6.67e-04

Pos logit contributions
rums+0.001
 runs+0.001
 turns+0.001
 happens+0.001
 comes+0.001
Neg logit contributions
he-0.001
ifying-0.001
 contain-0.001
 entire-0.001
 spine-0.001
Token
Token ID198
Feature activation-1.94e-04

Pos logit contributions
rums+0.001
 runs+0.001
 turns+0.001
 comes+0.001
 happens+0.001
Neg logit contributions
he-0.001
ifying-0.001
 contain-0.001
 entire-0.001
 spine-0.001
TokenCharlie
Token ID37136
Feature activation-1.34e-02

Pos logit contributions
he+0.000
ifying+0.000
 contain+0.000
 entire+0.000
pron+0.000
Neg logit contributions
rums-0.000
 runs-0.000
 turns-0.000
 comes-0.000
 happens-0.000
Token appeared
Token ID4120
Feature activation+3.57e-04

Pos logit contributions
he+0.021
ifying+0.019
 contain+0.019
 entire+0.017
 spine+0.017
Neg logit contributions
rums-0.007
 runs-0.007
 comes-0.007
 turns-0.006
 happens-0.006
Token and
Token ID290
Feature activation+1.17e-03

Pos logit contributions
 runs+0.001
rums+0.001
 comes+0.001
es+0.001
 turns+0.000
Neg logit contributions
he-0.001
 contain-0.001
pron-0.001
ifying-0.001
 spine-0.001
Token destroyed
Token ID6572
Feature activation+4.87e-03

Pos logit contributions
rums+0.003
 runs+0.002
es+0.002
olds+0.002
 turns+0.002
Neg logit contributions
he-0.002
ifying-0.002
 contain-0.002
 usual-0.002
 entire-0.002
Token the
Token ID262
Feature activation+5.06e-03

Pos logit contributions
rums+0.010
 runs+0.009
 listens+0.008
 happens+0.008
 turns+0.008
Neg logit contributions
he-0.009
ifying-0.008
 contain-0.007
 usual-0.007
 entire-0.007
Token river
Token ID7850
Feature activation+1.40e-02

Pos logit contributions
 runs+0.008
 turns+0.007
 comes+0.007
rums+0.007
 sounds+0.007
Neg logit contributions
he-0.013
 contain-0.012
ifying-0.011
hel-0.010
ancer-0.009
Tokenbank
Token ID17796
Feature activation+6.77e-02

Pos logit contributions
rums+0.031
 runs+0.028
 turns+0.027
 comes+0.026
 happens+0.026
Neg logit contributions
he-0.026
ifying-0.022
 contain-0.021
 entire-0.018
 spine-0.018
Token!
Token ID0
Feature activation+4.89e-03

Pos logit contributions
rums+0.156
 runs+0.142
 turns+0.137
 happens+0.131
 comes+0.130
Neg logit contributions
he-0.118
ifying-0.097
 contain-0.094
 entire-0.082
 spine-0.078
Token Water
Token ID5638
Feature activation+8.81e-03

Pos logit contributions
rums+0.011
 runs+0.010
 happens+0.009
 turns+0.009
 comes+0.009
Neg logit contributions
he-0.009
ifying-0.008
 contain-0.007
 spine-0.006
 entire-0.006
Token was
Token ID373
Feature activation+2.49e-03

Pos logit contributions
rums+0.015
 runs+0.013
 turns+0.012
 happens+0.011
 comes+0.011
Neg logit contributions
he-0.021
ifying-0.019
 contain-0.018
 entire-0.016
 spine-0.016
Token everywhere
Token ID8347
Feature activation+8.18e-03

Pos logit contributions
rums+0.004
 runs+0.004
 turns+0.004
 comes+0.004
 tries+0.003
Neg logit contributions
he-0.006
 contain-0.005
ifying-0.005
ancer-0.004
 spine-0.004
Token and
Token ID290
Feature activation+4.14e-03

Pos logit contributions
rums+0.013
 runs+0.013
 comes+0.012
 turns+0.012
es+0.011
Neg logit contributions
he-0.018
ifying-0.016
 contain-0.016
pron-0.014
ancer-0.014
Token and
Token ID290
Feature activation+3.36e-03

Pos logit contributions
he+0.008
ifying+0.008
 contain+0.007
 entire+0.006
ancer+0.006
Neg logit contributions
rums-0.006
 runs-0.006
 turns-0.006
es-0.005
 tries-0.005
Token grabbed
Token ID13176
Feature activation-4.14e-03

Pos logit contributions
rums+0.006
 runs+0.005
 turns+0.005
 happens+0.005
 comes+0.005
Neg logit contributions
he-0.008
ifying-0.006
 contain-0.006
 entire-0.006
 usual-0.006
Token his
Token ID465
Feature activation+1.10e-02

Pos logit contributions
he+0.009
ifying+0.008
 contain+0.008
 entire+0.007
 usual+0.007
Neg logit contributions
rums-0.008
 runs-0.007
 turns-0.006
 happens-0.006
 comes-0.006
Token pig
Token ID12967
Feature activation+1.13e-02

Pos logit contributions
 turns+0.019
 runs+0.019
rums+0.017
 hugs+0.016
 comes+0.016
Neg logit contributions
he-0.026
 contain-0.022
ifying-0.021
 spine-0.018
 horizon-0.018
Tokengy
Token ID1360
Feature activation+1.50e-02

Pos logit contributions
rums+0.022
 runs+0.021
 turns+0.019
 comes+0.018
 happens+0.018
Neg logit contributions
he-0.023
ifying-0.020
 contain-0.019
 entire-0.017
 spine-0.017
Tokenbank
Token ID17796
Feature activation+6.75e-02

Pos logit contributions
rums+0.027
 runs+0.025
 turns+0.024
 comes+0.022
 tries+0.022
Neg logit contributions
he-0.033
ifying-0.028
 contain-0.028
 entire-0.024
pron-0.024
Token and
Token ID290
Feature activation+3.75e-03

Pos logit contributions
rums+0.134
 runs+0.124
 turns+0.115
 comes+0.111
es+0.110
Neg logit contributions
he-0.136
ifying-0.118
 contain-0.113
 entire-0.099
pron-0.096
Token brought
Token ID3181
Feature activation+9.95e-03

Pos logit contributions
rums+0.007
 runs+0.006
 happens+0.006
 turns+0.006
 comes+0.006
Neg logit contributions
he-0.008
 contain-0.007
ifying-0.007
 usual-0.006
 entire-0.006
Token it
Token ID340
Feature activation-4.69e-03

Pos logit contributions
rums+0.018
 runs+0.016
 happens+0.014
 turns+0.014
 comes+0.014
Neg logit contributions
he-0.023
ifying-0.020
 contain-0.019
 entire-0.017
 usual-0.017
Token back
Token ID736
Feature activation-1.58e-02

Pos logit contributions
he+0.009
ifying+0.008
 contain+0.007
pron+0.006
 entire+0.006
Neg logit contributions
rums-0.010
 runs-0.009
 turns-0.009
 comes-0.008
 tries-0.008
Token to
Token ID284
Feature activation+1.95e-03

Pos logit contributions
he+0.033
ifying+0.029
 contain+0.028
pron+0.025
 entire+0.024
Neg logit contributions
rums-0.029
 runs-0.027
 turns-0.025
es-0.024
 comes-0.024
Token man
Token ID582
Feature activation+1.92e-02

Pos logit contributions
he+0.003
ifying+0.003
 contain+0.003
 entire+0.002
 spine+0.002
Neg logit contributions
rums-0.003
 runs-0.003
 turns-0.002
 comes-0.002
 happens-0.002
Token standing
Token ID5055
Feature activation+3.34e-03

Pos logit contributions
rums+0.032
 runs+0.032
 comes+0.029
 turns+0.027
 tries+0.027
Neg logit contributions
he-0.042
ifying-0.038
 contain-0.036
pron-0.034
 spine-0.033
Token by
Token ID416
Feature activation-1.26e-03

Pos logit contributions
 runs+0.005
rums+0.005
 turns+0.005
es+0.005
 tries+0.005
Neg logit contributions
 contain-0.007
he-0.007
 spine-0.006
ifying-0.006
ancer-0.006
Token the
Token ID262
Feature activation+3.71e-03

Pos logit contributions
he+0.003
ifying+0.003
 contain+0.003
pron+0.002
 spine+0.002
Neg logit contributions
rums-0.002
 runs-0.002
 turns-0.002
 comes-0.002
 tries-0.002
Token river
Token ID7850
Feature activation+1.42e-02

Pos logit contributions
 runs+0.007
rums+0.006
 turns+0.006
 comes+0.006
 sounds+0.006
Neg logit contributions
he-0.008
 contain-0.007
ifying-0.007
pron-0.006
ancer-0.006
Tokenbank
Token ID17796
Feature activation+6.71e-02

Pos logit contributions
rums+0.028
 runs+0.026
 turns+0.024
 comes+0.024
 tries+0.023
Neg logit contributions
he-0.028
ifying-0.025
 contain-0.024
 entire-0.021
pron-0.021
Token.
Token ID13
Feature activation-1.35e-03

Pos logit contributions
rums+0.129
 runs+0.121
 turns+0.111
 comes+0.110
es+0.106
Neg logit contributions
he-0.138
ifying-0.121
 contain-0.116
pron-0.103
 entire-0.102
Token The
Token ID383
Feature activation-4.09e-03

Pos logit contributions
he+0.002
ifying+0.002
 contain+0.002
 entire+0.002
 usual+0.002
Neg logit contributions
rums-0.003
 runs-0.003
 happens-0.002
 tries-0.002
es-0.002
Token old
Token ID1468
Feature activation-8.17e-04

Pos logit contributions
he+0.009
ifying+0.009
 contain+0.008
 spine+0.007
pron+0.007
Neg logit contributions
 runs-0.007
 turns-0.007
rums-0.007
 comes-0.007
 happens-0.006
Token man
Token ID582
Feature activation+1.98e-02

Pos logit contributions
he+0.002
 contain+0.001
ifying+0.001
 entire+0.001
 spine+0.001
Neg logit contributions
rums-0.002
 runs-0.002
 turns-0.002
 happens-0.002
 comes-0.001
Token showed
Token ID3751
Feature activation+3.00e-02

Pos logit contributions
rums+0.031
 runs+0.028
 turns+0.026
 comes+0.024
 happens+0.024
Neg logit contributions
he-0.048
ifying-0.043
 contain-0.042
 entire-0.038
 spine-0.037
Token back
Token ID736
Feature activation-1.47e-02

Pos logit contributions
rums+0.005
 runs+0.005
 comes+0.004
 turns+0.004
es+0.004
Neg logit contributions
he-0.006
 contain-0.005
ifying-0.005
 spine-0.004
pron-0.004
Token down
Token ID866
Feature activation-1.45e-02

Pos logit contributions
he+0.030
 contain+0.026
ifying+0.025
 entire+0.022
pron+0.022
Neg logit contributions
rums-0.029
 runs-0.027
 turns-0.026
 comes-0.025
 happens-0.024
Token on
Token ID319
Feature activation-2.95e-03

Pos logit contributions
he+0.030
ifying+0.026
 contain+0.026
 spine+0.024
pron+0.024
Neg logit contributions
rums-0.026
 runs-0.026
 comes-0.025
 turns-0.024
 happens-0.023
Token the
Token ID262
Feature activation+3.91e-03

Pos logit contributions
he+0.008
 contain+0.007
ifying+0.007
 spine+0.006
ancer+0.006
Neg logit contributions
rums-0.004
 runs-0.004
 turns-0.004
 comes-0.003
 tries-0.003
Token river
Token ID7850
Feature activation+1.28e-02

Pos logit contributions
 runs+0.007
rums+0.006
 turns+0.006
 comes+0.006
 sounds+0.006
Neg logit contributions
he-0.009
ifying-0.008
 contain-0.007
pron-0.007
ancer-0.006
Tokenbank
Token ID17796
Feature activation+6.66e-02

Pos logit contributions
rums+0.026
 runs+0.024
 turns+0.022
 comes+0.022
 happens+0.021
Neg logit contributions
he-0.026
ifying-0.022
 contain-0.021
 entire-0.019
 spine-0.018
Token.
Token ID13
Feature activation-2.29e-03

Pos logit contributions
rums+0.140
 runs+0.129
 turns+0.122
 comes+0.118
 happens+0.116
Neg logit contributions
he-0.128
ifying-0.110
 contain-0.105
 entire-0.092
 spine-0.090
Token
Token ID198
Feature activation-1.96e-03

Pos logit contributions
he+0.004
ifying+0.003
 contain+0.003
 entire+0.003
 spine+0.003
Neg logit contributions
rums-0.005
 runs-0.005
 happens-0.004
 comes-0.004
 tries-0.004
Token
Token ID198
Feature activation-2.77e-03

Pos logit contributions
he+0.004
ifying+0.003
 contain+0.003
 entire+0.003
 usual+0.003
Neg logit contributions
rums-0.004
 runs-0.003
 turns-0.003
 happens-0.003
 comes-0.003
TokenHe
Token ID1544
Feature activation-1.47e-02

Pos logit contributions
he+0.005
ifying+0.004
 contain+0.004
 entire+0.004
pron+0.003
Neg logit contributions
rums-0.006
 runs-0.006
 turns-0.005
 comes-0.005
 happens-0.005
Token continued
Token ID3767
Feature activation+2.13e-03

Pos logit contributions
he+0.030
 contain+0.027
ifying+0.026
 entire+0.024
 usual+0.023
Neg logit contributions
rums-0.028
 runs-0.023
 turns-0.022
 happens-0.022
 comes-0.022
Token When
Token ID1649
Feature activation+8.97e-03

Pos logit contributions
he+0.016
ifying+0.015
 contain+0.014
 entire+0.012
 spine+0.012
Neg logit contributions
rums-0.013
 runs-0.011
 turns-0.011
 happens-0.010
 comes-0.010
Token Daisy
Token ID40355
Feature activation-6.23e-03

Pos logit contributions
rums+0.017
 runs+0.014
 turns+0.014
 happens+0.013
 comes+0.013
Neg logit contributions
he-0.020
ifying-0.017
 contain-0.017
 usual-0.015
 entire-0.015
Token put
Token ID1234
Feature activation+4.50e-03

Pos logit contributions
he+0.015
ifying+0.012
 contain+0.012
 entire+0.011
 usual+0.011
Neg logit contributions
rums-0.011
 runs-0.009
 turns-0.009
 happens-0.008
 comes-0.008
Token the
Token ID262
Feature activation+4.41e-03

Pos logit contributions
rums+0.009
 runs+0.008
 turns+0.008
 comes+0.007
 happens+0.007
Neg logit contributions
he-0.009
ifying-0.008
 contain-0.008
 entire-0.007
 spine-0.007
Token magn
Token ID7842
Feature activation+3.93e-04

Pos logit contributions
 runs+0.008
rums+0.008
 turns+0.008
 comes+0.007
 tries+0.007
Neg logit contributions
he-0.009
 contain-0.008
ifying-0.008
 spine-0.007
ancer-0.006
Tokenifying
Token ID4035
Feature activation-1.32e-01

Pos logit contributions
es+0.001
rums+0.001
 runs+0.001
 turns+0.001
 comes+0.001
Neg logit contributions
he-0.000
 contain-0.000
 usual-0.000
pron-0.000
 spine-0.000
Token glass
Token ID5405
Feature activation-7.81e-03

Pos logit contributions
he+0.240
 contain+0.216
ifying+0.194
ancer+0.187
 spine+0.187
Neg logit contributions
 turns-0.250
rums-0.248
 runs-0.244
 comes-0.228
 happens-0.225
Token down
Token ID866
Feature activation-1.49e-02

Pos logit contributions
he+0.016
ifying+0.014
 contain+0.013
 entire+0.012
 spine+0.012
Neg logit contributions
rums-0.016
 runs-0.014
 turns-0.014
 comes-0.013
 happens-0.013
Token,
Token ID11
Feature activation+9.66e-04

Pos logit contributions
he+0.033
ifying+0.029
 contain+0.028
 entire+0.025
 spine+0.025
Neg logit contributions
rums-0.026
 runs-0.024
 turns-0.023
 comes-0.022
 happens-0.021
Token the
Token ID262
Feature activation+3.14e-03

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 comes+0.002
 happens+0.002
Neg logit contributions
he-0.002
ifying-0.002
 contain-0.002
 usual-0.001
 entire-0.001
Token cater
Token ID20825
Feature activation+5.41e-03

Pos logit contributions
rums+0.006
 runs+0.006
 turns+0.006
 comes+0.005
 happens+0.005
Neg logit contributions
he-0.006
ifying-0.005
 contain-0.005
 entire-0.004
 spine-0.004
Token flashlight
Token ID38371
Feature activation-8.12e-03

Pos logit contributions
he+0.002
 contain+0.001
ifying+0.001
 entire+0.001
 spine+0.001
Neg logit contributions
rums-0.002
 runs-0.002
 turns-0.002
 comes-0.002
 happens-0.002
Token,
Token ID11
Feature activation+1.21e-03

Pos logit contributions
he+0.017
ifying+0.015
 contain+0.014
 entire+0.013
 spine+0.012
Neg logit contributions
 runs-0.014
rums-0.014
 comes-0.014
 turns-0.013
 happens-0.013
Token and
Token ID290
Feature activation+3.50e-03

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 comes+0.002
 happens+0.002
Neg logit contributions
he-0.002
ifying-0.002
 contain-0.002
 entire-0.002
 spine-0.002
Token a
Token ID257
Feature activation+1.14e-03

Pos logit contributions
rums+0.007
 runs+0.006
 turns+0.006
 comes+0.005
 happens+0.005
Neg logit contributions
he-0.007
ifying-0.006
 contain-0.006
 entire-0.006
 usual-0.005
Token magn
Token ID7842
Feature activation+2.67e-04

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 comes+0.002
 happens+0.002
Neg logit contributions
he-0.002
 contain-0.002
ifying-0.002
 entire-0.002
 spine-0.001
Tokenifying
Token ID4035
Feature activation-1.24e-01

Pos logit contributions
es+0.001
 runs+0.001
 comes+0.001
rums+0.001
 happens+0.001
Neg logit contributions
 contain-0.000
he-0.000
pron-0.000
 rightful-0.000
 usual-0.000
Token glass
Token ID5405
Feature activation-5.54e-03

Pos logit contributions
he+0.221
 contain+0.202
ifying+0.178
ancer+0.174
 spine+0.172
Neg logit contributions
 turns-0.238
 runs-0.232
rums-0.231
 comes-0.220
 breaks-0.217
Token.
Token ID13
Feature activation+6.65e-03

Pos logit contributions
he+0.011
ifying+0.009
 contain+0.009
 spine+0.008
 entire+0.008
Neg logit contributions
rums-0.011
 runs-0.010
 turns-0.010
 comes-0.010
 happens-0.009
Token
Token ID198
Feature activation+1.02e-02

Pos logit contributions
rums+0.011
 runs+0.010
 turns+0.009
 happens+0.009
 comes+0.009
Neg logit contributions
he-0.016
ifying-0.014
 contain-0.013
 entire-0.012
 spine-0.012
Token
Token ID198
Feature activation+8.72e-03

Pos logit contributions
rums+0.020
 runs+0.019
 turns+0.017
 happens+0.017
 comes+0.017
Neg logit contributions
he-0.022
ifying-0.017
 contain-0.017
 entire-0.015
 usual-0.014
TokenAs
Token ID1722
Feature activation+1.91e-03

Pos logit contributions
rums+0.016
 runs+0.014
 turns+0.014
 happens+0.014
 comes+0.013
Neg logit contributions
he-0.020
ifying-0.017
 contain-0.016
 entire-0.015
pron-0.014
Token She
Token ID1375
Feature activation+2.88e-03

Pos logit contributions
he+0.002
ifying+0.002
 contain+0.002
 spine+0.001
 entire+0.001
Neg logit contributions
rums-0.002
 runs-0.002
 happens-0.002
 turns-0.002
 tries-0.002
Token takes
Token ID2753
Feature activation+1.79e-02

Pos logit contributions
rums+0.004
ages+0.003
haw+0.003
bank+0.003
irens+0.003
Neg logit contributions
he-0.007
 entire-0.006
ifying-0.005
 contain-0.005
bell-0.005
Token out
Token ID503
Feature activation-2.68e-04

Pos logit contributions
rums+0.039
 runs+0.034
 listens+0.032
 happens+0.031
 comes+0.031
Neg logit contributions
he-0.034
ifying-0.028
 contain-0.027
 entire-0.025
 usual-0.024
Token a
Token ID257
Feature activation+1.18e-03

Pos logit contributions
he+0.001
ifying+0.001
 contain+0.001
 entire+0.000
 spine+0.000
Neg logit contributions
rums-0.000
 runs-0.000
 turns-0.000
 comes-0.000
 happens-0.000
Token magn
Token ID7842
Feature activation-1.45e-03

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 comes+0.002
 tries+0.002
Neg logit contributions
he-0.002
ifying-0.002
 contain-0.002
 spine-0.002
 entire-0.001
Tokenifying
Token ID4035
Feature activation-1.21e-01

Pos logit contributions
he+0.001
 contain+0.001
 usual+0.001
pron+0.001
chard+0.001
Neg logit contributions
es-0.004
rums-0.004
 runs-0.004
 turns-0.004
 comes-0.004
Token glass
Token ID5405
Feature activation-5.44e-03

Pos logit contributions
he+0.220
 contain+0.194
ifying+0.181
 spine+0.166
ancer+0.164
Neg logit contributions
rums-0.241
 turns-0.236
 runs-0.233
 comes-0.219
 happens-0.218
Token.
Token ID13
Feature activation-8.11e-04

Pos logit contributions
he+0.010
ifying+0.009
 contain+0.009
 entire+0.007
 spine+0.007
Neg logit contributions
rums-0.011
 runs-0.011
 turns-0.010
 comes-0.010
 happens-0.010
Token She
Token ID1375
Feature activation+2.41e-03

Pos logit contributions
he+0.002
ifying+0.002
 contain+0.002
 spine+0.001
 entire+0.001
Neg logit contributions
rums-0.001
 runs-0.001
 turns-0.001
 happens-0.001
 comes-0.001
Token looks
Token ID3073
Feature activation+5.22e-03

Pos logit contributions
rums+0.004
ages+0.003
haw+0.003
irens+0.002
ulk+0.002
Neg logit contributions
he-0.006
ifying-0.005
 entire-0.005
 contain-0.005
bell-0.004
Token at
Token ID379
Feature activation-1.46e-03

Pos logit contributions
rums+0.011
 runs+0.011
 turns+0.010
 comes+0.010
 tries+0.010
Neg logit contributions
he-0.010
ifying-0.009
 contain-0.008
 spine-0.007
 entire-0.007
Token,
Token ID11
Feature activation+1.77e-03

Pos logit contributions
rums+0.001
 runs+0.001
es+0.001
 turns+0.001
 comes+0.001
Neg logit contributions
he-0.001
ifying-0.001
 contain-0.001
 entire-0.001
ancer-0.001
Token next
Token ID1306
Feature activation-1.78e-02

Pos logit contributions
rums+0.003
 runs+0.003
 turns+0.003
 comes+0.003
 tries+0.003
Neg logit contributions
he-0.004
ifying-0.003
 contain-0.003
 entire-0.003
 spine-0.003
Token to
Token ID284
Feature activation+3.19e-03

Pos logit contributions
he+0.041
ifying+0.037
 contain+0.036
chard+0.032
ancer+0.031
Neg logit contributions
 turns-0.029
rums-0.029
 runs-0.028
 comes-0.026
 happens-0.025
Token her
Token ID607
Feature activation+7.58e-03

Pos logit contributions
rums+0.005
 runs+0.004
 turns+0.004
 comes+0.004
 tries+0.004
Neg logit contributions
he-0.008
ifying-0.007
 contain-0.007
 spine-0.007
 entire-0.007
Token magn
Token ID7842
Feature activation-3.56e-03

Pos logit contributions
 runs+0.013
rums+0.013
 turns+0.013
 comes+0.012
 hugs+0.012
Neg logit contributions
he-0.017
ifying-0.015
 contain-0.015
ancer-0.013
 spine-0.012
Tokenifying
Token ID4035
Feature activation-1.20e-01

Pos logit contributions
 contain+0.003
he+0.003
 entire+0.002
 sunny+0.002
 particular+0.002
Neg logit contributions
es-0.010
rums-0.009
 runs-0.008
 comes-0.008
 hugs-0.008
Token glass
Token ID5405
Feature activation-6.58e-03

Pos logit contributions
he+0.220
 contain+0.195
ifying+0.181
 spine+0.165
ancer+0.159
Neg logit contributions
rums-0.238
 turns-0.231
 runs-0.230
 comes-0.217
 happens-0.216
Token.
Token ID13
Feature activation+1.21e-03

Pos logit contributions
he+0.013
ifying+0.011
 contain+0.011
 entire+0.010
 spine+0.009
Neg logit contributions
rums-0.013
 runs-0.012
 turns-0.012
 comes-0.012
 happens-0.011
Token She
Token ID1375
Feature activation+4.29e-03

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 happens+0.002
 comes+0.002
Neg logit contributions
he-0.003
ifying-0.002
 contain-0.002
 spine-0.002
 entire-0.002
Token looked
Token ID3114
Feature activation-1.58e-02

Pos logit contributions
rums+0.009
 runs+0.007
 turns+0.007
 happens+0.007
 comes+0.007
Neg logit contributions
he-0.009
 contain-0.007
ifying-0.007
 entire-0.006
 usual-0.006
Token at
Token ID379
Feature activation-1.24e-03

Pos logit contributions
he+0.031
ifying+0.027
 contain+0.026
 spine+0.023
 entire+0.022
Neg logit contributions
rums-0.031
 runs-0.031
 turns-0.030
 comes-0.028
 tries-0.028
Token the
Token ID262
Feature activation+3.75e-03

Pos logit contributions
he+0.002
ifying+0.002
 contain+0.002
 entire+0.001
 usual+0.001
Neg logit contributions
rums-0.001
 runs-0.001
 turns-0.001
 happens-0.001
 comes-0.001
Token necklace
Token ID38987
Feature activation-1.49e-02

Pos logit contributions
 runs+0.007
 turns+0.007
rums+0.007
 comes+0.006
 tries+0.006
Neg logit contributions
he-0.008
ifying-0.007
 contain-0.007
ancer-0.006
 entire-0.005
Token and
Token ID290
Feature activation+3.28e-03

Pos logit contributions
ifying+0.035
he+0.034
 contain+0.032
 entire+0.030
ancer+0.029
Neg logit contributions
es-0.020
 runs-0.020
rums-0.018
 comes-0.017
 turns-0.016
Token the
Token ID262
Feature activation+4.44e-03

Pos logit contributions
rums+0.007
 runs+0.005
 turns+0.005
 happens+0.005
 comes+0.005
Neg logit contributions
he-0.006
 contain-0.006
ifying-0.005
 usual-0.005
 entire-0.004
Token magn
Token ID7842
Feature activation+4.85e-04

Pos logit contributions
rums+0.009
 runs+0.008
 turns+0.008
 comes+0.008
 happens+0.007
Neg logit contributions
he-0.009
ifying-0.008
 contain-0.007
 entire-0.006
 spine-0.006
Tokenifying
Token ID4035
Feature activation-1.16e-01

Pos logit contributions
es+0.001
rums+0.001
 runs+0.001
 comes+0.001
 turns+0.001
Neg logit contributions
 contain-0.001
he-0.001
 particular-0.000
pron-0.000
 usual-0.000
Token glass
Token ID5405
Feature activation-3.38e-03

Pos logit contributions
he+0.212
 contain+0.188
ifying+0.173
 spine+0.159
ancer+0.156
Neg logit contributions
rums-0.231
 turns-0.227
 runs-0.227
 comes-0.212
 happens-0.210
Token and
Token ID290
Feature activation+3.86e-03

Pos logit contributions
he+0.007
ifying+0.006
 contain+0.005
 entire+0.005
 spine+0.005
Neg logit contributions
rums-0.007
 runs-0.007
 turns-0.006
 comes-0.006
 happens-0.006
Token started
Token ID2067
Feature activation+5.74e-03

Pos logit contributions
rums+0.008
 runs+0.006
 turns+0.006
 happens+0.006
 comes+0.006
Neg logit contributions
he-0.008
 contain-0.007
ifying-0.007
 usual-0.006
 entire-0.006
Token to
Token ID284
Feature activation+6.31e-03

Pos logit contributions
rums+0.014
 runs+0.012
 happens+0.012
 turns+0.012
 listens+0.011
Neg logit contributions
he-0.009
ifying-0.007
 contain-0.007
 usual-0.007
 entire-0.006
Token examine
Token ID10716
Feature activation-1.40e-03

Pos logit contributions
 runs+0.012
rums+0.012
 turns+0.012
 comes+0.011
 tries+0.011
Neg logit contributions
he-0.012
ifying-0.011
 contain-0.010
 entire-0.009
 spine-0.009
Token
Token ID26391
Feature activation-2.37e-02

Pos logit contributions
he+0.007
ifying+0.006
 contain+0.006
 entire+0.005
 spine+0.005
Neg logit contributions
rums-0.007
 runs-0.006
 turns-0.006
 comes-0.006
 happens-0.006
Token
Token ID8151
Feature activation-3.65e-03

Pos logit contributions
he+0.052
ifying+0.045
 contain+0.042
 entire+0.038
 spine+0.037
Neg logit contributions
rums-0.046
 runs-0.040
 turns-0.037
 happens-0.036
 comes-0.035
Tokens
Token ID82
Feature activation+3.81e-03

Pos logit contributions
he+0.008
ifying+0.007
 contain+0.007
 entire+0.006
 spine+0.006
Neg logit contributions
rums-0.007
 runs-0.006
 turns-0.006
 comes-0.005
 happens-0.005
Token a
Token ID257
Feature activation-9.13e-03

Pos logit contributions
rums+0.007
 runs+0.007
 turns+0.006
 comes+0.006
 happens+0.006
Neg logit contributions
he-0.008
ifying-0.007
 contain-0.007
 entire-0.006
 spine-0.006
Token books
Token ID3835
Feature activation-6.19e-03

Pos logit contributions
he+0.018
ifying+0.015
 contain+0.015
 spine+0.013
 entire+0.012
Neg logit contributions
 runs-0.018
rums-0.018
 turns-0.017
 comes-0.016
 happens-0.016
Tokenhe
Token ID258
Feature activation-1.16e-01

Pos logit contributions
ifying+0.008
 spine+0.007
 contain+0.007
he+0.007
 usual+0.006
Neg logit contributions
 runs-0.014
 comes-0.014
 turns-0.013
rums-0.013
 happens-0.013
Tokenlf
Token ID1652
Feature activation-9.46e-03

Pos logit contributions
he+0.227
ifying+0.202
 contain+0.192
 spine+0.167
 entire+0.166
Neg logit contributions
rums-0.236
 runs-0.219
 turns-0.205
 comes-0.199
 happens-0.195
Token.
Token ID13
Feature activation-7.13e-03

Pos logit contributions
he+0.019
ifying+0.016
 contain+0.016
 spine+0.014
 entire+0.014
Neg logit contributions
rums-0.019
 runs-0.018
 turns-0.016
 comes-0.016
 happens-0.016
Token You
Token ID921
Feature activation+7.62e-03

Pos logit contributions
he+0.017
ifying+0.015
 contain+0.015
 entire+0.014
 spine+0.013
Neg logit contributions
rums-0.011
 runs-0.010
 turns-0.009
 comes-0.009
 happens-0.009
Token can
Token ID460
Feature activation-7.54e-04

Pos logit contributions
 runs+0.012
rums+0.012
 turns+0.012
 comes+0.011
 happens+0.010
Neg logit contributions
he-0.017
ifying-0.015
 contain-0.015
 spine-0.014
pron-0.013
Token use
Token ID779
Feature activation-5.00e-03

Pos logit contributions
he+0.001
ifying+0.001
 contain+0.001
 spine+0.001
 entire+0.001
Neg logit contributions
rums-0.001
 runs-0.001
 turns-0.001
 comes-0.001
 happens-0.001
Token
Token ID198
Feature activation-4.23e-03

Pos logit contributions
he+0.008
ifying+0.006
 contain+0.006
 entire+0.006
 spine+0.006
Neg logit contributions
rums-0.006
 runs-0.006
 turns-0.005
 comes-0.005
 happens-0.005
TokenHe
Token ID1544
Feature activation-1.68e-02

Pos logit contributions
he+0.008
 contain+0.007
ifying+0.007
 entire+0.006
 usual+0.006
Neg logit contributions
rums-0.009
 runs-0.008
 comes-0.008
 turns-0.008
 happens-0.008
Token brought
Token ID3181
Feature activation+8.30e-03

Pos logit contributions
he+0.029
 contain+0.027
 entire+0.025
ifying+0.024
 usual+0.024
Neg logit contributions
rums-0.036
 runs-0.028
 happens-0.028
 turns-0.027
olds-0.027
Token his
Token ID465
Feature activation+1.03e-02

Pos logit contributions
rums+0.019
 runs+0.016
 happens+0.015
 listens+0.015
 comes+0.015
Neg logit contributions
he-0.015
 contain-0.012
ifying-0.012
 entire-0.011
 usual-0.011
Token magn
Token ID7842
Feature activation-5.91e-03

Pos logit contributions
rums+0.020
 runs+0.019
 turns+0.019
 tries+0.017
 happens+0.017
Neg logit contributions
he-0.023
ifying-0.019
 contain-0.018
 spine-0.015
ancer-0.015
Tokenifying
Token ID4035
Feature activation-1.14e-01

Pos logit contributions
he+0.006
 contain+0.004
 usual+0.004
pron+0.003
 spine+0.003
Neg logit contributions
rums-0.016
es-0.016
 runs-0.015
 turns-0.015
 hugs-0.014
Token glass
Token ID5405
Feature activation-7.34e-03

Pos logit contributions
he+0.218
 contain+0.188
ifying+0.177
 spine+0.159
 entire+0.156
Neg logit contributions
rums-0.225
 runs-0.218
 turns-0.217
 comes-0.205
 happens-0.203
Token and
Token ID290
Feature activation+1.47e-03

Pos logit contributions
he+0.015
ifying+0.012
 contain+0.012
 entire+0.011
 spine+0.010
Neg logit contributions
rums-0.015
 runs-0.014
 turns-0.013
 comes-0.013
 happens-0.012
Token looked
Token ID3114
Feature activation-1.86e-02

Pos logit contributions
rums+0.003
 MacDonald+0.002
 runs+0.002
 happens+0.002
 listens+0.002
Neg logit contributions
he-0.002
 contain-0.002
 usual-0.002
ifying-0.002
 entire-0.002
Token closely
Token ID7173
Feature activation-6.71e-03

Pos logit contributions
he+0.037
ifying+0.031
 contain+0.029
 spine+0.026
 entire+0.025
Neg logit contributions
rums-0.037
 runs-0.037
 turns-0.035
 comes-0.034
 tries-0.033
Token.
Token ID13
Feature activation-3.39e-03

Pos logit contributions
he+0.015
ifying+0.013
 contain+0.012
 entire+0.011
pron+0.011
Neg logit contributions
rums-0.013
 runs-0.012
 turns-0.011
 comes-0.010
 tries-0.010
Token He
Token ID679
Feature activation-4.16e-04

Pos logit contributions
rums+0.008
 runs+0.008
 happens+0.007
 tries+0.007
 comes+0.007
Neg logit contributions
he-0.008
 contain-0.007
ifying-0.006
 spine-0.006
 entire-0.006
Token took
Token ID1718
Feature activation+7.41e-03

Pos logit contributions
he+0.001
 contain+0.001
ifying+0.001
 entire+0.001
pron+0.001
Neg logit contributions
rums-0.001
 runs-0.001
 comes-0.001
 happens-0.001
 turns-0.001
Token out
Token ID503
Feature activation-3.52e-03

Pos logit contributions
rums+0.015
 runs+0.013
 happens+0.012
 listens+0.012
 comes+0.012
Neg logit contributions
he-0.015
 contain-0.012
ifying-0.012
 entire-0.011
 usual-0.010
Token a
Token ID257
Feature activation-3.73e-04

Pos logit contributions
he+0.008
ifying+0.007
 contain+0.007
 entire+0.006
 spine+0.006
Neg logit contributions
rums-0.006
 runs-0.005
 turns-0.005
 happens-0.005
 comes-0.005
Token magn
Token ID7842
Feature activation-4.51e-03

Pos logit contributions
he+0.001
ifying+0.001
 contain+0.001
 spine+0.000
 entire+0.000
Neg logit contributions
 runs-0.001
rums-0.001
 turns-0.001
 comes-0.001
 tries-0.001
Tokenifying
Token ID4035
Feature activation-1.13e-01

Pos logit contributions
 contain+0.004
he+0.003
 usual+0.003
chard+0.003
 entire+0.002
Neg logit contributions
es-0.012
rums-0.012
 runs-0.012
 comes-0.011
 turns-0.011
Token glass
Token ID5405
Feature activation-5.33e-03

Pos logit contributions
he+0.210
 contain+0.192
ifying+0.172
 spine+0.164
ancer+0.163
Neg logit contributions
 turns-0.215
 runs-0.208
rums-0.205
 comes-0.201
 happens-0.197
Token and
Token ID290
Feature activation+3.06e-03

Pos logit contributions
he+0.010
ifying+0.009
 contain+0.009
 entire+0.008
 spine+0.007
Neg logit contributions
rums-0.011
 runs-0.010
 turns-0.010
 comes-0.010
 happens-0.009
Token a
Token ID257
Feature activation-2.76e-04

Pos logit contributions
rums+0.006
 runs+0.005
 turns+0.005
 comes+0.005
 happens+0.005
Neg logit contributions
he-0.006
 contain-0.005
ifying-0.005
 usual-0.005
 entire-0.005
Token notebook
Token ID20922
Feature activation-1.96e-02

Pos logit contributions
he+0.001
 contain+0.000
ifying+0.000
 entire+0.000
ancer+0.000
Neg logit contributions
 runs-0.001
rums-0.001
 turns-0.001
 comes-0.000
 tries-0.000
Token and
Token ID290
Feature activation+3.30e-03

Pos logit contributions
he+0.039
ifying+0.035
 contain+0.033
 entire+0.029
 spine+0.028
Neg logit contributions
rums-0.037
 runs-0.036
 comes-0.033
 turns-0.033
 happens-0.032
Token the
Token ID262
Feature activation+1.08e-03

Pos logit contributions
he+0.000
ifying+0.000
 contain+0.000
 entire+0.000
 spine+0.000
Neg logit contributions
rums-0.000
 runs-0.000
 turns-0.000
 comes-0.000
 happens-0.000
Token watch
Token ID2342
Feature activation-1.01e-02

Pos logit contributions
 runs+0.002
rums+0.002
 turns+0.002
 comes+0.002
 sounds+0.002
Neg logit contributions
he-0.002
ifying-0.002
 contain-0.002
 spine-0.002
 entire-0.001
Token with
Token ID351
Feature activation+1.45e-02

Pos logit contributions
he+0.020
ifying+0.018
 contain+0.018
 entire+0.016
 spine+0.015
Neg logit contributions
rums-0.018
 runs-0.017
es-0.017
 comes-0.017
 turns-0.016
Token the
Token ID262
Feature activation+1.32e-03

Pos logit contributions
rums+0.024
 runs+0.022
 turns+0.021
 comes+0.020
 tries+0.019
Neg logit contributions
he-0.034
ifying-0.030
 contain-0.029
 entire-0.027
 spine-0.026
Token magn
Token ID7842
Feature activation-1.98e-03

Pos logit contributions
rums+0.003
 runs+0.003
 turns+0.002
 comes+0.002
 happens+0.002
Neg logit contributions
he-0.003
ifying-0.002
 contain-0.002
 entire-0.002
 spine-0.002
Tokenifying
Token ID4035
Feature activation-1.12e-01

Pos logit contributions
 contain+0.002
he+0.002
 usual+0.001
 entire+0.001
 rightful+0.001
Neg logit contributions
es-0.005
rums-0.005
 runs-0.005
 comes-0.005
 happens-0.005
Token glass
Token ID5405
Feature activation-5.35e-03

Pos logit contributions
he+0.205
 contain+0.182
ifying+0.170
 spine+0.156
ancer+0.150
Neg logit contributions
rums-0.225
 turns-0.219
 runs-0.219
 comes-0.207
 happens-0.205
Token.
Token ID13
Feature activation+1.05e-04

Pos logit contributions
he+0.011
ifying+0.009
 contain+0.009
 entire+0.008
 spine+0.008
Neg logit contributions
rums-0.011
 runs-0.010
 turns-0.010
 comes-0.009
 happens-0.009
Token He
Token ID679
Feature activation-1.87e-03

Pos logit contributions
rums+0.000
 runs+0.000
 turns+0.000
 comes+0.000
 happens+0.000
Neg logit contributions
he-0.000
ifying-0.000
 contain-0.000
 entire-0.000
 spine-0.000
Token saw
Token ID2497
Feature activation+3.60e-03

Pos logit contributions
he+0.004
 contain+0.003
ifying+0.003
 usual+0.003
 entire+0.003
Neg logit contributions
rums-0.004
 runs-0.003
 turns-0.003
 happens-0.003
 comes-0.003
Token the
Token ID262
Feature activation+1.41e-03

Pos logit contributions
rums+0.008
 runs+0.007
 comes+0.007
 listens+0.007
 turns+0.007
Neg logit contributions
he-0.006
ifying-0.005
 contain-0.005
 usual-0.005
 entire-0.004
Token,
Token ID11
Feature activation+5.31e-03

Pos logit contributions
he+0.016
ifying+0.014
 contain+0.014
 entire+0.012
 spine+0.012
Neg logit contributions
rums-0.017
 runs-0.016
 turns-0.015
 comes-0.014
 happens-0.014
Token handing
Token ID22786
Feature activation+1.57e-02

Pos logit contributions
rums+0.007
 runs+0.006
 turns+0.005
 comes+0.005
 tries+0.005
Neg logit contributions
he-0.015
ifying-0.013
 contain-0.013
 entire-0.012
 spine-0.011
Token him
Token ID683
Feature activation+1.37e-03

Pos logit contributions
rums+0.027
 runs+0.023
 happens+0.021
 turns+0.021
 comes+0.020
Neg logit contributions
he-0.038
ifying-0.032
 contain-0.031
 usual-0.029
 entire-0.028
Token the
Token ID262
Feature activation+3.39e-03

Pos logit contributions
rums+0.003
 runs+0.003
 happens+0.003
 comes+0.002
 turns+0.002
Neg logit contributions
he-0.002
ifying-0.002
 contain-0.002
 usual-0.002
 entire-0.002
Token magn
Token ID7842
Feature activation-1.09e-03

Pos logit contributions
 runs+0.006
rums+0.006
 turns+0.006
 comes+0.006
 happens+0.005
Neg logit contributions
he-0.007
ifying-0.006
 contain-0.006
 entire-0.005
 spine-0.005
Tokenifying
Token ID4035
Feature activation-1.12e-01

Pos logit contributions
 contain+0.001
he+0.001
 usual+0.001
 entire+0.001
 rightful+0.001
Neg logit contributions
es-0.003
rums-0.003
 runs-0.003
 happens-0.002
 turns-0.002
Token glass
Token ID5405
Feature activation-4.37e-03

Pos logit contributions
he+0.204
 contain+0.182
ifying+0.168
 spine+0.156
ancer+0.150
Neg logit contributions
rums-0.223
 turns-0.219
 runs-0.217
 comes-0.205
 happens-0.204
Token.
Token ID13
Feature activation+3.49e-03

Pos logit contributions
he+0.008
ifying+0.007
 contain+0.007
 entire+0.006
 spine+0.006
Neg logit contributions
rums-0.009
 runs-0.009
 turns-0.008
 comes-0.008
 happens-0.008
Token She
Token ID1375
Feature activation+5.12e-03

Pos logit contributions
rums+0.008
 runs+0.007
 happens+0.006
 turns+0.006
 tries+0.006
Neg logit contributions
he-0.007
ifying-0.005
 contain-0.005
-0.005
 spine-0.004
Token looked
Token ID3114
Feature activation-1.34e-02

Pos logit contributions
rums+0.010
 runs+0.008
 turns+0.008
 happens+0.008
 comes+0.008
Neg logit contributions
he-0.011
 contain-0.009
ifying-0.009
pron-0.008
 entire-0.008
Token around
Token ID1088
Feature activation+2.71e-04

Pos logit contributions
he+0.025
ifying+0.021
 contain+0.021
 entire+0.018
 spine+0.018
Neg logit contributions
rums-0.028
 runs-0.027
 turns-0.026
 comes-0.025
 tries-0.025
Token
Token ID198
Feature activation+5.71e-03

Pos logit contributions
rums+0.013
 runs+0.013
 happens+0.011
 comes+0.011
 turns+0.011
Neg logit contributions
he-0.013
 contain-0.010
ifying-0.010
 entire-0.009
 usual-0.008
TokenHe
Token ID1544
Feature activation-1.14e-02

Pos logit contributions
rums+0.011
 runs+0.010
 happens+0.010
 comes+0.010
 hides+0.009
Neg logit contributions
he-0.012
 contain-0.009
ifying-0.009
 entire-0.009
bell-0.008
Token pointed
Token ID6235
Feature activation+5.58e-03

Pos logit contributions
he+0.024
 contain+0.020
ifying+0.019
 entire+0.017
pron+0.017
Neg logit contributions
rums-0.023
 runs-0.019
 turns-0.018
 happens-0.018
 comes-0.018
Token his
Token ID465
Feature activation+9.71e-03

Pos logit contributions
rums+0.011
 runs+0.010
 turns+0.010
 comes+0.009
 happens+0.009
Neg logit contributions
he-0.011
ifying-0.010
 contain-0.009
 entire-0.008
 spine-0.008
Token magn
Token ID7842
Feature activation-4.01e-03

Pos logit contributions
rums+0.018
 runs+0.017
 turns+0.017
 comes+0.016
 tries+0.015
Neg logit contributions
he-0.022
 contain-0.018
ifying-0.018
 entire-0.015
ancer-0.015
Tokenifying
Token ID4035
Feature activation-1.11e-01

Pos logit contributions
he+0.004
 contain+0.003
 usual+0.003
 weekend+0.002
 spine+0.002
Neg logit contributions
es-0.011
rums-0.011
 runs-0.010
 turns-0.010
 comes-0.010
Token glass
Token ID5405
Feature activation-4.59e-03

Pos logit contributions
he+0.202
 contain+0.174
ifying+0.165
 spine+0.146
 entire+0.144
Neg logit contributions
rums-0.231
 runs-0.222
 turns-0.218
 comes-0.209
 happens-0.206
Token at
Token ID379
Feature activation+1.35e-03

Pos logit contributions
he+0.009
 contain+0.008
ifying+0.008
 entire+0.007
 spine+0.007
Neg logit contributions
rums-0.009
 runs-0.009
 turns-0.008
 comes-0.008
 happens-0.008
Token different
Token ID1180
Feature activation+9.81e-03

Pos logit contributions
rums+0.002
 turns+0.002
 runs+0.002
 comes+0.001
es+0.001
Neg logit contributions
he-0.003
ifying-0.003
 contain-0.003
pron-0.003
 spine-0.003
Token classmates
Token ID28999
Feature activation-1.35e-02

Pos logit contributions
rums+0.016
 runs+0.014
 turns+0.013
 happens+0.012
 comes+0.012
Neg logit contributions
he-0.024
ifying-0.021
 contain-0.021
 entire-0.019
 usual-0.018
Token and
Token ID290
Feature activation+6.55e-04

Pos logit contributions
he+0.028
ifying+0.025
 contain+0.024
 entire+0.022
 spine+0.021
Neg logit contributions
rums-0.025
 runs-0.023
 turns-0.022
es-0.022
 comes-0.021
Token a
Token ID257
Feature activation+3.48e-03

Pos logit contributions
rums+0.000
 runs+0.000
 happens+0.000
 comes+0.000
 turns+0.000
Neg logit contributions
he-0.000
 contain-0.000
ifying-0.000
 usual-0.000
 entire-0.000
Token flashlight
Token ID38371
Feature activation-5.70e-03

Pos logit contributions
rums+0.008
 runs+0.007
 turns+0.007
 comes+0.007
 happens+0.007
Neg logit contributions
he-0.006
ifying-0.005
 contain-0.005
 entire-0.004
 spine-0.004
Token and
Token ID290
Feature activation+1.21e-03

Pos logit contributions
he+0.013
ifying+0.012
 contain+0.011
 entire+0.011
pron+0.010
Neg logit contributions
 runs-0.009
 comes-0.008
rums-0.008
 turns-0.008
es-0.007
Token a
Token ID257
Feature activation+2.83e-03

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 comes+0.002
 happens+0.002
Neg logit contributions
he-0.003
ifying-0.002
 contain-0.002
 entire-0.002
 usual-0.002
Token magn
Token ID7842
Feature activation-1.80e-03

Pos logit contributions
rums+0.006
 runs+0.006
 turns+0.005
 comes+0.005
 happens+0.005
Neg logit contributions
he-0.006
 contain-0.005
ifying-0.005
 entire-0.004
 spine-0.004
Tokenifying
Token ID4035
Feature activation-1.10e-01

Pos logit contributions
 contain+0.002
he+0.001
 rightful+0.001
 entire+0.001
pron+0.001
Neg logit contributions
es-0.005
 comes-0.004
 runs-0.004
rums-0.004
icks-0.004
Token glass
Token ID5405
Feature activation-2.34e-03

Pos logit contributions
he+0.203
 contain+0.181
ifying+0.166
ancer+0.154
 spine+0.151
Neg logit contributions
 turns-0.210
 runs-0.209
rums-0.206
 comes-0.198
 happens-0.194
Token to
Token ID284
Feature activation+5.56e-03

Pos logit contributions
he+0.005
ifying+0.004
 contain+0.004
 entire+0.003
 spine+0.003
Neg logit contributions
rums-0.004
 runs-0.004
 comes-0.004
 turns-0.004
 happens-0.004
Token examine
Token ID10716
Feature activation-1.52e-03

Pos logit contributions
 runs+0.010
rums+0.010
 turns+0.009
 comes+0.009
 tries+0.009
Neg logit contributions
he-0.012
ifying-0.011
 contain-0.009
 spine-0.009
ancer-0.009
Token their
Token ID511
Feature activation+1.13e-02

Pos logit contributions
he+0.003
ifying+0.003
 contain+0.002
 entire+0.002
 usual+0.002
Neg logit contributions
rums-0.003
 runs-0.003
 turns-0.003
 happens-0.003
 comes-0.002
Token finds
Token ID7228
Feature activation+1.16e-02

Pos logit contributions
rums+0.021
 runs+0.019
 turns+0.018
 comes+0.017
 happens+0.017
Neg logit contributions
he-0.025
ifying-0.022
 contain-0.021
 entire-0.019
 spine-0.018
Token said
Token ID531
Feature activation-1.21e-02

Pos logit contributions
rums+0.009
 runs+0.007
 happens+0.007
 comes+0.007
 turns+0.007
Neg logit contributions
he-0.007
 contain-0.006
 entire-0.005
 spine-0.005
ifying-0.004
Token,
Token ID11
Feature activation+2.63e-04

Pos logit contributions
he+0.025
ifying+0.022
 contain+0.021
 entire+0.018
 spine+0.018
Neg logit contributions
rums-0.024
 runs-0.022
 turns-0.021
 comes-0.020
 happens-0.020
Token holding
Token ID4769
Feature activation+1.20e-02

Pos logit contributions
rums+0.000
 runs+0.000
 turns+0.000
 comes+0.000
 tries+0.000
Neg logit contributions
he-0.001
ifying-0.000
 contain-0.000
 entire-0.000
 spine-0.000
Token the
Token ID262
Feature activation+2.42e-03

Pos logit contributions
rums+0.024
 runs+0.022
 turns+0.020
 happens+0.020
 comes+0.020
Neg logit contributions
he-0.025
ifying-0.021
 contain-0.020
 entire-0.018
 usual-0.018
Token magn
Token ID7842
Feature activation-3.37e-03

Pos logit contributions
rums+0.004
 runs+0.004
 turns+0.004
 comes+0.004
 tries+0.004
Neg logit contributions
he-0.005
 contain-0.005
ifying-0.004
ancer-0.004
 spine-0.004
Tokenifying
Token ID4035
Feature activation-1.10e-01

Pos logit contributions
 contain+0.003
he+0.003
 usual+0.002
 spine+0.002
 weekend+0.002
Neg logit contributions
es-0.009
rums-0.009
 runs-0.009
ez-0.008
 comes-0.008
Token glass
Token ID5405
Feature activation-3.89e-03

Pos logit contributions
he+0.198
 contain+0.173
ifying+0.163
 spine+0.145
 entire+0.140
Neg logit contributions
rums-0.229
 runs-0.220
 turns-0.217
 comes-0.207
 happens-0.205
Token over
Token ID625
Feature activation+8.91e-03

Pos logit contributions
he+0.008
ifying+0.007
 contain+0.007
ancer+0.006
 entire+0.006
Neg logit contributions
 runs-0.007
rums-0.007
 turns-0.007
 comes-0.007
es-0.006
Token a
Token ID257
Feature activation+5.55e-04

Pos logit contributions
rums+0.013
 runs+0.012
 turns+0.011
 comes+0.011
 tries+0.010
Neg logit contributions
he-0.023
ifying-0.020
 contain-0.020
 spine-0.018
 entire-0.018
Token black
Token ID2042
Feature activation+2.23e-03

Pos logit contributions
 runs+0.001
 turns+0.001
rums+0.001
 comes+0.001
 sounds+0.001
Neg logit contributions
he-0.001
 contain-0.001
ifying-0.001
ancer-0.001
icking-0.001
Token ant
Token ID1885
Feature activation+4.08e-03

Pos logit contributions
rums+0.004
 runs+0.004
 comes+0.004
 turns+0.004
 happens+0.004
Neg logit contributions
he-0.004
ifying-0.004
 contain-0.004
 entire-0.003
 usual-0.003
Token the
Token ID262
Feature activation+2.52e-03

Pos logit contributions
he+0.002
 contain+0.002
ifying+0.002
 usual+0.002
 entire+0.002
Neg logit contributions
rums-0.003
 happens-0.002
 runs-0.002
 turns-0.002
Raw-0.002
Token necklace
Token ID38987
Feature activation-1.60e-02

Pos logit contributions
 runs+0.004
 turns+0.004
rums+0.004
 comes+0.004
 tries+0.004
Neg logit contributions
he-0.006
ifying-0.005
 contain-0.005
 spine-0.004
ancer-0.004
Token and
Token ID290
Feature activation+2.04e-03

Pos logit contributions
he+0.033
ifying+0.030
 contain+0.029
 entire+0.025
 spine+0.025
Neg logit contributions
rums-0.029
 runs-0.029
 comes-0.027
 turns-0.026
es-0.026
Token the
Token ID262
Feature activation+2.92e-03

Pos logit contributions
rums+0.004
 runs+0.003
 turns+0.003
 happens+0.003
ez+0.003
Neg logit contributions
he-0.003
 contain-0.003
ifying-0.003
 usual-0.003
chard-0.003
Token magn
Token ID7842
Feature activation-3.03e-03

Pos logit contributions
rums+0.006
 runs+0.006
 turns+0.005
 comes+0.005
 happens+0.005
Neg logit contributions
he-0.006
ifying-0.005
 contain-0.005
 entire-0.004
 spine-0.004
Tokenifying
Token ID4035
Feature activation-1.10e-01

Pos logit contributions
 contain+0.003
he+0.003
 usual+0.003
 spine+0.003
pron+0.002
Neg logit contributions
es-0.008
rums-0.007
 runs-0.007
 comes-0.007
 turns-0.007
Token glass
Token ID5405
Feature activation-3.63e-03

Pos logit contributions
he+0.202
 contain+0.176
ifying+0.163
 spine+0.150
ancer+0.143
Neg logit contributions
rums-0.222
 turns-0.217
 runs-0.216
 comes-0.204
 happens-0.203
Token and
Token ID290
Feature activation+2.94e-03

Pos logit contributions
he+0.007
ifying+0.006
 contain+0.006
 entire+0.005
 spine+0.005
Neg logit contributions
rums-0.008
 runs-0.007
 turns-0.007
 comes-0.007
 happens-0.007
Token told
Token ID1297
Feature activation+1.95e-02

Pos logit contributions
rums+0.006
 runs+0.005
 turns+0.005
 happens+0.005
 comes+0.004
Neg logit contributions
he-0.005
 contain-0.005
ifying-0.005
 usual-0.004
chard-0.004
Token him
Token ID683
Feature activation-5.32e-04

Pos logit contributions
rums+0.036
 runs+0.031
 turns+0.029
 listens+0.028
AAAA+0.027
Neg logit contributions
he-0.041
 contain-0.035
ifying-0.033
 usual-0.033
 entire-0.028
Token that
Token ID326
Feature activation-1.40e-03

Pos logit contributions
he+0.001
 contain+0.001
ifying+0.001
 usual+0.001
 entire+0.001
Neg logit contributions
rums-0.001
 runs-0.001
 turns-0.001
AAAA-0.001
 tries-0.001
Token She
Token ID1375
Feature activation-1.28e-03

Pos logit contributions
he+0.001
ifying+0.000
 contain+0.000
 entire+0.000
 usual+0.000
Neg logit contributions
rums-0.001
 runs-0.000
 turns-0.000
 happens-0.000
 comes-0.000
Token had
Token ID550
Feature activation-2.72e-03

Pos logit contributions
he+0.003
 contain+0.002
ifying+0.002
 entire+0.002
 usual+0.002
Neg logit contributions
rums-0.002
 runs-0.002
 turns-0.002
 happens-0.002
 comes-0.002
Token a
Token ID257
Feature activation+5.59e-04

Pos logit contributions
he+0.005
ifying+0.005
 contain+0.005
 entire+0.004
 usual+0.004
Neg logit contributions
rums-0.006
 runs-0.005
 turns-0.005
 comes-0.005
 happens-0.005
Token big
Token ID1263
Feature activation-5.64e-03

Pos logit contributions
 runs+0.001
 turns+0.001
rums+0.001
 comes+0.001
 tries+0.001
Neg logit contributions
he-0.001
ifying-0.001
 contain-0.001
 spine-0.001
 entire-0.001
Token magn
Token ID7842
Feature activation-4.37e-03

Pos logit contributions
he+0.010
ifying+0.008
 contain+0.008
 entire+0.007
 spine+0.007
Neg logit contributions
rums-0.013
 runs-0.012
 turns-0.011
 comes-0.011
 happens-0.011
Tokenifying
Token ID4035
Feature activation-1.10e-01

Pos logit contributions
he+0.004
 contain+0.004
 entire+0.003
 rightful+0.003
pron+0.003
Neg logit contributions
es-0.011
rums-0.011
 runs-0.010
 comes-0.010
 turns-0.010
Token glass
Token ID5405
Feature activation-5.10e-03

Pos logit contributions
he+0.207
 contain+0.187
ifying+0.173
ancer+0.162
 spine+0.160
Neg logit contributions
 turns-0.201
 runs-0.197
rums-0.196
 comes-0.191
 breaks-0.187
Token that
Token ID326
Feature activation-5.86e-03

Pos logit contributions
he+0.010
ifying+0.009
 contain+0.009
 entire+0.008
 spine+0.008
Neg logit contributions
 runs-0.009
rums-0.009
 comes-0.009
 turns-0.009
 happens-0.008
Token made
Token ID925
Feature activation+5.12e-03

Pos logit contributions
he+0.013
 contain+0.011
ifying+0.011
 usual+0.010
 entire+0.010
Neg logit contributions
rums-0.011
 runs-0.010
 turns-0.009
 happens-0.008
 comes-0.008
Token things
Token ID1243
Feature activation-9.77e-03

Pos logit contributions
rums+0.009
 runs+0.008
 turns+0.008
 happens+0.007
 comes+0.007
Neg logit contributions
he-0.012
ifying-0.010
 contain-0.010
 entire-0.009
 spine-0.009
Token look
Token ID804
Feature activation-6.33e-03

Pos logit contributions
he+0.019
ifying+0.016
 contain+0.015
 spine+0.014
pron+0.013
Neg logit contributions
rums-0.019
 runs-0.019
 turns-0.019
 comes-0.018
 happens-0.017
Token it
Token ID340
Feature activation-7.43e-03

Pos logit contributions
he+0.003
ifying+0.003
 contain+0.003
 entire+0.002
 spine+0.002
Neg logit contributions
rums-0.002
 runs-0.002
 turns-0.002
 comes-0.002
 tries-0.001
Token she
Token ID673
Feature activation-4.15e-03

Pos logit contributions
he+0.016
ifying+0.013
 contain+0.013
 entire+0.012
 spine+0.011
Neg logit contributions
rums-0.015
 runs-0.013
 turns-0.012
 happens-0.012
 comes-0.012
Token had
Token ID550
Feature activation-2.93e-03

Pos logit contributions
he+0.010
ifying+0.008
 contain+0.008
 entire+0.007
 usual+0.007
Neg logit contributions
rums-0.007
 runs-0.006
 turns-0.006
 happens-0.006
 comes-0.005
Token a
Token ID257
Feature activation+7.15e-04

Pos logit contributions
he+0.006
ifying+0.005
 contain+0.005
 entire+0.005
 usual+0.004
Neg logit contributions
rums-0.006
 runs-0.005
 turns-0.005
 happens-0.005
 comes-0.005
Token magn
Token ID7842
Feature activation-5.81e-03

Pos logit contributions
 runs+0.001
 turns+0.001
rums+0.001
 comes+0.001
 tries+0.001
Neg logit contributions
he-0.001
 contain-0.001
ifying-0.001
ancer-0.001
 spine-0.001
Tokenifying
Token ID4035
Feature activation-1.08e-01

Pos logit contributions
 contain+0.005
he+0.005
 rightful+0.004
pron+0.004
 usual+0.004
Neg logit contributions
es-0.015
rums-0.013
 runs-0.013
 comes-0.012
ched-0.012
Token glass
Token ID5405
Feature activation-6.07e-03

Pos logit contributions
he+0.205
 contain+0.188
ifying+0.170
ancer+0.166
 spine+0.161
Neg logit contributions
 turns-0.193
 runs-0.188
 comes-0.183
rums-0.182
 breaks-0.182
Token,
Token ID11
Feature activation+1.17e-04

Pos logit contributions
he+0.011
ifying+0.010
 contain+0.010
 entire+0.009
 spine+0.008
Neg logit contributions
 runs-0.012
rums-0.012
 turns-0.011
 comes-0.011
 happens-0.010
Token a
Token ID257
Feature activation-2.92e-05

Pos logit contributions
rums+0.000
 runs+0.000
 turns+0.000
 comes+0.000
 tries+0.000
Neg logit contributions
he-0.000
ifying-0.000
 contain-0.000
 entire-0.000
 spine-0.000
Token tiny
Token ID7009
Feature activation+2.45e-03

Pos logit contributions
he+0.000
 contain+0.000
ifying+0.000
 entire+0.000
 spine+0.000
Neg logit contributions
 runs-0.000
rums-0.000
 turns-0.000
 comes-0.000
 happens-0.000
Token torch
Token ID28034
Feature activation-1.20e-02

Pos logit contributions
 runs+0.005
rums+0.005
 turns+0.004
 comes+0.004
 happens+0.004
Neg logit contributions
he-0.005
 contain-0.004
ifying-0.004
 entire-0.003
 usual-0.003
Token and
Token ID290
Feature activation+1.97e-03

Pos logit contributions
he+0.049
ifying+0.047
 contain+0.043
 entire+0.038
 spine+0.037
Neg logit contributions
rums-0.039
 runs-0.037
 turns-0.035
 comes-0.034
 happens-0.032
Token handed
Token ID10158
Feature activation+9.72e-03

Pos logit contributions
rums+0.004
 runs+0.003
 turns+0.003
 happens+0.003
 comes+0.003
Neg logit contributions
he-0.004
 contain-0.004
ifying-0.003
 entire-0.003
 usual-0.003
Token him
Token ID683
Feature activation-6.05e-04

Pos logit contributions
rums+0.015
 runs+0.013
 turns+0.012
 happens+0.012
 comes+0.011
Neg logit contributions
he-0.024
ifying-0.021
 contain-0.021
 entire-0.019
 usual-0.019
Token the
Token ID262
Feature activation+3.40e-03

Pos logit contributions
he+0.001
ifying+0.001
 contain+0.001
 entire+0.001
 usual+0.001
Neg logit contributions
rums-0.001
 runs-0.001
 happens-0.001
 turns-0.001
 comes-0.001
Token magn
Token ID7842
Feature activation-2.33e-03

Pos logit contributions
 runs+0.006
 turns+0.006
rums+0.006
 comes+0.006
 happens+0.005
Neg logit contributions
he-0.007
ifying-0.006
 contain-0.006
 spine-0.005
ancer-0.005
Tokenifying
Token ID4035
Feature activation-1.08e-01

Pos logit contributions
 contain+0.003
he+0.003
 usual+0.002
 spine+0.002
pron+0.002
Neg logit contributions
es-0.006
rums-0.005
 turns-0.005
 runs-0.005
 happens-0.005
Token glass
Token ID5405
Feature activation-2.85e-03

Pos logit contributions
he+0.199
 contain+0.174
ifying+0.161
 spine+0.149
 entire+0.141
Neg logit contributions
rums-0.219
 turns-0.213
 runs-0.213
 comes-0.202
 happens-0.201
Token.
Token ID13
Feature activation+3.69e-03

Pos logit contributions
he+0.005
ifying+0.005
 contain+0.004
 entire+0.004
 spine+0.004
Neg logit contributions
rums-0.006
 runs-0.006
 turns-0.005
 comes-0.005
 happens-0.005
Token
Token ID198
Feature activation+7.28e-03

Pos logit contributions
rums+0.007
 runs+0.006
 turns+0.006
 happens+0.005
 comes+0.005
Neg logit contributions
he-0.008
ifying-0.007
 contain-0.007
 spine-0.006
 entire-0.006
Token
Token ID198
Feature activation+6.36e-03

Pos logit contributions
rums+0.014
 runs+0.013
 turns+0.012
 happens+0.011
 comes+0.011
Neg logit contributions
he-0.016
ifying-0.013
 contain-0.013
 entire-0.011
 usual-0.011
TokenHe
Token ID1544
Feature activation-9.82e-03

Pos logit contributions
rums+0.013
 runs+0.011
 turns+0.011
 happens+0.011
 comes+0.011
Neg logit contributions
he-0.014
ifying-0.011
 contain-0.011
 entire-0.010
pron-0.009
Token hat
Token ID6877
Feature activation-5.26e-03

Pos logit contributions
 runs+0.013
 turns+0.013
rums+0.013
 hugs+0.013
 tries+0.013
Neg logit contributions
he-0.017
 contain-0.014
ifying-0.014
ancer-0.012
 spine-0.012
Token and
Token ID290
Feature activation+3.29e-03

Pos logit contributions
ifying+0.012
 contain+0.011
he+0.011
 entire+0.010
ancer+0.010
Neg logit contributions
rums-0.007
es-0.007
 runs-0.007
 comes-0.006
 turns-0.006
Token grabbed
Token ID13176
Feature activation-1.93e-03

Pos logit contributions
rums+0.006
 runs+0.005
 turns+0.005
 happens+0.005
 comes+0.005
Neg logit contributions
he-0.007
ifying-0.006
 contain-0.006
 usual-0.005
pron-0.005
Token her
Token ID607
Feature activation+7.77e-03

Pos logit contributions
he+0.005
ifying+0.004
 contain+0.004
 entire+0.004
 spine+0.004
Neg logit contributions
rums-0.003
 runs-0.003
 turns-0.003
 comes-0.002
 happens-0.002
Token magn
Token ID7842
Feature activation-2.84e-03

Pos logit contributions
 turns+0.012
 runs+0.012
rums+0.012
 hugs+0.012
 tries+0.012
Neg logit contributions
he-0.018
 contain-0.016
ifying-0.014
ancer-0.013
 spine-0.013
Tokenifying
Token ID4035
Feature activation-1.08e-01

Pos logit contributions
 contain+0.003
he+0.003
 usual+0.002
 particular+0.002
 entire+0.002
Neg logit contributions
es-0.008
rums-0.007
 runs-0.007
 turns-0.007
ez-0.007
Token glass
Token ID5405
Feature activation-3.51e-03

Pos logit contributions
he+0.200
 contain+0.173
ifying+0.164
 spine+0.145
 entire+0.142
Neg logit contributions
rums-0.220
 runs-0.213
 turns-0.210
 comes-0.201
 happens-0.199
Token.
Token ID13
Feature activation+4.26e-03

Pos logit contributions
he+0.007
 contain+0.006
ifying+0.006
 entire+0.005
 spine+0.005
Neg logit contributions
rums-0.007
 runs-0.006
 turns-0.006
 comes-0.006
es-0.006
Token She
Token ID1375
Feature activation+7.33e-03

Pos logit contributions
rums+0.008
 runs+0.007
 turns+0.007
 tries+0.007
 happens+0.006
Neg logit contributions
he-0.009
ifying-0.008
 contain-0.007
 spine-0.006
 usual-0.006
Token was
Token ID373
Feature activation+1.16e-03

Pos logit contributions
rums+0.014
 runs+0.012
 turns+0.011
 comes+0.011
 happens+0.011
Neg logit contributions
he-0.016
ifying-0.014
 contain-0.014
 entire-0.012
 usual-0.012
Token ready
Token ID3492
Feature activation+9.43e-03

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 comes+0.002
 tries+0.002
Neg logit contributions
he-0.002
ifying-0.002
 contain-0.002
 spine-0.002
 entire-0.002
Token cried
Token ID16896
Feature activation+2.01e-02

Pos logit contributions
rums+0.003
 runs+0.003
 turns+0.003
 happens+0.003
 comes+0.003
Neg logit contributions
he-0.004
ifying-0.004
 contain-0.004
 entire-0.003
 spine-0.003
Token,
Token ID11
Feature activation-5.24e-04

Pos logit contributions
rums+0.040
 runs+0.037
 turns+0.035
 comes+0.034
 happens+0.033
Neg logit contributions
he-0.041
ifying-0.036
 contain-0.035
 entire-0.030
pron-0.029
Token dropping
Token ID12047
Feature activation+2.89e-02

Pos logit contributions
he+0.001
ifying+0.001
 contain+0.001
 entire+0.001
 spine+0.001
Neg logit contributions
rums-0.001
 runs-0.001
 turns-0.001
 comes-0.001
 happens-0.001
Token the
Token ID262
Feature activation+2.50e-03

Pos logit contributions
rums+0.057
 runs+0.049
 happens+0.047
 turns+0.046
 listens+0.046
Neg logit contributions
he-0.060
ifying-0.052
 contain-0.049
 entire-0.044
 usual-0.043
Token magn
Token ID7842
Feature activation-3.23e-03

Pos logit contributions
rums+0.004
 runs+0.004
 turns+0.004
 comes+0.004
 tries+0.004
Neg logit contributions
he-0.006
ifying-0.005
 contain-0.005
ancer-0.004
pron-0.004
Tokenifying
Token ID4035
Feature activation-1.05e-01

Pos logit contributions
 contain+0.003
he+0.003
 usual+0.003
 rightful+0.002
pron+0.002
Neg logit contributions
es-0.009
rums-0.008
 runs-0.008
 comes-0.007
 breaks-0.007
Token glass
Token ID5405
Feature activation-4.30e-03

Pos logit contributions
he+0.192
 contain+0.171
ifying+0.159
 spine+0.144
 entire+0.137
Neg logit contributions
rums-0.213
 runs-0.206
 turns-0.205
 comes-0.196
 happens-0.193
Token.
Token ID13
Feature activation+3.18e-04

Pos logit contributions
he+0.009
 contain+0.007
ifying+0.007
 entire+0.007
 usual+0.007
Neg logit contributions
rums-0.008
 runs-0.008
 comes-0.007
 turns-0.007
 happens-0.007
Token
Token ID198
Feature activation+4.63e-03

Pos logit contributions
rums+0.001
 runs+0.001
 turns+0.000
 happens+0.000
 comes+0.000
Neg logit contributions
he-0.001
ifying-0.001
 contain-0.001
 spine-0.001
 usual-0.001
Token
Token ID198
Feature activation+4.20e-03

Pos logit contributions
rums+0.010
 runs+0.009
 turns+0.008
 comes+0.008
 happens+0.008
Neg logit contributions
he-0.009
ifying-0.007
 contain-0.006
 entire-0.006
 usual-0.006
Token"
Token ID1
Feature activation-1.83e-02

Pos logit contributions
rums+0.008
 runs+0.007
 comes+0.006
 happens+0.006
 turns+0.006
Neg logit contributions
he-0.010
ifying-0.008
 contain-0.008
 entire-0.007
pron-0.007
Token.
Token ID13
Feature activation-2.39e-03

Pos logit contributions
rums+0.027
es+0.026
 runs+0.026
 turns+0.024
 comes+0.024
Neg logit contributions
he-0.038
 contain-0.035
ifying-0.034
 entire-0.031
 spine-0.030
Token Lily
Token ID20037
Feature activation+2.87e-03

Pos logit contributions
he+0.006
ifying+0.005
 contain+0.005
 entire+0.005
 spine+0.004
Neg logit contributions
rums-0.004
 runs-0.003
 turns-0.003
 happens-0.003
 comes-0.003
Token had
Token ID550
Feature activation-1.72e-03

Pos logit contributions
rums+0.007
ulk+0.005
 happens+0.005
 runs+0.005
irens+0.005
Neg logit contributions
he-0.005
 contain-0.004
 entire-0.004
 usual-0.004
ifying-0.004
Token a
Token ID257
Feature activation+1.47e-03

Pos logit contributions
he+0.003
ifying+0.003
 contain+0.003
 usual+0.002
 entire+0.002
Neg logit contributions
rums-0.004
 runs-0.003
 happens-0.003
 turns-0.003
 comes-0.003
Token magn
Token ID7842
Feature activation-2.44e-03

Pos logit contributions
 runs+0.003
 turns+0.003
rums+0.003
 comes+0.003
 tries+0.003
Neg logit contributions
he-0.003
 contain-0.002
ifying-0.002
 entire-0.002
 spine-0.002
Tokenifying
Token ID4035
Feature activation-1.05e-01

Pos logit contributions
he+0.002
 contain+0.002
 entire+0.002
 sunny+0.002
 rightful+0.002
Neg logit contributions
es-0.006
rums-0.006
 comes-0.006
 runs-0.006
 turns-0.006
Token glass
Token ID5405
Feature activation-2.92e-03

Pos logit contributions
he+0.197
 contain+0.173
ifying+0.164
ancer+0.148
 spine+0.144
Neg logit contributions
 turns-0.196
 runs-0.196
rums-0.194
 comes-0.188
 happens-0.184
Token that
Token ID326
Feature activation-5.80e-03

Pos logit contributions
he+0.006
ifying+0.005
 contain+0.005
 entire+0.004
 spine+0.004
Neg logit contributions
 runs-0.005
rums-0.005
 comes-0.005
 turns-0.005
 happens-0.005
Token she
Token ID673
Feature activation-6.82e-03

Pos logit contributions
he+0.012
 contain+0.010
ifying+0.010
 usual+0.009
 entire+0.009
Neg logit contributions
rums-0.012
 runs-0.010
 turns-0.009
 happens-0.009
 comes-0.009
Token found
Token ID1043
Feature activation+1.32e-02

Pos logit contributions
he+0.014
 contain+0.012
ifying+0.012
 entire+0.011
 usual+0.011
Neg logit contributions
rums-0.013
 runs-0.011
 turns-0.011
 happens-0.011
 comes-0.010
Token in
Token ID287
Feature activation+6.01e-03

Pos logit contributions
rums+0.030
 runs+0.028
 turns+0.027
 comes+0.026
 happens+0.025
Neg logit contributions
he-0.023
ifying-0.019
 contain-0.019
 entire-0.016
 spine-0.016
Token hold
Token ID1745
Feature activation-8.25e-03

Pos logit contributions
 runs+0.007
rums+0.007
 turns+0.006
 tries+0.006
 comes+0.006
Neg logit contributions
he-0.007
ifying-0.007
 contain-0.006
 spine-0.006
 entire-0.005
Token the
Token ID262
Feature activation+1.98e-03

Pos logit contributions
he+0.018
ifying+0.015
 contain+0.014
 entire+0.013
 usual+0.012
Neg logit contributions
rums-0.016
 runs-0.014
 turns-0.013
 happens-0.013
 comes-0.013
Token string
Token ID4731
Feature activation+4.22e-03

Pos logit contributions
 runs+0.003
 turns+0.003
rums+0.003
 sounds+0.002
 comes+0.002
Neg logit contributions
he-0.005
 contain-0.004
ifying-0.004
 spine-0.004
pron-0.004
Token and
Token ID290
Feature activation-1.14e-03

Pos logit contributions
 runs+0.007
rums+0.007
es+0.006
 turns+0.006
 comes+0.006
Neg logit contributions
he-0.009
ifying-0.008
 contain-0.008
 entire-0.007
pron-0.007
Token run
Token ID1057
Feature activation+1.48e-02

Pos logit contributions
he+0.002
 contain+0.002
ifying+0.002
 usual+0.001
 spine+0.001
Neg logit contributions
rums-0.003
 runs-0.002
 happens-0.002
es-0.002
 turns-0.002
Token with
Token ID351
Feature activation+1.69e-02

Pos logit contributions
rums+0.035
 runs+0.031
 happens+0.029
 turns+0.029
 comes+0.029
Neg logit contributions
he-0.026
ifying-0.021
 contain-0.020
 entire-0.018
 usual-0.016
Token the
Token ID262
Feature activation+2.65e-03

Pos logit contributions
rums+0.030
 runs+0.025
 happens+0.024
 listens+0.023
 turns+0.023
Neg logit contributions
he-0.038
ifying-0.033
 contain-0.032
 usual-0.029
 entire-0.029
Token k
Token ID479
Feature activation-2.63e-03

Pos logit contributions
 runs+0.004
rums+0.004
 turns+0.004
 comes+0.004
 happens+0.004
Neg logit contributions
he-0.006
ifying-0.006
 contain-0.005
 spine-0.005
pron-0.005
Tokenite
Token ID578
Feature activation-1.24e-03

Pos logit contributions
he+0.004
 contain+0.004
ifying+0.003
 entire+0.003
 spine+0.003
Neg logit contributions
rums-0.006
 runs-0.006
 turns-0.006
 comes-0.005
es-0.005
Token.
Token ID13
Feature activation+1.46e-03

Pos logit contributions
he+0.002
ifying+0.002
 contain+0.002
 entire+0.002
 spine+0.002
Neg logit contributions
rums-0.003
 runs-0.002
 turns-0.002
 comes-0.002
 happens-0.002
Token The
Token ID383
Feature activation-4.61e-03

Pos logit contributions
rums+0.003
 runs+0.003
 turns+0.002
 happens+0.002
 listens+0.002
Neg logit contributions
he-0.003
ifying-0.003
 contain-0.002
 entire-0.002
 spine-0.002
Token the
Token ID262
Feature activation+5.30e-03

Pos logit contributions
rums+0.008
 runs+0.007
 turns+0.007
 comes+0.007
 happens+0.007
Neg logit contributions
he-0.009
ifying-0.008
 contain-0.008
 entire-0.007
 spine-0.007
Token plates
Token ID13375
Feature activation+8.01e-03

Pos logit contributions
rums+0.009
 runs+0.008
 turns+0.008
 comes+0.007
 happens+0.007
Neg logit contributions
he-0.012
ifying-0.011
 contain-0.011
 spine-0.009
 entire-0.009
Token and
Token ID290
Feature activation+4.81e-03

Pos logit contributions
 runs+0.014
 turns+0.013
rums+0.013
 comes+0.013
es+0.012
Neg logit contributions
he-0.015
ifying-0.014
 contain-0.013
 spine-0.013
pron-0.012
Token cups
Token ID14180
Feature activation-7.03e-06

Pos logit contributions
rums+0.011
 runs+0.009
 happens+0.009
 turns+0.009
 comes+0.008
Neg logit contributions
he-0.008
ifying-0.008
 contain-0.007
 usual-0.007
 entire-0.007
Token in
Token ID287
Feature activation+4.91e-03

Feature not active on prev token;
no predictions were affected.
Token their
Token ID511
Feature activation+1.29e-02

Pos logit contributions
 turns+0.006
rums+0.006
 runs+0.006
 comes+0.005
 tries+0.005
Neg logit contributions
he-0.012
 contain-0.011
ifying-0.011
 spine-0.011
ancer-0.010
Token right
Token ID826
Feature activation+2.44e-03

Pos logit contributions
rums+0.029
 runs+0.026
 turns+0.025
 happens+0.024
 comes+0.024
Neg logit contributions
he-0.024
ifying-0.020
 contain-0.019
 entire-0.017
 spine-0.017
Token spots
Token ID10222
Feature activation-1.80e-02

Pos logit contributions
rums+0.005
 runs+0.005
 turns+0.005
 comes+0.004
 tries+0.004
Neg logit contributions
he-0.005
ifying-0.004
 contain-0.004
 spine-0.004
ancer-0.003
Token.
Token ID13
Feature activation+5.79e-04

Pos logit contributions
he+0.038
ifying+0.036
 contain+0.035
 spine+0.033
 entire+0.032
Neg logit contributions
rums-0.026
 runs-0.026
 turns-0.024
 comes-0.024
es-0.024
Token Suddenly
Token ID24975
Feature activation-3.17e-03

Pos logit contributions
rums+0.001
 runs+0.001
 happens+0.001
 comes+0.001
 turns+0.001
Neg logit contributions
he-0.001
ifying-0.001
 contain-0.001
 spine-0.001
 entire-0.001
Token,
Token ID11
Feature activation+4.41e-03

Pos logit contributions
he+0.005
ifying+0.004
 contain+0.003
 usual+0.003
 entire+0.003
Neg logit contributions
rums-0.008
 runs-0.007
 listens-0.006
 shouts-0.006
haw-0.006
Token have
Token ID423
Feature activation+2.94e-03

Pos logit contributions
he+0.024
ifying+0.020
 contain+0.020
 entire+0.018
 spine+0.017
Neg logit contributions
rums-0.021
 runs-0.018
 turns-0.018
 happens-0.017
 comes-0.017
Token fun
Token ID1257
Feature activation+2.08e-02

Pos logit contributions
rums+0.006
 runs+0.005
 comes+0.005
es+0.005
 listens+0.005
Neg logit contributions
he-0.005
ifying-0.005
 contain-0.004
 usual-0.004
 entire-0.004
Token playing
Token ID2712
Feature activation+2.05e-02

Pos logit contributions
rums+0.046
 runs+0.042
 turns+0.040
 comes+0.039
 happens+0.039
Neg logit contributions
he-0.037
ifying-0.031
 contain-0.030
 entire-0.026
 spine-0.025
Token tag
Token ID7621
Feature activation+5.50e-03

Pos logit contributions
rums+0.042
 runs+0.039
 turns+0.037
 comes+0.035
 happens+0.035
Neg logit contributions
he-0.040
ifying-0.035
 contain-0.034
 spine-0.030
 entire-0.029
Token with
Token ID351
Feature activation+1.35e-02

Pos logit contributions
rums+0.011
 runs+0.010
 turns+0.010
 comes+0.009
 happens+0.009
Neg logit contributions
he-0.011
ifying-0.009
 contain-0.009
 spine-0.008
 entire-0.008
Token his
Token ID465
Feature activation+1.33e-02

Pos logit contributions
rums+0.023
 runs+0.020
 turns+0.018
 happens+0.018
 comes+0.018
Neg logit contributions
he-0.032
ifying-0.028
 contain-0.027
 entire-0.024
 usual-0.024
Token friend
Token ID1545
Feature activation-2.54e-03

Pos logit contributions
rums+0.022
 runs+0.020
 turns+0.019
 comes+0.017
 tries+0.017
Neg logit contributions
he-0.032
ifying-0.028
 contain-0.028
 entire-0.024
 spine-0.024
Token.
Token ID13
Feature activation+4.34e-04

Pos logit contributions
he+0.005
ifying+0.004
 contain+0.004
 entire+0.004
 spine+0.004
Neg logit contributions
rums-0.005
 runs-0.005
 turns-0.004
 comes-0.004
 happens-0.004
Token The
Token ID383
Feature activation-7.41e-03

Pos logit contributions
rums+0.001
 runs+0.001
 turns+0.001
 happens+0.001
 comes+0.001
Neg logit contributions
he-0.001
ifying-0.001
 contain-0.001
 entire-0.001
 spine-0.001
Token end
Token ID886
Feature activation+1.35e-03

Pos logit contributions
he+0.014
ifying+0.012
 contain+0.012
 spine+0.010
 entire+0.009
Neg logit contributions
rums-0.015
 runs-0.015
 turns-0.014
 comes-0.013
 tries-0.013
Token.
Token ID13
Feature activation+1.08e-03

Pos logit contributions
rums+0.004
 runs+0.003
 turns+0.003
 tries+0.003
 comes+0.003
Neg logit contributions
he-0.002
ifying-0.001
 contain-0.001
 entire-0.001
 spine-0.001
Token the
Token ID262
Feature activation+3.87e-03

Pos logit contributions
rums+0.037
 runs+0.032
 happens+0.030
 comes+0.029
 listens+0.029
Neg logit contributions
he-0.039
ifying-0.034
 contain-0.033
 usual-0.029
 entire-0.029
Token rock
Token ID3881
Feature activation-1.15e-03

Pos logit contributions
 runs+0.007
rums+0.006
 turns+0.006
 tries+0.006
 sounds+0.006
Neg logit contributions
he-0.008
ifying-0.008
 contain-0.007
ancer-0.006
pron-0.006
Token.
Token ID13
Feature activation-4.30e-04

Pos logit contributions
he+0.003
ifying+0.002
 contain+0.002
pron+0.002
ancer+0.002
Neg logit contributions
rums-0.002
 runs-0.002
es-0.002
 comes-0.002
 turns-0.002
Token
Token ID198
Feature activation+9.23e-04

Pos logit contributions
he+0.001
ifying+0.001
 contain+0.001
 usual+0.001
 spine+0.001
Neg logit contributions
rums-0.001
 runs-0.001
 happens-0.001
 listens-0.001
 comes-0.001
Token
Token ID198
Feature activation+4.52e-04

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.001
 comes+0.001
 happens+0.001
Neg logit contributions
he-0.002
ifying-0.002
 contain-0.002
 entire-0.002
 usual-0.001
TokenMom
Token ID29252
Feature activation+2.88e-02

Pos logit contributions
rums+0.001
 runs+0.001
 turns+0.001
 comes+0.001
 happens+0.001
Neg logit contributions
he-0.001
ifying-0.001
 contain-0.001
 entire-0.001
 spine-0.001
Token smiled
Token ID13541
Feature activation-3.81e-03

Pos logit contributions
rums+0.053
 runs+0.046
 turns+0.044
 happens+0.042
 comes+0.041
Neg logit contributions
he-0.065
ifying-0.055
 contain-0.054
 entire-0.048
 spine-0.047
Token and
Token ID290
Feature activation+1.32e-03

Pos logit contributions
he+0.009
ifying+0.008
 contain+0.008
 spine+0.007
pron+0.007
Neg logit contributions
rums-0.006
 runs-0.006
 turns-0.005
 comes-0.005
es-0.005
Token took
Token ID1718
Feature activation+9.86e-03

Pos logit contributions
rums+0.003
 runs+0.002
 happens+0.002
 turns+0.002
 comes+0.002
Neg logit contributions
he-0.003
 contain-0.002
ifying-0.002
pron-0.002
 usual-0.002
Token the
Token ID262
Feature activation+3.32e-03

Pos logit contributions
rums+0.022
 runs+0.018
 listens+0.017
drivers+0.017
 disl+0.017
Neg logit contributions
he-0.018
ifying-0.015
 contain-0.014
 entire-0.012
 usual-0.012
Token rock
Token ID3881
Feature activation-7.30e-04

Pos logit contributions
 runs+0.006
rums+0.006
 turns+0.006
 tries+0.005
 comes+0.005
Neg logit contributions
he-0.007
ifying-0.007
 contain-0.006
 spine-0.005
 entire-0.005
Token.
Token ID13
Feature activation-2.35e-04

Pos logit contributions
rums+0.008
 runs+0.007
es+0.007
 comes+0.006
 tries+0.006
Neg logit contributions
he-0.010
ifying-0.009
 contain-0.008
 entire-0.007
pron-0.007
Token The
Token ID383
Feature activation-3.56e-03

Pos logit contributions
he+0.000
ifying+0.000
 contain+0.000
 spine+0.000
 entire+0.000
Neg logit contributions
rums-0.001
 runs-0.001
 happens-0.000
 comes-0.000
ups-0.000
Token bunny
Token ID44915
Feature activation+1.09e-02

Pos logit contributions
he+0.008
ifying+0.007
 contain+0.007
 spine+0.006
 entire+0.005
Neg logit contributions
rums-0.006
 runs-0.006
 turns-0.006
 tries-0.005
 comes-0.005
Token tried
Token ID3088
Feature activation+1.84e-02

Pos logit contributions
rums+0.024
irens+0.020
ulk+0.019
 runs+0.019
 happens+0.018
Neg logit contributions
he-0.020
 contain-0.016
 entire-0.015
 usual-0.014
pron-0.014
Token to
Token ID284
Feature activation+4.27e-03

Pos logit contributions
rums+0.043
 runs+0.038
 turns+0.036
 happens+0.036
 comes+0.035
Neg logit contributions
he-0.031
 contain-0.025
ifying-0.024
 usual-0.022
 entire-0.021
Token eat
Token ID4483
Feature activation+1.55e-02

Pos logit contributions
 runs+0.007
rums+0.007
 tries+0.006
 turns+0.006
 comes+0.006
Neg logit contributions
ifying-0.010
he-0.009
ancer-0.008
uttering-0.007
 spine-0.007
Token the
Token ID262
Feature activation+4.13e-03

Pos logit contributions
rums+0.028
 runs+0.025
 turns+0.023
 comes+0.022
 happens+0.022
Neg logit contributions
he-0.035
ifying-0.030
 contain-0.029
 entire-0.026
 spine-0.026
Token sword
Token ID8429
Feature activation+2.81e-03

Pos logit contributions
rums+0.007
 runs+0.007
 turns+0.006
 tries+0.006
 comes+0.006
Neg logit contributions
he-0.010
ifying-0.009
 contain-0.008
 spine-0.007
 entire-0.007
Token,
Token ID11
Feature activation+1.67e-03

Pos logit contributions
rums+0.004
 runs+0.004
es+0.004
 turns+0.004
 comes+0.004
Neg logit contributions
he-0.007
ifying-0.006
 contain-0.006
 entire-0.005
pron-0.005
Token but
Token ID475
Feature activation-1.26e-02

Pos logit contributions
rums+0.004
 runs+0.004
 turns+0.004
 comes+0.004
 tries+0.004
Neg logit contributions
he-0.003
ifying-0.002
 contain-0.002
 entire-0.002
 spine-0.002
Token it
Token ID340
Feature activation-6.06e-03

Pos logit contributions
he+0.033
ifying+0.029
 contain+0.029
 entire+0.026
 spine+0.026
Neg logit contributions
rums-0.018
 runs-0.016
 turns-0.015
 comes-0.014
 happens-0.014
Token Tom
Token ID4186
Feature activation-1.72e-02

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 comes+0.002
 happens+0.002
Neg logit contributions
he-0.002
ifying-0.002
 contain-0.002
 entire-0.002
 spine-0.002
Token.
Token ID13
Feature activation-2.59e-03

Pos logit contributions
he+0.031
ifying+0.025
 contain+0.024
 entire+0.021
 spine+0.020
Neg logit contributions
rums-0.039
 runs-0.035
 turns-0.034
 happens-0.033
 comes-0.032
Token Tom
Token ID4186
Feature activation-1.69e-02

Pos logit contributions
he+0.006
ifying+0.005
 contain+0.004
 spine+0.004
 entire+0.004
Neg logit contributions
rums-0.005
 runs-0.004
 happens-0.004
 turns-0.004
 tries-0.004
Token is
Token ID318
Feature activation+5.20e-03

Pos logit contributions
he+0.038
ifying+0.030
 still+0.030
 contain+0.029
 empty+0.028
Neg logit contributions
rums-0.028
ages-0.022
 medicines-0.021
drivers-0.019
 happens-0.019
Token hiding
Token ID11816
Feature activation+8.14e-03

Pos logit contributions
rums+0.010
 runs+0.009
 turns+0.008
 comes+0.008
 tries+0.008
Neg logit contributions
he-0.011
ifying-0.010
 contain-0.010
 spine-0.009
 entire-0.008
Token behind
Token ID2157
Feature activation-1.81e-02

Pos logit contributions
 runs+0.016
rums+0.016
es+0.015
 turns+0.015
 comes+0.014
Neg logit contributions
he-0.015
 contain-0.014
ifying-0.013
pron-0.012
 spine-0.012
Token the
Token ID262
Feature activation+3.50e-03

Pos logit contributions
he+0.043
ifying+0.038
 contain+0.037
 spine+0.033
 entire+0.033
Neg logit contributions
rums-0.030
 runs-0.028
 turns-0.026
 comes-0.024
 happens-0.024
Token sofa
Token ID34902
Feature activation+3.03e-02

Pos logit contributions
 runs+0.006
 turns+0.006
rums+0.006
 hugs+0.006
 comes+0.005
Neg logit contributions
he-0.007
ifying-0.006
 contain-0.006
ancer-0.006
 surely-0.006
Token.
Token ID13
Feature activation-1.56e-03

Pos logit contributions
rums+0.054
 runs+0.052
 comes+0.048
es+0.047
 turns+0.047
Neg logit contributions
he-0.062
ifying-0.058
 contain-0.056
 entire-0.050
pron-0.048
Token He
Token ID679
Feature activation-2.15e-03

Pos logit contributions
he+0.003
ifying+0.003
 contain+0.003
 spine+0.003
 entire+0.002
Neg logit contributions
rums-0.003
 runs-0.002
 turns-0.002
 happens-0.002
 tries-0.002
Token has
Token ID468
Feature activation+1.76e-02

Pos logit contributions
he+0.005
ifying+0.004
 still+0.004
 contain+0.004
 empty+0.004
Neg logit contributions
rums-0.004
ages-0.003
drivers-0.002
 medicines-0.002
es-0.002
Token placed
Token ID4624
Feature activation+2.09e-05

Pos logit contributions
he+0.029
ifying+0.025
 contain+0.024
 entire+0.021
 spine+0.021
Neg logit contributions
rums-0.029
 runs-0.026
 turns-0.025
 happens-0.024
 comes-0.024
Token it
Token ID340
Feature activation-3.06e-03

Pos logit contributions
rums+0.000
 runs+0.000
 turns+0.000
 comes+0.000
 happens+0.000
Neg logit contributions
he-0.000
ifying-0.000
 contain-0.000
 entire-0.000
 spine-0.000
Token under
Token ID739
Feature activation+8.47e-03

Pos logit contributions
he+0.005
ifying+0.005
 contain+0.004
 spine+0.004
 entire+0.004
Neg logit contributions
rums-0.007
 runs-0.006
 turns-0.006
 comes-0.006
es-0.006
Token her
Token ID607
Feature activation+9.88e-03

Pos logit contributions
rums+0.010
 runs+0.010
 turns+0.009
 comes+0.008
 happens+0.008
Neg logit contributions
he-0.023
 contain-0.021
ifying-0.020
 spine-0.019
ancer-0.018
Token pillow
Token ID28774
Feature activation+1.34e-02

Pos logit contributions
rums+0.019
 runs+0.018
 hugs+0.017
 turns+0.017
 tries+0.016
Neg logit contributions
he-0.020
ifying-0.016
 contain-0.016
ancer-0.015
uttering-0.014
Token.
Token ID13
Feature activation-8.22e-04

Pos logit contributions
rums+0.023
 runs+0.021
es+0.021
 turns+0.019
 comes+0.019
Neg logit contributions
he-0.028
ifying-0.026
 contain-0.025
 entire-0.023
ancer-0.022
Token
Token ID220
Feature activation+3.66e-03

Pos logit contributions
he+0.002
ifying+0.002
 contain+0.002
 entire+0.001
 spine+0.001
Neg logit contributions
rums-0.001
 runs-0.001
 turns-0.001
 happens-0.001
 comes-0.001
Token
Token ID198
Feature activation+2.93e-03

Pos logit contributions
rums+0.007
 runs+0.007
 turns+0.006
 happens+0.006
 comes+0.006
Neg logit contributions
he-0.008
ifying-0.006
 contain-0.006
 entire-0.005
 spine-0.005
Token
Token ID198
Feature activation+1.86e-03

Pos logit contributions
rums+0.005
 runs+0.005
 turns+0.005
 comes+0.004
 happens+0.004
Neg logit contributions
he-0.007
ifying-0.005
 contain-0.005
 entire-0.005
 spine-0.005
TokenShe
Token ID3347
Feature activation-9.13e-03

Pos logit contributions
rums+0.004
 runs+0.003
 turns+0.003
 comes+0.003
 happens+0.003
Neg logit contributions
he-0.004
ifying-0.003
 contain-0.003
 spine-0.003
 entire-0.003
Token waited
Token ID13488
Feature activation-1.79e-04

Pos logit contributions
he+0.021
ifying+0.018
 contain+0.018
 entire+0.016
 usual+0.015
Neg logit contributions
rums-0.017
 runs-0.014
 turns-0.013
 comes-0.013
 happens-0.013
Token and
Token ID290
Feature activation+2.37e-03

Pos logit contributions
rums+0.008
 runs+0.007
 turns+0.007
 comes+0.006
 happens+0.006
Neg logit contributions
he-0.007
ifying-0.006
 contain-0.006
 entire-0.005
 spine-0.005
Token hon
Token ID3032
Feature activation+1.16e-02

Pos logit contributions
rums+0.005
 runs+0.004
 turns+0.004
 happens+0.004
 comes+0.004
Neg logit contributions
he-0.005
ifying-0.004
 contain-0.004
 usual-0.003
 spine-0.003
Tokenk
Token ID74
Feature activation+1.04e-02

Pos logit contributions
rums+0.026
 runs+0.024
 turns+0.023
 comes+0.022
 happens+0.022
Neg logit contributions
he-0.021
ifying-0.018
 contain-0.017
 entire-0.015
 spine-0.015
Token and
Token ID290
Feature activation+3.05e-03

Pos logit contributions
rums+0.023
 runs+0.021
 turns+0.020
 happens+0.019
 comes+0.019
Neg logit contributions
he-0.019
ifying-0.016
 contain-0.015
 entire-0.013
 spine-0.013
Token roll
Token ID4836
Feature activation-3.89e-03

Pos logit contributions
rums+0.006
 runs+0.005
 turns+0.005
 happens+0.004
 comes+0.004
Neg logit contributions
he-0.007
ifying-0.006
 contain-0.006
 spine-0.005
 entire-0.005
Token around
Token ID1088
Feature activation-5.49e-04

Pos logit contributions
he+0.007
ifying+0.006
 contain+0.005
 entire+0.005
 usual+0.005
Neg logit contributions
rums-0.009
 runs-0.008
 listens-0.007
 tries-0.007
 happens-0.007
Token.
Token ID13
Feature activation+4.32e-03

Pos logit contributions
he+0.001
ifying+0.001
 contain+0.001
 entire+0.001
 spine+0.001
Neg logit contributions
rums-0.001
 runs-0.001
 turns-0.001
 comes-0.001
 happens-0.001
Token Sara
Token ID24799
Feature activation+9.35e-03

Pos logit contributions
rums+0.008
 runs+0.007
 turns+0.007
 comes+0.006
 happens+0.006
Neg logit contributions
he-0.009
ifying-0.008
 contain-0.008
 entire-0.007
 spine-0.007
Token and
Token ID290
Feature activation+2.89e-03

Pos logit contributions
rums+0.022
 runs+0.020
 turns+0.019
 happens+0.019
 comes+0.018
Neg logit contributions
he-0.016
ifying-0.012
 contain-0.012
 entire-0.011
 spine-0.010
Token Ben
Token ID3932
Feature activation-3.80e-03

Pos logit contributions
rums+0.006
 runs+0.005
 turns+0.005
 happens+0.004
 comes+0.004
Neg logit contributions
he-0.006
ifying-0.005
 contain-0.005
 usual-0.005
 spine-0.005
Token laughed
Token ID13818
Feature activation+7.99e-03

Pos logit contributions
he+0.009
ifying+0.008
 contain+0.008
 entire+0.007
 spine+0.007
Neg logit contributions
rums-0.006
 runs-0.006
 turns-0.005
 comes-0.005
 happens-0.005
Token made
Token ID925
Feature activation+4.90e-03

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 comes+0.002
 happens+0.002
Neg logit contributions
he-0.002
ifying-0.002
 contain-0.002
 entire-0.002
 usual-0.002
Token a
Token ID257
Feature activation+8.47e-04

Pos logit contributions
rums+0.010
 runs+0.009
 happens+0.008
 turns+0.008
 comes+0.008
Neg logit contributions
he-0.010
 contain-0.008
ifying-0.008
 entire-0.007
 usual-0.007
Token loud
Token ID7812
Feature activation+6.10e-03

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 comes+0.001
 happens+0.001
Neg logit contributions
he-0.002
ifying-0.001
 contain-0.001
 entire-0.001
 spine-0.001
Token noise
Token ID7838
Feature activation+2.00e-02

Pos logit contributions
rums+0.012
 runs+0.010
 turns+0.009
 happens+0.009
ups+0.009
Neg logit contributions
he-0.013
 contain-0.011
ifying-0.010
 entire-0.010
 spine-0.010
Token.
Token ID13
Feature activation-3.43e-03

Pos logit contributions
rums+0.039
 runs+0.037
 turns+0.034
 comes+0.034
 happens+0.033
Neg logit contributions
he-0.041
ifying-0.035
 contain-0.034
 entire-0.030
 spine-0.030
Token
Token ID220
Feature activation-9.99e-04

Pos logit contributions
he+0.007
ifying+0.006
 contain+0.006
 entire+0.005
 spine+0.005
Neg logit contributions
rums-0.007
 runs-0.006
 comes-0.006
 happens-0.005
es-0.005
Token
Token ID198
Feature activation-2.66e-03

Pos logit contributions
he+0.002
ifying+0.002
 contain+0.002
 entire+0.001
 usual+0.001
Neg logit contributions
rums-0.002
 runs-0.002
 turns-0.002
 tries-0.002
 happens-0.002
Token
Token ID198
Feature activation-3.70e-03

Pos logit contributions
he+0.006
ifying+0.005
 contain+0.005
 entire+0.004
 usual+0.004
Neg logit contributions
rums-0.005
 runs-0.004
 turns-0.004
 comes-0.004
 happens-0.004
TokenAlice
Token ID44484
Feature activation-6.23e-03

Pos logit contributions
he+0.008
ifying+0.006
 contain+0.006
 entire+0.006
pron+0.005
Neg logit contributions
rums-0.008
 runs-0.007
 turns-0.006
 comes-0.006
 happens-0.006
Token said
Token ID531
Feature activation-1.65e-02

Pos logit contributions
he+0.013
ifying+0.011
 contain+0.011
 entire+0.010
 spine+0.009
Neg logit contributions
rums-0.012
 runs-0.010
 happens-0.009
 turns-0.009
 comes-0.009
Token,
Token ID11
Feature activation-2.27e-03

Pos logit contributions
he+0.024
 contain+0.021
ifying+0.021
 usual+0.017
 entire+0.016
Neg logit contributions
rums-0.040
 runs-0.035
 turns-0.034
 tries-0.032
orns-0.032
Token to
Token ID284
Feature activation+3.13e-03

Pos logit contributions
he+0.004
 usual+0.004
 contain+0.003
ifying+0.003
 entire+0.003
Neg logit contributions
rums-0.010
 runs-0.009
 happens-0.009
 turns-0.009
es-0.009
Token explore
Token ID7301
Feature activation-1.37e-02

Pos logit contributions
rums+0.007
 runs+0.006
 turns+0.006
 happens+0.006
es+0.006
Neg logit contributions
he-0.005
 contain-0.005
ifying-0.004
 entire-0.004
 usual-0.004
Token the
Token ID262
Feature activation+2.36e-03

Pos logit contributions
he+0.025
ifying+0.020
 contain+0.019
 entire+0.017
 usual+0.016
Neg logit contributions
rums-0.031
 runs-0.028
 turns-0.027
 happens-0.026
 comes-0.026
Token world
Token ID995
Feature activation+9.41e-03

Pos logit contributions
rums+0.005
 runs+0.005
 turns+0.005
 comes+0.005
 happens+0.004
Neg logit contributions
he-0.004
ifying-0.004
 contain-0.004
 spine-0.003
pron-0.003
Token around
Token ID1088
Feature activation-3.06e-03

Pos logit contributions
rums+0.018
 runs+0.017
 turns+0.016
 comes+0.016
 happens+0.015
Neg logit contributions
he-0.019
ifying-0.017
 contain-0.016
 spine-0.015
pron-0.014
Token him
Token ID683
Feature activation-1.50e-04

Pos logit contributions
he+0.008
ifying+0.007
 contain+0.007
 entire+0.007
 spine+0.006
Neg logit contributions
rums-0.004
 runs-0.004
 turns-0.003
 comes-0.003
 happens-0.003
Token.
Token ID13
Feature activation-2.71e-03

Pos logit contributions
he+0.000
ifying+0.000
 contain+0.000
 entire+0.000
 spine+0.000
Neg logit contributions
rums-0.000
 runs-0.000
 turns-0.000
 comes-0.000
 happens-0.000
Token One
Token ID1881
Feature activation-2.01e-02

Pos logit contributions
he+0.006
ifying+0.005
 contain+0.004
 spine+0.004
 entire+0.004
Neg logit contributions
rums-0.006
 runs-0.005
 turns-0.005
 happens-0.005
 comes-0.004
Token day
Token ID1110
Feature activation+4.68e-03

Pos logit contributions
ifying+0.040
he+0.039
 contain+0.036
rily+0.030
 spine+0.030
Neg logit contributions
 runs-0.037
rums-0.034
 turns-0.033
 comes-0.033
 sounds-0.032
Token,
Token ID11
Feature activation-4.61e-04

Pos logit contributions
rums+0.013
 runs+0.010
ups+0.010
 shouts+0.010
 listens+0.010
Neg logit contributions
he-0.007
ifying-0.005
 usual-0.004
 contain-0.004
 yet-0.003
Token Spot
Token ID15899
Feature activation-1.90e-02

Pos logit contributions
he+0.001
ifying+0.001
 contain+0.001
 entire+0.001
 usual+0.001
Neg logit contributions
rums-0.001
 runs-0.001
 turns-0.001
 comes-0.001
 happens-0.001
Token hat
Token ID6877
Feature activation-5.26e-03

Pos logit contributions
 runs+0.013
 turns+0.013
rums+0.013
 hugs+0.013
 tries+0.013
Neg logit contributions
he-0.017
 contain-0.014
ifying-0.014
ancer-0.012
 spine-0.012
Token and
Token ID290
Feature activation+3.29e-03

Pos logit contributions
ifying+0.012
 contain+0.011
he+0.011
 entire+0.010
ancer+0.010
Neg logit contributions
rums-0.007
es-0.007
 runs-0.007
 comes-0.006
 turns-0.006
Token grabbed
Token ID13176
Feature activation-1.93e-03

Pos logit contributions
rums+0.006
 runs+0.005
 turns+0.005
 happens+0.005
 comes+0.005
Neg logit contributions
he-0.007
ifying-0.006
 contain-0.006
 usual-0.005
pron-0.005
Token her
Token ID607
Feature activation+7.77e-03

Pos logit contributions
he+0.005
ifying+0.004
 contain+0.004
 entire+0.004
 spine+0.004
Neg logit contributions
rums-0.003
 runs-0.003
 turns-0.003
 comes-0.002
 happens-0.002
Token magn
Token ID7842
Feature activation-2.84e-03

Pos logit contributions
 turns+0.012
 runs+0.012
rums+0.012
 hugs+0.012
 tries+0.012
Neg logit contributions
he-0.018
 contain-0.016
ifying-0.014
ancer-0.013
 spine-0.013
Tokenifying
Token ID4035
Feature activation-1.08e-01

Pos logit contributions
 contain+0.003
he+0.003
 usual+0.002
 particular+0.002
 entire+0.002
Neg logit contributions
es-0.008
rums-0.007
 runs-0.007
 turns-0.007
ez-0.007
Token glass
Token ID5405
Feature activation-3.51e-03

Pos logit contributions
he+0.200
 contain+0.173
ifying+0.164
 spine+0.145
 entire+0.142
Neg logit contributions
rums-0.220
 runs-0.213
 turns-0.210
 comes-0.201
 happens-0.199
Token.
Token ID13
Feature activation+4.26e-03

Pos logit contributions
he+0.007
 contain+0.006
ifying+0.006
 entire+0.005
 spine+0.005
Neg logit contributions
rums-0.007
 runs-0.006
 turns-0.006
 comes-0.006
es-0.006
Token She
Token ID1375
Feature activation+7.33e-03

Pos logit contributions
rums+0.008
 runs+0.007
 turns+0.007
 tries+0.007
 happens+0.006
Neg logit contributions
he-0.009
ifying-0.008
 contain-0.007
 spine-0.006
 usual-0.006
Token was
Token ID373
Feature activation+1.16e-03

Pos logit contributions
rums+0.014
 runs+0.012
 turns+0.011
 comes+0.011
 happens+0.011
Neg logit contributions
he-0.016
ifying-0.014
 contain-0.014
 entire-0.012
 usual-0.012
Token ready
Token ID3492
Feature activation+9.43e-03

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 comes+0.002
 tries+0.002
Neg logit contributions
he-0.002
ifying-0.002
 contain-0.002
 spine-0.002
 entire-0.002
Token it
Token ID340
Feature activation-8.18e-03

Pos logit contributions
rums+0.001
 runs+0.001
 turns+0.001
 happens+0.001
 comes+0.001
Neg logit contributions
he-0.002
ifying-0.002
 contain-0.001
 entire-0.001
 usual-0.001
Token back
Token ID736
Feature activation-1.49e-02

Pos logit contributions
he+0.016
ifying+0.014
 contain+0.014
 spine+0.012
 horizon+0.012
Neg logit contributions
rums-0.015
 runs-0.015
 turns-0.014
 comes-0.014
es-0.013
Token on
Token ID319
Feature activation-2.37e-03

Pos logit contributions
he+0.034
 contain+0.030
ifying+0.029
 spine+0.026
 entire+0.025
Neg logit contributions
rums-0.025
 runs-0.023
 turns-0.022
 comes-0.021
es-0.021
Token the
Token ID262
Feature activation+2.73e-03

Pos logit contributions
he+0.006
 contain+0.006
ifying+0.005
 spine+0.005
uttering+0.005
Neg logit contributions
rums-0.003
 turns-0.003
 runs-0.003
 breaks-0.002
 comes-0.002
Token books
Token ID3835
Feature activation-1.18e-02

Pos logit contributions
 runs+0.004
rums+0.004
 turns+0.004
 comes+0.003
 sounds+0.003
Neg logit contributions
he-0.006
ifying-0.006
 contain-0.006
uttering-0.005
ancer-0.005
Tokenhe
Token ID258
Feature activation-8.76e-02

Pos logit contributions
ifying+0.012
 contain+0.011
 spine+0.010
uttering+0.009
 entire+0.009
Neg logit contributions
es-0.028
 turns-0.028
 comes-0.027
 runs-0.027
 means-0.027
Tokenlf
Token ID1652
Feature activation-4.53e-03

Pos logit contributions
he+0.171
ifying+0.157
 contain+0.147
 entire+0.126
 spine+0.125
Neg logit contributions
rums-0.179
 runs-0.164
 turns-0.154
 comes-0.150
 happens-0.147
Token so
Token ID523
Feature activation+1.19e-03

Pos logit contributions
ifying+0.010
he+0.010
 contain+0.010
 spine+0.009
 entire+0.008
Neg logit contributions
es-0.006
rums-0.006
 runs-0.006
 turns-0.006
 comes-0.005
Token we
Token ID356
Feature activation-2.13e-03

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 happens+0.002
 comes+0.002
Neg logit contributions
he-0.003
ifying-0.003
 contain-0.003
 entire-0.002
 spine-0.002
Token can
Token ID460
Feature activation-3.46e-03

Pos logit contributions
he+0.005
 contain+0.005
ifying+0.005
 spine+0.005
 entire+0.004
Neg logit contributions
 runs-0.003
 turns-0.003
rums-0.003
 comes-0.002
 happens-0.002
Token find
Token ID1064
Feature activation-5.43e-03

Pos logit contributions
ifying+0.007
he+0.007
 spine+0.006
 contain+0.006
kin+0.006
Neg logit contributions
 runs-0.006
rums-0.005
 turns-0.005
 comes-0.005
 listens-0.005
Token.
Token ID13
Feature activation-1.89e-03

Pos logit contributions
 runs+0.006
rums+0.006
 turns+0.006
 comes+0.005
 sounds+0.005
Neg logit contributions
he-0.008
 contain-0.007
ifying-0.007
 spine-0.006
pron-0.006
Token She
Token ID1375
Feature activation-1.43e-04

Pos logit contributions
he+0.004
ifying+0.003
 contain+0.003
 entire+0.003
 spine+0.003
Neg logit contributions
rums-0.004
 runs-0.003
 turns-0.003
 happens-0.003
 comes-0.003
Token was
Token ID373
Feature activation+4.74e-04

Pos logit contributions
he+0.000
 contain+0.000
ifying+0.000
 entire+0.000
 usual+0.000
Neg logit contributions
rums-0.000
 runs-0.000
 turns-0.000
 comes-0.000
 happens-0.000
Token filled
Token ID5901
Feature activation-1.43e-02

Pos logit contributions
rums+0.001
 runs+0.001
 turns+0.001
 comes+0.001
 tries+0.001
Neg logit contributions
he-0.001
ifying-0.001
 contain-0.001
 spine-0.001
 entire-0.001
Token with
Token ID351
Feature activation+1.96e-02

Pos logit contributions
he+0.029
ifying+0.027
 contain+0.025
 spine+0.023
 entire+0.022
Neg logit contributions
rums-0.027
 runs-0.025
 turns-0.024
 comes-0.022
es-0.022
Token amaz
Token ID40642
Feature activation-5.43e-02

Pos logit contributions
rums+0.039
 runs+0.037
 turns+0.035
 comes+0.033
 happens+0.032
Neg logit contributions
he-0.040
ifying-0.034
 contain-0.033
 spine-0.029
 entire-0.029
Tokenement
Token ID972
Feature activation-4.61e-02

Pos logit contributions
he+0.094
ifying+0.086
 contain+0.084
 spine+0.078
chard+0.068
Neg logit contributions
rums-0.125
 turns-0.107
 runs-0.106
 happens-0.105
 comes-0.103
Token.
Token ID13
Feature activation-1.67e-03

Pos logit contributions
he+0.100
ifying+0.084
 contain+0.082
 entire+0.072
pron+0.071
Neg logit contributions
rums-0.087
 runs-0.081
 turns-0.075
 comes-0.073
es-0.072
Token
Token ID198
Feature activation-8.20e-04

Pos logit contributions
he+0.003
ifying+0.002
 contain+0.002
 entire+0.002
 spine+0.002
Neg logit contributions
rums-0.004
 runs-0.003
 tries-0.003
 happens-0.003
 turns-0.003
Token
Token ID198
Feature activation-1.78e-03

Pos logit contributions
he+0.002
ifying+0.002
 contain+0.002
 entire+0.001
 usual+0.001
Neg logit contributions
rums-0.002
 runs-0.001
 turns-0.001
 comes-0.001
 happens-0.001
TokenAt
Token ID2953
Feature activation-2.48e-03

Pos logit contributions
he+0.004
ifying+0.003
 contain+0.003
 entire+0.003
 usual+0.002
Neg logit contributions
rums-0.004
 runs-0.003
 turns-0.003
 comes-0.003
 happens-0.003
Token chair
Token ID5118
Feature activation+1.01e-02

Pos logit contributions
rums+0.001
 runs+0.001
 turns+0.001
 comes+0.001
 happens+0.001
Neg logit contributions
he-0.001
ifying-0.001
 contain-0.001
 entire-0.001
 spine-0.001
Token and
Token ID290
Feature activation+2.27e-03

Pos logit contributions
rums+0.015
 runs+0.015
 turns+0.014
 comes+0.014
es+0.013
Neg logit contributions
he-0.023
ifying-0.022
 contain-0.020
 entire-0.019
 spine-0.018
Token allowed
Token ID3142
Feature activation+2.23e-02

Pos logit contributions
rums+0.004
 runs+0.004
 turns+0.004
 comes+0.004
 happens+0.004
Neg logit contributions
he-0.005
ifying-0.004
 contain-0.004
 entire-0.004
 spine-0.003
Token the
Token ID262
Feature activation+2.33e-03

Pos logit contributions
rums+0.035
 runs+0.031
 turns+0.029
 comes+0.027
 happens+0.027
Neg logit contributions
he-0.055
ifying-0.049
 contain-0.048
 entire-0.043
 spine-0.043
Token ha
Token ID387
Feature activation-3.38e-04

Pos logit contributions
 runs+0.004
rums+0.004
 turns+0.004
 comes+0.004
 tries+0.004
Neg logit contributions
he-0.005
ifying-0.004
 contain-0.004
 spine-0.003
uttering-0.003
Tokenird
Token ID1447
Feature activation-5.12e-02

Pos logit contributions
he+0.001
ifying+0.001
 contain+0.000
 entire+0.000
 spine+0.000
Neg logit contributions
rums-0.001
 runs-0.001
 turns-0.001
 comes-0.001
es-0.001
Tokenress
Token ID601
Feature activation+1.35e-02

Pos logit contributions
he+0.131
 contain+0.119
ifying+0.119
 entire+0.112
 spine+0.105
Neg logit contributions
rums-0.068
 runs-0.058
es-0.056
 comes-0.052
 turns-0.052
Tokener
Token ID263
Feature activation+1.84e-02

Pos logit contributions
rums+0.021
 runs+0.019
 turns+0.017
 comes+0.016
es+0.016
Neg logit contributions
he-0.033
ifying-0.030
 contain-0.029
 entire-0.026
 spine-0.026
Token to
Token ID284
Feature activation+2.54e-03

Pos logit contributions
rums+0.036
 runs+0.033
 turns+0.031
 comes+0.030
 happens+0.029
Neg logit contributions
he-0.038
ifying-0.033
 contain-0.032
 entire-0.028
 spine-0.028
Token give
Token ID1577
Feature activation+8.62e-03

Pos logit contributions
 runs+0.004
rums+0.004
 turns+0.004
 comes+0.004
 tries+0.004
Neg logit contributions
he-0.005
ifying-0.005
 spine-0.004
 entire-0.004
uttering-0.004
Token her
Token ID607
Feature activation+9.25e-03

Pos logit contributions
rums+0.015
 runs+0.013
 turns+0.012
 happens+0.012
 comes+0.012
Neg logit contributions
he-0.020
ifying-0.017
 contain-0.016
 entire-0.015
 usual-0.015
Token.
Token ID13
Feature activation+4.95e-03

Pos logit contributions
rums+0.007
 runs+0.007
 comes+0.006
 turns+0.006
es+0.006
Neg logit contributions
he-0.008
 contain-0.007
ifying-0.007
pron-0.006
 spine-0.006
Token
Token ID198
Feature activation+9.08e-03

Pos logit contributions
rums+0.008
 runs+0.007
 turns+0.007
 happens+0.007
 comes+0.007
Neg logit contributions
he-0.012
ifying-0.010
 contain-0.010
 entire-0.009
 spine-0.009
Token
Token ID198
Feature activation+7.61e-03

Pos logit contributions
rums+0.018
 runs+0.017
 turns+0.016
 comes+0.016
 happens+0.016
Neg logit contributions
he-0.019
ifying-0.015
 contain-0.014
 entire-0.013
 usual-0.012
TokenThe
Token ID464
Feature activation-1.71e-02

Pos logit contributions
rums+0.014
 runs+0.012
 comes+0.011
 turns+0.011
 happens+0.011
Neg logit contributions
he-0.018
ifying-0.015
 contain-0.014
 entire-0.013
pron-0.013
Token books
Token ID3835
Feature activation-5.23e-03

Pos logit contributions
he+0.035
ifying+0.030
 contain+0.029
 spine+0.024
uttering+0.024
Neg logit contributions
rums-0.033
 runs-0.032
 turns-0.030
 tries-0.028
 comes-0.028
Tokenhe
Token ID258
Feature activation-8.72e-02

Pos logit contributions
ifying+0.008
he+0.008
 contain+0.008
 spine+0.008
 entire+0.007
Neg logit contributions
rums-0.010
 runs-0.010
 turns-0.010
 comes-0.010
es-0.010
Tokenlf
Token ID1652
Feature activation+5.91e-04

Pos logit contributions
he+0.166
ifying+0.145
 contain+0.139
 entire+0.120
 spine+0.119
Neg logit contributions
rums-0.186
 runs-0.171
 turns-0.163
 comes-0.155
 happens-0.153
Token came
Token ID1625
Feature activation+1.32e-02

Pos logit contributions
rums+0.001
 runs+0.001
 turns+0.001
 comes+0.001
 happens+0.001
Neg logit contributions
he-0.001
ifying-0.001
 contain-0.001
 entire-0.001
 spine-0.001
Token crashing
Token ID21899
Feature activation-1.45e-02

Pos logit contributions
rums+0.024
 runs+0.023
 turns+0.022
 comes+0.021
es+0.020
Neg logit contributions
he-0.028
 contain-0.024
ifying-0.024
 spine-0.021
 entire-0.020
Token down
Token ID866
Feature activation-1.40e-02

Pos logit contributions
he+0.030
 contain+0.026
ifying+0.026
 entire+0.022
 spine+0.022
Neg logit contributions
rums-0.028
 runs-0.026
 turns-0.025
 comes-0.024
 happens-0.023
Token on
Token ID319
Feature activation-3.41e-03

Pos logit contributions
he+0.031
ifying+0.029
 contain+0.029
 spine+0.026
pron+0.025
Neg logit contributions
rums-0.022
 turns-0.021
 runs-0.021
 comes-0.020
 happens-0.018
Token flashlight
Token ID38371
Feature activation-8.12e-03

Pos logit contributions
he+0.002
 contain+0.001
ifying+0.001
 entire+0.001
 spine+0.001
Neg logit contributions
rums-0.002
 runs-0.002
 turns-0.002
 comes-0.002
 happens-0.002
Token,
Token ID11
Feature activation+1.21e-03

Pos logit contributions
he+0.017
ifying+0.015
 contain+0.014
 entire+0.013
 spine+0.012
Neg logit contributions
 runs-0.014
rums-0.014
 comes-0.014
 turns-0.013
 happens-0.013
Token and
Token ID290
Feature activation+3.50e-03

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 comes+0.002
 happens+0.002
Neg logit contributions
he-0.002
ifying-0.002
 contain-0.002
 entire-0.002
 spine-0.002
Token a
Token ID257
Feature activation+1.14e-03

Pos logit contributions
rums+0.007
 runs+0.006
 turns+0.006
 comes+0.005
 happens+0.005
Neg logit contributions
he-0.007
ifying-0.006
 contain-0.006
 entire-0.006
 usual-0.005
Token magn
Token ID7842
Feature activation+2.67e-04

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 comes+0.002
 happens+0.002
Neg logit contributions
he-0.002
 contain-0.002
ifying-0.002
 entire-0.002
 spine-0.001
Tokenifying
Token ID4035
Feature activation-1.24e-01

Pos logit contributions
es+0.001
 runs+0.001
 comes+0.001
rums+0.001
 happens+0.001
Neg logit contributions
 contain-0.000
he-0.000
pron-0.000
 rightful-0.000
 usual-0.000
Token glass
Token ID5405
Feature activation-5.54e-03

Pos logit contributions
he+0.221
 contain+0.202
ifying+0.178
ancer+0.174
 spine+0.172
Neg logit contributions
 turns-0.238
 runs-0.232
rums-0.231
 comes-0.220
 breaks-0.217
Token.
Token ID13
Feature activation+6.65e-03

Pos logit contributions
he+0.011
ifying+0.009
 contain+0.009
 spine+0.008
 entire+0.008
Neg logit contributions
rums-0.011
 runs-0.010
 turns-0.010
 comes-0.010
 happens-0.009
Token
Token ID198
Feature activation+1.02e-02

Pos logit contributions
rums+0.011
 runs+0.010
 turns+0.009
 happens+0.009
 comes+0.009
Neg logit contributions
he-0.016
ifying-0.014
 contain-0.013
 entire-0.012
 spine-0.012
Token
Token ID198
Feature activation+8.72e-03

Pos logit contributions
rums+0.020
 runs+0.019
 turns+0.017
 happens+0.017
 comes+0.017
Neg logit contributions
he-0.022
ifying-0.017
 contain-0.017
 entire-0.015
 usual-0.014
TokenAs
Token ID1722
Feature activation+1.91e-03

Pos logit contributions
rums+0.016
 runs+0.014
 turns+0.014
 happens+0.014
 comes+0.013
Neg logit contributions
he-0.020
ifying-0.017
 contain-0.016
 entire-0.015
pron-0.014
Token She
Token ID1375
Feature activation+2.88e-03

Pos logit contributions
he+0.002
ifying+0.002
 contain+0.002
 spine+0.001
 entire+0.001
Neg logit contributions
rums-0.002
 runs-0.002
 happens-0.002
 turns-0.002
 tries-0.002
Token takes
Token ID2753
Feature activation+1.79e-02

Pos logit contributions
rums+0.004
ages+0.003
haw+0.003
bank+0.003
irens+0.003
Neg logit contributions
he-0.007
 entire-0.006
ifying-0.005
 contain-0.005
bell-0.005
Token out
Token ID503
Feature activation-2.68e-04

Pos logit contributions
rums+0.039
 runs+0.034
 listens+0.032
 happens+0.031
 comes+0.031
Neg logit contributions
he-0.034
ifying-0.028
 contain-0.027
 entire-0.025
 usual-0.024
Token a
Token ID257
Feature activation+1.18e-03

Pos logit contributions
he+0.001
ifying+0.001
 contain+0.001
 entire+0.000
 spine+0.000
Neg logit contributions
rums-0.000
 runs-0.000
 turns-0.000
 comes-0.000
 happens-0.000
Token magn
Token ID7842
Feature activation-1.45e-03

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 comes+0.002
 tries+0.002
Neg logit contributions
he-0.002
ifying-0.002
 contain-0.002
 spine-0.002
 entire-0.001
Tokenifying
Token ID4035
Feature activation-1.21e-01

Pos logit contributions
he+0.001
 contain+0.001
 usual+0.001
pron+0.001
chard+0.001
Neg logit contributions
es-0.004
rums-0.004
 runs-0.004
 turns-0.004
 comes-0.004
Token glass
Token ID5405
Feature activation-5.44e-03

Pos logit contributions
he+0.220
 contain+0.194
ifying+0.181
 spine+0.166
ancer+0.164
Neg logit contributions
rums-0.241
 turns-0.236
 runs-0.233
 comes-0.219
 happens-0.218
Token.
Token ID13
Feature activation-8.11e-04

Pos logit contributions
he+0.010
ifying+0.009
 contain+0.009
 entire+0.007
 spine+0.007
Neg logit contributions
rums-0.011
 runs-0.011
 turns-0.010
 comes-0.010
 happens-0.010
Token She
Token ID1375
Feature activation+2.41e-03

Pos logit contributions
he+0.002
ifying+0.002
 contain+0.002
 spine+0.001
 entire+0.001
Neg logit contributions
rums-0.001
 runs-0.001
 turns-0.001
 happens-0.001
 comes-0.001
Token looks
Token ID3073
Feature activation+5.22e-03

Pos logit contributions
rums+0.004
ages+0.003
haw+0.003
irens+0.002
ulk+0.002
Neg logit contributions
he-0.006
ifying-0.005
 entire-0.005
 contain-0.005
bell-0.004
Token at
Token ID379
Feature activation-1.46e-03

Pos logit contributions
rums+0.011
 runs+0.011
 turns+0.010
 comes+0.010
 tries+0.010
Neg logit contributions
he-0.010
ifying-0.009
 contain-0.008
 spine-0.007
 entire-0.007
Token the
Token ID262
Feature activation+3.75e-03

Pos logit contributions
he+0.002
ifying+0.002
 contain+0.002
 entire+0.001
 usual+0.001
Neg logit contributions
rums-0.001
 runs-0.001
 turns-0.001
 happens-0.001
 comes-0.001
Token necklace
Token ID38987
Feature activation-1.49e-02

Pos logit contributions
 runs+0.007
 turns+0.007
rums+0.007
 comes+0.006
 tries+0.006
Neg logit contributions
he-0.008
ifying-0.007
 contain-0.007
ancer-0.006
 entire-0.005
Token and
Token ID290
Feature activation+3.28e-03

Pos logit contributions
ifying+0.035
he+0.034
 contain+0.032
 entire+0.030
ancer+0.029
Neg logit contributions
es-0.020
 runs-0.020
rums-0.018
 comes-0.017
 turns-0.016
Token the
Token ID262
Feature activation+4.44e-03

Pos logit contributions
rums+0.007
 runs+0.005
 turns+0.005
 happens+0.005
 comes+0.005
Neg logit contributions
he-0.006
 contain-0.006
ifying-0.005
 usual-0.005
 entire-0.004
Token magn
Token ID7842
Feature activation+4.85e-04

Pos logit contributions
rums+0.009
 runs+0.008
 turns+0.008
 comes+0.008
 happens+0.007
Neg logit contributions
he-0.009
ifying-0.008
 contain-0.007
 entire-0.006
 spine-0.006
Tokenifying
Token ID4035
Feature activation-1.16e-01

Pos logit contributions
es+0.001
rums+0.001
 runs+0.001
 comes+0.001
 turns+0.001
Neg logit contributions
 contain-0.001
he-0.001
 particular-0.000
pron-0.000
 usual-0.000
Token glass
Token ID5405
Feature activation-3.38e-03

Pos logit contributions
he+0.212
 contain+0.188
ifying+0.173
 spine+0.159
ancer+0.156
Neg logit contributions
rums-0.231
 turns-0.227
 runs-0.227
 comes-0.212
 happens-0.210
Token and
Token ID290
Feature activation+3.86e-03

Pos logit contributions
he+0.007
ifying+0.006
 contain+0.005
 entire+0.005
 spine+0.005
Neg logit contributions
rums-0.007
 runs-0.007
 turns-0.006
 comes-0.006
 happens-0.006
Token started
Token ID2067
Feature activation+5.74e-03

Pos logit contributions
rums+0.008
 runs+0.006
 turns+0.006
 happens+0.006
 comes+0.006
Neg logit contributions
he-0.008
 contain-0.007
ifying-0.007
 usual-0.006
 entire-0.006
Token to
Token ID284
Feature activation+6.31e-03

Pos logit contributions
rums+0.014
 runs+0.012
 happens+0.012
 turns+0.012
 listens+0.011
Neg logit contributions
he-0.009
ifying-0.007
 contain-0.007
 usual-0.007
 entire-0.006
Token examine
Token ID10716
Feature activation-1.40e-03

Pos logit contributions
 runs+0.012
rums+0.012
 turns+0.012
 comes+0.011
 tries+0.011
Neg logit contributions
he-0.012
ifying-0.011
 contain-0.010
 entire-0.009
 spine-0.009
Token,
Token ID11
Feature activation+1.77e-03

Pos logit contributions
rums+0.001
 runs+0.001
es+0.001
 turns+0.001
 comes+0.001
Neg logit contributions
he-0.001
ifying-0.001
 contain-0.001
 entire-0.001
ancer-0.001
Token next
Token ID1306
Feature activation-1.78e-02

Pos logit contributions
rums+0.003
 runs+0.003
 turns+0.003
 comes+0.003
 tries+0.003
Neg logit contributions
he-0.004
ifying-0.003
 contain-0.003
 entire-0.003
 spine-0.003
Token to
Token ID284
Feature activation+3.19e-03

Pos logit contributions
he+0.041
ifying+0.037
 contain+0.036
chard+0.032
ancer+0.031
Neg logit contributions
 turns-0.029
rums-0.029
 runs-0.028
 comes-0.026
 happens-0.025
Token her
Token ID607
Feature activation+7.58e-03

Pos logit contributions
rums+0.005
 runs+0.004
 turns+0.004
 comes+0.004
 tries+0.004
Neg logit contributions
he-0.008
ifying-0.007
 contain-0.007
 spine-0.007
 entire-0.007
Token magn
Token ID7842
Feature activation-3.56e-03

Pos logit contributions
 runs+0.013
rums+0.013
 turns+0.013
 comes+0.012
 hugs+0.012
Neg logit contributions
he-0.017
ifying-0.015
 contain-0.015
ancer-0.013
 spine-0.012
Tokenifying
Token ID4035
Feature activation-1.20e-01

Pos logit contributions
 contain+0.003
he+0.003
 entire+0.002
 sunny+0.002
 particular+0.002
Neg logit contributions
es-0.010
rums-0.009
 runs-0.008
 comes-0.008
 hugs-0.008
Token glass
Token ID5405
Feature activation-6.58e-03

Pos logit contributions
he+0.220
 contain+0.195
ifying+0.181
 spine+0.165
ancer+0.159
Neg logit contributions
rums-0.238
 turns-0.231
 runs-0.230
 comes-0.217
 happens-0.216
Token.
Token ID13
Feature activation+1.21e-03

Pos logit contributions
he+0.013
ifying+0.011
 contain+0.011
 entire+0.010
 spine+0.009
Neg logit contributions
rums-0.013
 runs-0.012
 turns-0.012
 comes-0.012
 happens-0.011
Token She
Token ID1375
Feature activation+4.29e-03

Pos logit contributions
rums+0.002
 runs+0.002
 turns+0.002
 happens+0.002
 comes+0.002
Neg logit contributions
he-0.003
ifying-0.002
 contain-0.002
 spine-0.002
 entire-0.002
Token looked
Token ID3114
Feature activation-1.58e-02

Pos logit contributions
rums+0.009
 runs+0.007
 turns+0.007
 happens+0.007
 comes+0.007
Neg logit contributions
he-0.009
 contain-0.007
ifying-0.007
 entire-0.006
 usual-0.006
Token at
Token ID379
Feature activation-1.24e-03

Pos logit contributions
he+0.031
ifying+0.027
 contain+0.026
 spine+0.023
 entire+0.022
Neg logit contributions
rums-0.031
 runs-0.031
 turns-0.030
 comes-0.028
 tries-0.028
Token
Token ID26391
Feature activation-2.37e-02

Pos logit contributions
he+0.007
ifying+0.006
 contain+0.006
 entire+0.005
 spine+0.005
Neg logit contributions
rums-0.007
 runs-0.006
 turns-0.006
 comes-0.006
 happens-0.006
Token
Token ID8151
Feature activation-3.65e-03

Pos logit contributions
he+0.052
ifying+0.045
 contain+0.042
 entire+0.038
 spine+0.037
Neg logit contributions
rums-0.046
 runs-0.040
 turns-0.037
 happens-0.036
 comes-0.035
Tokens
Token ID82
Feature activation+3.81e-03

Pos logit contributions
he+0.008
ifying+0.007
 contain+0.007
 entire+0.006
 spine+0.006
Neg logit contributions
rums-0.007
 runs-0.006
 turns-0.006
 comes-0.005
 happens-0.005
Token a
Token ID257
Feature activation-9.13e-03

Pos logit contributions
rums+0.007
 runs+0.007
 turns+0.006
 comes+0.006
 happens+0.006
Neg logit contributions
he-0.008
ifying-0.007
 contain-0.007
 entire-0.006
 spine-0.006
Token books
Token ID3835
Feature activation-6.19e-03

Pos logit contributions
he+0.018
ifying+0.015
 contain+0.015
 spine+0.013
 entire+0.012
Neg logit contributions
 runs-0.018
rums-0.018
 turns-0.017
 comes-0.016
 happens-0.016
Tokenhe
Token ID258
Feature activation-1.16e-01

Pos logit contributions
ifying+0.008
 spine+0.007
 contain+0.007
he+0.007
 usual+0.006
Neg logit contributions
 runs-0.014
 comes-0.014
 turns-0.013
rums-0.013
 happens-0.013
Tokenlf
Token ID1652
Feature activation-9.46e-03

Pos logit contributions
he+0.227
ifying+0.202
 contain+0.192
 spine+0.167
 entire+0.166
Neg logit contributions
rums-0.236
 runs-0.219
 turns-0.205
 comes-0.199
 happens-0.195
Token.
Token ID13
Feature activation-7.13e-03

Pos logit contributions
he+0.019
ifying+0.016
 contain+0.016
 spine+0.014
 entire+0.014
Neg logit contributions
rums-0.019
 runs-0.018
 turns-0.016
 comes-0.016
 happens-0.016
Token You
Token ID921
Feature activation+7.62e-03

Pos logit contributions
he+0.017
ifying+0.015
 contain+0.015
 entire+0.014
 spine+0.013
Neg logit contributions
rums-0.011
 runs-0.010
 turns-0.009
 comes-0.009
 happens-0.009
Token can
Token ID460
Feature activation-7.54e-04

Pos logit contributions
 runs+0.012
rums+0.012
 turns+0.012
 comes+0.011
 happens+0.010
Neg logit contributions
he-0.017
ifying-0.015
 contain-0.015
 spine-0.014
pron-0.013
Token use
Token ID779
Feature activation-5.00e-03

Pos logit contributions
he+0.001
ifying+0.001
 contain+0.001
 spine+0.001
 entire+0.001
Neg logit contributions
rums-0.001
 runs-0.001
 turns-0.001
 comes-0.001
 happens-0.001