TOP ACTIVATIONS
MAX = 0.125

then her friend started to mis be have . She was
's wrong , Mr . Cro cod ile ?"
there , Lily started to mis be have . She pulled
and created a shelter for Twe ety . Twe ety felt
it fits !" She helped Sil via put on the coat
, Tim my started to mis be have . He didn
Lucy noticed the weather was uncom fort ably hot . She
, a little bird named Twe ety , and told her
seeds and gave them to Twe ety . Twe
dare to climb up to Twe ety 's nest .
a film camera and asked Twe ety if she could take
a little bird ie named Twe ety . Twe ety was
" Hello , Mr . Cro cod ile ," said Lily
who was trying to take Twe ety 's nest . Twe
to never try to catch Twe ety again . From that
a little bird ie named Twe ety . Twe ety loved
a little bird ie named Twe ety . Twe ety lived
bird and they named it Twe ety . They fed Twe
friend , a bird named Twe ety , got stuck in
there were two friends , Shar in and Maria . They

MIN ACTIVATIONS
MIN = -0.125

flashlight , and a magn ifying glass . As
When Daisy put the magn ifying glass down , the cater
She carefully applied the yellow makeup to her nose and smiled
stirred . Soon the cake batter was ready . Mama poured
no , because the ash tr ay was not a toy
He was here a minute ago . He was chasing the
our country a long time ago . He helped many people
my had told him long ago . The bird spread out
born only a few days ago . The deer felt sorry
family say how beautiful the drawer looked ! Grac ie smiled
the print in his new coloring book was so clear and
said . He drank his tea and smiled . Lily drank
story . A long time ago , there was an adventurous
, next to her magn ifying glass . She looked at
She takes out a magn ifying glass . She looks at
bought her a brand new coloring book . She grabbed her
, handing him the magn ifying glass . She looked around
tr ay . The ash tr ay lived in a house
and fell . The ash tr ay flew out of her
B obby and the ash tr ay became the best of

INTERVAL 0.063 - 0.125
CONTAINS 0.152%

was just sick and couldn 't help it . The animals
off course . Sunny didn 't know where he was and
the garden . There , Bug gy saw an incredible cob
too . But Tim didn 't care . He kept playing
Suddenly , he heard a loud roar . He turned around

INTERVAL 0.000 - 0.063
CONTAINS 50.031%

better ." Max and Lily went to the veterinarian
let her get her bucket and shovel . She filled the
heard a loud crunch ing sound . When she looked ,
get better soon . It will be fun to play hockey
that day when the mum came to check on them ,

INTERVAL -0.062 - 0.000
CONTAINS 49.716%

Oh no Jimmy , does it hurt ?" Jimmy
his mom went to the shop . Jack was excited to
ones his mom my made . One day , Tim my
the laser in their room . They heard their mom call
table was covered with cr umbs and spills .

INTERVAL -0.125 - -0.062
CONTAINS 0.101%

on the pan so the cake would not stick . They
paper . Writing on the fridge is bad . It makes
measure other things in his room . His bed , the
clay ! <|endoftext|> Once , there was a small rabbit who
my said , " This bucket is available for us to
Token then
Token ID788
Feature activation+2.36e-02

Pos logit contributions
ck+0.057
 reassure+0.056
 unexpected+0.053
 swat+0.052
 fright+0.052
Neg logit contributions
robe-0.041
ase-0.034
tub-0.033
 sky-0.032
 Now-0.032
Token her
Token ID607
Feature activation-1.86e-02

Pos logit contributions
ck+0.059
 reassure+0.059
 swat+0.055
 fright+0.055
 unexpected+0.055
Neg logit contributions
robe-0.030
 has-0.024
tub-0.023
 Now-0.023
 sky-0.023
Token friend
Token ID1545
Feature activation+2.56e-02

Pos logit contributions
robe+0.033
 sky+0.028
 has+0.027
tub+0.027
ase+0.027
Neg logit contributions
ck-0.038
 reassure-0.037
 intervene-0.034
 swat-0.034
 fright-0.034
Token started
Token ID2067
Feature activation+1.04e-02

Pos logit contributions
ck+0.062
 reassure+0.061
 swat+0.058
 unexpected+0.058
 fright+0.058
Neg logit contributions
robe-0.033
 has-0.028
 He-0.027
tub-0.026
 Now-0.026
Token to
Token ID284
Feature activation+4.21e-02

Pos logit contributions
ck+0.028
 reassure+0.027
 unexpected+0.026
 intervene+0.025
ited+0.024
Neg logit contributions
robe-0.011
tub-0.009
 sky-0.008
 He-0.008
 They-0.008
Token mis
Token ID2984
Feature activation+1.25e-01

Pos logit contributions
ck+0.086
 unexpected+0.081
away+0.075
 reassure+0.073
 tighter+0.072
Neg logit contributions
robe-0.060
 has-0.056
 He-0.053
 They-0.053
 sky-0.050
Tokenbe
Token ID1350
Feature activation+4.68e-02

Pos logit contributions
 hesitation+0.241
 reassure+0.239
 unexpected+0.239
 fear+0.237
ck+0.231
Neg logit contributions
ase-0.155
elt-0.126
angers-0.126
robe-0.122
tub-0.120
Tokenhave
Token ID14150
Feature activation+3.34e-03

Pos logit contributions
 reassure+0.078
ck+0.074
 fright+0.073
 unexpected+0.073
 swat+0.071
Neg logit contributions
robe-0.090
 has-0.081
 Now-0.078
tub-0.077
ase-0.077
Token.
Token ID13
Feature activation-4.22e-03

Pos logit contributions
ck+0.008
 reassure+0.008
 swat+0.007
 unexpected+0.007
 intervene+0.007
Neg logit contributions
robe-0.005
tub-0.004
 has-0.004
ase-0.004
 Now-0.004
Token She
Token ID1375
Feature activation-1.51e-02

Pos logit contributions
robe+0.008
ates+0.007
ase+0.007
owers+0.007
fl+0.007
Neg logit contributions
 intervene-0.007
 fright-0.006
 reassure-0.006
 cautiously-0.006
 closer-0.006
Token was
Token ID373
Feature activation+1.28e-02

Pos logit contributions
robe+0.024
 has+0.021
tub+0.020
 Now+0.019
 sky+0.019
Neg logit contributions
 reassure-0.032
ck-0.032
 unexpected-0.030
 fright-0.030
 intervene-0.030
Token's
Token ID338
Feature activation+1.39e-02

Pos logit contributions
robe+0.026
 has+0.021
tub+0.020
ase+0.020
 sky+0.019
Neg logit contributions
ck-0.055
 reassure-0.054
 swat-0.052
 intervene-0.051
 unexpected-0.051
Token wrong
Token ID2642
Feature activation+7.08e-03

Pos logit contributions
 reassure+0.029
ck+0.028
 intervene+0.027
 swat+0.026
 fright+0.026
Neg logit contributions
robe-0.024
 has-0.020
tub-0.019
 Now-0.019
ase-0.019
Token,
Token ID11
Feature activation+2.67e-02

Pos logit contributions
ck+0.015
 reassure+0.015
 unexpected+0.014
 fright+0.014
 swat+0.014
Neg logit contributions
robe-0.011
ase-0.009
tub-0.009
 Now-0.009
 has-0.009
Token Mr
Token ID1770
Feature activation+5.02e-02

Pos logit contributions
ck+0.048
 reassure+0.048
 unexpected+0.044
 swat+0.044
 fright+0.044
Neg logit contributions
robe-0.053
ase-0.045
tub-0.045
 Now-0.044
 has-0.044
Token.
Token ID13
Feature activation-1.16e-02

Pos logit contributions
ck+0.104
 reassure+0.101
 unexpected+0.097
 fright+0.095
 intervene+0.094
Neg logit contributions
robe-0.083
 Now-0.069
tub-0.068
ase-0.067
 sky-0.066
Token Cro
Token ID9325
Feature activation+1.08e-01

Pos logit contributions
robe+0.022
 Now+0.018
ase+0.018
tub+0.018
 has+0.018
Neg logit contributions
ck-0.021
 reassure-0.020
 fright-0.019
 unexpected-0.019
 swat-0.019
Tokencod
Token ID19815
Feature activation+5.17e-02

Pos logit contributions
 reassure+0.156
 fright+0.139
 intervene+0.138
 unexpected+0.133
 response+0.132
Neg logit contributions
ase-0.231
robe-0.227
 has-0.217
unes-0.205
tub-0.202
Tokenile
Token ID576
Feature activation+4.24e-02

Pos logit contributions
ck+0.080
 reassure+0.078
 unexpected+0.072
 swat+0.071
 fright+0.071
Neg logit contributions
robe-0.115
tub-0.100
ase-0.099
 has-0.099
 Now-0.099
Token?"
Token ID1701
Feature activation-5.73e-03

Pos logit contributions
ck+0.075
 reassure+0.074
 unexpected+0.070
 fright+0.068
 swat+0.068
Neg logit contributions
robe-0.084
tub-0.071
 sky-0.070
ase-0.070
 Now-0.070
Token
Token ID220
Feature activation-2.44e-02

Pos logit contributions
robe+0.011
tub+0.010
angers+0.009
ase+0.009
unes+0.009
Neg logit contributions
 reassure-0.010
ck-0.010
 fright-0.010
 swat-0.010
 convince-0.009
Token
Token ID198
Feature activation+1.10e-02

Pos logit contributions
robe+0.033
tub+0.025
ase+0.025
 has+0.025
 sky+0.024
Neg logit contributions
ck-0.057
 reassure-0.057
 fright-0.053
 swat-0.053
 unexpected-0.053
Token there
Token ID612
Feature activation-2.62e-02

Pos logit contributions
ck+0.057
 reassure+0.056
 swat+0.053
 intervene+0.052
 unexpected+0.051
Neg logit contributions
robe-0.037
 sky-0.034
 has-0.032
 on-0.030
 He-0.030
Token,
Token ID11
Feature activation+2.20e-02

Pos logit contributions
robe+0.037
tub+0.029
 sky+0.029
 has+0.029
ase+0.029
Neg logit contributions
ck-0.062
 reassure-0.061
 unexpected-0.057
 swat-0.057
 fright-0.057
Token Lily
Token ID20037
Feature activation+1.47e-02

Pos logit contributions
ck+0.043
 reassure+0.042
 unexpected+0.040
 swat+0.039
 fright+0.039
Neg logit contributions
robe-0.039
ase-0.034
tub-0.033
 Now-0.033
 has-0.032
Token started
Token ID2067
Feature activation-4.22e-03

Pos logit contributions
ck+0.030
 reassure+0.030
 unexpected+0.028
 fright+0.028
 swat+0.028
Neg logit contributions
robe-0.024
 has-0.020
tub-0.020
ase-0.020
 sky-0.020
Token to
Token ID284
Feature activation+2.99e-02

Pos logit contributions
robe+0.005
 sky+0.004
tub+0.004
 has+0.004
ase+0.003
Neg logit contributions
ck-0.011
 reassure-0.011
 unexpected-0.010
 intervene-0.010
 swat-0.010
Token mis
Token ID2984
Feature activation+1.08e-01

Pos logit contributions
ck+0.050
 unexpected+0.046
 reassure+0.045
render+0.044
 intervene+0.042
Neg logit contributions
robe-0.056
 sky-0.050
 has-0.050
bows-0.049
tub-0.048
Tokenbe
Token ID1350
Feature activation+3.62e-02

Pos logit contributions
 reassure+0.221
 unexpected+0.221
 hesitation+0.218
 fear+0.216
ck+0.214
Neg logit contributions
ase-0.122
elt-0.102
angers-0.093
 sky-0.092
robe-0.091
Tokenhave
Token ID14150
Feature activation+4.16e-04

Pos logit contributions
 reassure+0.059
ck+0.056
 fright+0.055
 unexpected+0.054
 swat+0.053
Neg logit contributions
robe-0.073
 has-0.065
 Now-0.063
ase-0.063
tub-0.062
Token.
Token ID13
Feature activation-5.01e-03

Pos logit contributions
ck+0.001
 reassure+0.001
 swat+0.001
 unexpected+0.001
 intervene+0.001
Neg logit contributions
robe-0.001
 has-0.000
 sky-0.000
ase-0.000
tub-0.000
Token She
Token ID1375
Feature activation-2.41e-02

Pos logit contributions
robe+0.006
ase+0.005
tub+0.005
 has+0.004
 Now+0.004
Neg logit contributions
 reassure-0.012
 fright-0.012
 swat-0.011
 intervene-0.011
ck-0.011
Token pulled
Token ID5954
Feature activation+9.68e-03

Pos logit contributions
robe+0.038
 has+0.032
 sky+0.031
tub+0.031
ase+0.031
Neg logit contributions
 reassure-0.051
ck-0.051
 unexpected-0.048
 intervene-0.048
 swat-0.048
Token and
Token ID290
Feature activation+4.37e-02

Pos logit contributions
ck+0.040
 reassure+0.040
 fright+0.038
 unexpected+0.038
 swat+0.037
Neg logit contributions
robe-0.037
ase-0.031
tub-0.030
 has-0.030
 Now-0.029
Token created
Token ID2727
Feature activation+2.75e-03

Pos logit contributions
ck+0.072
 reassure+0.069
 unexpected+0.067
 fright+0.067
 swat+0.065
Neg logit contributions
robe-0.093
ase-0.082
tub-0.079
 Now-0.078
angers-0.077
Token a
Token ID257
Feature activation+3.27e-02

Pos logit contributions
ck+0.004
 unexpected+0.004
 fright+0.003
 reassure+0.003
 fear+0.003
Neg logit contributions
robe-0.007
ase-0.006
 has-0.006
angers-0.006
tub-0.006
Token shelter
Token ID11772
Feature activation-9.25e-03

Pos logit contributions
ck+0.065
 reassure+0.063
 intervene+0.057
 swat+0.056
 unexpected+0.056
Neg logit contributions
robe-0.058
 sky-0.051
tub-0.051
 has-0.050
ase-0.048
Token for
Token ID329
Feature activation+3.69e-02

Pos logit contributions
robe+0.017
 Now+0.016
 Cookies+0.015
angers+0.015
unes+0.015
Neg logit contributions
ck-0.016
 reassure-0.016
 fright-0.016
 unexpected-0.016
 swat-0.015
Token Twe
Token ID24205
Feature activation+1.07e-01

Pos logit contributions
 reassure+0.069
ck+0.069
 intervene+0.063
 swat+0.063
 unexpected+0.063
Neg logit contributions
robe-0.068
 Now-0.057
 sky-0.057
 has-0.057
tub-0.057
Tokenety
Token ID2963
Feature activation+4.51e-02

Pos logit contributions
 reassure+0.241
ck+0.228
 unexpected+0.226
 intervene+0.225
 fright+0.224
Neg logit contributions
robe-0.160
ase-0.140
unes-0.129
tub-0.126
 sky-0.124
Token.
Token ID13
Feature activation+6.18e-03

Pos logit contributions
ck+0.100
 reassure+0.098
 unexpected+0.093
 swat+0.092
 fright+0.092
Neg logit contributions
robe-0.070
 Now-0.056
tub-0.056
ase-0.056
 sky-0.055
Token Twe
Token ID24205
Feature activation+9.27e-02

Pos logit contributions
 fright+0.006
 reassure+0.006
 unexpected+0.006
ck+0.006
 swat+0.006
Neg logit contributions
robe-0.015
ase-0.014
tub-0.014
pit-0.014
ates-0.013
Tokenety
Token ID2963
Feature activation+2.75e-02

Pos logit contributions
 reassure+0.192
ck+0.185
 unexpected+0.180
 intervene+0.178
 fright+0.177
Neg logit contributions
robe-0.155
ase-0.135
 has-0.127
tub-0.126
 Now-0.125
Token felt
Token ID2936
Feature activation+1.15e-02

Pos logit contributions
 reassure+0.068
ck+0.066
 intervene+0.065
 unexpected+0.064
 fright+0.063
Neg logit contributions
robe-0.034
 has-0.029
ase-0.027
 sky-0.027
 Now-0.027
Token it
Token ID340
Feature activation-2.86e-02

Pos logit contributions
ck+0.065
 reassure+0.064
 swat+0.060
 unexpected+0.059
 intervene+0.059
Neg logit contributions
robe-0.050
 has-0.042
tub-0.041
 Now-0.040
 sky-0.040
Token fits
Token ID11414
Feature activation+1.24e-02

Pos logit contributions
robe+0.045
ase+0.039
tub+0.037
 Now+0.037
 sky+0.036
Neg logit contributions
ck-0.062
 reassure-0.058
 fright-0.055
 unexpected-0.055
 swat-0.055
Token!"
Token ID2474
Feature activation+2.10e-03

Pos logit contributions
ck+0.025
 reassure+0.025
 unexpected+0.023
 swat+0.023
 fright+0.023
Neg logit contributions
robe-0.022
ase-0.018
tub-0.018
 sky-0.018
 Now-0.018
Token She
Token ID1375
Feature activation-2.79e-02

Pos logit contributions
ck+0.005
 reassure+0.005
 swat+0.005
 fright+0.005
 believe+0.005
Neg logit contributions
robe-0.003
tub-0.002
ase-0.002
 has-0.002
unes-0.002
Token helped
Token ID4193
Feature activation+2.30e-02

Pos logit contributions
robe+0.047
tub+0.039
ase+0.039
 has+0.039
 Now+0.038
Neg logit contributions
ck-0.057
 reassure-0.057
 unexpected-0.054
 fright-0.053
 swat-0.053
Token Sil
Token ID4243
Feature activation+1.07e-01

Pos logit contributions
 reassure+0.045
 unexpected+0.044
ck+0.043
ited+0.043
 intervene+0.042
Neg logit contributions
robe-0.037
 They-0.033
 Now-0.031
 He-0.031
 sky-0.031
Tokenvia
Token ID8869
Feature activation+5.40e-02

Pos logit contributions
 reassure+0.194
 fright+0.183
 intervene+0.181
 swat+0.176
 fear+0.174
Neg logit contributions
ase-0.194
robe-0.190
tub-0.167
 has-0.165
angers-0.158
Token put
Token ID1234
Feature activation-1.71e-02

Pos logit contributions
ck+0.086
 reassure+0.083
 swat+0.077
 fright+0.074
 unexpected+0.074
Neg logit contributions
robe-0.121
ase-0.106
 Now-0.104
tub-0.104
 sky-0.103
Token on
Token ID319
Feature activation+3.04e-02

Pos logit contributions
robe+0.036
tub+0.034
ase+0.033
unes+0.031
 Now+0.031
Neg logit contributions
ck-0.028
 reassure-0.025
 closer-0.024
 swat-0.024
 unexpected-0.023
Token the
Token ID262
Feature activation-5.14e-03

Pos logit contributions
ck+0.052
 reassure+0.052
 unexpected+0.048
 swat+0.047
 fright+0.047
Neg logit contributions
robe-0.062
tub-0.053
ase-0.053
 has-0.052
 Now-0.052
Token coat
Token ID13209
Feature activation-3.59e-02

Pos logit contributions
robe+0.010
ase+0.009
tub+0.009
 Now+0.008
unes+0.008
Neg logit contributions
 reassure-0.010
ck-0.009
 swat-0.009
 unexpected-0.009
 fright-0.009
Token,
Token ID11
Feature activation+1.32e-02

Pos logit contributions
ck+0.071
 reassure+0.070
 unexpected+0.066
 swat+0.066
 fright+0.066
Neg logit contributions
robe-0.047
ase-0.038
tub-0.038
 has-0.037
 Now-0.037
Token Tim
Token ID5045
Feature activation+3.78e-02

Pos logit contributions
ck+0.028
 reassure+0.028
 unexpected+0.027
 fright+0.026
 intervene+0.026
Neg logit contributions
robe-0.022
ase-0.018
tub-0.018
 has-0.017
 Now-0.017
Tokenmy
Token ID1820
Feature activation+1.92e-02

Pos logit contributions
ck+0.076
 reassure+0.074
 unexpected+0.069
 swat+0.069
 fright+0.069
Neg logit contributions
robe-0.067
tub-0.055
ase-0.055
 Now-0.055
 sky-0.053
Token started
Token ID2067
Feature activation-2.43e-03

Pos logit contributions
ck+0.037
 reassure+0.036
 unexpected+0.034
 swat+0.034
 fright+0.034
Neg logit contributions
robe-0.035
tub-0.030
ase-0.030
 has-0.029
 Now-0.029
Token to
Token ID284
Feature activation+2.79e-02

Pos logit contributions
robe+0.004
ase+0.003
tub+0.003
 Now+0.003
 has+0.003
Neg logit contributions
 reassure-0.005
ck-0.005
 swat-0.005
 fright-0.005
 unexpected-0.005
Token mis
Token ID2984
Feature activation+1.06e-01

Pos logit contributions
ck+0.049
 reassure+0.046
 unexpected+0.045
 fright+0.041
 swat+0.041
Neg logit contributions
robe-0.056
 has-0.048
tub-0.048
 sky-0.047
 He-0.047
Tokenbe
Token ID1350
Feature activation+3.60e-02

Pos logit contributions
 reassure+0.224
 unexpected+0.220
ck+0.219
 hesitation+0.211
 fear+0.210
Neg logit contributions
ase-0.117
 He-0.103
elt-0.101
robe-0.100
angers-0.099
Tokenhave
Token ID14150
Feature activation+4.58e-03

Pos logit contributions
ck+0.054
 reassure+0.054
 unexpected+0.050
 fright+0.049
 swat+0.049
Neg logit contributions
robe-0.079
 has-0.069
tub-0.069
ase-0.068
 Now-0.068
Token.
Token ID13
Feature activation-6.12e-03

Pos logit contributions
ck+0.010
 reassure+0.009
 unexpected+0.009
 swat+0.009
 fright+0.009
Neg logit contributions
robe-0.008
ase-0.006
tub-0.006
 has-0.006
 Now-0.006
Token He
Token ID679
Feature activation-1.94e-02

Pos logit contributions
robe+0.008
 has+0.007
ase+0.007
owers+0.006
bows+0.006
Neg logit contributions
 fright-0.012
 intervene-0.012
 reassure-0.012
 panic-0.012
 worry-0.012
Token didn
Token ID1422
Feature activation+1.74e-02

Pos logit contributions
robe+0.030
 has+0.026
 sky+0.025
tub+0.025
ase+0.025
Neg logit contributions
 reassure-0.042
ck-0.041
 swat-0.039
 unexpected-0.039
 intervene-0.038
Token Lucy
Token ID22162
Feature activation+2.79e-02

Pos logit contributions
ck+0.047
 reassure+0.046
 unexpected+0.043
 swat+0.042
 fright+0.042
Neg logit contributions
robe-0.056
ase-0.048
tub-0.048
 has-0.047
 Now-0.047
Token noticed
Token ID6810
Feature activation-6.47e-03

Pos logit contributions
ck+0.055
 reassure+0.054
 unexpected+0.051
 swat+0.051
 fright+0.050
Neg logit contributions
robe-0.050
ase-0.041
tub-0.041
 Now-0.041
 has-0.040
Token the
Token ID262
Feature activation-4.45e-04

Pos logit contributions
robe+0.010
 sky+0.009
tub+0.009
ase+0.008
 Now+0.008
Neg logit contributions
ck-0.014
 reassure-0.014
 intervene-0.013
 swat-0.013
 fright-0.013
Token weather
Token ID6193
Feature activation-7.56e-03

Pos logit contributions
robe+0.001
 Now+0.001
ase+0.001
 has+0.001
tub+0.001
Neg logit contributions
 reassure-0.001
 unexpected-0.001
ck-0.001
 fright-0.001
 swat-0.001
Token was
Token ID373
Feature activation+1.07e-02

Pos logit contributions
robe+0.012
 Now+0.011
ase+0.010
tub+0.010
 He+0.010
Neg logit contributions
ck-0.015
 reassure-0.015
 unexpected-0.015
 fright-0.014
 swat-0.014
Token uncom
Token ID8820
Feature activation+1.06e-01

Pos logit contributions
ck+0.020
 reassure+0.020
 swat+0.018
 intervene+0.018
 unexpected+0.018
Neg logit contributions
robe-0.020
tub-0.017
 sky-0.017
 has-0.017
ase-0.017
Tokenfort
Token ID3319
Feature activation+4.68e-02

Pos logit contributions
 fear+0.194
 unexpected+0.192
 reassure+0.192
 fright+0.181
 terror+0.172
Neg logit contributions
ase-0.166
robe-0.157
bows-0.144
angers-0.143
forest-0.137
Tokenably
Token ID1346
Feature activation+3.14e-02

Pos logit contributions
 reassure+0.064
ck+0.062
 fright+0.060
 intervene+0.059
 unexpected+0.058
Neg logit contributions
robe-0.112
tub-0.099
ase-0.097
angers-0.092
 sky-0.092
Token hot
Token ID3024
Feature activation+1.20e-02

Pos logit contributions
ck+0.054
 reassure+0.053
 unexpected+0.051
 fright+0.049
 swat+0.049
Neg logit contributions
robe-0.064
ase-0.056
 Now-0.055
tub-0.054
 has-0.052
Token.
Token ID13
Feature activation-6.35e-03

Pos logit contributions
 unexpected+0.019
 reassure+0.018
 fright+0.018
ck+0.018
 swat+0.017
Neg logit contributions
robe-0.024
ase-0.020
tub-0.019
angers-0.019
 has-0.019
Token She
Token ID1375
Feature activation-2.20e-02

Pos logit contributions
robe+0.007
tub+0.005
ase+0.005
 has+0.005
angers+0.004
Neg logit contributions
 reassure-0.016
 fright-0.016
 unexpected-0.015
 swat-0.015
 intervene-0.015
Token,
Token ID11
Feature activation+2.09e-02

Pos logit contributions
ck+0.081
 reassure+0.080
 unexpected+0.076
 fright+0.075
 swat+0.074
Neg logit contributions
robe-0.042
 has-0.036
 He-0.035
 sky-0.034
tub-0.033
Token a
Token ID257
Feature activation+3.17e-02

Pos logit contributions
ck+0.046
 reassure+0.044
 unexpected+0.042
 intervene+0.042
ited+0.042
Neg logit contributions
robe-0.030
 He-0.028
 sky-0.026
 has-0.026
 They-0.026
Token little
Token ID1310
Feature activation+4.16e-02

Pos logit contributions
ck+0.067
 reassure+0.062
 intervene+0.057
 unexpected+0.057
 swat+0.056
Neg logit contributions
robe-0.055
 sky-0.048
 has-0.047
tub-0.047
 He-0.045
Token bird
Token ID6512
Feature activation+5.15e-02

Pos logit contributions
ck+0.102
 reassure+0.096
 unexpected+0.091
 intervene+0.090
 convince+0.089
Neg logit contributions
robe-0.055
 sky-0.049
 has-0.047
 He-0.047
tub-0.045
Token named
Token ID3706
Feature activation+3.88e-03

Pos logit contributions
ck+0.096
 reassure+0.095
 unexpected+0.089
 swat+0.087
 fright+0.087
Neg logit contributions
robe-0.097
ase-0.082
tub-0.082
 has-0.081
 Now-0.081
Token Twe
Token ID24205
Feature activation+1.06e-01

Pos logit contributions
ck+0.007
 reassure+0.007
 swat+0.006
 unexpected+0.006
 fright+0.006
Neg logit contributions
robe-0.008
ase-0.007
tub-0.007
 has-0.007
 Now-0.007
Tokenety
Token ID2963
Feature activation+5.01e-02

Pos logit contributions
 reassure+0.229
 unexpected+0.223
ck+0.221
 fright+0.216
 intervene+0.210
Neg logit contributions
robe-0.168
ase-0.146
 has-0.134
tub-0.132
 He-0.131
Token,
Token ID11
Feature activation+5.52e-03

Pos logit contributions
ck+0.108
 reassure+0.105
 swat+0.099
 unexpected+0.098
 fright+0.098
Neg logit contributions
robe-0.083
tub-0.068
ase-0.068
 Now-0.067
 sky-0.064
Token and
Token ID290
Feature activation+2.40e-02

Pos logit contributions
ck+0.011
 reassure+0.011
 swat+0.010
 fright+0.010
 unexpected+0.010
Neg logit contributions
robe-0.010
ase-0.008
tub-0.008
 Now-0.008
 sky-0.008
Token told
Token ID1297
Feature activation-2.30e-04

Pos logit contributions
ck+0.051
 reassure+0.050
 unexpected+0.047
 fright+0.047
 swat+0.047
Neg logit contributions
robe-0.039
 has-0.032
tub-0.032
 Now-0.032
ase-0.032
Token her
Token ID607
Feature activation+1.30e-02

Pos logit contributions
robe+0.000
 Now+0.000
 has+0.000
 He+0.000
 sky+0.000
Neg logit contributions
ck-0.000
 reassure-0.000
 unexpected-0.000
 swat-0.000
 fright-0.000
Token seeds
Token ID11904
Feature activation-5.75e-03

Pos logit contributions
robe+0.007
 Now+0.006
ase+0.006
tub+0.006
angers+0.006
Neg logit contributions
ck-0.008
 reassure-0.008
 unexpected-0.007
 swat-0.007
 fright-0.007
Token and
Token ID290
Feature activation+1.41e-02

Pos logit contributions
robe+0.010
 Now+0.009
ase+0.009
angers+0.008
tub+0.008
Neg logit contributions
ck-0.011
 reassure-0.011
 swat-0.011
 fright-0.011
 closer-0.010
Token gave
Token ID2921
Feature activation-1.50e-03

Pos logit contributions
ck+0.028
 reassure+0.028
 unexpected+0.026
 swat+0.026
 fright+0.026
Neg logit contributions
robe-0.025
ase-0.021
tub-0.021
 Now-0.020
 has-0.020
Token them
Token ID606
Feature activation+9.81e-03

Pos logit contributions
robe+0.003
tub+0.003
ase+0.003
angers+0.003
 has+0.003
Neg logit contributions
ck-0.002
 reassure-0.002
 unexpected-0.002
 swat-0.002
 fright-0.002
Token to
Token ID284
Feature activation+1.25e-02

Pos logit contributions
 reassure+0.022
ck+0.022
 swat+0.020
 intervene+0.020
 unexpected+0.020
Neg logit contributions
robe-0.015
tub-0.012
 has-0.012
ase-0.012
 sky-0.012
Token Twe
Token ID24205
Feature activation+1.06e-01

Pos logit contributions
ck+0.021
 reassure+0.021
 unexpected+0.019
 swat+0.019
 fright+0.019
Neg logit contributions
robe-0.027
ase-0.023
tub-0.023
 has-0.022
 Now-0.022
Tokenety
Token ID2963
Feature activation+4.64e-02

Pos logit contributions
 reassure+0.228
ck+0.217
 unexpected+0.216
 fright+0.212
 intervene+0.211
Neg logit contributions
robe-0.164
ase-0.147
 has-0.133
tub-0.132
unes-0.130
Token.
Token ID13
Feature activation-5.41e-03

Pos logit contributions
ck+0.101
 reassure+0.100
 swat+0.093
 unexpected+0.093
 fright+0.092
Neg logit contributions
robe-0.073
 Now-0.060
tub-0.059
ase-0.059
 sky-0.059
Token
Token ID198
Feature activation+1.43e-02

Pos logit contributions
robe+0.011
ase+0.009
pit+0.009
tub+0.009
angers+0.009
Neg logit contributions
ck-0.008
 reassure-0.007
 fright-0.007
 intervene-0.007
 swat-0.007
Token
Token ID198
Feature activation+3.97e-02

Pos logit contributions
ck+0.039
 reassure+0.038
 unexpected+0.037
 fright+0.036
 swat+0.036
Neg logit contributions
robe-0.015
tub-0.011
ase-0.011
 has-0.010
 sky-0.010
TokenTwe
Token ID32665
Feature activation+8.17e-02

Pos logit contributions
 reassure+0.056
ck+0.056
 fright+0.050
 unexpected+0.050
 swat+0.050
Neg logit contributions
robe-0.091
 Now-0.081
ase-0.081
 has-0.080
tub-0.080
Token dare
Token ID16498
Feature activation+4.37e-02

Pos logit contributions
 reassure+0.102
ck+0.101
 intervene+0.096
 unexpected+0.095
 swat+0.091
Neg logit contributions
robe-0.134
 has-0.120
 He-0.119
 sky-0.114
tub-0.111
Token to
Token ID284
Feature activation+3.24e-02

Pos logit contributions
 reassure+0.098
ck+0.097
 unexpected+0.094
 intervene+0.093
ited+0.090
Neg logit contributions
robe-0.056
 He-0.051
 has-0.049
 sky-0.048
unes-0.047
Token climb
Token ID12080
Feature activation+2.29e-02

Pos logit contributions
ited+0.067
 reassure+0.064
 intervene+0.063
Maria+0.062
Little+0.062
Neg logit contributions
 sky-0.038
 has-0.037
 sun-0.036
robe-0.035
 name-0.034
Token up
Token ID510
Feature activation+2.38e-02

Pos logit contributions
ck+0.047
 reassure+0.047
 unexpected+0.044
 swat+0.044
 intervene+0.043
Neg logit contributions
robe-0.038
tub-0.032
ase-0.031
 has-0.031
 sky-0.031
Token to
Token ID284
Feature activation+2.28e-02

Pos logit contributions
ck+0.052
 reassure+0.051
 intervene+0.048
 swat+0.047
 unexpected+0.047
Neg logit contributions
robe-0.038
 sky-0.031
 has-0.031
tub-0.031
 Now-0.030
Token Twe
Token ID24205
Feature activation+1.04e-01

Pos logit contributions
ck+0.044
 reassure+0.044
 intervene+0.040
 unexpected+0.040
 swat+0.039
Neg logit contributions
robe-0.039
 He-0.034
 has-0.034
 sky-0.034
 Now-0.033
Tokenety
Token ID2963
Feature activation+4.54e-02

Pos logit contributions
 reassure+0.211
ck+0.208
 unexpected+0.200
 fright+0.196
 swat+0.194
Neg logit contributions
robe-0.179
ase-0.157
tub-0.149
 has-0.147
 sky-0.145
Token's
Token ID338
Feature activation-7.52e-03

Pos logit contributions
ck+0.106
 reassure+0.106
 unexpected+0.100
 intervene+0.099
 swat+0.099
Neg logit contributions
robe-0.064
ase-0.050
tub-0.050
 has-0.050
 sky-0.049
Token nest
Token ID16343
Feature activation-1.10e-02

Pos logit contributions
robe+0.012
 sky+0.010
tub+0.010
 has+0.010
ase+0.010
Neg logit contributions
ck-0.016
 reassure-0.016
 intervene-0.015
 unexpected-0.015
 swat-0.015
Token.
Token ID13
Feature activation+2.81e-03

Pos logit contributions
robe+0.017
 Now+0.014
ase+0.014
tub+0.014
 has+0.014
Neg logit contributions
ck-0.024
 reassure-0.023
 unexpected-0.022
 fright-0.022
 swat-0.022
Token
Token ID198
Feature activation+1.55e-02

Pos logit contributions
 reassure+0.006
ck+0.006
 intervene+0.006
 fright+0.005
 convince+0.005
Neg logit contributions
robe-0.005
ase-0.004
tub-0.004
angers-0.004
ates-0.003
Token a
Token ID257
Feature activation-1.01e-03

Pos logit contributions
robe+0.053
ase+0.046
tub+0.046
 Now+0.045
 has+0.044
Neg logit contributions
ck-0.046
 reassure-0.045
 unexpected-0.043
 fright-0.043
 swat-0.042
Token film
Token ID2646
Feature activation-2.94e-02

Pos logit contributions
robe+0.002
tub+0.002
ase+0.002
 has+0.002
 sky+0.002
Neg logit contributions
ck-0.002
 reassure-0.002
 unexpected-0.002
 swat-0.002
 intervene-0.002
Token camera
Token ID4676
Feature activation-2.74e-02

Pos logit contributions
robe+0.048
unes+0.043
 Now+0.043
tub+0.041
 has+0.041
Neg logit contributions
 reassure-0.054
ck-0.053
 unexpected-0.053
 fright-0.051
 swat-0.050
Token and
Token ID290
Feature activation+3.75e-03

Pos logit contributions
robe+0.045
 Now+0.041
tub+0.039
ase+0.038
unes+0.038
Neg logit contributions
ck-0.051
 reassure-0.049
 unexpected-0.048
 fright-0.047
 swat-0.047
Token asked
Token ID1965
Feature activation-1.48e-02

Pos logit contributions
 reassure+0.007
ck+0.007
 unexpected+0.007
 swat+0.006
 fright+0.006
Neg logit contributions
robe-0.007
ase-0.006
tub-0.006
unes-0.006
angers-0.006
Token Twe
Token ID24205
Feature activation+1.04e-01

Pos logit contributions
robe+0.025
ase+0.020
tub+0.020
 has+0.020
 Now+0.020
Neg logit contributions
ck-0.031
 reassure-0.031
 unexpected-0.029
 fright-0.028
 swat-0.028
Tokenety
Token ID2963
Feature activation+4.55e-02

Pos logit contributions
 reassure+0.227
ck+0.220
 unexpected+0.215
 fright+0.214
 intervene+0.209
Neg logit contributions
robe-0.158
ase-0.141
tub-0.128
unes-0.127
 has-0.126
Token if
Token ID611
Feature activation+4.37e-03

Pos logit contributions
ck+0.086
 reassure+0.085
 swat+0.079
 unexpected+0.079
 fright+0.078
Neg logit contributions
robe-0.083
ase-0.071
 Now-0.071
tub-0.070
 sky-0.070
Token she
Token ID673
Feature activation-3.58e-02

Pos logit contributions
ck+0.010
 reassure+0.010
 unexpected+0.010
 swat+0.010
 fright+0.010
Neg logit contributions
robe-0.006
ase-0.005
tub-0.005
 has-0.005
 Now-0.005
Token could
Token ID714
Feature activation+3.34e-03

Pos logit contributions
robe+0.040
 has+0.031
ase+0.030
tub+0.029
 Now+0.029
Neg logit contributions
ck-0.094
 reassure-0.093
 swat-0.088
 intervene-0.088
 unexpected-0.088
Token take
Token ID1011
Feature activation+7.33e-03

Pos logit contributions
ck+0.006
 reassure+0.006
 unexpected+0.005
 swat+0.005
 fright+0.005
Neg logit contributions
robe-0.007
tub-0.006
ase-0.006
 has-0.006
 Now-0.006
Token a
Token ID257
Feature activation+4.23e-02

Pos logit contributions
 reassure+0.097
ck+0.096
 intervene+0.092
 swat+0.089
 fright+0.089
Neg logit contributions
robe-0.064
 sky-0.055
 Now-0.054
 has-0.052
tub-0.052
Token little
Token ID1310
Feature activation+5.85e-02

Pos logit contributions
ck+0.091
 reassure+0.090
 intervene+0.084
 fright+0.083
 swat+0.081
Neg logit contributions
robe-0.069
 has-0.061
 sky-0.060
tub-0.059
 He-0.056
Token bird
Token ID6512
Feature activation+5.81e-02

Pos logit contributions
ck+0.138
 reassure+0.137
 unexpected+0.128
 intervene+0.127
 fright+0.127
Neg logit contributions
robe-0.083
tub-0.066
ase-0.066
 has-0.066
 sky-0.066
Tokenie
Token ID494
Feature activation+4.49e-02

Pos logit contributions
 reassure+0.113
ck+0.111
 intervene+0.106
 unexpected+0.106
 fright+0.104
Neg logit contributions
robe-0.103
ase-0.089
tub-0.087
 has-0.086
 sky-0.086
Token named
Token ID3706
Feature activation+1.61e-02

Pos logit contributions
 reassure+0.097
ck+0.094
 intervene+0.091
 unexpected+0.091
 fright+0.089
Neg logit contributions
robe-0.071
ase-0.059
 has-0.059
tub-0.058
 sky-0.058
Token Twe
Token ID24205
Feature activation+1.04e-01

Pos logit contributions
 reassure+0.035
ck+0.034
 intervene+0.032
 unexpected+0.032
 swat+0.031
Neg logit contributions
robe-0.026
 sky-0.023
 He-0.022
 has-0.022
tub-0.022
Tokenety
Token ID2963
Feature activation+4.48e-02

Pos logit contributions
 reassure+0.236
 fright+0.225
 unexpected+0.224
 intervene+0.221
ck+0.218
Neg logit contributions
robe-0.155
ase-0.139
tub-0.121
 has-0.119
 sky-0.119
Token.
Token ID13
Feature activation-7.35e-03

Pos logit contributions
 reassure+0.105
ck+0.104
 unexpected+0.099
 intervene+0.099
 fright+0.098
Neg logit contributions
robe-0.062
ase-0.050
tub-0.049
 has-0.049
 sky-0.048
Token Twe
Token ID24205
Feature activation+8.96e-02

Pos logit contributions
robe+0.019
tub+0.017
ase+0.017
angers+0.017
balls+0.016
Neg logit contributions
ck-0.007
 reassure-0.007
 worry-0.007
 fright-0.007
 swat-0.007
Tokenety
Token ID2963
Feature activation+1.93e-02

Pos logit contributions
 reassure+0.181
ck+0.175
 unexpected+0.167
 fright+0.167
 intervene+0.167
Neg logit contributions
robe-0.157
ase-0.136
tub-0.129
 has-0.128
 sky-0.126
Token was
Token ID373
Feature activation+3.59e-03

Pos logit contributions
ck+0.043
 reassure+0.040
 unexpected+0.040
 swat+0.039
 fright+0.038
Neg logit contributions
robe-0.030
tub-0.025
unes-0.025
angers-0.024
ase-0.024
Token"
Token ID1
Feature activation+8.28e-03

Pos logit contributions
ck+0.083
 reassure+0.081
 fright+0.078
 intervene+0.077
 unexpected+0.076
Neg logit contributions
robe-0.074
 has-0.066
 Now-0.065
ase-0.063
 sky-0.063
TokenHello
Token ID15496
Feature activation+2.36e-02

Pos logit contributions
ck+0.018
 unexpected+0.017
 reassure+0.017
 intervene+0.016
render+0.016
Neg logit contributions
robe-0.013
 sky-0.013
ase-0.011
 He-0.011
 Now-0.011
Token,
Token ID11
Feature activation+2.43e-02

Pos logit contributions
 reassure+0.059
ck+0.058
 intervene+0.055
 swat+0.055
 unexpected+0.054
Neg logit contributions
robe-0.029
 has-0.025
 sky-0.024
tub-0.024
ase-0.023
Token Mr
Token ID1770
Feature activation+3.66e-02

Pos logit contributions
ck+0.046
 reassure+0.046
 unexpected+0.043
 swat+0.042
 fright+0.042
Neg logit contributions
robe-0.045
ase-0.038
tub-0.038
 Now-0.037
 has-0.037
Token.
Token ID13
Feature activation-1.60e-02

Pos logit contributions
ck+0.075
 reassure+0.072
 unexpected+0.070
 fright+0.069
 swat+0.067
Neg logit contributions
robe-0.061
tub-0.050
 Now-0.049
ase-0.049
 sky-0.048
Token Cro
Token ID9325
Feature activation+1.03e-01

Pos logit contributions
robe+0.030
ase+0.025
 Now+0.024
tub+0.024
 has+0.024
Neg logit contributions
ck-0.029
 reassure-0.029
 fright-0.027
 unexpected-0.027
 swat-0.026
Tokencod
Token ID19815
Feature activation+5.14e-02

Pos logit contributions
 reassure+0.146
 fright+0.130
 unexpected+0.128
 intervene+0.128
 swat+0.122
Neg logit contributions
robe-0.225
ase-0.223
 has-0.213
unes-0.200
tub-0.199
Tokenile
Token ID576
Feature activation+3.50e-02

Pos logit contributions
ck+0.081
 reassure+0.080
 unexpected+0.073
 swat+0.073
 fright+0.072
Neg logit contributions
robe-0.112
ase-0.097
tub-0.097
 has-0.096
 Now-0.096
Token,"
Token ID553
Feature activation+6.79e-03

Pos logit contributions
ck+0.071
 reassure+0.071
 unexpected+0.066
 swat+0.066
 intervene+0.065
Neg logit contributions
robe-0.060
 has-0.050
ase-0.050
tub-0.049
 Now-0.049
Token said
Token ID531
Feature activation-1.09e-02

Pos logit contributions
ck+0.015
 reassure+0.015
 unexpected+0.014
 swat+0.014
 fright+0.014
Neg logit contributions
robe-0.010
ase-0.008
tub-0.008
 Now-0.008
 has-0.008
Token Lily
Token ID20037
Feature activation+4.47e-02

Pos logit contributions
robe+0.015
 has+0.011
 He+0.011
 Now+0.011
 sky+0.011
Neg logit contributions
ck-0.027
 reassure-0.026
 unexpected-0.025
 swat-0.025
 intervene-0.025
Token who
Token ID508
Feature activation-1.61e-02

Pos logit contributions
 reassure+0.109
ck+0.109
 unexpected+0.103
 intervene+0.102
 fright+0.101
Neg logit contributions
robe-0.080
ase-0.066
tub-0.066
 has-0.065
 sky-0.065
Token was
Token ID373
Feature activation+1.84e-02

Pos logit contributions
robe+0.019
 has+0.015
 sky+0.015
ase+0.014
tub+0.014
Neg logit contributions
 reassure-0.041
ck-0.040
 intervene-0.039
 swat-0.038
 unexpected-0.038
Token trying
Token ID2111
Feature activation+4.05e-03

Pos logit contributions
ck+0.034
 reassure+0.034
 unexpected+0.031
 intervene+0.031
 fright+0.031
Neg logit contributions
robe-0.035
tub-0.029
 has-0.029
 sky-0.029
ase-0.029
Token to
Token ID284
Feature activation+3.46e-02

Pos logit contributions
 reassure+0.011
ck+0.011
 intervene+0.011
ited+0.011
 swat+0.011
Neg logit contributions
robe-0.003
 has-0.003
 sky-0.002
 He-0.002
 They-0.002
Token take
Token ID1011
Feature activation+2.07e-02

Pos logit contributions
 intervene+0.067
ited+0.066
 reassure+0.064
 tighter+0.063
 unexpected+0.063
Neg logit contributions
robe-0.043
 sky-0.042
 has-0.037
 He-0.036
 It-0.035
Token Twe
Token ID24205
Feature activation+1.03e-01

Pos logit contributions
 reassure+0.041
ck+0.040
 intervene+0.038
 swat+0.037
 unexpected+0.037
Neg logit contributions
robe-0.036
 Now-0.031
 sky-0.031
 has-0.030
 He-0.030
Tokenety
Token ID2963
Feature activation+3.51e-02

Pos logit contributions
 reassure+0.229
 unexpected+0.218
ck+0.218
 fright+0.215
 intervene+0.213
Neg logit contributions
robe-0.154
ase-0.139
unes-0.124
 has-0.121
tub-0.121
Token's
Token ID338
Feature activation-1.09e-02

Pos logit contributions
 reassure+0.084
ck+0.084
 intervene+0.081
 unexpected+0.079
 swat+0.079
Neg logit contributions
robe-0.048
tub-0.037
ase-0.037
 has-0.036
 sky-0.036
Token nest
Token ID16343
Feature activation-9.41e-03

Pos logit contributions
robe+0.019
 sky+0.016
 has+0.016
ase+0.016
tub+0.016
Neg logit contributions
ck-0.021
 reassure-0.021
 intervene-0.020
 unexpected-0.020
 swat-0.019
Token.
Token ID13
Feature activation+6.94e-03

Pos logit contributions
robe+0.014
tub+0.011
ase+0.011
 sky+0.010
 has+0.010
Neg logit contributions
ck-0.022
 reassure-0.022
 swat-0.021
 intervene-0.020
 unexpected-0.020
Token Twe
Token ID24205
Feature activation+9.00e-02

Pos logit contributions
 reassure+0.011
 swat+0.011
ck+0.011
 fright+0.010
 closer+0.010
Neg logit contributions
robe-0.015
ase-0.013
tub-0.013
ates-0.012
angers-0.012
Token to
Token ID284
Feature activation+2.42e-02

Pos logit contributions
ck+0.038
 reassure+0.037
 swat+0.035
 fright+0.035
 unexpected+0.035
Neg logit contributions
robe-0.033
ase-0.029
tub-0.029
 has-0.026
 sky-0.026
Token never
Token ID1239
Feature activation+4.63e-02

Pos logit contributions
ck+0.044
 reassure+0.044
 unexpected+0.041
 swat+0.040
 fright+0.040
Neg logit contributions
robe-0.047
tub-0.039
ase-0.039
 has-0.039
 Now-0.039
Token try
Token ID1949
Feature activation+1.04e-02

Pos logit contributions
ck+0.085
 reassure+0.084
 unexpected+0.079
 intervene+0.078
 swat+0.077
Neg logit contributions
robe-0.081
 has-0.073
 sky-0.071
tub-0.068
 He-0.068
Token to
Token ID284
Feature activation+3.49e-02

Pos logit contributions
ck+0.022
 reassure+0.022
 unexpected+0.020
 swat+0.020
 intervene+0.020
Neg logit contributions
robe-0.017
tub-0.014
 has-0.014
ase-0.014
 sky-0.014
Token catch
Token ID4929
Feature activation+3.00e-02

Pos logit contributions
ited+0.065
 unexpected+0.063
ck+0.063
 reassure+0.062
 intervene+0.061
Neg logit contributions
robe-0.049
 sky-0.048
 has-0.045
bows-0.042
 ones-0.040
Token Twe
Token ID24205
Feature activation+1.02e-01

Pos logit contributions
ck+0.055
 reassure+0.055
 unexpected+0.050
 swat+0.050
 intervene+0.050
Neg logit contributions
robe-0.057
tub-0.048
ase-0.048
 has-0.048
 Now-0.048
Tokenety
Token ID2963
Feature activation+3.26e-02

Pos logit contributions
 reassure+0.211
ck+0.206
 unexpected+0.201
 fright+0.198
 intervene+0.195
Neg logit contributions
robe-0.172
ase-0.149
tub-0.139
 has-0.138
unes-0.136
Token again
Token ID757
Feature activation+2.50e-02

Pos logit contributions
ck+0.067
 reassure+0.067
 unexpected+0.063
 intervene+0.062
 swat+0.062
Neg logit contributions
robe-0.055
ase-0.045
 has-0.045
tub-0.045
 sky-0.044
Token.
Token ID13
Feature activation-9.35e-04

Pos logit contributions
ck+0.058
 reassure+0.057
 swat+0.054
 unexpected+0.054
 intervene+0.054
Neg logit contributions
robe-0.036
tub-0.029
 has-0.028
ase-0.028
 sky-0.028
Token From
Token ID3574
Feature activation-2.51e-03

Pos logit contributions
robe+0.002
tub+0.002
ase+0.002
ware+0.001
ates+0.001
Neg logit contributions
 reassure-0.002
 swat-0.002
 intervene-0.002
ck-0.002
 cautiously-0.001
Token that
Token ID326
Feature activation-2.22e-02

Pos logit contributions
robe+0.003
 has+0.002
 sky+0.002
 Now+0.002
 He+0.002
Neg logit contributions
 reassure-0.007
ck-0.006
 intervene-0.006
 fright-0.006
 swat-0.006
Token a
Token ID257
Feature activation+3.74e-02

Pos logit contributions
 reassure+0.080
ck+0.080
 intervene+0.075
 swat+0.074
 fright+0.073
Neg logit contributions
robe-0.060
 sky-0.050
 Now-0.050
tub-0.049
ase-0.049
Token little
Token ID1310
Feature activation+5.38e-02

Pos logit contributions
ck+0.078
 reassure+0.077
 intervene+0.072
 fright+0.071
 unexpected+0.069
Neg logit contributions
robe-0.063
 has-0.055
 sky-0.055
tub-0.054
ase-0.052
Token bird
Token ID6512
Feature activation+5.59e-02

Pos logit contributions
ck+0.126
 reassure+0.124
 unexpected+0.117
 fright+0.116
 swat+0.116
Neg logit contributions
robe-0.077
tub-0.062
ase-0.062
 has-0.061
 sky-0.061
Tokenie
Token ID494
Feature activation+4.03e-02

Pos logit contributions
ck+0.103
 reassure+0.102
 unexpected+0.095
 swat+0.094
 fright+0.093
Neg logit contributions
robe-0.106
ase-0.090
tub-0.090
 has-0.089
 Now-0.088
Token named
Token ID3706
Feature activation+1.46e-02

Pos logit contributions
ck+0.082
 reassure+0.080
 swat+0.075
 unexpected+0.075
 fright+0.075
Neg logit contributions
robe-0.070
tub-0.058
ase-0.057
 Now-0.057
 has-0.056
Token Twe
Token ID24205
Feature activation+1.01e-01

Pos logit contributions
 reassure+0.030
ck+0.030
 unexpected+0.028
 intervene+0.028
 swat+0.027
Neg logit contributions
robe-0.025
 sky-0.021
tub-0.021
 has-0.021
 Now-0.021
Tokenety
Token ID2963
Feature activation+4.38e-02

Pos logit contributions
 reassure+0.234
 unexpected+0.224
 fright+0.223
 intervene+0.219
ck+0.218
Neg logit contributions
robe-0.152
ase-0.135
tub-0.118
 has-0.116
angers-0.115
Token.
Token ID13
Feature activation-7.13e-03

Pos logit contributions
ck+0.098
 reassure+0.095
 unexpected+0.090
 swat+0.090
 fright+0.089
Neg logit contributions
robe-0.068
tub-0.055
ase-0.054
 Now-0.054
 has-0.053
Token Twe
Token ID24205
Feature activation+8.59e-02

Pos logit contributions
robe+0.019
tub+0.017
ase+0.017
angers+0.016
balls+0.016
Neg logit contributions
ck-0.007
 reassure-0.006
 worry-0.006
 swat-0.006
 fright-0.006
Tokenety
Token ID2963
Feature activation+1.62e-02

Pos logit contributions
 reassure+0.175
ck+0.170
 unexpected+0.164
 fright+0.163
 intervene+0.162
Neg logit contributions
robe-0.150
ase-0.129
tub-0.123
 has-0.121
 Now-0.119
Token loved
Token ID6151
Feature activation-9.59e-03

Pos logit contributions
ck+0.035
 reassure+0.032
 unexpected+0.032
 swat+0.031
 fright+0.031
Neg logit contributions
robe-0.027
unes-0.022
angers-0.022
tub-0.022
ase-0.021
Token a
Token ID257
Feature activation+3.97e-02

Pos logit contributions
ck+0.083
 reassure+0.083
 intervene+0.077
 swat+0.077
 fright+0.076
Neg logit contributions
robe-0.063
 Now-0.053
 sky-0.053
tub-0.052
 has-0.052
Token little
Token ID1310
Feature activation+5.77e-02

Pos logit contributions
ck+0.085
 reassure+0.084
 intervene+0.079
 fright+0.077
 unexpected+0.075
Neg logit contributions
robe-0.064
 has-0.058
 sky-0.056
tub-0.056
 He-0.053
Token bird
Token ID6512
Feature activation+5.67e-02

Pos logit contributions
ck+0.139
 reassure+0.137
 unexpected+0.128
 intervene+0.128
 fright+0.128
Neg logit contributions
robe-0.079
tub-0.063
 sky-0.063
ase-0.063
 has-0.063
Tokenie
Token ID494
Feature activation+4.34e-02

Pos logit contributions
ck+0.104
 reassure+0.104
 unexpected+0.097
 intervene+0.096
 fright+0.095
Neg logit contributions
robe-0.107
ase-0.092
tub-0.091
 has-0.090
 sky-0.089
Token named
Token ID3706
Feature activation+1.56e-02

Pos logit contributions
 reassure+0.091
ck+0.091
 unexpected+0.086
 intervene+0.085
 fright+0.084
Neg logit contributions
robe-0.071
ase-0.059
tub-0.058
 has-0.058
 sky-0.057
Token Twe
Token ID24205
Feature activation+1.01e-01

Pos logit contributions
 reassure+0.033
ck+0.033
 intervene+0.031
 unexpected+0.031
 swat+0.030
Neg logit contributions
robe-0.025
 sky-0.022
 He-0.021
 has-0.021
tub-0.021
Tokenety
Token ID2963
Feature activation+4.44e-02

Pos logit contributions
 reassure+0.228
 unexpected+0.219
 fright+0.218
ck+0.216
 intervene+0.214
Neg logit contributions
robe-0.156
ase-0.136
tub-0.122
 has-0.120
 sky-0.118
Token.
Token ID13
Feature activation-7.20e-03

Pos logit contributions
ck+0.101
 reassure+0.100
 unexpected+0.095
 swat+0.094
 fright+0.094
Neg logit contributions
robe-0.065
ase-0.052
tub-0.052
 has-0.051
 Now-0.051
Token Twe
Token ID24205
Feature activation+8.66e-02

Pos logit contributions
robe+0.019
tub+0.017
ase+0.017
angers+0.017
 has+0.016
Neg logit contributions
ck-0.007
 worry-0.007
 reassure-0.007
 fright-0.007
 swat-0.007
Tokenety
Token ID2963
Feature activation+1.77e-02

Pos logit contributions
 reassure+0.173
ck+0.169
 unexpected+0.161
 fright+0.160
 intervene+0.160
Neg logit contributions
robe-0.154
ase-0.132
tub-0.127
 has-0.125
 Now-0.124
Token lived
Token ID5615
Feature activation+1.05e-02

Pos logit contributions
ck+0.040
 reassure+0.037
 unexpected+0.036
 swat+0.035
 fright+0.035
Neg logit contributions
robe-0.028
tub-0.023
unes-0.023
angers-0.022
ase-0.022
Token bird
Token ID6512
Feature activation+2.16e-02

Pos logit contributions
ck+0.052
 reassure+0.051
 intervene+0.049
 swat+0.048
 unexpected+0.047
Neg logit contributions
robe-0.037
 sky-0.034
 has-0.032
ase-0.031
tub-0.030
Token and
Token ID290
Feature activation+2.59e-02

Pos logit contributions
ck+0.050
 reassure+0.050
 unexpected+0.047
 intervene+0.047
 fright+0.047
Neg logit contributions
robe-0.031
ase-0.024
tub-0.024
 has-0.024
 sky-0.024
Token they
Token ID484
Feature activation+1.51e-03

Pos logit contributions
ck+0.049
 reassure+0.048
 swat+0.045
 unexpected+0.044
 fright+0.044
Neg logit contributions
robe-0.050
ase-0.044
tub-0.042
 Now-0.041
unes-0.041
Token named
Token ID3706
Feature activation+8.00e-03

Pos logit contributions
ck+0.003
 reassure+0.003
 unexpected+0.003
 fright+0.003
 swat+0.003
Neg logit contributions
robe-0.002
tub-0.002
ase-0.002
 has-0.002
 sky-0.002
Token it
Token ID340
Feature activation-9.83e-03

Pos logit contributions
ck+0.014
 reassure+0.014
 unexpected+0.013
 swat+0.013
 intervene+0.013
Neg logit contributions
robe-0.015
ase-0.013
tub-0.013
 has-0.013
 sky-0.013
Token Twe
Token ID24205
Feature activation+1.00e-01

Pos logit contributions
robe+0.018
ase+0.015
 Now+0.015
tub+0.015
 sky+0.015
Neg logit contributions
ck-0.019
 reassure-0.019
 swat-0.017
 unexpected-0.017
 convince-0.017
Tokenety
Token ID2963
Feature activation+3.39e-02

Pos logit contributions
 reassure+0.226
 unexpected+0.219
ck+0.216
 fright+0.211
 intervene+0.207
Neg logit contributions
robe-0.156
ase-0.137
tub-0.122
 has-0.121
unes-0.119
Token.
Token ID13
Feature activation-8.59e-03

Pos logit contributions
ck+0.067
 reassure+0.063
 swat+0.059
 unexpected+0.059
 closer+0.057
Neg logit contributions
robe-0.058
 Now-0.052
ase-0.051
tub-0.050
 sky-0.049
Token They
Token ID1119
Feature activation-1.10e-02

Pos logit contributions
robe+0.016
ase+0.014
ates+0.014
tub+0.014
ests+0.014
Neg logit contributions
 reassure-0.013
 swat-0.013
 intervene-0.012
ck-0.012
 convince-0.012
Token fed
Token ID11672
Feature activation-9.61e-03

Pos logit contributions
robe+0.016
 has+0.014
ase+0.013
 sky+0.013
tub+0.013
Neg logit contributions
 reassure-0.024
ck-0.024
 fright-0.023
 intervene-0.023
 unexpected-0.023
Token Twe
Token ID24205
Feature activation+9.35e-02

Pos logit contributions
robe+0.020
tub+0.018
ase+0.018
 has+0.017
 Now+0.017
Neg logit contributions
ck-0.016
 reassure-0.016
 swat-0.015
 unexpected-0.015
 fright-0.015
Token friend
Token ID1545
Feature activation+2.32e-02

Pos logit contributions
robe+0.027
ase+0.024
 Now+0.024
 has+0.023
tub+0.023
Neg logit contributions
 reassure-0.022
ck-0.022
 unexpected-0.020
 swat-0.020
 fright-0.020
Token,
Token ID11
Feature activation+1.79e-03

Pos logit contributions
ck+0.053
 reassure+0.052
 fright+0.049
 swat+0.049
 unexpected+0.049
Neg logit contributions
robe-0.034
 has-0.028
 Now-0.028
tub-0.028
ase-0.028
Token a
Token ID257
Feature activation+2.33e-02

Pos logit contributions
ck+0.004
 reassure+0.004
 swat+0.004
 unexpected+0.004
 intervene+0.004
Neg logit contributions
robe-0.003
 He-0.002
tub-0.002
 has-0.002
 Now-0.002
Token bird
Token ID6512
Feature activation+4.61e-02

Pos logit contributions
ck+0.049
 reassure+0.047
 intervene+0.043
 unexpected+0.043
 swat+0.043
Neg logit contributions
robe-0.041
tub-0.034
 sky-0.034
 has-0.034
ase-0.033
Token named
Token ID3706
Feature activation+1.28e-03

Pos logit contributions
ck+0.080
 reassure+0.077
 swat+0.072
 fright+0.070
 unexpected+0.070
Neg logit contributions
robe-0.096
tub-0.083
ase-0.081
 Now-0.081
 sky-0.080
Token Twe
Token ID24205
Feature activation+1.00e-01

Pos logit contributions
ck+0.002
 reassure+0.002
 unexpected+0.002
 swat+0.002
 fright+0.002
Neg logit contributions
robe-0.003
ase-0.002
tub-0.002
 has-0.002
 Now-0.002
Tokenety
Token ID2963
Feature activation+4.27e-02

Pos logit contributions
ck+0.191
 reassure+0.190
 unexpected+0.179
 fright+0.176
 swat+0.175
Neg logit contributions
robe-0.184
ase-0.155
tub-0.153
 has-0.151
 Now-0.151
Token,
Token ID11
Feature activation-1.15e-03

Pos logit contributions
ck+0.082
 reassure+0.077
 swat+0.074
 unexpected+0.072
 fright+0.071
Neg logit contributions
robe-0.081
tub-0.068
ase-0.068
 Now-0.066
 sky-0.064
Token got
Token ID1392
Feature activation+3.45e-02

Pos logit contributions
robe+0.002
ase+0.002
tub+0.002
 sky+0.002
 Now+0.002
Neg logit contributions
ck-0.002
 reassure-0.002
 unexpected-0.002
 fright-0.002
 swat-0.002
Token stuck
Token ID7819
Feature activation+1.27e-02

Pos logit contributions
 reassure+0.068
ck+0.065
 intervene+0.062
 unexpected+0.061
 swat+0.060
Neg logit contributions
robe-0.059
 has-0.051
 sky-0.050
 He-0.050
 Now-0.049
Token in
Token ID287
Feature activation+3.11e-02

Pos logit contributions
 reassure+0.032
ck+0.032
 intervene+0.030
 unexpected+0.030
 swat+0.030
Neg logit contributions
robe-0.016
tub-0.012
 has-0.012
 sky-0.012
ase-0.012
Token there
Token ID612
Feature activation-5.44e-02

Pos logit contributions
ck+0.025
 reassure+0.024
 unexpected+0.023
 fright+0.023
 swat+0.023
Neg logit contributions
robe-0.021
ase-0.018
tub-0.017
 Now-0.017
 sky-0.017
Token were
Token ID547
Feature activation+3.61e-02

Pos logit contributions
robe+0.078
angers+0.063
unes+0.063
ase+0.061
tub+0.060
Neg logit contributions
ck-0.123
 reassure-0.120
 unexpected-0.117
 fright-0.114
 swat-0.114
Token two
Token ID734
Feature activation+3.94e-02

Pos logit contributions
ck+0.085
 reassure+0.084
 swat+0.079
 fright+0.078
 unexpected+0.078
Neg logit contributions
robe-0.051
tub-0.041
ase-0.041
 sky-0.041
 Now-0.041
Token friends
Token ID2460
Feature activation+3.12e-02

Pos logit contributions
 reassure+0.097
ck+0.097
 fright+0.090
 swat+0.090
 intervene+0.089
Neg logit contributions
robe-0.047
 has-0.041
 sky-0.041
tub-0.038
ase-0.037
Token,
Token ID11
Feature activation+8.65e-03

Pos logit contributions
ck+0.070
 reassure+0.069
 unexpected+0.065
 swat+0.065
 fright+0.065
Neg logit contributions
robe-0.044
ase-0.037
 Now-0.037
 He-0.037
 has-0.037
Token Shar
Token ID15465
Feature activation+1.00e-01

Pos logit contributions
ck+0.017
 reassure+0.017
 unexpected+0.016
 swat+0.016
 intervene+0.016
Neg logit contributions
robe-0.015
 has-0.013
tub-0.013
 sky-0.013
 Now-0.012
Tokenin
Token ID259
Feature activation+4.03e-02

Pos logit contributions
 reassure+0.168
 fright+0.157
 intervene+0.155
ck+0.154
 unexpected+0.153
Neg logit contributions
robe-0.195
 has-0.178
ase-0.175
 Now-0.171
 sky-0.169
Token and
Token ID290
Feature activation+2.57e-02

Pos logit contributions
ck+0.086
 reassure+0.084
 swat+0.079
 unexpected+0.079
 fright+0.079
Neg logit contributions
robe-0.066
tub-0.054
ase-0.053
 has-0.052
 sky-0.052
Token Maria
Token ID14200
Feature activation+3.99e-02

Pos logit contributions
ck+0.040
 reassure+0.039
 swat+0.036
 unexpected+0.036
 intervene+0.036
Neg logit contributions
robe-0.056
ase-0.050
tub-0.049
 sky-0.048
unes-0.047
Token.
Token ID13
Feature activation-1.40e-02

Pos logit contributions
ck+0.087
 reassure+0.084
 swat+0.079
 unexpected+0.079
 fright+0.079
Neg logit contributions
robe-0.066
tub-0.054
ase-0.053
 Now-0.052
 sky-0.051
Token They
Token ID1119
Feature activation-1.91e-02

Pos logit contributions
robe+0.014
ase+0.009
tub+0.009
angers+0.007
 has+0.007
Neg logit contributions
ck-0.039
 reassure-0.039
 fright-0.037
 unexpected-0.037
 swat-0.037
Token flashlight
Token ID38371
Feature activation-4.58e-02

Pos logit contributions
robe+0.023
ase+0.021
 Now+0.020
unes+0.020
tub+0.019
Neg logit contributions
 reassure-0.024
ck-0.023
 unexpected-0.023
 fright-0.023
 swat-0.023
Token,
Token ID11
Feature activation-2.36e-02

Pos logit contributions
robe+0.081
 Now+0.079
 Cookies+0.075
angers+0.073
unes+0.073
Neg logit contributions
ck-0.075
 reassure-0.072
 fright-0.071
 closer-0.071
 unexpected-0.070
Token and
Token ID290
Feature activation-2.51e-03

Pos logit contributions
robe+0.044
ase+0.040
tub+0.037
 Now+0.036
unes+0.036
Neg logit contributions
 reassure-0.044
ck-0.044
 unexpected-0.042
 fright-0.041
 swat-0.040
Token a
Token ID257
Feature activation-1.71e-02

Pos logit contributions
robe+0.005
ase+0.005
tub+0.005
unes+0.005
 Now+0.004
Neg logit contributions
ck-0.004
 reassure-0.004
 unexpected-0.004
 fright-0.004
 swat-0.004
Token magn
Token ID7842
Feature activation-3.71e-02

Pos logit contributions
robe+0.030
ase+0.028
 Now+0.027
unes+0.026
tub+0.025
Neg logit contributions
 reassure-0.032
 unexpected-0.030
ck-0.030
 fright-0.030
 swat-0.030
Tokenifying
Token ID4035
Feature activation-1.01e-01

Pos logit contributions
robe+0.067
 Now+0.063
 Cookies+0.059
tub+0.059
ase+0.057
Neg logit contributions
ck-0.067
 closer-0.063
 reassure-0.062
 fright-0.061
 nearer-0.060
Token glass
Token ID5405
Feature activation-4.29e-02

Pos logit contributions
robe+0.174
angers+0.164
unes+0.145
 Now+0.144
balls+0.143
Neg logit contributions
 fright-0.186
 unexpected-0.178
 reassure-0.166
ck-0.163
 fear-0.163
Token.
Token ID13
Feature activation-2.87e-02

Pos logit contributions
robe+0.074
 Now+0.074
angers+0.065
 These+0.065
 Cookies+0.065
Neg logit contributions
ck-0.072
 unexpected-0.070
 fright-0.068
 reassure-0.067
 closer-0.067
Token
Token ID198
Feature activation+1.19e-02

Pos logit contributions
robe+0.043
ase+0.036
tub+0.035
 has+0.033
unes+0.033
Neg logit contributions
 reassure-0.056
 cautiously-0.055
 fright-0.054
ck-0.053
 intervene-0.053
Token
Token ID198
Feature activation+3.13e-02

Pos logit contributions
ck+0.034
 reassure+0.034
 unexpected+0.032
 swat+0.032
 fright+0.032
Neg logit contributions
robe-0.011
ase-0.007
 Now-0.007
tub-0.007
 has-0.007
TokenAs
Token ID1722
Feature activation+1.08e-02

Pos logit contributions
ck+0.061
 reassure+0.059
 fright+0.058
 unexpected+0.057
 swat+0.056
Neg logit contributions
robe-0.053
 Now-0.050
ase-0.048
 He-0.045
 has-0.045
Token When
Token ID1649
Feature activation-1.72e-02

Pos logit contributions
robe+0.046
ase+0.036
tub+0.035
bows+0.032
unes+0.032
Neg logit contributions
 reassure-0.085
 fright-0.082
ck-0.081
 intervene-0.081
 swat-0.080
Token Daisy
Token ID40355
Feature activation+5.40e-03

Pos logit contributions
robe+0.030
ase+0.027
tub+0.024
angers+0.024
unes+0.024
Neg logit contributions
 reassure-0.034
ck-0.034
 fright-0.032
 unexpected-0.032
 intervene-0.032
Token put
Token ID1234
Feature activation-2.30e-02

Pos logit contributions
ck+0.010
 reassure+0.010
 swat+0.009
 fright+0.009
 unexpected+0.009
Neg logit contributions
robe-0.010
tub-0.009
ase-0.009
angers-0.008
unes-0.008
Token the
Token ID262
Feature activation-2.57e-02

Pos logit contributions
robe+0.045
tub+0.041
ase+0.040
angers+0.037
unes+0.037
Neg logit contributions
ck-0.041
 reassure-0.039
 fright-0.038
 unexpected-0.038
 closer-0.037
Token magn
Token ID7842
Feature activation-2.69e-02

Pos logit contributions
robe+0.041
 Now+0.035
ase+0.035
tub+0.034
 has+0.033
Neg logit contributions
 reassure-0.053
ck-0.053
 unexpected-0.051
 fright-0.050
 swat-0.050
Tokenifying
Token ID4035
Feature activation-9.86e-02

Pos logit contributions
robe+0.044
 Now+0.037
tub+0.036
 has+0.036
ase+0.036
Neg logit contributions
ck-0.055
 reassure-0.053
 fright-0.051
 closer-0.050
 unexpected-0.050
Token glass
Token ID5405
Feature activation-4.71e-02

Pos logit contributions
robe+0.170
angers+0.151
unes+0.136
 has+0.133
 Now+0.132
Neg logit contributions
 fright-0.183
 unexpected-0.176
ck-0.174
 reassure-0.171
 closer-0.166
Token down
Token ID866
Feature activation-9.96e-03

Pos logit contributions
robe+0.082
 Now+0.075
angers+0.073
unes+0.070
ase+0.068
Neg logit contributions
 unexpected-0.077
 reassure-0.075
 fright-0.075
 closer-0.075
ck-0.075
Token,
Token ID11
Feature activation+1.93e-02

Pos logit contributions
robe+0.018
ase+0.016
tub+0.016
angers+0.015
 Now+0.015
Neg logit contributions
 reassure-0.018
ck-0.018
 unexpected-0.018
 fright-0.017
 swat-0.017
Token the
Token ID262
Feature activation-1.99e-02

Pos logit contributions
 reassure+0.041
ck+0.040
 unexpected+0.039
 fright+0.039
 swat+0.038
Neg logit contributions
robe-0.032
ase-0.027
tub-0.026
 Now-0.025
unes-0.025
Token cater
Token ID20825
Feature activation+4.50e-04

Pos logit contributions
robe+0.033
ase+0.029
 Now+0.028
tub+0.028
unes+0.027
Neg logit contributions
 reassure-0.039
ck-0.039
 unexpected-0.038
 fright-0.037
 swat-0.037
Token She
Token ID1375
Feature activation-3.84e-02

Pos logit contributions
robe+0.039
ase+0.031
tub+0.030
unes+0.029
 has+0.029
Neg logit contributions
 reassure-0.056
 fright-0.054
 cautiously-0.053
 intervene-0.053
 convince-0.051
Token carefully
Token ID7773
Feature activation-2.64e-02

Pos logit contributions
robe+0.072
ase+0.061
tub+0.061
 Now+0.059
 sky+0.059
Neg logit contributions
ck-0.073
 reassure-0.072
 unexpected-0.067
 swat-0.066
 fright-0.066
Token applied
Token ID5625
Feature activation-4.52e-02

Pos logit contributions
robe+0.051
tub+0.044
ase+0.044
 Now+0.043
 has+0.043
Neg logit contributions
ck-0.048
 reassure-0.047
 unexpected-0.044
 swat-0.043
 fright-0.043
Token the
Token ID262
Feature activation-4.10e-02

Pos logit contributions
robe+0.094
ase+0.085
tub+0.084
angers+0.081
 Now+0.078
Neg logit contributions
ck-0.072
 reassure-0.070
 unexpected-0.069
 fright-0.068
 closer-0.066
Token yellow
Token ID7872
Feature activation-6.91e-02

Pos logit contributions
robe+0.075
ase+0.064
tub+0.063
 Now+0.063
 has+0.061
Neg logit contributions
ck-0.078
 reassure-0.078
 unexpected-0.073
 fright-0.073
 swat-0.072
Token makeup
Token ID16029
Feature activation-9.78e-02

Pos logit contributions
robe+0.124
tub+0.107
angers+0.105
 Now+0.105
ase+0.103
Neg logit contributions
 reassure-0.124
ck-0.123
 fright-0.123
 swat-0.121
 unexpected-0.117
Token to
Token ID284
Feature activation-1.70e-02

Pos logit contributions
robe+0.161
 Now+0.153
ase+0.144
angers+0.141
 Cookies+0.141
Neg logit contributions
 unexpected-0.164
 closer-0.164
 reassure-0.164
ck-0.161
 fright-0.158
Token her
Token ID607
Feature activation-3.28e-02

Pos logit contributions
robe+0.035
ase+0.031
tub+0.030
 has+0.030
 Now+0.030
Neg logit contributions
ck-0.028
 reassure-0.028
 unexpected-0.026
 swat-0.026
 fright-0.026
Token nose
Token ID9686
Feature activation-2.75e-02

Pos logit contributions
robe+0.060
ase+0.053
tub+0.052
 Now+0.051
unes+0.049
Neg logit contributions
 reassure-0.061
ck-0.061
 unexpected-0.058
 fright-0.057
 swat-0.057
Token and
Token ID290
Feature activation+1.99e-03

Pos logit contributions
robe+0.044
 Now+0.042
ase+0.039
 has+0.039
 Cookies+0.038
Neg logit contributions
 reassure-0.052
ck-0.052
 swat-0.049
 closer-0.048
 fright-0.048
Token smiled
Token ID13541
Feature activation-1.87e-02

Pos logit contributions
ck+0.003
 reassure+0.003
 swat+0.003
 fright+0.003
 unexpected+0.003
Neg logit contributions
robe-0.004
ase-0.004
tub-0.004
unes-0.004
angers-0.003
Token stirred
Token ID33091
Feature activation-3.24e-02

Pos logit contributions
ck+0.004
 reassure+0.004
 swat+0.004
 tug+0.004
 unexpected+0.004
Neg logit contributions
robe-0.006
ase-0.005
unes-0.005
angers-0.005
tub-0.004
Token.
Token ID13
Feature activation-3.15e-02

Pos logit contributions
robe+0.065
ase+0.056
angers+0.053
unes+0.053
 has+0.052
Neg logit contributions
ck-0.058
 reassure-0.055
 unexpected-0.054
 fright-0.052
 closer-0.051
Token Soon
Token ID15894
Feature activation-8.05e-03

Pos logit contributions
robe+0.046
ase+0.039
bows+0.038
rob+0.037
 has+0.035
Neg logit contributions
 cautiously-0.063
 reassure-0.060
 intervene-0.059
 fright-0.059
Little-0.059
Token the
Token ID262
Feature activation-2.94e-02

Pos logit contributions
robe+0.017
ase+0.015
angers+0.013
tub+0.013
unes+0.013
Neg logit contributions
ck-0.014
 reassure-0.014
 unexpected-0.014
 fright-0.013
 intervene-0.013
Token cake
Token ID12187
Feature activation-7.30e-02

Pos logit contributions
robe+0.047
 Now+0.040
ase+0.040
angers+0.039
unes+0.037
Neg logit contributions
 reassure-0.059
 unexpected-0.059
 fright-0.058
ck-0.058
 swat-0.056
Token batter
Token ID4984
Feature activation-9.37e-02

Pos logit contributions
robe+0.147
angers+0.140
unes+0.132
ests+0.129
 Now+0.129
Neg logit contributions
ck-0.113
 unexpected-0.111
 closer-0.101
 believe-0.100
 reassure-0.100
Token was
Token ID373
Feature activation-1.48e-02

Pos logit contributions
robe+0.187
ase+0.170
angers+0.170
 Now+0.168
ests+0.164
Neg logit contributions
ck-0.155
 unexpected-0.141
 closer-0.137
 cautiously-0.129
 tug-0.128
Token ready
Token ID3492
Feature activation-2.98e-02

Pos logit contributions
robe+0.030
ase+0.027
 Now+0.026
angers+0.025
tub+0.025
Neg logit contributions
ck-0.026
 unexpected-0.025
 reassure-0.024
 fright-0.023
 swat-0.022
Token.
Token ID13
Feature activation-2.80e-02

Pos logit contributions
robe+0.064
ase+0.057
angers+0.053
 Now+0.053
ests+0.052
Neg logit contributions
ck-0.047
 unexpected-0.046
 reassure-0.046
 fright-0.044
 closer-0.044
Token Mama
Token ID40084
Feature activation+1.40e-02

Pos logit contributions
robe+0.052
ase+0.047
bows+0.047
unes+0.044
angers+0.043
Neg logit contributions
 reassure-0.047
 cautiously-0.047
 intervene-0.046
Little-0.045
 fright-0.044
Token poured
Token ID19036
Feature activation-3.08e-02

Pos logit contributions
ck+0.025
 reassure+0.024
 swat+0.022
 cautiously+0.022
 unexpected+0.021
Neg logit contributions
robe-0.029
tub-0.024
unes-0.024
angers-0.024
ase-0.024
Token no
Token ID645
Feature activation+7.43e-03

Pos logit contributions
robe+0.117
ase+0.104
tub+0.103
 sky+0.101
 Now+0.100
Neg logit contributions
ck-0.069
 reassure-0.069
 unexpected-0.063
 swat-0.061
 fright-0.061
Token,
Token ID11
Feature activation-1.28e-02

Pos logit contributions
ck+0.017
 reassure+0.017
 unexpected+0.016
 swat+0.016
 fright+0.016
Neg logit contributions
robe-0.011
tub-0.009
ase-0.009
 has-0.009
 Now-0.009
Token because
Token ID780
Feature activation-5.89e-03

Pos logit contributions
robe+0.026
ase+0.023
tub+0.022
 Now+0.020
 sky+0.020
Neg logit contributions
 reassure-0.023
ck-0.023
 unexpected-0.022
 swat-0.021
 fright-0.021
Token the
Token ID262
Feature activation-3.91e-02

Pos logit contributions
robe+0.011
ase+0.010
tub+0.009
 Now+0.009
unes+0.009
Neg logit contributions
ck-0.011
 reassure-0.011
 unexpected-0.011
 fright-0.010
 swat-0.010
Token ash
Token ID12530
Feature activation-7.37e-02

Pos logit contributions
robe+0.091
angers+0.082
ase+0.081
 Now+0.080
tub+0.078
Neg logit contributions
 unexpected-0.054
 reassure-0.050
ck-0.050
 fright-0.050
 worry-0.049
Tokentr
Token ID2213
Feature activation-9.24e-02

Pos logit contributions
angers+0.099
 Now+0.099
 These+0.097
robe+0.096
 Cookies+0.092
Neg logit contributions
ck-0.138
 fright-0.135
 unexpected-0.132
 reassure-0.131
 closer-0.128
Tokenay
Token ID323
Feature activation-4.37e-02

Pos logit contributions
robe+0.155
tub+0.138
 living+0.132
 sky+0.132
 vitamins+0.129
Neg logit contributions
ck-0.153
 fright-0.143
 unexpected-0.139
 reassure-0.135
 closer-0.134
Token was
Token ID373
Feature activation-2.97e-03

Pos logit contributions
robe+0.065
tub+0.058
angers+0.055
 Now+0.052
 sky+0.051
Neg logit contributions
ck-0.097
 reassure-0.089
 unexpected-0.086
 fright-0.085
 swat-0.084
Token not
Token ID407
Feature activation+2.99e-02

Pos logit contributions
robe+0.006
ase+0.005
 Now+0.005
tub+0.005
angers+0.005
Neg logit contributions
ck-0.005
 reassure-0.005
 unexpected-0.005
 fright-0.005
 swat-0.005
Token a
Token ID257
Feature activation-2.37e-02

Pos logit contributions
ck+0.052
 reassure+0.049
 unexpected+0.049
 fright+0.045
 swat+0.045
Neg logit contributions
robe-0.062
ase-0.056
 Now-0.054
tub-0.053
 sky-0.052
Token toy
Token ID13373
Feature activation-4.51e-02

Pos logit contributions
robe+0.046
ase+0.041
 Now+0.040
angers+0.040
tub+0.039
Neg logit contributions
 unexpected-0.042
 reassure-0.041
 fright-0.041
ck-0.040
 swat-0.039
Token He
Token ID679
Feature activation-4.62e-02

Pos logit contributions
robe+0.065
tub+0.055
ase+0.054
angers+0.052
 sky+0.050
Neg logit contributions
 reassure-0.082
 fright-0.078
 worry-0.078
 intervene-0.076
 swat-0.075
Token was
Token ID373
Feature activation-1.33e-02

Pos logit contributions
robe+0.077
angers+0.065
tub+0.064
ase+0.064
unes+0.060
Neg logit contributions
ck-0.102
 reassure-0.095
 fright-0.093
 unexpected-0.093
 worry-0.090
Token here
Token ID994
Feature activation-8.99e-03

Pos logit contributions
robe+0.029
ase+0.025
tub+0.024
 Now+0.024
angers+0.023
Neg logit contributions
ck-0.023
 reassure-0.022
 unexpected-0.021
 fright-0.021
 swat-0.020
Token a
Token ID257
Feature activation-2.19e-03

Pos logit contributions
robe+0.015
ase+0.013
tub+0.012
 Now+0.012
 sky+0.012
Neg logit contributions
 reassure-0.019
ck-0.018
 unexpected-0.017
 fright-0.017
 swat-0.017
Token minute
Token ID5664
Feature activation+2.69e-03

Pos logit contributions
robe+0.004
tub+0.003
 has+0.003
 sky+0.003
ase+0.003
Neg logit contributions
ck-0.004
 reassure-0.004
 swat-0.004
 intervene-0.004
 unexpected-0.004
Token ago
Token ID2084
Feature activation-9.23e-02

Pos logit contributions
 reassure+0.005
ck+0.005
 worry+0.004
 unexpected+0.004
 fright+0.004
Neg logit contributions
robe-0.005
ase-0.005
angers-0.004
 Now-0.004
unes-0.004
Token.
Token ID13
Feature activation-3.67e-02

Pos logit contributions
robe+0.179
 Now+0.154
angers+0.152
ase+0.151
tub+0.151
Neg logit contributions
 reassure-0.150
ck-0.145
 fright-0.136
 intervene-0.133
 worry-0.133
Token He
Token ID679
Feature activation-3.86e-02

Pos logit contributions
robe+0.052
ase+0.045
tub+0.041
angers+0.040
bows+0.040
Neg logit contributions
 reassure-0.083
 fright-0.080
 intervene-0.076
ck-0.076
 cautiously-0.074
Token was
Token ID373
Feature activation-8.50e-03

Pos logit contributions
robe+0.066
ase+0.055
tub+0.054
angers+0.054
unes+0.054
Neg logit contributions
ck-0.082
 reassure-0.079
 unexpected-0.076
 fright-0.074
 swat-0.072
Token chasing
Token ID20023
Feature activation+1.35e-02

Pos logit contributions
robe+0.018
ase+0.016
tub+0.015
 Now+0.014
unes+0.014
Neg logit contributions
ck-0.015
 reassure-0.014
 unexpected-0.014
 fright-0.014
 swat-0.013
Token the
Token ID262
Feature activation-1.07e-02

Pos logit contributions
ck+0.026
 reassure+0.026
 intervene+0.024
 swat+0.024
 unexpected+0.024
Neg logit contributions
robe-0.024
 Now-0.020
 He-0.020
 sky-0.020
 has-0.020
Token our
Token ID674
Feature activation-2.84e-02

Pos logit contributions
robe+0.001
 sky+0.001
 has+0.001
 Now+0.001
tub+0.001
Neg logit contributions
ck-0.001
 reassure-0.001
 swat-0.001
 unexpected-0.001
 intervene-0.001
Token country
Token ID1499
Feature activation-3.68e-02

Pos logit contributions
robe+0.050
ase+0.042
tub+0.042
 has+0.041
 Now+0.041
Neg logit contributions
ck-0.056
 reassure-0.056
 unexpected-0.052
 swat-0.052
 fright-0.052
Token a
Token ID257
Feature activation-8.17e-03

Pos logit contributions
robe+0.065
unes+0.056
angers+0.055
 Now+0.055
ase+0.054
Neg logit contributions
 reassure-0.069
 unexpected-0.067
ck-0.066
 fright-0.065
 intervene-0.063
Token long
Token ID890
Feature activation-8.50e-03

Pos logit contributions
robe+0.017
tub+0.014
ase+0.014
 has+0.014
 Now+0.014
Neg logit contributions
ck-0.014
 reassure-0.014
 unexpected-0.013
 swat-0.013
 fright-0.013
Token time
Token ID640
Feature activation-4.09e-03

Pos logit contributions
robe+0.017
tub+0.015
unes+0.015
ase+0.015
 Now+0.014
Neg logit contributions
ck-0.015
 reassure-0.014
 fright-0.014
 unexpected-0.014
 swat-0.013
Token ago
Token ID2084
Feature activation-9.20e-02

Pos logit contributions
robe+0.008
ase+0.007
unes+0.006
 Now+0.006
tub+0.006
Neg logit contributions
ck-0.007
 reassure-0.007
 unexpected-0.007
 fright-0.007
 believe-0.007
Token.
Token ID13
Feature activation-2.54e-02

Pos logit contributions
robe+0.178
 Now+0.156
ase+0.151
unes+0.149
angers+0.148
Neg logit contributions
 reassure-0.151
ck-0.149
 fright-0.136
 intervene-0.136
 convince-0.135
Token He
Token ID679
Feature activation-2.97e-02

Pos logit contributions
robe+0.036
ase+0.028
tub+0.028
 has+0.026
unes+0.026
Neg logit contributions
 reassure-0.056
 fright-0.055
 intervene-0.053
ck-0.053
 unexpected-0.052
Token helped
Token ID4193
Feature activation+5.58e-03

Pos logit contributions
robe+0.048
ase+0.040
tub+0.039
unes+0.038
 Now+0.037
Neg logit contributions
ck-0.065
 reassure-0.063
 unexpected-0.060
 fright-0.059
 swat-0.059
Token many
Token ID867
Feature activation+1.40e-02

Pos logit contributions
ck+0.010
 reassure+0.010
 unexpected+0.010
 swat+0.010
 fright+0.010
Neg logit contributions
robe-0.010
tub-0.009
 sky-0.008
 Now-0.008
ase-0.008
Token people
Token ID661
Feature activation+3.10e-02

Pos logit contributions
ck+0.029
 reassure+0.029
 intervene+0.027
 swat+0.027
 fright+0.026
Neg logit contributions
robe-0.021
 has-0.019
 sky-0.019
 Now-0.018
 He-0.018
Tokenmy
Token ID1820
Feature activation+1.22e-02

Pos logit contributions
ck+0.040
 reassure+0.039
 unexpected+0.037
 swat+0.036
 fright+0.036
Neg logit contributions
robe-0.031
tub-0.026
ase-0.024
 Now-0.024
angers-0.024
Token had
Token ID550
Feature activation-3.56e-02

Pos logit contributions
ck+0.025
 reassure+0.025
 unexpected+0.023
 swat+0.022
 worry+0.022
Neg logit contributions
robe-0.022
tub-0.018
ase-0.018
 Now-0.018
angers-0.018
Token told
Token ID1297
Feature activation-1.89e-02

Pos logit contributions
robe+0.068
ase+0.059
tub+0.058
 Now+0.056
 sky+0.054
Neg logit contributions
ck-0.067
 reassure-0.066
 unexpected-0.063
 fright-0.060
 intervene-0.059
Token him
Token ID683
Feature activation-2.33e-03

Pos logit contributions
robe+0.040
tub+0.034
ase+0.034
 has+0.034
 Now+0.033
Neg logit contributions
ck-0.031
 reassure-0.031
 unexpected-0.029
 fright-0.028
 swat-0.028
Token long
Token ID890
Feature activation-6.82e-03

Pos logit contributions
robe+0.004
ase+0.004
tub+0.003
 sky+0.003
 Now+0.003
Neg logit contributions
 reassure-0.004
ck-0.004
 unexpected-0.004
 fright-0.004
 intervene-0.004
Token ago
Token ID2084
Feature activation-9.12e-02

Pos logit contributions
robe+0.013
ase+0.011
tub+0.011
 Now+0.011
angers+0.010
Neg logit contributions
 reassure-0.012
ck-0.012
 fright-0.012
 unexpected-0.012
 swat-0.011
Token.
Token ID13
Feature activation-1.51e-02

Pos logit contributions
robe+0.164
 Now+0.146
ase+0.142
tub+0.139
unes+0.137
Neg logit contributions
 reassure-0.163
ck-0.157
 fright-0.149
 intervene-0.147
 convince-0.147
Token The
Token ID383
Feature activation-1.20e-02

Pos logit contributions
robe+0.022
ase+0.018
 has+0.017
tub+0.017
bows+0.017
Neg logit contributions
 reassure-0.032
 intervene-0.031
 fright-0.031
 cautiously-0.031
ck-0.030
Token bird
Token ID6512
Feature activation+6.31e-03

Pos logit contributions
robe+0.021
tub+0.017
 sky+0.017
 has+0.017
ase+0.017
Neg logit contributions
ck-0.024
 reassure-0.024
 unexpected-0.022
 swat-0.022
 fright-0.022
Token spread
Token ID4104
Feature activation-1.26e-02

Pos logit contributions
ck+0.013
 reassure+0.013
 unexpected+0.012
 swat+0.012
 fright+0.012
Neg logit contributions
robe-0.011
ase-0.009
tub-0.009
 has-0.009
 Now-0.009
Token out
Token ID503
Feature activation+1.23e-02

Pos logit contributions
robe+0.025
ase+0.021
tub+0.021
 has+0.021
 Now+0.021
Neg logit contributions
ck-0.022
 reassure-0.022
 unexpected-0.021
 fright-0.020
 swat-0.020
Token born
Token ID4642
Feature activation+4.00e-03

Pos logit contributions
robe+0.017
ase+0.015
tub+0.015
 Now+0.014
 has+0.014
Neg logit contributions
ck-0.017
 reassure-0.017
 unexpected-0.016
 swat-0.015
 fright-0.015
Token only
Token ID691
Feature activation-3.98e-03

Pos logit contributions
 reassure+0.008
ck+0.007
 unexpected+0.007
 fright+0.007
 swat+0.007
Neg logit contributions
robe-0.007
ase-0.006
 Now-0.006
tub-0.006
 has-0.006
Token a
Token ID257
Feature activation-1.14e-02

Pos logit contributions
robe+0.008
ase+0.007
tub+0.007
angers+0.007
 Now+0.007
Neg logit contributions
 reassure-0.007
ck-0.007
 unexpected-0.007
 fright-0.007
 swat-0.007
Token few
Token ID1178
Feature activation-2.26e-03

Pos logit contributions
robe+0.024
ase+0.020
tub+0.020
 Now+0.020
 has+0.020
Neg logit contributions
ck-0.019
 reassure-0.019
 unexpected-0.018
 fright-0.017
 swat-0.017
Token days
Token ID1528
Feature activation+3.46e-03

Pos logit contributions
robe+0.005
tub+0.004
ase+0.004
 has+0.004
 Now+0.004
Neg logit contributions
ck-0.004
 reassure-0.004
 unexpected-0.004
 swat-0.003
 fright-0.003
Token ago
Token ID2084
Feature activation-9.09e-02

Pos logit contributions
ck+0.006
 reassure+0.006
 unexpected+0.006
 swat+0.006
 convince+0.005
Neg logit contributions
robe-0.007
ase-0.006
tub-0.005
 Now-0.005
 Cookies-0.005
Token.
Token ID13
Feature activation-3.02e-02

Pos logit contributions
robe+0.177
 Now+0.158
ase+0.154
tub+0.149
angers+0.148
Neg logit contributions
 reassure-0.149
ck-0.143
 fright-0.134
 worry-0.134
 intervene-0.133
Token The
Token ID383
Feature activation-2.30e-02

Pos logit contributions
robe+0.045
ase+0.035
bows+0.035
tub+0.034
 has+0.034
Neg logit contributions
 reassure-0.064
 fright-0.063
 intervene-0.062
 cautiously-0.060
ck-0.060
Token deer
Token ID20096
Feature activation-3.45e-03

Pos logit contributions
robe+0.041
ase+0.035
tub+0.035
 Now+0.034
 has+0.034
Neg logit contributions
 reassure-0.044
ck-0.043
 unexpected-0.042
 fright-0.041
 swat-0.041
Token felt
Token ID2936
Feature activation-3.17e-03

Pos logit contributions
robe+0.006
tub+0.004
ase+0.004
 Now+0.004
unes+0.004
Neg logit contributions
ck-0.008
 reassure-0.007
 unexpected-0.007
 swat-0.007
 fright-0.007
Token sorry
Token ID7926
Feature activation-2.00e-03

Pos logit contributions
robe+0.007
ase+0.006
tub+0.006
 has+0.005
 Now+0.005
Neg logit contributions
ck-0.005
 reassure-0.005
 unexpected-0.005
 fright-0.005
 swat-0.005
Token family
Token ID1641
Feature activation-8.58e-03

Pos logit contributions
robe+0.036
ase+0.033
tub+0.031
 Now+0.030
unes+0.030
Neg logit contributions
ck-0.032
 reassure-0.031
 swat-0.029
 unexpected-0.029
 fright-0.029
Token say
Token ID910
Feature activation-4.01e-02

Pos logit contributions
robe+0.015
angers+0.013
ase+0.013
unes+0.013
tub+0.013
Neg logit contributions
ck-0.016
 reassure-0.016
 unexpected-0.015
 fright-0.015
 intervene-0.015
Token how
Token ID703
Feature activation-3.12e-02

Pos logit contributions
robe+0.074
ase+0.063
tub+0.063
 sky+0.059
 has+0.059
Neg logit contributions
ck-0.077
 reassure-0.076
 unexpected-0.072
 fright-0.071
 intervene-0.071
Token beautiful
Token ID4950
Feature activation-2.64e-02

Pos logit contributions
robe+0.058
ase+0.050
tub+0.048
angers+0.045
 sky+0.045
Neg logit contributions
ck-0.060
 reassure-0.059
 unexpected-0.058
 fright-0.056
 swat-0.055
Token the
Token ID262
Feature activation-2.16e-02

Pos logit contributions
robe+0.054
angers+0.046
ase+0.045
tub+0.044
unes+0.043
Neg logit contributions
ck-0.046
 unexpected-0.045
 reassure-0.044
 fright-0.042
 swat-0.042
Token drawer
Token ID33451
Feature activation-9.07e-02

Pos logit contributions
robe+0.037
ase+0.031
tub+0.031
 Now+0.030
 has+0.030
Neg logit contributions
ck-0.043
 reassure-0.043
 unexpected-0.041
 fright-0.040
 swat-0.040
Token looked
Token ID3114
Feature activation-4.69e-02

Pos logit contributions
robe+0.137
 Now+0.129
angers+0.127
 Cookies+0.123
unes+0.122
Neg logit contributions
 unexpected-0.172
ck-0.169
 reassure-0.165
 believe-0.160
 closer-0.156
Token!
Token ID0
Feature activation-2.70e-02

Pos logit contributions
robe+0.092
ase+0.085
 Now+0.080
angers+0.076
tub+0.075
Neg logit contributions
 unexpected-0.085
ck-0.082
 reassure-0.079
 closer-0.078
 fright-0.075
Token Grac
Token ID38099
Feature activation+1.37e-02

Pos logit contributions
robe+0.042
tub+0.033
ase+0.032
 has+0.031
angers+0.031
Neg logit contributions
 reassure-0.055
 intervene-0.053
ck-0.053
 fright-0.053
 cautiously-0.051
Tokenie
Token ID494
Feature activation-6.12e-03

Pos logit contributions
ck+0.015
 reassure+0.014
 unexpected+0.012
 swat+0.012
 intervene+0.012
Neg logit contributions
robe-0.037
tub-0.033
 sky-0.033
ase-0.033
 Now-0.033
Token smiled
Token ID13541
Feature activation-2.89e-02

Pos logit contributions
robe+0.011
tub+0.009
unes+0.009
angers+0.009
ase+0.009
Neg logit contributions
ck-0.013
 reassure-0.012
 swat-0.012
 unexpected-0.011
 fright-0.011
Token the
Token ID262
Feature activation-2.14e-02

Pos logit contributions
robe+0.016
ase+0.014
 Now+0.014
tub+0.014
unes+0.013
Neg logit contributions
 reassure-0.015
ck-0.015
 unexpected-0.014
 fright-0.014
 swat-0.014
Token print
Token ID3601
Feature activation-6.61e-02

Pos logit contributions
robe+0.038
 Now+0.033
ase+0.033
tub+0.032
 has+0.031
Neg logit contributions
 reassure-0.040
ck-0.040
 unexpected-0.039
 fright-0.038
 swat-0.038
Token in
Token ID287
Feature activation+4.17e-03

Pos logit contributions
robe+0.100
angers+0.091
 Now+0.090
unes+0.088
 Cookies+0.085
Neg logit contributions
 unexpected-0.135
ck-0.135
 reassure-0.133
 fright-0.130
 intervene-0.124
Token his
Token ID465
Feature activation-3.77e-02

Pos logit contributions
ck+0.007
 reassure+0.007
 unexpected+0.007
 fright+0.006
 swat+0.006
Neg logit contributions
robe-0.009
ase-0.008
tub-0.007
 Now-0.007
 has-0.007
Token new
Token ID649
Feature activation-2.22e-02

Pos logit contributions
robe+0.066
ase+0.055
tub+0.055
 Now+0.054
unes+0.053
Neg logit contributions
ck-0.074
 reassure-0.072
 unexpected-0.072
 fright-0.070
 swat-0.068
Token coloring
Token ID33988
Feature activation-9.06e-02

Pos logit contributions
robe+0.038
ase+0.031
 Now+0.031
tub+0.031
unes+0.030
Neg logit contributions
 reassure-0.045
ck-0.044
 unexpected-0.043
 fright-0.042
 swat-0.042
Token book
Token ID1492
Feature activation-5.93e-02

Pos logit contributions
 Now+0.137
robe+0.130
unes+0.128
 These+0.126
 Cookies+0.124
Neg logit contributions
 unexpected-0.168
ck-0.159
 reassure-0.157
 fright-0.153
 believe-0.147
Token was
Token ID373
Feature activation-6.79e-03

Pos logit contributions
robe+0.095
 Now+0.085
unes+0.081
angers+0.081
ase+0.080
Neg logit contributions
ck-0.115
 unexpected-0.114
 reassure-0.113
 fright-0.109
 intervene-0.104
Token so
Token ID523
Feature activation+1.36e-02

Pos logit contributions
robe+0.014
ase+0.012
 Now+0.012
tub+0.011
angers+0.011
Neg logit contributions
ck-0.012
 unexpected-0.012
 reassure-0.012
 fright-0.011
 swat-0.011
Token clear
Token ID1598
Feature activation-9.11e-03

Pos logit contributions
ck+0.020
 unexpected+0.019
 reassure+0.019
 fright+0.017
 swat+0.017
Neg logit contributions
robe-0.032
ase-0.029
tub-0.028
 Now-0.027
angers-0.027
Token and
Token ID290
Feature activation+1.57e-02

Pos logit contributions
robe+0.018
ase+0.016
angers+0.016
 Cookies+0.015
tub+0.015
Neg logit contributions
 unexpected-0.015
ck-0.014
 closer-0.014
 fright-0.014
 reassure-0.014
Token said
Token ID531
Feature activation-3.10e-02

Pos logit contributions
ck+0.027
 reassure+0.026
 swat+0.023
 unexpected+0.023
 intervene+0.023
Neg logit contributions
robe-0.032
ase-0.027
tub-0.027
 sky-0.026
 Now-0.026
Token.
Token ID13
Feature activation-3.32e-02

Pos logit contributions
robe+0.046
ase+0.037
tub+0.037
 has+0.036
 Now+0.036
Neg logit contributions
ck-0.070
 reassure-0.069
 unexpected-0.065
 swat-0.065
 fright-0.065
Token He
Token ID679
Feature activation-4.47e-02

Pos logit contributions
robe+0.047
owers+0.045
atmeal+0.041
unes+0.040
ase+0.040
Neg logit contributions
 closer-0.068
 fright-0.068
 reassure-0.067
 quick-0.066
 louder-0.065
Token drank
Token ID24070
Feature activation-3.68e-02

Pos logit contributions
robe+0.089
unes+0.077
ase+0.076
tub+0.075
 sky+0.074
Neg logit contributions
ck-0.082
 reassure-0.077
 swat-0.072
 unexpected-0.070
 convince-0.069
Token his
Token ID465
Feature activation-5.45e-02

Pos logit contributions
robe+0.083
ase+0.074
angers+0.073
tub+0.071
 Now+0.070
Neg logit contributions
ck-0.056
 closer-0.051
 fright-0.050
 reassure-0.049
 unexpected-0.048
Token tea
Token ID8887
Feature activation-9.05e-02

Pos logit contributions
robe+0.085
ase+0.078
 He+0.075
 Now+0.074
 sky+0.072
Neg logit contributions
ck-0.111
 reassure-0.103
 unexpected-0.099
 swat-0.098
 fright-0.097
Token and
Token ID290
Feature activation-3.56e-03

Pos logit contributions
 Now+0.141
robe+0.138
 has+0.132
 There+0.131
 These+0.129
Neg logit contributions
ck-0.165
 reassure-0.161
 convince-0.153
 tighter-0.148
 closer-0.146
Token smiled
Token ID13541
Feature activation-2.41e-02

Pos logit contributions
robe+0.007
ase+0.006
unes+0.006
angers+0.006
tub+0.006
Neg logit contributions
ck-0.007
 reassure-0.006
 swat-0.006
 unexpected-0.006
 fright-0.006
Token.
Token ID13
Feature activation-3.54e-02

Pos logit contributions
robe+0.040
ase+0.035
tub+0.033
 Now+0.033
 has+0.032
Neg logit contributions
ck-0.050
 reassure-0.050
 unexpected-0.047
 fright-0.046
 swat-0.046
Token Lily
Token ID20037
Feature activation-1.31e-02

Pos logit contributions
owers+0.058
robe+0.058
ase+0.053
atmeal+0.052
fl+0.051
Neg logit contributions
 quick-0.059
 closer-0.057
 louder-0.057
 fright-0.057
 reassure-0.056
Token drank
Token ID24070
Feature activation-3.60e-02

Pos logit contributions
robe+0.026
unes+0.022
ase+0.022
tub+0.022
angers+0.022
Neg logit contributions
ck-0.023
 reassure-0.022
 swat-0.021
 convince-0.020
 fright-0.020
Token story
Token ID1621
Feature activation-2.59e-02

Pos logit contributions
robe+0.034
 sky+0.030
 has+0.030
tub+0.029
 Now+0.028
Neg logit contributions
ck-0.047
 reassure-0.044
 intervene-0.041
 swat-0.041
 unexpected-0.040
Token.
Token ID13
Feature activation-3.50e-02

Pos logit contributions
robe+0.045
tub+0.037
 Now+0.037
ase+0.037
unes+0.036
Neg logit contributions
 reassure-0.050
ck-0.050
 unexpected-0.047
 fright-0.046
 intervene-0.045
Token A
Token ID317
Feature activation-8.75e-03

Pos logit contributions
robe+0.048
tub+0.039
ase+0.037
 sky+0.034
angers+0.034
Neg logit contributions
 reassure-0.083
ck-0.080
 fright-0.078
 intervene-0.078
 swat-0.077
Token long
Token ID890
Feature activation-6.79e-03

Pos logit contributions
robe+0.016
 has+0.013
tub+0.013
 sky+0.013
ase+0.013
Neg logit contributions
ck-0.017
 reassure-0.017
 intervene-0.016
 swat-0.016
 unexpected-0.016
Token time
Token ID640
Feature activation+8.39e-04

Pos logit contributions
robe+0.013
tub+0.011
unes+0.011
 Now+0.010
angers+0.010
Neg logit contributions
ck-0.013
 reassure-0.013
 fright-0.012
 swat-0.012
 unexpected-0.012
Token ago
Token ID2084
Feature activation-8.99e-02

Pos logit contributions
 reassure+0.002
ck+0.002
 unexpected+0.001
 fright+0.001
 believe+0.001
Neg logit contributions
robe-0.002
ase-0.001
tub-0.001
unes-0.001
angers-0.001
Token,
Token ID11
Feature activation+7.88e-03

Pos logit contributions
robe+0.181
angers+0.153
tub+0.152
ase+0.152
unes+0.151
Neg logit contributions
 reassure-0.142
ck-0.138
 intervene-0.126
 fright-0.126
 worry-0.124
Token there
Token ID612
Feature activation-4.46e-02

Pos logit contributions
 reassure+0.014
ck+0.014
 unexpected+0.013
 swat+0.013
 fright+0.012
Neg logit contributions
robe-0.016
ase-0.015
 Now-0.013
tub-0.013
unes-0.013
Token was
Token ID373
Feature activation+1.87e-02

Pos logit contributions
robe+0.057
angers+0.043
ase+0.042
tub+0.041
unes+0.040
Neg logit contributions
 reassure-0.109
ck-0.108
 unexpected-0.106
 intervene-0.102
 swat-0.101
Token an
Token ID281
Feature activation+2.16e-02

Pos logit contributions
ck+0.037
 reassure+0.037
 swat+0.034
 fright+0.034
 intervene+0.034
Neg logit contributions
robe-0.033
tub-0.028
 has-0.028
 sky-0.028
 Now-0.028
Token adventurous
Token ID39770
Feature activation+6.24e-03

Pos logit contributions
ck+0.045
 reassure+0.044
 fright+0.044
 intervene+0.043
 swat+0.041
Neg logit contributions
robe-0.031
 sky-0.029
 Now-0.027
 has-0.027
ase-0.027
Token,
Token ID11
Feature activation-1.15e-02

Pos logit contributions
robe+0.092
 Now+0.082
unes+0.081
tub+0.079
 Cookies+0.076
Neg logit contributions
 reassure-0.115
ck-0.115
 unexpected-0.107
 fright-0.106
 convince-0.105
Token next
Token ID1306
Feature activation-2.22e-02

Pos logit contributions
robe+0.022
ase+0.020
tub+0.019
unes+0.019
 Now+0.018
Neg logit contributions
ck-0.021
 reassure-0.021
 swat-0.020
 fright-0.019
 convince-0.019
Token to
Token ID284
Feature activation-1.31e-03

Pos logit contributions
robe+0.041
tub+0.034
ase+0.033
unes+0.033
angers+0.032
Neg logit contributions
 reassure-0.041
 fright-0.041
ck-0.041
 unexpected-0.041
 closer-0.038
Token her
Token ID607
Feature activation-1.63e-02

Pos logit contributions
robe+0.003
tub+0.003
ase+0.003
 sky+0.002
 has+0.002
Neg logit contributions
ck-0.002
 reassure-0.002
 unexpected-0.002
 swat-0.002
 fright-0.002
Token magn
Token ID7842
Feature activation-1.76e-02

Pos logit contributions
robe+0.030
ase+0.025
tub+0.025
 Now+0.025
 has+0.025
Neg logit contributions
ck-0.031
 reassure-0.031
 unexpected-0.029
 swat-0.029
 fright-0.029
Tokenifying
Token ID4035
Feature activation-8.98e-02

Pos logit contributions
robe+0.027
 Now+0.023
tub+0.023
 has+0.022
ase+0.022
Neg logit contributions
ck-0.038
 reassure-0.037
 fright-0.035
 unexpected-0.035
 swat-0.035
Token glass
Token ID5405
Feature activation-4.19e-02

Pos logit contributions
robe+0.150
angers+0.129
 has+0.123
 Now+0.122
unes+0.121
Neg logit contributions
 fright-0.169
 unexpected-0.165
 reassure-0.162
ck-0.155
 intervene-0.150
Token.
Token ID13
Feature activation-3.36e-02

Pos logit contributions
robe+0.061
 Now+0.060
 These+0.055
 has+0.053
 He+0.053
Neg logit contributions
ck-0.081
 reassure-0.080
 unexpected-0.078
 fright-0.078
 swat-0.075
Token She
Token ID1375
Feature activation-4.20e-02

Pos logit contributions
robe+0.049
tub+0.043
ase+0.042
unes+0.040
rob+0.039
Neg logit contributions
 reassure-0.070
 cautiously-0.068
 fright-0.067
 closer-0.064
 swat-0.064
Token looked
Token ID3114
Feature activation-3.29e-02

Pos logit contributions
robe+0.082
ase+0.071
tub+0.071
unes+0.069
 Now+0.067
Neg logit contributions
ck-0.078
 reassure-0.076
 unexpected-0.070
 swat-0.068
 fright-0.068
Token at
Token ID379
Feature activation-1.29e-02

Pos logit contributions
robe+0.054
ase+0.045
tub+0.044
 Now+0.043
 sky+0.042
Neg logit contributions
ck-0.070
 reassure-0.069
 unexpected-0.066
 fright-0.065
 swat-0.065
Token She
Token ID1375
Feature activation-4.76e-02

Pos logit contributions
robe+0.061
ase+0.049
tub+0.047
atmeal+0.043
 has+0.042
Neg logit contributions
 reassure-0.113
 swat-0.108
 fright-0.108
ck-0.107
 closer-0.106
Token takes
Token ID2753
Feature activation-4.92e-04

Pos logit contributions
robe+0.093
tub+0.081
ase+0.080
unes+0.074
angers+0.072
Neg logit contributions
ck-0.092
 reassure-0.091
 fright-0.084
 unexpected-0.083
 cautiously-0.083
Token out
Token ID503
Feature activation+1.37e-02

Pos logit contributions
robe+0.001
 Now+0.001
 sky+0.001
ase+0.001
 has+0.001
Neg logit contributions
 reassure-0.001
ck-0.001
 unexpected-0.001
 swat-0.001
 intervene-0.001
Token a
Token ID257
Feature activation-1.26e-02

Pos logit contributions
ck+0.027
 reassure+0.027
 swat+0.025
 intervene+0.025
 unexpected+0.025
Neg logit contributions
robe-0.024
tub-0.020
 sky-0.020
 has-0.020
ase-0.020
Token magn
Token ID7842
Feature activation-1.84e-02

Pos logit contributions
robe+0.022
 has+0.019
 sky+0.018
tub+0.018
ase+0.018
Neg logit contributions
ck-0.026
 reassure-0.025
 swat-0.023
 intervene-0.023
 unexpected-0.023
Tokenifying
Token ID4035
Feature activation-8.95e-02

Pos logit contributions
robe+0.030
tub+0.024
 Now+0.024
ase+0.024
 has+0.024
Neg logit contributions
ck-0.039
 reassure-0.038
 fright-0.036
 unexpected-0.036
 swat-0.036
Token glass
Token ID5405
Feature activation-4.28e-02

Pos logit contributions
robe+0.157
angers+0.135
unes+0.125
 Now+0.123
 has+0.122
Neg logit contributions
 fright-0.170
 unexpected-0.167
 reassure-0.161
ck-0.160
 fear-0.150
Token.
Token ID13
Feature activation-4.35e-02

Pos logit contributions
robe+0.074
 Now+0.068
angers+0.065
unes+0.063
 These+0.063
Neg logit contributions
ck-0.073
 closer-0.073
 fright-0.071
 unexpected-0.070
 reassure-0.069
Token She
Token ID1375
Feature activation-4.50e-02

Pos logit contributions
robe+0.057
ase+0.047
 has+0.044
tub+0.043
atmeal+0.039
Neg logit contributions
 reassure-0.095
 cautiously-0.093
 fright-0.092
ck-0.091
 swat-0.091
Token looks
Token ID3073
Feature activation-2.70e-02

Pos logit contributions
robe+0.089
tub+0.076
ase+0.075
unes+0.071
angers+0.067
Neg logit contributions
ck-0.087
 reassure-0.084
 unexpected-0.080
 fright-0.078
 cautiously-0.078
Token at
Token ID379
Feature activation-1.39e-02

Pos logit contributions
robe+0.047
ase+0.040
tub+0.039
 Now+0.038
 sky+0.037
Neg logit contributions
ck-0.055
 reassure-0.054
 unexpected-0.052
 fright-0.051
 swat-0.050
Token bought
Token ID5839
Feature activation-2.08e-02

Pos logit contributions
 reassure+0.032
ck+0.032
 convince+0.029
 intervene+0.028
 swat+0.028
Neg logit contributions
robe-0.039
tub-0.034
ase-0.033
unes-0.032
 Now-0.032
Token her
Token ID607
Feature activation-2.54e-02

Pos logit contributions
robe+0.045
tub+0.039
ase+0.039
 has+0.037
 Now+0.037
Neg logit contributions
ck-0.034
 reassure-0.033
 unexpected-0.031
 fright-0.031
 swat-0.030
Token a
Token ID257
Feature activation-1.54e-02

Pos logit contributions
robe+0.049
ase+0.041
tub+0.041
 Now+0.041
 has+0.040
Neg logit contributions
ck-0.046
 reassure-0.046
 unexpected-0.043
 swat-0.042
 fright-0.042
Token brand
Token ID4508
Feature activation-6.77e-02

Pos logit contributions
robe+0.026
tub+0.022
 has+0.022
 sky+0.022
ase+0.021
Neg logit contributions
ck-0.033
 reassure-0.031
 swat-0.029
 intervene-0.029
 fright-0.029
Token new
Token ID649
Feature activation-2.26e-02

Pos logit contributions
robe+0.134
tub+0.114
ase+0.113
 has+0.113
 Now+0.112
Neg logit contributions
ck-0.118
 reassure-0.117
 unexpected-0.110
 fright-0.107
 swat-0.107
Token coloring
Token ID33988
Feature activation-8.95e-02

Pos logit contributions
robe+0.037
ase+0.031
 Now+0.031
tub+0.030
 has+0.030
Neg logit contributions
 reassure-0.047
ck-0.046
 unexpected-0.044
 fright-0.044
 swat-0.044
Token book
Token ID1492
Feature activation-5.56e-02

Pos logit contributions
 Now+0.149
unes+0.146
robe+0.145
 Cookies+0.145
 There+0.142
Neg logit contributions
 unexpected-0.138
 reassure-0.130
ck-0.129
 closer-0.127
 believe-0.124
Token.
Token ID13
Feature activation-1.91e-02

Pos logit contributions
robe+0.099
 Now+0.093
unes+0.088
tub+0.087
ase+0.086
Neg logit contributions
 unexpected-0.092
 reassure-0.091
ck-0.088
 fright-0.084
 closer-0.084
Token She
Token ID1375
Feature activation-3.67e-02

Pos logit contributions
robe+0.031
tub+0.025
ase+0.024
unes+0.023
angers+0.023
Neg logit contributions
 reassure-0.041
 fright-0.039
ck-0.039
 intervene-0.038
 unexpected-0.038
Token grabbed
Token ID13176
Feature activation-1.64e-02

Pos logit contributions
robe+0.068
tub+0.057
ase+0.057
unes+0.055
 sky+0.055
Neg logit contributions
ck-0.072
 reassure-0.070
 unexpected-0.066
 swat-0.065
 fright-0.064
Token her
Token ID607
Feature activation-4.61e-02

Pos logit contributions
robe+0.030
ase+0.025
tub+0.025
 has+0.025
 Now+0.025
Neg logit contributions
ck-0.031
 reassure-0.031
 unexpected-0.029
 swat-0.029
 fright-0.028
Token,
Token ID11
Feature activation-1.61e-02

Pos logit contributions
robe+0.052
ase+0.042
tub+0.042
 sky+0.041
 Now+0.041
Neg logit contributions
ck-0.073
 reassure-0.072
 swat-0.067
 unexpected-0.067
 fright-0.067
Token handing
Token ID22786
Feature activation-1.54e-02

Pos logit contributions
robe+0.019
ase+0.015
tub+0.014
unes+0.014
ests+0.013
Neg logit contributions
 reassure-0.042
 swat-0.041
ck-0.041
 fright-0.040
 intervene-0.039
Token him
Token ID683
Feature activation-1.19e-03

Pos logit contributions
robe+0.027
 Now+0.023
 sky+0.022
ase+0.022
 has+0.022
Neg logit contributions
ck-0.031
 reassure-0.031
 unexpected-0.029
 swat-0.028
 fright-0.028
Token the
Token ID262
Feature activation-3.30e-02

Pos logit contributions
robe+0.002
 has+0.002
 He+0.002
ase+0.002
 sky+0.002
Neg logit contributions
 reassure-0.002
ck-0.002
 unexpected-0.002
 swat-0.002
 intervene-0.002
Token magn
Token ID7842
Feature activation-2.26e-02

Pos logit contributions
robe+0.058
ase+0.049
 Now+0.049
tub+0.048
 has+0.047
Neg logit contributions
 reassure-0.065
ck-0.064
 unexpected-0.060
 fright-0.060
 swat-0.060
Tokenifying
Token ID4035
Feature activation-8.95e-02

Pos logit contributions
robe+0.038
 Now+0.033
tub+0.032
 has+0.032
 Cookies+0.031
Neg logit contributions
ck-0.045
 reassure-0.043
 fright-0.042
 unexpected-0.040
 swat-0.040
Token glass
Token ID5405
Feature activation-4.99e-02

Pos logit contributions
robe+0.152
angers+0.129
unes+0.119
tub+0.119
 Now+0.118
Neg logit contributions
 fright-0.173
 unexpected-0.168
ck-0.165
 reassure-0.165
 swat-0.154
Token.
Token ID13
Feature activation-2.61e-02

Pos logit contributions
robe+0.084
 Now+0.080
angers+0.075
 Cookies+0.071
 These+0.071
Neg logit contributions
ck-0.091
 fright-0.088
 reassure-0.085
 closer-0.085
 unexpected-0.084
Token She
Token ID1375
Feature activation-4.64e-02

Pos logit contributions
robe+0.041
ase+0.033
tub+0.033
ates+0.029
 has+0.029
Neg logit contributions
 reassure-0.051
 fright-0.050
 intervene-0.049
 swat-0.049
 closer-0.047
Token looked
Token ID3114
Feature activation-2.78e-02

Pos logit contributions
robe+0.084
ase+0.070
tub+0.070
 Now+0.069
 sky+0.069
Neg logit contributions
ck-0.091
 reassure-0.089
 unexpected-0.083
 swat-0.083
 fright-0.082
Token around
Token ID1088
Feature activation+2.58e-06

Pos logit contributions
robe+0.042
tub+0.034
ase+0.034
 has+0.034
 Now+0.034
Neg logit contributions
ck-0.062
 reassure-0.062
 unexpected-0.058
 swat-0.058
 fright-0.057
Tokentr
Token ID2213
Feature activation-8.84e-02

Pos logit contributions
 Now+0.069
 Cookies+0.063
robe+0.061
 These+0.059
angers+0.059
Neg logit contributions
 fright-0.083
 unexpected-0.082
 reassure-0.081
 swat-0.081
ck-0.080
Tokenay
Token ID323
Feature activation-2.78e-02

Pos logit contributions
robe+0.145
 has+0.129
 Now+0.129
tub+0.129
 vitamins+0.127
Neg logit contributions
ck-0.143
 fright-0.142
 unexpected-0.138
 reassure-0.137
pr-0.136
Token.
Token ID13
Feature activation-2.04e-02

Pos logit contributions
robe+0.049
tub+0.044
 Now+0.043
 Cookies+0.042
angers+0.042
Neg logit contributions
 reassure-0.046
ck-0.045
 unexpected-0.042
 fright-0.042
 swat-0.042
Token The
Token ID383
Feature activation-1.84e-02

Pos logit contributions
robe+0.022
tub+0.015
ase+0.014
angers+0.014
 has+0.014
Neg logit contributions
 reassure-0.053
 fright-0.053
 unexpected-0.052
ck-0.051
 intervene-0.050
Token ash
Token ID12530
Feature activation-5.82e-02

Pos logit contributions
robe+0.039
ase+0.033
tub+0.033
angers+0.033
 Now+0.032
Neg logit contributions
ck-0.030
 unexpected-0.029
 reassure-0.029
 swat-0.027
 fright-0.027
Tokentr
Token ID2213
Feature activation-8.92e-02

Pos logit contributions
angers+0.084
robe+0.080
 Now+0.078
 These+0.075
tub+0.074
Neg logit contributions
 fright-0.107
ck-0.105
 reassure-0.101
 unexpected-0.099
 worry-0.098
Tokenay
Token ID323
Feature activation-4.02e-02

Pos logit contributions
robe+0.155
tub+0.140
 vitamins+0.135
 living+0.131
 ones+0.129
Neg logit contributions
ck-0.142
 fright-0.137
 unexpected-0.130
 reassure-0.128
pr-0.127
Token lived
Token ID5615
Feature activation-1.01e-02

Pos logit contributions
robe+0.062
tub+0.055
angers+0.054
unes+0.049
 Now+0.047
Neg logit contributions
ck-0.083
 reassure-0.077
 fright-0.074
 unexpected-0.073
 swat-0.072
Token in
Token ID287
Feature activation+1.61e-02

Pos logit contributions
robe+0.015
tub+0.012
ase+0.012
 sky+0.012
 has+0.011
Neg logit contributions
ck-0.024
 reassure-0.023
 unexpected-0.022
 swat-0.022
 intervene-0.022
Token a
Token ID257
Feature activation+1.74e-02

Pos logit contributions
ck+0.026
 reassure+0.026
 unexpected+0.025
 fright+0.025
 swat+0.024
Neg logit contributions
robe-0.034
ase-0.029
tub-0.029
unes-0.029
 has-0.028
Token house
Token ID2156
Feature activation-2.18e-02

Pos logit contributions
ck+0.037
 reassure+0.036
 intervene+0.033
 swat+0.033
 fright+0.033
Neg logit contributions
robe-0.028
 sky-0.024
 has-0.024
tub-0.024
ase-0.024
Token and
Token ID290
Feature activation+2.18e-02

Pos logit contributions
robe+0.001
tub+0.001
ase+0.001
angers+0.001
 Cookies+0.001
Neg logit contributions
ck-0.001
 reassure-0.001
 fright-0.001
 swat-0.001
 closer-0.001
Token fell
Token ID3214
Feature activation+1.53e-03

Pos logit contributions
ck+0.040
 reassure+0.039
 fright+0.036
 swat+0.036
 unexpected+0.036
Neg logit contributions
robe-0.042
ase-0.036
tub-0.036
 Now-0.035
 sky-0.035
Token.
Token ID13
Feature activation-3.06e-02

Pos logit contributions
ck+0.003
 reassure+0.003
 fright+0.003
 swat+0.003
 closer+0.003
Neg logit contributions
robe-0.003
ase-0.003
tub-0.002
 Now-0.002
 has-0.002
Token The
Token ID383
Feature activation-2.15e-02

Pos logit contributions
robe+0.038
ase+0.030
tub+0.030
 has+0.026
unes+0.025
Neg logit contributions
 fright-0.073
 reassure-0.073
 swat-0.071
ck-0.071
 unexpected-0.069
Token ash
Token ID12530
Feature activation-6.33e-02

Pos logit contributions
robe+0.034
 Now+0.030
ase+0.029
tub+0.028
unes+0.028
Neg logit contributions
 reassure-0.043
ck-0.043
 unexpected-0.042
 fright-0.042
 swat-0.041
Tokentr
Token ID2213
Feature activation-8.91e-02

Pos logit contributions
 These+0.118
robe+0.114
 Now+0.114
angers+0.112
 Cookies+0.105
Neg logit contributions
ck-0.099
 fright-0.097
 swat-0.088
 fear-0.087
 closer-0.085
Tokenay
Token ID323
Feature activation-3.52e-02

Pos logit contributions
robe+0.144
tub+0.126
 sky+0.120
 living+0.116
 has+0.116
Neg logit contributions
ck-0.161
 fright-0.154
 unexpected-0.144
 reassure-0.144
 fear-0.139
Token flew
Token ID13112
Feature activation-2.00e-02

Pos logit contributions
robe+0.052
tub+0.047
angers+0.043
 sky+0.040
unes+0.040
Neg logit contributions
ck-0.078
 reassure-0.071
 fright-0.070
 swat-0.069
 unexpected-0.068
Token out
Token ID503
Feature activation+9.07e-03

Pos logit contributions
robe+0.037
ase+0.031
tub+0.031
 Now+0.031
 has+0.030
Neg logit contributions
ck-0.038
 reassure-0.037
 fright-0.035
 unexpected-0.035
 swat-0.035
Token of
Token ID286
Feature activation+1.35e-03

Pos logit contributions
ck+0.020
 reassure+0.019
 unexpected+0.018
 fright+0.018
 swat+0.018
Neg logit contributions
robe-0.015
ase-0.012
tub-0.012
 Now-0.012
 has-0.012
Token her
Token ID607
Feature activation-3.35e-02

Pos logit contributions
ck+0.002
 reassure+0.002
 unexpected+0.002
 fright+0.002
 swat+0.002
Neg logit contributions
robe-0.003
ase-0.003
tub-0.002
 has-0.002
 Now-0.002
TokenB
Token ID33
Feature activation+2.68e-02

Pos logit contributions
ck+0.040
 reassure+0.039
 fright+0.037
 unexpected+0.036
 swat+0.036
Neg logit contributions
robe-0.047
 Now-0.045
ase-0.043
 He-0.042
 sky-0.041
Tokenobby
Token ID11369
Feature activation+1.78e-02

Pos logit contributions
ck+0.044
 reassure+0.044
 intervene+0.040
 swat+0.040
 fright+0.040
Neg logit contributions
robe-0.054
 sky-0.047
 has-0.047
ase-0.047
tub-0.047
Token and
Token ID290
Feature activation+3.79e-03

Pos logit contributions
ck+0.035
 reassure+0.034
 swat+0.032
 intervene+0.031
 unexpected+0.031
Neg logit contributions
robe-0.033
ase-0.027
tub-0.027
unes-0.026
angers-0.026
Token the
Token ID262
Feature activation-1.95e-02

Pos logit contributions
ck+0.006
 reassure+0.006
 swat+0.006
 fright+0.006
 unexpected+0.006
Neg logit contributions
robe-0.008
ase-0.007
tub-0.007
 sky-0.007
 has-0.007
Token ash
Token ID12530
Feature activation-6.04e-02

Pos logit contributions
robe+0.033
tub+0.028
ase+0.028
 has+0.027
 sky+0.027
Neg logit contributions
ck-0.040
 reassure-0.039
 unexpected-0.037
 swat-0.036
 fright-0.036
Tokentr
Token ID2213
Feature activation-8.90e-02

Pos logit contributions
robe+0.075
 Now+0.073
 These+0.067
 Cookies+0.064
angers+0.063
Neg logit contributions
ck-0.127
 reassure-0.121
 fright-0.119
 unexpected-0.118
 closer-0.117
Tokenay
Token ID323
Feature activation-3.15e-02

Pos logit contributions
robe+0.147
tub+0.127
 sky+0.121
 has+0.118
 Now+0.116
Neg logit contributions
ck-0.170
 fright-0.155
 reassure-0.154
 unexpected-0.151
 intervene-0.146
Token became
Token ID2627
Feature activation-8.94e-03

Pos logit contributions
robe+0.051
tub+0.044
ase+0.040
 sky+0.040
angers+0.040
Neg logit contributions
ck-0.068
 reassure-0.064
 unexpected-0.060
 fright-0.060
 intervene-0.060
Token the
Token ID262
Feature activation-2.23e-02

Pos logit contributions
robe+0.017
ase+0.015
 Now+0.014
tub+0.014
 has+0.014
Neg logit contributions
ck-0.016
 reassure-0.016
 unexpected-0.015
 fright-0.015
 swat-0.015
Token best
Token ID1266
Feature activation-3.92e-03

Pos logit contributions
robe+0.031
ase+0.025
 Now+0.024
tub+0.024
 has+0.023
Neg logit contributions
 reassure-0.053
ck-0.052
 unexpected-0.051
 fright-0.050
 swat-0.050
Token of
Token ID286
Feature activation-2.36e-02

Pos logit contributions
robe+0.008
 Cookies+0.006
 has+0.006
ase+0.006
 Now+0.006
Neg logit contributions
ck-0.007
 unexpected-0.007
 reassure-0.007
 fright-0.006
 swat-0.006
Token was
Token ID373
Feature activation-3.52e-03

Pos logit contributions
robe+0.053
ase+0.044
tub+0.044
 Now+0.042
 sky+0.042
Neg logit contributions
ck-0.070
 reassure-0.067
 unexpected-0.065
 fright-0.063
 swat-0.063
Token just
Token ID655
Feature activation-2.26e-03

Pos logit contributions
robe+0.007
ase+0.006
tub+0.006
 Now+0.006
 sky+0.006
Neg logit contributions
ck-0.006
 reassure-0.006
 unexpected-0.006
 swat-0.006
 fright-0.006
Token sick
Token ID6639
Feature activation-4.38e-03

Pos logit contributions
robe+0.005
ase+0.004
tub+0.004
 Cookies+0.004
 Now+0.004
Neg logit contributions
ck-0.004
 reassure-0.004
 swat-0.004
 unexpected-0.004
 fright-0.004
Token and
Token ID290
Feature activation+3.04e-02

Pos logit contributions
robe+0.008
ase+0.007
tub+0.006
angers+0.006
 Now+0.006
Neg logit contributions
 reassure-0.008
ck-0.008
 fright-0.008
 unexpected-0.008
 swat-0.008
Token couldn
Token ID3521
Feature activation+1.07e-02

Pos logit contributions
ck+0.060
 reassure+0.058
 unexpected+0.057
 fright+0.055
 swat+0.055
Neg logit contributions
robe-0.056
ase-0.050
angers-0.046
unes-0.045
tub-0.045
Token't
Token ID470
Feature activation+6.47e-02

Pos logit contributions
ck+0.021
 reassure+0.021
 fright+0.020
 unexpected+0.020
 swat+0.020
Neg logit contributions
robe-0.019
tub-0.016
ase-0.016
angers-0.015
 sky-0.015
Token help
Token ID1037
Feature activation+2.89e-02

Pos logit contributions
ck+0.109
 reassure+0.108
 unexpected+0.100
 intervene+0.098
 swat+0.098
Neg logit contributions
robe-0.132
tub-0.113
ase-0.112
 has-0.112
 sky-0.111
Token it
Token ID340
Feature activation+1.95e-02

Pos logit contributions
ck+0.060
 reassure+0.059
 unexpected+0.055
 swat+0.055
 fright+0.054
Neg logit contributions
robe-0.049
tub-0.040
ase-0.040
 has-0.040
 sky-0.039
Token.
Token ID13
Feature activation-1.72e-02

Pos logit contributions
ck+0.041
 reassure+0.040
 unexpected+0.038
 swat+0.038
 fright+0.037
Neg logit contributions
robe-0.033
ase-0.027
tub-0.027
 Now-0.026
 has-0.026
Token The
Token ID383
Feature activation-7.41e-03

Pos logit contributions
robe+0.027
ase+0.022
bows+0.021
 has+0.021
angers+0.020
Neg logit contributions
 reassure-0.035
 intervene-0.035
 fright-0.035
 unexpected-0.034
ck-0.033
Token animals
Token ID4695
Feature activation+1.20e-02

Pos logit contributions
robe+0.014
tub+0.011
ase+0.011
 Now+0.011
angers+0.011
Neg logit contributions
ck-0.015
 reassure-0.015
 unexpected-0.014
 swat-0.014
 fright-0.013
Token off
Token ID572
Feature activation+2.56e-02

Pos logit contributions
 reassure+0.045
ck+0.044
 intervene+0.042
 unexpected+0.042
 swat+0.040
Neg logit contributions
robe-0.040
 has-0.034
tub-0.033
 sky-0.033
ase-0.032
Token course
Token ID1781
Feature activation+8.34e-03

Pos logit contributions
ck+0.044
 reassure+0.043
 fright+0.041
 unexpected+0.041
 swat+0.040
Neg logit contributions
robe-0.052
ase-0.044
tub-0.044
 has-0.043
 Now-0.043
Token.
Token ID13
Feature activation+2.44e-03

Pos logit contributions
 reassure+0.024
 intervene+0.023
ck+0.023
render+0.022
 convince+0.022
Neg logit contributions
robe-0.006
 has-0.005
 sky-0.005
 He-0.005
 They-0.004
Token Sunny
Token ID32241
Feature activation+3.22e-02

Pos logit contributions
 reassure+0.005
ck+0.005
 fright+0.005
 swat+0.005
 unexpected+0.005
Neg logit contributions
robe-0.004
ase-0.003
angers-0.003
unes-0.003
tub-0.003
Token didn
Token ID1422
Feature activation+2.51e-02

Pos logit contributions
 reassure+0.079
ck+0.078
 intervene+0.075
 unexpected+0.075
 fright+0.073
Neg logit contributions
robe-0.040
 has-0.034
 sky-0.032
tub-0.032
 Now-0.032
Token't
Token ID470
Feature activation+6.37e-02

Pos logit contributions
ck+0.060
 reassure+0.059
 unexpected+0.056
 intervene+0.055
 swat+0.055
Neg logit contributions
robe-0.034
 Now-0.027
ase-0.027
 has-0.027
tub-0.026
Token know
Token ID760
Feature activation+4.47e-03

Pos logit contributions
ck+0.106
 reassure+0.105
 unexpected+0.097
 intervene+0.096
 swat+0.096
Neg logit contributions
robe-0.132
tub-0.113
 has-0.113
ase-0.113
 sky-0.111
Token where
Token ID810
Feature activation-5.19e-03

Pos logit contributions
ck+0.008
 reassure+0.008
 unexpected+0.007
 swat+0.007
 fright+0.007
Neg logit contributions
robe-0.009
tub-0.008
ase-0.008
 has-0.008
 Now-0.008
Token he
Token ID339
Feature activation-1.03e-02

Pos logit contributions
robe+0.008
ase+0.007
tub+0.006
 Now+0.006
 sky+0.006
Neg logit contributions
ck-0.012
 reassure-0.012
 fright-0.011
 unexpected-0.011
 swat-0.011
Token was
Token ID373
Feature activation+2.53e-02

Pos logit contributions
robe+0.010
 has+0.008
 sky+0.007
ase+0.007
 He+0.007
Neg logit contributions
 reassure-0.028
ck-0.028
 intervene-0.027
 swat-0.026
 unexpected-0.026
Token and
Token ID290
Feature activation+2.82e-02

Pos logit contributions
ck+0.056
 reassure+0.055
 intervene+0.054
 unexpected+0.052
 fright+0.052
Neg logit contributions
robe-0.034
 sky-0.030
 has-0.029
tub-0.029
ase-0.027
Token the
Token ID262
Feature activation+2.61e-02

Pos logit contributions
ck+0.037
 reassure+0.036
 unexpected+0.034
 swat+0.034
 fright+0.034
Neg logit contributions
robe-0.044
ase-0.038
tub-0.038
 has-0.037
 Now-0.037
Token garden
Token ID11376
Feature activation-1.87e-02

Pos logit contributions
ck+0.063
 reassure+0.062
 unexpected+0.058
 swat+0.058
 fright+0.058
Neg logit contributions
robe-0.036
tub-0.028
ase-0.028
 has-0.028
 Now-0.028
Token.
Token ID13
Feature activation+6.10e-03

Pos logit contributions
robe+0.037
 Now+0.031
ase+0.031
unes+0.030
 has+0.029
Neg logit contributions
 reassure-0.031
ck-0.030
 unexpected-0.029
 swat-0.029
 fright-0.028
Token There
Token ID1318
Feature activation-2.87e-02

Pos logit contributions
 reassure+0.015
 swat+0.014
 fright+0.014
 panic+0.014
 convince+0.014
Neg logit contributions
robe-0.008
ase-0.007
tub-0.006
angers-0.006
bows-0.006
Token,
Token ID11
Feature activation+3.05e-02

Pos logit contributions
robe+0.040
tub+0.032
ase+0.032
 has+0.032
 Now+0.031
Neg logit contributions
ck-0.067
 reassure-0.067
 unexpected-0.063
 swat-0.063
 fright-0.062
Token Bug
Token ID15217
Feature activation+6.46e-02

Pos logit contributions
 reassure+0.057
ck+0.057
 unexpected+0.054
 swat+0.053
 fright+0.053
Neg logit contributions
robe-0.058
ase-0.050
tub-0.048
 Now-0.046
 has-0.046
Tokengy
Token ID1360
Feature activation+2.65e-03

Pos logit contributions
ck+0.132
 reassure+0.126
 swat+0.120
 fright+0.116
 unexpected+0.115
Neg logit contributions
robe-0.112
ase-0.095
 Now-0.093
tub-0.092
 Cookies-0.091
Token saw
Token ID2497
Feature activation-4.45e-04

Pos logit contributions
ck+0.006
 reassure+0.006
 unexpected+0.005
 swat+0.005
 fright+0.005
Neg logit contributions
robe-0.004
tub-0.003
ase-0.003
 has-0.003
 sky-0.003
Token an
Token ID281
Feature activation+5.49e-02

Pos logit contributions
robe+0.001
tub+0.001
 sky+0.001
 Now+0.001
ase+0.001
Neg logit contributions
ck-0.001
 reassure-0.001
 intervene-0.001
 swat-0.001
 unexpected-0.001
Token incredible
Token ID8082
Feature activation+2.66e-02

Pos logit contributions
ck+0.121
 reassure+0.120
 intervene+0.117
 fright+0.113
render+0.110
Neg logit contributions
robe-0.075
 sky-0.074
tub-0.065
 has-0.065
 He-0.064
Token cob
Token ID22843
Feature activation+1.32e-02

Pos logit contributions
ck+0.051
 reassure+0.050
 unexpected+0.047
 swat+0.046
 fright+0.046
Neg logit contributions
robe-0.049
tub-0.041
ase-0.041
 has-0.041
 sky-0.041
Token too
Token ID1165
Feature activation+4.00e-02

Pos logit contributions
 unexpected+0.033
ck+0.033
 reassure+0.033
 fright+0.032
 closer+0.032
Neg logit contributions
robe-0.031
ase-0.028
 Now-0.028
 These-0.026
 Cookies-0.026
Token.
Token ID13
Feature activation+5.88e-04

Pos logit contributions
 reassure+0.115
 intervene+0.111
ck+0.110
 swat+0.108
 convince+0.105
Neg logit contributions
robe-0.035
tub-0.031
 sky-0.030
 has-0.029
 He-0.028
Token But
Token ID887
Feature activation+1.58e-02

Pos logit contributions
 fright+0.001
 reassure+0.001
 unexpected+0.001
 panic+0.001
 intervene+0.001
Neg logit contributions
robe-0.001
ase-0.001
tub-0.000
 has-0.000
angers-0.000
Token Tim
Token ID5045
Feature activation+3.22e-02

Pos logit contributions
 reassure+0.036
ck+0.035
 intervene+0.034
 swat+0.034
 fright+0.034
Neg logit contributions
robe-0.022
 has-0.019
 He-0.019
 Now-0.019
 sky-0.019
Token didn
Token ID1422
Feature activation+2.90e-02

Pos logit contributions
ck+0.072
 reassure+0.071
 unexpected+0.067
 fright+0.067
 swat+0.067
Neg logit contributions
robe-0.049
 has-0.040
tub-0.039
ase-0.039
 Now-0.038
Token't
Token ID470
Feature activation+6.66e-02

Pos logit contributions
ck+0.073
 reassure+0.072
 swat+0.068
 unexpected+0.068
 intervene+0.068
Neg logit contributions
robe-0.036
 Now-0.028
 has-0.027
tub-0.027
 He-0.027
Token care
Token ID1337
Feature activation+1.31e-02

Pos logit contributions
ck+0.103
 reassure+0.101
 unexpected+0.092
 swat+0.091
 intervene+0.090
Neg logit contributions
robe-0.148
tub-0.129
 has-0.128
ase-0.128
 sky-0.126
Token.
Token ID13
Feature activation-6.22e-03

Pos logit contributions
ck+0.029
 reassure+0.029
 unexpected+0.027
 swat+0.027
 fright+0.027
Neg logit contributions
robe-0.020
ase-0.016
tub-0.016
 has-0.016
 Now-0.016
Token He
Token ID679
Feature activation-1.60e-02

Pos logit contributions
robe+0.007
ase+0.005
 has+0.005
owers+0.005
tub+0.005
Neg logit contributions
 fright-0.015
 reassure-0.014
 unexpected-0.014
 panic-0.014
 fear-0.014
Token kept
Token ID4030
Feature activation+3.07e-02

Pos logit contributions
robe+0.022
 has+0.019
tub+0.017
 sky+0.017
 Now+0.017
Neg logit contributions
 reassure-0.038
ck-0.036
 intervene-0.036
 unexpected-0.035
 swat-0.035
Token playing
Token ID2712
Feature activation+2.44e-03

Pos logit contributions
ck+0.062
 reassure+0.061
 unexpected+0.057
 swat+0.057
 fright+0.057
Neg logit contributions
robe-0.054
ase-0.045
tub-0.045
 has-0.044
 Now-0.044
Token Suddenly
Token ID24975
Feature activation+1.35e-02

Pos logit contributions
robe+0.015
bows+0.013
owers+0.013
ase+0.013
fl+0.011
Neg logit contributions
 cautiously-0.020
 fright-0.019
 dismay-0.018
 tug-0.018
 reassure-0.018
Token,
Token ID11
Feature activation+2.02e-02

Pos logit contributions
ck+0.030
 reassure+0.029
 unexpected+0.028
 fright+0.027
 swat+0.027
Neg logit contributions
robe-0.021
ase-0.017
tub-0.017
 Now-0.017
 has-0.017
Token he
Token ID339
Feature activation+2.40e-04

Pos logit contributions
ck+0.042
 reassure+0.041
 unexpected+0.040
 fright+0.039
 swat+0.038
Neg logit contributions
robe-0.034
ase-0.029
tub-0.028
 has-0.027
 Now-0.027
Token heard
Token ID2982
Feature activation+1.61e-02

Pos logit contributions
ck+0.000
 reassure+0.000
 unexpected+0.000
 fright+0.000
 swat+0.000
Neg logit contributions
robe-0.000
ase-0.000
 Now-0.000
tub-0.000
 has-0.000
Token a
Token ID257
Feature activation+4.91e-02

Pos logit contributions
ck+0.032
 reassure+0.031
 swat+0.029
 intervene+0.029
 unexpected+0.029
Neg logit contributions
robe-0.028
 sky-0.024
tub-0.024
 Now-0.024
ase-0.024
Token loud
Token ID7812
Feature activation+6.29e-02

Pos logit contributions
ck+0.082
 reassure+0.075
 intervene+0.070
 convince+0.068
render+0.066
Neg logit contributions
robe-0.096
 sky-0.093
 He-0.091
 has-0.090
tub-0.087
Token roar
Token ID34532
Feature activation+2.56e-02

Pos logit contributions
ck+0.094
 reassure+0.092
 unexpected+0.084
 swat+0.084
 fright+0.083
Neg logit contributions
robe-0.142
ase-0.124
tub-0.124
 has-0.122
 Now-0.122
Token.
Token ID13
Feature activation-2.90e-03

Pos logit contributions
ck+0.055
 reassure+0.054
 intervene+0.051
 unexpected+0.050
 swat+0.050
Neg logit contributions
robe-0.040
 sky-0.034
tub-0.034
 has-0.033
 He-0.032
Token He
Token ID679
Feature activation-8.12e-03

Pos logit contributions
robe+0.004
ase+0.003
bows+0.003
angers+0.003
 has+0.003
Neg logit contributions
 fright-0.007
 cautiously-0.006
 reassure-0.006
Run-0.006
 unexpected-0.006
Token turned
Token ID2900
Feature activation+5.32e-03

Pos logit contributions
robe+0.011
 has+0.010
 sky+0.010
 He+0.009
 Now+0.009
Neg logit contributions
 reassure-0.019
ck-0.018
 intervene-0.018
 swat-0.017
 convince-0.017
Token around
Token ID1088
Feature activation+1.47e-02

Pos logit contributions
ck+0.012
 reassure+0.012
 unexpected+0.011
 intervene+0.011
 swat+0.011
Neg logit contributions
robe-0.008
 sky-0.006
tub-0.006
 has-0.006
 Now-0.006
Token better
Token ID1365
Feature activation+1.97e-02

Pos logit contributions
ck+0.027
 reassure+0.026
 unexpected+0.024
 swat+0.024
 fright+0.024
Neg logit contributions
robe-0.042
ase-0.037
tub-0.037
 Now-0.036
 has-0.036
Token."
Token ID526
Feature activation-1.36e-02

Pos logit contributions
ck+0.039
 reassure+0.039
 unexpected+0.036
 fright+0.034
 swat+0.034
Neg logit contributions
robe-0.036
ase-0.031
tub-0.029
unes-0.029
 sky-0.028
Token
Token ID198
Feature activation+1.11e-02

Pos logit contributions
robe+0.022
tub+0.019
unes+0.018
ase+0.018
aces+0.017
Neg logit contributions
 reassure-0.027
 fright-0.025
ck-0.025
 swat-0.024
 intervene-0.024
Token
Token ID198
Feature activation+3.20e-02

Pos logit contributions
ck+0.032
 reassure+0.031
 unexpected+0.030
 swat+0.030
 fright+0.030
Neg logit contributions
robe-0.010
ase-0.007
tub-0.007
 Now-0.007
 has-0.007
TokenMax
Token ID11518
Feature activation+1.73e-02

Pos logit contributions
ck+0.061
 fright+0.060
 reassure+0.058
 unexpected+0.058
 nearer+0.057
Neg logit contributions
robe-0.053
 Now-0.050
 has-0.049
 They-0.047
 He-0.046
Token and
Token ID290
Feature activation+2.15e-02

Pos logit contributions
ck+0.037
 reassure+0.037
 unexpected+0.035
 fright+0.034
 swat+0.034
Neg logit contributions
robe-0.028
 has-0.023
 Now-0.023
ase-0.022
tub-0.022
Token Lily
Token ID20037
Feature activation+9.55e-03

Pos logit contributions
ck+0.051
 reassure+0.050
 unexpected+0.047
 swat+0.047
 fright+0.047
Neg logit contributions
robe-0.030
 has-0.024
tub-0.024
ase-0.024
 Now-0.024
Token went
Token ID1816
Feature activation+1.92e-02

Pos logit contributions
 reassure+0.023
ck+0.023
 unexpected+0.022
 fright+0.022
 swat+0.022
Neg logit contributions
robe-0.011
 has-0.010
 sky-0.009
 He-0.009
 They-0.009
Token to
Token ID284
Feature activation+1.15e-02

Pos logit contributions
ck+0.042
 reassure+0.042
 unexpected+0.039
 swat+0.039
 fright+0.039
Neg logit contributions
robe-0.030
ase-0.025
tub-0.025
 Now-0.024
 has-0.024
Token the
Token ID262
Feature activation+1.37e-02

Pos logit contributions
ck+0.025
 unexpected+0.023
 reassure+0.023
ited+0.023
 swat+0.022
Neg logit contributions
robe-0.017
 They-0.016
 He-0.015
 has-0.015
 Now-0.014
Token veterinarian
Token ID41404
Feature activation+1.29e-02

Pos logit contributions
ck+0.025
 reassure+0.025
 unexpected+0.023
 swat+0.023
 fright+0.023
Neg logit contributions
robe-0.027
tub-0.023
ase-0.023
 Now-0.022
 has-0.022
Token let
Token ID1309
Feature activation+1.22e-02

Pos logit contributions
robe+0.008
ase+0.007
tub+0.007
 Now+0.007
 sky+0.007
Neg logit contributions
ck-0.008
 reassure-0.008
 unexpected-0.007
 swat-0.007
 intervene-0.007
Token her
Token ID607
Feature activation+7.57e-03

Pos logit contributions
ck+0.021
 reassure+0.020
 unexpected+0.019
 swat+0.019
 fright+0.018
Neg logit contributions
robe-0.025
ase-0.022
 has-0.022
 Now-0.021
 sky-0.021
Token get
Token ID651
Feature activation+2.69e-02

Pos logit contributions
ck+0.012
 reassure+0.012
 fright+0.011
 swat+0.011
 unexpected+0.011
Neg logit contributions
robe-0.017
ase-0.015
tub-0.014
 sky-0.014
 Now-0.014
Token her
Token ID607
Feature activation-1.51e-02

Pos logit contributions
ck+0.053
 reassure+0.053
 unexpected+0.049
 swat+0.049
 fright+0.048
Neg logit contributions
robe-0.047
 has-0.040
 Now-0.040
tub-0.040
ase-0.039
Token bucket
Token ID19236
Feature activation-2.26e-02

Pos logit contributions
robe+0.030
ase+0.027
tub+0.026
 Now+0.025
angers+0.024
Neg logit contributions
ck-0.028
 reassure-0.027
 unexpected-0.026
 fright-0.025
 swat-0.025
Token and
Token ID290
Feature activation+2.54e-02

Pos logit contributions
robe+0.039
 Now+0.036
ase+0.034
 has+0.033
 Cookies+0.033
Neg logit contributions
ck-0.044
 reassure-0.042
 closer-0.040
 unexpected-0.039
 fright-0.039
Token shovel
Token ID33677
Feature activation-8.87e-03

Pos logit contributions
ck+0.045
 reassure+0.043
 swat+0.040
 fright+0.039
 intervene+0.039
Neg logit contributions
robe-0.053
ase-0.048
tub-0.045
 sky-0.044
 Now-0.043
Token.
Token ID13
Feature activation-1.22e-02

Pos logit contributions
robe+0.015
 Now+0.013
ase+0.013
tub+0.013
 sky+0.012
Neg logit contributions
ck-0.018
 reassure-0.018
 closer-0.016
 swat-0.016
 fright-0.016
Token She
Token ID1375
Feature activation-2.80e-02

Pos logit contributions
robe+0.023
ase+0.019
tub+0.019
 has+0.018
unes+0.017
Neg logit contributions
 reassure-0.022
 intervene-0.021
 fright-0.021
 unexpected-0.020
 swat-0.020
Token filled
Token ID5901
Feature activation-6.08e-03

Pos logit contributions
robe+0.055
ase+0.047
tub+0.046
 Now+0.046
 sky+0.045
Neg logit contributions
ck-0.053
 reassure-0.050
 unexpected-0.046
 swat-0.046
 fright-0.045
Token the
Token ID262
Feature activation+8.84e-03

Pos logit contributions
robe+0.013
ase+0.011
tub+0.011
 has+0.010
 sky+0.010
Neg logit contributions
ck-0.011
 reassure-0.010
 swat-0.010
 unexpected-0.010
 fright-0.009
Token heard
Token ID2982
Feature activation+4.61e-03

Pos logit contributions
robe+0.026
ase+0.022
tub+0.022
 Now+0.021
 has+0.021
Neg logit contributions
ck-0.026
 reassure-0.026
 unexpected-0.024
 swat-0.024
 fright-0.024
Token a
Token ID257
Feature activation+2.78e-02

Pos logit contributions
ck+0.009
 reassure+0.009
 swat+0.009
 intervene+0.009
 fright+0.008
Neg logit contributions
robe-0.008
 sky-0.007
 Now-0.006
 has-0.006
tub-0.006
Token loud
Token ID7812
Feature activation+4.97e-02

Pos logit contributions
ck+0.047
 reassure+0.043
 intervene+0.039
 convince+0.038
 unexpected+0.037
Neg logit contributions
robe-0.055
 sky-0.052
 has-0.051
 He-0.050
tub-0.049
Token crunch
Token ID23527
Feature activation+2.82e-02

Pos logit contributions
ck+0.073
 reassure+0.072
 unexpected+0.065
 swat+0.065
 fright+0.064
Neg logit contributions
robe-0.113
tub-0.099
ase-0.099
 has-0.098
 Now-0.097
Tokening
Token ID278
Feature activation+4.89e-03

Pos logit contributions
ck+0.050
 reassure+0.048
 swat+0.046
 unexpected+0.044
 fright+0.044
Neg logit contributions
robe-0.056
tub-0.048
 Now-0.048
 has-0.047
 Cookies-0.047
Token sound
Token ID2128
Feature activation+5.45e-03

Pos logit contributions
ck+0.007
 reassure+0.007
 unexpected+0.007
 swat+0.007
 fright+0.007
Neg logit contributions
robe-0.011
tub-0.010
ase-0.010
 Now-0.009
 has-0.009
Token.
Token ID13
Feature activation-1.15e-02

Pos logit contributions
ck+0.011
 reassure+0.011
 unexpected+0.010
 swat+0.010
 fright+0.010
Neg logit contributions
robe-0.010
ase-0.008
tub-0.008
 Now-0.008
 has-0.008
Token When
Token ID1649
Feature activation+2.62e-03

Pos logit contributions
robe+0.016
ase+0.013
tub+0.012
unes+0.012
 has+0.012
Neg logit contributions
 reassure-0.026
 fright-0.025
 intervene-0.025
 cautiously-0.024
 unexpected-0.024
Token she
Token ID673
Feature activation-1.50e-02

Pos logit contributions
ck+0.006
 reassure+0.006
 swat+0.005
 unexpected+0.005
 fright+0.005
Neg logit contributions
robe-0.004
 has-0.003
tub-0.003
 Now-0.003
 sky-0.003
Token looked
Token ID3114
Feature activation-1.25e-02

Pos logit contributions
robe+0.021
 has+0.018
 Now+0.018
 sky+0.017
ase+0.017
Neg logit contributions
 reassure-0.034
ck-0.033
 intervene-0.032
 swat-0.032
 unexpected-0.031
Token,
Token ID11
Feature activation+1.33e-02

Pos logit contributions
robe+0.018
 has+0.016
 sky+0.016
tub+0.015
 He+0.015
Neg logit contributions
ck-0.028
 reassure-0.027
 intervene-0.026
 swat-0.026
 fright-0.025
Token get
Token ID651
Feature activation+3.34e-02

Pos logit contributions
ck+0.015
 reassure+0.015
 swat+0.014
 fright+0.014
 unexpected+0.013
Neg logit contributions
robe-0.019
ase-0.017
tub-0.017
 Now-0.016
angers-0.016
Token better
Token ID1365
Feature activation+1.99e-02

Pos logit contributions
ck+0.056
 reassure+0.055
 unexpected+0.051
 swat+0.050
 fright+0.050
Neg logit contributions
robe-0.070
tub-0.060
ase-0.060
 has-0.059
 Now-0.059
Token soon
Token ID2582
Feature activation-4.38e-03

Pos logit contributions
ck+0.037
 reassure+0.036
 fright+0.034
 unexpected+0.034
 swat+0.033
Neg logit contributions
robe-0.041
ase-0.035
tub-0.033
unes-0.032
 sky-0.031
Token.
Token ID13
Feature activation-1.58e-02

Pos logit contributions
robe+0.009
ase+0.007
tub+0.007
 Now+0.007
angers+0.007
Neg logit contributions
ck-0.008
 reassure-0.008
 unexpected-0.007
 fright-0.007
 swat-0.007
Token It
Token ID632
Feature activation-2.92e-02

Pos logit contributions
robe+0.025
tub+0.020
ase+0.020
unes+0.019
angers+0.019
Neg logit contributions
 reassure-0.034
 fright-0.033
 intervene-0.032
 swat-0.032
ck-0.031
Token will
Token ID481
Feature activation+1.31e-03

Pos logit contributions
robe+0.057
ase+0.049
angers+0.048
tub+0.047
unes+0.046
Neg logit contributions
ck-0.055
 reassure-0.054
 unexpected-0.051
 fright-0.050
 swat-0.050
Token be
Token ID307
Feature activation+3.61e-02

Pos logit contributions
ck+0.002
 reassure+0.002
 unexpected+0.002
 fright+0.002
 swat+0.002
Neg logit contributions
robe-0.003
ase-0.002
tub-0.002
 Now-0.002
 sky-0.002
Token fun
Token ID1257
Feature activation+2.03e-02

Pos logit contributions
 reassure+0.064
ck+0.062
 swat+0.059
 intervene+0.059
 fright+0.057
Neg logit contributions
robe-0.068
 has-0.062
 sky-0.059
 Now-0.059
tub-0.058
Token to
Token ID284
Feature activation+1.61e-02

Pos logit contributions
 reassure+0.034
ck+0.034
 unexpected+0.032
 fright+0.032
 closer+0.031
Neg logit contributions
robe-0.042
ase-0.036
tub-0.034
unes-0.034
 Now-0.034
Token play
Token ID711
Feature activation-7.18e-03

Pos logit contributions
ck+0.027
 reassure+0.027
 unexpected+0.025
 swat+0.025
 fright+0.025
Neg logit contributions
robe-0.033
tub-0.028
ase-0.028
 has-0.028
 Now-0.028
Token hockey
Token ID12217
Feature activation+1.03e-02

Pos logit contributions
robe+0.012
ase+0.010
tub+0.010
 sky+0.010
 has+0.010
Neg logit contributions
ck-0.014
 reassure-0.014
 unexpected-0.013
 fright-0.013
 swat-0.013
Token that
Token ID326
Feature activation-1.75e-02

Pos logit contributions
robe+0.035
ase+0.028
tub+0.028
unes+0.027
angers+0.027
Neg logit contributions
 reassure-0.040
ck-0.040
 unexpected-0.039
 fright-0.038
 intervene-0.037
Token day
Token ID1110
Feature activation+7.55e-03

Pos logit contributions
robe+0.032
unes+0.028
angers+0.027
ase+0.026
tub+0.026
Neg logit contributions
 reassure-0.034
ck-0.033
 unexpected-0.033
 fright-0.032
 swat-0.031
Token when
Token ID618
Feature activation+2.68e-02

Pos logit contributions
ck+0.013
 reassure+0.013
 unexpected+0.013
 fright+0.012
 intervene+0.012
Neg logit contributions
robe-0.014
ase-0.012
tub-0.012
unes-0.012
 has-0.011
Token the
Token ID262
Feature activation-5.15e-03

Pos logit contributions
ck+0.052
 reassure+0.051
 unexpected+0.048
 swat+0.047
 fright+0.047
Neg logit contributions
robe-0.049
tub-0.041
ase-0.041
 has-0.041
 Now-0.040
Token mum
Token ID25682
Feature activation+9.60e-03

Pos logit contributions
robe+0.009
ase+0.007
tub+0.007
 has+0.007
 Now+0.007
Neg logit contributions
 reassure-0.011
ck-0.011
 unexpected-0.010
 intervene-0.010
 swat-0.010
Token came
Token ID1625
Feature activation+3.32e-02

Pos logit contributions
 reassure+0.019
ck+0.019
 unexpected+0.018
 fright+0.018
 intervene+0.017
Neg logit contributions
robe-0.017
tub-0.014
unes-0.013
angers-0.013
ase-0.013
Token to
Token ID284
Feature activation+2.57e-02

Pos logit contributions
ck+0.071
 reassure+0.070
 unexpected+0.066
 swat+0.065
 fright+0.065
Neg logit contributions
robe-0.054
tub-0.045
ase-0.044
 has-0.044
 sky-0.044
Token check
Token ID2198
Feature activation-1.67e-02

Pos logit contributions
ck+0.051
 reassure+0.047
 unexpected+0.046
 swat+0.044
 intervene+0.044
Neg logit contributions
robe-0.043
 sky-0.037
 Now-0.037
 He-0.036
 has-0.036
Token on
Token ID319
Feature activation+4.61e-02

Pos logit contributions
robe+0.026
ase+0.022
tub+0.021
 sky+0.021
 Now+0.020
Neg logit contributions
 reassure-0.036
ck-0.036
 unexpected-0.034
 swat-0.033
 fright-0.033
Token them
Token ID606
Feature activation+1.12e-02

Pos logit contributions
ck+0.079
 reassure+0.078
 unexpected+0.072
 swat+0.072
 fright+0.071
Neg logit contributions
robe-0.094
tub-0.080
ase-0.080
 has-0.079
 Now-0.079
Token,
Token ID11
Feature activation+2.06e-02

Pos logit contributions
 reassure+0.023
ck+0.023
 unexpected+0.022
 intervene+0.021
 fright+0.021
Neg logit contributions
robe-0.019
ase-0.016
tub-0.015
 Now-0.015
 sky-0.015
TokenOh
Token ID5812
Feature activation-1.67e-02

Pos logit contributions
robe+0.004
tub+0.003
ase+0.003
 has+0.003
 Now+0.003
Neg logit contributions
ck-0.003
 reassure-0.003
 unexpected-0.003
 swat-0.003
 fright-0.003
Token no
Token ID645
Feature activation+2.48e-02

Pos logit contributions
robe+0.031
tub+0.026
ase+0.026
 sky+0.025
 Now+0.025
Neg logit contributions
ck-0.032
 reassure-0.031
 unexpected-0.029
 fright-0.029
 swat-0.029
Token Jimmy
Token ID12963
Feature activation+3.50e-02

Pos logit contributions
ck+0.055
 reassure+0.054
 unexpected+0.051
 swat+0.050
 intervene+0.050
Neg logit contributions
robe-0.036
 has-0.031
 Now-0.031
 sky-0.030
tub-0.030
Token,
Token ID11
Feature activation+4.67e-03

Pos logit contributions
ck+0.064
 reassure+0.062
 swat+0.059
 fright+0.058
 unexpected+0.058
Neg logit contributions
robe-0.069
tub-0.058
ase-0.058
 sky-0.057
angers-0.056
Token does
Token ID857
Feature activation-1.78e-05

Pos logit contributions
 reassure+0.008
ck+0.008
 unexpected+0.008
 fright+0.008
 swat+0.008
Neg logit contributions
robe-0.010
ase-0.009
angers-0.008
tub-0.008
 sky-0.008
Token it
Token ID340
Feature activation-1.13e-02

Pos logit contributions
robe+0.000
ase+0.000
tub+0.000
 sky+0.000
 Now+0.000
Neg logit contributions
ck-0.000
 reassure-0.000
 fright-0.000
 unexpected-0.000
 swat-0.000
Token hurt
Token ID5938
Feature activation+1.79e-02

Pos logit contributions
robe+0.020
ase+0.017
tub+0.017
 Now+0.017
 sky+0.016
Neg logit contributions
ck-0.023
 reassure-0.022
 unexpected-0.021
 fright-0.021
 swat-0.021
Token?"
Token ID1701
Feature activation-8.48e-03

Pos logit contributions
ck+0.031
 reassure+0.031
 unexpected+0.028
 swat+0.028
 fright+0.028
Neg logit contributions
robe-0.036
tub-0.031
ase-0.031
 has-0.030
 Now-0.030
Token
Token ID198
Feature activation+7.11e-03

Pos logit contributions
robe+0.016
ase+0.014
tub+0.013
unes+0.013
 has+0.013
Neg logit contributions
ck-0.016
 reassure-0.016
 swat-0.015
 fright-0.015
 intervene-0.015
Token
Token ID198
Feature activation+2.07e-02

Pos logit contributions
ck+0.023
 reassure+0.022
 unexpected+0.022
 swat+0.022
 matter+0.022
Neg logit contributions
robe-0.003
 Now-0.003
 He-0.003
ase-0.001
 It-0.001
TokenJimmy
Token ID40335
Feature activation+2.30e-02

Pos logit contributions
ck+0.036
 reassure+0.034
 unexpected+0.033
 fright+0.032
 matter+0.032
Neg logit contributions
 Now-0.039
robe-0.039
 He-0.037
ase-0.035
tub-0.034
Token his
Token ID465
Feature activation-3.62e-02

Pos logit contributions
 reassure+0.020
ck+0.019
 fright+0.018
 unexpected+0.018
 swat+0.018
Neg logit contributions
robe-0.029
ase-0.026
angers-0.025
tub-0.025
unes-0.025
Token mom
Token ID1995
Feature activation+1.15e-02

Pos logit contributions
robe+0.068
tub+0.057
ase+0.057
 has+0.056
 sky+0.056
Neg logit contributions
ck-0.069
 reassure-0.068
 unexpected-0.063
 swat-0.063
 fright-0.062
Token went
Token ID1816
Feature activation+1.72e-02

Pos logit contributions
ck+0.014
 reassure+0.014
 worry+0.013
 intervene+0.013
 convince+0.012
Neg logit contributions
robe-0.029
angers-0.026
tub-0.025
ase-0.024
balls-0.024
Token to
Token ID284
Feature activation+8.92e-03

Pos logit contributions
 reassure+0.031
 closer+0.030
ck+0.029
 fright+0.028
 convince+0.028
Neg logit contributions
robe-0.032
ase-0.029
 Cookies-0.027
tub-0.027
 Now-0.027
Token the
Token ID262
Feature activation+9.13e-03

Pos logit contributions
ck+0.018
 reassure+0.017
 unexpected+0.016
 swat+0.016
 intervene+0.016
Neg logit contributions
robe-0.015
 sky-0.013
 has-0.013
 Now-0.013
tub-0.013
Token shop
Token ID6128
Feature activation-3.09e-04

Pos logit contributions
ck+0.022
 reassure+0.022
 swat+0.021
 unexpected+0.021
 fright+0.021
Neg logit contributions
robe-0.012
tub-0.010
ase-0.009
 sky-0.009
 has-0.009
Token.
Token ID13
Feature activation-1.89e-02

Pos logit contributions
robe+0.001
ase+0.000
 Now+0.000
 has+0.000
unes+0.000
Neg logit contributions
 reassure-0.001
ck-0.001
 intervene-0.001
 convince-0.001
 unexpected-0.001
Token Jack
Token ID3619
Feature activation+1.49e-02

Pos logit contributions
robe+0.027
ase+0.023
bows+0.021
rob+0.021
fl+0.020
Neg logit contributions
 intervene-0.040
 reassure-0.040
 cautiously-0.039
 unexpected-0.039
 fright-0.038
Token was
Token ID373
Feature activation+1.15e-03

Pos logit contributions
ck+0.028
 reassure+0.027
 unexpected+0.026
 cautiously+0.026
 convince+0.025
Neg logit contributions
robe-0.028
unes-0.024
ase-0.023
tub-0.023
angers-0.022
Token excited
Token ID6568
Feature activation+1.42e-03

Pos logit contributions
 reassure+0.002
ck+0.002
 worry+0.002
 unexpected+0.002
 convince+0.002
Neg logit contributions
robe-0.003
ase-0.002
unes-0.002
tub-0.002
owers-0.002
Token to
Token ID284
Feature activation+1.17e-02

Pos logit contributions
 reassure+0.003
ck+0.003
 unexpected+0.003
 fright+0.002
 convince+0.002
Neg logit contributions
robe-0.003
ase-0.002
 Now-0.002
tub-0.002
 has-0.002
Token ones
Token ID3392
Feature activation+1.65e-03

Pos logit contributions
robe+0.034
 Now+0.033
ase+0.030
angers+0.030
tub+0.029
Neg logit contributions
 fright-0.033
 unexpected-0.032
 reassure-0.032
 worry-0.032
 matter-0.032
Token his
Token ID465
Feature activation-4.58e-02

Pos logit contributions
ck+0.003
 else+0.002
 unexpected+0.002
 close+0.002
 believe+0.002
Neg logit contributions
robe-0.004
 Now-0.003
angers-0.003
 sky-0.003
tub-0.003
Token mom
Token ID1995
Feature activation+5.30e-03

Pos logit contributions
robe+0.079
 Now+0.073
ase+0.072
tub+0.069
angers+0.067
Neg logit contributions
 reassure-0.078
 worry-0.077
ck-0.077
 fright-0.076
 swat-0.076
Tokenmy
Token ID1820
Feature activation+7.77e-04

Pos logit contributions
ck+0.008
 reassure+0.007
 convince+0.007
 worry+0.006
 believe+0.006
Neg logit contributions
robe-0.012
angers-0.011
tub-0.011
 Now-0.011
unes-0.011
Token made
Token ID925
Feature activation-1.94e-02

Pos logit contributions
ck+0.001
 reassure+0.001
 convince+0.001
 worry+0.001
 swat+0.001
Neg logit contributions
robe-0.001
angers-0.001
tub-0.001
unes-0.001
 Now-0.001
Token.
Token ID13
Feature activation-1.20e-02

Pos logit contributions
robe+0.035
 has+0.030
 sky+0.030
 Now+0.030
ase+0.029
Neg logit contributions
ck-0.037
 reassure-0.035
 unexpected-0.034
 swat-0.033
 convince-0.033
Token One
Token ID1881
Feature activation-1.57e-02

Pos logit contributions
robe+0.026
ase+0.022
ed+0.022
 has+0.021
tub+0.021
Neg logit contributions
 reassure-0.015
 intervene-0.015
 convince-0.014
 fright-0.014
 matter-0.014
Token day
Token ID1110
Feature activation+1.67e-02

Pos logit contributions
robe+0.016
 has+0.013
 sky+0.012
ase+0.012
tub+0.012
Neg logit contributions
ck-0.044
 reassure-0.044
 intervene-0.042
 fright-0.041
 swat-0.041
Token,
Token ID11
Feature activation+1.05e-02

Pos logit contributions
ck+0.036
 reassure+0.036
 unexpected+0.034
 swat+0.033
 fright+0.033
Neg logit contributions
robe-0.026
ase-0.022
tub-0.022
 has-0.021
 sky-0.021
Token Tim
Token ID5045
Feature activation+2.82e-02

Pos logit contributions
ck+0.021
 reassure+0.021
 unexpected+0.019
 swat+0.019
 fright+0.019
Neg logit contributions
robe-0.019
tub-0.016
ase-0.016
 has-0.015
 Now-0.015
Tokenmy
Token ID1820
Feature activation+9.62e-03

Pos logit contributions
ck+0.050
 reassure+0.048
 unexpected+0.045
 fright+0.044
 swat+0.044
Neg logit contributions
robe-0.055
 Now-0.047
tub-0.047
angers-0.046
 Cookies-0.046
Token the
Token ID262
Feature activation-1.33e-02

Pos logit contributions
 reassure+0.022
ck+0.022
 intervene+0.020
 swat+0.020
 fright+0.020
Neg logit contributions
robe-0.022
tub-0.019
 has-0.018
 sky-0.018
ase-0.018
Token laser
Token ID12855
Feature activation-3.09e-02

Pos logit contributions
robe+0.024
ase+0.021
 Now+0.021
tub+0.020
unes+0.020
Neg logit contributions
ck-0.025
 reassure-0.025
 unexpected-0.024
 fright-0.024
 swat-0.023
Token in
Token ID287
Feature activation+1.23e-02

Pos logit contributions
robe+0.052
 Now+0.048
 Cookies+0.045
ase+0.044
unes+0.043
Neg logit contributions
ck-0.059
 unexpected-0.056
 reassure-0.055
 fright-0.053
 swat-0.053
Token their
Token ID511
Feature activation-2.71e-02

Pos logit contributions
ck+0.023
 reassure+0.023
 swat+0.021
 intervene+0.021
 fright+0.020
Neg logit contributions
robe-0.023
 has-0.020
tub-0.020
 sky-0.020
 Now-0.020
Token room
Token ID2119
Feature activation-4.88e-02

Pos logit contributions
robe+0.040
unes+0.034
ase+0.033
 Now+0.032
tub+0.031
Neg logit contributions
ck-0.059
 unexpected-0.058
 reassure-0.056
 fright-0.055
 swat-0.054
Token.
Token ID13
Feature activation-2.51e-02

Pos logit contributions
robe+0.089
 Now+0.077
 has+0.074
ase+0.074
unes+0.073
Neg logit contributions
ck-0.087
 reassure-0.087
 unexpected-0.083
 intervene-0.081
 fright-0.080
Token They
Token ID1119
Feature activation-2.75e-02

Pos logit contributions
robe+0.026
ase+0.022
owers+0.019
fl+0.018
atmeal+0.018
Neg logit contributions
 reassure-0.060
 unexpected-0.060
 fright-0.060
ck-0.060
 intervene-0.060
Token heard
Token ID2982
Feature activation+9.02e-03

Pos logit contributions
robe+0.043
tub+0.035
 has+0.035
ase+0.035
 sky+0.034
Neg logit contributions
ck-0.060
 reassure-0.060
 unexpected-0.056
 swat-0.056
 fright-0.056
Token their
Token ID511
Feature activation-1.54e-02

Pos logit contributions
ck+0.019
 reassure+0.019
 swat+0.017
 intervene+0.017
 convince+0.017
Neg logit contributions
robe-0.015
 sky-0.013
tub-0.013
 He-0.012
 Now-0.012
Token mom
Token ID1995
Feature activation+4.71e-02

Pos logit contributions
robe+0.024
 sky+0.021
 has+0.020
tub+0.020
 Now+0.019
Neg logit contributions
ck-0.034
 reassure-0.033
 swat-0.031
 intervene-0.031
ited-0.031
Token call
Token ID869
Feature activation-8.12e-03

Pos logit contributions
ck+0.073
 reassure+0.073
 unexpected+0.068
 fright+0.066
 swat+0.066
Neg logit contributions
robe-0.100
 has-0.089
tub-0.087
ase-0.087
 sky-0.087
Token table
Token ID3084
Feature activation-5.14e-02

Pos logit contributions
robe+0.027
ase+0.026
 Now+0.025
unes+0.024
angers+0.023
Neg logit contributions
 reassure-0.035
 unexpected-0.033
 intervene-0.033
 worry-0.033
 panic-0.033
Token was
Token ID373
Feature activation+9.01e-03

Pos logit contributions
robe+0.076
 Now+0.068
angers+0.064
unes+0.062
ase+0.061
Neg logit contributions
ck-0.108
 reassure-0.104
 unexpected-0.102
 fright-0.102
 swat-0.098
Token covered
Token ID5017
Feature activation-1.92e-02

Pos logit contributions
ck+0.015
 reassure+0.015
 unexpected+0.014
 fright+0.014
 swat+0.014
Neg logit contributions
robe-0.019
ase-0.016
 Now-0.016
tub-0.016
 has-0.016
Token with
Token ID351
Feature activation-1.10e-04

Pos logit contributions
robe+0.029
ase+0.023
 Now+0.023
 has+0.023
tub+0.023
Neg logit contributions
ck-0.043
 reassure-0.042
 unexpected-0.040
 swat-0.039
 fright-0.039
Token cr
Token ID1067
Feature activation-1.77e-02

Pos logit contributions
robe+0.000
ase+0.000
 Now+0.000
tub+0.000
 has+0.000
Neg logit contributions
ck-0.000
 reassure-0.000
 unexpected-0.000
 fright-0.000
 swat-0.000
Tokenumbs
Token ID18146
Feature activation-1.30e-02

Pos logit contributions
robe+0.029
 has+0.028
 Now+0.024
 ones+0.023
tub+0.023
Neg logit contributions
ck-0.032
 reassure-0.030
 unexpected-0.030
 fright-0.029
 intervene-0.027
Token and
Token ID290
Feature activation+1.30e-02

Pos logit contributions
robe+0.022
ase+0.019
angers+0.018
 Now+0.018
tub+0.017
Neg logit contributions
ck-0.026
 unexpected-0.025
 reassure-0.025
 fright-0.025
 closer-0.024
Token spills
Token ID42123
Feature activation-1.56e-02

Pos logit contributions
 unexpected+0.023
ck+0.023
 fright+0.022
 reassure+0.022
 fear+0.021
Neg logit contributions
robe-0.023
ase-0.022
angers-0.021
unes-0.021
owers-0.020
Token.
Token ID13
Feature activation-1.07e-02

Pos logit contributions
robe+0.029
ase+0.027
 Now+0.025
angers+0.025
bows+0.024
Neg logit contributions
 unexpected-0.026
 fright-0.025
ck-0.025
 closer-0.024
 fear-0.024
Token
Token ID198
Feature activation+2.82e-02

Pos logit contributions
robe+0.016
ase+0.013
tub+0.012
 has+0.012
unes+0.011
Neg logit contributions
 reassure-0.023
ck-0.023
 fright-0.022
 unexpected-0.022
 intervene-0.021
Token
Token ID198
Feature activation+4.87e-02

Pos logit contributions
ck+0.083
 reassure+0.083
 swat+0.079
 unexpected+0.079
 intervene+0.078
Neg logit contributions
robe-0.023
 Now-0.016
 has-0.015
ase-0.015
tub-0.015
Token on
Token ID319
Feature activation+6.03e-03

Pos logit contributions
robe+0.078
 Now+0.077
 There+0.070
 Cookies+0.069
 has+0.069
Neg logit contributions
ck-0.090
 reassure-0.088
 swat-0.083
 closer-0.082
 convince-0.080
Token the
Token ID262
Feature activation-3.20e-02

Pos logit contributions
ck+0.010
 reassure+0.010
 unexpected+0.009
 swat+0.009
 fright+0.009
Neg logit contributions
robe-0.013
tub-0.011
ase-0.011
 has-0.011
 Now-0.011
Token pan
Token ID3425
Feature activation-5.71e-02

Pos logit contributions
robe+0.056
 Now+0.050
ase+0.047
tub+0.046
angers+0.046
Neg logit contributions
ck-0.061
 reassure-0.060
 swat-0.058
 unexpected-0.058
 fright-0.058
Token so
Token ID523
Feature activation+1.33e-03

Pos logit contributions
robe+0.097
 Now+0.092
angers+0.084
 has+0.083
ase+0.083
Neg logit contributions
 reassure-0.101
ck-0.099
 closer-0.094
 fright-0.092
 intervene-0.092
Token the
Token ID262
Feature activation-1.94e-02

Pos logit contributions
ck+0.003
 reassure+0.003
 unexpected+0.003
 swat+0.003
 fright+0.003
Neg logit contributions
robe-0.002
ase-0.002
tub-0.002
 Now-0.002
 sky-0.002
Token cake
Token ID12187
Feature activation-6.52e-02

Pos logit contributions
robe+0.035
ase+0.030
 Now+0.030
tub+0.029
 He+0.028
Neg logit contributions
 reassure-0.038
ck-0.038
 unexpected-0.036
 swat-0.035
 fright-0.035
Token would
Token ID561
Feature activation-9.45e-03

Pos logit contributions
robe+0.103
 Now+0.101
angers+0.097
 These+0.093
ase+0.090
Neg logit contributions
ck-0.126
 reassure-0.120
 unexpected-0.117
 fright-0.113
 believe-0.113
Token not
Token ID407
Feature activation+5.73e-02

Pos logit contributions
robe+0.020
ase+0.018
 Now+0.018
tub+0.018
angers+0.017
Neg logit contributions
ck-0.016
 reassure-0.015
 swat-0.015
 fright-0.015
 unexpected-0.014
Token stick
Token ID4859
Feature activation-7.82e-03

Pos logit contributions
ck+0.097
 reassure+0.095
 unexpected+0.088
 swat+0.087
 fright+0.087
Neg logit contributions
robe-0.122
ase-0.107
 Now-0.106
tub-0.104
 has-0.102
Token.
Token ID13
Feature activation-2.58e-02

Pos logit contributions
robe+0.014
 Now+0.012
tub+0.011
ase+0.011
 sky+0.011
Neg logit contributions
ck-0.016
 reassure-0.015
 unexpected-0.015
 fright-0.015
 swat-0.014
Token They
Token ID1119
Feature activation-3.19e-02

Pos logit contributions
robe+0.040
tub+0.031
ase+0.031
 has+0.030
 sky+0.028
Neg logit contributions
 reassure-0.055
 intervene-0.053
ck-0.052
 fright-0.052
 swat-0.051
Token paper
Token ID3348
Feature activation-4.35e-02

Pos logit contributions
robe+0.044
tub+0.038
ase+0.038
 Now+0.035
 sky+0.035
Neg logit contributions
ck-0.035
 reassure-0.034
 unexpected-0.033
 fright-0.032
 swat-0.031
Token.
Token ID13
Feature activation-3.30e-02

Pos logit contributions
robe+0.076
 Now+0.068
ase+0.068
tub+0.066
 He+0.062
Neg logit contributions
ck-0.084
 reassure-0.078
 unexpected-0.075
 fright-0.072
 closer-0.072
Token Writing
Token ID22183
Feature activation-5.61e-02

Pos logit contributions
robe+0.039
tub+0.031
ase+0.028
 sky+0.027
 has+0.025
Neg logit contributions
ck-0.083
 reassure-0.083
 fright-0.079
 intervene-0.078
 swat-0.078
Token on
Token ID319
Feature activation-3.94e-03

Pos logit contributions
robe+0.075
ase+0.066
tub+0.063
angers+0.063
umm+0.061
Neg logit contributions
ck-0.118
 unexpected-0.116
 reassure-0.114
 closer-0.108
 fright-0.108
Token the
Token ID262
Feature activation-3.32e-02

Pos logit contributions
robe+0.008
tub+0.007
ase+0.007
angers+0.007
 Now+0.007
Neg logit contributions
ck-0.006
 unexpected-0.006
 reassure-0.006
 fright-0.006
 swat-0.006
Token fridge
Token ID25772
Feature activation-7.65e-02

Pos logit contributions
robe+0.042
 Now+0.036
ase+0.035
tub+0.033
angers+0.032
Neg logit contributions
 reassure-0.077
 unexpected-0.075
 fright-0.074
ck-0.074
 swat-0.072
Token is
Token ID318
Feature activation-1.65e-02

Pos logit contributions
robe+0.136
 Now+0.123
angers+0.116
tub+0.115
ase+0.114
Neg logit contributions
ck-0.120
 unexpected-0.118
 reassure-0.118
 worry-0.112
 tighter-0.111
Token bad
Token ID2089
Feature activation-3.47e-03

Pos logit contributions
robe+0.037
ase+0.034
tub+0.032
 Now+0.031
 sky+0.031
Neg logit contributions
ck-0.027
 unexpected-0.025
 reassure-0.025
 fright-0.022
 swat-0.022
Token.
Token ID13
Feature activation-2.90e-02

Pos logit contributions
robe+0.007
ase+0.006
tub+0.006
angers+0.006
�+0.006
Neg logit contributions
 reassure-0.005
 unexpected-0.005
ck-0.005
 matter-0.005
 fright-0.005
Token It
Token ID632
Feature activation-4.53e-02

Pos logit contributions
robe+0.036
tub+0.029
ase+0.026
 sky+0.024
 has+0.023
Neg logit contributions
ck-0.072
 reassure-0.072
 fright-0.069
 swat-0.068
 intervene-0.067
Token makes
Token ID1838
Feature activation-2.14e-03

Pos logit contributions
robe+0.073
angers+0.064
ase+0.062
tub+0.062
unes+0.056
Neg logit contributions
ck-0.105
 unexpected-0.094
 fright-0.093
 closer-0.089
 reassure-0.089
Token measure
Token ID3953
Feature activation-7.17e-03

Pos logit contributions
robe+0.002
tub+0.002
 has+0.002
 sky+0.002
ase+0.002
Neg logit contributions
ck-0.002
 reassure-0.002
 unexpected-0.002
 fright-0.002
 intervene-0.002
Token other
Token ID584
Feature activation+1.53e-02

Pos logit contributions
robe+0.014
ase+0.012
tub+0.012
 Now+0.012
 sky+0.012
Neg logit contributions
ck-0.013
 reassure-0.012
 unexpected-0.012
 fright-0.012
 swat-0.011
Token things
Token ID1243
Feature activation-8.79e-03

Pos logit contributions
ck+0.032
 reassure+0.032
 unexpected+0.030
 swat+0.030
 fright+0.030
Neg logit contributions
robe-0.025
ase-0.021
tub-0.020
 Now-0.020
 has-0.020
Token in
Token ID287
Feature activation+7.85e-03

Pos logit contributions
robe+0.016
ase+0.014
 Now+0.014
 Cookies+0.013
tub+0.013
Neg logit contributions
ck-0.017
 reassure-0.016
 unexpected-0.016
 closer-0.015
 fright-0.015
Token his
Token ID465
Feature activation-3.03e-02

Pos logit contributions
ck+0.013
 reassure+0.013
 unexpected+0.012
 fright+0.012
 swat+0.012
Neg logit contributions
robe-0.016
ase-0.014
tub-0.014
 Now-0.013
 has-0.013
Token room
Token ID2119
Feature activation-6.26e-02

Pos logit contributions
robe+0.047
ase+0.038
tub+0.038
 Now+0.037
unes+0.037
Neg logit contributions
ck-0.065
 reassure-0.065
 unexpected-0.063
 fright-0.062
 swat-0.061
Token.
Token ID13
Feature activation-2.01e-02

Pos logit contributions
robe+0.116
 Now+0.102
ase+0.100
unes+0.096
 has+0.096
Neg logit contributions
 reassure-0.104
 unexpected-0.101
ck-0.100
 intervene-0.098
 worry-0.097
Token His
Token ID2399
Feature activation-4.10e-02

Pos logit contributions
robe+0.024
bows+0.020
fl+0.019
ase+0.019
 has+0.019
Neg logit contributions
 fright-0.044
 reassure-0.044
 intervene-0.044
 cautiously-0.043
 unexpected-0.041
Token bed
Token ID3996
Feature activation-6.15e-02

Pos logit contributions
robe+0.072
ase+0.061
tub+0.060
 Now+0.059
 has+0.059
Neg logit contributions
ck-0.081
 reassure-0.081
 unexpected-0.076
 swat-0.076
 fright-0.075
Token,
Token ID11
Feature activation-4.98e-03

Pos logit contributions
robe+0.084
 Now+0.082
unes+0.082
angers+0.077
 These+0.076
Neg logit contributions
 unexpected-0.116
 reassure-0.113
ck-0.112
 fright-0.108
 intervene-0.106
Token the
Token ID262
Feature activation-3.31e-02

Pos logit contributions
robe+0.009
ase+0.008
unes+0.008
angers+0.008
owers+0.008
Neg logit contributions
 reassure-0.009
ck-0.009
 swat-0.009
 fright-0.009
 unexpected-0.009
Token clay
Token ID21558
Feature activation-3.19e-02

Pos logit contributions
robe+0.011
ase+0.009
tub+0.009
 has+0.009
 Now+0.009
Neg logit contributions
ck-0.008
 reassure-0.008
 unexpected-0.008
 fright-0.008
 swat-0.007
Token!
Token ID0
Feature activation-1.99e-02

Pos logit contributions
robe+0.062
 Now+0.057
ase+0.055
angers+0.051
 Cookies+0.050
Neg logit contributions
 unexpected-0.053
 reassure-0.053
ck-0.053
 fright-0.050
 worry-0.049
Token<|endoftext|>
Token ID50256
Feature activation-3.77e-02

Pos logit contributions
robe+0.031
ase+0.024
tub+0.024
angers+0.024
unes+0.023
Neg logit contributions
 reassure-0.043
 fright-0.042
 intervene-0.041
 cautiously-0.040
 unexpected-0.039
TokenOnce
Token ID7454
Feature activation-2.52e-02

Pos logit contributions
robe+0.053
unes+0.047
angers+0.047
umm+0.041
balls+0.041
Neg logit contributions
 fright-0.073
 reassure-0.071
Little-0.071
 closer-0.070
 nearer-0.069
Token,
Token ID11
Feature activation+1.22e-03

Pos logit contributions
robe+0.046
ase+0.039
 Now+0.038
tub+0.038
 sky+0.037
Neg logit contributions
ck-0.048
 reassure-0.047
 unexpected-0.044
 fright-0.043
 intervene-0.043
Token there
Token ID612
Feature activation-6.45e-02

Pos logit contributions
ck+0.002
 reassure+0.002
 unexpected+0.002
 fright+0.002
 swat+0.002
Neg logit contributions
robe-0.002
ase-0.002
tub-0.002
 Now-0.002
unes-0.002
Token was
Token ID373
Feature activation+2.61e-02

Pos logit contributions
robe+0.102
angers+0.087
unes+0.086
ase+0.081
tub+0.080
Neg logit contributions
ck-0.137
 reassure-0.135
 unexpected-0.133
 fright-0.129
 swat-0.127
Token a
Token ID257
Feature activation+2.82e-02

Pos logit contributions
ck+0.048
 reassure+0.047
 unexpected+0.044
 swat+0.043
 fright+0.043
Neg logit contributions
robe-0.050
ase-0.043
tub-0.043
 has-0.042
 Now-0.041
Token small
Token ID1402
Feature activation+1.93e-02

Pos logit contributions
ck+0.058
 reassure+0.057
 intervene+0.053
 fright+0.052
 unexpected+0.052
Neg logit contributions
robe-0.048
 has-0.041
tub-0.041
 sky-0.041
ase-0.040
Token rabbit
Token ID22746
Feature activation+2.71e-02

Pos logit contributions
ck+0.038
 reassure+0.038
 unexpected+0.037
 swat+0.036
 fright+0.036
Neg logit contributions
robe-0.033
 Now-0.026
ase-0.026
tub-0.026
unes-0.026
Token who
Token ID508
Feature activation-1.71e-02

Pos logit contributions
ck+0.044
 reassure+0.042
 swat+0.040
 fright+0.039
 unexpected+0.039
Neg logit contributions
robe-0.058
tub-0.050
unes-0.049
 Now-0.049
ase-0.048
Tokenmy
Token ID1820
Feature activation+2.57e-03

Pos logit contributions
ck+0.025
 reassure+0.024
 convince+0.022
 swat+0.021
 intervene+0.021
Neg logit contributions
robe-0.032
tub-0.027
unes-0.027
angers-0.027
ase-0.026
Token said
Token ID531
Feature activation-5.20e-02

Pos logit contributions
ck+0.005
 reassure+0.005
 swat+0.004
 intervene+0.004
 unexpected+0.004
Neg logit contributions
robe-0.005
ase-0.004
tub-0.004
 sky-0.004
unes-0.004
Token,
Token ID11
Feature activation-2.35e-02

Pos logit contributions
robe+0.101
ase+0.087
tub+0.085
 sky+0.082
 Now+0.082
Neg logit contributions
 reassure-0.094
ck-0.093
 unexpected-0.089
 fright-0.087
 swat-0.086
Token "
Token ID366
Feature activation-1.63e-02

Pos logit contributions
robe+0.034
ase+0.029
tub+0.026
ests+0.025
bows+0.025
Neg logit contributions
 reassure-0.054
 swat-0.051
 closer-0.050
ck-0.050
 fright-0.050
TokenThis
Token ID1212
Feature activation-5.89e-02

Pos logit contributions
robe+0.030
tub+0.025
 has+0.024
ase+0.024
 Now+0.024
Neg logit contributions
ck-0.032
 reassure-0.032
 unexpected-0.029
 swat-0.029
 fright-0.029
Token bucket
Token ID19236
Feature activation-6.90e-02

Pos logit contributions
robe+0.109
angers+0.099
ase+0.094
 Now+0.091
unes+0.087
Neg logit contributions
 unexpected-0.106
 closer-0.098
 reassure-0.096
 worry-0.094
 fright-0.094
Token is
Token ID318
Feature activation+8.84e-04

Pos logit contributions
robe+0.093
angers+0.088
 Now+0.075
ase+0.073
unes+0.071
Neg logit contributions
 unexpected-0.140
ck-0.133
 fright-0.130
 worry-0.128
 closer-0.127
Token available
Token ID1695
Feature activation-1.44e-02

Pos logit contributions
ck+0.002
 unexpected+0.002
 reassure+0.002
 fright+0.001
 swat+0.001
Neg logit contributions
robe-0.002
ase-0.002
 Now-0.001
tub-0.001
angers-0.001
Token for
Token ID329
Feature activation+9.73e-03

Pos logit contributions
robe+0.025
ase+0.022
 sky+0.020
 Now+0.020
tub+0.020
Neg logit contributions
 reassure-0.027
 unexpected-0.027
ck-0.027
 fright-0.025
 closer-0.025
Token us
Token ID514
Feature activation+1.72e-02

Pos logit contributions
ck+0.018
 reassure+0.018
 swat+0.017
 fright+0.017
 intervene+0.017
Neg logit contributions
robe-0.018
 Now-0.015
tub-0.015
 has-0.015
ase-0.015
Token to
Token ID284
Feature activation+3.42e-03

Pos logit contributions
ck+0.036
 reassure+0.036
 unexpected+0.034
 fright+0.034
 swat+0.033
Neg logit contributions
robe-0.028
ase-0.023
 Now-0.023
tub-0.023
 sky-0.022