TOP ACTIVATIONS
MAX = 8.700

queen from long , long ago . Mia listened and her
my cow boys and cow girls . I love you ."
velvet box all those years ago . They were glad that
who had died a year ago . It was very precious
best friend a long time ago . He was my dog
family from a long time ago ." " Maybe
of animals that lived long ago . They saw shiny rocks
and plants that lived long ago ," she says .
who lived a long time ago ," mom said .
is from a long time ago , when a king lived
They were here a minute ago ." " Maybe
is from a long time ago , when people lived here
's from a long time ago , when a good man
lost it a long time ago . I thought I would
is from a long time ago , when people used different
They lived a long time ago . They had big bones
watching it stir around and gra ze on the grass .
says that a long time ago , a dragon used to
tree and watched the cows gra ze . She felt happy
treasure from a long time ago ." The

MIN ACTIVATIONS
MIN = -7.426

wand won â t fit in my box ,
I won â t let anything bad happen to
I won â t do it again . â
big before ! She could even sit inside it .
creatures . One day , Cr ust y found a shiny
the squirrel decided to take an adventure in the forest .
we won â t be able to breathe .
, she wanted to take an exam . But before she
so cool . She couldn 't wait to rush outside and
named Cr ust y . Cr ust y lived in a
I won â t let anything bad happen to
I won â t hurt you . â
tree won â t feel alone . â
Tim my 's friends couldn 't come out to play .
Dad won â t like it . â
It won â t grow here no more because
it fits !" She helped Sil via put on the coat
Lu cy couldn 't stop wondering who the mail
. She played and won an amazing prize . It was
so he decided to take an exam . He was a

INTERVAL 4.668 - 8.700
CONTAINS 0.073%

" Yes , Tim . Be careful and don 't get
a strange man in the park . He has a box
." Her mom says , " I love you
bull ," Lily and Ben say . " You are big
our country a long time ago . He helped many people

INTERVAL 0.637 - 4.668
CONTAINS 33.960%

. The animals were having so much fun
ily took off one of her pink m ittens and put
to pet it again ." Her mom said "
It was brown and smelled delicious . Mom my said ,
other . Before long , the other animals in the park

INTERVAL -3.395 - 0.637
CONTAINS 63.674%

flowers . Sue picked a blue flower that would match the
had so much fun . One day , they
Mom heard the noise . She came to the
cave . The dragon was trembling in fear and carrying a
the world . She had a wonderful morning and a weird

INTERVAL -7.426 - -3.395
CONTAINS 2.293%

. Suddenly , Tommy saw something shiny on the ground !
, but he couldn 't find anything . As he walked
of the fle a . L ily said
in circles , looking for an escape . Suddenly
, but then he noticed something else had landed on the
Token queen
Token ID16599
Feature activation+1.07e-01

Pos logit contributions
 unexpected+0.022
 frightening+0.021
 scary+0.021
 since+0.021
sm+0.020
Neg logit contributions
bes-0.024
amas-0.022
ippers-0.021
 ramp-0.021
ates-0.020
Token from
Token ID422
Feature activation+4.57e+00

Pos logit contributions
bes+0.004
amas+0.004
 ramp+0.004
ippers+0.004
 wrists+0.004
Neg logit contributions
 unexpected-0.001
 since-0.001
sm-0.001
 frightening-0.001
 over-0.001
Token long
Token ID890
Feature activation+1.25e+00

Pos logit contributions
bes+0.155
 clinic+0.149
 ramp+0.147
 wrists+0.146
pel+0.144
Neg logit contributions
 unexpected-0.084
 frightening-0.078
sm-0.077
 unpleasant-0.076
 since-0.076
Token,
Token ID11
Feature activation+9.54e-01

Pos logit contributions
bes+0.038
amas+0.036
 ramp+0.035
ippers+0.034
ates+0.033
Neg logit contributions
 unexpected-0.032
 since-0.030
 scary-0.029
 frightening-0.029
 dangerous-0.028
Token long
Token ID890
Feature activation+1.96e+00

Pos logit contributions
bes+0.032
 ramp+0.031
 clinic+0.030
 slot+0.029
 wrists+0.029
Neg logit contributions
 unexpected-0.020
sm-0.017
 frightening-0.017
 unpleasant-0.017
 since-0.016
Token ago
Token ID2084
Feature activation+8.70e+00

Pos logit contributions
bes+0.055
 ramp+0.053
amas+0.052
ippers+0.051
 wrists+0.049
Neg logit contributions
 unexpected-0.052
 since-0.050
 scary-0.049
 frightening-0.049
sm-0.047
Token.
Token ID13
Feature activation+6.09e-01

Pos logit contributions
 ramp+0.410
 clinic+0.382
pel+0.377
Elsa+0.376
amas+0.376
Neg logit contributions
!-0.060
 since-0.059
 to-0.052
 over-0.044
ful-0.043
Token Mia
Token ID32189
Feature activation-2.17e-01

Pos logit contributions
bes+0.019
 ramp+0.018
ippers+0.017
amas+0.017
 wrists+0.017
Neg logit contributions
 unexpected-0.015
 frightening-0.014
 since-0.014
sm-0.014
 scary-0.013
Token listened
Token ID16399
Feature activation-3.13e+00

Pos logit contributions
 unexpected+0.004
 since+0.004
 over+0.004
 frightening+0.004
sm+0.003
Neg logit contributions
ippers-0.008
 ramp-0.008
bes-0.008
amas-0.008
 clinic-0.007
Token and
Token ID290
Feature activation+1.04e-01

Pos logit contributions
 unexpected+0.054
 frightening+0.047
sm+0.045
 unpleasant+0.043
 dangerous+0.041
Neg logit contributions
bes-0.118
amas-0.110
 ramp-0.110
 wrists-0.109
ippers-0.109
Token her
Token ID607
Feature activation+1.60e+00

Pos logit contributions
bes+0.003
 ramp+0.003
ippers+0.003
amas+0.003
 clinic+0.003
Neg logit contributions
 unexpected-0.002
 frightening-0.002
 since-0.002
 scary-0.002
sm-0.002
Token my
Token ID616
Feature activation+2.77e+00

Pos logit contributions
bes+0.019
ippers+0.018
 ramp+0.018
 clinic+0.018
pel+0.018
Neg logit contributions
 unexpected-0.010
 frightening-0.009
 since-0.009
 scary-0.009
 dangerous-0.008
Token cow
Token ID9875
Feature activation+3.33e+00

Pos logit contributions
bes+0.097
 ramp+0.090
pel+0.089
 clinic+0.088
ippers+0.088
Neg logit contributions
 unexpected-0.059
 scary-0.055
 frightening-0.053
 dangerous-0.053
 amazing-0.052
Tokenboys
Token ID13202
Feature activation+9.64e-01

Pos logit contributions
 ramp+0.124
bes+0.123
ippers+0.121
amas+0.119
 slot+0.118
Neg logit contributions
sm-0.055
 unexpected-0.053
 since-0.050
 scary-0.049
 over-0.048
Token and
Token ID290
Feature activation+1.49e+00

Pos logit contributions
bes+0.039
 ramp+0.038
ippers+0.036
 wrists+0.036
 slot+0.036
Neg logit contributions
 since-0.013
 unexpected-0.013
sm-0.013
 over-0.012
 scary-0.012
Token cow
Token ID9875
Feature activation+3.05e+00

Pos logit contributions
bes+0.045
 ramp+0.043
ippers+0.042
amas+0.042
pel+0.041
Neg logit contributions
 unexpected-0.037
 frightening-0.035
 scary-0.034
 since-0.034
 dangerous-0.033
Tokengirls
Token ID36960
Feature activation+8.50e+00

Pos logit contributions
bes+0.118
 ramp+0.116
amas+0.112
ippers+0.112
pel+0.112
Neg logit contributions
 unexpected-0.046
sm-0.045
 since-0.042
 frightening-0.040
 scary-0.040
Token.
Token ID13
Feature activation+1.35e+00

Pos logit contributions
 ramp+0.362
bes+0.351
ippers+0.343
amas+0.339
 slot+0.334
Neg logit contributions
 over-0.087
!-0.082
 unexpected-0.080
 since-0.079
sm-0.077
Token I
Token ID314
Feature activation+9.15e-01

Pos logit contributions
bes+0.037
 ramp+0.035
 wrists+0.034
ippers+0.032
 clinic+0.032
Neg logit contributions
 since-0.036
 unexpected-0.036
sm-0.035
 frightening-0.034
 scary-0.034
Token love
Token ID1842
Feature activation+6.19e-01

Pos logit contributions
bes+0.033
 ramp+0.032
amas+0.032
ippers+0.031
 wrists+0.031
Neg logit contributions
 unexpected-0.018
sm-0.016
 frightening-0.016
 scary-0.015
 since-0.015
Token you
Token ID345
Feature activation+8.67e-01

Pos logit contributions
bes+0.024
 ramp+0.023
ippers+0.022
amas+0.022
 clinic+0.022
Neg logit contributions
 unexpected-0.010
 frightening-0.009
sm-0.009
 since-0.009
 scary-0.008
Token."
Token ID526
Feature activation+2.13e+00

Pos logit contributions
bes+0.033
 ramp+0.032
ippers+0.031
 clinic+0.030
 wrists+0.030
Neg logit contributions
 unexpected-0.015
 since-0.014
sm-0.013
 frightening-0.013
 scary-0.012
Token velvet
Token ID47750
Feature activation+3.69e+00

Pos logit contributions
bes+0.008
 ramp+0.008
ippers+0.008
amas+0.008
 clinic+0.007
Neg logit contributions
 unexpected-0.005
 frightening-0.004
 scary-0.004
sm-0.004
 since-0.004
Token box
Token ID3091
Feature activation+4.89e-01

Pos logit contributions
bes+0.105
ippers+0.103
 ramp+0.102
pel+0.101
amas+0.101
Neg logit contributions
 unexpected-0.093
sm-0.086
 since-0.085
 frightening-0.084
 scary-0.082
Token all
Token ID477
Feature activation-1.82e+00

Pos logit contributions
bes+0.019
 ramp+0.018
amas+0.017
ippers+0.017
 clinic+0.017
Neg logit contributions
 unexpected-0.008
 frightening-0.008
sm-0.007
 since-0.007
 scary-0.007
Token those
Token ID883
Feature activation+1.37e+00

Pos logit contributions
 unexpected+0.031
 frightening+0.029
 unpleasant+0.028
sm+0.026
vious+0.024
Neg logit contributions
bes-0.062
 ramp-0.061
ates-0.058
 slot-0.058
 kit-0.057
Token years
Token ID812
Feature activation-1.44e-01

Pos logit contributions
bes+0.048
 ramp+0.046
 wrists+0.045
 clinic+0.045
ippers+0.045
Neg logit contributions
 unexpected-0.024
 frightening-0.023
sm-0.022
 unpleasant-0.021
 since-0.021
Token ago
Token ID2084
Feature activation+8.49e+00

Pos logit contributions
 unexpected+0.003
 frightening+0.003
sm+0.002
 unpleasant+0.002
 since+0.002
Neg logit contributions
bes-0.005
 ramp-0.005
ippers-0.005
 clinic-0.005
amas-0.005
Token.
Token ID13
Feature activation+2.98e-01

Pos logit contributions
 ramp+0.369
 clinic+0.355
bes+0.352
pel+0.350
amas+0.348
Neg logit contributions
 since-0.089
 unexpected-0.083
 over-0.072
sm-0.071
!-0.070
Token They
Token ID1119
Feature activation+2.27e+00

Pos logit contributions
bes+0.009
 ramp+0.008
 clinic+0.007
amas+0.007
 wrists+0.007
Neg logit contributions
 since-0.008
 unexpected-0.008
 over-0.008
sm-0.007
 frightening-0.007
Token were
Token ID547
Feature activation+3.04e+00

Pos logit contributions
bes+0.076
 ramp+0.073
amas+0.072
ippers+0.071
 clinic+0.070
Neg logit contributions
 unexpected-0.052
 since-0.047
 frightening-0.046
sm-0.045
 scary-0.043
Token glad
Token ID9675
Feature activation+6.03e-01

Pos logit contributions
bes+0.128
 ramp+0.126
amas+0.123
ippers+0.122
pel+0.120
Neg logit contributions
 unexpected-0.045
 amazing-0.039
 frightening-0.038
 over-0.038
 since-0.037
Token that
Token ID326
Feature activation-3.00e-02

Pos logit contributions
bes+0.023
 ramp+0.022
ippers+0.021
amas+0.021
 clinic+0.021
Neg logit contributions
 unexpected-0.010
 frightening-0.009
sm-0.009
 since-0.009
 unpleasant-0.008
Token who
Token ID508
Feature activation+1.64e+00

Pos logit contributions
bes+0.023
 ramp+0.022
ippers+0.022
amas+0.022
pel+0.021
Neg logit contributions
 unexpected-0.013
 since-0.013
 frightening-0.012
 scary-0.012
sm-0.012
Token had
Token ID550
Feature activation+1.13e+00

Pos logit contributions
bes+0.054
 ramp+0.053
amas+0.053
ippers+0.052
 clinic+0.050
Neg logit contributions
 unexpected-0.038
sm-0.035
 frightening-0.034
 since-0.033
 scary-0.033
Token died
Token ID3724
Feature activation-9.24e-01

Pos logit contributions
bes+0.041
 ramp+0.040
amas+0.039
ippers+0.039
 clinic+0.038
Neg logit contributions
 unexpected-0.022
 frightening-0.020
sm-0.019
 scary-0.019
 dangerous-0.019
Token a
Token ID257
Feature activation-3.72e-01

Pos logit contributions
 unexpected+0.017
 frightening+0.015
sm+0.015
 since+0.014
 unpleasant+0.014
Neg logit contributions
bes-0.034
 ramp-0.033
amas-0.032
ippers-0.032
 clinic-0.032
Token year
Token ID614
Feature activation+6.08e-01

Pos logit contributions
 unexpected+0.009
 frightening+0.009
 scary+0.008
sm+0.008
 dangerous+0.008
Neg logit contributions
bes-0.012
amas-0.011
 ramp-0.011
ippers-0.011
ates-0.011
Token ago
Token ID2084
Feature activation+8.34e+00

Pos logit contributions
bes+0.017
 slot+0.016
 clinic+0.016
 laptop+0.016
 ramp+0.015
Neg logit contributions
 unexpected-0.015
 frightening-0.014
 unpleasant-0.014
sm-0.013
 terrifying-0.013
Token.
Token ID13
Feature activation+1.33e+00

Pos logit contributions
 ramp+0.365
pel+0.351
 clinic+0.348
bes+0.344
amas+0.342
Neg logit contributions
 since-0.076
 unexpected-0.065
!-0.065
 over-0.064
 to-0.059
Token It
Token ID632
Feature activation-9.35e-01

Pos logit contributions
bes+0.039
 ramp+0.038
ippers+0.036
 wrists+0.036
amas+0.035
Neg logit contributions
 unexpected-0.034
 frightening-0.032
 since-0.031
sm-0.031
 scary-0.030
Token was
Token ID373
Feature activation+2.80e+00

Pos logit contributions
 unexpected+0.024
 frightening+0.022
sm+0.022
 since+0.021
 unpleasant+0.021
Neg logit contributions
bes-0.028
 ramp-0.026
amas-0.025
ippers-0.025
 clinic-0.025
Token very
Token ID845
Feature activation+1.13e-01

Pos logit contributions
bes+0.096
 ramp+0.091
 clinic+0.089
amas+0.088
ippers+0.088
Neg logit contributions
 unexpected-0.057
 frightening-0.052
sm-0.050
 since-0.049
 unpleasant-0.048
Token precious
Token ID14186
Feature activation+1.66e-01

Pos logit contributions
bes+0.004
 clinic+0.004
 slot+0.003
ippers+0.003
 laptop+0.003
Neg logit contributions
 since-0.002
 unexpected-0.002
sm-0.002
 frightening-0.002
filled-0.002
Token best
Token ID1266
Feature activation+2.14e+00

Pos logit contributions
bes+0.046
 ramp+0.042
amas+0.041
pel+0.040
ippers+0.040
Neg logit contributions
 unexpected-0.091
 frightening-0.087
 scary-0.086
 dangerous-0.084
sm-0.084
Token friend
Token ID1545
Feature activation+2.15e+00

Pos logit contributions
bes+0.064
 ramp+0.060
amas+0.058
 clinic+0.057
pel+0.057
Neg logit contributions
 unexpected-0.054
 frightening-0.050
 since-0.049
sm-0.049
 scary-0.048
Token a
Token ID257
Feature activation+1.13e+00

Pos logit contributions
bes+0.081
 ramp+0.078
amas+0.076
ippers+0.076
 clinic+0.075
Neg logit contributions
 unexpected-0.038
 frightening-0.033
 since-0.033
sm-0.033
 scary-0.031
Token long
Token ID890
Feature activation+2.81e+00

Pos logit contributions
bes+0.033
 ramp+0.032
 clinic+0.031
ippers+0.030
amas+0.030
Neg logit contributions
 unexpected-0.029
 frightening-0.026
sm-0.026
 since-0.025
 unpleasant-0.025
Token time
Token ID640
Feature activation+4.08e+00

Pos logit contributions
bes+0.086
amas+0.085
 ramp+0.083
ippers+0.082
ates+0.080
Neg logit contributions
 unexpected-0.064
 since-0.061
 scary-0.059
 frightening-0.058
 dangerous-0.057
Token ago
Token ID2084
Feature activation+8.25e+00

Pos logit contributions
bes+0.141
 ramp+0.135
amas+0.130
ippers+0.130
 clinic+0.130
Neg logit contributions
 unexpected-0.084
 frightening-0.077
sm-0.073
 since-0.072
 scary-0.071
Token.
Token ID13
Feature activation+1.08e+00

Pos logit contributions
 ramp+0.354
bes+0.349
pel+0.346
amas+0.341
 clinic+0.340
Neg logit contributions
 since-0.082
 over-0.070
 unexpected-0.069
!-0.065
 to-0.061
Token He
Token ID679
Feature activation+2.37e+00

Pos logit contributions
bes+0.036
 ramp+0.034
amas+0.032
ippers+0.032
 wrists+0.032
Neg logit contributions
 unexpected-0.023
 since-0.023
 frightening-0.022
sm-0.022
 over-0.021
Token was
Token ID373
Feature activation+2.80e+00

Pos logit contributions
bes+0.078
amas+0.077
 ramp+0.076
ippers+0.075
 wrists+0.073
Neg logit contributions
 unexpected-0.054
 dangerous-0.051
 scary-0.051
 frightening-0.050
sm-0.049
Token my
Token ID616
Feature activation+3.84e+00

Pos logit contributions
bes+0.102
 ramp+0.098
amas+0.096
ippers+0.095
 clinic+0.094
Neg logit contributions
 unexpected-0.054
 frightening-0.049
sm-0.046
 scary-0.045
 since-0.045
Token dog
Token ID3290
Feature activation+2.83e+00

Pos logit contributions
bes+0.120
amas+0.109
 ramp+0.109
pel+0.106
ippers+0.104
Neg logit contributions
 unexpected-0.099
 frightening-0.092
 scary-0.090
 since-0.087
 dangerous-0.087
Token family
Token ID1641
Feature activation+1.66e+00

Pos logit contributions
bes+0.107
 ramp+0.103
 clinic+0.100
ippers+0.098
 wrists+0.097
Neg logit contributions
 unexpected-0.124
 frightening-0.117
sm-0.114
 since-0.112
 unpleasant-0.112
Token from
Token ID422
Feature activation+4.56e+00

Pos logit contributions
bes+0.062
 ramp+0.059
ippers+0.057
amas+0.057
 clinic+0.056
Neg logit contributions
 unexpected-0.031
 frightening-0.027
 since-0.026
sm-0.026
 scary-0.025
Token a
Token ID257
Feature activation+4.06e-01

Pos logit contributions
bes+0.141
 clinic+0.139
 ramp+0.137
 wrists+0.133
 laptop+0.131
Neg logit contributions
 unexpected-0.094
 frightening-0.091
sm-0.090
 unpleasant-0.088
vious-0.084
Token long
Token ID890
Feature activation+2.13e+00

Pos logit contributions
bes+0.012
 ramp+0.011
 clinic+0.011
amas+0.011
ippers+0.011
Neg logit contributions
 unexpected-0.010
 frightening-0.010
sm-0.009
 unpleasant-0.009
 since-0.009
Token time
Token ID640
Feature activation+3.98e+00

Pos logit contributions
bes+0.073
ippers+0.069
amas+0.069
ates+0.067
 ramp+0.066
Neg logit contributions
 since-0.045
 unexpected-0.044
 scary-0.040
 frightening-0.039
 dangerous-0.038
Token ago
Token ID2084
Feature activation+8.17e+00

Pos logit contributions
bes+0.139
 ramp+0.133
amas+0.129
ippers+0.128
 clinic+0.128
Neg logit contributions
 unexpected-0.081
 frightening-0.074
 since-0.070
sm-0.070
 scary-0.068
Token."
Token ID526
Feature activation+1.83e+00

Pos logit contributions
 ramp+0.341
bes+0.337
pel+0.333
 clinic+0.326
amas+0.325
Neg logit contributions
 since-0.086
 unexpected-0.082
 over-0.076
!-0.072
sm-0.065
Token
Token ID198
Feature activation+7.92e-01

Pos logit contributions
bes+0.062
 ramp+0.059
 wrists+0.058
amas+0.058
pel+0.057
Neg logit contributions
 unexpected-0.039
 frightening-0.035
sm-0.035
 since-0.034
 scary-0.034
Token
Token ID198
Feature activation+1.05e+00

Pos logit contributions
bes+0.022
 ramp+0.021
ippers+0.020
 clinic+0.020
amas+0.020
Neg logit contributions
 unexpected-0.022
 frightening-0.021
 since-0.020
 unpleasant-0.020
sm-0.019
Token"
Token ID1
Feature activation+1.09e+00

Pos logit contributions
bes+0.032
amas+0.031
 ramp+0.031
 wrists+0.030
pel+0.030
Neg logit contributions
 unexpected-0.025
sm-0.023
 since-0.023
 frightening-0.022
 dangerous-0.022
TokenMaybe
Token ID13300
Feature activation-2.71e+00

Pos logit contributions
bes+0.034
amas+0.033
 ramp+0.033
ippers+0.032
 clinic+0.031
Neg logit contributions
 unexpected-0.028
 frightening-0.025
sm-0.024
 scary-0.023
 unpleasant-0.023
Token of
Token ID286
Feature activation+1.54e+00

Pos logit contributions
bes+0.021
amas+0.020
 ramp+0.020
 wrists+0.020
ippers+0.020
Neg logit contributions
 unexpected-0.011
 frightening-0.009
 scary-0.009
 since-0.009
sm-0.009
Token animals
Token ID4695
Feature activation+2.38e+00

Pos logit contributions
bes+0.042
 ramp+0.041
 clinic+0.039
 wrists+0.038
ippers+0.038
Neg logit contributions
 unexpected-0.041
 frightening-0.039
sm-0.037
 since-0.037
 unpleasant-0.037
Token that
Token ID326
Feature activation+1.82e+00

Pos logit contributions
bes+0.089
amas+0.085
 ramp+0.084
ippers+0.083
pel+0.082
Neg logit contributions
 unexpected-0.043
 since-0.039
 frightening-0.038
 scary-0.037
 over-0.036
Token lived
Token ID5615
Feature activation+3.38e+00

Pos logit contributions
bes+0.061
amas+0.058
 ramp+0.058
ippers+0.057
 clinic+0.056
Neg logit contributions
 unexpected-0.041
 frightening-0.037
 scary-0.036
 since-0.035
 dangerous-0.035
Token long
Token ID890
Feature activation+2.80e+00

Pos logit contributions
bes+0.127
 ramp+0.121
amas+0.119
ippers+0.118
 wrists+0.117
Neg logit contributions
 unexpected-0.061
 frightening-0.054
sm-0.052
 since-0.051
 scary-0.050
Token ago
Token ID2084
Feature activation+8.16e+00

Pos logit contributions
bes+0.096
 ramp+0.092
amas+0.092
ippers+0.092
 wrists+0.090
Neg logit contributions
 unexpected-0.055
 since-0.052
 frightening-0.051
 scary-0.050
 dangerous-0.048
Token.
Token ID13
Feature activation-5.90e-01

Pos logit contributions
 ramp+0.367
pel+0.351
 clinic+0.348
Elsa+0.346
amas+0.344
Neg logit contributions
 since-0.065
!-0.060
 over-0.057
 to-0.049
 unexpected-0.044
Token They
Token ID1119
Feature activation+2.19e+00

Pos logit contributions
 unexpected+0.014
 over+0.014
 since+0.014
 frightening+0.013
 scary+0.013
Neg logit contributions
bes-0.018
 ramp-0.017
pel-0.016
amas-0.016
ates-0.016
Token saw
Token ID2497
Feature activation-1.86e+00

Pos logit contributions
bes+0.073
 ramp+0.069
amas+0.069
ippers+0.067
 clinic+0.067
Neg logit contributions
 unexpected-0.050
 frightening-0.045
 since-0.044
 scary-0.043
sm-0.043
Token shiny
Token ID22441
Feature activation+2.70e-01

Pos logit contributions
 unexpected+0.044
 frightening+0.041
sm+0.039
 since+0.039
 scary+0.038
Neg logit contributions
bes-0.058
 ramp-0.056
amas-0.054
ippers-0.054
 clinic-0.053
Token rocks
Token ID12586
Feature activation-1.19e-01

Pos logit contributions
bes+0.007
 ramp+0.007
ippers+0.006
amas+0.006
 clinic+0.006
Neg logit contributions
 unexpected-0.008
 frightening-0.007
sm-0.007
 since-0.007
 unpleasant-0.007
Token and
Token ID290
Feature activation+1.40e+00

Pos logit contributions
bes+0.078
 ramp+0.074
 clinic+0.072
ippers+0.072
 wrists+0.071
Neg logit contributions
 unexpected-0.047
 frightening-0.043
sm-0.041
 unpleasant-0.040
 since-0.039
Token plants
Token ID6134
Feature activation+1.28e+00

Pos logit contributions
bes+0.036
 ramp+0.034
 wrists+0.033
 clinic+0.032
ippers+0.032
Neg logit contributions
 unexpected-0.040
 frightening-0.038
sm-0.037
 since-0.036
 unpleasant-0.036
Token that
Token ID326
Feature activation+1.54e+00

Pos logit contributions
bes+0.045
 ramp+0.043
ippers+0.041
 clinic+0.041
amas+0.041
Neg logit contributions
 unexpected-0.026
 frightening-0.024
sm-0.022
 since-0.022
 unpleasant-0.022
Token lived
Token ID5615
Feature activation+3.68e+00

Pos logit contributions
bes+0.047
 ramp+0.044
amas+0.043
ippers+0.043
 clinic+0.042
Neg logit contributions
 unexpected-0.039
 frightening-0.036
 scary-0.035
sm-0.034
 dangerous-0.034
Token long
Token ID890
Feature activation+2.91e+00

Pos logit contributions
bes+0.128
 ramp+0.122
 clinic+0.118
ippers+0.118
 wrists+0.116
Neg logit contributions
 unexpected-0.074
 frightening-0.069
sm-0.064
 unpleasant-0.064
 since-0.064
Token ago
Token ID2084
Feature activation+8.06e+00

Pos logit contributions
bes+0.096
 ramp+0.092
amas+0.089
ippers+0.089
 clinic+0.088
Neg logit contributions
 unexpected-0.064
 frightening-0.059
 since-0.057
sm-0.056
 scary-0.056
Token,"
Token ID553
Feature activation+2.37e+00

Pos logit contributions
 ramp+0.325
bes+0.315
pel+0.314
 clinic+0.310
amas+0.308
Neg logit contributions
 unexpected-0.100
 since-0.095
 over-0.084
 dangerous-0.083
sm-0.080
Token she
Token ID673
Feature activation-1.39e-01

Pos logit contributions
 ramp+0.058
bes+0.057
 wrists+0.055
 clinic+0.054
 slot+0.054
Neg logit contributions
 unexpected-0.066
 frightening-0.064
sm-0.063
 since-0.062
 unpleasant-0.060
Token says
Token ID1139
Feature activation+4.66e+00

Pos logit contributions
 unexpected+0.003
 frightening+0.003
 scary+0.003
 unpleasant+0.003
 dangerous+0.003
Neg logit contributions
bes-0.004
 ramp-0.004
ippers-0.004
amas-0.004
 clinic-0.004
Token.
Token ID13
Feature activation+1.51e+00

Pos logit contributions
bes+0.190
 ramp+0.186
pel+0.178
amas+0.178
 clinic+0.178
Neg logit contributions
 unexpected-0.063
sm-0.056
 frightening-0.056
 over-0.056
 since-0.054
Token
Token ID198
Feature activation+6.38e-01

Pos logit contributions
bes+0.047
 ramp+0.044
amas+0.043
ippers+0.043
 clinic+0.043
Neg logit contributions
 unexpected-0.037
 frightening-0.034
sm-0.033
 since-0.032
 scary-0.032
Token who
Token ID508
Feature activation+1.80e+00

Pos logit contributions
bes+0.043
 ramp+0.041
amas+0.040
ippers+0.039
 clinic+0.039
Neg logit contributions
 unexpected-0.025
 frightening-0.023
sm-0.022
 since-0.022
 scary-0.021
Token lived
Token ID5615
Feature activation+3.53e+00

Pos logit contributions
bes+0.055
 ramp+0.053
amas+0.051
ippers+0.050
 clinic+0.050
Neg logit contributions
 unexpected-0.045
 frightening-0.042
sm-0.040
 since-0.040
 scary-0.039
Token a
Token ID257
Feature activation+1.38e+00

Pos logit contributions
bes+0.119
 ramp+0.112
ippers+0.111
 clinic+0.110
amas+0.109
Neg logit contributions
 unexpected-0.074
 frightening-0.069
 unpleasant-0.065
 since-0.064
sm-0.063
Token long
Token ID890
Feature activation+2.72e+00

Pos logit contributions
bes+0.035
 ramp+0.032
amas+0.032
ippers+0.031
pel+0.030
Neg logit contributions
 unexpected-0.042
 frightening-0.040
sm-0.038
 scary-0.038
 since-0.038
Token time
Token ID640
Feature activation+4.03e+00

Pos logit contributions
bes+0.083
amas+0.080
ippers+0.077
 ramp+0.077
 wrists+0.074
Neg logit contributions
 since-0.064
 unexpected-0.063
 scary-0.060
 frightening-0.059
sm-0.057
Token ago
Token ID2084
Feature activation+8.06e+00

Pos logit contributions
bes+0.140
 ramp+0.135
amas+0.130
ippers+0.129
 clinic+0.129
Neg logit contributions
 unexpected-0.083
 frightening-0.075
sm-0.072
 since-0.071
 scary-0.070
Token,"
Token ID553
Feature activation+2.53e+00

Pos logit contributions
 ramp+0.340
pel+0.329
bes+0.326
 clinic+0.320
amas+0.317
Neg logit contributions
 since-0.090
 unexpected-0.081
 over-0.075
!-0.074
 to-0.068
Token mom
Token ID1995
Feature activation-1.40e+00

Pos logit contributions
bes+0.080
 ramp+0.076
amas+0.073
ippers+0.073
 clinic+0.072
Neg logit contributions
 unexpected-0.061
 frightening-0.056
sm-0.054
 since-0.054
 scary-0.053
Token said
Token ID531
Feature activation+3.91e+00

Pos logit contributions
 unexpected+0.032
 frightening+0.030
sm+0.028
 unpleasant+0.028
 scary+0.027
Neg logit contributions
bes-0.045
 ramp-0.042
 wrists-0.041
amas-0.041
 clinic-0.041
Token.
Token ID13
Feature activation+1.68e+00

Pos logit contributions
bes+0.153
 ramp+0.150
pel+0.142
amas+0.142
 clinic+0.141
Neg logit contributions
 unexpected-0.060
 over-0.056
 since-0.055
sm-0.055
 frightening-0.055
Token
Token ID198
Feature activation+6.34e-01

Pos logit contributions
bes+0.053
 ramp+0.051
ippers+0.049
amas+0.049
 clinic+0.048
Neg logit contributions
 unexpected-0.040
 frightening-0.037
sm-0.036
 since-0.035
 scary-0.035
Token is
Token ID318
Feature activation+2.38e+00

Pos logit contributions
 unexpected+0.020
 scary+0.019
 frightening+0.019
 dangerous+0.018
 over+0.018
Neg logit contributions
bes-0.023
ippers-0.023
amas-0.022
 ramp-0.021
 clinic-0.021
Token from
Token ID422
Feature activation+3.84e+00

Pos logit contributions
bes+0.088
 ramp+0.084
amas+0.082
ippers+0.082
 clinic+0.082
Neg logit contributions
 unexpected-0.044
 frightening-0.040
sm-0.037
 since-0.037
 scary-0.036
Token a
Token ID257
Feature activation+4.70e-01

Pos logit contributions
bes+0.132
 ramp+0.127
 clinic+0.123
ippers+0.123
 wrists+0.122
Neg logit contributions
 unexpected-0.076
 frightening-0.071
sm-0.068
 since-0.066
 unpleasant-0.065
Token long
Token ID890
Feature activation+1.85e+00

Pos logit contributions
bes+0.014
amas+0.013
ippers+0.013
ates+0.013
 ramp+0.013
Neg logit contributions
 unexpected-0.012
 frightening-0.011
 scary-0.011
 dangerous-0.011
 since-0.011
Token time
Token ID640
Feature activation+3.97e+00

Pos logit contributions
bes+0.059
amas+0.057
ippers+0.055
ates+0.054
arians+0.054
Neg logit contributions
 since-0.041
 unexpected-0.041
 scary-0.038
 dangerous-0.038
 frightening-0.035
Token ago
Token ID2084
Feature activation+8.01e+00

Pos logit contributions
bes+0.128
 ramp+0.122
ippers+0.118
amas+0.118
 clinic+0.117
Neg logit contributions
 unexpected-0.091
 frightening-0.084
sm-0.080
 since-0.079
 unpleasant-0.078
Token,
Token ID11
Feature activation+3.10e-01

Pos logit contributions
 ramp+0.339
bes+0.328
pel+0.324
 clinic+0.321
amas+0.314
Neg logit contributions
 since-0.088
 over-0.077
!-0.073
 unexpected-0.072
 to-0.070
Token when
Token ID618
Feature activation+1.67e+00

Pos logit contributions
bes+0.011
 ramp+0.011
ippers+0.011
amas+0.011
 clinic+0.011
Neg logit contributions
 unexpected-0.006
 frightening-0.005
sm-0.005
 since-0.005
 scary-0.005
Token a
Token ID257
Feature activation-2.44e-01

Pos logit contributions
bes+0.064
 ramp+0.061
amas+0.059
ippers+0.059
 clinic+0.059
Neg logit contributions
 unexpected-0.030
 frightening-0.027
 scary-0.025
 since-0.025
 dangerous-0.025
Token king
Token ID5822
Feature activation-1.83e+00

Pos logit contributions
 unexpected+0.006
 scary+0.005
 frightening+0.005
 dangerous+0.005
sm+0.005
Neg logit contributions
bes-0.008
amas-0.007
ippers-0.007
ates-0.007
 wrists-0.007
Token lived
Token ID5615
Feature activation+3.21e+00

Pos logit contributions
 unexpected+0.036
sm+0.035
 over+0.033
 since+0.033
 scary+0.032
Neg logit contributions
bes-0.065
 wrists-0.061
amas-0.060
 ramp-0.060
ippers-0.060
TokenThey
Token ID2990
Feature activation+3.05e+00

Pos logit contributions
bes+0.008
 ramp+0.008
amas+0.008
ippers+0.008
 clinic+0.008
Neg logit contributions
 unexpected-0.006
 frightening-0.006
sm-0.006
 since-0.005
 scary-0.005
Token were
Token ID547
Feature activation+3.06e+00

Pos logit contributions
bes+0.095
 ramp+0.090
amas+0.088
ippers+0.088
 wrists+0.087
Neg logit contributions
 unexpected-0.077
 frightening-0.069
 dangerous-0.068
sm-0.068
 scary-0.068
Token here
Token ID994
Feature activation+2.26e+00

Pos logit contributions
bes+0.105
 ramp+0.100
amas+0.097
ippers+0.096
 clinic+0.096
Neg logit contributions
 unexpected-0.065
 frightening-0.059
sm-0.056
 since-0.056
 unpleasant-0.055
Token a
Token ID257
Feature activation+1.50e-01

Pos logit contributions
bes+0.074
 clinic+0.073
 ramp+0.072
amas+0.071
ippers+0.070
Neg logit contributions
 unexpected-0.044
 frightening-0.041
 unpleasant-0.039
vious-0.038
sm-0.036
Token minute
Token ID5664
Feature activation-5.92e-01

Pos logit contributions
bes+0.005
ippers+0.004
amas+0.004
 ramp+0.004
ates+0.004
Neg logit contributions
 unexpected-0.004
 frightening-0.004
 scary-0.003
 dangerous-0.003
sm-0.003
Token ago
Token ID2084
Feature activation+8.00e+00

Pos logit contributions
 unexpected+0.011
 frightening+0.010
sm+0.009
 since+0.009
 scary+0.009
Neg logit contributions
bes-0.022
 ramp-0.021
amas-0.021
ippers-0.020
 clinic-0.020
Token."
Token ID526
Feature activation+2.50e+00

Pos logit contributions
bes+0.319
 ramp+0.315
pel+0.307
 clinic+0.304
amas+0.300
Neg logit contributions
 unexpected-0.105
 since-0.100
 over-0.094
sm-0.090
 dangerous-0.088
Token
Token ID198
Feature activation+1.26e+00

Pos logit contributions
bes+0.093
 ramp+0.089
amas+0.086
pel+0.086
 wrists+0.084
Neg logit contributions
 unexpected-0.044
 since-0.041
sm-0.040
 frightening-0.040
 scary-0.040
Token
Token ID198
Feature activation+1.50e+00

Pos logit contributions
bes+0.036
 ramp+0.034
ippers+0.033
 clinic+0.033
amas+0.032
Neg logit contributions
 unexpected-0.034
 frightening-0.032
 since-0.031
sm-0.030
 unpleasant-0.030
Token"
Token ID1
Feature activation+1.24e+00

Pos logit contributions
bes+0.052
 ramp+0.050
amas+0.050
pel+0.048
ippers+0.048
Neg logit contributions
 unexpected-0.031
 frightening-0.027
sm-0.027
 since-0.027
 scary-0.026
TokenMaybe
Token ID13300
Feature activation-2.60e+00

Pos logit contributions
bes+0.039
 ramp+0.038
amas+0.037
ippers+0.037
 clinic+0.037
Neg logit contributions
 unexpected-0.029
 frightening-0.027
sm-0.026
 scary-0.025
 since-0.025
Token is
Token ID318
Feature activation+1.62e+00

Pos logit contributions
 unexpected+0.049
 frightening+0.046
sm+0.044
 scary+0.044
 since+0.044
Neg logit contributions
bes-0.045
 ramp-0.042
ippers-0.041
amas-0.041
 clinic-0.040
Token from
Token ID422
Feature activation+3.76e+00

Pos logit contributions
bes+0.056
 ramp+0.053
 clinic+0.052
 wrists+0.052
ippers+0.051
Neg logit contributions
 unexpected-0.031
sm-0.029
 frightening-0.029
 since-0.028
 unpleasant-0.026
Token a
Token ID257
Feature activation+2.38e-01

Pos logit contributions
bes+0.125
 ramp+0.121
 clinic+0.119
 wrists+0.118
ippers+0.116
Neg logit contributions
 unexpected-0.074
 frightening-0.070
sm-0.069
 since-0.066
 unpleasant-0.065
Token long
Token ID890
Feature activation+1.84e+00

Pos logit contributions
bes+0.006
 ramp+0.006
 clinic+0.006
ippers+0.006
amas+0.006
Neg logit contributions
 unexpected-0.007
 frightening-0.006
sm-0.006
 since-0.006
 unpleasant-0.006
Token time
Token ID640
Feature activation+3.60e+00

Pos logit contributions
bes+0.063
amas+0.060
ippers+0.059
ates+0.057
 wrists+0.055
Neg logit contributions
 unexpected-0.039
 since-0.038
 scary-0.036
 dangerous-0.036
 frightening-0.035
Token ago
Token ID2084
Feature activation+7.88e+00

Pos logit contributions
bes+0.101
 ramp+0.095
ippers+0.091
 clinic+0.091
amas+0.091
Neg logit contributions
 unexpected-0.098
 frightening-0.092
sm-0.088
 unpleasant-0.087
 since-0.086
Token,
Token ID11
Feature activation-7.69e-02

Pos logit contributions
bes+0.323
 ramp+0.321
pel+0.314
 clinic+0.309
amas+0.308
Neg logit contributions
 since-0.089
 unexpected-0.088
 over-0.080
sm-0.071
!-0.070
Token when
Token ID618
Feature activation+1.43e+00

Pos logit contributions
 unexpected+0.002
 frightening+0.001
 since+0.001
sm+0.001
 scary+0.001
Neg logit contributions
bes-0.003
 ramp-0.003
amas-0.003
ippers-0.003
 clinic-0.002
Token people
Token ID661
Feature activation-8.98e-01

Pos logit contributions
bes+0.052
 ramp+0.050
amas+0.049
ippers+0.049
 clinic+0.048
Neg logit contributions
 unexpected-0.028
 frightening-0.025
 scary-0.023
 since-0.023
sm-0.023
Token lived
Token ID5615
Feature activation+3.10e+00

Pos logit contributions
 unexpected+0.021
 frightening+0.018
 scary+0.018
 dangerous+0.018
 since+0.018
Neg logit contributions
bes-0.031
amas-0.028
 ramp-0.028
ippers-0.027
 wrists-0.027
Token here
Token ID994
Feature activation+2.29e+00

Pos logit contributions
bes+0.110
 ramp+0.105
ippers+0.103
 clinic+0.102
amas+0.102
Neg logit contributions
 unexpected-0.061
 frightening-0.056
sm-0.052
 since-0.052
 unpleasant-0.052
Token's
Token ID338
Feature activation+2.80e+00

Pos logit contributions
 unexpected+0.035
 frightening+0.033
 scary+0.032
sm+0.032
 dangerous+0.031
Neg logit contributions
bes-0.037
ippers-0.036
amas-0.035
 ramp-0.035
pel-0.033
Token from
Token ID422
Feature activation+3.89e+00

Pos logit contributions
bes+0.102
 ramp+0.098
amas+0.095
ippers+0.095
 clinic+0.094
Neg logit contributions
 unexpected-0.054
 frightening-0.049
sm-0.046
 since-0.045
 scary-0.045
Token a
Token ID257
Feature activation-3.97e-02

Pos logit contributions
bes+0.133
 ramp+0.127
 clinic+0.125
 wrists+0.124
ippers+0.123
Neg logit contributions
 unexpected-0.076
 frightening-0.071
sm-0.069
 since-0.066
 unpleasant-0.066
Token long
Token ID890
Feature activation+1.40e+00

Pos logit contributions
 unexpected+0.001
 frightening+0.001
 scary+0.001
 dangerous+0.001
 since+0.001
Neg logit contributions
bes-0.001
ippers-0.001
amas-0.001
 ramp-0.001
ates-0.001
Token time
Token ID640
Feature activation+3.49e+00

Pos logit contributions
bes+0.046
amas+0.045
ates+0.043
arians+0.043
ippers+0.043
Neg logit contributions
 since-0.029
 unexpected-0.029
 scary-0.027
 dangerous-0.027
 over-0.025
Token ago
Token ID2084
Feature activation+7.87e+00

Pos logit contributions
bes+0.105
 ramp+0.099
ippers+0.096
amas+0.095
 clinic+0.095
Neg logit contributions
 unexpected-0.088
 frightening-0.081
sm-0.078
 since-0.077
 unpleasant-0.076
Token,
Token ID11
Feature activation+1.03e-01

Pos logit contributions
 ramp+0.318
bes+0.307
pel+0.306
 clinic+0.300
amas+0.298
Neg logit contributions
 since-0.100
 unexpected-0.094
 over-0.086
sm-0.080
 dangerous-0.080
Token when
Token ID618
Feature activation+1.43e+00

Pos logit contributions
bes+0.004
 ramp+0.004
amas+0.003
ippers+0.003
 clinic+0.003
Neg logit contributions
 unexpected-0.002
 frightening-0.002
sm-0.002
 since-0.002
 unpleasant-0.002
Token a
Token ID257
Feature activation-3.56e-01

Pos logit contributions
bes+0.054
 ramp+0.051
ippers+0.050
amas+0.050
 clinic+0.049
Neg logit contributions
 unexpected-0.026
 frightening-0.023
 scary-0.022
 since-0.022
 dangerous-0.022
Token good
Token ID922
Feature activation+1.06e+00

Pos logit contributions
 unexpected+0.009
 frightening+0.008
 scary+0.008
 dangerous+0.008
sm+0.008
Neg logit contributions
bes-0.011
amas-0.010
ippers-0.010
ates-0.010
 wrists-0.010
Token man
Token ID582
Feature activation-5.42e-02

Pos logit contributions
bes+0.038
amas+0.035
 wrists+0.035
 ramp+0.034
pel+0.034
Neg logit contributions
 unexpected-0.022
 scary-0.020
 frightening-0.020
 since-0.019
sm-0.019
Token lost
Token ID2626
Feature activation+1.16e-01

Pos logit contributions
bes+0.031
 ramp+0.030
amas+0.030
ippers+0.029
 wrists+0.029
Neg logit contributions
 unexpected-0.020
 frightening-0.018
sm-0.018
 since-0.018
 scary-0.017
Token it
Token ID340
Feature activation-1.11e+00

Pos logit contributions
bes+0.004
 ramp+0.004
 wrists+0.004
ippers+0.004
amas+0.004
Neg logit contributions
 unexpected-0.002
 frightening-0.002
sm-0.002
 unpleasant-0.002
 since-0.002
Token a
Token ID257
Feature activation+3.98e-01

Pos logit contributions
 unexpected+0.023
 frightening+0.021
sm+0.020
 unpleasant+0.020
 scary+0.019
Neg logit contributions
bes-0.038
 ramp-0.036
 clinic-0.035
amas-0.035
ippers-0.035
Token long
Token ID890
Feature activation+2.37e+00

Pos logit contributions
bes+0.012
 ramp+0.012
amas+0.011
ippers+0.011
 clinic+0.011
Neg logit contributions
 unexpected-0.010
 frightening-0.009
sm-0.009
 scary-0.009
 since-0.009
Token time
Token ID640
Feature activation+3.96e+00

Pos logit contributions
bes+0.064
amas+0.064
ippers+0.062
 ramp+0.061
ates+0.059
Neg logit contributions
 unexpected-0.061
 since-0.059
 scary-0.057
 dangerous-0.056
 frightening-0.056
Token ago
Token ID2084
Feature activation+7.87e+00

Pos logit contributions
bes+0.113
 ramp+0.107
amas+0.104
ippers+0.102
 clinic+0.101
Neg logit contributions
 unexpected-0.105
 frightening-0.098
 since-0.096
sm-0.095
 scary-0.094
Token.
Token ID13
Feature activation+1.12e+00

Pos logit contributions
 ramp+0.324
bes+0.314
pel+0.311
 clinic+0.308
amas+0.305
Neg logit contributions
 since-0.096
 over-0.081
!-0.077
 unexpected-0.077
 to-0.075
Token I
Token ID314
Feature activation+7.60e-01

Pos logit contributions
bes+0.033
 ramp+0.031
ippers+0.030
amas+0.030
 wrists+0.029
Neg logit contributions
 unexpected-0.029
 frightening-0.027
 since-0.027
sm-0.026
 scary-0.026
Token thought
Token ID1807
Feature activation-1.58e+00

Pos logit contributions
bes+0.025
 ramp+0.024
amas+0.023
ippers+0.023
 wrists+0.023
Neg logit contributions
 unexpected-0.018
 frightening-0.016
sm-0.016
 since-0.015
 scary-0.015
Token I
Token ID314
Feature activation+3.51e-01

Pos logit contributions
 unexpected+0.031
 frightening+0.028
sm+0.027
 since+0.026
 unpleasant+0.026
Neg logit contributions
bes-0.056
 ramp-0.054
amas-0.052
ippers-0.052
 clinic-0.052
Token would
Token ID561
Feature activation+2.70e+00

Pos logit contributions
bes+0.011
 ramp+0.010
 clinic+0.010
ippers+0.010
pel+0.010
Neg logit contributions
 unexpected-0.008
 frightening-0.008
 unpleasant-0.007
sm-0.007
 since-0.007
Token is
Token ID318
Feature activation+2.35e+00

Pos logit contributions
 unexpected+0.034
 frightening+0.032
 scary+0.031
 dangerous+0.030
sm+0.030
Neg logit contributions
bes-0.035
ippers-0.033
amas-0.033
 ramp-0.033
 clinic-0.032
Token from
Token ID422
Feature activation+3.72e+00

Pos logit contributions
bes+0.083
 ramp+0.080
 clinic+0.077
 wrists+0.077
ippers+0.076
Neg logit contributions
 unexpected-0.045
 frightening-0.041
sm-0.040
 since-0.038
 unpleasant-0.038
Token a
Token ID257
Feature activation+3.84e-02

Pos logit contributions
bes+0.127
 ramp+0.122
 wrists+0.118
 clinic+0.118
ippers+0.116
Neg logit contributions
 unexpected-0.073
 frightening-0.068
sm-0.066
 since-0.064
 unpleasant-0.063
Token long
Token ID890
Feature activation+1.62e+00

Pos logit contributions
bes+0.001
 ramp+0.001
amas+0.001
ippers+0.001
 clinic+0.001
Neg logit contributions
 unexpected-0.001
 frightening-0.001
 scary-0.001
sm-0.001
 since-0.001
Token time
Token ID640
Feature activation+3.83e+00

Pos logit contributions
bes+0.051
amas+0.049
ippers+0.048
ates+0.046
 wrists+0.045
Neg logit contributions
 unexpected-0.039
 since-0.038
 scary-0.036
 dangerous-0.036
 frightening-0.034
Token ago
Token ID2084
Feature activation+7.84e+00

Pos logit contributions
bes+0.112
 ramp+0.106
amas+0.101
ippers+0.101
 clinic+0.101
Neg logit contributions
 unexpected-0.100
 frightening-0.093
sm-0.090
 since-0.089
 scary-0.088
Token,
Token ID11
Feature activation-3.20e-02

Pos logit contributions
 ramp+0.324
bes+0.316
pel+0.313
 clinic+0.312
amas+0.304
Neg logit contributions
 since-0.091
 over-0.082
 unexpected-0.081
!-0.076
sm-0.072
Token when
Token ID618
Feature activation+1.42e+00

Pos logit contributions
 unexpected+0.001
 frightening+0.001
sm+0.001
 since+0.001
 unpleasant+0.001
Neg logit contributions
bes-0.001
 ramp-0.001
amas-0.001
ippers-0.001
 clinic-0.001
Token people
Token ID661
Feature activation-8.21e-01

Pos logit contributions
bes+0.053
 ramp+0.050
amas+0.049
ippers+0.049
 clinic+0.048
Neg logit contributions
 unexpected-0.027
 frightening-0.024
sm-0.022
 since-0.022
 scary-0.022
Token used
Token ID973
Feature activation-2.26e-01

Pos logit contributions
 unexpected+0.019
 over+0.016
 frightening+0.016
 dangerous+0.016
 scary+0.016
Neg logit contributions
bes-0.028
amas-0.026
 ramp-0.026
ippers-0.025
 wrists-0.025
Token different
Token ID1180
Feature activation-6.99e-01

Pos logit contributions
 unexpected+0.004
 frightening+0.004
sm+0.004
 unpleasant+0.004
 since+0.004
Neg logit contributions
bes-0.008
 ramp-0.007
 wrists-0.007
ippers-0.007
 clinic-0.007
Token They
Token ID1119
Feature activation+3.25e+00

Pos logit contributions
bes+0.044
amas+0.039
 ramp+0.039
ippers+0.039
 clinic+0.038
Neg logit contributions
 unexpected-0.039
 since-0.037
 frightening-0.037
sm-0.036
 over-0.036
Token lived
Token ID5615
Feature activation+3.90e+00

Pos logit contributions
bes+0.105
amas+0.100
 ramp+0.100
ippers+0.098
 clinic+0.096
Neg logit contributions
 unexpected-0.079
 frightening-0.070
 scary-0.069
 since-0.069
 dangerous-0.068
Token a
Token ID257
Feature activation+1.22e+00

Pos logit contributions
bes+0.135
 ramp+0.129
ippers+0.125
 clinic+0.124
amas+0.124
Neg logit contributions
 unexpected-0.080
 frightening-0.073
sm-0.069
 unpleasant-0.068
 since-0.068
Token long
Token ID890
Feature activation+2.26e+00

Pos logit contributions
bes+0.033
 ramp+0.031
amas+0.030
ippers+0.030
 clinic+0.029
Neg logit contributions
 unexpected-0.035
 frightening-0.033
sm-0.031
 since-0.031
 scary-0.031
Token time
Token ID640
Feature activation+3.76e+00

Pos logit contributions
bes+0.064
amas+0.062
ippers+0.059
 wrists+0.058
 ramp+0.058
Neg logit contributions
 unexpected-0.059
 since-0.058
 scary-0.056
 frightening-0.055
 dangerous-0.053
Token ago
Token ID2084
Feature activation+7.84e+00

Pos logit contributions
bes+0.112
 ramp+0.106
amas+0.102
ippers+0.102
 clinic+0.101
Neg logit contributions
 unexpected-0.096
 frightening-0.090
sm-0.086
 since-0.085
 scary-0.084
Token.
Token ID13
Feature activation+8.79e-01

Pos logit contributions
 ramp+0.317
bes+0.310
pel+0.310
 clinic+0.302
amas+0.300
Neg logit contributions
 since-0.095
 unexpected-0.091
 over-0.083
sm-0.080
!-0.077
Token They
Token ID1119
Feature activation+3.09e+00

Pos logit contributions
bes+0.021
 ramp+0.020
ippers+0.019
amas+0.019
 clinic+0.018
Neg logit contributions
 unexpected-0.028
 frightening-0.026
sm-0.025
 since-0.025
 scary-0.025
Token had
Token ID550
Feature activation+1.52e+00

Pos logit contributions
bes+0.100
amas+0.096
 ramp+0.095
ippers+0.094
 clinic+0.091
Neg logit contributions
 unexpected-0.073
 since-0.066
 frightening-0.065
 scary-0.065
sm-0.064
Token big
Token ID1263
Feature activation-4.40e-01

Pos logit contributions
bes+0.053
 ramp+0.051
amas+0.049
ippers+0.049
 clinic+0.049
Neg logit contributions
 unexpected-0.031
 frightening-0.029
sm-0.027
 since-0.027
 scary-0.026
Token bones
Token ID11945
Feature activation+1.34e+00

Pos logit contributions
 unexpected+0.014
 frightening+0.013
sm+0.013
 since+0.012
 unpleasant+0.012
Neg logit contributions
bes-0.011
 ramp-0.010
ippers-0.009
amas-0.009
 clinic-0.009
Token watching
Token ID4964
Feature activation+2.10e-01

Pos logit contributions
 unexpected+0.009
 since+0.009
 frightening+0.009
 scary+0.008
sm+0.008
Neg logit contributions
bes-0.017
 ramp-0.017
ippers-0.016
amas-0.016
pel-0.016
Token it
Token ID340
Feature activation-4.03e+00

Pos logit contributions
bes+0.007
 ramp+0.007
amas+0.007
 wrists+0.007
 clinic+0.007
Neg logit contributions
 unexpected-0.004
 frightening-0.004
sm-0.004
 unpleasant-0.004
 since-0.003
Token stir
Token ID11240
Feature activation-7.27e-01

Pos logit contributions
 unexpected+0.092
 frightening+0.084
sm+0.080
 since+0.079
 scary+0.079
Neg logit contributions
bes-0.132
 ramp-0.126
ippers-0.122
amas-0.122
 clinic-0.120
Token around
Token ID1088
Feature activation+4.94e-01

Pos logit contributions
 unexpected+0.015
 frightening+0.013
sm+0.013
 since+0.012
 unpleasant+0.012
Neg logit contributions
bes-0.025
 ramp-0.024
amas-0.023
 wrists-0.023
ippers-0.023
Token and
Token ID290
Feature activation-1.84e-01

Pos logit contributions
bes+0.018
 ramp+0.017
 wrists+0.017
 clinic+0.017
amas+0.017
Neg logit contributions
 unexpected-0.009
 frightening-0.008
sm-0.007
 unpleasant-0.007
 since-0.007
Token gra
Token ID7933
Feature activation+7.81e+00

Pos logit contributions
 unexpected+0.004
 frightening+0.004
sm+0.004
 unpleasant+0.004
 since+0.004
Neg logit contributions
bes-0.006
 ramp-0.005
 wrists-0.005
 clinic-0.005
amas-0.005
Tokenze
Token ID2736
Feature activation-6.74e-01

Pos logit contributions
 ramp+0.223
 laptop+0.203
ippers+0.201
 clinic+0.196
bes+0.193
Neg logit contributions
 unexpected-0.196
 over-0.190
sm-0.185
 frightening-0.183
ful-0.176
Token on
Token ID319
Feature activation+1.24e+00

Pos logit contributions
 unexpected+0.013
 frightening+0.012
sm+0.011
 since+0.011
 scary+0.011
Neg logit contributions
bes-0.024
 ramp-0.023
ippers-0.023
amas-0.023
 clinic-0.023
Token the
Token ID262
Feature activation-1.09e+00

Pos logit contributions
bes+0.048
 ramp+0.046
ippers+0.045
amas+0.045
 clinic+0.045
Neg logit contributions
 unexpected-0.021
 frightening-0.019
sm-0.017
 since-0.017
 scary-0.017
Token grass
Token ID8701
Feature activation+2.56e+00

Pos logit contributions
 unexpected+0.034
 frightening+0.032
sm+0.031
 since+0.031
 unpleasant+0.030
Neg logit contributions
bes-0.026
 ramp-0.025
amas-0.023
ippers-0.023
 clinic-0.023
Token.
Token ID13
Feature activation-4.30e-01

Pos logit contributions
bes+0.099
 ramp+0.095
amas+0.092
ippers+0.092
pel+0.091
Neg logit contributions
 unexpected-0.043
 frightening-0.038
 since-0.037
sm-0.037
 scary-0.035
Token says
Token ID1139
Feature activation+4.63e+00

Pos logit contributions
 unexpected+0.025
 frightening+0.024
sm+0.023
 since+0.023
 unpleasant+0.022
Neg logit contributions
bes-0.025
 ramp-0.023
amas-0.022
 clinic-0.022
ippers-0.022
Token that
Token ID326
Feature activation+1.50e-01

Pos logit contributions
bes+0.157
 ramp+0.150
ippers+0.145
amas+0.145
 clinic+0.144
Neg logit contributions
 unexpected-0.099
 frightening-0.091
sm-0.086
 since-0.085
 scary-0.084
Token a
Token ID257
Feature activation+7.24e-01

Pos logit contributions
bes+0.005
 ramp+0.005
 clinic+0.005
 wrists+0.005
amas+0.005
Neg logit contributions
 unexpected-0.003
 frightening-0.003
sm-0.003
 since-0.003
 unpleasant-0.003
Token long
Token ID890
Feature activation+1.79e+00

Pos logit contributions
bes+0.022
 ramp+0.021
 clinic+0.020
amas+0.020
ippers+0.020
Neg logit contributions
 unexpected-0.017
 frightening-0.016
sm-0.016
 since-0.016
 unpleasant-0.015
Token time
Token ID640
Feature activation+3.54e+00

Pos logit contributions
bes+0.051
ippers+0.049
amas+0.048
 ramp+0.047
 wrists+0.046
Neg logit contributions
 unexpected-0.049
 frightening-0.046
 scary-0.045
 since-0.044
 dangerous-0.043
Token ago
Token ID2084
Feature activation+7.76e+00

Pos logit contributions
bes+0.129
 ramp+0.123
amas+0.119
ippers+0.119
 clinic+0.119
Neg logit contributions
 unexpected-0.067
 frightening-0.061
sm-0.057
 since-0.057
 scary-0.056
Token,
Token ID11
Feature activation+2.68e-01

Pos logit contributions
bes+0.302
 ramp+0.301
 clinic+0.290
pel+0.290
amas+0.287
Neg logit contributions
 unexpected-0.111
 since-0.107
 over-0.092
sm-0.091
 dangerous-0.088
Token a
Token ID257
Feature activation+1.09e+00

Pos logit contributions
bes+0.009
 ramp+0.009
ippers+0.009
amas+0.009
 clinic+0.009
Neg logit contributions
 unexpected-0.005
 frightening-0.005
 since-0.005
sm-0.005
 scary-0.005
Token dragon
Token ID10441
Feature activation+3.08e-01

Pos logit contributions
bes+0.035
 ramp+0.033
ippers+0.032
amas+0.032
 clinic+0.032
Neg logit contributions
 unexpected-0.025
 frightening-0.023
sm-0.022
 since-0.022
 scary-0.022
Token used
Token ID973
Feature activation+8.65e-01

Pos logit contributions
bes+0.012
 ramp+0.011
ippers+0.011
amas+0.011
 wrists+0.011
Neg logit contributions
 unexpected-0.005
 frightening-0.005
sm-0.004
 since-0.004
 scary-0.004
Token to
Token ID284
Feature activation+1.95e+00

Pos logit contributions
bes+0.032
 ramp+0.031
ippers+0.030
 wrists+0.030
 clinic+0.030
Neg logit contributions
 unexpected-0.014
 frightening-0.013
sm-0.012
 unpleasant-0.012
 since-0.012
Token tree
Token ID5509
Feature activation+1.08e+00

Pos logit contributions
 unexpected+0.034
 frightening+0.032
sm+0.031
 since+0.031
 scary+0.030
Neg logit contributions
bes-0.032
 ramp-0.031
ippers-0.029
amas-0.029
 clinic-0.029
Token and
Token ID290
Feature activation-8.28e-02

Pos logit contributions
bes+0.040
 ramp+0.039
ippers+0.038
amas+0.037
pel+0.037
Neg logit contributions
 unexpected-0.019
 frightening-0.017
 since-0.017
sm-0.017
 scary-0.016
Token watched
Token ID7342
Feature activation+1.28e-01

Pos logit contributions
 unexpected+0.002
 frightening+0.002
sm+0.002
 since+0.002
 unpleasant+0.002
Neg logit contributions
bes-0.002
 ramp-0.002
amas-0.002
 clinic-0.002
 wrists-0.002
Token the
Token ID262
Feature activation-1.48e+00

Pos logit contributions
bes+0.005
 ramp+0.005
ippers+0.005
 clinic+0.005
amas+0.005
Neg logit contributions
 unexpected-0.002
 frightening-0.002
sm-0.002
 scary-0.002
 since-0.002
Token cows
Token ID22575
Feature activation+2.44e+00

Pos logit contributions
 unexpected+0.036
 frightening+0.033
sm+0.033
 since+0.032
 unpleasant+0.031
Neg logit contributions
bes-0.045
 ramp-0.043
amas-0.042
 wrists-0.042
 clinic-0.041
Token gra
Token ID7933
Feature activation+7.75e+00

Pos logit contributions
bes+0.093
ippers+0.086
 ramp+0.085
pel+0.085
amas+0.085
Neg logit contributions
 unexpected-0.041
 over-0.040
 since-0.039
 frightening-0.038
 scary-0.037
Tokenze
Token ID2736
Feature activation-1.19e+00

Pos logit contributions
 ramp+0.205
 laptop+0.187
ippers+0.183
 suitcase+0.176
bes+0.176
Neg logit contributions
sm-0.199
 over-0.197
 unexpected-0.190
 frightening-0.183
 closer-0.179
Token.
Token ID13
Feature activation-7.03e-01

Pos logit contributions
 unexpected+0.021
 frightening+0.019
sm+0.019
 scary+0.018
 since+0.018
Neg logit contributions
bes-0.044
 ramp-0.043
ippers-0.042
amas-0.041
 clinic-0.041
Token She
Token ID1375
Feature activation+8.24e-01

Pos logit contributions
 since+0.015
 over+0.015
 unexpected+0.015
 frightening+0.015
 scary+0.015
Neg logit contributions
bes-0.023
 ramp-0.022
ates-0.022
ippers-0.021
kin-0.021
Token felt
Token ID2936
Feature activation-1.32e+00

Pos logit contributions
bes+0.027
 ramp+0.026
ippers+0.026
amas+0.026
 clinic+0.025
Neg logit contributions
 unexpected-0.018
 frightening-0.017
 since-0.017
sm-0.016
 scary-0.016
Token happy
Token ID3772
Feature activation+1.31e+00

Pos logit contributions
 unexpected+0.030
 frightening+0.028
sm+0.028
 since+0.027
 unpleasant+0.026
Neg logit contributions
bes-0.038
 clinic-0.035
 ramp-0.034
amas-0.033
ippers-0.033
Token treasure
Token ID14068
Feature activation+1.27e+00

Pos logit contributions
bes+0.007
 ramp+0.007
 clinic+0.007
ippers+0.007
amas+0.007
Neg logit contributions
 unexpected-0.004
 frightening-0.004
sm-0.004
 since-0.004
 unpleasant-0.004
Token from
Token ID422
Feature activation+3.97e+00

Pos logit contributions
bes+0.051
 ramp+0.049
ippers+0.048
amas+0.048
 wrists+0.047
Neg logit contributions
 unexpected-0.020
 frightening-0.017
 since-0.016
sm-0.016
 scary-0.016
Token a
Token ID257
Feature activation+2.18e-01

Pos logit contributions
bes+0.129
 ramp+0.126
 clinic+0.124
 wrists+0.122
 slot+0.121
Neg logit contributions
sm-0.073
 unexpected-0.073
 frightening-0.072
vious-0.069
 unpleasant-0.068
Token long
Token ID890
Feature activation+1.82e+00

Pos logit contributions
 ramp+0.006
 clinic+0.006
bes+0.006
 slot+0.006
 trailer+0.006
Neg logit contributions
 unexpected-0.005
sm-0.005
 frightening-0.005
vious-0.005
 unpleasant-0.005
Token time
Token ID640
Feature activation+3.41e+00

Pos logit contributions
bes+0.057
amas+0.054
ippers+0.053
 wrists+0.052
 ramp+0.052
Neg logit contributions
 unexpected-0.044
 since-0.042
 frightening-0.040
 scary-0.040
 dangerous-0.039
Token ago
Token ID2084
Feature activation+7.74e+00

Pos logit contributions
bes+0.121
 ramp+0.115
amas+0.112
ippers+0.112
 wrists+0.110
Neg logit contributions
 unexpected-0.067
 frightening-0.061
 since-0.060
sm-0.059
 scary-0.057
Token."
Token ID526
Feature activation+6.22e-01

Pos logit contributions
bes+0.318
 ramp+0.316
pel+0.306
 clinic+0.302
ippers+0.298
Neg logit contributions
 since-0.093
 unexpected-0.083
 over-0.078
!-0.074
sm-0.070
Token
Token ID198
Feature activation-9.94e-01

Pos logit contributions
bes+0.025
ippers+0.023
pel+0.023
kin+0.023
 wrists+0.023
Neg logit contributions
 since-0.009
 over-0.008
 Joe-0.008
 Sally-0.008
sm-0.007
Token
Token ID220
Feature activation-2.49e+00

Pos logit contributions
 unexpected+0.026
 frightening+0.024
sm+0.023
 since+0.023
 unpleasant+0.022
Neg logit contributions
bes-0.030
 ramp-0.028
amas-0.027
ippers-0.027
 clinic-0.027
Token
Token ID198
Feature activation-6.25e-01

Pos logit contributions
 unexpected+0.041
 over+0.039
sm+0.036
 since+0.036
 frightening+0.036
Neg logit contributions
bes-0.094
 wrists-0.090
pel-0.085
 ramp-0.085
kin-0.084
TokenThe
Token ID464
Feature activation+5.39e-01

Pos logit contributions
 unexpected+0.012
 frightening+0.011
 since+0.010
sm+0.010
 unpleasant+0.010
Neg logit contributions
bes-0.023
ippers-0.022
 ramp-0.022
amas-0.022
 wrists-0.021
Token wand
Token ID11569
Feature activation+2.86e-01

Pos logit contributions
bes+0.068
amas+0.065
 slot+0.065
pel+0.064
 trailer+0.061
Neg logit contributions
 scary-0.071
 amazing-0.070
 unexpected-0.070
 dangerous-0.066
 incredible-0.065
Token won
Token ID1839
Feature activation-1.95e+00

Pos logit contributions
bes+0.008
 ramp+0.008
 clinic+0.008
ippers+0.008
amas+0.008
Neg logit contributions
 unexpected-0.008
 frightening-0.007
 since-0.007
sm-0.007
 scary-0.007
Tokenâ
Token ID22940
Feature activation-2.71e+00

Pos logit contributions
 unexpected+0.053
 frightening+0.050
sm+0.048
 since+0.047
 unpleasant+0.047
Neg logit contributions
bes-0.055
 ramp-0.052
amas-0.050
ippers-0.049
 clinic-0.049
Token
Token ID26391
Feature activation-9.22e-01

Pos logit contributions
 unexpected+0.060
 since+0.056
vious+0.056
sm+0.052
 unpleasant+0.050
Neg logit contributions
 ramp-0.087
bes-0.084
 wrists-0.084
 clinic-0.079
 trailer-0.079
Token
Token ID8151
Feature activation-4.22e+00

Pos logit contributions
 since+0.023
sm+0.022
 unexpected+0.021
 scary+0.021
!+0.020
Neg logit contributions
 ramp-0.025
 wrists-0.024
 slot-0.024
pel-0.024
 trailer-0.023
Tokent
Token ID83
Feature activation-7.43e+00

Pos logit contributions
 unexpected+0.105
 frightening+0.094
 unpleasant+0.089
 scary+0.083
 unfamiliar+0.082
Neg logit contributions
bes-0.131
amas-0.126
ippers-0.123
 ramp-0.123
 clinic-0.119
Token fit
Token ID4197
Feature activation-9.28e-01

Pos logit contributions
sm+0.215
 unexpected+0.213
 frightening+0.204
 unpleasant+0.193
 since+0.189
Neg logit contributions
 ramp-0.182
bes-0.171
pel-0.158
ippers-0.155
amas-0.153
Token in
Token ID287
Feature activation+8.43e-01

Pos logit contributions
 unexpected+0.020
 frightening+0.019
sm+0.018
 unpleasant+0.018
 scary+0.016
Neg logit contributions
bes-0.030
 ramp-0.029
ippers-0.029
 wrists-0.027
ates-0.027
Token my
Token ID616
Feature activation+1.59e+00

Pos logit contributions
bes+0.031
 ramp+0.030
ippers+0.029
 clinic+0.029
amas+0.029
Neg logit contributions
 unexpected-0.015
 frightening-0.014
sm-0.013
 since-0.013
 unpleasant-0.013
Token box
Token ID3091
Feature activation+2.00e-01

Pos logit contributions
bes+0.039
amas+0.038
pel+0.037
 slot+0.036
ates+0.035
Neg logit contributions
 unexpected-0.049
 scary-0.046
 amazing-0.045
 dangerous-0.044
 frightening-0.043
Token,
Token ID11
Feature activation-4.72e-01

Pos logit contributions
bes+0.007
 ramp+0.007
amas+0.007
 clinic+0.007
ippers+0.007
Neg logit contributions
 unexpected-0.004
 frightening-0.004
sm-0.003
 unpleasant-0.003
 scary-0.003
Token I
Token ID314
Feature activation+4.93e-01

Pos logit contributions
bes+0.002
 wrists+0.002
 ramp+0.001
kin+0.001
ippers+0.001
Neg logit contributions
 since-0.001
 unexpected-0.001
 frightening-0.001
sm-0.001
 over-0.001
Token won
Token ID1839
Feature activation-1.90e+00

Pos logit contributions
bes+0.015
 ramp+0.014
amas+0.014
ippers+0.014
 clinic+0.014
Neg logit contributions
 unexpected-0.012
 frightening-0.011
sm-0.011
 since-0.011
 scary-0.011
Tokenâ
Token ID22940
Feature activation-2.65e+00

Pos logit contributions
 unexpected+0.055
 frightening+0.051
sm+0.051
 unpleasant+0.050
 since+0.049
Neg logit contributions
bes-0.051
 ramp-0.047
amas-0.046
pel-0.044
 clinic-0.044
Token
Token ID26391
Feature activation-4.63e-01

Pos logit contributions
 unexpected+0.065
 since+0.059
 frightening+0.059
sm+0.058
 unpleasant+0.056
Neg logit contributions
bes-0.081
 ramp-0.079
 wrists-0.076
 clinic-0.075
ippers-0.074
Token
Token ID8151
Feature activation-4.51e+00

Pos logit contributions
 since+0.012
sm+0.011
 unexpected+0.011
 frightening+0.011
 dangerous+0.010
Neg logit contributions
 ramp-0.012
 wrists-0.012
bes-0.012
pel-0.012
 slot-0.012
Tokent
Token ID83
Feature activation-7.37e+00

Pos logit contributions
 unexpected+0.111
 frightening+0.100
 unpleasant+0.094
 scary+0.091
 exciting+0.088
Neg logit contributions
bes-0.141
amas-0.137
ippers-0.135
 ramp-0.134
 clinic-0.130
Token let
Token ID1309
Feature activation-5.36e-01

Pos logit contributions
 unexpected+0.204
sm+0.200
 frightening+0.193
 unpleasant+0.182
elling+0.182
Neg logit contributions
 ramp-0.177
bes-0.176
 clinic-0.167
pel-0.161
ippers-0.154
Token anything
Token ID1997
Feature activation-5.59e+00

Pos logit contributions
 unexpected+0.010
 frightening+0.009
sm+0.008
 since+0.008
 scary+0.008
Neg logit contributions
bes-0.020
 ramp-0.019
ippers-0.018
amas-0.018
 clinic-0.018
Token bad
Token ID2089
Feature activation-1.99e+00

Pos logit contributions
 unexpected+0.096
sm+0.090
 frightening+0.088
 since+0.082
 unpleasant+0.080
Neg logit contributions
bes-0.198
 ramp-0.194
 clinic-0.182
amas-0.181
ippers-0.180
Token happen
Token ID1645
Feature activation-3.64e-02

Pos logit contributions
 unexpected+0.032
 frightening+0.027
 since+0.027
 scary+0.026
sm+0.025
Neg logit contributions
bes-0.079
 ramp-0.075
 clinic-0.074
amas-0.074
 wrists-0.073
Token to
Token ID284
Feature activation+8.05e-01

Pos logit contributions
 unexpected+0.001
 frightening+0.001
sm+0.001
 unpleasant+0.001
 scary+0.001
Neg logit contributions
bes-0.001
 ramp-0.001
ippers-0.001
amas-0.001
ates-0.001
Token I
Token ID314
Feature activation+6.83e-01

Pos logit contributions
bes+0.022
 wrists+0.020
 ramp+0.020
 clinic+0.019
pel+0.019
Neg logit contributions
 since-0.015
 unexpected-0.015
 frightening-0.014
sm-0.014
 over-0.014
Token won
Token ID1839
Feature activation-1.53e+00

Pos logit contributions
bes+0.018
 ramp+0.017
 clinic+0.017
ippers+0.016
amas+0.016
Neg logit contributions
 unexpected-0.018
 frightening-0.017
 unpleasant-0.017
sm-0.016
 possible-0.016
Tokenâ
Token ID22940
Feature activation-2.46e+00

Pos logit contributions
 unexpected+0.052
 frightening+0.048
sm+0.048
 unpleasant+0.047
 since+0.046
Neg logit contributions
bes-0.036
 ramp-0.031
amas-0.030
pel-0.029
 laptop-0.028
Token
Token ID26391
Feature activation-5.21e-01

Pos logit contributions
 unexpected+0.061
 since+0.055
 frightening+0.055
sm+0.055
 unpleasant+0.053
Neg logit contributions
bes-0.075
 ramp-0.073
 wrists-0.069
 clinic-0.069
ippers-0.068
Token
Token ID8151
Feature activation-4.75e+00

Pos logit contributions
 since+0.012
sm+0.011
!+0.011
 unexpected+0.011
 frightening+0.011
Neg logit contributions
 ramp-0.015
pel-0.014
bes-0.014
 wrists-0.014
 slot-0.014
Tokent
Token ID83
Feature activation-7.34e+00

Pos logit contributions
 unexpected+0.121
 frightening+0.109
 unpleasant+0.104
 scary+0.098
 exciting+0.096
Neg logit contributions
bes-0.144
amas-0.141
ippers-0.138
 ramp-0.138
 clinic-0.133
Token do
Token ID466
Feature activation+1.76e-02

Pos logit contributions
sm+0.217
 unexpected+0.215
rous+0.210
 frightening+0.209
year+0.208
Neg logit contributions
 wrist-0.145
bes-0.141
 hair-0.134
 neck-0.131
 wrists-0.130
Token it
Token ID340
Feature activation-3.79e+00

Pos logit contributions
bes+0.001
 ramp+0.001
ippers+0.001
amas+0.001
 clinic+0.001
Neg logit contributions
 unexpected-0.000
 frightening-0.000
sm-0.000
 since-0.000
 scary-0.000
Token again
Token ID757
Feature activation-1.73e+00

Pos logit contributions
 unexpected+0.068
 frightening+0.062
sm+0.059
 unpleasant+0.055
 since+0.052
Neg logit contributions
bes-0.138
 ramp-0.134
amas-0.130
 clinic-0.125
ippers-0.124
Token.
Token ID13
Feature activation-3.37e-02

Pos logit contributions
 unexpected+0.037
 frightening+0.034
sm+0.033
 unpleasant+0.032
 scary+0.030
Neg logit contributions
bes-0.058
 ramp-0.054
 clinic-0.053
ippers-0.053
amas-0.052
Tokenâ
Token ID22940
Feature activation-1.02e+00

Pos logit contributions
 unexpected+0.001
 since+0.001
 frightening+0.001
sm+0.001
 scary+0.001
Neg logit contributions
bes-0.001
 ramp-0.001
 wrists-0.001
ippers-0.001
 clinic-0.001
Token big
Token ID1263
Feature activation-1.12e-01

Pos logit contributions
 unexpected+0.005
 frightening+0.004
 since+0.004
 scary+0.004
sm+0.004
Neg logit contributions
bes-0.007
 ramp-0.007
ippers-0.007
amas-0.007
pel-0.006
Token before
Token ID878
Feature activation+1.60e+00

Pos logit contributions
 unexpected+0.003
 scary+0.003
 frightening+0.003
 since+0.003
 dangerous+0.003
Neg logit contributions
bes-0.003
ippers-0.003
amas-0.003
 wrists-0.003
 ramp-0.003
Token!
Token ID0
Feature activation-1.67e+00

Pos logit contributions
bes+0.065
 ramp+0.063
 wrists+0.061
ippers+0.060
amas+0.060
Neg logit contributions
 unexpected-0.024
 frightening-0.022
 since-0.021
 scary-0.020
sm-0.020
Token She
Token ID1375
Feature activation+7.69e-01

Pos logit contributions
 since+0.037
 unexpected+0.035
 frightening+0.034
 over+0.034
sm+0.034
Neg logit contributions
bes-0.054
 ramp-0.051
ippers-0.050
 wrists-0.049
pel-0.049
Token could
Token ID714
Feature activation+5.95e-02

Pos logit contributions
bes+0.026
ippers+0.026
amas+0.026
 ramp+0.026
ates+0.025
Neg logit contributions
 unexpected-0.016
 frightening-0.015
 since-0.014
 scary-0.014
 over-0.014
Token even
Token ID772
Feature activation-7.30e+00

Pos logit contributions
bes+0.002
amas+0.002
 ramp+0.002
ippers+0.002
ates+0.002
Neg logit contributions
 unexpected-0.001
 frightening-0.001
 scary-0.001
 dangerous-0.001
 over-0.001
Token sit
Token ID1650
Feature activation+1.59e+00

Pos logit contributions
 unexpected+0.186
sm+0.179
 frightening+0.178
 since+0.171
 unpleasant+0.168
Neg logit contributions
bes-0.198
 ramp-0.195
 clinic-0.191
 slot-0.185
 wrists-0.182
Token inside
Token ID2641
Feature activation-3.03e-02

Pos logit contributions
bes+0.056
 ramp+0.052
ippers+0.051
amas+0.051
 clinic+0.051
Neg logit contributions
 unexpected-0.033
 frightening-0.030
sm-0.029
 unpleasant-0.028
 since-0.028
Token it
Token ID340
Feature activation-3.14e+00

Pos logit contributions
 unexpected+0.001
 frightening+0.001
sm+0.001
 unpleasant+0.001
 scary+0.001
Neg logit contributions
bes-0.001
 ramp-0.001
amas-0.001
 clinic-0.001
ippers-0.001
Token.
Token ID13
Feature activation-7.35e-01

Pos logit contributions
 unexpected+0.061
 frightening+0.056
 since+0.053
sm+0.052
 scary+0.051
Neg logit contributions
bes-0.112
 ramp-0.107
ippers-0.104
amas-0.103
 clinic-0.103
Token
Token ID220
Feature activation-2.88e+00

Pos logit contributions
 since+0.016
 unexpected+0.016
 frightening+0.015
sm+0.015
 over+0.015
Neg logit contributions
bes-0.024
 ramp-0.023
ippers-0.022
pel-0.021
kin-0.021
Token creatures
Token ID8109
Feature activation+8.00e-01

Pos logit contributions
bes+0.060
 ramp+0.057
ippers+0.056
 clinic+0.055
amas+0.054
Neg logit contributions
 unexpected-0.041
 frightening-0.037
 since-0.035
 unpleasant-0.035
sm-0.034
Token.
Token ID13
Feature activation-2.01e+00

Pos logit contributions
bes+0.026
 ramp+0.024
ippers+0.024
 clinic+0.024
 wrists+0.023
Neg logit contributions
 unexpected-0.018
 frightening-0.017
 unpleasant-0.016
 since-0.015
sm-0.015
Token One
Token ID1881
Feature activation-1.89e+00

Pos logit contributions
 unexpected+0.049
 since+0.045
 frightening+0.045
 scary+0.044
 over+0.043
Neg logit contributions
bes-0.063
 ramp-0.057
amas-0.057
 clinic-0.056
ippers-0.056
Token day
Token ID1110
Feature activation+2.72e+00

Pos logit contributions
 frightening+0.056
 unexpected+0.055
 unpleasant+0.053
sm+0.051
 awful+0.051
Neg logit contributions
kin-0.040
 ramp-0.035
ippers-0.032
 clinic-0.032
 neck-0.031
Token,
Token ID11
Feature activation-1.49e+00

Pos logit contributions
bes+0.075
ippers+0.072
 ramp+0.071
 slot+0.068
pel+0.068
Neg logit contributions
 unexpected-0.072
 frightening-0.068
 unpleasant-0.065
sm-0.063
 since-0.062
Token Cr
Token ID3864
Feature activation-7.26e+00

Pos logit contributions
 unexpected+0.032
 frightening+0.030
sm+0.029
 unpleasant+0.028
 since+0.027
Neg logit contributions
bes-0.048
 ramp-0.047
ippers-0.045
 wrists-0.045
amas-0.044
Tokenust
Token ID436
Feature activation-3.57e+00

Pos logit contributions
 unexpected+0.119
 frightening+0.106
sm+0.099
 since+0.097
 scary+0.095
Neg logit contributions
bes-0.282
 ramp-0.271
ippers-0.263
amas-0.263
 clinic-0.262
Tokeny
Token ID88
Feature activation-1.08e+00

Pos logit contributions
 unexpected+0.085
 frightening+0.080
sm+0.077
 unpleasant+0.075
 dangerous+0.072
Neg logit contributions
bes-0.103
 ramp-0.102
ippers-0.097
 wrists-0.095
pel-0.095
Token found
Token ID1043
Feature activation-3.14e+00

Pos logit contributions
 unexpected+0.026
 frightening+0.024
sm+0.023
 unpleasant+0.022
 since+0.022
Neg logit contributions
bes-0.034
 ramp-0.032
 clinic-0.031
amas-0.031
ippers-0.031
Token a
Token ID257
Feature activation-1.57e+00

Pos logit contributions
 frightening+0.094
 since+0.093
 unexpected+0.091
 unpleasant+0.089
rated+0.088
Neg logit contributions
ates-0.061
 neck-0.056
 kit-0.056
bes-0.055
apped-0.054
Token shiny
Token ID22441
Feature activation-4.15e-01

Pos logit contributions
 unexpected+0.029
 frightening+0.026
sm+0.026
 since+0.025
 unpleasant+0.024
Neg logit contributions
bes-0.056
 ramp-0.055
ippers-0.052
 clinic-0.052
amas-0.052
Token the
Token ID262
Feature activation+6.94e-01

Pos logit contributions
 unexpected+0.018
 frightening+0.016
 scary+0.015
 since+0.015
 dangerous+0.014
Neg logit contributions
bes-0.065
 ramp-0.061
ippers-0.061
amas-0.061
 clinic-0.060
Token squirrel
Token ID33039
Feature activation-2.17e+00

Pos logit contributions
 ramp+0.018
bes+0.017
 neck+0.016
amas+0.016
ippers+0.016
Neg logit contributions
 unexpected-0.019
sm-0.019
 frightening-0.018
 since-0.017
 unpleasant-0.017
Token decided
Token ID3066
Feature activation-1.93e+00

Pos logit contributions
 unexpected+0.058
 frightening+0.053
 unpleasant+0.052
 possible+0.051
sm+0.050
Neg logit contributions
bes-0.060
 ramp-0.058
 clinic-0.057
amas-0.053
 slot-0.053
Token to
Token ID284
Feature activation-2.50e-01

Pos logit contributions
 unexpected+0.035
sm+0.033
 frightening+0.030
vious+0.030
elling+0.029
Neg logit contributions
bes-0.066
 wrists-0.063
 clinic-0.063
 ramp-0.063
 neck-0.062
Token take
Token ID1011
Feature activation-1.07e+00

Pos logit contributions
 unexpected+0.005
 frightening+0.004
 scary+0.004
 over+0.004
 exciting+0.004
Neg logit contributions
bes-0.009
amas-0.008
ippers-0.008
 ramp-0.008
ates-0.008
Token an
Token ID281
Feature activation-7.25e+00

Pos logit contributions
 unexpected+0.023
 frightening+0.021
 since+0.020
sm+0.020
 unpleasant+0.020
Neg logit contributions
bes-0.036
 ramp-0.034
ippers-0.033
 clinic-0.033
amas-0.033
Token adventure
Token ID8855
Feature activation-1.01e+00

Pos logit contributions
sm+0.149
 since+0.141
 frightening+0.139
 unexpected+0.133
 unpleasant+0.120
Neg logit contributions
 ramp-0.226
bes-0.222
 trailer-0.220
 clinic-0.217
 neck-0.211
Token in
Token ID287
Feature activation+5.70e-01

Pos logit contributions
 unexpected+0.020
 frightening+0.019
 unpleasant+0.018
sm+0.017
 scary+0.016
Neg logit contributions
bes-0.035
 ramp-0.034
 clinic-0.033
amas-0.032
ippers-0.032
Token the
Token ID262
Feature activation-2.05e-03

Pos logit contributions
bes+0.022
 ramp+0.021
 clinic+0.021
ippers+0.021
amas+0.021
Neg logit contributions
 unexpected-0.009
 frightening-0.008
sm-0.008
 since-0.008
 unpleasant-0.007
Token forest
Token ID8222
Feature activation+3.98e-01

Pos logit contributions
sm+0.000
 since+0.000
ful+0.000
 unexpected+0.000
 frightening+0.000
Neg logit contributions
 ramp-0.000
 clinic-0.000
 slot-0.000
 trailer-0.000
 chest-0.000
Token.
Token ID13
Feature activation-1.89e+00

Pos logit contributions
bes+0.014
 ramp+0.014
 clinic+0.013
ippers+0.013
amas+0.013
Neg logit contributions
 unexpected-0.008
 frightening-0.007
sm-0.006
 since-0.006
 unpleasant-0.006
Token we
Token ID356
Feature activation+1.31e+00

Pos logit contributions
bes+0.014
 ramp+0.014
 wrists+0.013
 clinic+0.013
ippers+0.013
Neg logit contributions
 unexpected-0.012
sm-0.011
 frightening-0.011
 since-0.010
 unpleasant-0.010
Token won
Token ID1839
Feature activation-2.05e+00

Pos logit contributions
bes+0.038
 ramp+0.036
ippers+0.035
amas+0.035
 clinic+0.035
Neg logit contributions
 unexpected-0.034
 frightening-0.032
sm-0.030
 since-0.030
 unpleasant-0.030
Tokenâ
Token ID22940
Feature activation-3.20e+00

Pos logit contributions
 unexpected+0.047
 frightening+0.044
sm+0.042
 unpleasant+0.041
 since+0.041
Neg logit contributions
bes-0.066
 ramp-0.064
amas-0.061
 clinic-0.060
pel-0.059
Token
Token ID26391
Feature activation-1.07e+00

Pos logit contributions
 unexpected+0.079
 since+0.068
vious+0.067
 unpleasant+0.065
 frightening+0.065
Neg logit contributions
bes-0.099
 ramp-0.098
 wrists-0.096
ippers-0.093
amas-0.092
Token
Token ID8151
Feature activation-4.77e+00

Pos logit contributions
 since+0.026
 unexpected+0.026
sm+0.025
 frightening+0.024
 dangerous+0.024
Neg logit contributions
 ramp-0.029
 wrists-0.029
bes-0.028
pel-0.028
ippers-0.027
Tokent
Token ID83
Feature activation-7.24e+00

Pos logit contributions
 unexpected+0.126
 frightening+0.115
 unpleasant+0.109
 scary+0.104
 unfamiliar+0.101
Neg logit contributions
bes-0.139
amas-0.137
ippers-0.134
 ramp-0.132
 clinic-0.129
Token be
Token ID307
Feature activation+2.79e-01

Pos logit contributions
sm+0.220
elling+0.204
 unexpected+0.203
rous+0.200
year+0.198
Neg logit contributions
 ramp-0.165
 clinic-0.142
bes-0.138
 neck-0.135
pel-0.134
Token able
Token ID1498
Feature activation-4.96e-01

Pos logit contributions
bes+0.010
 ramp+0.010
 clinic+0.009
ippers+0.009
amas+0.009
Neg logit contributions
 unexpected-0.005
sm-0.004
 frightening-0.004
 since-0.004
 unpleasant-0.004
Token to
Token ID284
Feature activation-2.18e-01

Pos logit contributions
 unexpected+0.009
vious+0.008
 frightening+0.008
elling+0.008
 unpleasant+0.008
Neg logit contributions
bes-0.016
 clinic-0.016
 ramp-0.015
ippers-0.015
 kit-0.015
Token breathe
Token ID18044
Feature activation-1.73e-01

Pos logit contributions
 unexpected+0.006
sm+0.005
 frightening+0.005
 since+0.005
 unpleasant+0.005
Neg logit contributions
bes-0.006
 ramp-0.006
 clinic-0.006
pel-0.005
ippers-0.005
Token.
Token ID13
Feature activation-7.62e-01

Pos logit contributions
 unexpected+0.003
 frightening+0.003
sm+0.003
 since+0.003
 unpleasant+0.003
Neg logit contributions
bes-0.006
 ramp-0.006
amas-0.006
 clinic-0.006
ippers-0.006
Token,
Token ID11
Feature activation-1.33e+00

Pos logit contributions
bes+0.090
 ramp+0.084
amas+0.081
ippers+0.081
 clinic+0.080
Neg logit contributions
 unexpected-0.073
 frightening-0.068
 unpleasant-0.065
sm-0.065
 since-0.063
Token she
Token ID673
Feature activation-1.77e-01

Pos logit contributions
 unexpected+0.029
 frightening+0.027
sm+0.025
 unpleasant+0.025
 since+0.025
Neg logit contributions
bes-0.045
 ramp-0.042
amas-0.041
 wrists-0.041
 clinic-0.040
Token wanted
Token ID2227
Feature activation-2.53e+00

Pos logit contributions
 unexpected+0.004
 frightening+0.004
sm+0.004
 since+0.004
 unpleasant+0.004
Neg logit contributions
bes-0.006
 ramp-0.005
amas-0.005
ippers-0.005
 clinic-0.005
Token to
Token ID284
Feature activation-1.02e+00

Pos logit contributions
 frightening+0.048
 unexpected+0.047
sm+0.046
vious+0.044
 unpleasant+0.044
Neg logit contributions
 clinic-0.080
 laptop-0.080
 kit-0.079
bes-0.078
 hair-0.078
Token take
Token ID1011
Feature activation-1.08e+00

Pos logit contributions
 unexpected+0.025
 frightening+0.023
 since+0.022
 scary+0.022
sm+0.022
Neg logit contributions
bes-0.032
 ramp-0.030
amas-0.029
ippers-0.029
 clinic-0.029
Token an
Token ID281
Feature activation-7.24e+00

Pos logit contributions
 unexpected+0.022
 frightening+0.020
sm+0.020
 since+0.019
 unpleasant+0.019
Neg logit contributions
bes-0.038
 ramp-0.036
ippers-0.035
amas-0.035
 clinic-0.035
Token exam
Token ID2814
Feature activation-1.99e+00

Pos logit contributions
sm+0.160
 frightening+0.159
 since+0.152
 unexpected+0.151
 unpleasant+0.140
Neg logit contributions
bes-0.209
 ramp-0.206
 laptop-0.206
 clinic-0.205
 wrist-0.203
Token.
Token ID13
Feature activation-1.56e+00

Pos logit contributions
 unexpected+0.029
 frightening+0.026
sm+0.024
 since+0.023
 unpleasant+0.023
Neg logit contributions
bes-0.081
 ramp-0.078
amas-0.076
ippers-0.076
 clinic-0.076
Token But
Token ID887
Feature activation-2.37e+00

Pos logit contributions
 unexpected+0.039
 frightening+0.037
 since+0.037
 over+0.035
sm+0.035
Neg logit contributions
bes-0.046
 ramp-0.044
ippers-0.043
pel-0.042
 clinic-0.042
Token before
Token ID878
Feature activation+2.71e-01

Pos logit contributions
 unexpected+0.043
 frightening+0.038
sm+0.036
 since+0.036
 scary+0.035
Neg logit contributions
bes-0.089
 ramp-0.086
ippers-0.083
amas-0.083
 clinic-0.083
Token she
Token ID673
Feature activation-1.37e+00

Pos logit contributions
bes+0.010
 ramp+0.010
ippers+0.010
ates+0.010
 clinic+0.010
Neg logit contributions
 unexpected-0.005
 frightening-0.004
 scary-0.004
 since-0.004
 dangerous-0.004
Token so
Token ID523
Feature activation-1.10e+00

Pos logit contributions
bes+0.043
 ramp+0.041
 clinic+0.039
 wrists+0.039
ippers+0.039
Neg logit contributions
 unexpected-0.024
 frightening-0.022
sm-0.022
 since-0.021
 unpleasant-0.020
Token cool
Token ID3608
Feature activation-1.42e+00

Pos logit contributions
 unexpected+0.020
sm+0.018
 frightening+0.018
 since+0.018
 unpleasant+0.017
Neg logit contributions
bes-0.039
 ramp-0.037
ippers-0.036
 wrists-0.036
amas-0.036
Token.
Token ID13
Feature activation-1.39e+00

Pos logit contributions
 unexpected+0.027
 frightening+0.024
sm+0.023
 unpleasant+0.022
 scary+0.021
Neg logit contributions
bes-0.051
 ramp-0.049
amas-0.048
ippers-0.047
 wrists-0.047
Token She
Token ID1375
Feature activation+1.67e-01

Pos logit contributions
 since+0.023
 over+0.020
 unexpected+0.019
 to+0.019
+0.017
Neg logit contributions
 ramp-0.052
bes-0.052
kin-0.050
 trailer-0.048
 kit-0.048
Token couldn
Token ID3521
Feature activation-2.54e+00

Pos logit contributions
bes+0.006
amas+0.006
ippers+0.006
 ramp+0.006
 clinic+0.006
Neg logit contributions
 unexpected-0.003
 since-0.003
 over-0.003
 frightening-0.003
 dangerous-0.003
Token't
Token ID470
Feature activation-7.20e+00

Pos logit contributions
 unexpected+0.063
 frightening+0.059
sm+0.057
 since+0.056
 scary+0.055
Neg logit contributions
bes-0.077
 ramp-0.073
ippers-0.071
 wrists-0.070
amas-0.070
Token wait
Token ID4043
Feature activation-9.01e-02

Pos logit contributions
 unexpected+0.198
 frightening+0.191
sm+0.186
 unpleasant+0.178
 scary+0.178
Neg logit contributions
bes-0.193
 ramp-0.185
pel-0.176
 clinic-0.175
ippers-0.174
Token to
Token ID284
Feature activation-8.78e-01

Pos logit contributions
 unexpected+0.001
 frightening+0.001
sm+0.001
 unpleasant+0.001
vious+0.001
Neg logit contributions
bes-0.004
 ramp-0.003
ippers-0.003
 wrists-0.003
 clinic-0.003
Token rush
Token ID10484
Feature activation-2.01e+00

Pos logit contributions
 unexpected+0.023
sm+0.023
 frightening+0.023
 unpleasant+0.021
 since+0.021
Neg logit contributions
bes-0.024
 ramp-0.023
 wrists-0.022
 clinic-0.022
ippers-0.022
Token outside
Token ID2354
Feature activation+5.77e-01

Pos logit contributions
 unexpected+0.045
 frightening+0.043
 unpleasant+0.043
sm+0.040
 awful+0.037
Neg logit contributions
 ramp-0.062
bes-0.062
ippers-0.061
 kit-0.058
pel-0.057
Token and
Token ID290
Feature activation-7.98e-01

Pos logit contributions
bes+0.020
 ramp+0.019
 clinic+0.018
amas+0.018
ippers+0.018
Neg logit contributions
 unexpected-0.012
 frightening-0.011
sm-0.010
 unpleasant-0.010
 scary-0.010
Token named
Token ID3706
Feature activation+1.85e+00

Pos logit contributions
 unexpected+0.058
 frightening+0.054
 unpleasant+0.052
 possible+0.050
 since+0.050
Neg logit contributions
bes-0.047
 ramp-0.044
 clinic-0.043
ippers-0.041
 slot-0.041
Token Cr
Token ID3864
Feature activation-6.52e+00

Pos logit contributions
 ramp+0.062
bes+0.062
 wrists+0.058
amas+0.058
pel+0.057
Neg logit contributions
 unexpected-0.039
 frightening-0.036
sm-0.035
 unpleasant-0.033
 since-0.031
Tokenust
Token ID436
Feature activation-3.02e+00

Pos logit contributions
 unexpected+0.140
 frightening+0.131
 scary+0.121
sm+0.121
 since+0.120
Neg logit contributions
bes-0.217
 ramp-0.208
ippers-0.204
 clinic-0.202
amas-0.202
Tokeny
Token ID88
Feature activation-6.72e-02

Pos logit contributions
 unexpected+0.075
 frightening+0.071
 unpleasant+0.067
sm+0.065
 dangerous+0.063
Neg logit contributions
 ramp-0.085
bes-0.084
ippers-0.081
ates-0.079
 wrists-0.079
Token.
Token ID13
Feature activation-1.55e+00

Pos logit contributions
 unexpected+0.001
 frightening+0.001
 unpleasant+0.001
sm+0.001
 since+0.001
Neg logit contributions
bes-0.002
 ramp-0.002
 clinic-0.002
ippers-0.002
 wrists-0.002
Token Cr
Token ID3864
Feature activation-7.20e+00

Pos logit contributions
 unexpected+0.028
 frightening+0.025
sm+0.024
 since+0.023
 unpleasant+0.023
Neg logit contributions
bes-0.057
 ramp-0.055
ippers-0.053
amas-0.053
 clinic-0.053
Tokenust
Token ID436
Feature activation-3.61e+00

Pos logit contributions
 unexpected+0.119
 frightening+0.107
sm+0.099
 since+0.097
 scary+0.096
Neg logit contributions
bes-0.278
 ramp-0.267
ippers-0.260
amas-0.260
 clinic-0.258
Tokeny
Token ID88
Feature activation-7.38e-01

Pos logit contributions
 unexpected+0.084
 frightening+0.080
sm+0.076
 unpleasant+0.074
 dangerous+0.072
Neg logit contributions
bes-0.109
 ramp-0.106
 wrists-0.100
ippers-0.100
pel-0.099
Token lived
Token ID5615
Feature activation+1.31e+00

Pos logit contributions
 unexpected+0.018
 since+0.017
 frightening+0.017
 scary+0.017
sm+0.017
Neg logit contributions
bes-0.022
 ramp-0.021
amas-0.021
ippers-0.021
 clinic-0.021
Token in
Token ID287
Feature activation+9.85e-01

Pos logit contributions
bes+0.044
 ramp+0.042
ippers+0.041
 clinic+0.040
amas+0.040
Neg logit contributions
 unexpected-0.028
 frightening-0.026
 unpleasant-0.025
 since-0.024
sm-0.024
Token a
Token ID257
Feature activation-1.15e+00

Pos logit contributions
bes+0.037
 ramp+0.036
amas+0.035
ippers+0.035
 clinic+0.034
Neg logit contributions
 unexpected-0.018
 frightening-0.016
sm-0.015
 scary-0.015
 since-0.015
Token I
Token ID314
Feature activation+4.71e-01

Pos logit contributions
 unexpected+0.019
 frightening+0.018
 since+0.017
 scary+0.017
sm+0.017
Neg logit contributions
bes-0.031
 ramp-0.029
amas-0.028
ippers-0.028
 clinic-0.028
Token won
Token ID1839
Feature activation-2.15e+00

Pos logit contributions
bes+0.014
 ramp+0.013
amas+0.013
ippers+0.012
 clinic+0.012
Neg logit contributions
 unexpected-0.012
 frightening-0.011
sm-0.011
 since-0.011
 scary-0.011
Tokenâ
Token ID22940
Feature activation-2.93e+00

Pos logit contributions
 unexpected+0.063
 frightening+0.059
sm+0.057
 unpleasant+0.057
 since+0.056
Neg logit contributions
bes-0.057
 ramp-0.052
amas-0.050
pel-0.049
 clinic-0.049
Token
Token ID26391
Feature activation-7.59e-01

Pos logit contributions
 unexpected+0.070
 since+0.062
vious+0.060
 frightening+0.060
sm+0.059
Neg logit contributions
bes-0.091
 ramp-0.090
 wrists-0.088
ippers-0.085
 clinic-0.085
Token
Token ID8151
Feature activation-4.70e+00

Pos logit contributions
 since+0.020
sm+0.019
 unexpected+0.019
 frightening+0.018
 scary+0.018
Neg logit contributions
 ramp-0.019
 wrists-0.019
bes-0.019
pel-0.019
 slot-0.018
Tokent
Token ID83
Feature activation-7.19e+00

Pos logit contributions
 unexpected+0.122
 frightening+0.110
 unpleasant+0.104
 scary+0.099
 exciting+0.096
Neg logit contributions
bes-0.142
amas-0.139
ippers-0.137
 ramp-0.134
 clinic-0.131
Token let
Token ID1309
Feature activation-2.64e-01

Pos logit contributions
sm+0.234
elling+0.223
rous+0.221
year+0.217
 unexpected+0.214
Neg logit contributions
 ramp-0.138
 clinic-0.122
bes-0.118
 head-0.115
 neck-0.114
Token anything
Token ID1997
Feature activation-4.91e+00

Pos logit contributions
 unexpected+0.005
 frightening+0.004
sm+0.004
 since+0.004
 scary+0.004
Neg logit contributions
bes-0.010
 ramp-0.009
ippers-0.009
amas-0.009
 clinic-0.009
Token bad
Token ID2089
Feature activation-1.64e+00

Pos logit contributions
sm+0.085
 unexpected+0.083
 since+0.078
 frightening+0.078
elling+0.077
Neg logit contributions
 ramp-0.162
bes-0.162
 clinic-0.149
 wrists-0.145
amas-0.145
Token happen
Token ID1645
Feature activation+4.17e-01

Pos logit contributions
 unexpected+0.026
 frightening+0.022
 since+0.022
 scary+0.021
sm+0.020
Neg logit contributions
bes-0.066
 ramp-0.062
 clinic-0.061
amas-0.061
 wrists-0.061
Token to
Token ID284
Feature activation+1.01e+00

Pos logit contributions
bes+0.012
 ramp+0.011
ippers+0.011
amas+0.011
ates+0.011
Neg logit contributions
 unexpected-0.010
 frightening-0.010
sm-0.009
 unpleasant-0.009
vious-0.009
Token I
Token ID314
Feature activation+6.11e-01

Pos logit contributions
 unexpected+0.004
 frightening+0.004
 since+0.004
 scary+0.003
sm+0.003
Neg logit contributions
bes-0.007
 ramp-0.006
amas-0.006
ippers-0.006
 wrists-0.006
Token won
Token ID1839
Feature activation-1.64e+00

Pos logit contributions
bes+0.019
 ramp+0.018
amas+0.018
 wrists+0.018
ippers+0.018
Neg logit contributions
 unexpected-0.015
 frightening-0.013
sm-0.013
 since-0.013
 scary-0.013
Tokenâ
Token ID22940
Feature activation-2.43e+00

Pos logit contributions
 unexpected+0.043
 frightening+0.040
sm+0.038
 unpleasant+0.038
 since+0.038
Neg logit contributions
bes-0.049
 ramp-0.046
amas-0.044
ippers-0.043
 clinic-0.043
Token
Token ID26391
Feature activation-2.64e-01

Pos logit contributions
 unexpected+0.057
 since+0.052
 frightening+0.050
sm+0.050
vious+0.050
Neg logit contributions
bes-0.076
 ramp-0.075
 wrists-0.074
 clinic-0.071
ippers-0.070
Token
Token ID8151
Feature activation-4.19e+00

Pos logit contributions
 since+0.007
sm+0.007
 unexpected+0.006
 frightening+0.006
 scary+0.006
Neg logit contributions
 ramp-0.007
 wrists-0.007
bes-0.007
pel-0.007
 clinic-0.006
Tokent
Token ID83
Feature activation-7.18e+00

Pos logit contributions
 unexpected+0.108
 frightening+0.097
 unpleasant+0.093
 scary+0.088
 exciting+0.086
Neg logit contributions
bes-0.126
amas-0.122
ippers-0.120
 ramp-0.119
-0.115
Token hurt
Token ID5938
Feature activation-2.51e+00

Pos logit contributions
sm+0.201
 unexpected+0.191
elling+0.190
rous+0.185
 frightening+0.183
Neg logit contributions
 ramp-0.172
bes-0.161
 clinic-0.156
pel-0.152
 neck-0.150
Token you
Token ID345
Feature activation-7.02e-01

Pos logit contributions
 unexpected+0.053
 frightening+0.049
sm+0.047
 unpleasant+0.046
 scary+0.043
Neg logit contributions
bes-0.083
 ramp-0.079
ippers-0.077
amas-0.075
pel-0.075
Token.
Token ID13
Feature activation-7.47e-02

Pos logit contributions
 unexpected+0.013
 frightening+0.012
sm+0.011
 unpleasant+0.011
 scary+0.010
Neg logit contributions
bes-0.025
 ramp-0.024
amas-0.024
ippers-0.024
 clinic-0.024
Tokenâ
Token ID22940
Feature activation-7.55e-01

Pos logit contributions
 since+0.002
 unexpected+0.002
 frightening+0.002
 over+0.002
sm+0.002
Neg logit contributions
bes-0.002
 ramp-0.002
 wrists-0.002
ippers-0.002
kin-0.002
Token
Token ID26391
Feature activation+1.48e+00

Pos logit contributions
 unexpected+0.019
 frightening+0.018
 since+0.017
sm+0.017
 scary+0.016
Neg logit contributions
bes-0.023
 ramp-0.022
ippers-0.021
 clinic-0.021
 wrists-0.021
Token tree
Token ID5509
Feature activation+9.16e-01

Pos logit contributions
 unexpected+0.046
 frightening+0.043
sm+0.042
 since+0.041
 unpleasant+0.041
Neg logit contributions
bes-0.049
 ramp-0.046
amas-0.044
 clinic-0.043
ippers-0.043
Token won
Token ID1839
Feature activation-1.53e+00

Pos logit contributions
bes+0.027
amas+0.026
ippers+0.026
pel+0.026
 ramp+0.025
Neg logit contributions
 unexpected-0.023
 frightening-0.021
 scary-0.021
 since-0.021
 dangerous-0.020
Tokenâ
Token ID22940
Feature activation-2.76e+00

Pos logit contributions
 unexpected+0.036
 frightening+0.034
sm+0.032
 since+0.032
 scary+0.031
Neg logit contributions
bes-0.048
 ramp-0.046
ippers-0.044
amas-0.044
 clinic-0.044
Token
Token ID26391
Feature activation-9.00e-01

Pos logit contributions
 unexpected+0.064
 since+0.059
vious+0.056
 frightening+0.054
sm+0.054
Neg logit contributions
 ramp-0.089
bes-0.086
 wrists-0.084
ippers-0.083
 clinic-0.082
Token
Token ID8151
Feature activation-4.34e+00

Pos logit contributions
 since+0.022
sm+0.022
 unexpected+0.021
 frightening+0.020
 dangerous+0.020
Neg logit contributions
 ramp-0.024
pel-0.024
 wrists-0.023
bes-0.023
 slot-0.023
Tokent
Token ID83
Feature activation-7.16e+00

Pos logit contributions
 unexpected+0.099
 frightening+0.090
 unpleasant+0.084
 scary+0.081
 dangerous+0.078
Neg logit contributions
bes-0.142
amas-0.135
 ramp-0.135
ippers-0.134
 clinic-0.131
Token feel
Token ID1254
Feature activation-2.06e+00

Pos logit contributions
 unexpected+0.229
sm+0.224
 frightening+0.219
 dangerous+0.207
 unpleasant+0.206
Neg logit contributions
 ramp-0.146
bes-0.144
 clinic-0.131
pel-0.127
 wrists-0.127
Token alone
Token ID3436
Feature activation+2.05e+00

Pos logit contributions
 unexpected+0.037
sm+0.033
 frightening+0.033
 unpleasant+0.031
 since+0.030
Neg logit contributions
bes-0.072
 ramp-0.069
 clinic-0.067
ippers-0.066
pel-0.065
Token.
Token ID13
Feature activation-2.86e-01

Pos logit contributions
bes+0.069
 ramp+0.067
 clinic+0.065
amas+0.064
ippers+0.064
Neg logit contributions
 unexpected-0.042
 frightening-0.039
 unpleasant-0.036
sm-0.036
 scary-0.034
Tokenâ
Token ID22940
Feature activation-1.07e+00

Pos logit contributions
 unexpected+0.007
 since+0.006
 frightening+0.006
sm+0.006
 scary+0.006
Neg logit contributions
bes-0.009
 ramp-0.009
ippers-0.008
 wrists-0.008
pel-0.008
Token
Token ID26391
Feature activation+1.43e+00

Pos logit contributions
 unexpected+0.027
 frightening+0.025
 since+0.024
sm+0.024
 unpleasant+0.023
Neg logit contributions
bes-0.032
 ramp-0.031
ippers-0.030
 clinic-0.030
 wrists-0.030
Token Tim
Token ID5045
Feature activation-2.71e+00

Pos logit contributions
 unexpected+0.018
 frightening+0.017
sm+0.016
 unpleasant+0.016
vious+0.015
Neg logit contributions
bes-0.029
 ramp-0.029
amas-0.027
ippers-0.027
 wrists-0.027
Tokenmy
Token ID1820
Feature activation-1.60e+00

Pos logit contributions
 unexpected+0.046
 frightening+0.041
sm+0.038
 unpleasant+0.037
 scary+0.034
Neg logit contributions
bes-0.103
 ramp-0.101
amas-0.098
 clinic-0.097
ippers-0.097
Token's
Token ID338
Feature activation+1.61e+00

Pos logit contributions
 unexpected+0.042
 frightening+0.038
sm+0.037
 unpleasant+0.036
 since+0.036
Neg logit contributions
bes-0.047
 ramp-0.044
amas-0.043
 clinic-0.042
ippers-0.042
Token friends
Token ID2460
Feature activation+4.06e-01

Pos logit contributions
bes+0.048
 ramp+0.044
pel+0.043
 clinic+0.043
amas+0.042
Neg logit contributions
 unexpected-0.042
 frightening-0.039
 scary-0.038
 dangerous-0.037
sm-0.036
Token couldn
Token ID3521
Feature activation-2.82e+00

Pos logit contributions
bes+0.012
 ramp+0.011
amas+0.011
 clinic+0.011
ippers+0.011
Neg logit contributions
 unexpected-0.010
 frightening-0.009
sm-0.009
 unpleasant-0.009
 since-0.009
Token't
Token ID470
Feature activation-7.15e+00

Pos logit contributions
 unexpected+0.067
 frightening+0.061
sm+0.059
 scary+0.058
 since+0.057
Neg logit contributions
bes-0.088
 ramp-0.083
ippers-0.082
 wrists-0.082
 clinic-0.081
Token come
Token ID1282
Feature activation-1.06e+00

Pos logit contributions
 unexpected+0.186
 frightening+0.179
sm+0.173
 unpleasant+0.166
 since+0.165
Neg logit contributions
bes-0.204
 ramp-0.198
 clinic-0.186
ippers-0.185
pel-0.185
Token out
Token ID503
Feature activation-2.26e+00

Pos logit contributions
 unexpected+0.023
 frightening+0.021
sm+0.021
 unpleasant+0.020
 awful+0.018
Neg logit contributions
 ramp-0.036
ippers-0.035
bes-0.035
 clinic-0.034
ates-0.033
Token to
Token ID284
Feature activation-9.17e-01

Pos logit contributions
 unexpected+0.056
 frightening+0.052
 unpleasant+0.051
sm+0.047
 awful+0.045
Neg logit contributions
bes-0.064
 ramp-0.062
ippers-0.062
 clinic-0.062
 laptop-0.059
Token play
Token ID711
Feature activation+1.42e+00

Pos logit contributions
 unexpected+0.026
 frightening+0.025
sm+0.024
 since+0.024
 unpleasant+0.024
Neg logit contributions
bes-0.024
 ramp-0.023
 clinic-0.022
 wrists-0.022
ippers-0.021
Token.
Token ID13
Feature activation-1.01e+00

Pos logit contributions
bes+0.047
 ramp+0.044
amas+0.043
ippers+0.043
 wrists+0.043
Neg logit contributions
 unexpected-0.032
 frightening-0.029
 scary-0.028
sm-0.028
 since-0.027
Token Dad
Token ID17415
Feature activation-1.50e+00

Pos logit contributions
 ramp+0.034
bes+0.034
 wrists+0.034
ippers+0.033
amas+0.032
Neg logit contributions
 unexpected-0.034
 frightening-0.033
sm-0.032
 unpleasant-0.030
 dangerous-0.030
Token won
Token ID1839
Feature activation-1.46e+00

Pos logit contributions
 unexpected+0.031
 frightening+0.028
 since+0.027
sm+0.026
 scary+0.026
Neg logit contributions
bes-0.053
 ramp-0.051
 clinic-0.049
ippers-0.049
amas-0.049
Tokenâ
Token ID22940
Feature activation-2.51e+00

Pos logit contributions
 unexpected+0.039
 frightening+0.036
sm+0.035
 since+0.034
 unpleasant+0.034
Neg logit contributions
bes-0.043
 ramp-0.040
amas-0.039
ippers-0.038
 clinic-0.038
Token
Token ID26391
Feature activation-8.00e-01

Pos logit contributions
 unexpected+0.062
 since+0.055
 frightening+0.054
sm+0.053
vious+0.052
Neg logit contributions
bes-0.078
 ramp-0.076
 wrists-0.073
 clinic-0.072
amas-0.072
Token
Token ID8151
Feature activation-4.24e+00

Pos logit contributions
 since+0.019
 unexpected+0.019
sm+0.018
 frightening+0.017
 dangerous+0.017
Neg logit contributions
 ramp-0.023
bes-0.022
pel-0.022
 wrists-0.022
 slot-0.022
Tokent
Token ID83
Feature activation-7.14e+00

Pos logit contributions
 unexpected+0.102
 frightening+0.093
 unpleasant+0.087
 scary+0.084
 dangerous+0.081
Neg logit contributions
bes-0.133
amas-0.129
ippers-0.129
 ramp-0.128
 clinic-0.124
Token like
Token ID588
Feature activation-9.69e-03

Pos logit contributions
 unexpected+0.207
sm+0.206
 frightening+0.202
 unpleasant+0.188
 since+0.186
Neg logit contributions
 ramp-0.167
bes-0.165
ippers-0.151
pel-0.150
 clinic-0.148
Token it
Token ID340
Feature activation-2.33e+00

Pos logit contributions
 unexpected+0.000
 frightening+0.000
sm+0.000
 since+0.000
 unpleasant+0.000
Neg logit contributions
bes-0.000
 ramp-0.000
 clinic-0.000
amas-0.000
ippers-0.000
Token.
Token ID13
Feature activation-7.06e-01

Pos logit contributions
 unexpected+0.040
 frightening+0.036
sm+0.034
 since+0.033
 unpleasant+0.032
Neg logit contributions
bes-0.089
 ramp-0.085
amas-0.082
ippers-0.082
 clinic-0.082
Tokenâ
Token ID22940
Feature activation-1.37e+00

Pos logit contributions
 unexpected+0.023
 frightening+0.022
sm+0.021
 since+0.021
 unpleasant+0.021
Neg logit contributions
bes-0.016
 ramp-0.015
ippers-0.014
amas-0.014
 clinic-0.014
Token
Token ID26391
Feature activation+7.98e-01

Pos logit contributions
 unexpected+0.037
 frightening+0.035
sm+0.034
 since+0.033
 unpleasant+0.033
Neg logit contributions
bes-0.038
 ramp-0.036
amas-0.035
ippers-0.035
 clinic-0.034
Token It
Token ID632
Feature activation-1.60e+00

Pos logit contributions
 unexpected+0.013
 frightening+0.013
 since+0.013
 scary+0.012
sm+0.012
Neg logit contributions
bes-0.020
ippers-0.019
 wrists-0.019
kin-0.019
 ramp-0.018
Token won
Token ID1839
Feature activation-8.12e-01

Pos logit contributions
 unexpected+0.044
 frightening+0.040
sm+0.039
 since+0.039
 unpleasant+0.038
Neg logit contributions
bes-0.045
 ramp-0.042
 clinic-0.040
amas-0.040
pel-0.039
Tokenâ
Token ID22940
Feature activation-2.72e+00

Pos logit contributions
 unexpected+0.022
 frightening+0.021
sm+0.020
 unpleasant+0.020
 since+0.019
Neg logit contributions
bes-0.023
 ramp-0.022
amas-0.021
ippers-0.020
 clinic-0.020
Token
Token ID26391
Feature activation-1.04e+00

Pos logit contributions
 unexpected+0.060
 since+0.058
sm+0.055
vious+0.054
 frightening+0.054
Neg logit contributions
bes-0.088
 wrists-0.087
 ramp-0.087
ippers-0.082
 clinic-0.081
Token
Token ID8151
Feature activation-4.76e+00

Pos logit contributions
 since+0.030
sm+0.029
 unexpected+0.028
 dangerous+0.028
 frightening+0.028
Neg logit contributions
 ramp-0.024
 wrists-0.024
bes-0.024
kin-0.023
ippers-0.023
Tokent
Token ID83
Feature activation-7.14e+00

Pos logit contributions
 unexpected+0.118
 frightening+0.106
 unpleasant+0.101
 scary+0.097
 exciting+0.094
Neg logit contributions
bes-0.146
amas-0.143
ippers-0.141
 ramp-0.140
 clinic-0.136
Token grow
Token ID1663
Feature activation-1.36e+00

Pos logit contributions
 unexpected+0.192
sm+0.186
 frightening+0.180
rous+0.171
 unpleasant+0.170
Neg logit contributions
 ramp-0.179
bes-0.174
 clinic-0.164
pel-0.163
amas-0.152
Token here
Token ID994
Feature activation+6.55e-01

Pos logit contributions
 unexpected+0.027
sm+0.024
 frightening+0.024
 unpleasant+0.022
rous+0.022
Neg logit contributions
bes-0.041
 ramp-0.040
amas-0.040
 clinic-0.038
ippers-0.038
Token no
Token ID645
Feature activation-3.12e+00

Pos logit contributions
bes+0.023
 ramp+0.022
 clinic+0.021
amas+0.021
ippers+0.021
Neg logit contributions
 unexpected-0.013
 frightening-0.012
sm-0.011
 unpleasant-0.011
 scary-0.011
Token more
Token ID517
Feature activation+8.88e-01

Pos logit contributions
 unexpected+0.060
 frightening+0.055
sm+0.054
 since+0.052
 unpleasant+0.050
Neg logit contributions
bes-0.106
 ramp-0.104
 clinic-0.102
ippers-0.100
amas-0.099
Token because
Token ID780
Feature activation-2.59e+00

Pos logit contributions
bes+0.030
 ramp+0.029
 clinic+0.028
ippers+0.028
 wrists+0.028
Neg logit contributions
 unexpected-0.018
 frightening-0.016
sm-0.016
 since-0.016
 unpleasant-0.015
Token it
Token ID340
Feature activation-3.07e+00

Pos logit contributions
bes+0.008
 ramp+0.007
 clinic+0.007
ippers+0.007
amas+0.007
Neg logit contributions
 unexpected-0.005
 frightening-0.005
sm-0.005
 unpleasant-0.005
 since-0.004
Token fits
Token ID11414
Feature activation-4.52e-01

Pos logit contributions
 unexpected+0.065
 scary+0.064
 over+0.063
 dangerous+0.063
 frightening+0.061
Neg logit contributions
bes-0.101
ippers-0.097
pel-0.094
 slot-0.093
ates-0.093
Token!"
Token ID2474
Feature activation+1.53e-01

Pos logit contributions
 unexpected+0.010
 frightening+0.009
sm+0.009
 unpleasant+0.009
vious+0.008
Neg logit contributions
bes-0.015
ippers-0.014
ates-0.014
amas-0.014
 ramp-0.014
Token She
Token ID1375
Feature activation+6.67e-01

Pos logit contributions
bes+0.004
 ramp+0.004
amas+0.004
 clinic+0.004
ippers+0.004
Neg logit contributions
 unexpected-0.004
 over-0.004
 since-0.004
sm-0.004
 frightening-0.004
Token helped
Token ID4193
Feature activation-1.11e+00

Pos logit contributions
bes+0.022
 clinic+0.021
 ramp+0.021
ippers+0.021
pel+0.020
Neg logit contributions
 unexpected-0.015
 over-0.014
sm-0.014
 since-0.013
 frightening-0.013
Token Sil
Token ID4243
Feature activation-7.14e+00

Pos logit contributions
 unexpected+0.018
 frightening+0.016
sm+0.015
 since+0.015
 over+0.015
Neg logit contributions
bes-0.044
 ramp-0.042
amas-0.041
 clinic-0.041
ippers-0.040
Tokenvia
Token ID8869
Feature activation-5.60e+00

Pos logit contributions
 unexpected+0.200
 frightening+0.190
 scary+0.180
sm+0.180
 since+0.179
Neg logit contributions
bes-0.193
 ramp-0.181
 clinic-0.173
ippers-0.173
amas-0.172
Token put
Token ID1234
Feature activation+3.70e-01

Pos logit contributions
 unexpected+0.131
 frightening+0.119
 since+0.117
 over+0.116
sm+0.116
Neg logit contributions
bes-0.178
 ramp-0.166
amas-0.164
pel-0.162
ippers-0.161
Token on
Token ID319
Feature activation-2.28e-01

Pos logit contributions
bes+0.016
 clinic+0.015
 ramp+0.015
amas+0.015
pel+0.015
Neg logit contributions
 unexpected-0.005
 over-0.004
sm-0.004
 scary-0.004
 closer-0.004
Token the
Token ID262
Feature activation-1.76e+00

Pos logit contributions
 unexpected+0.003
sm+0.002
 frightening+0.002
 scary+0.002
 dangerous+0.002
Neg logit contributions
bes-0.010
 ramp-0.010
amas-0.010
ippers-0.009
 clinic-0.009
Token coat
Token ID13209
Feature activation-8.03e-01

Pos logit contributions
 unexpected+0.058
 frightening+0.056
 scary+0.055
 dangerous+0.054
 unpleasant+0.053
Neg logit contributions
bes-0.038
 ramp-0.035
 clinic-0.033
pel-0.033
amas-0.033
Token
Token ID198
Feature activation-2.24e+00

Pos logit contributions
 unexpected+0.094
 frightening+0.087
sm+0.084
 since+0.083
 scary+0.082
Neg logit contributions
bes-0.116
 ramp-0.110
ippers-0.105
amas-0.105
 clinic-0.105
Token
Token ID198
Feature activation-1.55e+00

Pos logit contributions
 unexpected+0.056
 frightening+0.051
sm+0.050
 since+0.049
 scary+0.048
Neg logit contributions
bes-0.068
 ramp-0.065
ippers-0.062
amas-0.062
 clinic-0.062
TokenLu
Token ID25596
Feature activation-1.13e+00

Pos logit contributions
 unexpected+0.037
 frightening+0.034
sm+0.033
 since+0.033
 scary+0.032
Neg logit contributions
bes-0.049
 ramp-0.047
ippers-0.045
amas-0.045
 clinic-0.045
Tokency
Token ID948
Feature activation-3.22e-01

Pos logit contributions
 unexpected+0.023
sm+0.022
 since+0.021
 frightening+0.021
 over+0.020
Neg logit contributions
bes-0.039
 ramp-0.039
ippers-0.037
amas-0.036
pel-0.036
Token couldn
Token ID3521
Feature activation-2.50e+00

Pos logit contributions
 unexpected+0.007
 since+0.006
 frightening+0.006
sm+0.006
 over+0.006
Neg logit contributions
ippers-0.011
bes-0.011
 ramp-0.011
amas-0.011
ates-0.011
Token't
Token ID470
Feature activation-7.13e+00

Pos logit contributions
 unexpected+0.068
 frightening+0.064
sm+0.062
 since+0.061
 unpleasant+0.060
Neg logit contributions
bes-0.070
 ramp-0.066
amas-0.064
ippers-0.063
 clinic-0.063
Token stop
Token ID2245
Feature activation-2.94e+00

Pos logit contributions
 frightening+0.174
sm+0.171
 unexpected+0.171
 dangerous+0.161
 unpleasant+0.159
Neg logit contributions
bes-0.189
 ramp-0.188
pel-0.184
 wrist-0.181
 clinic-0.179
Token wondering
Token ID11263
Feature activation-3.07e+00

Pos logit contributions
 unexpected+0.078
sm+0.071
 frightening+0.070
 unpleasant+0.067
vious+0.067
Neg logit contributions
bes-0.082
 wrists-0.078
 ramp-0.077
ippers-0.076
 clinic-0.075
Token who
Token ID508
Feature activation-1.47e+00

Pos logit contributions
 unexpected+0.066
vous+0.065
 frightening+0.064
vious+0.064
sm+0.063
Neg logit contributions
 neck-0.087
 wrist-0.086
 chest-0.084
 wrists-0.084
bes-0.082
Token the
Token ID262
Feature activation+5.42e-01

Pos logit contributions
 unexpected+0.033
 frightening+0.030
sm+0.029
 since+0.028
 unpleasant+0.028
Neg logit contributions
bes-0.048
 ramp-0.046
 clinic-0.044
ippers-0.044
amas-0.044
Token mail
Token ID6920
Feature activation+1.10e+00

Pos logit contributions
bes+0.017
ippers+0.016
 ramp+0.016
amas+0.016
ates+0.015
Neg logit contributions
 unexpected-0.014
 scary-0.013
 dangerous-0.013
 frightening-0.013
 amazing-0.012
Token.
Token ID13
Feature activation-2.29e-01

Pos logit contributions
bes+0.020
ippers+0.019
 clinic+0.019
amas+0.019
 ramp+0.019
Neg logit contributions
 unexpected-0.012
 over-0.010
 frightening-0.010
 scary-0.010
 since-0.010
Token She
Token ID1375
Feature activation+8.92e-01

Pos logit contributions
 since+0.005
 unexpected+0.004
 over+0.004
 frightening+0.004
 Sally+0.004
Neg logit contributions
bes-0.008
 ramp-0.007
ates-0.007
ippers-0.007
kin-0.007
Token played
Token ID2826
Feature activation+1.65e+00

Pos logit contributions
bes+0.030
ippers+0.029
amas+0.028
 clinic+0.028
 ramp+0.028
Neg logit contributions
 unexpected-0.020
 since-0.018
 frightening-0.018
 over-0.018
 scary-0.017
Token and
Token ID290
Feature activation-1.13e+00

Pos logit contributions
ippers+0.060
bes+0.059
 clinic+0.059
amas+0.059
 slot+0.058
Neg logit contributions
 unexpected-0.029
 over-0.027
 amazing-0.026
 scary-0.026
 exciting-0.025
Token won
Token ID1839
Feature activation-6.13e-01

Pos logit contributions
 unexpected+0.029
 frightening+0.027
sm+0.026
 since+0.026
 scary+0.025
Neg logit contributions
bes-0.033
 ramp-0.032
ippers-0.031
amas-0.031
 clinic-0.030
Token an
Token ID281
Feature activation-7.12e+00

Pos logit contributions
 unexpected+0.010
 frightening+0.009
sm+0.008
 since+0.008
 unpleasant+0.008
Neg logit contributions
bes-0.024
 ramp-0.023
amas-0.022
ippers-0.022
 clinic-0.022
Token amazing
Token ID4998
Feature activation-1.70e+00

Pos logit contributions
sm+0.134
 frightening+0.133
 unexpected+0.126
 since+0.118
 scary+0.114
Neg logit contributions
 ramp-0.237
bes-0.232
 clinic-0.220
 trailer-0.218
 wrists-0.217
Token prize
Token ID11596
Feature activation-1.13e-01

Pos logit contributions
 unexpected+0.041
 frightening+0.037
 scary+0.037
 amazing+0.036
 exciting+0.036
Neg logit contributions
bes-0.053
ippers-0.050
 clinic-0.050
pel-0.049
ates-0.049
Token.
Token ID13
Feature activation-1.48e-02

Pos logit contributions
 unexpected+0.002
 frightening+0.002
sm+0.002
 unpleasant+0.002
 since+0.002
Neg logit contributions
bes-0.004
 ramp-0.004
ippers-0.004
 clinic-0.004
amas-0.004
Token It
Token ID632
Feature activation-1.32e+00

Pos logit contributions
 since+0.000
 unexpected+0.000
 over+0.000
sm+0.000
 frightening+0.000
Neg logit contributions
bes-0.000
 ramp-0.000
ippers-0.000
ates-0.000
 clinic-0.000
Token was
Token ID373
Feature activation+2.24e+00

Pos logit contributions
 unexpected+0.034
 frightening+0.031
sm+0.030
 since+0.030
 scary+0.029
Neg logit contributions
bes-0.040
 ramp-0.038
ippers-0.037
amas-0.037
 clinic-0.036
Token so
Token ID523
Feature activation-1.80e+00

Pos logit contributions
 unexpected+0.061
sm+0.059
 frightening+0.056
 since+0.053
 unpleasant+0.052
Neg logit contributions
 ramp-0.093
bes-0.092
 clinic-0.086
 wrists-0.084
ippers-0.084
Token he
Token ID339
Feature activation-1.38e+00

Pos logit contributions
 unexpected+0.047
vious+0.046
elling+0.045
 frightening+0.045
 unpleasant+0.044
Neg logit contributions
bes-0.047
 ramp-0.046
 neck-0.044
ippers-0.043
 slot-0.043
Token decided
Token ID3066
Feature activation-2.17e+00

Pos logit contributions
 unexpected+0.030
 frightening+0.027
 scary+0.026
 since+0.026
 dangerous+0.026
Neg logit contributions
bes-0.048
ippers-0.046
amas-0.046
 ramp-0.045
 clinic-0.044
Token to
Token ID284
Feature activation-4.77e-01

Pos logit contributions
 unexpected+0.040
sm+0.039
elling+0.036
vious+0.036
 frightening+0.035
Neg logit contributions
bes-0.075
 ramp-0.073
 clinic-0.071
 wrists-0.071
 neck-0.070
Token take
Token ID1011
Feature activation-1.25e+00

Pos logit contributions
 unexpected+0.011
 over+0.009
 frightening+0.009
 scary+0.009
 exciting+0.009
Neg logit contributions
bes-0.016
amas-0.015
ippers-0.015
 ramp-0.015
ates-0.015
Token an
Token ID281
Feature activation-7.12e+00

Pos logit contributions
 unexpected+0.027
 frightening+0.025
 since+0.024
sm+0.024
 unpleasant+0.023
Neg logit contributions
bes-0.042
 ramp-0.040
ippers-0.039
 clinic-0.038
amas-0.038
Token exam
Token ID2814
Feature activation-1.88e+00

Pos logit contributions
sm+0.160
 since+0.154
 frightening+0.152
 unexpected+0.151
 unpleasant+0.136
Neg logit contributions
 ramp-0.209
bes-0.206
 clinic-0.205
 trailer-0.200
pel-0.195
Token.
Token ID13
Feature activation-2.11e+00

Pos logit contributions
 unexpected+0.030
 frightening+0.026
sm+0.025
 since+0.024
 unpleasant+0.024
Neg logit contributions
bes-0.074
 ramp-0.071
amas-0.069
ippers-0.069
 clinic-0.069
Token He
Token ID679
Feature activation-4.71e-01

Pos logit contributions
 over+0.037
 since+0.036
 to+0.033
 Joe+0.032
 Sally+0.031
Neg logit contributions
bes-0.081
kin-0.079
 wrists-0.075
ippers-0.074
ates-0.073
Token was
Token ID373
Feature activation+8.98e-01

Pos logit contributions
 unexpected+0.011
 frightening+0.010
 scary+0.010
sm+0.010
 since+0.010
Neg logit contributions
bes-0.015
ippers-0.014
 ramp-0.014
amas-0.014
 clinic-0.014
Token a
Token ID257
Feature activation+1.05e-01

Pos logit contributions
bes+0.034
 ramp+0.032
ippers+0.032
amas+0.032
 wrists+0.031
Neg logit contributions
 unexpected-0.016
 frightening-0.015
 scary-0.014
 dangerous-0.014
sm-0.014
Token "
Token ID366
Feature activation+5.93e-01

Pos logit contributions
bes+0.026
 ramp+0.026
ippers+0.025
 clinic+0.024
pel+0.023
Neg logit contributions
 over-0.021
 unexpected-0.021
 since-0.021
 frightening-0.021
sm-0.020
TokenYes
Token ID5297
Feature activation+6.92e-01

Pos logit contributions
bes+0.019
 ramp+0.018
amas+0.017
ippers+0.017
 clinic+0.017
Neg logit contributions
 unexpected-0.014
 frightening-0.013
sm-0.012
 since-0.012
 scary-0.012
Token,
Token ID11
Feature activation+8.57e-01

Pos logit contributions
bes+0.023
 ramp+0.022
amas+0.021
ippers+0.021
 clinic+0.021
Neg logit contributions
 unexpected-0.016
 frightening-0.014
sm-0.014
 since-0.013
 unpleasant-0.013
Token Tim
Token ID5045
Feature activation-1.43e+00

Pos logit contributions
bes+0.029
 ramp+0.028
amas+0.027
ippers+0.027
 wrists+0.027
Neg logit contributions
 unexpected-0.017
 frightening-0.016
sm-0.016
 unpleasant-0.015
 scary-0.014
Token.
Token ID13
Feature activation+1.09e+00

Pos logit contributions
 unexpected+0.033
 frightening+0.030
sm+0.027
 unpleasant+0.027
 scary+0.026
Neg logit contributions
bes-0.046
 ramp-0.044
amas-0.043
ippers-0.042
 clinic-0.042
Token Be
Token ID1355
Feature activation+4.75e+00

Pos logit contributions
bes+0.035
 ramp+0.032
 wrists+0.031
ippers+0.031
 clinic+0.031
Neg logit contributions
 since-0.025
 over-0.022
sm-0.022
 unexpected-0.022
 Betty-0.022
Token careful
Token ID8161
Feature activation+4.64e-01

Pos logit contributions
 ramp+0.195
bes+0.181
 clinic+0.179
pel+0.178
ippers+0.178
Neg logit contributions
 unexpected-0.080
 dangerous-0.070
 scary-0.067
 frightening-0.065
 over-0.065
Token and
Token ID290
Feature activation+7.82e-02

Pos logit contributions
bes+0.017
 ramp+0.016
ippers+0.015
 clinic+0.015
amas+0.015
Neg logit contributions
 unexpected-0.009
 frightening-0.008
 since-0.008
sm-0.008
 scary-0.007
Token don
Token ID836
Feature activation-1.02e-02

Pos logit contributions
bes+0.002
 ramp+0.002
 wrists+0.002
 clinic+0.002
ippers+0.002
Neg logit contributions
 unexpected-0.002
 frightening-0.002
sm-0.002
 since-0.002
 unpleasant-0.002
Token't
Token ID470
Feature activation-4.87e+00

Pos logit contributions
 unexpected+0.000
 frightening+0.000
sm+0.000
 since+0.000
 unpleasant+0.000
Neg logit contributions
bes-0.000
 ramp-0.000
amas-0.000
ippers-0.000
 clinic-0.000
Token get
Token ID651
Feature activation-1.26e+00

Pos logit contributions
 unexpected+0.131
sm+0.127
 frightening+0.125
 unpleasant+0.115
 since+0.112
Neg logit contributions
 ramp-0.125
bes-0.124
 clinic-0.117
 wrists-0.115
amas-0.115
Token a
Token ID257
Feature activation+7.60e-01

Pos logit contributions
bes+0.047
 ramp+0.045
 clinic+0.043
amas+0.043
ippers+0.043
Neg logit contributions
 unexpected-0.028
 frightening-0.026
sm-0.024
 since-0.024
 unpleasant-0.024
Token strange
Token ID6283
Feature activation+6.05e-01

Pos logit contributions
bes+0.024
 ramp+0.023
 clinic+0.022
ippers+0.022
amas+0.022
Neg logit contributions
 unexpected-0.018
 frightening-0.016
 since-0.016
sm-0.016
 unpleasant-0.015
Token man
Token ID582
Feature activation+8.15e-01

Pos logit contributions
bes+0.017
 ramp+0.016
amas+0.016
ippers+0.015
 wrists+0.015
Neg logit contributions
 unexpected-0.017
 frightening-0.016
 scary-0.015
sm-0.015
 dangerous-0.015
Token in
Token ID287
Feature activation+2.57e+00

Pos logit contributions
bes+0.030
 ramp+0.029
amas+0.028
ippers+0.028
 wrists+0.028
Neg logit contributions
 unexpected-0.015
sm-0.014
 frightening-0.013
 since-0.013
 scary-0.013
Token the
Token ID262
Feature activation+1.57e+00

Pos logit contributions
bes+0.101
 ramp+0.097
 clinic+0.094
ippers+0.094
amas+0.093
Neg logit contributions
 unexpected-0.041
 frightening-0.037
sm-0.034
 since-0.034
 unpleasant-0.033
Token park
Token ID3952
Feature activation+4.74e+00

Pos logit contributions
bes+0.027
 ramp+0.025
 clinic+0.023
amas+0.023
 slot+0.023
Neg logit contributions
 unexpected-0.060
 frightening-0.057
sm-0.056
 since-0.056
 unpleasant-0.055
Token.
Token ID13
Feature activation+1.09e+00

Pos logit contributions
bes+0.183
 ramp+0.175
amas+0.172
ippers+0.172
pel+0.169
Neg logit contributions
 unexpected-0.080
 frightening-0.071
 since-0.069
sm-0.068
 scary-0.067
Token He
Token ID679
Feature activation+2.41e+00

Pos logit contributions
bes+0.030
 ramp+0.028
 wrists+0.027
amas+0.027
ippers+0.027
Neg logit contributions
 unexpected-0.030
 frightening-0.028
sm-0.028
 since-0.028
 scary-0.027
Token has
Token ID468
Feature activation+1.30e+00

Pos logit contributions
bes+0.069
ippers+0.066
amas+0.066
 ramp+0.065
 wrists+0.064
Neg logit contributions
 unexpected-0.066
 frightening-0.062
 dangerous-0.061
 scary-0.061
sm-0.060
Token a
Token ID257
Feature activation+3.31e-01

Pos logit contributions
bes+0.047
 ramp+0.045
amas+0.044
ippers+0.044
 clinic+0.043
Neg logit contributions
 unexpected-0.025
 frightening-0.023
sm-0.022
 scary-0.021
 since-0.021
Token box
Token ID3091
Feature activation+1.31e+00

Pos logit contributions
bes+0.010
amas+0.010
 ramp+0.009
ippers+0.009
ates+0.009
Neg logit contributions
 unexpected-0.008
 frightening-0.008
 scary-0.008
 dangerous-0.007
sm-0.007
Token."
Token ID526
Feature activation+2.23e+00

Pos logit contributions
bes+0.034
 ramp+0.033
ippers+0.032
 clinic+0.032
pel+0.032
Neg logit contributions
 unexpected-0.015
 since-0.014
 frightening-0.013
 scary-0.013
sm-0.012
Token
Token ID198
Feature activation+1.27e+00

Pos logit contributions
bes+0.069
 ramp+0.066
amas+0.063
ippers+0.063
 clinic+0.062
Neg logit contributions
 unexpected-0.054
 frightening-0.050
sm-0.049
 since-0.048
 scary-0.047
Token
Token ID198
Feature activation+1.57e+00

Pos logit contributions
bes+0.039
 ramp+0.037
ippers+0.036
amas+0.036
 clinic+0.036
Neg logit contributions
 unexpected-0.031
 frightening-0.029
 since-0.028
sm-0.028
 unpleasant-0.027
TokenHer
Token ID9360
Feature activation+1.37e+00

Pos logit contributions
 ramp+0.052
bes+0.051
amas+0.050
pel+0.050
ippers+0.049
Neg logit contributions
 unexpected-0.034
sm-0.031
 frightening-0.029
 since-0.028
ful-0.028
Token mom
Token ID1995
Feature activation-2.77e-01

Pos logit contributions
bes+0.033
 ramp+0.032
ippers+0.030
 clinic+0.030
amas+0.030
Neg logit contributions
 unexpected-0.043
 frightening-0.040
 since-0.039
sm-0.039
 unpleasant-0.038
Token says
Token ID1139
Feature activation+4.95e+00

Pos logit contributions
 over+0.005
 unexpected+0.005
sm+0.005
 scary+0.005
 frightening+0.005
Neg logit contributions
bes-0.010
ippers-0.010
amas-0.009
pel-0.009
 ramp-0.009
Token,
Token ID11
Feature activation+1.72e+00

Pos logit contributions
bes+0.186
 ramp+0.180
amas+0.172
 clinic+0.171
ippers+0.170
Neg logit contributions
 unexpected-0.088
sm-0.080
 frightening-0.078
 since-0.077
 scary-0.076
Token "
Token ID366
Feature activation+1.51e+00

Pos logit contributions
bes+0.051
 ramp+0.051
ippers+0.049
 clinic+0.048
pel+0.047
Neg logit contributions
 over-0.041
sm-0.039
 since-0.038
 unexpected-0.038
 frightening-0.037
TokenI
Token ID40
Feature activation+1.44e+00

Pos logit contributions
bes+0.054
 ramp+0.054
ippers+0.053
amas+0.052
pel+0.052
Neg logit contributions
 unexpected-0.028
sm-0.027
 since-0.027
 frightening-0.025
 scary-0.024
Token love
Token ID1842
Feature activation+9.56e-01

Pos logit contributions
bes+0.050
 ramp+0.048
 clinic+0.048
ippers+0.047
pel+0.047
Neg logit contributions
 unexpected-0.028
 frightening-0.026
 unpleasant-0.024
sm-0.024
 scary-0.023
Token you
Token ID345
Feature activation+9.34e-01

Pos logit contributions
bes+0.038
 ramp+0.037
ippers+0.036
amas+0.036
 clinic+0.035
Neg logit contributions
 unexpected-0.015
 frightening-0.013
sm-0.013
 scary-0.013
 since-0.012
Token bull
Token ID6473
Feature activation+6.74e-01

Pos logit contributions
 ramp+0.028
 clinic+0.027
bes+0.026
 trailer+0.026
 slot+0.026
Neg logit contributions
 unexpected-0.031
 frightening-0.029
 since-0.028
sm-0.028
 unpleasant-0.027
Token,"
Token ID553
Feature activation+2.70e+00

Pos logit contributions
bes+0.027
amas+0.026
 ramp+0.025
ippers+0.025
pel+0.025
Neg logit contributions
 unexpected-0.011
sm-0.009
 frightening-0.009
 since-0.009
 scary-0.009
Token Lily
Token ID20037
Feature activation-8.63e-01

Pos logit contributions
bes+0.095
 ramp+0.091
ippers+0.088
 clinic+0.087
amas+0.087
Neg logit contributions
 unexpected-0.054
 frightening-0.050
sm-0.047
 since-0.046
 unpleasant-0.046
Token and
Token ID290
Feature activation+7.13e-01

Pos logit contributions
 unexpected+0.020
 frightening+0.019
 scary+0.018
 since+0.018
sm+0.018
Neg logit contributions
bes-0.027
 ramp-0.026
ippers-0.026
amas-0.025
pel-0.025
Token Ben
Token ID3932
Feature activation-4.70e-01

Pos logit contributions
bes+0.021
 ramp+0.019
amas+0.019
ippers+0.019
ates+0.019
Neg logit contributions
 unexpected-0.019
 frightening-0.018
 scary-0.018
 since-0.017
 dangerous-0.017
Token say
Token ID910
Feature activation+4.82e+00

Pos logit contributions
 unexpected+0.009
 frightening+0.008
 scary+0.007
 since+0.007
sm+0.007
Neg logit contributions
bes-0.018
 ramp-0.017
amas-0.016
ippers-0.016
 clinic-0.016
Token.
Token ID13
Feature activation+1.66e+00

Pos logit contributions
bes+0.202
 ramp+0.197
amas+0.191
pel+0.188
 clinic+0.186
Neg logit contributions
 unexpected-0.061
sm-0.059
 over-0.054
 frightening-0.054
 closer-0.052
Token "
Token ID366
Feature activation-3.90e-02

Pos logit contributions
bes+0.056
ippers+0.051
amas+0.051
kin+0.050
ates+0.050
Neg logit contributions
 over-0.037
 unexpected-0.035
 scary-0.035
 frightening-0.035
sm-0.034
TokenYou
Token ID1639
Feature activation+1.47e+00

Pos logit contributions
 unexpected+0.001
sm+0.001
 frightening+0.001
 since+0.001
 scary+0.001
Neg logit contributions
bes-0.001
 ramp-0.001
amas-0.001
ippers-0.001
 clinic-0.001
Token are
Token ID389
Feature activation+4.04e+00

Pos logit contributions
bes+0.046
amas+0.043
 ramp+0.043
ippers+0.042
 wrists+0.041
Neg logit contributions
 unexpected-0.039
 frightening-0.035
 scary-0.034
 dangerous-0.034
sm-0.033
Token big
Token ID1263
Feature activation+6.31e-01

Pos logit contributions
bes+0.148
 ramp+0.142
amas+0.138
ippers+0.138
 clinic+0.137
Neg logit contributions
 unexpected-0.077
 frightening-0.070
sm-0.066
 scary-0.065
 since-0.064
Token our
Token ID674
Feature activation+2.88e+00

Pos logit contributions
bes+0.054
 ramp+0.052
amas+0.051
ippers+0.050
 clinic+0.050
Neg logit contributions
 unexpected-0.023
 frightening-0.021
sm-0.019
 since-0.019
 scary-0.019
Token country
Token ID1499
Feature activation+2.71e+00

Pos logit contributions
bes+0.083
amas+0.077
 ramp+0.074
pel+0.073
ippers+0.071
Neg logit contributions
 unexpected-0.083
 frightening-0.076
 scary-0.075
 dangerous-0.074
 exciting-0.073
Token a
Token ID257
Feature activation+3.33e-01

Pos logit contributions
bes+0.104
 ramp+0.098
ippers+0.097
amas+0.097
pel+0.096
Neg logit contributions
 unexpected-0.047
 since-0.043
 frightening-0.042
sm-0.040
 scary-0.040
Token long
Token ID890
Feature activation+2.03e+00

Pos logit contributions
bes+0.012
amas+0.011
ippers+0.011
 ramp+0.011
ates+0.011
Neg logit contributions
 unexpected-0.007
 frightening-0.007
 scary-0.006
 dangerous-0.006
 since-0.006
Token time
Token ID640
Feature activation+3.09e+00

Pos logit contributions
bes+0.064
amas+0.064
ippers+0.060
 ramp+0.060
 wrists+0.059
Neg logit contributions
 unexpected-0.048
 since-0.044
 frightening-0.044
 scary-0.044
 dangerous-0.043
Token ago
Token ID2084
Feature activation+6.21e+00

Pos logit contributions
bes+0.116
 ramp+0.112
amas+0.110
ippers+0.108
 wrists+0.107
Neg logit contributions
 unexpected-0.056
 since-0.051
 frightening-0.050
sm-0.048
 scary-0.047
Token.
Token ID13
Feature activation-2.63e-01

Pos logit contributions
 ramp+0.252
bes+0.252
pel+0.242
 clinic+0.241
amas+0.241
Neg logit contributions
 since-0.077
 unexpected-0.068
 over-0.067
!-0.063
sm-0.059
Token He
Token ID679
Feature activation+8.61e-01

Pos logit contributions
 since+0.006
 unexpected+0.006
 frightening+0.005
 over+0.005
sm+0.005
Neg logit contributions
bes-0.009
 wrists-0.008
ippers-0.008
 ramp-0.008
amas-0.008
Token helped
Token ID4193
Feature activation-9.82e-01

Pos logit contributions
bes+0.030
ippers+0.029
amas+0.028
 ramp+0.028
ates+0.027
Neg logit contributions
 unexpected-0.019
 frightening-0.017
 since-0.017
 scary-0.016
sm-0.016
Token many
Token ID867
Feature activation-1.54e+00

Pos logit contributions
 unexpected+0.019
 frightening+0.017
sm+0.016
 since+0.016
 scary+0.016
Neg logit contributions
bes-0.036
 ramp-0.034
amas-0.033
ippers-0.033
 clinic-0.033
Token people
Token ID661
Feature activation-1.07e+00

Pos logit contributions
 unexpected+0.038
 frightening+0.036
sm+0.035
 since+0.034
 unpleasant+0.034
Neg logit contributions
bes-0.046
 ramp-0.044
 clinic-0.042
ippers-0.042
amas-0.042
Token.
Token ID13
Feature activation-1.85e-01

Pos logit contributions
 unexpected+0.003
 frightening+0.003
sm+0.003
 unpleasant+0.003
 since+0.003
Neg logit contributions
bes-0.006
 ramp-0.006
ippers-0.006
amas-0.006
 clinic-0.006
Token
Token ID220
Feature activation-2.67e+00

Pos logit contributions
 since+0.004
 over+0.004
 Joe+0.004
 Sally+0.004
 unexpected+0.004
Neg logit contributions
bes-0.006
kin-0.006
ippers-0.006
ates-0.005
pel-0.005
Token
Token ID198
Feature activation-1.02e+00

Pos logit contributions
 unexpected+0.051
sm+0.048
 over+0.047
 frightening+0.047
 since+0.045
Neg logit contributions
bes-0.092
 wrists-0.088
 ramp-0.088
pel-0.085
 clinic-0.085
Token
Token ID198
Feature activation-4.25e-01

Pos logit contributions
 unexpected+0.023
 frightening+0.021
sm+0.021
 since+0.021
 scary+0.020
Neg logit contributions
bes-0.033
 ramp-0.031
ippers-0.030
 wrists-0.030
amas-0.030
TokenThe
Token ID464
Feature activation+4.77e-01

Pos logit contributions
 unexpected+0.009
 frightening+0.008
sm+0.008
 since+0.007
 unpleasant+0.007
Neg logit contributions
bes-0.015
 ramp-0.014
amas-0.014
ippers-0.014
 wrists-0.014
Token animals
Token ID4695
Feature activation+1.41e+00

Pos logit contributions
bes+0.016
 ramp+0.015
ippers+0.014
amas+0.014
 clinic+0.014
Neg logit contributions
 unexpected-0.011
 frightening-0.010
sm-0.010
 scary-0.010
 since-0.010
Token were
Token ID547
Feature activation+2.38e+00

Pos logit contributions
bes+0.047
 ramp+0.044
amas+0.044
ippers+0.043
pel+0.042
Neg logit contributions
 unexpected-0.033
 frightening-0.030
sm-0.029
 since-0.029
 scary-0.028
Token having
Token ID1719
Feature activation+1.83e+00

Pos logit contributions
bes+0.097
 ramp+0.095
amas+0.093
ippers+0.093
pel+0.090
Neg logit contributions
 unexpected-0.038
 frightening-0.034
 scary-0.032
 over-0.032
 amazing-0.031
Token so
Token ID523
Feature activation-1.68e+00

Pos logit contributions
bes+0.077
 ramp+0.075
amas+0.073
ippers+0.073
 clinic+0.072
Neg logit contributions
 unexpected-0.027
 frightening-0.022
 scary-0.021
 since-0.021
 exciting-0.020
Token much
Token ID881
Feature activation-2.74e+00

Pos logit contributions
 unexpected+0.025
 frightening+0.023
 since+0.022
 unpleasant+0.021
sm+0.019
Neg logit contributions
bes-0.066
 ramp-0.062
 clinic-0.061
ippers-0.061
 slot-0.060
Token fun
Token ID1257
Feature activation+1.21e+00

Pos logit contributions
 unexpected+0.049
 frightening+0.044
 since+0.042
sm+0.042
 unpleasant+0.042
Neg logit contributions
bes-0.100
 ramp-0.096
 clinic-0.093
ippers-0.092
amas-0.091
Tokenily
Token ID813
Feature activation-8.54e-01

Pos logit contributions
 unexpected+0.050
 frightening+0.048
 unpleasant+0.044
 scary+0.044
 since+0.044
Neg logit contributions
bes-0.063
 ramp-0.058
ippers-0.057
amas-0.057
 clinic-0.056
Token took
Token ID1718
Feature activation-1.23e+00

Pos logit contributions
 unexpected+0.018
 frightening+0.017
 since+0.017
sm+0.016
 scary+0.016
Neg logit contributions
bes-0.029
 ramp-0.028
ippers-0.028
amas-0.027
ates-0.027
Token off
Token ID572
Feature activation-1.05e+00

Pos logit contributions
 unexpected+0.027
 frightening+0.026
 unpleasant+0.025
 since+0.024
sm+0.023
Neg logit contributions
bes-0.039
 ramp-0.037
 kit-0.036
 clinic-0.036
 wrists-0.036
Token one
Token ID530
Feature activation-5.90e+00

Pos logit contributions
 unexpected+0.020
 frightening+0.019
 unpleasant+0.017
 since+0.017
 scary+0.016
Neg logit contributions
bes-0.038
 ramp-0.036
 clinic-0.035
amas-0.035
ippers-0.035
Token of
Token ID286
Feature activation+5.30e-01

Pos logit contributions
 frightening+0.154
 unexpected+0.153
 unpleasant+0.143
sm+0.141
 shouldn+0.139
Neg logit contributions
 ramp-0.160
 slot-0.152
 clinic-0.150
 wrist-0.149
bes-0.147
Token her
Token ID607
Feature activation+7.65e-01

Pos logit contributions
bes+0.019
 ramp+0.019
 wrists+0.018
 clinic+0.018
ippers+0.018
Neg logit contributions
 unexpected-0.010
 frightening-0.009
sm-0.008
 unpleasant-0.008
 since-0.008
Token pink
Token ID11398
Feature activation+6.54e-01

Pos logit contributions
bes+0.022
 ramp+0.021
amas+0.021
pel+0.021
 clinic+0.020
Neg logit contributions
 unexpected-0.020
 frightening-0.019
 scary-0.018
sm-0.018
 dangerous-0.018
Token m
Token ID285
Feature activation-1.30e+00

Pos logit contributions
bes+0.016
 ramp+0.016
pel+0.016
amas+0.015
 slot+0.015
Neg logit contributions
 unexpected-0.018
sm-0.017
 scary-0.017
 to-0.017
 dangerous-0.017
Tokenittens
Token ID34978
Feature activation+9.26e-01

Pos logit contributions
 unexpected+0.042
 frightening+0.040
sm+0.039
 since+0.038
 unpleasant+0.038
Neg logit contributions
bes-0.030
 ramp-0.028
 clinic-0.026
amas-0.026
 wrists-0.026
Token and
Token ID290
Feature activation-1.77e-01

Pos logit contributions
bes+0.033
 ramp+0.031
ippers+0.030
 clinic+0.030
amas+0.030
Neg logit contributions
 unexpected-0.018
 frightening-0.017
 since-0.016
sm-0.016
 scary-0.016
Token put
Token ID1234
Feature activation+8.40e-01

Pos logit contributions
 unexpected+0.005
 frightening+0.005
sm+0.004
 unpleasant+0.004
 since+0.004
Neg logit contributions
bes-0.005
 ramp-0.004
 wrists-0.004
amas-0.004
 clinic-0.004
Token to
Token ID284
Feature activation-8.97e-02

Pos logit contributions
 unexpected+0.091
vious+0.081
vous+0.081
 frightening+0.080
 since+0.079
Neg logit contributions
bes-0.109
 wrists-0.107
 neck-0.105
 hair-0.102
 clinic-0.102
Token pet
Token ID4273
Feature activation-1.25e+00

Pos logit contributions
 unexpected+0.002
 frightening+0.002
sm+0.002
 since+0.002
 unpleasant+0.002
Neg logit contributions
bes-0.003
 clinic-0.002
 ramp-0.002
 wrists-0.002
 wrist-0.002
Token it
Token ID340
Feature activation-3.63e+00

Pos logit contributions
 unexpected+0.030
 frightening+0.029
 unpleasant+0.028
sm+0.026
 exciting+0.026
Neg logit contributions
bes-0.036
 ramp-0.035
 wrists-0.035
 clinic-0.035
 slot-0.034
Token again
Token ID757
Feature activation-1.47e+00

Pos logit contributions
 unexpected+0.084
 frightening+0.078
sm+0.074
 unpleasant+0.071
 since+0.069
Neg logit contributions
bes-0.117
 ramp-0.112
 clinic-0.110
amas-0.109
 wrists-0.106
Token."
Token ID526
Feature activation+2.17e+00

Pos logit contributions
 unexpected+0.039
 frightening+0.037
sm+0.036
 unpleasant+0.035
vious+0.033
Neg logit contributions
bes-0.039
 clinic-0.038
 ramp-0.037
 wrists-0.037
ippers-0.036
Token
Token ID198
Feature activation+9.58e-01

Pos logit contributions
 ramp+0.080
bes+0.080
pel+0.076
ates+0.074
amas+0.074
Neg logit contributions
 unexpected-0.036
 since-0.036
sm-0.035
 over-0.034
 frightening-0.033
Token
Token ID198
Feature activation+1.28e+00

Pos logit contributions
bes+0.029
 ramp+0.027
ippers+0.026
 clinic+0.026
amas+0.026
Neg logit contributions
 unexpected-0.025
 frightening-0.023
 since-0.022
sm-0.022
 unpleasant-0.021
TokenHer
Token ID9360
Feature activation+8.94e-01

Pos logit contributions
bes+0.041
 ramp+0.040
amas+0.039
pel+0.038
ippers+0.038
Neg logit contributions
 unexpected-0.030
sm-0.027
 frightening-0.026
 since-0.025
 unpleasant-0.025
Token mom
Token ID1995
Feature activation-9.04e-01

Pos logit contributions
bes+0.025
 ramp+0.024
amas+0.023
ippers+0.023
pel+0.023
Neg logit contributions
 unexpected-0.025
 frightening-0.023
 scary-0.022
sm-0.022
 unpleasant-0.022
Token said
Token ID531
Feature activation+4.08e+00

Pos logit contributions
 unexpected+0.019
sm+0.018
 since+0.018
 frightening+0.017
 over+0.017
Neg logit contributions
bes-0.031
ippers-0.030
 ramp-0.030
amas-0.029
 clinic-0.029
Token "
Token ID366
Feature activation+1.27e+00

Pos logit contributions
bes+0.122
 ramp+0.122
ippers+0.111
amas+0.111
pel+0.109
Neg logit contributions
 unexpected-0.101
sm-0.098
 frightening-0.094
 over-0.094
 since-0.093
Token It
Token ID632
Feature activation-1.54e+00

Pos logit contributions
 unexpected+0.009
 frightening+0.009
 since+0.008
sm+0.008
 over+0.008
Neg logit contributions
bes-0.011
ippers-0.010
 ramp-0.010
ates-0.010
 clinic-0.010
Token was
Token ID373
Feature activation+2.12e+00

Pos logit contributions
 unexpected+0.041
 frightening+0.038
 since+0.037
sm+0.037
 unpleasant+0.036
Neg logit contributions
bes-0.044
 ramp-0.041
amas-0.039
 clinic-0.039
 wrists-0.039
Token brown
Token ID7586
Feature activation+7.97e-01

Pos logit contributions
bes+0.071
 ramp+0.067
amas+0.067
 slot+0.065
 wrists+0.065
Neg logit contributions
 unexpected-0.041
 frightening-0.038
sm-0.038
vious-0.037
 since-0.036
Token and
Token ID290
Feature activation+8.38e-01

Pos logit contributions
bes+0.028
 ramp+0.027
ippers+0.026
 clinic+0.026
amas+0.026
Neg logit contributions
 unexpected-0.016
 frightening-0.015
 since-0.014
sm-0.014
 scary-0.014
Token smelled
Token ID37609
Feature activation-2.38e+00

Pos logit contributions
bes+0.027
 ramp+0.026
 wrists+0.025
amas+0.024
 slot+0.024
Neg logit contributions
 unexpected-0.019
sm-0.017
 frightening-0.017
vious-0.016
 since-0.016
Token delicious
Token ID12625
Feature activation+1.48e+00

Pos logit contributions
 unexpected+0.042
 frightening+0.041
sm+0.039
 since+0.038
 unpleasant+0.036
Neg logit contributions
bes-0.079
amas-0.075
 ramp-0.075
 slot-0.073
pel-0.072
Token.
Token ID13
Feature activation-4.15e-01

Pos logit contributions
bes+0.056
 ramp+0.053
amas+0.052
ippers+0.051
 clinic+0.051
Neg logit contributions
 unexpected-0.026
 frightening-0.024
sm-0.022
 since-0.022
 unpleasant-0.022
Token Mom
Token ID11254
Feature activation-2.22e+00

Pos logit contributions
 unexpected+0.010
 frightening+0.010
 since+0.010
sm+0.010
 scary+0.009
Neg logit contributions
bes-0.012
 ramp-0.012
ippers-0.012
pel-0.011
 clinic-0.011
Tokenmy
Token ID1820
Feature activation-5.68e-01

Pos logit contributions
 unexpected+0.033
 frightening+0.028
sm+0.025
 unpleasant+0.025
 scary+0.024
Neg logit contributions
bes-0.091
 ramp-0.088
amas-0.086
ippers-0.085
 clinic-0.084
Token said
Token ID531
Feature activation+3.03e+00

Pos logit contributions
 unexpected+0.013
 frightening+0.012
sm+0.012
 since+0.012
 scary+0.011
Neg logit contributions
bes-0.018
 ramp-0.017
ippers-0.017
 clinic-0.017
amas-0.017
Token,
Token ID11
Feature activation+8.99e-01

Pos logit contributions
bes+0.100
 ramp+0.094
amas+0.092
ippers+0.091
 clinic+0.091
Neg logit contributions
 unexpected-0.068
 frightening-0.062
sm-0.059
 since-0.058
 unpleasant-0.058
Token other
Token ID584
Feature activation+1.78e+00

Pos logit contributions
 unexpected+0.015
 frightening+0.014
sm+0.013
 unpleasant+0.013
 since+0.012
Neg logit contributions
bes-0.023
 ramp-0.022
 clinic-0.021
ippers-0.021
 wrists-0.021
Token.
Token ID13
Feature activation+1.19e-01

Pos logit contributions
bes+0.066
 ramp+0.064
amas+0.063
ippers+0.062
pel+0.062
Neg logit contributions
 unexpected-0.033
 frightening-0.030
 since-0.030
sm-0.029
 scary-0.029
Token Before
Token ID7413
Feature activation+2.88e+00

Pos logit contributions
bes+0.004
kin+0.004
ippers+0.004
amas+0.004
 ramp+0.003
Neg logit contributions
 since-0.003
 over-0.003
sm-0.003
 unexpected-0.002
 Betty-0.002
Token long
Token ID890
Feature activation-4.13e-02

Pos logit contributions
bes+0.104
 ramp+0.099
amas+0.098
ippers+0.098
 clinic+0.096
Neg logit contributions
 unexpected-0.058
 frightening-0.053
 scary-0.051
sm-0.049
 since-0.048
Token,
Token ID11
Feature activation-8.72e-02

Pos logit contributions
 unexpected+0.001
 scary+0.001
 since+0.001
 frightening+0.001
 dangerous+0.001
Neg logit contributions
amas-0.002
 ramp-0.002
bes-0.002
 slot-0.002
ippers-0.002
Token the
Token ID262
Feature activation+1.74e+00

Pos logit contributions
 unexpected+0.002
 frightening+0.001
 since+0.001
 scary+0.001
sm+0.001
Neg logit contributions
bes-0.003
amas-0.003
 ramp-0.003
ippers-0.003
pel-0.003
Token other
Token ID584
Feature activation+2.83e+00

Pos logit contributions
bes+0.055
 ramp+0.053
amas+0.051
ippers+0.051
 clinic+0.050
Neg logit contributions
 unexpected-0.040
 frightening-0.037
sm-0.037
 since-0.035
 unpleasant-0.035
Token animals
Token ID4695
Feature activation+1.33e+00

Pos logit contributions
bes+0.086
 ramp+0.081
ippers+0.079
amas+0.078
 clinic+0.078
Neg logit contributions
 unexpected-0.069
 frightening-0.064
sm-0.062
 unpleasant-0.061
 since-0.060
Token in
Token ID287
Feature activation+2.21e+00

Pos logit contributions
bes+0.043
 ramp+0.041
amas+0.039
ippers+0.039
 clinic+0.039
Neg logit contributions
 unexpected-0.031
 frightening-0.029
sm-0.028
 since-0.027
 scary-0.027
Token the
Token ID262
Feature activation+4.37e-01

Pos logit contributions
bes+0.101
amas+0.100
 ramp+0.097
ippers+0.094
ates+0.092
Neg logit contributions
 unexpected-0.027
 frightening-0.021
 scary-0.020
sm-0.019
 since-0.018
Token park
Token ID3952
Feature activation+2.79e+00

Pos logit contributions
bes+0.013
 ramp+0.012
ippers+0.012
amas+0.012
pel+0.011
Neg logit contributions
 unexpected-0.012
 frightening-0.011
 scary-0.011
 since-0.010
sm-0.010
Token flowers
Token ID12734
Feature activation+2.71e+00

Pos logit contributions
bes+0.061
 ramp+0.059
amas+0.058
 clinic+0.057
pel+0.057
Neg logit contributions
 unexpected-0.024
 since-0.021
 scary-0.021
sm-0.021
 frightening-0.021
Token.
Token ID13
Feature activation-1.26e-01

Pos logit contributions
bes+0.101
 ramp+0.098
amas+0.095
ippers+0.094
 clinic+0.094
Neg logit contributions
 unexpected-0.049
 frightening-0.044
sm-0.042
 since-0.041
 scary-0.040
Token Sue
Token ID26089
Feature activation-1.73e+00

Pos logit contributions
 unexpected+0.003
 frightening+0.003
sm+0.003
 since+0.003
 unpleasant+0.003
Neg logit contributions
bes-0.004
 ramp-0.004
amas-0.004
ippers-0.004
 clinic-0.004
Token picked
Token ID6497
Feature activation-6.60e-01

Pos logit contributions
 unexpected+0.037
 frightening+0.034
 since+0.033
sm+0.033
 scary+0.032
Neg logit contributions
bes-0.059
 ramp-0.056
ippers-0.055
amas-0.055
 clinic-0.054
Token a
Token ID257
Feature activation-3.61e+00

Pos logit contributions
 unexpected+0.014
 frightening+0.013
 since+0.012
sm+0.012
 unpleasant+0.012
Neg logit contributions
bes-0.022
 wrists-0.021
 ramp-0.021
ippers-0.021
 clinic-0.021
Token blue
Token ID4171
Feature activation-5.28e-01

Pos logit contributions
 unexpected+0.072
 frightening+0.066
 scary+0.063
sm+0.062
 since+0.061
Neg logit contributions
bes-0.128
 ramp-0.120
amas-0.119
ippers-0.118
 clinic-0.117
Token flower
Token ID15061
Feature activation-6.81e-01

Pos logit contributions
 unexpected+0.013
 since+0.012
sm+0.012
 over+0.012
 frightening+0.011
Neg logit contributions
bes-0.016
 ramp-0.015
amas-0.015
ippers-0.014
 clinic-0.014
Token that
Token ID326
Feature activation+2.39e-01

Pos logit contributions
 unexpected+0.014
 frightening+0.013
sm+0.013
 since+0.013
 scary+0.012
Neg logit contributions
bes-0.023
 ramp-0.022
amas-0.021
ippers-0.021
 clinic-0.021
Token would
Token ID561
Feature activation+2.44e+00

Pos logit contributions
bes+0.007
 ramp+0.007
 wrists+0.006
ippers+0.006
 clinic+0.006
Neg logit contributions
 unexpected-0.006
 frightening-0.006
sm-0.005
 since-0.005
 unpleasant-0.005
Token match
Token ID2872
Feature activation-6.83e-01

Pos logit contributions
bes+0.070
 ramp+0.066
 wrists+0.063
 clinic+0.063
ippers+0.063
Neg logit contributions
 unexpected-0.064
 frightening-0.060
sm-0.059
 since-0.057
 unpleasant-0.056
Token the
Token ID262
Feature activation+4.76e-01

Pos logit contributions
 unexpected+0.013
 frightening+0.012
sm+0.011
 since+0.011
 unpleasant+0.011
Neg logit contributions
bes-0.024
 ramp-0.023
ippers-0.023
 wrists-0.023
amas-0.022
Token had
Token ID550
Feature activation+9.64e-01

Pos logit contributions
bes+0.043
amas+0.041
 ramp+0.041
ippers+0.041
 clinic+0.040
Neg logit contributions
 unexpected-0.032
 frightening-0.029
 since-0.029
sm-0.028
 scary-0.028
Token so
Token ID523
Feature activation-1.45e+00

Pos logit contributions
bes+0.038
 ramp+0.036
ippers+0.036
amas+0.035
 clinic+0.035
Neg logit contributions
 unexpected-0.016
 frightening-0.014
 scary-0.014
sm-0.013
 since-0.013
Token much
Token ID881
Feature activation-2.79e+00

Pos logit contributions
 unexpected+0.021
 frightening+0.019
 since+0.018
 unpleasant+0.017
sm+0.016
Neg logit contributions
bes-0.058
 ramp-0.056
 clinic-0.054
ippers-0.053
 wrists-0.053
Token fun
Token ID1257
Feature activation+1.29e+00

Pos logit contributions
 unexpected+0.056
 since+0.054
 frightening+0.053
sm+0.052
 unpleasant+0.051
Neg logit contributions
bes-0.091
 ramp-0.089
 clinic-0.083
ippers-0.083
amas-0.080
Token.
Token ID13
Feature activation-7.48e-01

Pos logit contributions
bes+0.046
 ramp+0.044
ippers+0.043
amas+0.042
 clinic+0.042
Neg logit contributions
 unexpected-0.025
 frightening-0.023
sm-0.022
 since-0.022
 unpleasant-0.021
Token
Token ID198
Feature activation-1.51e+00

Pos logit contributions
 unexpected+0.016
 over+0.015
 since+0.014
 frightening+0.014
sm+0.014
Neg logit contributions
bes-0.025
 ramp-0.024
ippers-0.024
amas-0.023
 wrists-0.023
Token
Token ID198
Feature activation-1.03e+00

Pos logit contributions
 unexpected+0.048
 frightening+0.047
 unpleasant+0.045
vious+0.045
 since+0.044
Neg logit contributions
bes-0.034
 ramp-0.032
 clinic-0.030
ippers-0.029
amas-0.029
TokenOne
Token ID3198
Feature activation-9.48e-01

Pos logit contributions
 unexpected+0.020
 frightening+0.018
sm+0.017
 since+0.017
 unpleasant+0.017
Neg logit contributions
bes-0.037
 ramp-0.036
amas-0.035
ippers-0.035
 clinic-0.035
Token day
Token ID1110
Feature activation+3.66e+00

Pos logit contributions
 unexpected+0.025
 frightening+0.024
 unpleasant+0.024
vious+0.023
elling+0.022
Neg logit contributions
kin-0.022
 clinic-0.021
 ramp-0.021
bes-0.021
 neck-0.019
Token,
Token ID11
Feature activation-5.67e-01

Pos logit contributions
bes+0.106
 ramp+0.102
ippers+0.100
amas+0.097
 clinic+0.097
Neg logit contributions
 unexpected-0.093
 frightening-0.087
 unpleasant-0.084
sm-0.082
 since-0.081
Token they
Token ID484
Feature activation+1.30e+00

Pos logit contributions
 unexpected+0.010
 frightening+0.009
sm+0.009
 since+0.009
 unpleasant+0.009
Neg logit contributions
bes-0.021
 ramp-0.020
ippers-0.020
amas-0.020
 clinic-0.019
Token
Token ID198
Feature activation+1.31e+00

Pos logit contributions
bes+0.053
 ramp+0.052
ates+0.051
ippers+0.050
 clinic+0.049
Neg logit contributions
 over-0.026
 frightening-0.025
 scary-0.025
 unexpected-0.024
sm-0.024
Token
Token ID198
Feature activation+1.50e+00

Pos logit contributions
bes+0.036
 ramp+0.034
ippers+0.032
amas+0.032
 clinic+0.032
Neg logit contributions
 unexpected-0.036
 frightening-0.034
 since-0.033
 unpleasant-0.033
sm-0.032
TokenMom
Token ID29252
Feature activation-2.38e+00

Pos logit contributions
bes+0.047
 ramp+0.046
amas+0.045
 clinic+0.044
ippers+0.044
Neg logit contributions
 unexpected-0.035
 frightening-0.032
sm-0.032
 since-0.031
 dangerous-0.030
Token heard
Token ID2982
Feature activation-3.52e+00

Pos logit contributions
 unexpected+0.052
 frightening+0.048
sm+0.045
 unpleasant+0.044
 since+0.044
Neg logit contributions
bes-0.080
 ramp-0.076
amas-0.074
ippers-0.073
 clinic-0.073
Token the
Token ID262
Feature activation-1.23e+00

Pos logit contributions
 unexpected+0.075
 frightening+0.069
 since+0.065
sm+0.065
 unpleasant+0.065
Neg logit contributions
bes-0.118
 ramp-0.112
 clinic-0.109
amas-0.108
ippers-0.108
Token noise
Token ID7838
Feature activation-1.42e+00

Pos logit contributions
 unexpected+0.028
 frightening+0.026
 scary+0.025
 unpleasant+0.024
sm+0.024
Neg logit contributions
bes-0.041
 ramp-0.038
ippers-0.038
 clinic-0.037
pel-0.037
Token.
Token ID13
Feature activation+1.19e+00

Pos logit contributions
 unexpected+0.032
 frightening+0.029
sm+0.028
 since+0.028
 unpleasant+0.027
Neg logit contributions
bes-0.046
 ramp-0.044
amas-0.042
ippers-0.042
 clinic-0.042
Token She
Token ID1375
Feature activation+1.56e+00

Pos logit contributions
bes+0.036
 ramp+0.034
ates+0.033
ippers+0.033
amas+0.033
Neg logit contributions
 unexpected-0.030
 frightening-0.029
 scary-0.028
sm-0.027
 dangerous-0.026
Token came
Token ID1625
Feature activation-2.58e+00

Pos logit contributions
bes+0.044
 ramp+0.041
amas+0.039
 clinic+0.039
ippers+0.039
Neg logit contributions
 unexpected-0.042
 frightening-0.039
 since-0.038
 unpleasant-0.037
sm-0.037
Token to
Token ID284
Feature activation-3.99e-01

Pos logit contributions
 unexpected+0.050
 frightening+0.045
 unpleasant+0.044
sm+0.041
 since+0.039
Neg logit contributions
bes-0.090
 ramp-0.090
ippers-0.087
 clinic-0.086
amas-0.086
Token the
Token ID262
Feature activation-6.54e-01

Pos logit contributions
 unexpected+0.009
 frightening+0.009
 unpleasant+0.008
vious+0.008
 possible+0.008
Neg logit contributions
bes-0.012
 ramp-0.012
 clinic-0.011
 wrists-0.011
 kit-0.011
Token cave
Token ID11527
Feature activation+1.26e+00

Pos logit contributions
bes+0.008
 ramp+0.007
 clinic+0.007
amas+0.007
ippers+0.007
Neg logit contributions
 unexpected-0.011
 frightening-0.011
sm-0.011
 unpleasant-0.010
 since-0.010
Token.
Token ID13
Feature activation-4.89e-01

Pos logit contributions
bes+0.047
 ramp+0.045
ippers+0.044
amas+0.044
 clinic+0.044
Neg logit contributions
 unexpected-0.023
 frightening-0.020
sm-0.019
 since-0.019
 scary-0.019
Token The
Token ID383
Feature activation+2.29e-01

Pos logit contributions
 unexpected+0.014
 frightening+0.013
sm+0.013
 since+0.012
 unpleasant+0.012
Neg logit contributions
bes-0.013
 ramp-0.013
amas-0.012
ippers-0.012
 clinic-0.012
Token dragon
Token ID10441
Feature activation-8.66e-01

Pos logit contributions
bes+0.006
 ramp+0.005
ippers+0.005
amas+0.005
 clinic+0.005
Neg logit contributions
 unexpected-0.007
 frightening-0.007
sm-0.006
 since-0.006
 scary-0.006
Token was
Token ID373
Feature activation+1.96e+00

Pos logit contributions
 unexpected+0.021
 frightening+0.019
sm+0.018
 since+0.018
 scary+0.018
Neg logit contributions
bes-0.027
 ramp-0.026
ippers-0.026
amas-0.026
 clinic-0.025
Token trembling
Token ID39228
Feature activation-1.21e+00

Pos logit contributions
bes+0.067
 ramp+0.064
 clinic+0.062
ippers+0.061
amas+0.061
Neg logit contributions
 unexpected-0.040
 frightening-0.036
sm-0.036
 since-0.035
 unpleasant-0.034
Token in
Token ID287
Feature activation+1.19e+00

Pos logit contributions
 unexpected+0.028
 frightening+0.025
sm+0.025
 unpleasant+0.024
 since+0.024
Neg logit contributions
bes-0.037
 ramp-0.035
 clinic-0.034
amas-0.034
pel-0.034
Token fear
Token ID3252
Feature activation-2.27e+00

Pos logit contributions
bes+0.037
 ramp+0.036
pel+0.034
 clinic+0.034
 wrists+0.034
Neg logit contributions
 unexpected-0.025
sm-0.023
 frightening-0.023
 since-0.023
 unpleasant-0.022
Token and
Token ID290
Feature activation-1.67e-01

Pos logit contributions
 unexpected+0.055
 frightening+0.049
sm+0.048
 unpleasant+0.048
 dangerous+0.045
Neg logit contributions
bes-0.066
 ramp-0.064
 clinic-0.060
amas-0.060
pel-0.060
Token carrying
Token ID6872
Feature activation+8.81e-01

Pos logit contributions
 unexpected+0.004
sm+0.003
 frightening+0.003
 since+0.003
 unpleasant+0.003
Neg logit contributions
bes-0.005
 ramp-0.005
 wrists-0.005
 clinic-0.005
ippers-0.005
Token a
Token ID257
Feature activation-2.24e+00

Pos logit contributions
bes+0.031
 ramp+0.030
ippers+0.029
amas+0.029
 clinic+0.028
Neg logit contributions
 unexpected-0.017
 frightening-0.016
sm-0.015
 since-0.015
 unpleasant-0.015
Token the
Token ID262
Feature activation+1.15e+00

Pos logit contributions
bes+0.137
 ramp+0.132
amas+0.130
ippers+0.127
 wrists+0.126
Neg logit contributions
 unexpected-0.042
 frightening-0.034
sm-0.032
 scary-0.032
 since-0.031
Token world
Token ID995
Feature activation+1.31e+00

Pos logit contributions
bes+0.031
 ramp+0.029
ippers+0.028
amas+0.028
pel+0.028
Neg logit contributions
 unexpected-0.033
 frightening-0.031
 scary-0.030
sm-0.030
 since-0.030
Token.
Token ID13
Feature activation+1.48e+00

Pos logit contributions
bes+0.051
 ramp+0.049
ippers+0.048
amas+0.048
pel+0.047
Neg logit contributions
 unexpected-0.022
sm-0.019
 since-0.019
 frightening-0.019
 scary-0.018
Token She
Token ID1375
Feature activation+2.01e+00

Pos logit contributions
bes+0.041
 ramp+0.039
amas+0.037
ippers+0.037
 clinic+0.036
Neg logit contributions
 unexpected-0.041
 frightening-0.038
 since-0.038
sm-0.037
 scary-0.037
Token had
Token ID550
Feature activation+1.26e+00

Pos logit contributions
bes+0.069
 ramp+0.068
amas+0.067
ippers+0.067
pel+0.064
Neg logit contributions
 unexpected-0.043
 since-0.039
sm-0.038
 frightening-0.038
 scary-0.037
Token a
Token ID257
Feature activation-2.58e-01

Pos logit contributions
bes+0.046
 ramp+0.045
ippers+0.044
amas+0.043
 clinic+0.043
Neg logit contributions
 unexpected-0.024
 frightening-0.021
 since-0.021
sm-0.021
 scary-0.020
Token wonderful
Token ID7932
Feature activation+1.10e+00

Pos logit contributions
 unexpected+0.005
 scary+0.004
 frightening+0.004
 dangerous+0.004
 amazing+0.004
Neg logit contributions
bes-0.011
amas-0.010
ippers-0.010
ates-0.010
pel-0.010
Token morning
Token ID3329
Feature activation+5.11e+00

Pos logit contributions
bes+0.044
amas+0.043
 ramp+0.042
pel+0.042
ippers+0.041
Neg logit contributions
 unexpected-0.018
sm-0.015
 scary-0.015
 amazing-0.015
 exciting-0.015
Token and
Token ID290
Feature activation+4.47e-01

Pos logit contributions
bes+0.180
 ramp+0.174
amas+0.167
ippers+0.166
 clinic+0.165
Neg logit contributions
 unexpected-0.102
 frightening-0.093
sm-0.090
 since-0.089
 scary-0.086
Token a
Token ID257
Feature activation+5.41e-01

Pos logit contributions
bes+0.015
 ramp+0.014
 clinic+0.014
 wrists+0.014
amas+0.014
Neg logit contributions
 unexpected-0.010
 frightening-0.009
sm-0.008
 unpleasant-0.008
 since-0.008
Token weird
Token ID7650
Feature activation+6.49e-01

Pos logit contributions
bes+0.020
amas+0.018
ippers+0.018
 ramp+0.018
pel+0.018
Neg logit contributions
 unexpected-0.011
 frightening-0.010
 scary-0.010
sm-0.009
 since-0.009
Token.
Token ID13
Feature activation-9.04e-01

Pos logit contributions
 unexpected+0.015
 frightening+0.014
sm+0.013
 unpleasant+0.013
 scary+0.012
Neg logit contributions
bes-0.022
 ramp-0.021
 clinic-0.021
ippers-0.020
 wrists-0.020
Token Suddenly
Token ID24975
Feature activation-1.16e-01

Pos logit contributions
 unexpected+0.022
 since+0.022
 frightening+0.022
 scary+0.021
 over+0.021
Neg logit contributions
bes-0.028
kin-0.026
ates-0.025
 ramp-0.025
ippers-0.025
Token,
Token ID11
Feature activation-4.52e-01

Pos logit contributions
 unexpected+0.002
 frightening+0.002
 since+0.002
 scary+0.002
 dangerous+0.002
Neg logit contributions
bes-0.004
 ramp-0.004
amas-0.004
 wrists-0.004
 clinic-0.004
Token Tommy
Token ID19919
Feature activation-1.41e+00

Pos logit contributions
 unexpected+0.009
 frightening+0.008
sm+0.008
 since+0.008
 unpleasant+0.008
Neg logit contributions
bes-0.016
 ramp-0.015
amas-0.015
ippers-0.015
 clinic-0.014
Token saw
Token ID2497
Feature activation-2.16e+00

Pos logit contributions
 unexpected+0.030
 frightening+0.028
 since+0.027
 scary+0.026
sm+0.026
Neg logit contributions
bes-0.048
 ramp-0.045
ippers-0.044
 clinic-0.044
amas-0.043
Token something
Token ID1223
Feature activation-3.52e+00

Pos logit contributions
 unexpected+0.050
 frightening+0.047
sm+0.047
 unpleasant+0.045
 since+0.045
Neg logit contributions
bes-0.065
 ramp-0.062
pel-0.059
 clinic-0.059
 kit-0.057
Token shiny
Token ID22441
Feature activation-8.55e-01

Pos logit contributions
vious+0.071
 since+0.070
elling+0.067
sm+0.067
 unexpected+0.066
Neg logit contributions
 ramp-0.106
bes-0.106
 slot-0.099
 kit-0.097
 laptop-0.095
Token on
Token ID319
Feature activation+1.26e+00

Pos logit contributions
 unexpected+0.019
 frightening+0.018
sm+0.017
 unpleasant+0.017
 since+0.016
Neg logit contributions
bes-0.027
 ramp-0.026
 slot-0.025
ippers-0.025
 wrists-0.024
Token the
Token ID262
Feature activation-8.25e-01

Pos logit contributions
bes+0.046
 ramp+0.045
pel+0.043
 clinic+0.043
 wrists+0.043
Neg logit contributions
 unexpected-0.022
 frightening-0.020
sm-0.020
 unpleasant-0.019
 since-0.018
Token ground
Token ID2323
Feature activation+2.22e-01

Pos logit contributions
 unexpected+0.029
sm+0.028
 frightening+0.028
 unpleasant+0.027
 since+0.026
Neg logit contributions
bes-0.016
 ramp-0.016
 wrists-0.015
 wrist-0.014
 clinic-0.014
Token!
Token ID0
Feature activation-1.60e+00

Pos logit contributions
bes+0.007
 ramp+0.007
 clinic+0.007
amas+0.007
ippers+0.007
Neg logit contributions
 unexpected-0.005
 frightening-0.004
sm-0.004
 unpleasant-0.004
 scary-0.004
Token,
Token ID11
Feature activation-7.15e-01

Pos logit contributions
 unexpected+0.019
 frightening+0.017
sm+0.016
 unpleasant+0.016
 since+0.016
Neg logit contributions
bes-0.033
 ramp-0.031
ippers-0.030
amas-0.030
 clinic-0.030
Token but
Token ID475
Feature activation-2.05e+00

Pos logit contributions
 unexpected+0.017
 frightening+0.015
sm+0.015
 unpleasant+0.015
 since+0.014
Neg logit contributions
bes-0.022
 ramp-0.021
 wrists-0.020
ippers-0.020
amas-0.020
Token he
Token ID339
Feature activation+2.34e-01

Pos logit contributions
 unexpected+0.045
sm+0.042
 frightening+0.042
 unpleasant+0.039
 scary+0.038
Neg logit contributions
bes-0.066
 ramp-0.064
pel-0.061
 slot-0.061
ippers-0.061
Token couldn
Token ID3521
Feature activation-2.03e+00

Pos logit contributions
bes+0.008
 ramp+0.008
amas+0.008
ippers+0.008
 clinic+0.008
Neg logit contributions
 unexpected-0.005
 frightening-0.004
 since-0.004
sm-0.004
 scary-0.004
Token't
Token ID470
Feature activation-6.53e+00

Pos logit contributions
 unexpected+0.052
 frightening+0.048
sm+0.046
 since+0.046
 scary+0.045
Neg logit contributions
bes-0.061
 ramp-0.057
ippers-0.055
 wrists-0.055
 clinic-0.055
Token find
Token ID1064
Feature activation-3.67e+00

Pos logit contributions
sm+0.200
 frightening+0.198
 unexpected+0.195
elling+0.187
 unpleasant+0.185
Neg logit contributions
 ramp-0.143
bes-0.140
pel-0.134
 clinic-0.132
ippers-0.130
Token anything
Token ID1997
Feature activation-3.98e+00

Pos logit contributions
 unexpected+0.077
 frightening+0.075
 unpleasant+0.070
sm+0.070
 since+0.069
Neg logit contributions
bes-0.115
 ramp-0.114
ippers-0.110
pel-0.109
 clinic-0.109
Token.
Token ID13
Feature activation-1.14e+00

Pos logit contributions
elling+0.094
sm+0.088
vious+0.088
 unexpected+0.086
 frightening+0.085
Neg logit contributions
bes-0.104
 ramp-0.103
 clinic-0.097
 kit-0.093
 chest-0.092
Token As
Token ID1081
Feature activation-5.61e-01

Pos logit contributions
 since+0.024
 unexpected+0.024
 frightening+0.024
 over+0.023
 scary+0.023
Neg logit contributions
bes-0.038
kin-0.036
ippers-0.036
 wrists-0.035
ates-0.033
Token he
Token ID339
Feature activation-5.81e-01

Pos logit contributions
 unexpected+0.013
sm+0.012
 frightening+0.012
 unpleasant+0.012
vious+0.012
Neg logit contributions
 ramp-0.017
bes-0.017
 clinic-0.016
pel-0.015
 slot-0.015
Token walked
Token ID6807
Feature activation-7.54e-02

Pos logit contributions
 unexpected+0.016
 frightening+0.014
 since+0.014
sm+0.014
 scary+0.013
Neg logit contributions
bes-0.017
amas-0.016
ippers-0.016
 ramp-0.016
 clinic-0.016
Token of
Token ID286
Feature activation+6.74e-01

Pos logit contributions
 unexpected+0.038
 frightening+0.033
 unpleasant+0.030
 exciting+0.028
 dangerous+0.028
Neg logit contributions
bes-0.106
amas-0.103
ippers-0.102
 ramp-0.101
ates-0.099
Token the
Token ID262
Feature activation+1.22e+00

Pos logit contributions
bes+0.026
 ramp+0.025
amas+0.024
ippers+0.024
 clinic+0.024
Neg logit contributions
 unexpected-0.011
 frightening-0.010
sm-0.010
 since-0.009
 scary-0.009
Token fle
Token ID5104
Feature activation+1.47e+00

Pos logit contributions
bes+0.031
ippers+0.028
 ramp+0.028
pel+0.027
 wrists+0.027
Neg logit contributions
 unexpected-0.038
 scary-0.037
 dangerous-0.036
 frightening-0.036
 unpleasant-0.034
Tokena
Token ID64
Feature activation+1.32e+00

Pos logit contributions
 ramp+0.043
bes+0.041
amas+0.041
 wrists+0.041
 trailer+0.041
Neg logit contributions
sm-0.038
 unexpected-0.036
 dangerous-0.033
ful-0.033
 frightening-0.033
Token.
Token ID13
Feature activation-1.57e+00

Pos logit contributions
bes+0.048
 ramp+0.047
amas+0.047
ippers+0.046
pel+0.046
Neg logit contributions
 unexpected-0.024
sm-0.022
 scary-0.022
 frightening-0.022
 since-0.021
Token
Token ID220
Feature activation-4.03e+00

Pos logit contributions
 since+0.030
 over+0.028
 unexpected+0.027
 scary+0.026
 frightening+0.026
Neg logit contributions
bes-0.055
ippers-0.054
ates-0.053
pel-0.053
kin-0.052
Token
Token ID198
Feature activation-2.33e+00

Pos logit contributions
 unexpected+0.108
 frightening+0.101
sm+0.096
 since+0.096
 unpleasant+0.095
Neg logit contributions
bes-0.115
 ramp-0.109
amas-0.105
ippers-0.105
 clinic-0.104
Token
Token ID198
Feature activation-1.66e+00

Pos logit contributions
 unexpected+0.064
 frightening+0.060
 unpleasant+0.057
sm+0.057
 since+0.056
Neg logit contributions
bes-0.065
 ramp-0.062
amas-0.059
 clinic-0.059
ippers-0.059
TokenL
Token ID43
Feature activation-2.01e+00

Pos logit contributions
 unexpected+0.039
 frightening+0.036
 since+0.035
sm+0.034
 scary+0.034
Neg logit contributions
bes-0.053
 ramp-0.051
ippers-0.050
amas-0.049
 clinic-0.049
Tokenily
Token ID813
Feature activation-6.48e-01

Pos logit contributions
 unexpected+0.048
 frightening+0.046
 unpleasant+0.043
 since+0.042
 scary+0.042
Neg logit contributions
bes-0.063
 ramp-0.059
 clinic-0.057
amas-0.057
ippers-0.057
Token said
Token ID531
Feature activation+2.57e+00

Pos logit contributions
 unexpected+0.015
 frightening+0.014
sm+0.013
 since+0.013
 scary+0.013
Neg logit contributions
bes-0.021
 ramp-0.020
ippers-0.019
amas-0.019
 clinic-0.019
Token in
Token ID287
Feature activation+6.03e-01

Pos logit contributions
 unexpected+0.007
 frightening+0.006
sm+0.006
 since+0.006
 scary+0.006
Neg logit contributions
bes-0.014
 ramp-0.013
ippers-0.013
amas-0.013
 clinic-0.013
Token circles
Token ID13332
Feature activation+4.44e-01

Pos logit contributions
bes+0.020
 ramp+0.019
 clinic+0.018
 wrists+0.018
pel+0.018
Neg logit contributions
 unexpected-0.012
 frightening-0.012
sm-0.011
 unpleasant-0.011
 since-0.011
Token,
Token ID11
Feature activation-7.04e-01

Pos logit contributions
bes+0.016
 ramp+0.015
ippers+0.015
 clinic+0.015
amas+0.015
Neg logit contributions
 unexpected-0.008
 frightening-0.008
 since-0.008
sm-0.008
 over-0.007
Token looking
Token ID2045
Feature activation-1.37e+00

Pos logit contributions
 unexpected+0.016
 frightening+0.015
 since+0.014
 scary+0.014
sm+0.014
Neg logit contributions
bes-0.023
 ramp-0.022
ippers-0.022
amas-0.021
 clinic-0.021
Token for
Token ID329
Feature activation+1.77e+00

Pos logit contributions
 unexpected+0.026
sm+0.023
 unpleasant+0.022
 possible+0.022
 frightening+0.022
Neg logit contributions
bes-0.043
 ramp-0.043
 clinic-0.041
ippers-0.041
amas-0.041
Token an
Token ID281
Feature activation-4.03e+00

Pos logit contributions
bes+0.059
 ramp+0.056
 clinic+0.055
 wrists+0.054
ippers+0.053
Neg logit contributions
 unexpected-0.035
 frightening-0.033
 since-0.032
sm-0.031
 unpleasant-0.030
Token escape
Token ID6654
Feature activation+2.20e-02

Pos logit contributions
sm+0.085
 since+0.074
 frightening+0.073
 unexpected+0.072
ful+0.066
Neg logit contributions
 ramp-0.123
 trailer-0.123
 clinic-0.121
bes-0.111
pel-0.111
Token.
Token ID13
Feature activation-3.73e-01

Pos logit contributions
bes+0.001
 ramp+0.001
 clinic+0.001
ippers+0.001
 wrists+0.001
Neg logit contributions
 unexpected-0.000
 frightening-0.000
sm-0.000
 unpleasant-0.000
 since-0.000
Token
Token ID198
Feature activation-9.37e-01

Pos logit contributions
 since+0.008
 unexpected+0.008
 frightening+0.008
 over+0.008
sm+0.008
Neg logit contributions
bes-0.012
 ramp-0.011
ippers-0.011
kin-0.011
 wrists-0.011
Token
Token ID198
Feature activation-3.75e-01

Pos logit contributions
 unexpected+0.025
 frightening+0.024
 since+0.023
 unpleasant+0.023
sm+0.022
Neg logit contributions
bes-0.027
 ramp-0.025
 clinic-0.024
ippers-0.024
amas-0.024
TokenSuddenly
Token ID38582
Feature activation-2.97e-01

Pos logit contributions
 unexpected+0.009
 frightening+0.008
sm+0.008
 since+0.008
 unpleasant+0.008
Neg logit contributions
bes-0.012
 ramp-0.011
ippers-0.011
amas-0.011
 clinic-0.011
Token,
Token ID11
Feature activation-1.37e+00

Pos logit contributions
 unexpected+0.087
 frightening+0.079
sm+0.078
 unpleasant+0.075
 since+0.073
Neg logit contributions
bes-0.119
 ramp-0.116
 clinic-0.109
amas-0.109
ippers-0.109
Token but
Token ID475
Feature activation-2.41e+00

Pos logit contributions
 unexpected+0.022
 since+0.022
 frightening+0.022
 scary+0.021
 over+0.020
Neg logit contributions
bes-0.054
ippers-0.053
ates-0.051
amas-0.051
 ramp-0.051
Token then
Token ID788
Feature activation+1.99e-01

Pos logit contributions
 unexpected+0.052
 frightening+0.047
sm+0.047
 unpleasant+0.044
 scary+0.043
Neg logit contributions
bes-0.080
 ramp-0.077
 clinic-0.074
ippers-0.073
pel-0.073
Token he
Token ID339
Feature activation-4.01e-01

Pos logit contributions
bes+0.007
 ramp+0.007
ippers+0.006
amas+0.006
 clinic+0.006
Neg logit contributions
 unexpected-0.004
 frightening-0.004
sm-0.004
 since-0.003
 unpleasant-0.003
Token noticed
Token ID6810
Feature activation-2.98e+00

Pos logit contributions
 unexpected+0.009
sm+0.009
 frightening+0.009
 unpleasant+0.008
 possible+0.008
Neg logit contributions
bes-0.012
 ramp-0.012
 wrists-0.011
pel-0.011
 clinic-0.011
Token something
Token ID1223
Feature activation-4.05e+00

Pos logit contributions
 unexpected+0.061
 frightening+0.058
 unpleasant+0.056
sm+0.056
 since+0.054
Neg logit contributions
bes-0.092
 ramp-0.087
 kit-0.083
pel-0.083
 clinic-0.083
Token else
Token ID2073
Feature activation-5.14e-01

Pos logit contributions
 since+0.072
vious+0.067
elling+0.066
 unexpected+0.066
sm+0.065
Neg logit contributions
 ramp-0.133
bes-0.132
 slot-0.122
 clinic-0.120
 kit-0.119
Token had
Token ID550
Feature activation-7.22e-03

Pos logit contributions
 unexpected+0.011
 frightening+0.010
sm+0.010
 unpleasant+0.010
vious+0.010
Neg logit contributions
bes-0.016
 ramp-0.015
 clinic-0.014
 wrists-0.014
 slot-0.014
Token landed
Token ID11406
Feature activation+1.92e+00

Pos logit contributions
 unexpected+0.000
 frightening+0.000
 since+0.000
sm+0.000
 unpleasant+0.000
Neg logit contributions
bes-0.000
 ramp-0.000
amas-0.000
pel-0.000
 clinic-0.000
Token on
Token ID319
Feature activation+2.73e-01

Pos logit contributions
bes+0.064
 ramp+0.061
ippers+0.060
amas+0.059
ates+0.059
Neg logit contributions
 unexpected-0.041
 frightening-0.038
 unpleasant-0.035
 since-0.034
sm-0.034
Token the
Token ID262
Feature activation-1.33e+00

Pos logit contributions
bes+0.010
 ramp+0.010
amas+0.009
 clinic+0.009
ippers+0.009
Neg logit contributions
 unexpected-0.005
 frightening-0.005
sm-0.004
 since-0.004
 unpleasant-0.004