TOP ACTIVATIONS
MAX = 0.113

The pot replied sadly , " If you kick
dad just shook his head sadly . "
the ducks ?" Kid asked sadly . Mom smiled
not working ?" asked Tom sadly . But daddy didn 't
The little boy was taken aback and felt guilty . He
The small boy was taken aback . He thanked the saw
The big machine answered sadly , " My gear snapped
we do ?" asked Maya sadly . " Let 's
then a sad look came over her face . She couldn
The man was so taken aback by Maria 's charming request
and saw that mom was sadly looking at the calendar .
long time ?" she asked sadly . Her parents assured her
the little girl was taken aback . Inside the room ,
May a looked sadly at the fragile hoop .
collect ," the bird replied sadly . B ert
touched !" Lily was taken aback and stopped in her tracks
and so her story ended sadly . <|endoftext|> Alice was a
down at him and said sadly , " I told you
His mom was taken aback . She said calmly ,
until a strange feeling came over her . In the silence

MIN ACTIVATIONS
MIN = -0.160

did something special . On Wed nes days she liked to
's a special hat for Wed nes days ."
Wed nes days because on Wed nes days she did something
saw lots of people having pic n ics and walking their
my . Tim my loved Wed nes days because that was
. She loved to have pic n ics with her family
my . Tim my loved Wed nes days because that was
the warm sunshine and having pic n ics with her family
children running around , families pic n icking , couples st
girl named Lily who loved pic n ics . She would
le balls and ride a un icycle . One day ,
the cliff . On many occasions , he had to start
could not find his famous las agna . The
liked to play and have pic n ics . One day
a family who really enjoyed pic n ics . They loved
find a box of ch ocol ates . "
Jimmy was always excited for Wed nes days because it was
pe aches on Mondays , Wed nes days , and Fridays
What 's so special about Wed nes days ?"
near the cliff and have pic n ics . One day

INTERVAL 0.045 - 0.113
CONTAINS 4.610%

he forgot how to get back to his hole . He
happy that he sang the loud est he had ever sung
, Tim my started to feel uncomfortable . His shoes were
magn ifying glass . He saw the numbers , the hands
a little angel who could not fly . She watched her

INTERVAL -0.023 - 0.045
CONTAINS 77.125%

could share it with his family . He quickly grabbed a
he couldn â € ™ t get out . He was
. It was so wide and fluffy . Lily 's mom
said Ros ie . B enny smiled and said
back ." Max didn 't want to at first , but

INTERVAL -0.091 - -0.023
CONTAINS 18.241%

. He loved to play with his toys and pretend they
field . They all agreed that the dove
a fence . " Look , Ben , our
Mia are sorry . They run to the car . They
at the end of the day , he put the pistol

INTERVAL -0.160 - -0.091
CONTAINS 0.024%

" V room , v room , my car is
can sew some buttons or rib bons or patches . Would
I 've got a new leather jacket !" he said ,
to prepare for a fun saf ari . They wanted to
, Pete " P ete was touched by his
Token
Token ID198
Feature activation+1.52e-02

Pos logit contributions
 swollen+0.011
othe+0.010
 scared+0.009
 its+0.009
 Andy+0.008
Neg logit contributions
uries-0.042
nes-0.041
aces-0.041
Whe-0.038
coat-0.037
Token
Token ID198
Feature activation+1.13e-02

Pos logit contributions
 swollen+0.015
 burning+0.013
 frightening+0.013
 scared+0.013
 trembling+0.013
Neg logit contributions
nes-0.042
uries-0.040
aces-0.039
Whe-0.038
shirts-0.036
TokenThe
Token ID464
Feature activation+3.98e-02

Pos logit contributions
 swollen+0.014
othe+0.013
 scared+0.013
 its+0.012
 Andy+0.012
Neg logit contributions
uries-0.028
aces-0.027
nes-0.027
Whe-0.025
coat-0.024
Token pot
Token ID1787
Feature activation+1.27e-02

Pos logit contributions
 swollen+0.045
 scared+0.041
 frightening+0.038
 angry+0.037
othe+0.037
Neg logit contributions
uries-0.107
nes-0.105
aces-0.103
Whe-0.095
shirts-0.093
Token replied
Token ID8712
Feature activation+1.85e-02

Pos logit contributions
 swollen+0.012
othe+0.011
 frightening+0.010
 Andy+0.010
 Farmer+0.010
Neg logit contributions
aces-0.034
nes-0.033
uries-0.033
coat-0.030
shirts-0.029
Token sadly
Token ID21098
Feature activation+1.13e-01

Pos logit contributions
 swollen+0.015
othe+0.013
 scared+0.012
 frightening+0.012
 its+0.011
Neg logit contributions
uries-0.053
nes-0.052
aces-0.052
Whe-0.048
coat-0.048
Token,
Token ID11
Feature activation-6.19e-04

Pos logit contributions
 swollen+0.110
 scared+0.090
othe+0.090
 frightening+0.089
 faint+0.082
Neg logit contributions
uries-0.305
aces-0.298
nes-0.293
coat-0.271
shirts-0.267
Token "
Token ID366
Feature activation-2.79e-02

Pos logit contributions
uries+0.002
nes+0.002
aces+0.002
Whe+0.002
coat+0.002
Neg logit contributions
 swollen-0.000
othe-0.000
 scared-0.000
 frightening-0.000
 its-0.000
TokenIf
Token ID1532
Feature activation+3.73e-02

Pos logit contributions
uries+0.059
nes+0.058
aces+0.057
Whe+0.053
shirts+0.051
Neg logit contributions
 swollen-0.044
othe-0.041
 scared-0.041
 its-0.040
 frightening-0.040
Token you
Token ID345
Feature activation-6.88e-02

Pos logit contributions
 swollen+0.050
othe+0.046
 scared+0.046
 its+0.046
 Andy+0.045
Neg logit contributions
uries-0.089
nes-0.086
aces-0.086
Whe-0.079
coat-0.078
Token kick
Token ID4829
Feature activation-2.62e-02

Pos logit contributions
uries+0.165
aces+0.162
nes+0.158
Whe+0.156
shirts+0.146
Neg logit contributions
 scared-0.092
 swollen-0.083
othe-0.081
 Andy-0.077
 worried-0.077
Token dad
Token ID9955
Feature activation+3.28e-02

Pos logit contributions
 swollen+0.001
 scared+0.001
 angry+0.001
 worried+0.001
 frightening+0.001
Neg logit contributions
uries-0.002
nes-0.002
aces-0.002
Whe-0.002
shirts-0.001
Token just
Token ID655
Feature activation+3.37e-02

Pos logit contributions
 swollen+0.026
othe+0.022
 scared+0.021
 its+0.020
 frightening+0.020
Neg logit contributions
uries-0.096
nes-0.094
aces-0.094
Whe-0.087
coat-0.086
Token shook
Token ID14682
Feature activation+7.04e-02

Pos logit contributions
 swollen+0.035
othe+0.031
 scared+0.031
 its+0.029
 frightening+0.029
Neg logit contributions
uries-0.090
nes-0.088
aces-0.087
Whe-0.081
coat-0.079
Token his
Token ID465
Feature activation+6.74e-03

Pos logit contributions
 swollen+0.061
othe+0.054
 scared+0.051
 frightening+0.048
 Andy+0.045
Neg logit contributions
uries-0.198
nes-0.194
aces-0.193
Whe-0.179
coat-0.177
Token head
Token ID1182
Feature activation+5.73e-02

Pos logit contributions
 swollen+0.010
 scared+0.009
 angry+0.009
 its+0.009
 trembling+0.009
Neg logit contributions
uries-0.015
nes-0.015
aces-0.014
Whe-0.014
 parties-0.014
Token sadly
Token ID21098
Feature activation+1.12e-01

Pos logit contributions
 swollen+0.042
 scared+0.038
othe+0.035
 its+0.035
 trembling+0.032
Neg logit contributions
uries-0.172
nes-0.168
aces-0.164
Whe-0.158
 parties-0.154
Token.
Token ID13
Feature activation+1.33e-02

Pos logit contributions
 swollen+0.099
othe+0.086
 scared+0.084
 frightening+0.079
 its+0.078
Neg logit contributions
uries-0.315
nes-0.308
aces-0.307
Whe-0.284
coat-0.281
Token
Token ID220
Feature activation+2.61e-02

Pos logit contributions
 swollen+0.010
 scared+0.009
othe+0.008
 its+0.008
 Andy+0.008
Neg logit contributions
uries-0.039
nes-0.039
aces-0.038
 parties-0.035
Whe-0.035
Token
Token ID198
Feature activation+2.32e-02

Pos logit contributions
 swollen+0.019
 frightening+0.015
 scared+0.015
othe+0.014
 burning+0.013
Neg logit contributions
uries-0.078
aces-0.077
nes-0.077
Whe-0.071
coat-0.070
Token
Token ID198
Feature activation+1.87e-02

Pos logit contributions
 swollen+0.016
othe+0.013
 scared+0.013
 frightening+0.013
 burning+0.012
Neg logit contributions
uries-0.070
nes-0.069
aces-0.067
Whe-0.064
shirts-0.062
Token"
Token ID1
Feature activation-5.18e-02

Pos logit contributions
 swollen+0.024
othe+0.022
 scared+0.022
 its+0.021
 frightening+0.021
Neg logit contributions
uries-0.045
nes-0.044
aces-0.044
Whe-0.040
coat-0.040
Token the
Token ID262
Feature activation+3.83e-03

Pos logit contributions
 swollen+0.014
othe+0.013
 frightening+0.010
 scared+0.010
 trembling+0.010
Neg logit contributions
nes-0.035
uries-0.035
aces-0.035
 parties-0.033
Whe-0.032
Token ducks
Token ID39694
Feature activation+3.20e-03

Pos logit contributions
 swollen+0.005
 scared+0.005
 frightening+0.005
othe+0.005
 its+0.005
Neg logit contributions
uries-0.009
nes-0.009
aces-0.009
Whe-0.008
shirts-0.008
Token?"
Token ID1701
Feature activation+1.73e-02

Pos logit contributions
 swollen+0.004
 scared+0.003
othe+0.003
 its+0.003
 frightening+0.003
Neg logit contributions
uries-0.008
nes-0.008
aces-0.008
Whe-0.007
shirts-0.007
Token Kid
Token ID16828
Feature activation+1.73e-02

Pos logit contributions
 swollen+0.021
 burning+0.019
 attacking+0.018
 frightening+0.018
 cracking+0.018
Neg logit contributions
nes-0.041
Whe-0.038
uries-0.038
 shorts-0.038
aces-0.037
Token asked
Token ID1965
Feature activation+1.38e-02

Pos logit contributions
 swollen+0.011
othe+0.009
 scared+0.008
 frightening+0.008
 its+0.008
Neg logit contributions
uries-0.053
nes-0.052
aces-0.052
Whe-0.048
coat-0.048
Token sadly
Token ID21098
Feature activation+1.11e-01

Pos logit contributions
 swollen+0.010
 scared+0.009
othe+0.009
 its+0.009
 Andy+0.008
Neg logit contributions
uries-0.041
nes-0.040
aces-0.040
Whe-0.037
 parties-0.037
Token.
Token ID13
Feature activation+1.19e-02

Pos logit contributions
 swollen+0.102
othe+0.087
 scared+0.085
 frightening+0.082
 its+0.078
Neg logit contributions
uries-0.307
nes-0.299
aces-0.298
Whe-0.275
coat-0.273
Token
Token ID198
Feature activation+2.40e-02

Pos logit contributions
 swollen+0.012
 frightening+0.010
 burning+0.010
othe+0.009
 cracking+0.009
Neg logit contributions
uries-0.032
nes-0.032
Whe-0.030
aces-0.030
shirts-0.029
Token
Token ID198
Feature activation+2.14e-02

Pos logit contributions
 swollen+0.030
 frightening+0.028
 burning+0.026
 cracking+0.025
 scared+0.025
Neg logit contributions
nes-0.060
uries-0.055
Whe-0.054
shirts-0.052
ird-0.052
TokenMom
Token ID29252
Feature activation+3.37e-02

Pos logit contributions
 swollen+0.027
othe+0.025
 scared+0.024
 frightening+0.024
 its+0.023
Neg logit contributions
uries-0.052
nes-0.052
aces-0.050
Whe-0.047
shirts-0.046
Token smiled
Token ID13541
Feature activation+5.97e-02

Pos logit contributions
 swollen+0.032
othe+0.030
 frightening+0.027
 its+0.027
 scared+0.026
Neg logit contributions
uries-0.091
nes-0.090
aces-0.089
coat-0.083
shirts-0.081
Token not
Token ID407
Feature activation+7.63e-02

Pos logit contributions
 swollen+0.001
 scared+0.001
othe+0.001
 its+0.001
 frightening+0.001
Neg logit contributions
uries-0.003
aces-0.003
nes-0.003
Whe-0.003
shirts-0.003
Token working
Token ID1762
Feature activation-2.30e-02

Pos logit contributions
 scared+0.096
 its+0.095
 swollen+0.089
 burning+0.089
 dangerous+0.088
Neg logit contributions
aces-0.189
uries-0.189
nes-0.182
Whe-0.178
shirts-0.172
Token?"
Token ID1701
Feature activation+2.58e-02

Pos logit contributions
uries+0.062
aces+0.060
nes+0.060
Whe+0.058
shirts+0.056
Neg logit contributions
 scared-0.022
 swollen-0.022
 its-0.021
 frightening-0.020
 angry-0.020
Token asked
Token ID1965
Feature activation+1.95e-02

Pos logit contributions
 swollen+0.032
 burning+0.026
othe+0.026
 frightening+0.026
 buzzing+0.025
Neg logit contributions
nes-0.062
uries-0.062
aces-0.059
Whe-0.057
shirts-0.055
Token Tom
Token ID4186
Feature activation+3.58e-02

Pos logit contributions
 swollen+0.020
othe+0.018
 scared+0.018
 its+0.017
 frightening+0.017
Neg logit contributions
uries-0.052
nes-0.051
aces-0.051
Whe-0.047
coat-0.046
Token sadly
Token ID21098
Feature activation+1.10e-01

Pos logit contributions
 swollen+0.037
othe+0.032
 frightening+0.030
 its+0.030
 scared+0.029
Neg logit contributions
uries-0.096
nes-0.094
aces-0.093
coat-0.084
shirts-0.084
Token.
Token ID13
Feature activation+2.47e-02

Pos logit contributions
 swollen+0.123
othe+0.103
 frightening+0.100
 scared+0.095
 Andy+0.095
Neg logit contributions
uries-0.280
aces-0.273
nes-0.271
coat-0.247
Whe-0.246
Token But
Token ID887
Feature activation+2.75e-02

Pos logit contributions
 swollen+0.027
 frightening+0.022
othe+0.020
 burning+0.020
 dangerous+0.020
Neg logit contributions
uries-0.065
nes-0.065
aces-0.062
Whe-0.059
shirts-0.057
Token daddy
Token ID35822
Feature activation+2.17e-02

Pos logit contributions
 swollen+0.026
othe+0.022
 scared+0.022
 its+0.021
 frightening+0.021
Neg logit contributions
uries-0.077
nes-0.075
aces-0.075
Whe-0.069
coat-0.068
Token didn
Token ID1422
Feature activation+1.22e-02

Pos logit contributions
 swollen+0.023
othe+0.021
 its+0.020
 frightening+0.019
 scared+0.018
Neg logit contributions
uries-0.055
nes-0.054
aces-0.054
coat-0.050
shirts-0.048
Token't
Token ID470
Feature activation+6.15e-02

Pos logit contributions
 swollen+0.015
othe+0.013
 scared+0.013
 frightening+0.013
 its+0.013
Neg logit contributions
uries-0.030
nes-0.030
aces-0.029
Whe-0.027
coat-0.026
Token The
Token ID383
Feature activation+4.68e-02

Pos logit contributions
 swollen+0.016
othe+0.013
 scared+0.013
 frightening+0.013
 burning+0.012
Neg logit contributions
uries-0.047
nes-0.046
aces-0.046
Whe-0.043
coat-0.042
Token little
Token ID1310
Feature activation+5.36e-02

Pos logit contributions
 swollen+0.045
 scared+0.041
 frightening+0.037
 angry+0.036
 worried+0.036
Neg logit contributions
uries-0.136
nes-0.135
aces-0.132
Whe-0.120
shirts-0.119
Token boy
Token ID2933
Feature activation+4.37e-02

Pos logit contributions
 scared+0.059
 swollen+0.055
 worried+0.050
 angry+0.050
 trembling+0.047
Neg logit contributions
uries-0.159
nes-0.153
aces-0.149
 parties-0.132
shirts-0.132
Token was
Token ID373
Feature activation+1.98e-02

Pos logit contributions
 swollen+0.045
othe+0.040
 frightening+0.037
 its+0.035
 faint+0.035
Neg logit contributions
uries-0.114
nes-0.114
aces-0.111
coat-0.102
Whe-0.101
Token taken
Token ID2077
Feature activation+9.24e-03

Pos logit contributions
 swollen+0.022
othe+0.021
 Andy+0.019
 frightening+0.018
 Farmer+0.018
Neg logit contributions
uries-0.050
nes-0.049
aces-0.048
Whe-0.044
coat-0.043
Token aback
Token ID50147
Feature activation+1.10e-01

Pos logit contributions
 swollen+0.014
othe+0.012
 scared+0.012
 frightening+0.012
 Andy+0.012
Neg logit contributions
uries-0.020
nes-0.020
aces-0.020
Whe-0.018
coat-0.018
Token and
Token ID290
Feature activation+8.92e-03

Pos logit contributions
 swollen+0.150
 faint+0.125
othe+0.123
 Bee+0.122
 frightening+0.121
Neg logit contributions
aces-0.241
uries-0.239
nes-0.236
aults-0.210
 parties-0.209
Token felt
Token ID2936
Feature activation+2.47e-02

Pos logit contributions
 swollen+0.009
othe+0.008
 frightening+0.008
 scared+0.007
 faint+0.007
Neg logit contributions
uries-0.023
nes-0.023
aces-0.023
Whe-0.021
coat-0.021
Token guilty
Token ID6717
Feature activation+4.30e-02

Pos logit contributions
 swollen+0.026
othe+0.023
 Andy+0.023
 Bee+0.022
 Farmer+0.021
Neg logit contributions
uries-0.062
aces-0.061
nes-0.060
Whe-0.056
shirts-0.055
Token.
Token ID13
Feature activation+1.53e-02

Pos logit contributions
 swollen+0.040
othe+0.035
 scared+0.035
 frightening+0.032
 its+0.032
Neg logit contributions
uries-0.119
nes-0.117
aces-0.116
Whe-0.107
coat-0.106
Token He
Token ID679
Feature activation+2.67e-02

Pos logit contributions
 swollen+0.015
othe+0.012
 scared+0.012
 frightening+0.012
 burning+0.012
Neg logit contributions
uries-0.042
nes-0.041
aces-0.041
Whe-0.039
shirts-0.037
TokenThe
Token ID464
Feature activation+5.31e-02

Pos logit contributions
 swollen+0.018
othe+0.016
 scared+0.015
 its+0.014
 frightening+0.014
Neg logit contributions
uries-0.053
nes-0.052
aces-0.051
Whe-0.048
coat-0.047
Token small
Token ID1402
Feature activation+3.33e-02

Pos logit contributions
 swollen+0.064
 scared+0.060
othe+0.057
 frightening+0.056
 its+0.054
Neg logit contributions
uries-0.136
nes-0.133
aces-0.132
Whe-0.119
coat-0.119
Token boy
Token ID2933
Feature activation+4.65e-02

Pos logit contributions
 swollen+0.046
othe+0.042
 scared+0.041
 its+0.040
 frightening+0.040
Neg logit contributions
uries-0.078
nes-0.076
aces-0.076
Whe-0.069
coat-0.068
Token was
Token ID373
Feature activation+1.65e-02

Pos logit contributions
 swollen+0.045
othe+0.041
 frightening+0.037
 scared+0.036
 its+0.035
Neg logit contributions
uries-0.125
nes-0.124
aces-0.123
Whe-0.112
coat-0.111
Token taken
Token ID2077
Feature activation+1.06e-02

Pos logit contributions
 swollen+0.019
othe+0.018
 Andy+0.017
 frightening+0.016
 Farmer+0.016
Neg logit contributions
uries-0.041
nes-0.040
aces-0.039
Whe-0.035
coat-0.035
Token aback
Token ID50147
Feature activation+1.09e-01

Pos logit contributions
 swollen+0.017
othe+0.015
 frightening+0.015
 faint+0.014
 scared+0.014
Neg logit contributions
uries-0.022
aces-0.021
nes-0.021
Whe-0.019
shirts-0.019
Token.
Token ID13
Feature activation+1.79e-02

Pos logit contributions
 swollen+0.157
 faint+0.132
 frightening+0.130
othe+0.128
 Bee+0.128
Neg logit contributions
aces-0.231
uries-0.227
nes-0.223
aults-0.201
ault-0.194
Token He
Token ID679
Feature activation+2.12e-02

Pos logit contributions
 swollen+0.018
othe+0.015
 scared+0.015
 frightening+0.014
 its+0.014
Neg logit contributions
uries-0.050
aces-0.048
nes-0.048
Whe-0.044
shirts-0.044
Token thanked
Token ID26280
Feature activation+2.09e-02

Pos logit contributions
 swollen+0.024
othe+0.023
 Andy+0.020
 Farmer+0.020
 burning+0.020
Neg logit contributions
uries-0.052
nes-0.051
aces-0.051
coat-0.045
shirts-0.045
Token the
Token ID262
Feature activation+3.14e-02

Pos logit contributions
 swollen+0.017
othe+0.013
 scared+0.012
 frightening+0.012
 shocked+0.010
Neg logit contributions
uries-0.061
nes-0.060
aces-0.059
 parties-0.056
Whe-0.056
Token saw
Token ID2497
Feature activation+5.78e-02

Pos logit contributions
 swollen+0.032
othe+0.028
 scared+0.028
 its+0.026
 frightening+0.026
Neg logit contributions
uries-0.086
nes-0.084
aces-0.083
Whe-0.077
coat-0.076
Token
Token ID198
Feature activation+1.79e-02

Pos logit contributions
 swollen+0.030
 frightening+0.024
 burning+0.024
 cracking+0.023
 trembling+0.023
Neg logit contributions
nes-0.054
uries-0.050
aces-0.047
Whe-0.046
shirts-0.044
TokenThe
Token ID464
Feature activation+4.63e-02

Pos logit contributions
 swollen+0.021
othe+0.019
 scared+0.018
 frightening+0.017
 screaming+0.017
Neg logit contributions
uries-0.046
nes-0.046
aces-0.044
Whe-0.042
shirts-0.041
Token big
Token ID1263
Feature activation+3.15e-02

Pos logit contributions
 swollen+0.057
othe+0.054
 its+0.050
 scared+0.048
 Andy+0.048
Neg logit contributions
uries-0.114
aces-0.110
nes-0.110
Whe-0.104
coat-0.102
Token machine
Token ID4572
Feature activation+2.62e-02

Pos logit contributions
 swollen+0.047
othe+0.046
 its+0.042
 frightening+0.041
 scared+0.041
Neg logit contributions
uries-0.065
aces-0.065
nes-0.062
Whe-0.060
coat-0.059
Token answered
Token ID9373
Feature activation+4.40e-02

Pos logit contributions
 swollen+0.023
othe+0.020
 frightening+0.017
 its+0.017
 Farmer+0.017
Neg logit contributions
nes-0.072
uries-0.072
aces-0.071
coat-0.065
shirts-0.064
Token sadly
Token ID21098
Feature activation+1.09e-01

Pos logit contributions
 swollen+0.044
othe+0.037
 frightening+0.035
 scared+0.034
 demanding+0.032
Neg logit contributions
aces-0.117
uries-0.117
nes-0.114
Whe-0.103
coat-0.103
Token,
Token ID11
Feature activation+5.19e-04

Pos logit contributions
 swollen+0.120
 frightening+0.094
othe+0.094
 scared+0.093
 shocked+0.088
Neg logit contributions
uries-0.284
aces-0.274
nes-0.270
coat-0.247
 parties-0.246
Token "
Token ID366
Feature activation-2.81e-02

Pos logit contributions
 swollen+0.001
othe+0.001
 frightening+0.000
 painful+0.000
 Andy+0.000
Neg logit contributions
uries-0.001
aces-0.001
nes-0.001
coat-0.001
Whe-0.001
TokenMy
Token ID3666
Feature activation-1.29e-02

Pos logit contributions
uries+0.059
nes+0.059
aces+0.056
Whe+0.053
shirts+0.051
Neg logit contributions
 swollen-0.045
othe-0.043
 scared-0.040
 its-0.040
 frightening-0.040
Token gear
Token ID7733
Feature activation-1.07e-02

Pos logit contributions
nes+0.030
uries+0.030
aces+0.029
Whe+0.026
 parties+0.026
Neg logit contributions
 swollen-0.018
 scared-0.018
othe-0.017
 Andy-0.017
 frightening-0.017
Token snapped
Token ID20821
Feature activation+2.09e-02

Pos logit contributions
uries+0.031
nes+0.031
aces+0.030
Whe+0.029
shirts+0.028
Neg logit contributions
 swollen-0.009
 scared-0.008
othe-0.007
 its-0.007
 frightening-0.007
Token we
Token ID356
Feature activation-6.26e-02

Pos logit contributions
uries+0.031
nes+0.030
aces+0.030
Whe+0.026
shirts+0.026
Neg logit contributions
 swollen-0.018
othe-0.017
 scared-0.016
 frightening-0.016
 its-0.016
Token do
Token ID466
Feature activation-2.05e-02

Pos logit contributions
aces+0.159
uries+0.156
nes+0.153
Whe+0.148
shirts+0.145
Neg logit contributions
 scared-0.074
 swollen-0.068
 angry-0.066
 dangerous-0.066
 Andy-0.065
Token?"
Token ID1701
Feature activation+1.87e-02

Pos logit contributions
uries+0.060
aces+0.059
Whe+0.057
nes+0.056
shirts+0.055
Neg logit contributions
 scared-0.015
 its-0.013
 surprised-0.012
 angry-0.012
:-0.012
Token asked
Token ID1965
Feature activation+1.25e-02

Pos logit contributions
 swollen+0.016
othe+0.013
 burning+0.012
 frightening+0.012
 scared+0.012
Neg logit contributions
nes-0.053
uries-0.053
aces-0.052
Whe-0.049
shirts-0.048
Token Maya
Token ID26041
Feature activation+1.70e-02

Pos logit contributions
 swollen+0.017
 its+0.017
 scared+0.017
othe+0.016
 Andy+0.016
Neg logit contributions
uries-0.030
nes-0.028
aces-0.028
coat-0.026
Whe-0.025
Token sadly
Token ID21098
Feature activation+1.08e-01

Pos logit contributions
 swollen+0.016
othe+0.014
 frightening+0.014
 scared+0.013
 its+0.013
Neg logit contributions
uries-0.047
nes-0.046
aces-0.046
coat-0.041
Whe-0.041
Token.
Token ID13
Feature activation+2.03e-02

Pos logit contributions
 swollen+0.123
othe+0.104
 frightening+0.102
 scared+0.096
 faint+0.095
Neg logit contributions
uries-0.269
aces-0.264
nes-0.261
coat-0.240
shirts-0.235
Token
Token ID198
Feature activation+2.47e-02

Pos logit contributions
 swollen+0.014
othe+0.012
 scared+0.012
 its+0.011
 frightening+0.011
Neg logit contributions
uries-0.061
nes-0.060
aces-0.059
Whe-0.055
coat-0.054
Token"
Token ID1
Feature activation-4.46e-02

Pos logit contributions
 swollen+0.020
 scared+0.016
 frightening+0.016
othe+0.015
 burning+0.015
Neg logit contributions
nes-0.074
uries-0.073
aces-0.071
Whe-0.067
shirts-0.066
TokenLet
Token ID5756
Feature activation-2.23e-02

Pos logit contributions
uries+0.103
aces+0.101
nes+0.099
Whe+0.090
 parties+0.090
Neg logit contributions
 swollen-0.064
 scared-0.058
othe-0.058
 frightening-0.056
 its-0.055
Token's
Token ID338
Feature activation-2.74e-02

Pos logit contributions
uries+0.058
aces+0.056
nes+0.055
Whe+0.051
coat+0.051
Neg logit contributions
 swollen-0.026
 scared-0.024
 its-0.023
othe-0.023
 Andy-0.022
Token then
Token ID788
Feature activation+9.65e-03

Pos logit contributions
 swollen+0.015
othe+0.014
 scared+0.013
 frightening+0.013
 its+0.013
Neg logit contributions
uries-0.027
nes-0.026
aces-0.026
Whe-0.024
coat-0.023
Token a
Token ID257
Feature activation+2.14e-02

Pos logit contributions
 swollen+0.009
 scared+0.008
othe+0.008
 its+0.008
 Andy+0.008
Neg logit contributions
uries-0.027
nes-0.027
aces-0.027
Whe-0.024
shirts-0.024
Token sad
Token ID6507
Feature activation+2.91e-02

Pos logit contributions
 swollen+0.022
othe+0.021
 its+0.019
 Andy+0.019
 scared+0.018
Neg logit contributions
uries-0.055
nes-0.054
aces-0.052
coat-0.050
Whe-0.049
Token look
Token ID804
Feature activation+3.39e-02

Pos logit contributions
 swollen+0.032
othe+0.028
 scared+0.027
 its+0.026
 frightening+0.026
Neg logit contributions
uries-0.076
nes-0.074
aces-0.073
Whe-0.068
coat-0.067
Token came
Token ID1625
Feature activation+3.63e-02

Pos logit contributions
 swollen+0.036
 scared+0.032
othe+0.032
 its+0.030
 frightening+0.030
Neg logit contributions
uries-0.091
nes-0.089
aces-0.088
Whe-0.081
shirts-0.080
Token over
Token ID625
Feature activation+1.07e-01

Pos logit contributions
 swollen+0.041
othe+0.037
 scared+0.035
 frightening+0.034
 its+0.033
Neg logit contributions
uries-0.093
nes-0.092
aces-0.090
Whe-0.083
coat-0.083
Token her
Token ID607
Feature activation+7.16e-03

Pos logit contributions
 swollen+0.126
othe+0.108
 frightening+0.101
 scared+0.100
 shocked+0.096
Neg logit contributions
uries-0.256
aces-0.249
nes-0.248
 parties-0.237
coat-0.233
Token face
Token ID1986
Feature activation+8.39e-02

Pos logit contributions
 swollen+0.009
othe+0.008
 scared+0.008
 its+0.007
 frightening+0.007
Neg logit contributions
uries-0.018
nes-0.018
aces-0.018
Whe-0.016
shirts-0.016
Token.
Token ID13
Feature activation+2.11e-02

Pos logit contributions
 swollen+0.065
 scared+0.057
othe+0.056
 its+0.053
 frightening+0.051
Neg logit contributions
uries-0.247
nes-0.243
aces-0.239
Whe-0.225
shirts-0.220
Token She
Token ID1375
Feature activation+1.71e-02

Pos logit contributions
 swollen+0.027
othe+0.024
 scared+0.023
 frightening+0.023
 its+0.022
Neg logit contributions
uries-0.051
aces-0.050
nes-0.049
Whe-0.046
coat-0.045
Token couldn
Token ID3521
Feature activation+1.55e-02

Pos logit contributions
 swollen+0.020
othe+0.019
 its+0.017
 Farmer+0.017
 frightening+0.017
Neg logit contributions
uries-0.042
nes-0.042
aces-0.041
shirts-0.037
coat-0.037
TokenThe
Token ID464
Feature activation+4.55e-02

Pos logit contributions
 swollen+0.025
othe+0.023
 scared+0.023
 burning+0.022
 frightening+0.022
Neg logit contributions
nes-0.033
uries-0.031
aces-0.030
Whe-0.029
shirts-0.028
Token man
Token ID582
Feature activation+5.59e-02

Pos logit contributions
 swollen+0.048
 scared+0.043
othe+0.043
 frightening+0.040
 its+0.040
Neg logit contributions
uries-0.123
nes-0.120
aces-0.119
Whe-0.110
coat-0.108
Token was
Token ID373
Feature activation+1.55e-02

Pos logit contributions
 swollen+0.060
othe+0.054
 scared+0.051
 frightening+0.050
 its+0.049
Neg logit contributions
uries-0.147
nes-0.144
aces-0.143
Whe-0.130
shirts-0.129
Token so
Token ID523
Feature activation+3.94e-02

Pos logit contributions
 swollen+0.021
 scared+0.019
othe+0.019
 its+0.018
 frightening+0.018
Neg logit contributions
uries-0.037
nes-0.036
aces-0.036
Whe-0.033
shirts-0.032
Token taken
Token ID2077
Feature activation+6.36e-03

Pos logit contributions
 swollen+0.047
othe+0.043
 frightening+0.040
 Andy+0.039
 burning+0.038
Neg logit contributions
nes-0.095
uries-0.095
aces-0.092
Whe-0.084
 parties-0.082
Token aback
Token ID50147
Feature activation+1.07e-01

Pos logit contributions
 swollen+0.009
othe+0.008
 demanding+0.007
 screaming+0.007
 scared+0.007
Neg logit contributions
aces-0.014
nes-0.014
uries-0.014
 parties-0.013
Whe-0.013
Token by
Token ID416
Feature activation+4.82e-02

Pos logit contributions
 swollen+0.166
othe+0.134
 faint+0.133
 shocked+0.132
 Fire+0.131
Neg logit contributions
aces-0.221
nes-0.212
uries-0.210
 parties-0.187
aults-0.184
Token Maria
Token ID14200
Feature activation+4.99e-02

Pos logit contributions
 swollen+0.049
othe+0.044
 scared+0.039
 frightening+0.038
 painful+0.037
Neg logit contributions
aces-0.120
uries-0.119
nes-0.113
 parties-0.110
 sweets-0.104
Token's
Token ID338
Feature activation-2.32e-02

Pos logit contributions
 swollen+0.054
 scared+0.051
othe+0.050
 its+0.048
 frightening+0.046
Neg logit contributions
uries-0.130
nes-0.128
aces-0.126
Whe-0.120
coat-0.116
Token charming
Token ID23332
Feature activation-3.49e-03

Pos logit contributions
nes+0.056
uries+0.055
aces+0.054
Whe+0.051
coat+0.048
Neg logit contributions
 swollen-0.032
 scared-0.030
 trembling-0.028
 its-0.028
 burning-0.028
Token request
Token ID2581
Feature activation+1.19e-02

Pos logit contributions
nes+0.008
uries+0.008
aces+0.008
Whe+0.007
shirts+0.007
Neg logit contributions
 swollen-0.005
 scared-0.005
 its-0.005
 trembling-0.005
 frightening-0.004
Token and
Token ID290
Feature activation+1.12e-03

Pos logit contributions
 swollen+0.056
othe+0.047
 frightening+0.043
 Farmer+0.043
 scared+0.043
Neg logit contributions
uries-0.113
aces-0.112
nes-0.110
coat-0.102
 parties-0.100
Token saw
Token ID2497
Feature activation+6.00e-02

Pos logit contributions
 swollen+0.001
othe+0.001
 Farmer+0.001
 frightening+0.001
 painful+0.001
Neg logit contributions
nes-0.003
uries-0.003
aces-0.003
Whe-0.003
shirts-0.003
Token that
Token ID326
Feature activation+2.85e-02

Pos logit contributions
 swollen+0.068
othe+0.061
 scared+0.059
 frightening+0.056
 its+0.055
Neg logit contributions
uries-0.154
nes-0.151
aces-0.151
Whe-0.138
shirts-0.136
Token mom
Token ID1995
Feature activation+2.94e-02

Pos logit contributions
 swollen+0.022
 scared+0.021
 its+0.020
othe+0.020
 frightening+0.018
Neg logit contributions
uries-0.085
aces-0.083
nes-0.082
Whe-0.076
coat-0.076
Token was
Token ID373
Feature activation+1.46e-02

Pos logit contributions
 swollen+0.029
othe+0.027
 its+0.024
 Farmer+0.024
 frightening+0.024
Neg logit contributions
uries-0.078
nes-0.076
aces-0.075
coat-0.071
shirts-0.068
Token sadly
Token ID21098
Feature activation+1.07e-01

Pos logit contributions
 swollen+0.017
othe+0.015
 scared+0.015
 its+0.014
 frightening+0.014
Neg logit contributions
uries-0.038
nes-0.037
aces-0.036
Whe-0.033
coat-0.033
Token looking
Token ID2045
Feature activation+1.69e-02

Pos logit contributions
 swollen+0.119
 scared+0.111
othe+0.107
 its+0.104
 frightening+0.102
Neg logit contributions
uries-0.281
nes-0.276
aces-0.274
Whe-0.253
shirts-0.248
Token at
Token ID379
Feature activation+3.58e-02

Pos logit contributions
 swollen+0.021
othe+0.019
 scared+0.018
 Andy+0.018
 its+0.018
Neg logit contributions
uries-0.041
nes-0.040
aces-0.040
Whe-0.036
 parties-0.036
Token the
Token ID262
Feature activation+1.52e-02

Pos logit contributions
 swollen+0.030
othe+0.025
 scared+0.025
 its+0.024
 frightening+0.023
Neg logit contributions
uries-0.104
nes-0.102
aces-0.101
Whe-0.094
coat-0.092
Token calendar
Token ID11845
Feature activation+1.34e-02

Pos logit contributions
 swollen+0.020
 scared+0.020
 burning+0.019
 angry+0.019
 frightening+0.019
Neg logit contributions
nes-0.038
uries-0.037
aces-0.037
Whe-0.033
shirts-0.032
Token.
Token ID13
Feature activation+1.11e-02

Pos logit contributions
 swollen+0.008
 scared+0.008
 its+0.007
 burning+0.007
othe+0.007
Neg logit contributions
uries-0.041
nes-0.040
aces-0.039
Whe-0.039
shirts-0.038
Token long
Token ID890
Feature activation-1.80e-02

Pos logit contributions
 swollen+0.031
othe+0.028
 scared+0.028
 frightening+0.027
 its+0.027
Neg logit contributions
uries-0.053
aces-0.052
nes-0.052
Whe-0.047
shirts-0.046
Token time
Token ID640
Feature activation-3.56e-02

Pos logit contributions
uries+0.041
aces+0.040
nes+0.039
Whe+0.037
shirts+0.035
Neg logit contributions
 scared-0.025
 swollen-0.025
othe-0.025
 burning-0.024
 Farmer-0.024
Token?"
Token ID1701
Feature activation+1.88e-02

Pos logit contributions
uries+0.096
aces+0.096
nes+0.093
Whe+0.090
shirts+0.086
Neg logit contributions
 scared-0.035
 burning-0.032
othe-0.031
 its-0.031
 swollen-0.031
Token she
Token ID673
Feature activation-2.21e-03

Pos logit contributions
 swollen+0.024
 burning+0.021
 frightening+0.020
 cracking+0.020
 attacking+0.020
Neg logit contributions
nes-0.044
uries-0.041
aces-0.040
Whe-0.039
 parties-0.038
Token asked
Token ID1965
Feature activation+1.39e-02

Pos logit contributions
nes+0.006
uries+0.006
aces+0.006
shirts+0.005
coat+0.005
Neg logit contributions
 swollen-0.002
othe-0.002
 burning-0.002
 frightening-0.002
 Andy-0.002
Token sadly
Token ID21098
Feature activation+1.07e-01

Pos logit contributions
 swollen+0.014
othe+0.012
 scared+0.012
 frightening+0.012
 its+0.011
Neg logit contributions
uries-0.037
nes-0.037
aces-0.037
Whe-0.033
coat-0.033
Token.
Token ID13
Feature activation+1.50e-02

Pos logit contributions
 swollen+0.117
othe+0.097
 frightening+0.096
 scared+0.094
 shocked+0.089
Neg logit contributions
uries-0.274
nes-0.267
aces-0.266
coat-0.242
shirts-0.240
Token Her
Token ID2332
Feature activation+9.20e-03

Pos logit contributions
 swollen+0.015
 frightening+0.012
 burning+0.011
 scared+0.011
 dangerous+0.011
Neg logit contributions
uries-0.041
nes-0.041
aces-0.039
Whe-0.037
coat-0.036
Token parents
Token ID3397
Feature activation+2.45e-03

Pos logit contributions
 swollen+0.015
othe+0.014
 its+0.014
 frightening+0.013
 scared+0.013
Neg logit contributions
uries-0.018
aces-0.018
nes-0.018
 parties-0.016
shirts-0.016
Token assured
Token ID13933
Feature activation+3.69e-02

Pos logit contributions
 swollen+0.003
othe+0.002
 frightening+0.002
 its+0.002
 scared+0.002
Neg logit contributions
uries-0.006
nes-0.006
aces-0.006
coat-0.006
shirts-0.006
Token her
Token ID607
Feature activation-1.47e-02

Pos logit contributions
 swollen+0.037
othe+0.034
 scared+0.033
 its+0.031
 frightening+0.031
Neg logit contributions
uries-0.100
nes-0.097
aces-0.097
Whe-0.090
coat-0.088
Token the
Token ID262
Feature activation+1.09e-02

Pos logit contributions
 swollen+0.009
othe+0.009
 scared+0.008
 frightening+0.008
 its+0.008
Neg logit contributions
uries-0.022
nes-0.022
aces-0.021
Whe-0.020
coat-0.019
Token little
Token ID1310
Feature activation+3.72e-02

Pos logit contributions
 swollen+0.014
 scared+0.013
 trembling+0.012
 shaking+0.012
 frightening+0.012
Neg logit contributions
uries-0.028
nes-0.028
aces-0.027
Whe-0.025
 parties-0.024
Token girl
Token ID2576
Feature activation+3.14e-02

Pos logit contributions
 swollen+0.047
 scared+0.046
 trembling+0.041
 frightening+0.041
 worried+0.041
Neg logit contributions
uries-0.097
nes-0.094
aces-0.091
 parties-0.083
shirts-0.082
Token was
Token ID373
Feature activation+9.83e-03

Pos logit contributions
 swollen+0.035
othe+0.032
 Farmer+0.030
 frightening+0.029
 its+0.028
Neg logit contributions
nes-0.079
uries-0.079
aces-0.078
coat-0.071
ird-0.069
Token taken
Token ID2077
Feature activation+9.80e-03

Pos logit contributions
 swollen+0.009
othe+0.009
 frightening+0.007
 Farmer+0.007
 Andy+0.007
Neg logit contributions
uries-0.027
nes-0.026
aces-0.026
coat-0.024
Whe-0.023
Token aback
Token ID50147
Feature activation+1.07e-01

Pos logit contributions
 swollen+0.015
othe+0.013
 scared+0.013
 frightening+0.013
 screaming+0.012
Neg logit contributions
uries-0.021
aces-0.021
nes-0.021
coat-0.019
Whe-0.019
Token.
Token ID13
Feature activation+1.12e-02

Pos logit contributions
 swollen+0.146
 faint+0.127
othe+0.125
 Bee+0.125
 Fire+0.124
Neg logit contributions
aces-0.231
nes-0.227
uries-0.227
aults-0.202
ault-0.194
Token Inside
Token ID14384
Feature activation-4.85e-03

Pos logit contributions
 swollen+0.013
othe+0.010
 frightening+0.010
 scared+0.010
 burning+0.010
Neg logit contributions
uries-0.029
nes-0.028
aces-0.028
Whe-0.026
coat-0.025
Token the
Token ID262
Feature activation+7.24e-03

Pos logit contributions
nes+0.013
aces+0.012
uries+0.012
 parties+0.012
coat+0.012
Neg logit contributions
 swollen-0.005
othe-0.004
 frightening-0.004
 demanding-0.004
 scared-0.004
Token room
Token ID2119
Feature activation-1.48e-03

Pos logit contributions
 swollen+0.009
 scared+0.008
 frightening+0.008
othe+0.008
 trembling+0.008
Neg logit contributions
uries-0.018
nes-0.018
aces-0.018
Whe-0.016
shirts-0.016
Token,
Token ID11
Feature activation-9.03e-03

Pos logit contributions
uries+0.004
nes+0.004
aces+0.004
coat+0.004
 parties+0.004
Neg logit contributions
 swollen-0.001
othe-0.001
 scared-0.001
 frightening-0.001
 its-0.001
Token
Token ID198
Feature activation+1.70e-02

Pos logit contributions
 swollen+0.018
 frightening+0.014
othe+0.014
 dangerous+0.013
 painful+0.013
Neg logit contributions
uries-0.041
nes-0.041
aces-0.040
Whe-0.037
coat-0.036
Token
Token ID198
Feature activation+1.35e-02

Pos logit contributions
 swollen+0.019
 frightening+0.017
 burning+0.016
 scared+0.016
 cracking+0.016
Neg logit contributions
nes-0.045
uries-0.042
aces-0.040
Whe-0.038
shirts-0.038
TokenMay
Token ID6747
Feature activation+6.67e-03

Pos logit contributions
 swollen+0.020
othe+0.018
 scared+0.018
 frightening+0.017
 burning+0.017
Neg logit contributions
nes-0.031
uries-0.030
aces-0.028
Whe-0.027
shirts-0.026
Tokena
Token ID64
Feature activation+2.22e-02

Pos logit contributions
 swollen+0.007
othe+0.006
 scared+0.006
 its+0.006
 frightening+0.006
Neg logit contributions
uries-0.018
nes-0.017
aces-0.017
Whe-0.016
coat-0.015
Token looked
Token ID3114
Feature activation+3.05e-02

Pos logit contributions
 swollen+0.023
othe+0.021
 scared+0.020
 its+0.019
 frightening+0.019
Neg logit contributions
uries-0.059
nes-0.058
aces-0.057
Whe-0.053
coat-0.052
Token sadly
Token ID21098
Feature activation+1.07e-01

Pos logit contributions
 swollen+0.054
othe+0.050
 Farmer+0.049
 Bee+0.049
 Andy+0.048
Neg logit contributions
aces-0.053
uries-0.052
nes-0.050
coat-0.045
 parties-0.045
Token at
Token ID379
Feature activation+4.79e-02

Pos logit contributions
 swollen+0.152
 Bee+0.128
othe+0.128
 frightening+0.125
oc+0.124
Neg logit contributions
uries-0.221
aces-0.217
nes-0.199
 shorts-0.191
coat-0.190
Token the
Token ID262
Feature activation+2.24e-02

Pos logit contributions
 swollen+0.052
othe+0.046
 scared+0.044
 frightening+0.042
 its+0.041
Neg logit contributions
uries-0.126
nes-0.125
aces-0.123
Whe-0.114
 parties-0.112
Token fragile
Token ID21049
Feature activation+5.54e-03

Pos logit contributions
 swollen+0.030
 scared+0.029
 burning+0.027
 angry+0.027
 trembling+0.026
Neg logit contributions
uries-0.056
nes-0.056
aces-0.054
Whe-0.049
shirts-0.049
Token hoop
Token ID44792
Feature activation-9.93e-03

Pos logit contributions
 swollen+0.008
 scared+0.007
 its+0.007
 burning+0.007
 trembling+0.007
Neg logit contributions
uries-0.014
nes-0.014
aces-0.013
Whe-0.012
shirts-0.012
Token.
Token ID13
Feature activation+1.34e-02

Pos logit contributions
uries+0.030
nes+0.029
Whe+0.029
aces+0.028
shirts+0.028
Neg logit contributions
 swollen-0.007
 scared-0.006
othe-0.006
 its-0.006
 shaking-0.006
Token collect
Token ID2824
Feature activation-5.06e-02

Pos logit contributions
uries+0.139
nes+0.135
aces+0.134
Whe+0.124
shirts+0.120
Neg logit contributions
 swollen-0.078
 scared-0.074
othe-0.072
 its-0.069
 frightening-0.069
Token,"
Token ID553
Feature activation+8.62e-03

Pos logit contributions
uries+0.142
nes+0.136
Whe+0.130
aces+0.130
coat+0.124
Neg logit contributions
 swollen-0.049
 its-0.048
 scared-0.048
 dangerous-0.043
othe-0.042
Token the
Token ID262
Feature activation+1.44e-02

Pos logit contributions
 swollen+0.010
 burning+0.008
 attacking+0.008
 frightening+0.008
 painful+0.008
Neg logit contributions
nes-0.020
aces-0.019
Whe-0.019
 shorts-0.018
uries-0.018
Token bird
Token ID6512
Feature activation+6.08e-02

Pos logit contributions
othe+0.026
 Andy+0.021
 swollen+0.021
 frightening+0.021
 screaming+0.020
Neg logit contributions
aces-0.025
nes-0.024
uries-0.023
 guys-0.022
 bracelet-0.022
Token replied
Token ID8712
Feature activation+1.81e-02

Pos logit contributions
 swollen+0.068
 burning+0.068
 Farmer+0.067
 Fire+0.064
 attacking+0.063
Neg logit contributions
aces-0.140
nes-0.137
uries-0.128
ird-0.120
coat-0.118
Token sadly
Token ID21098
Feature activation+1.06e-01

Pos logit contributions
 swollen+0.023
 frightening+0.020
 faint+0.019
 dangerous+0.019
othe+0.019
Neg logit contributions
aces-0.043
nes-0.041
uries-0.041
coat-0.036
shirts-0.035
Token.
Token ID13
Feature activation+1.32e-02

Pos logit contributions
 swollen+0.131
 frightening+0.113
othe+0.108
 scared+0.107
 faint+0.106
Neg logit contributions
uries-0.255
aces-0.252
nes-0.244
coat-0.221
 parties-0.215
Token
Token ID198
Feature activation+1.79e-02

Pos logit contributions
 swollen+0.023
 angry+0.021
 burning+0.021
 painful+0.021
 cracking+0.021
Neg logit contributions
aces-0.024
nes-0.023
uries-0.023
Whe-0.021
 shorts-0.020
Token
Token ID198
Feature activation+1.42e-02

Pos logit contributions
 swollen+0.033
 cracking+0.030
 burning+0.030
 frightening+0.029
 angry+0.029
Neg logit contributions
nes-0.033
aces-0.027
uries-0.026
ests-0.026
Whe-0.025
TokenB
Token ID33
Feature activation-2.65e-02

Pos logit contributions
othe+0.029
 swollen+0.029
 burning+0.028
 screaming+0.027
 crying+0.027
Neg logit contributions
nes-0.021
uries-0.018
aces-0.016
aults-0.016
Whe-0.015
Tokenert
Token ID861
Feature activation+1.10e-02

Pos logit contributions
uries+0.056
nes+0.055
aces+0.054
Whe+0.048
coat+0.048
Neg logit contributions
 swollen-0.043
othe-0.040
 scared-0.040
 its-0.038
 frightening-0.038
Token touched
Token ID12615
Feature activation+4.49e-02

Pos logit contributions
uries+0.094
nes+0.092
aces+0.092
Whe+0.085
coat+0.083
Neg logit contributions
 swollen-0.036
othe-0.032
 scared-0.031
 its-0.029
 frightening-0.029
Token!"
Token ID2474
Feature activation-1.80e-03

Pos logit contributions
 swollen+0.055
 scared+0.051
 its+0.049
othe+0.049
 frightening+0.048
Neg logit contributions
uries-0.113
nes-0.110
aces-0.109
Whe-0.101
coat-0.100
Token Lily
Token ID20037
Feature activation+8.57e-03

Pos logit contributions
uries+0.005
nes+0.005
aces+0.005
Whe+0.005
coat+0.004
Neg logit contributions
 swollen-0.002
othe-0.001
 scared-0.001
 frightening-0.001
 its-0.001
Token was
Token ID373
Feature activation+1.72e-02

Pos logit contributions
 swollen+0.011
othe+0.010
 Farmer+0.009
 frightening+0.009
 burning+0.009
Neg logit contributions
nes-0.020
aces-0.020
uries-0.019
coat-0.017
ird-0.017
Token taken
Token ID2077
Feature activation+7.65e-03

Pos logit contributions
 swollen+0.018
othe+0.018
oc+0.016
 Andy+0.014
 Farmer+0.014
Neg logit contributions
uries-0.043
nes-0.042
aces-0.042
Whe-0.037
coat-0.036
Token aback
Token ID50147
Feature activation+1.06e-01

Pos logit contributions
 swollen+0.012
othe+0.011
 frightening+0.011
 screaming+0.011
 burning+0.011
Neg logit contributions
uries-0.015
aces-0.015
nes-0.015
coat-0.013
Whe-0.013
Token and
Token ID290
Feature activation+1.17e-02

Pos logit contributions
 swollen+0.144
 Fire+0.129
 faint+0.128
 Bee+0.127
 Andy+0.125
Neg logit contributions
aces-0.230
uries-0.225
nes-0.222
aults-0.200
 parties-0.197
Token stopped
Token ID5025
Feature activation+6.70e-02

Pos logit contributions
 swollen+0.015
othe+0.013
 frightening+0.012
 faint+0.012
 Farmer+0.012
Neg logit contributions
nes-0.026
uries-0.025
aces-0.024
 parties-0.023
coat-0.022
Token in
Token ID287
Feature activation+4.67e-02

Pos logit contributions
 swollen+0.087
oc+0.087
 painful+0.080
 frightening+0.074
othe+0.074
Neg logit contributions
uries-0.118
aces-0.117
nes-0.115
ault-0.114
 shoes-0.110
Token her
Token ID607
Feature activation-1.04e-02

Pos logit contributions
 swollen+0.061
othe+0.058
 frightening+0.056
 Andy+0.054
 dangerous+0.053
Neg logit contributions
uries-0.095
nes-0.094
aces-0.092
coat-0.090
 parties-0.089
Token tracks
Token ID8339
Feature activation+8.50e-03

Pos logit contributions
uries+0.019
aces+0.019
nes+0.019
Whe+0.017
coat+0.017
Neg logit contributions
 swollen-0.018
othe-0.017
 scared-0.017
 frightening-0.016
 demanding-0.016
Token and
Token ID290
Feature activation+1.96e-02

Pos logit contributions
 swollen+0.060
othe+0.053
 scared+0.050
 frightening+0.049
 its+0.047
Neg logit contributions
uries-0.173
aces-0.170
nes-0.169
Whe-0.155
coat-0.155
Token so
Token ID523
Feature activation+2.70e-02

Pos logit contributions
 swollen+0.022
othe+0.019
 Farmer+0.019
 faint+0.019
 frightening+0.018
Neg logit contributions
nes-0.048
aces-0.047
uries-0.047
coat-0.042
shirts-0.041
Token her
Token ID607
Feature activation+1.23e-03

Pos logit contributions
 swollen+0.030
 Farmer+0.027
 frightening+0.026
 faint+0.026
 Bee+0.026
Neg logit contributions
aces-0.063
nes-0.063
uries-0.062
Whe-0.056
coat-0.055
Token story
Token ID1621
Feature activation-3.29e-02

Pos logit contributions
 swollen+0.001
 scared+0.001
othe+0.001
 frightening+0.001
 its+0.001
Neg logit contributions
uries-0.003
nes-0.003
aces-0.003
Whe-0.003
shirts-0.003
Token ended
Token ID4444
Feature activation+2.48e-02

Pos logit contributions
uries+0.093
nes+0.090
aces+0.088
Whe+0.084
shirts+0.082
Neg logit contributions
 swollen-0.030
 scared-0.027
othe-0.027
 its-0.025
 frightening-0.024
Token sadly
Token ID21098
Feature activation+1.06e-01

Pos logit contributions
 swollen+0.032
othe+0.030
 frightening+0.028
 scared+0.028
 faint+0.028
Neg logit contributions
uries-0.058
nes-0.057
aces-0.056
coat-0.051
Whe-0.050
Token.
Token ID13
Feature activation+3.43e-03

Pos logit contributions
 swollen+0.108
othe+0.093
 frightening+0.092
 Andy+0.091
 scared+0.091
Neg logit contributions
uries-0.274
aces-0.264
nes-0.264
coat-0.242
 parties-0.238
Token<|endoftext|>
Token ID50256
Feature activation-5.66e-02

Pos logit contributions
 swollen+0.003
othe+0.003
 scared+0.003
 frightening+0.003
 burning+0.003
Neg logit contributions
uries-0.009
nes-0.009
aces-0.009
Whe-0.008
shirts-0.008
TokenAlice
Token ID44484
Feature activation+8.19e-03

Pos logit contributions
uries+0.183
nes+0.179
aces+0.178
Whe+0.166
shirts+0.165
Neg logit contributions
 swollen-0.027
 scared-0.021
othe-0.021
 its-0.019
 frightening-0.017
Token was
Token ID373
Feature activation+7.86e-03

Pos logit contributions
 swollen+0.007
othe+0.006
 scared+0.006
 frightening+0.005
 its+0.005
Neg logit contributions
uries-0.024
nes-0.023
aces-0.023
Whe-0.021
coat-0.021
Token a
Token ID257
Feature activation+3.16e-02

Pos logit contributions
 swollen+0.010
othe+0.009
 scared+0.009
 frightening+0.009
 its+0.009
Neg logit contributions
uries-0.019
nes-0.019
aces-0.019
Whe-0.017
coat-0.017
Token down
Token ID866
Feature activation+5.80e-02

Pos logit contributions
 swollen+0.045
othe+0.041
 Andy+0.039
 Bee+0.037
 faint+0.037
Neg logit contributions
uries-0.063
aces-0.061
nes-0.060
 parties-0.055
shirts-0.054
Token at
Token ID379
Feature activation+3.76e-02

Pos logit contributions
 swollen+0.080
othe+0.069
 Andy+0.067
 frightening+0.064
 Bee+0.064
Neg logit contributions
aces-0.126
uries-0.122
nes-0.119
 parties-0.114
coat-0.112
Token him
Token ID683
Feature activation+1.95e-02

Pos logit contributions
 swollen+0.047
othe+0.043
 scared+0.041
 frightening+0.039
 Bee+0.037
Neg logit contributions
uries-0.092
nes-0.090
aces-0.089
Whe-0.082
 parties-0.082
Token and
Token ID290
Feature activation+1.68e-02

Pos logit contributions
 swollen+0.016
othe+0.014
 scared+0.014
 its+0.013
 frightening+0.013
Neg logit contributions
uries-0.056
nes-0.056
aces-0.055
Whe-0.052
shirts-0.051
Token said
Token ID531
Feature activation+6.90e-03

Pos logit contributions
 swollen+0.016
othe+0.014
 frightening+0.013
 burning+0.013
 attacking+0.013
Neg logit contributions
nes-0.042
uries-0.042
aces-0.039
shirts-0.038
 shorts-0.038
Token sadly
Token ID21098
Feature activation+1.05e-01

Pos logit contributions
 swollen+0.005
othe+0.004
 scared+0.004
 frightening+0.004
 its+0.004
Neg logit contributions
uries-0.020
nes-0.020
aces-0.020
Whe-0.018
shirts-0.018
Token,
Token ID11
Feature activation+6.25e-04

Pos logit contributions
 swollen+0.084
othe+0.071
 scared+0.070
 frightening+0.065
 its+0.064
Neg logit contributions
uries-0.306
nes-0.299
aces-0.297
Whe-0.275
coat-0.272
Token "
Token ID366
Feature activation-2.64e-02

Pos logit contributions
 swollen+0.001
othe+0.001
 frightening+0.000
 scared+0.000
 its+0.000
Neg logit contributions
uries-0.002
nes-0.002
aces-0.002
coat-0.001
Whe-0.001
TokenI
Token ID40
Feature activation-3.93e-02

Pos logit contributions
uries+0.063
nes+0.061
aces+0.061
Whe+0.055
coat+0.054
Neg logit contributions
 swollen-0.036
othe-0.033
 scared-0.032
 its-0.031
 frightening-0.031
Token told
Token ID1297
Feature activation+1.26e-02

Pos logit contributions
nes+0.082
uries+0.082
aces+0.073
coat+0.070
ird+0.070
Neg logit contributions
 swollen-0.064
othe-0.056
 its-0.055
 cracking-0.054
 frightening-0.053
Token you
Token ID345
Feature activation-5.54e-02

Pos logit contributions
 swollen+0.016
othe+0.015
 scared+0.015
 its+0.015
 Andy+0.015
Neg logit contributions
uries-0.030
nes-0.030
aces-0.030
Whe-0.027
coat-0.027
Token
Token ID198
Feature activation+9.21e-03

Pos logit contributions
 swollen+0.020
 frightening+0.019
 burning+0.018
 cracking+0.017
 buzzing+0.017
Neg logit contributions
nes-0.029
uries-0.028
Whe-0.026
aces-0.025
shirts-0.025
TokenHis
Token ID6653
Feature activation+2.10e-02

Pos logit contributions
 swollen+0.010
othe+0.009
 scared+0.009
 frightening+0.009
 its+0.009
Neg logit contributions
uries-0.024
nes-0.023
aces-0.023
Whe-0.021
shirts-0.021
Token mom
Token ID1995
Feature activation+3.76e-02

Pos logit contributions
 swollen+0.023
othe+0.022
 its+0.021
 frightening+0.021
 scared+0.020
Neg logit contributions
uries-0.053
aces-0.052
nes-0.051
coat-0.048
 parties-0.047
Token was
Token ID373
Feature activation+1.28e-02

Pos logit contributions
 swollen+0.041
othe+0.038
 its+0.036
 frightening+0.036
 faint+0.036
Neg logit contributions
nes-0.092
uries-0.091
aces-0.090
coat-0.086
ird-0.084
Token taken
Token ID2077
Feature activation+6.44e-03

Pos logit contributions
 swollen+0.011
othe+0.010
 Andy+0.010
 its+0.009
 Bee+0.009
Neg logit contributions
uries-0.035
nes-0.035
aces-0.034
coat-0.032
Whe-0.031
Token aback
Token ID50147
Feature activation+1.05e-01

Pos logit contributions
 swollen+0.005
othe+0.004
 scared+0.004
 its+0.004
 frightening+0.004
Neg logit contributions
uries-0.019
nes-0.019
aces-0.019
Whe-0.017
coat-0.017
Token.
Token ID13
Feature activation+1.22e-02

Pos logit contributions
 swollen+0.134
 Bee+0.117
 frightening+0.116
othe+0.115
 faint+0.115
Neg logit contributions
uries-0.242
aces-0.242
nes-0.238
 parties-0.220
Whe-0.214
Token She
Token ID1375
Feature activation+6.78e-03

Pos logit contributions
 swollen+0.013
othe+0.011
 scared+0.011
 frightening+0.011
 its+0.010
Neg logit contributions
uries-0.032
nes-0.031
aces-0.031
Whe-0.029
coat-0.029
Token said
Token ID531
Feature activation-5.68e-03

Pos logit contributions
 swollen+0.006
othe+0.006
 faint+0.005
 its+0.005
 frightening+0.005
Neg logit contributions
nes-0.018
uries-0.018
aces-0.017
coat-0.016
shirts-0.016
Token calmly
Token ID30180
Feature activation+5.55e-02

Pos logit contributions
uries+0.017
nes+0.016
aces+0.016
Whe+0.015
coat+0.015
Neg logit contributions
 swollen-0.004
othe-0.004
 scared-0.004
 frightening-0.003
 its-0.003
Token,
Token ID11
Feature activation-1.19e-02

Pos logit contributions
 swollen+0.051
othe+0.042
 frightening+0.041
 scared+0.041
 its+0.038
Neg logit contributions
uries-0.156
aces-0.152
nes-0.150
coat-0.139
 parties-0.138
Token until
Token ID1566
Feature activation+7.39e-03

Pos logit contributions
 swollen+0.008
othe+0.007
 Andy+0.007
 scared+0.007
 frightening+0.006
Neg logit contributions
uries-0.013
aces-0.012
nes-0.012
Whe-0.011
coat-0.011
Token a
Token ID257
Feature activation+1.92e-02

Pos logit contributions
 swollen+0.007
othe+0.006
 scared+0.006
 its+0.005
 frightening+0.005
Neg logit contributions
uries-0.021
nes-0.021
aces-0.020
Whe-0.019
coat-0.019
Token strange
Token ID6283
Feature activation+2.24e-02

Pos logit contributions
 swollen+0.021
othe+0.020
 its+0.018
 scared+0.018
 Andy+0.018
Neg logit contributions
uries-0.048
nes-0.047
aces-0.046
Whe-0.044
coat-0.044
Token feeling
Token ID4203
Feature activation+2.80e-02

Pos logit contributions
 swollen+0.027
othe+0.024
 scared+0.024
 its+0.023
 frightening+0.023
Neg logit contributions
uries-0.056
nes-0.055
aces-0.055
Whe-0.050
shirts-0.049
Token came
Token ID1625
Feature activation+3.81e-02

Pos logit contributions
 swollen+0.034
othe+0.030
 Andy+0.029
 scared+0.028
 frightening+0.028
Neg logit contributions
nes-0.069
uries-0.069
aces-0.068
Whe-0.062
coat-0.062
Token over
Token ID625
Feature activation+1.05e-01

Pos logit contributions
 swollen+0.050
othe+0.047
 frightening+0.042
 Farmer+0.042
 Andy+0.040
Neg logit contributions
uries-0.086
nes-0.086
aces-0.085
coat-0.078
 parties-0.076
Token her
Token ID607
Feature activation+1.39e-03

Pos logit contributions
 swollen+0.151
othe+0.133
 frightening+0.127
 shocked+0.125
 scared+0.124
Neg logit contributions
 parties-0.207
aces-0.207
uries-0.202
nes-0.194
 shorts-0.188
Token.
Token ID13
Feature activation+1.99e-02

Pos logit contributions
 swollen+0.001
 scared+0.001
othe+0.001
 its+0.001
 frightening+0.001
Neg logit contributions
uries-0.004
nes-0.004
aces-0.004
Whe-0.003
shirts-0.003
Token In
Token ID554
Feature activation+1.06e-02

Pos logit contributions
 swollen+0.017
othe+0.016
 its+0.015
 scared+0.015
 Andy+0.014
Neg logit contributions
uries-0.056
nes-0.056
aces-0.055
Whe-0.051
coat-0.050
Token the
Token ID262
Feature activation+1.48e-02

Pos logit contributions
 swollen+0.008
othe+0.007
 scared+0.007
 frightening+0.006
 Andy+0.006
Neg logit contributions
uries-0.030
aces-0.030
nes-0.030
 parties-0.028
Whe-0.028
Token silence
Token ID9550
Feature activation+5.10e-02

Pos logit contributions
othe+0.028
 swollen+0.028
 its+0.027
 Andy+0.026
 Bee+0.025
Neg logit contributions
uries-0.024
aces-0.023
Whe-0.022
nes-0.022
coat-0.021
Token did
Token ID750
Feature activation-3.19e-02

Pos logit contributions
uries+0.082
aces+0.079
nes+0.076
Whe+0.073
aults+0.071
Neg logit contributions
 scared-0.021
 worried-0.017
 swollen-0.017
 was-0.015
 surprised-0.015
Token something
Token ID1223
Feature activation+1.34e-02

Pos logit contributions
uries+0.096
aces+0.094
nes+0.093
Whe+0.088
coat+0.086
Neg logit contributions
 its-0.022
 scared-0.021
 swollen-0.020
 faint-0.018
 angry-0.018
Token special
Token ID2041
Feature activation-2.59e-02

Pos logit contributions
 swollen+0.013
 scared+0.013
othe+0.012
 frightening+0.012
 its+0.012
Neg logit contributions
uries-0.037
aces-0.036
nes-0.036
Whe-0.033
coat-0.032
Token.
Token ID13
Feature activation-9.98e-03

Pos logit contributions
uries+0.084
Whe+0.081
nes+0.081
aces+0.080
aults+0.077
Neg logit contributions
 its-0.010
 scared-0.010
 was-0.008
:-0.008
 the-0.007
Token On
Token ID1550
Feature activation-1.48e-02

Pos logit contributions
uries+0.028
nes+0.028
aces+0.028
Whe+0.025
coat+0.025
Neg logit contributions
 swollen-0.009
othe-0.008
 scared-0.008
 its-0.007
 Andy-0.007
Token Wed
Token ID3300
Feature activation-1.60e-01

Pos logit contributions
uries+0.042
aces+0.041
nes+0.041
Whe+0.038
coat+0.037
Neg logit contributions
 swollen-0.014
othe-0.012
 scared-0.012
 its-0.011
 frightening-0.011
Tokennes
Token ID2516
Feature activation-7.09e-02

Pos logit contributions
aces+0.111
 sneakers+0.092
 shorts+0.089
Whe+0.085
ests+0.084
Neg logit contributions
 told-0.409
 scared-0.405
 its-0.405
othe-0.400
 Andy-0.399
Tokendays
Token ID12545
Feature activation-7.72e-02

Pos logit contributions
uries+0.094
nes+0.091
aces+0.087
Whe+0.073
coat+0.072
Neg logit contributions
 swollen-0.172
othe-0.163
 scared-0.160
 frightening-0.158
 its-0.157
Token she
Token ID673
Feature activation-3.76e-02

Pos logit contributions
uries+0.252
aces+0.243
nes+0.240
aults+0.230
ault+0.225
Neg logit contributions
 scared-0.031
 Andy-0.028
 its-0.027
othe-0.027
 was-0.026
Token liked
Token ID8288
Feature activation-7.55e-03

Pos logit contributions
uries+0.100
aces+0.098
nes+0.093
Whe+0.089
aults+0.086
Neg logit contributions
 scared-0.040
 worried-0.036
 swollen-0.035
 was-0.035
 its-0.034
Token to
Token ID284
Feature activation-1.03e-02

Pos logit contributions
uries+0.024
nes+0.024
aces+0.024
Whe+0.022
coat+0.022
Neg logit contributions
 swollen-0.004
 scared-0.003
 its-0.003
 Andy-0.003
 demanding-0.003
Token's
Token ID338
Feature activation-1.16e-02

Pos logit contributions
 swollen+0.018
 scared+0.017
 its+0.016
othe+0.016
 Andy+0.015
Neg logit contributions
uries-0.057
nes-0.056
aces-0.055
Whe-0.053
coat-0.051
Token a
Token ID257
Feature activation-1.49e-03

Pos logit contributions
nes+0.028
uries+0.028
aces+0.028
Whe+0.027
shirts+0.025
Neg logit contributions
 swollen-0.015
 scared-0.014
 its-0.014
 frightening-0.013
othe-0.013
Token special
Token ID2041
Feature activation-3.81e-02

Pos logit contributions
aces+0.004
uries+0.004
nes+0.004
Whe+0.003
shirts+0.003
Neg logit contributions
 swollen-0.002
 scared-0.002
 frightening-0.002
 its-0.002
 faint-0.002
Token hat
Token ID6877
Feature activation-8.80e-03

Pos logit contributions
uries+0.079
nes+0.078
aces+0.077
Whe+0.069
shirts+0.065
Neg logit contributions
 swollen-0.062
 scared-0.060
 its-0.060
othe-0.057
 frightening-0.057
Token for
Token ID329
Feature activation-2.43e-02

Pos logit contributions
uries+0.026
nes+0.025
Whe+0.025
aces+0.024
shirts+0.024
Neg logit contributions
 scared-0.006
 swollen-0.006
 its-0.006
othe-0.005
 burning-0.005
Token Wed
Token ID3300
Feature activation-1.51e-01

Pos logit contributions
uries+0.064
nes+0.063
aces+0.063
Whe+0.059
coat+0.059
Neg logit contributions
 swollen-0.027
 its-0.025
 scared-0.024
othe-0.024
 Andy-0.023
Tokennes
Token ID2516
Feature activation-7.88e-02

Pos logit contributions
aces+0.072
Whe+0.066
 sneakers+0.055
 shorts+0.051
uries+0.042
Neg logit contributions
othe-0.451
 scared-0.436
 swollen-0.427
 demanding-0.427
 frightening-0.426
Tokendays
Token ID12545
Feature activation-6.69e-02

Pos logit contributions
uries+0.129
aces+0.125
nes+0.124
Whe+0.114
coat+0.106
Neg logit contributions
 swollen-0.157
 scared-0.151
othe-0.151
 its-0.148
 Andy-0.145
Token."
Token ID526
Feature activation-7.83e-03

Pos logit contributions
uries+0.192
Whe+0.192
aces+0.186
nes+0.183
aults+0.178
Neg logit contributions
 its-0.041
 scared-0.041
 the-0.038
 was-0.038
s-0.037
Token
Token ID198
Feature activation+1.73e-02

Pos logit contributions
uries+0.025
nes+0.025
aces+0.025
Whe+0.023
coat+0.023
Neg logit contributions
 swollen-0.004
othe-0.003
 scared-0.003
 its-0.002
 frightening-0.002
Token
Token ID198
Feature activation+1.37e-02

Pos logit contributions
 swollen+0.019
 burning+0.017
 frightening+0.016
 cracking+0.016
 trembling+0.016
Neg logit contributions
nes-0.047
uries-0.044
aces-0.043
Whe-0.042
 parties-0.041
Token Wed
Token ID3300
Feature activation-1.33e-01

Pos logit contributions
uries+0.004
nes+0.004
aces+0.004
Whe+0.004
coat+0.004
Neg logit contributions
 swollen-0.001
 scared-0.001
othe-0.001
 its-0.001
 frightening-0.001
Tokennes
Token ID2516
Feature activation-6.47e-02

Pos logit contributions
aces+0.062
Whe+0.051
uries+0.049
shirts+0.039
 sneakers+0.038
Neg logit contributions
othe-0.405
 swollen-0.401
 scared-0.400
 faint-0.383
 frightening-0.382
Tokendays
Token ID12545
Feature activation-6.87e-02

Pos logit contributions
uries+0.076
nes+0.074
aces+0.068
coat+0.056
 parties+0.055
Neg logit contributions
 swollen-0.169
othe-0.159
 frightening-0.156
 scared-0.155
 Farmer-0.153
Token because
Token ID780
Feature activation-6.17e-03

Pos logit contributions
uries+0.210
aces+0.206
nes+0.205
Whe+0.197
shirts+0.188
Neg logit contributions
 scared-0.041
 its-0.036
othe-0.036
 Andy-0.033
 worried-0.031
Token on
Token ID319
Feature activation-2.14e-02

Pos logit contributions
uries+0.017
aces+0.017
nes+0.017
Whe+0.015
coat+0.015
Neg logit contributions
 swollen-0.006
 scared-0.005
 its-0.005
othe-0.005
 Andy-0.005
Token Wed
Token ID3300
Feature activation-1.48e-01

Pos logit contributions
uries+0.075
aces+0.071
nes+0.070
suit+0.065
ode+0.065
Neg logit contributions
 the-0.013
 its-0.012
 loud-0.008
 scared-0.008
 angry-0.008
Tokennes
Token ID2516
Feature activation-7.05e-02

Pos logit contributions
aces+0.095
 sneakers+0.078
 shorts+0.078
Whe+0.071
shirts+0.069
Neg logit contributions
othe-0.403
 scared-0.399
 swollen-0.387
 faint-0.386
 frightening-0.381
Tokendays
Token ID12545
Feature activation-7.44e-02

Pos logit contributions
uries+0.086
nes+0.083
aces+0.078
coat+0.064
shirts+0.062
Neg logit contributions
 swollen-0.181
othe-0.171
 scared-0.167
 frightening-0.167
 its-0.164
Token she
Token ID673
Feature activation-2.68e-02

Pos logit contributions
uries+0.236
aces+0.227
nes+0.226
aults+0.215
Whe+0.212
Neg logit contributions
 was-0.041
 scared-0.040
 its-0.033
 the-0.033
 shaking-0.028
Token did
Token ID750
Feature activation-3.19e-02

Pos logit contributions
uries+0.082
aces+0.079
nes+0.076
Whe+0.073
aults+0.071
Neg logit contributions
 scared-0.021
 worried-0.017
 swollen-0.017
 was-0.015
 surprised-0.015
Token something
Token ID1223
Feature activation+1.34e-02

Pos logit contributions
uries+0.096
aces+0.094
nes+0.093
Whe+0.088
coat+0.086
Neg logit contributions
 its-0.022
 scared-0.021
 swollen-0.020
 faint-0.018
 angry-0.018
Token saw
Token ID2497
Feature activation+6.70e-02

Pos logit contributions
 swollen+0.009
othe+0.008
 frightening+0.007
 its+0.007
 scared+0.007
Neg logit contributions
uries-0.025
nes-0.025
aces-0.024
Whe-0.022
coat-0.022
Token lots
Token ID6041
Feature activation-1.22e-02

Pos logit contributions
 swollen+0.088
othe+0.081
 scared+0.077
 frightening+0.075
 its+0.073
Neg logit contributions
uries-0.160
nes-0.157
aces-0.157
Whe-0.144
shirts-0.141
Token of
Token ID286
Feature activation+4.40e-04

Pos logit contributions
uries+0.036
nes+0.036
aces+0.035
Whe+0.033
coat+0.032
Neg logit contributions
 swollen-0.009
 scared-0.008
othe-0.008
 its-0.007
 frightening-0.007
Token people
Token ID661
Feature activation-1.56e-02

Pos logit contributions
 swollen+0.001
 scared+0.001
othe+0.001
 frightening+0.001
 its+0.001
Neg logit contributions
uries-0.001
nes-0.001
aces-0.001
Whe-0.001
coat-0.001
Token having
Token ID1719
Feature activation-5.46e-02

Pos logit contributions
uries+0.042
nes+0.041
aces+0.041
Whe+0.038
shirts+0.037
Neg logit contributions
 scared-0.016
 swollen-0.016
othe-0.015
 screaming-0.014
 worried-0.014
Token pic
Token ID8301
Feature activation-1.45e-01

Pos logit contributions
uries+0.145
nes+0.139
aces+0.134
Whe+0.127
atmeal+0.123
Neg logit contributions
 scared-0.067
 its-0.062
 angry-0.060
 dangerous-0.060
 the-0.058
Tokenn
Token ID77
Feature activation-6.76e-02

Pos logit contributions
uries+0.230
nes+0.222
aces+0.222
Whe+0.196
shirts+0.186
Neg logit contributions
 swollen-0.302
othe-0.289
 scared-0.282
 its-0.280
 Andy-0.277
Tokenics
Token ID873
Feature activation-2.60e-02

Pos logit contributions
uries+0.092
nes+0.088
aces+0.087
Whe+0.073
coat+0.071
Neg logit contributions
 swollen-0.159
othe-0.151
 scared-0.150
 its-0.147
 frightening-0.147
Token and
Token ID290
Feature activation+8.05e-04

Pos logit contributions
uries+0.086
nes+0.083
aces+0.083
Whe+0.083
shirts+0.079
Neg logit contributions
 scared-0.012
 its-0.012
 dangerous-0.009
 the-0.009
 angry-0.007
Token walking
Token ID6155
Feature activation-3.85e-02

Pos logit contributions
 swollen+0.001
 scared+0.001
othe+0.001
 its+0.001
 frightening+0.001
Neg logit contributions
uries-0.002
nes-0.002
aces-0.002
Whe-0.002
coat-0.002
Token their
Token ID511
Feature activation-4.12e-02

Pos logit contributions
uries+0.105
aces+0.102
nes+0.101
Whe+0.094
atmeal+0.093
Neg logit contributions
 its-0.040
 scared-0.039
 angry-0.034
 Andy-0.034
 swollen-0.034
Tokenmy
Token ID1820
Feature activation+2.65e-02

Pos logit contributions
uries+0.009
aces+0.009
nes+0.009
coat+0.008
 parties+0.008
Neg logit contributions
 swollen-0.009
othe-0.009
 frightening-0.008
 faint-0.008
 demanding-0.008
Token.
Token ID13
Feature activation-2.21e-02

Pos logit contributions
 swollen+0.044
 frightening+0.041
othe+0.041
 faint+0.038
 swarm+0.037
Neg logit contributions
uries-0.049
aces-0.049
nes-0.046
coat-0.045
 guys-0.042
Token Tim
Token ID5045
Feature activation-7.59e-04

Pos logit contributions
uries+0.056
aces+0.055
Whe+0.052
nes+0.052
coat+0.051
Neg logit contributions
 swollen-0.026
othe-0.022
 frightening-0.022
 scared-0.021
 shocked-0.021
Tokenmy
Token ID1820
Feature activation+2.39e-02

Pos logit contributions
uries+0.002
aces+0.002
nes+0.002
Whe+0.001
coat+0.001
Neg logit contributions
 swollen-0.001
othe-0.001
 frightening-0.001
 scared-0.001
 its-0.001
Token loved
Token ID6151
Feature activation-3.16e-04

Pos logit contributions
 swollen+0.033
othe+0.030
 frightening+0.028
 demanding+0.027
 faint+0.026
Neg logit contributions
uries-0.056
nes-0.055
aces-0.055
coat-0.050
Whe-0.050
Token Wed
Token ID3300
Feature activation-1.45e-01

Pos logit contributions
uries+0.001
nes+0.001
aces+0.001
Whe+0.001
 parties+0.001
Neg logit contributions
 swollen-0.000
othe-0.000
 scared-0.000
 frightening-0.000
 its-0.000
Tokennes
Token ID2516
Feature activation-7.14e-02

Pos logit contributions
aces+0.079
uries+0.069
Whe+0.064
 sneakers+0.048
shirts+0.047
Neg logit contributions
 swollen-0.431
othe-0.430
 scared-0.423
 frightening-0.408
 worried-0.406
Tokendays
Token ID12545
Feature activation-6.73e-02

Pos logit contributions
uries+0.087
nes+0.084
aces+0.078
coat+0.064
 parties+0.062
Neg logit contributions
 swollen-0.184
othe-0.173
 frightening-0.171
 scared-0.170
 its-0.168
Token because
Token ID780
Feature activation-6.07e-03

Pos logit contributions
uries+0.205
aces+0.201
nes+0.201
Whe+0.188
aults+0.182
Neg logit contributions
 scared-0.039
othe-0.037
 its-0.035
 Andy-0.034
 swollen-0.031
Token that
Token ID326
Feature activation+6.80e-03

Pos logit contributions
uries+0.018
aces+0.018
nes+0.018
Whe+0.016
coat+0.016
Neg logit contributions
 swollen-0.004
 scared-0.004
othe-0.004
 its-0.004
 Andy-0.003
Token was
Token ID373
Feature activation-5.85e-04

Pos logit contributions
 swollen+0.007
 scared+0.006
 its+0.006
 shaking+0.005
 worried+0.005
Neg logit contributions
uries-0.020
nes-0.019
aces-0.019
Whe-0.017
aults-0.017
Token.
Token ID13
Feature activation-2.02e-02

Pos logit contributions
uries+0.035
aces+0.034
nes+0.034
coat+0.031
 parties+0.030
Neg logit contributions
 swollen-0.026
othe-0.023
 frightening-0.023
 Farmer-0.021
 shocked-0.021
Token She
Token ID1375
Feature activation+1.34e-03

Pos logit contributions
uries+0.057
nes+0.056
aces+0.055
Whe+0.051
shirts+0.050
Neg logit contributions
 swollen-0.019
othe-0.017
 scared-0.017
 its-0.016
 Andy-0.016
Token loved
Token ID6151
Feature activation-3.64e-03

Pos logit contributions
 swollen+0.002
othe+0.002
 frightening+0.002
 Farmer+0.002
 Andy+0.002
Neg logit contributions
nes-0.003
uries-0.003
aces-0.003
coat-0.003
Whe-0.003
Token to
Token ID284
Feature activation-6.81e-03

Pos logit contributions
uries+0.010
nes+0.010
aces+0.010
Whe+0.009
shirts+0.009
Neg logit contributions
 swollen-0.003
othe-0.003
 scared-0.003
 frightening-0.003
 its-0.003
Token have
Token ID423
Feature activation-4.36e-02

Pos logit contributions
uries+0.017
nes+0.016
aces+0.016
Whe+0.014
suit+0.014
Neg logit contributions
 its-0.009
 swollen-0.009
 scared-0.009
 Andy-0.008
 faint-0.008
Token pic
Token ID8301
Feature activation-1.43e-01

Pos logit contributions
uries+0.108
nes+0.103
aces+0.097
suit+0.090
coat+0.085
Neg logit contributions
 its-0.063
 the-0.062
 scared-0.058
 loud-0.057
 dangerous-0.057
Tokenn
Token ID77
Feature activation-7.82e-02

Pos logit contributions
uries+0.214
nes+0.206
aces+0.203
Whe+0.174
coat+0.169
Neg logit contributions
 swollen-0.318
othe-0.302
 scared-0.300
 its-0.293
 frightening-0.293
Tokenics
Token ID873
Feature activation-3.09e-02

Pos logit contributions
uries+0.111
nes+0.104
aces+0.104
Whe+0.089
shirts+0.086
Neg logit contributions
 swollen-0.179
othe-0.171
 scared-0.170
 its-0.167
 frightening-0.164
Token with
Token ID351
Feature activation-3.38e-02

Pos logit contributions
uries+0.101
nes+0.097
aces+0.095
Whe+0.093
shirts+0.091
Neg logit contributions
 its-0.020
 scared-0.017
 the-0.016
 dangerous-0.013
 angry-0.012
Token her
Token ID607
Feature activation-4.44e-02

Pos logit contributions
uries+0.095
nes+0.093
aces+0.091
Whe+0.085
coat+0.083
Neg logit contributions
 swollen-0.033
 its-0.029
 scared-0.029
othe-0.029
 Andy-0.026
Token family
Token ID1641
Feature activation-1.97e-02

Pos logit contributions
uries+0.118
nes+0.114
aces+0.113
Whe+0.105
shirts+0.102
Neg logit contributions
 swollen-0.051
 scared-0.046
othe-0.044
 its-0.043
 Andy-0.042
Tokenmy
Token ID1820
Feature activation+2.45e-02

Pos logit contributions
aces+0.012
uries+0.012
nes+0.011
coat+0.011
 parties+0.010
Neg logit contributions
 swollen-0.009
othe-0.008
 frightening-0.008
 Farmer-0.007
 faint-0.007
Token.
Token ID13
Feature activation-2.31e-02

Pos logit contributions
 swollen+0.041
 frightening+0.038
othe+0.038
 faint+0.035
 Farmer+0.034
Neg logit contributions
uries-0.046
aces-0.046
nes-0.043
coat-0.041
shirts-0.039
Token Tim
Token ID5045
Feature activation-1.12e-03

Pos logit contributions
uries+0.056
aces+0.056
Whe+0.053
nes+0.052
coat+0.052
Neg logit contributions
 swollen-0.029
 frightening-0.025
othe-0.024
 shocked-0.023
 scared-0.023
Tokenmy
Token ID1820
Feature activation+2.17e-02

Pos logit contributions
uries+0.002
aces+0.002
nes+0.002
Whe+0.002
coat+0.002
Neg logit contributions
 swollen-0.002
othe-0.002
 frightening-0.001
 scared-0.001
 its-0.001
Token loved
Token ID6151
Feature activation-2.90e-03

Pos logit contributions
 swollen+0.028
othe+0.026
 frightening+0.024
 scared+0.023
 demanding+0.023
Neg logit contributions
uries-0.052
nes-0.052
aces-0.052
coat-0.047
Whe-0.047
Token Wed
Token ID3300
Feature activation-1.42e-01

Pos logit contributions
uries+0.008
aces+0.008
nes+0.008
 parties+0.007
Whe+0.007
Neg logit contributions
 swollen-0.003
othe-0.002
 scared-0.002
 frightening-0.002
 its-0.002
Tokennes
Token ID2516
Feature activation-7.19e-02

Pos logit contributions
aces+0.103
uries+0.099
Whe+0.088
nes+0.079
shirts+0.073
Neg logit contributions
 swollen-0.402
othe-0.398
 scared-0.393
 frightening-0.378
 Andy-0.376
Tokendays
Token ID12545
Feature activation-7.04e-02

Pos logit contributions
uries+0.084
nes+0.082
aces+0.075
coat+0.061
 parties+0.060
Neg logit contributions
 swollen-0.189
othe-0.177
 frightening-0.176
 scared-0.173
 Farmer-0.173
Token because
Token ID780
Feature activation-1.03e-02

Pos logit contributions
uries+0.222
aces+0.217
nes+0.216
Whe+0.205
aults+0.199
Neg logit contributions
 scared-0.035
 its-0.031
othe-0.030
 Andy-0.029
 the-0.025
Token that
Token ID326
Feature activation+3.15e-03

Pos logit contributions
uries+0.030
nes+0.029
aces+0.029
Whe+0.027
coat+0.027
Neg logit contributions
 swollen-0.008
othe-0.007
 scared-0.007
 its-0.007
 frightening-0.007
Token was
Token ID373
Feature activation-4.80e-03

Pos logit contributions
 swollen+0.003
 scared+0.003
 its+0.003
othe+0.003
 shaking+0.003
Neg logit contributions
uries-0.009
nes-0.009
aces-0.009
Whe-0.008
aults-0.008
Token the
Token ID262
Feature activation-1.36e-02

Pos logit contributions
uries+0.007
nes+0.007
aces+0.007
Whe+0.007
coat+0.007
Neg logit contributions
 swollen-0.001
othe-0.001
 scared-0.001
 its-0.001
 frightening-0.001
Token warm
Token ID5814
Feature activation-2.13e-02

Pos logit contributions
uries+0.037
aces+0.035
nes+0.034
shirts+0.030
Whe+0.030
Neg logit contributions
 swollen-0.016
 its-0.015
 scared-0.015
 burning-0.015
 dangerous-0.014
Token sunshine
Token ID34488
Feature activation-1.30e-02

Pos logit contributions
uries+0.053
nes+0.051
aces+0.048
Whe+0.044
shirts+0.043
Neg logit contributions
 its-0.028
 swollen-0.027
 scared-0.026
 burning-0.025
 Andy-0.025
Token and
Token ID290
Feature activation-9.15e-03

Pos logit contributions
uries+0.043
nes+0.042
Whe+0.040
aces+0.040
shirts+0.039
Neg logit contributions
 scared-0.006
 its-0.005
 burning-0.004
 swollen-0.004
 the-0.004
Token having
Token ID1719
Feature activation-5.82e-02

Pos logit contributions
uries+0.023
nes+0.022
aces+0.022
Whe+0.020
 parties+0.020
Neg logit contributions
 swollen-0.011
othe-0.010
 scared-0.010
 frightening-0.010
 its-0.010
Token pic
Token ID8301
Feature activation-1.40e-01

Pos logit contributions
uries+0.163
aces+0.151
nes+0.150
suit+0.142
coat+0.137
Neg logit contributions
 its-0.065
 the-0.063
 scared-0.062
 loud-0.056
 Andy-0.054
Tokenn
Token ID77
Feature activation-6.76e-02

Pos logit contributions
uries+0.228
aces+0.221
nes+0.218
Whe+0.197
shirts+0.187
Neg logit contributions
 swollen-0.281
othe-0.271
 its-0.264
 scared-0.260
 Andy-0.259
Tokenics
Token ID873
Feature activation-2.11e-02

Pos logit contributions
uries+0.100
nes+0.093
aces+0.091
Whe+0.081
shirts+0.078
Neg logit contributions
 swollen-0.151
othe-0.145
 scared-0.144
 its-0.141
 Andy-0.138
Token with
Token ID351
Feature activation-2.57e-02

Pos logit contributions
uries+0.069
nes+0.066
Whe+0.066
aces+0.066
shirts+0.063
Neg logit contributions
 its-0.011
 scared-0.011
 the-0.009
 dangerous-0.008
 swollen-0.007
Token her
Token ID607
Feature activation-3.76e-02

Pos logit contributions
uries+0.070
nes+0.069
aces+0.068
Whe+0.063
coat+0.062
Neg logit contributions
 swollen-0.026
othe-0.023
 scared-0.023
 its-0.022
 Andy-0.021
Token family
Token ID1641
Feature activation-1.32e-02

Pos logit contributions
uries+0.098
nes+0.095
aces+0.093
Whe+0.089
shirts+0.085
Neg logit contributions
 swollen-0.044
 scared-0.042
othe-0.039
 its-0.038
 Andy-0.037
Token children
Token ID1751
Feature activation-1.61e-02

Pos logit contributions
uries+0.047
nes+0.046
aces+0.046
coat+0.041
Whe+0.041
Neg logit contributions
 scared-0.025
 swollen-0.023
 its-0.022
 angry-0.022
 worried-0.022
Token running
Token ID2491
Feature activation-3.59e-02

Pos logit contributions
uries+0.044
aces+0.043
nes+0.043
Whe+0.041
shirts+0.039
Neg logit contributions
 scared-0.015
 swollen-0.014
 screaming-0.014
 worried-0.013
 crying-0.013
Token around
Token ID1088
Feature activation+2.03e-02

Pos logit contributions
uries+0.096
aces+0.095
nes+0.094
atmeal+0.089
Whe+0.088
Neg logit contributions
 scared-0.039
 its-0.038
 Andy-0.035
 angry-0.034
 frightening-0.034
Token,
Token ID11
Feature activation-8.85e-03

Pos logit contributions
 swollen+0.018
othe+0.016
 scared+0.015
 frightening+0.014
 its+0.014
Neg logit contributions
uries-0.057
nes-0.056
aces-0.055
Whe-0.051
 parties-0.051
Token families
Token ID4172
Feature activation-2.15e-02

Pos logit contributions
uries+0.024
nes+0.024
aces+0.023
Whe+0.021
shirts+0.021
Neg logit contributions
 swollen-0.009
 scared-0.009
othe-0.008
 its-0.008
 screaming-0.008
Token pic
Token ID8301
Feature activation-1.38e-01

Pos logit contributions
uries+0.063
nes+0.062
aces+0.062
Whe+0.058
shirts+0.056
Neg logit contributions
 swollen-0.017
 scared-0.016
othe-0.014
 its-0.014
 screaming-0.014
Tokenn
Token ID77
Feature activation-5.67e-02

Pos logit contributions
uries+0.199
nes+0.192
aces+0.182
Whe+0.154
 parties+0.154
Neg logit contributions
 swollen-0.321
 scared-0.304
othe-0.304
 frightening-0.294
 its-0.292
Tokenicking
Token ID7958
Feature activation+4.32e-02

Pos logit contributions
uries+0.086
nes+0.086
aces+0.084
coat+0.071
Whe+0.070
Neg logit contributions
 swollen-0.124
othe-0.116
 frightening-0.114
 scared-0.113
 its-0.112
Token,
Token ID11
Feature activation-1.41e-02

Pos logit contributions
 swollen+0.045
 scared+0.041
othe+0.041
 its+0.040
 frightening+0.038
Neg logit contributions
uries-0.115
nes-0.113
aces-0.111
Whe-0.104
shirts-0.101
Token couples
Token ID11886
Feature activation-3.07e-02

Pos logit contributions
uries+0.039
nes+0.039
aces+0.038
Whe+0.035
shirts+0.035
Neg logit contributions
 swollen-0.013
 scared-0.013
 its-0.011
othe-0.011
 frightening-0.011
Token st
Token ID336
Feature activation-6.74e-02

Pos logit contributions
uries+0.087
nes+0.086
aces+0.084
Whe+0.081
shirts+0.077
Neg logit contributions
 swollen-0.027
 scared-0.026
 screaming-0.024
 its-0.023
 trembling-0.023
Token girl
Token ID2576
Feature activation+8.64e-03

Pos logit contributions
 swollen+0.025
 scared+0.024
othe+0.022
 its+0.022
 frightening+0.022
Neg logit contributions
uries-0.047
nes-0.046
aces-0.045
Whe-0.041
shirts-0.040
Token named
Token ID3706
Feature activation-4.96e-02

Pos logit contributions
 swollen+0.014
 frightening+0.013
othe+0.013
 faint+0.012
 painful+0.012
Neg logit contributions
nes-0.017
uries-0.017
aces-0.016
coat-0.016
 parties-0.015
Token Lily
Token ID20037
Feature activation-1.49e-02

Pos logit contributions
uries+0.114
nes+0.113
aces+0.110
Whe+0.100
coat+0.100
Neg logit contributions
 swollen-0.071
othe-0.065
 scared-0.063
 frightening-0.062
 its-0.061
Token who
Token ID508
Feature activation+1.01e-02

Pos logit contributions
uries+0.032
nes+0.031
aces+0.031
coat+0.028
 parties+0.028
Neg logit contributions
 swollen-0.023
othe-0.021
 frightening-0.021
 Farmer-0.019
 shocked-0.019
Token loved
Token ID6151
Feature activation+2.12e-04

Pos logit contributions
 swollen+0.014
othe+0.013
 frightening+0.012
 Andy+0.012
 Farmer+0.012
Neg logit contributions
uries-0.022
nes-0.022
aces-0.021
coat-0.020
 parties-0.020
Token pic
Token ID8301
Feature activation-1.38e-01

Pos logit contributions
 swollen+0.000
othe+0.000
 scared+0.000
 its+0.000
 frightening+0.000
Neg logit contributions
uries-0.001
nes-0.001
aces-0.001
Whe-0.001
coat-0.001
Tokenn
Token ID77
Feature activation-7.62e-02

Pos logit contributions
uries+0.194
nes+0.189
aces+0.179
Whe+0.151
coat+0.149
Neg logit contributions
 swollen-0.322
 scared-0.309
othe-0.304
 frightening-0.299
 its-0.295
Tokenics
Token ID873
Feature activation-2.23e-02

Pos logit contributions
uries+0.102
nes+0.098
aces+0.096
Whe+0.081
coat+0.078
Neg logit contributions
 swollen-0.181
othe-0.172
 scared-0.171
 frightening-0.167
 its-0.167
Token.
Token ID13
Feature activation-1.53e-02

Pos logit contributions
uries+0.079
nes+0.076
aces+0.073
Whe+0.073
shirts+0.071
Neg logit contributions
 scared-0.009
 its-0.008
 the-0.006
 angry-0.004
 worried-0.004
Token She
Token ID1375
Feature activation+1.60e-03

Pos logit contributions
nes+0.048
uries+0.048
aces+0.046
aults+0.045
ault+0.044
Neg logit contributions
 its-0.008
-0.008
 swollen-0.007
 The-0.007
 scared-0.007
Token would
Token ID561
Feature activation-3.70e-02

Pos logit contributions
 scared+0.002
 swollen+0.002
 its+0.001
othe+0.001
 worried+0.001
Neg logit contributions
uries-0.004
aces-0.004
nes-0.004
Whe-0.004
shirts-0.004
Tokenle
Token ID293
Feature activation+1.11e-02

Pos logit contributions
uries+0.135
nes+0.126
aces+0.121
coat+0.120
Whe+0.113
Neg logit contributions
 its-0.076
 scared-0.075
 angry-0.071
othe-0.068
 swollen-0.067
Token balls
Token ID11333
Feature activation-7.03e-02

Pos logit contributions
 its+0.015
 scared+0.014
 swollen+0.014
 Bee+0.013
 Andy+0.013
Neg logit contributions
uries-0.026
nes-0.026
aces-0.024
coat-0.023
Whe-0.023
Token and
Token ID290
Feature activation-2.12e-02

Pos logit contributions
uries+0.209
nes+0.203
Whe+0.194
aces+0.190
shirts+0.188
Neg logit contributions
 its-0.054
 scared-0.053
othe-0.048
s-0.044
 the-0.043
Token ride
Token ID6594
Feature activation-3.32e-02

Pos logit contributions
uries+0.049
nes+0.049
aces+0.047
coat+0.043
Whe+0.042
Neg logit contributions
 swollen-0.029
 scared-0.028
 its-0.028
othe-0.027
 Andy-0.026
Token a
Token ID257
Feature activation-2.27e-02

Pos logit contributions
uries+0.092
nes+0.089
aces+0.087
coat+0.082
Whe+0.082
Neg logit contributions
 its-0.031
 scared-0.029
 swollen-0.028
 Andy-0.028
othe-0.028
Token un
Token ID555
Feature activation-1.38e-01

Pos logit contributions
uries+0.059
aces+0.059
nes+0.056
Whe+0.052
shirts+0.051
Neg logit contributions
 swollen-0.025
 scared-0.024
 angry-0.022
 its-0.022
 frightening-0.022
Tokenicycle
Token ID35298
Feature activation-2.83e-02

Pos logit contributions
aces+0.268
uries+0.262
nes+0.245
Whe+0.236
 parties+0.218
Neg logit contributions
 swollen-0.216
 scared-0.209
oc-0.206
 surprised-0.202
 angry-0.202
Token.
Token ID13
Feature activation-1.08e-02

Pos logit contributions
uries+0.092
nes+0.090
Whe+0.087
aces+0.086
shirts+0.082
Neg logit contributions
 scared-0.012
s-0.010
 its-0.009
othe-0.009
 angry-0.008
Token One
Token ID1881
Feature activation-2.14e-02

Pos logit contributions
uries+0.033
nes+0.032
aces+0.032
Whe+0.030
coat+0.030
Neg logit contributions
 swollen-0.007
othe-0.006
 scared-0.006
 frightening-0.005
 its-0.005
Token day
Token ID1110
Feature activation-2.76e-02

Pos logit contributions
uries+0.043
nes+0.042
aces+0.042
Whe+0.039
shirts+0.037
Neg logit contributions
 swollen-0.037
othe-0.034
 scared-0.032
 frightening-0.032
 its-0.031
Token,
Token ID11
Feature activation-2.48e-02

Pos logit contributions
aces+0.068
uries+0.067
nes+0.065
Whe+0.064
 parties+0.062
Neg logit contributions
 swollen-0.034
 frightening-0.026
othe-0.026
 faint-0.025
 silent-0.025
Token the
Token ID262
Feature activation-7.12e-03

Pos logit contributions
uries+0.024
nes+0.023
aces+0.023
Whe+0.021
coat+0.021
Neg logit contributions
 swollen-0.009
othe-0.008
 scared-0.008
 frightening-0.007
 its-0.007
Token cliff
Token ID19516
Feature activation+9.21e-03

Pos logit contributions
uries+0.015
nes+0.014
aces+0.014
Whe+0.013
 parties+0.013
Neg logit contributions
 swollen-0.012
othe-0.011
 Andy-0.010
 its-0.010
 scared-0.010
Token.
Token ID13
Feature activation+1.51e-03

Pos logit contributions
 swollen+0.008
othe+0.007
 scared+0.007
 its+0.007
 frightening+0.006
Neg logit contributions
uries-0.026
nes-0.026
aces-0.025
Whe-0.024
coat-0.023
Token On
Token ID1550
Feature activation-8.61e-03

Pos logit contributions
 swollen+0.002
 burning+0.002
 frightening+0.002
 trembling+0.002
othe+0.002
Neg logit contributions
aces-0.004
uries-0.004
nes-0.003
Whe-0.003
shirts-0.003
Token many
Token ID867
Feature activation-2.00e-02

Pos logit contributions
aces+0.022
nes+0.022
uries+0.021
Whe+0.020
 parties+0.020
Neg logit contributions
 swollen-0.009
 frightening-0.008
 scared-0.007
othe-0.007
 screaming-0.007
Token occasions
Token ID12432
Feature activation-1.37e-01

Pos logit contributions
aces+0.039
nes+0.036
uries+0.036
 parties+0.035
Whe+0.034
Neg logit contributions
 swollen-0.034
othe-0.033
 Andy-0.030
 Farmer-0.029
 burning-0.029
Token,
Token ID11
Feature activation-2.44e-02

Pos logit contributions
uries+0.402
nes+0.393
aces+0.389
Whe+0.369
shirts+0.363
Neg logit contributions
 swollen-0.097
 scared-0.094
othe-0.092
 its-0.087
 Andy-0.082
Token he
Token ID339
Feature activation-1.04e-02

Pos logit contributions
uries+0.065
aces+0.065
nes+0.064
Whe+0.061
 parties+0.059
Neg logit contributions
 swollen-0.025
othe-0.020
 frightening-0.018
 scared-0.018
 screaming-0.017
Token had
Token ID550
Feature activation-1.92e-02

Pos logit contributions
uries+0.029
nes+0.028
aces+0.028
Whe+0.026
shirts+0.026
Neg logit contributions
 scared-0.010
 swollen-0.010
 its-0.008
othe-0.008
 frightening-0.008
Token to
Token ID284
Feature activation+4.09e-03

Pos logit contributions
uries+0.055
nes+0.053
aces+0.053
Whe+0.049
shirts+0.049
Neg logit contributions
 swollen-0.016
 scared-0.016
 its-0.016
 frightening-0.014
othe-0.014
Token start
Token ID923
Feature activation+1.55e-02

Pos logit contributions
 its+0.005
 scared+0.005
 the+0.005
 swollen+0.005
 Andy+0.005
Neg logit contributions
uries-0.010
aces-0.010
nes-0.009
shirts-0.009
suit-0.009
Token could
Token ID714
Feature activation-8.18e-03

Pos logit contributions
uries+0.016
nes+0.016
aces+0.016
coat+0.014
Whe+0.014
Neg logit contributions
 swollen-0.007
othe-0.007
 frightening-0.006
 its-0.006
 Andy-0.006
Token not
Token ID407
Feature activation+5.07e-02

Pos logit contributions
uries+0.022
nes+0.022
aces+0.021
Whe+0.019
 parties+0.019
Neg logit contributions
 swollen-0.009
othe-0.007
 frightening-0.007
 Andy-0.007
 its-0.007
Token find
Token ID1064
Feature activation+1.34e-02

Pos logit contributions
 scared+0.070
 swollen+0.068
othe+0.067
 its+0.065
 burning+0.064
Neg logit contributions
uries-0.115
aces-0.114
nes-0.110
Whe-0.106
coat-0.103
Token his
Token ID465
Feature activation-2.74e-02

Pos logit contributions
 swollen+0.016
othe+0.014
 scared+0.014
 its+0.013
 frightening+0.013
Neg logit contributions
uries-0.034
nes-0.034
aces-0.033
Whe-0.031
coat-0.030
Token famous
Token ID5863
Feature activation-3.21e-02

Pos logit contributions
uries+0.055
nes+0.054
aces+0.053
Whe+0.048
coat+0.047
Neg logit contributions
 swollen-0.047
 scared-0.045
othe-0.044
 its-0.043
 frightening-0.043
Token las
Token ID39990
Feature activation-1.37e-01

Pos logit contributions
nes+0.065
uries+0.064
aces+0.062
Whe+0.057
coat+0.054
Neg logit contributions
 swollen-0.054
 scared-0.053
 burning-0.052
 frightening-0.051
 its-0.050
Tokenagna
Token ID48669
Feature activation-4.44e-02

Pos logit contributions
nes+0.161
aces+0.156
Whe+0.152
uries+0.141
coat+0.137
Neg logit contributions
oc-0.307
 burning-0.299
 burns-0.286
 swollen-0.283
 Farmer-0.282
Token.
Token ID13
Feature activation+3.46e-03

Pos logit contributions
Whe+0.151
uries+0.146
nes+0.144
aces+0.142
shirts+0.136
Neg logit contributions
 burning-0.012
 scared-0.010
 its-0.010
:-0.007
 swollen-0.007
Token
Token ID198
Feature activation+1.32e-02

Pos logit contributions
 swollen+0.002
 scared+0.002
 Andy+0.002
othe+0.002
 its+0.002
Neg logit contributions
uries-0.010
nes-0.010
aces-0.010
coat-0.009
 parties-0.009
Token
Token ID198
Feature activation+1.16e-02

Pos logit contributions
 swollen+0.015
 frightening+0.012
 burning+0.012
 cracking+0.012
 trembling+0.012
Neg logit contributions
nes-0.035
uries-0.034
aces-0.032
Whe-0.031
shirts-0.030
TokenThe
Token ID464
Feature activation+4.19e-02

Pos logit contributions
 swollen+0.013
othe+0.011
 scared+0.011
 its+0.011
 frightening+0.011
Neg logit contributions
uries-0.031
aces-0.030
nes-0.030
Whe-0.027
coat-0.027
Token liked
Token ID8288
Feature activation-5.11e-03

Pos logit contributions
uries+0.086
nes+0.082
aces+0.082
Whe+0.078
shirts+0.074
Neg logit contributions
 scared-0.035
 swollen-0.032
 its-0.031
 worried-0.030
 Andy-0.030
Token to
Token ID284
Feature activation-4.95e-03

Pos logit contributions
uries+0.014
nes+0.014
aces+0.014
 parties+0.013
Whe+0.013
Neg logit contributions
 swollen-0.005
othe-0.004
 scared-0.004
 frightening-0.004
 its-0.003
Token play
Token ID711
Feature activation-7.01e-02

Pos logit contributions
uries+0.012
nes+0.012
aces+0.012
Whe+0.011
shirts+0.010
Neg logit contributions
 Andy-0.006
 scared-0.006
 its-0.006
 swollen-0.006
 Bee-0.005
Token and
Token ID290
Feature activation-1.44e-02

Pos logit contributions
uries+0.185
nes+0.177
aces+0.175
shirts+0.162
Whe+0.161
Neg logit contributions
 swollen-0.074
 scared-0.073
 its-0.073
othe-0.070
 Andy-0.069
Token have
Token ID423
Feature activation-4.14e-02

Pos logit contributions
uries+0.030
nes+0.030
aces+0.029
Whe+0.026
coat+0.026
Neg logit contributions
 swollen-0.023
othe-0.022
 scared-0.021
 its-0.021
 frightening-0.021
Token pic
Token ID8301
Feature activation-1.36e-01

Pos logit contributions
uries+0.118
nes+0.114
aces+0.110
coat+0.102
suit+0.100
Neg logit contributions
 its-0.043
 scared-0.042
 the-0.040
 dangerous-0.039
 loud-0.038
Tokenn
Token ID77
Feature activation-6.91e-02

Pos logit contributions
uries+0.208
nes+0.200
aces+0.199
Whe+0.173
coat+0.166
Neg logit contributions
 swollen-0.296
othe-0.282
 scared-0.278
 its-0.274
 frightening-0.271
Tokenics
Token ID873
Feature activation-2.59e-02

Pos logit contributions
uries+0.096
nes+0.091
aces+0.090
Whe+0.077
shirts+0.074
Neg logit contributions
 swollen-0.160
othe-0.153
 scared-0.152
 its-0.149
 frightening-0.147
Token.
Token ID13
Feature activation-1.43e-02

Pos logit contributions
uries+0.080
nes+0.077
aces+0.077
Whe+0.075
shirts+0.073
Neg logit contributions
 scared-0.018
 its-0.017
 the-0.015
 dangerous-0.013
 angry-0.013
Token One
Token ID1881
Feature activation-1.94e-02

Pos logit contributions
uries+0.036
nes+0.035
aces+0.035
Whe+0.033
 parties+0.032
Neg logit contributions
 swollen-0.018
othe-0.016
 scared-0.015
 frightening-0.015
 burning-0.014
Token day
Token ID1110
Feature activation-2.71e-02

Pos logit contributions
nes+0.031
uries+0.031
aces+0.029
Whe+0.028
 parties+0.028
Neg logit contributions
 swollen-0.040
othe-0.036
 frightening-0.034
 demanding-0.034
 screaming-0.034
Token a
Token ID257
Feature activation+2.01e-02

Pos logit contributions
uries+0.005
nes+0.005
aces+0.005
coat+0.005
shirts+0.005
Neg logit contributions
 its-0.002
 swollen-0.002
 scared-0.002
 Andy-0.001
 Farmer-0.001
Token family
Token ID1641
Feature activation-1.16e-02

Pos logit contributions
 swollen+0.022
othe+0.021
 its+0.019
 scared+0.018
 frightening+0.018
Neg logit contributions
uries-0.051
nes-0.050
aces-0.049
Whe-0.046
coat-0.046
Token who
Token ID508
Feature activation+1.02e-02

Pos logit contributions
uries+0.029
nes+0.028
aces+0.027
 parties+0.026
shirts+0.025
Neg logit contributions
 swollen-0.015
othe-0.013
 frightening-0.012
 scared-0.012
 Andy-0.011
Token really
Token ID1107
Feature activation-6.75e-03

Pos logit contributions
 scared+0.009
 its+0.008
 swollen+0.007
 worried+0.007
othe+0.007
Neg logit contributions
uries-0.030
aces-0.029
nes-0.029
Whe-0.027
coat-0.026
Token enjoyed
Token ID8359
Feature activation-1.49e-02

Pos logit contributions
uries+0.019
aces+0.018
nes+0.018
Whe+0.017
coat+0.017
Neg logit contributions
 scared-0.006
 worried-0.005
 angry-0.005
 its-0.005
othe-0.005
Token pic
Token ID8301
Feature activation-1.36e-01

Pos logit contributions
uries+0.037
nes+0.036
aces+0.035
Whe+0.032
coat+0.032
Neg logit contributions
 swollen-0.019
 scared-0.017
othe-0.017
 its-0.017
 frightening-0.016
Tokenn
Token ID77
Feature activation-7.35e-02

Pos logit contributions
uries+0.192
nes+0.186
aces+0.179
 parties+0.149
Whe+0.149
Neg logit contributions
 swollen-0.317
 scared-0.302
othe-0.298
 frightening-0.293
 its-0.289
Tokenics
Token ID873
Feature activation-2.76e-02

Pos logit contributions
uries+0.093
nes+0.091
aces+0.089
Whe+0.072
coat+0.071
Neg logit contributions
 swollen-0.180
othe-0.171
 scared-0.170
 frightening-0.168
 its-0.165
Token.
Token ID13
Feature activation-1.74e-02

Pos logit contributions
uries+0.092
nes+0.089
aces+0.088
Whe+0.087
shirts+0.083
Neg logit contributions
 its-0.013
 scared-0.013
 the-0.009
othe-0.009
 swollen-0.009
Token They
Token ID1119
Feature activation-3.72e-02

Pos logit contributions
nes+0.052
aces+0.052
uries+0.052
aults+0.047
ault+0.047
Neg logit contributions
 Andy-0.011
 its-0.011
 Farmer-0.010
 scared-0.010
othe-0.010
Token loved
Token ID6151
Feature activation-5.22e-03

Pos logit contributions
uries+0.113
aces+0.108
nes+0.106
Whe+0.103
coat+0.098
Neg logit contributions
 scared-0.032
 its-0.028
 the-0.027
 worried-0.026
 knew-0.025
Token find
Token ID1064
Feature activation+6.49e-03

Pos logit contributions
uries+0.104
nes+0.101
aces+0.100
Whe+0.092
shirts+0.091
Neg logit contributions
 scared-0.059
 swollen-0.059
othe-0.055
 its-0.053
 angry-0.052
Token a
Token ID257
Feature activation-9.40e-03

Pos logit contributions
 swollen+0.009
othe+0.008
 frightening+0.007
 scared+0.007
 demanding+0.007
Neg logit contributions
uries-0.015
aces-0.015
nes-0.015
Whe-0.014
shirts-0.014
Token box
Token ID3091
Feature activation-9.50e-03

Pos logit contributions
uries+0.021
nes+0.021
aces+0.020
Whe+0.018
shirts+0.018
Neg logit contributions
 swollen-0.014
 scared-0.013
othe-0.013
 its-0.012
 frightening-0.012
Token of
Token ID286
Feature activation-4.44e-03

Pos logit contributions
uries+0.026
nes+0.025
aces+0.025
Whe+0.023
coat+0.023
Neg logit contributions
 swollen-0.010
othe-0.009
 scared-0.008
 frightening-0.008
 its-0.008
Token ch
Token ID442
Feature activation-2.54e-02

Pos logit contributions
uries+0.009
nes+0.009
aces+0.009
Whe+0.008
coat+0.008
Neg logit contributions
 swollen-0.007
 scared-0.007
othe-0.007
 its-0.007
 frightening-0.007
Tokenocol
Token ID4668
Feature activation-1.35e-01

Pos logit contributions
uries+0.060
nes+0.059
aces+0.057
Whe+0.053
coat+0.052
Neg logit contributions
 swollen-0.035
othe-0.033
 scared-0.032
 its-0.031
 frightening-0.031
Tokenates
Token ID689
Feature activation-3.68e-02

Pos logit contributions
uries+0.152
nes+0.145
aces+0.124
Whe+0.120
coat+0.110
Neg logit contributions
 swollen-0.345
othe-0.326
 its-0.326
 scared-0.321
 Andy-0.320
Token.
Token ID13
Feature activation-2.62e-03

Pos logit contributions
uries+0.112
nes+0.109
aces+0.107
Whe+0.104
shirts+0.100
Neg logit contributions
 swollen-0.025
 scared-0.022
othe-0.022
 its-0.022
 burning-0.020
Token
Token ID198
Feature activation+6.78e-03

Pos logit contributions
uries+0.007
nes+0.007
aces+0.007
Whe+0.006
shirts+0.006
Neg logit contributions
 swollen-0.003
 frightening-0.002
othe-0.002
 scared-0.002
 trembling-0.002
Token
Token ID198
Feature activation+5.53e-03

Pos logit contributions
 swollen+0.011
 frightening+0.010
 trembling+0.010
 cracking+0.009
 burning+0.009
Neg logit contributions
nes-0.015
uries-0.013
aces-0.012
Whe-0.012
 parties-0.012
Token"
Token ID1
Feature activation-5.30e-02

Pos logit contributions
 swollen+0.008
othe+0.007
 frightening+0.007
 trembling+0.007
 scared+0.007
Neg logit contributions
nes-0.013
uries-0.013
aces-0.012
Whe-0.012
shirts-0.011
TokenJimmy
Token ID40335
Feature activation-1.30e-02

Pos logit contributions
uries+0.177
nes+0.171
aces+0.168
shirts+0.157
coat+0.156
Neg logit contributions
 swollen-0.027
 scared-0.025
 its-0.024
othe-0.022
 Andy-0.022
Token was
Token ID373
Feature activation+6.92e-03

Pos logit contributions
uries+0.037
nes+0.037
aces+0.036
Whe+0.034
shirts+0.033
Neg logit contributions
 swollen-0.011
 scared-0.010
 its-0.009
othe-0.009
 Andy-0.009
Token always
Token ID1464
Feature activation+1.93e-02

Pos logit contributions
 swollen+0.008
 scared+0.008
 its+0.007
othe+0.007
 worried+0.007
Neg logit contributions
uries-0.018
nes-0.017
aces-0.017
Whe-0.016
shirts-0.016
Token excited
Token ID6568
Feature activation+1.52e-02

Pos logit contributions
 swollen+0.023
 scared+0.022
 its+0.020
othe+0.020
 worried+0.019
Neg logit contributions
uries-0.050
nes-0.048
aces-0.048
Whe-0.044
shirts-0.044
Token for
Token ID329
Feature activation-1.79e-02

Pos logit contributions
 scared+0.007
 its+0.007
 Andy+0.007
othe+0.007
 screaming+0.006
Neg logit contributions
uries-0.048
nes-0.048
aces-0.047
Whe-0.046
coat-0.043
Token Wed
Token ID3300
Feature activation-1.35e-01

Pos logit contributions
uries+0.053
aces+0.053
nes+0.053
coat+0.051
Whe+0.050
Neg logit contributions
 its-0.014
 the-0.012
 Andy-0.012
 scared-0.011
 demanding-0.011
Tokennes
Token ID2516
Feature activation-6.64e-02

Pos logit contributions
aces+0.070
Whe+0.051
uries+0.048
 sneakers+0.041
 shorts+0.036
Neg logit contributions
othe-0.409
 scared-0.401
 swollen-0.399
 frightening-0.390
 demanding-0.390
Tokendays
Token ID12545
Feature activation-6.21e-02

Pos logit contributions
uries+0.101
aces+0.097
nes+0.097
Whe+0.085
coat+0.081
Neg logit contributions
 swollen-0.143
othe-0.137
 scared-0.136
 its-0.133
 frightening-0.132
Token because
Token ID780
Feature activation-2.71e-03

Pos logit contributions
uries+0.200
aces+0.199
nes+0.199
Whe+0.196
shirts+0.180
Neg logit contributions
 scared-0.018
 Andy-0.018
:-0.017
 was-0.016
othe-0.016
Token it
Token ID340
Feature activation+1.22e-02

Pos logit contributions
uries+0.009
aces+0.008
nes+0.008
coat+0.008
Whe+0.008
Neg logit contributions
 swollen-0.001
 scared-0.001
 its-0.001
 Andy-0.001
othe-0.001
Token was
Token ID373
Feature activation+2.78e-03

Pos logit contributions
 swollen+0.007
othe+0.005
 scared+0.005
 its+0.005
 frightening+0.005
Neg logit contributions
uries-0.039
nes-0.038
aces-0.038
Whe-0.035
coat-0.035
Token pe
Token ID613
Feature activation-3.05e-02

Pos logit contributions
uries+0.044
nes+0.044
Whe+0.040
aces+0.040
coat+0.039
Neg logit contributions
 scared-0.024
 swollen-0.022
 angry-0.022
 worried-0.021
 its-0.021
Tokenaches
Token ID3694
Feature activation-2.85e-02

Pos logit contributions
uries+0.047
nes+0.045
aces+0.043
Whe+0.039
coat+0.038
Neg logit contributions
 swollen-0.066
 scared-0.063
othe-0.063
 its-0.061
 frightening-0.060
Token on
Token ID319
Feature activation-1.21e-02

Pos logit contributions
uries+0.091
Whe+0.088
nes+0.086
aces+0.084
shirts+0.082
Neg logit contributions
 its-0.015
 the-0.012
 scared-0.012
s-0.011
:-0.010
Token Mondays
Token ID49748
Feature activation-3.65e-02

Pos logit contributions
uries+0.038
nes+0.037
aces+0.037
Whe+0.035
coat+0.035
Neg logit contributions
 swollen-0.007
 its-0.007
 scared-0.006
othe-0.005
 Andy-0.005
Token,
Token ID11
Feature activation-1.63e-02

Pos logit contributions
Whe+0.102
nes+0.102
uries+0.100
aces+0.098
shirts+0.095
Neg logit contributions
,-0.025
 the-0.025
 its-0.024
 scared-0.023
 burning-0.023
Token Wed
Token ID3300
Feature activation-1.35e-01

Pos logit contributions
nes+0.043
aces+0.042
uries+0.042
Whe+0.038
shirts+0.038
Neg logit contributions
 scared-0.018
 its-0.017
 swollen-0.017
 Andy-0.016
 worried-0.015
Tokennes
Token ID2516
Feature activation-5.88e-02

Pos logit contributions
aces+0.089
ests+0.083
 shorts+0.082
 sneakers+0.080
shirts+0.075
Neg logit contributions
 scared-0.370
othe-0.366
 worried-0.359
 demanding-0.357
 shocked-0.355
Tokendays
Token ID12545
Feature activation-6.33e-02

Pos logit contributions
uries+0.087
nes+0.083
aces+0.082
Whe+0.070
coat+0.068
Neg logit contributions
 swollen-0.132
othe-0.125
 scared-0.124
 its-0.121
 frightening-0.121
Token,
Token ID11
Feature activation-1.46e-02

Pos logit contributions
uries+0.168
aces+0.163
nes+0.158
Whe+0.157
shirts+0.150
Neg logit contributions
 was-0.055
 its-0.054
 the-0.054
 scared-0.051
s-0.051
Token and
Token ID290
Feature activation-6.96e-03

Pos logit contributions
nes+0.042
uries+0.041
aces+0.040
Whe+0.038
shirts+0.038
Neg logit contributions
 its-0.012
 scared-0.012
 swollen-0.011
 Andy-0.010
 the-0.010
Token Fridays
Token ID45011
Feature activation+1.16e-02

Pos logit contributions
uries+0.017
aces+0.017
nes+0.017
coat+0.016
Whe+0.016
Neg logit contributions
 swollen-0.008
 scared-0.008
 its-0.008
othe-0.008
 Andy-0.008
TokenWhat
Token ID2061
Feature activation-2.35e-02

Pos logit contributions
aces+0.070
uries+0.069
nes+0.069
 parties+0.062
coat+0.061
Neg logit contributions
 swollen-0.031
 demanding-0.027
 scared-0.027
 faint-0.026
 its-0.026
Token's
Token ID338
Feature activation-2.80e-02

Pos logit contributions
uries+0.064
nes+0.063
aces+0.061
Whe+0.058
coat+0.057
Neg logit contributions
 swollen-0.023
 scared-0.022
 its-0.022
othe-0.021
 Andy-0.020
Token so
Token ID523
Feature activation+2.24e-02

Pos logit contributions
uries+0.079
nes+0.077
aces+0.077
Whe+0.071
coat+0.070
Neg logit contributions
 swollen-0.026
othe-0.023
 scared-0.022
 its-0.022
 frightening-0.021
Token special
Token ID2041
Feature activation-1.33e-02

Pos logit contributions
 swollen+0.030
othe+0.028
 scared+0.026
 its+0.026
 screaming+0.025
Neg logit contributions
uries-0.053
nes-0.052
aces-0.051
shirts-0.046
Whe-0.046
Token about
Token ID546
Feature activation+1.81e-02

Pos logit contributions
uries+0.038
nes+0.037
aces+0.037
Whe+0.035
shirts+0.033
Neg logit contributions
 scared-0.012
 swollen-0.011
 its-0.011
 frightening-0.011
othe-0.011
Token Wed
Token ID3300
Feature activation-1.35e-01

Pos logit contributions
 swollen+0.021
othe+0.020
 scared+0.018
 frightening+0.017
 shocked+0.016
Neg logit contributions
uries-0.046
aces-0.044
nes-0.044
 parties-0.041
shirts-0.040
Tokennes
Token ID2516
Feature activation-6.64e-02

Pos logit contributions
aces+0.059
uries+0.052
Whe+0.042
coat+0.033
 sneakers+0.033
Neg logit contributions
othe-0.412
 swollen-0.407
 scared-0.404
 frightening-0.395
 demanding-0.394
Tokendays
Token ID12545
Feature activation-6.08e-02

Pos logit contributions
uries+0.143
nes+0.140
aces+0.137
Whe+0.123
coat+0.122
Neg logit contributions
 swollen-0.105
othe-0.097
 scared-0.095
 frightening-0.092
 its-0.092
Token?"
Token ID1701
Feature activation+1.79e-02

Pos logit contributions
uries+0.178
aces+0.172
nes+0.167
Whe+0.164
aults+0.159
Neg logit contributions
 scared-0.046
 surprised-0.039
 the-0.039
:-0.038
 said-0.038
Token
Token ID198
Feature activation+2.26e-02

Pos logit contributions
 swollen+0.012
othe+0.009
 scared+0.009
 frightening+0.009
 burning+0.008
Neg logit contributions
uries-0.055
nes-0.054
aces-0.054
Whe-0.051
shirts-0.050
Token
Token ID198
Feature activation+1.93e-02

Pos logit contributions
 swollen+0.022
 frightening+0.018
 burning+0.018
othe+0.018
 scared+0.018
Neg logit contributions
nes-0.064
uries-0.061
aces-0.060
Whe-0.058
shirts-0.056
Token near
Token ID1474
Feature activation+1.30e-02

Pos logit contributions
uries+0.166
nes+0.161
aces+0.160
Whe+0.151
shirts+0.148
Neg logit contributions
 scared-0.049
 its-0.048
 swollen-0.047
othe-0.046
 Andy-0.044
Token the
Token ID262
Feature activation-5.35e-03

Pos logit contributions
 swollen+0.013
othe+0.011
 frightening+0.009
 painful+0.008
 scared+0.008
Neg logit contributions
uries-0.035
aces-0.034
nes-0.033
 parties-0.033
Whe-0.032
Token cliff
Token ID19516
Feature activation+5.09e-03

Pos logit contributions
nes+0.011
uries+0.010
aces+0.010
Whe+0.010
 parties+0.010
Neg logit contributions
 swollen-0.009
othe-0.008
 Andy-0.007
 its-0.007
 demanding-0.007
Token and
Token ID290
Feature activation-1.29e-02

Pos logit contributions
 swollen+0.006
othe+0.005
 frightening+0.005
 Farmer+0.005
 scared+0.005
Neg logit contributions
uries-0.013
nes-0.013
aces-0.013
 parties-0.012
coat-0.012
Token have
Token ID423
Feature activation-3.77e-02

Pos logit contributions
uries+0.027
aces+0.027
nes+0.027
Whe+0.024
coat+0.023
Neg logit contributions
 swollen-0.020
 scared-0.019
othe-0.019
 its-0.019
 frightening-0.018
Token pic
Token ID8301
Feature activation-1.34e-01

Pos logit contributions
uries+0.105
nes+0.101
aces+0.097
coat+0.089
suit+0.088
Neg logit contributions
 its-0.042
 scared-0.041
 dangerous-0.037
 the-0.037
 Andy-0.035
Tokenn
Token ID77
Feature activation-6.47e-02

Pos logit contributions
uries+0.217
aces+0.212
nes+0.208
Whe+0.188
shirts+0.178
Neg logit contributions
 swollen-0.273
othe-0.262
 its-0.256
 scared-0.253
 Andy-0.251
Tokenics
Token ID873
Feature activation-2.00e-02

Pos logit contributions
uries+0.092
nes+0.087
aces+0.085
Whe+0.074
shirts+0.071
Neg logit contributions
 swollen-0.148
othe-0.142
 scared-0.141
 its-0.138
 frightening-0.136
Token.
Token ID13
Feature activation-1.06e-02

Pos logit contributions
uries+0.064
aces+0.061
nes+0.060
Whe+0.060
shirts+0.058
Neg logit contributions
 its-0.013
 scared-0.011
 the-0.010
 dangerous-0.008
othe-0.008
Token One
Token ID1881
Feature activation-1.87e-02

Pos logit contributions
uries+0.031
nes+0.030
aces+0.030
Whe+0.028
shirts+0.027
Neg logit contributions
 swollen-0.009
othe-0.008
 scared-0.007
 frightening-0.007
 its-0.007
Token day
Token ID1110
Feature activation-2.63e-02

Pos logit contributions
nes+0.031
uries+0.030
aces+0.028
Whe+0.028
 parties+0.027
Neg logit contributions
 swollen-0.039
othe-0.035
 frightening-0.033
 screaming-0.033
 shocked-0.033
Token he
Token ID339
Feature activation-1.32e-03

Pos logit contributions
 swollen+0.013
othe+0.012
 frightening+0.010
 faint+0.010
 Farmer+0.010
Neg logit contributions
uries-0.023
aces-0.023
nes-0.023
Whe-0.021
 parties-0.021
Token forgot
Token ID16453
Feature activation+1.77e-02

Pos logit contributions
uries+0.004
nes+0.004
aces+0.003
Whe+0.003
coat+0.003
Neg logit contributions
 swollen-0.001
othe-0.001
 scared-0.001
 its-0.001
 frightening-0.001
Token how
Token ID703
Feature activation+1.02e-02

Pos logit contributions
 swollen+0.019
othe+0.016
 scared+0.015
 frightening+0.014
 Andy+0.014
Neg logit contributions
uries-0.047
aces-0.046
nes-0.046
 parties-0.042
shirts-0.042
Token to
Token ID284
Feature activation-8.80e-03

Pos logit contributions
 swollen+0.013
 Andy+0.011
 Farmer+0.011
othe+0.011
 Bee+0.011
Neg logit contributions
nes-0.023
aces-0.023
uries-0.023
shirts-0.021
Whe-0.021
Token get
Token ID651
Feature activation+2.55e-02

Pos logit contributions
uries+0.021
aces+0.020
nes+0.020
shirts+0.019
Whe+0.018
Neg logit contributions
 scared-0.012
 its-0.012
 swollen-0.011
othe-0.011
 trembling-0.011
Token back
Token ID736
Feature activation+5.20e-02

Pos logit contributions
 scared+0.024
 swollen+0.024
 its+0.022
othe+0.021
 angry+0.021
Neg logit contributions
uries-0.073
nes-0.071
aces-0.071
Whe-0.066
coat-0.065
Token to
Token ID284
Feature activation+2.67e-02

Pos logit contributions
 swollen+0.048
 scared+0.043
othe+0.042
 its+0.042
 frightening+0.039
Neg logit contributions
uries-0.146
nes-0.143
aces-0.141
Whe-0.132
shirts-0.129
Token his
Token ID465
Feature activation-6.83e-03

Pos logit contributions
 swollen+0.020
othe+0.016
 scared+0.015
 frightening+0.014
 Andy+0.013
Neg logit contributions
uries-0.079
nes-0.077
aces-0.077
Whe-0.073
 parties-0.072
Token hole
Token ID7604
Feature activation+3.95e-02

Pos logit contributions
uries+0.015
nes+0.015
aces+0.015
Whe+0.013
 parties+0.013
Neg logit contributions
 swollen-0.010
othe-0.009
 scared-0.009
 frightening-0.008
 its-0.008
Token.
Token ID13
Feature activation+6.64e-03

Pos logit contributions
 swollen+0.030
 scared+0.027
othe+0.026
 its+0.025
 frightening+0.023
Neg logit contributions
uries-0.117
nes-0.114
aces-0.112
Whe-0.108
shirts-0.105
Token He
Token ID679
Feature activation+1.91e-02

Pos logit contributions
 swollen+0.007
othe+0.006
 frightening+0.006
 scared+0.006
 burning+0.006
Neg logit contributions
uries-0.018
nes-0.017
aces-0.017
Whe-0.016
shirts-0.016
Token happy
Token ID3772
Feature activation+1.06e-02

Pos logit contributions
 swollen+0.051
othe+0.047
 Farmer+0.046
 Andy+0.045
 frightening+0.044
Neg logit contributions
nes-0.099
uries-0.095
aces-0.094
Whe-0.088
 parties-0.085
Token that
Token ID326
Feature activation+2.05e-02

Pos logit contributions
 swollen+0.010
othe+0.009
 scared+0.008
 frightening+0.008
 faint+0.007
Neg logit contributions
uries-0.029
nes-0.028
aces-0.028
 parties-0.026
Whe-0.026
Token he
Token ID339
Feature activation-2.00e-03

Pos logit contributions
 swollen+0.020
othe+0.016
 frightening+0.015
 scared+0.015
 screaming+0.014
Neg logit contributions
uries-0.056
nes-0.055
aces-0.055
Whe-0.051
 parties-0.051
Token sang
Token ID25889
Feature activation-6.46e-03

Pos logit contributions
uries+0.006
aces+0.005
nes+0.005
Whe+0.005
shirts+0.005
Neg logit contributions
 scared-0.002
 swollen-0.002
 its-0.002
othe-0.002
 frightening-0.002
Token the
Token ID262
Feature activation+1.39e-02

Pos logit contributions
uries+0.019
nes+0.018
aces+0.018
Whe+0.017
coat+0.017
Neg logit contributions
 its-0.006
 scared-0.005
 the-0.005
othe-0.005
 loud-0.005
Token loud
Token ID7812
Feature activation+4.97e-02

Pos logit contributions
 swollen+0.017
 scared+0.016
 frightening+0.015
 its+0.014
 trembling+0.014
Neg logit contributions
uries-0.036
nes-0.035
aces-0.034
Whe-0.032
shirts-0.031
Tokenest
Token ID395
Feature activation+6.73e-03

Pos logit contributions
 swollen+0.060
 scared+0.057
 its+0.056
othe+0.055
 frightening+0.053
Neg logit contributions
uries-0.126
nes-0.124
aces-0.120
Whe-0.112
shirts-0.111
Token he
Token ID339
Feature activation-1.40e-02

Pos logit contributions
 swollen+0.009
othe+0.008
 scared+0.008
 its+0.008
 frightening+0.008
Neg logit contributions
uries-0.016
nes-0.016
aces-0.015
Whe-0.014
shirts-0.014
Token had
Token ID550
Feature activation-4.36e-02

Pos logit contributions
uries+0.039
aces+0.038
nes+0.037
Whe+0.035
shirts+0.035
Neg logit contributions
 scared-0.013
 swollen-0.013
 its-0.012
 frightening-0.011
othe-0.011
Token ever
Token ID1683
Feature activation-1.57e-03

Pos logit contributions
uries+0.109
nes+0.104
aces+0.102
Whe+0.100
shirts+0.097
Neg logit contributions
 its-0.051
 swollen-0.050
 scared-0.050
othe-0.048
 frightening-0.045
Token sung
Token ID23568
Feature activation-9.13e-03

Pos logit contributions
uries+0.004
nes+0.003
aces+0.003
Whe+0.003
coat+0.003
Neg logit contributions
 swollen-0.002
 scared-0.002
othe-0.002
 its-0.002
 frightening-0.002
Token,
Token ID11
Feature activation-2.16e-02

Pos logit contributions
 swollen+0.004
othe+0.004
 scared+0.004
 its+0.003
 frightening+0.003
Neg logit contributions
uries-0.017
nes-0.017
aces-0.016
Whe-0.015
coat-0.015
Token Tim
Token ID5045
Feature activation+1.50e-02

Pos logit contributions
uries+0.067
aces+0.065
nes+0.065
Whe+0.060
coat+0.059
Neg logit contributions
 swollen-0.014
 scared-0.013
othe-0.013
 its-0.012
 Andy-0.012
Tokenmy
Token ID1820
Feature activation+3.08e-02

Pos logit contributions
 swollen+0.024
othe+0.023
 its+0.021
 frightening+0.021
 scared+0.021
Neg logit contributions
uries-0.032
aces-0.031
nes-0.031
coat-0.028
Whe-0.028
Token started
Token ID2067
Feature activation+2.27e-02

Pos logit contributions
 swollen+0.036
othe+0.034
 its+0.031
 frightening+0.030
 Andy+0.030
Neg logit contributions
uries-0.077
nes-0.076
aces-0.075
coat-0.069
Whe-0.068
Token to
Token ID284
Feature activation+3.12e-02

Pos logit contributions
 swollen+0.023
othe+0.020
 Andy+0.019
 frightening+0.018
 its+0.018
Neg logit contributions
uries-0.060
nes-0.059
aces-0.059
coat-0.056
Whe-0.055
Token feel
Token ID1254
Feature activation+4.81e-02

Pos logit contributions
 scared+0.036
 swollen+0.033
 angry+0.032
 screaming+0.031
 worried+0.031
Neg logit contributions
aces-0.085
uries-0.083
nes-0.081
shirts-0.074
ests-0.073
Token uncomfortable
Token ID12916
Feature activation+1.33e-02

Pos logit contributions
 swollen+0.054
 Andy+0.050
othe+0.050
oc+0.045
 frightening+0.045
Neg logit contributions
aces-0.114
uries-0.113
nes-0.109
Whe-0.106
shirts-0.103
Token.
Token ID13
Feature activation+1.22e-02

Pos logit contributions
 swollen+0.009
 scared+0.008
othe+0.008
 its+0.007
 burning+0.007
Neg logit contributions
uries-0.041
nes-0.040
aces-0.039
Whe-0.037
shirts-0.037
Token His
Token ID2399
Feature activation+1.45e-02

Pos logit contributions
 swollen+0.010
othe+0.009
 scared+0.008
 its+0.008
 frightening+0.008
Neg logit contributions
uries-0.036
nes-0.035
aces-0.035
Whe-0.032
coat-0.032
Token shoes
Token ID10012
Feature activation-3.42e-02

Pos logit contributions
 swollen+0.012
 scared+0.011
othe+0.010
 angry+0.009
 trembling+0.009
Neg logit contributions
uries-0.043
nes-0.042
aces-0.041
Whe-0.039
shirts-0.038
Token were
Token ID547
Feature activation-2.62e-03

Pos logit contributions
uries+0.081
nes+0.079
aces+0.078
Whe+0.072
shirts+0.070
Neg logit contributions
 swollen-0.046
othe-0.042
 scared-0.042
 its-0.040
 frightening-0.040
Token magn
Token ID7842
Feature activation-2.32e-02

Pos logit contributions
 swollen+0.021
 scared+0.019
othe+0.019
 frightening+0.018
 its+0.018
Neg logit contributions
uries-0.033
nes-0.032
aces-0.032
Whe-0.029
shirts-0.028
Tokenifying
Token ID4035
Feature activation-3.74e-02

Pos logit contributions
uries+0.043
nes+0.040
Whe+0.037
aces+0.037
 parties+0.036
Neg logit contributions
 swollen-0.044
 scared-0.041
 its-0.041
othe-0.041
 trembling-0.040
Token glass
Token ID5405
Feature activation-8.62e-03

Pos logit contributions
aces+0.064
nes+0.063
uries+0.063
Whe+0.059
 parties+0.058
Neg logit contributions
 swollen-0.068
othe-0.065
 frightening-0.062
 screaming-0.062
 demanding-0.062
Token.
Token ID13
Feature activation+7.41e-03

Pos logit contributions
uries+0.026
nes+0.025
Whe+0.024
aces+0.023
shirts+0.023
Neg logit contributions
 swollen-0.007
 scared-0.007
 its-0.006
othe-0.006
 burning-0.006
Token He
Token ID679
Feature activation+9.17e-03

Pos logit contributions
 swollen+0.005
 scared+0.005
othe+0.005
 its+0.005
 Andy+0.004
Neg logit contributions
uries-0.022
nes-0.022
aces-0.021
Whe-0.020
shirts-0.020
Token saw
Token ID2497
Feature activation+6.13e-02

Pos logit contributions
othe+0.012
 swollen+0.011
 burning+0.010
 Farmer+0.010
 Andy+0.010
Neg logit contributions
aces-0.021
nes-0.021
uries-0.021
 shorts-0.019
coat-0.018
Token the
Token ID262
Feature activation+2.24e-02

Pos logit contributions
 swollen+0.086
othe+0.083
 frightening+0.076
 demanding+0.075
 screaming+0.074
Neg logit contributions
aces-0.131
uries-0.122
nes-0.115
 parties-0.115
Whe-0.114
Token numbers
Token ID3146
Feature activation+1.96e-03

Pos logit contributions
 swollen+0.031
 scared+0.029
 frightening+0.027
 burning+0.027
othe+0.027
Neg logit contributions
uries-0.054
nes-0.054
aces-0.051
Whe-0.047
shirts-0.047
Token,
Token ID11
Feature activation-5.89e-03

Pos logit contributions
 swollen+0.002
 scared+0.002
othe+0.002
 its+0.002
 burning+0.002
Neg logit contributions
uries-0.005
nes-0.005
aces-0.005
Whe-0.005
shirts-0.005
Token the
Token ID262
Feature activation+1.51e-02

Pos logit contributions
aces+0.016
uries+0.016
nes+0.015
Whe+0.015
 parties+0.014
Neg logit contributions
 swollen-0.005
othe-0.005
 frightening-0.004
 Andy-0.004
 scared-0.004
Token hands
Token ID2832
Feature activation+2.20e-02

Pos logit contributions
 swollen+0.022
 scared+0.020
 trembling+0.019
 shaking+0.019
 frightening+0.019
Neg logit contributions
nes-0.036
uries-0.035
aces-0.033
Whe-0.031
shirts-0.031
Token a
Token ID257
Feature activation+1.96e-02

Pos logit contributions
 swollen+0.000
 scared+0.000
 its+0.000
 Andy+0.000
othe+0.000
Neg logit contributions
uries-0.000
nes-0.000
aces-0.000
shirts-0.000
coat-0.000
Token little
Token ID1310
Feature activation+1.81e-02

Pos logit contributions
 swollen+0.022
othe+0.021
 its+0.019
 Andy+0.019
 frightening+0.019
Neg logit contributions
uries-0.049
nes-0.048
aces-0.046
Whe-0.044
coat-0.044
Token angel
Token ID18304
Feature activation+5.45e-02

Pos logit contributions
 swollen+0.024
 scared+0.022
othe+0.022
 its+0.021
 frightening+0.021
Neg logit contributions
uries-0.044
nes-0.043
aces-0.042
Whe-0.038
shirts-0.038
Token who
Token ID508
Feature activation+1.13e-02

Pos logit contributions
 swollen+0.073
othe+0.065
 frightening+0.064
 Farmer+0.062
 Andy+0.058
Neg logit contributions
nes-0.126
aces-0.124
uries-0.123
coat-0.113
 parties-0.112
Token could
Token ID714
Feature activation-1.21e-02

Pos logit contributions
 swollen+0.016
othe+0.015
 Farmer+0.014
 frightening+0.013
 Andy+0.013
Neg logit contributions
uries-0.026
nes-0.025
aces-0.025
coat-0.023
Whe-0.023
Token not
Token ID407
Feature activation+4.92e-02

Pos logit contributions
nes+0.024
uries+0.023
aces+0.023
 parties+0.020
Whe+0.020
Neg logit contributions
 swollen-0.021
othe-0.019
 Andy-0.019
 Farmer-0.019
 frightening-0.019
Token fly
Token ID6129
Feature activation-1.25e-02

Pos logit contributions
 scared+0.069
 swollen+0.067
 its+0.066
othe+0.065
 angry+0.063
Neg logit contributions
uries-0.111
aces-0.108
nes-0.106
Whe-0.099
shirts-0.099
Token.
Token ID13
Feature activation-1.07e-02

Pos logit contributions
uries+0.034
nes+0.033
aces+0.033
Whe+0.031
shirts+0.030
Neg logit contributions
 swollen-0.013
 scared-0.012
othe-0.011
 its-0.011
 frightening-0.011
Token She
Token ID1375
Feature activation+8.70e-03

Pos logit contributions
uries+0.031
nes+0.030
aces+0.030
Whe+0.028
coat+0.027
Neg logit contributions
 swollen-0.009
othe-0.008
 scared-0.008
 its-0.007
 frightening-0.007
Token watched
Token ID7342
Feature activation+5.75e-03

Pos logit contributions
 swollen+0.009
othe+0.009
 frightening+0.008
 Andy+0.008
 Farmer+0.008
Neg logit contributions
uries-0.023
nes-0.023
aces-0.022
Whe-0.020
coat-0.020
Token her
Token ID607
Feature activation-5.37e-03

Pos logit contributions
 swollen+0.007
othe+0.006
 scared+0.005
 frightening+0.005
 screaming+0.005
Neg logit contributions
uries-0.014
aces-0.014
nes-0.014
 parties-0.013
Whe-0.013
Token could
Token ID714
Feature activation-2.00e-02

Pos logit contributions
uries+0.035
aces+0.033
nes+0.033
Whe+0.033
 parties+0.031
Neg logit contributions
 scared-0.014
 swollen-0.013
 its-0.012
othe-0.011
 said-0.011
Token share
Token ID2648
Feature activation-3.50e-02

Pos logit contributions
uries+0.044
aces+0.043
nes+0.043
Whe+0.039
shirts+0.038
Neg logit contributions
 swollen-0.030
othe-0.029
 scared-0.028
 its-0.027
 Andy-0.026
Token it
Token ID340
Feature activation+8.84e-03

Pos logit contributions
uries+0.101
nes+0.099
aces+0.099
Whe+0.091
 parties+0.090
Neg logit contributions
 swollen-0.029
othe-0.025
 scared-0.024
 frightening-0.023
 Andy-0.022
Token with
Token ID351
Feature activation-1.31e-02

Pos logit contributions
 its+0.005
 scared+0.005
othe+0.004
 swollen+0.004
 angry+0.003
Neg logit contributions
uries-0.028
Whe-0.027
nes-0.027
aces-0.026
shirts-0.026
Token his
Token ID465
Feature activation-3.50e-02

Pos logit contributions
uries+0.035
aces+0.035
nes+0.034
 parties+0.032
shirts+0.032
Neg logit contributions
 swollen-0.013
othe-0.011
 scared-0.011
 frightening-0.011
 screaming-0.011
Token family
Token ID1641
Feature activation-9.84e-03

Pos logit contributions
uries+0.076
aces+0.075
nes+0.071
 parties+0.065
coat+0.064
Neg logit contributions
 swollen-0.052
othe-0.050
 frightening-0.047
 scared-0.046
 its-0.046
Token.
Token ID13
Feature activation+1.43e-03

Pos logit contributions
uries+0.033
nes+0.033
Whe+0.032
aces+0.032
coat+0.030
Neg logit contributions
 scared-0.003
 its-0.003
 swollen-0.003
 the-0.002
 angry-0.002
Token He
Token ID679
Feature activation+1.24e-02

Pos logit contributions
 swollen+0.001
othe+0.001
 scared+0.001
 its+0.001
 Andy+0.001
Neg logit contributions
uries-0.004
nes-0.004
aces-0.004
Whe-0.004
coat-0.004
Token quickly
Token ID2952
Feature activation+2.04e-02

Pos logit contributions
 swollen+0.014
 scared+0.013
othe+0.013
 its+0.013
 frightening+0.012
Neg logit contributions
uries-0.032
nes-0.031
aces-0.031
Whe-0.029
coat-0.028
Token grabbed
Token ID13176
Feature activation+4.89e-03

Pos logit contributions
 swollen+0.025
 scared+0.024
 its+0.023
othe+0.023
 angry+0.021
Neg logit contributions
uries-0.051
nes-0.050
aces-0.049
Whe-0.048
shirts-0.044
Token a
Token ID257
Feature activation+6.84e-04

Pos logit contributions
 swollen+0.005
othe+0.004
 frightening+0.004
 scared+0.004
 screaming+0.004
Neg logit contributions
uries-0.013
aces-0.013
nes-0.013
Whe-0.012
shirts-0.012
Token he
Token ID339
Feature activation-6.31e-03

Pos logit contributions
uries+0.042
nes+0.042
aces+0.041
Whe+0.038
coat+0.038
Neg logit contributions
 swollen-0.009
 scared-0.008
othe-0.007
 its-0.007
 Andy-0.007
Token couldn
Token ID3521
Feature activation+1.02e-02

Pos logit contributions
uries+0.018
nes+0.017
aces+0.017
Whe+0.016
 parties+0.016
Neg logit contributions
 swollen-0.006
 scared-0.006
othe-0.005
 worried-0.005
 frightening-0.005
Tokenâ
Token ID22940
Feature activation-6.20e-02

Pos logit contributions
 swollen+0.012
othe+0.011
 scared+0.011
 frightening+0.011
 its+0.011
Neg logit contributions
uries-0.025
nes-0.025
aces-0.025
Whe-0.023
coat-0.022
Token€
Token ID26391
Feature activation-4.88e-02

Pos logit contributions
uries+0.101
aces+0.100
 skirts+0.094
nes+0.094
 parties+0.092
Neg logit contributions
othe-0.109
 swollen-0.103
 its-0.103
 scared-0.103
 demanding-0.101
Tokenâ„¢
Token ID8151
Feature activation-1.16e-02

Pos logit contributions
uries+0.088
nes+0.086
aces+0.085
shirts+0.077
Whe+0.077
Neg logit contributions
 scared-0.086
 swollen-0.086
othe-0.085
 Andy-0.081
 shocked-0.080
Tokent
Token ID83
Feature activation+3.78e-02

Pos logit contributions
uries+0.024
nes+0.024
aces+0.023
Whe+0.021
shirts+0.021
Neg logit contributions
 swollen-0.018
othe-0.017
 scared-0.017
 its-0.016
 frightening-0.016
Token get
Token ID651
Feature activation+2.34e-02

Pos logit contributions
 scared+0.049
othe+0.046
 swollen+0.045
 its+0.045
 trembling+0.044
Neg logit contributions
aces-0.090
uries-0.089
nes-0.086
Whe-0.083
shirts-0.082
Token out
Token ID503
Feature activation+7.24e-02

Pos logit contributions
 swollen+0.022
 scared+0.022
othe+0.020
 its+0.020
 worried+0.019
Neg logit contributions
uries-0.067
aces-0.065
nes-0.065
Whe-0.061
coat-0.059
Token.
Token ID13
Feature activation-2.17e-03

Pos logit contributions
 swollen+0.060
 scared+0.053
othe+0.052
 its+0.050
 frightening+0.048
Neg logit contributions
uries-0.211
nes-0.207
aces-0.204
Whe-0.192
coat-0.188
Token He
Token ID679
Feature activation+1.46e-02

Pos logit contributions
uries+0.006
nes+0.006
aces+0.006
coat+0.006
 parties+0.006
Neg logit contributions
 swollen-0.002
 scared-0.002
othe-0.002
 Andy-0.002
 its-0.001
Token was
Token ID373
Feature activation+8.86e-03

Pos logit contributions
 swollen+0.014
 scared+0.012
othe+0.012
 its+0.011
 frightening+0.011
Neg logit contributions
uries-0.041
nes-0.040
aces-0.040
Whe-0.037
coat-0.036
Token.
Token ID13
Feature activation+4.48e-03

Pos logit contributions
uries+0.101
aces+0.097
nes+0.096
Whe+0.094
shirts+0.089
Neg logit contributions
 burning-0.017
 scared-0.016
 swollen-0.016
othe-0.016
 its-0.015
Token It
Token ID632
Feature activation+3.29e-02

Pos logit contributions
 swollen+0.005
othe+0.004
 scared+0.004
 frightening+0.004
 its+0.004
Neg logit contributions
uries-0.012
nes-0.011
aces-0.011
Whe-0.010
coat-0.010
Token was
Token ID373
Feature activation+5.88e-03

Pos logit contributions
othe+0.027
 swollen+0.027
 frightening+0.022
 demanding+0.022
 its+0.022
Neg logit contributions
nes-0.089
uries-0.089
aces-0.087
coat-0.083
Whe-0.083
Token so
Token ID523
Feature activation+3.31e-02

Pos logit contributions
 swollen+0.007
othe+0.006
 scared+0.006
 frightening+0.006
 its+0.006
Neg logit contributions
uries-0.015
nes-0.014
aces-0.014
Whe-0.013
coat-0.013
Token wide
Token ID3094
Feature activation+5.06e-02

Pos logit contributions
 swollen+0.052
othe+0.048
 trembling+0.048
 screaming+0.047
 Andy+0.046
Neg logit contributions
nes-0.066
uries-0.065
aces-0.065
Whe-0.060
 parties-0.060
Token and
Token ID290
Feature activation+1.88e-02

Pos logit contributions
 swollen+0.033
 scared+0.030
 its+0.029
othe+0.028
 burning+0.027
Neg logit contributions
uries-0.155
nes-0.153
aces-0.149
Whe-0.142
shirts-0.140
Token fluffy
Token ID39145
Feature activation-8.35e-03

Pos logit contributions
 swollen+0.023
othe+0.022
 Andy+0.020
 Farmer+0.020
 screaming+0.020
Neg logit contributions
uries-0.043
nes-0.043
aces-0.042
Whe-0.039
coat-0.039
Token.
Token ID13
Feature activation+8.29e-03

Pos logit contributions
uries+0.028
nes+0.027
aces+0.026
Whe+0.025
shirts+0.025
Neg logit contributions
 scared-0.004
 its-0.004
 swollen-0.004
 burning-0.004
 Andy-0.004
Token Lily
Token ID20037
Feature activation+5.48e-03

Pos logit contributions
 swollen+0.010
othe+0.009
 scared+0.009
 frightening+0.009
 its+0.009
Neg logit contributions
uries-0.021
nes-0.020
aces-0.020
Whe-0.018
coat-0.018
Token's
Token ID338
Feature activation-2.23e-02

Pos logit contributions
 swollen+0.006
othe+0.006
 frightening+0.005
 scared+0.005
 its+0.005
Neg logit contributions
uries-0.014
nes-0.014
aces-0.014
coat-0.013
Whe-0.013
Token mom
Token ID1995
Feature activation+3.85e-02

Pos logit contributions
uries+0.063
nes+0.062
aces+0.061
Whe+0.056
shirts+0.055
Neg logit contributions
 swollen-0.021
 scared-0.020
othe-0.018
 its-0.017
 worried-0.017
Token said
Token ID531
Feature activation-1.23e-03

Pos logit contributions
 burning+0.010
 cracking+0.009
 attacking+0.009
 frightening+0.009
 raging+0.009
Neg logit contributions
aces-0.016
nes-0.016
 shorts-0.015
 sweets-0.015
 bracelet-0.015
Token Ros
Token ID10018
Feature activation+1.38e-02

Pos logit contributions
nes+0.002
aces+0.002
uries+0.002
 guys+0.002
 parties+0.002
Neg logit contributions
 swollen-0.002
 frightening-0.002
 cracking-0.002
 burning-0.002
 faint-0.002
Tokenie
Token ID494
Feature activation+3.41e-02

Pos logit contributions
 swollen+0.018
othe+0.018
 scared+0.017
 frightening+0.016
 Farmer+0.016
Neg logit contributions
uries-0.032
aces-0.032
nes-0.032
coat-0.028
shirts-0.028
Token.
Token ID13
Feature activation+1.36e-02

Pos logit contributions
 swollen+0.036
othe+0.032
 frightening+0.031
 scared+0.029
 burning+0.029
Neg logit contributions
uries-0.090
aces-0.089
nes-0.089
coat-0.080
shirts-0.079
Token
Token ID198
Feature activation+1.87e-02

Pos logit contributions
 swollen+0.017
 frightening+0.016
 cracking+0.016
 burning+0.015
 dangerous+0.015
Neg logit contributions
aces-0.031
uries-0.031
nes-0.031
coat-0.027
 shorts-0.027
Token
Token ID198
Feature activation+1.53e-02

Pos logit contributions
 swollen+0.023
 frightening+0.021
 cracking+0.020
 burning+0.020
 scared+0.020
Neg logit contributions
nes-0.046
aces-0.043
uries-0.043
 parties-0.040
Whe-0.040
TokenB
Token ID33
Feature activation-2.31e-02

Pos logit contributions
 swollen+0.023
othe+0.023
 scared+0.023
 frightening+0.022
 burning+0.022
Neg logit contributions
nes-0.032
uries-0.032
aces-0.030
Whe-0.027
coat-0.027
Tokenenny
Token ID11870
Feature activation+7.75e-03

Pos logit contributions
uries+0.039
aces+0.037
nes+0.037
Whe+0.034
 parties+0.033
Neg logit contributions
 swollen-0.046
 Andy-0.043
othe-0.043
 scared-0.042
 its-0.041
Token smiled
Token ID13541
Feature activation+5.44e-02

Pos logit contributions
 swollen+0.006
 scared+0.006
othe+0.005
 its+0.005
 Andy+0.005
Neg logit contributions
uries-0.023
nes-0.022
aces-0.022
Whe-0.021
shirts-0.020
Token and
Token ID290
Feature activation+1.42e-02

Pos logit contributions
 swollen+0.064
othe+0.058
 frightening+0.053
 Andy+0.052
 scared+0.051
Neg logit contributions
uries-0.135
aces-0.134
nes-0.132
Whe-0.120
coat-0.120
Token said
Token ID531
Feature activation+3.12e-03

Pos logit contributions
 swollen+0.015
othe+0.013
 frightening+0.012
 burning+0.012
 scared+0.012
Neg logit contributions
nes-0.037
uries-0.037
aces-0.036
Whe-0.034
coat-0.033
Token back
Token ID736
Feature activation+3.29e-02

Pos logit contributions
uries+0.112
Whe+0.107
nes+0.106
aces+0.106
shirts+0.101
Neg logit contributions
 scared-0.024
 the-0.023
 its-0.022
 dangerous-0.021
 swollen-0.021
Token."
Token ID526
Feature activation-8.92e-03

Pos logit contributions
 scared+0.027
 its+0.026
 swollen+0.025
 dangerous+0.024
 Andy+0.024
Neg logit contributions
uries-0.092
Whe-0.090
nes-0.090
aces-0.089
shirts-0.084
Token Max
Token ID5436
Feature activation+1.42e-02

Pos logit contributions
uries+0.021
nes+0.020
aces+0.020
 parties+0.019
coat+0.018
Neg logit contributions
 swollen-0.013
 frightening-0.011
 scared-0.011
othe-0.011
 burning-0.011
Token didn
Token ID1422
Feature activation+4.27e-03

Pos logit contributions
 swollen+0.016
othe+0.014
 frightening+0.013
 its+0.013
 scared+0.013
Neg logit contributions
uries-0.037
nes-0.036
aces-0.036
coat-0.033
Whe-0.033
Token't
Token ID470
Feature activation+5.04e-02

Pos logit contributions
 swollen+0.006
 frightening+0.006
 scared+0.005
othe+0.005
 its+0.005
Neg logit contributions
nes-0.010
uries-0.009
aces-0.009
Whe-0.008
 parties-0.008
Token want
Token ID765
Feature activation-1.84e-02

Pos logit contributions
 scared+0.072
 angry+0.066
othe+0.065
 worried+0.064
 swollen+0.064
Neg logit contributions
uries-0.117
aces-0.116
nes-0.115
Whe-0.108
shirts-0.104
Token to
Token ID284
Feature activation+1.04e-02

Pos logit contributions
uries+0.050
nes+0.049
aces+0.049
 parties+0.046
shirts+0.045
Neg logit contributions
 swollen-0.018
othe-0.016
 scared-0.015
 frightening-0.014
 burning-0.014
Token at
Token ID379
Feature activation-1.64e-04

Pos logit contributions
 swollen+0.015
 scared+0.015
othe+0.015
 its+0.014
 angry+0.014
Neg logit contributions
aces-0.024
uries-0.023
nes-0.022
Whe-0.021
shirts-0.020
Token first
Token ID717
Feature activation-3.87e-03

Pos logit contributions
uries+0.000
aces+0.000
nes+0.000
Whe+0.000
shirts+0.000
Neg logit contributions
 swollen-0.000
 scared-0.000
 its-0.000
othe-0.000
 Andy-0.000
Token,
Token ID11
Feature activation-1.46e-02

Pos logit contributions
uries+0.012
nes+0.012
aces+0.012
Whe+0.011
shirts+0.011
Neg logit contributions
 swollen-0.002
othe-0.002
 scared-0.002
 its-0.002
 screaming-0.002
Token but
Token ID475
Feature activation+2.99e-02

Pos logit contributions
uries+0.037
nes+0.036
aces+0.036
Whe+0.033
shirts+0.032
Neg logit contributions
 swollen-0.018
 scared-0.017
othe-0.016
 its-0.016
 frightening-0.016
Token.
Token ID13
Feature activation-1.68e-02

Pos logit contributions
uries+0.169
nes+0.167
aces+0.160
Whe+0.156
shirts+0.151
Neg logit contributions
 scared-0.041
 swollen-0.041
 its-0.039
othe-0.039
 Andy-0.034
Token He
Token ID679
Feature activation+4.18e-03

Pos logit contributions
uries+0.045
aces+0.044
nes+0.044
Whe+0.041
coat+0.040
Neg logit contributions
 swollen-0.018
othe-0.015
 scared-0.015
 frightening-0.014
 its-0.014
Token loved
Token ID6151
Feature activation-4.45e-03

Pos logit contributions
 swollen+0.005
 scared+0.005
othe+0.004
 its+0.004
 frightening+0.004
Neg logit contributions
uries-0.010
nes-0.010
aces-0.010
Whe-0.009
coat-0.009
Token to
Token ID284
Feature activation-8.33e-03

Pos logit contributions
uries+0.011
aces+0.011
nes+0.011
 parties+0.011
shirts+0.010
Neg logit contributions
 swollen-0.005
othe-0.005
 frightening-0.004
 scared-0.004
 shocked-0.004
Token play
Token ID711
Feature activation-7.25e-02

Pos logit contributions
uries+0.017
nes+0.016
aces+0.016
Whe+0.014
coat+0.014
Neg logit contributions
 swollen-0.014
othe-0.013
 scared-0.013
 its-0.013
 frightening-0.013
Token with
Token ID351
Feature activation-3.19e-02

Pos logit contributions
uries+0.194
nes+0.188
aces+0.186
Whe+0.173
shirts+0.171
Neg logit contributions
 swollen-0.073
 scared-0.071
 its-0.070
othe-0.068
 frightening-0.064
Token his
Token ID465
Feature activation-4.84e-02

Pos logit contributions
uries+0.089
aces+0.088
nes+0.086
 parties+0.082
shirts+0.081
Neg logit contributions
 swollen-0.029
othe-0.026
 scared-0.023
 frightening-0.023
 trembling-0.022
Token toys
Token ID14958
Feature activation-5.27e-02

Pos logit contributions
uries+0.089
nes+0.086
aces+0.086
Whe+0.076
coat+0.074
Neg logit contributions
 swollen-0.091
othe-0.086
 scared-0.084
 frightening-0.082
 its-0.082
Token and
Token ID290
Feature activation-2.38e-02

Pos logit contributions
uries+0.171
nes+0.168
Whe+0.165
aces+0.159
shirts+0.156
Neg logit contributions
 scared-0.027
 its-0.022
 swollen-0.022
 worried-0.019
 angry-0.019
Token pretend
Token ID16614
Feature activation-4.42e-02

Pos logit contributions
uries+0.048
nes+0.047
aces+0.046
Whe+0.042
coat+0.041
Neg logit contributions
 swollen-0.040
othe-0.038
 scared-0.037
 its-0.036
 frightening-0.036
Token they
Token ID484
Feature activation-5.21e-02

Pos logit contributions
uries+0.128
nes+0.126
aces+0.125
Whe+0.116
shirts+0.114
Neg logit contributions
 swollen-0.036
othe-0.031
 scared-0.030
 frightening-0.028
 its-0.027
Token field
Token ID2214
Feature activation-4.50e-03

Pos logit contributions
 swollen+0.007
 scared+0.007
 frightening+0.006
 burning+0.006
 dangerous+0.006
Neg logit contributions
uries-0.014
nes-0.014
aces-0.014
Whe-0.012
shirts-0.012
Token.
Token ID13
Feature activation-7.54e-03

Pos logit contributions
uries+0.013
nes+0.012
aces+0.012
Whe+0.012
shirts+0.011
Neg logit contributions
 swollen-0.004
othe-0.003
 scared-0.003
 its-0.003
 frightening-0.003
Token
Token ID220
Feature activation+1.82e-02

Pos logit contributions
uries+0.024
nes+0.023
aces+0.023
Whe+0.021
coat+0.021
Neg logit contributions
 swollen-0.004
othe-0.004
 scared-0.003
 its-0.003
 frightening-0.003
Token
Token ID198
Feature activation+1.56e-02

Pos logit contributions
 swollen+0.004
othe+0.004
 scared+0.003
 its+0.003
 Andy+0.003
Neg logit contributions
uries-0.062
nes-0.061
aces-0.060
shirts-0.057
coat-0.057
Token
Token ID198
Feature activation+1.42e-02

Pos logit contributions
 swollen+0.012
 scared+0.010
 frightening+0.010
othe+0.009
 trembling+0.009
Neg logit contributions
nes-0.046
uries-0.046
aces-0.045
Whe-0.043
 parties-0.041
TokenThey
Token ID2990
Feature activation-3.35e-02

Pos logit contributions
 swollen+0.013
othe+0.011
 its+0.011
 scared+0.011
 Andy+0.010
Neg logit contributions
uries-0.040
aces-0.039
nes-0.038
coat-0.036
shirts-0.035
Token all
Token ID477
Feature activation+3.53e-03

Pos logit contributions
uries+0.085
aces+0.081
nes+0.080
Whe+0.075
coat+0.074
Neg logit contributions
 swollen-0.041
 its-0.039
 scared-0.038
othe-0.037
 Andy-0.036
Token agreed
Token ID4987
Feature activation+1.17e-03

Pos logit contributions
 swollen+0.005
othe+0.004
 scared+0.004
 its+0.004
 frightening+0.004
Neg logit contributions
uries-0.009
nes-0.008
aces-0.008
Whe-0.008
coat-0.007
Token that
Token ID326
Feature activation+1.32e-02

Pos logit contributions
 swollen+0.001
othe+0.001
 scared+0.001
 its+0.001
 frightening+0.001
Neg logit contributions
uries-0.003
nes-0.003
aces-0.003
Whe-0.003
coat-0.003
Token the
Token ID262
Feature activation+1.51e-02

Pos logit contributions
 swollen+0.015
othe+0.013
 scared+0.012
 frightening+0.012
 Andy+0.011
Neg logit contributions
uries-0.034
nes-0.033
aces-0.033
Whe-0.030
 parties-0.030
Token dove
Token ID38862
Feature activation+3.23e-02

Pos logit contributions
 scared+0.020
 swollen+0.019
 worried+0.018
 frightening+0.018
 angry+0.018
Neg logit contributions
uries-0.038
nes-0.037
aces-0.037
shirts-0.034
coat-0.033
Token a
Token ID257
Feature activation+3.01e-03

Pos logit contributions
uries+0.045
nes+0.044
aces+0.043
Whe+0.040
coat+0.040
Neg logit contributions
 swollen-0.015
 its-0.015
 scared-0.014
othe-0.014
 Andy-0.013
Token fence
Token ID13990
Feature activation+6.20e-03

Pos logit contributions
 swollen+0.004
othe+0.004
 scared+0.003
 its+0.003
 frightening+0.003
Neg logit contributions
uries-0.007
nes-0.007
aces-0.007
Whe-0.006
coat-0.006
Token.
Token ID13
Feature activation-6.78e-03

Pos logit contributions
 swollen+0.005
 scared+0.005
othe+0.005
 its+0.004
 frightening+0.004
Neg logit contributions
uries-0.018
nes-0.017
aces-0.017
Whe-0.016
shirts-0.016
Token
Token ID198
Feature activation+6.57e-03

Pos logit contributions
uries+0.021
nes+0.021
aces+0.021
Whe+0.019
shirts+0.019
Neg logit contributions
 swollen-0.004
othe-0.003
 scared-0.003
 frightening-0.003
 its-0.003
Token
Token ID198
Feature activation+4.36e-03

Pos logit contributions
 swollen+0.010
 frightening+0.008
 trembling+0.008
 burning+0.007
 cracking+0.007
Neg logit contributions
nes-0.016
uries-0.015
Whe-0.014
aces-0.014
 parties-0.014
Token"
Token ID1
Feature activation-5.70e-02

Pos logit contributions
 swollen+0.004
 scared+0.004
othe+0.004
 its+0.004
 faint+0.003
Neg logit contributions
aces-0.012
uries-0.012
nes-0.011
 parties-0.011
coat-0.011
TokenLook
Token ID8567
Feature activation-2.25e-02

Pos logit contributions
uries+0.152
aces+0.150
nes+0.145
coat+0.133
 parties+0.133
Neg logit contributions
 swollen-0.054
 Bee-0.047
 Farmer-0.047
 its-0.046
 faint-0.045
Token,
Token ID11
Feature activation-3.18e-02

Pos logit contributions
uries+0.074
aces+0.071
nes+0.071
coat+0.067
shirts+0.067
Neg logit contributions
 scared-0.010
 its-0.010
 Andy-0.010
othe-0.009
 the-0.009
Token Ben
Token ID3932
Feature activation+4.53e-03

Pos logit contributions
uries+0.097
aces+0.095
nes+0.093
coat+0.090
shirts+0.086
Neg logit contributions
 the-0.028
 its-0.027
 scared-0.026
 Bee-0.025
 Sam-0.023
Token,
Token ID11
Feature activation-3.17e-02

Pos logit contributions
 swollen+0.004
othe+0.004
 scared+0.004
 frightening+0.003
 its+0.003
Neg logit contributions
uries-0.013
nes-0.012
aces-0.012
Whe-0.011
coat-0.011
Token our
Token ID674
Feature activation-5.43e-02

Pos logit contributions
uries+0.097
nes+0.092
aces+0.092
coat+0.089
shirts+0.088
Neg logit contributions
 the-0.028
 its-0.028
 scared-0.027
 lightning-0.023
 dangerous-0.022
Token Mia
Token ID32189
Feature activation+8.10e-03

Pos logit contributions
 swollen+0.009
othe+0.008
 burning+0.008
 frightening+0.008
 scared+0.007
Neg logit contributions
uries-0.018
aces-0.018
nes-0.017
Whe-0.017
 parties-0.016
Token are
Token ID389
Feature activation-9.74e-03

Pos logit contributions
 swollen+0.016
othe+0.015
 frightening+0.014
 Andy+0.014
 its+0.014
Neg logit contributions
uries-0.014
aces-0.014
nes-0.014
Whe-0.012
coat-0.012
Token sorry
Token ID7926
Feature activation+4.38e-02

Pos logit contributions
aces+0.027
uries+0.027
nes+0.027
Whe+0.024
coat+0.024
Neg logit contributions
 swollen-0.008
othe-0.007
 Andy-0.007
 Farmer-0.006
oc-0.006
Token.
Token ID13
Feature activation+1.53e-02

Pos logit contributions
 swollen+0.077
 frightening+0.063
 Farmer+0.062
 Andy+0.060
oc+0.059
Neg logit contributions
aces-0.078
nes-0.073
uries-0.072
 guys-0.072
 parties-0.068
Token They
Token ID1119
Feature activation-1.90e-02

Pos logit contributions
 swollen+0.027
othe+0.023
 frightening+0.023
 trembling+0.022
 painful+0.022
Neg logit contributions
uries-0.030
aces-0.029
nes-0.029
Whe-0.028
 parties-0.026
Token run
Token ID1057
Feature activation-3.44e-02

Pos logit contributions
uries+0.046
nes+0.045
aces+0.044
Whe+0.041
coat+0.040
Neg logit contributions
 swollen-0.025
othe-0.023
 scared-0.023
 its-0.022
 frightening-0.022
Token to
Token ID284
Feature activation+3.35e-02

Pos logit contributions
uries+0.089
nes+0.087
aces+0.086
Whe+0.079
coat+0.078
Neg logit contributions
 swollen-0.039
othe-0.035
 scared-0.035
 its-0.033
 frightening-0.033
Token the
Token ID262
Feature activation+2.29e-02

Pos logit contributions
 swollen+0.043
 frightening+0.035
 painful+0.033
othe+0.031
oc+0.031
Neg logit contributions
nes-0.081
uries-0.080
Whe-0.077
 parties-0.077
 guys-0.071
Token car
Token ID1097
Feature activation+8.19e-03

Pos logit contributions
 swollen+0.039
othe+0.037
 scared+0.036
 its+0.035
 frightening+0.035
Neg logit contributions
uries-0.046
nes-0.045
aces-0.044
Whe-0.040
coat-0.039
Token.
Token ID13
Feature activation+6.03e-03

Pos logit contributions
 swollen+0.007
othe+0.006
 scared+0.006
 its+0.006
 frightening+0.005
Neg logit contributions
uries-0.023
nes-0.023
aces-0.023
Whe-0.021
shirts-0.021
Token They
Token ID1119
Feature activation-2.99e-02

Pos logit contributions
 swollen+0.009
 frightening+0.008
othe+0.007
 trembling+0.007
 burning+0.007
Neg logit contributions
uries-0.013
nes-0.013
aces-0.013
Whe-0.013
 parties-0.012
Token at
Token ID379
Feature activation+2.00e-02

Pos logit contributions
 swollen+0.074
 Farmer+0.063
othe+0.061
 frightening+0.059
 Andy+0.057
Neg logit contributions
aces-0.148
uries-0.147
nes-0.142
 parties-0.130
shirts-0.130
Token the
Token ID262
Feature activation+1.82e-02

Pos logit contributions
 swollen+0.017
othe+0.015
 scared+0.014
 frightening+0.014
 its+0.013
Neg logit contributions
uries-0.057
nes-0.056
aces-0.056
Whe-0.052
 parties-0.051
Token end
Token ID886
Feature activation+1.45e-02

Pos logit contributions
 swollen+0.025
 scared+0.024
 demanding+0.023
 angry+0.023
 frightening+0.023
Neg logit contributions
aces-0.044
nes-0.043
uries-0.042
Whe-0.038
coat-0.038
Token of
Token ID286
Feature activation+2.95e-02

Pos logit contributions
 swollen+0.014
othe+0.012
 scared+0.011
 frightening+0.011
 its+0.010
Neg logit contributions
uries-0.040
aces-0.039
nes-0.039
shirts-0.035
coat-0.035
Token the
Token ID262
Feature activation+8.63e-03

Pos logit contributions
 swollen+0.021
othe+0.018
 scared+0.017
 frightening+0.017
 shocked+0.015
Neg logit contributions
uries-0.085
nes-0.083
aces-0.083
Whe-0.078
 parties-0.078
Token day
Token ID1110
Feature activation-2.54e-02

Pos logit contributions
 swollen+0.013
 scared+0.012
 frightening+0.011
 demanding+0.011
 screaming+0.011
Neg logit contributions
aces-0.020
uries-0.020
nes-0.020
Whe-0.017
coat-0.017
Token,
Token ID11
Feature activation+2.05e-03

Pos logit contributions
uries+0.078
nes+0.076
aces+0.075
Whe+0.071
shirts+0.070
Neg logit contributions
 scared-0.016
 swollen-0.015
othe-0.014
 its-0.014
 Andy-0.013
Token he
Token ID339
Feature activation+7.38e-03

Pos logit contributions
 swollen+0.002
 scared+0.002
othe+0.002
 its+0.002
 frightening+0.002
Neg logit contributions
uries-0.006
nes-0.006
aces-0.006
Whe-0.005
shirts-0.005
Token put
Token ID1234
Feature activation-2.65e-02

Pos logit contributions
 swollen+0.007
 scared+0.007
othe+0.006
 its+0.006
 frightening+0.006
Neg logit contributions
uries-0.020
nes-0.020
aces-0.020
Whe-0.018
shirts-0.018
Token the
Token ID262
Feature activation+1.76e-02

Pos logit contributions
uries+0.083
nes+0.082
aces+0.079
Whe+0.074
 parties+0.074
Neg logit contributions
 swollen-0.018
 its-0.016
 scared-0.015
othe-0.015
 Andy-0.013
Token pistol
Token ID16790
Feature activation+2.08e-02

Pos logit contributions
 swollen+0.028
 scared+0.026
othe+0.025
 frightening+0.025
 its+0.024
Neg logit contributions
uries-0.039
nes-0.038
aces-0.037
Whe-0.034
 parties-0.033
Token
Token ID198
Feature activation+6.72e-04

Pos logit contributions
 swollen+0.004
 frightening+0.003
 trembling+0.003
 burning+0.003
 cracking+0.003
Neg logit contributions
nes-0.008
uries-0.008
Whe-0.008
aces-0.007
 parties-0.007
Token"
Token ID1
Feature activation-5.96e-02

Pos logit contributions
 swollen+0.001
 scared+0.001
othe+0.001
 its+0.001
 frightening+0.001
Neg logit contributions
uries-0.002
aces-0.002
nes-0.002
Whe-0.002
 parties-0.002
TokenV
Token ID53
Feature activation-5.92e-02

Pos logit contributions
aces+0.165
uries+0.165
nes+0.155
 parties+0.145
atmeal+0.140
Neg logit contributions
 swollen-0.054
-0.052
 its-0.049
 scared-0.049
 was-0.045
Tokenroom
Token ID3823
Feature activation-1.25e-02

Pos logit contributions
uries+0.119
nes+0.103
shirts+0.102
aces+0.100
shirt+0.090
Neg logit contributions
oc-0.096
 was-0.095
,-0.093
:-0.092
 Bee-0.091
Token,
Token ID11
Feature activation-3.53e-02

Pos logit contributions
uries+0.041
nes+0.040
aces+0.039
Whe+0.037
shirts+0.037
Neg logit contributions
 swollen-0.006
 scared-0.005
othe-0.005
 its-0.004
 Andy-0.004
Token v
Token ID410
Feature activation-9.47e-02

Pos logit contributions
uries+0.084
nes+0.081
aces+0.080
Whe+0.073
coat+0.073
Neg logit contributions
 swollen-0.049
 scared-0.045
othe-0.044
 its-0.043
 frightening-0.042
Tokenroom
Token ID3823
Feature activation-1.55e-02

Pos logit contributions
uries+0.148
nes+0.135
aces+0.128
shirts+0.117
 parties+0.110
Neg logit contributions
 swollen-0.195
othe-0.190
 scared-0.189
 its-0.184
 shocked-0.183
Token,
Token ID11
Feature activation-3.40e-02

Pos logit contributions
uries+0.049
nes+0.047
aces+0.046
shirts+0.043
Whe+0.043
Neg logit contributions
 swollen-0.010
 scared-0.009
 its-0.008
othe-0.008
 Andy-0.008
Token my
Token ID616
Feature activation-4.92e-02

Pos logit contributions
uries+0.086
nes+0.084
aces+0.083
coat+0.075
shirts+0.075
Neg logit contributions
 swollen-0.041
 scared-0.039
 its-0.038
othe-0.037
 shaking-0.035
Token car
Token ID1097
Feature activation-7.83e-03

Pos logit contributions
nes+0.120
uries+0.120
aces+0.112
shirts+0.103
Whe+0.101
Neg logit contributions
 scared-0.065
 swollen-0.063
 angry-0.059
 worried-0.058
 shaking-0.057
Token is
Token ID318
Feature activation-5.26e-03

Pos logit contributions
uries+0.024
nes+0.023
aces+0.022
Whe+0.021
shirts+0.021
Neg logit contributions
 swollen-0.006
 scared-0.005
 its-0.005
othe-0.005
 angry-0.004
Token can
Token ID460
Feature activation-2.82e-02

Pos logit contributions
uries+0.199
aces+0.197
Whe+0.197
nes+0.194
coat+0.184
Neg logit contributions
 scared-0.074
 swollen-0.064
 worried-0.064
 its-0.061
 dangerous-0.059
Token sew
Token ID34249
Feature activation-2.85e-02

Pos logit contributions
uries+0.066
aces+0.065
nes+0.064
Whe+0.061
shirts+0.059
Neg logit contributions
 swollen-0.038
 scared-0.036
othe-0.035
 its-0.034
 faint-0.033
Token some
Token ID617
Feature activation-4.04e-02

Pos logit contributions
uries+0.088
nes+0.085
aces+0.083
coat+0.080
Whe+0.080
Neg logit contributions
 its-0.023
 the-0.018
 swollen-0.018
 scared-0.016
 angry-0.015
Token buttons
Token ID12163
Feature activation-6.37e-03

Pos logit contributions
uries+0.100
nes+0.096
aces+0.090
Whe+0.088
coat+0.083
Neg logit contributions
 swollen-0.055
 its-0.052
 scared-0.051
 dangerous-0.048
 angry-0.047
Token or
Token ID393
Feature activation-2.58e-02

Pos logit contributions
uries+0.019
nes+0.019
Whe+0.018
aces+0.018
shirts+0.017
Neg logit contributions
 its-0.004
 scared-0.004
 swollen-0.004
 the-0.003
s-0.003
Token rib
Token ID12183
Feature activation-1.12e-01

Pos logit contributions
uries+0.058
nes+0.057
aces+0.055
Whe+0.052
coat+0.051
Neg logit contributions
 swollen-0.038
 scared-0.036
 its-0.036
 angry-0.034
othe-0.034
Tokenbons
Token ID23461
Feature activation-6.15e-02

Pos logit contributions
nes+0.111
uries+0.110
aces+0.090
Whe+0.079
coat+0.077
Neg logit contributions
 swollen-0.300
 its-0.289
othe-0.280
 scared-0.278
 Andy-0.276
Token or
Token ID393
Feature activation-3.02e-02

Pos logit contributions
uries+0.185
Whe+0.176
nes+0.174
aces+0.170
shirts+0.166
Neg logit contributions
 its-0.044
 the-0.040
s-0.036
 swollen-0.035
 scared-0.035
Token patches
Token ID16082
Feature activation-4.90e-02

Pos logit contributions
nes+0.062
uries+0.061
aces+0.057
Whe+0.054
coat+0.053
Neg logit contributions
 swollen-0.051
 scared-0.048
 its-0.048
 angry-0.046
 dangerous-0.045
Token.
Token ID13
Feature activation-1.78e-02

Pos logit contributions
uries+0.143
nes+0.137
aces+0.133
Whe+0.132
shirts+0.127
Neg logit contributions
 its-0.038
 swollen-0.036
 scared-0.035
othe-0.035
 the-0.031
Token Would
Token ID10928
Feature activation-6.58e-02

Pos logit contributions
aces+0.046
nes+0.046
uries+0.045
coat+0.042
 parties+0.042
Neg logit contributions
 swollen-0.018
 its-0.017
 Andy-0.017
 scared-0.017
 Farmer-0.017
Token I
Token ID314
Feature activation-5.91e-02

Pos logit contributions
uries+0.090
nes+0.089
aces+0.088
coat+0.081
Whe+0.079
Neg logit contributions
 Andy-0.025
 its-0.024
 swollen-0.024
 scared-0.023
 Farmer-0.021
Token've
Token ID1053
Feature activation-1.57e-02

Pos logit contributions
uries+0.155
aces+0.154
nes+0.151
Whe+0.139
shirts+0.136
Neg logit contributions
 scared-0.065
 swollen-0.060
 worried-0.056
 surprised-0.056
 its-0.055
Token got
Token ID1392
Feature activation-6.69e-03

Pos logit contributions
uries+0.040
aces+0.040
nes+0.038
Whe+0.037
ests+0.036
Neg logit contributions
 scared-0.018
 its-0.017
 was-0.017
 the-0.016
 told-0.016
Token a
Token ID257
Feature activation-8.79e-03

Pos logit contributions
uries+0.021
aces+0.020
nes+0.020
coat+0.020
Whe+0.019
Neg logit contributions
 scared-0.005
 its-0.004
 the-0.004
 angry-0.004
 swollen-0.004
Token new
Token ID649
Feature activation-4.52e-02

Pos logit contributions
aces+0.021
uries+0.021
nes+0.020
Whe+0.018
 parties+0.017
Neg logit contributions
 swollen-0.013
 scared-0.012
 frightening-0.011
 its-0.011
 faint-0.011
Token leather
Token ID11620
Feature activation-9.34e-02

Pos logit contributions
uries+0.084
aces+0.081
nes+0.079
Whe+0.067
 parties+0.063
Neg logit contributions
 swollen-0.091
 its-0.086
 scared-0.086
 trembling-0.082
 shaking-0.081
Token jacket
Token ID15224
Feature activation-3.55e-02

Pos logit contributions
uries+0.201
nes+0.184
Whe+0.183
aces+0.181
 parties+0.161
Neg logit contributions
 swollen-0.154
s-0.146
 burning-0.140
 shaking-0.139
 that-0.139
Token!"
Token ID2474
Feature activation+9.01e-04

Pos logit contributions
uries+0.101
Whe+0.096
nes+0.095
aces+0.095
shirts+0.091
Neg logit contributions
 scared-0.030
 swollen-0.027
othe-0.026
 Andy-0.025
 burning-0.025
Token he
Token ID339
Feature activation-5.57e-03

Pos logit contributions
 swollen+0.001
 frightening+0.001
 attacking+0.001
 silent+0.001
 roaring+0.001
Neg logit contributions
nes-0.002
aces-0.002
uries-0.002
Whe-0.002
 shorts-0.002
Token said
Token ID531
Feature activation-5.26e-03

Pos logit contributions
uries+0.016
nes+0.016
aces+0.016
coat+0.015
shirts+0.015
Neg logit contributions
 swollen-0.004
othe-0.004
 frightening-0.003
 burning-0.003
 scared-0.003
Token,
Token ID11
Feature activation-1.18e-02

Pos logit contributions
uries+0.014
nes+0.014
aces+0.014
Whe+0.013
coat+0.013
Neg logit contributions
 swollen-0.005
othe-0.005
 scared-0.005
 frightening-0.004
 its-0.004
Token to
Token ID284
Feature activation+2.77e-02

Pos logit contributions
uries+0.142
nes+0.141
aces+0.137
Whe+0.130
shirts+0.125
Neg logit contributions
 scared-0.024
 its-0.020
 shaking-0.019
 burning-0.019
 swollen-0.019
Token prepare
Token ID8335
Feature activation-5.66e-02

Pos logit contributions
 swollen+0.026
 scared+0.024
othe+0.023
 its+0.023
 Andy+0.022
Neg logit contributions
uries-0.077
aces-0.075
nes-0.075
Whe-0.068
coat-0.068
Token for
Token ID329
Feature activation-1.86e-02

Pos logit contributions
uries+0.142
nes+0.137
aces+0.136
Whe+0.126
coat+0.123
Neg logit contributions
 swollen-0.070
othe-0.066
 scared-0.065
 its-0.065
 burning-0.061
Token a
Token ID257
Feature activation-2.26e-02

Pos logit contributions
uries+0.044
nes+0.042
aces+0.042
Whe+0.038
coat+0.038
Neg logit contributions
 swollen-0.026
 scared-0.024
othe-0.024
 its-0.023
 frightening-0.022
Token fun
Token ID1257
Feature activation-7.99e-02

Pos logit contributions
aces+0.048
uries+0.047
nes+0.045
Whe+0.041
shirts+0.038
Neg logit contributions
 swollen-0.037
 scared-0.036
othe-0.034
 burning-0.034
 frightening-0.034
Token saf
Token ID1932
Feature activation-1.08e-01

Pos logit contributions
aces+0.167
uries+0.164
nes+0.155
Whe+0.146
shirts+0.132
Neg logit contributions
 scared-0.131
 burning-0.129
 swollen-0.128
 Andy-0.128
 Farmer-0.128
Tokenari
Token ID2743
Feature activation-2.84e-02

Pos logit contributions
uries+0.183
aces+0.175
nes+0.173
Whe+0.164
shirts+0.156
Neg logit contributions
othe-0.195
 scared-0.191
s-0.190
 Andy-0.189
-0.185
Token.
Token ID13
Feature activation-1.37e-02

Pos logit contributions
uries+0.094
aces+0.093
Whe+0.092
nes+0.090
shirts+0.086
Neg logit contributions
 scared-0.009
 its-0.008
 Andy-0.006
 the-0.005
othe-0.004
Token They
Token ID1119
Feature activation-4.24e-02

Pos logit contributions
uries+0.032
aces+0.031
nes+0.031
Whe+0.029
 parties+0.029
Neg logit contributions
 swollen-0.022
 frightening-0.017
othe-0.017
 burning-0.016
 screaming-0.016
Token wanted
Token ID2227
Feature activation-1.48e-02

Pos logit contributions
uries+0.113
aces+0.110
nes+0.109
Whe+0.103
coat+0.098
Neg logit contributions
 scared-0.045
 swollen-0.042
 its-0.039
 Andy-0.039
 worried-0.039
Token to
Token ID284
Feature activation+1.12e-02

Pos logit contributions
uries+0.044
nes+0.043
aces+0.043
Whe+0.040
coat+0.039
Neg logit contributions
 swollen-0.011
othe-0.010
 scared-0.010
 its-0.009
 frightening-0.009
Token,
Token ID11
Feature activation-1.53e-02

Pos logit contributions
uries+0.159
aces+0.156
nes+0.154
Whe+0.153
shirts+0.145
Neg logit contributions
 scared-0.052
 swollen-0.045
 its-0.044
 worried-0.044
othe-0.044
Token Pete
Token ID17542
Feature activation-1.76e-03

Pos logit contributions
aces+0.041
uries+0.040
nes+0.039
shirts+0.037
Whe+0.036
Neg logit contributions
 scared-0.016
 its-0.016
 swollen-0.015
 Farmer-0.015
 Andy-0.015
Token"
Token ID1
Feature activation-4.65e-02

Pos logit contributions
uries+0.005
aces+0.005
nes+0.005
Whe+0.005
shirts+0.004
Neg logit contributions
 swollen-0.002
 scared-0.002
 its-0.001
othe-0.001
 Andy-0.001
Token
Token ID198
Feature activation+9.42e-03

Pos logit contributions
uries+0.110
aces+0.110
nes+0.106
Whe+0.096
coat+0.096
Neg logit contributions
 swollen-0.064
 its-0.061
 scared-0.060
 Farmer-0.058
othe-0.058
Token
Token ID198
Feature activation+5.49e-03

Pos logit contributions
 swollen+0.008
othe+0.007
 scared+0.007
 frightening+0.007
 its+0.006
Neg logit contributions
uries-0.027
nes-0.026
aces-0.026
Whe-0.024
shirts-0.024
TokenP
Token ID47
Feature activation-9.64e-02

Pos logit contributions
 swollen+0.008
 its+0.007
othe+0.007
 scared+0.007
 Farmer+0.007
Neg logit contributions
aces-0.012
uries-0.012
nes-0.012
coat-0.011
Whe-0.011
Tokenete
Token ID14471
Feature activation-2.86e-03

Pos logit contributions
uries+0.198
nes+0.189
aces+0.186
Whe+0.181
coat+0.176
Neg logit contributions
othe-0.153
 swollen-0.148
 its-0.140
:-0.139
 scared-0.139
Token was
Token ID373
Feature activation+5.02e-03

Pos logit contributions
uries+0.008
nes+0.008
aces+0.008
Whe+0.007
shirts+0.007
Neg logit contributions
 swollen-0.002
 scared-0.002
 its-0.002
othe-0.002
 Andy-0.002
Token touched
Token ID12615
Feature activation+5.47e-02

Pos logit contributions
 swollen+0.007
 scared+0.007
othe+0.007
 its+0.006
 frightening+0.006
Neg logit contributions
uries-0.012
nes-0.011
aces-0.011
Whe-0.010
coat-0.010
Token by
Token ID416
Feature activation+4.30e-02

Pos logit contributions
 swollen+0.046
 scared+0.042
 its+0.041
othe+0.039
 frightening+0.037
Neg logit contributions
uries-0.160
nes-0.158
aces-0.153
Whe-0.144
coat-0.141
Token his
Token ID465
Feature activation-1.45e-02

Pos logit contributions
 swollen+0.041
othe+0.036
 scared+0.036
 its+0.034
 frightening+0.033
Neg logit contributions
uries-0.119
nes-0.117
aces-0.116
Whe-0.108
coat-0.106