TOP ACTIVATIONS
MAX = 0.107

would whisper " thank you " to whoever put it there
Ha iley took Mir a for a walk and
Let 's take a break " But Tim was
, " You 're welcome " before returning to his stable
decided to say " hello " to all the animals she
le the knot unt ier " and she was the happiest
It was a big terr ac otta pot but it had
" and " thank you " to show her mom how
and said " thank you " to the mattress . The
he said " thank you " to his mom .
He said " thank you " to Tina and told her
tree and say " hello " to the little boy .
and said " thank you " to her mom my .
She kept saying " hello " to animals , and everywhere
naughty and did not share " and included a smile y
family said " thank you " before they left . They
need to learn a lot " said Mum .
a big " good bye " to her mum .
always said " thank you " when someone gave her something
people started saying " hello " too ! <|endoftext|> Once upon

MIN ACTIVATIONS
MIN = -0.104

wardrobe changed from ancient to brand new ! They looked around
scattered everywhere and the children crying and yelling . She was
the feather turned into a brand new staff . The wizard
kind lady saw Tim my crying and asked him what was
the h anger for her brand new dress . The cats
the broken sword and the crying children . She was not
the torn paper and the crying Tim . She was mad
man drove off in his brand new car , and he
heard the noise and the crying . She came to see
with your cough and your fever ," Mom urged .
floor and Sara and Ben crying . She was angry and
on her mom my 's brand new table !
spilling over and the twins crying . She quickly turned off
came in and saw her crying . He asked her ,
shop sitting by himself and crying . Jack
crown and playing in the freshly cut grass . It looked
the floor and the children crying . She said , "
's voice call out from afar . " Andy ! Hur
put your toys away when you are done . Come on
the broken plane and the crying children . She hugs them

INTERVAL 0.054 - 0.107
CONTAINS 0.174%

and food . They put everything in the crib , so
a tree . They looked up and saw the bird sitting
They put the book away and go to play
breakfast . He would pour the milk and cereal into a
play . While they were there , Lily accidentally dropped her

INTERVAL 0.002 - 0.054
CONTAINS 47.466%

to play with his friends . But when he got there
coloring book . She took out her cr ay ons and
They smile and nod . But then , Tom
and cookies . They were happy and proud of their work
fire work that had lost its spark le . It was

INTERVAL -0.051 - 0.002
CONTAINS 52.076%

birthday ever ! <|endoftext|> Once upon a time there was a
She has a dress and a hat . She is attractive
pole out of the ground . One by one
They played with the ball of yarn together .
It is so pretty . Can we catch it ?"

INTERVAL -0.104 - -0.051
CONTAINS 0.285%

was very dark outside and very late . Tom and Amy
, " Young lady , you should never go places you
and Liz are scared and sad . They cry and run
â The elderly lady nodded and said ,
just very quiet . And very pretty . He is the
Token would
Token ID561
Feature activation-1.34e-03

Pos logit contributions
 inconvenience+0.014
 hospital+0.013
 fault+0.013
 Enjoy+0.013
 tells+0.012
Neg logit contributions
quet-0.015
ized-0.015
ingly-0.015
bing-0.014
worm-0.014
Token whisper
Token ID31992
Feature activation+2.43e-03

Pos logit contributions
 inconvenience+0.002
 hospital+0.002
 fault+0.002
 tells+0.002
have+0.002
Neg logit contributions
quet-0.003
ized-0.002
ingly-0.002
bing-0.002
izing-0.002
Token "
Token ID366
Feature activation+6.58e-02

Pos logit contributions
quet+0.004
ized+0.004
bing+0.004
ingly+0.004
izing+0.004
Neg logit contributions
 hospital-0.004
 inconvenience-0.003
 tells-0.003
 fault-0.003
 Enjoy-0.003
Tokenthank
Token ID40716
Feature activation+2.38e-02

Pos logit contributions
quet+0.099
ized+0.094
bing+0.094
ingly+0.092
izing+0.091
Neg logit contributions
 hospital-0.109
 tells-0.107
 inconvenience-0.106
have-0.106
 Enjoy-0.105
Token you
Token ID345
Feature activation+6.31e-03

Pos logit contributions
quet+0.050
ized+0.049
bing+0.049
ingly+0.048
hetically+0.047
Neg logit contributions
 hospital-0.025
 tells-0.025
 Enjoy-0.025
 inconvenience-0.024
have-0.024
Token"
Token ID1
Feature activation+1.07e-01

Pos logit contributions
quet+0.010
bing+0.009
ingly+0.009
ized+0.009
izing+0.009
Neg logit contributions
 hospital-0.011
 inconvenience-0.011
 fault-0.011
 tells-0.010
 dentist-0.010
Token to
Token ID284
Feature activation-3.82e-03

Pos logit contributions
quet+0.198
ized+0.194
bing+0.194
izing+0.188
ised+0.184
Neg logit contributions
 tells-0.140
 dentist-0.135
 hospital-0.135
 done-0.132
have-0.130
Token whoever
Token ID16958
Feature activation-8.74e-03

Pos logit contributions
 hospital+0.006
 inconvenience+0.005
 tells+0.005
 fault+0.005
 Enjoy+0.005
Neg logit contributions
quet-0.007
ized-0.006
bing-0.006
ingly-0.006
izing-0.006
Token put
Token ID1234
Feature activation-7.31e-03

Pos logit contributions
 hospital+0.013
 inconvenience+0.012
 fault+0.012
 Enjoy+0.012
 tells+0.012
Neg logit contributions
quet-0.016
ized-0.015
ingly-0.015
bing-0.014
izing-0.014
Token it
Token ID340
Feature activation+2.34e-02

Pos logit contributions
 hospital+0.011
 inconvenience+0.011
 tells+0.011
 fault+0.011
 Enjoy+0.010
Neg logit contributions
quet-0.013
ized-0.012
bing-0.012
ingly-0.012
izing-0.011
Token there
Token ID612
Feature activation+3.68e-02

Pos logit contributions
quet+0.034
ized+0.031
bing+0.031
ingly+0.031
izing+0.030
Neg logit contributions
 hospital-0.043
 inconvenience-0.041
 tells-0.041
 fault-0.041
 Enjoy-0.040
Token
Token ID198
Feature activation-3.35e-03

Pos logit contributions
 inconvenience+0.007
 hospital+0.007
 fault+0.007
 mess+0.007
have+0.007
Neg logit contributions
ingly-0.006
edly-0.006
quet-0.006
bing-0.006
als-0.006
Token
Token ID198
Feature activation+6.83e-03

Pos logit contributions
 fault+0.005
 inconvenience+0.005
 hospital+0.005
 decides+0.005
 mess+0.005
Neg logit contributions
quet-0.005
ingly-0.005
bing-0.005
edly-0.005
ized-0.005
TokenHa
Token ID23303
Feature activation+3.43e-02

Pos logit contributions
quet+0.008
bing+0.007
ingly+0.007
ized+0.007
izing+0.007
Neg logit contributions
 fault-0.015
 hospital-0.015
 inconvenience-0.014
 tells-0.014
 Enjoy-0.014
Tokeniley
Token ID9618
Feature activation-4.73e-03

Pos logit contributions
quet+0.064
ized+0.060
bing+0.060
ingly+0.060
izing+0.058
Neg logit contributions
 hospital-0.049
 tells-0.047
 inconvenience-0.047
 fault-0.046
 Enjoy-0.045
Token took
Token ID1718
Feature activation+1.56e-02

Pos logit contributions
 hospital+0.008
 inconvenience+0.008
 fault+0.008
 Enjoy+0.008
 tells+0.008
Neg logit contributions
quet-0.007
ingly-0.007
ized-0.007
bing-0.007
izing-0.006
Token Mir
Token ID7381
Feature activation+1.01e-01

Pos logit contributions
quet+0.030
bing+0.029
ized+0.029
ingly+0.029
izing+0.028
Neg logit contributions
 hospital-0.021
 inconvenience-0.020
 tells-0.020
 fault-0.019
 Enjoy-0.019
Tokena
Token ID64
Feature activation+4.47e-02

Pos logit contributions
quet+0.200
ized+0.190
bing+0.190
ingly+0.188
izing+0.185
Neg logit contributions
 hospital-0.130
 tells-0.124
 inconvenience-0.124
 fault-0.121
 Enjoy-0.119
Token for
Token ID329
Feature activation-2.66e-02

Pos logit contributions
quet+0.071
ingly+0.068
bing+0.068
ized+0.068
worm+0.064
Neg logit contributions
 inconvenience-0.073
 hospital-0.073
 tells-0.071
 dentist-0.070
 Enjoy-0.069
Token a
Token ID257
Feature activation-1.01e-02

Pos logit contributions
 decides+0.041
 fault+0.039
arre+0.039
 tells+0.038
 hospital+0.038
Neg logit contributions
quet-0.046
ingly-0.042
edly-0.041
ized-0.040
bing-0.040
Token walk
Token ID2513
Feature activation+2.21e-02

Pos logit contributions
 tells+0.018
 hospital+0.018
 decides+0.017
 inconvenience+0.017
 fault+0.017
Neg logit contributions
quet-0.015
ized-0.014
ingly-0.014
worm-0.013
bing-0.013
Token and
Token ID290
Feature activation-1.36e-03

Pos logit contributions
quet+0.037
bing+0.034
ized+0.034
izing+0.033
ingly+0.032
Neg logit contributions
 hospital-0.036
 tells-0.033
 inconvenience-0.033
 fault-0.033
 Enjoy-0.032
TokenLet
Token ID5756
Feature activation+2.08e-02

Pos logit contributions
quet+0.039
ized+0.038
bing+0.037
ingly+0.036
izing+0.035
Neg logit contributions
 hospital-0.054
 fault-0.053
 tells-0.053
 inconvenience-0.053
 Enjoy-0.049
Token's
Token ID338
Feature activation-2.99e-02

Pos logit contributions
ized+0.031
quet+0.030
bing+0.029
ingly+0.028
izing+0.028
Neg logit contributions
 inconvenience-0.036
 fault-0.035
 Enjoy-0.035
 hospital-0.034
ws-0.034
Token take
Token ID1011
Feature activation+7.86e-03

Pos logit contributions
 inconvenience+0.038
 fault+0.038
 hospital+0.035
 tells+0.035
 microwave+0.035
Neg logit contributions
quet-0.057
bing-0.054
ized-0.053
ingly-0.052
ised-0.050
Token a
Token ID257
Feature activation+8.96e-03

Pos logit contributions
quet+0.014
ized+0.014
bing+0.014
ised+0.013
ingly+0.013
Neg logit contributions
 hospital-0.011
 inconvenience-0.011
 fault-0.011
 dentist-0.010
 tells-0.010
Token break
Token ID2270
Feature activation-2.70e-03

Pos logit contributions
quet+0.014
worm+0.013
ized+0.013
ingly+0.013
stones+0.012
Neg logit contributions
 hospital-0.015
 fault-0.014
 decides-0.014
 tells-0.014
have-0.014
Token"
Token ID1
Feature activation+1.01e-01

Pos logit contributions
 hospital+0.005
 inconvenience+0.004
 tells+0.004
 Enjoy+0.004
 fault+0.004
Neg logit contributions
quet-0.004
ized-0.004
bing-0.004
ingly-0.004
izing-0.004
Token
Token ID220
Feature activation+4.52e-02

Pos logit contributions
quet+0.157
ized+0.147
bing+0.147
ingly+0.144
izing+0.142
Neg logit contributions
 hospital-0.170
 tells-0.165
 inconvenience-0.163
 fault-0.161
 Enjoy-0.159
Token
Token ID198
Feature activation+4.46e-03

Pos logit contributions
quet+0.068
bing+0.063
ized+0.062
ingly+0.062
izing+0.059
Neg logit contributions
 hospital-0.082
 inconvenience-0.080
 tells-0.079
 fault-0.076
 decides-0.076
TokenBut
Token ID1537
Feature activation+2.11e-02

Pos logit contributions
quet+0.009
ized+0.009
bing+0.008
ingly+0.008
izing+0.008
Neg logit contributions
 hospital-0.006
 fault-0.005
 inconvenience-0.005
 tells-0.005
 decides-0.005
Token Tim
Token ID5045
Feature activation-8.32e-03

Pos logit contributions
quet+0.034
bing+0.034
ized+0.033
izing+0.032
ingly+0.032
Neg logit contributions
 hospital-0.033
 inconvenience-0.033
 fault-0.032
 tells-0.032
 Enjoy-0.031
Token was
Token ID373
Feature activation-1.05e-02

Pos logit contributions
 inconvenience+0.016
 hospital+0.016
 Enjoy+0.016
 aquarium+0.015
 microwave+0.015
Neg logit contributions
ingly-0.010
bing-0.010
ized-0.010
hog-0.010
hetically-0.009
Token,
Token ID11
Feature activation-1.43e-03

Pos logit contributions
quet+0.020
ized+0.020
bing+0.020
izing+0.019
ingly+0.019
Neg logit contributions
 Enjoy-0.020
 tells-0.019
 hospital-0.019
 inconvenience-0.018
 decides-0.018
Token "
Token ID366
Feature activation+4.14e-02

Pos logit contributions
 hospital+0.002
 tells+0.002
 inconvenience+0.002
 Enjoy+0.002
 fault+0.001
Neg logit contributions
quet-0.003
ized-0.003
ingly-0.003
bing-0.003
izing-0.003
TokenYou
Token ID1639
Feature activation-2.03e-03

Pos logit contributions
quet+0.079
ized+0.079
bing+0.078
ingly+0.077
hog+0.076
Neg logit contributions
 Enjoy-0.053
have-0.049
 Time-0.048
 Congratulations-0.048
 Good-0.047
Token're
Token ID821
Feature activation-7.26e-03

Pos logit contributions
 hospital+0.003
 inconvenience+0.003
 microwave+0.003
 dentist+0.003
 fault+0.003
Neg logit contributions
quet-0.003
als-0.003
ized-0.003
stones-0.003
ised-0.003
Token welcome
Token ID7062
Feature activation+8.40e-03

Pos logit contributions
 hospital+0.008
 inconvenience+0.008
 fault+0.008
 tells+0.008
 decides+0.007
Neg logit contributions
quet-0.015
bing-0.015
ized-0.014
izing-0.014
ingly-0.014
Token"
Token ID1
Feature activation+9.93e-02

Pos logit contributions
quet+0.014
ized+0.013
bing+0.013
ingly+0.013
izing+0.013
Neg logit contributions
 hospital-0.013
 tells-0.013
 inconvenience-0.013
 Enjoy-0.013
 fault-0.013
Token before
Token ID878
Feature activation+9.80e-03

Pos logit contributions
quet+0.175
ized+0.169
bing+0.167
izing+0.162
ingly+0.157
Neg logit contributions
 tells-0.145
 hospital-0.133
have-0.132
 decides-0.132
 done-0.131
Token returning
Token ID8024
Feature activation-1.80e-02

Pos logit contributions
quet+0.016
bing+0.015
ized+0.015
ingly+0.014
izing+0.014
Neg logit contributions
 hospital-0.017
 inconvenience-0.016
 tells-0.016
 fault-0.016
 Enjoy-0.016
Token to
Token ID284
Feature activation-6.87e-03

Pos logit contributions
 hospital+0.023
 tells+0.022
 Enjoy+0.021
 fault+0.021
 inconvenience+0.021
Neg logit contributions
quet-0.037
ized-0.035
ingly-0.035
bing-0.034
izing-0.034
Token his
Token ID465
Feature activation-4.61e-03

Pos logit contributions
 fault+0.012
 inconvenience+0.011
 microwave+0.011
 decides+0.011
have+0.011
Neg logit contributions
quet-0.010
bing-0.009
ized-0.009
ingly-0.009
izing-0.009
Token stable
Token ID8245
Feature activation+6.15e-04

Pos logit contributions
 hospital+0.006
 inconvenience+0.006
 tells+0.006
 fault+0.006
 Enjoy+0.006
Neg logit contributions
quet-0.009
ized-0.008
bing-0.008
ingly-0.008
izing-0.008
Token decided
Token ID3066
Feature activation+3.22e-03

Pos logit contributions
 inconvenience+0.005
 fault+0.005
 Enjoy+0.005
 hospital+0.005
 microwave+0.004
Neg logit contributions
ingly-0.005
ized-0.005
quet-0.005
bing-0.005
als-0.005
Token to
Token ID284
Feature activation-2.22e-03

Pos logit contributions
quet+0.005
ized+0.005
bing+0.004
ingly+0.004
izing+0.004
Neg logit contributions
 hospital-0.006
 inconvenience-0.005
 microwave-0.005
 fault-0.005
 tells-0.005
Token say
Token ID910
Feature activation-2.30e-03

Pos logit contributions
 microwave+0.003
 fault+0.003
 inconvenience+0.003
 hospital+0.003
 tells+0.003
Neg logit contributions
quet-0.004
ized-0.004
ingly-0.004
bing-0.004
izing-0.004
Token "
Token ID366
Feature activation+5.15e-02

Pos logit contributions
 inconvenience+0.003
 fault+0.003
 microwave+0.003
 hospital+0.003
ws+0.003
Neg logit contributions
quet-0.004
ingly-0.003
als-0.003
ized-0.003
bing-0.003
Tokenhello
Token ID31373
Feature activation+1.00e-02

Pos logit contributions
quet+0.079
ized+0.075
bing+0.074
ingly+0.073
izing+0.072
Neg logit contributions
 hospital-0.087
 tells-0.084
 inconvenience-0.084
 Enjoy-0.082
have-0.082
Token"
Token ID1
Feature activation+9.93e-02

Pos logit contributions
quet+0.016
bing+0.015
ized+0.015
izing+0.015
ingly+0.015
Neg logit contributions
 hospital-0.017
 fault-0.016
 inconvenience-0.016
 tells-0.016
 decides-0.016
Token to
Token ID284
Feature activation-1.25e-03

Pos logit contributions
quet+0.175
bing+0.170
ized+0.169
izing+0.164
ised+0.158
Neg logit contributions
 tells-0.143
 hospital-0.140
 dentist-0.134
have-0.132
 done-0.131
Token all
Token ID477
Feature activation-5.11e-03

Pos logit contributions
 fault+0.002
 hospital+0.002
 inconvenience+0.002
 microwave+0.002
 decides+0.002
Neg logit contributions
quet-0.002
bing-0.002
ized-0.002
ingly-0.002
izing-0.002
Token the
Token ID262
Feature activation+4.67e-03

Pos logit contributions
 fault+0.010
 hospital+0.009
 inconvenience+0.009
 dentist+0.009
 microwave+0.009
Neg logit contributions
quet-0.007
ized-0.007
ingly-0.006
bing-0.006
izing-0.006
Token animals
Token ID4695
Feature activation-1.60e-03

Pos logit contributions
quet+0.008
izing+0.007
bing+0.007
ized+0.007
ingly+0.007
Neg logit contributions
ws-0.007
 Enjoy-0.007
 fault-0.007
 hospital-0.006
 microwave-0.006
Token she
Token ID673
Feature activation-1.12e-02

Pos logit contributions
 hospital+0.003
 inconvenience+0.003
 fault+0.003
 dentist+0.003
 tells+0.003
Neg logit contributions
quet-0.002
ized-0.002
bing-0.002
ingly-0.002
izing-0.002
Tokenle
Token ID293
Feature activation+3.46e-03

Pos logit contributions
quet+0.091
bing+0.085
ized+0.084
ingly+0.083
izing+0.081
Neg logit contributions
 hospital-0.104
 inconvenience-0.101
 tells-0.100
 fault-0.098
have-0.098
Token the
Token ID262
Feature activation-1.80e-03

Pos logit contributions
quet+0.005
bing+0.005
ingly+0.005
ized+0.004
izing+0.004
Neg logit contributions
 hospital-0.006
 inconvenience-0.006
 fault-0.006
 dentist-0.006
 microwave-0.006
Token knot
Token ID29654
Feature activation+2.86e-02

Pos logit contributions
 hospital+0.003
 tells+0.002
 inconvenience+0.002
 fault+0.002
 Enjoy+0.002
Neg logit contributions
quet-0.003
ized-0.003
bing-0.003
ingly-0.003
izing-0.003
Token unt
Token ID1418
Feature activation+3.45e-02

Pos logit contributions
quet+0.046
ized+0.042
bing+0.042
ingly+0.042
izing+0.041
Neg logit contributions
 hospital-0.048
 inconvenience-0.047
 tells-0.046
 fault-0.046
 dentist-0.045
Tokenier
Token ID959
Feature activation+1.66e-02

Pos logit contributions
quet+0.060
 spe+0.055
edly+0.055
bing+0.055
ized+0.054
Neg logit contributions
 fault-0.049
have-0.048
 done-0.047
upid-0.047
 tells-0.047
Token"
Token ID1
Feature activation+9.74e-02

Pos logit contributions
quet+0.027
ized+0.026
bing+0.026
ingly+0.025
izing+0.025
Neg logit contributions
 hospital-0.027
 tells-0.026
 inconvenience-0.026
 fault-0.025
 Enjoy-0.025
Token and
Token ID290
Feature activation-1.52e-03

Pos logit contributions
quet+0.163
bing+0.159
ized+0.154
izing+0.150
ingly+0.150
Neg logit contributions
 tells-0.145
 hospital-0.138
have-0.135
 done-0.134
 decides-0.133
Token she
Token ID673
Feature activation-1.49e-02

Pos logit contributions
 hospital+0.002
 inconvenience+0.002
 fault+0.002
 tells+0.002
 Enjoy+0.002
Neg logit contributions
quet-0.003
ized-0.003
bing-0.003
ingly-0.002
izing-0.002
Token was
Token ID373
Feature activation-1.48e-03

Pos logit contributions
 hospital+0.019
 inconvenience+0.019
 fault+0.018
 tells+0.018
 Enjoy+0.018
Neg logit contributions
quet-0.029
ized-0.028
ingly-0.027
bing-0.027
izing-0.026
Token the
Token ID262
Feature activation+2.25e-02

Pos logit contributions
 inconvenience+0.002
 hospital+0.002
 fault+0.002
 tells+0.002
 microwave+0.002
Neg logit contributions
quet-0.003
ized-0.002
bing-0.002
ingly-0.002
izing-0.002
Token happiest
Token ID49414
Feature activation-4.55e-03

Pos logit contributions
quet+0.043
bing+0.041
ized+0.041
ingly+0.040
izing+0.039
Neg logit contributions
 hospital-0.032
 tells-0.030
 fault-0.029
 inconvenience-0.029
 dentist-0.029
Token It
Token ID632
Feature activation+1.70e-02

Pos logit contributions
hog+0.008
quet+0.008
edly+0.008
ingly+0.008
ised+0.007
Neg logit contributions
ws-0.012
 inconvenience-0.012
 fault-0.012
have-0.012
 hospital-0.012
Token was
Token ID373
Feature activation+1.24e-02

Pos logit contributions
quet+0.032
ized+0.030
bing+0.030
ingly+0.030
izing+0.029
Neg logit contributions
 hospital-0.024
 tells-0.023
 inconvenience-0.022
 fault-0.022
 Enjoy-0.022
Token a
Token ID257
Feature activation+3.08e-02

Pos logit contributions
ingly+0.020
quet+0.020
ized+0.019
bing+0.019
 beneath+0.019
Neg logit contributions
 inconvenience-0.019
have-0.018
 tells-0.018
 decides-0.018
 Enjoy-0.018
Token big
Token ID1263
Feature activation+3.29e-02

Pos logit contributions
quet+0.048
worm+0.046
hog+0.044
ized+0.043
 spe+0.043
Neg logit contributions
ws-0.052
have-0.052
 tells-0.050
 fault-0.050
 inconvenience-0.049
Token terr
Token ID8812
Feature activation+3.38e-02

Pos logit contributions
quet+0.049
ized+0.046
ingly+0.045
bing+0.045
izing+0.044
Neg logit contributions
 hospital-0.057
 tells-0.056
have-0.055
 inconvenience-0.055
 fault-0.055
Tokenac
Token ID330
Feature activation+9.72e-02

Pos logit contributions
quet+0.052
ized+0.049
bing+0.048
izing+0.047
ingly+0.046
Neg logit contributions
 hospital-0.059
 inconvenience-0.057
 fault-0.057
 tells-0.057
 Enjoy-0.056
Tokenotta
Token ID12375
Feature activation+5.69e-02

Pos logit contributions
quet+0.154
bing+0.151
izing+0.145
ized+0.143
edly+0.140
Neg logit contributions
 hospital-0.152
have-0.150
 fault-0.149
 done-0.148
 inconvenience-0.145
Token pot
Token ID1787
Feature activation+2.98e-02

Pos logit contributions
quet+0.087
ized+0.081
bing+0.081
ingly+0.080
izing+0.078
Neg logit contributions
 hospital-0.099
 tells-0.096
 inconvenience-0.096
 fault-0.095
 Enjoy-0.094
Token but
Token ID475
Feature activation+2.43e-02

Pos logit contributions
quet+0.044
ized+0.041
bing+0.041
ingly+0.040
izing+0.040
Neg logit contributions
 hospital-0.053
 inconvenience-0.050
 tells-0.050
 fault-0.049
 Enjoy-0.049
Token it
Token ID340
Feature activation-3.80e-03

Pos logit contributions
quet+0.040
ized+0.039
bing+0.039
ingly+0.039
izing+0.038
Neg logit contributions
 hospital-0.038
 inconvenience-0.038
 fault-0.036
 tells-0.036
 Enjoy-0.035
Token had
Token ID550
Feature activation+1.83e-02

Pos logit contributions
 hospital+0.005
 inconvenience+0.005
 tells+0.005
 fault+0.005
 Enjoy+0.005
Neg logit contributions
quet-0.007
ized-0.007
bing-0.006
ingly-0.006
izing-0.006
Token"
Token ID1
Feature activation+7.70e-02

Pos logit contributions
quet+0.002
bing+0.002
ingly+0.002
izing+0.002
als+0.002
Neg logit contributions
 hospital-0.004
 fault-0.003
 inconvenience-0.003
 fuss-0.003
 tells-0.003
Token and
Token ID290
Feature activation-1.77e-02

Pos logit contributions
quet+0.131
ized+0.126
bing+0.126
izing+0.121
ised+0.118
Neg logit contributions
 hospital-0.117
 tells-0.117
 dentist-0.112
 fault-0.109
have-0.108
Token "
Token ID366
Feature activation+3.56e-02

Pos logit contributions
 inconvenience+0.027
 fault+0.027
 hospital+0.027
have+0.026
 aquarium+0.025
Neg logit contributions
quet-0.029
bing-0.028
izing-0.026
ingly-0.026
edly-0.026
Tokenthank
Token ID40716
Feature activation+1.79e-02

Pos logit contributions
quet+0.053
ized+0.049
bing+0.049
ingly+0.048
izing+0.048
Neg logit contributions
 hospital-0.063
 tells-0.061
 inconvenience-0.061
 fault-0.060
 Enjoy-0.059
Token you
Token ID345
Feature activation+4.27e-03

Pos logit contributions
quet+0.038
ized+0.037
bing+0.037
ingly+0.036
izing+0.036
Neg logit contributions
 hospital-0.020
 tells-0.019
 Enjoy-0.019
 inconvenience-0.018
have-0.018
Token"
Token ID1
Feature activation+9.71e-02

Pos logit contributions
quet+0.006
bing+0.006
ingly+0.006
ized+0.005
izing+0.005
Neg logit contributions
 hospital-0.008
 inconvenience-0.008
 fault-0.008
 tells-0.008
 decides-0.007
Token to
Token ID284
Feature activation-6.87e-03

Pos logit contributions
quet+0.174
bing+0.170
ized+0.168
izing+0.165
ised+0.159
Neg logit contributions
 tells-0.140
 hospital-0.137
 dentist-0.132
have-0.129
 done-0.128
Token show
Token ID905
Feature activation-4.07e-03

Pos logit contributions
 inconvenience+0.011
 fault+0.011
 hospital+0.011
 decides+0.011
have+0.011
Neg logit contributions
quet-0.011
bing-0.010
ized-0.010
ingly-0.010
izing-0.009
Token her
Token ID607
Feature activation+6.70e-03

Pos logit contributions
 hospital+0.006
 inconvenience+0.006
 tells+0.006
 fault+0.006
 Enjoy+0.006
Neg logit contributions
quet-0.007
ized-0.006
bing-0.006
ingly-0.006
izing-0.006
Token mom
Token ID1995
Feature activation+3.36e-02

Pos logit contributions
quet+0.011
ingly+0.010
bing+0.010
ized+0.010
worm+0.010
Neg logit contributions
 inconvenience-0.011
 Enjoy-0.010
 decides-0.010
have-0.010
 hospital-0.010
Token how
Token ID703
Feature activation+3.64e-02

Pos logit contributions
quet+0.052
ingly+0.049
bing+0.049
ized+0.048
als+0.045
Neg logit contributions
 inconvenience-0.059
 hospital-0.057
 fault-0.055
 tells-0.054
 Enjoy-0.054
Token and
Token ID290
Feature activation+5.16e-03

Pos logit contributions
 hospital+0.002
 inconvenience+0.002
 tells+0.002
 fault+0.002
 Enjoy+0.002
Neg logit contributions
quet-0.002
ized-0.002
bing-0.002
ingly-0.002
izing-0.002
Token said
Token ID531
Feature activation-1.51e-03

Pos logit contributions
quet+0.010
ized+0.009
bing+0.009
ingly+0.009
izing+0.009
Neg logit contributions
 hospital-0.007
 inconvenience-0.007
 fault-0.007
 tells-0.007
 Enjoy-0.007
Token "
Token ID366
Feature activation+3.60e-02

Pos logit contributions
 hospital+0.002
 tells+0.002
 Enjoy+0.002
 inconvenience+0.002
 fault+0.002
Neg logit contributions
quet-0.003
ized-0.002
bing-0.002
ingly-0.002
izing-0.002
Tokenthank
Token ID40716
Feature activation+8.32e-03

Pos logit contributions
quet+0.058
ized+0.056
bing+0.055
ingly+0.054
izing+0.054
Neg logit contributions
 Enjoy-0.054
have-0.054
 hospital-0.053
 inconvenience-0.052
 decides-0.052
Token you
Token ID345
Feature activation-3.00e-03

Pos logit contributions
ingly+0.020
ized+0.020
lights+0.020
imm+0.020
quet+0.020
Neg logit contributions
 Enjoy-0.006
 tells-0.006
 hospital-0.006
 inconvenience-0.005
have-0.005
Token"
Token ID1
Feature activation+9.69e-02

Pos logit contributions
 hospital+0.005
 tells+0.005
 inconvenience+0.005
 fault+0.005
 Enjoy+0.005
Neg logit contributions
quet-0.005
bing-0.004
ized-0.004
ingly-0.004
izing-0.004
Token to
Token ID284
Feature activation-8.99e-03

Pos logit contributions
quet+0.195
bing+0.193
ized+0.190
izing+0.186
ised+0.184
Neg logit contributions
 tells-0.114
 done-0.108
 dentist-0.108
 hospital-0.106
have-0.101
Token the
Token ID262
Feature activation-1.86e-02

Pos logit contributions
 hospital+0.013
 tells+0.012
 inconvenience+0.012
 Enjoy+0.011
 fault+0.011
Neg logit contributions
quet-0.017
ized-0.016
ingly-0.016
bing-0.016
izing-0.016
Token mattress
Token ID33388
Feature activation-4.49e-03

Pos logit contributions
 hospital+0.025
 tells+0.024
 inconvenience+0.023
 fault+0.023
 Enjoy+0.023
Neg logit contributions
quet-0.036
ized-0.034
bing-0.034
ingly-0.034
izing-0.033
Token.
Token ID13
Feature activation-9.43e-03

Pos logit contributions
 hospital+0.007
 inconvenience+0.007
 tells+0.007
 fault+0.007
 Enjoy+0.007
Neg logit contributions
quet-0.007
ized-0.007
bing-0.007
ingly-0.007
izing-0.007
Token The
Token ID383
Feature activation-1.51e-02

Pos logit contributions
 hospital+0.015
 tells+0.014
 inconvenience+0.014
 fault+0.014
 Enjoy+0.014
Neg logit contributions
quet-0.016
ized-0.015
bing-0.015
ingly-0.015
izing-0.015
Token he
Token ID339
Feature activation+1.12e-02

Pos logit contributions
quet+0.007
ized+0.007
bing+0.007
ingly+0.007
izing+0.006
Neg logit contributions
 hospital-0.009
 inconvenience-0.009
 tells-0.008
 fault-0.008
 Enjoy-0.008
Token said
Token ID531
Feature activation+1.14e-02

Pos logit contributions
quet+0.021
ized+0.020
bing+0.020
ingly+0.019
izing+0.019
Neg logit contributions
 hospital-0.016
 tells-0.015
 inconvenience-0.015
 fault-0.015
 Enjoy-0.015
Token "
Token ID366
Feature activation+4.36e-02

Pos logit contributions
quet+0.022
ingly+0.020
ized+0.019
bing+0.019
worm+0.019
Neg logit contributions
 hospital-0.016
 fault-0.016
 inconvenience-0.016
 tells-0.015
have-0.015
Tokenthank
Token ID40716
Feature activation+1.70e-02

Pos logit contributions
quet+0.070
ized+0.066
bing+0.066
izing+0.065
ingly+0.064
Neg logit contributions
 Enjoy-0.068
have-0.067
 hospital-0.066
 dentist-0.064
 tells-0.064
Token you
Token ID345
Feature activation+4.01e-04

Pos logit contributions
quet+0.036
ized+0.035
bing+0.035
ingly+0.035
izing+0.034
Neg logit contributions
 Enjoy-0.019
 hospital-0.019
 tells-0.018
 inconvenience-0.018
 dentist-0.017
Token"
Token ID1
Feature activation+9.64e-02

Pos logit contributions
quet+0.001
ingly+0.001
bing+0.001
ized+0.001
als+0.000
Neg logit contributions
 hospital-0.001
 fault-0.001
 inconvenience-0.001
 tells-0.001
 decides-0.001
Token to
Token ID284
Feature activation-6.46e-03

Pos logit contributions
quet+0.172
ized+0.165
bing+0.163
izing+0.160
ingly+0.156
Neg logit contributions
 tells-0.137
 hospital-0.137
 dentist-0.133
have-0.130
 Enjoy-0.129
Token his
Token ID465
Feature activation-7.36e-03

Pos logit contributions
 inconvenience+0.008
 fault+0.008
 hospital+0.008
 decides+0.008
have+0.008
Neg logit contributions
quet-0.013
bing-0.012
ized-0.012
ingly-0.011
izing-0.011
Token mom
Token ID1995
Feature activation+1.58e-02

Pos logit contributions
 inconvenience+0.012
 Enjoy+0.011
have+0.011
ws+0.011
 fault+0.011
Neg logit contributions
quet-0.012
bing-0.011
ingly-0.011
ized-0.011
worm-0.010
Token.
Token ID13
Feature activation-1.28e-02

Pos logit contributions
quet+0.022
ingly+0.021
bing+0.020
ized+0.020
hog+0.019
Neg logit contributions
 inconvenience-0.030
 hospital-0.029
 fault-0.028
 tells-0.028
 Enjoy-0.027
Token
Token ID198
Feature activation-3.70e-03

Pos logit contributions
 inconvenience+0.019
 hospital+0.019
have+0.018
 tells+0.018
 fault+0.018
Neg logit contributions
quet-0.023
bing-0.021
ingly-0.021
ized-0.021
edly-0.020
Token He
Token ID679
Feature activation+1.85e-02

Pos logit contributions
hog+0.012
edly+0.011
quet+0.011
ingly+0.011
ized+0.010
Neg logit contributions
 inconvenience-0.018
have-0.018
 fault-0.017
 supposed-0.017
 hospital-0.017
Token said
Token ID531
Feature activation+4.27e-03

Pos logit contributions
ized+0.028
hog+0.026
quet+0.025
worm+0.025
bing+0.024
Neg logit contributions
 fault-0.033
 inconvenience-0.033
 Enjoy-0.030
 hospital-0.030
 dentist-0.029
Token "
Token ID366
Feature activation+3.49e-02

Pos logit contributions
quet+0.006
ingly+0.005
ized+0.005
bing+0.005
edly+0.005
Neg logit contributions
 hospital-0.008
 fault-0.008
 inconvenience-0.008
 tells-0.008
have-0.007
Tokenthank
Token ID40716
Feature activation+1.79e-02

Pos logit contributions
quet+0.056
ized+0.053
bing+0.052
ingly+0.052
izing+0.050
Neg logit contributions
 hospital-0.060
 tells-0.058
 fault-0.058
 inconvenience-0.057
 Enjoy-0.054
Token you
Token ID345
Feature activation+2.98e-03

Pos logit contributions
quet+0.040
ized+0.039
bing+0.039
ingly+0.038
izing+0.037
Neg logit contributions
 Enjoy-0.018
 hospital-0.018
 tells-0.017
 inconvenience-0.017
have-0.016
Token"
Token ID1
Feature activation+9.62e-02

Pos logit contributions
quet+0.004
bing+0.004
ingly+0.004
ized+0.003
izing+0.003
Neg logit contributions
 fault-0.006
 hospital-0.006
 inconvenience-0.006
 tells-0.006
 decides-0.005
Token to
Token ID284
Feature activation-9.03e-03

Pos logit contributions
quet+0.169
bing+0.160
ized+0.160
izing+0.157
ingly+0.153
Neg logit contributions
 tells-0.141
 hospital-0.141
 dentist-0.135
 Enjoy-0.133
have-0.132
Token Tina
Token ID35903
Feature activation+1.79e-02

Pos logit contributions
 hospital+0.012
 fault+0.012
 inconvenience+0.012
 tells+0.012
 decides+0.012
Neg logit contributions
quet-0.017
bing-0.016
ized-0.016
ingly-0.016
izing-0.015
Token and
Token ID290
Feature activation-1.89e-02

Pos logit contributions
quet+0.028
bing+0.027
ized+0.027
ingly+0.026
izing+0.026
Neg logit contributions
 hospital-0.030
 inconvenience-0.029
 tells-0.029
 fault-0.028
 Enjoy-0.028
Token told
Token ID1297
Feature activation-3.66e-02

Pos logit contributions
 inconvenience+0.031
 fault+0.030
 hospital+0.030
 microwave+0.029
have+0.029
Neg logit contributions
quet-0.030
ized-0.029
bing-0.028
ingly-0.027
hog-0.027
Token her
Token ID607
Feature activation-2.62e-02

Pos logit contributions
 microwave+0.059
 inconvenience+0.059
 hospital+0.058
 Enjoy+0.058
 dentist+0.057
Neg logit contributions
quet-0.056
ized-0.055
bing-0.055
ingly-0.055
izing-0.053
Token tree
Token ID5509
Feature activation+1.98e-03

Pos logit contributions
quet+0.056
bing+0.054
ized+0.054
ingly+0.053
izing+0.052
Neg logit contributions
 hospital-0.035
 inconvenience-0.034
 tells-0.034
 fault-0.033
 Enjoy-0.033
Token and
Token ID290
Feature activation+4.92e-03

Pos logit contributions
quet+0.003
ized+0.003
izing+0.003
bing+0.003
ingly+0.003
Neg logit contributions
 hospital-0.003
 tells-0.003
 dentist-0.003
 fault-0.003
 microwave-0.003
Token say
Token ID910
Feature activation-5.05e-03

Pos logit contributions
als+0.008
bing+0.008
worm+0.008
hog+0.008
ized+0.008
Neg logit contributions
 fault-0.008
 inconvenience-0.008
have-0.007
 microwave-0.007
 supposed-0.007
Token "
Token ID366
Feature activation+4.74e-02

Pos logit contributions
 hospital+0.009
 inconvenience+0.008
 fault+0.008
 tells+0.008
 microwave+0.008
Neg logit contributions
quet-0.008
bing-0.007
ized-0.007
ingly-0.007
izing-0.007
Tokenhello
Token ID31373
Feature activation+1.51e-04

Pos logit contributions
quet+0.074
ized+0.069
bing+0.069
ingly+0.068
izing+0.067
Neg logit contributions
 hospital-0.080
 tells-0.077
 inconvenience-0.077
 Enjoy-0.076
have-0.075
Token"
Token ID1
Feature activation+9.57e-02

Pos logit contributions
quet+0.000
ized+0.000
ingly+0.000
bing+0.000
izing+0.000
Neg logit contributions
 hospital-0.000
 tells-0.000
 inconvenience-0.000
 Enjoy-0.000
 fault-0.000
Token to
Token ID284
Feature activation-4.98e-03

Pos logit contributions
quet+0.163
ized+0.156
bing+0.152
izing+0.150
ingly+0.148
Neg logit contributions
 tells-0.145
 hospital-0.144
have-0.136
 dentist-0.136
 fault-0.136
Token the
Token ID262
Feature activation-1.61e-03

Pos logit contributions
 microwave+0.008
 fault+0.008
 inconvenience+0.008
 hospital+0.008
 decides+0.007
Neg logit contributions
quet-0.008
bing-0.007
ized-0.007
izing-0.007
hog-0.007
Token little
Token ID1310
Feature activation-1.42e-02

Pos logit contributions
 fault+0.004
 inconvenience+0.003
 hospital+0.003
ws+0.003
 microwave+0.003
Neg logit contributions
quet-0.002
ized-0.002
bing-0.002
izing-0.001
worm-0.001
Token boy
Token ID2933
Feature activation-1.07e-02

Pos logit contributions
 hospital+0.021
 tells+0.019
 decides+0.019
 inconvenience+0.018
 Time+0.018
Neg logit contributions
quet-0.028
ized-0.025
ingly-0.025
bing-0.025
izing-0.025
Token.
Token ID13
Feature activation-9.90e-03

Pos logit contributions
 inconvenience+0.022
 hospital+0.021
 microwave+0.020
 dentist+0.020
 fault+0.020
Neg logit contributions
bing-0.013
quet-0.013
ingly-0.013
hog-0.012
bed-0.012
Token and
Token ID290
Feature activation+6.47e-03

Pos logit contributions
quet+0.009
ized+0.008
bing+0.008
ingly+0.008
izing+0.008
Neg logit contributions
 hospital-0.009
 tells-0.009
 inconvenience-0.008
 fault-0.008
 Enjoy-0.008
Token said
Token ID531
Feature activation-2.68e-03

Pos logit contributions
quet+0.014
ingly+0.013
bing+0.013
ized+0.013
izing+0.013
Neg logit contributions
 tells-0.008
 hospital-0.007
 Enjoy-0.007
 decides-0.007
 inconvenience-0.007
Token "
Token ID366
Feature activation+3.83e-02

Pos logit contributions
 Enjoy+0.004
 tells+0.004
 hospital+0.004
 inconvenience+0.004
 fault+0.004
Neg logit contributions
ized-0.004
bing-0.004
quet-0.004
ingly-0.004
izing-0.004
Tokenthank
Token ID40716
Feature activation+1.17e-02

Pos logit contributions
ized+0.066
quet+0.066
bing+0.064
hog+0.064
izing+0.064
Neg logit contributions
 Enjoy-0.052
have-0.051
 Time-0.050
 dentist-0.049
 hospital-0.048
Token you
Token ID345
Feature activation-1.49e-03

Pos logit contributions
ized+0.029
lights+0.029
ingly+0.029
hetically+0.029
imm+0.028
Neg logit contributions
 Enjoy-0.009
 tells-0.008
 hospital-0.007
 Congratulations-0.007
 inconvenience-0.007
Token"
Token ID1
Feature activation+9.56e-02

Pos logit contributions
 hospital+0.003
 tells+0.002
 inconvenience+0.002
 fault+0.002
 Enjoy+0.002
Neg logit contributions
quet-0.002
bing-0.002
ized-0.002
ingly-0.002
izing-0.002
Token to
Token ID284
Feature activation-6.75e-03

Pos logit contributions
quet+0.193
ized+0.192
bing+0.188
izing+0.187
ised+0.184
Neg logit contributions
 tells-0.112
 done-0.106
 dentist-0.106
 hospital-0.105
 supposed-0.105
Token her
Token ID607
Feature activation-7.02e-03

Pos logit contributions
 hospital+0.009
 tells+0.009
 inconvenience+0.009
 Enjoy+0.009
 fault+0.009
Neg logit contributions
quet-0.013
ized-0.012
bing-0.012
ingly-0.012
izing-0.012
Token mom
Token ID1995
Feature activation+1.07e-02

Pos logit contributions
 hospital+0.007
 tells+0.007
 dentist+0.006
 fault+0.005
 microwave+0.005
Neg logit contributions
ized-0.017
quet-0.017
bing-0.017
ingly-0.017
izing-0.016
Tokenmy
Token ID1820
Feature activation+1.39e-02

Pos logit contributions
quet+0.022
bing+0.021
ized+0.021
ingly+0.021
izing+0.020
Neg logit contributions
 hospital-0.013
 inconvenience-0.012
 tells-0.012
 fault-0.012
 Enjoy-0.012
Token.
Token ID13
Feature activation-2.12e-02

Pos logit contributions
quet+0.022
bing+0.020
ingly+0.020
ized+0.020
izing+0.019
Neg logit contributions
 hospital-0.024
 inconvenience-0.024
 tells-0.023
 fault-0.023
 Enjoy-0.023
Token She
Token ID1375
Feature activation+1.38e-02

Pos logit contributions
quet+0.023
ingly+0.021
ized+0.021
bing+0.021
izing+0.021
Neg logit contributions
 hospital-0.019
 inconvenience-0.018
 tells-0.018
 fault-0.018
have-0.017
Token kept
Token ID4030
Feature activation+2.07e-02

Pos logit contributions
quet+0.025
ized+0.024
bing+0.024
ingly+0.024
izing+0.023
Neg logit contributions
 hospital-0.020
 inconvenience-0.019
 fault-0.019
 tells-0.019
 Enjoy-0.018
Token saying
Token ID2282
Feature activation+1.43e-02

Pos logit contributions
quet+0.038
bing+0.037
ized+0.037
ingly+0.036
izing+0.035
Neg logit contributions
 hospital-0.030
 inconvenience-0.030
 fault-0.029
 tells-0.028
 dentist-0.027
Token "
Token ID366
Feature activation+4.99e-02

Pos logit contributions
quet+0.026
ized+0.025
bing+0.025
ingly+0.025
izing+0.024
Neg logit contributions
 hospital-0.021
 inconvenience-0.020
 fault-0.019
 tells-0.019
 dentist-0.019
Tokenhello
Token ID31373
Feature activation+5.18e-03

Pos logit contributions
quet+0.077
bing+0.074
ized+0.074
ingly+0.072
izing+0.071
Neg logit contributions
 Enjoy-0.078
have-0.078
 hospital-0.078
 inconvenience-0.078
 tells-0.077
Token"
Token ID1
Feature activation+9.52e-02

Pos logit contributions
quet+0.009
bing+0.008
ized+0.008
ingly+0.008
izing+0.008
Neg logit contributions
 hospital-0.009
 tells-0.008
 inconvenience-0.008
 fault-0.008
 Enjoy-0.008
Token to
Token ID284
Feature activation-5.23e-03

Pos logit contributions
quet+0.166
bing+0.161
ized+0.161
izing+0.156
ised+0.151
Neg logit contributions
 tells-0.136
 hospital-0.133
 dentist-0.129
have-0.127
 done-0.126
Token animals
Token ID4695
Feature activation-6.91e-03

Pos logit contributions
 hospital+0.008
 inconvenience+0.008
 fault+0.008
 tells+0.008
 Enjoy+0.007
Neg logit contributions
quet-0.009
ized-0.009
bing-0.009
ingly-0.008
izing-0.008
Token,
Token ID11
Feature activation-1.15e-02

Pos logit contributions
 hospital+0.012
 inconvenience+0.012
 tells+0.012
 fault+0.012
 Enjoy+0.011
Neg logit contributions
quet-0.010
ized-0.010
bing-0.010
ingly-0.010
izing-0.009
Token and
Token ID290
Feature activation-1.40e-02

Pos logit contributions
 hospital+0.022
 inconvenience+0.021
 fault+0.021
 tells+0.021
 Enjoy+0.021
Neg logit contributions
quet-0.016
ized-0.015
bing-0.015
ingly-0.014
izing-0.014
Token everywhere
Token ID8347
Feature activation-5.31e-03

Pos logit contributions
 hospital+0.025
 inconvenience+0.024
 fault+0.024
 tells+0.024
 microwave+0.024
Neg logit contributions
quet-0.021
ized-0.020
bing-0.019
ingly-0.019
izing-0.019
Token naughty
Token ID45128
Feature activation-2.87e-03

Pos logit contributions
ized+0.010
quet+0.009
ingly+0.009
bing+0.009
hog+0.009
Neg logit contributions
 microwave-0.009
ws-0.009
 Enjoy-0.009
 inconvenience-0.009
 decides-0.008
Token and
Token ID290
Feature activation-2.12e-02

Pos logit contributions
 hospital+0.005
 inconvenience+0.005
 tells+0.005
 Enjoy+0.005
 dentist+0.005
Neg logit contributions
quet-0.004
ingly-0.004
bing-0.004
ized-0.004
izing-0.004
Token did
Token ID750
Feature activation+9.21e-03

Pos logit contributions
 fault+0.036
 inconvenience+0.035
 microwave+0.035
 hospital+0.034
 cinema+0.034
Neg logit contributions
ized-0.031
quet-0.029
izing-0.029
hog-0.028
worm-0.028
Token not
Token ID407
Feature activation-1.46e-02

Pos logit contributions
quet+0.014
izing+0.013
ized+0.013
bing+0.012
ingly+0.012
Neg logit contributions
 microwave-0.016
 cinema-0.015
 aquarium-0.015
 inconvenience-0.014
 hospital-0.014
Token share
Token ID2648
Feature activation+1.47e-02

Pos logit contributions
 microwave+0.023
 cinema+0.021
 fault+0.021
 hospital+0.020
 dentist+0.019
Neg logit contributions
quet-0.025
ingly-0.023
ized-0.022
izing-0.022
hog-0.021
Token"
Token ID1
Feature activation+9.51e-02

Pos logit contributions
quet+0.024
bing+0.024
ized+0.024
ingly+0.023
izing+0.023
Neg logit contributions
 inconvenience-0.023
 hospital-0.023
 microwave-0.022
 tells-0.022
 fault-0.021
Token and
Token ID290
Feature activation-9.91e-03

Pos logit contributions
quet+0.173
ized+0.164
bing+0.163
ingly+0.162
izing+0.159
Neg logit contributions
 hospital-0.137
 tells-0.132
 inconvenience-0.131
 fault-0.129
 Enjoy-0.127
Token included
Token ID3017
Feature activation-8.86e-03

Pos logit contributions
 fault+0.017
 inconvenience+0.017
 hospital+0.016
 Enjoy+0.016
 microwave+0.016
Neg logit contributions
quet-0.015
ingly-0.014
ized-0.014
bing-0.013
hog-0.013
Token a
Token ID257
Feature activation-1.90e-02

Pos logit contributions
 inconvenience+0.014
 hospital+0.014
 decides+0.014
 tells+0.014
 Enjoy+0.013
Neg logit contributions
quet-0.015
ized-0.014
ingly-0.014
bing-0.014
izing-0.013
Token smile
Token ID8212
Feature activation+6.18e-03

Pos logit contributions
ws+0.030
have+0.030
 Enjoy+0.029
 decides+0.029
 inconvenience+0.029
Neg logit contributions
quet-0.031
hog-0.029
worm-0.029
ingly-0.028
 spe-0.027
Tokeny
Token ID88
Feature activation+1.31e-03

Pos logit contributions
quet+0.009
ingly+0.009
bing+0.008
ized+0.008
izing+0.008
Neg logit contributions
 hospital-0.011
 inconvenience-0.011
 tells-0.011
 fault-0.011
 decides-0.011
Token family
Token ID1641
Feature activation-2.88e-02

Pos logit contributions
quet+0.000
ized+0.000
ingly+0.000
bing+0.000
worm+0.000
Neg logit contributions
ws-0.000
 fault-0.000
 Enjoy-0.000
 tells-0.000
 inconvenience-0.000
Token said
Token ID531
Feature activation-1.32e-02

Pos logit contributions
 hospital+0.042
 inconvenience+0.041
 tells+0.040
 fault+0.040
 Enjoy+0.039
Neg logit contributions
quet-0.052
ized-0.049
bing-0.049
ingly-0.049
izing-0.047
Token "
Token ID366
Feature activation+3.82e-02

Pos logit contributions
 hospital+0.024
 inconvenience+0.024
 fault+0.024
 tells+0.023
ws+0.023
Neg logit contributions
quet-0.018
ingly-0.017
ized-0.017
bing-0.016
izing-0.016
Tokenthank
Token ID40716
Feature activation+1.27e-02

Pos logit contributions
quet+0.058
ized+0.054
bing+0.054
ingly+0.053
izing+0.052
Neg logit contributions
 hospital-0.068
 tells-0.065
 inconvenience-0.065
 fault-0.064
 Enjoy-0.063
Token you
Token ID345
Feature activation-3.04e-03

Pos logit contributions
quet+0.026
ized+0.025
bing+0.025
ingly+0.025
izing+0.024
Neg logit contributions
 hospital-0.015
 tells-0.015
 inconvenience-0.014
 Enjoy-0.014
have-0.014
Token"
Token ID1
Feature activation+9.50e-02

Pos logit contributions
 hospital+0.006
 fault+0.005
 inconvenience+0.005
 tells+0.005
 decides+0.005
Neg logit contributions
quet-0.005
bing-0.004
ingly-0.004
ized-0.004
izing-0.004
Token before
Token ID878
Feature activation+5.99e-03

Pos logit contributions
quet+0.170
bing+0.163
ized+0.162
izing+0.158
ised+0.154
Neg logit contributions
 tells-0.135
 hospital-0.132
 dentist-0.127
 fault-0.124
have-0.124
Token they
Token ID484
Feature activation-3.20e-02

Pos logit contributions
quet+0.010
bing+0.010
ized+0.010
ingly+0.010
izing+0.010
Neg logit contributions
 hospital-0.009
 fault-0.009
 inconvenience-0.009
 Enjoy-0.009
 tells-0.009
Token left
Token ID1364
Feature activation-4.60e-02

Pos logit contributions
 fault+0.055
 inconvenience+0.050
 hospital+0.050
 tells+0.046
have+0.046
Neg logit contributions
quet-0.052
ized-0.051
als-0.050
ingly-0.049
hog-0.047
Token.
Token ID13
Feature activation-1.96e-02

Pos logit contributions
 hospital+0.070
 inconvenience+0.068
 tells+0.067
 fault+0.066
 Enjoy+0.066
Neg logit contributions
quet-0.079
ized-0.076
bing-0.075
ingly-0.075
izing-0.073
Token They
Token ID1119
Feature activation+5.53e-03

Pos logit contributions
have+0.036
 inconvenience+0.035
 hospital+0.035
 fault+0.034
ws+0.034
Neg logit contributions
quet-0.026
ized-0.025
ised-0.025
hog-0.025
bing-0.025
Token need
Token ID761
Feature activation-3.66e-02

Pos logit contributions
 microwave+0.024
 dentist+0.022
 fault+0.022
 inconvenience+0.021
 hospital+0.021
Neg logit contributions
ized-0.023
ingly-0.022
hog-0.022
bing-0.022
izing-0.022
Token to
Token ID284
Feature activation-5.31e-03

Pos logit contributions
 fault+0.058
 hospital+0.058
have+0.057
arre+0.057
 microwave+0.057
Neg logit contributions
quet-0.056
ized-0.051
ingly-0.051
izing-0.049
bing-0.049
Token learn
Token ID2193
Feature activation+1.05e-02

Pos logit contributions
 microwave+0.008
 fault+0.008
 Enjoy+0.008
 dentist+0.007
 Congratulations+0.007
Neg logit contributions
quet-0.009
bing-0.008
ized-0.008
ingly-0.008
izing-0.008
Token a
Token ID257
Feature activation-1.78e-02

Pos logit contributions
quet+0.018
ized+0.018
izing+0.018
bing+0.017
ingly+0.017
Neg logit contributions
 hospital-0.015
 inconvenience-0.015
 microwave-0.015
 Enjoy-0.015
 Time-0.014
Token lot
Token ID1256
Feature activation+7.76e-03

Pos logit contributions
 microwave+0.027
ws+0.027
 grounded+0.026
iam+0.026
have+0.026
Neg logit contributions
quet-0.029
worm-0.028
izing-0.027
stones-0.025
stone-0.025
Token"
Token ID1
Feature activation+9.50e-02

Pos logit contributions
quet+0.013
ized+0.012
ingly+0.012
izing+0.012
bing+0.012
Neg logit contributions
 hospital-0.013
 Enjoy-0.012
 inconvenience-0.012
 fault-0.012
 tells-0.012
Token said
Token ID531
Feature activation+6.29e-03

Pos logit contributions
quet+0.200
bing+0.191
ized+0.189
izing+0.185
ingly+0.184
Neg logit contributions
 tells-0.110
 hospital-0.109
have-0.101
 inconvenience-0.100
 dentist-0.099
Token Mum
Token ID38367
Feature activation+1.30e-02

Pos logit contributions
quet+0.010
ized+0.010
bing+0.010
ingly+0.009
izing+0.009
Neg logit contributions
 hospital-0.010
 tells-0.010
 inconvenience-0.010
 fault-0.010
 Enjoy-0.010
Token.
Token ID13
Feature activation-1.35e-02

Pos logit contributions
quet+0.022
ized+0.021
bing+0.021
ingly+0.020
izing+0.020
Neg logit contributions
 hospital-0.020
 tells-0.020
 inconvenience-0.020
 fault-0.019
 Enjoy-0.019
Token
Token ID198
Feature activation-7.57e-03

Pos logit contributions
 hospital+0.021
 inconvenience+0.020
have+0.020
 fault+0.020
 tells+0.020
Neg logit contributions
quet-0.023
ized-0.022
bing-0.022
ingly-0.021
edly-0.021
Token
Token ID198
Feature activation+1.99e-03

Pos logit contributions
 hospital+0.010
 inconvenience+0.010
 fault+0.009
 tells+0.009
 Enjoy+0.009
Neg logit contributions
quet-0.015
ized-0.014
bing-0.014
ingly-0.014
izing-0.014
Token a
Token ID257
Feature activation-2.15e-02

Pos logit contributions
 hospital+0.009
 inconvenience+0.008
 tells+0.008
 fault+0.008
 Enjoy+0.008
Neg logit contributions
quet-0.011
ized-0.010
bing-0.010
ingly-0.010
izing-0.010
Token big
Token ID1263
Feature activation+9.73e-03

Pos logit contributions
ws+0.035
 fault+0.034
 hospital+0.031
 grounded+0.031
 decides+0.031
Neg logit contributions
quet-0.035
worm-0.033
ingly-0.031
hog-0.030
izing-0.030
Token "
Token ID366
Feature activation+5.67e-02

Pos logit contributions
quet+0.017
ingly+0.015
hog+0.015
worm+0.015
izing+0.015
Neg logit contributions
have-0.014
 fault-0.014
 supposed-0.014
 hospital-0.013
 Enjoy-0.013
Tokengood
Token ID11274
Feature activation-4.22e-03

Pos logit contributions
quet+0.085
ized+0.081
bing+0.080
ingly+0.079
izing+0.078
Neg logit contributions
 hospital-0.097
 inconvenience-0.095
 tells-0.094
 Enjoy-0.093
 fault-0.092
Tokenbye
Token ID16390
Feature activation-3.86e-03

Pos logit contributions
 tells+0.006
 hospital+0.006
 fault+0.006
 inconvenience+0.006
 microwave+0.005
Neg logit contributions
quet-0.008
ingly-0.007
bing-0.007
ized-0.007
izing-0.007
Token"
Token ID1
Feature activation+9.49e-02

Pos logit contributions
 hospital+0.006
 tells+0.005
 inconvenience+0.005
 fault+0.005
 Enjoy+0.005
Neg logit contributions
quet-0.007
bing-0.007
ized-0.006
ingly-0.006
izing-0.006
Token to
Token ID284
Feature activation-3.86e-03

Pos logit contributions
quet+0.168
ized+0.164
bing+0.159
izing+0.156
ised+0.153
Neg logit contributions
 tells-0.138
 hospital-0.137
 dentist-0.130
have-0.128
 fault-0.128
Token her
Token ID607
Feature activation-1.26e-02

Pos logit contributions
 microwave+0.007
 fault+0.007
 inconvenience+0.006
 Enjoy+0.006
 decides+0.006
Neg logit contributions
quet-0.005
bing-0.005
ized-0.005
izing-0.005
ingly-0.005
Token mum
Token ID25682
Feature activation-5.20e-03

Pos logit contributions
 hospital+0.016
 tells+0.016
 inconvenience+0.016
 fault+0.015
 Enjoy+0.015
Neg logit contributions
quet-0.025
ized-0.023
bing-0.023
ingly-0.023
izing-0.023
Token.
Token ID13
Feature activation-1.85e-02

Pos logit contributions
 hospital+0.010
 inconvenience+0.010
 fault+0.009
 tells+0.009
 Enjoy+0.009
Neg logit contributions
quet-0.007
bing-0.007
ingly-0.007
ized-0.007
izing-0.006
Token
Token ID220
Feature activation+1.94e-03

Pos logit contributions
 hospital+0.028
 inconvenience+0.028
 fault+0.027
have+0.026
 decides+0.026
Neg logit contributions
quet-0.032
bing-0.031
ingly-0.031
edly-0.029
ized-0.029
Token always
Token ID1464
Feature activation-4.06e-03

Pos logit contributions
 hospital+0.002
 fault+0.002
 inconvenience+0.002
 microwave+0.002
 tells+0.002
Neg logit contributions
quet-0.004
ized-0.004
bing-0.004
ingly-0.003
izing-0.003
Token said
Token ID531
Feature activation-4.41e-03

Pos logit contributions
 hospital+0.006
 microwave+0.006
 fault+0.006
 dentist+0.006
 inconvenience+0.006
Neg logit contributions
bing-0.007
quet-0.006
ingly-0.006
ized-0.006
bed-0.006
Token "
Token ID366
Feature activation+4.03e-02

Pos logit contributions
 hospital+0.006
 tells+0.006
 inconvenience+0.006
 fault+0.006
 Enjoy+0.006
Neg logit contributions
quet-0.008
ized-0.008
bing-0.008
ingly-0.007
izing-0.007
Tokenthank
Token ID40716
Feature activation+1.45e-02

Pos logit contributions
quet+0.067
ized+0.063
bing+0.063
ingly+0.061
izing+0.061
Neg logit contributions
 hospital-0.064
 tells-0.061
 inconvenience-0.061
 Enjoy-0.060
have-0.060
Token you
Token ID345
Feature activation-2.70e-03

Pos logit contributions
quet+0.035
ized+0.035
bing+0.035
ingly+0.035
imm+0.034
Neg logit contributions
 Enjoy-0.011
 tells-0.011
 inconvenience-0.010
 hospital-0.010
have-0.010
Token"
Token ID1
Feature activation+9.47e-02

Pos logit contributions
 hospital+0.004
 fault+0.004
 inconvenience+0.004
 tells+0.004
 decides+0.004
Neg logit contributions
quet-0.005
bing-0.004
ingly-0.004
izing-0.004
ized-0.004
Token when
Token ID618
Feature activation+1.55e-02

Pos logit contributions
quet+0.183
ized+0.179
bing+0.177
izing+0.174
ised+0.171
Neg logit contributions
 tells-0.121
 hospital-0.117
 dentist-0.115
 done-0.115
 supposed-0.110
Token someone
Token ID2130
Feature activation-3.37e-02

Pos logit contributions
quet+0.021
bing+0.019
ized+0.019
ingly+0.019
izing+0.018
Neg logit contributions
 hospital-0.030
 inconvenience-0.029
 fault-0.029
 tells-0.029
 Enjoy-0.028
Token gave
Token ID2921
Feature activation-2.63e-03

Pos logit contributions
 hospital+0.044
 inconvenience+0.043
 fault+0.042
 Enjoy+0.041
 tells+0.041
Neg logit contributions
quet-0.065
ized-0.062
ingly-0.061
bing-0.061
izing-0.059
Token her
Token ID607
Feature activation+8.60e-03

Pos logit contributions
 hospital+0.004
 inconvenience+0.004
 tells+0.004
 fault+0.004
 Enjoy+0.004
Neg logit contributions
quet-0.005
ized-0.004
bing-0.004
ingly-0.004
izing-0.004
Token something
Token ID1223
Feature activation+1.55e-02

Pos logit contributions
quet+0.015
ized+0.014
ingly+0.014
als+0.013
hog+0.013
Neg logit contributions
 inconvenience-0.013
 hospital-0.012
 Enjoy-0.012
have-0.012
 tells-0.012
Token people
Token ID661
Feature activation-2.40e-02

Pos logit contributions
 hospital+0.021
 inconvenience+0.020
 fault+0.020
 tells+0.020
 Enjoy+0.020
Neg logit contributions
quet-0.019
ized-0.018
bing-0.018
ingly-0.017
izing-0.017
Token started
Token ID2067
Feature activation-2.48e-03

Pos logit contributions
 hospital+0.041
 inconvenience+0.040
 fault+0.040
 Enjoy+0.039
 microwave+0.039
Neg logit contributions
quet-0.038
ized-0.036
bing-0.035
ingly-0.035
izing-0.033
Token saying
Token ID2282
Feature activation+7.62e-04

Pos logit contributions
 hospital+0.004
 inconvenience+0.004
 Enjoy+0.004
 microwave+0.004
 dentist+0.004
Neg logit contributions
quet-0.004
bing-0.004
ingly-0.004
ized-0.004
izing-0.004
Token "
Token ID366
Feature activation+4.58e-02

Pos logit contributions
quet+0.001
ized+0.001
als+0.001
ingly+0.001
izing+0.001
Neg logit contributions
 hospital-0.001
 inconvenience-0.001
 microwave-0.001
 fault-0.001
 dentist-0.001
Tokenhello
Token ID31373
Feature activation+2.76e-03

Pos logit contributions
quet+0.069
ized+0.065
bing+0.064
ingly+0.064
izing+0.062
Neg logit contributions
 hospital-0.082
 tells-0.079
 inconvenience-0.078
 fault-0.077
 Enjoy-0.076
Token"
Token ID1
Feature activation+9.46e-02

Pos logit contributions
quet+0.005
ized+0.004
bing+0.004
izing+0.004
ingly+0.004
Neg logit contributions
 hospital-0.005
 fault-0.004
 tells-0.004
 inconvenience-0.004
 dentist-0.004
Token too
Token ID1165
Feature activation+4.07e-03

Pos logit contributions
quet+0.156
bing+0.149
ized+0.149
izing+0.144
ingly+0.143
Neg logit contributions
 hospital-0.146
 tells-0.145
 inconvenience-0.138
 dentist-0.137
have-0.137
Token!
Token ID0
Feature activation-2.59e-02

Pos logit contributions
quet+0.006
ized+0.006
bing+0.006
ingly+0.006
izing+0.006
Neg logit contributions
 hospital-0.007
 inconvenience-0.007
 tells-0.007
 fault-0.007
 Enjoy-0.007
Token<|endoftext|>
Token ID50256
Feature activation-3.73e-02

Pos logit contributions
 hospital+0.037
 inconvenience+0.037
 tells+0.036
 fault+0.036
have+0.036
Neg logit contributions
quet-0.047
ingly-0.044
ized-0.043
bing-0.043
edly-0.042
TokenOnce
Token ID7454
Feature activation+6.64e-03

Pos logit contributions
 hospital+0.048
 inconvenience+0.047
 fault+0.046
 tells+0.044
have+0.043
Neg logit contributions
quet-0.073
bing-0.070
ingly-0.070
ized-0.069
als-0.068
Token upon
Token ID2402
Feature activation-3.12e-02

Pos logit contributions
quet+0.013
ized+0.013
ingly+0.013
bing+0.013
izing+0.012
Neg logit contributions
 hospital-0.008
 inconvenience-0.008
 Enjoy-0.008
 tells-0.008
have-0.007
Token wardrobe
Token ID35593
Feature activation-1.30e-02

Pos logit contributions
 hospital+0.019
 tells+0.019
 inconvenience+0.018
 fault+0.018
 Enjoy+0.018
Neg logit contributions
quet-0.026
ized-0.025
bing-0.024
ingly-0.024
izing-0.024
Token changed
Token ID3421
Feature activation-1.41e-02

Pos logit contributions
 hospital+0.019
 tells+0.018
 done+0.016
 inconvenience+0.016
 dentist+0.016
Neg logit contributions
quet-0.024
bing-0.023
ingly-0.022
izing-0.022
ized-0.022
Token from
Token ID422
Feature activation+1.13e-02

Pos logit contributions
 hospital+0.024
 tells+0.023
 fault+0.023
 inconvenience+0.022
 decides+0.022
Neg logit contributions
quet-0.023
ized-0.020
bing-0.020
ingly-0.020
izing-0.020
Token ancient
Token ID6156
Feature activation-2.01e-02

Pos logit contributions
quet+0.021
ized+0.020
bing+0.020
ingly+0.020
izing+0.020
Neg logit contributions
 hospital-0.015
 inconvenience-0.015
 tells-0.015
 fault-0.014
 Enjoy-0.014
Token to
Token ID284
Feature activation-8.22e-03

Pos logit contributions
 hospital+0.031
 tells+0.030
 fault+0.028
 inconvenience+0.028
 dentist+0.028
Neg logit contributions
quet-0.036
ized-0.033
bing-0.033
izing-0.032
ingly-0.032
Token brand
Token ID4508
Feature activation-1.04e-01

Pos logit contributions
 inconvenience+0.012
 hospital+0.012
 microwave+0.012
 fault+0.012
have+0.012
Neg logit contributions
quet-0.014
ized-0.014
ingly-0.014
bing-0.013
izing-0.013
Token new
Token ID649
Feature activation-1.97e-02

Pos logit contributions
 hospital+0.143
 inconvenience+0.143
 tells+0.136
 fault+0.136
 Enjoy+0.134
Neg logit contributions
quet-0.192
ized-0.188
izing-0.184
bing-0.183
ingly-0.182
Token!
Token ID0
Feature activation-1.11e-02

Pos logit contributions
 hospital+0.029
 tells+0.028
 fault+0.027
 inconvenience+0.027
 decides+0.026
Neg logit contributions
quet-0.036
ized-0.033
bing-0.033
izing-0.033
ingly-0.032
Token They
Token ID1119
Feature activation+2.41e-03

Pos logit contributions
have+0.021
 inconvenience+0.020
 hospital+0.019
 fault+0.019
ws+0.019
Neg logit contributions
hog-0.014
quet-0.014
als-0.014
ised-0.014
ized-0.014
Token looked
Token ID3114
Feature activation-9.35e-03

Pos logit contributions
quet+0.004
ized+0.004
bing+0.004
ingly+0.004
izing+0.004
Neg logit contributions
 hospital-0.004
 inconvenience-0.004
 fault-0.004
 tells-0.004
 Enjoy-0.004
Token around
Token ID1088
Feature activation+1.24e-02

Pos logit contributions
 hospital+0.013
 tells+0.013
 Enjoy+0.012
 decides+0.012
 fault+0.012
Neg logit contributions
quet-0.018
bing-0.017
ingly-0.016
izing-0.016
ized-0.016
Token scattered
Token ID16830
Feature activation-2.37e-02

Pos logit contributions
 hospital+0.012
 tells+0.012
 inconvenience+0.012
 fault+0.012
 Enjoy+0.011
Neg logit contributions
quet-0.010
ized-0.010
bing-0.009
ingly-0.009
izing-0.009
Token everywhere
Token ID8347
Feature activation-9.42e-03

Pos logit contributions
 hospital+0.041
 inconvenience+0.040
 tells+0.040
 Enjoy+0.039
 fault+0.039
Neg logit contributions
quet-0.036
ized-0.034
ingly-0.034
bing-0.034
izing-0.033
Token and
Token ID290
Feature activation-3.56e-02

Pos logit contributions
 hospital+0.013
 tells+0.013
 fault+0.012
 inconvenience+0.012
 dentist+0.012
Neg logit contributions
quet-0.018
ized-0.017
bing-0.016
izing-0.016
ingly-0.016
Token the
Token ID262
Feature activation-2.93e-02

Pos logit contributions
 fault+0.062
 Enjoy+0.062
 inconvenience+0.058
 microwave+0.057
have+0.056
Neg logit contributions
ized-0.051
bing-0.049
ingly-0.049
als-0.048
edly-0.047
Token children
Token ID1751
Feature activation-5.70e-02

Pos logit contributions
 Enjoy+0.046
 tells+0.042
have+0.041
 decides+0.041
 hospital+0.041
Neg logit contributions
quet-0.047
ized-0.047
bing-0.046
hog-0.045
ingly-0.045
Token crying
Token ID13774
Feature activation-9.02e-02

Pos logit contributions
 Enjoy+0.112
 fault+0.109
 inconvenience+0.108
 hospital+0.104
 dentist+0.103
Neg logit contributions
ized-0.071
ingly-0.071
bing-0.071
quet-0.070
als-0.064
Token and
Token ID290
Feature activation-2.95e-02

Pos logit contributions
 hospital+0.163
 Enjoy+0.162
 inconvenience+0.162
 fault+0.157
 tells+0.155
Neg logit contributions
quet-0.128
ingly-0.126
ized-0.122
bing-0.122
izing-0.117
Token yelling
Token ID22187
Feature activation-6.14e-02

Pos logit contributions
 Enjoy+0.045
 fault+0.044
 inconvenience+0.044
have+0.043
 hospital+0.042
Neg logit contributions
bing-0.048
ized-0.047
quet-0.046
ingly-0.046
hog-0.045
Token.
Token ID13
Feature activation-6.86e-03

Pos logit contributions
 inconvenience+0.131
 Enjoy+0.127
 hospital+0.126
 dentist+0.125
 fault+0.121
Neg logit contributions
ingly-0.075
bing-0.068
ized-0.068
edly-0.064
als-0.063
Token She
Token ID1375
Feature activation-1.04e-03

Pos logit contributions
 hospital+0.011
 inconvenience+0.011
 tells+0.011
 fault+0.011
 Enjoy+0.010
Neg logit contributions
quet-0.011
ized-0.011
bing-0.011
ingly-0.010
izing-0.010
Token was
Token ID373
Feature activation-1.87e-02

Pos logit contributions
 fault+0.002
 inconvenience+0.002
 hospital+0.002
 Enjoy+0.001
 tells+0.001
Neg logit contributions
quet-0.002
ized-0.002
bing-0.002
ingly-0.002
izing-0.002
Token the
Token ID262
Feature activation-8.58e-03

Pos logit contributions
 inconvenience+0.003
 fault+0.003
 hospital+0.003
 Enjoy+0.003
 tells+0.003
Neg logit contributions
quet-0.004
ized-0.004
bing-0.003
ingly-0.003
izing-0.003
Token feather
Token ID17972
Feature activation-1.50e-02

Pos logit contributions
 Enjoy+0.016
 fault+0.016
ws+0.015
 inconvenience+0.015
 microwave+0.015
Neg logit contributions
worm-0.011
quet-0.011
ized-0.010
bing-0.010
stone-0.010
Token turned
Token ID2900
Feature activation-1.66e-02

Pos logit contributions
 hospital+0.023
 tells+0.022
 fault+0.021
 inconvenience+0.021
 decides+0.021
Neg logit contributions
quet-0.026
bing-0.025
ized-0.025
ingly-0.024
izing-0.024
Token into
Token ID656
Feature activation-1.35e-03

Pos logit contributions
 hospital+0.028
 inconvenience+0.028
 tells+0.027
 fault+0.027
 Enjoy+0.027
Neg logit contributions
quet-0.026
ized-0.025
bing-0.025
ingly-0.024
izing-0.024
Token a
Token ID257
Feature activation-1.55e-02

Pos logit contributions
 inconvenience+0.002
 hospital+0.002
 decides+0.002
 tells+0.002
have+0.002
Neg logit contributions
quet-0.002
ingly-0.002
bing-0.002
ized-0.002
izing-0.002
Token brand
Token ID4508
Feature activation-8.78e-02

Pos logit contributions
 fault+0.027
 tells+0.027
ws+0.027
have+0.027
 inconvenience+0.026
Neg logit contributions
quet-0.024
worm-0.022
izing-0.021
ized-0.021
 spe-0.021
Token new
Token ID649
Feature activation-1.81e-03

Pos logit contributions
 hospital+0.121
 inconvenience+0.119
 tells+0.118
 fault+0.115
have+0.113
Neg logit contributions
quet-0.165
ized-0.158
ingly-0.155
izing-0.154
bing-0.152
Token staff
Token ID3085
Feature activation+1.75e-02

Pos logit contributions
 hospital+0.003
 inconvenience+0.003
 tells+0.003
 Enjoy+0.003
 fault+0.003
Neg logit contributions
quet-0.003
ized-0.003
bing-0.003
ingly-0.003
izing-0.003
Token.
Token ID13
Feature activation+7.92e-03

Pos logit contributions
quet+0.029
ized+0.028
bing+0.028
ingly+0.027
izing+0.027
Neg logit contributions
 hospital-0.028
 tells-0.027
 inconvenience-0.026
 fault-0.026
 Enjoy-0.026
Token The
Token ID383
Feature activation-2.83e-03

Pos logit contributions
quet+0.012
bing+0.011
ized+0.011
ingly+0.011
edly+0.011
Neg logit contributions
 inconvenience-0.014
 hospital-0.014
have-0.014
 supposed-0.013
 fault-0.013
Token wizard
Token ID18731
Feature activation-5.24e-03

Pos logit contributions
 fault+0.005
 Enjoy+0.005
 inconvenience+0.005
 tells+0.005
ws+0.005
Neg logit contributions
quet-0.004
worm-0.004
ized-0.004
bing-0.004
izing-0.004
Token kind
Token ID1611
Feature activation+1.95e-02

Pos logit contributions
quet+0.004
 spe+0.004
worm+0.004
hog+0.004
ingly+0.004
Neg logit contributions
ws-0.005
 tells-0.004
 decides-0.004
 fault-0.004
have-0.004
Token lady
Token ID10846
Feature activation+1.09e-02

Pos logit contributions
quet+0.031
ingly+0.028
hog+0.027
worm+0.027
ized+0.026
Neg logit contributions
have-0.032
 tells-0.032
 fault-0.031
ws-0.031
 decides-0.031
Token saw
Token ID2497
Feature activation-2.21e-02

Pos logit contributions
quet+0.016
ingly+0.015
ized+0.014
bing+0.014
hog+0.014
Neg logit contributions
 fault-0.020
 inconvenience-0.019
 hospital-0.019
 microwave-0.018
 Enjoy-0.018
Token Tim
Token ID5045
Feature activation+1.23e-02

Pos logit contributions
 inconvenience+0.037
 hospital+0.037
 fault+0.037
 Enjoy+0.035
 tells+0.035
Neg logit contributions
quet-0.034
ized-0.034
ingly-0.033
bing-0.032
izing-0.032
Tokenmy
Token ID1820
Feature activation+5.92e-03

Pos logit contributions
bing+0.021
quet+0.021
ingly+0.021
ized+0.020
hog+0.019
Neg logit contributions
 inconvenience-0.019
 Enjoy-0.018
 fault-0.018
 hospital-0.018
 reservation-0.017
Token crying
Token ID13774
Feature activation-8.71e-02

Pos logit contributions
quet+0.009
bing+0.009
ingly+0.009
ized+0.009
hog+0.008
Neg logit contributions
 inconvenience-0.011
 fault-0.010
 Enjoy-0.010
 hospital-0.010
 microwave-0.009
Token and
Token ID290
Feature activation-1.32e-02

Pos logit contributions
 inconvenience+0.162
 hospital+0.157
 Enjoy+0.156
 fault+0.154
 tells+0.153
Neg logit contributions
ingly-0.120
quet-0.120
bing-0.117
ized-0.117
izing-0.111
Token asked
Token ID1965
Feature activation-9.57e-03

Pos logit contributions
 fault+0.022
 inconvenience+0.021
 Enjoy+0.020
ws+0.020
 microwave+0.020
Neg logit contributions
ized-0.021
bing-0.021
quet-0.020
bed-0.020
hog-0.020
Token him
Token ID683
Feature activation-2.79e-02

Pos logit contributions
 hospital+0.013
 inconvenience+0.013
 fault+0.013
 tells+0.013
 decides+0.013
Neg logit contributions
quet-0.017
ized-0.017
bing-0.017
ingly-0.016
izing-0.016
Token what
Token ID644
Feature activation+9.30e-03

Pos logit contributions
 inconvenience+0.036
 hospital+0.035
 fault+0.033
 microwave+0.033
 tells+0.032
Neg logit contributions
quet-0.054
bing-0.052
ingly-0.051
ized-0.051
hog-0.050
Token was
Token ID373
Feature activation+8.46e-03

Pos logit contributions
quet+0.016
ized+0.015
ingly+0.014
bing+0.014
izing+0.014
Neg logit contributions
 inconvenience-0.014
 hospital-0.014
 fault-0.014
 Enjoy-0.013
ws-0.013
Token the
Token ID262
Feature activation+7.74e-04

Pos logit contributions
quet+0.018
ized+0.017
bing+0.017
ingly+0.017
izing+0.017
Neg logit contributions
 inconvenience-0.014
 hospital-0.014
 Enjoy-0.014
 dentist-0.014
 tells-0.014
Token h
Token ID289
Feature activation+2.42e-03

Pos logit contributions
quet+0.001
stones+0.001
bing+0.001
hog+0.001
ingly+0.001
Neg logit contributions
 Enjoy-0.001
 hospital-0.001
 decides-0.001
 dentist-0.001
 tells-0.001
Tokenanger
Token ID2564
Feature activation+7.09e-03

Pos logit contributions
bing+0.004
ized+0.004
quet+0.004
izing+0.004
ised+0.004
Neg logit contributions
 hospital-0.004
 fault-0.004
 inconvenience-0.004
 dentist-0.004
 tells-0.004
Token for
Token ID329
Feature activation-3.39e-02

Pos logit contributions
quet+0.012
ized+0.011
bing+0.011
ingly+0.011
izing+0.011
Neg logit contributions
 hospital-0.011
 tells-0.011
 inconvenience-0.011
 fault-0.011
 Enjoy-0.010
Token her
Token ID607
Feature activation-1.54e-02

Pos logit contributions
 hospital+0.051
 tells+0.051
 fault+0.051
 microwave+0.051
 decides+0.051
Neg logit contributions
quet-0.058
bing-0.055
ized-0.054
ingly-0.053
izing-0.053
Token brand
Token ID4508
Feature activation-8.68e-02

Pos logit contributions
 hospital+0.025
 Enjoy+0.025
 inconvenience+0.025
 tells+0.024
 microwave+0.024
Neg logit contributions
quet-0.024
ingly-0.023
bing-0.023
ized-0.022
hog-0.022
Token new
Token ID649
Feature activation-2.07e-02

Pos logit contributions
 hospital+0.117
 inconvenience+0.115
 tells+0.113
 Enjoy+0.110
 fault+0.110
Neg logit contributions
quet-0.165
ized-0.156
bing-0.153
ingly-0.153
izing-0.153
Token dress
Token ID6576
Feature activation-5.97e-03

Pos logit contributions
 Enjoy+0.031
 hospital+0.030
 inconvenience+0.029
 grounded+0.029
 microwave+0.029
Neg logit contributions
quet-0.035
ingly-0.033
bing-0.033
worm-0.032
hog-0.032
Token.
Token ID13
Feature activation+2.16e-03

Pos logit contributions
 hospital+0.009
 fault+0.008
 tells+0.008
 inconvenience+0.008
 dentist+0.008
Neg logit contributions
quet-0.011
bing-0.010
ized-0.010
ingly-0.010
izing-0.010
Token The
Token ID383
Feature activation-1.20e-02

Pos logit contributions
quet+0.003
ized+0.003
bing+0.003
ingly+0.003
izing+0.003
Neg logit contributions
 hospital-0.004
 inconvenience-0.004
 fault-0.004
 tells-0.004
 Enjoy-0.004
Token cats
Token ID11875
Feature activation-3.18e-02

Pos logit contributions
 hospital+0.014
 inconvenience+0.014
 fault+0.014
 tells+0.014
 Enjoy+0.014
Neg logit contributions
quet-0.025
bing-0.023
ized-0.023
ingly-0.023
izing-0.023
Token the
Token ID262
Feature activation-1.26e-02

Pos logit contributions
 inconvenience+0.050
 fault+0.049
 hospital+0.049
 Enjoy+0.048
have+0.047
Neg logit contributions
quet-0.065
ized-0.065
bing-0.064
izing-0.062
ingly-0.061
Token broken
Token ID5445
Feature activation-4.59e-02

Pos logit contributions
 Enjoy+0.020
 inconvenience+0.017
have+0.017
 Time+0.017
 hospital+0.017
Neg logit contributions
ized-0.020
quet-0.020
stones-0.020
bing-0.020
ingly-0.019
Token sword
Token ID8429
Feature activation+1.38e-02

Pos logit contributions
 hospital+0.074
 Enjoy+0.074
 inconvenience+0.073
 tells+0.072
 fault+0.071
Neg logit contributions
quet-0.074
bing-0.071
ingly-0.070
ized-0.070
izing-0.068
Token and
Token ID290
Feature activation-2.02e-02

Pos logit contributions
quet+0.021
bing+0.020
ized+0.020
ingly+0.020
izing+0.019
Neg logit contributions
 hospital-0.024
 Enjoy-0.023
 inconvenience-0.023
 tells-0.023
 fault-0.023
Token the
Token ID262
Feature activation-2.32e-02

Pos logit contributions
 fault+0.031
 inconvenience+0.031
 Enjoy+0.030
 hospital+0.030
have+0.029
Neg logit contributions
quet-0.033
ized-0.032
ingly-0.031
bing-0.031
izing-0.031
Token crying
Token ID13774
Feature activation-8.57e-02

Pos logit contributions
 Enjoy+0.040
have+0.036
 tells+0.036
 Time+0.036
 inconvenience+0.035
Neg logit contributions
quet-0.032
ized-0.032
bing-0.032
worm-0.032
ingly-0.032
Token children
Token ID1751
Feature activation-4.46e-02

Pos logit contributions
 Enjoy+0.160
 hospital+0.158
 inconvenience+0.157
 fault+0.153
 tells+0.150
Neg logit contributions
quet-0.120
ingly-0.117
ized-0.113
bing-0.112
izing-0.108
Token.
Token ID13
Feature activation-2.02e-03

Pos logit contributions
 Enjoy+0.082
 inconvenience+0.082
 hospital+0.081
 fault+0.080
 dentist+0.079
Neg logit contributions
quet-0.061
ingly-0.061
bing-0.060
ized-0.059
izing-0.055
Token She
Token ID1375
Feature activation-1.08e-03

Pos logit contributions
 hospital+0.003
 tells+0.003
 fault+0.003
 inconvenience+0.003
 Enjoy+0.003
Neg logit contributions
quet-0.003
bing-0.003
ized-0.003
ingly-0.003
izing-0.003
Token was
Token ID373
Feature activation-2.25e-02

Pos logit contributions
 hospital+0.001
 tells+0.001
 decides+0.001
 dentist+0.001
 inconvenience+0.001
Neg logit contributions
quet-0.002
bing-0.002
ized-0.002
ingly-0.002
izing-0.002
Token not
Token ID407
Feature activation+1.90e-03

Pos logit contributions
 fault+0.027
 inconvenience+0.027
 Enjoy+0.027
 decides+0.026
 hospital+0.026
Neg logit contributions
quet-0.044
ized-0.044
bing-0.043
ingly-0.042
izing-0.041
Token the
Token ID262
Feature activation-1.46e-02

Pos logit contributions
 hospital+0.035
 inconvenience+0.035
 fault+0.034
 Enjoy+0.034
 tells+0.033
Neg logit contributions
quet-0.052
ized-0.051
bing-0.050
izing-0.049
ingly-0.049
Token torn
Token ID12445
Feature activation-3.72e-02

Pos logit contributions
 Enjoy+0.026
 inconvenience+0.024
 decides+0.024
 hospital+0.024
have+0.023
Neg logit contributions
bing-0.022
ized-0.021
quet-0.021
izing-0.020
ised-0.019
Token paper
Token ID3348
Feature activation+4.45e-03

Pos logit contributions
 hospital+0.063
 inconvenience+0.061
 tells+0.061
 Enjoy+0.060
 fault+0.060
Neg logit contributions
quet-0.059
bing-0.056
ized-0.055
ingly-0.055
izing-0.053
Token and
Token ID290
Feature activation-2.23e-02

Pos logit contributions
quet+0.006
bing+0.006
ized+0.006
ingly+0.006
izing+0.006
Neg logit contributions
 hospital-0.008
 inconvenience-0.008
 Enjoy-0.008
 tells-0.008
 fault-0.008
Token the
Token ID262
Feature activation-1.93e-02

Pos logit contributions
 hospital+0.032
 tells+0.030
 inconvenience+0.030
 fault+0.030
 Enjoy+0.029
Neg logit contributions
quet-0.042
ized-0.039
bing-0.039
ingly-0.039
izing-0.038
Token crying
Token ID13774
Feature activation-8.46e-02

Pos logit contributions
 Enjoy+0.036
have+0.033
 decides+0.033
 hospital+0.032
 inconvenience+0.032
Neg logit contributions
bing-0.026
 spe-0.025
ized-0.025
quet-0.025
izing-0.024
Token Tim
Token ID5045
Feature activation-7.19e-03

Pos logit contributions
 hospital+0.159
 inconvenience+0.157
 Enjoy+0.157
 tells+0.153
 fault+0.152
Neg logit contributions
quet-0.117
ingly-0.112
bing-0.109
ized-0.109
izing-0.105
Token.
Token ID13
Feature activation-1.93e-03

Pos logit contributions
 inconvenience+0.012
 Enjoy+0.012
 hospital+0.012
 tells+0.012
 reservation+0.011
Neg logit contributions
ingly-0.011
bing-0.011
quet-0.011
ized-0.010
hog-0.010
Token She
Token ID1375
Feature activation-4.48e-03

Pos logit contributions
 hospital+0.004
 inconvenience+0.004
 tells+0.004
 fault+0.003
 Enjoy+0.003
Neg logit contributions
quet-0.003
ized-0.002
bing-0.002
ingly-0.002
izing-0.002
Token was
Token ID373
Feature activation-2.23e-02

Pos logit contributions
 hospital+0.005
 tells+0.005
 dentist+0.005
 decides+0.005
 inconvenience+0.005
Neg logit contributions
quet-0.010
bing-0.009
ized-0.009
izing-0.009
ingly-0.009
Token mad
Token ID8805
Feature activation-2.99e-02

Pos logit contributions
 inconvenience+0.029
 fault+0.029
 Enjoy+0.029
 decides+0.028
 tells+0.027
Neg logit contributions
quet-0.042
ized-0.041
bing-0.040
ingly-0.040
hog-0.038
Token man
Token ID582
Feature activation+1.04e-02

Pos logit contributions
quet+0.004
ized+0.004
bing+0.004
hog+0.004
ised+0.004
Neg logit contributions
 fault-0.004
 Enjoy-0.004
 tells-0.004
ws-0.004
 supposed-0.004
Token drove
Token ID10357
Feature activation+9.35e-04

Pos logit contributions
quet+0.017
ized+0.017
bing+0.017
ingly+0.017
izing+0.016
Neg logit contributions
 inconvenience-0.016
 fault-0.016
 hospital-0.016
 Enjoy-0.015
 tells-0.015
Token off
Token ID572
Feature activation+7.45e-03

Pos logit contributions
quet+0.002
ized+0.002
ingly+0.001
bing+0.001
izing+0.001
Neg logit contributions
 hospital-0.001
 tells-0.001
 fault-0.001
 inconvenience-0.001
 Enjoy-0.001
Token in
Token ID287
Feature activation+9.04e-05

Pos logit contributions
quet+0.012
ized+0.011
bing+0.011
ingly+0.011
izing+0.011
Neg logit contributions
 hospital-0.012
 tells-0.012
 inconvenience-0.012
 fault-0.012
 Enjoy-0.011
Token his
Token ID465
Feature activation+1.47e-04

Pos logit contributions
quet+0.000
ingly+0.000
ized+0.000
bing+0.000
izing+0.000
Neg logit contributions
 hospital-0.000
 inconvenience-0.000
 fault-0.000
have-0.000
 tells-0.000
Token brand
Token ID4508
Feature activation-8.44e-02

Pos logit contributions
quet+0.000
ized+0.000
bing+0.000
ingly+0.000
ised+0.000
Neg logit contributions
 inconvenience-0.000
 hospital-0.000
 tells-0.000
 Enjoy-0.000
 fault-0.000
Token new
Token ID649
Feature activation-1.18e-02

Pos logit contributions
 inconvenience+0.120
 hospital+0.120
 tells+0.119
 decides+0.118
 Enjoy+0.116
Neg logit contributions
quet-0.152
ized-0.147
izing-0.143
bing-0.140
ingly-0.140
Token car
Token ID1097
Feature activation-4.42e-03

Pos logit contributions
 hospital+0.015
 tells+0.015
 inconvenience+0.015
 Enjoy+0.015
 fault+0.014
Neg logit contributions
quet-0.022
ized-0.022
bing-0.022
ingly-0.021
ised-0.021
Token,
Token ID11
Feature activation-8.78e-03

Pos logit contributions
 hospital+0.008
 tells+0.008
 inconvenience+0.008
 fault+0.008
 dentist+0.007
Neg logit contributions
quet-0.007
ized-0.006
bing-0.006
ingly-0.006
izing-0.006
Token and
Token ID290
Feature activation+1.17e-03

Pos logit contributions
 hospital+0.014
 inconvenience+0.014
 tells+0.014
 fault+0.014
 Enjoy+0.014
Neg logit contributions
quet-0.014
ized-0.013
bing-0.013
ingly-0.013
izing-0.013
Token he
Token ID339
Feature activation-1.37e-02

Pos logit contributions
quet+0.002
ized+0.002
bing+0.002
ingly+0.002
izing+0.002
Neg logit contributions
 fault-0.002
 inconvenience-0.002
 hospital-0.002
 tells-0.002
 Enjoy-0.002
Token heard
Token ID2982
Feature activation-2.45e-02

Pos logit contributions
 hospital+0.015
 tells+0.014
 dentist+0.013
 inconvenience+0.013
 decides+0.013
Neg logit contributions
quet-0.032
bing-0.030
ized-0.030
izing-0.030
ingly-0.029
Token the
Token ID262
Feature activation-1.37e-02

Pos logit contributions
 fault+0.043
 inconvenience+0.042
 Enjoy+0.041
 hospital+0.041
 decides+0.040
Neg logit contributions
quet-0.037
ized-0.035
bing-0.034
ingly-0.034
izing-0.033
Token noise
Token ID7838
Feature activation-4.43e-02

Pos logit contributions
 Enjoy+0.022
ws+0.019
 fault+0.018
 decides+0.018
have+0.018
Neg logit contributions
quet-0.024
bing-0.023
ingly-0.023
ized-0.023
hetically-0.023
Token and
Token ID290
Feature activation-2.26e-02

Pos logit contributions
 hospital+0.075
 tells+0.072
 inconvenience+0.068
 dentist+0.067
 microwave+0.066
Neg logit contributions
quet-0.072
ized-0.067
bing-0.066
izing-0.065
ingly-0.064
Token the
Token ID262
Feature activation-1.70e-02

Pos logit contributions
 hospital+0.022
 tells+0.022
 decides+0.020
 dentist+0.019
 inconvenience+0.018
Neg logit contributions
quet-0.054
bing-0.051
ized-0.051
izing-0.050
ingly-0.050
Token crying
Token ID13774
Feature activation-8.40e-02

Pos logit contributions
 Enjoy+0.027
ws+0.024
have+0.024
 fault+0.023
 decides+0.023
Neg logit contributions
quet-0.028
ized-0.028
bing-0.028
ingly-0.028
izing-0.027
Token.
Token ID13
Feature activation+4.93e-03

Pos logit contributions
 hospital+0.142
 Enjoy+0.141
 inconvenience+0.139
 fault+0.137
 tells+0.137
Neg logit contributions
quet-0.131
ingly-0.125
ized-0.123
bing-0.123
izing-0.119
Token She
Token ID1375
Feature activation-1.40e-03

Pos logit contributions
quet+0.007
ized+0.007
bing+0.007
ingly+0.007
izing+0.006
Neg logit contributions
 hospital-0.009
 tells-0.009
 inconvenience-0.009
 fault-0.008
 Enjoy-0.008
Token came
Token ID1625
Feature activation-5.03e-03

Pos logit contributions
 tells+0.002
 hospital+0.002
 decides+0.001
 dentist+0.001
 inconvenience+0.001
Neg logit contributions
quet-0.003
bing-0.003
ized-0.003
izing-0.003
ingly-0.003
Token to
Token ID284
Feature activation-2.36e-02

Pos logit contributions
 hospital+0.007
 tells+0.007
 decides+0.006
 fault+0.006
have+0.006
Neg logit contributions
quet-0.010
ized-0.009
izing-0.009
ingly-0.009
bing-0.009
Token see
Token ID766
Feature activation-3.31e-02

Pos logit contributions
 hospital+0.029
 tells+0.028
 inconvenience+0.027
 Enjoy+0.026
 dentist+0.026
Neg logit contributions
quet-0.050
bing-0.047
ized-0.047
ingly-0.047
izing-0.046
Token with
Token ID351
Feature activation-1.46e-03

Pos logit contributions
 inconvenience+0.073
 hospital+0.071
 tells+0.070
 Enjoy+0.069
 fault+0.069
Neg logit contributions
quet-0.076
bing-0.073
ingly-0.071
ized-0.071
izing-0.068
Token your
Token ID534
Feature activation-2.93e-02

Pos logit contributions
 inconvenience+0.002
 Congratulations+0.002
arre+0.002
 decides+0.002
 hospital+0.002
Neg logit contributions
bing-0.002
quet-0.002
ized-0.002
ingly-0.002
izing-0.002
Token cough
Token ID22094
Feature activation-5.45e-02

Pos logit contributions
 inconvenience+0.050
 Enjoy+0.050
have+0.049
 hospital+0.048
 Congratulations+0.048
Neg logit contributions
ingly-0.041
quet-0.040
bing-0.039
 fingertips-0.038
 spe-0.038
Token and
Token ID290
Feature activation-1.94e-02

Pos logit contributions
 hospital+0.090
 Enjoy+0.089
 inconvenience+0.089
 fault+0.085
 tells+0.084
Neg logit contributions
quet-0.084
ingly-0.082
ized-0.081
bing-0.080
izing-0.078
Token your
Token ID534
Feature activation-3.27e-02

Pos logit contributions
 inconvenience+0.029
 fault+0.028
 hospital+0.027
 Congratulations+0.027
 Enjoy+0.026
Neg logit contributions
bing-0.033
quet-0.031
ingly-0.031
bed-0.030
ized-0.030
Token fever
Token ID17372
Feature activation-8.38e-02

Pos logit contributions
 inconvenience+0.057
 Enjoy+0.055
 hospital+0.054
have+0.053
 Congratulations+0.053
Neg logit contributions
ingly-0.048
edly-0.045
bing-0.045
quet-0.045
ized-0.043
Token,"
Token ID553
Feature activation-1.25e-02

Pos logit contributions
 Enjoy+0.144
 hospital+0.141
 inconvenience+0.137
 aquarium+0.135
 dentist+0.133
Neg logit contributions
quet-0.123
ingly-0.120
ized-0.116
bing-0.116
izing-0.113
Token Mom
Token ID11254
Feature activation-2.31e-02

Pos logit contributions
 inconvenience+0.019
 fault+0.018
 hospital+0.018
 Enjoy+0.018
ws+0.017
Neg logit contributions
quet-0.022
ized-0.021
ingly-0.021
bing-0.021
ised-0.020
Token urged
Token ID11643
Feature activation-1.84e-02

Pos logit contributions
 hospital+0.026
 fault+0.025
 inconvenience+0.025
 Enjoy+0.024
 tells+0.023
Neg logit contributions
quet-0.050
ized-0.048
ingly-0.048
bing-0.047
izing-0.046
Token.
Token ID13
Feature activation-1.19e-02

Pos logit contributions
 hospital+0.022
 tells+0.021
 inconvenience+0.020
 fault+0.020
 dentist+0.020
Neg logit contributions
quet-0.039
bing-0.037
ized-0.036
izing-0.036
stones-0.035
Token
Token ID198
Feature activation-1.79e-02

Pos logit contributions
 hospital+0.015
 inconvenience+0.015
 fault+0.015
have+0.014
 tells+0.014
Neg logit contributions
quet-0.023
ized-0.022
bing-0.022
ingly-0.022
izing-0.021
Token floor
Token ID4314
Feature activation-2.97e-02

Pos logit contributions
quet+0.001
ized+0.001
bing+0.001
ingly+0.001
hog+0.001
Neg logit contributions
 Enjoy-0.001
 tells-0.001
 hospital-0.001
 inconvenience-0.001
have-0.001
Token and
Token ID290
Feature activation-2.12e-03

Pos logit contributions
 hospital+0.054
 inconvenience+0.053
 tells+0.052
 Enjoy+0.052
 fault+0.050
Neg logit contributions
quet-0.043
ized-0.041
ingly-0.040
bing-0.040
izing-0.039
Token Sara
Token ID24799
Feature activation-9.03e-03

Pos logit contributions
 inconvenience+0.004
 fault+0.004
 Enjoy+0.003
 hospital+0.003
have+0.003
Neg logit contributions
bing-0.003
ized-0.003
quet-0.003
ingly-0.003
bed-0.003
Token and
Token ID290
Feature activation-2.20e-03

Pos logit contributions
 inconvenience+0.016
 hospital+0.015
 fault+0.015
 Enjoy+0.015
 dentist+0.014
Neg logit contributions
ingly-0.013
bing-0.013
ized-0.013
als-0.013
bed-0.013
Token Ben
Token ID3932
Feature activation+4.34e-03

Pos logit contributions
 inconvenience+0.003
 hospital+0.003
have+0.003
 fault+0.003
 Enjoy+0.002
Neg logit contributions
als-0.004
quet-0.004
ingly-0.004
ized-0.004
bing-0.004
Token crying
Token ID13774
Feature activation-8.37e-02

Pos logit contributions
als+0.006
ingly+0.006
ized+0.006
quet+0.006
bing+0.006
Neg logit contributions
 inconvenience-0.008
 dentist-0.008
 hospital-0.008
 fault-0.008
 Enjoy-0.007
Token.
Token ID13
Feature activation-1.23e-02

Pos logit contributions
 inconvenience+0.156
 Enjoy+0.152
 hospital+0.152
 fault+0.148
 dentist+0.147
Neg logit contributions
ingly-0.116
quet-0.113
bing-0.112
ized-0.111
edly-0.109
Token She
Token ID1375
Feature activation+5.69e-03

Pos logit contributions
 inconvenience+0.025
 done+0.024
have+0.024
 hospital+0.023
 supposed+0.023
Neg logit contributions
bing-0.014
hog-0.014
bed-0.013
ingly-0.013
edly-0.013
Token was
Token ID373
Feature activation-5.38e-03

Pos logit contributions
ized+0.009
hog+0.009
ingly+0.009
als+0.009
bed+0.009
Neg logit contributions
 inconvenience-0.009
 fault-0.008
 Enjoy-0.007
 yours-0.007
 supposed-0.007
Token angry
Token ID7954
Feature activation-4.85e-03

Pos logit contributions
 decides+0.009
ws+0.009
 inconvenience+0.009
 Enjoy+0.009
 fault+0.009
Neg logit contributions
ized-0.008
bing-0.007
hog-0.007
quet-0.007
ingly-0.007
Token and
Token ID290
Feature activation-2.14e-02

Pos logit contributions
 hospital+0.010
 inconvenience+0.009
 tells+0.009
 Enjoy+0.009
 fault+0.009
Neg logit contributions
quet-0.006
bing-0.006
ingly-0.006
ized-0.006
izing-0.006
Token on
Token ID319
Feature activation+9.79e-03

Pos logit contributions
quet+0.015
ized+0.015
bing+0.015
ingly+0.015
izing+0.014
Neg logit contributions
 inconvenience-0.013
 hospital-0.013
 fault-0.012
 tells-0.012
 Enjoy-0.012
Token her
Token ID607
Feature activation+4.40e-03

Pos logit contributions
quet+0.018
ized+0.017
bing+0.017
ingly+0.017
izing+0.016
Neg logit contributions
 hospital-0.014
 inconvenience-0.014
 tells-0.014
 fault-0.013
 Enjoy-0.013
Token mom
Token ID1995
Feature activation+7.19e-03

Pos logit contributions
quet+0.008
ized+0.008
bing+0.008
ingly+0.008
izing+0.008
Neg logit contributions
 hospital-0.006
 tells-0.006
 inconvenience-0.006
 fault-0.006
 Enjoy-0.006
Tokenmy
Token ID1820
Feature activation+1.25e-02

Pos logit contributions
quet+0.014
ized+0.014
bing+0.014
ingly+0.014
izing+0.013
Neg logit contributions
 hospital-0.009
 tells-0.009
 inconvenience-0.009
 fault-0.008
 Enjoy-0.008
Token's
Token ID338
Feature activation-1.52e-02

Pos logit contributions
quet+0.021
bing+0.021
ingly+0.021
ized+0.020
izing+0.020
Neg logit contributions
 inconvenience-0.019
 hospital-0.019
 Enjoy-0.018
 fault-0.018
 tells-0.018
Token brand
Token ID4508
Feature activation-8.36e-02

Pos logit contributions
 hospital+0.023
 inconvenience+0.022
 tells+0.022
 Enjoy+0.022
 fault+0.021
Neg logit contributions
quet-0.027
bing-0.025
ized-0.025
ingly-0.025
izing-0.025
Token new
Token ID649
Feature activation+3.10e-04

Pos logit contributions
 hospital+0.114
 inconvenience+0.112
 tells+0.108
 fault+0.107
 Enjoy+0.105
Neg logit contributions
quet-0.159
ized-0.151
bing-0.148
izing-0.148
ingly-0.147
Token table
Token ID3084
Feature activation-2.67e-03

Pos logit contributions
quet+0.001
bing+0.000
ized+0.000
ingly+0.000
izing+0.000
Neg logit contributions
 hospital-0.000
 inconvenience-0.000
 Enjoy-0.000
 tells-0.000
 fault-0.000
Token!
Token ID0
Feature activation-2.41e-03

Pos logit contributions
 hospital+0.004
 tells+0.004
 inconvenience+0.004
 fault+0.004
 dentist+0.004
Neg logit contributions
quet-0.004
bing-0.004
ized-0.004
izing-0.004
ingly-0.004
Token
Token ID198
Feature activation+5.24e-03

Pos logit contributions
 inconvenience+0.005
 hospital+0.005
 grounded+0.004
have+0.004
 fault+0.004
Neg logit contributions
quet-0.003
ingly-0.003
bing-0.003
als-0.003
edly-0.003
Token
Token ID198
Feature activation+1.64e-02

Pos logit contributions
quet+0.010
bing+0.009
ingly+0.009
ized+0.009
ised+0.009
Neg logit contributions
 hospital-0.007
 inconvenience-0.007
 fault-0.007
 tells-0.007
 decides-0.007
Token spilling
Token ID45085
Feature activation-2.70e-02

Pos logit contributions
quet+0.001
ized+0.001
ingly+0.001
ised+0.001
izing+0.001
Neg logit contributions
 hospital-0.001
 tells-0.001
 mess-0.001
 microwave-0.001
 done-0.001
Token over
Token ID625
Feature activation+1.59e-02

Pos logit contributions
 hospital+0.041
 inconvenience+0.039
 tells+0.039
 fault+0.038
 Enjoy+0.038
Neg logit contributions
quet-0.047
ized-0.045
bing-0.045
ingly-0.044
izing-0.043
Token and
Token ID290
Feature activation-3.17e-02

Pos logit contributions
quet+0.036
worm+0.032
ized+0.032
imm+0.032
stones+0.032
Neg logit contributions
 hospital-0.018
 tells-0.016
 mess-0.016
 fault-0.016
 microwave-0.016
Token the
Token ID262
Feature activation-1.59e-02

Pos logit contributions
 hospital+0.045
 inconvenience+0.045
 fault+0.044
 Enjoy+0.043
 tells+0.042
Neg logit contributions
quet-0.057
bing-0.056
ized-0.056
ingly-0.055
izing-0.053
Token twins
Token ID20345
Feature activation-4.47e-02

Pos logit contributions
 Enjoy+0.023
have+0.021
 decides+0.021
 inconvenience+0.021
 tells+0.020
Neg logit contributions
quet-0.028
bing-0.027
ingly-0.027
ized-0.027
worm-0.026
Token crying
Token ID13774
Feature activation-8.33e-02

Pos logit contributions
 inconvenience+0.075
 Enjoy+0.074
 hospital+0.074
 fault+0.073
 tells+0.071
Neg logit contributions
quet-0.069
ingly-0.067
ized-0.067
bing-0.067
izing-0.062
Token.
Token ID13
Feature activation+1.24e-03

Pos logit contributions
 hospital+0.143
 tells+0.138
 inconvenience+0.136
 fault+0.134
have+0.131
Neg logit contributions
quet-0.131
ized-0.122
bing-0.121
ingly-0.118
izing-0.118
Token She
Token ID1375
Feature activation+6.02e-04

Pos logit contributions
quet+0.002
ized+0.002
bing+0.002
ingly+0.002
izing+0.002
Neg logit contributions
 hospital-0.002
 tells-0.002
 inconvenience-0.002
 fault-0.002
 Enjoy-0.002
Token quickly
Token ID2952
Feature activation-3.00e-02

Pos logit contributions
quet+0.001
bing+0.001
ized+0.001
izing+0.001
ingly+0.001
Neg logit contributions
 hospital-0.001
 tells-0.001
 dentist-0.001
 decides-0.001
 inconvenience-0.001
Token turned
Token ID2900
Feature activation-8.27e-03

Pos logit contributions
 hospital+0.043
 fault+0.042
 inconvenience+0.042
 tells+0.041
 Enjoy+0.041
Neg logit contributions
quet-0.054
ized-0.052
bing-0.051
ingly-0.051
izing-0.050
Token off
Token ID572
Feature activation+1.27e-02

Pos logit contributions
 hospital+0.014
 tells+0.014
 inconvenience+0.013
 fault+0.013
 Enjoy+0.013
Neg logit contributions
quet-0.013
ized-0.012
bing-0.012
ingly-0.012
izing-0.012
Token came
Token ID1625
Feature activation-1.01e-02

Pos logit contributions
 hospital+0.026
 inconvenience+0.026
 fault+0.026
 Enjoy+0.025
 tells+0.025
Neg logit contributions
quet-0.040
bing-0.038
ingly-0.038
ized-0.038
hog-0.036
Token in
Token ID287
Feature activation-8.70e-03

Pos logit contributions
 hospital+0.015
 tells+0.014
have+0.013
 decides+0.013
 supposed+0.013
Neg logit contributions
quet-0.019
ized-0.018
izing-0.018
ingly-0.017
edly-0.017
Token and
Token ID290
Feature activation-1.75e-02

Pos logit contributions
 hospital+0.015
 tells+0.015
 inconvenience+0.014
 fault+0.014
 Enjoy+0.014
Neg logit contributions
quet-0.013
ized-0.013
bing-0.013
ingly-0.012
izing-0.012
Token saw
Token ID2497
Feature activation-3.71e-02

Pos logit contributions
 fault+0.027
 inconvenience+0.026
 hospital+0.025
 Enjoy+0.023
 dentist+0.022
Neg logit contributions
ized-0.031
quet-0.031
bing-0.030
hog-0.029
stones-0.029
Token her
Token ID607
Feature activation-4.33e-03

Pos logit contributions
 inconvenience+0.076
 fault+0.075
 reservation+0.071
 Enjoy+0.070
 decides+0.068
Neg logit contributions
ized-0.045
bing-0.043
quet-0.042
izing-0.041
ised-0.040
Token crying
Token ID13774
Feature activation-8.31e-02

Pos logit contributions
 inconvenience+0.006
 Enjoy+0.006
 decides+0.005
 hospital+0.005
 tells+0.005
Neg logit contributions
ingly-0.008
quet-0.008
ized-0.008
bing-0.008
izing-0.008
Token.
Token ID13
Feature activation+2.14e-04

Pos logit contributions
 inconvenience+0.153
 Enjoy+0.151
 hospital+0.149
 fault+0.144
 tells+0.144
Neg logit contributions
quet-0.120
ingly-0.119
bing-0.117
ized-0.113
izing-0.110
Token He
Token ID679
Feature activation+8.52e-03

Pos logit contributions
hog+0.000
quet+0.000
ingly+0.000
edly+0.000
bing+0.000
Neg logit contributions
 inconvenience-0.000
 fault-0.000
have-0.000
ws-0.000
 supposed-0.000
Token asked
Token ID1965
Feature activation-8.52e-03

Pos logit contributions
quet+0.016
ized+0.016
bing+0.015
ingly+0.015
izing+0.015
Neg logit contributions
 fault-0.012
 inconvenience-0.012
 hospital-0.012
 Enjoy-0.011
 microwave-0.010
Token her
Token ID607
Feature activation-1.54e-02

Pos logit contributions
 hospital+0.013
 inconvenience+0.013
 fault+0.013
 tells+0.013
 Enjoy+0.012
Neg logit contributions
quet-0.015
bing-0.014
ized-0.014
ingly-0.014
izing-0.013
Token,
Token ID11
Feature activation-8.57e-03

Pos logit contributions
 inconvenience+0.017
 hospital+0.015
 fault+0.015
 Enjoy+0.015
 microwave+0.015
Neg logit contributions
quet-0.032
ingly-0.031
bing-0.031
hog-0.030
ized-0.030
Token shop
Token ID6128
Feature activation-2.14e-02

Pos logit contributions
 fault+0.024
 inconvenience+0.024
have+0.023
ws+0.023
 tells+0.023
Neg logit contributions
quet-0.028
ized-0.027
ingly-0.026
izing-0.026
bing-0.025
Token sitting
Token ID5586
Feature activation-3.31e-02

Pos logit contributions
 hospital+0.037
 inconvenience+0.037
 fault+0.036
 tells+0.036
 Enjoy+0.036
Neg logit contributions
quet-0.033
ized-0.031
ingly-0.031
bing-0.030
izing-0.029
Token by
Token ID416
Feature activation-1.61e-02

Pos logit contributions
 inconvenience+0.066
 hospital+0.065
 fault+0.064
 tells+0.063
 decides+0.061
Neg logit contributions
ized-0.042
quet-0.042
ingly-0.042
bing-0.041
edly-0.038
Token himself
Token ID2241
Feature activation-3.24e-03

Pos logit contributions
 fault+0.030
 reservation+0.029
 hospital+0.029
 microwave+0.028
arre+0.028
Neg logit contributions
ingly-0.022
bing-0.022
ized-0.021
edly-0.021
quet-0.021
Token and
Token ID290
Feature activation-3.41e-02

Pos logit contributions
 inconvenience+0.006
 hospital+0.006
 Enjoy+0.006
 tells+0.006
 dentist+0.006
Neg logit contributions
quet-0.004
ingly-0.004
bing-0.004
ized-0.004
izing-0.004
Token crying
Token ID13774
Feature activation-8.30e-02

Pos logit contributions
 fault+0.062
 inconvenience+0.057
ws+0.055
 hospital+0.054
upid+0.054
Neg logit contributions
ized-0.055
bing-0.054
quet-0.051
izing-0.050
ingly-0.050
Token.
Token ID13
Feature activation-5.86e-04

Pos logit contributions
 hospital+0.142
 inconvenience+0.141
 Enjoy+0.138
 tells+0.137
 fault+0.136
Neg logit contributions
quet-0.125
ingly-0.124
ized-0.123
bing-0.122
izing-0.117
Token
Token ID220
Feature activation-9.44e-03

Pos logit contributions
 inconvenience+0.001
 fault+0.001
 hospital+0.001
have+0.001
ws+0.001
Neg logit contributions
ized-0.001
quet-0.001
hog-0.001
bing-0.001
ingly-0.001
Token
Token ID198
Feature activation-2.45e-03

Pos logit contributions
 inconvenience+0.016
 hospital+0.016
 decides+0.015
 reservation+0.014
 microwave+0.014
Neg logit contributions
bing-0.014
quet-0.014
ized-0.014
ingly-0.013
ised-0.013
Token
Token ID198
Feature activation+6.87e-03

Pos logit contributions
 hospital+0.003
 inconvenience+0.003
 fault+0.003
 tells+0.003
 Enjoy+0.003
Neg logit contributions
quet-0.005
ized-0.005
bing-0.005
ingly-0.005
izing-0.005
TokenJack
Token ID14295
Feature activation+1.57e-02

Pos logit contributions
quet+0.011
bing+0.010
ized+0.010
ingly+0.010
izing+0.009
Neg logit contributions
 hospital-0.012
 tells-0.012
 inconvenience-0.011
 Enjoy-0.011
have-0.011
Token crown
Token ID12389
Feature activation+2.77e-02

Pos logit contributions
ingly+0.003
quet+0.003
ized+0.003
izing+0.003
bing+0.003
Neg logit contributions
 hospital-0.002
 tells-0.002
 fault-0.002
 mess-0.002
 done-0.002
Token and
Token ID290
Feature activation-6.39e-03

Pos logit contributions
ised+0.052
bing+0.051
edly+0.051
hog+0.051
izing+0.050
Neg logit contributions
 done-0.034
 hospital-0.032
 fault-0.031
 mess-0.031
 tells-0.030
Token playing
Token ID2712
Feature activation-2.71e-02

Pos logit contributions
 inconvenience+0.010
 hospital+0.010
 Enjoy+0.010
 fault+0.009
 microwave+0.009
Neg logit contributions
quet-0.011
bing-0.010
ized-0.010
ingly-0.010
izing-0.010
Token in
Token ID287
Feature activation+3.45e-03

Pos logit contributions
 hospital+0.056
 tells+0.054
 inconvenience+0.053
 Enjoy+0.053
 dentist+0.052
Neg logit contributions
quet-0.032
ingly-0.030
ized-0.030
bing-0.029
izing-0.028
Token the
Token ID262
Feature activation-1.88e-02

Pos logit contributions
quet+0.007
ized+0.006
bing+0.006
ingly+0.006
izing+0.006
Neg logit contributions
 hospital-0.005
 tells-0.005
 inconvenience-0.004
 fault-0.004
 Enjoy-0.004
Token freshly
Token ID29026
Feature activation-8.27e-02

Pos logit contributions
 hospital+0.026
 inconvenience+0.026
 tells+0.025
 fault+0.025
 Enjoy+0.024
Neg logit contributions
quet-0.035
ized-0.034
bing-0.034
ingly-0.033
izing-0.032
Token cut
Token ID2005
Feature activation-1.18e-03

Pos logit contributions
 hospital+0.112
 inconvenience+0.111
 fault+0.110
 tells+0.108
 Enjoy+0.103
Neg logit contributions
quet-0.157
bing-0.150
ized-0.149
ingly-0.145
izing-0.144
Token grass
Token ID8701
Feature activation-1.94e-03

Pos logit contributions
 hospital+0.002
 tells+0.002
 fault+0.002
 microwave+0.002
 dentist+0.002
Neg logit contributions
quet-0.002
ized-0.002
bing-0.002
izing-0.002
ingly-0.002
Token.
Token ID13
Feature activation+6.28e-03

Pos logit contributions
 hospital+0.003
 fault+0.003
 tells+0.003
 inconvenience+0.003
 dentist+0.003
Neg logit contributions
quet-0.003
ized-0.003
bing-0.003
ingly-0.003
izing-0.003
Token It
Token ID632
Feature activation+8.30e-03

Pos logit contributions
quet+0.010
bing+0.009
ingly+0.009
ized+0.009
izing+0.009
Neg logit contributions
 inconvenience-0.011
 hospital-0.010
have-0.010
 fault-0.010
 tells-0.010
Token looked
Token ID3114
Feature activation+3.81e-03

Pos logit contributions
quet+0.020
ingly+0.019
bing+0.018
edly+0.018
izing+0.018
Neg logit contributions
 tells-0.008
 hospital-0.008
 decides-0.008
 mess-0.007
 done-0.007
Token the
Token ID262
Feature activation-3.20e-02

Pos logit contributions
 hospital+0.009
 tells+0.008
 inconvenience+0.008
 fault+0.008
 Enjoy+0.008
Neg logit contributions
quet-0.012
ized-0.011
bing-0.011
ingly-0.011
izing-0.011
Token floor
Token ID4314
Feature activation-3.32e-02

Pos logit contributions
 hospital+0.040
 tells+0.039
 inconvenience+0.038
 fault+0.037
 Enjoy+0.037
Neg logit contributions
quet-0.065
ized-0.061
bing-0.061
ingly-0.061
izing-0.060
Token and
Token ID290
Feature activation-2.69e-02

Pos logit contributions
 hospital+0.060
 tells+0.058
 inconvenience+0.057
 fault+0.057
 Enjoy+0.055
Neg logit contributions
quet-0.049
ized-0.045
bing-0.045
ingly-0.044
izing-0.044
Token the
Token ID262
Feature activation-2.93e-02

Pos logit contributions
 fault+0.042
 inconvenience+0.042
 Enjoy+0.041
 hospital+0.041
have+0.039
Neg logit contributions
quet-0.044
ized-0.043
bing-0.042
ingly-0.042
izing-0.040
Token children
Token ID1751
Feature activation-4.73e-02

Pos logit contributions
 Enjoy+0.046
have+0.042
 decides+0.042
 tells+0.041
 fault+0.041
Neg logit contributions
quet-0.048
bing-0.047
ized-0.046
ingly-0.046
hog-0.046
Token crying
Token ID13774
Feature activation-8.26e-02

Pos logit contributions
 Enjoy+0.078
 inconvenience+0.077
 hospital+0.077
 fault+0.076
 tells+0.074
Neg logit contributions
quet-0.074
bing-0.073
ingly-0.073
ized-0.072
izing-0.068
Token.
Token ID13
Feature activation-2.22e-03

Pos logit contributions
 hospital+0.137
 inconvenience+0.132
 tells+0.132
 fault+0.130
 Enjoy+0.129
Neg logit contributions
quet-0.134
ized-0.127
bing-0.126
ingly-0.125
izing-0.122
Token She
Token ID1375
Feature activation-4.00e-03

Pos logit contributions
 hospital+0.004
 tells+0.004
 inconvenience+0.004
 fault+0.004
 Enjoy+0.004
Neg logit contributions
quet-0.003
bing-0.003
ized-0.003
ingly-0.003
izing-0.003
Token said
Token ID531
Feature activation-5.39e-03

Pos logit contributions
 tells+0.005
 hospital+0.005
 dentist+0.004
 decides+0.004
have+0.004
Neg logit contributions
quet-0.009
bing-0.008
ized-0.008
izing-0.008
ingly-0.008
Token,
Token ID11
Feature activation-9.49e-03

Pos logit contributions
 hospital+0.010
 tells+0.010
 fault+0.009
 inconvenience+0.009
 Enjoy+0.009
Neg logit contributions
quet-0.008
ized-0.008
bing-0.007
izing-0.007
ingly-0.007
Token "
Token ID366
Feature activation+7.19e-03

Pos logit contributions
 hospital+0.015
 fault+0.015
 inconvenience+0.015
have+0.014
 decides+0.014
Neg logit contributions
quet-0.016
bing-0.015
ingly-0.015
ized-0.014
izing-0.014
Token's
Token ID338
Feature activation-2.11e-02

Pos logit contributions
quet+0.052
bing+0.051
ingly+0.050
ized+0.050
izing+0.048
Neg logit contributions
 hospital-0.027
 inconvenience-0.027
 fault-0.026
 Enjoy-0.026
 tells-0.025
Token voice
Token ID3809
Feature activation-1.08e-02

Pos logit contributions
 hospital+0.036
 inconvenience+0.035
 tells+0.035
 Enjoy+0.035
 fault+0.034
Neg logit contributions
quet-0.032
ingly-0.031
bing-0.030
ized-0.030
izing-0.029
Token call
Token ID869
Feature activation-3.33e-02

Pos logit contributions
 Enjoy+0.017
 fault+0.017
 hospital+0.017
 inconvenience+0.017
 tells+0.016
Neg logit contributions
quet-0.018
ized-0.017
bing-0.017
ingly-0.017
izing-0.017
Token out
Token ID503
Feature activation-1.32e-02

Pos logit contributions
 inconvenience+0.053
 fault+0.053
 hospital+0.053
 decides+0.052
 Enjoy+0.051
Neg logit contributions
quet-0.054
ingly-0.053
bing-0.052
ized-0.049
izing-0.048
Token from
Token ID422
Feature activation-2.67e-03

Pos logit contributions
 hospital+0.020
 inconvenience+0.020
 tells+0.020
 fault+0.019
 Enjoy+0.019
Neg logit contributions
quet-0.023
bing-0.021
ized-0.021
ingly-0.021
izing-0.021
Token afar
Token ID44246
Feature activation-8.21e-02

Pos logit contributions
 fault+0.005
 inconvenience+0.005
 hospital+0.005
 tells+0.005
have+0.005
Neg logit contributions
quet-0.003
ingly-0.003
bing-0.003
ized-0.003
izing-0.003
Token.
Token ID13
Feature activation+1.05e-02

Pos logit contributions
 hospital+0.143
 inconvenience+0.141
 Enjoy+0.140
 tells+0.138
have+0.137
Neg logit contributions
quet-0.121
ingly-0.117
bing-0.115
ized-0.114
izing-0.112
Token "
Token ID366
Feature activation+3.07e-02

Pos logit contributions
quet+0.016
ingly+0.015
bing+0.015
izing+0.014
ized+0.014
Neg logit contributions
 fault-0.018
 inconvenience-0.018
 hospital-0.018
have-0.017
ws-0.017
TokenAndy
Token ID35314
Feature activation+2.38e-02

Pos logit contributions
quet+0.056
ized+0.053
bing+0.052
ingly+0.052
izing+0.051
Neg logit contributions
 hospital-0.045
 tells-0.043
 inconvenience-0.043
 Enjoy-0.042
 fault-0.041
Token!
Token ID0
Feature activation-1.98e-02

Pos logit contributions
quet+0.026
bing+0.025
ingly+0.025
ized+0.025
izing+0.023
Neg logit contributions
 hospital-0.050
 inconvenience-0.049
 Enjoy-0.048
 fault-0.048
have-0.047
Token Hur
Token ID10077
Feature activation-2.94e-02

Pos logit contributions
 hospital+0.032
 fault+0.030
ws+0.030
 inconvenience+0.030
 mess+0.030
Neg logit contributions
quet-0.032
hog-0.031
ingly-0.030
bing-0.030
als-0.030
Token put
Token ID1234
Feature activation-1.84e-02

Pos logit contributions
 microwave+0.030
 fault+0.029
 dentist+0.028
 hospital+0.028
 inconvenience+0.028
Neg logit contributions
quet-0.037
bing-0.036
ingly-0.035
izing-0.034
ized-0.033
Token your
Token ID534
Feature activation+1.04e-02

Pos logit contributions
 inconvenience+0.027
 hospital+0.026
 fault+0.024
 tells+0.024
 Enjoy+0.024
Neg logit contributions
ized-0.032
quet-0.032
bing-0.032
ingly-0.031
ised-0.030
Token toys
Token ID14958
Feature activation+1.54e-02

Pos logit contributions
quet+0.017
bing+0.017
ized+0.017
ingly+0.017
ised+0.016
Neg logit contributions
 Enjoy-0.015
 inconvenience-0.015
 hospital-0.015
 tells-0.014
 decides-0.014
Token away
Token ID1497
Feature activation+6.38e-03

Pos logit contributions
quet+0.027
ized+0.025
bing+0.025
ingly+0.025
izing+0.024
Neg logit contributions
 hospital-0.024
 tells-0.023
 inconvenience-0.023
 fault-0.022
 Enjoy-0.022
Token when
Token ID618
Feature activation+1.28e-03

Pos logit contributions
quet+0.010
ized+0.009
bing+0.009
ingly+0.009
izing+0.009
Neg logit contributions
 hospital-0.011
 tells-0.011
 inconvenience-0.011
 fault-0.010
 Enjoy-0.010
Token you
Token ID345
Feature activation-8.17e-02

Pos logit contributions
quet+0.002
bing+0.002
ized+0.002
ingly+0.002
izing+0.002
Neg logit contributions
 fault-0.002
 Enjoy-0.002
ws-0.002
 inconvenience-0.002
have-0.002
Token are
Token ID389
Feature activation-1.34e-02

Pos logit contributions
 fault+0.124
 inconvenience+0.119
 dentist+0.118
 nonsense+0.112
 microwave+0.112
Neg logit contributions
als-0.132
ized-0.125
quet-0.119
ingly-0.118
bed-0.115
Token done
Token ID1760
Feature activation-2.98e-02

Pos logit contributions
 Enjoy+0.016
 decides+0.016
 Good+0.014
 Time+0.014
 fault+0.014
Neg logit contributions
bing-0.026
quet-0.026
ized-0.026
ingly-0.025
 beneath-0.025
Token.
Token ID13
Feature activation-4.72e-03

Pos logit contributions
 inconvenience+0.062
 decides+0.060
 Enjoy+0.060
 hospital+0.060
ws+0.059
Neg logit contributions
bing-0.039
bed-0.034
ized-0.034
ingly-0.034
izing-0.031
Token Come
Token ID7911
Feature activation-1.19e-02

Pos logit contributions
 hospital+0.006
 inconvenience+0.006
 fault+0.006
have+0.006
 done+0.006
Neg logit contributions
quet-0.009
bing-0.008
ingly-0.008
izing-0.008
edly-0.008
Token on
Token ID319
Feature activation-2.27e-02

Pos logit contributions
 hospital+0.020
 inconvenience+0.020
 tells+0.019
 fault+0.019
 Enjoy+0.019
Neg logit contributions
quet-0.019
bing-0.018
ized-0.017
ingly-0.017
izing-0.017
Token the
Token ID262
Feature activation-1.09e-02

Pos logit contributions
 hospital+0.032
 inconvenience+0.032
 Enjoy+0.031
 fault+0.031
 tells+0.030
Neg logit contributions
quet-0.047
ized-0.047
bing-0.046
ingly-0.045
izing-0.045
Token broken
Token ID5445
Feature activation-4.06e-02

Pos logit contributions
 Enjoy+0.016
 decides+0.013
 inconvenience+0.013
 Time+0.013
 tells+0.013
Neg logit contributions
ized-0.020
quet-0.020
bing-0.019
izing-0.019
ingly-0.019
Token plane
Token ID6614
Feature activation+1.23e-02

Pos logit contributions
 hospital+0.065
 tells+0.063
 inconvenience+0.063
 Enjoy+0.063
 fault+0.062
Neg logit contributions
quet-0.067
ized-0.063
bing-0.063
ingly-0.062
izing-0.061
Token and
Token ID290
Feature activation-2.17e-02

Pos logit contributions
quet+0.019
ized+0.018
ingly+0.018
bing+0.018
izing+0.017
Neg logit contributions
 hospital-0.021
 tells-0.020
 inconvenience-0.020
 fault-0.019
 dentist-0.019
Token the
Token ID262
Feature activation-2.10e-02

Pos logit contributions
 fault+0.034
 inconvenience+0.034
 Enjoy+0.034
 hospital+0.033
 microwave+0.032
Neg logit contributions
bing-0.034
ized-0.034
quet-0.034
izing-0.032
ingly-0.032
Token crying
Token ID13774
Feature activation-8.17e-02

Pos logit contributions
 Enjoy+0.035
 decides+0.029
have+0.029
 Time+0.029
 Congratulations+0.028
Neg logit contributions
ized-0.035
quet-0.034
bing-0.033
 spe-0.033
ingly-0.032
Token children
Token ID1751
Feature activation-4.22e-02

Pos logit contributions
 Enjoy+0.128
 hospital+0.126
 inconvenience+0.125
 fault+0.121
 tells+0.118
Neg logit contributions
quet-0.140
ingly-0.135
bing-0.133
ized-0.132
izing-0.128
Token.
Token ID13
Feature activation-2.85e-03

Pos logit contributions
 Enjoy+0.066
 inconvenience+0.066
 hospital+0.066
 fault+0.064
 dentist+0.064
Neg logit contributions
quet-0.070
bing-0.069
ingly-0.068
ized-0.067
izing-0.064
Token She
Token ID1375
Feature activation-4.02e-03

Pos logit contributions
 hospital+0.005
 tells+0.005
 decides+0.004
 fault+0.004
 inconvenience+0.004
Neg logit contributions
quet-0.005
ized-0.005
bing-0.005
izing-0.004
ingly-0.004
Token hugs
Token ID40657
Feature activation-2.68e-02

Pos logit contributions
 hospital+0.003
 inconvenience+0.003
 fault+0.003
 Enjoy+0.003
 tells+0.003
Neg logit contributions
quet-0.010
ized-0.009
bing-0.009
ingly-0.009
izing-0.009
Token them
Token ID606
Feature activation+1.07e-02

Pos logit contributions
 hospital+0.018
 tells+0.017
 decides+0.015
 inconvenience+0.014
 dentist+0.014
Neg logit contributions
quet-0.073
ized-0.069
bing-0.068
worm-0.067
izing-0.067
Token and
Token ID290
Feature activation+6.37e-03

Pos logit contributions
quet+0.017
bing+0.016
ized+0.016
izing+0.016
ingly+0.016
Neg logit contributions
 hospital-0.016
 tells-0.015
 fault-0.015
 done-0.015
 Enjoy-0.015
Token food
Token ID2057
Feature activation-9.23e-03

Pos logit contributions
quet+0.012
ized+0.011
bing+0.011
ingly+0.011
izing+0.011
Neg logit contributions
 hospital-0.009
 inconvenience-0.009
 tells-0.008
 fault-0.008
 Enjoy-0.008
Token.
Token ID13
Feature activation+9.98e-03

Pos logit contributions
 hospital+0.016
 inconvenience+0.016
 tells+0.015
 Enjoy+0.015
 fault+0.015
Neg logit contributions
quet-0.014
ized-0.013
bing-0.013
ingly-0.013
izing-0.013
Token They
Token ID1119
Feature activation+7.29e-03

Pos logit contributions
bing+0.014
quet+0.014
ingly+0.014
als+0.014
ized+0.014
Neg logit contributions
 inconvenience-0.018
 hospital-0.017
 dentist-0.017
have-0.017
 fault-0.017
Token put
Token ID1234
Feature activation-2.55e-03

Pos logit contributions
quet+0.012
ized+0.012
bing+0.011
ingly+0.011
izing+0.011
Neg logit contributions
 inconvenience-0.011
 hospital-0.011
 fault-0.011
 tells-0.011
 Enjoy-0.011
Token everything
Token ID2279
Feature activation+5.65e-02

Pos logit contributions
 hospital+0.003
 tells+0.003
 inconvenience+0.003
 Enjoy+0.003
 fault+0.003
Neg logit contributions
quet-0.006
bing-0.005
ized-0.005
ingly-0.005
izing-0.005
Token in
Token ID287
Feature activation+2.04e-02

Pos logit contributions
quet+0.114
izing+0.107
bing+0.103
ingly+0.102
hog+0.097
Neg logit contributions
 done-0.077
 hospital-0.067
 wrong-0.066
 grounded-0.065
 ready-0.064
Token the
Token ID262
Feature activation-1.98e-03

Pos logit contributions
quet+0.036
bing+0.035
ized+0.035
ingly+0.035
izing+0.034
Neg logit contributions
 hospital-0.029
 inconvenience-0.029
 tells-0.028
have-0.028
 decides-0.028
Token crib
Token ID48083
Feature activation-1.37e-02

Pos logit contributions
 inconvenience+0.003
 fault+0.003
 tells+0.003
have+0.003
 Enjoy+0.003
Neg logit contributions
quet-0.003
ingly-0.003
bing-0.003
ized-0.003
izing-0.003
Token,
Token ID11
Feature activation-7.94e-03

Pos logit contributions
 hospital+0.024
 tells+0.023
 inconvenience+0.022
 fault+0.022
 Enjoy+0.022
Neg logit contributions
quet-0.021
bing-0.020
ized-0.020
ingly-0.019
izing-0.019
Token so
Token ID523
Feature activation-3.36e-02

Pos logit contributions
 hospital+0.014
 inconvenience+0.014
 tells+0.014
 fault+0.013
 Enjoy+0.013
Neg logit contributions
quet-0.012
ized-0.011
bing-0.011
ingly-0.011
izing-0.011
Token a
Token ID257
Feature activation-1.99e-02

Pos logit contributions
quet+0.012
ized+0.012
bing+0.012
ingly+0.012
izing+0.011
Neg logit contributions
 hospital-0.010
 inconvenience-0.009
 tells-0.009
have-0.009
 fault-0.009
Token tree
Token ID5509
Feature activation+1.99e-03

Pos logit contributions
 fault+0.037
 tells+0.036
 inconvenience+0.036
have+0.035
ws+0.034
Neg logit contributions
quet-0.030
ized-0.027
worm-0.026
izing-0.025
bing-0.025
Token.
Token ID13
Feature activation+1.18e-02

Pos logit contributions
quet+0.003
ized+0.003
bing+0.003
ingly+0.003
izing+0.003
Neg logit contributions
 hospital-0.003
 tells-0.003
 inconvenience-0.003
 fault-0.003
 Enjoy-0.003
Token They
Token ID1119
Feature activation+8.36e-03

Pos logit contributions
quet+0.017
als+0.017
edly+0.017
bing+0.016
hog+0.016
Neg logit contributions
have-0.020
 hospital-0.020
 inconvenience-0.020
 fault-0.019
 done-0.019
Token looked
Token ID3114
Feature activation-4.49e-03

Pos logit contributions
quet+0.016
bing+0.015
ingly+0.015
izing+0.015
ized+0.014
Neg logit contributions
 hospital-0.012
 tells-0.012
 Enjoy-0.011
 decides-0.011
 inconvenience-0.011
Token up
Token ID510
Feature activation+5.53e-02

Pos logit contributions
 hospital+0.006
 tells+0.006
 Enjoy+0.006
 fault+0.006
 decides+0.006
Neg logit contributions
quet-0.009
bing-0.008
ingly-0.008
ized-0.008
izing-0.008
Token and
Token ID290
Feature activation+5.26e-04

Pos logit contributions
quet+0.141
hetically+0.121
hog+0.120
stones+0.119
 threads+0.117
Neg logit contributions
 done-0.051
 mad-0.046
 tells-0.044
 decides-0.043
 says-0.041
Token saw
Token ID2497
Feature activation-1.07e-02

Pos logit contributions
quet+0.001
ingly+0.001
bing+0.001
izing+0.001
ized+0.001
Neg logit contributions
 tells-0.001
 hospital-0.001
 decides-0.000
 Enjoy-0.000
 inconvenience-0.000
Token the
Token ID262
Feature activation-9.52e-04

Pos logit contributions
 inconvenience+0.016
 hospital+0.016
have+0.016
 Enjoy+0.015
 fault+0.015
Neg logit contributions
ized-0.018
quet-0.018
bing-0.017
ingly-0.017
izing-0.017
Token bird
Token ID6512
Feature activation+7.84e-05

Pos logit contributions
 Enjoy+0.002
 tells+0.002
 hospital+0.002
 inconvenience+0.002
have+0.002
Neg logit contributions
quet-0.001
ized-0.001
bing-0.001
ingly-0.001
izing-0.001
Token sitting
Token ID5586
Feature activation-2.44e-02

Pos logit contributions
quet+0.000
ized+0.000
bing+0.000
ingly+0.000
izing+0.000
Neg logit contributions
 hospital-0.000
 tells-0.000
 inconvenience-0.000
 fault-0.000
 decides-0.000
Token
Token ID198
Feature activation+5.74e-04

Pos logit contributions
 hospital+0.006
 inconvenience+0.006
 fault+0.006
have+0.006
 tells+0.006
Neg logit contributions
quet-0.009
bing-0.009
ingly-0.009
ized-0.009
hog-0.009
Token
Token ID198
Feature activation+8.53e-03

Pos logit contributions
quet+0.001
bing+0.001
ingly+0.001
ized+0.001
hog+0.001
Neg logit contributions
 inconvenience-0.001
 hospital-0.001
 fault-0.001
 decides-0.001
 supposed-0.001
TokenThey
Token ID2990
Feature activation+1.24e-02

Pos logit contributions
quet+0.013
ized+0.012
bing+0.012
ingly+0.012
izing+0.012
Neg logit contributions
 hospital-0.015
 tells-0.014
 inconvenience-0.014
 fault-0.014
 Enjoy-0.014
Token put
Token ID1234
Feature activation+3.77e-03

Pos logit contributions
quet+0.020
als+0.020
ized+0.020
hetically+0.019
ingly+0.019
Neg logit contributions
 fault-0.019
 inconvenience-0.018
 supposed-0.018
have-0.017
 hospital-0.017
Token the
Token ID262
Feature activation+4.31e-02

Pos logit contributions
quet+0.007
ized+0.007
bing+0.007
ingly+0.007
izing+0.007
Neg logit contributions
 hospital-0.005
 tells-0.005
 inconvenience-0.005
 fault-0.005
 Enjoy-0.005
Token book
Token ID1492
Feature activation+5.67e-02

Pos logit contributions
quet+0.070
hog+0.066
worm+0.066
ized+0.063
bing+0.062
Neg logit contributions
 Enjoy-0.067
ws-0.064
 decides-0.064
have-0.063
 tells-0.060
Token away
Token ID1497
Feature activation+2.66e-02

Pos logit contributions
izing+0.099
quet+0.099
bing+0.097
ized+0.094
ingly+0.093
Neg logit contributions
 hospital-0.085
 tells-0.079
 microwave-0.075
 dentist-0.075
 fault-0.075
Token and
Token ID290
Feature activation+4.26e-03

Pos logit contributions
quet+0.049
ized+0.044
bing+0.044
izing+0.043
stones+0.043
Neg logit contributions
 hospital-0.041
 tells-0.039
 fault-0.036
 microwave-0.036
 dentist-0.035
Token go
Token ID467
Feature activation-1.38e-02

Pos logit contributions
quet+0.007
ized+0.007
ingly+0.007
bing+0.007
edly+0.007
Neg logit contributions
 fault-0.006
 inconvenience-0.006
 supposed-0.006
 Enjoy-0.006
have-0.006
Token to
Token ID284
Feature activation-8.87e-03

Pos logit contributions
 fault+0.026
 inconvenience+0.025
 nonsense+0.024
have+0.024
 decides+0.024
Neg logit contributions
quet-0.020
ingly-0.018
bing-0.018
ized-0.018
hog-0.017
Token play
Token ID711
Feature activation-1.46e-02

Pos logit contributions
have+0.018
arre+0.017
 decides+0.016
 microwave+0.015
 grounded+0.015
Neg logit contributions
bing-0.010
quet-0.009
bed-0.009
 fingertips-0.008
 spe-0.008
Token breakfast
Token ID12607
Feature activation+1.70e-02

Pos logit contributions
 fault+0.011
 decides+0.011
 inconvenience+0.011
 tells+0.011
 hospital+0.011
Neg logit contributions
quet-0.023
bing-0.023
ized-0.022
izing-0.022
ingly-0.022
Token.
Token ID13
Feature activation+1.99e-03

Pos logit contributions
quet+0.033
hog+0.031
ized+0.031
worm+0.031
ingly+0.030
Neg logit contributions
 hospital-0.022
 done-0.020
 tells-0.020
 microwave-0.020
 dentist-0.020
Token He
Token ID679
Feature activation+2.19e-02

Pos logit contributions
quet+0.003
bing+0.003
hog+0.003
ized+0.003
izing+0.003
Neg logit contributions
 inconvenience-0.003
 hospital-0.003
 grounded-0.003
have-0.003
 done-0.003
Token would
Token ID561
Feature activation+7.08e-03

Pos logit contributions
ized+0.034
quet+0.033
ingly+0.033
bing+0.033
worm+0.032
Neg logit contributions
 inconvenience-0.036
 fault-0.034
 hospital-0.032
 Enjoy-0.031
have-0.031
Token pour
Token ID12797
Feature activation+9.09e-03

Pos logit contributions
quet+0.012
bing+0.012
ingly+0.012
ized+0.011
hetically+0.011
Neg logit contributions
 fault-0.010
 inconvenience-0.010
 hospital-0.009
have-0.009
 tells-0.009
Token the
Token ID262
Feature activation+6.24e-02

Pos logit contributions
quet+0.019
ized+0.018
bing+0.018
ingly+0.017
izing+0.017
Neg logit contributions
 hospital-0.011
 tells-0.011
 Enjoy-0.010
 fault-0.010
 inconvenience-0.010
Token milk
Token ID7545
Feature activation+2.60e-02

Pos logit contributions
quet+0.133
ized+0.126
bing+0.126
ingly+0.124
izing+0.123
Neg logit contributions
 hospital-0.072
 tells-0.069
 fault-0.066
 inconvenience-0.066
 Enjoy-0.065
Token and
Token ID290
Feature activation+1.43e-02

Pos logit contributions
quet+0.048
ized+0.046
ingly+0.044
hog+0.043
izing+0.043
Neg logit contributions
 microwave-0.036
 hospital-0.035
 done-0.034
 tells-0.033
 Enjoy-0.032
Token cereal
Token ID33158
Feature activation+1.21e-02

Pos logit contributions
quet+0.028
bing+0.026
ized+0.026
ingly+0.026
izing+0.025
Neg logit contributions
 hospital-0.019
 inconvenience-0.019
 fault-0.018
 tells-0.018
 Enjoy-0.017
Token into
Token ID656
Feature activation-1.75e-03

Pos logit contributions
quet+0.022
ized+0.020
ingly+0.020
bing+0.020
izing+0.020
Neg logit contributions
 hospital-0.017
 microwave-0.017
 done-0.017
 tells-0.017
 fault-0.016
Token a
Token ID257
Feature activation-5.64e-03

Pos logit contributions
 inconvenience+0.002
 fault+0.002
 hospital+0.002
 tells+0.002
 decides+0.002
Neg logit contributions
quet-0.003
bing-0.003
ized-0.003
izing-0.003
ingly-0.003
Token play
Token ID711
Feature activation-1.33e-02

Pos logit contributions
 fault+0.027
 inconvenience+0.024
 nonsense+0.022
 microwave+0.022
 supposed+0.022
Neg logit contributions
quet-0.029
ized-0.028
bing-0.026
hog-0.026
als-0.025
Token.
Token ID13
Feature activation+9.67e-03

Pos logit contributions
 hospital+0.020
 tells+0.019
have+0.018
 Enjoy+0.018
 inconvenience+0.018
Neg logit contributions
quet-0.024
ingly-0.023
bing-0.023
izing-0.023
ized-0.022
Token While
Token ID2893
Feature activation-4.54e-03

Pos logit contributions
quet+0.019
ized+0.018
bing+0.018
ingly+0.018
izing+0.017
Neg logit contributions
 hospital-0.012
 inconvenience-0.012
 tells-0.012
 fault-0.012
 Enjoy-0.011
Token they
Token ID484
Feature activation+8.45e-03

Pos logit contributions
 hospital+0.007
 fault+0.007
 inconvenience+0.007
 tells+0.007
 Enjoy+0.007
Neg logit contributions
quet-0.007
ized-0.007
bing-0.007
ingly-0.007
izing-0.007
Token were
Token ID547
Feature activation+6.99e-03

Pos logit contributions
quet+0.016
bing+0.015
ingly+0.015
izing+0.015
ized+0.015
Neg logit contributions
 hospital-0.012
 tells-0.011
 Enjoy-0.011
 inconvenience-0.011
 decides-0.011
Token there
Token ID612
Feature activation+6.12e-02

Pos logit contributions
quet+0.015
bing+0.015
ized+0.015
ingly+0.014
izing+0.014
Neg logit contributions
 hospital-0.008
 tells-0.007
 inconvenience-0.007
 fault-0.007
 Enjoy-0.007
Token,
Token ID11
Feature activation+7.94e-03

Pos logit contributions
quet+0.117
edly+0.107
ingly+0.104
ized+0.101
stones+0.100
Neg logit contributions
 hospital-0.093
 tells-0.087
 Time-0.084
 inconvenience-0.084
 done-0.083
Token Lily
Token ID20037
Feature activation-3.52e-03

Pos logit contributions
quet+0.017
bing+0.016
ingly+0.016
ized+0.016
izing+0.016
Neg logit contributions
 hospital-0.009
 tells-0.009
 inconvenience-0.009
 Enjoy-0.008
 fault-0.008
Token accidentally
Token ID14716
Feature activation-5.67e-02

Pos logit contributions
 inconvenience+0.005
 fault+0.004
 Enjoy+0.004
 hospital+0.004
 tells+0.004
Neg logit contributions
ized-0.007
bing-0.007
quet-0.007
ingly-0.006
izing-0.006
Token dropped
Token ID5710
Feature activation-1.66e-02

Pos logit contributions
 inconvenience+0.076
 Enjoy+0.074
 hospital+0.072
 fault+0.072
 tells+0.070
Neg logit contributions
quet-0.104
bing-0.103
bed-0.101
ized-0.099
ingly-0.098
Token her
Token ID607
Feature activation+2.70e-02

Pos logit contributions
 hospital+0.024
 tells+0.023
 inconvenience+0.023
 fault+0.022
 Enjoy+0.022
Neg logit contributions
quet-0.031
ized-0.029
bing-0.029
ingly-0.029
izing-0.028
Token to
Token ID284
Feature activation-6.98e-03

Pos logit contributions
 hospital+0.025
 tells+0.024
 inconvenience+0.023
 dentist+0.023
 Enjoy+0.023
Neg logit contributions
quet-0.027
bing-0.025
ized-0.025
izing-0.025
ingly-0.025
Token play
Token ID711
Feature activation-8.15e-03

Pos logit contributions
 fault+0.010
 inconvenience+0.009
 microwave+0.009
 hospital+0.009
 supposed+0.009
Neg logit contributions
quet-0.013
ized-0.012
bing-0.011
ised-0.011
izing-0.011
Token with
Token ID351
Feature activation+1.57e-02

Pos logit contributions
 hospital+0.016
 tells+0.015
 inconvenience+0.015
 Enjoy+0.015
have+0.015
Neg logit contributions
quet-0.011
ized-0.010
bing-0.010
ingly-0.010
izing-0.010
Token his
Token ID465
Feature activation+1.61e-02

Pos logit contributions
quet+0.036
ized+0.035
bing+0.035
ingly+0.034
izing+0.034
Neg logit contributions
 inconvenience-0.015
 hospital-0.015
 fault-0.015
 tells-0.014
 Enjoy-0.014
Token friends
Token ID2460
Feature activation-1.46e-02

Pos logit contributions
quet+0.028
ized+0.026
bing+0.026
ingly+0.025
izing+0.025
Neg logit contributions
 inconvenience-0.024
 Enjoy-0.024
 hospital-0.024
 tells-0.023
ws-0.023
Token.
Token ID13
Feature activation+1.62e-03

Pos logit contributions
 hospital+0.026
 inconvenience+0.026
 fault+0.025
 tells+0.025
 Enjoy+0.024
Neg logit contributions
quet-0.022
bing-0.021
ized-0.020
ingly-0.020
izing-0.019
Token But
Token ID887
Feature activation+2.77e-02

Pos logit contributions
bing+0.002
quet+0.002
ized+0.002
hog+0.002
ised+0.002
Neg logit contributions
 inconvenience-0.003
 hospital-0.003
 grounded-0.003
 fault-0.003
have-0.002
Token when
Token ID618
Feature activation+3.15e-02

Pos logit contributions
bing+0.042
ized+0.041
bed+0.040
ised+0.040
quet+0.039
Neg logit contributions
 fault-0.047
 inconvenience-0.046
 Enjoy-0.044
 tells-0.044
 hospital-0.044
Token he
Token ID339
Feature activation-1.22e-02

Pos logit contributions
quet+0.045
bing+0.044
ized+0.044
izing+0.042
ingly+0.042
Neg logit contributions
 fault-0.055
 hospital-0.055
 tells-0.054
 inconvenience-0.054
have-0.052
Token got
Token ID1392
Feature activation+1.38e-02

Pos logit contributions
 hospital+0.015
 tells+0.014
 inconvenience+0.014
 fault+0.014
 Enjoy+0.014
Neg logit contributions
quet-0.025
ized-0.024
bing-0.024
ingly-0.023
izing-0.023
Token there
Token ID612
Feature activation+5.20e-02

Pos logit contributions
quet+0.027
bing+0.025
ized+0.025
izing+0.025
ingly+0.025
Neg logit contributions
 hospital-0.019
 tells-0.018
 inconvenience-0.018
 fault-0.018
 Enjoy-0.018
Token coloring
Token ID33988
Feature activation-1.54e-02

Pos logit contributions
 hospital+0.012
 tells+0.011
 inconvenience+0.011
 fault+0.011
 Enjoy+0.011
Neg logit contributions
quet-0.020
ized-0.019
bing-0.019
ingly-0.019
izing-0.019
Token book
Token ID1492
Feature activation+2.84e-02

Pos logit contributions
 Enjoy+0.024
 inconvenience+0.024
 decides+0.024
 fault+0.024
have+0.024
Neg logit contributions
quet-0.026
ized-0.023
ingly-0.023
bing-0.023
izing-0.023
Token.
Token ID13
Feature activation+5.99e-03

Pos logit contributions
quet+0.049
bing+0.047
ized+0.046
ingly+0.045
izing+0.045
Neg logit contributions
 hospital-0.044
 tells-0.042
 inconvenience-0.041
 fault-0.041
 dentist-0.040
Token She
Token ID1375
Feature activation+7.86e-03

Pos logit contributions
quet+0.006
bing+0.005
ingly+0.005
ized+0.005
izing+0.005
Neg logit contributions
 inconvenience-0.013
 fault-0.013
 hospital-0.013
have-0.013
ws-0.013
Token took
Token ID1718
Feature activation+1.25e-02

Pos logit contributions
quet+0.015
ized+0.015
bing+0.014
ingly+0.014
izing+0.014
Neg logit contributions
 hospital-0.010
 inconvenience-0.010
 fault-0.010
 tells-0.010
 Enjoy-0.010
Token out
Token ID503
Feature activation+2.15e-02

Pos logit contributions
quet+0.023
ized+0.022
bing+0.022
ingly+0.022
izing+0.021
Neg logit contributions
 hospital-0.017
 inconvenience-0.017
 fault-0.017
 tells-0.016
 Enjoy-0.016
Token her
Token ID607
Feature activation+3.10e-02

Pos logit contributions
quet+0.042
bing+0.040
ized+0.040
izing+0.039
ingly+0.039
Neg logit contributions
 hospital-0.028
 tells-0.027
 inconvenience-0.026
 fault-0.026
 Enjoy-0.026
Token cr
Token ID1067
Feature activation+4.76e-02

Pos logit contributions
quet+0.052
ized+0.049
bing+0.049
ingly+0.049
izing+0.047
Neg logit contributions
 hospital-0.049
 inconvenience-0.048
 tells-0.047
 Enjoy-0.047
 fault-0.046
Tokenay
Token ID323
Feature activation+3.59e-02

Pos logit contributions
bing+0.069
ized+0.069
izing+0.067
quet+0.067
worm+0.066
Neg logit contributions
 fault-0.079
 dentist-0.077
 hospital-0.076
 microwave-0.076
 tells-0.075
Tokenons
Token ID684
Feature activation+2.95e-02

Pos logit contributions
quet+0.053
ized+0.050
bing+0.050
ingly+0.049
izing+0.049
Neg logit contributions
 hospital-0.063
 tells-0.062
 inconvenience-0.061
 fault-0.060
 Enjoy-0.060
Token and
Token ID290
Feature activation-3.08e-03

Pos logit contributions
quet+0.057
izing+0.057
bing+0.054
ized+0.052
ised+0.052
Neg logit contributions
 hospital-0.038
 fault-0.036
 tells-0.035
 done-0.035
 dentist-0.033
Token They
Token ID1119
Feature activation+1.39e-02

Pos logit contributions
 microwave+0.016
 dentist+0.015
 hamb+0.014
 aquarium+0.014
 waiter+0.014
Neg logit contributions
izing-0.009
bing-0.009
ized-0.009
ingly-0.008
bed-0.008
Token smile
Token ID8212
Feature activation+2.15e-02

Pos logit contributions
als+0.022
ized+0.021
hog+0.020
worm+0.019
ingly+0.019
Neg logit contributions
 fault-0.022
 inconvenience-0.022
 dentist-0.021
 microwave-0.021
 Enjoy-0.020
Token and
Token ID290
Feature activation+1.58e-02

Pos logit contributions
ingly+0.021
bing+0.021
edly+0.021
quet+0.020
ized+0.020
Neg logit contributions
 hospital-0.050
 inconvenience-0.050
 microwave-0.047
 dentist-0.047
 Enjoy-0.046
Token nod
Token ID18666
Feature activation+3.04e-02

Pos logit contributions
quet+0.026
ized+0.026
ingly+0.025
bing+0.025
edly+0.024
Neg logit contributions
 hospital-0.024
 inconvenience-0.024
 fault-0.023
 dentist-0.022
 microwave-0.022
Token.
Token ID13
Feature activation+7.39e-03

Pos logit contributions
ingly+0.041
quet+0.039
bing+0.039
ized+0.037
edly+0.037
Neg logit contributions
 hospital-0.059
 inconvenience-0.058
 microwave-0.056
 dentist-0.056
 fault-0.055
Token
Token ID198
Feature activation+1.90e-03

Pos logit contributions
ingly+0.009
quet+0.009
bing+0.009
edly+0.009
hog+0.009
Neg logit contributions
 hospital-0.014
 inconvenience-0.014
 mess-0.013
 done-0.013
have-0.013
Token
Token ID198
Feature activation+6.50e-03

Pos logit contributions
quet+0.003
ised+0.002
bing+0.002
ingly+0.002
ized+0.002
Neg logit contributions
 fault-0.004
 hospital-0.004
 inconvenience-0.004
 reservation-0.003
 dentist-0.003
TokenBut
Token ID1537
Feature activation+2.33e-02

Pos logit contributions
quet+0.008
ized+0.007
bing+0.007
ingly+0.007
ised+0.007
Neg logit contributions
 fault-0.015
 hospital-0.014
 inconvenience-0.014
 tells-0.014
 dentist-0.013
Token then
Token ID788
Feature activation-1.80e-02

Pos logit contributions
bing+0.026
ized+0.024
quet+0.024
bed+0.024
ingly+0.023
Neg logit contributions
 fault-0.048
 inconvenience-0.047
ws-0.047
 hospital-0.047
have-0.047
Token,
Token ID11
Feature activation+9.28e-03

Pos logit contributions
 fault+0.036
 hospital+0.035
 inconvenience+0.035
 microwave+0.034
 supposed+0.034
Neg logit contributions
quet-0.021
ized-0.020
ingly-0.019
bing-0.019
izing-0.019
Token Tom
Token ID4186
Feature activation+1.86e-02

Pos logit contributions
quet+0.013
ized+0.013
bing+0.012
als+0.012
ingly+0.012
Neg logit contributions
 fault-0.015
ws-0.015
 inconvenience-0.015
 supposed-0.015
have-0.015
Token and
Token ID290
Feature activation-7.98e-05

Pos logit contributions
 hospital+0.001
 tells+0.001
 inconvenience+0.001
 fault+0.001
 Enjoy+0.001
Neg logit contributions
quet-0.001
ized-0.001
bing-0.001
ingly-0.001
izing-0.001
Token cookies
Token ID14746
Feature activation-2.10e-03

Pos logit contributions
 inconvenience+0.000
 fault+0.000
 Enjoy+0.000
 hospital+0.000
have+0.000
Neg logit contributions
quet-0.000
edly-0.000
ized-0.000
als-0.000
bing-0.000
Token.
Token ID13
Feature activation+1.17e-02

Pos logit contributions
 hospital+0.004
 inconvenience+0.003
 tells+0.003
 Enjoy+0.003
 fault+0.003
Neg logit contributions
quet-0.003
ized-0.003
bing-0.003
ingly-0.003
izing-0.003
Token They
Token ID1119
Feature activation+1.24e-02

Pos logit contributions
quet+0.019
bing+0.018
ingly+0.018
ized+0.018
edly+0.018
Neg logit contributions
 hospital-0.018
 inconvenience-0.018
have-0.017
 tells-0.017
 done-0.016
Token were
Token ID547
Feature activation-1.19e-02

Pos logit contributions
quet+0.023
ized+0.022
bing+0.022
ingly+0.022
izing+0.021
Neg logit contributions
 hospital-0.017
 inconvenience-0.017
 tells-0.016
 fault-0.016
 Enjoy-0.016
Token happy
Token ID3772
Feature activation+4.72e-03

Pos logit contributions
 inconvenience+0.018
 hospital+0.016
 Enjoy+0.016
 fault+0.016
 Time+0.016
Neg logit contributions
bing-0.021
quet-0.021
ized-0.020
ingly-0.020
izing-0.019
Token and
Token ID290
Feature activation+6.44e-05

Pos logit contributions
quet+0.007
ized+0.007
bing+0.007
ingly+0.007
izing+0.006
Neg logit contributions
 hospital-0.008
 inconvenience-0.008
 tells-0.008
 fault-0.008
 Enjoy-0.008
Token proud
Token ID6613
Feature activation-1.19e-02

Pos logit contributions
quet+0.000
bing+0.000
ingly+0.000
ized+0.000
izing+0.000
Neg logit contributions
 hospital-0.000
 tells-0.000
 Enjoy-0.000
 inconvenience-0.000
 decides-0.000
Token of
Token ID286
Feature activation+1.08e-02

Pos logit contributions
 hospital+0.023
 inconvenience+0.023
 fault+0.023
 Enjoy+0.022
have+0.022
Neg logit contributions
quet-0.016
ingly-0.015
ized-0.014
bing-0.014
izing-0.014
Token their
Token ID511
Feature activation+1.14e-03

Pos logit contributions
quet+0.019
bing+0.019
ized+0.018
ingly+0.018
izing+0.018
Neg logit contributions
 inconvenience-0.016
 fault-0.016
 hospital-0.015
 microwave-0.015
 dentist-0.015
Token work
Token ID670
Feature activation+2.54e-03

Pos logit contributions
quet+0.002
ized+0.002
bing+0.002
ingly+0.002
izing+0.002
Neg logit contributions
 Enjoy-0.001
 hospital-0.001
 inconvenience-0.001
have-0.001
 tells-0.001
Token fire
Token ID2046
Feature activation+1.46e-02

Pos logit contributions
quet+0.004
ized+0.004
bing+0.004
ingly+0.004
izing+0.004
Neg logit contributions
 hospital-0.003
 tells-0.003
 inconvenience-0.003
 fault-0.003
 Enjoy-0.003
Tokenwork
Token ID1818
Feature activation-1.35e-02

Pos logit contributions
quet+0.023
bing+0.021
ized+0.021
ingly+0.021
izing+0.020
Neg logit contributions
 hospital-0.025
 inconvenience-0.024
 tells-0.024
 fault-0.024
 Enjoy-0.024
Token that
Token ID326
Feature activation-1.60e-02

Pos logit contributions
 fault+0.025
 inconvenience+0.024
ws+0.024
 decides+0.024
 hospital+0.024
Neg logit contributions
quet-0.019
ized-0.018
bing-0.018
izing-0.017
ingly-0.017
Token had
Token ID550
Feature activation+1.74e-02

Pos logit contributions
 fault+0.031
 inconvenience+0.030
 hospital+0.028
ws+0.028
 dentist+0.027
Neg logit contributions
bing-0.022
ized-0.021
worm-0.020
izing-0.020
 spe-0.020
Token lost
Token ID2626
Feature activation-3.81e-02

Pos logit contributions
ized+0.028
quet+0.027
bing+0.026
als+0.026
ised+0.026
Neg logit contributions
 hospital-0.027
 Enjoy-0.027
 inconvenience-0.027
 tells-0.026
have-0.026
Token its
Token ID663
Feature activation+1.10e-02

Pos logit contributions
 hospital+0.061
 tells+0.060
 inconvenience+0.060
 fault+0.059
 Enjoy+0.058
Neg logit contributions
quet-0.062
ized-0.059
bing-0.059
ingly-0.059
izing-0.058
Token spark
Token ID9009
Feature activation+2.46e-03

Pos logit contributions
quet+0.021
ized+0.020
bing+0.020
ingly+0.019
izing+0.019
Neg logit contributions
 hospital-0.015
 tells-0.015
 inconvenience-0.014
 Enjoy-0.014
have-0.014
Tokenle
Token ID293
Feature activation+1.94e-02

Pos logit contributions
quet+0.003
bing+0.003
ized+0.003
ingly+0.003
izing+0.003
Neg logit contributions
 hospital-0.005
 fault-0.005
 tells-0.005
 inconvenience-0.005
have-0.005
Token.
Token ID13
Feature activation+7.31e-03

Pos logit contributions
quet+0.029
ized+0.026
bing+0.026
ingly+0.026
izing+0.026
Neg logit contributions
 hospital-0.035
 tells-0.033
 inconvenience-0.033
 fault-0.033
 Enjoy-0.032
Token It
Token ID632
Feature activation+1.73e-02

Pos logit contributions
quet+0.012
bing+0.011
izing+0.011
ized+0.011
ingly+0.011
Neg logit contributions
 fault-0.012
 hospital-0.012
 inconvenience-0.012
have-0.011
 tells-0.011
Token was
Token ID373
Feature activation+1.14e-02

Pos logit contributions
ized+0.026
bing+0.026
quet+0.026
hog+0.025
ingly+0.025
Neg logit contributions
 inconvenience-0.029
 fault-0.028
 Enjoy-0.028
 hospital-0.028
have-0.027
Token birthday
Token ID10955
Feature activation+3.82e-02

Pos logit contributions
quet+0.052
ized+0.049
ingly+0.049
bing+0.049
izing+0.048
Neg logit contributions
 hospital-0.029
 tells-0.028
have-0.028
 decides-0.027
 inconvenience-0.027
Token ever
Token ID1683
Feature activation+2.24e-02

Pos logit contributions
bing+0.055
quet+0.054
ized+0.054
ised+0.052
izing+0.052
Neg logit contributions
 hospital-0.068
 tells-0.066
 inconvenience-0.066
 fault-0.065
 Enjoy-0.064
Token!
Token ID0
Feature activation+6.38e-03

Pos logit contributions
quet+0.031
ized+0.029
bing+0.029
ingly+0.028
izing+0.028
Neg logit contributions
 hospital-0.043
 tells-0.041
 inconvenience-0.041
 fault-0.040
 Enjoy-0.040
Token<|endoftext|>
Token ID50256
Feature activation-3.87e-02

Pos logit contributions
quet+0.009
ingly+0.009
ized+0.009
bing+0.009
ised+0.009
Neg logit contributions
 inconvenience-0.011
 hospital-0.010
ws-0.010
 cheating-0.010
 tells-0.010
TokenOnce
Token ID7454
Feature activation+1.60e-02

Pos logit contributions
 hospital+0.050
 inconvenience+0.049
 fault+0.049
 tells+0.046
have+0.044
Neg logit contributions
quet-0.076
bing-0.073
ingly-0.072
ized-0.071
hog-0.069
Token upon
Token ID2402
Feature activation-2.72e-02

Pos logit contributions
quet+0.031
ized+0.031
ingly+0.030
bing+0.030
ised+0.029
Neg logit contributions
 hospital-0.020
 inconvenience-0.020
 Enjoy-0.019
 tells-0.019
have-0.018
Token a
Token ID257
Feature activation-3.31e-02

Pos logit contributions
 inconvenience+0.043
 fault+0.042
 tells+0.041
 hospital+0.041
 decides+0.039
Neg logit contributions
quet-0.045
ingly-0.042
ized-0.042
bing-0.040
edly-0.040
Token time
Token ID640
Feature activation+2.47e-02

Pos logit contributions
 hospital+0.020
 tells+0.018
 inconvenience+0.018
 fault+0.017
 Enjoy+0.016
Neg logit contributions
quet-0.088
ized-0.085
bing-0.084
ingly-0.084
izing-0.083
Token there
Token ID612
Feature activation+2.03e-02

Pos logit contributions
quet+0.040
ized+0.037
ingly+0.036
bing+0.036
izing+0.036
Neg logit contributions
 hospital-0.041
 tells-0.041
 inconvenience-0.040
 fault-0.040
 Enjoy-0.039
Token was
Token ID373
Feature activation-1.74e-02

Pos logit contributions
quet+0.031
ized+0.031
bing+0.031
ingly+0.030
izing+0.029
Neg logit contributions
 inconvenience-0.034
 hospital-0.033
 fault-0.033
 tells-0.032
 Enjoy-0.032
Token a
Token ID257
Feature activation+3.39e-03

Pos logit contributions
 hospital+0.020
 inconvenience+0.020
 tells+0.019
 fault+0.019
 Enjoy+0.018
Neg logit contributions
quet-0.036
ized-0.035
bing-0.035
ingly-0.034
izing-0.034
Token She
Token ID1375
Feature activation+6.34e-03

Pos logit contributions
 hospital+0.012
 tells+0.011
 inconvenience+0.011
 fault+0.011
 Enjoy+0.011
Neg logit contributions
quet-0.014
ized-0.014
bing-0.014
ingly-0.013
izing-0.013
Token has
Token ID468
Feature activation-3.58e-03

Pos logit contributions
als+0.013
quet+0.013
bing+0.012
stones+0.012
ized+0.012
Neg logit contributions
 inconvenience-0.007
 hospital-0.007
 fault-0.006
 dentist-0.006
 Enjoy-0.006
Token a
Token ID257
Feature activation+2.74e-03

Pos logit contributions
 inconvenience+0.006
 Enjoy+0.006
 hospital+0.005
arre+0.005
 microwave+0.005
Neg logit contributions
quet-0.005
 spe-0.005
 branches-0.005
als-0.005
stones-0.005
Token dress
Token ID6576
Feature activation+2.17e-02

Pos logit contributions
quet+0.004
 spe+0.004
worm+0.003
hog+0.003
ingly+0.003
Neg logit contributions
have-0.005
ws-0.005
 inconvenience-0.005
 decides-0.005
 tells-0.005
Token and
Token ID290
Feature activation+5.54e-03

Pos logit contributions
quet+0.035
ized+0.033
bing+0.033
ingly+0.033
izing+0.032
Neg logit contributions
 hospital-0.035
 inconvenience-0.034
 tells-0.034
 fault-0.034
 Enjoy-0.034
Token a
Token ID257
Feature activation-3.23e-03

Pos logit contributions
quet+0.010
ized+0.010
ingly+0.010
als+0.010
bing+0.009
Neg logit contributions
 inconvenience-0.007
 hospital-0.007
 fault-0.007
 Enjoy-0.007
have-0.007
Token hat
Token ID6877
Feature activation+2.34e-02

Pos logit contributions
have+0.006
 inconvenience+0.006
ws+0.006
 decides+0.006
 supposed+0.006
Neg logit contributions
quet-0.004
 spe-0.004
worm-0.004
hog-0.004
ingly-0.004
Token.
Token ID13
Feature activation-1.63e-03

Pos logit contributions
quet+0.038
ingly+0.035
bing+0.034
als+0.034
ized+0.034
Neg logit contributions
 hospital-0.038
 inconvenience-0.038
 Enjoy-0.037
 fault-0.037
have-0.037
Token She
Token ID1375
Feature activation+7.68e-03

Pos logit contributions
 hospital+0.003
 inconvenience+0.002
 fault+0.002
 tells+0.002
have+0.002
Neg logit contributions
quet-0.003
bing-0.003
ingly-0.003
ized-0.003
izing-0.003
Token is
Token ID318
Feature activation-2.89e-03

Pos logit contributions
als+0.015
quet+0.015
ized+0.014
bing+0.014
stones+0.014
Neg logit contributions
 inconvenience-0.009
 hospital-0.008
 fault-0.008
have-0.008
 Enjoy-0.008
Token attractive
Token ID10966
Feature activation+6.23e-03

Pos logit contributions
 inconvenience+0.005
arre+0.005
 hospital+0.005
 Enjoy+0.005
 fault+0.005
Neg logit contributions
bing-0.004
quet-0.004
 beneath-0.004
 spe-0.004
ized-0.004
Token pole
Token ID16825
Feature activation+3.09e-02

Pos logit contributions
quet+0.048
bing+0.045
ized+0.045
izing+0.044
worm+0.044
Neg logit contributions
 hospital-0.048
 tells-0.047
 Enjoy-0.047
 inconvenience-0.046
 dentist-0.046
Token out
Token ID503
Feature activation+1.42e-02

Pos logit contributions
quet+0.052
ized+0.049
bing+0.048
ingly+0.047
izing+0.047
Neg logit contributions
 hospital-0.050
 tells-0.047
 fault-0.047
 inconvenience-0.046
 Enjoy-0.045
Token of
Token ID286
Feature activation-6.65e-03

Pos logit contributions
quet+0.022
ized+0.020
bing+0.020
ingly+0.020
izing+0.020
Neg logit contributions
 hospital-0.024
 tells-0.024
 fault-0.023
 inconvenience-0.022
 decides-0.022
Token the
Token ID262
Feature activation+9.99e-03

Pos logit contributions
 hospital+0.012
have+0.012
 decides+0.011
 inconvenience+0.011
 fault+0.011
Neg logit contributions
quet-0.009
ized-0.009
bing-0.009
izing-0.009
ised-0.009
Token ground
Token ID2323
Feature activation-2.00e-02

Pos logit contributions
quet+0.012
ized+0.012
worm+0.011
izing+0.011
bing+0.011
Neg logit contributions
 fault-0.019
 Enjoy-0.019
 tells-0.019
have-0.019
 inconvenience-0.019
Token.
Token ID13
Feature activation-3.62e-03

Pos logit contributions
 hospital+0.034
 tells+0.033
 fault+0.033
 inconvenience+0.033
 Enjoy+0.032
Neg logit contributions
quet-0.031
ized-0.029
bing-0.029
ingly-0.028
izing-0.028
Token
Token ID198
Feature activation+2.48e-03

Pos logit contributions
 inconvenience+0.005
 hospital+0.005
have+0.005
 supposed+0.005
 grounded+0.005
Neg logit contributions
quet-0.006
hog-0.006
bing-0.006
ized-0.006
ised-0.006
Token
Token ID198
Feature activation+1.21e-02

Pos logit contributions
quet+0.004
bing+0.004
ized+0.004
ingly+0.004
ised+0.004
Neg logit contributions
 hospital-0.004
 inconvenience-0.004
 fault-0.004
 tells-0.004
 decides-0.004
TokenOne
Token ID3198
Feature activation+1.59e-03

Pos logit contributions
quet+0.017
ized+0.015
bing+0.015
izing+0.015
ingly+0.014
Neg logit contributions
 fault-0.024
 tells-0.023
 hospital-0.023
 inconvenience-0.023
 dentist-0.022
Token by
Token ID416
Feature activation+1.79e-02

Pos logit contributions
quet+0.003
ingly+0.003
ized+0.003
worm+0.003
bing+0.003
Neg logit contributions
 fault-0.002
 hospital-0.002
 inconvenience-0.002
 tells-0.002
 Enjoy-0.002
Token one
Token ID530
Feature activation+1.51e-02

Pos logit contributions
quet+0.036
ized+0.034
bing+0.034
ingly+0.033
izing+0.033
Neg logit contributions
 hospital-0.023
 tells-0.022
 inconvenience-0.022
 fault-0.021
 Enjoy-0.020
Token
Token ID198
Feature activation-3.71e-03

Pos logit contributions
 tells+0.006
 hospital+0.006
 Enjoy+0.005
 Time+0.005
 says+0.005
Neg logit contributions
quet-0.019
ized-0.018
izing-0.018
bing-0.018
ingly-0.018
Token
Token ID198
Feature activation+3.95e-03

Pos logit contributions
 hospital+0.004
 tells+0.004
 inconvenience+0.004
 fault+0.004
 Enjoy+0.004
Neg logit contributions
quet-0.008
ized-0.008
bing-0.008
ingly-0.008
izing-0.008
TokenThey
Token ID2990
Feature activation+9.71e-03

Pos logit contributions
imm+0.007
ingly+0.007
ized+0.007
quet+0.007
edly+0.007
Neg logit contributions
 hospital-0.005
 tells-0.005
have-0.005
 dentist-0.005
 grounded-0.005
Token played
Token ID2826
Feature activation-3.30e-03

Pos logit contributions
quet+0.020
bing+0.019
ingly+0.019
izing+0.019
ized+0.019
Neg logit contributions
 hospital-0.012
 tells-0.012
 inconvenience-0.011
 Enjoy-0.011
 decides-0.011
Token with
Token ID351
Feature activation+1.42e-02

Pos logit contributions
 hospital+0.006
 inconvenience+0.006
 tells+0.006
 fault+0.006
 Enjoy+0.006
Neg logit contributions
quet-0.004
ized-0.004
bing-0.004
ingly-0.004
izing-0.004
Token the
Token ID262
Feature activation-1.20e-03

Pos logit contributions
quet+0.035
bing+0.032
ingly+0.032
ized+0.031
izing+0.031
Neg logit contributions
 hospital-0.015
 tells-0.013
 fault-0.013
 dentist-0.012
 microwave-0.012
Token ball
Token ID2613
Feature activation+1.88e-02

Pos logit contributions
 hospital+0.002
 inconvenience+0.002
 Enjoy+0.002
 tells+0.002
 fault+0.002
Neg logit contributions
quet-0.002
bing-0.002
ized-0.002
izing-0.002
ingly-0.002
Token of
Token ID286
Feature activation+1.31e-03

Pos logit contributions
quet+0.033
ized+0.031
bing+0.031
izing+0.030
ingly+0.030
Neg logit contributions
 hospital-0.030
 tells-0.028
 fault-0.027
 inconvenience-0.026
 Enjoy-0.026
Token yarn
Token ID21181
Feature activation+9.73e-03

Pos logit contributions
quet+0.002
ized+0.002
bing+0.002
izing+0.002
ingly+0.002
Neg logit contributions
 hospital-0.002
 inconvenience-0.002
 tells-0.002
 fault-0.002
 Enjoy-0.002
Token together
Token ID1978
Feature activation+1.19e-02

Pos logit contributions
quet+0.017
ized+0.016
ingly+0.016
bing+0.016
izing+0.016
Neg logit contributions
 hospital-0.015
 tells-0.014
 fault-0.014
 inconvenience-0.013
 decides-0.012
Token.
Token ID13
Feature activation+1.22e-02

Pos logit contributions
quet+0.024
ized+0.023
bing+0.022
izing+0.022
stones+0.022
Neg logit contributions
 hospital-0.015
 tells-0.015
 fault-0.014
 inconvenience-0.013
 decides-0.013
TokenIt
Token ID1026
Feature activation+5.35e-03

Pos logit contributions
quet+0.041
ized+0.040
bing+0.039
ingly+0.039
izing+0.038
Neg logit contributions
 Enjoy-0.027
 hospital-0.027
 tells-0.027
 decides-0.027
have-0.026
Token is
Token ID318
Feature activation+9.26e-03

Pos logit contributions
quet+0.012
bing+0.011
ized+0.011
izing+0.011
als+0.011
Neg logit contributions
 hospital-0.006
 inconvenience-0.006
 fault-0.005
 dentist-0.005
 Enjoy-0.005
Token so
Token ID523
Feature activation-1.76e-03

Pos logit contributions
quet+0.014
bing+0.013
izing+0.013
 spe+0.012
ized+0.012
Neg logit contributions
 inconvenience-0.016
 hospital-0.016
 Enjoy-0.015
have-0.015
 dentist-0.015
Token pretty
Token ID2495
Feature activation-6.81e-04

Pos logit contributions
 hospital+0.003
 inconvenience+0.003
 fault+0.003
 tells+0.003
 dentist+0.003
Neg logit contributions
quet-0.003
ized-0.003
bing-0.003
izing-0.003
ingly-0.003
Token.
Token ID13
Feature activation-9.54e-03

Pos logit contributions
 tells+0.001
 hospital+0.001
 fault+0.001
 decides+0.001
 Enjoy+0.001
Neg logit contributions
quet-0.001
ized-0.001
ingly-0.001
bing-0.001
hetically-0.001
Token Can
Token ID1680
Feature activation-8.96e-03

Pos logit contributions
 hospital+0.011
 tells+0.010
 Enjoy+0.010
 inconvenience+0.010
 fault+0.010
Neg logit contributions
quet-0.021
ized-0.020
bing-0.020
ingly-0.019
izing-0.019
Token we
Token ID356
Feature activation-3.17e-02

Pos logit contributions
 tells+0.010
 hospital+0.009
 Enjoy+0.009
 decides+0.008
 says+0.008
Neg logit contributions
ingly-0.020
quet-0.020
imm-0.020
edly-0.019
bing-0.019
Token catch
Token ID4929
Feature activation-1.20e-02

Pos logit contributions
 hospital+0.035
 tells+0.033
 Enjoy+0.032
 inconvenience+0.031
 fault+0.030
Neg logit contributions
quet-0.069
ingly-0.067
bing-0.067
ized-0.067
izing-0.066
Token it
Token ID340
Feature activation-4.91e-03

Pos logit contributions
 hospital+0.016
 inconvenience+0.015
 tells+0.015
 fault+0.015
 Enjoy+0.015
Neg logit contributions
quet-0.023
ized-0.022
bing-0.022
ingly-0.022
izing-0.021
Token?"
Token ID1701
Feature activation-1.47e-02

Pos logit contributions
 hospital+0.004
 inconvenience+0.004
 fault+0.004
 Enjoy+0.003
 tells+0.003
Neg logit contributions
quet-0.012
ized-0.012
bing-0.012
izing-0.012
ingly-0.011
Token
Token ID198
Feature activation-5.25e-03

Pos logit contributions
 inconvenience+0.020
 hospital+0.019
 fault+0.019
 Enjoy+0.018
have+0.018
Neg logit contributions
quet-0.028
ingly-0.026
ized-0.026
bing-0.026
edly-0.025
Token was
Token ID373
Feature activation-8.86e-03

Pos logit contributions
 fault+0.036
 Enjoy+0.036
 inconvenience+0.036
 hospital+0.032
have+0.032
Neg logit contributions
ingly-0.029
ized-0.029
bing-0.029
quet-0.028
hog-0.028
Token very
Token ID845
Feature activation-2.51e-02

Pos logit contributions
 Enjoy+0.013
 decides+0.013
have+0.012
 inconvenience+0.012
 Congratulations+0.012
Neg logit contributions
 beneath-0.015
lights-0.015
bing-0.014
 spe-0.014
 between-0.014
Token dark
Token ID3223
Feature activation-1.40e-02

Pos logit contributions
 fault+0.044
 Enjoy+0.043
 Congratulations+0.042
 decides+0.041
 inconvenience+0.041
Neg logit contributions
lights-0.037
ized-0.036
 brightly-0.035
ingly-0.035
 beneath-0.034
Token outside
Token ID2354
Feature activation-3.25e-02

Pos logit contributions
 hospital+0.022
 tells+0.021
 dentist+0.021
 inconvenience+0.020
 fault+0.020
Neg logit contributions
quet-0.025
ized-0.023
bing-0.022
izing-0.022
ingly-0.022
Token and
Token ID290
Feature activation-3.27e-02

Pos logit contributions
 Enjoy+0.060
 inconvenience+0.060
 fault+0.060
 tells+0.058
 hospital+0.058
Neg logit contributions
ingly-0.043
quet-0.043
bing-0.042
ized-0.041
izing-0.039
Token very
Token ID845
Feature activation-5.70e-02

Pos logit contributions
 Enjoy+0.050
 fault+0.050
 inconvenience+0.048
have+0.047
 Congratulations+0.046
Neg logit contributions
ized-0.052
bing-0.051
lights-0.051
 beneath-0.050
ingly-0.050
Token late
Token ID2739
Feature activation-3.71e-02

Pos logit contributions
 Enjoy+0.089
 inconvenience+0.087
 fault+0.083
 Congratulations+0.082
have+0.082
Neg logit contributions
ized-0.096
ingly-0.095
lights-0.092
bing-0.092
 beneath-0.091
Token.
Token ID13
Feature activation+1.12e-02

Pos logit contributions
 hospital+0.070
have+0.070
 Enjoy+0.070
 fault+0.069
 tells+0.067
Neg logit contributions
ingly-0.048
quet-0.048
edly-0.047
bing-0.047
ized-0.046
Token Tom
Token ID4186
Feature activation+2.47e-02

Pos logit contributions
edly+0.015
ingly+0.014
quet+0.012
ized+0.012
ised+0.012
Neg logit contributions
have-0.020
 supposed-0.019
 inconvenience-0.019
 fault-0.018
ws-0.018
Token and
Token ID290
Feature activation-7.95e-03

Pos logit contributions
quet+0.033
ingly+0.031
ized+0.031
bing+0.031
izing+0.029
Neg logit contributions
 inconvenience-0.046
 hospital-0.046
 Enjoy-0.045
 fault-0.045
 tells-0.045
Token Amy
Token ID14235
Feature activation+8.01e-03

Pos logit contributions
 fault+0.015
 inconvenience+0.015
 hospital+0.014
 Enjoy+0.014
have+0.014
Neg logit contributions
quet-0.010
hog-0.009
ingly-0.009
bing-0.009
ized-0.009
Token,
Token ID11
Feature activation-2.66e-03

Pos logit contributions
quet+0.003
ized+0.002
bing+0.002
ingly+0.002
izing+0.002
Neg logit contributions
 hospital-0.003
 tells-0.003
 inconvenience-0.003
 Enjoy-0.003
 fault-0.003
Token "
Token ID366
Feature activation+2.27e-02

Pos logit contributions
 hospital+0.004
 fault+0.004
 inconvenience+0.004
have+0.004
ws+0.004
Neg logit contributions
quet-0.004
bing-0.004
ingly-0.004
hog-0.004
izing-0.004
TokenYoung
Token ID20917
Feature activation+6.25e-03

Pos logit contributions
quet+0.044
ized+0.042
bing+0.042
ingly+0.041
izing+0.041
Neg logit contributions
 hospital-0.030
 tells-0.029
 inconvenience-0.029
 Enjoy-0.028
have-0.027
Token lady
Token ID10846
Feature activation-8.38e-04

Pos logit contributions
quet+0.011
ized+0.010
ingly+0.010
bing+0.010
izing+0.010
Neg logit contributions
 hospital-0.009
 tells-0.009
 fault-0.009
 inconvenience-0.009
 Enjoy-0.009
Token,
Token ID11
Feature activation-7.43e-03

Pos logit contributions
 hospital+0.002
 inconvenience+0.002
 microwave+0.002
 fault+0.002
have+0.002
Neg logit contributions
quet-0.001
ingly-0.001
hog-0.001
ized-0.001
bing-0.001
Token you
Token ID345
Feature activation-5.34e-02

Pos logit contributions
 fault+0.010
 microwave+0.010
 hospital+0.010
ws+0.010
 inconvenience+0.009
Neg logit contributions
quet-0.013
ized-0.012
izing-0.012
worm-0.012
als-0.012
Token should
Token ID815
Feature activation-3.15e-02

Pos logit contributions
 microwave+0.075
 hospital+0.070
 fault+0.070
 dentist+0.069
 inconvenience+0.067
Neg logit contributions
quet-0.092
als-0.091
stones-0.088
ized-0.088
hog-0.085
Token never
Token ID1239
Feature activation-7.32e-03

Pos logit contributions
 microwave+0.042
 fault+0.040
 inconvenience+0.038
ws+0.038
 hospital+0.038
Neg logit contributions
quet-0.058
ized-0.054
bing-0.053
ingly-0.052
izing-0.052
Token go
Token ID467
Feature activation-4.99e-03

Pos logit contributions
 fault+0.010
 microwave+0.010
 hospital+0.010
 Enjoy+0.009
 tells+0.009
Neg logit contributions
quet-0.014
ized-0.013
bing-0.013
ingly-0.013
izing-0.013
Token places
Token ID4113
Feature activation+3.12e-02

Pos logit contributions
 hospital+0.008
 fault+0.008
 tells+0.008
 inconvenience+0.008
 Enjoy+0.008
Neg logit contributions
quet-0.008
bing-0.007
ized-0.007
ingly-0.007
izing-0.007
Token you
Token ID345
Feature activation-2.72e-02

Pos logit contributions
quet+0.050
ized+0.048
bing+0.047
ingly+0.047
izing+0.046
Neg logit contributions
 hospital-0.051
 tells-0.050
 inconvenience-0.049
 fault-0.049
 Enjoy-0.048
Token and
Token ID290
Feature activation+1.95e-03

Pos logit contributions
quet+0.000
ized+0.000
bing+0.000
ingly+0.000
izing+0.000
Neg logit contributions
 inconvenience-0.001
 hospital-0.001
 Enjoy-0.001
 fault-0.001
have-0.001
Token Liz
Token ID28313
Feature activation+1.75e-02

Pos logit contributions
quet+0.003
bing+0.003
ized+0.003
ingly+0.003
izing+0.003
Neg logit contributions
 hospital-0.003
 inconvenience-0.003
 Enjoy-0.003
 fault-0.003
 tells-0.003
Token are
Token ID389
Feature activation-1.12e-02

Pos logit contributions
quet+0.037
ized+0.037
ingly+0.036
bing+0.036
izing+0.035
Neg logit contributions
 hospital-0.018
 inconvenience-0.018
 fault-0.018
 Enjoy-0.017
 tells-0.017
Token scared
Token ID12008
Feature activation-2.75e-02

Pos logit contributions
 inconvenience+0.013
 hospital+0.013
 Enjoy+0.013
 tells+0.013
 fault+0.013
Neg logit contributions
quet-0.022
ized-0.022
bing-0.022
ingly-0.021
izing-0.021
Token and
Token ID290
Feature activation-1.48e-02

Pos logit contributions
 hospital+0.038
 tells+0.037
 inconvenience+0.035
 fault+0.035
have+0.034
Neg logit contributions
quet-0.054
ized-0.049
bing-0.049
izing-0.048
stones-0.047
Token sad
Token ID6507
Feature activation-5.49e-02

Pos logit contributions
 dentist+0.021
 fault+0.021
 inconvenience+0.020
 Enjoy+0.020
arre+0.019
Neg logit contributions
ingly-0.025
ized-0.025
bing-0.024
hog-0.024
edly-0.024
Token.
Token ID13
Feature activation+9.49e-03

Pos logit contributions
 Enjoy+0.111
 inconvenience+0.110
have+0.109
 hospital+0.108
 decides+0.107
Neg logit contributions
ingly-0.065
quet-0.062
edly-0.062
bing-0.062
izing-0.060
Token They
Token ID1119
Feature activation+7.61e-03

Pos logit contributions
quet+0.016
bing+0.016
ingly+0.015
ized+0.015
izing+0.015
Neg logit contributions
 hospital-0.014
 inconvenience-0.014
have-0.014
 Enjoy-0.014
 fault-0.013
Token cry
Token ID3960
Feature activation-4.26e-02

Pos logit contributions
ized+0.012
hog+0.012
als+0.012
ingly+0.012
hetically+0.011
Neg logit contributions
 fault-0.011
 dentist-0.011
 inconvenience-0.011
 microwave-0.011
have-0.010
Token and
Token ID290
Feature activation-6.60e-03

Pos logit contributions
 inconvenience+0.091
 hospital+0.086
 Enjoy+0.086
 dentist+0.086
 tells+0.084
Neg logit contributions
ingly-0.049
bing-0.048
quet-0.046
ized-0.044
edly-0.043
Token run
Token ID1057
Feature activation-2.23e-02

Pos logit contributions
 fault+0.009
 inconvenience+0.008
 dentist+0.008
have+0.008
 Enjoy+0.008
Neg logit contributions
ized-0.011
ingly-0.011
hetically-0.011
quet-0.011
hog-0.011
Tokenâ
Token ID22940
Feature activation-1.20e-02

Pos logit contributions
quet+0.016
bing+0.015
ingly+0.015
ized+0.014
izing+0.014
Neg logit contributions
 hospital-0.014
 inconvenience-0.013
 fault-0.013
 tells-0.013
 dentist-0.013
Token
Token ID26391
Feature activation-1.82e-03

Pos logit contributions
 hospital+0.015
 fault+0.015
 tells+0.014
 inconvenience+0.014
 dentist+0.014
Neg logit contributions
quet-0.024
ized-0.023
bing-0.023
ingly-0.022
izing-0.022
Token
Token ID198
Feature activation+1.46e-03

Pos logit contributions
 hospital+0.003
 inconvenience+0.003
 fault+0.003
 tells+0.003
 Enjoy+0.003
Neg logit contributions
quet-0.003
bing-0.003
ized-0.003
ingly-0.003
izing-0.003
Token
Token ID198
Feature activation+7.11e-03

Pos logit contributions
quet+0.003
ized+0.002
ingly+0.002
bing+0.002
ised+0.002
Neg logit contributions
 fault-0.002
 hospital-0.002
 inconvenience-0.002
 tells-0.002
 decides-0.002
TokenThe
Token ID464
Feature activation-9.69e-03

Pos logit contributions
quet+0.010
ized+0.009
bing+0.009
ingly+0.009
izing+0.008
Neg logit contributions
 hospital-0.014
 fault-0.014
 tells-0.013
 inconvenience-0.013
 decides-0.013
Token elderly
Token ID13830
Feature activation-5.24e-02

Pos logit contributions
ws+0.020
 fault+0.020
arre+0.019
 supposed+0.018
 Enjoy+0.018
Neg logit contributions
worm-0.011
hog-0.011
quet-0.011
edly-0.009
ized-0.009
Token lady
Token ID10846
Feature activation-9.49e-03

Pos logit contributions
 hospital+0.076
 fault+0.075
 tells+0.074
 inconvenience+0.074
 Enjoy+0.072
Neg logit contributions
quet-0.094
ized-0.089
bing-0.087
ingly-0.087
izing-0.085
Token nodded
Token ID14464
Feature activation+1.63e-02

Pos logit contributions
 hospital+0.012
 inconvenience+0.012
 fault+0.012
 Enjoy+0.011
 tells+0.011
Neg logit contributions
quet-0.019
ized-0.018
ingly-0.018
bing-0.017
izing-0.017
Token and
Token ID290
Feature activation+5.71e-03

Pos logit contributions
quet+0.027
ized+0.025
izing+0.025
bing+0.024
stones+0.024
Neg logit contributions
 hospital-0.026
 tells-0.025
 Enjoy-0.024
have-0.024
 decides-0.024
Token said
Token ID531
Feature activation-1.02e-02

Pos logit contributions
quet+0.012
ized+0.012
ingly+0.011
bing+0.011
izing+0.011
Neg logit contributions
 hospital-0.007
 inconvenience-0.006
 fault-0.006
 tells-0.006
 Enjoy-0.006
Token,
Token ID11
Feature activation-1.43e-02

Pos logit contributions
 hospital+0.019
 tells+0.019
 inconvenience+0.018
 Enjoy+0.018
 fault+0.018
Neg logit contributions
quet-0.014
bing-0.013
ized-0.013
ingly-0.013
izing-0.013
Token just
Token ID655
Feature activation-3.81e-03

Pos logit contributions
 inconvenience+0.001
 Enjoy+0.001
arre+0.001
 decides+0.001
 Time+0.001
Neg logit contributions
 spe-0.001
bing-0.001
 beneath-0.001
worm-0.001
hog-0.001
Token very
Token ID845
Feature activation-3.13e-02

Pos logit contributions
 inconvenience+0.007
 hospital+0.007
arre+0.007
 dentist+0.007
 microwave+0.006
Neg logit contributions
bing-0.005
 beneath-0.005
 spe-0.005
worm-0.004
ingly-0.004
Token quiet
Token ID5897
Feature activation-6.11e-03

Pos logit contributions
 hospital+0.053
 microwave+0.051
 fault+0.051
 dentist+0.051
 inconvenience+0.051
Neg logit contributions
ized-0.046
ingly-0.046
worm-0.046
izing-0.045
bing-0.043
Token.
Token ID13
Feature activation-8.23e-03

Pos logit contributions
 hospital+0.011
 inconvenience+0.011
 Enjoy+0.011
 tells+0.010
 dentist+0.010
Neg logit contributions
ingly-0.009
quet-0.009
bing-0.008
ized-0.008
edly-0.008
Token And
Token ID843
Feature activation-1.57e-02

Pos logit contributions
 inconvenience+0.014
 hospital+0.014
 done+0.014
 fault+0.014
 mess+0.013
Neg logit contributions
quet-0.012
ingly-0.011
edly-0.011
hog-0.011
bing-0.011
Token very
Token ID845
Feature activation-6.11e-02

Pos logit contributions
 inconvenience+0.025
 hospital+0.025
 fault+0.024
 dentist+0.023
 microwave+0.023
Neg logit contributions
quet-0.024
bing-0.024
ized-0.023
ingly-0.023
izing-0.023
Token pretty
Token ID2495
Feature activation-2.92e-02

Pos logit contributions
 fault+0.099
 inconvenience+0.096
 microwave+0.095
 hospital+0.094
 dentist+0.091
Neg logit contributions
ingly-0.093
ized-0.091
worm-0.091
izing-0.089
bing-0.087
Token.
Token ID13
Feature activation-8.96e-03

Pos logit contributions
 hospital+0.049
 inconvenience+0.049
 fault+0.047
 Enjoy+0.047
 dentist+0.046
Neg logit contributions
quet-0.045
ingly-0.044
bing-0.044
izing-0.043
ized-0.042
Token He
Token ID679
Feature activation+1.84e-02

Pos logit contributions
 hospital+0.014
 inconvenience+0.014
 fault+0.014
 done+0.014
 mess+0.014
Neg logit contributions
quet-0.013
ingly-0.013
edly-0.013
bing-0.013
hog-0.013
Token is
Token ID318
Feature activation-2.16e-03

Pos logit contributions
als+0.034
ized+0.033
hog+0.033
ingly+0.033
worm+0.033
Neg logit contributions
 inconvenience-0.024
 hospital-0.022
 fault-0.022
 dentist-0.022
 Enjoy-0.021
Token the
Token ID262
Feature activation+1.88e-02

Pos logit contributions
 inconvenience+0.004
arre+0.004
 decides+0.003
 Time+0.003
 Enjoy+0.003
Neg logit contributions
 spe-0.003
bing-0.003
 beneath-0.003
worm-0.003
ized-0.003