TOP ACTIVATIONS
MAX = 13.895

, it kept on flying err atically with inexplicable speed .
bring the quarrel to a resolution , until everyone was happy
˜ S ail ors Wanted â . Do
only give out on special occasions . â The kids
to grow l and sn arl to try and scare her
bananas when they went on trips together . <|endoftext|> Once upon
, and they started to argue . Their mom said ,
They raced each other and competed to see who the fastest
new word , ' emer gency ', so she whispered the
, so they started to argue . Tim had
a pop - pop - pop sound . She laughs .
sat down in a s ulk . Suddenly he remembered where
answered , " I take trips because I want to explore
opinions . They start to argue . They say , "
time . They started to argue and fight over who could
look at it from all angles . It was so colorful
to draw and started to argue . The argument
were both too young to argue over the television .
this time she wanted to compete with her brother . Her
. Suddenly , it shr ank ! The goose was

MIN ACTIVATIONS
MIN = -12.596

. But when their teacher came , she did
mess . But their mom said , " No
Finally , when their parents came to pick them
bike instead . But when their mom was busy talking to
worry however , as when their train arrived , it was
. Even though their holiday was different than they
. When their treat was finished , the
spending time together . Whenever their parents would leave the house
was a weird feeling . Er nie didn 't understand why
. When their fence was done , John
hands . When their parents saw the mole ,
pictures together . Even though their vacation was ruined , they
One day , Pe ppa stumbled upon a strange
. They hoped to rebuild their village and make it even
. When their energy was spent , they
hard , and then when his mom came home , she
Anna and Max finish their food and go to play
had fun . Their moms were proud . They
is brave . Their dad says , " Wow
and more sand together until their castle was finished .

INTERVAL 7.272 - 13.895
CONTAINS 1.102%

lots of delicious food and played games the entire day .
it . Then they will wake you up and take you
I don 't want to bite anymore . Please help me
Soon it was time to march back home . The other
sad . She started to weep , and the tears rolled

INTERVAL 0.650 - 7.272
CONTAINS 45.045%

and started to feel better . As the days
pirates and look for treasure . One day ,
see the sky , the trees , the birds , and
through the farm , she saw a big barn . She
the park , but they couldn 't find any treasure .

INTERVAL -5.973 - 0.650
CONTAINS 43.272%

, " Yes , fans are pretty cool !" <|endoftext|> Once
, Lily made sure to always be aware of the alarms
farther until they came to a park . The
. It did not listen to Lily . It kept running
. They were both competitive and wanted to win . They

INTERVAL -12.596 - -5.973
CONTAINS 10.581%

my friends ?" The children thought this was a
what they can buy with their money . The
he was able to take the lead . Finally
with his toy soldiers . He liked to make loud noises
parts and materials to make the car . He worked very
Token,
Token ID11
Feature activation-8.47e-01

Pos logit contributions
ges+0.021
uts+0.021
aps+0.020
 This+0.018
uttering+0.018
Neg logit contributions
 hospital-0.052
 nurses-0.050
 dentist-0.050
 punished-0.046
 happiest-0.046
Token it
Token ID340
Feature activation-2.30e+00

Pos logit contributions
 hospital+0.020
 dentist+0.019
 nurses+0.019
 wounds+0.018
 punished+0.018
Neg logit contributions
ges-0.013
uts-0.012
aps-0.012
 This-0.012
uttering-0.011
Token kept
Token ID4030
Feature activation+3.85e+00

Pos logit contributions
 dentist+0.047
 nurses+0.047
 hospital+0.047
 wounds+0.043
 workshop+0.042
Neg logit contributions
uts-0.038
ed-0.038
ges-0.037
 together-0.035
ung-0.035
Token on
Token ID319
Feature activation+1.42e+00

Pos logit contributions
uts+0.069
ges+0.067
aps+0.067
ore+0.062
 This+0.062
Neg logit contributions
 hospital-0.082
 nurses-0.078
 dentist-0.077
 punished-0.072
 cal-0.072
Token flying
Token ID7348
Feature activation+6.08e+00

Pos logit contributions
ges+0.026
uts+0.026
aps+0.025
ore+0.024
 This+0.024
Neg logit contributions
 hospital-0.030
 dentist-0.028
 nurses-0.028
 punished-0.025
 wounds-0.025
Token err
Token ID11454
Feature activation+1.29e+01

Pos logit contributions
ges+0.094
uts+0.092
aps+0.089
 This+0.083
uttering+0.083
Neg logit contributions
 hospital-0.144
 dentist-0.138
 nurses-0.138
 punished-0.126
 happiest-0.125
Tokenatically
Token ID4142
Feature activation+1.85e+00

Pos logit contributions
ges+0.258
uts+0.252
aps+0.246
 This+0.235
uttering+0.233
Neg logit contributions
 hospital-0.249
 nurses-0.234
 dentist-0.233
 punished-0.209
 wounds-0.207
Token with
Token ID351
Feature activation-8.10e-01

Pos logit contributions
ges+0.028
uts+0.027
aps+0.027
 This+0.025
uttering+0.025
Neg logit contributions
 hospital-0.044
 nurses-0.041
 dentist-0.040
 punished-0.038
 wounds-0.038
Token inexplicable
Token ID49602
Feature activation-4.04e+00

Pos logit contributions
 hospital+0.017
 dentist+0.016
 nurses+0.016
 punished+0.015
 wounds+0.014
Neg logit contributions
ges-0.015
uts-0.014
aps-0.014
 This-0.013
uttering-0.013
Token speed
Token ID2866
Feature activation+3.86e+00

Pos logit contributions
 hospital+0.066
 dentist+0.060
 nurses+0.060
 wounds+0.053
 punished+0.053
Neg logit contributions
ges-0.093
uts-0.091
aps-0.089
 This-0.085
ore-0.084
Token.
Token ID13
Feature activation+5.36e+00

Pos logit contributions
aps+0.053
ges+0.052
uts+0.049
uttering+0.047
 This+0.046
Neg logit contributions
 hospital-0.095
 nurses-0.089
 dentist-0.088
 forgot-0.085
 punished-0.084
Token bring
Token ID2222
Feature activation+3.44e+00

Pos logit contributions
 hospital+0.033
 dentist+0.032
 nurses+0.032
 doctors+0.028
 wounds+0.027
Neg logit contributions
ges-0.040
aps-0.037
uts-0.037
 This-0.037
ore-0.035
Token the
Token ID262
Feature activation-8.17e+00

Pos logit contributions
ges+0.062
aps+0.060
uts+0.060
 This+0.057
ore+0.055
Neg logit contributions
 hospital-0.072
 dentist-0.069
 nurses-0.068
 forgot-0.061
 punished-0.061
Token quarrel
Token ID41998
Feature activation+7.69e+00

Pos logit contributions
 hospital+0.134
 nurses+0.128
 dentist+0.125
ureen+0.112
 punished+0.111
Neg logit contributions
ges-0.172
aps-0.168
uts-0.166
 This-0.162
ore-0.161
Token to
Token ID284
Feature activation-2.89e-01

Pos logit contributions
ges+0.117
uts+0.110
aps+0.109
uttering+0.104
ore+0.101
Neg logit contributions
 hospital-0.184
 nurses-0.173
 dentist-0.173
 happiest-0.159
 punished-0.158
Token a
Token ID257
Feature activation-3.55e+00

Pos logit contributions
 hospital+0.006
 dentist+0.006
 nurses+0.006
 punished+0.005
 forgot+0.005
Neg logit contributions
ges-0.005
uts-0.005
aps-0.005
 This-0.005
ore-0.004
Token resolution
Token ID6323
Feature activation+1.28e+01

Pos logit contributions
 hospital+0.053
 nurses+0.053
 dentist+0.051
 wounds+0.048
 punished+0.046
Neg logit contributions
ges-0.076
uts-0.074
aps-0.073
 This-0.073
ore-0.072
Token,
Token ID11
Feature activation+8.07e-01

Pos logit contributions
ges+0.208
aps+0.205
uts+0.201
uttering+0.193
 This+0.190
Neg logit contributions
 hospital-0.276
 dentist-0.264
 nurses-0.260
 happiest-0.239
 cal-0.237
Token until
Token ID1566
Feature activation+7.44e-01

Pos logit contributions
ges+0.014
uts+0.013
aps+0.013
 This+0.012
uttering+0.012
Neg logit contributions
 hospital-0.018
 dentist-0.017
 nurses-0.017
 wounds-0.015
 punished-0.015
Token everyone
Token ID2506
Feature activation-1.49e+00

Pos logit contributions
ges+0.014
uts+0.014
aps+0.014
uttering+0.013
ore+0.012
Neg logit contributions
 hospital-0.016
 dentist-0.015
 nurses-0.015
 punished-0.013
 happiest-0.013
Token was
Token ID373
Feature activation-3.58e+00

Pos logit contributions
 hospital+0.029
 dentist+0.028
 nurses+0.028
 wounds+0.026
 doctors+0.024
Neg logit contributions
ges-0.030
uts-0.028
aps-0.028
ore-0.027
apped-0.026
Token happy
Token ID3772
Feature activation+1.32e+00

Pos logit contributions
 hospital+0.065
 nurses+0.061
 dentist+0.061
 punished+0.054
 happiest+0.053
Neg logit contributions
ges-0.076
uts-0.074
aps-0.072
 This-0.069
uttering-0.069
Token
Token ID26391
Feature activation+2.15e+00

Pos logit contributions
ges+0.023
uts+0.022
aps+0.021
uttering+0.020
uttered+0.020
Neg logit contributions
 hospital-0.027
 dentist-0.025
 nurses-0.024
 cal-0.023
 happiest-0.022
Token˜
Token ID41185
Feature activation+3.46e+00

Pos logit contributions
ges+0.034
uttering+0.032
uttered+0.029
elt+0.028
apped+0.028
Neg logit contributions
 hospital-0.046
 dentist-0.046
 forgot-0.044
ureen-0.044
ki-0.042
TokenS
Token ID50
Feature activation-4.04e+00

Pos logit contributions
ges+0.054
uttering+0.051
uts+0.051
aps+0.049
uttered+0.046
Neg logit contributions
 hospital-0.079
 dentist-0.075
 nurses-0.075
 happiest-0.070
 cal-0.069
Tokenail
Token ID603
Feature activation-3.06e+00

Pos logit contributions
 dentist+0.055
 hospital+0.054
 nurses+0.048
 cal+0.045
ureen+0.045
Neg logit contributions
ges-0.103
uts-0.101
uttering-0.095
aps-0.094
uttered-0.092
Tokenors
Token ID669
Feature activation+5.62e-01

Pos logit contributions
 hospital+0.066
 dentist+0.063
 nurses+0.062
 punished+0.057
 forgot+0.057
Neg logit contributions
ges-0.054
uts-0.052
aps-0.050
uttering-0.048
 This-0.047
Token Wanted
Token ID41536
Feature activation+1.25e+01

Pos logit contributions
ges+0.010
uts+0.010
aps+0.010
uttering+0.010
 This+0.009
Neg logit contributions
 hospital-0.011
 nurses-0.011
 dentist-0.011
 punished-0.010
 forgot-0.010
Tokenâ
Token ID22940
Feature activation-5.80e-01

Pos logit contributions
ges+0.183
uttering+0.176
uttered+0.166
uts+0.159
aps+0.159
Neg logit contributions
 hospital-0.273
 nurses-0.267
 dentist-0.261
 doctors-0.256
 conductor-0.248
Token
Token ID26391
Feature activation+2.22e+00

Pos logit contributions
 hospital+0.008
 dentist+0.007
 nurses+0.007
 punished+0.006
 happiest+0.006
Neg logit contributions
ges-0.015
uts-0.015
aps-0.014
uttering-0.014
apped-0.013
Token
Token ID8151
Feature activation+6.79e+00

Pos logit contributions
ges+0.038
uttering+0.035
uttered+0.032
elt+0.032
aps+0.032
Neg logit contributions
 hospital-0.044
 forgot-0.043
 dentist-0.042
ureen-0.041
ki-0.040
Token.
Token ID13
Feature activation+6.22e+00

Pos logit contributions
uttering+0.088
ges+0.088
aps+0.079
elt+0.071
ectar+0.070
Neg logit contributions
 hospital-0.154
 dentist-0.147
 forgot-0.144
ureen-0.134
 punished-0.134
Token Do
Token ID2141
Feature activation+2.06e+00

Pos logit contributions
uttering+0.097
ges+0.096
uts+0.094
aps+0.089
uttered+0.086
Neg logit contributions
 hospital-0.139
 nurses-0.133
 dentist-0.131
 cal-0.120
 happiest-0.119
Token only
Token ID691
Feature activation-3.06e+00

Pos logit contributions
 hospital+0.067
 dentist+0.064
 nurses+0.063
 wounds+0.057
ureen+0.057
Neg logit contributions
ges-0.063
uts-0.062
aps-0.061
 This-0.060
ore-0.057
Token give
Token ID1577
Feature activation+5.65e+00

Pos logit contributions
 hospital+0.058
 nurses+0.055
 dentist+0.054
 punished+0.049
 forgot+0.049
Neg logit contributions
uts-0.062
ges-0.062
aps-0.060
uttering-0.059
apping-0.056
Token out
Token ID503
Feature activation+3.36e+00

Pos logit contributions
ges+0.094
uts+0.093
aps+0.090
uttering+0.085
 This+0.084
Neg logit contributions
 hospital-0.127
 nurses-0.121
 dentist-0.120
 punished-0.110
 wounds-0.109
Token on
Token ID319
Feature activation+2.54e+00

Pos logit contributions
ges+0.062
uts+0.060
aps+0.059
 This+0.056
ore+0.055
Neg logit contributions
 hospital-0.070
 dentist-0.066
 nurses-0.066
 punished-0.059
 happiest-0.059
Token special
Token ID2041
Feature activation+4.88e-01

Pos logit contributions
ges+0.046
uts+0.045
aps+0.045
uttering+0.042
apped+0.041
Neg logit contributions
 hospital-0.054
 nurses-0.051
 dentist-0.050
 punished-0.046
 wounds-0.046
Token occasions
Token ID12432
Feature activation+1.21e+01

Pos logit contributions
ges+0.010
uts+0.010
aps+0.010
 This+0.009
ore+0.009
Neg logit contributions
 hospital-0.009
 dentist-0.008
 nurses-0.008
 punished-0.007
 forgot-0.007
Token.
Token ID13
Feature activation+5.96e+00

Pos logit contributions
ges+0.196
elt+0.192
uttered+0.190
uttering+0.184
apped+0.176
Neg logit contributions
 hospital-0.250
 dentist-0.244
 nurses-0.242
 wounds-0.227
 doctors-0.224
Tokenâ
Token ID22940
Feature activation-2.64e+00

Pos logit contributions
ges+0.087
uttering+0.086
uts+0.082
uttered+0.075
aps+0.075
Neg logit contributions
 hospital-0.147
 dentist-0.144
 nurses-0.139
 cal-0.126
 doctors-0.126
Token
Token ID26391
Feature activation+1.06e+00

Pos logit contributions
 hospital+0.049
 nurses+0.047
 dentist+0.046
 wounds+0.042
 doctors+0.041
Neg logit contributions
ges-0.053
 This-0.052
uts-0.051
ore-0.050
aps-0.050
Token The
Token ID383
Feature activation-7.19e+00

Pos logit contributions
ges+0.016
uttering+0.014
aps+0.014
uts+0.013
ore+0.013
Neg logit contributions
 hospital-0.025
 dentist-0.025
 forgot-0.023
 nurses-0.023
 punished-0.023
Token kids
Token ID3988
Feature activation-5.44e-01

Pos logit contributions
 hospital+0.103
 dentist+0.098
 nurses+0.095
 forgot+0.093
 punished+0.091
Neg logit contributions
aps-0.153
ges-0.153
 This-0.152
ore-0.149
uts-0.148
Token to
Token ID284
Feature activation-2.85e+00

Pos logit contributions
uts+0.065
ges+0.063
aps+0.060
uttering+0.055
ore+0.055
Neg logit contributions
 hospital-0.166
 nurses-0.158
 dentist-0.152
 punished-0.151
 mistake-0.150
Token grow
Token ID1663
Feature activation+5.81e+00

Pos logit contributions
 hospital+0.049
 dentist+0.048
 nurses+0.047
 wounds+0.042
 happiest+0.041
Neg logit contributions
ges-0.060
uts-0.058
aps-0.057
 This-0.055
uttering-0.054
Tokenl
Token ID75
Feature activation+3.31e+00

Pos logit contributions
ges+0.106
uts+0.103
aps+0.101
ore+0.095
uttering+0.094
Neg logit contributions
 hospital-0.121
 dentist-0.117
 nurses-0.115
 punished-0.103
 forgot-0.102
Token and
Token ID290
Feature activation-3.60e+00

Pos logit contributions
ges+0.053
 This+0.048
 Yes+0.047
aps+0.046
 Sorry+0.045
Neg logit contributions
 hospital-0.074
 dentist-0.070
 nurses-0.068
 cal-0.065
 doctors-0.064
Token sn
Token ID3013
Feature activation+5.88e+00

Pos logit contributions
 hospital+0.063
 dentist+0.060
 nurses+0.060
 punished+0.053
 happiest+0.053
Neg logit contributions
ges-0.076
uts-0.074
aps-0.073
 This-0.070
ore-0.069
Tokenarl
Token ID7063
Feature activation+1.21e+01

Pos logit contributions
ges+0.094
uts+0.093
 This+0.082
aps+0.081
 Yes+0.078
Neg logit contributions
 hospital-0.137
 punished-0.124
 cal-0.123
 nurses-0.121
 dentist-0.118
Token to
Token ID284
Feature activation-4.80e-01

Pos logit contributions
ges+0.133
elt+0.120
uttering+0.114
 This+0.114
aps+0.111
Neg logit contributions
 hospital-0.326
 nurses-0.305
 wounds-0.297
 dentist-0.295
 forgot-0.295
Token try
Token ID1949
Feature activation+6.82e+00

Pos logit contributions
 hospital+0.009
 dentist+0.009
 nurses+0.009
 forgot+0.008
 punished+0.008
Neg logit contributions
ges-0.009
 This-0.008
aps-0.008
uts-0.008
uttering-0.008
Token and
Token ID290
Feature activation-3.15e+00

Pos logit contributions
ges+0.106
uts+0.103
aps+0.101
apped+0.097
uttering+0.097
Neg logit contributions
 hospital-0.162
 dentist-0.150
 nurses-0.150
 wounds-0.142
 mistake-0.140
Token scare
Token ID19437
Feature activation+5.41e+00

Pos logit contributions
 hospital+0.055
 dentist+0.052
 nurses+0.052
 happiest+0.046
 punished+0.045
Neg logit contributions
ges-0.066
uts-0.064
aps-0.064
 This-0.061
ore-0.060
Token her
Token ID607
Feature activation-5.63e+00

Pos logit contributions
ges+0.106
uts+0.103
aps+0.101
 This+0.096
uttering+0.095
Neg logit contributions
 hospital-0.106
 dentist-0.100
 nurses-0.099
 punished-0.090
 happiest-0.089
Token bananas
Token ID35484
Feature activation+4.03e+00

Pos logit contributions
ges+0.092
uts+0.089
aps+0.088
 This+0.086
ore+0.083
Neg logit contributions
 hospital-0.090
 dentist-0.086
 nurses-0.083
 forgot-0.078
 punished-0.075
Token when
Token ID618
Feature activation+9.17e-01

Pos logit contributions
ges+0.066
aps+0.062
uttering+0.060
uttered+0.059
uts+0.058
Neg logit contributions
 hospital-0.093
 dentist-0.089
 nurses-0.085
 cal-0.079
 punished-0.079
Token they
Token ID484
Feature activation-5.23e+00

Pos logit contributions
ges+0.016
aps+0.016
uts+0.016
uttering+0.015
ore+0.015
Neg logit contributions
 hospital-0.020
 nurses-0.019
 dentist-0.019
 doctors-0.017
 forgot-0.017
Token went
Token ID1816
Feature activation+4.47e+00

Pos logit contributions
 hospital+0.091
 nurses+0.087
 dentist+0.086
 wounds+0.080
 doctors+0.075
Neg logit contributions
ges-0.112
uts-0.107
aps-0.105
 This-0.103
ore-0.102
Token on
Token ID319
Feature activation+2.47e+00

Pos logit contributions
ges+0.063
uts+0.062
aps+0.053
uttered+0.052
uttering+0.052
Neg logit contributions
 hospital-0.122
 nurses-0.106
 dentist-0.105
 doctors-0.103
 punished-0.100
Token trips
Token ID13229
Feature activation+1.19e+01

Pos logit contributions
ges+0.045
uts+0.045
aps+0.044
 This+0.043
ore+0.041
Neg logit contributions
 hospital-0.048
 nurses-0.047
 dentist-0.047
 forgot-0.043
 punished-0.042
Token together
Token ID1978
Feature activation+2.34e+00

Pos logit contributions
ges+0.154
uts+0.146
aps+0.145
uttering+0.141
uttered+0.138
Neg logit contributions
 hospital-0.309
 dentist-0.290
 nurses-0.279
 punished-0.262
 doctors-0.261
Token.
Token ID13
Feature activation+5.06e+00

Pos logit contributions
ges+0.034
uts+0.034
aps+0.033
uttering+0.031
uttered+0.031
Neg logit contributions
 hospital-0.057
 nurses-0.052
 dentist-0.052
 happiest-0.050
 cal-0.048
Token<|endoftext|>
Token ID50256
Feature activation+6.89e+00

Pos logit contributions
ges+0.056
uts+0.052
aps+0.051
uttering+0.050
ore+0.045
Neg logit contributions
 hospital-0.145
 dentist-0.139
 nurses-0.135
 happiest-0.127
 forgot-0.126
TokenOnce
Token ID7454
Feature activation+5.23e-01

Pos logit contributions
uttering+0.118
uts+0.116
ges+0.108
iked+0.105
apping+0.104
Neg logit contributions
 hospital-0.147
 dentist-0.144
 cal-0.125
 tears-0.121
 way-0.118
Token upon
Token ID2402
Feature activation-5.45e-01

Pos logit contributions
ges+0.007
uts+0.007
aps+0.006
 This+0.006
uttering+0.006
Neg logit contributions
 hospital-0.014
 dentist-0.013
 nurses-0.013
 forgot-0.012
 happiest-0.012
Token,
Token ID11
Feature activation-1.77e-01

Pos logit contributions
ges+0.040
aps+0.038
uts+0.038
uttering+0.035
 This+0.035
Neg logit contributions
 hospital-0.073
 nurses-0.069
 dentist-0.069
 happiest-0.064
 punished-0.064
Token and
Token ID290
Feature activation-4.57e+00

Pos logit contributions
 hospital+0.004
 dentist+0.004
 nurses+0.004
 punished+0.003
 wounds+0.003
Neg logit contributions
uts-0.003
ges-0.003
aps-0.003
 This-0.003
 together-0.003
Token they
Token ID484
Feature activation-4.60e+00

Pos logit contributions
 hospital+0.095
 dentist+0.092
 nurses+0.089
 wounds+0.082
 happiest+0.081
Neg logit contributions
ges-0.083
uts-0.078
aps-0.074
 This-0.074
 together-0.073
Token started
Token ID2067
Feature activation+5.97e+00

Pos logit contributions
 hospital+0.092
 dentist+0.090
 nurses+0.086
 wounds+0.080
 mistake+0.078
Neg logit contributions
ges-0.089
uts-0.084
apped-0.082
 together-0.080
ed-0.079
Token to
Token ID284
Feature activation-2.57e+00

Pos logit contributions
uts+0.070
ges+0.068
aps+0.065
ore+0.060
uttering+0.060
Neg logit contributions
 hospital-0.164
 nurses-0.154
 dentist-0.152
 punished-0.148
 wounds-0.146
Token argue
Token ID7267
Feature activation+1.19e+01

Pos logit contributions
 hospital+0.044
 dentist+0.043
 nurses+0.042
 wounds+0.037
 happiest+0.037
Neg logit contributions
ges-0.055
uts-0.053
aps-0.053
 This-0.052
ore-0.049
Token.
Token ID13
Feature activation+6.10e+00

Pos logit contributions
ges+0.206
uttering+0.193
elt+0.191
aps+0.188
uts+0.188
Neg logit contributions
 hospital-0.259
 dentist-0.233
 mess-0.231
 nurses-0.229
 mistake-0.227
Token Their
Token ID5334
Feature activation-7.53e+00

Pos logit contributions
ges+0.097
uttering+0.096
uts+0.093
aps+0.089
ore+0.086
Neg logit contributions
 hospital-0.141
 dentist-0.131
 nurses-0.127
 tears-0.117
 forgot-0.116
Token mom
Token ID1995
Feature activation-3.20e+00

Pos logit contributions
 hospital+0.116
 nurses+0.113
 dentist+0.112
 punished+0.099
 cal+0.097
Neg logit contributions
uts-0.170
aps-0.168
ges-0.167
 This-0.158
apped-0.156
Token said
Token ID531
Feature activation+2.88e+00

Pos logit contributions
 hospital+0.063
 dentist+0.059
 happiest+0.057
 nurses+0.057
 wounds+0.056
Neg logit contributions
ges-0.061
 together-0.056
ed-0.055
elt-0.053
aps-0.053
Token,
Token ID11
Feature activation-2.76e-01

Pos logit contributions
ges+0.039
uts+0.038
aps+0.036
uttering+0.033
 This+0.033
Neg logit contributions
 hospital-0.075
 nurses-0.071
 dentist-0.071
 punished-0.066
 happiest-0.065
Token They
Token ID1119
Feature activation-5.19e+00

Pos logit contributions
uts+0.082
ges+0.081
aps+0.081
uttering+0.075
ore+0.073
Neg logit contributions
 hospital-0.112
 nurses-0.108
 dentist-0.100
 forgot-0.097
 happiest-0.097
Token raced
Token ID32897
Feature activation+5.55e+00

Pos logit contributions
 hospital+0.107
 dentist+0.107
 wounds+0.099
 cal+0.098
 nurses+0.096
Neg logit contributions
ges-0.099
 together-0.088
apped-0.086
uts-0.086
ed-0.086
Token each
Token ID1123
Feature activation-5.51e+00

Pos logit contributions
ges+0.071
uts+0.066
aps+0.064
 This+0.061
uttering+0.059
Neg logit contributions
 hospital-0.148
 dentist-0.143
 nurses-0.142
 punished-0.132
 happiest-0.131
Token other
Token ID584
Feature activation-2.40e+00

Pos logit contributions
 dentist+0.077
 hospital+0.074
 nurses+0.068
 cal+0.068
 wounds+0.068
Neg logit contributions
ges-0.142
 This-0.128
ed-0.123
 on-0.122
 together-0.121
Token and
Token ID290
Feature activation-5.07e+00

Pos logit contributions
 hospital+0.054
 nurses+0.050
 dentist+0.048
 forgot+0.047
 happiest+0.046
Neg logit contributions
ges-0.039
aps-0.039
uts-0.038
elt-0.036
uttering-0.035
Token competed
Token ID32440
Feature activation+1.19e+01

Pos logit contributions
 dentist+0.101
 hospital+0.100
 nurses+0.091
 wounds+0.086
 cal+0.086
Neg logit contributions
ges-0.103
 together-0.095
uts-0.094
ed-0.092
apped-0.091
Token to
Token ID284
Feature activation-1.58e+00

Pos logit contributions
aps+0.159
ges+0.155
uts+0.153
elt+0.151
ore+0.150
Neg logit contributions
 hospital-0.276
 nurses-0.259
 dentist-0.255
 doctors-0.249
 conductor-0.248
Token see
Token ID766
Feature activation+3.22e+00

Pos logit contributions
 dentist+0.030
 hospital+0.029
 nurses+0.028
ureen+0.026
 cal+0.025
Neg logit contributions
ges-0.031
 This-0.029
aps-0.029
uts-0.028
 together-0.027
Token who
Token ID508
Feature activation-3.18e+00

Pos logit contributions
ges+0.067
uts+0.063
aps+0.063
 This+0.060
ore+0.059
Neg logit contributions
 hospital-0.060
 dentist-0.057
 nurses-0.055
 punished-0.050
 forgot-0.050
Token the
Token ID262
Feature activation-8.12e+00

Pos logit contributions
 hospital+0.062
 dentist+0.058
 nurses+0.058
 wounds+0.052
 punished+0.052
Neg logit contributions
ges-0.063
uts-0.061
aps-0.060
 This-0.057
ore-0.056
Token fastest
Token ID14162
Feature activation-8.48e-01

Pos logit contributions
 hospital+0.126
 dentist+0.120
 nurses+0.117
 cal+0.107
 wounds+0.105
Neg logit contributions
ges-0.179
uts-0.179
 This-0.175
aps-0.168
uttering-0.165
Token new
Token ID649
Feature activation-3.42e+00

Pos logit contributions
 hospital+0.054
 dentist+0.051
 nurses+0.051
 punished+0.044
 wounds+0.044
Neg logit contributions
ges-0.078
uts-0.077
aps-0.075
 This-0.072
uttering-0.072
Token word
Token ID1573
Feature activation+3.60e+00

Pos logit contributions
 hospital+0.058
 dentist+0.053
 nurses+0.052
 mistake+0.047
 mess+0.045
Neg logit contributions
ges-0.079
uts-0.077
aps-0.074
uttering-0.071
 This-0.070
Token,
Token ID11
Feature activation-5.05e-01

Pos logit contributions
ges+0.056
aps+0.053
uttering+0.053
uts+0.051
uttered+0.049
Neg logit contributions
 hospital-0.087
 dentist-0.082
 nurses-0.079
 doctors-0.073
 happiest-0.073
Token '
Token ID705
Feature activation+2.28e+00

Pos logit contributions
 hospital+0.011
 nurses+0.010
 dentist+0.010
 punished+0.010
 wounds+0.009
Neg logit contributions
ges-0.009
uts-0.009
aps-0.008
 This-0.008
uttering-0.008
Tokenemer
Token ID24677
Feature activation+3.66e+00

Pos logit contributions
uts+0.044
aps+0.043
ges+0.041
uttering+0.040
uttered+0.039
Neg logit contributions
 nurses-0.042
 punished-0.041
 dentist-0.041
 hospital-0.040
 cal-0.037
Tokengency
Token ID4949
Feature activation+1.18e+01

Pos logit contributions
uts+0.073
aps+0.068
uttering+0.065
ges+0.063
 This+0.062
Neg logit contributions
 hospital-0.072
 nurses-0.070
 dentist-0.069
 punished-0.065
 wounds-0.062
Token',
Token ID3256
Feature activation+2.69e+00

Pos logit contributions
ges+0.156
uttering+0.151
aps+0.145
elt+0.139
uttered+0.138
Neg logit contributions
 hospital-0.292
 dentist-0.278
 nurses-0.275
 punished-0.265
ureen-0.262
Token so
Token ID523
Feature activation-1.55e+00

Pos logit contributions
ges+0.044
uts+0.043
aps+0.042
uttering+0.039
 This+0.039
Neg logit contributions
 hospital-0.061
 nurses-0.058
 dentist-0.058
 punished-0.053
 happiest-0.053
Token she
Token ID673
Feature activation-5.98e+00

Pos logit contributions
 hospital+0.037
 nurses+0.035
 dentist+0.035
 punished+0.032
 happiest+0.032
Neg logit contributions
ges-0.024
uts-0.023
aps-0.023
uttering-0.021
 This-0.021
Token whispered
Token ID25029
Feature activation+7.28e+00

Pos logit contributions
 hospital+0.103
 nurses+0.099
 dentist+0.097
 wounds+0.094
 happiest+0.092
Neg logit contributions
uts-0.120
ges-0.113
ed-0.111
aps-0.110
apped-0.108
Token the
Token ID262
Feature activation-7.01e+00

Pos logit contributions
ges+0.109
aps+0.108
uts+0.108
uttering+0.099
ore+0.097
Neg logit contributions
 hospital-0.167
 nurses-0.162
 dentist-0.158
 punished-0.149
 happiest-0.147
Token,
Token ID11
Feature activation-4.07e-01

Pos logit contributions
ges+0.045
aps+0.042
uts+0.041
uttering+0.039
 This+0.039
Neg logit contributions
 hospital-0.089
 dentist-0.084
 nurses-0.083
 punished-0.078
 forgot-0.078
Token so
Token ID523
Feature activation-8.73e-01

Pos logit contributions
 hospital+0.009
 dentist+0.009
 nurses+0.009
 punished+0.008
ureen+0.008
Neg logit contributions
uts-0.006
ges-0.006
 This-0.006
aps-0.006
 together-0.005
Token they
Token ID484
Feature activation-4.37e+00

Pos logit contributions
 hospital+0.020
 nurses+0.019
 dentist+0.019
 forgot+0.018
 doctors+0.018
Neg logit contributions
ges-0.014
aps-0.014
uts-0.014
ore-0.013
uttering-0.013
Token started
Token ID2067
Feature activation+5.14e+00

Pos logit contributions
 hospital+0.090
 dentist+0.088
 mistake+0.082
 happiest+0.081
 difference+0.081
Neg logit contributions
ges-0.079
uts-0.075
apped-0.075
 together-0.073
ed-0.073
Token to
Token ID284
Feature activation-2.76e+00

Pos logit contributions
uts+0.053
ges+0.052
aps+0.049
uttering+0.046
ore+0.044
Neg logit contributions
 hospital-0.149
 nurses-0.141
 dentist-0.139
 punished-0.134
 wounds-0.134
Token argue
Token ID7267
Feature activation+1.18e+01

Pos logit contributions
 hospital+0.052
 dentist+0.051
 nurses+0.048
ureen+0.047
 happiest+0.046
Neg logit contributions
ges-0.054
 This-0.051
aps-0.050
uts-0.050
 together-0.047
Token.
Token ID13
Feature activation+5.68e+00

Pos logit contributions
elt+0.217
uttering+0.213
ges+0.211
uttered+0.200
ectar+0.197
Neg logit contributions
 hospital-0.238
 floor-0.220
 mess-0.217
 dentist-0.216
 mistake-0.210
Token
Token ID198
Feature activation+3.32e+00

Pos logit contributions
uttering+0.101
ges+0.095
uts+0.088
aps+0.085
ore+0.084
Neg logit contributions
 hospital-0.118
 tears-0.103
 dentist-0.101
 forgot-0.100
 nurses-0.099
Token
Token ID198
Feature activation+3.02e+00

Pos logit contributions
uttering+0.052
ges+0.051
elt+0.050
uts+0.050
apped+0.047
Neg logit contributions
 hospital-0.073
 dentist-0.068
 nurses-0.064
 forgot-0.063
 mistake-0.061
TokenTim
Token ID14967
Feature activation-6.83e+00

Pos logit contributions
ges+0.063
aps+0.062
uts+0.061
ore+0.060
uttering+0.059
Neg logit contributions
 hospital-0.053
 nurses-0.050
 punished-0.047
 dentist-0.047
 forgot-0.046
Token had
Token ID550
Feature activation-7.29e-01

Pos logit contributions
 hospital+0.167
 nurses+0.163
 dentist+0.163
 happiest+0.157
 wounds+0.156
Neg logit contributions
ges-0.092
ed-0.091
uts-0.088
 together-0.084
elt-0.082
Token a
Token ID257
Feature activation-4.18e+00

Pos logit contributions
 nurses+0.105
 dentist+0.104
 hospital+0.103
 punished+0.096
 forgot+0.092
Neg logit contributions
ges-0.172
 This-0.171
aps-0.169
uts-0.169
ore-0.160
Token pop
Token ID1461
Feature activation+4.41e+00

Pos logit contributions
 hospital+0.066
 nurses+0.065
 dentist+0.061
 wounds+0.059
 punished+0.057
Neg logit contributions
ges-0.089
aps-0.087
uts-0.085
 This-0.085
ore-0.084
Token-
Token ID12
Feature activation-6.59e-01

Pos logit contributions
ges+0.068
uts+0.068
aps+0.068
uttering+0.067
uttered+0.062
Neg logit contributions
 hospital-0.103
 dentist-0.096
 nurses-0.094
 mess-0.091
 punished-0.089
Tokenpop
Token ID12924
Feature activation+1.07e+01

Pos logit contributions
 hospital+0.014
 dentist+0.013
 nurses+0.013
 punished+0.012
 wounds+0.012
Neg logit contributions
uts-0.011
ges-0.011
aps-0.011
uttering-0.011
 This-0.010
Token-
Token ID12
Feature activation+9.06e-02

Pos logit contributions
ges+0.187
uttering+0.187
aps+0.185
uts+0.180
uttered+0.177
Neg logit contributions
 hospital-0.233
 nurses-0.218
 dentist-0.216
 doctors-0.200
 punished-0.199
Tokenpop
Token ID12924
Feature activation+1.17e+01

Pos logit contributions
uts+0.002
ges+0.002
aps+0.002
uttering+0.001
apping+0.001
Neg logit contributions
 hospital-0.002
 dentist-0.002
 nurses-0.002
 punished-0.002
 cal-0.002
Token sound
Token ID2128
Feature activation+6.33e+00

Pos logit contributions
uttering+0.207
aps+0.204
ges+0.204
uts+0.197
uttered+0.193
Neg logit contributions
 hospital-0.257
 nurses-0.239
 dentist-0.234
 punished-0.219
 doctors-0.218
Token.
Token ID13
Feature activation+6.18e+00

Pos logit contributions
ges+0.087
aps+0.086
uts+0.083
uttering+0.079
 This+0.076
Neg logit contributions
 hospital-0.159
 dentist-0.149
 nurses-0.149
 punished-0.143
 cal-0.141
Token She
Token ID1375
Feature activation-6.83e+00

Pos logit contributions
uttering+0.115
ges+0.112
uts+0.108
aps+0.106
uttered+0.100
Neg logit contributions
 hospital-0.128
 dentist-0.123
 nurses-0.114
 boo-0.103
 doctors-0.102
Token laughs
Token ID22051
Feature activation+5.34e+00

Pos logit contributions
 nurses+0.120
 hospital+0.118
 dentist+0.117
 happiest+0.112
 wounds+0.105
Neg logit contributions
ges-0.132
aps-0.127
uts-0.127
 This-0.123
ore-0.120
Token.
Token ID13
Feature activation+6.30e+00

Pos logit contributions
ges+0.109
aps+0.107
ore+0.104
uttering+0.104
uts+0.103
Neg logit contributions
 hospital-0.097
 dentist-0.089
 nurses-0.088
 doctors-0.083
 boo-0.082
Token sat
Token ID3332
Feature activation+4.79e+00

Pos logit contributions
 dentist+0.070
 nurses+0.069
 hospital+0.068
 stret+0.066
 wounds+0.062
Neg logit contributions
ges-0.075
 together-0.074
uts-0.074
 This-0.074
ed-0.074
Token down
Token ID866
Feature activation+3.60e+00

Pos logit contributions
ges+0.054
uts+0.053
aps+0.050
 This+0.046
uttering+0.045
Neg logit contributions
 hospital-0.134
 dentist-0.128
 nurses-0.128
 punished-0.119
 happiest-0.119
Token in
Token ID287
Feature activation-1.30e+00

Pos logit contributions
uts+0.053
ges+0.051
aps+0.048
uttering+0.046
elt+0.045
Neg logit contributions
 hospital-0.091
 dentist-0.084
 nurses-0.082
 cal-0.079
 punished-0.078
Token a
Token ID257
Feature activation-3.35e+00

Pos logit contributions
 hospital+0.025
 nurses+0.024
 dentist+0.023
 happiest+0.021
 punished+0.021
Neg logit contributions
ges-0.026
aps-0.025
uts-0.025
 This-0.024
ore-0.023
Token s
Token ID264
Feature activation+5.17e+00

Pos logit contributions
 nurses+0.061
 wounds+0.058
 forgot+0.056
 punished+0.055
 dentist+0.054
Neg logit contributions
ore-0.061
 This-0.061
aps-0.057
ges-0.056
 together-0.053
Tokenulk
Token ID12171
Feature activation+1.17e+01

Pos logit contributions
ges+0.096
uts+0.095
aps+0.088
uttering+0.087
 This+0.086
Neg logit contributions
 hospital-0.108
 dentist-0.103
 nurses-0.101
 punished-0.093
 cal-0.091
Token.
Token ID13
Feature activation+5.67e+00

Pos logit contributions
uts+0.168
aps+0.160
ges+0.157
uttering+0.156
elt+0.153
Neg logit contributions
 hospital-0.286
 dentist-0.266
 nurses-0.262
 punished-0.248
 forgot-0.247
Token Suddenly
Token ID24975
Feature activation+1.67e+00

Pos logit contributions
ges+0.085
uts+0.082
aps+0.078
uttering+0.077
ore+0.071
Neg logit contributions
 hospital-0.140
 dentist-0.134
 nurses-0.130
 happiest-0.120
 punished-0.119
Token he
Token ID339
Feature activation-5.45e+00

Pos logit contributions
ges+0.016
uts+0.015
aps+0.014
 This+0.014
ore+0.012
Neg logit contributions
 hospital-0.048
 dentist-0.047
 nurses-0.046
 wounds-0.045
 punished-0.045
Token remembered
Token ID12086
Feature activation+9.06e-01

Pos logit contributions
 nurses+0.100
 hospital+0.099
 dentist+0.088
 wounds+0.088
 stret+0.085
Neg logit contributions
ges-0.110
ed-0.108
uts-0.106
aps-0.100
elt-0.097
Token where
Token ID810
Feature activation+2.45e+00

Pos logit contributions
ges+0.016
uts+0.015
aps+0.015
 This+0.014
ore+0.014
Neg logit contributions
 hospital-0.019
 nurses-0.018
 dentist-0.018
 happiest-0.017
 wounds-0.017
Token answered
Token ID9373
Feature activation+6.85e+00

Pos logit contributions
 hospital+0.015
 dentist+0.015
 nurses+0.015
 wounds+0.013
 happiest+0.013
Neg logit contributions
ges-0.017
aps-0.016
uts-0.016
 This-0.016
apped-0.015
Token,
Token ID11
Feature activation+1.25e+00

Pos logit contributions
uts+0.105
ges+0.104
aps+0.099
uttering+0.095
ore+0.091
Neg logit contributions
 hospital-0.160
 nurses-0.152
 dentist-0.149
 cal-0.143
 wounds-0.139
Token "
Token ID366
Feature activation+3.47e+00

Pos logit contributions
ges+0.021
uts+0.020
aps+0.020
ore+0.019
uttering+0.019
Neg logit contributions
 hospital-0.028
 dentist-0.026
 nurses-0.025
 cal-0.025
 tears-0.024
TokenI
Token ID40
Feature activation-2.53e+00

Pos logit contributions
ges+0.064
uts+0.064
aps+0.062
uttering+0.059
ore+0.058
Neg logit contributions
 hospital-0.068
 nurses-0.067
 dentist-0.065
 happiest-0.061
 punished-0.060
Token take
Token ID1011
Feature activation+2.80e+00

Pos logit contributions
 hospital+0.055
 dentist+0.052
 nurses+0.052
 punished+0.048
 happiest+0.047
Neg logit contributions
ges-0.044
uts-0.043
aps-0.042
 This-0.039
uttering-0.039
Token trips
Token ID13229
Feature activation+1.17e+01

Pos logit contributions
uts+0.053
ges+0.053
aps+0.051
uttering+0.050
apping+0.048
Neg logit contributions
 hospital-0.056
 nurses-0.054
 dentist-0.052
 punished-0.048
 cal-0.047
Token because
Token ID780
Feature activation-8.07e-01

Pos logit contributions
ges+0.172
uts+0.169
aps+0.168
uttering+0.162
uttered+0.157
Neg logit contributions
 hospital-0.270
 nurses-0.256
 dentist-0.250
 punished-0.237
 cal-0.232
Token I
Token ID314
Feature activation-3.03e+00

Pos logit contributions
 hospital+0.021
 nurses+0.020
 dentist+0.020
 punished+0.019
 happiest+0.019
Neg logit contributions
ges-0.011
uts-0.010
aps-0.010
uttering-0.009
 This-0.009
Token want
Token ID765
Feature activation+3.99e+00

Pos logit contributions
 hospital+0.069
 dentist+0.066
 nurses+0.065
 punished+0.060
 happiest+0.059
Neg logit contributions
ges-0.050
uts-0.049
aps-0.047
 This-0.044
uttering-0.044
Token to
Token ID284
Feature activation-2.06e+00

Pos logit contributions
ges+0.042
uts+0.039
aps+0.039
uttering+0.036
apped+0.034
Neg logit contributions
 hospital-0.116
 nurses-0.112
 dentist-0.112
 doctors-0.103
 wounds-0.103
Token explore
Token ID7301
Feature activation+5.70e+00

Pos logit contributions
 hospital+0.036
 nurses+0.035
 dentist+0.035
 punished+0.030
 happiest+0.030
Neg logit contributions
ges-0.043
uts-0.043
aps-0.042
 This-0.040
ore-0.039
Token opinions
Token ID9317
Feature activation+6.62e+00

Pos logit contributions
 hospital+0.019
 nurses+0.018
 dentist+0.018
 wounds+0.016
 mess+0.015
Neg logit contributions
ges-0.025
uts-0.024
aps-0.023
uttering-0.023
 This-0.023
Token.
Token ID13
Feature activation+5.28e+00

Pos logit contributions
ges+0.108
uttering+0.102
aps+0.101
uts+0.097
uttered+0.094
Neg logit contributions
 hospital-0.149
 dentist-0.140
 nurses-0.139
 mess-0.133
 happiest-0.130
Token They
Token ID1119
Feature activation-5.34e+00

Pos logit contributions
uttering+0.086
ges+0.081
uts+0.076
ed+0.072
uttered+0.072
Neg logit contributions
 hospital-0.124
 dentist-0.120
 nurses-0.113
 doctors-0.104
 conductor-0.102
Token start
Token ID923
Feature activation+4.17e+00

Pos logit contributions
 dentist+0.102
 hospital+0.101
 nurses+0.099
 wounds+0.094
 doctors+0.089
Neg logit contributions
ges-0.095
uts-0.092
 This-0.089
aps-0.089
 together-0.086
Token to
Token ID284
Feature activation-2.58e+00

Pos logit contributions
uts+0.050
ges+0.049
aps+0.047
uttering+0.044
ore+0.044
Neg logit contributions
 hospital-0.113
 nurses-0.106
 dentist-0.106
 mistake-0.101
 mess-0.101
Token argue
Token ID7267
Feature activation+1.16e+01

Pos logit contributions
 hospital+0.045
 dentist+0.044
 nurses+0.043
 happiest+0.038
 wounds+0.038
Neg logit contributions
ges-0.055
uts-0.052
aps-0.052
 This-0.051
uttering-0.048
Token.
Token ID13
Feature activation+6.25e+00

Pos logit contributions
ges+0.228
uttering+0.218
uts+0.215
elt+0.212
aps+0.210
Neg logit contributions
 hospital-0.224
 dentist-0.206
 mess-0.201
 nurses-0.200
 mistake-0.194
Token They
Token ID1119
Feature activation-5.26e+00

Pos logit contributions
uttering+0.107
ges+0.103
uts+0.099
ed+0.095
ore+0.094
Neg logit contributions
 hospital-0.138
 dentist-0.131
 nurses-0.123
 doctors-0.115
 tears-0.113
Token say
Token ID910
Feature activation+2.25e+00

Pos logit contributions
 dentist+0.113
 nurses+0.109
 hospital+0.108
 doctors+0.099
 wounds+0.098
Neg logit contributions
ges-0.082
 This-0.080
uts-0.076
aps-0.076
 together-0.074
Token,
Token ID11
Feature activation-6.18e-01

Pos logit contributions
ges+0.038
uttering+0.035
aps+0.035
uts+0.035
ore+0.034
Neg logit contributions
 hospital-0.052
 dentist-0.049
 nurses-0.049
 doctors-0.045
 cal-0.045
Token "
Token ID366
Feature activation+3.26e+00

Pos logit contributions
 hospital+0.012
 dentist+0.012
 nurses+0.011
 conductor+0.011
 cal+0.010
Neg logit contributions
ges-0.012
uttering-0.011
aps-0.011
ore-0.011
uts-0.011
Token time
Token ID640
Feature activation+4.65e+00

Pos logit contributions
 hospital+0.047
 nurses+0.045
 dentist+0.044
 punished+0.039
 wounds+0.039
Neg logit contributions
ges-0.061
uts-0.059
aps-0.058
 This-0.056
ore-0.055
Token.
Token ID13
Feature activation+5.14e+00

Pos logit contributions
ges+0.071
aps+0.067
uts+0.067
uttering+0.066
ore+0.063
Neg logit contributions
 hospital-0.107
 dentist-0.100
 nurses-0.099
 forgot-0.096
 mess-0.093
Token They
Token ID1119
Feature activation-5.56e+00

Pos logit contributions
uttering+0.089
ges+0.085
uts+0.081
aps+0.077
ore+0.076
Neg logit contributions
 hospital-0.110
 dentist-0.107
 nurses-0.099
 cal-0.090
 doctors-0.090
Token started
Token ID2067
Feature activation+5.51e+00

Pos logit contributions
 dentist+0.109
 hospital+0.109
 nurses+0.105
 wounds+0.098
 doctors+0.096
Neg logit contributions
ges-0.101
uts-0.097
 This-0.094
aps-0.093
apped-0.090
Token to
Token ID284
Feature activation-2.61e+00

Pos logit contributions
ges+0.057
uts+0.057
aps+0.053
ore+0.048
uttering+0.048
Neg logit contributions
 hospital-0.158
 nurses-0.151
 dentist-0.149
 punished-0.142
 cal-0.140
Token argue
Token ID7267
Feature activation+1.16e+01

Pos logit contributions
 hospital+0.047
 dentist+0.046
 nurses+0.045
 wounds+0.040
 doctors+0.039
Neg logit contributions
ges-0.054
uts-0.052
aps-0.051
 This-0.050
ore-0.048
Token and
Token ID290
Feature activation-2.29e+00

Pos logit contributions
ges+0.237
uttering+0.222
elt+0.220
ore+0.217
aps+0.217
Neg logit contributions
 hospital-0.212
 dentist-0.190
 mess-0.189
 floor-0.185
 mistake-0.184
Token fight
Token ID1907
Feature activation+8.90e+00

Pos logit contributions
 hospital+0.040
 dentist+0.038
 nurses+0.037
 happiest+0.033
 wounds+0.033
Neg logit contributions
ges-0.050
uts-0.048
aps-0.047
 This-0.046
ore-0.045
Token over
Token ID625
Feature activation+2.18e+00

Pos logit contributions
ges+0.132
uts+0.130
aps+0.125
uttering+0.123
elt+0.118
Neg logit contributions
 hospital-0.217
 nurses-0.204
 dentist-0.202
 punished-0.192
 mess-0.192
Token who
Token ID508
Feature activation-2.70e+00

Pos logit contributions
ges+0.040
uts+0.039
aps+0.038
 This+0.036
ore+0.035
Neg logit contributions
 hospital-0.045
 dentist-0.043
 nurses-0.043
 punished-0.039
 happiest-0.038
Token could
Token ID714
Feature activation+1.10e+00

Pos logit contributions
 hospital+0.051
 dentist+0.049
 nurses+0.048
 wounds+0.045
 happiest+0.044
Neg logit contributions
ges-0.055
uts-0.052
aps-0.050
 This-0.049
ore-0.048
Token look
Token ID804
Feature activation+2.69e+00

Pos logit contributions
 hospital+0.034
 dentist+0.033
 nurses+0.032
 punished+0.029
 wounds+0.029
Neg logit contributions
ges-0.036
 This-0.033
aps-0.033
ore-0.032
uts-0.031
Token at
Token ID379
Feature activation+1.48e+00

Pos logit contributions
ges+0.029
uts+0.028
aps+0.026
 This+0.024
uttering+0.023
Neg logit contributions
 hospital-0.077
 nurses-0.074
 dentist-0.073
 punished-0.069
 happiest-0.069
Token it
Token ID340
Feature activation+1.45e-01

Pos logit contributions
ges+0.028
uts+0.028
aps+0.027
 This+0.026
uttering+0.026
Neg logit contributions
 hospital-0.030
 nurses-0.028
 dentist-0.028
 punished-0.025
 happiest-0.025
Token from
Token ID422
Feature activation-2.03e+00

Pos logit contributions
ges+0.002
uts+0.002
aps+0.002
elt+0.002
uttering+0.002
Neg logit contributions
 hospital-0.003
 nurses-0.003
 dentist-0.003
 happiest-0.003
 mess-0.003
Token all
Token ID477
Feature activation-1.69e+00

Pos logit contributions
 hospital+0.041
 dentist+0.039
 nurses+0.039
 punished+0.035
 happiest+0.034
Neg logit contributions
ges-0.039
uts-0.038
aps-0.037
 This-0.035
uttering-0.035
Token angles
Token ID18333
Feature activation+1.16e+01

Pos logit contributions
 hospital+0.032
 nurses+0.030
 dentist+0.030
 punished+0.027
 happiest+0.027
Neg logit contributions
ges-0.034
uts-0.034
aps-0.033
 This-0.031
uttering-0.031
Token.
Token ID13
Feature activation+5.20e+00

Pos logit contributions
 Sorry+0.208
 This+0.198
ore+0.193
iffs+0.191
elt+0.187
Neg logit contributions
 hospital-0.208
 floor-0.194
 mistake-0.193
 dentist-0.192
 tears-0.188
Token It
Token ID632
Feature activation-3.68e+00

Pos logit contributions
uts+0.092
ges+0.087
uttering+0.084
ore+0.079
aps+0.076
Neg logit contributions
 hospital-0.111
 nurses-0.099
 tears-0.097
 dentist-0.096
 happiest-0.093
Token was
Token ID373
Feature activation-1.41e+00

Pos logit contributions
 hospital+0.080
 dentist+0.080
 nurses+0.076
 wounds+0.074
 stret+0.070
Neg logit contributions
uts-0.061
ges-0.060
ed-0.059
 together-0.058
uttered-0.056
Token so
Token ID523
Feature activation-1.75e+00

Pos logit contributions
 hospital+0.031
 nurses+0.030
 dentist+0.030
 wounds+0.028
 punished+0.027
Neg logit contributions
ges-0.023
uts-0.023
aps-0.022
 This-0.021
uttering-0.021
Token colorful
Token ID20239
Feature activation+3.17e+00

Pos logit contributions
 hospital+0.024
 dentist+0.022
 nurses+0.021
 forgot+0.020
 cal+0.020
Neg logit contributions
ges-0.045
uts-0.042
aps-0.042
ore-0.041
uttering-0.040
Token to
Token ID284
Feature activation-3.80e+00

Pos logit contributions
ges+0.015
uts+0.014
aps+0.014
 This+0.013
ore+0.013
Neg logit contributions
 hospital-0.021
 dentist-0.020
 nurses-0.020
 wounds-0.018
 punished-0.018
Token draw
Token ID3197
Feature activation+6.47e+00

Pos logit contributions
 hospital+0.066
 dentist+0.062
 nurses+0.060
 punished+0.054
 forgot+0.054
Neg logit contributions
ges-0.083
uts-0.083
aps-0.080
uttering-0.077
ore-0.076
Token and
Token ID290
Feature activation-3.33e+00

Pos logit contributions
ges+0.098
uts+0.094
aps+0.092
uttering+0.088
ore+0.086
Neg logit contributions
 hospital-0.155
 dentist-0.148
 nurses-0.146
 happiest-0.135
 doctors-0.135
Token started
Token ID2067
Feature activation+4.90e+00

Pos logit contributions
 hospital+0.059
 dentist+0.056
 nurses+0.055
 happiest+0.049
 wounds+0.049
Neg logit contributions
ges-0.070
uts-0.069
aps-0.066
 This-0.065
ore-0.062
Token to
Token ID284
Feature activation-2.93e+00

Pos logit contributions
uts+0.063
ges+0.063
aps+0.060
uttering+0.058
ore+0.057
Neg logit contributions
 hospital-0.128
 dentist-0.119
 nurses-0.118
 mess-0.115
 mistake-0.114
Token argue
Token ID7267
Feature activation+1.16e+01

Pos logit contributions
 hospital+0.049
 nurses+0.047
 dentist+0.046
 wounds+0.041
 happiest+0.040
Neg logit contributions
ges-0.065
uts-0.063
aps-0.062
 This-0.060
ore-0.058
Token.
Token ID13
Feature activation+6.06e+00

Pos logit contributions
ges+0.179
uttering+0.169
elt+0.165
aps+0.165
uts+0.165
Neg logit contributions
 hospital-0.274
 dentist-0.253
 nurses-0.249
 mess-0.240
 floor-0.236
Token
Token ID198
Feature activation+3.97e+00

Pos logit contributions
ges+0.094
uts+0.090
uttering+0.088
aps+0.085
ore+0.079
Neg logit contributions
 hospital-0.147
 dentist-0.136
 nurses-0.134
 punished-0.121
 forgot-0.121
Token
Token ID198
Feature activation+3.64e+00

Pos logit contributions
ges+0.058
uttering+0.055
uts+0.054
elt+0.053
apped+0.050
Neg logit contributions
 hospital-0.093
 dentist-0.088
 nurses-0.083
 mistake-0.080
ureen-0.079
TokenThe
Token ID464
Feature activation-7.89e+00

Pos logit contributions
ges+0.076
uts+0.073
aps+0.072
uttering+0.069
ore+0.068
Neg logit contributions
 hospital-0.068
 nurses-0.063
 dentist-0.063
 punished-0.057
 forgot-0.056
Token argument
Token ID4578
Feature activation+5.93e+00

Pos logit contributions
 hospital+0.121
ureen+0.120
 punished+0.119
 stret+0.118
 wounds+0.115
Neg logit contributions
aps-0.143
 Yes-0.141
ges-0.140
 on-0.136
uttering-0.134
Token were
Token ID547
Feature activation-1.73e+00

Pos logit contributions
 hospital+0.105
 dentist+0.104
 nurses+0.101
 wounds+0.093
 mistake+0.089
Neg logit contributions
ges-0.095
uts-0.092
aps-0.087
 This-0.085
apped-0.084
Token both
Token ID1111
Feature activation-4.98e+00

Pos logit contributions
 hospital+0.035
 dentist+0.034
 nurses+0.033
 wounds+0.030
ureen+0.029
Neg logit contributions
ges-0.032
uts-0.031
aps-0.031
uttering-0.029
apped-0.028
Token too
Token ID1165
Feature activation-5.38e-01

Pos logit contributions
 hospital+0.095
 dentist+0.092
 nurses+0.091
 wounds+0.081
 punished+0.079
Neg logit contributions
ges-0.098
uts-0.096
aps-0.095
 This-0.089
uttering-0.088
Token young
Token ID1862
Feature activation+2.07e+00

Pos logit contributions
 hospital+0.010
 nurses+0.010
 dentist+0.009
 punished+0.009
 forgot+0.009
Neg logit contributions
ges-0.011
aps-0.010
uts-0.010
 This-0.010
uttering-0.010
Token to
Token ID284
Feature activation-3.02e+00

Pos logit contributions
ges+0.026
uts+0.026
aps+0.025
uttering+0.024
 This+0.023
Neg logit contributions
 hospital-0.055
 nurses-0.052
 dentist-0.051
 punished-0.049
 forgot-0.049
Token argue
Token ID7267
Feature activation+1.16e+01

Pos logit contributions
 hospital+0.053
 dentist+0.051
 nurses+0.050
 wounds+0.044
 happiest+0.044
Neg logit contributions
ges-0.063
uts-0.062
aps-0.061
 This-0.059
ore-0.057
Token over
Token ID625
Feature activation+3.72e+00

Pos logit contributions
ges+0.193
uttering+0.179
uts+0.178
elt+0.175
aps+0.174
Neg logit contributions
 hospital-0.259
 dentist-0.238
 nurses-0.235
 mess-0.228
 mistake-0.225
Token the
Token ID262
Feature activation-7.24e+00

Pos logit contributions
ges+0.066
uts+0.064
aps+0.063
uttering+0.059
 This+0.059
Neg logit contributions
 hospital-0.080
 nurses-0.076
 dentist-0.076
 punished-0.069
 wounds-0.068
Token television
Token ID5581
Feature activation+6.54e+00

Pos logit contributions
 hospital+0.104
 nurses+0.099
 dentist+0.099
 forgot+0.089
 punished+0.089
Neg logit contributions
ges-0.165
aps-0.161
uts-0.160
 This-0.155
ore-0.154
Token.
Token ID13
Feature activation+5.96e+00

Pos logit contributions
ges+0.092
uts+0.085
aps+0.084
uttering+0.082
elt+0.076
Neg logit contributions
 hospital-0.169
 dentist-0.158
 nurses-0.157
 happiest-0.146
 punished-0.146
Token
Token ID198
Feature activation+3.94e+00

Pos logit contributions
ges+0.093
uttering+0.092
uts+0.088
aps+0.083
ore+0.080
Neg logit contributions
 hospital-0.142
 dentist-0.134
 nurses-0.128
 cal-0.117
 doctors-0.116
Token this
Token ID428
Feature activation-5.32e+00

Pos logit contributions
 hospital+0.022
 dentist+0.021
 nurses+0.021
 wounds+0.020
 punished+0.019
Neg logit contributions
ges-0.016
uts-0.015
aps-0.015
 This-0.014
uttering-0.014
Token time
Token ID640
Feature activation+3.64e+00

Pos logit contributions
 hospital+0.099
 nurses+0.098
 dentist+0.094
 wounds+0.090
 punished+0.086
Neg logit contributions
ges-0.102
uts-0.101
aps-0.100
 This-0.099
ore-0.096
Token she
Token ID673
Feature activation-6.42e+00

Pos logit contributions
ges+0.058
uts+0.056
aps+0.055
uttering+0.052
ore+0.051
Neg logit contributions
 hospital-0.086
 nurses-0.082
 dentist-0.081
 punished-0.074
 forgot-0.074
Token wanted
Token ID2227
Feature activation+3.87e+00

Pos logit contributions
 nurses+0.114
 hospital+0.112
 dentist+0.107
 wounds+0.104
 stret+0.099
Neg logit contributions
uts-0.127
ges-0.123
aps-0.121
ed-0.118
apped-0.116
Token to
Token ID284
Feature activation-2.05e+00

Pos logit contributions
ges+0.033
uts+0.032
aps+0.030
 This+0.027
uttering+0.025
Neg logit contributions
 hospital-0.118
 nurses-0.114
 dentist-0.114
 punished-0.107
 happiest-0.106
Token compete
Token ID9320
Feature activation+1.15e+01

Pos logit contributions
 hospital+0.037
 nurses+0.035
 dentist+0.034
 wounds+0.031
 punished+0.031
Neg logit contributions
ges-0.043
uts-0.042
aps-0.041
 This-0.040
ore-0.039
Token with
Token ID351
Feature activation+1.15e+00

Pos logit contributions
ges+0.179
uts+0.167
uttered+0.162
elt+0.161
uttering+0.161
Neg logit contributions
 hospital-0.261
 dentist-0.250
 nurses-0.240
 happiest-0.234
 punished-0.232
Token her
Token ID607
Feature activation-5.28e+00

Pos logit contributions
ges+0.019
uts+0.018
aps+0.018
 This+0.017
uttering+0.016
Neg logit contributions
 hospital-0.026
 dentist-0.025
 nurses-0.024
 forgot-0.023
 punished-0.023
Token brother
Token ID3956
Feature activation-2.38e-01

Pos logit contributions
 hospital+0.088
 dentist+0.082
 nurses+0.082
 happiest+0.072
 punished+0.072
Neg logit contributions
ges-0.121
uts-0.117
aps-0.114
uttering-0.110
 This-0.110
Token.
Token ID13
Feature activation+5.41e+00

Pos logit contributions
 hospital+0.006
 nurses+0.006
 dentist+0.006
 punished+0.005
 happiest+0.005
Neg logit contributions
ges-0.003
uts-0.003
aps-0.003
uttering-0.003
ore-0.003
Token Her
Token ID2332
Feature activation-7.51e+00

Pos logit contributions
ges+0.085
uttering+0.083
uts+0.079
aps+0.076
ore+0.070
Neg logit contributions
 hospital-0.131
 dentist-0.122
 nurses-0.121
 doctors-0.106
 happiest-0.105
Token.
Token ID13
Feature activation+5.59e+00

Pos logit contributions
ges+0.031
uts+0.031
aps+0.030
uttering+0.027
 This+0.027
Neg logit contributions
 hospital-0.068
 nurses-0.064
 dentist-0.064
 punished-0.060
 wounds-0.059
Token
Token ID198
Feature activation+3.80e+00

Pos logit contributions
ges+0.088
uttering+0.087
uts+0.085
aps+0.080
ore+0.076
Neg logit contributions
 hospital-0.131
 dentist-0.125
 nurses-0.122
 doctors-0.109
 cal-0.109
TokenSuddenly
Token ID38582
Feature activation+2.39e+00

Pos logit contributions
ges+0.047
uts+0.046
uttering+0.041
aps+0.040
elt+0.039
Neg logit contributions
 hospital-0.101
 dentist-0.098
 nurses-0.095
 cal-0.089
 wounds-0.088
Token,
Token ID11
Feature activation-7.05e-01

Pos logit contributions
ges+0.029
uts+0.028
aps+0.027
 This+0.025
uttering+0.025
Neg logit contributions
 hospital-0.064
 dentist-0.061
 nurses-0.061
 punished-0.057
 happiest-0.057
Token it
Token ID340
Feature activation-2.48e+00

Pos logit contributions
 hospital+0.015
 dentist+0.015
 nurses+0.014
 wounds+0.014
 punished+0.014
Neg logit contributions
ges-0.012
uts-0.011
aps-0.011
 This-0.011
ore-0.010
Token shr
Token ID10157
Feature activation+1.15e+01

Pos logit contributions
 hospital+0.043
 nurses+0.042
 dentist+0.041
 wounds+0.037
 happiest+0.037
Neg logit contributions
ges-0.052
uts-0.052
aps-0.051
ore-0.048
 This-0.048
Tokenank
Token ID962
Feature activation+2.83e+00

Pos logit contributions
ges+0.133
 This+0.125
uts+0.125
ore+0.118
aps+0.113
Neg logit contributions
 hospital-0.291
 dentist-0.279
 nurses-0.277
 mess-0.273
 punished-0.272
Token!
Token ID0
Feature activation+5.44e+00

Pos logit contributions
uts+0.039
ges+0.039
aps+0.038
uttering+0.035
 This+0.035
Neg logit contributions
 hospital-0.070
 nurses-0.067
 dentist-0.066
 punished-0.062
 cal-0.061
Token The
Token ID383
Feature activation-7.08e+00

Pos logit contributions
ges+0.095
uttering+0.092
uts+0.092
aps+0.090
ore+0.083
Neg logit contributions
 hospital-0.114
 dentist-0.108
 nurses-0.105
 doctors-0.095
 cal-0.095
Token goose
Token ID37246
Feature activation-1.95e+00

Pos logit contributions
 hospital+0.104
 nurses+0.098
 dentist+0.097
 punished+0.092
 forgot+0.090
Neg logit contributions
ges-0.159
aps-0.156
uts-0.152
ore-0.150
uttering-0.147
Token was
Token ID373
Feature activation-1.88e+00

Pos logit contributions
 hospital+0.034
 nurses+0.034
 dentist+0.034
 wounds+0.029
 doctors+0.029
Neg logit contributions
ges-0.041
uts-0.040
aps-0.039
ore-0.038
 This-0.038
Token.
Token ID13
Feature activation+5.90e+00

Pos logit contributions
ges+0.064
uts+0.060
aps+0.060
uttering+0.059
ore+0.059
Neg logit contributions
 hospital-0.072
 dentist-0.068
 nurses-0.068
 doctors-0.063
 mess-0.063
Token
Token ID198
Feature activation+3.30e+00

Pos logit contributions
ges+0.092
uttering+0.090
uts+0.090
ore+0.084
aps+0.082
Neg logit contributions
 hospital-0.136
 dentist-0.127
 nurses-0.127
 doctors-0.118
 forgot-0.116
Token
Token ID198
Feature activation+2.94e+00

Pos logit contributions
ges+0.039
uts+0.037
uttering+0.034
aps+0.033
elt+0.033
Neg logit contributions
 hospital-0.091
 dentist-0.087
 nurses-0.084
 wounds-0.081
 cal-0.079
TokenBut
Token ID1537
Feature activation-5.86e-01

Pos logit contributions
ges+0.048
uts+0.047
aps+0.045
uttering+0.042
 This+0.042
Neg logit contributions
 hospital-0.068
 dentist-0.065
 nurses-0.065
 happiest-0.059
 punished-0.059
Token when
Token ID618
Feature activation+1.70e-01

Pos logit contributions
 hospital+0.012
 dentist+0.011
 nurses+0.011
 punished+0.010
 wounds+0.010
Neg logit contributions
ges-0.011
uts-0.011
aps-0.011
 This-0.010
uttering-0.010
Token their
Token ID511
Feature activation-1.17e+01

Pos logit contributions
ges+0.003
uts+0.003
aps+0.002
uttering+0.002
ore+0.002
Neg logit contributions
 hospital-0.004
 dentist-0.004
 nurses-0.004
 punished-0.004
 forgot-0.004
Token teacher
Token ID4701
Feature activation-1.26e+00

Pos logit contributions
 hospital+0.174
 nurses+0.167
 dentist+0.164
 punished+0.154
 cal+0.149
Neg logit contributions
aps-0.270
ges-0.265
uts-0.261
uttering-0.248
 This-0.247
Token came
Token ID1625
Feature activation+2.55e+00

Pos logit contributions
 hospital+0.024
 nurses+0.023
 wounds+0.022
 dentist+0.022
 happiest+0.021
Neg logit contributions
ges-0.025
uts-0.024
aps-0.022
 This-0.022
ed-0.022
Token,
Token ID11
Feature activation-1.85e+00

Pos logit contributions
uts+0.053
 This+0.047
unch+0.045
owers+0.045
elt+0.044
Neg logit contributions
 hospital-0.048
 nurses-0.046
 dentist-0.045
 wounds-0.042
 doctors-0.042
Token she
Token ID673
Feature activation-5.80e+00

Pos logit contributions
 hospital+0.047
 dentist+0.046
 nurses+0.045
 happiest+0.043
 punished+0.042
Neg logit contributions
ges-0.024
uts-0.024
aps-0.023
 This-0.021
uttering-0.020
Token did
Token ID750
Feature activation+2.18e+00

Pos logit contributions
 hospital+0.118
 nurses+0.117
 difference+0.112
 happiest+0.111
 wounds+0.109
Neg logit contributions
ed-0.092
ges-0.090
 on-0.085
uts-0.084
elt-0.082
Token mess
Token ID2085
Feature activation+7.76e+00

Pos logit contributions
 hospital+0.066
 nurses+0.065
 dentist+0.061
 wounds+0.059
 punished+0.057
Neg logit contributions
ges-0.089
aps-0.087
uts-0.085
 This-0.085
ore-0.084
Token.
Token ID13
Feature activation+5.31e+00

Pos logit contributions
uttering+0.118
aps+0.115
ges+0.114
uttered+0.111
uts+0.109
Neg logit contributions
 hospital-0.186
 mess-0.176
 dentist-0.176
 nurses-0.172
 punished-0.165
Token
Token ID198
Feature activation+3.34e+00

Pos logit contributions
uttering+0.099
ges+0.088
uts+0.087
uttered+0.082
ed+0.081
Neg logit contributions
 hospital-0.116
 dentist-0.110
 nurses-0.101
 doctors-0.094
 boo-0.093
Token
Token ID198
Feature activation+3.14e+00

Pos logit contributions
uts+0.048
ges+0.048
uttering+0.045
apped+0.043
uttered+0.043
Neg logit contributions
 hospital-0.082
 dentist-0.079
 nurses-0.073
 wounds-0.071
 cal-0.070
TokenBut
Token ID1537
Feature activation-1.20e+00

Pos logit contributions
uts+0.058
ges+0.058
uttered+0.054
uttering+0.054
apped+0.053
Neg logit contributions
 hospital-0.063
 dentist-0.059
 nurses-0.059
 punished-0.054
 wounds-0.053
Token their
Token ID511
Feature activation-1.13e+01

Pos logit contributions
 hospital+0.020
 dentist+0.019
 nurses+0.019
 punished+0.017
 happiest+0.016
Neg logit contributions
ges-0.027
uts-0.027
aps-0.026
uttering-0.025
 This-0.025
Token mom
Token ID1995
Feature activation-3.32e+00

Pos logit contributions
 nurses+0.178
 dentist+0.174
 hospital+0.171
ureen+0.170
 punished+0.170
Neg logit contributions
aps-0.228
ges-0.216
uts-0.212
uttering-0.207
ore-0.203
Token said
Token ID531
Feature activation+2.72e+00

Pos logit contributions
 hospital+0.062
 nurses+0.060
 dentist+0.058
 happiest+0.056
 wounds+0.055
Neg logit contributions
ges-0.065
uts-0.061
aps-0.060
 This-0.059
ore-0.058
Token,
Token ID11
Feature activation-6.01e-01

Pos logit contributions
ges+0.038
uts+0.037
aps+0.036
uttering+0.033
 This+0.033
Neg logit contributions
 hospital-0.069
 dentist-0.066
 nurses-0.066
 punished-0.061
 happiest-0.060
Token "
Token ID366
Feature activation+3.18e+00

Pos logit contributions
 hospital+0.015
 nurses+0.014
 dentist+0.014
 doctors+0.013
 forgot+0.013
Neg logit contributions
ges-0.009
aps-0.009
uttering-0.008
uts-0.008
ore-0.008
TokenNo
Token ID2949
Feature activation-2.35e+00

Pos logit contributions
ges+0.050
uts+0.048
aps+0.046
 This+0.043
uttering+0.042
Neg logit contributions
 hospital-0.076
 dentist-0.072
 nurses-0.070
 happiest-0.066
 wounds-0.066
Token
Token ID198
Feature activation+3.45e+00

Pos logit contributions
ges+0.075
uttering+0.065
uts+0.062
iked+0.058
ore+0.057
Neg logit contributions
 hospital-0.127
 dentist-0.123
 forgot-0.110
 cal-0.109
 wounds-0.108
Token
Token ID198
Feature activation+3.16e+00

Pos logit contributions
ges+0.040
uts+0.037
uttering+0.035
aps+0.034
elt+0.033
Neg logit contributions
 hospital-0.095
 dentist-0.091
 nurses-0.088
 forgot-0.084
 wounds-0.084
TokenFinally
Token ID11158
Feature activation+1.88e+00

Pos logit contributions
ges+0.050
uts+0.049
aps+0.048
 This+0.045
uttering+0.044
Neg logit contributions
 hospital-0.074
 nurses-0.071
 dentist-0.070
 happiest-0.064
 punished-0.064
Token,
Token ID11
Feature activation-6.49e-01

Pos logit contributions
uts+0.025
ges+0.025
aps+0.023
uttering+0.022
ore+0.021
Neg logit contributions
 hospital-0.050
 nurses-0.047
 dentist-0.047
 forgot-0.044
 punished-0.043
Token when
Token ID618
Feature activation+8.03e-01

Pos logit contributions
 hospital+0.015
 dentist+0.014
 nurses+0.014
 forgot+0.013
 happiest+0.013
Neg logit contributions
uts-0.011
ges-0.011
aps-0.010
uttering-0.010
elt-0.009
Token their
Token ID511
Feature activation-1.12e+01

Pos logit contributions
ges+0.014
uts+0.014
aps+0.013
uttering+0.013
elt+0.012
Neg logit contributions
 hospital-0.018
 dentist-0.017
 nurses-0.017
 forgot-0.015
 happiest-0.015
Token parents
Token ID3397
Feature activation-2.40e+00

Pos logit contributions
 hospital+0.178
 nurses+0.171
 dentist+0.170
 punished+0.152
 wounds+0.145
Neg logit contributions
uts-0.250
ges-0.250
aps-0.249
uttering-0.241
 This-0.239
Token came
Token ID1625
Feature activation+2.78e+00

Pos logit contributions
 hospital+0.048
 wounds+0.047
 nurses+0.046
 dentist+0.045
 happiest+0.044
Neg logit contributions
ges-0.044
ed-0.038
apped-0.038
uts-0.038
aps-0.038
Token to
Token ID284
Feature activation-1.72e+00

Pos logit contributions
uts+0.061
ges+0.056
 This+0.054
owers+0.054
uttering+0.052
Neg logit contributions
 hospital-0.049
 nurses-0.043
 dentist-0.041
 wounds-0.041
 cal-0.041
Token pick
Token ID2298
Feature activation+5.40e+00

Pos logit contributions
 hospital+0.036
 nurses+0.034
 dentist+0.034
 happiest+0.033
 punished+0.031
Neg logit contributions
ges-0.031
 This-0.029
aps-0.028
ore-0.027
 together-0.026
Token them
Token ID606
Feature activation-7.96e-01

Pos logit contributions
ges+0.093
uts+0.090
aps+0.088
 This+0.084
ore+0.083
Neg logit contributions
 hospital-0.119
 dentist-0.113
 nurses-0.113
 punished-0.103
 happiest-0.102
Token bike
Token ID7161
Feature activation+2.26e+00

Pos logit contributions
 hospital+0.175
 dentist+0.173
 nurses+0.172
ureen+0.165
 punished+0.160
Neg logit contributions
ges-0.171
aps-0.170
uts-0.164
 together-0.155
uttering-0.152
Token instead
Token ID2427
Feature activation+1.14e+00

Pos logit contributions
ges+0.029
uts+0.028
aps+0.027
 This+0.025
uttering+0.025
Neg logit contributions
 hospital-0.059
 nurses-0.057
 dentist-0.057
 punished-0.053
 wounds-0.052
Token.
Token ID13
Feature activation+5.53e+00

Pos logit contributions
ges+0.015
uts+0.014
aps+0.014
uttering+0.013
 This+0.013
Neg logit contributions
 hospital-0.030
 dentist-0.028
 nurses-0.028
 forgot-0.026
 doctors-0.026
Token But
Token ID887
Feature activation-2.54e-01

Pos logit contributions
uttering+0.086
ges+0.085
uts+0.085
ed+0.079
aps+0.078
Neg logit contributions
 hospital-0.131
 dentist-0.122
 nurses-0.117
 doctors-0.109
 tears-0.107
Token when
Token ID618
Feature activation+5.86e-02

Pos logit contributions
 hospital+0.006
 dentist+0.006
 nurses+0.006
 punished+0.005
 happiest+0.005
Neg logit contributions
ges-0.004
uts-0.004
aps-0.004
 This-0.004
uttering-0.004
Token their
Token ID511
Feature activation-1.12e+01

Pos logit contributions
ges+0.001
uts+0.001
aps+0.001
uttering+0.001
ore+0.001
Neg logit contributions
 hospital-0.001
 nurses-0.001
 dentist-0.001
 forgot-0.001
 punished-0.001
Token mom
Token ID1995
Feature activation-3.10e+00

Pos logit contributions
 hospital+0.175
 nurses+0.167
 dentist+0.161
 punished+0.155
ureen+0.149
Neg logit contributions
uts-0.241
aps-0.241
ges-0.237
uttering-0.226
 This-0.222
Token was
Token ID373
Feature activation-3.31e+00

Pos logit contributions
 hospital+0.060
 wounds+0.056
 nurses+0.056
 happiest+0.055
 dentist+0.054
Neg logit contributions
ges-0.058
uts-0.054
ed-0.053
aps-0.051
elt-0.051
Token busy
Token ID8179
Feature activation+7.65e-01

Pos logit contributions
 nurses+0.060
 hospital+0.060
 dentist+0.056
 wounds+0.052
 forgot+0.051
Neg logit contributions
aps-0.065
ges-0.062
uts-0.062
apping-0.061
uttering-0.061
Token talking
Token ID3375
Feature activation+6.62e+00

Pos logit contributions
uts+0.012
ges+0.012
aps+0.011
uttering+0.011
uttered+0.011
Neg logit contributions
 hospital-0.018
 nurses-0.017
 dentist-0.016
 mess-0.016
 punished-0.015
Token to
Token ID284
Feature activation+7.06e-01

Pos logit contributions
aps+0.101
ges+0.101
uttering+0.099
uts+0.094
uttered+0.089
Neg logit contributions
 hospital-0.159
 dentist-0.151
 nurses-0.149
 doctors-0.138
 mess-0.136
Token worry
Token ID5490
Feature activation+1.66e+00

Pos logit contributions
 hospital+0.042
 dentist+0.041
 nurses+0.040
 wounds+0.036
 doctors+0.035
Neg logit contributions
ges-0.050
uts-0.048
aps-0.047
 This-0.047
uttering-0.044
Token however
Token ID2158
Feature activation+7.75e-01

Pos logit contributions
ges+0.027
aps+0.026
uts+0.026
uttering+0.024
ore+0.024
Neg logit contributions
 hospital-0.038
 nurses-0.035
 dentist-0.035
 punished-0.032
 forgot-0.032
Token,
Token ID11
Feature activation-7.14e-01

Pos logit contributions
ges+0.011
uts+0.011
aps+0.010
 This+0.010
uttering+0.010
Neg logit contributions
 hospital-0.019
 nurses-0.018
 dentist-0.018
 happiest-0.017
 forgot-0.017
Token as
Token ID355
Feature activation-9.18e-02

Pos logit contributions
 hospital+0.015
 dentist+0.015
 nurses+0.015
 punished+0.013
 wounds+0.013
Neg logit contributions
ges-0.013
uts-0.012
aps-0.012
 This-0.011
uttering-0.011
Token when
Token ID618
Feature activation+4.92e-01

Pos logit contributions
 hospital+0.002
 dentist+0.002
 nurses+0.002
 punished+0.002
 wounds+0.002
Neg logit contributions
ges-0.001
uts-0.001
aps-0.001
 This-0.001
uttering-0.001
Token their
Token ID511
Feature activation-1.12e+01

Pos logit contributions
ges+0.008
uts+0.008
aps+0.008
uttering+0.007
ore+0.007
Neg logit contributions
 hospital-0.011
 nurses-0.011
 dentist-0.011
 happiest-0.010
 forgot-0.010
Token train
Token ID4512
Feature activation-1.28e+00

Pos logit contributions
 hospital+0.167
 dentist+0.162
 nurses+0.153
 punished+0.139
 forgot+0.136
Neg logit contributions
ges-0.253
uts-0.253
aps-0.250
 This-0.246
uttering-0.242
Token arrived
Token ID5284
Feature activation+2.24e+00

Pos logit contributions
 hospital+0.022
 dentist+0.022
 nurses+0.021
 wounds+0.020
 happiest+0.018
Neg logit contributions
uts-0.026
ges-0.026
 This-0.024
aps-0.024
apped-0.024
Token,
Token ID11
Feature activation-6.38e-01

Pos logit contributions
ges+0.031
uts+0.030
aps+0.028
 This+0.026
ore+0.026
Neg logit contributions
 hospital-0.058
 nurses-0.053
 dentist-0.053
 happiest-0.051
 punished-0.051
Token it
Token ID340
Feature activation-2.00e+00

Pos logit contributions
 hospital+0.015
 dentist+0.015
 nurses+0.015
 punished+0.014
 wounds+0.014
Neg logit contributions
ges-0.009
uts-0.009
aps-0.009
 This-0.008
uttering-0.008
Token was
Token ID373
Feature activation-1.69e+00

Pos logit contributions
 dentist+0.044
 nurses+0.042
 hospital+0.042
 wounds+0.039
 stret+0.037
Neg logit contributions
ed-0.031
 together-0.031
uts-0.031
uttered-0.030
ges-0.029
Token.
Token ID13
Feature activation+4.65e+00

Pos logit contributions
ges+0.026
uts+0.025
aps+0.024
uttering+0.023
uttered+0.023
Neg logit contributions
 hospital-0.030
 dentist-0.028
 nurses-0.028
 forgot-0.026
 punished-0.026
Token
Token ID198
Feature activation+3.21e+00

Pos logit contributions
ges+0.064
uttering+0.059
uts+0.059
aps+0.057
apping+0.052
Neg logit contributions
 hospital-0.121
 dentist-0.117
 nurses-0.111
 forgot-0.106
 happiest-0.104
Token
Token ID198
Feature activation+3.01e+00

Pos logit contributions
ges+0.056
uts+0.055
aps+0.054
 This+0.051
ore+0.050
Neg logit contributions
 hospital-0.069
 nurses-0.066
 dentist-0.065
 punished-0.060
 happiest-0.060
TokenEven
Token ID6104
Feature activation-1.70e+00

Pos logit contributions
ges+0.052
uts+0.052
aps+0.049
 This+0.048
uttering+0.046
Neg logit contributions
 hospital-0.065
 nurses-0.062
 dentist-0.061
 happiest-0.057
 wounds-0.056
Token though
Token ID996
Feature activation+3.59e-01

Pos logit contributions
 hospital+0.040
 dentist+0.038
 nurses+0.037
 wounds+0.037
ureen+0.034
Neg logit contributions
ges-0.026
uts-0.024
 together-0.023
aps-0.023
ore-0.022
Token their
Token ID511
Feature activation-1.12e+01

Pos logit contributions
ges+0.006
uts+0.006
aps+0.006
 This+0.006
ore+0.006
Neg logit contributions
 hospital-0.008
 dentist-0.007
 nurses-0.007
 punished-0.007
 wounds-0.007
Token holiday
Token ID9912
Feature activation+2.16e+00

Pos logit contributions
 dentist+0.176
 hospital+0.175
 nurses+0.172
ureen+0.166
 punished+0.164
Neg logit contributions
aps-0.222
ges-0.216
uts-0.214
 This-0.204
uttering-0.204
Token was
Token ID373
Feature activation-2.77e+00

Pos logit contributions
uts+0.037
ges+0.036
aps+0.035
 This+0.035
apped+0.035
Neg logit contributions
 hospital-0.044
 nurses-0.043
 dentist-0.043
 wounds-0.041
ureen-0.039
Token different
Token ID1180
Feature activation-8.35e-01

Pos logit contributions
 hospital+0.050
 dentist+0.047
 nurses+0.047
 wounds+0.045
ureen+0.042
Neg logit contributions
ges-0.055
uts-0.054
 This-0.052
uttering-0.051
aps-0.051
Token than
Token ID621
Feature activation-2.11e+00

Pos logit contributions
 hospital+0.021
 nurses+0.020
 dentist+0.020
 punished+0.018
 happiest+0.018
Neg logit contributions
ges-0.012
uts-0.012
aps-0.011
 This-0.011
uttering-0.011
Token they
Token ID484
Feature activation-5.47e+00

Pos logit contributions
 hospital+0.045
 dentist+0.043
 nurses+0.042
 wounds+0.040
ureen+0.039
Neg logit contributions
ges-0.037
uts-0.035
 This-0.033
aps-0.032
 together-0.032
Token.
Token ID13
Feature activation+5.44e+00

Pos logit contributions
ges+0.042
aps+0.040
uts+0.038
uttering+0.038
ore+0.037
Neg logit contributions
 hospital-0.058
 mess-0.055
 floor-0.054
 dentist-0.054
 nurses-0.052
Token
Token ID220
Feature activation+5.83e+00

Pos logit contributions
uts+0.099
ges+0.094
uttering+0.094
aps+0.090
ore+0.085
Neg logit contributions
 hospital-0.114
 nurses-0.109
 dentist-0.102
 tears-0.097
 happiest-0.096
Token
Token ID198
Feature activation+3.88e+00

Pos logit contributions
ges+0.085
uttering+0.077
uts+0.073
elt+0.072
ore+0.070
Neg logit contributions
 hospital-0.132
 dentist-0.126
 nurses-0.115
 wounds-0.115
 forgot-0.112
Token
Token ID198
Feature activation+3.54e+00

Pos logit contributions
ges+0.065
uts+0.065
uttering+0.064
elt+0.064
apped+0.060
Neg logit contributions
 hospital-0.074
 dentist-0.069
 nurses-0.064
 forgot-0.064
idi-0.062
TokenWhen
Token ID2215
Feature activation+1.37e+00

Pos logit contributions
ges+0.074
uts+0.073
aps+0.073
ore+0.069
uttering+0.068
Neg logit contributions
 hospital-0.064
 nurses-0.060
 dentist-0.058
 punished-0.054
 forgot-0.053
Token their
Token ID511
Feature activation-1.10e+01

Pos logit contributions
ges+0.022
uts+0.021
aps+0.021
 This+0.020
ore+0.020
Neg logit contributions
 hospital-0.031
 dentist-0.030
 nurses-0.030
 wounds-0.028
 punished-0.028
Token treat
Token ID2190
Feature activation+3.66e+00

Pos logit contributions
 hospital+0.178
 dentist+0.169
 nurses+0.163
 punished+0.155
 forgot+0.148
Neg logit contributions
ges-0.229
aps-0.229
uts-0.228
uttering-0.225
 This-0.219
Token was
Token ID373
Feature activation-3.37e+00

Pos logit contributions
ges+0.072
uts+0.069
aps+0.068
 This+0.065
ore+0.065
Neg logit contributions
 hospital-0.071
 nurses-0.068
 dentist-0.067
 wounds-0.062
 punished-0.061
Token finished
Token ID5201
Feature activation+5.28e+00

Pos logit contributions
 hospital+0.052
 dentist+0.049
 nurses+0.048
 punished+0.043
 happiest+0.042
Neg logit contributions
ges-0.080
uts-0.079
aps-0.077
 This-0.074
ore-0.073
Token,
Token ID11
Feature activation-1.12e-01

Pos logit contributions
ges+0.085
uts+0.083
aps+0.082
uttering+0.078
ore+0.076
Neg logit contributions
 hospital-0.121
 nurses-0.111
 mess-0.111
 dentist-0.110
 mistake-0.104
Token the
Token ID262
Feature activation-8.05e+00

Pos logit contributions
 hospital+0.002
 dentist+0.002
 nurses+0.002
 punished+0.002
 wounds+0.002
Neg logit contributions
ges-0.002
uts-0.002
aps-0.002
 This-0.002
uttering-0.002
Token spending
Token ID4581
Feature activation+8.87e-01

Pos logit contributions
 hospital+0.043
 nurses+0.041
 dentist+0.040
 punished+0.036
 happiest+0.036
Neg logit contributions
ges-0.042
uts-0.041
aps-0.040
uttering-0.038
ore-0.038
Token time
Token ID640
Feature activation+5.04e+00

Pos logit contributions
ges+0.019
uts+0.018
aps+0.018
 This+0.017
uttering+0.017
Neg logit contributions
 hospital-0.016
 nurses-0.015
 dentist-0.015
 punished-0.013
 happiest-0.013
Token together
Token ID1978
Feature activation+1.95e+00

Pos logit contributions
ges+0.062
elt+0.059
 This+0.057
uttered+0.056
uts+0.054
Neg logit contributions
 hospital-0.130
 dentist-0.121
 mess-0.121
 forgot-0.119
 nurses-0.117
Token.
Token ID13
Feature activation+4.62e+00

Pos logit contributions
elt+0.037
uttered+0.035
uts+0.035
aps+0.034
uck+0.034
Neg logit contributions
 hospital-0.037
 dentist-0.032
 nurses-0.032
 happiest-0.032
 forgot-0.031
Token Whenever
Token ID21326
Feature activation+1.65e+00

Pos logit contributions
ges+0.065
uts+0.061
uttering+0.059
aps+0.059
ore+0.055
Neg logit contributions
 hospital-0.119
 dentist-0.115
 nurses-0.111
 happiest-0.101
 doctors-0.101
Token their
Token ID511
Feature activation-1.10e+01

Pos logit contributions
ges+0.026
uts+0.026
aps+0.025
 This+0.024
uttering+0.023
Neg logit contributions
 hospital-0.038
 dentist-0.036
 nurses-0.036
 punished-0.033
 wounds-0.033
Token parents
Token ID3397
Feature activation-1.57e+00

Pos logit contributions
 hospital+0.185
 dentist+0.178
 nurses+0.177
 punished+0.164
ureen+0.163
Neg logit contributions
aps-0.227
uts-0.225
ges-0.224
uttering-0.213
apped-0.208
Token would
Token ID561
Feature activation+8.18e-02

Pos logit contributions
 hospital+0.028
 dentist+0.027
 wounds+0.027
 nurses+0.027
 mistake+0.025
Neg logit contributions
ges-0.030
uts-0.029
aps-0.028
apped-0.028
 together-0.028
Token leave
Token ID2666
Feature activation+3.44e+00

Pos logit contributions
ges+0.002
uts+0.002
aps+0.002
 This+0.002
uttering+0.002
Neg logit contributions
 hospital-0.001
 dentist-0.001
 nurses-0.001
 wounds-0.001
 punished-0.001
Token the
Token ID262
Feature activation-8.29e+00

Pos logit contributions
ges+0.050
uts+0.046
 This+0.045
aps+0.044
 together+0.043
Neg logit contributions
 hospital-0.082
 dentist-0.080
 nurses-0.079
 wounds-0.073
ureen-0.073
Token house
Token ID2156
Feature activation+5.98e+00

Pos logit contributions
 nurses+0.125
 hospital+0.124
ureen+0.119
 wounds+0.118
 dentist+0.117
Neg logit contributions
ges-0.166
aps-0.165
uts-0.164
uttering-0.161
ore-0.161
Token was
Token ID373
Feature activation-1.31e+00

Pos logit contributions
 hospital+0.079
 nurses+0.078
 dentist+0.078
 wounds+0.075
 stret+0.070
Neg logit contributions
uts-0.056
ed-0.055
ges-0.051
 together-0.051
 This-0.051
Token a
Token ID257
Feature activation-3.78e+00

Pos logit contributions
 hospital+0.027
 nurses+0.026
 dentist+0.025
 wounds+0.023
 forgot+0.023
Neg logit contributions
ges-0.023
uts-0.022
aps-0.022
 This-0.021
uttering-0.021
Token weird
Token ID7650
Feature activation-3.00e+00

Pos logit contributions
 nurses+0.061
 hospital+0.061
 dentist+0.057
 wounds+0.056
 doctors+0.052
Neg logit contributions
ges-0.075
 This-0.073
aps-0.073
uts-0.073
ore-0.072
Token feeling
Token ID4203
Feature activation+1.16e-01

Pos logit contributions
 hospital+0.054
 nurses+0.052
 dentist+0.051
 punished+0.047
 wounds+0.046
Neg logit contributions
ges-0.061
uts-0.059
aps-0.058
 This-0.057
ore-0.057
Token.
Token ID13
Feature activation+5.79e+00

Pos logit contributions
ges+0.002
uts+0.002
aps+0.001
 This+0.001
uttering+0.001
Neg logit contributions
 hospital-0.003
 dentist-0.003
 nurses-0.003
 punished-0.003
 happiest-0.003
Token Er
Token ID5256
Feature activation-1.10e+01

Pos logit contributions
ges+0.084
uts+0.083
uttering+0.081
aps+0.077
uttered+0.071
Neg logit contributions
 hospital-0.141
 dentist-0.136
 nurses-0.136
 cal-0.124
 happiest-0.122
Tokennie
Token ID11952
Feature activation-6.05e+00

Pos logit contributions
 hospital+0.157
 dentist+0.139
 nurses+0.122
 cal+0.122
 wounds+0.118
Neg logit contributions
ges-0.284
ore-0.269
ed-0.267
aps-0.264
apped-0.264
Token didn
Token ID1422
Feature activation+4.51e+00

Pos logit contributions
 hospital+0.109
 dentist+0.107
 nurses+0.104
 wounds+0.094
 happiest+0.092
Neg logit contributions
ges-0.125
uts-0.121
aps-0.119
apped-0.115
 This-0.114
Token't
Token ID470
Feature activation-2.92e+00

Pos logit contributions
ges+0.058
elt+0.056
uttering+0.053
apped+0.052
aps+0.050
Neg logit contributions
 hospital-0.126
 dentist-0.115
 nurses-0.108
 cal-0.108
 mistake-0.106
Token understand
Token ID1833
Feature activation+2.46e+00

Pos logit contributions
 hospital+0.058
 dentist+0.055
 nurses+0.055
 wounds+0.050
 happiest+0.049
Neg logit contributions
ges-0.057
uts-0.055
aps-0.053
 This-0.051
ore-0.050
Token why
Token ID1521
Feature activation+2.22e+00

Pos logit contributions
ges+0.050
uts+0.049
aps+0.047
 This+0.046
ore+0.045
Neg logit contributions
 hospital-0.046
 nurses-0.043
 dentist-0.043
 punished-0.039
 wounds-0.038
Token.
Token ID13
Feature activation+5.28e+00

Pos logit contributions
ges+0.087
uts+0.083
uttering+0.083
aps+0.082
elt+0.080
Neg logit contributions
 hospital-0.137
 dentist-0.129
 nurses-0.125
 mess-0.122
 happiest-0.120
Token
Token ID220
Feature activation+5.94e+00

Pos logit contributions
uttering+0.109
ges+0.106
uts+0.102
aps+0.095
ore+0.095
Neg logit contributions
 hospital-0.096
 dentist-0.093
 nurses-0.087
 tears-0.081
 floor-0.081
Token
Token ID198
Feature activation+3.65e+00

Pos logit contributions
ges+0.092
uttering+0.084
uts+0.077
elt+0.074
iked+0.073
Neg logit contributions
 hospital-0.133
 dentist-0.130
 nurses-0.115
 wounds-0.113
 cal-0.112
Token
Token ID198
Feature activation+3.32e+00

Pos logit contributions
uttering+0.061
ges+0.060
elt+0.056
uts+0.055
uttered+0.055
Neg logit contributions
 hospital-0.073
 dentist-0.073
ureen-0.065
 nurses-0.065
 conductor-0.063
TokenWhen
Token ID2215
Feature activation+1.44e+00

Pos logit contributions
ges+0.069
aps+0.067
uts+0.066
ore+0.064
uttering+0.063
Neg logit contributions
 hospital-0.061
 dentist-0.058
 nurses-0.057
 punished-0.052
 forgot-0.051
Token their
Token ID511
Feature activation-1.10e+01

Pos logit contributions
ges+0.025
uts+0.025
aps+0.024
uttering+0.023
ore+0.023
Neg logit contributions
 hospital-0.032
 dentist-0.030
 nurses-0.030
 forgot-0.027
 punished-0.027
Token fence
Token ID13990
Feature activation+1.81e+00

Pos logit contributions
 hospital+0.184
 dentist+0.181
 nurses+0.177
 punished+0.161
 cal+0.154
Neg logit contributions
uts-0.232
ges-0.232
aps-0.228
uttering-0.224
 This-0.221
Token was
Token ID373
Feature activation-3.06e+00

Pos logit contributions
ges+0.034
uts+0.033
aps+0.032
 This+0.031
ore+0.030
Neg logit contributions
 hospital-0.036
 nurses-0.035
 dentist-0.035
 wounds-0.032
 punished-0.031
Token done
Token ID1760
Feature activation+5.31e+00

Pos logit contributions
 hospital+0.046
 dentist+0.042
 nurses+0.042
 punished+0.036
 happiest+0.036
Neg logit contributions
ges-0.074
uts-0.073
aps-0.072
 This-0.069
uttering-0.068
Token,
Token ID11
Feature activation-5.15e-01

Pos logit contributions
ges+0.091
uts+0.091
aps+0.089
uttering+0.086
elt+0.085
Neg logit contributions
 hospital-0.114
 mess-0.102
 dentist-0.101
 nurses-0.100
 mistake-0.096
Token John
Token ID1757
Feature activation-5.88e+00

Pos logit contributions
 hospital+0.011
 dentist+0.011
 nurses+0.011
 wounds+0.010
 punished+0.010
Neg logit contributions
ges-0.009
uts-0.008
aps-0.008
 This-0.008
uttering-0.008
Token hands
Token ID2832
Feature activation+5.02e+00

Pos logit contributions
 dentist+0.183
 nurses+0.179
 hospital+0.177
 punished+0.161
ureen+0.160
Neg logit contributions
 This-0.183
ges-0.180
aps-0.178
uttering-0.166
uts-0.164
Token.
Token ID13
Feature activation+5.11e+00

Pos logit contributions
ges+0.083
aps+0.080
uts+0.077
ore+0.075
 This+0.075
Neg logit contributions
 hospital-0.107
 dentist-0.098
 nurses-0.097
 forgot-0.095
 happiest-0.093
Token
Token ID198
Feature activation+3.47e+00

Pos logit contributions
ges+0.084
uttering+0.081
uts+0.079
aps+0.078
ore+0.073
Neg logit contributions
 hospital-0.116
 dentist-0.103
 nurses-0.102
 forgot-0.102
 tears-0.099
Token
Token ID198
Feature activation+3.17e+00

Pos logit contributions
ges+0.047
uts+0.044
uttering+0.043
aps+0.042
elt+0.042
Neg logit contributions
 hospital-0.087
 dentist-0.082
 nurses-0.077
 forgot-0.076
 mess-0.074
TokenWhen
Token ID2215
Feature activation+1.04e+00

Pos logit contributions
ges+0.058
uts+0.057
aps+0.055
uttering+0.052
 This+0.052
Neg logit contributions
 hospital-0.067
 nurses-0.063
 dentist-0.063
 punished-0.058
 forgot-0.057
Token their
Token ID511
Feature activation-1.10e+01

Pos logit contributions
uts+0.017
ges+0.017
aps+0.017
uttering+0.016
ore+0.015
Neg logit contributions
 hospital-0.024
 nurses-0.023
 dentist-0.022
 forgot-0.021
 punished-0.021
Token parents
Token ID3397
Feature activation-2.47e+00

Pos logit contributions
 hospital+0.153
 dentist+0.146
 nurses+0.142
 punished+0.127
 cal+0.123
Neg logit contributions
ges-0.265
uts-0.265
aps-0.260
uttering-0.251
 This-0.249
Token saw
Token ID2497
Feature activation+1.82e+00

Pos logit contributions
 hospital+0.046
 wounds+0.044
 nurses+0.044
 dentist+0.043
 happiest+0.041
Neg logit contributions
ges-0.049
uts-0.044
ore-0.043
aps-0.042
ed-0.042
Token the
Token ID262
Feature activation-7.97e+00

Pos logit contributions
ges+0.029
 This+0.026
 together+0.026
uts+0.024
ore+0.024
Neg logit contributions
 hospital-0.041
 dentist-0.040
 nurses-0.040
 happiest-0.038
ureen-0.037
Token mole
Token ID9411
Feature activation-1.86e+00

Pos logit contributions
 hospital+0.139
 dentist+0.129
 nurses+0.129
 punished+0.114
 happiest+0.113
Neg logit contributions
ges-0.175
uts-0.172
aps-0.168
 This-0.161
uttering-0.160
Token,
Token ID11
Feature activation-1.74e+00

Pos logit contributions
 hospital+0.046
 dentist+0.044
 nurses+0.044
 punished+0.041
 happiest+0.040
Neg logit contributions
ges-0.026
uts-0.026
aps-0.025
 This-0.023
uttering-0.023
Token pictures
Token ID5986
Feature activation+6.83e+00

Pos logit contributions
ges+0.110
uts+0.107
aps+0.103
 This+0.097
uttering+0.097
Neg logit contributions
 hospital-0.151
 dentist-0.143
 nurses-0.143
 punished-0.130
 happiest-0.130
Token together
Token ID1978
Feature activation+2.36e+00

Pos logit contributions
ges+0.095
uttering+0.089
aps+0.084
uttered+0.084
elt+0.083
Neg logit contributions
 hospital-0.170
 dentist-0.165
 nurses-0.158
 doctors-0.150
 cal-0.149
Token.
Token ID13
Feature activation+5.72e+00

Pos logit contributions
ges+0.035
aps+0.034
uttering+0.033
uts+0.033
elt+0.033
Neg logit contributions
 hospital-0.056
 nurses-0.053
 dentist-0.052
 mess-0.049
 happiest-0.049
Token Even
Token ID3412
Feature activation-1.75e+00

Pos logit contributions
ges+0.079
uts+0.077
aps+0.073
uttering+0.072
ore+0.067
Neg logit contributions
 hospital-0.148
 dentist-0.139
 nurses-0.138
 cal-0.126
 happiest-0.126
Token though
Token ID996
Feature activation+6.81e-01

Pos logit contributions
 hospital+0.040
 dentist+0.039
 nurses+0.037
 wounds+0.036
ureen+0.034
Neg logit contributions
ges-0.028
uts-0.026
aps-0.026
uttering-0.024
 This-0.024
Token their
Token ID511
Feature activation-1.09e+01

Pos logit contributions
ges+0.011
uts+0.011
aps+0.010
 This+0.010
uttering+0.009
Neg logit contributions
 hospital-0.015
 dentist-0.015
 nurses-0.015
 punished-0.014
 wounds-0.014
Token vacation
Token ID14600
Feature activation+2.85e+00

Pos logit contributions
 dentist+0.190
 hospital+0.190
 nurses+0.187
ureen+0.170
 cal+0.170
Neg logit contributions
ges-0.205
aps-0.203
uts-0.201
 This-0.199
uttering-0.195
Token was
Token ID373
Feature activation-2.46e+00

Pos logit contributions
uts+0.049
ges+0.045
 This+0.045
aps+0.045
ed+0.045
Neg logit contributions
 nurses-0.058
 hospital-0.058
 dentist-0.057
 wounds-0.055
 stret-0.051
Token ruined
Token ID20484
Feature activation+4.49e+00

Pos logit contributions
 hospital+0.042
 nurses+0.041
 dentist+0.040
 wounds+0.038
ureen+0.035
Neg logit contributions
ges-0.050
uts-0.049
 This-0.047
aps-0.047
uttering-0.047
Token,
Token ID11
Feature activation-1.13e-01

Pos logit contributions
ges+0.059
uts+0.057
aps+0.056
uttering+0.051
ore+0.051
Neg logit contributions
 hospital-0.118
 nurses-0.112
 dentist-0.111
 punished-0.103
 happiest-0.103
Token they
Token ID484
Feature activation-4.07e+00

Pos logit contributions
 hospital+0.003
 dentist+0.003
 nurses+0.003
 punished+0.003
 wounds+0.002
Neg logit contributions
ges-0.002
uts-0.001
 This-0.001
aps-0.001
 together-0.001
Token
Token ID198
Feature activation+3.47e+00

Pos logit contributions
ges+0.085
uttering+0.078
uts+0.076
elt+0.075
iked+0.073
Neg logit contributions
 hospital-0.117
 dentist-0.114
 nurses-0.101
 tears-0.100
 wounds-0.099
Token
Token ID198
Feature activation+3.28e+00

Pos logit contributions
ges+0.050
uts+0.048
uttering+0.045
elt+0.045
apped+0.044
Neg logit contributions
 hospital-0.083
 dentist-0.080
 nurses-0.074
ureen-0.069
 forgot-0.069
TokenOne
Token ID3198
Feature activation-1.82e+00

Pos logit contributions
ges+0.065
uts+0.063
aps+0.063
apped+0.059
ore+0.058
Neg logit contributions
 hospital-0.066
 nurses-0.062
 dentist-0.061
 doctors-0.054
 happiest-0.054
Token day
Token ID1110
Feature activation+4.76e+00

Pos logit contributions
 hospital+0.040
 dentist+0.038
 nurses+0.038
 wounds+0.035
 punished+0.035
Neg logit contributions
ges-0.032
uts-0.030
aps-0.030
 This-0.029
ore-0.029
Token,
Token ID11
Feature activation-6.35e-01

Pos logit contributions
uts+0.095
aps+0.091
uttering+0.087
apping+0.084
ges+0.083
Neg logit contributions
 nurses-0.085
 hospital-0.084
 forgot-0.073
 doctors-0.072
 dentist-0.070
Token Pe
Token ID2631
Feature activation-1.09e+01

Pos logit contributions
 hospital+0.012
 nurses+0.012
 dentist+0.011
 doctors+0.010
 happiest+0.010
Neg logit contributions
uts-0.012
aps-0.012
ges-0.012
apping-0.012
uttering-0.011
Tokenppa
Token ID44989
Feature activation-3.77e+00

Pos logit contributions
 hospital+0.150
 nurses+0.137
 dentist+0.123
 cal+0.121
 happiest+0.112
Neg logit contributions
uttering-0.267
ges-0.266
uts-0.264
aps-0.258
apping-0.250
Token stumbled
Token ID24241
Feature activation+6.82e+00

Pos logit contributions
 nurses+0.072
 hospital+0.071
 wounds+0.070
 dentist+0.069
 happiest+0.066
Neg logit contributions
ges-0.068
uts-0.068
ed-0.068
ung-0.063
aps-0.063
Token upon
Token ID2402
Feature activation+2.11e+00

Pos logit contributions
uts+0.097
ges+0.095
 This+0.094
aps+0.093
 Sorry+0.089
Neg logit contributions
 hospital-0.164
 dentist-0.157
 nurses-0.153
 forgot-0.147
 happiest-0.143
Token a
Token ID257
Feature activation-3.84e+00

Pos logit contributions
ges+0.037
uts+0.036
aps+0.035
uttering+0.033
 This+0.032
Neg logit contributions
 hospital-0.046
 dentist-0.044
 nurses-0.044
 punished-0.040
 happiest-0.039
Token strange
Token ID6283
Feature activation-2.62e+00

Pos logit contributions
 nurses+0.063
 hospital+0.061
 wounds+0.059
 dentist+0.058
 punished+0.057
Neg logit contributions
ges-0.077
ore-0.076
uttering-0.075
aps-0.074
 This-0.074
Token.
Token ID13
Feature activation+5.13e+00

Pos logit contributions
 hospital+0.064
 dentist+0.062
 nurses+0.061
 punished+0.056
 wounds+0.056
Neg logit contributions
ges-0.029
uts-0.029
aps-0.027
 This-0.025
ore-0.024
Token They
Token ID1119
Feature activation-5.23e+00

Pos logit contributions
ges+0.070
uts+0.068
aps+0.065
uttering+0.062
ore+0.058
Neg logit contributions
 hospital-0.134
 dentist-0.127
 nurses-0.126
 happiest-0.115
 punished-0.115
Token hoped
Token ID10719
Feature activation+2.58e+00

Pos logit contributions
 dentist+0.126
 hospital+0.118
 wounds+0.118
 stret+0.118
 nurses+0.116
Neg logit contributions
 together-0.079
apped-0.074
uttered-0.066
ges-0.066
 on-0.063
Token to
Token ID284
Feature activation-2.47e+00

Pos logit contributions
ges+0.030
uts+0.029
 This+0.028
 together+0.027
aps+0.025
Neg logit contributions
 dentist-0.065
 hospital-0.064
 punished-0.063
 nurses-0.062
 wounds-0.061
Token rebuild
Token ID17884
Feature activation+4.65e+00

Pos logit contributions
 dentist+0.045
 nurses+0.045
 hospital+0.045
ureen+0.043
ice+0.042
Neg logit contributions
ges-0.042
aps-0.042
uts-0.041
 This-0.041
't-0.041
Token their
Token ID511
Feature activation-1.09e+01

Pos logit contributions
ges+0.070
 together+0.068
aps+0.066
uts+0.064
 This+0.062
Neg logit contributions
 dentist-0.104
 hospital-0.104
 nurses-0.103
 forgot-0.096
ureen-0.096
Token village
Token ID7404
Feature activation+2.70e+00

Pos logit contributions
 dentist+0.177
ureen+0.169
 nurses+0.169
 hospital+0.167
 punished+0.149
Neg logit contributions
aps-0.220
ges-0.215
 This-0.215
ore-0.213
uts-0.212
Token and
Token ID290
Feature activation-4.10e+00

Pos logit contributions
uts+0.043
ges+0.042
aps+0.042
 together+0.040
ore+0.039
Neg logit contributions
 dentist-0.059
 hospital-0.059
 nurses-0.058
ureen-0.053
 punished-0.052
Token make
Token ID787
Feature activation+2.46e+00

Pos logit contributions
 dentist+0.076
 hospital+0.076
 nurses+0.074
ureen+0.071
 wounds+0.067
Neg logit contributions
ges-0.075
uts-0.073
 together-0.072
aps-0.072
 This-0.071
Token it
Token ID340
Feature activation-3.17e-01

Pos logit contributions
 together+0.039
aps+0.035
uts+0.033
ges+0.033
 This+0.033
Neg logit contributions
 dentist-0.053
 nurses-0.053
 hospital-0.051
 forgot-0.051
ureen-0.049
Token even
Token ID772
Feature activation-3.07e+00

Pos logit contributions
 hospital+0.005
 nurses+0.005
 dentist+0.005
ureen+0.004
 punished+0.004
Neg logit contributions
uts-0.007
aps-0.007
ges-0.007
 This-0.007
uttering-0.006
Token.
Token ID13
Feature activation+5.58e+00

Pos logit contributions
aps+0.069
ges+0.067
uts+0.065
ore+0.063
elt+0.062
Neg logit contributions
 hospital-0.095
 forgot-0.089
 happiest-0.087
 dentist-0.086
 nurses-0.085
Token
Token ID220
Feature activation+5.67e+00

Pos logit contributions
ges+0.082
uts+0.081
uttering+0.080
aps+0.076
ore+0.070
Neg logit contributions
 hospital-0.135
 dentist-0.125
 nurses-0.122
 happiest-0.115
 forgot-0.114
Token
Token ID198
Feature activation+4.06e+00

Pos logit contributions
ges+0.079
uttering+0.071
uts+0.067
elt+0.065
ore+0.063
Neg logit contributions
 hospital-0.133
 dentist-0.126
 cal-0.116
 nurses-0.115
 wounds-0.114
Token
Token ID198
Feature activation+3.74e+00

Pos logit contributions
ges+0.049
uts+0.047
uttering+0.044
aps+0.043
elt+0.042
Neg logit contributions
 hospital-0.107
 dentist-0.103
 nurses-0.099
 punished-0.094
 forgot-0.094
TokenWhen
Token ID2215
Feature activation+1.24e+00

Pos logit contributions
ges+0.076
uts+0.075
aps+0.073
 This+0.069
uttering+0.069
Neg logit contributions
 hospital-0.070
 dentist-0.066
 nurses-0.066
 punished-0.059
 happiest-0.058
Token their
Token ID511
Feature activation-1.09e+01

Pos logit contributions
ges+0.021
uts+0.020
aps+0.019
 This+0.019
ore+0.018
Neg logit contributions
 hospital-0.027
 dentist-0.026
 nurses-0.026
 wounds-0.024
 punished-0.024
Token energy
Token ID2568
Feature activation+1.86e+00

Pos logit contributions
 hospital+0.179
 dentist+0.176
 nurses+0.174
 punished+0.157
 forgot+0.151
Neg logit contributions
aps-0.229
uts-0.228
ges-0.224
uttering-0.218
 This-0.213
Token was
Token ID373
Feature activation-3.35e+00

Pos logit contributions
ges+0.038
uts+0.038
aps+0.036
apped+0.035
ore+0.034
Neg logit contributions
 hospital-0.034
 dentist-0.033
 nurses-0.033
 wounds-0.030
 punished-0.029
Token spent
Token ID3377
Feature activation+1.66e+00

Pos logit contributions
 hospital+0.069
 nurses+0.065
 dentist+0.065
 punished+0.059
 wounds+0.058
Neg logit contributions
ges-0.062
uts-0.061
aps-0.059
 This-0.056
uttering-0.056
Token,
Token ID11
Feature activation-8.46e-01

Pos logit contributions
ges+0.028
aps+0.026
 together+0.025
uts+0.025
 This+0.025
Neg logit contributions
 dentist-0.037
 nurses-0.036
 hospital-0.036
 wounds-0.032
 punished-0.031
Token they
Token ID484
Feature activation-3.93e+00

Pos logit contributions
 hospital+0.020
 dentist+0.020
 nurses+0.020
 wounds+0.018
 punished+0.018
Neg logit contributions
ges-0.012
uts-0.012
aps-0.011
 This-0.011
apped-0.011
Token hard
Token ID1327
Feature activation+1.54e+00

Pos logit contributions
ges+0.074
uts+0.073
aps+0.071
 This+0.067
uttering+0.067
Neg logit contributions
 hospital-0.081
 dentist-0.076
 nurses-0.076
 punished-0.070
 wounds-0.068
Token,
Token ID11
Feature activation-7.58e-01

Pos logit contributions
aps+0.029
ges+0.029
uttered+0.027
 Sorry+0.027
uttering+0.027
Neg logit contributions
 hospital-0.031
 dentist-0.027
 nurses-0.027
 mess-0.027
 punished-0.026
Token and
Token ID290
Feature activation-4.82e+00

Pos logit contributions
 hospital+0.018
 dentist+0.017
 nurses+0.017
 wounds+0.016
 cal+0.016
Neg logit contributions
ges-0.011
uts-0.010
 This-0.010
 together-0.010
elt-0.009
Token then
Token ID788
Feature activation+2.00e-01

Pos logit contributions
 hospital+0.099
 dentist+0.095
 nurses+0.094
 wounds+0.091
 cal+0.088
Neg logit contributions
 together-0.083
ges-0.082
ed-0.078
uts-0.075
apped-0.073
Token when
Token ID618
Feature activation+5.59e-01

Pos logit contributions
ges+0.003
uts+0.003
aps+0.003
 together+0.003
ed+0.003
Neg logit contributions
 hospital-0.004
 nurses-0.004
 dentist-0.004
 wounds-0.004
 happiest-0.004
Token his
Token ID465
Feature activation-1.08e+01

Pos logit contributions
ges+0.010
uts+0.010
aps+0.010
 This+0.009
uttering+0.009
Neg logit contributions
 hospital-0.012
 nurses-0.011
 dentist-0.011
 punished-0.010
 happiest-0.010
Token mom
Token ID1995
Feature activation-3.63e+00

Pos logit contributions
 hospital+0.162
 nurses+0.157
 dentist+0.142
 punished+0.137
 cal+0.134
Neg logit contributions
aps-0.249
uts-0.247
ges-0.245
 This-0.236
elt-0.227
Token came
Token ID1625
Feature activation+2.53e+00

Pos logit contributions
 hospital+0.074
 nurses+0.069
 happiest+0.069
 wounds+0.068
 dentist+0.065
Neg logit contributions
ges-0.066
ed-0.064
uts-0.061
elt-0.060
aps-0.058
Token home
Token ID1363
Feature activation+5.46e+00

Pos logit contributions
uts+0.047
ges+0.041
 This+0.040
uttering+0.039
apped+0.039
Neg logit contributions
 hospital-0.053
 nurses-0.050
 dentist-0.049
 wounds-0.047
 doctors-0.046
Token,
Token ID11
Feature activation-1.15e+00

Pos logit contributions
ges+0.072
uts+0.069
aps+0.066
uttering+0.063
 This+0.061
Neg logit contributions
 hospital-0.145
 dentist-0.137
 nurses-0.136
 punished-0.128
 forgot-0.127
Token she
Token ID673
Feature activation-5.56e+00

Pos logit contributions
 hospital+0.028
 dentist+0.027
 nurses+0.027
 wounds+0.025
 happiest+0.025
Neg logit contributions
ges-0.016
uts-0.016
aps-0.015
 This-0.014
uttering-0.013
Token
Token ID198
Feature activation+2.79e+00

Pos logit contributions
uttering+0.050
ges+0.049
elt+0.049
uts+0.048
ucked+0.046
Neg logit contributions
 hospital-0.069
 dentist-0.067
 nurses-0.059
 mess-0.058
 forgot-0.058
TokenAnna
Token ID31160
Feature activation-7.31e+00

Pos logit contributions
ges+0.056
uts+0.055
aps+0.053
uttering+0.052
ore+0.050
Neg logit contributions
 hospital-0.055
 nurses-0.052
 dentist-0.052
 forgot-0.046
 punished-0.046
Token and
Token ID290
Feature activation-8.27e+00

Pos logit contributions
 nurses+0.167
 hospital+0.164
 dentist+0.160
 happiest+0.149
 doctors+0.148
Neg logit contributions
aps-0.114
ges-0.108
uts-0.105
 This-0.104
ore-0.098
Token Max
Token ID5436
Feature activation-5.81e+00

Pos logit contributions
 hospital+0.153
 dentist+0.143
 nurses+0.143
 punished+0.129
 happiest+0.127
Neg logit contributions
ges-0.172
uts-0.168
aps-0.164
 This-0.158
uttering-0.154
Token finish
Token ID5461
Feature activation+3.07e+00

Pos logit contributions
 dentist+0.126
 hospital+0.124
 nurses+0.123
 wounds+0.115
 doctors+0.114
Neg logit contributions
ges-0.094
 This-0.091
aps-0.088
uts-0.085
ore-0.080
Token their
Token ID511
Feature activation-1.08e+01

Pos logit contributions
uts+0.064
elt+0.061
ges+0.060
uttering+0.059
uttered+0.059
Neg logit contributions
 hospital-0.058
 dentist-0.051
 nurses-0.050
 mistake-0.049
 wounds-0.049
Token food
Token ID2057
Feature activation+2.06e+00

Pos logit contributions
 hospital+0.185
 dentist+0.172
 nurses+0.172
 wounds+0.152
 punished+0.151
Neg logit contributions
ges-0.243
uts-0.238
aps-0.232
uttering-0.222
 This-0.220
Token and
Token ID290
Feature activation-4.91e+00

Pos logit contributions
ges+0.030
uts+0.028
aps+0.027
 This+0.026
uttering+0.026
Neg logit contributions
 hospital-0.052
 dentist-0.049
 nurses-0.049
 punished-0.045
 forgot-0.045
Token go
Token ID467
Feature activation+3.37e+00

Pos logit contributions
 hospital+0.100
 dentist+0.099
 nurses+0.096
ureen+0.094
 happiest+0.092
Neg logit contributions
ges-0.075
 together-0.074
 This-0.074
aps-0.073
uts-0.072
Token to
Token ID284
Feature activation-1.79e+00

Pos logit contributions
uts+0.062
ges+0.057
uttering+0.054
uttered+0.054
ung+0.053
Neg logit contributions
 hospital-0.075
 dentist-0.066
 nurses-0.064
 forgot-0.060
 punished-0.060
Token play
Token ID711
Feature activation+5.95e+00

Pos logit contributions
 hospital+0.033
 nurses+0.031
 dentist+0.031
 punished+0.028
 happiest+0.027
Neg logit contributions
ges-0.037
uts-0.036
aps-0.036
 This-0.034
uttering-0.034
Token had
Token ID550
Feature activation-3.02e-01

Pos logit contributions
 wounds+0.117
 nurses+0.116
 dentist+0.115
 hospital+0.114
 mistake+0.107
Neg logit contributions
 together-0.070
apped-0.067
 on-0.066
ges-0.066
uts-0.064
Token fun
Token ID1257
Feature activation+3.10e+00

Pos logit contributions
 nurses+0.007
 hospital+0.007
 dentist+0.007
 wounds+0.007
ureen+0.007
Neg logit contributions
 together-0.004
uts-0.004
 This-0.004
aps-0.004
ges-0.003
Token.
Token ID13
Feature activation+5.54e+00

Pos logit contributions
ges+0.047
elt+0.046
uts+0.046
uttered+0.044
 This+0.043
Neg logit contributions
 hospital-0.075
 mess-0.067
 dentist-0.067
 mistake-0.066
 nurses-0.066
Token
Token ID198
Feature activation+2.87e+00

Pos logit contributions
ges+0.081
uts+0.080
uttering+0.080
iked+0.074
ore+0.074
Neg logit contributions
 hospital-0.133
 dentist-0.126
 nurses-0.120
 forgot-0.114
 doctors-0.113
Token
Token ID198
Feature activation+2.60e+00

Pos logit contributions
ges+0.054
uts+0.053
aps+0.049
uttering+0.049
elt+0.048
Neg logit contributions
 hospital-0.059
 dentist-0.056
 nurses-0.054
 wounds-0.050
 mistake-0.049
TokenTheir
Token ID14574
Feature activation-1.08e+01

Pos logit contributions
ges+0.051
uts+0.051
aps+0.048
 This+0.047
uttering+0.046
Neg logit contributions
 hospital-0.051
 dentist-0.048
 nurses-0.048
 punished-0.043
 wounds-0.043
Token moms
Token ID34015
Feature activation-1.94e+00

Pos logit contributions
 nurses+0.176
 hospital+0.175
 dentist+0.172
ureen+0.163
 cal+0.160
Neg logit contributions
aps-0.219
uts-0.219
ges-0.211
apped-0.202
uttering-0.198
Token were
Token ID547
Feature activation-2.13e+00

Pos logit contributions
 hospital+0.032
 nurses+0.031
 dentist+0.030
 wounds+0.029
 doctors+0.027
Neg logit contributions
ges-0.042
aps-0.040
uts-0.039
apped-0.038
 This-0.038
Token proud
Token ID6613
Feature activation+2.07e+00

Pos logit contributions
 hospital+0.038
 nurses+0.036
 dentist+0.036
 wounds+0.031
 forgot+0.031
Neg logit contributions
ges-0.044
uts-0.044
aps-0.044
uttering-0.042
apping-0.041
Token.
Token ID13
Feature activation+4.46e+00

Pos logit contributions
ges+0.026
uts+0.025
aps+0.024
 This+0.023
uttering+0.023
Neg logit contributions
 hospital-0.056
 nurses-0.054
 dentist-0.053
 punished-0.049
 doctors-0.049
Token They
Token ID1119
Feature activation-6.01e+00

Pos logit contributions
ges+0.067
uttering+0.062
uts+0.061
aps+0.057
ore+0.057
Neg logit contributions
 hospital-0.111
 nurses-0.105
 dentist-0.104
 doctors-0.096
 cal-0.094
Token is
Token ID318
Feature activation-1.75e+00

Pos logit contributions
 nurses+0.050
 hospital+0.050
 dentist+0.049
 happiest+0.044
 punished+0.044
Neg logit contributions
uts-0.036
ges-0.036
aps-0.035
 This-0.033
ore-0.033
Token brave
Token ID14802
Feature activation+2.15e+00

Pos logit contributions
 hospital+0.031
 nurses+0.030
 dentist+0.030
 wounds+0.026
 punished+0.025
Neg logit contributions
ges-0.036
uts-0.036
aps-0.035
 This-0.033
uttering-0.033
Token.
Token ID13
Feature activation+5.81e+00

Pos logit contributions
ges+0.028
uts+0.027
aps+0.026
uttering+0.025
 This+0.024
Neg logit contributions
 hospital-0.057
 nurses-0.054
 dentist-0.054
 punished-0.050
 wounds-0.050
Token
Token ID198
Feature activation+3.07e+00

Pos logit contributions
ges+0.100
uttering+0.098
uts+0.094
iked+0.088
ore+0.085
Neg logit contributions
 hospital-0.122
 dentist-0.113
 cal-0.110
 nurses-0.106
 doctors-0.105
Token
Token ID198
Feature activation+2.71e+00

Pos logit contributions
ges+0.040
uts+0.038
uttering+0.035
elt+0.034
aps+0.034
Neg logit contributions
 hospital-0.081
 dentist-0.077
 nurses-0.074
 wounds-0.072
 cal-0.071
TokenTheir
Token ID14574
Feature activation-1.08e+01

Pos logit contributions
ges+0.045
uts+0.043
aps+0.042
 This+0.040
uttering+0.039
Neg logit contributions
 hospital-0.062
 nurses-0.059
 dentist-0.059
 punished-0.054
 wounds-0.054
Token dad
Token ID9955
Feature activation-2.33e+00

Pos logit contributions
 hospital+0.165
 nurses+0.163
 dentist+0.158
 punished+0.146
 happiest+0.138
Neg logit contributions
uts-0.239
ges-0.239
aps-0.238
uttering-0.224
uttered-0.220
Token says
Token ID1139
Feature activation+1.24e+00

Pos logit contributions
 nurses+0.046
 hospital+0.045
 dentist+0.043
 happiest+0.043
 doctors+0.040
Neg logit contributions
aps-0.043
ges-0.042
ore-0.039
uts-0.038
 This-0.037
Token,
Token ID11
Feature activation-1.13e+00

Pos logit contributions
ges+0.018
uts+0.017
aps+0.016
uttering+0.016
ore+0.015
Neg logit contributions
 hospital-0.032
 dentist-0.030
 nurses-0.030
 doctors-0.028
 cal-0.028
Token "
Token ID366
Feature activation+2.85e+00

Pos logit contributions
 hospital+0.025
 dentist+0.025
 nurses+0.024
 doctors+0.022
 cal+0.022
Neg logit contributions
ges-0.019
uttering-0.018
aps-0.018
uts-0.017
uttered-0.016
TokenWow
Token ID22017
Feature activation+2.41e+00

Pos logit contributions
ges+0.045
uts+0.043
aps+0.042
 This+0.040
ore+0.039
Neg logit contributions
 hospital-0.068
 dentist-0.064
 nurses-0.064
 punished-0.059
 wounds-0.058
Token and
Token ID290
Feature activation-4.09e+00

Pos logit contributions
 hospital+0.027
 dentist+0.025
 nurses+0.025
 punished+0.022
 happiest+0.022
Neg logit contributions
ges-0.034
uts-0.034
aps-0.033
 This-0.031
uttering-0.031
Token more
Token ID517
Feature activation-2.63e+00

Pos logit contributions
 hospital+0.077
 dentist+0.073
 nurses+0.071
ureen+0.063
 punished+0.061
Neg logit contributions
ges-0.084
aps-0.082
 together-0.081
uts-0.080
uttering-0.075
Token sand
Token ID6450
Feature activation+4.59e+00

Pos logit contributions
 hospital+0.045
 dentist+0.042
 nurses+0.042
 punished+0.037
 happiest+0.037
Neg logit contributions
ges-0.059
uts-0.058
aps-0.057
 This-0.054
ore-0.053
Token together
Token ID1978
Feature activation+1.18e+00

Pos logit contributions
ges+0.060
 This+0.056
aps+0.055
ore+0.054
uttering+0.053
Neg logit contributions
 hospital-0.113
 dentist-0.107
 nurses-0.106
 wounds-0.104
 forgot-0.102
Token until
Token ID1566
Feature activation+2.20e-01

Pos logit contributions
aps+0.018
elt+0.017
uts+0.017
ges+0.017
uttering+0.017
Neg logit contributions
 hospital-0.026
 forgot-0.024
 happiest-0.023
 nurses-0.023
 dentist-0.023
Token their
Token ID511
Feature activation-1.08e+01

Pos logit contributions
uts+0.004
ges+0.004
aps+0.004
uttering+0.004
apping+0.004
Neg logit contributions
 hospital-0.005
 nurses-0.004
 dentist-0.004
 forgot-0.004
 cal-0.004
Token castle
Token ID16669
Feature activation+2.54e+00

Pos logit contributions
 hospital+0.181
 nurses+0.172
 dentist+0.171
 punished+0.154
 forgot+0.146
Neg logit contributions
ges-0.235
uts-0.233
aps-0.232
uttering-0.221
 This-0.219
Token was
Token ID373
Feature activation-2.65e+00

Pos logit contributions
uts+0.043
ed+0.040
aps+0.039
apped+0.039
uttered+0.037
Neg logit contributions
 nurses-0.059
 hospital-0.053
 dentist-0.052
 wounds-0.050
 happiest-0.048
Token finished
Token ID5201
Feature activation+5.46e+00

Pos logit contributions
 hospital+0.049
 nurses+0.046
 dentist+0.046
 punished+0.040
 wounds+0.040
Neg logit contributions
ges-0.055
uts-0.054
aps-0.053
 This-0.051
uttering-0.051
Token.
Token ID13
Feature activation+5.33e+00

Pos logit contributions
ges+0.090
uts+0.083
aps+0.081
elt+0.080
 This+0.079
Neg logit contributions
 hospital-0.119
 mess-0.114
 mistake-0.111
 dentist-0.111
 forgot-0.108
Token
Token ID220
Feature activation+5.60e+00

Pos logit contributions
ges+0.099
uttering+0.099
uts+0.096
aps+0.089
apping+0.086
Neg logit contributions
 hospital-0.101
 tears-0.091
 nurses-0.090
 dentist-0.090
 forgot-0.088
Token lots
Token ID6041
Feature activation+2.16e+00

Pos logit contributions
ges+0.145
uts+0.143
aps+0.139
 This+0.132
uttering+0.132
Neg logit contributions
 hospital-0.139
 nurses-0.131
 dentist-0.131
 punished-0.117
 wounds-0.117
Token of
Token ID286
Feature activation-2.31e+00

Pos logit contributions
ges+0.024
uts+0.023
aps+0.022
 This+0.020
ore+0.020
Neg logit contributions
 hospital-0.061
 dentist-0.059
 nurses-0.059
 punished-0.055
 happiest-0.054
Token delicious
Token ID12625
Feature activation+2.11e+00

Pos logit contributions
 hospital+0.044
 dentist+0.043
 nurses+0.042
 conductor+0.039
 engineer+0.038
Neg logit contributions
ges-0.045
aps-0.044
 together-0.042
uts-0.042
 This-0.042
Token food
Token ID2057
Feature activation+5.09e+00

Pos logit contributions
ges+0.050
uts+0.049
aps+0.048
uttering+0.046
 This+0.046
Neg logit contributions
 hospital-0.033
 nurses-0.030
 dentist-0.030
 wounds-0.026
 punished-0.026
Token and
Token ID290
Feature activation-4.02e+00

Pos logit contributions
ges+0.082
aps+0.078
uttering+0.074
uts+0.073
 This+0.073
Neg logit contributions
 hospital-0.116
 dentist-0.109
 nurses-0.108
 forgot-0.102
 mess-0.101
Token played
Token ID2826
Feature activation+7.70e+00

Pos logit contributions
 hospital+0.077
 dentist+0.075
 nurses+0.070
 conductor+0.065
 engineer+0.065
Neg logit contributions
ges-0.077
 together-0.074
uts-0.073
ed-0.069
aps-0.069
Token games
Token ID1830
Feature activation+8.47e+00

Pos logit contributions
ges+0.097
uts+0.095
aps+0.086
uttering+0.081
elt+0.081
Neg logit contributions
 hospital-0.208
 dentist-0.195
 nurses-0.193
 wounds-0.184
 doctors-0.181
Token the
Token ID262
Feature activation-7.17e+00

Pos logit contributions
ges+0.100
uts+0.093
aps+0.091
 This+0.086
uttering+0.084
Neg logit contributions
 hospital-0.232
 dentist-0.221
 nurses-0.219
 punished-0.204
 forgot-0.204
Token entire
Token ID2104
Feature activation-3.51e+00

Pos logit contributions
 hospital+0.127
 nurses+0.125
 dentist+0.123
 punished+0.113
 wounds+0.109
Neg logit contributions
ges-0.136
uts-0.135
aps-0.131
uttering-0.130
 This-0.129
Token day
Token ID1110
Feature activation+3.25e+00

Pos logit contributions
 hospital+0.076
 nurses+0.074
 dentist+0.073
 punished+0.068
 wounds+0.068
Neg logit contributions
ges-0.058
uts-0.056
aps-0.055
 This-0.054
ore-0.053
Token.
Token ID13
Feature activation+5.33e+00

Pos logit contributions
ges+0.046
aps+0.044
uts+0.043
elt+0.043
 This+0.040
Neg logit contributions
 hospital-0.074
 nurses-0.069
 dentist-0.068
 happiest-0.067
 forgot-0.067
Token it
Token ID340
Feature activation+1.03e+00

Pos logit contributions
ges+0.051
uts+0.050
aps+0.048
 This+0.046
uttering+0.046
Neg logit contributions
 hospital-0.051
 dentist-0.048
 nurses-0.048
 punished-0.043
 happiest-0.043
Token.
Token ID13
Feature activation+5.89e+00

Pos logit contributions
ges+0.015
aps+0.014
uts+0.014
uttering+0.013
elt+0.012
Neg logit contributions
 hospital-0.026
 dentist-0.025
 nurses-0.024
 punished-0.023
 wounds-0.022
Token Then
Token ID3244
Feature activation+6.45e-01

Pos logit contributions
uttering+0.097
ges+0.097
uts+0.093
aps+0.089
ore+0.086
Neg logit contributions
 hospital-0.135
 dentist-0.130
 nurses-0.124
 doctors-0.112
 cal-0.111
Token they
Token ID484
Feature activation-4.77e+00

Pos logit contributions
ges+0.010
uts+0.010
aps+0.010
uttering+0.009
 This+0.009
Neg logit contributions
 hospital-0.015
 nurses-0.014
 dentist-0.014
 punished-0.013
 happiest-0.013
Token will
Token ID481
Feature activation+1.99e+00

Pos logit contributions
 hospital+0.085
 dentist+0.081
 happiest+0.079
 nurses+0.077
 wounds+0.076
Neg logit contributions
ges-0.098
aps-0.088
apped-0.087
 together-0.087
uts-0.086
Token wake
Token ID7765
Feature activation+7.32e+00

Pos logit contributions
ges+0.045
uts+0.045
aps+0.044
 This+0.042
uttering+0.041
Neg logit contributions
 hospital-0.032
 nurses-0.030
 dentist-0.030
 happiest-0.026
 punished-0.026
Token you
Token ID345
Feature activation-3.36e+00

Pos logit contributions
ges+0.109
uts+0.105
aps+0.102
 This+0.097
ore+0.096
Neg logit contributions
 hospital-0.177
 dentist-0.170
 nurses-0.170
 happiest-0.156
 punished-0.156
Token up
Token ID510
Feature activation+1.63e+00

Pos logit contributions
 hospital+0.064
 nurses+0.060
 dentist+0.060
 punished+0.054
 wounds+0.054
Neg logit contributions
ges-0.067
uts-0.066
aps-0.064
 This-0.061
uttering-0.061
Token and
Token ID290
Feature activation-3.25e+00

Pos logit contributions
ges+0.027
uts+0.026
aps+0.025
uttering+0.024
 This+0.024
Neg logit contributions
 hospital-0.037
 dentist-0.035
 nurses-0.035
 punished-0.032
 cal-0.032
Token take
Token ID1011
Feature activation+2.32e+00

Pos logit contributions
 hospital+0.057
 dentist+0.055
 nurses+0.054
ureen+0.049
 happiest+0.049
Neg logit contributions
ges-0.065
aps-0.063
uts-0.063
 This-0.062
ore-0.059
Token you
Token ID345
Feature activation-3.39e+00

Pos logit contributions
ges+0.041
uts+0.040
aps+0.039
 This+0.037
ore+0.037
Neg logit contributions
 hospital-0.050
 dentist-0.047
 nurses-0.047
 punished-0.043
 happiest-0.043
Token I
Token ID314
Feature activation-2.94e+00

Pos logit contributions
uttering+0.078
uts+0.073
ges+0.073
aps+0.067
uttered+0.066
Neg logit contributions
 hospital-0.159
 nurses-0.155
 dentist-0.153
 doctors-0.143
 boo-0.139
Token don
Token ID836
Feature activation+4.49e+00

Pos logit contributions
 hospital+0.064
 dentist+0.063
 nurses+0.062
 happiest+0.057
 wounds+0.056
Neg logit contributions
ges-0.050
uts-0.048
aps-0.047
ore-0.046
 This-0.045
Token't
Token ID470
Feature activation-3.42e+00

Pos logit contributions
ges+0.062
uttering+0.060
elt+0.058
ectar+0.057
aps+0.055
Neg logit contributions
 hospital-0.116
 dentist-0.103
 nurses-0.101
 punished-0.101
 doctors-0.099
Token want
Token ID765
Feature activation+3.47e+00

Pos logit contributions
 hospital+0.078
 dentist+0.076
 nurses+0.075
 wounds+0.069
 happiest+0.069
Neg logit contributions
ges-0.056
uts-0.052
aps-0.051
 This-0.049
ore-0.047
Token to
Token ID284
Feature activation-2.45e+00

Pos logit contributions
ges+0.039
aps+0.038
uttering+0.037
uts+0.037
apped+0.035
Neg logit contributions
 hospital-0.098
 nurses-0.095
 dentist-0.092
 doctors-0.087
 punished-0.086
Token bite
Token ID13197
Feature activation+8.08e+00

Pos logit contributions
 hospital+0.039
 nurses+0.035
 dentist+0.035
 punished+0.031
 forgot+0.031
Neg logit contributions
ges-0.058
uts-0.057
aps-0.056
uttering-0.054
 This-0.053
Token anymore
Token ID7471
Feature activation+4.49e+00

Pos logit contributions
uttering+0.143
uttered+0.138
aps+0.137
uts+0.136
apped+0.136
Neg logit contributions
 hospital-0.177
 nurses-0.170
 dentist-0.157
 doctors-0.153
 wounds-0.152
Token.
Token ID13
Feature activation+5.83e+00

Pos logit contributions
uttering+0.085
uttered+0.083
elt+0.077
ges+0.077
apped+0.076
Neg logit contributions
 hospital-0.099
 dentist-0.090
 mess-0.089
 nurses-0.088
 forgot-0.087
Token Please
Token ID4222
Feature activation+2.47e+00

Pos logit contributions
uttering+0.079
ges+0.075
uts+0.072
aps+0.068
uttered+0.067
Neg logit contributions
 hospital-0.151
 dentist-0.147
 nurses-0.146
 doctors-0.136
 boo-0.135
Token help
Token ID1037
Feature activation+4.94e+00

Pos logit contributions
ges+0.053
uts+0.052
aps+0.051
uttering+0.049
 This+0.048
Neg logit contributions
 hospital-0.044
 dentist-0.041
 nurses-0.041
 punished-0.037
 forgot-0.037
Token me
Token ID502
Feature activation-1.68e+00

Pos logit contributions
ges+0.115
uts+0.112
aps+0.110
 This+0.106
ore+0.105
Neg logit contributions
 hospital-0.080
 dentist-0.074
 nurses-0.074
 punished-0.065
 happiest-0.064
Token Soon
Token ID15894
Feature activation-1.51e+00

Pos logit contributions
uttering+0.077
ges+0.076
uts+0.074
aps+0.072
ore+0.066
Neg logit contributions
 hospital-0.117
 dentist-0.112
 nurses-0.109
 cal-0.099
 happiest-0.099
Token it
Token ID340
Feature activation-2.41e+00

Pos logit contributions
 hospital+0.042
 dentist+0.041
 nurses+0.040
 wounds+0.038
 punished+0.038
Neg logit contributions
ges-0.017
uts-0.016
aps-0.016
 This-0.015
ore-0.014
Token was
Token ID373
Feature activation-1.90e+00

Pos logit contributions
 hospital+0.053
 nurses+0.053
 dentist+0.052
 wounds+0.051
 doctors+0.046
Neg logit contributions
uts-0.040
ges-0.039
ed-0.037
aps-0.036
 This-0.036
Token time
Token ID640
Feature activation+3.60e+00

Pos logit contributions
 hospital+0.032
 dentist+0.030
 nurses+0.030
 punished+0.027
 happiest+0.027
Neg logit contributions
ges-0.042
uts-0.042
aps-0.041
 This-0.039
uttering-0.039
Token to
Token ID284
Feature activation-3.76e+00

Pos logit contributions
uts+0.064
ges+0.063
aps+0.061
owers+0.057
ucked+0.057
Neg logit contributions
 hospital-0.071
 forgot-0.068
 dentist-0.066
 mess-0.065
 mistake-0.064
Token march
Token ID9960
Feature activation+8.30e+00

Pos logit contributions
 hospital+0.074
 dentist+0.073
 nurses+0.073
 wounds+0.067
 doctors+0.065
Neg logit contributions
ges-0.068
aps-0.066
uts-0.064
 This-0.063
uttering-0.061
Token back
Token ID736
Feature activation+1.53e+00

Pos logit contributions
ges+0.138
uts+0.135
aps+0.131
 This+0.122
uttering+0.122
Neg logit contributions
 hospital-0.188
 nurses-0.178
 dentist-0.178
 punished-0.162
 happiest-0.162
Token home
Token ID1363
Feature activation+6.74e+00

Pos logit contributions
ges+0.025
uts+0.024
aps+0.023
uttering+0.022
 This+0.022
Neg logit contributions
 hospital-0.036
 nurses-0.033
 dentist-0.032
 forgot-0.030
 punished-0.030
Token.
Token ID13
Feature activation+5.22e+00

Pos logit contributions
ges+0.090
uts+0.082
aps+0.080
uttering+0.076
 This+0.076
Neg logit contributions
 hospital-0.177
 dentist-0.167
 nurses-0.166
 forgot-0.157
 punished-0.156
Token The
Token ID383
Feature activation-7.75e+00

Pos logit contributions
ges+0.089
uttering+0.088
uts+0.085
aps+0.083
apping+0.076
Neg logit contributions
 hospital-0.117
 dentist-0.112
 nurses-0.107
 forgot-0.097
 cal-0.096
Token other
Token ID584
Feature activation-5.54e+00

Pos logit contributions
 punished+0.125
 boo+0.125
 wounds+0.124
 nurses+0.121
aret+0.120
Neg logit contributions
 on-0.132
 This-0.130
uttering-0.121
 at-0.120
ore-0.118
Token sad
Token ID6507
Feature activation+3.49e+00

Pos logit contributions
 hospital+0.025
 dentist+0.023
 nurses+0.023
 punished+0.020
 happiest+0.020
Neg logit contributions
ges-0.042
uts-0.042
aps-0.041
 This-0.039
uttering-0.039
Token.
Token ID13
Feature activation+4.86e+00

Pos logit contributions
ges+0.064
aps+0.058
uttering+0.058
uttered+0.057
uts+0.057
Neg logit contributions
 hospital-0.077
 nurses-0.073
 dentist-0.070
 doctors-0.067
 mess-0.067
Token She
Token ID1375
Feature activation-7.32e+00

Pos logit contributions
ges+0.075
uts+0.070
uttering+0.067
aps+0.067
ore+0.063
Neg logit contributions
 hospital-0.121
 dentist-0.116
 nurses-0.110
 doctors-0.103
 tears-0.102
Token started
Token ID2067
Feature activation+4.39e+00

Pos logit contributions
 dentist+0.149
 hospital+0.148
 nurses+0.144
 wounds+0.141
 stret+0.140
Neg logit contributions
ges-0.116
ed-0.114
apped-0.110
uts-0.109
elt-0.109
Token to
Token ID284
Feature activation-3.47e+00

Pos logit contributions
uts+0.040
ges+0.040
aps+0.037
 This+0.033
uttering+0.033
Neg logit contributions
 hospital-0.133
 nurses-0.127
 dentist-0.125
 punished-0.118
 wounds-0.117
Token weep
Token ID49671
Feature activation+7.75e+00

Pos logit contributions
 dentist+0.059
 hospital+0.056
 nurses+0.054
ureen+0.052
 happiest+0.050
Neg logit contributions
ges-0.074
aps-0.070
 This-0.068
uts-0.066
ore-0.066
Token,
Token ID11
Feature activation-9.22e-01

Pos logit contributions
ges+0.131
uts+0.124
aps+0.123
 This+0.117
ore+0.114
Neg logit contributions
 hospital-0.177
 nurses-0.166
 dentist-0.163
 doctors-0.151
 happiest-0.151
Token and
Token ID290
Feature activation-4.71e+00

Pos logit contributions
 hospital+0.019
 nurses+0.017
 dentist+0.017
 punished+0.016
 happiest+0.016
Neg logit contributions
ges-0.018
uts-0.017
aps-0.017
uttering-0.016
 This-0.016
Token the
Token ID262
Feature activation-8.72e+00

Pos logit contributions
 dentist+0.098
 hospital+0.098
 nurses+0.093
 wounds+0.089
ureen+0.088
Neg logit contributions
ges-0.080
uts-0.073
aps-0.072
 This-0.072
apped-0.069
Token tears
Token ID10953
Feature activation+1.12e+00

Pos logit contributions
 dentist+0.131
 hospital+0.131
 nurses+0.122
ureen+0.122
 punished+0.120
Neg logit contributions
ges-0.188
aps-0.187
uts-0.175
elt-0.175
ore-0.174
Token rolled
Token ID11686
Feature activation+5.70e+00

Pos logit contributions
uts+0.024
ges+0.024
aps+0.024
 This+0.023
uttering+0.023
Neg logit contributions
 hospital-0.019
 nurses-0.018
 dentist-0.017
 punished-0.016
 forgot-0.016
Token and
Token ID290
Feature activation-4.21e+00

Pos logit contributions
ges+0.046
uts+0.045
aps+0.043
 This+0.040
uttering+0.040
Neg logit contributions
 hospital-0.070
 nurses-0.067
 dentist-0.067
 punished-0.061
 wounds-0.061
Token started
Token ID2067
Feature activation+4.69e+00

Pos logit contributions
 hospital+0.081
 dentist+0.080
 nurses+0.080
 engineer+0.078
 happiest+0.076
Neg logit contributions
 together-0.077
ges-0.070
ed-0.069
uts-0.067
apped-0.066
Token to
Token ID284
Feature activation-2.98e+00

Pos logit contributions
uts+0.061
ges+0.060
uttering+0.058
apped+0.056
apping+0.055
Neg logit contributions
 hospital-0.121
 dentist-0.113
 boo-0.109
 cal-0.109
 mess-0.108
Token feel
Token ID1254
Feature activation-2.35e-01

Pos logit contributions
 nurses+0.052
 hospital+0.052
 dentist+0.050
ureen+0.045
 wounds+0.044
Neg logit contributions
ges-0.059
aps-0.059
uts-0.058
 This-0.056
ore-0.054
Token better
Token ID1365
Feature activation+7.24e-01

Pos logit contributions
 hospital+0.004
 nurses+0.003
 dentist+0.003
ureen+0.003
 conductor+0.003
Neg logit contributions
ges-0.005
aps-0.005
uts-0.005
 This-0.005
ore-0.005
Token.
Token ID13
Feature activation+5.18e+00

Pos logit contributions
ges+0.012
uttered+0.011
 This+0.011
uts+0.011
aps+0.011
Neg logit contributions
 hospital-0.017
 dentist-0.015
 mess-0.015
 forgot-0.015
 mistake-0.015
Token
Token ID198
Feature activation+3.32e+00

Pos logit contributions
ges+0.082
uttering+0.077
uts+0.076
aps+0.072
apping+0.069
Neg logit contributions
 hospital-0.126
 dentist-0.119
 nurses-0.112
 forgot-0.108
 tears-0.108
Token
Token ID198
Feature activation+3.05e+00

Pos logit contributions
ges+0.051
uts+0.050
uttering+0.049
elt+0.048
apped+0.046
Neg logit contributions
 hospital-0.075
 dentist-0.072
 forgot-0.064
 mess-0.064
 nurses-0.064
TokenAs
Token ID1722
Feature activation+1.86e+00

Pos logit contributions
ges+0.055
uts+0.054
aps+0.053
 This+0.051
uttering+0.049
Neg logit contributions
 hospital-0.064
 dentist-0.061
 nurses-0.061
 wounds-0.055
 happiest-0.055
Token the
Token ID262
Feature activation-8.52e+00

Pos logit contributions
ges+0.034
uts+0.033
aps+0.032
 This+0.030
ore+0.030
Neg logit contributions
 hospital-0.039
 nurses-0.037
 dentist-0.037
 wounds-0.035
 punished-0.034
Token days
Token ID1528
Feature activation+1.58e+00

Pos logit contributions
 hospital+0.137
 nurses+0.127
 dentist+0.127
 punished+0.115
 wounds+0.113
Neg logit contributions
ges-0.193
uts-0.190
aps-0.185
 This-0.179
uttering-0.178
Token pirates
Token ID27516
Feature activation+2.71e+00

Pos logit contributions
 hospital+0.039
 nurses+0.036
 dentist+0.036
 happiest+0.033
 doctors+0.033
Neg logit contributions
ges-0.036
uts-0.035
aps-0.034
ore-0.033
 This-0.033
Token and
Token ID290
Feature activation-3.92e+00

Pos logit contributions
ges+0.044
uts+0.042
aps+0.042
 This+0.039
ore+0.039
Neg logit contributions
 hospital-0.062
 dentist-0.059
 nurses-0.059
 punished-0.054
 happiest-0.054
Token look
Token ID804
Feature activation+2.01e+00

Pos logit contributions
 hospital+0.084
 dentist+0.083
 nurses+0.078
ureen+0.077
 punished+0.075
Neg logit contributions
ges-0.063
uts-0.062
 This-0.059
aps-0.057
 together-0.055
Token for
Token ID329
Feature activation-7.78e-01

Pos logit contributions
uts+0.031
ges+0.031
aps+0.029
uttering+0.027
ore+0.027
Neg logit contributions
 hospital-0.048
 nurses-0.046
 dentist-0.046
 happiest-0.043
 punished-0.043
Token treasure
Token ID14068
Feature activation+4.71e+00

Pos logit contributions
 hospital+0.015
 dentist+0.015
 nurses+0.013
ureen+0.013
 punished+0.013
Neg logit contributions
ges-0.015
aps-0.014
 together-0.014
uts-0.014
 This-0.013
Token.
Token ID13
Feature activation+5.39e+00

Pos logit contributions
ges+0.075
aps+0.070
 This+0.069
uttering+0.066
elt+0.065
Neg logit contributions
 hospital-0.110
 nurses-0.105
 dentist-0.104
 wounds-0.099
 doctors-0.099
Token
Token ID198
Feature activation+2.77e+00

Pos logit contributions
uts+0.078
ges+0.075
uttering+0.070
aps+0.068
ore+0.064
Neg logit contributions
 hospital-0.133
 dentist-0.122
 nurses-0.120
 forgot-0.113
 happiest-0.112
Token
Token ID198
Feature activation+2.58e+00

Pos logit contributions
uts+0.040
ges+0.039
elt+0.037
ed+0.036
uttering+0.035
Neg logit contributions
 hospital-0.065
 dentist-0.063
 nurses-0.058
 conductor-0.055
 mistake-0.055
TokenOne
Token ID3198
Feature activation-1.47e+00

Pos logit contributions
uts+0.056
ges+0.054
aps+0.053
ore+0.050
apped+0.050
Neg logit contributions
 hospital-0.044
 nurses-0.042
 dentist-0.040
 forgot-0.037
 doctors-0.036
Token day
Token ID1110
Feature activation+4.71e+00

Pos logit contributions
 hospital+0.034
 dentist+0.033
 nurses+0.032
 wounds+0.030
 punished+0.030
Neg logit contributions
ges-0.024
uts-0.023
aps-0.023
ore-0.022
 This-0.022
Token,
Token ID11
Feature activation-1.55e+00

Pos logit contributions
uts+0.105
aps+0.093
uttering+0.092
ore+0.091
to+0.090
Neg logit contributions
 forgot-0.070
 hospital-0.069
 nurses-0.069
 doctors-0.062
 way-0.059
Token see
Token ID766
Feature activation+1.11e+00

Pos logit contributions
 nurses+0.130
 dentist+0.126
 hospital+0.125
 doctors+0.117
 punished+0.111
Neg logit contributions
ges-0.073
aps-0.069
 This-0.065
uts-0.064
't-0.061
Token the
Token ID262
Feature activation-8.29e+00

Pos logit contributions
ges+0.016
aps+0.015
uts+0.014
 together+0.014
 This+0.014
Neg logit contributions
 hospital-0.026
 dentist-0.026
 punished-0.025
 nurses-0.025
 forgot-0.025
Token sky
Token ID6766
Feature activation+2.78e+00

Pos logit contributions
 nurses+0.157
 hospital+0.152
 dentist+0.147
 punished+0.144
 forgot+0.137
Neg logit contributions
aps-0.150
uts-0.146
ges-0.145
uttering-0.145
 This-0.134
Token,
Token ID11
Feature activation-2.10e+00

Pos logit contributions
ges+0.051
uts+0.050
aps+0.045
ore+0.045
 This+0.044
Neg logit contributions
 hospital-0.058
 cal-0.053
 mess-0.053
 dentist-0.052
 happiest-0.051
Token the
Token ID262
Feature activation-8.34e+00

Pos logit contributions
 hospital+0.044
 dentist+0.043
 nurses+0.043
 punished+0.040
ureen+0.037
Neg logit contributions
ges-0.038
aps-0.035
uts-0.034
 This-0.034
 together-0.033
Token trees
Token ID7150
Feature activation+9.48e-01

Pos logit contributions
 hospital+0.150
 nurses+0.147
 dentist+0.143
 punished+0.138
 forgot+0.132
Neg logit contributions
aps-0.159
ges-0.157
uts-0.154
uttering-0.150
 This-0.147
Token,
Token ID11
Feature activation-2.43e+00

Pos logit contributions
ges+0.015
uts+0.015
aps+0.015
ore+0.014
to+0.014
Neg logit contributions
 hospital-0.021
 cal-0.019
 dentist-0.018
 mess-0.018
 nurses-0.018
Token the
Token ID262
Feature activation-8.43e+00

Pos logit contributions
 hospital+0.057
 dentist+0.054
 nurses+0.053
 punished+0.052
 mistake+0.049
Neg logit contributions
ges-0.039
 on-0.036
aps-0.035
 together-0.035
uts-0.033
Token birds
Token ID10087
Feature activation-5.17e-01

Pos logit contributions
 hospital+0.146
 nurses+0.141
 dentist+0.140
 punished+0.133
 forgot+0.127
Neg logit contributions
aps-0.166
ges-0.165
uts-0.160
uttering-0.157
 This-0.153
Token,
Token ID11
Feature activation-3.01e+00

Pos logit contributions
 hospital+0.011
 happiest+0.010
 nurses+0.010
 cal+0.010
 dentist+0.010
Neg logit contributions
uts-0.009
ges-0.008
aps-0.008
 This-0.008
to-0.008
Token and
Token ID290
Feature activation-6.74e+00

Pos logit contributions
 hospital+0.073
 dentist+0.070
 nurses+0.068
 punished+0.067
 mistake+0.064
Neg logit contributions
ges-0.046
aps-0.041
 on-0.040
uts-0.040
 together-0.040
Token through
Token ID832
Feature activation+1.50e+00

Pos logit contributions
uts+0.119
ges+0.110
aps+0.105
uttering+0.098
 This+0.096
Neg logit contributions
 hospital-0.150
 nurses-0.140
 dentist-0.134
 happiest-0.128
 cal-0.127
Token the
Token ID262
Feature activation-7.20e+00

Pos logit contributions
ges+0.028
aps+0.027
uts+0.027
 This+0.026
ore+0.025
Neg logit contributions
 hospital-0.030
 dentist-0.029
 nurses-0.028
 punished-0.027
 forgot-0.026
Token farm
Token ID5318
Feature activation+6.25e+00

Pos logit contributions
 forgot+0.131
 punished+0.126
 nurses+0.124
 wounds+0.124
 dentist+0.121
Neg logit contributions
 This-0.133
aps-0.129
ore-0.127
 Today-0.121
 That-0.117
Token,
Token ID11
Feature activation-2.71e-01

Pos logit contributions
ges+0.080
uts+0.078
aps+0.073
uttering+0.067
 This+0.066
Neg logit contributions
 hospital-0.167
 nurses-0.158
 dentist-0.158
 happiest-0.147
 punished-0.145
Token she
Token ID673
Feature activation-5.18e+00

Pos logit contributions
 hospital+0.007
 dentist+0.007
 nurses+0.007
 wounds+0.006
 punished+0.006
Neg logit contributions
ges-0.003
uts-0.003
aps-0.003
 This-0.003
uttering-0.003
Token saw
Token ID2497
Feature activation+2.77e+00

Pos logit contributions
 hospital+0.088
 nurses+0.086
 dentist+0.084
 wounds+0.082
 mistake+0.079
Neg logit contributions
ges-0.102
ed-0.101
elt-0.098
uts-0.097
apped-0.096
Token a
Token ID257
Feature activation-3.58e+00

Pos logit contributions
ges+0.036
uts+0.034
ed+0.033
aps+0.032
elt+0.032
Neg logit contributions
 hospital-0.067
 dentist-0.064
 nurses-0.064
 forgot-0.063
 punished-0.062
Token big
Token ID1263
Feature activation-3.13e+00

Pos logit contributions
 hospital+0.060
 nurses+0.058
 dentist+0.055
 wounds+0.054
 punished+0.053
Neg logit contributions
ges-0.069
uts-0.069
aps-0.069
uttering-0.068
ore-0.067
Token barn
Token ID25203
Feature activation+5.69e+00

Pos logit contributions
 nurses+0.060
 hospital+0.059
 dentist+0.057
 punished+0.057
 wounds+0.057
Neg logit contributions
ges-0.054
 This-0.052
aps-0.051
ore-0.051
uts-0.049
Token.
Token ID13
Feature activation+5.51e+00

Pos logit contributions
ges+0.090
uts+0.086
aps+0.083
 This+0.076
ore+0.075
Neg logit contributions
 hospital-0.137
 dentist-0.128
 nurses-0.123
 doctors-0.116
 mess-0.116
Token She
Token ID1375
Feature activation-6.73e+00

Pos logit contributions
ges+0.078
uts+0.077
uttering+0.076
aps+0.074
apping+0.067
Neg logit contributions
 hospital-0.135
 dentist-0.130
 nurses-0.126
 tears-0.114
 doctors-0.114
Token the
Token ID262
Feature activation-8.89e+00

Pos logit contributions
 hospital+0.069
 dentist+0.065
 forgot+0.065
 punished+0.064
 happiest+0.063
Neg logit contributions
ges-0.052
aps-0.042
 together-0.041
 This-0.041
uttering-0.041
Token park
Token ID3952
Feature activation+4.39e+00

Pos logit contributions
 nurses+0.136
 dentist+0.134
 hospital+0.132
 punished+0.124
 wounds+0.119
Neg logit contributions
 This-0.191
aps-0.191
ges-0.190
uttering-0.183
ore-0.182
Token,
Token ID11
Feature activation-1.20e+00

Pos logit contributions
ges+0.063
uts+0.061
aps+0.059
 This+0.054
uttering+0.054
Neg logit contributions
 hospital-0.110
 nurses-0.104
 dentist-0.104
 happiest-0.096
 punished-0.096
Token but
Token ID475
Feature activation-2.02e+00

Pos logit contributions
 hospital+0.031
 dentist+0.030
 nurses+0.029
ureen+0.028
 mistake+0.027
Neg logit contributions
 together-0.015
ges-0.015
 on-0.014
 with-0.013
elt-0.013
Token they
Token ID484
Feature activation-5.17e+00

Pos logit contributions
 hospital+0.049
 dentist+0.048
 nurses+0.047
 wounds+0.044
 punished+0.043
Neg logit contributions
ges-0.030
uts-0.028
aps-0.027
 This-0.026
apped-0.026
Token couldn
Token ID3521
Feature activation+4.20e+00

Pos logit contributions
 hospital+0.114
 dentist+0.111
 nurses+0.110
 wounds+0.106
 cal+0.102
Neg logit contributions
ges-0.087
uts-0.081
apped-0.079
uttered-0.077
ed-0.076
Token't
Token ID470
Feature activation-3.69e+00

Pos logit contributions
ges+0.053
uttering+0.052
elt+0.047
ectar+0.047
aps+0.046
Neg logit contributions
 hospital-0.116
 dentist-0.106
 nurses-0.099
 cal-0.097
 tears-0.095
Token find
Token ID1064
Feature activation+2.36e+00

Pos logit contributions
 hospital+0.055
 nurses+0.052
 dentist+0.052
 punished+0.045
 wounds+0.044
Neg logit contributions
ges-0.089
uts-0.088
aps-0.085
 This-0.082
uttering-0.082
Token any
Token ID597
Feature activation-3.02e+00

Pos logit contributions
ges+0.040
 This+0.036
aps+0.035
uts+0.035
ore+0.034
Neg logit contributions
 hospital-0.053
 dentist-0.051
 nurses-0.050
 forgot-0.048
 punished-0.048
Token treasure
Token ID14068
Feature activation+3.47e+00

Pos logit contributions
 hospital+0.048
 nurses+0.046
 dentist+0.045
 punished+0.040
ureen+0.040
Neg logit contributions
ges-0.068
aps-0.066
uts-0.065
uttering-0.062
ore-0.061
Token.
Token ID13
Feature activation+5.39e+00

Pos logit contributions
ges+0.047
aps+0.046
uts+0.044
 This+0.042
uttering+0.042
Neg logit contributions
 hospital-0.089
 dentist-0.085
 nurses-0.083
 forgot-0.078
 doctors-0.078
Token,
Token ID11
Feature activation+4.45e-01

Pos logit contributions
uttering+0.057
ges+0.055
uts+0.054
aps+0.053
ore+0.051
Neg logit contributions
 hospital-0.059
 nurses-0.056
 cal-0.055
 dentist-0.055
 forgot-0.055
Token "
Token ID366
Feature activation+2.64e+00

Pos logit contributions
ges+0.008
aps+0.008
uttering+0.008
ore+0.008
uts+0.008
Neg logit contributions
 hospital-0.009
 dentist-0.008
 nurses-0.008
 forgot-0.008
 cal-0.008
TokenYes
Token ID5297
Feature activation+3.69e-01

Pos logit contributions
ges+0.044
uts+0.043
aps+0.042
ore+0.039
uttering+0.039
Neg logit contributions
 hospital-0.058
 nurses-0.056
 dentist-0.056
 happiest-0.052
 punished-0.051
Token,
Token ID11
Feature activation+8.83e-01

Pos logit contributions
ges+0.005
uts+0.004
aps+0.004
uttering+0.004
 This+0.004
Neg logit contributions
 hospital-0.010
 dentist-0.010
 nurses-0.010
 punished-0.009
 happiest-0.009
Token fans
Token ID3296
Feature activation+3.25e+00

Pos logit contributions
ges+0.011
uts+0.011
 This+0.010
aps+0.010
ore+0.009
Neg logit contributions
 hospital-0.024
 nurses-0.022
 dentist-0.022
 wounds-0.021
 punished-0.021
Token are
Token ID389
Feature activation-1.52e+00

Pos logit contributions
uts+0.052
uttering+0.050
aps+0.050
ges+0.048
uttered+0.046
Neg logit contributions
 hospital-0.069
 nurses-0.068
 forgot-0.064
 dentist-0.064
 punished-0.061
Token pretty
Token ID2495
Feature activation-1.39e-01

Pos logit contributions
 hospital+0.030
 dentist+0.029
 nurses+0.029
 happiest+0.029
 cal+0.028
Neg logit contributions
ges-0.028
uts-0.026
uttering-0.025
ore-0.025
aps-0.025
Token cool
Token ID3608
Feature activation+1.15e+00

Pos logit contributions
 hospital+0.003
 nurses+0.003
 dentist+0.003
 cal+0.003
 doctors+0.003
Neg logit contributions
ges-0.002
uts-0.002
aps-0.002
uttering-0.002
 This-0.002
Token!"
Token ID2474
Feature activation+5.38e+00

Pos logit contributions
ges+0.017
uts+0.016
uttering+0.016
aps+0.016
uttered+0.016
Neg logit contributions
 hospital-0.027
 nurses-0.026
 dentist-0.025
 mess-0.025
 doctors-0.024
Token<|endoftext|>
Token ID50256
Feature activation+7.28e+00

Pos logit contributions
ges+0.090
uttering+0.087
uts+0.086
aps+0.083
ore+0.076
Neg logit contributions
 hospital-0.118
 dentist-0.110
 nurses-0.107
 forgot-0.105
 tears-0.102
TokenOnce
Token ID7454
Feature activation+6.28e-01

Pos logit contributions
uttering+0.139
uts+0.139
iked+0.122
ges+0.120
itted+0.118
Neg logit contributions
 hospital-0.136
 dentist-0.134
 cal-0.119
 tears-0.115
 way-0.112
Token,
Token ID11
Feature activation-1.29e-01

Pos logit contributions
 hospital+0.015
 nurses+0.014
 dentist+0.014
 happiest+0.013
 forgot+0.013
Neg logit contributions
ges-0.007
uts-0.007
aps-0.007
uttering-0.007
elt-0.006
Token Lily
Token ID20037
Feature activation-6.28e+00

Pos logit contributions
 hospital+0.003
 dentist+0.003
 nurses+0.002
 wounds+0.002
 punished+0.002
Neg logit contributions
ges-0.002
uts-0.002
aps-0.002
 This-0.002
ore-0.002
Token made
Token ID925
Feature activation+3.29e+00

Pos logit contributions
 wounds+0.120
 hospital+0.116
 dentist+0.115
 nurses+0.113
 stret+0.111
Neg logit contributions
ed-0.116
ges-0.113
uts-0.110
 This-0.106
 together-0.104
Token sure
Token ID1654
Feature activation+2.20e-04

Pos logit contributions
aps+0.056
 together+0.055
ges+0.055
uts+0.053
 This+0.053
Neg logit contributions
 forgot-0.062
 dentist-0.061
 hospital-0.060
 nurses-0.059
 happiest-0.058
Token to
Token ID284
Feature activation-2.76e+00

Pos logit contributions
uts+0.000
ges+0.000
aps+0.000
 This+0.000
ore+0.000
Neg logit contributions
 hospital-0.000
 dentist-0.000
 nurses-0.000
 wounds-0.000
 happiest-0.000
Token always
Token ID1464
Feature activation-1.54e+00

Pos logit contributions
 hospital+0.045
 dentist+0.043
 nurses+0.043
 wounds+0.040
 happiest+0.038
Neg logit contributions
ges-0.057
uts-0.055
aps-0.055
 This-0.055
ore-0.052
Token be
Token ID307
Feature activation-2.88e+00

Pos logit contributions
 hospital+0.025
 dentist+0.024
 nurses+0.024
 wounds+0.021
 happiest+0.021
Neg logit contributions
ges-0.034
uts-0.033
aps-0.033
 This-0.032
ore-0.031
Token aware
Token ID3910
Feature activation+3.48e+00

Pos logit contributions
 hospital+0.046
 dentist+0.044
 nurses+0.040
ureen+0.038
 wounds+0.037
Neg logit contributions
ges-0.064
aps-0.063
 together-0.062
uttering-0.060
apping-0.060
Token of
Token ID286
Feature activation-1.05e+00

Pos logit contributions
ges+0.038
uts+0.036
aps+0.035
 This+0.032
ore+0.031
Neg logit contributions
 hospital-0.099
 dentist-0.095
 nurses-0.094
 punished-0.088
 wounds-0.088
Token the
Token ID262
Feature activation-7.45e+00

Pos logit contributions
 dentist+0.023
 hospital+0.023
 forgot+0.022
 nurses+0.021
ureen+0.021
Neg logit contributions
ges-0.017
uts-0.014
aps-0.014
 This-0.014
 together-0.014
Token alarms
Token ID36302
Feature activation+3.59e+00

Pos logit contributions
 dentist+0.119
 hospital+0.117
 forgot+0.113
ureen+0.113
 punished+0.107
Neg logit contributions
ges-0.140
aps-0.139
ore-0.134
 This-0.134
uttering-0.131
Token farther
Token ID18485
Feature activation+4.10e+00

Pos logit contributions
 dentist+0.104
 hospital+0.102
ureen+0.093
 nurses+0.093
 mistake+0.092
Neg logit contributions
 together-0.091
 on-0.080
ed-0.078
ges-0.074
 along-0.071
Token until
Token ID1566
Feature activation+2.61e-02

Pos logit contributions
ges+0.058
uts+0.056
aps+0.053
uttering+0.049
ore+0.049
Neg logit contributions
 hospital-0.104
 dentist-0.101
 nurses-0.100
 punished-0.091
 happiest-0.091
Token they
Token ID484
Feature activation-5.41e+00

Pos logit contributions
ges+0.000
uts+0.000
aps+0.000
 This+0.000
ore+0.000
Neg logit contributions
 hospital-0.001
 dentist-0.001
 nurses-0.001
 punished-0.001
 wounds-0.001
Token came
Token ID1625
Feature activation+3.03e+00

Pos logit contributions
 hospital+0.100
 nurses+0.099
 dentist+0.095
 wounds+0.092
 doctors+0.085
Neg logit contributions
ges-0.111
uts-0.108
aps-0.101
apped-0.100
ed-0.099
Token to
Token ID284
Feature activation-1.93e+00

Pos logit contributions
ges+0.053
uts+0.052
ore+0.049
 This+0.048
owers+0.045
Neg logit contributions
 hospital-0.062
 cal-0.056
 happiest-0.056
 nurses-0.055
 forgot-0.054
Token a
Token ID257
Feature activation-3.73e+00

Pos logit contributions
 hospital+0.045
 nurses+0.044
 dentist+0.043
 punished+0.043
 forgot+0.043
Neg logit contributions
ges-0.026
 This-0.024
aps-0.024
uts-0.022
uttering-0.022
Token park
Token ID3952
Feature activation+5.58e+00

Pos logit contributions
 wounds+0.065
 nurses+0.064
zie+0.059
ureen+0.058
 punished+0.058
Neg logit contributions
 This-0.065
ore-0.065
 together-0.064
aps-0.063
uttering-0.062
Token.
Token ID13
Feature activation+5.66e+00

Pos logit contributions
ges+0.075
uts+0.072
aps+0.067
uttering+0.063
elt+0.061
Neg logit contributions
 hospital-0.149
 dentist-0.140
 nurses-0.135
 cal-0.128
 happiest-0.128
Token
Token ID198
Feature activation+3.76e+00

Pos logit contributions
ges+0.085
uts+0.085
uttering+0.082
aps+0.078
apping+0.072
Neg logit contributions
 hospital-0.135
 dentist-0.126
 nurses-0.124
 tears-0.113
 cal-0.111
Token
Token ID198
Feature activation+3.35e+00

Pos logit contributions
ges+0.052
uttering+0.051
uts+0.050
elt+0.049
apped+0.046
Neg logit contributions
 hospital-0.091
 dentist-0.086
 nurses-0.081
 wounds-0.077
 cal-0.077
TokenThe
Token ID464
Feature activation-7.99e+00

Pos logit contributions
ges+0.069
uts+0.068
aps+0.066
ore+0.063
 This+0.063
Neg logit contributions
 hospital-0.062
 nurses-0.059
 dentist-0.058
 punished-0.053
 forgot-0.052
Token.
Token ID13
Feature activation+5.96e+00

Pos logit contributions
ges+0.059
uts+0.057
aps+0.055
 This+0.051
uttering+0.051
Neg logit contributions
 hospital-0.102
 dentist-0.097
 nurses-0.097
 punished-0.089
 happiest-0.089
Token It
Token ID632
Feature activation-3.48e+00

Pos logit contributions
ges+0.096
uttering+0.092
uts+0.090
aps+0.086
ore+0.082
Neg logit contributions
 hospital-0.138
 dentist-0.128
 tears-0.123
 nurses-0.120
 cal-0.118
Token did
Token ID750
Feature activation+2.41e+00

Pos logit contributions
 dentist+0.067
 nurses+0.067
 hospital+0.066
 wounds+0.058
ureen+0.058
Neg logit contributions
ges-0.064
uts-0.062
ed-0.061
 This-0.060
 together-0.058
Token not
Token ID407
Feature activation-2.88e+00

Pos logit contributions
ges+0.049
uts+0.048
aps+0.047
 This+0.046
uttering+0.044
Neg logit contributions
 hospital-0.045
 nurses-0.043
 dentist-0.043
 happiest-0.038
 punished-0.038
Token listen
Token ID6004
Feature activation+4.37e+00

Pos logit contributions
 hospital+0.061
 dentist+0.058
 nurses+0.057
 wounds+0.054
 happiest+0.052
Neg logit contributions
ges-0.051
uts-0.049
aps-0.046
 This-0.045
uttering-0.043
Token to
Token ID284
Feature activation-5.34e-01

Pos logit contributions
ges+0.064
aps+0.064
uttering+0.061
apped+0.060
uttered+0.059
Neg logit contributions
 hospital-0.106
 dentist-0.097
 mistake-0.097
 nurses-0.097
 mess-0.096
Token Lily
Token ID20037
Feature activation-3.84e+00

Pos logit contributions
 dentist+0.012
 hospital+0.012
 nurses+0.011
 punished+0.010
 happiest+0.010
Neg logit contributions
ges-0.009
 This-0.008
uts-0.008
aps-0.008
ore-0.008
Token.
Token ID13
Feature activation+5.11e+00

Pos logit contributions
 hospital+0.091
 dentist+0.081
 mistake+0.081
 nurses+0.080
 mess+0.079
Neg logit contributions
ges-0.060
uttered-0.060
uttering-0.059
aps-0.058
uts-0.058
Token It
Token ID632
Feature activation-3.84e+00

Pos logit contributions
ges+0.080
uttering+0.076
uts+0.075
aps+0.070
ore+0.067
Neg logit contributions
 hospital-0.123
 dentist-0.113
 nurses-0.109
 tears-0.107
 cal-0.106
Token kept
Token ID4030
Feature activation+3.89e+00

Pos logit contributions
 dentist+0.081
 nurses+0.079
 hospital+0.075
 workshop+0.072
ureen+0.071
Neg logit contributions
ed-0.062
ges-0.061
uts-0.059
 together-0.057
 on-0.056
Token running
Token ID2491
Feature activation+6.72e+00

Pos logit contributions
uts+0.077
aps+0.074
ges+0.072
ore+0.070
apping+0.068
Neg logit contributions
 hospital-0.078
 nurses-0.073
 dentist-0.071
 wounds-0.068
 punished-0.068
Token.
Token ID13
Feature activation+5.23e+00

Pos logit contributions
ges+0.147
elt+0.135
uttered+0.132
uts+0.132
ucked+0.124
Neg logit contributions
 hospital-0.213
 dentist-0.193
 doctors-0.183
 nurses-0.182
 floor-0.170
Token They
Token ID1119
Feature activation-5.19e+00

Pos logit contributions
ges+0.092
uts+0.091
uttering+0.090
aps+0.085
ore+0.084
Neg logit contributions
 hospital-0.106
 dentist-0.094
 forgot-0.094
 nurses-0.093
 tears-0.090
Token were
Token ID547
Feature activation-2.53e+00

Pos logit contributions
 hospital+0.113
 dentist+0.113
 wounds+0.107
 mistake+0.105
 nurses+0.101
Neg logit contributions
ges-0.094
apped-0.084
 together-0.082
ed-0.081
 This-0.080
Token both
Token ID1111
Feature activation-5.04e+00

Pos logit contributions
 hospital+0.057
 nurses+0.054
 dentist+0.054
 punished+0.049
 happiest+0.049
Neg logit contributions
ges-0.043
uts-0.042
aps-0.041
 This-0.038
uttering-0.038
Token competitive
Token ID7606
Feature activation+2.27e+00

Pos logit contributions
 hospital+0.094
 nurses+0.088
 dentist+0.088
 punished+0.079
 happiest+0.079
Neg logit contributions
ges-0.105
uts-0.103
aps-0.099
 This-0.095
ore-0.095
Token and
Token ID290
Feature activation-3.81e+00

Pos logit contributions
ges+0.036
uts+0.034
aps+0.032
uttering+0.032
ore+0.032
Neg logit contributions
 hospital-0.054
 nurses-0.051
 dentist-0.050
 forgot-0.047
 doctors-0.047
Token wanted
Token ID2227
Feature activation+4.57e+00

Pos logit contributions
 hospital+0.073
 dentist+0.072
 nurses+0.068
 wounds+0.064
 cal+0.062
Neg logit contributions
uts-0.076
ges-0.075
 This-0.071
aps-0.070
apped-0.070
Token to
Token ID284
Feature activation-1.91e+00

Pos logit contributions
ges+0.041
uts+0.039
aps+0.037
uttering+0.033
ore+0.033
Neg logit contributions
 hospital-0.140
 nurses-0.134
 dentist-0.134
 doctors-0.125
 punished-0.124
Token win
Token ID1592
Feature activation+7.62e+00

Pos logit contributions
 hospital+0.037
 nurses+0.035
 dentist+0.035
 punished+0.031
 wounds+0.031
Neg logit contributions
ges-0.037
uts-0.037
aps-0.036
 This-0.035
uttering-0.033
Token.
Token ID13
Feature activation+5.38e+00

Pos logit contributions
ges+0.127
uts+0.122
elt+0.118
uttering+0.116
aps+0.115
Neg logit contributions
 hospital-0.172
 nurses-0.160
 dentist-0.159
 wounds-0.151
 mistake-0.149
Token They
Token ID1119
Feature activation-5.23e+00

Pos logit contributions
ges+0.091
uttering+0.089
uts+0.088
aps+0.082
ore+0.078
Neg logit contributions
 hospital-0.112
 forgot-0.100
 dentist-0.099
 nurses-0.097
 tears-0.096
Token my
Token ID616
Feature activation-8.53e+00

Pos logit contributions
 hospital+0.063
 dentist+0.059
 nurses+0.059
 wounds+0.052
 forgot+0.051
Neg logit contributions
ges-0.065
aps-0.064
uts-0.063
 This-0.060
uttering-0.059
Token friends
Token ID2460
Feature activation-1.33e+00

Pos logit contributions
 hospital+0.147
 dentist+0.143
 nurses+0.137
 punished+0.129
 wounds+0.127
Neg logit contributions
aps-0.175
ges-0.173
uts-0.172
uttering-0.169
 This-0.165
Token?"
Token ID1701
Feature activation+5.94e+00

Pos logit contributions
 hospital+0.031
 dentist+0.029
 nurses+0.028
 happiest+0.027
 punished+0.027
Neg logit contributions
ges-0.022
uts-0.022
aps-0.021
uttering-0.020
uttered-0.020
Token
Token ID198
Feature activation+4.60e+00

Pos logit contributions
uts+0.099
ges+0.099
aps+0.096
uttering+0.094
apping+0.088
Neg logit contributions
 hospital-0.136
 nurses-0.130
 dentist-0.126
 cal-0.115
 punished-0.115
Token
Token ID198
Feature activation+4.05e+00

Pos logit contributions
ges+0.064
uts+0.062
uttering+0.060
aps+0.058
apped+0.055
Neg logit contributions
 hospital-0.113
 dentist-0.109
 nurses-0.105
 conductor-0.099
 cal-0.099
TokenThe
Token ID464
Feature activation-7.78e+00

Pos logit contributions
ges+0.092
uts+0.089
aps+0.089
uttering+0.084
ore+0.084
Neg logit contributions
 hospital-0.067
 nurses-0.064
 dentist-0.063
 punished-0.056
 cal-0.055
Token children
Token ID1751
Feature activation-2.43e+00

Pos logit contributions
 hospital+0.105
 dentist+0.096
 nurses+0.093
 punished+0.090
 forgot+0.087
Neg logit contributions
ges-0.191
uts-0.190
aps-0.185
 This-0.183
uttering-0.180
Token thought
Token ID1807
Feature activation+4.20e+00

Pos logit contributions
 hospital+0.044
 dentist+0.043
 nurses+0.041
 wounds+0.038
 doctors+0.037
Neg logit contributions
ges-0.050
uts-0.048
aps-0.048
apped-0.046
 This-0.046
Token this
Token ID428
Feature activation-4.77e+00

Pos logit contributions
ges+0.055
uts+0.053
 This+0.050
aps+0.050
ore+0.048
Neg logit contributions
 hospital-0.108
 dentist-0.105
 nurses-0.103
 punished-0.099
 forgot-0.097
Token was
Token ID373
Feature activation-8.01e-01

Pos logit contributions
 wounds+0.097
 hospital+0.096
 nurses+0.096
 dentist+0.095
 punished+0.092
Neg logit contributions
ore-0.078
 This-0.077
ges-0.076
uts-0.075
aps-0.071
Token a
Token ID257
Feature activation-3.35e+00

Pos logit contributions
 hospital+0.017
 nurses+0.016
 dentist+0.016
 wounds+0.015
 punished+0.014
Neg logit contributions
ges-0.013
 This-0.013
uts-0.013
aps-0.013
 together-0.012
Token what
Token ID644
Feature activation-5.93e-02

Pos logit contributions
 hospital+0.019
 dentist+0.018
 nurses+0.018
 forgot+0.017
 punished+0.017
Neg logit contributions
ges-0.011
 This-0.011
aps-0.011
uts-0.010
 together-0.010
Token they
Token ID484
Feature activation-4.92e+00

Pos logit contributions
 hospital+0.001
 dentist+0.001
 nurses+0.001
 doctors+0.001
 forgot+0.001
Neg logit contributions
ges-0.001
uttering-0.001
uts-0.001
aps-0.001
uttered-0.001
Token can
Token ID460
Feature activation+2.47e+00

Pos logit contributions
 hospital+0.103
 nurses+0.098
 dentist+0.098
 wounds+0.090
 doctors+0.087
Neg logit contributions
ges-0.087
uts-0.084
aps-0.082
 This-0.081
ore-0.078
Token buy
Token ID2822
Feature activation+4.42e+00

Pos logit contributions
ges+0.060
aps+0.056
uts+0.056
uttering+0.056
apped+0.055
Neg logit contributions
 hospital-0.039
 dentist-0.036
 nurses-0.035
 punished-0.031
 mess-0.031
Token with
Token ID351
Feature activation+8.17e-03

Pos logit contributions
ges+0.070
aps+0.067
uttering+0.066
uts+0.065
uttered+0.065
Neg logit contributions
 hospital-0.104
 dentist-0.098
 nurses-0.097
 doctors-0.090
 punished-0.090
Token their
Token ID511
Feature activation-8.16e+00

Pos logit contributions
ges+0.000
uts+0.000
aps+0.000
 This+0.000
ore+0.000
Neg logit contributions
 hospital-0.000
 dentist-0.000
 nurses-0.000
 happiest-0.000
 punished-0.000
Token money
Token ID1637
Feature activation+3.15e+00

Pos logit contributions
 hospital+0.134
 nurses+0.127
 dentist+0.126
 punished+0.112
 cal+0.112
Neg logit contributions
ges-0.179
uts-0.176
aps-0.175
 This-0.169
ore-0.163
Token.
Token ID13
Feature activation+5.41e+00

Pos logit contributions
ges+0.044
aps+0.041
uts+0.040
uttering+0.039
uttered+0.037
Neg logit contributions
 hospital-0.080
 nurses-0.076
 dentist-0.076
 punished-0.071
 forgot-0.070
Token
Token ID198
Feature activation+3.25e+00

Pos logit contributions
ges+0.087
uts+0.087
uttering+0.087
aps+0.074
ore+0.074
Neg logit contributions
 hospital-0.126
 nurses-0.115
 dentist-0.115
 tears-0.106
 cal-0.105
Token
Token ID198
Feature activation+2.96e+00

Pos logit contributions
ges+0.048
uts+0.047
elt+0.045
uttering+0.044
apped+0.041
Neg logit contributions
 hospital-0.079
 dentist-0.075
 nurses-0.070
 wounds-0.066
 cal-0.065
TokenThe
Token ID464
Feature activation-8.05e+00

Pos logit contributions
ges+0.055
uts+0.054
aps+0.051
 This+0.049
elt+0.049
Neg logit contributions
 hospital-0.062
 nurses-0.059
 dentist-0.057
 punished-0.053
 forgot-0.053
Token he
Token ID339
Feature activation-5.28e+00

Pos logit contributions
 hospital+0.100
 dentist+0.098
 nurses+0.094
 stret+0.088
 wounds+0.088
Neg logit contributions
uts-0.102
 together-0.100
ges-0.099
aps-0.096
 This-0.096
Token was
Token ID373
Feature activation-2.12e+00

Pos logit contributions
 nurses+0.097
 hospital+0.096
 dentist+0.096
 wounds+0.085
 stret+0.081
Neg logit contributions
uts-0.105
ges-0.100
ed-0.097
aps-0.096
 together-0.095
Token able
Token ID1498
Feature activation+1.49e+00

Pos logit contributions
 hospital+0.039
 nurses+0.038
 dentist+0.038
 wounds+0.033
 mess+0.032
Neg logit contributions
ges-0.042
uts-0.042
aps-0.041
 This-0.039
uttering-0.038
Token to
Token ID284
Feature activation-3.44e+00

Pos logit contributions
ges+0.015
uts+0.014
aps+0.013
ore+0.013
 This+0.012
Neg logit contributions
 hospital-0.043
 nurses-0.041
 dentist-0.041
 punished-0.040
 forgot-0.040
Token take
Token ID1011
Feature activation+2.39e+00

Pos logit contributions
 nurses+0.065
 hospital+0.064
 dentist+0.063
 wounds+0.056
 happiest+0.056
Neg logit contributions
uts-0.064
ges-0.063
aps-0.063
 This-0.060
ore-0.058
Token the
Token ID262
Feature activation-7.96e+00

Pos logit contributions
ges+0.045
uts+0.043
aps+0.042
 This+0.041
ore+0.040
Neg logit contributions
 hospital-0.049
 dentist-0.047
 nurses-0.046
 punished-0.042
 wounds-0.042
Token lead
Token ID1085
Feature activation+3.99e+00

Pos logit contributions
 hospital+0.128
 nurses+0.123
 dentist+0.122
 forgot+0.110
 wounds+0.110
Neg logit contributions
ges-0.172
uts-0.167
aps-0.165
 This-0.164
ore-0.160
Token.
Token ID13
Feature activation+5.28e+00

Pos logit contributions
ges+0.051
uts+0.049
aps+0.048
uttering+0.045
 This+0.042
Neg logit contributions
 hospital-0.105
 nurses-0.100
 dentist-0.099
 punished-0.093
 happiest-0.092
Token
Token ID198
Feature activation+3.72e+00

Pos logit contributions
ges+0.079
uts+0.076
uttering+0.075
aps+0.069
ore+0.066
Neg logit contributions
 hospital-0.127
 dentist-0.119
 nurses-0.112
 forgot-0.108
 happiest-0.108
Token
Token ID198
Feature activation+3.52e+00

Pos logit contributions
ges+0.042
uts+0.040
aps+0.036
uttering+0.035
elt+0.034
Neg logit contributions
 hospital-0.103
 dentist-0.099
 nurses-0.096
 cal-0.090
 punished-0.090
TokenFinally
Token ID11158
Feature activation+2.12e+00

Pos logit contributions
ges+0.068
uts+0.066
aps+0.064
 This+0.061
uttering+0.061
Neg logit contributions
 hospital-0.071
 dentist-0.066
 nurses-0.066
 punished-0.060
 happiest-0.059
Token with
Token ID351
Feature activation-1.77e+00

Pos logit contributions
ges+0.108
uts+0.104
aps+0.099
elt+0.097
ore+0.094
Neg logit contributions
 hospital-0.157
 nurses-0.145
 dentist-0.144
 doctors-0.136
 forgot-0.135
Token his
Token ID465
Feature activation-7.94e+00

Pos logit contributions
 hospital+0.038
 nurses+0.036
 dentist+0.036
 punished+0.034
 forgot+0.034
Neg logit contributions
ges-0.030
aps-0.029
uts-0.028
 This-0.027
 together-0.025
Token toy
Token ID13373
Feature activation-4.64e-01

Pos logit contributions
 hospital+0.128
 dentist+0.123
 nurses+0.121
 punished+0.111
 cal+0.110
Neg logit contributions
aps-0.175
ges-0.174
uts-0.173
uttering-0.164
 This-0.161
Token soldiers
Token ID5795
Feature activation+2.27e+00

Pos logit contributions
 hospital+0.006
 nurses+0.006
 dentist+0.006
 doctors+0.005
 happiest+0.005
Neg logit contributions
ges-0.012
uts-0.012
aps-0.012
 This-0.011
uttering-0.011
Token.
Token ID13
Feature activation+5.19e+00

Pos logit contributions
ges+0.035
uts+0.034
elt+0.033
aps+0.032
 This+0.032
Neg logit contributions
 hospital-0.053
 nurses-0.049
 dentist-0.048
 happiest-0.047
 forgot-0.047
Token He
Token ID679
Feature activation-6.23e+00

Pos logit contributions
ges+0.094
uts+0.094
uttering+0.094
aps+0.087
ung+0.086
Neg logit contributions
 hospital-0.110
 dentist-0.106
 nurses-0.092
 floor-0.088
 tears-0.087
Token liked
Token ID8288
Feature activation+3.37e+00

Pos logit contributions
 dentist+0.127
 hospital+0.124
 nurses+0.120
 wounds+0.118
 difference+0.113
Neg logit contributions
ges-0.109
apped-0.107
ed-0.107
aps-0.104
 together-0.103
Token to
Token ID284
Feature activation-3.47e+00

Pos logit contributions
ges+0.030
uts+0.029
aps+0.028
uttering+0.025
ore+0.024
Neg logit contributions
 hospital-0.102
 nurses-0.099
 dentist-0.098
 doctors-0.091
 punished-0.091
Token make
Token ID787
Feature activation+2.93e+00

Pos logit contributions
 dentist+0.068
 hospital+0.066
 nurses+0.065
 punished+0.060
ice+0.059
Neg logit contributions
ges-0.063
aps-0.062
 This-0.061
uts-0.059
 together-0.056
Token loud
Token ID7812
Feature activation+1.58e+00

Pos logit contributions
uts+0.061
aps+0.061
ges+0.061
 This+0.058
 together+0.057
Neg logit contributions
 hospital-0.049
 dentist-0.048
 nurses-0.048
 forgot-0.044
 punished-0.043
Token noises
Token ID26782
Feature activation+6.21e+00

Pos logit contributions
ges+0.036
uts+0.036
aps+0.035
 This+0.034
ore+0.033
Neg logit contributions
 hospital-0.025
 dentist-0.024
 nurses-0.024
 punished-0.021
 happiest-0.020
Token parts
Token ID3354
Feature activation+4.11e+00

Pos logit contributions
 hospital+0.053
 dentist+0.050
 nurses+0.049
 forgot+0.045
 punished+0.044
Neg logit contributions
ges-0.072
uts-0.070
aps-0.070
ore-0.068
 This-0.067
Token and
Token ID290
Feature activation-4.61e+00

Pos logit contributions
ges+0.058
uts+0.054
aps+0.054
uttering+0.053
 This+0.049
Neg logit contributions
 hospital-0.101
 nurses-0.096
 dentist-0.094
 punished-0.088
 cal-0.088
Token materials
Token ID5696
Feature activation+4.81e+00

Pos logit contributions
 hospital+0.086
 dentist+0.082
 nurses+0.078
 happiest+0.077
ureen+0.075
Neg logit contributions
uts-0.085
ges-0.084
 together-0.082
aps-0.080
 This-0.079
Token to
Token ID284
Feature activation-3.27e+00

Pos logit contributions
ges+0.067
aps+0.065
uts+0.064
uttering+0.060
 This+0.059
Neg logit contributions
 hospital-0.119
 nurses-0.113
 dentist-0.112
 punished-0.106
 forgot-0.104
Token make
Token ID787
Feature activation+3.60e+00

Pos logit contributions
 dentist+0.059
 hospital+0.058
 nurses+0.056
 happiest+0.052
 cal+0.052
Neg logit contributions
ges-0.063
 This-0.061
 together-0.059
aps-0.059
uts-0.059
Token the
Token ID262
Feature activation-7.16e+00

Pos logit contributions
ges+0.063
uts+0.061
aps+0.060
 This+0.057
ore+0.056
Neg logit contributions
 hospital-0.077
 dentist-0.073
 nurses-0.073
 forgot-0.067
 happiest-0.066
Token car
Token ID1097
Feature activation+2.55e+00

Pos logit contributions
 hospital+0.098
 dentist+0.096
 nurses+0.096
 forgot+0.093
ureen+0.092
Neg logit contributions
ges-0.154
aps-0.153
uts-0.149
 This-0.148
ore-0.147
Token.
Token ID13
Feature activation+5.65e+00

Pos logit contributions
uts+0.052
ges+0.051
aps+0.051
 This+0.048
ore+0.047
Neg logit contributions
 hospital-0.046
 dentist-0.044
 nurses-0.044
 punished-0.038
 wounds-0.038
Token He
Token ID679
Feature activation-5.86e+00

Pos logit contributions
ges+0.085
uttering+0.083
uts+0.082
aps+0.079
ore+0.072
Neg logit contributions
 hospital-0.137
 dentist-0.130
 nurses-0.127
 doctors-0.114
 happiest-0.114
Token worked
Token ID3111
Feature activation+4.73e+00

Pos logit contributions
 hospital+0.122
 dentist+0.119
 nurses+0.117
 wounds+0.110
 difference+0.108
Neg logit contributions
 together-0.103
apped-0.095
ed-0.095
uts-0.090
ges-0.088
Token very
Token ID845
Feature activation-2.15e+00

Pos logit contributions
ges+0.083
uts+0.081
aps+0.078
 This+0.074
uttering+0.074
Neg logit contributions
 hospital-0.103
 dentist-0.098
 nurses-0.098
 punished-0.088
 happiest-0.088