TOP ACTIVATIONS
MAX = 0.142

. " I think we shr ank too . The cushion
said . " They just shr ank by themselves !"
. He shr inks and shr inks . He becomes smaller
would play tag , hop sc ot ch , and many
for hours and hours , competing in lots of fun games
. Suddenly , it shr ank ! The goose was
. You can play hop sc ot ch again . But
They started to play hop sc ot ch and had lots
bird screamed again and fl apped its wings . It flew
hide and seek , hop sc ot ch and much more
. They swung and fl apped their arms again . They
. Each time , it shr ank ! The goose
She played tag and hop sc ot ch and pretended to
their beds . They n apped . They dreamed of colors
They kissed them and cl apped . They said , "
against the penny and it shr ank back to its regular
also played tag and hop sc ot ch . They laughed
. They laughed and cl apped when they made a match
. Anna laughed and cl apped . She said , "
bring the quarrel to a resolution , until everyone was happy

MIN ACTIVATIONS
MIN = -0.151

need . Even the t ini est creature needs our help
mom my ! Max broke my dress !" Sara cries .
from the sky and broke my spark . Can you help
excitement and proceeded to climb the stairs .
proud of herself for climbing the tree . She stayed on
his room and knocks on the door . He opens it
Daddy hear them and open the door . They see the
and he almost finished fixing the palace . But he was
loud and the guard shut the garden gate . Jim couldn
can have the rest of my avocado ." She pushes her
I found something hidden under my toy box !"
screams and runs out of the tunnel . She is angry
and Papa were sitting in their garden one summer season .
and soon she was crossing the finish line first .
D ott ie . D ott ie wanted to visit a
were two best friends , Mir a and Max . They
house . They knock on the door . Sam opens the
to say sorry for breaking the alarm and the fence .
They knock on the door . Dad opens it
. When D ott ie arrived at the lab

INTERVAL 0.069 - 0.142
CONTAINS 3.116%

her mom went outside to play . It was a very
ila and Ben liked to play outside when it was dark
rude . He did not care about me ."
He does not like to fight with Max . He does
covered in red bumps . They went to

INTERVAL -0.004 - 0.069
CONTAINS 55.779%

a big hug . It l icked his face !
to make sure you 're healthy ," his owner replied .
love you too . But can we still look at the
to the counter and said " Can I eat here ?".
An nie : � � € � � I

INTERVAL -0.078 - -0.004
CONTAINS 33.456%

. She was so excited to feed the cross again soon
very angry . L ily and Ben came out
together . <|endoftext|> Once upon a time , there was a
house . " S ara ! Ben ! Come
Who pushed this ?" one of them asked .

INTERVAL -0.151 - -0.078
CONTAINS 7.649%

struggled to get out but the lake was too strong !
!" M ia 's mom watches her play .
it in her hands . The object was so pretty and
She even blew bubbles with her doll and passed the time
itself . The boy never forgot the lesson
Token.
Token ID13
Feature activation+6.12e-02

Pos logit contributions
uts+0.064
't+0.060
 Sorry+0.059
uttering+0.059
ges+0.058
Neg logit contributions
 hospital-0.089
 dentist-0.086
 stret-0.081
 nurses-0.080
 conductor-0.078
Token "
Token ID366
Feature activation+3.56e-02

Pos logit contributions
uttering+0.084
uts+0.081
iked+0.078
ges+0.078
 Sorry+0.077
Neg logit contributions
 hospital-0.131
 dentist-0.130
 conductor-0.118
 floor-0.117
 nurses-0.117
TokenI
Token ID40
Feature activation-3.51e-02

Pos logit contributions
uts+0.051
't+0.048
iked+0.047
ges+0.047
 together+0.047
Neg logit contributions
 hospital-0.072
 dentist-0.070
 nurses-0.065
 stret-0.065
 conductor-0.064
Token think
Token ID892
Feature activation+5.43e-02

Pos logit contributions
 dentist+0.075
 hospital+0.073
 stret+0.073
 nurses+0.072
 conductor+0.070
Neg logit contributions
't-0.049
uts-0.045
t-0.045
 together-0.044
ges-0.042
Token we
Token ID356
Feature activation-3.14e-02

Pos logit contributions
uts+0.053
't+0.047
ges+0.047
uttering+0.044
iked+0.044
Neg logit contributions
 hospital-0.137
 dentist-0.134
 nurses-0.127
 stret-0.125
 conductor-0.124
Token shr
Token ID10157
Feature activation+1.42e-01

Pos logit contributions
 dentist+0.070
 hospital+0.068
 stret+0.068
 nurses+0.064
 conductor+0.064
Neg logit contributions
't-0.043
uts-0.040
 together-0.038
ges-0.038
t-0.037
Tokenank
Token ID962
Feature activation+2.86e-02

Pos logit contributions
 Sorry+0.155
 Yes+0.142
 This+0.140
't+0.138
uts+0.130
Neg logit contributions
 ig-0.314
 hospital-0.307
abel-0.297
 wounds-0.297
 dentist-0.295
Token too
Token ID1165
Feature activation+2.31e-02

Pos logit contributions
uts+0.039
't+0.037
ges+0.036
 together+0.034
uttering+0.034
Neg logit contributions
 hospital-0.061
 dentist-0.059
 nurses-0.056
 stret-0.055
 conductor-0.054
Token.
Token ID13
Feature activation+6.08e-02

Pos logit contributions
uts+0.036
iked+0.035
ges+0.035
 Sorry+0.034
uttering+0.034
Neg logit contributions
 hospital-0.044
 dentist-0.042
 floor-0.039
 nurses-0.038
 cob-0.038
Token The
Token ID383
Feature activation-8.39e-02

Pos logit contributions
uttering+0.071
uts+0.070
iked+0.067
't+0.063
t+0.063
Neg logit contributions
 hospital-0.138
 dentist-0.135
 nurses-0.129
 conductor-0.128
 engineer-0.125
Token cushion
Token ID33579
Feature activation+2.28e-02

Pos logit contributions
 stret+0.100
 hospital+0.099
 dentist+0.097
 conductor+0.091
ki+0.091
Neg logit contributions
't-0.187
uts-0.176
 together-0.173
 Sorry-0.169
t-0.167
Token said
Token ID531
Feature activation+4.41e-02

Pos logit contributions
 dentist+0.125
 hospital+0.118
 ig+0.110
 nurses+0.108
 cob+0.108
Neg logit contributions
 together-0.108
't-0.107
uts-0.104
 Sorry-0.098
 This-0.096
Token.
Token ID13
Feature activation+5.66e-02

Pos logit contributions
uts+0.067
 Sorry+0.066
't+0.063
ges+0.062
uttering+0.062
Neg logit contributions
 hospital-0.083
 stret-0.078
 tears-0.075
 boo-0.073
 wounds-0.073
Token "
Token ID366
Feature activation+2.82e-02

Pos logit contributions
uttering+0.084
iked+0.082
uts+0.079
ges+0.078
used+0.076
Neg logit contributions
 hospital-0.106
 conductor-0.106
 dentist-0.103
 stret-0.100
 driver-0.098
TokenThey
Token ID2990
Feature activation-3.14e-02

Pos logit contributions
uts+0.041
iked+0.038
ges+0.037
't+0.037
uttering+0.036
Neg logit contributions
 hospital-0.057
 dentist-0.055
 nurses-0.053
 stret-0.052
 conductor-0.051
Token just
Token ID655
Feature activation+8.67e-03

Pos logit contributions
 dentist+0.074
 hospital+0.071
 stret+0.070
 conductor+0.067
ureen+0.066
Neg logit contributions
't-0.038
uts-0.035
 together-0.034
t-0.034
 Sorry-0.032
Token shr
Token ID10157
Feature activation+1.37e-01

Pos logit contributions
uts+0.014
ges+0.014
 together+0.014
't+0.014
 Sorry+0.013
Neg logit contributions
 dentist-0.015
 hospital-0.015
 stret-0.014
 conductor-0.014
 nurses-0.014
Tokenank
Token ID962
Feature activation+2.45e-02

Pos logit contributions
 Sorry+0.128
't+0.124
uts+0.122
 Yes+0.119
 This+0.118
Neg logit contributions
 hospital-0.319
 ig-0.315
 wounds-0.309
ice-0.309
 dentist-0.309
Token by
Token ID416
Feature activation-1.27e-02

Pos logit contributions
uts+0.035
't+0.033
ges+0.032
 together+0.032
 This+0.030
Neg logit contributions
 hospital-0.051
 dentist-0.050
 stret-0.046
 nurses-0.046
 conductor-0.045
Token themselves
Token ID2405
Feature activation-2.23e-02

Pos logit contributions
 hospital+0.025
 dentist+0.025
 stret+0.022
 nurses+0.022
 conductor+0.022
Neg logit contributions
uts-0.020
 together-0.019
ges-0.019
't-0.019
 This-0.017
Token!"
Token ID2474
Feature activation+5.30e-02

Pos logit contributions
 hospital+0.046
 dentist+0.043
 floor+0.043
 nurses+0.041
 stret+0.041
Neg logit contributions
uts-0.030
 Sorry-0.029
't-0.029
iked-0.028
uttering-0.028
Token
Token ID198
Feature activation+3.77e-02

Pos logit contributions
uttering+0.066
uts+0.065
ges+0.065
iked+0.064
uttered+0.059
Neg logit contributions
 hospital-0.118
 dentist-0.113
 stret-0.109
 conductor-0.106
 tears-0.104
Token.
Token ID13
Feature activation+5.18e-02

Pos logit contributions
 hospital+0.005
 dentist+0.005
 nurses+0.005
 stret+0.005
 conductor+0.005
Neg logit contributions
uts-0.003
't-0.002
ges-0.002
 together-0.002
uttering-0.002
Token He
Token ID679
Feature activation-5.27e-02

Pos logit contributions
uts+0.075
uttering+0.072
iked+0.072
ges+0.071
't+0.069
Neg logit contributions
 hospital-0.103
 dentist-0.102
 conductor-0.089
 floor-0.088
 nurses-0.086
Token shr
Token ID10157
Feature activation+1.26e-01

Pos logit contributions
 dentist+0.095
 stret+0.094
 nurses+0.093
 conductor+0.091
 hospital+0.090
Neg logit contributions
't-0.085
uts-0.081
 Sorry-0.076
 together-0.076
ges-0.074
Tokeninks
Token ID2973
Feature activation+6.11e-02

Pos logit contributions
 Sorry+0.131
 This+0.124
 Yes+0.122
't+0.120
uts+0.111
Neg logit contributions
 hospital-0.285
 wounds-0.277
 ig-0.276
 dentist-0.273
an-0.269
Token and
Token ID290
Feature activation+1.70e-04

Pos logit contributions
uts+0.090
't+0.085
 Sorry+0.081
ges+0.081
iked+0.080
Neg logit contributions
 hospital-0.123
 dentist-0.120
 nurses-0.111
 stret-0.111
 conductor-0.109
Token shr
Token ID10157
Feature activation+1.37e-01

Pos logit contributions
uts+0.000
't+0.000
ges+0.000
 together+0.000
 Sorry+0.000
Neg logit contributions
 hospital-0.000
 dentist-0.000
 nurses-0.000
 stret-0.000
 conductor-0.000
Tokeninks
Token ID2973
Feature activation+6.60e-02

Pos logit contributions
 Sorry+0.142
't+0.137
 Yes+0.131
 This+0.131
t+0.122
Neg logit contributions
 hospital-0.312
 dentist-0.304
 wounds-0.304
 ig-0.304
an-0.298
Token.
Token ID13
Feature activation+5.57e-02

Pos logit contributions
uts+0.088
't+0.083
 Sorry+0.078
ges+0.078
iked+0.077
Neg logit contributions
 hospital-0.142
 dentist-0.138
 nurses-0.129
 stret-0.128
 conductor-0.127
Token He
Token ID679
Feature activation-5.26e-02

Pos logit contributions
uts+0.077
ges+0.074
uttering+0.074
iked+0.073
 Sorry+0.070
Neg logit contributions
 hospital-0.114
 dentist-0.114
 conductor-0.099
 floor-0.098
 tears-0.095
Token becomes
Token ID4329
Feature activation-8.32e-03

Pos logit contributions
 stret+0.100
 nurses+0.098
 dentist+0.097
 conductor+0.094
 hospital+0.092
Neg logit contributions
't-0.082
uts-0.078
 together-0.075
 Sorry-0.073
aps-0.069
Token smaller
Token ID4833
Feature activation+4.18e-02

Pos logit contributions
 dentist+0.015
 hospital+0.015
 stret+0.014
 conductor+0.014
 nurses+0.014
Neg logit contributions
uts-0.014
 together-0.013
ges-0.012
't-0.012
 onwards-0.012
Token would
Token ID561
Feature activation+1.83e-02

Pos logit contributions
 stret+0.123
 difference+0.118
 wounds+0.116
 nurses+0.110
 dentist+0.110
Neg logit contributions
't-0.072
 together-0.072
 on-0.066
t-0.065
uts-0.062
Token play
Token ID711
Feature activation+7.21e-02

Pos logit contributions
't+0.031
 together+0.027
 This+0.027
uts+0.027
ges+0.026
Neg logit contributions
 dentist-0.032
 hospital-0.031
 stret-0.031
 nurses-0.030
an-0.030
Token tag
Token ID7621
Feature activation+7.63e-02

Pos logit contributions
uts+0.096
't+0.091
ges+0.087
 together+0.087
 This+0.083
Neg logit contributions
 hospital-0.155
 dentist-0.152
 nurses-0.144
 stret-0.143
 conductor-0.140
Token,
Token ID11
Feature activation+2.63e-02

Pos logit contributions
uts+0.086
 together+0.082
't+0.081
ges+0.076
 This+0.072
Neg logit contributions
 hospital-0.179
 dentist-0.177
 stret-0.168
 nurses-0.167
 conductor-0.163
Token hop
Token ID1725
Feature activation+8.37e-02

Pos logit contributions
't+0.042
 together+0.039
uts+0.038
ges+0.037
 This+0.037
Neg logit contributions
 dentist-0.049
 hospital-0.048
an-0.047
 stret-0.046
 conductor-0.044
Tokensc
Token ID1416
Feature activation+1.36e-01

Pos logit contributions
uts+0.131
 Sorry+0.120
ges+0.120
uttering+0.119
't+0.118
Neg logit contributions
 hospital-0.163
 dentist-0.152
 stret-0.149
 conductor-0.145
 nurses-0.144
Tokenot
Token ID313
Feature activation+2.01e-02

Pos logit contributions
uts+0.240
't+0.232
 Sorry+0.231
ges+0.225
uttering+0.222
Neg logit contributions
 hospital-0.210
an-0.198
 stret-0.193
 dentist-0.191
et-0.190
Tokench
Token ID354
Feature activation+5.47e-02

Pos logit contributions
uts+0.036
 Sorry+0.035
uttering+0.035
't+0.035
ges+0.034
Neg logit contributions
 hospital-0.032
 dentist-0.030
 stret-0.030
 nurses-0.029
 conductor-0.028
Token,
Token ID11
Feature activation+3.02e-02

Pos logit contributions
 Sorry+0.073
uts+0.071
uttering+0.067
ges+0.065
iked+0.064
Neg logit contributions
 hospital-0.112
 stret-0.109
 dentist-0.105
 floor-0.102
 rug-0.102
Token and
Token ID290
Feature activation+6.88e-03

Pos logit contributions
uts+0.044
't+0.043
 together+0.042
ges+0.041
 This+0.039
Neg logit contributions
 hospital-0.060
 dentist-0.060
 nurses-0.055
 stret-0.055
 conductor-0.054
Token many
Token ID867
Feature activation-2.03e-02

Pos logit contributions
't+0.011
 together+0.011
uts+0.011
ges+0.010
 This+0.010
Neg logit contributions
 dentist-0.012
 hospital-0.012
 stret-0.011
an-0.011
ice-0.011
Token for
Token ID329
Feature activation-2.28e-02

Pos logit contributions
uts+0.039
't+0.036
ges+0.035
 together+0.035
uttering+0.033
Neg logit contributions
 hospital-0.086
 dentist-0.085
 stret-0.080
 nurses-0.080
 conductor-0.078
Token hours
Token ID2250
Feature activation+8.16e-02

Pos logit contributions
 dentist+0.044
an+0.044
 conductor+0.041
 stret+0.041
 hospital+0.041
Neg logit contributions
 together-0.037
't-0.034
ges-0.031
uts-0.031
 This-0.031
Token and
Token ID290
Feature activation-2.12e-03

Pos logit contributions
uts+0.106
 Sorry+0.101
't+0.100
ges+0.096
 This+0.092
Neg logit contributions
 hospital-0.175
 dentist-0.169
 stret-0.161
 nurses-0.159
 conductor-0.157
Token hours
Token ID2250
Feature activation+7.47e-02

Pos logit contributions
 stret+0.004
 dentist+0.004
an+0.004
 hospital+0.004
 conductor+0.004
Neg logit contributions
 together-0.003
't-0.003
ges-0.003
 onwards-0.003
 on-0.003
Token,
Token ID11
Feature activation+1.99e-02

Pos logit contributions
uts+0.093
 Sorry+0.091
't+0.087
ges+0.082
 This+0.081
Neg logit contributions
 hospital-0.164
 dentist-0.158
 stret-0.153
 nurses-0.149
 conductor-0.148
Token competing
Token ID11780
Feature activation+1.35e-01

Pos logit contributions
 together+0.023
't+0.020
uts+0.020
ges+0.019
 onwards+0.018
Neg logit contributions
 dentist-0.048
 hospital-0.048
 nurses-0.045
 stret-0.045
 conductor-0.044
Token in
Token ID287
Feature activation-1.19e-02

Pos logit contributions
 Sorry+0.150
uts+0.143
't+0.133
ges+0.131
iked+0.123
Neg logit contributions
 hospital-0.307
 dentist-0.299
 conductor-0.296
 stret-0.281
 engine-0.281
Token lots
Token ID6041
Feature activation+2.50e-02

Pos logit contributions
 dentist+0.024
 hospital+0.023
 stret+0.023
an+0.022
 nurses+0.022
Neg logit contributions
 together-0.018
ges-0.016
't-0.016
uts-0.015
 This-0.015
Token of
Token ID286
Feature activation-3.86e-02

Pos logit contributions
uts+0.020
 together+0.020
ges+0.018
't+0.017
t+0.016
Neg logit contributions
 dentist-0.067
 hospital-0.067
 stret-0.064
 conductor-0.062
 nurses-0.062
Token fun
Token ID1257
Feature activation+5.71e-02

Pos logit contributions
 dentist+0.072
 hospital+0.067
an+0.065
 conductor+0.064
 stret+0.063
Neg logit contributions
 together-0.069
 on-0.057
 onwards-0.056
 Yes-0.055
ges-0.055
Token games
Token ID1830
Feature activation+8.23e-02

Pos logit contributions
uts+0.082
't+0.078
ges+0.076
 Sorry+0.073
uttering+0.072
Neg logit contributions
 hospital-0.118
 dentist-0.114
 nurses-0.107
 stret-0.107
 conductor-0.104
Token.
Token ID13
Feature activation+3.66e-02

Pos logit contributions
uts+0.013
't+0.012
ges+0.011
 Sorry+0.011
uttering+0.011
Neg logit contributions
 hospital-0.024
 dentist-0.024
 nurses-0.022
 stret-0.022
 conductor-0.022
Token
Token ID198
Feature activation+3.03e-02

Pos logit contributions
uts+0.046
uttering+0.044
ges+0.042
't+0.041
iked+0.040
Neg logit contributions
 hospital-0.081
 dentist-0.080
 nurses-0.073
 conductor-0.073
 stret-0.072
TokenSuddenly
Token ID38582
Feature activation+3.94e-02

Pos logit contributions
uts+0.030
ges+0.027
 Sorry+0.026
't+0.026
uttering+0.025
Neg logit contributions
 hospital-0.076
 dentist-0.075
 stret-0.070
 conductor-0.069
 nurses-0.069
Token,
Token ID11
Feature activation+7.21e-03

Pos logit contributions
uts+0.039
't+0.036
ges+0.034
 together+0.034
 This+0.031
Neg logit contributions
 hospital-0.099
 dentist-0.097
 stret-0.092
 nurses-0.091
 conductor-0.090
Token it
Token ID340
Feature activation-1.82e-02

Pos logit contributions
't+0.009
uts+0.009
 together+0.009
ges+0.009
 Sorry+0.008
Neg logit contributions
 hospital-0.015
 dentist-0.015
 stret-0.014
an-0.014
 conductor-0.014
Token shr
Token ID10157
Feature activation+1.35e-01

Pos logit contributions
 hospital+0.030
 dentist+0.030
 stret+0.030
 nurses+0.029
 conductor+0.026
Neg logit contributions
't-0.033
uts-0.032
t-0.030
 together-0.030
ges-0.030
Tokenank
Token ID962
Feature activation+2.48e-02

Pos logit contributions
 Sorry+0.128
uts+0.127
't+0.127
 Yes+0.115
 This+0.115
Neg logit contributions
 hospital-0.314
 dentist-0.309
 ig-0.300
an-0.295
ice-0.293
Token!
Token ID0
Feature activation+3.67e-02

Pos logit contributions
uts+0.031
't+0.029
ges+0.028
 together+0.027
 Sorry+0.026
Neg logit contributions
 hospital-0.056
 dentist-0.055
 nurses-0.051
 stret-0.051
 conductor-0.050
Token The
Token ID383
Feature activation-8.67e-02

Pos logit contributions
uts+0.054
uttering+0.050
ges+0.049
't+0.049
iked+0.049
Neg logit contributions
 hospital-0.073
 dentist-0.071
 stret-0.066
 nurses-0.065
 conductor-0.065
Token goose
Token ID37246
Feature activation-3.97e-02

Pos logit contributions
 hospital+0.106
 dentist+0.104
 stret+0.103
cod+0.095
et+0.095
Neg logit contributions
't-0.167
 Sorry-0.165
 Yes-0.165
uts-0.165
 together-0.163
Token was
Token ID373
Feature activation-2.74e-02

Pos logit contributions
 dentist+0.069
 hospital+0.067
 stret+0.065
 nurses+0.064
 conductor+0.060
Neg logit contributions
uts-0.070
't-0.069
 together-0.068
ges-0.065
t-0.063
Token.
Token ID13
Feature activation+4.53e-02

Pos logit contributions
 hospital+0.047
 dentist+0.045
 nurses+0.043
 stret+0.042
 conductor+0.041
Neg logit contributions
uts-0.026
ges-0.024
't-0.024
uttering-0.023
iked-0.023
Token You
Token ID921
Feature activation-3.97e-02

Pos logit contributions
uts+0.051
iked+0.044
uttering+0.044
ges+0.041
't+0.041
Neg logit contributions
 hospital-0.110
 dentist-0.109
 nurses-0.103
 conductor-0.100
 stret-0.098
Token can
Token ID460
Feature activation+2.72e-02

Pos logit contributions
 dentist+0.096
 stret+0.095
 hospital+0.095
 conductor+0.089
 nurses+0.088
Neg logit contributions
't-0.046
uts-0.041
 together-0.041
t-0.038
 Sorry-0.038
Token play
Token ID711
Feature activation+7.14e-02

Pos logit contributions
uts+0.046
't+0.046
 together+0.043
ges+0.043
 This+0.041
Neg logit contributions
 hospital-0.049
 dentist-0.048
 nurses-0.045
 stret-0.044
 conductor-0.043
Token hop
Token ID1725
Feature activation+8.47e-02

Pos logit contributions
uts+0.066
't+0.058
ges+0.057
iked+0.054
uttering+0.052
Neg logit contributions
 hospital-0.187
 dentist-0.182
 stret-0.172
 nurses-0.171
 conductor-0.169
Tokensc
Token ID1416
Feature activation+1.34e-01

Pos logit contributions
uts+0.134
iked+0.128
 Sorry+0.125
uttering+0.122
ges+0.119
Neg logit contributions
 hospital-0.163
 dentist-0.146
 floor-0.143
 stret-0.143
et-0.143
Tokenot
Token ID313
Feature activation+3.10e-02

Pos logit contributions
uts+0.238
't+0.230
 Sorry+0.230
uttering+0.224
ges+0.224
Neg logit contributions
 hospital-0.202
et-0.189
an-0.188
 stret-0.187
 dentist-0.185
Tokench
Token ID354
Feature activation+4.98e-02

Pos logit contributions
uts+0.061
uttering+0.060
 Sorry+0.060
iked+0.059
uttered+0.058
Neg logit contributions
 hospital-0.043
 stret-0.040
 dentist-0.039
et-0.038
 nurses-0.037
Token again
Token ID757
Feature activation+5.73e-02

Pos logit contributions
 Sorry+0.073
uts+0.071
iked+0.071
uttering+0.071
uttered+0.067
Neg logit contributions
 hospital-0.092
 stret-0.087
 floor-0.084
et-0.084
 rug-0.084
Token.
Token ID13
Feature activation+5.28e-02

Pos logit contributions
uts+0.078
iked+0.073
't+0.072
ges+0.071
uttering+0.070
Neg logit contributions
 hospital-0.123
 dentist-0.118
 stret-0.112
 nurses-0.110
 scarf-0.109
Token But
Token ID887
Feature activation+9.45e-03

Pos logit contributions
uts+0.057
uttering+0.049
ges+0.049
iked+0.049
't+0.047
Neg logit contributions
 hospital-0.129
 dentist-0.126
 nurses-0.120
 conductor-0.118
 stret-0.117
Token They
Token ID1119
Feature activation-2.92e-02

Pos logit contributions
uts+0.064
't+0.058
ges+0.057
uttering+0.056
iked+0.052
Neg logit contributions
 hospital-0.133
 dentist-0.128
 nurses-0.121
 stret-0.121
 conductor-0.119
Token started
Token ID2067
Feature activation+5.82e-02

Pos logit contributions
 dentist+0.069
 hospital+0.065
 stret+0.063
 difference+0.062
an+0.060
Neg logit contributions
 together-0.043
iked-0.035
 on-0.034
apped-0.034
uttered-0.033
Token to
Token ID284
Feature activation-3.39e-02

Pos logit contributions
uts+0.044
't+0.038
ges+0.035
 Sorry+0.035
uttering+0.032
Neg logit contributions
 hospital-0.161
 dentist-0.155
 nurses-0.150
 stret-0.150
 conductor-0.147
Token play
Token ID711
Feature activation+7.37e-02

Pos logit contributions
 dentist+0.078
 hospital+0.070
ice+0.070
ureen+0.069
an+0.069
Neg logit contributions
't-0.044
 together-0.044
ges-0.036
 Sorry-0.035
 This-0.035
Token hop
Token ID1725
Feature activation+8.69e-02

Pos logit contributions
 together+0.077
't+0.060
uts+0.059
ges+0.054
 on+0.053
Neg logit contributions
 dentist-0.191
 nurses-0.182
 hospital-0.182
 conductor-0.175
 stret-0.175
Tokensc
Token ID1416
Feature activation+1.34e-01

Pos logit contributions
uts+0.125
't+0.115
ges+0.114
 Sorry+0.110
iked+0.108
Neg logit contributions
 hospital-0.181
 dentist-0.172
 stret-0.164
 nurses-0.163
 conductor-0.161
Tokenot
Token ID313
Feature activation+2.56e-02

Pos logit contributions
uts+0.226
't+0.218
ges+0.215
 Sorry+0.212
iked+0.207
Neg logit contributions
 hospital-0.224
 dentist-0.205
an-0.205
 stret-0.202
et-0.195
Tokench
Token ID354
Feature activation+4.93e-02

Pos logit contributions
uts+0.046
 Sorry+0.044
't+0.043
uttering+0.043
ges+0.042
Neg logit contributions
 hospital-0.042
 stret-0.039
 dentist-0.039
 nurses-0.037
 conductor-0.036
Token and
Token ID290
Feature activation+4.43e-03

Pos logit contributions
 Sorry+0.067
uts+0.066
ges+0.061
uttering+0.061
iked+0.060
Neg logit contributions
 hospital-0.100
 stret-0.094
 dentist-0.093
 floor-0.090
 rug-0.088
Token had
Token ID550
Feature activation+1.69e-02

Pos logit contributions
 together+0.007
't+0.007
ges+0.006
uts+0.006
 on+0.006
Neg logit contributions
 dentist-0.009
 hospital-0.009
ice-0.008
 stret-0.008
 nurses-0.008
Token lots
Token ID6041
Feature activation+1.39e-02

Pos logit contributions
 together+0.025
uts+0.025
't+0.025
 This+0.023
t+0.022
Neg logit contributions
 dentist-0.033
 hospital-0.033
 nurses-0.031
 stret-0.031
 conductor-0.031
Token bird
Token ID6512
Feature activation-3.98e-02

Pos logit contributions
 stret+0.109
 hospital+0.104
 dentist+0.102
 nurses+0.101
ki+0.096
Neg logit contributions
't-0.187
uts-0.177
 together-0.174
 Sorry-0.171
uttering-0.165
Token screamed
Token ID25421
Feature activation+7.65e-02

Pos logit contributions
 dentist+0.078
 nurses+0.077
 stret+0.076
 hospital+0.076
 workshop+0.076
Neg logit contributions
 together-0.061
't-0.055
 on-0.054
uts-0.052
ed-0.050
Token again
Token ID757
Feature activation+5.73e-02

Pos logit contributions
uts+0.145
 Sorry+0.137
't+0.134
iked+0.132
uttering+0.129
Neg logit contributions
 hospital-0.129
 dentist-0.119
 stret-0.115
 nurses-0.111
 conductor-0.110
Token and
Token ID290
Feature activation+1.23e-02

Pos logit contributions
 Sorry+0.106
uts+0.101
uttered+0.100
't+0.100
uttering+0.098
Neg logit contributions
 hospital-0.098
 dentist-0.089
 mistake-0.086
 floor-0.085
 stret-0.083
Token fl
Token ID781
Feature activation+1.04e-01

Pos logit contributions
 together+0.020
't+0.018
 on+0.017
 Sorry+0.017
ges+0.016
Neg logit contributions
 stret-0.023
 dentist-0.023
 engineer-0.023
ice-0.022
an-0.022
Tokenapped
Token ID6320
Feature activation+1.33e-01

Pos logit contributions
't+0.094
uts+0.092
 This+0.088
 Do+0.086
 Sorry+0.080
Neg logit contributions
 hospital-0.243
aver-0.241
an-0.240
 conductor-0.237
 dentist-0.235
Token its
Token ID663
Feature activation-5.64e-02

Pos logit contributions
uts+0.222
't+0.209
ges+0.201
iked+0.196
uttering+0.194
Neg logit contributions
 hospital-0.250
 dentist-0.242
 nurses-0.224
 stret-0.223
 conductor-0.220
Token wings
Token ID12098
Feature activation+6.42e-02

Pos logit contributions
 hospital+0.080
 dentist+0.078
 nurses+0.070
 stret+0.069
 conductor+0.067
Neg logit contributions
uts-0.118
't-0.114
ges-0.111
 together-0.111
 Sorry-0.107
Token.
Token ID13
Feature activation+5.36e-02

Pos logit contributions
uts+0.083
't+0.078
ges+0.075
 Sorry+0.075
iked+0.072
Neg logit contributions
 hospital-0.140
 dentist-0.135
 stret-0.129
 nurses-0.127
 conductor-0.126
Token It
Token ID632
Feature activation-2.48e-02

Pos logit contributions
uts+0.081
ges+0.077
uttering+0.075
't+0.075
iked+0.075
Neg logit contributions
 hospital-0.107
 dentist-0.104
 stret-0.095
 nurses-0.094
 conductor-0.093
Token flew
Token ID13112
Feature activation+6.94e-02

Pos logit contributions
 dentist+0.048
 hospital+0.047
 workshop+0.047
 stret+0.047
 nurses+0.047
Neg logit contributions
 together-0.037
't-0.036
uts-0.034
 on-0.033
t-0.033
Token hide
Token ID7808
Feature activation+6.93e-02

Pos logit contributions
 together+0.045
uts+0.045
't+0.045
ges+0.041
 Yes+0.040
Neg logit contributions
 dentist-0.053
 hospital-0.053
 stret-0.049
an-0.048
 nurses-0.048
Token and
Token ID290
Feature activation+3.36e-03

Pos logit contributions
uts+0.092
 together+0.088
ges+0.085
't+0.084
iked+0.077
Neg logit contributions
 hospital-0.153
 dentist-0.152
 nurses-0.139
 stret-0.138
 conductor-0.136
Token seek
Token ID5380
Feature activation+2.93e-02

Pos logit contributions
uts+0.005
 together+0.005
't+0.005
 Sorry+0.004
ges+0.004
Neg logit contributions
 hospital-0.007
 dentist-0.007
 stret-0.006
 nurses-0.006
 conductor-0.006
Token,
Token ID11
Feature activation+2.22e-02

Pos logit contributions
uts+0.034
't+0.032
ges+0.031
 together+0.030
uttering+0.029
Neg logit contributions
 hospital-0.068
 dentist-0.067
 nurses-0.063
 stret-0.063
 conductor-0.062
Token hop
Token ID1725
Feature activation+7.63e-02

Pos logit contributions
uts+0.033
't+0.032
 together+0.032
ges+0.030
 This+0.029
Neg logit contributions
 hospital-0.044
 dentist-0.043
 stret-0.040
 nurses-0.040
 conductor-0.039
Tokensc
Token ID1416
Feature activation+1.33e-01

Pos logit contributions
uts+0.109
 Sorry+0.102
ges+0.099
't+0.098
uttering+0.098
Neg logit contributions
 hospital-0.154
 stret-0.144
 dentist-0.141
 conductor-0.140
 floor-0.138
Tokenot
Token ID313
Feature activation+1.18e-02

Pos logit contributions
uts+0.233
't+0.223
 Sorry+0.221
ges+0.215
 together+0.212
Neg logit contributions
 hospital-0.214
an-0.199
 stret-0.199
 dentist-0.193
et-0.191
Tokench
Token ID354
Feature activation+4.50e-02

Pos logit contributions
uts+0.020
't+0.019
 Sorry+0.019
uttering+0.019
ges+0.018
Neg logit contributions
 hospital-0.021
 dentist-0.020
 stret-0.019
 nurses-0.019
 conductor-0.018
Token and
Token ID290
Feature activation+1.53e-04

Pos logit contributions
uts+0.061
 Sorry+0.059
't+0.057
ges+0.056
uttering+0.055
Neg logit contributions
 hospital-0.092
 stret-0.088
 dentist-0.087
 nurses-0.083
 conductor-0.082
Token much
Token ID881
Feature activation+3.88e-03

Pos logit contributions
uts+0.000
't+0.000
 together+0.000
ges+0.000
 Sorry+0.000
Neg logit contributions
 hospital-0.000
 dentist-0.000
 stret-0.000
 nurses-0.000
 conductor-0.000
Token more
Token ID517
Feature activation-2.28e-02

Pos logit contributions
 together+0.008
uts+0.008
't+0.007
ges+0.007
 Sorry+0.007
Neg logit contributions
 dentist-0.006
 hospital-0.006
 conductor-0.005
 nurses-0.005
ice-0.005
Token.
Token ID13
Feature activation+4.78e-02

Pos logit contributions
uts+0.069
't+0.064
 Sorry+0.063
ges+0.063
uttering+0.061
Neg logit contributions
 hospital-0.116
 dentist-0.111
 stret-0.103
 nurses-0.102
 conductor-0.102
Token They
Token ID1119
Feature activation-3.86e-02

Pos logit contributions
uts+0.059
't+0.054
iked+0.053
uttering+0.053
ges+0.052
Neg logit contributions
 hospital-0.111
 dentist-0.104
 nurses-0.098
 floor-0.097
 conductor-0.097
Token swung
Token ID28507
Feature activation+9.06e-02

Pos logit contributions
 difference+0.073
 engineer+0.072
 stret+0.071
 hospital+0.071
an+0.071
Neg logit contributions
 together-0.056
't-0.054
uts-0.052
 on-0.050
t-0.050
Token and
Token ID290
Feature activation+3.54e-03

Pos logit contributions
uts+0.157
't+0.147
 Sorry+0.144
ges+0.140
uttering+0.137
Neg logit contributions
 hospital-0.162
 dentist-0.155
 scarf-0.150
 stret-0.146
 snow-0.143
Token fl
Token ID781
Feature activation+1.12e-01

Pos logit contributions
 together+0.006
uts+0.006
't+0.006
ges+0.006
t+0.005
Neg logit contributions
 hospital-0.006
 dentist-0.006
 stret-0.006
 nurses-0.006
 conductor-0.006
Tokenapped
Token ID6320
Feature activation+1.33e-01

Pos logit contributions
't+0.087
 Do+0.080
uts+0.078
 This+0.076
 Sorry+0.075
Neg logit contributions
 hospital-0.281
 dentist-0.274
 stret-0.270
an-0.269
 conductor-0.269
Token their
Token ID511
Feature activation-7.94e-02

Pos logit contributions
uts+0.216
't+0.206
ges+0.193
 Sorry+0.190
iked+0.189
Neg logit contributions
 hospital-0.252
 dentist-0.243
 stret-0.228
 nurses-0.227
 conductor-0.225
Token arms
Token ID5101
Feature activation+6.59e-02

Pos logit contributions
 hospital+0.139
 dentist+0.137
 nurses+0.124
 stret+0.123
 conductor+0.122
Neg logit contributions
uts-0.141
't-0.133
ges-0.132
 together-0.131
iked-0.126
Token again
Token ID757
Feature activation+5.18e-02

Pos logit contributions
 Sorry+0.112
uts+0.102
't+0.098
ges+0.095
 Do+0.094
Neg logit contributions
 hospital-0.116
 dentist-0.109
et-0.107
 stret-0.107
 engine-0.106
Token.
Token ID13
Feature activation+5.36e-02

Pos logit contributions
uts+0.078
't+0.077
 Sorry+0.075
iked+0.074
ges+0.072
Neg logit contributions
 hospital-0.101
 dentist-0.096
 stret-0.092
 floor-0.092
 cob-0.090
Token They
Token ID1119
Feature activation-3.83e-02

Pos logit contributions
uts+0.068
't+0.064
iked+0.061
t+0.061
used+0.060
Neg logit contributions
 hospital-0.117
 dentist-0.111
 nurses-0.104
 conductor-0.103
 floor-0.103
Token.
Token ID13
Feature activation+4.29e-02

Pos logit contributions
uts+0.058
't+0.053
 Sorry+0.052
ges+0.051
uttering+0.051
Neg logit contributions
 hospital-0.087
 dentist-0.084
 conductor-0.078
 stret-0.078
 nurses-0.078
Token Each
Token ID5501
Feature activation-1.13e-02

Pos logit contributions
uts+0.051
ges+0.047
't+0.047
uttering+0.045
 together+0.044
Neg logit contributions
 hospital-0.100
 dentist-0.097
 stret-0.090
 nurses-0.089
 conductor-0.088
Token time
Token ID640
Feature activation+6.10e-02

Pos logit contributions
 dentist+0.022
 hospital+0.022
 stret+0.022
 conductor+0.021
 nurses+0.020
Neg logit contributions
't-0.016
 together-0.016
uts-0.015
ges-0.015
 onwards-0.015
Token,
Token ID11
Feature activation+1.39e-02

Pos logit contributions
uts+0.083
't+0.079
ges+0.076
 together+0.075
iked+0.072
Neg logit contributions
 hospital-0.129
 dentist-0.127
 stret-0.120
 nurses-0.118
 conductor-0.117
Token it
Token ID340
Feature activation-1.69e-02

Pos logit contributions
uts+0.019
't+0.019
 together+0.018
ges+0.018
 Sorry+0.017
Neg logit contributions
 hospital-0.028
 dentist-0.028
 stret-0.027
 conductor-0.026
 nurses-0.026
Token shr
Token ID10157
Feature activation+1.33e-01

Pos logit contributions
 stret+0.031
 dentist+0.031
 hospital+0.030
 nurses+0.028
 workshop+0.027
Neg logit contributions
 together-0.028
uts-0.027
't-0.027
t-0.026
 on-0.025
Tokenank
Token ID962
Feature activation+1.64e-02

Pos logit contributions
 Sorry+0.126
't+0.123
uts+0.117
 This+0.109
 Yes+0.109
Neg logit contributions
 hospital-0.304
 ig-0.303
an-0.297
 wounds-0.292
 dentist-0.291
Token!
Token ID0
Feature activation+3.52e-02

Pos logit contributions
uts+0.021
't+0.020
ges+0.020
 together+0.019
 This+0.018
Neg logit contributions
 hospital-0.036
 dentist-0.036
 stret-0.034
 nurses-0.034
 conductor-0.033
Token
Token ID198
Feature activation+3.58e-02

Pos logit contributions
uts+0.040
uttering+0.037
ges+0.036
iked+0.035
't+0.034
Neg logit contributions
 hospital-0.083
 dentist-0.080
 stret-0.074
 nurses-0.073
 conductor-0.072
TokenThe
Token ID464
Feature activation-8.49e-02

Pos logit contributions
uts+0.042
ges+0.037
't+0.037
 Sorry+0.036
 together+0.035
Neg logit contributions
 hospital-0.084
 dentist-0.082
 stret-0.077
 nurses-0.077
 conductor-0.076
Token goose
Token ID37246
Feature activation-4.37e-02

Pos logit contributions
 dentist+0.134
 hospital+0.127
 stret+0.123
 engineer+0.119
 nurses+0.118
Neg logit contributions
't-0.151
uttering-0.142
 together-0.141
uts-0.139
 on-0.138
Token She
Token ID1375
Feature activation-5.83e-02

Pos logit contributions
uts+0.059
uttering+0.054
't+0.053
ges+0.052
 together+0.051
Neg logit contributions
 hospital-0.081
 dentist-0.076
 floor-0.071
abel-0.070
 stret-0.069
Token played
Token ID2826
Feature activation+7.28e-02

Pos logit contributions
 stret+0.110
 dentist+0.106
 hospital+0.106
 nurses+0.102
 wounds+0.100
Neg logit contributions
 together-0.092
't-0.085
uts-0.084
 onwards-0.079
t-0.078
Token tag
Token ID7621
Feature activation+7.11e-02

Pos logit contributions
uts+0.089
't+0.084
 together+0.082
ges+0.081
uttering+0.076
Neg logit contributions
 hospital-0.164
 dentist-0.162
 nurses-0.152
 stret-0.150
 conductor-0.149
Token and
Token ID290
Feature activation+4.51e-03

Pos logit contributions
uts+0.087
 together+0.083
't+0.082
ges+0.079
iked+0.074
Neg logit contributions
 hospital-0.160
 dentist-0.160
 nurses-0.149
 stret-0.149
 conductor-0.145
Token hop
Token ID1725
Feature activation+8.04e-02

Pos logit contributions
 together+0.008
uts+0.008
't+0.008
ges+0.007
 onwards+0.007
Neg logit contributions
 dentist-0.008
 hospital-0.008
 stret-0.007
an-0.007
 nurses-0.007
Tokensc
Token ID1416
Feature activation+1.32e-01

Pos logit contributions
uts+0.117
 Sorry+0.113
't+0.109
ges+0.108
uttering+0.104
Neg logit contributions
 hospital-0.161
 stret-0.150
 dentist-0.150
 conductor-0.144
 nurses-0.143
Tokenot
Token ID313
Feature activation+2.08e-02

Pos logit contributions
 Sorry+0.227
uts+0.226
't+0.225
 together+0.217
 Do+0.215
Neg logit contributions
 hospital-0.206
 stret-0.194
 dentist-0.191
an-0.189
et-0.184
Tokench
Token ID354
Feature activation+5.24e-02

Pos logit contributions
uts+0.036
 Sorry+0.035
't+0.034
uttering+0.033
ges+0.033
Neg logit contributions
 hospital-0.035
 dentist-0.033
 stret-0.033
 nurses-0.031
 conductor-0.031
Token and
Token ID290
Feature activation+4.01e-03

Pos logit contributions
uts+0.070
't+0.065
 Sorry+0.065
ges+0.064
uttering+0.061
Neg logit contributions
 hospital-0.111
 dentist-0.107
 stret-0.104
 nurses-0.101
 conductor-0.099
Token pretended
Token ID36766
Feature activation+7.79e-02

Pos logit contributions
uts+0.007
't+0.007
 together+0.007
ges+0.006
 onwards+0.006
Neg logit contributions
 hospital-0.007
 dentist-0.007
 stret-0.006
 nurses-0.006
 conductor-0.006
Token to
Token ID284
Feature activation-2.60e-02

Pos logit contributions
uts+0.075
ges+0.072
 Sorry+0.072
uttering+0.070
't+0.067
Neg logit contributions
 hospital-0.192
 dentist-0.186
 nurses-0.174
 stret-0.173
 floor-0.171
Token their
Token ID511
Feature activation-9.95e-02

Pos logit contributions
 hospital+0.031
 dentist+0.031
 nurses+0.028
 conductor+0.028
 stret+0.028
Neg logit contributions
uts-0.038
't-0.038
 together-0.037
ges-0.037
 This-0.036
Token beds
Token ID20237
Feature activation+7.48e-02

Pos logit contributions
 stret+0.120
ureen+0.116
aver+0.113
 conductor+0.111
 dentist+0.111
Neg logit contributions
 together-0.223
 Sorry-0.202
't-0.202
uts-0.201
aps-0.197
Token.
Token ID13
Feature activation+5.64e-02

Pos logit contributions
uts+0.098
't+0.092
ges+0.089
 together+0.087
uttering+0.084
Neg logit contributions
 hospital-0.163
 dentist-0.160
 nurses-0.150
 stret-0.150
 conductor-0.147
Token They
Token ID1119
Feature activation-3.41e-02

Pos logit contributions
uts+0.076
uttering+0.075
iked+0.074
used+0.072
't+0.070
Neg logit contributions
 hospital-0.114
 dentist-0.111
 conductor-0.102
 snow-0.101
 ig-0.101
Token n
Token ID299
Feature activation+7.70e-02

Pos logit contributions
 stret+0.064
 engineer+0.061
 hospital+0.061
an+0.061
 dentist+0.060
Neg logit contributions
 together-0.053
't-0.049
uts-0.048
t-0.046
 on-0.045
Tokenapped
Token ID6320
Feature activation+1.32e-01

Pos logit contributions
't+0.070
uts+0.066
 This+0.062
t+0.055
 Yes+0.054
Neg logit contributions
 hospital-0.184
 ig-0.183
 dentist-0.175
an-0.175
abel-0.175
Token.
Token ID13
Feature activation+6.04e-02

Pos logit contributions
uts+0.181
 together+0.168
't+0.168
ges+0.167
uttering+0.157
Neg logit contributions
 hospital-0.279
 dentist-0.274
 nurses-0.256
 stret-0.256
 conductor-0.249
Token They
Token ID1119
Feature activation-3.40e-02

Pos logit contributions
uttering+0.083
used+0.082
uts+0.081
iked+0.081
't+0.078
Neg logit contributions
 hospital-0.117
 dentist-0.116
 snow-0.108
 conductor-0.107
 rug-0.105
Token dreamed
Token ID27947
Feature activation+5.14e-02

Pos logit contributions
 stret+0.069
 hospital+0.067
 dentist+0.066
 engineer+0.066
an+0.064
Neg logit contributions
 together-0.049
uts-0.047
't-0.045
 on-0.041
t-0.041
Token of
Token ID286
Feature activation-8.91e-03

Pos logit contributions
uts+0.060
't+0.056
ges+0.054
 together+0.052
uttering+0.050
Neg logit contributions
 hospital-0.120
 dentist-0.118
 nurses-0.111
 stret-0.111
 conductor-0.109
Token colors
Token ID7577
Feature activation+6.59e-02

Pos logit contributions
 dentist+0.017
 hospital+0.017
 stret+0.016
 conductor+0.016
 nurses+0.016
Neg logit contributions
 together-0.013
uts-0.013
't-0.013
ges-0.012
uttering-0.012
Token They
Token ID1119
Feature activation-2.95e-02

Pos logit contributions
uts+0.078
ges+0.074
't+0.074
uttering+0.073
t+0.068
Neg logit contributions
 hospital-0.153
 dentist-0.147
 stret-0.135
 conductor-0.135
 nurses-0.135
Token kissed
Token ID27946
Feature activation+1.05e-01

Pos logit contributions
 hospital+0.057
 dentist+0.057
 stret+0.055
 wounds+0.054
an+0.053
Neg logit contributions
 together-0.046
't-0.042
 on-0.041
uts-0.040
ges-0.040
Token them
Token ID606
Feature activation-1.53e-02

Pos logit contributions
 together+0.149
ges+0.140
uts+0.137
't+0.135
 This+0.126
Neg logit contributions
 dentist-0.219
 hospital-0.219
 stret-0.203
 nurses-0.194
 ig-0.194
Token and
Token ID290
Feature activation+1.74e-02

Pos logit contributions
 hospital+0.031
 dentist+0.030
 stret+0.028
 nurses+0.028
 conductor+0.027
Neg logit contributions
uts-0.023
't-0.021
ges-0.021
 Sorry-0.020
uttering-0.020
Token cl
Token ID537
Feature activation+9.62e-02

Pos logit contributions
 together+0.031
 on+0.027
uts+0.027
 Sorry+0.026
 onwards+0.025
Neg logit contributions
 hospital-0.031
an-0.031
 dentist-0.031
 stret-0.028
cod-0.028
Tokenapped
Token ID6320
Feature activation+1.31e-01

Pos logit contributions
't+0.115
t+0.094
uts+0.093
 Do+0.077
 Yes+0.073
Neg logit contributions
 hospital-0.211
et-0.209
an-0.206
 conductor-0.205
abel-0.205
Token.
Token ID13
Feature activation+5.72e-02

Pos logit contributions
uts+0.181
't+0.170
ges+0.163
 Sorry+0.156
 together+0.155
Neg logit contributions
 hospital-0.280
 dentist-0.273
 nurses-0.257
 stret-0.256
 conductor-0.253
Token They
Token ID1119
Feature activation-3.54e-02

Pos logit contributions
uts+0.071
't+0.068
ges+0.066
t+0.066
uttering+0.065
Neg logit contributions
 hospital-0.126
 dentist-0.123
 conductor-0.118
 nurses-0.114
 floor-0.112
Token said
Token ID531
Feature activation+3.76e-02

Pos logit contributions
 dentist+0.074
 stret+0.074
 hospital+0.073
an+0.071
 wounds+0.070
Neg logit contributions
 together-0.050
uts-0.044
 on-0.043
't-0.043
t-0.038
Token,
Token ID11
Feature activation+2.65e-02

Pos logit contributions
uts+0.053
uttering+0.051
't+0.050
ges+0.049
t+0.048
Neg logit contributions
 hospital-0.077
 dentist-0.072
 nurses-0.072
 stret-0.067
 cal-0.067
Token "
Token ID366
Feature activation+3.18e-02

Pos logit contributions
uttering+0.047
uts+0.046
ung+0.046
uck+0.045
ained+0.044
Neg logit contributions
 conductor-0.040
 dentist-0.039
 stret-0.038
 hospital-0.038
 bear-0.037
Token against
Token ID1028
Feature activation+1.60e-02

Pos logit contributions
uts+0.077
't+0.072
ges+0.071
 Sorry+0.069
 together+0.067
Neg logit contributions
 hospital-0.108
 dentist-0.105
 stret-0.098
 nurses-0.098
 conductor-0.096
Token the
Token ID262
Feature activation-9.96e-02

Pos logit contributions
uts+0.028
't+0.026
ges+0.025
iked+0.024
 together+0.024
Neg logit contributions
 hospital-0.029
 dentist-0.028
 nurses-0.026
 stret-0.026
 conductor-0.025
Token penny
Token ID26369
Feature activation+1.56e-02

Pos logit contributions
 dentist+0.138
 hospital+0.131
 nurses+0.126
 stret+0.122
 conductor+0.119
Neg logit contributions
't-0.182
uts-0.181
ges-0.181
 together-0.178
 on-0.178
Token and
Token ID290
Feature activation+8.99e-03

Pos logit contributions
uts+0.020
't+0.019
ges+0.019
 together+0.018
 Sorry+0.018
Neg logit contributions
 hospital-0.034
 dentist-0.033
 stret-0.031
 nurses-0.031
 conductor-0.031
Token it
Token ID340
Feature activation-1.46e-02

Pos logit contributions
 together+0.015
uts+0.015
't+0.014
ges+0.014
 Sorry+0.013
Neg logit contributions
 hospital-0.016
 dentist-0.016
 stret-0.015
 nurses-0.015
 conductor-0.014
Token shr
Token ID10157
Feature activation+1.31e-01

Pos logit contributions
 dentist+0.026
 hospital+0.026
 stret+0.026
 nurses+0.024
 workshop+0.023
Neg logit contributions
 together-0.025
uts-0.025
't-0.023
t-0.022
uttered-0.022
Tokenank
Token ID962
Feature activation+1.55e-02

Pos logit contributions
 Sorry+0.130
't+0.128
 Yes+0.121
 This+0.118
uts+0.115
Neg logit contributions
an-0.292
 ig-0.290
 hospital-0.290
 wounds-0.288
ice-0.284
Token back
Token ID736
Feature activation+1.31e-02

Pos logit contributions
uts+0.021
't+0.019
ges+0.019
 together+0.018
uttering+0.018
Neg logit contributions
 hospital-0.033
 dentist-0.033
 nurses-0.031
 stret-0.031
 conductor-0.030
Token to
Token ID284
Feature activation-2.72e-02

Pos logit contributions
uts+0.015
't+0.014
 Sorry+0.014
t+0.013
iked+0.013
Neg logit contributions
 hospital-0.031
 dentist-0.029
 nurses-0.028
 stret-0.028
 cob-0.028
Token its
Token ID663
Feature activation-8.73e-02

Pos logit contributions
 dentist+0.054
 hospital+0.054
 nurses+0.051
 stret+0.050
 conductor+0.049
Neg logit contributions
 together-0.039
't-0.038
uts-0.037
ges-0.037
 This-0.036
Token regular
Token ID3218
Feature activation-4.15e-02

Pos logit contributions
 dentist+0.131
 hospital+0.126
 nurses+0.121
 stret+0.120
 conductor+0.115
Neg logit contributions
 together-0.163
uts-0.163
't-0.157
uttering-0.151
ges-0.148
Token also
Token ID635
Feature activation-6.56e-02

Pos logit contributions
 dentist+0.153
 hospital+0.149
 stret+0.148
oscope+0.132
 conductor+0.132
Neg logit contributions
 together-0.109
iked-0.100
uts-0.098
ived-0.091
ges-0.090
Token played
Token ID2826
Feature activation+6.13e-02

Pos logit contributions
 dentist+0.123
 hospital+0.120
 stret+0.116
 nurses+0.107
 wounds+0.106
Neg logit contributions
 together-0.113
ges-0.105
uts-0.103
't-0.100
uttering-0.100
Token tag
Token ID7621
Feature activation+5.77e-02

Pos logit contributions
uts+0.059
 together+0.057
't+0.054
ges+0.053
uttering+0.050
Neg logit contributions
 dentist-0.153
 hospital-0.153
 nurses-0.144
 stret-0.142
 conductor-0.141
Token and
Token ID290
Feature activation+4.43e-03

Pos logit contributions
 together+0.067
uts+0.063
ges+0.059
't+0.057
iked+0.056
Neg logit contributions
 dentist-0.137
 hospital-0.135
 stret-0.128
 nurses-0.127
 conductor-0.123
Token hop
Token ID1725
Feature activation+7.65e-02

Pos logit contributions
 together+0.007
ges+0.007
't+0.007
uts+0.007
iked+0.006
Neg logit contributions
 dentist-0.009
 hospital-0.008
 stret-0.008
an-0.007
 nurses-0.007
Tokensc
Token ID1416
Feature activation+1.31e-01

Pos logit contributions
uts+0.099
't+0.093
ges+0.089
 Sorry+0.089
uttering+0.085
Neg logit contributions
 hospital-0.167
 dentist-0.159
 stret-0.154
 nurses-0.152
 conductor-0.151
Tokenot
Token ID313
Feature activation+1.48e-02

Pos logit contributions
uts+0.219
't+0.218
 Sorry+0.215
 onwards+0.206
ges+0.204
Neg logit contributions
 hospital-0.211
an-0.199
 stret-0.197
et-0.191
 dentist-0.191
Tokench
Token ID354
Feature activation+4.17e-02

Pos logit contributions
uts+0.024
 Sorry+0.023
't+0.023
ges+0.022
uttering+0.022
Neg logit contributions
 hospital-0.026
 dentist-0.025
 stret-0.024
 nurses-0.024
 conductor-0.023
Token.
Token ID13
Feature activation+4.53e-02

Pos logit contributions
uts+0.057
't+0.053
 Sorry+0.053
ges+0.052
uttering+0.050
Neg logit contributions
 hospital-0.087
 dentist-0.083
 stret-0.081
 nurses-0.078
 conductor-0.077
Token They
Token ID1119
Feature activation-3.84e-02

Pos logit contributions
uts+0.070
't+0.064
ges+0.063
uttering+0.062
 Sorry+0.061
Neg logit contributions
 hospital-0.088
 dentist-0.084
 conductor-0.078
 nurses-0.077
ki-0.075
Token laughed
Token ID13818
Feature activation+7.91e-02

Pos logit contributions
 dentist+0.087
 hospital+0.082
 stret+0.077
 wounds+0.075
 nurses+0.074
Neg logit contributions
 together-0.055
ges-0.053
't-0.050
 on-0.049
 onwards-0.047
Token.
Token ID13
Feature activation+5.72e-02

Pos logit contributions
 Sorry+0.074
't+0.069
uts+0.068
iked+0.067
uttering+0.067
Neg logit contributions
 hospital-0.084
 dentist-0.082
 floor-0.078
 scarf-0.075
et-0.075
Token They
Token ID1119
Feature activation-3.94e-02

Pos logit contributions
uts+0.061
't+0.059
t+0.055
ges+0.055
iked+0.053
Neg logit contributions
 hospital-0.140
 dentist-0.136
 nurses-0.127
 conductor-0.124
 floor-0.123
Token laughed
Token ID13818
Feature activation+9.18e-02

Pos logit contributions
 stret+0.080
 hospital+0.077
 dentist+0.074
an+0.072
 wounds+0.069
Neg logit contributions
 together-0.062
't-0.059
uts-0.059
ges-0.055
t-0.052
Token and
Token ID290
Feature activation+1.77e-02

Pos logit contributions
uts+0.175
't+0.172
 Sorry+0.170
ges+0.169
iked+0.164
Neg logit contributions
 hospital-0.142
 dentist-0.135
 conductor-0.127
 scarf-0.123
 nurses-0.122
Token cl
Token ID537
Feature activation+9.30e-02

Pos logit contributions
uts+0.035
 together+0.034
ges+0.033
't+0.033
 Sorry+0.031
Neg logit contributions
 hospital-0.028
 dentist-0.027
 stret-0.025
 nurses-0.024
 conductor-0.023
Tokenapped
Token ID6320
Feature activation+1.31e-01

Pos logit contributions
't+0.105
t+0.084
uts+0.080
 Yes+0.074
 Do+0.073
Neg logit contributions
 hospital-0.207
et-0.206
 engineer-0.203
an-0.201
 conductor-0.199
Token when
Token ID618
Feature activation+4.07e-02

Pos logit contributions
uts+0.186
't+0.179
ges+0.166
 Sorry+0.165
iked+0.161
Neg logit contributions
 hospital-0.269
 dentist-0.260
 nurses-0.247
 conductor-0.247
 stret-0.244
Token they
Token ID484
Feature activation-3.96e-02

Pos logit contributions
uts+0.037
't+0.033
ges+0.032
uttering+0.030
 together+0.030
Neg logit contributions
 hospital-0.106
 dentist-0.103
 nurses-0.098
 stret-0.096
 conductor-0.096
Token made
Token ID925
Feature activation+3.20e-02

Pos logit contributions
 hospital+0.068
 dentist+0.066
 stret+0.064
 nurses+0.061
 wounds+0.059
Neg logit contributions
uts-0.070
 together-0.068
't-0.067
ges-0.067
 Sorry-0.063
Token a
Token ID257
Feature activation-5.74e-02

Pos logit contributions
 together+0.044
uts+0.041
't+0.039
 This+0.038
ges+0.037
Neg logit contributions
 hospital-0.066
 dentist-0.066
 stret-0.063
 nurses-0.063
 conductor-0.062
Token match
Token ID2872
Feature activation+3.42e-02

Pos logit contributions
 nurses+0.095
 dentist+0.094
ice+0.092
oscope+0.090
 hospital+0.088
Neg logit contributions
 together-0.102
 Sorry-0.095
't-0.088
uttering-0.086
uts-0.085
Token.
Token ID13
Feature activation+5.17e-02

Pos logit contributions
 Sorry+0.172
uts+0.158
iked+0.146
imb+0.146
ges+0.143
Neg logit contributions
 hospital-0.161
 floor-0.153
et-0.151
 dentist-0.149
 conductor-0.147
Token Anna
Token ID11735
Feature activation-7.03e-02

Pos logit contributions
uttering+0.069
uts+0.069
ges+0.067
iked+0.067
used+0.065
Neg logit contributions
 hospital-0.111
 dentist-0.102
 floor-0.098
 rug-0.094
 tears-0.094
Token laughed
Token ID13818
Feature activation+8.25e-02

Pos logit contributions
 dentist+0.128
 hospital+0.125
 stret+0.121
 nurses+0.120
 ig+0.111
Neg logit contributions
 together-0.124
uts-0.123
't-0.120
ges-0.111
t-0.109
Token and
Token ID290
Feature activation+1.52e-02

Pos logit contributions
uts+0.157
 Sorry+0.154
ges+0.153
iked+0.151
't+0.150
Neg logit contributions
 hospital-0.133
 dentist-0.126
 scarf-0.116
 cal-0.112
 conductor-0.112
Token cl
Token ID537
Feature activation+8.86e-02

Pos logit contributions
uts+0.030
't+0.028
 together+0.028
ges+0.028
 Sorry+0.027
Neg logit contributions
 hospital-0.024
 dentist-0.023
 nurses-0.021
 stret-0.021
 conductor-0.020
Tokenapped
Token ID6320
Feature activation+1.31e-01

Pos logit contributions
't+0.108
uts+0.084
t+0.081
 Do+0.075
ges+0.069
Neg logit contributions
et-0.195
aver-0.184
an-0.184
 hospital-0.183
 dentist-0.176
Token.
Token ID13
Feature activation+5.03e-02

Pos logit contributions
uts+0.198
't+0.187
ges+0.177
 Sorry+0.177
iked+0.175
Neg logit contributions
 hospital-0.264
 dentist-0.254
 nurses-0.238
 conductor-0.236
 scarf-0.234
Token She
Token ID1375
Feature activation-5.86e-02

Pos logit contributions
uttering+0.076
iked+0.073
uts+0.071
ges+0.071
used+0.070
Neg logit contributions
 hospital-0.098
 dentist-0.092
 floor-0.090
 tears-0.086
 rug-0.084
Token said
Token ID531
Feature activation+3.35e-02

Pos logit contributions
 stret+0.129
 nurses+0.126
 dentist+0.123
 hospital+0.122
 workshop+0.122
Neg logit contributions
 together-0.083
uts-0.081
't-0.078
t-0.071
 on-0.071
Token,
Token ID11
Feature activation+2.52e-02

Pos logit contributions
uttering+0.054
uts+0.053
�+0.052
ges+0.051
't+0.050
Neg logit contributions
 hospital-0.057
 dentist-0.052
 cob-0.051
 boo-0.050
 cal-0.049
Token "
Token ID366
Feature activation+3.03e-02

Pos logit contributions
uts+0.047
uttering+0.047
ung+0.046
iked+0.044
uck+0.043
Neg logit contributions
 dentist-0.037
 hospital-0.035
 conductor-0.035
 bear-0.035
 scarf-0.033
Token bring
Token ID2222
Feature activation+4.32e-02

Pos logit contributions
 dentist+0.047
 hospital+0.047
 stret+0.043
 nurses+0.043
ice+0.041
Neg logit contributions
't-0.052
 together-0.049
uts-0.048
ges-0.048
 This-0.046
Token the
Token ID262
Feature activation-1.05e-01

Pos logit contributions
uts+0.063
 together+0.063
't+0.061
ges+0.059
 This+0.058
Neg logit contributions
 hospital-0.085
 dentist-0.084
 stret-0.080
 nurses-0.077
 conductor-0.077
Token quarrel
Token ID41998
Feature activation+6.55e-02

Pos logit contributions
 hospital+0.152
 stret+0.148
 dentist+0.147
 nurses+0.138
ice+0.131
Neg logit contributions
't-0.193
uts-0.191
 together-0.189
 Yes-0.188
 Sorry-0.186
Token to
Token ID284
Feature activation-1.47e-02

Pos logit contributions
uts+0.087
't+0.083
ges+0.081
 Sorry+0.079
uttering+0.078
Neg logit contributions
 hospital-0.140
 dentist-0.135
 nurses-0.126
 stret-0.125
 conductor-0.125
Token a
Token ID257
Feature activation-6.14e-02

Pos logit contributions
 hospital+0.030
 dentist+0.030
 stret+0.029
 nurses+0.028
an+0.028
Neg logit contributions
't-0.020
uts-0.019
 together-0.019
ges-0.018
 This-0.018
Token resolution
Token ID6323
Feature activation+1.31e-01

Pos logit contributions
 dentist+0.086
 hospital+0.085
ice+0.083
et+0.082
an+0.082
Neg logit contributions
 together-0.116
't-0.113
 Sorry-0.108
uts-0.108
 Yes-0.105
Token,
Token ID11
Feature activation+3.12e-02

Pos logit contributions
 Sorry+0.192
uts+0.188
't+0.176
ges+0.173
iked+0.173
Neg logit contributions
 hospital-0.251
 dentist-0.247
 conductor-0.228
 nurses-0.227
 stret-0.223
Token until
Token ID1566
Feature activation+3.01e-02

Pos logit contributions
uts+0.047
't+0.045
 together+0.044
ges+0.044
 This+0.041
Neg logit contributions
 hospital-0.062
 dentist-0.061
 stret-0.056
 nurses-0.056
 conductor-0.055
Token everyone
Token ID2506
Feature activation-3.26e-02

Pos logit contributions
uts+0.050
ges+0.044
uttering+0.044
iked+0.042
't+0.041
Neg logit contributions
 hospital-0.061
 dentist-0.058
 nurses-0.054
 conductor-0.051
 stret-0.050
Token was
Token ID373
Feature activation-3.04e-02

Pos logit contributions
 dentist+0.059
 hospital+0.058
 stret+0.057
 nurses+0.054
 conductor+0.052
Neg logit contributions
uts-0.054
 together-0.054
't-0.053
ges-0.052
t-0.050
Token happy
Token ID3772
Feature activation+3.74e-02

Pos logit contributions
 hospital+0.055
 dentist+0.053
 nurses+0.049
 stret+0.049
 conductor+0.048
Neg logit contributions
uts-0.052
't-0.049
ges-0.048
 together-0.047
 Sorry-0.046
Token need
Token ID761
Feature activation+2.56e-02

Pos logit contributions
 hospital+0.052
 dentist+0.050
 stret+0.046
 nurses+0.045
 floor+0.044
Neg logit contributions
uts-0.057
't-0.051
ges-0.051
iked-0.051
uttering-0.050
Token.
Token ID13
Feature activation+3.13e-02

Pos logit contributions
uts+0.039
't+0.036
ges+0.036
 together+0.035
uttering+0.034
Neg logit contributions
 hospital-0.051
 dentist-0.049
 nurses-0.046
 stret-0.046
 conductor-0.045
Token Even
Token ID3412
Feature activation+1.76e-02

Pos logit contributions
uttering+0.043
uts+0.041
iked+0.038
ges+0.038
t+0.036
Neg logit contributions
 hospital-0.066
 dentist-0.065
 conductor-0.059
 nurses-0.059
 stret-0.057
Token the
Token ID262
Feature activation-1.07e-01

Pos logit contributions
 together+0.023
uts+0.022
ges+0.022
't+0.022
 onwards+0.020
Neg logit contributions
 hospital-0.037
 dentist-0.037
 stret-0.036
 conductor-0.034
an-0.033
Token t
Token ID256
Feature activation-3.33e-03

Pos logit contributions
 stret+0.139
 dentist+0.135
 hospital+0.128
ureen+0.127
ki+0.127
Neg logit contributions
't-0.199
 Yes-0.192
 together-0.189
 on-0.184
 Sorry-0.182
Tokenini
Token ID5362
Feature activation-1.37e-01

Pos logit contributions
 hospital+0.006
 dentist+0.006
 nurses+0.006
 stret+0.005
 conductor+0.005
Neg logit contributions
uts-0.005
't-0.005
ges-0.005
 together-0.005
 This-0.005
Tokenest
Token ID395
Feature activation-5.38e-02

Pos logit contributions
 dentist+0.207
 hospital+0.199
 stret+0.194
 wounds+0.182
 nurses+0.178
Neg logit contributions
t-0.262
't-0.256
 together-0.256
ed-0.249
 This-0.248
Token creature
Token ID7185
Feature activation-3.28e-02

Pos logit contributions
 dentist+0.088
 hospital+0.087
 stret+0.080
 nurses+0.077
 conductor+0.076
Neg logit contributions
uts-0.093
 together-0.092
't-0.090
ges-0.090
t-0.087
Token needs
Token ID2476
Feature activation+1.01e-02

Pos logit contributions
 dentist+0.057
 hospital+0.057
 stret+0.053
 nurses+0.052
 conductor+0.050
Neg logit contributions
uts-0.057
't-0.055
 together-0.054
ges-0.053
t-0.051
Token our
Token ID674
Feature activation-9.90e-02

Pos logit contributions
uts+0.013
't+0.012
ges+0.012
 together+0.011
 Sorry+0.011
Neg logit contributions
 hospital-0.022
 dentist-0.021
 nurses-0.020
 stret-0.020
 conductor-0.020
Token help
Token ID1037
Feature activation+1.87e-02

Pos logit contributions
 dentist+0.152
 hospital+0.148
 stret+0.138
 conductor+0.136
 nurses+0.134
Neg logit contributions
't-0.178
uts-0.178
 together-0.177
ges-0.170
 Sorry-0.170
Token mom
Token ID1995
Feature activation-4.80e-02

Pos logit contributions
uts+0.018
't+0.016
ges+0.016
 together+0.015
uttering+0.015
Neg logit contributions
 hospital-0.026
 dentist-0.025
 stret-0.024
 nurses-0.024
 conductor-0.023
Tokenmy
Token ID1820
Feature activation-4.27e-02

Pos logit contributions
 hospital+0.070
 dentist+0.068
 stret+0.062
 nurses+0.061
 conductor+0.060
Neg logit contributions
uts-0.097
't-0.094
ges-0.092
 together-0.091
 Sorry-0.088
Token!
Token ID0
Feature activation+1.75e-02

Pos logit contributions
 hospital+0.073
 dentist+0.071
 nurses+0.065
 stret+0.065
 conductor+0.064
Neg logit contributions
uts-0.076
't-0.072
ges-0.071
uttering-0.069
iked-0.068
Token Max
Token ID5436
Feature activation-5.96e-02

Pos logit contributions
uttering+0.026
uts+0.025
iked+0.023
t+0.023
uttered+0.021
Neg logit contributions
 dentist-0.033
 hospital-0.033
 stret-0.031
 conductor-0.031
 nurses-0.030
Token broke
Token ID6265
Feature activation+5.37e-02

Pos logit contributions
 dentist+0.104
 hospital+0.103
 nurses+0.097
 stret+0.096
 conductor+0.093
Neg logit contributions
uts-0.103
't-0.101
 together-0.098
 This-0.095
ges-0.094
Token my
Token ID616
Feature activation-1.29e-01

Pos logit contributions
uts+0.085
't+0.081
 together+0.080
ges+0.080
 This+0.074
Neg logit contributions
 hospital-0.103
 dentist-0.101
 stret-0.094
 nurses-0.094
 conductor-0.093
Token dress
Token ID6576
Feature activation+3.09e-02

Pos logit contributions
 dentist+0.165
 nurses+0.163
 hospital+0.159
 conductor+0.159
ureen+0.155
Neg logit contributions
 together-0.250
uts-0.246
't-0.243
 This-0.237
 on-0.235
Token!"
Token ID2474
Feature activation+3.27e-02

Pos logit contributions
uts+0.040
't+0.037
ges+0.037
 together+0.035
uttering+0.035
Neg logit contributions
 hospital-0.068
 dentist-0.066
 stret-0.062
 nurses-0.062
 conductor-0.061
Token Sara
Token ID24799
Feature activation-9.73e-02

Pos logit contributions
uts+0.046
ges+0.043
't+0.043
uttering+0.043
iked+0.041
Neg logit contributions
 hospital-0.068
 dentist-0.066
 stret-0.062
 nurses-0.062
 conductor-0.061
Token cries
Token ID24691
Feature activation+6.44e-02

Pos logit contributions
 hospital+0.172
 dentist+0.171
 nurses+0.163
 stret+0.160
 conductor+0.156
Neg logit contributions
uts-0.167
't-0.166
 together-0.161
 This-0.152
ges-0.150
Token.
Token ID13
Feature activation+4.37e-02

Pos logit contributions
uts+0.100
 Sorry+0.093
iked+0.091
't+0.091
uttering+0.090
Neg logit contributions
 hospital-0.125
 stret-0.121
 dentist-0.118
 nurses-0.115
 boo-0.114
Token from
Token ID422
Feature activation-2.60e-02

Pos logit contributions
uts+0.098
iked+0.097
uttering+0.094
ges+0.088
 Sorry+0.088
Neg logit contributions
 hospital-0.150
 dentist-0.140
 stret-0.134
 floor-0.133
 boo-0.130
Token the
Token ID262
Feature activation-1.16e-01

Pos logit contributions
 hospital+0.051
 dentist+0.050
 conductor+0.047
 nurses+0.047
 stret+0.047
Neg logit contributions
uts-0.038
't-0.038
ges-0.036
 together-0.036
 Sorry-0.034
Token sky
Token ID6766
Feature activation+2.76e-02

Pos logit contributions
 nurses+0.137
 conductor+0.136
ice+0.135
ki+0.134
 dentist+0.131
Neg logit contributions
't-0.233
 on-0.223
uts-0.213
 together-0.213
 This-0.210
Token and
Token ID290
Feature activation-1.73e-02

Pos logit contributions
uts+0.039
ges+0.037
iked+0.037
 Sorry+0.037
't+0.036
Neg logit contributions
 hospital-0.056
 dentist-0.054
 stret-0.051
 floor-0.050
 conductor-0.049
Token broke
Token ID6265
Feature activation+6.20e-02

Pos logit contributions
 hospital+0.028
 dentist+0.028
 stret+0.025
 nurses+0.025
 conductor+0.025
Neg logit contributions
uts-0.032
't-0.031
ges-0.030
 together-0.029
 Sorry-0.029
Token my
Token ID616
Feature activation-1.26e-01

Pos logit contributions
uts+0.084
't+0.078
iked+0.076
uttering+0.076
ges+0.075
Neg logit contributions
 hospital-0.131
 dentist-0.127
 stret-0.120
 nurses-0.119
 floor-0.118
Token spark
Token ID9009
Feature activation-1.05e-02

Pos logit contributions
 dentist+0.159
 nurses+0.156
 hospital+0.155
 conductor+0.147
ki+0.140
Neg logit contributions
uts-0.251
 together-0.249
't-0.241
 Sorry-0.240
ges-0.233
Token.
Token ID13
Feature activation+4.37e-02

Pos logit contributions
 hospital+0.019
 dentist+0.019
 nurses+0.018
 stret+0.017
 conductor+0.017
Neg logit contributions
uts-0.018
't-0.016
ges-0.016
 together-0.016
t-0.015
Token Can
Token ID1680
Feature activation+4.54e-02

Pos logit contributions
uttering+0.057
iked+0.052
uts+0.050
 together+0.046
t+0.046
Neg logit contributions
 hospital-0.099
 dentist-0.098
 nurses-0.089
 stret-0.087
 boo-0.087
Token you
Token ID345
Feature activation-4.73e-02

Pos logit contributions
uts+0.048
uttering+0.048
ges+0.046
iked+0.045
uttered+0.039
Neg logit contributions
 hospital-0.114
 dentist-0.110
 nurses-0.103
 boo-0.100
 conductor-0.099
Token help
Token ID1037
Feature activation+4.38e-02

Pos logit contributions
 hospital+0.076
 dentist+0.075
 nurses+0.070
 stret+0.070
 conductor+0.067
Neg logit contributions
't-0.088
uts-0.086
 together-0.082
ges-0.080
 Sorry-0.080
Token excitement
Token ID14067
Feature activation+6.95e-02

Pos logit contributions
 hospital+0.063
 dentist+0.062
 conductor+0.058
 stret+0.058
 nurses+0.058
Neg logit contributions
't-0.053
 together-0.053
uts-0.053
ges-0.053
 This-0.049
Token and
Token ID290
Feature activation-1.11e-02

Pos logit contributions
uts+0.093
't+0.091
 Sorry+0.088
ges+0.087
uttering+0.082
Neg logit contributions
 hospital-0.144
 dentist-0.140
 stret-0.132
 nurses-0.131
 floor-0.129
Token proceeded
Token ID18814
Feature activation+6.96e-02

Pos logit contributions
 hospital+0.019
 dentist+0.019
 stret+0.017
 nurses+0.017
 conductor+0.016
Neg logit contributions
uts-0.019
 together-0.019
't-0.018
ges-0.018
 Sorry-0.018
Token to
Token ID284
Feature activation-3.49e-02

Pos logit contributions
uts+0.066
't+0.062
 Sorry+0.060
ges+0.056
 This+0.053
Neg logit contributions
 hospital-0.172
 dentist-0.167
 conductor-0.163
 stret-0.162
 nurses-0.161
Token climb
Token ID12080
Feature activation+5.22e-02

Pos logit contributions
 dentist+0.060
 hospital+0.059
 stret+0.056
 nurses+0.055
ice+0.054
Neg logit contributions
't-0.060
 together-0.057
uts-0.056
ges-0.055
 Sorry-0.054
Token the
Token ID262
Feature activation-1.26e-01

Pos logit contributions
uts+0.075
't+0.069
ges+0.067
 Sorry+0.067
uttering+0.065
Neg logit contributions
 hospital-0.108
 dentist-0.104
 stret-0.098
 nurses-0.097
 conductor-0.096
Token stairs
Token ID16046
Feature activation+4.30e-02

Pos logit contributions
 dentist+0.171
 nurses+0.167
 hospital+0.164
ice+0.162
 stret+0.161
Neg logit contributions
't-0.230
 together-0.218
 This-0.217
 on-0.214
uts-0.213
Token.
Token ID13
Feature activation+3.38e-02

Pos logit contributions
uts+0.058
 Sorry+0.053
't+0.053
ges+0.052
uttering+0.049
Neg logit contributions
 hospital-0.092
 dentist-0.089
 stret-0.083
 conductor-0.083
 nurses-0.082
Token
Token ID220
Feature activation+6.92e-02

Pos logit contributions
uts+0.043
uttering+0.042
ges+0.040
 together+0.038
iked+0.038
Neg logit contributions
 hospital-0.072
 dentist-0.070
 stret-0.065
 conductor-0.064
 nurses-0.063
Token
Token ID198
Feature activation+3.41e-02

Pos logit contributions
 Sorry+0.074
uts+0.067
uttering+0.066
ges+0.066
iked+0.064
Neg logit contributions
 hospital-0.154
 dentist-0.151
 stret-0.146
 conductor-0.143
ki-0.136
Token
Token ID198
Feature activation+3.33e-02

Pos logit contributions
uts+0.035
 Sorry+0.033
ges+0.032
uttering+0.032
iked+0.030
Neg logit contributions
 hospital-0.080
 dentist-0.080
 stret-0.076
 conductor-0.075
 nurses-0.073
Token proud
Token ID6613
Feature activation+5.00e-02

Pos logit contributions
 hospital+0.015
 dentist+0.014
 nurses+0.013
 stret+0.013
 cob+0.013
Neg logit contributions
uts-0.019
ges-0.018
't-0.018
iked-0.018
uttering-0.017
Token of
Token ID286
Feature activation-1.84e-03

Pos logit contributions
uts+0.068
't+0.066
ges+0.066
 Sorry+0.063
uttering+0.062
Neg logit contributions
 hospital-0.104
 dentist-0.101
 nurses-0.094
 conductor-0.092
 ig-0.090
Token herself
Token ID5223
Feature activation-5.60e-02

Pos logit contributions
 hospital+0.003
 dentist+0.003
 nurses+0.003
 stret+0.003
 conductor+0.003
Neg logit contributions
uts-0.003
't-0.003
ges-0.003
uttering-0.003
t-0.003
Token for
Token ID329
Feature activation-1.86e-02

Pos logit contributions
 hospital+0.120
 dentist+0.115
 stret+0.108
 nurses+0.108
 conductor+0.107
Neg logit contributions
uts-0.075
't-0.071
ges-0.070
 Sorry-0.069
uttering-0.066
Token climbing
Token ID14281
Feature activation+4.29e-02

Pos logit contributions
 hospital+0.038
 dentist+0.037
 nurses+0.034
 stret+0.034
 conductor+0.034
Neg logit contributions
uts-0.028
't-0.026
ges-0.025
 together-0.024
uttering-0.024
Token the
Token ID262
Feature activation-1.26e-01

Pos logit contributions
uts+0.064
't+0.060
ges+0.059
 Sorry+0.057
uttering+0.056
Neg logit contributions
 hospital-0.086
 dentist-0.084
 stret-0.078
 nurses-0.078
 conductor-0.076
Token tree
Token ID5509
Feature activation+1.66e-02

Pos logit contributions
 dentist+0.170
 nurses+0.168
ice+0.166
 hospital+0.163
 stret+0.161
Neg logit contributions
't-0.232
 This-0.211
 together-0.210
 on-0.208
uts-0.206
Token.
Token ID13
Feature activation+3.03e-02

Pos logit contributions
uts+0.021
't+0.019
ges+0.019
 Sorry+0.018
uttering+0.018
Neg logit contributions
 hospital-0.037
 dentist-0.036
 stret-0.034
 nurses-0.034
 conductor-0.033
Token She
Token ID1375
Feature activation-7.27e-02

Pos logit contributions
uttering+0.037
uts+0.037
ges+0.035
't+0.033
iked+0.033
Neg logit contributions
 hospital-0.067
 dentist-0.065
 conductor-0.060
 nurses-0.058
ki-0.057
Token stayed
Token ID9658
Feature activation+3.58e-02

Pos logit contributions
 dentist+0.133
 hospital+0.132
 stret+0.130
 nurses+0.128
 difference+0.119
Neg logit contributions
 together-0.117
't-0.111
uts-0.110
 onwards-0.099
 on-0.098
Token on
Token ID319
Feature activation+1.52e-02

Pos logit contributions
 together+0.036
uts+0.034
't+0.033
uttering+0.032
 onwards+0.031
Neg logit contributions
 hospital-0.088
 dentist-0.087
 nurses-0.083
 stret-0.081
 conductor-0.080
Token his
Token ID465
Feature activation-1.14e-01

Pos logit contributions
iked+0.077
uttering+0.076
used+0.073
oor+0.072
uts+0.072
Neg logit contributions
 dentist-0.114
 hospital-0.113
 nurses-0.098
 doctor-0.095
 floor-0.095
Token room
Token ID2119
Feature activation+8.55e-03

Pos logit contributions
 hospital+0.155
 dentist+0.153
 nurses+0.143
 stret+0.137
 conductor+0.135
Neg logit contributions
uts-0.230
't-0.223
 together-0.220
ges-0.215
 This-0.212
Token and
Token ID290
Feature activation-1.54e-02

Pos logit contributions
uts+0.011
't+0.010
ges+0.010
uttering+0.010
 Sorry+0.010
Neg logit contributions
 hospital-0.019
 dentist-0.018
 stret-0.017
 nurses-0.017
 floor-0.017
Token knocks
Token ID36539
Feature activation+7.43e-02

Pos logit contributions
 hospital+0.025
 dentist+0.024
 stret+0.023
 conductor+0.022
an+0.022
Neg logit contributions
 together-0.027
uts-0.027
't-0.026
ges-0.026
 Sorry-0.025
Token on
Token ID319
Feature activation+4.93e-03

Pos logit contributions
uts+0.096
uttering+0.085
't+0.080
iked+0.079
 Sorry+0.077
Neg logit contributions
 hospital-0.173
 dentist-0.169
 nurses-0.159
 floor-0.157
 stret-0.157
Token the
Token ID262
Feature activation-1.25e-01

Pos logit contributions
 together+0.007
 This+0.006
't+0.006
ges+0.006
uts+0.005
Neg logit contributions
an-0.010
 hospital-0.010
 dentist-0.010
aver-0.010
 happiest-0.010
Token door
Token ID3420
Feature activation+2.16e-02

Pos logit contributions
ice+0.158
cod+0.157
 dentist+0.155
 nurses+0.154
 hospital+0.153
Neg logit contributions
't-0.234
 on-0.228
 This-0.228
uts-0.228
 together-0.223
Token.
Token ID13
Feature activation+4.11e-02

Pos logit contributions
uts+0.032
uttering+0.030
ges+0.030
't+0.029
iked+0.028
Neg logit contributions
 hospital-0.044
 dentist-0.043
 conductor-0.040
 floor-0.039
 stret-0.039
Token He
Token ID679
Feature activation-6.79e-02

Pos logit contributions
uttering+0.069
uts+0.063
iked+0.062
t+0.059
ges+0.058
Neg logit contributions
 dentist-0.077
 hospital-0.077
 stret-0.067
 rug-0.065
 conductor-0.065
Token opens
Token ID9808
Feature activation+2.14e-03

Pos logit contributions
 stret+0.116
 dentist+0.115
 conductor+0.115
 hospital+0.113
an+0.113
Neg logit contributions
't-0.108
 together-0.101
uts-0.100
 Sorry-0.098
 This-0.096
Token it
Token ID340
Feature activation-1.17e-02

Pos logit contributions
uts+0.004
't+0.004
ges+0.003
uttering+0.003
 Sorry+0.003
Neg logit contributions
 hospital-0.004
 dentist-0.004
 stret-0.003
 nurses-0.003
 conductor-0.003
Token Daddy
Token ID26926
Feature activation-7.15e-02

Pos logit contributions
ureen+0.163
 dentist+0.161
 hospital+0.156
ice+0.154
et+0.152
Neg logit contributions
 together-0.160
't-0.144
 Sorry-0.143
uts-0.142
 Yes-0.134
Token hear
Token ID3285
Feature activation+1.14e-02

Pos logit contributions
 hospital+0.111
 dentist+0.109
 nurses+0.099
 stret+0.096
 conductor+0.095
Neg logit contributions
uts-0.141
't-0.129
ges-0.128
uttering-0.128
iked-0.124
Token them
Token ID606
Feature activation-3.07e-02

Pos logit contributions
uts+0.016
 together+0.016
't+0.015
ges+0.015
 This+0.014
Neg logit contributions
 hospital-0.023
 dentist-0.023
 conductor-0.021
 nurses-0.021
 stret-0.021
Token and
Token ID290
Feature activation-1.42e-02

Pos logit contributions
 hospital+0.060
 dentist+0.057
 stret+0.054
 nurses+0.053
 conductor+0.053
Neg logit contributions
uts-0.048
't-0.044
ges-0.044
uttering-0.043
iked-0.042
Token open
Token ID1280
Feature activation+1.18e-02

Pos logit contributions
 dentist+0.025
 hospital+0.024
an+0.023
 stret+0.023
ice+0.023
Neg logit contributions
 together-0.023
't-0.022
ges-0.022
uts-0.022
 Sorry-0.021
Token the
Token ID262
Feature activation-1.25e-01

Pos logit contributions
uts+0.015
't+0.015
ges+0.015
 together+0.015
 This+0.014
Neg logit contributions
 hospital-0.025
 dentist-0.025
 nurses-0.023
 stret-0.023
 conductor-0.023
Token door
Token ID3420
Feature activation+1.48e-02

Pos logit contributions
 dentist+0.161
 hospital+0.159
 nurses+0.153
 conductor+0.147
ureen+0.145
Neg logit contributions
 together-0.239
ges-0.238
 on-0.236
't-0.235
 This-0.235
Token.
Token ID13
Feature activation+3.50e-02

Pos logit contributions
uts+0.021
 Sorry+0.020
't+0.020
ges+0.020
uttering+0.019
Neg logit contributions
 hospital-0.030
 dentist-0.029
 floor-0.027
 conductor-0.027
 stret-0.026
Token They
Token ID1119
Feature activation-5.49e-02

Pos logit contributions
uttering+0.048
uts+0.046
iked+0.042
t+0.042
ges+0.041
Neg logit contributions
 hospital-0.074
 dentist-0.072
 conductor-0.066
 nurses-0.065
 stret-0.064
Token see
Token ID766
Feature activation+2.12e-04

Pos logit contributions
 dentist+0.105
 hospital+0.104
an+0.099
 stret+0.099
ice+0.096
Neg logit contributions
't-0.084
 together-0.077
uts-0.077
 Sorry-0.073
ges-0.073
Token the
Token ID262
Feature activation-1.07e-01

Pos logit contributions
 together+0.000
uts+0.000
ges+0.000
't+0.000
 on+0.000
Neg logit contributions
 hospital-0.000
 dentist-0.000
 stret-0.000
an-0.000
ki-0.000
Token and
Token ID290
Feature activation-1.70e-02

Pos logit contributions
uts+0.011
't+0.010
 together+0.010
ges+0.010
 This+0.009
Neg logit contributions
 hospital-0.014
 dentist-0.014
 nurses-0.013
 stret-0.013
 conductor-0.013
Token he
Token ID339
Feature activation-5.70e-02

Pos logit contributions
 hospital+0.033
 dentist+0.033
 stret+0.031
 nurses+0.030
 conductor+0.030
Neg logit contributions
uts-0.025
't-0.024
 together-0.024
ges-0.023
 Sorry-0.022
Token almost
Token ID2048
Feature activation-1.34e-02

Pos logit contributions
 stret+0.101
 dentist+0.098
 hospital+0.097
 nurses+0.095
 conductor+0.093
Neg logit contributions
't-0.092
uts-0.089
 together-0.086
t-0.083
 Sorry-0.081
Token finished
Token ID5201
Feature activation+4.76e-02

Pos logit contributions
 hospital+0.021
 dentist+0.021
 nurses+0.020
 stret+0.020
 conductor+0.020
Neg logit contributions
uts-0.024
't-0.023
 together-0.023
ges-0.023
 This-0.022
Token fixing
Token ID18682
Feature activation+3.93e-02

Pos logit contributions
uts+0.078
uttering+0.074
ges+0.074
t+0.072
iked+0.072
Neg logit contributions
 hospital-0.090
 dentist-0.086
 stret-0.079
 floor-0.079
 ig-0.077
Token the
Token ID262
Feature activation-1.25e-01

Pos logit contributions
 together+0.052
uts+0.048
't+0.046
ges+0.046
 This+0.043
Neg logit contributions
 hospital-0.084
 dentist-0.082
 conductor-0.081
 stret-0.078
 nurses-0.078
Token palace
Token ID20562
Feature activation+1.87e-02

Pos logit contributions
 hospital+0.146
 dentist+0.141
 nurses+0.137
 conductor+0.132
 stret+0.131
Neg logit contributions
't-0.250
 together-0.246
uts-0.240
 This-0.239
 Sorry-0.238
Token.
Token ID13
Feature activation+3.18e-02

Pos logit contributions
uts+0.023
 together+0.021
't+0.021
ges+0.019
 This+0.019
Neg logit contributions
 hospital-0.042
 dentist-0.042
 nurses-0.040
 stret-0.040
 conductor-0.039
Token But
Token ID887
Feature activation-5.04e-03

Pos logit contributions
uts+0.041
uttering+0.041
ges+0.038
iked+0.036
't+0.036
Neg logit contributions
 hospital-0.070
 dentist-0.069
 conductor-0.061
 nurses-0.061
 stret-0.061
Token he
Token ID339
Feature activation-5.72e-02

Pos logit contributions
 hospital+0.010
 dentist+0.010
 nurses+0.009
 stret+0.009
 conductor+0.009
Neg logit contributions
uts-0.008
't-0.007
ges-0.007
 together-0.007
 Sorry-0.007
Token was
Token ID373
Feature activation-3.11e-02

Pos logit contributions
 stret+0.119
 nurses+0.108
 hospital+0.107
 dentist+0.104
 conductor+0.104
Neg logit contributions
't-0.084
uts-0.081
 together-0.080
t-0.079
 onwards-0.070
Token loud
Token ID7812
Feature activation+3.50e-02

Pos logit contributions
 hospital+0.024
 dentist+0.023
 nurses+0.021
 stret+0.021
 conductor+0.021
Neg logit contributions
uts-0.025
't-0.024
ges-0.023
 together-0.023
 Sorry-0.022
Token and
Token ID290
Feature activation-7.19e-03

Pos logit contributions
 Sorry+0.063
uts+0.060
uttering+0.060
uttered+0.057
't+0.056
Neg logit contributions
 hospital-0.061
 dentist-0.057
 stret-0.054
 nurses-0.053
 engine-0.052
Token the
Token ID262
Feature activation-1.04e-01

Pos logit contributions
 hospital+0.015
 dentist+0.015
 stret+0.015
 nurses+0.014
 conductor+0.014
Neg logit contributions
uts-0.009
't-0.009
 together-0.009
ges-0.008
 Sorry-0.008
Token guard
Token ID4860
Feature activation-3.82e-02

Pos logit contributions
 dentist+0.131
 hospital+0.131
 stret+0.129
ice+0.121
 conductor+0.115
Neg logit contributions
't-0.206
uts-0.201
 together-0.197
 Sorry-0.196
uttering-0.195
Token shut
Token ID4423
Feature activation+4.70e-02

Pos logit contributions
 hospital+0.068
 dentist+0.067
 stret+0.064
 wounds+0.062
 nurses+0.062
Neg logit contributions
uts-0.061
't-0.060
 together-0.060
ges-0.058
t-0.057
Token the
Token ID262
Feature activation-1.25e-01

Pos logit contributions
uts+0.067
 together+0.067
ges+0.066
't+0.065
 onwards+0.061
Neg logit contributions
 dentist-0.094
 hospital-0.094
 nurses-0.087
 stret-0.085
 conductor-0.084
Token garden
Token ID11376
Feature activation+2.01e-02

Pos logit contributions
 dentist+0.205
ice+0.201
 hospital+0.197
 nurses+0.196
ureen+0.192
Neg logit contributions
't-0.184
ges-0.182
 This-0.182
uts-0.181
 Sorry-0.179
Token gate
Token ID8946
Feature activation+2.72e-02

Pos logit contributions
uts+0.023
't+0.021
ges+0.021
 together+0.020
uttering+0.019
Neg logit contributions
 hospital-0.047
 dentist-0.046
 nurses-0.044
 stret-0.043
 conductor-0.043
Token.
Token ID13
Feature activation+3.57e-02

Pos logit contributions
uts+0.036
ges+0.033
 Sorry+0.033
't+0.033
uttering+0.032
Neg logit contributions
 hospital-0.059
 dentist-0.057
 stret-0.053
 conductor-0.053
 floor-0.052
Token Jim
Token ID5395
Feature activation-7.94e-02

Pos logit contributions
uts+0.051
uttering+0.049
ges+0.048
iked+0.045
 together+0.045
Neg logit contributions
 hospital-0.074
 dentist-0.072
 conductor-0.063
 floor-0.063
 nurses-0.062
Token couldn
Token ID3521
Feature activation+4.93e-02

Pos logit contributions
 dentist+0.154
 hospital+0.148
 stret+0.143
 nurses+0.138
 difference+0.134
Neg logit contributions
't-0.127
 together-0.123
uts-0.120
t-0.116
ges-0.115
Token can
Token ID460
Feature activation+3.25e-02

Pos logit contributions
 hospital+0.101
 dentist+0.098
 nurses+0.091
 stret+0.088
 conductor+0.088
Neg logit contributions
uts-0.090
't-0.083
ges-0.082
uttering-0.079
iked-0.078
Token have
Token ID423
Feature activation+2.10e-02

Pos logit contributions
uts+0.060
uttering+0.057
ges+0.056
iked+0.055
 onwards+0.054
Neg logit contributions
 hospital-0.055
 dentist-0.053
 nurses-0.046
 stret-0.046
 boo-0.045
Token the
Token ID262
Feature activation-1.02e-01

Pos logit contributions
uts+0.035
ges+0.033
uttering+0.033
't+0.032
iked+0.032
Neg logit contributions
 hospital-0.039
 dentist-0.038
 nurses-0.035
 stret-0.034
 scarf-0.033
Token rest
Token ID1334
Feature activation+3.50e-02

Pos logit contributions
 hospital+0.155
 stret+0.155
 conductor+0.151
 nurses+0.151
 dentist+0.147
Neg logit contributions
't-0.183
uts-0.178
 together-0.175
 This-0.171
 Yes-0.170
Token of
Token ID286
Feature activation-3.89e-02

Pos logit contributions
uts+0.048
't+0.047
 Sorry+0.046
uttering+0.045
iked+0.045
Neg logit contributions
 hospital-0.071
 dentist-0.069
 nurses-0.064
 floor-0.063
et-0.062
Token my
Token ID616
Feature activation-1.25e-01

Pos logit contributions
 hospital+0.063
 dentist+0.060
 nurses+0.053
 floor+0.051
 stret+0.051
Neg logit contributions
uts-0.078
iked-0.072
uttering-0.072
ges-0.071
t-0.070
Token avocado
Token ID40377
Feature activation+4.23e-03

Pos logit contributions
 hospital+0.174
 dentist+0.168
 nurses+0.161
 conductor+0.160
 stret+0.159
Neg logit contributions
uts-0.252
't-0.242
 together-0.240
ges-0.234
 Sorry-0.234
Token."
Token ID526
Feature activation+5.79e-02

Pos logit contributions
uts+0.005
't+0.005
ges+0.005
uttering+0.005
iked+0.005
Neg logit contributions
 hospital-0.009
 dentist-0.009
 stret-0.008
 nurses-0.008
 conductor-0.008
Token She
Token ID1375
Feature activation-5.10e-02

Pos logit contributions
uttering+0.070
uts+0.067
iked+0.064
ges+0.064
't+0.063
Neg logit contributions
 dentist-0.132
 hospital-0.132
 stret-0.117
 nurses-0.117
 scarf-0.117
Token pushes
Token ID20070
Feature activation+8.13e-02

Pos logit contributions
 hospital+0.083
 dentist+0.080
 stret+0.078
 nurses+0.077
 conductor+0.076
Neg logit contributions
 together-0.090
't-0.087
uts-0.087
ges-0.081
 This-0.080
Token her
Token ID607
Feature activation-8.64e-02

Pos logit contributions
uts+0.164
 Sorry+0.150
uttering+0.149
iked+0.146
't+0.146
Neg logit contributions
 hospital-0.125
 dentist-0.121
 nurses-0.111
 stret-0.108
 tears-0.104
TokenI
Token ID40
Feature activation-4.16e-02

Pos logit contributions
uts+0.034
 together+0.032
iked+0.032
't+0.031
ges+0.031
Neg logit contributions
 hospital-0.044
 dentist-0.044
 nurses-0.040
 stret-0.040
 conductor-0.040
Token found
Token ID1043
Feature activation+2.96e-02

Pos logit contributions
 hospital+0.082
 dentist+0.080
 stret+0.075
 nurses+0.075
 conductor+0.073
Neg logit contributions
uts-0.063
't-0.061
ges-0.059
 together-0.058
 Sorry-0.056
Token something
Token ID1223
Feature activation-2.88e-02

Pos logit contributions
uts+0.039
't+0.037
ges+0.035
 together+0.035
 This+0.033
Neg logit contributions
 hospital-0.065
 dentist-0.063
 stret-0.060
 nurses-0.059
 conductor-0.059
Token hidden
Token ID7104
Feature activation+2.26e-02

Pos logit contributions
 hospital+0.056
 dentist+0.054
 nurses+0.050
 stret+0.050
 conductor+0.049
Neg logit contributions
uts-0.044
ges-0.042
't-0.042
iked-0.041
uttering-0.040
Token under
Token ID739
Feature activation-1.22e-02

Pos logit contributions
uts+0.032
 Sorry+0.028
't+0.028
iked+0.028
ges+0.027
Neg logit contributions
 hospital-0.049
 dentist-0.047
 nurses-0.045
 stret-0.044
et-0.043
Token my
Token ID616
Feature activation-1.25e-01

Pos logit contributions
 hospital+0.021
 dentist+0.021
 nurses+0.019
 stret+0.019
 conductor+0.018
Neg logit contributions
uts-0.021
't-0.020
ges-0.020
 together-0.020
 This-0.019
Token toy
Token ID13373
Feature activation-9.21e-03

Pos logit contributions
 dentist+0.189
 nurses+0.186
ureen+0.184
 cal+0.179
 hospital+0.177
Neg logit contributions
uts-0.234
 This-0.221
't-0.220
 together-0.211
ges-0.208
Token box
Token ID3091
Feature activation+2.92e-02

Pos logit contributions
 hospital+0.010
 dentist+0.009
 nurses+0.008
 stret+0.008
 conductor+0.008
Neg logit contributions
uts-0.023
ges-0.022
't-0.021
uttering-0.021
iked-0.021
Token!"
Token ID2474
Feature activation+4.90e-02

Pos logit contributions
uts+0.038
ges+0.037
 Sorry+0.037
uttering+0.036
iked+0.036
Neg logit contributions
 hospital-0.062
 dentist-0.060
 floor-0.057
 conductor-0.055
 ig-0.054
Token
Token ID220
Feature activation+8.45e-02

Pos logit contributions
uttering+0.073
uts+0.071
ges+0.069
iked+0.066
ived+0.061
Neg logit contributions
 hospital-0.099
 dentist-0.094
 tears-0.089
 stret-0.088
 conductor-0.087
Token
Token ID198
Feature activation+4.23e-02

Pos logit contributions
iked+0.114
 Sorry+0.110
uttering+0.108
ges+0.107
 Yes+0.102
Neg logit contributions
 hospital-0.169
 dentist-0.165
 stret-0.150
 tears-0.144
 conductor-0.144
Token screams
Token ID26557
Feature activation+6.36e-02

Pos logit contributions
 dentist+0.140
 stret+0.138
 nurses+0.133
 hospital+0.133
 conductor+0.132
Neg logit contributions
't-0.124
uts-0.111
 together-0.110
 Sorry-0.107
 This-0.103
Token and
Token ID290
Feature activation-1.37e-02

Pos logit contributions
uts+0.118
 Sorry+0.117
iked+0.109
uttering+0.109
't+0.108
Neg logit contributions
 hospital-0.104
 stret-0.096
 dentist-0.093
 nurses-0.090
 cob-0.088
Token runs
Token ID4539
Feature activation+4.55e-02

Pos logit contributions
 dentist+0.023
 hospital+0.023
 conductor+0.021
 nurses+0.021
 stret+0.021
Neg logit contributions
't-0.024
uts-0.023
ges-0.022
 Sorry-0.022
 together-0.022
Token out
Token ID503
Feature activation+2.72e-02

Pos logit contributions
uts+0.062
 Sorry+0.056
uttering+0.053
iked+0.051
't+0.051
Neg logit contributions
 hospital-0.105
 dentist-0.097
 nurses-0.095
 stret-0.094
 conductor-0.091
Token of
Token ID286
Feature activation-2.06e-02

Pos logit contributions
uts+0.042
 Sorry+0.041
uttering+0.038
iked+0.038
't+0.037
Neg logit contributions
 hospital-0.053
 dentist-0.049
 nurses-0.049
 cob-0.048
 stret-0.047
Token the
Token ID262
Feature activation-1.25e-01

Pos logit contributions
 hospital+0.039
 dentist+0.038
 conductor+0.036
 stret+0.036
 nurses+0.035
Neg logit contributions
't-0.032
uts-0.031
ges-0.031
 together-0.030
 This-0.028
Token tunnel
Token ID13275
Feature activation+3.18e-02

Pos logit contributions
ice+0.176
ureen+0.176
 dentist+0.166
 conductor+0.162
abel+0.161
Neg logit contributions
 on-0.218
 together-0.218
't-0.214
 This-0.212
ges-0.211
Token.
Token ID13
Feature activation+4.32e-02

Pos logit contributions
uts+0.039
't+0.036
ges+0.036
 Sorry+0.034
uttering+0.034
Neg logit contributions
 hospital-0.072
 dentist-0.070
 stret-0.066
 nurses-0.065
 conductor-0.065
Token She
Token ID1375
Feature activation-7.63e-02

Pos logit contributions
uttering+0.061
iked+0.060
uts+0.055
ived+0.052
 together+0.052
Neg logit contributions
 hospital-0.091
 dentist-0.087
 nurses-0.078
 stret-0.078
 cob-0.077
Token is
Token ID318
Feature activation-3.96e-02

Pos logit contributions
 dentist+0.146
 conductor+0.139
 stret+0.139
 engineer+0.137
 nurses+0.137
Neg logit contributions
't-0.118
uts-0.101
 Sorry-0.099
 together-0.098
ges-0.098
Token angry
Token ID7954
Feature activation+3.41e-02

Pos logit contributions
 dentist+0.077
ki+0.075
 hospital+0.074
an+0.073
 conductor+0.072
Neg logit contributions
't-0.058
 together-0.055
uts-0.055
ges-0.053
 on-0.052
Token and
Token ID290
Feature activation-3.08e-02

Pos logit contributions
 hospital+0.120
 stret+0.116
 dentist+0.115
 wounds+0.111
 conductor+0.110
Neg logit contributions
't-0.084
ges-0.084
 together-0.083
uts-0.083
 This-0.077
Token Papa
Token ID42328
Feature activation-5.98e-02

Pos logit contributions
 hospital+0.059
 dentist+0.057
 stret+0.053
 conductor+0.052
 nurses+0.051
Neg logit contributions
uts-0.048
't-0.047
ges-0.046
 together-0.045
 Sorry-0.044
Token were
Token ID547
Feature activation-3.84e-02

Pos logit contributions
 hospital+0.107
 dentist+0.104
 nurses+0.097
 stret+0.096
 conductor+0.094
Neg logit contributions
uts-0.103
't-0.097
ges-0.095
 together-0.092
uttering-0.091
Token sitting
Token ID5586
Feature activation+3.08e-02

Pos logit contributions
 hospital+0.086
 dentist+0.080
 nurses+0.076
 stret+0.075
 ig+0.071
Neg logit contributions
uts-0.057
ges-0.050
 Sorry-0.047
t-0.045
iked-0.045
Token in
Token ID287
Feature activation-3.82e-02

Pos logit contributions
uts+0.032
't+0.028
ges+0.026
 Sorry+0.025
 This+0.024
Neg logit contributions
 hospital-0.078
 dentist-0.076
 nurses-0.072
 stret-0.072
 conductor-0.071
Token their
Token ID511
Feature activation-1.24e-01

Pos logit contributions
 hospital+0.067
 dentist+0.066
 nurses+0.061
 stret+0.060
 conductor+0.059
Neg logit contributions
uts-0.065
't-0.064
ges-0.062
 together-0.061
 This-0.060
Token garden
Token ID11376
Feature activation+3.68e-02

Pos logit contributions
 dentist+0.164
ureen+0.160
 stret+0.156
 nurses+0.151
 hospital+0.151
Neg logit contributions
 This-0.252
 together-0.248
't-0.234
 Sorry-0.229
ges-0.223
Token one
Token ID530
Feature activation-5.24e-02

Pos logit contributions
uts+0.045
't+0.042
ges+0.041
 Sorry+0.038
 together+0.038
Neg logit contributions
 hospital-0.084
 dentist-0.082
 nurses-0.077
 stret-0.076
 conductor-0.075
Token summer
Token ID3931
Feature activation+1.96e-02

Pos logit contributions
 hospital+0.092
 dentist+0.091
 stret+0.085
 nurses+0.084
 conductor+0.082
Neg logit contributions
uts-0.087
 together-0.085
't-0.084
ges-0.083
uttering-0.081
Token season
Token ID1622
Feature activation+6.14e-02

Pos logit contributions
uts+0.027
't+0.026
ges+0.025
 Sorry+0.024
uttering+0.024
Neg logit contributions
 hospital-0.041
 dentist-0.040
 nurses-0.037
 stret-0.037
 floor-0.037
Token.
Token ID13
Feature activation+3.00e-02

Pos logit contributions
uts+0.079
't+0.075
ges+0.072
 Sorry+0.070
uttering+0.068
Neg logit contributions
 hospital-0.134
 dentist-0.129
 nurses-0.123
 stret-0.123
 conductor-0.120
Token and
Token ID290
Feature activation-8.88e-03

Pos logit contributions
uts+0.033
ges+0.032
 Sorry+0.031
iked+0.031
uttering+0.030
Neg logit contributions
 hospital-0.030
 dentist-0.028
 stret-0.026
 conductor-0.025
 nurses-0.025
Token soon
Token ID2582
Feature activation-6.46e-05

Pos logit contributions
 dentist+0.016
 hospital+0.016
 stret+0.016
 nurses+0.015
 conductor+0.015
Neg logit contributions
uts-0.014
't-0.013
 together-0.013
ges-0.013
 This-0.012
Token she
Token ID673
Feature activation-5.71e-02

Pos logit contributions
 hospital+0.000
 dentist+0.000
 stret+0.000
 nurses+0.000
 conductor+0.000
Neg logit contributions
uts-0.000
't-0.000
ges-0.000
 together-0.000
iked-0.000
Token was
Token ID373
Feature activation-2.77e-02

Pos logit contributions
 stret+0.100
 dentist+0.098
 hospital+0.098
 nurses+0.095
 wounds+0.088
Neg logit contributions
uts-0.099
't-0.098
 together-0.094
t-0.090
ges-0.088
Token crossing
Token ID12538
Feature activation+1.58e-02

Pos logit contributions
 hospital+0.055
 dentist+0.054
 nurses+0.051
 stret+0.051
 conductor+0.050
Neg logit contributions
uts-0.041
't-0.039
 together-0.038
ges-0.038
uttering-0.036
Token the
Token ID262
Feature activation-1.24e-01

Pos logit contributions
uts+0.024
't+0.023
 together+0.023
ges+0.023
 This+0.021
Neg logit contributions
 dentist-0.030
 hospital-0.030
 nurses-0.028
 stret-0.028
 conductor-0.027
Token finish
Token ID5461
Feature activation+4.46e-03

Pos logit contributions
ice+0.200
 wounds+0.190
cod+0.189
 dentist+0.188
eman+0.183
Neg logit contributions
't-0.198
 on-0.183
 This-0.180
 to-0.176
 together-0.175
Token line
Token ID1627
Feature activation+4.59e-02

Pos logit contributions
uts+0.008
't+0.007
ges+0.007
 together+0.007
uttering+0.007
Neg logit contributions
 hospital-0.008
 dentist-0.008
 nurses-0.007
 stret-0.007
 conductor-0.007
Token first
Token ID717
Feature activation+8.49e-03

Pos logit contributions
uts+0.060
 Sorry+0.058
ges+0.057
't+0.055
uttering+0.053
Neg logit contributions
 hospital-0.098
 dentist-0.094
 conductor-0.089
 stret-0.088
 nurses-0.088
Token.
Token ID13
Feature activation+3.33e-02

Pos logit contributions
uts+0.012
 Sorry+0.012
ges+0.011
uttering+0.011
't+0.011
Neg logit contributions
 hospital-0.017
 dentist-0.016
 floor-0.015
 nurses-0.015
 conductor-0.015
Token
Token ID198
Feature activation+3.21e-02

Pos logit contributions
uts+0.041
uttering+0.039
ges+0.037
iked+0.036
't+0.035
Neg logit contributions
 hospital-0.074
 dentist-0.073
 nurses-0.066
 conductor-0.066
 stret-0.065
Token D
Token ID360
Feature activation-4.86e-02

Pos logit contributions
uts+0.074
ges+0.070
't+0.069
iked+0.069
uttering+0.068
Neg logit contributions
 hospital-0.057
 dentist-0.056
 nurses-0.050
 stret-0.050
 conductor-0.049
Tokenott
Token ID1252
Feature activation-1.09e-01

Pos logit contributions
 hospital+0.087
 dentist+0.084
an+0.079
 conductor+0.077
 nurses+0.076
Neg logit contributions
uts-0.078
 Sorry-0.076
't-0.073
 together-0.073
ges-0.071
Tokenie
Token ID494
Feature activation-5.53e-02

Pos logit contributions
 hospital+0.164
 dentist+0.157
 nurses+0.156
 stret+0.151
 cal+0.151
Neg logit contributions
ges-0.206
uts-0.205
 together-0.196
t-0.191
't-0.190
Token.
Token ID13
Feature activation+1.97e-02

Pos logit contributions
 hospital+0.138
 dentist+0.137
 nurses+0.130
 stret+0.130
 conductor+0.125
Neg logit contributions
uts-0.054
't-0.050
ges-0.049
 together-0.049
 This-0.045
Token D
Token ID360
Feature activation-5.61e-02

Pos logit contributions
uts+0.041
 Sorry+0.036
ges+0.036
't+0.036
t+0.036
Neg logit contributions
 hospital-0.028
 ig-0.027
 bear-0.024
 snow-0.024
 mud-0.024
Tokenott
Token ID1252
Feature activation-1.24e-01

Pos logit contributions
 hospital+0.053
 dentist+0.052
 ig+0.040
 nurses+0.039
an+0.039
Neg logit contributions
uts-0.142
't-0.137
ges-0.135
 together-0.135
 Sorry-0.134
Tokenie
Token ID494
Feature activation-6.84e-02

Pos logit contributions
 hospital+0.178
 stret+0.172
 nurses+0.170
 dentist+0.167
 cal+0.158
Neg logit contributions
uts-0.246
ges-0.239
 together-0.232
't-0.231
t-0.227
Token wanted
Token ID2227
Feature activation+3.42e-02

Pos logit contributions
 dentist+0.135
 stret+0.135
 nurses+0.131
 hospital+0.131
 wounds+0.124
Neg logit contributions
't-0.101
 together-0.100
uts-0.099
 This-0.098
t-0.093
Token to
Token ID284
Feature activation-2.73e-02

Pos logit contributions
uts+0.013
't+0.011
 together+0.009
ges+0.008
 This+0.008
Neg logit contributions
 hospital-0.105
 dentist-0.104
 stret-0.102
 conductor-0.100
an-0.099
Token visit
Token ID3187
Feature activation+4.66e-02

Pos logit contributions
 dentist+0.049
 hospital+0.049
 nurses+0.047
 stret+0.046
ice+0.046
Neg logit contributions
't-0.045
uts-0.044
 This-0.040
 Sorry-0.040
ges-0.038
Token a
Token ID257
Feature activation-5.94e-02

Pos logit contributions
uts+0.075
't+0.073
 together+0.070
ges+0.070
 This+0.068
Neg logit contributions
 hospital-0.086
 dentist-0.084
 stret-0.080
 nurses-0.079
 conductor-0.077
Token were
Token ID547
Feature activation-1.98e-02

Pos logit contributions
uts+0.005
't+0.004
ges+0.004
 together+0.004
uttering+0.004
Neg logit contributions
 hospital-0.004
 dentist-0.003
 nurses-0.003
 stret-0.003
 conductor-0.003
Token two
Token ID734
Feature activation-6.45e-02

Pos logit contributions
 hospital+0.032
 dentist+0.032
 nurses+0.029
 ig+0.028
 stret+0.027
Neg logit contributions
uts-0.036
ges-0.035
't-0.034
 Sorry-0.033
iked-0.032
Token best
Token ID1266
Feature activation-2.49e-02

Pos logit contributions
 dentist+0.149
 stret+0.147
 hospital+0.145
 Tooth+0.138
oscope+0.136
Neg logit contributions
 together-0.068
 at-0.063
s-0.059
,-0.053
re-0.053
Token friends
Token ID2460
Feature activation-2.80e-02

Pos logit contributions
 hospital+0.043
 dentist+0.043
 stret+0.039
 nurses+0.038
 conductor+0.037
Neg logit contributions
uts-0.044
 together-0.042
't-0.041
uttering-0.040
iked-0.040
Token,
Token ID11
Feature activation-2.99e-03

Pos logit contributions
 dentist+0.064
 hospital+0.063
 stret+0.061
 nurses+0.058
 floor+0.057
Neg logit contributions
 together-0.036
ges-0.032
uts-0.032
't-0.031
 This-0.029
Token Mir
Token ID7381
Feature activation-1.24e-01

Pos logit contributions
 hospital+0.007
 dentist+0.006
 stret+0.006
 wounds+0.006
an+0.006
Neg logit contributions
 together-0.004
ges-0.004
't-0.004
uts-0.004
 Yes-0.004
Tokena
Token ID64
Feature activation-9.95e-02

Pos logit contributions
 hospital+0.259
 stret+0.240
 dentist+0.239
 nurses+0.233
 conductor+0.220
Neg logit contributions
uts-0.176
ges-0.171
 together-0.170
't-0.161
uttering-0.160
Token and
Token ID290
Feature activation-5.40e-02

Pos logit contributions
 hospital+0.260
 dentist+0.251
 nurses+0.246
 stret+0.244
 wounds+0.231
Neg logit contributions
uts-0.101
 together-0.096
't-0.082
ges-0.080
 This-0.071
Token Max
Token ID5436
Feature activation-6.67e-02

Pos logit contributions
 hospital+0.104
 dentist+0.101
 stret+0.094
 nurses+0.093
 conductor+0.091
Neg logit contributions
uts-0.086
't-0.082
ges-0.080
 together-0.079
 Sorry-0.076
Token.
Token ID13
Feature activation+2.01e-02

Pos logit contributions
 hospital+0.170
 dentist+0.170
 stret+0.158
 nurses+0.158
 conductor+0.155
Neg logit contributions
uts-0.062
 together-0.061
ges-0.058
't-0.058
 This-0.053
Token They
Token ID1119
Feature activation-5.55e-02

Pos logit contributions
uts+0.023
uttering+0.021
t+0.021
't+0.021
iked+0.019
Neg logit contributions
 hospital-0.047
 dentist-0.045
 nurses-0.043
 ig-0.043
 rug-0.042
Token house
Token ID2156
Feature activation+3.17e-02

Pos logit contributions
 hospital+0.041
 dentist+0.039
 stret+0.037
 nurses+0.037
 conductor+0.035
Neg logit contributions
uts-0.039
't-0.037
 together-0.036
ges-0.036
uttering-0.035
Token.
Token ID13
Feature activation+2.63e-02

Pos logit contributions
uts+0.041
't+0.038
ges+0.038
uttering+0.036
 Sorry+0.036
Neg logit contributions
 hospital-0.070
 dentist-0.068
 nurses-0.063
 stret-0.063
 conductor-0.062
Token They
Token ID1119
Feature activation-5.64e-02

Pos logit contributions
uttering+0.035
uts+0.035
iked+0.031
ges+0.031
t+0.030
Neg logit contributions
 hospital-0.056
 dentist-0.054
 conductor-0.051
 nurses-0.050
 stret-0.049
Token knock
Token ID10643
Feature activation+6.36e-02

Pos logit contributions
 dentist+0.104
 hospital+0.100
 stret+0.097
an+0.096
 nurses+0.093
Neg logit contributions
't-0.087
 together-0.085
uts-0.080
ges-0.079
 Sorry-0.077
Token on
Token ID319
Feature activation+1.24e-02

Pos logit contributions
uts+0.075
't+0.066
uttering+0.062
ges+0.062
 Sorry+0.060
Neg logit contributions
 hospital-0.151
 dentist-0.147
 nurses-0.140
 stret-0.139
 conductor-0.136
Token the
Token ID262
Feature activation-1.24e-01

Pos logit contributions
 together+0.017
ges+0.015
't+0.015
uts+0.015
 This+0.015
Neg logit contributions
 dentist-0.026
 hospital-0.026
an-0.025
 conductor-0.025
aver-0.024
Token door
Token ID3420
Feature activation+2.51e-02

Pos logit contributions
 dentist+0.189
 hospital+0.188
ureen+0.185
ice+0.185
cod+0.185
Neg logit contributions
 together-0.198
uts-0.193
 This-0.192
 on-0.192
't-0.190
Token.
Token ID13
Feature activation+4.02e-02

Pos logit contributions
uts+0.038
ges+0.034
uttering+0.034
't+0.034
 Sorry+0.034
Neg logit contributions
 hospital-0.050
 dentist-0.049
 conductor-0.046
 nurses-0.045
 stret-0.045
Token Sam
Token ID3409
Feature activation-8.36e-02

Pos logit contributions
uttering+0.062
uts+0.059
iked+0.055
t+0.053
ived+0.052
Neg logit contributions
 hospital-0.076
 dentist-0.074
 conductor-0.070
 rug-0.069
 stret-0.067
Token opens
Token ID9808
Feature activation+2.89e-03

Pos logit contributions
 hospital+0.140
 dentist+0.139
 nurses+0.132
 stret+0.130
 conductor+0.128
Neg logit contributions
't-0.143
uts-0.142
 together-0.136
ges-0.136
 This-0.135
Token the
Token ID262
Feature activation-1.15e-01

Pos logit contributions
uts+0.005
't+0.005
ges+0.005
uttering+0.005
 Sorry+0.005
Neg logit contributions
 hospital-0.005
 dentist-0.005
 stret-0.004
 nurses-0.004
 conductor-0.004
Token to
Token ID284
Feature activation-4.78e-02

Pos logit contributions
 hospital+0.006
 dentist+0.006
 stret+0.006
 nurses+0.006
 conductor+0.005
Neg logit contributions
uts-0.005
't-0.004
 together-0.004
ges-0.004
 This-0.004
Token say
Token ID910
Feature activation+2.45e-02

Pos logit contributions
 dentist+0.080
 hospital+0.078
 nurses+0.074
 conductor+0.072
 stret+0.072
Neg logit contributions
't-0.083
uts-0.080
 together-0.077
 Sorry-0.076
ges-0.076
Token sorry
Token ID7926
Feature activation+5.60e-02

Pos logit contributions
uts+0.039
't+0.036
ges+0.036
uttering+0.035
 together+0.034
Neg logit contributions
 hospital-0.047
 dentist-0.046
 nurses-0.042
 stret-0.042
 conductor-0.042
Token for
Token ID329
Feature activation+1.30e-03

Pos logit contributions
uts+0.063
uttering+0.062
't+0.059
ges+0.059
iked+0.057
Neg logit contributions
 hospital-0.133
 dentist-0.131
 nurses-0.123
 conductor-0.118
 floor-0.117
Token breaking
Token ID7163
Feature activation+3.63e-02

Pos logit contributions
uts+0.002
't+0.002
ges+0.002
uttering+0.002
iked+0.002
Neg logit contributions
 hospital-0.003
 dentist-0.002
 nurses-0.002
 stret-0.002
 conductor-0.002
Token the
Token ID262
Feature activation-1.24e-01

Pos logit contributions
 together+0.050
uts+0.050
ges+0.049
't+0.048
 This+0.044
Neg logit contributions
 hospital-0.074
 dentist-0.073
 conductor-0.069
 stret-0.068
 nurses-0.068
Token alarm
Token ID10436
Feature activation-7.72e-04

Pos logit contributions
 hospital+0.149
 dentist+0.147
 nurses+0.137
 conductor+0.135
abel+0.132
Neg logit contributions
't-0.247
 together-0.241
uts-0.239
ges-0.236
 Yes-0.235
Token and
Token ID290
Feature activation-4.39e-03

Pos logit contributions
 hospital+0.002
 dentist+0.002
 nurses+0.002
 stret+0.002
 conductor+0.002
Neg logit contributions
uts-0.001
't-0.001
ges-0.001
 together-0.001
uttering-0.001
Token the
Token ID262
Feature activation-1.10e-01

Pos logit contributions
 hospital+0.008
 dentist+0.008
 stret+0.008
 conductor+0.007
 nurses+0.007
Neg logit contributions
uts-0.007
 together-0.007
't-0.006
 Sorry-0.006
ges-0.006
Token fence
Token ID13990
Feature activation+8.10e-04

Pos logit contributions
 hospital+0.119
 dentist+0.119
ice+0.115
abel+0.112
ureen+0.111
Neg logit contributions
't-0.222
 together-0.221
 Sorry-0.214
 Yes-0.214
 on-0.212
Token.
Token ID13
Feature activation+4.59e-02

Pos logit contributions
uts+0.001
't+0.001
ges+0.001
 together+0.001
 Sorry+0.001
Neg logit contributions
 hospital-0.002
 dentist-0.002
 nurses-0.002
 stret-0.002
 conductor-0.002
Token
Token ID198
Feature activation+2.50e-02

Pos logit contributions
iked+0.066
uts+0.064
uttering+0.064
ges+0.059
't+0.059
Neg logit contributions
 hospital-0.105
 dentist-0.104
 nurses-0.096
 conductor-0.092
 stret-0.091
Token
Token ID198
Feature activation+2.31e-02

Pos logit contributions
uts+0.046
ges+0.043
iked+0.043
 Sorry+0.042
't+0.042
Neg logit contributions
 hospital-0.042
 dentist-0.041
 stret-0.037
 conductor-0.037
 nurses-0.036
TokenThey
Token ID2990
Feature activation-4.48e-02

Pos logit contributions
uts+0.040
ges+0.036
iked+0.036
uttered+0.035
't+0.035
Neg logit contributions
 hospital-0.041
 dentist-0.040
 stret-0.038
 nurses-0.038
 conductor-0.038
Token knock
Token ID10643
Feature activation+6.83e-02

Pos logit contributions
 dentist+0.084
ice+0.084
an+0.081
 nurses+0.078
 hospital+0.077
Neg logit contributions
't-0.067
 Sorry-0.058
 together-0.058
uts-0.053
 This-0.053
Token on
Token ID319
Feature activation+2.00e-02

Pos logit contributions
uts+0.078
't+0.064
uttering+0.061
 Sorry+0.060
 This+0.057
Neg logit contributions
 hospital-0.172
 dentist-0.163
 nurses-0.159
 stret-0.155
 floor-0.151
Token the
Token ID262
Feature activation-1.24e-01

Pos logit contributions
 together+0.028
ges+0.024
't+0.024
 This+0.023
uts+0.023
Neg logit contributions
 dentist-0.043
 hospital-0.043
 conductor-0.041
oscope-0.040
aver-0.040
Token door
Token ID3420
Feature activation+3.05e-02

Pos logit contributions
 dentist+0.214
 hospital+0.213
ice+0.208
et+0.207
cod+0.206
Neg logit contributions
't-0.186
 This-0.186
uts-0.182
 together-0.180
ges-0.175
Token.
Token ID13
Feature activation+4.77e-02

Pos logit contributions
uts+0.049
uttering+0.047
ges+0.045
 Sorry+0.045
't+0.045
Neg logit contributions
 hospital-0.057
 dentist-0.055
 conductor-0.051
 floor-0.051
 rug-0.050
Token Dad
Token ID17415
Feature activation-3.94e-02

Pos logit contributions
uts+0.066
uttering+0.064
't+0.062
t+0.061
iked+0.059
Neg logit contributions
 hospital-0.101
 dentist-0.097
 nurses-0.089
 stret-0.088
 floor-0.087
Token opens
Token ID9808
Feature activation+9.30e-03

Pos logit contributions
 hospital+0.068
 dentist+0.066
 stret+0.062
 nurses+0.062
 conductor+0.060
Neg logit contributions
uts-0.068
't-0.067
ges-0.065
 together-0.065
 This-0.062
Token it
Token ID340
Feature activation-4.88e-03

Pos logit contributions
uts+0.017
't+0.016
ges+0.016
uttering+0.015
iked+0.015
Neg logit contributions
 hospital-0.016
 dentist-0.015
 stret-0.014
 nurses-0.014
 conductor-0.014
Token.
Token ID13
Feature activation+2.94e-02

Pos logit contributions
uts+0.038
't+0.034
ges+0.034
 Sorry+0.032
uttering+0.032
Neg logit contributions
 hospital-0.071
 dentist-0.070
 nurses-0.065
 stret-0.065
 conductor-0.064
Token
Token ID198
Feature activation+2.38e-02

Pos logit contributions
uts+0.045
 Sorry+0.041
ges+0.040
uttering+0.039
oor+0.038
Neg logit contributions
 hospital-0.055
 rug-0.049
 dentist-0.049
 bear-0.049
 ig-0.048
Token
Token ID198
Feature activation+2.53e-02

Pos logit contributions
 Sorry+0.031
uts+0.030
iked+0.027
ges+0.027
uttering+0.027
Neg logit contributions
 hospital-0.050
 dentist-0.049
 conductor-0.045
 stret-0.044
 ig-0.044
TokenWhen
Token ID2215
Feature activation+2.72e-02

Pos logit contributions
uts+0.046
ges+0.044
 Sorry+0.043
uttered+0.039
iked+0.039
Neg logit contributions
 hospital-0.041
 nurses-0.036
 conductor-0.036
 scarf-0.036
 dentist-0.036
Token D
Token ID360
Feature activation-5.89e-02

Pos logit contributions
uts+0.060
ges+0.048
oor+0.047
aps+0.046
t+0.046
Neg logit contributions
 hospital-0.041
 snow-0.038
 scarf-0.037
 nurses-0.037
 bear-0.036
Tokenott
Token ID1252
Feature activation-1.24e-01

Pos logit contributions
 hospital+0.055
 dentist+0.053
 ig+0.042
 nurses+0.040
an+0.040
Neg logit contributions
uts-0.151
 together-0.144
't-0.143
ges-0.142
 Sorry-0.142
Tokenie
Token ID494
Feature activation-6.97e-02

Pos logit contributions
 hospital+0.188
 stret+0.187
 dentist+0.178
 nurses+0.178
 cal+0.167
Neg logit contributions
ges-0.225
to-0.217
 together-0.217
uts-0.214
't-0.213
Token arrived
Token ID5284
Feature activation+1.61e-02

Pos logit contributions
 dentist+0.120
 hospital+0.116
 wounds+0.114
 stret+0.113
 nurses+0.111
Neg logit contributions
't-0.117
uts-0.109
 together-0.108
 This-0.108
ges-0.108
Token at
Token ID379
Feature activation-7.13e-03

Pos logit contributions
uts+0.025
 Sorry+0.022
't+0.021
t+0.020
 Do+0.020
Neg logit contributions
 hospital-0.034
 dentist-0.030
 nurses-0.029
 tears-0.028
 bear-0.028
Token the
Token ID262
Feature activation-9.83e-02

Pos logit contributions
 stret+0.015
an+0.014
 dentist+0.014
 hospital+0.014
ice+0.014
Neg logit contributions
't-0.009
 together-0.009
ges-0.009
 This-0.009
uts-0.008
Token lab
Token ID2248
Feature activation+1.56e-02

Pos logit contributions
 hospital+0.114
 dentist+0.110
 stret+0.097
 nurses+0.097
 conductor+0.092
Neg logit contributions
uts-0.230
't-0.222
ges-0.218
 together-0.215
uttering-0.211
Token her
Token ID607
Feature activation-9.36e-02

Pos logit contributions
 hospital+0.048
 dentist+0.047
 stret+0.046
 nurses+0.045
 conductor+0.045
Neg logit contributions
uts-0.024
't-0.023
ges-0.022
 together-0.022
 This-0.021
Token mom
Token ID1995
Feature activation-4.86e-02

Pos logit contributions
 stret+0.182
 hospital+0.175
 engineer+0.166
an+0.166
oscope+0.166
Neg logit contributions
 together-0.126
uts-0.123
't-0.120
 on-0.119
 to-0.113
Token went
Token ID1816
Feature activation+4.65e-02

Pos logit contributions
 hospital+0.097
 dentist+0.092
 stret+0.092
 wounds+0.090
 nurses+0.086
Neg logit contributions
 together-0.074
't-0.068
uts-0.067
ges-0.067
iked-0.064
Token outside
Token ID2354
Feature activation+5.42e-02

Pos logit contributions
uts+0.073
 Sorry+0.065
't+0.064
ges+0.063
t+0.060
Neg logit contributions
 hospital-0.098
 dentist-0.091
 stret-0.087
 nurses-0.085
 conductor-0.084
Token to
Token ID284
Feature activation-2.73e-02

Pos logit contributions
uts+0.061
 Sorry+0.054
't+0.051
iked+0.048
ges+0.047
Neg logit contributions
 hospital-0.135
 dentist-0.127
 nurses-0.120
 ig-0.119
 stret-0.118
Token play
Token ID711
Feature activation+6.92e-02

Pos logit contributions
 dentist+0.058
 conductor+0.057
an+0.056
 hospital+0.056
 stret+0.055
Neg logit contributions
't-0.035
 This-0.034
 together-0.033
ges-0.032
 Sorry-0.031
Token.
Token ID13
Feature activation+4.28e-02

Pos logit contributions
uts+0.066
 together+0.066
't+0.061
ges+0.060
uttering+0.055
Neg logit contributions
 dentist-0.172
 hospital-0.171
 nurses-0.163
 conductor-0.161
 stret-0.161
Token It
Token ID632
Feature activation-1.71e-02

Pos logit contributions
uts+0.062
ges+0.056
't+0.055
uttering+0.055
iked+0.053
Neg logit contributions
 hospital-0.091
 dentist-0.084
 floor-0.082
 nurses-0.081
 stret-0.078
Token was
Token ID373
Feature activation-1.39e-02

Pos logit contributions
 stret+0.041
 dentist+0.041
 nurses+0.039
 hospital+0.038
 workshop+0.038
Neg logit contributions
't-0.021
 together-0.021
t-0.019
s-0.017
uts-0.017
Token a
Token ID257
Feature activation-5.49e-02

Pos logit contributions
 hospital+0.029
 dentist+0.028
 stret+0.027
 nurses+0.027
 conductor+0.027
Neg logit contributions
't-0.019
 together-0.019
uts-0.019
ges-0.017
 onwards-0.017
Token very
Token ID845
Feature activation-4.13e-02

Pos logit contributions
ice+0.104
oscope+0.102
 nurses+0.101
 wounds+0.101
 dentist+0.099
Neg logit contributions
 together-0.081
't-0.081
 Sorry-0.080
 Yes-0.075
t-0.072
Tokenila
Token ID10102
Feature activation-9.14e-02

Pos logit contributions
 hospital+0.122
 dentist+0.119
 stret+0.113
 nurses+0.111
 conductor+0.110
Neg logit contributions
uts-0.111
't-0.108
ges-0.105
 together-0.103
iked-0.099
Token and
Token ID290
Feature activation-2.34e-02

Pos logit contributions
 stret+0.194
 hospital+0.193
 dentist+0.192
 nurses+0.190
 conductor+0.182
Neg logit contributions
 together-0.121
uts-0.114
ges-0.111
't-0.110
 This-0.104
Token Ben
Token ID3932
Feature activation-8.12e-02

Pos logit contributions
 hospital+0.042
 dentist+0.041
 nurses+0.038
 stret+0.037
 conductor+0.037
Neg logit contributions
uts-0.040
ges-0.037
't-0.037
uttering-0.036
iked-0.036
Token liked
Token ID8288
Feature activation+2.09e-02

Pos logit contributions
 hospital+0.154
 dentist+0.152
 stret+0.142
 nurses+0.141
 conductor+0.138
Neg logit contributions
uts-0.126
't-0.122
ges-0.121
 together-0.120
 This-0.114
Token to
Token ID284
Feature activation-4.16e-02

Pos logit contributions
uts+0.018
't+0.016
ges+0.016
 Sorry+0.015
t+0.014
Neg logit contributions
 hospital-0.056
 dentist-0.055
 nurses-0.052
 stret-0.051
 ig-0.050
Token play
Token ID711
Feature activation+6.93e-02

Pos logit contributions
 hospital+0.077
 dentist+0.076
 nurses+0.071
 stret+0.069
 conductor+0.069
Neg logit contributions
uts-0.067
't-0.066
ges-0.062
 This-0.062
 together-0.062
Token outside
Token ID2354
Feature activation+5.69e-02

Pos logit contributions
uts+0.058
't+0.053
 together+0.051
ges+0.050
uttering+0.045
Neg logit contributions
 hospital-0.184
 dentist-0.181
 nurses-0.173
 stret-0.171
 conductor-0.169
Token when
Token ID618
Feature activation+1.57e-02

Pos logit contributions
uts+0.081
iked+0.070
't+0.070
uttering+0.069
 Sorry+0.069
Neg logit contributions
 hospital-0.123
 dentist-0.115
 nurses-0.110
 ig-0.109
 floor-0.109
Token it
Token ID340
Feature activation-1.19e-02

Pos logit contributions
uts+0.017
't+0.014
uttering+0.014
ges+0.014
iked+0.014
Neg logit contributions
 hospital-0.039
 dentist-0.038
 nurses-0.036
 stret-0.035
 ig-0.035
Token was
Token ID373
Feature activation-1.94e-02

Pos logit contributions
 stret+0.023
 dentist+0.023
 hospital+0.023
 nurses+0.022
 wounds+0.022
Neg logit contributions
't-0.017
t-0.017
 together-0.017
uts-0.016
ges-0.015
Token dark
Token ID3223
Feature activation+2.69e-02

Pos logit contributions
 hospital+0.033
 dentist+0.032
 nurses+0.029
 stret+0.029
 conductor+0.028
Neg logit contributions
uts-0.036
't-0.034
ges-0.033
 together-0.032
 Sorry-0.032
Token rude
Token ID22066
Feature activation+3.23e-02

Pos logit contributions
uts+0.007
't+0.007
ges+0.007
 together+0.007
 Sorry+0.007
Neg logit contributions
 hospital-0.005
 dentist-0.005
 stret-0.004
 nurses-0.004
 conductor-0.004
Token.
Token ID13
Feature activation+5.73e-02

Pos logit contributions
uttering+0.050
't+0.048
uts+0.047
iked+0.047
�+0.046
Neg logit contributions
 hospital-0.062
 dentist-0.060
 nurses-0.057
 floor-0.055
 doctors-0.054
Token He
Token ID679
Feature activation-5.50e-02

Pos logit contributions
uttering+0.060
uts+0.059
iked+0.056
 together+0.053
ges+0.053
Neg logit contributions
 hospital-0.140
 dentist-0.139
 nurses-0.131
 conductor-0.126
 stret-0.123
Token did
Token ID750
Feature activation+3.42e-02

Pos logit contributions
 stret+0.105
 dentist+0.103
 hospital+0.100
 nurses+0.099
 workshop+0.096
Neg logit contributions
't-0.088
uts-0.082
t-0.078
 Sorry-0.076
 together-0.076
Token not
Token ID407
Feature activation+2.83e-03

Pos logit contributions
't+0.042
uts+0.039
 together+0.036
 This+0.035
t+0.033
Neg logit contributions
 hospital-0.077
 dentist-0.077
 stret-0.076
 nurses-0.075
 conductor-0.074
Token care
Token ID1337
Feature activation+6.99e-02

Pos logit contributions
't+0.004
uts+0.004
ges+0.004
 Sorry+0.004
 together+0.004
Neg logit contributions
 dentist-0.005
 hospital-0.005
 stret-0.005
 nurses-0.005
 conductor-0.005
Token about
Token ID546
Feature activation+2.48e-02

Pos logit contributions
uts+0.112
iked+0.110
uttering+0.108
 Sorry+0.107
't+0.106
Neg logit contributions
 hospital-0.131
 dentist-0.125
 nurses-0.117
 stret-0.112
 conductor-0.111
Token me
Token ID502
Feature activation-4.99e-02

Pos logit contributions
uts+0.037
't+0.037
ges+0.035
 together+0.034
 This+0.034
Neg logit contributions
 hospital-0.047
 dentist-0.046
 stret-0.044
 conductor-0.044
 nurses-0.043
Token."
Token ID526
Feature activation+5.71e-02

Pos logit contributions
 hospital+0.103
 dentist+0.098
 nurses+0.089
ice+0.089
 stret+0.088
Neg logit contributions
iked-0.070
 Sorry-0.070
uttering-0.070
uttered-0.069
uts-0.068
Token
Token ID198
Feature activation+4.12e-02

Pos logit contributions
uts+0.066
uttering+0.061
ges+0.060
't+0.060
iked+0.058
Neg logit contributions
 hospital-0.137
 dentist-0.135
 nurses-0.122
 stret-0.120
 conductor-0.118
Token
Token ID198
Feature activation+3.62e-02

Pos logit contributions
uts+0.030
't+0.027
ges+0.025
 together+0.024
uttering+0.022
Neg logit contributions
 hospital-0.114
 dentist-0.113
 nurses-0.107
 stret-0.107
 conductor-0.105
Token He
Token ID679
Feature activation-5.88e-02

Pos logit contributions
uts+0.066
uttering+0.061
't+0.061
ges+0.061
iked+0.060
Neg logit contributions
 hospital-0.104
 dentist-0.101
 nurses-0.092
 stret-0.091
 conductor-0.090
Token does
Token ID857
Feature activation+7.74e-03

Pos logit contributions
 dentist+0.113
 nurses+0.112
 stret+0.111
 conductor+0.109
 engineer+0.108
Neg logit contributions
't-0.085
uts-0.078
 together-0.076
 Sorry-0.075
 This-0.074
Token not
Token ID407
Feature activation-8.22e-03

Pos logit contributions
uts+0.014
ges+0.014
iked+0.014
uttering+0.013
elt+0.013
Neg logit contributions
 hospital-0.014
 dentist-0.013
 stret-0.011
 mistake-0.011
 wounds-0.011
Token like
Token ID588
Feature activation-2.18e-04

Pos logit contributions
 hospital+0.017
 dentist+0.017
 stret+0.016
 nurses+0.016
 conductor+0.015
Neg logit contributions
uts-0.011
't-0.011
ges-0.011
 together-0.010
t-0.010
Token to
Token ID284
Feature activation-3.98e-02

Pos logit contributions
 hospital+0.000
 stret+0.000
 conductor+0.000
 dentist+0.000
 nurses+0.000
Neg logit contributions
uts-0.000
't-0.000
ges-0.000
 This-0.000
 together-0.000
Token fight
Token ID1907
Feature activation+1.07e-01

Pos logit contributions
 hospital+0.074
 dentist+0.073
 nurses+0.068
 stret+0.067
 conductor+0.066
Neg logit contributions
't-0.063
uts-0.063
 Sorry-0.059
 together-0.058
ges-0.058
Token with
Token ID351
Feature activation+2.00e-02

Pos logit contributions
uts+0.157
uttering+0.152
iked+0.151
ges+0.141
 Sorry+0.139
Neg logit contributions
 hospital-0.221
 dentist-0.211
 nurses-0.195
 floor-0.191
 conductor-0.189
Token Max
Token ID5436
Feature activation-3.66e-02

Pos logit contributions
't+0.022
 together+0.020
ges+0.020
 This+0.020
uts+0.019
Neg logit contributions
 stret-0.046
 conductor-0.044
an-0.043
 hospital-0.043
 engineer-0.042
Token.
Token ID13
Feature activation+4.72e-02

Pos logit contributions
 hospital+0.080
 dentist+0.077
 stret+0.072
 nurses+0.072
 conductor+0.070
Neg logit contributions
uts-0.048
't-0.045
ges-0.044
uttering-0.043
 Sorry-0.043
Token He
Token ID679
Feature activation-5.59e-02

Pos logit contributions
uts+0.068
uttering+0.067
iked+0.066
ges+0.066
't+0.063
Neg logit contributions
 hospital-0.098
 dentist-0.097
 nurses-0.085
 stret-0.085
 conductor-0.084
Token does
Token ID857
Feature activation+8.32e-03

Pos logit contributions
 stret+0.113
 dentist+0.112
 nurses+0.111
 conductor+0.110
 engineer+0.108
Neg logit contributions
't-0.076
 together-0.066
uts-0.065
 This-0.065
 Sorry-0.065
Token covered
Token ID5017
Feature activation+4.79e-02

Pos logit contributions
 hospital+0.022
 dentist+0.021
 nurses+0.019
 stret+0.019
 conductor+0.019
Neg logit contributions
uts-0.026
't-0.025
ges-0.024
 together-0.024
uttering-0.023
Token in
Token ID287
Feature activation-9.24e-03

Pos logit contributions
uts+0.051
't+0.045
ges+0.043
 Sorry+0.043
iked+0.041
Neg logit contributions
 hospital-0.117
 dentist-0.114
 nurses-0.108
 stret-0.108
 conductor-0.106
Token red
Token ID2266
Feature activation+2.66e-02

Pos logit contributions
 dentist+0.013
 hospital+0.012
 conductor+0.012
an+0.011
ureen+0.011
Neg logit contributions
 together-0.019
ges-0.018
't-0.018
uts-0.018
 This-0.017
Token bumps
Token ID34029
Feature activation+5.63e-02

Pos logit contributions
uts+0.048
't+0.046
ges+0.044
 Sorry+0.044
uttering+0.044
Neg logit contributions
 hospital-0.045
 dentist-0.043
 nurses-0.040
 stret-0.040
 floor-0.039
Token.
Token ID13
Feature activation+4.26e-02

Pos logit contributions
uts+0.082
't+0.078
ges+0.076
 Sorry+0.074
 together+0.073
Neg logit contributions
 hospital-0.113
 dentist-0.110
 stret-0.103
 nurses-0.103
 conductor-0.101
Token
Token ID220
Feature activation+7.35e-02

Pos logit contributions
uttering+0.064
ges+0.060
uts+0.059
iked+0.055
ectar+0.053
Neg logit contributions
 hospital-0.088
 dentist-0.082
 floor-0.078
 tears-0.075
 stret-0.074
Token
Token ID198
Feature activation+3.53e-02

Pos logit contributions
uttering+0.093
iked+0.092
ges+0.091
 Sorry+0.090
ho+0.083
Neg logit contributions
 hospital-0.147
 dentist-0.146
 stret-0.134
 floor-0.130
 conductor-0.127
Token
Token ID198
Feature activation+3.29e-02

Pos logit contributions
uttering+0.055
iked+0.051
 Sorry+0.050
uttered+0.049
uts+0.047
Neg logit contributions
 hospital-0.064
 stret-0.062
 dentist-0.062
 rug-0.059
 floor-0.058
TokenThey
Token ID2990
Feature activation-3.12e-02

Pos logit contributions
uts+0.070
ges+0.068
uttering+0.067
iked+0.066
uttered+0.065
Neg logit contributions
 hospital-0.044
 dentist-0.040
 stret-0.038
 nurses-0.036
 floor-0.036
Token went
Token ID1816
Feature activation+4.99e-02

Pos logit contributions
 dentist+0.070
an+0.069
 hospital+0.068
 stret+0.068
 difference+0.068
Neg logit contributions
 together-0.043
ges-0.034
 on-0.034
't-0.033
iked-0.032
Token to
Token ID284
Feature activation-2.12e-02

Pos logit contributions
uts+0.055
't+0.049
ges+0.049
 Sorry+0.047
uttering+0.045
Neg logit contributions
 hospital-0.122
 dentist-0.118
 stret-0.112
 nurses-0.112
 conductor-0.110
Token a
Token ID257
Feature activation-5.50e-02

Pos logit contributions
 hospital+0.004
 dentist+0.004
 stret+0.004
 conductor+0.004
 nurses+0.004
Neg logit contributions
 together-0.002
uts-0.002
 This-0.002
ges-0.002
't-0.002
Token big
Token ID1263
Feature activation-4.28e-02

Pos logit contributions
 hospital+0.082
 dentist+0.079
 nurses+0.073
 stret+0.070
 conductor+0.069
Neg logit contributions
uts-0.110
't-0.107
 together-0.103
ges-0.103
 This-0.101
Token hug
Token ID16225
Feature activation+6.10e-02

Pos logit contributions
 hospital+0.055
 dentist+0.053
 nurses+0.048
 stret+0.047
 conductor+0.046
Neg logit contributions
uts-0.093
't-0.091
ges-0.089
 together-0.089
 This-0.087
Token.
Token ID13
Feature activation+4.21e-02

Pos logit contributions
uts+0.085
't+0.081
 Sorry+0.078
ges+0.078
uttering+0.076
Neg logit contributions
 hospital-0.127
 dentist-0.124
 stret-0.114
 conductor-0.114
 nurses-0.114
Token It
Token ID632
Feature activation-2.01e-02

Pos logit contributions
uttering+0.067
ges+0.064
uts+0.064
iked+0.061
ectar+0.059
Neg logit contributions
 hospital-0.080
 dentist-0.079
 stret-0.071
 tears-0.071
 cal-0.070
Token l
Token ID300
Feature activation+2.86e-02

Pos logit contributions
 dentist+0.041
 hospital+0.041
 stret+0.040
 nurses+0.039
 workshop+0.038
Neg logit contributions
 together-0.030
uts-0.028
't-0.027
ed-0.025
t-0.024
Tokenicked
Token ID9484
Feature activation+6.18e-02

Pos logit contributions
't+0.027
 Sorry+0.026
uts+0.026
 This+0.022
 together+0.021
Neg logit contributions
 hospital-0.068
 dentist-0.066
et-0.064
aver-0.063
abel-0.063
Token his
Token ID465
Feature activation-8.14e-02

Pos logit contributions
uts+0.107
uttering+0.098
iked+0.097
't+0.096
 Sorry+0.094
Neg logit contributions
 hospital-0.108
 dentist-0.101
 stret-0.098
 nurses-0.098
 tears-0.094
Token face
Token ID1986
Feature activation+2.33e-02

Pos logit contributions
 dentist+0.125
 hospital+0.123
 nurses+0.117
 conductor+0.104
 engineer+0.101
Neg logit contributions
uts-0.161
 together-0.153
 This-0.152
't-0.150
uttering-0.146
Token!
Token ID0
Feature activation+3.62e-02

Pos logit contributions
uts+0.036
't+0.035
ges+0.035
 Sorry+0.034
uttering+0.033
Neg logit contributions
 hospital-0.043
 dentist-0.042
 stret-0.039
 conductor-0.038
 nurses-0.038
Token
Token ID198
Feature activation+4.01e-02

Pos logit contributions
uttering+0.055
ges+0.053
uts+0.052
iked+0.052
ectar+0.050
Neg logit contributions
 hospital-0.070
 dentist-0.068
 stret-0.065
 tears-0.063
 conductor-0.062
Token to
Token ID284
Feature activation-2.75e-02

Pos logit contributions
 hospital+0.085
 dentist+0.084
 nurses+0.077
 stret+0.077
 conductor+0.076
Neg logit contributions
uts-0.073
't-0.070
 together-0.067
ges-0.066
 This-0.064
Token make
Token ID787
Feature activation+3.30e-02

Pos logit contributions
 hospital+0.044
 dentist+0.043
 stret+0.040
 conductor+0.039
 nurses+0.039
Neg logit contributions
't-0.049
uts-0.048
 together-0.046
ges-0.045
 This-0.044
Token sure
Token ID1654
Feature activation+3.07e-02

Pos logit contributions
uts+0.050
't+0.049
 together+0.047
ges+0.044
 This+0.044
Neg logit contributions
 hospital-0.063
 dentist-0.062
 stret-0.058
 conductor-0.058
 nurses-0.057
Token you
Token ID345
Feature activation-5.66e-02

Pos logit contributions
uts+0.034
't+0.031
ges+0.030
 together+0.029
uttering+0.028
Neg logit contributions
 hospital-0.074
 dentist-0.072
 nurses-0.068
 stret-0.067
 conductor-0.066
Token're
Token ID821
Feature activation-1.25e-02

Pos logit contributions
 hospital+0.103
 dentist+0.102
 stret+0.094
 nurses+0.092
 conductor+0.091
Neg logit contributions
uts-0.094
't-0.092
 together-0.087
ges-0.086
t-0.085
Token healthy
Token ID5448
Feature activation+1.96e-02

Pos logit contributions
 hospital+0.022
 dentist+0.022
 nurses+0.020
 stret+0.020
 conductor+0.019
Neg logit contributions
uts-0.022
't-0.020
ges-0.020
 together-0.019
iked-0.019
Token,"
Token ID553
Feature activation+3.80e-02

Pos logit contributions
uts+0.028
ges+0.027
iked+0.027
uttering+0.027
't+0.026
Neg logit contributions
 hospital-0.040
 dentist-0.039
 nurses-0.037
 stret-0.036
 floor-0.035
Token his
Token ID465
Feature activation-8.59e-02

Pos logit contributions
 together+0.052
't+0.049
uts+0.049
 This+0.048
 Sorry+0.047
Neg logit contributions
 hospital-0.077
 stret-0.074
 dentist-0.074
 nurses-0.072
an-0.071
Token owner
Token ID4870
Feature activation-2.40e-02

Pos logit contributions
 hospital+0.126
 nurses+0.118
 dentist+0.117
 stret+0.113
abel+0.107
Neg logit contributions
uts-0.170
't-0.161
 together-0.156
uttering-0.153
iked-0.150
Token replied
Token ID8712
Feature activation+6.34e-02

Pos logit contributions
 hospital+0.048
 stret+0.047
 dentist+0.046
 nurses+0.044
 cob+0.043
Neg logit contributions
 together-0.037
't-0.034
uts-0.034
t-0.033
 This-0.031
Token.
Token ID13
Feature activation+5.10e-02

Pos logit contributions
uts+0.090
't+0.087
 Sorry+0.086
uttering+0.084
ges+0.083
Neg logit contributions
 hospital-0.127
 dentist-0.123
 stret-0.119
 nurses-0.112
 boo-0.111
Token love
Token ID1842
Feature activation+5.19e-02

Pos logit contributions
 dentist+0.076
 hospital+0.075
 stret+0.072
 nurses+0.071
 conductor+0.068
Neg logit contributions
uts-0.057
't-0.057
t-0.055
 together-0.054
 Sorry-0.050
Token you
Token ID345
Feature activation-4.62e-02

Pos logit contributions
uts+0.060
ges+0.054
uttering+0.053
't+0.053
iked+0.051
Neg logit contributions
 hospital-0.123
 dentist-0.121
 nurses-0.116
 stret-0.110
 conductor-0.108
Token too
Token ID1165
Feature activation+1.49e-02

Pos logit contributions
 hospital+0.097
 dentist+0.093
 nurses+0.088
 floor+0.085
 conductor+0.084
Neg logit contributions
uts-0.065
ges-0.063
uttering-0.062
iked-0.061
uttered-0.058
Token.
Token ID13
Feature activation+5.17e-02

Pos logit contributions
uttered+0.027
uttering+0.026
iked+0.026
ges+0.026
 Sorry+0.025
Neg logit contributions
 hospital-0.025
 dentist-0.024
 floor-0.023
 mistake-0.023
 mess-0.021
Token But
Token ID887
Feature activation+1.05e-02

Pos logit contributions
uts+0.054
uttering+0.048
iked+0.047
ges+0.046
't+0.045
Neg logit contributions
 hospital-0.129
 dentist-0.128
 nurses-0.121
 conductor-0.116
 stret-0.115
Token can
Token ID460
Feature activation+3.51e-02

Pos logit contributions
uts+0.011
ges+0.010
't+0.010
uttering+0.009
iked+0.009
Neg logit contributions
 hospital-0.026
 dentist-0.025
 nurses-0.024
 conductor-0.023
 stret-0.023
Token we
Token ID356
Feature activation-3.08e-02

Pos logit contributions
uts+0.026
't+0.023
ges+0.022
 together+0.021
 Sorry+0.019
Neg logit contributions
 hospital-0.097
 dentist-0.095
 stret-0.090
 nurses-0.090
 conductor-0.089
Token still
Token ID991
Feature activation+1.71e-02

Pos logit contributions
 hospital+0.053
 dentist+0.051
 nurses+0.047
 stret+0.047
 conductor+0.046
Neg logit contributions
uts-0.055
ges-0.052
't-0.051
uttering-0.050
iked-0.050
Token look
Token ID804
Feature activation+5.09e-02

Pos logit contributions
uts+0.026
't+0.025
 together+0.024
ges+0.024
 This+0.023
Neg logit contributions
 hospital-0.033
 dentist-0.033
 nurses-0.030
 stret-0.030
 conductor-0.030
Token at
Token ID379
Feature activation+1.63e-02

Pos logit contributions
uts+0.049
't+0.045
ges+0.043
 together+0.040
iked+0.039
Neg logit contributions
 hospital-0.129
 dentist-0.127
 nurses-0.120
 stret-0.120
 conductor-0.118
Token the
Token ID262
Feature activation-9.76e-02

Pos logit contributions
uts+0.024
't+0.023
 together+0.023
ges+0.022
 This+0.021
Neg logit contributions
 hospital-0.032
 dentist-0.032
 stret-0.030
 conductor-0.030
 nurses-0.029
Token to
Token ID284
Feature activation-1.95e-02

Pos logit contributions
uts+0.034
 Sorry+0.029
't+0.028
ges+0.027
uttering+0.026
Neg logit contributions
 hospital-0.077
 dentist-0.075
 nurses-0.071
 stret-0.070
 conductor-0.070
Token the
Token ID262
Feature activation-1.03e-01

Pos logit contributions
 dentist+0.041
 hospital+0.041
 stret+0.039
 nurses+0.038
 conductor+0.038
Neg logit contributions
't-0.027
 together-0.025
ges-0.024
 This-0.024
uts-0.023
Token counter
Token ID3753
Feature activation+9.46e-03

Pos logit contributions
 dentist+0.128
 hospital+0.125
 stret+0.118
 nurses+0.114
 conductor+0.112
Neg logit contributions
't-0.215
uts-0.212
 together-0.208
ges-0.204
 Sorry-0.202
Token and
Token ID290
Feature activation-1.44e-03

Pos logit contributions
uts+0.015
't+0.014
ges+0.014
 Sorry+0.014
uttering+0.013
Neg logit contributions
 hospital-0.018
 dentist-0.018
 nurses-0.016
 stret-0.016
 conductor-0.016
Token said
Token ID531
Feature activation+4.19e-02

Pos logit contributions
 hospital+0.003
 dentist+0.002
 stret+0.002
 nurses+0.002
 difference+0.002
Neg logit contributions
 together-0.003
uts-0.002
't-0.002
ges-0.002
iked-0.002
Token "
Token ID366
Feature activation+3.10e-02

Pos logit contributions
uttering+0.064
uts+0.064
ges+0.061
ained+0.060
't+0.059
Neg logit contributions
 hospital-0.076
 dentist-0.074
 bear-0.068
 cal-0.067
 cob-0.067
TokenCan
Token ID6090
Feature activation+4.57e-02

Pos logit contributions
uts+0.050
 together+0.045
't+0.045
iked+0.044
uttering+0.044
Neg logit contributions
 dentist-0.058
 hospital-0.058
 nurses-0.054
 conductor-0.053
 stret-0.053
Token I
Token ID314
Feature activation-2.31e-02

Pos logit contributions
uts+0.033
't+0.029
ges+0.027
 together+0.026
uttering+0.024
Neg logit contributions
 hospital-0.127
 dentist-0.125
 nurses-0.119
 stret-0.119
 conductor-0.117
Token eat
Token ID4483
Feature activation+6.25e-02

Pos logit contributions
 hospital+0.038
 dentist+0.037
 nurses+0.033
 stret+0.033
 conductor+0.032
Neg logit contributions
uts-0.043
ges-0.041
't-0.040
iked-0.039
 together-0.039
Token here
Token ID994
Feature activation+3.77e-02

Pos logit contributions
uts+0.109
uttering+0.100
ges+0.099
iked+0.099
uttered+0.098
Neg logit contributions
 hospital-0.113
 dentist-0.110
 nurses-0.100
 scarf-0.098
 stret-0.097
Token?".
Token ID43634
Feature activation+6.04e-02

Pos logit contributions
uts+0.056
ges+0.051
iked+0.049
 Sorry+0.049
uttering+0.048
Neg logit contributions
 hospital-0.077
 dentist-0.076
 nurses-0.069
 stret-0.068
 conductor-0.067
Token
Token ID198
Feature activation+4.01e-02

Pos logit contributions
 Sorry+0.086
uts+0.076
ges+0.075
uttering+0.075
iked+0.073
Neg logit contributions
 hospital-0.175
 dentist-0.171
 stret-0.160
ki-0.156
 nurses-0.154
Token
Token ID198
Feature activation+3.73e-02

Pos logit contributions
 Sorry+0.052
uts+0.048
ges+0.047
uttering+0.046
iked+0.044
Neg logit contributions
 hospital-0.083
 dentist-0.080
 stret-0.079
ki-0.076
 conductor-0.076
TokenAn
Token ID2025
Feature activation-7.94e-02

Pos logit contributions
uts+0.058
ges+0.056
 Sorry+0.055
uttering+0.052
't+0.052
Neg logit contributions
 hospital-0.069
 dentist-0.066
 stret-0.065
 nurses-0.065
 conductor-0.062
Tokennie
Token ID11952
Feature activation-7.26e-02

Pos logit contributions
 hospital+0.077
 dentist+0.075
 stret+0.069
 conductor+0.059
 ig+0.057
Neg logit contributions
uts-0.191
't-0.187
 together-0.186
ges-0.183
t-0.180
Token:
Token ID25
Feature activation+1.71e-03

Pos logit contributions
 dentist+0.130
 hospital+0.129
 stret+0.119
 nurses+0.117
 wounds+0.109
Neg logit contributions
 together-0.121
uts-0.121
't-0.119
ges-0.114
iked-0.111
Token �
Token ID6184
Feature activation+5.73e-03

Pos logit contributions
uts+0.003
uttering+0.002
iked+0.002
ges+0.002
t+0.002
Neg logit contributions
 hospital-0.004
 dentist-0.003
 conductor-0.003
 cal-0.003
 stret-0.003
Token�
Token ID95
Feature activation-2.52e-02

Pos logit contributions
uts+0.009
't+0.008
ges+0.008
 together+0.008
iked+0.008
Neg logit contributions
 hospital-0.011
 dentist-0.011
 stret-0.010
 nurses-0.010
 conductor-0.010
Token€
Token ID26391
Feature activation-1.94e-02

Pos logit contributions
 hospital+0.048
 dentist+0.048
 nurses+0.046
 stret+0.045
 ig+0.045
Neg logit contributions
uts-0.038
't-0.037
 This-0.037
ges-0.034
 together-0.033
Token�
Token ID129
Feature activation-4.17e-02

Pos logit contributions
 dentist+0.039
 hospital+0.039
 stret+0.036
ki+0.036
 conductor+0.035
Neg logit contributions
ges-0.026
uts-0.025
iked-0.024
uttering-0.024
 together-0.024
Token�
Token ID241
Feature activation-1.31e-02

Pos logit contributions
 hospital+0.071
 dentist+0.068
 stret+0.063
 nurses+0.062
 conductor+0.061
Neg logit contributions
uts-0.075
't-0.071
ges-0.071
 together-0.069
 Sorry-0.068
TokenI
Token ID40
Feature activation-4.23e-02

Pos logit contributions
 hospital+0.024
 dentist+0.024
 stret+0.022
 nurses+0.022
 conductor+0.021
Neg logit contributions
uts-0.020
 Sorry-0.019
iked-0.018
 together-0.018
't-0.018
Token.
Token ID13
Feature activation+4.28e-02

Pos logit contributions
uts+0.049
't+0.046
ges+0.045
 Sorry+0.044
 This+0.042
Neg logit contributions
 hospital-0.095
 dentist-0.091
 stret-0.087
 conductor-0.086
 nurses-0.085
Token She
Token ID1375
Feature activation-6.13e-02

Pos logit contributions
uts+0.054
uttering+0.049
't+0.048
iked+0.047
ges+0.047
Neg logit contributions
 hospital-0.096
 dentist-0.094
 nurses-0.088
 stret-0.087
 conductor-0.086
Token was
Token ID373
Feature activation-2.93e-02

Pos logit contributions
 stret+0.121
 hospital+0.116
 dentist+0.116
 nurses+0.116
 engineer+0.110
Neg logit contributions
 together-0.090
't-0.082
uts-0.079
 on-0.074
t-0.073
Token so
Token ID523
Feature activation-8.90e-03

Pos logit contributions
 hospital+0.057
 dentist+0.057
 nurses+0.054
 stret+0.053
 conductor+0.051
Neg logit contributions
't-0.043
uts-0.043
 together-0.042
ges-0.039
 onwards-0.039
Token excited
Token ID6568
Feature activation+6.17e-02

Pos logit contributions
 hospital+0.013
 dentist+0.012
 nurses+0.011
 conductor+0.011
 stret+0.011
Neg logit contributions
uts-0.018
ges-0.017
iked-0.017
't-0.017
 This-0.016
Token to
Token ID284
Feature activation-2.29e-02

Pos logit contributions
uts+0.072
ges+0.070
 Sorry+0.069
't+0.065
uttering+0.064
Neg logit contributions
 hospital-0.142
 dentist-0.137
 nurses-0.128
 conductor-0.125
 floor-0.124
Token feed
Token ID3745
Feature activation+7.95e-02

Pos logit contributions
 hospital+0.041
 dentist+0.040
 nurses+0.037
 stret+0.037
 conductor+0.036
Neg logit contributions
uts-0.039
't-0.038
ges-0.036
 together-0.036
 Sorry-0.035
Token the
Token ID262
Feature activation-9.62e-02

Pos logit contributions
uts+0.127
't+0.123
 together+0.122
ges+0.119
 This+0.114
Neg logit contributions
 hospital-0.147
 dentist-0.145
 stret-0.137
 nurses-0.133
 conductor-0.133
Token cross
Token ID3272
Feature activation+5.32e-03

Pos logit contributions
 stret+0.121
 dentist+0.120
et+0.115
 hospital+0.115
 conductor+0.114
Neg logit contributions
uts-0.190
't-0.190
 together-0.185
uttering-0.184
t-0.179
Token again
Token ID757
Feature activation+4.18e-02

Pos logit contributions
uts+0.006
't+0.006
 together+0.006
ges+0.006
 This+0.005
Neg logit contributions
 hospital-0.012
 dentist-0.012
 nurses-0.011
 stret-0.011
 conductor-0.011
Token soon
Token ID2582
Feature activation+1.92e-02

Pos logit contributions
uts+0.060
't+0.058
ges+0.057
 Sorry+0.057
iked+0.056
Neg logit contributions
 hospital-0.084
 dentist-0.080
 stret-0.074
 nurses-0.074
 conductor-0.073
Token very
Token ID845
Feature activation-1.37e-02

Pos logit contributions
 hospital+0.058
 dentist+0.057
 conductor+0.055
 nurses+0.055
 stret+0.055
Neg logit contributions
uts-0.041
't-0.041
 together-0.040
ges-0.038
uttering-0.036
Token angry
Token ID7954
Feature activation+4.87e-02

Pos logit contributions
 hospital+0.022
 dentist+0.021
 nurses+0.019
 stret+0.019
 conductor+0.019
Neg logit contributions
uts-0.026
't-0.025
ges-0.025
 together-0.024
uttering-0.024
Token.
Token ID13
Feature activation+5.00e-02

Pos logit contributions
 Sorry+0.088
uttered+0.085
uttering+0.084
iked+0.084
't+0.082
Neg logit contributions
 hospital-0.082
 dentist-0.079
 doctors-0.070
 nurses-0.069
 floor-0.069
Token
Token ID198
Feature activation+3.29e-02

Pos logit contributions
iked+0.083
uttering+0.081
ived+0.075
uttered+0.072
oor+0.072
Neg logit contributions
 dentist-0.091
 hospital-0.089
 doctor-0.077
 nurses-0.076
 witch-0.075
Token
Token ID198
Feature activation+2.84e-02

Pos logit contributions
iked+0.048
uttering+0.047
 Sorry+0.045
uts+0.043
uttered+0.043
Neg logit contributions
 dentist-0.063
 hospital-0.063
 stret-0.055
 conductor-0.054
 floor-0.053
TokenL
Token ID43
Feature activation-7.17e-02

Pos logit contributions
uts+0.036
uttered+0.035
iked+0.035
uttering+0.034
ges+0.032
Neg logit contributions
 hospital-0.059
 dentist-0.058
 nurses-0.055
 stret-0.053
 conductor-0.053
Tokenily
Token ID813
Feature activation-8.98e-02

Pos logit contributions
 hospital+0.122
 dentist+0.118
 stret+0.110
 nurses+0.109
 conductor+0.107
Neg logit contributions
uts-0.131
't-0.125
ges-0.124
 together-0.119
uttering-0.117
Token and
Token ID290
Feature activation-3.69e-02

Pos logit contributions
 dentist+0.208
 hospital+0.206
 difference+0.202
 nurses+0.201
 stret+0.201
Neg logit contributions
uts-0.103
't-0.101
 together-0.090
ed-0.086
ges-0.085
Token Ben
Token ID3932
Feature activation-7.73e-02

Pos logit contributions
 hospital+0.077
 stret+0.075
 dentist+0.075
 conductor+0.072
an+0.072
Neg logit contributions
't-0.049
uts-0.049
ges-0.047
 together-0.045
 Sorry-0.044
Token came
Token ID1625
Feature activation+4.19e-02

Pos logit contributions
 dentist+0.164
 hospital+0.158
 nurses+0.146
 stret+0.144
 ig+0.143
Neg logit contributions
't-0.109
uts-0.107
ges-0.104
 together-0.103
t-0.095
Token out
Token ID503
Feature activation+2.21e-02

Pos logit contributions
uts+0.062
 Sorry+0.058
't+0.052
 This+0.052
t+0.049
Neg logit contributions
 hospital-0.086
 nurses-0.081
 conductor-0.078
 wounds-0.078
 dentist-0.078
Token together
Token ID1978
Feature activation+5.03e-02

Pos logit contributions
uts+0.078
iked+0.070
't+0.069
ges+0.069
uttering+0.066
Neg logit contributions
 hospital-0.172
 dentist-0.163
 stret-0.156
 nurses-0.152
 conductor-0.152
Token.
Token ID13
Feature activation+4.75e-02

Pos logit contributions
uts+0.085
iked+0.085
 Sorry+0.082
elt+0.081
uttering+0.080
Neg logit contributions
 hospital-0.090
 dentist-0.082
 floor-0.078
et-0.077
 mistake-0.076
Token<|endoftext|>
Token ID50256
Feature activation+3.56e-02

Pos logit contributions
iked+0.058
uts+0.055
uttering+0.055
ges+0.050
t+0.050
Neg logit contributions
 hospital-0.108
 dentist-0.105
 conductor-0.096
 nurses-0.094
 tears-0.092
TokenOnce
Token ID7454
Feature activation+2.65e-02

Pos logit contributions
uts+0.058
iked+0.056
 Sorry+0.054
uttering+0.053
uttered+0.048
Neg logit contributions
 dentist-0.066
 hospital-0.065
 bear-0.060
 rug-0.056
 snow-0.056
Token upon
Token ID2402
Feature activation-2.39e-02

Pos logit contributions
uts+0.038
 Sorry+0.034
iked+0.032
ges+0.031
ucked+0.031
Neg logit contributions
 hospital-0.053
 dentist-0.052
 ig-0.049
 bear-0.047
 cob-0.046
Token a
Token ID257
Feature activation-4.73e-02

Pos logit contributions
 hospital+0.042
 dentist+0.042
 stret+0.037
 nurses+0.037
 scarf+0.036
Neg logit contributions
uts-0.042
ges-0.039
iked-0.039
 Sorry-0.038
't-0.038
Token time
Token ID640
Feature activation+3.66e-02

Pos logit contributions
 hospital+0.085
 dentist+0.082
 nurses+0.079
 conductor+0.076
 stret+0.076
Neg logit contributions
uts-0.079
't-0.077
 together-0.074
ges-0.073
uttering-0.072
Token,
Token ID11
Feature activation-4.00e-03

Pos logit contributions
uts+0.069
itted+0.066
iked+0.065
uttering+0.065
 Sorry+0.065
Neg logit contributions
 ig-0.045
 hospital-0.044
 rug-0.042
 dentist-0.042
 bear-0.041
Token there
Token ID612
Feature activation+1.05e-02

Pos logit contributions
 hospital+0.008
 ig+0.008
 cob+0.008
 dentist+0.008
 rug+0.008
Neg logit contributions
uts-0.006
iked-0.005
 Sorry-0.004
't-0.004
oor-0.004
Token was
Token ID373
Feature activation-2.29e-02

Pos logit contributions
uts+0.024
uttering+0.022
ges+0.022
iked+0.022
't+0.022
Neg logit contributions
 hospital-0.013
 dentist-0.013
 ig-0.011
 cob-0.011
 floor-0.011
Token a
Token ID257
Feature activation-5.84e-02

Pos logit contributions
 hospital+0.042
 dentist+0.041
 nurses+0.037
 stret+0.037
 conductor+0.036
Neg logit contributions
uts-0.039
't-0.036
ges-0.036
iked-0.035
 Sorry-0.034
Token house
Token ID2156
Feature activation+3.44e-02

Pos logit contributions
ice+0.141
 dentist+0.135
et+0.135
abel+0.132
ureen+0.131
Neg logit contributions
't-0.204
ges-0.192
 together-0.192
 Sorry-0.191
 Yes-0.191
Token.
Token ID13
Feature activation+5.40e-02

Pos logit contributions
uts+0.047
't+0.044
 Sorry+0.043
ges+0.042
uttering+0.041
Neg logit contributions
 hospital-0.074
 dentist-0.072
 nurses-0.067
 conductor-0.066
 stret-0.066
Token
Token ID198
Feature activation+3.47e-02

Pos logit contributions
uts+0.067
't+0.066
t+0.065
ges+0.064
uttering+0.063
Neg logit contributions
 hospital-0.121
 dentist-0.114
 stret-0.108
 nurses-0.107
 snow-0.105
Token
Token ID198
Feature activation+3.08e-02

Pos logit contributions
uts+0.039
iked+0.037
uttering+0.037
ges+0.035
 Sorry+0.034
Neg logit contributions
 hospital-0.081
 dentist-0.079
 stret-0.076
 conductor-0.074
 wounds-0.072
Token"
Token ID1
Feature activation+1.27e-02

Pos logit contributions
uts+0.037
iked+0.033
't+0.032
uttering+0.032
ges+0.032
Neg logit contributions
 hospital-0.069
 dentist-0.066
 nurses-0.065
 stret-0.065
 conductor-0.064
TokenS
Token ID50
Feature activation-4.60e-02

Pos logit contributions
uts+0.018
't+0.017
ges+0.016
 together+0.016
 Sorry+0.016
Neg logit contributions
 hospital-0.027
 dentist-0.026
 stret-0.024
 nurses-0.024
 conductor-0.024
Tokenara
Token ID3301
Feature activation-7.00e-02

Pos logit contributions
 hospital+0.054
 dentist+0.053
 nurses+0.046
 stret+0.045
 conductor+0.042
Neg logit contributions
uts-0.109
't-0.103
ges-0.101
 together-0.100
 Sorry-0.098
Token!
Token ID0
Feature activation+3.89e-02

Pos logit contributions
 hospital+0.168
 dentist+0.164
 nurses+0.154
 stret+0.154
 conductor+0.151
Neg logit contributions
uts-0.078
't-0.071
ges-0.069
 together-0.066
uttering-0.065
Token Ben
Token ID3932
Feature activation-5.83e-02

Pos logit contributions
uts+0.055
 together+0.052
uttering+0.052
iked+0.051
 onwards+0.049
Neg logit contributions
 hospital-0.082
 dentist-0.079
 nurses-0.072
 ig-0.072
 stret-0.070
Token!
Token ID0
Feature activation+3.73e-02

Pos logit contributions
 hospital+0.125
 dentist+0.123
 nurses+0.115
 stret+0.115
 conductor+0.113
Neg logit contributions
uts-0.079
't-0.074
ges-0.071
 together-0.070
 Sorry-0.067
Token Come
Token ID7911
Feature activation+3.92e-02

Pos logit contributions
uttering+0.045
uts+0.045
 together+0.044
iked+0.042
uttered+0.039
Neg logit contributions
 hospital-0.086
 dentist-0.085
 nurses-0.077
 ig-0.077
 boo-0.075
TokenWho
Token ID8241
Feature activation-1.26e-02

Pos logit contributions
uts+0.007
't+0.007
ges+0.007
 together+0.007
 Sorry+0.006
Neg logit contributions
 hospital-0.011
 dentist-0.011
 stret-0.010
 nurses-0.010
 conductor-0.010
Token pushed
Token ID7121
Feature activation+7.85e-02

Pos logit contributions
 hospital+0.024
 stret+0.023
 dentist+0.023
 nurses+0.022
 conductor+0.022
Neg logit contributions
uts-0.020
't-0.019
 together-0.019
ges-0.018
t-0.018
Token this
Token ID428
Feature activation-7.97e-02

Pos logit contributions
uts+0.120
't+0.118
 together+0.113
ges+0.113
 This+0.108
Neg logit contributions
 hospital-0.152
 dentist-0.150
 stret-0.139
 nurses-0.138
 conductor-0.135
Token?"
Token ID1701
Feature activation+3.67e-02

Pos logit contributions
 hospital+0.149
 dentist+0.145
 nurses+0.135
 stret+0.134
 conductor+0.132
Neg logit contributions
uts-0.130
't-0.123
ges-0.120
 together-0.118
 Sorry-0.114
Token one
Token ID530
Feature activation-2.91e-02

Pos logit contributions
uts+0.056
't+0.056
 together+0.052
ges+0.052
 This+0.052
Neg logit contributions
 hospital-0.070
 dentist-0.069
 stret-0.065
 nurses-0.064
 conductor-0.062
Token of
Token ID286
Feature activation-3.56e-02

Pos logit contributions
 hospital+0.070
 dentist+0.069
 stret+0.066
 nurses+0.065
 conductor+0.063
Neg logit contributions
uts-0.031
't-0.029
 together-0.028
ges-0.028
t-0.026
Token them
Token ID606
Feature activation-3.84e-02

Pos logit contributions
 dentist+0.077
 hospital+0.077
 stret+0.073
an+0.070
 nurses+0.069
Neg logit contributions
ges-0.047
't-0.047
 together-0.042
 This-0.040
 onwards-0.038
Token asked
Token ID1965
Feature activation+4.97e-02

Pos logit contributions
 hospital+0.058
 dentist+0.056
 nurses+0.051
 stret+0.051
 conductor+0.051
Neg logit contributions
uts-0.076
't-0.073
ges-0.071
 Sorry-0.069
uttering-0.069
Token.
Token ID13
Feature activation+5.82e-02

Pos logit contributions
uts+0.063
't+0.058
ges+0.055
 Sorry+0.054
uttering+0.053
Neg logit contributions
 hospital-0.111
 dentist-0.108
 nurses-0.103
 stret-0.102
 conductor-0.101
Token
Token ID198
Feature activation+4.04e-02

Pos logit contributions
uts+0.074
uttering+0.073
ges+0.068
iked+0.067
t+0.063
Neg logit contributions
 hospital-0.123
 dentist-0.121
 conductor-0.116
 stret-0.112
 engine-0.108
Token
Token ID198
Feature activation+3.57e-02

Pos logit contributions
uts+0.040
uttering+0.036
ges+0.036
iked+0.035
 Sorry+0.034
Neg logit contributions
 hospital-0.099
 dentist-0.098
 conductor-0.093
 stret-0.092
 nurses-0.089
Token struggled
Token ID11615
Feature activation+1.08e-01

Pos logit contributions
 hospital+0.011
 dentist+0.010
 nurses+0.009
 cob+0.009
 stret+0.009
Neg logit contributions
uts-0.016
uttering-0.015
't-0.015
 Sorry-0.015
 This-0.015
Token to
Token ID284
Feature activation-3.21e-02

Pos logit contributions
 Sorry+0.170
uts+0.156
iked+0.141
ges+0.141
 This+0.137
Neg logit contributions
 hospital-0.214
 dentist-0.196
 floor-0.192
 cob-0.191
 mistake-0.187
Token get
Token ID651
Feature activation+4.85e-02

Pos logit contributions
 dentist+0.062
 hospital+0.059
 nurses+0.057
 conductor+0.056
 stret+0.054
Neg logit contributions
't-0.052
uts-0.050
 together-0.047
ges-0.046
 This-0.042
Token out
Token ID503
Feature activation+2.57e-02

Pos logit contributions
uts+0.081
 Sorry+0.071
iked+0.071
 This+0.070
ges+0.070
Neg logit contributions
 hospital-0.094
 dentist-0.086
 nurses-0.085
 cal-0.080
 boo-0.080
Token but
Token ID475
Feature activation-1.25e-03

Pos logit contributions
uts+0.032
't+0.030
 Sorry+0.029
ges+0.028
uttering+0.027
Neg logit contributions
 hospital-0.058
 dentist-0.055
 nurses-0.053
 stret-0.052
 ig-0.051
Token the
Token ID262
Feature activation-9.07e-02

Pos logit contributions
 hospital+0.003
 dentist+0.003
 nurses+0.003
 stret+0.002
 conductor+0.002
Neg logit contributions
uts-0.002
't-0.001
ges-0.001
 Sorry-0.001
 together-0.001
Token lake
Token ID13546
Feature activation+2.70e-02

Pos logit contributions
 dentist+0.120
 stret+0.116
 hospital+0.114
ki+0.109
 conductor+0.108
Neg logit contributions
't-0.191
 together-0.183
uts-0.177
 Sorry-0.171
ges-0.170
Token was
Token ID373
Feature activation-1.21e-02

Pos logit contributions
uts+0.042
't+0.042
 together+0.041
t+0.041
ges+0.038
Neg logit contributions
 dentist-0.052
 stret-0.051
 hospital-0.049
 nurses-0.049
 conductor-0.046
Token too
Token ID1165
Feature activation+1.03e-02

Pos logit contributions
 hospital+0.024
 dentist+0.024
 nurses+0.022
 stret+0.022
 conductor+0.021
Neg logit contributions
uts-0.018
't-0.017
ges-0.017
 Sorry-0.016
 together-0.016
Token strong
Token ID1913
Feature activation+6.21e-02

Pos logit contributions
uts+0.020
ges+0.020
 Sorry+0.019
't+0.019
iked+0.019
Neg logit contributions
 hospital-0.015
 dentist-0.015
 nurses-0.013
 stret-0.013
 cob-0.012
Token!
Token ID0
Feature activation+3.79e-02

Pos logit contributions
 Sorry+0.108
uts+0.095
uttering+0.091
iked+0.091
uttered+0.090
Neg logit contributions
 hospital-0.114
 dentist-0.108
et-0.101
 nurses-0.099
 floor-0.099
Token!"
Token ID2474
Feature activation+5.16e-02

Pos logit contributions
uts+0.057
iked+0.057
uttering+0.055
 onwards+0.054
ges+0.053
Neg logit contributions
 hospital-0.083
 dentist-0.080
 floor-0.074
et-0.074
 engine-0.073
Token
Token ID198
Feature activation+3.50e-02

Pos logit contributions
uttering+0.068
uts+0.066
ges+0.064
iked+0.062
uttered+0.060
Neg logit contributions
 hospital-0.114
 dentist-0.109
 floor-0.099
 stret-0.099
 tears-0.095
Token
Token ID198
Feature activation+2.93e-02

Pos logit contributions
uts+0.031
 Sorry+0.027
ges+0.027
iked+0.026
uttering+0.026
Neg logit contributions
 hospital-0.091
 dentist-0.090
 stret-0.084
 conductor-0.083
 nurses-0.083
TokenM
Token ID44
Feature activation-7.14e-02

Pos logit contributions
uts+0.049
't+0.047
ges+0.045
 together+0.044
uttering+0.043
Neg logit contributions
 hospital-0.054
 dentist-0.053
 nurses-0.049
 stret-0.049
 conductor-0.048
Tokenia
Token ID544
Feature activation-1.04e-01

Pos logit contributions
 hospital+0.090
 dentist+0.088
 nurses+0.077
 stret+0.076
 conductor+0.074
Neg logit contributions
uts-0.161
't-0.155
ges-0.154
 together-0.151
 This-0.148
Token's
Token ID338
Feature activation-7.94e-02

Pos logit contributions
 dentist+0.199
 stret+0.198
 nurses+0.197
 hospital+0.192
 conductor+0.190
Neg logit contributions
 together-0.156
't-0.154
uts-0.150
 This-0.144
aps-0.140
Token mom
Token ID1995
Feature activation-5.50e-02

Pos logit contributions
 hospital+0.117
 nurses+0.114
 dentist+0.114
 stret+0.113
ice+0.107
Neg logit contributions
uts-0.150
 together-0.145
't-0.145
uttering-0.143
 Sorry-0.140
Token watches
Token ID16860
Feature activation+6.87e-02

Pos logit contributions
 hospital+0.096
 stret+0.093
 nurses+0.091
 dentist+0.091
 conductor+0.088
Neg logit contributions
't-0.093
 together-0.090
ges-0.086
uts-0.082
 This-0.081
Token her
Token ID607
Feature activation-8.39e-02

Pos logit contributions
uts+0.108
't+0.097
 Sorry+0.096
uttering+0.095
ges+0.095
Neg logit contributions
 hospital-0.135
 dentist-0.131
 nurses-0.124
 stret-0.121
 conductor-0.118
Token play
Token ID711
Feature activation+4.64e-02

Pos logit contributions
 hospital+0.163
 dentist+0.157
 stret+0.147
 nurses+0.145
 conductor+0.143
Neg logit contributions
uts-0.134
't-0.126
ges-0.123
 Sorry-0.118
uttering-0.118
Token.
Token ID13
Feature activation+4.66e-02

Pos logit contributions
uts+0.053
't+0.049
ges+0.047
 together+0.045
uttering+0.044
Neg logit contributions
 hospital-0.110
 dentist-0.108
 nurses-0.101
 stret-0.101
 conductor-0.099
Token it
Token ID340
Feature activation+4.34e-03

Pos logit contributions
uts+0.062
iked+0.051
 Sorry+0.051
uttering+0.051
't+0.050
Neg logit contributions
 hospital-0.080
 tears-0.076
 dentist-0.076
 scarf-0.075
et-0.074
Token in
Token ID287
Feature activation-1.25e-02

Pos logit contributions
uts+0.008
 Sorry+0.008
ges+0.007
't+0.007
 Do+0.007
Neg logit contributions
 hospital-0.007
 scarf-0.007
 floor-0.007
 dentist-0.006
 tears-0.006
Token her
Token ID607
Feature activation-9.40e-02

Pos logit contributions
 hospital+0.020
 dentist+0.020
 tears+0.019
 scarf+0.019
 floor+0.018
Neg logit contributions
uts-0.024
t-0.022
iked-0.022
't-0.021
 Sorry-0.021
Token hands
Token ID2832
Feature activation+4.92e-02

Pos logit contributions
 dentist+0.171
 hospital+0.171
 nurses+0.161
 stret+0.153
 conductor+0.150
Neg logit contributions
uts-0.154
 together-0.152
 This-0.150
't-0.150
ges-0.147
Token.
Token ID13
Feature activation+3.74e-02

Pos logit contributions
 Sorry+0.084
uts+0.076
't+0.073
ges+0.073
iked+0.071
Neg logit contributions
 hospital-0.087
 dentist-0.082
 floor-0.080
 conductor-0.077
 engine-0.077
Token The
Token ID383
Feature activation-8.56e-02

Pos logit contributions
uttering+0.058
uts+0.058
ges+0.053
t+0.052
iked+0.050
Neg logit contributions
 hospital-0.070
 dentist-0.068
 floor-0.062
 swings-0.061
 rug-0.060
Token object
Token ID2134
Feature activation-1.72e-02

Pos logit contributions
 dentist+0.113
 hospital+0.112
 stret+0.107
 nurses+0.101
abel+0.099
Neg logit contributions
 Yes-0.172
't-0.169
ges-0.167
 together-0.167
uttering-0.167
Token was
Token ID373
Feature activation-2.16e-02

Pos logit contributions
 stret+0.035
 dentist+0.033
 nurses+0.033
 hospital+0.032
 engineer+0.030
Neg logit contributions
uts-0.026
 together-0.026
't-0.026
t-0.024
uttered-0.023
Token so
Token ID523
Feature activation+1.79e-03

Pos logit contributions
 hospital+0.038
 dentist+0.037
 nurses+0.035
 stret+0.035
 conductor+0.034
Neg logit contributions
uts-0.037
't-0.035
ges-0.034
 together-0.034
uttering-0.033
Token pretty
Token ID2495
Feature activation+2.47e-02

Pos logit contributions
ges+0.004
't+0.004
iked+0.004
uts+0.004
t+0.004
Neg logit contributions
 hospital-0.002
 cob-0.001
 dentist-0.001
 floor-0.001
 ig-0.001
Token and
Token ID290
Feature activation-2.77e-04

Pos logit contributions
uts+0.038
't+0.037
 Sorry+0.037
ges+0.037
iked+0.035
Neg logit contributions
 hospital-0.047
 dentist-0.045
 floor-0.041
 nurses-0.041
 cob-0.041
Token She
Token ID1375
Feature activation-5.96e-02

Pos logit contributions
uts+0.060
uttering+0.057
ges+0.054
ectar+0.052
iked+0.051
Neg logit contributions
 hospital-0.081
 dentist-0.078
 tears-0.073
 floor-0.072
 swings-0.069
Token even
Token ID772
Feature activation-7.92e-03

Pos logit contributions
 dentist+0.127
 hospital+0.118
 stret+0.116
 nurses+0.113
 x+0.105
Neg logit contributions
 together-0.097
't-0.084
uts-0.083
t-0.082
 on-0.082
Token blew
Token ID17948
Feature activation+8.21e-02

Pos logit contributions
 dentist+0.016
 hospital+0.015
 stret+0.014
 wounds+0.014
 difference+0.013
Neg logit contributions
 together-0.014
ges-0.012
 on-0.012
 onwards-0.011
ived-0.011
Token bubbles
Token ID25037
Feature activation+6.42e-02

Pos logit contributions
uts+0.113
't+0.107
ges+0.105
 together+0.103
uttering+0.098
Neg logit contributions
 hospital-0.174
 dentist-0.171
 nurses-0.159
 stret-0.159
 conductor-0.156
Token with
Token ID351
Feature activation+4.56e-03

Pos logit contributions
 Sorry+0.081
uts+0.080
't+0.075
 This+0.073
ges+0.071
Neg logit contributions
 hospital-0.139
 dentist-0.129
 stret-0.129
et-0.126
 conductor-0.125
Token her
Token ID607
Feature activation-8.17e-02

Pos logit contributions
uts+0.006
't+0.006
ges+0.005
 together+0.005
 This+0.005
Neg logit contributions
 hospital-0.010
 dentist-0.010
 stret-0.009
 nurses-0.009
 conductor-0.009
Token doll
Token ID3654
Feature activation+4.78e-03

Pos logit contributions
 hospital+0.117
 dentist+0.114
 nurses+0.104
 stret+0.102
 conductor+0.099
Neg logit contributions
uts-0.172
't-0.165
ges-0.161
 together-0.161
 Sorry-0.153
Token and
Token ID290
Feature activation+2.35e-03

Pos logit contributions
uts+0.006
't+0.006
ges+0.006
 together+0.006
uttering+0.005
Neg logit contributions
 hospital-0.010
 dentist-0.010
 nurses-0.010
 stret-0.010
 conductor-0.009
Token passed
Token ID3804
Feature activation+5.09e-02

Pos logit contributions
uts+0.004
 together+0.004
ges+0.004
't+0.004
iked+0.003
Neg logit contributions
 hospital-0.004
 dentist-0.004
 stret-0.004
 nurses-0.004
 conductor-0.004
Token the
Token ID262
Feature activation-1.01e-01

Pos logit contributions
uts+0.064
 together+0.063
ges+0.062
't+0.061
 onwards+0.056
Neg logit contributions
 hospital-0.113
 dentist-0.111
 stret-0.105
 nurses-0.102
 conductor-0.101
Token time
Token ID640
Feature activation+5.89e-02

Pos logit contributions
 dentist+0.145
 hospital+0.141
 nurses+0.128
 stret+0.127
 conductor+0.122
Neg logit contributions
't-0.200
uts-0.199
 together-0.195
ges-0.192
t-0.187
Token itself
Token ID2346
Feature activation-2.34e-02

Pos logit contributions
 together+0.012
uts+0.011
ges+0.011
't+0.011
 This+0.011
Neg logit contributions
 dentist-0.016
 hospital-0.016
 stret-0.015
 conductor-0.015
 nurses-0.015
Token.
Token ID13
Feature activation+3.56e-02

Pos logit contributions
 hospital+0.054
 dentist+0.053
 nurses+0.050
 stret+0.050
 conductor+0.049
Neg logit contributions
uts-0.027
't-0.025
ges-0.025
 together-0.024
uttering-0.023
Token
Token ID220
Feature activation+7.03e-02

Pos logit contributions
uts+0.045
uttering+0.041
ges+0.041
't+0.040
iked+0.038
Neg logit contributions
 hospital-0.080
 dentist-0.078
 conductor-0.071
 nurses-0.071
 stret-0.070
Token
Token ID198
Feature activation+3.23e-02

Pos logit contributions
iked+0.077
 Sorry+0.076
uttering+0.076
ges+0.075
uts+0.073
Neg logit contributions
 hospital-0.154
 dentist-0.153
 stret-0.139
 conductor-0.135
 cal-0.133
Token
Token ID198
Feature activation+3.08e-02

Pos logit contributions
uts+0.035
uttering+0.034
 Sorry+0.033
ges+0.033
iked+0.032
Neg logit contributions
 hospital-0.074
 dentist-0.073
 stret-0.069
 conductor-0.068
 wounds-0.065
TokenThe
Token ID464
Feature activation-8.82e-02

Pos logit contributions
uts+0.060
't+0.057
ges+0.056
 together+0.055
uttering+0.054
Neg logit contributions
 hospital-0.048
 dentist-0.047
 nurses-0.043
 stret-0.043
 conductor-0.042
Token boy
Token ID2933
Feature activation-7.47e-02

Pos logit contributions
 stret+0.152
ki+0.128
 dentist+0.126
 conductor+0.122
ureen+0.121
Neg logit contributions
 together-0.150
't-0.148
 on-0.145
 Yes-0.145
 Sorry-0.140
Token never
Token ID1239
Feature activation+6.28e-04

Pos logit contributions
 stret+0.146
 dentist+0.145
 nurses+0.138
 hospital+0.135
 cob+0.127
Neg logit contributions
 together-0.118
't-0.106
 on-0.099
uts-0.098
 onwards-0.097
Token forgot
Token ID16453
Feature activation+3.32e-02

Pos logit contributions
ges+0.001
uts+0.001
 together+0.001
't+0.001
t+0.001
Neg logit contributions
 stret-0.001
 dentist-0.001
 hospital-0.001
 wounds-0.001
 rug-0.001
Token the
Token ID262
Feature activation-9.82e-02

Pos logit contributions
uts+0.044
 together+0.044
't+0.041
 This+0.040
ges+0.039
Neg logit contributions
 dentist-0.068
 hospital-0.067
 stret-0.067
 conductor-0.066
an-0.065
Token lesson
Token ID11483
Feature activation+8.46e-03

Pos logit contributions
 stret+0.161
 dentist+0.155
 hospital+0.152
 conductor+0.152
ice+0.146
Neg logit contributions
 Yes-0.163
 together-0.159
't-0.157
 Today-0.152
uttering-0.151