TOP ACTIVATIONS
MAX = 14.864

, it kept on flying err atically with inexplicable speed .
room while she ran an err and . The girl was
at his salon and running err ands . But every day
driver was done running his err ands , he drove back
her mom were out running err ands . Mary was three
on dry and cozy p aj amas and sn ugg led
and red and covered in scrib bles . Em
her cr ay ons to scrib ble on the wall .
and even tried to take n aps on the rabbit 's
needed to go do some err ands . Mom
have many bl anks and scrib bles . They feel bad
bee flew around and gathered n ectar to make honey .
AND YOU WILL GET IN TR OU BLE !"
toy until it was all sl ob bery .
and put on your p aj amas . Then we can
Buzz went out to collect n ectar from flowers .
sign that said " RE VERS E ". L ila and
˜ S ail ors Wanted â . Do
. Amy then began to scrib ble the prices on all
doctor , which was an err and he didn 't understand

MIN ACTIVATIONS
MIN = -10.741

inside that barn . If she did , the owners would
can happen . But if she was smart and quick ,
being silly . Even though she tried her best , the
each other . Even though she had to pay a lot
very frustrated . Even though she had had a lovely bath
a secret code . If she broke the code , she
and sugar . Even though she was only three years old
was gone . Even though she could not see it anymore
a scientist . Even though she couldn 't read yet ,
a special plant . If she took good care of it
the ocean . Even though she was tired , she was
felt frustrated . Even though she understood why the fence was
happy people . Even though she was a bit scared ,
seem scary , but if she talked about it with her
play with . Even though she was still poor , she
much slower . Even though she tried her best , the
of herself . Even though she was at the hospital ,
purse back . Even though she had forgotten it , everything
mess again . And if she did accidentally knock over her
little girl . Even though she was very small , she

INTERVAL 8.463 - 14.864
CONTAINS 0.266%

's a great poem !" <|endoftext|> Once upon a time ,
a very kind vine . <|endoftext|> Ben and Mia are twins
. They are happy . <|endoftext|> L ily and Ben are
the little girl watched in amaz ement . Mama 's fingers
made an impressive sandwich . <|endoftext|> Anna and Ben are twins

INTERVAL 2.062 - 8.463
CONTAINS 37.001%

but she was still troubled . She looked around and saw
her . She liked her dress . She thought it was
Spot ate the cookie . Then he ran to
a loud noise was heard . It was coming from outside
was incredible . The warrior showed them how it worked ,

INTERVAL -4.340 - 2.062
CONTAINS 44.619%

. One time , the ball rolled too far
her . One morning , the little girl
. Maybe we can make some delicious soup with it ,"
on her face and was so happy to get her y
played together , ate together , and slept together . They

INTERVAL -10.741 - -4.340
CONTAINS 18.114%

From then on , Grace was always sure to be
on a big adventure . He held up his head and
girl named Sue . Sue also had a twist y toy
Ben agreed . He wished he could soar like the hawk
water , then scrub bed and scrub bed until all the
Token,
Token ID11
Feature activation-1.48e+00

Pos logit contributions
aps+0.038
apping+0.032
uttering+0.032
 This+0.030
apped+0.027
Neg logit contributions
 forgot-0.060
 punished-0.054
 cleaned-0.050
 sighed-0.049
 realized-0.049
Token it
Token ID340
Feature activation-4.75e+00

Pos logit contributions
 forgot+0.037
 punished+0.033
 cleaned+0.030
 sighed+0.029
 regretted+0.029
Neg logit contributions
aps-0.032
apping-0.028
uttering-0.028
 This-0.027
apped-0.025
Token kept
Token ID4030
Feature activation+2.79e+00

Pos logit contributions
 nurses+0.075
 forgot+0.074
 punished+0.072
 overs+0.065
 promise+0.064
Neg logit contributions
aps-0.095
apped-0.092
uttering-0.092
apping-0.090
uts-0.090
Token on
Token ID319
Feature activation+1.08e+00

Pos logit contributions
aps+0.063
apping+0.056
uttering+0.052
 This+0.052
uts+0.048
Neg logit contributions
 forgot-0.068
 punished-0.063
 cleaned-0.057
 sighed-0.055
 realized-0.054
Token flying
Token ID7348
Feature activation+5.56e+00

Pos logit contributions
aps+0.026
apping+0.023
uttering+0.022
 This+0.021
apped+0.020
Neg logit contributions
 forgot-0.025
 punished-0.023
 cleaned-0.021
 realized-0.020
 regretted-0.020
Token err
Token ID11454
Feature activation+1.47e+01

Pos logit contributions
aps+0.121
apping+0.106
uttering+0.103
 This+0.100
apped+0.093
Neg logit contributions
 forgot-0.140
 punished-0.126
 cleaned-0.113
 sighed-0.112
 regretted-0.111
Tokenatically
Token ID4142
Feature activation+6.99e-01

Pos logit contributions
aps+0.338
apping+0.301
uttering+0.295
 This+0.286
apped+0.271
Neg logit contributions
 forgot-0.341
 punished-0.308
 cleaned-0.278
 sighed-0.269
 realized-0.266
Token with
Token ID351
Feature activation+2.74e-01

Pos logit contributions
aps+0.015
apping+0.013
uttering+0.013
 This+0.013
apped+0.012
Neg logit contributions
 forgot-0.017
 punished-0.015
 cleaned-0.014
 realized-0.014
 sighed-0.014
Token inexplicable
Token ID49602
Feature activation-3.17e+00

Pos logit contributions
aps+0.006
apping+0.006
uttering+0.005
 This+0.005
apped+0.005
Neg logit contributions
 forgot-0.007
 punished-0.006
 cleaned-0.005
 sighed-0.005
 realized-0.005
Token speed
Token ID2866
Feature activation+3.46e+00

Pos logit contributions
 forgot+0.068
 punished+0.059
 cleaned+0.053
 sighed+0.052
 realized+0.051
Neg logit contributions
aps-0.081
apping-0.073
uttering-0.071
 This-0.069
apped-0.066
Token.
Token ID13
Feature activation+5.19e+00

Pos logit contributions
aps+0.067
apping+0.056
uttering+0.055
 This+0.052
apped+0.048
Neg logit contributions
 forgot-0.097
 punished-0.087
 cleaned-0.080
 sighed-0.079
 regretted-0.079
Token room
Token ID2119
Feature activation+4.49e+00

Pos logit contributions
 forgot+0.101
 punished+0.086
 sighed+0.076
 realized+0.075
 cleaned+0.074
Neg logit contributions
aps-0.137
apping-0.121
uttering-0.120
 This-0.120
apped-0.109
Token while
Token ID981
Feature activation+4.00e+00

Pos logit contributions
aps+0.090
apping+0.079
uttering+0.076
 This+0.073
apped+0.068
Neg logit contributions
 forgot-0.120
 punished-0.108
 sighed-0.098
 cleaned-0.097
 regretted-0.097
Token she
Token ID673
Feature activation-8.25e+00

Pos logit contributions
aps+0.099
apping+0.085
uttering+0.084
 This+0.083
ay+0.079
Neg logit contributions
 forgot-0.093
 punished-0.077
 cleaned-0.073
 realized-0.072
 nurses-0.071
Token ran
Token ID4966
Feature activation+3.39e+00

Pos logit contributions
 forgot+0.148
 punished+0.141
 regret+0.113
 regretted+0.112
 nurses+0.110
Neg logit contributions
aps-0.217
apped-0.188
apping-0.188
uttering-0.176
 This-0.173
Token an
Token ID281
Feature activation-1.04e+00

Pos logit contributions
aps+0.062
apping+0.054
uttering+0.052
 This+0.049
apped+0.046
Neg logit contributions
 forgot-0.097
 punished-0.088
 cleaned-0.082
 sighed-0.080
 realized-0.079
Token err
Token ID11454
Feature activation+1.31e+01

Pos logit contributions
 forgot+0.024
 punished+0.019
 sighed+0.017
 realized+0.017
 regretted+0.017
Neg logit contributions
aps-0.024
apping-0.023
uttering-0.023
 This-0.022
ummy-0.021
Tokenand
Token ID392
Feature activation+4.15e+00

Pos logit contributions
aps+0.240
apping+0.206
uttering+0.198
 This+0.189
apped+0.174
Neg logit contributions
 forgot-0.376
 punished-0.340
 cleaned-0.311
 sighed-0.310
 regretted-0.308
Token.
Token ID13
Feature activation+6.00e+00

Pos logit contributions
aps+0.085
apping+0.074
uttering+0.072
 This+0.070
apped+0.065
Neg logit contributions
 forgot-0.110
 punished-0.099
 cleaned-0.090
 sighed-0.089
 realized-0.088
Token The
Token ID383
Feature activation-3.65e+00

Pos logit contributions
aps+0.127
apping+0.112
uttering+0.111
ings+0.095
uts+0.095
Neg logit contributions
 forgot-0.156
 cleaned-0.134
 punished-0.132
 sighed-0.128
 hoped-0.126
Token girl
Token ID2576
Feature activation-5.40e+00

Pos logit contributions
 forgot+0.081
 punished+0.071
 realized+0.061
 hoped+0.061
 sighed+0.061
Neg logit contributions
aps-0.085
apping-0.076
uttering-0.075
 This-0.073
 Today-0.069
Token was
Token ID373
Feature activation-1.81e+00

Pos logit contributions
 forgot+0.097
 punished+0.096
wy+0.088
 regret+0.087
eling+0.084
Neg logit contributions
aps-0.115
apping-0.110
apped-0.110
 This-0.105
uttering-0.095
Token at
Token ID379
Feature activation+1.93e-01

Pos logit contributions
aps+0.077
apping+0.067
uttering+0.065
 This+0.063
apped+0.059
Neg logit contributions
 forgot-0.101
 punished-0.091
 cleaned-0.083
 sighed-0.082
 regretted-0.081
Token his
Token ID465
Feature activation-5.85e+00

Pos logit contributions
aps+0.004
apping+0.004
uttering+0.004
 This+0.004
apped+0.003
Neg logit contributions
 forgot-0.005
 punished-0.004
 sighed-0.004
 cleaned-0.004
 regretted-0.004
Token salon
Token ID46608
Feature activation+3.62e+00

Pos logit contributions
 forgot+0.120
 punished+0.107
 cleaned+0.092
 regretted+0.092
 sighed+0.090
Neg logit contributions
aps-0.152
apping-0.138
uttering-0.136
 This-0.131
apped-0.127
Token and
Token ID290
Feature activation-5.62e+00

Pos logit contributions
aps+0.068
apping+0.059
uttering+0.056
 This+0.055
apped+0.050
Neg logit contributions
 forgot-0.099
 punished-0.090
 cleaned-0.083
 sighed-0.083
 regretted-0.082
Token running
Token ID2491
Feature activation+3.91e+00

Pos logit contributions
 forgot+0.114
 punished+0.101
 cleaned+0.090
 sighed+0.088
 regretted+0.085
Neg logit contributions
aps-0.139
apping-0.124
 This-0.121
uttering-0.118
apped-0.112
Token err
Token ID11454
Feature activation+1.28e+01

Pos logit contributions
aps+0.083
apping+0.072
uttering+0.070
 This+0.068
apped+0.063
Neg logit contributions
 forgot-0.101
 punished-0.091
 cleaned-0.082
 sighed-0.081
 regretted-0.080
Tokenands
Token ID1746
Feature activation+3.58e+00

Pos logit contributions
aps+0.284
apping+0.250
uttering+0.244
 This+0.235
apped+0.220
Neg logit contributions
 forgot-0.318
 punished-0.283
 cleaned-0.256
 sighed-0.253
 regretted-0.251
Token.
Token ID13
Feature activation+4.68e+00

Pos logit contributions
aps+0.068
apping+0.058
uttering+0.057
 This+0.054
apped+0.050
Neg logit contributions
 forgot-0.100
 punished-0.090
 cleaned-0.083
 sighed-0.082
 realized-0.081
Token But
Token ID887
Feature activation-7.96e-01

Pos logit contributions
aps+0.103
uttering+0.097
apping+0.094
apped+0.081
ectar+0.079
Neg logit contributions
 forgot-0.110
 hoped-0.092
 punished-0.091
 sighed-0.090
 learned-0.089
Token every
Token ID790
Feature activation-1.12e+00

Pos logit contributions
 forgot+0.018
 punished+0.017
 cleaned+0.015
 regretted+0.014
 sighed+0.014
Neg logit contributions
aps-0.018
apping-0.016
 This-0.016
uttering-0.016
apped-0.015
Token day
Token ID1110
Feature activation+1.68e+00

Pos logit contributions
 forgot+0.027
 punished+0.025
 cleaned+0.022
 sighed+0.022
 realized+0.022
Neg logit contributions
aps-0.025
apping-0.022
uttering-0.022
 This-0.021
apped-0.020
Token driver
Token ID4639
Feature activation-2.02e+00

Pos logit contributions
 forgot+0.105
 punished+0.088
 sighed+0.079
 realized+0.078
 regretted+0.077
Neg logit contributions
aps-0.131
 This-0.118
apping-0.118
uttering-0.117
ay-0.110
Token was
Token ID373
Feature activation-4.09e+00

Pos logit contributions
 forgot+0.034
 punished+0.030
 wounds+0.027
 nurses+0.026
 regret+0.025
Neg logit contributions
aps-0.049
 This-0.046
apping-0.046
apped-0.045
ay-0.042
Token done
Token ID1760
Feature activation+2.85e+00

Pos logit contributions
 forgot+0.088
 punished+0.077
 cleaned+0.068
 sighed+0.068
 regretted+0.068
Neg logit contributions
aps-0.103
apping-0.093
uttering-0.090
 This-0.087
apped-0.083
Token running
Token ID2491
Feature activation+4.60e+00

Pos logit contributions
aps+0.067
uttering+0.060
apping+0.058
ectar+0.057
apped+0.054
Neg logit contributions
 forgot-0.059
 punished-0.055
 cleaned-0.052
 realized-0.049
 sighed-0.048
Token his
Token ID465
Feature activation-5.85e+00

Pos logit contributions
aps+0.097
apping+0.085
uttering+0.084
 This+0.077
apped+0.074
Neg logit contributions
 forgot-0.119
 punished-0.107
 cleaned-0.098
 realized-0.096
 sighed-0.095
Token err
Token ID11454
Feature activation+1.27e+01

Pos logit contributions
 forgot+0.124
 punished+0.107
 cleaned+0.095
 sighed+0.095
 realized+0.094
Neg logit contributions
aps-0.151
apping-0.135
uttering-0.133
 This-0.130
apped-0.122
Tokenands
Token ID1746
Feature activation+7.84e-01

Pos logit contributions
aps+0.270
apping+0.237
uttering+0.231
 This+0.223
apped+0.208
Neg logit contributions
 forgot-0.324
 punished-0.291
 cleaned-0.264
 sighed-0.260
 regretted-0.258
Token,
Token ID11
Feature activation-1.90e+00

Pos logit contributions
aps+0.014
apping+0.012
uttering+0.012
 This+0.011
apped+0.010
Neg logit contributions
 forgot-0.022
 punished-0.020
 cleaned-0.019
 sighed-0.018
 regretted-0.018
Token he
Token ID339
Feature activation-7.32e+00

Pos logit contributions
 forgot+0.044
 punished+0.040
 cleaned+0.036
 regretted+0.034
 sighed+0.034
Neg logit contributions
aps-0.043
apping-0.037
 This-0.036
uttering-0.036
apped-0.034
Token drove
Token ID10357
Feature activation+4.19e+00

Pos logit contributions
 forgot+0.104
 nurses+0.104
 punished+0.099
 matter+0.097
 regret+0.095
Neg logit contributions
aps-0.178
apped-0.162
apping-0.159
 This-0.146
uts-0.141
Token back
Token ID736
Feature activation+1.10e-01

Pos logit contributions
aps+0.093
apping+0.082
uttering+0.080
 This+0.078
apped+0.073
Neg logit contributions
 forgot-0.104
 punished-0.093
 cleaned-0.084
 sighed-0.082
 regretted-0.082
Token her
Token ID607
Feature activation-6.60e+00

Pos logit contributions
 forgot+0.176
 punished+0.157
 cleaned+0.142
 sighed+0.139
 regretted+0.138
Neg logit contributions
aps-0.162
apping-0.143
uttering-0.139
 This-0.136
apped-0.127
Token mom
Token ID1995
Feature activation-3.65e+00

Pos logit contributions
 forgot+0.125
 punished+0.107
 cleaned+0.094
 sighed+0.092
 realized+0.091
Neg logit contributions
aps-0.185
apping-0.168
uttering-0.165
 This-0.160
apped-0.152
Token were
Token ID547
Feature activation-1.87e+00

Pos logit contributions
 forgot+0.058
 punished+0.051
 sighed+0.042
 regretted+0.041
 regret+0.040
Neg logit contributions
aps-0.102
apping-0.093
 This-0.092
apped-0.089
uttering-0.088
Token out
Token ID503
Feature activation+5.62e-02

Pos logit contributions
 forgot+0.051
 punished+0.047
 cleaned+0.043
 sighed+0.041
 realized+0.041
Neg logit contributions
aps-0.036
 This-0.030
apping-0.030
uttering-0.029
uts-0.029
Token running
Token ID2491
Feature activation+4.37e+00

Pos logit contributions
aps+0.001
uts+0.001
 This+0.001
apping+0.001
uttering+0.001
Neg logit contributions
 forgot-0.001
 punished-0.001
 cleaned-0.001
 nurses-0.001
 regretted-0.001
Token err
Token ID11454
Feature activation+1.27e+01

Pos logit contributions
aps+0.098
apping+0.085
uttering+0.083
unch+0.079
 This+0.079
Neg logit contributions
 forgot-0.106
 punished-0.094
 cleaned-0.087
 sighed-0.086
 realized-0.084
Tokenands
Token ID1746
Feature activation+3.89e+00

Pos logit contributions
aps+0.279
apping+0.247
uttering+0.242
 This+0.235
apped+0.221
Neg logit contributions
 forgot-0.308
 punished-0.281
 cleaned-0.254
 sighed-0.247
 regretted-0.243
Token.
Token ID13
Feature activation+4.13e+00

Pos logit contributions
aps+0.080
apping+0.067
uttering+0.066
 This+0.066
apped+0.059
Neg logit contributions
 forgot-0.101
 punished-0.091
 cleaned-0.083
 sighed-0.081
 realized-0.080
Token Mary
Token ID5335
Feature activation-7.19e+00

Pos logit contributions
aps+0.097
apping+0.087
uttering+0.085
uts+0.076
ectar+0.076
Neg logit contributions
 forgot-0.098
 hoped-0.080
 sighed-0.078
 punished-0.077
 cleaned-0.075
Token was
Token ID373
Feature activation-2.33e+00

Pos logit contributions
 forgot+0.118
 punished+0.117
 nurses+0.104
 matter+0.097
 regret+0.097
Neg logit contributions
aps-0.173
 This-0.161
apping-0.157
apped-0.147
uttering-0.144
Token three
Token ID1115
Feature activation-5.92e-01

Pos logit contributions
 forgot+0.053
 punished+0.046
 sighed+0.042
 cleaned+0.042
 regretted+0.042
Neg logit contributions
aps-0.055
apping-0.050
uttering-0.048
 This-0.046
apped-0.044
Token on
Token ID319
Feature activation+2.06e+00

Pos logit contributions
aps+0.096
apping+0.082
 This+0.080
uttering+0.080
apped+0.075
Neg logit contributions
 forgot-0.110
 punished-0.096
 regretted-0.086
 cleaned-0.085
 sighed-0.085
Token dry
Token ID5894
Feature activation+6.96e+00

Pos logit contributions
aps+0.052
apping+0.048
uttering+0.047
 This+0.044
apped+0.043
Neg logit contributions
 forgot-0.043
 punished-0.039
 cleaned-0.035
 sighed-0.034
 realized-0.033
Token and
Token ID290
Feature activation-4.58e+00

Pos logit contributions
aps+0.192
apping+0.176
uttering+0.173
 This+0.166
apped+0.159
Neg logit contributions
 forgot-0.131
 punished-0.114
 cleaned-0.102
 sighed-0.098
 realized-0.096
Token cozy
Token ID37438
Feature activation+5.05e+00

Pos logit contributions
 forgot+0.082
 punished+0.071
 regretted+0.062
 realized+0.060
 sighed+0.060
Neg logit contributions
aps-0.125
apping-0.110
uttering-0.108
 This-0.108
apped-0.102
Token p
Token ID279
Feature activation+3.81e+00

Pos logit contributions
aps+0.125
apping+0.114
uttering+0.111
 This+0.108
apped+0.101
Neg logit contributions
 forgot-0.110
 punished-0.096
 cleaned-0.088
 sighed-0.085
 regretted-0.084
Tokenaj
Token ID1228
Feature activation+1.25e+01

Pos logit contributions
aps+0.099
apping+0.089
uttering+0.087
 This+0.085
apped+0.080
Neg logit contributions
 forgot-0.080
 punished-0.070
 cleaned-0.062
 sighed-0.061
 regretted-0.060
Tokenamas
Token ID17485
Feature activation+6.84e+00

Pos logit contributions
aps+0.260
uttering+0.234
apping+0.226
 This+0.215
ges+0.202
Neg logit contributions
 forgot-0.313
 punished-0.269
 cleaned-0.259
 sighed-0.251
 felt-0.243
Token and
Token ID290
Feature activation-5.54e+00

Pos logit contributions
aps+0.134
apping+0.115
uttering+0.112
 This+0.106
apped+0.098
Neg logit contributions
 forgot-0.187
 punished-0.169
 cleaned-0.158
 sighed-0.154
 realized-0.152
Token sn
Token ID3013
Feature activation+7.79e+00

Pos logit contributions
 forgot+0.101
 punished+0.091
 regretted+0.079
 overs+0.078
 sighed+0.076
Neg logit contributions
aps-0.139
 This-0.121
apping-0.117
uttering-0.113
apped-0.112
Tokenugg
Token ID6837
Feature activation+6.26e+00

Pos logit contributions
aps+0.150
apping+0.132
uttering+0.129
 This+0.128
uts+0.121
Neg logit contributions
 forgot-0.199
 punished-0.189
 cleaned-0.180
 sighed-0.167
 regretted-0.165
Tokenled
Token ID992
Feature activation+3.35e+00

Pos logit contributions
aps+0.167
apping+0.150
uttering+0.147
 This+0.142
apped+0.136
Neg logit contributions
 forgot-0.127
 punished-0.111
 cleaned-0.097
 sighed-0.095
 realized-0.094
Token and
Token ID290
Feature activation-4.94e+00

Pos logit contributions
aps+0.094
uttering+0.084
apping+0.084
 This+0.081
apped+0.076
Neg logit contributions
 forgot-0.078
 punished-0.068
 cleaned-0.063
 realized-0.061
 sighed-0.061
Token red
Token ID2266
Feature activation+3.76e+00

Pos logit contributions
 forgot+0.111
 punished+0.103
 regretted+0.090
 cleaned+0.088
 nurses+0.087
Neg logit contributions
aps-0.112
 This-0.100
apping-0.098
uttering-0.097
apped-0.089
Token and
Token ID290
Feature activation-4.84e+00

Pos logit contributions
aps+0.100
uttering+0.092
apping+0.091
 This+0.082
utes+0.082
Neg logit contributions
 forgot-0.069
 punished-0.058
 cleaned-0.057
 realized-0.055
 regretted-0.054
Token covered
Token ID5017
Feature activation+7.61e+00

Pos logit contributions
 forgot+0.102
 punished+0.094
 regretted+0.081
 cleaned+0.079
 sighed+0.078
Neg logit contributions
aps-0.117
 This-0.105
apping-0.103
uttering-0.101
apped-0.095
Token in
Token ID287
Feature activation+1.44e+00

Pos logit contributions
aps+0.182
apping+0.160
uttering+0.159
 This+0.148
uts+0.144
Neg logit contributions
 forgot-0.167
 punished-0.150
 cleaned-0.140
 sighed-0.136
 realized-0.133
Token scrib
Token ID44661
Feature activation+1.25e+01

Pos logit contributions
aps+0.038
apping+0.035
uttering+0.034
 This+0.032
apped+0.032
Neg logit contributions
 forgot-0.029
 punished-0.026
 cleaned-0.023
 realized-0.022
 sighed-0.022
Tokenbles
Token ID7689
Feature activation+1.46e+00

Pos logit contributions
aps+0.264
 This+0.246
 These+0.227
apping+0.226
 Today+0.226
Neg logit contributions
 forgot-0.285
 punished-0.256
 cleaned-0.238
 realized-0.228
 sighed-0.224
Token.
Token ID13
Feature activation+6.09e+00

Pos logit contributions
aps+0.031
uttering+0.028
apping+0.027
apped+0.026
 This+0.025
Neg logit contributions
 forgot-0.035
 punished-0.031
 realized-0.029
 cleaned-0.028
 learned-0.027
Token
Token ID198
Feature activation+4.63e+00

Pos logit contributions
aps+0.127
uttering+0.126
apping+0.114
ings+0.101
apped+0.099
Neg logit contributions
 forgot-0.142
 punished-0.123
 realized-0.116
 sighed-0.116
 hoped-0.116
Token
Token ID198
Feature activation+4.48e+00

Pos logit contributions
aps+0.076
uttering+0.070
apping+0.068
 This+0.060
apped+0.059
Neg logit contributions
 forgot-0.131
 punished-0.119
 cleaned-0.112
 sighed-0.112
 realized-0.111
TokenEm
Token ID10161
Feature activation-4.34e+00

Pos logit contributions
aps+0.107
uttering+0.095
apping+0.095
 This+0.090
apped+0.088
Neg logit contributions
 forgot-0.100
 punished-0.090
 cleaned-0.080
 sighed-0.079
 realized-0.079
Token her
Token ID607
Feature activation-5.47e+00

Pos logit contributions
aps+0.040
apping+0.036
uttering+0.035
 This+0.034
apped+0.032
Neg logit contributions
 forgot-0.036
 punished-0.032
 cleaned-0.028
 sighed-0.028
 realized-0.027
Token cr
Token ID1067
Feature activation+7.93e+00

Pos logit contributions
 forgot+0.113
 punished+0.099
 cleaned+0.089
 sighed+0.086
 realized+0.085
Neg logit contributions
aps-0.143
apping-0.131
uttering-0.128
 This-0.122
apped-0.117
Tokenay
Token ID323
Feature activation+1.86e+00

Pos logit contributions
aps+0.070
uttering+0.059
apping+0.057
 This+0.053
 Yes+0.046
Neg logit contributions
 forgot-0.266
 punished-0.249
 cleaned-0.243
 sighed-0.230
 realized-0.229
Tokenons
Token ID684
Feature activation+2.35e+00

Pos logit contributions
aps+0.043
apping+0.040
uttering+0.039
 This+0.036
apped+0.035
Neg logit contributions
 forgot-0.042
 punished-0.037
 cleaned-0.035
 sighed-0.034
 realized-0.033
Token to
Token ID284
Feature activation-3.12e+00

Pos logit contributions
aps+0.043
apping+0.037
uttering+0.037
 This+0.035
apped+0.032
Neg logit contributions
 forgot-0.066
 punished-0.060
 cleaned-0.056
 sighed-0.054
 realized-0.054
Token scrib
Token ID44661
Feature activation+1.24e+01

Pos logit contributions
 forgot+0.057
 punished+0.048
 regretted+0.043
 sighed+0.042
 realized+0.040
Neg logit contributions
aps-0.090
apping-0.080
 This-0.079
uttering-0.078
apped-0.073
Tokenble
Token ID903
Feature activation+3.33e+00

Pos logit contributions
aps+0.259
 This+0.258
 These+0.239
 Today+0.237
uttering+0.232
Neg logit contributions
 forgot-0.263
 punished-0.249
 cleaned-0.226
 realized-0.208
 hoped-0.205
Token on
Token ID319
Feature activation+2.29e+00

Pos logit contributions
aps+0.069
uttering+0.063
apping+0.063
apped+0.055
unch+0.054
Neg logit contributions
 forgot-0.081
 punished-0.076
 cleaned-0.069
 realized-0.067
 wouldn-0.065
Token the
Token ID262
Feature activation-3.53e+00

Pos logit contributions
aps+0.057
apping+0.052
uttering+0.051
 This+0.046
apped+0.046
Neg logit contributions
 forgot-0.050
 punished-0.045
 cleaned-0.040
 realized-0.039
 sighed-0.039
Token wall
Token ID3355
Feature activation+6.07e+00

Pos logit contributions
 forgot+0.069
 punished+0.059
 regretted+0.050
 cleaned+0.050
 realized+0.049
Neg logit contributions
aps-0.096
 This-0.086
apping-0.085
uttering-0.084
apped-0.078
Token.
Token ID13
Feature activation+6.27e+00

Pos logit contributions
aps+0.118
uttering+0.111
apping+0.104
 This+0.094
apped+0.094
Neg logit contributions
 forgot-0.157
 punished-0.142
 cleaned-0.134
 sighed-0.129
 realized-0.126
Token and
Token ID290
Feature activation-6.07e+00

Pos logit contributions
aps+0.029
apping+0.023
 This+0.022
uttering+0.021
apped+0.017
Neg logit contributions
 forgot-0.074
 punished-0.068
 regretted-0.063
 sighed-0.062
 cleaned-0.062
Token even
Token ID772
Feature activation-5.27e+00

Pos logit contributions
 forgot+0.115
 punished+0.101
 cleaned+0.088
 regretted+0.086
 sighed+0.085
Neg logit contributions
aps-0.161
apping-0.143
 This-0.139
uttering-0.138
apped-0.132
Token tried
Token ID3088
Feature activation+5.94e+00

Pos logit contributions
 forgot+0.070
 punished+0.062
 nurses+0.052
 overs+0.050
 cleaned+0.049
Neg logit contributions
aps-0.155
apping-0.141
 This-0.135
uttering-0.135
apped-0.132
Token to
Token ID284
Feature activation-2.86e+00

Pos logit contributions
aps+0.115
apping+0.103
uttering+0.101
 This+0.093
apped+0.093
Neg logit contributions
 forgot-0.161
 punished-0.146
 cleaned-0.134
 realized-0.132
 sighed-0.131
Token take
Token ID1011
Feature activation+3.11e+00

Pos logit contributions
 forgot+0.056
 punished+0.047
 regretted+0.044
 sighed+0.044
 cleaned+0.042
Neg logit contributions
aps-0.075
apping-0.066
 This-0.065
uttering-0.064
ings-0.060
Token n
Token ID299
Feature activation+1.23e+01

Pos logit contributions
aps+0.068
apping+0.060
uttering+0.058
 This+0.057
apped+0.053
Neg logit contributions
 forgot-0.077
 punished-0.069
 cleaned-0.062
 sighed-0.062
 regretted-0.061
Tokenaps
Token ID1686
Feature activation+9.42e+00

Pos logit contributions
 This+0.063
 These+0.057
 Today+0.037
 Nice+0.034
 Yes+0.030
Neg logit contributions
 forgot-0.424
 punished-0.421
 cleaned-0.394
 comfort-0.384
 regret-0.381
Token on
Token ID319
Feature activation+2.83e+00

Pos logit contributions
aps+0.194
uttering+0.171
apping+0.170
 This+0.158
uts+0.150
Neg logit contributions
 forgot-0.240
 punished-0.215
 cleaned-0.203
 sighed-0.200
 realized-0.198
Token the
Token ID262
Feature activation-2.68e+00

Pos logit contributions
aps+0.072
apping+0.068
uttering+0.068
apped+0.062
 This+0.059
Neg logit contributions
 forgot-0.057
 punished-0.052
 cleaned-0.046
 sighed-0.044
 realized-0.044
Token rabbit
Token ID22746
Feature activation+9.96e-01

Pos logit contributions
 forgot+0.055
 punished+0.047
 cleaned+0.041
 sighed+0.040
 realized+0.040
Neg logit contributions
aps-0.070
apping-0.062
uttering-0.061
 This-0.060
apped-0.055
Token's
Token ID338
Feature activation-2.93e+00

Pos logit contributions
aps+0.022
apping+0.020
uttering+0.020
 This+0.018
apped+0.018
Neg logit contributions
 forgot-0.024
 punished-0.022
 cleaned-0.020
 realized-0.019
 sighed-0.019
Token needed
Token ID2622
Feature activation+1.85e+00

Pos logit contributions
 forgot+0.139
 punished+0.129
 nurses+0.112
 regretted+0.110
 hospital+0.108
Neg logit contributions
aps-0.174
apping-0.163
 This-0.160
apped-0.158
uttering-0.155
Token to
Token ID284
Feature activation-2.97e+00

Pos logit contributions
aps+0.034
apping+0.029
uttering+0.028
 This+0.025
apped+0.025
Neg logit contributions
 forgot-0.053
 punished-0.049
 cleaned-0.046
 sighed-0.045
 realized-0.044
Token go
Token ID467
Feature activation+2.75e+00

Pos logit contributions
 forgot+0.060
 punished+0.051
 regretted+0.047
 sighed+0.047
 realized+0.045
Neg logit contributions
aps-0.076
apping-0.068
 This-0.068
uttering-0.066
 Today-0.061
Token do
Token ID466
Feature activation+2.55e+00

Pos logit contributions
aps+0.051
apping+0.044
uttering+0.042
 This+0.040
apped+0.038
Neg logit contributions
 forgot-0.078
 punished-0.071
 cleaned-0.066
 sighed-0.064
 realized-0.064
Token some
Token ID617
Feature activation-1.39e+00

Pos logit contributions
aps+0.054
apping+0.047
uttering+0.046
 This+0.045
apped+0.041
Neg logit contributions
 forgot-0.065
 punished-0.058
 cleaned-0.053
 sighed-0.052
 regretted-0.052
Token err
Token ID11454
Feature activation+1.23e+01

Pos logit contributions
 forgot+0.027
 punished+0.024
 cleaned+0.021
 sighed+0.020
 regretted+0.020
Neg logit contributions
aps-0.038
apping-0.035
uttering-0.034
 This-0.033
apped-0.032
Tokenands
Token ID1746
Feature activation+3.20e+00

Pos logit contributions
aps+0.266
apping+0.234
uttering+0.231
 This+0.222
apped+0.213
Neg logit contributions
 forgot-0.301
 punished-0.278
 cleaned-0.253
 sighed-0.241
 nurses-0.238
Token.
Token ID13
Feature activation+5.16e+00

Pos logit contributions
aps+0.062
apping+0.053
uttering+0.052
 This+0.050
apped+0.047
Neg logit contributions
 forgot-0.087
 punished-0.079
 cleaned-0.073
 sighed-0.071
 realized-0.071
Token
Token ID198
Feature activation+3.82e+00

Pos logit contributions
aps+0.116
uttering+0.106
apping+0.100
ectar+0.091
uts+0.091
Neg logit contributions
 forgot-0.126
 punished-0.111
 cleaned-0.107
 sighed-0.104
 realized-0.101
Token
Token ID198
Feature activation+3.98e+00

Pos logit contributions
aps+0.086
uttering+0.080
ectar+0.074
apping+0.072
apped+0.069
Neg logit contributions
 forgot-0.089
 punished-0.074
 sighed-0.072
 cleaned-0.071
 realized-0.068
TokenMom
Token ID29252
Feature activation-3.41e+00

Pos logit contributions
aps+0.095
uttering+0.083
apping+0.081
 This+0.077
apped+0.076
Neg logit contributions
 forgot-0.094
 punished-0.082
 cleaned-0.075
 sighed-0.074
 realized-0.073
Token have
Token ID423
Feature activation-2.24e+00

Pos logit contributions
 forgot+0.182
 punished+0.162
 regretted+0.157
 nurses+0.155
 sighed+0.152
Neg logit contributions
aps-0.150
 This-0.134
 Today-0.118
apping-0.113
 These-0.113
Token many
Token ID867
Feature activation-4.26e-01

Pos logit contributions
 forgot+0.058
 punished+0.051
 sighed+0.047
 regretted+0.046
 realized+0.046
Neg logit contributions
aps-0.043
 This-0.037
apping-0.035
uttering-0.034
uts-0.032
Token bl
Token ID698
Feature activation+5.32e+00

Pos logit contributions
 forgot+0.010
 punished+0.009
 cleaned+0.008
 sighed+0.008
 regretted+0.008
Neg logit contributions
aps-0.010
apping-0.008
uttering-0.008
 This-0.008
uts-0.007
Tokenanks
Token ID2283
Feature activation+7.28e+00

Pos logit contributions
aps+0.106
apping+0.094
uttering+0.094
 This+0.090
apped+0.082
Neg logit contributions
 forgot-0.139
 punished-0.124
 cleaned-0.113
 realized-0.110
 hoped-0.109
Token and
Token ID290
Feature activation-4.17e+00

Pos logit contributions
aps+0.145
uttering+0.126
apping+0.125
 This+0.115
unch+0.107
Neg logit contributions
 forgot-0.195
 punished-0.177
 cleaned-0.168
 sighed-0.162
 realized-0.161
Token scrib
Token ID44661
Feature activation+1.23e+01

Pos logit contributions
 forgot+0.085
 punished+0.074
 cleaned+0.065
 sighed+0.064
 regretted+0.063
Neg logit contributions
aps-0.110
apping-0.099
uttering-0.096
 This-0.094
apped-0.089
Tokenbles
Token ID7689
Feature activation+8.29e-01

Pos logit contributions
 This+0.264
aps+0.256
 Today+0.250
 These+0.243
uttering+0.241
Neg logit contributions
 forgot-0.249
 punished-0.233
 cleaned-0.207
 hoped-0.192
 realized-0.191
Token.
Token ID13
Feature activation+6.62e+00

Pos logit contributions
aps+0.017
uttering+0.015
apping+0.015
apped+0.013
 This+0.013
Neg logit contributions
 forgot-0.021
 punished-0.019
 cleaned-0.018
 realized-0.017
 learned-0.017
Token They
Token ID1119
Feature activation-7.63e+00

Pos logit contributions
uttering+0.154
aps+0.134
apping+0.134
iked+0.122
unch+0.121
Neg logit contributions
 forgot-0.131
 hoped-0.114
 learned-0.107
 realized-0.106
 cleaned-0.105
Token feel
Token ID1254
Feature activation-2.42e-01

Pos logit contributions
 forgot+0.163
 punished+0.148
 regretted+0.144
 nurses+0.139
 sighed+0.137
Neg logit contributions
aps-0.152
 This-0.130
 Today-0.115
apping-0.110
 These-0.109
Token bad
Token ID2089
Feature activation-5.74e-01

Pos logit contributions
 forgot+0.005
 punished+0.004
 sighed+0.004
 cleaned+0.004
 regretted+0.004
Neg logit contributions
aps-0.006
apping-0.005
 This-0.005
uttering-0.005
apped-0.005
Tokenbee
Token ID20963
Feature activation+1.43e+00

Pos logit contributions
 forgot+0.035
 punished+0.031
 sighed+0.028
 cleaned+0.028
 realized+0.028
Neg logit contributions
aps-0.029
apping-0.025
uttering-0.025
 This-0.024
apped-0.022
Token flew
Token ID13112
Feature activation+6.59e+00

Pos logit contributions
aps+0.042
uttering+0.038
apping+0.038
 This+0.036
apped+0.033
Neg logit contributions
 forgot-0.026
 punished-0.021
 realized-0.019
 sighed-0.019
 regretted-0.018
Token around
Token ID1088
Feature activation+3.94e+00

Pos logit contributions
aps+0.179
apping+0.161
uttering+0.156
 This+0.153
uts+0.149
Neg logit contributions
 forgot-0.126
 punished-0.106
 realized-0.097
 sighed-0.096
 cleaned-0.095
Token and
Token ID290
Feature activation-4.36e+00

Pos logit contributions
aps+0.107
apping+0.098
uttering+0.096
uts+0.090
 This+0.089
Neg logit contributions
 forgot-0.073
 punished-0.061
 sighed-0.058
 realized-0.058
 cleaned-0.057
Token gathered
Token ID9272
Feature activation+5.63e+00

Pos logit contributions
 forgot+0.081
 punished+0.075
 cleaned+0.065
 nurses+0.062
 regretted+0.062
Neg logit contributions
aps-0.113
apping-0.103
 This-0.102
uttering-0.100
apped-0.095
Token n
Token ID299
Feature activation+1.23e+01

Pos logit contributions
aps+0.130
apping+0.115
uttering+0.113
 This+0.109
apped+0.102
Neg logit contributions
 forgot-0.133
 punished-0.119
 cleaned-0.107
 sighed-0.106
 realized-0.104
Tokenectar
Token ID44504
Feature activation+8.45e+00

Pos logit contributions
 This+0.154
 These+0.136
oga+0.118
 Today+0.116
ay+0.113
Neg logit contributions
 forgot-0.353
 punished-0.312
 realized-0.312
 hoped-0.312
 cleaned-0.311
Token to
Token ID284
Feature activation-2.54e+00

Pos logit contributions
aps+0.171
apping+0.151
uttering+0.149
 This+0.144
 Today+0.133
Neg logit contributions
 forgot-0.213
 punished-0.186
 cleaned-0.179
 realized-0.174
 sighed-0.174
Token make
Token ID787
Feature activation+3.24e+00

Pos logit contributions
 forgot+0.051
 punished+0.046
 regretted+0.041
 sighed+0.040
 cleaned+0.039
Neg logit contributions
aps-0.065
 This-0.057
apping-0.056
uttering-0.056
ectar-0.051
Token honey
Token ID12498
Feature activation+5.57e+00

Pos logit contributions
aps+0.076
apping+0.068
uttering+0.066
 This+0.064
apped+0.060
Neg logit contributions
 forgot-0.075
 punished-0.067
 cleaned-0.060
 sighed-0.059
 realized-0.059
Token.
Token ID13
Feature activation+5.88e+00

Pos logit contributions
aps+0.106
apping+0.093
uttering+0.093
 This+0.089
apped+0.083
Neg logit contributions
 forgot-0.151
 punished-0.135
 cleaned-0.125
 realized-0.122
 sighed-0.120
Token AND
Token ID5357
Feature activation-1.22e+00

Pos logit contributions
uttering+0.109
aps+0.102
apping+0.098
overed+0.085
ings+0.084
Neg logit contributions
 forgot-0.112
 punished-0.100
 cleaned-0.092
 nurses-0.092
 learned-0.090
Token YOU
Token ID7013
Feature activation-1.87e+00

Pos logit contributions
 forgot+0.030
 punished+0.026
 cleaned+0.023
 realized+0.023
 sighed+0.023
Neg logit contributions
aps-0.028
apping-0.025
uttering-0.025
 This-0.022
apped-0.022
Token WILL
Token ID17682
Feature activation+4.04e+00

Pos logit contributions
 forgot+0.039
 punished+0.034
 cleaned+0.030
 sighed+0.030
 regretted+0.030
Neg logit contributions
aps-0.048
apping-0.043
uttering-0.042
 This-0.041
apped-0.039
Token GET
Token ID17151
Feature activation+5.17e+00

Pos logit contributions
aps+0.094
apping+0.088
uttering+0.085
apped+0.074
ectar+0.073
Neg logit contributions
 forgot-0.092
 punished-0.089
 cleaned-0.077
 realized-0.074
 nurses-0.074
Token IN
Token ID3268
Feature activation+6.29e+00

Pos logit contributions
aps+0.115
apping+0.107
uttering+0.101
apped+0.090
 Today+0.088
Neg logit contributions
 forgot-0.115
 punished-0.109
 realized-0.098
 cleaned-0.095
 sighed-0.092
Token TR
Token ID7579
Feature activation+1.23e+01

Pos logit contributions
aps+0.131
apping+0.123
uttering+0.119
apped+0.114
ectar+0.106
Neg logit contributions
 punished-0.136
 forgot-0.136
 cleaned-0.120
 realized-0.116
 learned-0.115
TokenOU
Token ID2606
Feature activation+2.97e+00

Pos logit contributions
aps+0.245
uttering+0.231
apping+0.217
apped+0.207
ectar+0.205
Neg logit contributions
 forgot-0.245
 punished-0.238
 cleaned-0.217
 overs-0.209
ty-0.207
TokenBLE
Token ID19146
Feature activation+3.70e+00

Pos logit contributions
aps+0.072
apping+0.067
uttering+0.064
ectar+0.060
apped+0.058
Neg logit contributions
 forgot-0.064
 punished-0.060
 cleaned-0.051
 realized-0.050
 overs-0.049
Token!"
Token ID2474
Feature activation+5.90e+00

Pos logit contributions
aps+0.077
apping+0.067
uttering+0.066
ectar+0.061
apped+0.060
Neg logit contributions
 forgot-0.093
 punished-0.085
 realized-0.075
 cleaned-0.073
 regretted-0.073
Token
Token ID198
Feature activation+5.74e+00

Pos logit contributions
uttering+0.116
aps+0.110
apping+0.109
apped+0.095
itting+0.090
Neg logit contributions
 forgot-0.146
 realized-0.125
 punished-0.122
 sighed-0.119
 hoped-0.119
Token
Token ID198
Feature activation+5.11e+00

Pos logit contributions
aps+0.102
uttering+0.101
apping+0.098
apped+0.088
 This+0.083
Neg logit contributions
 forgot-0.154
 punished-0.133
 realized-0.130
 cleaned-0.128
 learned-0.124
Token toy
Token ID13373
Feature activation+4.18e-01

Pos logit contributions
 forgot+0.091
 punished+0.080
 cleaned+0.071
 sighed+0.070
 regretted+0.070
Neg logit contributions
aps-0.096
apping-0.085
uttering-0.084
 This-0.081
apped-0.076
Token until
Token ID1566
Feature activation-1.92e+00

Pos logit contributions
aps+0.011
apping+0.009
uttering+0.009
 This+0.009
apped+0.009
Neg logit contributions
 forgot-0.009
 punished-0.008
 cleaned-0.007
 sighed-0.007
 regretted-0.007
Token it
Token ID340
Feature activation-4.96e+00

Pos logit contributions
 forgot+0.044
 punished+0.040
 cleaned+0.035
 regretted+0.035
 sighed+0.035
Neg logit contributions
aps-0.044
apping-0.039
 This-0.037
uttering-0.037
apped-0.036
Token was
Token ID373
Feature activation-2.23e+00

Pos logit contributions
 forgot+0.090
 punished+0.085
 nurses+0.081
 sighed+0.071
 regret+0.071
Neg logit contributions
aps-0.122
apping-0.111
apped-0.110
uttering-0.108
 This-0.107
Token all
Token ID477
Feature activation-3.83e+00

Pos logit contributions
 forgot+0.048
 punished+0.042
 cleaned+0.037
 sighed+0.037
 regretted+0.036
Neg logit contributions
aps-0.056
apping-0.050
uttering-0.050
 This-0.048
apped-0.045
Token sl
Token ID1017
Feature activation+1.23e+01

Pos logit contributions
 forgot+0.069
 punished+0.057
 realized+0.051
 sighed+0.050
 regretted+0.049
Neg logit contributions
aps-0.109
apping-0.099
uttering-0.096
 This-0.095
apped-0.091
Tokenob
Token ID672
Feature activation+1.49e+00

Pos logit contributions
 This+0.178
 Of+0.173
aps+0.172
 Yes+0.172
 Hello+0.163
Neg logit contributions
 forgot-0.294
 punished-0.286
 cleaned-0.270
wy-0.244
ty-0.241
Tokenbery
Token ID13001
Feature activation+1.92e+00

Pos logit contributions
aps+0.039
apping+0.035
uttering+0.035
 This+0.033
apped+0.032
Neg logit contributions
 forgot-0.031
 punished-0.027
 cleaned-0.024
 sighed-0.023
 regretted-0.023
Token.
Token ID13
Feature activation+4.84e+00

Pos logit contributions
aps+0.044
apping+0.039
uttering+0.038
 This+0.037
apped+0.035
Neg logit contributions
 forgot-0.046
 punished-0.041
 cleaned-0.037
 sighed-0.036
 regretted-0.036
Token
Token ID220
Feature activation+5.59e+00

Pos logit contributions
aps+0.116
uttering+0.114
apping+0.108
ectar+0.096
ges+0.091
Neg logit contributions
 forgot-0.109
 punished-0.092
 sighed-0.089
 cleaned-0.087
 realized-0.087
Token
Token ID198
Feature activation+3.96e+00

Pos logit contributions
uttering+0.110
aps+0.106
apping+0.100
ectar+0.089
oga+0.088
Neg logit contributions
 forgot-0.130
 punished-0.112
 cleaned-0.107
 hoped-0.106
 learned-0.105
Token and
Token ID290
Feature activation-4.62e+00

Pos logit contributions
aps+0.094
apping+0.084
uttering+0.082
 This+0.078
apped+0.073
Neg logit contributions
 forgot-0.114
 punished-0.105
 cleaned-0.095
 sighed-0.093
 realized-0.092
Token put
Token ID1234
Feature activation+4.04e+00

Pos logit contributions
 forgot+0.078
 punished+0.066
 regretted+0.060
 sighed+0.058
 realized+0.058
Neg logit contributions
aps-0.130
apping-0.114
 This-0.112
uttering-0.111
apped-0.103
Token on
Token ID319
Feature activation+1.65e+00

Pos logit contributions
aps+0.090
apping+0.079
uttering+0.077
 This+0.076
apped+0.071
Neg logit contributions
 forgot-0.099
 punished-0.088
 cleaned-0.079
 sighed-0.078
 regretted-0.078
Token your
Token ID534
Feature activation-4.80e+00

Pos logit contributions
aps+0.038
apping+0.033
uttering+0.033
 This+0.031
apped+0.030
Neg logit contributions
 forgot-0.040
 punished-0.035
 cleaned-0.032
 sighed-0.031
 regretted-0.031
Token p
Token ID279
Feature activation+2.41e+00

Pos logit contributions
 forgot+0.103
 punished+0.088
 realized+0.077
 regretted+0.077
 cleaned+0.076
Neg logit contributions
aps-0.122
 This-0.105
apping-0.105
uttering-0.103
apped-0.095
Tokenaj
Token ID1228
Feature activation+1.22e+01

Pos logit contributions
aps+0.055
apping+0.049
uttering+0.048
 This+0.047
apped+0.044
Neg logit contributions
 forgot-0.057
 punished-0.051
 cleaned-0.046
 sighed-0.045
 regretted-0.045
Tokenamas
Token ID17485
Feature activation+7.39e+00

Pos logit contributions
aps+0.256
uttering+0.242
apping+0.230
 This+0.214
ges+0.214
Neg logit contributions
 forgot-0.292
 punished-0.256
 cleaned-0.240
 sighed-0.232
 realized-0.224
Token.
Token ID13
Feature activation+6.31e+00

Pos logit contributions
aps+0.153
uttering+0.132
apping+0.131
 This+0.121
apped+0.115
Neg logit contributions
 forgot-0.194
 punished-0.176
 cleaned-0.163
 realized-0.156
 sighed-0.156
Token Then
Token ID3244
Feature activation-3.28e-02

Pos logit contributions
uttering+0.119
aps+0.114
apping+0.104
ings+0.094
apped+0.091
Neg logit contributions
 forgot-0.165
 punished-0.151
 cleaned-0.142
 nurses-0.140
 hoped-0.138
Token we
Token ID356
Feature activation-5.95e+00

Pos logit contributions
 forgot+0.001
 punished+0.001
 cleaned+0.001
 sighed+0.001
 regretted+0.001
Neg logit contributions
aps-0.001
apping-0.001
 This-0.001
uttering-0.001
apped-0.000
Token can
Token ID460
Feature activation+1.59e+00

Pos logit contributions
 forgot+0.122
 punished+0.106
 sighed+0.098
 regretted+0.097
 realized+0.093
Neg logit contributions
aps-0.143
apping-0.123
 This-0.118
apped-0.116
 Today-0.116
Token Buzz
Token ID15753
Feature activation-4.17e+00

Pos logit contributions
aps+0.012
apping+0.010
uttering+0.010
 This+0.010
apped+0.009
Neg logit contributions
 forgot-0.012
 punished-0.010
 cleaned-0.009
 sighed-0.009
 realized-0.009
Token went
Token ID1816
Feature activation+3.40e+00

Pos logit contributions
 forgot+0.080
 punished+0.076
 sighed+0.066
 nurses+0.065
 cleaned+0.065
Neg logit contributions
aps-0.106
apping-0.095
uttering-0.092
 This-0.088
apped-0.086
Token out
Token ID503
Feature activation+2.80e+00

Pos logit contributions
aps+0.076
apping+0.072
uttering+0.068
uts+0.066
apped+0.064
Neg logit contributions
 forgot-0.072
 punished-0.063
 cleaned-0.062
 realized-0.060
 learned-0.058
Token to
Token ID284
Feature activation-1.31e+00

Pos logit contributions
aps+0.055
apping+0.051
uttering+0.049
uts+0.048
 This+0.048
Neg logit contributions
 forgot-0.068
 punished-0.062
 cleaned-0.059
 sighed-0.056
 realized-0.056
Token collect
Token ID2824
Feature activation+4.66e+00

Pos logit contributions
 forgot+0.029
 punished+0.026
 cleaned+0.023
 sighed+0.022
 regretted+0.022
Neg logit contributions
aps-0.032
apping-0.029
uttering-0.028
 This-0.028
apped-0.026
Token n
Token ID299
Feature activation+1.22e+01

Pos logit contributions
aps+0.108
apping+0.092
uttering+0.091
 This+0.090
apped+0.083
Neg logit contributions
 forgot-0.113
 punished-0.099
 cleaned-0.088
 sighed-0.088
 regretted-0.086
Tokenectar
Token ID44504
Feature activation+8.71e+00

Pos logit contributions
 This+0.081
 These+0.073
oga+0.049
 Yes+0.045
uts+0.042
Neg logit contributions
 forgot-0.396
 cleaned-0.375
 punished-0.373
 realized-0.355
 hoped-0.354
Token from
Token ID422
Feature activation+1.54e-01

Pos logit contributions
aps+0.183
apping+0.166
uttering+0.161
 This+0.155
 These+0.149
Neg logit contributions
 forgot-0.204
 punished-0.180
 cleaned-0.175
 realized-0.169
ty-0.166
Token flowers
Token ID12734
Feature activation+3.91e+00

Pos logit contributions
aps+0.004
apping+0.003
uttering+0.003
 This+0.003
apped+0.003
Neg logit contributions
 forgot-0.003
 punished-0.003
 cleaned-0.003
 sighed-0.003
 realized-0.003
Token.
Token ID13
Feature activation+5.11e+00

Pos logit contributions
aps+0.081
apping+0.072
uttering+0.070
 This+0.067
apped+0.065
Neg logit contributions
 forgot-0.096
 punished-0.085
 cleaned-0.081
 realized-0.079
 learned-0.077
Token
Token ID220
Feature activation+6.28e+00

Pos logit contributions
aps+0.112
uttering+0.107
apping+0.104
apped+0.089
ay+0.088
Neg logit contributions
 forgot-0.122
 punished-0.102
 realized-0.097
 hoped-0.097
 sighed-0.097
Token sign
Token ID1051
Feature activation+1.61e+00

Pos logit contributions
 forgot+0.031
 punished+0.027
 cleaned+0.025
 sighed+0.024
 regretted+0.024
Neg logit contributions
aps-0.033
apping-0.030
uttering-0.029
 This-0.028
apped-0.027
Token that
Token ID326
Feature activation-2.19e+00

Pos logit contributions
aps+0.039
uttering+0.036
apping+0.035
ectar+0.033
 This+0.032
Neg logit contributions
 forgot-0.034
 punished-0.031
 cleaned-0.028
 sighed-0.028
 realized-0.028
Token said
Token ID531
Feature activation+2.53e+00

Pos logit contributions
 forgot+0.034
 punished+0.029
 regretted+0.023
 cleaned+0.023
 sighed+0.023
Neg logit contributions
aps-0.067
apping-0.061
 This-0.059
uttering-0.059
apped-0.057
Token "
Token ID366
Feature activation+4.52e+00

Pos logit contributions
aps+0.050
uttering+0.044
apping+0.042
apped+0.036
imb+0.036
Neg logit contributions
 forgot-0.067
 punished-0.058
 realized-0.057
 cleaned-0.057
 sighed-0.055
TokenRE
Token ID2200
Feature activation+4.98e+00

Pos logit contributions
aps+0.111
apping+0.099
uttering+0.093
 This+0.087
apped+0.087
Neg logit contributions
 forgot-0.097
 punished-0.088
 cleaned-0.078
 sighed-0.077
 drowned-0.074
TokenVERS
Token ID28884
Feature activation+1.22e+01

Pos logit contributions
aps+0.119
apping+0.106
uttering+0.104
 This+0.099
apped+0.095
Neg logit contributions
 forgot-0.114
 punished-0.101
 cleaned-0.090
 realized-0.088
 regretted-0.087
TokenE
Token ID36
Feature activation+2.04e+00

Pos logit contributions
aps+0.283
apping+0.250
uttering+0.243
 This+0.238
apped+0.224
Neg logit contributions
 forgot-0.282
 punished-0.258
 cleaned-0.228
 sighed-0.222
 realized-0.221
Token".
Token ID1911
Feature activation+6.19e+00

Pos logit contributions
aps+0.041
apping+0.033
uttering+0.033
 This+0.032
 Nice+0.031
Neg logit contributions
 forgot-0.050
 punished-0.046
 cleaned-0.041
ty-0.041
 realized-0.040
Token L
Token ID406
Feature activation-2.28e-01

Pos logit contributions
uttering+0.128
aps+0.124
apping+0.118
unch+0.103
ectar+0.103
Neg logit contributions
 forgot-0.144
 punished-0.126
 cleaned-0.120
 realized-0.120
 sighed-0.120
Tokenila
Token ID10102
Feature activation-7.50e+00

Pos logit contributions
 forgot+0.004
 punished+0.003
 cleaned+0.003
 sighed+0.003
 realized+0.003
Neg logit contributions
aps-0.006
 This-0.006
uttering-0.006
apping-0.005
 Today-0.005
Token and
Token ID290
Feature activation-8.50e+00

Pos logit contributions
 forgot+0.191
 punished+0.182
 nurses+0.171
 regret+0.161
 cleaned+0.159
Neg logit contributions
aps-0.147
apping-0.128
 This-0.118
uts-0.117
apped-0.113
Token
Token ID26391
Feature activation+4.65e+00

Pos logit contributions
aps+0.101
uttering+0.092
apping+0.087
apped+0.081
ectar+0.078
Neg logit contributions
 forgot-0.118
 punished-0.108
 cleaned-0.100
 sighed-0.097
 learned-0.097
Token˜
Token ID41185
Feature activation+3.94e+00

Pos logit contributions
uttering+0.088
aps+0.087
apping+0.077
apped+0.073
ectar+0.070
Neg logit contributions
 forgot-0.124
 punished-0.096
 learned-0.094
 realized-0.094
 hoped-0.092
TokenS
Token ID50
Feature activation-1.25e-02

Pos logit contributions
aps+0.079
uttering+0.074
apping+0.069
 This+0.063
unch+0.062
Neg logit contributions
 forgot-0.100
 punished-0.088
 sighed-0.082
 realized-0.081
 cleaned-0.080
Tokenail
Token ID603
Feature activation-2.53e+00

Pos logit contributions
 forgot+0.000
 punished+0.000
 sighed+0.000
 regretted+0.000
 realized+0.000
Neg logit contributions
aps-0.000
uttering-0.000
apping-0.000
uts-0.000
 This-0.000
Tokenors
Token ID669
Feature activation-3.80e-02

Pos logit contributions
 forgot+0.059
 punished+0.052
 cleaned+0.047
 sighed+0.046
 realized+0.046
Neg logit contributions
aps-0.059
apping-0.053
uttering-0.052
 This-0.049
apped-0.047
Token Wanted
Token ID41536
Feature activation+1.21e+01

Pos logit contributions
 forgot+0.001
 punished+0.001
 cleaned+0.001
 sighed+0.001
 regretted+0.001
Neg logit contributions
aps-0.001
apping-0.001
uttering-0.001
 This-0.001
apped-0.001
Tokenâ
Token ID22940
Feature activation+2.24e+00

Pos logit contributions
aps+0.233
uttering+0.227
apping+0.201
unch+0.192
 This+0.191
Neg logit contributions
 forgot-0.284
 punished-0.257
 cleaned-0.232
 sighed-0.232
 realized-0.232
Token
Token ID26391
Feature activation+4.81e+00

Pos logit contributions
aps+0.066
uttering+0.060
apping+0.060
uts+0.055
 This+0.055
Neg logit contributions
 forgot-0.039
 punished-0.033
 cleaned-0.029
 sighed-0.028
 realized-0.027
Token
Token ID8151
Feature activation+6.15e+00

Pos logit contributions
aps+0.095
uttering+0.093
apping+0.084
ires+0.079
apped+0.078
Neg logit contributions
 forgot-0.125
 punished-0.096
 realized-0.091
 learned-0.091
 hoped-0.091
Token.
Token ID13
Feature activation+6.32e+00

Pos logit contributions
aps+0.124
uttering+0.120
apping+0.104
ectar+0.097
 This+0.094
Neg logit contributions
 forgot-0.149
 punished-0.123
 realized-0.115
 wouldn-0.115
 sighed-0.114
Token Do
Token ID2141
Feature activation+2.38e+00

Pos logit contributions
aps+0.123
uttering+0.122
apping+0.108
ings+0.098
apped+0.097
Neg logit contributions
 forgot-0.154
 punished-0.136
 cleaned-0.126
 hoped-0.125
 sighed-0.125
Token.
Token ID13
Feature activation+5.64e+00

Pos logit contributions
aps+0.089
apping+0.079
uttering+0.077
 This+0.071
apped+0.068
Neg logit contributions
 forgot-0.124
 punished-0.113
 cleaned-0.104
 sighed-0.102
 realized-0.101
Token Amy
Token ID14235
Feature activation-8.03e+00

Pos logit contributions
aps+0.127
uttering+0.122
apping+0.115
ectar+0.099
apped+0.099
Neg logit contributions
 forgot-0.135
 punished-0.113
 sighed-0.111
 realized-0.107
 learned-0.107
Token then
Token ID788
Feature activation-2.31e+00

Pos logit contributions
 forgot+0.135
 nurses+0.128
 punished+0.123
 rubble+0.115
 colder+0.111
Neg logit contributions
aps-0.195
apping-0.177
apped-0.173
 This-0.170
ucked-0.158
Token began
Token ID2540
Feature activation+4.93e+00

Pos logit contributions
 forgot+0.033
 punished+0.028
 nurses+0.025
 drowned+0.025
 overs+0.024
Neg logit contributions
aps-0.065
apping-0.061
 This-0.058
apped-0.057
uttering-0.056
Token to
Token ID284
Feature activation-3.57e+00

Pos logit contributions
aps+0.089
apping+0.077
uttering+0.076
 This+0.069
apped+0.068
Neg logit contributions
 forgot-0.140
 punished-0.129
 cleaned-0.120
 realized-0.118
 sighed-0.118
Token scrib
Token ID44661
Feature activation+1.21e+01

Pos logit contributions
 forgot+0.070
 punished+0.057
 regretted+0.053
 sighed+0.052
 nurses+0.050
Neg logit contributions
aps-0.090
 This-0.081
apping-0.079
uttering-0.075
 Today-0.072
Tokenble
Token ID903
Feature activation+3.08e+00

Pos logit contributions
aps+0.251
 This+0.237
 These+0.221
apping+0.214
 Today+0.213
Neg logit contributions
 forgot-0.273
 punished-0.260
 cleaned-0.231
 realized-0.226
 sighed-0.221
Token the
Token ID262
Feature activation-4.23e+00

Pos logit contributions
aps+0.066
apping+0.058
uttering+0.057
 This+0.053
apped+0.051
Neg logit contributions
 forgot-0.077
 punished-0.071
 cleaned-0.064
 realized-0.063
 sighed-0.063
Token prices
Token ID4536
Feature activation+6.39e+00

Pos logit contributions
 forgot+0.094
 punished+0.082
 cleaned+0.074
 sighed+0.072
 realized+0.072
Neg logit contributions
aps-0.104
apping-0.093
uttering-0.091
 This-0.089
apped-0.083
Token on
Token ID319
Feature activation+2.52e+00

Pos logit contributions
aps+0.128
apping+0.110
uttering+0.108
 This+0.102
apped+0.095
Neg logit contributions
 forgot-0.171
 punished-0.154
 cleaned-0.143
 sighed-0.141
 realized-0.139
Token all
Token ID477
Feature activation-7.72e-01

Pos logit contributions
aps+0.063
apping+0.057
uttering+0.055
 This+0.051
apped+0.051
Neg logit contributions
 forgot-0.056
 punished-0.049
 cleaned-0.044
 sighed-0.043
 realized-0.043
Token doctor
Token ID6253
Feature activation+8.62e-01

Pos logit contributions
 forgot+0.079
 punished+0.066
 sighed+0.059
 cleaned+0.059
 regretted+0.058
Neg logit contributions
aps-0.084
 This-0.075
apping-0.075
uttering-0.074
apped-0.066
Token,
Token ID11
Feature activation-6.90e-01

Pos logit contributions
aps+0.019
uttering+0.017
apping+0.017
 This+0.015
apped+0.015
Neg logit contributions
 forgot-0.021
 punished-0.019
 cleaned-0.017
 sighed-0.017
 realized-0.017
Token which
Token ID543
Feature activation-3.56e+00

Pos logit contributions
 forgot+0.016
 punished+0.014
 regretted+0.012
 cleaned+0.012
 sighed+0.012
Neg logit contributions
aps-0.016
apping-0.014
 This-0.014
uttering-0.013
apped-0.012
Token was
Token ID373
Feature activation-1.17e+00

Pos logit contributions
 forgot+0.068
 punished+0.060
 regretted+0.052
 realized+0.051
 cleaned+0.051
Neg logit contributions
aps-0.088
 This-0.084
apping-0.078
uts-0.074
uttering-0.073
Token an
Token ID281
Feature activation+2.60e-01

Pos logit contributions
 forgot+0.027
 punished+0.023
 sighed+0.021
 realized+0.020
 regretted+0.020
Neg logit contributions
aps-0.027
apping-0.024
uttering-0.024
 This-0.023
apped-0.021
Token err
Token ID11454
Feature activation+1.21e+01

Pos logit contributions
aps+0.006
apping+0.006
 This+0.006
uttering+0.006
ummy+0.006
Neg logit contributions
 forgot-0.006
 punished-0.004
 sighed-0.004
 realized-0.004
 cleaned-0.004
Tokenand
Token ID392
Feature activation+3.28e+00

Pos logit contributions
aps+0.256
apping+0.225
uttering+0.218
 This+0.210
apped+0.196
Neg logit contributions
 forgot-0.311
 punished-0.277
 cleaned-0.251
 sighed-0.249
 regretted-0.247
Token he
Token ID339
Feature activation-6.51e+00

Pos logit contributions
aps+0.070
apping+0.061
uttering+0.060
 This+0.057
apped+0.054
Neg logit contributions
 forgot-0.084
 punished-0.076
 cleaned-0.069
 sighed-0.068
 realized-0.067
Token didn
Token ID1422
Feature activation+4.42e+00

Pos logit contributions
 forgot+0.096
 punished+0.081
 nurses+0.069
 cleaned+0.069
 sighed+0.069
Neg logit contributions
aps-0.186
 This-0.171
apping-0.169
uttering-0.163
uts-0.158
Token't
Token ID470
Feature activation-3.29e+00

Pos logit contributions
aps+0.087
uttering+0.080
apping+0.077
apped+0.072
ectar+0.071
Neg logit contributions
 forgot-0.109
 punished-0.105
 cleaned-0.095
 never-0.090
 hospital-0.090
Token understand
Token ID1833
Feature activation+1.93e+00

Pos logit contributions
 forgot+0.067
 punished+0.057
 sighed+0.055
 regretted+0.054
 realized+0.053
Neg logit contributions
aps-0.080
 This-0.073
apping-0.073
uttering-0.070
uts-0.068
Token inside
Token ID2641
Feature activation+3.69e+00

Pos logit contributions
aps+0.038
apping+0.033
uttering+0.032
 This+0.031
apped+0.029
Neg logit contributions
 forgot-0.053
 punished-0.049
 cleaned-0.044
 sighed-0.043
 realized-0.043
Token that
Token ID326
Feature activation-7.69e-01

Pos logit contributions
aps+0.078
apping+0.069
uttering+0.067
 This+0.064
apped+0.060
Neg logit contributions
 forgot-0.094
 punished-0.085
 cleaned-0.077
 sighed-0.076
 realized-0.075
Token barn
Token ID25203
Feature activation+3.84e+00

Pos logit contributions
 forgot+0.019
 punished+0.017
 sighed+0.015
 cleaned+0.015
 regretted+0.014
Neg logit contributions
aps-0.014
 This-0.014
 Today-0.014
uttering-0.013
apping-0.012
Token.
Token ID13
Feature activation+4.76e+00

Pos logit contributions
aps+0.072
apping+0.062
 This+0.059
uttering+0.059
apped+0.054
Neg logit contributions
 forgot-0.105
 punished-0.095
 cleaned-0.086
 regretted-0.086
 sighed-0.086
Token If
Token ID1002
Feature activation+1.37e+00

Pos logit contributions
aps+0.088
uttering+0.086
apping+0.080
apped+0.068
ges+0.067
Neg logit contributions
 forgot-0.128
 punished-0.113
 hoped-0.108
 learned-0.108
 sighed-0.107
Token she
Token ID673
Feature activation-1.07e+01

Pos logit contributions
aps+0.030
apping+0.026
uttering+0.025
 This+0.025
apped+0.023
Neg logit contributions
 forgot-0.035
 punished-0.031
 cleaned-0.028
 sighed-0.028
 regretted-0.027
Token did
Token ID750
Feature activation-6.04e-02

Pos logit contributions
 forgot+0.149
 punished+0.140
 nurses+0.135
 mess+0.126
 wounds+0.126
Neg logit contributions
aps-0.250
 This-0.245
apping-0.231
uts-0.230
apped-0.224
Token,
Token ID11
Feature activation-1.58e+00

Pos logit contributions
 forgot+0.002
 punished+0.001
 cleaned+0.001
 sighed+0.001
 regretted+0.001
Neg logit contributions
aps-0.001
apping-0.001
uttering-0.001
 This-0.001
uts-0.001
Token the
Token ID262
Feature activation-4.42e+00

Pos logit contributions
 forgot+0.042
 punished+0.038
 cleaned+0.034
 regretted+0.034
 sighed+0.034
Neg logit contributions
aps-0.031
apping-0.027
 This-0.027
uttering-0.026
apped-0.024
Token owners
Token ID4393
Feature activation-3.98e+00

Pos logit contributions
 forgot+0.106
 punished+0.088
 sighed+0.084
 cleaned+0.081
 regretted+0.079
Neg logit contributions
aps-0.088
 This-0.083
 Today-0.083
uttering-0.082
 These-0.074
Token would
Token ID561
Feature activation-1.67e+00

Pos logit contributions
 forgot+0.062
 punished+0.052
 overs+0.047
 regretted+0.046
 nurses+0.046
Neg logit contributions
aps-0.102
apping-0.096
apped-0.093
 This-0.091
uttering-0.089
Token can
Token ID460
Feature activation-4.94e-01

Pos logit contributions
 forgot+0.010
 punished+0.009
 sighed+0.008
 regretted+0.008
 realized+0.008
Neg logit contributions
aps-0.013
apping-0.011
uttering-0.011
 This-0.011
uts-0.011
Token happen
Token ID1645
Feature activation+4.25e+00

Pos logit contributions
 forgot+0.010
 punished+0.010
 sighed+0.009
 regretted+0.009
 cleaned+0.009
Neg logit contributions
aps-0.011
 This-0.010
apping-0.010
uts-0.009
uttering-0.009
Token.
Token ID13
Feature activation+4.90e+00

Pos logit contributions
aps+0.083
apping+0.071
uttering+0.070
 This+0.066
apped+0.064
Neg logit contributions
 forgot-0.116
 punished-0.103
 cleaned-0.096
 realized-0.094
 sighed-0.093
Token But
Token ID887
Feature activation-3.37e-01

Pos logit contributions
aps+0.081
uttering+0.077
apping+0.072
apped+0.060
uts+0.059
Neg logit contributions
 forgot-0.139
 punished-0.126
 cleaned-0.118
 hoped-0.117
 learned-0.117
Token if
Token ID611
Feature activation-1.49e+00

Pos logit contributions
 forgot+0.008
 punished+0.008
 cleaned+0.007
 sighed+0.007
 regretted+0.007
Neg logit contributions
aps-0.007
apping-0.007
uttering-0.006
 This-0.006
apped-0.006
Token she
Token ID673
Feature activation-1.07e+01

Pos logit contributions
 forgot+0.041
 punished+0.038
 cleaned+0.035
 sighed+0.034
 regretted+0.033
Neg logit contributions
aps-0.026
 This-0.022
apping-0.021
uttering-0.021
 Yes-0.018
Token was
Token ID373
Feature activation-4.00e+00

Pos logit contributions
 nurses+0.155
 forgot+0.155
 punished+0.152
 stret+0.129
 conductor+0.128
Neg logit contributions
aps-0.264
 This-0.238
apping-0.234
apped-0.228
uts-0.223
Token smart
Token ID4451
Feature activation+1.22e+00

Pos logit contributions
 forgot+0.082
 punished+0.071
 sighed+0.065
 cleaned+0.064
 regretted+0.063
Neg logit contributions
aps-0.101
apping-0.092
uttering-0.089
 This-0.085
apped-0.080
Token and
Token ID290
Feature activation-6.02e+00

Pos logit contributions
aps+0.024
apping+0.021
uttering+0.021
 This+0.019
apped+0.018
Neg logit contributions
 forgot-0.033
 punished-0.030
 cleaned-0.027
 sighed-0.027
 regretted-0.027
Token quick
Token ID2068
Feature activation+4.73e-01

Pos logit contributions
 forgot+0.109
 punished+0.097
 sighed+0.083
 regretted+0.082
 overs+0.082
Neg logit contributions
aps-0.153
 This-0.139
apping-0.138
uttering-0.135
apped-0.128
Token,
Token ID11
Feature activation-1.75e+00

Pos logit contributions
aps+0.009
apping+0.008
uttering+0.008
 This+0.007
apped+0.007
Neg logit contributions
 forgot-0.013
 punished-0.012
 cleaned-0.011
 realized-0.011
 sighed-0.011
Token being
Token ID852
Feature activation-2.34e+00

Pos logit contributions
aps+0.051
apping+0.045
uttering+0.044
 This+0.042
apped+0.041
Neg logit contributions
 forgot-0.050
 punished-0.044
 cleaned-0.040
 realized-0.039
 sighed-0.039
Token silly
Token ID14397
Feature activation+2.62e+00

Pos logit contributions
 forgot+0.042
 punished+0.036
 cleaned+0.031
 sighed+0.030
 realized+0.030
Neg logit contributions
aps-0.068
apping-0.062
uttering-0.061
 This-0.059
apped-0.057
Token.
Token ID13
Feature activation+5.35e+00

Pos logit contributions
aps+0.055
uttering+0.049
apping+0.047
 This+0.045
apped+0.042
Neg logit contributions
 forgot-0.068
 punished-0.061
 cleaned-0.056
 sighed-0.055
 realized-0.055
Token Even
Token ID3412
Feature activation-2.31e+00

Pos logit contributions
uttering+0.100
aps+0.095
apping+0.088
ings+0.075
apped+0.074
Neg logit contributions
 forgot-0.138
 punished-0.125
 learned-0.118
 realized-0.116
 hoped-0.116
Token though
Token ID996
Feature activation+1.75e-01

Pos logit contributions
 forgot+0.052
 punished+0.046
 cleaned+0.041
 sighed+0.040
 regretted+0.040
Neg logit contributions
aps-0.056
apping-0.050
uttering-0.049
 This-0.048
apped-0.045
Token she
Token ID673
Feature activation-1.06e+01

Pos logit contributions
aps+0.004
apping+0.004
uttering+0.004
 This+0.004
apped+0.003
Neg logit contributions
 forgot-0.004
 punished-0.004
 cleaned-0.003
 sighed-0.003
 realized-0.003
Token tried
Token ID3088
Feature activation+4.35e+00

Pos logit contributions
 nurses+0.166
 punished+0.161
 forgot+0.148
 secure+0.142
 stret+0.140
Neg logit contributions
aps-0.239
apping-0.220
apped-0.213
uts-0.211
uttering-0.203
Token her
Token ID607
Feature activation-5.60e+00

Pos logit contributions
aps+0.089
apping+0.079
uttering+0.078
 This+0.073
apped+0.071
Neg logit contributions
 forgot-0.114
 punished-0.103
 cleaned-0.094
 realized-0.093
 sighed-0.092
Token best
Token ID1266
Feature activation-3.05e+00

Pos logit contributions
 forgot+0.123
 punished+0.108
 cleaned+0.096
 sighed+0.095
 regretted+0.094
Neg logit contributions
aps-0.140
apping-0.126
uttering-0.123
 This-0.119
apped-0.113
Token,
Token ID11
Feature activation-1.36e+00

Pos logit contributions
 forgot+0.079
 punished+0.067
 cleaned+0.063
 realized+0.062
 hoped+0.061
Neg logit contributions
aps-0.064
uttering-0.058
apping-0.057
 This-0.053
apped-0.050
Token the
Token ID262
Feature activation-4.75e+00

Pos logit contributions
 forgot+0.034
 punished+0.031
 cleaned+0.028
 sighed+0.028
 regretted+0.027
Neg logit contributions
aps-0.029
apping-0.025
uttering-0.024
 This-0.024
apped-0.022
Token each
Token ID1123
Feature activation-5.23e+00

Pos logit contributions
 forgot+0.030
 punished+0.027
 cleaned+0.023
 sighed+0.023
 regretted+0.023
Neg logit contributions
aps-0.033
apping-0.030
uttering-0.029
 This-0.029
apped-0.027
Token other
Token ID584
Feature activation-2.23e+00

Pos logit contributions
 forgot+0.101
 punished+0.086
 sighed+0.078
 cleaned+0.073
 regretted+0.073
Neg logit contributions
aps-0.140
 This-0.127
apping-0.125
uttering-0.124
ings-0.114
Token.
Token ID13
Feature activation+4.40e+00

Pos logit contributions
 forgot+0.059
 punished+0.053
 cleaned+0.048
 sighed+0.048
 regretted+0.048
Neg logit contributions
aps-0.045
apping-0.039
uttering-0.038
 This-0.037
apped-0.034
Token Even
Token ID3412
Feature activation-2.73e+00

Pos logit contributions
aps+0.079
uttering+0.079
apping+0.072
uts+0.063
ings+0.061
Neg logit contributions
 forgot-0.111
 learned-0.097
 hoped-0.095
 realized-0.093
 cleaned-0.092
Token though
Token ID996
Feature activation+3.16e-02

Pos logit contributions
 forgot+0.064
 punished+0.057
 cleaned+0.051
 sighed+0.051
 regretted+0.050
Neg logit contributions
aps-0.063
apping-0.056
uttering-0.055
 This-0.053
apped-0.050
Token she
Token ID673
Feature activation-1.06e+01

Pos logit contributions
aps+0.001
apping+0.001
uttering+0.001
 This+0.001
apped+0.001
Neg logit contributions
 forgot-0.001
 punished-0.001
 cleaned-0.001
 realized-0.001
 sighed-0.001
Token had
Token ID550
Feature activation-3.76e+00

Pos logit contributions
 nurses+0.179
 punished+0.178
 forgot+0.166
 wounds+0.151
 hospital+0.146
Neg logit contributions
aps-0.241
apping-0.216
 This-0.211
uts-0.209
uttering-0.206
Token to
Token ID284
Feature activation-5.02e+00

Pos logit contributions
 forgot+0.078
 punished+0.071
 sighed+0.065
 nurses+0.063
 cleaned+0.062
Neg logit contributions
aps-0.088
apping-0.077
 This-0.077
uttering-0.075
apped-0.070
Token pay
Token ID1414
Feature activation+3.06e+00

Pos logit contributions
 forgot+0.095
 punished+0.080
 sighed+0.074
 regretted+0.072
 cleaned+0.071
Neg logit contributions
aps-0.134
apping-0.120
 This-0.118
uttering-0.118
ings-0.110
Token a
Token ID257
Feature activation-2.39e+00

Pos logit contributions
aps+0.066
apping+0.058
uttering+0.056
 This+0.055
ings+0.051
Neg logit contributions
 forgot-0.076
 punished-0.068
 cleaned-0.061
 sighed-0.061
 regretted-0.061
Token lot
Token ID1256
Feature activation-1.87e-01

Pos logit contributions
 forgot+0.047
 punished+0.041
 sighed+0.037
 cleaned+0.037
 realized+0.035
Neg logit contributions
aps-0.062
apping-0.056
uttering-0.056
 This-0.054
ings-0.051
Token very
Token ID845
Feature activation-4.78e+00

Pos logit contributions
 forgot+0.071
 punished+0.061
 nurses+0.057
 sighed+0.057
 regretted+0.056
Neg logit contributions
aps-0.082
apping-0.075
uttering-0.071
 This-0.066
apped-0.066
Token frustrated
Token ID14718
Feature activation+3.64e+00

Pos logit contributions
 forgot+0.097
 punished+0.085
 nurses+0.078
 sighed+0.076
 cleaned+0.075
Neg logit contributions
aps-0.123
apping-0.113
uttering-0.107
apped-0.100
 This-0.099
Token.
Token ID13
Feature activation+4.23e+00

Pos logit contributions
aps+0.076
apping+0.064
uttering+0.064
 This+0.061
apped+0.059
Neg logit contributions
 forgot-0.095
 punished-0.085
 cleaned-0.079
 realized-0.077
 regretted-0.077
Token Even
Token ID3412
Feature activation-3.58e+00

Pos logit contributions
aps+0.080
uttering+0.076
apping+0.071
apped+0.060
ges+0.060
Neg logit contributions
 forgot-0.114
 cleaned-0.097
 punished-0.097
 sighed-0.096
 regretted-0.094
Token though
Token ID996
Feature activation-4.19e-01

Pos logit contributions
 forgot+0.086
 punished+0.076
 sighed+0.069
 cleaned+0.068
 regretted+0.068
Neg logit contributions
aps-0.078
apping-0.070
uttering-0.069
 This-0.063
apped-0.060
Token she
Token ID673
Feature activation-1.05e+01

Pos logit contributions
 forgot+0.010
 punished+0.009
 cleaned+0.008
 sighed+0.008
 regretted+0.008
Neg logit contributions
aps-0.009
apping-0.008
uttering-0.008
 This-0.008
apped-0.007
Token had
Token ID550
Feature activation-3.73e+00

Pos logit contributions
 nurses+0.190
 punished+0.179
 wounds+0.165
 officer+0.163
 stret+0.162
Neg logit contributions
aps-0.202
apping-0.189
uts-0.182
apped-0.176
uttering-0.175
Token had
Token ID550
Feature activation-2.82e+00

Pos logit contributions
 forgot+0.059
 punished+0.054
 sighed+0.051
 nurses+0.050
 promise+0.046
Neg logit contributions
aps-0.092
apping-0.083
uttering-0.083
 This-0.082
uts-0.080
Token a
Token ID257
Feature activation-2.22e+00

Pos logit contributions
 forgot+0.051
 punished+0.045
 sighed+0.042
 nurses+0.039
 realized+0.038
Neg logit contributions
aps-0.069
uttering-0.061
apping-0.061
 This-0.060
uts-0.058
Token lovely
Token ID14081
Feature activation-2.75e+00

Pos logit contributions
 forgot+0.047
 punished+0.041
 sighed+0.037
 hoped+0.036
 nurses+0.036
Neg logit contributions
aps-0.053
apping-0.048
uttering-0.048
 This-0.043
ings-0.042
Token bath
Token ID7837
Feature activation+4.33e+00

Pos logit contributions
 forgot+0.062
 punished+0.055
 cleaned+0.048
 sighed+0.048
 regretted+0.048
Neg logit contributions
aps-0.067
apping-0.060
uttering-0.059
 This-0.057
apped-0.054
Token a
Token ID257
Feature activation-1.18e+00

Pos logit contributions
 forgot+0.026
 punished+0.023
 sighed+0.021
 cleaned+0.021
 regretted+0.021
Neg logit contributions
aps-0.026
apping-0.023
uttering-0.023
 This-0.022
apped-0.021
Token secret
Token ID3200
Feature activation+2.53e+00

Pos logit contributions
 forgot+0.025
 punished+0.022
 sighed+0.019
 cleaned+0.019
 regretted+0.019
Neg logit contributions
aps-0.029
uttering-0.026
apping-0.026
 This-0.025
apped-0.023
Token code
Token ID2438
Feature activation+5.21e+00

Pos logit contributions
aps+0.055
apping+0.049
uttering+0.048
 This+0.046
apped+0.043
Neg logit contributions
 forgot-0.063
 punished-0.056
 cleaned-0.051
 sighed-0.050
 realized-0.050
Token.
Token ID13
Feature activation+5.11e+00

Pos logit contributions
aps+0.106
apping+0.091
uttering+0.089
 This+0.084
apped+0.080
Neg logit contributions
 forgot-0.138
 punished-0.124
 cleaned-0.115
 sighed-0.112
 realized-0.112
Token If
Token ID1002
Feature activation+1.31e+00

Pos logit contributions
uttering+0.127
aps+0.125
apping+0.116
ectar+0.107
utes+0.106
Neg logit contributions
 forgot-0.101
 sighed-0.086
 punished-0.085
 learned-0.084
 cleaned-0.084
Token she
Token ID673
Feature activation-1.05e+01

Pos logit contributions
aps+0.027
apping+0.023
 This+0.022
uttering+0.022
apped+0.020
Neg logit contributions
 forgot-0.034
 punished-0.031
 cleaned-0.028
 regretted-0.027
 sighed-0.027
Token broke
Token ID6265
Feature activation+5.51e+00

Pos logit contributions
 forgot+0.152
 nurses+0.151
 punished+0.150
 hospital+0.134
 mess+0.134
Neg logit contributions
aps-0.254
 This-0.240
apping-0.233
apped-0.229
uttering-0.225
Token the
Token ID262
Feature activation-4.13e+00

Pos logit contributions
aps+0.129
apping+0.116
uttering+0.112
 This+0.108
apped+0.103
Neg logit contributions
 forgot-0.128
 punished-0.114
 cleaned-0.103
 realized-0.101
 sighed-0.101
Token code
Token ID2438
Feature activation+3.83e+00

Pos logit contributions
 forgot+0.089
 punished+0.079
 cleaned+0.070
 sighed+0.069
 regretted+0.067
Neg logit contributions
aps-0.102
uttering-0.091
apping-0.090
 This-0.087
apped-0.081
Token,
Token ID11
Feature activation-1.72e+00

Pos logit contributions
aps+0.072
apping+0.062
uttering+0.060
 This+0.057
apped+0.053
Neg logit contributions
 forgot-0.107
 punished-0.097
 cleaned-0.089
 sighed-0.088
 realized-0.087
Token she
Token ID673
Feature activation-8.62e+00

Pos logit contributions
 forgot+0.043
 punished+0.039
 cleaned+0.035
 regretted+0.035
 sighed+0.035
Neg logit contributions
aps-0.037
apping-0.032
uttering-0.031
 This-0.031
apped-0.028
Token and
Token ID290
Feature activation-4.39e+00

Pos logit contributions
aps+0.067
uttering+0.060
apping+0.057
imb+0.053
 This+0.053
Neg logit contributions
 forgot-0.081
 punished-0.072
 cleaned-0.069
 realized-0.067
 learned-0.066
Token sugar
Token ID7543
Feature activation+4.80e-01

Pos logit contributions
 forgot+0.093
 punished+0.085
 cleaned+0.073
 sighed+0.073
 regretted+0.072
Neg logit contributions
aps-0.109
apping-0.096
 This-0.093
uttering-0.091
apped-0.087
Token.
Token ID13
Feature activation+4.35e+00

Pos logit contributions
aps+0.009
apping+0.008
uttering+0.007
 This+0.007
apped+0.007
Neg logit contributions
 forgot-0.013
 punished-0.012
 sighed-0.011
 cleaned-0.011
 regretted-0.011
Token Even
Token ID3412
Feature activation-2.62e+00

Pos logit contributions
uttering+0.091
aps+0.085
apping+0.081
unch+0.071
ings+0.070
Neg logit contributions
 forgot-0.106
 hoped-0.092
 learned-0.089
 cleaned-0.087
 punished-0.085
Token though
Token ID996
Feature activation-2.87e-02

Pos logit contributions
 forgot+0.061
 punished+0.054
 cleaned+0.049
 sighed+0.048
 realized+0.047
Neg logit contributions
aps-0.062
apping-0.055
uttering-0.054
 This-0.052
apped-0.049
Token she
Token ID673
Feature activation-1.05e+01

Pos logit contributions
 forgot+0.001
 punished+0.001
 cleaned+0.001
 sighed+0.001
 hoped+0.001
Neg logit contributions
aps-0.001
apping-0.001
uttering-0.001
 This-0.001
apped-0.001
Token was
Token ID373
Feature activation-3.93e+00

Pos logit contributions
 nurses+0.176
 punished+0.173
 forgot+0.162
 stret+0.151
 wounds+0.147
Neg logit contributions
aps-0.249
 This-0.220
apping-0.216
uts-0.215
apped-0.205
Token only
Token ID691
Feature activation-6.37e+00

Pos logit contributions
 forgot+0.080
 punished+0.069
 sighed+0.062
 regretted+0.061
 cleaned+0.061
Neg logit contributions
aps-0.102
apping-0.093
uttering-0.091
 This-0.087
apped-0.082
Token three
Token ID1115
Feature activation-8.97e-01

Pos logit contributions
 forgot+0.152
 punished+0.135
 cleaned+0.122
 sighed+0.120
 regretted+0.119
Neg logit contributions
aps-0.146
apping-0.130
uttering-0.127
 This-0.122
apped-0.115
Token years
Token ID812
Feature activation-1.51e-01

Pos logit contributions
 forgot+0.021
 punished+0.019
 cleaned+0.017
 sighed+0.017
 regretted+0.017
Neg logit contributions
aps-0.020
apping-0.018
 This-0.016
uttering-0.016
uts-0.015
Token old
Token ID1468
Feature activation-1.64e+00

Pos logit contributions
 forgot+0.002
 punished+0.002
 cleaned+0.002
 sighed+0.002
 regretted+0.002
Neg logit contributions
aps-0.005
apping-0.004
uttering-0.004
 This-0.004
apped-0.004
Token was
Token ID373
Feature activation-3.45e+00

Pos logit contributions
 forgot+0.020
 punished+0.019
 nurses+0.016
 regret+0.015
 cleaned+0.015
Neg logit contributions
aps-0.032
apping-0.029
apped-0.029
uttering-0.028
uts-0.026
Token gone
Token ID3750
Feature activation+1.70e+00

Pos logit contributions
 forgot+0.069
 punished+0.060
 cleaned+0.052
 sighed+0.052
 regretted+0.051
Neg logit contributions
aps-0.093
apping-0.084
uttering-0.082
 This-0.079
apped-0.076
Token.
Token ID13
Feature activation+4.58e+00

Pos logit contributions
aps+0.032
apping+0.027
uttering+0.026
 This+0.025
apped+0.023
Neg logit contributions
 forgot-0.049
 punished-0.044
 cleaned-0.041
 realized-0.040
 regretted-0.040
Token Even
Token ID3412
Feature activation-3.00e+00

Pos logit contributions
aps+0.094
uttering+0.089
apping+0.083
ach+0.076
unch+0.076
Neg logit contributions
 forgot-0.101
 cleaned-0.091
 sighed-0.087
 hoped-0.087
 punished-0.085
Token though
Token ID996
Feature activation-1.62e-02

Pos logit contributions
 forgot+0.069
 punished+0.061
 sighed+0.055
 cleaned+0.055
 regretted+0.054
Neg logit contributions
aps-0.068
apping-0.062
uttering-0.061
 This-0.056
apped-0.054
Token she
Token ID673
Feature activation-1.05e+01

Pos logit contributions
 forgot+0.000
 punished+0.000
 cleaned+0.000
 sighed+0.000
 regretted+0.000
Neg logit contributions
aps-0.000
apping-0.000
uttering-0.000
 This-0.000
apped-0.000
Token could
Token ID714
Feature activation-2.26e+00

Pos logit contributions
 nurses+0.209
 punished+0.189
 stret+0.177
 wounds+0.174
 hospital+0.166
Neg logit contributions
aps-0.211
apping-0.195
apped-0.193
uttering-0.185
uts-0.178
Token not
Token ID407
Feature activation-4.83e+00

Pos logit contributions
 forgot+0.043
 punished+0.038
 sighed+0.033
 cleaned+0.033
 regretted+0.033
Neg logit contributions
aps-0.062
apping-0.056
uttering-0.055
 This-0.054
apped-0.051
Token see
Token ID766
Feature activation+1.44e+00

Pos logit contributions
 forgot+0.093
 punished+0.089
 nurses+0.077
 sighed+0.077
 regretted+0.076
Neg logit contributions
aps-0.118
apping-0.108
 This-0.105
uttering-0.104
uts-0.096
Token it
Token ID340
Feature activation-3.93e+00

Pos logit contributions
aps+0.032
apping+0.028
uttering+0.027
 This+0.027
apped+0.025
Neg logit contributions
 forgot-0.035
 punished-0.031
 cleaned-0.028
 sighed-0.028
 regretted-0.027
Token anymore
Token ID7471
Feature activation+1.40e+00

Pos logit contributions
 forgot+0.108
 punished+0.098
 cleaned+0.089
 sighed+0.088
 realized+0.088
Neg logit contributions
aps-0.076
apping-0.066
uttering-0.064
 This-0.061
apped-0.057
Token a
Token ID257
Feature activation-1.59e+00

Pos logit contributions
 forgot+0.049
 punished+0.044
 cleaned+0.039
 realized+0.038
 sighed+0.038
Neg logit contributions
aps-0.046
apping-0.041
uttering-0.040
 This-0.038
apped-0.036
Token scientist
Token ID11444
Feature activation+2.89e-01

Pos logit contributions
 forgot+0.031
 punished+0.027
 cleaned+0.024
 sighed+0.024
 regretted+0.023
Neg logit contributions
aps-0.042
apping-0.038
uttering-0.037
 This-0.036
apped-0.033
Token.
Token ID13
Feature activation+4.80e+00

Pos logit contributions
aps+0.005
uttering+0.005
apping+0.005
 This+0.004
apped+0.004
Neg logit contributions
 forgot-0.008
 punished-0.007
 cleaned-0.007
 sighed-0.007
 realized-0.007
Token Even
Token ID3412
Feature activation-2.83e+00

Pos logit contributions
uttering+0.104
aps+0.100
apping+0.096
ings+0.083
ach+0.081
Neg logit contributions
 forgot-0.107
 cleaned-0.095
 punished-0.091
 learned-0.091
 realized-0.089
Token though
Token ID996
Feature activation-3.40e-02

Pos logit contributions
 forgot+0.061
 punished+0.053
 cleaned+0.047
 sighed+0.047
 regretted+0.046
Neg logit contributions
aps-0.071
apping-0.064
uttering-0.063
 This-0.061
apped-0.058
Token she
Token ID673
Feature activation-1.05e+01

Pos logit contributions
 forgot+0.001
 punished+0.001
 cleaned+0.001
 sighed+0.001
 regretted+0.001
Neg logit contributions
aps-0.001
apping-0.001
uttering-0.001
 This-0.001
apped-0.001
Token couldn
Token ID3521
Feature activation+1.44e+00

Pos logit contributions
 nurses+0.162
 punished+0.161
 forgot+0.153
 stret+0.141
 wounds+0.136
Neg logit contributions
aps-0.258
apping-0.229
 This-0.222
apped-0.218
uts-0.218
Token't
Token ID470
Feature activation-4.70e+00

Pos logit contributions
aps+0.028
uttering+0.027
apping+0.026
ectar+0.023
apped+0.023
Neg logit contributions
 forgot-0.036
 punished-0.033
 cleaned-0.031
 hospital-0.030
 realized-0.030
Token read
Token ID1100
Feature activation+4.76e+00

Pos logit contributions
 forgot+0.092
 punished+0.082
 regretted+0.071
 cleaned+0.070
 sighed+0.069
Neg logit contributions
aps-0.125
apping-0.113
 This-0.110
uttering-0.108
uts-0.101
Token yet
Token ID1865
Feature activation-1.01e-01

Pos logit contributions
aps+0.105
apping+0.094
uttering+0.092
 This+0.085
apped+0.083
Neg logit contributions
 forgot-0.117
 punished-0.105
 cleaned-0.096
 sighed-0.093
 realized-0.093
Token,
Token ID11
Feature activation-9.29e-01

Pos logit contributions
 forgot+0.003
 punished+0.002
 realized+0.002
 cleaned+0.002
 sighed+0.002
Neg logit contributions
aps-0.002
uttering-0.002
apping-0.002
apped-0.002
 This-0.002
Token a
Token ID257
Feature activation-8.12e-01

Pos logit contributions
 forgot+0.006
 punished+0.005
 sighed+0.005
 regretted+0.005
 cleaned+0.005
Neg logit contributions
aps-0.005
apping-0.004
uttering-0.004
 This-0.004
apped-0.004
Token special
Token ID2041
Feature activation-7.06e-01

Pos logit contributions
 forgot+0.017
 punished+0.015
 sighed+0.013
 cleaned+0.013
 regretted+0.013
Neg logit contributions
aps-0.021
apping-0.018
uttering-0.018
 This-0.018
apped-0.016
Token plant
Token ID4618
Feature activation+2.45e+00

Pos logit contributions
 forgot+0.015
 punished+0.013
 cleaned+0.011
 sighed+0.011
 regretted+0.011
Neg logit contributions
aps-0.018
apping-0.016
uttering-0.016
 This-0.016
apped-0.015
Token.
Token ID13
Feature activation+5.41e+00

Pos logit contributions
aps+0.049
apping+0.042
uttering+0.041
 This+0.039
apped+0.037
Neg logit contributions
 forgot-0.066
 punished-0.059
 cleaned-0.054
 sighed-0.053
 realized-0.053
Token If
Token ID1002
Feature activation+1.52e+00

Pos logit contributions
aps+0.126
uttering+0.122
apping+0.112
unch+0.106
owers+0.102
Neg logit contributions
 forgot-0.117
 punished-0.099
 realized-0.098
 cleaned-0.097
 sighed-0.095
Token she
Token ID673
Feature activation-1.05e+01

Pos logit contributions
aps+0.034
apping+0.030
uttering+0.029
 This+0.028
apped+0.026
Neg logit contributions
 forgot-0.037
 punished-0.033
 cleaned-0.030
 sighed-0.029
 regretted-0.029
Token took
Token ID1718
Feature activation+2.21e+00

Pos logit contributions
 forgot+0.145
 nurses+0.141
 punished+0.136
 wounds+0.125
 hospital+0.124
Neg logit contributions
aps-0.247
 This-0.239
apping-0.228
apped-0.221
uts-0.220
Token good
Token ID922
Feature activation+5.96e-01

Pos logit contributions
aps+0.046
apping+0.039
 This+0.039
uttering+0.039
apped+0.035
Neg logit contributions
 forgot-0.058
 punished-0.051
 regretted-0.046
 sighed-0.046
 cleaned-0.046
Token care
Token ID1337
Feature activation+3.80e+00

Pos logit contributions
aps+0.019
apping+0.018
uttering+0.017
 This+0.017
apped+0.016
Neg logit contributions
 forgot-0.009
 punished-0.007
 cleaned-0.006
 sighed-0.006
 realized-0.006
Token of
Token ID286
Feature activation+1.13e-01

Pos logit contributions
aps+0.064
apping+0.053
uttering+0.052
 This+0.049
apped+0.045
Neg logit contributions
 forgot-0.114
 punished-0.104
 cleaned-0.097
 sighed-0.095
 realized-0.095
Token it
Token ID340
Feature activation-2.40e+00

Pos logit contributions
aps+0.002
apping+0.002
uttering+0.002
 This+0.002
apped+0.002
Neg logit contributions
 forgot-0.003
 punished-0.002
 sighed-0.002
 regretted-0.002
 cleaned-0.002
Token the
Token ID262
Feature activation-4.56e+00

Pos logit contributions
 forgot+0.030
 punished+0.027
 cleaned+0.024
 sighed+0.024
 regretted+0.024
Neg logit contributions
aps-0.029
apping-0.025
uttering-0.025
 This-0.025
apped-0.023
Token ocean
Token ID9151
Feature activation+3.65e+00

Pos logit contributions
 forgot+0.103
 punished+0.094
 cleaned+0.083
 regretted+0.080
 sighed+0.080
Neg logit contributions
aps-0.110
apping-0.098
uttering-0.096
 This-0.094
apped-0.088
Token.
Token ID13
Feature activation+4.91e+00

Pos logit contributions
aps+0.066
apping+0.057
uttering+0.056
 This+0.053
apped+0.048
Neg logit contributions
 forgot-0.104
 punished-0.094
 cleaned-0.087
 sighed-0.086
 realized-0.085
Token Even
Token ID3412
Feature activation-2.61e+00

Pos logit contributions
uttering+0.099
aps+0.096
apping+0.093
ach+0.082
utes+0.081
Neg logit contributions
 forgot-0.112
 cleaned-0.097
 hoped-0.097
 learned-0.095
 wished-0.093
Token though
Token ID996
Feature activation+5.21e-02

Pos logit contributions
 forgot+0.061
 punished+0.054
 cleaned+0.049
 sighed+0.048
 regretted+0.047
Neg logit contributions
aps-0.060
apping-0.054
uttering-0.053
 This-0.050
apped-0.048
Token she
Token ID673
Feature activation-1.04e+01

Pos logit contributions
aps+0.001
apping+0.001
uttering+0.001
 This+0.001
apped+0.001
Neg logit contributions
 forgot-0.001
 punished-0.001
 cleaned-0.001
 sighed-0.001
 realized-0.001
Token was
Token ID373
Feature activation-4.08e+00

Pos logit contributions
 punished+0.189
 nurses+0.186
 forgot+0.169
 stret+0.158
 wounds+0.158
Neg logit contributions
aps-0.231
apping-0.206
 This-0.198
uttering-0.195
apped-0.194
Token tired
Token ID10032
Feature activation+3.87e+00

Pos logit contributions
 forgot+0.087
 punished+0.076
 sighed+0.068
 cleaned+0.068
 regretted+0.067
Neg logit contributions
aps-0.101
apping-0.092
uttering-0.090
 This-0.086
apped-0.081
Token,
Token ID11
Feature activation-1.49e+00

Pos logit contributions
aps+0.073
apping+0.062
uttering+0.060
 This+0.057
apped+0.054
Neg logit contributions
 forgot-0.109
 punished-0.099
 cleaned-0.091
 sighed-0.089
 realized-0.089
Token she
Token ID673
Feature activation-8.66e+00

Pos logit contributions
 forgot+0.040
 punished+0.037
 cleaned+0.034
 sighed+0.033
 regretted+0.033
Neg logit contributions
aps-0.027
apping-0.023
 This-0.023
uttering-0.023
apped-0.020
Token was
Token ID373
Feature activation-3.08e+00

Pos logit contributions
 punished+0.125
 nurses+0.122
 forgot+0.113
 damage+0.099
 mess+0.099
Neg logit contributions
aps-0.226
apping-0.209
apped-0.203
uttering-0.188
 This-0.186
Token felt
Token ID2936
Feature activation-7.50e-01

Pos logit contributions
 forgot+0.126
 punished+0.124
 nurses+0.121
 rubble+0.108
 colder+0.106
Neg logit contributions
aps-0.191
apping-0.177
apped-0.169
 This-0.165
uts-0.155
Token frustrated
Token ID14718
Feature activation+3.60e+00

Pos logit contributions
 forgot+0.016
 punished+0.013
 nurses+0.012
 sighed+0.012
 overs+0.012
Neg logit contributions
aps-0.018
apping-0.016
 This-0.016
uttering-0.016
apped-0.015
Token.
Token ID13
Feature activation+4.78e+00

Pos logit contributions
aps+0.082
apping+0.072
uttering+0.071
 This+0.068
apped+0.065
Neg logit contributions
 forgot-0.087
 punished-0.077
 cleaned-0.070
 sighed-0.069
 realized-0.069
Token Even
Token ID3412
Feature activation-3.38e+00

Pos logit contributions
aps+0.086
uttering+0.076
apping+0.073
unch+0.062
ings+0.061
Neg logit contributions
 forgot-0.134
 punished-0.119
 sighed-0.116
 realized-0.115
 cleaned-0.115
Token though
Token ID996
Feature activation-1.94e-01

Pos logit contributions
 forgot+0.077
 punished+0.067
 cleaned+0.061
 sighed+0.060
 regretted+0.059
Neg logit contributions
aps-0.079
apping-0.071
uttering-0.070
 This-0.066
apped-0.062
Token she
Token ID673
Feature activation-1.04e+01

Pos logit contributions
 forgot+0.005
 punished+0.004
 cleaned+0.004
 sighed+0.004
 regretted+0.004
Neg logit contributions
aps-0.004
apping-0.004
uttering-0.004
 This-0.004
apped-0.003
Token understood
Token ID7247
Feature activation+6.05e-01

Pos logit contributions
 nurses+0.175
 punished+0.164
 stret+0.156
 forgot+0.151
 officer+0.144
Neg logit contributions
aps-0.223
apping-0.208
 This-0.206
uttering-0.196
apped-0.196
Token why
Token ID1521
Feature activation+1.18e+00

Pos logit contributions
aps+0.013
apping+0.011
uttering+0.011
 This+0.011
apped+0.010
Neg logit contributions
 forgot-0.015
 punished-0.014
 cleaned-0.012
 sighed-0.012
 regretted-0.012
Token the
Token ID262
Feature activation-4.61e+00

Pos logit contributions
aps+0.028
apping+0.025
uttering+0.024
 This+0.023
apped+0.022
Neg logit contributions
 forgot-0.027
 punished-0.024
 cleaned-0.022
 sighed-0.021
 realized-0.021
Token fence
Token ID13990
Feature activation-8.52e-02

Pos logit contributions
 forgot+0.102
 punished+0.090
 cleaned+0.079
 sighed+0.078
 realized+0.077
Neg logit contributions
aps-0.113
apping-0.102
uttering-0.101
 This-0.098
apped-0.092
Token was
Token ID373
Feature activation-3.15e+00

Pos logit contributions
 forgot+0.002
 punished+0.001
 overs+0.001
 nurses+0.001
ppa+0.001
Neg logit contributions
aps-0.002
 This-0.002
apping-0.002
apped-0.002
uts-0.002
Token happy
Token ID3772
Feature activation+9.49e-01

Pos logit contributions
 punished+0.137
 forgot+0.136
 promise+0.118
 drowned+0.118
 cleaned+0.116
Neg logit contributions
aps-0.125
apping-0.109
uttering-0.106
 This-0.106
apped-0.101
Token people
Token ID661
Feature activation-2.43e+00

Pos logit contributions
aps+0.021
uttering+0.019
apping+0.019
 This+0.018
unch+0.017
Neg logit contributions
 forgot-0.023
 punished-0.019
 realized-0.019
 regretted-0.019
 felt-0.018
Token.
Token ID13
Feature activation+4.56e+00

Pos logit contributions
 forgot+0.066
 punished+0.059
 cleaned+0.054
 sighed+0.054
 realized+0.053
Neg logit contributions
aps-0.049
apping-0.042
uttering-0.041
 This-0.039
apped-0.037
Token Even
Token ID3412
Feature activation-2.72e+00

Pos logit contributions
aps+0.075
apping+0.065
uttering+0.062
apped+0.051
uts+0.051
Neg logit contributions
 forgot-0.140
 punished-0.123
 realized-0.118
 cleaned-0.117
 regretted-0.116
Token though
Token ID996
Feature activation+1.31e-01

Pos logit contributions
 forgot+0.061
 punished+0.054
 cleaned+0.048
 sighed+0.048
 regretted+0.047
Neg logit contributions
aps-0.066
apping-0.059
uttering-0.058
 This-0.056
apped-0.053
Token she
Token ID673
Feature activation-1.04e+01

Pos logit contributions
aps+0.003
apping+0.003
uttering+0.003
 This+0.003
apped+0.003
Neg logit contributions
 forgot-0.003
 punished-0.003
 cleaned-0.002
 realized-0.002
 sighed-0.002
Token was
Token ID373
Feature activation-4.10e+00

Pos logit contributions
 nurses+0.190
 punished+0.187
 stret+0.175
 wounds+0.170
 forgot+0.166
Neg logit contributions
aps-0.214
apping-0.196
 This-0.185
uttering-0.185
apped-0.182
Token a
Token ID257
Feature activation-2.07e+00

Pos logit contributions
 forgot+0.083
 punished+0.072
 sighed+0.066
 regretted+0.066
 realized+0.064
Neg logit contributions
aps-0.105
apping-0.097
uttering-0.095
 This-0.088
apped-0.083
Token bit
Token ID1643
Feature activation-2.88e-01

Pos logit contributions
 punished+0.041
 forgot+0.041
 sighed+0.038
eling+0.038
wed+0.038
Neg logit contributions
aps-0.038
apping-0.036
uttering-0.034
 This-0.034
 Yes-0.033
Token scared
Token ID12008
Feature activation+4.27e+00

Pos logit contributions
 forgot+0.006
 punished+0.005
 cleaned+0.005
 sighed+0.004
 hoped+0.004
Neg logit contributions
aps-0.008
apping-0.007
uttering-0.007
 This-0.006
apped-0.006
Token,
Token ID11
Feature activation-1.04e+00

Pos logit contributions
aps+0.091
uttering+0.079
apping+0.077
apped+0.071
 This+0.071
Neg logit contributions
 forgot-0.110
 punished-0.098
 cleaned-0.090
 nurses-0.089
 realized-0.089
Token seem
Token ID1283
Feature activation-1.52e+00

Pos logit contributions
 forgot+0.051
 punished+0.046
 sighed+0.044
 regretted+0.042
 cleaned+0.041
Neg logit contributions
aps-0.062
 This-0.057
apping-0.055
uttering-0.055
uts-0.052
Token scary
Token ID14343
Feature activation+3.32e+00

Pos logit contributions
 forgot+0.033
 punished+0.029
 cleaned+0.026
 sighed+0.026
 regretted+0.025
Neg logit contributions
aps-0.037
apping-0.034
uttering-0.033
 This-0.032
apped-0.030
Token,
Token ID11
Feature activation-7.72e-01

Pos logit contributions
aps+0.071
apping+0.062
uttering+0.061
 This+0.057
apped+0.055
Neg logit contributions
 forgot-0.085
 punished-0.076
 cleaned-0.070
 realized-0.068
 sighed-0.068
Token but
Token ID475
Feature activation-2.45e+00

Pos logit contributions
 forgot+0.017
 punished+0.015
 sighed+0.014
 cleaned+0.014
 regretted+0.014
Neg logit contributions
aps-0.018
apping-0.016
uttering-0.016
 This-0.016
uts-0.015
Token if
Token ID611
Feature activation-1.22e+00

Pos logit contributions
 forgot+0.055
 punished+0.049
 cleaned+0.044
 sighed+0.043
 realized+0.042
Neg logit contributions
aps-0.060
apping-0.053
uttering-0.052
 This-0.050
apped-0.048
Token she
Token ID673
Feature activation-1.04e+01

Pos logit contributions
 forgot+0.032
 punished+0.029
 cleaned+0.026
 sighed+0.026
 regretted+0.025
Neg logit contributions
aps-0.025
apping-0.022
uttering-0.021
 This-0.020
apped-0.019
Token talked
Token ID6619
Feature activation+5.92e+00

Pos logit contributions
 forgot+0.149
 punished+0.139
 nurses+0.127
 sighed+0.113
 stret+0.108
Neg logit contributions
aps-0.282
 This-0.257
apping-0.255
uts-0.248
apped-0.241
Token about
Token ID546
Feature activation+1.94e+00

Pos logit contributions
aps+0.127
uttering+0.111
apping+0.111
apped+0.099
 This+0.099
Neg logit contributions
 forgot-0.151
 punished-0.135
 cleaned-0.126
 sighed-0.123
 realized-0.122
Token it
Token ID340
Feature activation-2.47e+00

Pos logit contributions
aps+0.045
apping+0.040
uttering+0.039
 This+0.038
apped+0.035
Neg logit contributions
 forgot-0.046
 punished-0.040
 cleaned-0.036
 sighed-0.036
 regretted-0.036
Token with
Token ID351
Feature activation+3.28e-01

Pos logit contributions
 forgot+0.066
 punished+0.060
 cleaned+0.054
 sighed+0.054
 regretted+0.053
Neg logit contributions
aps-0.049
apping-0.043
uttering-0.042
 This-0.040
apped-0.037
Token her
Token ID607
Feature activation-4.09e+00

Pos logit contributions
aps+0.008
apping+0.007
uttering+0.006
 This+0.006
apped+0.006
Neg logit contributions
 forgot-0.008
 punished-0.007
 cleaned-0.006
 sighed-0.006
 regretted-0.006
Token play
Token ID711
Feature activation+5.26e+00

Pos logit contributions
 forgot+0.059
 punished+0.051
 cleaned+0.045
 sighed+0.044
 realized+0.043
Neg logit contributions
aps-0.078
apping-0.071
uttering-0.069
 This-0.067
apped-0.064
Token with
Token ID351
Feature activation+4.92e-01

Pos logit contributions
aps+0.089
uttering+0.080
apping+0.076
 This+0.067
apped+0.066
Neg logit contributions
 forgot-0.147
 punished-0.137
 cleaned-0.130
 learned-0.125
 realized-0.125
Token.
Token ID13
Feature activation+4.49e+00

Pos logit contributions
aps+0.010
apping+0.008
uttering+0.008
 This+0.008
apped+0.007
Neg logit contributions
 forgot-0.014
 punished-0.012
 realized-0.011
 cleaned-0.011
 sighed-0.011
Token Even
Token ID3412
Feature activation-2.34e+00

Pos logit contributions
aps+0.086
uttering+0.081
apping+0.078
ings+0.066
unch+0.064
Neg logit contributions
 forgot-0.126
 cleaned-0.107
 learned-0.105
 punished-0.105
 realized-0.105
Token though
Token ID996
Feature activation+2.96e-01

Pos logit contributions
 forgot+0.052
 punished+0.046
 cleaned+0.041
 sighed+0.040
 regretted+0.040
Neg logit contributions
aps-0.058
apping-0.052
uttering-0.051
 This-0.049
apped-0.046
Token she
Token ID673
Feature activation-1.04e+01

Pos logit contributions
aps+0.007
apping+0.006
uttering+0.006
 This+0.006
apped+0.006
Neg logit contributions
 forgot-0.007
 punished-0.006
 cleaned-0.005
 realized-0.005
 regretted-0.005
Token was
Token ID373
Feature activation-3.94e+00

Pos logit contributions
 nurses+0.187
 punished+0.173
 stret+0.165
 wounds+0.162
 forgot+0.152
Neg logit contributions
aps-0.224
apping-0.200
uts-0.197
 This-0.193
apped-0.189
Token still
Token ID991
Feature activation-5.33e+00

Pos logit contributions
 forgot+0.078
 punished+0.068
 sighed+0.062
 regretted+0.061
 realized+0.060
Neg logit contributions
aps-0.101
apping-0.093
uttering-0.090
 This-0.086
apped-0.081
Token poor
Token ID3595
Feature activation+4.22e-01

Pos logit contributions
 forgot+0.116
 punished+0.102
 cleaned+0.091
 sighed+0.090
 regretted+0.089
Neg logit contributions
aps-0.134
apping-0.120
uttering-0.117
 This-0.113
apped-0.107
Token,
Token ID11
Feature activation-1.39e+00

Pos logit contributions
aps+0.008
apping+0.007
uttering+0.007
 This+0.007
apped+0.006
Neg logit contributions
 forgot-0.011
 punished-0.010
 cleaned-0.009
 regretted-0.009
 sighed-0.009
Token she
Token ID673
Feature activation-8.63e+00

Pos logit contributions
 forgot+0.037
 punished+0.034
 cleaned+0.031
 sighed+0.030
 regretted+0.030
Neg logit contributions
aps-0.027
apping-0.023
 This-0.022
uttering-0.022
apped-0.020
Token much
Token ID881
Feature activation-3.05e+00

Pos logit contributions
 forgot+0.060
 punished+0.053
 regretted+0.048
 cleaned+0.048
 sighed+0.047
Neg logit contributions
aps-0.067
apping-0.060
uttering-0.058
 This-0.056
apped-0.053
Token slower
Token ID13611
Feature activation+8.55e-01

Pos logit contributions
 forgot+0.059
oscope+0.051
 overs+0.050
 punished+0.048
ppa+0.048
Neg logit contributions
aps-0.076
apping-0.068
uttering-0.063
apped-0.056
ectar-0.053
Token.
Token ID13
Feature activation+5.15e+00

Pos logit contributions
aps+0.016
apping+0.014
uttering+0.013
 This+0.012
apped+0.011
Neg logit contributions
 forgot-0.024
 punished-0.022
 cleaned-0.020
 sighed-0.020
 regretted-0.020
Token Even
Token ID3412
Feature activation-3.28e+00

Pos logit contributions
aps+0.096
apping+0.083
uttering+0.082
 This+0.072
apped+0.069
Neg logit contributions
 forgot-0.146
 punished-0.132
 cleaned-0.122
 sighed-0.121
 realized-0.120
Token though
Token ID996
Feature activation-3.22e-01

Pos logit contributions
 forgot+0.074
 punished+0.066
 cleaned+0.060
 sighed+0.059
 regretted+0.058
Neg logit contributions
aps-0.077
apping-0.069
uttering-0.067
 This-0.064
apped-0.060
Token she
Token ID673
Feature activation-1.04e+01

Pos logit contributions
 forgot+0.007
 punished+0.006
 cleaned+0.006
 sighed+0.006
 regretted+0.006
Neg logit contributions
aps-0.008
apping-0.007
uttering-0.007
 This-0.007
apped-0.006
Token tried
Token ID3088
Feature activation+4.00e+00

Pos logit contributions
 nurses+0.166
 punished+0.160
 forgot+0.147
 cleaned+0.133
 hospital+0.133
Neg logit contributions
aps-0.258
apping-0.244
apped-0.228
uttering-0.227
uts-0.226
Token her
Token ID607
Feature activation-6.02e+00

Pos logit contributions
aps+0.087
apping+0.077
uttering+0.076
apped+0.072
 This+0.071
Neg logit contributions
 forgot-0.101
 punished-0.088
 realized-0.081
 sighed-0.079
 cleaned-0.079
Token best
Token ID1266
Feature activation-2.87e+00

Pos logit contributions
 forgot+0.118
 punished+0.098
 realized+0.083
 hoped+0.083
 regretted+0.082
Neg logit contributions
aps-0.165
apping-0.150
uttering-0.149
 This-0.143
apped-0.138
Token,
Token ID11
Feature activation-1.54e+00

Pos logit contributions
 forgot+0.073
 punished+0.060
 hoped+0.057
 realized+0.057
 cleaned+0.056
Neg logit contributions
aps-0.061
uttering-0.053
apping-0.053
 This-0.051
apped-0.048
Token the
Token ID262
Feature activation-4.71e+00

Pos logit contributions
 forgot+0.037
 punished+0.034
 cleaned+0.031
 sighed+0.030
 regretted+0.030
Neg logit contributions
aps-0.033
apping-0.029
uttering-0.029
 This-0.029
apped-0.026
Token of
Token ID286
Feature activation+6.40e-01

Pos logit contributions
aps+0.082
uttering+0.078
imb+0.076
ings+0.075
owers+0.072
Neg logit contributions
 forgot-0.079
 realized-0.067
 cleaned-0.067
 sighed-0.064
 punished-0.062
Token herself
Token ID5223
Feature activation-4.77e+00

Pos logit contributions
aps+0.015
apping+0.014
uttering+0.014
 This+0.013
apped+0.012
Neg logit contributions
 forgot-0.015
 punished-0.013
 cleaned-0.012
 sighed-0.012
 realized-0.012
Token.
Token ID13
Feature activation+3.99e+00

Pos logit contributions
 forgot+0.122
 punished+0.109
 cleaned+0.103
 learned+0.098
 realized+0.098
Neg logit contributions
aps-0.098
uttering-0.087
apping-0.083
 This-0.077
ings-0.075
Token Even
Token ID3412
Feature activation-3.08e+00

Pos logit contributions
aps+0.074
uttering+0.069
apping+0.066
ings+0.058
ach+0.054
Neg logit contributions
 forgot-0.112
 realized-0.091
 cleaned-0.091
 hoped-0.091
 punished-0.091
Token though
Token ID996
Feature activation+1.26e-01

Pos logit contributions
 forgot+0.067
 punished+0.059
 cleaned+0.053
 sighed+0.053
 regretted+0.052
Neg logit contributions
aps-0.076
apping-0.068
uttering-0.067
 This-0.064
apped-0.061
Token she
Token ID673
Feature activation-1.04e+01

Pos logit contributions
aps+0.003
apping+0.003
uttering+0.003
 This+0.003
apped+0.002
Neg logit contributions
 forgot-0.003
 punished-0.003
 cleaned-0.002
 sighed-0.002
 realized-0.002
Token was
Token ID373
Feature activation-3.97e+00

Pos logit contributions
 nurses+0.200
 stret+0.185
 punished+0.185
 wounds+0.169
 forgot+0.163
Neg logit contributions
aps-0.232
apping-0.207
apped-0.202
 This-0.193
uts-0.189
Token at
Token ID379
Feature activation-4.30e-01

Pos logit contributions
 forgot+0.079
 punished+0.068
 sighed+0.063
 regretted+0.062
 realized+0.060
Neg logit contributions
aps-0.104
apping-0.095
uttering-0.092
 This-0.088
apped-0.084
Token the
Token ID262
Feature activation-4.21e+00

Pos logit contributions
 forgot+0.010
 punished+0.008
 cleaned+0.008
 sighed+0.007
 regretted+0.007
Neg logit contributions
aps-0.011
apping-0.010
uttering-0.009
 This-0.009
apped-0.009
Token hospital
Token ID4436
Feature activation+1.59e+00

Pos logit contributions
 forgot+0.074
 punished+0.057
 sighed+0.055
 cleaned+0.051
 hoped+0.050
Neg logit contributions
aps-0.115
uttering-0.106
 This-0.106
apping-0.105
 Today-0.098
Token,
Token ID11
Feature activation-1.74e+00

Pos logit contributions
aps+0.028
apping+0.024
uttering+0.024
 This+0.022
apped+0.021
Neg logit contributions
 forgot-0.046
 punished-0.042
 cleaned-0.039
 sighed-0.038
 realized-0.038
Token purse
Token ID27517
Feature activation+2.58e+00

Pos logit contributions
 forgot+0.046
 punished+0.041
 cleaned+0.036
 sighed+0.036
 realized+0.035
Neg logit contributions
aps-0.047
apping-0.041
uttering-0.041
 This-0.040
apped-0.037
Token back
Token ID736
Feature activation+4.17e-01

Pos logit contributions
aps+0.046
apping+0.039
uttering+0.038
 This+0.037
apped+0.033
Neg logit contributions
 forgot-0.074
 punished-0.068
 cleaned-0.062
 sighed-0.062
 regretted-0.061
Token.
Token ID13
Feature activation+4.85e+00

Pos logit contributions
aps+0.008
apping+0.007
uttering+0.007
 This+0.007
apped+0.007
Neg logit contributions
 forgot-0.011
 punished-0.010
 cleaned-0.009
 realized-0.009
 regretted-0.009
Token Even
Token ID3412
Feature activation-2.53e+00

Pos logit contributions
uttering+0.090
aps+0.089
apping+0.083
ings+0.071
apped+0.068
Neg logit contributions
 forgot-0.124
 hoped-0.108
 punished-0.106
 learned-0.104
 realized-0.104
Token though
Token ID996
Feature activation+4.59e-01

Pos logit contributions
 forgot+0.057
 punished+0.050
 cleaned+0.045
 sighed+0.045
 regretted+0.044
Neg logit contributions
aps-0.060
apping-0.053
uttering-0.052
 This-0.050
apped-0.047
Token she
Token ID673
Feature activation-1.04e+01

Pos logit contributions
aps+0.011
apping+0.010
uttering+0.010
 This+0.009
apped+0.009
Neg logit contributions
 forgot-0.011
 punished-0.009
 cleaned-0.008
 sighed-0.008
 realized-0.008
Token had
Token ID550
Feature activation-3.49e+00

Pos logit contributions
 nurses+0.169
 punished+0.163
 forgot+0.160
 hospital+0.139
 wounds+0.139
Neg logit contributions
aps-0.240
 This-0.217
apping-0.213
uttering-0.207
uts-0.206
Token forgotten
Token ID11564
Feature activation+1.57e+00

Pos logit contributions
 forgot+0.074
 punished+0.065
 sighed+0.060
 nurses+0.059
 cleaned+0.058
Neg logit contributions
aps-0.078
 This-0.071
uttering-0.067
apping-0.067
uts-0.063
Token it
Token ID340
Feature activation-3.60e+00

Pos logit contributions
aps+0.033
apping+0.029
uttering+0.028
 This+0.027
apped+0.025
Neg logit contributions
 forgot-0.040
 punished-0.036
 cleaned-0.033
 sighed-0.032
 regretted-0.032
Token,
Token ID11
Feature activation-1.50e+00

Pos logit contributions
 forgot+0.087
 punished+0.079
 cleaned+0.071
 sighed+0.070
 regretted+0.070
Neg logit contributions
aps-0.077
apping-0.068
 This-0.067
uttering-0.067
uts-0.060
Token everything
Token ID2279
Feature activation-1.91e+00

Pos logit contributions
 forgot+0.039
 punished+0.035
 cleaned+0.032
 sighed+0.031
 regretted+0.031
Neg logit contributions
aps-0.031
apping-0.027
uttering-0.026
 This-0.026
apped-0.024
Token mess
Token ID2085
Feature activation+5.50e+00

Pos logit contributions
 forgot+0.057
 punished+0.051
 sighed+0.046
 cleaned+0.044
 regretted+0.044
Neg logit contributions
aps-0.066
apping-0.057
uttering-0.056
 This-0.055
ings-0.051
Token again
Token ID757
Feature activation+8.29e-01

Pos logit contributions
aps+0.119
uttering+0.103
apping+0.103
 This+0.099
apped+0.093
Neg logit contributions
 forgot-0.139
 punished-0.124
 cleaned-0.114
 sighed-0.111
 realized-0.110
Token.
Token ID13
Feature activation+4.64e+00

Pos logit contributions
aps+0.017
apping+0.014
uttering+0.014
apped+0.013
 This+0.013
Neg logit contributions
 forgot-0.023
 punished-0.021
 cleaned-0.019
 realized-0.019
 regretted-0.018
Token And
Token ID843
Feature activation-2.83e+00

Pos logit contributions
aps+0.080
uttering+0.071
apping+0.070
apped+0.058
ings+0.056
Neg logit contributions
 forgot-0.137
 punished-0.120
 cleaned-0.117
 realized-0.116
 learned-0.115
Token if
Token ID611
Feature activation-1.32e+00

Pos logit contributions
 forgot+0.072
 punished+0.064
 cleaned+0.058
 sighed+0.057
 realized+0.057
Neg logit contributions
aps-0.061
apping-0.054
uttering-0.053
 This-0.051
apped-0.047
Token she
Token ID673
Feature activation-1.04e+01

Pos logit contributions
 forgot+0.035
 punished+0.032
 cleaned+0.029
 sighed+0.029
 regretted+0.028
Neg logit contributions
aps-0.026
apping-0.022
 This-0.022
uttering-0.022
apped-0.019
Token did
Token ID750
Feature activation-2.00e-01

Pos logit contributions
 forgot+0.153
 punished+0.146
 nurses+0.145
 wounds+0.126
 hospital+0.122
Neg logit contributions
aps-0.270
 This-0.243
apping-0.242
apped-0.237
uts-0.228
Token accidentally
Token ID14716
Feature activation-9.14e-01

Pos logit contributions
 forgot+0.005
 punished+0.005
 cleaned+0.004
 sighed+0.004
 realized+0.004
Neg logit contributions
aps-0.004
apping-0.004
uttering-0.003
 This-0.003
apped-0.003
Token knock
Token ID10643
Feature activation+6.44e+00

Pos logit contributions
 forgot+0.014
 punished+0.011
 regretted+0.010
 sighed+0.010
 realized+0.009
Neg logit contributions
aps-0.027
 This-0.024
apping-0.024
uttering-0.023
ges-0.023
Token over
Token ID625
Feature activation+2.50e+00

Pos logit contributions
aps+0.137
apping+0.120
uttering+0.117
 This+0.114
apped+0.106
Neg logit contributions
 forgot-0.165
 punished-0.146
 cleaned-0.132
 sighed-0.131
 regretted-0.131
Token her
Token ID607
Feature activation-5.14e+00

Pos logit contributions
aps+0.057
apping+0.050
uttering+0.048
 This+0.048
apped+0.044
Neg logit contributions
 forgot-0.061
 punished-0.054
 cleaned-0.048
 regretted-0.048
 sighed-0.048
Token little
Token ID1310
Feature activation-1.62e+00

Pos logit contributions
 forgot+0.113
 punished+0.094
 hoped+0.086
 realized+0.086
 sighed+0.084
Neg logit contributions
aps-0.118
apping-0.107
uttering-0.104
 This-0.100
apped-0.096
Token girl
Token ID2576
Feature activation-4.51e+00

Pos logit contributions
 forgot+0.033
 punished+0.028
 realized+0.024
 cleaned+0.024
 regretted+0.024
Neg logit contributions
aps-0.042
apping-0.037
 This-0.037
uttering-0.036
apped-0.035
Token.
Token ID13
Feature activation+4.69e+00

Pos logit contributions
 forgot+0.111
 punished+0.099
 cleaned+0.089
 sighed+0.088
 regretted+0.087
Neg logit contributions
aps-0.101
apping-0.089
uttering-0.087
 This-0.084
apped-0.079
Token Even
Token ID3412
Feature activation-3.27e+00

Pos logit contributions
aps+0.100
uttering+0.099
apping+0.090
ings+0.082
ectar+0.080
Neg logit contributions
 forgot-0.104
 punished-0.093
 sighed-0.092
 cleaned-0.091
 realized-0.090
Token though
Token ID996
Feature activation-7.12e-02

Pos logit contributions
 forgot+0.075
 punished+0.064
 cleaned+0.058
 sighed+0.058
 realized+0.057
Neg logit contributions
aps-0.077
apping-0.069
uttering-0.067
 This-0.063
apped-0.061
Token she
Token ID673
Feature activation-1.04e+01

Pos logit contributions
 forgot+0.002
 punished+0.002
 cleaned+0.001
 sighed+0.001
 regretted+0.001
Neg logit contributions
aps-0.002
apping-0.001
uttering-0.001
 This-0.001
apped-0.001
Token was
Token ID373
Feature activation-4.10e+00

Pos logit contributions
 nurses+0.186
 punished+0.172
 stret+0.165
 forgot+0.159
 hospital+0.157
Neg logit contributions
aps-0.216
apping-0.200
 This-0.190
apped-0.190
uts-0.188
Token very
Token ID845
Feature activation-3.84e+00

Pos logit contributions
 forgot+0.081
 punished+0.068
 sighed+0.061
 regretted+0.060
 cleaned+0.060
Neg logit contributions
aps-0.110
apping-0.101
uttering-0.098
 This-0.094
apped-0.089
Token small
Token ID1402
Feature activation+1.35e+00

Pos logit contributions
 forgot+0.071
 punished+0.061
 cleaned+0.053
 sighed+0.052
 regretted+0.051
Neg logit contributions
aps-0.109
apping-0.100
uttering-0.097
 This-0.093
apped-0.090
Token,
Token ID11
Feature activation-2.05e+00

Pos logit contributions
aps+0.027
apping+0.023
uttering+0.023
 This+0.022
apped+0.020
Neg logit contributions
 forgot-0.036
 punished-0.033
 cleaned-0.030
 sighed-0.030
 regretted-0.029
Token she
Token ID673
Feature activation-8.42e+00

Pos logit contributions
 forgot+0.054
 punished+0.050
 cleaned+0.044
 sighed+0.043
 regretted+0.043
Neg logit contributions
aps-0.039
apping-0.034
 This-0.033
uttering-0.033
apped-0.030
Token's
Token ID338
Feature activation-1.68e+00

Pos logit contributions
 forgot+0.080
 punished+0.068
 cleaned+0.059
 sighed+0.058
 regretted+0.057
Neg logit contributions
aps-0.130
apping-0.118
uttering-0.116
 This-0.113
apped-0.108
Token a
Token ID257
Feature activation-1.53e+00

Pos logit contributions
 forgot+0.035
 punished+0.031
 cleaned+0.027
 sighed+0.026
 realized+0.026
Neg logit contributions
aps-0.045
apping-0.040
uttering-0.040
 This-0.038
apped-0.036
Token great
Token ID1049
Feature activation-1.25e+00

Pos logit contributions
 forgot+0.033
 punished+0.029
 cleaned+0.026
 sighed+0.026
 regretted+0.026
Neg logit contributions
aps-0.036
apping-0.033
 This-0.032
uttering-0.032
ay-0.029
Token poem
Token ID21247
Feature activation+3.50e+00

Pos logit contributions
 forgot+0.026
 punished+0.023
 cleaned+0.020
 realized+0.020
 sighed+0.020
Neg logit contributions
aps-0.032
apping-0.029
uttering-0.029
 This-0.028
apped-0.026
Token!"
Token ID2474
Feature activation+5.55e+00

Pos logit contributions
aps+0.070
uttering+0.064
apping+0.061
apped+0.053
 This+0.053
Neg logit contributions
 forgot-0.093
 punished-0.082
 cleaned-0.077
 sighed-0.074
 realized-0.074
Token<|endoftext|>
Token ID50256
Feature activation+8.83e+00

Pos logit contributions
aps+0.098
uttering+0.094
apping+0.086
uts+0.072
ectar+0.071
Neg logit contributions
 forgot-0.153
 punished-0.139
 realized-0.132
 sighed-0.132
 cleaned-0.132
TokenOnce
Token ID7454
Feature activation+1.10e+00

Pos logit contributions
uttering+0.203
apping+0.186
aps+0.183
uts+0.174
itted+0.167
Neg logit contributions
 forgot-0.171
 felt-0.157
 learned-0.152
 sighed-0.147
 cleaned-0.146
Token upon
Token ID2402
Feature activation+7.12e-01

Pos logit contributions
aps+0.021
apping+0.019
uttering+0.018
 This+0.017
uts+0.016
Neg logit contributions
 forgot-0.030
 punished-0.026
 cleaned-0.024
 sighed-0.024
 realized-0.024
Token a
Token ID257
Feature activation-1.11e+00

Pos logit contributions
aps+0.017
apping+0.015
uttering+0.015
 This+0.014
apped+0.014
Neg logit contributions
 forgot-0.016
 punished-0.014
 cleaned-0.013
 realized-0.013
 sighed-0.013
Token time
Token ID640
Feature activation+2.53e+00

Pos logit contributions
 forgot+0.025
 punished+0.023
 cleaned+0.020
 regretted+0.020
 sighed+0.020
Neg logit contributions
aps-0.026
apping-0.024
uttering-0.023
 This-0.022
apped-0.021
Token,
Token ID11
Feature activation-3.94e-01

Pos logit contributions
aps+0.066
uttering+0.061
apping+0.057
itted+0.056
oga+0.054
Neg logit contributions
 forgot-0.048
 didn-0.036
 felt-0.035
 sighed-0.034
 realized-0.034
Token a
Token ID257
Feature activation-1.02e+00

Pos logit contributions
 forgot+0.035
 punished+0.031
 sighed+0.027
 regretted+0.027
 cleaned+0.027
Neg logit contributions
aps-0.043
apping-0.038
uttering-0.038
 This-0.036
apped-0.035
Token very
Token ID845
Feature activation-3.10e+00

Pos logit contributions
 forgot+0.023
 punished+0.021
 sighed+0.019
 regretted+0.018
 nurses+0.018
Neg logit contributions
aps-0.022
apping-0.020
uttering-0.020
 This-0.018
 Yes-0.018
Token kind
Token ID1611
Feature activation+2.53e+00

Pos logit contributions
 forgot+0.063
 punished+0.059
 nurses+0.053
 sighed+0.052
 regretted+0.052
Neg logit contributions
aps-0.077
apping-0.069
uttering-0.068
 This-0.065
apped-0.061
Token vine
Token ID17793
Feature activation+1.91e+00

Pos logit contributions
aps+0.055
apping+0.049
uttering+0.047
 This+0.046
apped+0.043
Neg logit contributions
 forgot-0.063
 punished-0.057
 cleaned-0.051
 sighed-0.051
 regretted-0.050
Token.
Token ID13
Feature activation+5.60e+00

Pos logit contributions
aps+0.033
apping+0.028
uttering+0.027
 This+0.026
apped+0.024
Neg logit contributions
 forgot-0.056
 punished-0.051
 cleaned-0.047
 sighed-0.047
 realized-0.046
Token<|endoftext|>
Token ID50256
Feature activation+8.99e+00

Pos logit contributions
uttering+0.106
aps+0.102
apping+0.097
ings+0.088
ummy+0.083
Neg logit contributions
 forgot-0.134
 hoped-0.121
 learned-0.116
 cleaned-0.112
 wished-0.112
TokenBen
Token ID11696
Feature activation-5.02e+00

Pos logit contributions
uttering+0.219
apping+0.201
uts+0.192
ower+0.187
aps+0.184
Neg logit contributions
 forgot-0.140
 learned-0.139
 felt-0.137
 had-0.134
 cleaned-0.131
Token and
Token ID290
Feature activation-7.45e+00

Pos logit contributions
 forgot+0.110
 punished+0.105
 nurses+0.092
 regretted+0.088
 sighed+0.088
Neg logit contributions
aps-0.115
 This-0.101
apping-0.099
uttering-0.094
apped-0.092
Token Mia
Token ID32189
Feature activation-7.39e+00

Pos logit contributions
 forgot+0.196
 punished+0.179
 sighed+0.161
 regretted+0.160
 cleaned+0.158
Neg logit contributions
aps-0.148
 This-0.128
apping-0.126
uttering-0.122
apped-0.116
Token are
Token ID389
Feature activation-2.06e+00

Pos logit contributions
 forgot+0.141
 punished+0.138
 nurses+0.125
 sighed+0.114
 regretted+0.113
Neg logit contributions
aps-0.176
 This-0.158
apping-0.152
apped-0.148
uttering-0.147
Token twins
Token ID20345
Feature activation-4.09e+00

Pos logit contributions
 forgot+0.043
 punished+0.039
 learned+0.036
 cleaned+0.035
 hoped+0.035
Neg logit contributions
aps-0.047
uts-0.045
 This-0.045
ings-0.043
uttering-0.043
Token.
Token ID13
Feature activation+5.76e+00

Pos logit contributions
 forgot+0.002
 punished+0.002
 learned+0.002
 cleaned+0.002
 hoped+0.001
Neg logit contributions
aps-0.002
uttering-0.002
 This-0.002
apping-0.002
 These-0.002
Token They
Token ID1119
Feature activation-7.56e+00

Pos logit contributions
uttering+0.117
aps+0.112
apping+0.109
ings+0.098
used+0.098
Neg logit contributions
 forgot-0.141
 learned-0.123
 cleaned-0.120
 hoped-0.120
 realized-0.118
Token are
Token ID389
Feature activation-2.90e+00

Pos logit contributions
 nurses+0.184
 punished+0.177
nie+0.177
 forgot+0.176
 sighed+0.173
Neg logit contributions
aps-0.116
 This-0.094
 Share-0.077
apped-0.074
icks-0.073
Token happy
Token ID3772
Feature activation+9.57e-01

Pos logit contributions
 forgot+0.067
 punished+0.061
 sighed+0.060
 nurses+0.058
 overs+0.057
Neg logit contributions
aps-0.065
apping-0.056
uttering-0.051
apped-0.051
 This-0.047
Token.
Token ID13
Feature activation+5.52e+00

Pos logit contributions
aps+0.020
uttering+0.020
 This+0.019
apping+0.018
ay+0.017
Neg logit contributions
 forgot-0.023
 punished-0.019
 realized-0.019
 cleaned-0.019
 wished-0.018
Token<|endoftext|>
Token ID50256
Feature activation+8.75e+00

Pos logit contributions
uttering+0.110
aps+0.106
apping+0.100
unch+0.093
ings+0.091
Neg logit contributions
 forgot-0.140
 learned-0.119
 hoped-0.116
 cleaned-0.116
 realized-0.115
TokenL
Token ID43
Feature activation-4.87e-01

Pos logit contributions
uttering+0.213
apping+0.197
uts+0.191
ower+0.183
iked+0.182
Neg logit contributions
 felt-0.125
 learned-0.125
 forgot-0.124
 cleaned-0.124
 saw-0.116
Tokenily
Token ID813
Feature activation-5.63e+00

Pos logit contributions
 forgot+0.012
 punished+0.010
 cleaned+0.009
 sighed+0.009
 regretted+0.009
Neg logit contributions
aps-0.011
apping-0.010
uttering-0.010
 This-0.009
apped-0.009
Token and
Token ID290
Feature activation-6.91e+00

Pos logit contributions
 forgot+0.129
 punished+0.122
 nurses+0.106
 sighed+0.104
 regretted+0.103
Neg logit contributions
aps-0.125
 This-0.110
apping-0.105
uttering-0.097
apped-0.097
Token Ben
Token ID3932
Feature activation-6.11e+00

Pos logit contributions
 forgot+0.164
 punished+0.145
 cleaned+0.131
 sighed+0.129
 realized+0.127
Neg logit contributions
aps-0.162
apping-0.144
uttering-0.141
 This-0.135
apped-0.127
Token are
Token ID389
Feature activation-2.04e+00

Pos logit contributions
 forgot+0.137
 punished+0.133
 nurses+0.119
 regretted+0.116
 sighed+0.116
Neg logit contributions
aps-0.125
 This-0.117
apping-0.102
 These-0.100
uttering-0.098
Token the
Token ID262
Feature activation-5.44e+00

Pos logit contributions
 forgot+0.085
 punished+0.080
 regretted+0.066
 nurses+0.063
 drowned+0.063
Neg logit contributions
aps-0.155
apping-0.138
apped-0.134
 This-0.133
uttering-0.132
Token little
Token ID1310
Feature activation-2.70e+00

Pos logit contributions
 forgot+0.132
 punished+0.117
 cleaned+0.103
 hoped+0.102
 regretted+0.102
Neg logit contributions
aps-0.122
apping-0.108
uttering-0.105
 This-0.102
apped-0.097
Token girl
Token ID2576
Feature activation-5.78e+00

Pos logit contributions
 forgot+0.047
 punished+0.040
 cleaned+0.033
 realized+0.032
 regretted+0.032
Neg logit contributions
aps-0.080
apping-0.073
uttering-0.071
 This-0.068
apped-0.067
Token watched
Token ID7342
Feature activation+7.12e+00

Pos logit contributions
 forgot+0.088
 punished+0.083
 nurses+0.070
 regretted+0.065
 regret+0.064
Neg logit contributions
aps-0.167
apping-0.153
apped-0.144
 This-0.141
uttering-0.141
Token in
Token ID287
Feature activation-1.41e+00

Pos logit contributions
aps+0.188
apping+0.165
uttering+0.164
 This+0.154
uts+0.153
Neg logit contributions
 forgot-0.132
 punished-0.122
 cleaned-0.115
 nurses-0.110
 comfort-0.109
Token amaz
Token ID40642
Feature activation+9.45e+00

Pos logit contributions
 forgot+0.030
 punished+0.027
 cleaned+0.024
 sighed+0.024
 realized+0.023
Neg logit contributions
aps-0.035
apping-0.032
uttering-0.032
 This-0.030
uts-0.029
Tokenement
Token ID972
Feature activation+5.41e+00

Pos logit contributions
aps+0.177
apping+0.153
uttering+0.149
 This+0.144
apped+0.130
Neg logit contributions
 forgot-0.259
 punished-0.238
 cleaned-0.220
 sighed-0.219
 realized-0.215
Token.
Token ID13
Feature activation+5.21e+00

Pos logit contributions
aps+0.117
 This+0.105
uttering+0.100
 These+0.100
imb+0.097
Neg logit contributions
 forgot-0.122
 punished-0.102
 realized-0.098
 sighed-0.098
 cleaned-0.096
Token Mama
Token ID40084
Feature activation-4.30e+00

Pos logit contributions
aps+0.106
uttering+0.100
apping+0.097
uts+0.085
ings+0.082
Neg logit contributions
 forgot-0.123
 realized-0.105
 punished-0.104
 cleaned-0.104
 sighed-0.103
Token's
Token ID338
Feature activation-4.16e+00

Pos logit contributions
 forgot+0.070
 punished+0.063
 regret+0.052
 nurses+0.052
 regretted+0.051
Neg logit contributions
aps-0.120
apping-0.111
apped-0.103
 This-0.102
uttering-0.098
Token fingers
Token ID9353
Feature activation+9.88e-01

Pos logit contributions
 forgot+0.095
 punished+0.083
 cleaned+0.072
 regretted+0.071
 realized+0.070
Neg logit contributions
aps-0.101
apping-0.091
uttering-0.088
 This-0.085
apped-0.082
Token made
Token ID925
Feature activation+3.75e+00

Pos logit contributions
 forgot+0.037
 punished+0.034
 nurses+0.033
 sighed+0.031
 regret+0.030
Neg logit contributions
aps-0.040
 This-0.035
apping-0.034
apped-0.033
uttering-0.032
Token an
Token ID281
Feature activation-1.35e+00

Pos logit contributions
aps+0.077
apping+0.067
uttering+0.065
 This+0.063
apped+0.058
Neg logit contributions
 forgot-0.100
 punished-0.090
 cleaned-0.082
 sighed-0.081
 regretted-0.080
Token impressive
Token ID8036
Feature activation-9.15e-01

Pos logit contributions
 forgot+0.030
 punished+0.024
 sighed+0.023
 regretted+0.022
 cleaned+0.022
Neg logit contributions
aps-0.032
apping-0.030
uttering-0.030
 This-0.028
ummy-0.028
Token sandwich
Token ID20433
Feature activation+3.96e+00

Pos logit contributions
 forgot+0.019
 punished+0.017
 cleaned+0.015
 realized+0.014
 sighed+0.014
Neg logit contributions
aps-0.024
apping-0.021
uttering-0.021
 This-0.020
apped-0.019
Token.
Token ID13
Feature activation+5.63e+00

Pos logit contributions
aps+0.078
uttering+0.069
apping+0.068
 This+0.067
 These+0.060
Neg logit contributions
 forgot-0.105
 punished-0.091
 cleaned-0.086
 realized-0.083
 hoped-0.082
Token<|endoftext|>
Token ID50256
Feature activation+9.32e+00

Pos logit contributions
uttering+0.127
aps+0.111
apping+0.107
ings+0.100
iked+0.096
Neg logit contributions
 forgot-0.126
 hoped-0.112
 learned-0.109
 cleaned-0.108
 wished-0.105
TokenAnna
Token ID31160
Feature activation-5.90e+00

Pos logit contributions
uttering+0.230
apping+0.208
uts+0.194
ower+0.190
itted+0.189
Neg logit contributions
 learned-0.151
 forgot-0.148
 felt-0.142
 cleaned-0.142
 had-0.137
Token and
Token ID290
Feature activation-7.67e+00

Pos logit contributions
 forgot+0.127
 punished+0.125
 nurses+0.116
 regret+0.104
 sighed+0.103
Neg logit contributions
aps-0.133
 This-0.113
apping-0.111
apped-0.107
uttering-0.103
Token Ben
Token ID3932
Feature activation-6.24e+00

Pos logit contributions
 forgot+0.194
 punished+0.176
 cleaned+0.157
 regretted+0.156
 sighed+0.156
Neg logit contributions
aps-0.164
apping-0.143
 This-0.138
uttering-0.138
apped-0.128
Token are
Token ID389
Feature activation-1.94e+00

Pos logit contributions
 punished+0.113
 forgot+0.111
 nurses+0.106
 wounds+0.092
 sighed+0.091
Neg logit contributions
aps-0.148
 This-0.138
apped-0.127
apping-0.126
uttering-0.121
Token twins
Token ID20345
Feature activation-4.37e+00

Pos logit contributions
 forgot+0.047
 punished+0.043
 cleaned+0.040
 learned+0.038
 hoped+0.038
Neg logit contributions
aps-0.041
 This-0.037
uts-0.036
apping-0.036
uttering-0.035
Token but
Token ID475
Feature activation-2.96e+00

Pos logit contributions
 forgot+0.037
 punished+0.034
 regretted+0.030
 nurses+0.030
 cleaned+0.030
Neg logit contributions
aps-0.032
apping-0.029
uttering-0.028
 This-0.027
apped-0.025
Token she
Token ID673
Feature activation-8.97e+00

Pos logit contributions
 forgot+0.072
 punished+0.065
 cleaned+0.058
 sighed+0.057
 regretted+0.057
Neg logit contributions
aps-0.066
apping-0.059
uttering-0.056
 This-0.055
apped-0.052
Token was
Token ID373
Feature activation-2.52e+00

Pos logit contributions
 nurses+0.143
 rubble+0.126
 punished+0.122
 colder+0.119
 forgot+0.117
Neg logit contributions
aps-0.218
apping-0.203
apped-0.199
uts-0.173
uttering-0.171
Token still
Token ID991
Feature activation-4.83e+00

Pos logit contributions
 forgot+0.050
 punished+0.041
 regretted+0.038
 sighed+0.038
 nurses+0.037
Neg logit contributions
aps-0.067
apping-0.062
uttering-0.058
 This-0.056
apped-0.054
Token troubled
Token ID17840
Feature activation+4.14e+00

Pos logit contributions
 forgot+0.095
 punished+0.083
 cleaned+0.072
 sighed+0.070
 realized+0.070
Neg logit contributions
aps-0.132
apping-0.119
uttering-0.117
 This-0.114
apped-0.108
Token.
Token ID13
Feature activation+5.29e+00

Pos logit contributions
aps+0.090
uttering+0.080
apping+0.078
 This+0.073
apped+0.071
Neg logit contributions
 forgot-0.104
 punished-0.093
 cleaned-0.085
 regretted-0.085
 realized-0.085
Token She
Token ID1375
Feature activation-9.03e+00

Pos logit contributions
aps+0.105
uttering+0.101
apping+0.095
ings+0.084
apped+0.079
Neg logit contributions
 forgot-0.141
 realized-0.116
 punished-0.115
 learned-0.114
 hoped-0.114
Token looked
Token ID3114
Feature activation-1.60e-01

Pos logit contributions
 nurses+0.131
 colder+0.121
 rubble+0.119
wy+0.115
 wounds+0.112
Neg logit contributions
aps-0.189
apped-0.180
apping-0.172
 This-0.167
 with-0.157
Token around
Token ID1088
Feature activation+2.39e+00

Pos logit contributions
 forgot+0.005
 punished+0.004
 cleaned+0.004
 overs+0.004
 regretted+0.004
Neg logit contributions
aps-0.002
apping-0.002
uttering-0.002
 This-0.002
apped-0.002
Token and
Token ID290
Feature activation-5.67e+00

Pos logit contributions
aps+0.055
apping+0.048
uttering+0.047
 This+0.045
uts+0.043
Neg logit contributions
 forgot-0.057
 punished-0.050
 cleaned-0.046
 sighed-0.046
 realized-0.046
Token saw
Token ID2497
Feature activation+3.40e+00

Pos logit contributions
 punished+0.077
 forgot+0.075
 nurses+0.069
oscope+0.067
wy+0.064
Neg logit contributions
aps-0.143
uttering-0.138
 This-0.136
apping-0.136
apped-0.133
Token her
Token ID607
Feature activation-4.01e+00

Pos logit contributions
 forgot+0.030
 punished+0.027
 cleaned+0.024
 realized+0.023
 sighed+0.023
Neg logit contributions
aps-0.035
apping-0.031
uttering-0.031
 This-0.029
apped-0.028
Token.
Token ID13
Feature activation+5.32e+00

Pos logit contributions
 forgot+0.109
 punished+0.097
 cleaned+0.088
 realized+0.088
 sighed+0.087
Neg logit contributions
aps-0.079
apping-0.070
uttering-0.069
 This-0.066
apped-0.060
Token She
Token ID1375
Feature activation-9.16e+00

Pos logit contributions
uttering+0.124
aps+0.112
apping+0.112
apped+0.103
unch+0.101
Neg logit contributions
 forgot-0.106
 wished-0.095
 hoped-0.094
 regret-0.088
 sighed-0.087
Token liked
Token ID8288
Feature activation+4.59e+00

Pos logit contributions
 nurses+0.161
 punished+0.151
 secure+0.148
 colder+0.144
 rubble+0.142
Neg logit contributions
aps-0.185
apped-0.169
 This-0.156
apping-0.153
 Yes-0.153
Token her
Token ID607
Feature activation-4.87e+00

Pos logit contributions
aps+0.107
apping+0.096
uttering+0.093
 This+0.089
apped+0.085
Neg logit contributions
 forgot-0.108
 punished-0.096
 cleaned-0.086
 sighed-0.085
 realized-0.084
Token dress
Token ID6576
Feature activation+3.08e+00

Pos logit contributions
 forgot+0.100
 punished+0.087
 cleaned+0.077
 sighed+0.075
 regretted+0.074
Neg logit contributions
aps-0.129
apping-0.116
uttering-0.114
 This-0.111
apped-0.105
Token.
Token ID13
Feature activation+5.30e+00

Pos logit contributions
aps+0.058
apping+0.050
 This+0.047
uttering+0.046
ings+0.043
Neg logit contributions
 forgot-0.084
 punished-0.076
 cleaned-0.069
 regretted-0.069
 sighed-0.069
Token She
Token ID1375
Feature activation-9.08e+00

Pos logit contributions
uttering+0.129
apping+0.115
aps+0.111
iked+0.109
apped+0.105
Neg logit contributions
 forgot-0.093
 hoped-0.086
 wished-0.083
 regret-0.081
 learned-0.078
Token thought
Token ID1807
Feature activation+2.67e+00

Pos logit contributions
 nurses+0.172
 punished+0.160
 rubble+0.158
 secure+0.155
 colder+0.154
Neg logit contributions
aps-0.173
apped-0.157
apping-0.141
 Yes-0.141
 This-0.138
Token it
Token ID340
Feature activation-4.47e+00

Pos logit contributions
aps+0.066
uttering+0.059
apping+0.059
 This+0.054
apped+0.053
Neg logit contributions
 forgot-0.059
 punished-0.051
 cleaned-0.048
 sighed-0.047
 realized-0.046
Token was
Token ID373
Feature activation-4.41e-01

Pos logit contributions
 forgot+0.087
 punished+0.079
 nurses+0.071
 regretted+0.069
 realized+0.066
Neg logit contributions
aps-0.110
 This-0.100
apping-0.096
apped-0.093
uttering-0.093
Token
Token ID198
Feature activation+3.65e+00

Pos logit contributions
uttering+0.120
aps+0.110
apping+0.108
apped+0.096
owers+0.092
Neg logit contributions
 forgot-0.126
 punished-0.108
 hoped-0.105
 cleaned-0.103
 wished-0.102
Token
Token ID198
Feature activation+3.47e+00

Pos logit contributions
aps+0.060
uttering+0.054
apping+0.054
 This+0.046
apped+0.046
Neg logit contributions
 forgot-0.106
 punished-0.096
 cleaned-0.091
 realized-0.090
 sighed-0.090
TokenSpot
Token ID32565
Feature activation-5.36e+00

Pos logit contributions
aps+0.074
apping+0.065
uttering+0.063
 This+0.061
apped+0.056
Neg logit contributions
 forgot-0.089
 punished-0.080
 cleaned-0.072
 sighed-0.071
 regretted-0.071
Token ate
Token ID15063
Feature activation+5.46e+00

Pos logit contributions
 punished+0.096
 forgot+0.094
 nurses+0.090
 regret+0.084
 secure+0.082
Neg logit contributions
aps-0.128
 This-0.114
apping-0.110
uts-0.105
apped-0.104
Token the
Token ID262
Feature activation-4.70e+00

Pos logit contributions
aps+0.147
apping+0.136
uttering+0.135
 This+0.125
uts+0.121
Neg logit contributions
 forgot-0.107
 punished-0.095
 cleaned-0.086
 sighed-0.084
 realized-0.083
Token cookie
Token ID19751
Feature activation+2.50e+00

Pos logit contributions
 forgot+0.091
 punished+0.078
 cleaned+0.075
 sighed+0.069
 mess+0.067
Neg logit contributions
aps-0.124
apping-0.121
uttering-0.116
apped-0.106
uts-0.104
Token.
Token ID13
Feature activation+4.78e+00

Pos logit contributions
aps+0.053
uttering+0.047
apping+0.047
 This+0.043
apped+0.041
Neg logit contributions
 forgot-0.064
 punished-0.057
 cleaned-0.053
 sighed-0.051
 realized-0.050
Token Then
Token ID3244
Feature activation-1.04e+00

Pos logit contributions
uttering+0.111
aps+0.109
apping+0.105
apped+0.089
ectar+0.088
Neg logit contributions
 forgot-0.108
 punished-0.093
 cleaned-0.087
 hoped-0.086
 sighed-0.086
Token he
Token ID339
Feature activation-7.85e+00

Pos logit contributions
 forgot+0.024
 punished+0.022
 cleaned+0.019
 regretted+0.019
 sighed+0.019
Neg logit contributions
aps-0.024
apping-0.021
 This-0.021
uttering-0.020
apped-0.019
Token ran
Token ID4966
Feature activation+2.98e+00

Pos logit contributions
 punished+0.116
 forgot+0.115
 nurses+0.108
 regret+0.106
 secure+0.103
Neg logit contributions
aps-0.194
apped-0.166
apping-0.161
uts-0.160
 This-0.152
Token to
Token ID284
Feature activation-1.07e+00

Pos logit contributions
aps+0.075
apping+0.069
uttering+0.068
 This+0.061
unch+0.061
Neg logit contributions
 forgot-0.064
 punished-0.058
 cleaned-0.052
 realized-0.050
 sighed-0.050
Token a
Token ID257
Feature activation-1.85e+00

Pos logit contributions
 forgot+0.022
 punished+0.019
 cleaned+0.018
 sighed+0.017
 realized+0.017
Neg logit contributions
aps-0.029
apping-0.026
uttering-0.025
uts-0.024
 This-0.023
Token loud
Token ID7812
Feature activation-1.01e-01

Pos logit contributions
 forgot+0.037
 punished+0.033
 cleaned+0.028
 sighed+0.028
 hoped+0.028
Neg logit contributions
aps-0.047
apping-0.043
uttering-0.042
 This-0.041
apped-0.039
Token noise
Token ID7838
Feature activation+3.44e+00

Pos logit contributions
 forgot+0.002
 punished+0.002
 cleaned+0.001
 regretted+0.001
 sighed+0.001
Neg logit contributions
aps-0.003
apping-0.003
 This-0.002
uttering-0.002
apped-0.002
Token was
Token ID373
Feature activation-9.46e-01

Pos logit contributions
aps+0.090
apping+0.082
uttering+0.080
 This+0.079
apped+0.075
Neg logit contributions
 forgot-0.069
 punished-0.060
 cleaned-0.052
 sighed-0.052
 regretted-0.051
Token heard
Token ID2982
Feature activation+5.29e+00

Pos logit contributions
 forgot+0.022
 punished+0.020
 cleaned+0.018
 sighed+0.018
 regretted+0.017
Neg logit contributions
aps-0.022
apping-0.020
uttering-0.019
 This-0.018
apped-0.017
Token.
Token ID13
Feature activation+5.35e+00

Pos logit contributions
aps+0.113
uttering+0.098
apping+0.098
 This+0.089
apped+0.085
Neg logit contributions
 forgot-0.132
 punished-0.122
 cleaned-0.113
 sighed-0.111
 realized-0.111
Token It
Token ID632
Feature activation-5.86e+00

Pos logit contributions
aps+0.112
uttering+0.104
apping+0.097
ectar+0.092
unch+0.087
Neg logit contributions
 forgot-0.130
 sighed-0.117
 cleaned-0.116
 punished-0.113
 felt-0.112
Token was
Token ID373
Feature activation-7.97e-01

Pos logit contributions
 forgot+0.116
 punished+0.109
ppa+0.101
 nurses+0.096
 regretted+0.089
Neg logit contributions
aps-0.130
 This-0.120
apped-0.115
apping-0.114
uttering-0.108
Token coming
Token ID2406
Feature activation+4.76e+00

Pos logit contributions
 forgot+0.021
 punished+0.018
 cleaned+0.017
 sighed+0.017
 regretted+0.016
Neg logit contributions
aps-0.017
apping-0.015
uttering-0.014
 This-0.014
apped-0.013
Token from
Token ID422
Feature activation-8.73e-01

Pos logit contributions
aps+0.093
apping+0.080
uttering+0.078
 This+0.074
apped+0.069
Neg logit contributions
 forgot-0.131
 punished-0.120
 cleaned-0.110
 realized-0.107
 sighed-0.107
Token outside
Token ID2354
Feature activation+4.41e+00

Pos logit contributions
 forgot+0.022
 punished+0.019
 sighed+0.018
 regretted+0.017
 cleaned+0.017
Neg logit contributions
aps-0.019
uttering-0.016
apping-0.016
 This-0.016
apped-0.015
Token was
Token ID373
Feature activation-5.06e-01

Pos logit contributions
 forgot+0.076
 punished+0.069
 regretted+0.056
 cleaned+0.055
 nurses+0.055
Neg logit contributions
aps-0.122
 This-0.114
apping-0.110
uttering-0.107
apped-0.105
Token incredible
Token ID8082
Feature activation+1.86e+00

Pos logit contributions
 forgot+0.011
 punished+0.010
 sighed+0.009
 cleaned+0.009
 regretted+0.009
Neg logit contributions
aps-0.012
apping-0.011
 This-0.011
uttering-0.011
apped-0.010
Token.
Token ID13
Feature activation+5.14e+00

Pos logit contributions
aps+0.041
uttering+0.038
apping+0.037
unch+0.034
apped+0.033
Neg logit contributions
 forgot-0.045
 punished-0.039
 realized-0.037
 hoped-0.036
 cleaned-0.036
Token The
Token ID383
Feature activation-4.15e+00

Pos logit contributions
aps+0.100
uttering+0.095
apping+0.090
uts+0.077
ings+0.077
Neg logit contributions
 forgot-0.133
 punished-0.113
 sighed-0.111
 hoped-0.111
 realized-0.110
Token warrior
Token ID16491
Feature activation-3.14e+00

Pos logit contributions
 forgot+0.083
 punished+0.076
 cleaned+0.067
 sighed+0.064
 drowned+0.063
Neg logit contributions
aps-0.103
apping-0.093
uttering-0.091
 This-0.088
ay-0.083
Token showed
Token ID3751
Feature activation+5.06e+00

Pos logit contributions
 forgot+0.050
 punished+0.047
 nurses+0.040
 cleaned+0.039
 regret+0.037
Neg logit contributions
aps-0.084
apping-0.079
apped-0.076
 This-0.074
uttering-0.073
Token them
Token ID606
Feature activation-2.32e+00

Pos logit contributions
aps+0.110
 This+0.095
apping+0.093
uttering+0.088
apped+0.086
Neg logit contributions
 forgot-0.129
 punished-0.115
 cleaned-0.103
 regretted-0.102
 sighed-0.101
Token how
Token ID703
Feature activation+2.75e+00

Pos logit contributions
 forgot+0.054
 punished+0.048
 cleaned+0.044
 realized+0.043
 sighed+0.043
Neg logit contributions
aps-0.054
apping-0.049
uttering-0.048
 This-0.045
apped-0.043
Token it
Token ID340
Feature activation-4.71e+00

Pos logit contributions
aps+0.046
apping+0.039
uttering+0.037
 This+0.037
apped+0.032
Neg logit contributions
 forgot-0.082
 punished-0.075
 cleaned-0.069
 sighed-0.069
 regretted-0.068
Token worked
Token ID3111
Feature activation+4.35e+00

Pos logit contributions
 punished+0.086
 stret+0.078
 forgot+0.078
 nurses+0.077
ppa+0.075
Neg logit contributions
aps-0.100
apped-0.089
uts-0.085
 This-0.084
uttering-0.080
Token,
Token ID11
Feature activation-1.96e-01

Pos logit contributions
aps+0.087
apping+0.076
uttering+0.074
 This+0.070
apped+0.066
Neg logit contributions
 forgot-0.117
 punished-0.106
 cleaned-0.097
 realized-0.095
 sighed-0.095
Token.
Token ID13
Feature activation+5.61e+00

Pos logit contributions
aps+0.114
apping+0.099
uttering+0.096
 This+0.093
apped+0.086
Neg logit contributions
 forgot-0.156
 punished-0.143
 cleaned-0.132
 realized-0.127
 regretted-0.127
Token
Token ID198
Feature activation+4.13e+00

Pos logit contributions
aps+0.110
uttering+0.107
apping+0.098
ings+0.082
apped+0.082
Neg logit contributions
 forgot-0.149
 punished-0.130
 sighed-0.122
 hoped-0.122
 cleaned-0.122
Token
Token ID198
Feature activation+4.06e+00

Pos logit contributions
aps+0.065
uttering+0.057
apping+0.056
 This+0.050
apped+0.048
Neg logit contributions
 forgot-0.124
 punished-0.113
 cleaned-0.106
 sighed-0.105
 realized-0.105
TokenOne
Token ID3198
Feature activation+5.78e-01

Pos logit contributions
aps+0.093
apping+0.083
uttering+0.081
 This+0.078
apped+0.074
Neg logit contributions
 forgot-0.097
 punished-0.087
 cleaned-0.078
 sighed-0.077
 realized-0.076
Token time
Token ID640
Feature activation+2.87e+00

Pos logit contributions
aps+0.012
apping+0.011
uttering+0.010
 This+0.010
apped+0.009
Neg logit contributions
 forgot-0.015
 punished-0.014
 cleaned-0.012
 sighed-0.012
 regretted-0.012
Token,
Token ID11
Feature activation-1.28e+00

Pos logit contributions
aps+0.070
uttering+0.065
apping+0.063
unch+0.060
ectar+0.059
Neg logit contributions
 forgot-0.064
 punished-0.054
 cleaned-0.053
 sighed-0.050
 realized-0.050
Token the
Token ID262
Feature activation-4.90e+00

Pos logit contributions
 forgot+0.029
 punished+0.025
 cleaned+0.023
 sighed+0.022
 realized+0.022
Neg logit contributions
aps-0.032
apping-0.028
uttering-0.028
 This-0.026
apped-0.025
Token ball
Token ID2613
Feature activation+1.37e+00

Pos logit contributions
 forgot+0.113
 punished+0.101
 cleaned+0.089
 sighed+0.088
 regretted+0.087
Neg logit contributions
aps-0.116
apping-0.103
uttering-0.101
 This-0.097
apped-0.093
Token rolled
Token ID11686
Feature activation+6.22e+00

Pos logit contributions
aps+0.036
apping+0.034
uttering+0.033
apped+0.033
 This+0.032
Neg logit contributions
 forgot-0.023
 punished-0.021
 nurses-0.018
 cleaned-0.017
 regret-0.017
Token too
Token ID1165
Feature activation-2.90e-02

Pos logit contributions
aps+0.154
apping+0.137
uttering+0.132
 This+0.127
uts+0.120
Neg logit contributions
 forgot-0.139
 punished-0.124
 cleaned-0.113
 sighed-0.111
 realized-0.110
Token far
Token ID1290
Feature activation+3.82e+00

Pos logit contributions
 forgot+0.001
 punished+0.001
 cleaned+0.001
 sighed+0.001
 realized+0.001
Neg logit contributions
aps-0.001
apping-0.001
uttering-0.001
 This-0.001
apped-0.001
Token her
Token ID607
Feature activation-3.69e+00

Pos logit contributions
aps+0.010
apping+0.009
uttering+0.009
 This+0.008
apped+0.008
Neg logit contributions
 forgot-0.009
 punished-0.008
 cleaned-0.008
 realized-0.007
 sighed-0.007
Token.
Token ID13
Feature activation+4.73e+00

Pos logit contributions
 forgot+0.105
 punished+0.094
 cleaned+0.087
 regretted+0.085
 sighed+0.085
Neg logit contributions
aps-0.070
apping-0.061
uttering-0.060
 This-0.054
apped-0.053
Token
Token ID220
Feature activation+5.61e+00

Pos logit contributions
aps+0.093
uttering+0.088
apping+0.084
uts+0.074
apped+0.070
Neg logit contributions
 forgot-0.119
 cleaned-0.103
 hoped-0.100
 punished-0.099
 learned-0.099
Token
Token ID198
Feature activation+3.90e+00

Pos logit contributions
uttering+0.105
aps+0.104
apping+0.097
oga+0.087
apped+0.086
Neg logit contributions
 forgot-0.130
 punished-0.111
 cleaned-0.110
 hoped-0.109
 learned-0.105
Token
Token ID198
Feature activation+3.82e+00

Pos logit contributions
aps+0.071
uttering+0.064
apping+0.064
apped+0.058
uts+0.055
Neg logit contributions
 forgot-0.102
 punished-0.087
 cleaned-0.086
 learned-0.085
 sighed-0.085
TokenOne
Token ID3198
Feature activation-3.10e-02

Pos logit contributions
aps+0.084
apping+0.074
uttering+0.072
 This+0.069
apped+0.066
Neg logit contributions
 forgot-0.096
 punished-0.086
 cleaned-0.078
 sighed-0.076
 realized-0.076
Token morning
Token ID3329
Feature activation+3.13e+00

Pos logit contributions
 forgot+0.001
 punished+0.001
 cleaned+0.001
 sighed+0.001
 regretted+0.001
Neg logit contributions
aps-0.001
apping-0.001
uttering-0.001
 This-0.001
apped-0.001
Token,
Token ID11
Feature activation-8.30e-01

Pos logit contributions
aps+0.075
uttering+0.066
apping+0.065
uts+0.059
apped+0.055
Neg logit contributions
 forgot-0.076
 cleaned-0.062
 punished-0.062
 realized-0.061
 sighed-0.060
Token the
Token ID262
Feature activation-5.22e+00

Pos logit contributions
 forgot+0.018
 punished+0.015
 cleaned+0.014
 realized+0.014
 sighed+0.014
Neg logit contributions
aps-0.022
apping-0.020
uttering-0.019
 This-0.018
uts-0.018
Token little
Token ID1310
Feature activation-1.46e+00

Pos logit contributions
 forgot+0.117
 punished+0.106
 hoped+0.092
 drowned+0.091
 realized+0.091
Neg logit contributions
aps-0.117
apping-0.105
uttering-0.102
 This-0.100
ummy-0.098
Token girl
Token ID2576
Feature activation-5.02e+00

Pos logit contributions
 forgot+0.028
 punished+0.027
 drowned+0.022
 regretted+0.022
 sighed+0.021
Neg logit contributions
aps-0.038
apping-0.035
uttering-0.034
apped-0.032
 This-0.032
Token.
Token ID13
Feature activation+4.14e+00

Pos logit contributions
aps+0.035
uttering+0.031
apping+0.031
 This+0.028
apped+0.027
Neg logit contributions
 forgot-0.044
 punished-0.039
 cleaned-0.036
 realized-0.036
 regretted-0.035
Token Maybe
Token ID6674
Feature activation-1.02e+00

Pos logit contributions
aps+0.084
uttering+0.084
ings+0.074
apping+0.073
uts+0.066
Neg logit contributions
 forgot-0.101
 punished-0.086
 cleaned-0.085
 hoped-0.085
 realized-0.081
Token we
Token ID356
Feature activation-5.68e+00

Pos logit contributions
 forgot+0.021
 punished+0.018
 realized+0.017
 cleaned+0.016
 wished+0.016
Neg logit contributions
aps-0.028
apping-0.025
uttering-0.024
apped-0.022
 This-0.022
Token can
Token ID460
Feature activation+6.40e-01

Pos logit contributions
 forgot+0.122
 punished+0.112
 sighed+0.106
 nurses+0.097
 regretted+0.096
Neg logit contributions
aps-0.129
 This-0.115
apping-0.113
uttering-0.113
apped-0.109
Token make
Token ID787
Feature activation+3.07e+00

Pos logit contributions
aps+0.019
apping+0.017
uttering+0.017
 This+0.016
apped+0.016
Neg logit contributions
 forgot-0.011
 punished-0.010
 cleaned-0.008
 sighed-0.008
 regretted-0.008
Token some
Token ID617
Feature activation-1.58e+00

Pos logit contributions
aps+0.076
apping+0.070
uttering+0.069
 This+0.063
apped+0.062
Neg logit contributions
 forgot-0.065
 punished-0.059
 realized-0.052
 cleaned-0.051
 learned-0.050
Token delicious
Token ID12625
Feature activation+2.49e+00

Pos logit contributions
 forgot+0.027
 punished+0.025
 cleaned+0.021
 realized+0.020
 mess+0.020
Neg logit contributions
aps-0.044
apping-0.041
uttering-0.041
 This-0.040
ges-0.038
Token soup
Token ID17141
Feature activation+2.82e+00

Pos logit contributions
aps+0.066
uttering+0.063
apping+0.063
 This+0.058
apped+0.057
Neg logit contributions
 forgot-0.046
 punished-0.039
 mess-0.036
 realized-0.035
 cleaned-0.035
Token with
Token ID351
Feature activation+3.22e-01

Pos logit contributions
aps+0.054
apping+0.048
uttering+0.048
 This+0.045
apped+0.043
Neg logit contributions
 forgot-0.077
 punished-0.069
 cleaned-0.064
 realized-0.061
 hoped-0.060
Token it
Token ID340
Feature activation-2.48e+00

Pos logit contributions
aps+0.008
apping+0.007
uttering+0.007
 This+0.007
apped+0.006
Neg logit contributions
 forgot-0.007
 punished-0.006
 cleaned-0.006
 realized-0.006
 sighed-0.005
Token,"
Token ID553
Feature activation+3.48e+00

Pos logit contributions
 forgot+0.063
 punished+0.055
 cleaned+0.051
 realized+0.050
 sighed+0.048
Neg logit contributions
aps-0.056
apping-0.050
uttering-0.049
 This-0.044
apped-0.044
Token on
Token ID319
Feature activation+3.42e+00

Pos logit contributions
aps+0.027
apping+0.022
uttering+0.022
 This+0.021
apped+0.019
Neg logit contributions
 forgot-0.035
 punished-0.031
 cleaned-0.030
 realized-0.030
 sighed-0.029
Token her
Token ID607
Feature activation-4.23e+00

Pos logit contributions
aps+0.092
apping+0.085
uttering+0.083
apped+0.075
ectar+0.073
Neg logit contributions
 forgot-0.069
 punished-0.061
 cleaned-0.057
 realized-0.055
 sighed-0.054
Token face
Token ID1986
Feature activation+2.44e+00

Pos logit contributions
 forgot+0.079
 punished+0.068
 cleaned+0.059
 sighed+0.058
 realized+0.057
Neg logit contributions
aps-0.119
apping-0.108
uttering-0.106
 This-0.103
apped-0.099
Token and
Token ID290
Feature activation-5.27e+00

Pos logit contributions
aps+0.061
uttering+0.053
apping+0.052
 This+0.049
apped+0.048
Neg logit contributions
 forgot-0.053
 punished-0.046
 realized-0.044
 cleaned-0.044
 sighed-0.043
Token was
Token ID373
Feature activation-2.10e+00

Pos logit contributions
 forgot+0.092
 punished+0.081
 cleaned+0.069
 sighed+0.067
 regretted+0.066
Neg logit contributions
aps-0.150
apping-0.134
 This-0.131
uttering-0.130
apped-0.125
Token so
Token ID523
Feature activation-2.43e+00

Pos logit contributions
 forgot+0.046
 punished+0.040
 cleaned+0.036
 sighed+0.035
 regretted+0.035
Neg logit contributions
aps-0.053
apping-0.047
uttering-0.046
 This-0.045
apped-0.042
Token happy
Token ID3772
Feature activation+2.49e+00

Pos logit contributions
 forgot+0.044
 punished+0.038
 cleaned+0.033
 realized+0.032
 hoped+0.031
Neg logit contributions
aps-0.070
apping-0.064
uttering-0.063
 This-0.062
ings-0.060
Token to
Token ID284
Feature activation-2.50e+00

Pos logit contributions
aps+0.059
uttering+0.054
apping+0.053
apped+0.050
imb+0.048
Neg logit contributions
 forgot-0.055
 punished-0.045
 realized-0.044
 cleaned-0.042
 regretted-0.041
Token get
Token ID651
Feature activation+2.91e+00

Pos logit contributions
 forgot+0.046
 punished+0.038
 cleaned+0.034
 realized+0.033
 sighed+0.032
Neg logit contributions
aps-0.073
apping-0.067
uttering-0.067
 This-0.061
apped-0.061
Token her
Token ID607
Feature activation-5.96e+00

Pos logit contributions
aps+0.065
apping+0.058
uttering+0.056
 This+0.054
apped+0.051
Neg logit contributions
 forgot-0.072
 punished-0.065
 cleaned-0.059
 sighed-0.057
 realized-0.057
Token y
Token ID331
Feature activation+6.35e+00

Pos logit contributions
 forgot+0.123
 punished+0.107
 cleaned+0.095
 sighed+0.093
 regretted+0.092
Neg logit contributions
aps-0.157
apping-0.142
uttering-0.139
 This-0.134
apped-0.128
Token played
Token ID2826
Feature activation+6.48e+00

Pos logit contributions
 forgot+0.142
 punished+0.141
 sighed+0.116
 regretted+0.115
 nurses+0.114
Neg logit contributions
aps-0.186
apping-0.171
apped-0.169
uttering-0.169
 This-0.168
Token together
Token ID1978
Feature activation-2.58e-01

Pos logit contributions
aps+0.122
apping+0.106
uttering+0.104
 This+0.098
uts+0.093
Neg logit contributions
 forgot-0.178
 punished-0.159
 cleaned-0.151
 realized-0.146
 sighed-0.145
Token,
Token ID11
Feature activation-1.03e+00

Pos logit contributions
 forgot+0.006
 punished+0.005
 cleaned+0.004
 hoped+0.004
 learned+0.004
Neg logit contributions
aps-0.006
apping-0.006
uttering-0.006
 This-0.005
uts-0.005
Token ate
Token ID15063
Feature activation+6.29e+00

Pos logit contributions
 punished+0.025
 forgot+0.025
 regretted+0.022
 regret+0.022
 nurses+0.022
Neg logit contributions
aps-0.019
apping-0.017
 This-0.017
uttering-0.017
apped-0.015
Token together
Token ID1978
Feature activation+1.06e+00

Pos logit contributions
aps+0.172
apping+0.160
uttering+0.155
 This+0.148
apped+0.142
Neg logit contributions
 forgot-0.120
 punished-0.104
 cleaned-0.097
 realized-0.092
 sighed-0.091
Token,
Token ID11
Feature activation-1.22e-01

Pos logit contributions
aps+0.028
uttering+0.026
apping+0.025
 This+0.024
imb+0.023
Neg logit contributions
 forgot-0.020
 cleaned-0.016
 hoped-0.016
 learned-0.016
 punished-0.016
Token and
Token ID290
Feature activation-4.73e+00

Pos logit contributions
 forgot+0.003
 punished+0.003
 regretted+0.003
 nurses+0.003
 regret+0.003
Neg logit contributions
aps-0.002
apping-0.002
uttering-0.002
 This-0.002
apped-0.002
Token slept
Token ID21256
Feature activation+4.32e+00

Pos logit contributions
 forgot+0.092
 punished+0.092
 cleaned+0.075
 regretted+0.074
 sighed+0.074
Neg logit contributions
aps-0.110
apping-0.100
 This-0.099
apped-0.095
uttering-0.094
Token together
Token ID1978
Feature activation-1.29e+00

Pos logit contributions
aps+0.084
apping+0.073
uttering+0.071
 This+0.068
apped+0.064
Neg logit contributions
 forgot-0.118
 punished-0.107
 cleaned-0.097
 sighed-0.097
 regretted-0.096
Token.
Token ID13
Feature activation+4.86e+00

Pos logit contributions
 forgot+0.032
 punished+0.026
 cleaned+0.026
 hoped+0.026
 learned+0.025
Neg logit contributions
aps-0.029
uttering-0.026
apping-0.026
 This-0.023
imb-0.023
Token They
Token ID1119
Feature activation-7.55e+00

Pos logit contributions
uttering+0.106
aps+0.102
apping+0.096
uts+0.090
ings+0.089
Neg logit contributions
 forgot-0.107
 learned-0.094
 hoped-0.091
 cleaned-0.088
 also-0.085
Token
Token ID198
Feature activation+4.29e+00

Pos logit contributions
aps+0.091
apping+0.082
uttering+0.080
 This+0.075
apped+0.073
Neg logit contributions
 forgot-0.106
 punished-0.095
 cleaned-0.088
 realized-0.087
 sighed-0.087
TokenFrom
Token ID4863
Feature activation-2.03e+00

Pos logit contributions
aps+0.100
apping+0.089
uttering+0.087
 This+0.084
apped+0.079
Neg logit contributions
 forgot-0.101
 punished-0.090
 cleaned-0.081
 sighed-0.080
 realized-0.079
Token then
Token ID788
Feature activation-1.56e+00

Pos logit contributions
 forgot+0.046
 punished+0.040
 cleaned+0.036
 sighed+0.035
 regretted+0.035
Neg logit contributions
aps-0.049
apping-0.044
uttering-0.043
 This-0.042
apped-0.039
Token on
Token ID319
Feature activation-4.44e-01

Pos logit contributions
 forgot+0.043
 punished+0.040
 cleaned+0.037
 realized+0.037
 learned+0.036
Neg logit contributions
aps-0.029
uttering-0.025
apping-0.024
uts-0.022
ectar-0.021
Token,
Token ID11
Feature activation-6.26e-01

Pos logit contributions
 forgot+0.014
 punished+0.012
 cleaned+0.011
 sighed+0.011
 regretted+0.011
Neg logit contributions
aps-0.007
apping-0.006
 This-0.006
uttering-0.006
apped-0.005
Token Grace
Token ID16156
Feature activation-6.60e+00

Pos logit contributions
 forgot+0.016
 punished+0.015
 sighed+0.014
 cleaned+0.014
 regretted+0.014
Neg logit contributions
aps-0.012
 This-0.011
apping-0.011
uttering-0.010
apped-0.009
Token was
Token ID373
Feature activation-2.68e+00

Pos logit contributions
 forgot+0.106
 punished+0.100
 nurses+0.087
 sighed+0.084
 regret+0.081
Neg logit contributions
aps-0.175
apping-0.161
 This-0.158
apped-0.151
uts-0.148
Token always
Token ID1464
Feature activation-4.22e+00

Pos logit contributions
 forgot+0.055
 punished+0.045
 sighed+0.043
 regretted+0.043
 realized+0.040
Neg logit contributions
aps-0.066
apping-0.061
uttering-0.057
 This-0.057
apped-0.053
Token sure
Token ID1654
Feature activation-1.31e+00

Pos logit contributions
 forgot+0.091
 punished+0.080
 cleaned+0.071
 sighed+0.070
 regretted+0.069
Neg logit contributions
aps-0.106
apping-0.096
uttering-0.093
 This-0.090
apped-0.086
Token to
Token ID284
Feature activation-3.59e+00

Pos logit contributions
 forgot+0.038
 punished+0.035
 cleaned+0.032
 sighed+0.031
 regretted+0.031
Neg logit contributions
aps-0.024
apping-0.020
uttering-0.020
 This-0.019
apped-0.017
Token be
Token ID307
Feature activation-1.98e+00

Pos logit contributions
 forgot+0.066
 punished+0.056
 sighed+0.051
 regretted+0.050
 realized+0.049
Neg logit contributions
aps-0.098
apping-0.088
 This-0.087
uttering-0.086
apped-0.078
Token on
Token ID319
Feature activation+1.27e+00

Pos logit contributions
aps+0.065
apping+0.059
uttering+0.055
apped+0.052
uts+0.051
Neg logit contributions
 forgot-0.089
 punished-0.081
 cleaned-0.077
 realized-0.073
 learned-0.072
Token a
Token ID257
Feature activation-5.32e-01

Pos logit contributions
aps+0.028
apping+0.026
uttering+0.025
 This+0.023
apped+0.023
Neg logit contributions
 forgot-0.031
 punished-0.028
 cleaned-0.025
 realized-0.025
 sighed-0.025
Token big
Token ID1263
Feature activation+1.29e+00

Pos logit contributions
 forgot+0.011
 punished+0.009
 sighed+0.008
 regretted+0.008
 cleaned+0.008
Neg logit contributions
aps-0.014
uttering-0.013
 This-0.013
apping-0.013
 Today-0.011
Token adventure
Token ID8855
Feature activation+6.59e+00

Pos logit contributions
aps+0.032
 This+0.030
apping+0.028
uttering+0.028
 Today+0.027
Neg logit contributions
 forgot-0.027
 punished-0.024
 regretted-0.021
 sighed-0.020
 hoped-0.020
Token.
Token ID13
Feature activation+4.87e+00

Pos logit contributions
aps+0.133
uttering+0.112
apping+0.111
apped+0.103
 This+0.100
Neg logit contributions
 forgot-0.173
 punished-0.158
 cleaned-0.148
 realized-0.139
 sighed-0.139
Token He
Token ID679
Feature activation-8.12e+00

Pos logit contributions
aps+0.111
uttering+0.109
apping+0.103
ectar+0.092
apped+0.090
Neg logit contributions
 forgot-0.116
 punished-0.097
 sighed-0.092
 cleaned-0.091
 hoped-0.090
Token held
Token ID2714
Feature activation+2.85e+00

Pos logit contributions
 punished+0.154
 forgot+0.151
 regret+0.145
 rubble+0.143
 difference+0.139
Neg logit contributions
apped-0.132
aps-0.132
 This-0.121
rolled-0.119
 Inside-0.117
Token up
Token ID510
Feature activation+2.53e+00

Pos logit contributions
aps+0.059
apping+0.052
uttering+0.051
 This+0.048
apped+0.045
Neg logit contributions
 forgot-0.075
 punished-0.067
 cleaned-0.062
 sighed-0.061
 realized-0.060
Token his
Token ID465
Feature activation-5.83e+00

Pos logit contributions
aps+0.062
apping+0.055
uttering+0.054
 This+0.052
apped+0.049
Neg logit contributions
 forgot-0.058
 punished-0.051
 cleaned-0.046
 sighed-0.045
 realized-0.044
Token head
Token ID1182
Feature activation+2.36e+00

Pos logit contributions
 forgot+0.125
 punished+0.110
 cleaned+0.096
 realized+0.095
 sighed+0.094
Neg logit contributions
aps-0.149
apping-0.133
uttering-0.132
 This-0.127
apped-0.121
Token and
Token ID290
Feature activation-5.86e+00

Pos logit contributions
aps+0.057
apping+0.048
uttering+0.047
 This+0.047
ectar+0.045
Neg logit contributions
 forgot-0.052
 punished-0.046
 cleaned-0.044
 sighed-0.043
 realized-0.042
Token girl
Token ID2576
Feature activation-2.14e+00

Pos logit contributions
 forgot+0.002
 punished+0.002
 cleaned+0.001
 sighed+0.001
 hoped+0.001
Neg logit contributions
aps-0.003
apping-0.003
uttering-0.003
 This-0.003
apped-0.002
Token named
Token ID3706
Feature activation+2.23e+00

Pos logit contributions
 forgot+0.052
 punished+0.049
 cleaned+0.044
 regret+0.043
 wished+0.043
Neg logit contributions
aps-0.045
uttering-0.041
apping-0.039
uts-0.039
 This-0.039
Token Sue
Token ID26089
Feature activation-5.02e+00

Pos logit contributions
aps+0.055
uttering+0.051
apping+0.050
apped+0.045
 This+0.044
Neg logit contributions
 forgot-0.050
 punished-0.043
 cleaned-0.039
 sighed-0.038
 realized-0.037
Token.
Token ID13
Feature activation+4.44e+00

Pos logit contributions
 forgot+0.126
 punished+0.115
 cleaned+0.105
 realized+0.105
 sighed+0.105
Neg logit contributions
aps-0.109
uttering-0.097
apping-0.096
uts-0.089
apped-0.087
Token Sue
Token ID26089
Feature activation-7.25e+00

Pos logit contributions
aps+0.116
uttering+0.113
apping+0.109
utes+0.099
ectar+0.096
Neg logit contributions
 forgot-0.095
 punished-0.076
 sighed-0.074
 cleaned-0.073
 learned-0.071
Token also
Token ID635
Feature activation-7.72e+00

Pos logit contributions
 forgot+0.121
 punished+0.118
 nurses+0.112
 cleaned+0.095
 overs+0.095
Neg logit contributions
aps-0.185
 This-0.167
apping-0.166
apped-0.154
uts-0.150
Token had
Token ID550
Feature activation-1.48e+00

Pos logit contributions
 forgot+0.098
 punished+0.078
 cleaned+0.061
 sighed+0.059
 regretted+0.057
Neg logit contributions
aps-0.265
apping-0.245
uttering-0.241
 This-0.235
apped-0.227
Token a
Token ID257
Feature activation-1.93e+00

Pos logit contributions
 forgot+0.041
 punished+0.038
 sighed+0.034
 cleaned+0.034
 regretted+0.034
Neg logit contributions
aps-0.028
apping-0.023
 This-0.023
uttering-0.023
apped-0.020
Token twist
Token ID14528
Feature activation+2.64e+00

Pos logit contributions
 forgot+0.042
 punished+0.037
 cleaned+0.032
 sighed+0.032
 regretted+0.032
Neg logit contributions
aps-0.049
apping-0.043
uttering-0.043
 This-0.041
apped-0.039
Tokeny
Token ID88
Feature activation-1.96e+00

Pos logit contributions
aps+0.054
apping+0.047
uttering+0.045
 This+0.045
uts+0.041
Neg logit contributions
 forgot-0.068
 punished-0.061
 cleaned-0.057
 sighed-0.056
 regretted-0.055
Token toy
Token ID13373
Feature activation+1.38e+00

Pos logit contributions
 forgot+0.037
 punished+0.031
 cleaned+0.029
 sighed+0.028
 realized+0.028
Neg logit contributions
aps-0.054
apping-0.050
uttering-0.049
 This-0.046
uts-0.045
TokenBen
Token ID11696
Feature activation-6.06e+00

Pos logit contributions
aps+0.107
apping+0.096
uttering+0.095
 This+0.091
apped+0.087
Neg logit contributions
 forgot-0.117
 punished-0.103
 cleaned-0.094
 realized-0.094
 sighed-0.093
Token agreed
Token ID4987
Feature activation+2.46e+00

Pos logit contributions
 forgot+0.100
 punished+0.098
 nurses+0.097
 regret+0.084
 hospital+0.082
Neg logit contributions
aps-0.164
apping-0.145
apped-0.142
 This-0.131
uts-0.129
Token.
Token ID13
Feature activation+4.85e+00

Pos logit contributions
aps+0.045
uttering+0.039
apping+0.039
 This+0.036
apped+0.034
Neg logit contributions
 forgot-0.071
 punished-0.064
 cleaned-0.060
 sighed-0.059
 realized-0.058
Token He
Token ID679
Feature activation-8.51e+00

Pos logit contributions
uttering+0.115
aps+0.103
apping+0.100
owers+0.098
ings+0.093
Neg logit contributions
 forgot-0.103
 hoped-0.088
 sighed-0.083
 realized-0.083
 cleaned-0.082
Token wished
Token ID16555
Feature activation+3.60e+00

Pos logit contributions
 nurses+0.140
 punished+0.135
 forgot+0.131
 regret+0.127
 hospital+0.122
Neg logit contributions
aps-0.213
apped-0.189
apping-0.179
uts-0.175
uttering-0.169
Token he
Token ID339
Feature activation-7.84e+00

Pos logit contributions
aps+0.086
apping+0.076
uttering+0.074
 This+0.072
apped+0.068
Neg logit contributions
 forgot-0.083
 punished-0.074
 cleaned-0.066
 sighed-0.065
 regretted-0.065
Token could
Token ID714
Feature activation-5.15e-01

Pos logit contributions
 punished+0.156
 nurses+0.149
 mess+0.143
 forgot+0.142
 hospital+0.140
Neg logit contributions
aps-0.152
 This-0.137
uts-0.136
apped-0.133
 Yes-0.130
Token soar
Token ID48701
Feature activation+4.30e+00

Pos logit contributions
 forgot+0.010
 punished+0.008
 sighed+0.007
 regretted+0.007
 cleaned+0.007
Neg logit contributions
aps-0.014
apping-0.013
uttering-0.012
 This-0.012
uts-0.012
Token like
Token ID588
Feature activation+2.86e+00

Pos logit contributions
aps+0.093
apping+0.083
uttering+0.080
 This+0.077
apped+0.073
Neg logit contributions
 forgot-0.107
 punished-0.096
 cleaned-0.087
 realized-0.086
 sighed-0.085
Token the
Token ID262
Feature activation-4.52e+00

Pos logit contributions
aps+0.074
apping+0.067
uttering+0.065
 This+0.061
apped+0.060
Neg logit contributions
 forgot-0.060
 punished-0.053
 cleaned-0.048
 sighed-0.047
 realized-0.046
Token hawk
Token ID48710
Feature activation-1.30e+00

Pos logit contributions
 forgot+0.096
 punished+0.083
 cleaned+0.074
 sighed+0.072
 realized+0.070
Neg logit contributions
aps-0.116
uttering-0.104
apping-0.103
 This-0.101
apped-0.094
Token water
Token ID1660
Feature activation+2.52e+00

Pos logit contributions
 forgot+0.021
 punished+0.018
 overs+0.016
 regretted+0.015
 sighed+0.015
Neg logit contributions
aps-0.021
uttering-0.018
apping-0.018
 This-0.017
apped-0.017
Token,
Token ID11
Feature activation-2.31e+00

Pos logit contributions
aps+0.061
apping+0.054
uttering+0.053
 This+0.051
apped+0.048
Neg logit contributions
 forgot-0.056
 punished-0.049
 cleaned-0.046
 sighed-0.044
 realized-0.044
Token then
Token ID788
Feature activation-1.05e+00

Pos logit contributions
 forgot+0.054
 punished+0.052
 regretted+0.047
 regret+0.047
 overs+0.047
Neg logit contributions
aps-0.046
apping-0.041
uttering-0.039
 This-0.038
apped-0.038
Token scrub
Token ID27268
Feature activation+5.00e+00

Pos logit contributions
 forgot+0.020
 punished+0.017
 regretted+0.015
 sighed+0.014
 regret+0.014
Neg logit contributions
aps-0.028
apping-0.025
uttering-0.025
 This-0.024
apped-0.024
Tokenbed
Token ID3077
Feature activation+5.47e+00

Pos logit contributions
aps+0.081
apping+0.068
uttering+0.064
 This+0.064
apped+0.059
Neg logit contributions
 forgot-0.152
 punished-0.139
 regretted-0.128
 sighed-0.126
 cleaned-0.126
Token and
Token ID290
Feature activation-5.68e+00

Pos logit contributions
aps+0.123
apping+0.108
uttering+0.105
 This+0.103
apped+0.096
Neg logit contributions
 forgot-0.134
 punished-0.119
 regretted-0.106
 sighed-0.105
 cleaned-0.105
Token scrub
Token ID27268
Feature activation+4.96e+00

Pos logit contributions
 forgot+0.085
 punished+0.075
 regretted+0.064
 overs+0.060
 nurses+0.059
Neg logit contributions
aps-0.165
apping-0.151
uttering-0.146
 This-0.145
apped-0.143
Tokenbed
Token ID3077
Feature activation+5.67e+00

Pos logit contributions
aps+0.078
apping+0.065
uttering+0.061
 This+0.060
apped+0.055
Neg logit contributions
 forgot-0.154
 punished-0.141
 regretted-0.129
 cleaned-0.129
 sighed-0.129
Token until
Token ID1566
Feature activation-1.45e+00

Pos logit contributions
aps+0.125
apping+0.110
uttering+0.107
 This+0.104
apped+0.097
Neg logit contributions
 forgot-0.141
 punished-0.126
 cleaned-0.114
 sighed-0.112
 regretted-0.112
Token all
Token ID477
Feature activation-4.69e+00

Pos logit contributions
 forgot+0.034
 punished+0.030
 cleaned+0.027
 sighed+0.027
 regretted+0.027
Neg logit contributions
aps-0.034
apping-0.030
uttering-0.029
 This-0.028
apped-0.027
Token the
Token ID262
Feature activation-5.00e+00

Pos logit contributions
 forgot+0.118
 punished+0.108
 regretted+0.098
 sighed+0.096
 realized+0.095
Neg logit contributions
aps-0.095
apping-0.084
 This-0.080
uttering-0.075
apped-0.075