TOP ACTIVATIONS
MAX = 0.321

't know the rh in oc eros was wild . Can
he would take the av oc ados and scatter them around
am taking the rh in oc eros to a new home
give to the rh in oc eros . Lily walked to
it to the rh in oc eros . When he tasted
there was a rh in oc eros who was always naughty
that the bad rh in oc eros was now nice .
," said the rh in oc eros . The
't touch the rh in oc eros ! It might bite
to see the rh in oc eros . Tim my was
was a big rh in oc eros . He was troubled
She loved to eat av oc ados . One day ,
They find the rh in oc eros . They are gentle
turned out that the av oc ados were bad and made
" The rh in oc eros is in there ,
They saw the rh in oc eros . They thanked Mom
garden and admire his sun fl owers . They also see
loud . The rh in oc eros was so scared .
closer , the rh in oc eros started to tear up
and all the other av oc ados were very happy too

MIN ACTIVATIONS
MIN = -0.142

smiled . Max was happy he could help . From that
The boy was so glad he could lend the lion some
Max was so happy , he w agged his tail and
friends . He was glad he had been so loyal .
Tim my felt happy that he could make someone feel better
His mom said he had to take a bath
Tom my was glad he was with his friend .
Ben felt sorry that he had pushed Jane , so
so grateful and happy that he hugged the man . From
lawn . Tim was happy he realized his mistake and cleaned
penny . The man said he could help her find it
very afraid . He wished he had listened to Lily .
Sammy was so thankful that he had surrendered and accepted that
George was glad that he was able to help ,
. Officer Roy was happy he could help . <|endoftext|> Once
you , playful dog ," he said . Max
Tom was happy that he had chosen to watch the
nodded . He was happy he could balance on the bush
Sam was so happy that he named the bunny Honey Bunny
. He was so happy he could ride around the park

INTERVAL 0.205 - 0.321
CONTAINS 0.019%

can 't draw on the black board . It 's for
giving him too many ch ocol ates ." The king learned
would take care of the sw an . He would feed
," said the rh in oc eros . The
They come to the ostr ich pen . Ost ric

INTERVAL 0.090 - 0.205
CONTAINS 2.987%

find it . � � € � � Don â €
family . He invited a beautiful princess , and she said
, Lily and the sweet re ind eer became good friends
the ant was carrying a little piece of food . Lily
big river where a serious crocod ile lived . He liked

INTERVAL -0.026 - 0.090
CONTAINS 67.541%

their little kid to the store to buy some nice things
hole in it . He was curious . He sw am
pepper plant kept on blo oming . Every day , more
Soon , all the other animals came to see Beef
read and write in many different languages . She could even

INTERVAL -0.142 - -0.026
CONTAINS 29.452%

tree was teacher Lisa . She was giving a lecture about
play in the garden . They have a big m op
. So she kept crawling and crawling and crawling
is old and sad . He has no fruits and no
the helmet and the gloves and climbed on the motorcycle .
Token't
Token ID470
Feature activation+7.35e-02

Pos logit contributions
 forgot+0.039
 managed+0.035
 missed+0.035
use+0.035
 never+0.034
Neg logit contributions
 films-0.031
ums-0.031
apping-0.030
 pals-0.030
aps-0.029
Token know
Token ID760
Feature activation+6.97e-02

Pos logit contributions
 films+0.163
ums+0.161
apping+0.160
 pals+0.159
aps+0.156
Neg logit contributions
 forgot-0.100
 managed-0.086
 missed-0.085
use-0.085
 never-0.082
Token the
Token ID262
Feature activation+1.42e-01

Pos logit contributions
ums+0.129
apping+0.126
aps+0.121
 films+0.119
 pals+0.118
Neg logit contributions
 forgot-0.126
 never-0.118
 would-0.113
 managed-0.111
 missed-0.110
Token rh
Token ID9529
Feature activation+2.27e-01

Pos logit contributions
apping+0.243
ums+0.243
 films+0.236
aps+0.234
 pals+0.230
Neg logit contributions
 forgot-0.264
use-0.237
 missed-0.237
 never-0.237
 managed-0.234
Tokenin
Token ID259
Feature activation+1.11e-01

Pos logit contributions
 films+0.407
ums+0.397
apping+0.385
 pals+0.384
aps+0.377
Neg logit contributions
 forgot-0.397
use-0.361
 managed-0.358
 missed-0.354
 never-0.354
Tokenoc
Token ID420
Feature activation+2.88e-01

Pos logit contributions
 films+0.146
ums+0.144
apping+0.142
 pals+0.140
aps+0.137
Neg logit contributions
 forgot-0.248
 managed-0.227
 missed-0.227
use-0.226
 never-0.222
Tokeneros
Token ID27498
Feature activation+1.10e-01

Pos logit contributions
 films+0.252
apping+0.234
aps+0.209
 pals+0.205
ums+0.187
Neg logit contributions
use-0.748
 forgot-0.736
 never-0.722
 managed-0.711
 wouldn-0.705
Token was
Token ID373
Feature activation+5.81e-02

Pos logit contributions
 films+0.248
ums+0.246
 pals+0.246
apping+0.244
aps+0.238
Neg logit contributions
 forgot-0.138
use-0.118
 managed-0.117
 missed-0.117
 never-0.111
Token wild
Token ID4295
Feature activation+1.12e-01

Pos logit contributions
 films+0.114
ums+0.113
apping+0.112
 pals+0.111
aps+0.109
Neg logit contributions
 forgot-0.093
 managed-0.082
 missed-0.082
use-0.081
 never-0.080
Token.
Token ID13
Feature activation+9.42e-02

Pos logit contributions
ums+0.224
utes+0.217
apping+0.215
aps+0.211
uff+0.205
Neg logit contributions
 would-0.174
 forgot-0.166
 had-0.166
 never-0.166
 needed-0.163
Token Can
Token ID1680
Feature activation+6.10e-02

Pos logit contributions
 films+0.164
ums+0.161
apping+0.158
 pals+0.157
ning+0.157
Neg logit contributions
 forgot-0.164
 never-0.156
 missed-0.155
Maybe-0.148
 had-0.147
Token he
Token ID339
Feature activation+1.00e-02

Pos logit contributions
ums+0.259
apping+0.256
aps+0.250
 films+0.247
 pals+0.242
Neg logit contributions
 forgot-0.161
 never-0.143
 had-0.140
 missed-0.139
 fell-0.138
Token would
Token ID561
Feature activation+3.35e-02

Pos logit contributions
 pals+0.025
 films+0.025
apping+0.024
ums+0.024
birds+0.024
Neg logit contributions
 forgot-0.008
use-0.007
 managed-0.007
 missed-0.006
Maybe-0.006
Token take
Token ID1011
Feature activation+6.52e-02

Pos logit contributions
 films+0.067
 pals+0.065
ums+0.065
apping+0.065
aps+0.064
Neg logit contributions
 forgot-0.051
use-0.044
 managed-0.044
 missed-0.044
Maybe-0.041
Token the
Token ID262
Feature activation+1.34e-01

Pos logit contributions
apping+0.121
ums+0.119
aps+0.113
 films+0.112
 pals+0.111
Neg logit contributions
 forgot-0.110
 never-0.106
 managed-0.104
 missed-0.101
use-0.101
Token av
Token ID1196
Feature activation+2.12e-01

Pos logit contributions
apping+0.222
ums+0.222
aps+0.213
 films+0.207
 pals+0.204
Neg logit contributions
 forgot-0.251
use-0.240
'd-0.230
 would-0.230
 managed-0.228
Tokenoc
Token ID420
Feature activation+2.86e-01

Pos logit contributions
 films+0.306
apping+0.292
ums+0.289
 pals+0.284
aps+0.280
Neg logit contributions
 forgot-0.447
use-0.429
 managed-0.423
 never-0.407
 missed-0.401
Tokenados
Token ID22484
Feature activation+7.72e-02

Pos logit contributions
 films+0.296
apping+0.274
aps+0.255
 pals+0.246
 sym+0.234
Neg logit contributions
use-0.695
 forgot-0.682
 managed-0.663
 never-0.660
 wouldn-0.643
Token and
Token ID290
Feature activation+3.89e-02

Pos logit contributions
 films+0.120
ums+0.119
apping+0.117
 pals+0.116
aps+0.113
Neg logit contributions
 forgot-0.154
 managed-0.140
 missed-0.139
use-0.139
 never-0.136
Token scatter
Token ID41058
Feature activation+5.64e-02

Pos logit contributions
 films+0.069
 pals+0.068
ums+0.068
apping+0.065
aps+0.064
Neg logit contributions
 forgot-0.067
 missed-0.061
use-0.060
 managed-0.060
 never-0.058
Token them
Token ID606
Feature activation+2.65e-02

Pos logit contributions
 films+0.093
ums+0.093
apping+0.092
 pals+0.090
aps+0.089
Neg logit contributions
 forgot-0.107
 managed-0.097
use-0.096
 missed-0.096
 never-0.095
Token around
Token ID1088
Feature activation+6.09e-02

Pos logit contributions
 films+0.043
ums+0.042
apping+0.042
 pals+0.041
aps+0.040
Neg logit contributions
 forgot-0.052
 managed-0.047
 missed-0.047
use-0.046
 never-0.046
Token am
Token ID716
Feature activation+7.34e-02

Pos logit contributions
aps+0.073
 pals+0.073
apping+0.072
ap+0.071
ums+0.069
Neg logit contributions
 forgot-0.038
 managed-0.033
use-0.032
Maybe-0.031
 missed-0.030
Token taking
Token ID2263
Feature activation+7.18e-02

Pos logit contributions
ums+0.145
 films+0.142
apping+0.139
 pals+0.137
ning+0.136
Neg logit contributions
 forgot-0.122
 never-0.110
 missed-0.109
 managed-0.107
 fell-0.101
Token the
Token ID262
Feature activation+1.45e-01

Pos logit contributions
apping+0.129
ums+0.129
 films+0.124
 pals+0.121
aps+0.120
Neg logit contributions
 forgot-0.125
 never-0.117
use-0.117
 managed-0.115
 missed-0.114
Token rh
Token ID9529
Feature activation+2.14e-01

Pos logit contributions
ums+0.235
apping+0.235
aps+0.226
 pals+0.221
 films+0.220
Neg logit contributions
 forgot-0.279
use-0.264
 never-0.252
 missed-0.250
 managed-0.250
Tokenin
Token ID259
Feature activation+1.16e-01

Pos logit contributions
 films+0.419
ums+0.405
 pals+0.393
apping+0.387
aps+0.384
Neg logit contributions
 forgot-0.320
use-0.299
 never-0.295
 managed-0.290
'd-0.278
Tokenoc
Token ID420
Feature activation+2.82e-01

Pos logit contributions
apping+0.193
ums+0.192
 films+0.189
aps+0.185
 pals+0.184
Neg logit contributions
 forgot-0.207
use-0.206
 never-0.203
 would-0.195
 fell-0.195
Tokeneros
Token ID27498
Feature activation+1.15e-01

Pos logit contributions
 films+0.271
apping+0.239
aps+0.218
 pals+0.214
 sym+0.197
Neg logit contributions
use-0.705
 never-0.678
 forgot-0.669
 wouldn-0.658
 managed-0.657
Token to
Token ID284
Feature activation+4.37e-02

Pos logit contributions
ums+0.196
 films+0.192
apping+0.188
aps+0.183
 pals+0.181
Neg logit contributions
 forgot-0.218
 never-0.202
use-0.199
 managed-0.198
 missed-0.195
Token a
Token ID257
Feature activation+8.79e-02

Pos logit contributions
apping+0.075
ums+0.074
 films+0.073
 pals+0.072
aps+0.071
Neg logit contributions
 forgot-0.081
 never-0.074
 missed-0.073
 managed-0.073
use-0.072
Token new
Token ID649
Feature activation+1.26e-01

Pos logit contributions
ums+0.157
apping+0.151
aps+0.148
 films+0.146
 pals+0.146
Neg logit contributions
 forgot-0.161
 never-0.148
use-0.146
 managed-0.145
 missed-0.145
Token home
Token ID1363
Feature activation+5.19e-02

Pos logit contributions
ums+0.218
apping+0.216
aps+0.205
 films+0.204
 pals+0.193
Neg logit contributions
 never-0.230
 forgot-0.227
use-0.214
 would-0.213
'd-0.209
Token give
Token ID1577
Feature activation+9.09e-02

Pos logit contributions
ums+0.179
apping+0.178
 films+0.177
 pals+0.173
aps+0.171
Neg logit contributions
 forgot-0.133
 missed-0.116
 managed-0.116
 never-0.116
use-0.115
Token to
Token ID284
Feature activation+4.61e-02

Pos logit contributions
apping+0.169
 films+0.168
ums+0.167
 pals+0.161
aps+0.161
Neg logit contributions
 forgot-0.157
use-0.141
 managed-0.141
 missed-0.140
 never-0.139
Token the
Token ID262
Feature activation+1.35e-01

Pos logit contributions
apping+0.098
ums+0.091
aps+0.088
uff+0.086
ning+0.084
Neg logit contributions
 forgot-0.077
 never-0.074
 managed-0.067
 would-0.066
 had-0.066
Token rh
Token ID9529
Feature activation+2.13e-01

Pos logit contributions
apping+0.204
ums+0.203
 films+0.198
aps+0.193
 pals+0.184
Neg logit contributions
 forgot-0.279
 never-0.252
use-0.251
 managed-0.251
 missed-0.249
Tokenin
Token ID259
Feature activation+1.07e-01

Pos logit contributions
 films+0.412
ums+0.402
 pals+0.389
apping+0.386
aters+0.378
Neg logit contributions
 forgot-0.330
use-0.309
 never-0.301
 managed-0.297
 missed-0.288
Tokenoc
Token ID420
Feature activation+2.81e-01

Pos logit contributions
apping+0.188
 films+0.182
ums+0.180
 pals+0.172
uff+0.171
Neg logit contributions
use-0.182
 forgot-0.181
 would-0.177
 never-0.175
 needed-0.173
Tokeneros
Token ID27498
Feature activation+1.02e-01

Pos logit contributions
 films+0.246
apping+0.235
aps+0.204
 pals+0.201
ums+0.184
Neg logit contributions
use-0.727
 never-0.700
 forgot-0.698
 managed-0.680
 wouldn-0.672
Token.
Token ID13
Feature activation+9.74e-02

Pos logit contributions
ums+0.173
apping+0.171
 films+0.170
uff+0.162
aps+0.160
Neg logit contributions
 forgot-0.190
 never-0.175
 would-0.174
 managed-0.173
use-0.173
Token Lily
Token ID20037
Feature activation+3.85e-02

Pos logit contributions
 films+0.179
ums+0.178
apping+0.175
 pals+0.171
ning+0.171
Neg logit contributions
 forgot-0.158
 missed-0.147
 never-0.147
Maybe-0.146
use-0.143
Token walked
Token ID6807
Feature activation+3.59e-02

Pos logit contributions
 films+0.074
 pals+0.074
ums+0.073
aps+0.071
apping+0.071
Neg logit contributions
 forgot-0.055
use-0.053
 managed-0.049
 missed-0.048
 forgotten-0.048
Token to
Token ID284
Feature activation+3.62e-02

Pos logit contributions
 films+0.063
ums+0.062
apping+0.062
 pals+0.061
aps+0.060
Neg logit contributions
 forgot-0.065
 managed-0.059
 missed-0.058
use-0.058
 never-0.057
Token it
Token ID340
Feature activation+1.52e-02

Pos logit contributions
apping+0.196
 films+0.190
ums+0.186
aps+0.182
 pals+0.176
Neg logit contributions
 forgot-0.161
 never-0.148
 managed-0.146
 missed-0.144
 wouldn-0.141
Token to
Token ID284
Feature activation+4.81e-02

Pos logit contributions
ums+0.029
apping+0.029
 films+0.028
uff+0.027
aps+0.027
Neg logit contributions
 forgot-0.026
 never-0.024
 managed-0.023
 missed-0.023
 fell-0.023
Token the
Token ID262
Feature activation+1.36e-01

Pos logit contributions
apping+0.087
ums+0.084
aps+0.082
 films+0.080
ning+0.079
Neg logit contributions
 forgot-0.088
 never-0.083
 missed-0.080
 managed-0.077
 fell-0.075
Token rh
Token ID9529
Feature activation+2.04e-01

Pos logit contributions
ums+0.214
apping+0.214
aps+0.204
 films+0.203
 pals+0.198
Neg logit contributions
 forgot-0.273
 never-0.251
 missed-0.248
use-0.244
 managed-0.240
Tokenin
Token ID259
Feature activation+1.06e-01

Pos logit contributions
 films+0.388
ums+0.375
 pals+0.364
apping+0.361
aps+0.351
Neg logit contributions
 forgot-0.332
 never-0.301
use-0.300
 managed-0.298
 missed-0.294
Tokenoc
Token ID420
Feature activation+2.72e-01

Pos logit contributions
 films+0.173
ums+0.171
apping+0.171
 pals+0.165
aps+0.163
Neg logit contributions
 forgot-0.195
 never-0.184
use-0.181
 would-0.177
 had-0.176
Tokeneros
Token ID27498
Feature activation+1.03e-01

Pos logit contributions
 films+0.223
apping+0.205
aps+0.178
 pals+0.177
ums+0.163
Neg logit contributions
use-0.717
 forgot-0.708
 never-0.702
 managed-0.685
 missed-0.682
Token.
Token ID13
Feature activation+9.42e-02

Pos logit contributions
 films+0.163
ums+0.162
apping+0.156
uff+0.153
 pals+0.149
Neg logit contributions
 forgot-0.206
 never-0.190
 missed-0.187
 managed-0.186
 would-0.185
Token When
Token ID1649
Feature activation+2.70e-02

Pos logit contributions
 films+0.166
ums+0.163
apping+0.161
ning+0.159
 pals+0.156
Neg logit contributions
 forgot-0.158
 never-0.150
 missed-0.149
Maybe-0.145
 had-0.144
Token he
Token ID339
Feature activation-3.79e-02

Pos logit contributions
 films+0.051
ums+0.050
apping+0.050
 pals+0.049
aps+0.048
Neg logit contributions
 forgot-0.046
 managed-0.040
 missed-0.040
use-0.040
 never-0.039
Token tasted
Token ID29187
Feature activation+7.35e-02

Pos logit contributions
use+0.044
Maybe+0.041
 forgot+0.040
Hmm+0.037
rated+0.036
Neg logit contributions
 pals-0.073
 films-0.072
 juggling-0.070
birds-0.069
berry-0.069
Token there
Token ID612
Feature activation+5.36e-02

Pos logit contributions
 films+0.127
ums+0.127
 pals+0.121
apping+0.119
aps+0.118
Neg logit contributions
 forgot-0.137
 managed-0.124
 missed-0.122
 never-0.122
use-0.120
Token was
Token ID373
Feature activation+7.15e-02

Pos logit contributions
apping+0.113
ums+0.112
 films+0.112
 pals+0.112
aps+0.110
Neg logit contributions
 forgot-0.075
use-0.065
 missed-0.064
 managed-0.064
 never-0.061
Token a
Token ID257
Feature activation+8.75e-02

Pos logit contributions
ums+0.135
apping+0.130
aps+0.127
 films+0.125
 pals+0.121
Neg logit contributions
 forgot-0.127
 never-0.126
 missed-0.120
 managed-0.118
 had-0.117
Token rh
Token ID9529
Feature activation+1.92e-01

Pos logit contributions
ums+0.141
 films+0.138
apping+0.137
aps+0.134
 pals+0.134
Neg logit contributions
 forgot-0.174
 missed-0.155
 managed-0.155
 never-0.152
use-0.152
Tokenin
Token ID259
Feature activation+8.35e-02

Pos logit contributions
 films+0.387
ums+0.378
 pals+0.362
aters+0.361
 sym+0.353
Neg logit contributions
 forgot-0.280
 never-0.261
 managed-0.257
use-0.251
 missed-0.245
Tokenoc
Token ID420
Feature activation+2.61e-01

Pos logit contributions
 films+0.139
apping+0.137
ums+0.135
aps+0.134
 pals+0.131
Neg logit contributions
 forgot-0.157
 never-0.146
 had-0.143
use-0.142
 missed-0.142
Tokeneros
Token ID27498
Feature activation+1.10e-01

Pos logit contributions
 films+0.265
apping+0.230
aps+0.209
 sym+0.202
utes+0.198
Neg logit contributions
use-0.635
 never-0.624
 managed-0.605
 forgot-0.605
 wouldn-0.597
Token who
Token ID508
Feature activation+3.67e-02

Pos logit contributions
ums+0.203
 films+0.202
aters+0.191
aps+0.190
uff+0.189
Neg logit contributions
 had-0.193
 never-0.190
 forgot-0.190
 needed-0.182
 fell-0.181
Token was
Token ID373
Feature activation+4.61e-02

Pos logit contributions
 pals+0.081
 films+0.081
ums+0.081
apping+0.080
aps+0.078
Neg logit contributions
 forgot-0.044
use-0.040
 managed-0.037
 missed-0.035
Maybe-0.034
Token always
Token ID1464
Feature activation+2.36e-03

Pos logit contributions
apping+0.101
aps+0.100
 films+0.100
ums+0.100
utes+0.096
Neg logit contributions
 never-0.066
 missed-0.061
 forgot-0.059
 had-0.058
 needed-0.054
Token naughty
Token ID45128
Feature activation+9.39e-02

Pos logit contributions
ums+0.004
 films+0.004
aps+0.004
apping+0.004
 pals+0.004
Neg logit contributions
 forgot-0.004
 managed-0.004
 never-0.004
 missed-0.004
 had-0.004
Token that
Token ID326
Feature activation+3.77e-02

Pos logit contributions
ums+0.214
 films+0.209
aps+0.193
uff+0.191
aters+0.191
Neg logit contributions
 forgot-0.173
 never-0.164
 had-0.158
 would-0.153
 fell-0.151
Token the
Token ID262
Feature activation+1.13e-01

Pos logit contributions
ums+0.076
 films+0.076
apping+0.075
 pals+0.074
aps+0.074
Neg logit contributions
 forgot-0.059
 managed-0.051
 never-0.051
 missed-0.051
use-0.049
Token bad
Token ID2089
Feature activation+1.12e-01

Pos logit contributions
ums+0.182
apping+0.176
aps+0.172
 films+0.171
ning+0.166
Neg logit contributions
 forgot-0.223
 missed-0.201
 never-0.201
 managed-0.198
 had-0.197
Token rh
Token ID9529
Feature activation+1.89e-01

Pos logit contributions
 films+0.193
ums+0.192
apping+0.189
 pals+0.187
aps+0.185
Neg logit contributions
 forgot-0.207
 managed-0.184
 missed-0.184
use-0.183
 never-0.180
Tokenin
Token ID259
Feature activation+7.71e-02

Pos logit contributions
 films+0.344
ums+0.337
 pals+0.322
apping+0.322
aps+0.319
Neg logit contributions
 forgot-0.331
 managed-0.297
 never-0.295
 missed-0.295
use-0.289
Tokenoc
Token ID420
Feature activation+2.57e-01

Pos logit contributions
 films+0.105
ums+0.103
 pals+0.102
apping+0.101
aps+0.098
Neg logit contributions
 forgot-0.168
 managed-0.154
use-0.153
 missed-0.153
 never-0.149
Tokeneros
Token ID27498
Feature activation+7.96e-02

Pos logit contributions
 films+0.213
apping+0.187
aps+0.170
 pals+0.170
ums+0.160
Neg logit contributions
 forgot-0.687
use-0.672
 never-0.664
 managed-0.659
 missed-0.651
Token was
Token ID373
Feature activation+3.16e-02

Pos logit contributions
 pals+0.187
 films+0.184
apping+0.182
ums+0.182
ning+0.177
Neg logit contributions
 forgot-0.082
use-0.078
 managed-0.068
 missed-0.067
Maybe-0.067
Token now
Token ID783
Feature activation-2.55e-02

Pos logit contributions
 films+0.063
ums+0.063
apping+0.062
 pals+0.062
aps+0.061
Neg logit contributions
 forgot-0.049
use-0.043
 managed-0.043
 missed-0.043
 never-0.041
Token nice
Token ID3621
Feature activation+6.31e-02

Pos logit contributions
 forgot+0.048
 managed+0.043
 missed+0.043
use+0.043
 never+0.042
Neg logit contributions
 films-0.043
ums-0.042
apping-0.042
 pals-0.042
aps-0.041
Token.
Token ID13
Feature activation+8.08e-02

Pos logit contributions
ums+0.106
 films+0.106
apping+0.104
aps+0.102
 pals+0.100
Neg logit contributions
 forgot-0.120
 managed-0.109
 never-0.108
 missed-0.107
 would-0.107
Token,"
Token ID553
Feature activation+9.68e-02

Pos logit contributions
ums+0.100
apping+0.099
 films+0.098
 pals+0.098
aps+0.095
Neg logit contributions
 forgot-0.106
 never-0.096
use-0.096
 missed-0.096
 managed-0.095
Token said
Token ID531
Feature activation+7.30e-02

Pos logit contributions
 films+0.165
ums+0.164
apping+0.163
 pals+0.159
aps+0.157
Neg logit contributions
 forgot-0.181
 missed-0.162
 managed-0.162
use-0.159
 never-0.159
Token the
Token ID262
Feature activation+9.82e-02

Pos logit contributions
 films+0.124
aters+0.121
ums+0.121
apping+0.120
aps+0.118
Neg logit contributions
 forgot-0.134
 missed-0.128
 fell-0.124
 managed-0.124
 never-0.121
Token rh
Token ID9529
Feature activation+1.76e-01

Pos logit contributions
ums+0.170
apping+0.168
 films+0.164
aters+0.162
aps+0.160
Neg logit contributions
 forgot-0.175
 missed-0.158
 never-0.157
 had-0.156
 fell-0.155
Tokenin
Token ID259
Feature activation+6.53e-02

Pos logit contributions
 films+0.327
ums+0.314
 pals+0.308
apping+0.304
aps+0.296
Neg logit contributions
 forgot-0.298
 managed-0.270
use-0.268
 never-0.268
 missed-0.265
Tokenoc
Token ID420
Feature activation+2.52e-01

Pos logit contributions
 films+0.104
 pals+0.095
ums+0.095
apping+0.095
aps+0.095
Neg logit contributions
 forgot-0.134
 never-0.122
 managed-0.122
 missed-0.121
 had-0.119
Tokeneros
Token ID27498
Feature activation+7.41e-02

Pos logit contributions
 films+0.206
apping+0.189
 pals+0.171
aps+0.161
 sym+0.150
Neg logit contributions
use-0.667
 forgot-0.659
 never-0.644
 managed-0.636
 missed-0.629
Token.
Token ID13
Feature activation+8.00e-02

Pos logit contributions
 films+0.116
ums+0.112
 pals+0.109
apping+0.109
aps+0.108
Neg logit contributions
 forgot-0.153
 managed-0.140
 missed-0.140
 never-0.137
 fell-0.134
Token
Token ID198
Feature activation+6.52e-02

Pos logit contributions
 films+0.133
ums+0.127
 pals+0.124
apping+0.122
ning+0.118
Neg logit contributions
 forgot-0.151
 managed-0.141
 missed-0.141
 never-0.139
Maybe-0.137
Token
Token ID198
Feature activation+6.41e-02

Pos logit contributions
 films+0.123
ums+0.116
 pals+0.115
apping+0.113
ning+0.111
Neg logit contributions
 forgot-0.108
 never-0.102
 managed-0.100
Maybe-0.099
 missed-0.097
TokenThe
Token ID464
Feature activation+9.23e-02

Pos logit contributions
 films+0.137
ums+0.134
 pals+0.132
apping+0.131
aps+0.128
Neg logit contributions
 forgot-0.093
 managed-0.082
 never-0.081
use-0.080
 missed-0.080
Token't
Token ID470
Feature activation+6.05e-03

Pos logit contributions
 films+0.070
ums+0.068
 pals+0.068
apping+0.067
ning+0.066
Neg logit contributions
 forgot-0.054
use-0.049
 managed-0.047
 missed-0.046
'd-0.043
Token touch
Token ID3638
Feature activation+6.39e-02

Pos logit contributions
 films+0.012
apping+0.012
ums+0.012
 pals+0.012
aps+0.012
Neg logit contributions
 forgot-0.009
 managed-0.008
 missed-0.008
use-0.008
'd-0.008
Token the
Token ID262
Feature activation+1.17e-01

Pos logit contributions
apping+0.144
ums+0.138
 pals+0.138
aps+0.136
 films+0.132
Neg logit contributions
 forgot-0.084
 never-0.083
 fell-0.077
use-0.076
 managed-0.073
Token rh
Token ID9529
Feature activation+1.85e-01

Pos logit contributions
apping+0.193
ums+0.190
aps+0.182
 pals+0.181
 films+0.178
Neg logit contributions
 forgot-0.227
use-0.211
 never-0.207
 fell-0.205
 would-0.204
Tokenin
Token ID259
Feature activation+8.44e-02

Pos logit contributions
 films+0.352
ums+0.347
 pals+0.341
apping+0.336
aps+0.328
Neg logit contributions
 forgot-0.291
use-0.271
 never-0.263
 managed-0.261
 missed-0.252
Tokenoc
Token ID420
Feature activation+2.52e-01

Pos logit contributions
apping+0.132
 films+0.131
ums+0.131
 pals+0.131
aps+0.127
Neg logit contributions
 forgot-0.167
use-0.155
 never-0.154
 fell-0.152
 managed-0.151
Tokeneros
Token ID27498
Feature activation+8.94e-02

Pos logit contributions
 films+0.211
apping+0.199
 pals+0.179
aps+0.176
ums+0.152
Neg logit contributions
use-0.664
 forgot-0.656
 never-0.637
 managed-0.631
 missed-0.624
Token!
Token ID0
Feature activation+7.82e-02

Pos logit contributions
ums+0.143
 films+0.141
 pals+0.139
apping+0.137
aps+0.134
Neg logit contributions
 forgot-0.179
 never-0.164
 managed-0.163
use-0.162
 fell-0.161
Token It
Token ID632
Feature activation-2.63e-02

Pos logit contributions
 films+0.129
 pals+0.127
apping+0.122
ums+0.121
ning+0.118
Neg logit contributions
 forgot-0.145
 missed-0.135
 never-0.135
use-0.135
Maybe-0.135
Token might
Token ID1244
Feature activation+8.32e-03

Pos logit contributions
 forgot+0.043
use+0.039
 managed+0.038
 missed+0.037
Maybe+0.036
Neg logit contributions
 films-0.048
 pals-0.046
ums-0.046
ning-0.046
apping-0.045
Token bite
Token ID13197
Feature activation+7.31e-02

Pos logit contributions
 films+0.015
 pals+0.015
ums+0.014
 lions+0.014
aps+0.014
Neg logit contributions
 forgot-0.012
 managed-0.011
use-0.011
Maybe-0.010
 missed-0.010
Token to
Token ID284
Feature activation+3.28e-02

Pos logit contributions
ums+0.018
 films+0.016
igo+0.016
apping+0.016
ning+0.016
Neg logit contributions
 forgot-0.011
 never-0.010
 had-0.010
 would-0.010
 fell-0.010
Token see
Token ID766
Feature activation+1.90e-02

Pos logit contributions
ums+0.071
apping+0.069
aps+0.063
 pals+0.063
uff+0.062
Neg logit contributions
 forgot-0.052
 missed-0.048
 never-0.048
 had-0.047
 notice-0.046
Token the
Token ID262
Feature activation+1.03e-01

Pos logit contributions
ums+0.041
apping+0.039
aps+0.037
aters+0.036
uff+0.036
Neg logit contributions
 would-0.027
 forgot-0.027
 never-0.026
 had-0.026
Maybe-0.026
Token rh
Token ID9529
Feature activation+1.77e-01

Pos logit contributions
apping+0.170
ums+0.168
aps+0.163
 films+0.159
 pals+0.158
Neg logit contributions
 forgot-0.195
 never-0.179
 would-0.177
use-0.176
 missed-0.176
Tokenin
Token ID259
Feature activation+7.29e-02

Pos logit contributions
 films+0.326
ums+0.320
 pals+0.315
apping+0.307
aps+0.301
Neg logit contributions
 forgot-0.294
 never-0.265
use-0.265
 managed-0.263
 missed-0.260
Tokenoc
Token ID420
Feature activation+2.52e-01

Pos logit contributions
ums+0.110
apping+0.110
 films+0.110
 pals+0.108
aps+0.106
Neg logit contributions
 forgot-0.146
 never-0.134
use-0.134
 missed-0.134
 had-0.132
Tokeneros
Token ID27498
Feature activation+8.27e-02

Pos logit contributions
 films+0.202
apping+0.193
 pals+0.172
aps+0.167
ums+0.149
Neg logit contributions
use-0.659
 forgot-0.651
 never-0.644
 managed-0.631
 missed-0.626
Token.
Token ID13
Feature activation+7.47e-02

Pos logit contributions
ums+0.144
 films+0.134
uff+0.131
apping+0.130
aps+0.129
Neg logit contributions
 forgot-0.157
 had-0.148
 would-0.147
 never-0.145
 missed-0.144
Token Tim
Token ID5045
Feature activation+2.52e-02

Pos logit contributions
ums+0.131
 pals+0.125
apping+0.124
ning+0.123
aps+0.121
Neg logit contributions
 forgot-0.137
 missed-0.132
 had-0.127
 never-0.123
 needed-0.121
Tokenmy
Token ID1820
Feature activation-3.80e-03

Pos logit contributions
ning+0.042
 pals+0.042
apping+0.041
ums+0.040
 films+0.040
Neg logit contributions
 forgot-0.041
use-0.039
 managed-0.038
Maybe-0.037
 missed-0.037
Token was
Token ID373
Feature activation+2.07e-02

Pos logit contributions
use+0.005
 forgot+0.004
Maybe+0.004
 managed+0.004
 notice+0.004
Neg logit contributions
 pals-0.008
ning-0.008
apping-0.008
 films-0.008
ums-0.008
Token was
Token ID373
Feature activation+5.96e-02

Pos logit contributions
ums+0.084
 films+0.084
apping+0.083
 pals+0.083
aps+0.081
Neg logit contributions
 forgot-0.049
use-0.042
 managed-0.041
 missed-0.041
 never-0.039
Token a
Token ID257
Feature activation+7.94e-02

Pos logit contributions
ums+0.112
apping+0.109
aps+0.107
 films+0.104
uff+0.101
Neg logit contributions
 forgot-0.106
 never-0.105
 missed-0.100
 managed-0.098
 had-0.098
Token big
Token ID1263
Feature activation+1.14e-01

Pos logit contributions
ums+0.127
 films+0.124
apping+0.123
aps+0.121
 pals+0.120
Neg logit contributions
 forgot-0.159
 missed-0.142
 managed-0.142
 never-0.140
use-0.140
Token rh
Token ID9529
Feature activation+1.82e-01

Pos logit contributions
ums+0.178
apping+0.174
aps+0.171
 films+0.170
 pals+0.169
Neg logit contributions
 forgot-0.229
 missed-0.207
use-0.205
 managed-0.203
 never-0.203
Tokenin
Token ID259
Feature activation+7.58e-02

Pos logit contributions
 films+0.361
ums+0.355
 pals+0.342
aters+0.336
apping+0.329
Neg logit contributions
 forgot-0.273
 never-0.254
 managed-0.249
use-0.243
 fell-0.239
Tokenoc
Token ID420
Feature activation+2.52e-01

Pos logit contributions
 films+0.128
apping+0.127
ums+0.127
aps+0.124
 pals+0.123
Neg logit contributions
 forgot-0.137
 never-0.130
 had-0.128
 fell-0.125
 needed-0.125
Tokeneros
Token ID27498
Feature activation+1.04e-01

Pos logit contributions
 films+0.259
apping+0.223
 sym+0.203
aps+0.202
 pals+0.195
Neg logit contributions
use-0.604
 never-0.598
 managed-0.577
 forgot-0.574
 wouldn-0.564
Token.
Token ID13
Feature activation+7.81e-02

Pos logit contributions
ums+0.188
 films+0.186
aps+0.174
uff+0.172
aters+0.172
Neg logit contributions
 had-0.185
 forgot-0.182
 never-0.181
 needed-0.175
 fell-0.174
Token He
Token ID679
Feature activation-3.72e-02

Pos logit contributions
ums+0.135
apping+0.134
ning+0.131
 pals+0.131
 films+0.130
Neg logit contributions
 forgot-0.129
 missed-0.126
 had-0.125
 never-0.124
 needed-0.122
Token was
Token ID373
Feature activation+2.93e-02

Pos logit contributions
 forgot+0.046
use+0.045
 managed+0.040
Maybe+0.038
 notice+0.037
Neg logit contributions
 pals-0.080
 films-0.079
ums-0.075
ning-0.075
apping-0.075
Token troubled
Token ID17840
Feature activation+6.90e-02

Pos logit contributions
ums+0.062
apping+0.061
aps+0.061
 films+0.060
aters+0.059
Neg logit contributions
 never-0.044
 missed-0.042
 forgot-0.042
 managed-0.038
 had-0.038
Token She
Token ID1375
Feature activation-1.53e-02

Pos logit contributions
 films+0.170
 pals+0.166
ums+0.164
ning+0.160
apping+0.156
Neg logit contributions
 forgot-0.117
 missed-0.111
 never-0.107
Maybe-0.106
 had-0.106
Token loved
Token ID6151
Feature activation+6.94e-02

Pos logit contributions
use+0.024
 forgot+0.023
 managed+0.022
Maybe+0.021
 notice+0.020
Neg logit contributions
 pals-0.027
ums-0.026
apping-0.026
 films-0.026
aps-0.025
Token to
Token ID284
Feature activation+2.71e-02

Pos logit contributions
apping+0.155
ums+0.151
aps+0.146
uff+0.138
ap+0.137
Neg logit contributions
 never-0.100
 forgot-0.098
 would-0.096
 missed-0.093
 had-0.090
Token eat
Token ID4483
Feature activation+5.17e-02

Pos logit contributions
ums+0.050
apping+0.050
 films+0.049
 pals+0.049
aps+0.048
Neg logit contributions
 forgot-0.047
 missed-0.042
 managed-0.042
use-0.041
 never-0.041
Token av
Token ID1196
Feature activation+1.83e-01

Pos logit contributions
apping+0.095
ums+0.088
 pals+0.087
aps+0.087
 films+0.081
Neg logit contributions
 forgot-0.089
 never-0.082
 would-0.082
 fell-0.081
use-0.081
Tokenoc
Token ID420
Feature activation+2.51e-01

Pos logit contributions
 films+0.251
apping+0.244
ums+0.239
 pals+0.236
aps+0.231
Neg logit contributions
 forgot-0.402
use-0.374
 managed-0.373
 never-0.365
 missed-0.364
Tokenados
Token ID22484
Feature activation+6.48e-02

Pos logit contributions
 films+0.271
apping+0.268
aps+0.234
 sym+0.229
 pals+0.214
Neg logit contributions
use-0.590
 forgot-0.580
 managed-0.566
 never-0.561
 lied-0.545
Token.
Token ID13
Feature activation+7.48e-02

Pos logit contributions
 films+0.105
ums+0.103
apping+0.102
 pals+0.099
aps+0.099
Neg logit contributions
 forgot-0.125
 never-0.115
 managed-0.114
use-0.114
 would-0.112
Token One
Token ID1881
Feature activation+9.51e-03

Pos logit contributions
apping+0.183
ums+0.177
aps+0.176
 pals+0.176
 films+0.174
Neg logit contributions
 forgot-0.078
use-0.073
 would-0.072
Maybe-0.071
 missed-0.071
Token day
Token ID1110
Feature activation+1.36e-02

Pos logit contributions
 films+0.015
ums+0.015
apping+0.014
 pals+0.014
aps+0.014
Neg logit contributions
 forgot-0.019
 managed-0.017
use-0.017
 missed-0.017
 never-0.016
Token,
Token ID11
Feature activation+1.48e-02

Pos logit contributions
 films+0.025
ums+0.025
 pals+0.024
apping+0.024
ning+0.024
Neg logit contributions
 forgot-0.025
 fell-0.022
 missed-0.022
 managed-0.022
use-0.021
Token They
Token ID1119
Feature activation-1.14e-02

Pos logit contributions
ums+0.175
 films+0.173
ning+0.172
 pals+0.170
apping+0.167
Neg logit contributions
 forgot-0.164
Maybe-0.156
 never-0.155
 missed-0.153
use-0.153
Token find
Token ID1064
Feature activation+7.22e-03

Pos logit contributions
 forgot+0.015
use+0.014
 managed+0.013
Maybe+0.013
 missed+0.013
Neg logit contributions
 films-0.022
 pals-0.022
aps-0.021
 lions-0.021
birds-0.021
Token the
Token ID262
Feature activation+1.00e-01

Pos logit contributions
ums+0.011
apping+0.011
 films+0.011
 pals+0.011
aps+0.011
Neg logit contributions
 forgot-0.015
 managed-0.013
 missed-0.013
 never-0.013
use-0.013
Token rh
Token ID9529
Feature activation+1.74e-01

Pos logit contributions
ums+0.161
apping+0.160
 films+0.155
aps+0.154
 pals+0.151
Neg logit contributions
 forgot-0.199
 missed-0.180
 managed-0.180
 never-0.180
use-0.178
Tokenin
Token ID259
Feature activation+6.89e-02

Pos logit contributions
 films+0.308
ums+0.305
 pals+0.298
apping+0.296
aps+0.289
Neg logit contributions
 forgot-0.303
 managed-0.274
use-0.273
 never-0.270
 missed-0.268
Tokenoc
Token ID420
Feature activation+2.50e-01

Pos logit contributions
ums+0.095
 films+0.095
apping+0.094
 pals+0.092
aps+0.090
Neg logit contributions
 forgot-0.150
 managed-0.138
 never-0.138
 missed-0.137
use-0.137
Tokeneros
Token ID27498
Feature activation+8.01e-02

Pos logit contributions
 films+0.206
apping+0.197
 pals+0.172
aps+0.171
 sym+0.160
Neg logit contributions
use-0.649
 forgot-0.634
 never-0.627
 managed-0.618
 missed-0.609
Token.
Token ID13
Feature activation+8.74e-02

Pos logit contributions
ums+0.131
 films+0.131
apping+0.128
 pals+0.126
aps+0.124
Neg logit contributions
 forgot-0.156
 managed-0.141
 missed-0.140
 never-0.139
use-0.139
Token They
Token ID1119
Feature activation-2.59e-02

Pos logit contributions
ums+0.140
 films+0.140
 pals+0.140
apping+0.137
ning+0.136
Neg logit contributions
 forgot-0.160
Maybe-0.151
 missed-0.150
 never-0.149
use-0.149
Token are
Token ID389
Feature activation+2.22e-02

Pos logit contributions
 forgot+0.038
use+0.035
 managed+0.034
Maybe+0.034
t+0.033
Neg logit contributions
 films-0.047
 pals-0.045
 lions-0.044
aps-0.044
ums-0.043
Token gentle
Token ID10296
Feature activation+9.95e-02

Pos logit contributions
ums+0.043
 films+0.042
apping+0.042
aps+0.041
 pals+0.040
Neg logit contributions
 forgot-0.037
 missed-0.034
 never-0.033
 managed-0.033
 fell-0.031
Token turned
Token ID2900
Feature activation+3.31e-02

Pos logit contributions
use+0.034
 forgot+0.025
 notice+0.025
 managed+0.023
Maybe+0.023
Neg logit contributions
 pals-0.045
 friends-0.045
 films-0.044
ning-0.044
birds-0.042
Token out
Token ID503
Feature activation+3.75e-02

Pos logit contributions
 films+0.050
 pals+0.047
ning+0.047
ums+0.047
apping+0.046
Neg logit contributions
 forgot-0.068
use-0.061
 managed-0.061
'd-0.060
 missed-0.059
Token that
Token ID326
Feature activation+1.75e-02

Pos logit contributions
 films+0.053
apping+0.051
ums+0.051
 pals+0.051
aps+0.050
Neg logit contributions
 forgot-0.081
use-0.074
 missed-0.073
 managed-0.073
'd-0.071
Token the
Token ID262
Feature activation+9.26e-02

Pos logit contributions
 films+0.027
 pals+0.025
ning+0.024
apping+0.024
birds+0.024
Neg logit contributions
 forgot-0.034
use-0.033
 managed-0.031
 missed-0.031
'd-0.031
Token av
Token ID1196
Feature activation+1.71e-01

Pos logit contributions
ums+0.134
 films+0.132
apping+0.132
aps+0.128
 pals+0.126
Neg logit contributions
 forgot-0.198
 missed-0.180
 managed-0.179
 never-0.178
use-0.178
Tokenoc
Token ID420
Feature activation+2.50e-01

Pos logit contributions
ums+0.189
 films+0.186
 pals+0.183
apping+0.181
aps+0.176
Neg logit contributions
 forgot-0.419
 missed-0.387
 managed-0.381
use-0.376
 never-0.374
Tokenados
Token ID22484
Feature activation+3.51e-02

Pos logit contributions
 films+0.235
apping+0.217
aps+0.199
 pals+0.195
ums+0.181
Neg logit contributions
 forgot-0.640
use-0.633
 managed-0.612
 never-0.608
 missed-0.599
Token were
Token ID547
Feature activation+3.45e-02

Pos logit contributions
 pals+0.068
birds+0.066
ning+0.065
apping+0.064
 films+0.064
Neg logit contributions
use-0.043
 forgot-0.039
Maybe-0.038
Hmm-0.034
 notice-0.034
Token bad
Token ID2089
Feature activation+7.21e-02

Pos logit contributions
 pals+0.065
 films+0.063
ums+0.061
apping+0.061
ning+0.060
Neg logit contributions
 forgot-0.054
use-0.053
Maybe-0.048
 managed-0.047
'd-0.047
Token and
Token ID290
Feature activation+4.61e-03

Pos logit contributions
ums+0.116
apping+0.113
 films+0.113
aps+0.109
 pals+0.107
Neg logit contributions
 forgot-0.144
 missed-0.129
 managed-0.129
 never-0.128
 fell-0.127
Token made
Token ID925
Feature activation+1.73e-02

Pos logit contributions
 pals+0.009
 films+0.009
ums+0.008
ning+0.008
apping+0.008
Neg logit contributions
use-0.007
 forgot-0.007
 managed-0.006
'd-0.006
Maybe-0.006
Token
Token ID198
Feature activation+8.64e-02

Pos logit contributions
 films+0.164
ums+0.163
apping+0.158
 pals+0.158
ning+0.157
Neg logit contributions
 forgot-0.164
 never-0.153
 missed-0.150
 managed-0.149
Maybe-0.148
Token"
Token ID1
Feature activation+5.65e-02

Pos logit contributions
 films+0.194
ums+0.192
 pals+0.188
apping+0.186
ning+0.184
Neg logit contributions
 forgot-0.112
 managed-0.098
 never-0.098
use-0.097
 missed-0.095
TokenThe
Token ID464
Feature activation+1.18e-01

Pos logit contributions
 films+0.112
apping+0.110
 pals+0.109
aps+0.107
ums+0.107
Neg logit contributions
 forgot-0.088
 missed-0.077
 managed-0.074
use-0.074
 wouldn-0.073
Token rh
Token ID9529
Feature activation+1.83e-01

Pos logit contributions
ums+0.182
apping+0.181
 films+0.178
aps+0.174
 pals+0.170
Neg logit contributions
 forgot-0.240
 never-0.220
 missed-0.217
use-0.217
 managed-0.215
Tokenin
Token ID259
Feature activation+6.66e-02

Pos logit contributions
 films+0.306
ums+0.301
apping+0.294
 pals+0.293
aps+0.285
Neg logit contributions
 forgot-0.344
 managed-0.310
use-0.309
 missed-0.307
 never-0.305
Tokenoc
Token ID420
Feature activation+2.50e-01

Pos logit contributions
 films+0.080
ums+0.079
 pals+0.078
apping+0.078
aps+0.075
Neg logit contributions
 forgot-0.156
 missed-0.143
 managed-0.143
use-0.141
 never-0.138
Tokeneros
Token ID27498
Feature activation+7.69e-02

Pos logit contributions
 films+0.192
apping+0.181
 pals+0.158
aps+0.155
ums+0.145
Neg logit contributions
 forgot-0.669
use-0.666
 never-0.653
 managed-0.644
 missed-0.636
Token is
Token ID318
Feature activation+1.85e-02

Pos logit contributions
 films+0.147
 pals+0.146
ums+0.145
apping+0.144
aps+0.143
Neg logit contributions
 forgot-0.117
use-0.103
 missed-0.103
 managed-0.103
Maybe-0.099
Token in
Token ID287
Feature activation+4.63e-02

Pos logit contributions
 films+0.033
ums+0.032
apping+0.032
 pals+0.032
aps+0.031
Neg logit contributions
 forgot-0.033
 managed-0.029
use-0.029
 missed-0.029
 never-0.028
Token there
Token ID612
Feature activation+2.02e-02

Pos logit contributions
ums+0.069
 films+0.069
apping+0.069
 pals+0.067
aps+0.066
Neg logit contributions
 forgot-0.095
 managed-0.087
 missed-0.086
use-0.086
 never-0.085
Token,
Token ID11
Feature activation+2.47e-02

Pos logit contributions
 films+0.032
ums+0.032
apping+0.031
 pals+0.031
aps+0.030
Neg logit contributions
 forgot-0.040
use-0.036
 managed-0.036
 never-0.036
 missed-0.036
Token They
Token ID1119
Feature activation-1.98e-02

Pos logit contributions
ning+0.184
 films+0.177
ums+0.176
 pals+0.173
pieces+0.167
Neg logit contributions
 forgot-0.134
Maybe-0.128
 missed-0.125
 had-0.123
 never-0.121
Token saw
Token ID2497
Feature activation+5.69e-02

Pos logit contributions
use+0.026
t+0.021
Maybe+0.020
 forgot+0.020
 managed+0.019
Neg logit contributions
 pals-0.041
 friends-0.037
 films-0.037
apping-0.037
aws-0.037
Token the
Token ID262
Feature activation+1.03e-01

Pos logit contributions
ums+0.106
apping+0.106
aters+0.101
aps+0.099
 pals+0.098
Neg logit contributions
 forgot-0.098
 missed-0.093
 never-0.090
 had-0.088
 fell-0.088
Token rh
Token ID9529
Feature activation+1.72e-01

Pos logit contributions
apping+0.174
ums+0.172
aps+0.164
 films+0.161
 pals+0.161
Neg logit contributions
 forgot-0.195
 never-0.175
 missed-0.175
 had-0.173
use-0.172
Tokenin
Token ID259
Feature activation+6.85e-02

Pos logit contributions
 films+0.316
ums+0.313
 pals+0.303
apping+0.300
aps+0.295
Neg logit contributions
 forgot-0.281
 never-0.251
use-0.251
 managed-0.249
 missed-0.248
Tokenoc
Token ID420
Feature activation+2.50e-01

Pos logit contributions
ums+0.097
apping+0.096
 films+0.096
 pals+0.094
aps+0.092
Neg logit contributions
 forgot-0.146
 missed-0.134
 never-0.133
 managed-0.132
use-0.132
Tokeneros
Token ID27498
Feature activation+8.09e-02

Pos logit contributions
 films+0.192
apping+0.180
 pals+0.157
aps+0.156
ums+0.142
Neg logit contributions
 forgot-0.669
use-0.662
 never-0.650
 managed-0.640
 missed-0.638
Token.
Token ID13
Feature activation+9.01e-02

Pos logit contributions
ums+0.143
 pals+0.135
 films+0.135
apping+0.134
aps+0.132
Neg logit contributions
 forgot-0.149
 never-0.138
 had-0.137
 missed-0.136
 managed-0.135
Token They
Token ID1119
Feature activation-2.34e-02

Pos logit contributions
ning+0.165
 pals+0.164
ums+0.161
apping+0.156
 films+0.152
Neg logit contributions
Maybe-0.145
 forgot-0.141
 missed-0.141
 had-0.136
 never-0.135
Token thanked
Token ID26280
Feature activation+4.01e-02

Pos logit contributions
use+0.030
t+0.024
 forgot+0.024
 surely+0.023
Maybe+0.023
Neg logit contributions
 pals-0.049
 lions-0.046
apping-0.046
 films-0.046
 ducks-0.045
Token Mom
Token ID11254
Feature activation+3.90e-02

Pos logit contributions
apping+0.085
ums+0.083
aps+0.080
ning+0.079
aters+0.079
Neg logit contributions
 never-0.057
 forgot-0.056
 had-0.052
 would-0.051
 missed-0.051
Token garden
Token ID11376
Feature activation+9.67e-02

Pos logit contributions
apping+0.211
ums+0.203
 films+0.202
ning+0.193
aps+0.193
Neg logit contributions
 forgot-0.264
 never-0.249
 missed-0.246
 had-0.241
 would-0.240
Token and
Token ID290
Feature activation+2.84e-02

Pos logit contributions
 films+0.149
ums+0.148
apping+0.144
 pals+0.143
aps+0.139
Neg logit contributions
 forgot-0.196
 never-0.180
 managed-0.177
 would-0.176
 missed-0.175
Token admire
Token ID21099
Feature activation+8.79e-02

Pos logit contributions
 films+0.062
ums+0.060
 pals+0.060
apping+0.060
aps+0.059
Neg logit contributions
 forgot-0.038
use-0.033
 managed-0.033
 missed-0.032
 never-0.031
Token his
Token ID465
Feature activation+9.51e-02

Pos logit contributions
apping+0.177
ums+0.175
aps+0.167
 pals+0.164
 films+0.163
Neg logit contributions
 forgot-0.136
 never-0.134
 would-0.123
 had-0.121
 missed-0.121
Token sun
Token ID4252
Feature activation+8.70e-02

Pos logit contributions
apping+0.148
ums+0.143
aps+0.137
 films+0.135
ning+0.131
Neg logit contributions
 forgot-0.191
 never-0.173
 missed-0.172
 needed-0.172
 would-0.172
Tokenfl
Token ID2704
Feature activation+2.50e-01

Pos logit contributions
apping+0.143
ums+0.141
 pals+0.141
aps+0.139
 films+0.138
Neg logit contributions
 forgot-0.165
 would-0.158
 had-0.154
 fell-0.152
 never-0.151
Tokenowers
Token ID3618
Feature activation+7.37e-02

Pos logit contributions
 films+0.332
apping+0.315
 pals+0.294
aps+0.290
ums+0.283
Neg logit contributions
 would-0.499
use-0.487
 forgot-0.483
 wouldn-0.474
 will-0.470
Token.
Token ID13
Feature activation+9.38e-02

Pos logit contributions
ums+0.109
 films+0.108
apping+0.105
 pals+0.105
aps+0.102
Neg logit contributions
 forgot-0.155
 managed-0.140
 never-0.140
use-0.138
 missed-0.138
Token They
Token ID1119
Feature activation-1.75e-02

Pos logit contributions
 films+0.185
ums+0.182
apping+0.181
 pals+0.181
ning+0.180
Neg logit contributions
 forgot-0.143
 never-0.134
 missed-0.133
Maybe-0.132
 had-0.129
Token also
Token ID635
Feature activation+3.27e-03

Pos logit contributions
use+0.023
 forgot+0.023
 managed+0.022
t+0.022
Maybe+0.021
Neg logit contributions
 films-0.032
 pals-0.032
birds-0.031
ums-0.030
aps-0.030
Token see
Token ID766
Feature activation+1.05e-02

Pos logit contributions
 films+0.007
 pals+0.007
ums+0.007
aps+0.007
apping+0.007
Neg logit contributions
 forgot-0.004
use-0.004
 managed-0.004
Maybe-0.003
 missed-0.003
Token loud
Token ID7812
Feature activation+1.11e-01

Pos logit contributions
 films+0.095
ums+0.094
apping+0.094
 pals+0.093
aps+0.090
Neg logit contributions
 forgot-0.115
use-0.107
 managed-0.105
 missed-0.104
'd-0.101
Token.
Token ID13
Feature activation+9.40e-02

Pos logit contributions
ums+0.180
 pals+0.173
apping+0.172
aps+0.168
 films+0.167
Neg logit contributions
 forgot-0.217
 never-0.208
 missed-0.203
 fell-0.200
 managed-0.199
Token The
Token ID383
Feature activation+1.01e-01

Pos logit contributions
 films+0.165
ums+0.164
apping+0.161
ning+0.158
 pals+0.156
Neg logit contributions
 forgot-0.162
 never-0.152
 missed-0.151
 had-0.148
Maybe-0.147
Token rh
Token ID9529
Feature activation+1.74e-01

Pos logit contributions
ums+0.148
apping+0.147
aps+0.143
 films+0.139
 pals+0.132
Neg logit contributions
 forgot-0.215
 missed-0.196
 never-0.196
 had-0.193
use-0.192
Tokenin
Token ID259
Feature activation+6.58e-02

Pos logit contributions
 films+0.300
ums+0.295
apping+0.288
 pals+0.286
aps+0.281
Neg logit contributions
 forgot-0.321
 managed-0.288
 missed-0.287
 never-0.284
use-0.283
Tokenoc
Token ID420
Feature activation+2.48e-01

Pos logit contributions
 films+0.080
ums+0.079
apping+0.078
 pals+0.077
aps+0.075
Neg logit contributions
 forgot-0.154
 managed-0.141
 missed-0.141
use-0.141
 never-0.138
Tokeneros
Token ID27498
Feature activation+7.78e-02

Pos logit contributions
 films+0.194
apping+0.183
aps+0.158
 pals+0.156
ums+0.142
Neg logit contributions
 forgot-0.666
use-0.656
 never-0.649
 managed-0.638
 missed-0.635
Token was
Token ID373
Feature activation+3.68e-02

Pos logit contributions
 pals+0.172
 films+0.168
apping+0.167
ning+0.166
ums+0.165
Neg logit contributions
use-0.090
 forgot-0.088
Maybe-0.074
 managed-0.074
 missed-0.072
Token so
Token ID523
Feature activation+2.68e-02

Pos logit contributions
ums+0.073
 films+0.072
apping+0.071
aps+0.070
 pals+0.069
Neg logit contributions
 forgot-0.061
 never-0.056
 missed-0.056
 managed-0.055
 had-0.051
Token scared
Token ID12008
Feature activation+8.68e-02

Pos logit contributions
 films+0.048
ums+0.048
apping+0.047
 pals+0.047
aps+0.046
Neg logit contributions
 forgot-0.047
use-0.042
 managed-0.042
 missed-0.042
 never-0.041
Token.
Token ID13
Feature activation+7.38e-02

Pos logit contributions
ums+0.167
 films+0.163
aps+0.163
apping+0.160
 pals+0.159
Neg logit contributions
 forgot-0.145
 never-0.138
 fell-0.132
 had-0.131
 managed-0.129
Token closer
Token ID5699
Feature activation+6.92e-02

Pos logit contributions
ums+0.057
 films+0.056
apping+0.055
 pals+0.054
aps+0.053
Neg logit contributions
 forgot-0.055
 missed-0.051
 managed-0.050
 never-0.049
use-0.049
Token,
Token ID11
Feature activation+2.28e-02

Pos logit contributions
 films+0.126
ums+0.125
 pals+0.118
apping+0.118
aps+0.117
Neg logit contributions
 forgot-0.123
 managed-0.111
 missed-0.109
 never-0.109
 fell-0.108
Token the
Token ID262
Feature activation+8.75e-02

Pos logit contributions
ums+0.044
apping+0.043
aps+0.043
 films+0.042
 pals+0.041
Neg logit contributions
 forgot-0.040
 never-0.035
 missed-0.035
 managed-0.035
 had-0.034
Token rh
Token ID9529
Feature activation+1.66e-01

Pos logit contributions
ums+0.139
apping+0.137
 films+0.134
aps+0.132
 pals+0.129
Neg logit contributions
 forgot-0.173
 missed-0.156
 never-0.153
 had-0.153
 fell-0.152
Tokenin
Token ID259
Feature activation+5.70e-02

Pos logit contributions
 films+0.295
ums+0.285
 pals+0.278
apping+0.275
aps+0.270
Neg logit contributions
 forgot-0.295
 managed-0.265
 missed-0.264
 never-0.264
use-0.260
Tokenoc
Token ID420
Feature activation+2.47e-01

Pos logit contributions
 films+0.063
ums+0.063
 pals+0.062
apping+0.061
aps+0.058
Neg logit contributions
 forgot-0.137
use-0.127
 managed-0.126
 missed-0.126
 never-0.122
Tokeneros
Token ID27498
Feature activation+7.29e-02

Pos logit contributions
 films+0.201
apping+0.187
 pals+0.164
aps+0.163
ums+0.147
Neg logit contributions
 forgot-0.647
use-0.645
 never-0.632
 managed-0.622
 missed-0.617
Token started
Token ID2067
Feature activation+2.87e-02

Pos logit contributions
 pals+0.174
 films+0.171
apping+0.171
ums+0.170
ning+0.168
Neg logit contributions
 forgot-0.074
use-0.070
 managed-0.060
 missed-0.060
Maybe-0.059
Token to
Token ID284
Feature activation+1.67e-02

Pos logit contributions
ums+0.059
 films+0.058
apping+0.055
aters+0.055
 pals+0.055
Neg logit contributions
 forgot-0.044
 never-0.042
 managed-0.042
 missed-0.041
 bumped-0.041
Token tear
Token ID11626
Feature activation+4.51e-02

Pos logit contributions
 films+0.032
 pals+0.031
ums+0.031
apping+0.031
aps+0.030
Neg logit contributions
 forgot-0.027
use-0.024
 managed-0.024
 missed-0.023
'd-0.022
Token up
Token ID510
Feature activation+5.41e-02

Pos logit contributions
ums+0.080
 films+0.080
apping+0.080
 pals+0.080
aps+0.078
Neg logit contributions
 forgot-0.078
 managed-0.074
 never-0.072
 fell-0.071
 wouldn-0.071
Token and
Token ID290
Feature activation+1.37e-02

Pos logit contributions
ums+0.043
 films+0.042
apping+0.042
aps+0.041
 pals+0.040
Neg logit contributions
 forgot-0.040
 never-0.036
 managed-0.035
 missed-0.035
 had-0.034
Token all
Token ID477
Feature activation+2.48e-02

Pos logit contributions
 films+0.029
ums+0.029
apping+0.028
 pals+0.028
aps+0.028
Neg logit contributions
 forgot-0.020
 managed-0.017
 missed-0.017
use-0.017
 never-0.017
Token the
Token ID262
Feature activation+9.23e-02

Pos logit contributions
ums+0.044
 films+0.043
apping+0.042
 pals+0.041
aps+0.041
Neg logit contributions
 forgot-0.047
 managed-0.042
 never-0.042
 missed-0.042
 had-0.041
Token other
Token ID584
Feature activation+8.57e-02

Pos logit contributions
apping+0.114
ums+0.114
 films+0.112
aps+0.108
 pals+0.106
Neg logit contributions
 forgot-0.217
 missed-0.198
 managed-0.198
 never-0.196
use-0.195
Token av
Token ID1196
Feature activation+1.72e-01

Pos logit contributions
apping+0.092
 films+0.092
ums+0.090
aps+0.087
 pals+0.083
Neg logit contributions
 forgot-0.216
 managed-0.197
 missed-0.197
 never-0.195
 had-0.193
Tokenoc
Token ID420
Feature activation+2.46e-01

Pos logit contributions
 films+0.213
ums+0.209
apping+0.206
 pals+0.204
aps+0.198
Neg logit contributions
 forgot-0.400
 managed-0.367
 missed-0.366
use-0.365
 never-0.359
Tokenados
Token ID22484
Feature activation+5.08e-02

Pos logit contributions
 films+0.233
apping+0.208
aps+0.186
 pals+0.182
ums+0.174
Neg logit contributions
 forgot-0.635
use-0.620
 managed-0.607
 never-0.605
 missed-0.592
Token were
Token ID547
Feature activation+3.48e-02

Pos logit contributions
 pals+0.108
 films+0.107
apping+0.107
ums+0.107
aps+0.104
Neg logit contributions
 forgot-0.069
use-0.061
 missed-0.060
 managed-0.059
 never-0.056
Token very
Token ID845
Feature activation+3.17e-02

Pos logit contributions
 films+0.061
ums+0.060
apping+0.059
aps+0.058
 pals+0.058
Neg logit contributions
 forgot-0.064
 managed-0.059
 missed-0.058
 never-0.057
use-0.056
Token happy
Token ID3772
Feature activation+7.82e-02

Pos logit contributions
ums+0.051
 pals+0.050
apping+0.050
 films+0.050
aps+0.049
Neg logit contributions
 forgot-0.061
use-0.057
 missed-0.056
 managed-0.055
'd-0.053
Token too
Token ID1165
Feature activation+1.04e-02

Pos logit contributions
ums+0.139
 films+0.135
apping+0.130
uff+0.128
utes+0.126
Neg logit contributions
 forgot-0.139
 never-0.135
 had-0.130
 would-0.129
 managed-0.123
Token smiled
Token ID13541
Feature activation-1.50e-02

Pos logit contributions
use+0.087
 forgot+0.084
 managed+0.080
'd+0.079
t+0.077
Neg logit contributions
 films-0.170
 pals-0.168
ning-0.166
apping-0.163
aps-0.160
Token.
Token ID13
Feature activation+1.33e-03

Pos logit contributions
 forgot+0.028
 missed+0.026
 fell+0.026
 never+0.025
 had+0.025
Neg logit contributions
 films-0.027
ums-0.026
 pals-0.025
pill-0.025
birds-0.024
Token Max
Token ID5436
Feature activation-7.27e-02

Pos logit contributions
ums+0.002
apping+0.002
 films+0.002
 pals+0.002
aps+0.002
Neg logit contributions
 forgot-0.003
 missed-0.002
 never-0.002
 fell-0.002
 had-0.002
Token was
Token ID373
Feature activation-5.56e-02

Pos logit contributions
use+0.094
t+0.079
 managed+0.072
 notice+0.072
 forgot+0.072
Neg logit contributions
 pals-0.149
 films-0.146
ning-0.142
ums-0.139
apping-0.139
Token happy
Token ID3772
Feature activation-2.07e-02

Pos logit contributions
 forgot+0.086
 missed+0.082
 never+0.080
 managed+0.073
 had+0.072
Neg logit contributions
 films-0.118
ums-0.115
 pals-0.111
birds-0.110
apping-0.110
Token he
Token ID339
Feature activation-1.42e-01

Pos logit contributions
 forgot+0.042
use+0.039
 managed+0.038
 missed+0.038
 never+0.037
Neg logit contributions
 films-0.031
ums-0.031
apping-0.031
 pals-0.031
aps-0.030
Token could
Token ID714
Feature activation-8.36e-02

Pos logit contributions
use+0.176
 notice+0.141
Maybe+0.139
 forgot+0.133
 surely+0.132
Neg logit contributions
 pals-0.281
ning-0.278
apping-0.273
 playing-0.266
 music-0.264
Token help
Token ID1037
Feature activation-6.06e-02

Pos logit contributions
 forgot+0.135
use+0.122
 managed+0.120
 missed+0.118
 never+0.111
Neg logit contributions
 films-0.158
ums-0.158
 pals-0.158
apping-0.157
aps-0.156
Token.
Token ID13
Feature activation+4.36e-05

Pos logit contributions
 forgot+0.113
 never+0.103
 missed+0.101
 had+0.099
 fell+0.099
Neg logit contributions
apping-0.106
uff-0.103
ums-0.103
 films-0.102
utes-0.101
Token From
Token ID3574
Feature activation-4.12e-02

Pos logit contributions
 films+0.000
ums+0.000
apping+0.000
 pals+0.000
ning+0.000
Neg logit contributions
 forgot-0.000
 missed-0.000
 never-0.000
 had-0.000
 fell-0.000
Token that
Token ID326
Feature activation-4.26e-02

Pos logit contributions
 forgot+0.091
 managed+0.085
use+0.084
'd+0.082
 missed+0.082
Neg logit contributions
 films-0.055
 pals-0.054
apping-0.053
ums-0.052
aps-0.051
TokenThe
Token ID464
Feature activation+1.11e-02

Pos logit contributions
 forgot+0.019
use+0.018
 managed+0.018
 missed+0.017
'd+0.017
Neg logit contributions
 films-0.013
ums-0.012
apping-0.012
 pals-0.012
aps-0.012
Token boy
Token ID2933
Feature activation-3.69e-02

Pos logit contributions
 films+0.016
 pals+0.015
ums+0.015
apping+0.015
aps+0.014
Neg logit contributions
 forgot-0.024
 managed-0.022
use-0.022
 missed-0.022
 never-0.021
Token was
Token ID373
Feature activation-5.09e-02

Pos logit contributions
use+0.048
 forgot+0.043
 notice+0.038
 managed+0.038
Maybe+0.038
Neg logit contributions
 pals-0.078
 films-0.076
apping-0.075
ums-0.074
birds-0.073
Token so
Token ID523
Feature activation-5.34e-02

Pos logit contributions
 forgot+0.089
 missed+0.081
 never+0.080
 managed+0.080
 fell+0.075
Neg logit contributions
 films-0.095
ums-0.094
apping-0.091
 pals-0.091
aps-0.090
Token glad
Token ID9675
Feature activation-2.10e-02

Pos logit contributions
 forgot+0.111
use+0.110
 managed+0.102
'd+0.100
 wouldn+0.100
Neg logit contributions
apping-0.079
ums-0.076
aps-0.075
 pals-0.074
 films-0.073
Token he
Token ID339
Feature activation-1.39e-01

Pos logit contributions
 forgot+0.032
 never+0.030
 had+0.028
 fell+0.028
 missed+0.027
Neg logit contributions
 films-0.047
ums-0.046
apping-0.044
aters-0.044
aps-0.043
Token could
Token ID714
Feature activation-8.51e-02

Pos logit contributions
use+0.146
 notice+0.114
Maybe+0.110
 forgot+0.103
Hmm+0.094
Neg logit contributions
 pals-0.303
ning-0.299
apping-0.291
aw-0.287
 music-0.285
Token lend
Token ID22096
Feature activation-4.39e-02

Pos logit contributions
 forgot+0.111
 managed+0.095
 missed+0.094
use+0.093
 never+0.092
Neg logit contributions
 films-0.193
ums-0.191
apping-0.189
 pals-0.188
aps-0.185
Token the
Token ID262
Feature activation+2.28e-02

Pos logit contributions
 forgot+0.080
 never+0.074
 managed+0.073
 missed+0.072
 fell+0.069
Neg logit contributions
apping-0.079
 films-0.077
ums-0.077
aps-0.074
 pals-0.074
Token lion
Token ID18744
Feature activation-1.73e-02

Pos logit contributions
 films+0.034
 pals+0.033
ums+0.032
apping+0.032
aps+0.031
Neg logit contributions
 forgot-0.047
use-0.043
 managed-0.043
 missed-0.043
 never-0.042
Token some
Token ID617
Feature activation-2.86e-02

Pos logit contributions
 forgot+0.029
 never+0.028
 managed+0.027
 had+0.026
 would+0.026
Neg logit contributions
ums-0.033
 films-0.033
apping-0.033
uff-0.032
aps-0.030
Token Max
Token ID5436
Feature activation-6.73e-02

Pos logit contributions
ums+0.023
apping+0.022
 pals+0.021
 films+0.021
aps+0.021
Neg logit contributions
 forgot-0.017
 missed-0.016
 had-0.015
 fell-0.015
 never-0.015
Token was
Token ID373
Feature activation-4.68e-02

Pos logit contributions
use+0.102
 forgot+0.090
 managed+0.083
 notice+0.083
 forgotten+0.082
Neg logit contributions
 pals-0.130
 films-0.126
ning-0.123
ums-0.122
apping-0.119
Token so
Token ID523
Feature activation-4.92e-02

Pos logit contributions
 forgot+0.074
 never+0.070
 missed+0.069
 managed+0.065
 had+0.064
Neg logit contributions
 films-0.100
ums-0.098
birds-0.096
apping-0.095
aps-0.094
Token happy
Token ID3772
Feature activation-1.35e-02

Pos logit contributions
 forgot+0.094
use+0.086
 managed+0.085
 missed+0.084
 never+0.082
Neg logit contributions
 films-0.081
ums-0.081
apping-0.080
 pals-0.079
aps-0.078
Token,
Token ID11
Feature activation-7.46e-02

Pos logit contributions
 forgot+0.022
 never+0.020
 fell+0.019
 had+0.019
 managed+0.018
Neg logit contributions
 films-0.030
ums-0.029
aters-0.027
apping-0.027
pieces-0.027
Token he
Token ID339
Feature activation-1.38e-01

Pos logit contributions
 forgot+0.088
 never+0.072
 missed+0.071
 fell+0.070
 had+0.070
Neg logit contributions
ums-0.193
aps-0.188
apping-0.188
 films-0.186
 pals-0.181
Token w
Token ID266
Feature activation+2.28e-02

Pos logit contributions
use+0.163
 forgot+0.153
 managed+0.138
Maybe+0.135
 missed+0.134
Neg logit contributions
 pals-0.307
 films-0.300
apping-0.297
ums-0.295
aps-0.289
Tokenagged
Token ID14655
Feature activation-1.72e-02

Pos logit contributions
 films+0.025
ums+0.024
apping+0.024
 pals+0.024
aps+0.023
Neg logit contributions
 forgot-0.057
 managed-0.053
 missed-0.053
use-0.052
 never-0.052
Token his
Token ID465
Feature activation-8.88e-03

Pos logit contributions
 forgot+0.031
 never+0.029
 missed+0.028
 managed+0.028
 fell+0.027
Neg logit contributions
 films-0.033
ums-0.033
apping-0.032
birds-0.031
aps-0.031
Token tail
Token ID7894
Feature activation+5.15e-03

Pos logit contributions
 forgot+0.022
 missed+0.020
 had+0.020
 never+0.020
 fell+0.019
Neg logit contributions
 films-0.012
birds-0.011
ums-0.011
pill-0.011
apping-0.010
Token and
Token ID290
Feature activation-7.02e-02

Pos logit contributions
 films+0.011
ocol+0.011
ums+0.011
utes+0.010
pill+0.010
Neg logit contributions
 never-0.008
 forgot-0.007
 had-0.007
 missed-0.007
 fell-0.007
Token friends
Token ID2460
Feature activation-4.49e-02

Pos logit contributions
 forgot+0.038
 had+0.035
 missed+0.035
 knew+0.034
 fell+0.034
Neg logit contributions
apping-0.022
ums-0.022
aw-0.021
ning-0.021
pill-0.021
Token.
Token ID13
Feature activation-4.48e-04

Pos logit contributions
 never+0.069
 forgot+0.065
 had+0.063
 fell+0.061
 soon+0.059
Neg logit contributions
 films-0.089
ums-0.085
pill-0.084
ning-0.084
 pals-0.083
Token He
Token ID679
Feature activation-1.23e-01

Pos logit contributions
 forgot+0.001
 missed+0.001
 managed+0.001
 never+0.001
 fell+0.001
Neg logit contributions
 films-0.001
apping-0.001
ums-0.001
ning-0.001
birds-0.001
Token was
Token ID373
Feature activation-5.16e-02

Pos logit contributions
use+0.142
 forgot+0.131
 managed+0.113
 missed+0.112
Maybe+0.110
Neg logit contributions
 pals-0.281
 films-0.268
ums-0.266
apping-0.264
aps-0.258
Token glad
Token ID9675
Feature activation-2.36e-02

Pos logit contributions
 forgot+0.088
 never+0.084
 managed+0.080
 missed+0.079
 had+0.074
Neg logit contributions
 films-0.105
ums-0.101
birds-0.097
aps-0.096
aters-0.096
Token he
Token ID339
Feature activation-1.38e-01

Pos logit contributions
 forgot+0.031
 never+0.029
 missed+0.026
 had+0.025
 managed+0.025
Neg logit contributions
ums-0.055
 films-0.054
apping-0.053
aters-0.053
aps-0.052
Token had
Token ID550
Feature activation-7.80e-02

Pos logit contributions
use+0.160
 notice+0.129
 forgot+0.127
Maybe+0.123
 bud+0.111
Neg logit contributions
ning-0.289
 pals-0.283
apping-0.270
aw-0.266
 playing-0.258
Token been
Token ID587
Feature activation-5.79e-02

Pos logit contributions
 forgot+0.114
use+0.101
 missed+0.099
 managed+0.098
 never+0.094
Neg logit contributions
 films-0.161
 pals-0.160
ums-0.160
apping-0.159
aps-0.156
Token so
Token ID523
Feature activation-5.47e-02

Pos logit contributions
 forgot+0.101
 managed+0.094
 never+0.089
 missed+0.088
 had+0.084
Neg logit contributions
 films-0.112
ums-0.110
aps-0.104
apping-0.103
aters-0.103
Token loyal
Token ID9112
Feature activation-4.85e-04

Pos logit contributions
 forgot+0.098
use+0.092
 missed+0.088
 managed+0.087
'd+0.086
Neg logit contributions
apping-0.095
ums-0.094
 pals-0.094
aps-0.091
 films-0.091
Token.
Token ID13
Feature activation-3.18e-03

Pos logit contributions
 forgot+0.001
 never+0.001
 managed+0.001
 missed+0.001
 had+0.001
Neg logit contributions
 films-0.001
ums-0.001
apping-0.001
aps-0.001
 sym-0.001
Token Tim
Token ID5045
Feature activation-4.52e-02

Pos logit contributions
 films+0.005
ums+0.005
apping+0.005
 pals+0.005
aps+0.005
Neg logit contributions
 forgot-0.005
 missed-0.005
 never-0.004
 had-0.004
 managed-0.004
Tokenmy
Token ID1820
Feature activation-8.18e-02

Pos logit contributions
 forgot+0.079
use+0.075
 managed+0.073
 missed+0.072
Maybe+0.071
Neg logit contributions
 pals-0.075
ums-0.074
ning-0.072
 films-0.072
apping-0.071
Token felt
Token ID2936
Feature activation-4.77e-02

Pos logit contributions
use+0.109
 forgot+0.091
 managed+0.086
Maybe+0.085
 notice+0.083
Neg logit contributions
 pals-0.172
 films-0.168
ums-0.166
apping-0.164
ning-0.160
Token happy
Token ID3772
Feature activation-1.67e-02

Pos logit contributions
 forgot+0.094
use+0.092
 managed+0.085
'd+0.084
 missed+0.081
Neg logit contributions
 films-0.073
apping-0.071
ums-0.071
 pals-0.070
aps-0.069
Token that
Token ID326
Feature activation-6.74e-02

Pos logit contributions
 forgot+0.033
 missed+0.030
 managed+0.030
 never+0.029
use+0.029
Neg logit contributions
 films-0.027
ums-0.027
apping-0.026
 pals-0.026
aps-0.026
Token he
Token ID339
Feature activation-1.37e-01

Pos logit contributions
use+0.127
 forgot+0.122
 managed+0.113
'd+0.110
Maybe+0.105
Neg logit contributions
 films-0.116
 pals-0.108
ums-0.105
apping-0.104
 television-0.101
Token could
Token ID714
Feature activation-8.28e-02

Pos logit contributions
use+0.162
 notice+0.133
Maybe+0.118
Hmm+0.113
 bud+0.113
Neg logit contributions
ning-0.276
 pals-0.268
 music-0.262
apping-0.260
 playing-0.258
Token make
Token ID787
Feature activation-3.76e-02

Pos logit contributions
 forgot+0.130
use+0.119
 managed+0.114
 missed+0.113
Maybe+0.106
Neg logit contributions
 pals-0.160
ums-0.160
apping-0.158
aps-0.157
 films-0.157
Token someone
Token ID2130
Feature activation-6.85e-02

Pos logit contributions
 never+0.049
 forgot+0.046
 managed+0.043
 notice+0.043
 missed+0.042
Neg logit contributions
apping-0.083
aters-0.082
uff-0.079
aps-0.078
utes-0.078
Token feel
Token ID1254
Feature activation-3.75e-02

Pos logit contributions
 forgot+0.112
 missed+0.096
 managed+0.095
 never+0.094
 had+0.091
Neg logit contributions
ums-0.140
 films-0.138
apping-0.135
uff-0.134
 pals-0.134
Token better
Token ID1365
Feature activation-4.18e-02

Pos logit contributions
 forgot+0.056
use+0.052
 managed+0.049
 missed+0.048
'd+0.047
Neg logit contributions
 films-0.075
apping-0.074
 pals-0.073
ums-0.073
aps-0.072
Token
Token ID198
Feature activation-6.49e-03

Pos logit contributions
 films+0.015
ums+0.014
 pals+0.014
apping+0.014
aps+0.013
Neg logit contributions
 forgot-0.017
 missed-0.015
 managed-0.015
 never-0.015
use-0.015
Token
Token ID198
Feature activation-4.34e-03

Pos logit contributions
 forgot+0.015
 managed+0.014
use+0.014
 missed+0.013
'd+0.013
Neg logit contributions
ums-0.008
aps-0.008
apping-0.008
 films-0.008
 pals-0.008
TokenHis
Token ID6653
Feature activation-3.74e-02

Pos logit contributions
 forgot+0.008
use+0.008
 missed+0.007
'd+0.007
 managed+0.007
Neg logit contributions
ums-0.008
apping-0.007
aps-0.007
uff-0.007
aw-0.007
Token mom
Token ID1995
Feature activation-5.31e-02

Pos logit contributions
 forgot+0.072
 missed+0.065
 managed+0.064
use+0.063
 never+0.063
Neg logit contributions
 films-0.062
ums-0.061
apping-0.061
 pals-0.060
aps-0.059
Token said
Token ID531
Feature activation-1.98e-02

Pos logit contributions
use+0.079
 forgot+0.067
 notice+0.066
 surely+0.065
Maybe+0.062
Neg logit contributions
 pals-0.095
aw-0.094
ning-0.094
apping-0.094
 friends-0.092
Token he
Token ID339
Feature activation-1.37e-01

Pos logit contributions
 forgot+0.033
use+0.031
 managed+0.029
 missed+0.029
'd+0.028
Neg logit contributions
 films-0.036
ums-0.035
 pals-0.035
apping-0.035
aps-0.034
Token had
Token ID550
Feature activation-7.85e-02

Pos logit contributions
use+0.161
 notice+0.155
 surely+0.129
 forgot+0.128
Maybe+0.117
Neg logit contributions
 pals-0.287
ums-0.280
 friends-0.277
ning-0.270
oses-0.268
Token to
Token ID284
Feature activation-5.84e-02

Pos logit contributions
 forgot+0.165
use+0.154
 managed+0.151
Maybe+0.148
 missed+0.148
Neg logit contributions
 films-0.107
 pals-0.105
ums-0.102
apping-0.100
aps-0.099
Token take
Token ID1011
Feature activation-5.32e-02

Pos logit contributions
 forgot+0.085
 managed+0.075
 missed+0.075
use+0.074
'd+0.071
Neg logit contributions
 films-0.121
ums-0.120
apping-0.118
 pals-0.117
aps-0.115
Token a
Token ID257
Feature activation-4.82e-03

Pos logit contributions
 forgot+0.107
 managed+0.098
 missed+0.097
use+0.096
 never+0.096
Neg logit contributions
 films-0.082
ums-0.082
apping-0.081
 pals-0.080
aps-0.078
Token bath
Token ID7837
Feature activation-2.23e-02

Pos logit contributions
 forgot+0.011
 managed+0.010
 missed+0.010
use+0.010
 never+0.010
Neg logit contributions
 films-0.006
apping-0.006
 pals-0.006
ums-0.005
aps-0.005
Token
Token ID198
Feature activation-3.61e-03

Pos logit contributions
 forgot+0.012
use+0.012
 managed+0.011
 missed+0.011
 wouldn+0.011
Neg logit contributions
aps-0.006
apping-0.006
 films-0.006
ums-0.006
 pals-0.006
TokenTom
Token ID13787
Feature activation-3.75e-02

Pos logit contributions
 forgot+0.006
use+0.006
 missed+0.006
 managed+0.006
 wouldn+0.006
Neg logit contributions
apping-0.006
ums-0.006
 films-0.006
aps-0.006
uff-0.006
Tokenmy
Token ID1820
Feature activation-6.89e-02

Pos logit contributions
 forgot+0.063
use+0.060
 missed+0.058
 managed+0.056
Maybe+0.056
Neg logit contributions
 pals-0.066
apping-0.066
ums-0.065
ning-0.064
 films-0.063
Token was
Token ID373
Feature activation-4.32e-02

Pos logit contributions
use+0.097
 forgot+0.094
 missed+0.084
 managed+0.083
'd+0.082
Neg logit contributions
 pals-0.139
apping-0.138
ums-0.135
 films-0.133
ning-0.131
Token glad
Token ID9675
Feature activation-2.08e-02

Pos logit contributions
 forgot+0.076
 never+0.069
 missed+0.069
 managed+0.069
 fell+0.064
Neg logit contributions
 films-0.083
ums-0.081
 pals-0.078
apping-0.076
aps-0.076
Token he
Token ID339
Feature activation-1.37e-01

Pos logit contributions
 forgot+0.027
 never+0.025
 fell+0.023
 had+0.023
 missed+0.023
Neg logit contributions
 films-0.050
ums-0.049
apping-0.048
aters-0.048
aps-0.047
Token was
Token ID373
Feature activation-5.85e-02

Pos logit contributions
use+0.165
 notice+0.126
 forgot+0.121
Maybe+0.114
 dies+0.106
Neg logit contributions
ning-0.296
 pals-0.282
apping-0.276
 playing-0.269
aw-0.265
Token with
Token ID351
Feature activation-3.81e-02

Pos logit contributions
 forgot+0.106
 managed+0.096
 missed+0.094
 never+0.094
 fell+0.090
Neg logit contributions
 films-0.107
ums-0.106
 pals-0.102
apping-0.101
aps-0.100
Token his
Token ID465
Feature activation-1.87e-02

Pos logit contributions
 forgot+0.083
use+0.077
 managed+0.075
 missed+0.074
'd+0.074
Neg logit contributions
 films-0.054
 pals-0.053
ums-0.053
apping-0.051
aps-0.050
Token friend
Token ID1545
Feature activation-2.47e-02

Pos logit contributions
 forgot+0.046
 missed+0.043
 managed+0.043
 never+0.043
 fell+0.042
Neg logit contributions
 films-0.022
ums-0.021
apping-0.020
aps-0.019
ning-0.018
Token.
Token ID13
Feature activation-1.75e-03

Pos logit contributions
 forgot+0.042
 had+0.040
 managed+0.038
 missed+0.037
 would+0.037
Neg logit contributions
 films-0.047
utes-0.046
ums-0.045
aps-0.045
ocol-0.044
Token
Token ID198
Feature activation-5.77e-03

Pos logit contributions
 forgot+0.016
 managed+0.015
 missed+0.015
use+0.015
 never+0.014
Neg logit contributions
 films-0.011
ums-0.011
apping-0.011
 pals-0.011
aps-0.011
TokenBen
Token ID11696
Feature activation-5.82e-02

Pos logit contributions
 forgot+0.009
 managed+0.008
 missed+0.008
use+0.008
 never+0.008
Neg logit contributions
 films-0.012
ums-0.011
apping-0.011
 pals-0.011
aps-0.011
Token felt
Token ID2936
Feature activation-4.85e-02

Pos logit contributions
use+0.092
 surely+0.082
Maybe+0.081
 notice+0.079
t+0.079
Neg logit contributions
 pals-0.095
 friends-0.090
 films-0.090
ning-0.088
apping-0.088
Token sorry
Token ID7926
Feature activation-4.28e-02

Pos logit contributions
 forgot+0.077
use+0.074
'd+0.069
 managed+0.068
 wouldn+0.061
Neg logit contributions
 films-0.092
 pals-0.090
ums-0.089
apping-0.087
aps-0.086
Token that
Token ID326
Feature activation-7.07e-02

Pos logit contributions
 forgot+0.101
use+0.095
 managed+0.095
 missed+0.093
'd+0.090
Neg logit contributions
 films-0.051
apping-0.051
ums-0.050
 pals-0.049
aps-0.048
Token he
Token ID339
Feature activation-1.37e-01

Pos logit contributions
 forgot+0.125
 managed+0.111
 missed+0.111
use+0.110
 never+0.108
Neg logit contributions
 films-0.128
ums-0.127
apping-0.126
 pals-0.125
aps-0.122
Token had
Token ID550
Feature activation-7.66e-02

Pos logit contributions
use+0.159
 notice+0.126
 surely+0.124
Maybe+0.120
 forgot+0.114
Neg logit contributions
 pals-0.276
ning-0.270
aw-0.267
 friends-0.262
apping-0.258
Token pushed
Token ID7121
Feature activation-3.25e-02

Pos logit contributions
use+0.105
 forgot+0.105
'd+0.096
Maybe+0.094
 managed+0.093
Neg logit contributions
 pals-0.155
apping-0.148
ums-0.147
 films-0.145
aps-0.145
Token Jane
Token ID12091
Feature activation-6.35e-02

Pos logit contributions
 forgot+0.061
 managed+0.056
 missed+0.055
 never+0.055
use+0.055
Neg logit contributions
 films-0.055
ums-0.054
apping-0.054
 pals-0.053
aps-0.052
Token,
Token ID11
Feature activation-5.99e-02

Pos logit contributions
 forgot+0.117
 managed+0.105
 missed+0.105
 never+0.105
use+0.103
Neg logit contributions
 films-0.111
ums-0.109
 pals-0.108
apping-0.107
aps-0.104
Token so
Token ID523
Feature activation-4.65e-02

Pos logit contributions
use+0.128
 forgot+0.125
 managed+0.114
 missed+0.113
'd+0.112
Neg logit contributions
 pals-0.075
apping-0.075
 films-0.074
ums-0.069
ning-0.069
Token so
Token ID523
Feature activation-4.88e-02

Pos logit contributions
 forgot+0.086
 managed+0.077
 missed+0.077
use+0.076
 never+0.075
Neg logit contributions
 films-0.081
ums-0.080
apping-0.079
 pals-0.078
aps-0.077
Token grateful
Token ID14066
Feature activation-6.39e-03

Pos logit contributions
use+0.115
 forgot+0.109
 wouldn+0.105
'd+0.105
 managed+0.103
Neg logit contributions
apping-0.063
aps-0.059
ums-0.056
 pals-0.055
 films-0.050
Token and
Token ID290
Feature activation-7.96e-02

Pos logit contributions
 forgot+0.012
 never+0.011
 managed+0.010
 missed+0.010
 fell+0.010
Neg logit contributions
 films-0.012
ums-0.011
apping-0.011
 pals-0.011
aps-0.011
Token happy
Token ID3772
Feature activation-2.80e-03

Pos logit contributions
use+0.116
 forgot+0.103
'd+0.095
 managed+0.095
Maybe+0.093
Neg logit contributions
 pals-0.164
 films-0.159
apping-0.156
ums-0.155
aps-0.152
Token that
Token ID326
Feature activation-6.81e-02

Pos logit contributions
 forgot+0.005
 never+0.004
 fell+0.004
 missed+0.004
 had+0.004
Neg logit contributions
 films-0.006
ums-0.006
aters-0.005
apping-0.005
uff-0.005
Token he
Token ID339
Feature activation-1.36e-01

Pos logit contributions
use+0.124
 forgot+0.122
 managed+0.112
'd+0.109
 missed+0.105
Neg logit contributions
 films-0.113
 pals-0.111
ums-0.108
ning-0.104
apping-0.103
Token hugged
Token ID41130
Feature activation-2.81e-02

Pos logit contributions
use+0.172
 notice+0.132
Maybe+0.115
 forgot+0.114
 forgotten+0.106
Neg logit contributions
 pals-0.302
ning-0.282
apping-0.282
 friends-0.277
 films-0.273
Token the
Token ID262
Feature activation+1.24e-02

Pos logit contributions
 forgot+0.060
 managed+0.054
use+0.054
 missed+0.054
'd+0.052
Neg logit contributions
 films-0.041
ums-0.040
 pals-0.040
apping-0.039
aps-0.038
Token man
Token ID582
Feature activation-4.31e-02

Pos logit contributions
 films+0.020
 pals+0.020
apping+0.020
ums+0.020
aps+0.019
Neg logit contributions
 forgot-0.023
use-0.021
 missed-0.021
 managed-0.021
 never-0.020
Token.
Token ID13
Feature activation+2.91e-03

Pos logit contributions
 forgot+0.074
 never+0.070
 had+0.068
 managed+0.067
 fell+0.067
Neg logit contributions
 films-0.082
ums-0.079
 pals-0.078
utes-0.077
uff-0.076
Token From
Token ID3574
Feature activation-3.99e-02

Pos logit contributions
 films+0.005
apping+0.004
ums+0.004
 pals+0.004
aps+0.004
Neg logit contributions
 forgot-0.006
 missed-0.005
 never-0.005
 managed-0.005
 fell-0.005
Token lawn
Token ID21133
Feature activation+3.53e-03

Pos logit contributions
apping+0.022
ums+0.021
aps+0.021
 pals+0.018
 films+0.018
Neg logit contributions
 forgot-0.029
 would-0.028
 had-0.027
 never-0.026
 fell-0.026
Token.
Token ID13
Feature activation+1.03e-02

Pos logit contributions
 films+0.006
ums+0.006
 pals+0.006
apping+0.006
ocol+0.006
Neg logit contributions
 forgot-0.006
 never-0.006
 would-0.006
 had-0.006
 fell-0.006
Token Tim
Token ID5045
Feature activation-4.99e-02

Pos logit contributions
 films+0.018
apping+0.017
ums+0.017
aps+0.016
 pals+0.016
Neg logit contributions
 forgot-0.020
 missed-0.019
 had-0.018
 never-0.018
 fell-0.018
Token was
Token ID373
Feature activation-4.47e-02

Pos logit contributions
use+0.075
Maybe+0.059
 forgot+0.057
 notice+0.056
t+0.055
Neg logit contributions
 pals-0.098
ning-0.095
ums-0.094
apping-0.092
 friends-0.090
Token happy
Token ID3772
Feature activation-1.32e-02

Pos logit contributions
 forgot+0.077
 never+0.070
 missed+0.069
 managed+0.068
 fell+0.063
Neg logit contributions
 films-0.088
ums-0.086
 pals-0.083
apping-0.083
aps-0.082
Token he
Token ID339
Feature activation-1.36e-01

Pos logit contributions
 forgot+0.024
 never+0.021
 missed+0.021
 managed+0.021
 fell+0.020
Neg logit contributions
 films-0.025
ums-0.024
apping-0.023
 pals-0.023
aps-0.023
Token realized
Token ID6939
Feature activation-4.83e-02

Pos logit contributions
use+0.161
Maybe+0.110
 notice+0.107
 forgot+0.101
 bud+0.097
Neg logit contributions
apping-0.287
ning-0.285
 pals-0.279
 playing-0.275
 music-0.274
Token his
Token ID465
Feature activation-1.22e-02

Pos logit contributions
 forgot+0.075
 never+0.072
 managed+0.066
 wouldn+0.066
 had+0.065
Neg logit contributions
ums-0.101
apping-0.098
aps-0.097
 films-0.097
 pals-0.092
Token mistake
Token ID7457
Feature activation-2.03e-02

Pos logit contributions
 forgot+0.031
 missed+0.028
 managed+0.028
 never+0.028
use+0.028
Neg logit contributions
 films-0.013
ums-0.013
apping-0.013
aps-0.012
 pals-0.012
Token and
Token ID290
Feature activation-7.66e-02

Pos logit contributions
 forgot+0.044
 managed+0.040
 missed+0.040
 never+0.039
use+0.039
Neg logit contributions
 films-0.029
ums-0.028
 pals-0.028
apping-0.028
aps-0.027
Token cleaned
Token ID20750
Feature activation-2.84e-02

Pos logit contributions
use+0.093
 forgot+0.074
Maybe+0.068
'd+0.068
 managed+0.064
Neg logit contributions
 pals-0.181
 films-0.174
apping-0.173
ums-0.171
ning-0.168
Token penny
Token ID26369
Feature activation-3.19e-02

Pos logit contributions
 forgot+0.003
 managed+0.002
 missed+0.002
use+0.002
 never+0.002
Neg logit contributions
 films-0.003
ums-0.003
apping-0.003
 pals-0.002
aps-0.002
Token.
Token ID13
Feature activation+9.31e-03

Pos logit contributions
 forgot+0.068
 managed+0.062
 missed+0.061
use+0.061
 never+0.060
Neg logit contributions
 films-0.046
ums-0.045
apping-0.045
 pals-0.044
aps-0.043
Token The
Token ID383
Feature activation+1.19e-02

Pos logit contributions
 films+0.012
ums+0.012
apping+0.012
 pals+0.012
aps+0.011
Neg logit contributions
 forgot-0.021
 missed-0.019
 managed-0.019
use-0.019
 never-0.019
Token man
Token ID582
Feature activation-4.97e-02

Pos logit contributions
 films+0.017
ums+0.017
 pals+0.016
apping+0.016
aps+0.016
Neg logit contributions
 forgot-0.025
use-0.023
 managed-0.023
 wouldn-0.022
 missed-0.022
Token said
Token ID531
Feature activation-2.75e-02

Pos logit contributions
use+0.076
 forgot+0.068
 notice+0.066
Maybe+0.062
 forgotten+0.061
Neg logit contributions
 pals-0.095
apping-0.089
 films-0.089
 friends-0.089
ning-0.088
Token he
Token ID339
Feature activation-1.36e-01

Pos logit contributions
 forgot+0.055
use+0.051
 managed+0.049
'd+0.048
 missed+0.048
Neg logit contributions
 films-0.042
 pals-0.040
apping-0.040
ums-0.040
aps-0.038
Token could
Token ID714
Feature activation-8.17e-02

Pos logit contributions
use+0.161
 notice+0.156
 forgot+0.128
 surely+0.126
Maybe+0.125
Neg logit contributions
 pals-0.282
 friends-0.278
ums-0.269
ning-0.261
birds-0.261
Token help
Token ID1037
Feature activation-5.49e-02

Pos logit contributions
 forgot+0.148
use+0.135
 managed+0.132
 missed+0.126
Maybe+0.126
Neg logit contributions
ums-0.125
aps-0.125
 pals-0.120
ap-0.116
apping-0.115
Token her
Token ID607
Feature activation-1.12e-02

Pos logit contributions
 forgot+0.114
 managed+0.104
 missed+0.104
use+0.103
'd+0.100
Neg logit contributions
 films-0.083
ums-0.082
 pals-0.079
apping-0.079
aps-0.077
Token find
Token ID1064
Feature activation-7.85e-02

Pos logit contributions
 forgot+0.018
use+0.015
 managed+0.015
 missed+0.015
'd+0.015
Neg logit contributions
ums-0.022
 films-0.022
apping-0.022
 pals-0.022
aps-0.021
Token it
Token ID340
Feature activation-1.05e-01

Pos logit contributions
 forgot+0.147
 missed+0.129
use+0.128
 managed+0.127
'd+0.127
Neg logit contributions
ums-0.134
 films-0.134
 pals-0.132
aps-0.128
ning-0.125
Token very
Token ID845
Feature activation-6.04e-02

Pos logit contributions
 forgot+0.049
 missed+0.047
 never+0.044
 managed+0.042
 fell+0.041
Neg logit contributions
ums-0.099
 films-0.099
 pals-0.095
aters-0.094
apping-0.094
Token afraid
Token ID7787
Feature activation-1.32e-02

Pos logit contributions
 forgot+0.094
 managed+0.082
use+0.082
 missed+0.082
 never+0.079
Neg logit contributions
 films-0.122
ums-0.121
apping-0.120
 pals-0.119
aps-0.118
Token.
Token ID13
Feature activation+8.37e-03

Pos logit contributions
 forgot+0.024
 fell+0.023
 never+0.023
 had+0.023
 missed+0.022
Neg logit contributions
 films-0.022
ums-0.022
 pals-0.021
ning-0.020
apping-0.020
Token He
Token ID679
Feature activation-1.16e-01

Pos logit contributions
ums+0.013
 films+0.013
apping+0.013
 pals+0.013
aps+0.012
Neg logit contributions
 forgot-0.017
use-0.015
 managed-0.015
 missed-0.015
 never-0.015
Token wished
Token ID16555
Feature activation-6.28e-02

Pos logit contributions
use+0.148
 forgot+0.108
 notice+0.105
t+0.105
Maybe+0.104
Neg logit contributions
aw-0.239
 pals-0.236
 lions-0.231
 films-0.229
 monkeys-0.229
Token he
Token ID339
Feature activation-1.36e-01

Pos logit contributions
 forgot+0.093
use+0.081
 managed+0.081
 missed+0.080
 never+0.077
Neg logit contributions
 films-0.131
ums-0.129
apping-0.128
 pals-0.128
aps-0.125
Token had
Token ID550
Feature activation-8.06e-02

Pos logit contributions
use+0.153
 notice+0.141
 surely+0.100
 forgot+0.099
Maybe+0.097
Neg logit contributions
 lions-0.288
 monkeys-0.280
 pals-0.278
 friends-0.270
ning-0.266
Token listened
Token ID16399
Feature activation-3.68e-02

Pos logit contributions
 forgot+0.089
use+0.084
 managed+0.074
Maybe+0.073
 missed+0.073
Neg logit contributions
 pals-0.187
 films-0.184
ums-0.179
apping-0.179
ning-0.178
Token to
Token ID284
Feature activation-6.69e-02

Pos logit contributions
 forgot+0.083
use+0.076
 managed+0.075
 missed+0.075
'd+0.072
Neg logit contributions
apping-0.046
 films-0.045
aps-0.044
 pals-0.044
ums-0.043
Token Lily
Token ID20037
Feature activation-5.05e-02

Pos logit contributions
 forgot+0.129
 missed+0.115
 managed+0.115
use+0.114
 never+0.114
Neg logit contributions
apping-0.110
ums-0.110
 films-0.110
 pals-0.107
aps-0.106
Token.
Token ID13
Feature activation+2.29e-03

Pos logit contributions
 forgot+0.089
 never+0.087
 had+0.083
 fell+0.080
 would+0.080
Neg logit contributions
 films-0.085
 pals-0.085
ums-0.085
apping-0.084
utes-0.082
Token Sammy
Token ID44067
Feature activation-5.43e-02

Pos logit contributions
 films+0.020
ums+0.020
aps+0.020
apping+0.019
 pals+0.019
Neg logit contributions
 forgot-0.026
 missed-0.025
 managed-0.024
 needed-0.024
 fell-0.023
Token was
Token ID373
Feature activation-4.14e-02

Pos logit contributions
use+0.072
 forgot+0.062
Maybe+0.060
 managed+0.057
 missed+0.056
Neg logit contributions
 pals-0.111
 films-0.106
ning-0.106
apping-0.106
ums-0.103
Token so
Token ID523
Feature activation-4.60e-02

Pos logit contributions
 forgot+0.070
 missed+0.064
 never+0.063
 managed+0.063
 fell+0.059
Neg logit contributions
 films-0.082
ums-0.080
 pals-0.077
apping-0.076
aps-0.076
Token thankful
Token ID25535
Feature activation-1.85e-02

Pos logit contributions
use+0.095
 forgot+0.093
'd+0.088
 wouldn+0.087
 managed+0.087
Neg logit contributions
apping-0.071
ums-0.067
aps-0.066
 pals-0.064
 films-0.062
Token that
Token ID326
Feature activation-6.82e-02

Pos logit contributions
 forgot+0.039
 managed+0.035
 missed+0.035
use+0.035
 never+0.034
Neg logit contributions
 films-0.028
ums-0.027
apping-0.027
 pals-0.027
aps-0.026
Token he
Token ID339
Feature activation-1.36e-01

Pos logit contributions
 forgot+0.116
use+0.106
 managed+0.102
 missed+0.102
'd+0.098
Neg logit contributions
 films-0.127
ums-0.126
 pals-0.125
apping-0.124
aps-0.121
Token had
Token ID550
Feature activation-7.98e-02

Pos logit contributions
use+0.160
 notice+0.113
Maybe+0.101
 forgot+0.098
t+0.091
Neg logit contributions
 pals-0.319
apping-0.292
 friends-0.290
 films-0.286
ning-0.286
Token surrendered
Token ID29209
Feature activation-3.45e-02

Pos logit contributions
use+0.101
 forgot+0.090
Maybe+0.080
'd+0.076
 missed+0.076
Neg logit contributions
 pals-0.181
apping-0.174
aps-0.170
ums-0.169
 films-0.167
Token and
Token ID290
Feature activation-6.94e-02

Pos logit contributions
 forgot+0.066
 managed+0.060
 never+0.059
 missed+0.059
use+0.058
Neg logit contributions
 films-0.058
ums-0.057
apping-0.057
 pals-0.056
aps-0.055
Token accepted
Token ID6292
Feature activation-2.93e-02

Pos logit contributions
use+0.083
 forgot+0.080
 missed+0.069
'd+0.069
 managed+0.067
Neg logit contributions
 pals-0.157
 films-0.152
apping-0.152
aps-0.149
ums-0.149
Token that
Token ID326
Feature activation-4.65e-02

Pos logit contributions
 forgot+0.042
 never+0.041
 managed+0.039
 had+0.038
 missed+0.037
Neg logit contributions
apping-0.062
ums-0.061
 films-0.060
aps-0.059
aters-0.059
Token
Token ID198
Feature activation-5.57e-03

Pos logit contributions
 forgot+0.013
 managed+0.012
 missed+0.012
use+0.012
 never+0.012
Neg logit contributions
 films-0.010
ums-0.010
apping-0.010
 pals-0.010
aps-0.010
TokenGeorge
Token ID20191
Feature activation-7.64e-02

Pos logit contributions
 forgot+0.011
use+0.010
 missed+0.010
 wouldn+0.009
 bumped+0.009
Neg logit contributions
 films-0.009
 pals-0.009
apping-0.009
aps-0.009
ums-0.008
Token was
Token ID373
Feature activation-5.01e-02

Pos logit contributions
use+0.101
 forgot+0.095
 notice+0.083
Maybe+0.082
 forgotten+0.081
Neg logit contributions
 pals-0.162
ums-0.155
apping-0.153
 films-0.152
aps-0.150
Token glad
Token ID9675
Feature activation-2.24e-02

Pos logit contributions
 forgot+0.086
 managed+0.076
use+0.076
 missed+0.076
 never+0.073
Neg logit contributions
 films-0.093
ums-0.092
apping-0.091
 pals-0.091
aps-0.089
Token that
Token ID326
Feature activation-7.20e-02

Pos logit contributions
 forgot+0.037
 never+0.033
 missed+0.032
 managed+0.032
 fell+0.031
Neg logit contributions
 films-0.045
ums-0.044
apping-0.043
aps-0.042
 pals-0.042
Token he
Token ID339
Feature activation-1.36e-01

Pos logit contributions
 forgot+0.136
use+0.129
 managed+0.123
 missed+0.119
'd+0.117
Neg logit contributions
 films-0.122
 pals-0.116
apping-0.111
ums-0.111
aps-0.108
Token was
Token ID373
Feature activation-5.92e-02

Pos logit contributions
use+0.144
 notice+0.129
Maybe+0.104
 forgot+0.103
 bumped+0.098
Neg logit contributions
 pals-0.283
ning-0.279
aw-0.267
 such-0.258
 friends-0.255
Token able
Token ID1498
Feature activation-4.72e-02

Pos logit contributions
 forgot+0.103
use+0.091
 missed+0.091
 managed+0.091
 never+0.087
Neg logit contributions
 films-0.108
ums-0.107
apping-0.107
 pals-0.106
aps-0.103
Token to
Token ID284
Feature activation-7.02e-02

Pos logit contributions
 forgot+0.082
 never+0.074
 managed+0.074
 missed+0.072
 fell+0.068
Neg logit contributions
 films-0.094
ums-0.091
apping-0.087
uff-0.085
aps-0.085
Token help
Token ID1037
Feature activation-5.53e-02

Pos logit contributions
 forgot+0.099
 missed+0.088
use+0.086
 managed+0.085
'd+0.081
Neg logit contributions
ums-0.146
 pals-0.146
 films-0.145
apping-0.142
aps-0.142
Token,
Token ID11
Feature activation-5.34e-02

Pos logit contributions
 forgot+0.106
 never+0.097
 managed+0.096
 missed+0.095
use+0.094
Neg logit contributions
apping-0.091
ums-0.091
 films-0.089
aps-0.088
 pals-0.088
Token.
Token ID13
Feature activation+5.50e-03

Pos logit contributions
 forgot+0.031
 never+0.029
 managed+0.029
 had+0.029
 missed+0.028
Neg logit contributions
ums-0.026
 films-0.026
apping-0.025
uff-0.024
aps-0.024
Token Officer
Token ID10391
Feature activation+3.19e-02

Pos logit contributions
 films+0.009
ums+0.008
apping+0.008
 pals+0.008
aps+0.008
Neg logit contributions
 forgot-0.011
 missed-0.010
 never-0.010
 managed-0.010
use-0.010
Token Roy
Token ID9817
Feature activation-5.78e-02

Pos logit contributions
 films+0.048
ums+0.047
apping+0.047
aps+0.045
 pals+0.044
Neg logit contributions
 forgot-0.067
 managed-0.061
use-0.060
 never-0.060
 missed-0.060
Token was
Token ID373
Feature activation-4.90e-02

Pos logit contributions
use+0.081
 forgot+0.078
 missed+0.068
Maybe+0.068
 managed+0.068
Neg logit contributions
 pals-0.117
 films-0.112
ums-0.111
apping-0.110
aps-0.106
Token happy
Token ID3772
Feature activation-1.60e-02

Pos logit contributions
 forgot+0.091
 never+0.084
 missed+0.083
 managed+0.083
 fell+0.078
Neg logit contributions
 films-0.087
ums-0.084
apping-0.082
 pals-0.082
aps-0.081
Token he
Token ID339
Feature activation-1.36e-01

Pos logit contributions
 forgot+0.031
 managed+0.028
 missed+0.028
 never+0.028
use+0.027
Neg logit contributions
 films-0.027
ums-0.026
apping-0.025
 pals-0.025
aps-0.025
Token could
Token ID714
Feature activation-8.11e-02

Pos logit contributions
use+0.134
 notice+0.113
Maybe+0.104
 forgot+0.103
 damage+0.096
Neg logit contributions
 pals-0.291
ning-0.286
apping-0.279
over-0.275
 playing-0.267
Token help
Token ID1037
Feature activation-5.82e-02

Pos logit contributions
 forgot+0.115
use+0.102
 missed+0.100
 managed+0.099
Maybe+0.093
Neg logit contributions
 pals-0.171
ums-0.170
 films-0.168
apping-0.168
aps-0.166
Token.
Token ID13
Feature activation+4.02e-03

Pos logit contributions
 forgot+0.109
 never+0.100
 managed+0.099
 missed+0.097
use+0.095
Neg logit contributions
apping-0.098
 films-0.097
ums-0.097
aps-0.096
 pals-0.094
Token<|endoftext|>
Token ID50256
Feature activation+3.30e-02

Pos logit contributions
 films+0.007
apping+0.006
ums+0.006
 pals+0.006
ning+0.006
Neg logit contributions
 forgot-0.008
 never-0.007
 missed-0.007
 managed-0.007
Maybe-0.007
TokenOnce
Token ID7454
Feature activation-2.27e-02

Pos logit contributions
 films+0.060
ums+0.060
apping+0.058
 pals+0.058
ning+0.054
Neg logit contributions
 forgot-0.056
 missed-0.052
 had-0.052
 managed-0.052
 never-0.051
Token you
Token ID345
Feature activation-9.77e-02

Pos logit contributions
 forgot+0.007
use+0.007
 never+0.007
 managed+0.006
 missed+0.006
Neg logit contributions
ums-0.011
apping-0.010
 films-0.010
 pals-0.010
aters-0.010
Token,
Token ID11
Feature activation-4.27e-02

Pos logit contributions
 forgot+0.184
 never+0.169
 missed+0.164
 managed+0.163
use+0.162
Neg logit contributions
 films-0.163
ums-0.160
apping-0.158
aps-0.157
ning-0.153
Token playful
Token ID34264
Feature activation+1.66e-02

Pos logit contributions
 forgot+0.087
 never+0.079
 missed+0.078
 managed+0.077
use+0.076
Neg logit contributions
ums-0.067
apping-0.065
aps-0.064
 films-0.063
 pals-0.063
Token dog
Token ID3290
Feature activation-1.73e-02

Pos logit contributions
apping+0.028
ums+0.028
aps+0.027
 films+0.026
 pals+0.025
Neg logit contributions
 forgot-0.031
use-0.029
 never-0.028
 fell-0.028
 would-0.028
Token,"
Token ID553
Feature activation+1.34e-02

Pos logit contributions
 forgot+0.027
 never+0.024
 missed+0.023
 fell+0.023
use+0.023
Neg logit contributions
 films-0.036
apping-0.035
ums-0.035
aps-0.034
uff-0.034
Token he
Token ID339
Feature activation-1.36e-01

Pos logit contributions
ums+0.025
apping+0.025
 films+0.025
 pals+0.024
aps+0.024
Neg logit contributions
 forgot-0.023
 missed-0.020
 managed-0.020
 never-0.020
use-0.020
Token said
Token ID531
Feature activation-1.14e-02

Pos logit contributions
 forgot+0.167
use+0.147
 managed+0.143
 missed+0.141
 never+0.136
Neg logit contributions
 films-0.309
 pals-0.307
ums-0.306
apping-0.303
aps-0.298
Token.
Token ID13
Feature activation+2.03e-03

Pos logit contributions
 forgot+0.023
 fell+0.021
 managed+0.021
 never+0.021
 missed+0.021
Neg logit contributions
 films-0.019
ums-0.018
 pals-0.018
uff-0.017
aters-0.017
Token
Token ID198
Feature activation-4.28e-03

Pos logit contributions
ums+0.004
apping+0.004
 films+0.004
uff+0.004
 pals+0.004
Neg logit contributions
 forgot-0.003
 missed-0.003
 never-0.003
 had-0.003
 fell-0.003
Token
Token ID198
Feature activation-3.58e-03

Pos logit contributions
 forgot+0.007
 never+0.007
 missed+0.007
 managed+0.007
 fell+0.007
Neg logit contributions
ums-0.008
 films-0.008
apping-0.007
 pals-0.007
aps-0.007
TokenMax
Token ID11518
Feature activation-6.99e-02

Pos logit contributions
 forgot+0.006
use+0.006
 missed+0.006
 managed+0.005
 wouldn+0.005
Neg logit contributions
 films-0.007
apping-0.006
 pals-0.006
ums-0.006
aps-0.006
Token
Token ID198
Feature activation-4.69e-03

Pos logit contributions
 forgot+0.012
 managed+0.011
 missed+0.010
use+0.010
 never+0.010
Neg logit contributions
 films-0.009
ums-0.008
apping-0.008
 pals-0.008
aps-0.008
TokenTom
Token ID13787
Feature activation-5.37e-02

Pos logit contributions
 forgot+0.008
use+0.007
 missed+0.007
 managed+0.007
 wouldn+0.007
Neg logit contributions
apping-0.009
ums-0.009
 films-0.009
aps-0.009
 pals-0.008
Token was
Token ID373
Feature activation-4.65e-02

Pos logit contributions
use+0.080
 forgot+0.075
Maybe+0.067
 managed+0.067
 missed+0.067
Neg logit contributions
 pals-0.104
ums-0.102
apping-0.102
 films-0.101
ning-0.098
Token happy
Token ID3772
Feature activation-1.33e-02

Pos logit contributions
 forgot+0.080
 never+0.073
 missed+0.072
 managed+0.072
 had+0.067
Neg logit contributions
 films-0.091
ums-0.089
 pals-0.086
apping-0.086
aps-0.086
Token that
Token ID326
Feature activation-6.91e-02

Pos logit contributions
 forgot+0.024
 never+0.022
 missed+0.021
 managed+0.021
 had+0.021
Neg logit contributions
 films-0.025
ums-0.024
apping-0.023
 pals-0.023
aters-0.023
Token he
Token ID339
Feature activation-1.35e-01

Pos logit contributions
 forgot+0.107
use+0.095
 managed+0.094
 missed+0.094
 never+0.090
Neg logit contributions
 films-0.140
ums-0.138
apping-0.137
 pals-0.136
aps-0.133
Token had
Token ID550
Feature activation-8.06e-02

Pos logit contributions
use+0.146
 notice+0.103
Maybe+0.101
 forgot+0.098
Hmm+0.093
Neg logit contributions
 pals-0.302
apping-0.291
ning-0.291
 music-0.282
 films-0.279
Token chosen
Token ID7147
Feature activation-5.74e-02

Pos logit contributions
 forgot+0.133
use+0.120
 missed+0.117
 managed+0.117
 never+0.113
Neg logit contributions
 films-0.153
ums-0.151
apping-0.150
 pals-0.150
aps-0.147
Token to
Token ID284
Feature activation-7.00e-02

Pos logit contributions
 forgot+0.094
 never+0.090
 managed+0.087
 missed+0.084
 had+0.082
Neg logit contributions
ums-0.111
apping-0.107
 films-0.104
aters-0.103
uff-0.103
Token watch
Token ID2342
Feature activation-4.11e-02

Pos logit contributions
 forgot+0.102
 managed+0.089
 missed+0.088
use+0.087
 never+0.086
Neg logit contributions
 films-0.148
ums-0.147
apping-0.145
 pals-0.144
aps-0.142
Token the
Token ID262
Feature activation+2.64e-02

Pos logit contributions
 forgot+0.060
 never+0.060
 would+0.057
 had+0.057
 missed+0.056
Neg logit contributions
ums-0.079
apping-0.077
 pals-0.077
aps-0.076
uff-0.076
Token nodded
Token ID14464
Feature activation-3.90e-02

Pos logit contributions
 forgot+0.090
use+0.085
 missed+0.077
 managed+0.075
'd+0.075
Neg logit contributions
 pals-0.176
apping-0.174
 films-0.174
ums-0.174
aps-0.170
Token.
Token ID13
Feature activation+3.74e-03

Pos logit contributions
 forgot+0.083
 managed+0.077
 missed+0.076
 never+0.076
use+0.075
Neg logit contributions
 films-0.056
ums-0.055
 pals-0.054
apping-0.053
aps-0.052
Token He
Token ID679
Feature activation-1.21e-01

Pos logit contributions
 films+0.006
ums+0.006
apping+0.006
 pals+0.006
aps+0.006
Neg logit contributions
 forgot-0.007
 missed-0.006
 managed-0.006
 never-0.006
use-0.006
Token was
Token ID373
Feature activation-4.99e-02

Pos logit contributions
use+0.166
 forgot+0.129
 notice+0.126
 forgotten+0.121
 missed+0.119
Neg logit contributions
 pals-0.256
apping-0.244
ning-0.239
ums-0.238
aps-0.237
Token happy
Token ID3772
Feature activation-1.40e-02

Pos logit contributions
 forgot+0.085
 managed+0.078
 never+0.078
 missed+0.077
 fell+0.071
Neg logit contributions
 films-0.098
ums-0.096
 pals-0.092
apping-0.092
aps-0.091
Token he
Token ID339
Feature activation-1.35e-01

Pos logit contributions
 forgot+0.025
 never+0.023
 managed+0.022
 missed+0.022
 fell+0.021
Neg logit contributions
 films-0.026
ums-0.026
apping-0.024
 pals-0.024
aps-0.024
Token could
Token ID714
Feature activation-7.89e-02

Pos logit contributions
use+0.148
 notice+0.114
Maybe+0.112
 forgot+0.109
 missed+0.099
Neg logit contributions
ning-0.295
 pals-0.287
apping-0.286
 music-0.269
 playing-0.268
Token balance
Token ID5236
Feature activation-4.50e-02

Pos logit contributions
 forgot+0.124
use+0.110
 missed+0.109
 managed+0.108
 never+0.103
Neg logit contributions
 films-0.155
ums-0.155
apping-0.154
 pals-0.153
aps-0.151
Token on
Token ID319
Feature activation-2.57e-02

Pos logit contributions
 forgot+0.083
 managed+0.076
 never+0.075
 missed+0.075
use+0.074
Neg logit contributions
 films-0.076
ums-0.076
apping-0.076
 pals-0.075
aps-0.073
Token the
Token ID262
Feature activation+3.91e-02

Pos logit contributions
 forgot+0.046
 never+0.043
 managed+0.042
 missed+0.041
 fell+0.040
Neg logit contributions
apping-0.047
ums-0.046
aps-0.045
 pals-0.044
 films-0.044
Token bush
Token ID24484
Feature activation+2.62e-03

Pos logit contributions
 films+0.060
ums+0.059
 pals+0.058
apping+0.057
aps+0.056
Neg logit contributions
 forgot-0.080
 managed-0.073
 missed-0.072
use-0.072
 never-0.070
Token Sam
Token ID3409
Feature activation-5.95e-02

Pos logit contributions
 films+0.013
apping+0.012
ums+0.012
 pals+0.012
aps+0.012
Neg logit contributions
 forgot-0.016
 never-0.015
 missed-0.015
 managed-0.015
 had-0.014
Token was
Token ID373
Feature activation-4.58e-02

Pos logit contributions
use+0.077
 forgot+0.074
 managed+0.068
 missed+0.064
Maybe+0.064
Neg logit contributions
 pals-0.123
 films-0.122
ums-0.120
apping-0.118
ning-0.117
Token so
Token ID523
Feature activation-4.89e-02

Pos logit contributions
 forgot+0.074
 never+0.073
 missed+0.068
 managed+0.066
 had+0.062
Neg logit contributions
 films-0.096
ums-0.094
aps-0.092
apping-0.091
aters-0.090
Token happy
Token ID3772
Feature activation-8.99e-03

Pos logit contributions
 forgot+0.093
use+0.085
 managed+0.084
 missed+0.083
 never+0.081
Neg logit contributions
 films-0.081
ums-0.081
apping-0.080
 pals-0.079
aps-0.078
Token that
Token ID326
Feature activation-6.92e-02

Pos logit contributions
 forgot+0.013
 never+0.013
 had+0.011
 fell+0.011
 would+0.011
Neg logit contributions
 films-0.020
ums-0.020
aters-0.019
pieces-0.019
ocol-0.018
Token he
Token ID339
Feature activation-1.35e-01

Pos logit contributions
 forgot+0.104
use+0.094
 managed+0.092
 missed+0.090
'd+0.086
Neg logit contributions
 films-0.143
ums-0.139
 pals-0.139
apping-0.137
aps-0.134
Token named
Token ID3706
Feature activation-1.99e-02

Pos logit contributions
use+0.145
Maybe+0.104
 forgot+0.099
 notice+0.097
 missed+0.089
Neg logit contributions
 pals-0.310
apping-0.298
ning-0.297
 films-0.296
 music-0.292
Token the
Token ID262
Feature activation+1.11e-02

Pos logit contributions
 forgot+0.029
 never+0.028
 missed+0.028
 had+0.027
 would+0.026
Neg logit contributions
apping-0.044
ums-0.042
uff-0.041
aps-0.040
utes-0.038
Token bunny
Token ID44915
Feature activation-2.76e-02

Pos logit contributions
apping+0.017
ums+0.017
aps+0.016
 films+0.016
oses+0.015
Neg logit contributions
 forgot-0.023
 missed-0.021
 never-0.021
use-0.021
 managed-0.021
Token Honey
Token ID21788
Feature activation-3.64e-02

Pos logit contributions
 forgot+0.053
 would+0.049
 had+0.049
 never+0.049
 missed+0.048
Neg logit contributions
ums-0.047
 films-0.046
apping-0.046
uff-0.044
aps-0.044
Token Bunny
Token ID28683
Feature activation-4.15e-02

Pos logit contributions
 forgot+0.076
 would+0.069
 had+0.068
 never+0.068
 missed+0.067
Neg logit contributions
ums-0.057
apping-0.057
 films-0.056
aps-0.056
 pals-0.053
Token.
Token ID13
Feature activation+3.75e-03

Pos logit contributions
 forgot+0.066
 had+0.064
 would+0.062
 never+0.062
 wouldn+0.059
Neg logit contributions
 films-0.084
 pals-0.079
 sym-0.079
aps-0.078
ums-0.078
Token He
Token ID679
Feature activation-1.23e-01

Pos logit contributions
 films+0.007
 pals+0.006
ums+0.006
ning+0.006
birds+0.006
Neg logit contributions
 forgot-0.007
 missed-0.006
 never-0.006
 managed-0.006
 fell-0.006
Token was
Token ID373
Feature activation-5.07e-02

Pos logit contributions
use+0.188
 notice+0.148
 forgot+0.141
t+0.136
 forgotten+0.131
Neg logit contributions
 pals-0.252
 friends-0.232
ums-0.227
apping-0.221
aw-0.220
Token so
Token ID523
Feature activation-5.11e-02

Pos logit contributions
 forgot+0.093
 never+0.086
 missed+0.085
 managed+0.084
 had+0.078
Neg logit contributions
 films-0.094
ums-0.090
 pals-0.088
apping-0.087
aps-0.087
Token happy
Token ID3772
Feature activation-1.22e-02

Pos logit contributions
 forgot+0.107
use+0.107
 managed+0.097
 missed+0.096
'd+0.096
Neg logit contributions
apping-0.075
ums-0.072
aps-0.071
 pals-0.070
 films-0.068
Token he
Token ID339
Feature activation-1.35e-01

Pos logit contributions
 forgot+0.018
 never+0.017
 had+0.016
 fell+0.015
 managed+0.015
Neg logit contributions
 films-0.028
ums-0.027
aters-0.026
utes-0.025
apping-0.025
Token could
Token ID714
Feature activation-8.21e-02

Pos logit contributions
use+0.149
Maybe+0.114
 notice+0.110
 forgot+0.098
 missed+0.094
Neg logit contributions
 pals-0.297
apping-0.291
ning-0.288
 music-0.277
aw-0.275
Token ride
Token ID6594
Feature activation-4.38e-02

Pos logit contributions
 forgot+0.131
use+0.124
 missed+0.118
 managed+0.114
Maybe+0.109
Neg logit contributions
ums-0.156
 pals-0.154
apping-0.153
aps-0.153
 films-0.149
Token around
Token ID1088
Feature activation-4.49e-02

Pos logit contributions
 forgot+0.073
 never+0.069
 managed+0.068
 missed+0.066
 had+0.065
Neg logit contributions
ums-0.082
apping-0.081
 films-0.081
 pals-0.079
aps-0.078
Token the
Token ID262
Feature activation+3.39e-02

Pos logit contributions
 forgot+0.086
 managed+0.079
 never+0.078
 missed+0.078
use+0.076
Neg logit contributions
ums-0.075
 films-0.073
apping-0.073
 pals-0.072
aps-0.071
Token park
Token ID3952
Feature activation-2.30e-02

Pos logit contributions
 films+0.042
ums+0.041
apping+0.040
 pals+0.040
aps+0.040
Neg logit contributions
 forgot-0.079
 managed-0.073
use-0.073
 missed-0.072
 never-0.070
Token can
Token ID460
Feature activation+8.20e-02

Pos logit contributions
use+0.019
 forgot+0.019
 managed+0.016
 forgotten+0.016
 notice+0.016
Neg logit contributions
 pals-0.032
 films-0.031
ning-0.030
birds-0.030
ums-0.030
Token't
Token ID470
Feature activation+6.66e-02

Pos logit contributions
 films+0.170
ums+0.166
 pals+0.165
apping+0.164
aps+0.163
Neg logit contributions
 forgot-0.120
 managed-0.107
 missed-0.104
use-0.104
'd-0.098
Token draw
Token ID3197
Feature activation+1.13e-01

Pos logit contributions
 films+0.132
 pals+0.130
apping+0.129
ums+0.127
aps+0.125
Neg logit contributions
 forgot-0.098
 managed-0.089
use-0.088
 missed-0.086
Maybe-0.083
Token on
Token ID319
Feature activation+8.37e-02

Pos logit contributions
ums+0.202
apping+0.201
 pals+0.201
 films+0.196
aps+0.188
Neg logit contributions
 forgot-0.196
 never-0.178
 missed-0.177
 fell-0.176
 managed-0.174
Token the
Token ID262
Feature activation+1.65e-01

Pos logit contributions
apping+0.157
ums+0.151
 pals+0.148
aps+0.141
 films+0.139
Neg logit contributions
 forgot-0.146
 never-0.140
use-0.132
 fell-0.131
 missed-0.131
Token black
Token ID2042
Feature activation+2.19e-01

Pos logit contributions
apping+0.289
ums+0.284
 pals+0.276
aps+0.271
 films+0.263
Neg logit contributions
 forgot-0.307
 never-0.285
use-0.284
 fell-0.276
 had-0.276
Tokenboard
Token ID3526
Feature activation+9.68e-02

Pos logit contributions
 pals+0.432
apping+0.431
ums+0.423
 films+0.416
aps+0.407
Neg logit contributions
use-0.317
 forgot-0.315
 fell-0.314
 would-0.311
 had-0.300
Token.
Token ID13
Feature activation+1.03e-01

Pos logit contributions
 films+0.113
apping+0.111
ums+0.108
aps+0.107
 pals+0.103
Neg logit contributions
 forgot-0.226
 managed-0.209
use-0.207
 missed-0.207
'd-0.197
Token It
Token ID632
Feature activation-1.17e-02

Pos logit contributions
ums+0.176
 films+0.175
ning+0.175
 pals+0.174
apping+0.172
Neg logit contributions
 forgot-0.180
 never-0.167
Maybe-0.167
 missed-0.167
use-0.164
Token's
Token ID338
Feature activation+6.65e-02

Pos logit contributions
 forgot+0.017
use+0.017
 managed+0.016
 missed+0.015
 forgotten+0.014
Neg logit contributions
 films-0.022
aps-0.021
 pals-0.020
ning-0.020
ums-0.020
Token for
Token ID329
Feature activation+4.20e-02

Pos logit contributions
 films+0.138
ums+0.138
apping+0.136
 pals+0.135
aps+0.133
Neg logit contributions
 forgot-0.101
 missed-0.088
 managed-0.088
 never-0.086
use-0.085
Token giving
Token ID3501
Feature activation+8.10e-02

Pos logit contributions
apping+0.170
 films+0.168
ums+0.163
 pals+0.163
aps+0.161
Neg logit contributions
 forgot-0.203
 never-0.192
 missed-0.186
 managed-0.184
use-0.183
Token him
Token ID683
Feature activation+2.35e-02

Pos logit contributions
apping+0.152
 films+0.149
ums+0.147
 pals+0.144
aps+0.142
Neg logit contributions
 forgot-0.138
use-0.129
 never-0.126
 managed-0.124
 missed-0.124
Token too
Token ID1165
Feature activation+2.40e-02

Pos logit contributions
apping+0.039
 films+0.038
ums+0.038
 pals+0.037
aps+0.036
Neg logit contributions
 forgot-0.045
 never-0.041
use-0.041
 managed-0.041
 missed-0.041
Token many
Token ID867
Feature activation+4.88e-02

Pos logit contributions
aps+0.038
ums+0.038
 pals+0.038
apping+0.038
 films+0.038
Neg logit contributions
 forgot-0.046
 managed-0.042
 missed-0.041
use-0.041
'd-0.040
Token ch
Token ID442
Feature activation+1.67e-01

Pos logit contributions
apping+0.076
ums+0.075
 films+0.072
aps+0.070
uff+0.069
Neg logit contributions
 forgot-0.099
 never-0.090
use-0.090
 managed-0.089
 missed-0.089
Tokenocol
Token ID4668
Feature activation+2.29e-01

Pos logit contributions
 films+0.174
 pals+0.156
ning+0.156
apping+0.153
ums+0.152
Neg logit contributions
 forgot-0.412
use-0.399
 managed-0.387
 missed-0.386
 never-0.379
Tokenates
Token ID689
Feature activation+8.04e-02

Pos logit contributions
 films+0.195
 pals+0.146
 television+0.146
apping+0.146
ums+0.137
Neg logit contributions
use-0.610
 forgot-0.603
 managed-0.598
 never-0.583
 missed-0.568
Token."
Token ID526
Feature activation+1.05e-01

Pos logit contributions
 films+0.113
ums+0.110
apping+0.108
 pals+0.108
aps+0.104
Neg logit contributions
 forgot-0.174
 managed-0.159
use-0.159
 never-0.158
 missed-0.157
Token The
Token ID383
Feature activation+9.83e-02

Pos logit contributions
 films+0.204
ums+0.193
ning+0.193
apping+0.191
 pals+0.188
Neg logit contributions
 forgot-0.175
 never-0.159
 missed-0.158
 fell-0.152
 had-0.152
Token king
Token ID5822
Feature activation+5.24e-02

Pos logit contributions
apping+0.164
 films+0.163
ums+0.163
aps+0.157
ning+0.156
Neg logit contributions
 forgot-0.188
 missed-0.166
 never-0.165
use-0.165
 managed-0.164
Token learned
Token ID4499
Feature activation+2.64e-02

Pos logit contributions
ums+0.112
 films+0.112
apping+0.112
 pals+0.111
aps+0.110
Neg logit contributions
 forgot-0.071
use-0.063
 managed-0.061
 missed-0.061
 never-0.058
Token would
Token ID561
Feature activation+3.67e-02

Pos logit contributions
 films+0.245
ums+0.238
apping+0.230
 pals+0.229
aps+0.224
Neg logit contributions
 forgot-0.185
 managed-0.163
 missed-0.162
 never-0.162
 fell-0.156
Token take
Token ID1011
Feature activation+6.28e-02

Pos logit contributions
 films+0.076
ums+0.074
 pals+0.074
apping+0.074
aps+0.072
Neg logit contributions
 forgot-0.054
 managed-0.047
use-0.047
 missed-0.047
 never-0.044
Token care
Token ID1337
Feature activation+5.13e-02

Pos logit contributions
apping+0.115
ums+0.110
 films+0.107
aps+0.106
 pals+0.105
Neg logit contributions
 forgot-0.108
 never-0.102
 managed-0.101
 missed-0.099
use-0.097
Token of
Token ID286
Feature activation+8.33e-02

Pos logit contributions
 films+0.065
ums+0.064
 pals+0.062
apping+0.061
ning+0.059
Neg logit contributions
 forgot-0.116
 never-0.108
 managed-0.106
 missed-0.105
use-0.104
Token the
Token ID262
Feature activation+1.28e-01

Pos logit contributions
apping+0.159
ums+0.150
aps+0.146
 films+0.142
 pals+0.140
Neg logit contributions
 forgot-0.142
 never-0.139
 managed-0.128
 missed-0.125
 had-0.123
Token sw
Token ID1509
Feature activation+2.06e-01

Pos logit contributions
apping+0.200
ums+0.198
aps+0.187
 films+0.184
 pals+0.182
Neg logit contributions
 forgot-0.257
use-0.235
 missed-0.231
 would-0.230
 never-0.229
Tokenan
Token ID272
Feature activation+1.10e-01

Pos logit contributions
 films+0.286
apping+0.268
aps+0.267
ums+0.256
birds+0.253
Neg logit contributions
 forgot-0.442
 managed-0.406
use-0.403
 never-0.394
'd-0.391
Token.
Token ID13
Feature activation+9.21e-02

Pos logit contributions
ums+0.198
aps+0.184
uff+0.184
apping+0.183
 films+0.182
Neg logit contributions
 forgot-0.194
 would-0.187
 never-0.181
 wouldn-0.177
 had-0.174
Token He
Token ID679
Feature activation-3.10e-02

Pos logit contributions
 films+0.181
ums+0.177
apping+0.175
ning+0.175
 pals+0.170
Neg logit contributions
 forgot-0.139
 never-0.128
Maybe-0.127
 missed-0.125
 managed-0.123
Token would
Token ID561
Feature activation+3.31e-03

Pos logit contributions
 forgot+0.032
use+0.032
 managed+0.026
Maybe+0.026
 missed+0.026
Neg logit contributions
 pals-0.071
 films-0.070
ums-0.067
birds-0.066
apping-0.066
Token feed
Token ID3745
Feature activation+6.39e-02

Pos logit contributions
 films+0.007
ums+0.006
 pals+0.006
apping+0.006
aps+0.006
Neg logit contributions
 forgot-0.005
 managed-0.005
use-0.004
 missed-0.004
Maybe-0.004
Token,"
Token ID553
Feature activation+9.68e-02

Pos logit contributions
ums+0.100
apping+0.099
 films+0.098
 pals+0.098
aps+0.095
Neg logit contributions
 forgot-0.106
 never-0.096
use-0.096
 missed-0.096
 managed-0.095
Token said
Token ID531
Feature activation+7.30e-02

Pos logit contributions
 films+0.165
ums+0.164
apping+0.163
 pals+0.159
aps+0.157
Neg logit contributions
 forgot-0.181
 missed-0.162
 managed-0.162
use-0.159
 never-0.159
Token the
Token ID262
Feature activation+9.82e-02

Pos logit contributions
 films+0.124
aters+0.121
ums+0.121
apping+0.120
aps+0.118
Neg logit contributions
 forgot-0.134
 missed-0.128
 fell-0.124
 managed-0.124
 never-0.121
Token rh
Token ID9529
Feature activation+1.76e-01

Pos logit contributions
ums+0.170
apping+0.168
 films+0.164
aters+0.162
aps+0.160
Neg logit contributions
 forgot-0.175
 missed-0.158
 never-0.157
 had-0.156
 fell-0.155
Tokenin
Token ID259
Feature activation+6.53e-02

Pos logit contributions
 films+0.327
ums+0.314
 pals+0.308
apping+0.304
aps+0.296
Neg logit contributions
 forgot-0.298
 managed-0.270
use-0.268
 never-0.268
 missed-0.265
Tokenoc
Token ID420
Feature activation+2.52e-01

Pos logit contributions
 films+0.104
 pals+0.095
ums+0.095
apping+0.095
aps+0.095
Neg logit contributions
 forgot-0.134
 never-0.122
 managed-0.122
 missed-0.121
 had-0.119
Tokeneros
Token ID27498
Feature activation+7.41e-02

Pos logit contributions
 films+0.206
apping+0.189
 pals+0.171
aps+0.161
 sym+0.150
Neg logit contributions
use-0.667
 forgot-0.659
 never-0.644
 managed-0.636
 missed-0.629
Token.
Token ID13
Feature activation+8.00e-02

Pos logit contributions
 films+0.116
ums+0.112
 pals+0.109
apping+0.109
aps+0.108
Neg logit contributions
 forgot-0.153
 managed-0.140
 missed-0.140
 never-0.137
 fell-0.134
Token
Token ID198
Feature activation+6.52e-02

Pos logit contributions
 films+0.133
ums+0.127
 pals+0.124
apping+0.122
ning+0.118
Neg logit contributions
 forgot-0.151
 managed-0.141
 missed-0.141
 never-0.139
Maybe-0.137
Token
Token ID198
Feature activation+6.41e-02

Pos logit contributions
 films+0.123
ums+0.116
 pals+0.115
apping+0.113
ning+0.111
Neg logit contributions
 forgot-0.108
 never-0.102
 managed-0.100
Maybe-0.099
 missed-0.097
TokenThe
Token ID464
Feature activation+9.23e-02

Pos logit contributions
 films+0.137
ums+0.134
 pals+0.132
apping+0.131
aps+0.128
Neg logit contributions
 forgot-0.093
 managed-0.082
 never-0.081
use-0.080
 missed-0.080
Token
Token ID198
Feature activation+5.64e-02

Pos logit contributions
 films+0.090
ums+0.089
 pals+0.087
apping+0.087
aps+0.083
Neg logit contributions
 forgot-0.106
 never-0.096
 missed-0.096
 managed-0.095
use-0.095
TokenThey
Token ID2990
Feature activation-3.62e-02

Pos logit contributions
 films+0.119
apping+0.118
ums+0.118
 pals+0.116
aps+0.115
Neg logit contributions
 forgot-0.082
 missed-0.072
 managed-0.072
use-0.070
'd-0.067
Token come
Token ID1282
Feature activation+8.46e-03

Pos logit contributions
 forgot+0.054
Maybe+0.051
 managed+0.051
use+0.050
 missed+0.048
Neg logit contributions
 films-0.065
 pals-0.063
ums-0.061
 lions-0.060
aps-0.060
Token to
Token ID284
Feature activation+1.30e-02

Pos logit contributions
 films+0.019
pill+0.019
ums+0.019
 pals+0.018
opl+0.018
Neg logit contributions
 forgot-0.011
 never-0.010
 would-0.009
 had-0.009
 managed-0.009
Token the
Token ID262
Feature activation+8.93e-02

Pos logit contributions
ums+0.026
apping+0.026
aps+0.024
ning+0.024
uff+0.024
Neg logit contributions
 forgot-0.021
 never-0.020
 missed-0.020
use-0.019
 notice-0.019
Token ostr
Token ID42387
Feature activation+2.06e-01

Pos logit contributions
ums+0.098
apping+0.098
 films+0.091
 pals+0.090
aps+0.090
Neg logit contributions
 forgot-0.225
 never-0.209
 missed-0.207
use-0.207
 managed-0.203
Tokenich
Token ID488
Feature activation+6.79e-02

Pos logit contributions
 films+0.292
 television+0.237
 cartoons+0.232
 sports+0.230
 pals+0.223
Neg logit contributions
 forgot-0.467
 never-0.451
use-0.430
 missed-0.421
 fell-0.416
Token pen
Token ID3112
Feature activation+3.92e-02

Pos logit contributions
ums+0.127
uff+0.120
 pals+0.118
apping+0.115
 films+0.115
Neg logit contributions
 never-0.117
 forgot-0.116
 would-0.114
use-0.110
 had-0.108
Token.
Token ID13
Feature activation+7.20e-02

Pos logit contributions
ums+0.065
 films+0.064
 pals+0.062
apping+0.060
aps+0.058
Neg logit contributions
 forgot-0.072
 never-0.069
 fell-0.067
 would-0.067
 had-0.066
Token Ost
Token ID38919
Feature activation+1.20e-01

Pos logit contributions
ums+0.136
ning+0.135
 pals+0.133
apping+0.133
agging+0.128
Neg logit contributions
Maybe-0.112
 forgot-0.109
 missed-0.107
 never-0.107
 had-0.104
Tokenric
Token ID1173
Feature activation+1.74e-01

Pos logit contributions
 films+0.078
apping+0.077
 pals+0.076
ums+0.072
aps+0.067
Neg logit contributions
 forgot-0.343
 never-0.330
 missed-0.326
use-0.325
 managed-0.322
Token find
Token ID1064
Feature activation-3.85e-02

Pos logit contributions
 forgot+0.054
use+0.052
 missed+0.050
 managed+0.049
 wouldn+0.047
Neg logit contributions
apping-0.067
ums-0.064
aps-0.063
 pals-0.060
 films-0.060
Token it
Token ID340
Feature activation-6.49e-02

Pos logit contributions
 forgot+0.075
 managed+0.068
 missed+0.068
use+0.067
 never+0.067
Neg logit contributions
 films-0.062
apping-0.062
ums-0.061
 pals-0.060
aps-0.059
Token.
Token ID13
Feature activation+4.00e-02

Pos logit contributions
 forgot+0.161
 missed+0.149
use+0.149
 managed+0.148
 never+0.143
Neg logit contributions
ums-0.065
 films-0.065
 pals-0.065
apping-0.065
aps-0.061
Token �
Token ID6184
Feature activation+7.12e-02

Pos logit contributions
 films+0.067
ums+0.064
 pals+0.064
apping+0.063
aps+0.061
Neg logit contributions
 forgot-0.076
 managed-0.069
 missed-0.068
use-0.068
 never-0.067
Token�
Token ID95
Feature activation+8.52e-02

Pos logit contributions
 films+0.117
ums+0.116
apping+0.114
 pals+0.113
aps+0.111
Neg logit contributions
 forgot-0.137
 managed-0.124
 missed-0.123
use-0.123
 never-0.120
Token€
Token ID26391
Feature activation+1.13e-01

Pos logit contributions
 films+0.146
ums+0.144
 pals+0.142
apping+0.142
aps+0.138
Neg logit contributions
 forgot-0.158
 managed-0.142
 missed-0.141
use-0.141
 never-0.138
Token�
Token ID129
Feature activation+8.52e-02

Pos logit contributions
 films+0.178
ums+0.172
 pals+0.171
apping+0.169
aps+0.163
Neg logit contributions
 forgot-0.231
 managed-0.209
 missed-0.208
use-0.206
 never-0.204
Token�
Token ID241
Feature activation+4.20e-02

Pos logit contributions
 films+0.132
 pals+0.132
apping+0.130
ums+0.128
aps+0.127
Neg logit contributions
 forgot-0.168
 never-0.157
 missed-0.156
 managed-0.153
use-0.151
TokenDon
Token ID3987
Feature activation-5.61e-03

Pos logit contributions
 films+0.088
ums+0.088
apping+0.088
 pals+0.087
aps+0.086
Neg logit contributions
 forgot-0.060
 managed-0.052
 missed-0.052
use-0.052
 never-0.050
Tokenâ
Token ID22940
Feature activation+7.95e-02

Pos logit contributions
 forgot+0.008
 missed+0.007
 managed+0.007
use+0.007
'd+0.006
Neg logit contributions
ums-0.012
apping-0.012
 films-0.012
 pals-0.012
aps-0.012
Token€
Token ID26391
Feature activation+1.25e-01

Pos logit contributions
ums+0.123
apping+0.121
 films+0.116
aps+0.113
ning+0.112
Neg logit contributions
 forgot-0.159
 missed-0.142
 managed-0.142
Maybe-0.135
'd-0.134
Token family
Token ID1641
Feature activation+3.54e-02

Pos logit contributions
 films+0.038
ums+0.037
apping+0.037
 pals+0.037
aps+0.036
Neg logit contributions
 forgot-0.028
 managed-0.024
 missed-0.024
use-0.023
 never-0.023
Token.
Token ID13
Feature activation+7.66e-02

Pos logit contributions
 films+0.058
ums+0.055
apping+0.053
aps+0.052
 pals+0.051
Neg logit contributions
 forgot-0.073
 managed-0.067
 never-0.065
 missed-0.065
 had-0.064
Token He
Token ID679
Feature activation-4.55e-02

Pos logit contributions
ning+0.138
apping+0.137
ums+0.136
 films+0.134
aps+0.132
Neg logit contributions
 forgot-0.122
 missed-0.117
 never-0.113
 had-0.113
Maybe-0.112
Token invited
Token ID9392
Feature activation+3.16e-02

Pos logit contributions
use+0.049
 forgot+0.047
 managed+0.041
Maybe+0.040
 missed+0.038
Neg logit contributions
 pals-0.105
 films-0.104
ums-0.100
apping-0.099
aps-0.098
Token a
Token ID257
Feature activation+5.43e-02

Pos logit contributions
apping+0.060
ums+0.059
aps+0.058
 films+0.056
uff+0.054
Neg logit contributions
 forgot-0.052
 never-0.050
 managed-0.047
 missed-0.047
 had-0.045
Token beautiful
Token ID4950
Feature activation+1.04e-01

Pos logit contributions
ums+0.093
 films+0.090
apping+0.089
aps+0.087
 pals+0.087
Neg logit contributions
 forgot-0.105
 managed-0.093
 missed-0.091
use-0.091
 never-0.091
Token princess
Token ID21752
Feature activation+5.32e-02

Pos logit contributions
ums+0.200
apping+0.185
aps+0.179
uff+0.174
ning+0.170
Neg logit contributions
 forgot-0.183
 had-0.175
 would-0.169
 fell-0.166
use-0.166
Token,
Token ID11
Feature activation+7.97e-03

Pos logit contributions
ums+0.104
 films+0.099
apping+0.096
uff+0.094
aps+0.093
Neg logit contributions
 forgot-0.089
 never-0.084
 would-0.082
 had-0.081
 managed-0.081
Token and
Token ID290
Feature activation+9.71e-03

Pos logit contributions
ums+0.013
apping+0.013
 films+0.013
aps+0.013
 pals+0.012
Neg logit contributions
 forgot-0.016
 managed-0.014
 never-0.014
 missed-0.014
 had-0.014
Token she
Token ID673
Feature activation-4.45e-02

Pos logit contributions
ums+0.019
 films+0.019
apping+0.018
aps+0.018
 pals+0.018
Neg logit contributions
 forgot-0.016
 managed-0.014
 missed-0.014
 never-0.014
use-0.014
Token said
Token ID531
Feature activation+5.32e-02

Pos logit contributions
 forgot+0.051
use+0.047
 managed+0.043
 missed+0.043
 never+0.040
Neg logit contributions
 pals-0.103
 films-0.103
ums-0.102
apping-0.101
aps-0.100
Token,
Token ID11
Feature activation-1.84e-02

Pos logit contributions
 forgot+0.027
 managed+0.025
use+0.025
 missed+0.025
'd+0.024
Neg logit contributions
 films-0.016
ums-0.016
 pals-0.016
apping-0.015
aps-0.015
Token Lily
Token ID20037
Feature activation-1.90e-02

Pos logit contributions
 forgot+0.035
 managed+0.032
 missed+0.032
use+0.031
 never+0.031
Neg logit contributions
 films-0.031
ums-0.030
apping-0.030
 pals-0.030
aps-0.029
Token and
Token ID290
Feature activation-2.89e-02

Pos logit contributions
use+0.028
 forgot+0.023
 lied+0.022
Maybe+0.021
 notice+0.021
Neg logit contributions
ning-0.037
 pals-0.037
ums-0.037
 films-0.036
 friends-0.034
Token the
Token ID262
Feature activation+4.73e-02

Pos logit contributions
 forgot+0.055
 missed+0.050
use+0.049
 never+0.049
 managed+0.049
Neg logit contributions
apping-0.050
ums-0.049
aps-0.048
 pals-0.048
 films-0.047
Token sweet
Token ID6029
Feature activation+7.04e-02

Pos logit contributions
 films+0.055
 pals+0.046
ums+0.044
 lions+0.042
ning+0.040
Neg logit contributions
 forgot-0.123
 managed-0.119
use-0.114
 wouldn-0.110
 fell-0.109
Token re
Token ID302
Feature activation+1.07e-01

Pos logit contributions
 films+0.098
ums+0.095
 pals+0.092
apping+0.091
aps+0.088
Neg logit contributions
 forgot-0.157
 managed-0.146
use-0.142
 missed-0.141
 never-0.139
Tokenind
Token ID521
Feature activation+6.76e-02

Pos logit contributions
 films+0.118
ums+0.114
apping+0.112
 pals+0.112
aps+0.108
Neg logit contributions
 forgot-0.261
 missed-0.242
use-0.241
 managed-0.241
 never-0.239
Tokeneer
Token ID28153
Feature activation-5.97e-03

Pos logit contributions
 films+0.090
ums+0.089
apping+0.086
 pals+0.085
aps+0.083
Neg logit contributions
 forgot-0.152
 managed-0.140
use-0.138
 missed-0.137
 never-0.135
Token became
Token ID2627
Feature activation-1.31e-02

Pos logit contributions
 forgot+0.008
use+0.007
 managed+0.006
Maybe+0.006
 missed+0.006
Neg logit contributions
 pals-0.013
 films-0.013
apping-0.013
ums-0.013
birds-0.012
Token good
Token ID922
Feature activation-3.33e-03

Pos logit contributions
 forgot+0.022
 managed+0.020
 never+0.020
 missed+0.020
 would+0.019
Neg logit contributions
apping-0.025
ums-0.025
aps-0.024
 films-0.024
uff-0.023
Token friends
Token ID2460
Feature activation-1.65e-02

Pos logit contributions
 forgot+0.009
 missed+0.008
 never+0.008
 managed+0.008
 would+0.008
Neg logit contributions
 films-0.003
ums-0.003
igo-0.002
ric-0.002
pieces-0.002
Token the
Token ID262
Feature activation+1.00e-01

Pos logit contributions
ums+0.042
 films+0.042
apping+0.042
 pals+0.041
aps+0.041
Neg logit contributions
 forgot-0.043
 managed-0.039
 never-0.038
 missed-0.038
 fell-0.037
Token ant
Token ID1885
Feature activation+7.95e-02

Pos logit contributions
 films+0.137
apping+0.136
ums+0.136
aps+0.132
 pals+0.128
Neg logit contributions
 forgot-0.220
 managed-0.200
 missed-0.199
 never-0.198
use-0.195
Token was
Token ID373
Feature activation+2.98e-02

Pos logit contributions
 films+0.178
 pals+0.177
ums+0.177
apping+0.176
aps+0.172
Neg logit contributions
 forgot-0.096
use-0.086
 managed-0.081
 missed-0.081
Maybe-0.075
Token carrying
Token ID6872
Feature activation+2.86e-02

Pos logit contributions
 films+0.058
ums+0.057
apping+0.056
 pals+0.055
aps+0.055
Neg logit contributions
 forgot-0.050
 missed-0.045
 managed-0.045
 never-0.045
 fell-0.042
Token a
Token ID257
Feature activation+5.58e-02

Pos logit contributions
 films+0.044
apping+0.044
ums+0.043
 pals+0.042
aps+0.041
Neg logit contributions
 forgot-0.058
 managed-0.053
 never-0.052
 missed-0.052
use-0.052
Token little
Token ID1310
Feature activation+9.35e-02

Pos logit contributions
 films+0.099
ums+0.098
apping+0.096
 pals+0.095
aps+0.094
Neg logit contributions
 forgot-0.101
 managed-0.090
 missed-0.089
use-0.089
 never-0.088
Token piece
Token ID3704
Feature activation+6.37e-02

Pos logit contributions
ums+0.207
 films+0.205
apping+0.201
 pals+0.197
aps+0.195
Neg logit contributions
 forgot-0.129
use-0.114
 managed-0.113
 missed-0.109
 fell-0.108
Token of
Token ID286
Feature activation+8.06e-02

Pos logit contributions
 films+0.075
ums+0.073
 pals+0.071
apping+0.071
aps+0.068
Neg logit contributions
 forgot-0.152
 managed-0.141
 missed-0.139
 never-0.138
use-0.138
Token food
Token ID2057
Feature activation+4.50e-02

Pos logit contributions
apping+0.135
ums+0.130
 films+0.127
aps+0.124
 pals+0.122
Neg logit contributions
 forgot-0.155
 managed-0.143
 never-0.143
use-0.141
 missed-0.139
Token.
Token ID13
Feature activation+7.14e-02

Pos logit contributions
 films+0.057
apping+0.057
ums+0.057
 pals+0.056
aps+0.054
Neg logit contributions
 forgot-0.100
 missed-0.092
use-0.091
 managed-0.091
 never-0.088
Token Lily
Token ID20037
Feature activation+1.33e-02

Pos logit contributions
 films+0.141
apping+0.132
ums+0.131
 pals+0.129
ning+0.128
Neg logit contributions
 forgot-0.115
 missed-0.105
 never-0.105
Maybe-0.104
 managed-0.103
Token big
Token ID1263
Feature activation+1.04e-01

Pos logit contributions
ums+0.118
apping+0.114
 films+0.114
aps+0.113
 pals+0.112
Neg logit contributions
 forgot-0.144
 missed-0.128
 managed-0.127
 never-0.126
use-0.125
Token river
Token ID7850
Feature activation+7.36e-02

Pos logit contributions
ums+0.165
apping+0.161
aps+0.159
 pals+0.156
 films+0.156
Neg logit contributions
 forgot-0.208
 missed-0.189
use-0.187
 never-0.185
 managed-0.184
Token where
Token ID810
Feature activation+4.55e-02

Pos logit contributions
ums+0.122
 films+0.121
 pals+0.118
aps+0.116
apping+0.111
Neg logit contributions
 never-0.138
 forgot-0.138
 had-0.132
 fell-0.130
 needed-0.129
Token a
Token ID257
Feature activation+6.85e-02

Pos logit contributions
 films+0.084
 pals+0.082
apping+0.082
ums+0.082
aps+0.080
Neg logit contributions
 forgot-0.077
use-0.070
 managed-0.068
 missed-0.068
 never-0.066
Token serious
Token ID2726
Feature activation+1.03e-01

Pos logit contributions
ums+0.113
apping+0.108
 films+0.108
aps+0.107
 pals+0.103
Neg logit contributions
 forgot-0.137
 managed-0.122
 missed-0.121
 never-0.120
'd-0.119
Token crocod
Token ID37565
Feature activation+1.76e-01

Pos logit contributions
ums+0.169
aps+0.168
apping+0.164
 films+0.159
 pals+0.154
Neg logit contributions
 forgot-0.203
 missed-0.183
 had-0.181
 would-0.181
 fell-0.181
Tokenile
Token ID576
Feature activation+7.46e-02

Pos logit contributions
ums+0.271
 films+0.269
aps+0.257
apping+0.256
 pals+0.252
Neg logit contributions
 forgot-0.353
 missed-0.324
 managed-0.323
use-0.322
 never-0.321
Token lived
Token ID5615
Feature activation+5.45e-02

Pos logit contributions
 films+0.156
ums+0.154
apping+0.153
 pals+0.152
aps+0.149
Neg logit contributions
 forgot-0.108
 managed-0.094
use-0.094
 missed-0.094
 never-0.090
Token.
Token ID13
Feature activation+6.80e-02

Pos logit contributions
 films+0.112
ums+0.109
aters+0.101
 cartoons+0.100
 pals+0.097
Neg logit contributions
 never-0.083
 managed-0.080
 had-0.078
 forgot-0.077
 fell-0.075
Token He
Token ID679
Feature activation-4.93e-02

Pos logit contributions
apping+0.120
ums+0.119
ning+0.117
 films+0.117
 pals+0.116
Neg logit contributions
 forgot-0.112
 missed-0.107
 never-0.107
 had-0.105
 needed-0.104
Token liked
Token ID8288
Feature activation+4.13e-02

Pos logit contributions
use+0.062
 forgot+0.061
 managed+0.054
Maybe+0.051
 notice+0.051
Neg logit contributions
 pals-0.105
 films-0.104
 lions-0.097
apping-0.096
ning-0.095
Token their
Token ID511
Feature activation+4.85e-02

Pos logit contributions
 forgot+0.011
 missed+0.010
 never+0.010
 fell+0.010
 lied+0.010
Neg logit contributions
apping-0.011
ums-0.011
aps-0.010
 pals-0.010
uff-0.010
Token little
Token ID1310
Feature activation+9.13e-02

Pos logit contributions
apping+0.075
ums+0.074
 films+0.072
aps+0.072
 pals+0.071
Neg logit contributions
 forgot-0.099
 missed-0.090
use-0.090
 managed-0.088
 never-0.088
Token kid
Token ID5141
Feature activation+3.34e-02

Pos logit contributions
apping+0.142
ums+0.141
aps+0.136
 films+0.131
 pals+0.131
Neg logit contributions
 forgot-0.182
use-0.170
 missed-0.163
 would-0.161
 needed-0.161
Token to
Token ID284
Feature activation-6.03e-03

Pos logit contributions
ums+0.060
apping+0.059
 films+0.059
aps+0.058
 pals+0.057
Neg logit contributions
 forgot-0.060
use-0.053
 never-0.053
 missed-0.053
 managed-0.053
Token the
Token ID262
Feature activation+7.43e-02

Pos logit contributions
 forgot+0.009
 never+0.008
 notice+0.008
 missed+0.008
 forgotten+0.008
Neg logit contributions
apping-0.013
ums-0.013
ning-0.012
birds-0.012
aps-0.012
Token store
Token ID3650
Feature activation+2.25e-02

Pos logit contributions
apping+0.139
ums+0.138
aps+0.132
 pals+0.131
 films+0.125
Neg logit contributions
 forgot-0.124
 never-0.119
 missed-0.114
use-0.112
 would-0.112
Token to
Token ID284
Feature activation-9.07e-03

Pos logit contributions
 films+0.030
ums+0.029
apping+0.029
 pals+0.029
aps+0.028
Neg logit contributions
 forgot-0.051
 managed-0.046
 missed-0.046
use-0.046
 never-0.045
Token buy
Token ID2822
Feature activation-6.89e-03

Pos logit contributions
 forgot+0.013
 managed+0.011
use+0.011
 missed+0.011
'd+0.011
Neg logit contributions
 films-0.020
ums-0.019
 pals-0.019
apping-0.019
aps-0.019
Token some
Token ID617
Feature activation+1.63e-02

Pos logit contributions
 forgot+0.012
 never+0.011
 had+0.011
 would+0.011
 needed+0.011
Neg logit contributions
apping-0.013
ums-0.012
aps-0.012
 films-0.012
 pals-0.012
Token nice
Token ID3621
Feature activation+4.65e-02

Pos logit contributions
apping+0.030
ums+0.028
ning+0.027
aps+0.027
ric+0.026
Neg logit contributions
 forgot-0.027
use-0.026
 would-0.026
 needed-0.026
 had-0.025
Token things
Token ID1243
Feature activation-1.46e-02

Pos logit contributions
apping+0.082
 sym+0.077
ric+0.076
ning+0.075
ums+0.075
Neg logit contributions
 forgot-0.075
 got-0.073
 had-0.073
 would-0.073
 lied-0.072
Token hole
Token ID7604
Feature activation+3.26e-02

Pos logit contributions
 films+0.049
ums+0.049
apping+0.049
 pals+0.048
aps+0.048
Neg logit contributions
 forgot-0.054
use-0.048
 missed-0.048
 managed-0.048
 never-0.047
Token in
Token ID287
Feature activation+3.19e-02

Pos logit contributions
 films+0.051
ums+0.051
 pals+0.050
apping+0.049
aps+0.048
Neg logit contributions
 forgot-0.065
 managed-0.059
 missed-0.059
 never-0.058
use-0.058
Token it
Token ID340
Feature activation-3.71e-02

Pos logit contributions
 films+0.050
ums+0.048
aps+0.048
 pals+0.047
ning+0.046
Neg logit contributions
 forgot-0.064
use-0.059
 missed-0.056
 managed-0.056
'd-0.056
Token.
Token ID13
Feature activation+5.02e-02

Pos logit contributions
 forgot+0.093
use+0.086
 managed+0.085
 missed+0.085
 never+0.083
Neg logit contributions
 films-0.039
ums-0.038
apping-0.038
 pals-0.037
aps-0.037
Token He
Token ID679
Feature activation-7.38e-02

Pos logit contributions
 films+0.085
 pals+0.084
ums+0.083
apping+0.082
ning+0.079
Neg logit contributions
 forgot-0.093
 missed-0.086
 managed-0.084
 fell-0.083
 never-0.082
Token was
Token ID373
Feature activation-6.86e-03

Pos logit contributions
use+0.108
 surely+0.080
 notice+0.077
 forgot+0.075
t+0.071
Neg logit contributions
 pals-0.146
aw-0.144
aps-0.139
igo-0.139
 friends-0.139
Token curious
Token ID11040
Feature activation+3.67e-02

Pos logit contributions
 forgot+0.010
 missed+0.009
 never+0.009
 managed+0.009
 fell+0.009
Neg logit contributions
 films-0.014
ums-0.014
 pals-0.014
apping-0.014
aps-0.014
Token.
Token ID13
Feature activation+4.24e-02

Pos logit contributions
ums+0.076
utes+0.074
 pals+0.073
apping+0.072
uff+0.070
Neg logit contributions
 forgot-0.056
 had-0.055
 would-0.053
 never-0.052
 fell-0.052
Token He
Token ID679
Feature activation-8.23e-02

Pos logit contributions
 films+0.082
 pals+0.080
ums+0.079
apping+0.079
ning+0.077
Neg logit contributions
 forgot-0.068
 missed-0.063
Maybe-0.061
 managed-0.061
 never-0.060
Token sw
Token ID1509
Feature activation+9.20e-02

Pos logit contributions
use+0.119
 surely+0.093
 notice+0.088
t+0.087
 forgot+0.087
Neg logit contributions
 pals-0.158
aw-0.156
 ducks-0.149
 friends-0.149
igo-0.148
Tokenam
Token ID321
Feature activation-2.15e-02

Pos logit contributions
 films+0.122
 pals+0.108
apping+0.107
ums+0.106
aps+0.102
Neg logit contributions
 forgot-0.216
 managed-0.201
 missed-0.197
 never-0.194
use-0.191
Token pepper
Token ID13385
Feature activation-9.56e-03

Pos logit contributions
 films+0.043
 pals+0.042
ums+0.042
apping+0.040
aps+0.039
Neg logit contributions
 forgot-0.045
 managed-0.041
use-0.040
 missed-0.040
'd-0.038
Token plant
Token ID4618
Feature activation-3.20e-02

Pos logit contributions
use+0.013
 forgot+0.012
 notice+0.011
 missed+0.011
 forgotten+0.010
Neg logit contributions
 pals-0.020
ums-0.019
 films-0.019
birds-0.019
ning-0.019
Token kept
Token ID4030
Feature activation-6.55e-02

Pos logit contributions
use+0.044
 forgot+0.038
 notice+0.035
Maybe+0.035
'd+0.033
Neg logit contributions
 pals-0.064
birds-0.062
ums-0.062
 films-0.061
 friends-0.060
Token on
Token ID319
Feature activation-3.01e-02

Pos logit contributions
 forgot+0.137
use+0.124
 managed+0.123
 missed+0.122
'd+0.120
Neg logit contributions
 films-0.097
ums-0.095
 pals-0.094
apping-0.093
aps-0.090
Token blo
Token ID24924
Feature activation+8.16e-03

Pos logit contributions
 forgot+0.066
 managed+0.060
 missed+0.060
 never+0.059
use+0.059
Neg logit contributions
 films-0.042
ums-0.041
apping-0.041
 pals-0.040
aps-0.039
Tokenoming
Token ID3383
Feature activation+3.06e-02

Pos logit contributions
 films+0.008
 pals+0.007
ums+0.007
apping+0.007
ning+0.006
Neg logit contributions
 forgot-0.021
 managed-0.020
use-0.020
 missed-0.019
 never-0.019
Token.
Token ID13
Feature activation+1.57e-02

Pos logit contributions
ums+0.051
apping+0.051
 films+0.051
 pals+0.049
aps+0.049
Neg logit contributions
 forgot-0.058
 managed-0.053
 never-0.052
 missed-0.051
 fell-0.051
Token Every
Token ID3887
Feature activation-1.90e-02

Pos logit contributions
 films+0.023
ums+0.023
apping+0.023
 pals+0.023
aps+0.022
Neg logit contributions
 forgot-0.032
 missed-0.030
 managed-0.030
 never-0.029
use-0.029
Token day
Token ID1110
Feature activation-5.23e-02

Pos logit contributions
 forgot+0.038
use+0.035
 managed+0.034
 missed+0.033
'd+0.032
Neg logit contributions
 films-0.030
 pals-0.029
apping-0.029
ums-0.028
aps-0.027
Token,
Token ID11
Feature activation-4.47e-02

Pos logit contributions
 forgot+0.110
use+0.103
 managed+0.100
 missed+0.100
'd+0.098
Neg logit contributions
 films-0.075
ums-0.074
apping-0.073
 pals-0.072
aps-0.069
Token more
Token ID517
Feature activation-1.35e-02

Pos logit contributions
 forgot+0.066
 managed+0.058
 never+0.057
 missed+0.056
 fell+0.055
Neg logit contributions
apping-0.095
ums-0.095
 films-0.093
aps-0.092
 pals-0.091
Token
Token ID198
Feature activation-1.99e-03

Pos logit contributions
 forgot+0.009
use+0.009
 managed+0.009
 missed+0.008
 wouldn+0.008
Neg logit contributions
 films-0.005
apping-0.005
aps-0.005
ums-0.005
 pals-0.005
TokenSoon
Token ID28093
Feature activation-7.67e-02

Pos logit contributions
 forgot+0.004
 managed+0.003
 missed+0.003
 wouldn+0.003
use+0.003
Neg logit contributions
 films-0.003
ums-0.003
apping-0.003
ning-0.003
 pals-0.003
Token,
Token ID11
Feature activation-4.37e-02

Pos logit contributions
 forgot+0.181
use+0.170
 managed+0.167
'd+0.164
 missed+0.163
Neg logit contributions
 films-0.093
ums-0.087
 pals-0.086
apping-0.084
aps-0.083
Token all
Token ID477
Feature activation-5.28e-02

Pos logit contributions
 forgot+0.087
use+0.079
 managed+0.079
 missed+0.077
'd+0.076
Neg logit contributions
 films-0.070
ums-0.067
 pals-0.066
apping-0.065
ning-0.064
Token the
Token ID262
Feature activation+1.98e-02

Pos logit contributions
 forgot+0.103
 missed+0.092
 managed+0.092
 never+0.091
 fell+0.090
Neg logit contributions
ums-0.087
apping-0.086
 pals-0.084
aps-0.084
 films-0.083
Token other
Token ID584
Feature activation+2.18e-02

Pos logit contributions
 films+0.016
 pals+0.014
ums+0.013
apping+0.012
aps+0.012
Neg logit contributions
 forgot-0.056
 managed-0.054
use-0.053
 missed-0.053
'd-0.052
Token animals
Token ID4695
Feature activation-1.16e-02

Pos logit contributions
 films+0.017
ums+0.016
 pals+0.016
apping+0.015
aps+0.014
Neg logit contributions
 forgot-0.061
 managed-0.057
use-0.057
 missed-0.057
 never-0.056
Token came
Token ID1625
Feature activation-3.42e-02

Pos logit contributions
 forgot+0.016
use+0.015
Maybe+0.014
 managed+0.014
 never+0.014
Neg logit contributions
apping-0.024
 films-0.023
 pals-0.023
aps-0.022
birds-0.022
Token to
Token ID284
Feature activation-2.89e-02

Pos logit contributions
 forgot+0.049
 never+0.046
 missed+0.044
 had+0.043
 managed+0.043
Neg logit contributions
 films-0.076
ums-0.075
ocol-0.071
opl-0.070
 pals-0.069
Token see
Token ID766
Feature activation-4.82e-02

Pos logit contributions
 forgot+0.046
 never+0.042
use+0.042
 missed+0.042
 managed+0.040
Neg logit contributions
ums-0.059
apping-0.059
aps-0.055
 pals-0.054
uff-0.054
Token Beef
Token ID35309
Feature activation-1.02e-02

Pos logit contributions
 forgot+0.078
 never+0.078
 missed+0.076
 would+0.075
use+0.074
Neg logit contributions
apping-0.092
ums-0.089
 pals-0.086
aps-0.085
utes-0.081
Token read
Token ID1100
Feature activation-3.25e-02

Pos logit contributions
 forgot+0.093
 managed+0.082
 missed+0.082
use+0.081
 fell+0.076
Neg logit contributions
 films-0.111
ums-0.110
 pals-0.109
apping-0.108
aps-0.107
Token and
Token ID290
Feature activation-5.70e-02

Pos logit contributions
 forgot+0.062
 never+0.057
 managed+0.056
 missed+0.056
use+0.056
Neg logit contributions
ums-0.053
apping-0.052
 films-0.051
 pals-0.051
aps-0.050
Token write
Token ID3551
Feature activation-4.51e-02

Pos logit contributions
 forgot+0.089
 managed+0.078
 missed+0.078
 never+0.076
use+0.076
Neg logit contributions
 films-0.115
ums-0.114
apping-0.113
 pals-0.111
aps-0.110
Token in
Token ID287
Feature activation-2.03e-02

Pos logit contributions
 forgot+0.092
 managed+0.085
 missed+0.084
 never+0.083
use+0.082
Neg logit contributions
 films-0.067
ums-0.067
apping-0.066
 pals-0.065
aps-0.063
Token many
Token ID867
Feature activation+2.73e-03

Pos logit contributions
 forgot+0.038
 never+0.036
 managed+0.035
 missed+0.035
use+0.034
Neg logit contributions
apping-0.033
ums-0.033
 films-0.033
 pals-0.032
aps-0.031
Token different
Token ID1180
Feature activation+5.55e-02

Pos logit contributions
ums+0.004
 films+0.003
apping+0.003
 pals+0.003
aps+0.003
Neg logit contributions
 forgot-0.006
 managed-0.006
use-0.006
 missed-0.006
 never-0.006
Token languages
Token ID8950
Feature activation+5.80e-03

Pos logit contributions
apping+0.075
ums+0.074
uff+0.069
aps+0.068
 films+0.068
Neg logit contributions
 forgot-0.122
use-0.113
 would-0.111
 never-0.110
 had-0.109
Token.
Token ID13
Feature activation+1.92e-02

Pos logit contributions
ums+0.009
 films+0.008
apping+0.008
 pals+0.008
aps+0.008
Neg logit contributions
 forgot-0.012
 never-0.011
 managed-0.011
 would-0.011
'd-0.011
Token She
Token ID1375
Feature activation-9.37e-02

Pos logit contributions
 films+0.033
apping+0.032
 pals+0.032
ums+0.032
ning+0.031
Neg logit contributions
 forgot-0.033
 never-0.032
 missed-0.031
Maybe-0.031
 would-0.030
Token could
Token ID714
Feature activation-6.18e-02

Pos logit contributions
use+0.126
 forgot+0.098
t+0.091
 forgotten+0.089
 blocked+0.087
Neg logit contributions
 pals-0.196
 films-0.188
ums-0.187
 friends-0.184
birds-0.182
Token even
Token ID772
Feature activation-6.56e-02

Pos logit contributions
 forgot+0.082
use+0.073
 managed+0.071
 missed+0.070
'd+0.065
Neg logit contributions
 films-0.136
ums-0.134
 pals-0.133
apping-0.132
aps-0.131
Token tree
Token ID5509
Feature activation-7.89e-03

Pos logit contributions
 pals+0.044
ning+0.043
aps+0.041
ums+0.041
birds+0.041
Neg logit contributions
 forgot-0.084
 missed-0.078
use-0.078
 managed-0.076
 lied-0.071
Token was
Token ID373
Feature activation-3.28e-02

Pos logit contributions
 forgot+0.014
 managed+0.012
 missed+0.012
 never+0.012
use+0.012
Neg logit contributions
 films-0.015
ums-0.014
apping-0.014
 pals-0.014
aps-0.014
Token teacher
Token ID4701
Feature activation-3.89e-02

Pos logit contributions
 forgot+0.063
 never+0.060
 managed+0.059
 missed+0.059
 had+0.057
Neg logit contributions
ums-0.054
 films-0.054
apping-0.053
 pals-0.051
aps-0.051
Token Lisa
Token ID15378
Feature activation-4.61e-02

Pos logit contributions
 had+0.064
 forgot+0.062
 never+0.056
 would+0.056
 fell+0.054
Neg logit contributions
 films-0.072
 pals-0.070
ums-0.070
 sym-0.068
utes-0.067
Token.
Token ID13
Feature activation+4.75e-03

Pos logit contributions
 forgot+0.083
 had+0.080
 never+0.077
 missed+0.074
 would+0.074
Neg logit contributions
ums-0.084
 films-0.083
 pals-0.081
apping-0.078
aps-0.077
Token She
Token ID1375
Feature activation-1.06e-01

Pos logit contributions
ums+0.008
 pals+0.008
apping+0.008
 films+0.008
ning+0.007
Neg logit contributions
 forgot-0.009
 missed-0.008
 had-0.008
 managed-0.008
 never-0.008
Token was
Token ID373
Feature activation-4.24e-02

Pos logit contributions
use+0.133
 forgot+0.125
 managed+0.108
 missed+0.105
 notice+0.104
Neg logit contributions
 pals-0.230
apping-0.222
 films-0.219
birds-0.218
aps-0.216
Token giving
Token ID3501
Feature activation-2.46e-02

Pos logit contributions
 forgot+0.073
 never+0.071
 missed+0.069
 managed+0.067
 had+0.065
Neg logit contributions
 films-0.078
ums-0.077
 pals-0.074
aters-0.074
apping-0.072
Token a
Token ID257
Feature activation-2.35e-02

Pos logit contributions
 forgot+0.045
 never+0.043
 managed+0.042
 would+0.041
 had+0.041
Neg logit contributions
apping-0.041
ums-0.040
 films-0.038
aps-0.038
uff-0.037
Token lecture
Token ID19143
Feature activation-1.95e-02

Pos logit contributions
 forgot+0.040
 managed+0.036
use+0.035
 missed+0.034
 fell+0.034
Neg logit contributions
ums-0.045
 films-0.044
apping-0.043
 pals-0.042
ning-0.041
Token about
Token ID546
Feature activation-3.12e-02

Pos logit contributions
 forgot+0.041
 never+0.038
 managed+0.038
 missed+0.037
 had+0.037
Neg logit contributions
ums-0.028
 films-0.028
 pals-0.027
apping-0.026
aps-0.025
Token play
Token ID711
Feature activation+1.01e-02

Pos logit contributions
 forgot+0.014
 missed+0.013
use+0.013
 managed+0.013
 never+0.012
Neg logit contributions
ums-0.015
apping-0.015
 films-0.015
 pals-0.015
aps-0.014
Token in
Token ID287
Feature activation+1.39e-02

Pos logit contributions
ums+0.021
ric+0.020
opl+0.019
apping+0.019
aters+0.018
Neg logit contributions
 would-0.015
Maybe-0.014
 forgot-0.014
 never-0.013
 had-0.013
Token the
Token ID262
Feature activation+7.46e-02

Pos logit contributions
apping+0.033
ums+0.033
 pals+0.031
uff+0.029
oses+0.029
Neg logit contributions
 never-0.018
 had-0.017
 forgot-0.016
 notice-0.016
Maybe-0.015
Token garden
Token ID11376
Feature activation+3.69e-02

Pos logit contributions
ums+0.143
apping+0.142
 pals+0.131
aps+0.131
oses+0.128
Neg logit contributions
 forgot-0.123
 never-0.119
 had-0.119
 would-0.116
Maybe-0.114
Token.
Token ID13
Feature activation+5.07e-02

Pos logit contributions
ums+0.068
 pals+0.066
 films+0.065
 sym+0.064
ocol+0.063
Neg logit contributions
 never-0.060
 would-0.059
 forgot-0.058
 had-0.057
 waited-0.054
Token They
Token ID1119
Feature activation-6.28e-02

Pos logit contributions
ning+0.114
 pals+0.114
ums+0.114
apping+0.110
uff+0.106
Neg logit contributions
 forgot-0.065
 missed-0.062
 had-0.059
Maybe-0.058
use-0.058
Token have
Token ID423
Feature activation-3.25e-02

Pos logit contributions
 forgot+0.095
 managed+0.094
 recovered+0.093
 wouldn+0.091
Maybe+0.088
Neg logit contributions
 films-0.105
ap-0.098
 lions-0.096
birds-0.095
 pals-0.095
Token a
Token ID257
Feature activation+2.98e-02

Pos logit contributions
 forgot+0.068
 managed+0.062
 never+0.062
 missed+0.062
use+0.059
Neg logit contributions
ums-0.050
apping-0.048
 films-0.048
 pals-0.047
aps-0.046
Token big
Token ID1263
Feature activation+7.46e-02

Pos logit contributions
ums+0.050
 films+0.049
apping+0.049
 pals+0.048
aps+0.047
Neg logit contributions
 forgot-0.057
 missed-0.051
 managed-0.051
use-0.050
 never-0.050
Token m
Token ID285
Feature activation+1.04e-01

Pos logit contributions
ums+0.112
apping+0.110
 films+0.110
 pals+0.107
aps+0.106
Neg logit contributions
 forgot-0.154
use-0.140
 missed-0.140
 managed-0.139
 never-0.137
Tokenop
Token ID404
Feature activation+3.48e-02

Pos logit contributions
 films+0.147
 pals+0.136
apping+0.126
ning+0.121
aps+0.120
Neg logit contributions
 forgot-0.225
 never-0.210
use-0.209
 fell-0.209
 managed-0.203
Token.
Token ID13
Feature activation+3.06e-02

Pos logit contributions
 films+0.000
ums+0.000
apping+0.000
 pals+0.000
aps+0.000
Neg logit contributions
 forgot-0.001
 never-0.001
 managed-0.001
use-0.001
 missed-0.000
Token
Token ID198
Feature activation+1.43e-02

Pos logit contributions
 films+0.053
ums+0.051
apping+0.050
 pals+0.050
aps+0.048
Neg logit contributions
 forgot-0.057
 missed-0.052
 managed-0.051
 never-0.051
use-0.051
Token
Token ID198
Feature activation+1.55e-02

Pos logit contributions
apping+0.018
aps+0.018
ums+0.018
 pals+0.017
 films+0.017
Neg logit contributions
 forgot-0.034
use-0.030
 managed-0.030
 missed-0.030
 wouldn-0.029
TokenSo
Token ID2396
Feature activation-2.38e-02

Pos logit contributions
apping+0.024
aps+0.023
 films+0.023
 pals+0.023
ums+0.023
Neg logit contributions
 forgot-0.030
use-0.028
 missed-0.028
 wouldn-0.027
 managed-0.026
Token she
Token ID673
Feature activation-9.28e-02

Pos logit contributions
 forgot+0.050
use+0.049
 managed+0.047
 missed+0.046
'd+0.046
Neg logit contributions
ums-0.031
 films-0.031
 pals-0.030
apping-0.030
aps-0.029
Token kept
Token ID4030
Feature activation-5.19e-02

Pos logit contributions
 forgot+0.107
use+0.107
 missed+0.096
Maybe+0.086
 forgotten+0.083
Neg logit contributions
 pals-0.208
ning-0.199
apping-0.197
 films-0.195
ums-0.192
Token crawling
Token ID34499
Feature activation-1.73e-02

Pos logit contributions
 forgot+0.111
 managed+0.102
 missed+0.101
use+0.101
 never+0.099
Neg logit contributions
 films-0.074
ums-0.073
apping-0.072
 pals-0.071
aps-0.070
Token and
Token ID290
Feature activation-4.50e-02

Pos logit contributions
 forgot+0.035
 missed+0.032
use+0.032
 managed+0.032
 never+0.031
Neg logit contributions
 films-0.026
ums-0.026
apping-0.026
 pals-0.026
aps-0.025
Token crawling
Token ID34499
Feature activation-1.89e-02

Pos logit contributions
 forgot+0.079
use+0.075
 missed+0.072
 lied+0.069
Maybe+0.068
Neg logit contributions
 pals-0.073
apping-0.072
ning-0.072
ums-0.070
aps-0.069
Token and
Token ID290
Feature activation-4.60e-02

Pos logit contributions
 forgot+0.037
 managed+0.034
 missed+0.033
use+0.033
 never+0.033
Neg logit contributions
 films-0.030
ums-0.030
apping-0.029
 pals-0.029
aps-0.029
Token crawling
Token ID34499
Feature activation-1.86e-02

Pos logit contributions
 forgot+0.090
use+0.087
 missed+0.082
 lied+0.080
 managed+0.080
Neg logit contributions
apping-0.065
 pals-0.065
ning-0.064
ums-0.064
 films-0.062
Token is
Token ID318
Feature activation+3.17e-02

Pos logit contributions
use+0.028
 forgot+0.028
 managed+0.024
Maybe+0.023
 missed+0.022
Neg logit contributions
 pals-0.059
 films-0.059
ums-0.057
aps-0.057
birds-0.056
Token old
Token ID1468
Feature activation+8.03e-02

Pos logit contributions
ums+0.064
 films+0.064
apping+0.062
aps+0.061
 pals+0.060
Neg logit contributions
 forgot-0.052
 never-0.047
 missed-0.047
 managed-0.046
 had-0.043
Token and
Token ID290
Feature activation+1.17e-02

Pos logit contributions
ums+0.150
 films+0.148
apping+0.146
aps+0.140
 pals+0.140
Neg logit contributions
 forgot-0.139
 never-0.128
 missed-0.124
 had-0.122
 managed-0.121
Token sad
Token ID6507
Feature activation+7.02e-02

Pos logit contributions
 films+0.021
 pals+0.020
aps+0.019
ums+0.019
apping+0.019
Neg logit contributions
 forgot-0.019
use-0.018
 managed-0.017
Maybe-0.017
'd-0.016
Token.
Token ID13
Feature activation+8.40e-02

Pos logit contributions
ums+0.108
 films+0.106
apping+0.104
 pals+0.103
aps+0.100
Neg logit contributions
 forgot-0.144
 missed-0.131
 never-0.130
 managed-0.129
use-0.128
Token He
Token ID679
Feature activation-4.02e-02

Pos logit contributions
 films+0.165
ning+0.163
 pals+0.162
ums+0.161
apping+0.161
Neg logit contributions
 forgot-0.123
 missed-0.117
 never-0.116
Maybe-0.115
 had-0.112
Token has
Token ID468
Feature activation-7.12e-03

Pos logit contributions
 forgot+0.057
 managed+0.055
use+0.054
Maybe+0.051
t+0.048
Neg logit contributions
 films-0.075
aps-0.071
 pals-0.071
ums-0.070
 lions-0.069
Token no
Token ID645
Feature activation+3.59e-02

Pos logit contributions
 forgot+0.011
use+0.011
 managed+0.010
'd+0.010
Maybe+0.010
Neg logit contributions
 films-0.012
 pals-0.012
ums-0.012
aps-0.012
 lions-0.012
Token fruits
Token ID15921
Feature activation+6.69e-02

Pos logit contributions
apping+0.056
ums+0.055
 films+0.055
aps+0.052
 pals+0.051
Neg logit contributions
 forgot-0.073
 missed-0.067
 never-0.067
use-0.066
 managed-0.065
Token and
Token ID290
Feature activation+3.00e-02

Pos logit contributions
apping+0.070
ums+0.068
 films+0.066
aps+0.066
 pals+0.064
Neg logit contributions
 forgot-0.157
 managed-0.145
use-0.144
 missed-0.143
Maybe-0.142
Token no
Token ID645
Feature activation+3.89e-02

Pos logit contributions
 films+0.039
 pals+0.037
ums+0.036
aps+0.034
apping+0.034
Neg logit contributions
 forgot-0.067
 managed-0.062
use-0.061
 missed-0.061
'd-0.059
Token the
Token ID262
Feature activation+3.28e-02

Pos logit contributions
 forgot+0.059
 never+0.056
 missed+0.054
 fell+0.054
 managed+0.054
Neg logit contributions
apping-0.075
ums-0.073
 pals-0.073
 films-0.072
aps-0.070
Token helmet
Token ID14335
Feature activation-4.41e-02

Pos logit contributions
 films+0.043
 pals+0.040
ums+0.040
apping+0.039
aps+0.038
Neg logit contributions
 forgot-0.075
 managed-0.069
 missed-0.068
use-0.068
 never-0.066
Token and
Token ID290
Feature activation-5.94e-02

Pos logit contributions
 forgot+0.084
 managed+0.076
 never+0.076
 missed+0.076
use+0.075
Neg logit contributions
 films-0.073
ums-0.072
 pals-0.071
apping-0.069
aps-0.068
Token the
Token ID262
Feature activation+2.69e-02

Pos logit contributions
 forgot+0.077
use+0.077
 missed+0.069
 managed+0.068
'd+0.068
Neg logit contributions
 films-0.123
 pals-0.119
ums-0.118
ning-0.117
aps-0.116
Token gloves
Token ID18051
Feature activation-3.71e-02

Pos logit contributions
 films+0.040
ums+0.039
apping+0.039
 pals+0.038
aps+0.038
Neg logit contributions
 forgot-0.056
 managed-0.051
 missed-0.051
use-0.051
 never-0.050
Token and
Token ID290
Feature activation-5.82e-02

Pos logit contributions
 forgot+0.084
use+0.078
 missed+0.076
 managed+0.076
 never+0.073
Neg logit contributions
apping-0.047
ums-0.047
 films-0.045
aps-0.045
 pals-0.045
Token climbed
Token ID19952
Feature activation-4.24e-02

Pos logit contributions
 forgot+0.082
use+0.081
 missed+0.073
 managed+0.072
'd+0.072
Neg logit contributions
 films-0.116
 pals-0.112
ums-0.111
ning-0.110
apping-0.110
Token on
Token ID319
Feature activation-1.54e-02

Pos logit contributions
 never+0.054
 fell+0.053
 forgot+0.052
 had+0.052
 would+0.051
Neg logit contributions
ums-0.094
apping-0.093
 pals-0.091
 films-0.090
aps-0.084
Token the
Token ID262
Feature activation+4.33e-02

Pos logit contributions
 forgot+0.024
 never+0.024
 managed+0.023
 fell+0.022
 missed+0.022
Neg logit contributions
apping-0.030
ums-0.030
 films-0.029
 pals-0.029
aps-0.028
Token motorcycle
Token ID18757
Feature activation-2.08e-02

Pos logit contributions
 films+0.065
ums+0.063
apping+0.062
 pals+0.062
aps+0.060
Neg logit contributions
 forgot-0.090
 managed-0.082
 missed-0.082
use-0.081
 never-0.080
Token.
Token ID13
Feature activation+2.14e-02

Pos logit contributions
 forgot+0.037
 never+0.036
 fell+0.036
 would+0.036
 had+0.034
Neg logit contributions
 pals-0.035
 films-0.035
ums-0.035
apping-0.033
ocol-0.031