TOP ACTIVATIONS
MAX = 0.413

perfect bag now I 'm very
by one 1 , 2 ,
rare silence that added to the
anxious anymore he knew he was
it forever things are only available
and rocked higher and higher .
important thing his leash ! She
of school it was an important
something was wrong â all the water had
also very persistent â so they gathered around
special home a hut . She
her pocket just in case her
t help it â he reached out and
something strange the mirror was making
something special a pool ! It
something incredible a magical plant !
looked lovely it was bright and
taught him cleaning is important work
beautiful thing a bright rainbow !
also excited the gorilla was so

MIN ACTIVATIONS
MIN = -0.434

dream called a " night mare ". She woke up crying
always wear her flexible a pron . She also learned that
put on his special a pron and decided to travel .
put on a pink a pron and climbed on a stool
there was a cool a pron . It was worn by
. It covered her a pron , her face , her
mom gave her an a pron with a bright pink flower
he put on his a pron and helped her make spaghetti
Lily put on the a pron and felt very grown -
. Lily thought her a pron was impressive because it had
her to wear an a pron so she wouldn 't get
my gave her an a pron to wear so she wouldn
the help of her a pron , she could make creations
. The a pron suddenly began to spark le
She put on her a pron with a big zip .
She wore a pink a pron with flowers on it .
She put on an a pron so she was ready to
's get a new a pron and a new bowl .
He took off his a pron and proudly told his mom
mother was wearing an a pron . Rose wanted to wear

INTERVAL 0.201 - 0.413
CONTAINS 0.390%

her and said , " Hello , little girl . Do
reply . " Hello Joe ," said a voice
tried your best . But next time , you need to
and cold . He was trapped . " Sam
Hello , little girl . Can you help me sort these

INTERVAL -0.011 - 0.201
CONTAINS 59.559%

walked towards it . When she reached the door , it
have to find me !" His friends agreed , so Bob
have asked me first . You should have respected my things
So , Tom and his mom my walked over to the
and marched until she came across a rude man . Amy

INTERVAL -0.223 - -0.011
CONTAINS 39.793%

content . <|endoftext|> Once upon a time , there was a
It 's a hotel , it 's supposed to be fancy
glad that she had experienced the tough land and had many
new ! The old man was very happy
It was a little bird who had hurt its wing .

INTERVAL -0.434 - -0.223
CONTAINS 0.258%

from then on they were always going to be careful when
, Lily 's mom would always remember the orange tree and
Tom was very flexible and always full of energy . One
ucky was more careful and always asked for help when he
his feet made little p udd les in the mud .
Token perfect
Token ID2818
Feature activation+3.43e-02

Pos logit contributions
+0.020
Warning+0.019
 aston+0.016
 policy+0.016
 next+0.016
Neg logit contributions
mare-0.021
pron-0.018
ack-0.018
lic-0.016
atmeal-0.016
Token bag
Token ID6131
Feature activation-6.17e-02

Pos logit contributions
+0.048
Warning+0.042
 next+0.040
 policy+0.039
 aston+0.038
Neg logit contributions
mare-0.055
lic-0.047
ack-0.047
pron-0.046
ackle-0.045
Token
Token ID6184
Feature activation-5.45e-02

Pos logit contributions
mare+0.103
ack+0.088
pron+0.088
lic+0.083
ackle+0.083
Neg logit contributions
-0.087
Warning-0.074
 next-0.069
 policy-0.066
 aston-0.061
Token
Token ID95
Feature activation-1.26e-01

Pos logit contributions
mare+0.090
pron+0.079
ack+0.078
lic+0.076
ackle+0.075
Neg logit contributions
-0.079
Warning-0.067
 next-0.058
 aston-0.057
 policy-0.056
Token
Token ID26391
Feature activation-2.54e-02

Pos logit contributions
mare+0.206
ins+0.198
pron+0.193
auldron+0.192
atmeal+0.187
Neg logit contributions
-0.200
Warning-0.155
 next-0.129
 Give-0.116
 aston-0.115
Token
Token ID447
Feature activation+4.09e-01

Pos logit contributions
mare+0.045
atmeal+0.042
pron+0.042
ackle+0.041
auldron+0.041
Neg logit contributions
-0.037
 ago-0.022
 next-0.020
Warning-0.020
 aston-0.018
Token
Token ID250
Feature activation-6.57e-02

Pos logit contributions
+0.683
Warning+0.468
Hello+0.429
 next+0.424
 aston+0.409
Neg logit contributions
mare-0.715
rows-0.619
auldron-0.600
ins-0.595
ocked-0.576
Token now
Token ID783
Feature activation+6.09e-02

Pos logit contributions
mare+0.113
pron+0.098
ack+0.096
ackle+0.091
atmeal+0.091
Neg logit contributions
-0.088
Warning-0.081
 aston-0.068
 next-0.066
 policy-0.066
Token I
Token ID314
Feature activation+7.69e-02

Pos logit contributions
+0.079
Warning+0.056
 next+0.054
 policy+0.048
 aston+0.047
Neg logit contributions
mare-0.116
pron-0.107
lic-0.105
ackle-0.105
ack-0.104
Token'm
Token ID1101
Feature activation-6.72e-02

Pos logit contributions
+0.107
Warning+0.076
 aston+0.070
 next+0.065
 policy+0.062
Neg logit contributions
mare-0.144
pron-0.130
atmeal-0.123
auldron-0.123
ackle-0.122
Token very
Token ID845
Feature activation-5.10e-02

Pos logit contributions
mare+0.120
ack+0.110
rows+0.109
ackle+0.107
auldron+0.106
Neg logit contributions
-0.097
 next-0.067
 aston-0.061
 dis-0.061
 ago-0.061
Token by
Token ID416
Feature activation-4.46e-02

Pos logit contributions
+0.023
Warning+0.022
 aston+0.019
 policy+0.019
 next+0.018
Neg logit contributions
mare-0.022
pron-0.019
ack-0.019
lic-0.017
ackle-0.017
Token one
Token ID530
Feature activation+8.68e-02

Pos logit contributions
mare+0.078
pron+0.070
ack+0.066
atmeal+0.066
lic+0.066
Neg logit contributions
-0.060
Warning-0.053
 next-0.048
 aston-0.046
 policy-0.045
Token
Token ID6184
Feature activation-5.18e-02

Pos logit contributions
+0.120
Warning+0.103
 next+0.096
 aston+0.094
 policy+0.091
Neg logit contributions
mare-0.149
pron-0.128
ack-0.122
lic-0.121
atmeal-0.120
Token
Token ID95
Feature activation-1.21e-01

Pos logit contributions
mare+0.085
pron+0.075
ack+0.070
atmeal+0.069
auldron+0.068
Neg logit contributions
-0.076
Warning-0.066
 aston-0.060
 next-0.058
 policy-0.054
Token
Token ID26391
Feature activation-1.90e-02

Pos logit contributions
mare+0.199
pron+0.185
ins+0.178
atmeal+0.175
auldron+0.172
Neg logit contributions
-0.188
Warning-0.154
 next-0.132
 aston-0.123
 counts-0.118
Token
Token ID447
Feature activation+4.05e-01

Pos logit contributions
mare+0.038
pron+0.035
atmeal+0.034
ench+0.034
rows+0.034
Neg logit contributions
-0.024
 ago-0.013
Warning-0.013
 aston-0.012
 next-0.012
Token
Token ID250
Feature activation-6.22e-02

Pos logit contributions
+0.658
Warning+0.490
Hello+0.444
 aston+0.428
 next+0.427
Neg logit contributions
mare-0.723
rows-0.608
ins-0.579
auldron-0.576
ocked-0.573
Token 1
Token ID352
Feature activation+1.79e-01

Pos logit contributions
mare+0.108
pron+0.094
ack+0.091
atmeal+0.088
ackle+0.087
Neg logit contributions
-0.082
Warning-0.075
 aston-0.063
 next-0.061
 policy-0.061
Token,
Token ID11
Feature activation+4.11e-02

Pos logit contributions
+0.235
Warning+0.185
 next+0.172
 counts+0.150
 policy+0.148
Neg logit contributions
mare-0.344
pron-0.314
auldron-0.295
ackle-0.292
ench-0.287
Token 2
Token ID362
Feature activation+1.26e-01

Pos logit contributions
+0.040
Warning+0.029
 next+0.024
 aston+0.022
 moral+0.019
Neg logit contributions
mare-0.091
pron-0.083
ack-0.080
ackle-0.079
atmeal-0.078
Token,
Token ID11
Feature activation+3.70e-02

Pos logit contributions
+0.161
Warning+0.125
 next+0.119
 aston+0.105
 counts+0.102
Neg logit contributions
mare-0.248
pron-0.217
ackle-0.205
ench-0.202
auldron-0.202
Token rare
Token ID4071
Feature activation+1.22e-01

Pos logit contributions
mare+0.004
atmeal+0.004
ack+0.003
ins+0.003
pron+0.003
Neg logit contributions
-0.003
Warning-0.002
 next-0.002
 aston-0.002
 moral-0.002
Token silence
Token ID9550
Feature activation+6.77e-03

Pos logit contributions
+0.163
Warning+0.136
 policy+0.133
 next+0.129
 moral+0.120
Neg logit contributions
mare-0.199
lic-0.180
atmeal-0.179
ack-0.176
pron-0.173
Token
Token ID6184
Feature activation-6.56e-02

Pos logit contributions
+0.009
Warning+0.007
 next+0.007
 policy+0.007
 aston+0.006
Neg logit contributions
mare-0.012
pron-0.010
atmeal-0.010
ack-0.010
auldron-0.010
Token
Token ID95
Feature activation-1.32e-01

Pos logit contributions
mare+0.100
pron+0.092
ins+0.087
atmeal+0.085
auldron+0.085
Neg logit contributions
-0.103
Warning-0.089
 aston-0.079
 next-0.077
 ago-0.072
Token
Token ID26391
Feature activation-2.97e-02

Pos logit contributions
mare+0.214
ins+0.211
pron+0.203
atmeal+0.197
auldron+0.196
Neg logit contributions
-0.210
Warning-0.173
Hello-0.137
 next-0.132
 aston-0.130
Token
Token ID447
Feature activation+4.03e-01

Pos logit contributions
mare+0.044
pron+0.044
atmeal+0.042
ins+0.041
auldron+0.039
Neg logit contributions
-0.050
 ago-0.035
Warning-0.032
 aston-0.031
 next-0.030
Token
Token ID250
Feature activation-6.80e-02

Pos logit contributions
+0.673
Warning+0.505
Hello+0.443
 aston+0.432
 next+0.430
Neg logit contributions
mare-0.704
ins-0.586
ocked-0.574
rows-0.562
auldron-0.540
Token that
Token ID326
Feature activation-7.17e-02

Pos logit contributions
mare+0.117
pron+0.102
ack+0.098
atmeal+0.094
ackle+0.094
Neg logit contributions
-0.091
Warning-0.084
 aston-0.071
 next-0.069
 policy-0.068
Token added
Token ID2087
Feature activation+2.84e-02

Pos logit contributions
mare+0.160
pron+0.145
lic+0.141
atmeal+0.141
ack+0.139
Neg logit contributions
-0.067
Warning-0.049
 next-0.040
 aston-0.038
 policy-0.036
Token to
Token ID284
Feature activation+5.06e-03

Pos logit contributions
+0.040
Warning+0.036
 aston+0.032
 policy+0.031
 next+0.031
Neg logit contributions
mare-0.046
pron-0.040
ack-0.039
atmeal-0.037
lic-0.037
Token the
Token ID262
Feature activation-3.49e-02

Pos logit contributions
+0.007
Warning+0.006
 next+0.005
 aston+0.005
 policy+0.005
Neg logit contributions
mare-0.008
pron-0.008
ack-0.007
lic-0.007
atmeal-0.007
Token anxious
Token ID18116
Feature activation+2.95e-02

Pos logit contributions
mare+0.051
pron+0.049
atmeal+0.048
ack+0.046
lor+0.046
Neg logit contributions
-0.023
 dis-0.018
 angry-0.015
 next-0.015
 unexpected-0.015
Token anymore
Token ID7471
Feature activation+4.67e-02

Pos logit contributions
+0.041
Warning+0.040
 aston+0.035
 policy+0.033
 next+0.031
Neg logit contributions
mare-0.046
ack-0.040
pron-0.039
lic-0.037
ackle-0.037
Token
Token ID6184
Feature activation-7.22e-02

Pos logit contributions
+0.062
Warning+0.054
 next+0.048
 policy+0.048
 aston+0.046
Neg logit contributions
mare-0.083
pron-0.073
ack-0.070
atmeal-0.068
lic-0.067
Token
Token ID95
Feature activation-1.39e-01

Pos logit contributions
mare+0.116
pron+0.097
rows+0.095
lic+0.095
auldron+0.094
Neg logit contributions
-0.118
Warning-0.101
 aston-0.090
 next-0.089
 policy-0.081
Token
Token ID26391
Feature activation-2.91e-02

Pos logit contributions
mare+0.235
pron+0.213
ins+0.209
lic+0.203
auldron+0.195
Neg logit contributions
-0.206
Warning-0.181
 next-0.155
 aston-0.140
 counts-0.136
Token
Token ID447
Feature activation+4.02e-01

Pos logit contributions
mare+0.052
pron+0.047
ench+0.042
iddle+0.042
atmeal+0.042
Neg logit contributions
-0.045
Warning-0.029
 next-0.028
 aston-0.026
 ago-0.025
Token
Token ID251
Feature activation-1.42e-01

Pos logit contributions
+0.598
Warning+0.487
 next+0.429
 aston+0.414
Hello+0.410
Neg logit contributions
mare-0.739
lic-0.560
rows-0.553
ocked-0.548
ins-0.543
Token he
Token ID339
Feature activation-1.03e-02

Pos logit contributions
mare+0.276
pron+0.248
ackle+0.238
auldron+0.237
atmeal+0.237
Neg logit contributions
-0.191
Warning-0.142
 next-0.136
 aston-0.114
 moral-0.112
Token knew
Token ID2993
Feature activation-1.98e-02

Pos logit contributions
mare+0.020
atmeal+0.018
pron+0.017
auldron+0.017
akes+0.017
Neg logit contributions
-0.015
 next-0.009
 moral-0.009
 aston-0.008
 counts-0.008
Token he
Token ID339
Feature activation-1.00e-02

Pos logit contributions
mare+0.033
ack+0.029
pron+0.028
ackle+0.027
auldron+0.027
Neg logit contributions
-0.028
Warning-0.025
 next-0.023
 policy-0.022
 aston-0.021
Token was
Token ID373
Feature activation+3.37e-02

Pos logit contributions
mare+0.018
pron+0.016
ack+0.015
atmeal+0.015
auldron+0.015
Neg logit contributions
-0.014
Warning-0.011
 next-0.010
 aston-0.010
 policy-0.009
Token it
Token ID340
Feature activation-1.60e-02

Pos logit contributions
+0.008
Warning+0.008
 aston+0.007
 policy+0.007
 moral+0.007
Neg logit contributions
mare-0.007
pron-0.006
ack-0.006
atmeal-0.006
auldron-0.005
Token forever
Token ID8097
Feature activation+1.24e-03

Pos logit contributions
mare+0.024
pron+0.021
ack+0.020
atmeal+0.019
ench+0.019
Neg logit contributions
-0.023
Warning-0.023
 aston-0.020
 policy-0.019
 next-0.018
Token
Token ID6184
Feature activation-6.45e-02

Pos logit contributions
Warning+0.002
+0.002
 aston+0.002
 policy+0.002
 horror+0.002
Neg logit contributions
mare-0.002
pron-0.001
ack-0.001
ench-0.001
atmeal-0.001
Token
Token ID95
Feature activation-1.31e-01

Pos logit contributions
mare+0.094
pron+0.080
lic+0.079
ack+0.078
ins+0.077
Neg logit contributions
-0.105
Warning-0.095
 aston-0.085
 next-0.082
 policy-0.078
Token
Token ID26391
Feature activation-2.66e-02

Pos logit contributions
ins+0.205
mare+0.196
pron+0.186
ucker+0.185
atmeal+0.185
Neg logit contributions
-0.205
Warning-0.181
 next-0.149
 aston-0.141
 moral-0.129
Token
Token ID447
Feature activation+4.02e-01

Pos logit contributions
mare+0.049
pron+0.047
atmeal+0.045
auldron+0.044
ackle+0.043
Neg logit contributions
-0.036
 ago-0.020
 next-0.019
Warning-0.019
 counts-0.017
Token
Token ID250
Feature activation-6.75e-02

Pos logit contributions
+0.653
Warning+0.497
 next+0.443
Hello+0.440
 aston+0.428
Neg logit contributions
mare-0.694
rows-0.568
auldron-0.558
ins-0.557
ocked-0.551
Token things
Token ID1243
Feature activation-2.25e-03

Pos logit contributions
mare+0.110
pron+0.095
ack+0.093
lic+0.088
atmeal+0.088
Neg logit contributions
-0.093
Warning-0.087
 aston-0.075
 policy-0.074
 next-0.073
Token are
Token ID389
Feature activation+2.57e-02

Pos logit contributions
mare+0.004
pron+0.003
ack+0.003
atmeal+0.003
auldron+0.003
Neg logit contributions
-0.003
Warning-0.003
 next-0.002
 policy-0.002
 aston-0.002
Token only
Token ID691
Feature activation-3.20e-02

Pos logit contributions
Warning+0.037
+0.036
 aston+0.032
 policy+0.030
 next+0.028
Neg logit contributions
mare-0.038
pron-0.034
ack-0.031
lic-0.030
ench-0.029
Token available
Token ID1695
Feature activation+1.40e-01

Pos logit contributions
mare+0.056
pron+0.049
ack+0.049
atmeal+0.047
lic+0.047
Neg logit contributions
-0.042
Warning-0.035
 next-0.032
 policy-0.031
 aston-0.031
Token and
Token ID290
Feature activation+7.77e-03

Pos logit contributions
mare+0.066
pron+0.057
ack+0.056
lic+0.053
ackle+0.052
Neg logit contributions
-0.058
Warning-0.055
 aston-0.049
 policy-0.048
 next-0.044
Token rocked
Token ID36872
Feature activation-4.14e-02

Pos logit contributions
+0.011
Warning+0.010
 next+0.009
 aston+0.009
 policy+0.008
Neg logit contributions
mare-0.013
pron-0.011
ack-0.011
ench-0.010
ackle-0.010
Token
Token ID6184
Feature activation-7.16e-02

Pos logit contributions
mare+0.062
pron+0.054
ack+0.054
lic+0.050
ackle+0.048
Neg logit contributions
-0.059
Warning-0.058
 aston-0.052
 policy-0.051
 upcoming-0.046
Token
Token ID95
Feature activation-1.37e-01

Pos logit contributions
mare+0.114
pron+0.093
ins+0.089
auldron+0.089
lic+0.087
Neg logit contributions
-0.123
Warning-0.099
 aston-0.095
 seventeen-0.083
 next-0.082
Token
Token ID26391
Feature activation-2.43e-02

Pos logit contributions
mare+0.240
ins+0.225
pron+0.219
lic+0.207
rows+0.204
Neg logit contributions
-0.214
Warning-0.170
 aston-0.135
 counts-0.129
 next-0.123
Token
Token ID447
Feature activation+4.02e-01

Pos logit contributions
mare+0.043
pron+0.038
ench+0.035
iddle+0.034
ackle+0.034
Neg logit contributions
-0.039
 ago-0.024
Warning-0.024
 next-0.022
 aston-0.021
Token
Token ID250
Feature activation-6.48e-02

Pos logit contributions
+0.649
Warning+0.524
 next+0.465
 aston+0.446
Hello+0.434
Neg logit contributions
mare-0.712
auldron-0.526
rows-0.522
ins-0.516
lic-0.513
Token higher
Token ID2440
Feature activation-8.04e-02

Pos logit contributions
mare+0.107
pron+0.093
ack+0.090
lic+0.085
atmeal+0.085
Neg logit contributions
-0.089
Warning-0.082
 aston-0.071
 next-0.069
 policy-0.068
Token and
Token ID290
Feature activation+7.16e-03

Pos logit contributions
mare+0.142
pron+0.123
atmeal+0.116
lic+0.116
ackle+0.115
Neg logit contributions
-0.111
Warning-0.092
 next-0.088
 aston-0.079
 policy-0.078
Token higher
Token ID2440
Feature activation-8.03e-02

Pos logit contributions
+0.010
Warning+0.009
 next+0.008
 aston+0.008
 policy+0.007
Neg logit contributions
mare-0.012
pron-0.010
ack-0.010
lic-0.009
ackle-0.009
Token.
Token ID13
Feature activation-1.46e-04

Pos logit contributions
mare+0.137
pron+0.120
lic+0.111
atmeal+0.110
ackle+0.109
Neg logit contributions
-0.114
Warning-0.099
 next-0.093
 aston-0.085
 policy-0.083
Token important
Token ID1593
Feature activation+1.33e-01

Pos logit contributions
+0.103
Warning+0.101
 aston+0.086
 policy+0.084
 next+0.080
Neg logit contributions
mare-0.111
pron-0.098
ack-0.095
atmeal-0.088
ackle-0.088
Token thing
Token ID1517
Feature activation+6.99e-03

Pos logit contributions
+0.154
Warning+0.139
 policy+0.126
 next+0.117
 aston+0.113
Neg logit contributions
mare-0.252
pron-0.225
ack-0.221
lic-0.216
ackle-0.212
Token
Token ID6184
Feature activation-6.51e-02

Pos logit contributions
+0.010
Warning+0.010
 aston+0.009
 policy+0.008
 next+0.008
Neg logit contributions
mare-0.011
pron-0.009
ack-0.009
atmeal-0.008
lic-0.008
Token
Token ID95
Feature activation-1.34e-01

Pos logit contributions
mare+0.091
pron+0.076
ack+0.073
ins+0.071
lic+0.070
Neg logit contributions
-0.112
Warning-0.099
 aston-0.091
 next-0.088
 policy-0.085
Token
Token ID26391
Feature activation-3.05e-02

Pos logit contributions
ins+0.226
mare+0.226
rows+0.202
pron+0.200
ucker+0.195
Neg logit contributions
-0.201
Warning-0.160
 next-0.132
 aston-0.123
 counts-0.118
Token
Token ID447
Feature activation+4.02e-01

Pos logit contributions
mare+0.048
pron+0.044
ock+0.042
iddle+0.042
ench+0.042
Neg logit contributions
-0.050
Warning-0.031
 next-0.031
 ago-0.030
 aston-0.030
Token
Token ID250
Feature activation-6.62e-02

Pos logit contributions
+0.661
Warning+0.508
 next+0.422
Hello+0.408
 aston+0.404
Neg logit contributions
mare-0.730
ins-0.570
ocked-0.564
rows-0.551
lic-0.541
Token his
Token ID465
Feature activation-1.38e-02

Pos logit contributions
mare+0.115
pron+0.099
ack+0.096
ackle+0.092
atmeal+0.092
Neg logit contributions
-0.090
Warning-0.083
 next-0.068
 aston-0.068
 policy-0.065
Token leash
Token ID40529
Feature activation-5.92e-03

Pos logit contributions
mare+0.022
pron+0.019
lic+0.019
ack+0.019
atmeal+0.018
Neg logit contributions
-0.020
Warning-0.017
 next-0.016
 aston-0.016
 policy-0.016
Token!
Token ID0
Feature activation-6.10e-02

Pos logit contributions
mare+0.011
pron+0.009
ack+0.009
auldron+0.009
atmeal+0.009
Neg logit contributions
-0.008
Warning-0.006
 next-0.006
 policy-0.005
 aston-0.005
Token She
Token ID1375
Feature activation-4.15e-02

Pos logit contributions
mare+0.154
pron+0.141
ack+0.139
lic+0.134
atmeal+0.134
Neg logit contributions
-0.029
Warning-0.023
 aston-0.013
 policy-0.012
 next-0.010
Token of
Token ID286
Feature activation-2.03e-03

Pos logit contributions
Warning+0.034
+0.032
 aston+0.029
 policy+0.028
 moral+0.027
Neg logit contributions
mare-0.032
pron-0.028
ack-0.027
atmeal-0.025
lic-0.025
Token school
Token ID1524
Feature activation-5.50e-02

Pos logit contributions
mare+0.003
pron+0.003
ack+0.003
lic+0.003
ackle+0.003
Neg logit contributions
-0.003
Warning-0.003
 aston-0.002
 next-0.002
 policy-0.002
Token
Token ID6184
Feature activation-6.79e-02

Pos logit contributions
mare+0.099
pron+0.085
ack+0.084
ench+0.081
lic+0.081
Neg logit contributions
-0.073
Warning-0.059
 next-0.058
 policy-0.052
 aston-0.051
Token
Token ID95
Feature activation-1.35e-01

Pos logit contributions
mare+0.104
pron+0.088
ack+0.087
lic+0.084
ins+0.082
Neg logit contributions
-0.106
Warning-0.093
 aston-0.085
 next-0.080
 policy-0.079
Token
Token ID26391
Feature activation-2.98e-02

Pos logit contributions
mare+0.212
ack+0.190
pron+0.188
ins+0.187
rows+0.183
Neg logit contributions
-0.208
Warning-0.177
 aston-0.153
 next-0.150
 counts-0.143
Token
Token ID447
Feature activation+4.01e-01

Pos logit contributions
mare+0.049
pron+0.044
atmeal+0.043
oster+0.043
ench+0.042
Neg logit contributions
-0.046
Warning-0.031
 aston-0.030
 ago-0.029
 next-0.029
Token
Token ID250
Feature activation-6.84e-02

Pos logit contributions
+0.602
Warning+0.479
 next+0.435
 aston+0.429
Hello+0.416
Neg logit contributions
mare-0.695
rows-0.581
ins-0.563
auldron-0.543
ocked-0.541
Token it
Token ID340
Feature activation-1.86e-02

Pos logit contributions
mare+0.123
pron+0.107
ack+0.104
ackle+0.100
atmeal+0.100
Neg logit contributions
-0.087
Warning-0.079
 next-0.065
 aston-0.065
 policy-0.062
Token was
Token ID373
Feature activation+3.50e-02

Pos logit contributions
mare+0.046
pron+0.042
ack+0.041
auldron+0.041
atmeal+0.041
Neg logit contributions
-0.016
Warning-0.006
 next-0.006
 aston-0.005
 counts-0.004
Token an
Token ID281
Feature activation+8.02e-02

Pos logit contributions
+0.051
Warning+0.042
 next+0.039
 aston+0.038
 policy+0.037
Neg logit contributions
mare-0.058
ack-0.050
pron-0.050
atmeal-0.048
ackle-0.048
Token important
Token ID1593
Feature activation+1.27e-01

Pos logit contributions
Warning+0.116
+0.099
 policy+0.088
 aston+0.088
 seventeen+0.082
Neg logit contributions
mare-0.121
pron-0.107
ack-0.102
adow-0.098
ackle-0.095
Token something
Token ID1223
Feature activation-2.83e-02

Pos logit contributions
mare+0.097
pron+0.084
ack+0.082
lic+0.078
atmeal+0.077
Neg logit contributions
-0.086
Warning-0.079
 aston-0.069
 policy-0.068
 next-0.067
Token was
Token ID373
Feature activation+3.47e-02

Pos logit contributions
mare+0.045
pron+0.042
ack+0.040
lic+0.039
ackle+0.037
Neg logit contributions
Warning-0.036
-0.034
 aston-0.030
 policy-0.029
 next-0.026
Token wrong
Token ID2642
Feature activation-1.44e-02

Pos logit contributions
+0.045
Warning+0.044
 aston+0.037
 policy+0.036
 next+0.033
Neg logit contributions
mare-0.057
pron-0.052
ack-0.049
lic-0.047
ackle-0.046
Tokenâ
Token ID22940
Feature activation-6.45e-02

Pos logit contributions
mare+0.023
pron+0.020
ack+0.019
lic+0.018
atmeal+0.018
Neg logit contributions
-0.020
Warning-0.019
 aston-0.017
 policy-0.016
 next-0.016
Token
Token ID26391
Feature activation-2.61e-02

Pos logit contributions
mare+0.100
pron+0.085
ack+0.083
lic+0.080
auldron+0.078
Neg logit contributions
-0.098
Warning-0.091
 next-0.078
 aston-0.076
 policy-0.076
Token
Token ID447
Feature activation+4.01e-01

Pos logit contributions
mare+0.044
pron+0.039
atmeal+0.038
ins+0.037
ack+0.036
Neg logit contributions
-0.040
Warning-0.028
 ago-0.028
 next-0.026
 aston-0.026
Token
Token ID251
Feature activation-1.41e-01

Pos logit contributions
+0.689
Warning+0.542
 next+0.474
 aston+0.452
Hello+0.425
Neg logit contributions
mare-0.674
pron-0.510
ins-0.497
ack-0.487
adow-0.485
Tokenall
Token ID439
Feature activation-7.19e-02

Pos logit contributions
mare+0.261
pron+0.239
atmeal+0.235
auldron+0.230
ackle+0.226
Neg logit contributions
-0.197
Warning-0.149
 next-0.133
 aston-0.120
 dis-0.116
Token the
Token ID262
Feature activation-3.20e-02

Pos logit contributions
mare+0.127
auldron+0.120
pron+0.116
atmeal+0.114
ravings+0.107
Neg logit contributions
-0.110
 next-0.080
Warning-0.073
 dis-0.067
 unexpected-0.062
Token water
Token ID1660
Feature activation-5.36e-02

Pos logit contributions
mare+0.054
pron+0.046
ack+0.046
atmeal+0.045
ench+0.044
Neg logit contributions
-0.046
Warning-0.039
 next-0.037
 moral-0.034
 aston-0.034
Token had
Token ID550
Feature activation+1.27e-01

Pos logit contributions
mare+0.089
pron+0.076
atmeal+0.073
ack+0.073
auldron+0.072
Neg logit contributions
-0.076
Warning-0.066
 next-0.062
 aston-0.058
 policy-0.055
Token also
Token ID635
Feature activation+3.52e-02

Pos logit contributions
+0.090
Warning+0.078
 next+0.072
 aston+0.072
 policy+0.068
Neg logit contributions
mare-0.103
pron-0.088
ack-0.086
atmeal-0.082
lic-0.082
Token very
Token ID845
Feature activation-5.57e-02

Pos logit contributions
+0.047
Warning+0.040
 next+0.038
 aston+0.037
 policy+0.035
Neg logit contributions
mare-0.061
pron-0.052
ack-0.051
ackle-0.049
lic-0.049
Token persistent
Token ID16218
Feature activation+1.61e-02

Pos logit contributions
mare+0.093
pron+0.080
ack+0.078
atmeal+0.074
lic+0.074
Neg logit contributions
-0.076
Warning-0.069
 aston-0.061
 policy-0.059
 next-0.059
Tokenâ
Token ID22940
Feature activation-5.86e-02

Pos logit contributions
+0.022
Warning+0.021
 aston+0.018
 policy+0.018
 next+0.018
Neg logit contributions
mare-0.026
pron-0.023
ack-0.022
lic-0.021
atmeal-0.021
Token
Token ID26391
Feature activation-2.33e-02

Pos logit contributions
mare+0.094
pron+0.081
lic+0.078
ack+0.077
auldron+0.077
Neg logit contributions
-0.087
Warning-0.080
 next-0.069
 policy-0.066
 aston-0.065
Token
Token ID447
Feature activation+4.01e-01

Pos logit contributions
mare+0.037
pron+0.033
atmeal+0.032
auldron+0.031
ackle+0.030
Neg logit contributions
-0.039
Warning-0.027
 ago-0.027
 next-0.026
 aston-0.025
Token
Token ID251
Feature activation-1.39e-01

Pos logit contributions
+0.625
Warning+0.484
 next+0.415
 aston+0.405
Hello+0.394
Neg logit contributions
mare-0.737
auldron-0.575
pron-0.556
lic-0.552
rows-0.548
Tokenso
Token ID568
Feature activation+1.85e-02

Pos logit contributions
mare+0.256
pron+0.234
auldron+0.231
atmeal+0.229
ackle+0.225
Neg logit contributions
-0.196
Warning-0.149
 next-0.134
 aston-0.124
 dis-0.112
Token they
Token ID484
Feature activation+2.58e-02

Pos logit contributions
+0.029
 next+0.019
Warning+0.019
 policy+0.017
 aston+0.017
Neg logit contributions
mare-0.032
auldron-0.030
pron-0.028
atmeal-0.028
ackle-0.027
Token gathered
Token ID9272
Feature activation+2.35e-02

Pos logit contributions
+0.038
Warning+0.023
 next+0.023
 aston+0.023
 dis+0.023
Neg logit contributions
mare-0.047
atmeal-0.043
auldron-0.043
pron-0.042
rows-0.041
Token around
Token ID1088
Feature activation+6.39e-02

Pos logit contributions
+0.033
Warning+0.033
 policy+0.029
 aston+0.027
 next+0.025
Neg logit contributions
mare-0.035
ack-0.031
pron-0.030
ench-0.028
lic-0.028
Token special
Token ID2041
Feature activation+1.03e-01

Pos logit contributions
mare+0.013
ack+0.012
pron+0.012
lic+0.011
atmeal+0.011
Neg logit contributions
-0.011
Warning-0.010
 policy-0.009
 aston-0.009
 next-0.009
Token home
Token ID1363
Feature activation-1.88e-02

Pos logit contributions
+0.155
Warning+0.149
 aston+0.129
 policy+0.122
 next+0.120
Neg logit contributions
mare-0.156
pron-0.137
ack-0.127
ench-0.119
atmeal-0.119
Token
Token ID6184
Feature activation-6.50e-02

Pos logit contributions
mare+0.029
pron+0.025
ack+0.024
atmeal+0.023
lic+0.023
Neg logit contributions
-0.027
Warning-0.026
 aston-0.023
 policy-0.022
 next-0.021
Token
Token ID95
Feature activation-1.33e-01

Pos logit contributions
mare+0.088
pron+0.076
ack+0.074
lic+0.069
ins+0.068
Neg logit contributions
-0.110
Warning-0.100
 aston-0.090
 next-0.087
 policy-0.087
Token
Token ID26391
Feature activation-2.93e-02

Pos logit contributions
ins+0.209
mare+0.204
pron+0.195
ack+0.192
rows+0.186
Neg logit contributions
-0.209
Warning-0.182
 aston-0.138
 next-0.135
 counts-0.133
Token
Token ID447
Feature activation+4.00e-01

Pos logit contributions
mare+0.052
pron+0.050
atmeal+0.048
ins+0.046
iddle+0.046
Neg logit contributions
-0.042
Warning-0.027
 aston-0.024
 next-0.024
 ago-0.023
Token
Token ID250
Feature activation-6.68e-02

Pos logit contributions
+0.644
Warning+0.515
 next+0.455
Hello+0.443
 aston+0.438
Neg logit contributions
mare-0.688
ins-0.543
rows-0.524
auldron-0.522
ocked-0.521
Token a
Token ID257
Feature activation-9.65e-03

Pos logit contributions
mare+0.111
pron+0.098
ack+0.096
lic+0.091
atmeal+0.089
Neg logit contributions
-0.084
Warning-0.080
 aston-0.071
 policy-0.071
 next-0.065
Token hut
Token ID40812
Feature activation+1.48e-02

Pos logit contributions
mare+0.016
ack+0.014
atmeal+0.014
ins+0.014
ocked+0.014
Neg logit contributions
-0.013
Warning-0.010
 aston-0.010
 next-0.010
 policy-0.009
Token.
Token ID13
Feature activation-6.32e-04

Pos logit contributions
+0.021
Warning+0.017
 next+0.017
 policy+0.016
 aston+0.015
Neg logit contributions
mare-0.024
pron-0.022
ack-0.022
lic-0.021
atmeal-0.020
Token She
Token ID1375
Feature activation-4.28e-02

Pos logit contributions
mare+0.001
pron+0.001
ack+0.001
lic+0.001
atmeal+0.001
Neg logit contributions
-0.001
Warning-0.000
 aston-0.000
 policy-0.000
 next-0.000
Token her
Token ID607
Feature activation+2.77e-02

Pos logit contributions
+0.053
Warning+0.052
 policy+0.045
 aston+0.045
 next+0.042
Neg logit contributions
mare-0.055
pron-0.049
ack-0.046
lic-0.042
ench-0.041
Token pocket
Token ID10000
Feature activation-1.02e-02

Pos logit contributions
+0.040
Warning+0.034
 next+0.032
 aston+0.032
 policy+0.030
Neg logit contributions
mare-0.044
ack-0.038
lic-0.038
pron-0.037
ackle-0.036
Token
Token ID6184
Feature activation-6.79e-02

Pos logit contributions
mare+0.019
ack+0.016
pron+0.016
lic+0.015
ackle+0.015
Neg logit contributions
-0.013
Warning-0.011
 next-0.010
 policy-0.009
 aston-0.009
Token
Token ID95
Feature activation-1.33e-01

Pos logit contributions
mare+0.098
pron+0.082
ack+0.080
lic+0.078
ins+0.076
Neg logit contributions
-0.113
Warning-0.100
 aston-0.091
 next-0.083
 policy-0.083
Token
Token ID26391
Feature activation-2.72e-02

Pos logit contributions
mare+0.211
ins+0.195
pron+0.191
ack+0.187
lic+0.183
Neg logit contributions
-0.202
Warning-0.173
 aston-0.145
 counts-0.137
 next-0.134
Token
Token ID447
Feature activation+4.00e-01

Pos logit contributions
mare+0.041
pron+0.036
atmeal+0.034
ench+0.033
lic+0.033
Neg logit contributions
-0.044
Warning-0.033
 aston-0.031
 next-0.030
 ago-0.030
Token
Token ID250
Feature activation-6.59e-02

Pos logit contributions
+0.604
Warning+0.477
 next+0.441
 aston+0.427
Hello+0.406
Neg logit contributions
mare-0.687
rows-0.560
ins-0.551
ocked-0.547
lic-0.527
Token just
Token ID655
Feature activation-1.38e-01

Pos logit contributions
mare+0.112
pron+0.097
ack+0.095
ackle+0.090
atmeal+0.090
Neg logit contributions
-0.088
Warning-0.081
 aston-0.069
 next-0.067
 policy-0.067
Token in
Token ID287
Feature activation+3.59e-02

Pos logit contributions
mare+0.225
pron+0.195
ack+0.195
lic+0.185
ackle+0.184
Neg logit contributions
-0.197
Warning-0.170
 aston-0.151
 next-0.151
 policy-0.148
Token case
Token ID1339
Feature activation-1.59e-02

Pos logit contributions
+0.049
Warning+0.042
 next+0.037
 aston+0.037
 policy+0.036
Neg logit contributions
mare-0.062
pron-0.055
ack-0.054
lic-0.054
atmeal-0.053
Token her
Token ID607
Feature activation+2.31e-02

Pos logit contributions
mare+0.028
pron+0.025
atmeal+0.024
ack+0.024
ackle+0.023
Neg logit contributions
-0.022
Warning-0.017
 next-0.016
 policy-0.015
 aston-0.014
Tokent
Token ID83
Feature activation-7.46e-03

Pos logit contributions
mare+0.218
pron+0.186
ack+0.175
auldron+0.174
atmeal+0.171
Neg logit contributions
-0.190
Warning-0.163
 next-0.141
 policy-0.141
 aston-0.137
Token help
Token ID1037
Feature activation+1.06e-01

Pos logit contributions
mare+0.014
pron+0.013
auldron+0.012
ack+0.012
ackle+0.012
Neg logit contributions
-0.010
Warning-0.008
 next-0.007
 policy-0.006
 dis-0.006
Token it
Token ID340
Feature activation-2.09e-02

Pos logit contributions
+0.127
Warning+0.106
 next+0.095
 policy+0.091
 aston+0.088
Neg logit contributions
mare-0.205
pron-0.183
ack-0.174
auldron-0.173
ackle-0.172
Tokenâ
Token ID22940
Feature activation-6.80e-02

Pos logit contributions
mare+0.036
pron+0.032
ack+0.030
lic+0.029
atmeal+0.029
Neg logit contributions
-0.028
Warning-0.025
 aston-0.022
 next-0.022
 policy-0.021
Token
Token ID26391
Feature activation-2.89e-02

Pos logit contributions
mare+0.122
pron+0.109
lic+0.108
auldron+0.107
ack+0.101
Neg logit contributions
-0.093
Warning-0.083
 next-0.069
 policy-0.061
 aston-0.060
Token
Token ID447
Feature activation+4.00e-01

Pos logit contributions
mare+0.050
pron+0.043
iddle+0.041
ackle+0.041
ock+0.041
Neg logit contributions
-0.047
Warning-0.030
 ago-0.029
 next-0.026
 aston-0.026
Token
Token ID251
Feature activation-1.45e-01

Pos logit contributions
+0.604
Warning+0.479
 next+0.407
 aston+0.374
Hello+0.356
Neg logit contributions
mare-0.737
adow-0.580
ocked-0.552
rows-0.548
ins-0.547
Tokenhe
Token ID258
Feature activation-2.45e-01

Pos logit contributions
mare+0.279
pron+0.256
auldron+0.250
ackle+0.249
atmeal+0.242
Neg logit contributions
-0.196
Warning-0.144
 next-0.135
 dis-0.112
 aston-0.112
Token reached
Token ID4251
Feature activation-5.84e-02

Pos logit contributions
mare+0.451
auldron+0.433
ench+0.424
udd+0.420
ackle+0.417
Neg logit contributions
-0.291
 next-0.256
Warning-0.217
 dis-0.205
 moral-0.192
Token out
Token ID503
Feature activation-2.33e-02

Pos logit contributions
mare+0.111
pron+0.099
atmeal+0.096
ack+0.096
lic+0.094
Neg logit contributions
-0.072
Warning-0.056
 next-0.056
 aston-0.051
 behind-0.048
Token and
Token ID290
Feature activation+5.07e-03

Pos logit contributions
mare+0.040
pron+0.036
atmeal+0.035
ack+0.034
auldron+0.033
Neg logit contributions
-0.032
Warning-0.028
 aston-0.025
 next-0.025
 policy-0.023
Token something
Token ID1223
Feature activation-3.80e-02

Pos logit contributions
+0.001
Warning+0.001
 aston+0.001
 policy+0.001
 next+0.001
Neg logit contributions
mare-0.001
pron-0.001
ack-0.001
lic-0.001
atmeal-0.001
Token strange
Token ID6283
Feature activation+1.28e-01

Pos logit contributions
pron+0.049
mare+0.048
lic+0.046
ack+0.044
ackle+0.041
Neg logit contributions
Warning-0.058
-0.049
 policy-0.048
 aston-0.048
 narrative-0.044
Token
Token ID6184
Feature activation-6.62e-02

Pos logit contributions
Warning+0.192
+0.191
 aston+0.169
 policy+0.161
 moral+0.153
Neg logit contributions
mare-0.177
pron-0.153
ack-0.148
lic-0.140
atmeal-0.135
Token
Token ID95
Feature activation-1.32e-01

Pos logit contributions
mare+0.102
pron+0.089
ack+0.086
lic+0.082
ins+0.081
Neg logit contributions
-0.102
Warning-0.090
 aston-0.081
 next-0.079
 policy-0.075
Token
Token ID26391
Feature activation-3.10e-02

Pos logit contributions
mare+0.214
ins+0.199
pron+0.195
ack+0.186
ench+0.184
Neg logit contributions
-0.206
Warning-0.174
 aston-0.140
 next-0.137
 counts-0.134
Token
Token ID447
Feature activation+4.00e-01

Pos logit contributions
mare+0.048
pron+0.045
atmeal+0.041
ench+0.041
ins+0.040
Neg logit contributions
-0.052
Warning-0.036
 ago-0.034
 aston-0.034
 next-0.033
Token
Token ID250
Feature activation-6.94e-02

Pos logit contributions
+0.696
Warning+0.570
 aston+0.494
 next+0.475
Hello+0.456
Neg logit contributions
mare-0.656
pron-0.498
ins-0.484
ocked-0.483
rows-0.481
Token the
Token ID262
Feature activation-3.47e-02

Pos logit contributions
mare+0.115
pron+0.101
ack+0.100
lic+0.095
atmeal+0.092
Neg logit contributions
-0.089
Warning-0.083
 policy-0.074
 aston-0.074
 next-0.069
Token mirror
Token ID10162
Feature activation-5.30e-02

Pos logit contributions
mare+0.055
ack+0.047
pron+0.047
lic+0.044
atmeal+0.044
Neg logit contributions
-0.050
Warning-0.046
 next-0.040
 aston-0.040
 policy-0.040
Token was
Token ID373
Feature activation+2.86e-02

Pos logit contributions
mare+0.094
pron+0.082
ack+0.081
ackle+0.078
ench+0.078
Neg logit contributions
-0.076
 next-0.055
Warning-0.055
 aston-0.048
 counts-0.048
Token making
Token ID1642
Feature activation-7.32e-02

Pos logit contributions
+0.040
Warning+0.038
 aston+0.033
 policy+0.032
 next+0.032
Neg logit contributions
mare-0.045
pron-0.039
ack-0.039
lic-0.036
atmeal-0.036
Token something
Token ID1223
Feature activation-4.51e-02

Pos logit contributions
mare+0.021
pron+0.018
ack+0.018
atmeal+0.017
ackle+0.017
Neg logit contributions
-0.018
Warning-0.017
 aston-0.014
 next-0.014
 policy-0.014
Token special
Token ID2041
Feature activation+1.03e-01

Pos logit contributions
pron+0.055
lic+0.055
mare+0.050
ack+0.048
ocked+0.046
Neg logit contributions
Warning-0.072
 aston-0.063
-0.061
 ago-0.058
 seventeen-0.057
Token
Token ID6184
Feature activation-6.92e-02

Pos logit contributions
Warning+0.175
+0.170
 aston+0.162
 seventeen+0.156
 moral+0.145
Neg logit contributions
mare-0.117
pron-0.098
adow-0.095
ack-0.089
ocked-0.089
Token
Token ID95
Feature activation-1.35e-01

Pos logit contributions
mare+0.106
pron+0.092
ack+0.089
lic+0.085
ackle+0.084
Neg logit contributions
-0.104
Warning-0.094
 aston-0.084
 next-0.081
 policy-0.080
Token
Token ID26391
Feature activation-2.90e-02

Pos logit contributions
mare+0.227
pron+0.204
ins+0.201
ack+0.197
ench+0.192
Neg logit contributions
-0.192
Warning-0.166
 aston-0.137
 next-0.133
 counts-0.130
Token
Token ID447
Feature activation+4.00e-01

Pos logit contributions
mare+0.041
pron+0.038
ench+0.036
iddle+0.035
ins+0.034
Neg logit contributions
-0.047
Warning-0.036
 counts-0.033
 aston-0.033
 next-0.033
Token
Token ID250
Feature activation-6.91e-02

Pos logit contributions
+0.678
Warning+0.603
 aston+0.537
 next+0.537
Hello+0.486
Neg logit contributions
mare-0.567
pron-0.439
ins-0.413
ocked-0.407
atmeal-0.402
Token a
Token ID257
Feature activation-7.79e-03

Pos logit contributions
mare+0.118
pron+0.104
ack+0.103
lic+0.100
atmeal+0.095
Neg logit contributions
-0.084
Warning-0.078
 policy-0.073
 aston-0.071
 next-0.064
Token pool
Token ID5933
Feature activation-7.66e-02

Pos logit contributions
mare+0.014
atmeal+0.013
ocked+0.012
ack+0.012
ins+0.012
Neg logit contributions
-0.009
Warning-0.007
 next-0.006
 precious-0.006
 policy-0.006
Token!
Token ID0
Feature activation-6.49e-02

Pos logit contributions
mare+0.111
pron+0.096
ack+0.092
atmeal+0.087
ackle+0.087
Neg logit contributions
-0.120
Warning-0.110
 policy-0.098
 aston-0.097
 next-0.096
Token It
Token ID632
Feature activation-4.24e-02

Pos logit contributions
mare+0.107
pron+0.087
ack+0.085
lic+0.083
atmeal+0.077
Neg logit contributions
Warning-0.080
-0.078
 policy-0.071
 next-0.071
 aston-0.070
Token something
Token ID1223
Feature activation-4.83e-02

Pos logit contributions
+0.004
Warning+0.003
 policy+0.003
 aston+0.003
 next+0.003
Neg logit contributions
mare-0.004
pron-0.003
ack-0.003
lic-0.003
atmeal-0.003
Token incredible
Token ID8082
Feature activation+1.47e-01

Pos logit contributions
pron+0.052
lic+0.051
ack+0.047
mare+0.047
ocked+0.046
Neg logit contributions
Warning-0.082
 aston-0.068
-0.067
 policy-0.066
 ago-0.064
Token
Token ID6184
Feature activation-7.13e-02

Pos logit contributions
+0.242
Warning+0.241
 aston+0.217
 policy+0.206
 moral+0.201
Neg logit contributions
mare-0.177
pron-0.148
ack-0.148
lic-0.140
ocked-0.139
Token
Token ID95
Feature activation-1.39e-01

Pos logit contributions
mare+0.103
pron+0.088
ack+0.086
lic+0.081
atmeal+0.080
Neg logit contributions
-0.115
Warning-0.104
 aston-0.093
 next-0.090
 policy-0.088
Token
Token ID26391
Feature activation-3.21e-02

Pos logit contributions
mare+0.219
ins+0.201
pron+0.199
ack+0.196
atmeal+0.188
Neg logit contributions
-0.211
Warning-0.184
 aston-0.150
 next-0.149
 counts-0.142
Token
Token ID447
Feature activation+3.99e-01

Pos logit contributions
mare+0.050
pron+0.046
atmeal+0.042
iddle+0.042
ench+0.042
Neg logit contributions
-0.052
Warning-0.038
 aston-0.035
 ago-0.035
 next-0.034
Token
Token ID250
Feature activation-6.94e-02

Pos logit contributions
+0.697
Warning+0.588
 aston+0.517
 next+0.505
Hello+0.468
Neg logit contributions
mare-0.618
pron-0.473
ins-0.443
atmeal-0.437
rows-0.437
Token a
Token ID257
Feature activation-1.19e-02

Pos logit contributions
mare+0.114
pron+0.101
ack+0.100
lic+0.097
atmeal+0.091
Neg logit contributions
-0.086
Warning-0.081
 policy-0.077
 aston-0.075
 next-0.068
Token magical
Token ID10883
Feature activation+1.14e-01

Pos logit contributions
mare+0.021
atmeal+0.018
ins+0.018
ack+0.018
ocked+0.017
Neg logit contributions
-0.016
Warning-0.012
 aston-0.011
 next-0.011
 precious-0.011
Token plant
Token ID4618
Feature activation+2.84e-02

Pos logit contributions
+0.149
Warning+0.126
 policy+0.120
 next+0.117
 aston+0.116
Neg logit contributions
mare-0.190
ack-0.169
atmeal-0.169
lic-0.163
pron-0.163
Token!
Token ID0
Feature activation-6.36e-02

Pos logit contributions
+0.042
Warning+0.036
 next+0.032
 aston+0.032
 policy+0.032
Neg logit contributions
mare-0.045
pron-0.039
ack-0.038
atmeal-0.037
lic-0.036
Token looked
Token ID3114
Feature activation-4.05e-02

Pos logit contributions
mare+0.028
pron+0.024
ack+0.023
lic+0.023
atmeal+0.023
Neg logit contributions
-0.022
Warning-0.018
 next-0.017
 aston-0.016
 counts-0.015
Token lovely
Token ID14081
Feature activation+5.83e-02

Pos logit contributions
mare+0.049
pron+0.046
ack+0.045
lic+0.042
ench+0.042
Neg logit contributions
Warning-0.070
-0.064
 policy-0.060
 aston-0.054
 counts-0.052
Token
Token ID6184
Feature activation-7.60e-02

Pos logit contributions
Warning+0.094
+0.091
 policy+0.079
 aston+0.078
 moral+0.074
Neg logit contributions
mare-0.078
pron-0.067
ack-0.066
ench-0.060
lic-0.060
Token
Token ID95
Feature activation-1.42e-01

Pos logit contributions
mare+0.119
pron+0.102
ack+0.099
ins+0.096
lic+0.095
Neg logit contributions
-0.116
Warning-0.099
 aston-0.091
 next-0.089
 policy-0.084
Token
Token ID26391
Feature activation-3.04e-02

Pos logit contributions
mare+0.230
ins+0.220
pron+0.208
rows+0.205
auldron+0.202
Neg logit contributions
-0.220
Warning-0.184
 next-0.150
 aston-0.147
 counts-0.141
Token
Token ID447
Feature activation+3.99e-01

Pos logit contributions
mare+0.053
pron+0.050
atmeal+0.046
ins+0.046
iddle+0.045
Neg logit contributions
-0.046
 ago-0.030
Warning-0.027
 aston-0.026
 next-0.026
Token
Token ID250
Feature activation-6.80e-02

Pos logit contributions
+0.653
Warning+0.507
 aston+0.464
 next+0.462
Hello+0.432
Neg logit contributions
mare-0.677
rows-0.511
ins-0.508
ocked-0.496
pron-0.496
Token it
Token ID340
Feature activation-1.99e-02

Pos logit contributions
mare+0.120
pron+0.105
ack+0.102
atmeal+0.098
ackle+0.098
Neg logit contributions
-0.086
Warning-0.079
 aston-0.066
 next-0.065
 policy-0.064
Token was
Token ID373
Feature activation+3.11e-02

Pos logit contributions
mare+0.041
pron+0.037
ack+0.036
atmeal+0.035
ackle+0.035
Neg logit contributions
-0.023
Warning-0.015
 next-0.015
 aston-0.014
 counts-0.013
Token bright
Token ID6016
Feature activation+5.95e-02

Pos logit contributions
+0.045
Warning+0.040
 aston+0.035
 next+0.035
 policy+0.035
Neg logit contributions
mare-0.050
pron-0.043
ack-0.042
atmeal-0.040
lic-0.040
Token and
Token ID290
Feature activation+8.39e-03

Pos logit contributions
+0.083
Warning+0.071
 next+0.065
 aston+0.065
 policy+0.062
Neg logit contributions
mare-0.099
pron-0.086
ack-0.084
atmeal-0.082
ackle-0.080
Token taught
Token ID7817
Feature activation-1.66e-02

Pos logit contributions
mare+0.401
ack+0.341
pron+0.336
ackle+0.332
ench+0.331
Neg logit contributions
-0.299
Warning-0.230
 next-0.221
 policy-0.192
 dis-0.184
Token him
Token ID683
Feature activation+6.64e-02

Pos logit contributions
mare+0.027
pron+0.024
ack+0.023
lic+0.022
atmeal+0.022
Neg logit contributions
-0.021
Warning-0.021
 aston-0.018
 policy-0.017
 next-0.016
Token
Token ID6184
Feature activation-6.95e-02

Pos logit contributions
+0.088
Warning+0.072
 next+0.068
 policy+0.063
 aston+0.062
Neg logit contributions
mare-0.122
pron-0.108
ack-0.103
lic-0.100
ackle-0.098
Token
Token ID95
Feature activation-1.36e-01

Pos logit contributions
mare+0.102
pron+0.083
lic+0.078
ack+0.077
ackle+0.073
Neg logit contributions
-0.118
Warning-0.106
 aston-0.095
 next-0.090
 policy-0.089
Token
Token ID26391
Feature activation-3.16e-02

Pos logit contributions
mare+0.227
ins+0.194
pron+0.193
ack+0.187
ocked+0.182
Neg logit contributions
-0.208
Warning-0.184
 next-0.149
 aston-0.148
 counts-0.137
Token
Token ID447
Feature activation+3.99e-01

Pos logit contributions
mare+0.050
pron+0.042
ench+0.040
ackle+0.039
ack+0.039
Neg logit contributions
-0.054
Warning-0.037
 next-0.036
 aston-0.034
 ago-0.034
Token
Token ID250
Feature activation-6.84e-02

Pos logit contributions
+0.646
Warning+0.547
 next+0.473
 aston+0.453
Hello+0.435
Neg logit contributions
mare-0.678
lic-0.515
pron-0.513
ocked-0.511
ins-0.511
Token cleaning
Token ID12724
Feature activation+6.37e-02

Pos logit contributions
mare+0.119
pron+0.102
ack+0.098
ackle+0.095
auldron+0.095
Neg logit contributions
-0.093
Warning-0.085
 next-0.071
 aston-0.070
 policy-0.068
Token is
Token ID318
Feature activation+1.60e-02

Pos logit contributions
+0.088
 next+0.068
Warning+0.067
 policy+0.059
 aston+0.057
Neg logit contributions
mare-0.114
lic-0.097
ackle-0.097
pron-0.096
ack-0.096
Token important
Token ID1593
Feature activation+1.28e-01

Pos logit contributions
+0.023
Warning+0.019
 next+0.018
 policy+0.017
 aston+0.016
Neg logit contributions
mare-0.027
ack-0.023
pron-0.023
ackle-0.022
atmeal-0.022
Token work
Token ID670
Feature activation-5.50e-02

Pos logit contributions
+0.168
Warning+0.143
 next+0.134
 policy+0.131
 aston+0.122
Neg logit contributions
mare-0.223
ack-0.197
pron-0.196
lic-0.188
atmeal-0.187
Token beautiful
Token ID4950
Feature activation+4.89e-02

Pos logit contributions
Warning+0.055
+0.050
 policy+0.044
 aston+0.040
 counts+0.040
Neg logit contributions
mare-0.056
pron-0.056
ack-0.055
lic-0.051
ackle-0.051
Token thing
Token ID1517
Feature activation+2.78e-03

Pos logit contributions
+0.067
Warning+0.063
 aston+0.054
 policy+0.053
 next+0.051
Neg logit contributions
mare-0.080
pron-0.070
ack-0.068
lic-0.064
atmeal-0.064
Token
Token ID6184
Feature activation-6.96e-02

Pos logit contributions
Warning+0.004
+0.004
 aston+0.004
 moral+0.004
 policy+0.004
Neg logit contributions
mare-0.004
pron-0.003
ack-0.003
lic-0.003
ocked-0.003
Token
Token ID95
Feature activation-1.37e-01

Pos logit contributions
mare+0.090
pron+0.075
ack+0.072
lic+0.068
atmeal+0.067
Neg logit contributions
-0.121
Warning-0.114
 aston-0.102
 policy-0.100
 next-0.099
Token
Token ID26391
Feature activation-3.16e-02

Pos logit contributions
mare+0.216
pron+0.194
ins+0.190
ack+0.186
atmeal+0.186
Neg logit contributions
-0.210
Warning-0.186
 aston-0.156
 next-0.150
 policy-0.141
Token
Token ID447
Feature activation+3.99e-01

Pos logit contributions
mare+0.052
pron+0.048
atmeal+0.046
ins+0.043
iddle+0.043
Neg logit contributions
-0.049
Warning-0.035
 aston-0.033
 ago-0.032
 next-0.032
Token
Token ID250
Feature activation-6.90e-02

Pos logit contributions
+0.657
Warning+0.550
 aston+0.499
 next+0.483
Hello+0.461
Neg logit contributions
mare-0.638
pron-0.491
ins-0.481
rows-0.479
lic-0.479
Token a
Token ID257
Feature activation-1.28e-02

Pos logit contributions
mare+0.117
pron+0.104
ack+0.103
lic+0.098
atmeal+0.093
Neg logit contributions
-0.083
Warning-0.079
 policy-0.071
 aston-0.070
 next-0.065
Token bright
Token ID6016
Feature activation+5.91e-02

Pos logit contributions
mare+0.023
atmeal+0.020
ack+0.019
ins+0.019
rows+0.019
Neg logit contributions
-0.017
Warning-0.012
 aston-0.012
 next-0.012
 precious-0.011
Token rainbow
Token ID27223
Feature activation+1.37e-01

Pos logit contributions
+0.074
 aston+0.058
Warning+0.058
 policy+0.058
 next+0.057
Neg logit contributions
mare-0.101
atmeal-0.093
lic-0.089
ack-0.088
ackle-0.088
Token!
Token ID0
Feature activation-6.49e-02

Pos logit contributions
+0.206
Warning+0.167
 next+0.159
 policy+0.149
 aston+0.147
Neg logit contributions
mare-0.217
pron-0.187
ack-0.181
ackle-0.180
auldron-0.178
Token also
Token ID635
Feature activation+2.45e-02

Pos logit contributions
+0.046
Warning+0.037
 next+0.036
 aston+0.035
 policy+0.031
Neg logit contributions
mare-0.058
pron-0.050
ack-0.049
atmeal-0.048
auldron-0.046
Token excited
Token ID6568
Feature activation+6.54e-02

Pos logit contributions
+0.031
 next+0.023
 aston+0.023
Warning+0.023
 precious+0.020
Neg logit contributions
mare-0.047
pron-0.040
atmeal-0.040
ack-0.040
auldron-0.039
Token
Token ID6184
Feature activation-6.93e-02

Pos logit contributions
+0.093
Warning+0.087
 aston+0.076
 policy+0.074
 next+0.073
Neg logit contributions
mare-0.104
pron-0.090
ack-0.088
lic-0.083
atmeal-0.083
Token
Token ID95
Feature activation-1.34e-01

Pos logit contributions
mare+0.106
pron+0.089
ack+0.084
ins+0.081
auldron+0.080
Neg logit contributions
-0.114
Warning-0.099
 aston-0.090
 next-0.085
 policy-0.083
Token
Token ID26391
Feature activation-3.00e-02

Pos logit contributions
mare+0.233
pron+0.209
ack+0.201
ins+0.197
ench+0.187
Neg logit contributions
-0.191
Warning-0.164
 aston-0.138
 next-0.132
 counts-0.131
Token
Token ID447
Feature activation+3.99e-01

Pos logit contributions
mare+0.047
pron+0.043
ock+0.039
iddle+0.039
ocked+0.038
Neg logit contributions
-0.048
Warning-0.034
 aston-0.033
 next-0.032
 counts-0.031
Token
Token ID250
Feature activation-6.90e-02

Pos logit contributions
+0.616
Warning+0.501
 next+0.452
 aston+0.440
Hello+0.422
Neg logit contributions
mare-0.703
ins-0.531
pron-0.530
auldron-0.522
rows-0.518
Token the
Token ID262
Feature activation-2.95e-02

Pos logit contributions
mare+0.124
pron+0.108
ack+0.105
atmeal+0.101
ackle+0.101
Neg logit contributions
-0.086
Warning-0.079
 aston-0.066
 next-0.065
 policy-0.063
Token gorilla
Token ID45314
Feature activation-2.41e-02

Pos logit contributions
mare+0.046
pron+0.040
ack+0.039
atmeal+0.037
lic+0.036
Neg logit contributions
-0.042
Warning-0.040
 aston-0.035
 policy-0.034
 next-0.032
Token was
Token ID373
Feature activation+2.93e-02

Pos logit contributions
mare+0.043
pron+0.039
ack+0.037
auldron+0.036
ackle+0.036
Neg logit contributions
-0.035
 next-0.025
Warning-0.024
 aston-0.022
 counts-0.022
Token so
Token ID523
Feature activation-6.81e-02

Pos logit contributions
+0.041
Warning+0.035
 next+0.032
 aston+0.032
 policy+0.030
Neg logit contributions
mare-0.049
ack-0.043
pron-0.043
atmeal-0.041
ackle-0.040
Token dream
Token ID4320
Feature activation-4.74e-02

Pos logit contributions
+0.121
 policy+0.107
Warning+0.105
 next+0.104
 moral+0.096
Neg logit contributions
mare-0.147
lic-0.136
ack-0.127
pron-0.124
ins-0.123
Token called
Token ID1444
Feature activation-5.60e-02

Pos logit contributions
mare+0.095
ack+0.088
ins+0.087
ench+0.086
ucker+0.085
Neg logit contributions
-0.046
 next-0.043
Warning-0.032
 policy-0.032
 happen-0.032
Token a
Token ID257
Feature activation-1.51e-02

Pos logit contributions
mare+0.094
ack+0.087
pron+0.085
atmeal+0.082
ackle+0.081
Neg logit contributions
-0.073
Warning-0.064
 next-0.057
 policy-0.056
 aston-0.055
Token "
Token ID366
Feature activation-4.54e-02

Pos logit contributions
mare+0.028
ins+0.027
ack+0.027
ocked+0.026
ench+0.026
Neg logit contributions
-0.016
 next-0.013
 moral-0.012
 precious-0.011
 policy-0.011
Tokennight
Token ID3847
Feature activation-1.19e-01

Pos logit contributions
mare+0.068
pron+0.058
ack+0.056
auldron+0.053
ackle+0.053
Neg logit contributions
-0.068
Warning-0.065
 next-0.056
 aston-0.053
 policy-0.053
Tokenmare
Token ID11449
Feature activation-4.34e-01

Pos logit contributions
udd+0.126
atmeal+0.119
ins+0.118
ulk+0.116
ucker+0.115
Neg logit contributions
-0.206
 next-0.199
 true-0.170
 ago-0.169
 policy-0.167
Token".
Token ID1911
Feature activation-7.04e-02

Pos logit contributions
atmeal+0.699
pron+0.651
ack+0.648
mare+0.635
ins+0.625
Neg logit contributions
-0.582
 next-0.481
Warning-0.412
 moral-0.396
 ago-0.391
Token She
Token ID1375
Feature activation-4.64e-02

Pos logit contributions
mare+0.147
ack+0.140
ins+0.140
pron+0.137
atmeal+0.136
Neg logit contributions
-0.066
Warning-0.047
 next-0.041
 aston-0.036
 counts-0.035
Token woke
Token ID19092
Feature activation-1.90e-02

Pos logit contributions
mare+0.084
atmeal+0.078
ench+0.078
pron+0.077
ack+0.077
Neg logit contributions
-0.059
 next-0.041
Warning-0.039
 moral-0.036
 aston-0.036
Token up
Token ID510
Feature activation+6.23e-02

Pos logit contributions
mare+0.046
lic+0.041
pron+0.040
ack+0.040
atmeal+0.039
Neg logit contributions
-0.011
 next-0.010
 behind-0.008
 closely-0.007
Warning-0.007
Token crying
Token ID13774
Feature activation+1.30e-01

Pos logit contributions
+0.086
Warning+0.075
 next+0.073
 aston+0.069
 policy+0.065
Neg logit contributions
mare-0.102
pron-0.088
ack-0.088
lic-0.086
atmeal-0.085
Token always
Token ID1464
Feature activation-2.36e-01

Pos logit contributions
+0.005
Warning+0.004
 next+0.004
 policy+0.004
 aston+0.004
Neg logit contributions
mare-0.006
ack-0.005
pron-0.005
lic-0.005
ackle-0.005
Token wear
Token ID5806
Feature activation-5.69e-02

Pos logit contributions
mare+0.360
pron+0.312
ack+0.308
lic+0.289
ackle+0.287
Neg logit contributions
-0.354
Warning-0.324
 aston-0.289
 next-0.288
 policy-0.286
Token her
Token ID607
Feature activation+1.79e-02

Pos logit contributions
mare+0.108
ack+0.094
pron+0.093
lic+0.090
atmeal+0.089
Neg logit contributions
-0.067
Warning-0.056
 next-0.052
 aston-0.051
 policy-0.048
Token flexible
Token ID12846
Feature activation+3.16e-02

Pos logit contributions
+0.024
Warning+0.022
 aston+0.019
 policy+0.019
 next+0.019
Neg logit contributions
mare-0.030
pron-0.026
ack-0.026
lic-0.025
atmeal-0.024
Token a
Token ID257
Feature activation-1.54e-02

Pos logit contributions
+0.047
Warning+0.044
 aston+0.039
 policy+0.038
 next+0.037
Neg logit contributions
mare-0.048
pron-0.041
ack-0.040
lic-0.038
atmeal-0.038
Tokenpron
Token ID31186
Feature activation-3.64e-01

Pos logit contributions
mare+0.027
atmeal+0.023
ack+0.023
lic+0.022
ackle+0.022
Neg logit contributions
-0.020
 next-0.016
 policy-0.016
 precious-0.015
 aston-0.015
Token.
Token ID13
Feature activation-4.58e-03

Pos logit contributions
mare+0.661
ack+0.577
pron+0.566
lic+0.557
atmeal+0.544
Neg logit contributions
-0.465
 next-0.383
Warning-0.382
 policy-0.352
 aston-0.316
Token She
Token ID1375
Feature activation-4.57e-02

Pos logit contributions
mare+0.008
pron+0.007
ack+0.007
atmeal+0.007
lic+0.007
Neg logit contributions
-0.006
Warning-0.005
 aston-0.005
 next-0.005
 policy-0.005
Token also
Token ID635
Feature activation+2.23e-02

Pos logit contributions
mare+0.082
lic+0.070
pron+0.069
ack+0.067
ench+0.065
Neg logit contributions
-0.067
Warning-0.050
 next-0.049
 moral-0.048
 aston-0.047
Token learned
Token ID4499
Feature activation-5.57e-02

Pos logit contributions
+0.032
Warning+0.024
 aston+0.022
 next+0.022
 moral+0.022
Neg logit contributions
mare-0.039
pron-0.035
ack-0.034
lic-0.034
ackle-0.033
Token that
Token ID326
Feature activation-8.11e-02

Pos logit contributions
mare+0.090
pron+0.078
ack+0.077
atmeal+0.074
lic+0.070
Neg logit contributions
-0.084
Warning-0.074
 next-0.068
 policy-0.064
 aston-0.063
Token put
Token ID1234
Feature activation-3.65e-02

Pos logit contributions
mare+0.069
ack+0.061
ench+0.061
pron+0.061
ackle+0.058
Neg logit contributions
-0.036
Warning-0.028
 next-0.027
 aston-0.023
 moral-0.022
Token on
Token ID319
Feature activation+3.72e-02

Pos logit contributions
mare+0.063
pron+0.055
ack+0.054
lic+0.051
ackle+0.051
Neg logit contributions
-0.047
Warning-0.044
 aston-0.037
 next-0.037
 policy-0.037
Token his
Token ID465
Feature activation-1.94e-02

Pos logit contributions
+0.055
Warning+0.052
 aston+0.045
 policy+0.044
 next+0.044
Neg logit contributions
mare-0.057
pron-0.049
ack-0.048
lic-0.045
atmeal-0.045
Token special
Token ID2041
Feature activation+9.23e-02

Pos logit contributions
mare+0.027
pron+0.026
atmeal+0.022
auldron+0.021
ack+0.021
Neg logit contributions
Warning-0.032
-0.031
 aston-0.026
 policy-0.026
 comparison-0.024
Token a
Token ID257
Feature activation-1.44e-02

Pos logit contributions
+0.162
Warning+0.162
 aston+0.139
 policy+0.130
Those+0.130
Neg logit contributions
mare-0.118
pron-0.112
atmeal-0.085
ack-0.084
auldron-0.083
Tokenpron
Token ID31186
Feature activation-3.64e-01

Pos logit contributions
mare+0.024
ack+0.020
lic+0.019
ench+0.019
atmeal+0.019
Neg logit contributions
-0.019
Warning-0.016
 next-0.016
 policy-0.015
 aston-0.015
Token and
Token ID290
Feature activation-7.19e-04

Pos logit contributions
mare+0.659
ack+0.626
lic+0.596
ench+0.586
ackle+0.578
Neg logit contributions
-0.371
 next-0.362
Warning-0.309
 behind-0.302
 policy-0.302
Token decided
Token ID3066
Feature activation+1.18e-01

Pos logit contributions
mare+0.001
ack+0.001
pron+0.001
ackle+0.001
ench+0.001
Neg logit contributions
-0.001
Warning-0.001
 next-0.001
 aston-0.001
 policy-0.001
Token to
Token ID284
Feature activation-4.67e-03

Pos logit contributions
+0.158
Warning+0.146
 aston+0.126
 policy+0.124
 next+0.124
Neg logit contributions
mare-0.197
pron-0.172
ack-0.170
atmeal-0.159
lic-0.159
Token travel
Token ID3067
Feature activation+2.80e-02

Pos logit contributions
mare+0.009
ins+0.009
udd+0.009
lic+0.008
ackle+0.008
Neg logit contributions
-0.006
Warning-0.004
 next-0.004
 happen-0.003
 aston-0.003
Token.
Token ID13
Feature activation-9.19e-03

Pos logit contributions
+0.035
Warning+0.032
 next+0.029
 policy+0.026
 aston+0.026
Neg logit contributions
mare-0.050
pron-0.044
ack-0.043
atmeal-0.042
lic-0.041
Token put
Token ID1234
Feature activation-2.87e-02

Pos logit contributions
mare+0.072
pron+0.062
ack+0.060
lic+0.057
atmeal+0.057
Neg logit contributions
-0.065
Warning-0.060
 aston-0.053
 policy-0.052
 next-0.051
Token on
Token ID319
Feature activation+4.20e-02

Pos logit contributions
mare+0.052
pron+0.045
ack+0.045
lic+0.042
ackle+0.042
Neg logit contributions
-0.035
Warning-0.032
 aston-0.027
 next-0.027
 policy-0.026
Token a
Token ID257
Feature activation-1.24e-02

Pos logit contributions
+0.058
Warning+0.055
 policy+0.048
 aston+0.047
 next+0.044
Neg logit contributions
mare-0.067
pron-0.060
ack-0.057
atmeal-0.054
lic-0.053
Token pink
Token ID11398
Feature activation+1.37e-01

Pos logit contributions
mare+0.018
pron+0.016
ack+0.015
lic+0.014
atmeal+0.014
Neg logit contributions
-0.019
Warning-0.019
 aston-0.017
 policy-0.016
 next-0.015
Token a
Token ID257
Feature activation-1.26e-02

Pos logit contributions
+0.216
Warning+0.197
 next+0.179
 aston+0.178
 policy+0.174
Neg logit contributions
mare-0.192
ack-0.162
pron-0.158
lic-0.153
ackle-0.152
Tokenpron
Token ID31186
Feature activation-3.63e-01

Pos logit contributions
mare+0.021
pron+0.018
ack+0.018
lic+0.017
atmeal+0.017
Neg logit contributions
-0.017
Warning-0.016
 aston-0.014
 policy-0.014
 next-0.013
Token and
Token ID290
Feature activation+5.57e-03

Pos logit contributions
mare+0.681
ack+0.638
ench+0.611
lic+0.607
ackle+0.599
Neg logit contributions
-0.367
 next-0.343
 behind-0.281
Warning-0.276
 policy-0.261
Token climbed
Token ID19952
Feature activation-6.79e-02

Pos logit contributions
+0.008
Warning+0.008
 aston+0.007
 policy+0.007
 next+0.006
Neg logit contributions
mare-0.008
pron-0.007
ack-0.007
lic-0.007
atmeal-0.007
Token on
Token ID319
Feature activation+4.22e-02

Pos logit contributions
mare+0.130
pron+0.114
ack+0.113
atmeal+0.107
ackle+0.107
Neg logit contributions
-0.078
Warning-0.068
 next-0.060
 aston-0.056
 policy-0.053
Token a
Token ID257
Feature activation-1.24e-02

Pos logit contributions
+0.057
Warning+0.052
 aston+0.045
 next+0.044
 policy+0.043
Neg logit contributions
mare-0.072
pron-0.062
ack-0.061
lic-0.058
atmeal-0.058
Token stool
Token ID38753
Feature activation-9.03e-02

Pos logit contributions
mare+0.018
ack+0.014
atmeal+0.013
ackle+0.013
lic+0.013
Neg logit contributions
-0.021
Warning-0.017
 next-0.016
 aston-0.016
 policy-0.015
Token there
Token ID612
Feature activation-1.00e-02

Pos logit contributions
+0.011
Warning+0.011
 aston+0.010
 policy+0.010
 next+0.009
Neg logit contributions
mare-0.008
pron-0.006
ack-0.006
atmeal-0.006
lic-0.006
Token was
Token ID373
Feature activation+2.37e-02

Pos logit contributions
mare+0.014
ack+0.012
pron+0.012
atmeal+0.011
lic+0.011
Neg logit contributions
Warning-0.015
-0.015
 policy-0.013
 aston-0.013
 next-0.012
Token a
Token ID257
Feature activation-1.39e-02

Pos logit contributions
Warning+0.046
 policy+0.043
 seventeen+0.040
+0.040
edient+0.039
Neg logit contributions
mare-0.019
ack-0.017
pron-0.017
adow-0.015
atmeal-0.015
Token cool
Token ID3608
Feature activation-5.81e-03

Pos logit contributions
mare+0.023
atmeal+0.020
ocked+0.019
ench+0.019
ack+0.019
Neg logit contributions
-0.019
 next-0.015
Warning-0.015
 aston-0.014
 moral-0.014
Token a
Token ID257
Feature activation-1.40e-02

Pos logit contributions
mare+0.010
lic+0.009
hes+0.009
ins+0.008
pron+0.008
Neg logit contributions
-0.007
 next-0.006
 policy-0.005
Warning-0.005
 orange-0.005
Tokenpron
Token ID31186
Feature activation-3.63e-01

Pos logit contributions
mare+0.024
ench+0.021
ocked+0.021
ack+0.021
atmeal+0.020
Neg logit contributions
-0.018
 next-0.015
 moral-0.013
Warning-0.013
 aston-0.013
Token.
Token ID13
Feature activation-1.06e-02

Pos logit contributions
mare+0.674
ack+0.633
ench+0.614
ackle+0.603
auldron+0.595
Neg logit contributions
-0.392
 next-0.349
 policy-0.298
Warning-0.296
 behind-0.253
Token It
Token ID632
Feature activation-4.87e-02

Pos logit contributions
mare+0.020
rows+0.019
ack+0.019
ackle+0.018
pron+0.018
Neg logit contributions
-0.013
Warning-0.011
 aston-0.009
 policy-0.008
 next-0.008
Token was
Token ID373
Feature activation+2.35e-02

Pos logit contributions
mare+0.100
ench+0.088
ack+0.086
ackle+0.085
pron+0.083
Neg logit contributions
-0.056
 next-0.039
Warning-0.038
 aston-0.032
 counts-0.032
Token worn
Token ID12666
Feature activation-2.58e-02

Pos logit contributions
+0.030
Warning+0.026
 next+0.024
 aston+0.023
 policy+0.022
Neg logit contributions
mare-0.042
ack-0.036
pron-0.035
ackle-0.034
atmeal-0.034
Token by
Token ID416
Feature activation-8.89e-02

Pos logit contributions
mare+0.048
pron+0.042
atmeal+0.042
ack+0.042
ackle+0.039
Neg logit contributions
-0.032
Warning-0.025
 next-0.024
 precious-0.022
 aston-0.021
Token.
Token ID13
Feature activation-8.33e-03

Pos logit contributions
+0.009
Warning+0.008
 next+0.007
 policy+0.007
 aston+0.007
Neg logit contributions
mare-0.012
pron-0.010
ack-0.010
atmeal-0.010
ench-0.010
Token It
Token ID632
Feature activation-5.18e-02

Pos logit contributions
mare+0.014
pron+0.012
ack+0.012
atmeal+0.012
ackle+0.011
Neg logit contributions
-0.011
Warning-0.010
 aston-0.009
 policy-0.009
 next-0.009
Token covered
Token ID5017
Feature activation-1.25e-02

Pos logit contributions
mare+0.083
pron+0.072
ack+0.070
atmeal+0.066
lic+0.066
Neg logit contributions
-0.074
Warning-0.068
 aston-0.060
 policy-0.058
 next-0.058
Token her
Token ID607
Feature activation+1.60e-02

Pos logit contributions
mare+0.021
pron+0.018
ack+0.018
atmeal+0.017
lic+0.017
Neg logit contributions
-0.017
Warning-0.015
 aston-0.013
 next-0.013
 policy-0.013
Token a
Token ID257
Feature activation-1.36e-02

Pos logit contributions
+0.023
Warning+0.020
 next+0.019
 aston+0.019
 policy+0.017
Neg logit contributions
mare-0.027
pron-0.022
lic-0.022
ack-0.021
atmeal-0.020
Tokenpron
Token ID31186
Feature activation-3.62e-01

Pos logit contributions
mare+0.023
ack+0.019
pron+0.019
atmeal+0.019
lic+0.018
Neg logit contributions
-0.018
Warning-0.016
 aston-0.015
 next-0.014
 policy-0.014
Token,
Token ID11
Feature activation+3.51e-02

Pos logit contributions
mare+0.726
lic+0.623
ack+0.606
atmeal+0.604
pron+0.602
Neg logit contributions
-0.420
 next-0.350
Warning-0.307
 behind-0.281
 policy-0.272
Token her
Token ID607
Feature activation+1.53e-02

Pos logit contributions
+0.049
Warning+0.047
 aston+0.040
 policy+0.040
 next+0.037
Neg logit contributions
mare-0.056
pron-0.049
ack-0.047
lic-0.044
atmeal-0.044
Token face
Token ID1986
Feature activation-1.27e-01

Pos logit contributions
+0.021
 next+0.018
Warning+0.017
 aston+0.017
 policy+0.016
Neg logit contributions
mare-0.026
lic-0.023
ack-0.022
pron-0.021
ackle-0.020
Token,
Token ID11
Feature activation+3.50e-02

Pos logit contributions
mare+0.229
ack+0.189
pron+0.187
ackle+0.182
ench+0.179
Neg logit contributions
-0.171
Warning-0.138
 next-0.136
 aston-0.125
 counts-0.113
Token her
Token ID607
Feature activation+1.48e-02

Pos logit contributions
+0.045
Warning+0.041
 next+0.037
 aston+0.036
 policy+0.034
Neg logit contributions
mare-0.061
ack-0.053
pron-0.052
lic-0.051
atmeal-0.050
Token mom
Token ID1995
Feature activation+6.85e-02

Pos logit contributions
Warning+0.034
+0.034
 aston+0.028
 policy+0.027
 seventeen+0.027
Neg logit contributions
mare-0.032
pron-0.028
ack-0.026
auldron-0.025
ackle-0.025
Token gave
Token ID2921
Feature activation+2.79e-02

Pos logit contributions
+0.099
Warning+0.087
 next+0.078
 aston+0.076
 policy+0.075
Neg logit contributions
mare-0.110
pron-0.095
ack-0.094
ackle-0.088
atmeal-0.087
Token her
Token ID607
Feature activation+2.20e-02

Pos logit contributions
+0.035
Warning+0.032
 aston+0.027
 next+0.027
 policy+0.026
Neg logit contributions
mare-0.049
pron-0.044
ack-0.042
lic-0.041
atmeal-0.041
Token an
Token ID281
Feature activation+7.19e-02

Pos logit contributions
+0.028
Warning+0.024
 next+0.021
 aston+0.021
 policy+0.020
Neg logit contributions
mare-0.040
ack-0.035
pron-0.035
lic-0.033
ackle-0.033
Token a
Token ID257
Feature activation-1.25e-02

Pos logit contributions
Warning+0.104
+0.097
 aston+0.083
 policy+0.079
 counts+0.079
Neg logit contributions
mare-0.106
pron-0.102
atmeal-0.089
auldron-0.087
iddle-0.087
Tokenpron
Token ID31186
Feature activation-3.62e-01

Pos logit contributions
mare+0.022
ack+0.019
pron+0.018
atmeal+0.018
ackle+0.018
Neg logit contributions
-0.016
Warning-0.014
 aston-0.012
 policy-0.012
 next-0.012
Token with
Token ID351
Feature activation+2.68e-02

Pos logit contributions
mare+0.628
ack+0.579
pron+0.550
ench+0.534
ackle+0.532
Neg logit contributions
-0.451
Warning-0.375
 next-0.363
 policy-0.358
 aston-0.309
Token a
Token ID257
Feature activation-1.24e-02

Pos logit contributions
+0.039
Warning+0.039
 policy+0.035
 aston+0.034
 next+0.031
Neg logit contributions
mare-0.038
pron-0.035
ack-0.032
lic-0.031
atmeal-0.030
Token bright
Token ID6016
Feature activation+5.57e-02

Pos logit contributions
mare+0.019
ack+0.016
atmeal+0.015
ackle+0.015
ench+0.014
Neg logit contributions
-0.019
Warning-0.016
 aston-0.015
 next-0.015
 policy-0.015
Token pink
Token ID11398
Feature activation+1.35e-01

Pos logit contributions
+0.057
Warning+0.042
 aston+0.041
 next+0.039
 orange+0.039
Neg logit contributions
mare-0.111
ack-0.097
atmeal-0.097
ackle-0.096
lic-0.095
Token flower
Token ID15061
Feature activation+2.88e-02

Pos logit contributions
+0.181
Warning+0.153
 next+0.146
 policy+0.137
 aston+0.130
Neg logit contributions
mare-0.224
ack-0.198
lic-0.193
ackle-0.189
ench-0.183
Token he
Token ID339
Feature activation-1.52e-02

Pos logit contributions
mare+0.151
ack+0.140
iddle+0.135
ric+0.134
ench+0.134
Neg logit contributions
-0.077
 next-0.066
Warning-0.053
 policy-0.052
 aston-0.048
Token put
Token ID1234
Feature activation-3.39e-02

Pos logit contributions
mare+0.027
ric+0.025
ack+0.024
ench+0.024
 salon+0.024
Neg logit contributions
-0.020
 next-0.014
 moral-0.012
Warning-0.012
 dis-0.012
Token on
Token ID319
Feature activation+3.99e-02

Pos logit contributions
mare+0.075
ack+0.070
ackle+0.068
lic+0.068
arg+0.067
Neg logit contributions
 next-0.026
-0.025
 behind-0.019
 orange-0.019
Warning-0.017
Token his
Token ID465
Feature activation-1.65e-02

Pos logit contributions
+0.042
 next+0.032
Warning+0.030
 aston+0.030
 orange+0.027
Neg logit contributions
mare-0.083
ack-0.075
lic-0.074
adow-0.070
ute-0.070
Token a
Token ID257
Feature activation-1.22e-02

Pos logit contributions
mare+0.024
pron+0.024
atmeal+0.019
auldron+0.019
ouse+0.018
Neg logit contributions
Warning-0.026
-0.025
 aston-0.021
 policy-0.021
 seventeen-0.020
Tokenpron
Token ID31186
Feature activation-3.62e-01

Pos logit contributions
mare+0.011
ack+0.009
pron+0.008
lic+0.007
atmeal+0.007
Neg logit contributions
-0.025
Warning-0.023
 aston-0.022
 policy-0.022
 next-0.022
Token and
Token ID290
Feature activation+5.05e-03

Pos logit contributions
mare+0.708
ack+0.650
lic+0.626
ric+0.611
atmeal+0.609
Neg logit contributions
-0.357
 next-0.327
 behind-0.278
Warning-0.263
 policy-0.237
Token helped
Token ID4193
Feature activation-9.26e-02

Pos logit contributions
+0.007
Warning+0.007
 aston+0.006
 next+0.006
 policy+0.006
Neg logit contributions
mare-0.008
ack-0.007
pron-0.007
lic-0.006
ackle-0.006
Token her
Token ID607
Feature activation+2.09e-02

Pos logit contributions
mare+0.170
ack+0.152
pron+0.147
lic+0.146
udd+0.143
Neg logit contributions
-0.113
Warning-0.092
 next-0.085
 policy-0.080
 aston-0.075
Token make
Token ID787
Feature activation-5.74e-02

Pos logit contributions
+0.027
 next+0.022
 aston+0.017
 policy+0.017
Warning+0.017
Neg logit contributions
mare-0.039
ack-0.036
lic-0.035
ocked-0.034
ackle-0.033
Token spaghetti
Token ID48241
Feature activation+4.54e-03

Pos logit contributions
mare+0.133
lic+0.112
ancer+0.110
ack+0.110
ric+0.110
Neg logit contributions
-0.052
 next-0.040
 orange-0.034
 happen-0.031
 surprises-0.029
Token Lily
Token ID20037
Feature activation+2.61e-03

Pos logit contributions
mare+0.015
pron+0.013
ack+0.013
atmeal+0.013
ackle+0.012
Neg logit contributions
-0.010
Warning-0.008
 next-0.007
 aston-0.007
 policy-0.006
Token put
Token ID1234
Feature activation-3.31e-02

Pos logit contributions
+0.003
 next+0.002
 moral+0.002
Warning+0.002
 aston+0.002
Neg logit contributions
mare-0.005
lic-0.005
ric-0.005
ancer-0.004
atmeal-0.004
Token on
Token ID319
Feature activation+3.88e-02

Pos logit contributions
mare+0.067
ack+0.061
ackle+0.059
ench+0.058
lic+0.057
Neg logit contributions
-0.034
 next-0.028
Warning-0.028
 aston-0.022
 moral-0.021
Token the
Token ID262
Feature activation-3.42e-02

Pos logit contributions
+0.043
Warning+0.035
 next+0.032
 aston+0.031
 policy+0.026
Neg logit contributions
mare-0.078
ack-0.069
lic-0.068
ackle-0.066
atmeal-0.065
Token a
Token ID257
Feature activation-1.29e-02

Pos logit contributions
mare+0.056
ack+0.049
lic+0.046
ackle+0.046
ench+0.045
Neg logit contributions
-0.048
Warning-0.041
 next-0.040
 aston-0.036
 policy-0.035
Tokenpron
Token ID31186
Feature activation-3.62e-01

Pos logit contributions
mare+0.007
ack+0.004
ench+0.004
atmeal+0.004
ackle+0.004
Neg logit contributions
-0.032
 next-0.028
Warning-0.028
 aston-0.027
 precious-0.026
Token and
Token ID290
Feature activation+2.66e-03

Pos logit contributions
mare+0.734
ack+0.699
ins+0.684
lic+0.683
atmeal+0.680
Neg logit contributions
-0.326
 next-0.317
 behind-0.255
Warning-0.211
 policy-0.188
Token felt
Token ID2936
Feature activation-1.13e-01

Pos logit contributions
+0.004
Warning+0.003
 next+0.003
 aston+0.003
 counts+0.003
Neg logit contributions
mare-0.004
ack-0.004
ench-0.004
ackle-0.004
lic-0.004
Token very
Token ID845
Feature activation-6.70e-02

Pos logit contributions
mare+0.203
ack+0.178
pron+0.172
ackle+0.165
lic+0.164
Neg logit contributions
-0.158
Warning-0.120
 aston-0.119
 next-0.114
 precious-0.107
Token grown
Token ID7334
Feature activation-4.49e-02

Pos logit contributions
mare+0.115
pron+0.096
ack+0.096
lic+0.092
ackle+0.090
Neg logit contributions
-0.091
Warning-0.077
 next-0.073
 aston-0.071
 policy-0.069
Token-
Token ID12
Feature activation-1.48e-01

Pos logit contributions
mare+0.070
ack+0.059
pron+0.057
atmeal+0.055
ric+0.055
Neg logit contributions
-0.070
 next-0.060
Warning-0.055
 aston-0.053
 behind-0.051
Token.
Token ID13
Feature activation-1.14e-02

Pos logit contributions
+0.072
Warning+0.070
 aston+0.062
 policy+0.059
 moral+0.055
Neg logit contributions
mare-0.071
pron-0.062
ack-0.060
atmeal-0.056
ackle-0.055
Token Lily
Token ID20037
Feature activation+3.20e-03

Pos logit contributions
mare+0.020
ack+0.018
pron+0.018
atmeal+0.018
ackle+0.017
Neg logit contributions
-0.015
Warning-0.012
 next-0.011
 aston-0.010
 policy-0.010
Token thought
Token ID1807
Feature activation-2.80e-03

Pos logit contributions
+0.004
 next+0.002
Warning+0.002
 aston+0.002
 dis+0.002
Neg logit contributions
mare-0.007
lic-0.006
ancer-0.006
atmeal-0.006
ack-0.006
Token her
Token ID607
Feature activation+2.04e-02

Pos logit contributions
mare+0.004
pron+0.004
ack+0.003
atmeal+0.003
lic+0.003
Neg logit contributions
-0.004
Warning-0.004
 aston-0.004
 policy-0.003
 next-0.003
Token a
Token ID257
Feature activation-1.28e-02

Pos logit contributions
+0.024
Warning+0.022
 next+0.019
 aston+0.019
 policy+0.018
Neg logit contributions
mare-0.037
ack-0.032
pron-0.032
lic-0.031
ackle-0.031
Tokenpron
Token ID31186
Feature activation-3.62e-01

Pos logit contributions
mare+0.023
rows+0.022
ack+0.021
ins+0.021
adow+0.021
Neg logit contributions
-0.015
 moral-0.012
 next-0.012
 aston-0.012
 precious-0.011
Token was
Token ID373
Feature activation+2.57e-02

Pos logit contributions
mare+0.692
ack+0.615
pron+0.599
lic+0.583
ackle+0.576
Neg logit contributions
-0.430
Warning-0.342
 next-0.325
 policy-0.309
 aston-0.275
Token impressive
Token ID8036
Feature activation+1.15e-01

Pos logit contributions
+0.033
 next+0.026
Warning+0.026
 precious+0.023
 policy+0.023
Neg logit contributions
mare-0.047
ack-0.042
ackle-0.039
ench-0.038
pron-0.038
Token because
Token ID780
Feature activation-4.89e-02

Pos logit contributions
+0.154
Warning+0.127
 next+0.124
 policy+0.118
 aston+0.115
Neg logit contributions
mare-0.201
ack-0.173
pron-0.172
lic-0.161
ackle-0.161
Token it
Token ID340
Feature activation-2.79e-02

Pos logit contributions
mare+0.054
pron+0.051
atmeal+0.040
hes+0.039
ack+0.039
Neg logit contributions
Warning-0.087
-0.081
 aston-0.079
 seventeen-0.073
ulent-0.072
Token had
Token ID550
Feature activation+1.24e-01

Pos logit contributions
mare+0.048
ack+0.042
pron+0.041
ench+0.039
ackle+0.039
Neg logit contributions
-0.039
Warning-0.032
 next-0.029
 aston-0.027
 policy-0.027
Token her
Token ID607
Feature activation+1.96e-02

Pos logit contributions
Warning+0.082
+0.081
 aston+0.072
 policy+0.069
 seventeen+0.067
Neg logit contributions
mare-0.076
pron-0.063
ack-0.060
ackle-0.054
ench-0.054
Token to
Token ID284
Feature activation-2.37e-03

Pos logit contributions
+0.028
Warning+0.028
 aston+0.024
 policy+0.023
 next+0.022
Neg logit contributions
mare-0.030
pron-0.026
ack-0.025
lic-0.024
atmeal-0.023
Token wear
Token ID5806
Feature activation-5.55e-02

Pos logit contributions
mare+0.004
ack+0.004
pron+0.004
ackle+0.004
lic+0.004
Neg logit contributions
-0.003
Warning-0.002
 next-0.002
 aston-0.002
 policy-0.002
Token an
Token ID281
Feature activation+6.89e-02

Pos logit contributions
mare+0.095
ack+0.082
pron+0.082
lic+0.078
ackle+0.078
Neg logit contributions
-0.073
Warning-0.066
 aston-0.057
 next-0.057
 policy-0.055
Token a
Token ID257
Feature activation-1.37e-02

Pos logit contributions
Warning+0.097
+0.094
 aston+0.078
 policy+0.078
 seventeen+0.071
Neg logit contributions
mare-0.109
pron-0.104
atmeal-0.091
ack-0.089
auldron-0.087
Tokenpron
Token ID31186
Feature activation-3.62e-01

Pos logit contributions
mare+0.014
ack+0.011
rows+0.010
atmeal+0.010
ackle+0.009
Neg logit contributions
-0.027
Warning-0.024
 next-0.023
 aston-0.023
 policy-0.022
Token so
Token ID523
Feature activation-7.72e-02

Pos logit contributions
mare+0.689
ack+0.673
ric+0.645
ins+0.638
ench+0.635
Neg logit contributions
-0.346
 next-0.337
 policy-0.254
 behind-0.253
Warning-0.247
Token she
Token ID673
Feature activation-1.04e-02

Pos logit contributions
mare+0.144
ack+0.126
pron+0.124
ackle+0.120
ench+0.119
Neg logit contributions
-0.091
Warning-0.077
 policy-0.070
 next-0.070
 aston-0.066
Token wouldn
Token ID3636
Feature activation-1.07e-01

Pos logit contributions
mare+0.017
ack+0.016
ench+0.015
ackle+0.015
pron+0.015
Neg logit contributions
-0.015
 next-0.011
Warning-0.011
 aston-0.010
 moral-0.010
Token't
Token ID470
Feature activation-3.59e-02

Pos logit contributions
mare+0.172
pron+0.148
ack+0.143
lic+0.137
ackle+0.136
Neg logit contributions
-0.163
Warning-0.143
 next-0.124
 aston-0.122
 policy-0.121
Token get
Token ID651
Feature activation-5.20e-02

Pos logit contributions
mare+0.066
pron+0.058
lic+0.057
auldron+0.056
ins+0.056
Neg logit contributions
-0.049
Warning-0.037
 next-0.037
 policy-0.031
 harm-0.030
Tokenmy
Token ID1820
Feature activation-2.43e-02

Pos logit contributions
+0.087
Warning+0.072
 next+0.069
 policy+0.062
 aston+0.061
Neg logit contributions
mare-0.109
ack-0.094
pron-0.093
ackle-0.090
ench-0.089
Token gave
Token ID2921
Feature activation+2.93e-02

Pos logit contributions
mare+0.038
pron+0.033
ack+0.032
atmeal+0.030
ackle+0.030
Neg logit contributions
-0.036
Warning-0.032
 next-0.028
 aston-0.028
 policy-0.027
Token her
Token ID607
Feature activation+2.06e-02

Pos logit contributions
+0.037
Warning+0.033
 aston+0.028
 next+0.028
 policy+0.028
Neg logit contributions
mare-0.052
pron-0.046
ack-0.045
lic-0.043
atmeal-0.043
Token an
Token ID281
Feature activation+7.13e-02

Pos logit contributions
+0.025
Warning+0.020
 next+0.019
 aston+0.018
 moral+0.017
Neg logit contributions
mare-0.039
ack-0.035
pron-0.033
lic-0.033
ackle-0.033
Token a
Token ID257
Feature activation-1.39e-02

Pos logit contributions
Warning+0.109
+0.102
 aston+0.089
 policy+0.085
 counts+0.085
Neg logit contributions
mare-0.101
pron-0.100
atmeal-0.083
ackle-0.080
ench-0.079
Tokenpron
Token ID31186
Feature activation-3.62e-01

Pos logit contributions
mare+0.024
ack+0.021
atmeal+0.020
rows+0.020
ackle+0.020
Neg logit contributions
-0.018
Warning-0.014
 aston-0.014
 policy-0.014
 next-0.013
Token to
Token ID284
Feature activation-1.97e-03

Pos logit contributions
mare+0.635
ack+0.597
pron+0.558
ench+0.557
atmeal+0.549
Neg logit contributions
-0.436
Warning-0.356
 next-0.355
 policy-0.345
 aston-0.290
Token wear
Token ID5806
Feature activation-5.45e-02

Pos logit contributions
mare+0.004
ables+0.003
ack+0.003
ocked+0.003
rows+0.003
Neg logit contributions
-0.002
 next-0.002
Warning-0.002
 aston-0.001
 dis-0.001
Token so
Token ID523
Feature activation-7.61e-02

Pos logit contributions
mare+0.101
ack+0.088
ackle+0.083
pron+0.083
atmeal+0.083
Neg logit contributions
-0.067
 next-0.055
Warning-0.054
 policy-0.045
 aston-0.043
Token she
Token ID673
Feature activation-1.02e-02

Pos logit contributions
mare+0.132
ack+0.112
pron+0.110
ackle+0.106
lic+0.106
Neg logit contributions
-0.101
Warning-0.089
 next-0.082
 policy-0.082
 aston-0.077
Token wouldn
Token ID3636
Feature activation-1.08e-01

Pos logit contributions
mare+0.017
ack+0.016
ench+0.015
ackle+0.015
pron+0.015
Neg logit contributions
-0.014
 next-0.010
Warning-0.010
 aston-0.009
 policy-0.009
Token the
Token ID262
Feature activation-3.63e-02

Pos logit contributions
mare+0.040
pron+0.038
ack+0.034
ocked+0.031
hes+0.030
Neg logit contributions
Warning-0.066
 policy-0.061
-0.061
 aston-0.053
 narrative-0.052
Token help
Token ID1037
Feature activation+9.89e-02

Pos logit contributions
mare+0.064
ack+0.056
lic+0.055
atmeal+0.053
ackle+0.053
Neg logit contributions
-0.047
Warning-0.040
 next-0.038
 aston-0.037
 policy-0.035
Token of
Token ID286
Feature activation-9.60e-03

Pos logit contributions
+0.140
Warning+0.120
 policy+0.108
 aston+0.105
 next+0.105
Neg logit contributions
mare-0.164
pron-0.144
ack-0.140
ackle-0.135
atmeal-0.135
Token her
Token ID607
Feature activation+2.06e-02

Pos logit contributions
mare+0.017
ack+0.015
pron+0.015
lic+0.015
ackle+0.014
Neg logit contributions
-0.012
Warning-0.011
 next-0.010
 aston-0.010
 policy-0.009
Token a
Token ID257
Feature activation-1.31e-02

Pos logit contributions
+0.029
Warning+0.027
 aston+0.023
 policy+0.023
 next+0.022
Neg logit contributions
mare-0.034
pron-0.029
ack-0.028
lic-0.027
atmeal-0.027
Tokenpron
Token ID31186
Feature activation-3.62e-01

Pos logit contributions
mare+0.020
ack+0.017
atmeal+0.017
ocked+0.017
lic+0.016
Neg logit contributions
-0.019
Warning-0.015
 next-0.015
 aston-0.015
 policy-0.015
Token,
Token ID11
Feature activation+4.11e-02

Pos logit contributions
mare+0.662
ack+0.580
pron+0.578
lic+0.552
ackle+0.551
Neg logit contributions
-0.447
Warning-0.371
 next-0.345
 policy-0.336
 aston-0.311
Token she
Token ID673
Feature activation-1.11e-02

Pos logit contributions
+0.059
Warning+0.057
 aston+0.049
 policy+0.048
 next+0.046
Neg logit contributions
mare-0.063
pron-0.055
ack-0.052
lic-0.050
atmeal-0.050
Token could
Token ID714
Feature activation-3.44e-02

Pos logit contributions
mare+0.020
ench+0.017
ackle+0.016
ack+0.016
pron+0.016
Neg logit contributions
-0.015
 next-0.011
Warning-0.011
 aston-0.011
 counts-0.010
Token make
Token ID787
Feature activation-5.69e-02

Pos logit contributions
mare+0.066
ins+0.058
lic+0.056
ack+0.055
rows+0.055
Neg logit contributions
-0.044
 next-0.034
 aston-0.029
Warning-0.028
 happen-0.028
Token creations
Token ID28443
Feature activation-2.35e-02

Pos logit contributions
mare+0.125
pron+0.106
ack+0.100
lic+0.100
atmeal+0.099
Neg logit contributions
-0.061
 next-0.041
 surprises-0.039
 unexpected-0.038
 orange-0.037
Token.
Token ID13
Feature activation-1.02e-02

Pos logit contributions
Warning+0.228
+0.227
 aston+0.205
 policy+0.197
 seventeen+0.184
Neg logit contributions
mare-0.190
pron-0.164
ack-0.160
lic-0.146
ackle-0.143
Token
Token ID198
Feature activation+9.86e-03

Pos logit contributions
mare+0.016
pron+0.014
ack+0.014
atmeal+0.013
lic+0.013
Neg logit contributions
-0.015
Warning-0.014
 aston-0.012
 policy-0.012
 next-0.011
Token
Token ID198
Feature activation+9.38e-03

Pos logit contributions
+0.014
Warning+0.013
 aston+0.011
 policy+0.011
 next+0.011
Neg logit contributions
mare-0.016
pron-0.014
ack-0.013
atmeal-0.013
lic-0.012
TokenThe
Token ID464
Feature activation-2.30e-02

Pos logit contributions
+0.013
Warning+0.012
 next+0.010
 aston+0.010
 policy+0.009
Neg logit contributions
mare-0.016
pron-0.014
ack-0.013
atmeal-0.013
auldron-0.013
Token a
Token ID257
Feature activation-1.42e-02

Pos logit contributions
mare+0.037
ack+0.032
ric+0.031
lic+0.031
ench+0.030
Neg logit contributions
-0.033
 next-0.030
Warning-0.027
 moral-0.026
 aston-0.024
Tokenpron
Token ID31186
Feature activation-3.62e-01

Pos logit contributions
mare+0.025
atmeal+0.022
ric+0.022
rows+0.022
iddle+0.021
Neg logit contributions
-0.018
 next-0.014
 moral-0.013
 precious-0.013
 aston-0.012
Token suddenly
Token ID6451
Feature activation+1.06e-01

Pos logit contributions
mare+0.679
ack+0.585
ench+0.584
atmeal+0.583
pron+0.577
Neg logit contributions
-0.428
 next-0.339
Warning-0.337
 aston-0.285
 policy-0.282
Token began
Token ID2540
Feature activation-1.23e-01

Pos logit contributions
+0.143
Warning+0.119
 next+0.110
 aston+0.104
 policy+0.097
Neg logit contributions
mare-0.182
pron-0.161
atmeal-0.155
auldron-0.147
ack-0.146
Token to
Token ID284
Feature activation-3.29e-03

Pos logit contributions
mare+0.233
pron+0.211
atmeal+0.204
ack+0.199
ackle+0.194
Neg logit contributions
-0.145
Warning-0.121
 next-0.113
 aston-0.107
 policy-0.104
Token spark
Token ID9009
Feature activation+1.27e-01

Pos logit contributions
mare+0.006
ack+0.005
lic+0.005
ables+0.005
pron+0.005
Neg logit contributions
-0.004
Warning-0.004
 next-0.004
 aston-0.003
 moral-0.003
Tokenle
Token ID293
Feature activation-4.24e-02

Pos logit contributions
+0.121
 behind+0.098
 next+0.096
 closely+0.085
Warning+0.085
Neg logit contributions
mare-0.275
pron-0.251
atmeal-0.230
ric-0.215
arg-0.215
Token She
Token ID1375
Feature activation-4.22e-02

Pos logit contributions
mare+0.006
pron+0.005
ack+0.005
atmeal+0.005
ackle+0.005
Neg logit contributions
-0.005
Warning-0.005
 aston-0.004
 policy-0.004
 next-0.004
Token put
Token ID1234
Feature activation-3.11e-02

Pos logit contributions
mare+0.068
pron+0.059
ack+0.057
lic+0.054
atmeal+0.053
Neg logit contributions
-0.059
Warning-0.056
 aston-0.049
 policy-0.048
 next-0.046
Token on
Token ID319
Feature activation+3.98e-02

Pos logit contributions
mare+0.060
ack+0.053
pron+0.052
ackle+0.051
lic+0.050
Neg logit contributions
-0.033
Warning-0.029
 next-0.027
 aston-0.024
 policy-0.023
Token her
Token ID607
Feature activation+1.93e-02

Pos logit contributions
+0.050
Warning+0.045
 aston+0.039
 next+0.038
 policy+0.036
Neg logit contributions
mare-0.072
ack-0.062
pron-0.062
lic-0.059
ackle-0.058
Token a
Token ID257
Feature activation-1.13e-02

Pos logit contributions
Warning+0.029
+0.028
 aston+0.024
 policy+0.024
 comparison+0.022
Neg logit contributions
mare-0.030
pron-0.027
atmeal-0.024
ack-0.024
auldron-0.023
Tokenpron
Token ID31186
Feature activation-3.62e-01

Pos logit contributions
mare+0.018
pron+0.016
ack+0.016
lic+0.015
atmeal+0.015
Neg logit contributions
-0.016
Warning-0.015
 aston-0.013
 policy-0.012
 next-0.012
Token with
Token ID351
Feature activation+2.62e-02

Pos logit contributions
mare+0.664
ack+0.643
lic+0.612
ins+0.608
ench+0.602
Neg logit contributions
-0.358
 next-0.343
Warning-0.291
 behind-0.289
 policy-0.275
Token a
Token ID257
Feature activation-1.11e-02

Pos logit contributions
+0.037
Warning+0.035
 policy+0.030
 aston+0.030
 next+0.028
Neg logit contributions
mare-0.042
pron-0.037
ack-0.035
atmeal-0.033
lic-0.033
Token big
Token ID1263
Feature activation+3.74e-02

Pos logit contributions
mare+0.020
ack+0.017
atmeal+0.017
ackle+0.017
ocked+0.017
Neg logit contributions
-0.013
Warning-0.010
 next-0.010
 aston-0.009
 policy-0.009
Token zip
Token ID19974
Feature activation-6.35e-02

Pos logit contributions
+0.053
Warning+0.044
 aston+0.042
 next+0.041
 policy+0.040
Neg logit contributions
mare-0.059
ack-0.051
lic-0.051
ackle-0.050
atmeal-0.049
Token.
Token ID13
Feature activation-3.81e-03

Pos logit contributions
mare+0.116
ack+0.103
ackle+0.103
ench+0.102
atmeal+0.099
Neg logit contributions
-0.072
 next-0.063
Warning-0.055
 ago-0.050
 behind-0.050
Token She
Token ID1375
Feature activation-4.85e-02

Pos logit contributions
mare+0.018
pron+0.016
ack+0.015
lic+0.014
atmeal+0.014
Neg logit contributions
-0.016
Warning-0.015
 policy-0.013
 aston-0.013
 next-0.013
Token wore
Token ID12408
Feature activation-8.06e-02

Pos logit contributions
mare+0.080
pron+0.070
ack+0.068
lic+0.064
atmeal+0.064
Neg logit contributions
-0.066
Warning-0.061
 aston-0.053
 policy-0.052
 next-0.051
Token a
Token ID257
Feature activation-1.42e-02

Pos logit contributions
mare+0.134
pron+0.116
ack+0.115
lic+0.108
atmeal+0.108
Neg logit contributions
-0.109
Warning-0.101
 aston-0.088
 policy-0.085
 next-0.084
Token pink
Token ID11398
Feature activation+1.31e-01

Pos logit contributions
mare+0.024
ack+0.020
pron+0.020
atmeal+0.019
lic+0.019
Neg logit contributions
-0.018
Warning-0.017
 aston-0.015
 next-0.014
 policy-0.014
Token a
Token ID257
Feature activation-1.44e-02

Pos logit contributions
+0.205
Warning+0.187
 next+0.171
 aston+0.169
 policy+0.165
Neg logit contributions
mare-0.185
ack-0.156
pron-0.150
lic-0.149
ackle-0.147
Tokenpron
Token ID31186
Feature activation-3.62e-01

Pos logit contributions
mare+0.024
ack+0.020
pron+0.019
atmeal+0.019
ackle+0.019
Neg logit contributions
-0.019
Warning-0.017
 aston-0.015
 next-0.015
 policy-0.015
Token with
Token ID351
Feature activation+2.18e-02

Pos logit contributions
mare+0.694
ack+0.652
lic+0.638
ins+0.624
ench+0.619
Neg logit contributions
 next-0.346
-0.344
 behind-0.284
 policy-0.251
Warning-0.243
Token flowers
Token ID12734
Feature activation+4.94e-02

Pos logit contributions
+0.030
Warning+0.029
 policy+0.025
 aston+0.025
 next+0.024
Neg logit contributions
mare-0.034
pron-0.031
ack-0.029
lic-0.028
atmeal-0.027
Token on
Token ID319
Feature activation+3.62e-02

Pos logit contributions
+0.059
 next+0.047
Warning+0.045
 aston+0.038
 policy+0.038
Neg logit contributions
mare-0.095
ack-0.082
pron-0.082
ackle-0.079
lic-0.077
Token it
Token ID340
Feature activation-3.17e-02

Pos logit contributions
+0.048
Warning+0.044
 aston+0.038
 next+0.038
 policy+0.037
Neg logit contributions
mare-0.062
ack-0.052
pron-0.052
lic-0.051
atmeal-0.051
Token.
Token ID13
Feature activation-1.06e-02

Pos logit contributions
mare+0.053
pron+0.046
ack+0.045
lic+0.042
atmeal+0.042
Neg logit contributions
-0.043
Warning-0.039
 next-0.034
 aston-0.034
 policy-0.034
Token She
Token ID1375
Feature activation-4.20e-02

Pos logit contributions
mare+0.008
pron+0.007
ack+0.006
lic+0.006
atmeal+0.006
Neg logit contributions
-0.008
Warning-0.008
 aston-0.007
 policy-0.007
 next-0.007
Token put
Token ID1234
Feature activation-3.31e-02

Pos logit contributions
mare+0.070
pron+0.060
ack+0.060
atmeal+0.056
lic+0.056
Neg logit contributions
-0.058
Warning-0.051
 next-0.045
 aston-0.045
 policy-0.043
Token on
Token ID319
Feature activation+4.13e-02

Pos logit contributions
mare+0.065
ack+0.059
lic+0.056
ackle+0.056
pron+0.055
Neg logit contributions
-0.035
Warning-0.029
 next-0.028
 aston-0.024
 moral-0.022
Token an
Token ID281
Feature activation+6.91e-02

Pos logit contributions
+0.052
Warning+0.047
 aston+0.041
 next+0.040
 policy+0.039
Neg logit contributions
mare-0.074
ack-0.063
pron-0.063
lic-0.061
ackle-0.059
Token a
Token ID257
Feature activation-1.18e-02

Pos logit contributions
Warning+0.113
+0.107
 aston+0.095
 policy+0.094
Hello+0.088
Neg logit contributions
mare-0.093
pron-0.091
atmeal-0.079
auldron-0.077
ack-0.074
Tokenpron
Token ID31186
Feature activation-3.61e-01

Pos logit contributions
mare+0.009
ack+0.006
pron+0.006
lic+0.005
atmeal+0.005
Neg logit contributions
-0.027
Warning-0.025
 aston-0.023
 next-0.023
 policy-0.023
Token so
Token ID523
Feature activation-7.36e-02

Pos logit contributions
mare+0.651
ack+0.614
lic+0.578
atmeal+0.577
ench+0.575
Neg logit contributions
-0.399
 next-0.347
Warning-0.320
 policy-0.295
 behind-0.292
Token she
Token ID673
Feature activation-9.24e-03

Pos logit contributions
mare+0.136
ack+0.117
pron+0.114
lic+0.112
ackle+0.112
Neg logit contributions
-0.090
Warning-0.075
 next-0.071
 policy-0.067
 aston-0.065
Token was
Token ID373
Feature activation+2.65e-02

Pos logit contributions
mare+0.015
ack+0.013
ench+0.013
ackle+0.013
pron+0.013
Neg logit contributions
-0.013
Warning-0.010
 next-0.010
 aston-0.009
 dis-0.009
Token ready
Token ID3492
Feature activation-3.00e-02

Pos logit contributions
+0.027
 next+0.024
 dis+0.019
 precious+0.019
 orange+0.018
Neg logit contributions
ack-0.054
mare-0.052
uddle-0.048
iolet-0.048
elvet-0.048
Token to
Token ID284
Feature activation-4.12e-04

Pos logit contributions
mare+0.057
ack+0.052
pron+0.049
atmeal+0.049
ackle+0.049
Neg logit contributions
-0.034
 next-0.028
Warning-0.027
 policy-0.023
 aston-0.021
Token's
Token ID338
Feature activation+8.75e-02

Pos logit contributions
+0.006
Warning+0.005
 aston+0.004
 next+0.004
 policy+0.004
Neg logit contributions
mare-0.010
pron-0.009
ack-0.008
lic-0.008
ackle-0.008
Token get
Token ID651
Feature activation-5.01e-02

Pos logit contributions
+0.120
 next+0.091
Warning+0.086
 aston+0.082
 moral+0.076
Neg logit contributions
mare-0.162
lic-0.141
pron-0.136
ack-0.135
rows-0.134
Token a
Token ID257
Feature activation-1.61e-02

Pos logit contributions
mare+0.075
pron+0.068
atmeal+0.064
ack+0.063
hes+0.058
Neg logit contributions
Warning-0.072
-0.070
 aston-0.062
 policy-0.061
 upcoming-0.054
Token new
Token ID649
Feature activation+5.29e-02

Pos logit contributions
mare+0.025
ack+0.021
pron+0.021
atmeal+0.020
lic+0.020
Neg logit contributions
-0.023
Warning-0.021
 aston-0.019
 policy-0.018
 next-0.018
Token a
Token ID257
Feature activation-1.66e-02

Pos logit contributions
+0.083
Warning+0.079
 aston+0.069
 policy+0.066
 next+0.064
Neg logit contributions
mare-0.078
pron-0.068
ack-0.064
atmeal-0.059
ackle-0.058
Tokenpron
Token ID31186
Feature activation-3.61e-01

Pos logit contributions
mare+0.029
ack+0.025
pron+0.024
atmeal+0.024
lic+0.024
Neg logit contributions
-0.021
Warning-0.018
 policy-0.017
 aston-0.016
 next-0.016
Token and
Token ID290
Feature activation+1.53e-03

Pos logit contributions
mare+0.627
pron+0.557
ack+0.550
lic+0.539
ackle+0.524
Neg logit contributions
-0.458
Warning-0.395
 next-0.382
 policy-0.368
 aston-0.319
Token a
Token ID257
Feature activation-1.63e-02

Pos logit contributions
+0.002
Warning+0.002
 aston+0.002
 next+0.002
 policy+0.002
Neg logit contributions
mare-0.003
pron-0.002
ack-0.002
lic-0.002
ackle-0.002
Token new
Token ID649
Feature activation+5.20e-02

Pos logit contributions
mare+0.025
ack+0.021
lic+0.020
atmeal+0.020
ackle+0.020
Neg logit contributions
-0.024
Warning-0.021
 aston-0.019
 next-0.019
 policy-0.019
Token bowl
Token ID9396
Feature activation-9.53e-02

Pos logit contributions
+0.075
Warning+0.067
 next+0.062
 policy+0.062
 aston+0.060
Neg logit contributions
mare-0.081
lic-0.069
ack-0.069
pron-0.066
atmeal-0.066
Token.
Token ID13
Feature activation-1.08e-02

Pos logit contributions
mare+0.161
pron+0.141
ack+0.137
lic+0.131
ackle+0.130
Neg logit contributions
-0.128
Warning-0.112
 next-0.104
 policy-0.101
 aston-0.095
TokenHe
Token ID1544
Feature activation-6.20e-02

Pos logit contributions
Warning+0.015
+0.015
 policy+0.013
 aston+0.013
 counts+0.012
Neg logit contributions
mare-0.012
pron-0.011
ack-0.011
lic-0.010
ackle-0.010
Token took
Token ID1718
Feature activation-2.62e-02

Pos logit contributions
mare+0.102
pron+0.088
ack+0.086
atmeal+0.082
lic+0.082
Neg logit contributions
-0.088
Warning-0.078
 aston-0.070
 next-0.068
 policy-0.066
Token off
Token ID572
Feature activation-2.90e-02

Pos logit contributions
mare+0.044
pron+0.038
ack+0.037
atmeal+0.035
lic+0.035
Neg logit contributions
-0.036
Warning-0.032
 aston-0.029
 next-0.028
 policy-0.028
Token his
Token ID465
Feature activation-1.89e-02

Pos logit contributions
mare+0.049
pron+0.042
ack+0.042
ackle+0.040
atmeal+0.040
Neg logit contributions
-0.040
Warning-0.035
 next-0.032
 aston-0.032
 policy-0.030
Token a
Token ID257
Feature activation-1.48e-02

Pos logit contributions
pron+0.026
mare+0.025
atmeal+0.020
ack+0.019
uddle+0.019
Neg logit contributions
Warning-0.032
-0.031
 aston-0.026
 policy-0.026
 seventeen-0.025
Tokenpron
Token ID31186
Feature activation-3.61e-01

Pos logit contributions
mare+0.021
pron+0.019
ack+0.018
lic+0.017
ackle+0.016
Neg logit contributions
-0.023
Warning-0.022
 aston-0.019
 policy-0.019
 next-0.018
Token and
Token ID290
Feature activation+3.51e-03

Pos logit contributions
mare+0.684
ack+0.609
pron+0.597
atmeal+0.596
ins+0.594
Neg logit contributions
-0.429
 next-0.325
Warning-0.312
 policy-0.293
 aston-0.273
Token proudly
Token ID23017
Feature activation-6.22e-02

Pos logit contributions
+0.005
Warning+0.005
 aston+0.004
 policy+0.004
 next+0.004
Neg logit contributions
mare-0.005
pron-0.005
ack-0.005
lic-0.004
ackle-0.004
Token told
Token ID1297
Feature activation-1.26e-02

Pos logit contributions
mare+0.102
ack+0.087
pron+0.086
ackle+0.083
ench+0.083
Neg logit contributions
-0.087
Warning-0.076
 aston-0.069
 next-0.069
 policy-0.064
Token his
Token ID465
Feature activation-1.88e-02

Pos logit contributions
mare+0.018
pron+0.016
ack+0.015
lic+0.014
ackle+0.014
Neg logit contributions
-0.020
Warning-0.019
 aston-0.016
 policy-0.016
 next-0.016
Token mom
Token ID1995
Feature activation+6.43e-02

Pos logit contributions
mare+0.033
pron+0.029
ack+0.029
lic+0.027
atmeal+0.027
Neg logit contributions
-0.024
Warning-0.022
 aston-0.019
 policy-0.019
 next-0.018
Token mother
Token ID2802
Feature activation+1.30e-01

Pos logit contributions
mare+0.006
pron+0.006
ack+0.005
atmeal+0.005
lic+0.005
Neg logit contributions
-0.007
Warning-0.007
 aston-0.006
 policy-0.006
 next-0.005
Token was
Token ID373
Feature activation+2.44e-02

Pos logit contributions
Warning+0.198
+0.175
 policy+0.164
 aston+0.160
 comparison+0.152
Neg logit contributions
mare-0.189
pron-0.172
ack-0.161
lic-0.155
atmeal-0.149
Token wearing
Token ID5762
Feature activation-2.32e-03

Pos logit contributions
+0.034
Warning+0.030
 next+0.028
 aston+0.028
 policy+0.026
Neg logit contributions
mare-0.040
ack-0.034
pron-0.034
atmeal-0.032
ackle-0.032
Token an
Token ID281
Feature activation+6.96e-02

Pos logit contributions
mare+0.004
ack+0.004
lic+0.004
ackle+0.004
pron+0.004
Neg logit contributions
-0.003
Warning-0.002
 aston-0.002
 next-0.002
 policy-0.002
Token a
Token ID257
Feature activation-1.36e-02

Pos logit contributions
Warning+0.097
+0.089
 policy+0.076
 aston+0.073
Welcome+0.069
Neg logit contributions
mare-0.112
pron-0.108
atmeal-0.095
auldron-0.093
ack-0.092
Tokenpron
Token ID31186
Feature activation-3.61e-01

Pos logit contributions
mare+0.021
ack+0.018
atmeal+0.017
lic+0.017
ackle+0.017
Neg logit contributions
-0.019
Warning-0.017
 aston-0.015
 next-0.015
 policy-0.015
Token.
Token ID13
Feature activation-6.41e-03

Pos logit contributions
mare+0.664
ack+0.613
lic+0.590
ackle+0.580
ench+0.578
Neg logit contributions
-0.398
 next-0.363
Warning-0.308
 policy-0.304
 behind-0.295
Token Rose
Token ID8049
Feature activation-1.02e-01

Pos logit contributions
mare+0.011
atmeal+0.010
pron+0.010
ack+0.009
ackle+0.009
Neg logit contributions
-0.009
Warning-0.007
 next-0.007
 aston-0.006
 counts-0.006
Token wanted
Token ID2227
Feature activation+7.32e-02

Pos logit contributions
mare+0.170
pron+0.147
ack+0.144
lic+0.137
atmeal+0.137
Neg logit contributions
-0.141
Warning-0.125
 aston-0.110
 next-0.107
 policy-0.105
Token to
Token ID284
Feature activation-7.79e-04

Pos logit contributions
+0.101
Warning+0.095
 aston+0.082
 policy+0.081
 next+0.079
Neg logit contributions
mare-0.119
pron-0.104
ack-0.101
lic-0.095
atmeal-0.095
Token wear
Token ID5806
Feature activation-5.54e-02

Pos logit contributions
mare+0.001
ack+0.001
lic+0.001
udd+0.001
ins+0.001
Neg logit contributions
-0.001
Warning-0.001
 aston-0.001
 next-0.001
 policy-0.001
Token her
Token ID607
Feature activation+2.54e-02

Pos logit contributions
+0.277
Warning+0.260
 aston+0.227
 policy+0.224
 next+0.217
Neg logit contributions
mare-0.300
pron-0.259
ack-0.252
lic-0.237
atmeal-0.235
Token and
Token ID290
Feature activation+5.48e-03

Pos logit contributions
+0.034
Warning+0.030
 aston+0.027
 next+0.027
 policy+0.026
Neg logit contributions
mare-0.043
pron-0.038
ack-0.037
lic-0.036
atmeal-0.035
Token said
Token ID531
Feature activation+7.55e-03

Pos logit contributions
+0.007
 next+0.006
Warning+0.005
 aston+0.005
 behind+0.005
Neg logit contributions
mare-0.010
pron-0.009
ack-0.009
atmeal-0.009
ench-0.008
Token,
Token ID11
Feature activation+4.13e-02

Pos logit contributions
+0.010
Warning+0.009
 aston+0.008
 policy+0.008
 next+0.007
Neg logit contributions
mare-0.013
pron-0.011
ack-0.011
lic-0.011
atmeal-0.011
Token "
Token ID366
Feature activation-3.58e-02

Pos logit contributions
+0.058
Warning+0.048
 next+0.046
 aston+0.044
 moral+0.040
Neg logit contributions
mare-0.069
ack-0.060
pron-0.060
atmeal-0.059
ackle-0.057
TokenHello
Token ID15496
Feature activation+2.74e-01

Pos logit contributions
mare+0.068
pron+0.061
ack+0.059
lic+0.056
atmeal+0.056
Neg logit contributions
-0.039
Warning-0.036
 aston-0.031
 policy-0.030
 next-0.029
Token,
Token ID11
Feature activation+3.97e-02

Pos logit contributions
+0.359
Warning+0.290
 next+0.278
 aston+0.265
 policy+0.253
Neg logit contributions
mare-0.483
pron-0.438
ack-0.409
atmeal-0.408
auldron-0.402
Token little
Token ID1310
Feature activation-6.06e-02

Pos logit contributions
+0.055
Warning+0.047
 aston+0.042
 next+0.042
 policy+0.041
Neg logit contributions
mare-0.067
pron-0.059
ack-0.058
atmeal-0.055
ackle-0.055
Token girl
Token ID2576
Feature activation-8.41e-02

Pos logit contributions
mare+0.080
pron+0.067
ack+0.066
lic+0.059
atmeal+0.059
Neg logit contributions
-0.100
Warning-0.097
 aston-0.086
 policy-0.084
 next-0.081
Token.
Token ID13
Feature activation-6.98e-03

Pos logit contributions
mare+0.158
pron+0.147
atmeal+0.140
ackle+0.140
auldron+0.139
Neg logit contributions
-0.109
 next-0.077
Warning-0.070
 aston-0.065
 policy-0.065
Token Do
Token ID2141
Feature activation+1.85e-01

Pos logit contributions
mare+0.013
pron+0.010
ack+0.010
lic+0.010
atmeal+0.009
Neg logit contributions
Warning-0.008
-0.008
 policy-0.007
 aston-0.007
 next-0.006
Token reply
Token ID10971
Feature activation+1.25e-01

Pos logit contributions
mare+0.023
pron+0.020
atmeal+0.019
ack+0.019
auldron+0.018
Neg logit contributions
-0.020
Warning-0.018
 aston-0.016
 next-0.015
 policy-0.015
Token.
Token ID13
Feature activation-4.49e-03

Pos logit contributions
+0.176
Warning+0.166
 aston+0.144
 policy+0.142
 next+0.137
Neg logit contributions
mare-0.198
pron-0.171
ack-0.167
lic-0.158
ackle-0.156
Token
Token ID198
Feature activation+8.01e-03

Pos logit contributions
mare+0.007
pron+0.007
atmeal+0.007
ack+0.007
ackle+0.006
Neg logit contributions
-0.006
Warning-0.006
 aston-0.005
 next-0.005
 policy-0.004
Token
Token ID198
Feature activation+7.31e-03

Pos logit contributions
+0.012
Warning+0.011
 aston+0.010
 policy+0.010
 next+0.010
Neg logit contributions
mare-0.012
pron-0.010
ack-0.010
lic-0.010
ackle-0.009
Token"
Token ID1
Feature activation-2.46e-02

Pos logit contributions
+0.011
Warning+0.011
 aston+0.009
 policy+0.009
 next+0.009
Neg logit contributions
mare-0.011
pron-0.009
ack-0.009
lic-0.008
ackle-0.008
TokenHello
Token ID15496
Feature activation+2.68e-01

Pos logit contributions
mare+0.045
lic+0.040
pron+0.039
ack+0.038
it+0.037
Neg logit contributions
 policy-0.025
-0.025
Warning-0.023
 aston-0.021
 allowance-0.021
Token Joe
Token ID5689
Feature activation+6.37e-02

Pos logit contributions
+0.376
Warning+0.353
 aston+0.307
 policy+0.302
 next+0.295
Neg logit contributions
mare-0.429
pron-0.372
ack-0.362
lic-0.342
atmeal-0.339
Token,"
Token ID553
Feature activation+1.47e-01

Pos logit contributions
+0.070
Warning+0.057
 next+0.050
 aston+0.047
 policy+0.046
Neg logit contributions
mare-0.126
pron-0.114
ack-0.110
atmeal-0.107
lic-0.106
Token said
Token ID531
Feature activation-4.62e-03

Pos logit contributions
+0.208
Warning+0.190
 aston+0.165
 next+0.161
 policy+0.158
Neg logit contributions
mare-0.243
pron-0.215
ack-0.207
atmeal-0.199
ackle-0.195
Token a
Token ID257
Feature activation-1.47e-02

Pos logit contributions
mare+0.008
pron+0.007
ack+0.007
atmeal+0.006
lic+0.006
Neg logit contributions
-0.006
Warning-0.006
 aston-0.005
 policy-0.005
 next-0.005
Token voice
Token ID3809
Feature activation-4.16e-02

Pos logit contributions
mare+0.023
pron+0.020
ack+0.020
atmeal+0.020
auldron+0.019
Neg logit contributions
-0.021
Warning-0.019
 aston-0.017
 next-0.016
 policy-0.016
Token tried
Token ID3088
Feature activation-5.12e-02

Pos logit contributions
+0.010
Warning+0.007
 aston+0.006
 next+0.006
 counts+0.006
Neg logit contributions
mare-0.014
pron-0.013
ack-0.012
ocked-0.012
lic-0.012
Token your
Token ID534
Feature activation-6.35e-02

Pos logit contributions
mare+0.065
pron+0.058
ack+0.056
auldron+0.052
ench+0.052
Neg logit contributions
Warning-0.082
-0.079
 aston-0.073
 policy-0.070
 upcoming-0.065
Token best
Token ID1266
Feature activation+4.29e-02

Pos logit contributions
mare+0.118
pron+0.103
ack+0.102
lic+0.099
ackle+0.096
Neg logit contributions
-0.077
Warning-0.069
 next-0.060
 policy-0.059
 aston-0.058
Token.
Token ID13
Feature activation-1.53e-02

Pos logit contributions
+0.063
Warning+0.061
 aston+0.053
 policy+0.050
 next+0.049
Neg logit contributions
mare-0.066
pron-0.057
ack-0.055
atmeal-0.052
ackle-0.050
Token But
Token ID887
Feature activation+1.14e-02

Pos logit contributions
mare+0.024
pron+0.020
ack+0.020
atmeal+0.019
auldron+0.018
Neg logit contributions
-0.021
Warning-0.020
 aston-0.018
 policy-0.017
 next-0.017
Token next
Token ID1306
Feature activation+3.00e-01

Pos logit contributions
+0.017
Warning+0.017
 aston+0.015
 policy+0.014
 upcoming+0.013
Neg logit contributions
mare-0.016
pron-0.014
ack-0.013
atmeal-0.013
 salon-0.012
Token time
Token ID640
Feature activation-1.60e-02

Pos logit contributions
+0.323
 next+0.264
Warning+0.257
 policy+0.251
 moral+0.232
Neg logit contributions
mare-0.589
lic-0.517
atmeal-0.509
pron-0.508
ins-0.500
Token,
Token ID11
Feature activation+3.27e-02

Pos logit contributions
mare+0.024
pron+0.020
ack+0.020
atmeal+0.018
lic+0.018
Neg logit contributions
Warning-0.023
-0.022
 aston-0.020
 policy-0.019
 upcoming-0.018
Token you
Token ID345
Feature activation+2.56e-03

Pos logit contributions
+0.050
Warning+0.048
 aston+0.042
 policy+0.041
 next+0.039
Neg logit contributions
mare-0.048
pron-0.041
ack-0.039
atmeal-0.037
lic-0.036
Token need
Token ID761
Feature activation+3.87e-02

Pos logit contributions
+0.003
Warning+0.003
 next+0.002
 aston+0.002
 moral+0.002
Neg logit contributions
mare-0.005
pron-0.004
ack-0.004
lic-0.004
atmeal-0.004
Token to
Token ID284
Feature activation-1.73e-03

Pos logit contributions
+0.054
Warning+0.051
 aston+0.045
 policy+0.043
 next+0.042
Neg logit contributions
mare-0.062
pron-0.054
ack-0.052
lic-0.049
atmeal-0.049
Token and
Token ID290
Feature activation-4.34e-03

Pos logit contributions
Warning+0.103
+0.103
 aston+0.093
 policy+0.089
 upcoming+0.084
Neg logit contributions
mare-0.068
ack-0.058
rows-0.051
ackle-0.051
pron-0.050
Token cold
Token ID4692
Feature activation-2.57e-02

Pos logit contributions
mare+0.007
pron+0.006
ack+0.006
lic+0.006
atmeal+0.006
Neg logit contributions
-0.006
Warning-0.006
 aston-0.005
 policy-0.005
 next-0.005
Token.
Token ID13
Feature activation-1.29e-02

Pos logit contributions
mare+0.039
pron+0.033
ack+0.033
lic+0.031
atmeal+0.030
Neg logit contributions
-0.038
Warning-0.036
 aston-0.032
 policy-0.031
 next-0.029
Token He
Token ID679
Feature activation-4.05e-02

Pos logit contributions
mare+0.020
pron+0.017
ack+0.017
lic+0.016
atmeal+0.016
Neg logit contributions
-0.019
Warning-0.018
 aston-0.016
 policy-0.015
 next-0.015
Token was
Token ID373
Feature activation+2.18e-02

Pos logit contributions
mare+0.070
pron+0.063
ack+0.058
atmeal+0.058
lic+0.056
Neg logit contributions
-0.056
Warning-0.049
 aston-0.043
 next-0.043
 moral-0.040
Token trapped
Token ID13640
Feature activation+2.23e-01

Pos logit contributions
+0.028
Warning+0.023
 next+0.022
 aston+0.021
 moral+0.019
Neg logit contributions
mare-0.040
pron-0.035
ack-0.034
atmeal-0.034
lic-0.033
Token.
Token ID13
Feature activation-1.28e-02

Pos logit contributions
Warning+0.346
+0.341
 aston+0.305
 policy+0.290
 moral+0.286
Neg logit contributions
mare-0.291
ack-0.253
pron-0.233
ackle-0.226
lic-0.225
Token
Token ID198
Feature activation+7.97e-03

Pos logit contributions
mare+0.018
ack+0.015
pron+0.015
lic+0.014
ackle+0.014
Neg logit contributions
-0.020
Warning-0.019
 aston-0.017
 policy-0.017
 next-0.016
Token
Token ID198
Feature activation+7.28e-03

Pos logit contributions
+0.013
Warning+0.012
 policy+0.011
 aston+0.011
 upcoming+0.011
Neg logit contributions
mare-0.010
ack-0.009
pron-0.008
lic-0.008
ackle-0.008
Token"
Token ID1
Feature activation-2.55e-02

Pos logit contributions
+0.011
Warning+0.011
 aston+0.010
 policy+0.010
 upcoming+0.009
Neg logit contributions
mare-0.010
ack-0.009
pron-0.009
lic-0.008
ackle-0.008
TokenSam
Token ID16305
Feature activation+8.79e-03

Pos logit contributions
mare+0.039
ack+0.035
pron+0.033
elvet+0.030
it+0.030
Neg logit contributions
-0.035
 aston-0.030
 policy-0.029
Warning-0.029
 upcoming-0.028
TokenHello
Token ID15496
Feature activation+2.72e-01

Pos logit contributions
mare+0.032
pron+0.029
lic+0.027
ack+0.027
atmeal+0.025
Neg logit contributions
 policy-0.016
-0.016
Warning-0.014
 ago-0.013
 aston-0.013
Token,
Token ID11
Feature activation+4.07e-02

Pos logit contributions
+0.316
Warning+0.271
 next+0.240
 aston+0.238
 policy+0.227
Neg logit contributions
mare-0.502
pron-0.455
ack-0.437
atmeal-0.420
ackle-0.420
Token little
Token ID1310
Feature activation-6.15e-02

Pos logit contributions
+0.052
Warning+0.042
 next+0.039
 aston+0.038
 policy+0.037
Neg logit contributions
mare-0.072
pron-0.065
ack-0.065
ench-0.063
atmeal-0.062
Token girl
Token ID2576
Feature activation-8.70e-02

Pos logit contributions
mare+0.089
pron+0.077
ack+0.074
lic+0.070
atmeal+0.069
Neg logit contributions
-0.095
Warning-0.090
 aston-0.079
 policy-0.078
 next-0.078
Token.
Token ID13
Feature activation-6.55e-03

Pos logit contributions
mare+0.149
pron+0.135
ack+0.130
atmeal+0.125
lic+0.123
Neg logit contributions
-0.114
Warning-0.099
 next-0.087
 aston-0.087
 policy-0.087
Token Can
Token ID1680
Feature activation+2.07e-01

Pos logit contributions
mare+0.013
pron+0.011
lic+0.011
ack+0.010
atmeal+0.010
Neg logit contributions
-0.007
Warning-0.007
 policy-0.006
 aston-0.005
 next-0.005
Token you
Token ID345
Feature activation+6.72e-03

Pos logit contributions
Warning+0.327
+0.305
 policy+0.301
 aston+0.299
 seventeen+0.270
Neg logit contributions
mare-0.273
pron-0.227
atmeal-0.205
ack-0.198
lic-0.183
Token help
Token ID1037
Feature activation+9.80e-02

Pos logit contributions
+0.008
Warning+0.007
 aston+0.006
 next+0.006
 policy+0.006
Neg logit contributions
mare-0.012
pron-0.011
ack-0.011
atmeal-0.010
lic-0.010
Token me
Token ID502
Feature activation-3.82e-02

Pos logit contributions
+0.155
Warning+0.144
 aston+0.127
 next+0.125
 policy+0.125
Neg logit contributions
mare-0.141
pron-0.121
ack-0.117
lic-0.110
atmeal-0.109
Token sort
Token ID3297
Feature activation+4.99e-02

Pos logit contributions
mare+0.060
pron+0.051
ack+0.049
lic+0.047
ackle+0.046
Neg logit contributions
-0.058
Warning-0.050
 next-0.047
 aston-0.044
 policy-0.043
Token these
Token ID777
Feature activation+1.24e-01

Pos logit contributions
Warning+0.091
+0.084
 aston+0.083
 policy+0.081
 horror+0.078
Neg logit contributions
mare-0.044
ack-0.041
pron-0.039
udd-0.036
ocked-0.035
Token walked
Token ID6807
Feature activation+2.12e-02

Pos logit contributions
mare+0.019
pron+0.018
atmeal+0.017
ack+0.017
ench+0.017
Neg logit contributions
-0.013
Warning-0.010
 next-0.009
 aston-0.009
 moral-0.008
Token towards
Token ID3371
Feature activation+1.87e-01

Pos logit contributions
+0.025
Warning+0.025
 policy+0.021
 aston+0.021
 seventeen+0.019
Neg logit contributions
mare-0.037
pron-0.032
ack-0.031
lic-0.031
ackle-0.030
Token it
Token ID340
Feature activation-2.69e-02

Pos logit contributions
+0.287
Warning+0.274
 policy+0.242
 aston+0.241
 next+0.228
Neg logit contributions
mare-0.270
pron-0.224
ack-0.220
ackle-0.203
lic-0.202
Token.
Token ID13
Feature activation-3.49e-03

Pos logit contributions
mare+0.041
pron+0.035
ack+0.034
lic+0.032
ackle+0.032
Neg logit contributions
-0.039
Warning-0.038
 aston-0.033
 policy-0.033
 next-0.031
Token When
Token ID1649
Feature activation-1.14e-01

Pos logit contributions
mare+0.005
pron+0.005
ack+0.004
lic+0.004
atmeal+0.004
Neg logit contributions
-0.005
Warning-0.005
 aston-0.004
 policy-0.004
 next-0.004
Token she
Token ID673
Feature activation-9.92e-03

Pos logit contributions
mare+0.170
pron+0.146
ack+0.142
atmeal+0.131
ackle+0.130
Neg logit contributions
-0.169
Warning-0.162
 aston-0.142
 policy-0.140
 next-0.132
Token reached
Token ID4251
Feature activation-6.45e-02

Pos logit contributions
mare+0.016
pron+0.014
ack+0.014
ench+0.013
ackle+0.013
Neg logit contributions
-0.015
Warning-0.012
 aston-0.012
 next-0.011
 counts-0.010
Token the
Token ID262
Feature activation-3.35e-02

Pos logit contributions
mare+0.096
pron+0.080
ack+0.078
ackle+0.072
lic+0.072
Neg logit contributions
-0.098
Warning-0.094
 aston-0.082
 policy-0.081
 next-0.077
Token door
Token ID3420
Feature activation-3.64e-02

Pos logit contributions
mare+0.047
pron+0.040
ack+0.039
lic+0.036
ackle+0.036
Neg logit contributions
-0.054
Warning-0.051
 aston-0.045
 policy-0.045
 next-0.043
Token,
Token ID11
Feature activation+4.27e-02

Pos logit contributions
mare+0.052
ack+0.043
pron+0.043
lic+0.043
ackle+0.040
Neg logit contributions
Warning-0.053
-0.052
 policy-0.045
 aston-0.045
 moral-0.043
Token it
Token ID340
Feature activation-2.70e-02

Pos logit contributions
+0.064
Warning+0.062
 aston+0.054
 policy+0.053
 next+0.050
Neg logit contributions
mare-0.063
pron-0.055
ack-0.051
atmeal-0.048
ackle-0.048
Token have
Token ID423
Feature activation+3.53e-02

Pos logit contributions
+0.017
 next+0.013
 counts+0.012
 ever+0.012
Warning+0.011
Neg logit contributions
mare-0.019
pron-0.018
ocked-0.015
auldron-0.015
atmeal-0.015
Token to
Token ID284
Feature activation+4.32e-03

Pos logit contributions
+0.047
Warning+0.040
 next+0.036
 aston+0.035
 policy+0.034
Neg logit contributions
mare-0.062
pron-0.054
ack-0.053
atmeal-0.050
ackle-0.050
Token find
Token ID1064
Feature activation+1.02e-01

Pos logit contributions
+0.006
Warning+0.005
 next+0.005
 aston+0.004
 policy+0.004
Neg logit contributions
mare-0.007
pron-0.007
lic-0.006
auldron-0.006
ocked-0.006
Token me
Token ID502
Feature activation-3.42e-02

Pos logit contributions
+0.152
Warning+0.146
 aston+0.127
 policy+0.126
 next+0.119
Neg logit contributions
mare-0.154
pron-0.131
ack-0.130
lic-0.120
ackle-0.119
Token!"
Token ID2474
Feature activation-5.23e-02

Pos logit contributions
mare+0.060
pron+0.054
lic+0.052
ack+0.052
ackle+0.051
Neg logit contributions
-0.045
 next-0.039
Warning-0.032
 behind-0.032
 counts-0.030
Token His
Token ID2399
Feature activation+6.28e-02

Pos logit contributions
mare+0.106
pron+0.104
ackle+0.091
lic+0.090
ack+0.090
Neg logit contributions
-0.061
Warning-0.047
 next-0.040
 aston-0.037
 counts-0.035
Token friends
Token ID2460
Feature activation-1.31e-01

Pos logit contributions
+0.124
Warning+0.119
 aston+0.107
 policy+0.106
 next+0.102
Neg logit contributions
mare-0.065
pron-0.052
ack-0.051
lic-0.045
atmeal-0.044
Token agreed
Token ID4987
Feature activation-7.96e-02

Pos logit contributions
mare+0.230
pron+0.204
auldron+0.195
atmeal+0.195
ack+0.192
Neg logit contributions
-0.182
Warning-0.146
 next-0.134
 aston-0.132
 dis-0.117
Token,
Token ID11
Feature activation+4.51e-02

Pos logit contributions
mare+0.145
pron+0.127
ack+0.124
atmeal+0.122
ackle+0.119
Neg logit contributions
-0.102
Warning-0.086
 next-0.079
 aston-0.075
 policy-0.074
Token so
Token ID523
Feature activation-6.99e-02

Pos logit contributions
+0.057
 next+0.042
Warning+0.038
 aston+0.036
 moral+0.033
Neg logit contributions
mare-0.086
atmeal-0.079
pron-0.078
ack-0.077
ench-0.076
Token Bob
Token ID5811
Feature activation+7.95e-02

Pos logit contributions
mare+0.132
pron+0.119
ack+0.114
atmeal+0.110
ackle+0.109
Neg logit contributions
-0.082
Warning-0.072
 aston-0.062
 next-0.061
 policy-0.060
Token have
Token ID423
Feature activation+3.16e-02

Pos logit contributions
+0.176
Warning+0.158
 policy+0.141
 next+0.140
 aston+0.137
Neg logit contributions
mare-0.179
pron-0.154
ack-0.149
lic-0.146
atmeal-0.142
Token asked
Token ID1965
Feature activation+5.69e-02

Pos logit contributions
+0.048
Warning+0.045
 aston+0.040
 policy+0.039
 next+0.038
Neg logit contributions
mare-0.047
pron-0.041
ack-0.040
lic-0.037
atmeal-0.037
Token me
Token ID502
Feature activation-3.71e-02

Pos logit contributions
+0.080
Warning+0.077
 aston+0.067
 policy+0.065
 next+0.060
Neg logit contributions
mare-0.089
pron-0.077
ack-0.076
atmeal-0.070
ackle-0.070
Token first
Token ID717
Feature activation+1.22e-01

Pos logit contributions
mare+0.071
pron+0.063
ack+0.061
lic+0.059
atmeal+0.058
Neg logit contributions
-0.041
Warning-0.037
 aston-0.031
 policy-0.030
 next-0.030
Token.
Token ID13
Feature activation-1.25e-02

Pos logit contributions
+0.163
Warning+0.146
 next+0.130
 policy+0.129
 aston+0.126
Neg logit contributions
mare-0.207
pron-0.182
ack-0.176
lic-0.171
atmeal-0.169
Token You
Token ID921
Feature activation+1.02e-02

Pos logit contributions
mare+0.021
pron+0.018
ack+0.018
lic+0.017
atmeal+0.017
Neg logit contributions
-0.017
Warning-0.016
 aston-0.013
 policy-0.013
 next-0.013
Token should
Token ID815
Feature activation+1.15e-01

Pos logit contributions
+0.013
Warning+0.010
 aston+0.008
 policy+0.008
 next+0.008
Neg logit contributions
mare-0.020
pron-0.018
ack-0.017
atmeal-0.017
ackle-0.017
Token have
Token ID423
Feature activation+3.14e-02

Pos logit contributions
+0.166
Warning+0.142
 next+0.131
 policy+0.129
 aston+0.122
Neg logit contributions
mare-0.192
pron-0.165
lic-0.164
ack-0.161
auldron-0.155
Token respected
Token ID14462
Feature activation+9.58e-03

Pos logit contributions
+0.046
Warning+0.043
 aston+0.037
 policy+0.037
 next+0.036
Neg logit contributions
mare-0.049
pron-0.043
ack-0.042
lic-0.039
atmeal-0.039
Token my
Token ID616
Feature activation-1.68e-02

Pos logit contributions
+0.015
Warning+0.015
 aston+0.013
 policy+0.012
 upcoming+0.012
Neg logit contributions
mare-0.013
ack-0.011
pron-0.011
ins-0.010
atmeal-0.009
Token things
Token ID1243
Feature activation-2.00e-02

Pos logit contributions
mare+0.029
pron+0.025
ack+0.025
lic+0.024
atmeal+0.024
Neg logit contributions
-0.022
Warning-0.020
 aston-0.017
 policy-0.017
 next-0.017
Token So
Token ID1406
Feature activation-3.48e-02

Pos logit contributions
mare+0.114
ackle+0.106
lic+0.104
auldron+0.103
pron+0.103
Neg logit contributions
-0.035
 next-0.030
Warning-0.022
 moral-0.021
 policy-0.019
Token,
Token ID11
Feature activation+4.15e-02

Pos logit contributions
mare+0.062
pron+0.055
ack+0.053
lic+0.051
ackle+0.050
Neg logit contributions
-0.044
Warning-0.040
 aston-0.034
 policy-0.034
 next-0.033
Token Tom
Token ID4186
Feature activation+2.78e-02

Pos logit contributions
+0.027
 next+0.024
 behind+0.018
 Mom+0.014
 orange+0.013
Neg logit contributions
mare-0.094
ench-0.092
rows-0.091
ackle-0.090
udd-0.089
Token and
Token ID290
Feature activation+4.69e-03

Pos logit contributions
+0.039
 next+0.028
Warning+0.027
 aston+0.026
 counts+0.025
Neg logit contributions
mare-0.051
pron-0.043
ench-0.041
ack-0.041
lic-0.041
Token his
Token ID465
Feature activation-1.40e-02

Pos logit contributions
+0.006
Warning+0.006
 policy+0.005
 aston+0.005
 next+0.005
Neg logit contributions
mare-0.008
ack-0.007
pron-0.007
ackle-0.007
lic-0.007
Token mom
Token ID1995
Feature activation+6.49e-02

Pos logit contributions
mare+0.013
pron+0.013
uddle+0.011
ack+0.011
ouse+0.010
Neg logit contributions
Warning-0.029
-0.028
 seventeen-0.023
 aston-0.023
 comparison-0.023
Tokenmy
Token ID1820
Feature activation-1.57e-02

Pos logit contributions
+0.092
Warning+0.076
 next+0.071
 aston+0.068
 policy+0.063
Neg logit contributions
mare-0.111
pron-0.094
ack-0.090
ench-0.087
ackle-0.087
Token walked
Token ID6807
Feature activation+2.26e-02

Pos logit contributions
mare+0.028
pron+0.024
lic+0.023
ench+0.022
ackle+0.022
Neg logit contributions
-0.022
Warning-0.017
 aston-0.016
 next-0.016
 counts-0.014
Token over
Token ID625
Feature activation+1.66e-01

Pos logit contributions
+0.028
Warning+0.026
 aston+0.022
 policy+0.021
 next+0.021
Neg logit contributions
mare-0.040
pron-0.036
ack-0.035
atmeal-0.033
lic-0.033
Token to
Token ID284
Feature activation+3.03e-03

Pos logit contributions
+0.277
Warning+0.275
 policy+0.246
 seventeen+0.238
 aston+0.236
Neg logit contributions
mare-0.196
ack-0.174
pron-0.156
lic-0.148
ackle-0.136
Token the
Token ID262
Feature activation-2.92e-02

Pos logit contributions
+0.004
Warning+0.004
 aston+0.003
 next+0.003
 policy+0.003
Neg logit contributions
mare-0.005
pron-0.005
ack-0.005
lic-0.004
ackle-0.004
Token and
Token ID290
Feature activation+3.67e-03

Pos logit contributions
mare+0.092
pron+0.084
atmeal+0.082
ack+0.079
auldron+0.078
Neg logit contributions
-0.051
 next-0.042
Warning-0.042
 aston-0.036
 behind-0.034
Token marched
Token ID23558
Feature activation-4.63e-02

Pos logit contributions
+0.005
Warning+0.005
 next+0.004
 aston+0.004
 counts+0.004
Neg logit contributions
mare-0.006
pron-0.005
ack-0.005
ench-0.005
atmeal-0.005
Token until
Token ID1566
Feature activation+2.96e-02

Pos logit contributions
mare+0.088
pron+0.080
atmeal+0.077
ack+0.074
auldron+0.073
Neg logit contributions
-0.053
Warning-0.046
 next-0.043
 aston-0.039
 policy-0.037
Token she
Token ID673
Feature activation-1.24e-02

Pos logit contributions
+0.042
Warning+0.040
 aston+0.035
 policy+0.034
 next+0.033
Neg logit contributions
mare-0.047
pron-0.041
ack-0.039
lic-0.037
atmeal-0.037
Token came
Token ID1625
Feature activation-9.58e-02

Pos logit contributions
mare+0.021
pron+0.021
atmeal+0.019
ench+0.018
auldron+0.018
Neg logit contributions
-0.017
 next-0.013
Warning-0.013
 aston-0.012
 counts-0.011
Token across
Token ID1973
Feature activation+1.52e-01

Pos logit contributions
mare+0.138
ack+0.122
ench+0.111
pron+0.109
lic+0.109
Neg logit contributions
Warning-0.138
-0.132
 policy-0.118
 aston-0.117
 seventeen-0.114
Token a
Token ID257
Feature activation-1.20e-02

Pos logit contributions
+0.213
Warning+0.200
 aston+0.174
 policy+0.171
 next+0.166
Neg logit contributions
mare-0.245
pron-0.213
ack-0.208
lic-0.196
atmeal-0.195
Token rude
Token ID22066
Feature activation+4.54e-02

Pos logit contributions
mare+0.023
atmeal+0.020
pron+0.019
ins+0.019
ack+0.019
Neg logit contributions
-0.015
Warning-0.011
 aston-0.010
 next-0.010
 unexpected-0.010
Token man
Token ID582
Feature activation-7.57e-02

Pos logit contributions
+0.056
Warning+0.048
 policy+0.047
 next+0.045
 aston+0.044
Neg logit contributions
mare-0.076
pron-0.070
atmeal-0.069
hes-0.065
lic-0.065
Token.
Token ID13
Feature activation-6.19e-03

Pos logit contributions
mare+0.111
pron+0.094
ack+0.093
lic+0.087
atmeal+0.086
Neg logit contributions
-0.110
Warning-0.108
 aston-0.094
 policy-0.094
 moral-0.088
Token Amy
Token ID14235
Feature activation+6.38e-02

Pos logit contributions
mare+0.010
pron+0.009
ack+0.009
atmeal+0.008
lic+0.008
Neg logit contributions
-0.009
Warning-0.008
 aston-0.007
 policy-0.007
 next-0.007
Token content
Token ID2695
Feature activation-6.11e-02

Pos logit contributions
+0.009
Warning+0.009
 aston+0.008
 policy+0.008
 next+0.007
Neg logit contributions
mare-0.009
pron-0.008
ack-0.008
lic-0.007
atmeal-0.007
Token.
Token ID13
Feature activation-3.54e-03

Pos logit contributions
mare+0.116
pron+0.110
ack+0.103
atmeal+0.097
lic+0.097
Neg logit contributions
-0.064
 next-0.062
Warning-0.056
 policy-0.051
 aston-0.049
Token<|endoftext|>
Token ID50256
Feature activation+5.22e-02

Pos logit contributions
mare+0.006
pron+0.005
ack+0.005
lic+0.004
atmeal+0.004
Neg logit contributions
-0.005
Warning-0.005
 aston-0.004
 policy-0.004
 next-0.004
TokenOnce
Token ID7454
Feature activation+9.45e-03

Pos logit contributions
+0.072
Warning+0.066
 aston+0.057
 next+0.056
 policy+0.056
Neg logit contributions
mare-0.086
pron-0.076
ack-0.073
atmeal-0.070
ackle-0.069
Token upon
Token ID2402
Feature activation-1.25e-01

Pos logit contributions
+0.017
Warning+0.016
 aston+0.015
 policy+0.015
 next+0.014
Neg logit contributions
mare-0.011
pron-0.009
ack-0.009
atmeal-0.008
lic-0.008
Token a
Token ID257
Feature activation-1.12e-02

Pos logit contributions
mare+0.254
pron+0.227
ack+0.224
atmeal+0.216
auldron+0.213
Neg logit contributions
-0.141
Warning-0.111
 next-0.102
 aston-0.091
 moral-0.082
Token time
Token ID640
Feature activation-1.25e-02

Pos logit contributions
mare+0.017
pron+0.014
ack+0.014
ackle+0.013
lic+0.013
Neg logit contributions
-0.017
Warning-0.016
 policy-0.014
 aston-0.014
 next-0.013
Token,
Token ID11
Feature activation+4.09e-02

Pos logit contributions
mare+0.013
ack+0.011
adow+0.010
uffy+0.009
lic+0.009
Neg logit contributions
Warning-0.025
-0.021
 seventeen-0.020
 bills-0.020
 aston-0.020
Token there
Token ID612
Feature activation-7.07e-03

Pos logit contributions
Warning+0.074
+0.067
 policy+0.066
 aston+0.063
 cameras+0.063
Neg logit contributions
mare-0.045
ack-0.038
pron-0.036
lic-0.034
it-0.033
Token was
Token ID373
Feature activation+2.64e-02

Pos logit contributions
mare+0.011
ack+0.009
pron+0.009
lic+0.008
atmeal+0.008
Neg logit contributions
-0.010
Warning-0.010
 aston-0.009
 policy-0.008
 next-0.008
Token a
Token ID257
Feature activation-1.19e-02

Pos logit contributions
Warning+0.046
+0.044
 policy+0.043
 aston+0.040
 seventeen+0.038
Neg logit contributions
mare-0.030
pron-0.026
ack-0.025
hes-0.022
atmeal-0.021
TokenIt
Token ID1026
Feature activation-1.25e-01

Pos logit contributions
mare+0.068
pron+0.059
ack+0.058
lic+0.055
atmeal+0.054
Neg logit contributions
-0.054
Warning-0.050
 aston-0.044
 policy-0.044
 next-0.041
Token's
Token ID338
Feature activation+8.86e-02

Pos logit contributions
mare+0.281
pron+0.269
ack+0.259
atmeal+0.250
ackle+0.243
Neg logit contributions
-0.130
 next-0.070
 counts-0.070
Warning-0.066
 aston-0.059
Token a
Token ID257
Feature activation-1.03e-02

Pos logit contributions
+0.121
Warning+0.109
 aston+0.096
 next+0.094
 policy+0.094
Neg logit contributions
mare-0.149
pron-0.130
ack-0.128
atmeal-0.121
lic-0.121
Token hotel
Token ID7541
Feature activation-1.21e-01

Pos logit contributions
mare+0.017
pron+0.014
ack+0.014
atmeal+0.014
lic+0.014
Neg logit contributions
-0.015
Warning-0.013
 aston-0.012
 policy-0.012
 next-0.011
Token,
Token ID11
Feature activation+4.18e-02

Pos logit contributions
mare+0.207
pron+0.185
ack+0.183
ench+0.176
ackle+0.175
Neg logit contributions
-0.165
 next-0.138
 policy-0.124
Warning-0.124
 counts-0.114
Token it
Token ID340
Feature activation-2.64e-02

Pos logit contributions
+0.056
Warning+0.048
 next+0.043
 aston+0.042
 policy+0.041
Neg logit contributions
mare-0.072
pron-0.064
ack-0.063
atmeal-0.060
lic-0.060
Token's
Token ID338
Feature activation+8.89e-02

Pos logit contributions
mare+0.059
pron+0.057
ack+0.054
ulk+0.053
elvet+0.053
Neg logit contributions
-0.029
 next-0.016
 counts-0.015
 precious-0.012
 happen-0.011
Token supposed
Token ID4385
Feature activation+5.31e-02

Pos logit contributions
+0.128
Warning+0.112
 next+0.101
 aston+0.100
 policy+0.098
Neg logit contributions
mare-0.144
pron-0.126
ack-0.125
atmeal-0.120
lic-0.119
Token to
Token ID284
Feature activation+2.06e-03

Pos logit contributions
+0.068
Warning+0.060
 next+0.053
 policy+0.052
 aston+0.052
Neg logit contributions
mare-0.092
pron-0.083
ack-0.082
lic-0.078
atmeal-0.077
Token be
Token ID307
Feature activation+1.82e-03

Pos logit contributions
+0.003
Warning+0.003
 aston+0.002
 policy+0.002
 next+0.002
Neg logit contributions
mare-0.003
pron-0.003
ack-0.003
lic-0.003
atmeal-0.003
Token fancy
Token ID14996
Feature activation+7.38e-02

Pos logit contributions
+0.002
Warning+0.002
 aston+0.002
 next+0.002
 policy+0.002
Neg logit contributions
mare-0.003
pron-0.003
ack-0.003
atmeal-0.003
ackle-0.002
Token glad
Token ID9675
Feature activation-1.60e-02

Pos logit contributions
+0.035
 next+0.028
 aston+0.024
 ever+0.023
 true+0.023
Neg logit contributions
mare-0.050
ack-0.049
uddle-0.049
rows-0.048
auldron-0.048
Token that
Token ID326
Feature activation-7.78e-02

Pos logit contributions
mare+0.029
pron+0.027
ack+0.026
atmeal+0.026
lic+0.025
Neg logit contributions
-0.020
 next-0.017
Warning-0.016
 policy-0.015
 aston-0.014
Token she
Token ID673
Feature activation-1.48e-02

Pos logit contributions
mare+0.082
pron+0.060
ack+0.055
 always+0.054
atmeal+0.054
Neg logit contributions
Warning-0.138
 seventeen-0.130
-0.128
 aston-0.124
 policy-0.118
Token had
Token ID550
Feature activation+1.24e-01

Pos logit contributions
mare+0.029
pron+0.028
ack+0.028
ench+0.027
atmeal+0.026
Neg logit contributions
-0.016
Warning-0.012
 next-0.011
 aston-0.010
 counts-0.009
Token experienced
Token ID5924
Feature activation+2.02e-02

Pos logit contributions
+0.158
Warning+0.137
 next+0.128
 aston+0.117
 policy+0.114
Neg logit contributions
mare-0.215
pron-0.205
ack-0.202
atmeal-0.191
ackle-0.190
Token the
Token ID262
Feature activation-3.36e-02

Pos logit contributions
+0.027
Warning+0.023
 aston+0.021
 next+0.021
 policy+0.020
Neg logit contributions
mare-0.035
pron-0.032
ack-0.031
atmeal-0.030
lic-0.030
Token tough
Token ID5802
Feature activation+1.38e-02

Pos logit contributions
mare+0.062
pron+0.056
ack+0.054
lic+0.053
atmeal+0.052
Neg logit contributions
-0.040
Warning-0.036
 next-0.031
 aston-0.031
 policy-0.030
Token land
Token ID1956
Feature activation-1.36e-01

Pos logit contributions
+0.021
Warning+0.019
 policy+0.017
 next+0.017
 aston+0.017
Neg logit contributions
mare-0.020
pron-0.018
ack-0.017
lic-0.017
atmeal-0.016
Token and
Token ID290
Feature activation+4.36e-03

Pos logit contributions
mare+0.241
pron+0.212
ack+0.210
lic+0.201
atmeal+0.197
Neg logit contributions
-0.176
Warning-0.146
 next-0.141
 policy-0.128
 aston-0.128
Token had
Token ID550
Feature activation+1.23e-01

Pos logit contributions
+0.006
Warning+0.005
 next+0.005
 aston+0.005
 policy+0.005
Neg logit contributions
mare-0.007
pron-0.007
ack-0.007
lic-0.006
ackle-0.006
Token many
Token ID867
Feature activation-2.55e-02

Pos logit contributions
+0.151
Warning+0.115
 next+0.114
 unexpected+0.100
 aston+0.099
Neg logit contributions
mare-0.227
pron-0.220
ack-0.217
atmeal-0.209
ackle-0.208
Token new
Token ID649
Feature activation+5.38e-02

Pos logit contributions
+0.014
Warning+0.013
 policy+0.011
 aston+0.011
 next+0.011
Neg logit contributions
mare-0.016
pron-0.014
ack-0.014
lic-0.013
atmeal-0.013
Token!
Token ID0
Feature activation-6.45e-02

Pos logit contributions
+0.071
Warning+0.061
 next+0.058
 policy+0.056
 aston+0.055
Neg logit contributions
mare-0.096
pron-0.085
ack-0.080
atmeal-0.078
lic-0.077
Token
Token ID220
Feature activation-5.47e-02

Pos logit contributions
mare+0.093
pron+0.080
ack+0.078
lic+0.072
atmeal+0.072
Neg logit contributions
-0.100
Warning-0.094
 aston-0.084
 policy-0.083
 next-0.081
Token
Token ID198
Feature activation+1.47e-02

Pos logit contributions
mare+0.115
pron+0.105
atmeal+0.096
auldron+0.094
ackle+0.092
Neg logit contributions
-0.064
Warning-0.048
 next-0.040
 aston-0.040
 policy-0.035
Token
Token ID198
Feature activation+1.37e-02

Pos logit contributions
+0.021
Warning+0.019
 aston+0.017
 policy+0.016
 next+0.016
Neg logit contributions
mare-0.024
pron-0.021
ack-0.020
atmeal-0.019
lic-0.019
TokenThe
Token ID464
Feature activation-2.40e-02

Pos logit contributions
+0.021
Warning+0.020
 aston+0.017
 next+0.017
 policy+0.016
Neg logit contributions
mare-0.021
pron-0.018
ack-0.017
atmeal-0.016
auldron-0.016
Token old
Token ID1468
Feature activation+1.44e-01

Pos logit contributions
mare+0.036
pron+0.034
ack+0.031
atmeal+0.029
lic+0.029
Neg logit contributions
Warning-0.034
-0.034
 policy-0.029
 aston-0.028
 counts-0.025
Token man
Token ID582
Feature activation-7.93e-02

Pos logit contributions
+0.229
Warning+0.211
 next+0.197
 aston+0.190
 policy+0.188
Neg logit contributions
mare-0.205
ack-0.171
pron-0.170
ench-0.164
lic-0.162
Token was
Token ID373
Feature activation+2.95e-02

Pos logit contributions
mare+0.137
pron+0.119
ack+0.114
atmeal+0.109
ackle+0.109
Neg logit contributions
-0.108
Warning-0.093
 aston-0.084
 next-0.082
 policy-0.078
Token very
Token ID845
Feature activation-6.47e-02

Pos logit contributions
+0.042
 next+0.032
 aston+0.031
Warning+0.027
 dis+0.027
Neg logit contributions
mare-0.053
auldron-0.046
rows-0.046
pron-0.045
ack-0.045
Token happy
Token ID3772
Feature activation-6.24e-03

Pos logit contributions
mare+0.104
pron+0.088
ack+0.086
atmeal+0.083
auldron+0.082
Neg logit contributions
-0.095
Warning-0.083
 aston-0.078
 next-0.076
 policy-0.073
Token It
Token ID632
Feature activation-4.76e-02

Pos logit contributions
pron+0.007
mare+0.007
atmeal+0.007
ack+0.007
ins+0.006
Neg logit contributions
-0.005
Warning-0.004
 aston-0.003
 next-0.003
 Oy-0.003
Token was
Token ID373
Feature activation+2.60e-02

Pos logit contributions
mare+0.074
pron+0.064
ack+0.063
lic+0.059
ackle+0.058
Neg logit contributions
-0.068
Warning-0.065
 aston-0.057
 policy-0.056
 next-0.054
Token a
Token ID257
Feature activation-1.16e-02

Pos logit contributions
Warning+0.041
+0.040
 policy+0.035
 aston+0.034
 counts+0.033
Neg logit contributions
mare-0.036
pron-0.030
ack-0.029
lic-0.027
ackle-0.026
Token little
Token ID1310
Feature activation-5.98e-02

Pos logit contributions
mare+0.021
ack+0.018
atmeal+0.018
pron+0.018
ench+0.017
Neg logit contributions
-0.014
Warning-0.012
 aston-0.011
 next-0.011
 policy-0.011
Token bird
Token ID6512
Feature activation-1.28e-02

Pos logit contributions
mare+0.090
pron+0.077
ack+0.075
lic+0.071
ackle+0.070
Neg logit contributions
-0.089
Warning-0.085
 aston-0.074
 policy-0.073
 next-0.071
Token who
Token ID508
Feature activation-1.17e-01

Pos logit contributions
mare+0.021
pron+0.019
ack+0.018
atmeal+0.017
auldron+0.017
Neg logit contributions
-0.019
Warning-0.016
 next-0.014
 aston-0.014
 policy-0.013
Token had
Token ID550
Feature activation+1.24e-01

Pos logit contributions
mare+0.223
pron+0.201
atmeal+0.193
ack+0.191
ench+0.189
Neg logit contributions
-0.143
Warning-0.116
 aston-0.103
 next-0.099
 policy-0.087
Token hurt
Token ID5938
Feature activation+8.21e-02

Pos logit contributions
+0.139
Warning+0.115
 next+0.099
 aston+0.097
 unexpected+0.088
Neg logit contributions
mare-0.242
pron-0.223
ack-0.214
atmeal-0.212
ackle-0.205
Token its
Token ID663
Feature activation+1.20e-02

Pos logit contributions
+0.105
Warning+0.097
 aston+0.082
 next+0.081
 policy+0.081
Neg logit contributions
mare-0.143
pron-0.129
ack-0.124
atmeal-0.120
lic-0.118
Token wing
Token ID8539
Feature activation-2.98e-02

Pos logit contributions
+0.020
Warning+0.019
 aston+0.017
 policy+0.016
 next+0.015
Neg logit contributions
mare-0.016
pron-0.014
ack-0.013
ench-0.012
atmeal-0.012
Token.
Token ID13
Feature activation-6.10e-03

Pos logit contributions
mare+0.058
pron+0.050
atmeal+0.049
ack+0.049
lic+0.047
Neg logit contributions
-0.038
 next-0.028
Warning-0.028
 policy-0.023
 behind-0.023
Token from
Token ID422
Feature activation+1.84e-01

Pos logit contributions
mare+0.094
pron+0.082
ack+0.079
atmeal+0.065
 always+0.065
Neg logit contributions
Warning-0.128
-0.115
 aston-0.112
 seventeen-0.111
ulent-0.104
Token then
Token ID788
Feature activation+1.45e-01

Pos logit contributions
+0.282
Warning+0.280
 aston+0.237
 policy+0.232
 counts+0.220
Neg logit contributions
mare-0.255
pron-0.226
ack-0.214
ackle-0.189
atmeal-0.189
Token on
Token ID319
Feature activation+4.17e-02

Pos logit contributions
Warning+0.221
+0.208
 aston+0.192
 policy+0.180
 seventeen+0.177
Neg logit contributions
mare-0.197
ack-0.162
pron-0.156
ench-0.143
ocked-0.140
Token they
Token ID484
Feature activation+1.91e-02

Pos logit contributions
+0.057
Warning+0.057
 aston+0.049
 policy+0.046
 next+0.044
Neg logit contributions
mare-0.065
pron-0.057
ack-0.056
ench-0.050
atmeal-0.050
Token were
Token ID547
Feature activation+3.98e-02

Pos logit contributions
+0.025
Warning+0.020
 next+0.018
 aston+0.017
 policy+0.016
Neg logit contributions
mare-0.036
pron-0.033
ack-0.030
lic-0.030
atmeal-0.030
Token always
Token ID1464
Feature activation-2.32e-01

Pos logit contributions
+0.061
Warning+0.052
 next+0.050
 aston+0.048
 policy+0.046
Neg logit contributions
mare-0.063
pron-0.053
ack-0.052
lic-0.050
ackle-0.049
Token going
Token ID1016
Feature activation+4.41e-02

Pos logit contributions
mare+0.377
pron+0.327
ack+0.320
lic+0.302
atmeal+0.300
Neg logit contributions
-0.325
Warning-0.302
 aston-0.264
 policy-0.259
 next-0.256
Token to
Token ID284
Feature activation+1.10e-03

Pos logit contributions
+0.061
Warning+0.057
 aston+0.050
 policy+0.049
 next+0.047
Neg logit contributions
mare-0.072
pron-0.062
ack-0.061
lic-0.057
atmeal-0.057
Token be
Token ID307
Feature activation+6.34e-04

Pos logit contributions
+0.002
Warning+0.001
 next+0.001
 aston+0.001
 policy+0.001
Neg logit contributions
mare-0.002
pron-0.002
lic-0.002
ack-0.002
ackle-0.002
Token careful
Token ID8161
Feature activation-8.35e-02

Pos logit contributions
+0.001
Warning+0.001
 next+0.001
 aston+0.001
 policy+0.001
Neg logit contributions
mare-0.001
pron-0.001
ack-0.001
lic-0.001
atmeal-0.001
Token when
Token ID618
Feature activation-7.20e-02

Pos logit contributions
mare+0.110
pron+0.096
ack+0.096
ench+0.088
ocked+0.085
Neg logit contributions
Warning-0.132
-0.123
 aston-0.112
 policy-0.106
Hello-0.100
Token,
Token ID11
Feature activation+4.19e-02

Pos logit contributions
mare+0.016
pron+0.014
ack+0.014
atmeal+0.013
ackle+0.013
Neg logit contributions
-0.017
Warning-0.016
 aston-0.014
 policy-0.013
 moral-0.013
Token Lily
Token ID20037
Feature activation+3.19e-03

Pos logit contributions
+0.061
Warning+0.059
 aston+0.051
 policy+0.050
 next+0.047
Neg logit contributions
mare-0.063
pron-0.056
ack-0.053
atmeal-0.050
lic-0.049
Token's
Token ID338
Feature activation+8.63e-02

Pos logit contributions
+0.004
 next+0.003
Warning+0.003
 counts+0.003
 aston+0.003
Neg logit contributions
mare-0.006
lic-0.005
ack-0.005
pron-0.005
ench-0.005
Token mom
Token ID1995
Feature activation+6.47e-02

Pos logit contributions
Warning+0.142
+0.142
 aston+0.115
 policy+0.113
 counts+0.106
Neg logit contributions
mare-0.112
pron-0.109
atmeal-0.095
ack-0.092
auldron-0.091
Token would
Token ID561
Feature activation-6.01e-02

Pos logit contributions
+0.084
 next+0.068
Warning+0.055
 counts+0.052
 behind+0.052
Neg logit contributions
mare-0.120
ack-0.103
lic-0.101
ackle-0.100
pron-0.100
Token always
Token ID1464
Feature activation-2.33e-01

Pos logit contributions
mare+0.092
pron+0.078
lic+0.078
ack+0.076
ackle+0.072
Neg logit contributions
-0.096
Warning-0.080
 next-0.077
 aston-0.076
 moral-0.071
Token remember
Token ID3505
Feature activation-9.99e-03

Pos logit contributions
mare+0.365
pron+0.314
ack+0.308
lic+0.292
ackle+0.287
Neg logit contributions
-0.339
Warning-0.313
 aston-0.277
 policy-0.272
 next-0.269
Token the
Token ID262
Feature activation-3.53e-02

Pos logit contributions
mare+0.017
pron+0.014
ack+0.014
lic+0.014
ackle+0.013
Neg logit contributions
-0.014
Warning-0.012
 aston-0.011
 next-0.011
 policy-0.011
Token orange
Token ID10912
Feature activation+2.31e-01

Pos logit contributions
pron+0.048
mare+0.048
atmeal+0.043
ack+0.041
ackle+0.040
Neg logit contributions
-0.051
Warning-0.051
 aston-0.044
 policy-0.041
foot-0.038
Token tree
Token ID5509
Feature activation-1.58e-02

Pos logit contributions
+0.262
Warning+0.240
 aston+0.199
 policy+0.198
 next+0.195
Neg logit contributions
mare-0.436
pron-0.385
ack-0.378
lic-0.360
ackle-0.356
Token and
Token ID290
Feature activation+6.79e-03

Pos logit contributions
mare+0.026
pron+0.023
ack+0.022
lic+0.021
ackle+0.021
Neg logit contributions
-0.022
Warning-0.020
 next-0.017
 aston-0.017
 policy-0.017
Token Tom
Token ID4186
Feature activation+2.80e-02

Pos logit contributions
mare+0.008
pron+0.007
ack+0.007
atmeal+0.007
ackle+0.007
Neg logit contributions
-0.005
Warning-0.005
 aston-0.004
 policy-0.004
 next-0.004
Token was
Token ID373
Feature activation+2.89e-02

Pos logit contributions
Warning+0.042
+0.038
 aston+0.036
 policy+0.035
 upcoming+0.034
Neg logit contributions
mare-0.040
pron-0.035
ack-0.034
atmeal-0.031
ackle-0.031
Token very
Token ID845
Feature activation-6.51e-02

Pos logit contributions
+0.037
Warning+0.031
 next+0.030
 aston+0.029
 policy+0.026
Neg logit contributions
mare-0.051
ack-0.045
pron-0.044
lic-0.042
ackle-0.042
Token flexible
Token ID12846
Feature activation+3.41e-02

Pos logit contributions
mare+0.108
pron+0.094
ack+0.092
lic+0.087
atmeal+0.086
Neg logit contributions
-0.088
Warning-0.082
 aston-0.071
 policy-0.070
 next-0.068
Token and
Token ID290
Feature activation+3.45e-03

Pos logit contributions
+0.035
 next+0.030
 behind+0.024
Warning+0.023
 policy+0.023
Neg logit contributions
mare-0.074
pron-0.066
atmeal-0.063
lic-0.063
ack-0.062
Token always
Token ID1464
Feature activation-2.32e-01

Pos logit contributions
+0.004
 next+0.003
 precious+0.003
 furry+0.003
 floppy+0.003
Neg logit contributions
mare-0.006
ench-0.006
ack-0.006
ackle-0.006
rows-0.006
Token full
Token ID1336
Feature activation+4.96e-03

Pos logit contributions
mare+0.409
ack+0.376
pron+0.375
ackle+0.358
auldron+0.354
Neg logit contributions
-0.299
Warning-0.245
 next-0.233
 dis-0.218
 counts-0.217
Token of
Token ID286
Feature activation-1.04e-02

Pos logit contributions
+0.007
Warning+0.007
 aston+0.006
 policy+0.006
 next+0.006
Neg logit contributions
mare-0.008
pron-0.007
ack-0.007
atmeal-0.006
ackle-0.006
Token energy
Token ID2568
Feature activation-5.24e-02

Pos logit contributions
mare+0.018
pron+0.016
ack+0.016
lic+0.015
atmeal+0.015
Neg logit contributions
-0.013
Warning-0.012
 aston-0.011
 policy-0.011
 next-0.010
Token.
Token ID13
Feature activation-4.46e-03

Pos logit contributions
mare+0.109
pron+0.095
ack+0.093
ackle+0.090
ench+0.089
Neg logit contributions
-0.058
 next-0.047
Warning-0.044
 policy-0.040
 ago-0.037
Token One
Token ID1881
Feature activation+1.43e-01

Pos logit contributions
mare+0.007
pron+0.006
ack+0.006
lic+0.006
atmeal+0.006
Neg logit contributions
-0.006
Warning-0.006
 aston-0.005
 policy-0.005
 next-0.005
Tokenucky
Token ID5309
Feature activation+1.24e-02

Pos logit contributions
+0.071
Warning+0.063
 next+0.056
 policy+0.055
 aston+0.053
Neg logit contributions
mare-0.090
pron-0.079
lic-0.073
ack-0.072
atmeal-0.071
Token was
Token ID373
Feature activation+2.58e-02

Pos logit contributions
+0.018
 next+0.014
Warning+0.013
 moral+0.013
 counts+0.012
Neg logit contributions
mare-0.021
pron-0.020
lic-0.020
ackle-0.018
ack-0.017
Token more
Token ID517
Feature activation-9.53e-03

Pos logit contributions
+0.037
 next+0.030
Warning+0.027
 moral+0.026
 aston+0.026
Neg logit contributions
mare-0.045
ack-0.040
lic-0.039
ackle-0.038
pron-0.038
Token careful
Token ID8161
Feature activation-8.28e-02

Pos logit contributions
mare+0.016
pron+0.013
ack+0.013
atmeal+0.013
lic+0.013
Neg logit contributions
-0.014
Warning-0.012
 next-0.011
 aston-0.011
 policy-0.011
Token and
Token ID290
Feature activation+4.82e-03

Pos logit contributions
mare+0.114
pron+0.098
ack+0.095
atmeal+0.087
ackle+0.087
Neg logit contributions
-0.130
Warning-0.126
 aston-0.112
 policy-0.108
 next-0.103
Token always
Token ID1464
Feature activation-2.33e-01

Pos logit contributions
+0.007
Warning+0.007
 next+0.006
 policy+0.006
 aston+0.006
Neg logit contributions
mare-0.007
ack-0.006
pron-0.006
lic-0.006
ackle-0.006
Token asked
Token ID1965
Feature activation+5.79e-02

Pos logit contributions
mare+0.348
pron+0.303
ack+0.295
lic+0.278
ackle+0.275
Neg logit contributions
-0.357
Warning-0.328
 aston-0.292
 next-0.287
 policy-0.286
Token for
Token ID329
Feature activation+2.03e-02

Pos logit contributions
+0.090
Warning+0.088
 aston+0.076
 policy+0.072
 seventeen+0.069
Neg logit contributions
mare-0.083
pron-0.070
ack-0.067
atmeal-0.063
lic-0.063
Token help
Token ID1037
Feature activation+1.00e-01

Pos logit contributions
+0.023
Warning+0.020
 aston+0.017
 next+0.017
 policy+0.017
Neg logit contributions
mare-0.039
pron-0.034
ack-0.034
lic-0.032
ackle-0.032
Token when
Token ID618
Feature activation-7.18e-02

Pos logit contributions
+0.145
Warning+0.127
 next+0.117
 policy+0.112
 aston+0.109
Neg logit contributions
mare-0.165
pron-0.144
ack-0.139
lic-0.134
atmeal-0.131
Token he
Token ID339
Feature activation-1.71e-02

Pos logit contributions
mare+0.100
pron+0.085
ack+0.082
atmeal+0.081
lic+0.075
Neg logit contributions
-0.110
Warning-0.107
 aston-0.096
 policy-0.089
 seventeen-0.086
Token his
Token ID465
Feature activation-1.82e-02

Pos logit contributions
+0.047
Warning+0.041
 next+0.037
 aston+0.035
 policy+0.032
Neg logit contributions
mare-0.069
pron-0.062
ack-0.061
lic-0.057
ackle-0.057
Token feet
Token ID3625
Feature activation+6.41e-02

Pos logit contributions
mare+0.032
pron+0.028
atmeal+0.027
lic+0.027
ack+0.026
Neg logit contributions
-0.023
Warning-0.020
 next-0.019
 aston-0.019
 policy-0.017
Token made
Token ID925
Feature activation-1.03e-01

Pos logit contributions
+0.096
Warning+0.079
 next+0.075
 aston+0.069
 behind+0.065
Neg logit contributions
mare-0.106
pron-0.089
auldron-0.083
atmeal-0.082
ack-0.081
Token little
Token ID1310
Feature activation-6.22e-02

Pos logit contributions
mare+0.202
pron+0.177
atmeal+0.161
ack+0.161
auldron+0.159
Neg logit contributions
-0.130
Warning-0.104
 next-0.096
 aston-0.095
 unexpected-0.083
Token p
Token ID279
Feature activation-2.46e-02

Pos logit contributions
mare+0.091
ack+0.078
pron+0.077
ackle+0.071
lic+0.071
Neg logit contributions
-0.092
Warning-0.089
 aston-0.077
 policy-0.077
 next-0.073
Tokenudd
Token ID4185
Feature activation-2.82e-01

Pos logit contributions
mare+0.037
pron+0.030
elvet+0.024
 superstar+0.021
iolet+0.021
Neg logit contributions
-0.042
 aston-0.034
Warning-0.034
 next-0.033
 upcoming-0.032
Tokenles
Token ID829
Feature activation-1.85e-01

Pos logit contributions
mare+0.511
pron+0.436
auldron+0.390
atmeal+0.387
ack+0.385
Neg logit contributions
-0.375
Warning-0.323
 next-0.308
 aston-0.284
 unexpected-0.276
Token in
Token ID287
Feature activation+2.45e-02

Pos logit contributions
mare+0.357
pron+0.317
auldron+0.292
atmeal+0.291
ack+0.288
Neg logit contributions
-0.248
 next-0.180
Warning-0.171
 aston-0.168
 behind-0.150
Token the
Token ID262
Feature activation-3.53e-02

Pos logit contributions
+0.035
Warning+0.033
 policy+0.029
 aston+0.028
 next+0.027
Neg logit contributions
mare-0.036
ack-0.031
pron-0.031
lic-0.028
ench-0.028
Token mud
Token ID17492
Feature activation+6.98e-02

Pos logit contributions
mare+0.063
pron+0.055
ack+0.053
lic+0.052
atmeal+0.051
Neg logit contributions
-0.046
Warning-0.041
 aston-0.035
 next-0.035
 policy-0.033
Token.
Token ID13
Feature activation-9.54e-03

Pos logit contributions
+0.077
 next+0.067
Warning+0.063
 behind+0.053
 policy+0.051
Neg logit contributions
mare-0.139
pron-0.125
ack-0.114
lic-0.113
ackle-0.111