TOP ACTIVATIONS
MAX = 0.205

and enthusiastic he could barely contain himself . He
M az zy could barely contain her excitement as she watched
so happy she could barely contain herself ! She got close
dress that she could barely contain her happiness . She couldn
before and she could hardly contain her excitement . She skipped
puppy . She could barely contain her excitement . She hugged
so happy he could hardly contain his excitement . He spent
excited - she could hardly contain herself . She started to
vious that he could hardly contain himself . He said to
special polish he couldn 't contain his happiness . He began
couldn â t contain himself . He yelled ,
so happy she could hardly contain her excitement ! She carefully
cartoon . He could barely contain his excitement as he sat
The two friends could hardly contain their joy , and hugged
so happy she could hardly contain her excitement .
cool that he could barely contain it . He was so
new board he could hardly contain himself . His dad smiled
excited that she could barely contain her joy .
excited that he could hardly contain himself ! Everywhere he went
's so special about Wed nes days ?" Max

MIN ACTIVATIONS
MIN = -0.218

you go , my d arl ings . Enjoy your lunch
and saw a big lion coming out of the house .
she saw a big cloud coming . It was going to
heard a deep rum ble coming from within the volcano .
and see another fire truck coming . They wonder what is
face and a pain - free tooth . <|endoftext|> Once upon
Fin saw a big shark coming near . Fin sw am
and saw a big bear coming out of the woods .
family heard a me ow coming from the garden . They
a hole and some stuffing coming out . It was not
He sees a soft light coming from his balloon .
. He saw a car coming fast . He did not
noise and saw an arrow coming right towards him ! He
and saw fire and lava coming out of the volcano .
heard Mom my and Daddy coming down the stairs . They
, brown , hairy bear coming out of the woods .
he saw a big wave coming towards the shore .
She sees a big truck coming . It stops near the
and saw a big wave coming towards them . They were
and Sue see the water coming . They are scared .

INTERVAL 0.099 - 0.205
CONTAINS 0.916%

years old and loved to explore the world around her .
a fun new style of dress and a sour treat for
It was so fun to see what it was doing each
do . He liked to dream about racing cars , playing
want to play hide and seek ?" " Yes

INTERVAL -0.007 - 0.099
CONTAINS 59.026%

he found other things to do , like build sand cast
. Every day he liked to play outside in the sun
The owl smiled and said , " I 'm
On this special day , Sue decided to design a
in the sauce jar . He tried to take it out

INTERVAL -0.113 - -0.007
CONTAINS 39.674%

to bake . It didn 't know what to do !
Finally , one of his friends said , "
g iggle . They did not mind being wet . They
. She wanted it for her fort . She ran to
much . I thought you would like a surprise . Do

INTERVAL -0.218 - -0.113
CONTAINS 0.384%

a trip , but she didn 't know what " prep
" Tom , are you okay ?" she asked . "
bag , it fell out of her hands and broke on
a while , the musician stopped playing and said to Lily
said they had to say sorry and be good .
Token and
Token ID290
Feature activation+3.90e-02

Pos logit contributions
aster+0.041
rob+0.039
oph+0.038
le+0.034
 course+0.034
Neg logit contributions
arl-0.054
 name-0.052
 twins-0.048
 welcome-0.045
 sorry-0.044
Token enthusiastic
Token ID21848
Feature activation+5.75e-02

Pos logit contributions
aster+0.048
rob+0.046
oph+0.046
 course+0.043
le+0.041
Neg logit contributions
arl-0.070
 name-0.065
 twins-0.058
 sorry-0.057
pering-0.056
Token he
Token ID339
Feature activation+4.31e-02

Pos logit contributions
aster+0.078
rob+0.074
oph+0.071
 course+0.066
le+0.063
Neg logit contributions
arl-0.099
 name-0.091
 twins-0.083
 sorry-0.079
olving-0.078
Token could
Token ID714
Feature activation-4.48e-02

Pos logit contributions
aster+0.063
rob+0.060
oph+0.053
 course+0.051
mmmm+0.050
Neg logit contributions
 name-0.071
arl-0.062
 head-0.058
 twins-0.058
 curious-0.057
Token barely
Token ID8523
Feature activation+1.08e-01

Pos logit contributions
 name+0.069
arl+0.058
 welcome+0.054
 sorry+0.051
 head+0.049
Neg logit contributions
rob-0.082
oph-0.074
aster-0.066
 course-0.065
ha-0.065
Token contain
Token ID3994
Feature activation+2.05e-01

Pos logit contributions
rob+0.173
oph+0.154
 course+0.136
ha+0.129
mm+0.128
Neg logit contributions
 name-0.176
arl-0.164
 welcome-0.150
 sorry-0.143
free-0.137
Token himself
Token ID2241
Feature activation+7.97e-02

Pos logit contributions
rob+0.204
aster+0.201
oph+0.198
chair+0.173
 course+0.171
Neg logit contributions
arl-0.411
 name-0.393
 twins-0.358
 sorry-0.347
 welcome-0.346
Token.
Token ID13
Feature activation+1.79e-02

Pos logit contributions
aster+0.106
rob+0.104
oph+0.103
 course+0.098
le+0.092
Neg logit contributions
arl-0.134
 name-0.121
 twins-0.109
pering-0.105
 sorry-0.105
Token
Token ID198
Feature activation+3.35e-02

Pos logit contributions
aster+0.022
rob+0.022
oph+0.021
 course+0.021
hop+0.019
Neg logit contributions
arl-0.032
 name-0.028
 twins-0.025
pering-0.025
 bruised-0.025
Token
Token ID198
Feature activation+2.66e-02

Pos logit contributions
aster+0.049
oph+0.048
rob+0.047
 course+0.045
le+0.044
Neg logit contributions
arl-0.053
 name-0.047
 twins-0.042
pering-0.040
 sorry-0.040
TokenHe
Token ID1544
Feature activation+4.05e-02

Pos logit contributions
aster+0.032
oph+0.032
rob+0.031
le+0.029
 course+0.029
Neg logit contributions
arl-0.048
 name-0.044
 twins-0.039
pering-0.039
olving-0.038
TokenM
Token ID44
Feature activation+2.19e-02

Pos logit contributions
rob+0.020
aster+0.020
oph+0.020
le+0.019
ly+0.018
Neg logit contributions
arl-0.029
 name-0.025
 twins-0.023
pering-0.023
 welcome-0.022
Tokenaz
Token ID1031
Feature activation+7.18e-02

Pos logit contributions
rob+0.040
aster+0.039
oph+0.038
 course+0.037
le+0.037
Neg logit contributions
arl-0.029
 name-0.023
 twins-0.020
olving-0.019
 sorry-0.018
Tokenzy
Token ID7357
Feature activation+4.41e-02

Pos logit contributions
aster+0.097
rob+0.094
 tomorrow+0.092
oph+0.091
 course+0.090
Neg logit contributions
arl-0.113
 name-0.101
olving-0.089
 twins-0.089
ric-0.084
Token could
Token ID714
Feature activation-4.26e-02

Pos logit contributions
rob+0.056
aster+0.053
 course+0.045
oph+0.045
mmmm+0.044
Neg logit contributions
 name-0.077
arl-0.072
 twins-0.069
 head-0.066
 younger-0.064
Token barely
Token ID8523
Feature activation+1.05e-01

Pos logit contributions
 name+0.065
arl+0.056
 sorry+0.051
 welcome+0.049
 head+0.045
Neg logit contributions
rob-0.077
oph-0.068
ha-0.062
 course-0.061
aster-0.061
Token contain
Token ID3994
Feature activation+2.04e-01

Pos logit contributions
rob+0.183
oph+0.158
 course+0.141
mm+0.135
ha+0.133
Neg logit contributions
 name-0.165
arl-0.158
 welcome-0.140
 sorry-0.132
free-0.129
Token her
Token ID607
Feature activation+4.81e-02

Pos logit contributions
rob+0.274
aster+0.257
oph+0.254
chair+0.237
 course+0.230
Neg logit contributions
arl-0.354
 name-0.336
 twins-0.299
 welcome-0.288
pering-0.287
Token excitement
Token ID14067
Feature activation+7.64e-02

Pos logit contributions
aster+0.076
rob+0.075
oph+0.075
 course+0.070
le+0.068
Neg logit contributions
arl-0.070
 name-0.063
 twins-0.055
 sorry-0.053
 welcome-0.052
Token as
Token ID355
Feature activation-3.77e-02

Pos logit contributions
aster+0.105
rob+0.104
oph+0.101
 course+0.096
le+0.090
Neg logit contributions
arl-0.127
 name-0.115
 twins-0.102
pering-0.099
 sorry-0.098
Token she
Token ID673
Feature activation+1.60e-02

Pos logit contributions
arl+0.086
 name+0.078
 twins+0.072
pering+0.071
 bruised+0.070
Neg logit contributions
aster-0.029
rob-0.027
oph-0.027
 course-0.025
 tomorrow-0.023
Token watched
Token ID7342
Feature activation-1.43e-02

Pos logit contributions
aster+0.021
rob+0.020
oph+0.018
mmm+0.018
mmmm+0.018
Neg logit contributions
 name-0.027
arl-0.027
 twins-0.023
 head-0.022
 curious-0.022
Token so
Token ID523
Feature activation+5.65e-02

Pos logit contributions
aster+0.039
 then+0.037
 sat+0.036
oph+0.036
hop+0.036
Neg logit contributions
arl-0.058
 name-0.047
olving-0.043
 twins-0.041
arse-0.041
Token happy
Token ID3772
Feature activation+2.70e-02

Pos logit contributions
aster+0.084
oph+0.080
 sat+0.080
 then+0.079
 tomorrow+0.079
Neg logit contributions
arl-0.086
 name-0.069
olving-0.064
 twins-0.062
arse-0.061
Token she
Token ID673
Feature activation+2.43e-02

Pos logit contributions
aster+0.034
rob+0.034
oph+0.031
 course+0.029
le+0.028
Neg logit contributions
arl-0.047
 name-0.045
 twins-0.042
 sorry-0.039
pering-0.039
Token could
Token ID714
Feature activation-5.19e-02

Pos logit contributions
aster+0.036
rob+0.033
oph+0.030
 course+0.030
mmmm+0.029
Neg logit contributions
 name-0.038
arl-0.036
 curious-0.033
 sorry-0.033
 welcome-0.032
Token barely
Token ID8523
Feature activation+1.03e-01

Pos logit contributions
 name+0.080
arl+0.071
 welcome+0.064
 sorry+0.062
 head+0.059
Neg logit contributions
rob-0.092
oph-0.085
 course-0.077
aster-0.077
ha-0.073
Token contain
Token ID3994
Feature activation+2.03e-01

Pos logit contributions
rob+0.149
oph+0.138
 course+0.121
mmm+0.119
mm+0.118
Neg logit contributions
 name-0.171
arl-0.166
 welcome-0.151
 sorry-0.147
 crashing-0.141
Token herself
Token ID5223
Feature activation+8.32e-02

Pos logit contributions
rob+0.248
aster+0.238
oph+0.231
chair+0.214
 course+0.211
Neg logit contributions
arl-0.367
 name-0.349
 twins-0.313
 sorry-0.305
pering-0.304
Token!
Token ID0
Feature activation+5.44e-02

Pos logit contributions
aster+0.102
oph+0.096
rob+0.095
 course+0.093
le+0.090
Neg logit contributions
arl-0.153
 name-0.133
 twins-0.120
pering-0.118
 bruised-0.117
Token She
Token ID1375
Feature activation+3.22e-02

Pos logit contributions
aster+0.078
rob+0.077
oph+0.075
 course+0.073
le+0.070
Neg logit contributions
arl-0.087
 name-0.077
 twins-0.068
pering-0.066
 sorry-0.065
Token got
Token ID1392
Feature activation+1.18e-02

Pos logit contributions
aster+0.049
rob+0.049
mmmm+0.044
mm+0.043
mmm+0.042
Neg logit contributions
 name-0.050
 curious-0.044
 head-0.043
 twins-0.041
 younger-0.041
Token close
Token ID1969
Feature activation-1.52e-02

Pos logit contributions
aster+0.014
 sat+0.013
oph+0.013
 later+0.013
 he+0.013
Neg logit contributions
arl-0.023
 name-0.018
olving-0.018
pering-0.017
arse-0.017
Token dress
Token ID6576
Feature activation+9.79e-02

Pos logit contributions
arl+0.034
 name+0.033
 twins+0.028
 welcome+0.027
 sorry+0.026
Neg logit contributions
aster-0.034
rob-0.032
oph-0.031
 course-0.030
le-0.029
Token that
Token ID326
Feature activation+1.81e-02

Pos logit contributions
aster+0.150
rob+0.143
oph+0.136
 course+0.129
mmm+0.128
Neg logit contributions
arl-0.152
 name-0.144
 twins-0.124
olving-0.121
 undone-0.120
Token she
Token ID673
Feature activation+3.24e-02

Pos logit contributions
rob+0.019
aster+0.018
oph+0.017
le+0.016
 course+0.016
Neg logit contributions
 name-0.034
arl-0.034
 twins-0.031
 welcome-0.030
 sorry-0.030
Token could
Token ID714
Feature activation-5.22e-02

Pos logit contributions
aster+0.046
rob+0.044
oph+0.040
mmmm+0.040
 course+0.039
Neg logit contributions
 name-0.051
arl-0.049
 twins-0.046
 head-0.044
 younger-0.043
Token barely
Token ID8523
Feature activation+1.06e-01

Pos logit contributions
 name+0.075
arl+0.063
 head+0.056
 welcome+0.054
 undone+0.053
Neg logit contributions
rob-0.100
oph-0.094
aster-0.085
mmmm-0.083
cream-0.082
Token contain
Token ID3994
Feature activation+2.02e-01

Pos logit contributions
rob+0.188
oph+0.172
mmm+0.150
mm+0.149
mmmm+0.148
Neg logit contributions
 name-0.154
arl-0.140
 welcome-0.129
free-0.123
 undone-0.121
Token her
Token ID607
Feature activation+4.49e-02

Pos logit contributions
rob+0.282
aster+0.266
oph+0.255
chair+0.242
 course+0.234
Neg logit contributions
arl-0.332
 name-0.327
 twins-0.291
 welcome-0.278
 sorry-0.275
Token happiness
Token ID12157
Feature activation+4.65e-02

Pos logit contributions
aster+0.076
rob+0.074
oph+0.074
 course+0.070
le+0.067
Neg logit contributions
arl-0.061
 name-0.055
 twins-0.047
 sorry-0.045
 welcome-0.044
Token.
Token ID13
Feature activation+1.26e-02

Pos logit contributions
aster+0.061
rob+0.059
oph+0.059
 course+0.056
le+0.053
Neg logit contributions
arl-0.080
 name-0.072
 twins-0.064
pering-0.062
 sorry-0.062
Token She
Token ID1375
Feature activation+2.81e-02

Pos logit contributions
aster+0.018
oph+0.017
rob+0.017
 course+0.016
le+0.016
Neg logit contributions
arl-0.020
 name-0.018
 twins-0.016
pering-0.016
 sorry-0.015
Token couldn
Token ID3521
Feature activation-2.41e-02

Pos logit contributions
aster+0.044
rob+0.043
mmmm+0.041
mmm+0.039
mm+0.038
Neg logit contributions
 name-0.043
 head-0.038
 curious-0.036
 younger-0.035
 twins-0.035
Token before
Token ID878
Feature activation-5.68e-02

Pos logit contributions
rob+0.014
aster+0.014
oph+0.013
 course+0.012
mmm+0.012
Neg logit contributions
arl-0.015
 name-0.014
 twins-0.013
 welcome-0.013
pering-0.013
Token and
Token ID290
Feature activation+2.36e-02

Pos logit contributions
arl+0.090
 name+0.083
 twins+0.075
pering+0.073
 sorry+0.072
Neg logit contributions
rob-0.081
aster-0.080
oph-0.077
le-0.072
 course-0.072
Token she
Token ID673
Feature activation+7.64e-03

Pos logit contributions
rob+0.028
aster+0.025
oph+0.025
le+0.023
 course+0.022
Neg logit contributions
 name-0.043
arl-0.042
 sorry-0.039
 welcome-0.038
 twins-0.038
Token could
Token ID714
Feature activation-6.74e-02

Pos logit contributions
rob+0.011
aster+0.010
oph+0.009
mmm+0.009
mmmm+0.009
Neg logit contributions
 name-0.013
arl-0.012
 twins-0.011
 sorry-0.011
 head-0.011
Token hardly
Token ID8941
Feature activation+7.11e-02

Pos logit contributions
 name+0.100
arl+0.094
 welcome+0.080
 sorry+0.080
olving+0.076
Neg logit contributions
rob-0.115
oph-0.105
aster-0.099
ha-0.096
 course-0.094
Token contain
Token ID3994
Feature activation+2.01e-01

Pos logit contributions
rob+0.121
oph+0.101
 course+0.088
ha+0.086
mmm+0.083
Neg logit contributions
 name-0.119
arl-0.108
 welcome-0.101
 sorry-0.096
 head-0.093
Token her
Token ID607
Feature activation+5.07e-02

Pos logit contributions
rob+0.240
aster+0.237
oph+0.229
 course+0.209
chair+0.203
Neg logit contributions
arl-0.369
 name-0.345
 twins-0.312
pering-0.302
 sorry-0.302
Token excitement
Token ID14067
Feature activation+8.70e-02

Pos logit contributions
aster+0.077
rob+0.073
oph+0.073
 course+0.072
le+0.068
Neg logit contributions
arl-0.078
 name-0.067
 twins-0.059
pering-0.059
 bruised-0.057
Token.
Token ID13
Feature activation+1.16e-02

Pos logit contributions
rob+0.126
aster+0.125
oph+0.121
 course+0.116
le+0.107
Neg logit contributions
arl-0.137
 name-0.125
 twins-0.111
 sorry-0.107
pering-0.107
Token She
Token ID1375
Feature activation+3.26e-02

Pos logit contributions
aster+0.016
oph+0.016
rob+0.015
 course+0.015
 tomorrow+0.014
Neg logit contributions
arl-0.019
 name-0.017
 twins-0.015
pering-0.015
 bruised-0.015
Token skipped
Token ID26684
Feature activation+5.94e-02

Pos logit contributions
rob+0.053
aster+0.052
mmmm+0.047
mm+0.045
mmm+0.045
Neg logit contributions
 name-0.049
 head-0.042
 curious-0.042
 twins-0.040
 younger-0.039
Token puppy
Token ID26188
Feature activation+1.82e-02

Pos logit contributions
arl+0.020
 name+0.017
pering+0.015
 twins+0.015
 sorry+0.015
Neg logit contributions
aster-0.019
rob-0.019
oph-0.019
 course-0.018
 tomorrow-0.017
Token.
Token ID13
Feature activation+5.44e-03

Pos logit contributions
aster+0.022
rob+0.021
oph+0.019
 course+0.018
le+0.017
Neg logit contributions
arl-0.032
 name-0.032
 twins-0.029
 sorry-0.028
 welcome-0.027
Token She
Token ID1375
Feature activation+2.62e-02

Pos logit contributions
aster+0.007
oph+0.007
rob+0.007
 course+0.007
 tomorrow+0.007
Neg logit contributions
arl-0.009
 name-0.008
 twins-0.007
pering-0.007
 bruised-0.007
Token could
Token ID714
Feature activation-4.48e-02

Pos logit contributions
rob+0.041
aster+0.040
mmmm+0.034
oph+0.034
mm+0.033
Neg logit contributions
 name-0.041
 head-0.035
 curious-0.035
 twins-0.034
arl-0.034
Token barely
Token ID8523
Feature activation+1.05e-01

Pos logit contributions
 name+0.070
arl+0.056
 sorry+0.053
 welcome+0.052
 head+0.050
Neg logit contributions
rob-0.081
oph-0.072
ha-0.066
aster-0.066
 course-0.064
Token contain
Token ID3994
Feature activation+2.01e-01

Pos logit contributions
rob+0.170
oph+0.143
mm+0.127
 course+0.126
ops+0.123
Neg logit contributions
 name-0.175
arl-0.160
 welcome-0.150
 sorry-0.143
free-0.137
Token her
Token ID607
Feature activation+4.70e-02

Pos logit contributions
rob+0.268
aster+0.256
oph+0.241
chair+0.231
ha+0.219
Neg logit contributions
arl-0.342
 name-0.334
 twins-0.297
pering-0.287
 welcome-0.286
Token excitement
Token ID14067
Feature activation+7.99e-02

Pos logit contributions
aster+0.069
rob+0.067
oph+0.067
 course+0.065
le+0.062
Neg logit contributions
arl-0.074
 name-0.064
 twins-0.057
pering-0.056
 bruised-0.055
Token.
Token ID13
Feature activation+8.76e-03

Pos logit contributions
aster+0.091
rob+0.091
oph+0.087
 course+0.082
le+0.075
Neg logit contributions
arl-0.150
 name-0.139
 twins-0.126
pering-0.122
 sorry-0.121
Token She
Token ID1375
Feature activation+2.95e-02

Pos logit contributions
aster+0.013
rob+0.012
oph+0.012
 course+0.012
le+0.011
Neg logit contributions
arl-0.014
 name-0.013
 twins-0.011
pering-0.011
 sorry-0.011
Token hugged
Token ID41130
Feature activation-2.16e-02

Pos logit contributions
aster+0.046
rob+0.045
mmmm+0.040
mm+0.039
mmm+0.039
Neg logit contributions
 name-0.047
 head-0.040
 curious-0.039
 twins-0.038
 younger-0.037
Token so
Token ID523
Feature activation+4.52e-02

Pos logit contributions
aster+0.008
 course+0.007
 then+0.007
 tomorrow+0.007
 later+0.007
Neg logit contributions
arl-0.009
 name-0.008
pering-0.007
olving-0.007
arse-0.007
Token happy
Token ID3772
Feature activation+1.99e-02

Pos logit contributions
aster+0.075
oph+0.070
 course+0.068
 tomorrow+0.068
 he+0.068
Neg logit contributions
arl-0.062
 name-0.051
pering-0.047
 twins-0.045
olving-0.044
Token he
Token ID339
Feature activation+3.23e-02

Pos logit contributions
aster+0.031
rob+0.031
oph+0.030
le+0.026
ha+0.026
Neg logit contributions
 name-0.028
arl-0.027
 twins-0.026
 sorry-0.023
 welcome-0.023
Token could
Token ID714
Feature activation-5.20e-02

Pos logit contributions
aster+0.049
rob+0.047
 course+0.043
oph+0.042
mmmm+0.042
Neg logit contributions
 name-0.048
arl-0.042
 welcome-0.041
 sorry-0.041
 curious-0.041
Token hardly
Token ID8941
Feature activation+7.60e-02

Pos logit contributions
 name+0.079
arl+0.071
 welcome+0.062
 sorry+0.062
 head+0.059
Neg logit contributions
rob-0.090
oph-0.087
aster-0.076
 course-0.076
ha-0.073
Token contain
Token ID3994
Feature activation+2.01e-01

Pos logit contributions
rob+0.105
oph+0.096
 course+0.083
ha+0.080
aster+0.078
Neg logit contributions
 name-0.135
arl-0.130
 welcome-0.116
 sorry-0.115
free-0.109
Token his
Token ID465
Feature activation+7.60e-02

Pos logit contributions
rob+0.227
oph+0.221
aster+0.218
chair+0.199
ha+0.195
Neg logit contributions
 name-0.372
arl-0.370
 twins-0.337
 welcome-0.328
 sorry-0.328
Token excitement
Token ID14067
Feature activation+8.41e-02

Pos logit contributions
aster+0.131
rob+0.128
oph+0.128
 course+0.122
le+0.118
Neg logit contributions
arl-0.100
 name-0.088
 twins-0.076
 sorry-0.072
pering-0.071
Token.
Token ID13
Feature activation+9.94e-03

Pos logit contributions
rob+0.123
aster+0.122
oph+0.118
 course+0.113
hop+0.104
Neg logit contributions
arl-0.131
 name-0.121
 twins-0.108
 sorry-0.102
pering-0.102
Token He
Token ID679
Feature activation+4.70e-02

Pos logit contributions
aster+0.013
rob+0.013
oph+0.013
 course+0.012
 tomorrow+0.011
Neg logit contributions
arl-0.017
 name-0.015
 twins-0.013
pering-0.013
 bruised-0.013
Token spent
Token ID3377
Feature activation+1.20e-01

Pos logit contributions
aster+0.072
rob+0.072
mmmm+0.067
mmm+0.064
mm+0.063
Neg logit contributions
 name-0.072
 head-0.062
 twins-0.061
 curious-0.061
 younger-0.059
Token excited
Token ID6568
Feature activation+6.01e-02

Pos logit contributions
rob+0.083
aster+0.080
oph+0.080
 course+0.076
le+0.072
Neg logit contributions
arl-0.075
 name-0.069
 sorry-0.060
 twins-0.059
 welcome-0.058
Token -
Token ID532
Feature activation-1.18e-02

Pos logit contributions
rob+0.091
aster+0.088
oph+0.081
 course+0.079
mm+0.075
Neg logit contributions
arl-0.087
 name-0.086
 twins-0.079
 attention-0.074
 crashing-0.071
Token she
Token ID673
Feature activation+2.17e-02

Pos logit contributions
arl+0.023
 name+0.021
 twins+0.019
 sorry+0.019
pering+0.019
Neg logit contributions
aster-0.013
rob-0.012
oph-0.012
 course-0.011
le-0.011
Token could
Token ID714
Feature activation-4.95e-02

Pos logit contributions
rob+0.029
aster+0.027
oph+0.026
 course+0.025
mmmm+0.023
Neg logit contributions
 name-0.037
arl-0.036
 twins-0.032
 welcome-0.032
 sorry-0.032
Token hardly
Token ID8941
Feature activation+7.94e-02

Pos logit contributions
 name+0.075
arl+0.058
 welcome+0.057
 sorry+0.056
 head+0.055
Neg logit contributions
rob-0.092
oph-0.085
 course-0.074
ha-0.073
aster-0.072
Token contain
Token ID3994
Feature activation+2.00e-01

Pos logit contributions
rob+0.129
oph+0.110
 course+0.097
mm+0.091
ha+0.091
Neg logit contributions
 name-0.135
arl-0.124
 welcome-0.119
 sorry-0.112
free-0.107
Token herself
Token ID5223
Feature activation+8.94e-02

Pos logit contributions
rob+0.255
aster+0.243
oph+0.238
chair+0.218
ha+0.210
Neg logit contributions
arl-0.349
 name-0.342
 twins-0.304
 welcome-0.301
 sorry-0.294
Token.
Token ID13
Feature activation+1.51e-02

Pos logit contributions
aster+0.119
rob+0.119
oph+0.116
 course+0.109
le+0.102
Neg logit contributions
arl-0.149
 name-0.138
 twins-0.123
 sorry-0.119
pering-0.118
Token She
Token ID1375
Feature activation+3.29e-02

Pos logit contributions
aster+0.022
rob+0.021
oph+0.021
 course+0.020
le+0.019
Neg logit contributions
arl-0.024
 name-0.022
 twins-0.019
pering-0.019
 sorry-0.018
Token started
Token ID2067
Feature activation-3.31e-02

Pos logit contributions
rob+0.051
aster+0.047
mmmm+0.044
mm+0.043
oph+0.042
Neg logit contributions
 name-0.052
 head-0.044
 curious-0.044
 twins-0.043
 younger-0.042
Token to
Token ID284
Feature activation+4.92e-02

Pos logit contributions
arl+0.048
 name+0.043
 twins+0.037
pering+0.036
 sorry+0.036
Neg logit contributions
aster-0.052
rob-0.051
oph-0.051
 course-0.049
le-0.047
Tokenvious
Token ID1442
Feature activation+7.45e-02

Pos logit contributions
aster+0.024
oph+0.023
rob+0.022
 course+0.020
le+0.019
Neg logit contributions
arl-0.119
 name-0.111
 twins-0.103
pering-0.102
 sorry-0.101
Token that
Token ID326
Feature activation+1.39e-02

Pos logit contributions
aster+0.087
rob+0.083
oph+0.076
zed+0.069
 course+0.069
Neg logit contributions
arl-0.142
 name-0.129
 twins-0.125
pering-0.117
 crashing-0.116
Token he
Token ID339
Feature activation+4.76e-02

Pos logit contributions
aster+0.010
rob+0.009
oph+0.009
 course+0.008
le+0.007
Neg logit contributions
arl-0.032
 name-0.030
 twins-0.028
 sorry-0.028
 welcome-0.028
Token could
Token ID714
Feature activation-4.81e-02

Pos logit contributions
aster+0.068
rob+0.064
oph+0.061
 course+0.058
mmmm+0.056
Neg logit contributions
 name-0.075
arl-0.071
 twins-0.068
 sorry-0.066
 curious-0.065
Token hardly
Token ID8941
Feature activation+7.05e-02

Pos logit contributions
 name+0.065
arl+0.057
 crashing+0.054
 welcome+0.050
 sorry+0.049
Neg logit contributions
oph-0.093
rob-0.089
ly-0.079
ha-0.079
chair-0.078
Token contain
Token ID3994
Feature activation+2.00e-01

Pos logit contributions
rob+0.122
oph+0.116
 course+0.096
mm+0.096
mmm+0.096
Neg logit contributions
 name-0.108
arl-0.100
 welcome-0.092
 crashing-0.091
 sorry-0.087
Token himself
Token ID2241
Feature activation+7.75e-02

Pos logit contributions
rob+0.193
aster+0.191
oph+0.188
chair+0.168
 course+0.165
Neg logit contributions
arl-0.408
 name-0.389
 twins-0.357
pering-0.343
 sorry-0.342
Token.
Token ID13
Feature activation+1.96e-02

Pos logit contributions
rob+0.109
aster+0.107
oph+0.105
 course+0.100
chair+0.092
Neg logit contributions
arl-0.125
 name-0.116
 twins-0.103
 sorry-0.100
pering-0.099
Token He
Token ID679
Feature activation+4.51e-02

Pos logit contributions
aster+0.026
oph+0.025
rob+0.025
 course+0.024
 tomorrow+0.022
Neg logit contributions
arl-0.034
 name-0.030
 twins-0.027
pering-0.026
 bruised-0.026
Token said
Token ID531
Feature activation+1.23e-02

Pos logit contributions
aster+0.068
rob+0.067
mmmm+0.060
oph+0.058
mm+0.058
Neg logit contributions
 name-0.071
 curious-0.062
 twins-0.061
 head-0.061
 younger-0.061
Token to
Token ID284
Feature activation+2.11e-02

Pos logit contributions
rob+0.016
aster+0.016
oph+0.016
 course+0.015
le+0.014
Neg logit contributions
arl-0.021
 name-0.020
 sorry-0.018
 twins-0.017
 welcome-0.017
Token special
Token ID2041
Feature activation+5.26e-02

Pos logit contributions
arl+0.119
 name+0.096
 Deal+0.093
pering+0.092
olving+0.089
Neg logit contributions
aster-0.103
 later-0.097
 course-0.096
rob-0.095
 he-0.095
Token polish
Token ID25245
Feature activation+5.19e-02

Pos logit contributions
aster+0.082
rob+0.080
oph+0.079
 course+0.072
hop+0.071
Neg logit contributions
arl-0.076
 name-0.073
 twins-0.063
 welcome-0.060
olving-0.058
Token he
Token ID339
Feature activation+1.74e-02

Pos logit contributions
aster+0.077
oph+0.076
rob+0.071
hop+0.065
 course+0.065
Neg logit contributions
arl-0.078
 name-0.075
 twins-0.067
 bruised-0.064
free-0.063
Token couldn
Token ID3521
Feature activation-2.02e-02

Pos logit contributions
rob+0.027
oph+0.026
aster+0.025
 course+0.022
mmmm+0.022
Neg logit contributions
 name-0.026
arl-0.023
 twins-0.022
 welcome-0.021
 head-0.021
Token't
Token ID470
Feature activation-1.74e-02

Pos logit contributions
 name+0.015
arl+0.012
 undone+0.011
free+0.010
 welcome+0.010
Neg logit contributions
oph-0.047
rob-0.045
 hours-0.043
aster-0.042
 imagined-0.041
Token contain
Token ID3994
Feature activation+2.00e-01

Pos logit contributions
 name+0.019
free+0.016
 undone+0.016
 welcome+0.016
 head+0.015
Neg logit contributions
rob-0.040
oph-0.035
aw-0.033
ted-0.031
oline-0.030
Token his
Token ID465
Feature activation+7.99e-02

Pos logit contributions
rob+0.247
oph+0.240
aster+0.225
chair+0.211
ha+0.208
Neg logit contributions
arl-0.359
 name-0.357
 twins-0.316
 sorry-0.311
 welcome-0.308
Token happiness
Token ID12157
Feature activation+5.75e-02

Pos logit contributions
aster+0.142
rob+0.142
oph+0.140
 course+0.133
le+0.128
Neg logit contributions
arl-0.099
 name-0.090
 twins-0.076
 sorry-0.072
 welcome-0.070
Token.
Token ID13
Feature activation+1.17e-02

Pos logit contributions
aster+0.078
rob+0.076
oph+0.074
 course+0.071
hop+0.065
Neg logit contributions
arl-0.097
 name-0.089
 twins-0.079
 sorry-0.076
olving-0.076
Token He
Token ID679
Feature activation+4.67e-02

Pos logit contributions
aster+0.015
rob+0.015
oph+0.015
 course+0.014
le+0.014
Neg logit contributions
arl-0.020
 name-0.018
 twins-0.016
pering-0.015
 sorry-0.015
Token began
Token ID2540
Feature activation-4.07e-02

Pos logit contributions
rob+0.067
aster+0.062
oph+0.060
mmmm+0.058
mmm+0.056
Neg logit contributions
 name-0.077
 head-0.066
 twins-0.063
arl-0.063
 curious-0.062
Token couldn
Token ID3521
Feature activation-2.99e-02

Pos logit contributions
aster+0.063
rob+0.060
oph+0.055
 course+0.053
mmmm+0.050
Neg logit contributions
 name-0.081
arl-0.075
 twins-0.068
 younger-0.066
 curious-0.066
Tokenâ
Token ID22940
Feature activation+5.48e-02

Pos logit contributions
arl+0.026
 name+0.025
 sorry+0.017
 welcome+0.017
 crashing+0.017
Neg logit contributions
oph-0.064
 hours-0.061
 imagined-0.061
rob-0.060
aster-0.059
Token
Token ID26391
Feature activation+6.38e-02

Pos logit contributions
oph+0.086
 seas+0.081
 hours+0.081
rob+0.081
der+0.080
Neg logit contributions
arl-0.075
 name-0.068
 welcome-0.064
free-0.057
 Fighting-0.056
Token
Token ID8151
Feature activation+1.37e-02

Pos logit contributions
aster+0.095
oph+0.093
rob+0.093
 course+0.091
le+0.091
Neg logit contributions
arl-0.094
 name-0.088
 twins-0.077
 sorry-0.073
pering-0.072
Tokent
Token ID83
Feature activation+8.35e-04

Pos logit contributions
aster+0.013
rob+0.013
oph+0.012
 course+0.012
le+0.011
Neg logit contributions
arl-0.028
 name-0.025
 twins-0.023
pering-0.023
 sorry-0.023
Token contain
Token ID3994
Feature activation+2.00e-01

Pos logit contributions
rob+0.002
oph+0.002
ons+0.001
 seas+0.001
aw+0.001
Neg logit contributions
 name-0.001
arl-0.001
free-0.001
 welcome-0.001
 crashing-0.001
Token himself
Token ID2241
Feature activation+7.63e-02

Pos logit contributions
rob+0.215
oph+0.203
aster+0.199
chair+0.191
ha+0.177
Neg logit contributions
arl-0.394
 name-0.380
 twins-0.345
 welcome-0.334
 sorry-0.330
Token.
Token ID13
Feature activation+9.81e-03

Pos logit contributions
aster+0.100
rob+0.098
oph+0.097
 course+0.093
le+0.087
Neg logit contributions
arl-0.131
 name-0.117
 twins-0.105
pering-0.102
 sorry-0.101
Token He
Token ID679
Feature activation+3.94e-02

Pos logit contributions
aster+0.012
oph+0.011
rob+0.011
 course+0.011
 tomorrow+0.010
Neg logit contributions
arl-0.018
 name-0.016
 twins-0.014
pering-0.014
 bruised-0.014
Token yelled
Token ID22760
Feature activation-1.28e-03

Pos logit contributions
aster+0.057
rob+0.055
mmmm+0.049
oph+0.048
mmm+0.047
Neg logit contributions
 name-0.066
 head-0.055
 younger-0.055
 curious-0.055
 twins-0.054
Token,
Token ID11
Feature activation-1.74e-02

Pos logit contributions
arl+0.002
 name+0.002
 twins+0.002
pering+0.002
 bruised+0.002
Neg logit contributions
aster-0.002
oph-0.002
rob-0.002
 course-0.001
le-0.001
Token so
Token ID523
Feature activation+5.53e-02

Pos logit contributions
aster+0.036
oph+0.033
rob+0.032
 course+0.032
 tomorrow+0.031
Neg logit contributions
arl-0.059
 name-0.051
olving-0.046
 twins-0.045
pering-0.045
Token happy
Token ID3772
Feature activation+2.26e-02

Pos logit contributions
aster+0.092
oph+0.089
rob+0.086
 course+0.084
 tomorrow+0.083
Neg logit contributions
arl-0.076
 name-0.064
 twins-0.056
pering-0.055
olving-0.054
Token she
Token ID673
Feature activation+2.65e-02

Pos logit contributions
rob+0.029
aster+0.029
oph+0.027
 course+0.025
le+0.024
Neg logit contributions
arl-0.038
 name-0.037
 twins-0.034
pering-0.031
 sorry-0.031
Token could
Token ID714
Feature activation-5.52e-02

Pos logit contributions
aster+0.038
rob+0.035
 course+0.033
oph+0.032
mmmm+0.031
Neg logit contributions
 name-0.041
arl-0.041
 sorry-0.036
 twins-0.036
 welcome-0.035
Token hardly
Token ID8941
Feature activation+6.67e-02

Pos logit contributions
 name+0.087
arl+0.076
 welcome+0.070
 sorry+0.066
 head+0.065
Neg logit contributions
rob-0.097
oph-0.086
 course-0.078
aster-0.076
aw-0.075
Token contain
Token ID3994
Feature activation+1.99e-01

Pos logit contributions
rob+0.094
oph+0.080
 course+0.071
ha+0.067
aster+0.066
Neg logit contributions
 name-0.120
arl-0.116
 welcome-0.106
 sorry-0.102
free-0.100
Token her
Token ID607
Feature activation+4.61e-02

Pos logit contributions
rob+0.260
aster+0.243
oph+0.232
chair+0.223
 course+0.217
Neg logit contributions
arl-0.345
 name-0.334
 twins-0.301
 sorry-0.289
 welcome-0.288
Token excitement
Token ID14067
Feature activation+7.86e-02

Pos logit contributions
aster+0.065
oph+0.062
 course+0.061
rob+0.061
le+0.058
Neg logit contributions
arl-0.077
 name-0.063
pering-0.060
olving-0.058
 twins-0.057
Token!
Token ID0
Feature activation+5.26e-02

Pos logit contributions
aster+0.102
rob+0.099
oph+0.098
 course+0.094
le+0.089
Neg logit contributions
arl-0.136
 name-0.121
 twins-0.109
pering-0.106
 sorry-0.105
Token She
Token ID1375
Feature activation+3.37e-02

Pos logit contributions
aster+0.072
oph+0.070
rob+0.070
 course+0.067
 tomorrow+0.063
Neg logit contributions
arl-0.087
 name-0.077
 twins-0.069
pering-0.068
 bruised-0.066
Token carefully
Token ID7773
Feature activation+3.06e-02

Pos logit contributions
aster+0.051
rob+0.050
mmmm+0.044
mm+0.042
mmm+0.042
Neg logit contributions
 name-0.054
 head-0.045
 twins-0.045
 curious-0.044
arl-0.044
Token cartoon
Token ID16251
Feature activation-5.41e-04

Pos logit contributions
aster+0.103
rob+0.099
oph+0.098
 course+0.098
le+0.094
Neg logit contributions
arl-0.117
 name-0.097
pering-0.090
 twins-0.087
olving-0.086
Token.
Token ID13
Feature activation-3.70e-03

Pos logit contributions
arl+0.001
 name+0.001
 twins+0.001
pering+0.001
 sorry+0.001
Neg logit contributions
aster-0.001
rob-0.001
oph-0.001
 course-0.001
le-0.001
Token He
Token ID679
Feature activation+2.62e-02

Pos logit contributions
arl+0.007
 name+0.006
 twins+0.006
 bruised+0.006
pering+0.006
Neg logit contributions
aster-0.004
oph-0.004
 course-0.004
rob-0.004
 he-0.004
Token could
Token ID714
Feature activation-5.55e-02

Pos logit contributions
rob+0.047
aster+0.043
ock+0.039
 course+0.039
mmmm+0.039
Neg logit contributions
 name-0.038
 curious-0.035
 head-0.033
 puzzled-0.033
 nos-0.032
Token barely
Token ID8523
Feature activation+1.01e-01

Pos logit contributions
 name+0.081
arl+0.065
 sorry+0.062
 head+0.059
 curious+0.058
Neg logit contributions
rob-0.111
oph-0.103
 course-0.094
ha-0.093
mmmm-0.087
Token contain
Token ID3994
Feature activation+1.99e-01

Pos logit contributions
rob+0.168
oph+0.147
 course+0.133
mm+0.125
ha+0.123
Neg logit contributions
 name-0.166
arl-0.156
 welcome-0.137
 sorry-0.136
free-0.131
Token his
Token ID465
Feature activation+5.83e-02

Pos logit contributions
rob+0.178
aster+0.174
oph+0.171
 course+0.152
chair+0.146
Neg logit contributions
arl-0.419
 name-0.400
 twins-0.365
pering-0.355
 sorry-0.354
Token excitement
Token ID14067
Feature activation+7.77e-02

Pos logit contributions
aster+0.089
rob+0.085
 course+0.085
oph+0.085
le+0.081
Neg logit contributions
arl-0.090
 name-0.073
pering-0.067
olving-0.066
 twins-0.065
Token as
Token ID355
Feature activation-4.07e-02

Pos logit contributions
aster+0.090
rob+0.086
oph+0.086
 course+0.081
le+0.077
Neg logit contributions
arl-0.146
 name-0.131
 twins-0.119
pering-0.116
 sorry-0.115
Token he
Token ID339
Feature activation+1.86e-02

Pos logit contributions
arl+0.115
 name+0.105
 twins+0.099
pering+0.098
 bruised+0.097
Neg logit contributions
aster-0.011
rob-0.006
oph-0.006
 course-0.006
 he-0.005
Token sat
Token ID3332
Feature activation-1.63e-02

Pos logit contributions
aster+0.025
rob+0.024
mmmm+0.023
oph+0.023
mmm+0.022
Neg logit contributions
 name-0.030
arl-0.028
 head-0.025
 curious-0.025
tails-0.025
Token The
Token ID383
Feature activation+2.18e-02

Pos logit contributions
rob+0.060
aster+0.059
 course+0.055
oph+0.055
le+0.055
Neg logit contributions
arl-0.059
 name-0.055
 twins-0.048
 sorry-0.045
pering-0.045
Token two
Token ID734
Feature activation+4.79e-02

Pos logit contributions
oph+0.032
 course+0.032
 later+0.030
ly+0.030
 then+0.030
Neg logit contributions
arl-0.035
pering-0.028
 name-0.026
 crashing-0.026
ric-0.025
Token friends
Token ID2460
Feature activation+7.48e-03

Pos logit contributions
aster+0.073
rob+0.070
oph+0.067
 course+0.063
hop+0.062
Neg logit contributions
arl-0.073
 name-0.067
 twins-0.063
 sorry-0.058
 bruised-0.057
Token could
Token ID714
Feature activation-5.05e-02

Pos logit contributions
rob+0.010
aster+0.009
oph+0.008
mmmm+0.008
mmm+0.008
Neg logit contributions
 name-0.013
arl-0.012
 twins-0.012
 curious-0.011
 crashing-0.011
Token hardly
Token ID8941
Feature activation+6.66e-02

Pos logit contributions
 name+0.080
arl+0.065
 welcome+0.060
 sorry+0.058
 curious+0.058
Neg logit contributions
rob-0.091
aster-0.080
oph-0.077
ha-0.075
 course-0.073
Token contain
Token ID3994
Feature activation+1.99e-01

Pos logit contributions
rob+0.114
oph+0.095
 course+0.086
aster+0.083
ha+0.081
Neg logit contributions
 name-0.111
arl-0.097
 welcome-0.094
 sorry-0.090
free-0.088
Token their
Token ID511
Feature activation+2.68e-02

Pos logit contributions
aster+0.229
oph+0.220
 course+0.218
 later+0.210
 tomorrow+0.209
Neg logit contributions
arl-0.381
 name-0.319
 bruised-0.299
pering-0.292
 twins-0.290
Token joy
Token ID8716
Feature activation+4.46e-02

Pos logit contributions
aster+0.037
oph+0.036
rob+0.036
 course+0.035
le+0.034
Neg logit contributions
arl-0.044
 name-0.038
pering-0.034
 twins-0.034
olving-0.033
Token,
Token ID11
Feature activation+3.12e-02

Pos logit contributions
aster+0.050
rob+0.048
oph+0.048
 course+0.045
le+0.043
Neg logit contributions
arl-0.085
 name-0.077
 twins-0.070
pering-0.068
 sorry-0.068
Token and
Token ID290
Feature activation+4.23e-02

Pos logit contributions
aster+0.035
rob+0.033
oph+0.033
 course+0.031
le+0.029
Neg logit contributions
arl-0.060
 name-0.055
 twins-0.050
pering-0.048
 sorry-0.048
Token hugged
Token ID41130
Feature activation-3.76e-02

Pos logit contributions
aster+0.042
rob+0.041
oph+0.040
 course+0.037
le+0.034
Neg logit contributions
arl-0.085
 name-0.079
 twins-0.073
 sorry-0.071
pering-0.070
Token so
Token ID523
Feature activation+5.29e-02

Pos logit contributions
aster+0.025
oph+0.023
 then+0.022
 course+0.022
 tomorrow+0.022
Neg logit contributions
arl-0.044
 name-0.036
olving-0.033
 twins-0.033
pering-0.032
Token happy
Token ID3772
Feature activation+2.01e-02

Pos logit contributions
aster+0.075
oph+0.071
 tomorrow+0.070
 sat+0.070
 then+0.069
Neg logit contributions
arl-0.087
 name-0.069
olving-0.064
 twins-0.063
pering-0.063
Token she
Token ID673
Feature activation+2.83e-02

Pos logit contributions
aster+0.026
rob+0.026
oph+0.023
 course+0.022
le+0.021
Neg logit contributions
arl-0.034
 name-0.034
 twins-0.031
pering-0.029
 sorry-0.029
Token could
Token ID714
Feature activation-5.71e-02

Pos logit contributions
aster+0.042
rob+0.039
oph+0.036
 course+0.035
mmmm+0.033
Neg logit contributions
 name-0.043
arl-0.041
 curious-0.038
 welcome-0.037
 twins-0.037
Token hardly
Token ID8941
Feature activation+6.32e-02

Pos logit contributions
 name+0.090
arl+0.076
 welcome+0.075
 sorry+0.069
 head+0.068
Neg logit contributions
rob-0.101
oph-0.091
 course-0.081
aster-0.080
ha-0.078
Token contain
Token ID3994
Feature activation+1.98e-01

Pos logit contributions
rob+0.092
oph+0.080
 course+0.069
ha+0.067
aster+0.065
Neg logit contributions
 name-0.112
arl-0.105
 welcome-0.099
 sorry-0.094
 crashing-0.092
Token her
Token ID607
Feature activation+4.43e-02

Pos logit contributions
aster+0.211
rob+0.209
oph+0.200
 course+0.187
chair+0.177
Neg logit contributions
arl-0.388
 name-0.359
 twins-0.327
pering-0.318
 sorry-0.315
Token excitement
Token ID14067
Feature activation+7.74e-02

Pos logit contributions
aster+0.067
rob+0.065
oph+0.065
 course+0.063
le+0.060
Neg logit contributions
arl-0.068
 name-0.059
 twins-0.052
pering-0.051
 bruised-0.050
Token.
Token ID13
Feature activation+1.55e-02

Pos logit contributions
aster+0.100
rob+0.096
oph+0.096
 course+0.091
le+0.087
Neg logit contributions
arl-0.135
 name-0.121
 twins-0.108
pering-0.106
 sorry-0.104
Token
Token ID198
Feature activation+3.62e-02

Pos logit contributions
oph+0.019
aster+0.018
 course+0.017
rob+0.017
chair+0.017
Neg logit contributions
arl-0.028
 name-0.024
 bruised-0.023
pering-0.023
rier-0.022
Token
Token ID198
Feature activation+3.04e-02

Pos logit contributions
aster+0.057
rob+0.056
oph+0.056
 course+0.053
le+0.052
Neg logit contributions
arl-0.052
 name-0.047
 twins-0.041
pering-0.039
 welcome-0.039
Token cool
Token ID3608
Feature activation+3.77e-02

Pos logit contributions
rob+0.116
oph+0.111
aster+0.111
 course+0.106
le+0.102
Neg logit contributions
arl-0.058
 name-0.052
 sorry-0.044
 welcome-0.042
 twins-0.040
Token that
Token ID326
Feature activation+1.65e-02

Pos logit contributions
aster+0.043
rob+0.040
oph+0.039
 course+0.036
hop+0.033
Neg logit contributions
arl-0.073
 name-0.066
 twins-0.061
pering-0.060
 sorry-0.059
Token he
Token ID339
Feature activation+4.70e-02

Pos logit contributions
rob+0.011
oph+0.010
aster+0.009
 course+0.007
le+0.007
Neg logit contributions
arl-0.038
 name-0.038
 sorry-0.035
 twins-0.035
 welcome-0.035
Token could
Token ID714
Feature activation-5.17e-02

Pos logit contributions
rob+0.072
aster+0.071
oph+0.064
mmmm+0.062
 course+0.062
Neg logit contributions
 name-0.068
 head-0.061
 curious-0.060
arl-0.059
 nos-0.058
Token barely
Token ID8523
Feature activation+9.80e-02

Pos logit contributions
 name+0.074
arl+0.063
 head+0.055
 sorry+0.055
 welcome+0.053
Neg logit contributions
rob-0.101
oph-0.092
ha-0.081
aster-0.081
 course-0.081
Token contain
Token ID3994
Feature activation+1.98e-01

Pos logit contributions
rob+0.169
oph+0.155
 course+0.136
mmm+0.133
mm+0.131
Neg logit contributions
 name-0.143
arl-0.136
free-0.120
 welcome-0.119
 sorry-0.117
Token it
Token ID340
Feature activation+9.36e-02

Pos logit contributions
rob+0.224
oph+0.212
aster+0.209
chair+0.195
ha+0.189
Neg logit contributions
arl-0.375
 name-0.365
 twins-0.331
 sorry-0.318
 welcome-0.317
Token.
Token ID13
Feature activation+1.53e-02

Pos logit contributions
aster+0.136
rob+0.134
oph+0.130
 course+0.121
hop+0.115
Neg logit contributions
arl-0.145
 name-0.136
 twins-0.121
 sorry-0.118
pering-0.116
Token He
Token ID679
Feature activation+4.31e-02

Pos logit contributions
aster+0.021
rob+0.020
oph+0.020
 course+0.019
le+0.018
Neg logit contributions
arl-0.026
 name-0.023
 twins-0.020
pering-0.020
 sorry-0.020
Token was
Token ID373
Feature activation+2.97e-02

Pos logit contributions
rob+0.068
aster+0.066
mmmm+0.061
mmm+0.060
mm+0.058
Neg logit contributions
 name-0.063
 head-0.058
 curious-0.055
 younger-0.053
 tight-0.053
Token so
Token ID523
Feature activation+5.43e-02

Pos logit contributions
aster+0.033
oph+0.031
rob+0.030
 course+0.030
 then+0.029
Neg logit contributions
arl-0.059
 name-0.051
 twins-0.046
pering-0.046
olving-0.045
Token new
Token ID649
Feature activation-3.17e-02

Pos logit contributions
aster+0.011
oph+0.010
rob+0.010
 course+0.010
le+0.009
Neg logit contributions
arl-0.012
 name-0.011
 twins-0.009
 sorry-0.009
olving-0.009
Token board
Token ID3096
Feature activation+6.94e-02

Pos logit contributions
arl+0.048
 name+0.046
 twins+0.040
 welcome+0.038
olving+0.037
Neg logit contributions
aster-0.047
oph-0.046
rob-0.045
le-0.041
mmm-0.041
Token he
Token ID339
Feature activation+2.47e-02

Pos logit contributions
aster+0.115
oph+0.112
rob+0.109
mmmm+0.105
mmm+0.105
Neg logit contributions
arl-0.089
 name-0.081
 fastest-0.074
 twins-0.073
 undone-0.070
Token could
Token ID714
Feature activation-5.25e-02

Pos logit contributions
aster+0.036
oph+0.036
rob+0.035
mmmm+0.034
mmm+0.033
Neg logit contributions
 name-0.036
arl-0.033
 curious-0.031
 fastest-0.030
 head-0.030
Token hardly
Token ID8941
Feature activation+7.35e-02

Pos logit contributions
 name+0.071
arl+0.054
 head+0.052
 crashing+0.051
 welcome+0.051
Neg logit contributions
oph-0.103
rob-0.100
ha-0.086
mmmm-0.085
mm-0.083
Token contain
Token ID3994
Feature activation+1.98e-01

Pos logit contributions
rob+0.129
oph+0.117
 course+0.101
mm+0.098
ha+0.094
Neg logit contributions
 name-0.113
arl-0.104
 welcome-0.097
 undone-0.095
 head-0.093
Token himself
Token ID2241
Feature activation+7.49e-02

Pos logit contributions
rob+0.232
oph+0.223
aster+0.215
chair+0.195
 course+0.188
Neg logit contributions
arl-0.362
 name-0.362
 twins-0.323
 welcome-0.317
 sorry-0.311
Token.
Token ID13
Feature activation+1.72e-02

Pos logit contributions
aster+0.104
rob+0.103
oph+0.101
 course+0.097
le+0.090
Neg logit contributions
arl-0.122
 name-0.111
 twins-0.099
 sorry-0.095
pering-0.095
Token His
Token ID2399
Feature activation+3.18e-02

Pos logit contributions
aster+0.023
rob+0.023
oph+0.023
 course+0.022
le+0.021
Neg logit contributions
arl-0.029
 name-0.026
 twins-0.023
pering-0.022
 sorry-0.022
Token dad
Token ID9955
Feature activation+6.57e-02

Pos logit contributions
aster+0.053
oph+0.051
 course+0.051
rob+0.050
 tomorrow+0.048
Neg logit contributions
arl-0.044
 name-0.034
pering-0.033
 bruised-0.031
 twins-0.030
Token smiled
Token ID13541
Feature activation-1.23e-02

Pos logit contributions
aster+0.081
rob+0.078
oph+0.073
mmmm+0.069
mm+0.066
Neg logit contributions
 name-0.111
arl-0.105
 twins-0.102
 younger-0.095
 curious-0.094
Token excited
Token ID6568
Feature activation+6.50e-02

Pos logit contributions
aster+0.076
oph+0.074
rob+0.074
 course+0.072
le+0.069
Neg logit contributions
arl-0.063
 name-0.054
 twins-0.047
pering-0.046
olving-0.044
Token that
Token ID326
Feature activation+6.86e-03

Pos logit contributions
aster+0.103
rob+0.101
oph+0.085
hop+0.085
 course+0.084
Neg logit contributions
arl-0.092
 name-0.091
 twins-0.084
 attention-0.075
 welcome-0.073
Token she
Token ID673
Feature activation+2.77e-02

Pos logit contributions
rob+0.006
aster+0.006
oph+0.005
 course+0.005
le+0.005
Neg logit contributions
arl-0.015
 name-0.014
 twins-0.013
 sorry-0.012
 welcome-0.012
Token could
Token ID714
Feature activation-5.74e-02

Pos logit contributions
rob+0.036
aster+0.036
 course+0.031
oph+0.031
hop+0.029
Neg logit contributions
 name-0.049
arl-0.046
 twins-0.041
 sorry-0.041
 curious-0.041
Token barely
Token ID8523
Feature activation+9.94e-02

Pos logit contributions
 name+0.086
 welcome+0.060
arl+0.059
 head+0.057
 sorry+0.056
Neg logit contributions
rob-0.123
oph-0.109
aw-0.094
ha-0.094
 course-0.094
Token contain
Token ID3994
Feature activation+1.98e-01

Pos logit contributions
rob+0.170
oph+0.152
 course+0.132
mm+0.125
aw+0.124
Neg logit contributions
 name-0.157
arl-0.139
free-0.130
 sorry-0.129
 welcome-0.128
Token her
Token ID607
Feature activation+4.54e-02

Pos logit contributions
rob+0.215
aster+0.214
oph+0.204
 course+0.191
chair+0.183
Neg logit contributions
arl-0.384
 name-0.357
 twins-0.323
pering-0.313
 sorry-0.312
Token joy
Token ID8716
Feature activation+4.25e-02

Pos logit contributions
aster+0.069
 course+0.066
oph+0.066
rob+0.065
le+0.061
Neg logit contributions
arl-0.071
 name-0.058
pering-0.054
 bruised-0.052
 twins-0.051
Token.
Token ID13
Feature activation+2.36e-02

Pos logit contributions
aster+0.053
oph+0.051
rob+0.051
 course+0.048
le+0.047
Neg logit contributions
arl-0.076
 name-0.068
 twins-0.061
pering-0.060
 bruised-0.059
Token
Token ID198
Feature activation+3.99e-02

Pos logit contributions
oph+0.029
aster+0.026
 course+0.025
rob+0.025
 tomorrow+0.025
Neg logit contributions
arl-0.044
 name-0.037
pering-0.036
 bruised-0.036
olving-0.035
Token
Token ID198
Feature activation+3.33e-02

Pos logit contributions
aster+0.063
rob+0.062
oph+0.060
 course+0.058
le+0.057
Neg logit contributions
arl-0.058
 name-0.051
 twins-0.045
 welcome-0.042
pering-0.042
Token excited
Token ID6568
Feature activation+6.69e-02

Pos logit contributions
aster+0.078
oph+0.075
rob+0.074
 course+0.072
 tomorrow+0.071
Neg logit contributions
arl-0.070
 name-0.059
 twins-0.052
pering-0.052
olving-0.051
Token that
Token ID326
Feature activation+5.23e-03

Pos logit contributions
aster+0.112
rob+0.111
oph+0.102
 course+0.098
 behold+0.094
Neg logit contributions
 name-0.083
arl-0.079
 twins-0.075
 attention-0.070
 welcome-0.067
Token he
Token ID339
Feature activation+3.97e-02

Pos logit contributions
rob+0.003
oph+0.003
aster+0.003
 course+0.002
le+0.002
Neg logit contributions
 name-0.012
arl-0.012
 twins-0.011
 sorry-0.011
 welcome-0.011
Token could
Token ID714
Feature activation-5.77e-02

Pos logit contributions
aster+0.061
rob+0.056
 course+0.052
oph+0.052
mmmm+0.050
Neg logit contributions
 name-0.063
arl-0.057
 curious-0.055
 nos-0.053
 head-0.053
Token hardly
Token ID8941
Feature activation+6.48e-02

Pos logit contributions
 name+0.090
 welcome+0.068
arl+0.067
 head+0.063
 crashing+0.061
Neg logit contributions
rob-0.114
oph-0.102
ha-0.092
 course-0.090
ly-0.089
Token contain
Token ID3994
Feature activation+1.98e-01

Pos logit contributions
rob+0.097
oph+0.085
 course+0.074
ha+0.070
mm+0.068
Neg logit contributions
 name-0.115
arl-0.108
 welcome-0.100
 sorry-0.095
free-0.093
Token himself
Token ID2241
Feature activation+7.38e-02

Pos logit contributions
rob+0.228
oph+0.215
aster+0.208
chair+0.199
ha+0.191
Neg logit contributions
 name-0.366
arl-0.362
 twins-0.332
 welcome-0.321
 sorry-0.315
Token!
Token ID0
Feature activation+5.09e-02

Pos logit contributions
aster+0.098
rob+0.097
oph+0.095
 course+0.091
le+0.084
Neg logit contributions
arl-0.124
 name-0.114
 twins-0.102
pering-0.098
 sorry-0.098
Token Everywhere
Token ID48201
Feature activation+5.69e-03

Pos logit contributions
aster+0.064
oph+0.062
rob+0.061
 course+0.058
le+0.055
Neg logit contributions
arl-0.091
 name-0.081
 twins-0.073
pering-0.072
 bruised-0.071
Token he
Token ID339
Feature activation+4.16e-02

Pos logit contributions
aster+0.004
rob+0.003
oph+0.003
 course+0.003
hop+0.003
Neg logit contributions
arl-0.013
 name-0.013
 twins-0.012
 sorry-0.011
 welcome-0.011
Token went
Token ID1816
Feature activation+3.80e-02

Pos logit contributions
aster+0.054
mmm+0.051
rob+0.050
oph+0.049
mmmm+0.048
Neg logit contributions
 name-0.067
arl-0.064
 head-0.059
 younger-0.057
 nos-0.057
Token's
Token ID338
Feature activation+1.82e-02

Pos logit contributions
aster+0.052
rob+0.051
oph+0.051
 course+0.048
le+0.046
Neg logit contributions
arl-0.053
 name-0.048
 twins-0.043
 sorry-0.041
olving-0.040
Token so
Token ID523
Feature activation+8.44e-02

Pos logit contributions
aster+0.016
rob+0.015
 course+0.014
 later+0.013
oph+0.013
Neg logit contributions
arl-0.040
 name-0.035
pering-0.034
 twins-0.033
olving-0.032
Token special
Token ID2041
Feature activation+8.66e-02

Pos logit contributions
aster+0.124
 he+0.118
rob+0.115
oph+0.113
ha+0.113
Neg logit contributions
arl-0.130
pering-0.106
arse-0.101
 name-0.101
olving-0.101
Token about
Token ID546
Feature activation-3.44e-02

Pos logit contributions
aster+0.089
oph+0.087
rob+0.086
 course+0.076
hop+0.074
Neg logit contributions
arl-0.173
 name-0.169
 twins-0.149
 welcome-0.149
 sorry-0.144
Token Wed
Token ID3300
Feature activation+3.13e-02

Pos logit contributions
arl+0.065
 name+0.064
 twins+0.058
 sorry+0.056
pering+0.054
Neg logit contributions
oph-0.037
aster-0.036
rob-0.035
 course-0.032
le-0.031
Tokennes
Token ID2516
Feature activation+1.98e-01

Pos logit contributions
aster+0.030
rob+0.029
oph+0.029
 course+0.028
le+0.025
Neg logit contributions
arl-0.065
 name-0.058
pering-0.053
 twins-0.053
olving-0.053
Tokendays
Token ID12545
Feature activation+8.90e-02

Pos logit contributions
rob+0.210
aster+0.205
oph+0.205
 course+0.197
chair+0.173
Neg logit contributions
arl-0.384
 name-0.357
 twins-0.327
pering-0.313
 welcome-0.313
Token?"
Token ID1701
Feature activation+1.24e-01

Pos logit contributions
aster+0.116
rob+0.115
oph+0.109
 mun+0.105
 course+0.101
Neg logit contributions
arl-0.160
 name-0.153
 welcome-0.141
 twins-0.137
 sorry-0.127
Token
Token ID198
Feature activation+1.27e-02

Pos logit contributions
aster+0.187
oph+0.181
rob+0.181
 course+0.171
le+0.169
Neg logit contributions
arl-0.189
 name-0.171
 twins-0.149
 sorry-0.142
pering-0.141
Token
Token ID198
Feature activation+1.17e-02

Pos logit contributions
aster+0.019
oph+0.018
rob+0.018
 course+0.017
le+0.017
Neg logit contributions
arl-0.020
 name-0.018
 twins-0.015
pering-0.015
 sorry-0.015
TokenMax
Token ID11518
Feature activation-1.45e-02

Pos logit contributions
aster+0.018
oph+0.017
rob+0.017
 course+0.016
le+0.016
Neg logit contributions
arl-0.018
 name-0.016
 twins-0.014
pering-0.013
 sorry-0.013
Token you
Token ID345
Feature activation-4.09e-02

Pos logit contributions
aster+0.042
rob+0.042
oph+0.041
 course+0.038
le+0.037
Neg logit contributions
arl-0.073
 name-0.066
 twins-0.061
 sorry-0.059
pering-0.059
Token go
Token ID467
Feature activation-5.12e-02

Pos logit contributions
arl+0.063
 name+0.061
 welcome+0.053
 twins+0.052
 sorry+0.052
Neg logit contributions
rob-0.062
aster-0.060
oph-0.058
 course-0.054
chair-0.052
Token,
Token ID11
Feature activation-5.16e-02

Pos logit contributions
arl+0.069
 name+0.064
pering+0.055
 twins+0.055
 welcome+0.054
Neg logit contributions
oph-0.092
rob-0.084
aster-0.084
 course-0.080
mmm-0.078
Token my
Token ID616
Feature activation-1.41e-03

Pos logit contributions
arl+0.088
 name+0.080
 twins+0.071
pering+0.069
 sorry+0.069
Neg logit contributions
aster-0.068
rob-0.067
oph-0.066
 course-0.062
le-0.059
Token d
Token ID288
Feature activation-1.10e-01

Pos logit contributions
arl+0.002
 name+0.002
pering+0.001
 twins+0.001
olving+0.001
Neg logit contributions
aster-0.002
rob-0.002
oph-0.002
 course-0.002
 tomorrow-0.002
Tokenarl
Token ID7063
Feature activation-2.15e-01

Pos logit contributions
arl+0.078
 name+0.042
 twins+0.021
pering+0.020
rier+0.016
Neg logit contributions
rob-0.281
aster-0.273
 course-0.268
oph-0.267
 tomorrow-0.259
Tokenings
Token ID654
Feature activation-5.28e-02

Pos logit contributions
arl+0.287
 name+0.242
 twins+0.199
olving+0.189
rier+0.184
Neg logit contributions
aster-0.389
oph-0.368
rob-0.366
 course-0.359
 tomorrow-0.344
Token.
Token ID13
Feature activation-4.49e-02

Pos logit contributions
arl+0.091
 name+0.083
 welcome+0.074
 twins+0.073
 sorry+0.070
Neg logit contributions
aster-0.070
oph-0.069
rob-0.069
orse-0.061
 hours-0.061
Token Enjoy
Token ID18179
Feature activation-8.61e-03

Pos logit contributions
arl+0.065
 name+0.054
olving+0.050
pering+0.050
 twins+0.049
Neg logit contributions
aster-0.071
 course-0.066
rob-0.066
oph-0.065
 tomorrow-0.063
Token your
Token ID534
Feature activation+1.21e-02

Pos logit contributions
arl+0.012
 name+0.011
 welcome+0.009
 twins+0.009
 attention+0.009
Neg logit contributions
chair-0.014
rob-0.014
oph-0.014
path-0.013
aster-0.013
Token lunch
Token ID9965
Feature activation+1.36e-02

Pos logit contributions
aster+0.015
rob+0.015
oph+0.015
 course+0.014
le+0.013
Neg logit contributions
arl-0.021
 name-0.020
 twins-0.017
 sorry-0.017
 welcome-0.017
Token and
Token ID290
Feature activation-1.47e-02

Pos logit contributions
arl+0.011
 name+0.010
 twins+0.010
pering+0.009
 sorry+0.009
Neg logit contributions
aster-0.006
rob-0.006
oph-0.005
 course-0.005
le-0.005
Token saw
Token ID2497
Feature activation+2.04e-02

Pos logit contributions
arl+0.024
 name+0.024
 twins+0.022
 sorry+0.022
 bruised+0.022
Neg logit contributions
oph-0.018
rob-0.018
mmm-0.017
aster-0.017
mmmm-0.017
Token a
Token ID257
Feature activation-3.03e-02

Pos logit contributions
aster+0.023
rob+0.022
oph+0.021
 course+0.021
le+0.019
Neg logit contributions
arl-0.039
 name-0.035
 twins-0.032
pering-0.032
olving-0.031
Token big
Token ID1263
Feature activation-5.22e-02

Pos logit contributions
arl+0.052
olving+0.045
 fault+0.043
 Deal+0.043
pering+0.042
Neg logit contributions
 he-0.038
aster-0.038
ha-0.037
 woke-0.036
 course-0.035
Token lion
Token ID18744
Feature activation-7.84e-02

Pos logit contributions
arl+0.082
 name+0.070
olving+0.064
pering+0.061
 twins+0.061
Neg logit contributions
aster-0.078
rob-0.074
 course-0.074
oph-0.072
le-0.068
Token coming
Token ID2406
Feature activation-2.11e-01

Pos logit contributions
arl+0.154
 name+0.127
 Deal+0.120
olving+0.120
 younger+0.119
Neg logit contributions
aster-0.095
 course-0.090
rob-0.089
 he-0.088
le-0.087
Token out
Token ID503
Feature activation-5.48e-02

Pos logit contributions
arl+0.418
olving+0.372
 name+0.355
 fault+0.350
 Deal+0.349
Neg logit contributions
 later-0.228
 he-0.223
 him-0.214
le-0.211
ly-0.210
Token of
Token ID286
Feature activation-1.51e-01

Pos logit contributions
arl+0.137
 name+0.126
olving+0.118
 fault+0.117
 Deal+0.115
Neg logit contributions
aster-0.030
le-0.027
 later-0.026
 course-0.025
 he-0.024
Token the
Token ID262
Feature activation-5.02e-02

Pos logit contributions
arl+0.295
 name+0.267
 twins+0.239
olving+0.238
pering+0.235
Neg logit contributions
aster-0.170
rob-0.159
 course-0.156
oph-0.151
 later-0.141
Token house
Token ID2156
Feature activation-1.84e-02

Pos logit contributions
arl+0.072
free+0.064
rier+0.063
olving+0.061
arse+0.061
Neg logit contributions
 course-0.075
aster-0.074
chair-0.074
 later-0.069
 tomorrow-0.069
Token.
Token ID13
Feature activation-2.77e-02

Pos logit contributions
arl+0.036
 name+0.032
 Deal+0.028
olving+0.028
 welcome+0.028
Neg logit contributions
aster-0.021
 course-0.019
 later-0.019
rob-0.018
 tomorrow-0.018
Token she
Token ID673
Feature activation+1.69e-03

Pos logit contributions
 name+0.006
 sorry+0.006
 twins+0.006
arl+0.006
 welcome+0.006
Neg logit contributions
rob-0.006
oph-0.006
aster-0.006
orse-0.005
le-0.005
Token saw
Token ID2497
Feature activation+3.08e-02

Pos logit contributions
rob+0.002
aster+0.002
 course+0.002
oph+0.002
mmm+0.002
Neg logit contributions
 name-0.003
arl-0.003
 twins-0.003
 nos-0.002
 sorry-0.002
Token a
Token ID257
Feature activation-1.68e-02

Pos logit contributions
rob+0.050
oph+0.049
mmm+0.046
aster+0.046
 mun+0.045
Neg logit contributions
 twins-0.040
 name-0.039
 sorry-0.038
arl-0.036
 bruised-0.035
Token big
Token ID1263
Feature activation-4.01e-02

Pos logit contributions
arl+0.031
 Deal+0.028
pering+0.027
olving+0.026
girls+0.026
Neg logit contributions
 he-0.019
ha-0.018
aster-0.018
 replied-0.018
 course-0.018
Token cloud
Token ID6279
Feature activation-3.58e-02

Pos logit contributions
arl+0.065
 name+0.055
olving+0.050
pering+0.049
itating+0.049
Neg logit contributions
aster-0.058
 course-0.056
rob-0.055
oph-0.054
 tomorrow-0.051
Token coming
Token ID2406
Feature activation-2.08e-01

Pos logit contributions
arl+0.053
 name+0.047
 twins+0.041
pering+0.040
 sorry+0.039
Neg logit contributions
aster-0.055
rob-0.054
oph-0.054
 course-0.051
le-0.049
Token.
Token ID13
Feature activation-6.82e-03

Pos logit contributions
arl+0.342
 name+0.308
olving+0.287
 twins+0.272
 sorry+0.270
Neg logit contributions
 later-0.276
aster-0.272
 tomorrow-0.266
le-0.256
 sat-0.256
Token It
Token ID632
Feature activation+3.37e-02

Pos logit contributions
arl+0.011
 name+0.010
 twins+0.009
pering+0.009
 sorry+0.009
Neg logit contributions
aster-0.009
rob-0.009
oph-0.009
 course-0.009
le-0.008
Token was
Token ID373
Feature activation+8.10e-03

Pos logit contributions
aster+0.045
oph+0.042
rob+0.041
hop+0.037
mmm+0.037
Neg logit contributions
 name-0.056
arl-0.055
 twins-0.050
 nos-0.048
 curious-0.048
Token going
Token ID1016
Feature activation-1.15e-01

Pos logit contributions
 then+0.008
aster+0.008
 later+0.007
 sat+0.007
 tomorrow+0.007
Neg logit contributions
arl-0.017
arse-0.014
rier-0.014
 name-0.014
itating-0.014
Token to
Token ID284
Feature activation+1.47e-02

Pos logit contributions
arl+0.243
 name+0.214
 Deal+0.205
olving+0.200
 twins+0.200
Neg logit contributions
 later-0.107
 tomorrow-0.105
 he-0.103
rob-0.102
 sat-0.102
Token heard
Token ID2982
Feature activation+3.71e-02

Pos logit contributions
rob+0.012
aster+0.012
 course+0.011
oph+0.011
chair+0.010
Neg logit contributions
arl-0.022
 name-0.021
 twins-0.019
 sorry-0.018
pering-0.018
Token a
Token ID257
Feature activation-3.01e-03

Pos logit contributions
rob+0.069
oph+0.064
ops+0.062
mm+0.059
 course+0.059
Neg logit contributions
 name-0.048
arl-0.041
 sorry-0.040
 twins-0.039
 younger-0.037
Token deep
Token ID2769
Feature activation-3.52e-02

Pos logit contributions
arl+0.005
 fault+0.004
olving+0.004
 name+0.004
pering+0.004
Neg logit contributions
aster-0.004
 course-0.004
 replied-0.004
 he-0.004
oph-0.004
Token rum
Token ID7440
Feature activation-1.10e-01

Pos logit contributions
arl+0.053
 name+0.046
 twins+0.041
olving+0.041
pering+0.039
Neg logit contributions
aster-0.054
rob-0.051
 course-0.051
oph-0.051
le-0.050
Tokenble
Token ID903
Feature activation-2.67e-02

Pos logit contributions
arl+0.140
 name+0.122
 twins+0.101
olving+0.097
pering+0.096
Neg logit contributions
aster-0.195
rob-0.190
oph-0.186
 course-0.182
le-0.178
Token coming
Token ID2406
Feature activation-2.07e-01

Pos logit contributions
arl+0.048
 name+0.042
 twins+0.038
olving+0.037
pering+0.037
Neg logit contributions
aster-0.034
rob-0.032
oph-0.031
 course-0.031
le-0.030
Token from
Token ID422
Feature activation-8.22e-02

Pos logit contributions
arl+0.409
olving+0.348
 name+0.339
 Deal+0.326
 fault+0.323
Neg logit contributions
 later-0.227
aster-0.220
le-0.220
 he-0.219
 tomorrow-0.210
Token within
Token ID1626
Feature activation+5.59e-02

Pos logit contributions
arl+0.143
 name+0.128
 twins+0.115
pering+0.112
 sorry+0.111
Neg logit contributions
aster-0.106
rob-0.103
oph-0.103
 course-0.098
le-0.093
Token the
Token ID262
Feature activation-1.21e-02

Pos logit contributions
oph+0.078
rob+0.074
chair+0.070
aster+0.069
mmm+0.065
Neg logit contributions
arl-0.092
 name-0.084
 welcome-0.077
pering-0.076
 twins-0.075
Token volcano
Token ID32410
Feature activation-4.84e-02

Pos logit contributions
arl+0.020
 name+0.016
olving+0.015
 undone+0.015
pering+0.015
Neg logit contributions
aster-0.018
 course-0.017
rob-0.016
 he-0.016
chair-0.016
Token.
Token ID13
Feature activation+3.10e-03

Pos logit contributions
arl+0.093
 name+0.082
 twins+0.072
 sorry+0.072
olving+0.071
Neg logit contributions
aster-0.056
 course-0.054
rob-0.052
oph-0.051
 later-0.051
Token and
Token ID290
Feature activation-8.91e-03

Pos logit contributions
arl+0.124
 name+0.112
 twins+0.101
pering+0.099
 sorry+0.098
Neg logit contributions
aster-0.075
rob-0.073
oph-0.072
 course-0.069
le-0.064
Token see
Token ID766
Feature activation+6.05e-02

Pos logit contributions
arl+0.014
 name+0.014
 twins+0.012
 sorry+0.012
 bruised+0.011
Neg logit contributions
oph-0.013
rob-0.013
aster-0.012
 course-0.012
le-0.011
Token another
Token ID1194
Feature activation+2.99e-02

Pos logit contributions
aster+0.075
rob+0.072
oph+0.070
 course+0.069
le+0.065
Neg logit contributions
arl-0.109
 name-0.097
pering-0.086
 twins-0.086
olving-0.085
Token fire
Token ID2046
Feature activation-1.32e-01

Pos logit contributions
aster+0.047
rob+0.045
 course+0.044
oph+0.043
le+0.043
Neg logit contributions
arl-0.044
 name-0.037
pering-0.033
olving-0.032
 bruised-0.031
Token truck
Token ID7779
Feature activation-8.38e-02

Pos logit contributions
arl+0.196
 name+0.166
 undone+0.150
 Deal+0.149
 welcome+0.147
Neg logit contributions
aster-0.202
rob-0.198
le-0.190
 later-0.188
 course-0.186
Token coming
Token ID2406
Feature activation-2.07e-01

Pos logit contributions
arl+0.149
 name+0.133
 Deal+0.119
olving+0.117
 welcome+0.116
Neg logit contributions
aster-0.107
 later-0.098
rob-0.098
 course-0.097
 tomorrow-0.094
Token.
Token ID13
Feature activation-2.28e-02

Pos logit contributions
arl+0.399
olving+0.345
 name+0.344
 Deal+0.327
iqu+0.317
Neg logit contributions
 later-0.241
 tomorrow-0.231
aster-0.216
le-0.215
 then-0.211
Token They
Token ID1119
Feature activation-2.72e-02

Pos logit contributions
arl+0.042
 name+0.035
olving+0.034
arse+0.034
pering+0.033
Neg logit contributions
aster-0.028
oph-0.025
rob-0.024
chair-0.024
 tomorrow-0.024
Token wonder
Token ID4240
Feature activation+5.25e-02

Pos logit contributions
arl+0.047
 name+0.042
 twins+0.038
pering+0.037
 sorry+0.036
Neg logit contributions
aster-0.035
rob-0.035
oph-0.035
 course-0.033
le-0.031
Token what
Token ID644
Feature activation-7.72e-02

Pos logit contributions
aster+0.077
rob+0.076
oph+0.075
 course+0.071
hop+0.067
Neg logit contributions
arl-0.083
 name-0.074
 twins-0.066
 sorry-0.064
pering-0.063
Token is
Token ID318
Feature activation+3.54e-03

Pos logit contributions
arl+0.130
 name+0.125
 twins+0.115
 sorry+0.110
 welcome+0.109
Neg logit contributions
oph-0.097
aster-0.095
rob-0.095
 course-0.087
le-0.084
Token face
Token ID1986
Feature activation+4.31e-03

Pos logit contributions
arl+0.015
 name+0.012
olving+0.011
 twins+0.011
pering+0.011
Neg logit contributions
aster-0.011
oph-0.010
rob-0.010
 course-0.010
le-0.010
Token and
Token ID290
Feature activation+2.32e-02

Pos logit contributions
aster+0.005
rob+0.005
oph+0.005
 course+0.005
le+0.004
Neg logit contributions
arl-0.008
 name-0.007
 twins-0.006
pering-0.006
 sorry-0.006
Token a
Token ID257
Feature activation-7.83e-03

Pos logit contributions
aster+0.027
rob+0.026
oph+0.025
 course+0.024
le+0.023
Neg logit contributions
arl-0.044
 name-0.040
 twins-0.036
pering-0.035
 sorry-0.035
Token pain
Token ID2356
Feature activation-6.71e-02

Pos logit contributions
arl+0.013
pering+0.011
olving+0.010
 Fighting+0.010
asc+0.010
Neg logit contributions
aster-0.011
 he-0.011
 course-0.011
 replied-0.010
oph-0.010
Token-
Token ID12
Feature activation-9.69e-02

Pos logit contributions
arl+0.108
 name+0.097
 twins+0.087
 sorry+0.084
 bruised+0.084
Neg logit contributions
aster-0.095
rob-0.093
oph-0.093
 course-0.087
hop-0.081
Tokenfree
Token ID5787
Feature activation-2.07e-01

Pos logit contributions
arl+0.146
 name+0.130
 twins+0.109
free+0.108
 bruised+0.108
Neg logit contributions
oph-0.146
rob-0.145
aster-0.144
 course-0.135
 tomorrow-0.130
Token tooth
Token ID16162
Feature activation+1.34e-02

Pos logit contributions
arl+0.317
 name+0.278
 twins+0.246
pering+0.239
 sorry+0.235
Neg logit contributions
aster-0.311
rob-0.302
oph-0.300
 course-0.289
le-0.276
Token.
Token ID13
Feature activation+1.82e-03

Pos logit contributions
aster+0.017
rob+0.016
oph+0.016
 course+0.016
le+0.015
Neg logit contributions
arl-0.024
 name-0.021
 twins-0.019
pering-0.018
 welcome-0.018
Token<|endoftext|>
Token ID50256
Feature activation-3.34e-03

Pos logit contributions
aster+0.002
rob+0.002
oph+0.002
 course+0.002
 tomorrow+0.002
Neg logit contributions
arl-0.003
 name-0.003
pering-0.003
 twins-0.003
 bruised-0.002
TokenOnce
Token ID7454
Feature activation-2.65e-02

Pos logit contributions
arl+0.007
 name+0.006
rier+0.006
 sorry+0.005
 bruised+0.005
Neg logit contributions
rob-0.004
aster-0.003
 course-0.003
oph-0.003
 later-0.003
Token upon
Token ID2402
Feature activation-3.86e-02

Pos logit contributions
arl+0.028
 name+0.023
pering+0.019
 twins+0.018
 bruised+0.018
Neg logit contributions
aster-0.052
oph-0.051
rob-0.050
 course-0.049
 later-0.048
Token Fin
Token ID4463
Feature activation-3.68e-03

Pos logit contributions
aster+0.012
rob+0.011
oph+0.011
 course+0.010
le+0.010
Neg logit contributions
arl-0.023
 name-0.020
 twins-0.019
pering-0.018
olving-0.018
Token saw
Token ID2497
Feature activation+3.95e-02

Pos logit contributions
arl+0.007
 name+0.007
 twins+0.006
pering+0.006
 welcome+0.006
Neg logit contributions
aster-0.004
rob-0.004
oph-0.004
 course-0.004
chair-0.003
Token a
Token ID257
Feature activation-1.50e-02

Pos logit contributions
oph+0.063
rob+0.060
aster+0.059
chair+0.057
hop+0.057
Neg logit contributions
 name-0.055
arl-0.054
 twins-0.051
 sorry-0.049
 welcome-0.047
Token big
Token ID1263
Feature activation-3.43e-02

Pos logit contributions
arl+0.025
 Deal+0.021
olving+0.020
pering+0.020
 Fighting+0.020
Neg logit contributions
aster-0.019
 he-0.019
oph-0.019
 seas-0.019
ha-0.018
Token shark
Token ID21027
Feature activation-2.17e-02

Pos logit contributions
arl+0.060
 name+0.052
olving+0.047
 twins+0.047
pering+0.046
Neg logit contributions
aster-0.045
oph-0.043
rob-0.042
 course-0.042
 tomorrow-0.039
Token coming
Token ID2406
Feature activation-2.06e-01

Pos logit contributions
arl+0.043
 name+0.037
olving+0.035
 Deal+0.035
rier+0.034
Neg logit contributions
aster-0.024
 course-0.024
le-0.023
 he-0.022
 later-0.022
Token near
Token ID1474
Feature activation-7.75e-02

Pos logit contributions
arl+0.410
olving+0.358
 name+0.353
 Deal+0.339
 fault+0.331
Neg logit contributions
 later-0.222
 tomorrow-0.209
 he-0.209
 sat-0.200
aster-0.194
Token.
Token ID13
Feature activation+1.28e-03

Pos logit contributions
arl+0.184
 name+0.154
olving+0.153
 fastest+0.152
rier+0.152
Neg logit contributions
aster-0.066
 course-0.058
 later-0.056
 tomorrow-0.051
 he-0.050
Token Fin
Token ID4463
Feature activation+1.53e-02

Pos logit contributions
aster+0.002
oph+0.002
rob+0.002
 course+0.001
hop+0.001
Neg logit contributions
arl-0.002
 name-0.002
 twins-0.002
pering-0.002
olving-0.002
Token sw
Token ID1509
Feature activation+2.66e-02

Pos logit contributions
aster+0.019
rob+0.018
oph+0.017
 course+0.016
chair+0.015
Neg logit contributions
 name-0.027
arl-0.027
 twins-0.024
 curious-0.023
 welcome-0.023
Tokenam
Token ID321
Feature activation+5.56e-03

Pos logit contributions
aster+0.027
rob+0.026
oph+0.026
 course+0.024
le+0.023
Neg logit contributions
arl-0.054
 name-0.049
 twins-0.045
pering-0.044
 sorry-0.043
Token and
Token ID290
Feature activation+4.82e-03

Pos logit contributions
arl+0.032
 name+0.028
 twins+0.025
pering+0.025
 sorry+0.025
Neg logit contributions
aster-0.025
rob-0.024
oph-0.024
 course-0.023
le-0.021
Token saw
Token ID2497
Feature activation+2.33e-02

Pos logit contributions
rob+0.006
oph+0.006
 course+0.006
mmm+0.005
aster+0.005
Neg logit contributions
 name-0.008
arl-0.008
 sorry-0.008
 twins-0.007
 bruised-0.007
Token a
Token ID257
Feature activation-2.62e-02

Pos logit contributions
aster+0.024
rob+0.023
 course+0.022
oph+0.022
 later+0.020
Neg logit contributions
arl-0.048
 name-0.042
pering-0.038
olving-0.038
 welcome-0.037
Token big
Token ID1263
Feature activation-4.77e-02

Pos logit contributions
arl+0.044
olving+0.037
 fault+0.035
pering+0.035
 Fighting+0.034
Neg logit contributions
aster-0.035
 he-0.035
ha-0.034
 woke-0.033
 course-0.033
Token bear
Token ID6842
Feature activation-7.61e-02

Pos logit contributions
arl+0.074
 name+0.064
olving+0.057
pering+0.055
 twins+0.055
Neg logit contributions
aster-0.072
rob-0.069
 course-0.069
oph-0.067
le-0.063
Token coming
Token ID2406
Feature activation-2.06e-01

Pos logit contributions
arl+0.149
 name+0.126
 Deal+0.121
 younger+0.116
olving+0.115
Neg logit contributions
aster-0.093
 he-0.088
 course-0.087
 later-0.086
le-0.085
Token out
Token ID503
Feature activation-5.60e-02

Pos logit contributions
arl+0.407
olving+0.374
 Deal+0.359
 name+0.356
 fault+0.350
Neg logit contributions
 later-0.231
 him-0.221
 he-0.216
ly-0.212
le-0.201
Token of
Token ID286
Feature activation-1.51e-01

Pos logit contributions
arl+0.138
 name+0.128
olving+0.117
 sorry+0.115
 bruised+0.115
Neg logit contributions
aster-0.032
 course-0.028
le-0.028
 later-0.027
rob-0.026
Token the
Token ID262
Feature activation-4.74e-02

Pos logit contributions
arl+0.304
 name+0.277
olving+0.249
 twins+0.247
pering+0.245
Neg logit contributions
aster-0.162
 course-0.150
rob-0.148
oph-0.141
 later-0.135
Token woods
Token ID16479
Feature activation-4.05e-02

Pos logit contributions
arl+0.066
 name+0.054
olving+0.051
free+0.051
itating+0.049
Neg logit contributions
 course-0.079
rob-0.078
aster-0.077
 he-0.073
oph-0.072
Token.
Token ID13
Feature activation-1.05e-02

Pos logit contributions
arl+0.091
 name+0.081
 Deal+0.078
arse+0.071
 sorry+0.070
Neg logit contributions
 he-0.037
 later-0.035
aster-0.035
 course-0.033
 tomorrow-0.032
Token family
Token ID1641
Feature activation-4.86e-03

Pos logit contributions
arl+0.058
rier+0.046
itating+0.046
pering+0.045
itates+0.043
Neg logit contributions
 then-0.042
aster-0.041
chair-0.041
 hours-0.041
 course-0.040
Token heard
Token ID2982
Feature activation+3.25e-02

Pos logit contributions
arl+0.008
 name+0.008
 twins+0.008
 sorry+0.007
 crashing+0.007
Neg logit contributions
rob-0.006
aster-0.006
oph-0.005
mmm-0.005
mmmm-0.005
Token a
Token ID257
Feature activation-2.54e-03

Pos logit contributions
rob+0.052
oph+0.046
aster+0.044
 sometime+0.041
mmm+0.041
Neg logit contributions
 name-0.047
arl-0.047
 twins-0.043
 sorry-0.041
 crashing-0.040
Token me
Token ID502
Feature activation-1.35e-02

Pos logit contributions
arl+0.004
 name+0.003
olving+0.003
 fault+0.003
pering+0.003
Neg logit contributions
aster-0.004
 course-0.004
oph-0.004
 replied-0.004
 he-0.004
Tokenow
Token ID322
Feature activation-2.99e-02

Pos logit contributions
arl+0.021
 name+0.019
 twins+0.017
pering+0.016
 bruised+0.016
Neg logit contributions
aster-0.019
rob-0.019
oph-0.019
 course-0.018
 later-0.017
Token coming
Token ID2406
Feature activation-2.06e-01

Pos logit contributions
arl+0.048
 name+0.043
 twins+0.039
pering+0.038
 sorry+0.037
Neg logit contributions
oph-0.043
rob-0.042
aster-0.042
 course-0.039
hop-0.037
Token from
Token ID422
Feature activation-8.30e-02

Pos logit contributions
arl+0.395
olving+0.345
 name+0.344
 Deal+0.329
 Fighting+0.326
Neg logit contributions
 later-0.229
 tomorrow-0.215
le-0.210
 then-0.201
ly-0.198
Token the
Token ID262
Feature activation-4.01e-02

Pos logit contributions
arl+0.138
 name+0.122
 twins+0.109
pering+0.106
 sorry+0.105
Neg logit contributions
aster-0.115
rob-0.111
oph-0.110
 course-0.106
le-0.101
Token garden
Token ID11376
Feature activation-4.27e-02

Pos logit contributions
arl+0.058
 name+0.044
olving+0.040
itating+0.040
rier+0.039
Neg logit contributions
aster-0.069
chair-0.066
 then-0.064
 course-0.063
 hours-0.062
Token.
Token ID13
Feature activation-2.04e-04

Pos logit contributions
arl+0.083
 name+0.073
 sorry+0.064
 Deal+0.064
olving+0.063
Neg logit contributions
aster-0.048
oph-0.046
 course-0.045
 later-0.044
 tomorrow-0.044
Token They
Token ID1119
Feature activation-3.10e-03

Pos logit contributions
arl+0.000
 name+0.000
 twins+0.000
pering+0.000
 sorry+0.000
Neg logit contributions
aster-0.000
rob-0.000
oph-0.000
 course-0.000
le-0.000
Token a
Token ID257
Feature activation+3.16e-03

Pos logit contributions
arl+0.034
 name+0.032
 twins+0.029
 bruised+0.028
 sorry+0.027
Neg logit contributions
rob-0.035
aster-0.034
oph-0.033
 course-0.031
le-0.031
Token hole
Token ID7604
Feature activation-1.87e-02

Pos logit contributions
aster+0.005
 course+0.004
 replied+0.004
 he+0.004
oph+0.004
Neg logit contributions
arl-0.005
pering-0.004
itating-0.004
olving-0.004
 Fighting-0.004
Token and
Token ID290
Feature activation-1.47e-03

Pos logit contributions
arl+0.026
 name+0.023
 twins+0.020
pering+0.019
 sorry+0.019
Neg logit contributions
aster-0.031
rob-0.030
oph-0.030
 course-0.029
hop-0.027
Token some
Token ID617
Feature activation+3.91e-02

Pos logit contributions
arl+0.002
 name+0.002
 twins+0.002
 bruised+0.002
 sorry+0.002
Neg logit contributions
rob-0.002
oph-0.002
mmm-0.002
 course-0.002
aster-0.002
Token stuffing
Token ID43672
Feature activation-1.10e-01

Pos logit contributions
aster+0.059
rob+0.057
oph+0.056
 course+0.055
le+0.053
Neg logit contributions
arl-0.059
 name-0.052
 twins-0.045
pering-0.045
olving-0.043
Token coming
Token ID2406
Feature activation-2.06e-01

Pos logit contributions
arl+0.179
 name+0.160
 twins+0.143
pering+0.138
 sorry+0.137
Neg logit contributions
aster-0.153
rob-0.149
oph-0.148
 course-0.141
le-0.133
Token out
Token ID503
Feature activation-5.45e-02

Pos logit contributions
arl+0.355
 Deal+0.306
olving+0.304
 name+0.292
iqu+0.280
Neg logit contributions
 later-0.277
 tomorrow-0.260
le-0.259
ly-0.258
 he-0.258
Token.
Token ID13
Feature activation-2.96e-02

Pos logit contributions
arl+0.102
 name+0.092
 twins+0.084
pering+0.082
 sorry+0.081
Neg logit contributions
aster-0.063
rob-0.061
oph-0.060
 course-0.057
le-0.054
Token It
Token ID632
Feature activation+2.77e-02

Pos logit contributions
arl+0.054
 name+0.047
arse+0.045
pering+0.045
olving+0.045
Neg logit contributions
aster-0.035
oph-0.032
rob-0.030
 tomorrow-0.030
 course-0.030
Token was
Token ID373
Feature activation-4.60e-03

Pos logit contributions
oph+0.035
aster+0.034
rob+0.032
mmm+0.031
 course+0.030
Neg logit contributions
 name-0.048
arl-0.047
 twins-0.043
 bruised-0.043
 sorry-0.041
Token not
Token ID407
Feature activation-9.98e-02

Pos logit contributions
arl+0.008
itating+0.007
arse+0.007
olving+0.007
 name+0.007
Neg logit contributions
 then-0.006
aster-0.006
 he-0.005
 later-0.005
 sat-0.005
Token He
Token ID679
Feature activation-1.81e-02

Pos logit contributions
arl+0.081
 name+0.071
pering+0.064
 twins+0.064
olving+0.064
Neg logit contributions
aster-0.061
oph-0.057
rob-0.057
 course-0.055
 later-0.053
Token sees
Token ID7224
Feature activation+4.13e-02

Pos logit contributions
arl+0.028
 name+0.026
 twins+0.023
 sorry+0.022
pering+0.022
Neg logit contributions
aster-0.026
rob-0.026
oph-0.025
 course-0.024
le-0.023
Token a
Token ID257
Feature activation-2.21e-02

Pos logit contributions
oph+0.055
rob+0.055
aster+0.054
le+0.050
mmm+0.049
Neg logit contributions
arl-0.068
 name-0.064
 twins-0.061
 sorry-0.057
 bruised-0.055
Token soft
Token ID2705
Feature activation-9.46e-02

Pos logit contributions
arl+0.037
olving+0.031
pering+0.031
arse+0.030
iqu+0.030
Neg logit contributions
aster-0.029
oph-0.028
rob-0.027
 course-0.027
 he-0.026
Token light
Token ID1657
Feature activation-5.92e-02

Pos logit contributions
arl+0.143
 name+0.128
 twins+0.112
pering+0.107
 sorry+0.107
Neg logit contributions
aster-0.143
rob-0.140
oph-0.139
 course-0.132
le-0.126
Token coming
Token ID2406
Feature activation-2.06e-01

Pos logit contributions
arl+0.100
 name+0.087
pering+0.078
 twins+0.077
olving+0.076
Neg logit contributions
aster-0.081
rob-0.076
 course-0.076
oph-0.075
le-0.074
Token from
Token ID422
Feature activation-9.10e-02

Pos logit contributions
arl+0.352
 name+0.302
olving+0.299
 Deal+0.287
 bruised+0.281
Neg logit contributions
 later-0.264
aster-0.260
le-0.249
 tomorrow-0.246
ly-0.246
Token his
Token ID465
Feature activation-4.15e-02

Pos logit contributions
arl+0.141
 name+0.130
 twins+0.115
pering+0.110
 welcome+0.110
Neg logit contributions
oph-0.136
rob-0.130
aster-0.125
 course-0.121
mmm-0.119
Token balloon
Token ID21190
Feature activation-1.33e-02

Pos logit contributions
arl+0.062
 name+0.050
pering+0.049
olving+0.047
free+0.047
Neg logit contributions
aster-0.063
 course-0.060
oph-0.059
rob-0.058
 tomorrow-0.058
Token.
Token ID13
Feature activation-3.75e-02

Pos logit contributions
arl+0.023
 name+0.021
 twins+0.018
pering+0.018
 sorry+0.018
Neg logit contributions
aster-0.017
rob-0.017
oph-0.016
 course-0.016
 later-0.015
Token
Token ID198
Feature activation-1.64e-02

Pos logit contributions
arl+0.070
 name+0.059
arse+0.058
pering+0.058
olving+0.057
Neg logit contributions
aster-0.044
oph-0.040
rob-0.039
 he-0.038
 course-0.038
Token.
Token ID13
Feature activation-2.70e-02

Pos logit contributions
arl+0.040
 name+0.036
 twins+0.031
 sorry+0.031
olving+0.031
Neg logit contributions
aster-0.026
rob-0.024
 course-0.024
 later-0.023
 tomorrow-0.023
Token He
Token ID679
Feature activation+2.60e-03

Pos logit contributions
arl+0.049
 name+0.043
 twins+0.039
pering+0.039
olving+0.038
Neg logit contributions
aster-0.034
rob-0.032
oph-0.032
 course-0.030
chair-0.029
Token saw
Token ID2497
Feature activation+2.75e-02

Pos logit contributions
aster+0.004
oph+0.004
mmm+0.003
rob+0.003
mmmm+0.003
Neg logit contributions
 name-0.004
arl-0.004
 twins-0.004
 curious-0.004
 nos-0.004
Token a
Token ID257
Feature activation-2.15e-02

Pos logit contributions
aster+0.033
rob+0.032
oph+0.031
 course+0.030
le+0.028
Neg logit contributions
arl-0.051
 name-0.046
 twins-0.042
pering-0.041
 sorry-0.040
Token car
Token ID1097
Feature activation-2.24e-03

Pos logit contributions
arl+0.036
 Deal+0.031
iqu+0.030
olving+0.030
pering+0.030
Neg logit contributions
 he-0.027
ha-0.026
aster-0.026
 woke-0.025
 course-0.025
Token coming
Token ID2406
Feature activation-2.06e-01

Pos logit contributions
arl+0.004
 name+0.004
 twins+0.003
pering+0.003
 sorry+0.003
Neg logit contributions
aster-0.003
rob-0.003
oph-0.003
 course-0.003
le-0.002
Token fast
Token ID3049
Feature activation-1.48e-02

Pos logit contributions
arl+0.367
 name+0.321
olving+0.305
 Deal+0.294
 twins+0.290
Neg logit contributions
 later-0.258
 he-0.245
 tomorrow-0.244
aster-0.237
 sat-0.232
Token.
Token ID13
Feature activation-2.97e-02

Pos logit contributions
arl+0.034
 name+0.031
 Deal+0.030
 sorry+0.029
iqu+0.029
Neg logit contributions
 later-0.010
aster-0.010
 course-0.010
 tomorrow-0.009
rob-0.009
Token He
Token ID679
Feature activation+6.97e-04

Pos logit contributions
arl+0.053
 name+0.047
 twins+0.042
pering+0.042
 sorry+0.041
Neg logit contributions
aster-0.038
rob-0.036
oph-0.036
 course-0.035
le-0.033
Token did
Token ID750
Feature activation-6.37e-02

Pos logit contributions
aster+0.001
oph+0.001
rob+0.001
 course+0.001
mmm+0.001
Neg logit contributions
 name-0.001
arl-0.001
 twins-0.001
 curious-0.001
 nos-0.001
Token not
Token ID407
Feature activation-9.52e-02

Pos logit contributions
arl+0.065
 name+0.060
 sorry+0.047
 twins+0.047
pering+0.045
Neg logit contributions
oph-0.129
rob-0.128
aster-0.124
 course-0.120
le-0.117
Token noise
Token ID7838
Feature activation-3.13e-02

Pos logit contributions
arl+0.091
 name+0.082
 twins+0.071
 sorry+0.069
pering+0.068
Neg logit contributions
oph-0.100
aster-0.100
rob-0.098
 course-0.093
 tomorrow-0.089
Token and
Token ID290
Feature activation+1.06e-02

Pos logit contributions
arl+0.037
 name+0.036
 crashing+0.033
pering+0.031
 twins+0.030
Neg logit contributions
oph-0.058
rob-0.054
mm-0.051
aster-0.051
mmm-0.050
Token saw
Token ID2497
Feature activation+3.18e-02

Pos logit contributions
rob+0.013
oph+0.013
aster+0.013
 course+0.012
mm+0.011
Neg logit contributions
 name-0.018
arl-0.017
 sorry-0.017
 curious-0.017
 twins-0.016
Token an
Token ID281
Feature activation+1.97e-02

Pos logit contributions
aster+0.043
oph+0.043
rob+0.041
 sometime+0.038
le+0.038
Neg logit contributions
 name-0.050
arl-0.050
 twins-0.045
 sorry-0.044
 welcome-0.041
Token arrow
Token ID15452
Feature activation-1.09e-02

Pos logit contributions
oph+0.040
rob+0.040
aster+0.039
 course+0.037
zed+0.037
Neg logit contributions
arl-0.019
 name-0.017
 twins-0.016
 sorry-0.015
 younger-0.014
Token coming
Token ID2406
Feature activation-2.04e-01

Pos logit contributions
arl+0.020
 name+0.016
 Deal+0.015
 welcome+0.015
olving+0.015
Neg logit contributions
 course-0.014
 he-0.014
rob-0.014
aster-0.014
 later-0.013
Token right
Token ID826
Feature activation-3.02e-02

Pos logit contributions
arl+0.417
 Deal+0.369
olving+0.353
 name+0.349
 fault+0.340
Neg logit contributions
 he-0.215
 later-0.209
 him-0.200
ly-0.192
 sat-0.191
Token towards
Token ID3371
Feature activation+4.70e-02

Pos logit contributions
arl+0.049
 name+0.042
 sorry+0.039
 twins+0.038
 bruised+0.038
Neg logit contributions
aster-0.041
rob-0.040
le-0.040
 later-0.040
 tomorrow-0.039
Token him
Token ID683
Feature activation-5.18e-02

Pos logit contributions
rob+0.045
oph+0.041
aster+0.039
mmm+0.035
 mun+0.035
Neg logit contributions
arl-0.094
 name-0.094
 twins-0.086
 welcome-0.083
 sorry-0.081
Token!
Token ID0
Feature activation+4.46e-02

Pos logit contributions
arl+0.117
 name+0.098
 Deal+0.095
 sorry+0.092
 twins+0.092
Neg logit contributions
 later-0.043
 tomorrow-0.042
 he-0.041
aster-0.038
le-0.038
Token He
Token ID679
Feature activation+3.07e-02

Pos logit contributions
aster+0.057
rob+0.055
oph+0.055
 course+0.053
le+0.049
Neg logit contributions
arl-0.078
 name-0.070
 twins-0.063
pering-0.061
 bruised-0.061
Token and
Token ID290
Feature activation-6.73e-03

Pos logit contributions
arl+0.039
 name+0.036
 crashing+0.034
pering+0.033
 twins+0.033
Neg logit contributions
oph-0.050
rob-0.046
mmm-0.042
mm-0.042
ha-0.041
Token saw
Token ID2497
Feature activation+2.54e-02

Pos logit contributions
 name+0.010
 sorry+0.010
arl+0.010
 curious+0.010
 twins+0.010
Neg logit contributions
rob-0.010
oph-0.010
mmm-0.009
 course-0.009
mmmm-0.009
Token fire
Token ID2046
Feature activation-8.83e-02

Pos logit contributions
aster+0.031
oph+0.031
rob+0.031
 course+0.029
le+0.027
Neg logit contributions
arl-0.045
 name-0.041
 twins-0.038
 sorry-0.036
 bruised-0.036
Token and
Token ID290
Feature activation+3.14e-03

Pos logit contributions
arl+0.161
 Deal+0.140
 name+0.139
 sorry+0.132
 welcome+0.131
Neg logit contributions
le-0.101
rob-0.100
 later-0.098
 then-0.097
 he-0.097
Token lava
Token ID28856
Feature activation-7.35e-02

Pos logit contributions
oph+0.005
rob+0.005
aster+0.005
 course+0.005
mmm+0.005
Neg logit contributions
arl-0.004
 name-0.004
 twins-0.004
 bruised-0.003
 sorry-0.003
Token coming
Token ID2406
Feature activation-2.04e-01

Pos logit contributions
arl+0.127
 name+0.114
 Deal+0.100
 welcome+0.099
 sorry+0.098
Neg logit contributions
aster-0.094
rob-0.090
le-0.089
 later-0.088
 course-0.088
Token out
Token ID503
Feature activation-5.15e-02

Pos logit contributions
arl+0.332
 name+0.285
olving+0.265
 Deal+0.256
 sorry+0.248
Neg logit contributions
 later-0.286
aster-0.285
le-0.278
 tomorrow-0.275
 he-0.269
Token of
Token ID286
Feature activation-1.41e-01

Pos logit contributions
arl+0.112
 name+0.103
 twins+0.093
 sorry+0.092
olving+0.091
Neg logit contributions
aster-0.044
rob-0.040
oph-0.040
 course-0.039
 later-0.038
Token the
Token ID262
Feature activation-4.59e-02

Pos logit contributions
arl+0.272
 name+0.248
 twins+0.221
pering+0.218
olving+0.218
Neg logit contributions
aster-0.159
rob-0.149
 course-0.146
oph-0.144
 later-0.134
Token volcano
Token ID32410
Feature activation-6.85e-02

Pos logit contributions
arl+0.083
 name+0.072
olving+0.070
 Fighting+0.068
pering+0.068
Neg logit contributions
 course-0.057
aster-0.055
rob-0.054
 he-0.053
chair-0.053
Token.
Token ID13
Feature activation-1.80e-02

Pos logit contributions
arl+0.122
 name+0.109
 twins+0.096
 sorry+0.095
pering+0.094
Neg logit contributions
aster-0.087
rob-0.083
oph-0.081
 course-0.080
 tomorrow-0.077
Token heard
Token ID2982
Feature activation+2.60e-02

Pos logit contributions
arl+0.050
 name+0.048
 twins+0.045
 sorry+0.044
 bruised+0.044
Neg logit contributions
rob-0.037
aster-0.034
mmm-0.031
oph-0.031
mmmm-0.031
Token Mom
Token ID11254
Feature activation-3.93e-02

Pos logit contributions
rob+0.038
oph+0.037
aster+0.037
 course+0.035
le+0.034
Neg logit contributions
arl-0.040
 name-0.037
 twins-0.033
 sorry-0.032
 bruised-0.030
Tokenmy
Token ID1820
Feature activation-8.58e-02

Pos logit contributions
arl+0.069
 name+0.063
rier+0.058
olving+0.058
 fault+0.055
Neg logit contributions
le-0.048
aster-0.047
 sat-0.046
 later-0.045
 then-0.045
Token and
Token ID290
Feature activation-3.10e-02

Pos logit contributions
arl+0.200
 name+0.177
rier+0.171
olving+0.170
 fastest+0.167
Neg logit contributions
aster-0.062
le-0.061
 sat-0.055
 later-0.054
ly-0.053
Token Daddy
Token ID26926
Feature activation-1.89e-02

Pos logit contributions
arl+0.047
 name+0.041
 twins+0.038
 sorry+0.037
 bruised+0.036
Neg logit contributions
rob-0.048
aster-0.047
oph-0.046
 course-0.044
le-0.043
Token coming
Token ID2406
Feature activation-2.04e-01

Pos logit contributions
arl+0.032
 name+0.027
 fault+0.026
rier+0.026
olving+0.026
Neg logit contributions
le-0.026
 sat-0.026
aster-0.026
 later-0.026
ly-0.025
Token down
Token ID866
Feature activation-1.66e-02

Pos logit contributions
arl+0.453
olving+0.402
 name+0.393
 fastest+0.380
osity+0.379
Neg logit contributions
 later-0.206
 tomorrow-0.181
 then-0.174
 him-0.163
 he-0.160
Token the
Token ID262
Feature activation-5.48e-02

Pos logit contributions
arl+0.033
 name+0.029
arse+0.027
 fastest+0.026
pering+0.026
Neg logit contributions
aster-0.019
 later-0.018
 tomorrow-0.016
 he-0.016
 course-0.016
Token stairs
Token ID16046
Feature activation-3.21e-02

Pos logit contributions
arl+0.104
 name+0.091
 Deal+0.084
iqu+0.080
arse+0.080
Neg logit contributions
aster-0.072
chair-0.070
rob-0.064
hop-0.063
lf-0.062
Token.
Token ID13
Feature activation-3.40e-02

Pos logit contributions
arl+0.062
 name+0.059
 Deal+0.050
 welcome+0.050
 twins+0.050
Neg logit contributions
aster-0.036
 later-0.033
 course-0.032
 tomorrow-0.031
chair-0.031
Token They
Token ID1119
Feature activation-3.12e-02

Pos logit contributions
arl+0.056
 name+0.047
rier+0.046
olving+0.046
 fastest+0.045
Neg logit contributions
aster-0.044
 he-0.042
oph-0.041
 then-0.041
 tomorrow-0.040
Token,
Token ID11
Feature activation-3.93e-02

Pos logit contributions
arl+0.076
 name+0.065
olving+0.059
itating+0.058
pering+0.056
Neg logit contributions
aster-0.081
 course-0.076
rob-0.075
oph-0.073
le-0.070
Token brown
Token ID7586
Feature activation-7.50e-02

Pos logit contributions
arl+0.063
olving+0.050
 name+0.049
itating+0.048
arse+0.047
Neg logit contributions
aster-0.062
 he-0.057
 then-0.056
 course-0.054
 sat-0.053
Token,
Token ID11
Feature activation-3.52e-02

Pos logit contributions
arl+0.095
 name+0.079
 twins+0.067
pering+0.067
olving+0.066
Neg logit contributions
aster-0.134
rob-0.129
oph-0.128
 course-0.126
le-0.121
Token hairy
Token ID43873
Feature activation-3.56e-02

Pos logit contributions
arl+0.058
olving+0.047
 name+0.045
itating+0.045
 fault+0.043
Neg logit contributions
aster-0.054
 then-0.052
 he-0.051
 course-0.048
 sat-0.047
Token bear
Token ID6842
Feature activation-6.37e-02

Pos logit contributions
arl+0.047
 name+0.039
 twins+0.033
pering+0.033
olving+0.033
Neg logit contributions
aster-0.062
rob-0.060
oph-0.059
 course-0.058
le-0.056
Token coming
Token ID2406
Feature activation-2.04e-01

Pos logit contributions
arl+0.147
 Deal+0.132
 name+0.130
 younger+0.129
rier+0.124
Neg logit contributions
aster-0.052
 he-0.051
 course-0.050
 later-0.046
le-0.045
Token out
Token ID503
Feature activation-5.49e-02

Pos logit contributions
arl+0.361
olving+0.319
 name+0.312
 fault+0.297
iqu+0.294
Neg logit contributions
 later-0.263
 he-0.245
 tomorrow-0.242
aster-0.241
 him-0.235
Token of
Token ID286
Feature activation-1.46e-01

Pos logit contributions
arl+0.131
 name+0.123
olving+0.112
 twins+0.110
pering+0.109
Neg logit contributions
aster-0.035
 later-0.031
 course-0.030
le-0.029
 he-0.028
Token the
Token ID262
Feature activation-5.06e-02

Pos logit contributions
arl+0.277
 name+0.251
 twins+0.225
pering+0.221
olving+0.220
Neg logit contributions
aster-0.169
rob-0.161
oph-0.157
 course-0.155
 later-0.142
Token woods
Token ID16479
Feature activation-4.38e-02

Pos logit contributions
arl+0.099
 name+0.087
itating+0.086
arse+0.086
osity+0.086
Neg logit contributions
aster-0.058
 course-0.056
board-0.052
 he-0.049
chair-0.048
Token.
Token ID13
Feature activation-3.72e-02

Pos logit contributions
arl+0.104
 name+0.095
 Deal+0.093
arse+0.086
iqu+0.084
Neg logit contributions
 he-0.033
 later-0.033
 course-0.030
aster-0.029
 tomorrow-0.029
Token he
Token ID339
Feature activation+1.80e-02

Pos logit contributions
oph+0.015
rob+0.015
aster+0.015
le+0.013
 course+0.013
Neg logit contributions
arl-0.029
 name-0.027
 sorry-0.025
 twins-0.025
 welcome-0.024
Token saw
Token ID2497
Feature activation+3.66e-02

Pos logit contributions
rob+0.023
aster+0.023
oph+0.022
 course+0.021
mmm+0.021
Neg logit contributions
 name-0.029
arl-0.028
 twins-0.027
 sorry-0.025
pering-0.025
Token a
Token ID257
Feature activation-1.32e-02

Pos logit contributions
oph+0.062
rob+0.057
mmm+0.056
aster+0.055
le+0.055
Neg logit contributions
 twins-0.047
 name-0.047
 sorry-0.045
arl-0.044
 curious-0.042
Token big
Token ID1263
Feature activation-3.42e-02

Pos logit contributions
arl+0.021
olving+0.017
 Deal+0.017
pering+0.017
 Fighting+0.017
Neg logit contributions
aster-0.018
oph-0.017
ha-0.017
 he-0.017
 course-0.017
Token wave
Token ID6769
Feature activation+3.65e-02

Pos logit contributions
arl+0.058
 name+0.051
 twins+0.045
pering+0.044
olving+0.044
Neg logit contributions
aster-0.046
rob-0.045
oph-0.044
 course-0.043
le-0.041
Token coming
Token ID2406
Feature activation-2.04e-01

Pos logit contributions
aster+0.055
 course+0.052
rob+0.052
oph+0.051
le+0.049
Neg logit contributions
arl-0.056
 name-0.049
 twins-0.043
olving-0.042
pering-0.042
Token towards
Token ID3371
Feature activation+3.61e-02

Pos logit contributions
arl+0.365
 name+0.333
olving+0.325
 Deal+0.312
 fault+0.305
Neg logit contributions
 later-0.235
 he-0.232
aster-0.220
ly-0.218
 sat-0.215
Token the
Token ID262
Feature activation-2.94e-02

Pos logit contributions
rob+0.036
oph+0.036
chair+0.032
aster+0.030
mmm+0.029
Neg logit contributions
arl-0.073
 name-0.071
 twins-0.062
 welcome-0.062
pering-0.061
Token shore
Token ID15191
Feature activation+7.29e-02

Pos logit contributions
arl+0.054
 name+0.047
olving+0.043
 bruised+0.042
 twins+0.042
Neg logit contributions
aster-0.036
rob-0.035
oph-0.034
 course-0.034
le-0.031
Token.
Token ID13
Feature activation+5.20e-03

Pos logit contributions
aster+0.086
rob+0.079
 course+0.079
oph+0.078
 later+0.077
Neg logit contributions
arl-0.137
 name-0.121
 bruised-0.107
olving-0.107
 twins-0.107
Token
Token ID198
Feature activation+2.37e-02

Pos logit contributions
aster+0.007
oph+0.007
rob+0.007
 course+0.006
 tomorrow+0.006
Neg logit contributions
arl-0.009
 name-0.008
 twins-0.007
pering-0.007
 bruised-0.007
Token She
Token ID1375
Feature activation-1.93e-02

Pos logit contributions
arl+0.069
 name+0.064
 twins+0.054
 sorry+0.053
pering+0.051
Neg logit contributions
rob-0.070
aster-0.069
oph-0.068
 course-0.066
le-0.063
Token sees
Token ID7224
Feature activation+3.98e-02

Pos logit contributions
arl+0.032
 name+0.029
 twins+0.026
pering+0.025
 sorry+0.025
Neg logit contributions
aster-0.026
rob-0.026
oph-0.026
 course-0.024
le-0.023
Token a
Token ID257
Feature activation-2.32e-02

Pos logit contributions
oph+0.054
rob+0.054
aster+0.052
 course+0.049
 sometime+0.048
Neg logit contributions
arl-0.066
 name-0.060
 twins-0.056
 sorry-0.053
 bruised-0.052
Token big
Token ID1263
Feature activation-4.53e-02

Pos logit contributions
arl+0.038
olving+0.034
 fault+0.034
pering+0.034
 Deal+0.033
Neg logit contributions
 he-0.028
ha-0.028
aster-0.027
 woke-0.027
 replied-0.026
Token truck
Token ID7779
Feature activation-8.28e-02

Pos logit contributions
arl+0.066
 name+0.054
olving+0.050
itating+0.050
arse+0.048
Neg logit contributions
aster-0.074
 course-0.070
rob-0.069
oph-0.067
le-0.064
Token coming
Token ID2406
Feature activation-2.04e-01

Pos logit contributions
arl+0.152
 name+0.133
 fault+0.129
 Deal+0.128
olving+0.125
Neg logit contributions
aster-0.106
 he-0.099
 course-0.097
 later-0.094
 then-0.091
Token.
Token ID13
Feature activation-4.35e-02

Pos logit contributions
arl+0.363
olving+0.316
 name+0.314
iqu+0.299
ric+0.294
Neg logit contributions
 later-0.256
aster-0.244
 tomorrow-0.234
 he-0.232
ly-0.222
Token It
Token ID632
Feature activation+1.56e-02

Pos logit contributions
arl+0.075
 name+0.067
 twins+0.060
pering+0.058
 sorry+0.058
Neg logit contributions
aster-0.058
rob-0.056
oph-0.055
 course-0.053
le-0.050
Token stops
Token ID9911
Feature activation-1.16e-01

Pos logit contributions
aster+0.022
rob+0.021
 course+0.020
oph+0.020
le+0.019
Neg logit contributions
arl-0.026
 name-0.022
 twins-0.020
pering-0.020
olving-0.020
Token near
Token ID1474
Feature activation-7.67e-02

Pos logit contributions
arl+0.237
 name+0.218
olving+0.198
 Deal+0.193
ometer+0.190
Neg logit contributions
aster-0.129
 later-0.123
 he-0.121
 tomorrow-0.115
 then-0.110
Token the
Token ID262
Feature activation-5.58e-02

Pos logit contributions
arl+0.158
 name+0.139
olving+0.127
pering+0.126
 bruised+0.124
Neg logit contributions
aster-0.083
oph-0.074
 course-0.073
 later-0.070
rob-0.069
Token and
Token ID290
Feature activation+7.94e-03

Pos logit contributions
arl+0.041
 name+0.038
 crashing+0.035
 twins+0.033
pering+0.032
Neg logit contributions
oph-0.060
rob-0.056
aster-0.053
mm-0.053
mmm-0.053
Token saw
Token ID2497
Feature activation+2.79e-02

Pos logit contributions
rob+0.012
aster+0.011
oph+0.010
mmm+0.010
 course+0.010
Neg logit contributions
 name-0.012
 sorry-0.012
arl-0.012
 twins-0.011
 curious-0.011
Token a
Token ID257
Feature activation-1.84e-02

Pos logit contributions
aster+0.034
rob+0.034
oph+0.033
 course+0.030
hop+0.030
Neg logit contributions
arl-0.050
 name-0.046
 twins-0.043
 sorry-0.041
pering-0.040
Token big
Token ID1263
Feature activation-3.75e-02

Pos logit contributions
arl+0.030
olving+0.025
pering+0.024
 Deal+0.024
 fault+0.023
Neg logit contributions
aster-0.024
 he-0.024
ha-0.024
 course-0.023
oph-0.023
Token wave
Token ID6769
Feature activation+3.33e-02

Pos logit contributions
arl+0.054
 name+0.046
olving+0.040
 twins+0.039
pering+0.039
Neg logit contributions
aster-0.061
rob-0.058
 course-0.058
oph-0.057
le-0.053
Token coming
Token ID2406
Feature activation-2.03e-01

Pos logit contributions
aster+0.055
 course+0.053
rob+0.051
le+0.050
 later+0.050
Neg logit contributions
arl-0.046
 name-0.039
olving-0.034
 twins-0.033
pering-0.033
Token towards
Token ID3371
Feature activation+3.78e-02

Pos logit contributions
arl+0.378
 name+0.336
olving+0.335
 Deal+0.333
 fault+0.321
Neg logit contributions
 later-0.225
ly-0.215
 he-0.212
le-0.207
 sat-0.206
Token them
Token ID606
Feature activation-3.48e-02

Pos logit contributions
rob+0.055
oph+0.049
aster+0.047
mmm+0.047
chair+0.045
Neg logit contributions
arl-0.062
 name-0.060
 twins-0.054
 welcome-0.051
 sorry-0.050
Token.
Token ID13
Feature activation+5.96e-03

Pos logit contributions
arl+0.077
 name+0.066
 Deal+0.061
olving+0.060
 bruised+0.060
Neg logit contributions
 later-0.032
 tomorrow-0.031
 he-0.029
le-0.028
 then-0.027
Token They
Token ID1119
Feature activation-4.36e-03

Pos logit contributions
aster+0.009
rob+0.009
oph+0.008
 course+0.008
le+0.008
Neg logit contributions
arl-0.009
 name-0.008
 twins-0.007
 sorry-0.007
 welcome-0.007
Token were
Token ID547
Feature activation-5.09e-03

Pos logit contributions
arl+0.007
 name+0.007
 twins+0.007
 crashing+0.007
 sorry+0.007
Neg logit contributions
aster-0.006
rob-0.005
oph-0.005
 course-0.005
mmm-0.005
Token and
Token ID290
Feature activation+1.75e-02

Pos logit contributions
aster+0.031
rob+0.029
oph+0.029
 course+0.027
le+0.025
Neg logit contributions
arl-0.059
 name-0.054
 twins-0.049
pering-0.048
 sorry-0.047
Token Sue
Token ID26089
Feature activation+1.35e-02

Pos logit contributions
rob+0.030
oph+0.029
aster+0.029
 course+0.027
le+0.027
Neg logit contributions
arl-0.023
 name-0.022
 welcome-0.019
 twins-0.019
 sorry-0.017
Token see
Token ID766
Feature activation+7.14e-02

Pos logit contributions
aster+0.016
 course+0.015
 sat+0.015
rob+0.014
oph+0.014
Neg logit contributions
arl-0.026
 name-0.022
 twins-0.020
olving-0.020
pering-0.020
Token the
Token ID262
Feature activation-3.68e-02

Pos logit contributions
oph+0.091
aster+0.090
rob+0.090
le+0.081
 course+0.080
Neg logit contributions
arl-0.125
 name-0.116
 twins-0.107
 sorry-0.103
 welcome-0.100
Token water
Token ID1660
Feature activation-4.06e-02

Pos logit contributions
arl+0.073
 Deal+0.062
 Fighting+0.057
tails+0.056
arse+0.056
Neg logit contributions
 he-0.040
 ice-0.038
 seas-0.037
 course-0.036
 adventure-0.034
Token coming
Token ID2406
Feature activation-2.03e-01

Pos logit contributions
arl+0.072
 name+0.063
 Deal+0.058
olving+0.056
arse+0.055
Neg logit contributions
aster-0.050
 course-0.048
 he-0.046
 later-0.046
rob-0.045
Token.
Token ID13
Feature activation-2.33e-03

Pos logit contributions
arl+0.389
 Deal+0.334
olving+0.321
 name+0.316
 appearances+0.302
Neg logit contributions
 later-0.256
 tomorrow-0.245
 then-0.233
 he-0.229
 sat-0.225
Token They
Token ID1119
Feature activation-1.94e-02

Pos logit contributions
arl+0.004
 name+0.003
pering+0.003
 twins+0.003
olving+0.003
Neg logit contributions
aster-0.004
oph-0.003
rob-0.003
 course-0.003
chair-0.003
Token are
Token ID389
Feature activation-1.98e-02

Pos logit contributions
arl+0.035
 name+0.030
 twins+0.028
pering+0.027
olving+0.027
Neg logit contributions
aster-0.024
oph-0.023
rob-0.022
 course-0.022
le-0.021
Token scared
Token ID12008
Feature activation-3.06e-02

Pos logit contributions
arl+0.038
arse+0.029
 Deal+0.028
ometer+0.027
Ru+0.027
Neg logit contributions
 then-0.028
 even-0.027
 sat-0.027
 reached-0.025
 he-0.024
Token.
Token ID13
Feature activation-9.56e-03

Pos logit contributions
arl+0.057
 name+0.051
 twins+0.047
pering+0.046
 sorry+0.045
Neg logit contributions
aster-0.037
rob-0.036
oph-0.035
 course-0.033
le-0.031
Token years
Token ID812
Feature activation-8.96e-02

Pos logit contributions
aster+0.172
rob+0.167
oph+0.164
 course+0.149
ha+0.143
Neg logit contributions
arl-0.176
 twins-0.175
 name-0.165
 curious-0.151
 nos-0.147
Token old
Token ID1468
Feature activation+2.66e-02

Pos logit contributions
arl+0.114
 name+0.100
olving+0.083
pering+0.082
arse+0.082
Neg logit contributions
aster-0.156
oph-0.149
rob-0.143
chair-0.142
 later-0.141
Token and
Token ID290
Feature activation+3.09e-02

Pos logit contributions
oph+0.021
aster+0.021
rob+0.021
 course+0.020
 later+0.019
Neg logit contributions
arl-0.059
 name-0.053
pering-0.049
olving-0.048
 sorry-0.048
Token loved
Token ID6151
Feature activation-1.33e-02

Pos logit contributions
aster+0.034
rob+0.032
oph+0.032
 course+0.031
 tomorrow+0.029
Neg logit contributions
arl-0.061
 name-0.053
pering-0.049
 twins-0.048
olving-0.048
Token to
Token ID284
Feature activation+5.22e-02

Pos logit contributions
arl+0.027
 name+0.023
rier+0.022
pering+0.022
itating+0.022
Neg logit contributions
aster-0.014
 tomorrow-0.013
rob-0.013
 later-0.013
 sat-0.013
Token explore
Token ID7301
Feature activation+1.09e-01

Pos logit contributions
oph+0.094
rob+0.094
ha+0.083
le+0.081
 course+0.081
Neg logit contributions
 name-0.078
arl-0.065
 welcome-0.061
 sorry-0.055
 head-0.053
Token the
Token ID262
Feature activation+1.01e-02

Pos logit contributions
aster+0.167
oph+0.165
rob+0.161
 course+0.149
hop+0.148
Neg logit contributions
arl-0.161
 name-0.149
 twins-0.134
 welcome-0.126
pering-0.125
Token world
Token ID995
Feature activation+8.36e-03

Pos logit contributions
oph+0.017
 tomorrow+0.017
 course+0.016
 seas+0.016
aster+0.016
Neg logit contributions
arl-0.014
 sorry-0.011
 bruised-0.011
 Deal-0.010
rier-0.010
Token around
Token ID1088
Feature activation+5.66e-02

Pos logit contributions
aster+0.012
rob+0.011
 course+0.011
oph+0.011
le+0.011
Neg logit contributions
arl-0.014
 name-0.012
 twins-0.011
 sorry-0.011
 bruised-0.011
Token her
Token ID607
Feature activation+1.37e-03

Pos logit contributions
aster+0.057
rob+0.057
oph+0.055
 course+0.051
le+0.047
Neg logit contributions
arl-0.114
 name-0.105
 twins-0.097
pering-0.093
 welcome-0.093
Token.
Token ID13
Feature activation+1.32e-02

Pos logit contributions
 tomorrow+0.001
 later+0.001
oph+0.001
 sometime+0.001
 imagined+0.001
Neg logit contributions
arl-0.003
 name-0.002
 trigger-0.002
 sorry-0.002
 bruised-0.002
Token a
Token ID257
Feature activation-1.30e-03

Pos logit contributions
oph+0.058
aster+0.058
 course+0.056
rob+0.054
 later+0.052
Neg logit contributions
arl-0.098
 name-0.086
pering-0.078
 twins-0.078
 sorry-0.076
Token fun
Token ID1257
Feature activation-5.21e-02

Pos logit contributions
arl+0.002
pering+0.002
 name+0.002
 Deal+0.002
olving+0.002
Neg logit contributions
oph-0.002
 course-0.002
aster-0.002
ha-0.002
 tomorrow-0.002
Token new
Token ID649
Feature activation-5.03e-02

Pos logit contributions
arl+0.078
 name+0.073
 twins+0.062
 welcome+0.059
olving+0.059
Neg logit contributions
aster-0.078
rob-0.077
oph-0.075
 course-0.070
le-0.070
Token style
Token ID3918
Feature activation+9.83e-02

Pos logit contributions
arl+0.075
 name+0.067
 twins+0.059
pering+0.056
 sorry+0.055
Neg logit contributions
aster-0.077
rob-0.075
oph-0.075
 course-0.072
le-0.069
Token of
Token ID286
Feature activation-7.18e-02

Pos logit contributions
rob+0.127
aster+0.124
oph+0.119
 course+0.109
le+0.104
Neg logit contributions
arl-0.164
 name-0.158
 twins-0.144
 welcome-0.137
olving-0.136
Token dress
Token ID6576
Feature activation+1.05e-01

Pos logit contributions
arl+0.115
 name+0.100
 twins+0.088
pering+0.088
 sorry+0.086
Neg logit contributions
aster-0.103
rob-0.100
oph-0.100
 course-0.098
 tomorrow-0.091
Token and
Token ID290
Feature activation+4.19e-02

Pos logit contributions
aster+0.138
rob+0.133
oph+0.126
 course+0.123
 sometime+0.111
Neg logit contributions
arl-0.178
 name-0.165
 twins-0.149
olving-0.142
pering-0.142
Token a
Token ID257
Feature activation+8.53e-03

Pos logit contributions
aster+0.049
rob+0.047
oph+0.046
 course+0.043
le+0.041
Neg logit contributions
arl-0.077
 name-0.073
 twins-0.066
 sorry-0.063
 welcome-0.063
Token sour
Token ID11348
Feature activation-5.40e-02

Pos logit contributions
oph+0.012
 course+0.012
aster+0.012
 tomorrow+0.011
 replied+0.011
Neg logit contributions
arl-0.014
pering-0.011
olving-0.010
 name-0.010
arse-0.010
Token treat
Token ID2190
Feature activation-1.68e-02

Pos logit contributions
arl+0.085
 name+0.079
 twins+0.068
 sorry+0.065
 welcome+0.065
Neg logit contributions
aster-0.078
rob-0.078
oph-0.074
 course-0.071
 tomorrow-0.069
Token for
Token ID329
Feature activation+3.02e-02

Pos logit contributions
arl+0.028
 name+0.025
 twins+0.023
 welcome+0.022
 sorry+0.022
Neg logit contributions
aster-0.023
rob-0.023
oph-0.022
 course-0.021
le-0.019
Token It
Token ID632
Feature activation+5.50e-02

Pos logit contributions
aster+0.018
rob+0.017
oph+0.017
 course+0.016
le+0.016
Neg logit contributions
arl-0.020
 name-0.018
 twins-0.016
pering-0.016
 sorry-0.016
Token was
Token ID373
Feature activation+2.96e-02

Pos logit contributions
aster+0.078
oph+0.072
rob+0.071
mmmm+0.065
mmm+0.065
Neg logit contributions
 name-0.089
arl-0.082
 twins-0.078
 bruised-0.077
 fastest-0.077
Token so
Token ID523
Feature activation+5.43e-02

Pos logit contributions
aster+0.037
oph+0.036
rob+0.035
 course+0.034
 tomorrow+0.032
Neg logit contributions
arl-0.053
 name-0.047
 twins-0.042
pering-0.041
olving-0.040
Token fun
Token ID1257
Feature activation+6.14e-03

Pos logit contributions
aster+0.089
rob+0.087
oph+0.086
 course+0.083
le+0.080
Neg logit contributions
arl-0.076
 name-0.066
 twins-0.058
pering-0.056
 sorry-0.055
Token to
Token ID284
Feature activation+4.57e-02

Pos logit contributions
aster+0.009
oph+0.009
rob+0.008
le+0.007
mmmm+0.007
Neg logit contributions
arl-0.009
 name-0.009
 twins-0.008
 welcome-0.008
 living-0.008
Token see
Token ID766
Feature activation+1.03e-01

Pos logit contributions
rob+0.081
oph+0.075
chair+0.067
cream+0.066
ly+0.066
Neg logit contributions
 name-0.066
arl-0.060
 welcome-0.054
 crashing-0.049
 sorry-0.048
Token what
Token ID644
Feature activation-5.42e-02

Pos logit contributions
aster+0.127
rob+0.122
oph+0.119
hop+0.105
 course+0.103
Neg logit contributions
arl-0.181
 name-0.172
 twins-0.156
 welcome-0.150
 sorry-0.148
Token it
Token ID340
Feature activation+7.76e-02

Pos logit contributions
 name+0.092
arl+0.087
 twins+0.085
 welcome+0.082
 younger+0.080
Neg logit contributions
aster-0.065
rob-0.065
oph-0.060
le-0.055
mmm-0.054
Token was
Token ID373
Feature activation+3.76e-02

Pos logit contributions
aster+0.114
oph+0.103
rob+0.103
mmmm+0.095
mmm+0.095
Neg logit contributions
 name-0.126
arl-0.112
 welcome-0.104
 twins-0.102
 curious-0.102
Token doing
Token ID1804
Feature activation-1.99e-02

Pos logit contributions
aster+0.054
rob+0.053
oph+0.052
 course+0.049
chair+0.046
Neg logit contributions
arl-0.060
 name-0.056
 twins-0.049
 welcome-0.048
olving-0.047
Token each
Token ID1123
Feature activation+1.50e-01

Pos logit contributions
arl+0.028
 name+0.027
 twins+0.024
olving+0.024
 welcome+0.024
Neg logit contributions
rob-0.031
aster-0.031
oph-0.031
 course-0.027
chair-0.026
Token do
Token ID466
Feature activation+1.28e-02

Pos logit contributions
rob+0.069
oph+0.064
aster+0.055
cream+0.055
dd+0.055
Neg logit contributions
 name-0.052
arl-0.042
 welcome-0.041
olving-0.036
 head-0.034
Token.
Token ID13
Feature activation+1.06e-02

Pos logit contributions
oph+0.027
rob+0.025
aster+0.024
 mun+0.022
mmmm+0.022
Neg logit contributions
 name-0.013
 twins-0.011
arl-0.011
olving-0.011
 sorry-0.011
Token He
Token ID679
Feature activation+3.61e-02

Pos logit contributions
rob+0.014
aster+0.014
oph+0.014
le+0.013
 course+0.012
Neg logit contributions
arl-0.018
 name-0.017
 twins-0.015
 welcome-0.014
 younger-0.014
Token liked
Token ID8288
Feature activation+1.31e-02

Pos logit contributions
mmmm+0.063
rob+0.063
mmm+0.061
aster+0.060
mm+0.059
Neg logit contributions
 name-0.050
 head-0.044
 curious-0.043
 younger-0.041
 brother-0.040
Token to
Token ID284
Feature activation+5.73e-02

Pos logit contributions
aster+0.018
oph+0.018
rob+0.018
 course+0.016
le+0.016
Neg logit contributions
arl-0.022
 name-0.020
 twins-0.018
pering-0.017
 sorry-0.017
Token dream
Token ID4320
Feature activation+1.43e-01

Pos logit contributions
rob+0.109
oph+0.108
ha+0.092
aw+0.091
mmmm+0.089
Neg logit contributions
 name-0.085
 head-0.067
arl-0.066
 welcome-0.063
 sorry-0.058
Token about
Token ID546
Feature activation-4.62e-02

Pos logit contributions
rob+0.206
oph+0.205
aster+0.193
 mun+0.186
 course+0.181
Neg logit contributions
arl-0.226
 name-0.213
olving-0.190
 younger-0.183
 twins-0.180
Token racing
Token ID11717
Feature activation+1.37e-02

Pos logit contributions
arl+0.079
 name+0.074
 twins+0.068
 sorry+0.064
olving+0.064
Neg logit contributions
oph-0.059
aster-0.058
rob-0.057
le-0.052
 course-0.052
Token cars
Token ID5006
Feature activation+1.86e-03

Pos logit contributions
oph+0.020
aster+0.020
rob+0.019
mmm+0.017
 course+0.017
Neg logit contributions
arl-0.021
 name-0.021
 twins-0.018
 welcome-0.017
 sorry-0.017
Token,
Token ID11
Feature activation+1.82e-02

Pos logit contributions
oph+0.002
aster+0.002
rob+0.002
mmm+0.002
mmmm+0.002
Neg logit contributions
arl-0.003
 name-0.003
 twins-0.003
olving-0.003
pering-0.003
Token playing
Token ID2712
Feature activation-2.11e-02

Pos logit contributions
aster+0.027
oph+0.027
rob+0.027
 course+0.025
le+0.025
Neg logit contributions
arl-0.027
 name-0.025
 twins-0.023
pering-0.021
 bruised-0.021
Token want
Token ID765
Feature activation-1.67e-03

Pos logit contributions
rob+0.004
oph+0.004
aster+0.004
 course+0.004
chair+0.004
Neg logit contributions
arl-0.003
 name-0.003
 welcome-0.003
 twins-0.002
 sorry-0.002
Token to
Token ID284
Feature activation+3.26e-02

Pos logit contributions
 name+0.001
 welcome+0.001
arl+0.001
 twins+0.001
 living+0.001
Neg logit contributions
rob-0.004
oph-0.004
ited-0.004
le-0.004
aster-0.003
Token play
Token ID711
Feature activation+3.75e-02

Pos logit contributions
rob+0.048
oph+0.046
aster+0.045
 course+0.043
chair+0.042
Neg logit contributions
arl-0.052
 name-0.050
 welcome-0.043
 twins-0.041
 sorry-0.040
Token hide
Token ID7808
Feature activation+3.57e-02

Pos logit contributions
oph+0.075
aster+0.068
rob+0.065
ureen+0.062
chair+0.062
Neg logit contributions
 name-0.046
arl-0.043
 welcome-0.039
 sorry-0.037
 twins-0.036
Token and
Token ID290
Feature activation+2.13e-02

Pos logit contributions
aster+0.053
rob+0.051
oph+0.051
zed+0.047
 course+0.045
Neg logit contributions
arl-0.057
 name-0.051
 twins-0.045
 welcome-0.043
pering-0.043
Token seek
Token ID5380
Feature activation+1.18e-01

Pos logit contributions
aster+0.023
rob+0.021
oph+0.021
 course+0.020
 tomorrow+0.020
Neg logit contributions
arl-0.042
 name-0.038
 twins-0.035
pering-0.034
olving-0.034
Token?"
Token ID1701
Feature activation+1.17e-01

Pos logit contributions
aster+0.189
oph+0.184
rob+0.181
zed+0.169
ureen+0.161
Neg logit contributions
arl-0.171
 name-0.165
 welcome-0.147
 twins-0.142
 fastest-0.134
Token
Token ID198
Feature activation+7.50e-03

Pos logit contributions
aster+0.208
oph+0.203
rob+0.201
 course+0.192
le+0.187
Neg logit contributions
arl-0.148
 name-0.132
 twins-0.111
 sorry-0.107
pering-0.105
Token
Token ID198
Feature activation+7.36e-03

Pos logit contributions
aster+0.011
rob+0.011
oph+0.011
 course+0.010
le+0.010
Neg logit contributions
arl-0.012
 name-0.011
 twins-0.009
pering-0.009
 sorry-0.009
Token"
Token ID1
Feature activation-1.64e-02

Pos logit contributions
aster+0.010
rob+0.009
oph+0.009
 course+0.009
le+0.008
Neg logit contributions
arl-0.013
 name-0.011
 twins-0.010
pering-0.010
 sorry-0.010
TokenYes
Token ID5297
Feature activation+3.90e-02

Pos logit contributions
arl+0.028
 name+0.026
 twins+0.022
 welcome+0.022
 sorry+0.021
Neg logit contributions
aster-0.023
rob-0.021
oph-0.021
le-0.020
 course-0.019
Token he
Token ID339
Feature activation+3.94e-02

Pos logit contributions
rob+0.037
oph+0.034
aster+0.033
 course+0.029
le+0.025
Neg logit contributions
arl-0.135
 name-0.131
 twins-0.126
 sorry-0.125
 bruised-0.121
Token found
Token ID1043
Feature activation+6.71e-02

Pos logit contributions
rob+0.066
aster+0.061
mmmm+0.056
mmm+0.056
 course+0.055
Neg logit contributions
 name-0.056
 curious-0.051
 nos-0.050
 twins-0.050
 head-0.050
Token other
Token ID584
Feature activation-2.32e-02

Pos logit contributions
oph+0.093
rob+0.087
aster+0.086
mmm+0.082
mmmm+0.078
Neg logit contributions
arl-0.109
 name-0.106
 bruised-0.097
 twins-0.097
 sorry-0.096
Token things
Token ID1243
Feature activation-5.04e-02

Pos logit contributions
 name+0.028
arl+0.027
 twins+0.025
 welcome+0.022
 younger+0.021
Neg logit contributions
aster-0.042
rob-0.041
oph-0.040
mmm-0.038
 course-0.038
Token to
Token ID284
Feature activation+4.39e-02

Pos logit contributions
arl+0.083
 name+0.075
 sorry+0.068
 bruised+0.068
 twins+0.068
Neg logit contributions
oph-0.070
aster-0.070
rob-0.067
mmm-0.060
mmmm-0.059
Token do
Token ID466
Feature activation+1.87e-02

Pos logit contributions
oph+0.092
ha+0.085
rob+0.082
oline+0.081
abel+0.079
Neg logit contributions
 name-0.054
arl-0.046
 welcome-0.046
 sorry-0.037
 crashing-0.034
Token,
Token ID11
Feature activation+2.32e-02

Pos logit contributions
oph+0.042
aster+0.039
rob+0.038
mmmm+0.036
mmm+0.035
Neg logit contributions
 name-0.015
 twins-0.015
 living-0.014
 best-0.013
arl-0.013
Token like
Token ID588
Feature activation-2.36e-02

Pos logit contributions
rob+0.035
oph+0.033
aster+0.033
le+0.030
 course+0.030
Neg logit contributions
arl-0.035
 name-0.034
 twins-0.031
 sorry-0.030
 crashing-0.029
Token build
Token ID1382
Feature activation+3.28e-02

Pos logit contributions
arl+0.046
 name+0.042
 twins+0.038
 sorry+0.037
pering+0.037
Neg logit contributions
aster-0.026
rob-0.025
oph-0.025
 course-0.023
le-0.022
Token sand
Token ID6450
Feature activation+5.46e-02

Pos logit contributions
oph+0.047
aster+0.047
rob+0.045
 course+0.041
chair+0.041
Neg logit contributions
arl-0.049
 name-0.048
 twins-0.044
 sorry-0.041
 welcome-0.041
Tokencast
Token ID2701
Feature activation+1.10e-01

Pos logit contributions
aster+0.074
oph+0.072
rob+0.068
 course+0.067
chair+0.062
Neg logit contributions
arl-0.094
 name-0.085
 twins-0.077
pering-0.073
olving-0.072
Token.
Token ID13
Feature activation+4.91e-03

Pos logit contributions
arl+0.082
 name+0.073
 twins+0.065
pering+0.065
 sorry+0.064
Neg logit contributions
aster-0.062
rob-0.059
oph-0.059
 course-0.057
le-0.054
Token Every
Token ID3887
Feature activation-7.04e-03

Pos logit contributions
aster+0.006
rob+0.006
oph+0.005
 course+0.005
le+0.005
Neg logit contributions
arl-0.009
 name-0.009
 twins-0.008
 welcome-0.008
 sorry-0.007
Token day
Token ID1110
Feature activation+4.15e-02

Pos logit contributions
 name+0.007
arl+0.007
 welcome+0.005
 twins+0.005
 crashing+0.004
Neg logit contributions
rob-0.014
oph-0.014
mmm-0.013
ized-0.013
mmmm-0.013
Token he
Token ID339
Feature activation+3.59e-02

Pos logit contributions
rob+0.054
aster+0.049
oph+0.049
hop+0.046
 course+0.045
Neg logit contributions
arl-0.070
 name-0.069
 sorry-0.063
 twins-0.062
 welcome-0.062
Token liked
Token ID8288
Feature activation+7.82e-03

Pos logit contributions
oph+0.069
mmm+0.066
mmmm+0.065
aster+0.060
rob+0.059
Neg logit contributions
 name-0.041
 twins-0.036
 living-0.035
 curious-0.033
arl-0.031
Token to
Token ID284
Feature activation+5.65e-02

Pos logit contributions
aster+0.011
rob+0.010
oph+0.010
 course+0.010
le+0.010
Neg logit contributions
arl-0.013
 name-0.012
 twins-0.010
pering-0.010
 sorry-0.010
Token play
Token ID711
Feature activation+4.32e-02

Pos logit contributions
oph+0.108
rob+0.107
ha+0.095
 course+0.090
der+0.089
Neg logit contributions
 name-0.079
arl-0.063
 welcome-0.061
 head-0.057
 sorry-0.054
Token outside
Token ID2354
Feature activation+3.34e-04

Pos logit contributions
oph+0.092
aster+0.081
mmm+0.078
mmmm+0.077
rob+0.077
Neg logit contributions
 name-0.045
arl-0.042
 twins-0.038
 sorry-0.035
 living-0.034
Token in
Token ID287
Feature activation-1.46e-03

Pos logit contributions
oph+0.001
aster+0.001
rob+0.001
ha+0.000
 course+0.000
Neg logit contributions
arl-0.000
 name-0.000
 twins-0.000
pering-0.000
 sorry-0.000
Token the
Token ID262
Feature activation+1.84e-02

Pos logit contributions
arl+0.002
 name+0.002
 twins+0.002
 welcome+0.002
pering+0.002
Neg logit contributions
oph-0.002
rob-0.002
aster-0.002
le-0.002
mmm-0.002
Token sun
Token ID4252
Feature activation-3.31e-02

Pos logit contributions
aster+0.027
oph+0.025
rob+0.025
 course+0.025
 tomorrow+0.024
Neg logit contributions
arl-0.030
 name-0.025
pering-0.023
 twins-0.023
olving-0.022
Token
Token ID198
Feature activation+1.77e-02

Pos logit contributions
oph+0.223
rob+0.218
aster+0.214
le+0.207
 hours+0.193
Neg logit contributions
 name-0.154
arl-0.146
 twins-0.125
 sorry-0.122
 welcome-0.121
Token
Token ID198
Feature activation+1.61e-02

Pos logit contributions
aster+0.028
rob+0.027
oph+0.027
 course+0.026
le+0.025
Neg logit contributions
arl-0.025
 name-0.023
 twins-0.019
 welcome-0.019
pering-0.018
TokenThe
Token ID464
Feature activation+2.41e-03

Pos logit contributions
rob+0.019
aster+0.019
oph+0.018
 course+0.017
le+0.016
Neg logit contributions
arl-0.030
 name-0.027
 twins-0.024
pering-0.024
 welcome-0.024
Token owl
Token ID39610
Feature activation+3.99e-03

Pos logit contributions
 course+0.003
 then+0.003
aster+0.003
 he+0.003
 replied+0.003
Neg logit contributions
arl-0.004
ifying-0.003
 fault-0.003
less-0.003
iqu-0.003
Token smiled
Token ID13541
Feature activation-3.41e-02

Pos logit contributions
aster+0.005
rob+0.005
oph+0.004
 hours+0.003
hop+0.003
Neg logit contributions
 name-0.008
 curious-0.007
 puzzled-0.007
 nos-0.007
 head-0.007
Token and
Token ID290
Feature activation+7.94e-03

Pos logit contributions
arl+0.070
 name+0.065
 twins+0.059
pering+0.057
 sorry+0.057
Neg logit contributions
aster-0.034
rob-0.033
oph-0.032
 course-0.030
hop-0.028
Token said
Token ID531
Feature activation-2.05e-02

Pos logit contributions
aster+0.011
rob+0.010
chair+0.009
le+0.009
 occasions+0.009
Neg logit contributions
 name-0.013
 curious-0.012
 sorry-0.012
 nos-0.012
arl-0.011
Token,
Token ID11
Feature activation-2.04e-02

Pos logit contributions
arl+0.039
 name+0.036
 twins+0.032
 sorry+0.032
 welcome+0.032
Neg logit contributions
aster-0.023
rob-0.022
oph-0.022
 course-0.020
le-0.019
Token "
Token ID366
Feature activation-6.55e-02

Pos logit contributions
arl+0.028
 name+0.023
olving+0.021
 twins+0.021
 bruised+0.021
Neg logit contributions
aster-0.033
oph-0.031
 he-0.031
 then-0.031
 course-0.031
TokenI
Token ID40
Feature activation+2.72e-02

Pos logit contributions
arl+0.100
 name+0.091
 twins+0.080
pering+0.076
 Fighting+0.075
Neg logit contributions
aster-0.099
oph-0.094
rob-0.092
le-0.088
 course-0.087
Token'm
Token ID1101
Feature activation+3.55e-02

Pos logit contributions
rob+0.051
oph+0.051
aster+0.050
chair+0.047
 course+0.046
Neg logit contributions
 name-0.035
arl-0.029
 welcome-0.026
olving-0.024
 twins-0.023
Token
Token ID198
Feature activation+1.75e-02

Pos logit contributions
rob+0.040
aster+0.039
oph+0.039
 course+0.037
le+0.037
Neg logit contributions
arl-0.026
 name-0.024
 twins-0.019
 welcome-0.019
itating-0.017
TokenOn
Token ID2202
Feature activation+2.03e-02

Pos logit contributions
rob+0.023
oph+0.021
aster+0.020
le+0.020
 course+0.018
Neg logit contributions
arl-0.031
 name-0.029
 welcome-0.025
 twins-0.024
 Fighting-0.024
Token this
Token ID428
Feature activation+3.95e-02

Pos logit contributions
rob+0.032
ly+0.031
le+0.031
oph+0.030
cream+0.030
Neg logit contributions
arl-0.026
 name-0.024
 twins-0.023
olving-0.023
 welcome-0.022
Token special
Token ID2041
Feature activation+3.62e-02

Pos logit contributions
rob+0.074
oph+0.071
aster+0.068
hop+0.067
aw+0.066
Neg logit contributions
arl-0.044
 name-0.043
olving-0.034
 welcome-0.033
 twins-0.033
Token day
Token ID1110
Feature activation+2.49e-02

Pos logit contributions
oph+0.058
rob+0.057
aster+0.056
 course+0.053
le+0.052
Neg logit contributions
arl-0.050
 name-0.047
 twins-0.040
olving-0.038
 welcome-0.038
Token,
Token ID11
Feature activation+2.04e-02

Pos logit contributions
rob+0.036
oph+0.033
aster+0.031
 course+0.030
le+0.029
Neg logit contributions
arl-0.041
 name-0.038
 welcome-0.035
 twins-0.035
 sorry-0.033
Token Sue
Token ID26089
Feature activation+1.69e-02

Pos logit contributions
rob+0.035
aster+0.033
oph+0.032
le+0.030
hop+0.030
Neg logit contributions
arl-0.027
 name-0.027
 welcome-0.023
 sorry-0.022
 twins-0.022
Token decided
Token ID3066
Feature activation+1.95e-03

Pos logit contributions
rob+0.022
aster+0.022
mmm+0.021
oph+0.021
mmmm+0.021
Neg logit contributions
 name-0.027
 twins-0.025
arl-0.025
 younger-0.023
 head-0.023
Token to
Token ID284
Feature activation+3.10e-02

Pos logit contributions
rob+0.004
aster+0.004
oph+0.004
le+0.004
 course+0.004
Neg logit contributions
 name-0.002
arl-0.002
 sorry-0.002
 twins-0.002
 undone-0.002
Token design
Token ID1486
Feature activation+7.65e-02

Pos logit contributions
rob+0.047
oph+0.046
aster+0.045
 course+0.042
le+0.041
Neg logit contributions
arl-0.047
 name-0.045
 twins-0.038
olving-0.037
 welcome-0.036
Token a
Token ID257
Feature activation+2.48e-02

Pos logit contributions
oph+0.108
rob+0.107
aster+0.104
hop+0.095
 course+0.093
Neg logit contributions
 name-0.121
arl-0.116
 twins-0.107
 sorry-0.102
 younger-0.099
Token in
Token ID287
Feature activation-4.55e-03

Pos logit contributions
aster+0.068
oph+0.067
rob+0.066
 course+0.059
 tomorrow+0.057
Neg logit contributions
arl-0.106
 name-0.097
 twins-0.089
pering-0.087
 bruised-0.086
Token the
Token ID262
Feature activation+1.79e-02

Pos logit contributions
arl+0.008
 name+0.007
 twins+0.006
pering+0.006
 welcome+0.006
Neg logit contributions
rob-0.006
aster-0.006
oph-0.006
 course-0.005
le-0.005
Token sauce
Token ID10746
Feature activation-3.61e-02

Pos logit contributions
aster+0.030
oph+0.028
rob+0.028
 course+0.028
hop+0.026
Neg logit contributions
arl-0.026
 name-0.022
 twins-0.019
pering-0.019
olving-0.018
Token jar
Token ID17379
Feature activation+2.67e-02

Pos logit contributions
arl+0.059
 name+0.054
 twins+0.049
pering+0.047
 welcome+0.046
Neg logit contributions
aster-0.047
rob-0.047
oph-0.045
 sometime-0.041
 course-0.041
Token.
Token ID13
Feature activation+5.74e-03

Pos logit contributions
rob+0.040
aster+0.039
oph+0.038
 sometime+0.034
mmm+0.034
Neg logit contributions
arl-0.039
 name-0.035
 twins-0.032
pering-0.031
 welcome-0.030
Token He
Token ID679
Feature activation+2.82e-02

Pos logit contributions
aster+0.007
oph+0.007
rob+0.006
 course+0.006
 tomorrow+0.006
Neg logit contributions
arl-0.011
 name-0.009
 twins-0.009
pering-0.008
 bruised-0.008
Token tried
Token ID3088
Feature activation-2.16e-02

Pos logit contributions
rob+0.050
aster+0.043
ock+0.038
 course+0.038
mm+0.037
Neg logit contributions
 name-0.041
 nos-0.039
 curious-0.039
 head-0.036
 loose-0.034
Token to
Token ID284
Feature activation+1.92e-02

Pos logit contributions
arl+0.042
 name+0.037
 twins+0.033
pering+0.033
 sorry+0.032
Neg logit contributions
aster-0.025
oph-0.023
 course-0.023
rob-0.023
 later-0.021
Token take
Token ID1011
Feature activation+1.02e-02

Pos logit contributions
rob+0.035
oph+0.031
chair+0.031
aster+0.030
 course+0.029
Neg logit contributions
 name-0.025
arl-0.025
 welcome-0.021
 head-0.020
pering-0.020
Token it
Token ID340
Feature activation+2.32e-02

Pos logit contributions
aster+0.010
rob+0.010
oph+0.010
 course+0.009
le+0.008
Neg logit contributions
arl-0.021
 name-0.019
 twins-0.018
pering-0.017
 sorry-0.017
Token out
Token ID503
Feature activation-1.64e-02

Pos logit contributions
aster+0.029
 later+0.028
 tomorrow+0.028
oph+0.028
 course+0.027
Neg logit contributions
arl-0.041
 name-0.035
 twins-0.034
pering-0.031
 bruised-0.031
Token to
Token ID284
Feature activation+4.42e-02

Pos logit contributions
arl+0.015
 name+0.014
 twins+0.012
 sorry+0.012
 welcome+0.011
Neg logit contributions
rob-0.014
aster-0.014
oph-0.014
 course-0.013
le-0.013
Token bake
Token ID28450
Feature activation+3.77e-02

Pos logit contributions
rob+0.079
oph+0.073
 course+0.066
ha+0.065
aster+0.065
Neg logit contributions
 name-0.063
arl-0.059
 welcome-0.052
 head-0.048
 sorry-0.047
Token.
Token ID13
Feature activation+1.21e-02

Pos logit contributions
rob+0.045
oph+0.045
aster+0.044
 course+0.040
hop+0.039
Neg logit contributions
arl-0.066
 name-0.065
 twins-0.058
 sorry-0.057
 welcome-0.056
Token It
Token ID632
Feature activation+6.14e-02

Pos logit contributions
aster+0.015
rob+0.015
oph+0.015
 course+0.014
le+0.014
Neg logit contributions
arl-0.021
 name-0.020
 twins-0.017
 sorry-0.017
pering-0.017
Token didn
Token ID1422
Feature activation-8.95e-02

Pos logit contributions
oph+0.087
rob+0.087
aster+0.083
mmm+0.080
mmmm+0.076
Neg logit contributions
 name-0.098
arl-0.093
 bruised-0.086
 curious-0.085
 twins-0.085
Token't
Token ID470
Feature activation-4.26e-02

Pos logit contributions
arl+0.085
 name+0.079
 sorry+0.058
pering+0.058
 undone+0.058
Neg logit contributions
oph-0.189
rob-0.189
 imagined-0.175
 hours-0.173
 course-0.171
Token know
Token ID760
Feature activation+6.42e-02

Pos logit contributions
arl+0.059
 name+0.059
 welcome+0.052
free+0.047
 sorry+0.046
Neg logit contributions
rob-0.076
oph-0.074
 course-0.067
mm-0.063
ized-0.062
Token what
Token ID644
Feature activation-6.16e-02

Pos logit contributions
aster+0.076
rob+0.074
oph+0.073
 course+0.069
le+0.065
Neg logit contributions
arl-0.119
 name-0.107
 twins-0.097
pering-0.095
 sorry-0.094
Token to
Token ID284
Feature activation+6.63e-02

Pos logit contributions
arl+0.098
 name+0.092
 twins+0.081
 sorry+0.078
 welcome+0.078
Neg logit contributions
rob-0.086
oph-0.086
aster-0.085
 course-0.078
le-0.077
Token do
Token ID466
Feature activation+3.27e-02

Pos logit contributions
rob+0.123
oph+0.111
aster+0.104
chair+0.103
 hours+0.103
Neg logit contributions
 name-0.088
arl-0.084
 welcome-0.072
 twins-0.067
 head-0.062
Token!
Token ID0
Feature activation+4.99e-02

Pos logit contributions
oph+0.060
rob+0.058
aster+0.057
 mun+0.053
 course+0.050
Neg logit contributions
arl-0.039
 name-0.039
 twins-0.035
olving-0.034
 welcome-0.033
Token
Token ID198
Feature activation+1.90e-02

Pos logit contributions
arl+0.002
 name+0.001
 twins+0.001
pering+0.001
 sorry+0.001
Neg logit contributions
aster-0.001
rob-0.001
oph-0.001
 course-0.001
le-0.001
Token
Token ID198
Feature activation+1.54e-02

Pos logit contributions
aster+0.028
rob+0.028
oph+0.027
 course+0.026
le+0.025
Neg logit contributions
arl-0.030
 name-0.026
 twins-0.023
pering-0.022
 sorry-0.022
TokenFinally
Token ID11158
Feature activation-1.48e-03

Pos logit contributions
rob+0.016
aster+0.016
oph+0.016
 course+0.014
le+0.014
Neg logit contributions
arl-0.031
 name-0.028
 twins-0.026
pering-0.025
 sorry-0.025
Token,
Token ID11
Feature activation-2.21e-03

Pos logit contributions
 sorry+0.002
 name+0.002
 twins+0.002
 welcome+0.001
arl+0.001
Neg logit contributions
rob-0.003
oph-0.003
round-0.002
chair-0.002
hop-0.002
Token one
Token ID530
Feature activation+8.68e-02

Pos logit contributions
 name+0.004
arl+0.003
 sorry+0.003
 twins+0.003
 welcome+0.003
Neg logit contributions
oph-0.003
rob-0.003
chair-0.003
le-0.003
aster-0.002
Token of
Token ID286
Feature activation-9.31e-02

Pos logit contributions
aster+0.106
oph+0.099
rob+0.098
 course+0.096
le+0.092
Neg logit contributions
arl-0.162
 name-0.142
pering-0.130
 twins-0.128
olving-0.125
Token his
Token ID465
Feature activation+2.13e-02

Pos logit contributions
arl+0.172
 name+0.156
 twins+0.144
 sorry+0.137
pering+0.137
Neg logit contributions
oph-0.110
rob-0.108
aster-0.106
chair-0.097
 course-0.094
Token friends
Token ID2460
Feature activation-1.86e-02

Pos logit contributions
aster+0.033
 course+0.031
oph+0.031
 tomorrow+0.031
 later+0.030
Neg logit contributions
arl-0.032
 name-0.024
pering-0.023
itating-0.023
 Fighting-0.023
Token said
Token ID531
Feature activation-1.65e-02

Pos logit contributions
arl+0.035
 name+0.034
 twins+0.032
 sorry+0.031
 welcome+0.030
Neg logit contributions
rob-0.021
aster-0.020
oph-0.018
mmm-0.016
 course-0.016
Token,
Token ID11
Feature activation-1.82e-02

Pos logit contributions
arl+0.034
 name+0.032
 sorry+0.030
 twins+0.029
 bruised+0.028
Neg logit contributions
rob-0.016
aster-0.016
oph-0.016
 course-0.014
le-0.013
Token "
Token ID366
Feature activation-6.48e-02

Pos logit contributions
arl+0.018
 name+0.015
 twins+0.012
 sorry+0.012
pering+0.011
Neg logit contributions
rob-0.038
aster-0.037
oph-0.037
 course-0.035
le-0.035
Token g
Token ID308
Feature activation+2.98e-02

Pos logit contributions
aster+0.021
 course+0.020
 sat+0.019
le+0.019
 then+0.019
Neg logit contributions
arl-0.029
 name-0.024
rier-0.024
pering-0.023
 twins-0.023
Tokeniggle
Token ID24082
Feature activation+1.33e-03

Pos logit contributions
rob+0.046
 course+0.042
aster+0.041
oph+0.040
hop+0.038
Neg logit contributions
arl-0.052
 name-0.043
 sorry-0.040
 twins-0.040
pering-0.039
Token.
Token ID13
Feature activation-1.80e-02

Pos logit contributions
aster+0.002
le+0.001
 later+0.001
 course+0.001
 tomorrow+0.001
Neg logit contributions
arl-0.003
 name-0.002
pering-0.002
olving-0.002
 bruised-0.002
Token They
Token ID1119
Feature activation-1.52e-02

Pos logit contributions
arl+0.029
 name+0.025
 bruised+0.024
pering+0.024
rier+0.024
Neg logit contributions
aster-0.023
oph-0.022
 he-0.022
chair-0.022
rob-0.022
Token did
Token ID750
Feature activation-4.47e-02

Pos logit contributions
 name+0.024
arl+0.024
 twins+0.022
 crashing+0.021
 bruised+0.021
Neg logit contributions
aster-0.020
rob-0.020
oph-0.019
mmm-0.018
mmmm-0.018
Token not
Token ID407
Feature activation-8.13e-02

Pos logit contributions
arl+0.051
 name+0.046
 twins+0.038
 sorry+0.037
pering+0.035
Neg logit contributions
rob-0.084
oph-0.083
aster-0.083
 course-0.079
chair-0.076
Token mind
Token ID2000
Feature activation+2.99e-02

Pos logit contributions
arl+0.122
 name+0.114
 welcome+0.100
 twins+0.099
 sorry+0.097
Neg logit contributions
rob-0.131
oph-0.126
 course-0.119
aster-0.118
chair-0.115
Token being
Token ID852
Feature activation-5.64e-02

Pos logit contributions
aster+0.029
le+0.028
 he+0.028
 later+0.027
 tomorrow+0.026
Neg logit contributions
arl-0.062
 name-0.057
 Deal-0.051
olving-0.050
 welcome-0.050
Token wet
Token ID9583
Feature activation-9.00e-03

Pos logit contributions
arl+0.077
 name+0.064
pering+0.055
olving+0.054
 twins+0.052
Neg logit contributions
aster-0.094
rob-0.090
oph-0.089
 course-0.088
le-0.088
Token.
Token ID13
Feature activation-2.23e-02

Pos logit contributions
arl+0.018
 name+0.017
 twins+0.015
pering+0.015
 sorry+0.014
Neg logit contributions
aster-0.009
rob-0.008
oph-0.008
 course-0.008
le-0.008
Token They
Token ID1119
Feature activation-1.59e-02

Pos logit contributions
arl+0.039
rier+0.034
pering+0.034
arse+0.034
 bruised+0.034
Neg logit contributions
 he-0.024
aster-0.024
oph-0.024
 tomorrow-0.023
 then-0.023
Token.
Token ID13
Feature activation-2.58e-02

Pos logit contributions
arl+0.065
 name+0.057
 twins+0.051
pering+0.050
 sorry+0.050
Neg logit contributions
aster-0.046
rob-0.044
oph-0.044
 course-0.043
le-0.040
Token She
Token ID1375
Feature activation-3.69e-03

Pos logit contributions
arl+0.054
arse+0.049
 fastest+0.047
itating+0.047
iqu+0.046
Neg logit contributions
aster-0.023
oph-0.022
chair-0.020
 course-0.019
 he-0.019
Token wanted
Token ID2227
Feature activation-3.12e-02

Pos logit contributions
 name+0.005
 curious+0.005
 nos+0.005
 twins+0.005
 head+0.005
Neg logit contributions
rob-0.006
aster-0.006
 course-0.005
 eternity-0.005
oph-0.005
Token it
Token ID340
Feature activation+3.34e-02

Pos logit contributions
arl+0.049
 name+0.046
 twins+0.041
 sorry+0.039
 bruised+0.039
Neg logit contributions
rob-0.044
oph-0.044
aster-0.044
 course-0.040
le-0.040
Token for
Token ID329
Feature activation+3.11e-02

Pos logit contributions
aster+0.049
rob+0.047
oph+0.047
 course+0.045
le+0.043
Neg logit contributions
arl-0.053
 name-0.048
 twins-0.042
pering-0.041
 sorry-0.040
Token her
Token ID607
Feature activation-7.74e-03

Pos logit contributions
rob+0.046
oph+0.041
chair+0.040
le+0.040
aster+0.038
Neg logit contributions
arl-0.052
 name-0.046
 twins-0.044
 sorry-0.043
 welcome-0.041
Token fort
Token ID6285
Feature activation-3.38e-02

Pos logit contributions
arl+0.013
 name+0.011
pering+0.010
 twins+0.010
 bruised+0.010
Neg logit contributions
aster-0.011
oph-0.010
 course-0.010
 tomorrow-0.010
rob-0.010
Token.
Token ID13
Feature activation-2.54e-02

Pos logit contributions
arl+0.051
 name+0.047
 twins+0.042
 welcome+0.041
olving+0.040
Neg logit contributions
aster-0.051
rob-0.050
oph-0.048
 course-0.045
mmm-0.044
Token She
Token ID1375
Feature activation-1.33e-03

Pos logit contributions
arl+0.046
 name+0.039
pering+0.038
arse+0.038
olving+0.038
Neg logit contributions
oph-0.030
aster-0.029
 course-0.027
 tomorrow-0.026
chair-0.026
Token ran
Token ID4966
Feature activation-7.03e-03

Pos logit contributions
 name+0.002
 nos+0.002
 curious+0.002
 twins+0.002
 head+0.002
Neg logit contributions
rob-0.002
aster-0.002
mmmm-0.002
mmm-0.002
 eternity-0.002
Token to
Token ID284
Feature activation-2.46e-03

Pos logit contributions
arl+0.013
 name+0.012
 twins+0.011
olving+0.011
 bruised+0.011
Neg logit contributions
aster-0.008
oph-0.008
 later-0.007
rob-0.007
 tomorrow-0.007
Token much
Token ID881
Feature activation-2.88e-02

Pos logit contributions
aster+0.062
rob+0.060
oph+0.059
 course+0.057
le+0.054
Neg logit contributions
arl-0.074
 name-0.065
 twins-0.058
pering-0.057
olving-0.056
Token.
Token ID13
Feature activation-1.57e-02

Pos logit contributions
arl+0.045
 name+0.040
 twins+0.035
 sorry+0.034
 welcome+0.034
Neg logit contributions
oph-0.044
rob-0.043
aster-0.043
 course-0.040
hop-0.038
Token I
Token ID314
Feature activation+1.42e-02

Pos logit contributions
arl+0.027
arse+0.023
pering+0.023
 name+0.022
olving+0.022
Neg logit contributions
aster-0.020
chair-0.018
 he-0.018
 course-0.018
 then-0.018
Token thought
Token ID1807
Feature activation+2.86e-02

Pos logit contributions
aster+0.020
rob+0.020
oph+0.019
 course+0.018
hop+0.017
Neg logit contributions
arl-0.023
 name-0.023
 twins-0.019
 sorry-0.019
 bruised-0.019
Token you
Token ID345
Feature activation-2.95e-02

Pos logit contributions
oph+0.035
rob+0.035
aster+0.031
 course+0.029
le+0.028
Neg logit contributions
arl-0.051
 name-0.048
 sorry-0.047
 twins-0.047
 welcome-0.045
Token would
Token ID561
Feature activation-4.88e-02

Pos logit contributions
 name+0.045
arl+0.037
free+0.037
 twins+0.036
 bruised+0.035
Neg logit contributions
rob-0.048
aster-0.048
oph-0.046
 course-0.041
lf-0.041
Token like
Token ID588
Feature activation-2.58e-02

Pos logit contributions
arl+0.086
 name+0.073
 twins+0.067
pering+0.067
olving+0.065
Neg logit contributions
aster-0.063
oph-0.059
 course-0.059
rob-0.058
 tomorrow-0.057
Token a
Token ID257
Feature activation+1.31e-02

Pos logit contributions
arl+0.045
 name+0.043
 twins+0.039
 sorry+0.037
 welcome+0.036
Neg logit contributions
oph-0.032
rob-0.032
aster-0.031
le-0.028
 course-0.027
Token surprise
Token ID5975
Feature activation+5.12e-02

Pos logit contributions
aster+0.017
 course+0.016
 tomorrow+0.016
oph+0.015
 he+0.015
Neg logit contributions
arl-0.023
pering-0.020
olving-0.018
itating-0.018
 name-0.018
Token.
Token ID13
Feature activation-1.49e-02

Pos logit contributions
rob+0.081
oph+0.080
aster+0.078
 course+0.071
zed+0.068
Neg logit contributions
arl-0.076
 name-0.069
 twins-0.063
 welcome-0.061
 sorry-0.059
Token Do
Token ID2141
Feature activation-3.74e-02

Pos logit contributions
arl+0.024
 name+0.019
arse+0.019
pering+0.019
olving+0.019
Neg logit contributions
aster-0.022
 tomorrow-0.020
 course-0.020
chair-0.020
rob-0.019
Token a
Token ID257
Feature activation+3.62e-04

Pos logit contributions
 name+0.016
arl+0.015
 twins+0.014
pering+0.013
 sorry+0.013
Neg logit contributions
oph-0.018
aster-0.018
rob-0.018
le-0.017
 course-0.017
Token trip
Token ID5296
Feature activation+1.53e-02

Pos logit contributions
rob+0.001
oph+0.001
aster+0.001
 course+0.001
mmmm+0.001
Neg logit contributions
arl-0.000
 name-0.000
 cozy-0.000
 sorry-0.000
 twins-0.000
Token,
Token ID11
Feature activation-5.56e-03

Pos logit contributions
rob+0.027
aster+0.026
oph+0.025
 course+0.024
mm+0.024
Neg logit contributions
arl-0.018
 name-0.017
 welcome-0.015
 twins-0.015
 forward-0.015
Token but
Token ID475
Feature activation+4.45e-02

Pos logit contributions
arl+0.009
 name+0.008
 twins+0.008
 sorry+0.007
pering+0.007
Neg logit contributions
rob-0.008
aster-0.008
oph-0.008
 course-0.007
le-0.007
Token she
Token ID673
Feature activation+1.85e-02

Pos logit contributions
rob+0.047
aster+0.043
oph+0.041
 course+0.040
le+0.037
Neg logit contributions
arl-0.088
 name-0.083
 twins-0.077
 sorry-0.076
 bruised-0.074
Token didn
Token ID1422
Feature activation-1.15e-01

Pos logit contributions
rob+0.031
aster+0.028
 course+0.026
mmmm+0.025
oph+0.025
Neg logit contributions
 name-0.027
 twins-0.024
 curious-0.024
 head-0.024
arl-0.023
Token't
Token ID470
Feature activation-6.00e-02

Pos logit contributions
arl+0.141
 name+0.126
pering+0.100
 twins+0.100
 sorry+0.100
Neg logit contributions
rob-0.209
oph-0.205
 course-0.198
aster-0.198
le-0.189
Token know
Token ID760
Feature activation+4.78e-02

Pos logit contributions
arl+0.085
 name+0.080
 welcome+0.070
 sorry+0.066
 twins+0.066
Neg logit contributions
rob-0.104
oph-0.099
 course-0.097
aster-0.092
chair-0.087
Token what
Token ID644
Feature activation-7.59e-02

Pos logit contributions
aster+0.057
oph+0.057
rob+0.057
 course+0.051
le+0.049
Neg logit contributions
arl-0.086
 name-0.078
 twins-0.072
 sorry-0.070
 bruised-0.069
Token "
Token ID366
Feature activation-2.79e-02

Pos logit contributions
arl+0.113
 name+0.111
 twins+0.097
 welcome+0.092
 sorry+0.092
Neg logit contributions
oph-0.112
rob-0.110
aster-0.108
 course-0.101
le-0.099
Tokenprep
Token ID46012
Feature activation+3.20e-02

Pos logit contributions
arl+0.042
 name+0.039
free+0.034
 undone+0.033
olving+0.032
Neg logit contributions
aster-0.042
oph-0.041
 course-0.040
rob-0.039
 eternity-0.036
Token"
Token ID1
Feature activation-3.55e-02

Pos logit contributions
arl+0.034
 name+0.031
 twins+0.028
pering+0.027
 sorry+0.027
Neg logit contributions
aster-0.018
rob-0.018
oph-0.018
 course-0.017
le-0.016
TokenTom
Token ID13787
Feature activation+6.09e-02

Pos logit contributions
arl+0.068
 name+0.061
 twins+0.057
 bruised+0.054
pering+0.054
Neg logit contributions
aster-0.039
oph-0.038
rob-0.037
le-0.035
 course-0.034
Token,
Token ID11
Feature activation+2.77e-03

Pos logit contributions
 course+0.037
 replied+0.035
aster+0.034
 he+0.033
 then+0.032
Neg logit contributions
arl-0.152
 name-0.130
pering-0.129
olving-0.127
rier-0.125
Token are
Token ID389
Feature activation-1.93e-02

Pos logit contributions
rob+0.004
oph+0.004
aster+0.004
le+0.003
 course+0.003
Neg logit contributions
arl-0.005
 name-0.004
 twins-0.004
 sorry-0.004
 welcome-0.004
Token you
Token ID345
Feature activation-2.95e-02

Pos logit contributions
arl+0.036
 name+0.033
 twins+0.030
pering+0.029
 sorry+0.029
Neg logit contributions
aster-0.022
rob-0.022
oph-0.022
 course-0.020
le-0.019
Token okay
Token ID8788
Feature activation-1.40e-01

Pos logit contributions
arl+0.024
 name+0.019
olving+0.014
pering+0.013
rier+0.011
Neg logit contributions
aster-0.064
 course-0.062
 later-0.062
rob-0.062
oph-0.061
Token?"
Token ID1701
Feature activation+1.28e-01

Pos logit contributions
arl+0.313
 name+0.287
 twins+0.263
pering+0.261
olving+0.258
Neg logit contributions
aster-0.113
rob-0.105
oph-0.105
 course-0.098
le-0.090
Token she
Token ID673
Feature activation+2.06e-03

Pos logit contributions
rob+0.230
oph+0.198
le+0.191
ops+0.190
 course+0.177
Neg logit contributions
 sorry-0.193
 name-0.185
arl-0.171
 sad-0.167
 curious-0.167
Token asked
Token ID1965
Feature activation+4.95e-03

Pos logit contributions
aster+0.002
rob+0.002
oph+0.002
 course+0.002
mmm+0.002
Neg logit contributions
arl-0.004
 name-0.004
 twins-0.004
 sorry-0.004
 younger-0.004
Token.
Token ID13
Feature activation-3.97e-02

Pos logit contributions
aster+0.006
 later+0.005
oph+0.005
 course+0.005
rob+0.005
Neg logit contributions
arl-0.010
 name-0.008
olving-0.007
pering-0.007
 twins-0.007
Token "
Token ID366
Feature activation-6.64e-02

Pos logit contributions
arl+0.054
 name+0.046
 twins+0.040
pering+0.039
olving+0.039
Neg logit contributions
aster-0.066
oph-0.063
rob-0.063
 course-0.061
hop-0.058
Token bag
Token ID6131
Feature activation+6.40e-02

Pos logit contributions
aster+0.017
 course+0.016
rob+0.015
chair+0.015
 then+0.015
Neg logit contributions
arl-0.015
 name-0.012
olving-0.011
pering-0.011
itating-0.011
Token,
Token ID11
Feature activation+1.76e-02

Pos logit contributions
rob+0.091
aster+0.089
oph+0.082
 hours+0.075
 course+0.074
Neg logit contributions
arl-0.107
 name-0.096
 welcome-0.087
 twins-0.086
 sorry-0.085
Token it
Token ID340
Feature activation+3.48e-02

Pos logit contributions
aster+0.015
rob+0.015
oph+0.014
 course+0.013
le+0.012
Neg logit contributions
arl-0.038
 name-0.035
 twins-0.032
pering-0.032
 sorry-0.032
Token fell
Token ID3214
Feature activation+8.67e-04

Pos logit contributions
aster+0.042
rob+0.042
oph+0.039
 course+0.037
mmm+0.035
Neg logit contributions
arl-0.060
 name-0.058
 bruised-0.053
 twins-0.053
 sorry-0.052
Token out
Token ID503
Feature activation-3.79e-02

Pos logit contributions
 later+0.001
 he+0.001
 sat+0.001
ly+0.001
rob+0.001
Neg logit contributions
arl-0.001
 name-0.001
 twins-0.001
 sorry-0.001
 Deal-0.001
Token of
Token ID286
Feature activation-1.18e-01

Pos logit contributions
arl+0.074
 name+0.067
 twins+0.062
pering+0.060
 bruised+0.060
Neg logit contributions
rob-0.041
aster-0.041
oph-0.040
 course-0.037
hop-0.035
Token her
Token ID607
Feature activation-3.94e-02

Pos logit contributions
arl+0.214
 name+0.192
 twins+0.174
pering+0.169
 sorry+0.168
Neg logit contributions
aster-0.143
rob-0.140
oph-0.138
 course-0.130
le-0.125
Token hands
Token ID2832
Feature activation-1.60e-02

Pos logit contributions
arl+0.051
 name+0.042
pering+0.036
 twins+0.035
olving+0.035
Neg logit contributions
aster-0.068
oph-0.066
rob-0.065
 course-0.064
chair-0.062
Token and
Token ID290
Feature activation+2.03e-02

Pos logit contributions
arl+0.030
 name+0.027
 twins+0.024
pering+0.023
 sorry+0.023
Neg logit contributions
aster-0.019
oph-0.018
rob-0.018
 course-0.017
le-0.016
Token broke
Token ID6265
Feature activation-1.23e-02

Pos logit contributions
rob+0.026
aster+0.026
oph+0.025
 course+0.024
le+0.022
Neg logit contributions
arl-0.035
 name-0.033
 twins-0.029
 bruised-0.029
 sorry-0.029
Token on
Token ID319
Feature activation+2.34e-02

Pos logit contributions
arl+0.027
 name+0.024
olving+0.022
 welcome+0.021
 sorry+0.021
Neg logit contributions
 later-0.011
oph-0.010
aster-0.010
 tomorrow-0.010
le-0.010
Token a
Token ID257
Feature activation+3.12e-03

Pos logit contributions
arl+0.039
 name+0.037
 twins+0.036
 sorry+0.036
pering+0.032
Neg logit contributions
oph-0.036
rob-0.036
le-0.034
aster-0.032
chair-0.031
Token while
Token ID981
Feature activation-5.29e-02

Pos logit contributions
aster+0.003
rob+0.003
oph+0.003
 course+0.003
le+0.003
Neg logit contributions
arl-0.006
 name-0.005
 twins-0.005
 sorry-0.005
pering-0.005
Token,
Token ID11
Feature activation+2.51e-03

Pos logit contributions
arl+0.101
 name+0.091
 twins+0.084
 sorry+0.083
pering+0.082
Neg logit contributions
rob-0.059
oph-0.058
aster-0.058
 course-0.053
le-0.051
Token the
Token ID262
Feature activation-1.70e-02

Pos logit contributions
rob+0.004
oph+0.003
aster+0.003
le+0.003
 course+0.003
Neg logit contributions
arl-0.004
 name-0.004
 twins-0.003
 sorry-0.003
 bruised-0.003
Token musician
Token ID21623
Feature activation+2.03e-02

Pos logit contributions
arl+0.030
 name+0.025
pering+0.024
olving+0.023
 bruised+0.023
Neg logit contributions
aster-0.023
oph-0.021
 course-0.021
chair-0.021
rob-0.021
Token stopped
Token ID5025
Feature activation-1.14e-01

Pos logit contributions
rob+0.026
aster+0.025
mmmm+0.023
mmm+0.023
 course+0.022
Neg logit contributions
 name-0.034
 twins-0.031
arl-0.030
 head-0.029
 sorry-0.029
Token playing
Token ID2712
Feature activation-4.96e-02

Pos logit contributions
arl+0.225
 name+0.194
rier+0.173
olving+0.173
 twins+0.173
Neg logit contributions
aster-0.124
 later-0.113
 tomorrow-0.112
rob-0.110
 he-0.109
Token and
Token ID290
Feature activation+3.10e-02

Pos logit contributions
arl+0.069
 name+0.067
 twins+0.062
 sorry+0.061
 welcome+0.059
Neg logit contributions
oph-0.081
rob-0.076
aster-0.073
mmm-0.071
mmmm-0.069
Token said
Token ID531
Feature activation-2.32e-02

Pos logit contributions
aster+0.036
rob+0.035
oph+0.035
mmm+0.034
mmmm+0.033
Neg logit contributions
 name-0.053
arl-0.051
 sorry-0.050
 twins-0.049
 bruised-0.048
Token to
Token ID284
Feature activation+2.61e-03

Pos logit contributions
arl+0.041
 name+0.038
 sorry+0.036
 twins+0.034
 welcome+0.033
Neg logit contributions
rob-0.030
aster-0.030
oph-0.029
 course-0.027
le-0.025
Token Lily
Token ID20037
Feature activation-1.19e-02

Pos logit contributions
rob+0.005
oph+0.004
aster+0.004
ops+0.004
mmm+0.004
Neg logit contributions
arl-0.003
 name-0.003
 twins-0.003
 sorry-0.003
 welcome-0.003
Token said
Token ID531
Feature activation-3.22e-02

Pos logit contributions
 twins+0.031
 name+0.031
 sorry+0.029
arl+0.027
 bruised+0.027
Neg logit contributions
aster-0.032
rob-0.029
mmm-0.029
mmmm-0.029
hop-0.029
Token they
Token ID484
Feature activation-6.56e-02

Pos logit contributions
arl+0.066
 name+0.060
 twins+0.057
 sorry+0.056
 bruised+0.053
Neg logit contributions
rob-0.033
aster-0.031
oph-0.031
 course-0.028
le-0.028
Token had
Token ID550
Feature activation-6.07e-02

Pos logit contributions
arl+0.099
 name+0.096
 twins+0.089
 sorry+0.082
 bruised+0.081
Neg logit contributions
aster-0.098
oph-0.098
rob-0.098
 course-0.088
mmm-0.087
Token to
Token ID284
Feature activation+1.65e-02

Pos logit contributions
arl+0.083
 name+0.077
 twins+0.072
 fault+0.066
 sorry+0.066
Neg logit contributions
rob-0.103
aster-0.096
oph-0.095
le-0.094
 course-0.091
Token say
Token ID910
Feature activation-1.75e-02

Pos logit contributions
rob+0.023
oph+0.022
aster+0.021
 course+0.020
chair+0.020
Neg logit contributions
arl-0.028
 name-0.025
 twins-0.022
 welcome-0.022
 sorry-0.022
Token sorry
Token ID7926
Feature activation-1.48e-01

Pos logit contributions
arl+0.013
 name+0.010
 twins+0.007
 sorry+0.007
pering+0.007
Neg logit contributions
aster-0.040
rob-0.039
oph-0.039
 course-0.038
le-0.037
Token and
Token ID290
Feature activation-2.41e-02

Pos logit contributions
arl+0.216
 name+0.201
 twins+0.184
 attention+0.163
 sorry+0.163
Neg logit contributions
rob-0.269
oph-0.239
aster-0.234
 course-0.228
hop-0.210
Token be
Token ID307
Feature activation+1.08e-02

Pos logit contributions
arl+0.038
 name+0.034
 twins+0.030
 sorry+0.030
 welcome+0.029
Neg logit contributions
rob-0.037
oph-0.036
aster-0.035
 course-0.033
le-0.033
Token good
Token ID922
Feature activation-4.88e-02

Pos logit contributions
aster+0.017
 later+0.016
 then+0.016
 tomorrow+0.016
 course+0.016
Neg logit contributions
arl-0.016
 name-0.013
ometer-0.012
olving-0.012
arse-0.012
Token.
Token ID13
Feature activation-5.24e-02

Pos logit contributions
arl+0.084
 name+0.075
 twins+0.068
pering+0.065
 sorry+0.065
Neg logit contributions
aster-0.065
rob-0.063
oph-0.063
 course-0.060
le-0.057
Token
Token ID198
Feature activation-2.92e-02

Pos logit contributions
arl+0.086
 name+0.076
pering+0.067
 twins+0.067
olving+0.067
Neg logit contributions
aster-0.072
oph-0.070
rob-0.069
 course-0.067
 tomorrow-0.064