TOP ACTIVATIONS
MAX = 0.205

. They had stopped qu arre ling . To celebrate ,
Sharing was better than qu arre ling . They said sorry
on , they never qu arre led again over clothes .
? Why are you qu arre ling with your mom ?
never happy . He qu arre led and yelled and sc
walking by saw them qu arre ling . She was very
together . They never qu arre led again . <|endoftext|> Once
and saw two birds qu arre ling ! They were squ
control . The ranc her heard the noise and
Tom and Mia stop qu arre ling . They see the
the parents were also qu arre ling . The first family
" Why are you qu arre ling ? You can share
but today they were qu arre ling . Jenna wanted to
the mess and the qu arre ling twins . She is
hands and they never qu arre led again . <|endoftext|> Once
Finally , after much qu arre ling , one of the
says , " Stop qu arre ling , Tom and Lily
she was sorry for qu arre ling . From that day
They learn that qu arre ling is bad and sharing
on , they never qu arre led again and played together

MIN ACTIVATIONS
MIN = -0.196

, " Yes , of course but you have to be
and said , " Of course !" They
, " Yes , of course ! I love to spin
and said , " Of course you can . Why don
and said , " Of course you can ! It 's
and said , " Of course , Tim my . Ice
Of course , sweet ie . But
˜ Yes , of course . Here , take this
, " Yes , of course you can have a bite
, " Yes , of course . We will go to
mom said , " Of course , Tim ! Let 's
said " yes , of course you can ", and so
â Of course I do , â
, " Yes , of course I will !"
my replied , " Of course , let 's go inside
my said , " Of course , but only pick a
shouted back , " Of course , and stay safe !"
Yes Ab bie , of course you can !" <|endoftext|> Once
and said , " Of course , let 's go get
and said , " Of course , Lily . That pattern

INTERVAL 0.105 - 0.205
CONTAINS 0.432%

At first , Jack didn 't understand what she meant
forest , but quickly got tired and felt very lonely .
and help me with the chores ." L ily
together to buy sa us ages . And Tim my was
She qu acked and w add led around . All the

INTERVAL 0.005 - 0.105
CONTAINS 39.657%

more and play less . Jack got frustrated and shouted at
the story is : Do not demand things that are not
thought for a moment and said , " I 'm sorry
nodd les . Then , she and her mummy started to
will play with safer toys . Tom and Anna

INTERVAL -0.096 - 0.005
CONTAINS 58.717%

began to suffer . He was not alone though ; there
the way . They were happy and curious .
all alone in the park . Then , Tom
. Kitty learned that it 's okay to feel embarrass and
hand , and he wanted to catch the duck . The

INTERVAL -0.196 - -0.096
CONTAINS 1.193%

. The kind and compassionate face always gave her the courage
thanking her for the special present . The moral
house . It was a gift for her . Sue was
for giving the drain a hug , Tina . As a
smiled and said , " I 'm glad I could help
Token.
Token ID13
Feature activation+1.03e-02

Pos logit contributions
 course+0.003
 yeah+0.002
Sure+0.002
 reminder+0.002
of+0.002
Neg logit contributions
allows-0.003
hes-0.002
hers-0.002
 pans-0.002
ighters-0.002
Token They
Token ID1119
Feature activation+3.59e-02

Pos logit contributions
allows+0.021
hers+0.019
hes+0.019
ighters+0.018
 pans+0.018
Neg logit contributions
 course-0.020
 reminder-0.014
 yeah-0.014
 friendship-0.014
 wink-0.014
Token had
Token ID550
Feature activation+2.15e-02

Pos logit contributions
allows+0.087
hers+0.080
hes+0.079
ighters+0.075
 pans+0.073
Neg logit contributions
 course-0.059
 reminder-0.038
 yeah-0.037
 friendship-0.036
 amaz-0.035
Token stopped
Token ID5025
Feature activation+7.35e-02

Pos logit contributions
allows+0.048
hers+0.043
hes+0.042
ighters+0.042
 pans+0.040
Neg logit contributions
 course-0.040
 friendship-0.027
 wink-0.027
 reminder-0.027
 yeah-0.025
Token qu
Token ID627
Feature activation+8.42e-02

Pos logit contributions
allows+0.154
hes+0.143
hers+0.143
ighters+0.134
 pans+0.129
Neg logit contributions
 course-0.142
 friendship-0.100
 reminder-0.098
 wink-0.096
 amaz-0.094
Tokenarre
Token ID9624
Feature activation+2.05e-01

Pos logit contributions
allows+0.124
 pans+0.120
ighters+0.116
hers+0.114
ometers+0.112
Neg logit contributions
 course-0.200
of-0.157
est-0.151
 wink-0.145
bird-0.140
Tokenling
Token ID1359
Feature activation+8.34e-02

Pos logit contributions
allows+0.347
ighters+0.323
hes+0.315
ometers+0.311
angers+0.309
Neg logit contributions
 course-0.457
est-0.345
 smiled-0.329
 friendship-0.323
 believe-0.313
Token.
Token ID13
Feature activation+7.16e-03

Pos logit contributions
allows+0.213
hers+0.195
hes+0.195
ighters+0.186
 pans+0.183
Neg logit contributions
 course-0.127
 reminder-0.077
 yeah-0.073
 wink-0.073
 friendship-0.073
Token To
Token ID1675
Feature activation-2.89e-02

Pos logit contributions
allows+0.016
hers+0.014
hes+0.014
ighters+0.013
 pans+0.013
Neg logit contributions
 course-0.014
 reminder-0.009
 yeah-0.009
 friendship-0.009
 wink-0.009
Token celebrate
Token ID10648
Feature activation-1.97e-02

Pos logit contributions
 course+0.058
 reminder+0.042
 yeah+0.041
Sure+0.041
 amaz+0.040
Neg logit contributions
allows-0.057
hes-0.051
hers-0.050
ighters-0.046
 pans-0.045
Token,
Token ID11
Feature activation+2.10e-03

Pos logit contributions
 course+0.031
 reminder+0.020
 yeah+0.019
Sure+0.019
 wink+0.019
Neg logit contributions
allows-0.049
hers-0.044
hes-0.044
 pans-0.042
ighters-0.040
Token Sharing
Token ID36644
Feature activation-3.49e-02

Pos logit contributions
 course+0.006
 reminder+0.004
 yeah+0.004
 wink+0.004
 friendship+0.004
Neg logit contributions
allows-0.004
hers-0.003
hes-0.003
ighters-0.003
 pans-0.003
Token was
Token ID373
Feature activation-3.09e-02

Pos logit contributions
 course+0.047
 reminder+0.029
 yeah+0.029
 amaz+0.029
Sure+0.027
Neg logit contributions
allows-0.090
hes-0.087
hers-0.084
ained-0.078
 pans-0.078
Token better
Token ID1365
Feature activation+3.11e-02

Pos logit contributions
 course+0.061
 reminder+0.043
 yeah+0.042
of+0.041
 wink+0.041
Neg logit contributions
allows-0.064
hes-0.059
hers-0.058
ighters-0.053
 pans-0.052
Token than
Token ID621
Feature activation-6.43e-03

Pos logit contributions
allows+0.061
hes+0.054
ages+0.054
hers+0.053
ighters+0.050
Neg logit contributions
 course-0.064
Sure-0.044
 yeah-0.044
 amaz-0.044
 wink-0.044
Token qu
Token ID627
Feature activation+9.22e-02

Pos logit contributions
 course+0.013
 reminder+0.009
 yeah+0.009
 wink+0.009
Sure+0.009
Neg logit contributions
allows-0.013
hers-0.012
hes-0.012
ighters-0.011
 pans-0.011
Tokenarre
Token ID9624
Feature activation+2.02e-01

Pos logit contributions
allows+0.138
 pans+0.129
ighters+0.126
hes+0.125
hers+0.124
Neg logit contributions
 course-0.217
of-0.169
est-0.159
 friendship-0.150
 cared-0.149
Tokenling
Token ID1359
Feature activation+8.61e-02

Pos logit contributions
allows+0.342
ighters+0.314
hes+0.309
hers+0.304
angers+0.293
Neg logit contributions
 course-0.459
est-0.347
 friendship-0.338
 cared-0.319
 smiled-0.319
Token.
Token ID13
Feature activation-1.45e-03

Pos logit contributions
allows+0.206
hers+0.187
hes+0.187
ighters+0.177
 pans+0.174
Neg logit contributions
 course-0.144
 reminder-0.093
 yeah-0.091
 wink-0.089
Sure-0.087
Token They
Token ID1119
Feature activation+3.56e-02

Pos logit contributions
 course+0.003
 reminder+0.002
 wink+0.002
 yeah+0.002
 friendship+0.002
Neg logit contributions
allows-0.003
hes-0.002
hers-0.002
ighters-0.002
 pans-0.002
Token said
Token ID531
Feature activation+2.63e-02

Pos logit contributions
allows+0.076
hes+0.072
hers+0.071
ighters+0.067
 pans+0.063
Neg logit contributions
 course-0.063
 reminder-0.043
 yeah-0.042
 amaz-0.041
joy-0.040
Token sorry
Token ID7926
Feature activation+4.62e-02

Pos logit contributions
allows+0.061
hes+0.056
hers+0.055
ighters+0.053
 pans+0.051
Neg logit contributions
 course-0.045
 reminder-0.029
 amaz-0.028
 yeah-0.027
 wink-0.027
Token on
Token ID319
Feature activation+2.19e-02

Pos logit contributions
 course+0.057
 reminder+0.041
 yeah+0.039
 friendship+0.039
 wink+0.039
Neg logit contributions
allows-0.045
hers-0.040
hes-0.037
 pans-0.037
ighters-0.036
Token,
Token ID11
Feature activation-6.01e-03

Pos logit contributions
allows+0.061
hers+0.056
hes+0.056
ighters+0.054
 pans+0.053
Neg logit contributions
 course-0.028
 reminder-0.015
 yeah-0.014
 wink-0.014
 friendship-0.014
Token they
Token ID484
Feature activation+4.70e-02

Pos logit contributions
 course+0.011
 friendship+0.008
 reminder+0.008
 yeah+0.007
 wink+0.007
Neg logit contributions
allows-0.013
hers-0.012
hes-0.012
ighters-0.011
 pans-0.011
Token never
Token ID1239
Feature activation-2.52e-02

Pos logit contributions
allows+0.105
hes+0.096
hers+0.095
ighters+0.090
 pans+0.087
Neg logit contributions
 course-0.086
 reminder-0.058
 yeah-0.057
 friendship-0.055
 wink-0.055
Token qu
Token ID627
Feature activation+8.20e-02

Pos logit contributions
 course+0.054
 yeah+0.037
Sure+0.036
 amaz+0.036
 reminder+0.036
Neg logit contributions
allows-0.044
hes-0.041
hers-0.039
inters-0.038
ages-0.038
Tokenarre
Token ID9624
Feature activation+2.01e-01

Pos logit contributions
allows+0.121
 pans+0.118
hers+0.104
ighters+0.100
hes+0.098
Neg logit contributions
 course-0.204
of-0.159
est-0.148
 friendship-0.148
 wink-0.145
Tokenled
Token ID992
Feature activation+8.13e-02

Pos logit contributions
allows+0.324
hers+0.275
hes+0.274
ighters+0.272
 pans+0.262
Neg logit contributions
 course-0.497
 friendship-0.365
est-0.357
 reminder-0.353
 smiled-0.353
Token again
Token ID757
Feature activation-7.37e-02

Pos logit contributions
allows+0.192
hers+0.178
hes+0.175
ighters+0.168
ained+0.163
Neg logit contributions
 course-0.133
 yeah-0.084
 reminder-0.081
Sure-0.078
 amaz-0.078
Token over
Token ID625
Feature activation+2.17e-02

Pos logit contributions
 course+0.133
 yeah+0.090
Sure+0.088
 reminder+0.085
 wink+0.083
Neg logit contributions
allows-0.167
hers-0.150
hes-0.149
ighters-0.141
 pans-0.138
Token clothes
Token ID8242
Feature activation-1.72e-03

Pos logit contributions
allows+0.045
hers+0.041
hes+0.040
ighters+0.038
 pans+0.037
Neg logit contributions
 course-0.043
 yeah-0.030
Sure-0.030
 reminder-0.029
 wink-0.029
Token.
Token ID13
Feature activation-3.07e-03

Pos logit contributions
 course+0.003
 yeah+0.002
Sure+0.002
 amaz+0.002
 reminder+0.002
Neg logit contributions
allows-0.004
hers-0.004
hes-0.004
ighters-0.004
ained-0.004
Token?
Token ID30
Feature activation-3.93e-02

Pos logit contributions
 course+0.023
 reminder+0.016
 yeah+0.016
 wink+0.015
 friendship+0.015
Neg logit contributions
allows-0.025
hers-0.023
hes-0.023
ighters-0.021
 pans-0.021
Token Why
Token ID4162
Feature activation+4.09e-02

Pos logit contributions
 course+0.058
 yeah+0.037
Sure+0.034
 reminder+0.034
 wink+0.033
Neg logit contributions
allows-0.100
hers-0.093
hes-0.087
ained-0.084
 pans-0.084
Token are
Token ID389
Feature activation-5.29e-02

Pos logit contributions
allows+0.112
hers+0.104
hes+0.102
ighters+0.099
 pans+0.099
Neg logit contributions
 course-0.056
 yeah-0.030
 friendship-0.030
 wink-0.029
 smiled-0.027
Token you
Token ID345
Feature activation-2.24e-02

Pos logit contributions
 course+0.044
 yeah+0.012
 friendship+0.012
 thank+0.005
Sure+0.005
Neg logit contributions
allows-0.170
ilts-0.161
ained-0.160
hers-0.157
hes-0.156
Token qu
Token ID627
Feature activation+9.34e-02

Pos logit contributions
 course+0.034
 be+0.024
 wink+0.024
 kid+0.022
 again+0.022
Neg logit contributions
allows-0.048
ighters-0.048
agons-0.046
ometers-0.046
inters-0.045
Tokenarre
Token ID9624
Feature activation+2.01e-01

Pos logit contributions
allows+0.152
 pans+0.146
ighters+0.145
ometers+0.138
hers+0.135
Neg logit contributions
 course-0.209
of-0.154
est-0.141
 wink-0.139
ad-0.138
Tokenling
Token ID1359
Feature activation+8.81e-02

Pos logit contributions
allows+0.348
ighters+0.336
ometers+0.329
angers+0.316
hers+0.311
Neg logit contributions
 course-0.439
est-0.323
 was-0.304
 smiled-0.300
 friendship-0.297
Token with
Token ID351
Feature activation-2.90e-02

Pos logit contributions
allows+0.213
hers+0.190
hes+0.188
ighters+0.185
 pans+0.178
Neg logit contributions
 course-0.151
 reminder-0.095
 wink-0.095
 friendship-0.094
 yeah-0.089
Token your
Token ID534
Feature activation-2.54e-02

Pos logit contributions
 course+0.052
 friendship+0.032
 wink+0.030
 yeah+0.030
 reminder+0.030
Neg logit contributions
allows-0.069
hes-0.062
hers-0.062
ighters-0.060
ained-0.059
Token mom
Token ID1995
Feature activation+1.16e-02

Pos logit contributions
 course+0.051
 friendship+0.038
 reminder+0.037
 wink+0.036
 yeah+0.035
Neg logit contributions
allows-0.050
hes-0.047
hers-0.047
ained-0.045
ighters-0.044
Token?
Token ID30
Feature activation-3.87e-02

Pos logit contributions
allows+0.030
hers+0.027
ighters+0.027
hes+0.026
 pans+0.025
Neg logit contributions
 course-0.018
 reminder-0.011
 wink-0.011
 yeah-0.010
 friendship-0.010
Token never
Token ID1239
Feature activation-2.04e-02

Pos logit contributions
 course+0.052
 wink+0.038
 reminder+0.035
 friendship+0.035
 smiled+0.034
Neg logit contributions
allows-0.066
angers-0.060
 pans-0.059
ometers-0.058
agons-0.058
Token happy
Token ID3772
Feature activation-4.16e-03

Pos logit contributions
 course+0.041
 wink+0.031
 reminder+0.031
 friendship+0.030
 smiled+0.030
Neg logit contributions
allows-0.040
hers-0.037
hes-0.035
ighters-0.035
angers-0.033
Token.
Token ID13
Feature activation-1.42e-03

Pos logit contributions
 course+0.008
 reminder+0.005
 yeah+0.005
Sure+0.005
 wink+0.005
Neg logit contributions
allows-0.009
hers-0.008
hes-0.008
ighters-0.008
 pans-0.008
Token He
Token ID679
Feature activation-1.07e-02

Pos logit contributions
 course+0.002
 reminder+0.002
 yeah+0.002
 wink+0.002
 friendship+0.001
Neg logit contributions
allows-0.003
hes-0.003
hers-0.003
ighters-0.003
 pans-0.003
Token qu
Token ID627
Feature activation+9.05e-02

Pos logit contributions
 course+0.018
 yeah+0.013
 reminder+0.012
Sure+0.012
 amaz+0.012
Neg logit contributions
allows-0.024
hes-0.023
hers-0.022
ighters-0.021
ages-0.020
Tokenarre
Token ID9624
Feature activation+2.01e-01

Pos logit contributions
allows+0.164
hers+0.148
 pans+0.146
ighters+0.140
hes+0.140
Neg logit contributions
 course-0.204
of-0.143
 friendship-0.140
 wink-0.139
 reminder-0.137
Tokenled
Token ID992
Feature activation+8.53e-02

Pos logit contributions
allows+0.318
hers+0.279
hes+0.266
ighters+0.265
angers+0.255
Neg logit contributions
 course-0.502
 friendship-0.365
est-0.365
 wink-0.363
 smiled-0.362
Token and
Token ID290
Feature activation+9.34e-04

Pos logit contributions
allows+0.187
hers+0.168
hes+0.166
ighters+0.158
 pans+0.155
Neg logit contributions
 course-0.155
 yeah-0.104
 reminder-0.103
 amaz-0.098
Sure-0.097
Token yelled
Token ID22760
Feature activation+8.01e-02

Pos logit contributions
allows+0.002
hes+0.002
hers+0.002
 pans+0.002
ighters+0.002
Neg logit contributions
 course-0.002
 reminder-0.001
 yeah-0.001
Sure-0.001
of-0.001
Token and
Token ID290
Feature activation+3.81e-04

Pos logit contributions
allows+0.218
hers+0.202
hes+0.201
ighters+0.192
 pans+0.190
Neg logit contributions
 course-0.108
 reminder-0.061
 friendship-0.058
 yeah-0.057
 wink-0.057
Token sc
Token ID629
Feature activation-1.20e-02

Pos logit contributions
allows+0.001
hes+0.001
hers+0.001
 pans+0.001
ighters+0.001
Neg logit contributions
 course-0.001
 reminder-0.000
 yeah-0.000
Sure-0.000
 friendship-0.000
Token walking
Token ID6155
Feature activation+3.78e-02

Pos logit contributions
 course+0.060
 wink+0.042
 reminder+0.039
 friendship+0.039
 yeah+0.038
Neg logit contributions
allows-0.067
ighters-0.058
hers-0.058
hes-0.057
 pans-0.056
Token by
Token ID416
Feature activation-8.54e-04

Pos logit contributions
allows+0.077
hers+0.069
hes+0.068
ighters+0.064
 pans+0.063
Neg logit contributions
 course-0.078
 reminder-0.055
 yeah-0.054
 wink-0.053
 friendship-0.053
Token saw
Token ID2497
Feature activation-1.32e-02

Pos logit contributions
 course+0.002
 yeah+0.001
 reminder+0.001
Sure+0.001
 wink+0.001
Neg logit contributions
allows-0.002
hers-0.002
hes-0.002
ighters-0.002
 pans-0.002
Token them
Token ID606
Feature activation+5.56e-02

Pos logit contributions
 course+0.030
 reminder+0.025
 yeah+0.023
 wink+0.023
 friendship+0.023
Neg logit contributions
allows-0.023
hers-0.020
hes-0.020
ighters-0.019
 pans-0.018
Token qu
Token ID627
Feature activation+8.50e-02

Pos logit contributions
allows+0.138
hers+0.126
hes+0.126
ighters+0.120
 pans+0.118
Neg logit contributions
 course-0.089
 reminder-0.055
 yeah-0.053
 wink-0.053
 friendship-0.053
Tokenarre
Token ID9624
Feature activation+2.01e-01

Pos logit contributions
allows+0.134
 pans+0.125
ighters+0.122
ometers+0.122
hes+0.117
Neg logit contributions
 course-0.193
of-0.147
est-0.143
ad-0.136
 wink-0.133
Tokenling
Token ID1359
Feature activation+8.14e-02

Pos logit contributions
allows+0.359
ometers+0.343
ighters+0.341
angers+0.323
hes+0.322
Neg logit contributions
 course-0.436
est-0.321
 smiled-0.295
 was-0.288
of-0.286
Token.
Token ID13
Feature activation+2.10e-04

Pos logit contributions
allows+0.198
hers+0.180
hes+0.180
ighters+0.170
 pans+0.167
Neg logit contributions
 course-0.134
 reminder-0.085
 yeah-0.084
 friendship-0.081
 wink-0.081
Token She
Token ID1375
Feature activation+1.81e-03

Pos logit contributions
allows+0.000
hes+0.000
hers+0.000
ighters+0.000
 pans+0.000
Neg logit contributions
 course-0.000
 reminder-0.000
 yeah-0.000
 friendship-0.000
 wink-0.000
Token was
Token ID373
Feature activation-3.24e-02

Pos logit contributions
allows+0.004
hers+0.004
hes+0.004
 pans+0.004
ighters+0.004
Neg logit contributions
 course-0.003
 reminder-0.002
 yeah-0.002
 friendship-0.002
 amaz-0.002
Token very
Token ID845
Feature activation+7.81e-02

Pos logit contributions
 course+0.062
 wink+0.048
 smiled+0.042
 reminder+0.041
 friendship+0.041
Neg logit contributions
allows-0.068
ighters-0.059
ages-0.057
 pans-0.057
ilts-0.057
Token together
Token ID1978
Feature activation+2.57e-02

Pos logit contributions
 course+0.114
 yeah+0.088
 reminder+0.087
 wink+0.086
of+0.085
Neg logit contributions
allows-0.085
hers-0.077
hes-0.076
ighters-0.069
 pans-0.065
Token.
Token ID13
Feature activation-4.57e-03

Pos logit contributions
allows+0.050
hers+0.046
hes+0.045
ighters+0.041
ained+0.040
Neg logit contributions
 course-0.051
 yeah-0.038
Sure-0.037
 reminder-0.036
 wink-0.036
Token They
Token ID1119
Feature activation+2.79e-02

Pos logit contributions
 course+0.010
 reminder+0.008
 wink+0.007
 yeah+0.007
 friendship+0.007
Neg logit contributions
allows-0.008
hers-0.007
hes-0.007
ighters-0.007
ained-0.006
Token never
Token ID1239
Feature activation-2.51e-02

Pos logit contributions
allows+0.064
hers+0.059
hes+0.058
ighters+0.055
 pans+0.054
Neg logit contributions
 course-0.049
 reminder-0.032
 yeah-0.032
 friendship-0.031
 wink-0.030
Token qu
Token ID627
Feature activation+8.35e-02

Pos logit contributions
 course+0.055
 yeah+0.039
ement+0.038
Sure+0.038
 reminder+0.037
Neg logit contributions
allows-0.040
hes-0.036
hers-0.036
inters-0.036
ained-0.034
Tokenarre
Token ID9624
Feature activation+2.01e-01

Pos logit contributions
allows+0.121
 pans+0.118
hers+0.106
ighters+0.102
hes+0.100
Neg logit contributions
 course-0.210
of-0.159
est-0.152
 friendship-0.150
 wink-0.147
Tokenled
Token ID992
Feature activation+8.21e-02

Pos logit contributions
allows+0.336
hes+0.293
ighters+0.291
hers+0.291
 pans+0.279
Neg logit contributions
 course-0.475
 friendship-0.341
est-0.340
 smiled-0.334
 cared-0.328
Token again
Token ID757
Feature activation-7.37e-02

Pos logit contributions
allows+0.191
hers+0.176
hes+0.170
ighters+0.164
ained+0.158
Neg logit contributions
 course-0.139
 yeah-0.089
Sure-0.087
 reminder-0.086
 wink-0.083
Token.
Token ID13
Feature activation-3.50e-03

Pos logit contributions
 course+0.124
 yeah+0.080
Sure+0.079
 reminder+0.075
 wink+0.074
Neg logit contributions
allows-0.175
hers-0.158
hes-0.155
ighters-0.147
 pans-0.146
Token<|endoftext|>
Token ID50256
Feature activation+8.49e-02

Pos logit contributions
 course+0.007
 wink+0.005
 reminder+0.005
 yeah+0.005
Sure+0.005
Neg logit contributions
allows-0.007
hes-0.006
hers-0.006
ighters-0.006
ained-0.005
TokenOnce
Token ID7454
Feature activation+5.07e-02

Pos logit contributions
allows+0.193
hers+0.175
hes+0.173
ighters+0.165
 pans+0.162
Neg logit contributions
 course-0.155
 reminder-0.101
 wink-0.099
 friendship-0.098
 yeah-0.098
Token and
Token ID290
Feature activation+2.89e-03

Pos logit contributions
allows+0.008
hers+0.007
hes+0.007
 pans+0.007
ighters+0.007
Neg logit contributions
 course-0.009
 yeah-0.006
 reminder-0.006
 friendship-0.006
Sure-0.006
Token saw
Token ID2497
Feature activation-8.60e-03

Pos logit contributions
allows+0.007
hers+0.006
hes+0.006
 pans+0.006
ighters+0.006
Neg logit contributions
 course-0.005
 reminder-0.003
 friendship-0.003
 yeah-0.003
Sure-0.003
Token two
Token ID734
Feature activation-2.12e-02

Pos logit contributions
 course+0.017
 reminder+0.013
 yeah+0.012
Sure+0.012
 friendship+0.012
Neg logit contributions
allows-0.017
hes-0.015
hers-0.015
ighters-0.014
 pans-0.014
Token birds
Token ID10087
Feature activation+2.20e-04

Pos logit contributions
 course+0.045
 reminder+0.032
 yeah+0.032
 wink+0.032
 friendship+0.031
Neg logit contributions
allows-0.041
hes-0.037
hers-0.037
ighters-0.034
 pans-0.033
Token qu
Token ID627
Feature activation+8.22e-02

Pos logit contributions
allows+0.000
hes+0.000
hers+0.000
ighters+0.000
 pans+0.000
Neg logit contributions
 course-0.000
 wink-0.000
 reminder-0.000
 friendship-0.000
 yeah-0.000
Tokenarre
Token ID9624
Feature activation+2.00e-01

Pos logit contributions
allows+0.140
 pans+0.135
hes+0.135
ighters+0.132
ometers+0.127
Neg logit contributions
 course-0.174
of-0.132
est-0.123
ad-0.120
 wink-0.118
Tokenling
Token ID1359
Feature activation+7.98e-02

Pos logit contributions
allows+0.359
ometers+0.347
ighters+0.346
hes+0.343
angers+0.339
Neg logit contributions
 course-0.418
est-0.310
 smiled-0.286
 believe-0.275
 was-0.274
Token!
Token ID0
Feature activation-2.05e-02

Pos logit contributions
allows+0.178
hes+0.162
hers+0.161
ighters+0.154
 pans+0.150
Neg logit contributions
 course-0.148
 reminder-0.098
 friendship-0.096
 wink-0.096
 yeah-0.094
Token They
Token ID1119
Feature activation+2.73e-02

Pos logit contributions
 course+0.035
 yeah+0.022
Sure+0.022
 wink+0.021
 reminder+0.021
Neg logit contributions
allows-0.050
hes-0.045
hers-0.045
ighters-0.043
 pans-0.042
Token were
Token ID547
Feature activation-3.71e-03

Pos logit contributions
allows+0.067
hers+0.061
hes+0.060
ighters+0.057
 pans+0.056
Neg logit contributions
 course-0.045
 reminder-0.029
 yeah-0.028
 friendship-0.027
 wink-0.026
Token squ
Token ID2809
Feature activation+3.06e-02

Pos logit contributions
 course+0.006
 wink+0.005
 friendship+0.004
 amaz+0.004
 smiled+0.004
Neg logit contributions
allows-0.008
ighters-0.008
 pans-0.007
angers-0.007
hes-0.007
Token control
Token ID1630
Feature activation-5.91e-02

Pos logit contributions
allows+0.123
hers+0.111
hes+0.111
ighters+0.105
 pans+0.102
Neg logit contributions
 course-0.101
 reminder-0.067
 yeah-0.065
 friendship-0.065
 wink-0.065
Token.
Token ID13
Feature activation+1.18e-03

Pos logit contributions
 course+0.099
 reminder+0.063
 friendship+0.061
 wink+0.061
 yeah+0.060
Neg logit contributions
allows-0.142
hers-0.130
hes-0.129
ighters-0.123
 pans-0.121
Token
Token ID198
Feature activation-1.02e-02

Pos logit contributions
allows+0.003
hes+0.003
hers+0.003
ighters+0.002
 pans+0.002
Neg logit contributions
 course-0.002
 reminder-0.001
 yeah-0.001
 wink-0.001
 friendship-0.001
Token
Token ID198
Feature activation-9.59e-03

Pos logit contributions
 course+0.015
 wink+0.008
Sure+0.008
 friendship+0.008
 reminder+0.008
Neg logit contributions
allows-0.027
hers-0.025
hes-0.024
 pans-0.023
ighters-0.023
TokenThe
Token ID464
Feature activation-2.93e-02

Pos logit contributions
 course+0.014
Sure+0.007
 wink+0.007
 reminder+0.007
of+0.007
Neg logit contributions
allows-0.026
hers-0.024
hes-0.023
ighters-0.023
 pans-0.022
Token ranc
Token ID32205
Feature activation+2.00e-01

Pos logit contributions
 course+0.053
 friendship+0.035
 reminder+0.035
 end+0.034
 wink+0.031
Neg logit contributions
allows-0.069
hes-0.062
ighters-0.061
angers-0.061
ants-0.060
Tokenher
Token ID372
Feature activation+6.62e-02

Pos logit contributions
allows+0.343
 pans+0.286
 onions+0.255
hes+0.255
 pigs+0.253
Neg logit contributions
 course-0.469
 friendship-0.334
 wink-0.334
 reminder-0.328
 smiled-0.324
Token heard
Token ID2982
Feature activation+1.50e-02

Pos logit contributions
allows+0.169
hes+0.152
hers+0.151
ighters+0.149
 pans+0.145
Neg logit contributions
 course-0.100
 wink-0.062
 smiled-0.059
 reminder-0.058
 friendship-0.058
Token the
Token ID262
Feature activation+1.30e-02

Pos logit contributions
allows+0.034
hers+0.031
hes+0.031
ighters+0.029
 pans+0.029
Neg logit contributions
 course-0.028
 reminder-0.018
 wink-0.018
 friendship-0.018
 yeah-0.017
Token noise
Token ID7838
Feature activation+3.69e-02

Pos logit contributions
allows+0.030
hes+0.027
ained+0.027
ighters+0.026
angers+0.026
Neg logit contributions
 course-0.023
 reminder-0.015
 wink-0.015
 friendship-0.015
 child-0.014
Token and
Token ID290
Feature activation+2.08e-03

Pos logit contributions
allows+0.084
hes+0.078
hers+0.077
ighters+0.072
 pans+0.071
Neg logit contributions
 course-0.066
 yeah-0.047
 reminder-0.046
 friendship-0.043
 amaz-0.042
Token Tom
Token ID4186
Feature activation-5.81e-03

Pos logit contributions
allows+0.022
hers+0.020
hes+0.020
ighters+0.019
 pans+0.018
Neg logit contributions
 course-0.016
 reminder-0.011
 yeah-0.010
 wink-0.010
 friendship-0.010
Token and
Token ID290
Feature activation+1.02e-02

Pos logit contributions
 course+0.008
 reminder+0.005
 yeah+0.005
 friendship+0.005
 wink+0.005
Neg logit contributions
allows-0.015
hers-0.014
hes-0.014
ighters-0.013
 pans-0.013
Token Mia
Token ID32189
Feature activation-3.94e-03

Pos logit contributions
allows+0.023
hes+0.021
hers+0.021
ighters+0.020
 pans+0.020
Neg logit contributions
 course-0.019
 reminder-0.013
 yeah-0.012
 friendship-0.012
 wink-0.011
Token stop
Token ID2245
Feature activation-4.09e-02

Pos logit contributions
 course+0.007
 reminder+0.004
 yeah+0.004
 friendship+0.004
 wink+0.004
Neg logit contributions
allows-0.009
hes-0.009
hers-0.009
ighters-0.008
 pans-0.008
Token qu
Token ID627
Feature activation+8.40e-02

Pos logit contributions
 course+0.083
 reminder+0.057
 yeah+0.057
 friendship+0.055
Sure+0.055
Neg logit contributions
allows-0.087
hes-0.076
hers-0.076
ighters-0.072
 pans-0.071
Tokenarre
Token ID9624
Feature activation+2.00e-01

Pos logit contributions
allows+0.188
hers+0.181
 pans+0.175
ometers+0.171
ighters+0.170
Neg logit contributions
 course-0.139
of-0.093
 wink-0.087
 circle-0.082
 friendship-0.080
Tokenling
Token ID1359
Feature activation+8.20e-02

Pos logit contributions
allows+0.349
ometers+0.334
ighters+0.321
hers+0.318
angers+0.316
Neg logit contributions
 course-0.432
est-0.316
 smiled-0.301
of-0.297
 was-0.295
Token.
Token ID13
Feature activation+5.33e-03

Pos logit contributions
allows+0.200
hers+0.182
hes+0.182
ighters+0.173
 pans+0.170
Neg logit contributions
 course-0.136
 reminder-0.086
 wink-0.082
 friendship-0.082
 yeah-0.082
Token They
Token ID1119
Feature activation+3.79e-02

Pos logit contributions
allows+0.010
hers+0.009
hes+0.009
ighters+0.008
 pans+0.008
Neg logit contributions
 course-0.012
 friendship-0.008
 wink-0.008
Sure-0.008
 yeah-0.008
Token see
Token ID766
Feature activation-6.40e-02

Pos logit contributions
allows+0.099
hers+0.090
hes+0.090
ighters+0.085
ained+0.085
Neg logit contributions
 course-0.058
 wink-0.036
 yeah-0.033
 amaz-0.032
 reminder-0.032
Token the
Token ID262
Feature activation+1.27e-02

Pos logit contributions
 course+0.132
 reminder+0.097
 yeah+0.094
Sure+0.091
 wink+0.090
Neg logit contributions
allows-0.130
hes-0.119
hers-0.116
ighters-0.111
 pans-0.106
Token the
Token ID262
Feature activation+8.60e-03

Pos logit contributions
 course+0.104
 yeah+0.072
 reminder+0.072
 wink+0.070
Sure+0.070
Neg logit contributions
allows-0.093
hers-0.083
hes-0.083
ighters-0.078
 pans-0.076
Token parents
Token ID3397
Feature activation+5.11e-03

Pos logit contributions
allows+0.018
hes+0.016
hers+0.016
ighters+0.015
 pans+0.015
Neg logit contributions
 course-0.018
 reminder-0.012
 friendship-0.012
 yeah-0.012
 wink-0.012
Token were
Token ID547
Feature activation-3.83e-03

Pos logit contributions
allows+0.012
hers+0.011
hes+0.011
ighters+0.010
 pans+0.010
Neg logit contributions
 course-0.008
 yeah-0.006
 reminder-0.006
 friendship-0.005
Sure-0.005
Token also
Token ID635
Feature activation+1.19e-02

Pos logit contributions
 course+0.007
 reminder+0.005
 wink+0.005
 friendship+0.005
 amaz+0.005
Neg logit contributions
allows-0.008
hers-0.007
hes-0.007
ighters-0.007
 pans-0.007
Token qu
Token ID627
Feature activation+8.20e-02

Pos logit contributions
allows+0.025
hers+0.022
hes+0.022
ighters+0.021
 pans+0.020
Neg logit contributions
 course-0.024
 reminder-0.017
 wink-0.016
 yeah-0.016
 friendship-0.016
Tokenarre
Token ID9624
Feature activation+2.00e-01

Pos logit contributions
allows+0.122
 pans+0.113
ighters+0.107
hes+0.106
ometers+0.103
Neg logit contributions
 course-0.204
of-0.158
est-0.148
 wink-0.144
 reminder-0.141
Tokenling
Token ID1359
Feature activation+7.87e-02

Pos logit contributions
allows+0.324
hes+0.286
ighters+0.282
hers+0.276
angers+0.270
Neg logit contributions
 course-0.482
est-0.355
 reminder-0.353
 smiled-0.345
 wink-0.343
Token.
Token ID13
Feature activation-2.25e-03

Pos logit contributions
allows+0.173
hers+0.157
hes+0.154
ighters+0.144
 pans+0.141
Neg logit contributions
 course-0.144
 yeah-0.099
 reminder-0.095
Sure-0.094
 wink-0.093
Token The
Token ID383
Feature activation-2.35e-02

Pos logit contributions
 course+0.004
 reminder+0.003
 yeah+0.003
 friendship+0.003
 wink+0.003
Neg logit contributions
allows-0.005
hers-0.005
hes-0.005
ighters-0.004
 pans-0.004
Token first
Token ID717
Feature activation-2.36e-02

Pos logit contributions
 course+0.046
 reminder+0.032
 yeah+0.031
 wink+0.031
 friendship+0.031
Neg logit contributions
allows-0.050
hers-0.045
hes-0.045
ighters-0.042
 pans-0.041
Token family
Token ID1641
Feature activation+1.66e-02

Pos logit contributions
 course+0.041
 yeah+0.027
Sure+0.027
 wink+0.025
 reminder+0.025
Neg logit contributions
allows-0.053
hers-0.048
 pans-0.045
hes-0.045
ages-0.043
Token "
Token ID366
Feature activation+3.79e-02

Pos logit contributions
 course+0.015
 reminder+0.010
 amaz+0.009
 wink+0.009
 yeah+0.009
Neg logit contributions
allows-0.024
hes-0.022
hers-0.021
ighters-0.020
 pans-0.020
TokenWhy
Token ID5195
Feature activation-2.09e-02

Pos logit contributions
allows+0.093
hes+0.085
hers+0.085
ighters+0.080
 pans+0.080
Neg logit contributions
 course-0.061
 reminder-0.039
 yeah-0.038
 friendship-0.038
 amaz-0.037
Token are
Token ID389
Feature activation-5.95e-02

Pos logit contributions
 course+0.031
 wink+0.018
 yeah+0.018
 friendship+0.018
 reminder+0.018
Neg logit contributions
allows-0.054
hers-0.050
hes-0.049
ighters-0.048
 pans-0.048
Token you
Token ID345
Feature activation-2.74e-02

Pos logit contributions
 course+0.070
 yeah+0.029
 friendship+0.026
Sure+0.025
 wink+0.024
Neg logit contributions
allows-0.183
hes-0.171
hers-0.169
ained-0.163
ighters-0.163
Token qu
Token ID627
Feature activation+8.89e-02

Pos logit contributions
 course+0.050
 wink+0.035
 yeah+0.033
 smiled+0.032
 be+0.031
Neg logit contributions
allows-0.059
ighters-0.056
hes-0.056
ages-0.053
 pans-0.053
Tokenarre
Token ID9624
Feature activation+2.00e-01

Pos logit contributions
allows+0.141
ighters+0.136
 pans+0.135
ometers+0.129
hers+0.127
Neg logit contributions
 course-0.213
of-0.162
est-0.148
 wink-0.147
ad-0.142
Tokenling
Token ID1359
Feature activation+8.45e-02

Pos logit contributions
allows+0.349
ighters+0.338
ometers+0.328
hers+0.316
hes+0.312
Neg logit contributions
 course-0.445
est-0.329
 smiled-0.302
of-0.302
 friendship-0.299
Token?
Token ID30
Feature activation-4.36e-02

Pos logit contributions
allows+0.196
hes+0.175
hers+0.174
ighters+0.173
angers+0.163
Neg logit contributions
 course-0.154
 wink-0.100
 friendship-0.099
 reminder-0.098
of-0.093
Token You
Token ID921
Feature activation-1.90e-02

Pos logit contributions
 course+0.070
Sure+0.043
 yeah+0.043
 wink+0.042
 reminder+0.042
Neg logit contributions
allows-0.108
hers-0.099
hes-0.096
ighters-0.092
 pans-0.092
Token can
Token ID460
Feature activation-3.24e-02

Pos logit contributions
 course+0.028
 wink+0.017
 believe+0.015
 yeah+0.014
 friendship+0.014
Neg logit contributions
allows-0.052
hers-0.047
hes-0.046
 pans-0.046
angers-0.045
Token share
Token ID2648
Feature activation-2.43e-02

Pos logit contributions
 course+0.061
 reminder+0.041
 yeah+0.040
 wink+0.040
 friendship+0.040
Neg logit contributions
allows-0.072
hers-0.065
hes-0.064
ighters-0.061
 pans-0.060
Token but
Token ID475
Feature activation-5.75e-02

Pos logit contributions
 course+0.003
 yeah+0.002
 reminder+0.002
Sure+0.002
 wink+0.002
Neg logit contributions
allows-0.004
hers-0.004
hes-0.003
 pans-0.003
ained-0.003
Token today
Token ID1909
Feature activation+3.10e-02

Pos logit contributions
 course+0.111
 yeah+0.074
Sure+0.074
 reminder+0.074
of+0.073
Neg logit contributions
allows-0.118
hers-0.106
hes-0.102
ained-0.098
ighters-0.096
Token they
Token ID484
Feature activation+4.49e-02

Pos logit contributions
allows+0.056
hers+0.049
hes+0.049
ighters+0.046
ained+0.045
Neg logit contributions
 course-0.067
 yeah-0.049
 reminder-0.049
Sure-0.049
 amaz-0.048
Token were
Token ID547
Feature activation-6.42e-03

Pos logit contributions
allows+0.115
hes+0.105
hers+0.104
ighters+0.100
 pans+0.099
Neg logit contributions
 course-0.069
 reminder-0.042
 wink-0.041
 friendship-0.040
 yeah-0.039
Token qu
Token ID627
Feature activation+7.98e-02

Pos logit contributions
 course+0.011
 wink+0.008
 friendship+0.008
 smiled+0.008
 reminder+0.008
Neg logit contributions
allows-0.013
agons-0.012
 pans-0.012
angers-0.012
ages-0.012
Tokenarre
Token ID9624
Feature activation+2.00e-01

Pos logit contributions
allows+0.124
 pans+0.121
ometers+0.114
ighters+0.109
hes+0.109
Neg logit contributions
 course-0.185
of-0.141
est-0.138
bird-0.129
 wink-0.128
Tokenling
Token ID1359
Feature activation+8.04e-02

Pos logit contributions
allows+0.348
ometers+0.323
ighters+0.323
hes+0.317
angers+0.310
Neg logit contributions
 course-0.426
est-0.327
 smiled-0.302
 cared-0.292
 believe-0.287
Token.
Token ID13
Feature activation-6.17e-03

Pos logit contributions
allows+0.180
hers+0.163
hes+0.162
ighters+0.153
 pans+0.150
Neg logit contributions
 course-0.148
 reminder-0.099
 yeah-0.097
 wink-0.096
 friendship-0.095
Token Jenna
Token ID45377
Feature activation+2.79e-02

Pos logit contributions
 course+0.011
 reminder+0.007
 yeah+0.007
 wink+0.007
 friendship+0.007
Neg logit contributions
allows-0.014
hers-0.013
hes-0.013
ighters-0.012
 pans-0.012
Token wanted
Token ID2227
Feature activation+2.67e-02

Pos logit contributions
allows+0.068
hers+0.063
hes+0.062
ighters+0.059
 pans+0.058
Neg logit contributions
 course-0.045
 reminder-0.028
 yeah-0.028
 friendship-0.027
 wink-0.027
Token to
Token ID284
Feature activation-7.49e-03

Pos logit contributions
allows+0.049
hers+0.044
hes+0.044
ighters+0.040
 pans+0.040
Neg logit contributions
 course-0.056
 reminder-0.042
 yeah-0.042
 amaz-0.042
 wink-0.039
Token the
Token ID262
Feature activation+1.39e-02

Pos logit contributions
 course+0.052
 wink+0.029
 friendship+0.029
 amaz+0.028
 yeah+0.028
Neg logit contributions
allows-0.078
hers-0.071
ained-0.070
ages-0.070
hes-0.069
Token mess
Token ID2085
Feature activation+9.81e-02

Pos logit contributions
allows+0.029
ighters+0.026
ained+0.026
hes+0.025
hers+0.025
Neg logit contributions
 course-0.027
 wink-0.019
 reminder-0.019
 friendship-0.018
 smiled-0.018
Token and
Token ID290
Feature activation+6.53e-03

Pos logit contributions
allows+0.258
hers+0.237
ighters+0.231
hes+0.230
ained+0.222
Neg logit contributions
 course-0.142
 wink-0.075
 reminder-0.073
 friendship-0.071
 smiled-0.069
Token the
Token ID262
Feature activation+1.59e-02

Pos logit contributions
allows+0.015
hers+0.014
hes+0.014
ighters+0.013
 pans+0.013
Neg logit contributions
 course-0.012
 reminder-0.008
 yeah-0.008
 wink-0.007
 friendship-0.007
Token qu
Token ID627
Feature activation+8.72e-02

Pos logit contributions
allows+0.031
hes+0.028
hers+0.027
ighters+0.027
ained+0.027
Neg logit contributions
 course-0.033
 reminder-0.024
 friendship-0.024
 wink-0.024
 smiled-0.022
Tokenarre
Token ID9624
Feature activation+2.00e-01

Pos logit contributions
allows+0.139
ighters+0.136
hers+0.132
nam+0.132
 pans+0.131
Neg logit contributions
 course-0.182
of-0.137
est-0.132
 circle-0.123
ad-0.121
Tokenling
Token ID1359
Feature activation+8.29e-02

Pos logit contributions
allows+0.329
ighters+0.297
hers+0.295
ometers+0.291
angers+0.285
Neg logit contributions
 course-0.464
est-0.342
 reminder-0.328
 wink-0.322
 friendship-0.321
Token twins
Token ID20345
Feature activation+8.83e-02

Pos logit contributions
allows+0.184
hers+0.169
hes+0.165
ighters+0.163
ained+0.154
Neg logit contributions
 course-0.153
 reminder-0.106
 friendship-0.103
 wink-0.102
 yeah-0.095
Token.
Token ID13
Feature activation+1.18e-03

Pos logit contributions
allows+0.215
hers+0.196
hes+0.193
ighters+0.186
 pans+0.183
Neg logit contributions
 course-0.146
 reminder-0.091
 wink-0.089
 friendship-0.087
 yeah-0.086
Token She
Token ID1375
Feature activation+8.62e-03

Pos logit contributions
allows+0.003
hers+0.003
hes+0.003
ighters+0.003
ained+0.003
Neg logit contributions
 course-0.002
Sure-0.001
 wink-0.001
 friendship-0.001
 yeah-0.001
Token is
Token ID318
Feature activation-5.02e-02

Pos logit contributions
allows+0.023
hes+0.021
hers+0.021
 pans+0.020
ighters+0.020
Neg logit contributions
 course-0.012
 reminder-0.008
 yeah-0.007
 friendship-0.007
Sure-0.006
Token hands
Token ID2832
Feature activation-1.88e-02

Pos logit contributions
allows+0.018
hes+0.015
hers+0.015
 pans+0.014
ighters+0.014
Neg logit contributions
 course-0.023
 reminder-0.018
 yeah-0.018
 wink-0.017
Sure-0.017
Token and
Token ID290
Feature activation-1.30e-03

Pos logit contributions
 course+0.036
 yeah+0.025
 reminder+0.024
 friendship+0.023
Sure+0.023
Neg logit contributions
allows-0.041
hers-0.037
hes-0.036
ighters-0.033
 pans-0.033
Token they
Token ID484
Feature activation+4.85e-02

Pos logit contributions
 course+0.003
 yeah+0.002
of+0.002
Sure+0.002
 reminder+0.002
Neg logit contributions
allows-0.002
hes-0.002
hers-0.002
 pans-0.002
ages-0.002
Token never
Token ID1239
Feature activation-2.24e-02

Pos logit contributions
allows+0.105
hers+0.095
hes+0.095
ighters+0.089
 pans+0.087
Neg logit contributions
 course-0.093
 reminder-0.063
 yeah-0.062
 friendship-0.061
 wink-0.061
Token qu
Token ID627
Feature activation+9.07e-02

Pos logit contributions
 course+0.050
 yeah+0.034
Sure+0.034
 amaz+0.032
ement+0.032
Neg logit contributions
allows-0.037
hes-0.033
hers-0.033
inters-0.032
ained-0.032
Tokenarre
Token ID9624
Feature activation+2.00e-01

Pos logit contributions
allows+0.136
 pans+0.136
ighters+0.117
hes+0.116
hers+0.116
Neg logit contributions
 course-0.221
of-0.170
est-0.162
 wink-0.159
 friendship-0.158
Tokenled
Token ID992
Feature activation+8.40e-02

Pos logit contributions
allows+0.324
hes+0.283
ighters+0.283
hers+0.279
 pans+0.269
Neg logit contributions
 course-0.479
 friendship-0.353
est-0.351
 reminder-0.348
 smiled-0.346
Token again
Token ID757
Feature activation-7.15e-02

Pos logit contributions
allows+0.221
hers+0.203
hes+0.201
ighters+0.193
 pans+0.187
Neg logit contributions
 course-0.121
 yeah-0.068
 reminder-0.066
Sure-0.064
 wink-0.064
Token.
Token ID13
Feature activation-3.19e-03

Pos logit contributions
 course+0.125
 reminder+0.080
 yeah+0.079
 wink+0.078
 friendship+0.077
Neg logit contributions
allows-0.168
hers-0.152
hes-0.151
ighters-0.144
 pans-0.141
Token<|endoftext|>
Token ID50256
Feature activation+9.08e-02

Pos logit contributions
 course+0.006
 wink+0.004
 yeah+0.004
 reminder+0.004
 friendship+0.004
Neg logit contributions
allows-0.006
hes-0.006
hers-0.006
ighters-0.005
ained-0.005
TokenOnce
Token ID7454
Feature activation+5.62e-02

Pos logit contributions
allows+0.226
hers+0.206
hes+0.204
ighters+0.196
 pans+0.193
Neg logit contributions
 course-0.146
 reminder-0.089
 wink-0.087
 friendship-0.085
 yeah-0.085
Token Finally
Token ID9461
Feature activation+2.84e-02

Pos logit contributions
 course+0.008
 reminder+0.006
 yeah+0.006
 friendship+0.006
Sure+0.006
Neg logit contributions
allows-0.010
hers-0.009
hes-0.009
 pans-0.009
ighters-0.009
Token,
Token ID11
Feature activation-6.73e-03

Pos logit contributions
allows+0.067
hes+0.061
hers+0.059
 pans+0.057
ighters+0.057
Neg logit contributions
 course-0.044
of-0.029
Sure-0.029
 reminder-0.029
 yeah-0.028
Token after
Token ID706
Feature activation+3.29e-02

Pos logit contributions
 course+0.013
 reminder+0.008
 wink+0.008
 friendship+0.008
 yeah+0.008
Neg logit contributions
allows-0.015
hers-0.014
hes-0.014
ighters-0.013
 pans-0.013
Token much
Token ID881
Feature activation+5.89e-02

Pos logit contributions
allows+0.073
hers+0.066
hes+0.066
ighters+0.063
 pans+0.061
Neg logit contributions
 course-0.063
 reminder-0.042
 friendship-0.041
 wink-0.041
 yeah-0.039
Token qu
Token ID627
Feature activation+8.25e-02

Pos logit contributions
allows+0.130
hes+0.117
hers+0.117
ighters+0.110
 pans+0.110
Neg logit contributions
 course-0.106
 yeah-0.073
 reminder-0.072
Sure-0.072
 wink-0.069
Tokenarre
Token ID9624
Feature activation+2.00e-01

Pos logit contributions
allows+0.139
 pans+0.130
ighters+0.129
hers+0.127
hes+0.125
Neg logit contributions
 course-0.191
of-0.140
est-0.135
 wink-0.132
 friendship-0.129
Tokenling
Token ID1359
Feature activation+7.92e-02

Pos logit contributions
allows+0.331
ighters+0.300
hes+0.297
hers+0.289
angers+0.289
Neg logit contributions
 course-0.481
est-0.348
 smiled-0.341
 friendship-0.336
 wink-0.331
Token,
Token ID11
Feature activation-7.64e-03

Pos logit contributions
allows+0.200
hers+0.184
hes+0.182
ighters+0.173
 pans+0.170
Neg logit contributions
 course-0.121
 reminder-0.074
 yeah-0.073
 wink-0.070
Sure-0.069
Token one
Token ID530
Feature activation-7.26e-02

Pos logit contributions
 course+0.013
 friendship+0.009
 reminder+0.009
 wink+0.009
 yeah+0.009
Neg logit contributions
allows-0.018
hers-0.016
hes-0.016
ighters-0.015
 pans-0.015
Token of
Token ID286
Feature activation+4.49e-02

Pos logit contributions
 course+0.138
 yeah+0.103
Sure+0.099
Of+0.094
 amaz+0.091
Neg logit contributions
allows-0.130
hes-0.121
hers-0.102
 pans-0.100
ores-0.099
Token the
Token ID262
Feature activation+1.00e-02

Pos logit contributions
allows+0.080
hes+0.073
hers+0.072
ighters+0.065
 pans+0.064
Neg logit contributions
 course-0.091
 reminder-0.076
 yeah-0.074
Sure-0.071
of-0.070
Token says
Token ID1139
Feature activation-8.92e-03

Pos logit contributions
allows+0.024
hes+0.023
hers+0.022
ighters+0.021
 pans+0.021
Neg logit contributions
 course-0.018
 reminder-0.014
 yeah-0.012
 friendship-0.011
bye-0.011
Token,
Token ID11
Feature activation-1.16e-03

Pos logit contributions
 course+0.009
 yeah+0.004
 friendship+0.003
 reminder+0.003
 wink+0.003
Neg logit contributions
allows-0.028
hers-0.027
hes-0.026
ained-0.025
ighters-0.025
Token "
Token ID366
Feature activation+3.78e-02

Pos logit contributions
 course+0.002
 reminder+0.001
 yeah+0.001
 wink+0.001
Sure+0.001
Neg logit contributions
allows-0.003
hers-0.003
hes-0.003
ighters-0.003
ained-0.002
TokenStop
Token ID19485
Feature activation-3.44e-02

Pos logit contributions
allows+0.079
hes+0.076
hers+0.075
 pans+0.071
ighters+0.069
Neg logit contributions
 course-0.066
 reminder-0.049
 amaz-0.046
 yeah-0.046
 friendship-0.046
Token qu
Token ID627
Feature activation+8.71e-02

Pos logit contributions
 course+0.061
 reminder+0.039
 yeah+0.039
 wink+0.038
 friendship+0.038
Neg logit contributions
allows-0.080
hes-0.073
hers-0.073
ighters-0.069
 pans-0.068
Tokenarre
Token ID9624
Feature activation+2.00e-01

Pos logit contributions
allows+0.153
hers+0.139
ighters+0.134
 pans+0.133
hes+0.129
Neg logit contributions
 course-0.201
of-0.149
 wink-0.141
Sure-0.136
 friendship-0.132
Tokenling
Token ID1359
Feature activation+8.28e-02

Pos logit contributions
allows+0.339
hers+0.309
ighters+0.307
ometers+0.300
hes+0.297
Neg logit contributions
 course-0.464
est-0.333
 friendship-0.320
of-0.318
 smiled-0.317
Token,
Token ID11
Feature activation-2.33e-03

Pos logit contributions
allows+0.223
hers+0.205
hes+0.205
ighters+0.198
 pans+0.193
Neg logit contributions
 course-0.119
 friendship-0.063
 reminder-0.063
 wink-0.062
of-0.060
Token Tom
Token ID4186
Feature activation-6.82e-03

Pos logit contributions
 course+0.005
 reminder+0.003
 amaz+0.003
 friendship+0.003
 wink+0.003
Neg logit contributions
allows-0.005
hers-0.004
hes-0.004
 pans-0.004
ighters-0.004
Token and
Token ID290
Feature activation+3.32e-03

Pos logit contributions
 course+0.009
 reminder+0.005
 wink+0.005
 friendship+0.005
 yeah+0.005
Neg logit contributions
allows-0.019
hers-0.017
hes-0.017
ighters-0.017
 pans-0.016
Token Lily
Token ID20037
Feature activation-9.47e-03

Pos logit contributions
allows+0.008
hers+0.008
hes+0.007
ighters+0.007
ained+0.007
Neg logit contributions
 course-0.006
 yeah-0.003
 friendship-0.003
 wink-0.003
 reminder-0.003
Token she
Token ID673
Feature activation+1.50e-02

Pos logit contributions
allows+0.064
hers+0.058
hes+0.058
ighters+0.055
 pans+0.055
Neg logit contributions
 course-0.045
 reminder-0.029
 friendship-0.029
 yeah-0.029
 wink-0.028
Token was
Token ID373
Feature activation-3.60e-02

Pos logit contributions
allows+0.037
hes+0.034
hers+0.034
ighters+0.032
ages+0.031
Neg logit contributions
 course-0.024
 reminder-0.015
 amaz-0.015
 yeah-0.015
Sure-0.013
Token sorry
Token ID7926
Feature activation+3.94e-02

Pos logit contributions
 course+0.073
 reminder+0.050
 friendship+0.049
 wink+0.049
 yeah+0.047
Neg logit contributions
allows-0.077
 pans-0.067
hers-0.067
angers-0.066
ighters-0.066
Token for
Token ID329
Feature activation+5.92e-02

Pos logit contributions
allows+0.083
hes+0.074
hers+0.074
ighters+0.069
 pans+0.067
Neg logit contributions
 course-0.077
 amaz-0.052
 wink-0.051
 yeah-0.051
 reminder-0.051
Token qu
Token ID627
Feature activation+8.42e-02

Pos logit contributions
allows+0.128
hers+0.115
hes+0.115
ighters+0.108
 pans+0.106
Neg logit contributions
 course-0.115
 reminder-0.079
 yeah-0.077
 wink-0.076
 friendship-0.076
Tokenarre
Token ID9624
Feature activation+2.00e-01

Pos logit contributions
allows+0.122
 pans+0.115
hers+0.107
ighters+0.107
hes+0.106
Neg logit contributions
 course-0.210
of-0.162
est-0.154
 friendship-0.152
Sure-0.150
Tokenling
Token ID1359
Feature activation+7.91e-02

Pos logit contributions
allows+0.317
ighters+0.277
hers+0.275
hes+0.275
angers+0.260
Neg logit contributions
 course-0.488
est-0.360
 friendship-0.358
 reminder-0.358
 smiled-0.346
Token.
Token ID13
Feature activation-6.82e-04

Pos logit contributions
allows+0.182
hers+0.164
hes+0.164
ighters+0.154
 pans+0.152
Neg logit contributions
 course-0.139
 reminder-0.092
 yeah-0.091
 wink-0.089
 friendship-0.087
Token From
Token ID3574
Feature activation+1.44e-02

Pos logit contributions
 course+0.001
 reminder+0.001
 wink+0.001
 yeah+0.001
 friendship+0.001
Neg logit contributions
allows-0.001
hes-0.001
hers-0.001
ighters-0.001
ained-0.001
Token that
Token ID326
Feature activation+1.11e-02

Pos logit contributions
allows+0.034
hers+0.032
hes+0.031
 pans+0.029
ighters+0.029
Neg logit contributions
 course-0.023
 reminder-0.016
 yeah-0.015
 friendship-0.015
 amaz-0.015
Token day
Token ID1110
Feature activation-2.54e-02

Pos logit contributions
allows+0.023
hes+0.020
hers+0.019
ained+0.019
 pans+0.018
Neg logit contributions
 course-0.020
 yeah-0.015
 amaz-0.015
 wink-0.014
of-0.014
Token
Token ID198
Feature activation-1.04e-02

Pos logit contributions
 course+0.020
 friendship+0.014
 wink+0.014
 reminder+0.014
Sure+0.013
Neg logit contributions
allows-0.023
hers-0.021
hes-0.020
 pans-0.019
ighters-0.019
TokenThey
Token ID2990
Feature activation+2.37e-02

Pos logit contributions
 course+0.022
 reminder+0.015
Sure+0.015
 friendship+0.015
 wink+0.015
Neg logit contributions
allows-0.022
hers-0.019
hes-0.019
ighters-0.018
 pans-0.017
Token learn
Token ID2193
Feature activation-7.89e-04

Pos logit contributions
allows+0.057
hers+0.052
hes+0.052
ighters+0.050
 pans+0.048
Neg logit contributions
 course-0.039
 reminder-0.025
 yeah-0.024
 friendship-0.024
 wink-0.023
Token that
Token ID326
Feature activation+1.38e-02

Pos logit contributions
 course+0.001
 reminder+0.001
 yeah+0.001
 wink+0.001
 amaz+0.001
Neg logit contributions
allows-0.002
hes-0.002
hers-0.002
ighters-0.002
 pans-0.001
Token qu
Token ID627
Feature activation+8.47e-02

Pos logit contributions
allows+0.029
hes+0.025
hers+0.025
ighters+0.024
 pans+0.023
Neg logit contributions
 course-0.028
 yeah-0.020
 reminder-0.019
of-0.019
Sure-0.019
Tokenarre
Token ID9624
Feature activation+1.99e-01

Pos logit contributions
allows+0.134
 pans+0.121
hers+0.115
hes+0.108
ighters+0.106
Neg logit contributions
 course-0.212
of-0.152
 friendship-0.152
est-0.143
 wink-0.138
Tokenling
Token ID1359
Feature activation+8.17e-02

Pos logit contributions
allows+0.315
hers+0.273
hes+0.264
ighters+0.255
 pans+0.242
Neg logit contributions
 course-0.503
 friendship-0.371
 reminder-0.361
 wink-0.357
of-0.357
Token is
Token ID318
Feature activation-5.39e-02

Pos logit contributions
allows+0.220
hes+0.205
hers+0.203
ighters+0.195
 pans+0.190
Neg logit contributions
 course-0.110
 reminder-0.064
 yeah-0.063
 wink-0.059
 amaz-0.059
Token bad
Token ID2089
Feature activation+2.30e-02

Pos logit contributions
 course+0.095
 friendship+0.069
 reminder+0.065
 wink+0.060
Sure+0.056
Neg logit contributions
allows-0.128
angers-0.114
ighters-0.111
hers-0.110
 pans-0.110
Token and
Token ID290
Feature activation+3.53e-03

Pos logit contributions
allows+0.049
hes+0.045
hers+0.043
ighters+0.042
ages+0.041
Neg logit contributions
 course-0.043
 amaz-0.031
 yeah-0.031
 reminder-0.030
 wink-0.030
Token sharing
Token ID7373
Feature activation+2.03e-02

Pos logit contributions
allows+0.008
hers+0.007
hes+0.007
ighters+0.007
angers+0.006
Neg logit contributions
 course-0.007
 friendship-0.005
 reminder-0.005
 wink-0.004
 yeah-0.004
Token on
Token ID319
Feature activation+1.98e-02

Pos logit contributions
 course+0.072
 reminder+0.054
 yeah+0.053
 friendship+0.053
 wink+0.053
Neg logit contributions
allows-0.046
hers-0.041
hes-0.040
ighters-0.037
 pans-0.036
Token,
Token ID11
Feature activation-9.45e-03

Pos logit contributions
allows+0.054
hers+0.050
hes+0.049
ighters+0.047
 pans+0.047
Neg logit contributions
 course-0.027
 reminder-0.015
 wink-0.014
 yeah-0.014
 amaz-0.014
Token they
Token ID484
Feature activation+4.68e-02

Pos logit contributions
 course+0.019
 reminder+0.013
 yeah+0.013
 wink+0.013
 friendship+0.013
Neg logit contributions
allows-0.020
hers-0.018
hes-0.018
ighters-0.016
 pans-0.016
Token never
Token ID1239
Feature activation-2.57e-02

Pos logit contributions
allows+0.105
hers+0.095
hes+0.095
ighters+0.090
 pans+0.087
Neg logit contributions
 course-0.086
 reminder-0.058
 yeah-0.057
 wink-0.056
 friendship-0.056
Token qu
Token ID627
Feature activation+8.41e-02

Pos logit contributions
 course+0.054
 yeah+0.037
 reminder+0.036
 amaz+0.036
ement+0.035
Neg logit contributions
allows-0.047
hes-0.043
hers-0.041
inters-0.040
ained-0.040
Tokenarre
Token ID9624
Feature activation+1.99e-01

Pos logit contributions
 pans+0.127
allows+0.125
hers+0.114
ighters+0.108
hes+0.104
Neg logit contributions
 course-0.199
of-0.157
est-0.145
 friendship-0.144
Sure-0.141
Tokenled
Token ID992
Feature activation+8.12e-02

Pos logit contributions
allows+0.314
hers+0.273
ighters+0.265
hes+0.264
 pans+0.254
Neg logit contributions
 course-0.494
 friendship-0.365
est-0.358
 smiled-0.355
 reminder-0.353
Token again
Token ID757
Feature activation-7.50e-02

Pos logit contributions
allows+0.203
hers+0.187
hes+0.186
ighters+0.178
 pans+0.172
Neg logit contributions
 course-0.127
 reminder-0.077
 yeah-0.076
 amaz-0.071
 wink-0.071
Token and
Token ID290
Feature activation-5.60e-03

Pos logit contributions
 course+0.131
 yeah+0.083
 reminder+0.082
Sure+0.081
 wink+0.079
Neg logit contributions
allows-0.179
hers-0.161
hes-0.160
ighters-0.152
 pans-0.149
Token played
Token ID2826
Feature activation+2.39e-02

Pos logit contributions
 course+0.012
 reminder+0.008
 yeah+0.008
 amaz+0.008
Sure+0.008
Neg logit contributions
allows-0.010
hes-0.010
hers-0.009
ained-0.009
ighters-0.009
Token together
Token ID1978
Feature activation+2.85e-02

Pos logit contributions
allows+0.049
hers+0.044
hes+0.043
ighters+0.041
 pans+0.040
Neg logit contributions
 course-0.049
 reminder-0.034
 yeah-0.034
 friendship-0.034
 wink-0.033
Token,
Token ID11
Feature activation-6.29e-03

Pos logit contributions
allows+0.001
hes+0.001
hers+0.001
ighters+0.001
 pans+0.001
Neg logit contributions
 course-0.000
 reminder-0.000
 yeah-0.000
 friendship-0.000
 wink-0.000
Token "
Token ID366
Feature activation+2.45e-02

Pos logit contributions
 course+0.010
 reminder+0.007
 wink+0.007
 yeah+0.006
 friendship+0.006
Neg logit contributions
allows-0.015
hers-0.014
hes-0.014
ighters-0.013
 pans-0.013
TokenYes
Token ID5297
Feature activation-1.72e-01

Pos logit contributions
allows+0.050
hers+0.047
hes+0.047
 pans+0.044
ained+0.043
Neg logit contributions
 course-0.041
 friendship-0.030
 amaz-0.029
 yeah-0.028
 statue-0.028
Token,
Token ID11
Feature activation-8.83e-03

Pos logit contributions
 course+0.257
 reminder+0.154
 amaz+0.150
 friendship+0.149
 wink+0.145
Neg logit contributions
allows-0.436
hers-0.389
hes-0.377
 pans-0.367
ighters-0.364
Token of
Token ID286
Feature activation+3.81e-02

Pos logit contributions
 course+0.015
 reminder+0.010
 wink+0.009
 friendship+0.009
 yeah+0.009
Neg logit contributions
allows-0.021
hers-0.018
hes-0.018
ighters-0.017
 pans-0.017
Token course
Token ID1781
Feature activation-1.91e-01

Pos logit contributions
allows+0.125
hers+0.120
hes+0.116
 pans+0.111
ics+0.104
Neg logit contributions
 course-0.007
Sure-0.004
 reminder-0.003
found-0.000
 Sure-0.000
Token but
Token ID475
Feature activation-5.77e-02

Pos logit contributions
 course+0.329
 reminder+0.213
 yeah+0.205
 wink+0.204
 friendship+0.203
Neg logit contributions
allows-0.449
hers-0.407
hes-0.403
ighters-0.383
 pans-0.380
Token you
Token ID345
Feature activation-3.41e-02

Pos logit contributions
 course+0.100
 amaz+0.064
 reminder+0.064
Sure+0.062
 yeah+0.062
Neg logit contributions
allows-0.127
hers-0.115
hes-0.113
 pans-0.107
ighters-0.107
Token have
Token ID423
Feature activation-1.91e-02

Pos logit contributions
 course+0.054
 wink+0.035
 believe+0.033
 reminder+0.032
 friendship+0.031
Neg logit contributions
allows-0.088
hers-0.080
hes-0.080
 pans-0.078
angers-0.078
Token to
Token ID284
Feature activation-6.03e-03

Pos logit contributions
 course+0.035
 reminder+0.023
 friendship+0.023
 wink+0.022
 yeah+0.022
Neg logit contributions
allows-0.044
hers-0.039
hes-0.039
ighters-0.037
 pans-0.036
Token be
Token ID307
Feature activation-7.61e-02

Pos logit contributions
 course+0.009
 reminder+0.006
 yeah+0.005
 friendship+0.005
 wink+0.005
Neg logit contributions
allows-0.015
hers-0.014
hes-0.014
ighters-0.013
 pans-0.013
Token and
Token ID290
Feature activation+8.48e-04

Pos logit contributions
 course+0.044
 reminder+0.027
 yeah+0.026
 friendship+0.025
 wink+0.025
Neg logit contributions
allows-0.072
hers-0.065
hes-0.065
ighters-0.062
 pans-0.061
Token said
Token ID531
Feature activation+2.40e-02

Pos logit contributions
hes+0.002
allows+0.002
ages+0.002
hers+0.002
 pans+0.001
Neg logit contributions
 course-0.002
 reminder-0.001
 yeah-0.001
 friendship-0.001
of-0.001
Token,
Token ID11
Feature activation-8.84e-04

Pos logit contributions
allows+0.061
hes+0.056
hers+0.055
ighters+0.053
ages+0.052
Neg logit contributions
 course-0.037
 reminder-0.021
 amaz-0.021
 friendship-0.020
 yeah-0.019
Token "
Token ID366
Feature activation+2.76e-02

Pos logit contributions
 course+0.002
 reminder+0.001
 yeah+0.001
 wink+0.001
 friendship+0.001
Neg logit contributions
allows-0.002
hers-0.002
hes-0.002
ighters-0.002
 pans-0.002
TokenOf
Token ID5189
Feature activation-1.70e-01

Pos logit contributions
hes+0.052
allows+0.051
ained+0.048
hers+0.047
 pans+0.046
Neg logit contributions
 course-0.047
 amaz-0.036
 friendship-0.035
 statue-0.031
 yeah-0.031
Token course
Token ID1781
Feature activation-1.90e-01

Pos logit contributions
 amaz+0.044
 Sure+0.032
 Nice+0.028
 reminder+0.023
found+0.022
Neg logit contributions
allows-0.496
hers-0.467
hes-0.467
agons-0.450
ards-0.443
Token!"
Token ID2474
Feature activation+4.08e-02

Pos logit contributions
 course+0.326
 amaz+0.225
 reminder+0.213
Sure+0.207
 friendship+0.203
Neg logit contributions
allows-0.422
hers-0.371
hes-0.365
 pans-0.361
ages-0.345
Token
Token ID220
Feature activation+3.26e-02

Pos logit contributions
allows+0.089
hes+0.082
hers+0.080
ighters+0.075
 pans+0.074
Neg logit contributions
 course-0.077
 reminder-0.052
 yeah-0.051
 friendship-0.050
 amaz-0.050
Token
Token ID198
Feature activation-1.98e-02

Pos logit contributions
allows+0.094
hers+0.091
 pans+0.088
hes+0.084
ighters+0.083
Neg logit contributions
 course-0.032
Sure-0.016
of-0.016
 yeah-0.014
 wink-0.014
Token
Token ID198
Feature activation-2.04e-02

Pos logit contributions
 course+0.032
 reminder+0.020
 yeah+0.019
 wink+0.019
 friendship+0.019
Neg logit contributions
allows-0.049
hers-0.045
hes-0.045
ighters-0.043
 pans-0.042
TokenThey
Token ID2990
Feature activation+1.92e-02

Pos logit contributions
 course+0.039
 reminder+0.026
 yeah+0.025
 wink+0.025
 friendship+0.025
Neg logit contributions
allows-0.045
hers-0.041
hes-0.041
ighters-0.039
 pans-0.038
Token,
Token ID11
Feature activation-1.45e-03

Pos logit contributions
allows+0.056
hes+0.051
hers+0.050
ighters+0.047
ages+0.046
Neg logit contributions
 course-0.035
 reminder-0.021
 amaz-0.020
 friendship-0.020
 yeah-0.020
Token "
Token ID366
Feature activation+2.28e-02

Pos logit contributions
 course+0.002
 reminder+0.002
 wink+0.002
 yeah+0.002
 friendship+0.002
Neg logit contributions
allows-0.003
hers-0.003
hes-0.003
ighters-0.003
 pans-0.003
TokenYes
Token ID5297
Feature activation-1.69e-01

Pos logit contributions
allows+0.044
hes+0.043
hers+0.042
ained+0.038
 pans+0.038
Neg logit contributions
 course-0.039
 friendship-0.030
 amaz-0.030
 yeah-0.028
 wink-0.027
Token,
Token ID11
Feature activation-4.98e-03

Pos logit contributions
 course+0.252
 reminder+0.151
 amaz+0.145
 friendship+0.144
 wink+0.143
Neg logit contributions
allows-0.432
hers-0.388
hes-0.380
ighters-0.364
 pans-0.364
Token of
Token ID286
Feature activation+3.69e-02

Pos logit contributions
 course+0.008
 reminder+0.005
 yeah+0.005
 friendship+0.005
 wink+0.005
Neg logit contributions
allows-0.012
hers-0.011
hes-0.011
ighters-0.010
 pans-0.010
Token course
Token ID1781
Feature activation-1.89e-01

Pos logit contributions
allows+0.123
hers+0.118
hes+0.117
 pans+0.110
ores+0.103
Neg logit contributions
Sure-0.001
 course-0.001
 reminder+0.000
found+0.001
Of+0.001
Token!
Token ID0
Feature activation-2.18e-02

Pos logit contributions
 course+0.350
 reminder+0.237
 yeah+0.228
 wink+0.227
 friendship+0.226
Neg logit contributions
allows-0.421
hers-0.377
hes-0.374
ighters-0.354
 pans-0.352
Token I
Token ID314
Feature activation-9.70e-02

Pos logit contributions
 course+0.036
 reminder+0.023
 friendship+0.022
 wink+0.022
 yeah+0.021
Neg logit contributions
allows-0.053
hes-0.048
hers-0.048
ighters-0.046
 pans-0.044
Token love
Token ID1842
Feature activation-4.13e-02

Pos logit contributions
 course+0.174
 yeah+0.114
 reminder+0.112
 friendship+0.111
 amaz+0.109
Neg logit contributions
allows-0.221
hes-0.200
hers-0.198
ighters-0.188
 pans-0.179
Token to
Token ID284
Feature activation-8.72e-03

Pos logit contributions
 course+0.078
 friendship+0.050
 reminder+0.050
 wink+0.048
 yeah+0.048
Neg logit contributions
allows-0.093
hers-0.085
hes-0.085
ighters-0.082
ained-0.078
Token spin
Token ID7906
Feature activation-3.06e-02

Pos logit contributions
 course+0.015
 reminder+0.010
 yeah+0.010
 friendship+0.010
Sure+0.010
Neg logit contributions
allows-0.020
hers-0.018
hes-0.018
ighters-0.017
 pans-0.017
Token and
Token ID290
Feature activation-7.42e-03

Pos logit contributions
 course+0.048
 yeah+0.033
Sure+0.032
of+0.030
 reminder+0.030
Neg logit contributions
allows-0.047
hes-0.039
hers-0.038
 pans-0.036
ages-0.035
Token said
Token ID531
Feature activation+2.36e-02

Pos logit contributions
 course+0.013
 reminder+0.008
 yeah+0.008
 friendship+0.008
 amaz+0.008
Neg logit contributions
allows-0.017
hes-0.016
hers-0.015
 pans-0.014
ighters-0.014
Token,
Token ID11
Feature activation-6.60e-03

Pos logit contributions
allows+0.058
hes+0.053
hers+0.052
ighters+0.049
ages+0.048
Neg logit contributions
 course-0.038
 reminder-0.022
 amaz-0.022
 yeah-0.021
 friendship-0.021
Token "
Token ID366
Feature activation+2.12e-02

Pos logit contributions
 course+0.011
 reminder+0.007
 wink+0.007
 yeah+0.007
 friendship+0.007
Neg logit contributions
allows-0.016
hers-0.014
hes-0.014
ighters-0.013
 pans-0.013
TokenOf
Token ID5189
Feature activation-1.67e-01

Pos logit contributions
allows+0.043
hes+0.042
hers+0.041
ained+0.037
 pans+0.037
Neg logit contributions
 course-0.034
 amaz-0.025
 friendship-0.025
 yeah-0.023
 wink-0.022
Token course
Token ID1781
Feature activation-1.89e-01

Pos logit contributions
 amaz+0.036
 Sure+0.027
Found+0.022
 Nice+0.018
found+0.016
Neg logit contributions
allows-0.503
hers-0.471
hes-0.462
ards-0.438
ained-0.430
Token you
Token ID345
Feature activation-3.42e-02

Pos logit contributions
 course+0.336
 amaz+0.230
 reminder+0.221
Sure+0.216
 wink+0.214
Neg logit contributions
allows-0.417
hers-0.371
hes-0.360
 pans-0.350
ages-0.340
Token can
Token ID460
Feature activation-3.61e-02

Pos logit contributions
 course+0.054
 reminder+0.033
 yeah+0.032
 wink+0.032
 friendship+0.031
Neg logit contributions
allows-0.087
hers-0.079
hes-0.079
ighters-0.075
 pans-0.074
Token.
Token ID13
Feature activation+6.24e-03

Pos logit contributions
 course+0.060
 reminder+0.038
 yeah+0.036
 wink+0.036
 friendship+0.036
Neg logit contributions
allows-0.088
hes-0.081
hers-0.081
ighters-0.077
 pans-0.075
Token Why
Token ID4162
Feature activation+3.45e-02

Pos logit contributions
allows+0.014
hes+0.013
hers+0.012
ighters+0.012
ained+0.012
Neg logit contributions
 course-0.011
 reminder-0.007
 wink-0.007
 amaz-0.007
 friendship-0.007
Token don
Token ID836
Feature activation+6.37e-02

Pos logit contributions
allows+0.081
hes+0.074
hers+0.073
ighters+0.071
 pans+0.070
Neg logit contributions
 course-0.061
 friendship-0.040
 yeah-0.039
 wink-0.039
 reminder-0.038
Token and
Token ID290
Feature activation-7.42e-03

Pos logit contributions
 course+0.048
 yeah+0.033
Sure+0.032
of+0.030
 reminder+0.030
Neg logit contributions
allows-0.047
hes-0.039
hers-0.038
 pans-0.036
ages-0.035
Token said
Token ID531
Feature activation+2.36e-02

Pos logit contributions
 course+0.013
 reminder+0.008
 yeah+0.008
 friendship+0.008
 amaz+0.008
Neg logit contributions
allows-0.017
hes-0.016
hers-0.015
 pans-0.014
ighters-0.014
Token,
Token ID11
Feature activation-6.60e-03

Pos logit contributions
allows+0.058
hes+0.053
hers+0.052
ighters+0.049
ages+0.048
Neg logit contributions
 course-0.038
 reminder-0.022
 amaz-0.022
 yeah-0.021
 friendship-0.021
Token "
Token ID366
Feature activation+2.12e-02

Pos logit contributions
 course+0.011
 reminder+0.007
 wink+0.007
 yeah+0.007
 friendship+0.007
Neg logit contributions
allows-0.016
hers-0.014
hes-0.014
ighters-0.013
 pans-0.013
TokenOf
Token ID5189
Feature activation-1.67e-01

Pos logit contributions
allows+0.043
hes+0.042
hers+0.041
ained+0.037
 pans+0.037
Neg logit contributions
 course-0.034
 amaz-0.025
 friendship-0.025
 yeah-0.023
 wink-0.022
Token course
Token ID1781
Feature activation-1.89e-01

Pos logit contributions
 amaz+0.036
 Sure+0.027
Found+0.022
 Nice+0.018
found+0.016
Neg logit contributions
allows-0.503
hers-0.471
hes-0.462
ards-0.438
ained-0.430
Token you
Token ID345
Feature activation-3.42e-02

Pos logit contributions
 course+0.336
 amaz+0.230
 reminder+0.221
Sure+0.216
 wink+0.214
Neg logit contributions
allows-0.417
hers-0.371
hes-0.360
 pans-0.350
ages-0.340
Token can
Token ID460
Feature activation-3.61e-02

Pos logit contributions
 course+0.054
 reminder+0.033
 yeah+0.032
 wink+0.032
 friendship+0.031
Neg logit contributions
allows-0.087
hers-0.079
hes-0.079
ighters-0.075
 pans-0.074
Token!
Token ID0
Feature activation-1.71e-02

Pos logit contributions
 course+0.060
 reminder+0.038
 yeah+0.036
 wink+0.036
 friendship+0.036
Neg logit contributions
allows-0.088
hes-0.081
hers-0.081
ighters-0.077
 pans-0.075
Token It
Token ID632
Feature activation-5.30e-02

Pos logit contributions
 course+0.030
 reminder+0.020
 amaz+0.019
 wink+0.019
 friendship+0.019
Neg logit contributions
allows-0.038
hes-0.035
hers-0.034
ighters-0.033
ained-0.032
Token's
Token ID338
Feature activation-1.30e-02

Pos logit contributions
 course+0.084
 reminder+0.056
 amaz+0.055
 yeah+0.054
Sure+0.052
Neg logit contributions
allows-0.127
hes-0.117
hers-0.117
ighters-0.109
ages-0.106
Token and
Token ID290
Feature activation-7.51e-03

Pos logit contributions
 course+0.002
 reminder+0.001
 yeah+0.001
Sure+0.001
 friendship+0.001
Neg logit contributions
allows-0.003
hes-0.002
hers-0.002
ighters-0.002
 pans-0.002
Token said
Token ID531
Feature activation+2.70e-02

Pos logit contributions
 course+0.013
 reminder+0.009
 yeah+0.008
 wink+0.008
 friendship+0.008
Neg logit contributions
allows-0.018
hers-0.016
hes-0.016
ighters-0.015
 pans-0.015
Token,
Token ID11
Feature activation-3.55e-03

Pos logit contributions
allows+0.070
hes+0.063
hers+0.063
ighters+0.060
 pans+0.058
Neg logit contributions
 course-0.041
 reminder-0.024
 yeah-0.023
 friendship-0.023
 wink-0.022
Token "
Token ID366
Feature activation+2.32e-02

Pos logit contributions
 course+0.006
 reminder+0.004
 wink+0.004
 yeah+0.004
 friendship+0.004
Neg logit contributions
allows-0.009
hers-0.008
hes-0.008
ighters-0.008
 pans-0.007
TokenOf
Token ID5189
Feature activation-1.66e-01

Pos logit contributions
allows+0.050
hes+0.048
hers+0.047
 pans+0.043
ained+0.043
Neg logit contributions
 course-0.039
 friendship-0.028
 amaz-0.028
 reminder-0.027
 yeah-0.026
Token course
Token ID1781
Feature activation-1.89e-01

Pos logit contributions
 amaz+0.017
 Sure+0.003
 reminder+0.003
 course+0.001
found+0.000
Neg logit contributions
allows-0.537
hers-0.503
hes-0.494
 pans-0.469
agons-0.467
Token,
Token ID11
Feature activation-5.31e-03

Pos logit contributions
 course+0.322
 amaz+0.216
 reminder+0.210
Sure+0.205
 wink+0.204
Neg logit contributions
allows-0.433
hers-0.387
hes-0.374
 pans-0.364
ighters-0.356
Token Tim
Token ID5045
Feature activation+1.68e-02

Pos logit contributions
 course+0.009
 reminder+0.006
 wink+0.006
 amaz+0.006
 yeah+0.006
Neg logit contributions
allows-0.012
hers-0.011
hes-0.011
ighters-0.010
 pans-0.010
Tokenmy
Token ID1820
Feature activation+4.02e-02

Pos logit contributions
allows+0.040
hers+0.036
hes+0.036
 pans+0.034
ighters+0.034
Neg logit contributions
 course-0.028
 reminder-0.018
 amaz-0.018
 yeah-0.018
 friendship-0.017
Token.
Token ID13
Feature activation+4.90e-03

Pos logit contributions
allows+0.094
hers+0.085
hes+0.085
ighters+0.080
 pans+0.079
Neg logit contributions
 course-0.070
 reminder-0.045
 yeah-0.044
 wink-0.043
 friendship-0.043
Token Ice
Token ID6663
Feature activation+2.14e-02

Pos logit contributions
allows+0.011
hes+0.010
hers+0.010
ighters+0.010
 pans+0.009
Neg logit contributions
 course-0.009
 reminder-0.006
 wink-0.005
 amaz-0.005
 friendship-0.005
Token
Token ID95
Feature activation+2.62e-02

Pos logit contributions
allows+0.030
hers+0.027
hes+0.027
ighters+0.025
 pans+0.024
Neg logit contributions
 course-0.034
 reminder-0.024
 friendship-0.024
 yeah-0.023
 wink-0.023
Token
Token ID26391
Feature activation+9.56e-03

Pos logit contributions
allows+0.050
hers+0.048
hes+0.046
ighters+0.042
 pans+0.041
Neg logit contributions
 course-0.054
 friendship-0.039
 yeah-0.038
 reminder-0.038
 wink-0.038
Token
Token ID129
Feature activation+2.67e-02

Pos logit contributions
allows+0.021
hers+0.019
hes+0.019
ighters+0.018
 pans+0.018
Neg logit contributions
 course-0.018
 yeah-0.012
 reminder-0.012
 wink-0.012
of-0.012
Token
Token ID241
Feature activation+2.68e-02

Pos logit contributions
allows+0.051
hes+0.046
 pans+0.045
ighters+0.045
hers+0.042
Neg logit contributions
 course-0.057
 reminder-0.040
 wink-0.040
 friendship-0.039
Sure-0.038
TokenOf
Token ID5189
Feature activation-1.69e-01

Pos logit contributions
allows+0.060
hes+0.054
hers+0.054
ained+0.051
ighters+0.048
Neg logit contributions
 course-0.050
 friendship-0.032
 amaz-0.032
 yeah-0.032
 wink-0.032
Token course
Token ID1781
Feature activation-1.89e-01

Pos logit contributions
 amaz+0.070
 Sure+0.043
 Nice+0.041
 reminder+0.037
 statue+0.035
Neg logit contributions
allows-0.468
hes-0.441
hers-0.431
ained-0.420
ards-0.407
Token,
Token ID11
Feature activation-5.96e-03

Pos logit contributions
 course+0.313
 amaz+0.207
 reminder+0.197
 friendship+0.192
 wink+0.190
Neg logit contributions
allows-0.436
hers-0.391
hes-0.381
 pans-0.371
ighters-0.360
Token sweet
Token ID6029
Feature activation-4.69e-02

Pos logit contributions
 course+0.011
 reminder+0.007
 wink+0.007
 friendship+0.007
 yeah+0.007
Neg logit contributions
allows-0.013
hers-0.012
hes-0.012
 pans-0.011
ighters-0.011
Tokenie
Token ID494
Feature activation+3.54e-02

Pos logit contributions
 course+0.073
 reminder+0.044
 yeah+0.043
 wink+0.042
 amaz+0.042
Neg logit contributions
allows-0.119
hers-0.107
hes-0.106
 pans-0.101
ighters-0.100
Token.
Token ID13
Feature activation-1.76e-03

Pos logit contributions
allows+0.085
hers+0.078
hes+0.077
ighters+0.073
 pans+0.072
Neg logit contributions
 course-0.058
 reminder-0.037
 yeah-0.036
 friendship-0.036
 amaz-0.035
Token But
Token ID887
Feature activation-9.83e-02

Pos logit contributions
 course+0.003
 reminder+0.002
 wink+0.002
 friendship+0.002
 yeah+0.002
Neg logit contributions
allows-0.004
hes-0.004
hers-0.004
ighters-0.004
 pans-0.003
Token
Token ID26391
Feature activation+1.16e-02

Pos logit contributions
allows+0.059
hers+0.055
hes+0.054
ighters+0.049
 pans+0.048
Neg logit contributions
 course-0.063
 friendship-0.044
 reminder-0.044
 yeah-0.043
 amaz-0.043
Token˜
Token ID41185
Feature activation+1.49e-02

Pos logit contributions
allows+0.024
hers+0.022
hes+0.022
ighters+0.021
 pans+0.020
Neg logit contributions
 course-0.023
 yeah-0.016
 wink-0.016
 reminder-0.016
of-0.015
TokenYes
Token ID5297
Feature activation-1.64e-01

Pos logit contributions
allows+0.037
hers+0.034
hes+0.033
ighters+0.032
 pans+0.031
Neg logit contributions
 course-0.024
 reminder-0.015
 yeah-0.015
 wink-0.015
 friendship-0.014
Token,
Token ID11
Feature activation-1.05e-02

Pos logit contributions
 course+0.250
 reminder+0.153
 amaz+0.152
 friendship+0.150
 wink+0.147
Neg logit contributions
allows-0.413
hers-0.374
hes-0.373
ighters-0.355
 pans-0.347
Token of
Token ID286
Feature activation+4.21e-02

Pos logit contributions
 course+0.018
 reminder+0.011
 yeah+0.011
 friendship+0.011
 wink+0.011
Neg logit contributions
allows-0.024
hes-0.022
hers-0.022
ighters-0.021
 pans-0.021
Token course
Token ID1781
Feature activation-1.89e-01

Pos logit contributions
allows+0.138
hes+0.134
hers+0.133
ores+0.122
ighters+0.122
Neg logit contributions
 course-0.003
 reminder-0.002
bye+0.002
Sure+0.002
 yeah+0.003
Token.
Token ID13
Feature activation-6.85e-03

Pos logit contributions
 course+0.324
 reminder+0.215
 amaz+0.215
 yeah+0.211
Sure+0.206
Neg logit contributions
allows-0.428
hers-0.381
hes-0.380
ighters-0.363
 pans-0.359
Token Here
Token ID3423
Feature activation-9.24e-02

Pos logit contributions
 course+0.012
 reminder+0.008
 friendship+0.007
 wink+0.007
 yeah+0.007
Neg logit contributions
allows-0.016
hes-0.014
hers-0.014
ighters-0.014
 pans-0.013
Token,
Token ID11
Feature activation-9.87e-03

Pos logit contributions
 course+0.138
 reminder+0.102
 amaz+0.098
Sure+0.095
 yeah+0.091
Neg logit contributions
allows-0.209
hes-0.192
hers-0.190
ighters-0.181
 pans-0.180
Token take
Token ID1011
Feature activation-3.10e-02

Pos logit contributions
 course+0.017
 reminder+0.011
 yeah+0.010
 friendship+0.010
 wink+0.010
Neg logit contributions
allows-0.024
hers-0.022
hes-0.022
ighters-0.021
 pans-0.020
Token this
Token ID428
Feature activation-1.05e-02

Pos logit contributions
 course+0.059
 reminder+0.043
 yeah+0.042
 amaz+0.041
 friendship+0.040
Neg logit contributions
allows-0.065
hes-0.059
hers-0.058
ighters-0.054
 pans-0.054
Token,
Token ID11
Feature activation-1.19e-02

Pos logit contributions
 course+0.024
 reminder+0.015
 yeah+0.015
 friendship+0.015
 amaz+0.014
Neg logit contributions
allows-0.034
hes-0.031
hers-0.030
ighters-0.029
ages-0.028
Token "
Token ID366
Feature activation+1.64e-02

Pos logit contributions
 course+0.019
 reminder+0.012
 wink+0.012
 yeah+0.012
 friendship+0.012
Neg logit contributions
allows-0.029
hers-0.027
hes-0.027
ighters-0.026
 pans-0.025
TokenYes
Token ID5297
Feature activation-1.72e-01

Pos logit contributions
allows+0.035
hers+0.033
hes+0.032
ained+0.030
 pans+0.030
Neg logit contributions
 course-0.028
 amaz-0.019
 friendship-0.019
 reminder-0.018
 yeah-0.018
Token,
Token ID11
Feature activation-1.17e-02

Pos logit contributions
 course+0.255
 reminder+0.155
 amaz+0.151
 friendship+0.146
 wink+0.145
Neg logit contributions
allows-0.435
hers-0.392
hes-0.381
ighters-0.367
 pans-0.367
Token of
Token ID286
Feature activation+3.43e-02

Pos logit contributions
 course+0.019
 reminder+0.012
 yeah+0.012
 wink+0.011
 friendship+0.011
Neg logit contributions
allows-0.029
hers-0.026
hes-0.026
ighters-0.025
 pans-0.024
Token course
Token ID1781
Feature activation-1.89e-01

Pos logit contributions
allows+0.116
hers+0.111
hes+0.108
 pans+0.103
ics+0.097
Neg logit contributions
 course-0.003
 reminder-0.001
Sure-0.000
 Sure+0.002
joy+0.002
Token you
Token ID345
Feature activation-3.61e-02

Pos logit contributions
 course+0.319
 reminder+0.207
 yeah+0.198
 wink+0.196
 friendship+0.196
Neg logit contributions
allows-0.449
hers-0.407
hes-0.402
ighters-0.383
 pans-0.382
Token can
Token ID460
Feature activation-3.88e-02

Pos logit contributions
 course+0.059
 reminder+0.037
 yeah+0.036
 friendship+0.035
 wink+0.035
Neg logit contributions
allows-0.089
hers-0.081
hes-0.081
ighters-0.077
 pans-0.075
Token have
Token ID423
Feature activation-1.98e-02

Pos logit contributions
 course+0.065
 reminder+0.041
 yeah+0.040
 friendship+0.039
 wink+0.039
Neg logit contributions
allows-0.095
hers-0.086
hes-0.086
ighters-0.082
 pans-0.080
Token a
Token ID257
Feature activation-7.70e-03

Pos logit contributions
 course+0.036
 reminder+0.024
 yeah+0.024
 wink+0.023
 friendship+0.023
Neg logit contributions
allows-0.045
hers-0.041
hes-0.041
ighters-0.038
 pans-0.038
Token bite
Token ID13197
Feature activation-2.41e-02

Pos logit contributions
 course+0.015
 yeah+0.010
 amaz+0.009
of+0.009
 reminder+0.009
Neg logit contributions
allows-0.016
hers-0.014
hes-0.014
 pans-0.013
ighters-0.012
Token,
Token ID11
Feature activation-1.04e-02

Pos logit contributions
allows+0.046
hes+0.042
hers+0.041
ighters+0.039
ages+0.039
Neg logit contributions
 course-0.029
 amaz-0.017
 reminder-0.017
 friendship-0.016
 wink-0.016
Token "
Token ID366
Feature activation+2.72e-02

Pos logit contributions
 course+0.018
 reminder+0.012
 yeah+0.012
 wink+0.012
 friendship+0.012
Neg logit contributions
allows-0.024
hers-0.022
hes-0.022
ighters-0.021
 pans-0.020
TokenYes
Token ID5297
Feature activation-1.64e-01

Pos logit contributions
allows+0.061
hes+0.061
hers+0.061
 pans+0.055
ained+0.054
Neg logit contributions
 course-0.037
 amaz-0.028
 friendship-0.023
joy-0.021
 statue-0.021
Token,
Token ID11
Feature activation-1.16e-02

Pos logit contributions
 course+0.242
 reminder+0.143
 amaz+0.140
 wink+0.137
 friendship+0.136
Neg logit contributions
allows-0.424
hers-0.383
hes-0.375
ighters-0.360
 pans-0.359
Token of
Token ID286
Feature activation+4.22e-02

Pos logit contributions
 course+0.019
 reminder+0.012
 amaz+0.011
 wink+0.011
 yeah+0.011
Neg logit contributions
allows-0.028
hers-0.025
hes-0.025
ighters-0.024
 pans-0.024
Token course
Token ID1781
Feature activation-1.89e-01

Pos logit contributions
allows+0.133
hers+0.130
hes+0.126
ics+0.116
 pans+0.116
Neg logit contributions
joy-0.003
Sure-0.002
 reminder-0.002
 yeah-0.001
 Sure-0.001
Token.
Token ID13
Feature activation-7.04e-03

Pos logit contributions
 course+0.337
 reminder+0.224
 yeah+0.216
 wink+0.215
 amaz+0.214
Neg logit contributions
allows-0.430
hers-0.388
hes-0.384
ighters-0.365
 pans-0.362
Token We
Token ID775
Feature activation-5.85e-02

Pos logit contributions
 course+0.012
 reminder+0.007
 wink+0.007
 amaz+0.007
 friendship+0.007
Neg logit contributions
allows-0.017
hes-0.015
hers-0.015
ighters-0.015
 pans-0.014
Token will
Token ID481
Feature activation-3.98e-02

Pos logit contributions
 course+0.087
 reminder+0.049
 wink+0.048
 believe+0.047
 friendship+0.045
Neg logit contributions
allows-0.160
hes-0.147
hers-0.146
 pans-0.144
angers-0.142
Token go
Token ID467
Feature activation-3.41e-02

Pos logit contributions
 course+0.059
 reminder+0.035
 wink+0.033
 friendship+0.033
 yeah+0.033
Neg logit contributions
allows-0.104
hes-0.096
hers-0.095
ighters-0.092
 pans-0.090
Token to
Token ID284
Feature activation-8.02e-03

Pos logit contributions
 course+0.061
 reminder+0.041
 wink+0.040
 friendship+0.039
Sure+0.038
Neg logit contributions
allows-0.079
hers-0.071
hes-0.071
ighters-0.069
 pans-0.066
Token mom
Token ID1995
Feature activation+1.25e-02

Pos logit contributions
 course+0.037
 reminder+0.026
 yeah+0.026
 wink+0.026
 friendship+0.025
Neg logit contributions
allows-0.032
hers-0.028
hes-0.028
ighters-0.026
 pans-0.026
Token said
Token ID531
Feature activation+2.16e-02

Pos logit contributions
allows+0.031
hers+0.028
hes+0.028
ighters+0.027
 pans+0.026
Neg logit contributions
 course-0.020
 reminder-0.012
 yeah-0.012
 wink-0.012
 friendship-0.012
Token,
Token ID11
Feature activation+8.90e-04

Pos logit contributions
allows+0.052
hes+0.046
hers+0.046
ighters+0.043
 pans+0.042
Neg logit contributions
 course-0.037
 reminder-0.023
 amaz-0.023
 yeah-0.023
 friendship-0.023
Token "
Token ID366
Feature activation+2.63e-02

Pos logit contributions
allows+0.002
hers+0.002
hes+0.002
ighters+0.002
ained+0.002
Neg logit contributions
 course-0.001
 reminder-0.001
 wink-0.001
 yeah-0.001
Sure-0.001
TokenOf
Token ID5189
Feature activation-1.70e-01

Pos logit contributions
allows+0.058
hes+0.055
hers+0.055
 pans+0.050
ighters+0.049
Neg logit contributions
 course-0.045
 amaz-0.031
 friendship-0.031
 wink-0.030
 reminder-0.029
Token course
Token ID1781
Feature activation-1.89e-01

Pos logit contributions
 amaz+0.021
 course+0.001
 wink-0.003
found-0.004
 Sure-0.006
Neg logit contributions
allows-0.549
hers-0.505
hes-0.502
 pans-0.480
agons-0.471
Token,
Token ID11
Feature activation-4.66e-03

Pos logit contributions
 course+0.312
 amaz+0.205
 reminder+0.197
 wink+0.191
Sure+0.191
Neg logit contributions
allows-0.443
hers-0.396
hes-0.386
 pans-0.375
ighters-0.367
Token Tim
Token ID5045
Feature activation+1.86e-02

Pos logit contributions
 course+0.008
 reminder+0.005
 wink+0.005
 amaz+0.005
 yeah+0.005
Neg logit contributions
allows-0.011
hers-0.010
hes-0.010
ighters-0.009
 pans-0.009
Token!
Token ID0
Feature activation-2.25e-02

Pos logit contributions
allows+0.046
hers+0.042
hes+0.042
ighters+0.039
 pans+0.039
Neg logit contributions
 course-0.030
 reminder-0.019
 yeah-0.018
 friendship-0.018
 wink-0.018
Token Let
Token ID3914
Feature activation-1.17e-01

Pos logit contributions
 course+0.038
 reminder+0.025
 wink+0.024
 amaz+0.024
 friendship+0.024
Neg logit contributions
allows-0.052
hes-0.048
hers-0.046
ighters-0.045
ained-0.043
Token's
Token ID338
Feature activation-1.51e-02

Pos logit contributions
 course+0.181
 reminder+0.116
 yeah+0.114
 amaz+0.110
 friendship+0.109
Neg logit contributions
allows-0.291
hers-0.267
hes-0.265
ighters-0.249
 pans-0.247
Token said
Token ID531
Feature activation+1.65e-02

Pos logit contributions
 course+0.012
 reminder+0.009
 yeah+0.008
 friendship+0.008
est+0.008
Neg logit contributions
allows-0.011
hes-0.011
hers-0.010
 pans-0.009
ages-0.009
Token "
Token ID366
Feature activation+2.77e-02

Pos logit contributions
allows+0.040
hes+0.037
hers+0.036
ighters+0.034
 pans+0.033
Neg logit contributions
 course-0.026
 amaz-0.016
 reminder-0.016
 friendship-0.015
 wink-0.015
Tokenyes
Token ID8505
Feature activation-8.14e-02

Pos logit contributions
allows+0.061
hes+0.060
hers+0.058
 pans+0.055
ained+0.055
Neg logit contributions
 course-0.039
 amaz-0.030
 friendship-0.028
 yeah-0.025
 statue-0.024
Token,
Token ID11
Feature activation-8.07e-03

Pos logit contributions
 course+0.144
 reminder+0.095
 yeah+0.093
 wink+0.091
 friendship+0.090
Neg logit contributions
allows-0.190
hers-0.172
hes-0.172
ighters-0.163
 pans-0.160
Token of
Token ID286
Feature activation+4.30e-02

Pos logit contributions
 course+0.013
 reminder+0.008
 wink+0.008
 amaz+0.008
 friendship+0.008
Neg logit contributions
allows-0.019
hers-0.017
hes-0.017
 pans-0.016
ighters-0.016
Token course
Token ID1781
Feature activation-1.88e-01

Pos logit contributions
allows+0.140
hers+0.137
hes+0.133
 pans+0.128
ighters+0.120
Neg logit contributions
 course-0.006
 reminder-0.001
 wink+0.001
Sure+0.001
found+0.002
Token you
Token ID345
Feature activation-3.52e-02

Pos logit contributions
 course+0.341
 reminder+0.227
 yeah+0.220
 wink+0.218
 friendship+0.218
Neg logit contributions
allows-0.428
hers-0.388
hes-0.385
ighters-0.365
 pans-0.359
Token can
Token ID460
Feature activation-4.25e-02

Pos logit contributions
 course+0.055
 reminder+0.035
 yeah+0.033
 wink+0.033
 friendship+0.033
Neg logit contributions
allows-0.090
hers-0.082
hes-0.082
ighters-0.078
 pans-0.078
Token",
Token ID1600
Feature activation+8.26e-02

Pos logit contributions
 course+0.074
 reminder+0.050
 yeah+0.048
 friendship+0.048
 amaz+0.047
Neg logit contributions
allows-0.098
hers-0.089
hes-0.089
ighters-0.083
 pans-0.081
Token and
Token ID290
Feature activation-6.61e-03

Pos logit contributions
allows+0.193
hes+0.176
hers+0.173
ighters+0.163
 pans+0.163
Neg logit contributions
 course-0.141
 reminder-0.093
 friendship-0.089
 yeah-0.088
 amaz-0.087
Token so
Token ID523
Feature activation+4.88e-02

Pos logit contributions
 course+0.012
 reminder+0.008
 yeah+0.008
 friendship+0.008
 wink+0.008
Neg logit contributions
allows-0.015
hes-0.014
hers-0.013
ighters-0.013
 pans-0.013
Tokenâ
Token ID22940
Feature activation+6.86e-02

Pos logit contributions
 course+0.058
 reminder+0.038
 yeah+0.037
Sure+0.037
 wink+0.037
Neg logit contributions
allows-0.073
hers-0.066
hes-0.066
ighters-0.063
 pans-0.061
Token
Token ID26391
Feature activation+1.21e-02

Pos logit contributions
allows+0.123
hers+0.114
hes+0.108
ained+0.098
ighters+0.096
Neg logit contributions
 course-0.148
 amaz-0.112
 friendship-0.110
 yeah-0.110
 reminder-0.108
Token
Token ID129
Feature activation+2.82e-02

Pos logit contributions
allows+0.028
hes+0.026
hers+0.026
ighters+0.025
 pans+0.024
Neg logit contributions
 course-0.021
 reminder-0.014
 yeah-0.013
 wink-0.013
 friendship-0.013
Token
Token ID241
Feature activation+2.94e-02

Pos logit contributions
allows+0.055
hes+0.050
ighters+0.049
 pans+0.049
hers+0.045
Neg logit contributions
 course-0.060
 wink-0.041
 reminder-0.041
 yeah-0.040
 friendship-0.040
TokenOf
Token ID5189
Feature activation-1.65e-01

Pos logit contributions
allows+0.070
hers+0.064
hes+0.063
ighters+0.060
 pans+0.059
Neg logit contributions
 course-0.050
 reminder-0.032
 yeah-0.031
 wink-0.031
 friendship-0.031
Token course
Token ID1781
Feature activation-1.88e-01

Pos logit contributions
 amaz+0.045
 reminder+0.026
 Sure+0.022
 statue+0.021
 circle+0.015
Neg logit contributions
allows-0.495
hers-0.464
hes-0.459
ained-0.455
agons-0.431
Token I
Token ID314
Feature activation-9.46e-02

Pos logit contributions
 course+0.303
 amaz+0.198
 reminder+0.198
 wink+0.189
 friendship+0.188
Neg logit contributions
allows-0.442
hers-0.399
hes-0.384
 pans-0.372
ighters-0.368
Token do
Token ID466
Feature activation-2.96e-02

Pos logit contributions
 course+0.177
 yeah+0.121
 friendship+0.119
 reminder+0.118
 amaz+0.117
Neg logit contributions
allows-0.198
hes-0.181
hers-0.178
ighters-0.166
ained-0.162
Token,
Token ID11
Feature activation-5.96e-03

Pos logit contributions
 course+0.050
 reminder+0.032
 yeah+0.031
 wink+0.031
 friendship+0.031
Neg logit contributions
allows-0.071
hers-0.065
hes-0.065
ighters-0.062
 pans-0.060
Tokenâ
Token ID22940
Feature activation+6.87e-02

Pos logit contributions
 course+0.011
 reminder+0.007
 yeah+0.007
 wink+0.007
 friendship+0.007
Neg logit contributions
allows-0.013
hers-0.012
hes-0.012
ighters-0.011
 pans-0.011
Token
Token ID26391
Feature activation+1.26e-02

Pos logit contributions
allows+0.128
hers+0.117
hes+0.114
 pans+0.102
ained+0.102
Neg logit contributions
 course-0.143
 amaz-0.108
 friendship-0.107
 reminder-0.107
 yeah-0.105
Token,
Token ID11
Feature activation-3.55e-03

Pos logit contributions
allows+0.070
hes+0.063
hers+0.063
ighters+0.060
 pans+0.058
Neg logit contributions
 course-0.041
 reminder-0.024
 yeah-0.023
 friendship-0.023
 wink-0.022
Token "
Token ID366
Feature activation+2.32e-02

Pos logit contributions
 course+0.006
 reminder+0.004
 wink+0.004
 yeah+0.004
 friendship+0.004
Neg logit contributions
allows-0.009
hers-0.008
hes-0.008
ighters-0.008
 pans-0.007
TokenYes
Token ID5297
Feature activation-1.67e-01

Pos logit contributions
allows+0.050
hes+0.048
hers+0.047
 pans+0.043
ained+0.043
Neg logit contributions
 course-0.039
 friendship-0.028
 amaz-0.028
 reminder-0.027
 yeah-0.026
Token,
Token ID11
Feature activation-6.31e-03

Pos logit contributions
 course+0.238
 reminder+0.141
 wink+0.133
 friendship+0.132
 amaz+0.132
Neg logit contributions
allows-0.441
hers-0.399
hes-0.392
ighters-0.375
 pans-0.375
Token of
Token ID286
Feature activation+3.74e-02

Pos logit contributions
 course+0.011
 reminder+0.007
 yeah+0.006
 wink+0.006
 friendship+0.006
Neg logit contributions
allows-0.015
hers-0.014
hes-0.014
ighters-0.013
 pans-0.013
Token course
Token ID1781
Feature activation-1.88e-01

Pos logit contributions
allows+0.124
hers+0.120
hes+0.116
 pans+0.111
ts+0.104
Neg logit contributions
 course-0.002
Sure-0.002
 reminder-0.002
found-0.001
 Sure+0.000
Token I
Token ID314
Feature activation-9.43e-02

Pos logit contributions
 course+0.338
 reminder+0.225
 yeah+0.217
 wink+0.216
 friendship+0.215
Neg logit contributions
allows-0.429
hers-0.387
hes-0.383
ighters-0.363
 pans-0.361
Token will
Token ID481
Feature activation-3.32e-02

Pos logit contributions
 course+0.170
 reminder+0.110
 yeah+0.110
 friendship+0.108
 amaz+0.107
Neg logit contributions
allows-0.215
hes-0.195
hers-0.194
ighters-0.184
 pans-0.176
Token!"
Token ID2474
Feature activation+3.77e-02

Pos logit contributions
 course+0.057
 reminder+0.037
 wink+0.036
 yeah+0.036
 friendship+0.036
Neg logit contributions
allows-0.079
hes-0.072
hers-0.072
ighters-0.068
 pans-0.067
Token
Token ID198
Feature activation-1.37e-02

Pos logit contributions
allows+0.086
hers+0.077
hes+0.077
ighters+0.073
 pans+0.072
Neg logit contributions
 course-0.069
 reminder-0.046
 yeah-0.044
 friendship-0.044
 wink-0.044
Token
Token ID198
Feature activation-1.51e-02

Pos logit contributions
 course+0.023
 reminder+0.014
 wink+0.013
Sure+0.013
 yeah+0.013
Neg logit contributions
allows-0.034
hers-0.031
hes-0.030
ighters-0.029
 pans-0.029
Tokenmy
Token ID1820
Feature activation+3.30e-02

Pos logit contributions
 course+0.002
 reminder+0.002
 friendship+0.002
 yeah+0.002
 amaz+0.002
Neg logit contributions
allows-0.002
hers-0.002
hes-0.002
 pans-0.002
ages-0.002
Token replied
Token ID8712
Feature activation-1.02e-03

Pos logit contributions
allows+0.081
hers+0.076
hes+0.075
 pans+0.068
ages+0.068
Neg logit contributions
 course-0.049
 reminder-0.033
 friendship-0.032
 yeah-0.032
 amaz-0.030
Token,
Token ID11
Feature activation-6.86e-03

Pos logit contributions
 course+0.002
 amaz+0.001
 reminder+0.001
 friendship+0.001
 yeah+0.001
Neg logit contributions
allows-0.003
hes-0.002
hers-0.002
ighters-0.002
 pans-0.002
Token "
Token ID366
Feature activation+2.85e-02

Pos logit contributions
 course+0.012
 reminder+0.008
 wink+0.007
 yeah+0.007
Sure+0.007
Neg logit contributions
allows-0.016
hers-0.015
hes-0.015
ighters-0.014
 pans-0.014
TokenOf
Token ID5189
Feature activation-1.72e-01

Pos logit contributions
hes+0.063
allows+0.061
hers+0.060
 pans+0.058
ained+0.055
Neg logit contributions
 course-0.041
 amaz-0.033
 friendship-0.028
est-0.027
 statue-0.026
Token course
Token ID1781
Feature activation-1.88e-01

Pos logit contributions
 amaz+0.078
 Nice+0.043
 wink+0.036
 statue+0.035
 reminder+0.033
Neg logit contributions
allows-0.482
hes-0.452
 pans-0.443
hers-0.438
ards-0.432
Token,
Token ID11
Feature activation-7.60e-03

Pos logit contributions
 course+0.314
 amaz+0.218
 reminder+0.202
 wink+0.198
Sure+0.192
Neg logit contributions
allows-0.435
hers-0.380
 pans-0.373
hes-0.370
ighters-0.362
Token let
Token ID1309
Feature activation-1.01e-01

Pos logit contributions
 course+0.014
 amaz+0.010
 reminder+0.010
 wink+0.010
joy+0.009
Neg logit contributions
allows-0.016
hers-0.014
hes-0.014
 pans-0.013
ighters-0.013
Token's
Token ID338
Feature activation-1.47e-02

Pos logit contributions
 course+0.125
 reminder+0.055
 friendship+0.055
of+0.054
 wink+0.053
Neg logit contributions
allows-0.292
hers-0.274
hes-0.270
ighters-0.263
 pans-0.257
Token go
Token ID467
Feature activation-3.60e-02

Pos logit contributions
 course+0.018
 wink+0.010
 reminder+0.009
 believe+0.009
 friendship+0.009
Neg logit contributions
allows-0.042
hers-0.039
hes-0.039
ighters-0.039
angers-0.038
Token inside
Token ID2641
Feature activation-5.16e-03

Pos logit contributions
 course+0.066
 reminder+0.045
 wink+0.044
 yeah+0.042
Sure+0.042
Neg logit contributions
allows-0.081
hers-0.074
hes-0.073
ighters-0.069
 pans-0.067
Tokenmy
Token ID1820
Feature activation+2.99e-02

Pos logit contributions
 course+0.004
 reminder+0.003
 yeah+0.003
 amaz+0.003
 friendship+0.003
Neg logit contributions
allows-0.004
hes-0.004
hers-0.004
 pans-0.003
ages-0.003
Token said
Token ID531
Feature activation+1.88e-02

Pos logit contributions
allows+0.066
hers+0.061
hes+0.060
 pans+0.055
ages+0.055
Neg logit contributions
 course-0.054
 yeah-0.037
 reminder-0.037
 friendship-0.037
 amaz-0.036
Token,
Token ID11
Feature activation-5.24e-03

Pos logit contributions
allows+0.044
hes+0.041
hers+0.039
ighters+0.037
 pans+0.037
Neg logit contributions
 course-0.031
 amaz-0.021
 friendship-0.019
 reminder-0.019
of-0.019
Token "
Token ID366
Feature activation+3.00e-02

Pos logit contributions
 course+0.010
 reminder+0.006
 yeah+0.006
 friendship+0.006
 wink+0.006
Neg logit contributions
allows-0.012
hers-0.011
hes-0.011
ighters-0.010
 pans-0.010
TokenOf
Token ID5189
Feature activation-1.74e-01

Pos logit contributions
hes+0.064
allows+0.061
hers+0.059
 pans+0.058
ained+0.058
Neg logit contributions
 course-0.044
 amaz-0.037
 friendship-0.032
 statue-0.029
joy-0.029
Token course
Token ID1781
Feature activation-1.88e-01

Pos logit contributions
 amaz+0.085
 Nice+0.043
joy+0.042
 Sure+0.031
 marvel+0.031
Neg logit contributions
allows-0.472
hes-0.448
hers-0.431
 pans-0.430
ards-0.428
Token,
Token ID11
Feature activation-5.89e-03

Pos logit contributions
 course+0.305
 amaz+0.200
 reminder+0.185
 wink+0.178
Sure+0.177
Neg logit contributions
allows-0.451
hers-0.399
hes-0.392
 pans-0.390
ighters-0.377
Token but
Token ID475
Feature activation-5.85e-02

Pos logit contributions
 course+0.011
 amaz+0.008
 reminder+0.008
 wink+0.007
joy+0.007
Neg logit contributions
allows-0.012
hes-0.011
hers-0.011
 pans-0.010
ighters-0.010
Token only
Token ID691
Feature activation-8.33e-02

Pos logit contributions
 course+0.105
 amaz+0.078
joy+0.076
five+0.074
of+0.074
Neg logit contributions
allows-0.113
hers-0.102
hes-0.098
 pans-0.096
ighters-0.096
Token pick
Token ID2298
Feature activation-4.58e-02

Pos logit contributions
 course+0.174
Sure+0.120
 yeah+0.118
 amaz+0.118
 reminder+0.118
Neg logit contributions
allows-0.154
hes-0.134
hers-0.132
 pans-0.129
ighters-0.127
Token a
Token ID257
Feature activation-3.90e-03

Pos logit contributions
 course+0.090
 reminder+0.061
 yeah+0.059
 wink+0.059
 friendship+0.059
Neg logit contributions
allows-0.098
hers-0.088
hes-0.088
ighters-0.083
 pans-0.081
Token shouted
Token ID17293
Feature activation+4.61e-02

Pos logit contributions
 course+0.013
 yeah+0.009
 reminder+0.008
 friendship+0.008
 amaz+0.008
Neg logit contributions
allows-0.015
hes-0.015
ages-0.013
hers-0.013
 pans-0.013
Token back
Token ID736
Feature activation-7.67e-02

Pos logit contributions
allows+0.122
hers+0.111
hes+0.111
ighters+0.106
 pans+0.104
Neg logit contributions
 course-0.068
 reminder-0.037
 yeah-0.036
 friendship-0.036
 wink-0.035
Token,
Token ID11
Feature activation-1.20e-02

Pos logit contributions
 course+0.111
 reminder+0.065
 yeah+0.062
 friendship+0.061
 wink+0.061
Neg logit contributions
allows-0.203
hers-0.187
hes-0.186
ighters-0.178
 pans-0.175
Token "
Token ID366
Feature activation+2.57e-02

Pos logit contributions
 course+0.024
 friendship+0.016
 yeah+0.016
 reminder+0.016
 amaz+0.015
Neg logit contributions
allows-0.025
hes-0.023
hers-0.023
 pans-0.021
ighters-0.021
TokenOf
Token ID5189
Feature activation-1.69e-01

Pos logit contributions
allows+0.049
hers+0.047
hes+0.047
ained+0.042
 pans+0.039
Neg logit contributions
 course-0.046
 amaz-0.034
 friendship-0.034
 yeah-0.033
 reminder-0.032
Token course
Token ID1781
Feature activation-1.88e-01

Pos logit contributions
 Sure+0.037
 amaz+0.029
found+0.015
 reminder+0.013
five+0.009
Neg logit contributions
allows-0.480
hers-0.453
hes-0.436
 pans-0.430
ards-0.426
Token,
Token ID11
Feature activation-1.16e-02

Pos logit contributions
 course+0.321
 amaz+0.213
 reminder+0.208
 friendship+0.200
Sure+0.196
Neg logit contributions
allows-0.419
hers-0.371
hes-0.367
 pans-0.357
ighters-0.345
Token and
Token ID290
Feature activation-8.11e-03

Pos logit contributions
 course+0.020
 reminder+0.013
 wink+0.012
 amaz+0.012
 friendship+0.012
Neg logit contributions
allows-0.026
hes-0.023
hers-0.023
ighters-0.022
 pans-0.022
Token stay
Token ID2652
Feature activation-2.07e-02

Pos logit contributions
 course+0.013
 reminder+0.008
 yeah+0.008
 wink+0.007
 friendship+0.007
Neg logit contributions
allows-0.019
hes-0.017
hers-0.017
 pans-0.016
ighters-0.016
Token safe
Token ID3338
Feature activation-1.42e-02

Pos logit contributions
 course+0.041
 reminder+0.029
 wink+0.028
 friendship+0.027
 yeah+0.027
Neg logit contributions
allows-0.044
hers-0.040
hes-0.039
ighters-0.038
 pans-0.036
Token!"
Token ID2474
Feature activation+3.62e-02

Pos logit contributions
 course+0.025
 reminder+0.017
 friendship+0.016
 wink+0.015
 again+0.015
Neg logit contributions
allows-0.033
hes-0.031
hers-0.030
ighters-0.030
 pans-0.028
TokenYes
Token ID5297
Feature activation-1.65e-01

Pos logit contributions
hes+0.052
allows+0.052
hers+0.049
 pans+0.047
ained+0.045
Neg logit contributions
 course-0.035
 amaz-0.029
 friendship-0.027
 statue-0.027
est-0.024
Token Ab
Token ID2275
Feature activation-2.69e-02

Pos logit contributions
 course+0.249
 reminder+0.153
 amaz+0.152
 wink+0.150
 friendship+0.149
Neg logit contributions
allows-0.420
hers-0.372
hes-0.363
ighters-0.352
 pans-0.350
Tokenbie
Token ID12590
Feature activation+5.58e-02

Pos logit contributions
 course+0.072
 friendship+0.056
 reminder+0.056
Sure+0.055
 wink+0.055
Neg logit contributions
allows-0.040
hers-0.029
 pans-0.028
hes-0.028
ighters-0.027
Token,
Token ID11
Feature activation-1.23e-02

Pos logit contributions
allows+0.136
hes+0.123
hers+0.122
ighters+0.115
ages+0.114
Neg logit contributions
 course-0.090
 reminder-0.057
 amaz-0.057
Sure-0.056
 yeah-0.055
Token of
Token ID286
Feature activation+4.29e-02

Pos logit contributions
 course+0.023
joy+0.016
 amaz+0.015
 reminder+0.015
 wink+0.015
Neg logit contributions
allows-0.025
hes-0.021
hers-0.021
ighters-0.020
 pans-0.020
Token course
Token ID1781
Feature activation-1.88e-01

Pos logit contributions
allows+0.134
hes+0.130
hers+0.128
ained+0.118
 pans+0.116
Neg logit contributions
 reminder-0.005
Sure-0.005
 Sure-0.004
 Nice-0.002
 yeah-0.002
Token you
Token ID345
Feature activation-3.42e-02

Pos logit contributions
 course+0.317
 reminder+0.205
 amaz+0.196
 friendship+0.195
 yeah+0.195
Neg logit contributions
allows-0.447
hers-0.400
hes-0.397
ighters-0.379
 pans-0.379
Token can
Token ID460
Feature activation-4.21e-02

Pos logit contributions
 course+0.056
 reminder+0.035
 yeah+0.034
 friendship+0.034
 wink+0.033
Neg logit contributions
allows-0.084
hers-0.076
hes-0.076
ighters-0.073
 pans-0.071
Token!"
Token ID2474
Feature activation+3.54e-02

Pos logit contributions
 course+0.079
 reminder+0.053
 amaz+0.052
 friendship+0.052
 yeah+0.051
Neg logit contributions
allows-0.091
hes-0.082
hers-0.082
ighters-0.076
ained-0.074
Token<|endoftext|>
Token ID50256
Feature activation+8.25e-02

Pos logit contributions
allows+0.073
hes+0.063
hers+0.057
ages+0.056
 pans+0.055
Neg logit contributions
 course-0.072
 friendship-0.050
 reminder-0.050
 yeah-0.050
 wink-0.048
TokenOnce
Token ID7454
Feature activation+4.69e-02

Pos logit contributions
allows+0.169
hers+0.151
hes+0.150
ighters+0.141
 pans+0.138
Neg logit contributions
 course-0.169
 reminder-0.118
 yeah-0.115
 wink-0.115
 friendship-0.114
Token and
Token ID290
Feature activation+2.09e-03

Pos logit contributions
 course+0.048
 yeah+0.030
 reminder+0.029
 friendship+0.028
Sure+0.028
Neg logit contributions
allows-0.067
hes-0.060
hers-0.059
 pans-0.056
ighters-0.055
Token said
Token ID531
Feature activation+2.23e-02

Pos logit contributions
allows+0.004
hes+0.004
hers+0.004
 pans+0.004
ages+0.003
Neg logit contributions
 course-0.004
 reminder-0.003
 yeah-0.003
 friendship-0.003
of-0.003
Token,
Token ID11
Feature activation+3.38e-04

Pos logit contributions
allows+0.054
hes+0.050
hers+0.049
ighters+0.046
ages+0.045
Neg logit contributions
 course-0.036
 reminder-0.022
 amaz-0.022
 friendship-0.021
 yeah-0.020
Token "
Token ID366
Feature activation+2.69e-02

Pos logit contributions
allows+0.001
hers+0.001
hes+0.001
ighters+0.001
 pans+0.001
Neg logit contributions
 course-0.001
 reminder-0.000
 yeah-0.000
 wink-0.000
 friendship-0.000
TokenOf
Token ID5189
Feature activation-1.68e-01

Pos logit contributions
hes+0.052
allows+0.050
hers+0.048
ained+0.046
 pans+0.046
Neg logit contributions
 course-0.043
 amaz-0.034
 friendship-0.032
 statue-0.030
 thank-0.029
Token course
Token ID1781
Feature activation-1.88e-01

Pos logit contributions
 amaz+0.055
 Nice+0.034
 Sure+0.030
 reminder+0.027
 wink+0.025
Neg logit contributions
allows-0.489
hes-0.452
hers-0.452
ards-0.436
 pans-0.433
Token,
Token ID11
Feature activation-2.89e-03

Pos logit contributions
 course+0.313
 amaz+0.214
 reminder+0.200
Sure+0.197
 wink+0.193
Neg logit contributions
allows-0.427
hers-0.378
hes-0.369
 pans-0.364
ighters-0.351
Token let
Token ID1309
Feature activation-1.01e-01

Pos logit contributions
 course+0.005
 reminder+0.003
 amaz+0.003
 wink+0.003
 yeah+0.003
Neg logit contributions
allows-0.006
hers-0.006
hes-0.005
 pans-0.005
ighters-0.005
Token's
Token ID338
Feature activation-1.30e-02

Pos logit contributions
 course+0.131
 reminder+0.063
 friendship+0.061
 wink+0.061
 yeah+0.060
Neg logit contributions
allows-0.288
hers-0.268
hes-0.265
ighters-0.258
 pans-0.253
Token go
Token ID467
Feature activation-3.52e-02

Pos logit contributions
 course+0.020
 reminder+0.012
 wink+0.012
 friendship+0.011
 yeah+0.011
Neg logit contributions
allows-0.034
hes-0.031
hers-0.031
ighters-0.031
 pans-0.030
Token get
Token ID651
Feature activation+2.89e-02

Pos logit contributions
 course+0.064
 reminder+0.042
 wink+0.041
 yeah+0.041
 friendship+0.040
Neg logit contributions
allows-0.081
hers-0.073
hes-0.073
ighters-0.070
 pans-0.068
Token and
Token ID290
Feature activation+2.09e-03

Pos logit contributions
 course+0.048
 yeah+0.030
 reminder+0.029
 friendship+0.028
Sure+0.028
Neg logit contributions
allows-0.067
hes-0.060
hers-0.059
 pans-0.056
ighters-0.055
Token said
Token ID531
Feature activation+2.23e-02

Pos logit contributions
allows+0.004
hes+0.004
hers+0.004
 pans+0.004
ages+0.003
Neg logit contributions
 course-0.004
 reminder-0.003
 yeah-0.003
 friendship-0.003
of-0.003
Token,
Token ID11
Feature activation+3.38e-04

Pos logit contributions
allows+0.054
hes+0.050
hers+0.049
ighters+0.046
ages+0.045
Neg logit contributions
 course-0.036
 reminder-0.022
 amaz-0.022
 friendship-0.021
 yeah-0.020
Token "
Token ID366
Feature activation+2.69e-02

Pos logit contributions
allows+0.001
hers+0.001
hes+0.001
ighters+0.001
 pans+0.001
Neg logit contributions
 course-0.001
 reminder-0.000
 yeah-0.000
 wink-0.000
 friendship-0.000
TokenOf
Token ID5189
Feature activation-1.68e-01

Pos logit contributions
hes+0.052
allows+0.050
hers+0.048
ained+0.046
 pans+0.046
Neg logit contributions
 course-0.043
 amaz-0.034
 friendship-0.032
 statue-0.030
 thank-0.029
Token course
Token ID1781
Feature activation-1.88e-01

Pos logit contributions
 amaz+0.055
 Nice+0.034
 Sure+0.030
 reminder+0.027
 wink+0.025
Neg logit contributions
allows-0.489
hes-0.452
hers-0.452
ards-0.436
 pans-0.433
Token,
Token ID11
Feature activation-2.89e-03

Pos logit contributions
 course+0.313
 amaz+0.214
 reminder+0.200
Sure+0.197
 wink+0.193
Neg logit contributions
allows-0.427
hers-0.378
hes-0.369
 pans-0.364
ighters-0.351
Token Lily
Token ID20037
Feature activation-9.15e-03

Pos logit contributions
 course+0.005
 reminder+0.003
 amaz+0.003
 wink+0.003
 yeah+0.003
Neg logit contributions
allows-0.006
hers-0.006
hes-0.005
 pans-0.005
ighters-0.005
Token.
Token ID13
Feature activation+2.78e-03

Pos logit contributions
 course+0.014
 reminder+0.008
 yeah+0.008
 friendship+0.008
 wink+0.008
Neg logit contributions
allows-0.024
hers-0.022
hes-0.022
ighters-0.021
 pans-0.020
Token That
Token ID1320
Feature activation-3.05e-02

Pos logit contributions
allows+0.006
hes+0.006
hers+0.005
ighters+0.005
ained+0.005
Neg logit contributions
 course-0.005
 wink-0.003
 reminder-0.003
 amaz-0.003
 friendship-0.003
Token pattern
Token ID3912
Feature activation+1.32e-03

Pos logit contributions
 course+0.047
 yeah+0.030
 reminder+0.029
Sure+0.029
 amaz+0.028
Neg logit contributions
allows-0.075
hers-0.068
hes-0.068
 pans-0.063
ained-0.063
Token
Token ID198
Feature activation-1.72e-02

Pos logit contributions
 course+0.025
Sure+0.014
 reminder+0.014
 wink+0.014
 yeah+0.014
Neg logit contributions
allows-0.044
hers-0.040
hes-0.040
ighters-0.038
 pans-0.038
TokenAt
Token ID2953
Feature activation+1.30e-02

Pos logit contributions
 course+0.030
 reminder+0.019
Sure+0.019
 yeah+0.018
 wink+0.018
Neg logit contributions
allows-0.042
hes-0.038
hers-0.037
ighters-0.036
 pans-0.035
Token first
Token ID717
Feature activation-2.29e-02

Pos logit contributions
allows+0.030
hers+0.027
hes+0.027
ighters+0.026
 pans+0.025
Neg logit contributions
 course-0.024
 reminder-0.016
 wink-0.015
 friendship-0.015
 yeah-0.015
Token,
Token ID11
Feature activation-2.88e-03

Pos logit contributions
 course+0.032
 reminder+0.018
 yeah+0.018
 wink+0.017
Sure+0.017
Neg logit contributions
allows-0.061
hes-0.055
hers-0.055
ighters-0.053
 pans-0.052
Token Jack
Token ID3619
Feature activation+2.45e-02

Pos logit contributions
 course+0.005
 reminder+0.003
 yeah+0.003
 wink+0.003
 friendship+0.003
Neg logit contributions
allows-0.007
hers-0.006
hes-0.006
ighters-0.006
ained-0.006
Token didn
Token ID1422
Feature activation+1.23e-01

Pos logit contributions
allows+0.063
hers+0.057
hes+0.057
ighters+0.054
 pans+0.054
Neg logit contributions
 course-0.038
 reminder-0.023
 yeah-0.022
 friendship-0.022
 wink-0.021
Token't
Token ID470
Feature activation+7.80e-02

Pos logit contributions
allows+0.246
 pans+0.217
hers+0.209
ighters+0.207
hes+0.205
Neg logit contributions
 course-0.261
 wink-0.181
Sure-0.179
of-0.176
 reminder-0.171
Token understand
Token ID1833
Feature activation-4.20e-02

Pos logit contributions
allows+0.184
hers+0.167
hes+0.166
ighters+0.158
 pans+0.155
Neg logit contributions
 course-0.136
 reminder-0.091
 wink-0.087
 yeah-0.087
 friendship-0.086
Token what
Token ID644
Feature activation+2.01e-02

Pos logit contributions
 course+0.076
 reminder+0.055
Sure+0.053
 wink+0.053
 yeah+0.052
Neg logit contributions
allows-0.091
hers-0.083
hes-0.081
 pans-0.075
ighters-0.075
Token she
Token ID673
Feature activation+1.36e-02

Pos logit contributions
allows+0.046
hes+0.042
hers+0.042
 pans+0.039
ighters+0.039
Neg logit contributions
 course-0.035
 reminder-0.023
 yeah-0.023
 amaz-0.022
 wink-0.022
Token meant
Token ID4001
Feature activation+7.05e-02

Pos logit contributions
allows+0.031
hes+0.029
hers+0.029
ighters+0.027
ages+0.026
Neg logit contributions
 course-0.024
 yeah-0.016
 reminder-0.016
 amaz-0.015
 friendship-0.015
Token forest
Token ID8222
Feature activation-1.34e-02

Pos logit contributions
allows+0.019
ained+0.019
hes+0.019
hers+0.018
 pans+0.017
Neg logit contributions
 course-0.018
 yeah-0.013
 reminder-0.013
 believe-0.012
 amaz-0.012
Token,
Token ID11
Feature activation-2.77e-03

Pos logit contributions
 course+0.027
 yeah+0.019
 reminder+0.018
Sure+0.018
 friendship+0.018
Neg logit contributions
allows-0.028
hers-0.026
hes-0.025
ighters-0.023
ained-0.023
Token but
Token ID475
Feature activation-5.68e-02

Pos logit contributions
 course+0.005
 reminder+0.003
 yeah+0.003
 friendship+0.003
 amaz+0.003
Neg logit contributions
allows-0.006
hers-0.006
hes-0.006
 pans-0.005
ighters-0.005
Token quickly
Token ID2952
Feature activation+3.11e-02

Pos logit contributions
 course+0.092
 reminder+0.057
 yeah+0.055
 wink+0.055
 friendship+0.055
Neg logit contributions
allows-0.141
hers-0.128
hes-0.128
ighters-0.122
 pans-0.120
Token got
Token ID1392
Feature activation+8.34e-02

Pos logit contributions
allows+0.068
hes+0.062
hers+0.061
ighters+0.058
 pans+0.058
Neg logit contributions
 course-0.059
 yeah-0.040
 reminder-0.039
Sure-0.039
 friendship-0.037
Token tired
Token ID10032
Feature activation+1.18e-01

Pos logit contributions
allows+0.192
hers+0.171
ants+0.165
agons+0.163
ighters+0.163
Neg logit contributions
 course-0.143
 wink-0.109
 smiled-0.102
 friendship-0.097
 reminder-0.097
Token and
Token ID290
Feature activation-1.79e-03

Pos logit contributions
allows+0.265
hers+0.241
hes+0.240
ighters+0.226
 pans+0.225
Neg logit contributions
 course-0.216
 yeah-0.146
 reminder-0.143
 wink-0.139
 friendship-0.139
Token felt
Token ID2936
Feature activation+3.40e-02

Pos logit contributions
 course+0.003
 wink+0.002
 reminder+0.002
 friendship+0.002
 smiled+0.002
Neg logit contributions
allows-0.004
hers-0.004
hes-0.004
ighters-0.004
 pans-0.003
Token very
Token ID845
Feature activation+7.21e-02

Pos logit contributions
allows+0.065
hers+0.059
hes+0.057
ighters+0.054
 pans+0.053
Neg logit contributions
 course-0.073
 reminder-0.053
 wink-0.052
 friendship-0.052
 yeah-0.051
Token lonely
Token ID21757
Feature activation+4.71e-02

Pos logit contributions
allows+0.122
hers+0.113
hes+0.108
ighters+0.101
 pans+0.098
Neg logit contributions
 course-0.163
 reminder-0.127
 wink-0.125
 friendship-0.124
 yeah-0.120
Token.
Token ID13
Feature activation-6.47e-03

Pos logit contributions
allows+0.112
hers+0.102
hes+0.101
ighters+0.096
 pans+0.094
Neg logit contributions
 course-0.081
 reminder-0.052
 yeah-0.051
 wink-0.050
 friendship-0.050
Token and
Token ID290
Feature activation-3.36e-03

Pos logit contributions
 course+0.039
 reminder+0.024
 wink+0.024
 friendship+0.024
 yeah+0.023
Neg logit contributions
allows-0.063
hers-0.058
hes-0.058
ighters-0.055
 pans-0.053
Token help
Token ID1037
Feature activation-6.97e-02

Pos logit contributions
 course+0.006
 reminder+0.004
 yeah+0.004
 amaz+0.004
Sure+0.004
Neg logit contributions
allows-0.007
hes-0.007
hers-0.007
 pans-0.006
ighters-0.006
Token me
Token ID502
Feature activation-5.16e-02

Pos logit contributions
 course+0.111
 friendship+0.065
 reminder+0.065
 wink+0.065
of+0.063
Neg logit contributions
allows-0.178
hes-0.161
hers-0.161
ighters-0.154
ained-0.150
Token with
Token ID351
Feature activation-3.60e-02

Pos logit contributions
 course+0.083
 wink+0.056
 reminder+0.054
 friendship+0.051
 believe+0.051
Neg logit contributions
allows-0.124
hers-0.112
hes-0.112
ighters-0.109
ained-0.107
Token the
Token ID262
Feature activation+1.28e-02

Pos logit contributions
 course+0.066
 reminder+0.044
 yeah+0.043
 wink+0.042
 friendship+0.042
Neg logit contributions
allows-0.081
hers-0.074
hes-0.074
ighters-0.070
 pans-0.068
Token chores
Token ID46173
Feature activation+1.17e-01

Pos logit contributions
allows+0.027
hes+0.024
hers+0.024
 pans+0.023
ighters+0.023
Neg logit contributions
 course-0.024
 yeah-0.017
 reminder-0.017
Sure-0.016
of-0.016
Token."
Token ID526
Feature activation+3.47e-02

Pos logit contributions
allows+0.264
hers+0.237
hes+0.236
ighters+0.225
 pans+0.220
Neg logit contributions
 course-0.217
 reminder-0.145
 friendship-0.139
 wink-0.139
 yeah-0.139
Token
Token ID198
Feature activation-1.35e-02

Pos logit contributions
allows+0.083
hes+0.076
hers+0.076
ighters+0.072
 pans+0.070
Neg logit contributions
 course-0.059
 reminder-0.038
 yeah-0.037
 wink-0.036
 friendship-0.036
Token
Token ID198
Feature activation-1.34e-02

Pos logit contributions
 course+0.022
 reminder+0.014
 yeah+0.013
 wink+0.013
 friendship+0.013
Neg logit contributions
allows-0.033
hers-0.030
hes-0.030
ighters-0.029
 pans-0.028
TokenL
Token ID43
Feature activation+1.42e-02

Pos logit contributions
 course+0.021
 reminder+0.013
 yeah+0.013
 amaz+0.013
 wink+0.013
Neg logit contributions
allows-0.033
hers-0.031
hes-0.031
 pans-0.029
ighters-0.029
Tokenily
Token ID813
Feature activation-2.05e-02

Pos logit contributions
allows+0.028
hes+0.026
hers+0.025
ighters+0.024
ained+0.023
Neg logit contributions
 course-0.030
 reminder-0.022
 amaz-0.021
 wink-0.021
 yeah-0.021
Token together
Token ID1978
Feature activation+2.56e-02

Pos logit contributions
 course+0.015
 yeah+0.010
 reminder+0.010
 friendship+0.010
 amaz+0.010
Neg logit contributions
allows-0.018
hers-0.017
hes-0.017
ighters-0.016
 pans-0.015
Token to
Token ID284
Feature activation-7.53e-03

Pos logit contributions
allows+0.055
hers+0.050
hes+0.049
ighters+0.046
 pans+0.044
Neg logit contributions
 course-0.049
 yeah-0.034
 reminder-0.032
Sure-0.032
 friendship-0.032
Token buy
Token ID2822
Feature activation-4.78e-03

Pos logit contributions
 course+0.015
 yeah+0.011
 reminder+0.011
est+0.011
Sure+0.010
Neg logit contributions
allows-0.014
hers-0.013
hes-0.013
ighters-0.012
 pans-0.011
Token sa
Token ID473
Feature activation+1.14e-01

Pos logit contributions
 course+0.010
 reminder+0.008
 yeah+0.008
 wink+0.007
Sure+0.007
Neg logit contributions
allows-0.009
hes-0.008
hers-0.008
 pans-0.007
ighters-0.007
Tokenus
Token ID385
Feature activation+1.58e-01

Pos logit contributions
allows+0.165
hers+0.137
hes+0.136
ighters+0.130
 pans+0.120
Neg logit contributions
 course-0.300
of-0.233
 reminder-0.224
Sure-0.223
 yeah-0.220
Tokenages
Token ID1095
Feature activation+1.34e-01

Pos logit contributions
allows+0.119
ained+0.102
hers+0.098
hes+0.093
ighters+0.092
Neg logit contributions
 course-0.471
of-0.403
 wink-0.398
 reminder-0.397
 friendship-0.396
Token.
Token ID13
Feature activation-8.94e-03

Pos logit contributions
allows+0.322
hers+0.292
hes+0.291
ighters+0.279
 pans+0.273
Neg logit contributions
 course-0.227
 reminder-0.146
 wink-0.139
 friendship-0.139
 yeah-0.138
Token And
Token ID843
Feature activation-5.42e-02

Pos logit contributions
 course+0.015
 reminder+0.010
 yeah+0.009
 friendship+0.009
 wink+0.009
Neg logit contributions
allows-0.021
hers-0.020
hes-0.019
ighters-0.019
 pans-0.018
Token Tim
Token ID5045
Feature activation+1.47e-02

Pos logit contributions
 course+0.122
 wink+0.087
 yeah+0.087
 reminder+0.086
 amaz+0.084
Neg logit contributions
allows-0.094
hes-0.082
hers-0.080
ighters-0.075
ages-0.075
Tokenmy
Token ID1820
Feature activation+3.53e-02

Pos logit contributions
allows+0.027
hers+0.024
hes+0.024
ained+0.022
ages+0.021
Neg logit contributions
 course-0.031
 amaz-0.023
 yeah-0.023
 reminder-0.023
Sure-0.022
Token was
Token ID373
Feature activation-3.80e-02

Pos logit contributions
allows+0.084
hes+0.078
hers+0.077
ages+0.075
ighters+0.067
Neg logit contributions
 course-0.055
 yeah-0.036
Sure-0.034
 amaz-0.033
 reminder-0.033
Token She
Token ID1375
Feature activation+1.37e-03

Pos logit contributions
 course+0.005
Sure+0.003
 yeah+0.003
 friendship+0.003
 reminder+0.003
Neg logit contributions
allows-0.009
hers-0.008
hes-0.008
ighters-0.008
 pans-0.008
Token qu
Token ID627
Feature activation+8.52e-02

Pos logit contributions
allows+0.003
hers+0.003
hes+0.003
ighters+0.003
 pans+0.003
Neg logit contributions
 course-0.002
 reminder-0.001
 yeah-0.001
 friendship-0.001
 wink-0.001
Tokenacked
Token ID6021
Feature activation+5.32e-02

Pos logit contributions
allows+0.167
hers+0.157
 pans+0.156
ighters+0.152
hes+0.148
Neg logit contributions
 course-0.172
of-0.118
 wink-0.112
est-0.112
 friendship-0.110
Token and
Token ID290
Feature activation-3.00e-03

Pos logit contributions
allows+0.125
hers+0.115
hes+0.115
ighters+0.111
ained+0.107
Neg logit contributions
 course-0.090
 wink-0.059
 reminder-0.058
 friendship-0.057
Sure-0.055
Token w
Token ID266
Feature activation+6.15e-02

Pos logit contributions
 course+0.005
 reminder+0.003
 wink+0.003
 yeah+0.003
 friendship+0.003
Neg logit contributions
allows-0.007
hers-0.007
hes-0.007
ighters-0.006
 pans-0.006
Tokenadd
Token ID2860
Feature activation+1.11e-01

Pos logit contributions
allows+0.096
hers+0.084
hes+0.083
 pans+0.078
ighters+0.077
Neg logit contributions
 course-0.155
 reminder-0.115
 yeah-0.113
 friendship-0.113
 wink-0.113
Tokenled
Token ID992
Feature activation+8.02e-02

Pos logit contributions
allows+0.173
hers+0.160
hes+0.154
ighters+0.152
 pans+0.145
Neg logit contributions
 course-0.269
of-0.199
 smiled-0.198
 friendship-0.198
 wink-0.198
Token around
Token ID1088
Feature activation+3.39e-02

Pos logit contributions
allows+0.168
hes+0.148
hers+0.148
ighters+0.138
 pans+0.138
Neg logit contributions
 course-0.162
 reminder-0.111
 yeah-0.111
 friendship-0.107
Sure-0.105
Token.
Token ID13
Feature activation-4.28e-03

Pos logit contributions
allows+0.073
hers+0.065
hes+0.064
ighters+0.060
 pans+0.060
Neg logit contributions
 course-0.064
 yeah-0.045
Sure-0.043
 reminder-0.043
 wink-0.042
Token All
Token ID1439
Feature activation-3.43e-02

Pos logit contributions
 course+0.007
 reminder+0.005
Sure+0.004
 yeah+0.004
 friendship+0.004
Neg logit contributions
allows-0.010
hers-0.009
hes-0.009
ighters-0.009
 pans-0.009
Token the
Token ID262
Feature activation+1.18e-02

Pos logit contributions
 course+0.064
 reminder+0.046
 yeah+0.045
 wink+0.043
 amaz+0.043
Neg logit contributions
allows-0.074
hes-0.067
hers-0.064
 pans-0.061
ighters-0.061
Token more
Token ID517
Feature activation-3.46e-02

Pos logit contributions
 course+0.022
 reminder+0.015
 yeah+0.015
 amaz+0.015
 wink+0.014
Neg logit contributions
allows-0.022
hes-0.020
hers-0.020
ighters-0.019
 pans-0.018
Token and
Token ID290
Feature activation-4.52e-03

Pos logit contributions
 course+0.066
 reminder+0.045
 yeah+0.045
 wink+0.044
Sure+0.043
Neg logit contributions
allows-0.075
hers-0.067
hes-0.067
ighters-0.063
 pans-0.062
Token play
Token ID711
Feature activation-5.25e-02

Pos logit contributions
 course+0.008
 yeah+0.005
of+0.005
 reminder+0.005
 amaz+0.005
Neg logit contributions
allows-0.010
hes-0.009
hers-0.009
 pans-0.008
ighters-0.008
Token less
Token ID1342
Feature activation-3.72e-02

Pos logit contributions
 course+0.098
 reminder+0.068
 yeah+0.068
 wink+0.066
 amaz+0.065
Neg logit contributions
allows-0.112
hers-0.102
hes-0.102
ighters-0.095
 pans-0.093
Token.
Token ID13
Feature activation-7.21e-04

Pos logit contributions
 course+0.065
 friendship+0.044
 reminder+0.044
 wink+0.042
 amaz+0.040
Neg logit contributions
allows-0.087
hes-0.079
hers-0.079
ighters-0.075
angers-0.072
Token Jack
Token ID3619
Feature activation+2.47e-02

Pos logit contributions
 course+0.001
 reminder+0.001
 wink+0.001
 friendship+0.001
 yeah+0.001
Neg logit contributions
allows-0.001
hers-0.001
hes-0.001
ighters-0.001
angers-0.001
Token got
Token ID1392
Feature activation+8.43e-02

Pos logit contributions
allows+0.060
hers+0.056
hes+0.055
ighters+0.051
ages+0.051
Neg logit contributions
 course-0.039
 yeah-0.024
 reminder-0.024
 amaz-0.023
 friendship-0.022
Token frustrated
Token ID14718
Feature activation+1.16e-01

Pos logit contributions
allows+0.173
hers+0.156
hes+0.147
ighters+0.145
 pans+0.140
Neg logit contributions
 course-0.171
 wink-0.125
 friendship-0.120
 reminder-0.120
Sure-0.120
Token and
Token ID290
Feature activation-3.48e-03

Pos logit contributions
allows+0.245
hers+0.227
hes+0.221
 pans+0.206
ighters+0.203
Neg logit contributions
 course-0.205
 yeah-0.140
 amaz-0.139
 wink-0.139
 reminder-0.138
Token shouted
Token ID17293
Feature activation+4.65e-02

Pos logit contributions
 course+0.006
 reminder+0.004
 yeah+0.004
 friendship+0.004
 wink+0.004
Neg logit contributions
allows-0.008
hers-0.007
hes-0.007
ighters-0.007
 pans-0.007
Token at
Token ID379
Feature activation-1.34e-02

Pos logit contributions
allows+0.116
hers+0.106
hes+0.105
ighters+0.100
 pans+0.098
Neg logit contributions
 course-0.075
 reminder-0.046
 yeah-0.045
 wink-0.044
 friendship-0.044
Token the
Token ID262
Feature activation+1.13e-02

Pos logit contributions
allows+0.109
hes+0.100
hers+0.098
ighters+0.091
 pans+0.090
Neg logit contributions
 course-0.105
 yeah-0.077
 reminder-0.076
 amaz-0.074
 wink-0.073
Token story
Token ID1621
Feature activation+4.62e-02

Pos logit contributions
allows+0.028
hers+0.025
hes+0.025
ighters+0.024
 pans+0.024
Neg logit contributions
 course-0.019
 reminder-0.012
 friendship-0.011
 wink-0.011
 yeah-0.011
Token is
Token ID318
Feature activation-5.34e-02

Pos logit contributions
allows+0.114
hes+0.109
ained+0.103
hers+0.103
ighters+0.102
Neg logit contributions
 course-0.069
 amaz-0.042
 reminder-0.042
 yeah-0.042
 wink-0.040
Token:
Token ID25
Feature activation+5.05e-02

Pos logit contributions
 course+0.083
 reminder+0.051
 yeah+0.049
 amaz+0.048
 wink+0.048
Neg logit contributions
allows-0.135
hes-0.124
hers-0.123
ighters-0.118
 pans-0.114
Token Do
Token ID2141
Feature activation-4.40e-02

Pos logit contributions
allows+0.113
hes+0.103
hers+0.102
ighters+0.099
 pans+0.094
Neg logit contributions
 course-0.090
 reminder-0.059
 yeah-0.058
 wink-0.058
 amaz-0.057
Token not
Token ID407
Feature activation+7.74e-02

Pos logit contributions
 course+0.115
 reminder+0.088
 yeah+0.086
 friendship+0.086
 wink+0.086
Neg logit contributions
allows-0.066
hers-0.056
hes-0.056
ighters-0.051
 pans-0.050
Token demand
Token ID3512
Feature activation+1.55e-02

Pos logit contributions
allows+0.191
hes+0.173
hers+0.173
ighters+0.164
 pans+0.162
Neg logit contributions
 course-0.123
 reminder-0.075
 yeah-0.073
 amaz-0.070
Sure-0.069
Token things
Token ID1243
Feature activation+1.62e-02

Pos logit contributions
allows+0.032
hes+0.028
hers+0.028
ighters+0.027
 pans+0.026
Neg logit contributions
 course-0.031
 reminder-0.022
 yeah-0.021
 wink-0.021
Sure-0.021
Token that
Token ID326
Feature activation+1.54e-02

Pos logit contributions
allows+0.036
hes+0.033
hers+0.033
ighters+0.031
 pans+0.030
Neg logit contributions
 course-0.029
 reminder-0.020
 amaz-0.019
 yeah-0.019
 wink-0.019
Token are
Token ID389
Feature activation-6.09e-02

Pos logit contributions
allows+0.033
hes+0.031
hers+0.029
ighters+0.029
 pans+0.028
Neg logit contributions
 course-0.027
 reminder-0.020
 yeah-0.020
Sure-0.019
joy-0.019
Token not
Token ID407
Feature activation+7.76e-02

Pos logit contributions
 course+0.134
 friendship+0.105
 reminder+0.099
 wink+0.097
 yeah+0.091
Neg logit contributions
allows-0.116
hes-0.097
hers-0.097
angers-0.096
ighters-0.093
Token thought
Token ID1807
Feature activation+2.19e-02

Pos logit contributions
allows+0.046
hes+0.042
hers+0.042
ages+0.039
ighters+0.038
Neg logit contributions
 course-0.030
 yeah-0.022
 reminder-0.021
 amaz-0.020
 friendship-0.019
Token for
Token ID329
Feature activation+5.69e-02

Pos logit contributions
allows+0.046
hes+0.041
hers+0.040
ighters+0.039
 pans+0.036
Neg logit contributions
 course-0.043
 reminder-0.031
 yeah-0.030
 amaz-0.030
Sure-0.030
Token a
Token ID257
Feature activation-1.39e-03

Pos logit contributions
allows+0.101
hers+0.093
hes+0.090
 pans+0.082
ighters+0.081
Neg logit contributions
 course-0.123
 reminder-0.096
 yeah-0.096
Sure-0.092
of-0.090
Token moment
Token ID2589
Feature activation-7.82e-02

Pos logit contributions
 course+0.002
 amaz+0.002
ement+0.002
 yeah+0.001
ad+0.001
Neg logit contributions
allows-0.003
 pans-0.003
hers-0.003
hes-0.002
inters-0.002
Token and
Token ID290
Feature activation+6.50e-04

Pos logit contributions
 course+0.146
 yeah+0.111
Sure+0.107
 amaz+0.104
 friendship+0.103
Neg logit contributions
allows-0.152
hers-0.135
hes-0.133
ages-0.122
ighters-0.117
Token said
Token ID531
Feature activation+2.38e-02

Pos logit contributions
allows+0.001
hes+0.001
hers+0.001
ighters+0.001
 pans+0.001
Neg logit contributions
 course-0.001
 reminder-0.001
 yeah-0.001
 friendship-0.001
est-0.001
Token,
Token ID11
Feature activation-1.16e-03

Pos logit contributions
allows+0.058
hes+0.052
hers+0.052
ighters+0.050
ages+0.048
Neg logit contributions
 course-0.037
 reminder-0.024
 amaz-0.023
 friendship-0.022
of-0.022
Token "
Token ID366
Feature activation+3.06e-02

Pos logit contributions
 course+0.002
 reminder+0.001
 yeah+0.001
 wink+0.001
 friendship+0.001
Neg logit contributions
allows-0.003
hers-0.002
hes-0.002
ighters-0.002
 pans-0.002
TokenI
Token ID40
Feature activation-1.24e-01

Pos logit contributions
allows+0.058
hers+0.057
hes+0.056
 pans+0.053
ained+0.051
Neg logit contributions
 course-0.053
 amaz-0.042
 friendship-0.039
 yeah-0.038
 reminder-0.038
Token'm
Token ID1101
Feature activation-4.76e-02

Pos logit contributions
 course+0.238
 yeah+0.156
 reminder+0.155
 amaz+0.152
 friendship+0.152
Neg logit contributions
allows-0.263
hes-0.238
hers-0.232
ighters-0.228
ages-0.212
Token sorry
Token ID7926
Feature activation+3.70e-02

Pos logit contributions
 course+0.089
 wink+0.065
 reminder+0.061
 yeah+0.060
 friendship+0.058
Neg logit contributions
allows-0.100
hers-0.091
 pans-0.090
ages-0.090
ometers-0.087
Token nodd
Token ID13177
Feature activation+4.63e-02

Pos logit contributions
allows+0.014
hes+0.013
hers+0.013
 pans+0.012
 chores+0.012
Neg logit contributions
 course-0.012
 yeah-0.009
of-0.008
 amaz-0.008
Sure-0.008
Tokenles
Token ID829
Feature activation+3.21e-02

Pos logit contributions
allows+0.090
hes+0.081
hers+0.080
ages+0.073
ighters+0.072
Neg logit contributions
 course-0.096
 amaz-0.073
 reminder-0.072
 yeah-0.070
 friendship-0.070
Token.
Token ID13
Feature activation-8.54e-03

Pos logit contributions
allows+0.067
hes+0.060
hers+0.058
ighters+0.054
ages+0.054
Neg logit contributions
 course-0.060
 yeah-0.042
 amaz-0.041
Sure-0.041
 friendship-0.040
Token Then
Token ID3244
Feature activation-4.00e-02

Pos logit contributions
 course+0.016
 reminder+0.011
 wink+0.011
 yeah+0.011
 friendship+0.011
Neg logit contributions
allows-0.019
hes-0.017
hers-0.016
ighters-0.015
ained-0.015
Token,
Token ID11
Feature activation-1.25e-02

Pos logit contributions
 course+0.070
 reminder+0.045
 yeah+0.043
 friendship+0.043
Sure+0.042
Neg logit contributions
allows-0.088
hes-0.078
 pans-0.076
ighters-0.073
hers-0.072
Token she
Token ID673
Feature activation+1.34e-02

Pos logit contributions
 course+0.023
 reminder+0.015
 yeah+0.015
 friendship+0.015
 wink+0.014
Neg logit contributions
allows-0.028
hes-0.025
hers-0.024
ighters-0.023
 pans-0.023
Token and
Token ID290
Feature activation-6.69e-03

Pos logit contributions
allows+0.030
hes+0.028
hers+0.027
 pans+0.025
ighters+0.025
Neg logit contributions
 course-0.024
 yeah-0.016
 reminder-0.015
 friendship-0.015
Sure-0.015
Token her
Token ID607
Feature activation-1.02e-02

Pos logit contributions
 course+0.013
 yeah+0.009
 reminder+0.009
Sure+0.009
 friendship+0.009
Neg logit contributions
allows-0.014
hes-0.012
 pans-0.011
hers-0.011
ighters-0.011
Token mummy
Token ID38567
Feature activation+4.36e-02

Pos logit contributions
 course+0.025
 yeah+0.018
Sure+0.018
 reminder+0.018
of+0.018
Neg logit contributions
allows-0.017
hes-0.014
hers-0.013
 pans-0.013
ighters-0.012
Token started
Token ID2067
Feature activation+5.76e-02

Pos logit contributions
allows+0.102
hes+0.093
hers+0.093
ighters+0.086
 pans+0.085
Neg logit contributions
 course-0.074
 yeah-0.049
 reminder-0.049
 friendship-0.047
Sure-0.046
Token to
Token ID284
Feature activation-6.84e-03

Pos logit contributions
allows+0.134
hers+0.123
ighters+0.121
hes+0.121
 pans+0.114
Neg logit contributions
 course-0.099
 reminder-0.067
 wink-0.066
 friendship-0.065
 amaz-0.063
Token will
Token ID481
Feature activation-3.09e-02

Pos logit contributions
allows+0.061
hers+0.056
hes+0.055
ighters+0.053
 pans+0.052
Neg logit contributions
 course-0.034
 reminder-0.019
 yeah-0.019
 wink-0.018
 friendship-0.018
Token play
Token ID711
Feature activation-4.81e-02

Pos logit contributions
 course+0.052
 reminder+0.032
 yeah+0.032
 friendship+0.031
 amaz+0.031
Neg logit contributions
allows-0.074
hes-0.068
hers-0.068
ighters-0.064
 pans-0.063
Token with
Token ID351
Feature activation-3.23e-02

Pos logit contributions
 course+0.091
 reminder+0.066
 yeah+0.064
 amaz+0.063
of+0.062
Neg logit contributions
allows-0.099
hes-0.093
hers-0.089
ighters-0.085
 pans-0.082
Token safer
Token ID14178
Feature activation-2.97e-02

Pos logit contributions
 course+0.061
 reminder+0.047
 yeah+0.043
Sure+0.042
 wink+0.042
Neg logit contributions
allows-0.069
hers-0.062
hes-0.062
ighters-0.059
 pans-0.058
Token toys
Token ID14958
Feature activation+3.73e-02

Pos logit contributions
 course+0.063
 reminder+0.047
 yeah+0.045
 wink+0.045
Sure+0.044
Neg logit contributions
allows-0.058
hers-0.051
hes-0.050
 pans-0.048
ighters-0.048
Token.
Token ID13
Feature activation+5.93e-03

Pos logit contributions
allows+0.085
hers+0.078
hes+0.077
ighters+0.073
 pans+0.071
Neg logit contributions
 course-0.065
 reminder-0.044
 yeah-0.043
 amaz-0.042
 wink-0.042
Token
Token ID198
Feature activation-3.05e-03

Pos logit contributions
allows+0.013
hes+0.012
hers+0.011
ighters+0.011
 pans+0.011
Neg logit contributions
 course-0.012
 reminder-0.008
 yeah-0.008
 wink-0.008
 friendship-0.008
Token
Token ID198
Feature activation-3.30e-03

Pos logit contributions
 course+0.005
 reminder+0.003
 wink+0.003
 friendship+0.003
 yeah+0.003
Neg logit contributions
allows-0.007
hers-0.007
hes-0.007
ighters-0.006
 pans-0.006
TokenTom
Token ID13787
Feature activation-4.98e-02

Pos logit contributions
 course+0.006
 reminder+0.004
 yeah+0.004
 friendship+0.004
 amaz+0.003
Neg logit contributions
allows-0.008
hes-0.007
hers-0.007
 pans-0.007
ighters-0.007
Token and
Token ID290
Feature activation+6.28e-03

Pos logit contributions
 course+0.075
 reminder+0.050
 yeah+0.045
 friendship+0.043
Sure+0.042
Neg logit contributions
allows-0.126
hes-0.117
hers-0.115
ighters-0.109
 pans-0.107
Token Anna
Token ID11735
Feature activation-4.67e-04

Pos logit contributions
allows+0.013
hes+0.012
hers+0.012
ighters+0.011
 pans+0.011
Neg logit contributions
 course-0.012
 reminder-0.009
 yeah-0.008
 friendship-0.008
 wink-0.008
Token began
Token ID2540
Feature activation+5.30e-02

Pos logit contributions
allows+0.033
hers+0.030
hes+0.030
ighters+0.028
 pans+0.028
Neg logit contributions
 course-0.020
 reminder-0.012
 yeah-0.012
 friendship-0.011
 amaz-0.011
Token to
Token ID284
Feature activation-6.85e-03

Pos logit contributions
allows+0.126
hers+0.116
hes+0.115
ighters+0.111
 pans+0.107
Neg logit contributions
 course-0.089
 reminder-0.058
 friendship-0.057
 wink-0.057
 amaz-0.054
Token suffer
Token ID8659
Feature activation+4.98e-02

Pos logit contributions
 course+0.012
 wink+0.008
 reminder+0.008
 friendship+0.008
 amaz+0.008
Neg logit contributions
allows-0.016
hes-0.015
hers-0.015
ighters-0.014
angers-0.014
Token.
Token ID13
Feature activation-4.47e-03

Pos logit contributions
allows+0.123
hes+0.113
hers+0.112
ighters+0.107
 pans+0.105
Neg logit contributions
 course-0.081
 reminder-0.050
 friendship-0.049
 wink-0.048
 yeah-0.047
Token He
Token ID679
Feature activation-1.34e-02

Pos logit contributions
 course+0.008
 reminder+0.005
 yeah+0.005
 friendship+0.005
 wink+0.005
Neg logit contributions
allows-0.010
hes-0.009
hers-0.009
ighters-0.009
 pans-0.008
Token was
Token ID373
Feature activation-3.62e-02

Pos logit contributions
 course+0.021
 reminder+0.013
 yeah+0.012
 friendship+0.012
 wink+0.012
Neg logit contributions
allows-0.034
hers-0.031
hes-0.031
ighters-0.030
 pans-0.029
Token not
Token ID407
Feature activation+7.08e-02

Pos logit contributions
 course+0.054
 wink+0.043
 smiled+0.038
 reminder+0.037
 friendship+0.037
Neg logit contributions
allows-0.085
angers-0.079
agons-0.079
 pans-0.079
ants-0.078
Token alone
Token ID3436
Feature activation+4.82e-02

Pos logit contributions
allows+0.157
ighters+0.151
angers+0.149
hers+0.147
agons+0.147
Neg logit contributions
 course-0.106
 wink-0.084
 reminder-0.080
 smiled-0.076
 friendship-0.073
Token though
Token ID996
Feature activation+3.47e-02

Pos logit contributions
allows+0.125
hes+0.114
hers+0.114
ighters+0.109
 pans+0.107
Neg logit contributions
 course-0.073
 reminder-0.043
 friendship-0.042
 wink-0.041
 yeah-0.041
Token;
Token ID26
Feature activation+3.97e-02

Pos logit contributions
allows+0.084
hers+0.077
hes+0.075
ighters+0.072
 pans+0.071
Neg logit contributions
 course-0.057
 reminder-0.036
 yeah-0.036
Sure-0.036
 wink-0.034
Token there
Token ID612
Feature activation+1.09e-02

Pos logit contributions
allows+0.099
hers+0.091
hes+0.090
ighters+0.085
 pans+0.085
Neg logit contributions
 course-0.062
 reminder-0.039
 yeah-0.038
 wink-0.037
Sure-0.037
Token the
Token ID262
Feature activation+1.35e-02

Pos logit contributions
allows+0.045
hers+0.041
hes+0.040
ighters+0.038
 pans+0.037
Neg logit contributions
 course-0.041
 reminder-0.029
 yeah-0.029
 wink-0.028
 friendship-0.028
Token way
Token ID835
Feature activation-1.15e-01

Pos logit contributions
allows+0.027
hers+0.024
hes+0.024
 pans+0.022
ighters+0.022
Neg logit contributions
 course-0.028
 reminder-0.020
 yeah-0.020
 wink-0.019
 friendship-0.019
Token.
Token ID13
Feature activation-2.35e-03

Pos logit contributions
 course+0.205
 reminder+0.134
 yeah+0.131
 wink+0.129
 friendship+0.129
Neg logit contributions
allows-0.266
hers-0.241
hes-0.240
ighters-0.228
 pans-0.223
Token They
Token ID1119
Feature activation+3.31e-02

Pos logit contributions
 course+0.005
 reminder+0.003
 friendship+0.003
 yeah+0.003
Sure+0.003
Neg logit contributions
allows-0.005
hers-0.004
hes-0.004
 pans-0.004
ighters-0.004
Token were
Token ID547
Feature activation-2.72e-04

Pos logit contributions
allows+0.080
hers+0.074
hes+0.073
ighters+0.069
 pans+0.068
Neg logit contributions
 course-0.055
 reminder-0.035
 yeah-0.034
 friendship-0.033
Sure-0.032
Token happy
Token ID3772
Feature activation-6.16e-03

Pos logit contributions
 course+0.000
 wink+0.000
 friendship+0.000
 amaz+0.000
 smiled+0.000
Neg logit contributions
angers-0.001
ighters-0.001
allows-0.001
ants-0.001
agons-0.001
Token and
Token ID290
Feature activation+3.37e-04

Pos logit contributions
 course+0.011
 reminder+0.008
 yeah+0.008
Sure+0.007
of+0.007
Neg logit contributions
allows-0.014
hers-0.012
hes-0.012
ighters-0.012
 pans-0.012
Token curious
Token ID11040
Feature activation+4.43e-03

Pos logit contributions
allows+0.001
hers+0.001
hes+0.001
ighters+0.001
 pans+0.001
Neg logit contributions
 course-0.001
 reminder-0.000
 wink-0.000
 friendship-0.000
 yeah-0.000
Token.
Token ID13
Feature activation-1.90e-03

Pos logit contributions
allows+0.010
hers+0.008
hes+0.008
ighters+0.008
 pans+0.008
Neg logit contributions
 course-0.009
 reminder-0.006
 yeah-0.006
Sure-0.006
 wink-0.006
Token
Token ID198
Feature activation-1.37e-02

Pos logit contributions
 course+0.003
 reminder+0.002
 yeah+0.002
 wink+0.002
 friendship+0.002
Neg logit contributions
allows-0.004
hers-0.004
hes-0.004
ighters-0.004
 pans-0.004
Token
Token ID198
Feature activation-1.32e-02

Pos logit contributions
 course+0.019
Sure+0.010
 wink+0.010
 friendship+0.010
 reminder+0.010
Neg logit contributions
allows-0.036
hers-0.034
hes-0.033
 pans-0.033
ighters-0.032
Token all
Token ID477
Feature activation+1.64e-02

Pos logit contributions
 course+0.058
 wink+0.041
 reminder+0.040
 friendship+0.038
 smiled+0.038
Neg logit contributions
allows-0.066
ighters-0.058
hers-0.058
 pans-0.058
hes-0.058
Token alone
Token ID3436
Feature activation+5.19e-02

Pos logit contributions
allows+0.031
hers+0.028
hes+0.028
ighters+0.026
 pans+0.025
Neg logit contributions
 course-0.036
 reminder-0.026
 yeah-0.025
 wink-0.025
 friendship-0.025
Token in
Token ID287
Feature activation+6.65e-03

Pos logit contributions
allows+0.118
hers+0.107
hes+0.107
ighters+0.101
 pans+0.099
Neg logit contributions
 course-0.093
 reminder-0.062
 yeah-0.061
 wink-0.060
 friendship-0.060
Token the
Token ID262
Feature activation+1.02e-02

Pos logit contributions
allows+0.014
hers+0.012
hes+0.012
ighters+0.011
 pans+0.011
Neg logit contributions
 course-0.013
 reminder-0.009
 yeah-0.009
Sure-0.009
 wink-0.009
Token park
Token ID3952
Feature activation-1.00e-01

Pos logit contributions
allows+0.021
hers+0.020
hes+0.019
 pans+0.018
ained+0.018
Neg logit contributions
 course-0.019
 yeah-0.014
Sure-0.013
 reminder-0.013
of-0.013
Token.
Token ID13
Feature activation+4.33e-03

Pos logit contributions
 course+0.187
 yeah+0.128
Sure+0.125
 reminder+0.125
 amaz+0.123
Neg logit contributions
allows-0.215
hers-0.198
hes-0.191
ighters-0.178
ained-0.177
Token
Token ID198
Feature activation-1.64e-02

Pos logit contributions
allows+0.010
hers+0.010
hes+0.010
ighters+0.009
 pans+0.009
Neg logit contributions
 course-0.007
 reminder-0.005
 yeah-0.004
 friendship-0.004
 wink-0.004
Token
Token ID198
Feature activation-1.65e-02

Pos logit contributions
 course+0.025
Sure+0.014
 reminder+0.014
 wink+0.014
 friendship+0.014
Neg logit contributions
allows-0.042
hers-0.039
hes-0.038
 pans-0.037
ighters-0.037
TokenThen
Token ID6423
Feature activation-5.42e-02

Pos logit contributions
 course+0.031
 reminder+0.020
Sure+0.020
 yeah+0.020
 wink+0.020
Neg logit contributions
allows-0.037
hers-0.033
hes-0.033
ighters-0.032
 pans-0.031
Token,
Token ID11
Feature activation-1.85e-04

Pos logit contributions
 course+0.081
 reminder+0.049
 yeah+0.047
 friendship+0.046
 wink+0.046
Neg logit contributions
allows-0.138
hers-0.125
hes-0.125
ighters-0.120
 pans-0.119
Token Tom
Token ID4186
Feature activation-6.16e-03

Pos logit contributions
 course+0.000
 reminder+0.000
 wink+0.000
 yeah+0.000
 friendship+0.000
Neg logit contributions
allows-0.000
hes-0.000
hers-0.000
ighters-0.000
 pans-0.000
Token.
Token ID13
Feature activation-4.65e-03

Pos logit contributions
 course+0.203
 yeah+0.139
 reminder+0.137
 amaz+0.135
 wink+0.133
Neg logit contributions
allows-0.231
hes-0.210
hers-0.209
ighters-0.196
ained-0.190
Token Kitty
Token ID21168
Feature activation+4.05e-02

Pos logit contributions
 course+0.007
 friendship+0.004
 reminder+0.004
Sure+0.004
 wink+0.004
Neg logit contributions
allows-0.012
hers-0.011
hes-0.011
 pans-0.011
ighters-0.011
Token learned
Token ID4499
Feature activation+2.35e-02

Pos logit contributions
allows+0.097
hes+0.091
hers+0.090
ages+0.084
ighters+0.082
Neg logit contributions
 course-0.060
 reminder-0.041
 yeah-0.040
 amaz-0.039
Sure-0.037
Token that
Token ID326
Feature activation+1.23e-02

Pos logit contributions
allows+0.066
hers+0.060
hes+0.059
ighters+0.058
 pans+0.058
Neg logit contributions
 course-0.033
 friendship-0.018
 wink-0.017
 reminder-0.017
Sure-0.017
Token it
Token ID340
Feature activation+1.34e-02

Pos logit contributions
allows+0.025
hes+0.023
hers+0.022
ighters+0.021
ained+0.020
Neg logit contributions
 course-0.025
 yeah-0.018
 reminder-0.017
of-0.017
 wink-0.017
Token's
Token ID338
Feature activation-1.27e-02

Pos logit contributions
allows+0.037
hers+0.034
hes+0.034
ighters+0.032
 pans+0.032
Neg logit contributions
 course-0.018
 reminder-0.010
 yeah-0.009
 wink-0.009
 friendship-0.009
Token okay
Token ID8788
Feature activation+2.40e-02

Pos logit contributions
 course+0.026
 friendship+0.019
 reminder+0.019
 wink+0.017
Sure+0.017
Neg logit contributions
allows-0.027
hers-0.024
ighters-0.023
 pans-0.023
angers-0.023
Token to
Token ID284
Feature activation-3.21e-03

Pos logit contributions
allows+0.046
hes+0.041
hers+0.040
ighters+0.038
ained+0.038
Neg logit contributions
 course-0.052
 amaz-0.038
 yeah-0.036
 reminder-0.036
 wink-0.036
Token feel
Token ID1254
Feature activation+2.15e-02

Pos logit contributions
 course+0.005
 reminder+0.003
 yeah+0.003
 amaz+0.003
 wink+0.003
Neg logit contributions
allows-0.008
hers-0.007
hes-0.007
ighters-0.007
 pans-0.007
Token embarrass
Token ID9614
Feature activation+5.03e-02

Pos logit contributions
allows+0.043
hers+0.039
hes+0.037
ighters+0.035
 pans+0.035
Neg logit contributions
 course-0.046
 reminder-0.033
 friendship-0.032
 wink-0.032
 yeah-0.031
Token and
Token ID290
Feature activation-4.67e-03

Pos logit contributions
allows+0.114
hers+0.103
hes+0.102
ighters+0.097
 pans+0.095
Neg logit contributions
 course-0.093
 reminder-0.062
 yeah-0.060
 wink-0.060
 friendship-0.059
Token hand
Token ID1021
Feature activation-8.88e-02

Pos logit contributions
 course+0.033
 reminder+0.024
Sure+0.023
 yeah+0.023
of+0.023
Neg logit contributions
allows-0.027
hes-0.024
ages-0.023
 pans-0.023
hers-0.022
Token,
Token ID11
Feature activation-2.64e-03

Pos logit contributions
 course+0.141
 reminder+0.087
 yeah+0.085
 friendship+0.084
Sure+0.082
Neg logit contributions
allows-0.221
hers-0.202
hes-0.202
ighters-0.190
 pans-0.188
Token and
Token ID290
Feature activation-1.36e-03

Pos logit contributions
 course+0.005
 reminder+0.003
 yeah+0.003
 friendship+0.003
Sure+0.003
Neg logit contributions
allows-0.006
hes-0.006
hers-0.006
 pans-0.005
ighters-0.005
Token he
Token ID339
Feature activation+1.78e-02

Pos logit contributions
 course+0.002
 reminder+0.001
 yeah+0.001
 friendship+0.001
 wink+0.001
Neg logit contributions
allows-0.003
hes-0.003
hers-0.003
ighters-0.003
 pans-0.003
Token wanted
Token ID2227
Feature activation+3.27e-02

Pos logit contributions
allows+0.048
hers+0.044
hes+0.044
ighters+0.042
 pans+0.041
Neg logit contributions
 course-0.025
 reminder-0.014
 wink-0.014
 friendship-0.014
 yeah-0.013
Token to
Token ID284
Feature activation-3.06e-03

Pos logit contributions
allows+0.064
hes+0.058
hers+0.057
ighters+0.053
 pans+0.053
Neg logit contributions
 course-0.069
 yeah-0.050
 reminder-0.050
 amaz-0.049
Sure-0.047
Token catch
Token ID4929
Feature activation-5.34e-02

Pos logit contributions
 course+0.005
 reminder+0.003
 yeah+0.003
 friendship+0.003
 amaz+0.003
Neg logit contributions
allows-0.007
hes-0.007
hers-0.007
ighters-0.006
 pans-0.006
Token the
Token ID262
Feature activation+1.58e-02

Pos logit contributions
 course+0.092
 reminder+0.059
 yeah+0.057
 wink+0.057
 friendship+0.057
Neg logit contributions
allows-0.128
hers-0.116
hes-0.116
ighters-0.110
 pans-0.108
Token duck
Token ID22045
Feature activation+4.62e-03

Pos logit contributions
allows+0.039
hers+0.036
hes+0.035
ighters+0.034
angers+0.033
Neg logit contributions
 course-0.026
 wink-0.017
 friendship-0.017
 reminder-0.016
Sure-0.015
Token.
Token ID13
Feature activation-6.23e-03

Pos logit contributions
allows+0.012
hers+0.011
hes+0.011
ighters+0.010
 pans+0.010
Neg logit contributions
 course-0.007
 reminder-0.004
 wink-0.004
 friendship-0.004
of-0.004
Token The
Token ID383
Feature activation-1.71e-02

Pos logit contributions
 course+0.011
 reminder+0.007
 yeah+0.007
 friendship+0.007
 wink+0.007
Neg logit contributions
allows-0.015
hers-0.013
hes-0.013
ighters-0.013
 pans-0.013
Token.
Token ID13
Feature activation-9.93e-03

Pos logit contributions
 course+0.135
 yeah+0.096
 reminder+0.093
 amaz+0.092
Sure+0.091
Neg logit contributions
allows-0.143
hers-0.130
hes-0.128
ighters-0.120
 pans-0.118
Token The
Token ID383
Feature activation-2.09e-02

Pos logit contributions
 course+0.020
 reminder+0.014
 wink+0.014
 yeah+0.013
 friendship+0.013
Neg logit contributions
allows-0.020
hes-0.018
hers-0.018
ighters-0.017
ained-0.016
Token kind
Token ID1611
Feature activation-3.08e-02

Pos logit contributions
 course+0.037
of+0.026
 yeah+0.024
ah+0.022
 Sure+0.021
Neg logit contributions
allows-0.043
 pans-0.042
hes-0.040
hers-0.039
ighters-0.036
Token and
Token ID290
Feature activation-4.57e-03

Pos logit contributions
 course+0.062
 yeah+0.041
Sure+0.037
 reminder+0.035
of+0.034
Neg logit contributions
allows-0.061
hes-0.055
 pans-0.053
hers-0.052
ighters-0.051
Token compassionate
Token ID32533
Feature activation-6.29e-03

Pos logit contributions
 course+0.009
of+0.006
 yeah+0.006
Sure+0.006
 reminder+0.006
Neg logit contributions
allows-0.009
hes-0.009
 pans-0.008
hers-0.008
ighters-0.007
Token face
Token ID1986
Feature activation-9.68e-02

Pos logit contributions
 course+0.013
 yeah+0.008
Sure+0.007
 amaz+0.007
of+0.007
Neg logit contributions
allows-0.013
 pans-0.011
hes-0.011
hers-0.011
ighters-0.011
Token always
Token ID1464
Feature activation+1.45e-02

Pos logit contributions
 course+0.158
 yeah+0.099
 reminder+0.099
 amaz+0.093
Sure+0.093
Neg logit contributions
allows-0.234
hes-0.216
hers-0.214
ighters-0.200
ages-0.195
Token gave
Token ID2921
Feature activation+2.05e-02

Pos logit contributions
allows+0.028
hes+0.026
hers+0.025
ages+0.023
 pans+0.023
Neg logit contributions
 course-0.028
Sure-0.021
 yeah-0.021
of-0.020
Of-0.018
Token her
Token ID607
Feature activation-9.64e-03

Pos logit contributions
allows+0.043
hes+0.038
hers+0.038
ighters+0.036
 pans+0.035
Neg logit contributions
 course-0.039
 reminder-0.028
 yeah-0.028
 amaz-0.026
 wink-0.026
Token the
Token ID262
Feature activation+7.41e-03

Pos logit contributions
 course+0.019
 reminder+0.013
 yeah+0.013
 wink+0.012
Sure+0.012
Neg logit contributions
allows-0.020
hes-0.018
hers-0.018
ighters-0.017
 pans-0.017
Token courage
Token ID11917
Feature activation-7.74e-02

Pos logit contributions
allows+0.015
 pans+0.014
hes+0.013
hers+0.013
ighters+0.012
Neg logit contributions
 course-0.014
of-0.009
 yeah-0.008
Sure-0.008
bye-0.008
Token thanking
Token ID41892
Feature activation-1.43e-02

Pos logit contributions
 course+0.009
 yeah+0.006
of+0.006
est+0.006
 friendship+0.006
Neg logit contributions
allows-0.011
hes-0.010
hers-0.009
 pans-0.009
ages-0.008
Token her
Token ID607
Feature activation-9.75e-03

Pos logit contributions
 course+0.028
 reminder+0.020
 yeah+0.020
 amaz+0.019
 wink+0.019
Neg logit contributions
allows-0.029
hers-0.026
hes-0.026
ighters-0.024
 pans-0.024
Token for
Token ID329
Feature activation+5.61e-02

Pos logit contributions
 course+0.018
 reminder+0.013
 friendship+0.012
 wink+0.012
 yeah+0.012
Neg logit contributions
allows-0.022
hers-0.020
hes-0.020
ighters-0.018
 pans-0.018
Token the
Token ID262
Feature activation+6.29e-03

Pos logit contributions
allows+0.110
hes+0.106
hers+0.102
 pans+0.090
ighters+0.088
Neg logit contributions
 course-0.109
of-0.083
 yeah-0.081
est-0.080
 reminder-0.079
Token special
Token ID2041
Feature activation-6.17e-02

Pos logit contributions
allows+0.013
hers+0.012
hes+0.012
 pans+0.011
ighters+0.011
Neg logit contributions
 course-0.011
of-0.008
 yeah-0.008
Sure-0.007
 amaz-0.007
Token present
Token ID1944
Feature activation-9.91e-02

Pos logit contributions
 course+0.106
 yeah+0.082
Nice+0.081
 amaz+0.080
Sure+0.079
Neg logit contributions
allows-0.101
 pans-0.094
hes-0.089
hers-0.088
 chores-0.087
Token.
Token ID13
Feature activation-4.09e-03

Pos logit contributions
 course+0.202
 yeah+0.141
 amaz+0.141
Sure+0.133
 friendship+0.129
Neg logit contributions
allows-0.194
hes-0.175
hers-0.175
ages-0.164
ighters-0.159
Token
Token ID198
Feature activation-2.16e-02

Pos logit contributions
 course+0.007
 reminder+0.005
 yeah+0.005
 wink+0.005
 friendship+0.005
Neg logit contributions
allows-0.009
hes-0.008
hers-0.008
ighters-0.008
 pans-0.008
Token
Token ID198
Feature activation-2.15e-02

Pos logit contributions
 course+0.040
 reminder+0.026
 wink+0.026
 friendship+0.026
Sure+0.025
Neg logit contributions
allows-0.049
hers-0.044
hes-0.043
ighters-0.042
 pans-0.041
TokenThe
Token ID464
Feature activation-4.13e-02

Pos logit contributions
 course+0.039
 reminder+0.026
Sure+0.025
 wink+0.025
 yeah+0.024
Neg logit contributions
allows-0.050
hers-0.045
hes-0.044
ighters-0.043
 pans-0.042
Token moral
Token ID6573
Feature activation+2.54e-02

Pos logit contributions
 course+0.073
 yeah+0.055
of+0.054
Sure+0.048
 amaz+0.046
Neg logit contributions
allows-0.085
hers-0.078
hes-0.077
 pans-0.075
ighters-0.070
Token house
Token ID2156
Feature activation+2.10e-02

Pos logit contributions
 course+0.010
 reminder+0.007
 yeah+0.007
 wink+0.007
 friendship+0.007
Neg logit contributions
allows-0.009
hers-0.008
hes-0.008
ighters-0.007
 pans-0.007
Token.
Token ID13
Feature activation-4.82e-03

Pos logit contributions
allows+0.052
hers+0.047
hes+0.047
ighters+0.046
 pans+0.044
Neg logit contributions
 course-0.034
 reminder-0.021
 wink-0.020
 friendship-0.020
 yeah-0.020
Token It
Token ID632
Feature activation-5.28e-02

Pos logit contributions
 course+0.010
 reminder+0.007
 friendship+0.006
 yeah+0.006
 wink+0.006
Neg logit contributions
allows-0.010
hes-0.009
hers-0.009
ighters-0.008
 pans-0.008
Token was
Token ID373
Feature activation-3.38e-02

Pos logit contributions
 course+0.066
 reminder+0.035
 yeah+0.035
 friendship+0.031
 amaz+0.031
Neg logit contributions
allows-0.149
hes-0.141
hers-0.139
 pans-0.132
ighters-0.129
Token a
Token ID257
Feature activation+3.62e-04

Pos logit contributions
 course+0.059
 wink+0.042
 reminder+0.042
 friendship+0.041
 amaz+0.039
Neg logit contributions
allows-0.077
hers-0.069
ighters-0.068
hes-0.067
ages-0.065
Token gift
Token ID6979
Feature activation-9.69e-02

Pos logit contributions
allows+0.001
 pans+0.001
hers+0.001
hes+0.001
ighters+0.000
Neg logit contributions
 course-0.001
of-0.001
 yeah-0.001
est-0.001
 amaz-0.001
Token for
Token ID329
Feature activation+5.71e-02

Pos logit contributions
 course+0.219
 yeah+0.161
 amaz+0.158
 friendship+0.154
 reminder+0.154
Neg logit contributions
allows-0.168
hes-0.155
hers-0.153
ighters-0.139
 pans-0.135
Token her
Token ID607
Feature activation-4.64e-03

Pos logit contributions
allows+0.138
hers+0.125
hes+0.125
ighters+0.118
 pans+0.116
Neg logit contributions
 course-0.096
 reminder-0.062
 yeah-0.060
 wink-0.059
 friendship-0.059
Token.
Token ID13
Feature activation-5.46e-03

Pos logit contributions
 course+0.009
 yeah+0.006
 reminder+0.006
Sure+0.006
 wink+0.006
Neg logit contributions
allows-0.010
hers-0.009
hes-0.009
ighters-0.008
ages-0.008
Token Sue
Token ID26089
Feature activation-1.76e-02

Pos logit contributions
 course+0.011
 reminder+0.008
 friendship+0.007
 yeah+0.007
 amaz+0.007
Neg logit contributions
allows-0.011
hes-0.010
hers-0.009
ained-0.009
ighters-0.009
Token was
Token ID373
Feature activation-3.36e-02

Pos logit contributions
 course+0.028
 reminder+0.018
 yeah+0.018
 amaz+0.017
 friendship+0.017
Neg logit contributions
allows-0.044
hers-0.040
hes-0.040
ighters-0.038
 pans-0.037
Token for
Token ID329
Feature activation+5.59e-02

Pos logit contributions
 course+0.053
 reminder+0.034
 yeah+0.030
 wink+0.030
 friendship+0.030
Neg logit contributions
allows-0.088
hes-0.082
hers-0.081
ighters-0.079
ained-0.076
Token giving
Token ID3501
Feature activation+1.55e-02

Pos logit contributions
allows+0.116
hers+0.105
hes+0.105
 pans+0.096
ighters+0.096
Neg logit contributions
 course-0.109
 reminder-0.077
 yeah-0.076
Sure-0.075
of-0.074
Token the
Token ID262
Feature activation+7.38e-03

Pos logit contributions
allows+0.040
hers+0.037
hes+0.036
ighters+0.035
 pans+0.034
Neg logit contributions
 course-0.023
 reminder-0.014
 yeah-0.014
 wink-0.013
 friendship-0.013
Token drain
Token ID14782
Feature activation+7.03e-02

Pos logit contributions
allows+0.014
hes+0.013
hers+0.013
ighters+0.011
ages+0.011
Neg logit contributions
 course-0.016
 friendship-0.012
 reminder-0.012
 wink-0.012
 memory-0.011
Token a
Token ID257
Feature activation-5.70e-03

Pos logit contributions
allows+0.157
hers+0.142
hes+0.142
ighters+0.134
 pans+0.132
Neg logit contributions
 course-0.129
 reminder-0.086
 yeah-0.084
 wink-0.083
 friendship-0.083
Token hug
Token ID16225
Feature activation-9.90e-02

Pos logit contributions
 course+0.011
 yeah+0.007
 amaz+0.006
of+0.006
joy+0.006
Neg logit contributions
allows-0.012
hes-0.011
hers-0.010
 pans-0.010
ighters-0.009
Token,
Token ID11
Feature activation-9.29e-03

Pos logit contributions
 course+0.204
 reminder+0.144
 wink+0.139
 friendship+0.138
 yeah+0.138
Neg logit contributions
allows-0.202
hers-0.181
hes-0.180
ighters-0.172
 pans-0.167
Token Tina
Token ID35903
Feature activation+4.07e-02

Pos logit contributions
 course+0.016
 reminder+0.011
 yeah+0.010
 wink+0.010
 friendship+0.010
Neg logit contributions
allows-0.022
hers-0.020
hes-0.020
ighters-0.019
 pans-0.018
Token.
Token ID13
Feature activation-7.72e-03

Pos logit contributions
allows+0.088
hers+0.080
hes+0.080
ighters+0.076
 pans+0.074
Neg logit contributions
 course-0.079
 reminder-0.055
 wink-0.053
 friendship-0.052
 yeah-0.052
Token As
Token ID1081
Feature activation+3.95e-02

Pos logit contributions
 course+0.013
 reminder+0.008
 yeah+0.008
 wink+0.007
 friendship+0.007
Neg logit contributions
allows-0.019
hers-0.017
hes-0.017
ighters-0.016
 pans-0.016
Token a
Token ID257
Feature activation-6.15e-03

Pos logit contributions
allows+0.081
hes+0.072
hers+0.072
 pans+0.069
ighters+0.067
Neg logit contributions
 course-0.075
 yeah-0.053
 amaz-0.051
Sure-0.051
 reminder-0.051
Token smiled
Token ID13541
Feature activation-1.99e-02

Pos logit contributions
 course+0.037
 yeah+0.025
 reminder+0.023
Sure+0.022
 amaz+0.022
Neg logit contributions
allows-0.059
hers-0.055
hes-0.055
ages-0.052
 pans-0.049
Token and
Token ID290
Feature activation-2.11e-03

Pos logit contributions
 course+0.037
 yeah+0.023
 reminder+0.022
of+0.021
Sure+0.021
Neg logit contributions
allows-0.045
hes-0.041
hers-0.040
ighters-0.037
ages-0.037
Token said
Token ID531
Feature activation+2.23e-02

Pos logit contributions
 course+0.004
 yeah+0.003
 reminder+0.003
of+0.002
 amaz+0.002
Neg logit contributions
hes-0.004
allows-0.004
ages-0.004
hers-0.004
 pans-0.004
Token,
Token ID11
Feature activation-6.86e-03

Pos logit contributions
allows+0.059
hes+0.054
hers+0.053
ighters+0.051
 pans+0.050
Neg logit contributions
 course-0.032
 reminder-0.018
 amaz-0.017
 yeah-0.017
 wink-0.017
Token "
Token ID366
Feature activation+2.93e-02

Pos logit contributions
 course+0.013
 reminder+0.009
 yeah+0.008
 friendship+0.008
 wink+0.008
Neg logit contributions
allows-0.015
hers-0.014
hes-0.014
ighters-0.013
 pans-0.013
TokenI
Token ID40
Feature activation-1.27e-01

Pos logit contributions
allows+0.055
hes+0.054
hers+0.054
ained+0.047
 pans+0.046
Neg logit contributions
 course-0.054
 amaz-0.041
 yeah-0.040
 friendship-0.039
 wink-0.037
Token'm
Token ID1101
Feature activation-4.90e-02

Pos logit contributions
 course+0.232
 yeah+0.148
 amaz+0.145
 friendship+0.141
of+0.138
Neg logit contributions
allows-0.280
hes-0.258
hers-0.245
ages-0.232
ighters-0.229
Token glad
Token ID9675
Feature activation-2.26e-02

Pos logit contributions
 course+0.089
 reminder+0.060
 wink+0.059
 yeah+0.058
 friendship+0.056
Neg logit contributions
allows-0.111
hers-0.102
hes-0.098
 pans-0.098
ighters-0.097
Token I
Token ID314
Feature activation-9.28e-02

Pos logit contributions
 course+0.038
 reminder+0.024
 yeah+0.023
 wink+0.023
 friendship+0.023
Neg logit contributions
allows-0.054
hers-0.049
hes-0.049
ighters-0.047
 pans-0.046
Token could
Token ID714
Feature activation-1.04e-02

Pos logit contributions
 course+0.165
 yeah+0.109
 friendship+0.107
 reminder+0.107
 amaz+0.106
Neg logit contributions
allows-0.211
hes-0.197
hers-0.192
ighters-0.183
ained-0.174
Token help
Token ID1037
Feature activation-7.25e-02

Pos logit contributions
 course+0.015
 reminder+0.009
 wink+0.009
 yeah+0.008
 friendship+0.008
Neg logit contributions
allows-0.028
hers-0.026
hes-0.025
ighters-0.024
 pans-0.024