TOP ACTIVATIONS
MAX = 0.132

: " Don 't worry , it will be alright ,
loud voice . " Hello , children !" she said .
of her . " Hello , Lily ! I am the
said " Don 't worry , I won 't hurt you
snow man . " Hello , Mr . Snow man !"
. " Don 't worry , Tim ," she said .
She said , " Hello , do ggy . I want
the dog , " Hi , I 'm Tom . Are
the rat . " Hello , rat . What is your
and said , " Hello , sir . We are trying
and said , " Hello , Rose . I am Tom
He said , " Hello , little girl . Do you
He said , " Hello , Anna . I want to
the girl , " Hello , little girl . My name
. " Hello , little ones . Why are
. " Hello , prince . Why are you
and said , " Hello , little bird . Why are
He said , " Hello , little boy . We saw
and says , " Hello , can I play your fl
â t worry , Sam . I can help

MIN ACTIVATIONS
MIN = -0.109

to be lucky with their newly prepared jar . <|endoftext|> Once
beautiful jazz tune on her newly repaired toy . From that
dogs and small dogs and fluffy dogs . They wanted to
show Mom the dresses they both had chosen . <|endoftext|> Once
crown and playing in the freshly cut grass . It looked
enjoyed playing together in their newly cleaned space . <|endoftext|> Once
stirred until it was all soft and fluffy . Then ,
it . They both enjoyed each other . <|endoftext|> L ily
no energy to play . Tears rolled down his cheeks and
and fell asleep in her newly painted crib . From that
. He stretched out his newly found wings and flew away
and blocks , all in bright red colours ! She was
having fun together all year long . They fit their plans
als and bloom ed in bright colors - red , yellow
parent smiles and kisses them both . " You are welcome
brothers . They were all brown and furry . They looked
with water and spl ashed each other . They fed the
with delicious potatoes all year long . The end
and laughed as it fl uttered and waved in the wind
the sand and bat hed all afternoon . When the sun

INTERVAL 0.072 - 0.132
CONTAINS 2.551%

<|endoftext|> Once upon a time , there was a bad dust
<|endoftext|> Once upon a time , a little boy named Tim
" Look , Leo , there is a peach for
It was a beautiful doll , but its arm was missing
a moment and then answered , " Pink !" Mom smiled

INTERVAL 0.011 - 0.072
CONTAINS 27.015%

They decided to go on an adventure . The two friends
inside . " Let 's open it !" Ben
the stable door . Ben saw her and tried to stop
skills with others , you can make new friends and have
When you hold it close , it can tell you things

INTERVAL -0.049 - 0.011
CONTAINS 66.801%

friends . <|endoftext|> Once upon a time , there was a
he could . His pants got caught on a branch and
the beautiful sounds of nature around her . When she tried
the voice . Tim my was sad .
excited to play on the swings and slide . But first

INTERVAL -0.109 - -0.049
CONTAINS 3.633%

!" Tom and Anna did not want to stop
Tim went on the swings and Sue looked at the mushrooms
little boy who liked to stretch . Every morning he would
depending on each other to have fun and stay safe .
g iggle a lot . They love playing with cardboard .
Token:
Token ID25
Feature activation+7.49e-02

Pos logit contributions
chard+0.004
 sir+0.004
ache+0.004
 why+0.004
ber+0.003
Neg logit contributions
acks-0.006
 drifted-0.005
acked-0.005
irmed-0.005
 care-0.005
Token "
Token ID366
Feature activation+5.33e-02

Pos logit contributions
ache+0.158
chard+0.147
 salon+0.145
itor+0.144
 sir+0.140
Neg logit contributions
 They-0.104
acks-0.094
 Soon-0.091
 turns-0.086
 drifted-0.085
TokenDon
Token ID3987
Feature activation+4.76e-02

Pos logit contributions
chard+0.089
 sir+0.088
ache+0.086
ber+0.081
 why+0.080
Neg logit contributions
acks-0.094
 drifted-0.091
acked-0.084
irmed-0.084
 care-0.082
Token't
Token ID470
Feature activation+4.56e-02

Pos logit contributions
chard+0.083
ache+0.079
 sir+0.075
ber+0.074
 salon+0.073
Neg logit contributions
acks-0.083
 drifted-0.080
 care-0.076
acked-0.074
 They-0.074
Token worry
Token ID5490
Feature activation+5.28e-02

Pos logit contributions
 sir+0.074
chard+0.073
ache+0.072
 why+0.067
 salon+0.065
Neg logit contributions
acks-0.084
 drifted-0.081
acked-0.077
irmed-0.076
igg-0.072
Token,
Token ID11
Feature activation+1.31e-01

Pos logit contributions
chard+0.115
 salon+0.105
ache+0.104
ber+0.103
 principal+0.099
Neg logit contributions
acks-0.069
 They-0.064
 care-0.063
 drifted-0.062
 turns-0.057
Token it
Token ID340
Feature activation+2.81e-02

Pos logit contributions
ache+0.265
chard+0.260
 salon+0.259
ber+0.256
itor+0.245
Neg logit contributions
 They-0.169
 care-0.162
 drifted-0.155
 turns-0.151
 air-0.143
Token will
Token ID481
Feature activation+2.44e-02

Pos logit contributions
ache+0.054
 sir+0.054
chard+0.052
 why+0.052
ber+0.050
Neg logit contributions
acks-0.042
 drifted-0.040
acked-0.037
irmed-0.036
 care-0.035
Token be
Token ID307
Feature activation+5.11e-03

Pos logit contributions
 sir+0.045
chard+0.045
ache+0.045
 why+0.043
 salon+0.040
Neg logit contributions
acks-0.038
 drifted-0.036
acked-0.035
irmed-0.034
hed-0.032
Token alright
Token ID23036
Feature activation+1.73e-02

Pos logit contributions
chard+0.008
 sir+0.008
ache+0.008
 why+0.008
ber+0.007
Neg logit contributions
acks-0.010
 drifted-0.009
acked-0.009
irmed-0.008
 care-0.008
Token,
Token ID11
Feature activation+1.09e-01

Pos logit contributions
chard+0.041
 salon+0.041
ache+0.040
 principal+0.039
itor+0.039
Neg logit contributions
 care-0.018
 They-0.017
acks-0.016
 drifted-0.015
 night-0.015
Token loud
Token ID7812
Feature activation-2.01e-02

Pos logit contributions
chard+0.032
 sir+0.032
ache+0.032
 why+0.030
ber+0.030
Neg logit contributions
 drifted-0.019
acks-0.018
 care-0.018
 turns-0.016
irmed-0.016
Token voice
Token ID3809
Feature activation+2.21e-02

Pos logit contributions
acks+0.032
 drifted+0.031
acked+0.029
irmed+0.029
 care+0.027
Neg logit contributions
 sir-0.036
ache-0.036
chard-0.035
 why-0.034
 salon-0.033
Token.
Token ID13
Feature activation+1.53e-02

Pos logit contributions
chard+0.057
itor+0.053
 sir+0.052
 salon+0.051
ache+0.051
Neg logit contributions
 drifted-0.019
 care-0.017
acks-0.017
 They-0.015
 drift-0.015
Token "
Token ID366
Feature activation+5.29e-02

Pos logit contributions
ache+0.036
chard+0.035
 sir+0.035
ber+0.033
 salon+0.033
Neg logit contributions
acks-0.017
 drifted-0.016
 They-0.014
acked-0.014
 care-0.013
TokenHello
Token ID15496
Feature activation+5.09e-02

Pos logit contributions
chard+0.088
 sir+0.087
ache+0.086
ber+0.079
 salon+0.078
Neg logit contributions
acks-0.093
 drifted-0.090
acked-0.085
 care-0.084
irmed-0.082
Token,
Token ID11
Feature activation+1.31e-01

Pos logit contributions
chard+0.119
ache+0.112
belt+0.109
 salon+0.108
itor+0.104
Neg logit contributions
 turns-0.050
acks-0.048
 night-0.047
 They-0.047
 care-0.046
Token children
Token ID1751
Feature activation+1.48e-02

Pos logit contributions
chard+0.274
ache+0.263
belt+0.251
ber+0.245
itor+0.240
Neg logit contributions
 They-0.158
 care-0.147
 drifted-0.144
 turns-0.140
 air-0.139
Token!"
Token ID2474
Feature activation+2.07e-06

Pos logit contributions
chard+0.033
ache+0.032
 sir+0.032
 salon+0.030
ber+0.030
Neg logit contributions
acks-0.018
 drifted-0.017
 care-0.016
acked-0.015
 They-0.015
Token she
Token ID673
Feature activation+1.04e-02

Feature not active on prev token;
no predictions were affected.
Token said
Token ID531
Feature activation+2.35e-02

Pos logit contributions
chard+0.021
 sir+0.021
ache+0.020
ber+0.019
 why+0.019
Neg logit contributions
acks-0.015
 drifted-0.014
acked-0.013
irmed-0.013
 care-0.013
Token.
Token ID13
Feature activation+1.76e-02

Pos logit contributions
chard+0.056
 sir+0.054
ache+0.054
 salon+0.054
ber+0.052
Neg logit contributions
 drifted-0.023
acks-0.023
 care-0.021
 They-0.020
hed-0.018
Token of
Token ID286
Feature activation+1.58e-03

Pos logit contributions
chard+0.082
ache+0.078
 sir+0.076
itor+0.073
ber+0.073
Neg logit contributions
acks-0.056
 drifted-0.054
 They-0.050
 care-0.050
acked-0.049
Token her
Token ID607
Feature activation+2.63e-02

Pos logit contributions
chard+0.004
ache+0.004
 sir+0.004
ber+0.003
 salon+0.003
Neg logit contributions
acks-0.002
 drifted-0.002
 care-0.002
acked-0.001
irmed-0.001
Token.
Token ID13
Feature activation+1.25e-02

Pos logit contributions
chard+0.055
ache+0.054
 sir+0.053
itor+0.049
ber+0.048
Neg logit contributions
acks-0.033
 drifted-0.033
 hands-0.032
 care-0.031
 turns-0.030
Token "
Token ID366
Feature activation+4.73e-02

Pos logit contributions
ache+0.029
chard+0.029
 sir+0.028
ber+0.027
 salon+0.027
Neg logit contributions
acks-0.014
 drifted-0.013
 They-0.012
acked-0.012
 Soon-0.011
TokenHello
Token ID15496
Feature activation+5.37e-02

Pos logit contributions
chard+0.086
 sir+0.086
ache+0.085
ber+0.079
 why+0.078
Neg logit contributions
acks-0.075
 drifted-0.073
acked-0.068
 care-0.067
irmed-0.066
Token,
Token ID11
Feature activation+1.30e-01

Pos logit contributions
chard+0.129
ache+0.124
belt+0.120
 salon+0.114
itor+0.112
Neg logit contributions
 turns-0.053
acks-0.046
 care-0.046
 night-0.045
 They-0.045
Token Lily
Token ID20037
Feature activation+4.32e-03

Pos logit contributions
chard+0.275
ache+0.273
belt+0.250
ber+0.246
itor+0.238
Neg logit contributions
 They-0.156
 turns-0.149
 care-0.149
 air-0.141
 Everyone-0.137
Token!
Token ID0
Feature activation+4.50e-02

Pos logit contributions
chard+0.010
 sir+0.010
ache+0.010
 salon+0.009
ber+0.009
Neg logit contributions
acks-0.005
 drifted-0.004
 They-0.004
 care-0.004
acked-0.004
Token I
Token ID314
Feature activation+6.69e-03

Pos logit contributions
ache+0.087
chard+0.085
 sir+0.077
ber+0.076
 salon+0.075
Neg logit contributions
acks-0.066
 They-0.066
 drifted-0.063
|-0.061
 Soon-0.061
Token am
Token ID716
Feature activation+3.26e-02

Pos logit contributions
chard+0.013
 sir+0.013
ache+0.013
ber+0.012
 salon+0.012
Neg logit contributions
acks-0.010
 drifted-0.009
acked-0.008
irmed-0.008
 care-0.008
Token the
Token ID262
Feature activation+6.27e-04

Pos logit contributions
ache+0.061
 sir+0.059
chard+0.058
ber+0.056
 salon+0.055
Neg logit contributions
acks-0.048
 drifted-0.046
 care-0.045
 They-0.043
 Soon-0.042
Token said
Token ID531
Feature activation+4.63e-02

Pos logit contributions
chard+0.031
 sir+0.031
ache+0.031
ber+0.029
 salon+0.029
Neg logit contributions
acks-0.027
 drifted-0.026
acked-0.024
 care-0.024
irmed-0.024
Token "
Token ID366
Feature activation+7.43e-02

Pos logit contributions
chard+0.094
ache+0.091
 salon+0.088
 sir+0.088
itor+0.084
Neg logit contributions
 drifted-0.062
acks-0.061
 They-0.057
 care-0.057
acked-0.052
TokenDon
Token ID3987
Feature activation+5.93e-02

Pos logit contributions
chard+0.131
 sir+0.130
ache+0.129
ber+0.120
 why+0.119
Neg logit contributions
acks-0.123
 drifted-0.118
acked-0.110
 care-0.109
irmed-0.108
Token't
Token ID470
Feature activation+4.59e-02

Pos logit contributions
chard+0.114
ache+0.104
 sir+0.100
ber+0.100
 salon+0.097
Neg logit contributions
acks-0.095
 drifted-0.091
 They-0.087
 care-0.085
acked-0.084
Token worry
Token ID5490
Feature activation+5.21e-02

Pos logit contributions
 sir+0.067
chard+0.065
ache+0.065
 why+0.060
 salon+0.058
Neg logit contributions
acks-0.092
 drifted-0.089
acked-0.085
irmed-0.084
igg-0.080
Token,
Token ID11
Feature activation+1.29e-01

Pos logit contributions
chard+0.117
 salon+0.105
ber+0.104
ache+0.104
 principal+0.100
Neg logit contributions
acks-0.065
 They-0.063
 care-0.059
 drifted-0.058
 hands-0.053
Token I
Token ID314
Feature activation+2.12e-02

Pos logit contributions
ache+0.264
chard+0.262
ber+0.256
 salon+0.249
itor+0.245
Neg logit contributions
 They-0.167
 care-0.158
 drifted-0.155
 turns-0.148
 air-0.142
Token won
Token ID1839
Feature activation+1.98e-02

Pos logit contributions
chard+0.041
 sir+0.040
ache+0.039
 salon+0.037
ber+0.037
Neg logit contributions
acks-0.032
 drifted-0.031
 care-0.030
acked-0.029
irmed-0.029
Token't
Token ID470
Feature activation+2.21e-02

Pos logit contributions
chard+0.031
 sir+0.031
ache+0.030
ber+0.028
 salon+0.028
Neg logit contributions
acks-0.037
 drifted-0.036
 care-0.034
acked-0.033
irmed-0.033
Token hurt
Token ID5938
Feature activation+1.31e-02

Pos logit contributions
chard+0.044
 sir+0.042
ache+0.041
ber+0.041
 salon+0.040
Neg logit contributions
acks-0.033
 drifted-0.031
 care-0.031
irmed-0.028
acked-0.028
Token you
Token ID345
Feature activation-3.49e-03

Pos logit contributions
chard+0.029
ache+0.028
 sir+0.028
ber+0.027
 salon+0.027
Neg logit contributions
acks-0.016
 care-0.015
 drifted-0.014
 They-0.013
 turns-0.013
Token snow
Token ID6729
Feature activation+2.92e-02

Pos logit contributions
acks+0.007
 drifted+0.006
 care+0.006
 air+0.006
 turns+0.006
Neg logit contributions
chard-0.008
ache-0.008
ber-0.008
 sir-0.008
itor-0.008
Tokenman
Token ID805
Feature activation+2.00e-02

Pos logit contributions
chard+0.048
 sir+0.048
ache+0.045
 why+0.044
 salon+0.043
Neg logit contributions
 drifted-0.051
acks-0.050
acked-0.048
hed-0.046
 They-0.045
Token.
Token ID13
Feature activation+7.15e-03

Pos logit contributions
 sir+0.039
chard+0.038
ache+0.036
itor+0.036
ber+0.035
Neg logit contributions
 drifted-0.029
acks-0.028
acked-0.026
hed-0.026
 care-0.025
Token "
Token ID366
Feature activation+4.77e-02

Pos logit contributions
chard+0.017
 sir+0.017
ache+0.017
ber+0.016
 why+0.016
Neg logit contributions
acks-0.007
 drifted-0.007
acked-0.006
irmed-0.006
 care-0.006
TokenHello
Token ID15496
Feature activation+5.26e-02

Pos logit contributions
chard+0.076
 sir+0.075
ache+0.074
ber+0.069
 why+0.068
Neg logit contributions
acks-0.089
 drifted-0.085
irmed-0.079
acked-0.078
 care-0.077
Token,
Token ID11
Feature activation+1.29e-01

Pos logit contributions
chard+0.127
 salon+0.123
belt+0.120
ache+0.118
itor+0.113
Neg logit contributions
 turns-0.053
 night-0.050
 They-0.049
 air-0.046
acks-0.045
Token Mr
Token ID1770
Feature activation+3.89e-03

Pos logit contributions
chard+0.296
belt+0.287
ache+0.285
 salon+0.282
ber+0.274
Neg logit contributions
 They-0.137
 night-0.123
 air-0.123
 turns-0.122
 drifted-0.118
Token.
Token ID13
Feature activation+4.63e-02

Pos logit contributions
 sir+0.008
ache+0.007
chard+0.007
ber+0.007
 why+0.007
Neg logit contributions
acks-0.006
 drifted-0.006
acked-0.005
irmed-0.005
 care-0.005
Token Snow
Token ID7967
Feature activation+1.20e-02

Pos logit contributions
chard+0.107
ache+0.101
 salon+0.098
 why+0.095
 sir+0.093
Neg logit contributions
 drifted-0.052
 They-0.048
acks-0.047
 Soon-0.042
 night-0.042
Tokenman
Token ID805
Feature activation+2.38e-02

Pos logit contributions
chard+0.016
 sir+0.016
ache+0.015
 salon+0.014
 why+0.014
Neg logit contributions
acks-0.025
 drifted-0.024
acked-0.023
irmed-0.022
 care-0.022
Token!"
Token ID2474
Feature activation-6.71e-03

Pos logit contributions
chard+0.047
 sir+0.046
ache+0.045
 salon+0.043
ber+0.042
Neg logit contributions
acks-0.035
 drifted-0.033
acked-0.031
 care-0.031
irmed-0.029
Token.
Token ID13
Feature activation+1.46e-02

Pos logit contributions
acks+0.030
 drifted+0.027
irmed+0.026
acked+0.026
 care+0.025
Neg logit contributions
ache-0.035
chard-0.035
 sir-0.035
ber-0.034
 why-0.033
Token "
Token ID366
Feature activation+4.59e-02

Pos logit contributions
ache+0.035
chard+0.035
 sir+0.034
ber+0.033
 salon+0.032
Neg logit contributions
acks-0.015
 drifted-0.014
acked-0.012
 They-0.012
irmed-0.012
TokenDon
Token ID3987
Feature activation+4.50e-02

Pos logit contributions
chard+0.078
 sir+0.077
ache+0.076
ber+0.071
 salon+0.070
Neg logit contributions
acks-0.080
 drifted-0.077
acked-0.071
irmed-0.071
 care-0.070
Token't
Token ID470
Feature activation+4.40e-02

Pos logit contributions
chard+0.083
ache+0.077
 sir+0.074
ber+0.073
 salon+0.072
Neg logit contributions
acks-0.075
 drifted-0.073
 care-0.068
 They-0.068
acked-0.067
Token worry
Token ID5490
Feature activation+4.78e-02

Pos logit contributions
chard+0.065
 sir+0.064
ache+0.063
ber+0.058
 salon+0.058
Neg logit contributions
acks-0.086
 drifted-0.084
acked-0.078
 care-0.078
irmed-0.078
Token,
Token ID11
Feature activation+1.29e-01

Pos logit contributions
chard+0.108
 salon+0.096
ber+0.096
ache+0.095
itor+0.092
Neg logit contributions
acks-0.060
 They-0.057
 care-0.056
 drifted-0.054
 turns-0.050
Token Tim
Token ID5045
Feature activation+2.71e-02

Pos logit contributions
chard+0.264
ache+0.262
ber+0.254
 salon+0.253
itor+0.246
Neg logit contributions
 They-0.159
 care-0.157
 drifted-0.154
 turns-0.146
 hands-0.137
Token,"
Token ID553
Feature activation-2.91e-03

Pos logit contributions
chard+0.057
 salon+0.052
 sir+0.052
ber+0.051
ache+0.050
Neg logit contributions
acks-0.037
 drifted-0.035
 care-0.034
acked-0.033
 They-0.032
Token she
Token ID673
Feature activation+2.13e-02

Pos logit contributions
acks+0.005
 drifted+0.005
acked+0.004
irmed+0.004
 care+0.004
Neg logit contributions
chard-0.005
 sir-0.005
ache-0.005
 why-0.005
ber-0.005
Token said
Token ID531
Feature activation+2.70e-02

Pos logit contributions
chard+0.043
 sir+0.043
ache+0.042
ber+0.040
 salon+0.039
Neg logit contributions
acks-0.030
 drifted-0.029
acked-0.026
irmed-0.026
 care-0.026
Token.
Token ID13
Feature activation+2.49e-02

Pos logit contributions
chard+0.064
 sir+0.063
 salon+0.062
ache+0.062
ber+0.060
Neg logit contributions
 drifted-0.027
acks-0.026
 care-0.024
 They-0.022
irmed-0.022
Token She
Token ID1375
Feature activation-5.06e-03

Pos logit contributions
ache+0.035
chard+0.035
 sir+0.034
ber+0.033
 salon+0.032
Neg logit contributions
acks-0.015
 drifted-0.014
acked-0.012
 They-0.012
irmed-0.012
Token said
Token ID531
Feature activation+2.83e-02

Pos logit contributions
acks+0.008
 drifted+0.007
acked+0.007
irmed+0.007
 care+0.007
Neg logit contributions
 sir-0.010
chard-0.009
ache-0.009
 why-0.009
ber-0.009
Token,
Token ID11
Feature activation+9.88e-02

Pos logit contributions
chard+0.060
itor+0.055
ache+0.055
 salon+0.054
 sir+0.053
Neg logit contributions
 drifted-0.036
 care-0.035
 They-0.034
 turns-0.032
acks-0.032
Token "
Token ID366
Feature activation+6.48e-02

Pos logit contributions
chard+0.175
ache+0.175
 salon+0.163
itor+0.162
 sir+0.159
Neg logit contributions
 They-0.162
 drifted-0.150
acks-0.142
 Soon-0.141
 drift-0.130
TokenHello
Token ID15496
Feature activation+5.28e-02

Pos logit contributions
chard+0.115
 sir+0.113
ache+0.111
ber+0.106
 salon+0.104
Neg logit contributions
acks-0.108
 drifted-0.104
irmed-0.096
acked-0.095
 care-0.093
Token,
Token ID11
Feature activation+1.29e-01

Pos logit contributions
chard+0.129
ache+0.124
 salon+0.115
belt+0.113
itor+0.111
Neg logit contributions
acks-0.053
 They-0.051
 turns-0.050
 care-0.046
 night-0.044
Token do
Token ID466
Feature activation+5.12e-02

Pos logit contributions
chard+0.273
ache+0.259
belt+0.242
ber+0.234
itor+0.233
Neg logit contributions
 They-0.161
 care-0.148
 turns-0.146
acks-0.145
 air-0.142
Tokenggy
Token ID19970
Feature activation+7.75e-03

Pos logit contributions
chard+0.094
 sir+0.092
ache+0.092
ber+0.085
 salon+0.085
Neg logit contributions
acks-0.082
 drifted-0.078
 care-0.073
acked-0.073
irmed-0.071
Token.
Token ID13
Feature activation+3.24e-02

Pos logit contributions
chard+0.018
ache+0.017
 sir+0.017
 salon+0.016
itor+0.015
Neg logit contributions
acks-0.010
 drifted-0.009
 care-0.008
acked-0.008
 They-0.008
Token I
Token ID314
Feature activation+3.95e-03

Pos logit contributions
ache+0.055
chard+0.054
 sir+0.051
 salon+0.049
ber+0.048
Neg logit contributions
acks-0.056
 drifted-0.053
 They-0.053
 care-0.051
acked-0.050
Token want
Token ID765
Feature activation+3.81e-02

Pos logit contributions
ache+0.007
 sir+0.007
chard+0.007
 why+0.007
ber+0.006
Neg logit contributions
acks-0.006
 drifted-0.006
acked-0.006
irmed-0.006
 Soon-0.005
Token the
Token ID262
Feature activation+3.63e-02

Pos logit contributions
chard+0.117
 sir+0.116
ache+0.114
ber+0.108
 why+0.107
Neg logit contributions
acks-0.084
 drifted-0.081
acked-0.074
irmed-0.073
 care-0.072
Token dog
Token ID3290
Feature activation+3.43e-02

Pos logit contributions
chard+0.064
ache+0.062
ber+0.058
itor+0.057
 sir+0.057
Neg logit contributions
 drifted-0.053
 care-0.051
 air-0.051
acks-0.050
 hearts-0.050
Token,
Token ID11
Feature activation+9.31e-02

Pos logit contributions
chard+0.070
ache+0.066
 sir+0.065
 salon+0.063
itor+0.061
Neg logit contributions
acks-0.046
 drifted-0.045
 They-0.044
 care-0.043
irmed-0.040
Token "
Token ID366
Feature activation+6.27e-02

Pos logit contributions
chard+0.152
ache+0.149
 salon+0.141
itor+0.137
ber+0.136
Neg logit contributions
 They-0.165
 drifted-0.158
acks-0.153
 care-0.147
 Soon-0.146
TokenHi
Token ID17250
Feature activation+5.26e-02

Pos logit contributions
chard+0.101
 sir+0.099
ache+0.097
ber+0.092
 salon+0.090
Neg logit contributions
acks-0.116
 drifted-0.111
irmed-0.102
acked-0.101
 care-0.101
Token,
Token ID11
Feature activation+1.28e-01

Pos logit contributions
chard+0.117
ache+0.111
 salon+0.106
itor+0.101
 sir+0.101
Neg logit contributions
acks-0.067
 They-0.060
 turns-0.060
 drifted-0.059
 care-0.058
Token I
Token ID314
Feature activation+1.01e-02

Pos logit contributions
chard+0.275
ache+0.259
itor+0.244
ber+0.239
 salon+0.238
Neg logit contributions
 They-0.160
acks-0.150
 care-0.147
 turns-0.145
 drifted-0.142
Token'm
Token ID1101
Feature activation+3.90e-02

Pos logit contributions
chard+0.017
 sir+0.017
ache+0.017
 why+0.016
ber+0.016
Neg logit contributions
acks-0.017
 drifted-0.016
irmed-0.015
acked-0.015
 care-0.014
Token Tom
Token ID4186
Feature activation-1.03e-02

Pos logit contributions
chard+0.069
 sir+0.068
ache+0.067
ber+0.063
 why+0.062
Neg logit contributions
acks-0.065
 drifted-0.063
acked-0.058
irmed-0.058
 care-0.057
Token.
Token ID13
Feature activation+2.78e-02

Pos logit contributions
acks+0.012
 drifted+0.011
acked+0.010
 care+0.010
 They+0.010
Neg logit contributions
chard-0.023
 sir-0.023
ache-0.022
 salon-0.022
ber-0.021
Token Are
Token ID4231
Feature activation+1.89e-02

Pos logit contributions
ache+0.046
chard+0.046
 sir+0.044
 salon+0.042
ber+0.041
Neg logit contributions
acks-0.049
 drifted-0.047
 They-0.045
acked-0.044
 care-0.044
Token the
Token ID262
Feature activation+1.07e-02

Pos logit contributions
chard+0.083
 sir+0.083
ache+0.082
 why+0.078
ber+0.077
Neg logit contributions
acks-0.051
 drifted-0.049
acked-0.045
irmed-0.044
 care-0.042
Token rat
Token ID4227
Feature activation+4.17e-02

Pos logit contributions
chard+0.021
 sir+0.020
ache+0.020
ber+0.020
itor+0.019
Neg logit contributions
 drifted-0.014
acks-0.014
 care-0.013
 air-0.013
 hearts-0.013
Token.
Token ID13
Feature activation-7.37e-03

Pos logit contributions
chard+0.087
 sir+0.085
ache+0.083
itor+0.079
ber+0.079
Neg logit contributions
acks-0.056
 drifted-0.053
 care-0.049
acked-0.047
 They-0.046
Token "
Token ID366
Feature activation+3.93e-02

Pos logit contributions
acks+0.008
 drifted+0.008
acked+0.007
irmed+0.007
 care+0.007
Neg logit contributions
chard-0.017
 sir-0.017
ache-0.017
ber-0.016
 salon-0.016
TokenHello
Token ID15496
Feature activation+4.51e-02

Pos logit contributions
ache+0.059
chard+0.059
 sir+0.058
ber+0.053
 why+0.052
Neg logit contributions
acks-0.077
 drifted-0.072
acked-0.068
irmed-0.068
igg-0.066
Token,
Token ID11
Feature activation+1.28e-01

Pos logit contributions
chard+0.117
belt+0.112
ache+0.108
 salon+0.108
itor+0.106
Neg logit contributions
 turns-0.040
acks-0.036
 They-0.036
 care-0.035
 hops-0.033
Token rat
Token ID4227
Feature activation+3.65e-02

Pos logit contributions
chard+0.274
belt+0.247
but+0.243
ache+0.242
itor+0.239
Neg logit contributions
 They-0.151
 turns-0.143
 air-0.142
 lot-0.140
 care-0.139
Token.
Token ID13
Feature activation+2.46e-02

Pos logit contributions
chard+0.085
 sir+0.079
 salon+0.078
ache+0.076
itor+0.075
Neg logit contributions
acks-0.040
 drifted-0.036
 care-0.035
acked-0.032
 They-0.032
Token What
Token ID1867
Feature activation+4.49e-02

Pos logit contributions
chard+0.041
ache+0.041
 sir+0.039
 salon+0.038
ber+0.037
Neg logit contributions
acks-0.043
 drifted-0.041
 They-0.039
 care-0.039
acked-0.038
Token is
Token ID318
Feature activation+4.09e-02

Pos logit contributions
chard+0.116
ache+0.112
 sir+0.112
itor+0.108
 salon+0.108
Neg logit contributions
acks-0.039
 drifted-0.036
 care-0.034
 turns-0.031
 They-0.030
Token your
Token ID534
Feature activation+2.70e-02

Pos logit contributions
 sir+0.090
ache+0.090
chard+0.088
 salon+0.085
ber+0.084
Neg logit contributions
acks-0.047
 drifted-0.046
 care-0.042
acked-0.040
 They-0.040
Token and
Token ID290
Feature activation-7.46e-03

Pos logit contributions
acks+0.125
acked+0.112
 drifted+0.106
irmed+0.105
 Soon+0.101
Neg logit contributions
 sir-0.130
chard-0.127
ache-0.125
 why-0.121
ber-0.116
Token said
Token ID531
Feature activation+3.02e-02

Pos logit contributions
acks+0.010
 drifted+0.010
 care+0.009
irmed+0.009
acked+0.008
Neg logit contributions
ache-0.015
 sir-0.015
chard-0.015
ber-0.014
 salon-0.014
Token,
Token ID11
Feature activation+1.03e-01

Pos logit contributions
chard+0.061
ache+0.059
 salon+0.058
 sir+0.057
itor+0.056
Neg logit contributions
 drifted-0.040
acks-0.038
 care-0.038
 They-0.037
 Soon-0.035
Token "
Token ID366
Feature activation+6.28e-02

Pos logit contributions
chard+0.165
ache+0.162
 salon+0.152
 sir+0.149
ber+0.148
Neg logit contributions
 They-0.184
 drifted-0.180
acks-0.175
 Soon-0.167
 care-0.159
TokenHello
Token ID15496
Feature activation+5.14e-02

Pos logit contributions
chard+0.111
 sir+0.109
ache+0.107
ber+0.102
 salon+0.100
Neg logit contributions
acks-0.106
 drifted-0.101
irmed-0.093
acked-0.092
 care-0.091
Token,
Token ID11
Feature activation+1.28e-01

Pos logit contributions
chard+0.121
ache+0.118
 salon+0.111
belt+0.109
itor+0.105
Neg logit contributions
acks-0.054
 They-0.051
 turns-0.050
 care-0.048
 night-0.046
Token sir
Token ID15967
Feature activation+3.49e-02

Pos logit contributions
chard+0.269
ache+0.268
belt+0.247
ber+0.240
itor+0.237
Neg logit contributions
 They-0.159
 care-0.145
 turns-0.141
acks-0.140
 drifted-0.135
Token.
Token ID13
Feature activation+3.25e-02

Pos logit contributions
chard+0.088
ache+0.082
 salon+0.081
itor+0.079
 sir+0.078
Neg logit contributions
acks-0.031
 They-0.029
 care-0.028
acked-0.024
 drifted-0.023
Token We
Token ID775
Feature activation-1.91e-02

Pos logit contributions
ache+0.054
chard+0.053
 sir+0.050
 salon+0.048
ber+0.047
Neg logit contributions
acks-0.057
 drifted-0.055
 They-0.054
 care-0.052
acked-0.052
Token are
Token ID389
Feature activation+1.29e-02

Pos logit contributions
acks+0.030
 drifted+0.028
acked+0.026
irmed+0.026
 Soon+0.025
Neg logit contributions
 sir-0.036
chard-0.035
ache-0.035
 why-0.035
ber-0.032
Token trying
Token ID2111
Feature activation+3.00e-02

Pos logit contributions
ache+0.024
chard+0.024
 sir+0.024
 salon+0.022
ber+0.022
Neg logit contributions
acks-0.019
 drifted-0.019
 care-0.017
acked-0.017
 They-0.017
Token and
Token ID290
Feature activation-7.46e-03

Pos logit contributions
 drifted+0.004
acks+0.003
 care+0.003
irmed+0.003
 Soon+0.003
Neg logit contributions
 sir-0.006
ache-0.006
chard-0.006
itor-0.006
 salon-0.006
Token said
Token ID531
Feature activation+3.02e-02

Pos logit contributions
acks+0.010
 drifted+0.010
 care+0.009
irmed+0.009
acked+0.008
Neg logit contributions
ache-0.015
 sir-0.015
chard-0.015
ber-0.014
 salon-0.014
Token,
Token ID11
Feature activation+1.03e-01

Pos logit contributions
chard+0.061
ache+0.059
 salon+0.058
 sir+0.057
itor+0.056
Neg logit contributions
 drifted-0.040
acks-0.038
 care-0.038
 They-0.037
 Soon-0.035
Token "
Token ID366
Feature activation+6.28e-02

Pos logit contributions
chard+0.165
ache+0.162
 salon+0.152
 sir+0.149
ber+0.148
Neg logit contributions
 They-0.184
 drifted-0.180
acks-0.175
 Soon-0.167
 care-0.159
TokenHello
Token ID15496
Feature activation+5.14e-02

Pos logit contributions
chard+0.111
 sir+0.109
ache+0.107
ber+0.102
 salon+0.100
Neg logit contributions
acks-0.106
 drifted-0.101
irmed-0.093
acked-0.092
 care-0.091
Token,
Token ID11
Feature activation+1.28e-01

Pos logit contributions
chard+0.121
ache+0.118
 salon+0.111
belt+0.109
itor+0.105
Neg logit contributions
acks-0.054
 They-0.051
 turns-0.050
 care-0.048
 night-0.046
Token Rose
Token ID8049
Feature activation+9.72e-03

Pos logit contributions
chard+0.269
ache+0.268
belt+0.247
ber+0.240
itor+0.237
Neg logit contributions
 They-0.159
 care-0.145
 turns-0.141
acks-0.140
 drifted-0.135
Token.
Token ID13
Feature activation+3.31e-02

Pos logit contributions
chard+0.024
 sir+0.022
ache+0.022
 salon+0.021
itor+0.021
Neg logit contributions
acks-0.010
 drifted-0.009
 They-0.008
acked-0.008
 care-0.008
Token I
Token ID314
Feature activation+5.47e-03

Pos logit contributions
ache+0.056
chard+0.054
 sir+0.052
 salon+0.049
ber+0.048
Neg logit contributions
acks-0.058
 drifted-0.056
 They-0.055
 care-0.053
acked-0.052
Token am
Token ID716
Feature activation+2.92e-02

Pos logit contributions
ache+0.009
 sir+0.009
chard+0.009
 why+0.009
ber+0.008
Neg logit contributions
acks-0.009
 drifted-0.009
acked-0.008
irmed-0.008
 Soon-0.008
Token Tom
Token ID4186
Feature activation-1.70e-02

Pos logit contributions
ache+0.056
 sir+0.055
chard+0.055
 salon+0.052
ber+0.052
Neg logit contributions
acks-0.042
 drifted-0.040
 care-0.038
 They-0.037
acked-0.036
Token He
Token ID679
Feature activation-8.65e-03

Pos logit contributions
ache+0.035
chard+0.035
 sir+0.034
ber+0.033
 salon+0.032
Neg logit contributions
acks-0.015
 drifted-0.014
acked-0.012
 They-0.012
irmed-0.012
Token said
Token ID531
Feature activation+2.51e-02

Pos logit contributions
acks+0.013
 drifted+0.012
acked+0.012
irmed+0.011
 care+0.011
Neg logit contributions
 sir-0.016
ache-0.016
chard-0.016
 why-0.016
ber-0.015
Token,
Token ID11
Feature activation+9.67e-02

Pos logit contributions
chard+0.054
 salon+0.050
ache+0.049
itor+0.049
 sir+0.047
Neg logit contributions
 drifted-0.031
 care-0.030
 They-0.030
acks-0.029
 turns-0.028
Token "
Token ID366
Feature activation+6.23e-02

Pos logit contributions
chard+0.165
ache+0.164
 salon+0.157
itor+0.152
 sir+0.150
Neg logit contributions
 They-0.164
 drifted-0.156
acks-0.149
 Soon-0.145
 drift-0.136
TokenHello
Token ID15496
Feature activation+5.11e-02

Pos logit contributions
chard+0.106
 sir+0.105
ache+0.102
ber+0.099
 salon+0.095
Neg logit contributions
acks-0.108
 drifted-0.103
irmed-0.095
igg-0.094
acked-0.093
Token,
Token ID11
Feature activation+1.27e-01

Pos logit contributions
chard+0.123
ache+0.118
 salon+0.114
belt+0.107
itor+0.106
Neg logit contributions
acks-0.053
 They-0.050
 turns-0.049
 care-0.046
 drifted-0.044
Token little
Token ID1310
Feature activation+1.80e-02

Pos logit contributions
chard+0.261
ache+0.250
belt+0.229
 salon+0.227
ber+0.224
Neg logit contributions
 They-0.166
 care-0.157
acks-0.153
 turns-0.152
 drifted-0.149
Token girl
Token ID2576
Feature activation+1.70e-02

Pos logit contributions
chard+0.033
ache+0.032
 sir+0.031
 salon+0.030
ber+0.030
Neg logit contributions
acks-0.028
 drifted-0.027
 care-0.026
acked-0.025
irmed-0.025
Token.
Token ID13
Feature activation+2.87e-02

Pos logit contributions
chard+0.040
ache+0.038
 sir+0.038
 salon+0.037
itor+0.036
Neg logit contributions
acks-0.018
 drifted-0.017
 care-0.016
 They-0.015
acked-0.015
Token Do
Token ID2141
Feature activation+4.67e-02

Pos logit contributions
ache+0.048
chard+0.046
 sir+0.045
 salon+0.043
ber+0.041
Neg logit contributions
acks-0.050
 drifted-0.048
 They-0.047
 care-0.046
acked-0.046
Token you
Token ID345
Feature activation+1.77e-02

Pos logit contributions
 sir+0.083
chard+0.080
ache+0.078
 why+0.076
ber+0.074
Neg logit contributions
acks-0.081
 drifted-0.077
acked-0.074
irmed-0.073
hed-0.069
Token He
Token ID679
Feature activation-8.65e-03

Pos logit contributions
ache+0.035
chard+0.035
 sir+0.034
ber+0.033
 salon+0.032
Neg logit contributions
acks-0.015
 drifted-0.014
acked-0.012
 They-0.012
irmed-0.012
Token said
Token ID531
Feature activation+2.51e-02

Pos logit contributions
acks+0.013
 drifted+0.012
acked+0.012
irmed+0.011
 care+0.011
Neg logit contributions
 sir-0.016
ache-0.016
chard-0.016
 why-0.016
ber-0.015
Token,
Token ID11
Feature activation+9.67e-02

Pos logit contributions
chard+0.054
 salon+0.050
ache+0.049
itor+0.049
 sir+0.047
Neg logit contributions
 drifted-0.031
 care-0.030
 They-0.030
acks-0.029
 turns-0.028
Token "
Token ID366
Feature activation+6.23e-02

Pos logit contributions
chard+0.165
ache+0.164
 salon+0.157
itor+0.152
 sir+0.150
Neg logit contributions
 They-0.164
 drifted-0.156
acks-0.149
 Soon-0.145
 drift-0.136
TokenHello
Token ID15496
Feature activation+5.11e-02

Pos logit contributions
chard+0.106
 sir+0.105
ache+0.102
ber+0.099
 salon+0.095
Neg logit contributions
acks-0.108
 drifted-0.103
irmed-0.095
igg-0.094
acked-0.093
Token,
Token ID11
Feature activation+1.27e-01

Pos logit contributions
chard+0.123
ache+0.118
 salon+0.114
belt+0.107
itor+0.106
Neg logit contributions
acks-0.053
 They-0.050
 turns-0.049
 care-0.046
 drifted-0.044
Token Anna
Token ID11735
Feature activation-2.36e-02

Pos logit contributions
chard+0.261
ache+0.250
belt+0.229
 salon+0.227
ber+0.224
Neg logit contributions
 They-0.166
 care-0.157
acks-0.153
 turns-0.152
 drifted-0.149
Token.
Token ID13
Feature activation+2.50e-02

Pos logit contributions
acks+0.029
 drifted+0.028
acked+0.025
 care+0.025
irmed+0.024
Neg logit contributions
chard-0.052
 sir-0.051
ache-0.050
 salon-0.048
ber-0.047
Token I
Token ID314
Feature activation+4.63e-03

Pos logit contributions
ache+0.042
chard+0.041
 sir+0.039
 salon+0.038
ber+0.036
Neg logit contributions
acks-0.043
 drifted-0.042
 They-0.040
 care-0.040
acked-0.039
Token want
Token ID765
Feature activation+3.92e-02

Pos logit contributions
ache+0.008
 sir+0.008
chard+0.008
 why+0.008
ber+0.007
Neg logit contributions
acks-0.008
 drifted-0.007
acked-0.007
irmed-0.007
 Soon-0.006
Token to
Token ID284
Feature activation+5.62e-02

Pos logit contributions
chard+0.058
ache+0.058
 sir+0.057
ber+0.053
 salon+0.053
Neg logit contributions
acks-0.074
 drifted-0.073
 care-0.071
acked-0.066
 They-0.066
Token the
Token ID262
Feature activation+7.13e-03

Pos logit contributions
chard+0.084
ache+0.084
 sir+0.083
ber+0.080
 salon+0.079
Neg logit contributions
acks-0.037
 drifted-0.035
 care-0.031
acked-0.030
irmed-0.030
Token girl
Token ID2576
Feature activation+2.00e-02

Pos logit contributions
ache+0.016
 sir+0.015
chard+0.015
ber+0.015
 salon+0.015
Neg logit contributions
acks-0.008
 drifted-0.008
 care-0.007
 air-0.007
 They-0.007
Token,
Token ID11
Feature activation+8.65e-02

Pos logit contributions
ache+0.043
chard+0.043
 sir+0.042
 salon+0.040
itor+0.040
Neg logit contributions
acks-0.024
 drifted-0.023
 They-0.023
 care-0.021
acked-0.020
Token "
Token ID366
Feature activation+5.99e-02

Pos logit contributions
ache+0.129
chard+0.128
 sir+0.127
ber+0.119
 salon+0.119
Neg logit contributions
 drifted-0.161
acks-0.160
 They-0.149
 care-0.147
acked-0.146
TokenHello
Token ID15496
Feature activation+5.08e-02

Pos logit contributions
 sir+0.106
chard+0.106
ache+0.104
ber+0.098
 salon+0.097
Neg logit contributions
acks-0.098
 drifted-0.096
acked-0.089
 care-0.087
irmed-0.087
Token,
Token ID11
Feature activation+1.27e-01

Pos logit contributions
belt+0.125
chard+0.125
 salon+0.120
ache+0.119
cher+0.113
Neg logit contributions
 turns-0.048
acks-0.047
 They-0.045
 air-0.044
 night-0.041
Token little
Token ID1310
Feature activation+1.58e-02

Pos logit contributions
belt+0.293
chard+0.285
ache+0.285
ber+0.266
 salon+0.263
Neg logit contributions
 They-0.144
 turns-0.132
 air-0.131
 lot-0.131
 care-0.121
Token girl
Token ID2576
Feature activation+2.06e-02

Pos logit contributions
 sir+0.025
chard+0.025
ache+0.024
ber+0.023
 why+0.023
Neg logit contributions
acks-0.029
 drifted-0.028
acked-0.026
irmed-0.026
 care-0.025
Token.
Token ID13
Feature activation+3.13e-02

Pos logit contributions
chard+0.050
ache+0.047
 salon+0.047
 sir+0.047
itor+0.045
Neg logit contributions
acks-0.020
 care-0.018
 drifted-0.018
 They-0.017
acked-0.017
Token My
Token ID2011
Feature activation+6.02e-02

Pos logit contributions
ache+0.057
chard+0.052
 sir+0.051
 salon+0.051
ber+0.049
Neg logit contributions
 They-0.052
acks-0.050
 Soon-0.048
 drifted-0.047
 care-0.046
Token name
Token ID1438
Feature activation+3.86e-02

Pos logit contributions
chard+0.126
ache+0.122
 sir+0.119
 salon+0.114
 why+0.111
Neg logit contributions
acks-0.081
 drifted-0.076
 care-0.072
irmed-0.070
acked-0.070
Token.
Token ID13
Feature activation+6.44e-03

Pos logit contributions
chard+0.071
 sir+0.065
ache+0.064
itor+0.064
ber+0.063
Neg logit contributions
acks-0.038
 drifted-0.037
 care-0.035
 They-0.032
acked-0.031
Token
Token ID198
Feature activation-1.49e-02

Pos logit contributions
chard+0.017
ache+0.017
 sir+0.017
ber+0.016
 salon+0.016
Neg logit contributions
acks-0.005
 drifted-0.004
acked-0.004
irmed-0.004
 care-0.004
Token
Token ID198
Feature activation-6.77e-03

Pos logit contributions
acks+0.019
 drifted+0.018
irmed+0.017
acked+0.016
 care+0.016
Neg logit contributions
 sir-0.032
chard-0.031
ache-0.031
ber-0.030
 why-0.029
Token"
Token ID1
Feature activation+7.22e-02

Pos logit contributions
acks+0.012
 drifted+0.012
irmed+0.011
acked+0.011
 care+0.011
Neg logit contributions
chard-0.011
 sir-0.011
ache-0.011
ber-0.010
 why-0.010
TokenHello
Token ID15496
Feature activation+4.52e-02

Pos logit contributions
chard+0.127
ache+0.126
 sir+0.125
 why+0.115
ber+0.115
Neg logit contributions
 drifted-0.117
acks-0.114
 care-0.107
acked-0.106
irmed-0.105
Token,
Token ID11
Feature activation+1.27e-01

Pos logit contributions
chard+0.110
ache+0.109
belt+0.103
 salon+0.101
but+0.096
Neg logit contributions
 turns-0.047
 They-0.041
acks-0.039
 care-0.038
 smiles-0.037
Token little
Token ID1310
Feature activation+1.11e-02

Pos logit contributions
chard+0.259
ache+0.257
belt+0.229
ber+0.226
 salon+0.225
Neg logit contributions
 They-0.167
 drifted-0.157
 care-0.155
 turns-0.150
 their-0.148
Token ones
Token ID3392
Feature activation+5.60e-02

Pos logit contributions
chard+0.020
ache+0.020
 sir+0.019
 salon+0.018
ber+0.018
Neg logit contributions
acks-0.017
 drifted-0.017
 care-0.016
irmed-0.015
acked-0.015
Token.
Token ID13
Feature activation+2.43e-02

Pos logit contributions
chard+0.120
ache+0.112
 sir+0.111
 salon+0.109
itor+0.108
Neg logit contributions
acks-0.071
 care-0.066
 drifted-0.066
 They-0.063
acked-0.061
Token Why
Token ID4162
Feature activation+5.41e-02

Pos logit contributions
ache+0.043
chard+0.041
 sir+0.039
 salon+0.038
ber+0.037
Neg logit contributions
acks-0.040
 They-0.040
 drifted-0.038
 care-0.037
 Soon-0.037
Token are
Token ID389
Feature activation+4.60e-02

Pos logit contributions
chard+0.145
ache+0.140
itor+0.134
 sir+0.133
ber+0.133
Neg logit contributions
 care-0.037
acks-0.036
 drifted-0.034
 turns-0.034
 They-0.032
Token.
Token ID13
Feature activation-2.23e-03

Pos logit contributions
 sir+0.060
chard+0.059
 salon+0.058
ache+0.058
itor+0.056
Neg logit contributions
 drifted-0.020
 care-0.020
acks-0.020
 They-0.016
irmed-0.015
Token
Token ID198
Feature activation-8.65e-03

Pos logit contributions
acks+0.003
 drifted+0.003
irmed+0.003
acked+0.003
 care+0.003
Neg logit contributions
 sir-0.004
chard-0.004
 why-0.004
ache-0.004
ber-0.004
Token
Token ID198
Feature activation-3.18e-04

Pos logit contributions
acks+0.010
 drifted+0.010
acked+0.009
irmed+0.009
 care+0.009
Neg logit contributions
chard-0.019
 sir-0.019
ache-0.019
ber-0.018
 why-0.018
Token"
Token ID1
Feature activation+6.85e-02

Pos logit contributions
acks+0.001
 drifted+0.001
irmed+0.001
acked+0.001
 care+0.000
Neg logit contributions
chard-0.001
 sir-0.001
ache-0.001
ber-0.000
 why-0.000
TokenHello
Token ID15496
Feature activation+4.85e-02

Pos logit contributions
 sir+0.117
ache+0.116
chard+0.114
 salon+0.109
ber+0.108
Neg logit contributions
 drifted-0.117
acks-0.112
acked-0.105
 care-0.105
irmed-0.102
Token,
Token ID11
Feature activation+1.27e-01

Pos logit contributions
belt+0.109
 salon+0.109
ache+0.108
chard+0.107
itor+0.102
Neg logit contributions
 turns-0.047
 night-0.046
 They-0.045
acks-0.044
 drifted-0.043
Token prince
Token ID19716
Feature activation+2.96e-02

Pos logit contributions
ache+0.268
belt+0.267
ber+0.260
chard+0.258
 salon+0.258
Neg logit contributions
 They-0.141
 drifted-0.133
 lot-0.126
 night-0.125
 care-0.123
Token.
Token ID13
Feature activation+3.12e-02

Pos logit contributions
chard+0.062
ache+0.059
 sir+0.059
 salon+0.059
itor+0.055
Neg logit contributions
acks-0.039
 drifted-0.036
 care-0.035
acked-0.033
 They-0.032
Token Why
Token ID4162
Feature activation+6.01e-02

Pos logit contributions
ache+0.057
 salon+0.051
chard+0.050
 sir+0.049
ber+0.047
Neg logit contributions
 They-0.052
acks-0.049
 drifted-0.047
 Soon-0.047
 care-0.047
Token are
Token ID389
Feature activation+4.98e-02

Pos logit contributions
chard+0.151
 sir+0.146
ache+0.146
ber+0.140
 salon+0.140
Neg logit contributions
acks-0.053
 drifted-0.052
 care-0.047
acked-0.042
irmed-0.041
Token you
Token ID345
Feature activation+1.29e-03

Pos logit contributions
chard+0.095
 sir+0.095
ache+0.094
ber+0.088
 salon+0.087
Neg logit contributions
acks-0.074
 drifted-0.071
acked-0.066
irmed-0.065
 care-0.065
Token and
Token ID290
Feature activation-2.08e-02

Pos logit contributions
acks+0.018
 drifted+0.017
acked+0.015
hed+0.015
 care+0.015
Neg logit contributions
 sir-0.028
chard-0.028
ache-0.026
itor-0.026
ber-0.025
Token said
Token ID531
Feature activation+2.18e-02

Pos logit contributions
acks+0.028
 drifted+0.028
 care+0.025
irmed+0.025
acked+0.024
Neg logit contributions
chard-0.042
 sir-0.042
ache-0.041
ber-0.039
 salon-0.038
Token,
Token ID11
Feature activation+9.77e-02

Pos logit contributions
chard+0.048
ache+0.045
 salon+0.044
itor+0.044
 sir+0.043
Neg logit contributions
 drifted-0.026
 They-0.026
 care-0.025
acks-0.025
 turns-0.024
Token "
Token ID366
Feature activation+6.28e-02

Pos logit contributions
chard+0.178
ache+0.171
ber+0.164
itor+0.164
 salon+0.163
Neg logit contributions
 They-0.159
 drifted-0.145
acks-0.138
 Soon-0.131
 drift-0.127
TokenHello
Token ID15496
Feature activation+5.10e-02

Pos logit contributions
chard+0.106
 sir+0.105
ache+0.105
ber+0.101
 salon+0.098
Neg logit contributions
acks-0.106
 drifted-0.102
irmed-0.094
igg-0.093
 care-0.092
Token,
Token ID11
Feature activation+1.27e-01

Pos logit contributions
chard+0.123
ache+0.116
 sir+0.105
itor+0.104
 salon+0.104
Neg logit contributions
acks-0.060
 turns-0.053
 They-0.052
 drifted-0.051
 care-0.051
Token little
Token ID1310
Feature activation+1.64e-02

Pos logit contributions
chard+0.301
ache+0.283
itor+0.258
belt+0.251
ber+0.251
Neg logit contributions
acks-0.136
 They-0.134
 drifted-0.129
 care-0.126
 turns-0.123
Token bird
Token ID6512
Feature activation+3.13e-02

Pos logit contributions
 sir+0.028
chard+0.028
ache+0.028
ber+0.026
 salon+0.026
Neg logit contributions
acks-0.028
 drifted-0.027
acked-0.025
irmed-0.025
 care-0.024
Token.
Token ID13
Feature activation+2.53e-02

Pos logit contributions
chard+0.072
 sir+0.069
ache+0.068
 salon+0.066
itor+0.066
Neg logit contributions
acks-0.037
 drifted-0.032
 care-0.031
acked-0.031
 They-0.030
Token Why
Token ID4162
Feature activation+5.50e-02

Pos logit contributions
chard+0.041
ache+0.040
 sir+0.040
 salon+0.037
ber+0.037
Neg logit contributions
acks-0.045
 drifted-0.044
acked-0.041
 care-0.041
irmed-0.041
Token are
Token ID389
Feature activation+5.02e-02

Pos logit contributions
chard+0.129
 sir+0.126
ache+0.126
ber+0.120
 salon+0.119
Neg logit contributions
acks-0.060
 drifted-0.057
 care-0.051
acked-0.048
irmed-0.048
Token He
Token ID679
Feature activation-3.68e-03

Pos logit contributions
chard+0.031
ache+0.030
 sir+0.030
ber+0.029
 salon+0.029
Neg logit contributions
acks-0.012
 drifted-0.012
acked-0.010
irmed-0.010
 care-0.010
Token said
Token ID531
Feature activation+2.46e-02

Pos logit contributions
acks+0.006
 drifted+0.005
acked+0.005
irmed+0.005
 care+0.005
Neg logit contributions
 sir-0.007
chard-0.007
ache-0.007
 why-0.007
ber-0.007
Token,
Token ID11
Feature activation+9.57e-02

Pos logit contributions
chard+0.054
ache+0.050
itor+0.049
 salon+0.049
 sir+0.047
Neg logit contributions
 drifted-0.029
 They-0.029
 care-0.029
 turns-0.028
 their-0.026
Token "
Token ID366
Feature activation+6.42e-02

Pos logit contributions
ache+0.173
chard+0.171
 salon+0.165
itor+0.159
ber+0.155
Neg logit contributions
 They-0.159
 drifted-0.141
 Soon-0.136
acks-0.134
 drift-0.125
TokenHello
Token ID15496
Feature activation+5.27e-02

Pos logit contributions
chard+0.111
 sir+0.110
ache+0.109
ber+0.102
 salon+0.101
Neg logit contributions
acks-0.109
 drifted-0.105
acked-0.097
irmed-0.097
 care-0.096
Token,
Token ID11
Feature activation+1.27e-01

Pos logit contributions
chard+0.129
ache+0.121
 salon+0.112
itor+0.107
 principal+0.106
Neg logit contributions
 They-0.056
acks-0.053
 turns-0.052
 care-0.047
text-0.046
Token little
Token ID1310
Feature activation+1.93e-02

Pos logit contributions
chard+0.265
ache+0.248
itor+0.215
 salon+0.213
ber+0.213
Neg logit contributions
 They-0.182
 turns-0.161
 care-0.159
 their-0.156
 air-0.154
Token boy
Token ID2933
Feature activation+1.92e-02

Pos logit contributions
chard+0.035
ache+0.035
 sir+0.034
ber+0.032
 salon+0.032
Neg logit contributions
acks-0.031
 drifted-0.029
 care-0.028
acked-0.027
irmed-0.027
Token.
Token ID13
Feature activation+3.04e-02

Pos logit contributions
chard+0.042
 sir+0.041
ache+0.040
 salon+0.039
ber+0.038
Neg logit contributions
acks-0.024
 drifted-0.022
 care-0.021
acked-0.020
irmed-0.020
Token We
Token ID775
Feature activation-1.87e-02

Pos logit contributions
ache+0.053
chard+0.052
 sir+0.049
 salon+0.048
ber+0.046
Neg logit contributions
acks-0.050
 They-0.049
 drifted-0.049
 care-0.047
acked-0.046
Token saw
Token ID2497
Feature activation+6.06e-02

Pos logit contributions
acks+0.031
 drifted+0.028
acked+0.026
irmed+0.026
 Soon+0.025
Neg logit contributions
 sir-0.035
chard-0.034
 why-0.034
ache-0.034
ber-0.031
Token and
Token ID290
Feature activation-7.93e-03

Pos logit contributions
chard+0.076
 sir+0.075
ache+0.074
ber+0.069
 salon+0.069
Neg logit contributions
acks-0.054
 drifted-0.053
acked-0.048
 care-0.047
irmed-0.047
Token says
Token ID1139
Feature activation+3.28e-02

Pos logit contributions
 drifted+0.010
acks+0.010
 care+0.009
 They+0.009
acked+0.009
Neg logit contributions
ache-0.016
chard-0.016
 sir-0.016
ber-0.015
 salon-0.015
Token,
Token ID11
Feature activation+1.06e-01

Pos logit contributions
chard+0.067
ache+0.063
 salon+0.061
itor+0.060
 sir+0.060
Neg logit contributions
 drifted-0.044
 care-0.043
acks-0.042
 They-0.042
 turns-0.039
Token "
Token ID366
Feature activation+6.26e-02

Pos logit contributions
chard+0.189
ache+0.175
 salon+0.168
itor+0.165
ber+0.157
Neg logit contributions
 They-0.183
 drifted-0.170
acks-0.164
 Soon-0.159
 care-0.154
TokenHello
Token ID15496
Feature activation+4.67e-02

Pos logit contributions
 sir+0.109
chard+0.108
ache+0.105
ber+0.102
 salon+0.099
Neg logit contributions
acks-0.106
 drifted-0.101
irmed-0.094
igg-0.092
 care-0.091
Token,
Token ID11
Feature activation+1.27e-01

Pos logit contributions
chard+0.111
ache+0.105
 salon+0.100
belt+0.096
itor+0.096
Neg logit contributions
acks-0.051
 They-0.049
 turns-0.048
 care-0.046
 drifted-0.045
Token can
Token ID460
Feature activation+8.30e-02

Pos logit contributions
chard+0.255
ache+0.242
belt+0.223
itor+0.218
 salon+0.216
Neg logit contributions
 They-0.173
 care-0.164
 turns-0.159
 drifted-0.153
acks-0.153
Token I
Token ID314
Feature activation+1.39e-02

Pos logit contributions
chard+0.133
ache+0.130
 sir+0.130
ber+0.120
 salon+0.119
Neg logit contributions
acks-0.151
 drifted-0.146
 care-0.136
acked-0.136
irmed-0.134
Token play
Token ID711
Feature activation+4.00e-02

Pos logit contributions
chard+0.031
 sir+0.031
ache+0.031
ber+0.029
itor+0.029
Neg logit contributions
acks-0.016
 drifted-0.016
 care-0.015
acked-0.014
 They-0.013
Token your
Token ID534
Feature activation+3.13e-02

Pos logit contributions
ache+0.080
chard+0.076
 sir+0.076
 salon+0.070
ber+0.070
Neg logit contributions
acks-0.061
 care-0.054
 drifted-0.054
acked-0.053
 They-0.052
Token fl
Token ID781
Feature activation+1.47e-02

Pos logit contributions
ache+0.060
chard+0.060
 sir+0.058
 why+0.055
ber+0.054
Neg logit contributions
acks-0.046
 drifted-0.044
 care-0.041
acked-0.040
irmed-0.039
Tokenâ
Token ID22940
Feature activation+1.30e-02

Pos logit contributions
 sir+0.056
chard+0.056
ache+0.055
ber+0.050
 why+0.050
Neg logit contributions
acks-0.078
 drifted-0.075
acked-0.071
irmed-0.070
 care-0.069
Token
Token ID26391
Feature activation+1.76e-02

Pos logit contributions
 sir+0.022
chard+0.021
ache+0.021
 salon+0.020
 why+0.020
Neg logit contributions
acks-0.022
 drifted-0.022
acked-0.020
irmed-0.020
 care-0.020
Token
Token ID8151
Feature activation+5.08e-02

Pos logit contributions
chard+0.041
ache+0.040
 sir+0.039
 why+0.038
 salon+0.037
Neg logit contributions
acks-0.020
 drifted-0.018
 They-0.016
acked-0.016
 Soon-0.015
Tokent
Token ID83
Feature activation+3.72e-02

Pos logit contributions
chard+0.093
 sir+0.091
ache+0.087
itor+0.085
 why+0.084
Neg logit contributions
acks-0.085
acked-0.076
hed-0.076
 drifted-0.075
 They-0.070
Token worry
Token ID5490
Feature activation+4.42e-02

Pos logit contributions
chard+0.073
 sir+0.072
ache+0.071
ber+0.067
 salon+0.066
Neg logit contributions
acks-0.055
 drifted-0.053
 care-0.048
acked-0.048
irmed-0.048
Token,
Token ID11
Feature activation+1.27e-01

Pos logit contributions
chard+0.097
ache+0.087
itor+0.085
 salon+0.084
ber+0.084
Neg logit contributions
acks-0.059
 drifted-0.054
 They-0.054
 care-0.053
 drift-0.047
Token Sam
Token ID3409
Feature activation+3.32e-02

Pos logit contributions
chard+0.251
ache+0.243
itor+0.232
 salon+0.224
ber+0.221
Neg logit contributions
 drifted-0.174
 care-0.171
 They-0.171
acks-0.161
 drift-0.151
Token.
Token ID13
Feature activation+2.88e-02

Pos logit contributions
chard+0.064
 sir+0.062
ache+0.061
ber+0.058
 salon+0.058
Neg logit contributions
acks-0.050
 drifted-0.049
acked-0.044
 care-0.044
irmed-0.044
Token I
Token ID314
Feature activation+6.58e-03

Pos logit contributions
ache+0.064
chard+0.063
 salon+0.060
 sir+0.057
itor+0.057
Neg logit contributions
 They-0.035
acks-0.035
 Soon-0.032
 drifted-0.031
acked-0.030
Token can
Token ID460
Feature activation+7.22e-02

Pos logit contributions
 sir+0.011
chard+0.011
ache+0.011
 why+0.011
ber+0.011
Neg logit contributions
acks-0.011
 drifted-0.010
acked-0.010
irmed-0.010
 care-0.009
Token help
Token ID1037
Feature activation+6.69e-02

Pos logit contributions
chard+0.130
 sir+0.130
ache+0.128
 why+0.119
ber+0.119
Neg logit contributions
acks-0.117
 drifted-0.112
acked-0.105
irmed-0.103
 care-0.101
Token to
Token ID284
Feature activation-1.40e-03

Pos logit contributions
acks+0.030
 drifted+0.028
irmed+0.028
acked+0.027
igg+0.027
Neg logit contributions
chard-0.018
ache-0.017
 why-0.017
 sir-0.017
 salon-0.016
Token be
Token ID307
Feature activation-3.41e-02

Pos logit contributions
acks+0.002
 drifted+0.002
acked+0.002
irmed+0.002
text+0.002
Neg logit contributions
chard-0.003
 sir-0.003
ache-0.003
 why-0.003
 salon-0.003
Token lucky
Token ID9670
Feature activation-1.45e-02

Pos logit contributions
acks+0.057
 drifted+0.054
acked+0.051
irmed+0.050
 care+0.048
Neg logit contributions
chard-0.060
 sir-0.059
ache-0.059
 why-0.055
ber-0.055
Token with
Token ID351
Feature activation-3.82e-02

Pos logit contributions
acks+0.023
 drifted+0.023
 care+0.021
acked+0.021
irmed+0.020
Neg logit contributions
chard-0.027
 sir-0.026
ache-0.026
ber-0.024
 salon-0.024
Token their
Token ID511
Feature activation-1.72e-02

Pos logit contributions
acks+0.051
 drifted+0.048
acked+0.045
irmed+0.043
 care+0.040
Neg logit contributions
chard-0.081
 sir-0.080
ache-0.078
 why-0.078
ber-0.075
Token newly
Token ID8308
Feature activation-1.09e-01

Pos logit contributions
acks+0.027
 drifted+0.026
acked+0.024
irmed+0.023
hed+0.022
Neg logit contributions
chard-0.032
 sir-0.032
ache-0.031
 why-0.030
 salon-0.030
Token prepared
Token ID5597
Feature activation-7.34e-03

Pos logit contributions
acks+0.187
 drifted+0.179
 They+0.162
irmed+0.162
text+0.160
Neg logit contributions
chard-0.188
ache-0.185
 sir-0.169
 why-0.168
 oh-0.167
Token jar
Token ID17379
Feature activation-1.23e-03

Pos logit contributions
acks+0.011
 drifted+0.010
acked+0.009
 care+0.009
irmed+0.009
Neg logit contributions
chard-0.014
 sir-0.014
ache-0.014
ber-0.013
 salon-0.013
Token.
Token ID13
Feature activation-1.95e-02

Pos logit contributions
acks+0.002
 drifted+0.002
acked+0.001
 care+0.001
irmed+0.001
Neg logit contributions
chard-0.003
 sir-0.003
ache-0.002
 salon-0.002
ber-0.002
Token<|endoftext|>
Token ID50256
Feature activation-4.88e-02

Pos logit contributions
 drifted+0.025
acks+0.024
irmed+0.023
 care+0.023
 breaths+0.023
Neg logit contributions
 sir-0.040
 why-0.037
 salon-0.037
ber-0.037
chard-0.037
TokenOnce
Token ID7454
Feature activation-7.09e-03

Pos logit contributions
acks+0.060
 drifted+0.058
irmed+0.053
 care+0.053
acked+0.051
Neg logit contributions
chard-0.103
 sir-0.103
ache-0.100
ber-0.096
 why-0.096
Token beautiful
Token ID4950
Feature activation-2.27e-02

Pos logit contributions
chard+0.042
 sir+0.041
ache+0.041
 salon+0.039
ber+0.039
Neg logit contributions
acks-0.032
 drifted-0.031
acked-0.029
irmed-0.029
 care-0.027
Token jazz
Token ID21274
Feature activation-1.21e-02

Pos logit contributions
acks+0.040
 drifted+0.039
acked+0.037
irmed+0.036
 care+0.036
Neg logit contributions
chard-0.037
 sir-0.037
ache-0.036
ber-0.034
 salon-0.034
Token tune
Token ID14009
Feature activation-2.45e-02

Pos logit contributions
acks+0.024
 drifted+0.023
acked+0.021
 care+0.021
irmed+0.021
Neg logit contributions
chard-0.018
 sir-0.018
ache-0.017
ber-0.016
 why-0.016
Token on
Token ID319
Feature activation-6.11e-02

Pos logit contributions
acks+0.033
 drifted+0.032
acked+0.029
 care+0.029
irmed+0.029
Neg logit contributions
chard-0.051
 sir-0.050
ache-0.049
ber-0.047
 salon-0.046
Token her
Token ID607
Feature activation-3.29e-03

Pos logit contributions
acks+0.079
 drifted+0.076
irmed+0.070
acked+0.069
 care+0.065
Neg logit contributions
chard-0.131
 sir-0.129
ache-0.128
 why-0.122
 salon-0.122
Token newly
Token ID8308
Feature activation-1.06e-01

Pos logit contributions
acks+0.006
 drifted+0.006
acked+0.005
irmed+0.005
 care+0.005
Neg logit contributions
chard-0.005
 sir-0.005
ache-0.005
ber-0.005
 salon-0.005
Token repaired
Token ID27457
Feature activation+4.87e-03

Pos logit contributions
acks+0.176
 drifted+0.162
irmed+0.157
 They+0.155
acked+0.151
Neg logit contributions
ache-0.182
chard-0.175
 why-0.165
 sir-0.164
 salon-0.164
Token toy
Token ID13373
Feature activation+6.79e-03

Pos logit contributions
chard+0.007
 sir+0.007
ache+0.007
ber+0.006
 why+0.006
Neg logit contributions
acks-0.009
 drifted-0.009
acked-0.009
 care-0.009
irmed-0.008
Token.
Token ID13
Feature activation-2.04e-02

Pos logit contributions
chard+0.012
 sir+0.012
ache+0.012
ber+0.011
 salon+0.011
Neg logit contributions
acks-0.011
 drifted-0.011
acked-0.010
irmed-0.010
 care-0.010
Token From
Token ID3574
Feature activation-2.69e-02

Pos logit contributions
acks+0.032
 breaths+0.031
irmed+0.031
 drifted+0.030
 fins+0.029
Neg logit contributions
chard-0.035
 why-0.035
 sir-0.034
ache-0.032
ber-0.031
Token that
Token ID326
Feature activation+5.73e-03

Pos logit contributions
 drifted+0.044
acks+0.044
 care+0.041
acked+0.041
irmed+0.040
Neg logit contributions
ache-0.046
chard-0.046
 sir-0.045
 salon-0.043
ber-0.042
Token dogs
Token ID6844
Feature activation-1.53e-02

Pos logit contributions
acks+0.059
 drifted+0.056
acked+0.051
 care+0.051
irmed+0.050
Neg logit contributions
chard-0.087
 sir-0.085
ache-0.085
ber-0.080
 why-0.079
Token and
Token ID290
Feature activation-5.26e-02

Pos logit contributions
acks+0.019
 drifted+0.019
 They+0.017
acked+0.016
 care+0.016
Neg logit contributions
chard-0.033
 sir-0.033
ache-0.033
itor-0.031
ber-0.030
Token small
Token ID1402
Feature activation-7.44e-02

Pos logit contributions
 drifted+0.062
acks+0.061
 care+0.056
 They+0.054
irmed+0.053
Neg logit contributions
 sir-0.116
ache-0.115
chard-0.113
ber-0.109
 salon-0.108
Token dogs
Token ID6844
Feature activation-3.27e-02

Pos logit contributions
acks+0.101
 drifted+0.097
acked+0.088
 care+0.088
irmed+0.087
Neg logit contributions
chard-0.154
 sir-0.152
ache-0.150
ber-0.142
 salon-0.140
Token and
Token ID290
Feature activation-6.01e-02

Pos logit contributions
acks+0.040
 drifted+0.038
acked+0.034
 care+0.033
 They+0.033
Neg logit contributions
 sir-0.072
chard-0.072
ache-0.071
itor-0.067
ber-0.066
Token fluffy
Token ID39145
Feature activation-1.06e-01

Pos logit contributions
 drifted+0.072
acks+0.070
 care+0.065
 They+0.063
irmed+0.062
Neg logit contributions
 sir-0.131
ache-0.129
chard-0.128
ber-0.123
 salon-0.122
Token dogs
Token ID6844
Feature activation-4.32e-02

Pos logit contributions
acks+0.157
 drifted+0.151
irmed+0.139
acked+0.136
igg+0.134
Neg logit contributions
chard-0.199
 sir-0.197
ache-0.196
ber-0.183
 why-0.183
Token.
Token ID13
Feature activation-4.45e-02

Pos logit contributions
acks+0.050
 drifted+0.048
acked+0.043
 care+0.042
 They+0.041
Neg logit contributions
 sir-0.097
chard-0.097
ache-0.095
ber-0.090
 salon-0.090
Token They
Token ID1119
Feature activation-6.01e-02

Pos logit contributions
acks+0.024
 They+0.022
 drifted+0.019
 Soon+0.016
acked+0.015
Neg logit contributions
ache-0.131
chard-0.129
 sir-0.126
 salon-0.123
ber-0.123
Token wanted
Token ID2227
Feature activation+2.72e-02

Pos logit contributions
acks+0.079
 drifted+0.078
 care+0.070
irmed+0.069
acked+0.069
Neg logit contributions
 sir-0.124
chard-0.124
ache-0.123
ber-0.116
 salon-0.115
Token to
Token ID284
Feature activation+3.93e-02

Pos logit contributions
ache+0.047
 sir+0.047
chard+0.045
 salon+0.043
ber+0.043
Neg logit contributions
 care-0.044
 They-0.042
 drifted-0.042
acks-0.042
 turns-0.041
Token show
Token ID905
Feature activation-1.07e-02

Pos logit contributions
 why+0.008
chard+0.008
 salon+0.008
 sir+0.008
ache+0.008
Neg logit contributions
acks-0.007
acked-0.007
 drifted-0.006
irmed-0.006
hed-0.006
Token Mom
Token ID11254
Feature activation-1.20e-02

Pos logit contributions
acks+0.012
 drifted+0.011
acked+0.010
irmed+0.010
igg+0.010
Neg logit contributions
 sir-0.025
 why-0.025
chard-0.024
ache-0.024
ber-0.024
Token the
Token ID262
Feature activation+1.41e-03

Pos logit contributions
acks+0.016
 drifted+0.015
acked+0.014
irmed+0.014
 care+0.014
Neg logit contributions
chard-0.025
 sir-0.025
ache-0.024
ber-0.023
 why-0.023
Token dresses
Token ID27309
Feature activation-6.79e-03

Pos logit contributions
 sir+0.003
chard+0.003
ache+0.003
 salon+0.002
ber+0.002
Neg logit contributions
acks-0.002
 drifted-0.002
acked-0.002
irmed-0.002
 care-0.002
Token they
Token ID484
Feature activation-6.26e-02

Pos logit contributions
acks+0.009
 drifted+0.008
 care+0.007
acked+0.007
 They+0.007
Neg logit contributions
chard-0.015
 sir-0.014
ache-0.014
ber-0.013
itor-0.013
Token both
Token ID1111
Feature activation-1.05e-01

Pos logit contributions
acks+0.103
 drifted+0.090
acked+0.089
irmed+0.087
igg+0.083
Neg logit contributions
chard-0.117
ache-0.113
 sir-0.110
 why-0.107
ber-0.105
Token had
Token ID550
Feature activation-9.17e-02

Pos logit contributions
acks+0.168
acked+0.148
igg+0.143
irmed+0.143
 drifted+0.143
Neg logit contributions
ache-0.186
chard-0.182
 why-0.180
 sir-0.179
ber-0.174
Token chosen
Token ID7147
Feature activation-1.75e-02

Pos logit contributions
acks+0.146
 drifted+0.134
irmed+0.129
acked+0.128
 care+0.121
Neg logit contributions
chard-0.166
ache-0.165
 sir-0.162
ber-0.155
 why-0.154
Token.
Token ID13
Feature activation-2.74e-02

Pos logit contributions
acks+0.024
 drifted+0.023
acked+0.021
irmed+0.021
 care+0.020
Neg logit contributions
chard-0.036
 sir-0.035
ache-0.035
ber-0.033
 why-0.033
Token<|endoftext|>
Token ID50256
Feature activation-4.90e-02

Pos logit contributions
igg+0.047
irmed+0.046
irt+0.045
 breaths+0.044
bons+0.044
Neg logit contributions
 sir-0.042
 why-0.041
ber-0.039
chard-0.038
 dentist-0.036
TokenOnce
Token ID7454
Feature activation-8.65e-03

Pos logit contributions
acks+0.064
 drifted+0.063
irmed+0.058
 care+0.057
acked+0.054
Neg logit contributions
chard-0.099
 sir-0.098
ache-0.095
ber-0.093
 why-0.091
Token crown
Token ID12389
Feature activation-6.51e-03

Pos logit contributions
chard+0.004
 sir+0.004
ache+0.003
ber+0.003
 salon+0.003
Neg logit contributions
acks-0.003
 drifted-0.003
acked-0.002
irmed-0.002
 care-0.002
Token and
Token ID290
Feature activation-4.66e-02

Pos logit contributions
acks+0.009
 drifted+0.009
 care+0.008
acked+0.008
irmed+0.008
Neg logit contributions
chard-0.013
 sir-0.013
ache-0.012
ber-0.012
 salon-0.012
Token playing
Token ID2712
Feature activation-6.58e-03

Pos logit contributions
 drifted+0.064
acks+0.063
 care+0.059
 They+0.055
 turns+0.054
Neg logit contributions
 sir-0.097
ache-0.094
chard-0.094
ber-0.087
 salon-0.086
Token in
Token ID287
Feature activation-6.32e-02

Pos logit contributions
acks+0.010
 drifted+0.009
 care+0.009
acked+0.009
 They+0.009
Neg logit contributions
 sir-0.013
ache-0.013
chard-0.012
ber-0.012
 salon-0.011
Token the
Token ID262
Feature activation-4.80e-02

Pos logit contributions
acks+0.078
 drifted+0.074
acked+0.067
irmed+0.067
 care+0.063
Neg logit contributions
chard-0.141
 sir-0.138
ache-0.135
 why-0.130
 salon-0.130
Token freshly
Token ID29026
Feature activation-1.04e-01

Pos logit contributions
acks+0.069
 drifted+0.065
irmed+0.061
acked+0.061
text+0.058
Neg logit contributions
chard-0.095
 sir-0.089
ache-0.089
 salon-0.088
 why-0.086
Token cut
Token ID2005
Feature activation-6.85e-03

Pos logit contributions
acks+0.159
 drifted+0.150
irmed+0.139
acked+0.137
 care+0.135
Neg logit contributions
chard-0.196
ache-0.193
 sir-0.190
ber-0.181
 why-0.180
Token grass
Token ID8701
Feature activation-2.86e-02

Pos logit contributions
acks+0.011
 drifted+0.010
acked+0.009
irmed+0.009
 care+0.009
Neg logit contributions
chard-0.013
 sir-0.013
ache-0.013
ber-0.012
 salon-0.012
Token.
Token ID13
Feature activation-2.35e-02

Pos logit contributions
acks+0.032
 drifted+0.028
acked+0.027
 They+0.026
hed+0.025
Neg logit contributions
 sir-0.066
ache-0.064
chard-0.064
itor-0.062
ber-0.061
Token It
Token ID632
Feature activation-2.39e-02

Pos logit contributions
acks+0.032
 drifted+0.030
 breaths+0.030
 fins+0.029
 care+0.029
Neg logit contributions
 why-0.043
chard-0.043
 sir-0.042
ache-0.040
 salon-0.040
Token looked
Token ID3114
Feature activation-2.86e-02

Pos logit contributions
acks+0.045
irmed+0.041
acked+0.041
irt+0.040
 drifted+0.040
Neg logit contributions
ache-0.035
 why-0.034
chard-0.033
 salon-0.031
 sir-0.030
Token enjoyed
Token ID8359
Feature activation-4.06e-02

Pos logit contributions
acks+0.044
 drifted+0.042
acked+0.038
 care+0.038
irmed+0.037
Neg logit contributions
chard-0.073
 sir-0.073
ache-0.072
ber-0.068
 salon-0.068
Token playing
Token ID2712
Feature activation-5.39e-03

Pos logit contributions
acks+0.057
acked+0.054
 drifted+0.053
irmed+0.051
text+0.048
Neg logit contributions
chard-0.082
 sir-0.079
 why-0.078
ache-0.076
ber-0.074
Token together
Token ID1978
Feature activation-4.17e-02

Pos logit contributions
acks+0.007
 drifted+0.007
acked+0.007
irmed+0.006
 care+0.006
Neg logit contributions
chard-0.011
 sir-0.011
ache-0.011
 why-0.010
ber-0.010
Token in
Token ID287
Feature activation-6.17e-02

Pos logit contributions
acks+0.055
 drifted+0.053
acked+0.048
 care+0.047
irmed+0.046
Neg logit contributions
 sir-0.088
chard-0.087
ache-0.086
ber-0.081
 salon-0.080
Token their
Token ID511
Feature activation-4.33e-02

Pos logit contributions
acks+0.096
acked+0.092
irmed+0.091
 drifted+0.090
igg+0.084
Neg logit contributions
chard-0.119
 sir-0.113
 why-0.112
 salon-0.107
ber-0.102
Token newly
Token ID8308
Feature activation-1.04e-01

Pos logit contributions
acks+0.082
 drifted+0.077
acked+0.076
hed+0.074
irmed+0.073
Neg logit contributions
chard-0.065
 sir-0.062
 salon-0.061
ber-0.059
 why-0.058
Token cleaned
Token ID20750
Feature activation-2.50e-03

Pos logit contributions
acks+0.156
 drifted+0.152
irmed+0.143
 They+0.141
acked+0.138
Neg logit contributions
ache-0.180
chard-0.179
 why-0.171
 oh-0.169
 sir-0.169
Token space
Token ID2272
Feature activation+2.17e-03

Pos logit contributions
acks+0.004
 drifted+0.004
acked+0.003
irmed+0.003
 care+0.003
Neg logit contributions
chard-0.005
 sir-0.005
ache-0.005
ber-0.004
 salon-0.004
Token.
Token ID13
Feature activation-1.92e-02

Pos logit contributions
chard+0.005
 sir+0.005
ache+0.005
 salon+0.004
ber+0.004
Neg logit contributions
acks-0.003
 drifted-0.003
 care-0.002
acked-0.002
irmed-0.002
Token<|endoftext|>
Token ID50256
Feature activation-4.73e-02

Pos logit contributions
acks+0.027
 drifted+0.026
irmed+0.025
 breaths+0.024
igg+0.024
Neg logit contributions
 sir-0.038
 why-0.037
chard-0.037
ache-0.034
ber-0.034
TokenOnce
Token ID7454
Feature activation-5.70e-03

Pos logit contributions
acks+0.064
 drifted+0.063
irmed+0.059
 care+0.057
acked+0.055
Neg logit contributions
chard-0.093
 sir-0.092
ache-0.089
ber-0.087
 why-0.086
Token stirred
Token ID33091
Feature activation-2.81e-02

Pos logit contributions
acks+0.045
text+0.040
 drifted+0.040
acked+0.040
irmed+0.039
Neg logit contributions
chard-0.056
ache-0.054
 why-0.052
 sir-0.051
 salon-0.050
Token until
Token ID1566
Feature activation-9.16e-03

Pos logit contributions
acks+0.039
 drifted+0.038
irmed+0.035
acked+0.034
 care+0.033
Neg logit contributions
chard-0.057
 sir-0.056
ache-0.055
 why-0.053
ber-0.052
Token it
Token ID340
Feature activation-2.19e-02

Pos logit contributions
acks+0.014
 drifted+0.013
acked+0.013
irmed+0.013
text+0.012
Neg logit contributions
chard-0.017
 sir-0.017
ache-0.016
 why-0.016
 salon-0.016
Token was
Token ID373
Feature activation-1.74e-02

Pos logit contributions
acks+0.035
 drifted+0.033
irmed+0.032
 care+0.031
acked+0.031
Neg logit contributions
ache-0.038
chard-0.037
 sir-0.036
 why-0.035
 salon-0.035
Token all
Token ID477
Feature activation-7.97e-02

Pos logit contributions
acks+0.025
 drifted+0.023
acked+0.022
irmed+0.022
 care+0.021
Neg logit contributions
chard-0.035
 sir-0.034
ache-0.034
 why-0.032
 salon-0.031
Token soft
Token ID2705
Feature activation-1.04e-01

Pos logit contributions
acks+0.136
 drifted+0.127
irmed+0.125
text+0.124
acked+0.120
Neg logit contributions
chard-0.133
ache-0.129
 sir-0.124
 why-0.124
 salon-0.118
Token and
Token ID290
Feature activation-3.58e-02

Pos logit contributions
acks+0.161
 drifted+0.156
irmed+0.145
acked+0.144
text+0.137
Neg logit contributions
ache-0.185
chard-0.183
 sir-0.178
 why-0.171
ber-0.165
Token fluffy
Token ID39145
Feature activation-9.35e-02

Pos logit contributions
acks+0.051
 drifted+0.050
 care+0.046
acked+0.045
irmed+0.044
Neg logit contributions
 sir-0.071
ache-0.070
chard-0.070
ber-0.066
 salon-0.065
Token.
Token ID13
Feature activation-1.39e-02

Pos logit contributions
acks+0.141
 drifted+0.135
irmed+0.125
acked+0.124
text+0.119
Neg logit contributions
ache-0.175
chard-0.173
 sir-0.167
 why-0.160
ber-0.156
Token Then
Token ID3244
Feature activation-8.40e-03

Pos logit contributions
acks+0.020
 drifted+0.020
irmed+0.018
 care+0.018
acked+0.018
Neg logit contributions
chard-0.027
 sir-0.026
 why-0.025
ache-0.025
ber-0.024
Token,
Token ID11
Feature activation+7.35e-02

Pos logit contributions
acks+0.015
 drifted+0.014
acked+0.014
irmed+0.013
 care+0.013
Neg logit contributions
chard-0.014
 sir-0.014
ache-0.013
ber-0.012
 why-0.012
Token it
Token ID340
Feature activation-5.28e-02

Pos logit contributions
acks+0.055
 drifted+0.051
acked+0.045
irmed+0.044
 care+0.043
Neg logit contributions
chard-0.136
 sir-0.135
ache-0.134
ber-0.128
 salon-0.127
Token.
Token ID13
Feature activation-4.27e-02

Pos logit contributions
acks+0.101
acked+0.095
irmed+0.095
 drifted+0.094
irted+0.088
Neg logit contributions
chard-0.079
ache-0.077
 sir-0.077
 why-0.075
ber-0.073
Token They
Token ID1119
Feature activation-7.20e-02

Pos logit contributions
acks+0.046
 drifted+0.043
irmed+0.040
acked+0.039
hed+0.036
Neg logit contributions
 sir-0.101
chard-0.100
 why-0.095
ache-0.094
ber-0.091
Token both
Token ID1111
Feature activation-8.84e-02

Pos logit contributions
acks+0.129
acked+0.111
irmed+0.107
 Soon+0.103
irt+0.100
Neg logit contributions
chard-0.137
ache-0.127
 why-0.120
 salon-0.120
 sir-0.119
Token enjoyed
Token ID8359
Feature activation-5.76e-02

Pos logit contributions
acks+0.176
acked+0.148
irmed+0.145
igg+0.143
 Soon+0.143
Neg logit contributions
chard-0.145
ache-0.138
ber-0.132
 why-0.124
 sir-0.122
Token each
Token ID1123
Feature activation-1.03e-01

Pos logit contributions
acks+0.060
 drifted+0.054
acked+0.050
irmed+0.049
 care+0.045
Neg logit contributions
chard-0.141
 sir-0.139
ache-0.136
ber-0.132
 why-0.131
Token other
Token ID584
Feature activation-4.96e-02

Pos logit contributions
acks+0.186
 drifted+0.184
irmed+0.175
acked+0.163
igg+0.157
Neg logit contributions
 sir-0.164
chard-0.163
 lawyer-0.163
 salon-0.158
ber-0.157
Token.
Token ID13
Feature activation-4.41e-02

Pos logit contributions
acks+0.087
 drifted+0.087
irmed+0.084
acked+0.082
igg+0.081
Neg logit contributions
ache-0.078
ber-0.076
chard-0.074
 sir-0.072
 why-0.069
Token<|endoftext|>
Token ID50256
Feature activation-5.67e-02

Pos logit contributions
acks+0.057
 drifted+0.053
irmed+0.050
acked+0.049
hed+0.046
Neg logit contributions
 sir-0.096
chard-0.094
 why-0.090
ache-0.088
ber-0.086
TokenL
Token ID43
Feature activation-5.46e-02

Pos logit contributions
acks+0.071
 drifted+0.069
irmed+0.064
 care+0.061
acked+0.061
Neg logit contributions
chard-0.120
 sir-0.119
ache-0.116
ber-0.112
 why-0.110
Tokenily
Token ID813
Feature activation-3.37e-02

Pos logit contributions
acks+0.109
 drifted+0.109
irmed+0.104
acked+0.100
 harmony+0.098
Neg logit contributions
 sir-0.074
 salon-0.073
 why-0.069
chard-0.068
ache-0.067
Token no
Token ID645
Feature activation-2.70e-02

Pos logit contributions
acks+0.118
acked+0.111
 drifted+0.105
irmed+0.103
igg+0.102
Neg logit contributions
chard-0.075
ache-0.074
 sir-0.074
 why-0.070
 salon-0.067
Token energy
Token ID2568
Feature activation-2.56e-02

Pos logit contributions
acks+0.036
acked+0.033
 drifted+0.032
hed+0.030
irmed+0.029
Neg logit contributions
 sir-0.056
chard-0.055
ache-0.054
 why-0.053
 salon-0.053
Token to
Token ID284
Feature activation+1.54e-02

Pos logit contributions
 drifted+0.045
acks+0.045
 care+0.042
irmed+0.040
acked+0.039
Neg logit contributions
chard-0.043
 sir-0.042
ache-0.039
itor-0.039
ber-0.037
Token play
Token ID711
Feature activation+1.06e-02

Pos logit contributions
chard+0.035
 sir+0.035
ache+0.034
 why+0.033
 salon+0.033
Neg logit contributions
acks-0.017
 drifted-0.016
acked-0.015
irmed-0.015
 care-0.014
Token.
Token ID13
Feature activation-1.25e-02

Pos logit contributions
chard+0.021
 sir+0.021
ache+0.021
ber+0.019
 salon+0.019
Neg logit contributions
acks-0.016
 drifted-0.015
 care-0.014
acked-0.013
 They-0.013
Token Tears
Token ID40328
Feature activation-1.03e-01

Pos logit contributions
acks+0.015
 drifted+0.015
acked+0.013
irmed+0.013
 care+0.013
Neg logit contributions
 sir-0.028
chard-0.027
ache-0.026
 why-0.026
ber-0.026
Token rolled
Token ID11686
Feature activation-5.13e-02

Pos logit contributions
acks+0.157
bit+0.140
irt+0.135
 sorts+0.135
 care+0.135
Neg logit contributions
ache-0.191
chard-0.175
 sir-0.168
 why-0.159
 salon-0.158
Token down
Token ID866
Feature activation-5.09e-02

Pos logit contributions
acks+0.069
 drifted+0.067
acked+0.063
irmed+0.062
 care+0.061
Neg logit contributions
chard-0.106
 sir-0.105
ache-0.104
 salon-0.101
ber-0.100
Token his
Token ID465
Feature activation-2.96e-02

Pos logit contributions
acks+0.063
 drifted+0.059
acked+0.054
irmed+0.053
 care+0.052
Neg logit contributions
chard-0.112
 sir-0.111
ache-0.110
ber-0.104
 why-0.104
Token cheeks
Token ID25839
Feature activation-6.74e-02

Pos logit contributions
acked+0.054
 sorts+0.054
acks+0.051
hed+0.051
irmed+0.047
Neg logit contributions
ache-0.048
 sir-0.046
ber-0.040
 salon-0.040
 why-0.038
Token and
Token ID290
Feature activation-6.45e-02

Pos logit contributions
acks+0.101
 drifted+0.096
acked+0.089
irmed+0.088
 care+0.088
Neg logit contributions
chard-0.129
 sir-0.127
ache-0.127
ber-0.118
 salon-0.118
Token and
Token ID290
Feature activation-1.83e-02

Pos logit contributions
 drifted+0.036
 care+0.033
acks+0.033
 They+0.028
irmed+0.027
Neg logit contributions
chard-0.085
 sir-0.082
ache-0.079
 salon-0.078
itor-0.078
Token fell
Token ID3214
Feature activation-3.59e-02

Pos logit contributions
acks+0.024
 drifted+0.023
acked+0.021
irmed+0.021
 care+0.021
Neg logit contributions
chard-0.039
 sir-0.038
ache-0.038
ber-0.036
 salon-0.036
Token asleep
Token ID16039
Feature activation-2.14e-02

Pos logit contributions
 drifted+0.050
acks+0.050
acked+0.043
irmed+0.043
 care+0.043
Neg logit contributions
 sir-0.075
chard-0.075
ache-0.070
ber-0.070
itor-0.068
Token in
Token ID287
Feature activation-6.02e-02

Pos logit contributions
 drifted+0.027
acks+0.026
 care+0.024
irmed+0.022
 They+0.022
Neg logit contributions
 sir-0.046
chard-0.045
ache-0.044
ber-0.042
 why-0.042
Token her
Token ID607
Feature activation-3.21e-02

Pos logit contributions
acks+0.091
acked+0.086
irmed+0.084
igg+0.083
 drifted+0.081
Neg logit contributions
chard-0.110
 why-0.110
 sir-0.109
 salon-0.104
ache-0.102
Token newly
Token ID8308
Feature activation-1.03e-01

Pos logit contributions
acks+0.058
acked+0.053
hed+0.051
irmed+0.050
 drifted+0.050
Neg logit contributions
chard-0.053
 sir-0.051
 salon-0.051
ache-0.050
ber-0.049
Token painted
Token ID13055
Feature activation-6.54e-03

Pos logit contributions
acks+0.176
 They+0.162
 drifted+0.152
acked+0.149
irmed+0.148
Neg logit contributions
ache-0.181
chard-0.166
 oh-0.157
 salon-0.155
 why-0.155
Token crib
Token ID48083
Feature activation-7.68e-03

Pos logit contributions
acks+0.010
 drifted+0.010
acked+0.009
 air+0.009
bons+0.009
Neg logit contributions
chard-0.013
 sir-0.013
ache-0.012
ber-0.012
 why-0.011
Token.
Token ID13
Feature activation-2.02e-02

Pos logit contributions
 drifted+0.008
acks+0.008
 care+0.007
 They+0.006
irmed+0.006
Neg logit contributions
 sir-0.018
chard-0.018
ache-0.017
itor-0.017
ber-0.016
Token From
Token ID3574
Feature activation-2.34e-02

Pos logit contributions
acks+0.029
 drifted+0.028
acked+0.027
irmed+0.026
bit+0.025
Neg logit contributions
chard-0.038
 sir-0.038
 why-0.037
ache-0.036
 salon-0.035
Token that
Token ID326
Feature activation+9.03e-03

Pos logit contributions
 drifted+0.038
acks+0.037
 care+0.035
irmed+0.034
acked+0.034
Neg logit contributions
ache-0.041
 sir-0.040
chard-0.039
 salon-0.037
ber-0.036
Token.
Token ID13
Feature activation-1.93e-02

Pos logit contributions
acks+0.039
 drifted+0.038
acked+0.034
 care+0.034
irmed+0.034
Neg logit contributions
chard-0.057
 sir-0.057
ache-0.056
ber-0.053
 salon-0.053
Token He
Token ID679
Feature activation-2.04e-02

Pos logit contributions
acks+0.022
 drifted+0.022
 care+0.020
irmed+0.020
acked+0.020
Neg logit contributions
 sir-0.044
chard-0.043
ache-0.041
 why-0.040
ber-0.040
Token stretched
Token ID19110
Feature activation-5.42e-02

Pos logit contributions
acks+0.034
 drifted+0.029
acked+0.029
irmed+0.029
 care+0.029
Neg logit contributions
chard-0.036
ache-0.035
 sir-0.035
ber-0.034
 why-0.034
Token out
Token ID503
Feature activation-3.19e-02

Pos logit contributions
acks+0.078
 drifted+0.072
acked+0.068
irmed+0.067
 care+0.064
Neg logit contributions
chard-0.108
 sir-0.106
ache-0.103
 why-0.102
ber-0.098
Token his
Token ID465
Feature activation-1.73e-02

Pos logit contributions
acks+0.045
 drifted+0.040
acked+0.038
irmed+0.038
irt+0.036
Neg logit contributions
chard-0.063
 why-0.063
 sir-0.063
ache-0.060
ber-0.057
Token newly
Token ID8308
Feature activation-1.03e-01

Pos logit contributions
acks+0.020
acked+0.017
 drifted+0.016
irmed+0.015
hed+0.015
Neg logit contributions
chard-0.040
 sir-0.040
ache-0.040
 salon-0.037
ber-0.037
Token found
Token ID1043
Feature activation+2.04e-02

Pos logit contributions
acks+0.179
 They+0.160
 drifted+0.150
|+0.149
irt+0.146
Neg logit contributions
ache-0.170
chard-0.157
cher-0.144
 why-0.143
lock-0.141
Token wings
Token ID12098
Feature activation-3.66e-02

Pos logit contributions
chard+0.039
 sir+0.039
ache+0.038
ber+0.036
 salon+0.036
Neg logit contributions
acks-0.031
 drifted-0.030
acked-0.028
irmed-0.027
 care-0.027
Token and
Token ID290
Feature activation-4.40e-02

Pos logit contributions
acks+0.052
 drifted+0.050
acked+0.045
 care+0.045
irmed+0.044
Neg logit contributions
chard-0.074
 sir-0.073
ache-0.071
ber-0.069
 salon-0.068
Token flew
Token ID13112
Feature activation-2.02e-02

Pos logit contributions
acks+0.060
 drifted+0.055
acked+0.052
irmed+0.050
 care+0.049
Neg logit contributions
chard-0.092
 sir-0.090
ache-0.090
 why-0.084
ber-0.084
Token away
Token ID1497
Feature activation-1.22e-02

Pos logit contributions
acks+0.031
 drifted+0.029
irmed+0.027
acked+0.027
 care+0.026
Neg logit contributions
chard-0.039
 sir-0.038
ache-0.037
 why-0.036
ber-0.035
Token and
Token ID290
Feature activation-4.04e-02

Pos logit contributions
chard+0.003
 sir+0.003
ache+0.003
ber+0.003
itor+0.003
Neg logit contributions
acks-0.002
 drifted-0.002
acked-0.002
 care-0.002
 They-0.002
Token blocks
Token ID7021
Feature activation-8.91e-03

Pos logit contributions
 drifted+0.057
acks+0.057
 care+0.053
 They+0.050
acked+0.050
Neg logit contributions
 sir-0.079
chard-0.078
ache-0.078
ber-0.073
 salon-0.071
Token,
Token ID11
Feature activation+4.70e-02

Pos logit contributions
acks+0.012
 drifted+0.011
acked+0.010
 care+0.010
 They+0.010
Neg logit contributions
chard-0.019
 sir-0.019
ache-0.018
ber-0.018
itor-0.017
Token all
Token ID477
Feature activation-5.95e-02

Pos logit contributions
chard+0.104
 sir+0.104
ache+0.103
ber+0.098
itor+0.096
Neg logit contributions
 drifted-0.055
acks-0.053
 care-0.049
 They-0.049
 hearts-0.047
Token in
Token ID287
Feature activation-6.84e-02

Pos logit contributions
acks+0.080
 drifted+0.075
irmed+0.071
igg+0.068
acked+0.066
Neg logit contributions
chard-0.124
 why-0.123
 sir-0.121
ache-0.117
 salon-0.117
Token bright
Token ID6016
Feature activation-1.02e-01

Pos logit contributions
acks+0.089
 drifted+0.084
acked+0.078
irmed+0.077
 care+0.072
Neg logit contributions
chard-0.146
 sir-0.144
ache-0.140
 why-0.137
 salon-0.136
Token red
Token ID2266
Feature activation-9.04e-02

Pos logit contributions
acks+0.161
 drifted+0.158
irmed+0.150
igg+0.140
bit+0.139
Neg logit contributions
chard-0.165
 why-0.165
 sir-0.162
ache-0.159
 salon-0.150
Token colours
Token ID18915
Feature activation-4.32e-02

Pos logit contributions
acks+0.145
 drifted+0.142
irmed+0.131
acked+0.129
igg+0.123
Neg logit contributions
 sir-0.154
ache-0.153
chard-0.153
 why-0.143
 salon-0.141
Token!
Token ID0
Feature activation-1.35e-02

Pos logit contributions
acks+0.054
 drifted+0.051
acked+0.046
 care+0.045
irmed+0.044
Neg logit contributions
chard-0.095
 sir-0.094
ache-0.092
ber-0.088
 salon-0.087
Token She
Token ID1375
Feature activation-1.13e-02

Pos logit contributions
 drifted+0.020
acks+0.020
 care+0.018
igg+0.018
irmed+0.018
Neg logit contributions
chard-0.025
 sir-0.025
 salon-0.024
 why-0.023
ber-0.023
Token was
Token ID373
Feature activation+2.33e-04

Pos logit contributions
acks+0.017
 drifted+0.016
acked+0.015
irmed+0.014
 care+0.014
Neg logit contributions
chard-0.022
 sir-0.022
ache-0.022
ber-0.021
 why-0.020
Token having
Token ID1719
Feature activation-6.77e-02

Pos logit contributions
acks+0.041
 drifted+0.040
acked+0.037
irmed+0.037
 care+0.037
Neg logit contributions
chard-0.039
 sir-0.039
ache-0.038
ber-0.035
 salon-0.035
Token fun
Token ID1257
Feature activation-2.78e-02

Pos logit contributions
acks+0.113
acked+0.110
 drifted+0.103
irmed+0.102
irt+0.097
Neg logit contributions
 why-0.114
 sir-0.113
chard-0.109
ache-0.105
 oh-0.100
Token together
Token ID1978
Feature activation-3.97e-02

Pos logit contributions
acks+0.041
 drifted+0.039
acked+0.036
irmed+0.036
 care+0.035
Neg logit contributions
chard-0.054
 sir-0.054
ache-0.053
ber-0.050
 why-0.050
Token all
Token ID477
Feature activation-7.88e-02

Pos logit contributions
acks+0.046
 drifted+0.044
 care+0.039
acked+0.038
irmed+0.037
Neg logit contributions
chard-0.090
 sir-0.090
ache-0.089
 salon-0.084
ber-0.083
Token year
Token ID614
Feature activation-5.03e-02

Pos logit contributions
acks+0.119
 drifted+0.113
irmed+0.112
igg+0.105
acked+0.105
Neg logit contributions
 why-0.151
 sir-0.145
chard-0.140
ber-0.139
ache-0.133
Token long
Token ID890
Feature activation-1.02e-01

Pos logit contributions
acks+0.074
 drifted+0.071
acked+0.065
irmed+0.064
 care+0.064
Neg logit contributions
chard-0.098
 sir-0.098
ache-0.096
ber-0.091
 salon-0.090
Token.
Token ID13
Feature activation-1.98e-02

Pos logit contributions
acks+0.148
 drifted+0.140
acked+0.133
irmed+0.129
 care+0.124
Neg logit contributions
chard-0.194
ache-0.194
 sir-0.194
 why-0.186
ber-0.183
Token They
Token ID1119
Feature activation-5.38e-02

Pos logit contributions
 breaths+0.031
irmed+0.029
 drifted+0.028
les+0.028
acks+0.028
Neg logit contributions
 why-0.036
 sir-0.036
ber-0.033
keeper-0.033
chard-0.032
Token fit
Token ID4197
Feature activation-2.40e-02

Pos logit contributions
acks+0.100
acked+0.086
irmed+0.081
irt+0.081
 Soon+0.081
Neg logit contributions
chard-0.093
 sir-0.090
 why-0.089
ache-0.088
ber-0.086
Token their
Token ID511
Feature activation-1.02e-02

Pos logit contributions
acks+0.040
 drifted+0.038
acked+0.036
irmed+0.034
igg+0.033
Neg logit contributions
 sir-0.040
 why-0.040
chard-0.039
ber-0.038
ache-0.038
Token plans
Token ID3352
Feature activation+1.54e-02

Pos logit contributions
acks+0.015
acked+0.013
 drifted+0.013
hed+0.012
irmed+0.012
Neg logit contributions
 sir-0.020
chard-0.019
ache-0.019
ber-0.019
 why-0.019
Tokenals
Token ID874
Feature activation-1.39e-02

Pos logit contributions
acks+0.001
 drifted+0.001
irmed+0.001
acked+0.001
 care+0.001
Neg logit contributions
chard-0.001
 sir-0.001
ache-0.001
ber-0.001
 salon-0.001
Token and
Token ID290
Feature activation-4.27e-02

Pos logit contributions
acks+0.020
 drifted+0.019
acked+0.017
 care+0.017
irmed+0.017
Neg logit contributions
chard-0.028
 sir-0.028
ache-0.027
ber-0.026
 salon-0.025
Token bloom
Token ID29955
Feature activation-3.15e-02

Pos logit contributions
 drifted+0.053
acks+0.052
 care+0.047
irmed+0.045
acked+0.045
Neg logit contributions
 sir-0.092
ache-0.091
chard-0.090
ber-0.086
 salon-0.085
Tokened
Token ID276
Feature activation-5.32e-02

Pos logit contributions
acks+0.041
 drifted+0.039
irmed+0.035
acked+0.035
 care+0.035
Neg logit contributions
chard-0.066
ache-0.065
 sir-0.065
ber-0.061
 why-0.061
Token in
Token ID287
Feature activation-6.94e-02

Pos logit contributions
acks+0.084
 drifted+0.079
irmed+0.074
acked+0.073
igg+0.070
Neg logit contributions
chard-0.098
 sir-0.096
ache-0.095
 why-0.093
ber-0.089
Token bright
Token ID6016
Feature activation-1.02e-01

Pos logit contributions
acks+0.118
irmed+0.115
acked+0.108
igg+0.108
 drifted+0.106
Neg logit contributions
 why-0.122
chard-0.116
 sir-0.113
 please-0.101
 yet-0.100
Token colors
Token ID7577
Feature activation-4.36e-02

Pos logit contributions
irmed+0.167
acks+0.165
|+0.156
 drifted+0.155
ez+0.153
Neg logit contributions
 why-0.145
chard-0.129
 sir-0.124
 yet-0.116
ache-0.114
Token -
Token ID532
Feature activation+5.13e-02

Pos logit contributions
acks+0.056
 drifted+0.054
acked+0.049
irmed+0.048
 care+0.048
Neg logit contributions
chard-0.092
 sir-0.092
ache-0.091
ber-0.086
 salon-0.085
Token red
Token ID2266
Feature activation-6.00e-02

Pos logit contributions
ache+0.108
 sir+0.107
chard+0.107
ber+0.101
itor+0.100
Neg logit contributions
 drifted-0.065
acks-0.065
 care-0.058
 They-0.058
acked-0.056
Token,
Token ID11
Feature activation+5.84e-02

Pos logit contributions
acks+0.092
 drifted+0.088
irmed+0.084
acked+0.082
 care+0.080
Neg logit contributions
chard-0.109
ache-0.109
 sir-0.109
 why-0.102
 salon-0.101
Token yellow
Token ID7872
Feature activation-6.08e-02

Pos logit contributions
 sir+0.128
chard+0.127
ache+0.126
ber+0.120
 salon+0.118
Neg logit contributions
acks-0.071
 drifted-0.070
acked-0.061
 care-0.060
irmed-0.060
Token parent
Token ID2560
Feature activation-1.16e-03

Pos logit contributions
ache+0.011
 sir+0.011
chard+0.011
ber+0.010
 why+0.010
Neg logit contributions
acks-0.021
acked-0.019
 drifted-0.019
irmed-0.018
hed-0.018
Token smiles
Token ID21845
Feature activation-6.39e-02

Pos logit contributions
acks+0.001
 drifted+0.001
irmed+0.001
acked+0.001
 Soon+0.001
Neg logit contributions
 sir-0.003
ache-0.003
 why-0.003
 salon-0.003
ber-0.003
Token and
Token ID290
Feature activation-5.45e-02

Pos logit contributions
acks+0.124
irmed+0.108
|+0.108
 drifted+0.107
acked+0.106
Neg logit contributions
ache-0.095
 sir-0.090
 why-0.089
ber-0.083
chard-0.082
Token kisses
Token ID36544
Feature activation-3.87e-02

Pos logit contributions
acks+0.065
acked+0.061
 drifted+0.058
|+0.058
irmed+0.055
Neg logit contributions
 sir-0.120
ache-0.114
 why-0.114
chard-0.110
ber-0.109
Token them
Token ID606
Feature activation-4.26e-02

Pos logit contributions
acks+0.052
 drifted+0.048
acked+0.045
irmed+0.045
igg+0.042
Neg logit contributions
 sir-0.082
ache-0.080
chard-0.079
 why-0.078
ber-0.074
Token both
Token ID1111
Feature activation-1.02e-01

Pos logit contributions
acks+0.085
igg+0.080
 sorts+0.079
acked+0.076
text+0.076
Neg logit contributions
ache-0.056
 sir-0.053
ber-0.051
chard-0.051
 why-0.049
Token.
Token ID13
Feature activation-3.74e-02

Pos logit contributions
acks+0.204
Ar+0.200
acked+0.195
igg+0.194
|+0.194
Neg logit contributions
 sir-0.132
 why-0.131
ache-0.111
ber-0.106
keeper-0.099
Token "
Token ID366
Feature activation+1.34e-02

Pos logit contributions
acks+0.031
 drifted+0.029
acked+0.027
 care+0.026
irmed+0.024
Neg logit contributions
 sir-0.096
chard-0.093
 why-0.092
ache-0.090
ber-0.087
TokenYou
Token ID1639
Feature activation-4.18e-02

Pos logit contributions
chard+0.021
 sir+0.020
ache+0.020
ber+0.020
 salon+0.019
Neg logit contributions
acks-0.025
 drifted-0.022
 breaths-0.022
acked-0.022
irmed-0.022
Token are
Token ID389
Feature activation-1.20e-02

Pos logit contributions
acks+0.070
acked+0.062
 drifted+0.061
irmed+0.059
 Soon+0.055
Neg logit contributions
 sir-0.081
ache-0.078
 why-0.073
chard-0.072
cher-0.068
Token welcome
Token ID7062
Feature activation-8.30e-03

Pos logit contributions
acks+0.020
 drifted+0.020
acked+0.018
irmed+0.018
 care+0.017
Neg logit contributions
chard-0.021
 sir-0.021
ache-0.021
 why-0.019
ber-0.019
Token brothers
Token ID9397
Feature activation-5.46e-02

Pos logit contributions
acks+0.116
 drifted+0.110
acked+0.102
irmed+0.100
 care+0.098
Neg logit contributions
chard-0.164
 sir-0.163
ache-0.162
ber-0.153
 why-0.152
Token.
Token ID13
Feature activation-3.77e-02

Pos logit contributions
acks+0.071
 drifted+0.070
acked+0.062
 care+0.062
irmed+0.061
Neg logit contributions
chard-0.114
 sir-0.113
ache-0.111
ber-0.105
 salon-0.105
Token They
Token ID1119
Feature activation-5.85e-02

Pos logit contributions
acks+0.039
 drifted+0.038
acked+0.035
irmed+0.034
 care+0.033
Neg logit contributions
 sir-0.089
chard-0.087
 why-0.084
ache-0.084
ber-0.082
Token were
Token ID547
Feature activation-2.58e-02

Pos logit contributions
acks+0.076
 drifted+0.071
acked+0.065
irmed+0.064
 care+0.063
Neg logit contributions
chard-0.125
 sir-0.124
ache-0.123
ber-0.116
 why-0.115
Token all
Token ID477
Feature activation-7.66e-02

Pos logit contributions
acks+0.037
 drifted+0.037
 care+0.034
acked+0.033
 They+0.033
Neg logit contributions
ache-0.049
 sir-0.049
chard-0.048
ber-0.045
 salon-0.045
Token brown
Token ID7586
Feature activation-1.02e-01

Pos logit contributions
acks+0.129
 drifted+0.123
acked+0.115
irmed+0.114
 care+0.112
Neg logit contributions
chard-0.135
 sir-0.133
ache-0.131
ber-0.123
 why-0.122
Token and
Token ID290
Feature activation-4.29e-02

Pos logit contributions
acks+0.151
 drifted+0.145
acked+0.133
irmed+0.131
 care+0.129
Neg logit contributions
chard-0.195
 sir-0.193
ache-0.192
ber-0.181
 why-0.179
Token furry
Token ID40857
Feature activation-8.68e-02

Pos logit contributions
acks+0.064
 drifted+0.059
acked+0.057
irmed+0.055
igg+0.053
Neg logit contributions
chard-0.086
 sir-0.083
ache-0.081
 why-0.078
ber-0.077
Token.
Token ID13
Feature activation-4.00e-02

Pos logit contributions
acks+0.129
 drifted+0.123
acked+0.113
irmed+0.111
 care+0.108
Neg logit contributions
chard-0.166
 sir-0.165
ache-0.164
ber-0.156
 why-0.155
Token They
Token ID1119
Feature activation-6.21e-02

Pos logit contributions
acks+0.043
 drifted+0.041
acked+0.038
irmed+0.037
 care+0.035
Neg logit contributions
 sir-0.094
chard-0.091
ache-0.089
 why-0.088
ber-0.087
Token looked
Token ID3114
Feature activation-1.94e-02

Pos logit contributions
acks+0.083
 drifted+0.078
acked+0.072
irmed+0.071
 care+0.070
Neg logit contributions
chard-0.130
 sir-0.129
ache-0.127
ber-0.121
 why-0.120
Token with
Token ID351
Feature activation-4.75e-02

Pos logit contributions
acks+0.008
 drifted+0.008
irmed+0.008
 care+0.007
acked+0.007
Neg logit contributions
chard-0.006
 sir-0.006
ache-0.006
ber-0.006
 why-0.006
Token water
Token ID1660
Feature activation-1.53e-02

Pos logit contributions
acks+0.057
 drifted+0.051
acked+0.048
irmed+0.046
 Soon+0.042
Neg logit contributions
chard-0.110
 sir-0.107
ache-0.103
 why-0.101
ber-0.098
Token and
Token ID290
Feature activation-6.56e-02

Pos logit contributions
acks+0.020
 drifted+0.019
acked+0.017
 care+0.017
irmed+0.017
Neg logit contributions
chard-0.033
 sir-0.032
ache-0.032
ber-0.030
 salon-0.030
Token spl
Token ID4328
Feature activation-6.15e-02

Pos logit contributions
acks+0.079
 drifted+0.073
acked+0.067
irmed+0.065
 care+0.064
Neg logit contributions
chard-0.147
 sir-0.145
ache-0.143
ber-0.136
 why-0.136
Tokenashed
Token ID5263
Feature activation-6.62e-02

Pos logit contributions
 drifted+0.088
 harmony+0.080
 ease+0.079
 drift+0.077
 care+0.076
Neg logit contributions
ache-0.112
chard-0.110
ber-0.107
 sir-0.101
 why-0.097
Token each
Token ID1123
Feature activation-1.02e-01

Pos logit contributions
acks+0.099
 drifted+0.098
irmed+0.086
 Soon+0.085
acked+0.084
Neg logit contributions
chard-0.120
 sir-0.117
 why-0.117
ache-0.113
ber-0.109
Token other
Token ID584
Feature activation-4.96e-02

Pos logit contributions
 drifted+0.233
acks+0.222
irmed+0.217
igg+0.211
istence+0.206
Neg logit contributions
chard-0.087
 why-0.085
 worker-0.084
 else-0.082
ber-0.081
Token.
Token ID13
Feature activation-4.38e-02

Pos logit contributions
acks+0.078
 drifted+0.077
irmed+0.071
acked+0.070
igg+0.069
Neg logit contributions
chard-0.091
ache-0.088
 sir-0.087
ber-0.085
 why-0.081
Token They
Token ID1119
Feature activation-7.16e-02

Pos logit contributions
acked+0.042
acks+0.042
hed+0.040
 drifted+0.039
 sorts+0.038
Neg logit contributions
 sir-0.102
chard-0.097
 why-0.095
 dentist-0.090
 principal-0.090
Token fed
Token ID11672
Feature activation+2.87e-04

Pos logit contributions
acks+0.114
acked+0.099
 drifted+0.096
irmed+0.095
 Soon+0.093
Neg logit contributions
chard-0.141
ache-0.134
 sir-0.133
 salon-0.128
ber-0.128
Token the
Token ID262
Feature activation+3.61e-03

Pos logit contributions
chard+0.001
 sir+0.001
ache+0.001
ber+0.001
 why+0.001
Neg logit contributions
acks-0.000
 drifted-0.000
acked-0.000
irmed-0.000
 care-0.000
Token with
Token ID351
Feature activation-5.03e-02

Pos logit contributions
chard+0.002
 sir+0.002
ache+0.002
ber+0.002
 salon+0.002
Neg logit contributions
acks-0.001
 drifted-0.001
acked-0.001
irmed-0.001
 care-0.001
Token delicious
Token ID12625
Feature activation-4.63e-02

Pos logit contributions
acks+0.070
 drifted+0.063
acked+0.061
irmed+0.060
igg+0.055
Neg logit contributions
chard-0.105
 sir-0.103
ache-0.102
 why-0.100
ber-0.093
Token potatoes
Token ID18821
Feature activation-1.81e-02

Pos logit contributions
acks+0.077
 drifted+0.074
acked+0.071
irmed+0.070
|+0.070
Neg logit contributions
ache-0.077
 sir-0.076
chard-0.076
 why-0.074
ber-0.067
Token all
Token ID477
Feature activation-8.56e-02

Pos logit contributions
acks+0.024
 drifted+0.023
 care+0.021
acked+0.020
 They+0.020
Neg logit contributions
chard-0.037
 sir-0.037
ache-0.036
 salon-0.035
ber-0.035
Token year
Token ID614
Feature activation-4.91e-02

Pos logit contributions
acks+0.137
 drifted+0.132
irmed+0.127
text+0.122
igg+0.120
Neg logit contributions
 why-0.153
chard-0.151
 sir-0.143
ache-0.139
ber-0.135
Token long
Token ID890
Feature activation-1.01e-01

Pos logit contributions
acks+0.075
 drifted+0.071
acked+0.066
 care+0.065
irmed+0.064
Neg logit contributions
chard-0.094
 sir-0.093
ache-0.091
 salon-0.086
ber-0.086
Token.
Token ID13
Feature activation-1.42e-02

Pos logit contributions
acks+0.138
 drifted+0.130
acked+0.120
irmed+0.118
 care+0.116
Neg logit contributions
chard-0.208
 sir-0.206
ache-0.205
ber-0.193
 why-0.192
Token
Token ID198
Feature activation-1.51e-02

Pos logit contributions
irmed+0.025
 breaths+0.024
 drifted+0.023
irted+0.023
acks+0.023
Neg logit contributions
 sir-0.025
 why-0.023
 dentist-0.022
ber-0.021
 lawyer-0.021
Token
Token ID198
Feature activation-7.62e-03

Pos logit contributions
acks+0.021
 drifted+0.021
irmed+0.020
acked+0.019
 care+0.019
Neg logit contributions
chard-0.029
 sir-0.029
ache-0.029
 why-0.028
ber-0.027
TokenThe
Token ID464
Feature activation-8.07e-03

Pos logit contributions
acks+0.013
 drifted+0.013
irmed+0.013
 breaths+0.012
acked+0.012
Neg logit contributions
ache-0.013
chard-0.013
 sir-0.012
 why-0.012
ber-0.012
Token end
Token ID886
Feature activation+2.61e-02

Pos logit contributions
acks+0.016
 drifted+0.015
acked+0.015
irmed+0.014
 care+0.014
Neg logit contributions
chard-0.012
 sir-0.012
ache-0.011
ber-0.011
 salon-0.010
Token and
Token ID290
Feature activation-4.16e-02

Pos logit contributions
acks+0.058
 drifted+0.056
acked+0.051
 care+0.051
irmed+0.051
Neg logit contributions
chard-0.078
 sir-0.077
ache-0.077
 salon-0.073
ber-0.072
Token laughed
Token ID13818
Feature activation-4.53e-02

Pos logit contributions
acks+0.061
 drifted+0.054
acked+0.053
irmed+0.050
igg+0.050
Neg logit contributions
chard-0.085
 sir-0.081
ache-0.080
 why-0.077
ber-0.076
Token as
Token ID355
Feature activation-2.08e-02

Pos logit contributions
acks+0.067
 drifted+0.064
acked+0.059
irmed+0.058
 care+0.058
Neg logit contributions
chard-0.088
 sir-0.088
ache-0.087
ber-0.081
 salon-0.081
Token it
Token ID340
Feature activation-4.58e-02

Pos logit contributions
acks+0.030
 drifted+0.030
 care+0.027
 They+0.026
acked+0.026
Neg logit contributions
ache-0.040
 sir-0.040
chard-0.040
 salon-0.038
ber-0.037
Token fl
Token ID781
Feature activation-5.37e-02

Pos logit contributions
acks+0.062
 care+0.054
 They+0.052
 drifted+0.052
 Soon+0.051
Neg logit contributions
ache-0.092
chard-0.091
ber-0.087
 sir-0.086
 salon-0.086
Tokenuttered
Token ID46322
Feature activation-1.01e-01

Pos logit contributions
acks+0.072
 drifted+0.069
 care+0.066
 turns+0.061
 They+0.060
Neg logit contributions
 sir-0.109
chard-0.108
ache-0.107
ber-0.104
 salon-0.097
Token and
Token ID290
Feature activation-5.48e-02

Pos logit contributions
acks+0.180
 drifted+0.169
text+0.160
igg+0.154
|+0.153
Neg logit contributions
chard-0.159
 why-0.153
 sir-0.147
ache-0.145
keeper-0.134
Token waved
Token ID26834
Feature activation-5.65e-02

Pos logit contributions
acks+0.073
 drifted+0.068
acked+0.063
irmed+0.062
 care+0.061
Neg logit contributions
chard-0.116
 sir-0.114
ache-0.113
ber-0.107
 why-0.106
Token in
Token ID287
Feature activation-6.24e-02

Pos logit contributions
acks+0.095
 drifted+0.089
acked+0.085
irmed+0.081
igg+0.079
Neg logit contributions
chard-0.099
 sir-0.098
ache-0.095
 why-0.093
ber-0.090
Token the
Token ID262
Feature activation-4.60e-02

Pos logit contributions
acks+0.092
 drifted+0.085
acked+0.081
irmed+0.079
igg+0.078
Neg logit contributions
chard-0.122
 sir-0.120
 why-0.119
ache-0.109
 salon-0.108
Token wind
Token ID2344
Feature activation-4.86e-02

Pos logit contributions
acks+0.060
 drifted+0.053
acked+0.051
text+0.048
igg+0.048
Neg logit contributions
 sir-0.098
chard-0.098
ache-0.093
 salon-0.091
 why-0.089
Token the
Token ID262
Feature activation-5.73e-02

Pos logit contributions
acks+0.115
 drifted+0.111
irmed+0.109
acked+0.108
igg+0.103
Neg logit contributions
chard-0.144
 sir-0.138
 why-0.135
 salon-0.127
ber-0.125
Token sand
Token ID6450
Feature activation-1.48e-02

Pos logit contributions
acks+0.108
 drifted+0.104
irmed+0.100
 breaths+0.096
text+0.096
Neg logit contributions
chard-0.089
 sir-0.083
 salon-0.076
 lawyer-0.075
ache-0.075
Token and
Token ID290
Feature activation-5.41e-02

Pos logit contributions
acks+0.021
 drifted+0.020
irmed+0.019
acked+0.019
 care+0.018
Neg logit contributions
 sir-0.029
chard-0.029
ache-0.028
ber-0.027
 salon-0.027
Token bat
Token ID7365
Feature activation-3.94e-02

Pos logit contributions
acks+0.072
 drifted+0.068
acked+0.062
irmed+0.062
 care+0.060
Neg logit contributions
chard-0.114
 sir-0.112
ache-0.111
ber-0.106
 why-0.105
Tokenhed
Token ID704
Feature activation-6.97e-02

Pos logit contributions
 drifted+0.037
acks+0.036
irmed+0.035
 imag+0.035
irt+0.034
Neg logit contributions
ache-0.085
ber-0.081
chard-0.081
 salon-0.081
 sir-0.080
Token all
Token ID477
Feature activation-1.01e-01

Pos logit contributions
acks+0.112
 drifted+0.108
acked+0.100
irmed+0.099
igg+0.097
Neg logit contributions
chard-0.119
 sir-0.117
ache-0.116
 why-0.114
ber-0.112
Token afternoon
Token ID6672
Feature activation-6.02e-02

Pos logit contributions
 drifted+0.176
irmed+0.167
acks+0.166
igg+0.163
 imag+0.158
Neg logit contributions
 why-0.151
chard-0.130
ber-0.124
 sir-0.121
 yet-0.118
Token.
Token ID13
Feature activation-2.81e-02

Pos logit contributions
acks+0.090
 drifted+0.084
irmed+0.082
acked+0.082
 care+0.077
Neg logit contributions
chard-0.113
ache-0.113
 sir-0.110
ber-0.104
 salon-0.102
Token When
Token ID1649
Feature activation-5.06e-03

Pos logit contributions
 breaths+0.043
igg+0.043
 drifted+0.042
irmed+0.042
 care+0.039
Neg logit contributions
 sir-0.049
chard-0.047
 why-0.045
 principal-0.043
ber-0.043
Token the
Token ID262
Feature activation-6.88e-03

Pos logit contributions
acks+0.009
acked+0.009
irmed+0.009
igg+0.009
 drifted+0.008
Neg logit contributions
 sir-0.008
chard-0.007
 why-0.007
 dentist-0.007
ber-0.007
Token sun
Token ID4252
Feature activation-2.71e-02

Pos logit contributions
acks+0.010
 drifted+0.010
acked+0.009
irmed+0.009
 care+0.009
Neg logit contributions
 sir-0.014
chard-0.014
ache-0.013
ber-0.013
 salon-0.013
Token<|endoftext|>
Token ID50256
Feature activation-4.79e-02

Pos logit contributions
acks+0.011
 drifted+0.011
irmed+0.009
acked+0.009
igg+0.009
Neg logit contributions
 sir-0.017
chard-0.016
 why-0.016
ache-0.015
ber-0.015
TokenOnce
Token ID7454
Feature activation-6.03e-03

Pos logit contributions
acks+0.059
 drifted+0.057
irmed+0.051
 care+0.050
acked+0.050
Neg logit contributions
chard-0.103
 sir-0.102
ache-0.100
ber-0.096
 why-0.095
Token upon
Token ID2402
Feature activation-2.15e-02

Pos logit contributions
acks+0.011
 drifted+0.010
acked+0.010
 care+0.010
irmed+0.010
Neg logit contributions
chard-0.010
ache-0.010
 sir-0.010
ber-0.009
 salon-0.009
Token a
Token ID257
Feature activation-4.36e-03

Pos logit contributions
acks+0.031
 drifted+0.029
acked+0.028
irmed+0.027
igg+0.025
Neg logit contributions
chard-0.043
 sir-0.042
ache-0.041
 why-0.040
ber-0.039
Token time
Token ID640
Feature activation-7.60e-03

Pos logit contributions
acks+0.005
 drifted+0.005
acked+0.005
irmed+0.004
hed+0.004
Neg logit contributions
 sir-0.009
ache-0.009
 why-0.009
chard-0.009
ber-0.009
Token,
Token ID11
Feature activation+8.58e-02

Pos logit contributions
acks+0.009
 They+0.008
 drifted+0.008
text+0.008
 care+0.007
Neg logit contributions
ache-0.018
chard-0.017
itor-0.016
 salon-0.016
 sir-0.016
Token there
Token ID612
Feature activation+1.07e-03

Pos logit contributions
ache+0.117
 sir+0.109
chard+0.104
 salon+0.102
itor+0.101
Neg logit contributions
acks-0.178
 drifted-0.175
 They-0.174
acked-0.170
irmed-0.167
Token was
Token ID373
Feature activation+1.12e-02

Pos logit contributions
chard+0.002
 sir+0.002
ache+0.002
ber+0.002
 salon+0.002
Neg logit contributions
acks-0.001
 drifted-0.001
acked-0.001
irmed-0.001
 care-0.001
Token a
Token ID257
Feature activation+2.32e-02

Pos logit contributions
chard+0.019
 sir+0.017
 why+0.016
 oh+0.016
 dentist+0.016
Neg logit contributions
acks-0.021
 drifted-0.019
irmed-0.019
acked-0.018
 care-0.017
Token bad
Token ID2089
Feature activation+2.13e-02

Pos logit contributions
chard+0.048
ache+0.048
 sir+0.046
 salon+0.045
 why+0.045
Neg logit contributions
 drifted-0.027
acks-0.026
 They-0.025
 care-0.025
 lot-0.024
Token dust
Token ID8977
Feature activation+3.52e-02

Pos logit contributions
chard+0.035
 sir+0.034
ache+0.034
ber+0.031
 salon+0.031
Neg logit contributions
acks-0.038
 drifted-0.037
 care-0.034
acked-0.033
irmed-0.033
Token<|endoftext|>
Token ID50256
Feature activation-5.20e-02

Pos logit contributions
irmed+0.045
 drifted+0.045
 breaths+0.044
acks+0.043
les+0.042
Neg logit contributions
 sir-0.036
 why-0.033
ber-0.030
chard-0.029
keeper-0.028
TokenOnce
Token ID7454
Feature activation-1.13e-02

Pos logit contributions
acks+0.071
 drifted+0.069
irmed+0.065
 care+0.063
acked+0.061
Neg logit contributions
chard-0.103
 sir-0.103
ache-0.098
ber-0.095
 why-0.095
Token upon
Token ID2402
Feature activation-2.32e-02

Pos logit contributions
acks+0.021
 drifted+0.020
acked+0.019
 care+0.018
irmed+0.018
Neg logit contributions
chard-0.018
ache-0.018
 sir-0.018
ber-0.016
 salon-0.016
Token a
Token ID257
Feature activation-1.67e-03

Pos logit contributions
acks+0.036
 drifted+0.034
acked+0.033
irmed+0.033
igg+0.030
Neg logit contributions
chard-0.044
 sir-0.042
ache-0.042
 why-0.041
ber-0.039
Token time
Token ID640
Feature activation-6.56e-03

Pos logit contributions
acks+0.002
 drifted+0.002
acked+0.002
irmed+0.002
 breaths+0.002
Neg logit contributions
 sir-0.003
ache-0.003
 why-0.003
chard-0.003
ber-0.003
Token,
Token ID11
Feature activation+8.60e-02

Pos logit contributions
acks+0.009
 drifted+0.008
 They+0.008
text+0.007
 care+0.007
Neg logit contributions
ache-0.014
chard-0.014
 sir-0.013
 salon-0.013
itor-0.013
Token a
Token ID257
Feature activation+2.34e-02

Pos logit contributions
ache+0.119
 sir+0.112
chard+0.107
 salon+0.106
itor+0.102
Neg logit contributions
acks-0.177
 drifted-0.174
 They-0.170
acked-0.169
irmed-0.165
Token little
Token ID1310
Feature activation-8.37e-03

Pos logit contributions
chard+0.043
 sir+0.042
ache+0.042
 salon+0.039
ber+0.039
Neg logit contributions
acks-0.037
 drifted-0.036
acked-0.033
irmed-0.033
 care-0.033
Token boy
Token ID2933
Feature activation+1.30e-02

Pos logit contributions
acks+0.017
 drifted+0.017
irmed+0.016
acked+0.015
 ease+0.015
Neg logit contributions
 sir-0.012
chard-0.012
ache-0.011
ber-0.011
 lawyer-0.010
Token named
Token ID3706
Feature activation-1.48e-03

Pos logit contributions
chard+0.018
 sir+0.018
ache+0.017
 why+0.017
ber+0.017
Neg logit contributions
acks-0.026
 drifted-0.025
irmed-0.024
acked-0.023
igg-0.022
Token Tim
Token ID5045
Feature activation+5.68e-03

Pos logit contributions
acks+0.002
 drifted+0.002
acked+0.002
 care+0.002
 They+0.002
Neg logit contributions
ache-0.003
chard-0.003
 sir-0.003
 salon-0.003
ber-0.003
Token
Token ID198
Feature activation-1.58e-02

Pos logit contributions
acks+0.031
 drifted+0.030
acked+0.027
irmed+0.026
 care+0.026
Neg logit contributions
chard-0.057
 sir-0.057
ache-0.056
ber-0.053
 why-0.053
Token"
Token ID1
Feature activation+6.52e-02

Pos logit contributions
acks+0.029
 drifted+0.028
acked+0.026
irmed+0.026
 care+0.025
Neg logit contributions
chard-0.025
 sir-0.025
ache-0.025
ber-0.023
 salon-0.023
TokenLook
Token ID8567
Feature activation+1.09e-02

Pos logit contributions
 sir+0.129
chard+0.129
ache+0.127
ber+0.119
 why+0.119
Neg logit contributions
acks-0.095
 drifted-0.092
acked-0.085
 care-0.083
irmed-0.083
Token,
Token ID11
Feature activation+9.63e-02

Pos logit contributions
chard+0.025
ache+0.025
 salon+0.023
belt+0.023
ber+0.023
Neg logit contributions
acks-0.012
 hands-0.011
 care-0.011
 turns-0.011
irt-0.010
Token Leo
Token ID19632
Feature activation-6.43e-03

Pos logit contributions
ache+0.233
chard+0.222
belt+0.219
ber+0.210
itor+0.209
Neg logit contributions
 They-0.098
 their-0.089
 turns-0.088
 care-0.087
 hands-0.080
Token,
Token ID11
Feature activation+1.01e-01

Pos logit contributions
acks+0.010
 drifted+0.010
 care+0.009
acked+0.009
irmed+0.009
Neg logit contributions
chard-0.012
 sir-0.012
ache-0.012
 salon-0.011
ber-0.011
Token there
Token ID612
Feature activation+1.68e-03

Pos logit contributions
ache+0.219
belt+0.216
chard+0.212
itor+0.209
ber+0.207
Neg logit contributions
 care-0.118
 They-0.117
 turns-0.115
 breaks-0.105
 their-0.104
Token is
Token ID318
Feature activation+2.06e-02

Pos logit contributions
 sir+0.004
chard+0.004
ache+0.004
 why+0.004
 salon+0.004
Neg logit contributions
acks-0.002
 drifted-0.002
irmed-0.002
acked-0.002
 care-0.002
Token a
Token ID257
Feature activation+3.24e-02

Pos logit contributions
chard+0.045
 sir+0.045
ache+0.044
ber+0.042
 salon+0.042
Neg logit contributions
acks-0.026
 drifted-0.024
acked-0.022
 care-0.022
irmed-0.022
Token peach
Token ID47565
Feature activation+3.54e-02

Pos logit contributions
ache+0.062
ver+0.061
itor+0.061
 sir+0.060
 why+0.060
Neg logit contributions
 lot-0.042
 care-0.039
 night-0.037
 drifted-0.037
acks-0.036
Token for
Token ID329
Feature activation-3.02e-03

Pos logit contributions
 sir+0.073
chard+0.071
itor+0.071
ache+0.067
 principal+0.067
Neg logit contributions
acks-0.046
 drifted-0.044
acked-0.041
 air-0.041
 breaks-0.039
Token It
Token ID632
Feature activation-5.00e-03

Pos logit contributions
ache+0.002
 sir+0.002
chard+0.002
ber+0.002
 salon+0.002
Neg logit contributions
acks-0.001
 drifted-0.001
 They-0.001
acked-0.001
 care-0.001
Token was
Token ID373
Feature activation+2.70e-03

Pos logit contributions
acks+0.008
 drifted+0.007
acked+0.007
irmed+0.007
 care+0.006
Neg logit contributions
chard-0.010
ache-0.009
 sir-0.009
 why-0.009
ber-0.009
Token a
Token ID257
Feature activation+2.26e-02

Pos logit contributions
 sir+0.006
ber+0.006
ache+0.006
chard+0.006
itor+0.006
Neg logit contributions
 drifted-0.003
acks-0.003
 care-0.003
 They-0.003
irmed-0.003
Token beautiful
Token ID4950
Feature activation-2.15e-02

Pos logit contributions
 sir+0.050
 why+0.050
chard+0.050
ber+0.050
itor+0.049
Neg logit contributions
 lot-0.023
 drifted-0.021
 care-0.020
 night-0.019
 air-0.018
Token doll
Token ID3654
Feature activation+9.37e-03

Pos logit contributions
acks+0.029
 drifted+0.028
 care+0.026
acked+0.026
irmed+0.025
Neg logit contributions
chard-0.043
 sir-0.043
ache-0.042
ber-0.041
 why-0.040
Token,
Token ID11
Feature activation+7.19e-02

Pos logit contributions
 sir+0.019
chard+0.019
ache+0.017
ber+0.017
 why+0.017
Neg logit contributions
 drifted-0.012
acks-0.012
 They-0.011
 care-0.011
hed-0.011
Token but
Token ID475
Feature activation+2.51e-02

Pos logit contributions
 sir+0.113
ache+0.110
chard+0.108
ber+0.106
itor+0.100
Neg logit contributions
 drifted-0.131
acks-0.130
 care-0.120
 They-0.120
irmed-0.117
Token its
Token ID663
Feature activation+2.73e-03

Pos logit contributions
ache+0.059
 sir+0.058
ber+0.054
chard+0.054
itor+0.053
Neg logit contributions
 drifted-0.029
acks-0.026
 care-0.026
irmed-0.025
 They-0.025
Token arm
Token ID3211
Feature activation+2.51e-03

Pos logit contributions
 sir+0.006
chard+0.006
ache+0.005
ber+0.005
 why+0.005
Neg logit contributions
 drifted-0.004
acks-0.003
 care-0.003
irmed-0.003
 hands-0.003
Token was
Token ID373
Feature activation+6.89e-03

Pos logit contributions
 sir+0.005
chard+0.005
ache+0.005
ber+0.005
 why+0.005
Neg logit contributions
acks-0.003
 drifted-0.003
acked-0.003
 care-0.003
irmed-0.003
Token missing
Token ID4814
Feature activation+2.00e-02

Pos logit contributions
 sir+0.015
ber+0.013
itor+0.013
ache+0.013
chard+0.013
Neg logit contributions
 drifted-0.009
 care-0.008
irmed-0.008
acks-0.008
acked-0.007
Token a
Token ID257
Feature activation+3.61e-02

Pos logit contributions
chard+0.037
 sir+0.037
ache+0.036
ber+0.035
 salon+0.034
Neg logit contributions
acks-0.018
 drifted-0.017
 care-0.016
acked-0.015
irmed-0.015
Token moment
Token ID2589
Feature activation+4.71e-03

Pos logit contributions
chard+0.085
 sir+0.083
ache+0.082
ber+0.080
itor+0.079
Neg logit contributions
acks-0.037
 care-0.037
 drifted-0.037
 turns-0.032
 They-0.030
Token and
Token ID290
Feature activation-2.12e-02

Pos logit contributions
chard+0.013
itor+0.013
 sir+0.013
ache+0.012
 salon+0.012
Neg logit contributions
 drifted-0.003
 care-0.003
 night-0.003
 They-0.002
acks-0.002
Token then
Token ID788
Feature activation-5.95e-03

Pos logit contributions
 drifted+0.029
acks+0.026
 care+0.025
irmed+0.024
 turns+0.024
Neg logit contributions
ache-0.043
 sir-0.043
chard-0.041
 salon-0.040
ber-0.040
Token answered
Token ID9373
Feature activation+2.41e-02

Pos logit contributions
 drifted+0.007
acks+0.007
 care+0.006
irmed+0.006
 turns+0.006
Neg logit contributions
 sir-0.013
chard-0.013
ache-0.013
 salon-0.012
itor-0.012
Token,
Token ID11
Feature activation+8.86e-02

Pos logit contributions
ache+0.058
chard+0.057
 sir+0.057
 salon+0.056
itor+0.054
Neg logit contributions
acks-0.022
 drifted-0.021
 care-0.021
 They-0.019
 hands-0.019
Token "
Token ID366
Feature activation+5.34e-02

Pos logit contributions
ache+0.131
 sir+0.130
chard+0.124
ber+0.123
 salon+0.118
Neg logit contributions
 drifted-0.167
acks-0.164
 They-0.157
 Soon-0.152
acked-0.148
TokenPink
Token ID41912
Feature activation-2.41e-04

Pos logit contributions
 sir+0.097
chard+0.096
ache+0.094
 why+0.088
itor+0.087
Neg logit contributions
acks-0.087
 drifted-0.084
acked-0.079
 care-0.076
irmed-0.076
Token!"
Token ID2474
Feature activation+3.29e-05

Pos logit contributions
acks+0.000
 drifted+0.000
acked+0.000
irmed+0.000
 care+0.000
Neg logit contributions
chard-0.000
 sir-0.000
ache-0.000
ber-0.000
 salon-0.000
Token Mom
Token ID11254
Feature activation+2.26e-02

Pos logit contributions
chard+0.000
 sir+0.000
ache+0.000
 why+0.000
 salon+0.000
Neg logit contributions
acks-0.000
 drifted-0.000
acked-0.000
irmed-0.000
 care-0.000
Token smiled
Token ID13541
Feature activation-1.32e-02

Pos logit contributions
chard+0.046
 sir+0.045
ache+0.043
 salon+0.041
itor+0.041
Neg logit contributions
 drifted-0.032
acks-0.031
acked-0.029
 care-0.028
hed-0.027
Token They
Token ID1119
Feature activation-2.41e-02

Pos logit contributions
 drifted+0.007
acks+0.007
irmed+0.007
igg+0.007
les+0.006
Neg logit contributions
 sir-0.016
 why-0.016
chard-0.015
itor-0.015
 Lee-0.014
Token decided
Token ID3066
Feature activation+4.78e-02

Pos logit contributions
acks+0.051
 drifted+0.044
irmed+0.043
 breaths+0.042
acked+0.041
Neg logit contributions
chard-0.038
 why-0.035
 sir-0.032
ache-0.031
 please-0.029
Token to
Token ID284
Feature activation+5.23e-02

Pos logit contributions
ache+0.088
 sir+0.086
chard+0.085
ber+0.083
 salon+0.082
Neg logit contributions
 drifted-0.070
 care-0.070
 They-0.069
 their-0.068
 turns-0.066
Token go
Token ID467
Feature activation+2.48e-02

Pos logit contributions
 sir+0.102
chard+0.102
ache+0.099
 why+0.095
 salon+0.095
Neg logit contributions
acks-0.075
 drifted-0.073
irmed-0.068
acked-0.067
igg-0.061
Token on
Token ID319
Feature activation-4.13e-02

Pos logit contributions
chard+0.032
 sir+0.031
ache+0.029
 why+0.029
ber+0.028
Neg logit contributions
acks-0.051
 drifted-0.051
acked-0.048
irmed-0.048
text-0.046
Token an
Token ID281
Feature activation+3.59e-02

Pos logit contributions
acks+0.076
acked+0.074
 drifted+0.071
irmed+0.070
hed+0.066
Neg logit contributions
chard-0.067
 sir-0.064
 why-0.059
 dentist-0.058
 salon-0.058
Token adventure
Token ID8855
Feature activation+5.93e-02

Pos logit contributions
chard+0.069
 sir+0.069
ache+0.067
 why+0.064
 salon+0.064
Neg logit contributions
acks-0.054
 drifted-0.052
acked-0.048
irmed-0.047
 care-0.046
Token.
Token ID13
Feature activation-5.79e-04

Pos logit contributions
chard+0.099
 sir+0.097
ache+0.093
ber+0.089
 why+0.089
Neg logit contributions
acks-0.105
 drifted-0.101
irmed-0.096
acked-0.095
 care-0.091
Token The
Token ID383
Feature activation+5.56e-03

Pos logit contributions
 drifted+0.001
acks+0.001
irmed+0.001
igg+0.001
acked+0.001
Neg logit contributions
 sir-0.001
chard-0.001
 why-0.001
ber-0.001
itor-0.001
Token two
Token ID734
Feature activation-2.76e-02

Pos logit contributions
chard+0.013
ache+0.013
 sir+0.012
ber+0.012
 why+0.012
Neg logit contributions
 drifted-0.006
acks-0.006
 care-0.005
 They-0.005
acked-0.005
Token friends
Token ID2460
Feature activation+1.25e-02

Pos logit contributions
acks+0.044
 drifted+0.041
acked+0.038
irmed+0.037
 care+0.037
Neg logit contributions
 sir-0.052
chard-0.050
ache-0.050
 why-0.048
ber-0.048
Token inside
Token ID2641
Feature activation+2.06e-02

Pos logit contributions
 sir+0.053
ache+0.051
chard+0.050
 salon+0.048
ber+0.047
Neg logit contributions
acks-0.060
 drifted-0.059
 care-0.058
acked-0.055
irmed-0.054
Token.
Token ID13
Feature activation-1.08e-02

Pos logit contributions
ache+0.046
chard+0.046
 sir+0.045
 salon+0.043
ber+0.043
Neg logit contributions
acks-0.024
 drifted-0.023
 care-0.021
 They-0.020
acked-0.020
Token
Token ID198
Feature activation-1.47e-02

Pos logit contributions
acks+0.012
 drifted+0.012
acked+0.010
irmed+0.010
 care+0.010
Neg logit contributions
chard-0.025
 sir-0.025
ache-0.024
ber-0.023
 why-0.023
Token
Token ID198
Feature activation-6.92e-03

Pos logit contributions
acks+0.017
 drifted+0.016
acked+0.015
irmed+0.014
 care+0.014
Neg logit contributions
chard-0.033
 sir-0.033
ache-0.032
ber-0.031
 salon-0.031
Token"
Token ID1
Feature activation+6.42e-02

Pos logit contributions
acks+0.013
 drifted+0.013
acked+0.012
irmed+0.012
 care+0.012
Neg logit contributions
chard-0.011
 sir-0.010
ache-0.010
ber-0.009
 salon-0.009
TokenLet
Token ID5756
Feature activation+1.73e-02

Pos logit contributions
ache+0.151
 sir+0.148
chard+0.139
 salon+0.137
 why+0.136
Neg logit contributions
 They-0.072
acked-0.071
acks-0.068
 drifted-0.067
 turns-0.060
Token's
Token ID338
Feature activation+4.25e-02

Pos logit contributions
 sir+0.035
ache+0.035
chard+0.035
 why+0.033
ber+0.032
Neg logit contributions
acks-0.024
 drifted-0.023
irmed-0.021
acked-0.021
 care-0.020
Token open
Token ID1280
Feature activation+2.32e-02

Pos logit contributions
ache+0.089
chard+0.089
 sir+0.088
ber+0.085
 salon+0.083
Neg logit contributions
acks-0.056
 drifted-0.053
 care-0.051
 They-0.048
acked-0.047
Token it
Token ID340
Feature activation+5.65e-03

Pos logit contributions
chard+0.047
 sir+0.047
ache+0.047
ber+0.044
 salon+0.044
Neg logit contributions
acks-0.032
 drifted-0.031
acked-0.028
irmed-0.028
 care-0.027
Token!"
Token ID2474
Feature activation-1.64e-02

Pos logit contributions
ache+0.013
chard+0.013
 sir+0.012
ber+0.012
 salon+0.012
Neg logit contributions
acks-0.007
 They-0.006
acked-0.006
 Soon-0.006
hed-0.005
Token Ben
Token ID3932
Feature activation-5.49e-03

Pos logit contributions
acks+0.032
acked+0.030
 drifted+0.030
bons+0.030
bit+0.029
Neg logit contributions
 sir-0.025
chard-0.023
 why-0.023
 oh-0.021
 salon-0.021
Token the
Token ID262
Feature activation-8.97e-04

Pos logit contributions
chard+0.041
 sir+0.041
ache+0.040
ber+0.038
 salon+0.038
Neg logit contributions
acks-0.021
 drifted-0.020
acked-0.018
 care-0.017
irmed-0.017
Token stable
Token ID8245
Feature activation+2.63e-02

Pos logit contributions
acks+0.001
 drifted+0.001
 care+0.001
 air+0.001
 hearts+0.001
Neg logit contributions
ache-0.002
chard-0.002
 sir-0.002
 why-0.002
ber-0.002
Token door
Token ID3420
Feature activation+1.42e-02

Pos logit contributions
ache+0.062
chard+0.062
 sir+0.060
 why+0.058
 oh+0.057
Neg logit contributions
acks-0.025
 drifted-0.023
 They-0.023
 care-0.021
acked-0.020
Token.
Token ID13
Feature activation-1.58e-02

Pos logit contributions
chard+0.036
ache+0.035
 sir+0.035
itor+0.033
ber+0.033
Neg logit contributions
acks-0.013
 drifted-0.012
 They-0.011
 care-0.011
acked-0.010
Token Ben
Token ID3932
Feature activation-2.65e-02

Pos logit contributions
acks+0.019
 drifted+0.018
acked+0.016
irmed+0.016
 care+0.015
Neg logit contributions
chard-0.036
 sir-0.035
ache-0.035
ber-0.033
 salon-0.033
Token saw
Token ID2497
Feature activation+4.80e-02

Pos logit contributions
 drifted+0.036
acks+0.035
irmed+0.031
acked+0.031
 care+0.031
Neg logit contributions
chard-0.055
 sir-0.054
ache-0.053
 salon-0.050
 why-0.050
Token her
Token ID607
Feature activation+2.21e-02

Pos logit contributions
chard+0.109
ache+0.108
 sir+0.107
ber+0.103
itor+0.100
Neg logit contributions
acks-0.055
 drifted-0.051
 care-0.049
 They-0.046
 turns-0.045
Token and
Token ID290
Feature activation-2.76e-02

Pos logit contributions
chard+0.045
ache+0.042
 sir+0.041
itor+0.039
ber+0.038
Neg logit contributions
acks-0.030
 drifted-0.030
 hands-0.027
 care-0.026
 They-0.026
Token tried
Token ID3088
Feature activation+7.18e-03

Pos logit contributions
 drifted+0.032
acks+0.030
 care+0.028
irmed+0.027
 turns+0.027
Neg logit contributions
 sir-0.062
ache-0.062
chard-0.062
ber-0.058
 salon-0.058
Token to
Token ID284
Feature activation+4.25e-02

Pos logit contributions
 sir+0.014
ache+0.014
chard+0.014
itor+0.013
ber+0.012
Neg logit contributions
 care-0.011
acks-0.010
 They-0.010
 their-0.010
 turns-0.010
Token stop
Token ID2245
Feature activation-4.15e-03

Pos logit contributions
chard+0.082
 sir+0.081
ache+0.080
ber+0.076
 salon+0.075
Neg logit contributions
acks-0.064
 drifted-0.062
 care-0.057
acked-0.056
irmed-0.056
Token skills
Token ID4678
Feature activation-4.76e-03

Pos logit contributions
acks+0.006
 drifted+0.006
acked+0.005
irmed+0.005
 care+0.005
Neg logit contributions
chard-0.009
 sir-0.009
ache-0.009
ber-0.008
 salon-0.008
Token with
Token ID351
Feature activation-3.44e-02

Pos logit contributions
acks+0.007
 care+0.006
 drifted+0.006
 They+0.006
acked+0.006
Neg logit contributions
chard-0.009
ache-0.009
 sir-0.009
itor-0.009
ber-0.009
Token others
Token ID1854
Feature activation-3.14e-02

Pos logit contributions
acks+0.047
 drifted+0.045
acked+0.041
irmed+0.041
 care+0.040
Neg logit contributions
chard-0.071
 sir-0.070
ache-0.069
ber-0.065
 why-0.065
Token,
Token ID11
Feature activation+5.81e-02

Pos logit contributions
acks+0.048
 drifted+0.045
 care+0.043
acked+0.041
irmed+0.041
Neg logit contributions
chard-0.059
 sir-0.058
ache-0.058
 salon-0.054
itor-0.054
Token you
Token ID345
Feature activation-2.26e-02

Pos logit contributions
ache+0.121
chard+0.121
 sir+0.117
itor+0.114
ber+0.113
Neg logit contributions
acks-0.074
 drifted-0.072
 care-0.072
 They-0.067
 turns-0.063
Token can
Token ID460
Feature activation+4.04e-02

Pos logit contributions
acks+0.041
 drifted+0.039
acked+0.038
irmed+0.037
text+0.035
Neg logit contributions
 sir-0.037
ache-0.036
chard-0.036
 why-0.034
ber-0.033
Token make
Token ID787
Feature activation+1.00e-03

Pos logit contributions
chard+0.070
 sir+0.070
ache+0.069
 why+0.064
 salon+0.064
Neg logit contributions
acks-0.067
 drifted-0.065
acked-0.061
irmed-0.060
 care-0.058
Token new
Token ID649
Feature activation+1.72e-03

Pos logit contributions
 sir+0.002
chard+0.002
ache+0.002
 why+0.002
 salon+0.002
Neg logit contributions
 drifted-0.001
acks-0.001
acked-0.001
irmed-0.001
text-0.001
Token friends
Token ID2460
Feature activation-4.91e-03

Pos logit contributions
chard+0.004
 sir+0.004
ache+0.004
ber+0.004
 salon+0.004
Neg logit contributions
acks-0.002
 drifted-0.002
 care-0.002
acked-0.002
irmed-0.002
Token and
Token ID290
Feature activation-2.70e-02

Pos logit contributions
acks+0.007
 care+0.006
 drifted+0.006
 They+0.005
acked+0.005
Neg logit contributions
chard-0.010
ache-0.010
 sir-0.010
itor-0.010
 salon-0.009
Token have
Token ID423
Feature activation-6.39e-02

Pos logit contributions
acks+0.035
 drifted+0.034
 care+0.032
irmed+0.030
acked+0.030
Neg logit contributions
chard-0.057
 sir-0.056
ache-0.056
ber-0.053
 salon-0.052
Token When
Token ID1649
Feature activation+1.04e-02

Pos logit contributions
chard+0.004
ache+0.004
 sir+0.004
 salon+0.004
ber+0.004
Neg logit contributions
acks-0.002
 drifted-0.002
acked-0.002
 They-0.002
 care-0.002
Token you
Token ID345
Feature activation-2.02e-02

Pos logit contributions
chard+0.020
 sir+0.020
ache+0.020
ber+0.018
 salon+0.018
Neg logit contributions
acks-0.016
 drifted-0.015
 care-0.014
acked-0.014
irmed-0.014
Token hold
Token ID1745
Feature activation-2.91e-02

Pos logit contributions
acks+0.030
 drifted+0.029
irmed+0.028
acked+0.028
 Soon+0.026
Neg logit contributions
 sir-0.040
ache-0.038
chard-0.037
 why-0.036
 salon-0.034
Token it
Token ID340
Feature activation-2.65e-02

Pos logit contributions
acks+0.044
 drifted+0.043
acked+0.039
irmed+0.038
 care+0.036
Neg logit contributions
chard-0.056
 sir-0.055
ache-0.054
 why-0.052
ber-0.051
Token close
Token ID1969
Feature activation-5.17e-02

Pos logit contributions
acks+0.040
 drifted+0.038
 care+0.036
acked+0.036
 They+0.035
Neg logit contributions
chard-0.049
 sir-0.049
ache-0.048
 salon-0.046
ber-0.046
Token,
Token ID11
Feature activation+6.98e-02

Pos logit contributions
acks+0.083
 drifted+0.079
 care+0.075
acked+0.074
 They+0.073
Neg logit contributions
chard-0.093
ache-0.093
 sir-0.090
 salon-0.086
ber-0.086
Token it
Token ID340
Feature activation-1.70e-02

Pos logit contributions
ache+0.162
chard+0.157
 salon+0.149
ber+0.148
itor+0.148
Neg logit contributions
 care-0.080
acks-0.078
 They-0.076
 turns-0.074
 drifted-0.074
Token can
Token ID460
Feature activation+3.61e-02

Pos logit contributions
acks+0.025
 drifted+0.024
acked+0.023
irmed+0.022
 care+0.021
Neg logit contributions
ache-0.033
 sir-0.032
chard-0.031
 why-0.031
ber-0.030
Token tell
Token ID1560
Feature activation+3.24e-03

Pos logit contributions
chard+0.068
 sir+0.068
ache+0.067
 why+0.063
 salon+0.062
Neg logit contributions
acks-0.056
 drifted-0.053
acked-0.051
irmed-0.050
 care-0.047
Token you
Token ID345
Feature activation-1.20e-02

Pos logit contributions
chard+0.007
 sir+0.007
ache+0.007
ber+0.006
 salon+0.006
Neg logit contributions
acks-0.004
 drifted-0.004
acked-0.004
irmed-0.004
 care-0.004
Token things
Token ID1243
Feature activation+3.79e-02

Pos logit contributions
acks+0.020
 drifted+0.019
 care+0.019
 They+0.017
irmed+0.017
Neg logit contributions
chard-0.021
 sir-0.020
ache-0.020
 salon-0.019
ber-0.019
Token friends
Token ID2460
Feature activation-3.34e-02

Pos logit contributions
acks+0.041
acked+0.038
 drifted+0.038
irmed+0.036
hed+0.036
Neg logit contributions
chard-0.044
 sir-0.043
ache-0.043
ber-0.040
 why-0.040
Token.
Token ID13
Feature activation-2.19e-02

Pos logit contributions
 care+0.037
 drifted+0.037
acks+0.036
 They+0.033
hed+0.032
Neg logit contributions
 sir-0.074
chard-0.074
ache-0.072
 salon-0.071
itor-0.069
Token<|endoftext|>
Token ID50256
Feature activation-5.90e-02

Pos logit contributions
acks+0.033
 breaths+0.032
 drifted+0.031
 minds+0.030
 care+0.030
Neg logit contributions
 sir-0.038
 why-0.037
ache-0.036
chard-0.034
keeper-0.034
TokenOnce
Token ID7454
Feature activation-1.29e-02

Pos logit contributions
acks+0.083
 drifted+0.081
irmed+0.077
 care+0.074
acked+0.072
Neg logit contributions
chard-0.113
 sir-0.112
ache-0.108
 why-0.105
ber-0.104
Token upon
Token ID2402
Feature activation-2.60e-02

Pos logit contributions
acks+0.024
 drifted+0.023
acked+0.022
irmed+0.022
 care+0.022
Neg logit contributions
chard-0.020
 sir-0.020
ache-0.019
ber-0.018
 why-0.018
Token a
Token ID257
Feature activation-6.68e-03

Pos logit contributions
acks+0.042
 drifted+0.039
acked+0.038
irmed+0.038
igg+0.035
Neg logit contributions
chard-0.047
ache-0.045
 sir-0.045
 why-0.044
ber-0.041
Token time
Token ID640
Feature activation-8.70e-03

Pos logit contributions
acks+0.008
 drifted+0.007
acked+0.007
irmed+0.007
 breaths+0.007
Neg logit contributions
ache-0.014
 sir-0.014
 why-0.014
chard-0.013
ber-0.013
Token,
Token ID11
Feature activation+8.52e-02

Pos logit contributions
acks+0.013
 drifted+0.012
 They+0.011
 care+0.011
text+0.011
Neg logit contributions
ache-0.018
chard-0.017
 sir-0.017
ber-0.016
 salon-0.016
Token there
Token ID612
Feature activation+1.25e-03

Pos logit contributions
ache+0.110
 sir+0.106
chard+0.100
 salon+0.096
itor+0.094
Neg logit contributions
acks-0.181
 drifted-0.180
 They-0.173
acked-0.172
irmed-0.170
Token was
Token ID373
Feature activation+8.99e-03

Pos logit contributions
chard+0.002
 sir+0.002
ache+0.002
 why+0.002
ber+0.002
Neg logit contributions
acks-0.002
 drifted-0.002
irmed-0.002
acked-0.002
 care-0.002
Token a
Token ID257
Feature activation+2.25e-02

Pos logit contributions
chard+0.015
 sir+0.013
 oh+0.013
 why+0.013
 dentist+0.013
Neg logit contributions
acks-0.017
 drifted-0.016
irmed-0.015
acked-0.015
 minds-0.014
Token he
Token ID339
Feature activation-2.22e-02

Pos logit contributions
acks+0.018
 drifted+0.018
 care+0.017
 They+0.016
irmed+0.016
Neg logit contributions
ache-0.022
chard-0.022
 sir-0.022
 salon-0.021
ber-0.020
Token could
Token ID714
Feature activation+9.40e-03

Pos logit contributions
 drifted+0.041
acks+0.039
irmed+0.037
 care+0.036
acked+0.036
Neg logit contributions
chard-0.036
 sir-0.036
ache-0.035
 salon-0.034
ber-0.032
Token.
Token ID13
Feature activation-3.67e-03

Pos logit contributions
chard+0.020
 sir+0.020
ache+0.020
ber+0.019
 salon+0.019
Neg logit contributions
acks-0.013
 drifted-0.012
 care-0.011
acked-0.010
irmed-0.010
Token His
Token ID2399
Feature activation+2.76e-02

Pos logit contributions
acks+0.005
 drifted+0.005
acked+0.004
irmed+0.004
irt+0.004
Neg logit contributions
chard-0.007
 sir-0.007
 why-0.007
ache-0.007
ber-0.007
Token pants
Token ID12581
Feature activation+1.15e-02

Pos logit contributions
chard+0.055
ache+0.051
 sir+0.051
 salon+0.048
 why+0.048
Neg logit contributions
 drifted-0.042
acks-0.041
 care-0.039
 hands-0.038
irmed-0.037
Token got
Token ID1392
Feature activation+2.62e-03

Pos logit contributions
chard+0.024
 sir+0.024
ache+0.024
ber+0.023
 salon+0.023
Neg logit contributions
acks-0.015
 drifted-0.014
acked-0.013
irmed-0.013
 care-0.013
Token caught
Token ID4978
Feature activation+1.47e-02

Pos logit contributions
chard+0.005
 sir+0.005
ache+0.005
ber+0.004
 salon+0.004
Neg logit contributions
acks-0.004
 drifted-0.004
acked-0.004
irmed-0.004
 care-0.004
Token on
Token ID319
Feature activation-4.65e-02

Pos logit contributions
chard+0.027
 sir+0.027
ache+0.027
ber+0.025
 salon+0.025
Neg logit contributions
acks-0.023
 drifted-0.022
acked-0.020
irmed-0.020
 care-0.020
Token a
Token ID257
Feature activation+1.50e-02

Pos logit contributions
acks+0.065
 drifted+0.062
irmed+0.057
acked+0.057
 care+0.054
Neg logit contributions
chard-0.094
 sir-0.094
ache-0.092
 why-0.089
ber-0.085
Token branch
Token ID8478
Feature activation+9.49e-03

Pos logit contributions
chard+0.030
 sir+0.029
ber+0.028
ache+0.028
 why+0.028
Neg logit contributions
acks-0.021
 drifted-0.019
 care-0.019
 lot-0.018
 drift-0.017
Token and
Token ID290
Feature activation-3.60e-02

Pos logit contributions
chard+0.021
 sir+0.020
ache+0.020
ber+0.020
itor+0.019
Neg logit contributions
acks-0.012
 drifted-0.011
 care-0.010
acked-0.010
 They-0.010
Token the
Token ID262
Feature activation-2.26e-03

Pos logit contributions
chard+0.031
 sir+0.031
ache+0.030
ber+0.029
 salon+0.029
Neg logit contributions
acks-0.018
 drifted-0.018
 care-0.016
acked-0.016
irmed-0.015
Token beautiful
Token ID4950
Feature activation-2.25e-02

Pos logit contributions
acks+0.003
 drifted+0.003
acked+0.003
irmed+0.003
 care+0.003
Neg logit contributions
 sir-0.004
chard-0.004
ache-0.004
ber-0.004
 why-0.004
Token sounds
Token ID5238
Feature activation-1.15e-02

Pos logit contributions
acks+0.034
irmed+0.034
 drifted+0.034
igg+0.031
acked+0.029
Neg logit contributions
 sir-0.040
ache-0.039
 why-0.039
chard-0.038
 keeper-0.036
Token of
Token ID286
Feature activation-4.24e-03

Pos logit contributions
acks+0.017
 drifted+0.017
acked+0.015
 care+0.015
irmed+0.014
Neg logit contributions
chard-0.022
 sir-0.022
ache-0.022
 salon-0.021
ber-0.021
Token nature
Token ID3450
Feature activation-7.53e-03

Pos logit contributions
acks+0.005
 drifted+0.005
acked+0.005
irmed+0.005
 care+0.005
Neg logit contributions
chard-0.009
 sir-0.009
ache-0.009
 why-0.008
ber-0.008
Token around
Token ID1088
Feature activation-3.77e-02

Pos logit contributions
acks+0.011
 drifted+0.011
irmed+0.010
acked+0.010
 care+0.010
Neg logit contributions
chard-0.014
ache-0.014
 sir-0.014
 why-0.013
ber-0.013
Token her
Token ID607
Feature activation-7.08e-03

Pos logit contributions
acks+0.045
 drifted+0.043
acked+0.039
irmed+0.039
 care+0.037
Neg logit contributions
chard-0.084
 sir-0.083
ache-0.082
 why-0.078
ber-0.078
Token.
Token ID13
Feature activation-5.88e-03

Pos logit contributions
acks+0.009
 drifted+0.009
acked+0.008
 care+0.008
irmed+0.008
Neg logit contributions
chard-0.015
 sir-0.015
ache-0.015
ber-0.014
 salon-0.014
Token When
Token ID1649
Feature activation+7.20e-03

Pos logit contributions
acks+0.009
 drifted+0.008
acked+0.008
 care+0.007
irmed+0.007
Neg logit contributions
chard-0.012
 sir-0.011
 why-0.011
ache-0.011
 salon-0.010
Token she
Token ID673
Feature activation-1.25e-02

Pos logit contributions
chard+0.009
 sir+0.008
 why+0.008
 salon+0.008
ber+0.007
Neg logit contributions
acks-0.016
acked-0.016
bit-0.015
irmed-0.015
igg-0.015
Token tried
Token ID3088
Feature activation-8.13e-03

Pos logit contributions
 drifted+0.017
acks+0.017
 care+0.015
acked+0.015
irmed+0.015
Neg logit contributions
chard-0.026
 sir-0.026
ache-0.025
ber-0.024
 salon-0.024
Token the
Token ID262
Feature activation+1.75e-02

Pos logit contributions
chard+0.067
 sir+0.067
ache+0.066
ber+0.063
 salon+0.063
Neg logit contributions
acks-0.031
 drifted-0.029
acked-0.026
 care-0.026
irmed-0.025
Token voice
Token ID3809
Feature activation+1.98e-02

Pos logit contributions
chard+0.030
ache+0.029
 salon+0.028
 sir+0.027
ber+0.027
Neg logit contributions
acks-0.029
 drifted-0.029
 care-0.028
 They-0.027
 air-0.026
Token.
Token ID13
Feature activation+4.49e-03

Pos logit contributions
chard+0.044
 sir+0.042
 salon+0.041
ache+0.041
ber+0.040
Neg logit contributions
 drifted-0.023
acks-0.022
 care-0.021
acked-0.020
irmed-0.020
Token
Token ID220
Feature activation-2.35e-02

Pos logit contributions
ache+0.012
chard+0.011
 salon+0.011
ber+0.011
 sir+0.011
Neg logit contributions
 They-0.004
acks-0.004
 Soon-0.004
 drifted-0.004
acked-0.003
Token
Token ID198
Feature activation-1.73e-03

Pos logit contributions
acks+0.026
 drifted+0.024
acked+0.022
irmed+0.021
 care+0.021
Neg logit contributions
chard-0.055
 sir-0.054
ache-0.053
ber-0.051
 salon-0.051
Token
Token ID198
Feature activation+3.91e-03

Pos logit contributions
acks+0.002
 drifted+0.002
acked+0.002
 care+0.002
irmed+0.002
Neg logit contributions
chard-0.004
 sir-0.004
ache-0.004
ber-0.004
 salon-0.004
TokenTim
Token ID14967
Feature activation+4.63e-03

Pos logit contributions
chard+0.007
ache+0.007
 sir+0.007
 salon+0.007
ber+0.007
Neg logit contributions
acks-0.006
acked-0.006
 drifted-0.006
 They-0.006
 care-0.006
Tokenmy
Token ID1820
Feature activation+6.08e-03

Pos logit contributions
chard+0.008
 sir+0.008
ache+0.008
 salon+0.007
ber+0.007
Neg logit contributions
acks-0.008
 drifted-0.008
acked-0.007
irmed-0.007
 care-0.007
Token was
Token ID373
Feature activation+3.67e-03

Pos logit contributions
chard+0.013
 sir+0.012
ache+0.012
ber+0.012
 salon+0.011
Neg logit contributions
acks-0.008
 drifted-0.008
acked-0.007
irmed-0.007
 care-0.007
Token sad
Token ID6507
Feature activation-1.74e-02

Pos logit contributions
 sir+0.006
chard+0.006
ache+0.006
 salon+0.006
ber+0.006
Neg logit contributions
acks-0.006
 drifted-0.006
irmed-0.006
acked-0.005
 care-0.005
Token.
Token ID13
Feature activation+2.52e-03

Pos logit contributions
acks+0.027
 drifted+0.027
 care+0.025
 They+0.024
irmed+0.024
Neg logit contributions
chard-0.034
 sir-0.031
ache-0.031
itor-0.030
 salon-0.030
Token excited
Token ID6568
Feature activation-9.65e-03

Pos logit contributions
ache+0.013
 sir+0.013
chard+0.012
ber+0.012
 salon+0.011
Neg logit contributions
 drifted-0.007
acks-0.007
 They-0.007
 care-0.007
 Soon-0.006
Token to
Token ID284
Feature activation+3.68e-02

Pos logit contributions
acks+0.015
 drifted+0.014
 care+0.014
 They+0.013
acked+0.013
Neg logit contributions
chard-0.019
 sir-0.018
ache-0.018
ber-0.017
 salon-0.017
Token play
Token ID711
Feature activation+2.11e-02

Pos logit contributions
chard+0.084
ache+0.083
 sir+0.083
ber+0.078
 salon+0.077
Neg logit contributions
acks-0.044
 drifted-0.041
 care-0.038
acked-0.036
irmed-0.036
Token on
Token ID319
Feature activation-4.41e-02

Pos logit contributions
ache+0.042
chard+0.041
 sir+0.041
ber+0.038
 salon+0.037
Neg logit contributions
acks-0.033
 drifted-0.031
 care-0.028
acked-0.028
irmed-0.028
Token the
Token ID262
Feature activation-4.02e-03

Pos logit contributions
acks+0.049
 drifted+0.046
acked+0.040
 care+0.040
irmed+0.040
Neg logit contributions
chard-0.103
 sir-0.102
ache-0.102
ber-0.096
 salon-0.095
Token swings
Token ID26728
Feature activation-1.82e-02

Pos logit contributions
acks+0.008
 drifted+0.008
acked+0.008
irmed+0.008
 care+0.007
Neg logit contributions
chard-0.005
 sir-0.005
ache-0.005
 salon-0.005
ber-0.005
Token and
Token ID290
Feature activation-4.25e-02

Pos logit contributions
acks+0.022
 drifted+0.020
 They+0.019
acked+0.019
hed+0.019
Neg logit contributions
ache-0.041
 sir-0.040
chard-0.040
ber-0.037
itor-0.037
Token slide
Token ID10649
Feature activation-2.06e-02

Pos logit contributions
acks+0.068
 drifted+0.066
acked+0.060
 care+0.060
irmed+0.060
Neg logit contributions
chard-0.077
 sir-0.077
ache-0.077
ber-0.072
 salon-0.071
Token.
Token ID13
Feature activation-1.04e-02

Pos logit contributions
acks+0.028
 drifted+0.026
acked+0.024
irmed+0.024
 care+0.024
Neg logit contributions
ache-0.043
 sir-0.043
chard-0.043
ber-0.040
 salon-0.039
Token But
Token ID887
Feature activation+4.55e-02

Pos logit contributions
acks+0.012
 drifted+0.012
acked+0.011
irmed+0.010
 care+0.010
Neg logit contributions
 sir-0.023
chard-0.023
ache-0.022
 why-0.022
 salon-0.021
Token first
Token ID717
Feature activation+1.78e-02

Pos logit contributions
ache+0.103
 sir+0.094
chard+0.088
itor+0.087
 salon+0.087
Neg logit contributions
 drifted-0.059
 night-0.054
 They-0.053
 Soon-0.052
 care-0.052
Token!"
Token ID2474
Feature activation-3.53e-02

Pos logit contributions
acks+0.049
 drifted+0.047
 care+0.043
acked+0.042
irmed+0.042
Neg logit contributions
chard-0.064
ache-0.063
 sir-0.063
ber-0.059
 salon-0.059
Token
Token ID198
Feature activation-1.82e-02

Pos logit contributions
acks+0.047
 drifted+0.045
acked+0.042
irmed+0.041
 care+0.040
Neg logit contributions
 sir-0.075
chard-0.073
ache-0.071
ber-0.069
 why-0.069
Token
Token ID198
Feature activation-1.44e-02

Pos logit contributions
acks+0.021
 drifted+0.020
acked+0.018
irmed+0.018
 care+0.018
Neg logit contributions
chard-0.041
 sir-0.041
ache-0.040
ber-0.038
 salon-0.038
TokenTom
Token ID13787
Feature activation-1.64e-02

Pos logit contributions
acks+0.024
 drifted+0.022
acked+0.021
 care+0.021
 They+0.020
Neg logit contributions
chard-0.026
ache-0.026
 sir-0.026
 why-0.024
 salon-0.023
Token and
Token ID290
Feature activation-3.35e-02

Pos logit contributions
 drifted+0.017
acks+0.017
acked+0.015
hed+0.014
 They+0.014
Neg logit contributions
chard-0.041
 sir-0.038
 salon-0.037
ache-0.037
 why-0.036
Token Anna
Token ID11735
Feature activation-6.55e-02

Pos logit contributions
acks+0.053
 drifted+0.052
acked+0.047
 care+0.047
irmed+0.047
Neg logit contributions
chard-0.062
 sir-0.061
ache-0.061
ber-0.056
 salon-0.056
Token did
Token ID750
Feature activation-1.21e-02

Pos logit contributions
acks+0.098
 drifted+0.090
acked+0.086
irmed+0.083
 care+0.082
Neg logit contributions
 sir-0.127
chard-0.127
ache-0.126
ber-0.119
 why-0.118
Token not
Token ID407
Feature activation-4.32e-03

Pos logit contributions
acks+0.028
 drifted+0.028
 care+0.026
irmed+0.026
acked+0.026
Neg logit contributions
chard-0.014
 sir-0.013
ache-0.013
ber-0.012
 salon-0.011
Token want
Token ID765
Feature activation+2.22e-02

Pos logit contributions
acks+0.006
 drifted+0.006
 care+0.006
acked+0.005
irmed+0.005
Neg logit contributions
chard-0.009
 sir-0.008
ache-0.008
ber-0.008
 salon-0.008
Token to
Token ID284
Feature activation+3.32e-02

Pos logit contributions
chard+0.030
 sir+0.029
ache+0.029
ber+0.026
 salon+0.026
Neg logit contributions
acks-0.047
 drifted-0.045
acked-0.042
 care-0.042
irmed-0.042
Token stop
Token ID2245
Feature activation-1.77e-02

Pos logit contributions
chard+0.066
 sir+0.065
ache+0.065
ber+0.061
 salon+0.060
Neg logit contributions
acks-0.048
 drifted-0.046
acked-0.042
irmed-0.042
 care-0.042
Token Tim
Token ID5045
Feature activation-3.82e-03

Pos logit contributions
acks+0.009
 drifted+0.008
acked+0.007
 care+0.007
irmed+0.007
Neg logit contributions
chard-0.032
 sir-0.032
ache-0.031
 why-0.030
ber-0.030
Token went
Token ID1816
Feature activation+4.87e-03

Pos logit contributions
acks+0.005
 drifted+0.005
acked+0.005
irmed+0.005
 care+0.005
Neg logit contributions
chard-0.008
 sir-0.008
ache-0.008
ber-0.007
 salon-0.007
Token on
Token ID319
Feature activation-5.45e-02

Pos logit contributions
ache+0.012
 sir+0.011
chard+0.010
keys+0.010
ber+0.010
Neg logit contributions
acks-0.006
 care-0.006
 drifted-0.005
acked-0.005
 They-0.005
Token the
Token ID262
Feature activation-1.08e-02

Pos logit contributions
acks+0.065
 drifted+0.062
 care+0.056
acked+0.054
irmed+0.054
Neg logit contributions
ache-0.122
chard-0.122
 sir-0.122
ber-0.115
 why-0.113
Token swings
Token ID26728
Feature activation-1.66e-02

Pos logit contributions
acks+0.019
 drifted+0.018
acked+0.017
 care+0.016
irmed+0.016
Neg logit contributions
 sir-0.019
chard-0.019
ache-0.018
ber-0.017
 why-0.017
Token and
Token ID290
Feature activation-4.91e-02

Pos logit contributions
acks+0.023
 drifted+0.021
acked+0.020
 care+0.019
irmed+0.019
Neg logit contributions
ache-0.035
chard-0.034
 sir-0.034
ber-0.032
itor-0.031
Token Sue
Token ID26089
Feature activation-3.23e-02

Pos logit contributions
acks+0.066
 drifted+0.062
acked+0.057
irmed+0.056
 care+0.055
Neg logit contributions
chard-0.104
 sir-0.101
ache-0.099
ber-0.095
 salon-0.095
Token looked
Token ID3114
Feature activation-4.15e-03

Pos logit contributions
acks+0.045
 drifted+0.042
acked+0.039
irmed+0.039
 care+0.038
Neg logit contributions
chard-0.066
 sir-0.065
ache-0.064
ber-0.061
 salon-0.061
Token at
Token ID379
Feature activation+1.57e-02

Pos logit contributions
acks+0.007
 drifted+0.007
acked+0.006
irmed+0.006
 care+0.006
Neg logit contributions
chard-0.007
 sir-0.007
ache-0.007
 why-0.007
ber-0.007
Token the
Token ID262
Feature activation+2.25e-03

Pos logit contributions
chard+0.037
 sir+0.036
ache+0.035
 why+0.034
ber+0.034
Neg logit contributions
acks-0.018
 drifted-0.017
acked-0.015
irmed-0.015
 care-0.014
Token mushrooms
Token ID23452
Feature activation+2.34e-02

Pos logit contributions
chard+0.004
 sir+0.004
ache+0.004
ber+0.004
 salon+0.004
Neg logit contributions
acks-0.003
 drifted-0.003
acked-0.003
 care-0.003
irmed-0.003
Token little
Token ID1310
Feature activation-1.41e-02

Pos logit contributions
chard+0.035
ache+0.034
 sir+0.033
 why+0.032
 salon+0.032
Neg logit contributions
acks-0.021
 drifted-0.021
 They-0.020
 care-0.020
acked-0.018
Token boy
Token ID2933
Feature activation+3.85e-03

Pos logit contributions
acks+0.025
 drifted+0.024
acked+0.022
irmed+0.022
 care+0.022
Neg logit contributions
chard-0.024
 sir-0.023
ache-0.023
ber-0.021
 salon-0.021
Token who
Token ID508
Feature activation-3.30e-02

Pos logit contributions
chard+0.006
 sir+0.006
ache+0.006
ber+0.006
 salon+0.006
Neg logit contributions
acks-0.007
 drifted-0.007
acked-0.006
irmed-0.006
 care-0.006
Token liked
Token ID8288
Feature activation+1.26e-02

Pos logit contributions
acks+0.061
 breaths+0.054
 drifted+0.054
acked+0.053
les+0.053
Neg logit contributions
ache-0.052
 why-0.052
chard-0.050
ber-0.048
 lawyer-0.046
Token to
Token ID284
Feature activation+3.03e-02

Pos logit contributions
 sir+0.020
ache+0.020
chard+0.019
ber+0.018
itor+0.018
Neg logit contributions
 drifted-0.023
acks-0.022
 care-0.021
 They-0.021
 hearts-0.020
Token stretch
Token ID7539
Feature activation-5.10e-02

Pos logit contributions
chard+0.063
 sir+0.063
ache+0.062
 why+0.059
 salon+0.059
Neg logit contributions
acks-0.040
 drifted-0.038
acked-0.036
irmed-0.035
 care-0.033
Token.
Token ID13
Feature activation-3.13e-03

Pos logit contributions
acks+0.071
 drifted+0.066
acked+0.064
irmed+0.063
igg+0.060
Neg logit contributions
chard-0.104
ache-0.100
 why-0.099
 sir-0.099
ber-0.094
Token Every
Token ID3887
Feature activation-9.50e-03

Pos logit contributions
 drifted+0.004
acks+0.004
irmed+0.004
acked+0.004
 breaths+0.004
Neg logit contributions
 sir-0.007
chard-0.006
 why-0.006
 salon-0.006
 dentist-0.006
Token morning
Token ID3329
Feature activation+2.64e-03

Pos logit contributions
acks+0.011
acked+0.010
 drifted+0.009
irmed+0.009
bit+0.009
Neg logit contributions
 sir-0.021
ache-0.021
chard-0.021
 why-0.020
ber-0.020
Token he
Token ID339
Feature activation-2.25e-03

Pos logit contributions
chard+0.005
ache+0.005
 sir+0.005
 salon+0.005
itor+0.005
Neg logit contributions
acks-0.004
 drifted-0.004
 care-0.004
 They-0.004
acked-0.004
Token would
Token ID561
Feature activation+4.79e-03

Pos logit contributions
acks+0.005
acked+0.004
 drifted+0.004
irmed+0.004
 Soon+0.004
Neg logit contributions
ache-0.003
chard-0.003
 why-0.003
ber-0.003
 sir-0.003
Token depending
Token ID6906
Feature activation+1.12e-02

Pos logit contributions
acks+0.036
 drifted+0.034
acked+0.032
irmed+0.031
igg+0.030
Neg logit contributions
chard-0.034
 sir-0.033
ache-0.032
 why-0.031
ber-0.030
Token on
Token ID319
Feature activation-2.16e-02

Pos logit contributions
chard+0.019
 sir+0.019
ache+0.019
ber+0.017
 salon+0.017
Neg logit contributions
acks-0.019
 drifted-0.019
acked-0.017
irmed-0.017
 care-0.017
Token each
Token ID1123
Feature activation-6.06e-02

Pos logit contributions
acks+0.033
 drifted+0.032
acked+0.031
irmed+0.029
igg+0.027
Neg logit contributions
 why-0.042
chard-0.042
 sir-0.041
ache-0.040
ber-0.038
Token other
Token ID584
Feature activation-4.44e-03

Pos logit contributions
acks+0.091
 drifted+0.086
irmed+0.080
acked+0.078
igg+0.075
Neg logit contributions
chard-0.115
 sir-0.114
 why-0.112
ache-0.109
ber-0.106
Token to
Token ID284
Feature activation+2.61e-02

Pos logit contributions
acks+0.006
 drifted+0.006
 care+0.005
acked+0.005
irmed+0.005
Neg logit contributions
chard-0.009
 sir-0.009
ache-0.009
 salon-0.009
ber-0.008
Token have
Token ID423
Feature activation-5.74e-02

Pos logit contributions
chard+0.048
 sir+0.048
 why+0.046
ache+0.046
ber+0.042
Neg logit contributions
acks-0.043
 drifted-0.040
acked-0.039
irmed-0.038
hed-0.036
Token fun
Token ID1257
Feature activation-1.59e-02

Pos logit contributions
acks+0.097
 drifted+0.085
acked+0.085
irmed+0.082
igg+0.080
Neg logit contributions
chard-0.099
 why-0.098
 oh-0.093
 sir-0.092
ache-0.091
Token and
Token ID290
Feature activation-2.37e-02

Pos logit contributions
acks+0.027
 drifted+0.026
acked+0.024
irmed+0.024
 care+0.024
Neg logit contributions
chard-0.027
 sir-0.027
ache-0.026
ber-0.025
 why-0.025
Token stay
Token ID2652
Feature activation-3.66e-02

Pos logit contributions
acks+0.035
 drifted+0.033
acked+0.031
irmed+0.030
 care+0.030
Neg logit contributions
chard-0.047
 sir-0.046
ache-0.045
ber-0.043
 why-0.043
Token safe
Token ID3338
Feature activation-3.93e-02

Pos logit contributions
acks+0.073
 drifted+0.068
acked+0.065
text+0.064
irmed+0.062
Neg logit contributions
chard-0.055
 why-0.051
 sir-0.050
ache-0.049
ber-0.046
Token.
Token ID13
Feature activation-1.69e-02

Pos logit contributions
acks+0.047
 drifted+0.045
 care+0.041
acked+0.040
irmed+0.039
Neg logit contributions
 sir-0.087
chard-0.087
ache-0.085
 salon-0.083
ber-0.080
Token g
Token ID308
Feature activation-5.80e-02

Pos logit contributions
acks+0.054
 drifted+0.052
acked+0.047
irmed+0.047
 care+0.046
Neg logit contributions
chard-0.085
 sir-0.084
ache-0.083
ber-0.079
 why-0.078
Tokeniggle
Token ID24082
Feature activation-6.88e-02

Pos logit contributions
 drifted+0.070
 care+0.066
irmed+0.062
Ar+0.061
acked+0.061
Neg logit contributions
 sir-0.120
 why-0.115
ache-0.115
ister-0.112
 salon-0.111
Token a
Token ID257
Feature activation-1.34e-02

Pos logit contributions
acks+0.100
 drifted+0.096
acked+0.089
irmed+0.088
 care+0.085
Neg logit contributions
 sir-0.134
chard-0.132
ache-0.132
ber-0.126
 why-0.126
Token lot
Token ID1256
Feature activation-3.12e-02

Pos logit contributions
acks+0.015
acked+0.013
irmed+0.013
 drifted+0.013
igg+0.012
Neg logit contributions
ache-0.032
 sir-0.032
chard-0.030
ber-0.029
 salon-0.029
Token.
Token ID13
Feature activation-4.13e-02

Pos logit contributions
acks+0.037
 drifted+0.034
 care+0.031
acked+0.030
irmed+0.029
Neg logit contributions
chard-0.072
 sir-0.069
ache-0.069
 salon-0.066
ber-0.065
Token They
Token ID1119
Feature activation-6.49e-02

Pos logit contributions
acks+0.050
 drifted+0.049
acked+0.045
 breaths+0.044
 care+0.044
Neg logit contributions
 sir-0.088
chard-0.087
 why-0.085
ber-0.083
ache-0.081
Token love
Token ID1842
Feature activation-5.00e-03

Pos logit contributions
acks+0.107
acked+0.103
 drifted+0.097
irmed+0.096
 Soon+0.091
Neg logit contributions
chard-0.116
 sir-0.115
ache-0.112
 salon-0.111
 why-0.111
Token playing
Token ID2712
Feature activation-1.03e-03

Pos logit contributions
acks+0.006
 drifted+0.006
 care+0.005
 They+0.005
 turns+0.005
Neg logit contributions
ache-0.011
chard-0.011
 sir-0.011
ber-0.010
 salon-0.010
Token with
Token ID351
Feature activation-3.67e-02

Pos logit contributions
acks+0.002
 drifted+0.002
 care+0.001
acked+0.001
irmed+0.001
Neg logit contributions
chard-0.002
ache-0.002
 sir-0.002
ber-0.002
 salon-0.002
Token cardboard
Token ID26819
Feature activation-8.78e-03

Pos logit contributions
acks+0.044
 drifted+0.040
acked+0.037
irmed+0.036
igg+0.033
Neg logit contributions
chard-0.086
 sir-0.084
ache-0.081
 why-0.080
ber-0.079
Token.
Token ID13
Feature activation-4.16e-02

Pos logit contributions
acks+0.012
 drifted+0.011
acked+0.010
 care+0.010
irmed+0.010
Neg logit contributions
chard-0.018
 sir-0.018
ache-0.018
ber-0.017
 salon-0.017