TOP ACTIVATIONS
MAX = 0.126

. She grabbed the fire extingu isher and put out the
Sally was even more " int elligent " and careful when
the apple ! He squ int ed his eyes and saw
he found a magical o int ment . He touched it
that the kitten was un har med ! The two hugged
that the kitten was un har med ! The two hugged
sky . The sun gl int ed brightly off the elephant
Jess ie squ int ed , and saw a
quickly checked for a fire extingu isher , but there wasn
the distance . He squ int ed and said to Sarah
. He found an ash tr ay and started to jugg
the honey from the be eh ive . Finally
. He had an ash tr ay that could zip around
cigarette and using an ash tr ay . L
. Everyone in the be eh ive was so happy for
still be healthy and un har med . Suddenly , her
it was the best ash tr ay they had ever seen
it came across an ash tr ay . It wanted to
and rub the magical o int ment on her forehead .
leaves . With soft sn ores , Lucy began to drift

MIN ACTIVATIONS
MIN = -0.119

deep paws , to no avail . He finally thought of
him . But to no avail , they were too scared
times , but to no avail . Jane 's
out , but to no avail . He felt sad and
They have a lot in common . They become friends .
and said , " G reetings Sir . Do you speak
't judge someone by their appearance . Sometimes they are not
friendly voice . " G reetings Jack ! Are you here
friendly voice , " G reetings , little one . Why
gen ie . " G reetings , kind one !" said
spoke up . " G reetings , le opard ," it
No , that is too common ," Anna says . "
that it 's not the appearance of things that matter ,
meant something that was not common , something that was special
that they had more in common than they thought . They
" G reetings , brave rabbit . I
. It is a very common bird . It likes to
not judge others by their appearance . You should be kind
. They learn about different cultures and times . They have
't judge things by their appearance , be curious and respectful

INTERVAL 0.064 - 0.126
CONTAINS 0.125%

their mom said . " You are being too loud and
a big fight . " You can 't play with my
hello to Sally . Sur pr ised but excited , Sally
impressed and said , " You are a restless little musician
their mom calls . " You must be hungry and thirsty

INTERVAL 0.003 - 0.064
CONTAINS 41.919%

was so blue and the light was so bright . He
not notice the mouse and just kept walking . The mouse
a big tree and hid behind it . She hoped Tom
The people in the church were friendly and
arrived at a beautiful me adow . There was no fight

INTERVAL -0.058 - 0.003
CONTAINS 57.738%

Tom is curious . He wants to talk to the bald
jealous and said , " Can I grab it too ?"
they were proud of her and loved her very much .
him to close the door before he went to bed .
not normal . J ill said , " I

INTERVAL -0.119 - -0.058
CONTAINS 0.219%

, Mia , what 's wrong ?" Mom asks . She
at first , it 's hardest to take the first step
but it did not say anything . It just kept recording
dad told her a silly story about a friendly monster who
smiled and said , " Hello , kids . I see
Token.
Token ID13
Feature activation+1.92e-02

Pos logit contributions
 help+0.011
 Madd+0.010
f+0.009
 support+0.009
 happily+0.008
Neg logit contributions
 appearance-0.016
 ours-0.014
 yourselves-0.014
cks-0.013
 yours-0.013
Token She
Token ID1375
Feature activation+2.73e-02

Pos logit contributions
 help+0.036
 Madd+0.034
f+0.032
apy+0.030
 support+0.030
Neg logit contributions
 appearance-0.031
 ours-0.025
 yourselves-0.025
 yours-0.024
cks-0.023
Token grabbed
Token ID13176
Feature activation+8.45e-03

Pos logit contributions
 help+0.044
 Madd+0.042
f+0.038
apy+0.037
 support+0.036
Neg logit contributions
 appearance-0.048
 doesn-0.038
 ours-0.037
 yourselves-0.037
 habitats-0.036
Token the
Token ID262
Feature activation+9.54e-03

Pos logit contributions
appy+0.016
sy+0.015
f+0.015
mates+0.015
tted+0.015
Neg logit contributions
 appearance-0.009
 ours-0.009
 yours-0.009
 yourselves-0.009
 yourself-0.008
Token fire
Token ID2046
Feature activation+4.55e-02

Pos logit contributions
 Madd+0.017
f+0.016
appy+0.015
 help+0.015
Sure+0.015
Neg logit contributions
 appearance-0.015
 ours-0.013
 yourselves-0.012
 yours-0.012
 stealing-0.011
Token extingu
Token ID24995
Feature activation+1.10e-01

Pos logit contributions
 Madd+0.081
 help+0.077
apy+0.072
Sure+0.071
 oblig+0.071
Neg logit contributions
 appearance-0.068
 ours-0.052
 yourselves-0.051
 doesn-0.049
 yours-0.048
Tokenisher
Token ID4828
Feature activation+2.84e-02

Pos logit contributions
 help+0.155
 Madd+0.155
apy+0.133
 gard+0.126
sy+0.125
Neg logit contributions
 appearance-0.227
 ours-0.193
 yourselves-0.191
 yours-0.188
cks-0.174
Token and
Token ID290
Feature activation+1.03e-02

Pos logit contributions
 Madd+0.053
 help+0.053
apy+0.048
f+0.047
Sure+0.046
Neg logit contributions
 appearance-0.041
 ours-0.034
 yourselves-0.032
 yours-0.032
 yourself-0.030
Token put
Token ID1234
Feature activation-8.24e-03

Pos logit contributions
 Madd+0.015
 help+0.015
appy+0.015
f+0.014
 oblig+0.014
Neg logit contributions
 appearance-0.016
 yourselves-0.015
 ours-0.015
 doesn-0.014
 writes-0.013
Token out
Token ID503
Feature activation+3.86e-02

Pos logit contributions
 appearance+0.007
 yours+0.007
 yourselves+0.007
 ours+0.006
 yourself+0.006
Neg logit contributions
 Madd-0.018
 help-0.018
appy-0.018
 gard-0.017
f-0.017
Token the
Token ID262
Feature activation+8.83e-03

Pos logit contributions
 Madd+0.061
 help+0.060
f+0.058
appy+0.058
Carl+0.055
Neg logit contributions
 appearance-0.060
 yours-0.053
 yourselves-0.052
 ours-0.052
 yourself-0.049
Token Sally
Token ID25737
Feature activation-1.95e-02

Pos logit contributions
 help+0.032
 Madd+0.029
f+0.028
apy+0.027
sy+0.027
Neg logit contributions
 appearance-0.042
 ours-0.036
 yourselves-0.035
 yours-0.035
 yourself-0.033
Token was
Token ID373
Feature activation+3.45e-03

Pos logit contributions
 appearance+0.037
 ours+0.031
 yourselves+0.031
 yours+0.030
cks+0.029
Neg logit contributions
 help-0.031
 Madd-0.029
f-0.026
apy-0.025
 support-0.025
Token even
Token ID772
Feature activation+1.01e-02

Pos logit contributions
 help+0.005
 Madd+0.005
f+0.005
apy+0.004
 support+0.004
Neg logit contributions
 appearance-0.006
 ours-0.005
 yourselves-0.005
 yours-0.005
 yourself-0.005
Token more
Token ID517
Feature activation-2.11e-03

Pos logit contributions
 help+0.014
 Madd+0.014
f+0.013
apy+0.012
aroo+0.012
Neg logit contributions
 appearance-0.020
 ours-0.016
 yours-0.016
 yourselves-0.016
 yourself-0.015
Token "
Token ID366
Feature activation+2.81e-02

Pos logit contributions
 appearance+0.005
 ours+0.004
 yourselves+0.004
 yours+0.004
 yourself+0.004
Neg logit contributions
 help-0.003
 Madd-0.002
f-0.002
apy-0.002
 support-0.002
Tokenint
Token ID600
Feature activation+1.07e-01

Pos logit contributions
 help+0.043
 Madd+0.040
f+0.036
apy+0.034
 support+0.034
Neg logit contributions
 appearance-0.054
 yourselves-0.046
 ours-0.046
 yours-0.044
cks-0.042
Tokenelligent
Token ID32940
Feature activation-1.55e-02

Pos logit contributions
 help+0.149
 Madd+0.137
f+0.124
apy+0.116
 support+0.115
Neg logit contributions
 appearance-0.224
 ours-0.192
 yourselves-0.191
 yours-0.186
cks-0.179
Token"
Token ID1
Feature activation+4.36e-02

Pos logit contributions
 appearance+0.029
 ours+0.024
 yourselves+0.024
 yours+0.023
cks+0.023
Neg logit contributions
 help-0.026
 Madd-0.024
f-0.022
 support-0.021
apy-0.021
Token and
Token ID290
Feature activation+9.16e-03

Pos logit contributions
 help+0.067
 Madd+0.062
f+0.057
 support+0.053
apy+0.052
Neg logit contributions
 appearance-0.087
 ours-0.073
 yourselves-0.072
 yours-0.070
cks-0.070
Token careful
Token ID8161
Feature activation+3.64e-02

Pos logit contributions
 help+0.013
 Madd+0.012
f+0.011
apy+0.010
 support+0.010
Neg logit contributions
 appearance-0.019
 ours-0.016
 yourselves-0.016
 yours-0.016
cks-0.015
Token when
Token ID618
Feature activation+3.21e-02

Pos logit contributions
 help+0.061
 Madd+0.055
f+0.051
 support+0.048
apy+0.048
Neg logit contributions
 appearance-0.068
 ours-0.058
 yourselves-0.055
cks-0.054
 yours-0.054
Token the
Token ID262
Feature activation+1.37e-02

Pos logit contributions
 help+0.011
 Madd+0.009
 support+0.007
apy+0.007
 happily+0.007
Neg logit contributions
 appearance-0.021
 ours-0.018
cks-0.018
 habitats-0.018
 Sharing-0.018
Token apple
Token ID17180
Feature activation+1.58e-02

Pos logit contributions
 help+0.023
 Madd+0.021
f+0.019
apy+0.018
 support+0.018
Neg logit contributions
 appearance-0.025
 ours-0.021
 yourselves-0.021
 yours-0.020
cks-0.020
Token!
Token ID0
Feature activation+1.78e-02

Pos logit contributions
 help+0.022
 Madd+0.020
f+0.018
apy+0.017
 support+0.016
Neg logit contributions
 appearance-0.034
 ours-0.028
 yourselves-0.028
 yours-0.027
 habitats-0.027
Token He
Token ID679
Feature activation+4.03e-03

Pos logit contributions
 help+0.029
 Madd+0.028
f+0.025
apy+0.024
 support+0.024
Neg logit contributions
 appearance-0.033
 ours-0.028
 yourselves-0.027
 yours-0.026
cks-0.026
Token squ
Token ID2809
Feature activation+2.19e-02

Pos logit contributions
 help+0.006
 Madd+0.005
f+0.005
apy+0.005
 support+0.005
Neg logit contributions
 appearance-0.008
 ours-0.007
 yourselves-0.007
 yours-0.007
cks-0.007
Tokenint
Token ID600
Feature activation+1.07e-01

Pos logit contributions
 help+0.034
 Madd+0.030
f+0.027
 support+0.026
apy+0.025
Neg logit contributions
 appearance-0.044
 ours-0.037
 yourselves-0.037
cks-0.036
 yours-0.036
Tokened
Token ID276
Feature activation+3.02e-02

Pos logit contributions
 help+0.112
 Madd+0.094
 support+0.083
f+0.079
 believe+0.074
Neg logit contributions
 appearance-0.254
 yourselves-0.223
 ours-0.221
cks-0.219
 yours-0.217
Token his
Token ID465
Feature activation-7.85e-03

Pos logit contributions
 help+0.049
 Madd+0.046
f+0.044
apy+0.042
 support+0.042
Neg logit contributions
 appearance-0.051
 ours-0.042
 yours-0.041
 yourselves-0.041
cks-0.039
Token eyes
Token ID2951
Feature activation-8.51e-03

Pos logit contributions
 appearance+0.013
 ours+0.011
 yourselves+0.010
 yours+0.010
cks+0.010
Neg logit contributions
 help-0.014
 Madd-0.014
f-0.013
apy-0.012
 support-0.012
Token and
Token ID290
Feature activation-1.51e-03

Pos logit contributions
 appearance+0.016
 ours+0.013
 yourselves+0.013
 yours+0.013
cks+0.012
Neg logit contributions
 help-0.014
 Madd-0.013
f-0.012
apy-0.011
 support-0.011
Token saw
Token ID2497
Feature activation-8.44e-04

Pos logit contributions
 appearance+0.003
 ours+0.003
 yourselves+0.003
 yours+0.003
cks+0.003
Neg logit contributions
 help-0.002
 Madd-0.002
f-0.002
 support-0.002
apy-0.002
Token he
Token ID339
Feature activation-9.85e-03

Pos logit contributions
 help+0.052
 Madd+0.048
f+0.043
apy+0.041
 support+0.040
Neg logit contributions
 appearance-0.072
 ours-0.062
 yourselves-0.060
 yours-0.059
cks-0.058
Token found
Token ID1043
Feature activation+1.46e-02

Pos logit contributions
 appearance+0.021
 ours+0.020
 yourselves+0.019
 yours+0.019
cks+0.018
Neg logit contributions
 help-0.014
 Madd-0.012
f-0.011
 happily-0.011
 support-0.010
Token a
Token ID257
Feature activation+2.26e-02

Pos logit contributions
 help+0.025
 Madd+0.023
f+0.022
 support+0.021
apy+0.020
Neg logit contributions
 appearance-0.026
 ours-0.022
 yourselves-0.021
 yours-0.020
cks-0.020
Token magical
Token ID10883
Feature activation-1.47e-02

Pos logit contributions
 help+0.034
 Madd+0.029
f+0.026
 support+0.026
apy+0.025
Neg logit contributions
 appearance-0.047
 yourselves-0.041
 ours-0.040
cks-0.040
 yours-0.038
Token o
Token ID267
Feature activation+6.20e-02

Pos logit contributions
 appearance+0.030
 yourselves+0.026
 ours+0.026
cks+0.025
 yours+0.024
Neg logit contributions
 help-0.022
 Madd-0.020
f-0.017
 support-0.017
apy-0.016
Tokenint
Token ID600
Feature activation+1.07e-01

Pos logit contributions
 help+0.102
 Madd+0.092
f+0.087
 support+0.081
apy+0.080
Neg logit contributions
 appearance-0.118
 ours-0.101
 yourselves-0.098
 yours-0.095
cks-0.092
Tokenment
Token ID434
Feature activation+4.58e-02

Pos logit contributions
 help+0.111
 Madd+0.098
f+0.087
apy+0.079
 support+0.076
Neg logit contributions
 appearance-0.263
 ours-0.232
 yourselves-0.229
 yours-0.223
cks-0.217
Token.
Token ID13
Feature activation+1.61e-02

Pos logit contributions
 help+0.063
 Madd+0.054
f+0.053
 support+0.047
 happily+0.047
Neg logit contributions
 appearance-0.100
 yourselves-0.086
 ours-0.085
 yours-0.080
cks-0.080
Token He
Token ID679
Feature activation+3.94e-03

Pos logit contributions
 Madd+0.025
 help+0.025
f+0.020
 happily+0.020
apy+0.020
Neg logit contributions
 appearance-0.033
 ours-0.027
 yourselves-0.026
cks-0.026
 yours-0.025
Token touched
Token ID12615
Feature activation+4.00e-02

Pos logit contributions
 help+0.006
 Madd+0.005
f+0.005
apy+0.005
 support+0.005
Neg logit contributions
 appearance-0.008
 ours-0.007
 yourselves-0.007
 yours-0.006
cks-0.006
Token it
Token ID340
Feature activation+3.82e-03

Pos logit contributions
 help+0.067
 Madd+0.062
f+0.057
apy+0.054
 support+0.054
Neg logit contributions
 appearance-0.074
 ours-0.063
 yourselves-0.061
 yours-0.059
cks-0.057
Token that
Token ID326
Feature activation+7.99e-03

Pos logit contributions
 help+0.004
f+0.004
 Madd+0.004
apy+0.004
 lear+0.004
Neg logit contributions
 appearance-0.004
 yours-0.004
 ours-0.004
 yourselves-0.004
 yourself-0.004
Token the
Token ID262
Feature activation+1.20e-02

Pos logit contributions
 help+0.013
 Madd+0.012
f+0.011
apy+0.010
 support+0.010
Neg logit contributions
 appearance-0.015
 ours-0.013
 yourselves-0.013
 yours-0.012
cks-0.012
Token kitten
Token ID42143
Feature activation+3.77e-02

Pos logit contributions
 help+0.014
 Madd+0.013
f+0.012
apy+0.012
 support+0.010
Neg logit contributions
 appearance-0.027
 ours-0.023
 yourselves-0.023
 yours-0.023
 yourself-0.022
Token was
Token ID373
Feature activation+5.23e-03

Pos logit contributions
 help+0.058
 Madd+0.052
f+0.047
 support+0.045
 happily+0.044
Neg logit contributions
 appearance-0.075
 yourselves-0.065
 ours-0.065
 yours-0.062
cks-0.062
Token un
Token ID555
Feature activation+2.27e-02

Pos logit contributions
 help+0.008
 Madd+0.007
 happily+0.006
 support+0.006
sy+0.006
Neg logit contributions
 appearance-0.011
 yourselves-0.009
ulum-0.009
 habitats-0.009
 ours-0.009
Tokenhar
Token ID9869
Feature activation+1.06e-01

Pos logit contributions
 help+0.033
 Madd+0.032
apy+0.028
 gard+0.026
 lear+0.026
Neg logit contributions
 appearance-0.042
 yourselves-0.038
 ours-0.038
ind-0.036
 yours-0.035
Tokenmed
Token ID1150
Feature activation+2.92e-02

Pos logit contributions
 help+0.141
 Madd+0.128
f+0.122
apy+0.110
sy+0.109
Neg logit contributions
 appearance-0.229
 ours-0.201
 yourselves-0.194
 yours-0.191
cks-0.180
Token!
Token ID0
Feature activation+2.57e-02

Pos logit contributions
 help+0.047
 Madd+0.044
f+0.040
apy+0.038
 support+0.038
Neg logit contributions
 appearance-0.054
 ours-0.046
 yourselves-0.045
 yours-0.044
cks-0.042
Token The
Token ID383
Feature activation+3.70e-02

Pos logit contributions
 help+0.042
 Madd+0.040
f+0.036
apy+0.035
 support+0.034
Neg logit contributions
 appearance-0.048
 ours-0.040
 yourselves-0.040
 yours-0.038
cks-0.038
Token two
Token ID734
Feature activation+2.08e-03

Pos logit contributions
f+0.047
 help+0.044
 Madd+0.043
apy+0.042
 oblig+0.040
Neg logit contributions
 appearance-0.076
 ours-0.064
 yourselves-0.063
 yours-0.061
 stealing-0.061
Token hugged
Token ID41130
Feature activation+1.08e-02

Pos logit contributions
 Madd+0.003
f+0.003
 help+0.003
 oblig+0.002
 lear+0.002
Neg logit contributions
 appearance-0.004
 ours-0.003
 strangers-0.003
 stealing-0.003
 sharing-0.003
Token that
Token ID326
Feature activation+7.99e-03

Pos logit contributions
 help+0.004
f+0.004
 Madd+0.004
apy+0.004
 lear+0.004
Neg logit contributions
 appearance-0.004
 yours-0.004
 ours-0.004
 yourselves-0.004
 yourself-0.004
Token the
Token ID262
Feature activation+1.20e-02

Pos logit contributions
 help+0.013
 Madd+0.012
f+0.011
apy+0.010
 support+0.010
Neg logit contributions
 appearance-0.015
 ours-0.013
 yourselves-0.013
 yours-0.012
cks-0.012
Token kitten
Token ID42143
Feature activation+3.77e-02

Pos logit contributions
 help+0.014
 Madd+0.013
f+0.012
apy+0.012
 support+0.010
Neg logit contributions
 appearance-0.027
 ours-0.023
 yourselves-0.023
 yours-0.023
 yourself-0.022
Token was
Token ID373
Feature activation+5.23e-03

Pos logit contributions
 help+0.058
 Madd+0.052
f+0.047
 support+0.045
 happily+0.044
Neg logit contributions
 appearance-0.075
 yourselves-0.065
 ours-0.065
 yours-0.062
cks-0.062
Token un
Token ID555
Feature activation+2.27e-02

Pos logit contributions
 help+0.008
 Madd+0.007
 happily+0.006
 support+0.006
sy+0.006
Neg logit contributions
 appearance-0.011
 yourselves-0.009
ulum-0.009
 habitats-0.009
 ours-0.009
Tokenhar
Token ID9869
Feature activation+1.06e-01

Pos logit contributions
 help+0.033
 Madd+0.032
apy+0.028
 gard+0.026
 lear+0.026
Neg logit contributions
 appearance-0.042
 yourselves-0.038
 ours-0.038
ind-0.036
 yours-0.035
Tokenmed
Token ID1150
Feature activation+2.92e-02

Pos logit contributions
 help+0.141
 Madd+0.128
f+0.122
apy+0.110
sy+0.109
Neg logit contributions
 appearance-0.229
 ours-0.201
 yourselves-0.194
 yours-0.191
cks-0.180
Token!
Token ID0
Feature activation+2.57e-02

Pos logit contributions
 help+0.047
 Madd+0.044
f+0.040
apy+0.038
 support+0.038
Neg logit contributions
 appearance-0.054
 ours-0.046
 yourselves-0.045
 yours-0.044
cks-0.042
Token The
Token ID383
Feature activation+3.70e-02

Pos logit contributions
 help+0.042
 Madd+0.040
f+0.036
apy+0.035
 support+0.034
Neg logit contributions
 appearance-0.048
 ours-0.040
 yourselves-0.040
 yours-0.038
cks-0.038
Token two
Token ID734
Feature activation+2.08e-03

Pos logit contributions
f+0.047
 help+0.044
 Madd+0.043
apy+0.042
 oblig+0.040
Neg logit contributions
 appearance-0.076
 ours-0.064
 yourselves-0.063
 yours-0.061
 stealing-0.061
Token hugged
Token ID41130
Feature activation+1.08e-02

Pos logit contributions
 Madd+0.003
f+0.003
 help+0.003
 oblig+0.002
 lear+0.002
Neg logit contributions
 appearance-0.004
 ours-0.003
 strangers-0.003
 stealing-0.003
 sharing-0.003
Token sky
Token ID6766
Feature activation+2.62e-02

Pos logit contributions
 help+0.035
 Madd+0.033
f+0.030
apy+0.030
 support+0.029
Neg logit contributions
 appearance-0.030
 ours-0.024
 yourselves-0.024
 yours-0.023
cks-0.022
Token.
Token ID13
Feature activation+2.13e-02

Pos logit contributions
 help+0.038
 Madd+0.034
f+0.031
 support+0.029
apy+0.029
Neg logit contributions
 appearance-0.054
 ours-0.047
 yourselves-0.047
 yours-0.045
cks-0.044
Token The
Token ID383
Feature activation+3.79e-02

Pos logit contributions
 help+0.034
 Madd+0.031
f+0.029
 support+0.028
apy+0.028
Neg logit contributions
 appearance-0.040
 ours-0.034
 yourselves-0.033
 yours-0.032
cks-0.031
Token sun
Token ID4252
Feature activation+3.59e-02

Pos logit contributions
 Madd+0.058
apy+0.057
 help+0.057
f+0.052
 oblig+0.050
Neg logit contributions
 appearance-0.069
 ours-0.055
 habitats-0.054
 yourselves-0.053
 yours-0.052
Token gl
Token ID1278
Feature activation+1.40e-02

Pos logit contributions
 help+0.056
 Madd+0.051
f+0.047
 support+0.045
apy+0.043
Neg logit contributions
 appearance-0.068
 yourselves-0.056
 ours-0.056
 yours-0.054
cks-0.054
Tokenint
Token ID600
Feature activation+1.05e-01

Pos logit contributions
 help+0.020
 Madd+0.018
f+0.016
 support+0.015
apy+0.015
Neg logit contributions
 appearance-0.029
 yourselves-0.025
 ours-0.025
 yours-0.024
cks-0.024
Tokened
Token ID276
Feature activation+2.85e-02

Pos logit contributions
 help+0.132
 Madd+0.118
f+0.105
 support+0.100
apy+0.095
Neg logit contributions
 appearance-0.231
 yourselves-0.199
 ours-0.199
 yours-0.193
cks-0.191
Token brightly
Token ID35254
Feature activation+2.60e-02

Pos logit contributions
 help+0.043
 Madd+0.039
f+0.036
apy+0.034
 support+0.034
Neg logit contributions
 appearance-0.056
 ours-0.047
 yourselves-0.046
 yours-0.045
cks-0.043
Token off
Token ID572
Feature activation+3.78e-04

Pos logit contributions
 help+0.038
 Madd+0.034
f+0.031
apy+0.030
 support+0.029
Neg logit contributions
 appearance-0.053
 ours-0.045
 yourselves-0.045
 yours-0.044
cks-0.043
Token the
Token ID262
Feature activation+1.47e-02

Pos logit contributions
 help+0.001
 Madd+0.000
f+0.000
 support+0.000
apy+0.000
Neg logit contributions
 appearance-0.001
 ours-0.001
 yourselves-0.001
 yours-0.001
cks-0.001
Token elephant
Token ID20950
Feature activation+2.91e-02

Pos logit contributions
 help+0.027
 Madd+0.026
apy+0.024
f+0.023
 support+0.022
Neg logit contributions
 appearance-0.024
 ours-0.019
 yourselves-0.019
 yours-0.018
 habitats-0.018
Token
Token ID198
Feature activation+7.62e-03

Pos logit contributions
 help+0.009
 Madd+0.008
f+0.008
apy+0.007
 support+0.007
Neg logit contributions
 appearance-0.014
 ours-0.012
 yours-0.011
 yourselves-0.011
cks-0.011
Token
Token ID198
Feature activation+6.75e-03

Pos logit contributions
 help+0.012
 Madd+0.011
f+0.010
apy+0.010
 support+0.010
Neg logit contributions
 appearance-0.014
 ours-0.012
 yourselves-0.012
 yours-0.012
cks-0.011
TokenJess
Token ID34648
Feature activation+3.59e-03

Pos logit contributions
 help+0.010
 Madd+0.009
f+0.009
 support+0.008
apy+0.008
Neg logit contributions
 appearance-0.012
 yourselves-0.011
 ours-0.011
 yourself-0.011
 yours-0.010
Tokenie
Token ID494
Feature activation-5.77e-03

Pos logit contributions
 help+0.006
 Madd+0.005
f+0.005
apy+0.005
 support+0.005
Neg logit contributions
 appearance-0.007
 ours-0.006
 yourselves-0.006
 yours-0.006
cks-0.005
Token squ
Token ID2809
Feature activation+1.96e-02

Pos logit contributions
 appearance+0.011
 yourselves+0.009
 ours+0.009
 yours+0.009
cks+0.009
Neg logit contributions
 help-0.009
 Madd-0.008
f-0.007
apy-0.007
 support-0.007
Tokenint
Token ID600
Feature activation+1.04e-01

Pos logit contributions
 help+0.027
 Madd+0.023
 support+0.020
f+0.020
apy+0.019
Neg logit contributions
 appearance-0.042
 yourselves-0.036
 ours-0.036
 yours-0.035
cks-0.035
Tokened
Token ID276
Feature activation+2.26e-02

Pos logit contributions
 help+0.122
 Madd+0.098
 support+0.094
 believe+0.086
f+0.083
Neg logit contributions
 appearance-0.228
 yourselves-0.201
cks-0.200
 yours-0.196
 yourself-0.193
Token,
Token ID11
Feature activation+9.25e-03

Pos logit contributions
 help+0.034
 Madd+0.031
f+0.030
 support+0.029
apy+0.028
Neg logit contributions
 appearance-0.041
 yourselves-0.033
 yours-0.033
 ours-0.033
 yourself-0.032
Token and
Token ID290
Feature activation+1.22e-03

Pos logit contributions
 help+0.013
 Madd+0.012
f+0.010
apy+0.010
 happily+0.010
Neg logit contributions
 appearance-0.020
 ours-0.017
 yourselves-0.017
 yours-0.016
 habitats-0.016
Token saw
Token ID2497
Feature activation+8.77e-04

Pos logit contributions
 help+0.002
 Madd+0.002
f+0.001
 support+0.001
apy+0.001
Neg logit contributions
 appearance-0.002
 ours-0.002
 yourselves-0.002
 yours-0.002
 yourself-0.002
Token a
Token ID257
Feature activation+2.28e-02

Pos logit contributions
 help+0.001
 Madd+0.001
f+0.001
apy+0.001
 support+0.001
Neg logit contributions
 appearance-0.002
 ours-0.002
 yourselves-0.002
 yours-0.001
cks-0.001
Token quickly
Token ID2952
Feature activation+4.38e-03

Pos logit contributions
 appearance+0.022
 yourselves+0.017
 ours+0.017
cks+0.017
 doesn+0.017
Neg logit contributions
 help-0.017
 Madd-0.017
f-0.015
apy-0.014
 support-0.014
Token checked
Token ID10667
Feature activation+4.52e-02

Pos logit contributions
 Madd+0.007
 help+0.006
aroo+0.006
f+0.006
Sure+0.006
Neg logit contributions
 appearance-0.007
 ours-0.006
cks-0.006
 stealing-0.006
 yourselves-0.006
Token for
Token ID329
Feature activation+1.70e-02

Pos logit contributions
 help+0.071
 Madd+0.070
f+0.063
apy+0.062
sy+0.060
Neg logit contributions
 appearance-0.082
 yourselves-0.071
 ours-0.071
 yours-0.069
cks-0.065
Token a
Token ID257
Feature activation+2.88e-02

Pos logit contributions
 Madd+0.024
f+0.022
 help+0.021
sy+0.021
apy+0.020
Neg logit contributions
 appearance-0.034
 yourselves-0.029
 ours-0.029
 yours-0.029
 stealing-0.028
Token fire
Token ID2046
Feature activation+3.91e-02

Pos logit contributions
 Madd+0.042
f+0.037
 help+0.037
apped+0.036
apy+0.035
Neg logit contributions
 appearance-0.054
 ours-0.046
 yours-0.044
 stealing-0.043
 mine-0.041
Token extingu
Token ID24995
Feature activation+1.04e-01

Pos logit contributions
 Madd+0.062
 help+0.061
apy+0.056
f+0.055
Sure+0.052
Neg logit contributions
 appearance-0.071
 ours-0.057
 yours-0.056
 yourselves-0.056
 habitats-0.053
Tokenisher
Token ID4828
Feature activation+2.55e-02

Pos logit contributions
 Madd+0.136
 help+0.132
apy+0.118
sy+0.108
f+0.108
Neg logit contributions
 appearance-0.226
 ours-0.196
 yours-0.191
 yourselves-0.188
cks-0.176
Token,
Token ID11
Feature activation+1.17e-02

Pos logit contributions
 help+0.041
 Madd+0.038
f+0.035
apy+0.033
 support+0.033
Neg logit contributions
 appearance-0.048
 ours-0.040
 yourselves-0.040
 yours-0.039
cks-0.037
Token but
Token ID475
Feature activation+9.01e-03

Pos logit contributions
 help+0.018
 Madd+0.016
 support+0.014
f+0.013
 happily+0.013
Neg logit contributions
 appearance-0.024
 ours-0.020
 yourselves-0.020
 habitats-0.020
 yours-0.019
Token there
Token ID612
Feature activation+1.06e-02

Pos logit contributions
 help+0.015
 Madd+0.014
f+0.013
 support+0.012
apy+0.012
Neg logit contributions
 appearance-0.016
 ours-0.014
 yourselves-0.014
 yours-0.013
 stealing-0.012
Token wasn
Token ID2492
Feature activation+1.82e-02

Pos logit contributions
 help+0.017
 Madd+0.016
f+0.014
 support+0.014
apy+0.014
Neg logit contributions
 appearance-0.020
 ours-0.017
 yourselves-0.017
 yours-0.016
cks-0.016
Token the
Token ID262
Feature activation+1.78e-02

Pos logit contributions
 help+0.021
 Madd+0.018
 support+0.015
f+0.015
apy+0.015
Neg logit contributions
 appearance-0.038
 ours-0.033
 yourselves-0.032
 yours-0.031
 habitats-0.031
Token distance
Token ID5253
Feature activation-1.84e-02

Pos logit contributions
 help+0.030
 Madd+0.027
f+0.025
 support+0.024
apy+0.023
Neg logit contributions
 appearance-0.033
 ours-0.027
 yourselves-0.027
 yours-0.026
cks-0.026
Token.
Token ID13
Feature activation+2.41e-02

Pos logit contributions
 appearance+0.043
 ours+0.037
 yourselves+0.036
cks+0.036
 habitats+0.035
Neg logit contributions
 help-0.022
 Madd-0.019
f-0.017
 happily-0.016
apy-0.015
Token He
Token ID679
Feature activation+2.11e-03

Pos logit contributions
 help+0.036
 Madd+0.036
f+0.031
apy+0.030
 happily+0.029
Neg logit contributions
 appearance-0.049
 ours-0.040
 yourselves-0.039
cks-0.039
 yours-0.038
Token squ
Token ID2809
Feature activation+2.49e-02

Pos logit contributions
 help+0.003
 Madd+0.002
f+0.002
 happily+0.002
 support+0.002
Neg logit contributions
 appearance-0.005
 ours-0.004
 yourselves-0.004
 yours-0.004
cks-0.004
Tokenint
Token ID600
Feature activation+1.04e-01

Pos logit contributions
 help+0.039
 Madd+0.036
f+0.032
 support+0.031
apy+0.031
Neg logit contributions
 appearance-0.048
 ours-0.041
 yourselves-0.041
 yours-0.039
cks-0.039
Tokened
Token ID276
Feature activation+2.90e-02

Pos logit contributions
 help+0.098
 Madd+0.083
f+0.068
 support+0.068
 believe+0.058
Neg logit contributions
 appearance-0.260
 yourselves-0.230
 ours-0.228
 yours-0.224
cks-0.224
Token and
Token ID290
Feature activation-3.06e-03

Pos logit contributions
 help+0.045
 Madd+0.041
f+0.039
apy+0.037
 support+0.037
Neg logit contributions
 appearance-0.054
 ours-0.045
 yourselves-0.045
 yours-0.044
cks-0.042
Token said
Token ID531
Feature activation+5.34e-03

Pos logit contributions
 appearance+0.006
 ours+0.005
 yourselves+0.005
 yours+0.005
cks+0.005
Neg logit contributions
 help-0.004
 Madd-0.004
f-0.004
 support-0.003
apy-0.003
Token to
Token ID284
Feature activation-5.86e-03

Pos logit contributions
 help+0.007
 Madd+0.006
f+0.006
 support+0.005
apy+0.005
Neg logit contributions
 appearance-0.011
 yourselves-0.010
 ours-0.010
 yours-0.010
cks-0.009
Token Sarah
Token ID10490
Feature activation-1.52e-02

Pos logit contributions
 appearance+0.013
 ours+0.011
 yourselves+0.010
cks+0.010
 yours+0.010
Neg logit contributions
 help-0.009
 Madd-0.008
 support-0.006
f-0.006
 happily-0.006
Token.
Token ID13
Feature activation+8.06e-03

Pos logit contributions
 help+0.002
 Madd+0.002
f+0.002
apy+0.002
 support+0.002
Neg logit contributions
 appearance-0.003
 ours-0.002
 yourselves-0.002
 yours-0.002
cks-0.002
Token He
Token ID679
Feature activation+7.49e-03

Pos logit contributions
 help+0.013
 Madd+0.011
f+0.010
 support+0.010
Sure+0.010
Neg logit contributions
 appearance-0.016
 yourselves-0.013
 ours-0.013
 yours-0.013
cks-0.013
Token found
Token ID1043
Feature activation+8.46e-03

Pos logit contributions
 help+0.012
 Madd+0.011
f+0.010
apy+0.009
 support+0.009
Neg logit contributions
 appearance-0.014
 ours-0.012
 yourselves-0.012
 yours-0.012
cks-0.011
Token an
Token ID281
Feature activation+1.08e-02

Pos logit contributions
 help+0.012
 Madd+0.012
f+0.011
apy+0.010
sy+0.010
Neg logit contributions
 appearance-0.016
 ours-0.014
 yourselves-0.014
 yours-0.014
cks-0.013
Token ash
Token ID12530
Feature activation+5.06e-02

Pos logit contributions
 help+0.018
 Madd+0.018
f+0.017
apy+0.016
 support+0.015
Neg logit contributions
 appearance-0.019
 ours-0.016
 yourselves-0.015
 yours-0.015
 yourself-0.014
Tokentr
Token ID2213
Feature activation+1.02e-01

Pos logit contributions
 help+0.079
 Madd+0.071
f+0.066
 support+0.062
apy+0.062
Neg logit contributions
 appearance-0.099
 ours-0.085
 yourselves-0.084
 yours-0.081
cks-0.078
Tokenay
Token ID323
Feature activation-9.34e-03

Pos logit contributions
 help+0.155
 Madd+0.143
f+0.131
apy+0.126
 support+0.122
Neg logit contributions
 appearance-0.205
 ours-0.172
 yourselves-0.171
 yours-0.164
cks-0.161
Token and
Token ID290
Feature activation-1.37e-03

Pos logit contributions
 appearance+0.022
 yourselves+0.020
 ours+0.019
 habitats+0.019
cks+0.018
Neg logit contributions
 help-0.011
f-0.009
 Madd-0.009
 support-0.008
 happily-0.008
Token started
Token ID2067
Feature activation+3.19e-02

Pos logit contributions
 appearance+0.003
 ours+0.002
 yourselves+0.002
 yours+0.002
cks+0.002
Neg logit contributions
 help-0.002
 Madd-0.002
f-0.002
 support-0.002
apy-0.002
Token to
Token ID284
Feature activation-1.36e-03

Pos logit contributions
 help+0.050
 Madd+0.044
f+0.041
apy+0.039
 support+0.039
Neg logit contributions
 appearance-0.063
 ours-0.055
 yourselves-0.054
 yours-0.052
cks-0.050
Token jugg
Token ID24873
Feature activation-2.39e-02

Pos logit contributions
 appearance+0.003
 ours+0.002
 yourselves+0.002
 yours+0.002
cks+0.002
Neg logit contributions
 help-0.002
 Madd-0.002
f-0.002
apy-0.002
 support-0.002
Token the
Token ID262
Feature activation+2.54e-02

Pos logit contributions
 help+0.029
 Madd+0.027
f+0.025
apy+0.023
 support+0.023
Neg logit contributions
 appearance-0.042
 ours-0.037
 yourselves-0.036
 yours-0.035
cks-0.034
Token honey
Token ID12498
Feature activation+1.25e-02

Pos logit contributions
 Madd+0.037
 help+0.036
f+0.033
apy+0.033
Sure+0.030
Neg logit contributions
 appearance-0.049
 ours-0.041
 yourselves-0.040
 yours-0.040
 yourself-0.037
Token from
Token ID422
Feature activation+4.02e-02

Pos logit contributions
 help+0.020
 Madd+0.018
f+0.017
apy+0.016
 support+0.016
Neg logit contributions
 appearance-0.023
 ours-0.020
 yourselves-0.019
 yours-0.019
cks-0.018
Token the
Token ID262
Feature activation+2.04e-02

Pos logit contributions
 help+0.055
 Madd+0.046
 support+0.040
f+0.039
 happily+0.039
Neg logit contributions
 appearance-0.090
 ours-0.076
 yourselves-0.075
 habitats-0.073
 Sharing-0.072
Token be
Token ID307
Feature activation+3.72e-02

Pos logit contributions
 help+0.033
 Madd+0.030
f+0.028
apy+0.027
 support+0.026
Neg logit contributions
 appearance-0.039
 ours-0.032
 yourselves-0.032
 yours-0.031
cks-0.029
Tokeneh
Token ID17231
Feature activation+1.02e-01

Pos logit contributions
 help+0.063
 Madd+0.058
f+0.054
apy+0.051
 support+0.051
Neg logit contributions
 appearance-0.067
 ours-0.056
 yourselves-0.056
 yours-0.054
cks-0.052
Tokenive
Token ID425
Feature activation+3.15e-02

Pos logit contributions
 Madd+0.121
 help+0.113
apy+0.112
 happily+0.101
 accomplishment+0.094
Neg logit contributions
 appearance-0.212
 yourselves-0.179
 yourself-0.177
 yours-0.175
 ours-0.174
Token.
Token ID13
Feature activation+2.77e-02

Pos logit contributions
 help+0.048
 Madd+0.044
f+0.041
apy+0.038
 support+0.038
Neg logit contributions
 appearance-0.062
 ours-0.053
 yourselves-0.052
 yours-0.051
cks-0.049
Token
Token ID198
Feature activation+1.97e-02

Pos logit contributions
 help+0.045
 Madd+0.041
f+0.038
apy+0.036
 support+0.036
Neg logit contributions
 appearance-0.052
 ours-0.044
 yourselves-0.043
 yours-0.042
cks-0.040
Token
Token ID198
Feature activation+1.77e-02

Pos logit contributions
 help+0.033
 Madd+0.030
f+0.028
 support+0.026
apy+0.026
Neg logit contributions
 appearance-0.035
 yourselves-0.030
 ours-0.030
 yours-0.028
cks-0.028
TokenFinally
Token ID11158
Feature activation+3.93e-02

Pos logit contributions
 help+0.028
 Madd+0.025
f+0.023
 support+0.023
apy+0.022
Neg logit contributions
 appearance-0.031
 yourselves-0.028
 ours-0.027
 yourself-0.026
 yours-0.026
Token.
Token ID13
Feature activation+9.12e-03

Pos logit contributions
 appearance+0.084
 ours+0.072
 yourselves+0.072
 yours+0.068
cks+0.065
Neg logit contributions
 help-0.055
 Madd-0.048
f-0.045
 support-0.041
aroo-0.040
Token He
Token ID679
Feature activation+1.15e-02

Pos logit contributions
 help+0.015
 Madd+0.013
f+0.012
 support+0.012
 happily+0.012
Neg logit contributions
 appearance-0.017
 ours-0.015
 yourselves-0.015
cks-0.014
 yours-0.014
Token had
Token ID550
Feature activation-1.60e-02

Pos logit contributions
 help+0.017
 Madd+0.015
f+0.015
 support+0.014
apy+0.013
Neg logit contributions
 appearance-0.023
 ours-0.020
 yourselves-0.020
 yours-0.019
cks-0.018
Token an
Token ID281
Feature activation+7.08e-03

Pos logit contributions
 appearance+0.041
 ours+0.036
 yourselves+0.035
cks+0.034
ulum+0.034
Neg logit contributions
 help-0.017
 Madd-0.012
 support-0.011
apy-0.010
f-0.010
Token ash
Token ID12530
Feature activation+5.02e-02

Pos logit contributions
 Madd+0.012
 help+0.012
f+0.011
apy+0.011
Sure+0.010
Neg logit contributions
 appearance-0.012
 ours-0.011
 yours-0.010
 yourselves-0.010
cks-0.009
Tokentr
Token ID2213
Feature activation+1.02e-01

Pos logit contributions
 help+0.088
 Madd+0.083
f+0.077
apy+0.073
 support+0.073
Neg logit contributions
 appearance-0.087
 ours-0.071
 yourselves-0.071
 yours-0.068
cks-0.066
Tokenay
Token ID323
Feature activation-9.71e-03

Pos logit contributions
 help+0.152
 Madd+0.140
f+0.128
apy+0.124
 support+0.119
Neg logit contributions
 appearance-0.207
 ours-0.174
 yourselves-0.173
 yours-0.166
cks-0.163
Token that
Token ID326
Feature activation-3.67e-05

Pos logit contributions
 appearance+0.021
 ours+0.018
 yourselves+0.018
cks+0.017
 habitats+0.017
Neg logit contributions
 help-0.014
 Madd-0.011
f-0.011
 support-0.010
apy-0.010
Token could
Token ID714
Feature activation+3.85e-03

Pos logit contributions
 appearance+0.000
 ours+0.000
 yourselves+0.000
 yours+0.000
cks+0.000
Neg logit contributions
 help-0.000
 Madd-0.000
f-0.000
 support-0.000
apy-0.000
Token zip
Token ID19974
Feature activation-2.06e-02

Pos logit contributions
 help+0.005
 Madd+0.005
f+0.004
 support+0.004
apy+0.004
Neg logit contributions
 appearance-0.008
 ours-0.007
 yourselves-0.007
 yours-0.007
 habitats-0.007
Token around
Token ID1088
Feature activation+2.94e-02

Pos logit contributions
 appearance+0.049
 yourselves+0.044
 ours+0.042
 habitats+0.042
 Sharing+0.042
Neg logit contributions
 help-0.024
f-0.018
 support-0.016
sy-0.016
 Madd-0.016
Token cigarette
Token ID17779
Feature activation+3.93e-02

Pos logit contributions
 help+0.033
 Madd+0.029
apy+0.026
f+0.026
 support+0.025
Neg logit contributions
 appearance-0.046
 yourselves-0.039
 ours-0.039
cks-0.038
 yours-0.038
Token and
Token ID290
Feature activation-3.34e-03

Pos logit contributions
 help+0.049
 Madd+0.038
apy+0.037
f+0.036
sy+0.032
Neg logit contributions
 appearance-0.091
 ours-0.082
 yours-0.079
 habitats-0.076
 yourselves-0.076
Token using
Token ID1262
Feature activation-6.57e-03

Pos logit contributions
 appearance+0.007
 ours+0.006
 yours+0.006
 yourselves+0.006
 habitats+0.006
Neg logit contributions
 help-0.005
 Madd-0.004
apy-0.004
 support-0.004
f-0.004
Token an
Token ID281
Feature activation+8.22e-03

Pos logit contributions
 appearance+0.015
 yourselves+0.012
 ours+0.012
 habitats+0.012
cks+0.012
Neg logit contributions
 help-0.009
 Madd-0.008
apy-0.007
 support-0.007
f-0.007
Token ash
Token ID12530
Feature activation+5.03e-02

Pos logit contributions
 help+0.015
 Madd+0.014
f+0.013
apy+0.012
 support+0.012
Neg logit contributions
 appearance-0.014
 ours-0.011
 yourselves-0.011
 yours-0.011
cks-0.010
Tokentr
Token ID2213
Feature activation+1.02e-01

Pos logit contributions
 Madd+0.093
 help+0.093
f+0.085
 support+0.081
apy+0.081
Neg logit contributions
 appearance-0.077
 ours-0.061
 yourselves-0.061
 yours-0.059
cks-0.056
Tokenay
Token ID323
Feature activation-9.89e-03

Pos logit contributions
 help+0.154
 Madd+0.141
f+0.130
apy+0.125
 support+0.120
Neg logit contributions
 appearance-0.204
 ours-0.171
 yourselves-0.170
 yours-0.163
 habitats-0.160
Token.
Token ID13
Feature activation-2.19e-03

Pos logit contributions
 appearance+0.020
 ours+0.017
 yourselves+0.017
 yours+0.016
cks+0.016
Neg logit contributions
 help-0.015
 Madd-0.013
f-0.013
apy-0.012
 support-0.012
Token
Token ID198
Feature activation-1.96e-03

Pos logit contributions
 appearance+0.006
cks+0.005
 ours+0.005
 yourselves+0.005
 habitats+0.005
Neg logit contributions
 Madd-0.002
 help-0.002
 happily-0.002
Sure-0.002
 Fr-0.002
Token
Token ID198
Feature activation-9.35e-04

Pos logit contributions
 appearance+0.005
 ours+0.004
 yourselves+0.004
 habitats+0.004
 yours+0.004
Neg logit contributions
 help-0.002
 Madd-0.002
 happily-0.002
f-0.002
Sure-0.002
TokenL
Token ID43
Feature activation+3.23e-02

Pos logit contributions
 appearance+0.002
 ours+0.002
 yourselves+0.002
 yours+0.002
cks+0.002
Neg logit contributions
 help-0.001
 Madd-0.001
f-0.001
Sure-0.001
apy-0.001
Token.
Token ID13
Feature activation+6.99e-03

Pos logit contributions
 help+0.073
 Madd+0.063
f+0.059
 support+0.055
 happily+0.055
Neg logit contributions
 appearance-0.110
 ours-0.096
 yourselves-0.093
 yours-0.090
cks-0.090
Token Everyone
Token ID11075
Feature activation+2.91e-02

Pos logit contributions
 help+0.011
 Madd+0.011
f+0.009
apy+0.009
 happily+0.009
Neg logit contributions
 appearance-0.014
 ours-0.012
 yourselves-0.012
cks-0.011
 yours-0.011
Token in
Token ID287
Feature activation+1.78e-02

Pos logit contributions
 help+0.044
 Madd+0.040
f+0.037
apy+0.035
 support+0.035
Neg logit contributions
 appearance-0.058
 ours-0.050
 yourselves-0.049
 yours-0.047
cks-0.046
Token the
Token ID262
Feature activation+1.43e-02

Pos logit contributions
 help+0.022
 Madd+0.019
 support+0.016
f+0.016
 happily+0.015
Neg logit contributions
 appearance-0.041
 ours-0.037
 yourselves-0.036
 yours-0.035
cks-0.034
Token be
Token ID307
Feature activation+3.20e-02

Pos logit contributions
 help+0.025
 Madd+0.023
f+0.022
apy+0.021
 support+0.020
Neg logit contributions
 appearance-0.025
 ours-0.021
 yourselves-0.020
 yours-0.020
cks-0.019
Tokeneh
Token ID17231
Feature activation+1.01e-01

Pos logit contributions
 help+0.044
 Madd+0.040
f+0.037
 support+0.034
apy+0.034
Neg logit contributions
 appearance-0.066
 ours-0.057
 yourselves-0.057
 yours-0.056
cks-0.055
Tokenive
Token ID425
Feature activation+2.80e-02

Pos logit contributions
 help+0.125
 Madd+0.113
apy+0.107
 happily+0.101
 accomplishment+0.101
Neg logit contributions
 appearance-0.198
 yourselves-0.173
 habitats-0.172
 yourself-0.171
opl-0.169
Token was
Token ID373
Feature activation+6.33e-04

Pos logit contributions
 help+0.040
 Madd+0.037
f+0.033
apy+0.031
 support+0.031
Neg logit contributions
 appearance-0.059
 ours-0.051
 yourselves-0.049
 yours-0.049
cks-0.046
Token so
Token ID523
Feature activation-2.04e-02

Pos logit contributions
 help+0.001
 Madd+0.001
f+0.001
 support+0.001
apy+0.001
Neg logit contributions
 appearance-0.001
 yourselves-0.001
 ours-0.001
 yours-0.001
 habitats-0.001
Token happy
Token ID3772
Feature activation+2.69e-02

Pos logit contributions
 appearance+0.045
 ours+0.039
 yourselves+0.038
 yours+0.037
cks+0.036
Neg logit contributions
 help-0.027
 Madd-0.024
f-0.022
apy-0.020
 support-0.020
Token for
Token ID329
Feature activation+2.18e-02

Pos logit contributions
 help+0.039
 Madd+0.035
f+0.031
 happily+0.029
 support+0.029
Neg logit contributions
 appearance-0.057
 ours-0.049
 yourselves-0.048
cks-0.047
 yours-0.045
Token still
Token ID991
Feature activation+1.22e-02

Pos logit contributions
 appearance+0.029
 ours+0.025
 yourselves+0.025
 yours+0.024
cks+0.023
Neg logit contributions
 help-0.022
 Madd-0.020
f-0.018
 support-0.017
apy-0.017
Token be
Token ID307
Feature activation+3.02e-02

Pos logit contributions
 help+0.018
 Madd+0.016
f+0.015
 support+0.014
apy+0.014
Neg logit contributions
 appearance-0.025
 ours-0.021
 yourselves-0.021
 yours-0.020
cks-0.020
Token healthy
Token ID5448
Feature activation-2.24e-02

Pos logit contributions
 help+0.044
 Madd+0.039
f+0.035
 support+0.034
apy+0.033
Neg logit contributions
 appearance-0.063
 yourselves-0.053
 ours-0.053
 yours-0.051
cks-0.050
Token and
Token ID290
Feature activation+1.09e-02

Pos logit contributions
 appearance+0.045
 yourselves+0.039
 ours+0.039
cks+0.038
 yours+0.037
Neg logit contributions
 help-0.034
 Madd-0.030
f-0.027
 support-0.026
 happily-0.026
Token un
Token ID555
Feature activation+2.01e-02

Pos logit contributions
 help+0.014
 Madd+0.012
f+0.011
 support+0.011
apy+0.010
Neg logit contributions
 appearance-0.025
 yourselves-0.021
 ours-0.021
cks-0.020
 yours-0.020
Tokenhar
Token ID9869
Feature activation+1.01e-01

Pos logit contributions
 help+0.031
 Madd+0.030
 gard+0.025
apy+0.025
 lear+0.025
Neg logit contributions
 appearance-0.037
 yourselves-0.033
 ours-0.032
cks-0.031
 yours-0.031
Tokenmed
Token ID1150
Feature activation+2.47e-02

Pos logit contributions
 help+0.129
 Madd+0.116
f+0.114
apy+0.102
sy+0.098
Neg logit contributions
 appearance-0.225
 ours-0.197
 yourselves-0.191
 yours-0.185
 habitats-0.178
Token.
Token ID13
Feature activation+2.57e-02

Pos logit contributions
 help+0.041
 Madd+0.039
f+0.035
apy+0.034
 support+0.034
Neg logit contributions
 appearance-0.043
 ours-0.036
 yourselves-0.036
 yours-0.035
 yourself-0.033
Token Suddenly
Token ID24975
Feature activation+8.74e-03

Pos logit contributions
 help+0.032
 Madd+0.030
f+0.026
apy+0.026
Sure+0.026
Neg logit contributions
 appearance-0.059
 ours-0.051
 yours-0.049
 yourselves-0.049
cks-0.047
Token,
Token ID11
Feature activation+1.96e-02

Pos logit contributions
 help+0.011
 Madd+0.010
f+0.009
 support+0.009
sy+0.009
Neg logit contributions
 appearance-0.019
 yourselves-0.016
 ours-0.016
 yours-0.016
 yourself-0.015
Token her
Token ID607
Feature activation+6.79e-03

Pos logit contributions
 help+0.031
 Madd+0.029
f+0.027
 support+0.026
sy+0.025
Neg logit contributions
 appearance-0.036
 yourselves-0.031
 ours-0.031
 yours-0.029
 yourself-0.028
Token it
Token ID340
Feature activation+4.13e-03

Pos logit contributions
 help+0.026
 Madd+0.024
f+0.023
 support+0.021
apy+0.021
Neg logit contributions
 appearance-0.028
 ours-0.023
 yourselves-0.023
 yours-0.022
cks-0.021
Token was
Token ID373
Feature activation+1.74e-03

Pos logit contributions
 help+0.007
 Madd+0.006
f+0.006
 support+0.005
apy+0.005
Neg logit contributions
 appearance-0.008
 ours-0.007
 yourselves-0.007
 yours-0.006
cks-0.006
Token the
Token ID262
Feature activation+9.53e-03

Pos logit contributions
 help+0.002
 Madd+0.002
 support+0.002
f+0.002
 happily+0.002
Neg logit contributions
 appearance-0.004
 yourselves-0.003
 ours-0.003
cks-0.003
 habitats-0.003
Token best
Token ID1266
Feature activation-2.28e-02

Pos logit contributions
apy+0.013
 Madd+0.013
 help+0.012
f+0.012
aroo+0.011
Neg logit contributions
 appearance-0.020
 ours-0.017
 yours-0.017
 yourselves-0.016
 yourself-0.016
Token ash
Token ID12530
Feature activation+5.51e-02

Pos logit contributions
 appearance+0.042
 yours+0.035
 ours+0.035
 yourselves+0.034
 yourself+0.031
Neg logit contributions
 help-0.035
 Madd-0.033
f-0.032
apy-0.030
sy-0.030
Tokentr
Token ID2213
Feature activation+1.01e-01

Pos logit contributions
 Madd+0.111
 help+0.108
apy+0.100
f+0.098
Sure+0.096
Neg logit contributions
 appearance-0.071
 yours-0.059
 ours-0.057
 yourselves-0.053
 doesn-0.050
Tokenay
Token ID323
Feature activation-7.08e-03

Pos logit contributions
 help+0.183
 Madd+0.175
 gard+0.155
f+0.154
 support+0.152
Neg logit contributions
 appearance-0.155
 ours-0.135
 yours-0.134
 yourselves-0.131
cks-0.122
Token they
Token ID484
Feature activation+1.61e-02

Pos logit contributions
 appearance+0.013
 ours+0.012
 yourselves+0.011
 yours+0.011
cks+0.011
Neg logit contributions
 help-0.011
 Madd-0.010
f-0.009
apy-0.009
 support-0.009
Token had
Token ID550
Feature activation-7.73e-03

Pos logit contributions
 help+0.025
 Madd+0.023
f+0.021
 support+0.020
apy+0.020
Neg logit contributions
 appearance-0.031
 ours-0.026
 yourselves-0.026
 yours-0.025
cks-0.024
Token ever
Token ID1683
Feature activation+2.42e-02

Pos logit contributions
 appearance+0.019
 ours+0.016
 yourselves+0.016
ulum+0.016
cks+0.016
Neg logit contributions
 help-0.009
f-0.007
 Madd-0.007
 support-0.006
 happily-0.006
Token seen
Token ID1775
Feature activation+1.17e-02

Pos logit contributions
 help+0.048
f+0.041
 Madd+0.041
 support+0.039
 believe+0.038
Neg logit contributions
 appearance-0.038
 yourselves-0.033
 ours-0.032
 yours-0.030
cks-0.030
Token it
Token ID340
Feature activation+8.93e-03

Pos logit contributions
 help+0.024
f+0.022
sy+0.022
 Madd+0.021
Sure+0.021
Neg logit contributions
 appearance-0.019
 yourselves-0.016
 ours-0.016
 yourself-0.015
 stealing-0.015
Token came
Token ID1625
Feature activation+3.71e-02

Pos logit contributions
 help+0.013
 Madd+0.012
f+0.011
apy+0.011
 support+0.011
Neg logit contributions
 appearance-0.018
 ours-0.015
 yourselves-0.015
 yours-0.014
cks-0.014
Token across
Token ID1973
Feature activation+3.59e-02

Pos logit contributions
 help+0.053
 Madd+0.049
f+0.045
apy+0.042
 support+0.042
Neg logit contributions
 appearance-0.076
 ours-0.065
 yourselves-0.064
 yours-0.062
cks-0.060
Token an
Token ID281
Feature activation+1.36e-02

Pos logit contributions
 help+0.050
 Madd+0.047
f+0.043
apy+0.041
 support+0.039
Neg logit contributions
 appearance-0.074
 yourselves-0.063
 ours-0.063
 yours-0.061
 yourself-0.059
Token ash
Token ID12530
Feature activation+5.62e-02

Pos logit contributions
 help+0.026
 Madd+0.024
f+0.023
apy+0.022
 support+0.022
Neg logit contributions
 appearance-0.021
 ours-0.017
 yourselves-0.017
 yours-0.017
cks-0.016
Tokentr
Token ID2213
Feature activation+1.00e-01

Pos logit contributions
 help+0.092
 Madd+0.081
f+0.076
apy+0.072
 support+0.071
Neg logit contributions
 appearance-0.106
 ours-0.092
 yourselves-0.089
 yours-0.088
cks-0.083
Tokenay
Token ID323
Feature activation-5.26e-03

Pos logit contributions
 help+0.158
 Madd+0.146
f+0.134
apy+0.127
 support+0.126
Neg logit contributions
 appearance-0.193
 ours-0.163
 yourselves-0.161
 yours-0.156
cks-0.151
Token.
Token ID13
Feature activation+1.63e-02

Pos logit contributions
 appearance+0.011
 ours+0.010
 yourselves+0.010
cks+0.009
 yours+0.009
Neg logit contributions
 help-0.007
f-0.006
 Madd-0.006
 happily-0.006
sy-0.006
Token It
Token ID632
Feature activation+3.34e-02

Pos logit contributions
 help+0.035
 Madd+0.033
f+0.032
 support+0.030
apy+0.030
Neg logit contributions
 appearance-0.020
 ours-0.016
 yourselves-0.016
 yours-0.015
 yourself-0.014
Token wanted
Token ID2227
Feature activation+6.38e-03

Pos logit contributions
 help+0.055
 Madd+0.051
f+0.048
 support+0.045
apy+0.044
Neg logit contributions
 appearance-0.060
 ours-0.049
 yourselves-0.049
 yours-0.047
cks-0.047
Token to
Token ID284
Feature activation-2.92e-03

Pos logit contributions
 help+0.009
 Madd+0.008
 support+0.007
f+0.007
 happily+0.006
Neg logit contributions
 appearance-0.014
 yourselves-0.012
 ours-0.012
cks-0.012
ulum-0.011
Token and
Token ID290
Feature activation-3.48e-04

Pos logit contributions
 help+0.013
 Madd+0.012
f+0.011
 support+0.010
apy+0.010
Neg logit contributions
 appearance-0.019
 ours-0.016
 yourselves-0.016
 yours-0.016
cks-0.015
Token rub
Token ID6437
Feature activation+1.41e-02

Pos logit contributions
 appearance+0.001
 ours+0.001
 yourselves+0.001
 yours+0.001
cks+0.001
Neg logit contributions
 help-0.001
 Madd-0.000
 support-0.000
apy-0.000
f-0.000
Token the
Token ID262
Feature activation+9.29e-03

Pos logit contributions
 help+0.021
 Madd+0.019
f+0.018
apy+0.016
 support+0.016
Neg logit contributions
 appearance-0.028
 ours-0.024
 yourselves-0.024
 yours-0.023
cks-0.022
Token magical
Token ID10883
Feature activation-1.78e-02

Pos logit contributions
 help+0.014
 Madd+0.013
f+0.012
apy+0.012
 support+0.011
Neg logit contributions
 appearance-0.018
 yourselves-0.015
 ours-0.015
 yours-0.014
 yourself-0.014
Token o
Token ID267
Feature activation+6.19e-02

Pos logit contributions
 appearance+0.036
 ours+0.030
 yourselves+0.030
 yours+0.029
cks+0.028
Neg logit contributions
 help-0.026
 Madd-0.024
f-0.022
apy-0.021
 support-0.020
Tokenint
Token ID600
Feature activation+1.00e-01

Pos logit contributions
 help+0.108
 Madd+0.102
f+0.093
apy+0.089
 support+0.089
Neg logit contributions
 appearance-0.107
 yourselves-0.088
 ours-0.088
 yours-0.084
cks-0.082
Tokenment
Token ID434
Feature activation+4.04e-02

Pos logit contributions
 help+0.127
 Madd+0.115
f+0.102
 support+0.096
apy+0.093
Neg logit contributions
 appearance-0.219
 yourselves-0.189
 ours-0.188
 yours-0.184
cks-0.178
Token on
Token ID319
Feature activation-3.63e-04

Pos logit contributions
 help+0.064
 Madd+0.062
f+0.054
apy+0.053
 support+0.052
Neg logit contributions
 appearance-0.074
 ours-0.062
 yourselves-0.061
 yours-0.061
 yourself-0.057
Token her
Token ID607
Feature activation+4.28e-04

Pos logit contributions
 appearance+0.001
 ours+0.001
 yourselves+0.001
 yours+0.001
cks+0.001
Neg logit contributions
 help-0.001
 Madd-0.000
f-0.000
 support-0.000
apy-0.000
Token forehead
Token ID22645
Feature activation+3.57e-02

Pos logit contributions
 help+0.001
 Madd+0.001
f+0.001
apy+0.001
 support+0.001
Neg logit contributions
 appearance-0.001
 ours-0.001
 yourselves-0.001
 yours-0.001
cks-0.001
Token.
Token ID13
Feature activation+1.38e-02

Pos logit contributions
 help+0.056
 Madd+0.051
f+0.047
apy+0.044
 support+0.044
Neg logit contributions
 appearance-0.070
 ours-0.059
 yourselves-0.059
 yours-0.057
cks-0.055
Token leaves
Token ID5667
Feature activation+1.14e-02

Pos logit contributions
 help+0.002
 Madd+0.002
f+0.002
apy+0.002
 support+0.002
Neg logit contributions
 appearance-0.002
 ours-0.002
 yourselves-0.002
 yours-0.001
cks-0.001
Token.
Token ID13
Feature activation+9.05e-03

Pos logit contributions
 help+0.017
 Madd+0.016
f+0.014
 support+0.013
apy+0.013
Neg logit contributions
 appearance-0.023
 ours-0.020
 yourselves-0.019
 yours-0.019
cks-0.018
Token With
Token ID2080
Feature activation-2.85e-02

Pos logit contributions
 help+0.016
 Madd+0.015
f+0.014
 support+0.013
apy+0.013
Neg logit contributions
 appearance-0.015
 ours-0.013
 yourselves-0.013
 yours-0.012
cks-0.011
Token soft
Token ID2705
Feature activation-4.67e-04

Pos logit contributions
 appearance+0.053
 ours+0.046
 yours+0.046
 yourselves+0.044
cks+0.042
Neg logit contributions
 help-0.041
 Madd-0.039
f-0.039
sy-0.035
mates-0.034
Token sn
Token ID3013
Feature activation+1.48e-02

Pos logit contributions
 appearance+0.001
 yourselves+0.001
 yours+0.001
 yourself+0.001
 ours+0.001
Neg logit contributions
 help-0.001
 Madd-0.001
f-0.001
Sure-0.001
appy-0.001
Tokenores
Token ID2850
Feature activation+1.00e-01

Pos logit contributions
 help+0.020
 Madd+0.018
 lear+0.016
f+0.016
 believe+0.016
Neg logit contributions
 appearance-0.029
 yourselves-0.026
 yourself-0.025
 yours-0.025
opl-0.025
Token,
Token ID11
Feature activation+8.81e-03

Pos logit contributions
 help+0.147
 Madd+0.142
f+0.130
 support+0.128
 oblig+0.126
Neg logit contributions
 appearance-0.173
 yours-0.146
 ours-0.145
 yourselves-0.139
 stealing-0.134
Token Lucy
Token ID22162
Feature activation+6.32e-03

Pos logit contributions
 help+0.014
f+0.013
sy+0.013
 Madd+0.012
appy+0.012
Neg logit contributions
 appearance-0.015
 yourselves-0.013
 ours-0.013
 yourself-0.012
 yours-0.012
Token began
Token ID2540
Feature activation+2.93e-02

Pos logit contributions
 help+0.010
 Madd+0.009
f+0.008
apy+0.008
 support+0.008
Neg logit contributions
 appearance-0.012
 yourselves-0.009
 ours-0.009
cks-0.009
 yours-0.009
Token to
Token ID284
Feature activation-2.69e-03

Pos logit contributions
 help+0.047
 Madd+0.044
f+0.040
apy+0.038
 support+0.038
Neg logit contributions
 appearance-0.055
 ours-0.046
 yourselves-0.046
 yours-0.044
cks-0.043
Token drift
Token ID24260
Feature activation+1.89e-02

Pos logit contributions
 appearance+0.006
 ours+0.005
 yourselves+0.005
 yours+0.005
cks+0.005
Neg logit contributions
 help-0.004
 Madd-0.003
f-0.003
apy-0.003
 support-0.003
Token deep
Token ID2769
Feature activation-4.87e-03

Pos logit contributions
 appearance+0.026
 ours+0.021
 yours+0.020
 yourselves+0.020
 mine+0.019
Neg logit contributions
 Madd-0.019
f-0.017
 help-0.017
sy-0.016
apy-0.016
Token paws
Token ID49193
Feature activation+1.82e-02

Pos logit contributions
 appearance+0.009
 ours+0.008
 yourselves+0.008
 yours+0.007
cks+0.007
Neg logit contributions
 help-0.008
 Madd-0.007
f-0.007
apy-0.006
 support-0.006
Token,
Token ID11
Feature activation+6.10e-03

Pos logit contributions
 help+0.027
 Madd+0.025
f+0.023
apy+0.021
 support+0.021
Neg logit contributions
 appearance-0.036
 ours-0.031
 yourselves-0.030
 yours-0.030
cks-0.028
Token to
Token ID284
Feature activation-4.68e-03

Pos logit contributions
 help+0.010
 Madd+0.009
f+0.008
 support+0.008
apy+0.007
Neg logit contributions
 appearance-0.012
 ours-0.010
 yourselves-0.010
 yours-0.010
cks-0.010
Token no
Token ID645
Feature activation-2.87e-02

Pos logit contributions
 appearance+0.010
 ours+0.009
 yourselves+0.008
 yours+0.008
cks+0.008
Neg logit contributions
 help-0.006
 Madd-0.006
f-0.005
apy-0.005
 support-0.005
Token avail
Token ID29107
Feature activation-1.19e-01

Pos logit contributions
 appearance+0.047
 yourselves+0.041
 ours+0.040
cks+0.038
 yours+0.037
Neg logit contributions
 help-0.055
 Madd-0.048
 support-0.045
f-0.045
 happily-0.042
Token.
Token ID13
Feature activation+8.58e-03

Pos logit contributions
 appearance+0.263
 ours+0.234
 yourselves+0.234
ulum+0.216
 yours+0.216
Neg logit contributions
 help-0.159
 Madd-0.140
f-0.126
 happily-0.118
 support-0.114
Token He
Token ID679
Feature activation+1.00e-03

Pos logit contributions
 help+0.014
 Madd+0.013
f+0.012
apy+0.011
 support+0.011
Neg logit contributions
 appearance-0.016
 ours-0.014
 yourselves-0.014
 yours-0.013
cks-0.013
Token finally
Token ID3443
Feature activation-1.78e-02

Pos logit contributions
 help+0.002
 Madd+0.002
f+0.001
apy+0.001
 support+0.001
Neg logit contributions
 appearance-0.002
 yourselves-0.001
 habitats-0.001
 ours-0.001
 doesn-0.001
Token thought
Token ID1807
Feature activation+2.82e-02

Pos logit contributions
 appearance+0.030
 ours+0.024
 yourselves+0.023
 yours+0.023
cks+0.023
Neg logit contributions
 Madd-0.029
 help-0.029
apy-0.026
f-0.025
 support-0.025
Token of
Token ID286
Feature activation+4.18e-03

Pos logit contributions
 help+0.049
 Madd+0.045
f+0.042
apy+0.040
 support+0.040
Neg logit contributions
 appearance-0.050
 ours-0.042
 yourselves-0.041
 yours-0.040
cks-0.038
Token him
Token ID683
Feature activation-2.00e-04

Pos logit contributions
 help+0.025
 Madd+0.022
 support+0.019
f+0.018
 happily+0.018
Neg logit contributions
 appearance-0.039
 yourselves-0.034
 ours-0.033
ulum-0.032
 yours-0.031
Token.
Token ID13
Feature activation+2.12e-02

Pos logit contributions
 appearance+0.000
 yourselves+0.000
 ours+0.000
 yours+0.000
cks+0.000
Neg logit contributions
 help-0.000
 Madd-0.000
f-0.000
 support-0.000
apy-0.000
Token But
Token ID887
Feature activation+4.39e-02

Pos logit contributions
 help+0.040
 Madd+0.037
f+0.035
apy+0.034
 support+0.033
Neg logit contributions
 appearance-0.033
 ours-0.027
 yourselves-0.027
 yours-0.026
 habitats-0.024
Token to
Token ID284
Feature activation-1.84e-03

Pos logit contributions
 help+0.064
 Madd+0.060
f+0.056
apy+0.053
sy+0.052
Neg logit contributions
 appearance-0.086
 ours-0.072
 yourselves-0.071
 yours-0.070
 stealing-0.066
Token no
Token ID645
Feature activation-2.73e-02

Pos logit contributions
 appearance+0.004
 ours+0.003
 yourselves+0.003
 yours+0.003
cks+0.003
Neg logit contributions
 help-0.003
 Madd-0.002
 support-0.002
f-0.002
 believe-0.002
Token avail
Token ID29107
Feature activation-1.19e-01

Pos logit contributions
 appearance+0.039
 yourselves+0.033
 ours+0.032
 yours+0.030
cks+0.030
Neg logit contributions
 help-0.057
 Madd-0.052
f-0.049
 support-0.048
apy-0.046
Token,
Token ID11
Feature activation+1.30e-02

Pos logit contributions
 appearance+0.268
 ours+0.241
 yourselves+0.240
ulum+0.227
 habitats+0.222
Neg logit contributions
 help-0.152
 Madd-0.136
f-0.118
 happily-0.110
 support-0.108
Token they
Token ID484
Feature activation+1.35e-02

Pos logit contributions
 help+0.019
 Madd+0.017
f+0.015
 support+0.015
apy+0.014
Neg logit contributions
 appearance-0.027
 ours-0.023
 yourselves-0.023
 yours-0.022
cks-0.022
Token were
Token ID547
Feature activation+1.01e-02

Pos logit contributions
 help+0.021
 Madd+0.020
f+0.018
apy+0.017
 support+0.017
Neg logit contributions
 appearance-0.026
 ours-0.021
 yourselves-0.021
 yours-0.021
cks-0.020
Token too
Token ID1165
Feature activation+3.47e-02

Pos logit contributions
 help+0.016
 Madd+0.014
f+0.013
 support+0.012
 happily+0.012
Neg logit contributions
 appearance-0.020
 yourselves-0.017
 ours-0.017
 yours-0.016
ulum-0.016
Token scared
Token ID12008
Feature activation+1.14e-02

Pos logit contributions
 help+0.048
 Madd+0.044
f+0.040
apy+0.038
 support+0.037
Neg logit contributions
 appearance-0.074
 ours-0.062
 yourselves-0.062
 yours-0.061
cks-0.059
Token times
Token ID1661
Feature activation-9.05e-03

Pos logit contributions
 appearance+0.021
 ours+0.017
 yourselves+0.017
 yours+0.016
cks+0.016
Neg logit contributions
 help-0.018
 Madd-0.017
f-0.016
apy-0.015
 support-0.014
Token,
Token ID11
Feature activation+1.09e-02

Pos logit contributions
 appearance+0.018
 ours+0.016
 yourselves+0.016
 yours+0.015
 yourself+0.014
Neg logit contributions
 help-0.012
 Madd-0.012
f-0.010
apy-0.010
 support-0.010
Token but
Token ID475
Feature activation+5.89e-03

Pos logit contributions
 help+0.019
 Madd+0.017
f+0.016
 support+0.015
apy+0.015
Neg logit contributions
 appearance-0.020
 ours-0.017
 yourselves-0.017
 yours-0.016
cks-0.016
Token to
Token ID284
Feature activation-2.72e-03

Pos logit contributions
 help+0.010
 Madd+0.010
f+0.010
sy+0.010
 oblig+0.009
Neg logit contributions
 appearance-0.009
 ours-0.008
 yours-0.008
 yourselves-0.007
 stealing-0.007
Token no
Token ID645
Feature activation-2.80e-02

Pos logit contributions
 appearance+0.005
 ours+0.005
 yourselves+0.005
 yours+0.004
cks+0.004
Neg logit contributions
 help-0.004
 Madd-0.004
f-0.003
 support-0.003
apy-0.003
Token avail
Token ID29107
Feature activation-1.18e-01

Pos logit contributions
 appearance+0.039
 yourselves+0.032
 ours+0.032
cks+0.029
 yours+0.029
Neg logit contributions
 help-0.060
 Madd-0.055
f-0.051
 support-0.050
apy-0.048
Token.
Token ID13
Feature activation+1.66e-02

Pos logit contributions
 appearance+0.249
 ours+0.221
 yourselves+0.219
 yours+0.205
 habitats+0.204
Neg logit contributions
 help-0.167
 Madd-0.152
f-0.136
 support-0.125
 happily-0.125
Token
Token ID220
Feature activation-1.01e-02

Pos logit contributions
 help+0.025
 Madd+0.023
f+0.022
apy+0.020
 support+0.020
Neg logit contributions
 appearance-0.033
 ours-0.028
 yourselves-0.027
 yours-0.027
cks-0.026
Token
Token ID628
Feature activation+1.82e-02

Pos logit contributions
 appearance+0.021
 ours+0.017
 yourselves+0.017
 yours+0.017
cks+0.016
Neg logit contributions
 help-0.015
 Madd-0.014
f-0.013
apy-0.012
 support-0.012
Token Jane
Token ID12091
Feature activation-1.33e-02

Pos logit contributions
 help+0.034
f+0.031
 Madd+0.031
 lear+0.030
 support+0.030
Neg logit contributions
 appearance-0.024
 yourselves-0.022
 ours-0.020
 Sharing-0.020
 yourself-0.019
Token's
Token ID338
Feature activation-2.07e-02

Pos logit contributions
 appearance+0.023
 yourselves+0.018
 ours+0.018
cks+0.017
 yours+0.017
Neg logit contributions
 help-0.023
 Madd-0.022
f-0.020
apy-0.019
 support-0.019
Token out
Token ID503
Feature activation+3.87e-02

Pos logit contributions
 appearance+0.076
 ours+0.066
 yours+0.065
 yourselves+0.064
 yourself+0.060
Neg logit contributions
 help-0.052
 Madd-0.047
f-0.046
apy-0.043
 support-0.042
Token,
Token ID11
Feature activation+1.49e-02

Pos logit contributions
 help+0.048
 Madd+0.045
f+0.041
apy+0.038
 support+0.037
Neg logit contributions
 appearance-0.085
 ours-0.072
 yourselves-0.072
 yours-0.071
cks-0.069
Token but
Token ID475
Feature activation+8.68e-03

Pos logit contributions
 help+0.025
 Madd+0.023
f+0.020
 support+0.019
apy+0.019
Neg logit contributions
 appearance-0.028
 ours-0.024
 yourselves-0.023
 yours-0.023
cks-0.022
Token to
Token ID284
Feature activation-7.62e-04

Pos logit contributions
 help+0.016
 Madd+0.015
f+0.014
sy+0.014
 support+0.013
Neg logit contributions
 appearance-0.013
 ours-0.011
 yourselves-0.010
 yours-0.010
 stealing-0.010
Token no
Token ID645
Feature activation-2.65e-02

Pos logit contributions
 appearance+0.002
 ours+0.001
 yourselves+0.001
 yours+0.001
cks+0.001
Neg logit contributions
 help-0.001
 Madd-0.001
f-0.001
 support-0.001
apy-0.001
Token avail
Token ID29107
Feature activation-1.18e-01

Pos logit contributions
 appearance+0.047
 ours+0.040
 yourselves+0.039
 yours+0.037
cks+0.037
Neg logit contributions
 help-0.047
 Madd-0.043
f-0.039
 support-0.038
apy-0.037
Token.
Token ID13
Feature activation+2.40e-02

Pos logit contributions
 appearance+0.252
 ours+0.224
 yourselves+0.223
 yours+0.208
ulum+0.205
Neg logit contributions
 help-0.164
 Madd-0.148
f-0.132
 happily-0.123
 support-0.121
Token He
Token ID679
Feature activation+1.10e-02

Pos logit contributions
 help+0.042
 Madd+0.038
f+0.037
 support+0.035
apy+0.034
Neg logit contributions
 appearance-0.041
 yourselves-0.034
 ours-0.033
 yours-0.032
cks-0.031
Token felt
Token ID2936
Feature activation+1.45e-02

Pos logit contributions
 help+0.018
 Madd+0.017
f+0.016
apy+0.015
 support+0.015
Neg logit contributions
 appearance-0.019
 yourselves-0.015
 doesn-0.014
cks-0.014
 habitats-0.014
Token sad
Token ID6507
Feature activation+4.87e-04

Pos logit contributions
 help+0.022
 Madd+0.021
f+0.019
apy+0.018
 support+0.018
Neg logit contributions
 appearance-0.028
 ours-0.024
 yourselves-0.023
 yours-0.023
cks-0.022
Token and
Token ID290
Feature activation+3.22e-03

Pos logit contributions
 help+0.001
 Madd+0.001
f+0.001
 support+0.001
apy+0.001
Neg logit contributions
 appearance-0.001
 ours-0.001
 yourselves-0.001
 yours-0.001
cks-0.001
Token They
Token ID1119
Feature activation-1.03e-02

Pos logit contributions
 appearance+0.070
 ours+0.059
cks+0.056
 yourselves+0.055
ind+0.054
Neg logit contributions
 help-0.042
 Madd-0.039
Sure-0.032
apy-0.030
 support-0.030
Token have
Token ID423
Feature activation-1.52e-02

Pos logit contributions
 appearance+0.018
 ours+0.014
 yours+0.014
 yourselves+0.013
cks+0.013
Neg logit contributions
 Madd-0.018
 help-0.017
apy-0.016
f-0.015
aroo-0.015
Token a
Token ID257
Feature activation+6.12e-03

Pos logit contributions
 appearance+0.032
 yourselves+0.027
 ours+0.027
cks+0.026
 yours+0.026
Neg logit contributions
 help-0.023
 Madd-0.019
 support-0.018
f-0.017
 happily-0.017
Token lot
Token ID1256
Feature activation+5.20e-03

Pos logit contributions
 Madd+0.009
 help+0.009
f+0.008
apy+0.008
sy+0.008
Neg logit contributions
 appearance-0.012
 ours-0.010
 yours-0.010
 English-0.009
 yourselves-0.009
Token in
Token ID287
Feature activation-2.12e-04

Pos logit contributions
 help+0.006
 Madd+0.005
f+0.004
 support+0.004
apy+0.004
Neg logit contributions
 appearance-0.013
 ours-0.011
 yourselves-0.011
 yours-0.011
cks-0.011
Token common
Token ID2219
Feature activation-1.14e-01

Pos logit contributions
 appearance+0.001
 ours+0.000
 yourselves+0.000
ulum+0.000
 yours+0.000
Neg logit contributions
 help-0.000
 Madd-0.000
 support-0.000
f-0.000
 happily-0.000
Token.
Token ID13
Feature activation-3.16e-02

Pos logit contributions
 appearance+0.263
 ours+0.241
 yours+0.238
ulum+0.233
arter+0.225
Neg logit contributions
 help-0.132
 happily-0.103
f-0.098
 Madd-0.089
 support-0.088
Token They
Token ID1119
Feature activation-1.04e-02

Pos logit contributions
 appearance+0.071
 ours+0.059
 yourselves+0.057
 yours+0.057
ind+0.055
Neg logit contributions
 help-0.044
 Madd-0.039
f-0.033
Sure-0.033
 support-0.033
Token become
Token ID1716
Feature activation-1.16e-02

Pos logit contributions
 appearance+0.018
 ours+0.015
 yourselves+0.014
 yours+0.014
cks+0.014
Neg logit contributions
 Madd-0.017
 help-0.017
apy-0.016
f-0.015
aroo-0.015
Token friends
Token ID2460
Feature activation+2.43e-03

Pos logit contributions
 appearance+0.027
 ours+0.025
 yourselves+0.024
 yours+0.024
 yourself+0.023
Neg logit contributions
 help-0.013
 Madd-0.012
f-0.011
apy-0.010
Sure-0.010
Token.
Token ID13
Feature activation-3.09e-02

Pos logit contributions
 help+0.003
f+0.003
 happily+0.003
 Madd+0.003
 support+0.003
Neg logit contributions
 appearance-0.005
 ours-0.005
 yourselves-0.004
 yours-0.004
cks-0.004
Token and
Token ID290
Feature activation-2.33e-03

Pos logit contributions
 help+0.005
 Madd+0.004
f+0.004
 happily+0.004
 support+0.004
Neg logit contributions
 appearance-0.006
 ours-0.005
 yourselves-0.005
 yours-0.005
cks-0.005
Token said
Token ID531
Feature activation+2.31e-03

Pos logit contributions
 appearance+0.004
 ours+0.003
 yourselves+0.003
 yours+0.003
cks+0.003
Neg logit contributions
 help-0.004
 Madd-0.004
f-0.004
apy-0.003
 support-0.003
Token,
Token ID11
Feature activation-5.61e-04

Pos logit contributions
 help+0.003
 Madd+0.003
f+0.002
 support+0.002
apy+0.002
Neg logit contributions
 appearance-0.005
 ours-0.004
 yourselves-0.004
 yours-0.004
 yourself-0.004
Token "
Token ID366
Feature activation+1.15e-02

Pos logit contributions
 appearance+0.001
 ours+0.001
 yourselves+0.001
 yours+0.001
cks+0.001
Neg logit contributions
 help-0.001
 Madd-0.001
f-0.001
apy-0.001
 support-0.001
TokenG
Token ID38
Feature activation+5.35e-03

Pos logit contributions
 help+0.018
 Madd+0.017
f+0.017
Sure+0.015
 support+0.015
Neg logit contributions
 appearance-0.022
 ours-0.018
 yourselves-0.018
 yours-0.017
cks-0.017
Tokenreetings
Token ID46648
Feature activation-1.13e-01

Pos logit contributions
 help+0.008
f+0.008
 Madd+0.008
sy+0.007
ail+0.007
Neg logit contributions
 appearance-0.010
 ours-0.008
 yourselves-0.008
cks-0.008
 yours-0.008
Token Sir
Token ID7361
Feature activation-6.42e-02

Pos logit contributions
 appearance+0.260
 ours+0.226
 yourselves+0.224
 yours+0.219
cks+0.213
Neg logit contributions
 help-0.136
 Madd-0.122
f-0.109
apy-0.100
 support-0.100
Token.
Token ID13
Feature activation+3.34e-03

Pos logit contributions
 appearance+0.121
 yourselves+0.102
 ours+0.100
cks+0.099
ulum+0.094
Neg logit contributions
 help-0.103
 Madd-0.092
f-0.089
 support-0.079
 happily-0.078
Token Do
Token ID2141
Feature activation-7.23e-02

Pos logit contributions
 help+0.006
 Madd+0.006
f+0.005
 support+0.005
Sure+0.005
Neg logit contributions
 appearance-0.005
 ours-0.005
cks-0.004
 yourselves-0.004
 yours-0.004
Token you
Token ID345
Feature activation+2.73e-02

Pos logit contributions
 appearance+0.118
 ours+0.093
 yourselves+0.089
cks+0.089
 yours+0.086
Neg logit contributions
 help-0.144
 Madd-0.125
f-0.124
 support-0.117
Sure-0.111
Token speak
Token ID2740
Feature activation+5.51e-04

Pos logit contributions
 help+0.047
f+0.037
 Madd+0.037
 support+0.037
 believe+0.036
Neg logit contributions
 appearance-0.054
 yourselves-0.047
 ours-0.045
cks-0.045
ulum-0.043
Token't
Token ID470
Feature activation-2.63e-02

Pos logit contributions
 appearance+0.012
 ours+0.010
 yourselves+0.009
 yours+0.009
cks+0.009
Neg logit contributions
 help-0.014
 Madd-0.012
f-0.012
 support-0.011
apy-0.011
Token judge
Token ID5052
Feature activation-4.66e-02

Pos logit contributions
 appearance+0.052
 ours+0.050
 yourselves+0.049
cks+0.048
 habitats+0.047
Neg logit contributions
 help-0.040
 support-0.030
 Madd-0.029
f-0.029
 believe-0.027
Token someone
Token ID2130
Feature activation-4.48e-02

Pos logit contributions
 appearance+0.087
 ours+0.080
ulum+0.076
cks+0.073
opl+0.073
Neg logit contributions
 help-0.076
f-0.059
 Madd-0.059
 support-0.058
 happily-0.052
Token by
Token ID416
Feature activation+3.82e-02

Pos logit contributions
 appearance+0.088
 yourselves+0.077
ulum+0.077
 ours+0.074
 habitats+0.072
Neg logit contributions
 help-0.073
f-0.058
 support-0.056
 Madd-0.055
 believe-0.051
Token their
Token ID511
Feature activation+9.70e-03

Pos logit contributions
 help+0.067
 Madd+0.054
 support+0.052
 believe+0.048
 days+0.047
Neg logit contributions
 appearance-0.063
ulum-0.058
 ours-0.058
ays-0.058
ucker-0.056
Token appearance
Token ID5585
Feature activation-1.13e-01

Pos logit contributions
 help+0.023
 support+0.019
 Madd+0.019
f+0.019
 happily+0.018
Neg logit contributions
 appearance-0.010
 ours-0.009
 yourselves-0.009
cks-0.009
 yours-0.009
Token.
Token ID13
Feature activation-2.45e-02

Pos logit contributions
 appearance+0.253
 ours+0.239
cks+0.233
ulum+0.230
 habitats+0.223
Neg logit contributions
 help-0.129
 Madd-0.095
 happily-0.089
 support-0.083
f-0.082
Token Sometimes
Token ID8975
Feature activation-2.77e-02

Pos logit contributions
 appearance+0.044
 ours+0.038
cks+0.035
ulum+0.035
 yours+0.034
Neg logit contributions
 help-0.042
 Madd-0.039
 support-0.034
f-0.032
Sure-0.032
Token they
Token ID484
Feature activation-4.93e-03

Pos logit contributions
 appearance+0.063
 ours+0.057
 yourselves+0.056
cks+0.056
ulum+0.055
Neg logit contributions
 help-0.035
 Madd-0.025
 support-0.023
f-0.022
apy-0.021
Token are
Token ID389
Feature activation-1.58e-02

Pos logit contributions
 appearance+0.009
 yourselves+0.008
 ours+0.008
ulum+0.008
 yours+0.008
Neg logit contributions
 help-0.008
 support-0.006
f-0.006
 Madd-0.006
 believe-0.006
Token not
Token ID407
Feature activation-3.20e-02

Pos logit contributions
 appearance+0.030
ulum+0.027
 yourselves+0.024
opl+0.024
 habitats+0.024
Neg logit contributions
 help-0.027
 support-0.022
 Madd-0.021
 happily-0.021
 brave-0.020
Token friendly
Token ID8030
Feature activation-2.29e-02

Pos logit contributions
 help+0.033
 Madd+0.030
f+0.027
apy+0.026
 support+0.026
Neg logit contributions
 appearance-0.047
 ours-0.041
 yourselves-0.040
 yours-0.039
cks-0.038
Token voice
Token ID3809
Feature activation+2.19e-03

Pos logit contributions
 appearance+0.047
 ours+0.041
 yourselves+0.041
cks+0.040
 yours+0.040
Neg logit contributions
 help-0.033
 Madd-0.030
f-0.027
 support-0.025
apy-0.024
Token.
Token ID13
Feature activation+1.40e-02

Pos logit contributions
 help+0.003
 Madd+0.003
f+0.003
apy+0.002
 support+0.002
Neg logit contributions
 appearance-0.005
 ours-0.004
 yourselves-0.004
 yours-0.004
cks-0.004
Token "
Token ID366
Feature activation+1.66e-02

Pos logit contributions
 help+0.021
 Madd+0.020
f+0.017
apy+0.017
 happily+0.016
Neg logit contributions
 appearance-0.029
 ours-0.024
 yourselves-0.024
cks-0.023
 yours-0.023
TokenG
Token ID38
Feature activation+1.03e-02

Pos logit contributions
 help+0.029
 Madd+0.027
f+0.026
Sure+0.025
 support+0.024
Neg logit contributions
 appearance-0.030
 ours-0.026
 yourselves-0.024
cks-0.023
 yours-0.023
Tokenreetings
Token ID46648
Feature activation-1.13e-01

Pos logit contributions
 help+0.017
f+0.016
 Madd+0.015
sy+0.014
tted+0.013
Neg logit contributions
 appearance-0.019
 ours-0.015
 yourselves-0.014
 habitats-0.014
ulum-0.014
Token Jack
Token ID3619
Feature activation-3.07e-02

Pos logit contributions
 appearance+0.242
 ours+0.209
 yourselves+0.208
 yours+0.202
cks+0.193
Neg logit contributions
 help-0.152
 Madd-0.137
f-0.126
apy-0.119
 support-0.117
Token!
Token ID0
Feature activation+1.27e-02

Pos logit contributions
 appearance+0.055
 ours+0.045
 yourselves+0.045
 yours+0.043
cks+0.042
Neg logit contributions
 help-0.053
 Madd-0.049
f-0.046
 support-0.043
apy-0.042
Token Are
Token ID4231
Feature activation-4.06e-02

Pos logit contributions
 help+0.027
 Madd+0.026
f+0.023
 support+0.022
Sure+0.022
Neg logit contributions
 appearance-0.019
 ours-0.015
cks-0.014
 yourselves-0.014
 yours-0.013
Token you
Token ID345
Feature activation+3.00e-02

Pos logit contributions
 appearance+0.062
 ours+0.048
 yourselves+0.048
cks+0.045
 yours+0.044
Neg logit contributions
 help-0.083
 Madd-0.077
f-0.072
 support-0.068
apy-0.068
Token here
Token ID994
Feature activation+2.79e-02

Pos logit contributions
 help+0.048
 Madd+0.043
f+0.039
 support+0.038
apy+0.036
Neg logit contributions
 appearance-0.059
 yourselves-0.049
 ours-0.049
cks-0.047
 habitats-0.046
Token friendly
Token ID8030
Feature activation-2.31e-02

Pos logit contributions
 help+0.033
 Madd+0.030
f+0.027
 support+0.025
apy+0.025
Neg logit contributions
 appearance-0.044
 ours-0.038
 yourselves-0.038
 yours-0.037
cks-0.036
Token voice
Token ID3809
Feature activation+3.77e-03

Pos logit contributions
 appearance+0.050
 ours+0.045
 yourselves+0.044
 yours+0.043
cks+0.042
Neg logit contributions
 help-0.031
 Madd-0.028
f-0.024
 support-0.023
 gard-0.022
Token,
Token ID11
Feature activation+4.26e-03

Pos logit contributions
 help+0.005
 Madd+0.004
f+0.004
 support+0.003
apy+0.003
Neg logit contributions
 appearance-0.009
 ours-0.007
 yourselves-0.007
 yours-0.007
cks-0.007
Token "
Token ID366
Feature activation+1.51e-02

Pos logit contributions
 help+0.006
 Madd+0.006
f+0.005
 happily+0.005
apy+0.005
Neg logit contributions
 appearance-0.009
 ours-0.008
 yourselves-0.008
 yours-0.008
cks-0.007
TokenG
Token ID38
Feature activation+1.07e-02

Pos logit contributions
 help+0.026
 Madd+0.025
f+0.024
Sure+0.023
 support+0.022
Neg logit contributions
 appearance-0.028
 ours-0.023
 yourselves-0.022
cks-0.021
 yours-0.021
Tokenreetings
Token ID46648
Feature activation-1.12e-01

Pos logit contributions
 help+0.019
f+0.018
 Madd+0.017
sy+0.016
 support+0.016
Neg logit contributions
 appearance-0.018
 ours-0.015
 yourselves-0.014
 yours-0.013
 habitats-0.013
Token,
Token ID11
Feature activation+6.05e-03

Pos logit contributions
 appearance+0.237
 ours+0.204
 yourselves+0.202
 yours+0.197
cks+0.191
Neg logit contributions
 help-0.153
 Madd-0.139
f-0.126
apy-0.117
 support-0.117
Token little
Token ID1310
Feature activation+7.75e-03

Pos logit contributions
 help+0.009
 Madd+0.008
f+0.007
 support+0.007
 gard+0.006
Neg logit contributions
 appearance-0.013
cks-0.011
 ours-0.011
 yourselves-0.011
ulum-0.010
Token one
Token ID530
Feature activation-3.56e-04

Pos logit contributions
 help+0.009
 Madd+0.008
f+0.007
apy+0.007
 support+0.006
Neg logit contributions
 appearance-0.018
 ours-0.016
 yourselves-0.016
 yours-0.015
 yourself-0.015
Token.
Token ID13
Feature activation+1.04e-02

Pos logit contributions
 appearance+0.001
 yourselves+0.001
 ours+0.001
ulum+0.001
cks+0.001
Neg logit contributions
 help-0.001
f-0.000
 Madd-0.000
 support-0.000
 believe-0.000
Token Why
Token ID4162
Feature activation+1.62e-02

Pos logit contributions
 help+0.022
 Madd+0.021
f+0.018
 support+0.018
Sure+0.018
Neg logit contributions
 appearance-0.015
 ours-0.012
cks-0.012
 yourselves-0.011
 yours-0.011
Token gen
Token ID2429
Feature activation+1.64e-03

Pos logit contributions
 help+0.040
 Madd+0.038
f+0.035
apy+0.032
 support+0.032
Neg logit contributions
 appearance-0.049
 ours-0.041
 yourselves-0.040
 yours-0.040
cks-0.037
Tokenie
Token ID494
Feature activation-7.48e-03

Pos logit contributions
 help+0.002
 Madd+0.002
f+0.002
 support+0.002
apy+0.002
Neg logit contributions
 appearance-0.003
 ours-0.003
 yourselves-0.003
 yours-0.003
 habitats-0.003
Token.
Token ID13
Feature activation+1.81e-02

Pos logit contributions
 appearance+0.015
 ours+0.013
 yourselves+0.013
 yours+0.012
cks+0.012
Neg logit contributions
 help-0.011
 Madd-0.010
f-0.010
 support-0.009
apy-0.009
Token "
Token ID366
Feature activation+1.80e-02

Pos logit contributions
 help+0.030
 Madd+0.028
f+0.026
apy+0.024
 support+0.024
Neg logit contributions
 appearance-0.032
 ours-0.027
 yourselves-0.027
 yours-0.026
cks-0.025
TokenG
Token ID38
Feature activation+1.68e-02

Pos logit contributions
 help+0.032
 Madd+0.030
f+0.028
apy+0.026
 support+0.026
Neg logit contributions
 appearance-0.031
 ours-0.026
 yourselves-0.025
 yours-0.025
cks-0.024
Tokenreetings
Token ID46648
Feature activation-1.12e-01

Pos logit contributions
 help+0.026
 Madd+0.025
f+0.024
sy+0.022
ail+0.021
Neg logit contributions
 appearance-0.032
 ours-0.027
 yourselves-0.026
 habitats-0.025
 Sharing-0.024
Token,
Token ID11
Feature activation+1.14e-02

Pos logit contributions
 appearance+0.227
 ours+0.195
 yourselves+0.194
 yours+0.188
 yourself+0.180
Neg logit contributions
 help-0.162
 Madd-0.146
f-0.137
apy-0.131
 support-0.128
Token kind
Token ID1611
Feature activation-9.96e-03

Pos logit contributions
 help+0.017
 Madd+0.016
f+0.014
 support+0.013
apy+0.013
Neg logit contributions
 appearance-0.023
 ours-0.019
 yourselves-0.019
cks-0.019
 yours-0.018
Token one
Token ID530
Feature activation+1.64e-03

Pos logit contributions
 appearance+0.021
 ours+0.018
 yourselves+0.018
 yours+0.017
cks+0.017
Neg logit contributions
 help-0.014
 Madd-0.013
f-0.012
apy-0.011
 support-0.011
Token!"
Token ID2474
Feature activation+1.04e-03

Pos logit contributions
 help+0.002
 Madd+0.002
f+0.002
 support+0.002
apy+0.002
Neg logit contributions
 appearance-0.003
 yourselves-0.003
 ours-0.003
 yours-0.003
cks-0.003
Token said
Token ID531
Feature activation+1.11e-02

Pos logit contributions
 help+0.002
 Madd+0.002
f+0.002
sy+0.001
 support+0.001
Neg logit contributions
 appearance-0.002
 yourselves-0.001
 ours-0.001
 yourself-0.001
 yours-0.001
Token spoke
Token ID5158
Feature activation-3.14e-03

Pos logit contributions
 help+0.003
 Madd+0.003
f+0.002
apy+0.002
 support+0.002
Neg logit contributions
 appearance-0.003
 yourselves-0.002
cks-0.002
 ours-0.002
 yours-0.002
Token up
Token ID510
Feature activation+4.49e-03

Pos logit contributions
 appearance+0.007
 ours+0.006
 yourselves+0.006
 yours+0.005
cks+0.005
Neg logit contributions
 help-0.004
 Madd-0.004
f-0.004
 support-0.003
apy-0.003
Token.
Token ID13
Feature activation+6.91e-03

Pos logit contributions
 help+0.007
 Madd+0.007
f+0.006
apy+0.006
 support+0.006
Neg logit contributions
 appearance-0.008
 ours-0.007
 yourselves-0.007
 yours-0.007
cks-0.006
Token "
Token ID366
Feature activation+1.44e-02

Pos logit contributions
 help+0.012
 Madd+0.011
f+0.010
apy+0.010
 support+0.010
Neg logit contributions
 appearance-0.012
 ours-0.010
 yourselves-0.010
 yours-0.010
cks-0.009
TokenG
Token ID38
Feature activation+9.63e-03

Pos logit contributions
 help+0.025
 Madd+0.024
f+0.022
Sure+0.021
 support+0.021
Neg logit contributions
 appearance-0.026
 ours-0.022
 yours-0.020
 yourselves-0.020
cks-0.019
Tokenreetings
Token ID46648
Feature activation-1.11e-01

Pos logit contributions
 help+0.015
 Madd+0.014
f+0.014
sy+0.013
aroo+0.012
Neg logit contributions
 appearance-0.019
 ours-0.015
 yourselves-0.015
 yours-0.014
ulum-0.014
Token,
Token ID11
Feature activation+7.84e-03

Pos logit contributions
 appearance+0.244
 ours+0.210
 yourselves+0.208
 yours+0.203
cks+0.198
Neg logit contributions
 help-0.145
 Madd-0.132
f-0.118
 support-0.109
apy-0.109
Token le
Token ID443
Feature activation+5.44e-02

Pos logit contributions
 help+0.011
 Madd+0.010
f+0.008
 support+0.008
 gard+0.007
Neg logit contributions
 appearance-0.017
cks-0.015
 ours-0.014
 yourselves-0.014
ulum-0.014
Tokenopard
Token ID15478
Feature activation+3.89e-02

Pos logit contributions
 help+0.068
 Madd+0.063
f+0.055
apy+0.054
 support+0.053
Neg logit contributions
 appearance-0.115
 yourselves-0.104
 ours-0.101
 yours-0.098
cks-0.098
Token,"
Token ID553
Feature activation-4.63e-03

Pos logit contributions
 help+0.065
 Madd+0.060
f+0.055
 support+0.052
apy+0.052
Neg logit contributions
 appearance-0.072
 ours-0.060
 yourselves-0.059
 yours-0.057
cks-0.055
Token it
Token ID340
Feature activation+4.30e-04

Pos logit contributions
 appearance+0.008
 ours+0.007
 yourselves+0.007
 yours+0.007
cks+0.006
Neg logit contributions
 help-0.008
 Madd-0.007
f-0.007
apy-0.006
 support-0.006
TokenNo
Token ID2949
Feature activation+6.71e-03

Pos logit contributions
 help+0.033
 Madd+0.030
f+0.028
apy+0.027
 support+0.027
Neg logit contributions
 appearance-0.034
 ours-0.028
 yourselves-0.027
 yours-0.027
cks-0.026
Token,
Token ID11
Feature activation-1.26e-02

Pos logit contributions
 help+0.007
 Madd+0.006
f+0.006
 support+0.005
apy+0.005
Neg logit contributions
 appearance-0.016
 ours-0.014
 yourselves-0.014
 yours-0.014
cks-0.014
Token that
Token ID326
Feature activation-1.40e-02

Pos logit contributions
 appearance+0.023
ulum+0.020
cks+0.019
 habitats+0.019
 yourselves+0.019
Neg logit contributions
 help-0.023
 Madd-0.019
 support-0.017
f-0.017
apy-0.016
Token is
Token ID318
Feature activation-3.54e-02

Pos logit contributions
 appearance+0.025
 ours+0.021
 yourselves+0.021
 habitats+0.021
ellery+0.021
Neg logit contributions
 help-0.025
 Madd-0.021
f-0.019
 support-0.019
apy-0.018
Token too
Token ID1165
Feature activation+1.45e-02

Pos logit contributions
 appearance+0.073
ulum+0.064
 habitats+0.063
 habitat+0.060
opl+0.060
Neg logit contributions
 help-0.059
 Madd-0.045
 brave-0.045
 support-0.042
 happily-0.040
Token common
Token ID2219
Feature activation-1.11e-01

Pos logit contributions
 help+0.015
 brave+0.013
apy+0.011
 strong+0.011
f+0.011
Neg logit contributions
 appearance-0.037
ellery-0.032
ulum-0.032
 habitats-0.031
ric-0.031
Token,"
Token ID553
Feature activation-2.26e-02

Pos logit contributions
 appearance+0.277
 habitats+0.230
ulum+0.226
 ours+0.225
ellery+0.224
Neg logit contributions
 help-0.135
f-0.095
 happily-0.092
 Madd-0.091
 support-0.089
Token Anna
Token ID11735
Feature activation-5.01e-03

Pos logit contributions
 appearance+0.031
 ours+0.025
 yourselves+0.025
 yours+0.024
cks+0.022
Neg logit contributions
 help-0.046
 Madd-0.044
f-0.042
sy-0.040
apy-0.040
Token says
Token ID1139
Feature activation-2.70e-03

Pos logit contributions
 appearance+0.009
 yourselves+0.008
 ours+0.008
 yours+0.008
cks+0.007
Neg logit contributions
 help-0.009
f-0.008
 happily-0.007
 Madd-0.007
 support-0.007
Token.
Token ID13
Feature activation-3.19e-02

Pos logit contributions
 appearance+0.004
 ours+0.003
 yours+0.003
 yourselves+0.003
cks+0.003
Neg logit contributions
 Madd-0.005
 help-0.005
apy-0.004
f-0.004
aroo-0.004
Token "
Token ID366
Feature activation-3.62e-03

Pos logit contributions
 appearance+0.069
 ours+0.060
 yourselves+0.058
 yours+0.057
cks+0.055
Neg logit contributions
 help-0.046
 Madd-0.042
apy-0.036
f-0.036
 happily-0.035
Token that
Token ID326
Feature activation-6.29e-03

Pos logit contributions
 help+0.017
 Madd+0.014
 support+0.012
 happily+0.012
f+0.012
Neg logit contributions
 appearance-0.026
cks-0.022
 ours-0.022
 yourselves-0.021
ulum-0.021
Token it
Token ID340
Feature activation-5.14e-03

Pos logit contributions
 appearance+0.012
cks+0.010
 ours+0.010
 yourselves+0.010
 habitats+0.010
Neg logit contributions
 help-0.010
 Madd-0.009
 support-0.008
apy-0.008
f-0.008
Token's
Token ID338
Feature activation-2.89e-02

Pos logit contributions
 appearance+0.009
 ours+0.007
 yourselves+0.007
 yours+0.007
cks+0.007
Neg logit contributions
 help-0.010
 Madd-0.009
f-0.008
 support-0.008
sy-0.008
Token not
Token ID407
Feature activation-2.62e-02

Pos logit contributions
 appearance+0.054
 yourselves+0.044
cks+0.044
 ours+0.043
ulum+0.042
Neg logit contributions
 help-0.049
 Madd-0.043
 support-0.040
f-0.039
sy-0.037
Token the
Token ID262
Feature activation-9.29e-04

Pos logit contributions
 appearance+0.053
 yourselves+0.045
cks+0.045
ulum+0.044
 habitats+0.043
Neg logit contributions
 help-0.040
 Madd-0.033
 support-0.032
f-0.030
 happily-0.029
Token appearance
Token ID5585
Feature activation-1.11e-01

Pos logit contributions
 appearance+0.002
 ours+0.002
 yourselves+0.002
 yours+0.001
cks+0.001
Neg logit contributions
 help-0.001
 Madd-0.001
f-0.001
 support-0.001
apy-0.001
Token of
Token ID286
Feature activation-3.58e-03

Pos logit contributions
 appearance+0.264
cks+0.240
 ours+0.237
 habitats+0.230
 yourselves+0.228
Neg logit contributions
 help-0.127
 Madd-0.098
 support-0.086
f-0.078
apy-0.076
Token things
Token ID1243
Feature activation-5.01e-02

Pos logit contributions
 appearance+0.007
ulum+0.006
cks+0.006
 ours+0.006
osaurus+0.006
Neg logit contributions
 help-0.006
 Madd-0.005
 support-0.004
f-0.004
 assistance-0.004
Token that
Token ID326
Feature activation-7.90e-03

Pos logit contributions
 appearance+0.108
 habitats+0.094
 yourselves+0.092
ulum+0.092
cks+0.092
Neg logit contributions
 help-0.073
 Madd-0.057
f-0.055
 support-0.054
 happily-0.048
Token matter
Token ID2300
Feature activation-5.75e-02

Pos logit contributions
 appearance+0.015
 yourselves+0.013
 ours+0.013
cks+0.013
ulum+0.013
Neg logit contributions
 help-0.013
 Madd-0.011
 support-0.010
f-0.010
apy-0.009
Token,
Token ID11
Feature activation+4.26e-03

Pos logit contributions
 appearance+0.106
 ours+0.089
ulum+0.088
 habitats+0.087
cks+0.087
Neg logit contributions
 help-0.097
 Madd-0.083
 support-0.076
f-0.072
 together-0.069
Token meant
Token ID4001
Feature activation-2.09e-02

Pos logit contributions
 help+0.027
 Madd+0.026
f+0.024
apy+0.023
 support+0.022
Neg logit contributions
 appearance-0.029
 ours-0.024
 yourselves-0.024
 yours-0.023
cks-0.023
Token something
Token ID1223
Feature activation-4.47e-02

Pos logit contributions
 appearance+0.043
 ours+0.037
 yourselves+0.036
cks+0.035
 yours+0.034
Neg logit contributions
 help-0.032
 Madd-0.029
f-0.025
 support-0.024
apy-0.024
Token that
Token ID326
Feature activation-1.24e-02

Pos logit contributions
 appearance+0.084
 ours+0.072
 yourselves+0.071
 yours+0.069
cks+0.066
Neg logit contributions
 help-0.069
 Madd-0.065
f-0.060
apy-0.057
 support-0.056
Token was
Token ID373
Feature activation-1.81e-02

Pos logit contributions
 appearance+0.025
 ours+0.021
 yourselves+0.021
 yours+0.020
cks+0.020
Neg logit contributions
 help-0.019
 Madd-0.017
f-0.016
apy-0.015
 support-0.015
Token not
Token ID407
Feature activation-3.25e-02

Pos logit contributions
 appearance+0.039
 yourselves+0.033
ulum+0.032
 ours+0.031
 habitats+0.031
Neg logit contributions
 help-0.026
 Madd-0.022
 support-0.020
f-0.019
 happily-0.019
Token common
Token ID2219
Feature activation-1.10e-01

Pos logit contributions
 appearance+0.069
 yourselves+0.059
 ours+0.058
cks+0.055
 yours+0.055
Neg logit contributions
 help-0.045
 Madd-0.040
f-0.037
 support-0.035
apy-0.034
Token,
Token ID11
Feature activation-1.00e-02

Pos logit contributions
 appearance+0.244
 ours+0.214
 yourselves+0.208
cks+0.201
 yours+0.199
Neg logit contributions
 help-0.149
 Madd-0.124
f-0.119
 happily-0.110
 support-0.110
Token something
Token ID1223
Feature activation-4.42e-02

Pos logit contributions
 appearance+0.019
cks+0.016
 ours+0.016
ulum+0.016
 yourselves+0.016
Neg logit contributions
 help-0.016
 Madd-0.014
f-0.012
 lear-0.012
apy-0.012
Token that
Token ID326
Feature activation-1.16e-02

Pos logit contributions
 appearance+0.085
 ours+0.071
 yourselves+0.071
 yours+0.068
cks+0.066
Neg logit contributions
 help-0.071
 Madd-0.065
f-0.060
apy-0.056
 support-0.056
Token was
Token ID373
Feature activation-1.79e-02

Pos logit contributions
 appearance+0.024
 ours+0.021
 yourselves+0.021
 yours+0.019
ulum+0.019
Neg logit contributions
 help-0.017
 Madd-0.016
f-0.014
apy-0.013
 support-0.013
Token special
Token ID2041
Feature activation-1.73e-02

Pos logit contributions
 appearance+0.038
 yourselves+0.032
ulum+0.031
 habitats+0.030
opl+0.029
Neg logit contributions
 help-0.027
 Madd-0.022
 happily-0.021
 support-0.020
f-0.020
Token that
Token ID326
Feature activation-1.36e-02

Pos logit contributions
 appearance+0.046
 yourselves+0.040
 ours+0.040
 yours+0.038
cks+0.038
Neg logit contributions
 help-0.026
 Madd-0.022
f-0.019
 support-0.019
apy-0.018
Token they
Token ID484
Feature activation-2.38e-03

Pos logit contributions
 appearance+0.032
 yourselves+0.028
 ours+0.028
ulum+0.027
cks+0.027
Neg logit contributions
 help-0.017
 Madd-0.015
apy-0.012
 support-0.012
f-0.011
Token had
Token ID550
Feature activation-2.50e-02

Pos logit contributions
 appearance+0.004
 ours+0.004
 yourselves+0.003
 yours+0.003
cks+0.003
Neg logit contributions
 Madd-0.004
 help-0.004
f-0.003
apy-0.003
 gard-0.003
Token more
Token ID517
Feature activation-2.42e-02

Pos logit contributions
 appearance+0.053
 yourselves+0.045
 ours+0.045
 yours+0.043
cks+0.043
Neg logit contributions
 help-0.036
 Madd-0.031
f-0.028
 support-0.028
apy-0.026
Token in
Token ID287
Feature activation+3.35e-03

Pos logit contributions
 appearance+0.050
 yourselves+0.042
 ours+0.042
 yours+0.040
ulum+0.039
Neg logit contributions
 help-0.037
 Madd-0.032
f-0.030
 support-0.029
apy-0.028
Token common
Token ID2219
Feature activation-1.10e-01

Pos logit contributions
 help+0.005
 Madd+0.004
 support+0.004
 happily+0.004
f+0.004
Neg logit contributions
 appearance-0.007
 ours-0.006
 yourselves-0.006
ulum-0.006
cks-0.006
Token than
Token ID621
Feature activation-2.55e-02

Pos logit contributions
 appearance+0.240
 ours+0.216
ulum+0.206
 yours+0.206
 yourselves+0.205
Neg logit contributions
 help-0.143
 Madd-0.119
 happily-0.113
f-0.113
 support-0.105
Token they
Token ID484
Feature activation-3.17e-03

Pos logit contributions
 appearance+0.053
 yourselves+0.045
ulum+0.045
cks+0.044
 ours+0.044
Neg logit contributions
 help-0.039
 Madd-0.032
f-0.030
 support-0.030
 happily-0.028
Token thought
Token ID1807
Feature activation+2.15e-02

Pos logit contributions
 appearance+0.007
 yourselves+0.006
 ours+0.006
ulum+0.006
cks+0.006
Neg logit contributions
 help-0.005
f-0.004
 support-0.004
 Madd-0.004
 happily-0.003
Token.
Token ID13
Feature activation-2.13e-02

Pos logit contributions
 help+0.029
 Madd+0.023
f+0.023
 support+0.021
apy+0.021
Neg logit contributions
 appearance-0.049
 yourselves-0.042
 ours-0.041
 habitats-0.040
ulum-0.039
Token They
Token ID1119
Feature activation+1.36e-03

Pos logit contributions
 appearance+0.051
cks+0.041
 ours+0.041
 yours+0.041
osaurus+0.040
Neg logit contributions
 Madd-0.024
 help-0.024
Sure-0.020
apy-0.020
 happily-0.019
Token
Token ID220
Feature activation-8.98e-03

Pos logit contributions
 help+0.029
f+0.026
 Madd+0.026
 support+0.024
 lear+0.024
Neg logit contributions
 appearance-0.025
 yourselves-0.020
 ours-0.020
 yours-0.019
cks-0.019
Token
Token ID198
Feature activation+1.26e-02

Pos logit contributions
 appearance+0.020
 ours+0.017
 yourselves+0.017
 yours+0.016
 habitats+0.016
Neg logit contributions
 help-0.012
 Madd-0.011
f-0.009
apy-0.009
 support-0.009
Token
Token ID198
Feature activation+1.24e-02

Pos logit contributions
 help+0.021
 Madd+0.020
f+0.019
apy+0.018
 support+0.018
Neg logit contributions
 appearance-0.021
 yourselves-0.018
 ours-0.017
 yours-0.017
cks-0.016
Token"
Token ID1
Feature activation+3.21e-02

Pos logit contributions
f+0.018
 help+0.018
 Madd+0.018
 lear+0.017
 accomplishment+0.017
Neg logit contributions
 appearance-0.020
 yourselves-0.018
 yourself-0.018
cks-0.017
 ours-0.017
TokenG
Token ID38
Feature activation+1.67e-02

Pos logit contributions
 help+0.061
f+0.056
 Madd+0.055
apy+0.053
 accomplishment+0.052
Neg logit contributions
 appearance-0.044
 yourselves-0.038
 ours-0.036
 yourself-0.035
 yours-0.034
Tokenreetings
Token ID46648
Feature activation-1.10e-01

Pos logit contributions
 help+0.024
 Madd+0.022
f+0.020
apy+0.018
 support+0.018
Neg logit contributions
 appearance-0.035
 ours-0.030
 yourselves-0.030
 yours-0.029
cks-0.028
Token,
Token ID11
Feature activation+8.98e-03

Pos logit contributions
 appearance+0.225
 ours+0.195
 yourselves+0.192
 yours+0.188
 yourself+0.180
Neg logit contributions
 help-0.154
 Madd-0.137
f-0.133
apy-0.126
 support-0.122
Token brave
Token ID14802
Feature activation-6.84e-04

Pos logit contributions
 help+0.009
 Madd+0.008
f+0.006
 support+0.006
apy+0.006
Neg logit contributions
 appearance-0.023
 ours-0.020
 yourselves-0.020
 yours-0.019
cks-0.019
Token rabbit
Token ID22746
Feature activation-3.58e-03

Pos logit contributions
 appearance+0.002
 ours+0.001
 yourselves+0.001
 yours+0.001
cks+0.001
Neg logit contributions
 help-0.001
 Madd-0.001
f-0.001
apy-0.001
 support-0.001
Token.
Token ID13
Feature activation+1.36e-02

Pos logit contributions
 appearance+0.006
 ours+0.005
 yourselves+0.005
 yours+0.005
cks+0.004
Neg logit contributions
 help-0.007
 Madd-0.006
f-0.006
apy-0.005
 support-0.005
Token I
Token ID314
Feature activation+6.30e-03

Pos logit contributions
 help+0.030
 Madd+0.029
f+0.026
 support+0.026
apy+0.025
Neg logit contributions
 appearance-0.018
 ours-0.014
 yourselves-0.013
cks-0.013
 yours-0.013
Token.
Token ID13
Feature activation-2.08e-02

Pos logit contributions
 help+0.046
 Madd+0.039
f+0.037
 support+0.034
aroo+0.033
Neg logit contributions
 appearance-0.063
 ours-0.054
 yourselves-0.053
ulum-0.050
 yours-0.050
Token It
Token ID632
Feature activation+1.67e-02

Pos logit contributions
 appearance+0.035
 ours+0.028
ulum+0.027
 yourselves+0.027
 habitats+0.026
Neg logit contributions
 help-0.039
 Madd-0.037
f-0.033
 support-0.032
apy-0.032
Token is
Token ID318
Feature activation-3.38e-02

Pos logit contributions
 help+0.038
 Madd+0.035
f+0.033
 support+0.032
apy+0.031
Neg logit contributions
 appearance-0.022
 ours-0.017
 yourselves-0.017
ulum-0.016
 habitats-0.015
Token a
Token ID257
Feature activation+1.42e-02

Pos logit contributions
 appearance+0.073
ulum+0.065
 habitats+0.060
 yourselves+0.060
opl+0.059
Neg logit contributions
 help-0.051
 Madd-0.041
 support-0.037
 brave-0.034
 happily-0.034
Token very
Token ID845
Feature activation-7.13e-03

Pos logit contributions
 help+0.021
 Madd+0.019
f+0.017
 support+0.016
apy+0.016
Neg logit contributions
 appearance-0.029
 ours-0.025
 yourselves-0.024
 yours-0.024
cks-0.023
Token common
Token ID2219
Feature activation-1.09e-01

Pos logit contributions
 appearance+0.015
 yourselves+0.013
ulum+0.012
 ours+0.012
cks+0.012
Neg logit contributions
 help-0.011
 Madd-0.009
f-0.009
 support-0.008
apy-0.008
Token bird
Token ID6512
Feature activation+1.80e-02

Pos logit contributions
 appearance+0.229
ulum+0.197
 ours+0.191
 yourselves+0.189
cks+0.184
Neg logit contributions
 help-0.159
 Madd-0.141
f-0.121
aroo-0.120
 happily-0.118
Token.
Token ID13
Feature activation-2.17e-02

Pos logit contributions
 help+0.022
f+0.016
 Madd+0.016
aroo+0.015
 support+0.014
Neg logit contributions
 appearance-0.044
ulum-0.038
 yourselves-0.038
 ours-0.037
 habitats-0.036
Token It
Token ID632
Feature activation+1.59e-02

Pos logit contributions
 appearance+0.038
 ours+0.030
ulum+0.029
 habitats+0.028
 yourselves+0.028
Neg logit contributions
 help-0.040
 Madd-0.038
f-0.032
 support-0.032
apy-0.032
Token likes
Token ID7832
Feature activation-2.24e-02

Pos logit contributions
 help+0.036
 Madd+0.031
f+0.030
 support+0.030
aroo+0.028
Neg logit contributions
 appearance-0.022
 yourselves-0.017
 ours-0.017
ulum-0.017
 habitats-0.016
Token to
Token ID284
Feature activation-1.60e-02

Pos logit contributions
 appearance+0.048
ulum+0.045
oma+0.042
 yourselves+0.041
opl+0.041
Neg logit contributions
 help-0.032
 Madd-0.025
 support-0.022
f-0.020
 crit-0.019
Token not
Token ID407
Feature activation-2.90e-02

Pos logit contributions
 appearance+0.077
ays+0.068
ulum+0.068
cks+0.068
 ours+0.067
Neg logit contributions
 help-0.063
 support-0.048
 believe-0.043
 Madd-0.043
f-0.039
Token judge
Token ID5052
Feature activation-4.44e-02

Pos logit contributions
 appearance+0.060
 ours+0.057
 yourselves+0.056
cks+0.054
ulum+0.053
Neg logit contributions
 help-0.045
 support-0.033
 believe-0.031
f-0.031
 Madd-0.030
Token others
Token ID1854
Feature activation-4.28e-02

Pos logit contributions
 appearance+0.083
ulum+0.073
 ours+0.072
opl+0.070
cks+0.069
Neg logit contributions
 help-0.073
 support-0.057
f-0.055
 Madd-0.054
 happily-0.051
Token by
Token ID416
Feature activation+4.15e-02

Pos logit contributions
 appearance+0.090
ulum+0.080
 ours+0.079
 habitats+0.078
cks+0.078
Neg logit contributions
 help-0.062
 support-0.046
f-0.045
 happily-0.043
 believe-0.042
Token their
Token ID511
Feature activation+1.22e-02

Pos logit contributions
 help+0.077
 support+0.061
 Madd+0.057
 believe+0.052
f+0.052
Neg logit contributions
 appearance-0.067
ulum-0.060
 ours-0.059
ays-0.058
ellery-0.057
Token appearance
Token ID5585
Feature activation-1.09e-01

Pos logit contributions
 help+0.027
 support+0.022
 Madd+0.022
f+0.022
 happily+0.021
Neg logit contributions
 appearance-0.014
 ours-0.013
cks-0.013
ulum-0.013
 yourselves-0.013
Token.
Token ID13
Feature activation-1.74e-02

Pos logit contributions
 appearance+0.225
cks+0.202
 ours+0.202
ulum+0.198
 habitats+0.191
Neg logit contributions
 help-0.150
 Madd-0.113
 happily-0.111
 support-0.106
f-0.103
Token You
Token ID921
Feature activation+3.13e-02

Pos logit contributions
 appearance+0.030
 ours+0.024
cks+0.023
ulum+0.022
 habitats+0.021
Neg logit contributions
 help-0.031
 Madd-0.030
 support-0.026
Sure-0.025
 Fr-0.024
Token should
Token ID815
Feature activation-3.79e-02

Pos logit contributions
 help+0.063
 Madd+0.060
f+0.056
apy+0.054
 support+0.053
Neg logit contributions
 appearance-0.045
 ours-0.036
 yourselves-0.036
 yours-0.034
cks-0.032
Token be
Token ID307
Feature activation+1.84e-02

Pos logit contributions
 appearance+0.078
ays+0.071
ulum+0.070
cks+0.069
 ours+0.068
Neg logit contributions
 help-0.064
 support-0.048
 believe-0.044
 Madd-0.042
f-0.039
Token kind
Token ID1611
Feature activation-1.99e-02

Pos logit contributions
 help+0.026
 Madd+0.023
f+0.021
 support+0.020
apy+0.019
Neg logit contributions
 appearance-0.039
 ours-0.033
 yourselves-0.033
 yours-0.032
cks-0.031
Token.
Token ID13
Feature activation-1.62e-02

Pos logit contributions
 appearance+0.122
 yourselves+0.107
 ours+0.106
ulum+0.102
 yours+0.101
Neg logit contributions
 help-0.060
f-0.046
 Madd-0.045
 happily-0.041
 support-0.039
Token They
Token ID1119
Feature activation+5.05e-03

Pos logit contributions
 appearance+0.027
 ours+0.022
 yourselves+0.022
 yours+0.021
cks+0.020
Neg logit contributions
 help-0.030
 Madd-0.028
f-0.026
apy-0.025
 support-0.025
Token learn
Token ID2193
Feature activation-1.65e-03

Pos logit contributions
 help+0.008
 Madd+0.006
f+0.006
 support+0.006
 believe+0.006
Neg logit contributions
 appearance-0.010
 ours-0.009
 yourselves-0.009
 yours-0.009
cks-0.008
Token about
Token ID546
Feature activation-4.81e-02

Pos logit contributions
 appearance+0.003
 yourselves+0.003
 ours+0.003
ulum+0.003
 yours+0.003
Neg logit contributions
 help-0.003
 Madd-0.002
 support-0.002
f-0.002
apy-0.002
Token different
Token ID1180
Feature activation-8.54e-03

Pos logit contributions
 appearance+0.096
ulum+0.089
 yourselves+0.087
 ours+0.087
opl+0.084
Neg logit contributions
 help-0.071
 Madd-0.058
 support-0.051
 crit-0.047
 El-0.046
Token cultures
Token ID13817
Feature activation-1.08e-01

Pos logit contributions
 appearance+0.016
 ours+0.016
ulum+0.015
 yours+0.015
 yourselves+0.015
Neg logit contributions
 help-0.012
f-0.010
 crit-0.009
 support-0.009
 Madd-0.009
Token and
Token ID290
Feature activation-1.63e-02

Pos logit contributions
 appearance+0.258
 yourselves+0.227
 ours+0.227
ulum+0.222
cks+0.217
Neg logit contributions
 help-0.134
 Madd-0.095
f-0.094
 support-0.092
 happily-0.088
Token times
Token ID1661
Feature activation-1.06e-02

Pos logit contributions
ulum+0.027
 appearance+0.026
ays+0.026
oma+0.025
 yourselves+0.025
Neg logit contributions
 help-0.031
 support-0.025
 believe-0.022
 Madd-0.022
f-0.021
Token.
Token ID13
Feature activation-1.74e-02

Pos logit contributions
 appearance+0.023
 yourselves+0.020
 ours+0.020
ulum+0.019
 yours+0.019
Neg logit contributions
 help-0.015
f-0.012
 Madd-0.012
 support-0.011
apy-0.011
Token They
Token ID1119
Feature activation+4.30e-03

Pos logit contributions
 appearance+0.034
 ours+0.029
 yourselves+0.028
 yours+0.028
cks+0.028
Neg logit contributions
 help-0.028
 Madd-0.027
f-0.022
apy-0.022
 support-0.022
Token have
Token ID423
Feature activation-8.52e-03

Pos logit contributions
 help+0.007
 Madd+0.006
f+0.006
 support+0.005
apy+0.005
Neg logit contributions
 appearance-0.008
 ours-0.007
 yourselves-0.007
 yours-0.007
cks-0.007
Token't
Token ID470
Feature activation-2.37e-02

Pos logit contributions
 help+0.006
 Madd+0.005
f+0.005
 support+0.005
apy+0.005
Neg logit contributions
 appearance-0.006
 ours-0.005
 habitats-0.005
 yourselves-0.005
 yours-0.005
Token judge
Token ID5052
Feature activation-4.04e-02

Pos logit contributions
 appearance+0.048
 yourselves+0.044
 ours+0.044
cks+0.044
ulum+0.042
Neg logit contributions
 help-0.037
 support-0.028
 Madd-0.027
 believe-0.027
f-0.026
Token things
Token ID1243
Feature activation-4.90e-02

Pos logit contributions
 appearance+0.075
 ours+0.067
ulum+0.064
cks+0.063
 yourselves+0.063
Neg logit contributions
 help-0.066
 Madd-0.053
f-0.052
 support-0.052
 happily-0.046
Token by
Token ID416
Feature activation+4.19e-02

Pos logit contributions
 appearance+0.104
 yourselves+0.088
 ours+0.087
ulum+0.086
 habitats+0.086
Neg logit contributions
 help-0.073
 Madd-0.058
f-0.057
 support-0.055
 believe-0.049
Token their
Token ID511
Feature activation+1.26e-02

Pos logit contributions
 help+0.085
 Madd+0.069
 support+0.068
 believe+0.064
f+0.062
Neg logit contributions
 appearance-0.062
ulum-0.055
 ours-0.053
ays-0.052
ucker-0.050
Token appearance
Token ID5585
Feature activation-1.07e-01

Pos logit contributions
 help+0.031
 support+0.027
f+0.026
 Madd+0.025
 happily+0.025
Neg logit contributions
 yourselves-0.010
cks-0.010
ulum-0.010
 ours-0.010
 appearance-0.010
Token,
Token ID11
Feature activation-2.76e-03

Pos logit contributions
 appearance+0.238
 ours+0.218
cks+0.214
ulum+0.209
 habitats+0.205
Neg logit contributions
 help-0.132
 Madd-0.103
 support-0.092
f-0.091
 happily-0.090
Token be
Token ID307
Feature activation+1.89e-02

Pos logit contributions
 appearance+0.005
cks+0.005
ulum+0.005
oma+0.005
opl+0.005
Neg logit contributions
 help-0.004
 Madd-0.004
 support-0.003
f-0.003
 believe-0.003
Token curious
Token ID11040
Feature activation-1.98e-02

Pos logit contributions
 help+0.022
 Madd+0.018
f+0.016
 support+0.016
sy+0.015
Neg logit contributions
 appearance-0.046
 ours-0.039
 yourselves-0.039
 yours-0.037
ulum-0.037
Token and
Token ID290
Feature activation-1.28e-02

Pos logit contributions
 appearance+0.043
cks+0.042
 yourselves+0.041
 ours+0.040
ulum+0.038
Neg logit contributions
 help-0.026
 support-0.019
 Madd-0.018
 happily-0.018
f-0.017
Token respectful
Token ID25923
Feature activation+7.51e-03

Pos logit contributions
 appearance+0.027
ulum+0.023
 yourselves+0.023
oma+0.023
cks+0.022
Neg logit contributions
 help-0.020
 support-0.016
 Madd-0.015
f-0.014
 believe-0.014
Token their
Token ID511
Feature activation+1.27e-02

Pos logit contributions
 appearance+0.027
 yourselves+0.022
 ours+0.021
 yours+0.020
cks+0.019
Neg logit contributions
 help-0.037
 Madd-0.034
f-0.033
sy-0.032
 support-0.032
Token mom
Token ID1995
Feature activation+1.57e-02

Pos logit contributions
 help+0.017
 Madd+0.017
f+0.016
free+0.015
sy+0.015
Neg logit contributions
 appearance-0.027
 yourselves-0.022
 ours-0.021
 habitats-0.021
 yours-0.020
Token said
Token ID531
Feature activation+3.04e-03

Pos logit contributions
 help+0.026
 Madd+0.025
apy+0.022
 support+0.022
f+0.022
Neg logit contributions
 appearance-0.027
 yourselves-0.023
 ours-0.022
cks-0.022
 yours-0.021
Token.
Token ID13
Feature activation-1.54e-02

Pos logit contributions
 Madd+0.006
 help+0.006
Quick+0.005
Carl+0.005
Sure+0.005
Neg logit contributions
 appearance-0.004
 yourselves-0.003
 ours-0.003
 stealing-0.003
 either-0.003
Token "
Token ID366
Feature activation+3.24e-03

Pos logit contributions
 appearance+0.030
 ours+0.026
 yours+0.025
 yourselves+0.025
cks+0.024
Neg logit contributions
 help-0.025
 Madd-0.023
f-0.020
apy-0.019
 support-0.019
TokenYou
Token ID1639
Feature activation+6.79e-02

Pos logit contributions
 help+0.006
 Madd+0.005
f+0.005
apy+0.005
 support+0.005
Neg logit contributions
 appearance-0.006
 ours-0.005
 yours-0.005
 yourselves-0.005
cks-0.004
Token are
Token ID389
Feature activation-1.36e-02

Pos logit contributions
 help+0.116
 Madd+0.116
f+0.104
apy+0.100
 support+0.097
Neg logit contributions
 appearance-0.114
 ours-0.098
 yourselves-0.095
 yours-0.092
cks-0.087
Token being
Token ID852
Feature activation+3.97e-02

Pos logit contributions
 appearance+0.020
 ours+0.016
 yourselves+0.016
 yours+0.015
cks+0.014
Neg logit contributions
 help-0.028
 Madd-0.026
f-0.025
 support-0.024
apy-0.024
Token too
Token ID1165
Feature activation+2.19e-02

Pos logit contributions
 Madd+0.069
 help+0.068
f+0.064
apy+0.061
Sure+0.058
Neg logit contributions
 appearance-0.063
 ours-0.059
 yourselves-0.059
 yours-0.055
 yourself-0.053
Token loud
Token ID7812
Feature activation+2.89e-03

Pos logit contributions
 help+0.033
 Madd+0.032
f+0.028
apy+0.027
 support+0.027
Neg logit contributions
 appearance-0.043
 ours-0.037
 yourselves-0.037
 yours-0.035
cks-0.034
Token and
Token ID290
Feature activation-1.51e-02

Pos logit contributions
 help+0.003
 happily+0.002
 together+0.002
f+0.002
 support+0.002
Neg logit contributions
 appearance-0.007
 ours-0.006
cks-0.006
ellery-0.006
 yourselves-0.006
Token a
Token ID257
Feature activation+1.55e-02

Pos logit contributions
 appearance+0.048
 ours+0.041
 yourselves+0.041
cks+0.041
 yours+0.040
Neg logit contributions
 help-0.024
 Madd-0.021
f-0.018
apy-0.017
 support-0.017
Token big
Token ID1263
Feature activation-2.24e-02

Pos logit contributions
 Madd+0.025
 help+0.023
f+0.022
taker+0.022
aroo+0.022
Neg logit contributions
 appearance-0.028
 ours-0.023
 yours-0.022
ulum-0.020
 mine-0.020
Token fight
Token ID1907
Feature activation-1.71e-03

Pos logit contributions
 appearance+0.044
 ours+0.036
 yours+0.034
ulum+0.033
 yourselves+0.033
Neg logit contributions
 Madd-0.033
 help-0.032
f-0.031
apy-0.030
sy-0.029
Token.
Token ID13
Feature activation-5.99e-03

Pos logit contributions
 appearance+0.003
 ours+0.003
 yourselves+0.003
 yours+0.002
cks+0.002
Neg logit contributions
 help-0.003
 Madd-0.003
f-0.002
apy-0.002
 support-0.002
Token "
Token ID366
Feature activation+7.69e-03

Pos logit contributions
 appearance+0.010
 ours+0.009
 yourselves+0.008
 yours+0.008
cks+0.008
Neg logit contributions
 help-0.011
 Madd-0.010
f-0.009
apy-0.009
 support-0.009
TokenYou
Token ID1639
Feature activation+7.12e-02

Pos logit contributions
 help+0.013
 Madd+0.012
f+0.011
apy+0.011
Sure+0.011
Neg logit contributions
 appearance-0.014
 ours-0.012
 yours-0.011
 yourselves-0.011
cks-0.011
Token can
Token ID460
Feature activation-2.50e-04

Pos logit contributions
 help+0.127
 Madd+0.126
f+0.114
apy+0.110
 support+0.106
Neg logit contributions
 appearance-0.115
 ours-0.098
 yourselves-0.093
 yours-0.092
cks-0.086
Token't
Token ID470
Feature activation-2.28e-02

Pos logit contributions
 appearance+0.000
 ours+0.000
 yourselves+0.000
 yours+0.000
cks+0.000
Neg logit contributions
 help-0.000
 Madd-0.000
f-0.000
apy-0.000
 support-0.000
Token play
Token ID711
Feature activation+9.66e-03

Pos logit contributions
 appearance+0.046
 ours+0.039
 yourselves+0.039
 yours+0.038
cks+0.036
Neg logit contributions
 help-0.033
 Madd-0.031
f-0.028
apy-0.027
 support-0.026
Token with
Token ID351
Feature activation+3.01e-03

Pos logit contributions
 help+0.015
 Madd+0.013
f+0.012
 support+0.011
apy+0.011
Neg logit contributions
 appearance-0.020
 ours-0.017
 yourselves-0.016
 habitats-0.016
 yours-0.016
Token my
Token ID616
Feature activation-1.07e-02

Pos logit contributions
 help+0.005
 Madd+0.005
 support+0.004
apy+0.004
f+0.004
Neg logit contributions
 appearance-0.005
 habitats-0.004
cks-0.004
 ours-0.004
 yourselves-0.004
Token hello
Token ID23748
Feature activation-6.33e-02

Pos logit contributions
 help+0.009
 Madd+0.008
f+0.008
 support+0.007
apy+0.007
Neg logit contributions
 appearance-0.009
 ours-0.008
 yourselves-0.008
cks-0.007
ulum-0.007
Token to
Token ID284
Feature activation-7.96e-03

Pos logit contributions
 appearance+0.135
 ours+0.122
 yourselves+0.120
cks+0.116
 yours+0.114
Neg logit contributions
 help-0.087
 Madd-0.078
f-0.070
 happily-0.065
 support-0.064
Token Sally
Token ID25737
Feature activation-2.90e-02

Pos logit contributions
 appearance+0.016
 ours+0.014
 yourselves+0.014
 yours+0.013
cks+0.013
Neg logit contributions
 help-0.012
 Madd-0.011
f-0.009
 support-0.009
apy-0.009
Token.
Token ID13
Feature activation+7.54e-03

Pos logit contributions
 appearance+0.064
 yourselves+0.056
 ours+0.055
cks+0.052
 yours+0.052
Neg logit contributions
 help-0.038
 Madd-0.035
f-0.031
 support-0.028
 happily-0.028
Token Sur
Token ID4198
Feature activation+3.22e-02

Pos logit contributions
 help+0.012
 Madd+0.011
f+0.011
apy+0.010
 support+0.010
Neg logit contributions
 appearance-0.014
 ours-0.012
 yourselves-0.012
 yours-0.011
cks-0.011
Tokenpr
Token ID1050
Feature activation+7.03e-02

Pos logit contributions
 help+0.043
 Madd+0.041
apy+0.038
 support+0.034
 accomplishment+0.033
Neg logit contributions
 appearance-0.061
 yourselves-0.057
 yourself-0.055
cks-0.055
 yours-0.054
Tokenised
Token ID1417
Feature activation+5.58e-02

Pos logit contributions
 help+0.078
 Madd+0.064
f+0.058
 support+0.055
 lear+0.051
Neg logit contributions
 appearance-0.166
 yourselves-0.145
 ours-0.143
cks-0.142
 yours-0.139
Token but
Token ID475
Feature activation-2.43e-04

Pos logit contributions
 help+0.070
 Madd+0.063
f+0.057
apy+0.052
 support+0.052
Neg logit contributions
 appearance-0.125
 ours-0.109
 yourselves-0.108
 yours-0.105
cks-0.102
Token excited
Token ID6568
Feature activation-7.51e-03

Pos logit contributions
 appearance+0.000
 ours+0.000
 yourselves+0.000
 yours+0.000
cks+0.000
Neg logit contributions
 help-0.000
 Madd-0.000
f-0.000
apy-0.000
 support-0.000
Token,
Token ID11
Feature activation+1.18e-02

Pos logit contributions
 appearance+0.018
 ours+0.016
 yourselves+0.016
cks+0.016
opl+0.016
Neg logit contributions
 help-0.008
 Madd-0.007
f-0.006
 support-0.005
 happily-0.005
Token Sally
Token ID25737
Feature activation-2.52e-02

Pos logit contributions
 help+0.018
 Madd+0.016
f+0.016
apy+0.014
 support+0.014
Neg logit contributions
 appearance-0.022
 ours-0.019
 yourselves-0.019
 yours-0.019
 yourself-0.018
Token impressed
Token ID12617
Feature activation-2.60e-02

Pos logit contributions
 appearance+0.025
 yourselves+0.021
 ours+0.021
 habitats+0.020
 yours+0.020
Neg logit contributions
 help-0.012
 Madd-0.011
 happily-0.009
 support-0.009
f-0.009
Token and
Token ID290
Feature activation+1.92e-03

Pos logit contributions
 appearance+0.054
 ours+0.046
 yourselves+0.045
 yours+0.043
cks+0.043
Neg logit contributions
 help-0.039
 Madd-0.035
f-0.033
 support-0.030
 happily-0.030
Token said
Token ID531
Feature activation+6.79e-03

Pos logit contributions
 help+0.003
 Madd+0.002
f+0.002
apy+0.002
 support+0.002
Neg logit contributions
 appearance-0.004
 ours-0.004
 yourselves-0.004
 yours-0.003
cks-0.003
Token,
Token ID11
Feature activation+7.40e-03

Pos logit contributions
 help+0.009
 Madd+0.008
f+0.008
sy+0.007
 support+0.007
Neg logit contributions
 appearance-0.014
 yours-0.012
 ours-0.012
 yourselves-0.012
 yourself-0.011
Token "
Token ID366
Feature activation+1.40e-02

Pos logit contributions
 help+0.012
 Madd+0.011
f+0.010
apy+0.010
 support+0.010
Neg logit contributions
 appearance-0.014
 ours-0.012
 yourselves-0.012
 yours-0.011
cks-0.011
TokenYou
Token ID1639
Feature activation+7.56e-02

Pos logit contributions
 help+0.022
 Madd+0.020
f+0.019
Sure+0.018
 support+0.017
Neg logit contributions
 appearance-0.028
 ours-0.024
 yourselves-0.023
 yours-0.022
cks-0.022
Token are
Token ID389
Feature activation-8.50e-03

Pos logit contributions
 Madd+0.124
 help+0.123
f+0.111
apy+0.110
aroo+0.103
Neg logit contributions
 appearance-0.130
 ours-0.112
 yourselves-0.108
 yours-0.107
 yourself-0.103
Token a
Token ID257
Feature activation+1.89e-02

Pos logit contributions
 appearance+0.016
 ours+0.014
 yourselves+0.014
 yours+0.014
 yourself+0.013
Neg logit contributions
 help-0.013
 Madd-0.013
f-0.012
apy-0.011
 support-0.011
Token restless
Token ID42537
Feature activation+2.89e-03

Pos logit contributions
 Madd+0.030
f+0.026
 help+0.026
apy+0.026
 Suz+0.024
Neg logit contributions
 appearance-0.035
 yours-0.029
 ours-0.029
 yourselves-0.028
 mine-0.027
Token little
Token ID1310
Feature activation+4.64e-03

Pos logit contributions
 help+0.004
 Madd+0.004
f+0.004
apy+0.003
 support+0.003
Neg logit contributions
 appearance-0.006
 ours-0.005
 yourselves-0.005
 yours-0.005
cks-0.005
Token musician
Token ID21623
Feature activation-1.14e-02

Pos logit contributions
 Madd+0.006
 help+0.006
f+0.006
apy+0.005
sy+0.005
Neg logit contributions
 appearance-0.009
 ours-0.008
 yours-0.008
 yourselves-0.008
 yourself-0.007
Token their
Token ID511
Feature activation+1.12e-02

Pos logit contributions
 appearance+0.039
 ours+0.034
 yourselves+0.033
 yours+0.031
 Do+0.030
Neg logit contributions
sy-0.049
f-0.048
 help-0.047
 Madd-0.047
tted-0.043
Token mom
Token ID1995
Feature activation+1.48e-02

Pos logit contributions
 Madd+0.015
 help+0.014
f+0.014
sy+0.012
aroo+0.012
Neg logit contributions
 appearance-0.024
 yourselves-0.019
 ours-0.019
 habitats-0.018
 yours-0.018
Token calls
Token ID3848
Feature activation-1.19e-02

Pos logit contributions
 help+0.030
 Madd+0.027
f+0.026
 support+0.024
 happily+0.023
Neg logit contributions
 appearance-0.024
 ours-0.019
 yourselves-0.018
 yours-0.018
 habitats-0.017
Token.
Token ID13
Feature activation-2.12e-02

Pos logit contributions
 appearance+0.014
 either+0.011
 ours+0.010
 yourselves+0.010
 or+0.010
Neg logit contributions
 Madd-0.024
apy-0.023
aroo-0.022
sy-0.022
Sure-0.022
Token "
Token ID366
Feature activation+2.71e-03

Pos logit contributions
 appearance+0.044
 ours+0.038
 yourselves+0.037
 yours+0.037
cks+0.036
Neg logit contributions
 help-0.032
 Madd-0.029
f-0.025
apy-0.024
 support-0.024
TokenYou
Token ID1639
Feature activation+6.79e-02

Pos logit contributions
 help+0.005
 Madd+0.005
f+0.004
apy+0.004
 support+0.004
Neg logit contributions
 appearance-0.005
 yourselves-0.004
 ours-0.004
 yours-0.004
cks-0.003
Token must
Token ID1276
Feature activation-4.03e-02

Pos logit contributions
 Madd+0.122
 help+0.117
apy+0.110
f+0.109
Sure+0.104
Neg logit contributions
 appearance-0.107
 ours-0.093
 yourselves-0.088
 yours-0.086
 yourself-0.081
Token be
Token ID307
Feature activation+2.22e-02

Pos logit contributions
 appearance+0.081
 ours+0.068
 yourselves+0.066
 yours+0.065
cks+0.065
Neg logit contributions
 help-0.063
 Madd-0.055
f-0.050
 support-0.049
 believe-0.045
Token hungry
Token ID14720
Feature activation-3.77e-02

Pos logit contributions
 help+0.027
 Madd+0.025
f+0.023
apy+0.021
 support+0.020
Neg logit contributions
 appearance-0.050
 ours-0.044
 yourselves-0.044
 yours-0.043
cks-0.041
Token and
Token ID290
Feature activation-1.82e-02

Pos logit contributions
 appearance+0.085
cks+0.075
 habitats+0.073
ulum+0.071
 ours+0.070
Neg logit contributions
 help-0.052
 Madd-0.038
 happily-0.037
f-0.036
 support-0.035
Token thirsty
Token ID47124
Feature activation-1.65e-02

Pos logit contributions
 appearance+0.039
 yourselves+0.032
 ours+0.032
ulum+0.032
cks+0.031
Neg logit contributions
 help-0.027
 Madd-0.022
f-0.021
 support-0.020
apy-0.019
Token was
Token ID373
Feature activation-1.82e-03

Pos logit contributions
 help+0.023
 Madd+0.020
f+0.019
apy+0.018
 support+0.017
Neg logit contributions
 appearance-0.033
 ours-0.029
 yourselves-0.028
 yours-0.027
cks-0.026
Token so
Token ID523
Feature activation-2.12e-02

Pos logit contributions
 appearance+0.004
 yourselves+0.003
 ours+0.003
cks+0.003
 Sharing+0.003
Neg logit contributions
 help-0.002
 Madd-0.002
 happily-0.002
apy-0.002
 support-0.002
Token blue
Token ID4171
Feature activation-1.79e-02

Pos logit contributions
 appearance+0.046
 yourselves+0.038
 ours+0.038
 yours+0.037
cks+0.037
Neg logit contributions
 help-0.030
 Madd-0.028
apy-0.025
f-0.024
 support-0.023
Token and
Token ID290
Feature activation+3.14e-03

Pos logit contributions
 appearance+0.043
 ours+0.038
 yourselves+0.038
cks+0.036
 habitats+0.036
Neg logit contributions
 help-0.020
 Madd-0.018
 happily-0.016
f-0.016
apy-0.014
Token the
Token ID262
Feature activation+8.53e-03

Pos logit contributions
 help+0.004
 Madd+0.004
apy+0.003
f+0.003
 support+0.003
Neg logit contributions
 appearance-0.007
 yourselves-0.006
 ours-0.006
cks-0.006
 yours-0.006
Token light
Token ID1657
Feature activation+1.33e-02

Pos logit contributions
 help+0.013
 Madd+0.012
f+0.011
apy+0.011
 support+0.010
Neg logit contributions
 appearance-0.017
 ours-0.014
 yourselves-0.014
 yours-0.013
cks-0.013
Token was
Token ID373
Feature activation-9.29e-04

Pos logit contributions
 help+0.019
 Madd+0.017
f+0.016
apy+0.015
 support+0.015
Neg logit contributions
 appearance-0.028
 ours-0.024
 yourselves-0.024
 yours-0.023
cks-0.022
Token so
Token ID523
Feature activation-2.18e-02

Pos logit contributions
 appearance+0.002
 yourselves+0.002
 ours+0.002
 Sharing+0.002
 habitats+0.002
Neg logit contributions
 help-0.001
 Madd-0.001
 happily-0.001
apy-0.001
 support-0.001
Token bright
Token ID6016
Feature activation+1.32e-02

Pos logit contributions
 appearance+0.050
 yourselves+0.041
 ours+0.040
 yours+0.040
cks+0.040
Neg logit contributions
 help-0.029
 Madd-0.027
apy-0.025
f-0.023
 happily-0.022
Token.
Token ID13
Feature activation+1.30e-02

Pos logit contributions
 help+0.016
 Madd+0.014
 happily+0.013
f+0.012
apy+0.011
Neg logit contributions
 appearance-0.031
 ours-0.027
 yourselves-0.027
cks-0.026
 habitats-0.026
Token He
Token ID679
Feature activation+4.76e-03

Pos logit contributions
 help+0.020
 Madd+0.020
f+0.017
apy+0.017
 support+0.016
Neg logit contributions
 appearance-0.026
 ours-0.021
 yourselves-0.021
 yours-0.020
cks-0.020
Token not
Token ID407
Feature activation-1.57e-02

Pos logit contributions
 help+0.024
 Madd+0.021
f+0.020
 support+0.020
 happily+0.019
Neg logit contributions
 appearance-0.012
 ours-0.010
 yourselves-0.009
 yours-0.009
 habitats-0.008
Token notice
Token ID4003
Feature activation-1.40e-02

Pos logit contributions
 appearance+0.029
 ours+0.024
 yourselves+0.024
 yours+0.023
cks+0.023
Neg logit contributions
 help-0.026
 Madd-0.024
f-0.022
apy-0.021
 support-0.021
Token the
Token ID262
Feature activation+1.85e-03

Pos logit contributions
 appearance+0.033
 ours+0.030
 yours+0.029
 yourselves+0.029
cks+0.028
Neg logit contributions
 help-0.016
 Madd-0.014
f-0.011
apy-0.010
 support-0.010
Token mouse
Token ID10211
Feature activation+6.55e-03

Pos logit contributions
 help+0.003
 Madd+0.003
f+0.002
apy+0.002
 support+0.002
Neg logit contributions
 appearance-0.004
 ours-0.003
 yourselves-0.003
 yours-0.003
cks-0.003
Token and
Token ID290
Feature activation+1.13e-03

Pos logit contributions
 help+0.011
 Madd+0.009
f+0.009
 support+0.008
 happily+0.008
Neg logit contributions
 appearance-0.013
 yourselves-0.011
 ours-0.011
 yours-0.011
 habitats-0.010
Token just
Token ID655
Feature activation+1.38e-02

Pos logit contributions
 help+0.002
 Madd+0.001
f+0.001
apy+0.001
 support+0.001
Neg logit contributions
 appearance-0.002
 ours-0.002
 yourselves-0.002
 yours-0.002
cks-0.002
Token kept
Token ID4030
Feature activation-8.71e-03

Pos logit contributions
 help+0.021
 Madd+0.019
f+0.017
apy+0.016
 support+0.016
Neg logit contributions
 appearance-0.028
 ours-0.024
 yourselves-0.024
 yours-0.023
cks-0.022
Token walking
Token ID6155
Feature activation-3.55e-03

Pos logit contributions
 appearance+0.015
 ours+0.012
 yourselves+0.012
 habitats+0.012
 yours+0.012
Neg logit contributions
 help-0.016
 Madd-0.014
 support-0.013
f-0.013
apy-0.013
Token.
Token ID13
Feature activation+1.69e-02

Pos logit contributions
 appearance+0.007
 ours+0.006
 yourselves+0.006
 yours+0.005
 yourself+0.005
Neg logit contributions
 help-0.005
 Madd-0.005
apy-0.005
f-0.005
 support-0.005
Token The
Token ID383
Feature activation+2.77e-02

Pos logit contributions
 help+0.025
 Madd+0.023
f+0.021
 support+0.020
apy+0.020
Neg logit contributions
 appearance-0.034
 ours-0.029
 yourselves-0.029
 yours-0.028
cks-0.027
Token mouse
Token ID10211
Feature activation+7.43e-03

Pos logit contributions
 Madd+0.040
f+0.035
apy+0.034
 help+0.034
 oblig+0.032
Neg logit contributions
 appearance-0.057
 ours-0.047
 stealing-0.046
 yourselves-0.046
 yours-0.044
Token a
Token ID257
Feature activation+6.66e-03

Pos logit contributions
 appearance+0.035
 yourselves+0.030
 ours+0.030
 yours+0.029
 yourself+0.028
Neg logit contributions
 Madd-0.028
 help-0.027
apy-0.026
f-0.026
sy-0.025
Token big
Token ID1263
Feature activation-3.12e-02

Pos logit contributions
 help+0.010
 Madd+0.009
f+0.009
 support+0.008
apy+0.008
Neg logit contributions
 appearance-0.013
 ours-0.011
 yourselves-0.011
 yours-0.011
cks-0.011
Token tree
Token ID5509
Feature activation-1.89e-02

Pos logit contributions
 appearance+0.065
 ours+0.056
 yourselves+0.055
 yours+0.054
cks+0.053
Neg logit contributions
 help-0.044
 Madd-0.040
f-0.036
 support-0.034
apy-0.033
Token and
Token ID290
Feature activation-1.40e-02

Pos logit contributions
 appearance+0.040
 ours+0.034
 yourselves+0.034
 yours+0.033
 habitats+0.032
Neg logit contributions
 help-0.028
 Madd-0.023
f-0.022
 happily-0.021
 support-0.021
Token hid
Token ID24519
Feature activation-2.07e-02

Pos logit contributions
 appearance+0.024
 ours+0.022
 yourselves+0.021
 yours+0.021
 yourself+0.020
Neg logit contributions
 help-0.022
 Madd-0.020
f-0.019
apy-0.018
 support-0.018
Token behind
Token ID2157
Feature activation+1.57e-02

Pos logit contributions
 appearance+0.034
 yourselves+0.031
 ours+0.030
 yours+0.029
cks+0.028
Neg logit contributions
 help-0.035
 Madd-0.033
f-0.032
apy-0.031
 support-0.030
Token it
Token ID340
Feature activation-1.03e-02

Pos logit contributions
 help+0.021
 Madd+0.019
f+0.017
 support+0.015
apy+0.015
Neg logit contributions
 appearance-0.035
 ours-0.030
 yourselves-0.029
 yours-0.029
cks-0.028
Token.
Token ID13
Feature activation-1.69e-02

Pos logit contributions
 appearance+0.021
 ours+0.018
 yourselves+0.017
 yours+0.017
cks+0.016
Neg logit contributions
 help-0.016
 Madd-0.014
f-0.013
 support-0.012
apy-0.012
Token She
Token ID1375
Feature activation+5.23e-03

Pos logit contributions
 appearance+0.036
 ours+0.030
 yours+0.030
 yourselves+0.029
cks+0.028
Neg logit contributions
 help-0.025
 Madd-0.024
f-0.021
apy-0.020
 happily-0.019
Token hoped
Token ID10719
Feature activation+1.16e-02

Pos logit contributions
 help+0.008
 Madd+0.008
f+0.007
apy+0.007
 support+0.007
Neg logit contributions
 appearance-0.010
 yourselves-0.008
 ours-0.008
 yours-0.008
cks-0.007
Token Tom
Token ID4186
Feature activation-2.31e-02

Pos logit contributions
 help+0.020
 Madd+0.019
f+0.018
apy+0.017
 support+0.017
Neg logit contributions
 appearance-0.020
 ours-0.017
 yourselves-0.017
 yours-0.016
cks-0.015
Token
Token ID220
Feature activation-2.75e-03

Pos logit contributions
 help+0.030
 Madd+0.028
 happily+0.025
Sure+0.025
 mummy+0.024
Neg logit contributions
 appearance-0.060
 ours-0.054
cks-0.053
 yourselves-0.051
 yours-0.049
Token
Token ID198
Feature activation+1.99e-02

Pos logit contributions
 appearance+0.006
 ours+0.005
 yourselves+0.005
 yours+0.005
cks+0.005
Neg logit contributions
 help-0.004
 Madd-0.004
f-0.003
apy-0.003
 happily-0.003
Token
Token ID198
Feature activation+1.87e-02

Pos logit contributions
 help+0.027
 Madd+0.025
f+0.023
Sure+0.022
apy+0.021
Neg logit contributions
 appearance-0.044
 ours-0.038
 yourselves-0.036
 yours-0.035
cks-0.034
TokenThe
Token ID464
Feature activation+2.37e-02

Pos logit contributions
 help+0.025
 Madd+0.022
f+0.021
apy+0.019
 support+0.019
Neg logit contributions
 appearance-0.041
 ours-0.035
 yourselves-0.034
 yours-0.034
cks-0.033
Token people
Token ID661
Feature activation-7.58e-03

Pos logit contributions
 Madd+0.035
 help+0.034
apy+0.031
f+0.031
sy+0.028
Neg logit contributions
 appearance-0.045
 yourselves-0.037
 ours-0.037
 yours-0.036
 yourself-0.035
Token in
Token ID287
Feature activation+1.96e-02

Pos logit contributions
 appearance+0.015
 ours+0.012
 yourselves+0.012
 yours+0.012
cks+0.012
Neg logit contributions
 help-0.011
 Madd-0.010
f-0.009
apy-0.009
 support-0.008
Token the
Token ID262
Feature activation+1.13e-02

Pos logit contributions
 help+0.023
 Madd+0.019
f+0.017
 support+0.017
apy+0.016
Neg logit contributions
 appearance-0.046
 ours-0.041
 yourselves-0.041
 yours-0.040
cks-0.038
Token church
Token ID4928
Feature activation+1.03e-02

Pos logit contributions
 help+0.019
 Madd+0.019
f+0.017
apy+0.016
 support+0.016
Neg logit contributions
 appearance-0.020
 yourselves-0.017
 ours-0.016
 yours-0.016
 habitats-0.015
Token were
Token ID547
Feature activation+1.08e-02

Pos logit contributions
 help+0.015
 Madd+0.013
f+0.012
apy+0.011
 support+0.011
Neg logit contributions
 appearance-0.021
 ours-0.018
 yourselves-0.018
 yours-0.018
cks-0.017
Token friendly
Token ID8030
Feature activation-1.71e-02

Pos logit contributions
 help+0.015
 Madd+0.014
f+0.012
 support+0.012
apy+0.012
Neg logit contributions
 appearance-0.023
 ours-0.019
 yourselves-0.019
 yours-0.019
cks-0.018
Token and
Token ID290
Feature activation+6.34e-03

Pos logit contributions
 appearance+0.034
 ours+0.030
 yourselves+0.029
 yours+0.028
cks+0.028
Neg logit contributions
 help-0.026
 Madd-0.023
f-0.021
 support-0.020
apy-0.019
Token arrived
Token ID5284
Feature activation-2.44e-02

Pos logit contributions
 appearance+0.004
 ours+0.003
 yourselves+0.003
 yours+0.003
cks+0.003
Neg logit contributions
 help-0.003
 Madd-0.003
f-0.003
apy-0.003
 support-0.002
Token at
Token ID379
Feature activation+3.15e-02

Pos logit contributions
 appearance+0.046
 yourselves+0.041
 ours+0.040
 yourself+0.038
 yours+0.038
Neg logit contributions
 help-0.036
 Madd-0.035
f-0.032
 support-0.030
apy-0.030
Token a
Token ID257
Feature activation+2.44e-02

Pos logit contributions
 help+0.043
 Madd+0.040
f+0.039
apy+0.036
sy+0.035
Neg logit contributions
 appearance-0.065
 yourselves-0.057
 ours-0.056
 yours-0.055
 yourself-0.052
Token beautiful
Token ID4950
Feature activation-6.18e-04

Pos logit contributions
 help+0.035
 Madd+0.030
 support+0.027
f+0.027
sy+0.025
Neg logit contributions
 appearance-0.051
 yourselves-0.045
cks-0.044
 ours-0.044
 yours-0.043
Token me
Token ID502
Feature activation+1.44e-02

Pos logit contributions
 appearance+0.001
cks+0.001
 yourselves+0.001
 ours+0.001
 yours+0.001
Neg logit contributions
 help-0.001
 Madd-0.001
 support-0.001
 happily-0.001
f-0.001
Tokenadow
Token ID4584
Feature activation+3.53e-02

Pos logit contributions
 help+0.021
f+0.017
 Madd+0.017
 support+0.016
sy+0.016
Neg logit contributions
 appearance-0.031
 ours-0.027
 yourselves-0.026
 yours-0.025
cks-0.024
Token.
Token ID13
Feature activation+1.88e-02

Pos logit contributions
 help+0.048
 Madd+0.044
f+0.040
 happily+0.037
apy+0.036
Neg logit contributions
 appearance-0.077
 ours-0.067
 yourselves-0.065
 yours-0.064
cks-0.063
Token There
Token ID1318
Feature activation+8.89e-03

Pos logit contributions
 help+0.035
 Madd+0.032
f+0.032
 support+0.029
apy+0.029
Neg logit contributions
 appearance-0.030
 ours-0.024
 yourselves-0.024
 yours-0.023
 Sharing-0.022
Token was
Token ID373
Feature activation-4.92e-03

Pos logit contributions
 help+0.013
 Madd+0.012
f+0.011
 lear+0.010
apy+0.010
Neg logit contributions
 appearance-0.017
 ours-0.014
 yourselves-0.014
 yours-0.014
 yourself-0.013
Token no
Token ID645
Feature activation-2.93e-02

Pos logit contributions
 appearance+0.012
 ours+0.010
cks+0.010
 yourselves+0.010
 yours+0.010
Neg logit contributions
 help-0.006
 Madd-0.005
 support-0.004
apy-0.004
 gard-0.004
Token fight
Token ID1907
Feature activation+3.12e-03

Pos logit contributions
 appearance+0.059
 ours+0.050
 yourselves+0.049
 yours+0.048
cks+0.045
Neg logit contributions
 help-0.044
 Madd-0.041
f-0.038
apy-0.035
 support-0.034
Token Tom
Token ID4186
Feature activation-2.32e-02

Pos logit contributions
 help+0.036
 Madd+0.034
f+0.033
sy+0.031
 support+0.031
Neg logit contributions
 appearance-0.029
 ours-0.024
 yourselves-0.024
 yours-0.023
 stealing-0.021
Token is
Token ID318
Feature activation-3.63e-02

Pos logit contributions
 appearance+0.031
 ours+0.030
 yours+0.028
 yourselves+0.027
weet+0.024
Neg logit contributions
 help-0.050
f-0.043
 Madd-0.041
 happily-0.041
 support-0.041
Token curious
Token ID11040
Feature activation-2.46e-02

Pos logit contributions
 appearance+0.078
 yourselves+0.065
 ours+0.064
 habitats+0.063
 yours+0.062
Neg logit contributions
 help-0.053
 Madd-0.044
 happily-0.041
f-0.040
 support-0.040
Token.
Token ID13
Feature activation-2.05e-02

Pos logit contributions
 appearance+0.065
 ours+0.065
 yourselves+0.064
cks+0.062
 yours+0.060
Neg logit contributions
 help-0.022
 happily-0.012
f-0.011
 days-0.010
 Madd-0.010
Token He
Token ID679
Feature activation-1.76e-02

Pos logit contributions
 appearance+0.035
 ours+0.029
 yours+0.028
 yourselves+0.028
cks+0.027
Neg logit contributions
 help-0.038
 Madd-0.035
f-0.032
apy-0.031
 support-0.031
Token wants
Token ID3382
Feature activation-2.21e-02

Pos logit contributions
 appearance+0.026
 ours+0.025
 yourselves+0.024
 yours+0.024
cks+0.021
Neg logit contributions
 help-0.035
f-0.029
 happily-0.029
 support-0.028
 brave-0.028
Token to
Token ID284
Feature activation-1.76e-02

Pos logit contributions
 appearance+0.050
 ours+0.045
 yourselves+0.044
cks+0.044
ulum+0.043
Neg logit contributions
 help-0.030
 Madd-0.023
 support-0.021
f-0.020
Sure-0.017
Token talk
Token ID1561
Feature activation+7.05e-03

Pos logit contributions
 appearance+0.037
 ours+0.032
 yourselves+0.032
 yours+0.031
cks+0.030
Neg logit contributions
 help-0.026
 Madd-0.022
f-0.020
 support-0.020
apy-0.018
Token to
Token ID284
Feature activation-1.69e-02

Pos logit contributions
 help+0.009
f+0.007
 support+0.006
 Madd+0.006
 happily+0.006
Neg logit contributions
 appearance-0.016
cks-0.015
 ours-0.015
 yourselves-0.014
opl-0.014
Token the
Token ID262
Feature activation-8.19e-03

Pos logit contributions
 appearance+0.041
 ours+0.037
 yourselves+0.036
cks+0.036
 yours+0.035
Neg logit contributions
 help-0.020
 Madd-0.014
f-0.013
 support-0.013
 happily-0.011
Token bald
Token ID28273
Feature activation-1.25e-02

Pos logit contributions
 appearance+0.020
cks+0.019
 ours+0.019
 yours+0.019
 yourselves+0.018
Neg logit contributions
 help-0.009
 gard-0.006
 support-0.006
 band-0.006
f-0.005
Token jealous
Token ID19354
Feature activation-1.64e-02

Pos logit contributions
 help+0.053
 Madd+0.050
f+0.048
apy+0.046
aroo+0.045
Neg logit contributions
 appearance-0.045
 ours-0.037
 yourselves-0.034
 yours-0.033
 stealing-0.033
Token and
Token ID290
Feature activation+4.86e-03

Pos logit contributions
 appearance+0.035
 ours+0.030
 yourselves+0.030
 yours+0.029
cks+0.028
Neg logit contributions
 help-0.023
 Madd-0.021
f-0.019
 support-0.018
apy-0.017
Token said
Token ID531
Feature activation+6.44e-03

Pos logit contributions
 help+0.007
 Madd+0.007
f+0.006
apy+0.006
 support+0.006
Neg logit contributions
 appearance-0.010
 ours-0.008
 yourselves-0.008
 yours-0.008
cks-0.008
Token,
Token ID11
Feature activation+9.88e-03

Pos logit contributions
 help+0.009
 Madd+0.008
f+0.008
sy+0.008
 support+0.007
Neg logit contributions
 appearance-0.013
 yours-0.011
 yourselves-0.010
 ours-0.010
 yourself-0.010
Token "
Token ID366
Feature activation+1.64e-02

Pos logit contributions
 help+0.017
 Madd+0.016
f+0.015
apy+0.014
 support+0.014
Neg logit contributions
 appearance-0.016
 ours-0.013
 yourselves-0.013
 yours-0.013
 yourself-0.012
TokenCan
Token ID6090
Feature activation-2.59e-02

Pos logit contributions
 help+0.025
 Madd+0.024
f+0.022
Sure+0.021
 support+0.020
Neg logit contributions
 appearance-0.032
 ours-0.028
 yourselves-0.026
cks-0.026
 yours-0.026
Token I
Token ID314
Feature activation+8.01e-04

Pos logit contributions
 appearance+0.036
 ours+0.028
cks+0.028
 yourselves+0.027
 habitats+0.026
Neg logit contributions
 help-0.058
 Madd-0.052
f-0.049
 support-0.048
Sure-0.047
Token grab
Token ID5552
Feature activation-3.75e-03

Pos logit contributions
 help+0.001
 Madd+0.001
f+0.001
apy+0.001
 support+0.001
Neg logit contributions
 appearance-0.002
 ours-0.001
 yourselves-0.001
 yours-0.001
cks-0.001
Token it
Token ID340
Feature activation+2.63e-03

Pos logit contributions
 appearance+0.007
 ours+0.006
 yourselves+0.006
 yours+0.005
cks+0.005
Neg logit contributions
 help-0.006
 Madd-0.006
f-0.005
 support-0.005
apy-0.005
Token too
Token ID1165
Feature activation+2.97e-02

Pos logit contributions
 help+0.006
 Madd+0.005
f+0.005
 support+0.005
apy+0.005
Neg logit contributions
 appearance-0.004
 yourselves-0.003
 ours-0.003
cks-0.003
 habitats-0.003
Token?"
Token ID1701
Feature activation-2.66e-02

Pos logit contributions
 help+0.059
 Madd+0.052
 happily+0.049
Sure+0.048
f+0.047
Neg logit contributions
 appearance-0.048
cks-0.042
 habitats-0.039
ellery-0.038
 yourselves-0.035
Token they
Token ID484
Feature activation+2.76e-03

Pos logit contributions
 appearance+0.002
 ours+0.002
 yourselves+0.002
 yours+0.002
cks+0.002
Neg logit contributions
 help-0.001
 Madd-0.001
f-0.001
 support-0.001
apy-0.001
Token were
Token ID547
Feature activation-3.63e-03

Pos logit contributions
 help+0.004
 Madd+0.004
f+0.003
apy+0.003
 support+0.003
Neg logit contributions
 appearance-0.006
 ours-0.005
 yourselves-0.005
 yours-0.005
cks-0.004
Token proud
Token ID6613
Feature activation+1.22e-02

Pos logit contributions
 appearance+0.007
 ours+0.006
 yours+0.006
 yourselves+0.006
cks+0.006
Neg logit contributions
 help-0.005
 Madd-0.005
f-0.005
apy-0.005
 support-0.004
Token of
Token ID286
Feature activation-2.06e-03

Pos logit contributions
 help+0.013
 Madd+0.011
f+0.009
 support+0.008
apy+0.008
Neg logit contributions
 appearance-0.031
 ours-0.027
 yourselves-0.027
 yours-0.026
cks-0.026
Token her
Token ID607
Feature activation-8.20e-03

Pos logit contributions
 appearance+0.004
 ours+0.004
 yourselves+0.004
 yours+0.004
cks+0.003
Neg logit contributions
 help-0.003
 Madd-0.003
f-0.002
apy-0.002
 support-0.002
Token and
Token ID290
Feature activation-1.12e-02

Pos logit contributions
 appearance+0.014
 ours+0.012
 yours+0.011
 yourselves+0.011
 habitats+0.010
Neg logit contributions
 Madd-0.015
 help-0.013
apy-0.012
aroo-0.012
sy-0.012
Token loved
Token ID6151
Feature activation-7.00e-03

Pos logit contributions
 appearance+0.024
 yourselves+0.021
 ours+0.021
 yours+0.020
cks+0.020
Neg logit contributions
 help-0.015
 Madd-0.014
f-0.012
 support-0.012
apy-0.012
Token her
Token ID607
Feature activation-8.34e-03

Pos logit contributions
 appearance+0.015
 yourselves+0.013
 ours+0.012
cks+0.012
ulum+0.012
Neg logit contributions
 help-0.010
 Madd-0.009
f-0.008
 happily-0.008
 support-0.008
Token very
Token ID845
Feature activation-1.95e-04

Pos logit contributions
 appearance+0.015
 ours+0.013
 yours+0.013
 yourselves+0.012
 habitats+0.012
Neg logit contributions
 Madd-0.013
 help-0.013
apy-0.012
f-0.012
sy-0.011
Token much
Token ID881
Feature activation+6.51e-03

Pos logit contributions
 appearance+0.000
 ours+0.000
 yourselves+0.000
 yours+0.000
cks+0.000
Neg logit contributions
 help-0.000
 Madd-0.000
f-0.000
apy-0.000
 support-0.000
Token.
Token ID13
Feature activation-8.33e-03

Pos logit contributions
 help+0.011
 Madd+0.010
f+0.009
apy+0.009
 support+0.008
Neg logit contributions
 appearance-0.012
 ours-0.010
 yourselves-0.010
 yours-0.010
cks-0.009
Token him
Token ID683
Feature activation-8.37e-03

Pos logit contributions
 appearance+0.044
 ours+0.036
 yourselves+0.035
cks+0.034
 yours+0.034
Neg logit contributions
 help-0.050
 Madd-0.045
f-0.042
 support-0.041
apy-0.040
Token to
Token ID284
Feature activation-1.32e-02

Pos logit contributions
 appearance+0.013
 yours+0.011
 ours+0.011
 stealing+0.010
 yourselves+0.010
Neg logit contributions
 help-0.014
 Madd-0.014
f-0.013
apy-0.013
sy-0.012
Token close
Token ID1969
Feature activation+7.63e-03

Pos logit contributions
 appearance+0.029
 ours+0.025
 yourselves+0.025
 yours+0.024
 yourself+0.023
Neg logit contributions
 help-0.016
 Madd-0.015
f-0.014
apy-0.013
 support-0.012
Token the
Token ID262
Feature activation-6.69e-03

Pos logit contributions
 help+0.010
 Madd+0.008
 support+0.007
f+0.007
apy+0.007
Neg logit contributions
 appearance-0.017
 ours-0.015
 habitats-0.014
 Sharing-0.014
cks-0.014
Token door
Token ID3420
Feature activation-1.04e-02

Pos logit contributions
 appearance+0.012
cks+0.010
 ours+0.010
 yourselves+0.010
 yours+0.010
Neg logit contributions
 help-0.012
 Madd-0.010
 support-0.009
 gard-0.009
f-0.009
Token before
Token ID878
Feature activation-4.91e-03

Pos logit contributions
 appearance+0.021
 ours+0.018
 yourselves+0.018
 yours+0.017
cks+0.017
Neg logit contributions
 help-0.015
 Madd-0.014
f-0.013
apy-0.012
 support-0.012
Token he
Token ID339
Feature activation-2.29e-02

Pos logit contributions
 appearance+0.011
 ours+0.009
 yourselves+0.009
 yours+0.009
cks+0.009
Neg logit contributions
 help-0.007
 Madd-0.006
apy-0.005
f-0.005
 support-0.005
Token went
Token ID1816
Feature activation+2.25e-02

Pos logit contributions
 appearance+0.039
 reads+0.032
 doesn+0.032
 wants+0.032
 writes+0.031
Neg logit contributions
 Madd-0.039
 help-0.033
apy-0.032
wards-0.032
taker-0.031
Token to
Token ID284
Feature activation-1.05e-02

Pos logit contributions
 help+0.034
 Madd+0.033
f+0.030
apy+0.029
 support+0.028
Neg logit contributions
 appearance-0.043
 ours-0.037
 yourselves-0.036
 yours-0.035
 yourself-0.033
Token bed
Token ID3996
Feature activation-1.29e-03

Pos logit contributions
 appearance+0.021
 ours+0.019
 yourselves+0.018
 yours+0.018
 yourself+0.017
Neg logit contributions
 Madd-0.014
 help-0.014
f-0.013
apy-0.012
sy-0.012
Token.
Token ID13
Feature activation+1.21e-03

Pos logit contributions
 appearance+0.003
 ours+0.002
 yourselves+0.002
cks+0.002
 yours+0.002
Neg logit contributions
 help-0.002
 Madd-0.001
f-0.001
 happily-0.001
 support-0.001
Token not
Token ID407
Feature activation-1.71e-02

Pos logit contributions
 appearance+0.015
 yourselves+0.013
cks+0.013
 habitats+0.013
ind+0.013
Neg logit contributions
 help-0.010
 Madd-0.008
 support-0.007
 happily-0.007
 brave-0.006
Token normal
Token ID3487
Feature activation+2.13e-02

Pos logit contributions
 appearance+0.036
 yourselves+0.031
 habitats+0.030
cks+0.030
ulum+0.029
Neg logit contributions
 help-0.026
 Madd-0.021
 support-0.020
f-0.019
 happily-0.019
Token.
Token ID13
Feature activation+6.27e-03

Pos logit contributions
 help+0.029
 Madd+0.023
 support+0.021
f+0.020
 happily+0.020
Neg logit contributions
 appearance-0.045
cks-0.042
 ours-0.042
 yourselves-0.042
ulum-0.040
Token
Token ID220
Feature activation-1.60e-02

Pos logit contributions
 help+0.010
 Madd+0.009
f+0.009
apy+0.008
 support+0.008
Neg logit contributions
 appearance-0.012
 yourselves-0.010
 ours-0.010
 yours-0.009
cks-0.009
Token
Token ID198
Feature activation+7.76e-03

Pos logit contributions
 appearance+0.036
 ours+0.030
 yourselves+0.029
 yours+0.029
 habitats+0.028
Neg logit contributions
 help-0.021
 Madd-0.021
f-0.018
apy-0.017
 happily-0.016
TokenJ
Token ID41
Feature activation-6.01e-03

Pos logit contributions
 help+0.013
 Madd+0.012
f+0.011
apy+0.010
 support+0.010
Neg logit contributions
 appearance-0.015
 ours-0.012
 yourselves-0.012
 yours-0.012
cks-0.011
Tokenill
Token ID359
Feature activation-2.24e-03

Pos logit contributions
 appearance+0.011
 ours+0.009
 yourselves+0.009
 yours+0.008
cks+0.008
Neg logit contributions
 help-0.010
 Madd-0.010
f-0.009
apy-0.009
 support-0.008
Token said
Token ID531
Feature activation+5.26e-03

Pos logit contributions
 appearance+0.005
 ours+0.004
 yourselves+0.004
 yours+0.004
cks+0.004
Neg logit contributions
 help-0.003
 Madd-0.003
f-0.003
apy-0.003
 support-0.003
Token,
Token ID11
Feature activation+4.05e-03

Pos logit contributions
 help+0.008
 Madd+0.008
f+0.007
sy+0.007
 support+0.007
Neg logit contributions
 appearance-0.009
 yours-0.008
 stealing-0.008
 yourselves-0.008
 ours-0.007
Token "
Token ID366
Feature activation+1.36e-02

Pos logit contributions
 help+0.007
 Madd+0.007
f+0.006
apy+0.006
 support+0.006
Neg logit contributions
 appearance-0.007
 yourselves-0.005
 ours-0.005
 yours-0.005
 Sharing-0.005
TokenI
Token ID40
Feature activation+2.46e-02

Pos logit contributions
 Madd+0.018
 help+0.017
Sure+0.017
f+0.016
B+0.015
Neg logit contributions
 appearance-0.030
 ours-0.027
 yours-0.025
cks-0.025
arter-0.023
Token,
Token ID11
Feature activation-9.62e-03

Pos logit contributions
 appearance+0.027
 ours+0.024
 yourselves+0.024
 habitats+0.024
 yours+0.023
Neg logit contributions
 help-0.009
 Madd-0.006
f-0.005
 support-0.005
 happily-0.004
Token Mia
Token ID32189
Feature activation-5.40e-03

Pos logit contributions
 appearance+0.015
 ours+0.013
 yourselves+0.013
 yours+0.013
 yourself+0.011
Neg logit contributions
 Madd-0.017
 help-0.017
f-0.016
sy-0.015
apy-0.015
Token,
Token ID11
Feature activation-9.01e-03

Pos logit contributions
 appearance+0.013
 ours+0.012
 yourselves+0.011
 yours+0.011
cks+0.011
Neg logit contributions
 help-0.006
 Madd-0.005
f-0.005
 support-0.004
apy-0.004
Token what
Token ID644
Feature activation-2.43e-02

Pos logit contributions
 appearance+0.015
 ours+0.013
 yourselves+0.013
 yours+0.012
 yourself+0.011
Neg logit contributions
 Madd-0.015
 help-0.015
f-0.015
apy-0.014
sy-0.014
Token's
Token ID338
Feature activation-3.18e-02

Pos logit contributions
 appearance+0.049
 yourselves+0.041
 ours+0.040
 habitats+0.039
cks+0.038
Neg logit contributions
 help-0.041
f-0.032
 Madd-0.032
 support-0.031
 believe-0.029
Token wrong
Token ID2642
Feature activation-6.55e-02

Pos logit contributions
 appearance+0.044
 yourselves+0.033
 ours+0.032
 habitats+0.030
 yours+0.030
Neg logit contributions
 help-0.070
 Madd-0.063
f-0.060
 support-0.059
apy-0.058
Token?"
Token ID1701
Feature activation-5.10e-02

Pos logit contributions
 appearance+0.083
 ours+0.060
 yourselves+0.057
 habitats+0.056
cks+0.055
Neg logit contributions
 help-0.154
 Madd-0.142
f-0.132
 support-0.129
apy-0.126
Token Mom
Token ID11254
Feature activation+1.75e-02

Pos logit contributions
 appearance+0.106
 yours+0.090
 ours+0.090
opl+0.084
ellery+0.083
Neg logit contributions
 help-0.082
 Madd-0.073
apy-0.065
f-0.060
 support-0.060
Token asks
Token ID7893
Feature activation-2.65e-02

Pos logit contributions
 help+0.033
 Madd+0.026
f+0.025
 cheer+0.025
 support+0.025
Neg logit contributions
 appearance-0.031
 ours-0.029
 yours-0.028
 habitats-0.026
 yourselves-0.025
Token.
Token ID13
Feature activation-2.08e-02

Pos logit contributions
 appearance+0.042
 yourselves+0.034
 ours+0.032
cks+0.031
 yourself+0.031
Neg logit contributions
 help-0.047
 Madd-0.047
apy-0.043
f-0.042
sy-0.042
Token She
Token ID1375
Feature activation+3.09e-03

Pos logit contributions
 appearance+0.045
 ours+0.040
 yours+0.039
 yourselves+0.037
cks+0.037
Neg logit contributions
 help-0.030
 Madd-0.027
 gard-0.023
f-0.023
apy-0.023
Token at
Token ID379
Feature activation+3.36e-02

Pos logit contributions
 help+0.020
 Madd+0.018
 support+0.016
f+0.016
apy+0.015
Neg logit contributions
 appearance-0.018
 ours-0.015
cks-0.015
 yourselves-0.015
ulum-0.014
Token first
Token ID717
Feature activation-2.31e-02

Pos logit contributions
 help+0.045
 Madd+0.042
f+0.038
apy+0.035
 support+0.035
Neg logit contributions
 appearance-0.072
 ours-0.062
 yourselves-0.062
 yours-0.060
cks-0.058
Token,
Token ID11
Feature activation+1.14e-02

Pos logit contributions
 appearance+0.058
 ours+0.054
cks+0.053
 habitats+0.052
 yourselves+0.052
Neg logit contributions
 help-0.023
 Madd-0.017
 support-0.014
f-0.014
apy-0.012
Token it
Token ID340
Feature activation+3.98e-03

Pos logit contributions
 help+0.022
 Madd+0.021
f+0.019
 support+0.019
apy+0.018
Neg logit contributions
 appearance-0.018
 ours-0.014
 yourselves-0.014
 yours-0.013
cks-0.013
Token's
Token ID338
Feature activation-2.03e-02

Pos logit contributions
 help+0.008
 Madd+0.007
f+0.006
 support+0.006
sy+0.006
Neg logit contributions
 appearance-0.006
 ours-0.005
 yourselves-0.005
 yours-0.005
cks-0.005
Token hardest
Token ID17612
Feature activation-5.94e-02

Pos logit contributions
 appearance+0.045
 yourselves+0.039
 habitats+0.039
ulum+0.038
asks+0.038
Neg logit contributions
 help-0.026
 support-0.020
 happily-0.019
 Madd-0.019
 brave-0.018
Token to
Token ID284
Feature activation-4.31e-04

Pos logit contributions
 appearance+0.128
ulum+0.117
oma+0.116
cks+0.115
 ours+0.115
Neg logit contributions
 help-0.080
 Madd-0.057
 support-0.056
f-0.050
 happily-0.049
Token take
Token ID1011
Feature activation-8.01e-03

Pos logit contributions
 appearance+0.001
 ours+0.001
 yourselves+0.001
 yours+0.001
cks+0.001
Neg logit contributions
 help-0.001
 Madd-0.001
f-0.000
 support-0.000
apy-0.000
Token the
Token ID262
Feature activation+7.40e-03

Pos logit contributions
 appearance+0.017
 ours+0.014
 yourselves+0.014
 yours+0.014
cks+0.013
Neg logit contributions
 help-0.011
 Madd-0.010
f-0.009
apy-0.009
 support-0.009
Token first
Token ID717
Feature activation-2.26e-02

Pos logit contributions
 Madd+0.011
apy+0.010
sy+0.009
f+0.009
Carl+0.009
Neg logit contributions
 appearance-0.015
 ours-0.013
 yourselves-0.012
 yours-0.012
 yourself-0.012
Token step
Token ID2239
Feature activation-4.73e-03

Pos logit contributions
 appearance+0.047
 ours+0.040
 yourselves+0.039
 yours+0.039
cks+0.037
Neg logit contributions
 help-0.031
 Madd-0.030
f-0.026
apy-0.025
 support-0.025
Token but
Token ID475
Feature activation-3.08e-03

Pos logit contributions
 help+0.001
 Madd+0.001
 support+0.001
 happily+0.001
 together+0.001
Neg logit contributions
 appearance-0.002
cks-0.002
ulum-0.002
 yourselves-0.002
ellery-0.002
Token it
Token ID340
Feature activation-3.88e-03

Pos logit contributions
 appearance+0.006
 ours+0.005
 yourselves+0.005
 yours+0.005
 habitats+0.005
Neg logit contributions
 help-0.005
 Madd-0.004
f-0.004
apy-0.004
 support-0.004
Token did
Token ID750
Feature activation+6.69e-03

Pos logit contributions
 appearance+0.006
 ours+0.005
 doesn+0.005
 yourselves+0.005
 yours+0.005
Neg logit contributions
 help-0.007
 Madd-0.007
f-0.006
apy-0.006
 oblig-0.006
Token not
Token ID407
Feature activation-2.27e-02

Pos logit contributions
 help+0.015
 support+0.012
f+0.012
 happily+0.012
 Madd+0.012
Neg logit contributions
 appearance-0.009
 ours-0.008
 habitats-0.008
 yourselves-0.007
ulum-0.007
Token say
Token ID910
Feature activation-5.21e-05

Pos logit contributions
 appearance+0.046
 ours+0.039
 yourselves+0.039
 yours+0.038
cks+0.036
Neg logit contributions
 help-0.033
 Madd-0.031
f-0.028
apy-0.027
 support-0.026
Token anything
Token ID1997
Feature activation-7.01e-02

Pos logit contributions
 appearance+0.000
 yourselves+0.000
 ours+0.000
 habitats+0.000
ulum+0.000
Neg logit contributions
 help-0.000
 Madd-0.000
 support-0.000
f-0.000
apy-0.000
Token.
Token ID13
Feature activation-1.91e-03

Pos logit contributions
 appearance+0.147
 yourselves+0.128
 ours+0.127
ulum+0.123
cks+0.123
Neg logit contributions
 help-0.103
f-0.083
 Madd-0.083
 support-0.077
 happily-0.074
Token It
Token ID632
Feature activation+2.27e-02

Pos logit contributions
 appearance+0.003
 ours+0.003
 yourselves+0.003
 yours+0.003
cks+0.003
Neg logit contributions
 help-0.003
 Madd-0.003
f-0.003
 support-0.003
apy-0.003
Token just
Token ID655
Feature activation+5.66e-03

Pos logit contributions
 help+0.039
 Madd+0.038
f+0.035
apy+0.034
 support+0.032
Neg logit contributions
 appearance-0.038
 doesn-0.029
cks-0.029
 ours-0.029
 yourselves-0.029
Token kept
Token ID4030
Feature activation-1.41e-02

Pos logit contributions
 help+0.009
 Madd+0.008
f+0.008
apy+0.007
 support+0.007
Neg logit contributions
 appearance-0.011
 ours-0.009
 yourselves-0.009
 yours-0.009
cks-0.008
Token recording
Token ID8296
Feature activation+1.07e-03

Pos logit contributions
 appearance+0.025
 ours+0.021
 yourselves+0.021
 yours+0.020
 habitats+0.020
Neg logit contributions
 help-0.025
 Madd-0.021
f-0.020
 support-0.020
 happily-0.020
Token dad
Token ID9955
Feature activation+7.75e-04

Pos logit contributions
 appearance+0.054
 ours+0.043
 yourselves+0.043
 habitats+0.042
 yours+0.041
Neg logit contributions
 help-0.030
 Madd-0.030
f-0.030
apy-0.027
 oblig-0.026
Token told
Token ID1297
Feature activation-1.69e-02

Pos logit contributions
 help+0.001
 Madd+0.001
f+0.001
 support+0.001
 happily+0.001
Neg logit contributions
 appearance-0.002
 yourselves-0.001
 ours-0.001
 yours-0.001
cks-0.001
Token her
Token ID607
Feature activation-7.89e-03

Pos logit contributions
 appearance+0.036
 ours+0.032
 yourselves+0.031
cks+0.030
 yours+0.030
Neg logit contributions
 help-0.024
 Madd-0.021
f-0.019
 support-0.018
apy-0.017
Token a
Token ID257
Feature activation+1.62e-02

Pos logit contributions
 appearance+0.015
 ours+0.013
 yourselves+0.013
 yours+0.012
cks+0.012
Neg logit contributions
 help-0.013
 Madd-0.012
f-0.011
apy-0.010
 support-0.010
Token silly
Token ID14397
Feature activation+5.23e-04

Pos logit contributions
 help+0.033
 Madd+0.032
f+0.030
apy+0.030
 support+0.028
Neg logit contributions
 appearance-0.022
 ours-0.017
 yours-0.016
 yourselves-0.016
 stealing-0.014
Token story
Token ID1621
Feature activation-6.11e-02

Pos logit contributions
 help+0.001
 Madd+0.001
f+0.001
apy+0.001
 support+0.001
Neg logit contributions
 appearance-0.001
 ours-0.001
 yourselves-0.001
 yours-0.001
cks-0.001
Token about
Token ID546
Feature activation-4.31e-02

Pos logit contributions
 appearance+0.100
 yourselves+0.083
 ours+0.080
cks+0.075
 yours+0.073
Neg logit contributions
 help-0.121
f-0.106
 Madd-0.105
 support-0.098
apy-0.095
Token a
Token ID257
Feature activation+1.56e-02

Pos logit contributions
 appearance+0.101
 ours+0.089
 yourselves+0.088
cks+0.088
ulum+0.087
Neg logit contributions
 help-0.053
 Madd-0.043
 support-0.036
f-0.032
 happily-0.031
Token friendly
Token ID8030
Feature activation-2.76e-02

Pos logit contributions
 help+0.026
 Madd+0.024
f+0.022
apy+0.021
 support+0.021
Neg logit contributions
 appearance-0.029
 ours-0.024
 yourselves-0.024
 yours-0.023
cks-0.022
Token monster
Token ID9234
Feature activation-1.94e-02

Pos logit contributions
 appearance+0.058
 ours+0.051
 yourselves+0.051
cks+0.049
 yours+0.049
Neg logit contributions
 help-0.039
 Madd-0.035
f-0.031
 support-0.030
apy-0.028
Token who
Token ID508
Feature activation+3.63e-03

Pos logit contributions
 appearance+0.040
 ours+0.035
 yourselves+0.034
 yours+0.032
cks+0.032
Neg logit contributions
 help-0.029
 Madd-0.025
f-0.024
 support-0.022
 happily-0.021
Token smiled
Token ID13541
Feature activation+3.32e-03

Pos logit contributions
 appearance+0.019
 ours+0.016
 yourselves+0.016
 yours+0.015
cks+0.015
Neg logit contributions
 help-0.015
 Madd-0.013
f-0.012
apy-0.012
 support-0.012
Token and
Token ID290
Feature activation-4.96e-03

Pos logit contributions
 help+0.005
 Madd+0.005
f+0.004
 happily+0.004
 support+0.004
Neg logit contributions
 appearance-0.007
 ours-0.006
 yourselves-0.006
 yours-0.006
cks-0.005
Token said
Token ID531
Feature activation+1.62e-03

Pos logit contributions
 appearance+0.008
 ours+0.007
 yourselves+0.007
cks+0.007
 yours+0.007
Neg logit contributions
 help-0.008
f-0.007
 Madd-0.007
 support-0.007
sy-0.007
Token,
Token ID11
Feature activation+2.50e-04

Pos logit contributions
 help+0.002
 Madd+0.002
f+0.002
 support+0.002
sy+0.002
Neg logit contributions
 appearance-0.003
 yourselves-0.003
 ours-0.003
 yours-0.003
 yourself-0.003
Token "
Token ID366
Feature activation+9.00e-03

Pos logit contributions
 help+0.000
 Madd+0.000
f+0.000
apy+0.000
 support+0.000
Neg logit contributions
 appearance-0.000
 ours-0.000
 yourselves-0.000
 yours-0.000
cks-0.000
TokenHello
Token ID15496
Feature activation-6.27e-02

Pos logit contributions
 help+0.015
 Madd+0.014
f+0.013
apy+0.012
 support+0.012
Neg logit contributions
 appearance-0.016
 yourselves-0.014
 ours-0.014
 yours-0.013
cks-0.013
Token,
Token ID11
Feature activation+1.01e-03

Pos logit contributions
 appearance+0.149
 ours+0.129
 yourselves+0.128
cks+0.126
 yours+0.124
Neg logit contributions
 help-0.073
 Madd-0.065
f-0.056
 support-0.050
apy-0.049
Token kids
Token ID3988
Feature activation+1.08e-02

Pos logit contributions
 help+0.002
 Madd+0.001
 support+0.001
f+0.001
aroo+0.001
Neg logit contributions
 appearance-0.002
cks-0.002
 habitats-0.002
 yourselves-0.002
ulum-0.002
Token.
Token ID13
Feature activation-6.28e-03

Pos logit contributions
 help+0.017
 Madd+0.015
f+0.014
 support+0.013
apy+0.013
Neg logit contributions
 appearance-0.021
 ours-0.018
 yourselves-0.018
 yours-0.017
cks-0.017
Token I
Token ID314
Feature activation-6.06e-03

Pos logit contributions
 appearance+0.008
 ours+0.006
 yourselves+0.006
cks+0.006
 yours+0.005
Neg logit contributions
 help-0.014
 Madd-0.013
f-0.012
 support-0.012
apy-0.012
Token see
Token ID766
Feature activation-2.79e-02

Pos logit contributions
 appearance+0.013
 ours+0.011
 yourselves+0.011
 yours+0.011
cks+0.011
Neg logit contributions
 help-0.008
 Madd-0.007
f-0.006
apy-0.006
 support-0.006