TOP ACTIVATIONS
MAX = 0.124

made a fire and roasted marsh m allows . They gave
She ate popcorn and roasted marsh m allows .
words and the pictures . Before he knew it , he
and knocked something loose . Before he knew it , something
move closer to her . Before she knew it , the
around the fireplace and roasted marsh m allows . Suddenly ,
. They pretended to roast marsh m allows and sing songs
more and more space . Before long , the entire box
things in the lab . Before Molly knew it , her
for someone to come . Before long , lots of people
They laughed and played . Before dark , they returned back
road block 's back . Before he knew it , he
and combined everything together . Before long , Daisy had created
him tight while they roasted marsh m allows over the camp
grew bigger and bigger . Before long , the bush was
a mouse . Before the movie starts , they
they pointed . Before they knew it , they
flowers to other people . Before she left , she gave
the drive back home . Before they reached the house ,
fireplace so they could roast marsh m allows . They were

MIN ACTIVATIONS
MIN = -0.113

her and said , " Ell ie , I found a
her and said , " Ell ie , I need your
Her friend said , " Ell ie , come quick !
It was a big j igsaw puzzle and she wanted to
recommendation and picked the right av oc ados .
girl would make a great fer ret parent .
every day to supply the av oc ados . And the
would take out a j igsaw puzzle and try to solve
easily . " Ell ie , you 're too
said to her , " Ell ie , singing is good
, and all the other av oc ados were very happy
box and saw a j igsaw puzzle inside . He paused
fish said , " Hi Gab ri ella , would you
so happy and brought the av oc ados home to share
the tree knew that the av oc ados would be back
The young girl took the fer ret home and was very
Everyday he would take the av oc ados and scatter them
said , " I love av oc ados ! They are
â € � � Ell ie , do you see
was a little girl called Lay la . She was three

INTERVAL 0.065 - 0.124
CONTAINS 0.146%

was so happy . But after a while , it realized
playing in the park . After they had been playing for
that everything was white and cold . Snow had fallen during
got the missing piece . When she came back , Lily
fun to watch the long noodles boil in the pot .

INTERVAL 0.005 - 0.065
CONTAINS 36.746%

filled with lots of y ummy food ! B
for the rest of the summer . The sun was hot
clouds . The helpers worked hard and used lots of bright
the storm would pass . But the storm didn
all agreed that it was time to go home .

INTERVAL -0.054 - 0.005
CONTAINS 62.683%

cat . So , Tim went out to look for Sam
. He decided to take it home . When
hill . I met a new friend , Tom . He
any nightmare that tried to enter her room would be caught
looking duck . The duck said , "

INTERVAL -0.113 - -0.054
CONTAINS 0.425%

. Lucy said , " It 's time to go home
take a nap . You look tired . And maybe you
girl and said " That puppy looks so helpful !" The
He had a j olly smile and was always singing .
and said , " Thank you , Lily ! I will
Token made
Token ID925
Feature activation+5.62e-02

Pos logit contributions
 elbows+0.052
tails+0.046
 poop+0.045
w+0.044
mons+0.043
Neg logit contributions
Em-0.055
 newfound-0.055
igsaw-0.054
lad-0.054
oke-0.053
Token a
Token ID257
Feature activation+2.11e-02

Pos logit contributions
tails+0.090
 elbows+0.089
ms+0.081
w+0.078
bat+0.073
Neg logit contributions
igsaw-0.092
 pleasure-0.082
 newfound-0.079
 lifelong-0.073
 brand-0.072
Token fire
Token ID2046
Feature activation+3.80e-02

Pos logit contributions
 elbows+0.030
tails+0.028
 poop+0.028
 covered+0.028
 marks+0.028
Neg logit contributions
igsaw-0.039
lad-0.033
 newfound-0.032
Em-0.032
 importantly-0.031
Token and
Token ID290
Feature activation+8.57e-03

Pos logit contributions
 elbows+0.060
w+0.052
tails+0.052
 dressed+0.050
ups+0.048
Neg logit contributions
igsaw-0.061
 hero-0.056
lad-0.052
oke-0.050
 pleasure-0.049
Token roasted
Token ID30490
Feature activation+7.05e-02

Pos logit contributions
 elbows+0.010
tails+0.010
w+0.009
 marks+0.008
 poop+0.008
Neg logit contributions
igsaw-0.016
 importantly-0.015
 newfound-0.015
Em-0.015
 million-0.014
Token marsh
Token ID22397
Feature activation+1.05e-01

Pos logit contributions
 elbows+0.110
tails+0.108
w+0.097
 poop+0.092
 dressed+0.092
Neg logit contributions
igsaw-0.115
 newfound-0.099
lad-0.097
 importantly-0.093
 million-0.093
Tokenm
Token ID76
Feature activation+5.65e-02

Pos logit contributions
 elbows+0.162
tails+0.161
 marks+0.148
 poop+0.145
 smoke+0.144
Neg logit contributions
igsaw-0.165
lad-0.148
oke-0.147
 importantly-0.137
Em-0.136
Tokenallows
Token ID47205
Feature activation+8.23e-02

Pos logit contributions
tails+0.096
 elbows+0.094
tail+0.091
Give+0.083
 knit+0.083
Neg logit contributions
lad-0.075
igsaw-0.066
 importantly-0.064
 newfound-0.063
oke-0.059
Token.
Token ID13
Feature activation+1.02e-02

Pos logit contributions
 elbows+0.125
tails+0.109
w+0.102
 poop+0.095
 marks+0.090
Neg logit contributions
igsaw-0.145
lad-0.128
 million-0.122
 Nice-0.117
Em-0.116
Token They
Token ID1119
Feature activation+2.63e-02

Pos logit contributions
 elbows+0.013
 poop+0.013
 smoke+0.013
 covered+0.013
 made+0.012
Neg logit contributions
igsaw-0.021
lad-0.018
reetings-0.017
 newfound-0.017
Em-0.017
Token gave
Token ID2921
Feature activation-6.90e-03

Pos logit contributions
 elbows+0.036
tails+0.031
w+0.029
 poop+0.029
 marks+0.028
Neg logit contributions
igsaw-0.046
Em-0.044
 newfound-0.044
 million-0.043
lad-0.043
Token She
Token ID1375
Feature activation-7.61e-03

Pos logit contributions
igsaw+0.022
lad+0.019
reetings+0.019
 newfound+0.018
 importantly+0.018
Neg logit contributions
 made-0.016
 dressed-0.016
 covered-0.015
 elbows-0.015
tails-0.015
Token ate
Token ID15063
Feature activation+2.73e-02

Pos logit contributions
igsaw+0.015
lad+0.014
Em+0.013
 newfound+0.013
reetings+0.013
Neg logit contributions
 elbows-0.010
 poop-0.009
tails-0.009
 smoke-0.008
 marks-0.008
Token popcorn
Token ID38914
Feature activation+4.86e-02

Pos logit contributions
 elbows+0.030
tails+0.028
 poop+0.027
 covered+0.026
 marks+0.026
Neg logit contributions
igsaw-0.060
lad-0.052
 newfound-0.051
Em-0.050
 importantly-0.049
Token and
Token ID290
Feature activation+4.65e-03

Pos logit contributions
 elbows+0.075
tails+0.063
 poop+0.061
 smoke+0.060
w+0.059
Neg logit contributions
igsaw-0.093
lad-0.076
 newfound-0.076
 hero-0.072
Em-0.071
Token roasted
Token ID30490
Feature activation+6.77e-02

Pos logit contributions
 elbows+0.006
 poop+0.005
tails+0.005
out+0.005
ow+0.005
Neg logit contributions
igsaw-0.008
 newfound-0.008
Em-0.008
 importantly-0.008
lad-0.008
Token marsh
Token ID22397
Feature activation+1.02e-01

Pos logit contributions
 elbows+0.107
tails+0.098
w+0.093
 poop+0.092
 covered+0.092
Neg logit contributions
igsaw-0.116
 newfound-0.098
lad-0.097
Em-0.095
 importantly-0.093
Tokenm
Token ID76
Feature activation+5.44e-02

Pos logit contributions
 elbows+0.158
tails+0.149
 smoke+0.146
 marks+0.145
 poop+0.144
Neg logit contributions
igsaw-0.167
lad-0.148
Em-0.142
oke-0.141
 importantly-0.140
Tokenallows
Token ID47205
Feature activation+7.98e-02

Pos logit contributions
 elbows+0.089
tails+0.079
 poop+0.076
tail+0.073
 covered+0.073
Neg logit contributions
lad-0.082
igsaw-0.079
 importantly-0.073
 newfound-0.072
oke-0.068
Token.
Token ID13
Feature activation-7.18e-03

Pos logit contributions
 elbows+0.120
tails+0.099
w+0.097
 poop+0.096
 marks+0.089
Neg logit contributions
igsaw-0.154
lad-0.130
Em-0.123
 Nice-0.123
 million-0.120
Token
Token ID198
Feature activation-8.60e-03

Pos logit contributions
igsaw+0.015
lad+0.013
reetings+0.013
 newfound+0.012
 importantly+0.012
Neg logit contributions
 dressed-0.008
 made-0.008
 covered-0.008
 marks-0.008
 when-0.008
Token
Token ID198
Feature activation-1.42e-02

Pos logit contributions
igsaw+0.018
lad+0.015
 importantly+0.014
 newfound+0.014
reetings+0.014
Neg logit contributions
 made-0.011
 dressed-0.010
 covered-0.010
 marks-0.010
w-0.010
Token words
Token ID2456
Feature activation-8.08e-03

Pos logit contributions
igsaw+0.054
lad+0.047
Em+0.046
reetings+0.044
 importantly+0.044
Neg logit contributions
 made-0.036
 marks-0.036
 smoke-0.036
 poop-0.035
 covered-0.034
Token and
Token ID290
Feature activation-1.28e-02

Pos logit contributions
igsaw+0.017
lad+0.015
 newfound+0.014
 importantly+0.014
Em+0.014
Neg logit contributions
 elbows-0.010
 poop-0.009
tails-0.009
w-0.009
 smoke-0.008
Token the
Token ID262
Feature activation-1.89e-02

Pos logit contributions
igsaw+0.025
 newfound+0.023
lad+0.022
 importantly+0.022
Em+0.022
Neg logit contributions
 elbows-0.016
w-0.015
 poop-0.015
tails-0.014
 dressed-0.014
Token pictures
Token ID5986
Feature activation+1.23e-02

Pos logit contributions
igsaw+0.043
lad+0.038
Em+0.037
reetings+0.036
 importantly+0.035
Neg logit contributions
 smoke-0.021
 made-0.020
 marks-0.020
 poop-0.020
 elbows-0.019
Token.
Token ID13
Feature activation-8.07e-03

Pos logit contributions
 elbows+0.017
 poop+0.015
tails+0.015
w+0.014
 marks+0.013
Neg logit contributions
igsaw-0.024
lad-0.021
 newfound-0.021
 importantly-0.020
Em-0.020
Token Before
Token ID7413
Feature activation+1.01e-01

Pos logit contributions
igsaw+0.017
 newfound+0.015
reetings+0.015
lad+0.015
Em+0.014
Neg logit contributions
 made-0.009
 smoke-0.009
 when-0.009
 waited-0.009
 covered-0.009
Token he
Token ID339
Feature activation+2.71e-02

Pos logit contributions
 elbows+0.149
 poop+0.142
tails+0.141
 dressed+0.140
w+0.139
Neg logit contributions
igsaw-0.188
 newfound-0.152
 importantly-0.150
lad-0.149
Em-0.148
Token knew
Token ID2993
Feature activation-2.32e-03

Pos logit contributions
 made+0.028
 covered+0.025
 marks+0.023
 waited+0.023
 dressed+0.022
Neg logit contributions
igsaw-0.068
lad-0.058
reetings-0.056
Em-0.056
 importantly-0.055
Token it
Token ID340
Feature activation-1.75e-02

Pos logit contributions
igsaw+0.006
lad+0.005
Em+0.005
reetings+0.005
 newfound+0.005
Neg logit contributions
 made-0.002
 smoke-0.002
 covered-0.002
 marks-0.002
 elbows-0.002
Token,
Token ID11
Feature activation-4.24e-03

Pos logit contributions
igsaw+0.037
lad+0.032
Em+0.031
 newfound+0.031
reetings+0.030
Neg logit contributions
 elbows-0.022
 poop-0.020
tails-0.020
 covered-0.019
 marks-0.019
Token he
Token ID339
Feature activation+2.50e-03

Pos logit contributions
igsaw+0.008
lad+0.007
 newfound+0.007
Em+0.007
 importantly+0.007
Neg logit contributions
 elbows-0.005
 poop-0.005
tails-0.005
 marks-0.005
 covered-0.005
Token and
Token ID290
Feature activation+1.96e-03

Pos logit contributions
 elbows+0.046
w+0.041
tails+0.041
 poop+0.040
 dressed+0.039
Neg logit contributions
igsaw-0.062
lad-0.054
 victorious-0.052
 newfound-0.051
 million-0.051
Token knocked
Token ID13642
Feature activation+3.54e-02

Pos logit contributions
 elbows+0.003
tails+0.002
w+0.002
 poop+0.002
 dressed+0.002
Neg logit contributions
igsaw-0.003
 newfound-0.003
lad-0.003
reetings-0.003
 importantly-0.003
Token something
Token ID1223
Feature activation+4.76e-02

Pos logit contributions
 elbows+0.045
tails+0.044
 poop+0.041
 smoke+0.040
 covered+0.040
Neg logit contributions
igsaw-0.068
lad-0.061
Em-0.058
reetings-0.058
 newfound-0.054
Token loose
Token ID9155
Feature activation+3.10e-02

Pos logit contributions
 elbows+0.063
tails+0.059
w+0.056
 dressed+0.056
 poop+0.055
Neg logit contributions
igsaw-0.091
 newfound-0.079
lad-0.078
 importantly-0.077
Em-0.077
Token.
Token ID13
Feature activation-3.03e-03

Pos logit contributions
 elbows+0.040
tails+0.036
 poop+0.033
 dressed+0.032
ups+0.032
Neg logit contributions
igsaw-0.060
lad-0.054
 hero-0.051
reetings-0.051
Em-0.051
Token Before
Token ID7413
Feature activation+1.01e-01

Pos logit contributions
igsaw+0.005
lad+0.004
Em+0.004
 newfound+0.004
reetings+0.004
Neg logit contributions
 elbows-0.005
 poop-0.005
tails-0.005
 covered-0.005
 marks-0.005
Token he
Token ID339
Feature activation+3.12e-02

Pos logit contributions
 elbows+0.133
 dressed+0.130
 poop+0.127
tails+0.126
w+0.122
Neg logit contributions
igsaw-0.196
 newfound-0.165
lad-0.163
 importantly-0.161
Em-0.154
Token knew
Token ID2993
Feature activation+1.13e-03

Pos logit contributions
 made+0.029
 covered+0.027
 marks+0.023
 waited+0.023
w+0.022
Neg logit contributions
igsaw-0.083
lad-0.070
Em-0.068
reetings-0.067
 importantly-0.067
Token it
Token ID340
Feature activation-8.73e-03

Pos logit contributions
 elbows+0.001
 covered+0.001
 made+0.001
 poop+0.001
 marks+0.001
Neg logit contributions
igsaw-0.003
lad-0.002
Em-0.002
reetings-0.002
 newfound-0.002
Token,
Token ID11
Feature activation+1.63e-03

Pos logit contributions
igsaw+0.019
lad+0.016
Em+0.016
 newfound+0.016
reetings+0.015
Neg logit contributions
 elbows-0.010
 poop-0.010
tails-0.009
 covered-0.009
 marks-0.009
Token something
Token ID1223
Feature activation+2.94e-02

Pos logit contributions
 elbows+0.002
 poop+0.002
tails+0.002
 dressed+0.002
w+0.002
Neg logit contributions
igsaw-0.003
 newfound-0.003
lad-0.003
 importantly-0.003
joy-0.002
Token move
Token ID1445
Feature activation+1.03e-02

Pos logit contributions
w+0.007
 elbows+0.007
 dressed+0.006
tails+0.006
 poop+0.006
Neg logit contributions
igsaw-0.012
 newfound-0.011
 million-0.010
Em-0.010
lad-0.010
Token closer
Token ID5699
Feature activation+2.84e-04

Pos logit contributions
 elbows+0.013
w+0.013
 poop+0.012
tails+0.012
 dressed+0.011
Neg logit contributions
igsaw-0.019
 newfound-0.017
 million-0.016
Em-0.016
lad-0.016
Token to
Token ID284
Feature activation+1.14e-02

Pos logit contributions
 elbows+0.000
w+0.000
 poop+0.000
 dressed+0.000
 smoke+0.000
Neg logit contributions
igsaw-0.001
lad-0.001
 newfound-0.001
 importantly-0.001
Em-0.000
Token her
Token ID607
Feature activation+1.69e-02

Pos logit contributions
 elbows+0.010
 poop+0.009
 covered+0.009
tails+0.009
 marks+0.009
Neg logit contributions
igsaw-0.028
lad-0.024
Em-0.024
 newfound-0.024
reetings-0.023
Token.
Token ID13
Feature activation-2.05e-03

Pos logit contributions
 elbows+0.019
w+0.018
 poop+0.017
 dressed+0.017
 covered+0.017
Neg logit contributions
igsaw-0.037
 newfound-0.032
lad-0.031
 importantly-0.030
 pleasure-0.030
Token Before
Token ID7413
Feature activation+1.00e-01

Pos logit contributions
igsaw+0.004
lad+0.004
reetings+0.003
 newfound+0.003
 importantly+0.003
Neg logit contributions
 covered-0.003
 made-0.003
 elbows-0.003
 poop-0.003
 dressed-0.003
Token she
Token ID673
Feature activation+2.48e-02

Pos logit contributions
 elbows+0.135
 dressed+0.130
 poop+0.129
tails+0.127
w+0.126
Neg logit contributions
igsaw-0.198
 newfound-0.164
Em-0.162
 importantly-0.159
lad-0.156
Token knew
Token ID2993
Feature activation+7.07e-04

Pos logit contributions
 made+0.025
 covered+0.024
 waited+0.021
 marks+0.021
 smoke+0.021
Neg logit contributions
igsaw-0.062
lad-0.054
reetings-0.052
Em-0.051
 newfound-0.050
Token it
Token ID340
Feature activation-1.18e-02

Pos logit contributions
 elbows+0.001
 poop+0.001
tails+0.001
 covered+0.001
 marks+0.001
Neg logit contributions
igsaw-0.002
lad-0.002
Em-0.001
 newfound-0.001
reetings-0.001
Token,
Token ID11
Feature activation+1.49e-03

Pos logit contributions
igsaw+0.023
lad+0.020
 newfound+0.020
Em+0.019
reetings+0.019
Neg logit contributions
 elbows-0.016
 poop-0.015
tails-0.014
 smoke-0.014
 dressed-0.014
Token the
Token ID262
Feature activation-1.62e-02

Pos logit contributions
 elbows+0.002
tails+0.002
 poop+0.002
w+0.002
 marks+0.001
Neg logit contributions
igsaw-0.003
 newfound-0.003
Em-0.003
lad-0.003
 importantly-0.003
Token around
Token ID1088
Feature activation-3.99e-03

Pos logit contributions
 elbows+0.029
 poop+0.027
tails+0.027
 marks+0.027
 smoke+0.026
Neg logit contributions
igsaw-0.065
lad-0.058
Em-0.057
 newfound-0.056
 importantly-0.055
Token the
Token ID262
Feature activation+2.59e-03

Pos logit contributions
igsaw+0.009
Em+0.008
 newfound+0.008
lad+0.007
 importantly+0.007
Neg logit contributions
 elbows-0.004
 poop-0.004
tails-0.004
w-0.004
 marks-0.004
Token fireplace
Token ID44250
Feature activation+2.99e-02

Pos logit contributions
 elbows+0.005
 poop+0.004
tails+0.004
 marks+0.004
 covered+0.004
Neg logit contributions
igsaw-0.004
lad-0.003
Em-0.003
 newfound-0.003
 importantly-0.003
Token and
Token ID290
Feature activation-1.36e-03

Pos logit contributions
 elbows+0.041
w+0.040
 poop+0.037
tails+0.036
 marks+0.035
Neg logit contributions
igsaw-0.055
Em-0.047
 importantly-0.045
joy-0.045
 million-0.045
Token roasted
Token ID30490
Feature activation+6.33e-02

Pos logit contributions
Em+0.002
 newfound+0.002
igsaw+0.002
 brand+0.002
 million+0.002
Neg logit contributions
 elbows-0.002
w-0.002
 marks-0.002
ms-0.002
mons-0.002
Token marsh
Token ID22397
Feature activation+9.97e-02

Pos logit contributions
 elbows+0.117
w+0.109
tails+0.107
ms+0.100
 poop+0.099
Neg logit contributions
igsaw-0.086
 newfound-0.075
 million-0.071
 brand-0.070
Em-0.068
Tokenm
Token ID76
Feature activation+4.84e-02

Pos logit contributions
 elbows+0.155
tails+0.151
 marks+0.146
 poop+0.145
 smoke+0.142
Neg logit contributions
igsaw-0.155
Em-0.139
lad-0.133
oke-0.131
 importantly-0.130
Tokenallows
Token ID47205
Feature activation+7.59e-02

Pos logit contributions
 elbows+0.083
w+0.080
tail+0.080
Give+0.079
tails+0.079
Neg logit contributions
 importantly-0.052
lad-0.052
igsaw-0.048
 newfound-0.046
 brand-0.044
Token.
Token ID13
Feature activation-5.62e-03

Pos logit contributions
 elbows+0.113
w+0.097
tails+0.095
 poop+0.093
 marks+0.085
Neg logit contributions
igsaw-0.136
Em-0.114
lad-0.112
 million-0.111
 Nice-0.107
Token Suddenly
Token ID24975
Feature activation+3.46e-02

Pos logit contributions
igsaw+0.011
lad+0.010
reetings+0.010
 newfound+0.009
 importantly+0.009
Neg logit contributions
 covered-0.007
 made-0.007
 smoke-0.007
 poop-0.007
 dressed-0.007
Token,
Token ID11
Feature activation+1.79e-02

Pos logit contributions
 elbows+0.049
 poop+0.046
tails+0.045
 dressed+0.044
 marks+0.043
Neg logit contributions
igsaw-0.066
Em-0.055
 newfound-0.054
lad-0.053
 importantly-0.052
Token.
Token ID13
Feature activation+1.46e-02

Pos logit contributions
igsaw+0.007
 newfound+0.006
lad+0.006
Em+0.006
 importantly+0.006
Neg logit contributions
 elbows-0.004
tails-0.003
w-0.003
 poop-0.003
 marks-0.003
Token They
Token ID1119
Feature activation+2.57e-02

Pos logit contributions
 poop+0.016
 made+0.016
 elbows+0.016
 covered+0.015
 dressed+0.015
Neg logit contributions
igsaw-0.032
lad-0.030
reetings-0.028
 newfound-0.027
 lifetime-0.026
Token pretended
Token ID36766
Feature activation+1.95e-02

Pos logit contributions
 elbows+0.036
 poop+0.031
w+0.031
tails+0.030
 marks+0.029
Neg logit contributions
igsaw-0.044
Em-0.043
 newfound-0.041
lad-0.039
 million-0.039
Token to
Token ID284
Feature activation-1.87e-02

Pos logit contributions
 elbows+0.032
w+0.029
tails+0.028
 poop+0.025
 marks+0.025
Neg logit contributions
igsaw-0.031
 newfound-0.027
 importantly-0.026
 Nice-0.026
Em-0.026
Token roast
Token ID32595
Feature activation+6.32e-02

Pos logit contributions
igsaw+0.033
 newfound+0.029
lad+0.028
reetings+0.028
 hero+0.027
Neg logit contributions
 elbows-0.026
w-0.025
 dressed-0.023
mons-0.023
ups-0.023
Token marsh
Token ID22397
Feature activation+9.97e-02

Pos logit contributions
 elbows+0.096
tails+0.087
w+0.085
 poop+0.079
 dressed+0.078
Neg logit contributions
igsaw-0.110
 newfound-0.094
lad-0.089
 importantly-0.087
Em-0.087
Tokenm
Token ID76
Feature activation+4.85e-02

Pos logit contributions
 elbows+0.149
tails+0.137
 poop+0.133
 marks+0.132
 smoke+0.130
Neg logit contributions
igsaw-0.166
lad-0.146
oke-0.145
Em-0.140
 importantly-0.134
Tokenallows
Token ID47205
Feature activation+7.54e-02

Pos logit contributions
 elbows+0.083
Give+0.078
w+0.075
 say+0.073
tail+0.072
Neg logit contributions
lad-0.057
igsaw-0.056
oke-0.049
 importantly-0.049
 newfound-0.048
Token and
Token ID290
Feature activation+1.06e-02

Pos logit contributions
 elbows+0.120
w+0.102
tails+0.099
 poop+0.097
 marks+0.089
Neg logit contributions
igsaw-0.133
lad-0.113
 Nice-0.105
 hero-0.104
Em-0.104
Token sing
Token ID1702
Feature activation-5.17e-03

Pos logit contributions
w+0.013
 elbows+0.012
uck+0.011
out+0.011
ups+0.010
Neg logit contributions
igsaw-0.020
 newfound-0.018
 importantly-0.018
Em-0.017
 million-0.017
Token songs
Token ID7259
Feature activation-4.01e-04

Pos logit contributions
igsaw+0.010
Em+0.009
 newfound+0.009
lad+0.008
 importantly+0.008
Neg logit contributions
 elbows-0.007
w-0.007
 poop-0.006
 spill-0.006
 dressed-0.006
Token more
Token ID517
Feature activation+3.28e-02

Pos logit contributions
 elbows+0.022
w+0.021
tails+0.020
 poop+0.020
 dressed+0.019
Neg logit contributions
igsaw-0.038
 newfound-0.032
lad-0.031
joy-0.030
Em-0.029
Token and
Token ID290
Feature activation-8.95e-04

Pos logit contributions
 elbows+0.043
tails+0.039
w+0.037
 waited+0.036
 poop+0.035
Neg logit contributions
igsaw-0.071
 newfound-0.058
 importantly-0.057
Em-0.055
lad-0.055
Token more
Token ID517
Feature activation+3.88e-02

Pos logit contributions
igsaw+0.002
lad+0.001
 newfound+0.001
Em+0.001
 importantly+0.001
Neg logit contributions
 elbows-0.001
 poop-0.001
tails-0.001
w-0.001
 marks-0.001
Token space
Token ID2272
Feature activation-1.63e-02

Pos logit contributions
 elbows+0.049
w+0.044
tails+0.043
 poop+0.042
 waited+0.040
Neg logit contributions
igsaw-0.084
 importantly-0.071
 newfound-0.070
Em-0.068
lad-0.067
Token.
Token ID13
Feature activation-8.90e-03

Pos logit contributions
igsaw+0.036
lad+0.031
 newfound+0.030
 importantly+0.029
reetings+0.029
Neg logit contributions
 elbows-0.019
 poop-0.017
w-0.017
 smoke-0.015
tails-0.015
Token Before
Token ID7413
Feature activation+9.96e-02

Pos logit contributions
igsaw+0.014
lad+0.013
reetings+0.013
 newfound+0.011
 importantly+0.011
Neg logit contributions
 covered-0.014
 made-0.014
 elbows-0.014
 smoke-0.014
tails-0.014
Token long
Token ID890
Feature activation+5.94e-02

Pos logit contributions
 elbows+0.134
 poop+0.129
 marks+0.125
 dressed+0.125
tails+0.124
Neg logit contributions
igsaw-0.197
 newfound-0.162
Em-0.161
lad-0.160
 importantly-0.160
Token,
Token ID11
Feature activation+1.06e-02

Pos logit contributions
 elbows+0.076
 poop+0.072
tails+0.070
w+0.070
 dressed+0.069
Neg logit contributions
igsaw-0.121
lad-0.102
 newfound-0.100
Em-0.098
 importantly-0.097
Token the
Token ID262
Feature activation-1.54e-02

Pos logit contributions
 elbows+0.015
 poop+0.014
tails+0.014
 marks+0.014
 covered+0.014
Neg logit contributions
igsaw-0.020
lad-0.017
Em-0.016
 newfound-0.016
 importantly-0.016
Token entire
Token ID2104
Feature activation-7.72e-03

Pos logit contributions
igsaw+0.031
 newfound+0.026
lad+0.026
Em+0.025
 importantly+0.025
Neg logit contributions
 elbows-0.021
tails-0.019
 poop-0.019
 dressed-0.018
w-0.018
Token box
Token ID3091
Feature activation-2.17e-02

Pos logit contributions
igsaw+0.015
lad+0.013
Em+0.012
 newfound+0.012
reetings+0.012
Neg logit contributions
 elbows-0.010
 poop-0.010
 covered-0.010
 marks-0.010
tails-0.010
Token things
Token ID1243
Feature activation+1.89e-02

Pos logit contributions
igsaw+0.014
 newfound+0.013
 importantly+0.012
joy+0.011
 lifetime+0.011
Neg logit contributions
 elbows-0.010
w-0.010
ow-0.009
tails-0.009
uck-0.008
Token in
Token ID287
Feature activation-2.96e-02

Pos logit contributions
 elbows+0.024
tails+0.021
w+0.021
 poop+0.020
 marks+0.020
Neg logit contributions
igsaw-0.038
 importantly-0.032
lad-0.032
 newfound-0.032
Em-0.032
Token the
Token ID262
Feature activation-8.35e-05

Pos logit contributions
igsaw+0.066
 newfound+0.055
lad+0.054
Em+0.054
 importantly+0.053
Neg logit contributions
 elbows-0.033
 poop-0.030
w-0.030
tails-0.030
 covered-0.029
Token lab
Token ID2248
Feature activation+1.42e-02

Pos logit contributions
igsaw+0.000
 newfound+0.000
Em+0.000
 lifetime+0.000
reetings+0.000
Neg logit contributions
 elbows-0.000
w-0.000
tails-0.000
 covered-0.000
 poop-0.000
Token.
Token ID13
Feature activation-9.14e-03

Pos logit contributions
 elbows+0.017
w+0.016
 poop+0.015
tails+0.014
 marks+0.013
Neg logit contributions
igsaw-0.030
Em-0.025
 importantly-0.025
lad-0.025
 lifetime-0.024
Token Before
Token ID7413
Feature activation+9.94e-02

Pos logit contributions
igsaw+0.017
lad+0.017
reetings+0.016
 newfound+0.015
 hero+0.015
Neg logit contributions
 dressed-0.011
 made-0.010
 covered-0.010
 elbows-0.010
 marks-0.010
Token Molly
Token ID30236
Feature activation+3.32e-02

Pos logit contributions
 elbows+0.149
 poop+0.141
tails+0.140
w+0.140
 dressed+0.140
Neg logit contributions
igsaw-0.188
 newfound-0.151
 importantly-0.147
Em-0.144
 million-0.140
Token knew
Token ID2993
Feature activation-8.63e-04

Pos logit contributions
 elbows+0.052
tails+0.044
 poop+0.043
w+0.042
 smoke+0.041
Neg logit contributions
igsaw-0.058
Em-0.052
 newfound-0.051
lad-0.050
oke-0.050
Token it
Token ID340
Feature activation-1.62e-02

Pos logit contributions
igsaw+0.002
lad+0.002
 newfound+0.002
 importantly+0.002
Em+0.002
Neg logit contributions
 elbows-0.001
 poop-0.001
tails-0.001
w-0.001
 covered-0.001
Token,
Token ID11
Feature activation-6.70e-04

Pos logit contributions
igsaw+0.032
lad+0.027
 newfound+0.026
reetings+0.026
Em+0.026
Neg logit contributions
 elbows-0.023
 poop-0.021
tails-0.020
w-0.020
 marks-0.020
Token her
Token ID607
Feature activation-1.30e-02

Pos logit contributions
igsaw+0.001
 newfound+0.001
 importantly+0.001
lad+0.001
Em+0.001
Neg logit contributions
 elbows-0.001
tails-0.001
 poop-0.001
w-0.001
 marks-0.001
Token for
Token ID329
Feature activation+3.92e-02

Pos logit contributions
 elbows+0.026
 poop+0.024
tails+0.024
 covered+0.023
 marks+0.023
Neg logit contributions
igsaw-0.055
lad-0.048
Em-0.046
 newfound-0.046
reetings-0.046
Token someone
Token ID2130
Feature activation+9.76e-03

Pos logit contributions
 elbows+0.040
tails+0.037
 poop+0.037
 covered+0.036
 dressed+0.036
Neg logit contributions
igsaw-0.089
lad-0.078
 newfound-0.077
Em-0.076
 importantly-0.075
Token to
Token ID284
Feature activation+1.01e-02

Pos logit contributions
 elbows+0.012
w+0.010
 smoke+0.010
 marks+0.010
 poop+0.010
Neg logit contributions
igsaw-0.020
Em-0.018
 newfound-0.018
lad-0.017
reetings-0.017
Token come
Token ID1282
Feature activation+2.24e-03

Pos logit contributions
 elbows+0.012
w+0.012
tails+0.011
 poop+0.011
 marks+0.011
Neg logit contributions
igsaw-0.021
 newfound-0.018
lad-0.018
reetings-0.018
Em-0.017
Token.
Token ID13
Feature activation-5.75e-03

Pos logit contributions
 elbows+0.002
 poop+0.002
 smoke+0.002
w+0.002
 marks+0.002
Neg logit contributions
igsaw-0.005
lad-0.004
reetings-0.004
 newfound-0.004
Em-0.004
Token Before
Token ID7413
Feature activation+9.92e-02

Pos logit contributions
igsaw+0.010
lad+0.009
reetings+0.009
Em+0.009
 newfound+0.008
Neg logit contributions
 smoke-0.008
 covered-0.008
 elbows-0.008
 made-0.008
 dressed-0.008
Token long
Token ID890
Feature activation+6.02e-02

Pos logit contributions
 elbows+0.153
 poop+0.145
tails+0.144
w+0.143
 dressed+0.142
Neg logit contributions
igsaw-0.178
 newfound-0.145
 importantly-0.142
Em-0.138
lad-0.137
Token,
Token ID11
Feature activation+8.25e-03

Pos logit contributions
 elbows+0.087
tails+0.083
w+0.082
 poop+0.081
 covered+0.077
Neg logit contributions
igsaw-0.110
 newfound-0.091
lad-0.091
 importantly-0.089
reetings-0.086
Token lots
Token ID6041
Feature activation+4.65e-03

Pos logit contributions
 elbows+0.010
 poop+0.010
tails+0.010
 marks+0.009
 dressed+0.009
Neg logit contributions
igsaw-0.017
 newfound-0.015
lad-0.014
 importantly-0.014
Em-0.014
Token of
Token ID286
Feature activation+6.68e-03

Pos logit contributions
 elbows+0.009
tails+0.008
 poop+0.008
w+0.008
 marks+0.008
Neg logit contributions
igsaw-0.007
 importantly-0.005
lad-0.005
 newfound-0.005
Em-0.005
Token people
Token ID661
Feature activation+1.38e-02

Pos logit contributions
 elbows+0.010
tails+0.009
w+0.009
bat+0.009
 covered+0.009
Neg logit contributions
igsaw-0.012
 newfound-0.011
 importantly-0.011
 pleasure-0.010
lad-0.009
Token They
Token ID1119
Feature activation+1.98e-02

Pos logit contributions
igsaw+0.018
lad+0.017
reetings+0.016
 newfound+0.016
Em+0.015
Neg logit contributions
 elbows-0.009
 covered-0.009
 dressed-0.009
 smoke-0.008
 poop-0.008
Token laughed
Token ID13818
Feature activation+2.79e-03

Pos logit contributions
 elbows+0.026
 poop+0.023
tails+0.022
 marks+0.022
 smoke+0.021
Neg logit contributions
igsaw-0.038
Em-0.036
 newfound-0.034
lad-0.034
 importantly-0.033
Token and
Token ID290
Feature activation-4.98e-03

Pos logit contributions
 elbows+0.004
tails+0.004
w+0.004
 poop+0.003
 marks+0.003
Neg logit contributions
igsaw-0.005
 newfound-0.005
 importantly-0.005
lad-0.005
Em-0.005
Token played
Token ID2826
Feature activation-3.94e-03

Pos logit contributions
igsaw+0.009
Em+0.009
 newfound+0.008
 importantly+0.008
lad+0.008
Neg logit contributions
 elbows-0.006
 marks-0.006
 poop-0.006
 smoke-0.006
w-0.005
Token.
Token ID13
Feature activation-1.29e-03

Pos logit contributions
igsaw+0.009
lad+0.007
 newfound+0.007
Em+0.007
 importantly+0.007
Neg logit contributions
 elbows-0.004
tails-0.004
 poop-0.004
w-0.004
 marks-0.004
Token Before
Token ID7413
Feature activation+9.92e-02

Pos logit contributions
igsaw+0.003
lad+0.003
reetings+0.003
 newfound+0.002
 hero+0.002
Neg logit contributions
 dressed-0.001
 covered-0.001
 made-0.001
 smoke-0.001
 elbows-0.001
Token dark
Token ID3223
Feature activation+4.96e-02

Pos logit contributions
 elbows+0.139
tails+0.134
 marks+0.130
 poop+0.129
w+0.127
Neg logit contributions
igsaw-0.190
 importantly-0.155
 newfound-0.155
Em-0.152
 million-0.147
Token,
Token ID11
Feature activation+8.25e-03

Pos logit contributions
 elbows+0.068
tails+0.065
w+0.060
 poop+0.060
 marks+0.058
Neg logit contributions
igsaw-0.097
lad-0.081
 newfound-0.078
reetings-0.076
 importantly-0.076
Token they
Token ID484
Feature activation+3.18e-02

Pos logit contributions
 elbows+0.013
tails+0.012
 poop+0.012
 marks+0.011
w+0.011
Neg logit contributions
igsaw-0.015
 newfound-0.012
Em-0.012
 importantly-0.012
 pleasure-0.012
Token returned
Token ID4504
Feature activation-9.45e-03

Pos logit contributions
 elbows+0.035
tails+0.031
 poop+0.031
 marks+0.030
 smoke+0.029
Neg logit contributions
igsaw-0.069
Em-0.062
lad-0.062
 newfound-0.060
 importantly-0.059
Token back
Token ID736
Feature activation-1.27e-02

Pos logit contributions
igsaw+0.017
joy+0.015
 newfound+0.015
 pleasure+0.014
lad+0.014
Neg logit contributions
 elbows-0.013
w-0.013
tails-0.012
 poop-0.012
 smoke-0.012
Token road
Token ID2975
Feature activation+5.18e-03

Pos logit contributions
igsaw+0.002
lad+0.001
reetings+0.001
Em+0.001
 importantly+0.001
Neg logit contributions
 elbows-0.001
 poop-0.001
 covered-0.001
 smoke-0.001
 marks-0.001
Tokenblock
Token ID9967
Feature activation+1.40e-02

Pos logit contributions
 elbows+0.007
w+0.007
 poop+0.006
tails+0.006
 marks+0.006
Neg logit contributions
igsaw-0.010
lad-0.009
 newfound-0.009
Em-0.008
joy-0.008
Token's
Token ID338
Feature activation+4.24e-02

Pos logit contributions
 elbows+0.019
w+0.018
tails+0.017
 poop+0.016
 marks+0.016
Neg logit contributions
igsaw-0.027
 newfound-0.022
Em-0.022
lad-0.021
 million-0.021
Token back
Token ID736
Feature activation-2.96e-02

Pos logit contributions
 elbows+0.057
 poop+0.054
 covered+0.053
 marks+0.053
tails+0.053
Neg logit contributions
igsaw-0.084
lad-0.072
Em-0.069
reetings-0.068
 newfound-0.068
Token.
Token ID13
Feature activation-1.19e-02

Pos logit contributions
igsaw+0.068
lad+0.058
 newfound+0.057
Em+0.056
 importantly+0.056
Neg logit contributions
 elbows-0.030
 poop-0.027
w-0.027
 smoke-0.026
tails-0.026
Token Before
Token ID7413
Feature activation+9.90e-02

Pos logit contributions
igsaw+0.022
lad+0.021
reetings+0.020
 newfound+0.019
Em+0.019
Neg logit contributions
 dressed-0.014
 covered-0.014
 made-0.014
 elbows-0.013
 smoke-0.013
Token he
Token ID339
Feature activation+2.63e-02

Pos logit contributions
 elbows+0.144
 poop+0.139
tails+0.137
 marks+0.137
w+0.135
Neg logit contributions
igsaw-0.183
 newfound-0.152
 importantly-0.148
Em-0.146
lad-0.145
Token knew
Token ID2993
Feature activation-7.11e-03

Pos logit contributions
 made+0.026
 covered+0.025
 dressed+0.024
 elbows+0.024
 poop+0.023
Neg logit contributions
igsaw-0.063
lad-0.054
Em-0.053
 importantly-0.051
reetings-0.051
Token it
Token ID340
Feature activation-1.61e-02

Pos logit contributions
igsaw+0.017
lad+0.015
Em+0.014
reetings+0.014
 newfound+0.014
Neg logit contributions
 elbows-0.007
 made-0.006
 covered-0.006
 poop-0.006
 dressed-0.006
Token,
Token ID11
Feature activation-3.69e-03

Pos logit contributions
igsaw+0.032
lad+0.028
 newfound+0.027
reetings+0.027
Em+0.027
Neg logit contributions
 elbows-0.022
 poop-0.020
tails-0.019
 marks-0.019
 smoke-0.019
Token he
Token ID339
Feature activation+1.19e-03

Pos logit contributions
igsaw+0.008
lad+0.007
 newfound+0.007
 importantly+0.006
Em+0.006
Neg logit contributions
 elbows-0.005
 poop-0.004
tails-0.004
 marks-0.004
w-0.004
Token and
Token ID290
Feature activation+1.03e-02

Pos logit contributions
 elbows+0.067
tails+0.061
 poop+0.059
 covered+0.056
 dressed+0.056
Neg logit contributions
igsaw-0.130
lad-0.112
 newfound-0.110
Em-0.109
 importantly-0.104
Token combined
Token ID5929
Feature activation+2.25e-02

Pos logit contributions
 elbows+0.012
tails+0.011
 poop+0.011
 marks+0.010
w+0.010
Neg logit contributions
igsaw-0.022
Em-0.020
lad-0.019
 newfound-0.019
 importantly-0.019
Token everything
Token ID2279
Feature activation+5.70e-03

Pos logit contributions
 elbows+0.027
tails+0.024
 poop+0.024
 covered+0.023
w+0.023
Neg logit contributions
igsaw-0.049
lad-0.041
 newfound-0.041
Em-0.040
 importantly-0.039
Token together
Token ID1978
Feature activation-7.77e-03

Pos logit contributions
 elbows+0.007
tails+0.006
 poop+0.006
w+0.006
 smoke+0.006
Neg logit contributions
igsaw-0.013
lad-0.011
 newfound-0.010
Em-0.010
 importantly-0.010
Token.
Token ID13
Feature activation-1.15e-02

Pos logit contributions
igsaw+0.017
lad+0.015
 newfound+0.015
Em+0.014
 importantly+0.014
Neg logit contributions
 elbows-0.009
tails-0.008
 poop-0.008
w-0.008
 smoke-0.007
Token Before
Token ID7413
Feature activation+9.88e-02

Pos logit contributions
igsaw+0.022
lad+0.020
reetings+0.019
 hero+0.018
oke+0.018
Neg logit contributions
 made-0.014
 dressed-0.014
 covered-0.014
 elbows-0.014
 smoke-0.014
Token long
Token ID890
Feature activation+5.97e-02

Pos logit contributions
 elbows+0.132
tails+0.124
 poop+0.122
 dressed+0.120
w+0.119
Neg logit contributions
igsaw-0.203
 newfound-0.168
 importantly-0.165
Em-0.163
 million-0.161
Token,
Token ID11
Feature activation+8.39e-03

Pos logit contributions
 elbows+0.085
tails+0.082
w+0.080
 poop+0.079
 dressed+0.078
Neg logit contributions
igsaw-0.113
 newfound-0.093
lad-0.091
 importantly-0.088
Em-0.088
Token Daisy
Token ID40355
Feature activation+2.04e-03

Pos logit contributions
 elbows+0.011
 poop+0.010
tails+0.010
 marks+0.010
 covered+0.009
Neg logit contributions
igsaw-0.017
lad-0.015
 newfound-0.015
Em-0.014
 importantly-0.014
Token had
Token ID550
Feature activation-5.58e-03

Pos logit contributions
 elbows+0.003
 poop+0.003
tails+0.003
w+0.003
mons+0.002
Neg logit contributions
igsaw-0.004
Em-0.003
 million-0.003
lad-0.003
 newfound-0.003
Token created
Token ID2727
Feature activation+2.15e-02

Pos logit contributions
igsaw+0.012
 newfound+0.010
Em+0.010
lad+0.010
 million+0.009
Neg logit contributions
 elbows-0.007
tails-0.007
 poop-0.006
w-0.006
ards-0.006
Token him
Token ID683
Feature activation+9.79e-03

Pos logit contributions
igsaw+0.061
Em+0.054
lad+0.054
 newfound+0.053
 importantly+0.051
Neg logit contributions
 elbows-0.024
 poop-0.022
tails-0.021
 covered-0.021
 marks-0.021
Token tight
Token ID5381
Feature activation+5.06e-03

Pos logit contributions
 elbows+0.009
 poop+0.008
 covered+0.008
 marks+0.008
tails+0.008
Neg logit contributions
igsaw-0.023
lad-0.021
Em-0.020
 newfound-0.020
 importantly-0.020
Token while
Token ID981
Feature activation+4.25e-02

Pos logit contributions
 elbows+0.006
 poop+0.006
w+0.005
 smoke+0.005
tails+0.005
Neg logit contributions
igsaw-0.010
lad-0.009
Em-0.009
 newfound-0.009
 importantly-0.009
Token they
Token ID484
Feature activation+2.28e-02

Pos logit contributions
 elbows+0.040
 poop+0.036
 covered+0.036
tails+0.035
 made+0.035
Neg logit contributions
igsaw-0.102
lad-0.089
Em-0.086
reetings-0.086
 newfound-0.085
Token roasted
Token ID30490
Feature activation+6.36e-02

Pos logit contributions
 elbows+0.028
 poop+0.026
 marks+0.025
 smoke+0.023
w+0.023
Neg logit contributions
igsaw-0.043
 newfound-0.040
lad-0.039
Em-0.039
 hero-0.037
Token marsh
Token ID22397
Feature activation+9.88e-02

Pos logit contributions
 elbows+0.105
w+0.096
tails+0.094
 poop+0.090
 covered+0.088
Neg logit contributions
igsaw-0.101
 newfound-0.087
 million-0.079
lad-0.079
 importantly-0.078
Tokenm
Token ID76
Feature activation+4.96e-02

Pos logit contributions
 elbows+0.142
tails+0.134
 smoke+0.134
 marks+0.134
 poop+0.134
Neg logit contributions
igsaw-0.166
lad-0.147
Em-0.145
 importantly-0.143
oke-0.140
Tokenallows
Token ID47205
Feature activation+7.62e-02

Pos logit contributions
 elbows+0.082
Give+0.076
w+0.075
 say+0.073
 poop+0.072
Neg logit contributions
lad-0.060
igsaw-0.057
 importantly-0.056
 newfound-0.055
 million-0.050
Token over
Token ID625
Feature activation-6.10e-03

Pos logit contributions
 elbows+0.109
w+0.090
 poop+0.090
tails+0.087
 marks+0.082
Neg logit contributions
igsaw-0.144
lad-0.125
Em-0.120
 Nice-0.118
 million-0.115
Token the
Token ID262
Feature activation+5.38e-03

Pos logit contributions
igsaw+0.012
 newfound+0.011
joy+0.010
lad+0.010
 importantly+0.010
Neg logit contributions
 elbows-0.008
w-0.007
tails-0.007
 poop-0.007
 marks-0.006
Token camp
Token ID1413
Feature activation+1.50e-02

Pos logit contributions
 elbows+0.008
w+0.007
tails+0.007
 dressed+0.007
 marks+0.007
Neg logit contributions
igsaw-0.010
 newfound-0.008
lad-0.008
 million-0.008
Em-0.008
Token grew
Token ID6348
Feature activation+3.10e-02

Pos logit contributions
igsaw+0.019
 newfound+0.019
Em+0.019
lad+0.018
 hero+0.017
Neg logit contributions
 elbows-0.016
tails-0.015
w-0.014
 poop-0.014
mons-0.013
Token bigger
Token ID5749
Feature activation+2.29e-02

Pos logit contributions
 elbows+0.043
w+0.040
tails+0.040
 poop+0.040
 dressed+0.038
Neg logit contributions
igsaw-0.056
lad-0.050
 newfound-0.049
 importantly-0.047
Em-0.046
Token and
Token ID290
Feature activation-7.51e-03

Pos logit contributions
 elbows+0.034
w+0.033
tails+0.030
 dressed+0.030
 poop+0.029
Neg logit contributions
igsaw-0.040
 importantly-0.034
joy-0.034
lad-0.033
 hero-0.033
Token bigger
Token ID5749
Feature activation+3.02e-02

Pos logit contributions
igsaw+0.013
lad+0.011
Em+0.011
reetings+0.011
 newfound+0.010
Neg logit contributions
 elbows-0.011
 covered-0.011
 made-0.011
 poop-0.011
 smoke-0.011
Token.
Token ID13
Feature activation-7.05e-03

Pos logit contributions
w+0.039
 elbows+0.038
 dressed+0.037
tails+0.034
 poop+0.033
Neg logit contributions
igsaw-0.056
 importantly-0.050
lad-0.049
joy-0.049
 newfound-0.048
Token Before
Token ID7413
Feature activation+9.87e-02

Pos logit contributions
igsaw+0.013
lad+0.012
reetings+0.011
 newfound+0.011
Em+0.010
Neg logit contributions
 smoke-0.010
 made-0.009
 covered-0.009
 elbows-0.009
 poop-0.009
Token long
Token ID890
Feature activation+6.08e-02

Pos logit contributions
 elbows+0.158
 dressed+0.153
 poop+0.152
tails+0.150
 marks+0.148
Neg logit contributions
igsaw-0.166
 newfound-0.136
 importantly-0.133
lad-0.130
Em-0.129
Token,
Token ID11
Feature activation+1.06e-02

Pos logit contributions
 elbows+0.078
 poop+0.074
tails+0.074
 dressed+0.073
w+0.072
Neg logit contributions
igsaw-0.119
lad-0.103
 newfound-0.101
 importantly-0.099
reetings-0.097
Token the
Token ID262
Feature activation-1.05e-02

Pos logit contributions
 elbows+0.012
 poop+0.011
tails+0.011
 dressed+0.011
 marks+0.010
Neg logit contributions
igsaw-0.022
 newfound-0.020
lad-0.019
 importantly-0.019
Em-0.019
Token bush
Token ID24484
Feature activation+3.39e-03

Pos logit contributions
igsaw+0.019
lad+0.016
 newfound+0.016
 importantly+0.015
Em+0.015
Neg logit contributions
 elbows-0.016
 dressed-0.015
tails-0.014
 marks-0.014
 poop-0.014
Token was
Token ID373
Feature activation+1.29e-02

Pos logit contributions
 elbows+0.005
tails+0.004
 poop+0.004
w+0.004
 dressed+0.004
Neg logit contributions
igsaw-0.006
 newfound-0.006
lad-0.005
Em-0.005
 hero-0.005
Token a
Token ID257
Feature activation+9.38e-03

Pos logit contributions
igsaw+0.003
 newfound+0.003
 importantly+0.002
Em+0.002
 pleasure+0.002
Neg logit contributions
 elbows-0.002
w-0.001
 poop-0.001
 covered-0.001
 marks-0.001
Token mouse
Token ID10211
Feature activation-1.23e-02

Pos logit contributions
 poop+0.013
 smoke+0.013
 elbows+0.013
 made+0.013
 covered+0.013
Neg logit contributions
igsaw-0.018
lad-0.015
reetings-0.014
Em-0.014
 importantly-0.014
Token.
Token ID13
Feature activation+6.20e-03

Pos logit contributions
igsaw+0.024
 importantly+0.021
 hero+0.021
oke+0.021
Em+0.021
Neg logit contributions
 elbows-0.016
w-0.014
 marks-0.013
tails-0.012
 dressed-0.012
Token
Token ID198
Feature activation+6.70e-03

Pos logit contributions
 elbows+0.011
 poop+0.010
tails+0.010
 covered+0.010
 marks+0.010
Neg logit contributions
igsaw-0.010
lad-0.008
Em-0.008
 newfound-0.008
 importantly-0.008
Token
Token ID198
Feature activation+9.09e-04

Pos logit contributions
 elbows+0.011
 made+0.011
 poop+0.011
 dressed+0.010
 marks+0.010
Neg logit contributions
igsaw-0.011
lad-0.009
 newfound-0.009
 importantly-0.009
Em-0.009
TokenBefore
Token ID8421
Feature activation+9.87e-02

Pos logit contributions
 elbows+0.002
 marks+0.002
 smoke+0.002
tails+0.002
 poop+0.002
Neg logit contributions
igsaw-0.001
Em-0.001
lad-0.001
reetings-0.001
 importantly-0.001
Token the
Token ID262
Feature activation+1.00e-02

Pos logit contributions
w+0.142
 elbows+0.141
 poop+0.134
ms+0.133
 dressed+0.133
Neg logit contributions
igsaw-0.177
 newfound-0.149
 importantly-0.146
joy-0.142
Em-0.141
Token movie
Token ID3807
Feature activation+4.10e-02

Pos logit contributions
 elbows+0.017
 dressed+0.016
 covered+0.016
tails+0.016
 poop+0.016
Neg logit contributions
igsaw-0.015
 newfound-0.013
lad-0.012
 hero-0.012
 importantly-0.012
Token starts
Token ID4940
Feature activation+7.07e-02

Pos logit contributions
 elbows+0.066
tails+0.058
 poop+0.056
 dressed+0.055
w+0.054
Neg logit contributions
igsaw-0.075
lad-0.059
Em-0.059
oke-0.059
 hero-0.058
Token,
Token ID11
Feature activation+7.78e-03

Pos logit contributions
w+0.116
 elbows+0.109
 poop+0.102
mons+0.097
tails+0.097
Neg logit contributions
igsaw-0.112
 importantly-0.105
 newfound-0.094
oke-0.094
run-0.093
Token they
Token ID484
Feature activation+2.30e-02

Pos logit contributions
tails+0.012
 elbows+0.012
w+0.012
ms+0.011
 poop+0.011
Neg logit contributions
igsaw-0.012
 newfound-0.011
 importantly-0.010
 pleasure-0.010
joy-0.010
Token they
Token ID484
Feature activation+1.58e-02

Pos logit contributions
 elbows+0.031
w+0.028
 poop+0.028
tails+0.027
 smoke+0.026
Neg logit contributions
igsaw-0.051
lad-0.045
Em-0.044
 newfound-0.044
joy-0.044
Token pointed
Token ID6235
Feature activation-8.81e-03

Pos logit contributions
 elbows+0.018
 poop+0.017
 marks+0.016
tails+0.016
 smoke+0.016
Neg logit contributions
igsaw-0.031
Em-0.028
 newfound-0.027
 hero-0.026
 million-0.026
Token.
Token ID13
Feature activation-7.76e-03

Pos logit contributions
igsaw+0.018
 newfound+0.016
joy+0.015
 pleasure+0.015
lad+0.015
Neg logit contributions
 elbows-0.011
w-0.010
tails-0.010
 poop-0.010
 smoke-0.009
Token
Token ID198
Feature activation-1.41e-02

Pos logit contributions
igsaw+0.016
lad+0.015
reetings+0.014
 newfound+0.014
 importantly+0.013
Neg logit contributions
 smoke-0.009
 made-0.009
 poop-0.009
 elbows-0.009
 dressed-0.009
Token
Token ID198
Feature activation-1.91e-02

Pos logit contributions
igsaw+0.033
lad+0.027
joy+0.026
 importantly+0.026
 newfound+0.026
Neg logit contributions
 shouted-0.015
 made-0.015
 yelled-0.015
 dressed-0.014
 said-0.014
TokenBefore
Token ID8421
Feature activation+9.86e-02

Pos logit contributions
igsaw+0.034
lad+0.029
Em+0.027
 newfound+0.027
reetings+0.027
Neg logit contributions
 elbows-0.029
 poop-0.028
 covered-0.027
tails-0.027
 marks-0.027
Token they
Token ID484
Feature activation+4.63e-02

Pos logit contributions
 elbows+0.142
w+0.136
 poop+0.134
tails+0.134
 marks+0.132
Neg logit contributions
igsaw-0.186
 newfound-0.153
 importantly-0.150
Em-0.150
lad-0.143
Token knew
Token ID2993
Feature activation-1.19e-03

Pos logit contributions
 elbows+0.055
tails+0.050
 poop+0.050
 marks+0.048
 smoke+0.047
Neg logit contributions
igsaw-0.095
Em-0.083
 newfound-0.082
lad-0.082
 importantly-0.080
Token it
Token ID340
Feature activation-1.36e-02

Pos logit contributions
igsaw+0.003
lad+0.002
Em+0.002
 newfound+0.002
reetings+0.002
Neg logit contributions
 elbows-0.001
 poop-0.001
 covered-0.001
tails-0.001
 marks-0.001
Token,
Token ID11
Feature activation+9.75e-04

Pos logit contributions
igsaw+0.026
lad+0.023
Em+0.022
 newfound+0.022
reetings+0.022
Neg logit contributions
 elbows-0.019
 poop-0.017
tails-0.017
w-0.016
 marks-0.016
Token they
Token ID484
Feature activation+1.51e-02

Pos logit contributions
 elbows+0.001
tails+0.001
w+0.001
 poop+0.001
 marks+0.001
Neg logit contributions
igsaw-0.002
 newfound-0.002
 importantly-0.002
Em-0.002
lad-0.002
Token flowers
Token ID12734
Feature activation+9.98e-03

Pos logit contributions
 elbows+0.040
tails+0.035
 covered+0.033
 poop+0.032
w+0.032
Neg logit contributions
igsaw-0.072
 newfound-0.063
 importantly-0.063
lad-0.060
Em-0.059
Token to
Token ID284
Feature activation+7.47e-03

Pos logit contributions
 elbows+0.014
 smoke+0.012
 marks+0.011
 poop+0.011
w+0.011
Neg logit contributions
igsaw-0.020
Em-0.016
lad-0.016
 newfound-0.016
reetings-0.016
Token other
Token ID584
Feature activation+9.57e-03

Pos logit contributions
 elbows+0.007
 poop+0.006
 covered+0.006
tails+0.006
 marks+0.006
Neg logit contributions
igsaw-0.018
lad-0.016
Em-0.015
 newfound-0.015
reetings-0.015
Token people
Token ID661
Feature activation+9.24e-03

Pos logit contributions
 elbows+0.011
w+0.010
 covered+0.010
tails+0.009
 poop+0.009
Neg logit contributions
igsaw-0.020
 newfound-0.017
lad-0.017
 importantly-0.017
Em-0.017
Token.
Token ID13
Feature activation-5.33e-03

Pos logit contributions
 elbows+0.012
 smoke+0.011
 marks+0.011
 poop+0.010
w+0.010
Neg logit contributions
igsaw-0.018
Em-0.016
lad-0.015
 newfound-0.015
 importantly-0.015
Token Before
Token ID7413
Feature activation+9.85e-02

Pos logit contributions
igsaw+0.011
lad+0.010
reetings+0.010
 newfound+0.009
 importantly+0.009
Neg logit contributions
 made-0.007
 covered-0.007
 poop-0.006
 dressed-0.006
 when-0.006
Token she
Token ID673
Feature activation+2.02e-02

Pos logit contributions
 elbows+0.125
 poop+0.117
 marks+0.116
tails+0.115
 covered+0.114
Neg logit contributions
igsaw-0.201
lad-0.168
Em-0.167
 newfound-0.166
 importantly-0.164
Token left
Token ID1364
Feature activation+3.97e-02

Pos logit contributions
 made+0.019
 covered+0.016
 waited+0.016
 dressed+0.015
 said+0.014
Neg logit contributions
igsaw-0.054
lad-0.047
reetings-0.045
 importantly-0.044
Em-0.044
Token,
Token ID11
Feature activation+2.63e-03

Pos logit contributions
 elbows+0.040
 poop+0.038
 covered+0.038
 made+0.037
 smoke+0.037
Neg logit contributions
igsaw-0.091
lad-0.081
Em-0.078
reetings-0.077
 newfound-0.076
Token she
Token ID673
Feature activation+7.39e-03

Pos logit contributions
 elbows+0.003
 poop+0.003
 covered+0.003
 marks+0.003
tails+0.003
Neg logit contributions
igsaw-0.006
lad-0.005
Em-0.005
 newfound-0.005
reetings-0.005
Token gave
Token ID2921
Feature activation-7.44e-03

Pos logit contributions
 elbows+0.008
 poop+0.007
tails+0.007
 marks+0.007
 smoke+0.006
Neg logit contributions
igsaw-0.017
lad-0.015
Em-0.015
 newfound-0.014
 importantly-0.014
Token the
Token ID262
Feature activation+3.16e-04

Pos logit contributions
w+0.069
ms+0.062
 elbows+0.061
mons+0.061
 poop+0.060
Neg logit contributions
igsaw-0.064
joy-0.060
 newfound-0.058
 million-0.053
Em-0.053
Token drive
Token ID3708
Feature activation-1.71e-02

Pos logit contributions
 elbows+0.001
tails+0.001
w+0.001
ms+0.001
 poop+0.001
Neg logit contributions
igsaw-0.000
 newfound-0.000
 hero-0.000
 lifetime-0.000
 pleasure-0.000
Token back
Token ID736
Feature activation-2.04e-02

Pos logit contributions
igsaw+0.035
lad+0.029
Em+0.029
 newfound+0.029
joy+0.028
Neg logit contributions
 elbows-0.022
 poop-0.020
w-0.020
tails-0.019
 marks-0.019
Token home
Token ID1363
Feature activation-2.54e-03

Pos logit contributions
igsaw+0.045
lad+0.039
 newfound+0.038
 importantly+0.037
reetings+0.037
Neg logit contributions
 elbows-0.022
 poop-0.020
w-0.020
 smoke-0.020
tails-0.019
Token.
Token ID13
Feature activation-1.07e-02

Pos logit contributions
igsaw+0.006
lad+0.005
 newfound+0.005
 importantly+0.005
Em+0.005
Neg logit contributions
 elbows-0.003
w-0.002
 poop-0.002
 marks-0.002
tails-0.002
Token Before
Token ID7413
Feature activation+9.84e-02

Pos logit contributions
igsaw+0.022
lad+0.020
reetings+0.019
 newfound+0.019
Em+0.018
Neg logit contributions
 covered-0.012
 poop-0.012
 made-0.012
 elbows-0.012
 smoke-0.012
Token they
Token ID484
Feature activation+4.36e-02

Pos logit contributions
 elbows+0.138
 poop+0.130
 marks+0.130
w+0.129
tails+0.127
Neg logit contributions
igsaw-0.190
 newfound-0.155
 importantly-0.155
Em-0.153
lad-0.150
Token reached
Token ID4251
Feature activation+1.24e-02

Pos logit contributions
 elbows+0.042
 poop+0.039
 covered+0.039
 made+0.038
 marks+0.038
Neg logit contributions
igsaw-0.102
lad-0.090
Em-0.087
 newfound-0.086
reetings-0.086
Token the
Token ID262
Feature activation+9.29e-03

Pos logit contributions
 elbows+0.010
 covered+0.010
 made+0.010
 dressed+0.010
 poop+0.010
Neg logit contributions
igsaw-0.030
lad-0.027
Em-0.026
reetings-0.026
 importantly-0.026
Token house
Token ID2156
Feature activation+3.36e-02

Pos logit contributions
 elbows+0.015
tails+0.014
 poop+0.014
 covered+0.014
 marks+0.014
Neg logit contributions
igsaw-0.016
lad-0.013
 newfound-0.012
Em-0.012
reetings-0.012
Token,
Token ID11
Feature activation+9.45e-03

Pos logit contributions
 elbows+0.045
 poop+0.041
tails+0.041
w+0.041
 marks+0.040
Neg logit contributions
igsaw-0.067
lad-0.057
 newfound-0.056
Em-0.055
reetings-0.054
Token fireplace
Token ID44250
Feature activation+3.73e-02

Pos logit contributions
 elbows+0.020
tails+0.018
 poop+0.018
 dressed+0.018
 covered+0.018
Neg logit contributions
igsaw-0.025
lad-0.020
 newfound-0.020
Em-0.020
reetings-0.019
Token so
Token ID523
Feature activation+4.79e-02

Pos logit contributions
 elbows+0.065
w+0.058
tails+0.054
 poop+0.053
 marks+0.048
Neg logit contributions
igsaw-0.064
lad-0.048
 million-0.048
joy-0.048
Em-0.047
Token they
Token ID484
Feature activation+2.46e-02

Pos logit contributions
 elbows+0.062
tails+0.056
 poop+0.055
 marks+0.054
 dressed+0.051
Neg logit contributions
igsaw-0.096
Em-0.084
 newfound-0.083
 importantly-0.082
lad-0.082
Token could
Token ID714
Feature activation+3.30e-02

Pos logit contributions
 elbows+0.043
tails+0.038
 poop+0.037
 marks+0.036
w+0.034
Neg logit contributions
Em-0.033
igsaw-0.031
 newfound-0.030
 million-0.029
lad-0.028
Token roast
Token ID32595
Feature activation+6.81e-02

Pos logit contributions
 elbows+0.051
w+0.048
ups+0.045
ards+0.043
mons+0.042
Neg logit contributions
igsaw-0.050
 newfound-0.050
 pleasure-0.047
 brand-0.043
 hero-0.042
Token marsh
Token ID22397
Feature activation+9.82e-02

Pos logit contributions
 elbows+0.112
tails+0.100
w+0.098
aunts+0.088
ow+0.087
Neg logit contributions
igsaw-0.110
 newfound-0.097
 brand-0.091
 million-0.088
 importantly-0.084
Tokenm
Token ID76
Feature activation+4.59e-02

Pos logit contributions
 elbows+0.153
tails+0.146
 smoke+0.141
 marks+0.140
 poop+0.140
Neg logit contributions
igsaw-0.157
Em-0.137
lad-0.135
oke-0.132
 importantly-0.126
Tokenallows
Token ID47205
Feature activation+7.83e-02

Pos logit contributions
 elbows+0.084
Give+0.077
w+0.075
talk+0.073
 say+0.071
Neg logit contributions
lad-0.047
 importantly-0.042
igsaw-0.042
 newfound-0.041
 million-0.040
Token.
Token ID13
Feature activation+1.67e-04

Pos logit contributions
 elbows+0.116
tails+0.094
w+0.091
 poop+0.091
 marks+0.085
Neg logit contributions
igsaw-0.149
lad-0.127
Em-0.122
 million-0.120
 Nice-0.117
Token They
Token ID1119
Feature activation+2.74e-02

Pos logit contributions
 covered+0.000
 poop+0.000
 made+0.000
 smoke+0.000
 dressed+0.000
Neg logit contributions
igsaw-0.000
lad-0.000
reetings-0.000
 newfound-0.000
 importantly-0.000
Token were
Token ID547
Feature activation+1.62e-02

Pos logit contributions
 elbows+0.042
w+0.035
 poop+0.034
mons+0.034
uck+0.033
Neg logit contributions
Em-0.042
 million-0.041
 newfound-0.040
igsaw-0.038
 eighty-0.036
Token her
Token ID607
Feature activation+1.24e-02

Pos logit contributions
 elbows+0.004
 poop+0.003
 covered+0.003
 smoke+0.003
 made+0.003
Neg logit contributions
igsaw-0.009
lad-0.008
Em-0.007
 importantly-0.007
reetings-0.007
Token and
Token ID290
Feature activation-1.59e-03

Pos logit contributions
 elbows+0.016
w+0.016
 smoke+0.015
 poop+0.014
 dressed+0.014
Neg logit contributions
igsaw-0.025
 newfound-0.020
 hero-0.020
Em-0.020
lad-0.019
Token said
Token ID531
Feature activation-8.49e-03

Pos logit contributions
igsaw+0.003
Em+0.003
 newfound+0.003
lad+0.003
 importantly+0.003
Neg logit contributions
 elbows-0.002
 marks-0.002
 poop-0.001
w-0.001
tails-0.001
Token,
Token ID11
Feature activation-1.24e-02

Pos logit contributions
igsaw+0.020
lad+0.017
Em+0.017
 newfound+0.016
reetings+0.016
Neg logit contributions
 elbows-0.008
 poop-0.008
 made-0.008
 covered-0.008
 smoke-0.008
Token "
Token ID366
Feature activation-6.51e-03

Pos logit contributions
igsaw+0.017
lad+0.013
reetings+0.012
 newfound+0.012
Em+0.012
Neg logit contributions
 made-0.024
 poop-0.023
 covered-0.023
 elbows-0.023
 smoke-0.023
TokenEll
Token ID30639
Feature activation-1.13e-01

Pos logit contributions
igsaw+0.011
lad+0.009
Em+0.009
 newfound+0.008
reetings+0.008
Neg logit contributions
 elbows-0.011
 poop-0.010
 covered-0.010
 marks-0.010
tails-0.010
Tokenie
Token ID494
Feature activation-2.54e-02

Pos logit contributions
igsaw+0.276
 newfound+0.248
 importantly+0.241
 pleasure+0.240
Em+0.238
Neg logit contributions
tails-0.098
 marks-0.096
 covered-0.090
 poop-0.090
 made-0.088
Token,
Token ID11
Feature activation-7.79e-03

Pos logit contributions
igsaw+0.045
lad+0.038
 newfound+0.035
reetings+0.035
 hero+0.035
Neg logit contributions
 elbows-0.040
 dressed-0.037
 marks-0.035
 poop-0.034
 smoke-0.034
Token I
Token ID314
Feature activation-4.03e-02

Pos logit contributions
igsaw+0.017
lad+0.014
Em+0.014
 newfound+0.014
reetings+0.014
Neg logit contributions
 elbows-0.009
tails-0.008
 poop-0.008
 marks-0.008
 covered-0.008
Token found
Token ID1043
Feature activation-3.73e-02

Pos logit contributions
igsaw+0.061
reetings+0.059
lad+0.056
 newfound+0.056
 pleasure+0.055
Neg logit contributions
 elbows-0.071
ups-0.059
tails-0.058
ards-0.058
 noses-0.058
Token a
Token ID257
Feature activation-1.25e-02

Pos logit contributions
igsaw+0.083
lad+0.072
Em+0.070
 newfound+0.069
reetings+0.069
Neg logit contributions
 elbows-0.040
 poop-0.037
 covered-0.037
tails-0.037
 marks-0.036
Token her
Token ID607
Feature activation+1.83e-02

Pos logit contributions
 elbows+0.012
 poop+0.012
 made+0.012
 smoke+0.012
tails+0.012
Neg logit contributions
igsaw-0.026
lad-0.023
 importantly-0.022
reetings-0.022
Em-0.022
Token and
Token ID290
Feature activation+6.24e-04

Pos logit contributions
 elbows+0.022
w+0.021
 poop+0.020
 smoke+0.019
 dressed+0.019
Neg logit contributions
igsaw-0.039
 newfound-0.032
Em-0.032
lad-0.032
 importantly-0.031
Token said
Token ID531
Feature activation-5.25e-03

Pos logit contributions
 elbows+0.001
 marks+0.001
 poop+0.001
w+0.000
 smoke+0.000
Neg logit contributions
igsaw-0.001
Em-0.001
lad-0.001
 newfound-0.001
reetings-0.001
Token,
Token ID11
Feature activation-8.84e-03

Pos logit contributions
igsaw+0.012
lad+0.011
Em+0.010
 newfound+0.010
 importantly+0.010
Neg logit contributions
 made-0.005
 elbows-0.005
 poop-0.005
 smoke-0.005
 covered-0.005
Token "
Token ID366
Feature activation-4.76e-03

Pos logit contributions
igsaw+0.012
lad+0.009
 newfound+0.009
Em+0.009
reetings+0.008
Neg logit contributions
 made-0.017
 elbows-0.017
 poop-0.017
 covered-0.017
 smoke-0.017
TokenEll
Token ID30639
Feature activation-1.11e-01

Pos logit contributions
igsaw+0.008
lad+0.006
Em+0.006
 newfound+0.006
reetings+0.006
Neg logit contributions
 elbows-0.008
 poop-0.008
 covered-0.008
tails-0.008
 marks-0.008
Tokenie
Token ID494
Feature activation-2.26e-02

Pos logit contributions
igsaw+0.278
 newfound+0.253
 pleasure+0.247
 importantly+0.243
Em+0.239
Neg logit contributions
tails-0.092
 marks-0.088
 made-0.081
 covered-0.080
 poop-0.079
Token,
Token ID11
Feature activation-5.30e-03

Pos logit contributions
igsaw+0.041
lad+0.035
reetings+0.032
 newfound+0.032
Em+0.032
Neg logit contributions
 elbows-0.035
 dressed-0.032
 poop-0.030
 marks-0.030
 covered-0.029
Token I
Token ID314
Feature activation-3.79e-02

Pos logit contributions
igsaw+0.011
lad+0.010
Em+0.009
 newfound+0.009
reetings+0.009
Neg logit contributions
 elbows-0.007
tails-0.006
 poop-0.006
 marks-0.006
 covered-0.006
Token need
Token ID761
Feature activation+1.06e-02

Pos logit contributions
igsaw+0.059
reetings+0.054
lad+0.051
 newfound+0.050
 pleasure+0.049
Neg logit contributions
 elbows-0.068
tails-0.056
ups-0.055
 noses-0.055
 marks-0.054
Token your
Token ID534
Feature activation-2.10e-03

Pos logit contributions
 elbows+0.010
 poop+0.009
 covered+0.009
tails+0.009
 marks+0.008
Neg logit contributions
igsaw-0.025
lad-0.022
Em-0.022
 newfound-0.021
reetings-0.021
Token Her
Token ID2332
Feature activation-6.08e-03

Pos logit contributions
igsaw+0.004
lad+0.003
reetings+0.003
 newfound+0.003
 importantly+0.003
Neg logit contributions
tails-0.003
 elbows-0.003
 made-0.003
 poop-0.003
 covered-0.003
Token friend
Token ID1545
Feature activation-2.05e-02

Pos logit contributions
igsaw+0.010
lad+0.008
Em+0.008
reetings+0.008
 importantly+0.008
Neg logit contributions
tails-0.010
 poop-0.010
 marks-0.010
 made-0.010
 covered-0.010
Token said
Token ID531
Feature activation-8.66e-03

Pos logit contributions
igsaw+0.043
lad+0.037
Em+0.036
 newfound+0.035
reetings+0.035
Neg logit contributions
 elbows-0.025
 poop-0.023
 covered-0.023
tails-0.023
 made-0.023
Token,
Token ID11
Feature activation-1.41e-02

Pos logit contributions
igsaw+0.020
lad+0.018
Em+0.017
reetings+0.017
 newfound+0.017
Neg logit contributions
 elbows-0.008
 made-0.008
 poop-0.008
 covered-0.008
 smoke-0.008
Token "
Token ID366
Feature activation-8.64e-03

Pos logit contributions
igsaw+0.019
lad+0.015
 newfound+0.014
reetings+0.014
 importantly+0.014
Neg logit contributions
 poop-0.027
 made-0.027
 covered-0.027
 elbows-0.027
 smoke-0.027
TokenEll
Token ID30639
Feature activation-1.10e-01

Pos logit contributions
igsaw+0.011
lad+0.009
Em+0.008
reetings+0.008
 newfound+0.008
Neg logit contributions
 elbows-0.018
 poop-0.017
 marks-0.017
 covered-0.017
tails-0.017
Tokenie
Token ID494
Feature activation-2.60e-02

Pos logit contributions
igsaw+0.276
 newfound+0.250
 importantly+0.244
 pleasure+0.243
Em+0.237
Neg logit contributions
 marks-0.087
tails-0.086
 covered-0.082
 made-0.080
 poop-0.079
Token,
Token ID11
Feature activation-9.63e-03

Pos logit contributions
igsaw+0.047
lad+0.041
reetings+0.038
 newfound+0.037
Em+0.037
Neg logit contributions
 elbows-0.040
 dressed-0.036
 marks-0.034
 poop-0.034
 covered-0.033
Token come
Token ID1282
Feature activation-1.53e-02

Pos logit contributions
igsaw+0.020
lad+0.017
Em+0.016
 newfound+0.016
reetings+0.016
Neg logit contributions
 elbows-0.012
tails-0.011
 poop-0.011
 marks-0.011
 covered-0.011
Token quick
Token ID2068
Feature activation+9.22e-03

Pos logit contributions
igsaw+0.035
lad+0.031
Em+0.030
 newfound+0.030
reetings+0.030
Neg logit contributions
 elbows-0.015
 poop-0.014
 marks-0.014
 smoke-0.014
 covered-0.014
Token!
Token ID0
Feature activation-1.60e-02

Pos logit contributions
 elbows+0.015
w+0.012
mons+0.012
 poop+0.012
tails+0.012
Neg logit contributions
igsaw-0.016
Em-0.014
joy-0.014
 hero-0.014
lad-0.013
Token It
Token ID632
Feature activation-2.20e-02

Pos logit contributions
lad+0.020
igsaw+0.020
reetings+0.018
 lifetime+0.017
 newfound+0.017
Neg logit contributions
 dressed-0.011
 covered-0.011
 made-0.011
 like-0.010
 when-0.010
Token was
Token ID373
Feature activation+1.29e-02

Pos logit contributions
igsaw+0.041
Em+0.037
 newfound+0.035
lad+0.035
 importantly+0.034
Neg logit contributions
 elbows-0.032
 poop-0.027
w-0.027
tails-0.027
 marks-0.027
Token a
Token ID257
Feature activation-1.67e-02

Pos logit contributions
 elbows+0.017
w+0.015
tails+0.015
 poop+0.015
 marks+0.014
Neg logit contributions
igsaw-0.026
 newfound-0.022
Em-0.022
 importantly-0.022
lad-0.021
Token big
Token ID1263
Feature activation-1.05e-02

Pos logit contributions
igsaw+0.032
lad+0.030
reetings+0.028
Em+0.026
 importantly+0.026
Neg logit contributions
 made-0.022
 poop-0.022
 covered-0.022
 smoke-0.021
 elbows-0.021
Token j
Token ID474
Feature activation-3.08e-02

Pos logit contributions
igsaw+0.021
lad+0.019
Em+0.018
reetings+0.018
 newfound+0.017
Neg logit contributions
 elbows-0.013
 poop-0.013
 covered-0.013
 made-0.013
 marks-0.012
Tokenigsaw
Token ID45636
Feature activation-1.07e-01

Pos logit contributions
igsaw+0.033
lad+0.032
 newfound+0.029
reetings+0.029
Em+0.029
Neg logit contributions
 dressed-0.059
 covered-0.058
 poop-0.058
 marks-0.057
 made-0.057
Token puzzle
Token ID15027
Feature activation-3.30e-02

Pos logit contributions
lad+0.164
igsaw+0.162
reetings+0.161
 newfound+0.147
joy+0.141
Neg logit contributions
 made-0.169
 covered-0.164
 marks-0.156
 dressed-0.156
 smoke-0.154
Token and
Token ID290
Feature activation-9.99e-03

Pos logit contributions
igsaw+0.072
Em+0.061
 importantly+0.060
lad+0.059
 newfound+0.058
Neg logit contributions
 elbows-0.042
 poop-0.036
w-0.035
tails-0.035
 marks-0.034
Token she
Token ID673
Feature activation-1.07e-02

Pos logit contributions
igsaw+0.022
lad+0.019
Em+0.019
 newfound+0.019
 importantly+0.019
Neg logit contributions
 elbows-0.011
 poop-0.010
tails-0.010
 marks-0.010
 smoke-0.010
Token wanted
Token ID2227
Feature activation-2.38e-02

Pos logit contributions
igsaw+0.020
Em+0.018
lad+0.017
 newfound+0.017
 importantly+0.017
Neg logit contributions
 elbows-0.016
 poop-0.014
tails-0.014
 smoke-0.013
 marks-0.013
Token to
Token ID284
Feature activation-7.14e-03

Pos logit contributions
igsaw+0.054
Em+0.046
lad+0.046
 newfound+0.046
 importantly+0.045
Neg logit contributions
 elbows-0.026
 poop-0.022
tails-0.022
w-0.022
 covered-0.022
Token recommendation
Token ID15602
Feature activation-2.21e-02

Pos logit contributions
 elbows+0.046
 covered+0.044
 poop+0.043
 dressed+0.043
tails+0.043
Neg logit contributions
igsaw-0.052
lad-0.044
 newfound-0.043
 importantly-0.041
reetings-0.041
Token and
Token ID290
Feature activation-1.40e-02

Pos logit contributions
igsaw+0.049
lad+0.044
Em+0.042
reetings+0.042
 newfound+0.042
Neg logit contributions
 elbows-0.024
tails-0.021
 poop-0.021
 smoke-0.019
 marks-0.019
Token picked
Token ID6497
Feature activation-2.76e-02

Pos logit contributions
igsaw+0.029
lad+0.026
 newfound+0.025
 importantly+0.025
Em+0.025
Neg logit contributions
 elbows-0.016
 poop-0.015
tails-0.015
 marks-0.014
w-0.014
Token the
Token ID262
Feature activation-1.53e-02

Pos logit contributions
igsaw+0.070
lad+0.063
Em+0.061
 newfound+0.060
reetings+0.060
Neg logit contributions
 elbows-0.020
 poop-0.019
 marks-0.018
 covered-0.018
 smoke-0.018
Token right
Token ID826
Feature activation-3.57e-02

Pos logit contributions
igsaw+0.026
lad+0.023
Em+0.021
 importantly+0.021
reetings+0.021
Neg logit contributions
 marks-0.026
 made-0.026
 smoke-0.026
 smell-0.025
 poop-0.025
Token av
Token ID1196
Feature activation-1.05e-01

Pos logit contributions
igsaw+0.072
Em+0.065
lad+0.064
 importantly+0.062
reetings+0.059
Neg logit contributions
 marks-0.044
 made-0.044
 poop-0.043
 smoke-0.043
 covered-0.043
Tokenoc
Token ID420
Feature activation-5.75e-02

Pos logit contributions
igsaw+0.147
lad+0.118
Em+0.117
 importantly+0.115
 newfound+0.111
Neg logit contributions
tails-0.192
 poop-0.190
 made-0.190
 elbows-0.189
w-0.188
Tokenados
Token ID22484
Feature activation-3.93e-03

Pos logit contributions
igsaw+0.105
Em+0.099
 importantly+0.094
 newfound+0.093
lad+0.090
Neg logit contributions
 marks-0.078
 poop-0.077
 elbows-0.077
tails-0.076
w-0.076
Token.
Token ID13
Feature activation-1.55e-02

Pos logit contributions
igsaw+0.009
lad+0.008
Em+0.008
 newfound+0.008
reetings+0.008
Neg logit contributions
 elbows-0.004
 poop-0.003
tails-0.003
 smoke-0.003
 marks-0.003
Token
Token ID198
Feature activation-1.79e-02

Pos logit contributions
igsaw+0.030
reetings+0.026
 newfound+0.026
lad+0.025
 importantly+0.024
Neg logit contributions
 said-0.019
 covered-0.019
 made-0.019
 dressed-0.018
 smoke-0.018
Token
Token ID198
Feature activation-2.25e-02

Pos logit contributions
igsaw+0.038
 importantly+0.030
lad+0.030
 newfound+0.030
 pleasure+0.030
Neg logit contributions
 poop-0.020
 shouted-0.019
 dressed-0.019
 made-0.019
 marks-0.019
Token girl
Token ID2576
Feature activation-5.46e-02

Pos logit contributions
igsaw+0.098
lad+0.082
Em+0.080
 newfound+0.078
reetings+0.078
Neg logit contributions
 elbows-0.080
 poop-0.075
 covered-0.074
tails-0.074
 marks-0.073
Token would
Token ID561
Feature activation-3.93e-03

Pos logit contributions
igsaw+0.110
lad+0.096
 newfound+0.091
Em+0.091
reetings+0.091
Neg logit contributions
 elbows-0.072
 poop-0.066
 smoke-0.063
 covered-0.063
 marks-0.063
Token make
Token ID787
Feature activation+3.10e-02

Pos logit contributions
igsaw+0.008
lad+0.007
 newfound+0.007
reetings+0.007
Em+0.007
Neg logit contributions
 elbows-0.005
 poop-0.005
w-0.005
tails-0.004
 marks-0.004
Token a
Token ID257
Feature activation-2.46e-03

Pos logit contributions
 elbows+0.039
w+0.037
 covered+0.035
 poop+0.034
 waited+0.033
Neg logit contributions
igsaw-0.063
 newfound-0.054
Em-0.051
 lifelong-0.051
lad-0.050
Token great
Token ID1049
Feature activation-2.55e-02

Pos logit contributions
igsaw+0.004
lad+0.003
 newfound+0.003
Em+0.003
reetings+0.003
Neg logit contributions
 elbows-0.005
 poop-0.004
 covered-0.004
tails-0.004
 marks-0.004
Token fer
Token ID11354
Feature activation-1.04e-01

Pos logit contributions
igsaw+0.046
 newfound+0.036
lad+0.034
 pleasure+0.034
 importantly+0.032
Neg logit contributions
 elbows-0.044
w-0.041
tails-0.039
 poop-0.039
 covered-0.038
Tokenret
Token ID1186
Feature activation-6.07e-02

Pos logit contributions
igsaw+0.155
lad+0.128
Em+0.127
 importantly+0.124
 newfound+0.123
Neg logit contributions
 elbows-0.174
tails-0.174
w-0.173
 marks-0.171
 dressed-0.170
Token parent
Token ID2560
Feature activation-6.42e-03

Pos logit contributions
igsaw+0.117
lad+0.100
Em+0.097
 newfound+0.096
 importantly+0.095
Neg logit contributions
 elbows-0.079
 marks-0.076
 poop-0.076
tails-0.076
 covered-0.076
Token.
Token ID13
Feature activation-1.21e-02

Pos logit contributions
igsaw+0.015
lad+0.013
Em+0.013
 newfound+0.013
reetings+0.013
Neg logit contributions
 elbows-0.006
 poop-0.006
 covered-0.005
tails-0.005
 marks-0.005
Token
Token ID220
Feature activation-1.75e-02

Pos logit contributions
igsaw+0.023
lad+0.021
reetings+0.020
 newfound+0.019
 importantly+0.019
Neg logit contributions
 dressed-0.015
 made-0.015
 elbows-0.015
tails-0.015
 say-0.015
Token
Token ID198
Feature activation-1.53e-02

Pos logit contributions
igsaw+0.035
 importantly+0.030
lad+0.030
 newfound+0.030
 lifetime+0.028
Neg logit contributions
 dressed-0.020
tails-0.020
 marks-0.020
 say-0.019
 made-0.019
Token every
Token ID790
Feature activation+1.29e-02

Pos logit contributions
igsaw+0.055
lad+0.047
 newfound+0.046
Em+0.045
reetings+0.045
Neg logit contributions
 elbows-0.030
 poop-0.028
 marks-0.027
 smoke-0.027
 covered-0.027
Token day
Token ID1110
Feature activation-1.12e-03

Pos logit contributions
 made+0.011
 water+0.010
 covered+0.010
 smoke+0.010
 dressed+0.009
Neg logit contributions
igsaw-0.030
lad-0.030
oke-0.029
Em-0.028
reetings-0.028
Token to
Token ID284
Feature activation-4.34e-03

Pos logit contributions
igsaw+0.002
 newfound+0.002
 importantly+0.002
lad+0.002
Em+0.002
Neg logit contributions
 elbows-0.001
 poop-0.001
w-0.001
tails-0.001
 smoke-0.001
Token supply
Token ID5127
Feature activation+7.67e-03

Pos logit contributions
igsaw+0.011
Em+0.010
lad+0.010
 importantly+0.010
oke+0.010
Neg logit contributions
 made-0.003
 smoke-0.003
 say-0.003
 smell-0.003
 marks-0.003
Token the
Token ID262
Feature activation-2.23e-02

Pos logit contributions
 made+0.007
 marks+0.007
 elbows+0.007
 smoke+0.007
 poop+0.006
Neg logit contributions
igsaw-0.018
lad-0.016
Em-0.016
reetings-0.015
 importantly-0.015
Token av
Token ID1196
Feature activation-1.03e-01

Pos logit contributions
igsaw+0.042
lad+0.037
Em+0.035
reetings+0.035
 Nice+0.034
Neg logit contributions
 marks-0.033
 made-0.032
 smoke-0.031
 covered-0.031
 smell-0.031
Tokenoc
Token ID420
Feature activation-4.14e-02

Pos logit contributions
igsaw+0.151
lad+0.123
Em+0.123
 importantly+0.120
reetings+0.115
Neg logit contributions
tails-0.182
 made-0.178
 marks-0.176
 covered-0.176
 poop-0.174
Tokenados
Token ID22484
Feature activation-8.01e-03

Pos logit contributions
igsaw+0.075
Em+0.070
 newfound+0.066
 importantly+0.066
lad+0.063
Neg logit contributions
 marks-0.058
 poop-0.055
 elbows-0.055
tails-0.054
w-0.054
Token.
Token ID13
Feature activation-1.32e-02

Pos logit contributions
igsaw+0.019
lad+0.017
Em+0.017
 newfound+0.016
reetings+0.016
Neg logit contributions
 elbows-0.007
 poop-0.006
 covered-0.006
tails-0.006
 marks-0.006
Token And
Token ID843
Feature activation+2.13e-02

Pos logit contributions
igsaw+0.024
lad+0.022
reetings+0.021
Em+0.021
 newfound+0.021
Neg logit contributions
 smoke-0.016
 made-0.016
 dressed-0.016
 marks-0.016
 like-0.015
Token the
Token ID262
Feature activation-2.04e-02

Pos logit contributions
 elbows+0.025
 poop+0.023
tails+0.022
 marks+0.022
 smoke+0.022
Neg logit contributions
igsaw-0.046
lad-0.040
 newfound-0.040
 importantly-0.039
Em-0.039
Token would
Token ID561
Feature activation+1.12e-02

Pos logit contributions
 elbows+0.007
 poop+0.007
tails+0.007
 smoke+0.006
 marks+0.006
Neg logit contributions
igsaw-0.011
lad-0.009
Em-0.009
 newfound-0.009
reetings-0.009
Token take
Token ID1011
Feature activation-2.46e-02

Pos logit contributions
 elbows+0.013
 poop+0.012
w+0.012
tails+0.011
 marks+0.011
Neg logit contributions
igsaw-0.024
lad-0.021
Em-0.020
 newfound-0.020
reetings-0.020
Token out
Token ID503
Feature activation-6.91e-03

Pos logit contributions
igsaw+0.057
lad+0.051
Em+0.049
reetings+0.049
 importantly+0.048
Neg logit contributions
 elbows-0.024
 marks-0.022
 made-0.022
 poop-0.022
 covered-0.022
Token a
Token ID257
Feature activation+1.55e-03

Pos logit contributions
igsaw+0.015
lad+0.014
reetings+0.013
Em+0.013
 importantly+0.013
Neg logit contributions
 elbows-0.007
 marks-0.007
 made-0.007
 covered-0.007
 smoke-0.007
Token j
Token ID474
Feature activation-2.87e-02

Pos logit contributions
 made+0.002
 smell+0.002
 marks+0.002
 smoke+0.002
 spill+0.002
Neg logit contributions
lad-0.003
reetings-0.003
 importantly-0.003
igsaw-0.003
oke-0.003
Tokenigsaw
Token ID45636
Feature activation-1.03e-01

Pos logit contributions
igsaw+0.032
lad+0.030
Em+0.029
reetings+0.028
 newfound+0.027
Neg logit contributions
 marks-0.054
 poop-0.054
 elbows-0.054
 covered-0.053
 dressed-0.053
Token puzzle
Token ID15027
Feature activation-2.27e-02

Pos logit contributions
lad+0.143
reetings+0.142
igsaw+0.136
 newfound+0.127
 pleasure+0.126
Neg logit contributions
 made-0.176
 covered-0.170
 marks-0.169
 smoke-0.161
 dressed-0.160
Token and
Token ID290
Feature activation-8.77e-03

Pos logit contributions
igsaw+0.049
Em+0.040
 newfound+0.039
 importantly+0.039
lad+0.039
Neg logit contributions
 elbows-0.029
w-0.025
 poop-0.025
tails-0.025
 smoke-0.024
Token try
Token ID1949
Feature activation-1.42e-04

Pos logit contributions
igsaw+0.019
 newfound+0.016
Em+0.015
lad+0.015
 importantly+0.015
Neg logit contributions
 elbows-0.010
w-0.009
 poop-0.009
tails-0.009
 covered-0.008
Token to
Token ID284
Feature activation-5.40e-04

Pos logit contributions
igsaw+0.000
lad+0.000
Em+0.000
 importantly+0.000
reetings+0.000
Neg logit contributions
 marks-0.000
 made-0.000
 elbows-0.000
 poop-0.000
 dressed-0.000
Token solve
Token ID8494
Feature activation-3.44e-02

Pos logit contributions
igsaw+0.001
lad+0.001
Em+0.001
 newfound+0.001
reetings+0.001
Neg logit contributions
 elbows-0.001
 poop-0.001
tails-0.001
 covered-0.001
 dressed-0.001
Token easily
Token ID3538
Feature activation+1.22e-03

Pos logit contributions
igsaw+0.023
lad+0.019
 newfound+0.019
Em+0.019
 importantly+0.018
Neg logit contributions
 elbows-0.015
 poop-0.014
tails-0.013
 smoke-0.013
w-0.013
Token.
Token ID13
Feature activation-8.98e-03

Pos logit contributions
 elbows+0.001
tails+0.001
 poop+0.001
 smoke+0.001
 marks+0.001
Neg logit contributions
igsaw-0.003
lad-0.002
Em-0.002
 newfound-0.002
reetings-0.002
Token
Token ID198
Feature activation-8.58e-03

Pos logit contributions
igsaw+0.018
lad+0.017
reetings+0.016
 newfound+0.016
 importantly+0.015
Neg logit contributions
 poop-0.011
 covered-0.011
 elbows-0.011
 made-0.010
tails-0.010
Token
Token ID198
Feature activation-1.47e-02

Pos logit contributions
igsaw+0.020
lad+0.017
 pleasure+0.017
 newfound+0.017
 hero+0.017
Neg logit contributions
 yelled-0.008
 shouted-0.008
 said-0.007
 made-0.007
 dressed-0.007
Token"
Token ID1
Feature activation+2.71e-02

Pos logit contributions
igsaw+0.028
lad+0.024
 pleasure+0.024
 newfound+0.023
joy+0.023
Neg logit contributions
tails-0.019
 made-0.019
 elbows-0.019
 poop-0.019
 covered-0.019
TokenEll
Token ID30639
Feature activation-1.03e-01

Pos logit contributions
 elbows+0.042
tails+0.040
 marks+0.040
 poop+0.039
 smoke+0.039
Neg logit contributions
igsaw-0.047
Em-0.042
lad-0.040
reetings-0.038
 newfound-0.037
Tokenie
Token ID494
Feature activation-1.47e-02

Pos logit contributions
igsaw+0.234
 newfound+0.200
lad+0.199
Em+0.197
 importantly+0.196
Neg logit contributions
 elbows-0.105
 marks-0.102
 covered-0.101
tails-0.101
 poop-0.100
Token,
Token ID11
Feature activation-9.69e-03

Pos logit contributions
igsaw+0.027
lad+0.023
reetings+0.022
 newfound+0.022
Em+0.022
Neg logit contributions
 elbows-0.022
 dressed-0.020
 poop-0.019
 marks-0.019
 smoke-0.019
Token you
Token ID345
Feature activation-2.95e-02

Pos logit contributions
igsaw+0.019
lad+0.017
Em+0.016
 newfound+0.016
reetings+0.016
Neg logit contributions
 elbows-0.013
tails-0.012
 poop-0.012
 marks-0.011
 covered-0.011
Token're
Token ID821
Feature activation+1.18e-02

Pos logit contributions
igsaw+0.060
lad+0.052
Em+0.050
 newfound+0.049
reetings+0.049
Neg logit contributions
 elbows-0.038
 poop-0.035
tails-0.035
 marks-0.034
 dressed-0.034
Token too
Token ID1165
Feature activation+2.69e-02

Pos logit contributions
 elbows+0.014
tails+0.013
 poop+0.013
 marks+0.013
 smoke+0.012
Neg logit contributions
igsaw-0.025
lad-0.022
Em-0.021
 newfound-0.021
 importantly-0.021
Token said
Token ID531
Feature activation-3.30e-03

Pos logit contributions
 elbows+0.024
 poop+0.020
tails+0.020
 marks+0.019
w+0.019
Neg logit contributions
igsaw-0.040
Em-0.039
lad-0.037
reetings-0.036
 importantly-0.036
Token to
Token ID284
Feature activation-2.01e-02

Pos logit contributions
igsaw+0.007
lad+0.006
Em+0.006
 newfound+0.006
reetings+0.006
Neg logit contributions
 elbows-0.004
 poop-0.004
tails-0.004
 marks-0.004
 covered-0.004
Token her
Token ID607
Feature activation+3.15e-03

Pos logit contributions
igsaw+0.055
lad+0.051
 importantly+0.050
reetings+0.049
oke+0.049
Neg logit contributions
 smoke-0.011
 made-0.011
 elbows-0.010
 poop-0.010
 covered-0.009
Token,
Token ID11
Feature activation-2.02e-03

Pos logit contributions
 elbows+0.004
 poop+0.004
 covered+0.003
tails+0.003
 marks+0.003
Neg logit contributions
igsaw-0.007
lad-0.006
Em-0.006
 newfound-0.006
reetings-0.006
Token "
Token ID366
Feature activation-7.55e-04

Pos logit contributions
igsaw+0.002
lad+0.002
 newfound+0.002
Em+0.002
reetings+0.002
Neg logit contributions
 elbows-0.004
 covered-0.004
 poop-0.004
 made-0.004
 smoke-0.004
TokenEll
Token ID30639
Feature activation-1.02e-01

Pos logit contributions
igsaw+0.001
lad+0.001
Em+0.001
reetings+0.001
 newfound+0.001
Neg logit contributions
 elbows-0.001
 poop-0.001
 marks-0.001
 covered-0.001
tails-0.001
Tokenie
Token ID494
Feature activation-1.50e-02

Pos logit contributions
igsaw+0.219
lad+0.189
 newfound+0.183
Em+0.183
 importantly+0.181
Neg logit contributions
 elbows-0.118
 poop-0.112
tails-0.111
 marks-0.111
 covered-0.111
Token,
Token ID11
Feature activation-3.34e-03

Pos logit contributions
igsaw+0.026
lad+0.022
reetings+0.020
Em+0.020
 newfound+0.020
Neg logit contributions
 elbows-0.025
 dressed-0.022
 marks-0.021
 poop-0.021
 smoke-0.021
Token singing
Token ID13777
Feature activation-1.53e-02

Pos logit contributions
igsaw+0.006
lad+0.006
Em+0.005
reetings+0.005
 newfound+0.005
Neg logit contributions
 elbows-0.005
tails-0.004
 marks-0.004
 poop-0.004
 covered-0.004
Token is
Token ID318
Feature activation+1.51e-02

Pos logit contributions
igsaw+0.033
lad+0.029
Em+0.028
 newfound+0.027
reetings+0.027
Neg logit contributions
 elbows-0.018
 poop-0.016
tails-0.016
 marks-0.016
 covered-0.016
Token good
Token ID922
Feature activation+1.68e-03

Pos logit contributions
 elbows+0.023
tails+0.020
 poop+0.020
 marks+0.020
w+0.020
Neg logit contributions
igsaw-0.027
lad-0.023
Em-0.023
 newfound-0.023
 importantly-0.023
Token,
Token ID11
Feature activation-7.75e-03

Pos logit contributions
igsaw+0.063
lad+0.053
 newfound+0.053
 importantly+0.052
Em+0.052
Neg logit contributions
 elbows-0.032
w-0.028
 smoke-0.027
 poop-0.027
 marks-0.027
Token and
Token ID290
Feature activation-1.23e-04

Pos logit contributions
igsaw+0.017
lad+0.015
 newfound+0.014
Em+0.014
 importantly+0.014
Neg logit contributions
 elbows-0.009
tails-0.008
 poop-0.008
 marks-0.008
 covered-0.008
Token all
Token ID477
Feature activation-3.41e-02

Pos logit contributions
igsaw+0.000
 newfound+0.000
lad+0.000
Em+0.000
 importantly+0.000
Neg logit contributions
 elbows-0.000
 poop-0.000
tails-0.000
 marks-0.000
w-0.000
Token the
Token ID262
Feature activation-1.08e-02

Pos logit contributions
igsaw+0.069
lad+0.060
Em+0.058
 newfound+0.058
 importantly+0.057
Neg logit contributions
 elbows-0.044
 poop-0.040
tails-0.040
 marks-0.040
 covered-0.038
Token other
Token ID584
Feature activation-2.48e-03

Pos logit contributions
igsaw+0.024
lad+0.021
Em+0.020
reetings+0.020
 newfound+0.020
Neg logit contributions
 elbows-0.012
 poop-0.011
 marks-0.011
 smoke-0.011
 covered-0.011
Token av
Token ID1196
Feature activation-1.01e-01

Pos logit contributions
igsaw+0.005
lad+0.004
Em+0.004
 newfound+0.004
 importantly+0.004
Neg logit contributions
 elbows-0.003
 poop-0.003
tails-0.003
 covered-0.003
 marks-0.003
Tokenoc
Token ID420
Feature activation-4.32e-02

Pos logit contributions
igsaw+0.124
lad+0.095
Em+0.089
 newfound+0.087
reetings+0.086
Neg logit contributions
 elbows-0.211
 poop-0.204
 covered-0.202
tails-0.202
 marks-0.202
Tokenados
Token ID22484
Feature activation-5.03e-03

Pos logit contributions
igsaw+0.074
Em+0.063
 newfound+0.061
lad+0.061
 importantly+0.059
Neg logit contributions
 elbows-0.068
 marks-0.066
 poop-0.065
tails-0.065
w-0.064
Token were
Token ID547
Feature activation-7.34e-03

Pos logit contributions
igsaw+0.010
lad+0.009
Em+0.009
 newfound+0.009
 importantly+0.009
Neg logit contributions
 elbows-0.006
 poop-0.005
tails-0.005
 marks-0.005
 smoke-0.005
Token very
Token ID845
Feature activation+1.08e-02

Pos logit contributions
igsaw+0.015
lad+0.013
Em+0.012
 newfound+0.012
reetings+0.012
Neg logit contributions
 elbows-0.009
 poop-0.009
 covered-0.009
tails-0.009
 marks-0.009
Token happy
Token ID3772
Feature activation-1.44e-02

Pos logit contributions
 elbows+0.018
 covered+0.017
 made+0.017
 poop+0.017
 marks+0.017
Neg logit contributions
igsaw-0.018
lad-0.014
Em-0.014
reetings-0.014
 newfound-0.013
Token box
Token ID3091
Feature activation-1.18e-02

Pos logit contributions
 made+0.012
 marks+0.012
 poop+0.012
 covered+0.012
 smoke+0.011
Neg logit contributions
igsaw-0.022
lad-0.020
reetings-0.020
 importantly-0.019
Em-0.019
Token and
Token ID290
Feature activation+8.89e-03

Pos logit contributions
igsaw+0.024
 newfound+0.020
lad+0.020
 importantly+0.019
Em+0.019
Neg logit contributions
 elbows-0.016
 poop-0.014
w-0.014
tails-0.014
 smoke-0.013
Token saw
Token ID2497
Feature activation-5.45e-03

Pos logit contributions
 elbows+0.009
 poop+0.008
tails+0.008
 marks+0.008
 smoke+0.007
Neg logit contributions
igsaw-0.021
lad-0.018
Em-0.018
 newfound-0.018
 importantly-0.017
Token a
Token ID257
Feature activation+8.14e-03

Pos logit contributions
igsaw+0.012
 newfound+0.010
lad+0.010
 importantly+0.010
Em+0.010
Neg logit contributions
 elbows-0.006
 poop-0.006
tails-0.006
w-0.006
 dressed-0.005
Token j
Token ID474
Feature activation-2.20e-02

Pos logit contributions
 made+0.011
 covered+0.011
 smoke+0.011
 poop+0.011
 marks+0.011
Neg logit contributions
igsaw-0.015
lad-0.014
reetings-0.013
Em-0.013
 importantly-0.012
Tokenigsaw
Token ID45636
Feature activation-1.01e-01

Pos logit contributions
igsaw+0.024
lad+0.022
Em+0.021
reetings+0.020
 newfound+0.020
Neg logit contributions
 covered-0.042
 poop-0.042
 marks-0.042
 dressed-0.042
 elbows-0.042
Token puzzle
Token ID15027
Feature activation-2.00e-02

Pos logit contributions
igsaw+0.162
lad+0.160
reetings+0.157
 newfound+0.146
 importantly+0.145
Neg logit contributions
 made-0.149
 covered-0.146
 marks-0.140
 smoke-0.137
 dressed-0.134
Token inside
Token ID2641
Feature activation-5.88e-03

Pos logit contributions
igsaw+0.045
lad+0.037
 importantly+0.037
Em+0.037
 newfound+0.037
Neg logit contributions
 elbows-0.024
tails-0.021
 poop-0.020
w-0.020
 dressed-0.019
Token.
Token ID13
Feature activation-1.01e-02

Pos logit contributions
igsaw+0.012
lad+0.011
 importantly+0.011
 newfound+0.011
joy+0.010
Neg logit contributions
 elbows-0.007
tails-0.007
 poop-0.006
w-0.006
mons-0.006
Token He
Token ID679
Feature activation+2.35e-03

Pos logit contributions
igsaw+0.021
lad+0.020
reetings+0.020
 newfound+0.019
 lifelong+0.018
Neg logit contributions
 covered-0.011
 said-0.011
 dressed-0.011
 made-0.011
 water-0.010
Token paused
Token ID24487
Feature activation-7.43e-03

Pos logit contributions
 made+0.002
 covered+0.002
 elbows+0.002
 marks+0.002
 dressed+0.002
Neg logit contributions
igsaw-0.006
lad-0.005
reetings-0.005
Em-0.005
 newfound-0.005
Token fish
Token ID5916
Feature activation+1.97e-02

Pos logit contributions
igsaw+0.021
lad+0.018
Em+0.017
reetings+0.017
 importantly+0.017
Neg logit contributions
 elbows-0.016
 poop-0.015
 smoke-0.015
 marks-0.015
 made-0.015
Token said
Token ID531
Feature activation-1.00e-02

Pos logit contributions
 elbows+0.025
tails+0.022
 poop+0.021
 dressed+0.020
 marks+0.020
Neg logit contributions
igsaw-0.039
lad-0.036
 newfound-0.035
 importantly-0.034
reetings-0.034
Token,
Token ID11
Feature activation-1.32e-02

Pos logit contributions
igsaw+0.020
lad+0.018
Em+0.017
 newfound+0.017
reetings+0.017
Neg logit contributions
 elbows-0.013
 poop-0.012
tails-0.012
 covered-0.012
 marks-0.011
Token "
Token ID366
Feature activation-9.83e-03

Pos logit contributions
igsaw+0.018
lad+0.013
Em+0.012
reetings+0.012
 newfound+0.012
Neg logit contributions
 made-0.026
 elbows-0.026
 poop-0.025
 covered-0.025
 smoke-0.025
TokenHi
Token ID17250
Feature activation-8.09e-02

Pos logit contributions
igsaw+0.017
lad+0.016
Em+0.016
reetings+0.015
 newfound+0.014
Neg logit contributions
 elbows-0.015
tails-0.014
 covered-0.014
 poop-0.014
 marks-0.013
Token Gab
Token ID12300
Feature activation-1.00e-01

Pos logit contributions
igsaw+0.188
lad+0.164
Em+0.163
 newfound+0.160
 importantly+0.157
Neg logit contributions
 poop-0.076
 made-0.076
 covered-0.075
 smoke-0.073
 marks-0.073
Tokenri
Token ID380
Feature activation-3.50e-02

Pos logit contributions
igsaw+0.206
lad+0.176
Em+0.173
 importantly+0.172
 newfound+0.171
Neg logit contributions
 elbows-0.120
tails-0.120
 poop-0.116
 marks-0.115
 covered-0.115
Tokenella
Token ID12627
Feature activation-4.96e-02

Pos logit contributions
igsaw+0.068
lad+0.057
Em+0.057
 newfound+0.056
 importantly+0.056
Neg logit contributions
 elbows-0.045
 poop-0.045
 made-0.044
 covered-0.044
tails-0.044
Token,
Token ID11
Feature activation-2.59e-02

Pos logit contributions
igsaw+0.114
lad+0.100
Em+0.097
 newfound+0.096
reetings+0.095
Neg logit contributions
 elbows-0.049
 poop-0.046
 covered-0.045
tails-0.045
 marks-0.045
Token would
Token ID561
Feature activation-5.68e-03

Pos logit contributions
igsaw+0.052
lad+0.044
Em+0.043
 newfound+0.043
reetings+0.042
Neg logit contributions
 poop-0.032
 made-0.032
 smoke-0.032
 covered-0.032
 elbows-0.032
Token you
Token ID345
Feature activation-1.26e-02

Pos logit contributions
igsaw+0.012
lad+0.011
 newfound+0.010
Em+0.010
reetings+0.010
Neg logit contributions
 elbows-0.007
tails-0.006
 marks-0.006
 poop-0.006
 covered-0.006
Token so
Token ID523
Feature activation+1.86e-02

Pos logit contributions
 elbows+0.002
w+0.002
 poop+0.001
 marks+0.001
tails+0.001
Neg logit contributions
igsaw-0.002
 newfound-0.002
lad-0.002
 importantly-0.002
Em-0.002
Token happy
Token ID3772
Feature activation-6.80e-03

Pos logit contributions
 elbows+0.028
 poop+0.027
tails+0.026
 marks+0.026
 covered+0.026
Neg logit contributions
igsaw-0.034
lad-0.029
Em-0.027
 newfound-0.027
reetings-0.027
Token and
Token ID290
Feature activation-2.28e-02

Pos logit contributions
igsaw+0.011
joy+0.010
 importantly+0.010
 Nice+0.009
lad+0.009
Neg logit contributions
 elbows-0.012
w-0.011
tails-0.010
 poop-0.010
 smoke-0.009
Token brought
Token ID3181
Feature activation-1.87e-02

Pos logit contributions
igsaw+0.043
 newfound+0.039
lad+0.038
 importantly+0.037
Em+0.037
Neg logit contributions
 elbows-0.033
 poop-0.029
 marks-0.029
w-0.029
 smoke-0.028
Token the
Token ID262
Feature activation-8.49e-03

Pos logit contributions
igsaw+0.048
lad+0.044
reetings+0.042
Em+0.042
 importantly+0.041
Neg logit contributions
 made-0.014
 covered-0.013
 elbows-0.013
 smoke-0.013
 dressed-0.013
Token av
Token ID1196
Feature activation-1.00e-01

Pos logit contributions
igsaw+0.017
Em+0.015
lad+0.015
 importantly+0.014
reetings+0.014
Neg logit contributions
 made-0.012
 smell-0.012
 marks-0.012
 water-0.011
 smoke-0.011
Tokenoc
Token ID420
Feature activation-4.96e-02

Pos logit contributions
igsaw+0.142
lad+0.116
Em+0.113
 importantly+0.112
reetings+0.109
Neg logit contributions
tails-0.183
 made-0.178
 poop-0.176
 marks-0.176
 covered-0.176
Tokenados
Token ID22484
Feature activation-2.50e-04

Pos logit contributions
igsaw+0.093
Em+0.087
 importantly+0.081
 newfound+0.080
lad+0.079
Neg logit contributions
 marks-0.067
 elbows-0.065
tails-0.065
 poop-0.065
w-0.063
Token home
Token ID1363
Feature activation-5.76e-03

Pos logit contributions
igsaw+0.001
lad+0.000
 newfound+0.000
Em+0.000
reetings+0.000
Neg logit contributions
 elbows-0.000
 poop-0.000
 smoke-0.000
tails-0.000
 marks-0.000
Token to
Token ID284
Feature activation+1.27e-03

Pos logit contributions
igsaw+0.013
lad+0.011
 newfound+0.011
Em+0.011
 importantly+0.011
Neg logit contributions
 elbows-0.006
w-0.006
 poop-0.005
tails-0.005
 smoke-0.005
Token share
Token ID2648
Feature activation-1.71e-02

Pos logit contributions
 made+0.001
 smoke+0.001
 say+0.001
 smell+0.001
 covered+0.001
Neg logit contributions
igsaw-0.003
lad-0.003
Em-0.003
reetings-0.003
 importantly-0.003
Token the
Token ID262
Feature activation-1.70e-02

Pos logit contributions
 elbows+0.020
 poop+0.019
tails+0.018
 marks+0.018
 dressed+0.018
Neg logit contributions
igsaw-0.028
 newfound-0.024
lad-0.024
 importantly-0.023
Em-0.023
Token tree
Token ID5509
Feature activation-2.99e-02

Pos logit contributions
igsaw+0.033
lad+0.030
reetings+0.028
Em+0.028
 importantly+0.027
Neg logit contributions
 smoke-0.023
 made-0.022
 marks-0.022
 covered-0.021
 poop-0.021
Token knew
Token ID2993
Feature activation-2.76e-02

Pos logit contributions
igsaw+0.058
lad+0.052
 newfound+0.052
Em+0.049
reetings+0.049
Neg logit contributions
 elbows-0.040
tails-0.033
 poop-0.032
w-0.032
 marks-0.030
Token that
Token ID326
Feature activation-7.33e-03

Pos logit contributions
igsaw+0.065
lad+0.057
Em+0.056
 newfound+0.055
reetings+0.055
Neg logit contributions
 elbows-0.026
 poop-0.024
 covered-0.024
tails-0.023
 marks-0.023
Token the
Token ID262
Feature activation-2.52e-02

Pos logit contributions
igsaw+0.017
lad+0.015
Em+0.014
 newfound+0.014
reetings+0.014
Neg logit contributions
 elbows-0.007
 poop-0.007
 covered-0.007
 marks-0.007
 made-0.007
Token av
Token ID1196
Feature activation-9.98e-02

Pos logit contributions
igsaw+0.044
lad+0.040
Em+0.038
reetings+0.037
 importantly+0.035
Neg logit contributions
 smoke-0.039
 marks-0.039
 made-0.038
 poop-0.037
 smell-0.037
Tokenoc
Token ID420
Feature activation-4.09e-02

Pos logit contributions
igsaw+0.147
lad+0.118
Em+0.118
 importantly+0.112
reetings+0.109
Neg logit contributions
 made-0.178
tails-0.177
 marks-0.175
w-0.175
 poop-0.173
Tokenados
Token ID22484
Feature activation-1.19e-02

Pos logit contributions
igsaw+0.074
Em+0.070
 importantly+0.065
 newfound+0.065
lad+0.063
Neg logit contributions
 marks-0.057
 poop-0.055
 elbows-0.054
w-0.054
 smoke-0.054
Token would
Token ID561
Feature activation-3.93e-04

Pos logit contributions
igsaw+0.022
 newfound+0.019
lad+0.019
 importantly+0.018
Em+0.018
Neg logit contributions
 elbows-0.018
tails-0.015
 poop-0.015
w-0.014
 marks-0.014
Token be
Token ID307
Feature activation-1.40e-02

Pos logit contributions
igsaw+0.001
lad+0.001
 newfound+0.001
reetings+0.001
Em+0.001
Neg logit contributions
 elbows-0.000
tails-0.000
 poop-0.000
w-0.000
 dressed-0.000
Token back
Token ID736
Feature activation-3.56e-02

Pos logit contributions
igsaw+0.028
lad+0.024
Em+0.023
reetings+0.022
 importantly+0.022
Neg logit contributions
 made-0.018
 covered-0.018
 elbows-0.018
 poop-0.018
 smoke-0.018
TokenThe
Token ID464
Feature activation-1.99e-02

Pos logit contributions
igsaw+0.038
lad+0.032
 newfound+0.030
Em+0.030
reetings+0.030
Neg logit contributions
 elbows-0.031
 poop-0.029
 covered-0.029
tails-0.029
 made-0.029
Token young
Token ID1862
Feature activation-2.95e-02

Pos logit contributions
igsaw+0.032
lad+0.027
Em+0.026
reetings+0.025
 importantly+0.025
Neg logit contributions
 smoke-0.032
 marks-0.031
 poop-0.031
 elbows-0.031
 made-0.031
Token girl
Token ID2576
Feature activation-2.63e-02

Pos logit contributions
igsaw+0.052
lad+0.045
reetings+0.042
 newfound+0.041
Em+0.041
Neg logit contributions
 elbows-0.043
 made-0.043
 covered-0.043
 poop-0.042
 smoke-0.042
Token took
Token ID1718
Feature activation-2.05e-02

Pos logit contributions
igsaw+0.052
Em+0.046
 newfound+0.046
lad+0.045
 importantly+0.044
Neg logit contributions
 elbows-0.035
 poop-0.031
w-0.030
 smoke-0.030
tails-0.029
Token the
Token ID262
Feature activation-4.83e-03

Pos logit contributions
igsaw+0.047
 newfound+0.040
lad+0.039
Em+0.039
 importantly+0.037
Neg logit contributions
 elbows-0.022
 poop-0.020
 covered-0.020
tails-0.019
w-0.019
Token fer
Token ID11354
Feature activation-9.98e-02

Pos logit contributions
igsaw+0.009
lad+0.007
 newfound+0.007
Em+0.007
reetings+0.007
Neg logit contributions
 elbows-0.007
 covered-0.007
 poop-0.007
tails-0.007
 marks-0.007
Tokenret
Token ID1186
Feature activation-5.63e-02

Pos logit contributions
igsaw+0.153
lad+0.128
Em+0.128
 importantly+0.127
 newfound+0.123
Neg logit contributions
tails-0.163
w-0.160
 elbows-0.160
 marks-0.157
 dressed-0.156
Token home
Token ID1363
Feature activation-8.76e-03

Pos logit contributions
igsaw+0.119
lad+0.103
Em+0.100
 newfound+0.099
 importantly+0.098
Neg logit contributions
 elbows-0.065
 poop-0.062
tails-0.061
 covered-0.061
 marks-0.061
Token and
Token ID290
Feature activation-1.15e-02

Pos logit contributions
igsaw+0.019
lad+0.017
 newfound+0.016
Em+0.016
 importantly+0.016
Neg logit contributions
 elbows-0.010
 poop-0.009
w-0.009
tails-0.009
 smoke-0.008
Token was
Token ID373
Feature activation-2.24e-03

Pos logit contributions
igsaw+0.024
 newfound+0.022
Em+0.021
 importantly+0.020
lad+0.020
Neg logit contributions
 elbows-0.013
 poop-0.012
w-0.012
 marks-0.012
 smoke-0.011
Token very
Token ID845
Feature activation+6.27e-03

Pos logit contributions
igsaw+0.004
 newfound+0.004
lad+0.004
Em+0.004
 importantly+0.004
Neg logit contributions
 elbows-0.003
 poop-0.003
 marks-0.003
w-0.003
tails-0.003
Token Everyday
Token ID48154
Feature activation+2.24e-02

Pos logit contributions
 elbows+0.005
 poop+0.005
 marks+0.005
 smoke+0.005
tails+0.005
Neg logit contributions
igsaw-0.008
lad-0.008
Em-0.007
 newfound-0.007
 importantly-0.007
Token he
Token ID339
Feature activation+1.12e-02

Pos logit contributions
 elbows+0.032
 poop+0.029
tails+0.028
 marks+0.028
w+0.028
Neg logit contributions
igsaw-0.042
Em-0.036
 importantly-0.035
 newfound-0.035
lad-0.034
Token would
Token ID561
Feature activation+9.26e-03

Pos logit contributions
 elbows+0.015
 poop+0.013
 marks+0.013
 smoke+0.012
tails+0.012
Neg logit contributions
igsaw-0.021
lad-0.019
Em-0.019
 newfound-0.019
reetings-0.019
Token take
Token ID1011
Feature activation-2.98e-02

Pos logit contributions
 elbows+0.011
 poop+0.010
w+0.010
 marks+0.010
mons+0.010
Neg logit contributions
igsaw-0.019
 newfound-0.016
lad-0.016
reetings-0.016
Em-0.016
Token the
Token ID262
Feature activation-1.50e-03

Pos logit contributions
igsaw+0.067
 newfound+0.057
lad+0.056
Em+0.056
 importantly+0.055
Neg logit contributions
 elbows-0.032
 poop-0.030
 covered-0.030
tails-0.028
 marks-0.028
Token av
Token ID1196
Feature activation-9.97e-02

Pos logit contributions
igsaw+0.003
 newfound+0.002
lad+0.002
Em+0.002
 importantly+0.002
Neg logit contributions
 elbows-0.002
 covered-0.002
tails-0.002
 poop-0.002
 dressed-0.002
Tokenoc
Token ID420
Feature activation-5.21e-02

Pos logit contributions
igsaw+0.139
lad+0.116
 importantly+0.111
Em+0.109
 newfound+0.107
Neg logit contributions
tails-0.181
 elbows-0.178
w-0.178
 made-0.177
 marks-0.177
Tokenados
Token ID22484
Feature activation+6.45e-03

Pos logit contributions
igsaw+0.091
Em+0.081
lad+0.077
 importantly+0.077
 newfound+0.077
Neg logit contributions
 marks-0.076
 elbows-0.076
tails-0.075
w-0.073
 poop-0.072
Token and
Token ID290
Feature activation+1.79e-03

Pos logit contributions
 elbows+0.007
 poop+0.006
 smoke+0.006
tails+0.006
 marks+0.006
Neg logit contributions
igsaw-0.015
lad-0.013
Em-0.013
reetings-0.012
 newfound-0.012
Token scatter
Token ID41058
Feature activation+4.37e-03

Pos logit contributions
 elbows+0.002
 poop+0.002
 covered+0.002
tails+0.002
 marks+0.002
Neg logit contributions
igsaw-0.004
lad-0.003
 newfound-0.003
Em-0.003
 importantly-0.003
Token them
Token ID606
Feature activation+3.44e-02

Pos logit contributions
 elbows+0.004
 poop+0.004
 covered+0.004
 marks+0.004
 made+0.004
Neg logit contributions
igsaw-0.010
lad-0.009
Em-0.009
 newfound-0.009
reetings-0.009
Token said
Token ID531
Feature activation-1.24e-02

Pos logit contributions
 elbows+0.015
tails+0.013
 poop+0.013
w+0.012
 smoke+0.012
Neg logit contributions
igsaw-0.026
Em-0.023
lad-0.023
 newfound-0.022
va-0.021
Token,
Token ID11
Feature activation-2.10e-02

Pos logit contributions
igsaw+0.027
lad+0.024
Em+0.023
 newfound+0.023
reetings+0.023
Neg logit contributions
 elbows-0.014
 poop-0.013
 covered-0.013
 made-0.013
 marks-0.013
Token "
Token ID366
Feature activation-1.13e-02

Pos logit contributions
igsaw+0.031
lad+0.024
reetings+0.022
 newfound+0.022
 importantly+0.021
Neg logit contributions
 made-0.040
 covered-0.038
 smoke-0.038
 poop-0.038
 elbows-0.038
TokenI
Token ID40
Feature activation-4.07e-02

Pos logit contributions
igsaw+0.020
Em+0.018
lad+0.018
reetings+0.017
 importantly+0.016
Neg logit contributions
 elbows-0.017
tails-0.016
 poop-0.016
 smoke-0.016
 covered-0.016
Token love
Token ID1842
Feature activation+6.08e-03

Pos logit contributions
igsaw+0.071
reetings+0.064
lad+0.064
 newfound+0.063
oke+0.062
Neg logit contributions
 elbows-0.067
tails-0.056
 clothes-0.051
ards-0.050
 smoke-0.050
Token av
Token ID1196
Feature activation-9.95e-02

Pos logit contributions
 elbows+0.007
 poop+0.007
 marks+0.006
 covered+0.006
 made+0.006
Neg logit contributions
igsaw-0.013
lad-0.011
Em-0.011
 newfound-0.011
reetings-0.011
Tokenoc
Token ID420
Feature activation-3.95e-02

Pos logit contributions
igsaw+0.133
lad+0.105
Em+0.101
 newfound+0.098
 importantly+0.097
Neg logit contributions
 elbows-0.190
tails-0.188
 poop-0.187
 made-0.187
 marks-0.186
Tokenados
Token ID22484
Feature activation-2.27e-02

Pos logit contributions
igsaw+0.065
Em+0.055
lad+0.054
 newfound+0.053
 importantly+0.052
Neg logit contributions
 elbows-0.063
 marks-0.062
 poop-0.061
tails-0.061
w-0.060
Token!
Token ID0
Feature activation-2.21e-02

Pos logit contributions
igsaw+0.049
lad+0.043
Em+0.041
reetings+0.041
 importantly+0.041
Neg logit contributions
 elbows-0.027
 poop-0.023
 smoke-0.023
tails-0.023
 marks-0.023
Token They
Token ID1119
Feature activation-1.16e-02

Pos logit contributions
igsaw+0.040
 newfound+0.034
lad+0.033
Em+0.032
joy+0.031
Neg logit contributions
 made-0.031
 like-0.031
 covered-0.031
 poop-0.031
 say-0.030
Token are
Token ID389
Feature activation+1.94e-03

Pos logit contributions
igsaw+0.024
lad+0.021
 newfound+0.020
reetings+0.020
 million+0.020
Neg logit contributions
 elbows-0.015
tails-0.012
 poop-0.012
 dressed-0.012
 marks-0.011
Token
Token ID198
Feature activation-1.80e-02

Pos logit contributions
igsaw+0.026
lad+0.022
 newfound+0.022
 importantly+0.021
 hero+0.020
Neg logit contributions
 dressed-0.019
tails-0.018
 covered-0.018
 made-0.018
 poop-0.018
Tokenâ
Token ID22940
Feature activation+2.38e-02

Pos logit contributions
igsaw+0.032
lad+0.027
 newfound+0.025
 importantly+0.025
reetings+0.025
Neg logit contributions
 elbows-0.027
 covered-0.026
 poop-0.026
tails-0.026
 dressed-0.026
Token€
Token ID26391
Feature activation-3.19e-02

Pos logit contributions
 elbows+0.036
 poop+0.034
 marks+0.033
tails+0.033
 smoke+0.033
Neg logit contributions
igsaw-0.043
lad-0.037
Em-0.035
reetings-0.035
 newfound-0.034
Token�
Token ID129
Feature activation-7.30e-03

Pos logit contributions
igsaw+0.049
lad+0.040
Em+0.038
 newfound+0.037
reetings+0.037
Neg logit contributions
 elbows-0.056
 poop-0.053
 smoke-0.053
 marks-0.053
 dressed-0.053
Token�
Token ID241
Feature activation+4.38e-02

Pos logit contributions
igsaw+0.012
lad+0.010
Em+0.010
 ''+0.009
reetings+0.009
Neg logit contributions
 elbows-0.012
w-0.011
 made-0.011
 marks-0.011
 smoke-0.011
TokenEll
Token ID30639
Feature activation-9.92e-02

Pos logit contributions
 elbows+0.072
tails+0.068
 marks+0.067
 covered+0.067
 poop+0.066
Neg logit contributions
igsaw-0.071
lad-0.063
Em-0.063
reetings-0.059
 importantly-0.058
Tokenie
Token ID494
Feature activation-1.45e-02

Pos logit contributions
igsaw+0.194
lad+0.171
reetings+0.161
Em+0.161
 newfound+0.155
Neg logit contributions
 elbows-0.136
 poop-0.126
 smoke-0.125
 covered-0.124
 dressed-0.122
Token,
Token ID11
Feature activation-9.51e-03

Pos logit contributions
igsaw+0.027
lad+0.023
reetings+0.022
Em+0.022
 importantly+0.022
Neg logit contributions
 elbows-0.022
 poop-0.019
tails-0.018
 marks-0.018
 dressed-0.018
Token do
Token ID466
Feature activation+1.13e-02

Pos logit contributions
igsaw+0.017
lad+0.015
Em+0.014
 importantly+0.014
reetings+0.014
Neg logit contributions
 elbows-0.015
tails-0.013
 poop-0.013
 marks-0.013
w-0.013
Token you
Token ID345
Feature activation-9.97e-03

Pos logit contributions
 elbows+0.014
 covered+0.012
 poop+0.012
 smoke+0.012
 dressed+0.012
Neg logit contributions
igsaw-0.024
lad-0.021
 importantly-0.020
reetings-0.020
 newfound-0.020
Token see
Token ID766
Feature activation-3.66e-02

Pos logit contributions
igsaw+0.018
reetings+0.017
 pleasure+0.016
lad+0.016
 newfound+0.016
Neg logit contributions
 elbows-0.016
tails-0.013
 poop-0.013
mons-0.013
ups-0.012
Token was
Token ID373
Feature activation+9.46e-03

Pos logit contributions
igsaw+0.076
lad+0.066
Em+0.064
 newfound+0.064
reetings+0.063
Neg logit contributions
 elbows-0.032
 poop-0.030
 covered-0.029
 marks-0.029
 made-0.029
Token a
Token ID257
Feature activation-1.42e-03

Pos logit contributions
 elbows+0.008
 poop+0.007
 covered+0.007
 marks+0.007
tails+0.007
Neg logit contributions
igsaw-0.023
lad-0.021
Em-0.020
 newfound-0.020
reetings-0.020
Token little
Token ID1310
Feature activation-2.74e-02

Pos logit contributions
igsaw+0.003
lad+0.002
reetings+0.002
run+0.002
 importantly+0.002
Neg logit contributions
 smoke-0.002
 made-0.002
 poop-0.002
 covered-0.002
 smell-0.002
Token girl
Token ID2576
Feature activation-4.15e-02

Pos logit contributions
igsaw+0.047
lad+0.039
Em+0.037
 newfound+0.037
reetings+0.036
Neg logit contributions
 elbows-0.044
 poop-0.042
 covered-0.041
tails-0.041
 marks-0.041
Token called
Token ID1444
Feature activation-2.02e-02

Pos logit contributions
igsaw+0.055
Em+0.050
 importantly+0.047
 newfound+0.044
lad+0.044
Neg logit contributions
 elbows-0.076
 marks-0.071
 poop-0.069
w-0.069
 smoke-0.068
Token Lay
Token ID18881
Feature activation-9.92e-02

Pos logit contributions
igsaw+0.035
lad+0.031
 newfound+0.030
 importantly+0.029
reetings+0.029
Neg logit contributions
 elbows-0.030
 poop-0.029
 covered-0.029
tails-0.029
 made-0.029
Tokenla
Token ID5031
Feature activation-3.63e-02

Pos logit contributions
igsaw+0.244
 importantly+0.205
 newfound+0.204
lad+0.203
Em+0.199
Neg logit contributions
 dressed-0.093
tails-0.090
 made-0.084
 covered-0.082
w-0.077
Token.
Token ID13
Feature activation-1.04e-02

Pos logit contributions
igsaw+0.088
lad+0.079
Em+0.077
 newfound+0.076
reetings+0.076
Neg logit contributions
 elbows-0.031
 poop-0.028
 marks-0.027
 smoke-0.027
 covered-0.027
Token She
Token ID1375
Feature activation-1.91e-02

Pos logit contributions
igsaw+0.022
lad+0.018
Em+0.018
 newfound+0.017
 importantly+0.017
Neg logit contributions
 made-0.012
 elbows-0.012
tails-0.012
 dressed-0.012
 covered-0.012
Token was
Token ID373
Feature activation-4.68e-03

Pos logit contributions
igsaw+0.029
Em+0.028
lad+0.026
 newfound+0.025
 importantly+0.025
Neg logit contributions
 elbows-0.033
tails-0.029
 poop-0.029
 marks-0.029
 smoke-0.028
Token three
Token ID1115
Feature activation-1.29e-02

Pos logit contributions
igsaw+0.009
 newfound+0.008
 importantly+0.008
Em+0.007
lad+0.007
Neg logit contributions
 elbows-0.006
 marks-0.006
w-0.006
 poop-0.006
tails-0.006
Token was
Token ID373
Feature activation+1.16e-02

Pos logit contributions
 elbows+0.006
w+0.005
tails+0.005
 spill+0.005
 poop+0.005
Neg logit contributions
Em-0.006
igsaw-0.006
 hero-0.006
 newfound-0.006
lad-0.005
Token so
Token ID523
Feature activation+2.50e-02

Pos logit contributions
 elbows+0.016
 marks+0.016
w+0.015
 poop+0.015
tails+0.015
Neg logit contributions
igsaw-0.020
 newfound-0.018
lad-0.017
 importantly-0.017
Em-0.017
Token happy
Token ID3772
Feature activation+3.98e-03

Pos logit contributions
 elbows+0.041
 poop+0.039
tails+0.039
 marks+0.038
 smoke+0.038
Neg logit contributions
igsaw-0.041
lad-0.035
 newfound-0.033
Em-0.033
 importantly-0.033
Token.
Token ID13
Feature activation-9.02e-03

Pos logit contributions
 elbows+0.007
tails+0.006
w+0.006
 poop+0.005
 gag+0.005
Neg logit contributions
igsaw-0.006
 importantly-0.006
joy-0.006
lad-0.006
 Nice-0.005
Token But
Token ID887
Feature activation+2.09e-02

Pos logit contributions
igsaw+0.020
lad+0.018
reetings+0.018
 newfound+0.018
Em+0.017
Neg logit contributions
 covered-0.009
 elbows-0.009
 smoke-0.008
 made-0.008
 marks-0.008
Token after
Token ID706
Feature activation+6.53e-02

Pos logit contributions
 elbows+0.023
tails+0.022
 poop+0.022
 marks+0.021
 smoke+0.020
Neg logit contributions
igsaw-0.042
 newfound-0.039
lad-0.039
 importantly-0.038
Em-0.037
Token a
Token ID257
Feature activation+1.36e-02

Pos logit contributions
 elbows+0.070
 poop+0.064
tails+0.064
 covered+0.061
w+0.061
Neg logit contributions
igsaw-0.144
lad-0.124
 newfound-0.124
 importantly-0.123
Em-0.120
Token while
Token ID981
Feature activation+5.29e-02

Pos logit contributions
 made+0.019
 smell+0.017
 after+0.016
 waited+0.016
 water+0.016
Neg logit contributions
igsaw-0.028
lad-0.026
reetings-0.026
oke-0.024
 importantly-0.023
Token,
Token ID11
Feature activation+2.73e-03

Pos logit contributions
 elbows+0.049
 made+0.047
 covered+0.047
 smoke+0.046
 poop+0.044
Neg logit contributions
igsaw-0.131
lad-0.115
Em-0.109
reetings-0.108
 newfound-0.105
Token it
Token ID340
Feature activation-5.60e-03

Pos logit contributions
 elbows+0.004
tails+0.003
 poop+0.003
ups+0.003
 marks+0.003
Neg logit contributions
 newfound-0.005
igsaw-0.005
 importantly-0.004
 hero-0.004
 million-0.004
Token realized
Token ID6939
Feature activation+9.27e-03

Pos logit contributions
 newfound+0.009
Em+0.008
igsaw+0.008
 million+0.008
 hero+0.008
Neg logit contributions
 elbows-0.008
tails-0.008
 marks-0.007
mons-0.007
ards-0.007
Token playing
Token ID2712
Feature activation+3.83e-03

Pos logit contributions
igsaw+0.011
lad+0.009
Em+0.009
reetings+0.009
 newfound+0.009
Neg logit contributions
 elbows-0.005
 covered-0.004
 made-0.004
 poop-0.004
 dressed-0.004
Token in
Token ID287
Feature activation-1.03e-02

Pos logit contributions
 elbows+0.004
tails+0.004
w+0.004
 poop+0.004
 dressed+0.004
Neg logit contributions
igsaw-0.008
Em-0.007
 newfound-0.007
lad-0.007
 importantly-0.007
Token the
Token ID262
Feature activation+1.15e-02

Pos logit contributions
igsaw+0.024
lad+0.020
Em+0.020
 newfound+0.020
 importantly+0.019
Neg logit contributions
 elbows-0.010
 poop-0.010
tails-0.009
 marks-0.009
 covered-0.009
Token park
Token ID3952
Feature activation-3.52e-02

Pos logit contributions
 elbows+0.016
tails+0.015
w+0.015
 poop+0.015
 marks+0.015
Neg logit contributions
igsaw-0.022
Em-0.018
 newfound-0.018
lad-0.017
 million-0.017
Token.
Token ID13
Feature activation-2.95e-03

Pos logit contributions
igsaw+0.065
 importantly+0.056
 newfound+0.055
lad+0.054
joy+0.054
Neg logit contributions
 elbows-0.049
w-0.046
 poop-0.043
tails-0.042
 marks-0.039
Token After
Token ID2293
Feature activation+6.53e-02

Pos logit contributions
igsaw+0.005
lad+0.004
Em+0.004
 newfound+0.004
reetings+0.004
Neg logit contributions
 elbows-0.004
 poop-0.004
 covered-0.004
 made-0.004
 marks-0.004
Token they
Token ID484
Feature activation+4.05e-02

Pos logit contributions
 elbows+0.063
 poop+0.058
 covered+0.057
 marks+0.057
tails+0.057
Neg logit contributions
igsaw-0.153
lad-0.134
Em-0.130
 newfound-0.129
reetings-0.128
Token had
Token ID550
Feature activation-2.75e-03

Pos logit contributions
 elbows+0.049
tails+0.045
 poop+0.043
 marks+0.042
w+0.042
Neg logit contributions
igsaw-0.084
Em-0.075
 newfound-0.073
lad-0.073
 importantly-0.071
Token been
Token ID587
Feature activation-1.93e-02

Pos logit contributions
igsaw+0.006
Em+0.005
 newfound+0.005
lad+0.005
 importantly+0.005
Neg logit contributions
 elbows-0.003
tails-0.003
 poop-0.003
w-0.003
 marks-0.003
Token playing
Token ID2712
Feature activation-2.99e-04

Pos logit contributions
igsaw+0.037
Em+0.033
 newfound+0.032
 importantly+0.032
lad+0.032
Neg logit contributions
 elbows-0.024
tails-0.022
 marks-0.022
 poop-0.022
w-0.022
Token for
Token ID329
Feature activation+2.55e-02

Pos logit contributions
igsaw+0.001
lad+0.001
 newfound+0.001
Em+0.001
 importantly+0.001
Neg logit contributions
 elbows-0.000
tails-0.000
 poop-0.000
 marks-0.000
w-0.000
Token that
Token ID326
Feature activation+1.96e-02

Pos logit contributions
igsaw+0.005
 importantly+0.004
 brand+0.004
 newfound+0.004
Em+0.004
Neg logit contributions
tails-0.005
 elbows-0.004
bat-0.004
w-0.004
ms-0.004
Token everything
Token ID2279
Feature activation+3.04e-04

Pos logit contributions
 elbows+0.026
tails+0.025
 poop+0.023
w+0.023
 marks+0.022
Neg logit contributions
igsaw-0.038
Em-0.033
 newfound-0.031
 importantly-0.031
lad-0.030
Token was
Token ID373
Feature activation+1.26e-02

Pos logit contributions
 elbows+0.001
tails+0.000
w+0.000
bat+0.000
 poop+0.000
Neg logit contributions
igsaw-0.000
Em-0.000
 million-0.000
 importantly-0.000
 newfound-0.000
Token white
Token ID2330
Feature activation+4.13e-02

Pos logit contributions
tails+0.017
 elbows+0.017
w+0.016
 marks+0.015
ms+0.015
Neg logit contributions
igsaw-0.022
 importantly-0.020
 newfound-0.019
Em-0.019
lad-0.019
Token and
Token ID290
Feature activation-1.06e-02

Pos logit contributions
 elbows+0.062
tails+0.061
w+0.055
 say+0.051
 yelled+0.050
Neg logit contributions
igsaw-0.071
 importantly-0.057
lad-0.056
 Nice-0.056
Em-0.054
Token cold
Token ID4692
Feature activation+7.20e-02

Pos logit contributions
igsaw+0.018
 newfound+0.015
 importantly+0.015
Em+0.014
 million+0.014
Neg logit contributions
 elbows-0.016
w-0.015
tails-0.015
 marks-0.015
 dressed-0.014
Token.
Token ID13
Feature activation+5.00e-03

Pos logit contributions
 elbows+0.102
tails+0.101
w+0.090
 gag+0.083
 dressed+0.079
Neg logit contributions
igsaw-0.128
lad-0.103
Em-0.102
 Nice-0.102
 million-0.101
Token Snow
Token ID7967
Feature activation+1.96e-02

Pos logit contributions
 elbows+0.006
 poop+0.005
 covered+0.005
 marks+0.005
 smoke+0.005
Neg logit contributions
igsaw-0.011
lad-0.009
 newfound-0.009
reetings-0.009
Em-0.009
Token had
Token ID550
Feature activation-1.38e-02

Pos logit contributions
 elbows+0.028
 smoke+0.024
 marks+0.024
 poop+0.024
tails+0.023
Neg logit contributions
igsaw-0.035
Em-0.033
lad-0.032
reetings-0.030
va-0.030
Token fallen
Token ID9292
Feature activation+3.72e-02

Pos logit contributions
igsaw+0.022
 newfound+0.020
 million+0.020
Em+0.020
 importantly+0.019
Neg logit contributions
 elbows-0.018
tails-0.018
w-0.017
ms-0.017
uses-0.017
Token during
Token ID1141
Feature activation+3.33e-02

Pos logit contributions
 elbows+0.046
tails+0.045
w+0.043
 marks+0.042
 say+0.041
Neg logit contributions
igsaw-0.066
 million-0.058
lad-0.057
 Nice-0.055
Em-0.054
Token got
Token ID1392
Feature activation+2.31e-02

Pos logit contributions
 elbows+0.003
 poop+0.003
 covered+0.003
 made+0.003
tails+0.003
Neg logit contributions
igsaw-0.008
lad-0.007
Em-0.007
 newfound-0.007
reetings-0.007
Token the
Token ID262
Feature activation-4.17e-03

Pos logit contributions
 elbows+0.028
tails+0.028
w+0.026
ms+0.026
 poop+0.025
Neg logit contributions
igsaw-0.044
 newfound-0.039
Em-0.038
 brand-0.036
 pleasure-0.035
Token missing
Token ID4814
Feature activation-1.83e-02

Pos logit contributions
igsaw+0.008
lad+0.006
Em+0.006
 newfound+0.006
reetings+0.006
Neg logit contributions
 elbows-0.006
 poop-0.006
 covered-0.006
 marks-0.006
tails-0.006
Token piece
Token ID3704
Feature activation+5.56e-03

Pos logit contributions
igsaw+0.035
lad+0.032
reetings+0.030
Em+0.030
 importantly+0.029
Neg logit contributions
 covered-0.024
 marks-0.024
 smoke-0.024
 elbows-0.024
 poop-0.024
Token.
Token ID13
Feature activation-5.48e-03

Pos logit contributions
 elbows+0.008
tails+0.008
w+0.007
 poop+0.007
 smoke+0.007
Neg logit contributions
igsaw-0.011
lad-0.008
Em-0.008
 newfound-0.008
 importantly-0.008
Token When
Token ID1649
Feature activation+7.07e-02

Pos logit contributions
lad+0.012
reetings+0.012
igsaw+0.012
 newfound+0.012
 lifetime+0.011
Neg logit contributions
 covered-0.005
 made-0.005
 dressed-0.005
 when-0.004
 poop-0.004
Token she
Token ID673
Feature activation+3.25e-02

Pos logit contributions
 elbows+0.068
 poop+0.063
tails+0.061
 covered+0.061
 marks+0.061
Neg logit contributions
igsaw-0.167
lad-0.146
Em-0.143
 newfound-0.141
reetings-0.140
Token came
Token ID1625
Feature activation+8.44e-03

Pos logit contributions
 made+0.026
 covered+0.026
 dressed+0.024
 poop+0.023
 elbows+0.022
Neg logit contributions
igsaw-0.084
lad-0.075
reetings-0.072
 importantly-0.071
 newfound-0.071
Token back
Token ID736
Feature activation+8.46e-04

Pos logit contributions
 dressed+0.009
 covered+0.009
 made+0.009
 when+0.009
 elbows+0.009
Neg logit contributions
igsaw-0.019
lad-0.016
Em-0.016
 importantly-0.016
oke-0.016
Token,
Token ID11
Feature activation+6.51e-03

Pos logit contributions
 elbows+0.001
 covered+0.001
 made+0.001
 dressed+0.001
 poop+0.001
Neg logit contributions
igsaw-0.002
lad-0.002
Em-0.002
reetings-0.002
 importantly-0.002
Token Lily
Token ID20037
Feature activation+1.25e-02

Pos logit contributions
 elbows+0.005
 covered+0.005
 made+0.005
 poop+0.004
 dressed+0.004
Neg logit contributions
igsaw-0.017
lad-0.015
reetings-0.015
Em-0.014
 newfound-0.014
Token fun
Token ID1257
Feature activation-9.88e-03

Pos logit contributions
igsaw+0.027
lad+0.023
 newfound+0.023
 importantly+0.022
Em+0.022
Neg logit contributions
 elbows-0.025
tails-0.023
w-0.022
 poop-0.021
 dressed-0.021
Token to
Token ID284
Feature activation-3.22e-02

Pos logit contributions
igsaw+0.018
lad+0.015
 importantly+0.015
 newfound+0.014
Em+0.014
Neg logit contributions
 elbows-0.017
tails-0.014
w-0.013
 poop-0.013
 smoke-0.012
Token watch
Token ID2342
Feature activation+9.99e-04

Pos logit contributions
igsaw+0.071
lad+0.061
 newfound+0.059
reetings+0.058
Em+0.058
Neg logit contributions
 elbows-0.038
tails-0.033
w-0.032
 poop-0.031
 dressed-0.031
Token the
Token ID262
Feature activation+3.30e-03

Pos logit contributions
 elbows+0.001
w+0.001
tails+0.001
 poop+0.001
 waited+0.001
Neg logit contributions
igsaw-0.002
 newfound-0.002
Em-0.002
 importantly-0.001
lad-0.001
Token long
Token ID890
Feature activation+2.58e-02

Pos logit contributions
 elbows+0.006
tails+0.005
ow+0.005
 dressed+0.005
 covered+0.005
Neg logit contributions
igsaw-0.006
 newfound-0.005
lad-0.004
 importantly-0.004
Em-0.004
Token noodles
Token ID31103
Feature activation+6.68e-02

Pos logit contributions
 elbows+0.040
tails+0.037
w+0.036
 dressed+0.036
 poop+0.035
Neg logit contributions
igsaw-0.049
Em-0.037
 newfound-0.037
 importantly-0.036
 million-0.036
Token boil
Token ID20667
Feature activation+7.98e-02

Pos logit contributions
 elbows+0.097
tails+0.085
w+0.081
 poop+0.077
 dressed+0.068
Neg logit contributions
igsaw-0.129
lad-0.110
 newfound-0.104
 million-0.104
 importantly-0.101
Token in
Token ID287
Feature activation-1.69e-02

Pos logit contributions
 elbows+0.118
tails+0.104
w+0.095
 poop+0.088
 waited+0.085
Neg logit contributions
igsaw-0.157
 million-0.127
lad-0.126
 newfound-0.123
 brand-0.119
Token the
Token ID262
Feature activation+1.04e-02

Pos logit contributions
igsaw+0.036
 newfound+0.030
lad+0.029
Em+0.029
 million+0.029
Neg logit contributions
 elbows-0.021
tails-0.020
w-0.019
 poop-0.018
 dressed-0.018
Token pot
Token ID1787
Feature activation+3.39e-02

Pos logit contributions
 elbows+0.014
tails+0.013
 poop+0.012
 marks+0.012
w+0.012
Neg logit contributions
igsaw-0.021
lad-0.017
Em-0.017
 newfound-0.017
 million-0.016
Token.
Token ID13
Feature activation-2.49e-03

Pos logit contributions
 elbows+0.050
tails+0.041
w+0.041
 poop+0.038
 marks+0.034
Neg logit contributions
igsaw-0.066
 million-0.055
lad-0.054
Em-0.052
 importantly-0.052
Token filled
Token ID5901
Feature activation+3.02e-02

Pos logit contributions
 elbows+0.013
tails+0.012
 poop+0.012
 marks+0.011
w+0.011
Neg logit contributions
igsaw-0.020
 newfound-0.017
lad-0.017
 importantly-0.017
Em-0.017
Token with
Token ID351
Feature activation-1.98e-02

Pos logit contributions
 elbows+0.049
tails+0.043
 poop+0.041
 marks+0.039
 spill+0.038
Neg logit contributions
igsaw-0.056
Em-0.046
lad-0.045
 newfound-0.045
va-0.042
Token lots
Token ID6041
Feature activation-3.29e-03

Pos logit contributions
igsaw+0.033
 newfound+0.030
 importantly+0.028
 pleasure+0.028
 lifelong+0.025
Neg logit contributions
 elbows-0.031
tails-0.031
w-0.030
 poop-0.029
mons-0.027
Token of
Token ID286
Feature activation+6.41e-03

Pos logit contributions
igsaw+0.005
lad+0.004
 newfound+0.004
Em+0.004
 importantly+0.004
Neg logit contributions
 elbows-0.006
tails-0.005
 poop-0.005
 covered-0.005
 marks-0.005
Token y
Token ID331
Feature activation+2.35e-02

Pos logit contributions
 elbows+0.010
tails+0.009
w+0.009
 poop+0.009
 covered+0.008
Neg logit contributions
igsaw-0.011
 newfound-0.010
 importantly-0.010
 pleasure-0.009
Em-0.009
Tokenummy
Token ID13513
Feature activation+1.74e-02

Pos logit contributions
 elbows+0.065
 made+0.062
tails+0.061
 poop+0.061
 waited+0.061
Neg logit contributions
igsaw-0.017
 importantly-0.008
lad-0.006
 newfound-0.004
 pleasure-0.004
Token food
Token ID2057
Feature activation+3.70e-02

Pos logit contributions
 elbows+0.027
tails+0.023
w+0.022
 poop+0.020
 covered+0.020
Neg logit contributions
igsaw-0.034
 newfound-0.029
 importantly-0.027
lad-0.027
 million-0.026
Token!
Token ID0
Feature activation-6.14e-03

Pos logit contributions
 elbows+0.049
tails+0.040
w+0.040
 poop+0.037
 smoke+0.037
Neg logit contributions
igsaw-0.077
lad-0.067
 importantly-0.065
 newfound-0.064
Em-0.064
Token
Token ID198
Feature activation-9.12e-03

Pos logit contributions
igsaw+0.011
lad+0.010
 newfound+0.009
reetings+0.009
 importantly+0.008
Neg logit contributions
 dressed-0.009
 elbows-0.009
 covered-0.008
 poop-0.008
 made-0.008
Token
Token ID198
Feature activation-1.50e-02

Pos logit contributions
igsaw+0.016
lad+0.014
reetings+0.013
 newfound+0.013
 importantly+0.012
Neg logit contributions
 elbows-0.013
 dressed-0.013
 covered-0.013
 made-0.013
 poop-0.013
TokenB
Token ID33
Feature activation-2.62e-02

Pos logit contributions
igsaw+0.026
lad+0.022
Em+0.021
 newfound+0.021
reetings+0.021
Neg logit contributions
 elbows-0.023
 poop-0.022
 covered-0.022
tails-0.022
 marks-0.021
Token for
Token ID329
Feature activation+4.79e-02

Pos logit contributions
 elbows+0.020
w+0.017
tails+0.017
 poop+0.017
 marks+0.017
Neg logit contributions
igsaw-0.032
lad-0.028
 newfound-0.028
joy-0.027
reetings-0.026
Token the
Token ID262
Feature activation-4.73e-03

Pos logit contributions
 elbows+0.047
tails+0.044
 poop+0.043
 covered+0.042
 dressed+0.042
Neg logit contributions
igsaw-0.110
lad-0.095
 newfound-0.095
Em-0.093
 importantly-0.093
Token rest
Token ID1334
Feature activation+1.22e-02

Pos logit contributions
igsaw+0.009
lad+0.008
 newfound+0.007
Em+0.007
reetings+0.007
Neg logit contributions
 elbows-0.007
tails-0.006
 dressed-0.006
 poop-0.006
 covered-0.006
Token of
Token ID286
Feature activation+3.09e-02

Pos logit contributions
 elbows+0.021
 poop+0.020
tails+0.020
 smoke+0.019
 marks+0.019
Neg logit contributions
igsaw-0.020
lad-0.016
 importantly-0.015
 newfound-0.015
Em-0.015
Token the
Token ID262
Feature activation+3.62e-03

Pos logit contributions
 elbows+0.031
tails+0.029
 poop+0.029
 covered+0.028
 marks+0.028
Neg logit contributions
igsaw-0.071
lad-0.062
 newfound-0.060
Em-0.060
 importantly-0.059
Token summer
Token ID3931
Feature activation+3.34e-02

Pos logit contributions
 elbows+0.005
tails+0.005
 dressed+0.004
 covered+0.004
 poop+0.004
Neg logit contributions
igsaw-0.007
lad-0.006
 newfound-0.006
Em-0.006
 importantly-0.006
Token.
Token ID13
Feature activation-1.39e-02

Pos logit contributions
 elbows+0.043
tails+0.040
 poop+0.036
w+0.036
 dressed+0.035
Neg logit contributions
igsaw-0.070
Em-0.058
 importantly-0.057
lad-0.056
 newfound-0.056
Token The
Token ID383
Feature activation-1.27e-02

Pos logit contributions
igsaw+0.028
reetings+0.024
 newfound+0.024
lad+0.024
Em+0.024
Neg logit contributions
 made-0.016
 covered-0.015
 when-0.015
 dressed-0.015
 water-0.015
Token sun
Token ID4252
Feature activation+2.83e-03

Pos logit contributions
igsaw+0.022
lad+0.019
Em+0.018
 newfound+0.018
reetings+0.018
Neg logit contributions
 elbows-0.020
 poop-0.019
 covered-0.019
 marks-0.019
tails-0.019
Token was
Token ID373
Feature activation+1.00e-02

Pos logit contributions
 elbows+0.004
tails+0.004
 poop+0.004
 marks+0.004
w+0.003
Neg logit contributions
igsaw-0.005
 newfound-0.005
lad-0.004
Em-0.004
 importantly-0.004
Token hot
Token ID3024
Feature activation+6.11e-02

Pos logit contributions
 elbows+0.014
tails+0.013
 poop+0.013
 marks+0.013
w+0.012
Neg logit contributions
igsaw-0.018
lad-0.016
 newfound-0.016
 importantly-0.016
Em-0.016
Token clouds
Token ID15114
Feature activation+1.36e-02

Pos logit contributions
 elbows+0.011
 smoke+0.010
 poop+0.010
 covered+0.010
 made+0.010
Neg logit contributions
igsaw-0.017
lad-0.015
Em-0.014
reetings-0.014
 importantly-0.014
Token.
Token ID13
Feature activation-3.05e-03

Pos logit contributions
 elbows+0.015
 poop+0.013
tails+0.013
 marks+0.012
w+0.012
Neg logit contributions
igsaw-0.031
lad-0.026
Em-0.026
 newfound-0.025
reetings-0.025
Token The
Token ID383
Feature activation-9.66e-03

Pos logit contributions
igsaw+0.006
lad+0.005
 newfound+0.005
reetings+0.005
 importantly+0.005
Neg logit contributions
 dressed-0.004
 poop-0.004
 smoke-0.004
 elbows-0.004
 covered-0.004
Token helpers
Token ID49385
Feature activation+1.69e-02

Pos logit contributions
igsaw+0.018
lad+0.016
Em+0.015
 newfound+0.015
reetings+0.015
Neg logit contributions
 elbows-0.014
 poop-0.013
 covered-0.013
tails-0.013
 marks-0.013
Token worked
Token ID3111
Feature activation-5.62e-03

Pos logit contributions
 elbows+0.021
 poop+0.019
tails+0.018
 marks+0.017
w+0.017
Neg logit contributions
igsaw-0.035
lad-0.031
 newfound-0.030
Em-0.030
reetings-0.030
Token hard
Token ID1327
Feature activation+2.35e-02

Pos logit contributions
igsaw+0.011
lad+0.009
Em+0.009
 newfound+0.009
reetings+0.009
Neg logit contributions
 elbows-0.008
 poop-0.007
tails-0.007
 covered-0.007
 marks-0.007
Token and
Token ID290
Feature activation-4.04e-04

Pos logit contributions
 elbows+0.029
 poop+0.026
tails+0.026
 covered+0.024
w+0.024
Neg logit contributions
igsaw-0.052
lad-0.043
 newfound-0.042
Em-0.042
 importantly-0.042
Token used
Token ID973
Feature activation+5.39e-03

Pos logit contributions
igsaw+0.001
lad+0.001
Em+0.001
 newfound+0.001
 importantly+0.001
Neg logit contributions
 elbows-0.000
 poop-0.000
tails-0.000
 marks-0.000
w-0.000
Token lots
Token ID6041
Feature activation-7.73e-03

Pos logit contributions
 elbows+0.006
tails+0.006
w+0.006
 poop+0.006
 covered+0.005
Neg logit contributions
igsaw-0.012
 newfound-0.010
lad-0.010
Em-0.010
 importantly-0.010
Token of
Token ID286
Feature activation-4.83e-03

Pos logit contributions
igsaw+0.013
lad+0.011
Em+0.010
 newfound+0.010
reetings+0.010
Neg logit contributions
 elbows-0.013
 poop-0.012
 covered-0.012
tails-0.012
 marks-0.012
Token bright
Token ID6016
Feature activation-4.55e-03

Pos logit contributions
igsaw+0.010
 newfound+0.008
lad+0.008
 importantly+0.008
Em+0.008
Neg logit contributions
 elbows-0.006
tails-0.006
 poop-0.006
w-0.006
 covered-0.005
Token the
Token ID262
Feature activation-9.65e-03

Pos logit contributions
 elbows+0.017
w+0.016
tails+0.016
 poop+0.015
 marks+0.015
Neg logit contributions
igsaw-0.023
 newfound-0.020
Em-0.020
 importantly-0.020
lad-0.019
Token storm
Token ID6388
Feature activation-9.21e-03

Pos logit contributions
igsaw+0.018
lad+0.015
 newfound+0.015
Em+0.015
 importantly+0.015
Neg logit contributions
 elbows-0.013
tails-0.013
 covered-0.012
 dressed-0.012
 poop-0.012
Token would
Token ID561
Feature activation+6.36e-03

Pos logit contributions
igsaw+0.014
lad+0.014
reetings+0.013
Em+0.013
oke+0.013
Neg logit contributions
 elbows-0.015
tails-0.013
 poop-0.013
 marks-0.012
 smoke-0.012
Token pass
Token ID1208
Feature activation-1.00e-02

Pos logit contributions
 elbows+0.008
w+0.007
tails+0.007
 poop+0.007
 marks+0.007
Neg logit contributions
igsaw-0.012
lad-0.011
 newfound-0.011
reetings-0.011
Em-0.010
Token.
Token ID13
Feature activation+2.21e-02

Pos logit contributions
igsaw+0.021
lad+0.018
 newfound+0.018
Em+0.018
reetings+0.018
Neg logit contributions
 elbows-0.012
tails-0.011
 poop-0.011
 covered-0.010
 marks-0.010
Token
Token ID198
Feature activation+1.43e-02

Pos logit contributions
 elbows+0.028
 poop+0.027
 covered+0.026
 dressed+0.026
 smoke+0.026
Neg logit contributions
igsaw-0.045
lad-0.039
 newfound-0.038
reetings-0.036
Em-0.035
Token
Token ID198
Feature activation+8.53e-03

Pos logit contributions
 elbows+0.020
 poop+0.020
 made+0.019
 marks+0.019
 smoke+0.019
Neg logit contributions
igsaw-0.027
lad-0.023
 newfound-0.021
reetings-0.021
 importantly-0.021
TokenBut
Token ID1537
Feature activation+2.54e-02

Pos logit contributions
 elbows+0.012
 poop+0.011
 covered+0.011
 marks+0.011
tails+0.011
Neg logit contributions
igsaw-0.016
lad-0.014
Em-0.013
 newfound-0.013
reetings-0.013
Token the
Token ID262
Feature activation-5.38e-03

Pos logit contributions
 elbows+0.031
 marks+0.029
tails+0.028
 poop+0.028
 smoke+0.027
Neg logit contributions
igsaw-0.051
Em-0.044
 newfound-0.044
lad-0.043
 importantly-0.043
Token storm
Token ID6388
Feature activation-8.07e-03

Pos logit contributions
igsaw+0.010
lad+0.008
 newfound+0.008
Em+0.008
 importantly+0.008
Neg logit contributions
 elbows-0.008
tails-0.007
 poop-0.007
 covered-0.007
 dressed-0.007
Token didn
Token ID1422
Feature activation+3.90e-02

Pos logit contributions
oke+0.012
lad+0.011
igsaw+0.011
reetings+0.010
 hero+0.010
Neg logit contributions
 elbows-0.014
tails-0.012
 dressed-0.011
 poop-0.011
 gag-0.011
Token all
Token ID477
Feature activation-2.71e-02

Pos logit contributions
mons+0.021
 elbows+0.021
 poop+0.020
ups+0.019
w+0.019
Neg logit contributions
Em-0.025
igsaw-0.023
 newfound-0.022
 million-0.022
 importantly-0.022
Token agreed
Token ID4987
Feature activation-1.14e-02

Pos logit contributions
igsaw+0.051
Em+0.050
 newfound+0.044
 importantly+0.044
lad+0.044
Neg logit contributions
 elbows-0.034
mons-0.032
 poop-0.030
 marks-0.030
tails-0.030
Token that
Token ID326
Feature activation-3.12e-03

Pos logit contributions
igsaw+0.019
 newfound+0.017
Em+0.017
 importantly+0.016
lad+0.016
Neg logit contributions
 elbows-0.019
w-0.018
tails-0.017
mons-0.017
bat-0.016
Token it
Token ID340
Feature activation-3.22e-02

Pos logit contributions
igsaw+0.005
 importantly+0.005
Em+0.005
 newfound+0.005
 pleasure+0.005
Neg logit contributions
w-0.004
 elbows-0.004
tails-0.004
 poop-0.004
ards-0.004
Token was
Token ID373
Feature activation-5.49e-03

Pos logit contributions
igsaw+0.050
Em+0.048
lad+0.048
 million+0.043
 newfound+0.043
Neg logit contributions
 elbows-0.053
tails-0.046
 poop-0.046
 smoke-0.043
mons-0.042
Token time
Token ID640
Feature activation+1.30e-02

Pos logit contributions
igsaw+0.010
lad+0.008
 newfound+0.008
Em+0.008
 importantly+0.008
Neg logit contributions
 elbows-0.008
 poop-0.008
tails-0.008
 marks-0.008
 smoke-0.007
Token to
Token ID284
Feature activation-1.02e-02

Pos logit contributions
 elbows+0.017
 poop+0.015
tails+0.015
 smoke+0.015
 marks+0.015
Neg logit contributions
igsaw-0.026
lad-0.023
 newfound-0.022
 importantly-0.022
Em-0.022
Token go
Token ID467
Feature activation+3.48e-03

Pos logit contributions
igsaw+0.021
 newfound+0.018
lad+0.017
Em+0.017
 pleasure+0.017
Neg logit contributions
 elbows-0.012
w-0.011
mons-0.011
tails-0.011
 poop-0.011
Token home
Token ID1363
Feature activation-2.97e-03

Pos logit contributions
 elbows+0.004
 poop+0.004
 smoke+0.004
tails+0.004
 marks+0.004
Neg logit contributions
igsaw-0.007
lad-0.006
Em-0.006
 newfound-0.006
reetings-0.006
Token.
Token ID13
Feature activation-9.05e-03

Pos logit contributions
igsaw+0.007
lad+0.006
 newfound+0.006
Em+0.006
 importantly+0.006
Neg logit contributions
 elbows-0.003
w-0.003
 poop-0.003
 smoke-0.003
tails-0.003
Token
Token ID198
Feature activation-1.48e-02

Pos logit contributions
igsaw+0.018
lad+0.017
reetings+0.017
 newfound+0.016
 lifetime+0.015
Neg logit contributions
 dressed-0.010
 covered-0.010
 made-0.010
 poop-0.010
 waited-0.010
Token cat
Token ID3797
Feature activation+1.10e-02

Pos logit contributions
igsaw+0.012
lad+0.011
Em+0.010
reetings+0.010
 importantly+0.010
Neg logit contributions
 elbows-0.009
 poop-0.009
 covered-0.009
 marks-0.009
 made-0.009
Token.
Token ID13
Feature activation+9.36e-04

Pos logit contributions
 elbows+0.013
 poop+0.012
tails+0.012
 smoke+0.012
w+0.012
Neg logit contributions
igsaw-0.023
lad-0.020
 newfound-0.019
Em-0.019
 importantly-0.019
Token So
Token ID1406
Feature activation+1.42e-02

Pos logit contributions
 elbows+0.002
 dressed+0.002
 poop+0.002
 smoke+0.002
tails+0.002
Neg logit contributions
igsaw-0.001
lad-0.001
Em-0.001
 importantly-0.001
 newfound-0.001
Token,
Token ID11
Feature activation-7.34e-03

Pos logit contributions
 elbows+0.021
 poop+0.020
 dressed+0.020
 marks+0.019
tails+0.019
Neg logit contributions
igsaw-0.024
 newfound-0.021
 importantly-0.021
lad-0.021
reetings-0.020
Token Tim
Token ID5045
Feature activation+3.33e-03

Pos logit contributions
igsaw+0.007
 newfound+0.007
 importantly+0.007
 Nice+0.007
 hero+0.006
Neg logit contributions
 elbows-0.014
ms-0.014
w-0.014
 dressed-0.014
out-0.014
Token went
Token ID1816
Feature activation-1.74e-02

Pos logit contributions
 elbows+0.005
 smoke+0.004
w+0.004
 poop+0.004
mons+0.004
Neg logit contributions
Em-0.005
reetings-0.005
igsaw-0.005
lad-0.005
 hero-0.005
Token out
Token ID503
Feature activation+8.20e-03

Pos logit contributions
igsaw+0.035
lad+0.030
 newfound+0.029
Em+0.029
reetings+0.029
Neg logit contributions
 elbows-0.022
w-0.021
 smoke-0.020
 marks-0.019
 poop-0.019
Token to
Token ID284
Feature activation+1.57e-02

Pos logit contributions
 elbows+0.008
 poop+0.008
tails+0.008
 marks+0.008
 smoke+0.008
Neg logit contributions
igsaw-0.019
lad-0.016
Em-0.016
 newfound-0.016
 importantly-0.016
Token look
Token ID804
Feature activation-5.33e-02

Pos logit contributions
 elbows+0.016
 poop+0.015
 covered+0.015
tails+0.015
 marks+0.015
Neg logit contributions
igsaw-0.036
lad-0.031
Em-0.030
 newfound-0.030
reetings-0.030
Token for
Token ID329
Feature activation+1.32e-02

Pos logit contributions
igsaw+0.129
lad+0.114
Em+0.109
reetings+0.109
 importantly+0.106
Neg logit contributions
 covered-0.049
 made-0.048
 dressed-0.047
 elbows-0.047
 poop-0.045
Token Sam
Token ID3409
Feature activation+6.36e-03

Pos logit contributions
 elbows+0.016
 poop+0.015
tails+0.015
 covered+0.014
 dressed+0.014
Neg logit contributions
igsaw-0.028
lad-0.024
 newfound-0.023
Em-0.023
 importantly-0.023
Token.
Token ID13
Feature activation-7.03e-03

Pos logit contributions
igsaw+0.004
lad+0.004
joy+0.004
 importantly+0.004
 Nice+0.004
Neg logit contributions
 elbows-0.003
w-0.003
tails-0.003
 poop-0.003
 smoke-0.003
Token
Token ID198
Feature activation-3.30e-03

Pos logit contributions
igsaw+0.013
lad+0.011
 newfound+0.011
reetings+0.011
 importantly+0.011
Neg logit contributions
 made-0.010
 covered-0.010
 poop-0.010
 dressed-0.009
 smoke-0.009
Token
Token ID198
Feature activation-8.45e-03

Pos logit contributions
igsaw+0.007
lad+0.006
 newfound+0.005
 importantly+0.005
reetings+0.005
Neg logit contributions
 made-0.004
 dressed-0.004
 covered-0.004
 poop-0.004
 shouted-0.004
TokenHe
Token ID1544
Feature activation+9.41e-03

Pos logit contributions
igsaw+0.013
Em+0.011
lad+0.011
reetings+0.010
 importantly+0.010
Neg logit contributions
 elbows-0.015
 poop-0.014
tails-0.014
 marks-0.014
 dressed-0.014
Token decided
Token ID3066
Feature activation-4.50e-03

Pos logit contributions
 elbows+0.010
 poop+0.009
tails+0.009
 marks+0.009
w+0.009
Neg logit contributions
igsaw-0.020
Em-0.018
 newfound-0.018
lad-0.018
 importantly-0.017
Token to
Token ID284
Feature activation-7.57e-03

Pos logit contributions
igsaw+0.009
lad+0.009
Em+0.008
 newfound+0.008
reetings+0.008
Neg logit contributions
 elbows-0.005
tails-0.005
 poop-0.005
w-0.005
 marks-0.004
Token take
Token ID1011
Feature activation-1.89e-02

Pos logit contributions
igsaw+0.016
 newfound+0.013
lad+0.013
Em+0.013
reetings+0.013
Neg logit contributions
 elbows-0.010
 poop-0.008
w-0.008
 marks-0.008
tails-0.008
Token it
Token ID340
Feature activation+3.99e-03

Pos logit contributions
igsaw+0.045
lad+0.039
Em+0.038
 newfound+0.038
reetings+0.038
Neg logit contributions
 elbows-0.018
 poop-0.016
 covered-0.016
tails-0.016
 marks-0.016
Token home
Token ID1363
Feature activation-5.54e-03

Pos logit contributions
 elbows+0.007
w+0.006
 smoke+0.006
 poop+0.006
tails+0.006
Neg logit contributions
igsaw-0.007
 newfound-0.005
lad-0.005
 hero-0.005
Em-0.005
Token.
Token ID13
Feature activation-2.33e-03

Pos logit contributions
igsaw+0.013
lad+0.011
 newfound+0.011
Em+0.011
reetings+0.010
Neg logit contributions
 elbows-0.006
 poop-0.005
tails-0.005
 smoke-0.005
w-0.005
Token When
Token ID1649
Feature activation+6.48e-02

Pos logit contributions
igsaw+0.004
lad+0.003
reetings+0.003
 newfound+0.003
Em+0.003
Neg logit contributions
 covered-0.004
 made-0.004
 smoke-0.004
 dressed-0.004
 poop-0.003
Token hill
Token ID12788
Feature activation-2.84e-02

Pos logit contributions
 elbows+0.000
tails+0.000
 dressed+0.000
 covered+0.000
w+0.000
Neg logit contributions
igsaw-0.000
lad-0.000
 newfound-0.000
Em-0.000
 importantly-0.000
Token.
Token ID13
Feature activation+1.13e-02

Pos logit contributions
igsaw+0.066
lad+0.058
Em+0.056
reetings+0.055
 newfound+0.055
Neg logit contributions
 elbows-0.028
tails-0.025
 poop-0.025
 marks-0.024
 covered-0.024
Token I
Token ID314
Feature activation-1.91e-02

Pos logit contributions
 elbows+0.017
 covered+0.016
 made+0.016
 poop+0.016
 smoke+0.016
Neg logit contributions
igsaw-0.021
lad-0.017
 newfound-0.017
Em-0.016
reetings-0.016
Token met
Token ID1138
Feature activation-2.77e-02

Pos logit contributions
igsaw+0.031
lad+0.029
 newfound+0.029
reetings+0.028
 pleasure+0.028
Neg logit contributions
 elbows-0.030
tails-0.026
 marks-0.024
 smoke-0.024
ards-0.024
Token a
Token ID257
Feature activation-6.95e-03

Pos logit contributions
igsaw+0.061
lad+0.054
Em+0.052
 newfound+0.052
reetings+0.051
Neg logit contributions
 elbows-0.030
 poop-0.027
tails-0.027
 covered-0.027
 marks-0.027
Token new
Token ID649
Feature activation-3.78e-02

Pos logit contributions
igsaw+0.013
lad+0.012
reetings+0.011
 importantly+0.010
Em+0.010
Neg logit contributions
 poop-0.010
 smoke-0.010
 made-0.010
 water-0.009
 covered-0.009
Token friend
Token ID1545
Feature activation-3.74e-02

Pos logit contributions
igsaw+0.063
lad+0.053
Em+0.050
reetings+0.049
 newfound+0.049
Neg logit contributions
 elbows-0.061
 poop-0.059
 covered-0.059
 marks-0.058
 made-0.058
Token,
Token ID11
Feature activation-1.51e-03

Pos logit contributions
igsaw+0.084
lad+0.074
Em+0.072
reetings+0.071
 newfound+0.071
Neg logit contributions
 elbows-0.039
 poop-0.035
tails-0.035
 marks-0.035
 covered-0.035
Token Tom
Token ID4186
Feature activation-2.16e-03

Pos logit contributions
igsaw+0.003
lad+0.003
reetings+0.003
Em+0.003
 newfound+0.003
Neg logit contributions
 covered-0.002
 poop-0.002
 made-0.002
 elbows-0.002
 dressed-0.002
Token.
Token ID13
Feature activation+1.24e-02

Pos logit contributions
igsaw+0.005
lad+0.004
Em+0.004
reetings+0.004
 newfound+0.004
Neg logit contributions
 elbows-0.002
 poop-0.002
tails-0.002
 marks-0.002
 smoke-0.002
Token He
Token ID679
Feature activation-5.74e-03

Pos logit contributions
 poop+0.017
 covered+0.017
 like+0.017
 smoke+0.017
 say+0.017
Neg logit contributions
igsaw-0.023
 newfound-0.020
lad-0.020
 lifetime-0.018
Em-0.018
Token any
Token ID597
Feature activation+1.02e-02

Pos logit contributions
igsaw+0.009
Em+0.008
 newfound+0.008
lad+0.008
 importantly+0.008
Neg logit contributions
 elbows-0.006
tails-0.006
 poop-0.005
w-0.005
 marks-0.005
Token nightmare
Token ID17123
Feature activation-1.22e-02

Pos logit contributions
 elbows+0.015
 poop+0.015
tails+0.014
 covered+0.014
 marks+0.014
Neg logit contributions
igsaw-0.018
lad-0.015
Em-0.015
 newfound-0.015
 importantly-0.015
Token that
Token ID326
Feature activation-3.08e-03

Pos logit contributions
igsaw+0.023
Em+0.021
lad+0.020
 importantly+0.020
 newfound+0.019
Neg logit contributions
 elbows-0.017
tails-0.015
 poop-0.015
w-0.015
 smoke-0.013
Token tried
Token ID3088
Feature activation+7.02e-03

Pos logit contributions
igsaw+0.006
Em+0.006
 newfound+0.006
lad+0.005
 importantly+0.005
Neg logit contributions
 elbows-0.004
tails-0.004
 poop-0.003
w-0.003
 smoke-0.003
Token to
Token ID284
Feature activation+4.53e-03

Pos logit contributions
 elbows+0.008
w+0.008
 poop+0.007
tails+0.007
 covered+0.007
Neg logit contributions
igsaw-0.015
 newfound-0.013
lad-0.013
Em-0.012
 importantly-0.012
Token enter
Token ID3802
Feature activation-6.01e-03

Pos logit contributions
 elbows+0.005
 poop+0.005
tails+0.005
w+0.005
 dressed+0.005
Neg logit contributions
igsaw-0.009
lad-0.008
 newfound-0.008
Em-0.008
reetings-0.008
Token her
Token ID607
Feature activation+1.24e-02

Pos logit contributions
igsaw+0.014
lad+0.012
Em+0.012
 newfound+0.012
 importantly+0.012
Neg logit contributions
 elbows-0.006
 poop-0.005
tails-0.005
w-0.005
 covered-0.005
Token room
Token ID2119
Feature activation+2.44e-02

Pos logit contributions
 elbows+0.017
 poop+0.016
 smoke+0.016
 covered+0.016
tails+0.016
Neg logit contributions
igsaw-0.024
 newfound-0.020
lad-0.019
Em-0.019
 importantly-0.019
Token would
Token ID561
Feature activation+7.52e-03

Pos logit contributions
 elbows+0.028
 poop+0.026
w+0.026
tails+0.025
 smoke+0.024
Neg logit contributions
igsaw-0.054
lad-0.045
 newfound-0.045
Em-0.045
 importantly-0.044
Token be
Token ID307
Feature activation-8.72e-03

Pos logit contributions
 elbows+0.009
 poop+0.008
tails+0.008
w+0.008
 marks+0.008
Neg logit contributions
igsaw-0.016
lad-0.014
 newfound-0.013
Em-0.013
reetings-0.013
Token caught
Token ID4978
Feature activation+4.13e-03

Pos logit contributions
igsaw+0.017
lad+0.015
 newfound+0.015
Em+0.015
reetings+0.014
Neg logit contributions
 elbows-0.011
 poop-0.011
tails-0.010
w-0.010
 marks-0.010
Token looking
Token ID2045
Feature activation-3.76e-02

Pos logit contributions
 elbows+0.014
tails+0.013
 dressed+0.013
 poop+0.013
 covered+0.012
Neg logit contributions
igsaw-0.023
lad-0.019
 newfound-0.019
Em-0.019
 importantly-0.019
Token duck
Token ID22045
Feature activation-6.91e-03

Pos logit contributions
igsaw+0.073
lad+0.062
Em+0.060
 newfound+0.059
reetings+0.059
Neg logit contributions
 elbows-0.051
 poop-0.049
 covered-0.048
 smoke-0.048
 marks-0.048
Token.
Token ID13
Feature activation-3.27e-03

Pos logit contributions
igsaw+0.011
 million+0.010
 importantly+0.010
lad+0.010
 hero+0.010
Neg logit contributions
 elbows-0.011
ups-0.009
mons-0.009
 marks-0.008
 waited-0.008
Token
Token ID220
Feature activation+2.00e-03

Pos logit contributions
igsaw+0.005
lad+0.004
 newfound+0.004
reetings+0.004
Em+0.004
Neg logit contributions
 elbows-0.005
 covered-0.005
 poop-0.005
 smoke-0.005
 made-0.005
Token
Token ID198
Feature activation-8.14e-03

Pos logit contributions
 marks+0.003
 poop+0.003
 elbows+0.003
 made+0.003
 smoke+0.003
Neg logit contributions
igsaw-0.004
 newfound-0.003
lad-0.003
Em-0.003
 importantly-0.003
Token
Token ID198
Feature activation-1.27e-02

Pos logit contributions
igsaw+0.016
lad+0.013
 newfound+0.013
 importantly+0.013
 pleasure+0.012
Neg logit contributions
 made-0.011
 covered-0.011
 poop-0.011
 elbows-0.011
 shouted-0.010
TokenThe
Token ID464
Feature activation-1.06e-02

Pos logit contributions
igsaw+0.021
lad+0.017
Em+0.017
reetings+0.016
 newfound+0.016
Neg logit contributions
 elbows-0.021
 poop-0.020
 marks-0.020
tails-0.020
 covered-0.020
Token duck
Token ID22045
Feature activation+6.35e-03

Pos logit contributions
igsaw+0.021
lad+0.018
Em+0.017
 newfound+0.017
reetings+0.017
Neg logit contributions
 elbows-0.014
 poop-0.013
 covered-0.013
tails-0.013
 marks-0.013
Token said
Token ID531
Feature activation-3.92e-03

Pos logit contributions
 elbows+0.008
 poop+0.007
tails+0.007
 marks+0.006
mons+0.006
Neg logit contributions
igsaw-0.012
lad-0.011
Em-0.011
 newfound-0.011
 million-0.011
Token,
Token ID11
Feature activation-5.06e-03

Pos logit contributions
igsaw+0.009
lad+0.008
Em+0.008
 newfound+0.008
reetings+0.008
Neg logit contributions
 elbows-0.004
 covered-0.004
 made-0.004
 poop-0.004
 smoke-0.004
Token "
Token ID366
Feature activation-2.37e-03

Pos logit contributions
igsaw+0.007
lad+0.005
reetings+0.005
 newfound+0.005
Em+0.005
Neg logit contributions
 made-0.010
 elbows-0.010
 covered-0.010
 smoke-0.010
 poop-0.010
Token.
Token ID13
Feature activation-1.38e-02

Pos logit contributions
 elbows+0.004
w+0.004
 poop+0.004
tails+0.004
 marks+0.004
Neg logit contributions
igsaw-0.007
 newfound-0.006
Em-0.006
lad-0.006
 lifelong-0.006
Token Lucy
Token ID22162
Feature activation-4.34e-03

Pos logit contributions
igsaw+0.029
lad+0.027
reetings+0.026
 newfound+0.025
joy+0.023
Neg logit contributions
 covered-0.016
 smoke-0.016
 poop-0.016
 made-0.016
 dressed-0.016
Token said
Token ID531
Feature activation-7.79e-03

Pos logit contributions
igsaw+0.009
Em+0.008
lad+0.008
 newfound+0.008
va+0.007
Neg logit contributions
 elbows-0.006
 smoke-0.005
 poop-0.005
 marks-0.004
w-0.004
Token,
Token ID11
Feature activation-9.94e-03

Pos logit contributions
igsaw+0.019
lad+0.016
Em+0.016
reetings+0.016
 newfound+0.015
Neg logit contributions
 made-0.007
 elbows-0.007
 covered-0.007
tails-0.007
 dressed-0.007
Token "
Token ID366
Feature activation-7.66e-03

Pos logit contributions
igsaw+0.022
reetings+0.016
 newfound+0.016
lad+0.016
 importantly+0.016
Neg logit contributions
 made-0.014
 when-0.014
 like-0.013
 covered-0.013
 said-0.013
TokenIt
Token ID1026
Feature activation-5.41e-02

Pos logit contributions
igsaw+0.016
lad+0.013
 newfound+0.013
reetings+0.012
 importantly+0.012
Neg logit contributions
 elbows-0.010
 made-0.010
 dressed-0.009
 poop-0.009
 covered-0.009
Token's
Token ID338
Feature activation+3.69e-02

Pos logit contributions
igsaw+0.117
lad+0.107
Em+0.101
reetings+0.100
 newfound+0.100
Neg logit contributions
 elbows-0.064
 poop-0.054
tails-0.051
 marks-0.051
 dressed-0.050
Token time
Token ID640
Feature activation+1.46e-02

Pos logit contributions
 elbows+0.054
 poop+0.051
 covered+0.050
tails+0.050
 marks+0.050
Neg logit contributions
igsaw-0.069
lad-0.058
Em-0.056
 newfound-0.055
reetings-0.055
Token to
Token ID284
Feature activation-7.25e-03

Pos logit contributions
 elbows+0.018
 poop+0.016
tails+0.016
 smoke+0.016
 marks+0.016
Neg logit contributions
igsaw-0.030
lad-0.027
 newfound-0.025
Em-0.025
reetings-0.025
Token go
Token ID467
Feature activation+4.21e-03

Pos logit contributions
igsaw+0.017
lad+0.015
 newfound+0.015
reetings+0.014
Em+0.014
Neg logit contributions
 elbows-0.007
 poop-0.006
tails-0.006
 smoke-0.006
 marks-0.006
Token home
Token ID1363
Feature activation-4.59e-03

Pos logit contributions
 elbows+0.005
 poop+0.005
 covered+0.005
tails+0.005
 marks+0.005
Neg logit contributions
igsaw-0.009
lad-0.008
Em-0.008
 newfound-0.007
reetings-0.007
Token take
Token ID1011
Feature activation-8.91e-03

Pos logit contributions
 elbows+0.017
tails+0.014
 covered+0.014
w+0.014
 poop+0.014
Neg logit contributions
igsaw-0.023
lad-0.020
reetings-0.020
Em-0.019
 pleasure-0.019
Token a
Token ID257
Feature activation+5.05e-03

Pos logit contributions
igsaw+0.020
lad+0.018
 newfound+0.017
Em+0.017
 importantly+0.017
Neg logit contributions
 elbows-0.009
 covered-0.008
 poop-0.008
tails-0.008
 smoke-0.008
Token nap
Token ID25422
Feature activation+6.00e-03

Pos logit contributions
 poop+0.005
 made+0.005
 spill+0.005
 smoke+0.005
 water+0.005
Neg logit contributions
igsaw-0.011
lad-0.011
reetings-0.010
 importantly-0.010
Em-0.009
Token.
Token ID13
Feature activation-3.46e-03

Pos logit contributions
 elbows+0.006
 poop+0.006
tails+0.005
 smoke+0.005
 marks+0.005
Neg logit contributions
igsaw-0.014
lad-0.012
Em-0.012
 newfound-0.012
reetings-0.012
Token You
Token ID921
Feature activation-3.26e-02

Pos logit contributions
igsaw+0.006
lad+0.005
 newfound+0.005
Em+0.005
 lifetime+0.005
Neg logit contributions
 poop-0.005
 elbows-0.005
 covered-0.005
 dressed-0.005
 smoke-0.005
Token look
Token ID804
Feature activation-6.38e-02

Pos logit contributions
igsaw+0.056
lad+0.048
Em+0.047
 newfound+0.047
reetings+0.046
Neg logit contributions
 elbows-0.051
 poop-0.046
tails-0.045
 marks-0.044
 smoke-0.044
Token tired
Token ID10032
Feature activation+4.82e-02

Pos logit contributions
igsaw+0.132
lad+0.114
Em+0.110
reetings+0.108
 newfound+0.108
Neg logit contributions
 elbows-0.078
 poop-0.074
 covered-0.074
 marks-0.073
 made-0.073
Token.
Token ID13
Feature activation-2.02e-03

Pos logit contributions
 elbows+0.042
 poop+0.039
 covered+0.038
 marks+0.038
tails+0.037
Neg logit contributions
igsaw-0.117
lad-0.103
Em-0.100
 newfound-0.100
reetings-0.099
Token And
Token ID843
Feature activation+1.55e-02

Pos logit contributions
igsaw+0.004
lad+0.003
 newfound+0.003
Em+0.003
 million+0.003
Neg logit contributions
 elbows-0.003
 poop-0.003
 dressed-0.003
 covered-0.003
 made-0.003
Token maybe
Token ID3863
Feature activation-7.93e-03

Pos logit contributions
 elbows+0.020
 poop+0.018
tails+0.018
 marks+0.018
 covered+0.018
Neg logit contributions
igsaw-0.031
lad-0.027
 newfound-0.026
Em-0.026
 importantly-0.026
Token you
Token ID345
Feature activation-7.78e-03

Pos logit contributions
igsaw+0.015
lad+0.013
 importantly+0.013
reetings+0.013
Em+0.013
Neg logit contributions
 elbows-0.010
tails-0.010
 poop-0.010
 covered-0.009
 smoke-0.009
Token girl
Token ID2576
Feature activation-2.77e-02

Pos logit contributions
igsaw+0.018
Em+0.014
lad+0.014
 newfound+0.014
 importantly+0.014
Neg logit contributions
 elbows-0.014
tails-0.013
 dressed-0.013
 covered-0.013
 poop-0.013
Token and
Token ID290
Feature activation-5.48e-03

Pos logit contributions
igsaw+0.056
 newfound+0.049
Em+0.048
 importantly+0.047
lad+0.046
Neg logit contributions
 elbows-0.034
w-0.032
 smoke-0.030
 marks-0.030
 poop-0.030
Token said
Token ID531
Feature activation-1.07e-02

Pos logit contributions
igsaw+0.011
 newfound+0.010
Em+0.010
 importantly+0.010
lad+0.010
Neg logit contributions
 elbows-0.006
 poop-0.006
w-0.006
 marks-0.006
tails-0.005
Token "
Token ID366
Feature activation-1.03e-02

Pos logit contributions
igsaw+0.023
lad+0.019
Em+0.019
 newfound+0.018
reetings+0.018
Neg logit contributions
 made-0.013
 covered-0.012
 elbows-0.012
 dressed-0.012
 smoke-0.012
TokenThat
Token ID2504
Feature activation-3.89e-02

Pos logit contributions
igsaw+0.018
lad+0.016
Em+0.016
reetings+0.015
 newfound+0.015
Neg logit contributions
 elbows-0.016
 poop-0.015
tails-0.015
 marks-0.015
 covered-0.014
Token puppy
Token ID26188
Feature activation-5.45e-02

Pos logit contributions
igsaw+0.070
lad+0.059
 million+0.059
 newfound+0.058
Em+0.058
Neg logit contributions
 elbows-0.062
 dressed-0.048
 poop-0.048
mons-0.048
 covered-0.047
Token looks
Token ID3073
Feature activation-4.71e-02

Pos logit contributions
igsaw+0.102
lad+0.091
 newfound+0.084
reetings+0.084
Em+0.083
Neg logit contributions
 elbows-0.082
 poop-0.069
 marks-0.068
 dressed-0.067
 smoke-0.067
Token so
Token ID523
Feature activation+8.25e-03

Pos logit contributions
igsaw+0.099
 newfound+0.089
lad+0.087
 pleasure+0.086
 importantly+0.084
Neg logit contributions
 elbows-0.058
tails-0.051
 marks-0.048
 poop-0.047
w-0.047
Token helpful
Token ID7613
Feature activation-1.23e-02

Pos logit contributions
 elbows+0.014
 poop+0.013
tails+0.013
 marks+0.013
 covered+0.013
Neg logit contributions
igsaw-0.014
lad-0.011
 newfound-0.011
Em-0.011
 importantly-0.011
Token!"
Token ID2474
Feature activation+6.09e-03

Pos logit contributions
igsaw+0.023
lad+0.020
reetings+0.019
Em+0.019
 importantly+0.018
Neg logit contributions
 elbows-0.019
 poop-0.017
tails-0.017
 marks-0.016
 smoke-0.016
Token The
Token ID383
Feature activation-2.16e-02

Pos logit contributions
 elbows+0.009
 covered+0.009
 made+0.009
 poop+0.009
 dressed+0.009
Neg logit contributions
igsaw-0.011
lad-0.009
reetings-0.009
Em-0.008
 newfound-0.008
Token He
Token ID679
Feature activation-1.81e-02

Pos logit contributions
igsaw+0.023
lad+0.019
 newfound+0.019
Em+0.018
reetings+0.018
Neg logit contributions
 covered-0.023
 elbows-0.022
 made-0.022
 smoke-0.022
 poop-0.022
Token had
Token ID550
Feature activation-2.59e-02

Pos logit contributions
igsaw+0.033
lad+0.028
Em+0.028
 newfound+0.027
reetings+0.027
Neg logit contributions
 elbows-0.027
 poop-0.025
 marks-0.024
tails-0.024
 smoke-0.024
Token a
Token ID257
Feature activation+9.90e-03

Pos logit contributions
igsaw+0.050
Em+0.046
 newfound+0.045
 importantly+0.044
 million+0.044
Neg logit contributions
 elbows-0.031
 poop-0.028
tails-0.027
 marks-0.027
ms-0.027
Token j
Token ID474
Feature activation-2.35e-02

Pos logit contributions
 poop+0.013
 made+0.013
 elbows+0.013
 smoke+0.013
tails+0.013
Neg logit contributions
igsaw-0.020
lad-0.018
reetings-0.017
Em-0.016
 importantly-0.016
Tokenolly
Token ID5098
Feature activation-4.89e-03

Pos logit contributions
igsaw+0.028
lad+0.025
Em+0.025
reetings+0.023
 newfound+0.022
Neg logit contributions
 elbows-0.044
 dressed-0.044
 poop-0.044
 covered-0.044
 marks-0.044
Token smile
Token ID8212
Feature activation-5.56e-02

Pos logit contributions
igsaw+0.009
lad+0.008
 newfound+0.008
Em+0.008
 importantly+0.007
Neg logit contributions
 elbows-0.007
 covered-0.006
tails-0.006
 poop-0.006
 marks-0.006
Token and
Token ID290
Feature activation-2.12e-02

Pos logit contributions
igsaw+0.120
lad+0.107
Em+0.104
 newfound+0.104
 importantly+0.104
Neg logit contributions
 elbows-0.061
 poop-0.052
tails-0.049
 marks-0.049
 smoke-0.049
Token was
Token ID373
Feature activation-4.49e-03

Pos logit contributions
igsaw+0.041
 newfound+0.037
 importantly+0.037
lad+0.037
Em+0.036
Neg logit contributions
 elbows-0.026
 poop-0.024
 marks-0.023
w-0.023
 smoke-0.022
Token always
Token ID1464
Feature activation-2.24e-02

Pos logit contributions
igsaw+0.010
lad+0.008
Em+0.008
 newfound+0.008
 importantly+0.008
Neg logit contributions
 elbows-0.005
 poop-0.005
 marks-0.005
tails-0.005
 covered-0.005
Token singing
Token ID13777
Feature activation-1.77e-02

Pos logit contributions
igsaw+0.046
lad+0.040
Em+0.039
 newfound+0.038
reetings+0.038
Neg logit contributions
 elbows-0.028
 poop-0.026
 marks-0.026
tails-0.025
 covered-0.025
Token.
Token ID13
Feature activation-1.37e-02

Pos logit contributions
igsaw+0.037
 newfound+0.032
lad+0.032
Em+0.032
 importantly+0.032
Neg logit contributions
 elbows-0.022
 poop-0.020
w-0.019
 spill-0.018
tails-0.018
Token and
Token ID290
Feature activation-2.50e-02

Pos logit contributions
igsaw+0.017
joy+0.016
 importantly+0.015
lad+0.015
Em+0.015
Neg logit contributions
 elbows-0.014
w-0.014
tails-0.013
 poop-0.012
 smoke-0.012
Token said
Token ID531
Feature activation-1.22e-02

Pos logit contributions
igsaw+0.049
 importantly+0.044
 newfound+0.043
Em+0.043
lad+0.043
Neg logit contributions
 elbows-0.033
w-0.029
 marks-0.029
 poop-0.029
 smoke-0.027
Token,
Token ID11
Feature activation-1.25e-02

Pos logit contributions
igsaw+0.030
Em+0.025
Mic+0.025
 newfound+0.025
 lifetime+0.024
Neg logit contributions
 made-0.011
 like-0.010
 covered-0.010
 when-0.010
 dressed-0.009
Token "
Token ID366
Feature activation-4.60e-03

Pos logit contributions
igsaw+0.020
reetings+0.015
lad+0.015
 newfound+0.014
 importantly+0.014
Neg logit contributions
 made-0.023
 covered-0.022
 dressed-0.022
 poop-0.022
 smoke-0.021
TokenThank
Token ID10449
Feature activation-3.93e-02

Pos logit contributions
igsaw+0.009
lad+0.008
 newfound+0.008
reetings+0.008
 importantly+0.008
Neg logit contributions
 elbows-0.006
 made-0.006
 poop-0.005
 covered-0.005
 dressed-0.005
Token you
Token ID345
Feature activation-5.94e-02

Pos logit contributions
igsaw+0.099
lad+0.087
 newfound+0.087
Em+0.084
reetings+0.084
Neg logit contributions
 elbows-0.030
w-0.028
 made-0.027
 poop-0.027
tails-0.026
Token,
Token ID11
Feature activation-3.23e-02

Pos logit contributions
igsaw+0.147
lad+0.130
 newfound+0.129
reetings+0.129
Em+0.128
Neg logit contributions
 made-0.047
 covered-0.045
tails-0.044
 poop-0.043
 marks-0.042
Token Lily
Token ID20037
Feature activation-3.41e-02

Pos logit contributions
igsaw+0.076
reetings+0.070
lad+0.070
joy+0.067
 newfound+0.066
Neg logit contributions
 made-0.033
 say-0.032
 covered-0.030
 said-0.029
 poop-0.029
Token!
Token ID0
Feature activation-1.78e-02

Pos logit contributions
igsaw+0.071
lad+0.064
Em+0.061
 importantly+0.058
reetings+0.058
Neg logit contributions
 elbows-0.043
 poop-0.037
 dressed-0.036
 smoke-0.035
 marks-0.035
Token I
Token ID314
Feature activation-4.11e-02

Pos logit contributions
igsaw+0.039
 newfound+0.036
 lifetime+0.033
 brand+0.032
 lifelong+0.032
Neg logit contributions
 made-0.017
 when-0.016
 like-0.016
 dressed-0.016
 say-0.015
Token will
Token ID481
Feature activation+1.30e-02

Pos logit contributions
igsaw+0.069
lad+0.067
reetings+0.065
 newfound+0.063
 pleasure+0.062
Neg logit contributions
 elbows-0.071
tails-0.057
ups-0.056
 poop-0.053
ards-0.052