TOP ACTIVATIONS
MAX = 0.096

her mom and dad and did what she wanted .
back to my house and enjoy ! â
sat on the hill and enjoyed the view together . All
Benny knew that if he kept trying , he could do
, so he went and got her . Mom
began the game and Jill lasted 5 minutes before she said
. He hopped over and enjoyed the bowl of boun cy
uffy became best friends and did everything together . They played
down the tree and happily enjoyed his tasty snack . The
master of the guitar and enjoyed playing music every day .
didn 't listen and he kept insisting . M
cold , but if he kept searching , he would find
decided to stay friends and enjoy their differences .
that she ran off and did just that . From then
read the newspaper together and enjoyed a mild morning . <|endoftext|>
kids happily agreed and they enjoyed the l oll ipop together
went to the river and enjoyed the sunshine . Then ,
were very brave and you did great !" The
every day and they both enjoyed listening to the waves together
ered over the rock and enjoyed the beautiful rainbow . <|endoftext|>

MIN ACTIVATIONS
MIN = -0.181

" Maybe the hipp o was just tired .
changed . The crocod ile wanted to slap a
One day , the hipp o 's friends asked him
tired . The le opard was searching for a
me ." The hipp o is not angry .
?" The hipp o smiled and said yes
One day , the hipp o saw a little bird
adventure . The le opard was walking through the
a distant place . The hipp o had many friends who
happy ! The hipp o asked Anna and Ben
inside . The hipp o was happy and Lily
here ?" The crocod ile replied , " I
One day , the green oct opus got sick . His
, mom . Maybe the hipp o is not so dull
â The le opard nodded and said ,
us ?" The oct opus did not speak .
. The flaming o nodded its head and
was amazed and asked the hipp o how he knew how
in the jungle . The le opard had many friends ,
; they knew that the hipp o would always be close

INTERVAL 0.027 - 0.096
CONTAINS 29.022%

playing with it until it was time to go home .
happily as if he understood and she gave him a nice
. As she wandered around , Lucy tried to guess what
said he would try to like the museum . Lily gave
their money , and always tried to find the cheapest way

INTERVAL -0.043 - 0.027
CONTAINS 56.285%

the bug and put it back on the ground . He
her mom 's help , she started to reach up and
That was the day the weird witch learned that
better and said , " Okay , Tom . I forgive
the air and catches it . He thinks he is very

INTERVAL -0.112 - -0.043
CONTAINS 13.647%

forgot all about his mud p uddle . He was so
, she saw a little fish swimming in the pond .
sandbox . They thank the old lady and Max for the
." They play with the truck , the ball and the
, she saw her friend Tim my with a big bag

INTERVAL -0.181 - -0.112
CONTAINS 1.046%

giant 's cave . The mouse shouted brave ly , "
decided she would paint the lizard . She started
The bright rag made many circles in the
the morning came , the sailor would return to the harbor
was a little boy named Tim my . Tim my loved
Token her
Token ID607
Feature activation-1.10e-02

Pos logit contributions
 common+0.048
 thrilling+0.046
 comparison+0.045
 awesome+0.044
 late+0.044
Neg logit contributions
bee-0.119
ie-0.112
itty-0.108
bees-0.105
owed-0.104
Token mom
Token ID1995
Feature activation-5.98e-02

Pos logit contributions
bee+0.023
itty+0.022
owed+0.021
oke+0.020
op+0.020
Neg logit contributions
 comparison-0.018
 outer-0.018
 fun-0.018
 common-0.017
 our-0.017
Token and
Token ID290
Feature activation+9.03e-03

Pos logit contributions
owed+0.142
bee+0.142
unn+0.134
itty+0.130
mun+0.128
Neg logit contributions
 their-0.079
 our-0.079
 the-0.078
 fun-0.077
 an-0.076
Token dad
Token ID9955
Feature activation-4.52e-02

Pos logit contributions
 comparison+0.014
 outer+0.013
 common+0.013
 awesome+0.013
 thrilling+0.012
Neg logit contributions
bee-0.020
itty-0.019
owed-0.019
m-0.018
op-0.018
Token and
Token ID290
Feature activation+1.88e-02

Pos logit contributions
bee+0.092
itty+0.086
owed+0.084
unn+0.082
oke+0.081
Neg logit contributions
 fun-0.076
 an-0.074
 their-0.074
 comparison-0.074
 our-0.073
Token did
Token ID750
Feature activation+9.62e-02

Pos logit contributions
 comparison+0.031
 outer+0.029
 fun+0.029
 common+0.029
 our+0.028
Neg logit contributions
bee-0.040
itty-0.038
owed-0.037
oke-0.035
op-0.035
Token what
Token ID644
Feature activation+3.42e-02

Pos logit contributions
 comparison+0.130
 outer+0.126
 thrilling+0.122
rous+0.117
 faith+0.111
Neg logit contributions
owed-0.199
bee-0.196
ies-0.180
ie-0.177
bees-0.175
Token she
Token ID673
Feature activation-3.72e-02

Pos logit contributions
 comparison+0.048
 outer+0.047
 common+0.045
 awesome+0.044
 thrilling+0.043
Neg logit contributions
bee-0.078
itty-0.076
owed-0.074
oke-0.071
op-0.070
Token wanted
Token ID2227
Feature activation+4.12e-02

Pos logit contributions
bee+0.072
itty+0.067
owed+0.066
op+0.063
oke+0.062
Neg logit contributions
 fun-0.067
 comparison-0.064
 common-0.064
 trouble-0.063
 awesome-0.063
Token.
Token ID13
Feature activation-6.09e-04

Pos logit contributions
 comparison+0.060
 fun+0.060
 outer+0.058
 our+0.057
 common+0.057
Neg logit contributions
bee-0.093
itty-0.088
owed-0.086
oke-0.083
op-0.081
Token
Token ID198
Feature activation+6.41e-03

Pos logit contributions
bee+0.001
owed+0.001
itty+0.001
oke+0.001
ppy+0.001
Neg logit contributions
 fun-0.001
 comparison-0.001
 our-0.001
 common-0.001
 outer-0.001
Token back
Token ID736
Feature activation-4.02e-03

Pos logit contributions
 comparison+0.056
 outer+0.053
 common+0.052
 fun+0.052
 awesome+0.051
Neg logit contributions
bee-0.073
itty-0.071
owed-0.070
op-0.066
oke-0.066
Token to
Token ID284
Feature activation+5.09e-02

Pos logit contributions
bee+0.008
itty+0.008
owed+0.008
op+0.008
m+0.008
Neg logit contributions
 comparison-0.007
 outer-0.006
 common-0.006
 fun-0.006
 awesome-0.006
Token my
Token ID616
Feature activation-7.98e-03

Pos logit contributions
 comparison+0.059
 thrilling+0.055
 awesome+0.052
 common+0.049
Hours+0.048
Neg logit contributions
bee-0.121
itty-0.121
berry-0.117
ie-0.115
ird-0.113
Token house
Token ID2156
Feature activation-3.32e-02

Pos logit contributions
bee+0.017
itty+0.016
owed+0.016
oke+0.015
op+0.015
Neg logit contributions
 comparison-0.013
 outer-0.012
 common-0.012
 fun-0.012
 our-0.011
Token and
Token ID290
Feature activation+3.95e-02

Pos logit contributions
bee+0.061
itty+0.057
owed+0.056
oke+0.053
unn+0.053
Neg logit contributions
 comparison-0.062
 fun-0.061
 our-0.060
 outer-0.060
 common-0.060
Token enjoy
Token ID2883
Feature activation+9.62e-02

Pos logit contributions
 comparison+0.056
 outer+0.053
 common+0.052
 fun+0.052
 awesome+0.050
Neg logit contributions
bee-0.091
itty-0.088
owed-0.087
oke-0.082
op-0.082
Token!
Token ID0
Feature activation+1.73e-03

Pos logit contributions
 comparison+0.100
 thrilling+0.092
Hours+0.084
 common+0.082
 backup+0.082
Neg logit contributions
bee-0.242
berry-0.220
ie-0.217
itty-0.216
bees-0.214
Tokenâ
Token ID22940
Feature activation-4.16e-02

Pos logit contributions
 our+0.002
 fun+0.002
 comparison+0.002
 outer+0.002
 an+0.002
Neg logit contributions
bee-0.004
owed-0.004
itty-0.004
ops-0.004
bees-0.004
Token
Token ID26391
Feature activation-6.85e-02

Pos logit contributions
bee+0.086
itty+0.082
op+0.078
owed+0.077
unn+0.077
Neg logit contributions
 common-0.065
 fun-0.064
 awesome-0.063
 comparison-0.063
 our-0.061
Token
Token ID198
Feature activation+8.06e-03

Pos logit contributions
bee+0.143
owed+0.131
itty+0.131
unn+0.125
ie+0.122
Neg logit contributions
 fun-0.113
 our-0.110
 comparison-0.108
 common-0.108
 trouble-0.108
Token
Token ID198
Feature activation-9.35e-04

Pos logit contributions
 comparison+0.011
 outer+0.011
 fun+0.010
 common+0.010
 our+0.010
Neg logit contributions
bee-0.019
itty-0.018
owed-0.018
oke-0.017
op-0.017
Token sat
Token ID3332
Feature activation+5.30e-02

Pos logit contributions
 fun+0.021
 comparison+0.021
 outer+0.020
 thrilling+0.020
 an+0.020
Neg logit contributions
bee-0.023
itty-0.022
oke-0.021
owed-0.021
op-0.020
Token on
Token ID319
Feature activation+5.52e-02

Pos logit contributions
 comparison+0.082
 outer+0.078
 common+0.078
 fun+0.076
 awesome+0.075
Neg logit contributions
bee-0.115
itty-0.111
owed-0.111
m-0.104
op-0.103
Token the
Token ID262
Feature activation-2.56e-04

Pos logit contributions
 comparison+0.064
 awesome+0.060
 unnecessary+0.057
 rewarding+0.056
 gonna+0.056
Neg logit contributions
bee-0.119
berry-0.115
bird-0.114
flower-0.112
itty-0.111
Token hill
Token ID12788
Feature activation-4.56e-02

Pos logit contributions
bee+0.001
itty+0.001
owed+0.001
oke+0.001
op+0.001
Neg logit contributions
 comparison-0.000
 fun-0.000
 outer-0.000
 awesome-0.000
 common-0.000
Token and
Token ID290
Feature activation+3.87e-02

Pos logit contributions
bee+0.092
itty+0.087
owed+0.085
oke+0.082
unn+0.080
Neg logit contributions
 comparison-0.077
 fun-0.075
 outer-0.075
 common-0.073
 our-0.072
Token enjoyed
Token ID8359
Feature activation+9.60e-02

Pos logit contributions
 comparison+0.073
 common+0.070
 outer+0.069
 trouble+0.069
 awesome+0.068
Neg logit contributions
bee-0.071
owed-0.068
itty-0.067
m-0.062
op-0.061
Token the
Token ID262
Feature activation-2.07e-03

Pos logit contributions
 comparison+0.099
 common+0.099
 backup+0.097
 frustrating+0.089
 impossible+0.087
Neg logit contributions
bee-0.234
owed-0.216
m-0.212
bees-0.210
flower-0.207
Token view
Token ID1570
Feature activation-3.86e-02

Pos logit contributions
bee+0.006
itty+0.006
owed+0.006
oke+0.006
ppy+0.006
Neg logit contributions
 fun-0.002
 comparison-0.002
 outer-0.002
 awesome-0.001
 thrilling-0.001
Token together
Token ID1978
Feature activation+2.57e-02

Pos logit contributions
bee+0.077
itty+0.073
owed+0.070
oke+0.067
unn+0.067
Neg logit contributions
 fun-0.069
 comparison-0.067
 outer-0.066
 awesome-0.065
 thrilling-0.064
Token.
Token ID13
Feature activation+1.18e-02

Pos logit contributions
 comparison+0.056
 outer+0.054
 fun+0.054
 common+0.054
 our+0.053
Neg logit contributions
bee-0.040
itty-0.037
owed-0.037
oke-0.034
op-0.033
Token All
Token ID1439
Feature activation+8.52e-03

Pos logit contributions
 comparison+0.018
 outer+0.017
 common+0.017
 fun+0.017
 awesome+0.016
Neg logit contributions
bee-0.026
itty-0.025
owed-0.025
oke-0.024
op-0.023
Token Benny
Token ID44275
Feature activation-7.13e-02

Pos logit contributions
 comparison+0.044
 common+0.042
 outer+0.041
 awesome+0.039
 thrilling+0.039
Neg logit contributions
bee-0.050
itty-0.048
m-0.046
owed-0.046
ie-0.045
Token knew
Token ID2993
Feature activation+6.00e-02

Pos logit contributions
bee+0.137
itty+0.132
owed+0.131
oke+0.122
op+0.120
Neg logit contributions
 comparison-0.125
 fun-0.121
 outer-0.120
 common-0.119
 our-0.117
Token that
Token ID326
Feature activation+3.12e-02

Pos logit contributions
 comparison+0.095
 common+0.091
 outer+0.083
 awesome+0.080
 thrilling+0.079
Neg logit contributions
bee-0.126
itty-0.124
owed-0.121
m-0.119
op-0.117
Token if
Token ID611
Feature activation+4.37e-02

Pos logit contributions
 common+0.049
 comparison+0.047
 outer+0.046
 awesome+0.044
 thrilling+0.043
Neg logit contributions
bee-0.067
itty-0.061
m-0.060
owed-0.060
op-0.059
Token he
Token ID339
Feature activation-1.17e-02

Pos logit contributions
 comparison+0.074
 common+0.069
 awesome+0.069
 outer+0.068
 thrilling+0.068
Neg logit contributions
bee-0.080
m-0.075
unn-0.075
itty-0.074
ie-0.073
Token kept
Token ID4030
Feature activation+9.58e-02

Pos logit contributions
bee+0.023
itty+0.023
owed+0.023
oke+0.022
pp+0.021
Neg logit contributions
 fun-0.019
 comparison-0.019
 common-0.018
 trouble-0.018
 awesome-0.018
Token trying
Token ID2111
Feature activation+8.18e-02

Pos logit contributions
 comparison+0.158
 common+0.156
 outer+0.155
 proper+0.146
 thrilling+0.145
Neg logit contributions
bee-0.182
m-0.157
op-0.148
flower-0.147
ie-0.146
Token,
Token ID11
Feature activation+2.62e-02

Pos logit contributions
 comparison+0.164
 outer+0.156
 common+0.155
 awesome+0.147
 late+0.145
Neg logit contributions
bee-0.125
owed-0.110
ie-0.109
m-0.107
op-0.107
Token he
Token ID339
Feature activation-1.10e-02

Pos logit contributions
 common+0.044
 comparison+0.044
 outer+0.043
 awesome+0.041
 thrilling+0.039
Neg logit contributions
bee-0.050
itty-0.046
m-0.045
op-0.045
owed-0.044
Token could
Token ID714
Feature activation+6.69e-02

Pos logit contributions
bee+0.020
itty+0.019
owed+0.018
oke+0.017
op+0.017
Neg logit contributions
 comparison-0.021
 fun-0.021
 outer-0.020
 common-0.020
 our-0.020
Token do
Token ID466
Feature activation+6.53e-02

Pos logit contributions
 comparison+0.085
 outer+0.081
 fun+0.080
 common+0.079
 our+0.077
Neg logit contributions
bee-0.163
itty-0.158
owed-0.156
oke-0.149
op-0.147
Token,
Token ID11
Feature activation+2.15e-02

Pos logit contributions
bee+0.082
unn+0.080
itty+0.080
owed+0.075
affe+0.075
Neg logit contributions
 their-0.076
 his-0.075
 fun-0.075
 an-0.074
 our-0.073
Token so
Token ID523
Feature activation+2.39e-02

Pos logit contributions
 fun+0.039
 our+0.039
 an+0.038
 comparison+0.038
 outer+0.038
Neg logit contributions
bee-0.041
owed-0.039
itty-0.038
unn-0.036
ops-0.036
Token he
Token ID339
Feature activation-2.32e-02

Pos logit contributions
 comparison+0.033
 outer+0.031
 common+0.031
 fun+0.031
 our+0.030
Neg logit contributions
bee-0.057
itty-0.054
owed-0.054
oke-0.051
op-0.051
Token went
Token ID1816
Feature activation+5.51e-02

Pos logit contributions
bee+0.046
itty+0.044
owed+0.043
oke+0.041
op+0.040
Neg logit contributions
 fun-0.039
 comparison-0.039
 our-0.038
 outer-0.038
 common-0.038
Token and
Token ID290
Feature activation+4.08e-02

Pos logit contributions
 comparison+0.092
 awesome+0.082
 thrilling+0.081
 trouble+0.079
 outer+0.079
Neg logit contributions
m-0.109
bee-0.107
itty-0.106
owed-0.106
op-0.102
Token got
Token ID1392
Feature activation+9.58e-02

Pos logit contributions
 comparison+0.076
 outer+0.072
 common+0.071
 fun+0.071
 awesome+0.070
Neg logit contributions
bee-0.077
itty-0.073
owed-0.073
oke-0.067
op-0.067
Token her
Token ID607
Feature activation-1.25e-02

Pos logit contributions
 comparison+0.151
 common+0.142
 outer+0.132
 thrilling+0.127
 purpose+0.123
Neg logit contributions
m-0.201
op-0.188
owed-0.187
ie-0.183
ird-0.183
Token.
Token ID13
Feature activation+7.72e-03

Pos logit contributions
bee+0.025
itty+0.023
owed+0.023
op+0.022
m+0.021
Neg logit contributions
 comparison-0.022
 outer-0.021
 common-0.020
 trouble-0.020
 thrilling-0.020
Token
Token ID198
Feature activation+1.56e-02

Pos logit contributions
 comparison+0.012
 outer+0.011
 common+0.011
 fun+0.011
 awesome+0.011
Neg logit contributions
bee-0.017
itty-0.016
owed-0.016
oke-0.015
op-0.015
Token
Token ID198
Feature activation+1.72e-03

Pos logit contributions
 comparison+0.023
 fun+0.022
 common+0.022
 outer+0.022
 our+0.022
Neg logit contributions
bee-0.035
owed-0.033
itty-0.033
oke-0.031
unn-0.031
TokenMom
Token ID29252
Feature activation-6.08e-02

Pos logit contributions
 comparison+0.003
 outer+0.002
 common+0.002
 trouble+0.002
 fun+0.002
Neg logit contributions
bee-0.004
itty-0.004
owed-0.003
op-0.003
m-0.003
Token began
Token ID2540
Feature activation+7.50e-02

Pos logit contributions
bee+0.017
itty+0.017
owed+0.016
oke+0.016
unn+0.016
Neg logit contributions
 comparison-0.012
 fun-0.012
 outer-0.012
 common-0.012
 our-0.011
Token the
Token ID262
Feature activation-1.13e-02

Pos logit contributions
 comparison+0.102
 awesome+0.100
 outer+0.099
 ours+0.095
 common+0.094
Neg logit contributions
bee-0.174
op-0.160
itty-0.160
owed-0.158
bird-0.156
Token game
Token ID983
Feature activation-3.15e-02

Pos logit contributions
bee+0.026
itty+0.025
owed+0.024
oke+0.024
op+0.023
Neg logit contributions
 comparison-0.016
 outer-0.015
 common-0.015
 our-0.014
 fun-0.014
Token and
Token ID290
Feature activation+3.55e-02

Pos logit contributions
bee+0.057
itty+0.055
owed+0.055
op+0.052
oke+0.051
Neg logit contributions
 comparison-0.059
 outer-0.057
 common-0.056
 our-0.055
 awesome-0.054
Token Jill
Token ID22772
Feature activation-6.78e-02

Pos logit contributions
 comparison+0.049
 outer+0.047
 common+0.046
 awesome+0.043
 trouble+0.043
Neg logit contributions
bee-0.083
owed-0.079
itty-0.079
oke-0.075
m-0.074
Token lasted
Token ID15436
Feature activation+9.57e-02

Pos logit contributions
bee+0.127
itty+0.127
owed+0.121
unn+0.120
erry+0.115
Neg logit contributions
 fun-0.117
 comparison-0.112
 an-0.112
 their-0.110
 great-0.109
Token 5
Token ID642
Feature activation-4.05e-02

Pos logit contributions
 comparison+0.167
 common+0.160
 outer+0.158
rous+0.155
 ours+0.146
Neg logit contributions
bee-0.166
owed-0.155
ppy-0.150
ie-0.145
ist-0.145
Token minutes
Token ID2431
Feature activation+1.05e-03

Pos logit contributions
bee+0.087
itty+0.085
owed+0.084
unn+0.078
oke+0.078
Neg logit contributions
 comparison-0.062
 fun-0.061
 common-0.059
 outer-0.059
 trouble-0.058
Token before
Token ID878
Feature activation+3.51e-02

Pos logit contributions
 comparison+0.002
 outer+0.002
 common+0.002
 fun+0.002
 our+0.002
Neg logit contributions
bee-0.002
itty-0.002
owed-0.002
oke-0.002
op-0.002
Token she
Token ID673
Feature activation-8.93e-03

Pos logit contributions
 outer+0.056
 comparison+0.055
 awesome+0.053
 common+0.052
rous+0.051
Neg logit contributions
bee-0.071
m-0.065
ops-0.064
op-0.064
itty-0.063
Token said
Token ID531
Feature activation+4.45e-02

Pos logit contributions
bee+0.017
itty+0.017
owed+0.016
unn+0.015
oke+0.015
Neg logit contributions
 comparison-0.015
 fun-0.015
 common-0.015
 outer-0.015
 trouble-0.015
Token.
Token ID13
Feature activation+8.41e-03

Pos logit contributions
 comparison+0.068
 outer+0.065
 fun+0.065
 our+0.064
 common+0.064
Neg logit contributions
bee-0.064
itty-0.062
owed-0.061
oke-0.056
op-0.055
Token He
Token ID679
Feature activation-6.91e-03

Pos logit contributions
 comparison+0.011
 fun+0.011
 outer+0.010
 trouble+0.010
 common+0.010
Neg logit contributions
bee-0.021
itty-0.019
owed-0.019
oke-0.018
ppy-0.018
Token hopped
Token ID45230
Feature activation+5.39e-02

Pos logit contributions
bee+0.013
itty+0.013
owed+0.013
oke+0.012
op+0.012
Neg logit contributions
 fun-0.012
 comparison-0.012
 common-0.011
 outer-0.011
 awesome-0.011
Token over
Token ID625
Feature activation+5.51e-02

Pos logit contributions
 comparison+0.103
 outer+0.100
 common+0.094
 awesome+0.091
 thrilling+0.090
Neg logit contributions
m-0.095
bee-0.095
y-0.088
ie-0.088
itty-0.088
Token and
Token ID290
Feature activation+4.89e-02

Pos logit contributions
 comparison+0.104
 common+0.100
 outer+0.099
 awesome+0.097
 ours+0.093
Neg logit contributions
m-0.095
bee-0.094
itty-0.093
berry-0.090
op-0.087
Token enjoyed
Token ID8359
Feature activation+9.56e-02

Pos logit contributions
 comparison+0.097
 common+0.093
 outer+0.092
 awesome+0.089
 trouble+0.089
Neg logit contributions
bee-0.088
itty-0.083
owed-0.083
ppy-0.076
m-0.076
Token the
Token ID262
Feature activation-5.84e-03

Pos logit contributions
 outer+0.128
 thrilling+0.128
 common+0.127
 comparison+0.124
 backup+0.123
Neg logit contributions
bee-0.198
m-0.178
bees-0.170
berry-0.169
ie-0.168
Token bowl
Token ID9396
Feature activation-8.27e-02

Pos logit contributions
bee+0.013
m+0.013
itty+0.012
owed+0.012
ops+0.012
Neg logit contributions
 outer-0.008
 comparison-0.008
 common-0.007
 backup-0.007
 thrilling-0.007
Token of
Token ID286
Feature activation+3.03e-02

Pos logit contributions
bee+0.137
itty+0.129
owed+0.127
oke+0.119
op+0.117
Neg logit contributions
 comparison-0.169
 outer-0.164
 fun-0.164
 common-0.162
 our-0.160
Token boun
Token ID31283
Feature activation-3.90e-02

Pos logit contributions
 outer+0.058
 comparison+0.056
 backup+0.056
 common+0.053
 thrilling+0.053
Neg logit contributions
bee-0.050
ie-0.046
m-0.046
berry-0.045
mun-0.043
Tokency
Token ID948
Feature activation-7.69e-02

Pos logit contributions
bee+0.039
itty+0.034
owed+0.034
oke+0.030
op+0.029
Neg logit contributions
 comparison-0.107
 outer-0.104
 fun-0.103
 common-0.103
 our-0.102
Tokenuffy
Token ID15352
Feature activation-7.63e-02

Pos logit contributions
bee+0.084
oke+0.071
aron+0.070
affe+0.070
cut+0.064
Neg logit contributions
 common-0.148
 our-0.146
 fun-0.145
 their-0.143
 great-0.142
Token became
Token ID2627
Feature activation+8.32e-02

Pos logit contributions
itty+0.173
oke+0.168
bee+0.167
op+0.162
owed+0.160
Neg logit contributions
 fun-0.109
 their-0.103
 great-0.100
 an-0.097
 comparison-0.094
Token best
Token ID1266
Feature activation+3.98e-02

Pos logit contributions
 comparison+0.115
 outer+0.113
 common+0.104
 thrilling+0.102
 sight+0.098
Neg logit contributions
bee-0.191
itty-0.185
owed-0.174
m-0.173
flower-0.171
Token friends
Token ID2460
Feature activation-2.79e-02

Pos logit contributions
 fun+0.064
 trouble+0.061
 our+0.059
 comparison+0.059
 common+0.058
Neg logit contributions
bee-0.088
owed-0.084
itty-0.080
oke-0.078
pp-0.078
Token and
Token ID290
Feature activation+3.63e-02

Pos logit contributions
bee+0.057
oke+0.051
unn+0.051
itty+0.051
owed+0.051
Neg logit contributions
 fun-0.049
 an-0.048
 their-0.047
 comparison-0.046
 its-0.046
Token did
Token ID750
Feature activation+9.56e-02

Pos logit contributions
 fun+0.054
 their+0.052
 our+0.051
 comparison+0.051
 outer+0.050
Neg logit contributions
bee-0.079
itty-0.077
owed-0.077
oke-0.076
op-0.074
Token everything
Token ID2279
Feature activation-3.29e-03

Pos logit contributions
 comparison+0.105
 outer+0.100
 common+0.093
 late+0.092
 backup+0.091
Neg logit contributions
bee-0.242
owed-0.237
ie-0.224
ops-0.215
bees-0.213
Token together
Token ID1978
Feature activation+2.54e-02

Pos logit contributions
bee+0.008
itty+0.007
owed+0.007
oke+0.007
op+0.007
Neg logit contributions
 comparison-0.005
 fun-0.004
 outer-0.004
 our-0.004
 common-0.004
Token.
Token ID13
Feature activation+6.87e-03

Pos logit contributions
 comparison+0.050
 fun+0.049
 outer+0.049
 common+0.048
 our+0.048
Neg logit contributions
bee-0.044
itty-0.042
owed-0.041
oke-0.038
unn-0.038
Token They
Token ID1119
Feature activation+2.63e-03

Pos logit contributions
 fun+0.008
 comparison+0.008
 our+0.008
 their+0.008
 outer+0.007
Neg logit contributions
bee-0.018
owed-0.017
itty-0.016
op-0.016
itive-0.016
Token played
Token ID2826
Feature activation+6.41e-02

Pos logit contributions
 fun+0.004
 comparison+0.004
 their+0.004
 outer+0.004
 an+0.004
Neg logit contributions
itty-0.006
bee-0.006
owed-0.006
oke-0.006
op-0.005
Token down
Token ID866
Feature activation+3.75e-02

Pos logit contributions
 comparison+0.008
 outer+0.007
 common+0.007
 fun+0.007
 awesome+0.007
Neg logit contributions
bee-0.016
itty-0.015
owed-0.015
op-0.014
m-0.014
Token the
Token ID262
Feature activation-1.14e-02

Pos logit contributions
 comparison+0.052
 awesome+0.049
 outer+0.048
 common+0.047
 fantastic+0.044
Neg logit contributions
itty-0.085
m-0.083
owed-0.079
bee-0.079
bird-0.078
Token tree
Token ID5509
Feature activation-7.07e-02

Pos logit contributions
bee+0.022
owed+0.021
itty+0.021
oke+0.019
op+0.019
Neg logit contributions
 comparison-0.021
 fun-0.020
 outer-0.020
 common-0.020
 our-0.020
Token and
Token ID290
Feature activation+4.03e-02

Pos logit contributions
bee+0.124
itty+0.115
owed+0.112
oke+0.108
unn+0.105
Neg logit contributions
 comparison-0.138
 fun-0.136
 outer-0.134
 our-0.131
 common-0.131
Token happily
Token ID18177
Feature activation+5.40e-03

Pos logit contributions
 comparison+0.073
 outer+0.070
 common+0.070
 fun+0.069
 awesome+0.068
Neg logit contributions
bee-0.077
itty-0.074
owed-0.073
oke-0.068
op-0.067
Token enjoyed
Token ID8359
Feature activation+9.55e-02

Pos logit contributions
 fun+0.008
 straight+0.007
 free+0.007
 an+0.007
 far+0.007
Neg logit contributions
bee-0.012
itty-0.011
oke-0.011
owers-0.011
itive-0.011
Token his
Token ID465
Feature activation+4.58e-03

Pos logit contributions
 common+0.130
 comparison+0.122
 complicated+0.121
 outer+0.119
 frustrating+0.114
Neg logit contributions
bee-0.190
m-0.173
mun-0.167
es-0.165
itty-0.165
Token tasty
Token ID25103
Feature activation-9.46e-03

Pos logit contributions
 comparison+0.007
 outer+0.006
 common+0.006
 our+0.006
 trouble+0.006
Neg logit contributions
bee-0.011
itty-0.010
owed-0.010
m-0.009
op-0.009
Token snack
Token ID26906
Feature activation-2.92e-02

Pos logit contributions
bee+0.019
itty+0.018
owed+0.018
oke+0.017
op+0.017
Neg logit contributions
 comparison-0.016
 outer-0.015
 fun-0.015
 common-0.015
 our-0.015
Token.
Token ID13
Feature activation+4.74e-03

Pos logit contributions
bee+0.055
itty+0.053
owed+0.050
oke+0.050
unn+0.049
Neg logit contributions
 fun-0.054
 comparison-0.052
 awesome-0.051
 outer-0.050
 an-0.050
Token The
Token ID383
Feature activation-2.26e-02

Pos logit contributions
 comparison+0.006
 fun+0.006
 outer+0.006
 our+0.006
 common+0.006
Neg logit contributions
bee-0.012
itty-0.011
owed-0.011
oke-0.010
op-0.010
Token master
Token ID4958
Feature activation-4.85e-02

Pos logit contributions
 comparison+0.037
 outer+0.035
 common+0.034
 trouble+0.033
 fun+0.033
Neg logit contributions
bee-0.056
itty-0.054
owed-0.051
op-0.050
oke-0.050
Token of
Token ID286
Feature activation+4.06e-02

Pos logit contributions
bee+0.094
owed+0.092
unn+0.090
itty+0.089
owers+0.083
Neg logit contributions
 our-0.082
 fun-0.080
 an-0.080
 comparison-0.079
 awesome-0.078
Token the
Token ID262
Feature activation-2.58e-02

Pos logit contributions
 comparison+0.057
 outer+0.053
 common+0.053
 thrilling+0.050
 awesome+0.050
Neg logit contributions
bee-0.091
itty-0.090
bird-0.083
m-0.080
bug-0.079
Token guitar
Token ID10047
Feature activation-8.13e-02

Pos logit contributions
bee+0.055
itty+0.054
m+0.049
op+0.049
flower+0.048
Neg logit contributions
 comparison-0.041
 outer-0.040
 common-0.038
 our-0.037
 awesome-0.037
Token and
Token ID290
Feature activation+2.63e-02

Pos logit contributions
bee+0.148
itty+0.140
owed+0.140
unn+0.133
oke+0.132
Neg logit contributions
 fun-0.148
 comparison-0.148
 outer-0.146
 our-0.144
 an-0.141
Token enjoyed
Token ID8359
Feature activation+9.50e-02

Pos logit contributions
 comparison+0.034
 outer+0.032
 common+0.031
 awesome+0.030
 fun+0.030
Neg logit contributions
bee-0.065
itty-0.062
owed-0.061
oke-0.058
op-0.057
Token playing
Token ID2712
Feature activation+4.92e-02

Pos logit contributions
 comparison+0.123
 impossible+0.115
 ours+0.113
 frustrating+0.112
 common+0.108
Neg logit contributions
bee-0.210
es-0.190
mun-0.183
itive-0.181
bees-0.178
Token music
Token ID2647
Feature activation-1.49e-02

Pos logit contributions
 comparison+0.087
 outer+0.079
 common+0.078
 thrilling+0.076
 awesome+0.076
Neg logit contributions
bee-0.097
itty-0.089
owed-0.088
op-0.084
ird-0.083
Token every
Token ID790
Feature activation+3.48e-02

Pos logit contributions
bee+0.031
itty+0.030
owed+0.029
oke+0.028
unn+0.028
Neg logit contributions
 fun-0.024
 comparison-0.024
 outer-0.023
 common-0.023
 our-0.023
Token day
Token ID1110
Feature activation-2.14e-03

Pos logit contributions
 comparison+0.059
 common+0.056
 outer+0.056
 awesome+0.055
 fun+0.055
Neg logit contributions
bee-0.069
itty-0.068
owed-0.063
m-0.062
ne-0.060
Token.
Token ID13
Feature activation+1.02e-02

Pos logit contributions
bee+0.003
itty+0.003
owed+0.003
oke+0.003
op+0.003
Neg logit contributions
 comparison-0.005
 outer-0.004
 fun-0.004
 common-0.004
 our-0.004
Token didn
Token ID1422
Feature activation+7.49e-02

Pos logit contributions
bee+0.141
itty+0.136
unn+0.131
owed+0.130
oke+0.126
Neg logit contributions
 fun-0.123
 trouble-0.121
 an-0.120
 comparison-0.119
 great-0.117
Token't
Token ID470
Feature activation+6.16e-02

Pos logit contributions
 comparison+0.124
 outer+0.120
 common+0.117
 fun+0.115
 sight+0.113
Neg logit contributions
bee-0.151
itty-0.145
op-0.142
owed-0.142
ops-0.137
Token listen
Token ID6004
Feature activation+5.48e-02

Pos logit contributions
 fun+0.081
 comparison+0.081
 common+0.078
 our+0.076
 trouble+0.075
Neg logit contributions
bee-0.147
itty-0.144
owed-0.140
oke-0.135
unn-0.134
Token and
Token ID290
Feature activation+4.68e-02

Pos logit contributions
 comparison+0.126
 outer+0.124
 common+0.119
 awesome+0.118
 our+0.116
Neg logit contributions
bee-0.076
owed-0.072
m-0.068
op-0.065
y-0.063
Token he
Token ID339
Feature activation-7.88e-03

Pos logit contributions
 comparison+0.087
 outer+0.085
 common+0.084
 awesome+0.081
 thrilling+0.079
Neg logit contributions
bee-0.087
owed-0.084
itty-0.082
oke-0.076
m-0.076
Token kept
Token ID4030
Feature activation+9.48e-02

Pos logit contributions
bee+0.016
itty+0.015
owed+0.015
oke+0.015
unn+0.014
Neg logit contributions
 fun-0.013
 trouble-0.012
 an-0.012
 comparison-0.012
 great-0.012
Token insisting
Token ID23630
Feature activation+6.89e-02

Pos logit contributions
 outer+0.148
 comparison+0.147
 common+0.147
 thrilling+0.141
 awesome+0.138
Neg logit contributions
bee-0.189
m-0.181
itty-0.173
ird-0.171
op-0.168
Token.
Token ID13
Feature activation+3.28e-03

Pos logit contributions
 comparison+0.124
 outer+0.122
 common+0.119
 thrilling+0.113
 fun+0.113
Neg logit contributions
bee-0.134
owed-0.124
itty-0.122
op-0.120
oke-0.120
Token
Token ID198
Feature activation+1.26e-02

Pos logit contributions
 comparison+0.005
 outer+0.005
 fun+0.005
 common+0.005
 our+0.005
Neg logit contributions
bee-0.007
itty-0.007
owed-0.007
oke-0.006
op-0.006
Token
Token ID198
Feature activation+1.77e-03

Pos logit contributions
 comparison+0.020
 fun+0.019
 common+0.019
 outer+0.019
 our+0.018
Neg logit contributions
bee-0.028
owed-0.026
itty-0.026
oke-0.024
unn-0.024
TokenM
Token ID44
Feature activation-2.88e-02

Pos logit contributions
 comparison+0.003
 outer+0.003
 common+0.002
 fun+0.002
 our+0.002
Neg logit contributions
bee-0.004
itty-0.004
owed-0.004
op-0.004
oke-0.004
Token cold
Token ID4692
Feature activation-1.18e-02

Pos logit contributions
 comparison+0.065
 outer+0.059
 common+0.051
 recess+0.050
 sight+0.048
Neg logit contributions
owed-0.124
bee-0.122
ie-0.119
m-0.118
itty-0.117
Token,
Token ID11
Feature activation+2.27e-02

Pos logit contributions
bee+0.019
itty+0.018
owed+0.018
oke+0.016
m+0.016
Neg logit contributions
 comparison-0.025
 outer-0.024
 common-0.024
 fun-0.023
 awesome-0.023
Token but
Token ID475
Feature activation+3.17e-02

Pos logit contributions
 comparison+0.047
 outer+0.047
 common+0.047
 backup+0.043
 awesome+0.043
Neg logit contributions
bee-0.035
itty-0.031
owed-0.031
m-0.030
oke-0.029
Token if
Token ID611
Feature activation+3.28e-02

Pos logit contributions
 comparison+0.040
 outer+0.039
 common+0.038
 backup+0.035
 awesome+0.035
Neg logit contributions
m-0.069
bee-0.067
oke-0.066
itty-0.066
owed-0.065
Token he
Token ID339
Feature activation-5.51e-03

Pos logit contributions
 comparison+0.052
 common+0.047
 outer+0.046
 awesome+0.045
 thrilling+0.044
Neg logit contributions
bee-0.062
m-0.060
owed-0.060
ie-0.060
itty-0.060
Token kept
Token ID4030
Feature activation+9.47e-02

Pos logit contributions
bee+0.011
itty+0.011
owed+0.011
oke+0.010
op+0.010
Neg logit contributions
 comparison-0.009
 fun-0.009
 common-0.009
 trouble-0.009
 outer-0.008
Token searching
Token ID10342
Feature activation+7.10e-02

Pos logit contributions
 comparison+0.140
 thrilling+0.136
 common+0.135
 outer+0.133
 awesome+0.129
Neg logit contributions
bee-0.177
m-0.175
ie-0.165
op-0.162
itty-0.160
Token,
Token ID11
Feature activation+2.73e-02

Pos logit contributions
 comparison+0.151
 common+0.141
 outer+0.136
 awesome+0.130
 sight+0.129
Neg logit contributions
owed-0.102
ie-0.097
bee-0.096
m-0.093
itty-0.089
Token he
Token ID339
Feature activation-6.46e-03

Pos logit contributions
 common+0.040
 comparison+0.040
 outer+0.037
 awesome+0.036
 thrilling+0.035
Neg logit contributions
bee-0.056
owed-0.053
m-0.052
op-0.052
itty-0.051
Token would
Token ID561
Feature activation+6.74e-02

Pos logit contributions
bee+0.011
itty+0.010
owed+0.010
oke+0.009
op+0.009
Neg logit contributions
 comparison-0.013
 outer-0.013
 common-0.013
 fun-0.013
 our-0.012
Token find
Token ID1064
Feature activation+6.69e-02

Pos logit contributions
 comparison+0.097
 outer+0.094
 common+0.091
 awesome+0.090
 thrilling+0.088
Neg logit contributions
bee-0.149
owed-0.140
itty-0.138
m-0.134
ppy-0.133
Token decided
Token ID3066
Feature activation+5.35e-02

Pos logit contributions
bee+0.146
itty+0.146
oke+0.137
unn+0.134
owed+0.134
Neg logit contributions
 fun-0.111
 comparison-0.110
 their-0.108
 our-0.106
 an-0.106
Token to
Token ID284
Feature activation+5.65e-02

Pos logit contributions
 outer+0.101
 comparison+0.100
 common+0.094
 thrilling+0.089
 awesome+0.089
Neg logit contributions
owed-0.082
bee-0.080
ess-0.078
oke-0.075
es-0.075
Token stay
Token ID2652
Feature activation+6.37e-02

Pos logit contributions
 comparison+0.070
 outer+0.066
 fun+0.066
 common+0.066
 our+0.064
Neg logit contributions
bee-0.140
itty-0.135
owed-0.133
oke-0.128
op-0.126
Token friends
Token ID2460
Feature activation-1.34e-02

Pos logit contributions
 comparison+0.133
 outer+0.120
 thrilling+0.116
 common+0.116
 backup+0.114
Neg logit contributions
owed-0.094
bee-0.090
flower-0.089
ies-0.088
ie-0.088
Token and
Token ID290
Feature activation+4.89e-02

Pos logit contributions
bee+0.025
itty+0.024
owed+0.024
oke+0.023
unn+0.022
Neg logit contributions
 comparison-0.024
 fun-0.024
 outer-0.024
 our-0.023
 common-0.023
Token enjoy
Token ID2883
Feature activation+9.46e-02

Pos logit contributions
 comparison+0.066
 outer+0.062
 common+0.060
 fun+0.059
 awesome+0.059
Neg logit contributions
bee-0.116
itty-0.111
owed-0.110
op-0.104
oke-0.103
Token their
Token ID511
Feature activation-2.21e-03

Pos logit contributions
 comparison+0.095
 thrilling+0.081
Hours+0.080
 backup+0.080
 impossible+0.076
Neg logit contributions
bee-0.231
bees-0.210
flower-0.204
owers-0.204
m-0.202
Token differences
Token ID5400
Feature activation+1.68e-03

Pos logit contributions
bee+0.006
owed+0.005
itty+0.005
ops+0.005
op+0.005
Neg logit contributions
 comparison-0.002
 outer-0.002
 common-0.002
 late-0.002
 trouble-0.002
Token.
Token ID13
Feature activation+1.14e-02

Pos logit contributions
 comparison+0.003
 fun+0.003
 outer+0.003
 common+0.003
 our+0.003
Neg logit contributions
bee-0.003
itty-0.003
owed-0.003
oke-0.003
unn-0.003
Token
Token ID198
Feature activation+1.36e-02

Pos logit contributions
 comparison+0.015
 fun+0.014
 our+0.014
 outer+0.014
 common+0.014
Neg logit contributions
bee-0.028
owed-0.026
itty-0.026
bees-0.024
m-0.024
Token
Token ID198
Feature activation+5.71e-03

Pos logit contributions
 comparison+0.020
 outer+0.019
 fun+0.019
 common+0.019
 our+0.019
Neg logit contributions
bee-0.030
itty-0.029
owed-0.029
oke-0.027
op-0.027
Token that
Token ID326
Feature activation+3.02e-02

Pos logit contributions
 comparison+0.052
 outer+0.050
 fun+0.050
 common+0.049
 our+0.049
Neg logit contributions
bee-0.058
itty-0.055
owed-0.055
oke-0.052
op-0.051
Token she
Token ID673
Feature activation-2.15e-02

Pos logit contributions
 comparison+0.052
 outer+0.050
 common+0.050
 fun+0.049
 awesome+0.048
Neg logit contributions
bee-0.059
itty-0.056
owed-0.056
m-0.053
oke-0.052
Token ran
Token ID4966
Feature activation+3.90e-02

Pos logit contributions
bee+0.039
itty+0.038
owed+0.036
oke+0.035
op+0.034
Neg logit contributions
 fun-0.039
 comparison-0.039
 outer-0.038
 our-0.038
 common-0.038
Token off
Token ID572
Feature activation+3.78e-02

Pos logit contributions
 comparison+0.065
 outer+0.060
 common+0.058
 thrilling+0.054
 awesome+0.054
Neg logit contributions
bee-0.075
itty-0.072
owed-0.071
ne-0.071
m-0.070
Token and
Token ID290
Feature activation+4.40e-02

Pos logit contributions
 comparison+0.063
 outer+0.059
 awesome+0.058
 common+0.057
 fun+0.056
Neg logit contributions
bee-0.076
itty-0.073
owed-0.073
m-0.070
op-0.070
Token did
Token ID750
Feature activation+9.45e-02

Pos logit contributions
 comparison+0.088
 outer+0.084
 common+0.084
 awesome+0.081
 fun+0.080
Neg logit contributions
bee-0.075
owed-0.073
itty-0.072
ird-0.066
m-0.066
Token just
Token ID655
Feature activation+4.63e-02

Pos logit contributions
 comparison+0.119
 common+0.105
 outer+0.105
 late+0.102
rous+0.097
Neg logit contributions
bee-0.223
owed-0.222
ird-0.209
ops-0.201
es-0.197
Token that
Token ID326
Feature activation+2.47e-02

Pos logit contributions
 comparison+0.067
 common+0.065
 our+0.064
 outer+0.062
 awesome+0.062
Neg logit contributions
bee-0.102
owed-0.099
itty-0.093
ird-0.091
m-0.089
Token.
Token ID13
Feature activation+6.80e-03

Pos logit contributions
 comparison+0.044
 outer+0.043
 common+0.042
 awesome+0.041
 fun+0.041
Neg logit contributions
bee-0.047
itty-0.044
owed-0.044
op-0.041
m-0.041
Token From
Token ID3574
Feature activation+3.17e-02

Pos logit contributions
 comparison+0.010
 outer+0.010
 fun+0.010
 common+0.010
 trouble+0.010
Neg logit contributions
bee-0.015
owed-0.014
itty-0.014
oke-0.013
op-0.013
Token then
Token ID788
Feature activation+1.94e-02

Pos logit contributions
 comparison+0.060
 outer+0.053
 common+0.053
 trouble+0.052
 backup+0.052
Neg logit contributions
bee-0.060
itty-0.059
owed-0.056
ie-0.054
ne-0.053
Token read
Token ID1100
Feature activation+6.02e-02

Pos logit contributions
 comparison+0.003
 fun+0.002
 outer+0.002
 our+0.002
 common+0.002
Neg logit contributions
bee-0.004
itty-0.003
owed-0.003
oke-0.003
op-0.003
Token the
Token ID262
Feature activation-9.93e-03

Pos logit contributions
 comparison+0.069
 outer+0.056
 common+0.055
 gonna+0.050
 possible+0.050
Neg logit contributions
bee-0.153
owed-0.152
itty-0.147
m-0.147
oke-0.143
Token newspaper
Token ID7533
Feature activation-5.54e-02

Pos logit contributions
bee+0.020
itty+0.019
owed+0.019
m+0.018
oke+0.018
Neg logit contributions
 comparison-0.017
 outer-0.016
 common-0.016
 our-0.016
 awesome-0.015
Token together
Token ID1978
Feature activation+1.99e-02

Pos logit contributions
bee+0.117
owed+0.112
itty+0.111
oke+0.104
op+0.103
Neg logit contributions
 comparison-0.089
 outer-0.085
 common-0.084
 fun-0.082
 our-0.081
Token and
Token ID290
Feature activation+4.04e-02

Pos logit contributions
 comparison+0.045
 outer+0.043
 common+0.043
 fun+0.042
 our+0.042
Neg logit contributions
bee-0.030
itty-0.028
owed-0.028
oke-0.025
op-0.025
Token enjoyed
Token ID8359
Feature activation+9.45e-02

Pos logit contributions
 comparison+0.073
 outer+0.069
 common+0.068
 awesome+0.066
 thrilling+0.066
Neg logit contributions
bee-0.078
owed-0.074
itty-0.073
oke-0.068
m-0.068
Token a
Token ID257
Feature activation+3.40e-02

Pos logit contributions
 comparison+0.118
 impossible+0.110
 common+0.105
 frustrating+0.102
 backup+0.100
Neg logit contributions
bee-0.208
es-0.192
bees-0.190
ops-0.186
m-0.180
Token mild
Token ID11607
Feature activation+1.86e-02

Pos logit contributions
 fun+0.028
 comparison+0.028
 outer+0.026
 common+0.026
 awesome+0.025
Neg logit contributions
bee-0.099
itty-0.096
owed-0.096
oke-0.092
unn-0.090
Token morning
Token ID3329
Feature activation+6.26e-03

Pos logit contributions
 fun+0.026
 comparison+0.024
 sight+0.024
 outer+0.024
 common+0.024
Neg logit contributions
bee-0.044
owed-0.043
itty-0.042
oke-0.041
ppy-0.041
Token.
Token ID13
Feature activation+1.08e-02

Pos logit contributions
 comparison+0.011
 common+0.011
 outer+0.011
 fun+0.010
 awesome+0.010
Neg logit contributions
bee-0.012
itty-0.011
owed-0.011
op-0.010
m-0.010
Token<|endoftext|>
Token ID50256
Feature activation-2.99e-02

Pos logit contributions
 comparison+0.020
 outer+0.019
 common+0.019
 fun+0.018
 awesome+0.018
Neg logit contributions
bee-0.020
itty-0.020
owed-0.019
oke-0.019
m-0.018
Token kids
Token ID3988
Feature activation-1.03e-01

Pos logit contributions
bee+0.060
owed+0.058
itty+0.058
op+0.054
ppy+0.054
Neg logit contributions
 comparison-0.049
 fun-0.049
 awesome-0.047
 outer-0.046
 common-0.046
Token happily
Token ID18177
Feature activation-1.11e-02

Pos logit contributions
bee+0.204
itty+0.193
unn+0.190
owed+0.178
ppy+0.176
Neg logit contributions
 their-0.178
 an-0.175
 fun-0.175
 our-0.167
 comparison-0.167
Token agreed
Token ID4987
Feature activation+4.35e-02

Pos logit contributions
bee+0.025
itty+0.024
op+0.023
bees+0.023
itive+0.023
Neg logit contributions
 fun-0.016
 their-0.015
 an-0.014
 our-0.014
 great-0.014
Token and
Token ID290
Feature activation+4.31e-02

Pos logit contributions
 outer+0.104
 comparison+0.103
 common+0.098
 awesome+0.096
 thrilling+0.095
Neg logit contributions
bee-0.060
owed-0.055
itty-0.050
m-0.050
op-0.049
Token they
Token ID484
Feature activation+3.48e-03

Pos logit contributions
 comparison+0.066
 outer+0.063
 fun+0.062
 common+0.062
 our+0.061
Neg logit contributions
bee-0.095
itty-0.090
owed-0.089
oke-0.085
op-0.084
Token enjoyed
Token ID8359
Feature activation+9.43e-02

Pos logit contributions
 fun+0.004
 comparison+0.004
 their+0.004
 our+0.004
 an+0.004
Neg logit contributions
bee-0.009
itty-0.009
owed-0.008
oke-0.008
unn-0.008
Token the
Token ID262
Feature activation-5.26e-03

Pos logit contributions
 comparison+0.110
 outer+0.109
 thrilling+0.099
 common+0.092
 impossible+0.091
Neg logit contributions
bee-0.221
m-0.201
ops-0.196
mun-0.196
itty-0.195
Token l
Token ID300
Feature activation-6.40e-02

Pos logit contributions
bee+0.012
itty+0.011
owed+0.011
oke+0.011
ppy+0.011
Neg logit contributions
 fun-0.008
 comparison-0.008
 awesome-0.007
 common-0.007
 outer-0.007
Tokenoll
Token ID692
Feature activation-1.16e-01

Pos logit contributions
bee+0.080
flower+0.073
itty+0.064
itive+0.062
mun+0.062
Neg logit contributions
 awesome-0.143
 backup-0.141
 fun-0.139
 comparison-0.138
 an-0.138
Tokenipop
Token ID42800
Feature activation-5.75e-02

Pos logit contributions
bird+0.116
itty+0.114
bee+0.113
flower+0.109
pillar+0.102
Neg logit contributions
 his-0.283
 your-0.276
 our-0.275
 their-0.270
 fun-0.268
Token together
Token ID1978
Feature activation+2.74e-02

Pos logit contributions
bee+0.135
itty+0.130
oke+0.126
unn+0.123
owed+0.122
Neg logit contributions
 fun-0.084
 their-0.076
 our-0.075
 awesome-0.074
 an-0.074
Token went
Token ID1816
Feature activation+4.49e-02

Pos logit contributions
bee+0.014
owed+0.013
itty+0.013
oke+0.012
op+0.012
Neg logit contributions
 comparison-0.014
 outer-0.013
 common-0.013
 fun-0.013
 our-0.013
Token to
Token ID284
Feature activation+5.56e-02

Pos logit contributions
 comparison+0.080
 common+0.071
 awesome+0.069
 outer+0.067
 trouble+0.067
Neg logit contributions
bee-0.085
m-0.085
itty-0.085
owed-0.085
y-0.080
Token the
Token ID262
Feature activation-1.91e-02

Pos logit contributions
 comparison+0.077
 thrilling+0.068
 fantastic+0.065
 awesome+0.065
 impossible+0.064
Neg logit contributions
bee-0.104
ird-0.102
itty-0.098
bird-0.098
ie-0.098
Token river
Token ID7850
Feature activation-7.71e-02

Pos logit contributions
m+0.023
bee+0.021
board+0.020
itty+0.020
ird+0.020
Neg logit contributions
 common-0.044
 comparison-0.043
 accomplishment-0.041
 faith-0.040
 concentration-0.039
Token and
Token ID290
Feature activation+3.50e-02

Pos logit contributions
bee+0.165
itty+0.157
owed+0.154
oke+0.147
op+0.144
Neg logit contributions
 comparison-0.121
 fun-0.118
 outer-0.117
 common-0.115
 our-0.114
Token enjoyed
Token ID8359
Feature activation+9.43e-02

Pos logit contributions
 comparison+0.068
 awesome+0.064
 common+0.063
 outer+0.062
 trouble+0.061
Neg logit contributions
bee-0.062
owed-0.061
itty-0.060
m-0.055
ppy-0.054
Token the
Token ID262
Feature activation-4.00e-03

Pos logit contributions
 common+0.124
 comparison+0.120
 impossible+0.116
 backup+0.115
 thrilling+0.110
Neg logit contributions
bee-0.206
bees-0.186
m-0.183
itive-0.177
mun-0.173
Token sunshine
Token ID34488
Feature activation-6.65e-02

Pos logit contributions
bee+0.010
itty+0.009
owed+0.009
m+0.009
op+0.009
Neg logit contributions
 comparison-0.005
 outer-0.005
 common-0.004
 our-0.004
 thrilling-0.004
Token.
Token ID13
Feature activation+1.45e-02

Pos logit contributions
bee+0.118
itty+0.115
owed+0.110
oke+0.107
unn+0.104
Neg logit contributions
 fun-0.128
 comparison-0.128
 outer-0.125
 our-0.122
 awesome-0.121
Token Then
Token ID3244
Feature activation+1.65e-02

Pos logit contributions
 comparison+0.021
 common+0.020
 outer+0.020
 awesome+0.019
 our+0.019
Neg logit contributions
bee-0.032
itty-0.031
owed-0.031
op-0.029
oke-0.029
Token,
Token ID11
Feature activation+1.95e-02

Pos logit contributions
 comparison+0.040
 outer+0.039
 common+0.038
 awesome+0.038
 thrilling+0.037
Neg logit contributions
bee-0.022
owed-0.020
itty-0.020
op-0.018
m-0.018
Token were
Token ID547
Feature activation+5.81e-02

Pos logit contributions
bee+0.001
itty+0.001
owed+0.001
op+0.001
oke+0.001
Neg logit contributions
 fun-0.001
 an-0.001
 great-0.001
 our-0.001
 awesome-0.001
Token very
Token ID845
Feature activation+5.63e-02

Pos logit contributions
 comparison+0.079
 outer+0.075
 common+0.072
 sight+0.066
 thrilling+0.064
Neg logit contributions
bee-0.131
itty-0.126
m-0.125
owed-0.120
ppy-0.119
Token brave
Token ID14802
Feature activation+1.73e-03

Pos logit contributions
 comparison+0.073
 outer+0.068
 trouble+0.064
 sight+0.064
 common+0.062
Neg logit contributions
bee-0.136
itty-0.131
owed-0.131
oke-0.127
m-0.124
Token and
Token ID290
Feature activation+2.91e-02

Pos logit contributions
 fun+0.003
 comparison+0.003
 trouble+0.003
 our+0.003
 common+0.003
Neg logit contributions
bee-0.003
owed-0.003
itty-0.003
op-0.003
oke-0.003
Token you
Token ID345
Feature activation-1.27e-02

Pos logit contributions
 comparison+0.047
 outer+0.041
 sight+0.039
 trouble+0.039
 mischief+0.038
Neg logit contributions
bee-0.061
owed-0.059
itty-0.059
oke-0.057
affe-0.056
Token did
Token ID750
Feature activation+9.42e-02

Pos logit contributions
bee+0.025
itty+0.024
op+0.022
oke+0.022
owed+0.022
Neg logit contributions
 fun-0.023
 great-0.022
 our-0.022
 an-0.022
 common-0.022
Token great
Token ID1049
Feature activation+3.84e-02

Pos logit contributions
 comparison+0.125
 outer+0.110
rous+0.107
 shallow+0.102
 thrilling+0.096
Neg logit contributions
bee-0.200
owed-0.187
es-0.186
affe-0.174
ie-0.174
Token!"
Token ID2474
Feature activation-7.25e-03

Pos logit contributions
 comparison+0.073
 outer+0.071
 thrilling+0.067
 common+0.067
 awesome+0.064
Neg logit contributions
bee-0.071
owed-0.064
itty-0.062
ops-0.061
oke-0.061
Token
Token ID198
Feature activation-6.22e-03

Pos logit contributions
bee+0.018
owed+0.017
itty+0.017
ird+0.016
ppy+0.016
Neg logit contributions
 fun-0.010
 common-0.009
 our-0.009
 an-0.009
 trouble-0.009
Token
Token ID198
Feature activation-1.21e-02

Pos logit contributions
bee+0.015
itty+0.014
owed+0.014
oke+0.013
op+0.013
Neg logit contributions
 comparison-0.009
 outer-0.008
 common-0.008
 fun-0.008
 our-0.008
TokenThe
Token ID464
Feature activation-2.50e-02

Pos logit contributions
bee+0.025
itty+0.024
owed+0.023
oke+0.023
op+0.022
Neg logit contributions
 comparison-0.020
 outer-0.020
 common-0.019
 sight-0.018
 trouble-0.018
Token every
Token ID790
Feature activation+2.99e-02

Pos logit contributions
bee+0.199
itty+0.189
owed+0.187
oke+0.177
op+0.174
Neg logit contributions
 comparison-0.166
 outer-0.161
 fun-0.159
 common-0.157
 our-0.156
Token day
Token ID1110
Feature activation-3.69e-03

Pos logit contributions
 comparison+0.053
 common+0.052
 awesome+0.052
 outer+0.052
 fun+0.051
Neg logit contributions
bee-0.056
itty-0.053
m-0.050
ie-0.049
ne-0.048
Token and
Token ID290
Feature activation+3.55e-02

Pos logit contributions
bee+0.006
itty+0.006
owed+0.006
oke+0.006
op+0.006
Neg logit contributions
 comparison-0.007
 outer-0.007
 common-0.007
 fun-0.007
 awesome-0.007
Token they
Token ID484
Feature activation-2.33e-03

Pos logit contributions
 comparison+0.062
 outer+0.059
 common+0.058
 awesome+0.057
 fun+0.056
Neg logit contributions
bee-0.070
owed-0.066
itty-0.065
oke-0.063
m-0.062
Token both
Token ID1111
Feature activation+6.82e-03

Pos logit contributions
bee+0.005
itty+0.005
owed+0.005
oke+0.004
op+0.004
Neg logit contributions
 comparison-0.004
 fun-0.004
 outer-0.004
 common-0.004
 our-0.004
Token enjoyed
Token ID8359
Feature activation+9.42e-02

Pos logit contributions
 fun+0.011
 comparison+0.011
 their+0.011
 common+0.011
 our+0.011
Neg logit contributions
itty-0.014
bee-0.014
owed-0.013
op-0.013
oke-0.013
Token listening
Token ID8680
Feature activation+6.61e-02

Pos logit contributions
 comparison+0.107
 impossible+0.096
 common+0.093
 backup+0.093
 frustrating+0.090
Neg logit contributions
bee-0.230
m-0.205
mun-0.203
bees-0.203
owers-0.201
Token to
Token ID284
Feature activation+4.74e-02

Pos logit contributions
 comparison+0.095
 common+0.088
 outer+0.085
 trouble+0.083
 awesome+0.082
Neg logit contributions
bee-0.146
owed-0.144
oke-0.131
m-0.131
y-0.130
Token the
Token ID262
Feature activation-8.27e-03

Pos logit contributions
 comparison+0.042
 common+0.038
 outer+0.035
 thrilling+0.034
 awesome+0.034
Neg logit contributions
bee-0.130
owed-0.123
itty-0.120
ie-0.119
oke-0.118
Token waves
Token ID9813
Feature activation-3.51e-02

Pos logit contributions
bee+0.019
m+0.018
oke+0.017
ne+0.017
itty+0.017
Neg logit contributions
 comparison-0.010
 outer-0.010
 our-0.010
 common-0.009
 gonna-0.009
Token together
Token ID1978
Feature activation+1.45e-02

Pos logit contributions
bee+0.066
itty+0.064
owed+0.062
unn+0.060
oke+0.059
Neg logit contributions
 comparison-0.062
 fun-0.062
 our-0.061
 outer-0.061
 their-0.060
Tokenered
Token ID1068
Feature activation+8.06e-02

Pos logit contributions
bee+0.030
owed+0.029
itty+0.028
oke+0.027
op+0.026
Neg logit contributions
 comparison-0.028
 outer-0.027
 fun-0.027
 common-0.027
 trouble-0.026
Token over
Token ID625
Feature activation+5.41e-02

Pos logit contributions
 comparison+0.135
 common+0.134
 awesome+0.127
 trouble+0.123
 fun+0.123
Neg logit contributions
bee-0.144
owed-0.143
ie-0.137
itty-0.137
y-0.131
Token the
Token ID262
Feature activation-2.77e-02

Pos logit contributions
 comparison+0.077
 awesome+0.076
 common+0.075
 outer+0.071
 ours+0.070
Neg logit contributions
itty-0.113
bee-0.113
owed-0.109
m-0.108
bird-0.106
Token rock
Token ID3881
Feature activation-9.86e-02

Pos logit contributions
bee+0.058
itty+0.056
owed+0.053
oke+0.052
m+0.052
Neg logit contributions
 comparison-0.044
 common-0.042
 outer-0.042
 fun-0.041
 our-0.040
Token and
Token ID290
Feature activation+3.85e-02

Pos logit contributions
bee+0.187
itty+0.172
unn+0.171
owed+0.167
oke+0.164
Neg logit contributions
 fun-0.171
 comparison-0.170
 outer-0.169
 an-0.169
 our-0.165
Token enjoyed
Token ID8359
Feature activation+9.42e-02

Pos logit contributions
 comparison+0.070
 common+0.068
 outer+0.067
 awesome+0.066
 fun+0.065
Neg logit contributions
bee-0.072
owed-0.072
itty-0.069
m-0.064
oke-0.063
Token the
Token ID262
Feature activation-7.63e-03

Pos logit contributions
 common+0.112
 comparison+0.109
 backup+0.105
 ours+0.102
 proper+0.101
Neg logit contributions
bee-0.221
bees-0.199
owed-0.193
m-0.190
itty-0.186
Token beautiful
Token ID4950
Feature activation-2.35e-02

Pos logit contributions
bee+0.020
itty+0.020
owed+0.020
oke+0.019
op+0.019
Neg logit contributions
 comparison-0.008
 outer-0.007
 common-0.007
 fun-0.007
 our-0.007
Token rainbow
Token ID27223
Feature activation-7.15e-02

Pos logit contributions
bee+0.057
itty+0.054
owed+0.054
oke+0.051
m+0.050
Neg logit contributions
 comparison-0.030
 outer-0.028
 common-0.028
 fun-0.027
 our-0.027
Token.
Token ID13
Feature activation+5.23e-03

Pos logit contributions
bee+0.141
itty+0.137
oke+0.134
unn+0.132
owed+0.132
Neg logit contributions
 fun-0.122
 outer-0.118
 comparison-0.117
 our-0.114
 awesome-0.114
Token<|endoftext|>
Token ID50256
Feature activation-3.69e-02

Pos logit contributions
 an+0.006
 fun+0.006
 comparison+0.006
 outer+0.006
 late+0.006
Neg logit contributions
bee-0.014
owed-0.012
itive-0.012
itty-0.012
bees-0.012
Token
Token ID198
Feature activation-1.41e-02

Pos logit contributions
bee+0.042
itty+0.040
owed+0.040
oke+0.038
op+0.037
Neg logit contributions
 comparison-0.028
 outer-0.026
 fun-0.026
 common-0.026
 our-0.025
Token
Token ID198
Feature activation-2.30e-02

Pos logit contributions
bee+0.027
itty+0.026
owed+0.025
oke+0.024
op+0.024
Neg logit contributions
 comparison-0.026
 outer-0.025
 fun-0.024
 common-0.024
 awesome-0.024
Token"
Token ID1
Feature activation-3.79e-03

Pos logit contributions
bee+0.037
itty+0.036
owed+0.033
oke+0.032
op+0.032
Neg logit contributions
 comparison-0.048
 outer-0.047
 fun-0.045
 sight-0.045
 trouble-0.045
TokenMaybe
Token ID13300
Feature activation+2.23e-02

Pos logit contributions
bee+0.008
itty+0.008
owed+0.007
oke+0.007
m+0.007
Neg logit contributions
 comparison-0.006
 outer-0.006
 common-0.006
 trouble-0.006
 fun-0.006
Token the
Token ID262
Feature activation-4.40e-02

Pos logit contributions
 comparison+0.033
 outer+0.030
 common+0.030
 thrilling+0.029
 awesome+0.029
Neg logit contributions
bee-0.050
owed-0.047
itty-0.047
oke-0.045
m-0.045
Token hipp
Token ID18568
Feature activation-1.81e-01

Pos logit contributions
owed+0.070
bee+0.068
itty+0.066
op+0.063
ppy+0.063
Neg logit contributions
 fun-0.097
 comparison-0.094
 our-0.092
 awesome-0.092
 common-0.092
Tokeno
Token ID78
Feature activation-1.15e-01

Pos logit contributions
bee+0.306
ird+0.283
ppy+0.281
itive+0.270
isy+0.265
Neg logit contributions
 our-0.332
 their-0.319
 an-0.304
 common-0.302
 fun-0.301
Token was
Token ID373
Feature activation+5.66e-02

Pos logit contributions
bee+0.226
itty+0.212
ird+0.211
owed+0.210
unn+0.208
Neg logit contributions
 our-0.188
 common-0.182
 fun-0.180
 an-0.175
 their-0.174
Token just
Token ID655
Feature activation+2.58e-02

Pos logit contributions
 comparison+0.072
 outer+0.067
 thrilling+0.061
Hours+0.058
 common+0.057
Neg logit contributions
bee-0.138
itty-0.124
owed-0.122
m-0.122
oke-0.120
Token tired
Token ID10032
Feature activation+1.45e-03

Pos logit contributions
 comparison+0.029
 outer+0.026
 common+0.025
 thrilling+0.024
Hours+0.024
Neg logit contributions
bee-0.067
owed-0.062
itty-0.062
oke-0.060
m-0.060
Token.
Token ID13
Feature activation-1.76e-03

Pos logit contributions
 fun+0.003
 trouble+0.003
 our+0.003
 late+0.003
 comparison+0.003
Neg logit contributions
bee-0.003
itty-0.003
owed-0.003
op-0.002
oke-0.002
Token changed
Token ID3421
Feature activation+2.38e-02

Pos logit contributions
bee+0.046
owed+0.043
itty+0.042
ie+0.042
oke+0.041
Neg logit contributions
 comparison-0.022
 outer-0.022
 common-0.019
 our-0.018
 sight-0.018
Token.
Token ID13
Feature activation-5.69e-03

Pos logit contributions
 comparison+0.051
 outer+0.049
 awesome+0.048
 common+0.047
 thrilling+0.047
Neg logit contributions
bee-0.037
owed-0.035
itty-0.033
m-0.032
ie-0.031
Token
Token ID198
Feature activation+3.50e-03

Pos logit contributions
bee+0.016
owed+0.015
itty+0.014
pp+0.014
unn+0.014
Neg logit contributions
 an-0.006
 comparison-0.006
 fun-0.006
 outer-0.006
 trouble-0.006
Token
Token ID198
Feature activation-6.60e-03

Pos logit contributions
 comparison+0.005
 fun+0.005
 outer+0.005
 common+0.005
 trouble+0.005
Neg logit contributions
bee-0.008
owed-0.007
itty-0.007
oke-0.007
unn-0.007
TokenThe
Token ID464
Feature activation-3.96e-02

Pos logit contributions
bee+0.016
itty+0.016
owed+0.016
oke+0.015
op+0.015
Neg logit contributions
 comparison-0.008
 outer-0.008
 fun-0.008
 common-0.008
 our-0.008
Token crocod
Token ID37565
Feature activation-1.78e-01

Pos logit contributions
bee+0.078
owed+0.075
itty+0.075
oke+0.069
op+0.069
Neg logit contributions
 comparison-0.069
 fun-0.067
 outer-0.066
 common-0.066
 our-0.065
Tokenile
Token ID576
Feature activation-1.31e-01

Pos logit contributions
ird+0.323
owed+0.301
bee+0.268
pp+0.262
 Girl+0.254
Neg logit contributions
 fun-0.304
 far-0.300
 comparison-0.299
 awesome-0.298
 common-0.291
Token wanted
Token ID2227
Feature activation+6.04e-02

Pos logit contributions
bee+0.273
itty+0.264
owed+0.261
unn+0.256
ird+0.251
Neg logit contributions
 fun-0.204
 an-0.193
 trouble-0.191
 comparison-0.188
 great-0.184
Token to
Token ID284
Feature activation+4.91e-02

Pos logit contributions
 comparison+0.099
 common+0.090
 outer+0.090
 thrilling+0.085
 awesome+0.084
Neg logit contributions
bee-0.116
itty-0.116
m-0.113
ie-0.110
owed-0.110
Token slap
Token ID23905
Feature activation+3.19e-02

Pos logit contributions
 comparison+0.067
 outer+0.064
 fun+0.061
 thrilling+0.061
 common+0.061
Neg logit contributions
bee-0.115
owed-0.109
itty-0.108
op-0.102
ird-0.102
Token a
Token ID257
Feature activation-1.47e-03

Pos logit contributions
 comparison+0.016
 fun+0.015
 outer+0.014
 our+0.014
 common+0.014
Neg logit contributions
bee-0.102
itty-0.100
owed-0.099
oke-0.096
op-0.095
Token
Token ID198
Feature activation-4.36e-03

Pos logit contributions
 fun+0.009
 comparison+0.009
 common+0.008
 outer+0.008
 our+0.008
Neg logit contributions
bee-0.014
owed-0.013
itty-0.013
unn-0.012
oke-0.012
TokenOne
Token ID3198
Feature activation-1.26e-02

Pos logit contributions
bee+0.010
itty+0.010
owed+0.010
op+0.009
oke+0.009
Neg logit contributions
 comparison-0.006
 outer-0.006
 trouble-0.005
 common-0.005
 fun-0.005
Token day
Token ID1110
Feature activation+9.90e-03

Pos logit contributions
bee+0.032
owed+0.031
itty+0.031
ird+0.030
oke+0.030
Neg logit contributions
 fun-0.015
 common-0.014
 comparison-0.014
 our-0.014
 outer-0.014
Token,
Token ID11
Feature activation+1.30e-02

Pos logit contributions
 our+0.018
 fun+0.018
 an+0.018
 their+0.017
 comparison+0.017
Neg logit contributions
bee-0.019
itty-0.018
owed-0.018
oke-0.017
pp-0.016
Token the
Token ID262
Feature activation-4.76e-02

Pos logit contributions
 comparison+0.010
 outer+0.009
 common+0.008
 awesome+0.008
 thrilling+0.008
Neg logit contributions
bee-0.039
itty-0.037
owed-0.036
op-0.036
m-0.035
Token hipp
Token ID18568
Feature activation-1.78e-01

Pos logit contributions
bee+0.077
owed+0.074
itty+0.073
op+0.066
ppy+0.066
Neg logit contributions
 comparison-0.101
 fun-0.100
 our-0.097
 common-0.097
 awesome-0.097
Tokeno
Token ID78
Feature activation-1.06e-01

Pos logit contributions
bee+0.348
ird+0.315
owed+0.293
bees+0.290
itive+0.287
Neg logit contributions
 their-0.293
 our-0.286
 his-0.277
 far-0.277
 late-0.276
Token's
Token ID338
Feature activation+6.09e-03

Pos logit contributions
bee+0.217
owed+0.208
itty+0.208
ird+0.194
unn+0.192
Neg logit contributions
 fun-0.169
 our-0.164
 comparison-0.162
 trouble-0.162
 an-0.162
Token friends
Token ID2460
Feature activation-7.03e-02

Pos logit contributions
 comparison+0.012
 outer+0.012
 common+0.011
 fun+0.011
 our+0.011
Neg logit contributions
bee-0.011
itty-0.010
owed-0.010
op-0.009
oke-0.009
Token asked
Token ID1965
Feature activation+2.62e-02

Pos logit contributions
bee+0.155
itty+0.148
owed+0.144
unn+0.140
affe+0.138
Neg logit contributions
 fun-0.104
 an-0.103
 comparison-0.100
 trouble-0.100
 their-0.098
Token him
Token ID683
Feature activation-3.89e-02

Pos logit contributions
 comparison+0.027
 outer+0.025
 common+0.024
 fun+0.023
 awesome+0.023
Neg logit contributions
bee-0.072
owed-0.069
itty-0.068
oke-0.065
op-0.065
Token tired
Token ID10032
Feature activation+1.66e-02

Pos logit contributions
 comparison+0.082
 outer+0.077
 common+0.072
 trouble+0.072
 sight+0.071
Neg logit contributions
owed-0.102
bee-0.102
m-0.099
itty-0.099
op-0.097
Token.
Token ID13
Feature activation-4.88e-03

Pos logit contributions
 comparison+0.034
 outer+0.033
 fun+0.033
 common+0.033
 our+0.032
Neg logit contributions
bee-0.027
itty-0.026
owed-0.026
oke-0.024
op-0.023
Token
Token ID198
Feature activation+3.78e-03

Pos logit contributions
bee+0.012
owed+0.011
itty+0.011
oke+0.010
unn+0.010
Neg logit contributions
 fun-0.007
 comparison-0.007
 outer-0.006
 an-0.006
 common-0.006
Token
Token ID198
Feature activation-8.17e-03

Pos logit contributions
 fun+0.005
 comparison+0.005
 common+0.005
 outer+0.005
 an+0.005
Neg logit contributions
bee-0.009
owed-0.009
itty-0.008
unn-0.008
flower-0.008
TokenThe
Token ID464
Feature activation-4.75e-02

Pos logit contributions
bee+0.021
owed+0.020
itty+0.020
oke+0.019
unn+0.019
Neg logit contributions
 comparison-0.010
 fun-0.009
 outer-0.009
 common-0.009
 thrilling-0.009
Token le
Token ID443
Feature activation-1.76e-01

Pos logit contributions
bee+0.093
owed+0.089
itty+0.088
oke+0.082
op+0.081
Neg logit contributions
 comparison-0.084
 fun-0.081
 outer-0.080
 common-0.079
 awesome-0.078
Tokenopard
Token ID15478
Feature activation-1.34e-01

Pos logit contributions
owed+0.336
ie+0.320
ird+0.317
bee+0.304
ppy+0.303
Neg logit contributions
 steel-0.261
 golf-0.260
 fun-0.258
 far-0.254
 long-0.243
Token was
Token ID373
Feature activation+5.99e-02

Pos logit contributions
bee+0.280
itty+0.275
oke+0.267
owed+0.263
unn+0.256
Neg logit contributions
 fun-0.209
 an-0.205
 trouble-0.192
 far-0.191
 comparison-0.190
Token searching
Token ID10342
Feature activation+5.83e-02

Pos logit contributions
 comparison+0.110
 common+0.109
 outer+0.104
 thrilling+0.097
 backup+0.095
Neg logit contributions
bee-0.108
m-0.106
owed-0.106
itty-0.103
ppy-0.097
Token for
Token ID329
Feature activation+4.00e-02

Pos logit contributions
 comparison+0.094
 common+0.088
 outer+0.088
 awesome+0.084
 fun+0.083
Neg logit contributions
bee-0.123
owed-0.117
itty-0.116
m-0.109
oke-0.108
Token a
Token ID257
Feature activation+2.10e-02

Pos logit contributions
 comparison+0.056
 common+0.050
 outer+0.046
rous+0.045
 late+0.045
Neg logit contributions
bee-0.087
owed-0.084
erry-0.081
itty-0.081
ie-0.080
Token me
Token ID502
Feature activation-3.71e-02

Pos logit contributions
 comparison+0.047
 outer+0.043
 thrilling+0.042
 common+0.041
 awesome+0.037
Neg logit contributions
bee-0.112
itty-0.105
owed-0.103
ie-0.102
op-0.101
Token."
Token ID526
Feature activation-1.60e-02

Pos logit contributions
bee+0.071
itty+0.069
owed+0.066
oke+0.065
op+0.062
Neg logit contributions
 our-0.066
 fun-0.065
 comparison-0.065
 outer-0.064
 an-0.063
Token
Token ID198
Feature activation-6.80e-03

Pos logit contributions
bee+0.041
owed+0.038
itty+0.037
pp+0.036
unn+0.036
Neg logit contributions
 an-0.020
 trouble-0.020
 fun-0.020
 their-0.020
 our-0.019
Token
Token ID198
Feature activation-1.22e-02

Pos logit contributions
bee+0.015
owed+0.014
itty+0.014
oke+0.013
unn+0.013
Neg logit contributions
 comparison-0.010
 fun-0.010
 common-0.010
 outer-0.010
 our-0.010
TokenThe
Token ID464
Feature activation-4.32e-02

Pos logit contributions
bee+0.030
owed+0.028
itty+0.028
oke+0.027
unn+0.026
Neg logit contributions
 comparison-0.016
 outer-0.015
 fun-0.015
 common-0.015
 our-0.015
Token hipp
Token ID18568
Feature activation-1.76e-01

Pos logit contributions
owed+0.067
bee+0.066
itty+0.063
ppy+0.058
op+0.058
Neg logit contributions
 comparison-0.095
 fun-0.094
 our-0.092
 common-0.092
 awesome-0.091
Tokeno
Token ID78
Feature activation-1.09e-01

Pos logit contributions
bee+0.327
ird+0.301
ppy+0.296
itive+0.281
bees+0.281
Neg logit contributions
 their-0.303
 his-0.298
 our-0.294
 an-0.268
 common-0.268
Token is
Token ID318
Feature activation+5.06e-02

Pos logit contributions
bee+0.245
itty+0.238
owed+0.229
ird+0.225
unn+0.222
Neg logit contributions
 fun-0.151
 comparison-0.148
 trouble-0.148
 our-0.148
 common-0.146
Token not
Token ID407
Feature activation+5.75e-02

Pos logit contributions
 comparison+0.087
 outer+0.087
 common+0.076
 purpose+0.074
 thrilling+0.074
Neg logit contributions
bee-0.097
owed-0.095
itty-0.091
m-0.088
bug-0.087
Token angry
Token ID7954
Feature activation+1.24e-02

Pos logit contributions
 fun+0.061
 comparison+0.058
 common+0.057
 awesome+0.056
 our+0.056
Neg logit contributions
bee-0.152
itty-0.152
owed-0.147
op-0.145
ops-0.143
Token.
Token ID13
Feature activation-5.71e-03

Pos logit contributions
 trouble+0.021
 comparison+0.021
 fun+0.021
 our+0.020
 an+0.020
Neg logit contributions
bee-0.025
itty-0.024
owed-0.024
pp-0.023
ird-0.022
Token?"
Token ID1701
Feature activation-1.54e-02

Pos logit contributions
bee+0.124
owed+0.117
unn+0.116
itty+0.115
op+0.111
Neg logit contributions
 our-0.097
 fun-0.095
 your-0.092
 an-0.092
 awesome-0.090
Token
Token ID198
Feature activation-9.23e-05

Pos logit contributions
bee+0.040
owed+0.038
itty+0.037
unn+0.037
itive+0.036
Neg logit contributions
 an-0.017
 their-0.017
 his-0.016
 fun-0.016
 our-0.016
Token
Token ID220
Feature activation-4.48e-03

Pos logit contributions
bee+0.000
owed+0.000
itty+0.000
unn+0.000
flower+0.000
Neg logit contributions
 common-0.000
 fun-0.000
 comparison-0.000
 our-0.000
 an-0.000
Token
Token ID198
Feature activation-4.87e-03

Pos logit contributions
bee+0.011
itty+0.010
owed+0.010
flower+0.009
itive+0.009
Neg logit contributions
 an-0.006
 our-0.006
 fun-0.006
 common-0.006
 comparison-0.006
TokenThe
Token ID464
Feature activation-4.59e-02

Pos logit contributions
bee+0.012
itty+0.012
owed+0.012
oke+0.011
op+0.011
Neg logit contributions
 comparison-0.006
 outer-0.006
 fun-0.005
 common-0.005
 our-0.005
Token hipp
Token ID18568
Feature activation-1.76e-01

Pos logit contributions
owed+0.069
bee+0.068
itty+0.065
ppy+0.061
op+0.060
Neg logit contributions
 fun-0.102
 comparison-0.101
 our-0.099
 awesome-0.099
 common-0.098
Tokeno
Token ID78
Feature activation-1.07e-01

Pos logit contributions
bee+0.315
ird+0.286
bees+0.280
itive+0.277
ppy+0.273
Neg logit contributions
 their-0.288
 his-0.285
 our-0.277
 an-0.269
 common-0.266
Token smiled
Token ID13541
Feature activation+2.46e-02

Pos logit contributions
itty+0.192
bee+0.191
unn+0.180
ird+0.177
owed+0.174
Neg logit contributions
 fun-0.188
 common-0.183
 an-0.182
 our-0.180
 their-0.178
Token and
Token ID290
Feature activation+1.97e-02

Pos logit contributions
 comparison+0.059
 outer+0.058
 common+0.057
 fun+0.056
 awesome+0.056
Neg logit contributions
bee-0.033
owed-0.030
itty-0.029
m-0.027
oke-0.026
Token said
Token ID531
Feature activation+3.03e-02

Pos logit contributions
 comparison+0.048
 outer+0.047
 common+0.046
 trouble+0.045
 fun+0.045
Neg logit contributions
bee-0.026
owed-0.024
itty-0.024
oke-0.021
op-0.021
Token yes
Token ID3763
Feature activation+2.46e-03

Pos logit contributions
 comparison+0.060
 fun+0.058
 outer+0.058
 common+0.058
 our+0.057
Neg logit contributions
bee-0.052
itty-0.050
owed-0.048
oke-0.045
op-0.044
Token
Token ID198
Feature activation-3.83e-03

Pos logit contributions
 fun+0.011
 common+0.010
 comparison+0.010
 outer+0.010
 our+0.010
Neg logit contributions
bee-0.018
owed-0.017
itty-0.017
unn-0.016
flower-0.016
TokenOne
Token ID3198
Feature activation-1.02e-02

Pos logit contributions
bee+0.010
itty+0.009
owed+0.009
oke+0.009
op+0.009
Neg logit contributions
 comparison-0.005
 outer-0.005
 common-0.004
 fun-0.004
 our-0.004
Token day
Token ID1110
Feature activation+1.29e-02

Pos logit contributions
bee+0.027
owed+0.026
ird+0.026
itty+0.025
oke+0.025
Neg logit contributions
 fun-0.012
 common-0.011
 our-0.011
 great-0.010
 comparison-0.010
Token,
Token ID11
Feature activation+1.91e-02

Pos logit contributions
 fun+0.024
 our+0.023
 an+0.023
 comparison+0.023
 common+0.023
Neg logit contributions
bee-0.024
itty-0.023
owed-0.023
oke-0.022
ppy-0.021
Token the
Token ID262
Feature activation-4.75e-02

Pos logit contributions
 comparison+0.013
 outer+0.011
 common+0.011
 fun+0.010
 awesome+0.010
Neg logit contributions
bee-0.059
itty-0.057
owed-0.056
op-0.054
m-0.054
Token hipp
Token ID18568
Feature activation-1.76e-01

Pos logit contributions
bee+0.077
owed+0.076
itty+0.073
ppy+0.068
ird+0.067
Neg logit contributions
 fun-0.101
 comparison-0.099
 our-0.097
 awesome-0.096
 sight-0.095
Tokeno
Token ID78
Feature activation-1.04e-01

Pos logit contributions
bee+0.362
ird+0.327
owed+0.315
isy+0.306
bees+0.306
Neg logit contributions
 their-0.270
 his-0.262
 our-0.261
 far-0.260
 many-0.255
Token saw
Token ID2497
Feature activation+5.27e-02

Pos logit contributions
bee+0.228
owed+0.222
itty+0.221
ird+0.212
oke+0.208
Neg logit contributions
 fun-0.148
 an-0.142
 our-0.141
 trouble-0.137
 their-0.137
Token a
Token ID257
Feature activation+8.67e-03

Pos logit contributions
 comparison+0.053
 common+0.048
 outer+0.047
 thrilling+0.043
 awesome+0.040
Neg logit contributions
bee-0.145
itty-0.139
owed-0.139
m-0.137
ie-0.131
Token little
Token ID1310
Feature activation-2.58e-02

Pos logit contributions
 comparison+0.009
 outer+0.009
 common+0.008
 our+0.008
 awesome+0.008
Neg logit contributions
bee-0.023
itty-0.022
owed-0.021
oke-0.021
op-0.021
Token bird
Token ID6512
Feature activation-1.09e-01

Pos logit contributions
owed+0.046
bee+0.046
ops+0.042
ird+0.041
itty+0.041
Neg logit contributions
 fun-0.051
 trouble-0.050
 sight-0.050
 our-0.049
 comparison-0.049
Token adventure
Token ID8855
Feature activation-1.16e-02

Pos logit contributions
 comparison+0.022
 outer+0.021
 fun+0.021
 awesome+0.020
 common+0.019
Neg logit contributions
bee-0.075
itty-0.073
owed-0.072
oke-0.070
op-0.069
Token.
Token ID13
Feature activation-4.92e-03

Pos logit contributions
bee+0.020
itty+0.019
owed+0.018
oke+0.017
op+0.017
Neg logit contributions
 comparison-0.023
 outer-0.023
 fun-0.023
 common-0.022
 our-0.022
Token
Token ID198
Feature activation+3.47e-03

Pos logit contributions
bee+0.013
owed+0.012
itty+0.012
ppy+0.011
bees+0.011
Neg logit contributions
 fun-0.006
 comparison-0.006
 outer-0.006
 an-0.006
 common-0.006
Token
Token ID198
Feature activation-8.54e-03

Pos logit contributions
 fun+0.005
 comparison+0.005
 outer+0.005
 common+0.005
 an+0.005
Neg logit contributions
bee-0.008
owed-0.008
itty-0.007
unn-0.007
oke-0.007
TokenThe
Token ID464
Feature activation-5.01e-02

Pos logit contributions
bee+0.021
itty+0.020
owed+0.020
oke+0.019
op+0.019
Neg logit contributions
 comparison-0.011
 outer-0.010
 common-0.010
 fun-0.010
 our-0.010
Token le
Token ID443
Feature activation-1.76e-01

Pos logit contributions
owed+0.105
bee+0.103
itty+0.098
ird+0.094
ppy+0.094
Neg logit contributions
 fun-0.082
 comparison-0.079
 awesome-0.077
 trouble-0.076
 great-0.076
Tokenopard
Token ID15478
Feature activation-1.32e-01

Pos logit contributions
owed+0.337
ie+0.328
ird+0.324
iness+0.312
bee+0.305
Neg logit contributions
 golf-0.246
 steel-0.245
 fun-0.240
 far-0.235
 long-0.227
Token was
Token ID373
Feature activation+5.80e-02

Pos logit contributions
bee+0.280
itty+0.277
owed+0.268
oke+0.268
unn+0.264
Neg logit contributions
 fun-0.203
 an-0.197
 far-0.182
 great-0.181
 exciting-0.180
Token walking
Token ID6155
Feature activation+3.31e-02

Pos logit contributions
 comparison+0.104
 outer+0.099
 common+0.096
 thrilling+0.088
 backup+0.088
Neg logit contributions
bee-0.111
itty-0.106
owed-0.103
m-0.103
ppy-0.097
Token through
Token ID832
Feature activation+3.47e-02

Pos logit contributions
 comparison+0.057
 outer+0.054
 common+0.053
 fun+0.052
 trouble+0.052
Neg logit contributions
bee-0.067
itty-0.064
owed-0.063
m-0.058
op-0.058
Token the
Token ID262
Feature activation-1.58e-02

Pos logit contributions
 common+0.040
 comparison+0.038
 awesome+0.037
 ours+0.035
 backup+0.034
Neg logit contributions
bee-0.082
itty-0.080
bird-0.078
ie-0.078
m-0.075
Token a
Token ID257
Feature activation+4.87e-03

Pos logit contributions
 awesome+0.026
 gonna+0.026
 thrilling+0.026
 brand+0.025
 fantastic+0.025
Neg logit contributions
itty-0.046
bee-0.046
bird-0.045
m-0.045
berry-0.045
Token distant
Token ID12899
Feature activation-2.13e-02

Pos logit contributions
 comparison+0.007
 outer+0.006
 common+0.006
 awesome+0.006
 theirs+0.006
Neg logit contributions
itty-0.011
bee-0.011
op-0.010
m-0.010
anda-0.010
Token place
Token ID1295
Feature activation-3.96e-02

Pos logit contributions
bee+0.047
itty+0.046
owed+0.044
oke+0.042
op+0.042
Neg logit contributions
 comparison-0.031
 outer-0.029
 awesome-0.029
 common-0.029
 fun-0.029
Token.
Token ID13
Feature activation-1.10e-02

Pos logit contributions
bee+0.075
owed+0.073
itty+0.072
oke+0.067
m+0.067
Neg logit contributions
 comparison-0.071
 outer-0.069
 common-0.067
 fun-0.067
 awesome-0.066
Token The
Token ID383
Feature activation-4.15e-02

Pos logit contributions
bee+0.030
owed+0.028
itty+0.028
ppy+0.026
oke+0.026
Neg logit contributions
 fun-0.011
 comparison-0.011
 outer-0.011
 an-0.011
 common-0.010
Token hipp
Token ID18568
Feature activation-1.76e-01

Pos logit contributions
bee+0.084
owed+0.081
itty+0.081
oke+0.075
op+0.074
Neg logit contributions
 comparison-0.071
 fun-0.069
 outer-0.068
 common-0.067
 our-0.066
Tokeno
Token ID78
Feature activation-1.06e-01

Pos logit contributions
bee+0.338
ird+0.306
itive+0.298
ppy+0.284
bees+0.282
Neg logit contributions
 their-0.285
 many-0.282
 far-0.280
 our-0.276
 an-0.267
Token had
Token ID550
Feature activation+6.44e-02

Pos logit contributions
bee+0.212
itty+0.200
owed+0.198
ird+0.191
unn+0.186
Neg logit contributions
 fun-0.176
 our-0.167
 an-0.166
 common-0.164
 comparison-0.163
Token many
Token ID867
Feature activation+1.36e-02

Pos logit contributions
 avail+0.095
 comparison+0.094
 accomplishment+0.089
 outer+0.087
 common+0.083
Neg logit contributions
bee-0.126
itty-0.123
m-0.123
owed-0.121
bees-0.115
Token friends
Token ID2460
Feature activation-5.91e-02

Pos logit contributions
 comparison+0.022
 outer+0.022
 common+0.021
 fun+0.021
 our+0.020
Neg logit contributions
bee-0.028
itty-0.027
owed-0.026
oke-0.025
op-0.025
Token who
Token ID508
Feature activation-1.21e-02

Pos logit contributions
bee+0.109
itty+0.102
owed+0.099
oke+0.095
unn+0.094
Neg logit contributions
 fun-0.111
 comparison-0.109
 common-0.105
 outer-0.105
 our-0.105
Token happy
Token ID3772
Feature activation+1.69e-02

Pos logit contributions
 comparison+0.085
 outer+0.085
 backup+0.074
Hours+0.074
 thrilling+0.073
Neg logit contributions
bee-0.126
m-0.115
itty-0.113
owed-0.113
bug-0.110
Token!
Token ID0
Feature activation-2.32e-02

Pos logit contributions
 comparison+0.027
 fun+0.027
 our+0.026
 outer+0.026
 common+0.026
Neg logit contributions
bee-0.035
itty-0.034
owed-0.034
oke-0.032
op-0.032
Token
Token ID198
Feature activation-8.97e-03

Pos logit contributions
bee+0.050
owed+0.048
itty+0.048
op+0.044
oke+0.044
Neg logit contributions
 comparison-0.036
 fun-0.035
 outer-0.034
 common-0.034
 our-0.034
Token
Token ID198
Feature activation-1.92e-02

Pos logit contributions
bee+0.018
itty+0.017
owed+0.017
oke+0.016
op+0.016
Neg logit contributions
 comparison-0.016
 outer-0.015
 common-0.014
 fun-0.014
 awesome-0.014
TokenThe
Token ID464
Feature activation-4.58e-02

Pos logit contributions
bee+0.035
itty+0.033
owed+0.032
oke+0.030
op+0.030
Neg logit contributions
 comparison-0.037
 outer-0.036
 fun-0.035
 common-0.035
 trouble-0.034
Token hipp
Token ID18568
Feature activation-1.75e-01

Pos logit contributions
owed+0.084
bee+0.082
itty+0.080
op+0.076
ppy+0.075
Neg logit contributions
 fun-0.087
 comparison-0.087
 our-0.085
 common-0.084
 awesome-0.083
Tokeno
Token ID78
Feature activation-1.11e-01

Pos logit contributions
bee+0.319
ird+0.305
ppy+0.297
owed+0.284
itive+0.284
Neg logit contributions
 their-0.300
 our-0.291
 his-0.275
 common-0.256
 fun-0.255
Token asked
Token ID1965
Feature activation+1.68e-02

Pos logit contributions
itty+0.206
bee+0.203
ird+0.199
owed+0.199
op+0.191
Neg logit contributions
 fun-0.189
 our-0.187
 their-0.180
 an-0.178
 common-0.177
Token Anna
Token ID11735
Feature activation-9.75e-02

Pos logit contributions
 comparison+0.021
 outer+0.020
 fun+0.020
 common+0.020
 our+0.020
Neg logit contributions
bee-0.041
itty-0.040
owed-0.039
oke-0.037
op-0.037
Token and
Token ID290
Feature activation-1.59e-02

Pos logit contributions
bee+0.194
itty+0.185
owed+0.178
unn+0.172
oke+0.171
Neg logit contributions
 our-0.166
 an-0.161
 fun-0.161
 their-0.159
 comparison-0.158
Token Ben
Token ID3932
Feature activation-5.89e-02

Pos logit contributions
bee+0.040
itty+0.039
owed+0.038
oke+0.037
op+0.036
Neg logit contributions
 comparison-0.019
 outer-0.018
 fun-0.018
 common-0.018
 our-0.017
Token inside
Token ID2641
Feature activation+1.56e-02

Pos logit contributions
oke+0.265
itty+0.257
bee+0.254
ird+0.246
unn+0.244
Neg logit contributions
 our-0.140
 an-0.138
 their-0.137
 his-0.134
 fun-0.133
Token.
Token ID13
Feature activation-1.79e-02

Pos logit contributions
 comparison+0.036
 outer+0.035
 common+0.034
 awesome+0.034
 fun+0.034
Neg logit contributions
bee-0.021
owed-0.020
itty-0.020
m-0.019
op-0.018
Token
Token ID198
Feature activation+2.19e-03

Pos logit contributions
bee+0.040
itty+0.038
owed+0.038
oke+0.035
op+0.035
Neg logit contributions
 comparison-0.027
 outer-0.026
 fun-0.026
 common-0.026
 our-0.025
Token
Token ID198
Feature activation-8.03e-03

Pos logit contributions
 comparison+0.004
 outer+0.004
 fun+0.003
 common+0.003
 our+0.003
Neg logit contributions
bee-0.004
itty-0.004
owed-0.004
oke-0.004
op-0.004
TokenThe
Token ID464
Feature activation-4.45e-02

Pos logit contributions
bee+0.015
itty+0.014
op+0.013
owed+0.013
ppy+0.013
Neg logit contributions
 comparison-0.015
 outer-0.015
 sight-0.014
 trouble-0.014
 fun-0.014
Token hipp
Token ID18568
Feature activation-1.75e-01

Pos logit contributions
owed+0.068
bee+0.067
itty+0.065
op+0.060
ppy+0.060
Neg logit contributions
 comparison-0.097
 fun-0.097
 our-0.094
 awesome-0.094
 common-0.094
Tokeno
Token ID78
Feature activation-1.09e-01

Pos logit contributions
bee+0.317
ird+0.291
ppy+0.282
bees+0.270
itive+0.268
Neg logit contributions
 their-0.303
 his-0.293
 our-0.292
 fun-0.267
 far-0.265
Token was
Token ID373
Feature activation+5.18e-02

Pos logit contributions
bee+0.212
itty+0.209
owed+0.199
ird+0.197
unn+0.195
Neg logit contributions
 fun-0.177
 our-0.171
 an-0.170
 common-0.165
 their-0.165
Token happy
Token ID3772
Feature activation+1.63e-02

Pos logit contributions
 comparison+0.090
 outer+0.089
 common+0.085
 backup+0.078
 faith+0.076
Neg logit contributions
bee-0.100
m-0.095
owed-0.095
itty-0.091
bug-0.088
Token and
Token ID290
Feature activation+2.12e-02

Pos logit contributions
 comparison+0.030
 fun+0.029
 outer+0.029
 common+0.028
 our+0.028
Neg logit contributions
bee-0.031
itty-0.030
owed-0.029
oke-0.027
op-0.027
Token Lily
Token ID20037
Feature activation-8.69e-02

Pos logit contributions
 comparison+0.048
 outer+0.046
 common+0.043
 trouble+0.043
 backup+0.042
Neg logit contributions
bee-0.031
owed-0.031
itty-0.027
m-0.027
bees-0.026
Token here
Token ID994
Feature activation+1.12e-02

Pos logit contributions
 comparison+0.077
 outer+0.071
 common+0.070
 thrilling+0.069
 backup+0.068
Neg logit contributions
bee-0.090
itty-0.083
owed-0.082
op-0.081
flower-0.079
Token?"
Token ID1701
Feature activation-1.49e-02

Pos logit contributions
 comparison+0.026
 outer+0.024
 common+0.024
 trouble+0.024
 fun+0.024
Neg logit contributions
bee-0.015
itty-0.014
owed-0.014
m-0.013
op-0.013
Token
Token ID198
Feature activation+6.31e-03

Pos logit contributions
bee+0.041
owed+0.038
unn+0.038
itty+0.038
oke+0.037
Neg logit contributions
 an-0.016
 comparison-0.015
 fun-0.014
 outer-0.014
 their-0.014
Token
Token ID198
Feature activation+9.81e-05

Pos logit contributions
 comparison+0.009
 common+0.009
 outer+0.009
 fun+0.009
 our+0.008
Neg logit contributions
bee-0.015
owed-0.014
itty-0.014
oke-0.013
unn-0.013
TokenThe
Token ID464
Feature activation-4.35e-02

Pos logit contributions
 comparison+0.000
 outer+0.000
 fun+0.000
 common+0.000
 our+0.000
Neg logit contributions
bee-0.000
owed-0.000
itty-0.000
oke-0.000
op-0.000
Token crocod
Token ID37565
Feature activation-1.75e-01

Pos logit contributions
bee+0.086
owed+0.083
itty+0.082
op+0.076
oke+0.075
Neg logit contributions
 comparison-0.075
 fun-0.074
 common-0.072
 outer-0.072
 our-0.071
Tokenile
Token ID576
Feature activation-1.24e-01

Pos logit contributions
ird+0.324
owed+0.297
pp+0.271
anna+0.262
bee+0.259
Neg logit contributions
 awesome-0.300
 fun-0.298
 surprising-0.283
 unexpected-0.279
 comparison-0.278
Token replied
Token ID8712
Feature activation+2.57e-02

Pos logit contributions
itty+0.227
unn+0.226
bee+0.221
ird+0.218
owed+0.210
Neg logit contributions
 fun-0.225
 an-0.215
 great-0.206
 his-0.206
 sight-0.205
Token,
Token ID11
Feature activation-1.85e-03

Pos logit contributions
 comparison+0.050
 fun+0.048
 outer+0.048
 our+0.047
 an+0.047
Neg logit contributions
bee-0.045
itty-0.043
owed-0.041
unn-0.039
oke-0.039
Token "
Token ID366
Feature activation-8.93e-03

Pos logit contributions
bee+0.004
itty+0.004
owed+0.004
unn+0.004
oke+0.004
Neg logit contributions
 comparison-0.002
 fun-0.002
 an-0.002
 outer-0.002
 our-0.002
TokenI
Token ID40
Feature activation+1.13e-03

Pos logit contributions
bee+0.020
owed+0.020
itty+0.019
oke+0.018
unn+0.018
Neg logit contributions
 comparison-0.013
 outer-0.012
 common-0.012
 fun-0.012
 our-0.012
TokenOne
Token ID3198
Feature activation-9.54e-03

Pos logit contributions
bee+0.012
itty+0.012
owed+0.012
oke+0.011
op+0.011
Neg logit contributions
 comparison-0.006
 outer-0.006
 fun-0.006
 common-0.006
 our-0.005
Token day
Token ID1110
Feature activation+9.28e-03

Pos logit contributions
bee+0.023
owed+0.022
itty+0.022
oke+0.021
ird+0.021
Neg logit contributions
 comparison-0.012
 fun-0.012
 outer-0.012
 common-0.012
 our-0.012
Token,
Token ID11
Feature activation+1.49e-02

Pos logit contributions
 fun+0.017
 our+0.017
 an+0.017
 comparison+0.016
 common+0.016
Neg logit contributions
bee-0.017
itty-0.017
owed-0.016
oke-0.015
ppy-0.015
Token the
Token ID262
Feature activation-4.09e-02

Pos logit contributions
 comparison+0.014
 outer+0.013
 common+0.012
 thrilling+0.012
 trouble+0.012
Neg logit contributions
bee-0.042
itty-0.040
owed-0.039
op-0.038
m-0.038
Token green
Token ID4077
Feature activation-1.03e-01

Pos logit contributions
bee+0.077
itty+0.072
owed+0.070
oke+0.067
m+0.067
Neg logit contributions
 comparison-0.076
 outer-0.073
 common-0.072
 fun-0.071
 our-0.070
Token oct
Token ID19318
Feature activation-1.75e-01

Pos logit contributions
owed+0.236
ird+0.231
itty+0.230
bee+0.223
oke+0.216
Neg logit contributions
 sight-0.145
 comparison-0.143
 outer-0.139
 fun-0.139
 common-0.136
Tokenopus
Token ID25790
Feature activation-1.28e-01

Pos logit contributions
owed+0.238
bee+0.222
itty+0.200
ird+0.199
flower+0.188
Neg logit contributions
 fun-0.394
 common-0.388
 outer-0.385
 comparison-0.382
 awesome-0.382
Token got
Token ID1392
Feature activation+7.26e-02

Pos logit contributions
bee+0.287
itty+0.282
owed+0.280
oke+0.267
unn+0.266
Neg logit contributions
 fun-0.181
 an-0.171
 our-0.168
 comparison-0.166
 trouble-0.165
Token sick
Token ID6639
Feature activation+9.53e-03

Pos logit contributions
 comparison+0.122
 outer+0.117
 common+0.111
 thrilling+0.110
 awesome+0.105
Neg logit contributions
m-0.149
owed-0.134
bee-0.134
ie-0.132
itty-0.130
Token.
Token ID13
Feature activation-2.02e-03

Pos logit contributions
 fun+0.016
 an+0.016
 trouble+0.016
 their+0.015
 late+0.015
Neg logit contributions
itty-0.020
bee-0.019
owed-0.019
oke-0.019
anda-0.018
Token His
Token ID2399
Feature activation-6.92e-03

Pos logit contributions
bee+0.004
owed+0.004
itty+0.004
oke+0.004
op+0.004
Neg logit contributions
 comparison-0.003
 outer-0.003
 common-0.003
 fun-0.003
 awesome-0.003
Token,
Token ID11
Feature activation-1.95e-03

Pos logit contributions
 comparison+0.047
 outer+0.046
 fun+0.045
 common+0.045
 our+0.044
Neg logit contributions
bee-0.037
itty-0.035
owed-0.035
oke-0.032
op-0.032
Token mom
Token ID1995
Feature activation-7.54e-02

Pos logit contributions
bee+0.004
owed+0.004
itty+0.004
op+0.004
oke+0.004
Neg logit contributions
 comparison-0.003
 fun-0.003
 our-0.003
 outer-0.003
 common-0.003
Token.
Token ID13
Feature activation+6.05e-04

Pos logit contributions
bee+0.173
owed+0.154
itty+0.152
op+0.151
bees+0.148
Neg logit contributions
 fun-0.117
 your-0.112
 late-0.112
 our-0.111
 trouble-0.103
Token Maybe
Token ID6674
Feature activation+2.87e-02

Pos logit contributions
 our+0.001
 comparison+0.001
 outer+0.001
 fun+0.001
 an+0.001
Neg logit contributions
bee-0.001
owed-0.001
itty-0.001
ie-0.001
ops-0.001
Token the
Token ID262
Feature activation-3.80e-02

Pos logit contributions
 comparison+0.039
 common+0.035
 outer+0.035
 thrilling+0.034
 fun+0.034
Neg logit contributions
bee-0.068
itty-0.065
owed-0.065
oke-0.061
m-0.061
Token hipp
Token ID18568
Feature activation-1.75e-01

Pos logit contributions
owed+0.059
bee+0.058
itty+0.055
ppy+0.052
op+0.052
Neg logit contributions
 fun-0.084
 comparison-0.083
 our-0.081
 awesome-0.080
 common-0.080
Tokeno
Token ID78
Feature activation-1.07e-01

Pos logit contributions
bee+0.313
ppy+0.282
isy+0.277
ird+0.277
owed+0.272
Neg logit contributions
 our-0.315
 their-0.293
 an-0.281
 common-0.276
 fun-0.276
Token is
Token ID318
Feature activation+5.11e-02

Pos logit contributions
bee+0.224
owed+0.210
itty+0.208
ird+0.205
unn+0.202
Neg logit contributions
 our-0.169
 fun-0.163
 common-0.162
 trouble-0.157
 an-0.155
Token not
Token ID407
Feature activation+5.49e-02

Pos logit contributions
 comparison+0.067
 outer+0.063
 thrilling+0.062
 backup+0.057
Hours+0.057
Neg logit contributions
bee-0.115
flower-0.104
itty-0.104
m-0.104
bug-0.103
Token so
Token ID523
Feature activation+4.72e-02

Pos logit contributions
 comparison+0.061
 outer+0.058
 common+0.056
 fun+0.055
 awesome+0.053
Neg logit contributions
bee-0.143
itty-0.137
owed-0.136
oke-0.130
op-0.128
Token dull
Token ID19222
Feature activation-1.19e-03

Pos logit contributions
 comparison+0.051
 outer+0.048
 thrilling+0.043
 common+0.042
 sight+0.041
Neg logit contributions
bee-0.124
itty-0.118
owed-0.116
oke-0.115
m-0.111
Tokenâ
Token ID22940
Feature activation-4.80e-02

Pos logit contributions
 our+0.001
 outer+0.001
 an+0.001
 your+0.001
 fun+0.001
Neg logit contributions
bee-0.003
owed-0.003
bees-0.003
ops-0.003
itive-0.003
Token
Token ID26391
Feature activation-6.68e-02

Pos logit contributions
bee+0.095
bees+0.086
unn+0.086
owed+0.086
op+0.085
Neg logit contributions
 common-0.082
 fun-0.079
 comparison-0.079
 awesome-0.077
 our-0.077
Token
Token ID198
Feature activation+1.31e-02

Pos logit contributions
bee+0.118
owed+0.104
unn+0.103
itty+0.097
itive+0.096
Neg logit contributions
 fun-0.135
 our-0.131
 common-0.128
 an-0.126
 awesome-0.124
Token
Token ID198
Feature activation-2.36e-03

Pos logit contributions
 comparison+0.021
 outer+0.021
 fun+0.020
 common+0.020
 our+0.020
Neg logit contributions
bee-0.027
itty-0.026
owed-0.026
oke-0.024
op-0.024
TokenThe
Token ID464
Feature activation-4.29e-02

Pos logit contributions
bee+0.005
itty+0.004
owed+0.004
op+0.004
m+0.004
Neg logit contributions
 comparison-0.004
 outer-0.004
 trouble-0.004
 common-0.004
 sight-0.004
Token le
Token ID443
Feature activation-1.75e-01

Pos logit contributions
bee+0.081
itty+0.076
owed+0.075
oke+0.071
op+0.070
Neg logit contributions
 comparison-0.080
 outer-0.077
 fun-0.076
 common-0.076
 our-0.075
Tokenopard
Token ID15478
Feature activation-1.27e-01

Pos logit contributions
ie+0.327
owed+0.321
itive+0.308
iness+0.303
ppy+0.292
Neg logit contributions
 golf-0.273
 steel-0.273
 fun-0.266
 awesome-0.248
 an-0.248
Token nodded
Token ID14464
Feature activation+3.28e-02

Pos logit contributions
bee+0.243
itty+0.240
unn+0.231
oke+0.230
owed+0.227
Neg logit contributions
 fun-0.221
 an-0.210
 great-0.199
 common-0.198
 our-0.197
Token and
Token ID290
Feature activation+3.73e-02

Pos logit contributions
 comparison+0.040
 outer+0.037
 fun+0.036
 common+0.036
 trouble+0.035
Neg logit contributions
bee-0.084
owed-0.080
itty-0.079
m-0.074
ppy-0.074
Token said
Token ID531
Feature activation+4.84e-02

Pos logit contributions
 comparison+0.081
 outer+0.079
 fun+0.078
 common+0.078
 our+0.077
Neg logit contributions
bee-0.058
itty-0.054
owed-0.053
oke-0.050
op-0.049
Token,
Token ID11
Feature activation+1.57e-03

Pos logit contributions
 comparison+0.100
 outer+0.097
 fun+0.096
 common+0.095
 our+0.094
Neg logit contributions
bee-0.080
itty-0.076
owed-0.074
oke-0.069
op-0.068
Token us
Token ID514
Feature activation-5.01e-02

Pos logit contributions
 comparison+0.045
 thrilling+0.040
 common+0.039
 outer+0.039
 accomplishment+0.038
Neg logit contributions
bee-0.070
itty-0.068
ie-0.067
op-0.064
ppy-0.064
Token?"
Token ID1701
Feature activation-2.29e-02

Pos logit contributions
bee+0.090
itty+0.086
owed+0.084
oke+0.080
op+0.078
Neg logit contributions
 comparison-0.095
 fun-0.093
 outer-0.093
 common-0.092
 our-0.092
Token
Token ID198
Feature activation-1.62e-02

Pos logit contributions
bee+0.043
itty+0.042
owed+0.040
oke+0.039
op+0.038
Neg logit contributions
 comparison-0.042
 outer-0.040
 common-0.039
 fun-0.039
 awesome-0.038
Token
Token ID198
Feature activation-2.02e-02

Pos logit contributions
bee+0.032
itty+0.031
owed+0.030
oke+0.029
op+0.028
Neg logit contributions
 comparison-0.028
 outer-0.027
 fun-0.027
 common-0.026
 awesome-0.026
TokenThe
Token ID464
Feature activation-4.11e-02

Pos logit contributions
bee+0.039
itty+0.038
owed+0.036
oke+0.035
op+0.035
Neg logit contributions
 comparison-0.037
 outer-0.035
 fun-0.034
 awesome-0.034
 trouble-0.034
Token oct
Token ID19318
Feature activation-1.75e-01

Pos logit contributions
bee+0.075
owed+0.072
itty+0.072
oke+0.066
op+0.065
Neg logit contributions
 comparison-0.078
 outer-0.075
 fun-0.075
 common-0.074
 our-0.073
Tokenopus
Token ID25790
Feature activation-1.32e-01

Pos logit contributions
owed+0.268
bee+0.241
ird+0.226
itty+0.220
ie+0.219
Neg logit contributions
 common-0.348
 trouble-0.338
 comparison-0.338
 backup-0.337
 fun-0.337
Token did
Token ID750
Feature activation+7.05e-02

Pos logit contributions
unn+0.275
itty+0.270
ird+0.259
bee+0.254
pp+0.252
Neg logit contributions
 fun-0.216
 an-0.200
 his-0.199
 common-0.195
 their-0.193
Token not
Token ID407
Feature activation+4.69e-02

Pos logit contributions
 outer+0.097
 comparison+0.097
rous+0.093
 trouble+0.092
 sight+0.088
Neg logit contributions
bee-0.154
owed-0.150
ie-0.128
bees-0.125
es-0.124
Token speak
Token ID2740
Feature activation+3.76e-02

Pos logit contributions
 fun+0.059
 common+0.057
 comparison+0.053
 awesome+0.052
 our+0.050
Neg logit contributions
itty-0.121
op-0.116
owed-0.113
bee-0.112
ops-0.111
Token.
Token ID13
Feature activation-1.39e-02

Pos logit contributions
 comparison+0.082
 outer+0.081
 thrilling+0.076
 common+0.076
 trouble+0.076
Neg logit contributions
bee-0.058
owed-0.057
itty-0.050
ppy-0.048
m-0.048
Token.
Token ID13
Feature activation-1.34e-02

Pos logit contributions
 comparison+0.055
 outer+0.053
 fun+0.053
 common+0.053
 our+0.052
Neg logit contributions
bee-0.053
itty-0.051
owed-0.050
oke-0.046
op-0.046
Token
Token ID220
Feature activation+3.81e-03

Pos logit contributions
bee+0.032
owed+0.030
itty+0.030
oke+0.028
op+0.028
Neg logit contributions
 comparison-0.018
 outer-0.018
 fun-0.017
 our-0.017
 common-0.017
Token
Token ID198
Feature activation+2.55e-03

Pos logit contributions
 fun+0.005
 outer+0.005
 comparison+0.005
 our+0.005
 common+0.005
Neg logit contributions
bee-0.009
itty-0.008
owed-0.008
flower-0.008
ppy-0.008
Token
Token ID198
Feature activation-4.84e-03

Pos logit contributions
 comparison+0.004
 outer+0.004
 fun+0.004
 common+0.004
 our+0.004
Neg logit contributions
bee-0.005
owed-0.005
itty-0.005
oke-0.005
op-0.005
TokenThe
Token ID464
Feature activation-3.91e-02

Pos logit contributions
bee+0.010
itty+0.010
owed+0.009
oke+0.009
op+0.009
Neg logit contributions
 comparison-0.008
 outer-0.008
 common-0.007
 fun-0.007
 trouble-0.007
Token flaming
Token ID39518
Feature activation-1.75e-01

Pos logit contributions
owed+0.077
itty+0.075
bee+0.075
op+0.072
ird+0.071
Neg logit contributions
 fun-0.068
 comparison-0.068
 awesome-0.066
 sight-0.065
 our-0.064
Tokeno
Token ID78
Feature activation-9.97e-02

Pos logit contributions
ird+0.246
bee+0.244
itive+0.240
unn+0.231
bees+0.226
Neg logit contributions
 fun-0.295
 an-0.275
 our-0.275
 their-0.274
 common-0.273
Token nodded
Token ID14464
Feature activation+3.22e-02

Pos logit contributions
itty+0.184
bee+0.181
unn+0.173
owed+0.173
ird+0.172
Neg logit contributions
 fun-0.168
 our-0.161
 an-0.160
 common-0.157
 great-0.156
Token its
Token ID663
Feature activation-1.42e-02

Pos logit contributions
 comparison+0.061
 outer+0.059
 fun+0.058
 common+0.058
 trouble+0.058
Neg logit contributions
bee-0.059
owed-0.056
itty-0.053
m-0.050
ppy-0.049
Token head
Token ID1182
Feature activation-3.22e-02

Pos logit contributions
bee+0.020
owed+0.019
itty+0.018
ppy+0.016
flower+0.016
Neg logit contributions
 comparison-0.033
 outer-0.032
 common-0.031
 trouble-0.031
 fun-0.031
Token and
Token ID290
Feature activation+2.87e-02

Pos logit contributions
bee+0.056
itty+0.054
owed+0.051
oke+0.050
unn+0.050
Neg logit contributions
 comparison-0.062
 outer-0.060
 fun-0.060
 common-0.059
 our-0.059
Token was
Token ID373
Feature activation+6.01e-02

Pos logit contributions
unn+0.197
itty+0.192
ird+0.191
owers+0.190
erry+0.187
Neg logit contributions
 his-0.119
 their-0.114
 an-0.109
 its-0.109
 fun-0.105
Token amazed
Token ID24562
Feature activation+4.86e-02

Pos logit contributions
 outer+0.102
 comparison+0.102
 common+0.098
 backup+0.091
 thrilling+0.088
Neg logit contributions
bee-0.115
owed-0.113
itty-0.112
m-0.111
bug-0.108
Token and
Token ID290
Feature activation+2.81e-02

Pos logit contributions
 comparison+0.093
 fun+0.092
 outer+0.089
 common+0.088
 sight+0.088
Neg logit contributions
bee-0.089
itty-0.084
owed-0.083
ird-0.078
op-0.078
Token asked
Token ID1965
Feature activation+3.62e-02

Pos logit contributions
 comparison+0.064
 outer+0.063
 common+0.062
 backup+0.061
 trouble+0.060
Neg logit contributions
bee-0.040
owed-0.038
ie-0.033
itty-0.033
m-0.032
Token the
Token ID262
Feature activation-5.11e-02

Pos logit contributions
 comparison+0.035
 outer+0.032
 common+0.032
 fun+0.032
 awesome+0.030
Neg logit contributions
bee-0.100
itty-0.097
owed-0.096
oke-0.092
op-0.091
Token hipp
Token ID18568
Feature activation-1.75e-01

Pos logit contributions
owed+0.083
bee+0.083
itty+0.077
ppy+0.075
op+0.073
Neg logit contributions
 fun-0.110
 comparison-0.108
 awesome-0.106
 common-0.105
 our-0.105
Tokeno
Token ID78
Feature activation-9.96e-02

Pos logit contributions
bee+0.306
ird+0.283
itive+0.257
bees+0.254
ppy+0.253
Neg logit contributions
 their-0.317
 his-0.313
 an-0.306
 our-0.304
 fun-0.291
Token how
Token ID703
Feature activation+4.54e-02

Pos logit contributions
bee+0.214
itty+0.205
owed+0.198
op+0.190
oke+0.189
Neg logit contributions
 comparison-0.153
 fun-0.150
 outer-0.149
 common-0.147
 our-0.146
Token he
Token ID339
Feature activation-2.47e-02

Pos logit contributions
 comparison+0.058
 outer+0.055
 common+0.054
 our+0.051
 sight+0.050
Neg logit contributions
bee-0.106
owed-0.101
itty-0.099
oke-0.097
ie-0.096
Token knew
Token ID2993
Feature activation+5.56e-02

Pos logit contributions
bee+0.047
itty+0.045
owed+0.043
op+0.042
oke+0.042
Neg logit contributions
 fun-0.044
 comparison-0.043
 outer-0.042
 common-0.042
 awesome-0.042
Token how
Token ID703
Feature activation+5.94e-02

Pos logit contributions
 comparison+0.063
 outer+0.060
 fun+0.059
 common+0.058
 our+0.057
Neg logit contributions
bee-0.143
itty-0.138
owed-0.137
oke-0.131
op-0.130
Token in
Token ID287
Feature activation+1.74e-02

Pos logit contributions
 comparison+0.069
 outer+0.067
 fun+0.066
 common+0.066
 our+0.065
Neg logit contributions
bee-0.074
itty-0.070
owed-0.069
oke-0.065
op-0.064
Token the
Token ID262
Feature activation-2.97e-02

Pos logit contributions
 thrilling+0.022
 comparison+0.022
 destination+0.021
 brand+0.021
 accomplishment+0.021
Neg logit contributions
itty-0.037
bee-0.034
bird-0.034
bug-0.033
oc-0.031
Token jungle
Token ID20712
Feature activation-7.73e-02

Pos logit contributions
itty+0.056
bee+0.052
m+0.050
flower+0.049
op+0.048
Neg logit contributions
 comparison-0.054
 common-0.053
 outer-0.051
 accomplishment-0.049
 thrilling-0.048
Token.
Token ID13
Feature activation-8.98e-03

Pos logit contributions
bee+0.160
itty+0.148
unn+0.144
owed+0.143
oke+0.142
Neg logit contributions
 fun-0.136
 an-0.131
 far-0.130
 comparison-0.128
 their-0.127
Token The
Token ID383
Feature activation-4.42e-02

Pos logit contributions
bee+0.021
itty+0.020
owed+0.020
oke+0.019
op+0.019
Neg logit contributions
 comparison-0.012
 outer-0.012
 common-0.011
 fun-0.011
 awesome-0.011
Token le
Token ID443
Feature activation-1.74e-01

Pos logit contributions
bee+0.086
owed+0.083
itty+0.082
oke+0.076
op+0.076
Neg logit contributions
 comparison-0.078
 fun-0.076
 outer-0.075
 common-0.074
 our-0.073
Tokenopard
Token ID15478
Feature activation-1.32e-01

Pos logit contributions
owed+0.320
ie+0.308
ird+0.302
bee+0.293
iness+0.293
Neg logit contributions
 fun-0.273
 steel-0.264
 golf-0.259
 far-0.255
 long-0.251
Token had
Token ID550
Feature activation+6.74e-02

Pos logit contributions
bee+0.299
owed+0.299
unn+0.296
itty+0.294
ird+0.292
Neg logit contributions
 fun-0.195
 an-0.179
 his-0.171
 great-0.169
 exciting-0.167
Token many
Token ID867
Feature activation+1.19e-02

Pos logit contributions
 comparison+0.094
 outer+0.091
 avail+0.089
 common+0.087
 accomplishment+0.087
Neg logit contributions
bee-0.149
itty-0.142
affe-0.135
m-0.134
bees-0.131
Token friends
Token ID2460
Feature activation-5.77e-02

Pos logit contributions
 fun+0.019
 comparison+0.018
 trouble+0.018
 awesome+0.017
 common+0.017
Neg logit contributions
bee-0.026
owed-0.025
itty-0.025
ppy-0.023
oke-0.023
Token,
Token ID11
Feature activation+1.77e-02

Pos logit contributions
bee+0.102
itty+0.094
owed+0.094
oke+0.090
unn+0.090
Neg logit contributions
 fun-0.112
 comparison-0.108
 their-0.108
 common-0.107
 our-0.107
Token;
Token ID26
Feature activation+2.88e-02

Pos logit contributions
 comparison+0.046
 outer+0.044
 fun+0.044
 common+0.044
 our+0.043
Neg logit contributions
bee-0.041
itty-0.039
owed-0.039
oke-0.036
op-0.036
Token they
Token ID484
Feature activation-1.24e-02

Pos logit contributions
 outer+0.040
 comparison+0.040
 common+0.039
 awesome+0.037
 thrilling+0.035
Neg logit contributions
bee-0.064
itty-0.060
m-0.058
owed-0.058
oke-0.057
Token knew
Token ID2993
Feature activation+5.32e-02

Pos logit contributions
bee+0.021
itty+0.021
owed+0.020
oke+0.019
unn+0.019
Neg logit contributions
 comparison-0.025
 fun-0.024
 outer-0.024
 common-0.023
 our-0.023
Token that
Token ID326
Feature activation+2.00e-02

Pos logit contributions
 comparison+0.072
 outer+0.066
 common+0.065
 awesome+0.062
 thrilling+0.062
Neg logit contributions
bee-0.125
owed-0.120
itty-0.119
oke-0.114
m-0.112
Token the
Token ID262
Feature activation-3.81e-02

Pos logit contributions
 comparison+0.025
 outer+0.024
 common+0.023
 awesome+0.023
 thrilling+0.022
Neg logit contributions
bee-0.048
owed-0.046
itty-0.046
oke-0.044
m-0.043
Token hipp
Token ID18568
Feature activation-1.74e-01

Pos logit contributions
bee+0.057
itty+0.053
owed+0.053
oke+0.048
op+0.048
Neg logit contributions
 comparison-0.085
 fun-0.083
 outer-0.082
 common-0.082
 our-0.081
Tokeno
Token ID78
Feature activation-1.09e-01

Pos logit contributions
bee+0.286
ird+0.256
owed+0.242
itive+0.241
ppy+0.237
Neg logit contributions
 our-0.312
 their-0.306
 an-0.304
 fun-0.302
 common-0.300
Token would
Token ID561
Feature activation+5.89e-02

Pos logit contributions
bee+0.196
itty+0.184
owed+0.181
unn+0.173
oke+0.172
Neg logit contributions
 fun-0.197
 comparison-0.196
 common-0.196
 our-0.193
 an-0.188
Token always
Token ID1464
Feature activation+5.78e-02

Pos logit contributions
 comparison+0.085
 outer+0.084
 awesome+0.078
 trouble+0.077
 thrilling+0.077
Neg logit contributions
bee-0.131
owed-0.122
itty-0.119
m-0.116
ppy-0.113
Token be
Token ID307
Feature activation+5.24e-02

Pos logit contributions
 comparison+0.083
 outer+0.079
 common+0.076
 fun+0.075
 trouble+0.074
Neg logit contributions
bee-0.131
owed-0.125
itty-0.124
oke-0.116
m-0.116
Token close
Token ID1969
Feature activation+1.63e-02

Pos logit contributions
 comparison+0.078
 outer+0.076
 common+0.069
 thrilling+0.067
 backup+0.064
Neg logit contributions
bee-0.110
owed-0.109
itty-0.107
m-0.105
ie-0.104
Token playing
Token ID2712
Feature activation+4.21e-02

Pos logit contributions
 comparison+0.088
 thrilling+0.076
 common+0.076
 outer+0.074
 backup+0.073
Neg logit contributions
bee-0.211
op-0.191
owed-0.191
ie-0.187
m-0.187
Token with
Token ID351
Feature activation+3.60e-02

Pos logit contributions
 comparison+0.074
 outer+0.070
 common+0.069
 fun+0.067
 awesome+0.066
Neg logit contributions
bee-0.083
itty-0.079
owed-0.078
op-0.073
oke-0.073
Token it
Token ID340
Feature activation-3.35e-02

Pos logit contributions
 comparison+0.043
 outer+0.035
 thrilling+0.033
Half+0.032
Hours+0.031
Neg logit contributions
bee-0.091
m-0.085
itty-0.085
ops-0.083
ie-0.082
Token until
Token ID1566
Feature activation+4.31e-02

Pos logit contributions
bee+0.056
itty+0.053
owed+0.053
oke+0.048
op+0.048
Neg logit contributions
 comparison-0.069
 outer-0.066
 common-0.065
 fun-0.064
 our-0.064
Token it
Token ID340
Feature activation-1.95e-02

Pos logit contributions
 comparison+0.062
 ours+0.060
 thrilling+0.060
 common+0.059
 awesome+0.058
Neg logit contributions
m-0.081
ops-0.078
ird-0.077
bee-0.075
ie-0.073
Token was
Token ID373
Feature activation+6.99e-02

Pos logit contributions
bee+0.033
itty+0.031
owed+0.031
oke+0.029
op+0.028
Neg logit contributions
 comparison-0.040
 fun-0.039
 outer-0.038
 common-0.038
 our-0.037
Token time
Token ID640
Feature activation+3.53e-02

Pos logit contributions
 backup+0.106
 comparison+0.105
 outer+0.105
 common+0.104
 faith+0.100
Neg logit contributions
ie-0.134
bee-0.127
m-0.126
flower-0.121
owed-0.120
Token to
Token ID284
Feature activation+5.92e-02

Pos logit contributions
 comparison+0.064
 outer+0.059
 common+0.057
 awesome+0.054
 thrilling+0.054
Neg logit contributions
owed-0.068
m-0.066
bee-0.064
oke-0.062
op-0.061
Token go
Token ID467
Feature activation+4.56e-02

Pos logit contributions
 comparison+0.081
 outer+0.077
 common+0.075
 fun+0.075
 thrilling+0.074
Neg logit contributions
bee-0.138
owed-0.130
itty-0.129
m-0.123
ird-0.122
Token home
Token ID1363
Feature activation+8.79e-03

Pos logit contributions
 comparison+0.065
 common+0.057
 outer+0.056
 thrilling+0.055
 awesome+0.055
Neg logit contributions
itty-0.105
m-0.101
owed-0.101
bee-0.101
ie-0.099
Token.
Token ID13
Feature activation+7.89e-03

Pos logit contributions
 comparison+0.020
 outer+0.019
 common+0.019
 fun+0.019
 awesome+0.019
Neg logit contributions
bee-0.013
owed-0.012
itty-0.012
oke-0.011
op-0.011
Token happily
Token ID18177
Feature activation-1.85e-03

Pos logit contributions
 comparison+0.080
 outer+0.078
 common+0.078
 fun+0.075
 awesome+0.075
Neg logit contributions
bee-0.076
owed-0.072
itty-0.071
oke-0.065
m-0.064
Token as
Token ID355
Feature activation+2.01e-02

Pos logit contributions
itty+0.004
bee+0.004
unn+0.003
oke+0.003
owed+0.003
Neg logit contributions
 fun-0.003
 comparison-0.003
 an-0.003
 outer-0.003
 common-0.003
Token if
Token ID611
Feature activation+2.03e-02

Pos logit contributions
 comparison+0.031
 outer+0.029
 common+0.028
 fun+0.028
 awesome+0.028
Neg logit contributions
bee-0.044
itty-0.041
owed-0.041
oke-0.038
op-0.038
Token he
Token ID339
Feature activation-2.53e-02

Pos logit contributions
 comparison+0.029
 awesome+0.026
 outer+0.026
 common+0.026
 thrilling+0.026
Neg logit contributions
bee-0.046
itty-0.044
owed-0.043
ie-0.041
ne-0.040
Token understood
Token ID7247
Feature activation+5.76e-02

Pos logit contributions
bee+0.050
itty+0.049
oke+0.046
unn+0.046
owed+0.046
Neg logit contributions
 fun-0.041
 common-0.039
 great-0.039
 an-0.039
 thrilling-0.039
Token and
Token ID290
Feature activation+3.07e-02

Pos logit contributions
 comparison+0.105
 outer+0.101
 common+0.099
 fun+0.099
 awesome+0.098
Neg logit contributions
bee-0.109
owed-0.103
itty-0.102
oke-0.096
op-0.094
Token she
Token ID673
Feature activation-1.66e-02

Pos logit contributions
 comparison+0.061
 outer+0.059
 common+0.058
 awesome+0.056
 our+0.055
Neg logit contributions
bee-0.054
owed-0.052
itty-0.051
oke-0.047
m-0.046
Token gave
Token ID2921
Feature activation+5.90e-02

Pos logit contributions
bee+0.031
itty+0.031
unn+0.028
oke+0.028
op+0.028
Neg logit contributions
 fun-0.029
 an-0.028
 common-0.027
 comparison-0.027
 great-0.027
Token him
Token ID683
Feature activation-3.15e-02

Pos logit contributions
 comparison+0.050
 outer+0.046
 common+0.044
 fun+0.044
 awesome+0.042
Neg logit contributions
bee-0.169
itty-0.165
owed-0.164
op-0.155
oke-0.155
Token a
Token ID257
Feature activation+1.64e-02

Pos logit contributions
bee+0.091
owed+0.089
itty+0.089
ppy+0.084
m+0.084
Neg logit contributions
 comparison-0.026
 outer-0.022
 common-0.022
 thrilling-0.020
 fun-0.020
Token nice
Token ID3621
Feature activation+8.92e-03

Pos logit contributions
 comparison+0.019
 outer+0.018
 common+0.017
 fun+0.017
 our+0.016
Neg logit contributions
bee-0.042
itty-0.041
owed-0.040
op-0.039
oke-0.039
Token.
Token ID13
Feature activation+9.97e-03

Pos logit contributions
 comparison+0.074
 common+0.070
 outer+0.065
 awesome+0.065
 our+0.065
Neg logit contributions
bee-0.111
itty-0.106
owed-0.105
m-0.100
op-0.099
Token As
Token ID1081
Feature activation+4.16e-02

Pos logit contributions
 comparison+0.015
 outer+0.015
 fun+0.015
 common+0.014
 trouble+0.014
Neg logit contributions
bee-0.022
itty-0.021
owed-0.020
oke-0.019
op-0.019
Token she
Token ID673
Feature activation-2.64e-02

Pos logit contributions
 comparison+0.077
 common+0.073
 outer+0.073
 awesome+0.071
 fun+0.071
Neg logit contributions
bee-0.076
itty-0.073
owed-0.071
m-0.067
op-0.066
Token wandered
Token ID36036
Feature activation+5.99e-02

Pos logit contributions
bee+0.053
itty+0.051
owed+0.049
oke+0.048
unn+0.046
Neg logit contributions
 fun-0.043
 outer-0.043
 comparison-0.043
 common-0.041
 thrilling-0.041
Token around
Token ID1088
Feature activation+2.78e-02

Pos logit contributions
 comparison+0.122
 common+0.113
 outer+0.112
 trouble+0.110
 our+0.109
Neg logit contributions
bee-0.102
itty-0.098
owed-0.096
m-0.092
ie-0.089
Token,
Token ID11
Feature activation+2.89e-02

Pos logit contributions
 comparison+0.047
 awesome+0.045
 common+0.045
 ours+0.044
 our+0.043
Neg logit contributions
bee-0.053
owed-0.051
itty-0.051
m-0.048
ie-0.047
Token Lucy
Token ID22162
Feature activation-8.03e-02

Pos logit contributions
 comparison+0.048
 fun+0.046
 outer+0.046
 an+0.045
 trouble+0.045
Neg logit contributions
bee-0.060
itty-0.057
owed-0.056
oke-0.055
unn-0.053
Token tried
Token ID3088
Feature activation+7.77e-02

Pos logit contributions
bee+0.172
itty+0.166
owed+0.161
oke+0.156
unn+0.154
Neg logit contributions
 comparison-0.121
 fun-0.119
 outer-0.118
 an-0.114
 thrilling-0.113
Token to
Token ID284
Feature activation+5.56e-02

Pos logit contributions
 common+0.114
 comparison+0.114
 outer+0.108
 awesome+0.107
 our+0.105
Neg logit contributions
bee-0.164
m-0.158
owed-0.151
op-0.145
itty-0.142
Token guess
Token ID4724
Feature activation+5.42e-02

Pos logit contributions
 comparison+0.081
 outer+0.077
 common+0.077
 our+0.074
 fun+0.074
Neg logit contributions
bee-0.126
owed-0.117
itty-0.114
ird-0.109
ppy-0.108
Token what
Token ID644
Feature activation+4.45e-02

Pos logit contributions
 comparison+0.057
 fun+0.056
 outer+0.055
 awesome+0.052
 an+0.052
Neg logit contributions
bee-0.143
itty-0.141
owed-0.135
oke-0.132
unn-0.131
Token said
Token ID531
Feature activation+2.76e-02

Pos logit contributions
bee+0.032
itty+0.030
owed+0.029
unn+0.028
op+0.028
Neg logit contributions
 fun-0.036
 an-0.034
 common-0.034
 trouble-0.033
 awesome-0.033
Token he
Token ID339
Feature activation-2.72e-02

Pos logit contributions
 comparison+0.042
 outer+0.041
 common+0.040
 thrilling+0.038
 awesome+0.038
Neg logit contributions
bee-0.062
itty-0.058
owed-0.058
oke-0.057
op-0.055
Token would
Token ID561
Feature activation+5.93e-02

Pos logit contributions
bee+0.056
itty+0.051
unn+0.051
owed+0.050
bees+0.050
Neg logit contributions
 fun-0.046
 an-0.042
 great-0.042
 trouble-0.041
 awesome-0.041
Token try
Token ID1949
Feature activation+6.24e-02

Pos logit contributions
 comparison+0.084
 outer+0.081
 awesome+0.074
 thrilling+0.074
 sight+0.074
Neg logit contributions
bee-0.137
owed-0.130
itty-0.129
op-0.124
ird-0.123
Token to
Token ID284
Feature activation+4.25e-02

Pos logit contributions
 comparison+0.098
 outer+0.088
 awesome+0.085
 thrilling+0.085
 common+0.084
Neg logit contributions
bee-0.132
m-0.126
itty-0.124
op-0.123
owed-0.122
Token like
Token ID588
Feature activation+5.78e-02

Pos logit contributions
 comparison+0.065
 outer+0.060
 thrilling+0.057
 common+0.057
 sight+0.057
Neg logit contributions
bee-0.093
owed-0.088
itty-0.087
ird-0.084
op-0.083
Token the
Token ID262
Feature activation-3.59e-02

Pos logit contributions
 comparison+0.075
 common+0.070
 thrilling+0.066
 outer+0.064
 fantastic+0.063
Neg logit contributions
bee-0.132
itty-0.128
op-0.123
owed-0.122
ie-0.121
Token museum
Token ID13257
Feature activation-9.09e-02

Pos logit contributions
bee+0.080
itty+0.076
owed+0.075
oke+0.073
m+0.071
Neg logit contributions
 comparison-0.053
 outer-0.050
 common-0.049
 our-0.048
 fun-0.048
Token.
Token ID13
Feature activation-3.99e-03

Pos logit contributions
bee+0.175
itty+0.166
owed+0.162
unn+0.153
op+0.150
Neg logit contributions
 fun-0.166
 comparison-0.161
 awesome-0.157
 common-0.157
 our-0.157
Token Lily
Token ID20037
Feature activation-8.02e-02

Pos logit contributions
bee+0.007
oke+0.007
itty+0.007
owed+0.007
op+0.007
Neg logit contributions
 comparison-0.007
 outer-0.007
 thrilling-0.007
 awesome-0.007
 common-0.007
Token gave
Token ID2921
Feature activation+4.90e-02

Pos logit contributions
bee+0.172
unn+0.171
owers+0.169
erry+0.169
itty+0.166
Neg logit contributions
 fun-0.124
 an-0.115
 their-0.111
 his-0.109
 trouble-0.108
Token their
Token ID511
Feature activation-1.64e-02

Pos logit contributions
 comparison+0.036
 outer+0.032
 thrilling+0.030
 awesome+0.029
 late+0.028
Neg logit contributions
bee-0.097
itty-0.093
owed-0.093
op-0.092
ie-0.089
Token money
Token ID1637
Feature activation-4.99e-02

Pos logit contributions
bee+0.038
op+0.036
itty+0.036
owed+0.036
ops+0.034
Neg logit contributions
 comparison-0.021
 outer-0.021
 late-0.019
 common-0.019
 awesome-0.019
Token,
Token ID11
Feature activation+2.11e-02

Pos logit contributions
bee+0.078
itty+0.073
owed+0.073
oke+0.067
op+0.067
Neg logit contributions
 comparison-0.107
 outer-0.104
 common-0.102
 fun-0.102
 awesome-0.100
Token and
Token ID290
Feature activation+3.32e-02

Pos logit contributions
 outer+0.047
 comparison+0.047
 common+0.045
 awesome+0.044
 thrilling+0.044
Neg logit contributions
bee-0.031
itty-0.027
owed-0.027
oke-0.026
op-0.025
Token always
Token ID1464
Feature activation+5.91e-02

Pos logit contributions
 comparison+0.049
 outer+0.047
 common+0.046
 awesome+0.044
 thrilling+0.043
Neg logit contributions
bee-0.073
owed-0.070
itty-0.068
oke-0.067
op-0.065
Token tried
Token ID3088
Feature activation+7.23e-02

Pos logit contributions
 comparison+0.107
 outer+0.102
 common+0.098
 awesome+0.098
 thrilling+0.098
Neg logit contributions
bee-0.110
owed-0.110
itty-0.101
ird-0.099
m-0.099
Token to
Token ID284
Feature activation+4.91e-02

Pos logit contributions
 comparison+0.094
 outer+0.088
 awesome+0.087
 thrilling+0.085
 common+0.082
Neg logit contributions
bee-0.160
op-0.152
m-0.150
owed-0.150
y-0.148
Token find
Token ID1064
Feature activation+6.43e-02

Pos logit contributions
 comparison+0.072
 outer+0.067
 thrilling+0.063
 sight+0.063
 common+0.062
Neg logit contributions
bee-0.110
owed-0.103
itty-0.102
ird-0.099
op-0.098
Token the
Token ID262
Feature activation-2.82e-02

Pos logit contributions
 comparison+0.077
 outer+0.061
 common+0.060
 thrilling+0.060
 late+0.057
Neg logit contributions
owed-0.152
itty-0.150
bee-0.149
ird-0.148
ie-0.144
Token cheapest
Token ID28918
Feature activation-2.28e-02

Pos logit contributions
bee+0.062
itty+0.059
m+0.056
op+0.056
owed+0.055
Neg logit contributions
 comparison-0.040
 outer-0.040
 common-0.038
 our-0.037
 awesome-0.036
Token way
Token ID835
Feature activation+5.10e-03

Pos logit contributions
bee+0.051
itty+0.049
owed+0.048
oke+0.046
op+0.045
Neg logit contributions
 comparison-0.034
 outer-0.032
 fun-0.032
 common-0.032
 our-0.031
Token the
Token ID262
Feature activation-1.93e-02

Pos logit contributions
 comparison+0.093
 outer+0.079
 common+0.075
 thrilling+0.070
 ours+0.067
Neg logit contributions
bee-0.144
ie-0.142
owed-0.141
itty-0.138
ne-0.138
Token bug
Token ID5434
Feature activation-1.11e-01

Pos logit contributions
bee+0.030
itty+0.028
owed+0.028
oke+0.026
op+0.025
Neg logit contributions
 comparison-0.042
 outer-0.041
 fun-0.040
 common-0.040
 our-0.039
Token and
Token ID290
Feature activation+2.77e-02

Pos logit contributions
bee+0.294
itty+0.282
owed+0.279
oke+0.273
op+0.266
Neg logit contributions
 comparison-0.117
 outer-0.112
 fun-0.112
 common-0.109
 our-0.108
Token put
Token ID1234
Feature activation+6.54e-02

Pos logit contributions
 comparison+0.050
 outer+0.047
 common+0.047
 fun+0.046
 awesome+0.046
Neg logit contributions
bee-0.054
itty-0.051
owed-0.051
oke-0.047
op-0.047
Token it
Token ID340
Feature activation-2.47e-02

Pos logit contributions
 comparison+0.085
 common+0.077
 outer+0.074
 rewarding+0.069
 thrilling+0.065
Neg logit contributions
bee-0.155
itty-0.150
ie-0.147
owed-0.145
m-0.145
Token back
Token ID736
Feature activation+6.32e-03

Pos logit contributions
bee+0.052
owed+0.051
itty+0.050
m+0.047
ppy+0.046
Neg logit contributions
 comparison-0.040
 common-0.039
 outer-0.038
 trouble-0.037
 fun-0.036
Token on
Token ID319
Feature activation+4.30e-02

Pos logit contributions
 comparison+0.011
 outer+0.011
 common+0.010
 fun+0.010
 awesome+0.010
Neg logit contributions
bee-0.012
itty-0.012
owed-0.012
oke-0.011
op-0.011
Token the
Token ID262
Feature activation-9.98e-04

Pos logit contributions
 comparison+0.069
 awesome+0.065
 thrilling+0.064
 outer+0.063
 common+0.062
Neg logit contributions
bee-0.078
itty-0.073
bird-0.072
berry-0.072
m-0.071
Token ground
Token ID2323
Feature activation-4.04e-02

Pos logit contributions
bee+0.002
itty+0.002
owed+0.002
op+0.002
oke+0.002
Neg logit contributions
 comparison-0.002
 common-0.002
 outer-0.002
 fun-0.002
 our-0.002
Token.
Token ID13
Feature activation+1.35e-02

Pos logit contributions
bee+0.062
itty+0.058
owed+0.058
oke+0.053
op+0.053
Neg logit contributions
 comparison-0.087
 outer-0.085
 common-0.084
 fun-0.084
 our-0.082
Token He
Token ID679
Feature activation-5.99e-03

Pos logit contributions
 comparison+0.023
 common+0.022
 outer+0.022
 awesome+0.021
 fun+0.021
Neg logit contributions
bee-0.026
itty-0.025
owed-0.025
oke-0.025
op-0.024
Token her
Token ID607
Feature activation-1.58e-02

Pos logit contributions
 comparison+0.057
 outer+0.052
Half+0.051
 common+0.049
 late+0.049
Neg logit contributions
bee-0.110
itty-0.101
ie-0.100
owed-0.098
bug-0.097
Token mom
Token ID1995
Feature activation-7.25e-02

Pos logit contributions
bee+0.038
itty+0.036
owed+0.036
op+0.034
oke+0.034
Neg logit contributions
 comparison-0.021
 outer-0.019
 late-0.018
 common-0.018
 our-0.017
Token's
Token ID338
Feature activation-1.47e-03

Pos logit contributions
bee+0.127
itty+0.115
owed+0.114
unn+0.108
oke+0.108
Neg logit contributions
 fun-0.139
 comparison-0.138
 our-0.137
 outer-0.136
 common-0.134
Token help
Token ID1037
Feature activation+1.30e-02

Pos logit contributions
bee+0.003
itty+0.003
owed+0.003
op+0.003
oke+0.003
Neg logit contributions
 comparison-0.002
 outer-0.002
 common-0.002
 late-0.002
 our-0.002
Token,
Token ID11
Feature activation+3.56e-04

Pos logit contributions
 comparison+0.025
 fun+0.025
 outer+0.024
 common+0.024
 our+0.024
Neg logit contributions
bee-0.023
itty-0.022
owed-0.021
oke-0.020
unn-0.019
Token she
Token ID673
Feature activation-3.33e-02

Pos logit contributions
 comparison+0.001
 outer+0.001
 fun+0.000
 common+0.000
 our+0.000
Neg logit contributions
bee-0.001
itty-0.001
owed-0.001
oke-0.001
op-0.001
Token started
Token ID2067
Feature activation+5.83e-02

Pos logit contributions
bee+0.068
itty+0.066
owed+0.063
oke+0.061
unn+0.061
Neg logit contributions
 fun-0.053
 comparison-0.052
 our-0.051
 common-0.051
 outer-0.051
Token to
Token ID284
Feature activation+4.66e-02

Pos logit contributions
 comparison+0.081
 awesome+0.079
 common+0.079
 outer+0.078
 thrilling+0.074
Neg logit contributions
bee-0.133
owed-0.126
itty-0.124
op-0.123
m-0.120
Token reach
Token ID3151
Feature activation+5.17e-02

Pos logit contributions
 comparison+0.067
 outer+0.064
 common+0.062
 fun+0.062
 awesome+0.061
Neg logit contributions
bee-0.107
owed-0.100
itty-0.100
op-0.094
ird-0.093
Token up
Token ID510
Feature activation+3.26e-02

Pos logit contributions
 comparison+0.052
 common+0.045
 outer+0.043
 thrilling+0.043
 awesome+0.042
Neg logit contributions
bee-0.138
owed-0.131
itty-0.129
ie-0.127
m-0.125
Token and
Token ID290
Feature activation+3.09e-02

Pos logit contributions
 comparison+0.052
 outer+0.049
 awesome+0.049
 common+0.049
 backup+0.048
Neg logit contributions
bee-0.064
owed-0.059
itty-0.057
bees-0.057
m-0.057
Token
Token ID198
Feature activation+1.33e-03

Pos logit contributions
bee+0.043
owed+0.040
itty+0.038
op+0.037
ops+0.037
Neg logit contributions
 comparison-0.023
 fun-0.022
 an-0.022
 outer-0.022
 late-0.022
Token
Token ID198
Feature activation-7.54e-03

Pos logit contributions
 comparison+0.002
 fun+0.002
 outer+0.002
 common+0.002
 our+0.002
Neg logit contributions
bee-0.003
owed-0.003
itty-0.003
oke-0.003
unn-0.003
TokenThat
Token ID2504
Feature activation-5.28e-03

Pos logit contributions
bee+0.018
itty+0.017
owed+0.017
oke+0.016
op+0.016
Neg logit contributions
 comparison-0.010
 outer-0.010
 fun-0.010
 common-0.010
 our-0.009
Token was
Token ID373
Feature activation+6.09e-02

Pos logit contributions
bee+0.012
itty+0.011
owed+0.011
m+0.010
oke+0.010
Neg logit contributions
 comparison-0.008
 outer-0.008
 common-0.008
 awesome-0.007
 fun-0.007
Token the
Token ID262
Feature activation-1.93e-02

Pos logit contributions
 comparison+0.068
 outer+0.063
 backup+0.061
 common+0.059
 trouble+0.057
Neg logit contributions
bee-0.141
owed-0.140
ie-0.133
itty-0.131
bug-0.131
Token day
Token ID1110
Feature activation-1.10e-02

Pos logit contributions
bee+0.045
itty+0.043
owed+0.043
oke+0.041
op+0.040
Neg logit contributions
 comparison-0.027
 fun-0.026
 outer-0.026
 common-0.025
 awesome-0.025
Token the
Token ID262
Feature activation-3.32e-02

Pos logit contributions
bee+0.021
itty+0.020
owed+0.020
oke+0.019
op+0.019
Neg logit contributions
 comparison-0.020
 outer-0.019
 common-0.019
 fun-0.019
 our-0.018
Token weird
Token ID7650
Feature activation-5.33e-02

Pos logit contributions
bee+0.071
owed+0.068
itty+0.068
oke+0.064
op+0.064
Neg logit contributions
 comparison-0.052
 fun-0.051
 outer-0.050
 common-0.050
 awesome-0.049
Token witch
Token ID16365
Feature activation-1.30e-01

Pos logit contributions
bee+0.109
itty+0.107
owed+0.105
oke+0.100
op+0.099
Neg logit contributions
 comparison-0.087
 fun-0.085
 common-0.084
 outer-0.084
 sight-0.081
Token learned
Token ID4499
Feature activation+5.31e-02

Pos logit contributions
bee+0.240
itty+0.228
unn+0.225
owed+0.220
oke+0.216
Neg logit contributions
 fun-0.232
 an-0.219
 our-0.219
 outer-0.218
 comparison-0.218
Token that
Token ID326
Feature activation+1.68e-02

Pos logit contributions
 comparison+0.077
 outer+0.070
 common+0.069
 thrilling+0.066
 awesome+0.066
Neg logit contributions
bee-0.117
owed-0.114
itty-0.110
m-0.105
ie-0.105
Token better
Token ID1365
Feature activation-2.15e-03

Pos logit contributions
owed+0.086
bee+0.085
ops+0.077
itty+0.076
op+0.076
Neg logit contributions
 trouble-0.113
 fun-0.113
 our-0.113
 sight-0.112
 an-0.109
Token and
Token ID290
Feature activation+2.22e-02

Pos logit contributions
bee+0.004
itty+0.004
owed+0.004
oke+0.003
op+0.003
Neg logit contributions
 comparison-0.004
 outer-0.004
 fun-0.004
 common-0.004
 our-0.004
Token said
Token ID531
Feature activation+2.64e-02

Pos logit contributions
 comparison+0.035
 outer+0.033
 common+0.033
 fun+0.032
 awesome+0.031
Neg logit contributions
bee-0.047
owed-0.046
itty-0.045
oke-0.042
m-0.042
Token,
Token ID11
Feature activation-1.20e-02

Pos logit contributions
 comparison+0.051
 outer+0.049
 common+0.048
 fun+0.047
 awesome+0.046
Neg logit contributions
bee-0.048
owed-0.046
itty-0.045
oke-0.042
op-0.042
Token "
Token ID366
Feature activation-1.24e-02

Pos logit contributions
bee+0.027
itty+0.026
owed+0.026
oke+0.025
op+0.024
Neg logit contributions
 comparison-0.017
 outer-0.016
 fun-0.016
 common-0.016
 our-0.016
TokenOkay
Token ID16454
Feature activation-7.47e-03

Pos logit contributions
bee+0.030
owed+0.029
itty+0.028
oke+0.028
unn+0.027
Neg logit contributions
 comparison-0.016
 common-0.015
 outer-0.015
 fun-0.015
 our-0.015
Token,
Token ID11
Feature activation-1.31e-02

Pos logit contributions
bee+0.014
itty+0.013
owed+0.013
unn+0.012
oke+0.012
Neg logit contributions
 fun-0.014
 comparison-0.014
 our-0.014
 common-0.014
 awesome-0.014
Token Tom
Token ID4186
Feature activation-9.42e-02

Pos logit contributions
bee+0.029
itty+0.028
owed+0.027
oke+0.026
op+0.026
Neg logit contributions
 comparison-0.020
 outer-0.019
 fun-0.019
 common-0.019
 our-0.018
Token.
Token ID13
Feature activation-1.07e-02

Pos logit contributions
bee+0.196
itty+0.183
owed+0.183
unn+0.178
op+0.175
Neg logit contributions
 our-0.158
 fun-0.156
 your-0.150
 an-0.147
 trouble-0.146
Token I
Token ID314
Feature activation-5.67e-03

Pos logit contributions
bee+0.026
owed+0.024
itty+0.024
ie+0.023
oke+0.023
Neg logit contributions
 our-0.014
 comparison-0.014
 fun-0.013
 outer-0.013
 common-0.013
Token forgive
Token ID20927
Feature activation+2.46e-02

Pos logit contributions
bee+0.013
itty+0.012
owed+0.012
unn+0.012
oke+0.012
Neg logit contributions
 fun-0.008
 common-0.008
 comparison-0.008
 great-0.008
 trouble-0.007
Token the
Token ID262
Feature activation-3.41e-03

Pos logit contributions
 rewarding+0.063
 thrilling+0.059
 awesome+0.059
 gonna+0.058
 outer+0.058
Neg logit contributions
bee-0.094
bird-0.090
itty-0.090
y-0.089
m-0.088
Token air
Token ID1633
Feature activation-4.27e-02

Pos logit contributions
itty+0.007
bee+0.007
m+0.006
oke+0.006
op+0.006
Neg logit contributions
 comparison-0.006
 outer-0.006
 common-0.006
 thrilling-0.005
 awesome-0.005
Token and
Token ID290
Feature activation+2.09e-02

Pos logit contributions
bee+0.060
itty+0.054
owed+0.054
oke+0.050
m+0.049
Neg logit contributions
 comparison-0.099
 outer-0.098
 common-0.095
 awesome-0.094
 fun-0.094
Token catches
Token ID17591
Feature activation+4.56e-02

Pos logit contributions
 comparison+0.036
 outer+0.035
 thrilling+0.034
 common+0.033
 awesome+0.032
Neg logit contributions
bee-0.041
itty-0.036
owed-0.036
oke-0.035
ops-0.035
Token it
Token ID340
Feature activation-4.96e-02

Pos logit contributions
 comparison+0.052
 common+0.050
 outer+0.048
 thrilling+0.046
 rewarding+0.043
Neg logit contributions
bee-0.116
itty-0.108
ie-0.107
es-0.106
owed-0.105
Token.
Token ID13
Feature activation+9.74e-03

Pos logit contributions
bee+0.086
itty+0.081
owed+0.081
op+0.075
oke+0.074
Neg logit contributions
 comparison-0.097
 outer-0.095
 common-0.094
 fun-0.091
 thrilling-0.090
Token He
Token ID679
Feature activation-1.16e-02

Pos logit contributions
 comparison+0.021
 outer+0.020
 thrilling+0.019
rous+0.019
 awesome+0.019
Neg logit contributions
oke-0.014
itty-0.014
bee-0.014
op-0.014
owed-0.012
Token thinks
Token ID6834
Feature activation+4.07e-02

Pos logit contributions
bee+0.023
itty+0.021
owed+0.021
oke+0.021
anda+0.021
Neg logit contributions
 outer-0.020
 comparison-0.020
 thrilling-0.019
 sight-0.018
 common-0.018
Token he
Token ID339
Feature activation-2.75e-02

Pos logit contributions
 comparison+0.045
 common+0.043
 outer+0.043
 sight+0.041
 faith+0.039
Neg logit contributions
bee-0.106
oke-0.101
owed-0.100
itty-0.098
op-0.098
Token is
Token ID318
Feature activation+5.01e-02

Pos logit contributions
bee+0.059
itty+0.056
owed+0.056
oke+0.053
unn+0.052
Neg logit contributions
 fun-0.042
 comparison-0.042
 common-0.041
 our-0.040
 trouble-0.040
Token very
Token ID845
Feature activation+4.88e-02

Pos logit contributions
 outer+0.072
 comparison+0.070
 common+0.064
 sight+0.062
 thrilling+0.062
Neg logit contributions
bee-0.111
ppy-0.100
bug-0.098
m-0.096
itty-0.094
Token forgot
Token ID16453
Feature activation+2.64e-02

Pos logit contributions
bee+0.064
itty+0.061
owed+0.059
oke+0.056
op+0.056
Neg logit contributions
 comparison-0.064
 fun-0.064
 outer-0.062
 common-0.062
 trouble-0.061
Token all
Token ID477
Feature activation+5.12e-03

Pos logit contributions
 comparison+0.034
 common+0.033
 outer+0.033
 thrilling+0.030
 awesome+0.030
Neg logit contributions
bee-0.063
owed-0.060
itty-0.057
oke-0.057
m-0.056
Token about
Token ID546
Feature activation+5.47e-02

Pos logit contributions
 comparison+0.008
 outer+0.008
 common+0.007
 awesome+0.007
 our+0.007
Neg logit contributions
bee-0.011
owed-0.010
itty-0.010
oke-0.010
m-0.010
Token his
Token ID465
Feature activation-9.15e-03

Pos logit contributions
 comparison+0.052
 common+0.049
 outer+0.049
 thrilling+0.047
 awesome+0.046
Neg logit contributions
bee-0.145
owed-0.130
ie-0.129
bees-0.129
itty-0.129
Token mud
Token ID17492
Feature activation-8.56e-02

Pos logit contributions
bee+0.023
itty+0.021
owed+0.021
op+0.020
anda+0.020
Neg logit contributions
 comparison-0.011
 outer-0.011
 common-0.010
 late-0.010
 our-0.010
Token p
Token ID279
Feature activation-6.81e-02

Pos logit contributions
bee+0.153
itty+0.148
owed+0.143
oke+0.138
unn+0.133
Neg logit contributions
 fun-0.160
 comparison-0.159
 our-0.155
 trouble-0.152
 common-0.152
Tokenuddle
Token ID24500
Feature activation-3.09e-02

Pos logit contributions
bee+0.089
itty+0.081
owed+0.079
oke+0.072
op+0.070
Neg logit contributions
 comparison-0.163
 fun-0.160
 outer-0.159
 common-0.158
 our-0.156
Token.
Token ID13
Feature activation-2.21e-02

Pos logit contributions
bee+0.053
itty+0.050
owed+0.050
oke+0.046
op+0.046
Neg logit contributions
 comparison-0.062
 outer-0.059
 common-0.059
 fun-0.058
 awesome-0.057
Token He
Token ID679
Feature activation-2.24e-02

Pos logit contributions
bee+0.054
owed+0.051
itty+0.050
op+0.048
unn+0.048
Neg logit contributions
 comparison-0.028
 fun-0.027
 outer-0.027
 our-0.026
 common-0.026
Token was
Token ID373
Feature activation+5.29e-02

Pos logit contributions
bee+0.042
itty+0.040
owed+0.040
oke+0.037
op+0.037
Neg logit contributions
 comparison-0.041
 outer-0.039
 fun-0.039
 common-0.039
 our-0.038
Token so
Token ID523
Feature activation+4.37e-02

Pos logit contributions
 common+0.099
 outer+0.095
 comparison+0.091
 avail+0.083
 sight+0.082
Neg logit contributions
bee-0.096
m-0.093
itty-0.091
bug-0.091
owed-0.089
Token,
Token ID11
Feature activation+1.44e-02

Pos logit contributions
 comparison+0.049
 outer+0.047
 fun+0.047
 common+0.047
 our+0.047
Neg logit contributions
bee-0.041
itty-0.039
owed-0.038
oke-0.036
op-0.035
Token she
Token ID673
Feature activation-1.87e-02

Pos logit contributions
 comparison+0.012
 outer+0.011
 common+0.010
 fun+0.010
 awesome+0.010
Neg logit contributions
bee-0.042
itty-0.040
owed-0.040
op-0.038
m-0.038
Token saw
Token ID2497
Feature activation+6.24e-02

Pos logit contributions
bee+0.041
itty+0.040
owed+0.039
oke+0.038
unn+0.038
Neg logit contributions
 comparison-0.027
 fun-0.027
 outer-0.026
 an-0.026
 common-0.026
Token a
Token ID257
Feature activation+1.15e-02

Pos logit contributions
 comparison+0.051
 common+0.048
 outer+0.043
 thrilling+0.041
 awesome+0.041
Neg logit contributions
bee-0.180
itty-0.175
owed-0.174
ie-0.164
m-0.164
Token little
Token ID1310
Feature activation-1.87e-02

Pos logit contributions
 comparison+0.015
 outer+0.015
 common+0.014
 fun+0.014
 our+0.014
Neg logit contributions
bee-0.028
itty-0.026
owed-0.026
oke-0.025
op-0.025
Token fish
Token ID5916
Feature activation-1.07e-01

Pos logit contributions
bee+0.029
owed+0.028
itty+0.026
ops+0.025
oke+0.025
Neg logit contributions
 comparison-0.040
 fun-0.040
 trouble-0.039
 outer-0.039
 our-0.039
Token swimming
Token ID14899
Feature activation+1.50e-02

Pos logit contributions
bee+0.266
owed+0.258
itty+0.253
unn+0.248
oke+0.246
Neg logit contributions
 comparison-0.129
 outer-0.121
 our-0.119
 fun-0.119
 an-0.115
Token in
Token ID287
Feature activation+4.88e-02

Pos logit contributions
 comparison+0.027
 outer+0.026
 common+0.026
 fun+0.025
 awesome+0.025
Neg logit contributions
bee-0.029
itty-0.027
owed-0.027
m-0.025
oke-0.025
Token the
Token ID262
Feature activation-1.42e-02

Pos logit contributions
 awesome+0.059
 fantastic+0.056
 backup+0.056
 thrilling+0.054
Hours+0.054
Neg logit contributions
bird-0.109
bee-0.108
itty-0.108
m-0.105
flower-0.104
Token pond
Token ID16723
Feature activation-6.13e-02

Pos logit contributions
bee+0.028
itty+0.028
m+0.027
flower+0.025
op+0.025
Neg logit contributions
 comparison-0.024
 outer-0.023
 common-0.023
 awesome-0.021
 thrilling-0.021
Token.
Token ID13
Feature activation+3.06e-03

Pos logit contributions
bee+0.093
itty+0.087
owed+0.086
oke+0.079
op+0.078
Neg logit contributions
 comparison-0.135
 outer-0.131
 common-0.129
 fun-0.129
 awesome-0.127
Token sandbox
Token ID35204
Feature activation-5.56e-02

Pos logit contributions
bee+0.039
owed+0.038
itty+0.038
oke+0.035
op+0.035
Neg logit contributions
 comparison-0.036
 fun-0.036
 outer-0.034
 common-0.034
 trouble-0.034
Token.
Token ID13
Feature activation+8.73e-03

Pos logit contributions
bee+0.094
itty+0.090
owed+0.085
oke+0.082
op+0.081
Neg logit contributions
 comparison-0.112
 fun-0.111
 outer-0.108
 common-0.107
 our-0.107
Token They
Token ID1119
Feature activation-1.13e-02

Pos logit contributions
 comparison+0.017
 outer+0.016
 thrilling+0.016
 awesome+0.016
rous+0.015
Neg logit contributions
itty-0.014
bee-0.014
owed-0.014
oke-0.013
ird-0.013
Token thank
Token ID5875
Feature activation+4.30e-02

Pos logit contributions
bee+0.026
itty+0.025
owed+0.025
oke+0.024
op+0.023
Neg logit contributions
 comparison-0.016
 fun-0.015
 outer-0.015
 common-0.015
 our-0.015
Token the
Token ID262
Feature activation-4.13e-02

Pos logit contributions
 comparison+0.040
 outer+0.037
 fun+0.036
 common+0.036
 our+0.035
Neg logit contributions
bee-0.121
itty-0.117
owed-0.116
oke-0.111
op-0.110
Token old
Token ID1468
Feature activation-9.43e-02

Pos logit contributions
bee+0.100
itty+0.096
owed+0.096
oke+0.091
op+0.090
Neg logit contributions
 comparison-0.054
 fun-0.051
 outer-0.051
 common-0.051
 our-0.049
Token lady
Token ID10846
Feature activation-1.28e-01

Pos logit contributions
bee+0.170
owed+0.168
itty+0.163
ppy+0.160
op+0.155
Neg logit contributions
 fun-0.177
 comparison-0.175
 common-0.173
 sight-0.173
 trouble-0.170
Token and
Token ID290
Feature activation+7.75e-03

Pos logit contributions
bee+0.288
ird+0.264
ops+0.263
itive+0.259
unn+0.258
Neg logit contributions
 their-0.214
 our-0.191
 the-0.187
 an-0.185
 his-0.184
Token Max
Token ID5436
Feature activation-8.48e-02

Pos logit contributions
 comparison+0.008
 outer+0.007
 common+0.007
 awesome+0.007
 thrilling+0.007
Neg logit contributions
bee-0.021
itty-0.020
owed-0.020
oke-0.019
ird-0.019
Token for
Token ID329
Feature activation+4.23e-02

Pos logit contributions
bee+0.195
itty+0.188
owed+0.182
oke+0.176
op+0.176
Neg logit contributions
 fun-0.120
 our-0.117
 their-0.114
 an-0.113
 comparison-0.113
Token the
Token ID262
Feature activation-9.39e-03

Pos logit contributions
 comparison+0.058
 thrilling+0.056
 common+0.054
 outer+0.054
 complicated+0.052
Neg logit contributions
bee-0.094
owed-0.083
itty-0.082
m-0.082
bees-0.079
Token."
Token ID526
Feature activation-1.01e-02

Pos logit contributions
 comparison+0.078
 outer+0.075
 common+0.073
 fun+0.073
 awesome+0.072
Neg logit contributions
bee-0.087
itty-0.082
owed-0.081
oke-0.077
op-0.076
Token They
Token ID1119
Feature activation-2.40e-03

Pos logit contributions
bee+0.025
owed+0.024
itty+0.024
op+0.022
oke+0.022
Neg logit contributions
 fun-0.013
 comparison-0.013
 common-0.012
 our-0.012
 trouble-0.012
Token play
Token ID711
Feature activation+4.49e-02

Pos logit contributions
bee+0.006
itty+0.005
owed+0.005
oke+0.005
op+0.005
Neg logit contributions
 comparison-0.003
 fun-0.003
 outer-0.003
 common-0.003
 our-0.003
Token with
Token ID351
Feature activation+2.20e-02

Pos logit contributions
 comparison+0.081
 outer+0.077
 common+0.073
 thrilling+0.072
 awesome+0.070
Neg logit contributions
bee-0.088
owed-0.083
itty-0.083
oke-0.079
m-0.078
Token the
Token ID262
Feature activation-2.87e-02

Pos logit contributions
 comparison+0.018
 outer+0.016
 thrilling+0.015
 common+0.014
 awesome+0.014
Neg logit contributions
bee-0.063
itty-0.062
owed-0.060
ppy-0.058
ie-0.058
Token truck
Token ID7779
Feature activation-8.41e-02

Pos logit contributions
bee+0.058
owed+0.056
itty+0.055
oke+0.051
op+0.051
Neg logit contributions
 comparison-0.049
 fun-0.048
 outer-0.047
 common-0.047
 our-0.046
Token,
Token ID11
Feature activation+2.29e-02

Pos logit contributions
bee+0.180
itty+0.169
owed+0.165
unn+0.160
oke+0.157
Neg logit contributions
 fun-0.135
 comparison-0.131
 our-0.126
 their-0.125
 trouble-0.124
Token the
Token ID262
Feature activation-2.86e-02

Pos logit contributions
 comparison+0.036
 outer+0.035
 common+0.034
 fun+0.034
 awesome+0.033
Neg logit contributions
bee-0.049
itty-0.047
owed-0.046
oke-0.044
m-0.043
Token ball
Token ID2613
Feature activation-7.96e-02

Pos logit contributions
bee+0.054
itty+0.052
owed+0.051
oke+0.048
op+0.047
Neg logit contributions
 comparison-0.052
 outer-0.050
 common-0.050
 fun-0.049
 our-0.048
Token and
Token ID290
Feature activation+2.79e-02

Pos logit contributions
bee+0.142
itty+0.134
owed+0.132
oke+0.123
unn+0.122
Neg logit contributions
 fun-0.153
 comparison-0.152
 our-0.147
 outer-0.144
 common-0.144
Token the
Token ID262
Feature activation-1.87e-02

Pos logit contributions
 comparison+0.023
 outer+0.021
 common+0.020
 thrilling+0.019
 awesome+0.018
Neg logit contributions
bee-0.081
itty-0.077
owed-0.076
oke-0.074
m-0.073
Token,
Token ID11
Feature activation+1.32e-02

Pos logit contributions
 our+0.020
 fun+0.020
 comparison+0.020
 common+0.020
 their+0.020
Neg logit contributions
bee-0.022
owed-0.020
itty-0.020
oke-0.019
affe-0.019
Token she
Token ID673
Feature activation-2.03e-02

Pos logit contributions
 comparison+0.026
 outer+0.025
 thrilling+0.024
 awesome+0.024
 trouble+0.024
Neg logit contributions
bee-0.024
itty-0.022
owed-0.021
m-0.021
ie-0.020
Token saw
Token ID2497
Feature activation+5.58e-02

Pos logit contributions
bee+0.041
itty+0.039
owed+0.039
oke+0.037
op+0.036
Neg logit contributions
 comparison-0.034
 fun-0.034
 common-0.033
 our-0.033
 trouble-0.032
Token her
Token ID607
Feature activation-1.66e-02

Pos logit contributions
 comparison+0.060
 common+0.059
 outer+0.053
 thrilling+0.052
 awesome+0.048
Neg logit contributions
bee-0.148
itty-0.147
m-0.140
owed-0.138
ie-0.137
Token friend
Token ID1545
Feature activation-4.68e-02

Pos logit contributions
bee+0.037
owed+0.036
itty+0.035
oke+0.034
unn+0.033
Neg logit contributions
 fun-0.025
 sight-0.024
 common-0.024
 comparison-0.024
 trouble-0.024
Token Tim
Token ID5045
Feature activation-9.35e-02

Pos logit contributions
bee+0.123
owed+0.116
unn+0.115
ops+0.114
op+0.111
Neg logit contributions
 their-0.053
 an-0.052
 our-0.050
 fun-0.049
 his-0.048
Tokenmy
Token ID1820
Feature activation-5.91e-02

Pos logit contributions
bee+0.131
owed+0.124
unn+0.123
affe+0.121
ops+0.116
Neg logit contributions
 their-0.211
 an-0.208
 our-0.206
 his-0.206
 comparison-0.206
Token with
Token ID351
Feature activation+1.90e-02

Pos logit contributions
bee+0.151
owed+0.147
unn+0.142
affe+0.139
itty+0.138
Neg logit contributions
 their-0.066
 comparison-0.064
 fun-0.063
 an-0.063
 our-0.061
Token a
Token ID257
Feature activation+1.78e-02

Pos logit contributions
 common+0.024
 comparison+0.023
 thrilling+0.023
 destination+0.022
 accomplishment+0.022
Neg logit contributions
bee-0.041
itty-0.039
bird-0.038
m-0.037
bug-0.036
Token big
Token ID1263
Feature activation+1.41e-02

Pos logit contributions
 comparison+0.018
 outer+0.017
 common+0.015
 our+0.015
 awesome+0.015
Neg logit contributions
bee-0.048
itty-0.047
owed-0.044
op-0.044
m-0.044
Token bag
Token ID6131
Feature activation-6.69e-02

Pos logit contributions
 comparison+0.022
 fun+0.021
 outer+0.021
 common+0.021
 our+0.021
Neg logit contributions
bee-0.030
itty-0.029
owed-0.028
oke-0.027
op-0.026
Token giant
Token ID6175
Feature activation-1.09e-01

Pos logit contributions
bee+0.080
itty+0.076
owed+0.075
oke+0.070
op+0.070
Neg logit contributions
 comparison-0.071
 outer-0.069
 fun-0.068
 common-0.067
 our-0.066
Token's
Token ID338
Feature activation-2.00e-02

Pos logit contributions
bee+0.212
owed+0.206
itty+0.194
unn+0.185
ird+0.182
Neg logit contributions
 outer-0.192
 fun-0.191
 our-0.186
 trouble-0.184
 comparison-0.184
Token cave
Token ID11527
Feature activation-7.85e-02

Pos logit contributions
bee+0.044
itty+0.042
owed+0.041
oke+0.039
op+0.038
Neg logit contributions
 comparison-0.031
 outer-0.030
 fun-0.030
 common-0.029
 trouble-0.029
Token.
Token ID13
Feature activation-3.83e-03

Pos logit contributions
bee+0.130
itty+0.123
owed+0.119
oke+0.111
op+0.110
Neg logit contributions
 comparison-0.162
 outer-0.157
 fun-0.157
 common-0.155
 our-0.153
Token The
Token ID383
Feature activation-3.27e-02

Pos logit contributions
bee+0.010
itty+0.010
owed+0.010
oke+0.009
unn+0.009
Neg logit contributions
 comparison-0.004
 fun-0.004
 outer-0.004
 common-0.004
 our-0.004
Token mouse
Token ID10211
Feature activation-1.12e-01

Pos logit contributions
bee+0.054
owed+0.053
itty+0.050
unn+0.047
ird+0.046
Neg logit contributions
 comparison-0.069
 fun-0.069
 trouble-0.068
 sight-0.067
 outer-0.067
Token shouted
Token ID17293
Feature activation+3.28e-02

Pos logit contributions
itty+0.237
bee+0.235
owers+0.229
ird+0.228
unn+0.226
Neg logit contributions
 an-0.161
 fun-0.157
 his-0.154
 great-0.148
 its-0.147
Token brave
Token ID14802
Feature activation-1.47e-02

Pos logit contributions
 comparison+0.060
 outer+0.058
 common+0.058
 fun+0.056
 awesome+0.056
Neg logit contributions
bee-0.063
owed-0.059
itty-0.058
op-0.056
oke-0.055
Tokenly
Token ID306
Feature activation-2.71e-02

Pos logit contributions
bee+0.018
itty+0.018
owed+0.017
ird+0.017
op+0.016
Neg logit contributions
 fun-0.034
 comparison-0.034
 our-0.034
 trouble-0.034
 sight-0.034
Token,
Token ID11
Feature activation+5.13e-03

Pos logit contributions
bee+0.051
itty+0.050
owed+0.048
ird+0.046
oke+0.046
Neg logit contributions
 comparison-0.046
 fun-0.045
 our-0.045
 an-0.045
 outer-0.044
Token "
Token ID366
Feature activation-1.26e-03

Pos logit contributions
 comparison+0.009
 outer+0.009
 common+0.009
 fun+0.008
 awesome+0.008
Neg logit contributions
bee-0.010
itty-0.010
owed-0.010
oke-0.009
m-0.009
Token decided
Token ID3066
Feature activation+5.13e-02

Pos logit contributions
 comparison+0.058
 common+0.055
 outer+0.055
 awesome+0.054
 fun+0.053
Neg logit contributions
bee-0.057
owed-0.054
itty-0.054
op-0.050
ird-0.049
Token she
Token ID673
Feature activation-1.78e-02

Pos logit contributions
 comparison+0.093
 outer+0.088
 common+0.087
 awesome+0.081
 harmony+0.080
Neg logit contributions
owed-0.091
bee-0.090
m-0.084
itty-0.083
ne-0.082
Token would
Token ID561
Feature activation+6.04e-02

Pos logit contributions
bee+0.033
itty+0.031
owed+0.030
oke+0.029
unn+0.028
Neg logit contributions
 fun-0.032
 comparison-0.032
 outer-0.031
 common-0.031
 an-0.031
Token paint
Token ID7521
Feature activation+3.00e-02

Pos logit contributions
 comparison+0.084
 outer+0.080
 common+0.078
 trouble+0.077
 fun+0.076
Neg logit contributions
bee-0.139
owed-0.133
itty-0.132
op-0.127
ird-0.125
Token the
Token ID262
Feature activation-3.65e-02

Pos logit contributions
 comparison+0.035
 common+0.032
 backup+0.028
 thrilling+0.028
 outer+0.027
Neg logit contributions
bee-0.075
owed-0.072
itty-0.071
ie-0.070
bird-0.069
Token lizard
Token ID42406
Feature activation-1.22e-01

Pos logit contributions
bee+0.058
owed+0.056
itty+0.055
oke+0.051
op+0.050
Neg logit contributions
 comparison-0.077
 fun-0.076
 outer-0.075
 common-0.074
 awesome-0.074
Token.
Token ID13
Feature activation-1.25e-03

Pos logit contributions
bee+0.255
itty+0.247
unn+0.236
oke+0.234
owed+0.234
Neg logit contributions
 fun-0.197
 outer-0.193
 comparison-0.193
 our-0.189
 awesome-0.185
Token
Token ID198
Feature activation+1.42e-02

Pos logit contributions
bee+0.003
itty+0.002
owed+0.002
oke+0.002
op+0.002
Neg logit contributions
 comparison-0.002
 outer-0.002
 fun-0.002
 common-0.002
 our-0.002
Token
Token ID198
Feature activation+4.08e-03

Pos logit contributions
 fun+0.020
 common+0.019
 comparison+0.019
 outer+0.019
 our+0.018
Neg logit contributions
bee-0.033
owed-0.031
itty-0.031
flower-0.030
oke-0.029
TokenShe
Token ID3347
Feature activation-1.73e-02

Pos logit contributions
 comparison+0.008
 outer+0.008
 common+0.008
 trouble+0.008
 our+0.008
Neg logit contributions
bee-0.007
itty-0.006
owed-0.006
op-0.006
m-0.006
Token started
Token ID2067
Feature activation+6.89e-02

Pos logit contributions
bee+0.040
itty+0.039
oke+0.038
unn+0.038
owed+0.037
Neg logit contributions
 fun-0.024
 an-0.022
 great-0.021
 their-0.021
 our-0.020
Token
Token ID220
Feature activation+8.96e-04

Pos logit contributions
bee+0.023
owed+0.021
itty+0.021
oke+0.020
unn+0.020
Neg logit contributions
 comparison-0.013
 outer-0.013
 fun-0.013
 common-0.012
 our-0.012
Token
Token ID198
Feature activation-2.95e-03

Pos logit contributions
 fun+0.001
 common+0.001
 comparison+0.001
 our+0.001
 an+0.001
Neg logit contributions
bee-0.002
itty-0.002
owed-0.002
unn-0.002
flower-0.002
Token
Token ID198
Feature activation-9.14e-03

Pos logit contributions
bee+0.006
owed+0.006
itty+0.006
oke+0.006
unn+0.006
Neg logit contributions
 comparison-0.004
 fun-0.004
 outer-0.004
 common-0.004
 our-0.004
TokenThe
Token ID464
Feature activation-3.10e-02

Pos logit contributions
bee+0.020
itty+0.019
owed+0.019
oke+0.018
op+0.018
Neg logit contributions
 comparison-0.014
 outer-0.013
 fun-0.013
 common-0.013
 our-0.013
Token bright
Token ID6016
Feature activation-4.14e-02

Pos logit contributions
bee+0.063
owed+0.062
itty+0.061
oke+0.057
op+0.057
Neg logit contributions
 comparison-0.051
 fun-0.050
 outer-0.049
 common-0.048
 awesome-0.048
Token rag
Token ID34232
Feature activation-1.30e-01

Pos logit contributions
bee+0.089
owed+0.087
itty+0.086
ppy+0.083
oke+0.082
Neg logit contributions
 comparison-0.063
 fun-0.061
 sight-0.060
 outer-0.060
 common-0.059
Token made
Token ID925
Feature activation+4.37e-02

Pos logit contributions
bee+0.245
itty+0.238
unn+0.237
oke+0.235
owed+0.230
Neg logit contributions
 fun-0.220
 an-0.211
 comparison-0.209
 our-0.204
 thrilling-0.203
Token many
Token ID867
Feature activation+8.59e-03

Pos logit contributions
 comparison+0.046
 common+0.043
 outer+0.043
 thrilling+0.035
rous+0.035
Neg logit contributions
bee-0.113
owed-0.110
ie-0.107
itty-0.106
m-0.103
Token circles
Token ID13332
Feature activation-4.61e-02

Pos logit contributions
 comparison+0.009
 fun+0.008
 outer+0.008
 common+0.008
 trouble+0.008
Neg logit contributions
bee-0.023
itty-0.023
owed-0.023
oke-0.022
op-0.021
Token in
Token ID287
Feature activation+3.35e-02

Pos logit contributions
bee+0.078
itty+0.073
owed+0.072
oke+0.067
op+0.066
Neg logit contributions
 comparison-0.093
 outer-0.090
 common-0.089
 fun-0.089
 our-0.088
Token the
Token ID262
Feature activation-1.32e-02

Pos logit contributions
 unnecessary+0.039
 awesome+0.038
 gonna+0.037
 rewarding+0.037
 backup+0.037
Neg logit contributions
bird-0.071
bee-0.070
y-0.067
itty-0.067
berry-0.067
Token the
Token ID262
Feature activation-3.24e-02

Pos logit contributions
 comparison+0.035
 outer+0.032
 awesome+0.032
 thrilling+0.032
rous+0.032
Neg logit contributions
bee-0.040
ops-0.040
ie-0.039
m-0.039
op-0.036
Token morning
Token ID3329
Feature activation-7.77e-03

Pos logit contributions
bee+0.059
itty+0.052
owed+0.050
m+0.050
oke+0.049
Neg logit contributions
 comparison-0.063
 common-0.060
 outer-0.060
 our-0.057
 fun-0.057
Token came
Token ID1625
Feature activation+4.70e-02

Pos logit contributions
bee+0.014
itty+0.013
owed+0.013
oke+0.012
op+0.012
Neg logit contributions
 comparison-0.015
 outer-0.015
 common-0.014
 fun-0.014
 awesome-0.014
Token,
Token ID11
Feature activation+2.08e-02

Pos logit contributions
 comparison+0.108
 common+0.105
 outer+0.104
 fun+0.104
 awesome+0.104
Neg logit contributions
bee-0.061
owed-0.059
itty-0.058
m-0.055
flower-0.054
Token the
Token ID262
Feature activation-3.64e-02

Pos logit contributions
 comparison+0.023
 common+0.021
 outer+0.021
 awesome+0.020
 thrilling+0.020
Neg logit contributions
bee-0.052
owed-0.049
itty-0.049
m-0.048
op-0.048
Token sailor
Token ID43272
Feature activation-1.17e-01

Pos logit contributions
bee+0.056
itty+0.047
m+0.046
flower+0.045
oke+0.045
Neg logit contributions
 comparison-0.079
 common-0.077
 outer-0.075
 awesome-0.073
 thrilling-0.073
Token would
Token ID561
Feature activation+6.02e-02

Pos logit contributions
bee+0.226
itty+0.221
owed+0.208
unn+0.204
op+0.200
Neg logit contributions
 fun-0.192
 our-0.191
 comparison-0.189
 outer-0.188
 common-0.186
Token return
Token ID1441
Feature activation+4.63e-02

Pos logit contributions
 comparison+0.090
 trouble+0.087
 outer+0.086
 common+0.086
 fun+0.083
Neg logit contributions
bee-0.132
owed-0.124
itty-0.119
m-0.114
op-0.114
Token to
Token ID284
Feature activation+3.81e-02

Pos logit contributions
 comparison+0.063
 outer+0.055
 awesome+0.054
 common+0.054
 thrilling+0.052
Neg logit contributions
owed-0.105
itty-0.104
m-0.102
bee-0.101
ie-0.101
Token the
Token ID262
Feature activation-2.25e-02

Pos logit contributions
 comparison+0.043
 thrilling+0.035
 common+0.034
rous+0.034
 complicated+0.033
Neg logit contributions
bee-0.087
ird-0.083
ie-0.083
owed-0.082
itty-0.079
Token harbor
Token ID25451
Feature activation-6.80e-02

Pos logit contributions
bee+0.043
m+0.040
flower+0.039
ie+0.037
itty+0.037
Neg logit contributions
 comparison-0.037
 common-0.036
 outer-0.034
 awesome-0.032
 trouble-0.032
Token was
Token ID373
Feature activation+4.41e-02

Pos logit contributions
bee+0.025
itty+0.023
owed+0.023
oke+0.022
op+0.021
Neg logit contributions
 comparison-0.030
 outer-0.030
 fun-0.029
 common-0.029
 our-0.029
Token a
Token ID257
Feature activation+2.56e-03

Pos logit contributions
 comparison+0.033
 common+0.029
 outer+0.029
 thrilling+0.025
 trouble+0.023
Neg logit contributions
bee-0.131
itty-0.127
owed-0.122
m-0.120
op-0.120
Token little
Token ID1310
Feature activation-3.88e-02

Pos logit contributions
 fun+0.003
 great+0.003
 our+0.003
 long+0.003
 far+0.003
Neg logit contributions
owed-0.007
bee-0.006
ops-0.006
unn-0.006
itty-0.006
Token boy
Token ID2933
Feature activation-1.24e-01

Pos logit contributions
owed+0.081
ops+0.067
ird+0.065
oke+0.064
bee+0.064
Neg logit contributions
 fun-0.077
 our-0.075
 long-0.072
 sight-0.072
 far-0.071
Token named
Token ID3706
Feature activation-7.95e-03

Pos logit contributions
bee+0.257
owed+0.250
unn+0.248
owers+0.246
oke+0.244
Neg logit contributions
 their-0.216
 his-0.207
 our-0.205
 your-0.204
 an-0.203
Token Tim
Token ID5045
Feature activation-1.19e-01

Pos logit contributions
bee+0.017
itty+0.017
owed+0.016
oke+0.016
op+0.016
Neg logit contributions
 comparison-0.012
 outer-0.012
 common-0.012
 trouble-0.011
 thrilling-0.011
Tokenmy
Token ID1820
Feature activation-8.26e-02

Pos logit contributions
unn+0.176
bee+0.174
owed+0.171
bees+0.170
oke+0.168
Neg logit contributions
 his-0.274
 their-0.266
 fun-0.263
 an-0.260
 our-0.258
Token.
Token ID13
Feature activation-2.91e-02

Pos logit contributions
owed+0.186
oke+0.184
unn+0.182
bee+0.180
owers+0.176
Neg logit contributions
 his-0.131
 their-0.130
 fun-0.124
 an-0.123
 our-0.119
Token Tim
Token ID5045
Feature activation-9.45e-02

Pos logit contributions
bee+0.075
owed+0.073
itty+0.071
oke+0.069
op+0.067
Neg logit contributions
 comparison-0.034
 fun-0.032
 outer-0.031
 our-0.031
 common-0.030
Tokenmy
Token ID1820
Feature activation-6.04e-02

Pos logit contributions
bee+0.124
owed+0.122
oke+0.118
unn+0.116
itty+0.115
Neg logit contributions
 fun-0.227
 our-0.218
 his-0.217
 an-0.217
 great-0.214
Token loved
Token ID6151
Feature activation+6.53e-02

Pos logit contributions
owed+0.141
bee+0.138
oke+0.135
itty+0.134
unn+0.133
Neg logit contributions
 fun-0.087
 their-0.082
 our-0.082
 an-0.081
 his-0.080