TOP ACTIVATIONS
MAX = 0.087

and wanted to stretch his eight legs . He crawled out
, six , seven , eight bees ." Tom counts .
, six , seven , eight , nine , ten !
If I slip on this banana peel , it might be
so weird . It has eight legs and a lot of
" Six , seven , eight , nine , ten flowers
. Five , six , seven , eight bees ." Tom
, six , seven , eight , nine , ten !"
, six , seven , eight , nine , ten !
, six , seven , eight , nine , ten .
, five , six , seven , eight , nine ,
told her that spiders have eight eyes , that they can
, six , seven , eight , nine , ten ,
ten flowers . Five , six , seven , eight bees
room " and " la la la ". Mom
, five , six , seven . There were seven ants
, five , six , seven , eight , nine ,
, six , seven , eight ," Mia says .
butterflies and lady bugs and cater pill ars ."
, four , six , seven , eight , nine ,

MIN ACTIVATIONS
MIN = -0.088

boy said " thank you " and Lily felt good for
he said " thank you " to his mom .
happy and he hugged his daddy . Daddy was happy too
He was ready to share all the different things he had
his puzzle and thanked Wh isk ers for helping him .
He said " thank you " to Tom and ran home
and said " thank you " to his mom .
thought it was the best vest ever ! He put it
boy called Jon ny . Jon ny had two parents who
and said " thank you " to her mom . She
the cardboard house and played all day long .
" thank you mom my " and ate her new orange
opened it - it was filled with spark ly things !
so happy and gave his dad a big hug .
happy , she hugged her daddy and the t eddy bear
happy and he gave his dad a big hug to say
home , and they played all day . From that day
was very happy and said " Thank you , Mom my
and said " thank you " to her mom my .
was a little boy called Jon ny . Jon ny had

INTERVAL 0.043 - 0.087
CONTAINS 0.124%

. They see cows , pigs , chickens , and corn
. I don 't like hairy things in my hair ."
. But then they hear a loud noise .
, five , six , seven , eight , nine ,
or frogs or snakes . Some of them might be friendly

INTERVAL -0.001 - 0.043
CONTAINS 52.071%

go to the park sooner ." Lily thought about it and
made her laugh . It was a big bulb with funny
meal on the yacht . He used only the finest ingredients
fun along the way . Then she heard a noise !
don 't you count the flowers ? They will make you

INTERVAL -0.044 - -0.001
CONTAINS 47.295%

their picnic and went home . They had a nice day
t ines and discovered a shiny coin ! He
for her . She said she would get a sticker and
, " You can be very intelligent too if you use
L ily and Ben agreed . They helped Mom put the

INTERVAL -0.088 - -0.044
CONTAINS 0.510%

scary dog . <|endoftext|> Once upon a time , there was
brown man . <|endoftext|> Once upon a time , there was
found a big box and filled it with some sand .
my got a present . It was a deep , red
and search for treasure . It was the best day ever
Token and
Token ID290
Feature activation+1.21e-02

Pos logit contributions
 mouths+0.014
 heights+0.014
 noses+0.014
 tongues+0.013
 appearance+0.013
Neg logit contributions
des-0.010
xy-0.010
 snug-0.010
 secure-0.010
lad-0.010
Token wanted
Token ID2227
Feature activation+6.05e-03

Pos logit contributions
 mouths+0.021
 heights+0.020
 noses+0.020
 tongues+0.020
els+0.019
Neg logit contributions
 snug-0.016
 secure-0.015
des-0.015
xy-0.015
lad-0.015
Token to
Token ID284
Feature activation+5.90e-04

Pos logit contributions
 mouths+0.012
els+0.011
 noses+0.011
 tongues+0.011
 heights+0.011
Neg logit contributions
des-0.007
 snug-0.007
 secure-0.006
 valuable-0.006
xy-0.006
Token stretch
Token ID7539
Feature activation+6.93e-03

Pos logit contributions
 mouths+0.001
 noses+0.001
 tongues+0.001
els+0.001
 heights+0.001
Neg logit contributions
 secure-0.001
des-0.001
 snug-0.001
 repaired-0.001
 safe-0.001
Token his
Token ID465
Feature activation+3.32e-02

Pos logit contributions
 mouths+0.013
 heights+0.013
 noses+0.013
 tongues+0.012
 appearance+0.012
Neg logit contributions
 snug-0.008
des-0.008
 secure-0.007
xy-0.007
lad-0.007
Token eight
Token ID3624
Feature activation+8.66e-02

Pos logit contributions
els+0.032
 mouths+0.032
 groups+0.032
 salon+0.030
 heights+0.030
Neg logit contributions
 snug-0.067
 secure-0.063
 valuable-0.063
 safe-0.063
sy-0.062
Token legs
Token ID7405
Feature activation+3.72e-02

Pos logit contributions
 salon+0.114
uders+0.110
ric+0.105
els+0.105
ican+0.103
Neg logit contributions
 snug-0.138
 hard-0.134
 valuable-0.133
 good-0.131
 tight-0.130
Token.
Token ID13
Feature activation+1.16e-02

Pos logit contributions
 mouths+0.075
 heights+0.071
 tongues+0.071
 noses+0.070
els+0.069
Neg logit contributions
 snug-0.040
 secure-0.039
sy-0.039
 safe-0.038
des-0.037
Token He
Token ID679
Feature activation+1.68e-02

Pos logit contributions
 mouths+0.024
 heights+0.023
 noses+0.023
 tongues+0.023
 appearance+0.023
Neg logit contributions
des-0.011
 snug-0.011
xy-0.011
 secure-0.011
lad-0.010
Token crawled
Token ID45668
Feature activation+2.87e-03

Pos logit contributions
 mouths+0.027
 heights+0.026
 noses+0.026
 tongues+0.025
 appearance+0.025
Neg logit contributions
des-0.023
xy-0.023
 snug-0.023
lad-0.023
 secure-0.022
Token out
Token ID503
Feature activation-1.15e-02

Pos logit contributions
 frogs+0.004
 hoops+0.004
 heights+0.003
 weeds+0.003
 beetles+0.003
Neg logit contributions
xy-0.005
than-0.005
lad-0.005
des-0.005
great-0.005
Token,
Token ID11
Feature activation+2.47e-02

Pos logit contributions
 elbows+0.121
 mouths+0.117
 appearance+0.115
 anth+0.114
 salon+0.112
Neg logit contributions
ve-0.071
lad-0.068
 snug-0.066
sy-0.066
 wonderful-0.065
Token six
Token ID2237
Feature activation+7.26e-02

Pos logit contributions
 mouths+0.040
 appearance+0.039
 tongues+0.039
 heights+0.038
 elbows+0.038
Neg logit contributions
Good-0.030
da-0.029
 valuable-0.029
 Good-0.029
 snug-0.028
Token,
Token ID11
Feature activation+2.79e-02

Pos logit contributions
 appearance+0.131
 salon+0.131
 mouths+0.130
 elbows+0.128
 anth+0.126
Neg logit contributions
ve-0.070
lad-0.068
 good-0.065
 snug-0.063
iest-0.063
Token seven
Token ID3598
Feature activation+7.68e-02

Pos logit contributions
 mouths+0.044
 appearance+0.043
 heights+0.042
 noses+0.042
 tongues+0.042
Neg logit contributions
Good-0.036
da-0.036
 snug-0.036
 valuable-0.035
lad-0.034
Token,
Token ID11
Feature activation+3.62e-02

Pos logit contributions
 elbows+0.150
 appearance+0.150
 conclusions+0.147
ric+0.146
 mouths+0.146
Neg logit contributions
 ever-0.060
 good-0.058
 hard-0.055
ve-0.052
 much-0.051
Token eight
Token ID3624
Feature activation+7.86e-02

Pos logit contributions
 mouths+0.049
 appearance+0.049
 elbows+0.046
 noses+0.046
 heights+0.046
Neg logit contributions
da-0.056
Good-0.056
ve-0.053
lad-0.053
Great-0.053
Token bees
Token ID17002
Feature activation+6.33e-02

Pos logit contributions
 appearance+0.149
 salon+0.147
 mouths+0.143
 elbows+0.143
 anth+0.143
Neg logit contributions
 good-0.062
ve-0.062
 hard-0.061
 ever-0.061
 snug-0.059
Token."
Token ID526
Feature activation+1.31e-02

Pos logit contributions
 mouths+0.134
 appearance+0.126
 noses+0.126
 elbows+0.126
 anth+0.125
Neg logit contributions
 ever-0.052
ve-0.052
xy-0.051
da-0.050
sy-0.047
Token Tom
Token ID4186
Feature activation+6.29e-03

Pos logit contributions
 appearance+0.027
els+0.026
 tongues+0.026
 mouths+0.026
 heights+0.025
Neg logit contributions
 late-0.012
Good-0.012
da-0.012
Great-0.011
 Good-0.011
Token counts
Token ID9853
Feature activation+2.33e-02

Pos logit contributions
 appearance+0.012
 mouths+0.012
 conclusions+0.012
 elbows+0.012
 heights+0.012
Neg logit contributions
da-0.006
sy-0.006
xy-0.006
 fixed-0.005
ager-0.005
Token.
Token ID13
Feature activation-4.29e-03

Pos logit contributions
 anth+0.044
 mouths+0.044
 appearance+0.044
 conclusions+0.043
 tongues+0.043
Neg logit contributions
xy-0.023
man-0.021
 snug-0.021
lad-0.021
year-0.021
Token,
Token ID11
Feature activation+2.32e-02

Pos logit contributions
 mouths+0.093
 appearance+0.092
 tongues+0.090
 salon+0.090
 elbows+0.090
Neg logit contributions
 snug-0.057
ve-0.055
lad-0.054
 valuable-0.052
 safe-0.052
Token six
Token ID2237
Feature activation+6.91e-02

Pos logit contributions
 mouths+0.037
 tongues+0.035
 appearance+0.034
 noses+0.034
 heights+0.034
Neg logit contributions
 snug-0.031
Good-0.031
da-0.031
lad-0.031
des-0.031
Token,
Token ID11
Feature activation+2.70e-02

Pos logit contributions
 mouths+0.129
 salon+0.126
 appearance+0.126
 tongues+0.124
 anth+0.123
Neg logit contributions
 snug-0.065
ve-0.063
lad-0.061
 good-0.059
iest-0.058
Token seven
Token ID3598
Feature activation+7.34e-02

Pos logit contributions
 mouths+0.041
 heights+0.039
 noses+0.038
 appearance+0.038
 tongues+0.038
Neg logit contributions
 snug-0.040
da-0.039
des-0.038
Good-0.038
sy-0.038
Token,
Token ID11
Feature activation+3.48e-02

Pos logit contributions
ric+0.141
 mouths+0.138
 appearance+0.137
 conclusions+0.135
 elbows+0.134
Neg logit contributions
 ever-0.063
 good-0.057
 snug-0.057
ve-0.055
 wonderful-0.053
Token eight
Token ID3624
Feature activation+7.80e-02

Pos logit contributions
 mouths+0.045
 appearance+0.042
 tongues+0.042
 heights+0.041
 noses+0.041
Neg logit contributions
da-0.059
Good-0.057
 snug-0.055
ve-0.054
lad-0.054
Token,
Token ID11
Feature activation+2.27e-02

Pos logit contributions
 appearance+0.145
 salon+0.145
ric+0.144
 mouths+0.143
 anth+0.141
Neg logit contributions
 snug-0.066
ve-0.062
 ever-0.061
 good-0.060
 hard-0.057
Token nine
Token ID5193
Feature activation+6.18e-02

Pos logit contributions
 mouths+0.021
 heights+0.018
 noses+0.018
 tongues+0.018
 appearance+0.018
Neg logit contributions
 snug-0.047
lad-0.047
des-0.047
xy-0.047
da-0.046
Token,
Token ID11
Feature activation+2.17e-02

Pos logit contributions
 conclusions+0.118
 mouths+0.117
 anth+0.117
 elbows+0.116
 hoops+0.114
Neg logit contributions
lad-0.052
ve-0.052
 snug-0.051
 ever-0.050
da-0.046
Token ten
Token ID3478
Feature activation+5.22e-02

Pos logit contributions
 mouths+0.029
 tongues+0.027
 noses+0.026
 heights+0.026
 appearance+0.025
Neg logit contributions
xy-0.036
des-0.036
lad-0.036
da-0.036
Good-0.036
Token!
Token ID0
Feature activation+3.83e-03

Pos logit contributions
 mouths+0.110
 elbows+0.107
 appearance+0.107
 tongues+0.105
 salon+0.105
Neg logit contributions
ve-0.041
lad-0.038
 ever-0.037
 snug-0.036
da-0.035
TokenIf
Token ID1532
Feature activation-2.85e-03

Pos logit contributions
xy+0.008
des+0.008
 snug+0.008
lad+0.008
 secure+0.008
Neg logit contributions
 mouths-0.018
 heights-0.017
 noses-0.017
 tongues-0.017
 appearance-0.016
Token I
Token ID314
Feature activation+2.47e-02

Pos logit contributions
 secure+0.002
 snug+0.002
xy+0.002
 valuable+0.002
Good+0.002
Neg logit contributions
 mouths-0.006
 tongues-0.006
 groups-0.006
els-0.006
 noses-0.006
Token slip
Token ID13819
Feature activation+3.71e-02

Pos logit contributions
 mouths+0.048
 heights+0.047
 noses+0.046
 tongues+0.045
 groups+0.045
Neg logit contributions
 secure-0.027
 snug-0.027
sy-0.027
des-0.027
 repaired-0.026
Token on
Token ID319
Feature activation+4.09e-02

Pos logit contributions
 mouths+0.075
 heights+0.071
 noses+0.070
 tongues+0.070
 groups+0.069
Neg logit contributions
 snug-0.037
xy-0.036
 secure-0.035
sy-0.034
lad-0.034
Token this
Token ID428
Feature activation+1.76e-02

Pos logit contributions
 mouths+0.086
 tongues+0.082
 appearance+0.081
 noses+0.081
 groups+0.080
Neg logit contributions
 snug-0.033
 safe-0.032
Good-0.031
 secure-0.030
xy-0.030
Token banana
Token ID25996
Feature activation+7.78e-02

Pos logit contributions
 mouths+0.031
 noses+0.029
 tongues+0.029
els+0.028
 groups+0.028
Neg logit contributions
 snug-0.023
 secure-0.023
 safe-0.022
 valuable-0.022
des-0.022
Token peel
Token ID31738
Feature activation+4.94e-02

Pos logit contributions
 mouths+0.138
 noses+0.135
 tongues+0.131
 appearance+0.129
 groups+0.127
Neg logit contributions
xy-0.087
 safe-0.087
 snug-0.083
tted-0.082
 fixed-0.082
Token,
Token ID11
Feature activation+1.36e-02

Pos logit contributions
 mouths+0.083
 heights+0.080
 noses+0.079
 tongues+0.078
 appearance+0.077
Neg logit contributions
xy-0.065
 snug-0.064
lad-0.062
 secure-0.062
des-0.061
Token it
Token ID340
Feature activation+8.45e-03

Pos logit contributions
 mouths+0.027
 tongues+0.026
 heights+0.026
 noses+0.026
els+0.025
Neg logit contributions
 snug-0.014
 secure-0.014
des-0.013
 valuable-0.013
lad-0.013
Token might
Token ID1244
Feature activation-4.62e-03

Pos logit contributions
 mouths+0.017
 heights+0.017
 noses+0.016
 tongues+0.016
 groups+0.016
Neg logit contributions
xy-0.008
 snug-0.008
 secure-0.008
des-0.008
lad-0.008
Token be
Token ID307
Feature activation+8.50e-03

Pos logit contributions
 secure+0.006
 snug+0.006
 valuable+0.005
des+0.005
sy+0.005
Neg logit contributions
 mouths-0.008
els-0.008
 heights-0.008
 noses-0.008
 groups-0.008
Token so
Token ID523
Feature activation-5.41e-03

Pos logit contributions
 mouths+0.028
 heights+0.027
 tongues+0.027
 appearance+0.027
 salon+0.026
Neg logit contributions
 snug-0.012
 valuable-0.012
 secure-0.011
 safe-0.011
sy-0.010
Token weird
Token ID7650
Feature activation+6.26e-03

Pos logit contributions
des+0.007
than+0.007
lad+0.007
 secure+0.006
xy+0.006
Neg logit contributions
 mouths-0.009
 frogs-0.009
 hoops-0.009
 heights-0.009
 anth-0.009
Token.
Token ID13
Feature activation+1.82e-02

Pos logit contributions
 mouths+0.012
 heights+0.012
 noses+0.012
 tongues+0.012
 appearance+0.011
Neg logit contributions
des-0.006
 snug-0.006
xy-0.006
 secure-0.006
lad-0.006
Token It
Token ID632
Feature activation-4.19e-03

Pos logit contributions
 mouths+0.038
 heights+0.038
 tongues+0.037
 appearance+0.037
 noses+0.036
Neg logit contributions
 Good-0.017
da-0.016
 snug-0.016
year-0.016
Good-0.015
Token has
Token ID468
Feature activation+3.07e-02

Pos logit contributions
xy+0.004
 snug+0.004
des+0.004
 secure+0.004
lad+0.004
Neg logit contributions
 mouths-0.009
 heights-0.008
 noses-0.008
 tongues-0.008
 appearance-0.008
Token eight
Token ID3624
Feature activation+7.76e-02

Pos logit contributions
 mouths+0.060
els+0.059
 heights+0.059
 salon+0.058
 groups+0.058
Neg logit contributions
 valuable-0.030
 snug-0.029
da-0.027
Good-0.027
 secure-0.026
Token legs
Token ID7405
Feature activation+4.82e-02

Pos logit contributions
 salon+0.095
 groups+0.075
ican+0.075
 appearance+0.074
 nations+0.073
Neg logit contributions
 snug-0.137
ve-0.136
 valuable-0.133
 hard-0.130
 tight-0.129
Token and
Token ID290
Feature activation+3.65e-02

Pos logit contributions
 mouths+0.097
 heights+0.095
 tongues+0.091
 appearance+0.091
 groups+0.090
Neg logit contributions
xy-0.046
lad-0.046
 snug-0.045
sy-0.045
da-0.045
Token a
Token ID257
Feature activation+4.54e-02

Pos logit contributions
 mouths+0.069
 heights+0.069
 groups+0.069
 appearance+0.068
 salon+0.067
Neg logit contributions
 snug-0.035
Good-0.034
xy-0.033
 Good-0.033
da-0.033
Token lot
Token ID1256
Feature activation+9.55e-03

Pos logit contributions
els+0.089
 heights+0.087
 mouths+0.082
 groups+0.082
 crabs+0.081
Neg logit contributions
 valuable-0.052
 safe-0.049
 snug-0.048
 tight-0.045
ve-0.045
Token of
Token ID286
Feature activation+3.58e-02

Pos logit contributions
 mouths+0.021
 appearance+0.020
 noses+0.020
 heights+0.020
 tongues+0.020
Neg logit contributions
 valuable-0.008
 snug-0.008
year-0.007
tted-0.007
da-0.007
Token"
Token ID1
Feature activation-6.13e-03

Pos logit contributions
 mouths+0.024
 heights+0.023
 noses+0.023
 tongues+0.023
 appearance+0.022
Neg logit contributions
des-0.014
xy-0.014
 snug-0.014
lad-0.014
 secure-0.013
TokenSix
Token ID21447
Feature activation+6.35e-02

Pos logit contributions
Good+0.007
Great+0.007
xy+0.006
da+0.006
lad+0.006
Neg logit contributions
 mouths-0.013
 appearance-0.012
 noses-0.012
 tongues-0.012
 heights-0.012
Token,
Token ID11
Feature activation+3.93e-02

Pos logit contributions
 mouths+0.123
 appearance+0.118
 elbows+0.116
 tongues+0.116
 noses+0.115
Neg logit contributions
lad-0.060
ve-0.059
 good-0.056
iest-0.055
xy-0.055
Token seven
Token ID3598
Feature activation+7.20e-02

Pos logit contributions
 appearance+0.054
 elbows+0.053
 heights+0.053
okes+0.053
els+0.053
Neg logit contributions
Good-0.053
da-0.052
 good-0.051
 Good-0.050
 valuable-0.049
Token,
Token ID11
Feature activation+4.47e-02

Pos logit contributions
 salon+0.141
 appearance+0.141
 mouths+0.139
 elbows+0.139
 hoops+0.138
Neg logit contributions
 ever-0.055
 good-0.054
 hard-0.051
lad-0.050
ve-0.049
Token eight
Token ID3624
Feature activation+7.73e-02

Pos logit contributions
 elbows+0.061
 appearance+0.061
 mouths+0.060
 heights+0.058
uders+0.057
Neg logit contributions
Good-0.063
da-0.062
 Good-0.058
 good-0.058
ve-0.057
Token,
Token ID11
Feature activation+2.62e-02

Pos logit contributions
 appearance+0.148
 salon+0.148
 mouths+0.145
 hoops+0.144
 anth+0.140
Neg logit contributions
ve-0.061
 good-0.059
 hard-0.059
 ever-0.058
 snug-0.057
Token nine
Token ID5193
Feature activation+6.01e-02

Pos logit contributions
 mouths+0.031
 appearance+0.029
 tongues+0.029
 noses+0.029
 heights+0.028
Neg logit contributions
Good-0.047
da-0.046
lad-0.045
Great-0.045
 snug-0.044
Token,
Token ID11
Feature activation+2.49e-02

Pos logit contributions
 hoops+0.118
 elbows+0.117
 conclusions+0.117
 appearance+0.116
 mouths+0.116
Neg logit contributions
ve-0.046
lad-0.046
 ever-0.045
 good-0.041
 hard-0.039
Token ten
Token ID3478
Feature activation+5.27e-02

Pos logit contributions
 mouths+0.037
 tongues+0.035
 noses+0.035
 anth+0.034
 appearance+0.034
Neg logit contributions
Good-0.037
da-0.035
lad-0.035
Great-0.035
des-0.034
Token flowers
Token ID12734
Feature activation+3.93e-02

Pos logit contributions
 mouths+0.114
 appearance+0.113
 elbows+0.111
 salon+0.111
 tongues+0.108
Neg logit contributions
ve-0.038
lad-0.034
xy-0.033
 ever-0.032
 valuable-0.031
Token.
Token ID13
Feature activation+2.24e-02

Pos logit contributions
 mouths+0.080
 appearance+0.077
 noses+0.077
 anth+0.075
 tongues+0.075
Neg logit contributions
xy-0.036
ve-0.035
lad-0.034
 secure-0.034
 snug-0.034
Token Five
Token ID10579
Feature activation+6.64e-02

Pos logit contributions
 heights+0.044
 appearance+0.043
 mouths+0.043
els+0.043
 tongues+0.043
Neg logit contributions
 Good-0.024
da-0.020
Good-0.020
Great-0.020
xy-0.019
Token,
Token ID11
Feature activation+2.47e-02

Pos logit contributions
 elbows+0.121
 mouths+0.117
 appearance+0.115
 anth+0.114
 salon+0.112
Neg logit contributions
ve-0.071
lad-0.068
 snug-0.066
sy-0.066
 wonderful-0.065
Token six
Token ID2237
Feature activation+7.26e-02

Pos logit contributions
 mouths+0.040
 appearance+0.039
 tongues+0.039
 heights+0.038
 elbows+0.038
Neg logit contributions
Good-0.030
da-0.029
 valuable-0.029
 Good-0.029
 snug-0.028
Token,
Token ID11
Feature activation+2.79e-02

Pos logit contributions
 appearance+0.131
 salon+0.131
 mouths+0.130
 elbows+0.128
 anth+0.126
Neg logit contributions
ve-0.070
lad-0.068
 good-0.065
 snug-0.063
iest-0.063
Token seven
Token ID3598
Feature activation+7.68e-02

Pos logit contributions
 mouths+0.044
 appearance+0.043
 heights+0.042
 noses+0.042
 tongues+0.042
Neg logit contributions
Good-0.036
da-0.036
 snug-0.036
 valuable-0.035
lad-0.034
Token,
Token ID11
Feature activation+3.62e-02

Pos logit contributions
 elbows+0.150
 appearance+0.150
 conclusions+0.147
ric+0.146
 mouths+0.146
Neg logit contributions
 ever-0.060
 good-0.058
 hard-0.055
ve-0.052
 much-0.051
Token eight
Token ID3624
Feature activation+7.86e-02

Pos logit contributions
 mouths+0.049
 appearance+0.049
 elbows+0.046
 noses+0.046
 heights+0.046
Neg logit contributions
da-0.056
Good-0.056
ve-0.053
lad-0.053
Great-0.053
Token bees
Token ID17002
Feature activation+6.33e-02

Pos logit contributions
 appearance+0.149
 salon+0.147
 mouths+0.143
 elbows+0.143
 anth+0.143
Neg logit contributions
 good-0.062
ve-0.062
 hard-0.061
 ever-0.061
 snug-0.059
Token."
Token ID526
Feature activation+1.31e-02

Pos logit contributions
 mouths+0.134
 appearance+0.126
 noses+0.126
 elbows+0.126
 anth+0.125
Neg logit contributions
 ever-0.052
ve-0.052
xy-0.051
da-0.050
sy-0.047
Token Tom
Token ID4186
Feature activation+6.29e-03

Pos logit contributions
 appearance+0.027
els+0.026
 tongues+0.026
 mouths+0.026
 heights+0.025
Neg logit contributions
 late-0.012
Good-0.012
da-0.012
Great-0.011
 Good-0.011
Token,
Token ID11
Feature activation+3.09e-02

Pos logit contributions
 mouths+0.089
 appearance+0.088
 elbows+0.086
 tongues+0.086
 noses+0.084
Neg logit contributions
ve-0.051
 snug-0.050
lad-0.050
 valuable-0.048
 secure-0.048
Token six
Token ID2237
Feature activation+6.67e-02

Pos logit contributions
 mouths+0.047
 tongues+0.045
 noses+0.044
 heights+0.044
 appearance+0.044
Neg logit contributions
 snug-0.044
lad-0.044
des-0.043
Good-0.043
xy-0.043
Token,
Token ID11
Feature activation+3.48e-02

Pos logit contributions
 mouths+0.121
 appearance+0.118
 salon+0.116
 elbows+0.115
 tongues+0.114
Neg logit contributions
ve-0.070
 snug-0.070
lad-0.069
 secure-0.064
sy-0.063
Token seven
Token ID3598
Feature activation+7.23e-02

Pos logit contributions
 mouths+0.051
 heights+0.048
 noses+0.048
 tongues+0.047
 appearance+0.047
Neg logit contributions
 snug-0.053
des-0.052
 secure-0.051
lad-0.051
sy-0.051
Token,
Token ID11
Feature activation+3.99e-02

Pos logit contributions
 appearance+0.132
 mouths+0.132
 elbows+0.131
 conclusions+0.129
 salon+0.128
Neg logit contributions
 ever-0.069
ve-0.062
 good-0.062
 snug-0.062
 wonderful-0.061
Token eight
Token ID3624
Feature activation+7.58e-02

Pos logit contributions
 mouths+0.055
 tongues+0.052
 appearance+0.051
 noses+0.051
 elbows+0.050
Neg logit contributions
da-0.063
Good-0.062
 snug-0.061
lad-0.061
Great-0.061
Token,
Token ID11
Feature activation+2.61e-02

Pos logit contributions
 appearance+0.132
 salon+0.130
 mouths+0.129
uders+0.127
 conclusions+0.126
Neg logit contributions
 snug-0.079
ve-0.076
 good-0.071
 ever-0.070
lad-0.069
Token nine
Token ID5193
Feature activation+5.45e-02

Pos logit contributions
 mouths+0.031
 noses+0.028
 heights+0.028
 tongues+0.028
 appearance+0.027
Neg logit contributions
lad-0.047
 snug-0.047
des-0.047
xy-0.046
 secure-0.046
Token,
Token ID11
Feature activation+2.49e-02

Pos logit contributions
 conclusions+0.101
 elbows+0.100
 hoops+0.099
 mouths+0.099
 appearance+0.096
Neg logit contributions
ve-0.053
lad-0.052
 snug-0.050
 ever-0.049
 secure-0.048
Token ten
Token ID3478
Feature activation+4.72e-02

Pos logit contributions
 mouths+0.037
 tongues+0.035
 noses+0.034
 appearance+0.032
 heights+0.032
Neg logit contributions
lad-0.038
Good-0.038
des-0.037
xy-0.037
 snug-0.037
Token!"
Token ID2474
Feature activation-2.70e-02

Pos logit contributions
 mouths+0.105
 elbows+0.102
 appearance+0.101
 salon+0.100
 tongues+0.099
Neg logit contributions
ve-0.036
lad-0.033
 snug-0.030
 ever-0.030
 valuable-0.029
Token,
Token ID11
Feature activation+2.47e-02

Pos logit contributions
 mouths+0.088
 appearance+0.086
 tongues+0.085
 salon+0.085
 throats+0.083
Neg logit contributions
 snug-0.058
ve-0.056
 safe-0.055
 valuable-0.054
 secure-0.054
Token six
Token ID2237
Feature activation+6.58e-02

Pos logit contributions
 mouths+0.040
 tongues+0.038
 appearance+0.037
 heights+0.037
 noses+0.037
Neg logit contributions
 snug-0.034
des-0.033
lad-0.033
da-0.033
Good-0.033
Token,
Token ID11
Feature activation+2.56e-02

Pos logit contributions
 mouths+0.121
 salon+0.119
 appearance+0.116
 tongues+0.116
 throats+0.115
Neg logit contributions
 snug-0.066
ve-0.064
 safe-0.060
lad-0.060
 good-0.059
Token seven
Token ID3598
Feature activation+7.11e-02

Pos logit contributions
 mouths+0.038
 heights+0.036
 noses+0.035
 tongues+0.035
 appearance+0.035
Neg logit contributions
 snug-0.039
des-0.038
 secure-0.038
lad-0.037
sy-0.037
Token,
Token ID11
Feature activation+3.30e-02

Pos logit contributions
 mouths+0.134
ric+0.134
 salon+0.131
 throats+0.130
 conclusions+0.130
Neg logit contributions
 ever-0.061
 snug-0.060
 wonderful-0.056
ve-0.055
 good-0.055
Token eight
Token ID3624
Feature activation+7.49e-02

Pos logit contributions
 mouths+0.041
 appearance+0.038
 heights+0.038
 tongues+0.038
 noses+0.037
Neg logit contributions
da-0.058
 snug-0.057
Good-0.056
sy-0.055
lad-0.055
Token,
Token ID11
Feature activation+2.13e-02

Pos logit contributions
 salon+0.137
 appearance+0.135
ric+0.134
 mouths+0.134
 conclusions+0.131
Neg logit contributions
 snug-0.072
ve-0.065
 safe-0.063
 good-0.062
 ever-0.061
Token nine
Token ID5193
Feature activation+5.44e-02

Pos logit contributions
 mouths+0.022
 heights+0.021
 noses+0.020
 tongues+0.020
 appearance+0.020
Neg logit contributions
 snug-0.041
des-0.041
lad-0.041
xy-0.041
 secure-0.041
Token,
Token ID11
Feature activation+2.39e-02

Pos logit contributions
 conclusions+0.100
 mouths+0.099
 anth+0.097
 salon+0.097
 tongues+0.097
Neg logit contributions
 snug-0.050
ve-0.050
lad-0.049
 ever-0.047
 secure-0.045
Token ten
Token ID3478
Feature activation+4.98e-02

Pos logit contributions
 mouths+0.034
 tongues+0.031
 noses+0.031
 appearance+0.030
 heights+0.030
Neg logit contributions
Good-0.038
des-0.038
da-0.038
lad-0.038
xy-0.038
Token!
Token ID0
Feature activation+1.98e-02

Pos logit contributions
 mouths+0.099
 appearance+0.096
 salon+0.095
 tongues+0.095
 elbows+0.094
Neg logit contributions
ve-0.046
 snug-0.041
lad-0.041
 ever-0.041
da-0.039
Token,
Token ID11
Feature activation+2.60e-02

Pos logit contributions
 appearance+0.083
 mouths+0.080
 tongues+0.078
 elbows+0.078
 noses+0.078
Neg logit contributions
ve-0.047
lad-0.046
 snug-0.046
 secure-0.045
 valuable-0.044
Token six
Token ID2237
Feature activation+6.25e-02

Pos logit contributions
 mouths+0.042
 tongues+0.040
 noses+0.039
 appearance+0.039
 heights+0.039
Neg logit contributions
 snug-0.035
Good-0.035
lad-0.034
des-0.034
da-0.034
Token,
Token ID11
Feature activation+2.88e-02

Pos logit contributions
 appearance+0.117
 mouths+0.113
 throats+0.110
 tongues+0.110
 conclusions+0.110
Neg logit contributions
lad-0.058
ve-0.058
 snug-0.054
 good-0.053
 himself-0.052
Token seven
Token ID3598
Feature activation+7.07e-02

Pos logit contributions
 mouths+0.043
 heights+0.042
 noses+0.041
 appearance+0.041
 tongues+0.041
Neg logit contributions
 snug-0.042
des-0.041
 secure-0.041
lad-0.041
Good-0.040
Token,
Token ID11
Feature activation+3.70e-02

Pos logit contributions
 appearance+0.136
 conclusions+0.133
uders+0.131
 mouths+0.130
 throats+0.130
Neg logit contributions
 ever-0.055
 good-0.054
lad-0.054
ve-0.052
 much-0.051
Token eight
Token ID3624
Feature activation+7.34e-02

Pos logit contributions
 mouths+0.047
 appearance+0.045
 tongues+0.044
 noses+0.044
 heights+0.043
Neg logit contributions
Good-0.062
da-0.062
Great-0.060
lad-0.060
 snug-0.059
Token,
Token ID11
Feature activation+2.70e-02

Pos logit contributions
 appearance+0.139
uders+0.134
 mouths+0.132
 conclusions+0.131
 mothers+0.131
Neg logit contributions
 snug-0.061
ve-0.059
 good-0.057
 hard-0.055
 repaired-0.055
Token nine
Token ID5193
Feature activation+5.31e-02

Pos logit contributions
 mouths+0.023
 noses+0.021
 heights+0.021
 tongues+0.020
 appearance+0.020
Neg logit contributions
des-0.057
 snug-0.057
lad-0.057
xy-0.057
 secure-0.056
Token,
Token ID11
Feature activation+2.47e-02

Pos logit contributions
 conclusions+0.101
 hoops+0.100
 elbows+0.098
 mouths+0.098
 appearance+0.098
Neg logit contributions
ve-0.047
lad-0.046
 snug-0.042
 ever-0.042
 secure-0.041
Token ten
Token ID3478
Feature activation+4.38e-02

Pos logit contributions
 mouths+0.032
 noses+0.029
 tongues+0.029
 heights+0.029
 appearance+0.028
Neg logit contributions
des-0.042
xy-0.042
lad-0.042
 snug-0.042
Good-0.041
Token.
Token ID13
Feature activation+1.48e-02

Pos logit contributions
 mouths+0.088
 appearance+0.088
 tongues+0.085
 elbows+0.084
 noses+0.084
Neg logit contributions
ve-0.038
lad-0.036
 snug-0.034
xy-0.033
 valuable-0.033
Token,
Token ID11
Feature activation+2.40e-02

Pos logit contributions
 mouths+0.104
 appearance+0.104
 salon+0.101
 tongues+0.101
 elbows+0.100
Neg logit contributions
 snug-0.068
ve-0.063
lad-0.062
 secure-0.061
 safe-0.060
Token five
Token ID1936
Feature activation+5.21e-02

Pos logit contributions
 mouths+0.039
 tongues+0.037
 heights+0.037
 appearance+0.036
 noses+0.036
Neg logit contributions
 snug-0.032
Good-0.031
da-0.031
lad-0.031
des-0.031
Token,
Token ID11
Feature activation+2.32e-02

Pos logit contributions
 mouths+0.093
 appearance+0.092
 tongues+0.090
 salon+0.090
 elbows+0.090
Neg logit contributions
 snug-0.057
ve-0.055
lad-0.054
 valuable-0.052
 safe-0.052
Token six
Token ID2237
Feature activation+6.91e-02

Pos logit contributions
 mouths+0.037
 tongues+0.035
 appearance+0.034
 noses+0.034
 heights+0.034
Neg logit contributions
 snug-0.031
Good-0.031
da-0.031
lad-0.031
des-0.031
Token,
Token ID11
Feature activation+2.70e-02

Pos logit contributions
 mouths+0.129
 salon+0.126
 appearance+0.126
 tongues+0.124
 anth+0.123
Neg logit contributions
 snug-0.065
ve-0.063
lad-0.061
 good-0.059
iest-0.058
Token seven
Token ID3598
Feature activation+7.34e-02

Pos logit contributions
 mouths+0.041
 heights+0.039
 noses+0.038
 appearance+0.038
 tongues+0.038
Neg logit contributions
 snug-0.040
da-0.039
des-0.038
Good-0.038
sy-0.038
Token,
Token ID11
Feature activation+3.48e-02

Pos logit contributions
ric+0.141
 mouths+0.138
 appearance+0.137
 conclusions+0.135
 elbows+0.134
Neg logit contributions
 ever-0.063
 good-0.057
 snug-0.057
ve-0.055
 wonderful-0.053
Token eight
Token ID3624
Feature activation+7.80e-02

Pos logit contributions
 mouths+0.045
 appearance+0.042
 tongues+0.042
 heights+0.041
 noses+0.041
Neg logit contributions
da-0.059
Good-0.057
 snug-0.055
ve-0.054
lad-0.054
Token,
Token ID11
Feature activation+2.27e-02

Pos logit contributions
 appearance+0.145
 salon+0.145
ric+0.144
 mouths+0.143
 anth+0.141
Neg logit contributions
 snug-0.066
ve-0.062
 ever-0.061
 good-0.060
 hard-0.057
Token nine
Token ID5193
Feature activation+6.18e-02

Pos logit contributions
 mouths+0.021
 heights+0.018
 noses+0.018
 tongues+0.018
 appearance+0.018
Neg logit contributions
 snug-0.047
lad-0.047
des-0.047
xy-0.047
da-0.046
Token,
Token ID11
Feature activation+2.17e-02

Pos logit contributions
 conclusions+0.118
 mouths+0.117
 anth+0.117
 elbows+0.116
 hoops+0.114
Neg logit contributions
lad-0.052
ve-0.052
 snug-0.051
 ever-0.050
da-0.046
Token told
Token ID1297
Feature activation+1.08e-02

Pos logit contributions
des+0.026
lad+0.025
 snug+0.025
xy+0.025
 secure+0.024
Neg logit contributions
 mouths-0.025
 noses-0.023
 spr-0.023
 groups-0.023
els-0.022
Token her
Token ID607
Feature activation+4.64e-03

Pos logit contributions
 mouths+0.022
 heights+0.022
 tongues+0.021
 noses+0.021
 anth+0.021
Neg logit contributions
 snug-0.009
 valuable-0.009
 secure-0.008
Good-0.008
da-0.008
Token that
Token ID326
Feature activation+1.25e-02

Pos logit contributions
 mouths+0.009
 heights+0.009
 tongues+0.009
 noses+0.008
 appearance+0.008
Neg logit contributions
 valuable-0.005
 snug-0.005
 secure-0.005
lad-0.005
des-0.005
Token spiders
Token ID26120
Feature activation+2.94e-02

Pos logit contributions
 mouths+0.027
 tongues+0.026
 noses+0.026
 appearance+0.025
 groups+0.025
Neg logit contributions
 valuable-0.010
xy-0.010
 snug-0.009
 secure-0.009
 Good-0.008
Token have
Token ID423
Feature activation+2.95e-02

Pos logit contributions
 mouths+0.054
 heights+0.053
 noses+0.051
 tongues+0.051
 appearance+0.050
Neg logit contributions
 secure-0.033
 snug-0.033
xy-0.033
lad-0.032
des-0.032
Token eight
Token ID3624
Feature activation+7.32e-02

Pos logit contributions
 salon+0.058
 heights+0.057
 anth+0.056
 mouths+0.056
els+0.054
Neg logit contributions
 valuable-0.031
 snug-0.029
 secure-0.029
 repaired-0.029
 safe-0.029
Token eyes
Token ID2951
Feature activation+3.21e-02

Pos logit contributions
 salon+0.093
 heights+0.079
ican+0.078
uders+0.074
 anth+0.073
Neg logit contributions
 snug-0.125
 valuable-0.123
ve-0.117
sy-0.115
 secure-0.114
Token,
Token ID11
Feature activation+3.53e-02

Pos logit contributions
 heights+0.056
 mouths+0.055
 anth+0.053
 tongues+0.052
 noses+0.052
Neg logit contributions
 snug-0.040
xy-0.039
 secure-0.039
lad-0.038
sy-0.038
Token that
Token ID326
Feature activation+1.20e-02

Pos logit contributions
 anth+0.059
 mouths+0.057
 heights+0.057
 tongues+0.055
 salon+0.055
Neg logit contributions
 snug-0.045
 secure-0.043
 valuable-0.041
 fixed-0.040
 safe-0.040
Token they
Token ID484
Feature activation+5.74e-03

Pos logit contributions
 heights+0.021
 mouths+0.021
 tongues+0.020
 salon+0.020
 appearance+0.020
Neg logit contributions
 valuable-0.014
 snug-0.014
xy-0.014
 secure-0.013
 safe-0.013
Token can
Token ID460
Feature activation-3.27e-03

Pos logit contributions
 mouths+0.011
 heights+0.011
 noses+0.010
 tongues+0.010
 appearance+0.010
Neg logit contributions
 secure-0.006
 snug-0.006
xy-0.006
des-0.006
sy-0.006
Token,
Token ID11
Feature activation+3.46e-02

Pos logit contributions
 appearance+0.105
 elbows+0.103
 mouths+0.102
 salon+0.100
 conclusions+0.100
Neg logit contributions
ve-0.056
 snug-0.055
lad-0.053
 valuable-0.051
 secure-0.051
Token six
Token ID2237
Feature activation+6.64e-02

Pos logit contributions
 mouths+0.059
 tongues+0.056
 anth+0.056
 heights+0.055
 noses+0.055
Neg logit contributions
 snug-0.041
Good-0.041
da-0.040
lad-0.040
 secure-0.039
Token,
Token ID11
Feature activation+3.74e-02

Pos logit contributions
 mouths+0.128
 appearance+0.127
 elbows+0.126
 salon+0.125
 throats+0.125
Neg logit contributions
ve-0.054
 good-0.050
 snug-0.050
lad-0.049
 wonderful-0.047
Token seven
Token ID3598
Feature activation+7.20e-02

Pos logit contributions
 mouths+0.057
 heights+0.054
 appearance+0.054
 tongues+0.053
 noses+0.053
Neg logit contributions
 snug-0.052
Good-0.051
da-0.051
 valuable-0.049
Great-0.049
Token,
Token ID11
Feature activation+4.26e-02

Pos logit contributions
ric+0.146
 conclusions+0.144
 appearance+0.142
 elbows+0.142
 throats+0.141
Neg logit contributions
 ever-0.051
 good-0.046
ve-0.042
 hard-0.042
 wonderful-0.041
Token eight
Token ID3624
Feature activation+7.31e-02

Pos logit contributions
 mouths+0.056
 appearance+0.053
 tongues+0.052
 noses+0.051
 heights+0.051
Neg logit contributions
da-0.068
Good-0.068
Great-0.065
lad-0.065
ve-0.065
Token,
Token ID11
Feature activation+2.99e-02

Pos logit contributions
 appearance+0.139
 salon+0.136
 mouths+0.136
ric+0.135
 conclusions+0.134
Neg logit contributions
ve-0.057
 snug-0.056
 ever-0.053
 good-0.053
 tight-0.052
Token nine
Token ID5193
Feature activation+5.56e-02

Pos logit contributions
 mouths+0.030
 tongues+0.027
 noses+0.027
 heights+0.026
 appearance+0.026
Neg logit contributions
lad-0.059
 snug-0.059
xy-0.059
Good-0.059
des-0.059
Token,
Token ID11
Feature activation+2.98e-02

Pos logit contributions
 conclusions+0.113
 elbows+0.110
 anth+0.109
 mouths+0.108
 hoops+0.106
Neg logit contributions
ve-0.043
 ever-0.041
lad-0.040
da-0.036
 snug-0.036
Token ten
Token ID3478
Feature activation+4.61e-02

Pos logit contributions
 mouths+0.045
 tongues+0.041
 noses+0.040
 groups+0.039
 anth+0.038
Neg logit contributions
Good-0.045
xy-0.045
da-0.045
lad-0.045
des-0.044
Token,
Token ID11
Feature activation+2.73e-02

Pos logit contributions
 mouths+0.101
 appearance+0.100
 elbows+0.100
 tongues+0.097
 conclusions+0.097
Neg logit contributions
ve-0.033
 ever-0.028
lad-0.027
 valuable-0.026
da-0.025
Token ten
Token ID3478
Feature activation+5.27e-02

Pos logit contributions
 mouths+0.037
 tongues+0.035
 noses+0.035
 anth+0.034
 appearance+0.034
Neg logit contributions
Good-0.037
da-0.035
lad-0.035
Great-0.035
des-0.034
Token flowers
Token ID12734
Feature activation+3.93e-02

Pos logit contributions
 mouths+0.114
 appearance+0.113
 elbows+0.111
 salon+0.111
 tongues+0.108
Neg logit contributions
ve-0.038
lad-0.034
xy-0.033
 ever-0.032
 valuable-0.031
Token.
Token ID13
Feature activation+2.24e-02

Pos logit contributions
 mouths+0.080
 appearance+0.077
 noses+0.077
 anth+0.075
 tongues+0.075
Neg logit contributions
xy-0.036
ve-0.035
lad-0.034
 secure-0.034
 snug-0.034
Token Five
Token ID10579
Feature activation+6.64e-02

Pos logit contributions
 heights+0.044
 appearance+0.043
 mouths+0.043
els+0.043
 tongues+0.043
Neg logit contributions
 Good-0.024
da-0.020
Good-0.020
Great-0.020
xy-0.019
Token,
Token ID11
Feature activation+2.47e-02

Pos logit contributions
 elbows+0.121
 mouths+0.117
 appearance+0.115
 anth+0.114
 salon+0.112
Neg logit contributions
ve-0.071
lad-0.068
 snug-0.066
sy-0.066
 wonderful-0.065
Token six
Token ID2237
Feature activation+7.26e-02

Pos logit contributions
 mouths+0.040
 appearance+0.039
 tongues+0.039
 heights+0.038
 elbows+0.038
Neg logit contributions
Good-0.030
da-0.029
 valuable-0.029
 Good-0.029
 snug-0.028
Token,
Token ID11
Feature activation+2.79e-02

Pos logit contributions
 appearance+0.131
 salon+0.131
 mouths+0.130
 elbows+0.128
 anth+0.126
Neg logit contributions
ve-0.070
lad-0.068
 good-0.065
 snug-0.063
iest-0.063
Token seven
Token ID3598
Feature activation+7.68e-02

Pos logit contributions
 mouths+0.044
 appearance+0.043
 heights+0.042
 noses+0.042
 tongues+0.042
Neg logit contributions
Good-0.036
da-0.036
 snug-0.036
 valuable-0.035
lad-0.034
Token,
Token ID11
Feature activation+3.62e-02

Pos logit contributions
 elbows+0.150
 appearance+0.150
 conclusions+0.147
ric+0.146
 mouths+0.146
Neg logit contributions
 ever-0.060
 good-0.058
 hard-0.055
ve-0.052
 much-0.051
Token eight
Token ID3624
Feature activation+7.86e-02

Pos logit contributions
 mouths+0.049
 appearance+0.049
 elbows+0.046
 noses+0.046
 heights+0.046
Neg logit contributions
da-0.056
Good-0.056
ve-0.053
lad-0.053
Great-0.053
Token bees
Token ID17002
Feature activation+6.33e-02

Pos logit contributions
 appearance+0.149
 salon+0.147
 mouths+0.143
 elbows+0.143
 anth+0.143
Neg logit contributions
 good-0.062
ve-0.062
 hard-0.061
 ever-0.061
 snug-0.059
Tokenroom
Token ID3823
Feature activation+4.39e-02

Pos logit contributions
 mouths+0.023
 heights+0.022
 noses+0.022
 tongues+0.021
els+0.021
Neg logit contributions
 secure-0.020
 snug-0.020
des-0.020
xy-0.019
 valuable-0.019
Token"
Token ID1
Feature activation-3.06e-02

Pos logit contributions
 mouths+0.092
 heights+0.090
 tongues+0.086
 appearance+0.086
 noses+0.086
Neg logit contributions
xy-0.040
lad-0.039
 snug-0.038
sy-0.038
da-0.037
Token and
Token ID290
Feature activation+3.20e-02

Pos logit contributions
des+0.041
 snug+0.040
 secure+0.040
lad+0.039
xy+0.039
Neg logit contributions
 mouths-0.051
 heights-0.049
 noses-0.049
 tongues-0.048
 appearance-0.047
Token "
Token ID366
Feature activation+4.16e-03

Pos logit contributions
 mouths+0.053
 heights+0.051
 noses+0.050
 tongues+0.050
els+0.050
Neg logit contributions
xy-0.042
 snug-0.041
lad-0.041
Good-0.040
 valuable-0.040
Tokenla
Token ID5031
Feature activation+4.25e-02

Pos logit contributions
 mouths+0.008
 heights+0.008
 noses+0.007
 appearance+0.007
 tongues+0.007
Neg logit contributions
des-0.005
xy-0.005
lad-0.005
 snug-0.005
Good-0.005
Token la
Token ID8591
Feature activation+7.25e-02

Pos logit contributions
 heights+0.086
 mouths+0.085
 appearance+0.083
 tongues+0.081
 noses+0.081
Neg logit contributions
da-0.041
xy-0.040
sy-0.039
lad-0.039
ve-0.038
Token la
Token ID8591
Feature activation+6.85e-02

Pos logit contributions
 mouths+0.147
 heights+0.140
 tongues+0.137
 elbows+0.136
 noses+0.134
Neg logit contributions
da-0.081
ve-0.078
lad-0.074
Good-0.073
tted-0.071
Token".
Token ID1911
Feature activation+1.11e-02

Pos logit contributions
 mouths+0.146
 heights+0.138
 tongues+0.135
 elbows+0.134
 noses+0.133
Neg logit contributions
da-0.072
ve-0.068
lad-0.064
Good-0.062
xy-0.061
Token
Token ID198
Feature activation+9.06e-03

Pos logit contributions
els+0.025
 heights+0.025
 appearance+0.025
 crabs+0.025
 tongues+0.024
Neg logit contributions
Great-0.008
 late-0.007
da-0.007
 Good-0.007
 Thanks-0.007
Token
Token ID198
Feature activation+1.10e-02

Pos logit contributions
 mouths+0.017
 heights+0.016
 noses+0.016
 groups+0.016
 tongues+0.016
Neg logit contributions
des-0.010
 secure-0.010
 snug-0.010
lad-0.010
xy-0.010
TokenMom
Token ID29252
Feature activation+1.17e-02

Pos logit contributions
 mouths+0.023
 heights+0.022
 tongues+0.022
 noses+0.021
 appearance+0.021
Neg logit contributions
xy-0.011
sy-0.011
Good-0.011
 snug-0.011
des-0.011
Token,
Token ID11
Feature activation+2.30e-02

Pos logit contributions
 appearance+0.131
 salon+0.130
 conclusions+0.124
 elbows+0.121
uders+0.119
Neg logit contributions
ve-0.062
 wonderful-0.053
 useful-0.052
 much-0.052
sy-0.052
Token five
Token ID1936
Feature activation+4.48e-02

Pos logit contributions
 mouths+0.036
 tongues+0.033
 noses+0.033
 heights+0.033
 appearance+0.033
Neg logit contributions
 snug-0.033
des-0.032
xy-0.032
lad-0.032
Good-0.032
Token,
Token ID11
Feature activation+2.23e-02

Pos logit contributions
 salon+0.087
 appearance+0.087
 throats+0.083
 elbows+0.083
ric+0.082
Neg logit contributions
ve-0.042
 valuable-0.040
 useful-0.040
 much-0.040
 wonderful-0.040
Token six
Token ID2237
Feature activation+6.46e-02

Pos logit contributions
 mouths+0.035
 tongues+0.032
 noses+0.032
 appearance+0.032
 heights+0.032
Neg logit contributions
 snug-0.032
des-0.031
lad-0.031
xy-0.031
Good-0.031
Token,
Token ID11
Feature activation+2.46e-02

Pos logit contributions
 salon+0.133
 appearance+0.128
ric+0.126
 conclusions+0.123
uders+0.121
Neg logit contributions
 good-0.049
ve-0.048
--0.048
 hard-0.048
 fast-0.047
Token seven
Token ID3598
Feature activation+7.23e-02

Pos logit contributions
 mouths+0.038
 heights+0.036
 noses+0.035
 appearance+0.035
 tongues+0.035
Neg logit contributions
 snug-0.035
des-0.034
lad-0.034
Good-0.034
 secure-0.034
Token.
Token ID13
Feature activation+1.89e-02

Pos logit contributions
ric+0.142
 salon+0.141
 appearance+0.136
uders+0.136
 conclusions+0.134
Neg logit contributions
 ever-0.057
 good-0.055
 much-0.054
 hard-0.054
 fast-0.051
Token There
Token ID1318
Feature activation+1.01e-03

Pos logit contributions
 appearance+0.042
 heights+0.040
els+0.039
 noses+0.039
 salon+0.039
Neg logit contributions
Great-0.016
 Good-0.015
tted-0.015
 Thanks-0.015
Good-0.015
Token were
Token ID547
Feature activation+1.65e-02

Pos logit contributions
 mouths+0.002
 heights+0.002
 noses+0.002
 groups+0.002
 tongues+0.002
Neg logit contributions
des-0.001
 snug-0.001
 secure-0.001
lad-0.001
xy-0.001
Token seven
Token ID3598
Feature activation+7.09e-02

Pos logit contributions
 mouths+0.027
 appearance+0.027
 heights+0.026
els+0.026
 tongues+0.026
Neg logit contributions
Good-0.019
sy-0.019
 snug-0.019
 valuable-0.019
Great-0.019
Token ants
Token ID27842
Feature activation+3.65e-02

Pos logit contributions
 appearance+0.110
 conclusions+0.106
ric+0.104
 elbows+0.103
 throats+0.102
Neg logit contributions
 delicious-0.088
 awesome-0.085
 good-0.084
 wonderful-0.084
 tasty-0.081
Token,
Token ID11
Feature activation+3.33e-02

Pos logit contributions
 appearance+0.098
 mouths+0.098
 elbows+0.095
 tongues+0.093
 salon+0.093
Neg logit contributions
ve-0.061
 snug-0.060
lad-0.058
sy-0.057
 secure-0.056
Token five
Token ID1936
Feature activation+4.80e-02

Pos logit contributions
 mouths+0.052
 tongues+0.049
 noses+0.048
 heights+0.048
 appearance+0.048
Neg logit contributions
 snug-0.047
lad-0.047
des-0.046
Good-0.046
da-0.046
Token,
Token ID11
Feature activation+3.09e-02

Pos logit contributions
 mouths+0.089
 appearance+0.088
 elbows+0.086
 tongues+0.086
 noses+0.084
Neg logit contributions
ve-0.051
 snug-0.050
lad-0.050
 valuable-0.048
 secure-0.048
Token six
Token ID2237
Feature activation+6.67e-02

Pos logit contributions
 mouths+0.047
 tongues+0.045
 noses+0.044
 heights+0.044
 appearance+0.044
Neg logit contributions
 snug-0.044
lad-0.044
des-0.043
Good-0.043
xy-0.043
Token,
Token ID11
Feature activation+3.48e-02

Pos logit contributions
 mouths+0.121
 appearance+0.118
 salon+0.116
 elbows+0.115
 tongues+0.114
Neg logit contributions
ve-0.070
 snug-0.070
lad-0.069
 secure-0.064
sy-0.063
Token seven
Token ID3598
Feature activation+7.23e-02

Pos logit contributions
 mouths+0.051
 heights+0.048
 noses+0.048
 tongues+0.047
 appearance+0.047
Neg logit contributions
 snug-0.053
des-0.052
 secure-0.051
lad-0.051
sy-0.051
Token,
Token ID11
Feature activation+3.99e-02

Pos logit contributions
 appearance+0.132
 mouths+0.132
 elbows+0.131
 conclusions+0.129
 salon+0.128
Neg logit contributions
 ever-0.069
ve-0.062
 good-0.062
 snug-0.062
 wonderful-0.061
Token eight
Token ID3624
Feature activation+7.58e-02

Pos logit contributions
 mouths+0.055
 tongues+0.052
 appearance+0.051
 noses+0.051
 elbows+0.050
Neg logit contributions
da-0.063
Good-0.062
 snug-0.061
lad-0.061
Great-0.061
Token,
Token ID11
Feature activation+2.61e-02

Pos logit contributions
 appearance+0.132
 salon+0.130
 mouths+0.129
uders+0.127
 conclusions+0.126
Neg logit contributions
 snug-0.079
ve-0.076
 good-0.071
 ever-0.070
lad-0.069
Token nine
Token ID5193
Feature activation+5.45e-02

Pos logit contributions
 mouths+0.031
 noses+0.028
 heights+0.028
 tongues+0.028
 appearance+0.027
Neg logit contributions
lad-0.047
 snug-0.047
des-0.047
xy-0.046
 secure-0.046
Token,
Token ID11
Feature activation+2.49e-02

Pos logit contributions
 conclusions+0.101
 elbows+0.100
 hoops+0.099
 mouths+0.099
 appearance+0.096
Neg logit contributions
ve-0.053
lad-0.052
 snug-0.050
 ever-0.049
 secure-0.048
Token,
Token ID11
Feature activation+2.56e-02

Pos logit contributions
 anth+0.100
 mouths+0.098
 elbows+0.098
 appearance+0.097
uders+0.096
Neg logit contributions
ve-0.059
lad-0.058
 snug-0.057
 good-0.057
 valuable-0.057
Token six
Token ID2237
Feature activation+6.74e-02

Pos logit contributions
 mouths+0.042
 anth+0.041
 appearance+0.041
 heights+0.041
 tongues+0.041
Neg logit contributions
Good-0.032
Great-0.031
da-0.031
 valuable-0.030
 snug-0.029
Token,
Token ID11
Feature activation+2.91e-02

Pos logit contributions
 mouths+0.127
 appearance+0.125
 elbows+0.124
 anth+0.122
 conclusions+0.121
Neg logit contributions
ve-0.061
lad-0.060
 good-0.057
 snug-0.055
 valuable-0.053
Token seven
Token ID3598
Feature activation+7.05e-02

Pos logit contributions
 mouths+0.046
 heights+0.045
 appearance+0.045
 noses+0.044
 tongues+0.044
Neg logit contributions
Good-0.039
da-0.039
 snug-0.038
Great-0.037
 valuable-0.037
Token,
Token ID11
Feature activation+3.39e-02

Pos logit contributions
 conclusions+0.136
 elbows+0.134
 appearance+0.134
 mouths+0.133
ric+0.132
Neg logit contributions
 ever-0.057
 good-0.055
ve-0.052
lad-0.051
Good-0.049
Token eight
Token ID3624
Feature activation+7.21e-02

Pos logit contributions
 mouths+0.045
 appearance+0.044
 heights+0.043
 noses+0.042
 tongues+0.042
Neg logit contributions
da-0.055
Good-0.055
Great-0.053
lad-0.052
ve-0.052
Token,"
Token ID553
Feature activation-4.57e-03

Pos logit contributions
 appearance+0.134
 mouths+0.131
 conclusions+0.131
 elbows+0.130
uders+0.129
Neg logit contributions
ve-0.061
 snug-0.061
 good-0.059
lad-0.058
 ever-0.056
Token Mia
Token ID32189
Feature activation+1.62e-02

Pos logit contributions
 snug+0.005
xy+0.005
Good+0.005
 secure+0.005
des+0.005
Neg logit contributions
 mouths-0.009
 appearance-0.009
 tongues-0.008
 noses-0.008
 heights-0.008
Token says
Token ID1139
Feature activation-1.23e-03

Pos logit contributions
 mouths+0.030
 appearance+0.029
 tongues+0.029
 elbows+0.028
 heights+0.028
Neg logit contributions
da-0.018
xy-0.018
sy-0.017
 fixed-0.017
 repaired-0.017
Token.
Token ID13
Feature activation+1.78e-02

Pos logit contributions
des+0.002
xy+0.002
 snug+0.002
 secure+0.002
lad+0.002
Neg logit contributions
 mouths-0.002
 heights-0.002
 noses-0.002
 tongues-0.002
 appearance-0.002
Token
Token ID198
Feature activation+9.81e-03

Pos logit contributions
 appearance+0.045
 tongues+0.043
 mouths+0.042
els+0.042
 heights+0.041
Neg logit contributions
 He-0.011
Good-0.011
Great-0.010
 Good-0.010
xy-0.010
Token butterflies
Token ID44571
Feature activation+4.53e-02

Pos logit contributions
 anth+0.091
 mouths+0.087
 salon+0.087
 apartment+0.086
 darkened+0.085
Neg logit contributions
 good-0.043
 safe-0.043
 free-0.041
 wonderful-0.040
 useful-0.040
Token and
Token ID290
Feature activation+3.22e-02

Pos logit contributions
 mouths+0.093
 anth+0.093
 heights+0.091
 elbows+0.090
 tongues+0.090
Neg logit contributions
 safe-0.043
xy-0.041
 fixed-0.039
 snug-0.039
 valuable-0.038
Token lady
Token ID10846
Feature activation+2.49e-02

Pos logit contributions
els+0.041
andel+0.039
 salon+0.039
 anth+0.039
uders+0.038
Neg logit contributions
 valuable-0.052
Good-0.050
 wonderful-0.050
 snug-0.050
des-0.049
Tokenbugs
Token ID32965
Feature activation+4.01e-02

Pos logit contributions
 heights+0.036
 anth+0.035
 mouths+0.033
 elbows+0.032
 appearance+0.031
Neg logit contributions
sy-0.037
xy-0.037
 fixed-0.037
great-0.037
 safe-0.036
Token and
Token ID290
Feature activation+3.80e-02

Pos logit contributions
 anth+0.081
 heights+0.080
 mouths+0.078
 elbows+0.076
 tongues+0.075
Neg logit contributions
 safe-0.039
 fixed-0.038
xy-0.037
 valuable-0.037
 snug-0.036
Token cater
Token ID20825
Feature activation+7.21e-02

Pos logit contributions
 anth+0.047
els+0.047
 salon+0.047
uders+0.046
andel+0.046
Neg logit contributions
Good-0.059
 valuable-0.057
 wonderful-0.056
 snug-0.056
 safe-0.055
Tokenpill
Token ID27215
Feature activation+5.82e-02

Pos logit contributions
 mouths+0.131
 salon+0.126
 noses+0.121
 groups+0.121
 appearance+0.120
Neg logit contributions
xy-0.089
ve-0.084
des-0.083
free-0.080
da-0.077
Tokenars
Token ID945
Feature activation+3.92e-03

Pos logit contributions
 mouths+0.124
 heights+0.121
 noses+0.120
 appearance+0.118
 tongues+0.117
Neg logit contributions
des-0.052
 secure-0.051
xy-0.051
 snug-0.050
sy-0.050
Token."
Token ID526
Feature activation+1.14e-02

Pos logit contributions
 heights+0.007
 mouths+0.007
 anth+0.007
 tongues+0.007
 noses+0.007
Neg logit contributions
xy-0.004
lad-0.004
 snug-0.004
 valuable-0.004
 secure-0.004
Token
Token ID198
Feature activation-3.80e-03

Pos logit contributions
 mouths+0.028
 appearance+0.028
 noses+0.027
 tongues+0.027
 heights+0.027
Neg logit contributions
xy-0.006
des-0.006
Good-0.006
 snug-0.005
lad-0.005
Token
Token ID198
Feature activation+2.89e-03

Pos logit contributions
des+0.004
 snug+0.004
 secure+0.004
than+0.004
lad+0.004
Neg logit contributions
 mouths-0.007
 heights-0.007
 noses-0.007
 tongues-0.006
 groups-0.006
Token,
Token ID11
Feature activation+2.98e-02

Pos logit contributions
 appearance+0.077
 mouths+0.075
 salon+0.074
 heights+0.074
 elbows+0.074
Neg logit contributions
 snug-0.049
lad-0.048
ve-0.047
 valuable-0.046
 secure-0.045
Token four
Token ID1440
Feature activation+5.66e-02

Pos logit contributions
 mouths+0.053
 tongues+0.051
 anth+0.051
 heights+0.050
 noses+0.050
Neg logit contributions
 snug-0.033
Good-0.033
da-0.032
lad-0.032
 secure-0.031
Token,
Token ID11
Feature activation+3.46e-02

Pos logit contributions
 appearance+0.105
 elbows+0.103
 mouths+0.102
 salon+0.100
 conclusions+0.100
Neg logit contributions
ve-0.056
 snug-0.055
lad-0.053
 valuable-0.051
 secure-0.051
Token six
Token ID2237
Feature activation+6.64e-02

Pos logit contributions
 mouths+0.059
 tongues+0.056
 anth+0.056
 heights+0.055
 noses+0.055
Neg logit contributions
 snug-0.041
Good-0.041
da-0.040
lad-0.040
 secure-0.039
Token,
Token ID11
Feature activation+3.74e-02

Pos logit contributions
 mouths+0.128
 appearance+0.127
 elbows+0.126
 salon+0.125
 throats+0.125
Neg logit contributions
ve-0.054
 good-0.050
 snug-0.050
lad-0.049
 wonderful-0.047
Token seven
Token ID3598
Feature activation+7.20e-02

Pos logit contributions
 mouths+0.057
 heights+0.054
 appearance+0.054
 tongues+0.053
 noses+0.053
Neg logit contributions
 snug-0.052
Good-0.051
da-0.051
 valuable-0.049
Great-0.049
Token,
Token ID11
Feature activation+4.26e-02

Pos logit contributions
ric+0.146
 conclusions+0.144
 appearance+0.142
 elbows+0.142
 throats+0.141
Neg logit contributions
 ever-0.051
 good-0.046
ve-0.042
 hard-0.042
 wonderful-0.041
Token eight
Token ID3624
Feature activation+7.31e-02

Pos logit contributions
 mouths+0.056
 appearance+0.053
 tongues+0.052
 noses+0.051
 heights+0.051
Neg logit contributions
da-0.068
Good-0.068
Great-0.065
lad-0.065
ve-0.065
Token,
Token ID11
Feature activation+2.99e-02

Pos logit contributions
 appearance+0.139
 salon+0.136
 mouths+0.136
ric+0.135
 conclusions+0.134
Neg logit contributions
ve-0.057
 snug-0.056
 ever-0.053
 good-0.053
 tight-0.052
Token nine
Token ID5193
Feature activation+5.56e-02

Pos logit contributions
 mouths+0.030
 tongues+0.027
 noses+0.027
 heights+0.026
 appearance+0.026
Neg logit contributions
lad-0.059
 snug-0.059
xy-0.059
Good-0.059
des-0.059
Token,
Token ID11
Feature activation+2.98e-02

Pos logit contributions
 conclusions+0.113
 elbows+0.110
 anth+0.109
 mouths+0.108
 hoops+0.106
Neg logit contributions
ve-0.043
 ever-0.041
lad-0.040
da-0.036
 snug-0.036
Token boy
Token ID2933
Feature activation-3.06e-02

Pos logit contributions
xy+0.029
des+0.028
 secure+0.027
 snug+0.027
sy+0.027
Neg logit contributions
 mouths-0.032
 heights-0.031
 noses-0.030
 groups-0.030
 appearance-0.030
Token said
Token ID531
Feature activation+5.25e-04

Pos logit contributions
des+0.032
xy+0.032
 snug+0.031
lad+0.031
 secure+0.031
Neg logit contributions
 mouths-0.060
 heights-0.058
 noses-0.058
 tongues-0.056
 appearance-0.056
Token "
Token ID366
Feature activation-5.60e-02

Pos logit contributions
 mouths+0.001
 heights+0.001
 noses+0.001
 crabs+0.001
 frogs+0.001
Neg logit contributions
xy-0.001
sy-0.001
des-0.001
 snug-0.001
 secure-0.001
Tokenthank
Token ID40716
Feature activation-1.50e-02

Pos logit contributions
 snug+0.066
 secure+0.066
vern+0.066
 valuable+0.065
xy+0.062
Neg logit contributions
 heights-0.088
 noses-0.084
 mouths-0.084
els-0.083
 tongues-0.083
Token you
Token ID345
Feature activation-1.54e-02

Pos logit contributions
 secure+0.024
sy+0.024
 snug+0.024
xy+0.023
des+0.023
Neg logit contributions
 heights-0.021
 frogs-0.020
 mouths-0.020
 crabs-0.020
 beetles-0.019
Token"
Token ID1
Feature activation-8.23e-02

Pos logit contributions
des+0.023
 secure+0.023
xy+0.022
 snug+0.022
than+0.022
Neg logit contributions
 mouths-0.023
 noses-0.022
 groups-0.021
 appearance-0.021
 heights-0.021
Token and
Token ID290
Feature activation-2.05e-02

Pos logit contributions
 secure+0.151
 snug+0.150
lad+0.149
des+0.148
vern+0.145
Neg logit contributions
 mouths-0.075
 cats-0.075
 beetles-0.073
bugs-0.073
 salon-0.073
Token Lily
Token ID20037
Feature activation-1.17e-02

Pos logit contributions
des+0.030
xy+0.029
lad+0.029
sy+0.028
 snug+0.028
Neg logit contributions
 mouths-0.033
 noses-0.031
 frogs-0.030
 spr-0.030
 heights-0.030
Token felt
Token ID2936
Feature activation-1.75e-02

Pos logit contributions
xy+0.018
des+0.018
lad+0.017
 snug+0.017
 secure+0.016
Neg logit contributions
 mouths-0.018
 noses-0.017
 groups-0.016
 eyes-0.016
 tongues-0.016
Token good
Token ID922
Feature activation-3.56e-03

Pos logit contributions
xy+0.030
ve+0.025
rob+0.024
lett+0.024
eddy+0.024
Neg logit contributions
 frogs-0.027
 crabs-0.027
 mouths-0.026
 spr-0.026
 hoops-0.025
Token for
Token ID329
Feature activation-2.86e-02

Pos logit contributions
des+0.005
 snug+0.005
xy+0.005
 secure+0.005
lad+0.005
Neg logit contributions
 mouths-0.006
 heights-0.006
 noses-0.006
 tongues-0.006
 appearance-0.005
Token he
Token ID339
Feature activation-4.80e-02

Pos logit contributions
des+0.008
 snug+0.008
xy+0.008
 secure+0.008
lad+0.007
Neg logit contributions
 mouths-0.012
 heights-0.011
 noses-0.011
 tongues-0.011
 appearance-0.011
Token said
Token ID531
Feature activation-8.66e-04

Pos logit contributions
xy+0.077
vern+0.074
lad+0.074
eddy+0.073
des+0.072
Neg logit contributions
 spr-0.059
 mouths-0.059
 crabs-0.058
 frogs-0.056
 feet-0.055
Token "
Token ID366
Feature activation-6.99e-02

Pos logit contributions
xy+0.001
des+0.001
sy+0.001
 secure+0.001
 snug+0.001
Neg logit contributions
 mouths-0.001
 heights-0.001
 eyes-0.001
 noses-0.001
 crabs-0.001
Tokenthank
Token ID40716
Feature activation-9.13e-03

Pos logit contributions
vern+0.081
 valuable+0.080
 secure+0.079
xy+0.078
 snug+0.077
Neg logit contributions
 heights-0.115
 noses-0.114
 tongues-0.112
 mouths-0.112
 crabs-0.108
Token you
Token ID345
Feature activation-8.01e-03

Pos logit contributions
 secure+0.012
 snug+0.012
des+0.012
xy+0.012
lad+0.011
Neg logit contributions
 heights-0.015
 mouths-0.015
 noses-0.015
 frogs-0.014
 tongues-0.014
Token"
Token ID1
Feature activation-7.96e-02

Pos logit contributions
des+0.010
 secure+0.010
xy+0.010
 snug+0.010
lad+0.009
Neg logit contributions
 mouths-0.014
 noses-0.014
 heights-0.013
 tongues-0.013
 appearance-0.013
Token to
Token ID284
Feature activation-2.63e-02

Pos logit contributions
des+0.130
lad+0.130
 secure+0.129
 snug+0.129
xy+0.124
Neg logit contributions
 heights-0.088
 mouths-0.088
 noses-0.086
 eyes-0.083
 salon-0.082
Token his
Token ID465
Feature activation+1.43e-03

Pos logit contributions
des+0.036
xy+0.035
 snug+0.035
lad+0.034
 secure+0.034
Neg logit contributions
 mouths-0.044
 heights-0.043
 noses-0.042
 tongues-0.041
 appearance-0.041
Token mom
Token ID1995
Feature activation-1.79e-02

Pos logit contributions
 heights+0.002
els+0.002
 mouths+0.002
uders+0.002
 appearance+0.002
Neg logit contributions
 snug-0.002
 secure-0.002
 valuable-0.002
 safe-0.002
eddy-0.002
Token.
Token ID13
Feature activation-1.60e-02

Pos logit contributions
des+0.025
 snug+0.025
xy+0.024
 secure+0.024
lad+0.024
Neg logit contributions
 mouths-0.029
 heights-0.028
 noses-0.027
 tongues-0.027
 appearance-0.027
Token
Token ID198
Feature activation-2.20e-02

Pos logit contributions
des+0.019
 secure+0.019
 snug+0.018
lad+0.018
xy+0.018
Neg logit contributions
 mouths-0.030
 noses-0.028
 tongues-0.027
 heights-0.027
 groups-0.027
Token happy
Token ID3772
Feature activation-1.24e-02

Pos logit contributions
xy+0.028
rob+0.027
da+0.025
lett+0.025
 Good+0.024
Neg logit contributions
 groups-0.023
 mouths-0.022
 spr-0.021
 blamed-0.021
 eyes-0.020
Token and
Token ID290
Feature activation-4.85e-02

Pos logit contributions
des+0.016
 snug+0.016
xy+0.016
 secure+0.016
lad+0.016
Neg logit contributions
 mouths-0.021
 heights-0.020
 noses-0.020
 tongues-0.020
 appearance-0.019
Token he
Token ID339
Feature activation-3.48e-02

Pos logit contributions
xy+0.081
lad+0.078
tted+0.074
ve+0.073
des+0.073
Neg logit contributions
 mouths-0.075
 eyes-0.067
 noses-0.066
 feet-0.065
 ears-0.063
Token hugged
Token ID41130
Feature activation-3.49e-02

Pos logit contributions
des+0.054
xy+0.053
lad+0.053
 snug+0.050
 secure+0.049
Neg logit contributions
 mouths-0.052
 noses-0.048
 spr-0.047
 groups-0.046
 eyes-0.046
Token his
Token ID465
Feature activation-3.27e-02

Pos logit contributions
xy+0.037
lad+0.037
des+0.036
sy+0.035
tted+0.035
Neg logit contributions
 mouths-0.070
 noses-0.068
 heights-0.067
 eyes-0.066
 tongues-0.066
Token daddy
Token ID35822
Feature activation-7.88e-02

Pos logit contributions
des+0.066
xy+0.065
 secure+0.063
lad+0.063
than+0.062
Neg logit contributions
 mouths-0.037
 noses-0.033
 groups-0.032
 eyes-0.032
 tongues-0.031
Token.
Token ID13
Feature activation-3.33e-02

Pos logit contributions
xy+0.110
lad+0.102
Good+0.097
Great+0.096
tted+0.095
Neg logit contributions
 mouths-0.136
 noses-0.131
 eyes-0.127
 ears-0.125
 tongues-0.122
Token Daddy
Token ID26926
Feature activation-3.35e-02

Pos logit contributions
 secure+0.044
sy+0.044
des+0.044
ve+0.044
xy+0.042
Neg logit contributions
 mouths-0.060
 eyes-0.056
 noses-0.054
 tongues-0.052
 anth-0.051
Token was
Token ID373
Feature activation-2.63e-02

Pos logit contributions
des+0.051
xy+0.048
 snug+0.046
 secure+0.046
lad+0.046
Neg logit contributions
 mouths-0.050
 spr-0.048
els-0.048
 noses-0.047
 groups-0.047
Token happy
Token ID3772
Feature activation-2.20e-02

Pos logit contributions
xy+0.045
des+0.038
sy+0.037
lad+0.037
da+0.037
Neg logit contributions
 groups-0.037
 salon-0.036
 mouths-0.036
 heights-0.035
 eyes-0.035
Token too
Token ID1165
Feature activation-4.33e-02

Pos logit contributions
xy+0.036
sy+0.036
tted+0.036
des+0.036
 secure+0.035
Neg logit contributions
 mouths-0.030
 heights-0.030
 eyes-0.028
 noses-0.028
 appearance-0.027
Token He
Token ID679
Feature activation-3.14e-02

Pos logit contributions
des+0.016
 secure+0.015
xy+0.015
sy+0.015
 snug+0.014
Neg logit contributions
 mouths-0.024
 eyes-0.022
 heights-0.022
 noses-0.022
 frogs-0.022
Token was
Token ID373
Feature activation-6.11e-03

Pos logit contributions
des+0.033
xy+0.032
lad+0.032
 snug+0.031
 secure+0.031
Neg logit contributions
 mouths-0.062
 noses-0.059
 heights-0.059
 tongues-0.058
 appearance-0.057
Token ready
Token ID3492
Feature activation+2.77e-03

Pos logit contributions
xy+0.008
des+0.008
lad+0.008
than+0.007
Good+0.007
Neg logit contributions
 mouths-0.010
 heights-0.010
 noses-0.010
 groups-0.010
 tongues-0.010
Token to
Token ID284
Feature activation-1.26e-02

Pos logit contributions
 mouths+0.004
 heights+0.004
 noses+0.004
 appearance+0.004
 eyes+0.004
Neg logit contributions
xy-0.004
des-0.004
than-0.004
lad-0.004
 snug-0.004
Token share
Token ID2648
Feature activation-3.21e-02

Pos logit contributions
 secure+0.016
des+0.015
 snug+0.015
xy+0.015
lad+0.015
Neg logit contributions
 mouths-0.022
 noses-0.021
 heights-0.021
 tongues-0.021
els-0.020
Token all
Token ID477
Feature activation-7.60e-02

Pos logit contributions
des+0.032
 snug+0.032
xy+0.032
 secure+0.032
lad+0.031
Neg logit contributions
 mouths-0.064
 heights-0.062
 noses-0.061
 tongues-0.060
 appearance-0.060
Token the
Token ID262
Feature activation-3.56e-02

Pos logit contributions
ve+0.131
tted+0.129
xy+0.128
sy+0.122
 secure+0.120
Neg logit contributions
 heights-0.098
 frogs-0.096
 mouths-0.096
 crabs-0.095
 groups-0.091
Token different
Token ID1180
Feature activation-1.63e-02

Pos logit contributions
xy+0.056
eddy+0.053
tted+0.053
than+0.052
 secure+0.051
Neg logit contributions
 heights-0.052
 groups-0.049
 mouths-0.049
 noses-0.049
 frogs-0.048
Token things
Token ID1243
Feature activation-5.25e-02

Pos logit contributions
des+0.021
 snug+0.021
xy+0.021
 secure+0.020
lad+0.020
Neg logit contributions
 mouths-0.028
 heights-0.027
 noses-0.027
 tongues-0.026
 appearance-0.026
Token he
Token ID339
Feature activation-2.84e-02

Pos logit contributions
da+0.085
des+0.083
Great+0.083
lad+0.083
Good+0.082
Neg logit contributions
 heights-0.069
 crabs-0.066
 mouths-0.065
 noses-0.060
 groups-0.060
Token had
Token ID550
Feature activation+2.61e-03

Pos logit contributions
eddy+0.037
 snug+0.036
xy+0.035
des+0.035
 secure+0.035
Neg logit contributions
 mouths-0.046
 heights-0.045
 appearance-0.044
 beetles-0.044
 crabs-0.043
Token his
Token ID465
Feature activation-4.03e-03

Pos logit contributions
des+0.004
 secure+0.004
 snug+0.004
lad+0.004
tted+0.004
Neg logit contributions
 mouths-0.005
 heights-0.005
 noses-0.005
els-0.005
 groups-0.005
Token puzzle
Token ID15027
Feature activation-2.48e-02

Pos logit contributions
des+0.005
than+0.005
xy+0.005
lad+0.005
da+0.005
Neg logit contributions
 mouths-0.007
 heights-0.007
 groups-0.006
 noses-0.006
 tongues-0.006
Token and
Token ID290
Feature activation-2.11e-02

Pos logit contributions
des+0.041
lad+0.040
xy+0.040
da+0.040
than+0.040
Neg logit contributions
 mouths-0.034
 heights-0.033
 groups-0.032
 noses-0.031
 eyes-0.030
Token thanked
Token ID26280
Feature activation-1.81e-02

Pos logit contributions
des+0.035
lad+0.033
xy+0.032
year+0.031
great+0.031
Neg logit contributions
 mouths-0.033
 frogs-0.030
 noses-0.030
 spr-0.029
 crabs-0.028
Token Wh
Token ID854
Feature activation-9.19e-03

Pos logit contributions
des+0.026
lad+0.026
 snug+0.026
 secure+0.025
xy+0.025
Neg logit contributions
 mouths-0.029
 heights-0.028
 noses-0.027
 tongues-0.026
 frogs-0.026
Tokenisk
Token ID1984
Feature activation-7.53e-02

Pos logit contributions
 snug+0.015
than+0.015
 secure+0.015
des+0.015
year+0.013
Neg logit contributions
els-0.012
 noses-0.012
 mouths-0.012
 appearance-0.012
 heights-0.012
Tokeners
Token ID364
Feature activation-1.60e-02

Pos logit contributions
 snug+0.151
 secure+0.146
 brand+0.144
des+0.139
 pleasure+0.138
Neg logit contributions
 bees-0.073
 frogs-0.072
 snakes-0.071
els-0.067
 beetles-0.066
Token for
Token ID329
Feature activation-1.85e-02

Pos logit contributions
des+0.016
xy+0.016
 snug+0.016
 secure+0.016
lad+0.015
Neg logit contributions
 mouths-0.032
 heights-0.031
 noses-0.031
 groups-0.030
 tongues-0.030
Token helping
Token ID5742
Feature activation-9.60e-03

Pos logit contributions
des+0.032
da+0.030
xy+0.029
than+0.029
eddy+0.029
Neg logit contributions
 frogs-0.025
 groups-0.023
 mouths-0.023
 heights-0.023
 crabs-0.023
Token him
Token ID683
Feature activation-7.55e-03

Pos logit contributions
des+0.010
 snug+0.010
lad+0.010
xy+0.010
 secure+0.009
Neg logit contributions
 mouths-0.019
 heights-0.019
 noses-0.018
 tongues-0.018
 appearance-0.018
Token.
Token ID13
Feature activation-1.41e-02

Pos logit contributions
xy+0.011
des+0.011
lad+0.011
 snug+0.011
than+0.011
Neg logit contributions
 mouths-0.012
 heights-0.011
 noses-0.011
 tongues-0.011
 appearance-0.011
Token He
Token ID679
Feature activation-3.43e-02

Pos logit contributions
 secure+0.034
 snug+0.033
sy+0.032
des+0.032
than+0.031
Neg logit contributions
 mouths-0.039
 noses-0.035
okes-0.035
 eyes-0.034
 tongues-0.034
Token said
Token ID531
Feature activation+1.63e-02

Pos logit contributions
des+0.046
 snug+0.044
lad+0.044
xy+0.044
 secure+0.044
Neg logit contributions
 mouths-0.059
 noses-0.055
 heights-0.054
 tongues-0.054
 frogs-0.053
Token "
Token ID366
Feature activation-5.22e-02

Pos logit contributions
 mouths+0.025
 heights+0.024
 noses+0.024
 tongues+0.023
 appearance+0.023
Neg logit contributions
des-0.024
 snug-0.024
xy-0.024
lad-0.024
 secure-0.024
Tokenthank
Token ID40716
Feature activation-1.19e-02

Pos logit contributions
 snug+0.063
 secure+0.062
 valuable+0.058
vern+0.057
xy+0.056
Neg logit contributions
 heights-0.088
 noses-0.084
 mouths-0.084
 crabs-0.083
 hoops-0.083
Token you
Token ID345
Feature activation-1.29e-02

Pos logit contributions
 secure+0.015
 snug+0.015
xy+0.015
des+0.015
sy+0.015
Neg logit contributions
 heights-0.020
 mouths-0.020
 noses-0.019
 tongues-0.019
 appearance-0.019
Token"
Token ID1
Feature activation-7.42e-02

Pos logit contributions
des+0.019
 secure+0.018
 snug+0.018
xy+0.018
than+0.018
Neg logit contributions
 mouths-0.020
 noses-0.019
 heights-0.019
 groups-0.019
 appearance-0.019
Token to
Token ID284
Feature activation-2.56e-02

Pos logit contributions
 snug+0.132
 secure+0.132
des+0.128
lad+0.127
xy+0.122
Neg logit contributions
 mouths-0.080
 heights-0.079
okes-0.076
 noses-0.075
 hoops-0.075
Token Tom
Token ID4186
Feature activation-7.57e-03

Pos logit contributions
des+0.039
sy+0.039
 snug+0.038
xy+0.038
 secure+0.038
Neg logit contributions
 mouths-0.039
 heights-0.038
 noses-0.037
 appearance-0.036
 frogs-0.036
Token and
Token ID290
Feature activation-6.42e-03

Pos logit contributions
 snug+0.010
des+0.010
lad+0.010
 secure+0.010
xy+0.010
Neg logit contributions
 mouths-0.013
 heights-0.013
 noses-0.012
 tongues-0.012
 appearance-0.012
Token ran
Token ID4966
Feature activation-1.41e-02

Pos logit contributions
des+0.009
xy+0.009
sy+0.009
ve+0.009
lad+0.009
Neg logit contributions
 mouths-0.010
 noses-0.010
 frogs-0.010
 heights-0.010
 crabs-0.009
Token home
Token ID1363
Feature activation-2.19e-02

Pos logit contributions
lad+0.022
xy+0.022
 secure+0.021
des+0.021
 snug+0.020
Neg logit contributions
 mouths-0.021
 heights-0.020
 noses-0.020
 hoops-0.020
 groups-0.020
Token and
Token ID290
Feature activation-3.31e-02

Pos logit contributions
des+0.008
 snug+0.007
xy+0.007
 secure+0.007
lad+0.007
Neg logit contributions
 mouths-0.010
 heights-0.010
 noses-0.009
 tongues-0.009
 appearance-0.009
Token said
Token ID531
Feature activation+1.11e-02

Pos logit contributions
lad+0.053
xy+0.053
tted+0.051
des+0.050
year+0.049
Neg logit contributions
 mouths-0.053
 noses-0.048
 tongues-0.046
 feet-0.045
 eyes-0.045
Token "
Token ID366
Feature activation-5.89e-02

Pos logit contributions
 mouths+0.014
 noses+0.012
 heights+0.012
 tongues+0.012
 appearance+0.012
Neg logit contributions
des-0.020
xy-0.020
 secure-0.020
 snug-0.020
sy-0.019
Tokenthank
Token ID40716
Feature activation-1.20e-02

Pos logit contributions
vern+0.078
 secure+0.076
xy+0.074
 snug+0.072
lad+0.072
Neg logit contributions
 tongues-0.094
 noses-0.094
 mouths-0.093
 heights-0.087
 anth-0.086
Token you
Token ID345
Feature activation-1.49e-02

Pos logit contributions
 secure+0.017
 snug+0.017
xy+0.017
lad+0.017
des+0.017
Neg logit contributions
 heights-0.019
 mouths-0.018
 noses-0.018
 tongues-0.017
 frogs-0.017
Token"
Token ID1
Feature activation-7.38e-02

Pos logit contributions
des+0.024
 secure+0.023
xy+0.023
 snug+0.022
than+0.022
Neg logit contributions
 mouths-0.022
 noses-0.021
 groups-0.020
 appearance-0.020
 tongues-0.020
Token to
Token ID284
Feature activation-2.81e-02

Pos logit contributions
 secure+0.136
des+0.136
 snug+0.134
lad+0.134
 Good+0.128
Neg logit contributions
 mouths-0.072
 noses-0.070
 eyes-0.067
 heights-0.066
 salon-0.065
Token his
Token ID465
Feature activation-9.41e-03

Pos logit contributions
des+0.030
xy+0.029
sy+0.028
 snug+0.028
lad+0.028
Neg logit contributions
 mouths-0.057
 heights-0.055
 noses-0.054
 tongues-0.053
 appearance-0.053
Token mom
Token ID1995
Feature activation-2.04e-02

Pos logit contributions
 snug+0.014
 secure+0.014
 repaired+0.014
eddy+0.014
 valuable+0.014
Neg logit contributions
 heights-0.014
 mouths-0.013
 appearance-0.013
els-0.013
 tongues-0.013
Token.
Token ID13
Feature activation-4.30e-03

Pos logit contributions
des+0.028
 snug+0.027
xy+0.027
 secure+0.027
lad+0.027
Neg logit contributions
 mouths-0.034
 heights-0.033
 noses-0.032
 tongues-0.031
 appearance-0.031
Token
Token ID220
Feature activation-2.20e-02

Pos logit contributions
xy+0.004
des+0.004
 snug+0.004
lad+0.004
 secure+0.004
Neg logit contributions
 mouths-0.009
 heights-0.008
 noses-0.008
 tongues-0.008
 appearance-0.008
Token thought
Token ID1807
Feature activation-1.46e-02

Pos logit contributions
 snug+0.015
des+0.014
 secure+0.014
xy+0.014
lad+0.014
Neg logit contributions
 mouths-0.033
 heights-0.032
 noses-0.031
 tongues-0.031
 appearance-0.031
Token it
Token ID340
Feature activation-5.29e-02

Pos logit contributions
sy+0.021
xy+0.021
des+0.021
eddy+0.021
Great+0.020
Neg logit contributions
 mouths-0.023
 eyes-0.022
 heights-0.022
 noses-0.022
 appearance-0.021
Token was
Token ID373
Feature activation-3.53e-02

Pos logit contributions
des+0.083
than+0.074
lad+0.073
eddy+0.073
year+0.073
Neg logit contributions
 mouths-0.080
 noses-0.078
 appearance-0.073
 eyes-0.073
 heights-0.073
Token the
Token ID262
Feature activation-3.21e-02

Pos logit contributions
des+0.049
xy+0.046
than+0.045
lad+0.045
tted+0.044
Neg logit contributions
 mouths-0.062
 noses-0.060
 heights-0.059
 eyes-0.058
 appearance-0.058
Token best
Token ID1266
Feature activation-4.22e-02

Pos logit contributions
xy+0.034
des+0.034
 snug+0.033
lad+0.033
 secure+0.033
Neg logit contributions
 mouths-0.062
 heights-0.061
 noses-0.060
 tongues-0.059
 appearance-0.058
Token vest
Token ID19750
Feature activation-7.26e-02

Pos logit contributions
xy+0.075
sy+0.070
year+0.069
ve+0.066
tted+0.066
Neg logit contributions
 eyes-0.063
 appearance-0.060
 heights-0.059
 mouths-0.057
 hoops-0.056
Token ever
Token ID1683
Feature activation-4.65e-02

Pos logit contributions
lad+0.094
des+0.089
 Good+0.089
Good+0.087
Great+0.086
Neg logit contributions
 mouths-0.115
 salon-0.109
 feet-0.107
 noses-0.105
 tongues-0.105
Token!
Token ID0
Feature activation-3.24e-02

Pos logit contributions
tted+0.090
Great+0.089
tto+0.088
than+0.087
Thanks+0.086
Neg logit contributions
 appearance-0.045
 mouths-0.044
 heights-0.044
 noses-0.043
 crabs-0.042
Token He
Token ID679
Feature activation-2.99e-02

Pos logit contributions
 secure+0.037
 snug+0.037
des+0.036
sy+0.035
xy+0.035
Neg logit contributions
 mouths-0.062
 heights-0.059
 noses-0.058
 groups-0.057
 tongues-0.057
Token put
Token ID1234
Feature activation-3.82e-02

Pos logit contributions
xy+0.034
des+0.034
lad+0.033
 snug+0.033
 secure+0.033
Neg logit contributions
 mouths-0.056
 noses-0.053
 heights-0.053
 tongues-0.052
 groups-0.051
Token it
Token ID340
Feature activation-4.41e-02

Pos logit contributions
des+0.075
lad+0.069
xy+0.068
ager+0.067
sy+0.066
Neg logit contributions
 eyes-0.049
 mouths-0.048
els-0.048
 hoops-0.046
 feet-0.044
Token boy
Token ID2933
Feature activation-8.34e-03

Pos logit contributions
des+0.010
lad+0.010
xy+0.010
 secure+0.010
 snug+0.010
Neg logit contributions
 mouths-0.012
 heights-0.011
 noses-0.011
 tongues-0.011
 groups-0.011
Token called
Token ID1444
Feature activation-1.21e-02

Pos logit contributions
xy+0.012
 snug+0.012
des+0.012
sy+0.012
eddy+0.012
Neg logit contributions
 heights-0.013
 mouths-0.013
 appearance-0.012
 noses-0.012
 tongues-0.012
Token Jon
Token ID5966
Feature activation-7.13e-02

Pos logit contributions
 secure+0.016
des+0.016
lad+0.015
 snug+0.015
xy+0.015
Neg logit contributions
 mouths-0.021
 noses-0.020
 heights-0.020
 tongues-0.019
 groups-0.019
Tokenny
Token ID3281
Feature activation-1.27e-02

Pos logit contributions
 secure+0.118
 valuable+0.117
vern+0.115
igsaw+0.115
 memorable+0.114
Neg logit contributions
 noses-0.071
 dogs-0.069
 pigs-0.065
 necks-0.064
 goats-0.064
Token.
Token ID13
Feature activation-1.67e-02

Pos logit contributions
xy+0.017
 snug+0.017
des+0.017
 secure+0.016
sy+0.016
Neg logit contributions
 mouths-0.021
 heights-0.020
 noses-0.020
 tongues-0.020
 appearance-0.019
Token Jon
Token ID5966
Feature activation-7.26e-02

Pos logit contributions
ager+0.022
year+0.022
lad+0.022
sy+0.022
ve+0.022
Neg logit contributions
 heights-0.028
 mouths-0.028
 noses-0.027
 eyes-0.026
 pigs-0.026
Tokenny
Token ID3281
Feature activation+1.31e-05

Pos logit contributions
vern+0.128
 accomplishment+0.122
year+0.120
 valuable+0.118
 secure+0.117
Neg logit contributions
 dogs-0.066
 noses-0.064
 necks-0.063
ric-0.061
 asks-0.061
Token had
Token ID550
Feature activation+1.10e-03

Pos logit contributions
 mouths+0.000
 noses+0.000
 tongues+0.000
 heights+0.000
 appearance+0.000
Neg logit contributions
des-0.000
lad-0.000
 secure-0.000
xy-0.000
 snug-0.000
Token two
Token ID734
Feature activation+2.31e-02

Pos logit contributions
 mouths+0.002
 noses+0.002
 heights+0.002
 frogs+0.002
 tongues+0.002
Neg logit contributions
lad-0.001
xy-0.001
des-0.001
 secure-0.001
than-0.001
Token parents
Token ID3397
Feature activation-4.80e-03

Pos logit contributions
 heights+0.039
uders+0.037
 conclusions+0.036
 appearance+0.036
 salon+0.035
Neg logit contributions
sy-0.030
 snug-0.030
 valuable-0.030
 wonderful-0.029
ve-0.028
Token who
Token ID508
Feature activation-1.24e-02

Pos logit contributions
des+0.008
 snug+0.007
xy+0.007
lad+0.007
 secure+0.007
Neg logit contributions
 mouths-0.007
 heights-0.006
 noses-0.006
 tongues-0.006
 appearance-0.006
Token and
Token ID290
Feature activation-3.10e-02

Pos logit contributions
 snug+0.006
des+0.006
 secure+0.006
lad+0.005
xy+0.005
Neg logit contributions
 mouths-0.008
 noses-0.008
 heights-0.008
 tongues-0.008
els-0.008
Token said
Token ID531
Feature activation+1.24e-02

Pos logit contributions
xy+0.052
lad+0.051
rob+0.048
des+0.048
eddy+0.047
Neg logit contributions
 mouths-0.048
 spr-0.043
 eyes-0.043
 noses-0.043
 frogs-0.041
Token "
Token ID366
Feature activation-4.99e-02

Pos logit contributions
 mouths+0.017
 heights+0.016
 noses+0.016
 tongues+0.015
 appearance+0.015
Neg logit contributions
des-0.020
xy-0.020
 snug-0.020
 secure-0.020
lad-0.020
Tokenthank
Token ID40716
Feature activation-2.11e-03

Pos logit contributions
xy+0.058
 valuable+0.057
vern+0.057
 secure+0.056
 snug+0.056
Neg logit contributions
 heights-0.084
 mouths-0.081
 noses-0.081
 tongues-0.080
 appearance-0.078
Token you
Token ID345
Feature activation-1.13e-02

Pos logit contributions
 secure+0.003
xy+0.003
 snug+0.003
des+0.003
lad+0.003
Neg logit contributions
 mouths-0.004
 heights-0.004
 noses-0.003
 tongues-0.003
 appearance-0.003
Token"
Token ID1
Feature activation-7.23e-02

Pos logit contributions
des+0.018
xy+0.017
 secure+0.017
 snug+0.017
than+0.017
Neg logit contributions
 mouths-0.017
 noses-0.016
 groups-0.015
 heights-0.015
 appearance-0.015
Token to
Token ID284
Feature activation-1.69e-02

Pos logit contributions
 secure+0.131
 snug+0.129
lad+0.128
des+0.128
xy+0.127
Neg logit contributions
 mouths-0.076
 heights-0.072
 noses-0.071
 salon-0.070
 eyes-0.069
Token her
Token ID607
Feature activation-5.68e-03

Pos logit contributions
des+0.026
xy+0.026
sy+0.025
than+0.025
 snug+0.025
Neg logit contributions
 mouths-0.026
 heights-0.024
 noses-0.024
els-0.023
 tongues-0.023
Token mom
Token ID1995
Feature activation-1.52e-02

Pos logit contributions
 snug+0.009
 secure+0.009
 valuable+0.009
 fixed+0.009
 safe+0.009
Neg logit contributions
 heights-0.009
 mouths-0.008
 tongues-0.008
 appearance-0.008
 noses-0.008
Token.
Token ID13
Feature activation-1.26e-03

Pos logit contributions
des+0.021
 snug+0.021
xy+0.021
 secure+0.021
lad+0.021
Neg logit contributions
 mouths-0.025
 heights-0.024
 noses-0.023
 tongues-0.023
 appearance-0.023
Token She
Token ID1375
Feature activation-1.53e-02

Pos logit contributions
xy+0.002
 snug+0.002
lad+0.002
des+0.002
 secure+0.002
Neg logit contributions
 mouths-0.002
 heights-0.002
 noses-0.002
 appearance-0.002
 tongues-0.002
Token the
Token ID262
Feature activation+7.04e-04

Pos logit contributions
des+0.032
xy+0.032
lad+0.032
 snug+0.031
 secure+0.031
Neg logit contributions
 mouths-0.037
 heights-0.035
 noses-0.035
 tongues-0.034
 groups-0.034
Token cardboard
Token ID26819
Feature activation+1.48e-03

Pos logit contributions
 heights+0.001
 mouths+0.001
okes+0.001
 noses+0.001
 tongues+0.001
Neg logit contributions
 snug-0.001
 repaired-0.001
 secure-0.001
 valuable-0.001
 safe-0.001
Token house
Token ID2156
Feature activation-1.99e-02

Pos logit contributions
 heights+0.003
 mouths+0.003
 appearance+0.003
 tongues+0.003
 noses+0.003
Neg logit contributions
xy-0.001
 snug-0.001
 secure-0.001
sy-0.001
 repaired-0.001
Token and
Token ID290
Feature activation-3.05e-03

Pos logit contributions
des+0.034
lad+0.034
than+0.033
Good+0.033
xy+0.033
Neg logit contributions
 mouths-0.027
 noses-0.025
 heights-0.025
 groups-0.024
 tongues-0.024
Token played
Token ID2826
Feature activation-1.31e-02

Pos logit contributions
lad+0.004
des+0.004
xy+0.004
 snug+0.004
sy+0.004
Neg logit contributions
 mouths-0.005
 noses-0.005
 heights-0.005
 tongues-0.005
 groups-0.005
Token all
Token ID477
Feature activation-7.22e-02

Pos logit contributions
lad+0.019
xy+0.019
des+0.019
than+0.019
tto+0.018
Neg logit contributions
 groups-0.020
 mouths-0.020
 heights-0.020
 frogs-0.019
 hoops-0.019
Token day
Token ID1110
Feature activation-1.42e-02

Pos logit contributions
ve+0.137
vern+0.135
cher+0.135
xy+0.129
des+0.128
Neg logit contributions
 hoops-0.068
 frogs-0.066
 houses-0.065
 groups-0.065
 mouths-0.063
Token long
Token ID890
Feature activation+6.31e-03

Pos logit contributions
lad+0.022
des+0.021
Good+0.021
xy+0.020
 Good+0.020
Neg logit contributions
 mouths-0.021
 noses-0.020
 heights-0.020
 groups-0.020
 tongues-0.019
Token.
Token ID13
Feature activation-1.42e-02

Pos logit contributions
 mouths+0.012
 appearance+0.011
 heights+0.011
 noses+0.011
 tongues+0.011
Neg logit contributions
 snug-0.007
xy-0.007
 secure-0.007
des-0.007
 repaired-0.007
Token
Token ID198
Feature activation-1.47e-02

Pos logit contributions
des+0.018
 secure+0.018
 snug+0.018
than+0.018
xy+0.017
Neg logit contributions
 mouths-0.025
 noses-0.023
 heights-0.023
 tongues-0.023
 hoops-0.022
Token
Token ID198
Feature activation-9.17e-03

Pos logit contributions
des+0.023
 secure+0.023
 snug+0.023
lad+0.023
than+0.022
Neg logit contributions
 groups-0.018
 frogs-0.018
 mouths-0.018
 hoops-0.017
 heights-0.017
Token "
Token ID366
Feature activation-5.94e-02

Pos logit contributions
 mouths+0.019
 heights+0.018
 noses+0.018
 tongues+0.017
 appearance+0.017
Neg logit contributions
des-0.023
xy-0.023
 snug-0.023
 secure-0.023
lad-0.022
Tokenthank
Token ID40716
Feature activation-4.59e-03

Pos logit contributions
xy+0.060
 secure+0.059
 snug+0.058
vern+0.056
 valuable+0.056
Neg logit contributions
 mouths-0.108
 heights-0.108
 noses-0.107
 tongues-0.106
els-0.105
Token you
Token ID345
Feature activation-7.37e-03

Pos logit contributions
 secure+0.006
 snug+0.006
des+0.006
xy+0.006
lad+0.006
Neg logit contributions
 heights-0.008
 mouths-0.008
 noses-0.007
 appearance-0.007
 tongues-0.007
Token mom
Token ID1995
Feature activation-2.55e-02

Pos logit contributions
des+0.012
xy+0.012
 secure+0.011
 snug+0.011
than+0.011
Neg logit contributions
 mouths-0.010
 noses-0.010
 appearance-0.010
 groups-0.010
 heights-0.009
Tokenmy
Token ID1820
Feature activation-4.33e-03

Pos logit contributions
des+0.026
 snug+0.026
 secure+0.026
xy+0.026
than+0.025
Neg logit contributions
 mouths-0.050
 noses-0.048
 heights-0.048
els-0.047
 appearance-0.047
Token"
Token ID1
Feature activation-7.22e-02

Pos logit contributions
des+0.007
xy+0.007
 secure+0.007
 snug+0.006
than+0.006
Neg logit contributions
 mouths-0.006
 noses-0.006
 heights-0.006
 groups-0.006
 appearance-0.006
Token and
Token ID290
Feature activation-3.60e-04

Pos logit contributions
 secure+0.136
 snug+0.134
lad+0.133
des+0.133
xy+0.131
Neg logit contributions
 mouths-0.064
 eyes-0.063
 salon-0.063
 noses-0.058
 beetles-0.057
Token ate
Token ID15063
Feature activation-4.43e-03

Pos logit contributions
des+0.001
xy+0.001
sy+0.001
lad+0.001
tted+0.001
Neg logit contributions
 mouths-0.001
 noses-0.001
 tongues-0.001
 eyes-0.001
 heights-0.001
Token her
Token ID607
Feature activation-1.42e-02

Pos logit contributions
des+0.006
xy+0.006
lad+0.006
 snug+0.006
 secure+0.006
Neg logit contributions
 mouths-0.007
 heights-0.007
 noses-0.007
 tongues-0.007
 appearance-0.007
Token new
Token ID649
Feature activation-1.35e-02

Pos logit contributions
 snug+0.019
des+0.019
 secure+0.019
lad+0.018
 valuable+0.018
Neg logit contributions
 mouths-0.023
 heights-0.023
 noses-0.023
 tongues-0.022
 appearance-0.022
Token orange
Token ID10912
Feature activation+1.45e-03

Pos logit contributions
 snug+0.018
des+0.017
 secure+0.017
lad+0.017
 valuable+0.017
Neg logit contributions
 mouths-0.022
 heights-0.022
 noses-0.021
 tongues-0.021
 appearance-0.021
Token opened
Token ID4721
Feature activation-2.28e-04

Pos logit contributions
lad+0.012
xy+0.011
des+0.011
sy+0.011
 valuable+0.011
Neg logit contributions
 mouths-0.011
 noses-0.010
 heights-0.010
 spr-0.010
 crabs-0.010
Token it
Token ID340
Feature activation+3.60e-03

Pos logit contributions
des+0.000
lad+0.000
xy+0.000
than+0.000
 secure+0.000
Neg logit contributions
 heights-0.000
 mouths-0.000
els-0.000
 noses-0.000
 eyes-0.000
Token -
Token ID532
Feature activation-1.17e-03

Pos logit contributions
 mouths+0.005
 heights+0.005
 noses+0.005
 groups+0.005
 eyes+0.005
Neg logit contributions
des-0.006
than-0.006
lad-0.006
Good-0.006
xy-0.006
Token it
Token ID340
Feature activation-3.36e-02

Pos logit contributions
xy+0.002
des+0.002
lad+0.002
sy+0.001
than+0.001
Neg logit contributions
 heights-0.002
 mouths-0.002
 crabs-0.002
 frogs-0.002
 noses-0.002
Token was
Token ID373
Feature activation-2.03e-02

Pos logit contributions
des+0.044
lad+0.041
than+0.040
year+0.039
eddy+0.038
Neg logit contributions
 mouths-0.060
 spr-0.059
 noses-0.058
 heights-0.056
 frogs-0.056
Token filled
Token ID5901
Feature activation-7.22e-02

Pos logit contributions
des+0.022
xy+0.022
lad+0.022
than+0.021
da+0.020
Neg logit contributions
 mouths-0.039
 noses-0.038
 heights-0.038
 tongues-0.037
 groups-0.037
Token with
Token ID351
Feature activation-1.21e-02

Pos logit contributions
lad+0.142
des+0.132
 secure+0.131
 Good+0.130
da+0.129
Neg logit contributions
 crabs-0.070
 heights-0.069
 groups-0.069
els-0.064
 frogs-0.063
Token spark
Token ID9009
Feature activation-2.26e-02

Pos logit contributions
xy+0.015
lad+0.015
des+0.014
than+0.014
sy+0.014
Neg logit contributions
 mouths-0.022
 noses-0.022
 heights-0.021
 tongues-0.021
 groups-0.021
Tokenly
Token ID306
Feature activation-1.05e-02

Pos logit contributions
des+0.033
xy+0.032
lad+0.032
 snug+0.032
 secure+0.032
Neg logit contributions
 mouths-0.036
 noses-0.035
 heights-0.034
els-0.033
 tongues-0.033
Token things
Token ID1243
Feature activation-4.31e-02

Pos logit contributions
 snug+0.016
des+0.016
 secure+0.016
xy+0.016
lad+0.016
Neg logit contributions
 mouths-0.015
 heights-0.014
 noses-0.014
 tongues-0.014
 appearance-0.014
Token!
Token ID0
Feature activation-1.01e-02

Pos logit contributions
lad+0.072
than+0.070
des+0.070
year+0.070
Great+0.070
Neg logit contributions
 mouths-0.056
 heights-0.055
 noses-0.055
 spr-0.052
 crabs-0.051
Token so
Token ID523
Feature activation-9.43e-03

Pos logit contributions
 mouths+0.012
 heights+0.012
 noses+0.012
 groups+0.011
 tongues+0.011
Neg logit contributions
xy-0.009
des-0.009
lad-0.008
than-0.008
 snug-0.008
Token happy
Token ID3772
Feature activation-4.24e-03

Pos logit contributions
xy+0.017
rob+0.016
des+0.015
rev+0.014
eddy+0.014
Neg logit contributions
 groups-0.012
 feet-0.011
 mouths-0.011
 spr-0.011
 legs-0.011
Token and
Token ID290
Feature activation-3.98e-02

Pos logit contributions
des+0.006
 snug+0.006
xy+0.006
 secure+0.006
lad+0.006
Neg logit contributions
 mouths-0.007
 heights-0.007
 noses-0.007
 tongues-0.007
 appearance-0.007
Token gave
Token ID2921
Feature activation-3.70e-02

Pos logit contributions
xy+0.067
lad+0.067
des+0.064
year+0.064
eddy+0.062
Neg logit contributions
 mouths-0.058
 feet-0.054
 legs-0.052
 noses-0.052
 eyes-0.050
Token his
Token ID465
Feature activation-1.73e-02

Pos logit contributions
lad+0.065
xy+0.064
ve+0.063
tted+0.061
sy+0.061
Neg logit contributions
els-0.052
 mouths-0.052
 eyes-0.052
 noses-0.050
 heights-0.050
Token dad
Token ID9955
Feature activation-7.21e-02

Pos logit contributions
xy+0.036
sy+0.035
lad+0.035
des+0.035
 secure+0.035
Neg logit contributions
 mouths-0.018
 noses-0.016
 heights-0.016
 tongues-0.015
 groups-0.015
Token a
Token ID257
Feature activation-2.34e-02

Pos logit contributions
xy+0.106
lad+0.102
ve+0.096
 secure+0.095
than+0.093
Neg logit contributions
 mouths-0.117
 noses-0.114
 crabs-0.110
 groups-0.107
 frogs-0.107
Token big
Token ID1263
Feature activation-1.70e-02

Pos logit contributions
xy+0.035
than+0.032
lad+0.031
des+0.031
lett+0.030
Neg logit contributions
 mouths-0.037
 eyes-0.035
 spr-0.034
 groups-0.034
 hoops-0.034
Token hug
Token ID16225
Feature activation-2.28e-02

Pos logit contributions
xy+0.024
lad+0.023
des+0.023
than+0.023
 secure+0.023
Neg logit contributions
 mouths-0.027
 heights-0.026
 noses-0.026
 tongues-0.025
 groups-0.025
Token.
Token ID13
Feature activation-2.67e-02

Pos logit contributions
des+0.037
lad+0.036
xy+0.036
 snug+0.036
 secure+0.036
Neg logit contributions
 mouths-0.031
 noses-0.030
 heights-0.029
 tongues-0.028
 appearance-0.028
Token
Token ID220
Feature activation-1.88e-02

Pos logit contributions
des+0.045
 secure+0.045
sy+0.044
 snug+0.043
ve+0.042
Neg logit contributions
 mouths-0.039
 eyes-0.036
 noses-0.035
 tongues-0.033
 groups-0.032
Token happy
Token ID3772
Feature activation-3.35e-03

Pos logit contributions
xy+0.033
rob+0.031
rev+0.029
eddy+0.027
man+0.027
Neg logit contributions
 mouths-0.022
 groups-0.021
 spr-0.020
 frogs-0.019
 crabs-0.019
Token,
Token ID11
Feature activation-2.85e-02

Pos logit contributions
des+0.004
 snug+0.004
xy+0.004
 secure+0.004
lad+0.004
Neg logit contributions
 mouths-0.006
 heights-0.006
 noses-0.006
 tongues-0.005
 appearance-0.005
Token she
Token ID673
Feature activation-4.22e-02

Pos logit contributions
xy+0.049
sy+0.049
lad+0.047
tted+0.047
da+0.045
Neg logit contributions
 mouths-0.039
 eyes-0.036
 noses-0.036
 heights-0.035
 salon-0.035
Token hugged
Token ID41130
Feature activation-2.63e-02

Pos logit contributions
xy+0.069
des+0.066
lad+0.065
year+0.063
eddy+0.061
Neg logit contributions
 mouths-0.056
 spr-0.055
 noses-0.052
 eyes-0.052
 feet-0.050
Token her
Token ID607
Feature activation-3.69e-02

Pos logit contributions
xy+0.038
des+0.038
lad+0.038
 secure+0.037
 snug+0.037
Neg logit contributions
 mouths-0.042
 heights-0.040
 noses-0.040
 tongues-0.038
 appearance-0.038
Token daddy
Token ID35822
Feature activation-7.20e-02

Pos logit contributions
xy+0.078
tted+0.078
des+0.076
lad+0.071
da+0.071
Neg logit contributions
 eyes-0.041
 mouths-0.040
 noses-0.035
 ears-0.034
 fingers-0.033
Token and
Token ID290
Feature activation-2.81e-02

Pos logit contributions
xy+0.116
lad+0.104
tted+0.102
Ar+0.100
des+0.099
Neg logit contributions
 mouths-0.110
 eyes-0.107
 noses-0.106
 ears-0.104
 feet-0.099
Token the
Token ID262
Feature activation-2.57e-02

Pos logit contributions
tted+0.048
xy+0.048
lad+0.047
des+0.046
sy+0.045
Neg logit contributions
 mouths-0.043
 eyes-0.039
 noses-0.038
 spr-0.037
 tongues-0.035
Token t
Token ID256
Feature activation-3.34e-02

Pos logit contributions
xy+0.047
lad+0.044
des+0.044
year+0.043
da+0.042
Neg logit contributions
 mouths-0.035
 noses-0.034
 eyes-0.032
 appearance-0.032
 tongues-0.031
Tokeneddy
Token ID21874
Feature activation-4.11e-02

Pos logit contributions
 secure+0.034
Great+0.033
des+0.033
 valuable+0.033
tted+0.032
Neg logit contributions
els-0.067
 mouths-0.062
 eyes-0.062
 noses-0.060
 tongues-0.059
Token bear
Token ID6842
Feature activation-4.49e-02

Pos logit contributions
des+0.080
xy+0.079
tted+0.077
than+0.074
lad+0.074
Neg logit contributions
 mouths-0.050
 noses-0.047
 eyes-0.046
 ears-0.044
 pigs-0.042
Token happy
Token ID3772
Feature activation-1.01e-02

Pos logit contributions
xy+0.025
rob+0.024
da+0.022
rev+0.021
 Good+0.021
Neg logit contributions
 groups-0.020
 mouths-0.019
 spr-0.018
 eyes-0.018
 salon-0.018
Token and
Token ID290
Feature activation-5.53e-02

Pos logit contributions
des+0.013
 snug+0.013
xy+0.013
 secure+0.013
lad+0.013
Neg logit contributions
 mouths-0.017
 heights-0.017
 noses-0.016
 tongues-0.016
 appearance-0.016
Token he
Token ID339
Feature activation-4.43e-02

Pos logit contributions
xy+0.099
lad+0.092
ve+0.090
year+0.090
sy+0.090
Neg logit contributions
 mouths-0.080
 eyes-0.073
 feet-0.072
 ears-0.069
 legs-0.068
Token gave
Token ID2921
Feature activation-4.38e-02

Pos logit contributions
xy+0.076
des+0.075
lad+0.074
year+0.072
iest+0.070
Neg logit contributions
 mouths-0.055
 spr-0.052
 eyes-0.049
 noses-0.048
 ears-0.048
Token his
Token ID465
Feature activation-2.94e-02

Pos logit contributions
xy+0.073
lad+0.073
sy+0.071
ve+0.070
tted+0.069
Neg logit contributions
 mouths-0.066
 eyes-0.064
els-0.064
 crabs-0.062
 heights-0.061
Token dad
Token ID9955
Feature activation-7.18e-02

Pos logit contributions
xy+0.064
sy+0.063
lad+0.062
des+0.062
tted+0.061
Neg logit contributions
 mouths-0.028
 noses-0.024
 eyes-0.024
 crabs-0.023
 heights-0.023
Token a
Token ID257
Feature activation-3.99e-02

Pos logit contributions
xy+0.111
lad+0.101
ve+0.099
 secure+0.098
than+0.094
Neg logit contributions
 mouths-0.115
 noses-0.111
 crabs-0.108
 frogs-0.106
 eyes-0.105
Token big
Token ID1263
Feature activation-3.14e-02

Pos logit contributions
xy+0.064
lad+0.057
than+0.057
lett+0.056
des+0.055
Neg logit contributions
 mouths-0.058
 eyes-0.057
 spr-0.055
 groups-0.053
 salon-0.053
Token hug
Token ID16225
Feature activation-3.23e-02

Pos logit contributions
xy+0.050
sy+0.046
 secure+0.046
than+0.046
lad+0.046
Neg logit contributions
 mouths-0.047
 eyes-0.044
 noses-0.044
 heights-0.043
 spr-0.042
Token to
Token ID284
Feature activation-4.49e-02

Pos logit contributions
des+0.055
xy+0.054
lad+0.054
 snug+0.054
 secure+0.053
Neg logit contributions
 mouths-0.041
 noses-0.039
 heights-0.038
 appearance-0.037
 eyes-0.036
Token say
Token ID910
Feature activation-1.08e-02

Pos logit contributions
sy+0.064
xy+0.063
lad+0.061
des+0.060
tted+0.060
Neg logit contributions
 mouths-0.075
 noses-0.071
 appearance-0.071
 heights-0.070
 eyes-0.069
Token home
Token ID1363
Feature activation-2.40e-02

Pos logit contributions
 snug+0.006
 secure+0.006
des+0.006
xy+0.006
lad+0.006
Neg logit contributions
 mouths-0.008
 heights-0.008
 noses-0.008
 tongues-0.008
els-0.007
Token,
Token ID11
Feature activation-8.80e-03

Pos logit contributions
xy+0.038
des+0.038
than+0.037
lad+0.037
Good+0.036
Neg logit contributions
 mouths-0.034
 heights-0.033
 noses-0.032
 groups-0.032
 tongues-0.031
Token and
Token ID290
Feature activation-1.90e-02

Pos logit contributions
xy+0.013
des+0.012
lad+0.012
than+0.012
sy+0.012
Neg logit contributions
 mouths-0.014
 noses-0.013
 heights-0.013
 tongues-0.013
 groups-0.013
Token they
Token ID484
Feature activation-2.09e-02

Pos logit contributions
lad+0.029
des+0.028
xy+0.028
sy+0.027
 snug+0.027
Neg logit contributions
 mouths-0.030
 noses-0.028
 frogs-0.027
 heights-0.027
 tongues-0.027
Token played
Token ID2826
Feature activation-2.07e-02

Pos logit contributions
lad+0.034
des+0.034
xy+0.033
 snug+0.033
 secure+0.032
Neg logit contributions
 mouths-0.030
 noses-0.028
 tongues-0.027
 frogs-0.026
 groups-0.026
Token all
Token ID477
Feature activation-7.16e-02

Pos logit contributions
lad+0.032
than+0.031
tto+0.029
xy+0.029
da+0.029
Neg logit contributions
 frogs-0.033
 groups-0.031
 heights-0.031
 crabs-0.030
 mouths-0.030
Token day
Token ID1110
Feature activation-1.13e-02

Pos logit contributions
cher+0.127
ve+0.126
des+0.123
vern+0.123
rev+0.123
Neg logit contributions
 frogs-0.087
 hoops-0.078
 mouths-0.076
 groups-0.075
 noses-0.073
Token.
Token ID13
Feature activation-1.40e-02

Pos logit contributions
lad+0.016
des+0.015
 secure+0.014
 snug+0.014
xy+0.014
Neg logit contributions
 mouths-0.019
 noses-0.018
 heights-0.018
 groups-0.018
 tongues-0.017
Token From
Token ID3574
Feature activation-1.13e-02

Pos logit contributions
des+0.022
 secure+0.022
 snug+0.021
lad+0.021
than+0.021
Neg logit contributions
 mouths-0.021
 noses-0.019
 frogs-0.019
 groups-0.019
 hoops-0.018
Token that
Token ID326
Feature activation-2.25e-02

Pos logit contributions
des+0.017
lad+0.017
xy+0.016
 snug+0.016
 secure+0.016
Neg logit contributions
 mouths-0.017
 heights-0.017
 noses-0.016
 frogs-0.016
 crabs-0.016
Token day
Token ID1110
Feature activation-2.50e-03

Pos logit contributions
des+0.034
tted+0.033
 secure+0.032
xy+0.032
year+0.032
Neg logit contributions
 mouths-0.032
 appearance-0.030
 anth-0.030
 noses-0.030
 heights-0.030
Token was
Token ID373
Feature activation-3.84e-03

Pos logit contributions
des+0.034
lad+0.034
xy+0.033
 secure+0.033
 snug+0.032
Neg logit contributions
 mouths-0.055
 noses-0.052
 groups-0.050
 tongues-0.050
 eyes-0.050
Token very
Token ID845
Feature activation-1.68e-02

Pos logit contributions
xy+0.005
des+0.005
lad+0.005
than+0.004
 snug+0.004
Neg logit contributions
 mouths-0.007
 heights-0.007
 noses-0.007
 groups-0.006
 tongues-0.006
Token happy
Token ID3772
Feature activation-9.63e-03

Pos logit contributions
xy+0.026
rob+0.025
des+0.025
lad+0.025
da+0.023
Neg logit contributions
 mouths-0.026
 groups-0.026
 feet-0.024
 noses-0.024
 frogs-0.023
Token and
Token ID290
Feature activation-4.46e-02

Pos logit contributions
 snug+0.011
des+0.011
xy+0.011
 secure+0.011
lad+0.011
Neg logit contributions
 mouths-0.018
 heights-0.017
 noses-0.017
 tongues-0.017
 appearance-0.016
Token said
Token ID531
Feature activation+3.93e-03

Pos logit contributions
lad+0.084
xy+0.081
des+0.078
tted+0.077
vern+0.076
Neg logit contributions
 mouths-0.062
 eyes-0.055
 noses-0.055
 feet-0.054
 ears-0.052
Token "
Token ID366
Feature activation-7.14e-02

Pos logit contributions
 mouths+0.005
 noses+0.004
 eyes+0.004
 heights+0.004
 crabs+0.004
Neg logit contributions
xy-0.007
 secure-0.007
des-0.007
sy-0.007
 snug-0.007
TokenThank
Token ID10449
Feature activation-2.55e-02

Pos logit contributions
vern+0.108
 secure+0.101
 snug+0.098
xy+0.096
 valuable+0.096
Neg logit contributions
 noses-0.099
 tongues-0.096
 mouths-0.096
 tits-0.092
els-0.092
Token you
Token ID345
Feature activation-2.07e-02

Pos logit contributions
 secure+0.048
 snug+0.046
 valuable+0.045
sy+0.044
des+0.043
Neg logit contributions
 heights-0.029
 mouths-0.029
 crabs-0.028
els-0.028
 frogs-0.028
Token,
Token ID11
Feature activation-7.95e-03

Pos logit contributions
des+0.033
 secure+0.033
xy+0.033
 snug+0.033
lad+0.032
Neg logit contributions
 mouths-0.029
 noses-0.028
 heights-0.027
 tongues-0.027
 appearance-0.026
Token Mom
Token ID11254
Feature activation-3.54e-02

Pos logit contributions
des+0.014
xy+0.014
 snug+0.014
 secure+0.014
lad+0.014
Neg logit contributions
 mouths-0.010
 heights-0.009
 noses-0.009
 tongues-0.009
 appearance-0.008
Tokenmy
Token ID1820
Feature activation-2.71e-02

Pos logit contributions
des+0.044
 secure+0.043
tted+0.043
 snug+0.043
xy+0.042
Neg logit contributions
 mouths-0.063
 noses-0.061
 heights-0.060
 appearance-0.059
 groups-0.059
Token and
Token ID290
Feature activation-3.15e-02

Pos logit contributions
des+0.001
xy+0.001
 snug+0.001
lad+0.001
 secure+0.001
Neg logit contributions
 mouths-0.002
 heights-0.002
 noses-0.002
 tongues-0.002
 groups-0.002
Token said
Token ID531
Feature activation+2.07e-02

Pos logit contributions
xy+0.052
lad+0.051
des+0.050
tted+0.048
eddy+0.048
Neg logit contributions
 mouths-0.047
 noses-0.043
 eyes-0.042
 feet-0.041
 spr-0.041
Token "
Token ID366
Feature activation-5.63e-02

Pos logit contributions
 mouths+0.027
 heights+0.026
 noses+0.026
 tongues+0.025
 appearance+0.024
Neg logit contributions
des-0.035
xy-0.035
 secure-0.035
 snug-0.035
lad-0.034
Tokenthank
Token ID40716
Feature activation-6.23e-03

Pos logit contributions
 secure+0.066
xy+0.064
vern+0.063
 snug+0.063
lad+0.063
Neg logit contributions
 mouths-0.097
 noses-0.096
 tongues-0.096
 heights-0.092
els-0.092
Token you
Token ID345
Feature activation-9.51e-03

Pos logit contributions
des+0.008
 secure+0.008
 snug+0.008
xy+0.008
lad+0.008
Neg logit contributions
 mouths-0.011
 heights-0.011
 noses-0.010
 tongues-0.010
 appearance-0.010
Token"
Token ID1
Feature activation-7.14e-02

Pos logit contributions
des+0.014
 secure+0.014
xy+0.014
 snug+0.013
than+0.013
Neg logit contributions
 mouths-0.015
 noses-0.014
 groups-0.014
 appearance-0.013
 heights-0.013
Token to
Token ID284
Feature activation-1.94e-02

Pos logit contributions
 secure+0.129
des+0.129
lad+0.127
 snug+0.126
xy+0.124
Neg logit contributions
 mouths-0.073
 noses-0.070
 salon-0.069
 eyes-0.069
 heights-0.067
Token her
Token ID607
Feature activation-1.19e-02

Pos logit contributions
des+0.028
xy+0.028
 snug+0.027
sy+0.027
lad+0.027
Neg logit contributions
 mouths-0.031
 heights-0.029
 noses-0.029
 tongues-0.028
 appearance-0.028
Token mom
Token ID1995
Feature activation-2.53e-02

Pos logit contributions
 snug+0.021
 secure+0.020
 valuable+0.020
des+0.020
lad+0.019
Neg logit contributions
 heights-0.016
 mouths-0.015
 tongues-0.015
 noses-0.015
 appearance-0.014
Tokenmy
Token ID1820
Feature activation-6.30e-03

Pos logit contributions
des+0.023
xy+0.023
 snug+0.023
 secure+0.023
lad+0.022
Neg logit contributions
 mouths-0.053
 heights-0.051
 noses-0.051
 tongues-0.050
 appearance-0.049
Token.
Token ID13
Feature activation+5.63e-03

Pos logit contributions
 snug+0.008
 secure+0.008
des+0.008
sy+0.008
xy+0.008
Neg logit contributions
 heights-0.011
 mouths-0.011
 tongues-0.010
 noses-0.010
 appearance-0.010
Token was
Token ID373
Feature activation-7.36e-03

Pos logit contributions
des+0.003
 secure+0.003
 valuable+0.003
year+0.003
xy+0.003
Neg logit contributions
 mouths-0.005
 noses-0.004
 spr-0.004
 heights-0.004
 groups-0.004
Token a
Token ID257
Feature activation-7.92e-03

Pos logit contributions
lad+0.010
 secure+0.009
xy+0.009
than+0.009
des+0.009
Neg logit contributions
 frogs-0.012
 noses-0.012
 mouths-0.012
 hoops-0.012
 groups-0.012
Token little
Token ID1310
Feature activation-7.26e-03

Pos logit contributions
des+0.013
lad+0.013
xy+0.013
 secure+0.013
 snug+0.012
Neg logit contributions
 mouths-0.011
 heights-0.010
 noses-0.010
 groups-0.010
 frogs-0.010
Token boy
Token ID2933
Feature activation-8.34e-03

Pos logit contributions
des+0.010
lad+0.010
xy+0.010
 secure+0.010
 snug+0.010
Neg logit contributions
 mouths-0.012
 heights-0.011
 noses-0.011
 tongues-0.011
 groups-0.011
Token called
Token ID1444
Feature activation-1.21e-02

Pos logit contributions
xy+0.012
 snug+0.012
des+0.012
sy+0.012
eddy+0.012
Neg logit contributions
 heights-0.013
 mouths-0.013
 appearance-0.012
 noses-0.012
 tongues-0.012
Token Jon
Token ID5966
Feature activation-7.13e-02

Pos logit contributions
 secure+0.016
des+0.016
lad+0.015
 snug+0.015
xy+0.015
Neg logit contributions
 mouths-0.021
 noses-0.020
 heights-0.020
 tongues-0.019
 groups-0.019
Tokenny
Token ID3281
Feature activation-1.27e-02

Pos logit contributions
 secure+0.118
 valuable+0.117
vern+0.115
igsaw+0.115
 memorable+0.114
Neg logit contributions
 noses-0.071
 dogs-0.069
 pigs-0.065
 necks-0.064
 goats-0.064
Token.
Token ID13
Feature activation-1.67e-02

Pos logit contributions
xy+0.017
 snug+0.017
des+0.017
 secure+0.016
sy+0.016
Neg logit contributions
 mouths-0.021
 heights-0.020
 noses-0.020
 tongues-0.020
 appearance-0.019
Token Jon
Token ID5966
Feature activation-7.26e-02

Pos logit contributions
ager+0.022
year+0.022
lad+0.022
sy+0.022
ve+0.022
Neg logit contributions
 heights-0.028
 mouths-0.028
 noses-0.027
 eyes-0.026
 pigs-0.026
Tokenny
Token ID3281
Feature activation+1.31e-05

Pos logit contributions
vern+0.128
 accomplishment+0.122
year+0.120
 valuable+0.118
 secure+0.117
Neg logit contributions
 dogs-0.066
 noses-0.064
 necks-0.063
ric-0.061
 asks-0.061
Token had
Token ID550
Feature activation+1.10e-03

Pos logit contributions
 mouths+0.000
 noses+0.000
 tongues+0.000
 heights+0.000
 appearance+0.000
Neg logit contributions
des-0.000
lad-0.000
 secure-0.000
xy-0.000
 snug-0.000
Token.
Token ID13
Feature activation+2.30e-02

Pos logit contributions
 mouths+0.033
 appearance+0.032
 heights+0.032
 tongues+0.032
 anth+0.031
Neg logit contributions
xy-0.014
sy-0.013
 safe-0.013
 snug-0.013
 secure-0.013
Token They
Token ID1119
Feature activation+3.46e-02

Pos logit contributions
 heights+0.046
 appearance+0.046
els+0.045
 crabs+0.045
 elbows+0.045
Neg logit contributions
 Good-0.023
da-0.021
Great-0.020
Good-0.020
John-0.019
Token see
Token ID766
Feature activation+2.18e-02

Pos logit contributions
 mouths+0.072
 heights+0.069
 tongues+0.069
 noses+0.068
 elbows+0.068
Neg logit contributions
 secure-0.029
sy-0.029
 snug-0.029
 safe-0.029
 wonderful-0.028
Token cows
Token ID22575
Feature activation+3.31e-02

Pos logit contributions
 mouths+0.031
 salon+0.031
 appearance+0.030
 tongues+0.030
 heights+0.030
Neg logit contributions
 valuable-0.032
 snug-0.031
xy-0.030
 wonderful-0.030
 safe-0.030
Token,
Token ID11
Feature activation+3.20e-02

Pos logit contributions
 mouths+0.061
 heights+0.060
 appearance+0.059
 tongues+0.059
 conclusions+0.059
Neg logit contributions
 safe-0.035
xy-0.035
 snug-0.034
sy-0.034
 secure-0.032
Token pigs
Token ID22333
Feature activation+4.43e-02

Pos logit contributions
uders+0.036
 appearance+0.036
 salon+0.034
 heights+0.033
inches+0.033
Neg logit contributions
 safe-0.056
 snug-0.056
Good-0.056
sy-0.055
 wonderful-0.055
Token,
Token ID11
Feature activation+3.37e-02

Pos logit contributions
 mouths+0.086
 heights+0.085
 conclusions+0.084
 tongues+0.083
 appearance+0.082
Neg logit contributions
sy-0.046
xy-0.045
 safe-0.043
 snug-0.041
 fixed-0.040
Token chickens
Token ID25972
Feature activation+4.71e-02

Pos logit contributions
 appearance+0.037
uders+0.037
 mouths+0.035
 heights+0.035
 salon+0.035
Neg logit contributions
 snug-0.058
 safe-0.058
Good-0.058
sy-0.057
 fixed-0.056
Token,
Token ID11
Feature activation+3.21e-02

Pos logit contributions
 mouths+0.103
 heights+0.100
 appearance+0.100
 conclusions+0.100
 tongues+0.100
Neg logit contributions
sy-0.036
 safe-0.035
xy-0.035
man-0.031
 snug-0.030
Token and
Token ID290
Feature activation+3.15e-02

Pos logit contributions
 appearance+0.054
uders+0.054
 mouths+0.054
 anth+0.053
 salon+0.053
Neg logit contributions
 snug-0.035
 safe-0.035
Good-0.034
des-0.034
sy-0.034
Token corn
Token ID11676
Feature activation+2.58e-02

Pos logit contributions
els+0.031
uders+0.031
 mouths+0.030
 salon+0.030
okes+0.029
Neg logit contributions
des-0.060
 snug-0.060
 valuable-0.059
Good-0.059
 secure-0.058
Token.
Token ID13
Feature activation+1.17e-02

Pos logit contributions
 heights+0.071
 mouths+0.071
 anth+0.068
 noses+0.067
 tongues+0.067
Neg logit contributions
xy-0.040
 snug-0.039
des-0.038
eddy-0.038
 fixed-0.037
Token I
Token ID314
Feature activation+2.15e-02

Pos logit contributions
 mouths+0.025
 heights+0.025
 noses+0.024
 groups+0.024
 tongues+0.024
Neg logit contributions
 snug-0.009
des-0.009
xy-0.009
 secure-0.009
lad-0.009
Token don
Token ID836
Feature activation+5.77e-04

Pos logit contributions
 mouths+0.043
 heights+0.042
 noses+0.041
 tongues+0.041
 appearance+0.040
Neg logit contributions
des-0.021
xy-0.021
 snug-0.021
lad-0.021
 secure-0.021
Token't
Token ID470
Feature activation+1.59e-02

Pos logit contributions
 mouths+0.001
 heights+0.001
 noses+0.001
 tongues+0.001
 groups+0.001
Neg logit contributions
des-0.001
 snug-0.001
 secure-0.001
lad-0.001
xy-0.001
Token like
Token ID588
Feature activation+3.12e-02

Pos logit contributions
 mouths+0.032
 heights+0.030
 noses+0.029
els+0.029
 tongues+0.029
Neg logit contributions
 secure-0.017
des-0.017
 snug-0.017
 valuable-0.016
sy-0.016
Token hairy
Token ID43873
Feature activation+4.40e-02

Pos logit contributions
 groups+0.077
 mouths+0.075
 apartment+0.075
 darkened+0.074
ume+0.074
Neg logit contributions
 yours-0.010
 valuable-0.009
 it-0.009
 fast-0.009
 himself-0.009
Token things
Token ID1243
Feature activation+1.07e-02

Pos logit contributions
 salon+0.060
 mouths+0.055
 groups+0.054
ric+0.053
 apartment+0.052
Neg logit contributions
des-0.066
 himself-0.065
 ever-0.062
 snug-0.061
iest-0.061
Token in
Token ID287
Feature activation+6.47e-03

Pos logit contributions
 mouths+0.020
 heights+0.019
 noses+0.019
 tongues+0.018
 anth+0.018
Neg logit contributions
 snug-0.013
 secure-0.013
des-0.013
xy-0.013
 valuable-0.012
Token my
Token ID616
Feature activation+2.08e-02

Pos logit contributions
 mouths+0.014
 noses+0.013
 tongues+0.013
 heights+0.013
 salon+0.013
Neg logit contributions
 snug-0.005
 secure-0.005
xy-0.005
 valuable-0.005
 safe-0.005
Token hair
Token ID4190
Feature activation+3.29e-02

Pos logit contributions
els+0.030
 groups+0.028
ants+0.028
okes+0.028
 mouths+0.028
Neg logit contributions
 snug-0.031
 safe-0.031
 valuable-0.029
des-0.029
 secure-0.029
Token."
Token ID526
Feature activation-8.36e-03

Pos logit contributions
 mouths+0.064
 heights+0.064
 anth+0.061
 noses+0.061
 hoops+0.061
Neg logit contributions
xy-0.031
 snug-0.030
des-0.029
eddy-0.029
 secure-0.029
Token.
Token ID13
Feature activation-1.46e-02

Pos logit contributions
 mouths+0.008
 tongues+0.008
 noses+0.007
 appearance+0.007
 heights+0.007
Neg logit contributions
 snug-0.004
xy-0.004
 secure-0.004
des-0.004
 repaired-0.004
Token
Token ID198
Feature activation-1.00e-02

Pos logit contributions
des+0.017
 secure+0.017
xy+0.017
 snug+0.017
lad+0.017
Neg logit contributions
 mouths-0.027
 heights-0.026
 noses-0.026
 tongues-0.025
 appearance-0.025
Token
Token ID198
Feature activation+7.11e-03

Pos logit contributions
des+0.015
lad+0.015
 snug+0.015
 secure+0.015
than+0.015
Neg logit contributions
 mouths-0.014
 noses-0.014
 heights-0.014
 frogs-0.013
 groups-0.013
TokenBut
Token ID1537
Feature activation+1.57e-02

Pos logit contributions
 mouths+0.013
 heights+0.012
 noses+0.012
 tongues+0.012
 groups+0.012
Neg logit contributions
des-0.009
 snug-0.009
 secure-0.009
xy-0.009
lad-0.009
Token then
Token ID788
Feature activation+5.43e-03

Pos logit contributions
 groups+0.029
 heights+0.029
 mouths+0.029
 appearance+0.029
 tongues+0.028
Neg logit contributions
 snug-0.017
 Good-0.016
Good-0.016
 secure-0.016
xy-0.016
Token they
Token ID484
Feature activation+4.30e-02

Pos logit contributions
 heights+0.009
 mouths+0.009
 appearance+0.008
 tongues+0.008
 noses+0.008
Neg logit contributions
 snug-0.007
xy-0.007
than-0.007
des-0.007
da-0.007
Token hear
Token ID3285
Feature activation+1.81e-02

Pos logit contributions
 heights+0.067
els+0.065
 mouths+0.064
 noses+0.063
ants+0.063
Neg logit contributions
 safe-0.058
 repaired-0.058
 fixed-0.057
 snug-0.054
 secure-0.054
Token a
Token ID257
Feature activation+8.74e-03

Pos logit contributions
 heights+0.045
 salon+0.045
 appearance+0.044
 anth+0.044
 mouths+0.044
Neg logit contributions
 Good-0.007
 late-0.007
 snug-0.007
eddy-0.006
da-0.006
Token loud
Token ID7812
Feature activation+5.24e-04

Pos logit contributions
 mouths+0.015
 heights+0.015
 noses+0.015
els+0.015
 tongues+0.014
Neg logit contributions
 snug-0.011
 secure-0.010
xy-0.010
des-0.010
lad-0.010
Token noise
Token ID7838
Feature activation-1.17e-02

Pos logit contributions
 mouths+0.001
 heights+0.001
 noses+0.001
 tongues+0.001
 appearance+0.001
Neg logit contributions
des-0.001
 snug-0.001
 secure-0.001
xy-0.001
lad-0.001
Token.
Token ID13
Feature activation+8.01e-04

Pos logit contributions
xy+0.011
 snug+0.011
des+0.011
 secure+0.011
eddy+0.011
Neg logit contributions
 mouths-0.024
 tongues-0.023
 noses-0.022
 appearance-0.022
 heights-0.022
Token,
Token ID11
Feature activation+2.25e-02

Pos logit contributions
 appearance+0.096
 conclusions+0.094
 heights+0.091
 elbows+0.091
uders+0.090
Neg logit contributions
ve-0.050
 snug-0.048
 wonderful-0.046
 valuable-0.046
 secure-0.045
Token five
Token ID1936
Feature activation+3.88e-02

Pos logit contributions
 mouths+0.036
 tongues+0.034
 heights+0.034
 appearance+0.034
 noses+0.034
Neg logit contributions
 snug-0.031
xy-0.030
Good-0.030
des-0.030
lad-0.030
Token,
Token ID11
Feature activation+2.22e-02

Pos logit contributions
 appearance+0.072
 anth+0.069
 tongues+0.069
 mouths+0.069
 conclusions+0.068
Neg logit contributions
ve-0.038
 valuable-0.037
 snug-0.037
 wonderful-0.035
 useful-0.035
Token six
Token ID2237
Feature activation+5.21e-02

Pos logit contributions
 mouths+0.035
 tongues+0.033
 appearance+0.032
 heights+0.032
 noses+0.032
Neg logit contributions
 snug-0.031
Good-0.031
xy-0.030
lad-0.030
des-0.030
Token,
Token ID11
Feature activation+2.02e-02

Pos logit contributions
 appearance+0.100
 anth+0.098
 conclusions+0.098
ican+0.097
 mouths+0.097
Neg logit contributions
ve-0.041
 snug-0.040
 good-0.039
 hard-0.038
 much-0.037
Token seven
Token ID3598
Feature activation+6.03e-02

Pos logit contributions
 mouths+0.030
 heights+0.029
 noses+0.029
 tongues+0.028
 appearance+0.028
Neg logit contributions
 snug-0.030
des-0.029
xy-0.029
lad-0.029
 secure-0.029
Token,
Token ID11
Feature activation+2.84e-02

Pos logit contributions
ric+0.115
 conclusions+0.114
 appearance+0.113
uders+0.113
 anth+0.111
Neg logit contributions
 ever-0.047
 much-0.045
 good-0.045
 hard-0.045
 wonderful-0.043
Token eight
Token ID3624
Feature activation+6.63e-02

Pos logit contributions
 mouths+0.037
 appearance+0.035
 heights+0.035
 tongues+0.035
 noses+0.034
Neg logit contributions
Good-0.047
da-0.046
Great-0.046
 snug-0.045
xy-0.045
Token,
Token ID11
Feature activation+2.18e-02

Pos logit contributions
 appearance+0.124
uders+0.120
 mothers+0.119
 conclusions+0.119
ric+0.119
Neg logit contributions
 snug-0.052
ve-0.052
 hard-0.052
 good-0.051
 ever-0.051
Token nine
Token ID5193
Feature activation+4.78e-02

Pos logit contributions
 mouths+0.018
 heights+0.016
 noses+0.016
 tongues+0.015
 appearance+0.015
Neg logit contributions
des-0.048
xy-0.047
 snug-0.047
lad-0.047
 secure-0.047
Token,
Token ID11
Feature activation+2.29e-02

Pos logit contributions
 conclusions+0.097
 anth+0.095
 elbows+0.092
 crabs+0.090
 mosquitoes+0.090
Neg logit contributions
ve-0.037
 ever-0.036
lad-0.033
 fixed-0.032
 repaired-0.032
Token or
Token ID393
Feature activation+1.30e-02

Pos logit contributions
 mouths+0.047
 heights+0.045
 noses+0.044
 tongues+0.044
 anth+0.044
Neg logit contributions
xy-0.030
lad-0.028
 secure-0.028
 snug-0.028
des-0.028
Token frogs
Token ID37475
Feature activation+4.17e-02

Pos logit contributions
els+0.021
andel+0.020
okes+0.020
 salon+0.020
 heights+0.019
Neg logit contributions
 valuable-0.017
 snug-0.016
 safe-0.016
 wonderful-0.016
 secure-0.015
Token or
Token ID393
Feature activation+2.15e-02

Pos logit contributions
 mouths+0.082
 noses+0.078
 heights+0.077
 anth+0.077
 tongues+0.076
Neg logit contributions
 secure-0.043
 safe-0.043
des-0.042
xy-0.042
 snug-0.042
Token snakes
Token ID26042
Feature activation+4.00e-02

Pos logit contributions
els+0.031
 salon+0.029
okes+0.029
andel+0.028
 anth+0.028
Neg logit contributions
 valuable-0.030
 snug-0.030
 safe-0.030
 wonderful-0.030
 useful-0.029
Token.
Token ID13
Feature activation+1.71e-02

Pos logit contributions
 mouths+0.086
 anth+0.080
 noses+0.079
 tongues+0.079
 heights+0.078
Neg logit contributions
xy-0.034
 safe-0.033
 ever-0.033
 secure-0.033
 fixed-0.032
Token Some
Token ID2773
Feature activation+4.78e-02

Pos logit contributions
 heights+0.035
 mouths+0.035
els+0.034
 salon+0.034
 tongues+0.034
Neg logit contributions
 Good-0.015
xy-0.015
 snug-0.014
da-0.013
des-0.013
Token of
Token ID286
Feature activation+1.67e-02

Pos logit contributions
 salon+0.082
 anth+0.078
 heights+0.077
 appearance+0.074
els+0.074
Neg logit contributions
 valuable-0.055
 snug-0.055
 secure-0.053
 safe-0.053
 wonderful-0.052
Token them
Token ID606
Feature activation+3.03e-02

Pos logit contributions
 mouths+0.029
 salon+0.028
 anth+0.028
 tongues+0.027
 heights+0.027
Neg logit contributions
 snug-0.019
xy-0.019
 secure-0.019
 valuable-0.018
 himself-0.018
Token might
Token ID1244
Feature activation-8.43e-03

Pos logit contributions
 heights+0.057
 mouths+0.056
 anth+0.055
 hoops+0.054
 noses+0.053
Neg logit contributions
 snug-0.032
 secure-0.032
xy-0.031
 repaired-0.031
sy-0.030
Token be
Token ID307
Feature activation+1.25e-02

Pos logit contributions
 secure+0.010
 snug+0.010
 valuable+0.010
 safe+0.009
sy+0.009
Neg logit contributions
 mouths-0.016
 heights-0.015
els-0.015
 noses-0.014
 anth-0.014
Token friendly
Token ID8030
Feature activation+2.36e-02

Pos logit contributions
 salon+0.025
 heights+0.024
 mouths+0.024
 anth+0.023
 tongues+0.023
Neg logit contributions
 valuable-0.014
 snug-0.013
 secure-0.013
 safe-0.013
sy-0.013
Token go
Token ID467
Feature activation-8.19e-03

Pos logit contributions
 mouths+0.000
 heights+0.000
 noses+0.000
els+0.000
 tongues+0.000
Neg logit contributions
 secure-0.000
 snug-0.000
des-0.000
 repaired-0.000
 fixed-0.000
Token to
Token ID284
Feature activation+4.55e-03

Pos logit contributions
xy+0.010
des+0.010
lad+0.010
 snug+0.010
 secure+0.010
Neg logit contributions
 mouths-0.014
 heights-0.014
 noses-0.014
 tongues-0.013
 appearance-0.013
Token the
Token ID262
Feature activation+1.34e-02

Pos logit contributions
 mouths+0.009
 tongues+0.008
 noses+0.008
els+0.008
 heights+0.008
Neg logit contributions
 secure-0.005
 snug-0.005
des-0.005
 safe-0.005
xy-0.005
Token park
Token ID3952
Feature activation+2.39e-03

Pos logit contributions
 mouths+0.021
els+0.020
okes+0.020
 tongues+0.020
 noses+0.020
Neg logit contributions
 safe-0.020
 snug-0.020
 secure-0.019
 valuable-0.019
 Good-0.019
Token sooner
Token ID14556
Feature activation+3.35e-02

Pos logit contributions
 mouths+0.004
 heights+0.004
 noses+0.004
 tongues+0.004
 appearance+0.004
Neg logit contributions
des-0.003
 snug-0.003
xy-0.003
 secure-0.003
lad-0.003
Token."
Token ID526
Feature activation+2.26e-03

Pos logit contributions
 mouths+0.064
 tongues+0.062
els+0.060
 noses+0.060
 hoops+0.057
Neg logit contributions
 snug-0.036
 secure-0.035
 safe-0.034
des-0.033
 late-0.033
Token Lily
Token ID20037
Feature activation-3.05e-03

Pos logit contributions
 heights+0.005
 appearance+0.005
 mouths+0.005
els+0.005
 noses+0.005
Neg logit contributions
da-0.001
des-0.001
 snug-0.001
 Good-0.001
Great-0.001
Token thought
Token ID1807
Feature activation-1.97e-04

Pos logit contributions
des+0.003
 snug+0.003
 secure+0.003
xy+0.003
sy+0.003
Neg logit contributions
 mouths-0.006
 heights-0.006
 tongues-0.006
 noses-0.006
 appearance-0.005
Token about
Token ID546
Feature activation-1.92e-02

Pos logit contributions
des+0.000
lad+0.000
xy+0.000
than+0.000
 snug+0.000
Neg logit contributions
 mouths-0.000
 heights-0.000
 noses-0.000
 tongues-0.000
 appearance-0.000
Token it
Token ID340
Feature activation-1.38e-02

Pos logit contributions
des+0.023
sy+0.023
tted+0.023
 secure+0.022
 snug+0.022
Neg logit contributions
 heights-0.035
 mouths-0.035
 noses-0.033
 frogs-0.032
 appearance-0.032
Token and
Token ID290
Feature activation+2.56e-03

Pos logit contributions
des+0.018
 snug+0.018
xy+0.018
 secure+0.017
lad+0.017
Neg logit contributions
 mouths-0.024
 heights-0.023
 noses-0.022
 tongues-0.022
 appearance-0.022
Token made
Token ID925
Feature activation-1.40e-02

Pos logit contributions
des+0.044
lad+0.043
xy+0.042
 snug+0.041
 secure+0.041
Neg logit contributions
 mouths-0.071
 heights-0.068
 noses-0.068
 tongues-0.067
 appearance-0.065
Token her
Token ID607
Feature activation-1.38e-03

Pos logit contributions
 snug+0.016
 secure+0.015
xy+0.015
des+0.015
sy+0.014
Neg logit contributions
 mouths-0.026
 heights-0.025
 appearance-0.025
 tongues-0.025
 noses-0.025
Token laugh
Token ID6487
Feature activation-1.34e-03

Pos logit contributions
 snug+0.002
 secure+0.002
 valuable+0.002
des+0.002
 safe+0.002
Neg logit contributions
 heights-0.002
els-0.002
uders-0.002
 mouths-0.002
 appearance-0.002
Token.
Token ID13
Feature activation+2.01e-02

Pos logit contributions
 snug+0.002
 secure+0.002
xy+0.002
des+0.002
lad+0.002
Neg logit contributions
 mouths-0.002
 heights-0.002
 noses-0.002
 tongues-0.002
 appearance-0.002
Token It
Token ID632
Feature activation-2.90e-02

Pos logit contributions
els+0.044
 appearance+0.044
 heights+0.044
 groups+0.043
 mouths+0.043
Neg logit contributions
 Thanks-0.014
 Good-0.013
da-0.013
Great-0.013
Good-0.012
Token was
Token ID373
Feature activation+4.68e-04

Pos logit contributions
 snug+0.019
xy+0.019
des+0.018
 secure+0.018
lad+0.018
Neg logit contributions
 mouths-0.068
 heights-0.067
 noses-0.066
 tongues-0.065
 appearance-0.064
Token a
Token ID257
Feature activation-6.83e-03

Pos logit contributions
 mouths+0.001
 heights+0.001
 appearance+0.001
 tongues+0.001
 noses+0.001
Neg logit contributions
 snug-0.000
 secure-0.000
 valuable-0.000
sy-0.000
xy-0.000
Token big
Token ID1263
Feature activation-5.10e-03

Pos logit contributions
 snug+0.008
 valuable+0.008
 secure+0.007
xy+0.007
sy+0.007
Neg logit contributions
 mouths-0.013
 heights-0.013
els-0.012
 noses-0.012
 tongues-0.012
Token bulb
Token ID28287
Feature activation-9.56e-03

Pos logit contributions
 snug+0.007
 secure+0.007
xy+0.007
des+0.007
than+0.007
Neg logit contributions
 mouths-0.008
 heights-0.008
 noses-0.007
 tongues-0.007
els-0.007
Token with
Token ID351
Feature activation+4.79e-04

Pos logit contributions
 snug+0.011
 secure+0.010
xy+0.010
sy+0.010
eddy+0.010
Neg logit contributions
 mouths-0.018
 heights-0.017
 noses-0.017
 appearance-0.017
 tongues-0.017
Token funny
Token ID8258
Feature activation+3.36e-02

Pos logit contributions
 mouths+0.001
 heights+0.001
 appearance+0.001
 noses+0.001
 tongues+0.001
Neg logit contributions
 snug-0.000
 secure-0.000
 valuable-0.000
sy-0.000
xy-0.000
Token meal
Token ID9799
Feature activation-1.29e-02

Pos logit contributions
des+0.031
xy+0.031
 snug+0.031
lad+0.031
 secure+0.031
Neg logit contributions
 mouths-0.031
 heights-0.029
 noses-0.029
 tongues-0.028
 appearance-0.028
Token on
Token ID319
Feature activation+2.54e-03

Pos logit contributions
des+0.017
lad+0.017
 snug+0.017
xy+0.017
 secure+0.017
Neg logit contributions
 mouths-0.021
 heights-0.020
 noses-0.020
 tongues-0.020
 groups-0.020
Token the
Token ID262
Feature activation-1.54e-02

Pos logit contributions
 mouths+0.006
okes+0.005
 appearance+0.005
 tongues+0.005
 noses+0.005
Neg logit contributions
 snug-0.002
 secure-0.002
 valuable-0.002
xy-0.002
than-0.002
Token yacht
Token ID43207
Feature activation-1.18e-02

Pos logit contributions
des+0.027
 snug+0.027
xy+0.027
 secure+0.027
than+0.027
Neg logit contributions
 mouths-0.019
 heights-0.018
 noses-0.017
 tongues-0.017
els-0.017
Token.
Token ID13
Feature activation+3.60e-03

Pos logit contributions
xy+0.014
 secure+0.014
des+0.013
 snug+0.013
sy+0.013
Neg logit contributions
 mouths-0.021
 heights-0.021
 noses-0.021
 tongues-0.020
 appearance-0.020
Token He
Token ID679
Feature activation-1.95e-04

Pos logit contributions
 mouths+0.008
 heights+0.007
 noses+0.007
 tongues+0.007
 groups+0.007
Neg logit contributions
des-0.003
xy-0.003
 snug-0.003
 secure-0.003
lad-0.003
Token used
Token ID973
Feature activation+7.34e-03

Pos logit contributions
des+0.000
lad+0.000
 snug+0.000
xy+0.000
 secure+0.000
Neg logit contributions
 mouths-0.000
 noses-0.000
 heights-0.000
 tongues-0.000
 groups-0.000
Token only
Token ID691
Feature activation+7.64e-03

Pos logit contributions
 mouths+0.014
 heights+0.014
 noses+0.014
 appearance+0.014
 tongues+0.014
Neg logit contributions
 snug-0.008
des-0.007
 secure-0.007
xy-0.007
lad-0.007
Token the
Token ID262
Feature activation-8.41e-04

Pos logit contributions
 mouths+0.013
 heights+0.013
 noses+0.013
 appearance+0.013
 tongues+0.013
Neg logit contributions
 snug-0.009
 secure-0.009
des-0.009
 valuable-0.009
sy-0.009
Token finest
Token ID18822
Feature activation+1.04e-02

Pos logit contributions
 snug+0.001
 secure+0.001
des+0.001
xy+0.001
 valuable+0.001
Neg logit contributions
 mouths-0.001
 heights-0.001
 noses-0.001
 tongues-0.001
 appearance-0.001
Token ingredients
Token ID9391
Feature activation+4.82e-03

Pos logit contributions
 heights+0.017
 mouths+0.016
 salon+0.016
 appearance+0.016
 anth+0.016
Neg logit contributions
 snug-0.014
 secure-0.014
des-0.014
 valuable-0.013
 repaired-0.013
Token fun
Token ID1257
Feature activation-3.85e-03

Pos logit contributions
 mouths+0.010
 tongues+0.010
 noses+0.009
 salon+0.009
uders+0.009
Neg logit contributions
 snug-0.007
 secure-0.007
des-0.007
 valuable-0.007
 safe-0.007
Token along
Token ID1863
Feature activation+6.37e-03

Pos logit contributions
des+0.006
 snug+0.006
 secure+0.006
xy+0.006
lad+0.006
Neg logit contributions
 mouths-0.006
 tongues-0.005
 noses-0.005
 heights-0.005
 appearance-0.005
Token the
Token ID262
Feature activation-5.40e-03

Pos logit contributions
 mouths+0.012
 tongues+0.011
 noses+0.011
 appearance+0.011
 heights+0.011
Neg logit contributions
 snug-0.008
 secure-0.007
xy-0.007
des-0.007
 valuable-0.007
Token way
Token ID835
Feature activation-5.79e-03

Pos logit contributions
des+0.009
 snug+0.009
 secure+0.009
xy+0.009
lad+0.008
Neg logit contributions
 mouths-0.008
 heights-0.007
 noses-0.007
 tongues-0.007
els-0.007
Token.
Token ID13
Feature activation+2.61e-02

Pos logit contributions
des+0.008
xy+0.008
lad+0.008
 snug+0.008
 secure+0.008
Neg logit contributions
 mouths-0.009
 heights-0.009
 noses-0.008
 tongues-0.008
 appearance-0.008
Token Then
Token ID3244
Feature activation+1.26e-02

Pos logit contributions
 mouths+0.051
 noses+0.049
els+0.049
 tongues+0.049
 heights+0.049
Neg logit contributions
 snug-0.027
des-0.026
Great-0.026
da-0.026
tted-0.026
Token she
Token ID673
Feature activation+1.79e-02

Pos logit contributions
 mouths+0.023
 tongues+0.022
 groups+0.022
els+0.022
 heights+0.022
Neg logit contributions
 snug-0.015
 secure-0.014
 safe-0.014
da-0.013
Good-0.013
Token heard
Token ID2982
Feature activation+1.57e-02

Pos logit contributions
 mouths+0.034
 heights+0.034
 tongues+0.033
 appearance+0.033
els+0.032
Neg logit contributions
sy-0.020
 snug-0.020
 delivered-0.019
 repaired-0.019
da-0.019
Token a
Token ID257
Feature activation-1.63e-03

Pos logit contributions
 mouths+0.038
 tongues+0.036
els+0.035
 noses+0.035
 heights+0.035
Neg logit contributions
 snug-0.010
des-0.010
 secure-0.010
 valuable-0.010
year-0.009
Token noise
Token ID7838
Feature activation-1.74e-02

Pos logit contributions
 snug+0.002
 secure+0.002
xy+0.002
des+0.002
 valuable+0.002
Neg logit contributions
 mouths-0.003
 heights-0.003
 noses-0.003
els-0.003
 tongues-0.003
Token!
Token ID0
Feature activation+1.69e-02

Pos logit contributions
 snug+0.019
des+0.019
 secure+0.018
xy+0.018
sy+0.018
Neg logit contributions
 mouths-0.036
 tongues-0.033
 noses-0.033
 appearance-0.032
 heights-0.031
Token don
Token ID836
Feature activation-5.94e-03

Pos logit contributions
 mouths+0.010
 heights+0.010
 noses+0.010
 tongues+0.010
 appearance+0.010
Neg logit contributions
lad-0.007
 snug-0.007
xy-0.007
 secure-0.007
des-0.007
Token't
Token ID470
Feature activation-1.08e-02

Pos logit contributions
des+0.008
 snug+0.008
 secure+0.008
xy+0.007
lad+0.007
Neg logit contributions
 mouths-0.010
 heights-0.010
 noses-0.010
 tongues-0.009
 groups-0.009
Token you
Token ID345
Feature activation+2.02e-02

Pos logit contributions
 secure+0.006
than+0.006
 snug+0.006
 safe+0.006
 ever+0.005
Neg logit contributions
 mouths-0.026
els-0.025
 noses-0.025
 tongues-0.024
 heights-0.024
Token count
Token ID954
Feature activation+2.84e-02

Pos logit contributions
 mouths+0.039
 heights+0.037
 elbows+0.036
 tongues+0.036
 noses+0.035
Neg logit contributions
 secure-0.023
sy-0.022
 fixed-0.022
 repaired-0.022
 snug-0.022
Token the
Token ID262
Feature activation+1.21e-03

Pos logit contributions
 mouths+0.055
 tongues+0.053
 anth+0.053
 noses+0.052
 elbows+0.052
Neg logit contributions
 snug-0.028
xy-0.028
eddy-0.027
 safe-0.026
 secure-0.026
Token flowers
Token ID12734
Feature activation+2.82e-02

Pos logit contributions
 mouths+0.002
 heights+0.002
okes+0.002
 tongues+0.002
els+0.002
Neg logit contributions
 valuable-0.002
 snug-0.001
 secure-0.001
 safe-0.001
Good-0.001
Token?
Token ID30
Feature activation+2.02e-02

Pos logit contributions
 mouths+0.051
 noses+0.049
 heights+0.049
 tongues+0.048
 appearance+0.047
Neg logit contributions
 snug-0.034
 secure-0.034
xy-0.033
lad-0.033
des-0.032
Token They
Token ID1119
Feature activation+9.89e-03

Pos logit contributions
 mouths+0.043
 heights+0.042
 tongues+0.042
 noses+0.042
 appearance+0.042
Neg logit contributions
 Good-0.018
da-0.018
xy-0.017
lad-0.017
Great-0.017
Token will
Token ID481
Feature activation+1.76e-03

Pos logit contributions
 mouths+0.019
 heights+0.018
 noses+0.018
 tongues+0.018
 appearance+0.018
Neg logit contributions
 secure-0.011
 snug-0.011
des-0.010
xy-0.010
sy-0.010
Token make
Token ID787
Feature activation-5.84e-03

Pos logit contributions
 mouths+0.003
 heights+0.003
 noses+0.003
 tongues+0.003
els+0.003
Neg logit contributions
 secure-0.002
 snug-0.002
 repaired-0.002
sy-0.002
xy-0.002
Token you
Token ID345
Feature activation-9.83e-03

Pos logit contributions
 snug+0.007
 secure+0.007
xy+0.007
des+0.006
 valuable+0.006
Neg logit contributions
 mouths-0.011
 heights-0.010
 tongues-0.010
 noses-0.010
 appearance-0.010
Token their
Token ID511
Feature activation-1.18e-02

Pos logit contributions
 mouths+0.020
 heights+0.020
 noses+0.019
 appearance+0.019
 tongues+0.019
Neg logit contributions
 snug-0.015
xy-0.015
 secure-0.015
des-0.015
lad-0.014
Token picnic
Token ID35715
Feature activation-9.83e-03

Pos logit contributions
xy+0.017
des+0.017
lad+0.017
 snug+0.017
 secure+0.017
Neg logit contributions
 mouths-0.018
 heights-0.017
 noses-0.017
 tongues-0.017
 appearance-0.017
Token and
Token ID290
Feature activation-4.26e-03

Pos logit contributions
des+0.014
lad+0.014
xy+0.014
 snug+0.014
 secure+0.014
Neg logit contributions
 mouths-0.016
 heights-0.015
 noses-0.015
 tongues-0.014
 groups-0.014
Token went
Token ID1816
Feature activation-3.23e-03

Pos logit contributions
lad+0.006
xy+0.006
des+0.006
 snug+0.006
 secure+0.005
Neg logit contributions
 mouths-0.007
 noses-0.007
 tongues-0.007
 heights-0.007
 frogs-0.007
Token home
Token ID1363
Feature activation-1.81e-02

Pos logit contributions
lad+0.004
xy+0.004
 secure+0.004
than+0.004
des+0.004
Neg logit contributions
 mouths-0.005
 noses-0.005
 heights-0.005
 frogs-0.005
 hoops-0.005
Token.
Token ID13
Feature activation-7.10e-03

Pos logit contributions
xy+0.028
lad+0.028
des+0.028
than+0.027
Great+0.027
Neg logit contributions
 mouths-0.026
 groups-0.025
 noses-0.025
 heights-0.024
 tongues-0.023
Token They
Token ID1119
Feature activation-3.78e-03

Pos logit contributions
xy+0.010
lad+0.010
 secure+0.010
des+0.010
than+0.010
Neg logit contributions
 mouths-0.012
 noses-0.011
 heights-0.011
 tongues-0.010
 groups-0.010
Token had
Token ID550
Feature activation+7.74e-03

Pos logit contributions
des+0.004
lad+0.004
xy+0.004
 snug+0.004
 secure+0.004
Neg logit contributions
 mouths-0.007
 noses-0.007
 heights-0.007
 tongues-0.006
 groups-0.006
Token a
Token ID257
Feature activation-1.28e-02

Pos logit contributions
 mouths+0.014
 noses+0.013
 heights+0.013
 frogs+0.013
 groups+0.013
Neg logit contributions
xy-0.010
lad-0.010
than-0.010
des-0.009
 secure-0.009
Token nice
Token ID3621
Feature activation-1.52e-02

Pos logit contributions
xy+0.013
lad+0.013
des+0.013
 secure+0.012
than+0.012
Neg logit contributions
 mouths-0.026
 heights-0.024
 noses-0.024
 tongues-0.024
 groups-0.024
Token day
Token ID1110
Feature activation-1.27e-02

Pos logit contributions
xy+0.028
year+0.028
lad+0.028
cher+0.026
tted+0.026
Neg logit contributions
 mouths-0.018
 noses-0.017
 frogs-0.017
 eyes-0.017
 spr-0.016
Token t
Token ID256
Feature activation+5.09e-03

Pos logit contributions
 mouths+0.025
 tongues+0.024
 noses+0.024
uders+0.023
 heights+0.023
Neg logit contributions
 snug-0.030
 secure-0.030
sy-0.029
 valuable-0.029
 safe-0.029
Tokenines
Token ID1127
Feature activation+1.67e-02

Pos logit contributions
 mouths+0.010
 heights+0.010
 noses+0.009
 tongues+0.009
 appearance+0.009
Neg logit contributions
xy-0.005
des-0.005
 snug-0.005
lad-0.005
 secure-0.005
Token and
Token ID290
Feature activation-1.05e-02

Pos logit contributions
 mouths+0.030
 heights+0.029
 appearance+0.028
 noses+0.028
 tongues+0.028
Neg logit contributions
 snug-0.020
 secure-0.020
xy-0.019
des-0.019
lad-0.019
Token discovered
Token ID5071
Feature activation-3.58e-03

Pos logit contributions
des+0.015
xy+0.015
lad+0.015
 snug+0.015
 secure+0.015
Neg logit contributions
 mouths-0.017
 heights-0.016
 noses-0.016
 tongues-0.015
 appearance-0.015
Token a
Token ID257
Feature activation-2.42e-03

Pos logit contributions
des+0.005
lad+0.005
xy+0.005
than+0.005
 snug+0.005
Neg logit contributions
 heights-0.006
 mouths-0.006
 noses-0.006
 crabs-0.006
 frogs-0.006
Token shiny
Token ID22441
Feature activation-6.69e-03

Pos logit contributions
 snug+0.003
 secure+0.003
 valuable+0.003
sy+0.002
des+0.002
Neg logit contributions
 mouths-0.005
 heights-0.005
 noses-0.004
 tongues-0.004
els-0.004
Token coin
Token ID10752
Feature activation-3.46e-02

Pos logit contributions
 snug+0.009
 secure+0.009
des+0.009
xy+0.009
lad+0.008
Neg logit contributions
 mouths-0.011
 heights-0.011
 noses-0.011
 tongues-0.011
 appearance-0.010
Token!
Token ID0
Feature activation-1.21e-02

Pos logit contributions
des+0.051
lad+0.050
xy+0.049
 snug+0.049
 secure+0.049
Neg logit contributions
 mouths-0.053
 heights-0.050
 noses-0.050
els-0.049
 crabs-0.049
Token
Token ID198
Feature activation-1.38e-02

Pos logit contributions
des+0.012
 secure+0.012
xy+0.012
 snug+0.012
lad+0.012
Neg logit contributions
 mouths-0.024
 heights-0.024
 noses-0.023
 tongues-0.023
 groups-0.023
Token
Token ID198
Feature activation-5.35e-03

Pos logit contributions
des+0.020
 secure+0.020
 snug+0.019
xy+0.019
lad+0.019
Neg logit contributions
 groups-0.019
 heights-0.019
 frogs-0.019
 hoops-0.019
 anth-0.019
TokenHe
Token ID1544
Feature activation-1.98e-02

Pos logit contributions
xy+0.006
 secure+0.006
 snug+0.006
des+0.006
lad+0.006
Neg logit contributions
 heights-0.009
 mouths-0.009
 groups-0.009
 hoops-0.009
 frogs-0.009
Token for
Token ID329
Feature activation+8.10e-03

Pos logit contributions
 mouths+0.036
 heights+0.036
 noses+0.035
 tongues+0.035
els+0.034
Neg logit contributions
 snug-0.014
des-0.014
 secure-0.013
xy-0.013
 valuable-0.013
Token her
Token ID607
Feature activation+2.16e-02

Pos logit contributions
 mouths+0.016
 heights+0.015
 anth+0.015
 tongues+0.015
 noses+0.015
Neg logit contributions
 snug-0.008
 secure-0.008
xy-0.008
 safe-0.008
Good-0.008
Token.
Token ID13
Feature activation+1.16e-02

Pos logit contributions
 mouths+0.039
 heights+0.039
 hoops+0.038
els+0.038
 crabs+0.038
Neg logit contributions
 snug-0.026
 secure-0.026
 valuable-0.025
des-0.025
 safe-0.024
Token She
Token ID1375
Feature activation-7.58e-03

Pos logit contributions
 heights+0.028
 crabs+0.027
 appearance+0.027
 noses+0.027
els+0.026
Neg logit contributions
 Good-0.008
Great-0.007
Good-0.007
 Thanks-0.007
 late-0.007
Token said
Token ID531
Feature activation+3.79e-02

Pos logit contributions
lad+0.011
des+0.011
xy+0.011
 snug+0.011
 secure+0.011
Neg logit contributions
 mouths-0.012
 noses-0.011
 tongues-0.011
 heights-0.010
 spr-0.010
Token she
Token ID673
Feature activation-1.91e-02

Pos logit contributions
 anth+0.077
 heights+0.075
els+0.074
 groups+0.074
 mouths+0.072
Neg logit contributions
 snug-0.038
 valuable-0.035
Good-0.034
 Good-0.034
 secure-0.033
Token would
Token ID561
Feature activation+3.51e-03

Pos logit contributions
lad+0.029
des+0.027
 snug+0.026
Great+0.026
 secure+0.026
Neg logit contributions
 mouths-0.031
 noses-0.029
 tongues-0.027
 eyes-0.027
 appearance-0.026
Token get
Token ID651
Feature activation-1.75e-03

Pos logit contributions
 mouths+0.006
 heights+0.006
 noses+0.006
els+0.006
 tongues+0.006
Neg logit contributions
 secure-0.004
 snug-0.004
des-0.004
sy-0.004
xy-0.004
Token a
Token ID257
Feature activation+1.43e-02

Pos logit contributions
des+0.002
 snug+0.002
 secure+0.002
xy+0.002
lad+0.002
Neg logit contributions
 mouths-0.003
 heights-0.003
 noses-0.003
 tongues-0.003
 appearance-0.003
Token sticker
Token ID27735
Feature activation-7.64e-03

Pos logit contributions
 mouths+0.027
 heights+0.026
 noses+0.025
 tongues+0.025
els+0.025
Neg logit contributions
 snug-0.017
des-0.017
 secure-0.016
xy-0.016
lad-0.016
Token and
Token ID290
Feature activation+8.69e-03

Pos logit contributions
lad+0.011
des+0.011
xy+0.011
 snug+0.011
 secure+0.011
Neg logit contributions
 mouths-0.012
 heights-0.011
 noses-0.011
 salon-0.011
 tongues-0.011
Token,
Token ID11
Feature activation+1.18e-04

Pos logit contributions
 mouths+0.016
 heights+0.015
 noses+0.015
 tongues+0.015
 appearance+0.015
Neg logit contributions
des-0.020
xy-0.020
 secure-0.020
 snug-0.020
lad-0.020
Token "
Token ID366
Feature activation-3.53e-02

Pos logit contributions
 mouths+0.000
 heights+0.000
 noses+0.000
 tongues+0.000
 appearance+0.000
Neg logit contributions
 snug-0.000
xy-0.000
des-0.000
lad-0.000
 secure-0.000
TokenYou
Token ID1639
Feature activation+4.75e-03

Pos logit contributions
 secure+0.045
 snug+0.044
xy+0.041
sy+0.041
 safe+0.041
Neg logit contributions
 heights-0.057
 crabs-0.054
 frogs-0.053
 mouths-0.053
 hoops-0.053
Token can
Token ID460
Feature activation+2.08e-04

Pos logit contributions
 mouths+0.007
 noses+0.006
 heights+0.006
 groups+0.006
 frogs+0.006
Neg logit contributions
xy-0.008
des-0.008
 snug-0.007
eddy-0.007
lad-0.007
Token be
Token ID307
Feature activation+3.89e-03

Pos logit contributions
 heights+0.000
 mouths+0.000
 noses+0.000
 tongues+0.000
 appearance+0.000
Neg logit contributions
xy-0.000
lad-0.000
des-0.000
year-0.000
 snug-0.000
Token very
Token ID845
Feature activation-1.71e-02

Pos logit contributions
 mouths+0.007
 heights+0.007
 noses+0.007
 tongues+0.007
 groups+0.007
Neg logit contributions
xy-0.004
des-0.004
lad-0.004
 snug-0.004
 secure-0.004
Token intelligent
Token ID12661
Feature activation+2.72e-02

Pos logit contributions
eddy+0.025
xy+0.024
rob+0.024
rev+0.023
 Good+0.022
Neg logit contributions
 crabs-0.025
 frogs-0.025
 groups-0.025
 heights-0.025
 mouths-0.023
Token too
Token ID1165
Feature activation+4.63e-03

Pos logit contributions
 mouths+0.052
 tongues+0.049
 noses+0.049
 heights+0.048
els+0.047
Neg logit contributions
 snug-0.030
lad-0.029
des-0.029
 secure-0.029
 valuable-0.028
Token if
Token ID611
Feature activation+1.23e-03

Pos logit contributions
 mouths+0.009
 noses+0.009
 tongues+0.009
 heights+0.009
els+0.009
Neg logit contributions
 snug-0.005
 secure-0.005
 valuable-0.005
lad-0.005
xy-0.004
Token you
Token ID345
Feature activation+2.41e-02

Pos logit contributions
 mouths+0.003
 noses+0.002
 tongues+0.002
 heights+0.002
 groups+0.002
Neg logit contributions
xy-0.001
 snug-0.001
 secure-0.001
des-0.001
lad-0.001
Token use
Token ID779
Feature activation+2.52e-02

Pos logit contributions
 mouths+0.047
 heights+0.045
 noses+0.044
 tongues+0.043
els+0.042
Neg logit contributions
 secure-0.027
des-0.026
sy-0.026
 snug-0.026
 repaired-0.026
TokenL
Token ID43
Feature activation-2.72e-03

Pos logit contributions
 mouths+0.007
 heights+0.007
 noses+0.007
 groups+0.007
 tongues+0.006
Neg logit contributions
des-0.004
xy-0.004
 snug-0.004
 secure-0.004
lad-0.004
Tokenily
Token ID813
Feature activation+7.14e-03

Pos logit contributions
 secure+0.004
 snug+0.004
than+0.004
des+0.004
xy+0.004
Neg logit contributions
els-0.004
 hoops-0.004
 groups-0.004
 mouths-0.004
okes-0.004
Token and
Token ID290
Feature activation+2.03e-02

Pos logit contributions
 mouths+0.012
 heights+0.012
 noses+0.012
 tongues+0.012
 appearance+0.011
Neg logit contributions
des-0.009
 snug-0.009
xy-0.009
 secure-0.009
lad-0.009
Token Ben
Token ID3932
Feature activation+1.59e-02

Pos logit contributions
 heights+0.040
 appearance+0.038
 conclusions+0.037
 groups+0.036
uders+0.036
Neg logit contributions
 fixed-0.019
 snug-0.019
eddy-0.018
Good-0.018
 Good-0.018
Token agreed
Token ID4987
Feature activation-4.70e-03

Pos logit contributions
 mouths+0.027
 noses+0.026
 heights+0.025
 tongues+0.025
 groups+0.024
Neg logit contributions
xy-0.021
des-0.021
lad-0.021
 snug-0.021
 secure-0.021
Token.
Token ID13
Feature activation-8.06e-03

Pos logit contributions
xy+0.007
des+0.007
lad+0.007
 snug+0.007
 secure+0.007
Neg logit contributions
 mouths-0.007
 heights-0.007
 noses-0.007
 tongues-0.007
 appearance-0.006
Token They
Token ID1119
Feature activation+6.57e-03

Pos logit contributions
 Thanks+0.004
 Good+0.004
 late+0.004
 Now+0.004
eddy+0.004
Neg logit contributions
 heights-0.020
els-0.020
 crabs-0.019
 appearance-0.019
 strangers-0.019
Token helped
Token ID4193
Feature activation+6.81e-04

Pos logit contributions
 mouths+0.011
 noses+0.011
 groups+0.010
 tongues+0.010
 spr+0.010
Neg logit contributions
lad-0.009
xy-0.009
des-0.009
eddy-0.008
 Good-0.008
Token Mom
Token ID11254
Feature activation+4.29e-03

Pos logit contributions
 mouths+0.001
 heights+0.001
 noses+0.001
 appearance+0.001
 tongues+0.001
Neg logit contributions
 snug-0.001
 secure-0.001
des-0.001
xy-0.001
than-0.001
Token put
Token ID1234
Feature activation-1.63e-02

Pos logit contributions
 mouths+0.007
 noses+0.006
 heights+0.006
 tongues+0.006
 groups+0.006
Neg logit contributions
des-0.006
xy-0.006
lad-0.006
 secure-0.006
 snug-0.006
Token the
Token ID262
Feature activation-1.28e-02

Pos logit contributions
des+0.028
lad+0.028
xy+0.027
 secure+0.027
than+0.027
Neg logit contributions
 mouths-0.024
 noses-0.022
els-0.022
 heights-0.021
 tongues-0.021
Token scary
Token ID14343
Feature activation+1.86e-02

Pos logit contributions
 mouths+0.005
 heights+0.005
 noses+0.005
els+0.005
 tongues+0.005
Neg logit contributions
 snug-0.005
des-0.005
 secure-0.005
xy-0.005
 valuable-0.005
Token dog
Token ID3290
Feature activation+3.55e-03

Pos logit contributions
 mouths+0.029
els+0.028
 tongues+0.027
 heights+0.027
 noses+0.027
Neg logit contributions
xy-0.025
 valuable-0.025
 snug-0.025
Good-0.024
des-0.024
Token.
Token ID13
Feature activation-2.23e-02

Pos logit contributions
 mouths+0.007
 heights+0.006
 noses+0.006
 tongues+0.006
els+0.006
Neg logit contributions
xy-0.004
 snug-0.004
 secure-0.004
des-0.004
sy-0.004
Token<|endoftext|>
Token ID50256
Feature activation-1.45e-02

Pos logit contributions
 secure+0.035
des+0.035
sy+0.034
than+0.034
 snug+0.034
Neg logit contributions
 mouths-0.034
 noses-0.030
 groups-0.030
 hoops-0.030
 tongues-0.029
TokenOnce
Token ID7454
Feature activation+7.16e-03

Pos logit contributions
 secure+0.027
des+0.025
lad+0.025
 snug+0.025
 valuable+0.024
Neg logit contributions
 groups-0.017
 frogs-0.017
 mouths-0.016
 heights-0.016
 hoops-0.016
Token upon
Token ID2402
Feature activation-5.78e-02

Pos logit contributions
 mouths+0.009
 groups+0.008
 heights+0.008
 noses+0.008
 tongues+0.008
Neg logit contributions
des-0.013
 secure-0.012
 snug-0.012
xy-0.012
than-0.012
Token a
Token ID257
Feature activation-4.15e-03

Pos logit contributions
xy+0.107
lad+0.107
than+0.103
 secure+0.100
packing+0.099
Neg logit contributions
 frogs-0.068
 groups-0.065
 noses-0.063
 beetles-0.062
 eyes-0.062
Token time
Token ID640
Feature activation+1.40e-02

Pos logit contributions
xy+0.008
 secure+0.008
lad+0.008
than+0.008
year+0.008
Neg logit contributions
 mouths-0.004
 noses-0.004
 heights-0.004
 tongues-0.004
 groups-0.004
Token,
Token ID11
Feature activation+2.21e-03

Pos logit contributions
 mouths+0.015
 heights+0.014
 noses+0.014
 groups+0.014
 tongues+0.013
Neg logit contributions
 secure-0.027
des-0.027
than-0.026
 snug-0.026
xy-0.026
Token there
Token ID612
Feature activation-7.12e-03

Pos logit contributions
 groups+0.002
 frogs+0.002
 mouths+0.002
 spr+0.002
 hoops+0.002
Neg logit contributions
des-0.005
than-0.005
 secure-0.004
xy-0.004
lad-0.004
Token was
Token ID373
Feature activation-1.26e-02

Pos logit contributions
des+0.011
 secure+0.010
year+0.010
xy+0.010
lad+0.010
Neg logit contributions
 mouths-0.011
 noses-0.010
 spr-0.010
 frogs-0.010
 eyes-0.010
Token brown
Token ID7586
Feature activation+2.28e-02

Pos logit contributions
 mouths+0.010
 heights+0.010
 noses+0.010
 tongues+0.010
els+0.010
Neg logit contributions
 snug-0.006
 secure-0.006
des-0.005
 valuable-0.005
Good-0.005
Token man
Token ID582
Feature activation-1.24e-02

Pos logit contributions
 mouths+0.039
 heights+0.039
 appearance+0.037
 tongues+0.037
 groups+0.036
Neg logit contributions
 snug-0.028
xy-0.027
eddy-0.027
des-0.027
Good-0.027
Token.
Token ID13
Feature activation-1.40e-02

Pos logit contributions
 snug+0.015
des+0.014
xy+0.014
 secure+0.014
sy+0.014
Neg logit contributions
 heights-0.022
 mouths-0.022
 noses-0.021
 tongues-0.021
 appearance-0.021
Token<|endoftext|>
Token ID50256
Feature activation-1.61e-02

Pos logit contributions
 secure+0.022
sy+0.021
 snug+0.021
des+0.021
ve+0.021
Neg logit contributions
 mouths-0.021
 heights-0.019
 noses-0.019
 tongues-0.019
 hoops-0.019
TokenOnce
Token ID7454
Feature activation+8.43e-03

Pos logit contributions
 secure+0.029
 snug+0.028
des+0.028
than+0.028
lad+0.028
Neg logit contributions
 groups-0.018
 frogs-0.018
 mouths-0.018
 heights-0.017
 hoops-0.017
Token upon
Token ID2402
Feature activation-5.03e-02

Pos logit contributions
 mouths+0.011
 heights+0.010
 noses+0.010
 groups+0.010
 tongues+0.009
Neg logit contributions
des-0.015
 secure-0.014
 snug-0.014
xy-0.014
sy-0.014
Token a
Token ID257
Feature activation+1.08e-03

Pos logit contributions
xy+0.080
lad+0.079
 secure+0.077
than+0.077
sy+0.074
Neg logit contributions
 frogs-0.071
 noses-0.069
 mouths-0.068
 heights-0.067
 groups-0.067
Token time
Token ID640
Feature activation+2.38e-02

Pos logit contributions
 mouths+0.001
 noses+0.001
 heights+0.001
 tongues+0.001
 groups+0.001
Neg logit contributions
 secure-0.002
xy-0.002
lad-0.002
 snug-0.002
than-0.002
Token,
Token ID11
Feature activation+1.05e-02

Pos logit contributions
 mouths+0.028
 heights+0.026
 noses+0.026
 groups+0.025
 tongues+0.025
Neg logit contributions
 secure-0.044
des-0.044
 snug-0.043
xy-0.043
than-0.043
Token there
Token ID612
Feature activation+6.74e-04

Pos logit contributions
 mouths+0.012
 frogs+0.011
 groups+0.011
 spr+0.011
 noses+0.011
Neg logit contributions
des-0.020
than-0.019
 secure-0.019
xy-0.019
lad-0.019
Token was
Token ID373
Feature activation-1.32e-02

Pos logit contributions
 mouths+0.001
 noses+0.001
 spr+0.001
 frogs+0.001
 heights+0.001
Neg logit contributions
des-0.001
 secure-0.001
year-0.001
 snug-0.001
sy-0.001
Token found
Token ID1043
Feature activation-5.25e-03

Pos logit contributions
 snug+0.001
 secure+0.001
xy+0.001
des+0.001
sy+0.001
Neg logit contributions
 mouths-0.001
 heights-0.001
 noses-0.001
 tongues-0.001
 appearance-0.001
Token a
Token ID257
Feature activation-5.72e-03

Pos logit contributions
des+0.007
xy+0.007
lad+0.007
 snug+0.007
 secure+0.007
Neg logit contributions
 mouths-0.009
 heights-0.009
 noses-0.009
els-0.009
 frogs-0.008
Token big
Token ID1263
Feature activation-5.22e-03

Pos logit contributions
 snug+0.007
 secure+0.006
des+0.006
xy+0.006
lad+0.006
Neg logit contributions
 mouths-0.011
 heights-0.010
 noses-0.010
 tongues-0.010
 appearance-0.010
Token box
Token ID3091
Feature activation-2.60e-02

Pos logit contributions
des+0.008
 snug+0.008
xy+0.008
 secure+0.008
lad+0.008
Neg logit contributions
 mouths-0.008
 heights-0.008
 noses-0.008
 tongues-0.007
 appearance-0.007
Token and
Token ID290
Feature activation-1.43e-02

Pos logit contributions
des+0.040
lad+0.039
xy+0.039
Good+0.038
than+0.038
Neg logit contributions
 mouths-0.037
 heights-0.036
 groups-0.036
 noses-0.036
els-0.035
Token filled
Token ID5901
Feature activation-4.71e-02

Pos logit contributions
lad+0.023
xy+0.022
des+0.022
eddy+0.020
vern+0.020
Neg logit contributions
 mouths-0.021
 noses-0.020
 groups-0.020
 hoops-0.020
 eyes-0.020
Token it
Token ID340
Feature activation-3.25e-02

Pos logit contributions
lad+0.084
des+0.083
 secure+0.078
vern+0.076
sy+0.076
Neg logit contributions
 groups-0.061
els-0.061
 crabs-0.061
 heights-0.058
 frogs-0.057
Token with
Token ID351
Feature activation-1.02e-02

Pos logit contributions
des+0.046
lad+0.045
than+0.044
xy+0.044
Good+0.043
Neg logit contributions
 mouths-0.054
 groups-0.051
 noses-0.051
 heights-0.050
 hoops-0.049
Token some
Token ID617
Feature activation-4.04e-03

Pos logit contributions
xy+0.014
des+0.014
lad+0.013
 snug+0.013
 secure+0.013
Neg logit contributions
 mouths-0.017
 heights-0.016
 noses-0.016
 tongues-0.016
 groups-0.016
Token sand
Token ID6450
Feature activation-7.67e-03

Pos logit contributions
 snug+0.006
 secure+0.005
 valuable+0.005
des+0.005
sy+0.005
Neg logit contributions
 heights-0.007
 mouths-0.006
 appearance-0.006
 tongues-0.006
 noses-0.006
Token.
Token ID13
Feature activation-6.94e-03

Pos logit contributions
xy+0.010
 snug+0.010
des+0.010
 secure+0.010
lad+0.010
Neg logit contributions
 mouths-0.012
 heights-0.012
 noses-0.012
 appearance-0.012
 tongues-0.012
Tokenmy
Token ID1820
Feature activation-1.08e-02

Pos logit contributions
 secure+0.034
xy+0.033
 valuable+0.033
tted+0.033
des+0.032
Neg logit contributions
 mouths-0.035
 noses-0.035
els-0.034
 spr-0.034
 frogs-0.034
Token got
Token ID1392
Feature activation+7.38e-03

Pos logit contributions
 secure+0.018
xy+0.018
 valuable+0.017
lad+0.017
 snug+0.017
Neg logit contributions
 spr-0.014
 mouths-0.014
 noses-0.014
 frogs-0.013
 groups-0.013
Token a
Token ID257
Feature activation-2.60e-02

Pos logit contributions
 mouths+0.014
 heights+0.014
 noses+0.014
 tongues+0.014
 appearance+0.013
Neg logit contributions
 snug-0.008
des-0.008
 secure-0.008
xy-0.008
than-0.007
Token present
Token ID1944
Feature activation-2.10e-02

Pos logit contributions
lad+0.033
xy+0.033
des+0.032
 secure+0.031
 snug+0.031
Neg logit contributions
 mouths-0.046
 noses-0.043
 heights-0.043
 tongues-0.042
 groups-0.042
Token.
Token ID13
Feature activation-7.86e-03

Pos logit contributions
lad+0.029
des+0.029
 snug+0.028
xy+0.028
 secure+0.028
Neg logit contributions
 mouths-0.034
 noses-0.032
 heights-0.031
 groups-0.031
 salon-0.031
Token It
Token ID632
Feature activation-5.29e-02

Pos logit contributions
xy+0.011
lad+0.011
des+0.011
 secure+0.010
than+0.010
Neg logit contributions
 mouths-0.014
 noses-0.013
 eyes-0.012
 tongues-0.012
 groups-0.012
Token was
Token ID373
Feature activation-3.28e-02

Pos logit contributions
des+0.108
lad+0.102
vern+0.098
than+0.092
year+0.092
Neg logit contributions
 spr-0.067
 mouths-0.064
 eyes-0.062
 noses-0.055
els-0.053
Token a
Token ID257
Feature activation-3.82e-02

Pos logit contributions
lad+0.043
des+0.043
xy+0.042
than+0.041
ager+0.038
Neg logit contributions
 mouths-0.059
 eyes-0.057
 noses-0.056
 groups-0.055
 spr-0.055
Token deep
Token ID2769
Feature activation-2.57e-02

Pos logit contributions
lad+0.048
des+0.048
xy+0.047
tted+0.043
 secure+0.043
Neg logit contributions
 mouths-0.068
 noses-0.064
 spr-0.063
 eyes-0.063
 groups-0.063
Token,
Token ID11
Feature activation-2.57e-02

Pos logit contributions
lad+0.041
xy+0.040
des+0.040
 secure+0.038
than+0.038
Neg logit contributions
 mouths-0.037
 heights-0.036
 noses-0.035
 tongues-0.034
 appearance-0.034
Token red
Token ID2266
Feature activation-1.61e-02

Pos logit contributions
lad+0.044
xy+0.041
than+0.039
des+0.038
tto+0.038
Neg logit contributions
 mouths-0.037
 eyes-0.037
 spr-0.036
 noses-0.036
 hoops-0.036
Token and
Token ID290
Feature activation-2.37e-02

Pos logit contributions
des+0.025
than+0.025
 secure+0.025
 snug+0.025
lad+0.025
Neg logit contributions
 mouths-0.043
 heights-0.043
 noses-0.042
 tongues-0.041
 groups-0.041
Token search
Token ID2989
Feature activation-4.02e-02

Pos logit contributions
des+0.034
lad+0.034
xy+0.033
than+0.032
ve+0.032
Neg logit contributions
 mouths-0.039
 heights-0.038
 noses-0.037
 frogs-0.036
 crabs-0.036
Token for
Token ID329
Feature activation-3.39e-02

Pos logit contributions
than+0.070
tto+0.069
lad+0.066
xy+0.066
da+0.065
Neg logit contributions
 mouths-0.052
 groups-0.052
 heights-0.050
 frogs-0.050
 noses-0.049
Token treasure
Token ID14068
Feature activation-2.04e-02

Pos logit contributions
des+0.044
xy+0.044
 snug+0.043
lad+0.043
 secure+0.043
Neg logit contributions
 mouths-0.058
 heights-0.056
 noses-0.055
 tongues-0.054
 appearance-0.054
Token.
Token ID13
Feature activation-1.11e-02

Pos logit contributions
xy+0.028
des+0.028
 snug+0.028
 secure+0.028
lad+0.027
Neg logit contributions
 mouths-0.033
 heights-0.032
 noses-0.031
 tongues-0.031
 appearance-0.030
Token It
Token ID632
Feature activation-4.73e-02

Pos logit contributions
des+0.015
 secure+0.015
 snug+0.014
than+0.014
xy+0.014
Neg logit contributions
 mouths-0.019
 noses-0.018
 groups-0.018
 heights-0.018
 hoops-0.018
Token was
Token ID373
Feature activation-1.73e-02

Pos logit contributions
des+0.058
lad+0.053
eddy+0.053
xy+0.053
than+0.052
Neg logit contributions
 mouths-0.088
 noses-0.085
 groups-0.085
 spr-0.085
 frogs-0.084
Token the
Token ID262
Feature activation-2.43e-02

Pos logit contributions
des+0.022
xy+0.021
lad+0.021
than+0.020
ve+0.020
Neg logit contributions
 mouths-0.031
 heights-0.031
 noses-0.030
 groups-0.030
 frogs-0.029
Token best
Token ID1266
Feature activation-3.10e-02

Pos logit contributions
 snug+0.025
 secure+0.025
des+0.024
lad+0.024
xy+0.024
Neg logit contributions
 mouths-0.048
 heights-0.046
 noses-0.046
 tongues-0.045
els-0.045
Token day
Token ID1110
Feature activation-3.54e-02

Pos logit contributions
xy+0.045
lad+0.042
sy+0.042
 snug+0.041
 secure+0.041
Neg logit contributions
 mouths-0.051
 heights-0.050
 noses-0.049
 appearance-0.048
 crabs-0.048
Token ever
Token ID1683
Feature activation-3.65e-02

Pos logit contributions
lad+0.030
xy+0.029
sy+0.029
des+0.028
than+0.028
Neg logit contributions
 mouths-0.075
 noses-0.072
 heights-0.072
 crabs-0.072
 frogs-0.071