TOP ACTIVATIONS
MAX = 0.109

mum , " What is a jewel ?" M
scissors . " Thank you so much !" Santa
was a bird named Blue . Blue was very happy because
Queen . She was a grand ship , with a tall
was a bird named Blue . Blue was a graceful bird
little fish y named Blue . Blue loved to swim and
was a bird named Blue . Blue was very beautiful with
was a bird named Blue . Blue liked to fly high
was a bird named Blue . Blue had a strong and
was a bird named Blue . Blue loved to soar high
was a bird named Blue . Blue was an ordinary bird
ostr ich named Blue y . Blue y liked to run
a little bird named Blue . Blue was very sad because
was a bird named Blue . Blue loved to fly high
blue cat named Blue y . Blue y was sad because
a little bird named Blue . Blue was very happy because
calm little bird named Blue . Blue lived in a tiny
wrote letters to Santa and put them in a big red
a little bird named Blue . Blue had a peaceful life
gifted little bird named Blue . Blue loved to sing and

MIN ACTIVATIONS
MIN = -0.099

high ! Can you teach me ?" The frog nodded and
asked , " What is that ?" The mom
Mom my took out a spray bottle . She showed Sarah
higher than you !" The small bubble said , " No
it tastes ." L ula could barely contain her
said , " Look at that blue circle !" His mom
They swim back to the shallow end of the pool ,
fun ! Can you teach me how to make music with
let 's play it . One , two , three ,
my , can you tell me a story ?" His mom
the bald player and the girl skipped up and down the
ball ! Can you teach me how to fly ?" The
dancer ! Can you teach me ?" Lucy smiled and said
like to stay in the shallow end ." Ben
he found a can of spray paint in the woods .
can have the steak , L ula ". She turned around
. She showed Sarah how to spray some nice smelling liquid
cat said , " Mar riage is when two animals love
good ! Can you teach me how to play too ?"
and the baby walked together to the garden . When they

INTERVAL 0.057 - 0.109
CONTAINS 0.075%

a boy named Jack . Jack was a clever boy who
ia looked up and saw the diamond too . " Let
a big t eddy bear that she loved to play with
, Charlie and Lucy . Charlie had a brush in his
Every day his mum would buy one for him and he

INTERVAL 0.005 - 0.057
CONTAINS 37.491%

. Together , Twe ety and the squirrel flew
day , they found a shiny packet of seeds that said
?" Dad asked . " Yes , please !"
the ju ic iest ones . Finally , he found a
side and Sam 's side . She frowned .

INTERVAL -0.047 - 0.005
CONTAINS 62.050%

the park . He always loved being creative and making new
and make loud noises ." The squirrel said , " I
Kate smiled as she put on her swim suit and grabbed
nice . You have to be kind to the animals .
truck is hard to do a jump . Maybe you can

INTERVAL -0.099 - -0.047
CONTAINS 0.384%

like to play outside . One day , they go to
started to cry and asked all of the animals to help
the team was trying to figure out how to do a
around the room again . She knew she could always rely
not so busy . That way , we can have fun
Token mum
Token ID25682
Feature activation-2.27e-02

Pos logit contributions
upid+0.008
ours+0.008
ry+0.008
about+0.007
asing+0.007
Neg logit contributions
Brown-0.011
Ell-0.010
 Brown-0.010
 Stretch-0.010
Twe-0.009
Token,
Token ID11
Feature activation-1.90e-02

Pos logit contributions
upid+0.029
ours+0.027
 bubble+0.027
olt+0.027
 ankle+0.027
Neg logit contributions
Brown-0.039
 Brown-0.037
Ell-0.037
 Stretch-0.035
 Barney-0.035
Token "
Token ID366
Feature activation+1.91e-02

Pos logit contributions
upid+0.029
ours+0.028
ry+0.028
olt+0.027
about+0.027
Neg logit contributions
Brown-0.029
Ell-0.028
 Brown-0.028
 Barney-0.026
 Stretch-0.026
TokenWhat
Token ID2061
Feature activation-1.79e-02

Pos logit contributions
Brown+0.034
Ell+0.032
hy+0.032
 Stretch+0.031
Twe+0.030
Neg logit contributions
ry-0.024
ours-0.023
about-0.022
 bubble-0.021
upid-0.021
Token is
Token ID318
Feature activation+7.53e-02

Pos logit contributions
upid+0.023
ry+0.023
ours+0.023
 Sandy+0.022
 shallow+0.022
Neg logit contributions
Brown-0.031
Ell-0.029
 Stretch-0.028
 Brown-0.028
hy-0.027
Token a
Token ID257
Feature activation+1.09e-01

Pos logit contributions
Brown+0.171
Ell+0.167
Ar+0.151
thank+0.151
hy+0.151
Neg logit contributions
 how-0.061
 more-0.058
 Sandy-0.056
ours-0.056
 bubble-0.053
Token jewel
Token ID16927
Feature activation+6.40e-03

Pos logit contributions
Ell+0.226
Brown+0.218
hy+0.193
ding+0.190
rob+0.189
Neg logit contributions
 shell-0.121
 promise-0.115
 foot-0.115
 eye-0.108
 end-0.108
Token?"
Token ID1701
Feature activation+1.25e-03

Pos logit contributions
Brown+0.013
Ell+0.013
Twe+0.012
Ar+0.012
 Stretch+0.012
Neg logit contributions
ry-0.006
ours-0.006
upid-0.006
 bubble-0.006
 Sandy-0.005
Token
Token ID198
Feature activation+8.61e-03

Pos logit contributions
Brown+0.002
Ell+0.002
Twe+0.002
 Stretch+0.002
hy+0.002
Neg logit contributions
ours-0.002
ry-0.002
upid-0.002
 Sandy-0.002
 bubble-0.002
Token
Token ID198
Feature activation+4.60e-03

Pos logit contributions
Brown+0.014
Ell+0.013
Twe+0.013
hy+0.013
 Stretch+0.012
Neg logit contributions
ry-0.012
ours-0.012
upid-0.012
 Sandy-0.011
 bubble-0.011
TokenM
Token ID44
Feature activation+9.09e-03

Pos logit contributions
Brown+0.008
Ell+0.007
Twe+0.007
 Stretch+0.007
hy+0.007
Neg logit contributions
ry-0.006
ours-0.006
upid-0.006
 bubble-0.006
 Sandy-0.006
Token scissors
Token ID41786
Feature activation+2.63e-03

Pos logit contributions
Brown+0.055
Ell+0.053
Twe+0.048
 Stretch+0.048
hy+0.047
Neg logit contributions
 bubble-0.036
ours-0.035
upid-0.034
 foot-0.034
 Sandy-0.033
Token.
Token ID13
Feature activation+5.49e-02

Pos logit contributions
Brown+0.005
Ell+0.005
Twe+0.004
 Stretch+0.004
Ar+0.004
Neg logit contributions
ours-0.003
ry-0.003
upid-0.003
 Sandy-0.003
 bubble-0.003
Token
Token ID198
Feature activation+4.41e-02

Pos logit contributions
Ell+0.088
Brown+0.081
ve+0.073
Twe+0.073
wl+0.072
Neg logit contributions
ours-0.079
ry-0.075
 hand-0.074
upid-0.072
 Sandy-0.072
Token
Token ID198
Feature activation+4.67e-02

Pos logit contributions
hy+0.084
Ell+0.083
Brown+0.083
Twe+0.076
 loot+0.076
Neg logit contributions
upid-0.040
ry-0.039
ours-0.036
 Sandy-0.036
ress-0.035
Token"
Token ID1
Feature activation+5.14e-02

Pos logit contributions
hy+0.065
Ell+0.063
 Brown+0.059
Brown+0.058
 Ellie+0.058
Neg logit contributions
ry-0.062
about-0.061
upid-0.061
ress-0.060
'-0.060
TokenThank
Token ID10449
Feature activation+9.64e-02

Pos logit contributions
hy+0.077
gy+0.077
Ell+0.077
 Brown+0.076
Brown+0.076
Neg logit contributions
about-0.061
 bubble-0.057
ired-0.054
ful-0.054
Make-0.053
Token you
Token ID345
Feature activation+3.84e-02

Pos logit contributions
Ell+0.207
Brown+0.189
gy+0.188
hy+0.182
storm+0.182
Neg logit contributions
ful-0.084
 of-0.079
'-0.078
ry-0.072
 if-0.065
Token so
Token ID523
Feature activation+2.65e-02

Pos logit contributions
Brown+0.072
Ell+0.070
Twe+0.067
Ar+0.066
 Stretch+0.062
Neg logit contributions
ry-0.042
ress-0.041
 Sandy-0.040
upid-0.040
ful-0.039
Token much
Token ID881
Feature activation+5.52e-02

Pos logit contributions
Brown+0.047
Ell+0.046
Ar+0.042
Twe+0.042
hy+0.040
Neg logit contributions
ry-0.033
upid-0.033
ours-0.032
 Sandy-0.031
 foot-0.030
Token!"
Token ID2474
Feature activation+6.41e-02

Pos logit contributions
Ell+0.115
Brown+0.111
Twe+0.102
Mic+0.101
hy+0.099
Neg logit contributions
ry-0.054
 of-0.051
ours-0.049
 if-0.048
ired-0.047
Token Santa
Token ID8909
Feature activation+2.22e-02

Pos logit contributions
Ell+0.109
Brown+0.108
Twe+0.101
Ar+0.098
hy+0.096
Neg logit contributions
ry-0.074
 bubble-0.073
 hand-0.073
 Sandy-0.072
 Mama-0.068
Token was
Token ID373
Feature activation+2.22e-02

Pos logit contributions
ours+0.021
upid+0.021
ry+0.020
asing+0.019
 bubble+0.019
Neg logit contributions
Brown-0.022
Ell-0.021
 Stretch-0.020
 Brown-0.020
Twe-0.019
Token a
Token ID257
Feature activation+5.71e-03

Pos logit contributions
Brown+0.047
Ell+0.045
hy+0.043
Twe+0.042
ve+0.041
Neg logit contributions
 bubble-0.020
 Sandy-0.020
ours-0.019
upid-0.019
 foot-0.018
Token bird
Token ID6512
Feature activation-1.33e-02

Pos logit contributions
Brown+0.011
Ell+0.011
Twe+0.010
 Stretch+0.010
hy+0.010
Neg logit contributions
ours-0.006
ry-0.006
upid-0.006
 bubble-0.006
 Sandy-0.006
Token named
Token ID3706
Feature activation+2.64e-02

Pos logit contributions
upid+0.018
ours+0.018
ry+0.017
 Sandy+0.017
about+0.016
Neg logit contributions
Brown-0.024
Ell-0.022
Twe-0.021
hy-0.020
 Stretch-0.020
Token Blue
Token ID4518
Feature activation+1.70e-03

Pos logit contributions
Brown+0.033
Ell+0.031
Ar+0.030
ve+0.029
hy+0.028
Neg logit contributions
 bubble-0.046
 Sandy-0.046
upid-0.045
ours-0.043
ry-0.042
Token.
Token ID13
Feature activation+9.61e-02

Pos logit contributions
Brown+0.003
Ell+0.003
Twe+0.003
 Stretch+0.003
hy+0.003
Neg logit contributions
ours-0.002
ry-0.002
upid-0.002
 Sandy-0.002
 bubble-0.002
Token Blue
Token ID4518
Feature activation+3.34e-02

Pos logit contributions
ve+0.178
bel+0.163
ind+0.142
grand+0.141
cle+0.141
Neg logit contributions
 again-0.097
 of-0.094
 if-0.092
 the-0.088
 Cater-0.088
Token was
Token ID373
Feature activation+3.74e-02

Pos logit contributions
Brown+0.058
Ell+0.057
hy+0.054
 Barney+0.053
 Stretch+0.053
Neg logit contributions
ry-0.042
 bubble-0.038
upid-0.037
 foot-0.036
ours-0.036
Token very
Token ID845
Feature activation+4.80e-02

Pos logit contributions
Brown+0.069
Ell+0.064
Ar+0.062
Twe+0.060
ve+0.060
Neg logit contributions
 Sandy-0.043
 foot-0.043
 bubble-0.042
upid-0.040
 hand-0.039
Token happy
Token ID3772
Feature activation+3.42e-03

Pos logit contributions
Brown+0.092
Ell+0.089
Ar+0.078
 Barney+0.078
 Stretch+0.077
Neg logit contributions
 foot-0.055
 Sandy-0.055
upid-0.054
ours-0.053
 bubble-0.052
Token because
Token ID780
Feature activation+3.12e-02

Pos logit contributions
Brown+0.006
Ell+0.006
Twe+0.005
 Stretch+0.005
Ar+0.005
Neg logit contributions
ry-0.004
ours-0.004
upid-0.004
 Sandy-0.004
 bubble-0.004
Token Queen
Token ID7542
Feature activation+1.02e-03

Pos logit contributions
ours+0.023
upid+0.023
ry+0.021
 ankle+0.021
asing+0.020
Neg logit contributions
Brown-0.026
Ell-0.024
 Stretch-0.023
Twe-0.023
Ar-0.022
Token.
Token ID13
Feature activation+4.52e-02

Pos logit contributions
Brown+0.002
Ell+0.002
Twe+0.002
 Stretch+0.002
hy+0.002
Neg logit contributions
ours-0.001
ry-0.001
upid-0.001
 Sandy-0.001
 bubble-0.001
Token She
Token ID1375
Feature activation+2.72e-02

Pos logit contributions
Brown+0.088
ve+0.083
Ell+0.081
Twe+0.080
Ar+0.078
Neg logit contributions
 Sandy-0.048
ry-0.046
upid-0.046
ours-0.045
 Jelly-0.044
Token was
Token ID373
Feature activation+2.81e-02

Pos logit contributions
Brown+0.048
Ell+0.048
Mic+0.043
Twe+0.042
 Kirst+0.042
Neg logit contributions
ry-0.034
 bubble-0.033
ours-0.032
upid-0.031
 foot-0.031
Token a
Token ID257
Feature activation+3.70e-02

Pos logit contributions
Brown+0.050
Ell+0.048
Twe+0.043
Mic+0.042
Ar+0.041
Neg logit contributions
 Sandy-0.036
ours-0.035
 foot-0.035
 bubble-0.035
upid-0.034
Token grand
Token ID4490
Feature activation+9.45e-02

Pos logit contributions
Ell+0.063
Brown+0.063
Twe+0.055
 Kirst+0.054
hy+0.053
Neg logit contributions
 bubble-0.050
 foot-0.049
ry-0.048
 dolphin-0.046
 twin-0.046
Token ship
Token ID4074
Feature activation+2.93e-02

Pos logit contributions
Ell+0.192
Brown+0.188
Red+0.179
Twe+0.178
Mic+0.172
Neg logit contributions
ry-0.114
 eye-0.107
 dolphin-0.106
 insect-0.105
ful-0.103
Token,
Token ID11
Feature activation+2.23e-02

Pos logit contributions
Brown+0.057
Ell+0.055
Twe+0.052
Ar+0.052
ve+0.049
Neg logit contributions
ry-0.033
 bubble-0.031
 foot-0.030
ours-0.029
apping-0.029
Token with
Token ID351
Feature activation+3.71e-02

Pos logit contributions
Brown+0.045
Ell+0.043
Twe+0.040
Ar+0.040
ve+0.040
Neg logit contributions
 Sandy-0.022
ry-0.021
 bubble-0.021
ours-0.021
 foot-0.021
Token a
Token ID257
Feature activation-9.05e-04

Pos logit contributions
Brown+0.072
Ell+0.070
Twe+0.068
ve+0.064
 Claus+0.063
Neg logit contributions
upid-0.040
 bubble-0.039
 Sandy-0.039
 foot-0.038
ful-0.037
Token tall
Token ID7331
Feature activation-4.27e-03

Pos logit contributions
ours+0.001
ry+0.001
upid+0.001
 Sandy+0.001
 bubble+0.001
Neg logit contributions
Brown-0.001
Ell-0.001
Twe-0.001
 Stretch-0.001
Ar-0.001
Token was
Token ID373
Feature activation+1.74e-02

Pos logit contributions
ours+0.020
upid+0.020
ry+0.020
asing+0.019
 bubble+0.018
Neg logit contributions
Brown-0.022
Ell-0.021
 Stretch-0.019
 Brown-0.019
Twe-0.019
Token a
Token ID257
Feature activation+5.97e-03

Pos logit contributions
Brown+0.036
Ell+0.034
Twe+0.032
hy+0.032
 Stretch+0.031
Neg logit contributions
ours-0.017
 bubble-0.017
upid-0.017
 Sandy-0.016
 foot-0.015
Token bird
Token ID6512
Feature activation-1.50e-02

Pos logit contributions
Brown+0.012
Ell+0.011
Twe+0.010
 Stretch+0.010
hy+0.010
Neg logit contributions
ours-0.007
ry-0.007
upid-0.007
 bubble-0.006
 Sandy-0.006
Token named
Token ID3706
Feature activation+1.90e-02

Pos logit contributions
upid+0.021
ours+0.021
ry+0.019
 Sandy+0.019
about+0.018
Neg logit contributions
Brown-0.026
Ell-0.025
Twe-0.023
hy-0.022
 Stretch-0.022
Token Blue
Token ID4518
Feature activation+2.17e-03

Pos logit contributions
Brown+0.025
Ell+0.023
Ar+0.022
ve+0.021
hy+0.021
Neg logit contributions
upid-0.032
 bubble-0.032
 Sandy-0.032
ours-0.032
ry-0.031
Token.
Token ID13
Feature activation+9.40e-02

Pos logit contributions
Brown+0.004
Ell+0.004
Twe+0.003
 Stretch+0.003
hy+0.003
Neg logit contributions
ours-0.003
ry-0.003
upid-0.003
 Sandy-0.003
 bubble-0.003
Token Blue
Token ID4518
Feature activation+3.51e-02

Pos logit contributions
ve+0.172
bel+0.158
Ar+0.139
grand+0.137
cle+0.135
Neg logit contributions
 again-0.094
 Cater-0.092
 Duck-0.091
 if-0.091
 cricket-0.089
Token was
Token ID373
Feature activation+3.80e-02

Pos logit contributions
Brown+0.062
Ell+0.061
hy+0.058
 Barney+0.057
 Stretch+0.056
Neg logit contributions
ry-0.043
 bubble-0.038
ours-0.037
upid-0.037
 foot-0.037
Token a
Token ID257
Feature activation+3.37e-02

Pos logit contributions
Brown+0.070
Ell+0.065
Ar+0.063
Twe+0.061
ve+0.060
Neg logit contributions
 Sandy-0.044
 foot-0.043
 bubble-0.043
upid-0.041
 hand-0.040
Token graceful
Token ID44363
Feature activation-5.30e-03

Pos logit contributions
Brown+0.056
Ell+0.055
hy+0.049
Twe+0.049
 Kirst+0.048
Neg logit contributions
 foot-0.044
 bubble-0.044
ry-0.040
 hand-0.040
 twin-0.040
Token bird
Token ID6512
Feature activation-9.15e-03

Pos logit contributions
ours+0.007
upid+0.006
ry+0.006
 Sandy+0.006
asing+0.006
Neg logit contributions
Brown-0.010
Ell-0.009
Twe-0.009
 Stretch-0.009
hy-0.009
Token little
Token ID1310
Feature activation+8.48e-03

Pos logit contributions
Brown+0.011
Ell+0.011
Twe+0.010
 Stretch+0.010
hy+0.010
Neg logit contributions
ours-0.006
ry-0.006
upid-0.006
 bubble-0.006
 Sandy-0.006
Token fish
Token ID5916
Feature activation-2.39e-02

Pos logit contributions
Brown+0.017
Ell+0.016
Twe+0.015
 Stretch+0.015
hy+0.015
Neg logit contributions
ry-0.009
ours-0.009
upid-0.009
 bubble-0.009
 Sandy-0.008
Tokeny
Token ID88
Feature activation-2.62e-02

Pos logit contributions
upid+0.036
ours+0.035
 Sandy+0.033
ry+0.032
about+0.032
Neg logit contributions
Brown-0.038
Ell-0.035
Twe-0.034
 Stretch-0.033
hy-0.033
Token named
Token ID3706
Feature activation+7.05e-03

Pos logit contributions
upid+0.040
ours+0.038
asing+0.036
ress+0.035
ry+0.035
Neg logit contributions
Brown-0.042
 Stretch-0.038
Twe-0.038
Ell-0.037
 Brown-0.037
Token Blue
Token ID4518
Feature activation-3.18e-04

Pos logit contributions
Brown+0.010
Ell+0.010
Twe+0.009
hy+0.009
 Stretch+0.009
Neg logit contributions
ours-0.011
upid-0.011
ry-0.011
 Sandy-0.011
 bubble-0.011
Token.
Token ID13
Feature activation+9.29e-02

Pos logit contributions
ours+0.000
ry+0.000
upid+0.000
 Sandy+0.000
 bubble+0.000
Neg logit contributions
Brown-0.001
Ell-0.001
Twe-0.001
 Stretch-0.001
Ar-0.001
Token Blue
Token ID4518
Feature activation+2.11e-02

Pos logit contributions
ve+0.171
bel+0.151
cle+0.142
grand+0.141
ind+0.135
Neg logit contributions
 again-0.103
 if-0.096
 goalie-0.095
 frogs-0.095
 how-0.094
Token loved
Token ID6151
Feature activation+1.34e-02

Pos logit contributions
Brown+0.045
Ell+0.044
hy+0.041
Twe+0.041
 Stretch+0.041
Neg logit contributions
ry-0.020
upid-0.018
ours-0.018
 bubble-0.017
 foot-0.016
Token to
Token ID284
Feature activation-4.04e-03

Pos logit contributions
Brown+0.025
Ell+0.023
Twe+0.022
hy+0.022
 Stretch+0.021
Neg logit contributions
ours-0.017
upid-0.016
 bubble-0.016
ry-0.016
 Sandy-0.016
Token swim
Token ID9422
Feature activation+9.62e-03

Pos logit contributions
ours+0.005
ry+0.005
upid+0.005
 Sandy+0.005
asing+0.005
Neg logit contributions
Brown-0.007
Ell-0.007
Twe-0.006
 Stretch-0.006
Ar-0.006
Token and
Token ID290
Feature activation-1.32e-02

Pos logit contributions
Brown+0.016
Ell+0.016
Twe+0.014
hy+0.014
 Stretch+0.014
Neg logit contributions
ry-0.014
ours-0.013
upid-0.013
 bubble-0.012
 Sandy-0.012
Token was
Token ID373
Feature activation+1.99e-02

Pos logit contributions
ours+0.031
upid+0.031
asing+0.030
ry+0.030
 ankle+0.028
Neg logit contributions
Brown-0.029
Ell-0.027
 Brown-0.027
 Stretch-0.026
Twe-0.025
Token a
Token ID257
Feature activation+1.99e-03

Pos logit contributions
Brown+0.042
Ell+0.040
hy+0.037
Twe+0.037
 Stretch+0.037
Neg logit contributions
 Sandy-0.018
 bubble-0.018
ours-0.018
upid-0.018
 foot-0.017
Token bird
Token ID6512
Feature activation-1.61e-02

Pos logit contributions
Brown+0.004
Ell+0.004
Twe+0.003
 Stretch+0.003
hy+0.003
Neg logit contributions
ours-0.002
ry-0.002
upid-0.002
 Sandy-0.002
 bubble-0.002
Token named
Token ID3706
Feature activation+1.84e-02

Pos logit contributions
upid+0.023
ours+0.023
ry+0.021
 Sandy+0.021
about+0.020
Neg logit contributions
Brown-0.028
Ell-0.026
Twe-0.025
hy-0.024
 Stretch-0.023
Token Blue
Token ID4518
Feature activation+8.43e-04

Pos logit contributions
Brown+0.024
Ell+0.023
Ar+0.022
ve+0.021
hy+0.021
Neg logit contributions
 bubble-0.031
upid-0.031
 Sandy-0.030
ours-0.030
ry-0.030
Token.
Token ID13
Feature activation+9.28e-02

Pos logit contributions
Brown+0.001
Ell+0.001
Twe+0.001
 Stretch+0.001
hy+0.001
Neg logit contributions
ours-0.001
ry-0.001
upid-0.001
 Sandy-0.001
 bubble-0.001
Token Blue
Token ID4518
Feature activation+3.36e-02

Pos logit contributions
ve+0.172
bel+0.153
ind+0.138
cle+0.137
grand+0.137
Neg logit contributions
 again-0.099
 if-0.092
 goalie-0.090
 of-0.090
 Cater-0.089
Token was
Token ID373
Feature activation+3.50e-02

Pos logit contributions
Brown+0.059
Ell+0.058
hy+0.055
 Barney+0.054
Twe+0.054
Neg logit contributions
ry-0.041
 bubble-0.037
ours-0.036
upid-0.035
 foot-0.035
Token very
Token ID845
Feature activation+4.80e-02

Pos logit contributions
Brown+0.063
Ell+0.059
Ar+0.057
Twe+0.055
ve+0.055
Neg logit contributions
 Sandy-0.042
 foot-0.041
 bubble-0.041
upid-0.039
ours-0.038
Token beautiful
Token ID4950
Feature activation+1.38e-03

Pos logit contributions
Brown+0.091
Ell+0.089
 Barney+0.078
 Elliot+0.078
Ar+0.078
Neg logit contributions
 foot-0.056
upid-0.055
 Sandy-0.055
 bubble-0.053
ulu-0.052
Token with
Token ID351
Feature activation+5.00e-02

Pos logit contributions
Brown+0.002
Ell+0.002
Twe+0.002
 Stretch+0.002
hy+0.002
Neg logit contributions
ours-0.002
ry-0.002
upid-0.002
 Sandy-0.002
 bubble-0.002
Token was
Token ID373
Feature activation+2.25e-02

Pos logit contributions
ours+0.017
upid+0.017
ry+0.016
asing+0.016
 bubble+0.015
Neg logit contributions
Brown-0.019
Ell-0.018
 Stretch-0.017
 Brown-0.017
Twe-0.017
Token a
Token ID257
Feature activation+1.10e-02

Pos logit contributions
Brown+0.047
Ell+0.044
hy+0.042
Twe+0.042
 Stretch+0.041
Neg logit contributions
 bubble-0.021
 Sandy-0.020
ours-0.020
upid-0.020
 foot-0.019
Token bird
Token ID6512
Feature activation-1.01e-02

Pos logit contributions
Brown+0.021
Ell+0.021
Twe+0.019
hy+0.019
 Stretch+0.019
Neg logit contributions
ry-0.012
ours-0.012
upid-0.012
 bubble-0.011
 foot-0.011
Token named
Token ID3706
Feature activation+2.51e-02

Pos logit contributions
upid+0.013
ours+0.013
ry+0.012
 Sandy+0.012
about+0.012
Neg logit contributions
Brown-0.018
Ell-0.017
Twe-0.016
hy-0.016
 Stretch-0.016
Token Blue
Token ID4518
Feature activation+6.32e-03

Pos logit contributions
Brown+0.033
Ell+0.032
Ar+0.031
ve+0.030
hy+0.028
Neg logit contributions
 Sandy-0.041
 bubble-0.041
upid-0.041
ours-0.040
ry-0.039
Token.
Token ID13
Feature activation+9.21e-02

Pos logit contributions
Brown+0.012
Ell+0.011
Twe+0.010
 Stretch+0.010
hy+0.010
Neg logit contributions
ry-0.008
ours-0.008
upid-0.008
 bubble-0.007
 Sandy-0.007
Token Blue
Token ID4518
Feature activation+3.37e-02

Pos logit contributions
ve+0.159
bel+0.144
grand+0.127
ind+0.126
cle+0.126
Neg logit contributions
 of-0.101
 Duck-0.099
 again-0.098
 if-0.096
 Cater-0.095
Token liked
Token ID8288
Feature activation+2.61e-02

Pos logit contributions
Brown+0.058
Ell+0.057
hy+0.054
 Barney+0.053
Twe+0.053
Neg logit contributions
ry-0.043
 bubble-0.038
ours-0.037
upid-0.037
 foot-0.037
Token to
Token ID284
Feature activation+1.89e-02

Pos logit contributions
Brown+0.052
Ell+0.049
Twe+0.047
ve+0.046
hy+0.046
Neg logit contributions
 bubble-0.027
ours-0.027
 Sandy-0.026
upid-0.026
 foot-0.025
Token fly
Token ID6129
Feature activation+3.45e-02

Pos logit contributions
Brown+0.036
Ell+0.035
ve+0.032
 Stretch+0.032
Twe+0.032
Neg logit contributions
 bubble-0.022
upid-0.021
 foot-0.020
ours-0.020
ry-0.020
Token high
Token ID1029
Feature activation+5.28e-02

Pos logit contributions
Brown+0.066
Twe+0.060
Ell+0.059
hy+0.058
 Barney+0.057
Neg logit contributions
ry-0.038
 bubble-0.038
 foot-0.036
upid-0.036
 Sandy-0.035
Token was
Token ID373
Feature activation+2.33e-02

Pos logit contributions
ours+0.023
upid+0.023
ry+0.022
asing+0.022
 bubble+0.021
Neg logit contributions
Brown-0.024
Ell-0.023
 Brown-0.022
 Stretch-0.022
Twe-0.021
Token a
Token ID257
Feature activation+5.30e-03

Pos logit contributions
Brown+0.049
Ell+0.046
hy+0.044
Twe+0.044
ve+0.043
Neg logit contributions
 bubble-0.021
 Sandy-0.021
ours-0.020
upid-0.020
 foot-0.019
Token bird
Token ID6512
Feature activation-1.43e-02

Pos logit contributions
Brown+0.010
Ell+0.010
Twe+0.009
 Stretch+0.009
hy+0.009
Neg logit contributions
ours-0.006
ry-0.006
upid-0.006
 bubble-0.006
 Sandy-0.006
Token named
Token ID3706
Feature activation+2.37e-02

Pos logit contributions
upid+0.020
ours+0.020
ry+0.018
 Sandy+0.018
about+0.017
Neg logit contributions
Brown-0.025
Ell-0.024
Twe-0.022
hy-0.021
 Stretch-0.021
Token Blue
Token ID4518
Feature activation+3.60e-03

Pos logit contributions
Brown+0.032
Ell+0.030
Ar+0.029
ve+0.028
hy+0.027
Neg logit contributions
 bubble-0.039
 Sandy-0.039
upid-0.039
ours-0.038
ry-0.037
Token.
Token ID13
Feature activation+9.15e-02

Pos logit contributions
Brown+0.006
Ell+0.006
Twe+0.006
 Stretch+0.006
hy+0.006
Neg logit contributions
ry-0.005
ours-0.005
upid-0.004
 bubble-0.004
 Sandy-0.004
Token Blue
Token ID4518
Feature activation+2.96e-02

Pos logit contributions
ve+0.161
bel+0.142
ind+0.130
grand+0.126
cle+0.125
Neg logit contributions
 of-0.099
 again-0.097
 Duck-0.092
 if-0.092
 let-0.091
Token had
Token ID550
Feature activation+2.85e-02

Pos logit contributions
Brown+0.050
Ell+0.049
hy+0.046
Twe+0.045
 Barney+0.045
Neg logit contributions
ry-0.039
 bubble-0.035
upid-0.035
ours-0.035
 foot-0.034
Token a
Token ID257
Feature activation+3.60e-02

Pos logit contributions
Brown+0.059
Ell+0.055
Twe+0.054
ve+0.054
 Stretch+0.053
Neg logit contributions
 foot-0.025
 bubble-0.025
upid-0.024
 Sandy-0.023
 hand-0.023
Token strong
Token ID1913
Feature activation+1.01e-02

Pos logit contributions
Brown+0.061
Ell+0.058
Twe+0.054
hy+0.054
 Stretch+0.053
Neg logit contributions
 bubble-0.049
 foot-0.048
 hand-0.045
upid-0.044
 twin-0.043
Token and
Token ID290
Feature activation+6.37e-03

Pos logit contributions
Brown+0.021
Ell+0.020
Twe+0.019
 Stretch+0.018
Ar+0.018
Neg logit contributions
upid-0.010
ours-0.010
ry-0.010
 bubble-0.009
 foot-0.009
Token was
Token ID373
Feature activation+1.68e-02

Pos logit contributions
ours+0.032
upid+0.031
asing+0.030
ry+0.030
 shallow+0.029
Neg logit contributions
Brown-0.030
Ell-0.027
 Brown-0.027
 Stretch-0.026
Twe-0.025
Token a
Token ID257
Feature activation+5.22e-03

Pos logit contributions
Brown+0.034
Ell+0.032
hy+0.031
Twe+0.031
 Stretch+0.030
Neg logit contributions
 bubble-0.016
ours-0.016
upid-0.016
 Sandy-0.016
 foot-0.015
Token bird
Token ID6512
Feature activation-1.38e-02

Pos logit contributions
Brown+0.010
Ell+0.010
Twe+0.009
 Stretch+0.009
hy+0.009
Neg logit contributions
ours-0.006
ry-0.006
upid-0.006
 bubble-0.005
 Sandy-0.005
Token named
Token ID3706
Feature activation+2.06e-02

Pos logit contributions
upid+0.019
ours+0.019
ry+0.018
 Sandy+0.017
about+0.017
Neg logit contributions
Brown-0.024
Ell-0.023
Twe-0.022
hy-0.021
 Stretch-0.021
Token Blue
Token ID4518
Feature activation+1.45e-03

Pos logit contributions
Brown+0.027
Ell+0.025
Ar+0.024
ve+0.023
hy+0.023
Neg logit contributions
upid-0.035
 bubble-0.035
 Sandy-0.035
ours-0.035
ry-0.034
Token.
Token ID13
Feature activation+9.14e-02

Pos logit contributions
Brown+0.003
Ell+0.002
Twe+0.002
 Stretch+0.002
hy+0.002
Neg logit contributions
ours-0.002
ry-0.002
upid-0.002
 Sandy-0.002
 bubble-0.002
Token Blue
Token ID4518
Feature activation+2.75e-02

Pos logit contributions
ve+0.162
bel+0.145
cle+0.129
ind+0.129
Ar+0.129
Neg logit contributions
 again-0.097
 of-0.096
 Cater-0.094
 if-0.093
 Duck-0.092
Token loved
Token ID6151
Feature activation+2.76e-02

Pos logit contributions
Brown+0.047
Ell+0.045
hy+0.043
Twe+0.042
 Stretch+0.042
Neg logit contributions
ry-0.036
ours-0.033
 bubble-0.033
upid-0.033
 foot-0.032
Token to
Token ID284
Feature activation+1.16e-02

Pos logit contributions
Brown+0.054
Ell+0.051
Twe+0.050
hy+0.049
ve+0.049
Neg logit contributions
 bubble-0.029
ours-0.028
 Sandy-0.028
upid-0.027
 foot-0.026
Token soar
Token ID48701
Feature activation+4.56e-02

Pos logit contributions
Brown+0.021
Ell+0.020
Twe+0.019
 Stretch+0.019
Ar+0.018
Neg logit contributions
upid-0.014
 bubble-0.014
ours-0.014
ry-0.014
 foot-0.013
Token high
Token ID1029
Feature activation+6.08e-02

Pos logit contributions
Brown+0.089
Ell+0.081
Twe+0.079
hy+0.079
 Barney+0.076
Neg logit contributions
ry-0.050
 foot-0.048
 bubble-0.048
 shallow-0.046
 Sandy-0.046
Token was
Token ID373
Feature activation+1.77e-02

Pos logit contributions
ours+0.016
upid+0.016
ry+0.016
asing+0.015
 bubble+0.014
Neg logit contributions
Brown-0.019
Ell-0.018
 Stretch-0.017
 Brown-0.017
Twe-0.017
Token a
Token ID257
Feature activation+5.53e-03

Pos logit contributions
Brown+0.036
Ell+0.034
hy+0.032
Twe+0.032
ve+0.032
Neg logit contributions
ours-0.017
 bubble-0.017
upid-0.017
 Sandy-0.017
ry-0.016
Token bird
Token ID6512
Feature activation-1.13e-02

Pos logit contributions
Brown+0.010
Ell+0.010
Twe+0.009
 Stretch+0.009
hy+0.009
Neg logit contributions
ours-0.007
ry-0.007
upid-0.006
 bubble-0.006
 Sandy-0.006
Token named
Token ID3706
Feature activation+2.14e-02

Pos logit contributions
upid+0.015
ours+0.015
ry+0.014
 Sandy+0.014
about+0.013
Neg logit contributions
Brown-0.020
Ell-0.019
Twe-0.018
hy-0.017
 Stretch-0.017
Token Blue
Token ID4518
Feature activation+2.16e-03

Pos logit contributions
Brown+0.028
Ell+0.026
Ar+0.025
ve+0.024
hy+0.023
Neg logit contributions
 bubble-0.036
upid-0.036
 Sandy-0.036
ours-0.035
 foot-0.034
Token.
Token ID13
Feature activation+9.14e-02

Pos logit contributions
Brown+0.004
Ell+0.004
Twe+0.003
 Stretch+0.003
hy+0.003
Neg logit contributions
ours-0.003
ry-0.003
upid-0.003
 Sandy-0.003
 bubble-0.003
Token Blue
Token ID4518
Feature activation+3.38e-02

Pos logit contributions
ve+0.164
bel+0.140
ind+0.138
cle+0.130
grand+0.128
Neg logit contributions
 goalie-0.097
 again-0.096
 of-0.096
 the-0.094
 All-0.093
Token was
Token ID373
Feature activation+3.60e-02

Pos logit contributions
Brown+0.057
Ell+0.056
hy+0.053
 Barney+0.052
 Stretch+0.052
Neg logit contributions
ry-0.045
 bubble-0.040
upid-0.039
ours-0.039
 foot-0.038
Token an
Token ID281
Feature activation+2.05e-02

Pos logit contributions
Brown+0.066
Ell+0.061
Ar+0.059
Twe+0.057
ve+0.057
Neg logit contributions
 bubble-0.042
 foot-0.042
 Sandy-0.042
upid-0.040
 hand-0.039
Token ordinary
Token ID8850
Feature activation+4.15e-03

Pos logit contributions
Brown+0.032
Ell+0.030
Ar+0.028
 Stretch+0.027
Twe+0.027
Neg logit contributions
upid-0.031
ours-0.030
ry-0.030
 foot-0.030
 bubble-0.029
Token bird
Token ID6512
Feature activation-1.12e-02

Pos logit contributions
Brown+0.008
Ell+0.008
Twe+0.007
 Stretch+0.007
hy+0.007
Neg logit contributions
ry-0.004
ours-0.004
upid-0.004
 bubble-0.004
 Sandy-0.004
Token ostr
Token ID42387
Feature activation+4.43e-02

Pos logit contributions
upid+0.031
ours+0.031
ry+0.030
asing+0.030
 Sandy+0.028
Neg logit contributions
Brown-0.037
Ell-0.034
Twe-0.033
 Brown-0.033
hy-0.033
Tokenich
Token ID488
Feature activation-2.28e-02

Pos logit contributions
Ar+0.084
Ell+0.083
Brown+0.082
 Barney+0.079
 Stretch+0.078
Neg logit contributions
ry-0.052
icking-0.046
apping-0.045
ours-0.045
ress-0.044
Token named
Token ID3706
Feature activation+3.71e-02

Pos logit contributions
upid+0.036
ours+0.036
ry+0.032
 Sandy+0.032
about+0.031
Neg logit contributions
Brown-0.037
Ell-0.035
Twe-0.033
hy-0.032
 Stretch-0.031
Token Blue
Token ID4518
Feature activation+3.43e-03

Pos logit contributions
Brown+0.058
Ell+0.057
Ar+0.053
ve+0.053
hy+0.052
Neg logit contributions
 Sandy-0.054
upid-0.048
 bubble-0.047
ry-0.047
ours-0.046
Tokeny
Token ID88
Feature activation+1.96e-03

Pos logit contributions
Brown+0.006
Ell+0.006
Twe+0.006
 Stretch+0.006
hy+0.006
Neg logit contributions
ry-0.004
ours-0.004
upid-0.004
 bubble-0.004
 Sandy-0.004
Token.
Token ID13
Feature activation+9.12e-02

Pos logit contributions
Brown+0.003
Ell+0.003
Twe+0.003
 Stretch+0.003
hy+0.003
Neg logit contributions
ours-0.003
ry-0.003
upid-0.003
 Sandy-0.002
 bubble-0.002
Token Blue
Token ID4518
Feature activation+2.86e-02

Pos logit contributions
ve+0.176
bel+0.152
ind+0.140
cle+0.136
grand+0.130
Neg logit contributions
 again-0.091
 Believe-0.090
 of-0.089
 All-0.089
 Let-0.089
Tokeny
Token ID88
Feature activation+4.72e-02

Pos logit contributions
Brown+0.065
Ell+0.063
hy+0.061
Twe+0.060
 Stretch+0.060
Neg logit contributions
ry-0.022
upid-0.018
ours-0.018
 bubble-0.017
 foot-0.016
Token liked
Token ID8288
Feature activation+2.27e-02

Pos logit contributions
Brown+0.086
Ell+0.085
Ar+0.080
Mic+0.079
ve+0.078
Neg logit contributions
ry-0.052
 foot-0.050
 bubble-0.048
 Sandy-0.048
 twin-0.046
Token to
Token ID284
Feature activation+1.63e-02

Pos logit contributions
Brown+0.044
Ell+0.042
Twe+0.040
hy+0.039
ve+0.039
Neg logit contributions
 bubble-0.024
ours-0.024
 Sandy-0.024
upid-0.023
 foot-0.022
Token run
Token ID1057
Feature activation+2.87e-02

Pos logit contributions
Brown+0.031
Ell+0.030
ve+0.028
Twe+0.027
 Stretch+0.027
Neg logit contributions
 bubble-0.019
upid-0.018
ours-0.018
ry-0.018
 foot-0.017
Token a
Token ID257
Feature activation+8.27e-03

Pos logit contributions
Brown+0.032
Ell+0.030
Twe+0.028
hy+0.028
 Stretch+0.028
Neg logit contributions
ours-0.015
upid-0.015
 bubble-0.015
 Sandy-0.015
 foot-0.014
Token little
Token ID1310
Feature activation+3.76e-03

Pos logit contributions
Brown+0.016
Ell+0.015
Twe+0.014
 Stretch+0.014
hy+0.014
Neg logit contributions
ours-0.009
ry-0.009
upid-0.009
 bubble-0.009
 Sandy-0.008
Token bird
Token ID6512
Feature activation-1.47e-02

Pos logit contributions
Brown+0.007
Ell+0.007
Twe+0.007
 Stretch+0.006
hy+0.006
Neg logit contributions
ours-0.004
ry-0.004
upid-0.004
 bubble-0.004
 Sandy-0.004
Token named
Token ID3706
Feature activation+1.93e-02

Pos logit contributions
upid+0.020
ours+0.020
ry+0.019
 Sandy+0.019
about+0.018
Neg logit contributions
Brown-0.026
Ell-0.024
Twe-0.023
hy-0.022
 Stretch-0.022
Token Blue
Token ID4518
Feature activation+3.23e-03

Pos logit contributions
Brown+0.027
Ell+0.026
Ar+0.025
hy+0.024
ve+0.023
Neg logit contributions
upid-0.031
 Sandy-0.031
 bubble-0.031
ours-0.031
ry-0.030
Token.
Token ID13
Feature activation+9.01e-02

Pos logit contributions
Brown+0.006
Ell+0.006
Twe+0.005
 Stretch+0.005
hy+0.005
Neg logit contributions
ry-0.004
ours-0.004
upid-0.004
 bubble-0.004
 Sandy-0.004
Token Blue
Token ID4518
Feature activation+2.94e-02

Pos logit contributions
ve+0.161
bel+0.143
cle+0.127
Ar+0.127
grand+0.125
Neg logit contributions
 again-0.096
 Cater-0.093
 Duck-0.093
 of-0.093
 if-0.091
Token was
Token ID373
Feature activation+3.54e-02

Pos logit contributions
Brown+0.050
Ell+0.049
hy+0.046
Twe+0.045
 Stretch+0.045
Neg logit contributions
ry-0.039
 bubble-0.035
ours-0.035
upid-0.035
 foot-0.034
Token very
Token ID845
Feature activation+4.50e-02

Pos logit contributions
Brown+0.063
Ell+0.059
Ar+0.057
Twe+0.055
ve+0.054
Neg logit contributions
 foot-0.043
 Sandy-0.043
 bubble-0.042
upid-0.041
ours-0.040
Token sad
Token ID6507
Feature activation-5.77e-03

Pos logit contributions
Brown+0.085
Ell+0.083
Ar+0.073
 Barney+0.072
 Stretch+0.072
Neg logit contributions
 foot-0.053
upid-0.053
 Sandy-0.052
ours-0.051
 bubble-0.051
Token because
Token ID780
Feature activation+2.97e-02

Pos logit contributions
ours+0.008
upid+0.008
ry+0.008
 Sandy+0.008
about+0.007
Neg logit contributions
Brown-0.010
Ell-0.009
 Stretch-0.008
 Brown-0.008
Twe-0.008
Token was
Token ID373
Feature activation+1.97e-02

Pos logit contributions
ours+0.003
upid+0.003
ry+0.003
 bubble+0.003
 Sandy+0.003
Neg logit contributions
Brown-0.004
Ell-0.004
Twe-0.004
 Stretch-0.004
hy-0.004
Token a
Token ID257
Feature activation+6.55e-03

Pos logit contributions
Brown+0.041
Ell+0.039
hy+0.037
Twe+0.037
ve+0.036
Neg logit contributions
ours-0.018
 bubble-0.018
 Sandy-0.018
upid-0.018
 foot-0.017
Token bird
Token ID6512
Feature activation-8.28e-03

Pos logit contributions
Brown+0.012
Ell+0.012
Twe+0.011
 Stretch+0.011
hy+0.011
Neg logit contributions
ours-0.008
ry-0.008
upid-0.008
 bubble-0.007
 Sandy-0.007
Token named
Token ID3706
Feature activation+2.13e-02

Pos logit contributions
ours+0.011
upid+0.011
ry+0.010
 Sandy+0.010
 bubble+0.009
Neg logit contributions
Brown-0.015
Ell-0.014
Twe-0.014
hy-0.013
 Stretch-0.013
Token Blue
Token ID4518
Feature activation+6.71e-03

Pos logit contributions
Brown+0.030
Ell+0.028
Ar+0.026
ve+0.026
Twe+0.025
Neg logit contributions
 Sandy-0.034
upid-0.034
 bubble-0.034
ours-0.033
 foot-0.032
Token.
Token ID13
Feature activation+9.00e-02

Pos logit contributions
Brown+0.012
Ell+0.012
Twe+0.011
 Stretch+0.011
hy+0.011
Neg logit contributions
ry-0.008
ours-0.008
upid-0.008
 bubble-0.008
 Sandy-0.008
Token Blue
Token ID4518
Feature activation+3.69e-02

Pos logit contributions
ve+0.158
bel+0.138
ind+0.128
cle+0.128
grand+0.127
Neg logit contributions
 goalie-0.102
 again-0.101
 harder-0.096
 cricket-0.094
 of-0.094
Token loved
Token ID6151
Feature activation+2.94e-02

Pos logit contributions
Brown+0.064
Ell+0.062
hy+0.060
 Barney+0.059
 Stretch+0.058
Neg logit contributions
ry-0.046
 bubble-0.041
 foot-0.040
upid-0.040
ours-0.040
Token to
Token ID284
Feature activation+8.91e-03

Pos logit contributions
Brown+0.059
Ell+0.056
Twe+0.054
hy+0.054
ve+0.054
Neg logit contributions
 bubble-0.030
ours-0.029
 Sandy-0.029
upid-0.027
 foot-0.027
Token fly
Token ID6129
Feature activation+3.65e-02

Pos logit contributions
Brown+0.016
Ell+0.015
Twe+0.014
 Stretch+0.014
Ar+0.014
Neg logit contributions
upid-0.011
ours-0.011
ry-0.011
 bubble-0.011
 Sandy-0.010
Token high
Token ID1029
Feature activation+5.14e-02

Pos logit contributions
Brown+0.070
Ell+0.061
Twe+0.061
 Barney+0.060
hy+0.060
Neg logit contributions
ry-0.041
 bubble-0.041
 foot-0.039
 Sandy-0.038
upid-0.038
Token blue
Token ID4171
Feature activation-2.18e-02

Pos logit contributions
Brown+0.009
Ell+0.008
Twe+0.008
 Stretch+0.008
hy+0.008
Neg logit contributions
ours-0.006
ry-0.006
upid-0.006
 bubble-0.005
 Sandy-0.005
Token cat
Token ID3797
Feature activation-3.14e-03

Pos logit contributions
upid+0.031
ours+0.030
ry+0.030
asing+0.029
 Sandy+0.027
Neg logit contributions
Brown-0.036
Ell-0.034
Twe-0.033
hy-0.032
 Brown-0.032
Token named
Token ID3706
Feature activation-5.77e-03

Pos logit contributions
ours+0.004
upid+0.004
ry+0.004
 Sandy+0.003
 bubble+0.003
Neg logit contributions
Brown-0.006
Ell-0.006
Twe-0.005
 Stretch-0.005
hy-0.005
Token Blue
Token ID4518
Feature activation+8.43e-03

Pos logit contributions
ry+0.009
ours+0.009
upid+0.009
about+0.008
 bubble+0.008
Neg logit contributions
Brown-0.009
Ell-0.009
Twe-0.008
 Stretch-0.008
 Brown-0.008
Tokeny
Token ID88
Feature activation-2.13e-04

Pos logit contributions
Brown+0.015
Ell+0.015
Twe+0.014
 Stretch+0.014
Ar+0.014
Neg logit contributions
ry-0.010
ours-0.010
upid-0.010
 bubble-0.009
 Sandy-0.009
Token.
Token ID13
Feature activation+8.90e-02

Pos logit contributions
ours+0.000
upid+0.000
ry+0.000
 Sandy+0.000
 bubble+0.000
Neg logit contributions
Brown-0.000
Ell-0.000
Twe-0.000
 Stretch-0.000
hy-0.000
Token Blue
Token ID4518
Feature activation+3.12e-02

Pos logit contributions
ve+0.171
bel+0.147
ind+0.135
cle+0.131
Ar+0.129
Neg logit contributions
 All-0.095
 of-0.093
 cricket-0.092
 bubble-0.090
 goalie-0.090
Tokeny
Token ID88
Feature activation+3.61e-02

Pos logit contributions
Brown+0.060
Ell+0.058
hy+0.055
Twe+0.055
 Stretch+0.055
Neg logit contributions
ry-0.034
upid-0.030
 bubble-0.030
ours-0.029
 foot-0.029
Token was
Token ID373
Feature activation+2.69e-02

Pos logit contributions
Brown+0.062
Ell+0.060
Ar+0.058
ve+0.056
Twe+0.056
Neg logit contributions
ry-0.044
 bubble-0.043
 foot-0.043
 Sandy-0.043
ours-0.041
Token sad
Token ID6507
Feature activation-9.08e-03

Pos logit contributions
Brown+0.048
Ell+0.045
Ar+0.043
ve+0.042
Twe+0.042
Neg logit contributions
 Sandy-0.033
 foot-0.033
 bubble-0.032
ours-0.032
upid-0.031
Token because
Token ID780
Feature activation+2.59e-02

Pos logit contributions
ours+0.013
upid+0.013
ry+0.013
about+0.012
 Sandy+0.012
Neg logit contributions
Brown-0.014
Ell-0.013
 Stretch-0.013
 Brown-0.013
Twe-0.012
Token a
Token ID257
Feature activation+4.62e-03

Pos logit contributions
Brown+0.019
Ell+0.018
Twe+0.017
hy+0.017
 Stretch+0.017
Neg logit contributions
ours-0.011
upid-0.011
ry-0.011
 bubble-0.010
 Sandy-0.010
Token little
Token ID1310
Feature activation+3.69e-03

Pos logit contributions
Brown+0.009
Ell+0.008
Twe+0.008
 Stretch+0.008
hy+0.008
Neg logit contributions
ours-0.005
ry-0.005
upid-0.005
 bubble-0.005
 Sandy-0.005
Token bird
Token ID6512
Feature activation-1.68e-02

Pos logit contributions
Brown+0.007
Ell+0.007
Twe+0.006
 Stretch+0.006
hy+0.006
Neg logit contributions
ours-0.004
ry-0.004
upid-0.004
 bubble-0.004
 Sandy-0.004
Token named
Token ID3706
Feature activation+1.61e-02

Pos logit contributions
upid+0.023
ours+0.023
ry+0.022
 Sandy+0.021
about+0.021
Neg logit contributions
Brown-0.029
Ell-0.028
Twe-0.026
hy-0.025
 Stretch-0.025
Token Blue
Token ID4518
Feature activation-3.81e-03

Pos logit contributions
Brown+0.021
Ell+0.020
Ar+0.019
hy+0.018
ve+0.018
Neg logit contributions
upid-0.027
 Sandy-0.027
 bubble-0.027
ours-0.027
ry-0.026
Token.
Token ID13
Feature activation+8.88e-02

Pos logit contributions
ours+0.005
upid+0.005
ry+0.005
 Sandy+0.004
 bubble+0.004
Neg logit contributions
Brown-0.007
Ell-0.007
Twe-0.006
 Stretch-0.006
Ar-0.006
Token Blue
Token ID4518
Feature activation+2.87e-02

Pos logit contributions
ve+0.161
bel+0.142
cle+0.129
Ar+0.126
grand+0.124
Neg logit contributions
 goalie-0.102
 again-0.101
 harder-0.097
 Believe-0.094
 more-0.092
Token was
Token ID373
Feature activation+3.07e-02

Pos logit contributions
Brown+0.047
Ell+0.045
hy+0.043
 Stretch+0.043
 Barney+0.042
Neg logit contributions
ry-0.040
upid-0.036
 bubble-0.036
ours-0.036
 foot-0.035
Token very
Token ID845
Feature activation+4.25e-02

Pos logit contributions
Brown+0.054
Ell+0.050
Ar+0.048
Twe+0.046
ve+0.046
Neg logit contributions
 foot-0.039
 Sandy-0.039
 bubble-0.039
upid-0.037
ours-0.037
Token happy
Token ID3772
Feature activation-2.25e-03

Pos logit contributions
Brown+0.080
Ell+0.078
 Stretch+0.068
Ar+0.067
 Barney+0.067
Neg logit contributions
upid-0.051
 foot-0.051
 Sandy-0.050
ours-0.050
 bubble-0.049
Token because
Token ID780
Feature activation+2.53e-02

Pos logit contributions
upid+0.003
ours+0.003
ry+0.003
 Sandy+0.003
 bubble+0.003
Neg logit contributions
Brown-0.004
Ell-0.004
 Stretch-0.003
Twe-0.003
 Brown-0.003
Token calm
Token ID9480
Feature activation-1.01e-02

Pos logit contributions
Brown+0.010
Ell+0.010
Twe+0.009
 Stretch+0.009
hy+0.009
Neg logit contributions
ours-0.008
ry-0.008
upid-0.008
 bubble-0.008
 Sandy-0.007
Token little
Token ID1310
Feature activation-8.12e-03

Pos logit contributions
ours+0.014
upid+0.014
ry+0.013
asing+0.012
 Sandy+0.012
Neg logit contributions
Brown-0.017
Ell-0.016
 Stretch-0.015
hy-0.015
Twe-0.015
Token bird
Token ID6512
Feature activation-1.39e-02

Pos logit contributions
upid+0.011
ours+0.010
ry+0.010
 Sandy+0.009
asing+0.009
Neg logit contributions
Brown-0.014
Ell-0.014
 Stretch-0.013
Twe-0.013
hy-0.013
Token named
Token ID3706
Feature activation+1.69e-02

Pos logit contributions
upid+0.020
ours+0.019
ry+0.018
 Sandy+0.018
about+0.017
Neg logit contributions
Brown-0.024
Ell-0.022
Twe-0.021
hy-0.020
 Stretch-0.020
Token Blue
Token ID4518
Feature activation+5.23e-04

Pos logit contributions
Brown+0.026
Ell+0.025
Ar+0.023
ve+0.023
Twe+0.023
Neg logit contributions
 Sandy-0.025
upid-0.024
ours-0.024
 bubble-0.024
ry-0.024
Token.
Token ID13
Feature activation+8.87e-02

Pos logit contributions
Brown+0.001
Ell+0.001
Twe+0.001
 Stretch+0.001
hy+0.001
Neg logit contributions
ours-0.001
ry-0.001
upid-0.001
 Sandy-0.001
 bubble-0.001
Token Blue
Token ID4518
Feature activation+2.44e-02

Pos logit contributions
ve+0.159
bel+0.134
ind+0.130
cle+0.126
grand+0.123
Neg logit contributions
 of-0.104
 goalie-0.101
 All-0.100
 Believe-0.098
 again-0.095
Token lived
Token ID5615
Feature activation+6.90e-03

Pos logit contributions
Brown+0.039
Ell+0.038
Twe+0.035
hy+0.035
 Barney+0.035
Neg logit contributions
ry-0.035
ours-0.032
 bubble-0.032
upid-0.032
 foot-0.031
Token in
Token ID287
Feature activation+3.82e-02

Pos logit contributions
Brown+0.011
Ell+0.010
Twe+0.010
hy+0.010
 Stretch+0.010
Neg logit contributions
ours-0.010
ry-0.010
upid-0.010
 bubble-0.009
 Sandy-0.009
Token a
Token ID257
Feature activation+2.03e-03

Pos logit contributions
ve+0.087
Brown+0.085
Ell+0.085
Twe+0.082
hy+0.077
Neg logit contributions
 bubble-0.028
 foot-0.024
upid-0.023
 hand-0.023
 Sandy-0.022
Token tiny
Token ID7009
Feature activation-1.48e-02

Pos logit contributions
Brown+0.003
Ell+0.003
Twe+0.002
 Stretch+0.002
hy+0.002
Neg logit contributions
ours-0.003
ry-0.003
upid-0.003
 Sandy-0.003
 bubble-0.003
Token wrote
Token ID2630
Feature activation+5.20e-02

Pos logit contributions
Brown+0.060
Ell+0.057
Twe+0.054
 Stretch+0.052
Ar+0.051
Neg logit contributions
ry-0.037
ours-0.036
 bubble-0.033
 Sandy-0.033
 foot-0.033
Token letters
Token ID7475
Feature activation+5.67e-02

Pos logit contributions
Twe+0.109
Ell+0.109
Brown+0.108
Mic+0.099
Ar+0.096
Neg logit contributions
ours-0.050
ry-0.048
ful-0.044
 of-0.043
 how-0.042
Token to
Token ID284
Feature activation+5.74e-02

Pos logit contributions
Ell+0.130
Twe+0.129
Brown+0.127
Mic+0.123
Ar+0.119
Neg logit contributions
ry-0.049
 of-0.042
ours-0.041
ful-0.041
upid-0.038
Token Santa
Token ID8909
Feature activation+5.61e-02

Pos logit contributions
Ell+0.107
Brown+0.100
Twe+0.092
ve+0.090
hy+0.088
Neg logit contributions
 hand-0.070
 bubble-0.068
 foot-0.068
 why-0.068
ours-0.067
Token and
Token ID290
Feature activation+3.81e-02

Pos logit contributions
Ell+0.058
Brown+0.051
Twe+0.051
Ar+0.049
Mic+0.045
Neg logit contributions
ry-0.125
icking-0.111
ours-0.110
ful-0.108
 twin-0.107
Token put
Token ID1234
Feature activation+8.87e-02

Pos logit contributions
Brown+0.066
Ell+0.066
Twe+0.063
ve+0.060
hy+0.058
Neg logit contributions
ours-0.047
 bubble-0.047
ry-0.047
 hand-0.045
 Sandy-0.045
Token them
Token ID606
Feature activation+7.04e-02

Pos logit contributions
Ell+0.210
Brown+0.199
Twe+0.192
Mic+0.185
hy+0.180
Neg logit contributions
 hand-0.070
ours-0.067
 more-0.063
 how-0.059
 something-0.058
Token in
Token ID287
Feature activation+4.42e-02

Pos logit contributions
Brown+0.161
Twe+0.161
Ell+0.161
Mic+0.150
Ar+0.148
Neg logit contributions
ry-0.052
 hand-0.045
 foot-0.042
ac-0.041
icking-0.040
Token a
Token ID257
Feature activation+4.10e-02

Pos logit contributions
Ell+0.096
Brown+0.095
Twe+0.090
ve+0.087
Mic+0.086
Neg logit contributions
 hand-0.038
 bubble-0.037
upid-0.033
 twin-0.033
ry-0.033
Token big
Token ID1263
Feature activation+2.20e-02

Pos logit contributions
Brown+0.074
Ell+0.071
Twe+0.067
 Stretch+0.064
 Kirst+0.063
Neg logit contributions
 bubble-0.051
ry-0.050
 hand-0.050
 foot-0.049
ours-0.048
Token red
Token ID2266
Feature activation+1.74e-02

Pos logit contributions
Brown+0.044
Ell+0.043
Twe+0.041
 Stretch+0.039
Ar+0.038
Neg logit contributions
ry-0.024
ours-0.024
 bubble-0.023
upid-0.023
 foot-0.022
Token a
Token ID257
Feature activation+4.66e-03

Pos logit contributions
Brown+0.041
Ell+0.039
Twe+0.037
hy+0.037
ve+0.036
Neg logit contributions
 bubble-0.019
ours-0.019
 Sandy-0.019
upid-0.018
ry-0.017
Token little
Token ID1310
Feature activation+3.62e-03

Pos logit contributions
Brown+0.009
Ell+0.008
Twe+0.008
 Stretch+0.008
hy+0.008
Neg logit contributions
ours-0.005
ry-0.005
upid-0.005
 bubble-0.005
 Sandy-0.005
Token bird
Token ID6512
Feature activation-1.46e-02

Pos logit contributions
Brown+0.007
Ell+0.007
Twe+0.006
 Stretch+0.006
hy+0.006
Neg logit contributions
ours-0.004
ry-0.004
upid-0.004
 bubble-0.004
 Sandy-0.004
Token named
Token ID3706
Feature activation+1.77e-02

Pos logit contributions
upid+0.020
ours+0.020
ry+0.019
 Sandy+0.018
about+0.018
Neg logit contributions
Brown-0.026
Ell-0.024
Twe-0.023
hy-0.022
 Stretch-0.022
Token Blue
Token ID4518
Feature activation+1.46e-03

Pos logit contributions
Brown+0.025
Ell+0.023
Ar+0.022
hy+0.022
ve+0.022
Neg logit contributions
upid-0.028
 bubble-0.028
 Sandy-0.028
ours-0.028
ry-0.027
Token.
Token ID13
Feature activation+8.86e-02

Pos logit contributions
Brown+0.003
Ell+0.002
Twe+0.002
 Stretch+0.002
hy+0.002
Neg logit contributions
ours-0.002
ry-0.002
upid-0.002
 Sandy-0.002
 bubble-0.002
Token Blue
Token ID4518
Feature activation+3.10e-02

Pos logit contributions
ve+0.165
bel+0.145
ind+0.133
cle+0.132
grand+0.130
Neg logit contributions
 again-0.094
 goalie-0.093
 harder-0.090
 the-0.088
 Believe-0.087
Token had
Token ID550
Feature activation+3.10e-02

Pos logit contributions
Brown+0.052
Ell+0.050
hy+0.048
 Stretch+0.047
 Barney+0.046
Neg logit contributions
ry-0.042
 bubble-0.038
upid-0.037
ours-0.037
 foot-0.036
Token a
Token ID257
Feature activation+3.57e-02

Pos logit contributions
Brown+0.065
Ell+0.061
Twe+0.060
ve+0.059
 Stretch+0.059
Neg logit contributions
 bubble-0.026
 foot-0.026
upid-0.024
 Sandy-0.024
 hand-0.024
Token peaceful
Token ID12309
Feature activation+2.51e-02

Pos logit contributions
Brown+0.061
Ell+0.057
Twe+0.053
hy+0.053
Ar+0.052
Neg logit contributions
 bubble-0.049
 foot-0.048
upid-0.045
 hand-0.044
 twin-0.043
Token life
Token ID1204
Feature activation+3.99e-03

Pos logit contributions
Brown+0.048
Ell+0.046
Twe+0.044
ve+0.043
Ar+0.043
Neg logit contributions
 bubble-0.028
 foot-0.027
ry-0.027
upid-0.027
 hand-0.025
Token gifted
Token ID22527
Feature activation-1.15e-02

Pos logit contributions
Brown+0.000
Ell+0.000
Twe+0.000
 Stretch+0.000
Ar+0.000
Neg logit contributions
ours-0.000
ry-0.000
upid-0.000
 Sandy-0.000
 bubble-0.000
Token little
Token ID1310
Feature activation-3.77e-03

Pos logit contributions
ours+0.015
upid+0.015
ry+0.015
asing+0.014
 Sandy+0.014
Neg logit contributions
Brown-0.020
Ell-0.019
 Stretch-0.018
Twe-0.018
hy-0.017
Token bird
Token ID6512
Feature activation-1.92e-02

Pos logit contributions
upid+0.005
ours+0.005
ry+0.005
 Sandy+0.004
 bubble+0.004
Neg logit contributions
Brown-0.007
Ell-0.007
 Stretch-0.006
Twe-0.006
hy-0.006
Token named
Token ID3706
Feature activation-6.74e-04

Pos logit contributions
upid+0.029
ours+0.029
ry+0.026
 Sandy+0.026
about+0.026
Neg logit contributions
Brown-0.031
Ell-0.029
Twe-0.027
hy-0.026
 Brown-0.026
Token Blue
Token ID4518
Feature activation-1.82e-03

Pos logit contributions
ours+0.001
ry+0.001
upid+0.001
 bubble+0.001
 Sandy+0.001
Neg logit contributions
Brown-0.001
Ell-0.001
Twe-0.001
 Stretch-0.001
hy-0.001
Token.
Token ID13
Feature activation+8.77e-02

Pos logit contributions
ours+0.002
upid+0.002
ry+0.002
 Sandy+0.002
 bubble+0.002
Neg logit contributions
Brown-0.004
Ell-0.004
Twe-0.003
 Stretch-0.003
hy-0.003
Token Blue
Token ID4518
Feature activation+1.78e-02

Pos logit contributions
ve+0.153
cle+0.130
ind+0.129
bel+0.125
wl+0.117
Neg logit contributions
 of-0.112
 Believe-0.108
 Keep-0.106
 Let-0.103
 All-0.102
Token loved
Token ID6151
Feature activation+2.61e-02

Pos logit contributions
Brown+0.037
Ell+0.036
Twe+0.034
hy+0.034
 Stretch+0.034
Neg logit contributions
ry-0.017
upid-0.016
ours-0.015
 bubble-0.015
 foot-0.014
Token to
Token ID284
Feature activation+7.24e-03

Pos logit contributions
Brown+0.052
Ell+0.049
Twe+0.048
hy+0.047
ve+0.046
Neg logit contributions
 bubble-0.028
ours-0.027
 Sandy-0.026
upid-0.026
 foot-0.025
Token sing
Token ID1702
Feature activation+1.55e-02

Pos logit contributions
Brown+0.013
Ell+0.012
Twe+0.011
 Stretch+0.011
hy+0.011
Neg logit contributions
upid-0.009
ours-0.009
ry-0.009
 bubble-0.009
 Sandy-0.009
Token and
Token ID290
Feature activation-2.40e-03

Pos logit contributions
Brown+0.030
Ell+0.029
Twe+0.028
hy+0.027
Ar+0.026
Neg logit contributions
ry-0.016
upid-0.016
 bubble-0.016
ours-0.015
 foot-0.015
Token high
Token ID1029
Feature activation+3.01e-03

Pos logit contributions
ours+0.058
ress+0.057
ry+0.056
upid+0.056
icking+0.055
Neg logit contributions
Brown-0.026
Ell-0.023
 Stretch-0.022
Twe-0.022
hy-0.021
Token!
Token ID0
Feature activation-4.93e-02

Pos logit contributions
Brown+0.005
Ell+0.005
Twe+0.005
 Stretch+0.005
Ar+0.005
Neg logit contributions
ours-0.004
ry-0.004
upid-0.004
 Sandy-0.004
 bubble-0.003
Token Can
Token ID1680
Feature activation+1.63e-02

Pos logit contributions
ry+0.093
upid+0.089
ours+0.088
icking+0.086
 shallow+0.086
Neg logit contributions
Brown-0.060
 Stretch-0.055
 Brown-0.054
hy-0.052
Twe-0.052
Token you
Token ID345
Feature activation-5.73e-02

Pos logit contributions
Brown+0.026
Ell+0.024
Ar+0.023
hy+0.022
 Stretch+0.022
Neg logit contributions
ours-0.024
ry-0.023
upid-0.023
 bubble-0.023
 Sandy-0.023
Token teach
Token ID4545
Feature activation-9.65e-03

Pos logit contributions
ours+0.118
ays+0.109
ress+0.107
ry+0.105
about+0.103
Neg logit contributions
Brown-0.060
Ell-0.059
Twe-0.058
ve-0.053
 Ellie-0.052
Token me
Token ID502
Feature activation-9.90e-02

Pos logit contributions
ry+0.018
ours+0.018
upid+0.017
about+0.016
olt+0.016
Neg logit contributions
Brown-0.012
Ell-0.011
 Stretch-0.010
Twe-0.010
 Brown-0.010
Token?"
Token ID1701
Feature activation+3.47e-03

Pos logit contributions
olt+0.175
 horror+0.173
 pitcher+0.154
idget+0.150
about+0.147
Neg logit contributions
 Christmas-0.109
Brown-0.098
 novels-0.094
 Ellie-0.092
 saf-0.090
Token The
Token ID383
Feature activation-2.22e-02

Pos logit contributions
Brown+0.006
Ell+0.006
Twe+0.005
 Stretch+0.005
Ar+0.005
Neg logit contributions
ours-0.005
ry-0.005
upid-0.005
 Sandy-0.004
 bubble-0.004
Token frog
Token ID21264
Feature activation-1.66e-02

Pos logit contributions
ry+0.030
upid+0.029
ours+0.029
asing+0.027
icking+0.026
Neg logit contributions
Brown-0.039
Ell-0.038
 Stretch-0.037
Ar-0.037
Twe-0.035
Token nodded
Token ID14464
Feature activation+1.94e-03

Pos logit contributions
upid+0.026
ours+0.025
ry+0.024
ress+0.023
 Sandy+0.022
Neg logit contributions
Brown-0.025
Ell-0.025
 Stretch-0.022
Twe-0.022
 Brown-0.022
Token and
Token ID290
Feature activation-3.51e-02

Pos logit contributions
Brown+0.003
Ell+0.003
Twe+0.003
 Stretch+0.003
hy+0.003
Neg logit contributions
ours-0.003
ry-0.003
upid-0.003
 Sandy-0.002
 bubble-0.002
Token asked
Token ID1965
Feature activation+7.37e-04

Pos logit contributions
Brown+0.002
Ell+0.002
Twe+0.002
 Stretch+0.002
hy+0.002
Neg logit contributions
ours-0.002
ry-0.002
upid-0.002
 Sandy-0.002
 bubble-0.002
Token,
Token ID11
Feature activation-1.37e-02

Pos logit contributions
Brown+0.001
Ell+0.001
Twe+0.001
 Stretch+0.001
hy+0.001
Neg logit contributions
ours-0.001
ry-0.001
upid-0.001
 bubble-0.001
 Sandy-0.001
Token "
Token ID366
Feature activation-7.86e-03

Pos logit contributions
upid+0.020
ry+0.019
ours+0.019
asing+0.018
olt+0.017
Neg logit contributions
Brown-0.023
Ell-0.022
 Brown-0.021
 Stretch-0.020
Twe-0.020
TokenWhat
Token ID2061
Feature activation-4.93e-02

Pos logit contributions
upid+0.012
ours+0.012
ry+0.012
 Sandy+0.011
 bubble+0.011
Neg logit contributions
Brown-0.013
Ell-0.012
Twe-0.011
Ar-0.011
 Stretch-0.011
Token is
Token ID318
Feature activation-3.07e-02

Pos logit contributions
 shallow+0.082
 Sandy+0.080
olt+0.080
upid+0.080
ry+0.079
Neg logit contributions
Brown-0.061
 Stretch-0.060
 Brown-0.060
 Christmas-0.058
hy-0.057
Token that
Token ID326
Feature activation-9.74e-02

Pos logit contributions
ill+0.051
ress+0.050
ry+0.050
upid+0.049
about+0.048
Neg logit contributions
Brown-0.041
 Brown-0.040
 Stretch-0.039
 Barney-0.038
 Twe-0.037
Token?"
Token ID1701
Feature activation-4.41e-04

Pos logit contributions
ill+0.171
ress+0.170
asing+0.170
omp+0.165
ry+0.161
Neg logit contributions
 Brown-0.099
 Twe-0.095
 long-0.092
 Blue-0.090
 Ellie-0.088
Token
Token ID198
Feature activation+1.60e-03

Pos logit contributions
ours+0.001
upid+0.001
ry+0.001
 Sandy+0.001
 bubble+0.001
Neg logit contributions
Brown-0.001
Ell-0.001
Twe-0.001
 Stretch-0.001
hy-0.001
Token
Token ID198
Feature activation+2.60e-03

Pos logit contributions
Brown+0.003
Ell+0.002
Twe+0.002
 Stretch+0.002
hy+0.002
Neg logit contributions
ours-0.002
ry-0.002
upid-0.002
 Sandy-0.002
 bubble-0.002
TokenThe
Token ID464
Feature activation+1.35e-03

Pos logit contributions
Brown+0.005
Ell+0.005
Twe+0.004
 Stretch+0.004
hy+0.004
Neg logit contributions
ours-0.003
ry-0.003
upid-0.003
 bubble-0.003
 Sandy-0.003
Token mom
Token ID1995
Feature activation+2.50e-05

Pos logit contributions
Brown+0.003
Ell+0.002
Twe+0.002
 Stretch+0.002
hy+0.002
Neg logit contributions
ours-0.002
ry-0.002
upid-0.002
 Sandy-0.001
 bubble-0.001
Token Mom
Token ID11254
Feature activation-8.88e-03

Pos logit contributions
upid+0.020
ours+0.020
ry+0.020
 bubble+0.018
asing+0.018
Neg logit contributions
Brown-0.017
Ell-0.016
 Brown-0.015
 Stretch-0.014
Twe-0.014
Tokenmy
Token ID1820
Feature activation-6.88e-03

Pos logit contributions
upid+0.013
ours+0.013
ry+0.013
 Sandy+0.012
 bubble+0.012
Neg logit contributions
Brown-0.015
Ell-0.014
 Brown-0.012
Twe-0.012
 Stretch-0.012
Token took
Token ID1718
Feature activation+1.03e-02

Pos logit contributions
upid+0.010
ours+0.010
ry+0.009
 Sandy+0.009
 bubble+0.009
Neg logit contributions
Brown-0.012
Ell-0.011
Twe-0.010
 Stretch-0.010
 Brown-0.010
Token out
Token ID503
Feature activation-2.16e-02

Pos logit contributions
Brown+0.017
Ell+0.016
Twe+0.015
 Stretch+0.015
Ar+0.015
Neg logit contributions
ours-0.014
upid-0.014
ry-0.014
 Sandy-0.014
 bubble-0.014
Token a
Token ID257
Feature activation-1.87e-02

Pos logit contributions
ry+0.035
upid+0.034
ours+0.034
 ankle+0.031
asing+0.031
Neg logit contributions
Ell-0.031
Brown-0.030
 Brown-0.028
hy-0.026
 Stretch-0.026
Token spray
Token ID11662
Feature activation-9.68e-02

Pos logit contributions
ours+0.027
upid+0.027
ry+0.026
asing+0.026
 Sandy+0.024
Neg logit contributions
Brown-0.030
Ell-0.029
Twe-0.026
 Stretch-0.026
Ar-0.026
Token bottle
Token ID9294
Feature activation-4.15e-02

Pos logit contributions
ress+0.142
upid+0.133
asing+0.133
omp+0.129
ours+0.129
Neg logit contributions
 past-0.119
Ell-0.113
 Brown-0.112
hy-0.111
 Ellie-0.110
Token.
Token ID13
Feature activation-7.93e-03

Pos logit contributions
upid+0.057
ours+0.056
 ankle+0.051
ry+0.051
ress+0.050
Neg logit contributions
Ell-0.064
Brown-0.063
hy-0.060
 Brown-0.059
ve-0.058
Token She
Token ID1375
Feature activation-6.18e-02

Pos logit contributions
ry+0.013
upid+0.013
ours+0.012
 bubble+0.012
asing+0.012
Neg logit contributions
Brown-0.012
Ell-0.011
Twe-0.010
 Brown-0.010
 Stretch-0.010
Token showed
Token ID3751
Feature activation-8.04e-03

Pos logit contributions
upid+0.089
 conclusions+0.081
olt+0.075
 goalie+0.072
iece+0.072
Neg logit contributions
Ell-0.107
Brown-0.105
ve-0.104
 Brown-0.097
Ar-0.096
Token Sarah
Token ID10490
Feature activation-4.12e-02

Pos logit contributions
ry+0.012
ours+0.012
upid+0.012
 bubble+0.011
 shallow+0.011
Neg logit contributions
Brown-0.013
Ell-0.012
 Brown-0.011
Twe-0.011
 Stretch-0.011
Token higher
Token ID2440
Feature activation-2.53e-02

Pos logit contributions
ours+0.040
upid+0.039
ress+0.038
ry+0.038
olt+0.037
Neg logit contributions
Brown-0.030
Ell-0.027
 Stretch-0.027
Ar-0.026
Twe-0.025
Token than
Token ID621
Feature activation-7.74e-03

Pos logit contributions
ours+0.041
upid+0.040
ry+0.038
ress+0.037
olt+0.037
Neg logit contributions
Brown-0.035
 Stretch-0.032
Ar-0.032
Ell-0.031
 Brown-0.030
Token you
Token ID345
Feature activation-2.29e-02

Pos logit contributions
ours+0.014
ry+0.014
upid+0.014
about+0.013
 Sandy+0.013
Neg logit contributions
Brown-0.010
Ell-0.009
 Stretch-0.008
Twe-0.008
Ar-0.008
Token!"
Token ID2474
Feature activation-1.30e-02

Pos logit contributions
ours+0.035
ry+0.033
ress+0.032
 Sandy+0.032
upid+0.032
Neg logit contributions
Brown-0.034
 Stretch-0.031
Ell-0.031
Ar-0.030
hy-0.030
Token The
Token ID383
Feature activation-6.57e-02

Pos logit contributions
upid+0.022
ry+0.022
ours+0.022
 shallow+0.020
asing+0.020
Neg logit contributions
Brown-0.018
Ell-0.017
 Stretch-0.016
 Brown-0.016
hy-0.016
Token small
Token ID1402
Feature activation-9.63e-02

Pos logit contributions
ress+0.171
ry+0.170
ours+0.165
asing+0.161
ill+0.160
Neg logit contributions
 long-0.044
Ar-0.039
 shiny-0.030
Brown-0.028
 whole-0.026
Token bubble
Token ID14310
Feature activation-2.98e-02

Pos logit contributions
idget+0.144
run+0.141
asing+0.141
ress+0.136
ayed+0.135
Neg logit contributions
 long-0.142
Ar-0.129
 rh-0.121
 shiny-0.117
 furry-0.112
Token said
Token ID531
Feature activation-1.68e-02

Pos logit contributions
ours+0.045
upid+0.044
ry+0.041
ress+0.039
 Sandy+0.039
Neg logit contributions
Brown-0.046
Ell-0.044
Ar-0.042
pe-0.041
 Stretch-0.041
Token,
Token ID11
Feature activation-5.51e-03

Pos logit contributions
ours+0.026
upid+0.026
ry+0.026
 shallow+0.024
 bubble+0.024
Neg logit contributions
Brown-0.024
Ell-0.023
 Stretch-0.022
 Brown-0.022
hy-0.022
Token "
Token ID366
Feature activation+1.25e-02

Pos logit contributions
upid+0.007
ry+0.007
ours+0.007
 Sandy+0.006
 bubble+0.006
Neg logit contributions
Brown-0.010
Ell-0.010
 Stretch-0.009
Twe-0.009
hy-0.009
TokenNo
Token ID2949
Feature activation-3.32e-02

Pos logit contributions
Brown+0.021
Ell+0.020
 Brown+0.019
 Barney+0.018
 Stretch+0.018
Neg logit contributions
upid-0.016
ry-0.016
ours-0.016
 bubble-0.016
about-0.016
Token it
Token ID340
Feature activation+1.59e-02

Pos logit contributions
upid+0.027
ry+0.026
ours+0.026
about+0.026
 Sandy+0.025
Neg logit contributions
Brown-0.027
Ell-0.024
Twe-0.023
 Stretch-0.023
ve-0.023
Token tastes
Token ID18221
Feature activation-2.95e-03

Pos logit contributions
Brown+0.033
Ell+0.032
Twe+0.030
hy+0.029
Ar+0.029
Neg logit contributions
ours-0.017
ry-0.017
upid-0.016
 bubble-0.015
 Sandy-0.014
Token."
Token ID526
Feature activation-1.93e-02

Pos logit contributions
ours+0.004
ry+0.004
upid+0.004
 Sandy+0.004
 bubble+0.004
Neg logit contributions
Brown-0.005
Ell-0.005
Twe-0.004
 Stretch-0.004
Ar-0.004
Token
Token ID198
Feature activation-2.27e-02

Pos logit contributions
upid+0.034
ours+0.033
ry+0.033
 shallow+0.031
 bubble+0.031
Neg logit contributions
Brown-0.025
Ell-0.024
 Brown-0.022
Twe-0.022
 Stretch-0.021
Token
Token ID198
Feature activation-2.70e-02

Pos logit contributions
ours+0.042
ry+0.041
upid+0.041
asing+0.039
about+0.039
Neg logit contributions
Brown-0.029
Ell-0.027
Twe-0.024
Ar-0.023
 Brown-0.023
TokenL
Token ID43
Feature activation-9.35e-02

Pos logit contributions
ours+0.050
upid+0.048
ry+0.048
icking+0.046
 Sandy+0.045
Neg logit contributions
Brown-0.038
Ell-0.035
Twe-0.033
Ar-0.031
Red-0.030
Tokenula
Token ID4712
Feature activation+2.25e-02

Pos logit contributions
 shallow+0.125
ry+0.122
 reel+0.119
 fin+0.117
 net+0.113
Neg logit contributions
ve-0.154
Ell-0.131
Twe-0.126
Red-0.124
Brown-0.117
Token could
Token ID714
Feature activation+3.18e-03

Pos logit contributions
Brown+0.045
Ell+0.043
Ar+0.041
Twe+0.041
 Stretch+0.040
Neg logit contributions
ry-0.025
ours-0.024
upid-0.023
 bubble-0.023
 Sandy-0.022
Token barely
Token ID8523
Feature activation-4.22e-03

Pos logit contributions
Brown+0.005
Ell+0.005
Twe+0.004
 Stretch+0.004
Ar+0.004
Neg logit contributions
ours-0.005
ry-0.005
upid-0.005
 Sandy-0.005
 bubble-0.005
Token contain
Token ID3994
Feature activation-3.72e-03

Pos logit contributions
ours+0.005
ry+0.005
upid+0.005
 Sandy+0.005
 bubble+0.005
Neg logit contributions
Brown-0.007
Ell-0.007
Twe-0.007
 Stretch-0.007
hy-0.007
Token her
Token ID607
Feature activation-5.77e-03

Pos logit contributions
ours+0.007
ry+0.007
upid+0.007
 Sandy+0.006
 bubble+0.006
Neg logit contributions
Brown-0.005
Ell-0.004
Twe-0.004
 Stretch-0.004
 Barney-0.004
Token said
Token ID531
Feature activation+2.88e-03

Pos logit contributions
upid+0.006
ours+0.006
ry+0.006
 bubble+0.006
 Sandy+0.006
Neg logit contributions
Brown-0.006
Ell-0.006
 Stretch-0.006
Twe-0.006
Ar-0.005
Token,
Token ID11
Feature activation+1.55e-02

Pos logit contributions
Brown+0.005
Ell+0.005
Twe+0.005
 Stretch+0.005
Ar+0.005
Neg logit contributions
ours-0.004
ry-0.004
upid-0.004
 Sandy-0.003
 bubble-0.003
Token "
Token ID366
Feature activation+2.07e-02

Pos logit contributions
Brown+0.033
Ell+0.031
Twe+0.030
Ar+0.029
hy+0.029
Neg logit contributions
ry-0.015
ours-0.015
upid-0.014
 Sandy-0.014
 bubble-0.013
TokenLook
Token ID8567
Feature activation-1.03e-02

Pos logit contributions
Brown+0.028
Ell+0.027
 Stretch+0.026
hy+0.026
ve+0.025
Neg logit contributions
about-0.031
ry-0.031
 bubble-0.031
upid-0.031
ours-0.030
Token at
Token ID379
Feature activation-2.01e-02

Pos logit contributions
ours+0.013
upid+0.013
ry+0.013
 bubble+0.012
 shallow+0.012
Neg logit contributions
Brown-0.018
Ell-0.017
 Stretch-0.016
 Brown-0.016
Twe-0.016
Token that
Token ID326
Feature activation-9.27e-02

Pos logit contributions
ry+0.031
ours+0.030
upid+0.029
olt+0.029
about+0.029
Neg logit contributions
Brown-0.030
 Brown-0.028
 Stretch-0.028
 Barney-0.027
Ell-0.027
Token blue
Token ID4171
Feature activation-7.26e-02

Pos logit contributions
asing+0.152
icking+0.152
idget+0.151
ours+0.150
arre+0.143
Neg logit contributions
 long-0.107
 saf-0.102
 Brown-0.094
 shiny-0.090
 Stretch-0.086
Token circle
Token ID9197
Feature activation-1.40e-02

Pos logit contributions
icking+0.105
 ankle+0.104
idget+0.104
arre+0.101
ours+0.099
Neg logit contributions
 saf-0.085
 Christmas-0.083
 Brown-0.082
hy-0.080
 Stretch-0.080
Token!"
Token ID2474
Feature activation-4.68e-04

Pos logit contributions
ours+0.019
upid+0.019
ry+0.019
 Sandy+0.018
 bubble+0.018
Neg logit contributions
Brown-0.023
Ell-0.021
hy-0.020
 Stretch-0.020
Twe-0.020
Token His
Token ID2399
Feature activation-3.77e-02

Pos logit contributions
ours+0.001
ry+0.001
upid+0.001
 Sandy+0.001
 bubble+0.001
Neg logit contributions
Brown-0.001
Ell-0.001
Twe-0.001
 Stretch-0.001
hy-0.001
Token mom
Token ID1995
Feature activation-6.27e-05

Pos logit contributions
ours+0.059
ry+0.058
ion+0.057
upid+0.055
ays+0.054
Neg logit contributions
Brown-0.052
 Brown-0.050
Ar-0.048
 Stretch-0.047
 saf-0.046
Token They
Token ID1119
Feature activation+8.82e-03

Pos logit contributions
ry+0.021
upid+0.021
ours+0.021
ress+0.020
asing+0.020
Neg logit contributions
Brown-0.013
Ell-0.012
Ar-0.011
 Stretch-0.011
Twe-0.011
Token swim
Token ID9422
Feature activation-1.67e-02

Pos logit contributions
Brown+0.018
Ell+0.017
Twe+0.016
 Stretch+0.016
Ar+0.016
Neg logit contributions
upid-0.009
ours-0.009
ry-0.009
 bubble-0.008
 Sandy-0.008
Token back
Token ID736
Feature activation-4.22e-03

Pos logit contributions
upid+0.026
ours+0.026
ry+0.024
 bubble+0.024
 Sandy+0.024
Neg logit contributions
Brown-0.024
Ar-0.022
Ell-0.022
 Stretch-0.021
 Brown-0.021
Token to
Token ID284
Feature activation-2.57e-02

Pos logit contributions
upid+0.006
ours+0.006
ry+0.006
 Sandy+0.006
 bubble+0.006
Neg logit contributions
Brown-0.007
Ell-0.006
 Stretch-0.006
Twe-0.006
 Brown-0.006
Token the
Token ID262
Feature activation-1.86e-02

Pos logit contributions
ry+0.043
ours+0.042
upid+0.041
asing+0.041
about+0.040
Neg logit contributions
 Brown-0.036
Brown-0.035
 Stretch-0.033
Ell-0.033
 Christmas-0.032
Token shallow
Token ID19337
Feature activation-9.25e-02

Pos logit contributions
ry+0.026
ours+0.026
upid+0.026
asing+0.024
 Sandy+0.024
Neg logit contributions
Brown-0.032
Ell-0.029
Ar-0.028
Twe-0.027
 Stretch-0.027
Token end
Token ID886
Feature activation-4.12e-02

Pos logit contributions
asing+0.172
ours+0.153
run+0.146
 ankle+0.144
okes+0.143
Neg logit contributions
Ar-0.122
 saf-0.109
Yo-0.105
thank-0.103
pe-0.101
Token of
Token ID286
Feature activation-2.57e-02

Pos logit contributions
upid+0.069
 ankle+0.065
ours+0.062
about+0.061
asing+0.060
Neg logit contributions
Brown-0.049
Ell-0.047
Ar-0.044
 Brown-0.044
pe-0.044
Token the
Token ID262
Feature activation-2.75e-02

Pos logit contributions
ry+0.041
upid+0.039
asing+0.039
ours+0.039
about+0.037
Neg logit contributions
Brown-0.038
 Brown-0.038
Ell-0.036
 Stretch-0.036
Ar-0.035
Token pool
Token ID5933
Feature activation-3.86e-02

Pos logit contributions
ours+0.042
ry+0.041
asing+0.041
upid+0.041
ress+0.040
Neg logit contributions
Brown-0.041
Ar-0.041
Ell-0.038
 saf-0.037
 Stretch-0.036
Token,
Token ID11
Feature activation-2.45e-02

Pos logit contributions
upid+0.071
ours+0.068
 ankle+0.065
asing+0.064
 Sandy+0.064
Neg logit contributions
Brown-0.039
pe-0.038
hy-0.038
Ar-0.036
 Brown-0.036
Token fun
Token ID1257
Feature activation+1.09e-02

Pos logit contributions
ours+0.035
ry+0.034
upid+0.034
ress+0.033
 ankle+0.032
Neg logit contributions
Brown-0.020
Ell-0.018
 Stretch-0.017
Twe-0.017
Ar-0.017
Token!
Token ID0
Feature activation-3.75e-02

Pos logit contributions
Brown+0.020
Ell+0.019
Twe+0.018
Ar+0.018
hy+0.017
Neg logit contributions
ry-0.013
ours-0.013
upid-0.013
 Sandy-0.012
 bubble-0.012
Token Can
Token ID1680
Feature activation+1.91e-02

Pos logit contributions
ry+0.062
upid+0.061
ours+0.061
 shallow+0.059
 bubble+0.058
Neg logit contributions
Brown-0.055
 Stretch-0.050
 Brown-0.049
Ell-0.049
 Barney-0.046
Token you
Token ID345
Feature activation-3.88e-02

Pos logit contributions
Brown+0.034
Ell+0.033
Ar+0.031
hy+0.031
Twe+0.030
Neg logit contributions
ry-0.023
ours-0.023
 Sandy-0.022
upid-0.022
 bubble-0.022
Token teach
Token ID4545
Feature activation+7.67e-03

Pos logit contributions
ours+0.069
ry+0.063
ress+0.059
ays+0.059
upid+0.059
Neg logit contributions
Brown-0.056
Ell-0.052
 Stretch-0.048
Twe-0.047
 Brown-0.047
Token me
Token ID502
Feature activation-9.23e-02

Pos logit contributions
Brown+0.011
Ell+0.011
Twe+0.010
hy+0.010
Ar+0.010
Neg logit contributions
ours-0.012
ry-0.012
upid-0.012
 bubble-0.011
 Sandy-0.011
Token how
Token ID703
Feature activation-4.48e-02

Pos logit contributions
olt+0.151
 horror+0.144
ours+0.138
 anth+0.137
 pitcher+0.136
Neg logit contributions
Brown-0.116
 Christmas-0.109
 Brown-0.100
 Stretch-0.100
 novels-0.098
Token to
Token ID284
Feature activation-3.53e-02

Pos logit contributions
icking+0.089
about+0.088
ours+0.087
olt+0.087
ill+0.086
Neg logit contributions
Brown-0.045
 Stretch-0.037
Ell-0.037
 Brown-0.036
Ar-0.035
Token make
Token ID787
Feature activation-4.26e-02

Pos logit contributions
ours+0.058
ry+0.053
 Sandy+0.052
ress+0.051
upid+0.049
Neg logit contributions
Brown-0.056
Ell-0.049
Ar-0.048
 Stretch-0.047
pe-0.047
Token music
Token ID2647
Feature activation-4.69e-02

Pos logit contributions
ry+0.071
ours+0.069
upid+0.067
olt+0.067
ill+0.064
Neg logit contributions
Brown-0.058
 Stretch-0.056
 Brown-0.056
 Barney-0.053
Ell-0.050
Token with
Token ID351
Feature activation-2.64e-02

Pos logit contributions
upid+0.073
ours+0.072
ry+0.069
 Sandy+0.067
olt+0.067
Neg logit contributions
Brown-0.070
Ell-0.062
 Brown-0.061
 Stretch-0.061
Ar-0.058
Token let
Token ID1309
Feature activation-3.87e-02

Pos logit contributions
Brown+0.001
Ell+0.001
Twe+0.001
 Stretch+0.001
hy+0.001
Neg logit contributions
ours-0.001
ry-0.001
upid-0.001
 Sandy-0.001
 bubble-0.001
Token's
Token ID338
Feature activation-5.53e-02

Pos logit contributions
ours+0.055
 shallow+0.053
apping+0.053
asing+0.052
upid+0.051
Neg logit contributions
 Brown-0.064
Brown-0.064
 Stretch-0.058
 Barney-0.056
Ell-0.055
Token play
Token ID711
Feature activation-6.05e-02

Pos logit contributions
ours+0.098
ry+0.094
ress+0.092
asing+0.088
upid+0.088
Neg logit contributions
Brown-0.072
 Brown-0.066
 Stretch-0.063
Ell-0.059
 Christmas-0.058
Token it
Token ID340
Feature activation-4.87e-03

Pos logit contributions
upid+0.108
ours+0.106
olt+0.101
icking+0.100
 Sandy+0.098
Neg logit contributions
Brown-0.076
 Stretch-0.071
Yo-0.069
Ell-0.068
Ar-0.068
Token.
Token ID13
Feature activation-2.48e-02

Pos logit contributions
ours+0.007
upid+0.007
ry+0.007
 Sandy+0.006
 bubble+0.006
Neg logit contributions
Brown-0.008
Ell-0.008
 Stretch-0.007
Twe-0.007
Ar-0.007
Token One
Token ID1881
Feature activation-9.21e-02

Pos logit contributions
ry+0.041
ours+0.041
upid+0.040
about+0.039
 shallow+0.038
Neg logit contributions
Brown-0.037
 Stretch-0.032
 Brown-0.032
Ell-0.032
Twe-0.030
Token,
Token ID11
Feature activation-4.44e-02

Pos logit contributions
omp+0.183
 Sandy+0.181
idget+0.179
ill+0.177
ours+0.175
Neg logit contributions
 long-0.079
 Christmas-0.078
 de-0.072
bows-0.071
Brown-0.069
Token two
Token ID734
Feature activation-1.94e-02

Pos logit contributions
upid+0.067
ry+0.064
ours+0.064
olt+0.063
ired+0.060
Neg logit contributions
Brown-0.073
 Brown-0.066
 Stretch-0.065
Ar-0.063
Ell-0.062
Token,
Token ID11
Feature activation-4.80e-02

Pos logit contributions
ours+0.027
upid+0.027
ry+0.027
 Sandy+0.025
ress+0.025
Neg logit contributions
Brown-0.032
Ell-0.029
 Stretch-0.028
hy-0.028
 Brown-0.027
Token three
Token ID1115
Feature activation-1.70e-02

Pos logit contributions
upid+0.072
ry+0.071
ours+0.070
olt+0.069
icking+0.066
Neg logit contributions
Brown-0.078
 Brown-0.071
Ell-0.070
 Stretch-0.069
Ar-0.068
Token,
Token ID11
Feature activation-3.89e-02

Pos logit contributions
upid+0.025
ours+0.025
ry+0.024
 Sandy+0.023
icking+0.022
Neg logit contributions
Brown-0.027
Ell-0.025
 Stretch-0.024
Ar-0.023
hy-0.023
Tokenmy
Token ID1820
Feature activation+2.88e-03

Pos logit contributions
upid+0.019
ours+0.019
ry+0.019
 bubble+0.018
about+0.018
Neg logit contributions
Brown-0.021
Ell-0.020
 Brown-0.019
 Stretch-0.018
Twe-0.018
Token,
Token ID11
Feature activation-3.52e-02

Pos logit contributions
Brown+0.005
Ell+0.005
Twe+0.005
 Stretch+0.005
Ar+0.005
Neg logit contributions
ours-0.004
ry-0.004
upid-0.004
 Sandy-0.003
 bubble-0.003
Token can
Token ID460
Feature activation+1.71e-02

Pos logit contributions
upid+0.050
ours+0.047
ry+0.047
about+0.047
apping+0.046
Neg logit contributions
Brown-0.056
 Brown-0.055
 Stretch-0.050
 Barney-0.050
 Christmas-0.050
Token you
Token ID345
Feature activation-3.80e-02

Pos logit contributions
Brown+0.028
Ell+0.028
Ar+0.026
hy+0.025
Twe+0.025
Neg logit contributions
ry-0.024
ours-0.024
 bubble-0.023
upid-0.022
 Sandy-0.022
Token tell
Token ID1560
Feature activation-6.10e-03

Pos logit contributions
ours+0.073
ry+0.070
upid+0.068
 Sandy+0.068
about+0.068
Neg logit contributions
Brown-0.046
Ell-0.040
Twe-0.038
 Brown-0.036
ve-0.034
Token me
Token ID502
Feature activation-9.20e-02

Pos logit contributions
ry+0.010
ours+0.010
upid+0.010
about+0.009
 Sandy+0.009
Neg logit contributions
Brown-0.009
Ell-0.008
Twe-0.007
 Stretch-0.007
 Brown-0.007
Token a
Token ID257
Feature activation-2.27e-02

Pos logit contributions
udge+0.153
omp+0.147
olt+0.146
about+0.145
 goalie+0.141
Neg logit contributions
 Christmas-0.112
Brown-0.103
 Brown-0.102
 Ellie-0.095
 novels-0.088
Token story
Token ID1621
Feature activation-1.45e-02

Pos logit contributions
ours+0.036
ry+0.036
upid+0.034
asing+0.034
ays+0.032
Neg logit contributions
Brown-0.035
Ell-0.033
Twe-0.030
Ar-0.030
 Brown-0.030
Token?"
Token ID1701
Feature activation+1.39e-02

Pos logit contributions
upid+0.017
ours+0.017
ry+0.016
 Sandy+0.016
 bubble+0.015
Neg logit contributions
Brown-0.028
Ell-0.026
 Stretch-0.024
hy-0.024
 Brown-0.024
Token His
Token ID2399
Feature activation-1.79e-03

Pos logit contributions
Brown+0.019
Ell+0.017
Ar+0.017
 Stretch+0.016
ve+0.016
Neg logit contributions
ours-0.024
ry-0.024
 Sandy-0.023
upid-0.023
 foot-0.023
Token mom
Token ID1995
Feature activation+5.92e-03

Pos logit contributions
ry+0.002
ours+0.002
upid+0.002
 Sandy+0.002
 bubble+0.002
Neg logit contributions
Brown-0.003
Ell-0.003
Twe-0.003
 Stretch-0.003
hy-0.003
Token the
Token ID262
Feature activation-3.25e-02

Pos logit contributions
ours+0.007
upid+0.007
ry+0.007
 bubble+0.006
about+0.006
Neg logit contributions
Brown-0.009
Ell-0.009
 Stretch-0.008
 Brown-0.008
Twe-0.008
Token bald
Token ID28273
Feature activation-1.18e-02

Pos logit contributions
ry+0.063
ours+0.061
upid+0.061
ress+0.055
ill+0.055
Neg logit contributions
Brown-0.039
 Stretch-0.036
 Brown-0.036
Ar-0.035
Ell-0.034
Token player
Token ID2137
Feature activation-5.95e-02

Pos logit contributions
upid+0.019
ours+0.019
ry+0.019
 Sandy+0.018
 shallow+0.018
Neg logit contributions
Brown-0.017
Ell-0.015
hy-0.014
 Stretch-0.014
Twe-0.014
Token and
Token ID290
Feature activation-3.11e-02

Pos logit contributions
upid+0.103
ours+0.102
 conclusions+0.094
 shallow+0.093
ry+0.092
Neg logit contributions
Brown-0.075
Ell-0.075
ve-0.068
 Stretch-0.068
Ar-0.066
Token the
Token ID262
Feature activation-4.11e-02

Pos logit contributions
ours+0.050
upid+0.049
ry+0.048
icking+0.044
ays+0.043
Neg logit contributions
Ell-0.050
Brown-0.049
 Stretch-0.045
 Brown-0.044
 Ellie-0.043
Token girl
Token ID2576
Feature activation-9.16e-02

Pos logit contributions
ry+0.081
ours+0.080
upid+0.078
ays+0.070
icking+0.069
Neg logit contributions
Brown-0.050
 Stretch-0.046
Ar-0.046
 Brown-0.046
Ell-0.045
Token skipped
Token ID26684
Feature activation-4.17e-03

Pos logit contributions
ours+0.194
upid+0.192
 conclusions+0.191
ainer+0.186
udge+0.181
Neg logit contributions
Ell-0.076
ve-0.069
 Kirst-0.063
hy-0.062
Brown-0.057
Token up
Token ID510
Feature activation-1.59e-02

Pos logit contributions
ours+0.006
ry+0.006
upid+0.006
 bubble+0.005
asing+0.005
Neg logit contributions
Brown-0.007
Ell-0.007
 Stretch-0.006
Twe-0.006
Ar-0.006
Token and
Token ID290
Feature activation-4.11e-02

Pos logit contributions
upid+0.023
ours+0.022
ry+0.022
asing+0.021
ress+0.020
Neg logit contributions
Ell-0.025
Brown-0.025
 Stretch-0.023
Ar-0.023
 Brown-0.023
Token down
Token ID866
Feature activation-2.88e-02

Pos logit contributions
upid+0.073
ours+0.072
ry+0.068
ava+0.067
ainer+0.067
Neg logit contributions
Ell-0.054
Brown-0.050
 Brown-0.048
 Kirst-0.047
 Stretch-0.046
Token the
Token ID262
Feature activation-1.85e-02

Pos logit contributions
ours+0.044
upid+0.044
asing+0.042
ry+0.041
 ankle+0.040
Neg logit contributions
Ar-0.039
Ell-0.039
 Stretch-0.038
hy-0.038
Brown-0.037
Token ball
Token ID2613
Feature activation-2.95e-02

Pos logit contributions
ry+0.095
ices+0.089
asing+0.087
ours+0.086
ress+0.085
Neg logit contributions
 Ellie-0.061
 Blue-0.060
 Brown-0.059
 Twe-0.059
 Stretch-0.057
Token!
Token ID0
Feature activation-2.82e-02

Pos logit contributions
ours+0.050
upid+0.050
 ankle+0.046
asing+0.046
ry+0.046
Neg logit contributions
Brown-0.038
Ell-0.037
Twe-0.036
hy-0.034
 Stretch-0.034
Token Can
Token ID1680
Feature activation+2.43e-02

Pos logit contributions
upid+0.045
ry+0.045
ours+0.044
about+0.044
 bubble+0.043
Neg logit contributions
Brown-0.041
Ell-0.038
 Stretch-0.037
hy-0.036
Twe-0.036
Token you
Token ID345
Feature activation-1.44e-02

Pos logit contributions
Brown+0.040
Ell+0.039
Ar+0.037
hy+0.035
ve+0.035
Neg logit contributions
ours-0.032
ry-0.032
upid-0.031
 bubble-0.031
 Sandy-0.031
Token teach
Token ID4545
Feature activation-1.57e-02

Pos logit contributions
ours+0.024
ry+0.024
upid+0.023
about+0.022
 Sandy+0.022
Neg logit contributions
Brown-0.020
Ell-0.020
Twe-0.018
hy-0.018
 Stretch-0.017
Token me
Token ID502
Feature activation-9.10e-02

Pos logit contributions
ry+0.031
ours+0.030
upid+0.029
olt+0.029
about+0.028
Neg logit contributions
Brown-0.018
Ell-0.017
 Stretch-0.015
 Brown-0.015
Twe-0.015
Token how
Token ID703
Feature activation-3.97e-02

Pos logit contributions
olt+0.153
 horror+0.146
idget+0.140
about+0.138
 Sandy+0.137
Neg logit contributions
 Christmas-0.102
Brown-0.098
 Ellie-0.095
 novels-0.094
 Blue-0.093
Token to
Token ID284
Feature activation-1.51e-03

Pos logit contributions
ry+0.078
about+0.077
ours+0.076
ill+0.076
icking+0.076
Neg logit contributions
Brown-0.041
Ell-0.038
hy-0.034
Twe-0.034
 Stretch-0.033
Token fly
Token ID6129
Feature activation-3.47e-02

Pos logit contributions
ours+0.002
ry+0.002
upid+0.002
 Sandy+0.002
 bubble+0.002
Neg logit contributions
Brown-0.003
Ell-0.002
Twe-0.002
 Stretch-0.002
hy-0.002
Token?"
Token ID1701
Feature activation+2.72e-03

Pos logit contributions
ours+0.056
ry+0.054
ress+0.053
olt+0.052
icking+0.052
Neg logit contributions
Brown-0.049
Ell-0.046
Twe-0.043
 Stretch-0.043
 Brown-0.043
Token The
Token ID383
Feature activation-9.88e-03

Pos logit contributions
Brown+0.005
Ell+0.004
Twe+0.004
 Stretch+0.004
hy+0.004
Neg logit contributions
ours-0.004
ry-0.004
upid-0.004
 bubble-0.003
 Sandy-0.003
Token dancer
Token ID38619
Feature activation-1.47e-02

Pos logit contributions
Brown+0.001
Ell+0.000
Twe+0.000
 Stretch+0.000
hy+0.000
Neg logit contributions
ours-0.000
ry-0.000
upid-0.000
 Sandy-0.000
 bubble-0.000
Token!
Token ID0
Feature activation-3.62e-02

Pos logit contributions
ours+0.021
upid+0.021
ry+0.020
 Sandy+0.019
 bubble+0.019
Neg logit contributions
Brown-0.024
Ell-0.022
 Stretch-0.021
hy-0.021
 Brown-0.020
Token Can
Token ID1680
Feature activation+1.65e-02

Pos logit contributions
ry+0.065
upid+0.063
ours+0.063
 shallow+0.061
 bubble+0.061
Neg logit contributions
Brown-0.047
Ell-0.044
 Stretch-0.042
hy-0.042
 Brown-0.041
Token you
Token ID345
Feature activation-4.44e-02

Pos logit contributions
Brown+0.028
Ell+0.027
Ar+0.026
hy+0.025
 Stretch+0.025
Neg logit contributions
ry-0.021
ours-0.021
upid-0.021
 Sandy-0.021
 bubble-0.020
Token teach
Token ID4545
Feature activation+3.07e-03

Pos logit contributions
ours+0.081
ry+0.071
ays+0.070
ress+0.067
about+0.066
Neg logit contributions
Brown-0.062
Ell-0.060
 Stretch-0.055
Twe-0.054
Ar-0.054
Token me
Token ID502
Feature activation-9.02e-02

Pos logit contributions
Brown+0.004
Ell+0.004
Twe+0.004
 Stretch+0.004
Ar+0.004
Neg logit contributions
ours-0.005
ry-0.005
upid-0.005
 Sandy-0.005
 bubble-0.005
Token?"
Token ID1701
Feature activation+3.32e-02

Pos logit contributions
olt+0.139
 horror+0.136
ours+0.125
inter+0.124
 anth+0.122
Neg logit contributions
Brown-0.113
 Christmas-0.111
 novels-0.103
Ell-0.102
 Stretch-0.102
Token Lucy
Token ID22162
Feature activation-1.49e-02

Pos logit contributions
Brown+0.066
Ell+0.061
Ar+0.060
ve+0.059
Twe+0.058
Neg logit contributions
 Sandy-0.037
ours-0.034
about-0.032
ry-0.032
ress-0.031
Token smiled
Token ID13541
Feature activation-5.54e-03

Pos logit contributions
upid+0.022
ours+0.021
ry+0.021
 shallow+0.019
asing+0.019
Neg logit contributions
Brown-0.024
Ell-0.023
 Stretch-0.021
Twe-0.021
 Brown-0.021
Token and
Token ID290
Feature activation-2.24e-02

Pos logit contributions
ours+0.007
upid+0.007
ry+0.007
 bubble+0.006
 Sandy+0.006
Neg logit contributions
Brown-0.010
Ell-0.009
 Stretch-0.009
Twe-0.009
 Brown-0.008
Token said
Token ID531
Feature activation-1.14e-02

Pos logit contributions
upid+0.041
ry+0.038
ours+0.038
 conclusions+0.036
 foot+0.035
Neg logit contributions
Brown-0.028
Ell-0.028
 Stretch-0.026
 Brown-0.026
 Barney-0.026
Token like
Token ID588
Feature activation-2.76e-02

Pos logit contributions
ours+0.020
ry+0.019
upid+0.019
ress+0.018
 Sandy+0.018
Neg logit contributions
Brown-0.023
Ell-0.021
Ar-0.021
Twe-0.020
ve-0.020
Token to
Token ID284
Feature activation-1.54e-02

Pos logit contributions
ry+0.045
upid+0.041
olt+0.040
ours+0.040
asing+0.039
Neg logit contributions
Brown-0.041
 Brown-0.038
 Stretch-0.036
Ell-0.036
 Christmas-0.036
Token stay
Token ID2652
Feature activation-1.19e-02

Pos logit contributions
ours+0.021
ry+0.021
upid+0.020
asing+0.019
ress+0.018
Neg logit contributions
Brown-0.026
Ell-0.024
Twe-0.024
hy-0.024
Ar-0.023
Token in
Token ID287
Feature activation-4.18e-02

Pos logit contributions
ours+0.019
upid+0.019
ry+0.019
about+0.017
ress+0.017
Neg logit contributions
Brown-0.017
Ell-0.016
 Stretch-0.015
Twe-0.015
Ar-0.015
Token the
Token ID262
Feature activation-3.31e-02

Pos logit contributions
ry+0.073
ours+0.072
asing+0.069
icking+0.068
apping+0.066
Neg logit contributions
Brown-0.055
 Brown-0.052
 Christmas-0.051
 Stretch-0.050
Ar-0.049
Token shallow
Token ID19337
Feature activation-8.95e-02

Pos logit contributions
asing+0.055
ry+0.055
ress+0.052
ful+0.051
inda+0.050
Neg logit contributions
Ar-0.044
Brown-0.044
Yo-0.043
hy-0.042
 saf-0.042
Token end
Token ID886
Feature activation-4.15e-02

Pos logit contributions
asing+0.164
ours+0.155
 ankle+0.143
isted+0.142
icking+0.139
Neg logit contributions
Ar-0.111
 saf-0.098
Yo-0.094
pe-0.088
 road-0.086
Token."
Token ID526
Feature activation+9.07e-03

Pos logit contributions
upid+0.071
 ankle+0.070
ours+0.068
asing+0.067
icking+0.066
Neg logit contributions
Brown-0.049
Ell-0.045
Ar-0.043
Twe-0.042
 Brown-0.042
Token
Token ID198
Feature activation-1.23e-03

Pos logit contributions
Brown+0.016
Ell+0.015
Twe+0.014
hy+0.014
 Stretch+0.014
Neg logit contributions
ours-0.012
ry-0.012
upid-0.012
 Sandy-0.012
 bubble-0.011
Token
Token ID198
Feature activation+4.83e-04

Pos logit contributions
ours+0.002
ry+0.002
upid+0.002
 Sandy+0.002
 bubble+0.002
Neg logit contributions
Brown-0.002
Ell-0.002
Twe-0.002
 Stretch-0.002
Ar-0.002
TokenBen
Token ID11696
Feature activation-3.78e-02

Pos logit contributions
Brown+0.001
Ell+0.001
Twe+0.001
 Stretch+0.001
Ar+0.001
Neg logit contributions
ours-0.001
ry-0.001
upid-0.001
 Sandy-0.001
 bubble-0.001
Token he
Token ID339
Feature activation-1.03e-02

Pos logit contributions
Brown+0.024
Ell+0.023
Ar+0.020
Twe+0.020
hy+0.020
Neg logit contributions
ry-0.026
ours-0.026
 Sandy-0.026
 bubble-0.026
 foot-0.026
Token found
Token ID1043
Feature activation-1.27e-02

Pos logit contributions
upid+0.017
ours+0.017
ry+0.017
asing+0.016
 bubble+0.015
Neg logit contributions
Brown-0.015
Ell-0.014
Twe-0.013
Ar-0.013
 Brown-0.012
Token a
Token ID257
Feature activation-6.18e-03

Pos logit contributions
ry+0.020
ours+0.019
upid+0.019
about+0.018
apping+0.017
Neg logit contributions
Brown-0.020
Ell-0.019
 Brown-0.018
 Stretch-0.018
 Barney-0.017
Token can
Token ID460
Feature activation-2.52e-02

Pos logit contributions
ours+0.011
ry+0.010
upid+0.010
 Sandy+0.010
asing+0.010
Neg logit contributions
Brown-0.008
Ell-0.008
Twe-0.007
 Stretch-0.007
hy-0.007
Token of
Token ID286
Feature activation-1.42e-02

Pos logit contributions
ours+0.025
upid+0.023
ry+0.021
about+0.020
 shallow+0.020
Neg logit contributions
Brown-0.053
Ell-0.051
hy-0.050
Twe-0.049
 Brown-0.048
Token spray
Token ID11662
Feature activation-8.95e-02

Pos logit contributions
ry+0.024
ours+0.023
upid+0.022
ress+0.021
asing+0.021
Neg logit contributions
Brown-0.020
Ell-0.019
 Stretch-0.018
 Brown-0.018
Twe-0.018
Token paint
Token ID7521
Feature activation-3.67e-02

Pos logit contributions
ress+0.145
ry+0.133
asing+0.131
upid+0.131
ours+0.130
Neg logit contributions
 Twe-0.103
 past-0.100
hy-0.099
 Ellie-0.098
 Brown-0.097
Token in
Token ID287
Feature activation-2.31e-02

Pos logit contributions
upid+0.062
ours+0.060
ry+0.058
ress+0.055
 Sandy+0.055
Neg logit contributions
Brown-0.050
Ell-0.048
hy-0.046
 Brown-0.044
Twe-0.043
Token the
Token ID262
Feature activation-3.71e-02

Pos logit contributions
ours+0.035
ry+0.034
upid+0.033
asing+0.032
apping+0.032
Neg logit contributions
Brown-0.038
 Brown-0.035
Ell-0.035
Ar-0.033
Twe-0.033
Token woods
Token ID16479
Feature activation-2.57e-02

Pos logit contributions
ry+0.060
ours+0.060
upid+0.060
ress+0.058
omp+0.056
Neg logit contributions
Brown-0.055
Ell-0.054
Twe-0.052
Red-0.046
hy-0.046
Token.
Token ID13
Feature activation-3.92e-03

Pos logit contributions
ours+0.044
upid+0.044
ry+0.041
 Sandy+0.041
 bubble+0.040
Neg logit contributions
Brown-0.033
hy-0.031
Ell-0.031
pe-0.029
 Brown-0.029
Token can
Token ID460
Feature activation+2.17e-02

Pos logit contributions
ours+0.002
ry+0.002
upid+0.002
 Sandy+0.002
 bubble+0.002
Neg logit contributions
Brown-0.004
Ell-0.004
Twe-0.003
 Stretch-0.003
hy-0.003
Token have
Token ID423
Feature activation+2.32e-02

Pos logit contributions
Brown+0.038
Ell+0.037
Ar+0.034
Twe+0.033
 Stretch+0.033
Neg logit contributions
upid-0.028
ry-0.027
 bubble-0.026
 foot-0.026
 Sandy-0.026
Token the
Token ID262
Feature activation+2.87e-02

Pos logit contributions
Brown+0.046
Ell+0.043
Twe+0.041
hy+0.041
Ar+0.040
Neg logit contributions
upid-0.024
ours-0.024
 Sandy-0.024
 bubble-0.024
ry-0.023
Token steak
Token ID26320
Feature activation+2.04e-02

Pos logit contributions
Ell+0.049
Brown+0.048
Twe+0.045
 Kirst+0.042
hy+0.040
Neg logit contributions
 bubble-0.040
 foot-0.040
upid-0.040
ours-0.040
 hand-0.040
Token,
Token ID11
Feature activation-4.04e-02

Pos logit contributions
Brown+0.038
Ell+0.037
Twe+0.034
Ar+0.034
 Stretch+0.033
Neg logit contributions
ry-0.025
upid-0.024
ours-0.024
 Sandy-0.023
 foot-0.023
Token L
Token ID406
Feature activation-8.91e-02

Pos logit contributions
ours+0.060
upid+0.058
ry+0.057
about+0.054
ays+0.053
Neg logit contributions
Brown-0.064
 Brown-0.059
Ell-0.057
Twe-0.057
hy-0.056
Tokenula
Token ID4712
Feature activation+1.95e-02

Pos logit contributions
 shallow+0.118
 reel+0.111
 amaz+0.111
 goalie+0.109
 horror+0.109
Neg logit contributions
ve-0.163
Twe-0.131
Ell-0.125
Brown-0.116
Red-0.114
Token".
Token ID1911
Feature activation-1.37e-02

Pos logit contributions
Brown+0.037
Ell+0.036
Twe+0.033
Ar+0.033
 Stretch+0.032
Neg logit contributions
ry-0.024
ours-0.023
upid-0.023
 Sandy-0.021
 bubble-0.021
Token She
Token ID1375
Feature activation-2.33e-04

Pos logit contributions
upid+0.019
ours+0.019
ry+0.018
 bubble+0.017
asing+0.017
Neg logit contributions
Brown-0.023
Ell-0.022
hy-0.021
Twe-0.020
 Brown-0.020
Token turned
Token ID2900
Feature activation-5.27e-03

Pos logit contributions
ours+0.000
upid+0.000
ry+0.000
 Sandy+0.000
 bubble+0.000
Neg logit contributions
Brown-0.000
Ell-0.000
Twe-0.000
 Stretch-0.000
Ar-0.000
Token around
Token ID1088
Feature activation+8.49e-03

Pos logit contributions
ours+0.007
upid+0.007
ry+0.007
 Sandy+0.006
 bubble+0.006
Neg logit contributions
Brown-0.009
Ell-0.008
 Stretch-0.008
hy-0.008
Twe-0.008
Token.
Token ID13
Feature activation-7.93e-03

Pos logit contributions
upid+0.057
ours+0.056
 ankle+0.051
ry+0.051
ress+0.050
Neg logit contributions
Ell-0.064
Brown-0.063
hy-0.060
 Brown-0.059
ve-0.058
Token She
Token ID1375
Feature activation-6.18e-02

Pos logit contributions
ry+0.013
upid+0.013
ours+0.012
 bubble+0.012
asing+0.012
Neg logit contributions
Brown-0.012
Ell-0.011
Twe-0.010
 Brown-0.010
 Stretch-0.010
Token showed
Token ID3751
Feature activation-8.04e-03

Pos logit contributions
upid+0.089
 conclusions+0.081
olt+0.075
 goalie+0.072
iece+0.072
Neg logit contributions
Ell-0.107
Brown-0.105
ve-0.104
 Brown-0.097
Ar-0.096
Token Sarah
Token ID10490
Feature activation-4.12e-02

Pos logit contributions
ry+0.012
ours+0.012
upid+0.012
 bubble+0.011
 shallow+0.011
Neg logit contributions
Brown-0.013
Ell-0.012
 Brown-0.011
Twe-0.011
 Stretch-0.011
Token how
Token ID703
Feature activation-4.84e-02

Pos logit contributions
upid+0.054
olt+0.052
ry+0.048
ours+0.047
asing+0.047
Neg logit contributions
Brown-0.072
 Brown-0.068
Ell-0.068
 Kirst-0.063
 Stretch-0.063
Token to
Token ID284
Feature activation-8.88e-02

Pos logit contributions
about+0.101
ress+0.099
 goalie+0.097
upid+0.096
ry+0.095
Neg logit contributions
Ell-0.037
ve-0.037
Brown-0.036
 long-0.034
Twe-0.033
Token spray
Token ID11662
Feature activation-7.66e-02

Pos logit contributions
ress+0.146
ired+0.136
iece+0.135
apping+0.134
asing+0.134
Neg logit contributions
Twe-0.115
Brown-0.109
 Twe-0.102
Ell-0.100
 Brown-0.098
Token some
Token ID617
Feature activation-8.37e-02

Pos logit contributions
ress+0.136
apping+0.124
olt+0.123
 goalie+0.122
ired+0.122
Neg logit contributions
 Brown-0.089
 Twe-0.085
 Ellie-0.083
 Blue-0.081
Brown-0.080
Token nice
Token ID3621
Feature activation-8.44e-02

Pos logit contributions
ress+0.141
olt+0.123
ill+0.123
upid+0.121
ours+0.118
Neg logit contributions
 long-0.107
Brown-0.105
Twe-0.098
ve-0.098
 de-0.098
Token smelling
Token ID48090
Feature activation-8.64e-02

Pos logit contributions
ress+0.140
olt+0.139
icking+0.134
upid+0.133
 ankle+0.133
Neg logit contributions
 past-0.091
 Brown-0.089
bows-0.088
 de-0.087
Brown-0.083
Token liquid
Token ID8122
Feature activation-2.59e-02

Pos logit contributions
ress+0.139
olt+0.130
 goalie+0.129
upid+0.127
ired+0.127
Neg logit contributions
 de-0.105
 Brown-0.102
 Twe-0.102
Brown-0.098
 Kirst-0.097
Token cat
Token ID3797
Feature activation+9.85e-03

Pos logit contributions
ry+0.040
upid+0.038
ours+0.038
ill+0.035
asing+0.035
Neg logit contributions
Brown-0.032
Ell-0.029
hy-0.028
 Brown-0.027
 Stretch-0.027
Token said
Token ID531
Feature activation-8.92e-03

Pos logit contributions
Brown+0.017
Ell+0.016
Twe+0.015
Ar+0.015
 Stretch+0.015
Neg logit contributions
ry-0.013
ours-0.013
upid-0.012
 bubble-0.012
 Sandy-0.012
Token,
Token ID11
Feature activation-3.92e-03

Pos logit contributions
ours+0.013
upid+0.013
ry+0.012
 bubble+0.012
 shallow+0.011
Neg logit contributions
Brown-0.014
Ell-0.014
 Stretch-0.013
 Brown-0.013
hy-0.013
Token "
Token ID366
Feature activation+1.18e-02

Pos logit contributions
upid+0.005
ours+0.005
ry+0.005
 Sandy+0.004
 bubble+0.004
Neg logit contributions
Brown-0.007
Ell-0.007
 Stretch-0.006
Twe-0.006
 Brown-0.006
TokenMar
Token ID7676
Feature activation-1.97e-02

Pos logit contributions
Brown+0.019
Ell+0.018
 Stretch+0.017
 Barney+0.017
 Brown+0.016
Neg logit contributions
ry-0.017
ours-0.017
upid-0.016
 bubble-0.016
 Sandy-0.016
Tokenriage
Token ID4087
Feature activation-8.85e-02

Pos logit contributions
upid+0.029
ours+0.028
 bubble+0.028
 Sandy+0.028
 shallow+0.027
Neg logit contributions
Brown-0.031
Ell-0.028
hy-0.027
Twe-0.027
 Stretch-0.027
Token is
Token ID318
Feature activation-1.75e-02

Pos logit contributions
 shallow+0.162
upid+0.162
olt+0.159
 ankle+0.158
ours+0.157
Neg logit contributions
Brown-0.091
 Brown-0.084
pe-0.080
 Stretch-0.079
Ar-0.077
Token when
Token ID618
Feature activation-1.45e-03

Pos logit contributions
ours+0.022
upid+0.022
ry+0.021
about+0.020
 bubble+0.019
Neg logit contributions
Brown-0.031
Ell-0.030
Twe-0.028
 Stretch-0.028
 Brown-0.028
Token two
Token ID734
Feature activation+6.46e-03

Pos logit contributions
ours+0.002
ry+0.002
upid+0.002
 bubble+0.002
 Sandy+0.002
Neg logit contributions
Brown-0.002
Ell-0.002
Twe-0.002
 Stretch-0.002
hy-0.002
Token animals
Token ID4695
Feature activation-5.46e-03

Pos logit contributions
Brown+0.012
Ell+0.011
Twe+0.010
 Stretch+0.010
Ar+0.010
Neg logit contributions
ours-0.008
ry-0.008
upid-0.008
 bubble-0.008
 Sandy-0.007
Token love
Token ID1842
Feature activation+1.02e-02

Pos logit contributions
ours+0.007
ry+0.007
upid+0.007
 Sandy+0.006
 bubble+0.006
Neg logit contributions
Brown-0.010
Ell-0.009
 Stretch-0.009
Twe-0.009
hy-0.009
Token good
Token ID922
Feature activation-1.09e-02

Pos logit contributions
Brown+0.007
Ell+0.007
Twe+0.006
 Stretch+0.006
Ar+0.006
Neg logit contributions
ours-0.008
upid-0.008
ry-0.008
 bubble-0.007
 Sandy-0.007
Token!
Token ID0
Feature activation-3.91e-02

Pos logit contributions
ours+0.015
upid+0.015
ry+0.015
 shallow+0.014
 bubble+0.014
Neg logit contributions
Brown-0.018
Ell-0.017
 Brown-0.016
hy-0.016
 Stretch-0.016
Token Can
Token ID1680
Feature activation+2.39e-02

Pos logit contributions
ry+0.070
upid+0.068
 shallow+0.068
ours+0.067
 bubble+0.066
Neg logit contributions
Brown-0.052
 Brown-0.046
Ell-0.046
hy-0.045
Twe-0.045
Token you
Token ID345
Feature activation-3.63e-02

Pos logit contributions
Brown+0.039
Ell+0.038
Ar+0.037
hy+0.035
ve+0.034
Neg logit contributions
ry-0.032
ours-0.032
upid-0.030
 Sandy-0.030
 bubble-0.030
Token teach
Token ID4545
Feature activation+1.29e-02

Pos logit contributions
ours+0.066
ry+0.061
ays+0.059
upid+0.055
ress+0.054
Neg logit contributions
Brown-0.052
Ell-0.049
Twe-0.046
hy-0.045
 Brown-0.043
Token me
Token ID502
Feature activation-8.84e-02

Pos logit contributions
Brown+0.021
Ell+0.019
Ar+0.018
Twe+0.018
hy+0.018
Neg logit contributions
upid-0.019
ours-0.019
ry-0.018
 bubble-0.018
 Sandy-0.018
Token how
Token ID703
Feature activation-3.41e-02

Pos logit contributions
 horror+0.137
olt+0.136
 anth+0.131
 Sandy+0.123
inter+0.122
Neg logit contributions
Brown-0.113
 Christmas-0.111
 novels-0.108
 him-0.099
 Brown-0.099
Token to
Token ID284
Feature activation-9.48e-03

Pos logit contributions
ry+0.064
ours+0.064
 Sandy+0.063
ill+0.063
icking+0.063
Neg logit contributions
Brown-0.039
Ell-0.034
hy-0.032
 Stretch-0.031
Twe-0.030
Token play
Token ID711
Feature activation-4.70e-02

Pos logit contributions
ours+0.014
ry+0.014
upid+0.013
 Sandy+0.013
ress+0.012
Neg logit contributions
Brown-0.016
Ell-0.014
Twe-0.014
 Stretch-0.013
Ar-0.013
Token too
Token ID1165
Feature activation-6.14e-02

Pos logit contributions
upid+0.082
ours+0.081
 Sandy+0.079
ry+0.079
olt+0.077
Neg logit contributions
Brown-0.066
Ell-0.060
 Stretch-0.058
Red-0.057
 Brown-0.056
Token?"
Token ID1701
Feature activation+1.42e-02

Pos logit contributions
 ankle+0.111
asing+0.104
ours+0.104
 Sandy+0.103
upid+0.103
Neg logit contributions
Brown-0.072
 Brown-0.066
 Stretch-0.064
 Christmas-0.063
Ell-0.063
Token and
Token ID290
Feature activation+6.91e-04

Pos logit contributions
 conclusions+0.121
upid+0.120
ours+0.110
 ankle+0.103
 shallow+0.102
Neg logit contributions
Brown-0.080
Ell-0.079
ve-0.070
 Brown-0.066
Ar-0.064
Token the
Token ID262
Feature activation-1.33e-03

Pos logit contributions
Brown+0.001
Ell+0.001
Twe+0.001
 Stretch+0.001
hy+0.001
Neg logit contributions
ours-0.001
ry-0.001
upid-0.001
 Sandy-0.001
 bubble-0.001
Token baby
Token ID5156
Feature activation-6.96e-02

Pos logit contributions
ry+0.002
ours+0.002
upid+0.002
 Sandy+0.002
 bubble+0.002
Neg logit contributions
Brown-0.002
Ell-0.002
Twe-0.002
 Stretch-0.002
hy-0.002
Token walked
Token ID6807
Feature activation+2.84e-03

Pos logit contributions
upid+0.122
 conclusions+0.107
olt+0.104
ours+0.104
 ankle+0.103
Neg logit contributions
Ell-0.090
Brown-0.087
ve-0.085
 Kirst-0.082
 Brown-0.081
Token together
Token ID1978
Feature activation-2.89e-02

Pos logit contributions
Brown+0.005
Ell+0.005
Twe+0.004
 Stretch+0.004
hy+0.004
Neg logit contributions
ry-0.004
ours-0.004
upid-0.004
 Sandy-0.003
 bubble-0.003
Token to
Token ID284
Feature activation-8.84e-02

Pos logit contributions
ours+0.053
upid+0.052
asing+0.048
ry+0.048
ress+0.048
Neg logit contributions
Ell-0.031
ve-0.031
Brown-0.030
 Stretch-0.029
 Brown-0.029
Token the
Token ID262
Feature activation-5.90e-02

Pos logit contributions
asing+0.200
ry+0.193
ays+0.192
iness+0.192
ares+0.190
Neg logit contributions
 visit-0.080
 Brown-0.076
 Ellie-0.074
 long-0.072
 Twe-0.071
Token garden
Token ID11376
Feature activation-3.52e-02

Pos logit contributions
ry+0.110
asing+0.104
run+0.104
omp+0.102
ours+0.100
Neg logit contributions
Ell-0.072
Brown-0.068
Twe-0.065
 ranch-0.065
Yo-0.064
Token.
Token ID13
Feature activation-6.43e-03

Pos logit contributions
upid+0.063
ours+0.060
asing+0.056
ry+0.055
 ankle+0.055
Neg logit contributions
Ell-0.042
Brown-0.041
 Brown-0.039
pe-0.038
hy-0.038
Token When
Token ID1649
Feature activation+1.44e-03

Pos logit contributions
ry+0.011
upid+0.011
ours+0.011
 bubble+0.011
asing+0.010
Neg logit contributions
Brown-0.009
Ell-0.008
 Stretch-0.007
 Brown-0.007
Twe-0.007
Token they
Token ID484
Feature activation-5.09e-02

Pos logit contributions
Brown+0.002
Ell+0.002
Twe+0.002
 Stretch+0.002
hy+0.002
Neg logit contributions
ours-0.002
ry-0.002
upid-0.002
 Sandy-0.002
 bubble-0.002
Token a
Token ID257
Feature activation+8.91e-03

Pos logit contributions
Brown+0.029
Ell+0.028
Twe+0.026
hy+0.026
 Stretch+0.026
Neg logit contributions
ours-0.015
upid-0.015
 Sandy-0.015
 bubble-0.015
ry-0.014
Token boy
Token ID2933
Feature activation-3.64e-03

Pos logit contributions
Brown+0.017
Ell+0.016
Twe+0.015
 Stretch+0.015
hy+0.015
Neg logit contributions
ry-0.010
ours-0.010
upid-0.010
 bubble-0.009
 Sandy-0.009
Token named
Token ID3706
Feature activation+2.30e-02

Pos logit contributions
upid+0.004
ours+0.004
ry+0.004
 Sandy+0.004
 bubble+0.004
Neg logit contributions
Brown-0.007
Ell-0.006
Twe-0.006
 Stretch-0.006
hy-0.006
Token Jack
Token ID3619
Feature activation+5.96e-03

Pos logit contributions
Brown+0.041
Ell+0.040
ve+0.037
Twe+0.037
pe+0.036
Neg logit contributions
ours-0.028
 Sandy-0.028
upid-0.028
ry-0.027
 bubble-0.027
Token.
Token ID13
Feature activation+2.28e-02

Pos logit contributions
Brown+0.011
Ell+0.010
Twe+0.010
 Stretch+0.009
Ar+0.009
Neg logit contributions
ry-0.008
ours-0.007
upid-0.007
 bubble-0.007
 Sandy-0.007
Token Jack
Token ID3619
Feature activation+5.94e-02

Pos logit contributions
Brown+0.042
Ell+0.039
Twe+0.038
ve+0.037
 Kirst+0.036
Neg logit contributions
 Sandy-0.028
upid-0.028
ours-0.028
ry-0.027
 bubble-0.027
Token was
Token ID373
Feature activation+2.44e-02

Pos logit contributions
Ell+0.118
Brown+0.117
ve+0.117
 Kirst+0.115
Ar+0.113
Neg logit contributions
ry-0.055
 bubble-0.051
 foot-0.050
's-0.047
 bat-0.044
Token a
Token ID257
Feature activation+2.84e-02

Pos logit contributions
Brown+0.046
Ell+0.044
Twe+0.040
Ar+0.040
ve+0.040
Neg logit contributions
 foot-0.028
ours-0.028
 Sandy-0.028
 bubble-0.027
 twin-0.027
Token clever
Token ID14169
Feature activation-1.78e-02

Pos logit contributions
Brown+0.049
Ell+0.049
Twe+0.043
 Kirst+0.043
 Stretch+0.042
Neg logit contributions
 foot-0.037
 bubble-0.037
ry-0.036
ours-0.035
 twin-0.034
Token boy
Token ID2933
Feature activation-5.97e-03

Pos logit contributions
upid+0.027
ours+0.027
ry+0.026
asing+0.024
 Sandy+0.024
Neg logit contributions
Brown-0.028
Ell-0.025
 Brown-0.025
 Stretch-0.025
hy-0.024
Token who
Token ID508
Feature activation+3.12e-02

Pos logit contributions
upid+0.007
ours+0.007
ry+0.007
 Sandy+0.007
 bubble+0.006
Neg logit contributions
Brown-0.011
Ell-0.011
Twe-0.010
 Stretch-0.010
hy-0.010
Tokenia
Token ID544
Feature activation+9.00e-03

Pos logit contributions
ours+0.013
upid+0.013
ry+0.012
 bubble+0.012
 Sandy+0.012
Neg logit contributions
Brown-0.014
Ell-0.013
Twe-0.013
hy-0.012
 Stretch-0.012
Token looked
Token ID3114
Feature activation+3.18e-03

Pos logit contributions
Brown+0.016
Ell+0.015
Twe+0.015
 Stretch+0.014
Ar+0.014
Neg logit contributions
ry-0.011
ours-0.011
upid-0.011
 bubble-0.010
 Sandy-0.010
Token up
Token ID510
Feature activation+2.50e-02

Pos logit contributions
Brown+0.006
Ell+0.005
Twe+0.005
 Stretch+0.005
Ar+0.005
Neg logit contributions
ours-0.004
ry-0.004
upid-0.004
 Sandy-0.004
 bubble-0.004
Token and
Token ID290
Feature activation+2.03e-02

Pos logit contributions
Brown+0.058
Ell+0.054
Twe+0.051
Ar+0.050
 Stretch+0.050
Neg logit contributions
ry-0.023
 Sandy-0.022
ours-0.021
 bubble-0.020
 foot-0.020
Token saw
Token ID2497
Feature activation-1.43e-02

Pos logit contributions
Brown+0.038
Ell+0.036
Twe+0.034
Ar+0.034
hy+0.034
Neg logit contributions
ry-0.026
ours-0.024
 Sandy-0.024
 bubble-0.023
upid-0.022
Token the
Token ID262
Feature activation+6.03e-02

Pos logit contributions
upid+0.020
ry+0.020
ours+0.020
asing+0.018
about+0.018
Neg logit contributions
Brown-0.024
Ell-0.022
 Brown-0.022
 Stretch-0.022
 Barney-0.021
Token diamond
Token ID15291
Feature activation+1.71e-02

Pos logit contributions
Ell+0.105
Brown+0.104
Twe+0.097
hy+0.095
Ar+0.094
Neg logit contributions
 bubble-0.078
 foot-0.074
 hand-0.072
 shell-0.069
 eye-0.069
Token too
Token ID1165
Feature activation+2.67e-02

Pos logit contributions
Brown+0.031
Ell+0.029
Twe+0.027
Ar+0.027
 Stretch+0.026
Neg logit contributions
ry-0.023
ours-0.022
 bubble-0.021
upid-0.021
 foot-0.021
Token.
Token ID13
Feature activation+1.10e-02

Pos logit contributions
Brown+0.057
Ell+0.055
Twe+0.051
Ar+0.050
Mic+0.050
Neg logit contributions
ry-0.029
 bubble-0.027
ours-0.025
 Sandy-0.025
upid-0.024
Token "
Token ID366
Feature activation+1.37e-03

Pos logit contributions
Brown+0.021
Ell+0.020
Twe+0.018
Ar+0.018
hy+0.018
Neg logit contributions
ours-0.013
ry-0.013
upid-0.013
 Sandy-0.013
 bubble-0.012
TokenLet
Token ID5756
Feature activation+1.16e-02

Pos logit contributions
Brown+0.002
Ell+0.002
Twe+0.002
 Stretch+0.002
hy+0.002
Neg logit contributions
ours-0.002
ry-0.002
upid-0.002
 Sandy-0.002
 bubble-0.002
Token a
Token ID257
Feature activation+2.77e-02

Pos logit contributions
Brown+0.045
Ell+0.043
Twe+0.042
 Stretch+0.041
Ar+0.041
Neg logit contributions
 bubble-0.019
 Sandy-0.019
 foot-0.019
ours-0.018
upid-0.018
Token big
Token ID1263
Feature activation+1.32e-02

Pos logit contributions
Brown+0.047
Ell+0.045
Twe+0.042
 Stretch+0.041
Ar+0.041
Neg logit contributions
 bubble-0.039
 foot-0.037
ours-0.036
 twin-0.035
 hand-0.034
Token t
Token ID256
Feature activation+3.21e-02

Pos logit contributions
Brown+0.025
Ell+0.024
Twe+0.023
Ar+0.022
 Stretch+0.022
Neg logit contributions
ours-0.015
upid-0.015
ry-0.015
 bubble-0.015
 foot-0.014
Tokeneddy
Token ID21874
Feature activation+5.30e-03

Pos logit contributions
Ell+0.047
Brown+0.047
Ar+0.043
 Stretch+0.043
Twe+0.042
Neg logit contributions
ry-0.053
ours-0.050
 foot-0.049
ress-0.048
upid-0.047
Token bear
Token ID6842
Feature activation-5.88e-04

Pos logit contributions
Brown+0.008
Ell+0.008
Twe+0.007
 Stretch+0.007
Ar+0.007
Neg logit contributions
ry-0.008
ours-0.008
upid-0.008
 bubble-0.008
 Sandy-0.007
Token that
Token ID326
Feature activation+6.39e-02

Pos logit contributions
ours+0.001
upid+0.001
ry+0.001
 Sandy+0.001
 bubble+0.001
Neg logit contributions
Brown-0.001
Ell-0.001
Twe-0.001
 Stretch-0.001
hy-0.001
Token she
Token ID673
Feature activation+5.28e-02

Pos logit contributions
ve+0.118
Brown+0.117
Ar+0.116
Ell+0.112
Twe+0.111
Neg logit contributions
 Sandy-0.072
 Linda-0.072
 bubble-0.071
 M-0.066
 Mia-0.066
Token loved
Token ID6151
Feature activation+4.68e-02

Pos logit contributions
Ell+0.100
Brown+0.097
 Barney+0.093
Frank+0.092
 Kirst+0.091
Neg logit contributions
 bubble-0.049
ry-0.049
 foot-0.048
 toe-0.046
 sprayed-0.046
Token to
Token ID284
Feature activation+2.70e-02

Pos logit contributions
Brown+0.101
Ell+0.099
Twe+0.099
Ar+0.096
ve+0.094
Neg logit contributions
ful-0.039
 bubble-0.036
 foot-0.035
 how-0.034
 Sandy-0.033
Token play
Token ID711
Feature activation+2.64e-02

Pos logit contributions
Brown+0.052
Ell+0.052
ve+0.047
Ar+0.047
 Stretch+0.046
Neg logit contributions
 bubble-0.030
 foot-0.028
upid-0.028
ry-0.028
 Sandy-0.027
Token with
Token ID351
Feature activation+1.76e-02

Pos logit contributions
Brown+0.049
hy+0.046
Ell+0.046
Twe+0.044
 Barney+0.043
Neg logit contributions
 bubble-0.031
ry-0.029
 foot-0.028
about-0.027
ours-0.027
Token,
Token ID11
Feature activation+1.78e-02

Pos logit contributions
ours+0.003
upid+0.003
ry+0.003
 Sandy+0.002
 bubble+0.002
Neg logit contributions
Brown-0.004
Ell-0.004
Twe-0.004
 Stretch-0.004
hy-0.004
Token Charlie
Token ID11526
Feature activation+3.47e-02

Pos logit contributions
Brown+0.034
Ell+0.032
Ar+0.030
Twe+0.030
hy+0.030
Neg logit contributions
 Sandy-0.020
ours-0.019
ry-0.019
 bubble-0.019
upid-0.018
Token and
Token ID290
Feature activation+2.11e-02

Pos logit contributions
Brown+0.078
Ell+0.076
Twe+0.074
Ar+0.072
hy+0.072
Neg logit contributions
ry-0.028
 bubble-0.023
ful-0.023
 Sandy-0.023
ours-0.022
Token Lucy
Token ID22162
Feature activation+2.46e-02

Pos logit contributions
Brown+0.037
Ell+0.034
Twe+0.032
ve+0.032
Ar+0.032
Neg logit contributions
 Sandy-0.028
upid-0.026
ours-0.026
 bubble-0.026
ry-0.026
Token.
Token ID13
Feature activation+4.86e-02

Pos logit contributions
Brown+0.050
Ell+0.048
Ar+0.045
Twe+0.045
 Stretch+0.044
Neg logit contributions
ry-0.026
ours-0.024
upid-0.024
ful-0.023
 foot-0.023
Token Charlie
Token ID11526
Feature activation+6.56e-02

Pos logit contributions
Brown+0.092
Twe+0.090
ve+0.088
hy+0.085
Ell+0.081
Neg logit contributions
 Sandy-0.052
ours-0.050
 bubble-0.048
upid-0.048
 Sam-0.047
Token had
Token ID550
Feature activation+2.94e-02

Pos logit contributions
Brown+0.147
Ell+0.143
Twe+0.139
Mic+0.134
Ar+0.133
Neg logit contributions
ry-0.056
 bubble-0.047
ours-0.046
 Sandy-0.046
bug-0.046
Token a
Token ID257
Feature activation+3.15e-02

Pos logit contributions
Brown+0.063
Ell+0.059
Twe+0.057
ve+0.057
 Stretch+0.056
Neg logit contributions
 Sandy-0.025
 bubble-0.024
 foot-0.023
upid-0.022
 shallow-0.022
Token brush
Token ID14093
Feature activation-1.32e-03

Pos logit contributions
Brown+0.055
Ell+0.052
Twe+0.049
hy+0.048
 Stretch+0.047
Neg logit contributions
 bubble-0.042
 foot-0.040
ours-0.039
upid-0.039
 twin-0.038
Token in
Token ID287
Feature activation+5.49e-03

Pos logit contributions
ours+0.002
upid+0.002
ry+0.002
 Sandy+0.002
 bubble+0.002
Neg logit contributions
Brown-0.002
Ell-0.002
Twe-0.002
 Stretch-0.002
hy-0.002
Token his
Token ID465
Feature activation+9.64e-03

Pos logit contributions
Brown+0.008
Ell+0.007
Twe+0.007
 Stretch+0.007
hy+0.007
Neg logit contributions
ours-0.009
upid-0.009
ry-0.009
 bubble-0.008
 Sandy-0.008
Token Every
Token ID3887
Feature activation-2.05e-02

Pos logit contributions
Brown+0.002
Ell+0.002
Twe+0.001
 Stretch+0.001
hy+0.001
Neg logit contributions
ours-0.003
ry-0.003
upid-0.003
 Sandy-0.003
 bubble-0.003
Token day
Token ID1110
Feature activation-1.26e-02

Pos logit contributions
ours+0.039
ry+0.039
upid+0.038
asing+0.037
ress+0.035
Neg logit contributions
Brown-0.023
Ell-0.021
Ar-0.020
hy-0.020
 Stretch-0.020
Token his
Token ID465
Feature activation+4.37e-02

Pos logit contributions
ours+0.025
upid+0.025
ry+0.024
asing+0.023
 bubble+0.023
Neg logit contributions
Brown-0.014
Ell-0.012
 Stretch-0.012
 Brown-0.011
hy-0.011
Token mum
Token ID25682
Feature activation+4.05e-02

Pos logit contributions
Brown+0.103
Ell+0.102
Twe+0.094
Ar+0.092
 Kirst+0.091
Neg logit contributions
 foot-0.038
 twin-0.037
 sister-0.037
 hand-0.035
ours-0.035
Token would
Token ID561
Feature activation+4.09e-02

Pos logit contributions
Brown+0.077
Ell+0.074
Twe+0.071
Ar+0.070
hy+0.069
Neg logit contributions
ry-0.048
ours-0.047
 bubble-0.042
bug-0.041
ress-0.040
Token buy
Token ID2822
Feature activation+5.78e-02

Pos logit contributions
Brown+0.083
Ell+0.080
Ar+0.073
 Stretch+0.073
Twe+0.073
Neg logit contributions
upid-0.041
 bubble-0.040
ours-0.039
ry-0.038
 foot-0.038
Token one
Token ID530
Feature activation+4.67e-03

Pos logit contributions
Brown+0.135
Ell+0.129
Ar+0.118
hy+0.117
Twe+0.116
Neg logit contributions
ours-0.050
 Sandy-0.050
 more-0.042
 bubble-0.042
ry-0.041
Token for
Token ID329
Feature activation+3.99e-02

Pos logit contributions
Brown+0.009
Ell+0.008
Twe+0.008
 Stretch+0.008
Ar+0.008
Neg logit contributions
ours-0.005
ry-0.005
upid-0.005
 Sandy-0.005
 bubble-0.005
Token him
Token ID683
Feature activation+3.30e-02

Pos logit contributions
Brown+0.066
Ell+0.063
Ar+0.057
hy+0.055
Twe+0.055
Neg logit contributions
 Sandy-0.059
ours-0.057
 bubble-0.057
 foot-0.054
ry-0.054
Token and
Token ID290
Feature activation+1.07e-02

Pos logit contributions
Brown+0.067
Ell+0.065
Ar+0.061
Twe+0.060
 Stretch+0.059
Neg logit contributions
ry-0.037
ours-0.036
upid-0.034
 Sandy-0.033
 foot-0.032
Token he
Token ID339
Feature activation+4.36e-02

Pos logit contributions
Brown+0.017
Ell+0.016
Twe+0.015
Ar+0.014
hy+0.014
Neg logit contributions
ours-0.016
ry-0.016
upid-0.016
 Sandy-0.015
 bubble-0.015
Token.
Token ID13
Feature activation+2.47e-02

Pos logit contributions
Brown+0.094
Ell+0.085
Twe+0.083
Ar+0.083
Mic+0.081
Neg logit contributions
ry-0.046
 Sandy-0.041
upid-0.041
 foot-0.041
apping-0.040
Token
Token ID198
Feature activation+9.75e-03

Pos logit contributions
Brown+0.042
Ell+0.039
hy+0.037
Ar+0.037
ve+0.036
Neg logit contributions
ours-0.032
 Sandy-0.031
ry-0.031
upid-0.031
ress-0.029
Token
Token ID198
Feature activation+2.21e-02

Pos logit contributions
Brown+0.016
Ell+0.015
hy+0.014
 Stretch+0.014
Twe+0.014
Neg logit contributions
ry-0.013
upid-0.013
ours-0.013
 Sandy-0.012
 bubble-0.012
TokenTogether
Token ID41631
Feature activation+3.46e-02

Pos logit contributions
Brown+0.037
 Stretch+0.036
hy+0.035
 Brown+0.035
Ell+0.035
Neg logit contributions
upid-0.026
about-0.025
ry-0.025
ours-0.024
ress-0.024
Token,
Token ID11
Feature activation+2.94e-02

Pos logit contributions
Brown+0.082
Ell+0.076
Twe+0.074
Ar+0.074
Mic+0.071
Neg logit contributions
ry-0.024
 Sandy-0.024
ours-0.024
upid-0.023
 of-0.022
Token Twe
Token ID24205
Feature activation+1.62e-02

Pos logit contributions
Brown+0.040
Ell+0.036
Ar+0.034
hy+0.034
ve+0.032
Neg logit contributions
 Sandy-0.048
ours-0.046
ry-0.046
 bubble-0.046
upid-0.044
Tokenety
Token ID2963
Feature activation-2.11e-02

Pos logit contributions
Brown+0.027
Ell+0.025
 Stretch+0.025
Ar+0.024
hy+0.024
Neg logit contributions
ry-0.022
upid-0.022
ours-0.021
 Sandy-0.020
ired-0.020
Token and
Token ID290
Feature activation+4.15e-02

Pos logit contributions
ours+0.032
upid+0.031
ry+0.029
 Sandy+0.029
 shallow+0.028
Neg logit contributions
Brown-0.033
Ell-0.031
 Stretch-0.029
Ar-0.028
ve-0.028
Token the
Token ID262
Feature activation+7.51e-03

Pos logit contributions
Brown+0.062
ve+0.061
hy+0.058
Ell+0.056
Ar+0.056
Neg logit contributions
 Sandy-0.056
 bubble-0.053
 Duck-0.050
 how-0.050
 foot-0.048
Token squirrel
Token ID33039
Feature activation-3.59e-03

Pos logit contributions
Brown+0.013
Ell+0.012
Twe+0.011
 Stretch+0.011
hy+0.011
Neg logit contributions
upid-0.010
ours-0.010
ry-0.010
 bubble-0.010
 Sandy-0.010
Token flew
Token ID13112
Feature activation+3.39e-02

Pos logit contributions
ours+0.006
upid+0.005
ry+0.005
 Sandy+0.005
 bubble+0.005
Neg logit contributions
Brown-0.006
Ell-0.005
Twe-0.005
 Stretch-0.005
hy-0.005
Token day
Token ID1110
Feature activation-9.43e-03

Pos logit contributions
ours+0.049
upid+0.049
asing+0.047
 ankle+0.047
ress+0.046
Neg logit contributions
Brown-0.020
Ell-0.018
 Christmas-0.017
hy-0.016
Ar-0.016
Token,
Token ID11
Feature activation-1.41e-02

Pos logit contributions
upid+0.014
ours+0.013
ry+0.013
 bubble+0.013
asing+0.012
Neg logit contributions
Brown-0.015
Ell-0.014
 Stretch-0.013
Twe-0.013
 Brown-0.013
Token they
Token ID484
Feature activation-1.31e-02

Pos logit contributions
upid+0.023
ry+0.023
ours+0.022
 ankle+0.021
asing+0.021
Neg logit contributions
Brown-0.020
Ell-0.019
 Brown-0.017
 Stretch-0.017
Twe-0.017
Token found
Token ID1043
Feature activation-2.27e-02

Pos logit contributions
upid+0.021
ours+0.021
ry+0.021
asing+0.019
 ankle+0.019
Neg logit contributions
Brown-0.018
Ell-0.018
Twe-0.016
Ar-0.016
 Brown-0.016
Token a
Token ID257
Feature activation+7.33e-03

Pos logit contributions
ry+0.040
upid+0.039
ours+0.037
about+0.037
asing+0.036
Neg logit contributions
Brown-0.028
 Brown-0.028
Ell-0.027
 Barney-0.027
 Stretch-0.027
Token shiny
Token ID22441
Feature activation+8.51e-03

Pos logit contributions
Brown+0.011
Ell+0.011
Twe+0.010
 Stretch+0.010
hy+0.010
Neg logit contributions
ours-0.011
ry-0.011
upid-0.011
 bubble-0.011
 Sandy-0.010
Token packet
Token ID19638
Feature activation-3.34e-02

Pos logit contributions
Brown+0.015
Ell+0.014
Twe+0.013
 Stretch+0.013
Ar+0.013
Neg logit contributions
ours-0.011
ry-0.011
upid-0.011
 bubble-0.010
 Sandy-0.010
Token of
Token ID286
Feature activation+1.96e-02

Pos logit contributions
upid+0.033
ours+0.032
ry+0.029
 Sandy+0.027
 rhythms+0.027
Neg logit contributions
Brown-0.066
Ell-0.065
hy-0.064
pe-0.062
Twe-0.062
Token seeds
Token ID11904
Feature activation-3.89e-03

Pos logit contributions
Brown+0.036
Ell+0.034
Twe+0.031
Ar+0.031
 Stretch+0.030
Neg logit contributions
 bubble-0.025
ours-0.025
upid-0.024
ry-0.023
 Sandy-0.023
Token that
Token ID326
Feature activation+2.85e-02

Pos logit contributions
ours+0.006
upid+0.006
ry+0.006
 Sandy+0.005
 bubble+0.005
Neg logit contributions
Brown-0.006
Ell-0.006
Twe-0.005
 Stretch-0.005
hy-0.005
Token said
Token ID531
Feature activation-1.41e-03

Pos logit contributions
Brown+0.057
Ell+0.053
Twe+0.050
hy+0.048
Ar+0.048
Neg logit contributions
 Sandy-0.033
ours-0.033
ry-0.033
 bubble-0.032
 foot-0.030
Token?"
Token ID1701
Feature activation+1.04e-02

Pos logit contributions
Brown+0.010
Ell+0.010
Twe+0.009
 Stretch+0.009
hy+0.009
Neg logit contributions
ours-0.007
ry-0.007
upid-0.007
 bubble-0.006
 Sandy-0.006
Token Dad
Token ID17415
Feature activation+7.61e-03

Pos logit contributions
Brown+0.023
Ell+0.021
Twe+0.020
ve+0.020
 Stretch+0.020
Neg logit contributions
ours-0.009
 Sandy-0.009
ry-0.009
upid-0.009
 bubble-0.009
Token asked
Token ID1965
Feature activation+3.65e-03

Pos logit contributions
Brown+0.013
Ell+0.012
Twe+0.012
 Stretch+0.011
Ar+0.011
Neg logit contributions
ry-0.011
ours-0.010
upid-0.010
 bubble-0.010
 Sandy-0.010
Token.
Token ID13
Feature activation+3.96e-03

Pos logit contributions
Brown+0.006
Ell+0.006
Twe+0.006
 Stretch+0.006
Ar+0.006
Neg logit contributions
ry-0.005
ours-0.005
upid-0.005
 Sandy-0.004
 bubble-0.004
Token
Token ID198
Feature activation+8.56e-03

Pos logit contributions
Brown+0.006
Ell+0.006
Twe+0.006
 Stretch+0.006
hy+0.005
Neg logit contributions
ours-0.006
ry-0.006
upid-0.006
 Sandy-0.005
 bubble-0.005
Token
Token ID198
Feature activation+5.81e-03

Pos logit contributions
Brown+0.014
Ell+0.013
hy+0.012
Twe+0.012
 Stretch+0.012
Neg logit contributions
ry-0.012
ours-0.012
upid-0.012
 Sandy-0.011
 bubble-0.011
Token"
Token ID1
Feature activation+1.59e-02

Pos logit contributions
Brown+0.010
Ell+0.009
 Stretch+0.009
Twe+0.008
hy+0.008
Neg logit contributions
ry-0.008
ours-0.008
upid-0.008
 Sandy-0.008
 bubble-0.007
TokenYes
Token ID5297
Feature activation-4.30e-03

Pos logit contributions
Brown+0.029
Ell+0.028
 Stretch+0.027
 Brown+0.026
 Barney+0.026
Neg logit contributions
ours-0.019
ry-0.018
 bubble-0.017
upid-0.017
about-0.017
Token,
Token ID11
Feature activation-2.76e-02

Pos logit contributions
ours+0.006
upid+0.006
ry+0.006
 bubble+0.005
 Sandy+0.005
Neg logit contributions
Brown-0.007
Ell-0.007
Twe-0.006
 Stretch-0.006
hy-0.006
Token please
Token ID3387
Feature activation-5.66e-04

Pos logit contributions
upid+0.045
ry+0.044
ours+0.044
 ankle+0.041
ress+0.041
Neg logit contributions
Brown-0.037
hy-0.033
Twe-0.033
Ell-0.032
 Brown-0.032
Token!"
Token ID2474
Feature activation+1.33e-02

Pos logit contributions
ours+0.001
upid+0.001
ry+0.001
 Sandy+0.001
 bubble+0.001
Neg logit contributions
Brown-0.001
Ell-0.001
Twe-0.001
 Stretch-0.001
hy-0.001
Token the
Token ID262
Feature activation+2.80e-02

Pos logit contributions
Brown+0.014
Ell+0.013
Twe+0.012
hy+0.012
 Stretch+0.012
Neg logit contributions
ours-0.008
ry-0.008
upid-0.008
 Sandy-0.007
 bubble-0.007
Token ju
Token ID7544
Feature activation+2.51e-02

Pos logit contributions
Brown+0.053
Ell+0.050
Twe+0.047
Ar+0.046
 Stretch+0.046
Neg logit contributions
 foot-0.033
upid-0.031
 bubble-0.031
 hand-0.031
ours-0.030
Tokenic
Token ID291
Feature activation+7.15e-03

Pos logit contributions
Brown+0.042
Ell+0.039
 Stretch+0.037
hy+0.037
Ar+0.037
Neg logit contributions
ry-0.036
ours-0.036
upid-0.035
about-0.033
 bubble-0.033
Tokeniest
Token ID6386
Feature activation+1.22e-02

Pos logit contributions
Brown+0.013
Ell+0.013
Twe+0.012
 Stretch+0.012
Ar+0.011
Neg logit contributions
ry-0.009
upid-0.009
ours-0.009
 Sandy-0.008
 bubble-0.008
Token ones
Token ID3392
Feature activation+3.42e-02

Pos logit contributions
Brown+0.021
Ell+0.019
 Stretch+0.018
Twe+0.018
Ar+0.018
Neg logit contributions
ours-0.017
ry-0.017
upid-0.016
 foot-0.016
 Sandy-0.016
Token.
Token ID13
Feature activation+1.93e-02

Pos logit contributions
Brown+0.065
Ell+0.061
Ar+0.056
Twe+0.056
Mic+0.053
Neg logit contributions
ry-0.046
 Sandy-0.043
ours-0.041
ful-0.041
upid-0.040
Token Finally
Token ID9461
Feature activation+1.35e-02

Pos logit contributions
Brown+0.028
Ell+0.026
Twe+0.024
ve+0.023
hy+0.023
Neg logit contributions
ours-0.032
ry-0.031
 Sandy-0.031
upid-0.031
 foot-0.030
Token,
Token ID11
Feature activation+1.40e-02

Pos logit contributions
Brown+0.028
Ell+0.027
Twe+0.026
Ar+0.025
hy+0.024
Neg logit contributions
ry-0.014
 Sandy-0.014
upid-0.013
ours-0.013
 foot-0.013
Token he
Token ID339
Feature activation-2.40e-03

Pos logit contributions
Brown+0.017
Ell+0.016
Twe+0.015
hy+0.014
Ar+0.014
Neg logit contributions
ry-0.026
 Sandy-0.025
ours-0.025
upid-0.025
 foot-0.025
Token found
Token ID1043
Feature activation+6.43e-03

Pos logit contributions
ours+0.003
upid+0.003
ry+0.003
 bubble+0.003
 Sandy+0.003
Neg logit contributions
Brown-0.004
Ell-0.004
Twe-0.003
 Stretch-0.003
Ar-0.003
Token a
Token ID257
Feature activation+5.97e-03

Pos logit contributions
Brown+0.013
Ell+0.012
Twe+0.011
hy+0.011
 Stretch+0.011
Neg logit contributions
ours-0.007
ry-0.007
upid-0.007
 Sandy-0.007
 bubble-0.006
Token side
Token ID1735
Feature activation-1.07e-02

Pos logit contributions
ry+0.004
ours+0.004
upid+0.004
 Sandy+0.004
 bubble+0.004
Neg logit contributions
Brown-0.006
Ell-0.005
hy-0.005
Twe-0.005
 Stretch-0.005
Token and
Token ID290
Feature activation-1.25e-02

Pos logit contributions
upid+0.015
ours+0.015
ry+0.015
 Sandy+0.014
 bubble+0.014
Neg logit contributions
Brown-0.017
Ell-0.016
hy-0.015
 Stretch-0.015
 Brown-0.015
Token Sam
Token ID3409
Feature activation-5.85e-03

Pos logit contributions
upid+0.016
ry+0.015
ours+0.014
about+0.013
olt+0.013
Neg logit contributions
Brown-0.023
Ell-0.022
 Brown-0.021
 Stretch-0.020
hy-0.020
Token's
Token ID338
Feature activation+6.66e-03

Pos logit contributions
ours+0.008
upid+0.008
ry+0.008
 Sandy+0.007
 bubble+0.007
Neg logit contributions
Brown-0.010
Ell-0.010
hy-0.009
 Stretch-0.009
Twe-0.009
Token side
Token ID1735
Feature activation-5.17e-03

Pos logit contributions
Brown+0.013
Ell+0.012
Twe+0.012
 Stretch+0.011
Ar+0.011
Neg logit contributions
ours-0.008
upid-0.008
ry-0.008
 bubble-0.007
 foot-0.007
Token.
Token ID13
Feature activation+1.36e-02

Pos logit contributions
ours+0.007
upid+0.007
ry+0.007
 Sandy+0.007
 bubble+0.007
Neg logit contributions
Brown-0.008
Ell-0.008
hy-0.007
Twe-0.007
 Stretch-0.007
Token She
Token ID1375
Feature activation-8.10e-03

Pos logit contributions
Brown+0.018
Ell+0.016
Twe+0.015
 Stretch+0.015
Ar+0.015
Neg logit contributions
ours-0.024
ry-0.024
 Sandy-0.023
upid-0.023
 bubble-0.022
Token frowned
Token ID32198
Feature activation-2.12e-02

Pos logit contributions
upid+0.012
ry+0.011
ours+0.011
 Sandy+0.010
 bubble+0.010
Neg logit contributions
Brown-0.014
Ell-0.013
Twe-0.012
Ar-0.012
 Stretch-0.012
Token.
Token ID13
Feature activation+7.25e-03

Pos logit contributions
upid+0.040
ours+0.038
ress+0.036
ry+0.036
 bubble+0.035
Neg logit contributions
Brown-0.025
Ell-0.023
 Brown-0.022
Twe-0.020
 Stretch-0.020
Token
Token ID198
Feature activation+9.96e-03

Pos logit contributions
Brown+0.010
Ell+0.009
Twe+0.009
 Stretch+0.009
Ar+0.008
Neg logit contributions
ours-0.012
ry-0.012
upid-0.012
 Sandy-0.012
 bubble-0.011
Token
Token ID198
Feature activation+1.46e-02

Pos logit contributions
Brown+0.017
Ell+0.016
 Stretch+0.015
hy+0.015
Twe+0.015
Neg logit contributions
ry-0.014
upid-0.014
ours-0.014
 Sandy-0.013
 bubble-0.012
Token the
Token ID262
Feature activation-1.20e-02

Pos logit contributions
ours+0.017
ry+0.017
upid+0.016
about+0.015
asing+0.015
Neg logit contributions
Brown-0.022
Ell-0.021
 Stretch-0.021
 Brown-0.021
Ar-0.020
Token park
Token ID3952
Feature activation-1.29e-02

Pos logit contributions
ry+0.016
ours+0.016
upid+0.015
asing+0.015
ress+0.014
Neg logit contributions
Brown-0.021
Ar-0.020
 Stretch-0.019
Ell-0.019
Twe-0.019
Token.
Token ID13
Feature activation-6.17e-03

Pos logit contributions
upid+0.019
ours+0.019
ry+0.018
 bubble+0.017
 Sandy+0.016
Neg logit contributions
Brown-0.021
Ell-0.020
hy-0.019
 Stretch-0.018
 Brown-0.018
Token He
Token ID679
Feature activation-1.25e-02

Pos logit contributions
ry+0.013
ours+0.013
upid+0.013
 bubble+0.012
asing+0.012
Neg logit contributions
Brown-0.006
Ell-0.005
 Stretch-0.005
 Brown-0.005
Twe-0.005
Token always
Token ID1464
Feature activation-1.94e-02

Pos logit contributions
upid+0.019
ours+0.019
ry+0.018
asing+0.017
about+0.017
Neg logit contributions
Brown-0.020
Ell-0.018
Ar-0.017
Twe-0.017
 Stretch-0.017
Token loved
Token ID6151
Feature activation+1.60e-03

Pos logit contributions
upid+0.031
ours+0.031
ry+0.029
about+0.028
ress+0.027
Neg logit contributions
Brown-0.028
Ell-0.027
 Stretch-0.025
 Brown-0.025
Ar-0.025
Token being
Token ID852
Feature activation-1.08e-02

Pos logit contributions
Brown+0.003
Ell+0.003
Twe+0.002
 Stretch+0.002
hy+0.002
Neg logit contributions
ours-0.002
ry-0.002
upid-0.002
 Sandy-0.002
 bubble-0.002
Token creative
Token ID7325
Feature activation-2.10e-02

Pos logit contributions
ours+0.017
ry+0.017
upid+0.016
about+0.015
 bubble+0.015
Neg logit contributions
Brown-0.017
Ell-0.016
 Stretch-0.015
Twe-0.015
hy-0.014
Token and
Token ID290
Feature activation-1.66e-02

Pos logit contributions
ours+0.030
upid+0.030
ry+0.028
 Sandy+0.027
 bubble+0.026
Neg logit contributions
Brown-0.034
Ell-0.032
 Stretch-0.031
 Brown-0.030
hy-0.030
Token making
Token ID1642
Feature activation-1.46e-02

Pos logit contributions
ours+0.027
ry+0.027
upid+0.027
about+0.025
ress+0.024
Neg logit contributions
Brown-0.024
Ell-0.023
 Stretch-0.021
hy-0.021
 Brown-0.021
Token new
Token ID649
Feature activation-2.00e-02

Pos logit contributions
ry+0.021
ours+0.020
upid+0.020
ress+0.018
about+0.018
Neg logit contributions
Brown-0.024
Ell-0.022
 Stretch-0.022
Twe-0.021
 Brown-0.021
Token and
Token ID290
Feature activation+3.20e-03

Pos logit contributions
Brown+0.011
Ell+0.010
Twe+0.010
 Stretch+0.009
hy+0.009
Neg logit contributions
ry-0.007
upid-0.007
ours-0.007
 bubble-0.006
 Sandy-0.006
Token make
Token ID787
Feature activation+8.59e-03

Pos logit contributions
Brown+0.006
Ell+0.005
Twe+0.005
 Stretch+0.005
hy+0.005
Neg logit contributions
ours-0.004
ry-0.004
upid-0.004
 Sandy-0.004
 bubble-0.004
Token loud
Token ID7812
Feature activation+1.07e-02

Pos logit contributions
Brown+0.015
Ell+0.015
Twe+0.014
hy+0.013
 Stretch+0.013
Neg logit contributions
upid-0.011
ours-0.011
ry-0.011
 Sandy-0.010
 bubble-0.010
Token noises
Token ID26782
Feature activation-1.50e-03

Pos logit contributions
Brown+0.015
Ell+0.014
Twe+0.013
 Stretch+0.012
Ar+0.012
Neg logit contributions
ry-0.018
upid-0.018
ours-0.018
 bubble-0.017
 Sandy-0.017
Token."
Token ID526
Feature activation-4.01e-02

Pos logit contributions
ours+0.002
ry+0.002
upid+0.002
 Sandy+0.002
 bubble+0.002
Neg logit contributions
Brown-0.003
Ell-0.002
Twe-0.002
 Stretch-0.002
hy-0.002
Token The
Token ID383
Feature activation+5.00e-03

Pos logit contributions
ours+0.056
upid+0.055
ry+0.054
asing+0.053
ress+0.051
Neg logit contributions
Twe-0.055
Ell-0.054
 Barney-0.053
 Twe-0.053
 Brown-0.053
Token squirrel
Token ID33039
Feature activation-6.09e-03

Pos logit contributions
Brown+0.008
Ell+0.008
Twe+0.007
hy+0.007
 Stretch+0.007
Neg logit contributions
upid-0.007
ours-0.007
ry-0.007
 bubble-0.007
 Sandy-0.007
Token said
Token ID531
Feature activation-2.82e-03

Pos logit contributions
ours+0.008
upid+0.008
ry+0.008
 Sandy+0.007
 bubble+0.007
Neg logit contributions
Brown-0.011
Ell-0.010
Twe-0.009
 Stretch-0.009
Ar-0.009
Token,
Token ID11
Feature activation-4.73e-03

Pos logit contributions
ours+0.004
upid+0.004
ry+0.004
 bubble+0.003
 Sandy+0.003
Neg logit contributions
Brown-0.005
Ell-0.005
 Stretch-0.004
Twe-0.004
hy-0.004
Token "
Token ID366
Feature activation+2.02e-02

Pos logit contributions
ours+0.006
upid+0.006
ry+0.006
 Sandy+0.005
 bubble+0.005
Neg logit contributions
Brown-0.009
Ell-0.008
 Stretch-0.008
Twe-0.008
 Barney-0.008
TokenI
Token ID40
Feature activation-1.22e-02

Pos logit contributions
Brown+0.033
Ell+0.031
 Brown+0.031
hy+0.030
 Stretch+0.030
Neg logit contributions
ry-0.025
upid-0.025
 Sandy-0.025
about-0.025
 bubble-0.025
TokenKate
Token ID45087
Feature activation-4.20e-03

Pos logit contributions
ours+0.003
ry+0.003
upid+0.003
 bubble+0.003
 Sandy+0.003
Neg logit contributions
Brown-0.003
Ell-0.003
Twe-0.003
 Stretch-0.002
Ar-0.002
Token smiled
Token ID13541
Feature activation+5.21e-03

Pos logit contributions
ours+0.006
upid+0.006
ry+0.005
 Sandy+0.005
 bubble+0.005
Neg logit contributions
Brown-0.007
Ell-0.007
Twe-0.006
 Stretch-0.006
hy-0.006
Token as
Token ID355
Feature activation+2.06e-04

Pos logit contributions
Brown+0.010
Ell+0.010
Twe+0.009
 Stretch+0.009
Ar+0.009
Neg logit contributions
ry-0.006
ours-0.006
upid-0.006
 Sandy-0.005
 bubble-0.005
Token she
Token ID673
Feature activation-5.20e-03

Pos logit contributions
Brown+0.000
Ell+0.000
Twe+0.000
 Stretch+0.000
hy+0.000
Neg logit contributions
ours-0.000
ry-0.000
upid-0.000
 bubble-0.000
 Sandy-0.000
Token put
Token ID1234
Feature activation-7.13e-04

Pos logit contributions
upid+0.008
ours+0.008
ry+0.008
 Sandy+0.007
 bubble+0.007
Neg logit contributions
Brown-0.008
Ell-0.007
Twe-0.007
Ar-0.007
 Stretch-0.007
Token on
Token ID319
Feature activation+1.70e-03

Pos logit contributions
ours+0.001
upid+0.001
ry+0.001
 bubble+0.001
 Sandy+0.001
Neg logit contributions
Brown-0.001
Ell-0.001
Twe-0.001
 Stretch-0.001
hy-0.001
Token her
Token ID607
Feature activation+9.83e-03

Pos logit contributions
Brown+0.002
Ell+0.002
Twe+0.002
 Stretch+0.002
hy+0.002
Neg logit contributions
ours-0.003
ry-0.003
upid-0.003
 Sandy-0.003
 bubble-0.003
Token swim
Token ID9422
Feature activation-6.00e-03

Pos logit contributions
Brown+0.017
Ell+0.016
Twe+0.015
 Stretch+0.015
Ar+0.014
Neg logit contributions
ours-0.013
ry-0.013
upid-0.013
 bubble-0.012
 Sandy-0.012
Tokensuit
Token ID6063
Feature activation-8.69e-03

Pos logit contributions
ours+0.010
upid+0.010
ry+0.010
 Sandy+0.009
 bubble+0.009
Neg logit contributions
Brown-0.008
Ell-0.008
Twe-0.007
hy-0.007
Ar-0.007
Token and
Token ID290
Feature activation-1.47e-02

Pos logit contributions
upid+0.012
ours+0.012
ry+0.011
 Sandy+0.011
 bubble+0.011
Neg logit contributions
Brown-0.015
Ell-0.014
hy-0.013
Twe-0.013
 Stretch-0.013
Token grabbed
Token ID13176
Feature activation-1.20e-02

Pos logit contributions
upid+0.023
ours+0.023
ry+0.022
asing+0.021
ired+0.020
Neg logit contributions
Brown-0.022
Ell-0.021
 Stretch-0.019
 Brown-0.019
 Barney-0.019
Token nice
Token ID3621
Feature activation-3.29e-02

Pos logit contributions
Brown+0.014
Ell+0.014
Twe+0.013
 Stretch+0.012
Ar+0.012
Neg logit contributions
upid-0.011
ours-0.010
ry-0.010
 Sandy-0.010
 bubble-0.010
Token.
Token ID13
Feature activation-8.52e-03

Pos logit contributions
ours+0.058
upid+0.057
 shallow+0.055
ry+0.055
 bubble+0.054
Neg logit contributions
Brown-0.035
 Brown-0.034
hy-0.034
 Stretch-0.032
 Barney-0.030
Token You
Token ID921
Feature activation-4.65e-03

Pos logit contributions
ry+0.013
upid+0.013
ours+0.013
 bubble+0.013
about+0.012
Neg logit contributions
Brown-0.013
Ell-0.011
Twe-0.011
 Stretch-0.011
hy-0.011
Token have
Token ID423
Feature activation-9.50e-03

Pos logit contributions
ours+0.006
ry+0.006
upid+0.006
 Sandy+0.005
 bubble+0.005
Neg logit contributions
Brown-0.009
Ell-0.008
Twe-0.008
 Stretch-0.008
hy-0.007
Token to
Token ID284
Feature activation+4.90e-04

Pos logit contributions
ry+0.015
ours+0.015
upid+0.015
about+0.014
 bubble+0.014
Neg logit contributions
Brown-0.014
Ell-0.013
hy-0.012
Twe-0.012
 Stretch-0.012
Token be
Token ID307
Feature activation-1.16e-02

Pos logit contributions
Brown+0.001
Ell+0.001
Twe+0.001
 Stretch+0.001
hy+0.001
Neg logit contributions
ours-0.001
ry-0.001
upid-0.001
 Sandy-0.001
 bubble-0.001
Token kind
Token ID1611
Feature activation-1.79e-02

Pos logit contributions
ry+0.017
ours+0.017
upid+0.017
about+0.016
 bubble+0.015
Neg logit contributions
Brown-0.019
Ell-0.018
Twe-0.017
hy-0.016
 Stretch-0.016
Token to
Token ID284
Feature activation+7.26e-03

Pos logit contributions
ours+0.029
upid+0.028
ry+0.027
 shallow+0.026
 bubble+0.026
Neg logit contributions
Brown-0.026
Ell-0.024
hy-0.023
 Brown-0.023
 Stretch-0.022
Token the
Token ID262
Feature activation+7.22e-04

Pos logit contributions
Brown+0.014
Ell+0.013
Twe+0.012
Ar+0.012
hy+0.012
Neg logit contributions
ours-0.009
upid-0.009
ry-0.009
 Sandy-0.008
 bubble-0.008
Token animals
Token ID4695
Feature activation-3.93e-02

Pos logit contributions
Brown+0.001
Ell+0.001
Twe+0.001
 Stretch+0.001
Ar+0.001
Neg logit contributions
ours-0.001
ry-0.001
upid-0.001
 Sandy-0.001
 bubble-0.001
Token.
Token ID13
Feature activation-1.10e-02

Pos logit contributions
ours+0.067
upid+0.065
about+0.062
 Sandy+0.062
ry+0.061
Neg logit contributions
Brown-0.050
Ell-0.045
 Stretch-0.045
 Brown-0.044
pe-0.044
Token truck
Token ID7779
Feature activation+3.16e-02

Pos logit contributions
ry+0.013
ours+0.013
upid+0.013
 Sandy+0.012
asing+0.012
Neg logit contributions
Brown-0.017
Ell-0.016
Twe-0.015
 Stretch-0.015
hy-0.015
Token is
Token ID318
Feature activation+7.51e-03

Pos logit contributions
Brown+0.063
Ell+0.062
Twe+0.058
 Stretch+0.056
Mic+0.056
Neg logit contributions
ry-0.036
ours-0.034
upid-0.034
 bubble-0.032
 foot-0.031
Token hard
Token ID1327
Feature activation+2.87e-02

Pos logit contributions
Brown+0.012
Ell+0.011
Twe+0.010
 Stretch+0.010
hy+0.010
Neg logit contributions
ours-0.011
upid-0.011
ry-0.011
 bubble-0.011
 Sandy-0.011
Token to
Token ID284
Feature activation+8.60e-03

Pos logit contributions
Brown+0.055
Ell+0.054
Twe+0.052
 Stretch+0.048
Mic+0.048
Neg logit contributions
ry-0.035
upid-0.033
ours-0.033
 bubble-0.033
 foot-0.031
Token do
Token ID466
Feature activation-5.87e-03

Pos logit contributions
Brown+0.016
Ell+0.016
Twe+0.014
 Stretch+0.014
hy+0.014
Neg logit contributions
upid-0.010
ry-0.010
ours-0.010
 bubble-0.010
 foot-0.009
Token a
Token ID257
Feature activation-4.65e-04

Pos logit contributions
ours+0.008
ry+0.008
upid+0.008
 Sandy+0.007
 bubble+0.007
Neg logit contributions
Brown-0.010
Ell-0.009
Twe-0.009
 Stretch-0.009
hy-0.009
Token jump
Token ID4391
Feature activation-1.59e-02

Pos logit contributions
ours+0.001
ry+0.001
upid+0.001
 Sandy+0.001
 bubble+0.001
Neg logit contributions
Brown-0.001
Ell-0.001
Twe-0.001
 Stretch-0.001
Ar-0.001
Token.
Token ID13
Feature activation-2.59e-02

Pos logit contributions
ours+0.023
upid+0.022
ry+0.022
 Sandy+0.021
asing+0.020
Neg logit contributions
Brown-0.026
Ell-0.024
Twe-0.022
hy-0.022
 Stretch-0.022
Token Maybe
Token ID6674
Feature activation-3.32e-02

Pos logit contributions
upid+0.041
ry+0.040
ours+0.040
about+0.039
ired+0.038
Neg logit contributions
Brown-0.040
 Brown-0.035
 Stretch-0.034
Ell-0.034
hy-0.034
Token you
Token ID345
Feature activation-2.09e-02

Pos logit contributions
upid+0.061
ours+0.057
olt+0.056
apping+0.055
ry+0.055
Neg logit contributions
Brown-0.043
 Brown-0.040
 Stretch-0.038
Ell-0.036
 Barney-0.036
Token can
Token ID460
Feature activation-6.01e-04

Pos logit contributions
ours+0.026
 Sandy+0.025
ry+0.024
upid+0.024
ress+0.024
Neg logit contributions
Brown-0.037
Ell-0.034
 Stretch-0.033
hy-0.033
ve-0.033
Token like
Token ID588
Feature activation+2.33e-03

Pos logit contributions
Brown+0.060
Ell+0.057
 Stretch+0.052
Twe+0.051
 Barney+0.051
Neg logit contributions
upid-0.033
ours-0.030
 Sandy-0.030
 bubble-0.029
ry-0.029
Token to
Token ID284
Feature activation-3.81e-03

Pos logit contributions
Brown+0.004
Ell+0.004
Twe+0.003
 Stretch+0.003
hy+0.003
Neg logit contributions
ours-0.003
ry-0.003
upid-0.003
 Sandy-0.003
 bubble-0.003
Token play
Token ID711
Feature activation-1.79e-02

Pos logit contributions
ours+0.006
ry+0.006
upid+0.006
 Sandy+0.005
 bubble+0.005
Neg logit contributions
Brown-0.006
Ell-0.005
Twe-0.005
hy-0.005
 Stretch-0.005
Token outside
Token ID2354
Feature activation-1.35e-02

Pos logit contributions
upid+0.031
ours+0.030
ry+0.030
ress+0.028
 Sandy+0.028
Neg logit contributions
Brown-0.024
Ell-0.023
 Stretch-0.022
Ar-0.021
Twe-0.021
Token.
Token ID13
Feature activation-2.42e-02

Pos logit contributions
upid+0.022
ours+0.022
ry+0.021
 Sandy+0.020
ress+0.020
Neg logit contributions
Brown-0.020
Ell-0.018
 Stretch-0.017
 Brown-0.017
Twe-0.017
Token One
Token ID1881
Feature activation-4.76e-02

Pos logit contributions
ry+0.052
asing+0.051
upid+0.051
ress+0.050
 ankle+0.049
Neg logit contributions
Brown-0.021
 Stretch-0.019
Twe-0.019
 Brown-0.019
Ell-0.018
Token day
Token ID1110
Feature activation-3.50e-03

Pos logit contributions
asing+0.118
ress+0.118
ours+0.118
 ankle+0.116
upid+0.115
Neg logit contributions
 saf-0.020
 Christmas-0.018
Brown-0.016
hy-0.016
 long-0.015
Token,
Token ID11
Feature activation-9.03e-03

Pos logit contributions
upid+0.005
ours+0.005
ry+0.005
 bubble+0.004
 Sandy+0.004
Neg logit contributions
Brown-0.006
Ell-0.006
Twe-0.005
 Stretch-0.005
Ar-0.005
Token they
Token ID484
Feature activation-8.56e-03

Pos logit contributions
upid+0.015
ry+0.015
ours+0.015
asing+0.014
 bubble+0.014
Neg logit contributions
Brown-0.012
Ell-0.011
 Stretch-0.011
Twe-0.011
 Brown-0.010
Token go
Token ID467
Feature activation-1.31e-02

Pos logit contributions
ours+0.012
ry+0.012
upid+0.012
ress+0.012
asing+0.011
Neg logit contributions
Brown-0.013
Ell-0.013
Twe-0.012
Ar-0.012
 Stretch-0.012
Token to
Token ID284
Feature activation-2.18e-02

Pos logit contributions
upid+0.022
ours+0.022
ry+0.021
ress+0.020
about+0.020
Neg logit contributions
Brown-0.018
Ell-0.017
Ar-0.016
 Brown-0.016
 Stretch-0.015
Token started
Token ID2067
Feature activation-5.18e-03

Pos logit contributions
Brown+0.003
Ell+0.003
Twe+0.003
 Stretch+0.003
hy+0.003
Neg logit contributions
ours-0.002
ry-0.002
upid-0.002
 Sandy-0.002
 bubble-0.002
Token to
Token ID284
Feature activation-1.60e-03

Pos logit contributions
upid+0.008
ours+0.008
ry+0.008
 Sandy+0.007
 bubble+0.007
Neg logit contributions
Brown-0.008
Ell-0.007
Twe-0.007
 Stretch-0.007
 Brown-0.007
Token cry
Token ID3960
Feature activation-5.28e-03

Pos logit contributions
ours+0.002
ry+0.002
upid+0.002
 Sandy+0.002
 bubble+0.002
Neg logit contributions
Brown-0.003
Ell-0.003
Twe-0.002
 Stretch-0.002
hy-0.002
Token and
Token ID290
Feature activation-1.03e-02

Pos logit contributions
ours+0.007
upid+0.007
ry+0.007
 Sandy+0.006
 bubble+0.006
Neg logit contributions
Brown-0.009
Ell-0.009
 Stretch-0.008
 Brown-0.008
Twe-0.008
Token asked
Token ID1965
Feature activation-1.07e-02

Pos logit contributions
ours+0.016
upid+0.016
ry+0.016
about+0.015
asing+0.015
Neg logit contributions
Brown-0.015
Ell-0.015
 Stretch-0.013
 Brown-0.013
Twe-0.013
Token all
Token ID477
Feature activation-5.37e-02

Pos logit contributions
ours+0.021
upid+0.021
ry+0.021
 bubble+0.020
 shallow+0.020
Neg logit contributions
Brown-0.012
Ell-0.011
 Brown-0.010
 Stretch-0.010
 Barney-0.010
Token of
Token ID286
Feature activation-1.12e-02

Pos logit contributions
ours+0.103
asing+0.102
ry+0.101
 goalie+0.101
 pitcher+0.099
Neg logit contributions
 Ellie-0.056
Ell-0.050
 Kirst-0.050
 Brown-0.050
Brown-0.047
Token the
Token ID262
Feature activation+2.77e-02

Pos logit contributions
ours+0.022
ry+0.022
upid+0.021
apping+0.020
about+0.020
Neg logit contributions
Brown-0.012
Ell-0.012
 Stretch-0.010
 Brown-0.010
Twe-0.010
Token animals
Token ID4695
Feature activation-2.79e-02

Pos logit contributions
Brown+0.062
hy+0.059
Ell+0.058
Twe+0.056
Ar+0.054
Neg logit contributions
ours-0.024
 foot-0.023
upid-0.022
 bubble-0.022
ry-0.021
Token to
Token ID284
Feature activation+3.84e-03

Pos logit contributions
ours+0.038
upid+0.038
 bubble+0.036
ry+0.036
about+0.036
Neg logit contributions
Brown-0.042
Ell-0.041
 Brown-0.040
 Stretch-0.040
hy-0.040
Token help
Token ID1037
Feature activation+1.11e-02

Pos logit contributions
Brown+0.007
Ell+0.006
Twe+0.006
 Stretch+0.006
Ar+0.006
Neg logit contributions
upid-0.005
ry-0.005
ours-0.005
 Sandy-0.005
 bubble-0.005
Token the
Token ID262
Feature activation-2.54e-02

Pos logit contributions
upid+0.018
ours+0.018
ry+0.018
asing+0.016
ress+0.016
Neg logit contributions
Brown-0.020
Ell-0.019
 Stretch-0.018
 Brown-0.018
 Barney-0.017
Token team
Token ID1074
Feature activation-4.52e-02

Pos logit contributions
ry+0.036
ours+0.035
upid+0.033
asing+0.030
ress+0.030
Neg logit contributions
Brown-0.045
Ar-0.043
Ell-0.042
 Stretch-0.041
 Brown-0.040
Token was
Token ID373
Feature activation-1.05e-02

Pos logit contributions
ours+0.080
upid+0.079
ry+0.073
 conclusions+0.071
asing+0.070
Neg logit contributions
Brown-0.056
Ell-0.055
Ar-0.050
 Brown-0.050
 Stretch-0.049
Token trying
Token ID2111
Feature activation-3.26e-02

Pos logit contributions
ry+0.017
ours+0.017
upid+0.017
about+0.016
 Sandy+0.015
Neg logit contributions
Brown-0.015
Ell-0.014
 Stretch-0.013
hy-0.013
 Brown-0.013
Token to
Token ID284
Feature activation-1.51e-02

Pos logit contributions
ours+0.064
upid+0.063
ry+0.063
olt+0.061
ress+0.061
Neg logit contributions
Brown-0.029
Ell-0.028
 Brown-0.027
 Stretch-0.026
 Barney-0.025
Token figure
Token ID3785
Feature activation-5.44e-02

Pos logit contributions
ours+0.022
ry+0.021
upid+0.020
 Sandy+0.018
asing+0.018
Neg logit contributions
Brown-0.026
Ell-0.024
Twe-0.023
 Stretch-0.023
Ar-0.023
Token out
Token ID503
Feature activation-4.49e-02

Pos logit contributions
ry+0.089
ours+0.088
 pitcher+0.086
about+0.085
olt+0.084
Neg logit contributions
 Stretch-0.068
 Brown-0.067
Brown-0.066
 Barney-0.065
 Elliot-0.062
Token how
Token ID703
Feature activation-1.58e-02

Pos logit contributions
 pitcher+0.055
ours+0.054
upid+0.053
olt+0.053
ry+0.050
Neg logit contributions
Brown-0.074
 Brown-0.074
 Stretch-0.074
Ell-0.071
 Ellie-0.070
Token to
Token ID284
Feature activation-3.41e-03

Pos logit contributions
ours+0.027
upid+0.027
ry+0.027
about+0.025
 Sandy+0.025
Neg logit contributions
Brown-0.020
Ell-0.019
hy-0.018
 Stretch-0.017
Twe-0.017
Token do
Token ID466
Feature activation-3.40e-02

Pos logit contributions
ours+0.004
ry+0.004
upid+0.004
 Sandy+0.004
 bubble+0.004
Neg logit contributions
Brown-0.006
Ell-0.006
Twe-0.005
 Stretch-0.005
hy-0.005
Token a
Token ID257
Feature activation-1.49e-02

Pos logit contributions
ours+0.047
ry+0.046
upid+0.046
asing+0.042
about+0.041
Neg logit contributions
Brown-0.053
Ell-0.051
 Stretch-0.050
 Brown-0.049
 Barney-0.048
Token around
Token ID1088
Feature activation+3.97e-03

Pos logit contributions
Brown+0.003
Ell+0.003
Twe+0.003
 Stretch+0.003
hy+0.003
Neg logit contributions
ours-0.002
ry-0.002
upid-0.002
 Sandy-0.002
 bubble-0.002
Token the
Token ID262
Feature activation-2.08e-02

Pos logit contributions
Brown+0.007
Ell+0.007
Twe+0.007
 Stretch+0.006
hy+0.006
Neg logit contributions
ry-0.005
ours-0.005
upid-0.005
 bubble-0.004
 Sandy-0.004
Token room
Token ID2119
Feature activation+1.29e-04

Pos logit contributions
ours+0.032
upid+0.030
ry+0.030
 Sandy+0.029
asing+0.028
Neg logit contributions
Brown-0.033
Ell-0.030
 Brown-0.029
 Stretch-0.028
Ar-0.028
Token again
Token ID757
Feature activation-1.58e-02

Pos logit contributions
Brown+0.000
Ell+0.000
Twe+0.000
 Stretch+0.000
hy+0.000
Neg logit contributions
ours-0.000
ry-0.000
upid-0.000
 Sandy-0.000
 bubble-0.000
Token.
Token ID13
Feature activation+4.80e-03

Pos logit contributions
ours+0.024
upid+0.023
ry+0.022
 Sandy+0.021
about+0.021
Neg logit contributions
Brown-0.025
Ell-0.024
 Stretch-0.022
 Brown-0.022
hy-0.021
Token She
Token ID1375
Feature activation-5.10e-02

Pos logit contributions
Brown+0.007
Ell+0.006
Twe+0.006
 Stretch+0.005
hy+0.005
Neg logit contributions
ours-0.008
ry-0.008
upid-0.008
 Sandy-0.008
 bubble-0.008
Token knew
Token ID2993
Feature activation-1.89e-02

Pos logit contributions
ours+0.094
upid+0.090
ry+0.083
ress+0.083
icking+0.081
Neg logit contributions
Brown-0.063
Ell-0.062
Ar-0.056
ve-0.054
 Stretch-0.053
Token she
Token ID673
Feature activation-1.53e-03

Pos logit contributions
upid+0.032
ours+0.032
ry+0.031
about+0.030
 pitcher+0.029
Neg logit contributions
Brown-0.026
Ell-0.024
 Stretch-0.023
 Brown-0.023
 Barney-0.022
Token could
Token ID714
Feature activation-3.30e-02

Pos logit contributions
ours+0.002
upid+0.002
ry+0.002
 Sandy+0.002
 bubble+0.002
Neg logit contributions
Brown-0.002
Ell-0.002
Twe-0.002
 Stretch-0.002
hy-0.002
Token always
Token ID1464
Feature activation-2.69e-02

Pos logit contributions
ours+0.053
ry+0.049
upid+0.048
ress+0.047
 shallow+0.045
Neg logit contributions
Brown-0.050
Ell-0.046
 Stretch-0.044
 Brown-0.043
Twe-0.041
Token rely
Token ID8814
Feature activation-3.70e-02

Pos logit contributions
ours+0.039
ry+0.037
upid+0.035
ress+0.035
 Sandy+0.033
Neg logit contributions
Brown-0.044
Ell-0.041
 Stretch-0.040
hy-0.038
Ar-0.038
Token not
Token ID407
Feature activation+4.85e-03

Pos logit contributions
ry+0.005
ours+0.005
upid+0.005
 Sandy+0.005
 bubble+0.005
Neg logit contributions
Brown-0.006
Ell-0.005
Twe-0.005
 Stretch-0.005
Ar-0.005
Token so
Token ID523
Feature activation+1.43e-02

Pos logit contributions
Brown+0.008
Ell+0.007
Twe+0.007
 Stretch+0.007
hy+0.007
Neg logit contributions
ours-0.007
upid-0.007
ry-0.007
 bubble-0.006
 Sandy-0.006
Token busy
Token ID8179
Feature activation+5.25e-04

Pos logit contributions
Brown+0.022
Ell+0.021
Twe+0.019
 Stretch+0.018
 Barney+0.018
Neg logit contributions
upid-0.023
ours-0.023
ry-0.022
 bubble-0.022
 foot-0.021
Token.
Token ID13
Feature activation-4.55e-02

Pos logit contributions
Brown+0.001
Ell+0.001
Twe+0.001
 Stretch+0.001
hy+0.001
Neg logit contributions
ours-0.001
ry-0.001
upid-0.001
 Sandy-0.001
 bubble-0.001
Token That
Token ID1320
Feature activation-4.72e-02

Pos logit contributions
ry+0.083
upid+0.081
about+0.080
ours+0.078
icking+0.078
Neg logit contributions
Brown-0.061
 Brown-0.053
 Stretch-0.051
Ell-0.051
Twe-0.050
Token way
Token ID835
Feature activation-5.34e-02

Pos logit contributions
 Sandy+0.081
ours+0.079
upid+0.077
ry+0.077
ulu+0.076
Neg logit contributions
Brown-0.062
 Brown-0.057
 Stretch-0.055
pe-0.053
Ar-0.052
Token,
Token ID11
Feature activation-4.06e-02

Pos logit contributions
udge+0.096
olt+0.096
asing+0.095
about+0.095
omp+0.095
Neg logit contributions
 Brown-0.057
Brown-0.050
 Stretch-0.046
 Ellie-0.046
 Kirst-0.043
Token we
Token ID356
Feature activation-9.81e-03

Pos logit contributions
ours+0.072
upid+0.072
ry+0.071
about+0.068
olt+0.067
Neg logit contributions
Brown-0.050
 Brown-0.047
Ell-0.045
 Stretch-0.045
 Barney-0.043
Token can
Token ID460
Feature activation+2.04e-02

Pos logit contributions
ours+0.011
ry+0.010
 Sandy+0.010
upid+0.010
ress+0.009
Neg logit contributions
Brown-0.020
Ell-0.018
 Stretch-0.017
Twe-0.017
Ar-0.017
Token have
Token ID423
Feature activation+1.95e-03

Pos logit contributions
Ell+0.040
Brown+0.040
hy+0.036
Ar+0.036
Twe+0.036
Neg logit contributions
upid-0.022
ry-0.021
 bubble-0.021
ours-0.021
 foot-0.020
Token fun
Token ID1257
Feature activation-1.32e-02

Pos logit contributions
Brown+0.003
Ell+0.003
Twe+0.002
 Stretch+0.002
hy+0.002
Neg logit contributions
ours-0.003
ry-0.003
upid-0.003
 Sandy-0.003
 bubble-0.003