TOP ACTIVATIONS
MAX = 0.058

fruit tree ! Sab rina was very thirsty ,
to normal . Sab rina was so happy and
one of the fruits . Sab rina pl ucked one from
y ride ." Sab rina wanted to help the
to a safe place . Wh isk ers was able to
excited . When Sab rina arrived , she was
original self . Sab rina was scared , and
was gone ! Ch ir py was sad that
A few days later , Sab rina had to stay in
in the center . Then Sab rina saw something even more
to do . Wh isk ers saw that his
soft , brown centre . Sab ina couldn 't resist picking
play . El mo and the dragon fly
from Buster . So , Wh isk ers jumped down from
There were princess dresses , superhero cap es , and even
he said . Sab ina felt sad . She
fly in his mouth . Ho ppy 's friends cheered and
her ! V iv ian enjoyed playing with
. The Ro aring Monster was so happy
mine !" Sab ina looked up to see

MIN ACTIVATIONS
MIN = -0.066

still wanted to eat something scr um pt ious .
bitter and made her face scr unch up .
Sour candy made her face scr unch up . She did
it made Jack 's face scr unch up !
sour it made her face scr unch up !
it made Bob 's face scr unch up . He didn
hard and made her face scr unch up . She spat
sour and made their faces scr unch up .
. It makes her face scr unch up . She sp
and her face began to scr unch up . Hot tears
. She felt her face scr unch up and her tongue
never played with the or nam ents again . They still
to look at the or nam ents on the tree .
to help sort the or nam ents . She picked up
hill and put her or nam ents all around . She
made a funny face , scr unch ing up her nose
help her mom put or nam ents on the tree .
and then she had a scr um pt ious snack .
he especially liked the or nam ents he kept around .
the rest of the or nam ents onto the tree .

INTERVAL 0.027 - 0.058
CONTAINS 0.193%

other around the piano and around the living room .
. It had a shiny thing in its be ak .
we will be there in a minute !" their parents said
struggling and asked , " Do you need help ?" Sp
passengers . " Ch oo cho o , this

INTERVAL -0.004 - 0.027
CONTAINS 69.248%

Her mom replied , " The store owner wouldn 't like
andy 's mother smiled and said , " Yes . We
together ." The next week , Lily and her
happy to have an owner who loved him . And every
" It 's OK , my loves , I 'm

INTERVAL -0.035 - -0.004
CONTAINS 30.543%

the slide . He lands on the ground with a th
He would eat so much candy that he became very ch
Lucy thanked the fairy and she took a bite of the
Lily said . " It 's not for us . It
new friend and went home , still rocking as he walked

INTERVAL -0.066 - -0.035
CONTAINS 0.017%

have fun . One day , he saw a big tree
It does not hurt your eyes . It makes your hair
said earlier about wiping their shoes . She smiled and wiped
<|endoftext|> One day , Mom and Dad took their daughter to
how miserable the task may seem , courage is needed and
Token fruit
Token ID8234
Feature activation-1.60e-02

Pos logit contributions
 tallest+0.020
 cube+0.020
 cubes+0.019
 nail+0.019
 brick+0.019
Neg logit contributions
jo-0.047
 Carly-0.046
fulness-0.044
iko-0.042
 Carol-0.042
Token tree
Token ID5509
Feature activation-5.74e-03

Pos logit contributions
 tallest+0.026
 cube+0.026
 nail+0.025
 cubes+0.025
 brick+0.025
Neg logit contributions
jo-0.024
 Carly-0.023
fulness-0.022
les-0.021
of-0.021
Token!
Token ID0
Feature activation+1.73e-02

Pos logit contributions
 tallest+0.009
 cube+0.009
 nail+0.008
 cubes+0.008
 brick+0.008
Neg logit contributions
jo-0.010
 Carly-0.009
fulness-0.009
iko-0.008
les-0.008
Token
Token ID198
Feature activation+1.24e-02

Pos logit contributions
fulness+0.028
jo+0.028
le+0.028
les+0.027
ock+0.026
Neg logit contributions
 tallest-0.025
 sp-0.024
 nail-0.023
 square-0.023
 brick-0.023
Token
Token ID198
Feature activation+1.67e-02

Pos logit contributions
fulness+0.026
les+0.025
le+0.024
jo+0.024
elle+0.023
Neg logit contributions
 very-0.011
 nail-0.011
 square-0.011
 sp-0.011
 tallest-0.011
TokenSab
Token ID39646
Feature activation+5.79e-02

Pos logit contributions
jo+0.022
 Carly+0.020
fulness+0.020
les+0.019
le+0.018
Neg logit contributions
 tallest-0.031
 nail-0.030
 cube-0.030
 sp-0.029
 brick-0.029
Tokenrina
Token ID22267
Feature activation+4.51e-03

Pos logit contributions
 Carly+0.071
 Rover+0.064
 Woo+0.060
 Wendy+0.060
ted+0.060
Neg logit contributions
 tallest-0.102
 cube-0.097
 nail-0.095
 cracked-0.094
 brick-0.093
Token was
Token ID373
Feature activation-3.23e-03

Pos logit contributions
jo+0.008
 Carly+0.008
fulness+0.008
iko+0.007
 Carol+0.007
Neg logit contributions
 cube-0.006
 tallest-0.006
 nail-0.006
 cubes-0.006
 brick-0.006
Token very
Token ID845
Feature activation-7.16e-03

Pos logit contributions
 cube+0.004
 cubes+0.004
 tallest+0.003
 nail+0.003
 brick+0.003
Neg logit contributions
jo-0.007
 Carly-0.007
 Debbie-0.006
fulness-0.006
 Carol-0.006
Token thirsty
Token ID47124
Feature activation-1.94e-02

Pos logit contributions
 cubes+0.010
 cube+0.010
 tallest+0.009
 brick+0.009
 bricks+0.009
Neg logit contributions
 Carly-0.013
jo-0.012
 Debbie-0.012
fulness-0.012
 Carol-0.011
Token,
Token ID11
Feature activation-1.73e-03

Pos logit contributions
 cube+0.029
 tallest+0.029
 cubes+0.027
 nail+0.027
 bucket+0.027
Neg logit contributions
jo-0.034
 Carly-0.032
fulness-0.030
iko-0.030
 Carol-0.029
Token to
Token ID284
Feature activation+1.09e-03

Pos logit contributions
jo+0.017
 Carly+0.017
fulness+0.016
iko+0.016
 Carol+0.015
Neg logit contributions
 cube-0.008
 tallest-0.008
 bucket-0.008
 cubes-0.008
 nail-0.008
Token normal
Token ID3487
Feature activation+6.00e-03

Pos logit contributions
jo+0.002
 Carly+0.002
les+0.002
azz+0.002
iko+0.002
Neg logit contributions
 tallest-0.001
 sp-0.001
 nail-0.001
 cube-0.001
 brick-0.001
Token.
Token ID13
Feature activation+6.91e-03

Pos logit contributions
jo+0.013
 Carly+0.012
azz+0.012
of+0.012
iko+0.012
Neg logit contributions
 tallest-0.006
 cube-0.006
 sp-0.006
 nail-0.005
 cubes-0.005
Token
Token ID198
Feature activation+9.63e-03

Pos logit contributions
le+0.013
les+0.013
jo+0.013
ock+0.012
fulness+0.012
Neg logit contributions
 tallest-0.008
 square-0.008
 brick-0.008
 nail-0.007
 sp-0.007
Token
Token ID198
Feature activation+1.25e-02

Pos logit contributions
les+0.020
le+0.020
jo+0.020
elle+0.019
fulness+0.019
Neg logit contributions
 very-0.008
 nail-0.008
 cube-0.008
 brick-0.008
 square-0.008
TokenSab
Token ID39646
Feature activation+5.45e-02

Pos logit contributions
jo+0.024
les+0.023
fulness+0.022
 Carly+0.022
le+0.022
Neg logit contributions
 tallest-0.015
 nail-0.014
 brick-0.014
 sp-0.014
 cube-0.014
Tokenrina
Token ID22267
Feature activation+5.31e-04

Pos logit contributions
 Carly+0.067
 Rover+0.061
 Woo+0.059
 Wendy+0.057
ted+0.057
Neg logit contributions
 tallest-0.097
 cube-0.093
 cracked-0.090
 nail-0.090
 brick-0.089
Token was
Token ID373
Feature activation-6.69e-03

Pos logit contributions
 Carly+0.001
fulness+0.001
jo+0.001
azz+0.001
reating+0.001
Neg logit contributions
 cube-0.001
 sp-0.001
 bucket-0.001
 cracked-0.001
 nail-0.001
Token so
Token ID523
Feature activation-5.51e-03

Pos logit contributions
 tallest+0.005
 cube+0.005
 nail+0.004
 bucket+0.004
 brick+0.004
Neg logit contributions
jo-0.017
 Carly-0.016
fulness-0.016
les-0.016
of-0.015
Token happy
Token ID3772
Feature activation-1.99e-03

Pos logit contributions
 tallest+0.007
 cube+0.007
 cubes+0.007
 nail+0.006
 brick+0.006
Neg logit contributions
jo-0.011
 Carly-0.010
fulness-0.010
 Debbie-0.010
 Carol-0.010
Token and
Token ID290
Feature activation-7.24e-03

Pos logit contributions
 very+0.001
 sp+0.001
 plant+0.001
 in+0.001
 square+0.001
Neg logit contributions
jo-0.005
rob-0.005
haw-0.005
iko-0.005
Eric-0.005
Token one
Token ID530
Feature activation-3.96e-03

Pos logit contributions
 bucket+0.010
 sp+0.010
 cube+0.010
 squash+0.010
 nail+0.010
Neg logit contributions
jo-0.015
 Carly-0.013
fulness-0.013
iko-0.012
 Carol-0.012
Token of
Token ID286
Feature activation-5.90e-03

Pos logit contributions
nam+0.006
 cubes+0.006
 tallest+0.006
 cube+0.006
 magnets+0.006
Neg logit contributions
jo-0.006
le-0.006
fulness-0.006
of-0.006
 Carly-0.005
Token the
Token ID262
Feature activation-5.23e-03

Pos logit contributions
 tallest+0.009
 cube+0.009
 cubes+0.008
 nail+0.008
 brick+0.008
Neg logit contributions
 Carly-0.009
jo-0.009
fulness-0.009
 Carol-0.008
 Debbie-0.008
Token fruits
Token ID15921
Feature activation-2.56e-02

Pos logit contributions
 cube+0.006
 tallest+0.005
 nail+0.005
nam+0.005
 cubes+0.005
Neg logit contributions
jo-0.010
 Carly-0.010
fulness-0.010
les-0.010
le-0.010
Token.
Token ID13
Feature activation+2.19e-03

Pos logit contributions
 tallest+0.041
 cube+0.041
 cubes+0.040
 nail+0.040
 brick+0.039
Neg logit contributions
jo-0.040
 Carly-0.038
fulness-0.036
of-0.034
les-0.034
Token Sab
Token ID9910
Feature activation+5.33e-02

Pos logit contributions
jo+0.004
 Carly+0.004
fulness+0.003
iko+0.003
les+0.003
Neg logit contributions
 tallest-0.003
 cube-0.003
 nail-0.003
 brick-0.003
 sp-0.003
Tokenrina
Token ID22267
Feature activation+7.66e-05

Pos logit contributions
 Carly+0.062
 Rover+0.055
 Audrey+0.051
 Rita+0.051
rob+0.051
Neg logit contributions
 tallest-0.098
 cube-0.096
 nail-0.094
 brick-0.094
 cubes-0.091
Token pl
Token ID458
Feature activation-1.09e-02

Pos logit contributions
jo+0.000
 Carly+0.000
fulness+0.000
 Carol+0.000
iko+0.000
Neg logit contributions
 cube-0.000
 tallest-0.000
 sp-0.000
 bucket-0.000
 nail-0.000
Tokenucked
Token ID17758
Feature activation+5.72e-04

Pos logit contributions
 tallest+0.017
 cube+0.017
 nail+0.016
 cubes+0.016
 brick+0.016
Neg logit contributions
jo-0.018
 Carly-0.017
fulness-0.016
of-0.016
iko-0.015
Token one
Token ID530
Feature activation-4.13e-03

Pos logit contributions
jo+0.001
 Carly+0.001
iko+0.001
fulness+0.001
azz+0.001
Neg logit contributions
 sp-0.001
 tallest-0.001
 dough-0.001
 brick-0.001
 bucket-0.001
Token from
Token ID422
Feature activation-2.41e-02

Pos logit contributions
 tallest+0.007
 cube+0.007
 nail+0.007
 cubes+0.006
 brick+0.006
Neg logit contributions
jo-0.006
 Carly-0.006
fulness-0.006
iko-0.005
 Carol-0.005
Tokeny
Token ID88
Feature activation-1.26e-02

Pos logit contributions
 cube+0.023
 tallest+0.023
 nail+0.022
 brick+0.022
 cubes+0.022
Neg logit contributions
jo-0.028
 Carly-0.028
 Carol-0.025
iko-0.025
fulness-0.025
Token ride
Token ID6594
Feature activation-6.81e-03

Pos logit contributions
 cube+0.013
 tallest+0.013
 nail+0.013
 brick+0.013
 bucket+0.012
Neg logit contributions
jo-0.027
 Carly-0.027
fulness-0.025
 Carol-0.024
iko-0.024
Token."
Token ID526
Feature activation+1.95e-02

Pos logit contributions
 cube+0.008
 tallest+0.008
 nail+0.008
 brick+0.007
 square+0.007
Neg logit contributions
jo-0.014
 Carly-0.013
iko-0.013
azz-0.012
 Carol-0.012
Token
Token ID198
Feature activation+1.67e-02

Pos logit contributions
les+0.035
fulness+0.034
le+0.034
ock+0.034
jo+0.033
Neg logit contributions
 square-0.024
 sp-0.024
 tallest-0.023
 stack-0.022
 nails-0.022
Token
Token ID198
Feature activation+1.85e-02

Pos logit contributions
les+0.034
fulness+0.033
le+0.033
ock+0.031
jo+0.031
Neg logit contributions
 nail-0.015
 very-0.015
 square-0.015
 sp-0.014
 third-0.014
TokenSab
Token ID39646
Feature activation+5.32e-02

Pos logit contributions
jo+0.022
fulness+0.022
les+0.020
 Carly+0.019
le+0.019
Neg logit contributions
 tallest-0.035
 nail-0.034
 sp-0.034
 brick-0.033
 square-0.033
Tokenrina
Token ID22267
Feature activation-4.04e-03

Pos logit contributions
 Carly+0.066
 Rover+0.060
 Woo+0.056
Rust+0.056
 Audrey+0.056
Neg logit contributions
 tallest-0.092
 cube-0.091
 nail-0.088
 cracked-0.087
 brick-0.087
Token wanted
Token ID2227
Feature activation-1.38e-03

Pos logit contributions
 cube+0.005
 sp+0.005
 nail+0.005
 bucket+0.005
 brick+0.005
Neg logit contributions
 Carly-0.008
jo-0.008
fulness-0.007
 Carol-0.007
iko-0.007
Token to
Token ID284
Feature activation+2.88e-03

Pos logit contributions
 nail+0.002
 sp+0.001
 brick+0.001
 nails+0.001
 hammer+0.001
Neg logit contributions
jo-0.003
 Carly-0.003
les-0.003
fulness-0.003
azz-0.003
Token help
Token ID1037
Feature activation-4.84e-03

Pos logit contributions
jo+0.006
 Carly+0.006
fulness+0.005
iko+0.005
les+0.005
Neg logit contributions
 tallest-0.003
 cube-0.003
 nail-0.003
 cubes-0.003
 brick-0.003
Token the
Token ID262
Feature activation-5.41e-03

Pos logit contributions
 nail+0.006
 hammer+0.006
 brick+0.006
 cube+0.005
 rake+0.005
Neg logit contributions
jo-0.010
 Carly-0.009
Eric-0.009
fulness-0.009
azz-0.009
Token to
Token ID284
Feature activation+1.54e-02

Pos logit contributions
jo+0.023
 Carly+0.022
iko+0.021
fulness+0.020
of+0.020
Neg logit contributions
 sp-0.011
 tallest-0.010
 cube-0.010
 square-0.010
 bucket-0.010
Token a
Token ID257
Feature activation+1.80e-02

Pos logit contributions
jo+0.039
les+0.035
azz+0.035
fulness+0.034
iko+0.034
Neg logit contributions
 stack-0.014
 rake-0.014
 nail-0.014
 bucket-0.013
 tallest-0.013
Token safe
Token ID3338
Feature activation+7.84e-03

Pos logit contributions
jo+0.038
 Carly+0.037
fulness+0.035
iko+0.034
les+0.034
Neg logit contributions
 tallest-0.020
 cube-0.019
 nail-0.019
 brick-0.018
 bucket-0.018
Token place
Token ID1295
Feature activation+1.22e-02

Pos logit contributions
jo+0.018
fulness+0.017
 Carly+0.017
azz+0.017
Eric+0.016
Neg logit contributions
 bucket-0.009
 nail-0.008
 square-0.008
 pit-0.008
 structure-0.008
Token.
Token ID13
Feature activation+1.43e-02

Pos logit contributions
 Carly+0.029
jo+0.029
iko+0.028
 Debbie+0.027
April+0.027
Neg logit contributions
 cube-0.011
 bucket-0.010
 square-0.010
 cubes-0.009
 tallest-0.009
Token Wh
Token ID854
Feature activation+5.26e-02

Pos logit contributions
le+0.026
jo+0.025
fulness+0.025
les+0.023
Kat+0.023
Neg logit contributions
 tallest-0.019
 nails-0.019
 hammer-0.018
 sp-0.018
 rake-0.018
Tokenisk
Token ID1984
Feature activation+1.57e-02

Pos logit contributions
 Carly+0.081
 Debbie+0.078
fulness+0.072
jo+0.071
 Carol+0.070
Neg logit contributions
 tallest-0.079
 cube-0.078
 cubes-0.076
 nail-0.076
 hammer-0.074
Tokeners
Token ID364
Feature activation-3.41e-05

Pos logit contributions
 Carly+0.030
jo+0.029
 Debbie+0.027
iko+0.027
azz+0.026
Neg logit contributions
 cube-0.018
 tallest-0.018
 cubes-0.018
 nail-0.018
 bricks-0.017
Token was
Token ID373
Feature activation-4.34e-04

Pos logit contributions
 cube+0.000
 sp+0.000
 bucket+0.000
 cracked+0.000
 split+0.000
Neg logit contributions
jo-0.000
 Carly-0.000
fulness-0.000
iko-0.000
 Debbie-0.000
Token able
Token ID1498
Feature activation+8.54e-03

Pos logit contributions
 tallest+0.000
 very+0.000
 best+0.000
 cube+0.000
 bucket+0.000
Neg logit contributions
jo-0.001
les-0.001
fulness-0.001
of-0.001
 Carly-0.001
Token to
Token ID284
Feature activation-7.30e-03

Pos logit contributions
jo+0.018
 Carly+0.017
le+0.017
les+0.017
fulness+0.016
Neg logit contributions
 cube-0.010
 cubes-0.009
 sp-0.009
 tallest-0.009
 bucket-0.009
Token excited
Token ID6568
Feature activation-2.96e-03

Pos logit contributions
 cube+0.008
 tallest+0.008
 nail+0.008
 cubes+0.008
 brick+0.008
Neg logit contributions
jo-0.013
 Carly-0.012
 Debbie-0.012
fulness-0.012
iko-0.011
Token.
Token ID13
Feature activation+1.24e-02

Pos logit contributions
 cube+0.002
 cubes+0.002
 sp+0.002
 tallest+0.002
 square+0.002
Neg logit contributions
jo-0.007
 Carly-0.007
iko-0.007
Eric-0.007
April-0.007
Token
Token ID198
Feature activation+1.06e-02

Pos logit contributions
le+0.024
fulness+0.024
jo+0.024
les+0.023
ez+0.022
Neg logit contributions
 tallest-0.014
 square-0.014
 brick-0.013
 sp-0.013
 nail-0.013
Token
Token ID198
Feature activation+1.66e-02

Pos logit contributions
fulness+0.025
les+0.024
le+0.023
elle+0.023
ez+0.023
Neg logit contributions
 very-0.007
 square-0.006
 also-0.006
 nail-0.005
 do-0.005
TokenWhen
Token ID2215
Feature activation+3.04e-03

Pos logit contributions
jo+0.029
fulness+0.028
les+0.027
 Carly+0.027
le+0.026
Neg logit contributions
 tallest-0.023
 nail-0.021
 sp-0.021
 square-0.021
 brick-0.020
Token Sab
Token ID9910
Feature activation+5.24e-02

Pos logit contributions
jo+0.005
 Carly+0.004
fulness+0.004
les+0.004
le+0.004
Neg logit contributions
 tallest-0.005
 cube-0.005
 sp-0.005
 nail-0.005
 brick-0.005
Tokenrina
Token ID22267
Feature activation+4.22e-03

Pos logit contributions
 Carly+0.061
 Rover+0.052
fulness+0.051
Rust+0.050
 Audrey+0.050
Neg logit contributions
 tallest-0.095
 cube-0.093
 nail-0.092
 brick-0.090
 cubes-0.090
Token arrived
Token ID5284
Feature activation-1.90e-03

Pos logit contributions
fulness+0.008
jo+0.008
 Carly+0.008
 Carol+0.007
 Debbie+0.007
Neg logit contributions
 sp-0.005
 cube-0.005
 bucket-0.005
 tallest-0.005
 nail-0.005
Token,
Token ID11
Feature activation+4.33e-03

Pos logit contributions
 tallest+0.002
 cube+0.002
 bucket+0.002
 nail+0.002
 cubes+0.002
Neg logit contributions
jo-0.004
 Carly-0.004
fulness-0.003
iko-0.003
of-0.003
Token she
Token ID673
Feature activation+6.53e-03

Pos logit contributions
jo+0.007
fulness+0.007
les+0.007
ez+0.006
le+0.006
Neg logit contributions
 tallest-0.007
nam-0.007
 sp-0.007
 nail-0.007
 brick-0.006
Token was
Token ID373
Feature activation-2.91e-03

Pos logit contributions
fulness+0.013
jo+0.013
 Carly+0.012
ock+0.012
elle+0.012
Neg logit contributions
 sp-0.009
 bucket-0.008
 cube-0.008
 tallest-0.008
 cracked-0.008
Token original
Token ID2656
Feature activation+5.42e-03

Pos logit contributions
 tallest+0.003
 cube+0.003
 nail+0.003
 bucket+0.003
 sp+0.003
Neg logit contributions
jo-0.006
 Carly-0.006
fulness-0.006
of-0.005
les-0.005
Token self
Token ID2116
Feature activation-5.97e-03

Pos logit contributions
jo+0.011
 Carly+0.011
of+0.010
fulness+0.010
azz+0.010
Neg logit contributions
 nail-0.006
 cube-0.006
 tallest-0.006
 bucket-0.006
 sp-0.006
Token.
Token ID13
Feature activation+1.51e-02

Pos logit contributions
 cube+0.007
 tallest+0.007
 cubes+0.006
 nail+0.006
 square+0.006
Neg logit contributions
 Carly-0.012
jo-0.012
 Carol-0.011
 Debbie-0.011
iko-0.011
Token
Token ID198
Feature activation+1.72e-02

Pos logit contributions
fulness+0.023
le+0.023
les+0.023
ock+0.022
jo+0.021
Neg logit contributions
 tallest-0.023
 square-0.022
 sp-0.022
 stack-0.021
 wall-0.021
Token
Token ID198
Feature activation+1.98e-02

Pos logit contributions
fulness+0.037
les+0.037
le+0.036
night+0.035
jo+0.033
Neg logit contributions
 nail-0.013
 very-0.013
 square-0.012
 third-0.012
 tallest-0.012
TokenSab
Token ID39646
Feature activation+5.15e-02

Pos logit contributions
jo+0.027
fulness+0.027
les+0.026
le+0.024
 Carly+0.024
Neg logit contributions
 tallest-0.034
 nail-0.032
 sp-0.032
 square-0.031
 brick-0.031
Tokenrina
Token ID22267
Feature activation-2.56e-03

Pos logit contributions
 Carly+0.063
 Rover+0.056
 Audrey+0.054
 Wendy+0.054
fulness+0.053
Neg logit contributions
 tallest-0.092
 cube-0.091
 nail-0.086
 cracked-0.086
 sp-0.086
Token was
Token ID373
Feature activation-3.14e-03

Pos logit contributions
 cube+0.003
 sp+0.003
 nail+0.003
 tallest+0.003
 bucket+0.003
Neg logit contributions
 Carly-0.005
jo-0.005
fulness-0.005
 Carol-0.005
reating-0.004
Token scared
Token ID12008
Feature activation+2.73e-03

Pos logit contributions
 tallest+0.003
 cube+0.002
 nail+0.002
 brick+0.002
 sp+0.002
Neg logit contributions
jo-0.008
 Carly-0.007
fulness-0.007
of-0.007
iko-0.007
Token,
Token ID11
Feature activation-9.26e-04

Pos logit contributions
rob+0.007
lor+0.007
jo+0.007
haw+0.006
Eric+0.006
Neg logit contributions
 very-0.002
 sp-0.002
 best-0.002
 messy-0.002
 selfish-0.002
Token and
Token ID290
Feature activation+3.35e-03

Pos logit contributions
 very+0.001
nam+0.001
 tallest+0.001
 sp+0.001
 nail+0.001
Neg logit contributions
jo-0.002
les-0.002
rob-0.002
lor-0.002
ucks-0.002
Token was
Token ID373
Feature activation+4.10e-03

Pos logit contributions
jo+0.044
 Carly+0.044
fulness+0.042
iko+0.040
 Debbie+0.040
Neg logit contributions
 sp-0.022
 cube-0.022
 tallest-0.021
 nail-0.021
 bucket-0.021
Token gone
Token ID3750
Feature activation+3.09e-03

Pos logit contributions
jo+0.009
 Carly+0.009
fulness+0.009
iko+0.008
of+0.008
Neg logit contributions
 tallest-0.004
 cube-0.004
 nail-0.004
 cubes-0.004
 brick-0.004
Token!
Token ID0
Feature activation+2.17e-02

Pos logit contributions
jo+0.006
 Carly+0.006
 Debbie+0.005
iko+0.005
fulness+0.005
Neg logit contributions
 tallest-0.004
 cube-0.004
 bucket-0.004
 sp-0.004
 brick-0.004
Token
Token ID198
Feature activation+1.52e-02

Pos logit contributions
le+0.043
fulness+0.042
les+0.042
reating+0.041
jo+0.040
Neg logit contributions
 nail-0.023
 square-0.023
 brick-0.023
 sp-0.022
 tallest-0.022
Token
Token ID198
Feature activation+2.37e-02

Pos logit contributions
fulness+0.031
 Carly+0.030
jo+0.030
les+0.029
reating+0.029
Neg logit contributions
 nail-0.015
 sp-0.014
 cube-0.014
 hammer-0.014
 brick-0.014
TokenCh
Token ID1925
Feature activation+5.09e-02

Pos logit contributions
jo+0.037
 Carly+0.035
fulness+0.035
les+0.033
 Debbie+0.032
Neg logit contributions
 nail-0.037
 tallest-0.036
 sp-0.036
 cube-0.036
 brick-0.035
Tokenir
Token ID343
Feature activation+1.85e-02

Pos logit contributions
 Carly+0.076
jo+0.072
 Debbie+0.066
fulness+0.065
 Carol+0.065
Neg logit contributions
 cube-0.088
 nail-0.085
 brick-0.084
 cubes-0.084
 tallest-0.084
Tokenpy
Token ID9078
Feature activation+2.37e-03

Pos logit contributions
 Carly+0.027
jo+0.026
 Debbie+0.024
fulness+0.023
 Carol+0.023
Neg logit contributions
 cube-0.032
 tallest-0.032
 nail-0.032
 brick-0.031
 cubes-0.031
Token was
Token ID373
Feature activation-8.05e-05

Pos logit contributions
fulness+0.005
 Carly+0.005
jo+0.005
reating+0.005
 Debbie+0.005
Neg logit contributions
 sp-0.003
 cube-0.003
 cracked-0.002
 bucket-0.002
 split-0.002
Token sad
Token ID6507
Feature activation-8.07e-05

Pos logit contributions
 tallest+0.000
 cube+0.000
 nail+0.000
 bucket+0.000
 brick+0.000
Neg logit contributions
jo-0.000
 Carly-0.000
fulness-0.000
of-0.000
les-0.000
Token that
Token ID326
Feature activation+6.23e-03

Pos logit contributions
 sp+0.000
 cube+0.000
 rake+0.000
 nail+0.000
 bucket+0.000
Neg logit contributions
rob-0.000
jo-0.000
 Carly-0.000
lor-0.000
Eric-0.000
TokenA
Token ID32
Feature activation+7.69e-03

Pos logit contributions
jo+0.018
fulness+0.017
les+0.017
 Carly+0.016
le+0.016
Neg logit contributions
 tallest-0.026
 nail-0.025
 sp-0.025
 brick-0.024
 square-0.024
Token few
Token ID1178
Feature activation+1.94e-02

Pos logit contributions
jo+0.012
 Carly+0.012
les+0.012
fulness+0.012
le+0.012
Neg logit contributions
 sp-0.011
 nail-0.011
 cube-0.011
 tallest-0.011
 square-0.011
Token days
Token ID1528
Feature activation+1.08e-03

Pos logit contributions
jo+0.032
 Carly+0.030
fulness+0.028
iko+0.027
 Carol+0.027
Neg logit contributions
 tallest-0.030
 cube-0.030
 nail-0.029
 cubes-0.029
 brick-0.029
Token later
Token ID1568
Feature activation+1.10e-02

Pos logit contributions
jo+0.002
 Carly+0.002
fulness+0.002
iko+0.002
 Carol+0.002
Neg logit contributions
 cube-0.001
 nail-0.001
 cubes-0.001
 bucket-0.001
 brick-0.001
Token,
Token ID11
Feature activation+5.53e-03

Pos logit contributions
jo+0.026
of+0.024
haw+0.024
azz+0.024
ock+0.024
Neg logit contributions
 sp-0.008
 nail-0.008
 square-0.007
 split-0.007
 cube-0.007
Token Sab
Token ID9910
Feature activation+5.04e-02

Pos logit contributions
jo+0.010
les+0.009
ez+0.009
fulness+0.008
ock+0.008
Neg logit contributions
 tallest-0.009
 sp-0.009
nam-0.008
 best-0.008
 nail-0.008
Tokenrina
Token ID22267
Feature activation+1.09e-03

Pos logit contributions
 Carly+0.058
 Rover+0.051
rob+0.051
 Audrey+0.050
Rust+0.049
Neg logit contributions
 tallest-0.091
 cube-0.089
 nail-0.088
 brick-0.087
 cracked-0.085
Token had
Token ID550
Feature activation-1.72e-03

Pos logit contributions
 Carly+0.002
fulness+0.002
jo+0.002
reating+0.002
 Carol+0.002
Neg logit contributions
 cube-0.001
 sp-0.001
 nail-0.001
 cracked-0.001
 bucket-0.001
Token to
Token ID284
Feature activation+1.76e-03

Pos logit contributions
 tallest+0.002
 sp+0.002
 cube+0.002
 nail+0.002
 bucket+0.002
Neg logit contributions
jo-0.004
 Carly-0.003
fulness-0.003
le-0.003
les-0.003
Token stay
Token ID2652
Feature activation-2.14e-03

Pos logit contributions
jo+0.004
 Carly+0.004
fulness+0.003
iko+0.003
les+0.003
Neg logit contributions
 tallest-0.002
 cube-0.002
 nail-0.002
 cubes-0.002
 brick-0.002
Token in
Token ID287
Feature activation+2.55e-03

Pos logit contributions
 tallest+0.002
 bucket+0.002
 cube+0.002
 nail+0.002
 sp+0.002
Neg logit contributions
jo-0.005
 Carly-0.005
fulness-0.005
of-0.005
azz-0.004
Token in
Token ID287
Feature activation+1.22e-02

Pos logit contributions
jo+0.007
 Carly+0.006
fulness+0.006
iko+0.006
 Carol+0.006
Neg logit contributions
 cube-0.004
 tallest-0.004
 cubes-0.004
 nail-0.004
 bucket-0.004
Token the
Token ID262
Feature activation+4.71e-03

Pos logit contributions
jo+0.029
fulness+0.027
les+0.027
anna+0.026
rob+0.026
Neg logit contributions
 sp-0.011
 square-0.011
 brick-0.011
 bucket-0.011
 nail-0.011
Token center
Token ID3641
Feature activation+9.08e-03

Pos logit contributions
jo+0.010
 Carly+0.009
fulness+0.009
 Carol+0.009
iko+0.009
Neg logit contributions
 tallest-0.006
 cube-0.005
 bucket-0.005
 sp-0.005
 cubes-0.005
Token.
Token ID13
Feature activation+1.89e-02

Pos logit contributions
jo+0.016
 Carly+0.015
fulness+0.014
iko+0.014
les+0.014
Neg logit contributions
 tallest-0.013
 cube-0.013
 nail-0.013
 cubes-0.013
 brick-0.012
Token Then
Token ID3244
Feature activation+1.67e-02

Pos logit contributions
jo+0.025
fulness+0.025
le+0.024
les+0.023
 Carly+0.023
Neg logit contributions
 tallest-0.034
 square-0.032
 sp-0.032
 nail-0.031
 brick-0.031
Token Sab
Token ID9910
Feature activation+5.04e-02

Pos logit contributions
jo+0.039
fulness+0.037
les+0.036
ock+0.035
ez+0.034
Neg logit contributions
 sp-0.016
 tallest-0.015
 best-0.013
 nail-0.013
 nails-0.013
Tokenrina
Token ID22267
Feature activation+2.11e-03

Pos logit contributions
 Carly+0.058
 Rover+0.051
rob+0.048
 Audrey+0.048
Rust+0.047
Neg logit contributions
 tallest-0.093
 cube-0.091
 nail-0.090
 brick-0.089
 cubes-0.087
Token saw
Token ID2497
Feature activation-8.19e-05

Pos logit contributions
 Carly+0.004
jo+0.004
fulness+0.004
 Carol+0.004
 Debbie+0.003
Neg logit contributions
 cube-0.003
 sp-0.003
 bucket-0.003
 tallest-0.003
 nail-0.003
Token something
Token ID1223
Feature activation+8.75e-03

Pos logit contributions
 sp+0.000
 tallest+0.000
 bricks+0.000
 nail+0.000
 brick+0.000
Neg logit contributions
jo-0.000
fulness-0.000
azz-0.000
 Carly-0.000
rett-0.000
Token even
Token ID772
Feature activation+2.50e-03

Pos logit contributions
jo+0.018
 Carly+0.018
fulness+0.017
les+0.016
 Carol+0.016
Neg logit contributions
 tallest-0.010
 cube-0.009
 cubes-0.009
 nail-0.009
 brick-0.009
Token more
Token ID517
Feature activation+9.53e-03

Pos logit contributions
jo+0.005
 Carly+0.005
fulness+0.005
iko+0.005
 Carol+0.005
Neg logit contributions
 tallest-0.003
 cube-0.003
 nail-0.003
 cubes-0.003
 sp-0.003
Token to
Token ID284
Feature activation-6.48e-03

Pos logit contributions
jo+0.017
 Carly+0.017
fulness+0.016
les+0.015
iko+0.015
Neg logit contributions
 cube-0.015
 tallest-0.015
 nail-0.015
 brick-0.014
 sp-0.014
Token do
Token ID466
Feature activation+1.60e-03

Pos logit contributions
 cube+0.003
 tallest+0.003
 nail+0.003
 bucket+0.003
 brick+0.003
Neg logit contributions
jo-0.018
 Carly-0.017
fulness-0.016
iko-0.016
les-0.016
Token.
Token ID13
Feature activation+1.92e-02

Pos logit contributions
jo+0.003
 Carly+0.003
iko+0.003
fulness+0.003
 Debbie+0.003
Neg logit contributions
 cube-0.002
 tallest-0.002
 cubes-0.002
 nail-0.002
 sp-0.002
Token
Token ID198
Feature activation+8.33e-03

Pos logit contributions
le+0.037
fulness+0.037
jo+0.036
reating+0.035
les+0.034
Neg logit contributions
 tallest-0.023
 sp-0.022
 hammer-0.022
 nails-0.022
 rake-0.021
Token
Token ID198
Feature activation+1.78e-02

Pos logit contributions
fulness+0.016
reating+0.016
le+0.016
jo+0.016
 Carly+0.016
Neg logit contributions
 nail-0.008
 hammer-0.008
 cube-0.007
 sp-0.007
 brick-0.007
TokenWh
Token ID1199
Feature activation+5.03e-02

Pos logit contributions
jo+0.028
 Carly+0.027
fulness+0.026
 Debbie+0.024
les+0.024
Neg logit contributions
 tallest-0.028
 cube-0.028
 nail-0.027
 sp-0.027
 brick-0.027
Tokenisk
Token ID1984
Feature activation+1.60e-02

Pos logit contributions
 Debbie+0.064
 Carly+0.062
 Ashley+0.053
fulness+0.053
Kat+0.053
Neg logit contributions
 tallest-0.084
 cube-0.081
 nail-0.080
 cubes-0.079
 rake-0.078
Tokeners
Token ID364
Feature activation+1.11e-03

Pos logit contributions
jo+0.029
 Carly+0.029
iko+0.027
 Debbie+0.027
 Carol+0.026
Neg logit contributions
 cube-0.020
 tallest-0.020
 cubes-0.019
 nail-0.019
 brick-0.018
Token saw
Token ID2497
Feature activation+1.45e-02

Pos logit contributions
jo+0.002
 Carly+0.002
fulness+0.002
 Debbie+0.002
iko+0.002
Neg logit contributions
 cube-0.001
 sp-0.001
 nail-0.001
 bucket-0.001
 tallest-0.001
Token that
Token ID326
Feature activation+1.03e-02

Pos logit contributions
jo+0.027
fulness+0.026
Kat+0.024
 Debbie+0.024
le+0.024
Neg logit contributions
 sp-0.020
 nails-0.019
 hammer-0.019
 nail-0.019
 bricks-0.018
Token his
Token ID465
Feature activation+1.60e-03

Pos logit contributions
jo+0.019
fulness+0.018
le+0.018
lor+0.018
les+0.018
Neg logit contributions
 hammer-0.014
 sp-0.014
 nail-0.014
 nails-0.014
nam-0.013
Token soft
Token ID2705
Feature activation+3.99e-03

Pos logit contributions
jo+0.027
 Carly+0.026
fulness+0.025
iko+0.024
les+0.024
Neg logit contributions
 tallest-0.012
 cube-0.012
 nail-0.012
 cubes-0.012
 brick-0.011
Token,
Token ID11
Feature activation+1.92e-03

Pos logit contributions
jo+0.009
 Carly+0.008
fulness+0.008
iko+0.008
azz+0.008
Neg logit contributions
 cube-0.004
 bucket-0.004
 nail-0.004
 brick-0.004
 sp-0.004
Token brown
Token ID7586
Feature activation+2.48e-03

Pos logit contributions
jo+0.005
 Carly+0.005
fulness+0.004
iko+0.004
of+0.004
Neg logit contributions
 tallest-0.002
 cube-0.001
 nail-0.001
 sp-0.001
 bucket-0.001
Token centre
Token ID7372
Feature activation+1.45e-02

Pos logit contributions
jo+0.005
 Carly+0.005
fulness+0.005
iko+0.005
 Carol+0.005
Neg logit contributions
 cube-0.003
 bucket-0.003
 nail-0.003
 brick-0.003
 sp-0.003
Token.
Token ID13
Feature activation+1.55e-02

Pos logit contributions
jo+0.028
 Carly+0.027
fulness+0.026
iko+0.025
 Carol+0.024
Neg logit contributions
 cube-0.018
 tallest-0.018
 nail-0.018
 cubes-0.018
 brick-0.017
Token Sab
Token ID9910
Feature activation+5.02e-02

Pos logit contributions
jo+0.024
fulness+0.023
le+0.022
les+0.022
ock+0.021
Neg logit contributions
 tallest-0.025
 sp-0.024
 square-0.024
 nail-0.023
 nails-0.023
Tokenina
Token ID1437
Feature activation+5.51e-03

Pos logit contributions
 Carly+0.066
jo+0.059
 Rover+0.058
 Carol+0.056
fulness+0.056
Neg logit contributions
 tallest-0.087
 cube-0.087
 nail-0.085
 cubes-0.084
 sp-0.083
Token couldn
Token ID3521
Feature activation+9.88e-03

Pos logit contributions
jo+0.011
 Carly+0.010
fulness+0.010
iko+0.009
 Carol+0.009
Neg logit contributions
 cube-0.007
 tallest-0.007
 nail-0.006
 cubes-0.006
 brick-0.006
Token't
Token ID470
Feature activation+5.24e-03

Pos logit contributions
jo+0.015
 Carly+0.014
fulness+0.013
iko+0.013
les+0.013
Neg logit contributions
 tallest-0.016
 cube-0.016
 nail-0.016
 brick-0.016
 cubes-0.016
Token resist
Token ID4180
Feature activation-5.67e-04

Pos logit contributions
jo+0.010
 Carly+0.009
fulness+0.009
iko+0.009
les+0.008
Neg logit contributions
 cube-0.007
 tallest-0.007
 nail-0.007
 brick-0.007
 cubes-0.007
Token picking
Token ID10868
Feature activation-1.59e-02

Pos logit contributions
 sp+0.001
 square+0.001
 squash+0.001
 nail+0.001
 bucket+0.001
Neg logit contributions
jo-0.001
unt-0.001
azz-0.001
rob-0.001
lor-0.001
Token play
Token ID711
Feature activation+9.03e-03

Pos logit contributions
jo+0.006
 Carly+0.005
iko+0.005
fulness+0.005
les+0.005
Neg logit contributions
 cube-0.003
 bucket-0.003
 nail-0.003
 tallest-0.003
 stack-0.003
Token.
Token ID13
Feature activation+1.30e-03

Pos logit contributions
jo+0.019
 Carly+0.018
iko+0.017
 Debbie+0.017
la+0.016
Neg logit contributions
 cube-0.011
 cubes-0.011
 square-0.010
 tallest-0.010
 bucket-0.010
Token
Token ID220
Feature activation-9.94e-03

Pos logit contributions
jo+0.003
le+0.003
les+0.002
reating+0.002
fulness+0.002
Neg logit contributions
 tallest-0.001
 cube-0.001
 square-0.001
 sp-0.001
 bucket-0.001
Token
Token ID198
Feature activation+5.47e-03

Pos logit contributions
 tallest+0.013
 cube+0.013
 square+0.012
 cubes+0.012
 nail+0.012
Neg logit contributions
jo-0.018
 Carly-0.017
fulness-0.017
le-0.017
iko-0.016
Token
Token ID198
Feature activation+8.72e-03

Pos logit contributions
jo+0.011
elle+0.010
 Carly+0.010
les+0.010
le+0.010
Neg logit contributions
 cube-0.006
 tallest-0.006
 sp-0.006
 square-0.006
 bucket-0.005
TokenEl
Token ID9527
Feature activation+5.00e-02

Pos logit contributions
jo+0.018
 Carly+0.017
fulness+0.016
les+0.016
elle+0.016
Neg logit contributions
 tallest-0.010
 cube-0.009
 sp-0.009
 brick-0.009
 nail-0.009
Tokenmo
Token ID5908
Feature activation-9.76e-03

Pos logit contributions
jo+0.074
 Carly+0.071
fulness+0.066
 Carol+0.063
iko+0.063
Neg logit contributions
 tallest-0.085
 cube-0.085
 nail-0.082
 cubes-0.082
 brick-0.081
Token and
Token ID290
Feature activation-1.28e-02

Pos logit contributions
 cube+0.012
 tallest+0.011
 sp+0.011
 bucket+0.011
 nail+0.011
Neg logit contributions
 Carly-0.019
jo-0.019
fulness-0.019
 Carol-0.017
reating-0.017
Token the
Token ID262
Feature activation+3.51e-03

Pos logit contributions
 cube+0.020
 tallest+0.020
 nail+0.020
 brick+0.019
 sp+0.019
Neg logit contributions
jo-0.021
 Carly-0.019
fulness-0.019
les-0.018
azz-0.018
Token dragon
Token ID10441
Feature activation+1.89e-03

Pos logit contributions
jo+0.006
 Carly+0.005
fulness+0.005
of+0.005
iko+0.005
Neg logit contributions
 tallest-0.006
 cube-0.006
 nail-0.006
 brick-0.006
 sp-0.005
Tokenfly
Token ID12254
Feature activation-1.48e-02

Pos logit contributions
 Carly+0.004
jo+0.004
fulness+0.004
 Debbie+0.004
 Carol+0.003
Neg logit contributions
 cube-0.002
 bucket-0.002
 sp-0.002
 tallest-0.002
 brick-0.002
Token from
Token ID422
Feature activation-1.70e-04

Pos logit contributions
jo+0.004
 Carly+0.004
iko+0.004
 Debbie+0.004
fulness+0.004
Neg logit contributions
 cube-0.002
 tallest-0.002
 nail-0.002
 bucket-0.002
 brick-0.002
Token Buster
Token ID34452
Feature activation+1.02e-02

Pos logit contributions
 nail+0.000
 tallest+0.000
 cube+0.000
 brick+0.000
 hammer+0.000
Neg logit contributions
jo-0.000
fulness-0.000
 Carly-0.000
lor-0.000
of-0.000
Token.
Token ID13
Feature activation+1.95e-02

Pos logit contributions
jo+0.020
 Carly+0.019
fulness+0.018
iko+0.018
 Debbie+0.018
Neg logit contributions
 tallest-0.012
 cube-0.012
 nail-0.012
 brick-0.012
 cubes-0.011
Token So
Token ID1406
Feature activation+2.28e-02

Pos logit contributions
jo+0.040
le+0.040
fulness+0.040
 Carly+0.038
 Debbie+0.038
Neg logit contributions
 tallest-0.019
 nails-0.019
 sp-0.019
 hammer-0.018
 cube-0.018
Token,
Token ID11
Feature activation+1.27e-02

Pos logit contributions
jo+0.048
Kat+0.045
fulness+0.045
 Debbie+0.044
le+0.044
Neg logit contributions
 sp-0.024
 hammer-0.024
 nails-0.023
 best-0.023
 nail-0.022
Token Wh
Token ID854
Feature activation+4.98e-02

Pos logit contributions
jo+0.025
Kat+0.023
fulness+0.023
anne+0.023
oph+0.022
Neg logit contributions
 hammer-0.016
 sp-0.015
 nails-0.015
 very-0.015
 best-0.015
Tokenisk
Token ID1984
Feature activation+1.65e-02

Pos logit contributions
 Carly+0.075
 Debbie+0.072
 Carol+0.065
fulness+0.065
jo+0.064
Neg logit contributions
 cube-0.077
 tallest-0.076
 cubes-0.076
 nail-0.074
 hammer-0.073
Tokeners
Token ID364
Feature activation-2.73e-03

Pos logit contributions
jo+0.030
 Carly+0.030
 Debbie+0.027
iko+0.027
 Carol+0.026
Neg logit contributions
 cube-0.021
 tallest-0.021
 nail-0.021
 cubes-0.021
 brick-0.020
Token jumped
Token ID11687
Feature activation+9.91e-03

Pos logit contributions
 cube+0.004
 nail+0.004
 bucket+0.004
 sp+0.004
 tallest+0.004
Neg logit contributions
jo-0.005
 Carly-0.005
fulness-0.005
 Debbie-0.004
iko-0.004
Token down
Token ID866
Feature activation+6.90e-03

Pos logit contributions
jo+0.021
 Carly+0.019
iko+0.018
fulness+0.018
 Debbie+0.018
Neg logit contributions
 sp-0.012
 cube-0.012
 tallest-0.012
 bucket-0.012
 brick-0.012
Token from
Token ID422
Feature activation-2.15e-03

Pos logit contributions
jo+0.012
 Carly+0.011
fulness+0.011
iko+0.011
le+0.011
Neg logit contributions
 cube-0.010
 sp-0.010
 brick-0.010
 nail-0.010
 tallest-0.010
Token There
Token ID1318
Feature activation+1.03e-02

Pos logit contributions
 Carly+0.001
jo+0.001
 Carol+0.001
fulness+0.001
 Margaret+0.001
Neg logit contributions
 tallest-0.001
 cube-0.001
 cubes-0.001
 nail-0.001
 brick-0.001
Token were
Token ID547
Feature activation+6.22e-03

Pos logit contributions
jo+0.021
 Carly+0.020
fulness+0.019
iko+0.018
les+0.018
Neg logit contributions
 tallest-0.012
 cube-0.012
 nail-0.012
 cubes-0.012
 brick-0.011
Token princess
Token ID21752
Feature activation+3.50e-02

Pos logit contributions
jo+0.014
fulness+0.014
 Carly+0.013
le+0.013
iko+0.013
Neg logit contributions
 tallest-0.006
 sp-0.006
 cubes-0.006
 square-0.006
 cube-0.006
Token dresses
Token ID27309
Feature activation+1.30e-02

Pos logit contributions
fulness+0.059
jo+0.054
 Carly+0.054
 Debbie+0.052
unt+0.052
Neg logit contributions
 sp-0.054
 tallest-0.052
 cube-0.051
 square-0.051
 cubes-0.051
Token,
Token ID11
Feature activation+1.71e-02

Pos logit contributions
jo+0.027
 Carly+0.026
iko+0.024
fulness+0.024
 Debbie+0.024
Neg logit contributions
 tallest-0.016
 cube-0.015
 sp-0.015
 cubes-0.015
 square-0.015
Token superhero
Token ID17343
Feature activation+4.97e-02

Pos logit contributions
jo+0.039
fulness+0.036
 Carly+0.036
rett+0.035
lor+0.034
Neg logit contributions
 sp-0.019
 tallest-0.018
 cubes-0.018
nam-0.018
 cube-0.017
Token cap
Token ID1451
Feature activation+2.25e-02

Pos logit contributions
fulness+0.092
Kat+0.088
jo+0.088
 Debbie+0.088
 Carly+0.087
Neg logit contributions
 sp-0.065
 cubes-0.061
 stack-0.060
 blocks-0.060
 nails-0.059
Tokenes
Token ID274
Feature activation+2.85e-02

Pos logit contributions
jo+0.042
 Carly+0.041
fulness+0.039
 Debbie+0.039
Kat+0.039
Neg logit contributions
 cubes-0.029
 sp-0.029
 cube-0.029
 tallest-0.028
 square-0.027
Token,
Token ID11
Feature activation+1.78e-02

Pos logit contributions
jo+0.064
 Carly+0.062
fulness+0.060
iko+0.059
 Debbie+0.059
Neg logit contributions
 sp-0.028
 cube-0.027
 square-0.027
 cubes-0.026
 bricks-0.025
Token and
Token ID290
Feature activation+1.17e-02

Pos logit contributions
jo+0.036
fulness+0.033
 Carly+0.033
rett+0.031
 Debbie+0.030
Neg logit contributions
 sp-0.024
 cubes-0.023
 tallest-0.023
 cube-0.022
 square-0.022
Token even
Token ID772
Feature activation+1.58e-02

Pos logit contributions
jo+0.024
 Carly+0.022
fulness+0.022
rett+0.021
iko+0.021
Neg logit contributions
 tallest-0.015
nam-0.014
 sp-0.014
 cubes-0.014
 square-0.014
Token he
Token ID339
Feature activation-6.11e-03

Pos logit contributions
fulness+0.026
jo+0.026
les+0.024
lor+0.023
ock+0.023
Neg logit contributions
 sp-0.017
 square-0.016
 nail-0.016
nam-0.016
 hammer-0.016
Token said
Token ID531
Feature activation+2.41e-03

Pos logit contributions
 sp+0.008
 cube+0.008
 nail+0.008
 square+0.008
 cracked+0.008
Neg logit contributions
jo-0.011
fulness-0.011
 Carly-0.010
les-0.010
iko-0.010
Token.
Token ID13
Feature activation+1.07e-02

Pos logit contributions
jo+0.005
 Carly+0.004
fulness+0.004
iko+0.004
les+0.004
Neg logit contributions
 cube-0.003
 tallest-0.003
 nail-0.003
 cubes-0.003
 brick-0.003
Token
Token ID198
Feature activation+2.07e-03

Pos logit contributions
le+0.024
fulness+0.024
ock+0.024
reating+0.023
les+0.023
Neg logit contributions
 square-0.009
 sp-0.008
 very-0.007
 hammer-0.007
 tallest-0.007
Token
Token ID198
Feature activation+4.43e-03

Pos logit contributions
fulness+0.005
le+0.005
les+0.005
elle+0.005
reating+0.005
Neg logit contributions
 square-0.001
 nail-0.001
 very-0.001
 hammer-0.001
 cube-0.001
TokenSab
Token ID39646
Feature activation+4.96e-02

Pos logit contributions
jo+0.008
fulness+0.007
les+0.007
 Carly+0.007
le+0.007
Neg logit contributions
 tallest-0.006
 nail-0.006
 square-0.006
 cube-0.006
 sp-0.006
Tokenina
Token ID1437
Feature activation-5.70e-03

Pos logit contributions
 Carly+0.071
 Rover+0.065
fulness+0.063
 Audrey+0.062
jo+0.062
Neg logit contributions
 cube-0.079
 tallest-0.078
 sp-0.076
 nail-0.076
 cracked-0.072
Token felt
Token ID2936
Feature activation-8.79e-04

Pos logit contributions
 tallest+0.008
 cube+0.008
 cubes+0.008
 nail+0.008
 brick+0.008
Neg logit contributions
jo-0.010
 Carly-0.009
fulness-0.009
les-0.009
of-0.009
Token sad
Token ID6507
Feature activation+2.67e-03

Pos logit contributions
 cube+0.001
 tallest+0.001
 cubes+0.001
 brick+0.001
 nail+0.001
Neg logit contributions
jo-0.002
 Carly-0.002
fulness-0.002
le-0.002
 Carol-0.002
Token.
Token ID13
Feature activation+3.88e-03

Pos logit contributions
jo+0.006
 Carly+0.006
 Debbie+0.006
rob+0.006
Jo+0.006
Neg logit contributions
 sp-0.003
 cube-0.002
 very-0.002
 best-0.002
 tallest-0.002
Token She
Token ID1375
Feature activation-3.06e-03

Pos logit contributions
jo+0.008
le+0.007
fulness+0.007
iko+0.007
les+0.007
Neg logit contributions
 tallest-0.005
 square-0.004
 cube-0.004
 brick-0.004
 nail-0.004
Token fly
Token ID6129
Feature activation-2.05e-03

Pos logit contributions
 cubes+0.013
nam+0.013
 cube+0.013
 tallest+0.013
 brick+0.013
Neg logit contributions
 Carly-0.021
jo-0.020
les-0.020
of-0.020
fulness-0.020
Token in
Token ID287
Feature activation-2.31e-03

Pos logit contributions
 cube+0.003
 tallest+0.003
 nail+0.003
 brick+0.003
 bucket+0.003
Neg logit contributions
jo-0.004
 Carly-0.004
fulness-0.003
iko-0.003
 Debbie-0.003
Token his
Token ID465
Feature activation+4.14e-03

Pos logit contributions
 sp+0.003
 square+0.003
 bucket+0.003
 brick+0.003
 nail+0.003
Neg logit contributions
jo-0.005
anna-0.005
iko-0.005
la-0.005
 Carly-0.005
Token mouth
Token ID5422
Feature activation-1.27e-02

Pos logit contributions
jo+0.008
 Carly+0.008
of+0.007
iko+0.007
les+0.007
Neg logit contributions
nam-0.005
 tallest-0.005
 cubes-0.005
 cube-0.004
 nail-0.004
Token.
Token ID13
Feature activation-7.86e-04

Pos logit contributions
 tallest+0.017
 cube+0.016
 nail+0.016
 cubes+0.016
 brick+0.016
Neg logit contributions
jo-0.024
 Carly-0.023
fulness-0.022
iko-0.021
 Carol-0.021
Token Ho
Token ID9544
Feature activation+4.91e-02

Pos logit contributions
 tallest+0.001
 cube+0.001
 nail+0.001
 cubes+0.001
 brick+0.001
Neg logit contributions
jo-0.001
 Carly-0.001
fulness-0.001
iko-0.001
of-0.001
Tokenppy
Token ID14097
Feature activation+7.40e-03

Pos logit contributions
 Carly+0.089
jo+0.087
 Carol+0.080
 Debbie+0.080
fulness+0.080
Neg logit contributions
 cube-0.066
 tallest-0.065
 nail-0.065
 sp-0.064
 cubes-0.062
Token's
Token ID338
Feature activation-7.02e-03

Pos logit contributions
jo+0.014
 Carly+0.014
fulness+0.013
iko+0.013
 Debbie+0.013
Neg logit contributions
 cube-0.009
 tallest-0.009
 sp-0.009
 nail-0.009
 bucket-0.009
Token friends
Token ID2460
Feature activation+2.04e-03

Pos logit contributions
 tallest+0.008
 cube+0.008
 cubes+0.008
 nail+0.008
 brick+0.008
Neg logit contributions
jo-0.014
 Carly-0.014
fulness-0.013
iko-0.013
les-0.012
Token cheered
Token ID34767
Feature activation+2.42e-02

Pos logit contributions
jo+0.004
 Carly+0.004
fulness+0.004
iko+0.004
 Debbie+0.004
Neg logit contributions
 cube-0.002
 sp-0.002
 bucket-0.002
 brick-0.002
 nail-0.002
Token and
Token ID290
Feature activation+9.49e-03

Pos logit contributions
jo+0.053
lor+0.052
rob+0.052
 Debbie+0.052
 Carly+0.051
Neg logit contributions
 very-0.024
 sp-0.023
 cube-0.023
 squash-0.021
 both-0.021
Token her
Token ID607
Feature activation+3.09e-03

Pos logit contributions
jo+0.019
les+0.017
lor+0.016
azz+0.016
le+0.016
Neg logit contributions
 nail-0.012
 hammer-0.011
 nails-0.011
 rake-0.010
nam-0.010
Token!
Token ID0
Feature activation+3.06e-03

Pos logit contributions
jo+0.005
 Carly+0.004
fulness+0.004
iko+0.004
les+0.004
Neg logit contributions
 tallest-0.005
 cube-0.005
 cubes-0.005
 nail-0.005
 brick-0.005
Token
Token ID220
Feature activation-3.38e-03

Pos logit contributions
le+0.006
jo+0.006
les+0.005
fulness+0.005
iko+0.005
Neg logit contributions
 square-0.004
 tallest-0.004
 nail-0.004
 cube-0.004
 brick-0.003
Token
Token ID198
Feature activation+7.81e-03

Pos logit contributions
 tallest+0.005
 nail+0.005
 cube+0.004
 hammer+0.004
 rake+0.004
Neg logit contributions
jo-0.006
le-0.006
fulness-0.005
 Carly-0.005
les-0.005
Token
Token ID198
Feature activation+1.07e-02

Pos logit contributions
jo+0.016
le+0.015
les+0.014
fulness+0.014
iko+0.014
Neg logit contributions
 nail-0.008
 cube-0.008
 tallest-0.008
 square-0.008
 rake-0.008
TokenV
Token ID53
Feature activation+4.88e-02

Pos logit contributions
jo+0.021
 Carly+0.019
fulness+0.019
le+0.018
les+0.018
Neg logit contributions
 tallest-0.014
 nail-0.014
 cube-0.013
 brick-0.013
 rake-0.013
Tokeniv
Token ID452
Feature activation+1.87e-02

Pos logit contributions
jo+0.075
 Carly+0.073
fulness+0.066
 Debbie+0.065
 Carol+0.065
Neg logit contributions
 cube-0.080
 cubes-0.077
 nail-0.077
 tallest-0.077
 sp-0.074
Tokenian
Token ID666
Feature activation+2.68e-03

Pos logit contributions
 Carly+0.028
jo+0.026
fulness+0.025
reating+0.024
 Carol+0.024
Neg logit contributions
 cube-0.033
 nail-0.032
 hammer-0.030
 brick-0.030
 sp-0.030
Token enjoyed
Token ID8359
Feature activation+4.65e-03

Pos logit contributions
 Carly+0.005
jo+0.005
fulness+0.005
iko+0.005
 Carol+0.004
Neg logit contributions
 cube-0.004
 nail-0.003
 square-0.003
 sp-0.003
 brick-0.003
Token playing
Token ID2712
Feature activation+8.52e-03

Pos logit contributions
jo+0.010
le+0.009
les+0.009
 Carly+0.009
azz+0.009
Neg logit contributions
 sp-0.006
 cube-0.006
 square-0.006
 cubes-0.005
 nail-0.005
Token with
Token ID351
Feature activation+9.17e-04

Pos logit contributions
jo+0.018
 Carly+0.017
iko+0.016
 Debbie+0.015
la+0.015
Neg logit contributions
 cube-0.011
 nail-0.011
 tallest-0.010
 square-0.010
 bucket-0.010
Token.
Token ID13
Feature activation+4.53e-03

Pos logit contributions
jo+0.006
 Carly+0.006
iko+0.005
fulness+0.005
les+0.005
Neg logit contributions
 cube-0.003
 tallest-0.003
 bucket-0.003
 cubes-0.003
 brick-0.003
Token
Token ID220
Feature activation-5.58e-03

Pos logit contributions
le+0.009
fulness+0.009
les+0.009
jo+0.009
ock+0.008
Neg logit contributions
 square-0.005
 tallest-0.005
 rake-0.004
 stack-0.004
 brick-0.004
Token
Token ID198
Feature activation+4.00e-03

Pos logit contributions
 tallest+0.007
 cube+0.007
 square+0.006
 cubes+0.006
 nail+0.006
Neg logit contributions
jo-0.010
le-0.010
iko-0.010
 Carly-0.010
elle-0.010
Token
Token ID198
Feature activation+8.85e-03

Pos logit contributions
le+0.009
fulness+0.008
les+0.008
jo+0.008
night+0.008
Neg logit contributions
 square-0.003
 tallest-0.003
 cube-0.003
 nail-0.003
 sp-0.003
TokenThe
Token ID464
Feature activation+1.70e-02

Pos logit contributions
jo+0.018
les+0.018
fulness+0.018
le+0.017
 Carly+0.016
Neg logit contributions
 tallest-0.008
 nail-0.008
 square-0.008
 rake-0.008
 sp-0.008
Token Ro
Token ID5564
Feature activation+4.88e-02

Pos logit contributions
jo+0.037
 Carly+0.035
fulness+0.034
 Debbie+0.033
of+0.033
Neg logit contributions
 tallest-0.019
 cube-0.018
 sp-0.018
 nail-0.017
 square-0.016
Tokenaring
Token ID1723
Feature activation+3.58e-03

Pos logit contributions
reating+0.062
 Carly+0.060
 Kathy+0.058
 Debbie+0.057
Kat+0.056
Neg logit contributions
 centimeters-0.075
 square-0.075
 sp-0.075
 tallest-0.074
 cube-0.074
Token Monster
Token ID12635
Feature activation+1.49e-03

Pos logit contributions
Kat+0.007
jo+0.006
irst+0.006
 Carly+0.006
Kelly+0.006
Neg logit contributions
 sp-0.004
 cube-0.004
 square-0.004
 tallest-0.004
 best-0.003
Token was
Token ID373
Feature activation+4.71e-04

Pos logit contributions
 Carly+0.003
jo+0.003
fulness+0.003
iko+0.003
 Debbie+0.003
Neg logit contributions
 cube-0.002
 sp-0.002
 tallest-0.002
 square-0.002
 stack-0.002
Token so
Token ID523
Feature activation+2.10e-03

Pos logit contributions
jo+0.001
 Carly+0.001
iko+0.001
fulness+0.001
of+0.001
Neg logit contributions
 tallest-0.000
 cube-0.000
 sp-0.000
 square-0.000
 nail-0.000
Token happy
Token ID3772
Feature activation-1.40e-03

Pos logit contributions
jo+0.005
 Carly+0.005
fulness+0.004
iko+0.004
les+0.004
Neg logit contributions
 tallest-0.002
 cube-0.002
 sp-0.002
 square-0.002
 nail-0.002
Token mine
Token ID6164
Feature activation-1.32e-02

Pos logit contributions
 tallest+0.014
 cube+0.014
 cubes+0.013
 nail+0.013
 brick+0.013
Neg logit contributions
jo-0.023
 Carly-0.022
fulness-0.021
iko-0.021
 Carol-0.020
Token!"
Token ID2474
Feature activation+2.56e-02

Pos logit contributions
 tallest+0.021
 cube+0.021
 cubes+0.020
 bucket+0.020
 nail+0.020
Neg logit contributions
jo-0.022
 Carly-0.020
fulness-0.019
iko-0.018
of-0.018
Token
Token ID220
Feature activation+1.04e-02

Pos logit contributions
fulness+0.037
jo+0.035
reating+0.034
les+0.034
ock+0.034
Neg logit contributions
 sp-0.043
 square-0.040
 nails-0.039
 nail-0.039
 wall-0.039
Token
Token ID198
Feature activation+1.98e-02

Pos logit contributions
fulness+0.021
jo+0.020
reating+0.020
le+0.020
ock+0.019
Neg logit contributions
 nail-0.010
 tallest-0.010
 nails-0.009
 square-0.009
 sp-0.009
Token
Token ID198
Feature activation+2.23e-02

Pos logit contributions
fulness+0.034
jo+0.032
les+0.032
le+0.031
ock+0.030
Neg logit contributions
 nail-0.027
 sp-0.027
 tallest-0.026
 square-0.026
 hammer-0.025
TokenSab
Token ID39646
Feature activation+4.87e-02

Pos logit contributions
jo+0.026
fulness+0.024
 Carly+0.023
les+0.023
le+0.021
Neg logit contributions
 tallest-0.043
 nail-0.043
 sp-0.043
 cube-0.042
 square-0.041
Tokenina
Token ID1437
Feature activation-2.15e-03

Pos logit contributions
 Carly+0.069
 Rover+0.063
fulness+0.062
jo+0.061
 Audrey+0.060
Neg logit contributions
 tallest-0.078
 cube-0.077
 sp-0.075
 nail-0.075
 cubes-0.072
Token looked
Token ID3114
Feature activation-2.99e-03

Pos logit contributions
 cube+0.003
 sp+0.002
 tallest+0.002
 nail+0.002
 bucket+0.002
Neg logit contributions
jo-0.004
 Carly-0.004
fulness-0.004
iko-0.004
 Debbie-0.004
Token up
Token ID510
Feature activation-3.67e-03

Pos logit contributions
 sp+0.004
 cube+0.004
 tallest+0.004
 bucket+0.004
 brick+0.003
Neg logit contributions
jo-0.006
fulness-0.006
 Carly-0.006
iko-0.006
lor-0.006
Token to
Token ID284
Feature activation-2.99e-03

Pos logit contributions
 sp+0.005
 bucket+0.005
 cube+0.005
 square+0.004
 brick+0.004
Neg logit contributions
jo-0.008
fulness-0.007
 Carly-0.007
iko-0.007
lor-0.007
Token see
Token ID766
Feature activation-4.86e-04

Pos logit contributions
 nail+0.003
 plant+0.003
 square+0.003
 brick+0.003
 bucket+0.002
Neg logit contributions
jo-0.007
lor-0.007
les-0.006
rob-0.006
ez-0.006
Token still
Token ID991
Feature activation+4.11e-03

Pos logit contributions
jo+0.013
 Carly+0.011
iko+0.011
le+0.011
Jo+0.011
Neg logit contributions
 tallest-0.010
 cubes-0.010
 brick-0.010
 bricks-0.010
 cube-0.010
Token wanted
Token ID2227
Feature activation-3.52e-03

Pos logit contributions
 Carly+0.009
jo+0.009
fulness+0.009
 Carol+0.009
iko+0.009
Neg logit contributions
 cube-0.004
 tallest-0.004
 sp-0.004
 nail-0.003
 bucket-0.003
Token to
Token ID284
Feature activation+5.38e-03

Pos logit contributions
 cube+0.005
 tallest+0.005
 cubes+0.005
 sp+0.005
 brick+0.005
Neg logit contributions
jo-0.007
 Carly-0.006
fulness-0.006
iko-0.006
les-0.006
Token eat
Token ID4483
Feature activation-1.03e-02

Pos logit contributions
jo+0.010
 Carly+0.010
fulness+0.009
of+0.009
 Carol+0.009
Neg logit contributions
 tallest-0.007
 cube-0.007
 cubes-0.006
 brick-0.006
 nail-0.006
Token something
Token ID1223
Feature activation-3.37e-03

Pos logit contributions
 tallest+0.015
 cube+0.015
 nail+0.014
 cubes+0.014
 brick+0.014
Neg logit contributions
jo-0.018
 Carly-0.017
fulness-0.016
iko-0.015
les-0.015
Token scr
Token ID6040
Feature activation-6.59e-02

Pos logit contributions
 tallest+0.004
 nail+0.004
 cube+0.004
 brick+0.004
 cubes+0.004
Neg logit contributions
jo-0.006
fulness-0.006
 Carly-0.006
le-0.006
of-0.006
Tokenum
Token ID388
Feature activation-8.79e-03

Pos logit contributions
 tallest+0.131
nam+0.128
 cube+0.126
 sp+0.125
 nail+0.122
Neg logit contributions
jo-0.077
of-0.068
fulness-0.067
 Carly-0.064
les-0.063
Tokenpt
Token ID457
Feature activation-1.15e-02

Pos logit contributions
 tallest+0.012
 cube+0.012
 nail+0.012
 cubes+0.012
 brick+0.012
Neg logit contributions
jo-0.016
 Carly-0.015
fulness-0.014
les-0.014
iko-0.014
Tokenious
Token ID699
Feature activation-7.51e-03

Pos logit contributions
 cube+0.018
 nail+0.018
 dough+0.017
 bucket+0.017
 tallest+0.017
Neg logit contributions
jo-0.018
fulness-0.016
 Carly-0.016
le-0.016
les-0.015
Token.
Token ID13
Feature activation-9.14e-03

Pos logit contributions
 cube+0.009
 tallest+0.009
 cubes+0.009
 nail+0.009
 brick+0.009
Neg logit contributions
jo-0.015
 Carly-0.014
fulness-0.013
iko-0.013
 Carol-0.013
Token
Token ID220
Feature activation-9.94e-03

Pos logit contributions
 tallest+0.012
 cube+0.012
 cubes+0.011
 square+0.011
 stack+0.011
Neg logit contributions
jo-0.017
fulness-0.016
le-0.016
 Carly-0.016
iko-0.016
Token bitter
Token ID12922
Feature activation-1.67e-02

Pos logit contributions
 tallest+0.002
 cube+0.002
 nail+0.002
 brick+0.002
 cubes+0.002
Neg logit contributions
jo-0.003
 Carly-0.003
fulness-0.003
of-0.003
 Debbie-0.003
Token and
Token ID290
Feature activation-5.05e-06

Pos logit contributions
 tallest+0.024
 cube+0.024
 nail+0.024
 brick+0.023
 cubes+0.023
Neg logit contributions
jo-0.028
 Carly-0.026
fulness-0.026
les-0.026
le-0.025
Token made
Token ID925
Feature activation+3.07e-03

Feature not active on prev token;
no predictions were affected.
Token her
Token ID607
Feature activation-4.01e-03

Pos logit contributions
jo+0.006
 Carly+0.005
iko+0.005
lor+0.005
ark+0.005
Neg logit contributions
 dough-0.004
 sp-0.004
 nail-0.004
 tallest-0.004
 cube-0.004
Token face
Token ID1986
Feature activation-2.34e-02

Pos logit contributions
 tallest+0.005
 cube+0.005
 cubes+0.004
 brick+0.004
 nail+0.004
Neg logit contributions
jo-0.008
 Carly-0.007
fulness-0.007
le-0.007
les-0.007
Token scr
Token ID6040
Feature activation-6.58e-02

Pos logit contributions
 cube+0.025
 tallest+0.025
 brick+0.025
 cubes+0.024
 nail+0.024
Neg logit contributions
jo-0.046
le-0.044
fulness-0.044
les-0.043
 Carly-0.043
Tokenunch
Token ID3316
Feature activation-1.31e-02

Pos logit contributions
 tallest+0.134
nam+0.132
 cube+0.132
 brick+0.129
 nail+0.129
Neg logit contributions
jo-0.070
of-0.061
fulness-0.061
 Carly-0.058
les-0.057
Token up
Token ID510
Feature activation-2.03e-02

Pos logit contributions
 tallest+0.021
 cube+0.021
 nail+0.020
 cubes+0.020
 brick+0.020
Neg logit contributions
jo-0.021
 Carly-0.020
fulness-0.019
les-0.019
le-0.018
Token.
Token ID13
Feature activation-7.52e-04

Pos logit contributions
 tallest+0.028
 cube+0.028
 nail+0.028
 brick+0.027
 cubes+0.027
Neg logit contributions
jo-0.035
 Carly-0.034
fulness-0.033
les-0.031
le-0.031
Token
Token ID198
Feature activation-2.63e-03

Pos logit contributions
 square+0.001
 tallest+0.001
 dough+0.001
 cube+0.001
 sp+0.001
Neg logit contributions
jo-0.001
fulness-0.001
le-0.001
les-0.001
 Carly-0.001
Token
Token ID198
Feature activation+8.55e-04

Pos logit contributions
 nail+0.002
 cube+0.002
 square+0.002
 brick+0.002
 dough+0.002
Neg logit contributions
fulness-0.005
jo-0.005
les-0.005
 Carly-0.005
le-0.005
Token Sour
Token ID47694
Feature activation-8.55e-03

Pos logit contributions
jo+0.008
 Carly+0.007
fulness+0.007
les+0.007
le+0.007
Neg logit contributions
 tallest-0.006
 cube-0.006
 nail-0.006
 brick-0.006
 cubes-0.006
Token candy
Token ID18550
Feature activation-1.15e-02

Pos logit contributions
 cube+0.010
 tallest+0.010
 cubes+0.010
 nail+0.010
 sp+0.010
Neg logit contributions
jo-0.017
 Carly-0.016
fulness-0.015
iko-0.015
 Carol-0.015
Token made
Token ID925
Feature activation+1.33e-03

Pos logit contributions
 cube+0.015
 tallest+0.015
 cubes+0.015
 nail+0.014
 sp+0.014
Neg logit contributions
jo-0.022
 Carly-0.021
fulness-0.020
iko-0.019
 Carol-0.019
Token her
Token ID607
Feature activation-1.87e-03

Pos logit contributions
jo+0.003
 Carly+0.002
iko+0.002
Eric+0.002
lor+0.002
Neg logit contributions
 cube-0.002
 dough-0.002
 nail-0.002
 cubes-0.002
 sp-0.002
Token face
Token ID1986
Feature activation-2.12e-02

Pos logit contributions
 tallest+0.002
 cube+0.002
 nail+0.002
 cubes+0.002
 brick+0.002
Neg logit contributions
jo-0.004
 Carly-0.004
fulness-0.004
iko-0.003
 Carol-0.003
Token scr
Token ID6040
Feature activation-6.55e-02

Pos logit contributions
 tallest+0.024
 cube+0.024
 cubes+0.023
 nail+0.023
 brick+0.023
Neg logit contributions
jo-0.043
 Carly-0.041
fulness-0.040
les-0.038
of-0.038
Tokenunch
Token ID3316
Feature activation-1.24e-02

Pos logit contributions
 tallest+0.134
nam+0.131
 cube+0.130
 brick+0.129
 bucket+0.129
Neg logit contributions
jo-0.067
of-0.062
fulness-0.061
les-0.057
 Carly-0.056
Token up
Token ID510
Feature activation-1.78e-02

Pos logit contributions
 tallest+0.021
 cube+0.021
 nail+0.020
 cubes+0.020
 brick+0.020
Neg logit contributions
jo-0.018
 Carly-0.017
fulness-0.016
les-0.016
of-0.015
Token.
Token ID13
Feature activation+2.90e-03

Pos logit contributions
 tallest+0.023
 cube+0.023
 cubes+0.022
 nail+0.022
 brick+0.022
Neg logit contributions
jo-0.034
 Carly-0.033
fulness-0.031
iko-0.030
 Carol-0.030
Token She
Token ID1375
Feature activation-1.30e-02

Pos logit contributions
jo+0.005
 Carly+0.004
fulness+0.004
iko+0.004
les+0.004
Neg logit contributions
 tallest-0.005
 cube-0.005
 nail-0.004
 cubes-0.004
 brick-0.004
Token did
Token ID750
Feature activation-4.83e-03

Pos logit contributions
 cube+0.015
 sp+0.015
 tallest+0.015
 nail+0.014
 cubes+0.014
Neg logit contributions
jo-0.027
 Carly-0.026
fulness-0.025
 Carol-0.024
iko-0.024
Token it
Token ID340
Feature activation+5.54e-03

Pos logit contributions
jo+0.016
iko+0.015
 Carly+0.014
la+0.014
lor+0.014
Neg logit contributions
 cube-0.013
 sp-0.012
 cubes-0.012
 squash-0.012
 tallest-0.012
Token made
Token ID925
Feature activation+2.81e-03

Pos logit contributions
jo+0.011
 Carly+0.011
fulness+0.010
iko+0.010
 Carol+0.010
Neg logit contributions
 cube-0.007
 tallest-0.007
 nail-0.006
 cubes-0.006
 brick-0.006
Token Jack
Token ID3619
Feature activation-5.13e-03

Pos logit contributions
jo+0.005
iko+0.004
 Carly+0.004
lor+0.004
la+0.004
Neg logit contributions
 cube-0.005
 cubes-0.005
 dough-0.005
 sp-0.005
 brick-0.005
Token's
Token ID338
Feature activation-1.97e-02

Pos logit contributions
 cube+0.006
 tallest+0.006
 nail+0.006
 cubes+0.006
 brick+0.006
Neg logit contributions
jo-0.010
 Carly-0.010
fulness-0.009
iko-0.009
 Carol-0.009
Token face
Token ID1986
Feature activation-2.13e-02

Pos logit contributions
 tallest+0.021
 cube+0.021
 cubes+0.020
 nail+0.020
 brick+0.020
Neg logit contributions
jo-0.041
 Carly-0.040
fulness-0.038
of-0.037
 Carol-0.036
Token scr
Token ID6040
Feature activation-6.42e-02

Pos logit contributions
 tallest+0.021
 cube+0.021
 cubes+0.020
 nail+0.020
 brick+0.020
Neg logit contributions
jo-0.045
 Carly-0.043
fulness-0.042
le-0.041
of-0.041
Tokenunch
Token ID3316
Feature activation-1.05e-02

Pos logit contributions
nam+0.141
 tallest+0.141
 cube+0.137
 brick+0.133
 nail+0.133
Neg logit contributions
jo-0.055
of-0.052
fulness-0.050
 Carly-0.046
les-0.045
Token up
Token ID510
Feature activation-1.60e-02

Pos logit contributions
 tallest+0.020
 cube+0.020
 nail+0.019
 cubes+0.019
 brick+0.019
Neg logit contributions
jo-0.014
 Carly-0.013
fulness-0.012
iko-0.011
les-0.011
Token!
Token ID0
Feature activation-7.67e-04

Pos logit contributions
 cube+0.026
 tallest+0.026
 sp+0.025
 cubes+0.025
 bucket+0.025
Neg logit contributions
jo-0.027
 Carly-0.025
iko-0.023
fulness-0.023
 Carol-0.022
Token
Token ID220
Feature activation-7.84e-03

Pos logit contributions
 cube+0.001
 tallest+0.001
 brick+0.001
 cubes+0.001
 bucket+0.001
Neg logit contributions
jo-0.001
 Carly-0.001
iko-0.001
le-0.001
fulness-0.001
Token
Token ID198
Feature activation-2.28e-03

Pos logit contributions
 cube+0.012
 tallest+0.011
 cubes+0.011
 nail+0.011
 brick+0.011
Neg logit contributions
jo-0.013
 Carly-0.013
iko-0.012
le-0.012
fulness-0.012
Token sour
Token ID11348
Feature activation-1.07e-02

Pos logit contributions
 tallest+0.000
 cube+0.000
 nail+0.000
 cubes+0.000
 brick+0.000
Neg logit contributions
jo-0.000
 Carly-0.000
fulness-0.000
iko-0.000
les-0.000
Token it
Token ID340
Feature activation+1.05e-02

Pos logit contributions
 sp+0.015
 cube+0.015
 tallest+0.015
 squash+0.015
 dough+0.015
Neg logit contributions
jo-0.020
 Carly-0.018
iko-0.018
April-0.017
Eric-0.017
Token made
Token ID925
Feature activation+4.73e-03

Pos logit contributions
jo+0.022
 Carly+0.022
fulness+0.020
iko+0.020
 Carol+0.020
Neg logit contributions
 cube-0.012
 tallest-0.012
 sp-0.011
 nail-0.011
 brick-0.011
Token her
Token ID607
Feature activation-2.78e-03

Pos logit contributions
jo+0.010
rob+0.010
ark+0.009
lor+0.009
iko+0.008
Neg logit contributions
 sp-0.006
 dough-0.005
 nail-0.005
 rake-0.005
 very-0.005
Token face
Token ID1986
Feature activation-2.08e-02

Pos logit contributions
 tallest+0.003
 cube+0.003
 nail+0.003
 cubes+0.003
 brick+0.003
Neg logit contributions
jo-0.006
 Carly-0.005
fulness-0.005
iko-0.005
les-0.005
Token scr
Token ID6040
Feature activation-6.34e-02

Pos logit contributions
 tallest+0.024
 cube+0.024
 nail+0.023
 cubes+0.023
 brick+0.023
Neg logit contributions
jo-0.042
 Carly-0.041
fulness-0.039
iko-0.038
 Carol-0.037
Tokenunch
Token ID3316
Feature activation-4.17e-03

Pos logit contributions
 tallest+0.133
 cube+0.132
nam+0.131
 nail+0.127
 brick+0.127
Neg logit contributions
jo-0.064
of-0.056
fulness-0.054
 Carly-0.054
les-0.051
Token up
Token ID510
Feature activation-1.77e-02

Pos logit contributions
 cube+0.007
 tallest+0.007
 nail+0.007
 brick+0.007
 cubes+0.007
Neg logit contributions
jo-0.006
 Carly-0.006
iko-0.006
 Carol-0.005
fulness-0.005
Token!
Token ID0
Feature activation+5.82e-05

Pos logit contributions
 cube+0.023
 tallest+0.023
 bucket+0.022
 sp+0.022
 cubes+0.022
Neg logit contributions
jo-0.036
 Carly-0.033
iko-0.032
fulness-0.031
 Carol-0.031
Token
Token ID220
Feature activation-4.29e-03

Pos logit contributions
jo+0.000
le+0.000
fulness+0.000
ock+0.000
les+0.000
Neg logit contributions
 square-0.000
 tallest-0.000
 sp-0.000
 bucket-0.000
 brick-0.000
Token
Token ID198
Feature activation-5.97e-04

Pos logit contributions
 nail+0.005
 tallest+0.005
 cube+0.005
 square+0.005
 bucket+0.005
Neg logit contributions
jo-0.008
le-0.008
iko-0.008
fulness-0.007
 Carly-0.007
Token it
Token ID340
Feature activation+1.06e-02

Pos logit contributions
jo+0.018
 Carly+0.017
fulness+0.016
iko+0.015
les+0.015
Neg logit contributions
 tallest-0.025
 cube-0.025
 nail-0.024
 cubes-0.024
 brick-0.024
Token made
Token ID925
Feature activation+7.60e-03

Pos logit contributions
jo+0.019
les+0.018
 Carly+0.018
le+0.018
fulness+0.017
Neg logit contributions
 tallest-0.014
 cube-0.013
 brick-0.013
 cubes-0.013
 nail-0.013
Token Bob
Token ID5811
Feature activation-3.68e-03

Pos logit contributions
jo+0.011
 Carly+0.010
fulness+0.009
iko+0.009
azz+0.009
Neg logit contributions
 cube-0.013
 tallest-0.013
 brick-0.013
 sp-0.013
 cubes-0.013
Token's
Token ID338
Feature activation-1.02e-02

Pos logit contributions
 tallest+0.005
 brick+0.005
 bricks+0.005
 cubes+0.005
 nails+0.005
Neg logit contributions
jo-0.006
le-0.006
of-0.006
fulness-0.006
les-0.006
Token face
Token ID1986
Feature activation-2.39e-02

Pos logit contributions
 tallest+0.013
nam+0.013
 cubes+0.012
 cube+0.012
asuring+0.011
Neg logit contributions
jo-0.017
 Carly-0.016
of-0.016
fulness-0.016
les-0.015
Token scr
Token ID6040
Feature activation-6.29e-02

Pos logit contributions
 tallest+0.031
 brick+0.030
 cube+0.029
 bricks+0.029
 cubes+0.029
Neg logit contributions
le-0.040
les-0.039
jo-0.039
of-0.038
fulness-0.038
Tokenunch
Token ID3316
Feature activation-1.08e-02

Pos logit contributions
nam+0.133
 tallest+0.133
 cube+0.128
 brick+0.124
 nail+0.123
Neg logit contributions
jo-0.056
of-0.056
fulness-0.051
les-0.048
acks-0.047
Token up
Token ID510
Feature activation-1.68e-02

Pos logit contributions
 tallest+0.019
nam+0.018
 nail+0.018
 cube+0.018
 stack+0.018
Neg logit contributions
jo-0.015
les-0.014
fulness-0.014
le-0.014
 Carly-0.014
Token.
Token ID13
Feature activation+2.24e-03

Pos logit contributions
 tallest+0.026
 cube+0.025
 nail+0.025
 brick+0.025
 cubes+0.024
Neg logit contributions
jo-0.025
 Carly-0.025
fulness-0.024
les-0.023
le-0.023
Token He
Token ID679
Feature activation-5.93e-04

Pos logit contributions
jo+0.003
 Carly+0.003
fulness+0.002
 Debbie+0.002
 Carol+0.002
Neg logit contributions
 tallest-0.004
 cube-0.004
 nail-0.004
 cubes-0.004
 bucket-0.004
Token didn
Token ID1422
Feature activation+3.67e-03

Pos logit contributions
 tallest+0.001
 cube+0.001
 brick+0.001
 cubes+0.001
 nail+0.001
Neg logit contributions
jo-0.001
 Carly-0.001
les-0.001
le-0.001
of-0.001
Token hard
Token ID1327
Feature activation+4.30e-03

Pos logit contributions
jo+0.002
 Carly+0.002
fulness+0.002
les+0.002
of+0.002
Neg logit contributions
 tallest-0.001
 cube-0.001
 cubes-0.001
 brick-0.001
 nail-0.001
Token and
Token ID290
Feature activation-1.39e-03

Pos logit contributions
jo+0.009
 Carly+0.009
fulness+0.008
 Debbie+0.008
 Carol+0.008
Neg logit contributions
 cube-0.005
 tallest-0.005
 cubes-0.005
 sp-0.005
 nail-0.005
Token made
Token ID925
Feature activation-1.43e-03

Pos logit contributions
 cube+0.002
 tallest+0.002
 cubes+0.002
 brick+0.002
 nail+0.002
Neg logit contributions
jo-0.002
 Carly-0.002
fulness-0.002
les-0.002
of-0.002
Token her
Token ID607
Feature activation-6.80e-03

Pos logit contributions
 tallest+0.002
 cube+0.002
 nail+0.002
 sp+0.002
 cubes+0.002
Neg logit contributions
jo-0.003
 Carly-0.002
fulness-0.002
iko-0.002
 Debbie-0.002
Token face
Token ID1986
Feature activation-2.49e-02

Pos logit contributions
 cubes+0.009
 tallest+0.009
 cube+0.009
 brick+0.008
 bricks+0.008
Neg logit contributions
jo-0.012
 Carly-0.011
le-0.011
fulness-0.010
of-0.010
Token scr
Token ID6040
Feature activation-6.22e-02

Pos logit contributions
 tallest+0.030
 cube+0.029
 brick+0.028
 cubes+0.028
 bricks+0.028
Neg logit contributions
jo-0.046
le-0.045
fulness-0.044
les-0.043
of-0.043
Tokenunch
Token ID3316
Feature activation-1.20e-02

Pos logit contributions
 tallest+0.135
nam+0.134
 cube+0.132
 brick+0.128
 nail+0.128
Neg logit contributions
jo-0.058
of-0.052
fulness-0.051
 Carly-0.046
les-0.046
Token up
Token ID510
Feature activation-1.80e-02

Pos logit contributions
 tallest+0.020
 cube+0.020
 cubes+0.020
 nail+0.020
 brick+0.019
Neg logit contributions
jo-0.018
 Carly-0.017
fulness-0.016
les-0.016
le-0.015
Token.
Token ID13
Feature activation-7.85e-03

Pos logit contributions
 tallest+0.025
 cube+0.024
 nail+0.024
 cubes+0.023
 brick+0.023
Neg logit contributions
jo-0.032
 Carly-0.031
fulness-0.030
les-0.028
le-0.028
Token She
Token ID1375
Feature activation-1.37e-02

Pos logit contributions
 cube+0.013
 tallest+0.013
 cubes+0.013
 bucket+0.012
 brick+0.012
Neg logit contributions
jo-0.012
 Carly-0.012
fulness-0.011
 Carol-0.010
les-0.010
Token spat
Token ID15246
Feature activation+6.34e-03

Pos logit contributions
 tallest+0.018
 cube+0.018
 cubes+0.017
 brick+0.017
 nail+0.017
Neg logit contributions
jo-0.026
 Carly-0.025
fulness-0.023
les-0.023
iko-0.022
Token sour
Token ID11348
Feature activation-2.08e-02

Pos logit contributions
jo+0.003
 Carly+0.003
fulness+0.003
les+0.003
le+0.003
Neg logit contributions
 tallest-0.002
 cube-0.002
 brick-0.002
 nail-0.002
 cubes-0.002
Token and
Token ID290
Feature activation-4.67e-04

Pos logit contributions
 tallest+0.032
 cube+0.031
 nail+0.030
nam+0.030
 brick+0.030
Neg logit contributions
jo-0.034
 Carly-0.032
fulness-0.032
les-0.031
le-0.031
Token made
Token ID925
Feature activation-4.74e-03

Pos logit contributions
 tallest+0.000
 cube+0.000
 brick+0.000
 nail+0.000
 cubes+0.000
Neg logit contributions
jo-0.001
 Carly-0.001
fulness-0.001
les-0.001
le-0.001
Token their
Token ID511
Feature activation-1.14e-02

Pos logit contributions
 cube+0.008
 dough+0.008
 cubes+0.008
 sp+0.008
 nail+0.008
Neg logit contributions
jo-0.008
 Carly-0.007
iko-0.007
lor-0.006
rett-0.006
Token faces
Token ID6698
Feature activation-1.88e-02

Pos logit contributions
 tallest+0.012
 cube+0.012
 nail+0.011
 cubes+0.011
 brick+0.011
Neg logit contributions
jo-0.023
 Carly-0.022
le-0.021
fulness-0.021
les-0.021
Token scr
Token ID6040
Feature activation-6.20e-02

Pos logit contributions
 tallest+0.029
 cube+0.028
 brick+0.028
 cubes+0.028
 nail+0.028
Neg logit contributions
fulness-0.027
jo-0.027
le-0.027
les-0.026
of-0.025
Tokenunch
Token ID3316
Feature activation-1.36e-02

Pos logit contributions
 tallest+0.123
nam+0.121
 cube+0.119
 brick+0.115
 nail+0.114
Neg logit contributions
jo-0.071
fulness-0.065
of-0.064
les-0.060
 Carly-0.059
Token up
Token ID510
Feature activation-1.81e-02

Pos logit contributions
 tallest+0.026
 cube+0.025
 nail+0.024
 brick+0.024
 cubes+0.024
Neg logit contributions
jo-0.018
fulness-0.017
les-0.016
 Carly-0.016
le-0.016
Token.
Token ID13
Feature activation+7.88e-04

Pos logit contributions
 tallest+0.026
 cube+0.026
 nail+0.025
 brick+0.025
 cubes+0.025
Neg logit contributions
jo-0.030
 Carly-0.029
fulness-0.028
le-0.027
les-0.027
Token
Token ID198
Feature activation-9.54e-03

Pos logit contributions
jo+0.002
 Carly+0.001
fulness+0.001
le+0.001
iko+0.001
Neg logit contributions
 cube-0.001
 tallest-0.001
 cubes-0.001
 square-0.001
 nail-0.001
Token
Token ID198
Feature activation-7.65e-03

Pos logit contributions
 cube+0.012
 nail+0.012
 cubes+0.012
 square+0.012
 brick+0.012
Neg logit contributions
jo-0.018
 Carly-0.017
fulness-0.017
le-0.016
les-0.016
Token.
Token ID13
Feature activation+9.71e-03

Pos logit contributions
 tallest+0.021
 cube+0.020
 nail+0.020
 brick+0.019
 cubes+0.019
Neg logit contributions
jo-0.025
 Carly-0.024
fulness-0.023
les-0.023
le-0.022
Token It
Token ID632
Feature activation+1.72e-02

Pos logit contributions
jo+0.016
 Carly+0.015
fulness+0.014
 Carol+0.014
 Debbie+0.013
Neg logit contributions
 tallest-0.015
 cube-0.015
 cubes-0.015
 nail-0.014
 brick-0.014
Token makes
Token ID1838
Feature activation+1.53e-03

Pos logit contributions
jo+0.030
 Carly+0.027
les+0.027
le+0.027
of+0.026
Neg logit contributions
 tallest-0.024
 cubes-0.023
 cube-0.023
 brick-0.023
nam-0.023
Token her
Token ID607
Feature activation-2.01e-03

Pos logit contributions
jo+0.003
 Carly+0.003
fulness+0.002
iko+0.002
lor+0.002
Neg logit contributions
 nail-0.002
 sp-0.002
 cube-0.002
 tallest-0.002
 dough-0.002
Token face
Token ID1986
Feature activation-2.12e-02

Pos logit contributions
 tallest+0.002
 cube+0.002
 cubes+0.002
 brick+0.002
 nail+0.002
Neg logit contributions
jo-0.004
 Carly-0.004
fulness-0.003
le-0.003
 Debbie-0.003
Token scr
Token ID6040
Feature activation-6.02e-02

Pos logit contributions
 tallest+0.023
 cube+0.022
 cubes+0.022
 brick+0.022
 nail+0.021
Neg logit contributions
jo-0.042
le-0.040
 Carly-0.040
fulness-0.039
les-0.039
Tokenunch
Token ID3316
Feature activation-8.24e-03

Pos logit contributions
 tallest+0.126
nam+0.125
 cube+0.123
 bucket+0.119
 brick+0.119
Neg logit contributions
jo-0.060
of-0.055
fulness-0.052
 Carly-0.049
les-0.049
Token up
Token ID510
Feature activation-1.63e-02

Pos logit contributions
 tallest+0.015
 cube+0.015
 cubes+0.015
 nail+0.015
 brick+0.015
Neg logit contributions
jo-0.011
 Carly-0.010
fulness-0.010
les-0.010
le-0.009
Token.
Token ID13
Feature activation+2.68e-03

Pos logit contributions
 tallest+0.025
 cube+0.025
 nail+0.024
 cubes+0.024
 brick+0.024
Neg logit contributions
jo-0.026
 Carly-0.025
fulness-0.024
les-0.023
le-0.023
Token She
Token ID1375
Feature activation-5.02e-03

Pos logit contributions
jo+0.005
 Carly+0.004
fulness+0.004
les+0.004
iko+0.004
Neg logit contributions
 tallest-0.004
 cube-0.004
 nail-0.004
 sp-0.004
 brick-0.004
Token sp
Token ID599
Feature activation-6.75e-04

Pos logit contributions
 tallest+0.007
 cube+0.007
 nail+0.007
 cubes+0.007
 brick+0.007
Neg logit contributions
jo-0.009
 Carly-0.008
fulness-0.008
iko-0.008
 Carol-0.007
Token and
Token ID290
Feature activation-5.93e-04

Pos logit contributions
 cube+0.006
 sp+0.006
 nail+0.006
 rake+0.005
 brick+0.005
Neg logit contributions
jo-0.012
 Carly-0.012
rob-0.011
iko-0.011
lor-0.011
Token her
Token ID607
Feature activation-1.24e-02

Pos logit contributions
 cube+0.001
 tallest+0.001
 sp+0.001
 nail+0.001
 bucket+0.001
Neg logit contributions
jo-0.001
 Carly-0.001
fulness-0.001
iko-0.001
of-0.001
Token face
Token ID1986
Feature activation-2.66e-02

Pos logit contributions
 tallest+0.015
 cube+0.014
 cubes+0.014
 nail+0.014
 brick+0.014
Neg logit contributions
jo-0.024
 Carly-0.023
fulness-0.021
iko-0.021
 Carol-0.021
Token began
Token ID2540
Feature activation-1.08e-02

Pos logit contributions
 tallest+0.035
 cube+0.035
 cubes+0.034
 nail+0.033
 brick+0.033
Neg logit contributions
jo-0.050
 Carly-0.047
fulness-0.045
les-0.043
of-0.043
Token to
Token ID284
Feature activation-1.24e-02

Pos logit contributions
 cube+0.016
 nail+0.016
 tallest+0.015
 brick+0.015
 sp+0.015
Neg logit contributions
jo-0.019
 Carly-0.018
fulness-0.017
iko-0.017
les-0.017
Token scr
Token ID6040
Feature activation-5.81e-02

Pos logit contributions
 tallest+0.018
 cube+0.018
 cubes+0.017
 brick+0.017
 nail+0.017
Neg logit contributions
jo-0.021
 Carly-0.020
fulness-0.019
le-0.018
of-0.018
Tokenunch
Token ID3316
Feature activation-1.31e-02

Pos logit contributions
 tallest+0.122
 cube+0.119
nam+0.118
 sp+0.115
 bucket+0.114
Neg logit contributions
jo-0.061
of-0.054
 Carly-0.054
fulness-0.051
les-0.048
Token up
Token ID510
Feature activation-2.09e-02

Pos logit contributions
 tallest+0.021
 cube+0.021
 cubes+0.021
 nail+0.020
 brick+0.020
Neg logit contributions
jo-0.020
 Carly-0.019
fulness-0.018
les-0.018
le-0.017
Token.
Token ID13
Feature activation-1.44e-03

Pos logit contributions
 cube+0.027
 tallest+0.027
 cubes+0.026
 bucket+0.025
 sp+0.025
Neg logit contributions
jo-0.041
 Carly-0.039
fulness-0.037
iko-0.036
 Carol-0.035
Token Hot
Token ID6964
Feature activation+5.88e-03

Pos logit contributions
 square+0.002
 tallest+0.002
 brick+0.002
 cube+0.002
 sp+0.002
Neg logit contributions
jo-0.003
le-0.003
fulness-0.003
les-0.003
iko-0.003
Token tears
Token ID10953
Feature activation-8.20e-03

Pos logit contributions
jo+0.011
 Carly+0.011
fulness+0.010
 Debbie+0.010
iko+0.010
Neg logit contributions
 cube-0.008
 sp-0.008
 cubes-0.007
 dough-0.007
 nail-0.007
Token.
Token ID13
Feature activation+1.96e-03

Pos logit contributions
 tallest+0.004
 cube+0.004
 cubes+0.003
 nail+0.003
 brick+0.003
Neg logit contributions
jo-0.005
 Carly-0.004
fulness-0.004
les-0.004
of-0.004
Token She
Token ID1375
Feature activation-5.91e-03

Pos logit contributions
jo+0.003
 Carly+0.003
fulness+0.003
 Carol+0.002
 Debbie+0.002
Neg logit contributions
 tallest-0.003
 cube-0.003
 cubes-0.003
 nail-0.003
 brick-0.003
Token felt
Token ID2936
Feature activation+2.77e-03

Pos logit contributions
 cube+0.007
 tallest+0.007
 sp+0.007
 nail+0.007
 cubes+0.007
Neg logit contributions
jo-0.012
 Carly-0.011
fulness-0.011
iko-0.010
 Carol-0.010
Token her
Token ID607
Feature activation-1.21e-02

Pos logit contributions
 Carly+0.005
jo+0.005
fulness+0.005
le+0.005
les+0.005
Neg logit contributions
 cube-0.004
 tallest-0.004
 cubes-0.003
 brick-0.003
 nail-0.003
Token face
Token ID1986
Feature activation-2.12e-02

Pos logit contributions
 tallest+0.016
 cubes+0.015
nam+0.015
 brick+0.015
 structure+0.014
Neg logit contributions
 Carly-0.019
jo-0.019
fulness-0.019
 Debbie-0.018
les-0.018
Token scr
Token ID6040
Feature activation-5.76e-02

Pos logit contributions
 tallest+0.032
 cubes+0.032
 cube+0.031
 brick+0.031
 structure+0.031
Neg logit contributions
le-0.033
jo-0.032
les-0.031
fulness-0.031
of-0.030
Tokenunch
Token ID3316
Feature activation-3.31e-03

Pos logit contributions
 tallest+0.126
nam+0.125
 cube+0.124
 brick+0.121
 nail+0.120
Neg logit contributions
jo-0.050
of-0.046
fulness-0.046
les-0.042
 Carly-0.041
Token up
Token ID510
Feature activation-1.60e-02

Pos logit contributions
 tallest+0.005
 cube+0.005
 nail+0.005
 cubes+0.005
 brick+0.005
Neg logit contributions
jo-0.006
 Carly-0.005
fulness-0.005
iko-0.005
les-0.005
Token and
Token ID290
Feature activation+4.82e-04

Pos logit contributions
 tallest+0.024
 cube+0.024
 nail+0.023
 cubes+0.023
 brick+0.023
Neg logit contributions
jo-0.027
 Carly-0.026
fulness-0.024
les-0.023
le-0.023
Token her
Token ID607
Feature activation-1.83e-02

Pos logit contributions
jo+0.001
 Carly+0.001
fulness+0.001
iko+0.001
of+0.001
Neg logit contributions
 cube-0.001
 tallest-0.001
 sp-0.001
 bucket-0.001
 cubes-0.001
Token tongue
Token ID11880
Feature activation-1.61e-02

Pos logit contributions
 tallest+0.022
 brick+0.020
 nail+0.020
 cube+0.020
 structure+0.019
Neg logit contributions
jo-0.035
 Carly-0.034
fulness-0.033
 Debbie-0.032
les-0.032
Token never
Token ID1239
Feature activation-6.41e-03

Pos logit contributions
 cube+0.001
 tallest+0.001
 nail+0.001
 sp+0.001
 bucket+0.001
Neg logit contributions
jo-0.002
 Carly-0.002
fulness-0.002
les-0.002
iko-0.002
Token played
Token ID2826
Feature activation-8.29e-03

Pos logit contributions
 tallest+0.009
 cube+0.008
 cubes+0.008
 brick+0.008
 nail+0.008
Neg logit contributions
jo-0.012
 Carly-0.011
fulness-0.011
of-0.011
les-0.010
Token with
Token ID351
Feature activation-5.72e-03

Pos logit contributions
 tallest+0.011
 cube+0.011
 cubes+0.010
 nail+0.010
 dough+0.010
Neg logit contributions
jo-0.016
 Carly-0.015
fulness-0.014
iko-0.014
of-0.014
Token the
Token ID262
Feature activation-1.58e-02

Pos logit contributions
 cube+0.007
 tallest+0.007
 cubes+0.007
 nail+0.007
 brick+0.007
Neg logit contributions
jo-0.012
 Carly-0.011
fulness-0.011
of-0.010
iko-0.010
Token or
Token ID393
Feature activation-5.13e-03

Pos logit contributions
 tallest+0.016
 cube+0.016
 nail+0.015
 cubes+0.015
 brick+0.015
Neg logit contributions
jo-0.035
 Carly-0.034
fulness-0.032
iko-0.031
 Carol-0.031
Tokennam
Token ID7402
Feature activation-5.64e-02

Pos logit contributions
 tallest+0.006
 cube+0.006
 nail+0.006
 cubes+0.006
 brick+0.006
Neg logit contributions
jo-0.010
 Carly-0.010
fulness-0.009
iko-0.009
of-0.009
Tokenents
Token ID658
Feature activation-3.10e-02

Pos logit contributions
 tallest+0.083
 cube+0.083
 nail+0.080
 brick+0.079
 bucket+0.079
Neg logit contributions
jo-0.097
 Carly-0.091
fulness-0.085
of-0.083
les-0.082
Token again
Token ID757
Feature activation-9.75e-03

Pos logit contributions
 cube+0.060
 tallest+0.060
 cubes+0.058
 nail+0.058
 bucket+0.057
Neg logit contributions
jo-0.040
 Carly-0.039
iko-0.034
 Carol-0.033
 Debbie-0.032
Token.
Token ID13
Feature activation+3.59e-03

Pos logit contributions
 cube+0.010
 tallest+0.010
 sp+0.010
 nail+0.010
 square+0.010
Neg logit contributions
jo-0.021
 Carly-0.020
iko-0.019
 Debbie-0.019
fulness-0.019
Token They
Token ID1119
Feature activation-6.16e-03

Pos logit contributions
les+0.007
le+0.007
jo+0.007
fulness+0.007
 Carly+0.006
Neg logit contributions
 square-0.004
 cube-0.004
 tallest-0.004
 brick-0.004
 cubes-0.004
Token still
Token ID991
Feature activation-3.45e-04

Pos logit contributions
 tallest+0.008
 cube+0.008
 cubes+0.008
 nail+0.008
 brick+0.008
Neg logit contributions
jo-0.012
 Carly-0.011
fulness-0.011
iko-0.010
of-0.010
Token to
Token ID284
Feature activation+3.15e-03

Pos logit contributions
 tallest+0.035
 cube+0.033
 nail+0.033
 bucket+0.032
 brick+0.032
Neg logit contributions
jo-0.037
 Carly-0.036
fulness-0.033
iko-0.032
 Debbie-0.032
Token look
Token ID804
Feature activation-2.06e-02

Pos logit contributions
jo+0.006
 Carly+0.006
fulness+0.006
iko+0.005
les+0.005
Neg logit contributions
 tallest-0.004
 cube-0.004
 nail-0.004
 cubes-0.004
 brick-0.004
Token at
Token ID379
Feature activation-1.32e-02

Pos logit contributions
 cube+0.025
 tallest+0.025
 cubes+0.024
 brick+0.024
 nail+0.024
Neg logit contributions
jo-0.041
 Carly-0.039
fulness-0.037
iko-0.036
 Carol-0.036
Token the
Token ID262
Feature activation-1.47e-02

Pos logit contributions
 brick+0.015
 cube+0.015
 tallest+0.014
 nail+0.014
 cubes+0.014
Neg logit contributions
jo-0.029
fulness-0.028
 Carly-0.027
iko-0.026
le-0.026
Token or
Token ID393
Feature activation-1.07e-02

Pos logit contributions
 tallest+0.018
 cube+0.017
 nail+0.017
 cubes+0.016
 bucket+0.016
Neg logit contributions
jo-0.028
 Carly-0.027
les-0.025
fulness-0.025
iko-0.025
Tokennam
Token ID7402
Feature activation-5.37e-02

Pos logit contributions
 tallest+0.007
 cube+0.006
 nail+0.006
 cubes+0.006
 brick+0.006
Neg logit contributions
jo-0.028
 Carly-0.027
fulness-0.026
iko-0.025
les-0.025
Tokenents
Token ID658
Feature activation-1.91e-02

Pos logit contributions
 tallest+0.084
 cube+0.083
 bucket+0.082
 nail+0.081
 sp+0.081
Neg logit contributions
jo-0.087
 Carly-0.080
of-0.075
fulness-0.074
les-0.072
Token on
Token ID319
Feature activation-5.54e-03

Pos logit contributions
 tallest+0.027
 cube+0.026
 cubes+0.025
 brick+0.025
 nail+0.025
Neg logit contributions
jo-0.033
 Carly-0.031
fulness-0.031
le-0.030
les-0.030
Token the
Token ID262
Feature activation+1.07e-02

Pos logit contributions
 tallest+0.008
 cube+0.008
 nail+0.007
 cubes+0.007
 brick+0.007
Neg logit contributions
jo-0.010
 Carly-0.010
fulness-0.009
iko-0.009
of-0.009
Token tree
Token ID5509
Feature activation-5.77e-03

Pos logit contributions
jo+0.023
 Carly+0.022
fulness+0.021
les+0.021
 Debbie+0.021
Neg logit contributions
 cube-0.010
 cubes-0.009
 tallest-0.009
 bucket-0.009
 nail-0.009
Token.
Token ID13
Feature activation-1.02e-03

Pos logit contributions
 tallest+0.009
 cube+0.009
 cubes+0.009
 brick+0.008
 bucket+0.008
Neg logit contributions
jo-0.009
le-0.008
 Carly-0.008
of-0.008
fulness-0.008
Token to
Token ID284
Feature activation-8.84e-03

Pos logit contributions
 cube+0.004
 sp+0.004
 square+0.003
 cubes+0.003
 squash+0.003
Neg logit contributions
jo-0.008
rob-0.008
 Carly-0.007
iko-0.007
les-0.007
Token help
Token ID1037
Feature activation-9.12e-03

Pos logit contributions
 tallest+0.012
 cube+0.012
 cubes+0.012
 nail+0.011
 brick+0.011
Neg logit contributions
jo-0.016
 Carly-0.016
fulness-0.015
 Carol-0.014
iko-0.014
Token sort
Token ID3297
Feature activation-9.35e-03

Pos logit contributions
 nail+0.013
 cube+0.013
 brick+0.012
 tallest+0.012
 hammer+0.012
Neg logit contributions
jo-0.017
 Carly-0.016
fulness-0.015
iko-0.015
les-0.015
Token the
Token ID262
Feature activation-2.10e-02

Pos logit contributions
 cube+0.013
 cubes+0.013
 nail+0.013
 brick+0.013
 tallest+0.012
Neg logit contributions
jo-0.018
iko-0.017
 Carly-0.016
lor-0.016
fulness-0.015
Token or
Token ID393
Feature activation-1.13e-02

Pos logit contributions
 tallest+0.021
 cube+0.019
asuring+0.019
 cubes+0.018
nam+0.018
Neg logit contributions
 Carly-0.039
jo-0.038
les-0.037
 Carol-0.035
fulness-0.035
Tokennam
Token ID7402
Feature activation-5.37e-02

Pos logit contributions
 tallest+0.010
 cube+0.010
 brick+0.009
 cubes+0.009
 nail+0.009
Neg logit contributions
 Carly-0.024
jo-0.024
fulness-0.023
of-0.022
 Carol-0.022
Tokenents
Token ID658
Feature activation-2.21e-02

Pos logit contributions
 tallest+0.084
 cube+0.083
 sp+0.082
 bucket+0.081
 nail+0.081
Neg logit contributions
jo-0.084
 Carly-0.079
of-0.076
fulness-0.073
la-0.072
Token.
Token ID13
Feature activation-2.33e-03

Pos logit contributions
 tallest+0.031
 cube+0.031
 cubes+0.030
 nail+0.030
 brick+0.029
Neg logit contributions
jo-0.039
 Carly-0.037
fulness-0.036
iko-0.034
les-0.034
Token She
Token ID1375
Feature activation-1.00e-02

Pos logit contributions
 tallest+0.003
 cube+0.003
 nail+0.003
 brick+0.003
 cubes+0.003
Neg logit contributions
jo-0.004
 Carly-0.004
fulness-0.004
iko-0.004
le-0.004
Token picked
Token ID6497
Feature activation+8.94e-03

Pos logit contributions
 tallest+0.015
 cube+0.014
 brick+0.014
 cubes+0.014
 nail+0.013
Neg logit contributions
jo-0.017
 Carly-0.017
les-0.015
of-0.015
fulness-0.015
Token up
Token ID510
Feature activation-8.83e-03

Pos logit contributions
jo+0.019
iko+0.016
 Carly+0.016
fulness+0.016
azz+0.016
Neg logit contributions
 tallest-0.012
 dough-0.011
 cube-0.011
 squash-0.011
 cubes-0.011
Token hill
Token ID12788
Feature activation-1.17e-02

Pos logit contributions
jo+0.010
 Carly+0.010
unt+0.010
of+0.009
 Carol+0.009
Neg logit contributions
 cubes-0.006
asuring-0.006
 tallest-0.006
nam-0.006
 bricks-0.006
Token and
Token ID290
Feature activation+3.35e-03

Pos logit contributions
 cube+0.017
 tallest+0.017
 square+0.016
 bucket+0.016
 brick+0.016
Neg logit contributions
jo-0.021
 Carly-0.021
fulness-0.019
iko-0.019
 Debbie-0.019
Token put
Token ID1234
Feature activation+9.67e-03

Pos logit contributions
jo+0.007
fulness+0.007
 Carly+0.007
iko+0.006
lor+0.006
Neg logit contributions
 sp-0.004
 tallest-0.004
 bucket-0.004
 nail-0.004
 cube-0.004
Token her
Token ID607
Feature activation-5.67e-03

Pos logit contributions
jo+0.020
 Carly+0.017
iko+0.017
azz+0.016
fulness+0.016
Neg logit contributions
 cube-0.013
 tallest-0.013
 bucket-0.013
 cubes-0.013
 sp-0.013
Token or
Token ID393
Feature activation-1.29e-03

Pos logit contributions
 cube+0.006
 tallest+0.006
 cubes+0.006
 nail+0.006
 brick+0.006
Neg logit contributions
jo-0.010
 Carly-0.010
fulness-0.010
les-0.010
 Debbie-0.009
Tokennam
Token ID7402
Feature activation-5.34e-02

Pos logit contributions
 cube+0.001
 tallest+0.001
 brick+0.001
 cubes+0.001
 nail+0.000
Neg logit contributions
jo-0.003
 Carly-0.003
fulness-0.003
 Carol-0.003
 Debbie-0.003
Tokenents
Token ID658
Feature activation-1.89e-02

Pos logit contributions
 tallest+0.087
 cube+0.086
 sp+0.084
 rake+0.083
 brick+0.083
Neg logit contributions
jo-0.082
 Carly-0.074
of-0.072
fulness-0.068
la-0.068
Token all
Token ID477
Feature activation-2.45e-03

Pos logit contributions
 tallest+0.023
 cube+0.023
 nail+0.022
 cubes+0.022
 brick+0.022
Neg logit contributions
jo-0.037
 Carly-0.036
fulness-0.034
iko-0.033
 Carol-0.032
Token around
Token ID1088
Feature activation+1.93e-02

Pos logit contributions
 cube+0.003
 tallest+0.003
 cubes+0.003
 nail+0.003
 brick+0.003
Neg logit contributions
jo-0.004
 Carly-0.004
fulness-0.004
iko-0.004
 Carol-0.004
Token.
Token ID13
Feature activation-2.70e-03

Pos logit contributions
jo+0.037
 Carly+0.035
fulness+0.033
iko+0.032
of+0.032
Neg logit contributions
 cube-0.026
 tallest-0.026
 nail-0.025
 brick-0.025
 cubes-0.025
Token She
Token ID1375
Feature activation-4.27e-03

Pos logit contributions
 tallest+0.004
 cube+0.004
 cubes+0.004
 nail+0.004
 brick+0.004
Neg logit contributions
jo-0.004
 Carly-0.004
fulness-0.004
 Carol-0.004
iko-0.004
Token made
Token ID925
Feature activation+1.15e-02

Pos logit contributions
jo+0.001
 Carly+0.001
fulness+0.001
les+0.001
iko+0.001
Neg logit contributions
 tallest-0.001
 cube-0.001
 cubes-0.000
 nail-0.000
 brick-0.000
Token a
Token ID257
Feature activation-5.62e-04

Pos logit contributions
jo+0.027
azz+0.024
iko+0.024
 Carly+0.024
fulness+0.024
Neg logit contributions
 dough-0.012
 sp-0.011
 tallest-0.011
 cube-0.011
 squash-0.011
Token funny
Token ID8258
Feature activation-4.33e-03

Pos logit contributions
 tallest+0.001
 cubes+0.001
 bricks+0.001
nam+0.001
 cube+0.001
Neg logit contributions
jo-0.001
 Carly-0.001
of-0.001
le-0.001
fulness-0.001
Token face
Token ID1986
Feature activation-5.25e-03

Pos logit contributions
 cube+0.004
 tallest+0.004
 nail+0.004
 sp+0.004
 brick+0.004
Neg logit contributions
jo-0.010
 Carly-0.010
fulness-0.009
iko-0.009
 Carol-0.009
Token,
Token ID11
Feature activation+5.86e-03

Pos logit contributions
 tallest+0.009
 cube+0.009
 cubes+0.008
 nail+0.008
 brick+0.008
Neg logit contributions
jo-0.008
 Carly-0.008
fulness-0.007
iko-0.007
les-0.007
Token scr
Token ID6040
Feature activation-5.32e-02

Pos logit contributions
jo+0.012
iko+0.011
 Carly+0.011
lor+0.011
rett+0.011
Neg logit contributions
 sp-0.008
 dough-0.008
 square-0.007
 cube-0.007
 nail-0.007
Tokenunch
Token ID3316
Feature activation+2.23e-03

Pos logit contributions
 tallest+0.111
 cube+0.108
nam+0.108
 nail+0.105
 bucket+0.105
Neg logit contributions
jo-0.057
 Carly-0.051
fulness-0.050
of-0.049
les-0.046
Tokening
Token ID278
Feature activation+5.84e-03

Pos logit contributions
jo+0.003
 Carly+0.003
iko+0.003
fulness+0.003
 Carol+0.003
Neg logit contributions
 cube-0.004
 tallest-0.004
 nail-0.004
 brick-0.004
 sp-0.004
Token up
Token ID510
Feature activation-6.32e-03

Pos logit contributions
jo+0.011
 Carly+0.011
iko+0.010
azz+0.010
fulness+0.010
Neg logit contributions
 cube-0.008
 dough-0.008
 sp-0.008
 bucket-0.008
 cubes-0.008
Token her
Token ID607
Feature activation-8.55e-03

Pos logit contributions
 tallest+0.009
 cube+0.009
 cubes+0.009
 nail+0.009
 brick+0.009
Neg logit contributions
jo-0.011
 Carly-0.010
fulness-0.010
iko-0.009
 Carol-0.009
Token nose
Token ID9686
Feature activation-4.72e-03

Pos logit contributions
 tallest+0.009
 cube+0.009
 cubes+0.009
 brick+0.008
 nail+0.008
Neg logit contributions
jo-0.017
 Carly-0.016
fulness-0.016
le-0.016
les-0.016
Token help
Token ID1037
Feature activation-2.19e-02

Pos logit contributions
jo+0.012
 Carly+0.012
fulness+0.011
iko+0.011
of+0.011
Neg logit contributions
 tallest-0.007
 cube-0.007
 cubes-0.007
 nail-0.007
 brick-0.007
Token her
Token ID607
Feature activation-5.73e-03

Pos logit contributions
 tallest+0.035
 cube+0.035
 cubes+0.033
 nail+0.033
 brick+0.033
Neg logit contributions
jo-0.035
 Carly-0.033
fulness-0.031
les-0.030
iko-0.030
Token mom
Token ID1995
Feature activation-1.30e-02

Pos logit contributions
 tallest+0.010
 centimeters+0.009
 cubes+0.008
 inches+0.008
asuring+0.008
Neg logit contributions
jo-0.008
 Carly-0.008
 Carol-0.007
 Rita-0.007
 Debbie-0.007
Token put
Token ID1234
Feature activation-3.16e-03

Pos logit contributions
 tallest+0.027
nam+0.026
 structure+0.026
 nail+0.025
 brick+0.025
Neg logit contributions
le-0.012
les-0.011
jo-0.011
of-0.010
ock-0.010
Token or
Token ID393
Feature activation-9.70e-03

Pos logit contributions
 tallest+0.005
 cube+0.005
 nail+0.004
 cubes+0.004
 brick+0.004
Neg logit contributions
jo-0.005
 Carly-0.005
fulness-0.005
les-0.005
iko-0.005
Tokennam
Token ID7402
Feature activation-5.29e-02

Pos logit contributions
 tallest+0.012
 cube+0.011
 brick+0.010
 nail+0.010
 cubes+0.010
Neg logit contributions
jo-0.018
 Carly-0.018
fulness-0.017
of-0.017
 Debbie-0.017
Tokenents
Token ID658
Feature activation-1.67e-02

Pos logit contributions
 tallest+0.087
 sp+0.084
 cube+0.083
 rake+0.082
 brick+0.082
Neg logit contributions
jo-0.082
 Carly-0.073
of-0.072
fulness-0.069
le-0.068
Token on
Token ID319
Feature activation-7.88e-03

Pos logit contributions
 tallest+0.023
 cube+0.023
 nail+0.022
 cubes+0.022
 brick+0.022
Neg logit contributions
jo-0.030
 Carly-0.029
fulness-0.028
iko-0.026
les-0.026
Token the
Token ID262
Feature activation+3.13e-03

Pos logit contributions
 tallest+0.012
 cube+0.011
 nail+0.011
 brick+0.011
 cubes+0.011
Neg logit contributions
 Carly-0.013
jo-0.013
fulness-0.012
 Carol-0.012
 Debbie-0.012
Token tree
Token ID5509
Feature activation-9.77e-03

Pos logit contributions
jo+0.005
 Carly+0.005
unt+0.005
 Debbie+0.005
 Paula+0.005
Neg logit contributions
asuring-0.003
 cube-0.003
 cubes-0.003
 sp-0.003
 bucket-0.003
Token.
Token ID13
Feature activation-2.33e-02

Pos logit contributions
 tallest+0.018
 cube+0.017
 cubes+0.016
 biggest+0.016
 sp+0.016
Neg logit contributions
jo-0.012
le-0.010
of-0.010
land-0.010
les-0.010
Token and
Token ID290
Feature activation-6.23e-03

Pos logit contributions
jo+0.003
iko+0.003
 Carly+0.003
fulness+0.003
azz+0.003
Neg logit contributions
 sp-0.002
 square-0.002
 cube-0.002
 brick-0.002
 tallest-0.002
Token then
Token ID788
Feature activation-2.42e-03

Pos logit contributions
 sp+0.008
 cube+0.008
 tallest+0.008
 bucket+0.008
 square+0.008
Neg logit contributions
jo-0.013
fulness-0.012
iko-0.012
 Carly-0.011
lor-0.011
Token she
Token ID673
Feature activation-2.85e-03

Pos logit contributions
 sp+0.003
 bucket+0.003
 nail+0.003
 cube+0.003
 square+0.003
Neg logit contributions
jo-0.005
fulness-0.004
iko-0.004
ock-0.004
lor-0.004
Token had
Token ID550
Feature activation+2.25e-03

Pos logit contributions
 sp+0.004
 bucket+0.004
 cube+0.004
 tallest+0.004
 cubes+0.004
Neg logit contributions
jo-0.005
fulness-0.005
 Carly-0.005
iko-0.005
 Carol-0.005
Token a
Token ID257
Feature activation+7.15e-03

Pos logit contributions
jo+0.005
 Carly+0.005
fulness+0.005
le+0.004
les+0.004
Neg logit contributions
 tallest-0.002
 sp-0.002
 cube-0.002
 bucket-0.002
 cubes-0.002
Token scr
Token ID6040
Feature activation-5.26e-02

Pos logit contributions
jo+0.015
 Carly+0.014
 Carol+0.013
fulness+0.013
of+0.013
Neg logit contributions
 tallest-0.007
 cube-0.007
 cubes-0.007
 nail-0.007
 bricks-0.007
Tokenum
Token ID388
Feature activation+1.99e-03

Pos logit contributions
 tallest+0.091
 cube+0.090
 nail+0.087
 cubes+0.087
 brick+0.086
Neg logit contributions
jo-0.076
 Carly-0.070
fulness-0.067
of-0.064
les-0.064
Tokenpt
Token ID457
Feature activation-7.34e-03

Pos logit contributions
jo+0.004
 Carly+0.004
fulness+0.004
 Carol+0.003
iko+0.003
Neg logit contributions
 cube-0.003
 tallest-0.003
 cubes-0.003
 bucket-0.003
 nail-0.002
Tokenious
Token ID699
Feature activation-6.00e-03

Pos logit contributions
 tallest+0.010
 cube+0.010
 cubes+0.009
 nail+0.009
 brick+0.009
Neg logit contributions
jo-0.014
 Carly-0.013
fulness-0.013
iko-0.012
les-0.012
Token snack
Token ID26906
Feature activation-4.28e-04

Pos logit contributions
 cube+0.006
 tallest+0.006
 brick+0.006
 nail+0.005
 cubes+0.005
Neg logit contributions
jo-0.013
 Carly-0.013
fulness-0.012
iko-0.012
 Carol-0.012
Token.
Token ID13
Feature activation-3.29e-03

Pos logit contributions
 cube+0.001
 cubes+0.001
 tallest+0.001
 bucket+0.001
 nail+0.001
Neg logit contributions
jo-0.001
 Carly-0.001
iko-0.001
fulness-0.001
 Carol-0.001
Token he
Token ID339
Feature activation-2.78e-03

Pos logit contributions
 tallest+0.005
 cube+0.005
 nail+0.005
 cubes+0.005
 brick+0.005
Neg logit contributions
jo-0.013
 Carly-0.012
fulness-0.012
les-0.011
 Carol-0.011
Token especially
Token ID2592
Feature activation-1.07e-02

Pos logit contributions
 cube+0.003
 tallest+0.003
 sp+0.003
 nail+0.003
 bucket+0.003
Neg logit contributions
 Carly-0.006
jo-0.006
fulness-0.006
 Carol-0.005
iko-0.005
Token liked
Token ID8288
Feature activation-6.46e-03

Pos logit contributions
 cube+0.012
 tallest+0.012
 nail+0.011
 cubes+0.011
 sp+0.011
Neg logit contributions
jo-0.022
 Carly-0.022
fulness-0.021
iko-0.020
 Carol-0.020
Token the
Token ID262
Feature activation-9.97e-03

Pos logit contributions
 cube+0.009
 cubes+0.009
 tallest+0.009
 brick+0.009
 sp+0.009
Neg logit contributions
jo-0.012
 Carly-0.011
fulness-0.010
iko-0.010
of-0.010
Token or
Token ID393
Feature activation-2.60e-03

Pos logit contributions
 tallest+0.011
 cube+0.011
 nail+0.010
 brick+0.010
 cubes+0.010
Neg logit contributions
jo-0.022
 Carly-0.021
fulness-0.020
of-0.019
iko-0.019
Tokennam
Token ID7402
Feature activation-5.25e-02

Pos logit contributions
 tallest+0.003
 cube+0.003
 nail+0.003
 cubes+0.003
 brick+0.003
Neg logit contributions
jo-0.005
 Carly-0.005
fulness-0.005
 Carol-0.005
les-0.005
Tokenents
Token ID658
Feature activation-1.83e-02

Pos logit contributions
 tallest+0.083
 cube+0.082
 sp+0.081
 bucket+0.081
 nail+0.081
Neg logit contributions
jo-0.084
 Carly-0.076
of-0.071
fulness-0.070
les-0.070
Token he
Token ID339
Feature activation+3.92e-03

Pos logit contributions
 cube+0.023
 tallest+0.023
 cubes+0.022
 nail+0.022
 brick+0.022
Neg logit contributions
jo-0.035
 Carly-0.034
fulness-0.032
iko-0.031
 Carol-0.031
Token kept
Token ID4030
Feature activation-3.73e-03

Pos logit contributions
jo+0.009
 Carly+0.009
fulness+0.008
 Carol+0.008
 Debbie+0.008
Neg logit contributions
 tallest-0.004
 cube-0.004
 sp-0.004
 nail-0.003
 cubes-0.003
Token around
Token ID1088
Feature activation+1.21e-02

Pos logit contributions
 cube+0.005
 cubes+0.004
 tallest+0.004
 nail+0.004
 brick+0.004
Neg logit contributions
jo-0.008
 Carly-0.007
iko-0.007
of-0.007
 Carol-0.007
Token.
Token ID13
Feature activation+1.94e-03

Pos logit contributions
jo+0.025
 Carly+0.025
iko+0.023
of+0.023
 Debbie+0.023
Neg logit contributions
 cube-0.014
 tallest-0.014
 cubes-0.013
 brick-0.013
 nail-0.013
Token the
Token ID262
Feature activation-6.62e-03

Pos logit contributions
 cube+0.007
 brick+0.007
 bucket+0.007
 cubes+0.007
 sp+0.007
Neg logit contributions
jo-0.013
iko-0.012
 Carly-0.012
lor-0.011
fulness-0.011
Token rest
Token ID1334
Feature activation-3.64e-03

Pos logit contributions
 tallest+0.006
 cube+0.006
 cubes+0.006
 nail+0.005
 brick+0.005
Neg logit contributions
jo-0.014
 Carly-0.014
fulness-0.013
les-0.013
of-0.013
Token of
Token ID286
Feature activation+4.45e-03

Pos logit contributions
 cube+0.006
 tallest+0.006
 cubes+0.006
 bucket+0.006
 nail+0.006
Neg logit contributions
jo-0.006
 Carly-0.005
iko-0.005
fulness-0.005
 Carol-0.005
Token the
Token ID262
Feature activation-6.13e-03

Pos logit contributions
jo+0.010
 Carly+0.009
fulness+0.009
iko+0.009
les+0.009
Neg logit contributions
 cube-0.005
 brick-0.005
 nail-0.005
 cubes-0.005
 bucket-0.005
Token or
Token ID393
Feature activation-5.34e-03

Pos logit contributions
 tallest+0.006
 cube+0.005
 cubes+0.005
 brick+0.005
nam+0.005
Neg logit contributions
jo-0.013
 Carly-0.012
fulness-0.012
les-0.012
of-0.012
Tokennam
Token ID7402
Feature activation-5.16e-02

Pos logit contributions
 cube+0.006
 tallest+0.006
nam+0.006
 nail+0.006
 bucket+0.006
Neg logit contributions
jo-0.011
 Carly-0.010
iko-0.010
fulness-0.010
Jo-0.010
Tokenents
Token ID658
Feature activation-1.74e-02

Pos logit contributions
 tallest+0.081
 cube+0.079
 nail+0.077
 brick+0.077
 sp+0.077
Neg logit contributions
jo-0.084
 Carly-0.077
of-0.073
fulness-0.072
les-0.070
Token onto
Token ID4291
Feature activation-4.13e-03

Pos logit contributions
 cube+0.022
 tallest+0.022
 cubes+0.021
 nail+0.021
 brick+0.021
Neg logit contributions
jo-0.033
 Carly-0.032
iko-0.030
fulness-0.030
 Carol-0.029
Token the
Token ID262
Feature activation+8.20e-03

Pos logit contributions
 brick+0.005
 cube+0.005
 dough+0.005
 nail+0.005
 square+0.005
Neg logit contributions
jo-0.009
 Carly-0.008
iko-0.008
azz-0.007
of-0.007
Token tree
Token ID5509
Feature activation-5.72e-03

Pos logit contributions
jo+0.018
 Carly+0.018
fulness+0.017
les+0.016
iko+0.016
Neg logit contributions
 tallest-0.007
 cube-0.007
 cubes-0.007
 nail-0.007
 brick-0.007
Token.
Token ID13
Feature activation-1.16e-03

Pos logit contributions
 cube+0.007
 tallest+0.007
 nail+0.007
 brick+0.007
 cubes+0.007
Neg logit contributions
jo-0.011
 Carly-0.011
fulness-0.010
iko-0.010
 Debbie-0.010
Token other
Token ID584
Feature activation+5.91e-03

Pos logit contributions
jo+0.000
lor+0.000
 Carly+0.000
iko+0.000
fulness+0.000
Neg logit contributions
 tallest-0.000
 brick-0.000
 cube-0.000
 nail-0.000
 dough-0.000
Token around
Token ID1088
Feature activation+2.34e-02

Pos logit contributions
jo+0.012
 Carly+0.011
iko+0.011
fulness+0.011
 Debbie+0.011
Neg logit contributions
 cube-0.007
 tallest-0.007
 nail-0.006
 square-0.006
 brick-0.006
Token the
Token ID262
Feature activation+9.08e-03

Pos logit contributions
jo+0.052
lor+0.049
iko+0.048
 Carly+0.047
azz+0.047
Neg logit contributions
 square-0.023
 sp-0.022
 cube-0.022
 dough-0.022
 nail-0.021
Token piano
Token ID19132
Feature activation+1.31e-02

Pos logit contributions
jo+0.020
 Carly+0.019
fulness+0.018
iko+0.018
of+0.017
Neg logit contributions
 tallest-0.009
 cube-0.009
 nail-0.009
 cubes-0.009
 brick-0.009
Token and
Token ID290
Feature activation+1.44e-02

Pos logit contributions
jo+0.027
 Carly+0.027
fulness+0.025
iko+0.025
 Debbie+0.025
Neg logit contributions
 tallest-0.015
 cube-0.014
 square-0.014
 nail-0.014
 brick-0.013
Token around
Token ID1088
Feature activation+2.70e-02

Pos logit contributions
jo+0.029
lor+0.029
fulness+0.028
iko+0.028
of+0.027
Neg logit contributions
 square-0.017
 sp-0.017
 tallest-0.017
 stack-0.017
nam-0.017
Token the
Token ID262
Feature activation+1.10e-02

Pos logit contributions
jo+0.062
lor+0.061
iko+0.059
azz+0.058
unt+0.058
Neg logit contributions
 square-0.024
 sp-0.023
 cube-0.022
 tallest-0.022
 very-0.021
Token living
Token ID2877
Feature activation+3.02e-03

Pos logit contributions
jo+0.024
 Carly+0.023
fulness+0.022
iko+0.022
of+0.021
Neg logit contributions
 tallest-0.012
 cube-0.011
 nail-0.011
 brick-0.011
 cubes-0.011
Token room
Token ID2119
Feature activation+1.22e-02

Pos logit contributions
jo+0.008
 Carly+0.007
fulness+0.007
of+0.007
iko+0.007
Neg logit contributions
 cube-0.002
 tallest-0.002
 bucket-0.002
 cubes-0.002
 sp-0.002
Token.
Token ID13
Feature activation+8.80e-03

Pos logit contributions
jo+0.027
 Carly+0.026
iko+0.025
 Debbie+0.024
fulness+0.024
Neg logit contributions
 cube-0.013
 tallest-0.013
 square-0.012
 sp-0.012
 cubes-0.012
Token
Token ID198
Feature activation+1.05e-02

Pos logit contributions
le+0.018
fulness+0.017
les+0.017
jo+0.017
ock+0.017
Neg logit contributions
 tallest-0.010
 square-0.010
 brick-0.009
 cube-0.009
 sp-0.009
Token.
Token ID13
Feature activation+1.33e-02

Pos logit contributions
jo+0.049
 Carly+0.048
fulness+0.045
 Carol+0.044
 Debbie+0.044
Neg logit contributions
 cube-0.028
 cubes-0.026
 bucket-0.026
 tallest-0.026
 sp-0.025
Token It
Token ID632
Feature activation+1.25e-02

Pos logit contributions
jo+0.025
 Carly+0.024
fulness+0.023
iko+0.022
les+0.022
Neg logit contributions
 tallest-0.017
 cube-0.017
 nail-0.016
 cubes-0.016
 brick-0.016
Token had
Token ID550
Feature activation+7.34e-03

Pos logit contributions
jo+0.029
fulness+0.029
 Carly+0.028
reating+0.027
iko+0.026
Neg logit contributions
 sp-0.012
 cube-0.011
 cracked-0.011
 bucket-0.011
 square-0.011
Token a
Token ID257
Feature activation+2.29e-02

Pos logit contributions
jo+0.019
fulness+0.017
le+0.017
 Carly+0.016
unt+0.016
Neg logit contributions
 sp-0.006
 bucket-0.006
 cracked-0.006
 tallest-0.006
 square-0.006
Token shiny
Token ID22441
Feature activation-8.46e-03

Pos logit contributions
jo+0.052
 Carly+0.049
fulness+0.049
les+0.046
le+0.046
Neg logit contributions
 cube-0.022
 tallest-0.022
 nail-0.022
 bucket-0.022
 sp-0.021
Token thing
Token ID1517
Feature activation+3.07e-02

Pos logit contributions
 cube+0.009
 tallest+0.009
 nail+0.009
 bucket+0.008
 cubes+0.008
Neg logit contributions
jo-0.018
 Carly-0.018
fulness-0.017
iko-0.016
of-0.016
Token in
Token ID287
Feature activation+3.67e-03

Pos logit contributions
jo+0.062
 Carly+0.060
fulness+0.057
iko+0.055
 Carol+0.055
Neg logit contributions
 cube-0.037
 tallest-0.036
 nail-0.035
 cubes-0.035
 brick-0.034
Token its
Token ID663
Feature activation+1.73e-02

Pos logit contributions
jo+0.007
fulness+0.006
les+0.006
anna+0.006
azz+0.006
Neg logit contributions
 bucket-0.006
 nail-0.006
 brick-0.005
 square-0.005
 sp-0.005
Token be
Token ID307
Feature activation-8.42e-03

Pos logit contributions
jo+0.038
 Carly+0.037
fulness+0.035
iko+0.034
of+0.034
Neg logit contributions
 cube-0.016
 tallest-0.016
 cubes-0.016
 nail-0.015
 brick-0.015
Tokenak
Token ID461
Feature activation-6.70e-03

Pos logit contributions
 cube+0.009
 cubes+0.008
 tallest+0.008
 bucket+0.008
 sp+0.008
Neg logit contributions
 Carly-0.018
jo-0.018
fulness-0.017
 Carol-0.017
 Debbie-0.017
Token.
Token ID13
Feature activation+1.32e-02

Pos logit contributions
 tallest+0.009
 cube+0.009
 nail+0.008
 cubes+0.008
 brick+0.008
Neg logit contributions
jo-0.013
 Carly-0.012
fulness-0.011
iko-0.011
 Carol-0.011
Token we
Token ID356
Feature activation-7.97e-03

Pos logit contributions
jo+0.000
les+0.000
 Carly+0.000
rob+0.000
ez+0.000
Neg logit contributions
 sp-0.000
 cube-0.000
 nail-0.000
 bucket-0.000
 tallest-0.000
Token will
Token ID481
Feature activation-2.24e-03

Pos logit contributions
 tallest+0.009
 cube+0.009
 cubes+0.009
 nail+0.009
 brick+0.009
Neg logit contributions
jo-0.016
 Carly-0.016
fulness-0.015
of-0.014
iko-0.014
Token be
Token ID307
Feature activation-1.32e-02

Pos logit contributions
 tallest+0.002
 sp+0.002
 cube+0.002
 bucket+0.002
 rake+0.002
Neg logit contributions
jo-0.005
 Carly-0.004
fulness-0.004
les-0.004
elle-0.004
Token there
Token ID612
Feature activation-5.15e-03

Pos logit contributions
 tallest+0.013
 cube+0.011
 sp+0.011
 bucket+0.011
 cubes+0.011
Neg logit contributions
jo-0.031
fulness-0.029
 Carly-0.029
of-0.028
iko-0.027
Token in
Token ID287
Feature activation+1.12e-02

Pos logit contributions
 tallest+0.006
 cube+0.006
 square+0.006
 sp+0.006
 cubes+0.006
Neg logit contributions
jo-0.010
 Carly-0.010
iko-0.010
les-0.009
fulness-0.009
Token a
Token ID257
Feature activation+2.79e-02

Pos logit contributions
jo+0.024
 Carly+0.022
fulness+0.022
les+0.021
iko+0.021
Neg logit contributions
 tallest-0.012
 cube-0.012
 bucket-0.012
 nail-0.012
 square-0.012
Token minute
Token ID5664
Feature activation+9.74e-03

Pos logit contributions
jo+0.056
 Carly+0.054
fulness+0.050
 Carol+0.049
of+0.049
Neg logit contributions
 tallest-0.031
 cube-0.031
 cubes-0.030
 nail-0.030
nam-0.029
Token!"
Token ID2474
Feature activation+5.32e-03

Pos logit contributions
jo+0.018
 Carly+0.018
fulness+0.017
iko+0.017
 Carol+0.016
Neg logit contributions
 cube-0.012
 tallest-0.012
 square-0.012
 cubes-0.012
 sp-0.012
Token their
Token ID511
Feature activation-1.66e-03

Pos logit contributions
jo+0.012
lor+0.012
fulness+0.012
les+0.011
ock+0.011
Neg logit contributions
 square-0.005
 sp-0.005
 cube-0.005
 bucket-0.005
 squash-0.005
Token parents
Token ID3397
Feature activation-1.31e-02

Pos logit contributions
 tallest+0.002
 cube+0.002
 nail+0.002
 cubes+0.002
 brick+0.002
Neg logit contributions
jo-0.003
 Carly-0.003
fulness-0.003
iko-0.003
of-0.003
Token said
Token ID531
Feature activation+6.53e-03

Pos logit contributions
 cube+0.019
 sp+0.018
 cubes+0.017
 bucket+0.017
 hammer+0.017
Neg logit contributions
jo-0.024
 Carly-0.023
fulness-0.021
iko-0.021
 Carol-0.020
Token struggling
Token ID9648
Feature activation-1.87e-03

Pos logit contributions
jo+0.022
 Carly+0.021
fulness+0.020
les+0.019
le+0.019
Neg logit contributions
 tallest-0.021
 cube-0.020
 cubes-0.020
 nail-0.019
 brick-0.019
Token and
Token ID290
Feature activation+2.34e-02

Pos logit contributions
 cube+0.003
 cubes+0.002
 tallest+0.002
 nail+0.002
 brick+0.002
Neg logit contributions
jo-0.004
 Carly-0.003
iko-0.003
fulness-0.003
 Debbie-0.003
Token asked
Token ID1965
Feature activation+2.10e-02

Pos logit contributions
jo+0.047
 Carly+0.044
fulness+0.044
iko+0.042
le+0.040
Neg logit contributions
 sp-0.031
 cube-0.029
 nail-0.028
 tallest-0.028
 brick-0.028
Token,
Token ID11
Feature activation+1.63e-03

Pos logit contributions
jo+0.039
 Carly+0.036
iko+0.034
fulness+0.034
les+0.033
Neg logit contributions
 sp-0.029
 cube-0.029
 brick-0.028
 cubes-0.028
 tallest-0.028
Token "
Token ID366
Feature activation+1.15e-02

Pos logit contributions
jo+0.004
 Carly+0.004
fulness+0.003
les+0.003
iko+0.003
Neg logit contributions
 cube-0.002
 tallest-0.002
 nail-0.001
 brick-0.001
 sp-0.001
TokenDo
Token ID5211
Feature activation+2.74e-02

Pos logit contributions
jo+0.023
 Carly+0.021
fulness+0.020
 Carol+0.019
iko+0.019
Neg logit contributions
 cube-0.015
 sp-0.015
 tallest-0.015
 nail-0.015
 brick-0.014
Token you
Token ID345
Feature activation+1.23e-02

Pos logit contributions
 Carly+0.037
jo+0.037
of+0.035
fulness+0.035
 Carol+0.033
Neg logit contributions
 cube-0.049
 tallest-0.049
 brick-0.049
 bucket-0.048
 cubes-0.047
Token need
Token ID761
Feature activation+1.45e-02

Pos logit contributions
jo+0.019
 Carly+0.017
fulness+0.017
le+0.017
of+0.017
Neg logit contributions
 tallest-0.020
 cubes-0.020
 cube-0.020
 brick-0.019
 bricks-0.019
Token help
Token ID1037
Feature activation+1.39e-03

Pos logit contributions
jo+0.034
 Carly+0.032
fulness+0.030
iko+0.030
les+0.030
Neg logit contributions
 cube-0.013
 nail-0.013
 tallest-0.013
 sp-0.013
 cubes-0.013
Token?"
Token ID1701
Feature activation+1.54e-02

Pos logit contributions
jo+0.003
 Carly+0.002
fulness+0.002
iko+0.002
 Debbie+0.002
Neg logit contributions
 cube-0.002
 tallest-0.002
 nail-0.002
 brick-0.002
 hammer-0.002
Token Sp
Token ID1338
Feature activation+2.73e-02

Pos logit contributions
jo+0.028
 Carly+0.027
fulness+0.026
iko+0.025
la+0.024
Neg logit contributions
 cube-0.022
 sp-0.022
 square-0.021
 tallest-0.021
 nail-0.021
Token passengers
Token ID10405
Feature activation-1.43e-03

Pos logit contributions
jo+0.016
 Carly+0.016
fulness+0.015
iko+0.014
of+0.014
Neg logit contributions
 tallest-0.009
 cube-0.009
 cubes-0.009
 nail-0.009
 brick-0.009
Token.
Token ID13
Feature activation-3.19e-03

Pos logit contributions
 cube+0.002
 tallest+0.002
 cubes+0.002
 nail+0.002
 brick+0.002
Neg logit contributions
jo-0.003
 Carly-0.003
fulness-0.002
iko-0.002
 Carol-0.002
Token
Token ID198
Feature activation-7.49e-03

Pos logit contributions
 tallest+0.006
 cube+0.006
 nail+0.005
 cubes+0.005
 brick+0.005
Neg logit contributions
jo-0.005
 Carly-0.004
fulness-0.004
 Carol-0.004
of-0.004
Token
Token ID198
Feature activation-6.14e-03

Pos logit contributions
 cube+0.009
 square+0.009
 bucket+0.009
 brick+0.009
 cubes+0.009
Neg logit contributions
jo-0.014
 Carly-0.014
fulness-0.014
iko-0.013
elle-0.013
Token"
Token ID1
Feature activation-1.41e-02

Pos logit contributions
 tallest+0.011
 cube+0.011
 brick+0.010
 cubes+0.010
 nail+0.010
Neg logit contributions
jo-0.009
 Carly-0.008
of-0.008
fulness-0.008
le-0.008
TokenCh
Token ID1925
Feature activation+2.83e-02

Pos logit contributions
 cube+0.020
 tallest+0.020
 cubes+0.020
 nail+0.020
 brick+0.019
Neg logit contributions
jo-0.025
 Carly-0.023
fulness-0.022
iko-0.022
les-0.021
Tokenoo
Token ID2238
Feature activation+9.86e-03

Pos logit contributions
jo+0.037
 Carly+0.036
fulness+0.032
 Debbie+0.032
 Carol+0.031
Neg logit contributions
 cube-0.052
 cubes-0.051
 tallest-0.051
 brick-0.050
 nail-0.050
Token cho
Token ID1727
Feature activation+1.10e-02

Pos logit contributions
 Carly+0.017
jo+0.017
fulness+0.016
Kat+0.015
 Debbie+0.015
Neg logit contributions
 cube-0.014
 tallest-0.014
 cubes-0.014
 sp-0.013
 nail-0.013
Tokeno
Token ID78
Feature activation-9.09e-03

Pos logit contributions
 Carly+0.019
jo+0.018
 Carol+0.017
 Debbie+0.017
fulness+0.017
Neg logit contributions
 cube-0.016
 cubes-0.016
 nail-0.015
 tallest-0.015
 brick-0.015
Token,
Token ID11
Feature activation+7.77e-03

Pos logit contributions
 cube+0.013
 tallest+0.013
 cubes+0.012
 nail+0.012
 sp+0.012
Neg logit contributions
jo-0.016
 Carly-0.016
fulness-0.015
 Carol-0.014
 Debbie-0.014
Token this
Token ID428
Feature activation+2.16e-03

Pos logit contributions
jo+0.014
 Carly+0.013
fulness+0.013
iko+0.012
les+0.012
Neg logit contributions
 tallest-0.011
 cube-0.011
 nail-0.010
 cubes-0.010
 brick-0.010
TokenHer
Token ID9360
Feature activation-8.01e-03

Pos logit contributions
 tallest+0.025
 cubes+0.024
 cube+0.023
 bricks+0.022
nam+0.022
Neg logit contributions
jo-0.013
 Carly-0.012
of-0.012
unt-0.012
 Carol-0.011
Token mom
Token ID1995
Feature activation+3.57e-03

Pos logit contributions
 tallest+0.013
nam+0.012
 cube+0.012
 cubes+0.011
 nail+0.011
Neg logit contributions
jo-0.012
 Carly-0.012
unt-0.011
 Carol-0.011
iko-0.011
Token replied
Token ID8712
Feature activation+1.56e-02

Pos logit contributions
jo+0.006
 Carly+0.005
fulness+0.005
iko+0.005
 Debbie+0.005
Neg logit contributions
 cube-0.006
 tallest-0.006
 nail-0.006
 sp-0.006
 cubes-0.006
Token,
Token ID11
Feature activation+4.97e-03

Pos logit contributions
jo+0.026
 Carly+0.025
fulness+0.025
les+0.023
le+0.023
Neg logit contributions
 tallest-0.024
 cube-0.023
 cubes-0.022
 nail-0.022
 brick-0.022
Token "
Token ID366
Feature activation+4.85e-03

Pos logit contributions
jo+0.009
 Carly+0.008
fulness+0.008
 Carol+0.008
unt+0.008
Neg logit contributions
 tallest-0.007
 cube-0.007
 cubes-0.007
 nail-0.007
 bucket-0.007
TokenThe
Token ID464
Feature activation-1.82e-03

Pos logit contributions
jo+0.009
 Carly+0.008
fulness+0.008
iko+0.007
les+0.007
Neg logit contributions
 tallest-0.007
 cube-0.007
 nail-0.007
 cubes-0.007
 brick-0.007
Token store
Token ID3650
Feature activation-1.08e-02

Pos logit contributions
nam+0.002
 tallest+0.002
 cube+0.002
 cubes+0.002
 brick+0.002
Neg logit contributions
 Carly-0.004
jo-0.004
fulness-0.003
les-0.003
 Carol-0.003
Token owner
Token ID4870
Feature activation-1.83e-02

Pos logit contributions
 tallest+0.014
 cube+0.014
 cubes+0.013
 brick+0.013
 nail+0.013
Neg logit contributions
jo-0.020
 Carly-0.019
fulness-0.018
 Carol-0.018
les-0.018
Token wouldn
Token ID3636
Feature activation+9.52e-03

Pos logit contributions
 cube+0.020
 tallest+0.020
 nail+0.019
 cubes+0.019
 brick+0.019
Neg logit contributions
jo-0.039
 Carly-0.037
fulness-0.036
iko-0.034
 Carol-0.034
Token't
Token ID470
Feature activation-6.51e-03

Pos logit contributions
jo+0.013
 Carly+0.013
fulness+0.012
of+0.011
iko+0.011
Neg logit contributions
 tallest-0.017
 cube-0.017
 cubes-0.016
 nail-0.016
 brick-0.016
Token like
Token ID588
Feature activation+2.66e-03

Pos logit contributions
 cube+0.007
 tallest+0.007
 nail+0.006
 cubes+0.006
 brick+0.006
Neg logit contributions
jo-0.014
 Carly-0.014
fulness-0.013
iko-0.013
 Debbie-0.013
Tokenandy
Token ID10757
Feature activation+5.06e-04

Pos logit contributions
jo+0.012
 Carly+0.012
fulness+0.011
 Carol+0.011
les+0.011
Neg logit contributions
 cube-0.011
 tallest-0.010
 cubes-0.010
 nail-0.010
 brick-0.010
Token's
Token ID338
Feature activation-6.44e-03

Pos logit contributions
jo+0.001
 Carly+0.001
fulness+0.001
iko+0.001
 Carol+0.001
Neg logit contributions
 cube-0.001
 tallest-0.001
 sp-0.001
 nail-0.001
 bucket-0.001
Token mother
Token ID2802
Feature activation+1.35e-03

Pos logit contributions
 tallest+0.007
 cube+0.007
 nail+0.007
 cubes+0.007
 brick+0.007
Neg logit contributions
jo-0.013
 Carly-0.013
fulness-0.012
iko-0.012
 Carol-0.012
Token smiled
Token ID13541
Feature activation+5.51e-03

Pos logit contributions
fulness+0.002
 Carly+0.002
jo+0.002
rett+0.002
iko+0.002
Neg logit contributions
 sp-0.002
 cube-0.002
 stack-0.002
 cracked-0.002
 nail-0.002
Token and
Token ID290
Feature activation+6.06e-03

Pos logit contributions
jo+0.009
 Carly+0.009
fulness+0.008
iko+0.008
of+0.008
Neg logit contributions
 cube-0.008
 tallest-0.008
 sp-0.008
 brick-0.008
 cubes-0.008
Token said
Token ID531
Feature activation+1.45e-02

Pos logit contributions
jo+0.012
fulness+0.011
 Carly+0.010
iko+0.010
rett+0.010
Neg logit contributions
 sp-0.008
 cube-0.008
 bucket-0.008
 squash-0.008
 tallest-0.008
Token,
Token ID11
Feature activation+2.26e-03

Pos logit contributions
jo+0.027
 Carly+0.026
fulness+0.025
iko+0.024
of+0.023
Neg logit contributions
 tallest-0.019
 cube-0.019
 nail-0.018
 cubes-0.018
 brick-0.018
Token "
Token ID366
Feature activation+2.95e-04

Pos logit contributions
jo+0.005
 Carly+0.005
fulness+0.005
iko+0.004
les+0.004
Neg logit contributions
 cube-0.002
 tallest-0.002
 nail-0.002
 brick-0.002
 cubes-0.002
TokenYes
Token ID5297
Feature activation+1.04e-02

Pos logit contributions
jo+0.001
 Carly+0.001
les+0.001
iko+0.001
le+0.001
Neg logit contributions
 tallest-0.000
 cube-0.000
 brick-0.000
 nail-0.000
 square-0.000
Token.
Token ID13
Feature activation-2.57e-03

Pos logit contributions
jo+0.020
 Carly+0.019
fulness+0.018
iko+0.018
les+0.018
Neg logit contributions
 cube-0.013
 tallest-0.013
 cubes-0.013
 nail-0.013
 brick-0.013
Token We
Token ID775
Feature activation+8.89e-03

Pos logit contributions
 tallest+0.003
 cube+0.003
 nail+0.003
 cubes+0.003
 brick+0.003
Neg logit contributions
jo-0.005
 Carly-0.005
fulness-0.004
iko-0.004
les-0.004
Token together
Token ID1978
Feature activation-1.03e-02

Pos logit contributions
 tallest+0.013
 cube+0.013
 cubes+0.012
 bucket+0.012
 brick+0.012
Neg logit contributions
jo-0.020
 Carly-0.019
fulness-0.018
iko-0.018
 Carol-0.018
Token."
Token ID526
Feature activation+5.82e-03

Pos logit contributions
 tallest+0.014
 cube+0.013
 brick+0.013
 nail+0.013
 cubes+0.013
Neg logit contributions
jo-0.020
 Carly-0.019
fulness-0.017
iko-0.017
 Carol-0.017
Token
Token ID198
Feature activation+5.57e-03

Pos logit contributions
jo+0.011
 Carly+0.010
iko+0.010
fulness+0.010
le+0.010
Neg logit contributions
 cube-0.008
 dough-0.007
 square-0.007
 bucket-0.007
 brick-0.007
Token
Token ID198
Feature activation+5.03e-03

Pos logit contributions
 Carly+0.011
fulness+0.011
iko+0.011
jo+0.011
 Debbie+0.011
Neg logit contributions
 cube-0.005
 dough-0.005
 square-0.005
 brick-0.005
 nail-0.005
TokenThe
Token ID464
Feature activation-6.52e-03

Pos logit contributions
jo+0.007
 Carly+0.006
fulness+0.006
of+0.006
iko+0.006
Neg logit contributions
 tallest-0.009
 cube-0.009
 cubes-0.009
 nail-0.009
 brick-0.009
Token next
Token ID1306
Feature activation+8.51e-03

Pos logit contributions
 tallest+0.008
 cubes+0.008
 cube+0.008
 nail+0.007
nam+0.007
Neg logit contributions
jo-0.012
 Carly-0.011
unt-0.011
fulness-0.011
 Carol-0.010
Token week
Token ID1285
Feature activation-7.03e-03

Pos logit contributions
jo+0.017
 Carly+0.016
fulness+0.015
of+0.014
iko+0.014
Neg logit contributions
 tallest-0.011
 cube-0.011
 cubes-0.010
 nail-0.010
 brick-0.010
Token,
Token ID11
Feature activation-8.89e-03

Pos logit contributions
 cube+0.008
 bucket+0.007
 nail+0.007
 square+0.007
 sp+0.007
Neg logit contributions
jo-0.015
 Carly-0.014
fulness-0.014
iko-0.013
of-0.013
Token Lily
Token ID20037
Feature activation+3.79e-03

Pos logit contributions
 sp+0.012
 cube+0.011
 dough+0.011
 bucket+0.011
 squash+0.011
Neg logit contributions
jo-0.019
iko-0.016
fulness-0.016
 Carly-0.016
lor-0.016
Token and
Token ID290
Feature activation-1.09e-02

Pos logit contributions
jo+0.007
 Carly+0.007
fulness+0.007
iko+0.007
 Debbie+0.007
Neg logit contributions
 cube-0.005
 tallest-0.005
 sp-0.004
 nail-0.004
 bucket-0.004
Token her
Token ID607
Feature activation-9.54e-03

Pos logit contributions
 cube+0.019
 brick+0.018
 nail+0.018
nam+0.018
 cubes+0.018
Neg logit contributions
jo-0.016
 Carly-0.015
fulness-0.015
iko-0.014
les-0.014
Token happy
Token ID3772
Feature activation-4.96e-03

Pos logit contributions
 cube+0.003
 tallest+0.003
 sp+0.003
 bucket+0.003
 square+0.003
Neg logit contributions
jo-0.009
 Carly-0.008
fulness-0.008
iko-0.008
les-0.008
Token to
Token ID284
Feature activation-3.89e-03

Pos logit contributions
 cube+0.004
 sp+0.004
 cubes+0.004
 square+0.004
 tower+0.003
Neg logit contributions
jo-0.012
iko-0.011
azz-0.011
 Carly-0.011
April-0.011
Token have
Token ID423
Feature activation+5.35e-03

Pos logit contributions
 tallest+0.003
 cube+0.003
 nail+0.003
 square+0.003
 bucket+0.003
Neg logit contributions
jo-0.010
 Carly-0.009
les-0.009
fulness-0.009
elle-0.009
Token an
Token ID281
Feature activation-2.02e-03

Pos logit contributions
jo+0.012
iko+0.011
 Carly+0.011
fulness+0.011
azz+0.011
Neg logit contributions
 best-0.005
 sp-0.005
 split-0.005
 cubes-0.005
 bucket-0.005
Token owner
Token ID4870
Feature activation-1.06e-02

Pos logit contributions
 cube+0.002
 tallest+0.002
 cubes+0.002
 nail+0.002
 brick+0.002
Neg logit contributions
jo-0.004
 Carly-0.004
fulness-0.004
iko-0.004
 Carol-0.004
Token who
Token ID508
Feature activation-2.31e-03

Pos logit contributions
 cube+0.014
 tallest+0.013
 cubes+0.013
 bucket+0.013
 nail+0.012
Neg logit contributions
jo-0.021
 Carly-0.020
fulness-0.019
iko-0.018
les-0.018
Token loved
Token ID6151
Feature activation-1.68e-03

Pos logit contributions
 cube+0.002
 tallest+0.002
 sp+0.002
 nail+0.002
 hammer+0.002
Neg logit contributions
jo-0.005
 Carly-0.005
fulness-0.005
rett-0.005
azz-0.005
Token him
Token ID683
Feature activation-1.07e-02

Pos logit contributions
 cubes+0.002
 sp+0.002
 cube+0.002
 bricks+0.002
 dough+0.002
Neg logit contributions
jo-0.003
le-0.003
azz-0.003
les-0.003
lor-0.003
Token.
Token ID13
Feature activation-2.72e-03

Pos logit contributions
 tallest+0.013
 cube+0.013
 cubes+0.012
 sp+0.012
 nail+0.012
Neg logit contributions
jo-0.021
 Carly-0.020
fulness-0.019
iko-0.019
les-0.019
Token And
Token ID843
Feature activation+3.77e-03

Pos logit contributions
 tallest+0.003
 cube+0.003
 brick+0.003
 cubes+0.003
 square+0.003
Neg logit contributions
jo-0.005
le-0.005
 Carly-0.005
iko-0.005
les-0.005
Token every
Token ID790
Feature activation+4.99e-03

Pos logit contributions
jo+0.007
les+0.007
iko+0.007
 Carly+0.007
le+0.006
Neg logit contributions
 cube-0.005
 tallest-0.005
 best-0.005
 square-0.004
 sp-0.004
Token
Token ID198
Feature activation-2.60e-03

Pos logit contributions
 cube+0.004
 square+0.004
 nail+0.004
 brick+0.004
 bucket+0.004
Neg logit contributions
jo-0.006
 Carly-0.006
le-0.006
fulness-0.006
elle-0.006
Token"
Token ID1
Feature activation-2.94e-03

Pos logit contributions
 tallest+0.004
 cube+0.004
 cubes+0.004
 brick+0.004
 nail+0.004
Neg logit contributions
jo-0.004
 Carly-0.004
fulness-0.004
of-0.004
 Carol-0.004
TokenIt
Token ID1026
Feature activation+9.47e-03

Pos logit contributions
 tallest+0.004
 cube+0.004
 cubes+0.004
 nail+0.004
 brick+0.004
Neg logit contributions
jo-0.005
 Carly-0.005
fulness-0.005
iko-0.005
of-0.005
Token's
Token ID338
Feature activation-1.48e-03

Pos logit contributions
jo+0.019
 Carly+0.018
fulness+0.017
les+0.017
le+0.017
Neg logit contributions
 cubes-0.011
 tallest-0.011
nam-0.011
 cube-0.010
 nail-0.010
Token OK
Token ID7477
Feature activation+4.74e-04

Pos logit contributions
 cube+0.002
 cubes+0.002
 tallest+0.002
 bucket+0.002
 brick+0.002
Neg logit contributions
 Carly-0.002
jo-0.002
fulness-0.002
 Carol-0.002
 Margaret-0.002
Token,
Token ID11
Feature activation-3.43e-03

Pos logit contributions
jo+0.001
 Carly+0.001
rob+0.001
les+0.001
fulness+0.001
Neg logit contributions
 cube-0.000
 sp-0.000
 cubes-0.000
 bucket-0.000
 nail-0.000
Token my
Token ID616
Feature activation-5.22e-03

Pos logit contributions
 tallest+0.004
 cube+0.004
 nail+0.004
 bucket+0.004
 sp+0.004
Neg logit contributions
jo-0.007
 Carly-0.006
fulness-0.006
les-0.006
of-0.006
Token loves
Token ID10408
Feature activation-2.46e-03

Pos logit contributions
 cube+0.007
 tallest+0.007
 bucket+0.007
 nail+0.006
 brick+0.006
Neg logit contributions
jo-0.010
 Carly-0.010
fulness-0.009
 Debbie-0.009
les-0.009
Token,
Token ID11
Feature activation-2.60e-03

Pos logit contributions
 cubes+0.002
 cube+0.002
 dough+0.002
 nail+0.002
 sp+0.002
Neg logit contributions
 Carly-0.005
jo-0.005
 Carol-0.005
 Debbie-0.005
rett-0.005
Token I
Token ID314
Feature activation+1.19e-03

Pos logit contributions
 sp+0.003
 nail+0.003
 bucket+0.003
 tallest+0.003
 cube+0.003
Neg logit contributions
jo-0.005
 Carly-0.005
fulness-0.004
rett-0.004
les-0.004
Token'm
Token ID1101
Feature activation-3.18e-03

Pos logit contributions
 Carly+0.003
jo+0.003
fulness+0.003
 Debbie+0.003
 Carol+0.002
Neg logit contributions
 cube-0.001
 sp-0.001
 bucket-0.001
 tallest-0.001
 nail-0.001
Token the
Token ID262
Feature activation-2.71e-03

Pos logit contributions
jo+0.009
 Carly+0.008
fulness+0.008
iko+0.008
la+0.008
Neg logit contributions
 cube-0.004
 nail-0.004
 tallest-0.004
 cubes-0.004
 brick-0.004
Token slide
Token ID10649
Feature activation-2.08e-03

Pos logit contributions
nam+0.004
asuring+0.004
asure+0.003
 cubes+0.003
kins+0.003
Neg logit contributions
jo-0.004
unt-0.004
les-0.004
ock-0.004
 Carly-0.004
Token.
Token ID13
Feature activation-1.52e-03

Pos logit contributions
 tallest+0.003
 cube+0.003
 nail+0.003
 cubes+0.003
 brick+0.003
Neg logit contributions
jo-0.004
 Carly-0.003
fulness-0.003
les-0.003
of-0.003
Token He
Token ID679
Feature activation-4.99e-04

Pos logit contributions
 square+0.002
 cube+0.002
 sp+0.002
 tallest+0.002
 cubes+0.002
Neg logit contributions
jo-0.003
le-0.003
 Carly-0.003
iko-0.003
les-0.003
Token lands
Token ID8604
Feature activation+1.25e-02

Pos logit contributions
 tallest+0.001
 cube+0.001
 sp+0.001
 nail+0.001
 cubes+0.001
Neg logit contributions
jo-0.001
 Carly-0.001
fulness-0.001
iko-0.001
 Carol-0.001
Token on
Token ID319
Feature activation-1.02e-02

Pos logit contributions
jo+0.030
 Carly+0.028
iko+0.026
fulness+0.026
rett+0.025
Neg logit contributions
 cubes-0.013
 cube-0.012
 sp-0.012
 bucket-0.012
 tallest-0.012
Token the
Token ID262
Feature activation+3.71e-04

Pos logit contributions
 cube+0.011
 cubes+0.010
 sp+0.010
 nail+0.010
 square+0.010
Neg logit contributions
jo-0.024
fulness-0.022
 Carly-0.022
iko-0.021
le-0.020
Token ground
Token ID2323
Feature activation+4.63e-03

Pos logit contributions
jo+0.001
 Carly+0.001
les+0.001
fulness+0.001
ock+0.001
Neg logit contributions
nam-0.000
 cube-0.000
 tallest-0.000
asuring-0.000
 cubes-0.000
Token with
Token ID351
Feature activation-1.03e-02

Pos logit contributions
jo+0.009
 Carly+0.009
iko+0.009
fulness+0.008
 Debbie+0.008
Neg logit contributions
 cube-0.006
 square-0.006
 tallest-0.006
 sp-0.006
 nail-0.006
Token a
Token ID257
Feature activation+1.94e-02

Pos logit contributions
 sp+0.014
 cubes+0.014
 cube+0.014
 bucket+0.014
 tallest+0.014
Neg logit contributions
jo-0.020
 Carly-0.019
fulness-0.018
iko-0.018
lor-0.017
Token th
Token ID294
Feature activation-3.69e-03

Pos logit contributions
jo+0.038
 Carly+0.037
fulness+0.035
of+0.035
les+0.034
Neg logit contributions
 tallest-0.023
 cube-0.021
 cubes-0.021
nam-0.020
 brick-0.020
Token He
Token ID679
Feature activation-3.60e-03

Pos logit contributions
 tallest+0.001
 cube+0.001
 nail+0.001
 cubes+0.001
 bucket+0.001
Neg logit contributions
jo-0.001
 Carly-0.001
fulness-0.001
of-0.001
 Carol-0.001
Token would
Token ID561
Feature activation-7.99e-03

Pos logit contributions
 tallest+0.006
 brick+0.005
 cube+0.005
 nail+0.005
 cubes+0.005
Neg logit contributions
jo-0.006
 Carly-0.005
elle-0.005
iko-0.005
le-0.005
Token eat
Token ID4483
Feature activation-3.65e-03

Pos logit contributions
 tallest+0.011
 cube+0.011
 cubes+0.010
 nail+0.010
 brick+0.010
Neg logit contributions
jo-0.015
 Carly-0.014
fulness-0.013
of-0.013
 Debbie-0.013
Token so
Token ID523
Feature activation-4.07e-04

Pos logit contributions
 cube+0.004
 cubes+0.004
 tallest+0.004
 sp+0.004
 nail+0.004
Neg logit contributions
jo-0.008
 Carly-0.007
fulness-0.007
iko-0.007
 Carol-0.007
Token much
Token ID881
Feature activation+8.16e-03

Pos logit contributions
 tallest+0.001
 cube+0.001
 nail+0.001
 bucket+0.001
 brick+0.001
Neg logit contributions
jo-0.001
 Carly-0.001
fulness-0.001
of-0.001
iko-0.001
Token candy
Token ID18550
Feature activation-7.86e-03

Pos logit contributions
jo+0.018
 Carly+0.017
la+0.016
iko+0.016
 Carol+0.016
Neg logit contributions
 cube-0.009
 cubes-0.009
 sp-0.009
 dough-0.009
 nail-0.009
Token that
Token ID326
Feature activation-3.31e-03

Pos logit contributions
 cube+0.012
 tallest+0.012
 cubes+0.011
 nail+0.011
 brick+0.011
Neg logit contributions
jo-0.013
 Carly-0.013
fulness-0.012
iko-0.012
 Carol-0.011
Token he
Token ID339
Feature activation+1.97e-03

Pos logit contributions
 tallest+0.006
 cube+0.006
 cubes+0.005
 nail+0.005
 brick+0.005
Neg logit contributions
jo-0.005
 Carly-0.005
fulness-0.004
iko-0.004
les-0.004
Token became
Token ID2627
Feature activation-8.75e-03

Pos logit contributions
jo+0.003
 Carly+0.003
le+0.003
les+0.003
iko+0.003
Neg logit contributions
 tallest-0.003
 brick-0.003
 cube-0.003
 nail-0.003
 cubes-0.003
Token very
Token ID845
Feature activation-7.99e-03

Pos logit contributions
 cube+0.010
 brick+0.010
 structure+0.010
 nail+0.010
 bucket+0.010
Neg logit contributions
 Carly-0.015
jo-0.014
 knowing-0.014
 Hugo-0.014
les-0.014
Token ch
Token ID442
Feature activation-9.61e-03

Pos logit contributions
 brick+0.011
 nail+0.011
 nails+0.010
 bricks+0.010
 tallest+0.010
Neg logit contributions
fulness-0.012
 Carly-0.012
jo-0.012
le-0.011
 Debbie-0.011
Token Lucy
Token ID22162
Feature activation-4.30e-03

Pos logit contributions
 cube+0.002
 brick+0.002
 tallest+0.002
 square+0.002
 nail+0.002
Neg logit contributions
jo-0.003
le-0.003
iko-0.003
fulness-0.003
 Carly-0.003
Token thanked
Token ID26280
Feature activation-2.48e-03

Pos logit contributions
 cube+0.006
 sp+0.005
 tallest+0.005
 nail+0.005
 bucket+0.005
Neg logit contributions
jo-0.008
 Carly-0.008
fulness-0.008
 Carol-0.007
iko-0.007
Token the
Token ID262
Feature activation+7.56e-04

Pos logit contributions
 cube+0.002
 squash+0.002
 brick+0.002
 nail+0.002
 square+0.002
Neg logit contributions
jo-0.006
les-0.005
azz-0.005
lor-0.005
enn-0.005
Token fairy
Token ID25607
Feature activation+3.45e-03

Pos logit contributions
jo+0.001
 Carly+0.001
fulness+0.001
iko+0.001
les+0.001
Neg logit contributions
 tallest-0.001
 cube-0.001
 cubes-0.001
 nail-0.001
 brick-0.001
Token and
Token ID290
Feature activation-6.52e-03

Pos logit contributions
jo+0.006
 Carly+0.006
rob+0.006
Eric+0.006
iko+0.006
Neg logit contributions
 cube-0.005
 tallest-0.004
 brick-0.004
 square-0.004
 bucket-0.004
Token she
Token ID673
Feature activation-8.94e-03

Pos logit contributions
 bucket+0.007
 sp+0.007
 cube+0.007
 nail+0.007
nam+0.007
Neg logit contributions
jo-0.014
fulness-0.013
 Carly-0.013
iko-0.012
azz-0.012
Token took
Token ID1718
Feature activation+9.38e-04

Pos logit contributions
 sp+0.011
 bucket+0.011
 cube+0.011
 tallest+0.011
 cracked+0.010
Neg logit contributions
jo-0.018
 Carly-0.017
fulness-0.017
iko-0.015
 Carol-0.015
Token a
Token ID257
Feature activation+1.57e-03

Pos logit contributions
jo+0.002
azz+0.002
iko+0.002
ark+0.002
lor+0.002
Neg logit contributions
 dough-0.001
 squash-0.001
 clay-0.001
 sp-0.001
 brick-0.001
Token bite
Token ID13197
Feature activation-9.85e-03

Pos logit contributions
jo+0.003
 Carly+0.003
fulness+0.003
iko+0.003
elle+0.003
Neg logit contributions
 cube-0.002
 bucket-0.002
 nail-0.002
 sp-0.002
 tallest-0.002
Token of
Token ID286
Feature activation-8.18e-04

Pos logit contributions
 cube+0.013
 brick+0.013
 sp+0.013
 dough+0.013
 bucket+0.012
Neg logit contributions
jo-0.019
 Carly-0.018
iko-0.017
 Carol-0.016
Eric-0.016
Token the
Token ID262
Feature activation-8.75e-03

Pos logit contributions
 sp+0.001
 cube+0.001
 tallest+0.001
 brick+0.001
 dough+0.001
Neg logit contributions
jo-0.002
azz-0.002
 Carly-0.002
ark-0.002
fulness-0.002
Token Lily
Token ID20037
Feature activation-3.98e-03

Pos logit contributions
 sp+0.009
 cube+0.009
 square+0.009
 bucket+0.009
 cubes+0.008
Neg logit contributions
jo-0.013
fulness-0.012
lor-0.012
iko-0.012
 Carly-0.012
Token said
Token ID531
Feature activation-5.64e-03

Pos logit contributions
 tallest+0.007
 cube+0.007
 nail+0.007
 cubes+0.007
 brick+0.007
Neg logit contributions
jo-0.006
 Carly-0.005
fulness-0.005
iko-0.005
les-0.005
Token.
Token ID13
Feature activation+1.18e-03

Pos logit contributions
 tallest+0.008
 cube+0.008
 cubes+0.008
 nail+0.008
 brick+0.007
Neg logit contributions
jo-0.010
 Carly-0.010
fulness-0.009
 Carol-0.009
 Debbie-0.009
Token "
Token ID366
Feature activation+3.99e-03

Pos logit contributions
jo+0.002
fulness+0.002
 Carly+0.002
le+0.002
les+0.002
Neg logit contributions
 cube-0.002
 nail-0.002
 sp-0.002
 tallest-0.002
 bucket-0.002
TokenIt
Token ID1026
Feature activation+8.98e-03

Pos logit contributions
jo+0.007
 Carly+0.007
fulness+0.007
iko+0.006
of+0.006
Neg logit contributions
 tallest-0.005
 cube-0.005
 cubes-0.005
 nail-0.005
 brick-0.005
Token's
Token ID338
Feature activation-9.19e-03

Pos logit contributions
jo+0.019
 Carly+0.018
fulness+0.017
of+0.017
les+0.017
Neg logit contributions
 tallest-0.010
 cube-0.010
 cubes-0.010
 nail-0.009
 brick-0.009
Token not
Token ID407
Feature activation+7.43e-04

Pos logit contributions
 cube+0.012
 tallest+0.012
 cubes+0.012
 nail+0.011
 brick+0.011
Neg logit contributions
jo-0.017
 Carly-0.017
fulness-0.015
 Carol-0.015
iko-0.015
Token for
Token ID329
Feature activation-6.22e-03

Pos logit contributions
jo+0.001
 Carly+0.001
fulness+0.001
 Carol+0.001
iko+0.001
Neg logit contributions
 tallest-0.001
 cube-0.001
 cubes-0.001
 nail-0.001
 brick-0.001
Token us
Token ID514
Feature activation+1.08e-03

Pos logit contributions
 tallest+0.009
 cube+0.009
 nail+0.008
 cubes+0.008
 brick+0.008
Neg logit contributions
jo-0.011
 Carly-0.011
fulness-0.010
iko-0.010
les-0.010
Token.
Token ID13
Feature activation+2.07e-03

Pos logit contributions
jo+0.002
 Carly+0.002
fulness+0.002
iko+0.002
les+0.002
Neg logit contributions
 cube-0.002
 tallest-0.002
 cubes-0.001
 nail-0.001
 brick-0.001
Token It
Token ID632
Feature activation+1.49e-02

Pos logit contributions
jo+0.004
 Carly+0.003
fulness+0.003
 Carol+0.003
 Debbie+0.003
Neg logit contributions
 tallest-0.003
 cube-0.003
 cubes-0.003
 nail-0.003
 brick-0.003
Token new
Token ID649
Feature activation-6.20e-03

Pos logit contributions
 tallest+0.010
 cube+0.010
 nail+0.009
 cubes+0.009
 brick+0.009
Neg logit contributions
jo-0.015
 Carly-0.014
fulness-0.014
les-0.013
of-0.013
Token friend
Token ID1545
Feature activation-3.65e-03

Pos logit contributions
 cube+0.007
 brick+0.007
 tallest+0.007
 bucket+0.007
 nail+0.006
Neg logit contributions
jo-0.013
 Carly-0.013
fulness-0.013
les-0.012
azz-0.012
Token and
Token ID290
Feature activation-6.81e-03

Pos logit contributions
 cube+0.004
 square+0.004
 tallest+0.004
 sp+0.004
 brick+0.004
Neg logit contributions
jo-0.007
 Carly-0.007
fulness-0.007
iko-0.006
rett-0.006
Token went
Token ID1816
Feature activation+3.39e-03

Pos logit contributions
 bucket+0.008
 sp+0.007
 cube+0.007
 tallest+0.007
 nail+0.007
Neg logit contributions
jo-0.015
fulness-0.013
 Carly-0.013
les-0.013
iko-0.013
Token home
Token ID1363
Feature activation+1.52e-02

Pos logit contributions
jo+0.008
fulness+0.007
elle+0.007
 Carly+0.007
Jo+0.007
Neg logit contributions
 bucket-0.004
 cube-0.003
 square-0.003
 rake-0.003
 hammer-0.003
Token,
Token ID11
Feature activation-7.44e-03

Pos logit contributions
jo+0.035
 Carly+0.034
elle+0.033
les+0.033
of+0.033
Neg logit contributions
 square-0.013
 cube-0.013
 hammer-0.012
 very-0.012
 sp-0.012
Token still
Token ID991
Feature activation+3.43e-03

Pos logit contributions
 very+0.007
 bucket+0.007
 clean+0.007
 still+0.007
 hammer+0.007
Neg logit contributions
jo-0.018
les-0.018
le-0.017
rob-0.017
elle-0.016
Token rocking
Token ID38059
Feature activation+8.61e-03

Pos logit contributions
jo+0.007
les+0.007
 Carly+0.006
azz+0.006
of+0.006
Neg logit contributions
 tallest-0.004
 cube-0.004
 bucket-0.004
 sp-0.004
 square-0.004
Token as
Token ID355
Feature activation+4.26e-03

Pos logit contributions
jo+0.019
 Carly+0.018
azz+0.017
fulness+0.017
Kat+0.017
Neg logit contributions
 cube-0.009
 cubes-0.009
 bucket-0.009
 tallest-0.009
 sp-0.009
Token he
Token ID339
Feature activation-4.82e-03

Pos logit contributions
jo+0.009
 Carly+0.008
les+0.008
le+0.008
lor+0.008
Neg logit contributions
 cube-0.005
 tallest-0.005
 cubes-0.005
 square-0.005
 nail-0.005
Token walked
Token ID6807
Feature activation-1.25e-02

Pos logit contributions
 sp+0.006
 tallest+0.005
 cube+0.005
 bucket+0.005
 cracked+0.005
Neg logit contributions
 Carly-0.010
jo-0.010
fulness-0.010
 Carol-0.009
 Debbie-0.009
Token have
Token ID423
Feature activation+5.91e-03

Pos logit contributions
 tallest+0.003
 cube+0.003
 nail+0.003
 cubes+0.003
 brick+0.003
Neg logit contributions
jo-0.006
 Carly-0.006
fulness-0.005
iko-0.005
les-0.005
Token fun
Token ID1257
Feature activation+4.64e-03

Pos logit contributions
jo+0.013
 Carly+0.013
fulness+0.012
iko+0.012
les+0.012
Neg logit contributions
 tallest-0.006
 cube-0.006
 cubes-0.006
 sp-0.006
 bucket-0.005
Token.
Token ID13
Feature activation-1.58e-02

Pos logit contributions
jo+0.009
 Carly+0.009
fulness+0.009
iko+0.008
 Debbie+0.008
Neg logit contributions
 cube-0.006
 tallest-0.006
 cubes-0.005
 sp-0.005
 nail-0.005
Token One
Token ID1881
Feature activation-7.56e-03

Pos logit contributions
 tallest+0.027
 cube+0.026
 cubes+0.026
 nail+0.026
 brick+0.025
Neg logit contributions
jo-0.024
 Carly-0.023
fulness-0.021
iko-0.020
les-0.020
Token day
Token ID1110
Feature activation-2.95e-02

Pos logit contributions
 tallest+0.012
nam+0.012
 cubes+0.012
 nail+0.011
 nails+0.011
Neg logit contributions
jo-0.013
 Carly-0.011
fulness-0.011
le-0.010
unt-0.010
Token,
Token ID11
Feature activation-3.78e-02

Pos logit contributions
 tallest+0.042
 cube+0.040
 nail+0.039
 brick+0.039
 cubes+0.039
Neg logit contributions
jo-0.053
 Carly-0.051
fulness-0.048
of-0.046
 Carol-0.046
Token he
Token ID339
Feature activation-9.75e-03

Pos logit contributions
 tallest+0.068
 cube+0.068
 cubes+0.066
 nail+0.066
 brick+0.066
Neg logit contributions
jo-0.053
 Carly-0.050
fulness-0.046
iko-0.044
les-0.043
Token saw
Token ID2497
Feature activation-1.41e-02

Pos logit contributions
 sp+0.011
 bucket+0.010
 cube+0.010
 cracked+0.010
 square+0.009
Neg logit contributions
fulness-0.022
jo-0.021
 Carly-0.021
lor-0.020
 Carol-0.019
Token a
Token ID257
Feature activation-2.79e-03

Pos logit contributions
 cube+0.014
 cubes+0.013
 sp+0.013
 tallest+0.013
 brick+0.013
Neg logit contributions
jo-0.034
 Carly-0.032
fulness-0.031
of-0.030
iko-0.030
Token big
Token ID1263
Feature activation-2.99e-03

Pos logit contributions
 tallest+0.003
 cubes+0.003
 bricks+0.003
 cube+0.003
 nails+0.003
Neg logit contributions
jo-0.005
 Carly-0.005
unt-0.005
 Carol-0.005
iko-0.005
Token tree
Token ID5509
Feature activation-5.61e-03

Pos logit contributions
 tallest+0.003
nam+0.003
 cubes+0.003
 cube+0.003
 nail+0.003
Neg logit contributions
jo-0.006
 Carly-0.006
 Carol-0.005
iko-0.005
les-0.005
Token It
Token ID632
Feature activation+7.59e-03

Pos logit contributions
 tallest+0.004
 cube+0.004
 nail+0.004
 cubes+0.004
 brick+0.004
Neg logit contributions
jo-0.006
 Carly-0.005
fulness-0.005
iko-0.005
les-0.005
Token does
Token ID857
Feature activation+5.85e-03

Pos logit contributions
jo+0.015
 Carly+0.014
fulness+0.014
iko+0.013
 Carol+0.013
Neg logit contributions
 tallest-0.009
 cube-0.009
 nail-0.009
 cubes-0.009
 brick-0.009
Token not
Token ID407
Feature activation-5.94e-03

Pos logit contributions
jo+0.012
 Carly+0.011
fulness+0.011
le+0.011
of+0.011
Neg logit contributions
 cube-0.007
 tallest-0.007
 brick-0.006
 cubes-0.006
 nail-0.006
Token hurt
Token ID5938
Feature activation-8.80e-03

Pos logit contributions
 cube+0.007
 tallest+0.007
 nail+0.007
 sp+0.007
 cubes+0.007
Neg logit contributions
jo-0.012
 Carly-0.012
fulness-0.011
 Carol-0.011
iko-0.011
Token your
Token ID534
Feature activation-5.55e-03

Pos logit contributions
 cubes+0.009
 nail+0.009
 square+0.009
 bricks+0.009
 cube+0.009
Neg logit contributions
jo-0.019
 Carly-0.019
lor-0.018
rob-0.017
Jo-0.017
Token eyes
Token ID2951
Feature activation-3.67e-02

Pos logit contributions
 cube+0.004
 tallest+0.004
 sp+0.004
 bucket+0.004
 nail+0.004
Neg logit contributions
jo-0.014
 Carly-0.014
fulness-0.013
of-0.013
 Carol-0.013
Token.
Token ID13
Feature activation+7.26e-04

Pos logit contributions
 tallest+0.054
 cube+0.053
 brick+0.052
 cubes+0.052
 nail+0.052
Neg logit contributions
jo-0.062
fulness-0.058
 Carly-0.058
les-0.056
le-0.056
Token It
Token ID632
Feature activation+5.28e-03

Pos logit contributions
jo+0.001
 Carly+0.001
fulness+0.001
iko+0.001
les+0.001
Neg logit contributions
 tallest-0.001
 cube-0.001
 nail-0.001
 cubes-0.001
 brick-0.001
Token makes
Token ID1838
Feature activation-4.58e-03

Pos logit contributions
jo+0.010
 Carly+0.010
fulness+0.009
les+0.009
iko+0.009
Neg logit contributions
 tallest-0.007
 cube-0.006
 cubes-0.006
 nail-0.006
 brick-0.006
Token your
Token ID534
Feature activation-8.23e-03

Pos logit contributions
 dough+0.006
 sp+0.006
 cube+0.006
 nail+0.006
 cubes+0.006
Neg logit contributions
jo-0.009
 Carly-0.008
iko-0.008
rob-0.008
fulness-0.008
Token hair
Token ID4190
Feature activation-2.10e-02

Pos logit contributions
 cube+0.006
 tallest+0.006
 bucket+0.006
 sp+0.006
 nail+0.005
Neg logit contributions
jo-0.021
 Carly-0.020
fulness-0.020
of-0.019
iko-0.019
Token said
Token ID531
Feature activation+6.47e-04

Pos logit contributions
 cube+0.004
 nail+0.004
 sp+0.004
 tallest+0.004
 bucket+0.004
Neg logit contributions
jo-0.006
 Carly-0.006
fulness-0.005
 Carol-0.005
 Debbie-0.005
Token earlier
Token ID2961
Feature activation+3.76e-04

Pos logit contributions
jo+0.001
 Carly+0.001
fulness+0.001
iko+0.001
of+0.001
Neg logit contributions
 cube-0.001
 tallest-0.001
 nail-0.001
 cubes-0.001
 brick-0.001
Token about
Token ID546
Feature activation+1.14e-03

Pos logit contributions
jo+0.001
 Carly+0.001
rett+0.001
iko+0.001
les+0.001
Neg logit contributions
 cube-0.000
 nail-0.000
 sp-0.000
 brick-0.000
 dough-0.000
Token wiping
Token ID36906
Feature activation-1.05e-02

Pos logit contributions
jo+0.002
 Carly+0.002
fulness+0.002
les+0.002
of+0.002
Neg logit contributions
 cube-0.001
 tallest-0.001
 nail-0.001
 cubes-0.001
 brick-0.001
Token their
Token ID511
Feature activation-2.55e-02

Pos logit contributions
 cube+0.013
 dough+0.013
 sp+0.013
 nail+0.013
 tallest+0.013
Neg logit contributions
jo-0.021
 Carly-0.020
iko-0.019
azz-0.019
lor-0.019
Token shoes
Token ID10012
Feature activation-3.51e-02

Pos logit contributions
 tallest+0.039
 nail+0.036
 brick+0.036
 cube+0.035
nam+0.035
Neg logit contributions
jo-0.040
les-0.038
 Carly-0.038
fulness-0.036
le-0.036
Token.
Token ID13
Feature activation-4.53e-04

Pos logit contributions
 tallest+0.053
 cube+0.052
 cubes+0.051
 brick+0.051
 nail+0.050
Neg logit contributions
jo-0.055
fulness-0.053
 Carly-0.053
of-0.050
le-0.049
Token She
Token ID1375
Feature activation-9.85e-03

Pos logit contributions
 cube+0.001
 tallest+0.001
 brick+0.001
 bucket+0.001
 square+0.001
Neg logit contributions
jo-0.001
 Carly-0.001
fulness-0.001
le-0.001
iko-0.001
Token smiled
Token ID13541
Feature activation-2.30e-03

Pos logit contributions
 cube+0.014
 tallest+0.013
 nail+0.013
 sp+0.013
 bucket+0.013
Neg logit contributions
jo-0.018
 Carly-0.017
fulness-0.017
iko-0.016
 Carol-0.016
Token and
Token ID290
Feature activation-6.02e-03

Pos logit contributions
 cube+0.003
 tallest+0.003
 sp+0.003
 bucket+0.003
 brick+0.003
Neg logit contributions
jo-0.004
 Carly-0.004
iko-0.004
fulness-0.004
rob-0.004
Token wiped
Token ID21122
Feature activation+5.91e-03

Pos logit contributions
 cube+0.009
 tallest+0.009
 sp+0.009
 bucket+0.009
 nail+0.009
Neg logit contributions
jo-0.010
 Carly-0.009
fulness-0.009
iko-0.009
le-0.008
Token<|endoftext|>
Token ID50256
Feature activation+2.57e-03

Pos logit contributions
jo+0.002
 Carly+0.002
le+0.002
fulness+0.002
les+0.002
Neg logit contributions
 tallest-0.002
 cube-0.002
 brick-0.002
 nail-0.002
 cubes-0.002
TokenOne
Token ID3198
Feature activation-4.74e-03

Pos logit contributions
jo+0.005
 Carly+0.005
fulness+0.005
iko+0.004
elle+0.004
Neg logit contributions
 cube-0.003
 tallest-0.003
 bucket-0.003
 cubes-0.003
 square-0.003
Token day
Token ID1110
Feature activation-1.03e-02

Pos logit contributions
 tallest+0.007
 nail+0.007
nam+0.007
 cubes+0.007
 nails+0.007
Neg logit contributions
jo-0.008
 Carly-0.008
fulness-0.007
of-0.007
le-0.007
Token,
Token ID11
Feature activation-1.85e-02

Pos logit contributions
 cube+0.012
 cubes+0.012
 tallest+0.012
 sp+0.012
 brick+0.012
Neg logit contributions
jo-0.021
 Carly-0.020
fulness-0.019
iko-0.019
les-0.018
Token Mom
Token ID11254
Feature activation-9.03e-03

Pos logit contributions
 cube+0.020
 tallest+0.020
 nail+0.019
 cubes+0.019
 brick+0.019
Neg logit contributions
jo-0.039
 Carly-0.037
fulness-0.036
iko-0.035
les-0.034
Token and
Token ID290
Feature activation-3.67e-02

Pos logit contributions
 cube+0.015
 tallest+0.014
 cubes+0.014
 brick+0.014
 nail+0.014
Neg logit contributions
jo-0.014
 Carly-0.014
fulness-0.013
iko-0.012
 Carol-0.012
Token Dad
Token ID17415
Feature activation-1.42e-02

Pos logit contributions
 tallest+0.065
 nail+0.056
 stack+0.054
 centimeters+0.054
 square+0.054
Neg logit contributions
 Carly-0.056
 Debbie-0.052
jo-0.050
fulness-0.050
 Margaret-0.048
Token took
Token ID1718
Feature activation-9.52e-03

Pos logit contributions
 tallest+0.020
 nail+0.018
 cube+0.018
 brick+0.018
 cubes+0.018
Neg logit contributions
jo-0.025
 Carly-0.024
fulness-0.023
les-0.023
of-0.023
Token their
Token ID511
Feature activation-5.77e-03

Pos logit contributions
 dough+0.011
 cube+0.011
 sp+0.011
 cubes+0.011
 squash+0.011
Neg logit contributions
jo-0.019
fulness-0.018
 Carly-0.017
iko-0.017
lor-0.017
Token daughter
Token ID4957
Feature activation-1.81e-03

Pos logit contributions
 tallest+0.008
 nail+0.008
 cube+0.008
 cubes+0.007
nam+0.007
Neg logit contributions
jo-0.010
 Carly-0.010
 Carol-0.009
 Margaret-0.009
unt-0.009
Token to
Token ID284
Feature activation-4.10e-03

Pos logit contributions
 tallest+0.003
 cube+0.003
 nail+0.003
 cubes+0.003
 brick+0.003
Neg logit contributions
jo-0.003
 Carly-0.003
fulness-0.003
of-0.002
 Carol-0.002
Token how
Token ID703
Feature activation-4.59e-03

Pos logit contributions
 cube+0.001
 tallest+0.001
 cubes+0.001
 bucket+0.001
 brick+0.001
Neg logit contributions
jo-0.001
 Carly-0.001
fulness-0.001
of-0.001
 Carol-0.001
Token miserable
Token ID22444
Feature activation-8.07e-03

Pos logit contributions
 tallest+0.006
 cube+0.006
 cubes+0.006
 nail+0.005
 brick+0.005
Neg logit contributions
jo-0.009
 Carly-0.008
fulness-0.008
les-0.008
of-0.008
Token the
Token ID262
Feature activation-1.04e-02

Pos logit contributions
 cube+0.011
 tallest+0.011
 nail+0.010
 brick+0.010
 cubes+0.010
Neg logit contributions
jo-0.015
 Carly-0.014
iko-0.013
of-0.013
fulness-0.013
Token task
Token ID4876
Feature activation+2.27e-03

Pos logit contributions
 cube+0.012
 tallest+0.012
 cubes+0.011
 nail+0.011
 brick+0.011
Neg logit contributions
jo-0.021
 Carly-0.020
fulness-0.019
iko-0.018
 Carol-0.018
Token may
Token ID743
Feature activation+6.79e-03

Pos logit contributions
jo+0.005
 Carly+0.005
iko+0.004
 Debbie+0.004
fulness+0.004
Neg logit contributions
 cube-0.003
 nail-0.002
 tallest-0.002
 brick-0.002
 bucket-0.002
Token seem
Token ID1283
Feature activation-3.66e-02

Pos logit contributions
jo+0.015
 Carly+0.015
iko+0.014
fulness+0.014
 Debbie+0.014
Neg logit contributions
 tallest-0.007
 cube-0.006
 nail-0.006
 brick-0.006
 sp-0.006
Token,
Token ID11
Feature activation-5.22e-03

Pos logit contributions
 cube+0.053
 tallest+0.052
 cubes+0.051
 nail+0.050
 brick+0.050
Neg logit contributions
jo-0.062
 Carly-0.059
fulness-0.058
les-0.055
of-0.054
Token courage
Token ID11917
Feature activation+2.50e-03

Pos logit contributions
 tallest+0.007
 cube+0.007
 nail+0.007
 brick+0.007
 cubes+0.007
Neg logit contributions
jo-0.010
 Carly-0.009
iko-0.009
les-0.009
rett-0.009
Token is
Token ID318
Feature activation-4.18e-03

Pos logit contributions
jo+0.005
 Carly+0.005
iko+0.005
 Debbie+0.005
rett+0.005
Neg logit contributions
 cube-0.003
 tallest-0.003
 nail-0.003
 bucket-0.003
 brick-0.003
Token needed
Token ID2622
Feature activation+2.65e-03

Pos logit contributions
 tallest+0.005
 cube+0.005
 nail+0.005
 cubes+0.005
 brick+0.005
Neg logit contributions
jo-0.008
 Carly-0.008
fulness-0.007
iko-0.007
of-0.007
Token and
Token ID290
Feature activation+9.93e-03

Pos logit contributions
jo+0.006
 Carly+0.005
iko+0.005
 Debbie+0.005
azz+0.005
Neg logit contributions
 nail-0.003
 bucket-0.003
 brick-0.003
 cube-0.003
 nails-0.003